* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-04-25 16:15 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-04-25 16:15 UTC (permalink / raw
To: gentoo-commits
commit: f026af57d0531cd6bb957384b6c52816a0a34ed6
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Mon Apr 25 16:14:27 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Mon Apr 25 16:15:36 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=f026af57
Update distro patch in security Kconfig for 5.18
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
4567_distro-Gentoo-Kconfig.patch | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/4567_distro-Gentoo-Kconfig.patch b/4567_distro-Gentoo-Kconfig.patch
index 9843c3e2..ab78353b 100644
--- a/4567_distro-Gentoo-Kconfig.patch
+++ b/4567_distro-Gentoo-Kconfig.patch
@@ -294,12 +294,12 @@
+ See the settings that become available for more details and fine-tuning.
+
+endmenu
---- a/security/Kconfig 2021-12-05 18:20:55.655677710 -0500
-+++ b/security/Kconfig 2021-12-05 18:23:42.404251618 -0500
+--- a/security/Kconfig 2022-04-25 11:20:45.487213970 -0400
++++ b/security/Kconfig 2022-04-25 11:22:02.514143999 -0400
@@ -167,6 +167,7 @@ config HARDENED_USERCOPY_PAGESPAN
bool "Refuse to copy allocations that span multiple pages"
depends on HARDENED_USERCOPY
- depends on EXPERT
+ depends on BROKEN
+ depends on !GENTOO_KERNEL_SELF_PROTECTION
help
When a multi-page allocation is done without __GFP_COMP,
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-04-25 16:23 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-04-25 16:23 UTC (permalink / raw
To: gentoo-commits
commit: f7636d029fe7df83bfd3cb541207ca86ce72aa0d
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Mon Apr 25 16:22:56 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Mon Apr 25 16:22:56 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=f7636d02
Add genpatches
Bluetooth: Check key sizes only when Secure Simple Pairing is
enabled. See bug #686758
tmp513 requies REGMAP_I2C to build. Select it by default in Kconfig.
See bug #710790. Thanks to Phil Stracchino
sign-file: full functionality with modern LibreSSL
Add Gentoo Linux support config settings and defaults.
Kernel Self Protection patch
CPU Optimization patch
Print firmware info (Reqs CONFIG_GENTOO_PRINT_FIRMWARE_INFO
Patch to enable link security restrictions by default.
Patch to support for namespace user.pax.* on tmpfs.
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 28 +
1500_XATTR_USER_PREFIX.patch | 67 ++
...ble-link-security-restrictions-by-default.patch | 17 +
...zes-only-if-Secure-Simple-Pairing-enabled.patch | 37 ++
...3-Fix-build-issue-by-selecting-CONFIG_REG.patch | 30 +
2920_sign-file-patch-for-libressl.patch | 16 +
3000_Support-printing-firmware-info.patch | 14 +
5010_enable-cpu-optimizations-universal.patch | 675 +++++++++++++++++++++
8 files changed, 884 insertions(+)
diff --git a/0000_README b/0000_README
index 90189932..efde5c7d 100644
--- a/0000_README
+++ b/0000_README
@@ -43,6 +43,34 @@ EXPERIMENTAL
Individual Patch Descriptions:
--------------------------------------------------------------------------
+Patch: 1500_XATTR_USER_PREFIX.patch
+From: https://bugs.gentoo.org/show_bug.cgi?id=470644
+Desc: Support for namespace user.pax.* on tmpfs.
+
+Patch: 1510_fs-enable-link-security-restrictions-by-default.patch
+From: http://sources.debian.net/src/linux/3.16.7-ckt4-3/debian/patches/debian/fs-enable-link-security-restrictions-by-default.patch/
+Desc: Enable link security restrictions by default.
+
+Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
+From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
+Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
+
+Patch: 2900_tmp513-Fix-build-issue-by-selecting-CONFIG_REG.patch
+From: https://bugs.gentoo.org/710790
+Desc: tmp513 requies REGMAP_I2C to build. Select it by default in Kconfig. See bug #710790. Thanks to Phil Stracchino
+
+Patch: 2920_sign-file-patch-for-libressl.patch
+From: https://bugs.gentoo.org/717166
+Desc: sign-file: full functionality with modern LibreSSL
+
+Patch: 3000_Support-printing-firmware-info.patch
+From: https://bugs.gentoo.org/732852
+Desc: Print firmware info (Reqs CONFIG_GENTOO_PRINT_FIRMWARE_INFO). Thanks to Georgy Yakovlev
+
Patch: 4567_distro-Gentoo-Kconfig.patch
From: Tom Wijsman <TomWij@gentoo.org>
Desc: Add Gentoo Linux support config settings and defaults.
+
+Patch: 5010_enable-cpu-optimizations-universal.patch
+From: https://github.com/graysky2/kernel_compiler_patch
+Desc: Kernel >= 5.15 patch enables gcc = v11.1+ optimizations for additional CPUs.
diff --git a/1500_XATTR_USER_PREFIX.patch b/1500_XATTR_USER_PREFIX.patch
new file mode 100644
index 00000000..245dcc29
--- /dev/null
+++ b/1500_XATTR_USER_PREFIX.patch
@@ -0,0 +1,67 @@
+From: Anthony G. Basile <blueness@gentoo.org>
+
+This patch adds support for a restricted user-controlled namespace on
+tmpfs filesystem used to house PaX flags. The namespace must be of the
+form user.pax.* and its value cannot exceed a size of 8 bytes.
+
+This is needed even on all Gentoo systems so that XATTR_PAX flags
+are preserved for users who might build packages using portage on
+a tmpfs system with a non-hardened kernel and then switch to a
+hardened kernel with XATTR_PAX enabled.
+
+The namespace is added to any user with Extended Attribute support
+enabled for tmpfs. Users who do not enable xattrs will not have
+the XATTR_PAX flags preserved.
+
+diff --git a/include/uapi/linux/xattr.h b/include/uapi/linux/xattr.h
+index 1590c49..5eab462 100644
+--- a/include/uapi/linux/xattr.h
++++ b/include/uapi/linux/xattr.h
+@@ -73,5 +73,9 @@
+ #define XATTR_POSIX_ACL_DEFAULT "posix_acl_default"
+ #define XATTR_NAME_POSIX_ACL_DEFAULT XATTR_SYSTEM_PREFIX XATTR_POSIX_ACL_DEFAULT
+
++/* User namespace */
++#define XATTR_PAX_PREFIX XATTR_USER_PREFIX "pax."
++#define XATTR_PAX_FLAGS_SUFFIX "flags"
++#define XATTR_NAME_PAX_FLAGS XATTR_PAX_PREFIX XATTR_PAX_FLAGS_SUFFIX
+
+ #endif /* _UAPI_LINUX_XATTR_H */
+--- a/mm/shmem.c 2020-05-04 15:30:27.042035334 -0400
++++ b/mm/shmem.c 2020-05-04 15:34:57.013881725 -0400
+@@ -3238,6 +3238,14 @@ static int shmem_xattr_handler_set(const
+ struct shmem_inode_info *info = SHMEM_I(inode);
+
+ name = xattr_full_name(handler, name);
++
++ if (!strncmp(name, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN)) {
++ if (strcmp(name, XATTR_NAME_PAX_FLAGS))
++ return -EOPNOTSUPP;
++ if (size > 8)
++ return -EINVAL;
++ }
++
+ return simple_xattr_set(&info->xattrs, name, value, size, flags, NULL);
+ }
+
+@@ -3253,6 +3261,12 @@ static const struct xattr_handler shmem_
+ .set = shmem_xattr_handler_set,
+ };
+
++static const struct xattr_handler shmem_user_xattr_handler = {
++ .prefix = XATTR_USER_PREFIX,
++ .get = shmem_xattr_handler_get,
++ .set = shmem_xattr_handler_set,
++};
++
+ static const struct xattr_handler *shmem_xattr_handlers[] = {
+ #ifdef CONFIG_TMPFS_POSIX_ACL
+ &posix_acl_access_xattr_handler,
+@@ -3260,6 +3274,7 @@ static const struct xattr_handler *shmem
+ #endif
+ &shmem_security_xattr_handler,
+ &shmem_trusted_xattr_handler,
++ &shmem_user_xattr_handler,
+ NULL
+ };
+
diff --git a/1510_fs-enable-link-security-restrictions-by-default.patch b/1510_fs-enable-link-security-restrictions-by-default.patch
new file mode 100644
index 00000000..e8c30157
--- /dev/null
+++ b/1510_fs-enable-link-security-restrictions-by-default.patch
@@ -0,0 +1,17 @@
+--- a/fs/namei.c 2022-01-23 13:02:27.876558299 -0500
++++ b/fs/namei.c 2022-03-06 12:47:39.375719693 -0500
+@@ -1020,10 +1020,10 @@ static inline void put_link(struct namei
+ path_put(&last->link);
+ }
+
+-static int sysctl_protected_symlinks __read_mostly;
+-static int sysctl_protected_hardlinks __read_mostly;
+-static int sysctl_protected_fifos __read_mostly;
+-static int sysctl_protected_regular __read_mostly;
++static int sysctl_protected_symlinks __read_mostly = 1;
++static int sysctl_protected_hardlinks __read_mostly = 1;
++int sysctl_protected_fifos __read_mostly = 1;
++int sysctl_protected_regular __read_mostly = 1;
+
+ #ifdef CONFIG_SYSCTL
+ static struct ctl_table namei_sysctls[] = {
diff --git a/2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch b/2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
new file mode 100644
index 00000000..394ad48f
--- /dev/null
+++ b/2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
@@ -0,0 +1,37 @@
+The encryption is only mandatory to be enforced when both sides are using
+Secure Simple Pairing and this means the key size check makes only sense
+in that case.
+
+On legacy Bluetooth 2.0 and earlier devices like mice the encryption was
+optional and thus causing an issue if the key size check is not bound to
+using Secure Simple Pairing.
+
+Fixes: d5bb334a8e17 ("Bluetooth: Align minimum encryption key size for LE and BR/EDR connections")
+Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
+Cc: stable@vger.kernel.org
+---
+ net/bluetooth/hci_conn.c | 9 +++++++--
+ 1 file changed, 7 insertions(+), 2 deletions(-)
+
+diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
+index 3cf0764d5793..7516cdde3373 100644
+--- a/net/bluetooth/hci_conn.c
++++ b/net/bluetooth/hci_conn.c
+@@ -1272,8 +1272,13 @@ int hci_conn_check_link_mode(struct hci_conn *conn)
+ return 0;
+ }
+
+- if (hci_conn_ssp_enabled(conn) &&
+- !test_bit(HCI_CONN_ENCRYPT, &conn->flags))
++ /* If Secure Simple Pairing is not enabled, then legacy connection
++ * setup is used and no encryption or key sizes can be enforced.
++ */
++ if (!hci_conn_ssp_enabled(conn))
++ return 1;
++
++ if (!test_bit(HCI_CONN_ENCRYPT, &conn->flags))
+ return 0;
+
+ /* The minimum encryption key size needs to be enforced by the
+--
+2.20.1
diff --git a/2900_tmp513-Fix-build-issue-by-selecting-CONFIG_REG.patch b/2900_tmp513-Fix-build-issue-by-selecting-CONFIG_REG.patch
new file mode 100644
index 00000000..43356857
--- /dev/null
+++ b/2900_tmp513-Fix-build-issue-by-selecting-CONFIG_REG.patch
@@ -0,0 +1,30 @@
+From dc328d75a6f37f4ff11a81ae16b1ec88c3197640 Mon Sep 17 00:00:00 2001
+From: Mike Pagano <mpagano@gentoo.org>
+Date: Mon, 23 Mar 2020 08:20:06 -0400
+Subject: [PATCH 1/1] This driver requires REGMAP_I2C to build. Select it by
+ default in Kconfig. Reported at gentoo bugzilla:
+ https://bugs.gentoo.org/710790
+Cc: mpagano@gentoo.org
+
+Reported-by: Phil Stracchino <phils@caerllewys.net>
+
+Signed-off-by: Mike Pagano <mpagano@gentoo.org>
+---
+ drivers/hwmon/Kconfig | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
+index 47ac20aee06f..530b4f29ba85 100644
+--- a/drivers/hwmon/Kconfig
++++ b/drivers/hwmon/Kconfig
+@@ -1769,6 +1769,7 @@ config SENSORS_TMP421
+ config SENSORS_TMP513
+ tristate "Texas Instruments TMP513 and compatibles"
+ depends on I2C
++ select REGMAP_I2C
+ help
+ If you say yes here you get support for Texas Instruments TMP512,
+ and TMP513 temperature and power supply sensor chips.
+--
+2.24.1
+
diff --git a/2920_sign-file-patch-for-libressl.patch b/2920_sign-file-patch-for-libressl.patch
new file mode 100644
index 00000000..e6ec017d
--- /dev/null
+++ b/2920_sign-file-patch-for-libressl.patch
@@ -0,0 +1,16 @@
+--- a/scripts/sign-file.c 2020-05-20 18:47:21.282820662 -0400
++++ b/scripts/sign-file.c 2020-05-20 18:48:37.991081899 -0400
+@@ -41,9 +41,10 @@
+ * signing with anything other than SHA1 - so we're stuck with that if such is
+ * the case.
+ */
+-#if defined(LIBRESSL_VERSION_NUMBER) || \
+- OPENSSL_VERSION_NUMBER < 0x10000000L || \
+- defined(OPENSSL_NO_CMS)
++#if defined(OPENSSL_NO_CMS) || \
++ ( defined(LIBRESSL_VERSION_NUMBER) \
++ && (LIBRESSL_VERSION_NUMBER < 0x3010000fL) ) || \
++ OPENSSL_VERSION_NUMBER < 0x10000000L
+ #define USE_PKCS7
+ #endif
+ #ifndef USE_PKCS7
diff --git a/3000_Support-printing-firmware-info.patch b/3000_Support-printing-firmware-info.patch
new file mode 100644
index 00000000..a630cfbe
--- /dev/null
+++ b/3000_Support-printing-firmware-info.patch
@@ -0,0 +1,14 @@
+--- a/drivers/base/firmware_loader/main.c 2021-08-24 15:42:07.025482085 -0400
++++ b/drivers/base/firmware_loader/main.c 2021-08-24 15:44:40.782975313 -0400
+@@ -809,6 +809,11 @@ _request_firmware(const struct firmware
+
+ ret = _request_firmware_prepare(&fw, name, device, buf, size,
+ offset, opt_flags);
++
++#ifdef CONFIG_GENTOO_PRINT_FIRMWARE_INFO
++ printk(KERN_NOTICE "Loading firmware: %s\n", name);
++#endif
++
+ if (ret <= 0) /* error or already assigned */
+ goto out;
+
diff --git a/5010_enable-cpu-optimizations-universal.patch b/5010_enable-cpu-optimizations-universal.patch
new file mode 100644
index 00000000..b9c03cb6
--- /dev/null
+++ b/5010_enable-cpu-optimizations-universal.patch
@@ -0,0 +1,675 @@
+From b5892719c43f739343c628e3d357471a3bdaa368 Mon Sep 17 00:00:00 2001
+From: graysky <graysky@archlinux.us>
+Date: Tue, 15 Mar 2022 05:58:43 -0400
+Subject: [PATCH] more uarches for kernel 5.17+
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+FEATURES
+This patch adds additional CPU options to the Linux kernel accessible under:
+ Processor type and features --->
+ Processor family --->
+
+With the release of gcc 11.1 and clang 12.0, several generic 64-bit levels are
+offered which are good for supported Intel or AMD CPUs:
+• x86-64-v2
+• x86-64-v3
+• x86-64-v4
+
+Users of glibc 2.33 and above can see which level is supported by current
+hardware by running:
+ /lib/ld-linux-x86-64.so.2 --help | grep supported
+
+Alternatively, compare the flags from /proc/cpuinfo to this list.[1]
+
+CPU-specific microarchitectures include:
+• AMD Improved K8-family
+• AMD K10-family
+• AMD Family 10h (Barcelona)
+• AMD Family 14h (Bobcat)
+• AMD Family 16h (Jaguar)
+• AMD Family 15h (Bulldozer)
+• AMD Family 15h (Piledriver)
+• AMD Family 15h (Steamroller)
+• AMD Family 15h (Excavator)
+• AMD Family 17h (Zen)
+• AMD Family 17h (Zen 2)
+• AMD Family 19h (Zen 3)†
+• Intel Silvermont low-power processors
+• Intel Goldmont low-power processors (Apollo Lake and Denverton)
+• Intel Goldmont Plus low-power processors (Gemini Lake)
+• Intel 1st Gen Core i3/i5/i7 (Nehalem)
+• Intel 1.5 Gen Core i3/i5/i7 (Westmere)
+• Intel 2nd Gen Core i3/i5/i7 (Sandybridge)
+• Intel 3rd Gen Core i3/i5/i7 (Ivybridge)
+• Intel 4th Gen Core i3/i5/i7 (Haswell)
+• Intel 5th Gen Core i3/i5/i7 (Broadwell)
+• Intel 6th Gen Core i3/i5/i7 (Skylake)
+• Intel 6th Gen Core i7/i9 (Skylake X)
+• Intel 8th Gen Core i3/i5/i7 (Cannon Lake)
+• Intel 10th Gen Core i7/i9 (Ice Lake)
+• Intel Xeon (Cascade Lake)
+• Intel Xeon (Cooper Lake)*
+• Intel 3rd Gen 10nm++ i3/i5/i7/i9-family (Tiger Lake)*
+• Intel 3rd Gen 10nm++ Xeon (Sapphire Rapids)‡
+• Intel 11th Gen i3/i5/i7/i9-family (Rocket Lake)‡
+• Intel 12th Gen i3/i5/i7/i9-family (Alder Lake)‡
+
+Notes: If not otherwise noted, gcc >=9.1 is required for support.
+ *Requires gcc >=10.1 or clang >=10.0
+ †Required gcc >=10.3 or clang >=12.0
+ ‡Required gcc >=11.1 or clang >=12.0
+
+It also offers to compile passing the 'native' option which, "selects the CPU
+to generate code for at compilation time by determining the processor type of
+the compiling machine. Using -march=native enables all instruction subsets
+supported by the local machine and will produce code optimized for the local
+machine under the constraints of the selected instruction set."[2]
+
+Users of Intel CPUs should select the 'Intel-Native' option and users of AMD
+CPUs should select the 'AMD-Native' option.
+
+MINOR NOTES RELATING TO INTEL ATOM PROCESSORS
+This patch also changes -march=atom to -march=bonnell in accordance with the
+gcc v4.9 changes. Upstream is using the deprecated -match=atom flags when I
+believe it should use the newer -march=bonnell flag for atom processors.[3]
+
+It is not recommended to compile on Atom-CPUs with the 'native' option.[4] The
+recommendation is to use the 'atom' option instead.
+
+BENEFITS
+Small but real speed increases are measurable using a make endpoint comparing
+a generic kernel to one built with one of the respective microarchs.
+
+See the following experimental evidence supporting this statement:
+https://github.com/graysky2/kernel_gcc_patch
+
+REQUIREMENTS
+linux version 5.17+
+gcc version >=9.0 or clang version >=9.0
+
+ACKNOWLEDGMENTS
+This patch builds on the seminal work by Jeroen.[5]
+
+REFERENCES
+1. https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9
+2. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-x86-Options
+3. https://bugzilla.kernel.org/show_bug.cgi?id=77461
+4. https://github.com/graysky2/kernel_gcc_patch/issues/15
+5. http://www.linuxforge.net/docs/linux/linux-gcc.php
+
+Signed-off-by: graysky <graysky@archlinux.us>
+---
+ arch/x86/Kconfig.cpu | 332 ++++++++++++++++++++++++++++++--
+ arch/x86/Makefile | 40 +++-
+ arch/x86/include/asm/vermagic.h | 66 +++++++
+ 3 files changed, 424 insertions(+), 14 deletions(-)
+
+diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
+index 542377cd419d..22b919cdb6d1 100644
+--- a/arch/x86/Kconfig.cpu
++++ b/arch/x86/Kconfig.cpu
+@@ -157,7 +157,7 @@ config MPENTIUM4
+
+
+ config MK6
+- bool "K6/K6-II/K6-III"
++ bool "AMD K6/K6-II/K6-III"
+ depends on X86_32
+ help
+ Select this for an AMD K6-family processor. Enables use of
+@@ -165,7 +165,7 @@ config MK6
+ flags to GCC.
+
+ config MK7
+- bool "Athlon/Duron/K7"
++ bool "AMD Athlon/Duron/K7"
+ depends on X86_32
+ help
+ Select this for an AMD Athlon K7-family processor. Enables use of
+@@ -173,12 +173,98 @@ config MK7
+ flags to GCC.
+
+ config MK8
+- bool "Opteron/Athlon64/Hammer/K8"
++ bool "AMD Opteron/Athlon64/Hammer/K8"
+ help
+ Select this for an AMD Opteron or Athlon64 Hammer-family processor.
+ Enables use of some extended instructions, and passes appropriate
+ optimization flags to GCC.
+
++config MK8SSE3
++ bool "AMD Opteron/Athlon64/Hammer/K8 with SSE3"
++ help
++ Select this for improved AMD Opteron or Athlon64 Hammer-family processors.
++ Enables use of some extended instructions, and passes appropriate
++ optimization flags to GCC.
++
++config MK10
++ bool "AMD 61xx/7x50/PhenomX3/X4/II/K10"
++ help
++ Select this for an AMD 61xx Eight-Core Magny-Cours, Athlon X2 7x50,
++ Phenom X3/X4/II, Athlon II X2/X3/X4, or Turion II-family processor.
++ Enables use of some extended instructions, and passes appropriate
++ optimization flags to GCC.
++
++config MBARCELONA
++ bool "AMD Barcelona"
++ help
++ Select this for AMD Family 10h Barcelona processors.
++
++ Enables -march=barcelona
++
++config MBOBCAT
++ bool "AMD Bobcat"
++ help
++ Select this for AMD Family 14h Bobcat processors.
++
++ Enables -march=btver1
++
++config MJAGUAR
++ bool "AMD Jaguar"
++ help
++ Select this for AMD Family 16h Jaguar processors.
++
++ Enables -march=btver2
++
++config MBULLDOZER
++ bool "AMD Bulldozer"
++ help
++ Select this for AMD Family 15h Bulldozer processors.
++
++ Enables -march=bdver1
++
++config MPILEDRIVER
++ bool "AMD Piledriver"
++ help
++ Select this for AMD Family 15h Piledriver processors.
++
++ Enables -march=bdver2
++
++config MSTEAMROLLER
++ bool "AMD Steamroller"
++ help
++ Select this for AMD Family 15h Steamroller processors.
++
++ Enables -march=bdver3
++
++config MEXCAVATOR
++ bool "AMD Excavator"
++ help
++ Select this for AMD Family 15h Excavator processors.
++
++ Enables -march=bdver4
++
++config MZEN
++ bool "AMD Zen"
++ help
++ Select this for AMD Family 17h Zen processors.
++
++ Enables -march=znver1
++
++config MZEN2
++ bool "AMD Zen 2"
++ help
++ Select this for AMD Family 17h Zen 2 processors.
++
++ Enables -march=znver2
++
++config MZEN3
++ bool "AMD Zen 3"
++ depends on (CC_IS_GCC && GCC_VERSION >= 100300) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ help
++ Select this for AMD Family 19h Zen 3 processors.
++
++ Enables -march=znver3
++
+ config MCRUSOE
+ bool "Crusoe"
+ depends on X86_32
+@@ -270,7 +356,7 @@ config MPSC
+ in /proc/cpuinfo. Family 15 is an older Xeon, Family 6 a newer one.
+
+ config MCORE2
+- bool "Core 2/newer Xeon"
++ bool "Intel Core 2"
+ help
+
+ Select this for Intel Core 2 and newer Core 2 Xeons (Xeon 51xx and
+@@ -278,6 +364,8 @@ config MCORE2
+ family in /proc/cpuinfo. Newer ones have 6 and older ones 15
+ (not a typo)
+
++ Enables -march=core2
++
+ config MATOM
+ bool "Intel Atom"
+ help
+@@ -287,6 +375,182 @@ config MATOM
+ accordingly optimized code. Use a recent GCC with specific Atom
+ support in order to fully benefit from selecting this option.
+
++config MNEHALEM
++ bool "Intel Nehalem"
++ select X86_P6_NOP
++ help
++
++ Select this for 1st Gen Core processors in the Nehalem family.
++
++ Enables -march=nehalem
++
++config MWESTMERE
++ bool "Intel Westmere"
++ select X86_P6_NOP
++ help
++
++ Select this for the Intel Westmere formerly Nehalem-C family.
++
++ Enables -march=westmere
++
++config MSILVERMONT
++ bool "Intel Silvermont"
++ select X86_P6_NOP
++ help
++
++ Select this for the Intel Silvermont platform.
++
++ Enables -march=silvermont
++
++config MGOLDMONT
++ bool "Intel Goldmont"
++ select X86_P6_NOP
++ help
++
++ Select this for the Intel Goldmont platform including Apollo Lake and Denverton.
++
++ Enables -march=goldmont
++
++config MGOLDMONTPLUS
++ bool "Intel Goldmont Plus"
++ select X86_P6_NOP
++ help
++
++ Select this for the Intel Goldmont Plus platform including Gemini Lake.
++
++ Enables -march=goldmont-plus
++
++config MSANDYBRIDGE
++ bool "Intel Sandy Bridge"
++ select X86_P6_NOP
++ help
++
++ Select this for 2nd Gen Core processors in the Sandy Bridge family.
++
++ Enables -march=sandybridge
++
++config MIVYBRIDGE
++ bool "Intel Ivy Bridge"
++ select X86_P6_NOP
++ help
++
++ Select this for 3rd Gen Core processors in the Ivy Bridge family.
++
++ Enables -march=ivybridge
++
++config MHASWELL
++ bool "Intel Haswell"
++ select X86_P6_NOP
++ help
++
++ Select this for 4th Gen Core processors in the Haswell family.
++
++ Enables -march=haswell
++
++config MBROADWELL
++ bool "Intel Broadwell"
++ select X86_P6_NOP
++ help
++
++ Select this for 5th Gen Core processors in the Broadwell family.
++
++ Enables -march=broadwell
++
++config MSKYLAKE
++ bool "Intel Skylake"
++ select X86_P6_NOP
++ help
++
++ Select this for 6th Gen Core processors in the Skylake family.
++
++ Enables -march=skylake
++
++config MSKYLAKEX
++ bool "Intel Skylake X"
++ select X86_P6_NOP
++ help
++
++ Select this for 6th Gen Core processors in the Skylake X family.
++
++ Enables -march=skylake-avx512
++
++config MCANNONLAKE
++ bool "Intel Cannon Lake"
++ select X86_P6_NOP
++ help
++
++ Select this for 8th Gen Core processors
++
++ Enables -march=cannonlake
++
++config MICELAKE
++ bool "Intel Ice Lake"
++ select X86_P6_NOP
++ help
++
++ Select this for 10th Gen Core processors in the Ice Lake family.
++
++ Enables -march=icelake-client
++
++config MCASCADELAKE
++ bool "Intel Cascade Lake"
++ select X86_P6_NOP
++ help
++
++ Select this for Xeon processors in the Cascade Lake family.
++
++ Enables -march=cascadelake
++
++config MCOOPERLAKE
++ bool "Intel Cooper Lake"
++ depends on (CC_IS_GCC && GCC_VERSION > 100100) || (CC_IS_CLANG && CLANG_VERSION >= 100000)
++ select X86_P6_NOP
++ help
++
++ Select this for Xeon processors in the Cooper Lake family.
++
++ Enables -march=cooperlake
++
++config MTIGERLAKE
++ bool "Intel Tiger Lake"
++ depends on (CC_IS_GCC && GCC_VERSION > 100100) || (CC_IS_CLANG && CLANG_VERSION >= 100000)
++ select X86_P6_NOP
++ help
++
++ Select this for third-generation 10 nm process processors in the Tiger Lake family.
++
++ Enables -march=tigerlake
++
++config MSAPPHIRERAPIDS
++ bool "Intel Sapphire Rapids"
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ select X86_P6_NOP
++ help
++
++ Select this for third-generation 10 nm process processors in the Sapphire Rapids family.
++
++ Enables -march=sapphirerapids
++
++config MROCKETLAKE
++ bool "Intel Rocket Lake"
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ select X86_P6_NOP
++ help
++
++ Select this for eleventh-generation processors in the Rocket Lake family.
++
++ Enables -march=rocketlake
++
++config MALDERLAKE
++ bool "Intel Alder Lake"
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ select X86_P6_NOP
++ help
++
++ Select this for twelfth-generation processors in the Alder Lake family.
++
++ Enables -march=alderlake
++
+ config GENERIC_CPU
+ bool "Generic-x86-64"
+ depends on X86_64
+@@ -294,6 +558,50 @@ config GENERIC_CPU
+ Generic x86-64 CPU.
+ Run equally well on all x86-64 CPUs.
+
++config GENERIC_CPU2
++ bool "Generic-x86-64-v2"
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ depends on X86_64
++ help
++ Generic x86-64 CPU.
++ Run equally well on all x86-64 CPUs with min support of x86-64-v2.
++
++config GENERIC_CPU3
++ bool "Generic-x86-64-v3"
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ depends on X86_64
++ help
++ Generic x86-64-v3 CPU with v3 instructions.
++ Run equally well on all x86-64 CPUs with min support of x86-64-v3.
++
++config GENERIC_CPU4
++ bool "Generic-x86-64-v4"
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ depends on X86_64
++ help
++ Generic x86-64 CPU with v4 instructions.
++ Run equally well on all x86-64 CPUs with min support of x86-64-v4.
++
++config MNATIVE_INTEL
++ bool "Intel-Native optimizations autodetected by the compiler"
++ help
++
++ Clang 3.8, GCC 4.2 and above support -march=native, which automatically detects
++ the optimum settings to use based on your processor. Do NOT use this
++ for AMD CPUs. Intel Only!
++
++ Enables -march=native
++
++config MNATIVE_AMD
++ bool "AMD-Native optimizations autodetected by the compiler"
++ help
++
++ Clang 3.8, GCC 4.2 and above support -march=native, which automatically detects
++ the optimum settings to use based on your processor. Do NOT use this
++ for Intel CPUs. AMD Only!
++
++ Enables -march=native
++
+ endchoice
+
+ config X86_GENERIC
+@@ -318,7 +626,7 @@ config X86_INTERNODE_CACHE_SHIFT
+ config X86_L1_CACHE_SHIFT
+ int
+ default "7" if MPENTIUM4 || MPSC
+- default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU
++ default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || MK8SSE3 || MK10 || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR || MZEN || MZEN2 || MZEN3 || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MNATIVE_INTEL || MNATIVE_AMD || X86_GENERIC || GENERIC_CPU || GENERIC_CPU2 || GENERIC_CPU3 || GENERIC_CPU4
+ default "4" if MELAN || M486SX || M486 || MGEODEGX1
+ default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODE_LX
+
+@@ -336,11 +644,11 @@ config X86_ALIGNMENT_16
+
+ config X86_INTEL_USERCOPY
+ def_bool y
+- depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC || MK8 || MK7 || MEFFICEON || MCORE2
++ depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC || MK8 || MK7 || MEFFICEON || MCORE2 || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MNATIVE_INTEL
+
+ config X86_USE_PPRO_CHECKSUM
+ def_bool y
+- depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MVIAC7 || MEFFICEON || MGEODE_LX || MCORE2 || MATOM
++ depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MVIAC7 || MEFFICEON || MGEODE_LX || MCORE2 || MATOM || MK8SSE3 || MK10 || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR || MZEN || MZEN2 || MZEN3 || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MNATIVE_INTEL || MNATIVE_AMD
+
+ #
+ # P6_NOPs are a relatively minor optimization that require a family >=
+@@ -356,26 +664,26 @@ config X86_USE_PPRO_CHECKSUM
+ config X86_P6_NOP
+ def_bool y
+ depends on X86_64
+- depends on (MCORE2 || MPENTIUM4 || MPSC)
++ depends on (MCORE2 || MPENTIUM4 || MPSC || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MNATIVE_INTEL)
+
+ config X86_TSC
+ def_bool y
+- depends on (MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MK8 || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MCORE2 || MATOM) || X86_64
++ depends on (MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MK8 || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MCORE2 || MATOM || MK8SSE3 || MK10 || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR || MZEN || MZEN2 || MZEN3 || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MNATIVE_INTEL || MNATIVE_AMD) || X86_64
+
+ config X86_CMPXCHG64
+ def_bool y
+- depends on X86_PAE || X86_64 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586TSC || M586MMX || MATOM || MGEODE_LX || MGEODEGX1 || MK6 || MK7 || MK8
++ depends on X86_PAE || X86_64 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586TSC || M586MMX || MATOM || MGEODE_LX || MGEODEGX1 || MK6 || MK7 || MK8 || MK8SSE3 || MK10 || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR || MZEN || MZEN2 || MZEN3 || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MNATIVE_INTEL || MNATIVE_AMD
+
+ # this should be set for all -march=.. options where the compiler
+ # generates cmov.
+ config X86_CMOV
+ def_bool y
+- depends on (MK8 || MK7 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MCRUSOE || MEFFICEON || X86_64 || MATOM || MGEODE_LX)
++ depends on (MK8 || MK7 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MCRUSOE || MEFFICEON || X86_64 || MATOM || MGEODE_LX || MK8SSE3 || MK10 || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR || MZEN || MZEN2 || MZEN3 || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MNATIVE_INTEL || MNATIVE_AMD)
+
+ config X86_MINIMUM_CPU_FAMILY
+ int
+ default "64" if X86_64
+- default "6" if X86_32 && (MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MEFFICEON || MATOM || MCRUSOE || MCORE2 || MK7 || MK8)
++ default "6" if X86_32 && (MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MEFFICEON || MATOM || MCRUSOE || MCORE2 || MK7 || MK8 || MK8SSE3 || MK10 || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR || MZEN || MZEN2 || MZEN3 || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MNATIVE_INTEL || MNATIVE_AMD)
+ default "5" if X86_32 && X86_CMPXCHG64
+ default "4"
+
+diff --git a/arch/x86/Makefile b/arch/x86/Makefile
+index e84cdd409b64..7d3bbf060079 100644
+--- a/arch/x86/Makefile
++++ b/arch/x86/Makefile
+@@ -131,8 +131,44 @@ else
+ # FIXME - should be integrated in Makefile.cpu (Makefile_32.cpu)
+ cflags-$(CONFIG_MK8) += -march=k8
+ cflags-$(CONFIG_MPSC) += -march=nocona
+- cflags-$(CONFIG_MCORE2) += -march=core2
+- cflags-$(CONFIG_MATOM) += -march=atom
++ cflags-$(CONFIG_MK8SSE3) += -march=k8-sse3
++ cflags-$(CONFIG_MK10) += -march=amdfam10
++ cflags-$(CONFIG_MBARCELONA) += -march=barcelona
++ cflags-$(CONFIG_MBOBCAT) += -march=btver1
++ cflags-$(CONFIG_MJAGUAR) += -march=btver2
++ cflags-$(CONFIG_MBULLDOZER) += -march=bdver1
++ cflags-$(CONFIG_MPILEDRIVER) += -march=bdver2 -mno-tbm
++ cflags-$(CONFIG_MSTEAMROLLER) += -march=bdver3 -mno-tbm
++ cflags-$(CONFIG_MEXCAVATOR) += -march=bdver4 -mno-tbm
++ cflags-$(CONFIG_MZEN) += -march=znver1
++ cflags-$(CONFIG_MZEN2) += -march=znver2
++ cflags-$(CONFIG_MZEN3) += -march=znver3
++ cflags-$(CONFIG_MNATIVE_INTEL) += -march=native
++ cflags-$(CONFIG_MNATIVE_AMD) += -march=native
++ cflags-$(CONFIG_MATOM) += -march=bonnell
++ cflags-$(CONFIG_MCORE2) += -march=core2
++ cflags-$(CONFIG_MNEHALEM) += -march=nehalem
++ cflags-$(CONFIG_MWESTMERE) += -march=westmere
++ cflags-$(CONFIG_MSILVERMONT) += -march=silvermont
++ cflags-$(CONFIG_MGOLDMONT) += -march=goldmont
++ cflags-$(CONFIG_MGOLDMONTPLUS) += -march=goldmont-plus
++ cflags-$(CONFIG_MSANDYBRIDGE) += -march=sandybridge
++ cflags-$(CONFIG_MIVYBRIDGE) += -march=ivybridge
++ cflags-$(CONFIG_MHASWELL) += -march=haswell
++ cflags-$(CONFIG_MBROADWELL) += -march=broadwell
++ cflags-$(CONFIG_MSKYLAKE) += -march=skylake
++ cflags-$(CONFIG_MSKYLAKEX) += -march=skylake-avx512
++ cflags-$(CONFIG_MCANNONLAKE) += -march=cannonlake
++ cflags-$(CONFIG_MICELAKE) += -march=icelake-client
++ cflags-$(CONFIG_MCASCADELAKE) += -march=cascadelake
++ cflags-$(CONFIG_MCOOPERLAKE) += -march=cooperlake
++ cflags-$(CONFIG_MTIGERLAKE) += -march=tigerlake
++ cflags-$(CONFIG_MSAPPHIRERAPIDS) += -march=sapphirerapids
++ cflags-$(CONFIG_MROCKETLAKE) += -march=rocketlake
++ cflags-$(CONFIG_MALDERLAKE) += -march=alderlake
++ cflags-$(CONFIG_GENERIC_CPU2) += -march=x86-64-v2
++ cflags-$(CONFIG_GENERIC_CPU3) += -march=x86-64-v3
++ cflags-$(CONFIG_GENERIC_CPU4) += -march=x86-64-v4
+ cflags-$(CONFIG_GENERIC_CPU) += -mtune=generic
+ KBUILD_CFLAGS += $(cflags-y)
+
+diff --git a/arch/x86/include/asm/vermagic.h b/arch/x86/include/asm/vermagic.h
+index 75884d2cdec3..4e6a08d4c7e5 100644
+--- a/arch/x86/include/asm/vermagic.h
++++ b/arch/x86/include/asm/vermagic.h
+@@ -17,6 +17,48 @@
+ #define MODULE_PROC_FAMILY "586MMX "
+ #elif defined CONFIG_MCORE2
+ #define MODULE_PROC_FAMILY "CORE2 "
++#elif defined CONFIG_MNATIVE_INTEL
++#define MODULE_PROC_FAMILY "NATIVE_INTEL "
++#elif defined CONFIG_MNATIVE_AMD
++#define MODULE_PROC_FAMILY "NATIVE_AMD "
++#elif defined CONFIG_MNEHALEM
++#define MODULE_PROC_FAMILY "NEHALEM "
++#elif defined CONFIG_MWESTMERE
++#define MODULE_PROC_FAMILY "WESTMERE "
++#elif defined CONFIG_MSILVERMONT
++#define MODULE_PROC_FAMILY "SILVERMONT "
++#elif defined CONFIG_MGOLDMONT
++#define MODULE_PROC_FAMILY "GOLDMONT "
++#elif defined CONFIG_MGOLDMONTPLUS
++#define MODULE_PROC_FAMILY "GOLDMONTPLUS "
++#elif defined CONFIG_MSANDYBRIDGE
++#define MODULE_PROC_FAMILY "SANDYBRIDGE "
++#elif defined CONFIG_MIVYBRIDGE
++#define MODULE_PROC_FAMILY "IVYBRIDGE "
++#elif defined CONFIG_MHASWELL
++#define MODULE_PROC_FAMILY "HASWELL "
++#elif defined CONFIG_MBROADWELL
++#define MODULE_PROC_FAMILY "BROADWELL "
++#elif defined CONFIG_MSKYLAKE
++#define MODULE_PROC_FAMILY "SKYLAKE "
++#elif defined CONFIG_MSKYLAKEX
++#define MODULE_PROC_FAMILY "SKYLAKEX "
++#elif defined CONFIG_MCANNONLAKE
++#define MODULE_PROC_FAMILY "CANNONLAKE "
++#elif defined CONFIG_MICELAKE
++#define MODULE_PROC_FAMILY "ICELAKE "
++#elif defined CONFIG_MCASCADELAKE
++#define MODULE_PROC_FAMILY "CASCADELAKE "
++#elif defined CONFIG_MCOOPERLAKE
++#define MODULE_PROC_FAMILY "COOPERLAKE "
++#elif defined CONFIG_MTIGERLAKE
++#define MODULE_PROC_FAMILY "TIGERLAKE "
++#elif defined CONFIG_MSAPPHIRERAPIDS
++#define MODULE_PROC_FAMILY "SAPPHIRERAPIDS "
++#elif defined CONFIG_ROCKETLAKE
++#define MODULE_PROC_FAMILY "ROCKETLAKE "
++#elif defined CONFIG_MALDERLAKE
++#define MODULE_PROC_FAMILY "ALDERLAKE "
+ #elif defined CONFIG_MATOM
+ #define MODULE_PROC_FAMILY "ATOM "
+ #elif defined CONFIG_M686
+@@ -35,6 +77,30 @@
+ #define MODULE_PROC_FAMILY "K7 "
+ #elif defined CONFIG_MK8
+ #define MODULE_PROC_FAMILY "K8 "
++#elif defined CONFIG_MK8SSE3
++#define MODULE_PROC_FAMILY "K8SSE3 "
++#elif defined CONFIG_MK10
++#define MODULE_PROC_FAMILY "K10 "
++#elif defined CONFIG_MBARCELONA
++#define MODULE_PROC_FAMILY "BARCELONA "
++#elif defined CONFIG_MBOBCAT
++#define MODULE_PROC_FAMILY "BOBCAT "
++#elif defined CONFIG_MBULLDOZER
++#define MODULE_PROC_FAMILY "BULLDOZER "
++#elif defined CONFIG_MPILEDRIVER
++#define MODULE_PROC_FAMILY "PILEDRIVER "
++#elif defined CONFIG_MSTEAMROLLER
++#define MODULE_PROC_FAMILY "STEAMROLLER "
++#elif defined CONFIG_MJAGUAR
++#define MODULE_PROC_FAMILY "JAGUAR "
++#elif defined CONFIG_MEXCAVATOR
++#define MODULE_PROC_FAMILY "EXCAVATOR "
++#elif defined CONFIG_MZEN
++#define MODULE_PROC_FAMILY "ZEN "
++#elif defined CONFIG_MZEN2
++#define MODULE_PROC_FAMILY "ZEN2 "
++#elif defined CONFIG_MZEN3
++#define MODULE_PROC_FAMILY "ZEN3 "
+ #elif defined CONFIG_MELAN
+ #define MODULE_PROC_FAMILY "ELAN "
+ #elif defined CONFIG_MCRUSOE
+--
+2.35.1
+
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-05-11 17:39 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-05-11 17:39 UTC (permalink / raw
To: gentoo-commits
commit: ff2113050f76160860097c55d73a4e0f73b4aafe
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed May 11 17:25:52 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed May 11 17:39:49 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=ff211305
Update Gentoo Hardened patchset based on KSPP thanks to Peter Bo
Bug: https://bugs.gentoo.org/841488
Added:
CONFIG_HARDENED_USERCOPY=y
CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=y
CONFIG_KFENCE=y
CONFIG_IOMMU_DEFAULT_DMA_STRICT=y
CONFIG_SCHED_CORE=y
CONFIG_ZERO_CALL_USED_REGS=y
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
4567_distro-Gentoo-Kconfig.patch | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/4567_distro-Gentoo-Kconfig.patch b/4567_distro-Gentoo-Kconfig.patch
index ab78353b..1efc0fba 100644
--- a/4567_distro-Gentoo-Kconfig.patch
+++ b/4567_distro-Gentoo-Kconfig.patch
@@ -1,14 +1,14 @@
---- a/Kconfig 2022-04-12 13:11:48.403113171 -0400
-+++ b/Kconfig 2022-04-12 13:12:36.530084675 -0400
+--- a/Kconfig 2022-05-11 13:20:07.110347567 -0400
++++ b/Kconfig 2022-05-11 13:21:12.127174393 -0400
@@ -30,3 +30,5 @@ source "lib/Kconfig"
source "lib/Kconfig.debug"
source "Documentation/Kconfig"
+
+source "distro/Kconfig"
---- /dev/null 2022-04-12 05:39:54.696333295 -0400
-+++ b/distro/Kconfig 2022-04-12 13:21:04.666379519 -0400
-@@ -0,0 +1,285 @@
+--- /dev/null 2022-05-10 13:47:17.750578524 -0400
++++ b/distro/Kconfig 2022-05-11 13:21:20.540529032 -0400
+@@ -0,0 +1,290 @@
+menu "Gentoo Linux"
+
+config GENTOO_LINUX
@@ -185,7 +185,7 @@
+config GENTOO_KERNEL_SELF_PROTECTION_COMMON
+ bool "Enable Kernel Self Protection Project Recommendations"
+
-+ depends on GENTOO_LINUX && !ACPI_CUSTOM_METHOD && !COMPAT_BRK && !DEVKMEM && !PROC_KCORE && !COMPAT_VDSO && !KEXEC && !HIBERNATION && !LEGACY_PTYS && !X86_X32 && !MODIFY_LDT_SYSCALL && GCC_PLUGINS
++ depends on GENTOO_LINUX && !ACPI_CUSTOM_METHOD && !COMPAT_BRK && !PROC_KCORE && !COMPAT_VDSO && !KEXEC && !HIBERNATION && !LEGACY_PTYS && !X86_X32 && !MODIFY_LDT_SYSCALL && GCC_PLUGINS && !IOMMU_DEFAULT_DMA_LAZY && !IOMMU_DEFAULT_PASSTHROUGH && IOMMU_DEFAULT_DMA_STRICT
+
+ select BUG
+ select STRICT_KERNEL_RWX
@@ -199,6 +199,10 @@
+ select DEBUG_NOTIFIERS
+ select DEBUG_LIST
+ select DEBUG_SG
++ select HARDENED_USERCOPY if HAVE_HARDENED_USERCOPY_ALLOCATOR=y
++ select KFENCE if HAVE_ARCH_KFENCE && (!SLAB || SLUB)
++ select RANDOMIZE_KSTACK_OFFSET_DEFAULT if HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET && (INIT_STACK_NONE || !CC_IS_CLANG || CLANG_VERSION>=140000)
++ select SCHED_CORE if SCHED_SMT
+ select BUG_ON_DATA_CORRUPTION
+ select SCHED_STACK_END_CHECK
+ select SECCOMP if HAVE_ARCH_SECCOMP
@@ -222,6 +226,7 @@
+ select GCC_PLUGIN_STRUCTLEAK_BYREF_ALL
+ select GCC_PLUGIN_RANDSTRUCT
+ select GCC_PLUGIN_RANDSTRUCT_PERFORMANCE
++ select ZERO_CALL_USED_REGS if CC_HAS_ZERO_CALL_USED_REGS
+
+ help
+ Search for GENTOO_KERNEL_SELF_PROTECTION_{X86_64, ARM64, X86_32, ARM} for dependency
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-05-24 18:14 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-05-24 18:14 UTC (permalink / raw
To: gentoo-commits
commit: e7a03b1b749c1b7f8384972eddc70be4036ec3b5
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Tue May 24 18:13:25 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Tue May 24 18:13:25 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=e7a03b1b
Add the BMQ(BitMap Queue) Scheduler.
A new CPU scheduler developed from PDS(incld).
Inspired by the scheduler in zircon.
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 8 +
5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch | 9932 ++++++++++++++++++++++++++
5021_BMQ-and-PDS-gentoo-defaults.patch | 12 +
3 files changed, 9952 insertions(+)
diff --git a/0000_README b/0000_README
index efde5c7d..f8cc57f0 100644
--- a/0000_README
+++ b/0000_README
@@ -74,3 +74,11 @@ Desc: Add Gentoo Linux support config settings and defaults.
Patch: 5010_enable-cpu-optimizations-universal.patch
From: https://github.com/graysky2/kernel_compiler_patch
Desc: Kernel >= 5.15 patch enables gcc = v11.1+ optimizations for additional CPUs.
+
+Patch: 5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch
+From: https://gitlab.com/alfredchen/linux-prjc
+Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
+
+Patch: 5021_BMQ-and-PDS-gentoo-defaults.patch
+From: https://gitweb.gentoo.org/proj/linux-patches.git/
+Desc: Set defaults for BMQ. Add archs as people test, default to N
diff --git a/5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch b/5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch
new file mode 100644
index 00000000..9b1a8ba1
--- /dev/null
+++ b/5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch
@@ -0,0 +1,9932 @@
+diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
+index 3f1cc5e317ed..e6f88a16732b 100644
+--- a/Documentation/admin-guide/kernel-parameters.txt
++++ b/Documentation/admin-guide/kernel-parameters.txt
+@@ -5164,6 +5164,12 @@
+ sa1100ir [NET]
+ See drivers/net/irda/sa1100_ir.c.
+
++ sched_timeslice=
++ [KNL] Time slice in ms for Project C BMQ/PDS scheduler.
++ Format: integer 2, 4
++ Default: 4
++ See Documentation/scheduler/sched-BMQ.txt
++
+ sched_verbose [KNL] Enables verbose scheduler debug messages.
+
+ schedstats= [KNL,X86] Enable or disable scheduled statistics.
+diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
+index 1144ea3229a3..2accee67d6fb 100644
+--- a/Documentation/admin-guide/sysctl/kernel.rst
++++ b/Documentation/admin-guide/sysctl/kernel.rst
+@@ -1517,3 +1517,13 @@ is 10 seconds.
+
+ The softlockup threshold is (``2 * watchdog_thresh``). Setting this
+ tunable to zero will disable lockup detection altogether.
++
++yield_type:
++===========
++
++BMQ/PDS CPU scheduler only. This determines what type of yield calls
++to sched_yield will perform.
++
++ 0 - No yield.
++ 1 - Deboost and requeue task. (default)
++ 2 - Set run queue skip task.
+diff --git a/Documentation/scheduler/sched-BMQ.txt b/Documentation/scheduler/sched-BMQ.txt
+new file mode 100644
+index 000000000000..05c84eec0f31
+--- /dev/null
++++ b/Documentation/scheduler/sched-BMQ.txt
+@@ -0,0 +1,110 @@
++ BitMap queue CPU Scheduler
++ --------------------------
++
++CONTENT
++========
++
++ Background
++ Design
++ Overview
++ Task policy
++ Priority management
++ BitMap Queue
++ CPU Assignment and Migration
++
++
++Background
++==========
++
++BitMap Queue CPU scheduler, referred to as BMQ from here on, is an evolution
++of previous Priority and Deadline based Skiplist multiple queue scheduler(PDS),
++and inspired by Zircon scheduler. The goal of it is to keep the scheduler code
++simple, while efficiency and scalable for interactive tasks, such as desktop,
++movie playback and gaming etc.
++
++Design
++======
++
++Overview
++--------
++
++BMQ use per CPU run queue design, each CPU(logical) has it's own run queue,
++each CPU is responsible for scheduling the tasks that are putting into it's
++run queue.
++
++The run queue is a set of priority queues. Note that these queues are fifo
++queue for non-rt tasks or priority queue for rt tasks in data structure. See
++BitMap Queue below for details. BMQ is optimized for non-rt tasks in the fact
++that most applications are non-rt tasks. No matter the queue is fifo or
++priority, In each queue is an ordered list of runnable tasks awaiting execution
++and the data structures are the same. When it is time for a new task to run,
++the scheduler simply looks the lowest numbered queueue that contains a task,
++and runs the first task from the head of that queue. And per CPU idle task is
++also in the run queue, so the scheduler can always find a task to run on from
++its run queue.
++
++Each task will assigned the same timeslice(default 4ms) when it is picked to
++start running. Task will be reinserted at the end of the appropriate priority
++queue when it uses its whole timeslice. When the scheduler selects a new task
++from the priority queue it sets the CPU's preemption timer for the remainder of
++the previous timeslice. When that timer fires the scheduler will stop execution
++on that task, select another task and start over again.
++
++If a task blocks waiting for a shared resource then it's taken out of its
++priority queue and is placed in a wait queue for the shared resource. When it
++is unblocked it will be reinserted in the appropriate priority queue of an
++eligible CPU.
++
++Task policy
++-----------
++
++BMQ supports DEADLINE, FIFO, RR, NORMAL, BATCH and IDLE task policy like the
++mainline CFS scheduler. But BMQ is heavy optimized for non-rt task, that's
++NORMAL/BATCH/IDLE policy tasks. Below is the implementation detail of each
++policy.
++
++DEADLINE
++ It is squashed as priority 0 FIFO task.
++
++FIFO/RR
++ All RT tasks share one single priority queue in BMQ run queue designed. The
++complexity of insert operation is O(n). BMQ is not designed for system runs
++with major rt policy tasks.
++
++NORMAL/BATCH/IDLE
++ BATCH and IDLE tasks are treated as the same policy. They compete CPU with
++NORMAL policy tasks, but they just don't boost. To control the priority of
++NORMAL/BATCH/IDLE tasks, simply use nice level.
++
++ISO
++ ISO policy is not supported in BMQ. Please use nice level -20 NORMAL policy
++task instead.
++
++Priority management
++-------------------
++
++RT tasks have priority from 0-99. For non-rt tasks, there are three different
++factors used to determine the effective priority of a task. The effective
++priority being what is used to determine which queue it will be in.
++
++The first factor is simply the task’s static priority. Which is assigned from
++task's nice level, within [-20, 19] in userland's point of view and [0, 39]
++internally.
++
++The second factor is the priority boost. This is a value bounded between
++[-MAX_PRIORITY_ADJ, MAX_PRIORITY_ADJ] used to offset the base priority, it is
++modified by the following cases:
++
++*When a thread has used up its entire timeslice, always deboost its boost by
++increasing by one.
++*When a thread gives up cpu control(voluntary or non-voluntary) to reschedule,
++and its switch-in time(time after last switch and run) below the thredhold
++based on its priority boost, will boost its boost by decreasing by one buti is
++capped at 0 (won’t go negative).
++
++The intent in this system is to ensure that interactive threads are serviced
++quickly. These are usually the threads that interact directly with the user
++and cause user-perceivable latency. These threads usually do little work and
++spend most of their time blocked awaiting another user event. So they get the
++priority boost from unblocking while background threads that do most of the
++processing receive the priority penalty for using their entire timeslice.
+diff --git a/fs/proc/base.c b/fs/proc/base.c
+index c1031843cc6a..f2b0af41a3eb 100644
+--- a/fs/proc/base.c
++++ b/fs/proc/base.c
+@@ -479,7 +479,7 @@ static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
+ seq_puts(m, "0 0 0\n");
+ else
+ seq_printf(m, "%llu %llu %lu\n",
+- (unsigned long long)task->se.sum_exec_runtime,
++ (unsigned long long)tsk_seruntime(task),
+ (unsigned long long)task->sched_info.run_delay,
+ task->sched_info.pcount);
+
+diff --git a/include/asm-generic/resource.h b/include/asm-generic/resource.h
+index 8874f681b056..59eb72bf7d5f 100644
+--- a/include/asm-generic/resource.h
++++ b/include/asm-generic/resource.h
+@@ -23,7 +23,7 @@
+ [RLIMIT_LOCKS] = { RLIM_INFINITY, RLIM_INFINITY }, \
+ [RLIMIT_SIGPENDING] = { 0, 0 }, \
+ [RLIMIT_MSGQUEUE] = { MQ_BYTES_MAX, MQ_BYTES_MAX }, \
+- [RLIMIT_NICE] = { 0, 0 }, \
++ [RLIMIT_NICE] = { 30, 30 }, \
+ [RLIMIT_RTPRIO] = { 0, 0 }, \
+ [RLIMIT_RTTIME] = { RLIM_INFINITY, RLIM_INFINITY }, \
+ }
+diff --git a/include/linux/sched.h b/include/linux/sched.h
+index a8911b1f35aa..7a4bf3a0db5a 100644
+--- a/include/linux/sched.h
++++ b/include/linux/sched.h
+@@ -753,8 +753,14 @@ struct task_struct {
+ unsigned int ptrace;
+
+ #ifdef CONFIG_SMP
+- int on_cpu;
+ struct __call_single_node wake_entry;
++#endif
++#if defined(CONFIG_SMP) || defined(CONFIG_SCHED_ALT)
++ int on_cpu;
++#endif
++
++#ifdef CONFIG_SMP
++#ifndef CONFIG_SCHED_ALT
+ unsigned int wakee_flips;
+ unsigned long wakee_flip_decay_ts;
+ struct task_struct *last_wakee;
+@@ -768,6 +774,7 @@ struct task_struct {
+ */
+ int recent_used_cpu;
+ int wake_cpu;
++#endif /* !CONFIG_SCHED_ALT */
+ #endif
+ int on_rq;
+
+@@ -776,6 +783,20 @@ struct task_struct {
+ int normal_prio;
+ unsigned int rt_priority;
+
++#ifdef CONFIG_SCHED_ALT
++ u64 last_ran;
++ s64 time_slice;
++ int sq_idx;
++ struct list_head sq_node;
++#ifdef CONFIG_SCHED_BMQ
++ int boost_prio;
++#endif /* CONFIG_SCHED_BMQ */
++#ifdef CONFIG_SCHED_PDS
++ u64 deadline;
++#endif /* CONFIG_SCHED_PDS */
++ /* sched_clock time spent running */
++ u64 sched_time;
++#else /* !CONFIG_SCHED_ALT */
+ struct sched_entity se;
+ struct sched_rt_entity rt;
+ struct sched_dl_entity dl;
+@@ -786,6 +807,7 @@ struct task_struct {
+ unsigned long core_cookie;
+ unsigned int core_occupation;
+ #endif
++#endif /* !CONFIG_SCHED_ALT */
+
+ #ifdef CONFIG_CGROUP_SCHED
+ struct task_group *sched_task_group;
+@@ -1516,6 +1538,15 @@ struct task_struct {
+ */
+ };
+
++#ifdef CONFIG_SCHED_ALT
++#define tsk_seruntime(t) ((t)->sched_time)
++/* replace the uncertian rt_timeout with 0UL */
++#define tsk_rttimeout(t) (0UL)
++#else /* CFS */
++#define tsk_seruntime(t) ((t)->se.sum_exec_runtime)
++#define tsk_rttimeout(t) ((t)->rt.timeout)
++#endif /* !CONFIG_SCHED_ALT */
++
+ static inline struct pid *task_pid(struct task_struct *task)
+ {
+ return task->thread_pid;
+diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
+index 7c83d4d5a971..fa30f98cb2be 100644
+--- a/include/linux/sched/deadline.h
++++ b/include/linux/sched/deadline.h
+@@ -1,5 +1,24 @@
+ /* SPDX-License-Identifier: GPL-2.0 */
+
++#ifdef CONFIG_SCHED_ALT
++
++static inline int dl_task(struct task_struct *p)
++{
++ return 0;
++}
++
++#ifdef CONFIG_SCHED_BMQ
++#define __tsk_deadline(p) (0UL)
++#endif
++
++#ifdef CONFIG_SCHED_PDS
++#define __tsk_deadline(p) ((((u64) ((p)->prio))<<56) | (p)->deadline)
++#endif
++
++#else
++
++#define __tsk_deadline(p) ((p)->dl.deadline)
++
+ /*
+ * SCHED_DEADLINE tasks has negative priorities, reflecting
+ * the fact that any of them has higher prio than RT and
+@@ -21,6 +40,7 @@ static inline int dl_task(struct task_struct *p)
+ {
+ return dl_prio(p->prio);
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ static inline bool dl_time_before(u64 a, u64 b)
+ {
+diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
+index ab83d85e1183..6af9ae681116 100644
+--- a/include/linux/sched/prio.h
++++ b/include/linux/sched/prio.h
+@@ -18,6 +18,32 @@
+ #define MAX_PRIO (MAX_RT_PRIO + NICE_WIDTH)
+ #define DEFAULT_PRIO (MAX_RT_PRIO + NICE_WIDTH / 2)
+
++#ifdef CONFIG_SCHED_ALT
++
++/* Undefine MAX_PRIO and DEFAULT_PRIO */
++#undef MAX_PRIO
++#undef DEFAULT_PRIO
++
++/* +/- priority levels from the base priority */
++#ifdef CONFIG_SCHED_BMQ
++#define MAX_PRIORITY_ADJ (7)
++
++#define MIN_NORMAL_PRIO (MAX_RT_PRIO)
++#define MAX_PRIO (MIN_NORMAL_PRIO + NICE_WIDTH)
++#define DEFAULT_PRIO (MIN_NORMAL_PRIO + NICE_WIDTH / 2)
++#endif
++
++#ifdef CONFIG_SCHED_PDS
++#define MAX_PRIORITY_ADJ (0)
++
++#define MIN_NORMAL_PRIO (128)
++#define NORMAL_PRIO_NUM (64)
++#define MAX_PRIO (MIN_NORMAL_PRIO + NORMAL_PRIO_NUM)
++#define DEFAULT_PRIO (MAX_PRIO - NICE_WIDTH / 2)
++#endif
++
++#endif /* CONFIG_SCHED_ALT */
++
+ /*
+ * Convert user-nice values [ -20 ... 0 ... 19 ]
+ * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
+diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
+index e5af028c08b4..0a7565d0d3cf 100644
+--- a/include/linux/sched/rt.h
++++ b/include/linux/sched/rt.h
+@@ -24,8 +24,10 @@ static inline bool task_is_realtime(struct task_struct *tsk)
+
+ if (policy == SCHED_FIFO || policy == SCHED_RR)
+ return true;
++#ifndef CONFIG_SCHED_ALT
+ if (policy == SCHED_DEADLINE)
+ return true;
++#endif
+ return false;
+ }
+
+diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
+index 56cffe42abbc..e020fc572b22 100644
+--- a/include/linux/sched/topology.h
++++ b/include/linux/sched/topology.h
+@@ -233,7 +233,8 @@ static inline bool cpus_share_cache(int this_cpu, int that_cpu)
+
+ #endif /* !CONFIG_SMP */
+
+-#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL)
++#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) && \
++ !defined(CONFIG_SCHED_ALT)
+ extern void rebuild_sched_domains_energy(void);
+ #else
+ static inline void rebuild_sched_domains_energy(void)
+diff --git a/init/Kconfig b/init/Kconfig
+index ddcbefe535e9..85616423dc94 100644
+--- a/init/Kconfig
++++ b/init/Kconfig
+@@ -821,6 +821,7 @@ menu "Scheduler features"
+ config UCLAMP_TASK
+ bool "Enable utilization clamping for RT/FAIR tasks"
+ depends on CPU_FREQ_GOV_SCHEDUTIL
++ depends on !SCHED_ALT
+ help
+ This feature enables the scheduler to track the clamped utilization
+ of each CPU based on RUNNABLE tasks scheduled on that CPU.
+@@ -867,6 +868,35 @@ config UCLAMP_BUCKETS_COUNT
+
+ If in doubt, use the default value.
+
++menuconfig SCHED_ALT
++ bool "Alternative CPU Schedulers"
++ default y
++ help
++ This feature enable alternative CPU scheduler"
++
++if SCHED_ALT
++
++choice
++ prompt "Alternative CPU Scheduler"
++ default SCHED_BMQ
++
++config SCHED_BMQ
++ bool "BMQ CPU scheduler"
++ help
++ The BitMap Queue CPU scheduler for excellent interactivity and
++ responsiveness on the desktop and solid scalability on normal
++ hardware and commodity servers.
++
++config SCHED_PDS
++ bool "PDS CPU scheduler"
++ help
++ The Priority and Deadline based Skip list multiple queue CPU
++ Scheduler.
++
++endchoice
++
++endif
++
+ endmenu
+
+ #
+@@ -911,6 +941,7 @@ config NUMA_BALANCING
+ depends on ARCH_SUPPORTS_NUMA_BALANCING
+ depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY
+ depends on SMP && NUMA && MIGRATION && !PREEMPT_RT
++ depends on !SCHED_ALT
+ help
+ This option adds support for automatic NUMA aware memory/task placement.
+ The mechanism is quite primitive and is based on migrating memory when
+@@ -1003,6 +1034,7 @@ config FAIR_GROUP_SCHED
+ depends on CGROUP_SCHED
+ default CGROUP_SCHED
+
++if !SCHED_ALT
+ config CFS_BANDWIDTH
+ bool "CPU bandwidth provisioning for FAIR_GROUP_SCHED"
+ depends on FAIR_GROUP_SCHED
+@@ -1025,6 +1057,7 @@ config RT_GROUP_SCHED
+ realtime bandwidth for them.
+ See Documentation/scheduler/sched-rt-group.rst for more information.
+
++endif #!SCHED_ALT
+ endif #CGROUP_SCHED
+
+ config UCLAMP_TASK_GROUP
+@@ -1268,6 +1301,7 @@ config CHECKPOINT_RESTORE
+
+ config SCHED_AUTOGROUP
+ bool "Automatic process group scheduling"
++ depends on !SCHED_ALT
+ select CGROUPS
+ select CGROUP_SCHED
+ select FAIR_GROUP_SCHED
+diff --git a/init/init_task.c b/init/init_task.c
+index 73cc8f03511a..2d0bad762895 100644
+--- a/init/init_task.c
++++ b/init/init_task.c
+@@ -75,9 +75,15 @@ struct task_struct init_task
+ .stack = init_stack,
+ .usage = REFCOUNT_INIT(2),
+ .flags = PF_KTHREAD,
++#ifdef CONFIG_SCHED_ALT
++ .prio = DEFAULT_PRIO + MAX_PRIORITY_ADJ,
++ .static_prio = DEFAULT_PRIO,
++ .normal_prio = DEFAULT_PRIO + MAX_PRIORITY_ADJ,
++#else
+ .prio = MAX_PRIO - 20,
+ .static_prio = MAX_PRIO - 20,
+ .normal_prio = MAX_PRIO - 20,
++#endif
+ .policy = SCHED_NORMAL,
+ .cpus_ptr = &init_task.cpus_mask,
+ .user_cpus_ptr = NULL,
+@@ -88,6 +94,17 @@ struct task_struct init_task
+ .restart_block = {
+ .fn = do_no_restart_syscall,
+ },
++#ifdef CONFIG_SCHED_ALT
++ .sq_node = LIST_HEAD_INIT(init_task.sq_node),
++#ifdef CONFIG_SCHED_BMQ
++ .boost_prio = 0,
++ .sq_idx = 15,
++#endif
++#ifdef CONFIG_SCHED_PDS
++ .deadline = 0,
++#endif
++ .time_slice = HZ,
++#else
+ .se = {
+ .group_node = LIST_HEAD_INIT(init_task.se.group_node),
+ },
+@@ -95,6 +112,7 @@ struct task_struct init_task
+ .run_list = LIST_HEAD_INIT(init_task.rt.run_list),
+ .time_slice = RR_TIMESLICE,
+ },
++#endif
+ .tasks = LIST_HEAD_INIT(init_task.tasks),
+ #ifdef CONFIG_SMP
+ .pushable_tasks = PLIST_NODE_INIT(init_task.pushable_tasks, MAX_PRIO),
+diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
+index c2f1fd95a821..41654679b1b2 100644
+--- a/kernel/Kconfig.preempt
++++ b/kernel/Kconfig.preempt
+@@ -117,7 +117,7 @@ config PREEMPT_DYNAMIC
+
+ config SCHED_CORE
+ bool "Core Scheduling for SMT"
+- depends on SCHED_SMT
++ depends on SCHED_SMT && !SCHED_ALT
+ help
+ This option permits Core Scheduling, a means of coordinated task
+ selection across SMT siblings. When enabled -- see
+diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
+index 71a418858a5e..7e3016873db1 100644
+--- a/kernel/cgroup/cpuset.c
++++ b/kernel/cgroup/cpuset.c
+@@ -704,7 +704,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
+ return ret;
+ }
+
+-#ifdef CONFIG_SMP
++#if defined(CONFIG_SMP) && !defined(CONFIG_SCHED_ALT)
+ /*
+ * Helper routine for generate_sched_domains().
+ * Do cpusets a, b have overlapping effective cpus_allowed masks?
+@@ -1100,7 +1100,7 @@ static void rebuild_sched_domains_locked(void)
+ /* Have scheduler rebuild the domains */
+ partition_and_rebuild_sched_domains(ndoms, doms, attr);
+ }
+-#else /* !CONFIG_SMP */
++#else /* !CONFIG_SMP || CONFIG_SCHED_ALT */
+ static void rebuild_sched_domains_locked(void)
+ {
+ }
+diff --git a/kernel/delayacct.c b/kernel/delayacct.c
+index c5e8cea9e05f..8e90b2a3667a 100644
+--- a/kernel/delayacct.c
++++ b/kernel/delayacct.c
+@@ -130,7 +130,7 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
+ */
+ t1 = tsk->sched_info.pcount;
+ t2 = tsk->sched_info.run_delay;
+- t3 = tsk->se.sum_exec_runtime;
++ t3 = tsk_seruntime(tsk);
+
+ d->cpu_count += t1;
+
+diff --git a/kernel/exit.c b/kernel/exit.c
+index f072959fcab7..da97095a2997 100644
+--- a/kernel/exit.c
++++ b/kernel/exit.c
+@@ -124,7 +124,7 @@ static void __exit_signal(struct task_struct *tsk)
+ sig->curr_target = next_thread(tsk);
+ }
+
+- add_device_randomness((const void*) &tsk->se.sum_exec_runtime,
++ add_device_randomness((const void*) &tsk_seruntime(tsk),
+ sizeof(unsigned long long));
+
+ /*
+@@ -145,7 +145,7 @@ static void __exit_signal(struct task_struct *tsk)
+ sig->inblock += task_io_get_inblock(tsk);
+ sig->oublock += task_io_get_oublock(tsk);
+ task_io_accounting_add(&sig->ioac, &tsk->ioac);
+- sig->sum_sched_runtime += tsk->se.sum_exec_runtime;
++ sig->sum_sched_runtime += tsk_seruntime(tsk);
+ sig->nr_threads--;
+ __unhash_process(tsk, group_dead);
+ write_sequnlock(&sig->stats_lock);
+diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
+index 8555c4efe97c..a2b3bd3fd85c 100644
+--- a/kernel/locking/rtmutex.c
++++ b/kernel/locking/rtmutex.c
+@@ -298,21 +298,25 @@ static __always_inline void
+ waiter_update_prio(struct rt_mutex_waiter *waiter, struct task_struct *task)
+ {
+ waiter->prio = __waiter_prio(task);
+- waiter->deadline = task->dl.deadline;
++ waiter->deadline = __tsk_deadline(task);
+ }
+
+ /*
+ * Only use with rt_mutex_waiter_{less,equal}()
+ */
+ #define task_to_waiter(p) \
+- &(struct rt_mutex_waiter){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline }
++ &(struct rt_mutex_waiter){ .prio = __waiter_prio(p), .deadline = __tsk_deadline(p) }
+
+ static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left,
+ struct rt_mutex_waiter *right)
+ {
++#ifdef CONFIG_SCHED_PDS
++ return (left->deadline < right->deadline);
++#else
+ if (left->prio < right->prio)
+ return 1;
+
++#ifndef CONFIG_SCHED_BMQ
+ /*
+ * If both waiters have dl_prio(), we check the deadlines of the
+ * associated tasks.
+@@ -321,16 +325,22 @@ static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left,
+ */
+ if (dl_prio(left->prio))
+ return dl_time_before(left->deadline, right->deadline);
++#endif
+
+ return 0;
++#endif
+ }
+
+ static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left,
+ struct rt_mutex_waiter *right)
+ {
++#ifdef CONFIG_SCHED_PDS
++ return (left->deadline == right->deadline);
++#else
+ if (left->prio != right->prio)
+ return 0;
+
++#ifndef CONFIG_SCHED_BMQ
+ /*
+ * If both waiters have dl_prio(), we check the deadlines of the
+ * associated tasks.
+@@ -339,8 +349,10 @@ static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left,
+ */
+ if (dl_prio(left->prio))
+ return left->deadline == right->deadline;
++#endif
+
+ return 1;
++#endif
+ }
+
+ static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter,
+diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
+index 976092b7bd45..31d587c16ec1 100644
+--- a/kernel/sched/Makefile
++++ b/kernel/sched/Makefile
+@@ -28,7 +28,12 @@ endif
+ # These compilation units have roughly the same size and complexity - so their
+ # build parallelizes well and finishes roughly at once:
+ #
++ifdef CONFIG_SCHED_ALT
++obj-y += alt_core.o
++obj-$(CONFIG_SCHED_DEBUG) += alt_debug.o
++else
+ obj-y += core.o
+ obj-y += fair.o
++endif
+ obj-y += build_policy.o
+ obj-y += build_utility.o
+diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
+new file mode 100644
+index 000000000000..a466a05301b8
+--- /dev/null
++++ b/kernel/sched/alt_core.c
+@@ -0,0 +1,7765 @@
++/*
++ * kernel/sched/alt_core.c
++ *
++ * Core alternative kernel scheduler code and related syscalls
++ *
++ * Copyright (C) 1991-2002 Linus Torvalds
++ *
++ * 2009-08-13 Brainfuck deadline scheduling policy by Con Kolivas deletes
++ * a whole lot of those previous things.
++ * 2017-09-06 Priority and Deadline based Skip list multiple queue kernel
++ * scheduler by Alfred Chen.
++ * 2019-02-20 BMQ(BitMap Queue) kernel scheduler by Alfred Chen.
++ */
++#include <linux/sched/cputime.h>
++#include <linux/sched/debug.h>
++#include <linux/sched/isolation.h>
++#include <linux/sched/loadavg.h>
++#include <linux/sched/mm.h>
++#include <linux/sched/nohz.h>
++#include <linux/sched/stat.h>
++#include <linux/sched/wake_q.h>
++
++#include <linux/blkdev.h>
++#include <linux/context_tracking.h>
++#include <linux/cpuset.h>
++#include <linux/delayacct.h>
++#include <linux/init_task.h>
++#include <linux/kcov.h>
++#include <linux/kprobes.h>
++#include <linux/profile.h>
++#include <linux/nmi.h>
++
++#include <uapi/linux/sched/types.h>
++
++#include <asm/switch_to.h>
++
++#define CREATE_TRACE_POINTS
++#include <trace/events/sched.h>
++#undef CREATE_TRACE_POINTS
++
++#include "sched.h"
++
++#include "../../fs/io-wq.h"
++#include "../smpboot.h"
++
++/*
++ * Export tracepoints that act as a bare tracehook (ie: have no trace event
++ * associated with them) to allow external modules to probe them.
++ */
++EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_irq_tp);
++
++#ifdef CONFIG_SCHED_DEBUG
++#define sched_feat(x) (1)
++/*
++ * Print a warning if need_resched is set for the given duration (if
++ * LATENCY_WARN is enabled).
++ *
++ * If sysctl_resched_latency_warn_once is set, only one warning will be shown
++ * per boot.
++ */
++__read_mostly int sysctl_resched_latency_warn_ms = 100;
++__read_mostly int sysctl_resched_latency_warn_once = 1;
++#else
++#define sched_feat(x) (0)
++#endif /* CONFIG_SCHED_DEBUG */
++
++#define ALT_SCHED_VERSION "v5.18-r1"
++
++/* rt_prio(prio) defined in include/linux/sched/rt.h */
++#define rt_task(p) rt_prio((p)->prio)
++#define rt_policy(policy) ((policy) == SCHED_FIFO || (policy) == SCHED_RR)
++#define task_has_rt_policy(p) (rt_policy((p)->policy))
++
++#define STOP_PRIO (MAX_RT_PRIO - 1)
++
++/* Default time slice is 4 in ms, can be set via kernel parameter "sched_timeslice" */
++u64 sched_timeslice_ns __read_mostly = (4 << 20);
++
++static inline void requeue_task(struct task_struct *p, struct rq *rq, int idx);
++
++#ifdef CONFIG_SCHED_BMQ
++#include "bmq.h"
++#endif
++#ifdef CONFIG_SCHED_PDS
++#include "pds.h"
++#endif
++
++static int __init sched_timeslice(char *str)
++{
++ int timeslice_ms;
++
++ get_option(&str, ×lice_ms);
++ if (2 != timeslice_ms)
++ timeslice_ms = 4;
++ sched_timeslice_ns = timeslice_ms << 20;
++ sched_timeslice_imp(timeslice_ms);
++
++ return 0;
++}
++early_param("sched_timeslice", sched_timeslice);
++
++/* Reschedule if less than this many μs left */
++#define RESCHED_NS (100 << 10)
++
++/**
++ * sched_yield_type - Choose what sort of yield sched_yield will perform.
++ * 0: No yield.
++ * 1: Deboost and requeue task. (default)
++ * 2: Set rq skip task.
++ */
++int sched_yield_type __read_mostly = 1;
++
++#ifdef CONFIG_SMP
++static cpumask_t sched_rq_pending_mask ____cacheline_aligned_in_smp;
++
++DEFINE_PER_CPU(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
++DEFINE_PER_CPU(cpumask_t *, sched_cpu_llc_mask);
++DEFINE_PER_CPU(cpumask_t *, sched_cpu_topo_end_mask);
++
++#ifdef CONFIG_SCHED_SMT
++DEFINE_STATIC_KEY_FALSE(sched_smt_present);
++EXPORT_SYMBOL_GPL(sched_smt_present);
++#endif
++
++/*
++ * Keep a unique ID per domain (we use the first CPUs number in the cpumask of
++ * the domain), this allows us to quickly tell if two cpus are in the same cache
++ * domain, see cpus_share_cache().
++ */
++DEFINE_PER_CPU(int, sd_llc_id);
++#endif /* CONFIG_SMP */
++
++static DEFINE_MUTEX(sched_hotcpu_mutex);
++
++DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
++
++#ifndef prepare_arch_switch
++# define prepare_arch_switch(next) do { } while (0)
++#endif
++#ifndef finish_arch_post_lock_switch
++# define finish_arch_post_lock_switch() do { } while (0)
++#endif
++
++#ifdef CONFIG_SCHED_SMT
++static cpumask_t sched_sg_idle_mask ____cacheline_aligned_in_smp;
++#endif
++static cpumask_t sched_rq_watermark[SCHED_BITS] ____cacheline_aligned_in_smp;
++
++/* sched_queue related functions */
++static inline void sched_queue_init(struct sched_queue *q)
++{
++ int i;
++
++ bitmap_zero(q->bitmap, SCHED_BITS);
++ for(i = 0; i < SCHED_BITS; i++)
++ INIT_LIST_HEAD(&q->heads[i]);
++}
++
++/*
++ * Init idle task and put into queue structure of rq
++ * IMPORTANT: may be called multiple times for a single cpu
++ */
++static inline void sched_queue_init_idle(struct sched_queue *q,
++ struct task_struct *idle)
++{
++ idle->sq_idx = IDLE_TASK_SCHED_PRIO;
++ INIT_LIST_HEAD(&q->heads[idle->sq_idx]);
++ list_add(&idle->sq_node, &q->heads[idle->sq_idx]);
++}
++
++/* water mark related functions */
++static inline void update_sched_rq_watermark(struct rq *rq)
++{
++ unsigned long watermark = find_first_bit(rq->queue.bitmap, SCHED_QUEUE_BITS);
++ unsigned long last_wm = rq->watermark;
++ unsigned long i;
++ int cpu;
++
++ if (watermark == last_wm)
++ return;
++
++ rq->watermark = watermark;
++ cpu = cpu_of(rq);
++ if (watermark < last_wm) {
++ for (i = last_wm; i > watermark; i--)
++ cpumask_clear_cpu(cpu, sched_rq_watermark + SCHED_BITS - 1 - i);
++#ifdef CONFIG_SCHED_SMT
++ if (static_branch_likely(&sched_smt_present) &&
++ IDLE_TASK_SCHED_PRIO == last_wm)
++ cpumask_andnot(&sched_sg_idle_mask,
++ &sched_sg_idle_mask, cpu_smt_mask(cpu));
++#endif
++ return;
++ }
++ /* last_wm < watermark */
++ for (i = watermark; i > last_wm; i--)
++ cpumask_set_cpu(cpu, sched_rq_watermark + SCHED_BITS - 1 - i);
++#ifdef CONFIG_SCHED_SMT
++ if (static_branch_likely(&sched_smt_present) &&
++ IDLE_TASK_SCHED_PRIO == watermark) {
++ cpumask_t tmp;
++
++ cpumask_and(&tmp, cpu_smt_mask(cpu), sched_rq_watermark);
++ if (cpumask_equal(&tmp, cpu_smt_mask(cpu)))
++ cpumask_or(&sched_sg_idle_mask,
++ &sched_sg_idle_mask, cpu_smt_mask(cpu));
++ }
++#endif
++}
++
++/*
++ * This routine assume that the idle task always in queue
++ */
++static inline struct task_struct *sched_rq_first_task(struct rq *rq)
++{
++ unsigned long idx = find_first_bit(rq->queue.bitmap, SCHED_QUEUE_BITS);
++ const struct list_head *head = &rq->queue.heads[sched_prio2idx(idx, rq)];
++
++ return list_first_entry(head, struct task_struct, sq_node);
++}
++
++static inline struct task_struct *
++sched_rq_next_task(struct task_struct *p, struct rq *rq)
++{
++ unsigned long idx = p->sq_idx;
++ struct list_head *head = &rq->queue.heads[idx];
++
++ if (list_is_last(&p->sq_node, head)) {
++ idx = find_next_bit(rq->queue.bitmap, SCHED_QUEUE_BITS,
++ sched_idx2prio(idx, rq) + 1);
++ head = &rq->queue.heads[sched_prio2idx(idx, rq)];
++
++ return list_first_entry(head, struct task_struct, sq_node);
++ }
++
++ return list_next_entry(p, sq_node);
++}
++
++static inline struct task_struct *rq_runnable_task(struct rq *rq)
++{
++ struct task_struct *next = sched_rq_first_task(rq);
++
++ if (unlikely(next == rq->skip))
++ next = sched_rq_next_task(next, rq);
++
++ return next;
++}
++
++/*
++ * Serialization rules:
++ *
++ * Lock order:
++ *
++ * p->pi_lock
++ * rq->lock
++ * hrtimer_cpu_base->lock (hrtimer_start() for bandwidth controls)
++ *
++ * rq1->lock
++ * rq2->lock where: rq1 < rq2
++ *
++ * Regular state:
++ *
++ * Normal scheduling state is serialized by rq->lock. __schedule() takes the
++ * local CPU's rq->lock, it optionally removes the task from the runqueue and
++ * always looks at the local rq data structures to find the most eligible task
++ * to run next.
++ *
++ * Task enqueue is also under rq->lock, possibly taken from another CPU.
++ * Wakeups from another LLC domain might use an IPI to transfer the enqueue to
++ * the local CPU to avoid bouncing the runqueue state around [ see
++ * ttwu_queue_wakelist() ]
++ *
++ * Task wakeup, specifically wakeups that involve migration, are horribly
++ * complicated to avoid having to take two rq->locks.
++ *
++ * Special state:
++ *
++ * System-calls and anything external will use task_rq_lock() which acquires
++ * both p->pi_lock and rq->lock. As a consequence the state they change is
++ * stable while holding either lock:
++ *
++ * - sched_setaffinity()/
++ * set_cpus_allowed_ptr(): p->cpus_ptr, p->nr_cpus_allowed
++ * - set_user_nice(): p->se.load, p->*prio
++ * - __sched_setscheduler(): p->sched_class, p->policy, p->*prio,
++ * p->se.load, p->rt_priority,
++ * p->dl.dl_{runtime, deadline, period, flags, bw, density}
++ * - sched_setnuma(): p->numa_preferred_nid
++ * - sched_move_task()/
++ * cpu_cgroup_fork(): p->sched_task_group
++ * - uclamp_update_active() p->uclamp*
++ *
++ * p->state <- TASK_*:
++ *
++ * is changed locklessly using set_current_state(), __set_current_state() or
++ * set_special_state(), see their respective comments, or by
++ * try_to_wake_up(). This latter uses p->pi_lock to serialize against
++ * concurrent self.
++ *
++ * p->on_rq <- { 0, 1 = TASK_ON_RQ_QUEUED, 2 = TASK_ON_RQ_MIGRATING }:
++ *
++ * is set by activate_task() and cleared by deactivate_task(), under
++ * rq->lock. Non-zero indicates the task is runnable, the special
++ * ON_RQ_MIGRATING state is used for migration without holding both
++ * rq->locks. It indicates task_cpu() is not stable, see task_rq_lock().
++ *
++ * p->on_cpu <- { 0, 1 }:
++ *
++ * is set by prepare_task() and cleared by finish_task() such that it will be
++ * set before p is scheduled-in and cleared after p is scheduled-out, both
++ * under rq->lock. Non-zero indicates the task is running on its CPU.
++ *
++ * [ The astute reader will observe that it is possible for two tasks on one
++ * CPU to have ->on_cpu = 1 at the same time. ]
++ *
++ * task_cpu(p): is changed by set_task_cpu(), the rules are:
++ *
++ * - Don't call set_task_cpu() on a blocked task:
++ *
++ * We don't care what CPU we're not running on, this simplifies hotplug,
++ * the CPU assignment of blocked tasks isn't required to be valid.
++ *
++ * - for try_to_wake_up(), called under p->pi_lock:
++ *
++ * This allows try_to_wake_up() to only take one rq->lock, see its comment.
++ *
++ * - for migration called under rq->lock:
++ * [ see task_on_rq_migrating() in task_rq_lock() ]
++ *
++ * o move_queued_task()
++ * o detach_task()
++ *
++ * - for migration called under double_rq_lock():
++ *
++ * o __migrate_swap_task()
++ * o push_rt_task() / pull_rt_task()
++ * o push_dl_task() / pull_dl_task()
++ * o dl_task_offline_migration()
++ *
++ */
++
++/*
++ * Context: p->pi_lock
++ */
++static inline struct rq
++*__task_access_lock(struct task_struct *p, raw_spinlock_t **plock)
++{
++ struct rq *rq;
++ for (;;) {
++ rq = task_rq(p);
++ if (p->on_cpu || task_on_rq_queued(p)) {
++ raw_spin_lock(&rq->lock);
++ if (likely((p->on_cpu || task_on_rq_queued(p))
++ && rq == task_rq(p))) {
++ *plock = &rq->lock;
++ return rq;
++ }
++ raw_spin_unlock(&rq->lock);
++ } else if (task_on_rq_migrating(p)) {
++ do {
++ cpu_relax();
++ } while (unlikely(task_on_rq_migrating(p)));
++ } else {
++ *plock = NULL;
++ return rq;
++ }
++ }
++}
++
++static inline void
++__task_access_unlock(struct task_struct *p, raw_spinlock_t *lock)
++{
++ if (NULL != lock)
++ raw_spin_unlock(lock);
++}
++
++static inline struct rq
++*task_access_lock_irqsave(struct task_struct *p, raw_spinlock_t **plock,
++ unsigned long *flags)
++{
++ struct rq *rq;
++ for (;;) {
++ rq = task_rq(p);
++ if (p->on_cpu || task_on_rq_queued(p)) {
++ raw_spin_lock_irqsave(&rq->lock, *flags);
++ if (likely((p->on_cpu || task_on_rq_queued(p))
++ && rq == task_rq(p))) {
++ *plock = &rq->lock;
++ return rq;
++ }
++ raw_spin_unlock_irqrestore(&rq->lock, *flags);
++ } else if (task_on_rq_migrating(p)) {
++ do {
++ cpu_relax();
++ } while (unlikely(task_on_rq_migrating(p)));
++ } else {
++ raw_spin_lock_irqsave(&p->pi_lock, *flags);
++ if (likely(!p->on_cpu && !p->on_rq &&
++ rq == task_rq(p))) {
++ *plock = &p->pi_lock;
++ return rq;
++ }
++ raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
++ }
++ }
++}
++
++static inline void
++task_access_unlock_irqrestore(struct task_struct *p, raw_spinlock_t *lock,
++ unsigned long *flags)
++{
++ raw_spin_unlock_irqrestore(lock, *flags);
++}
++
++/*
++ * __task_rq_lock - lock the rq @p resides on.
++ */
++struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ lockdep_assert_held(&p->pi_lock);
++
++ for (;;) {
++ rq = task_rq(p);
++ raw_spin_lock(&rq->lock);
++ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
++ return rq;
++ raw_spin_unlock(&rq->lock);
++
++ while (unlikely(task_on_rq_migrating(p)))
++ cpu_relax();
++ }
++}
++
++/*
++ * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
++ */
++struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(p->pi_lock)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ for (;;) {
++ raw_spin_lock_irqsave(&p->pi_lock, rf->flags);
++ rq = task_rq(p);
++ raw_spin_lock(&rq->lock);
++ /*
++ * move_queued_task() task_rq_lock()
++ *
++ * ACQUIRE (rq->lock)
++ * [S] ->on_rq = MIGRATING [L] rq = task_rq()
++ * WMB (__set_task_cpu()) ACQUIRE (rq->lock);
++ * [S] ->cpu = new_cpu [L] task_rq()
++ * [L] ->on_rq
++ * RELEASE (rq->lock)
++ *
++ * If we observe the old CPU in task_rq_lock(), the acquire of
++ * the old rq->lock will fully serialize against the stores.
++ *
++ * If we observe the new CPU in task_rq_lock(), the address
++ * dependency headed by '[L] rq = task_rq()' and the acquire
++ * will pair with the WMB to ensure we then also see migrating.
++ */
++ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
++ return rq;
++ }
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
++
++ while (unlikely(task_on_rq_migrating(p)))
++ cpu_relax();
++ }
++}
++
++static inline void
++rq_lock_irqsave(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock_irqsave(&rq->lock, rf->flags);
++}
++
++static inline void
++rq_unlock_irqrestore(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock_irqrestore(&rq->lock, rf->flags);
++}
++
++void raw_spin_rq_lock_nested(struct rq *rq, int subclass)
++{
++ raw_spinlock_t *lock;
++
++ /* Matches synchronize_rcu() in __sched_core_enable() */
++ preempt_disable();
++
++ for (;;) {
++ lock = __rq_lockp(rq);
++ raw_spin_lock_nested(lock, subclass);
++ if (likely(lock == __rq_lockp(rq))) {
++ /* preempt_count *MUST* be > 1 */
++ preempt_enable_no_resched();
++ return;
++ }
++ raw_spin_unlock(lock);
++ }
++}
++
++void raw_spin_rq_unlock(struct rq *rq)
++{
++ raw_spin_unlock(rq_lockp(rq));
++}
++
++/*
++ * RQ-clock updating methods:
++ */
++
++static void update_rq_clock_task(struct rq *rq, s64 delta)
++{
++/*
++ * In theory, the compile should just see 0 here, and optimize out the call
++ * to sched_rt_avg_update. But I don't trust it...
++ */
++ s64 __maybe_unused steal = 0, irq_delta = 0;
++
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++ irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time;
++
++ /*
++ * Since irq_time is only updated on {soft,}irq_exit, we might run into
++ * this case when a previous update_rq_clock() happened inside a
++ * {soft,}irq region.
++ *
++ * When this happens, we stop ->clock_task and only update the
++ * prev_irq_time stamp to account for the part that fit, so that a next
++ * update will consume the rest. This ensures ->clock_task is
++ * monotonic.
++ *
++ * It does however cause some slight miss-attribution of {soft,}irq
++ * time, a more accurate solution would be to update the irq_time using
++ * the current rq->clock timestamp, except that would require using
++ * atomic ops.
++ */
++ if (irq_delta > delta)
++ irq_delta = delta;
++
++ rq->prev_irq_time += irq_delta;
++ delta -= irq_delta;
++#endif
++#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
++ if (static_key_false((¶virt_steal_rq_enabled))) {
++ steal = paravirt_steal_clock(cpu_of(rq));
++ steal -= rq->prev_steal_time_rq;
++
++ if (unlikely(steal > delta))
++ steal = delta;
++
++ rq->prev_steal_time_rq += steal;
++ delta -= steal;
++ }
++#endif
++
++ rq->clock_task += delta;
++
++#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
++ if ((irq_delta + steal))
++ update_irq_load_avg(rq, irq_delta + steal);
++#endif
++}
++
++static inline void update_rq_clock(struct rq *rq)
++{
++ s64 delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
++
++ if (unlikely(delta <= 0))
++ return;
++ rq->clock += delta;
++ update_rq_time_edge(rq);
++ update_rq_clock_task(rq, delta);
++}
++
++/*
++ * RQ Load update routine
++ */
++#define RQ_LOAD_HISTORY_BITS (sizeof(s32) * 8ULL)
++#define RQ_UTIL_SHIFT (8)
++#define RQ_LOAD_HISTORY_TO_UTIL(l) (((l) >> (RQ_LOAD_HISTORY_BITS - 1 - RQ_UTIL_SHIFT)) & 0xff)
++
++#define LOAD_BLOCK(t) ((t) >> 17)
++#define LOAD_HALF_BLOCK(t) ((t) >> 16)
++#define BLOCK_MASK(t) ((t) & ((0x01 << 18) - 1))
++#define LOAD_BLOCK_BIT(b) (1UL << (RQ_LOAD_HISTORY_BITS - 1 - (b)))
++#define CURRENT_LOAD_BIT LOAD_BLOCK_BIT(0)
++
++static inline void rq_load_update(struct rq *rq)
++{
++ u64 time = rq->clock;
++ u64 delta = min(LOAD_BLOCK(time) - LOAD_BLOCK(rq->load_stamp),
++ RQ_LOAD_HISTORY_BITS - 1);
++ u64 prev = !!(rq->load_history & CURRENT_LOAD_BIT);
++ u64 curr = !!rq->nr_running;
++
++ if (delta) {
++ rq->load_history = rq->load_history >> delta;
++
++ if (delta < RQ_UTIL_SHIFT) {
++ rq->load_block += (~BLOCK_MASK(rq->load_stamp)) * prev;
++ if (!!LOAD_HALF_BLOCK(rq->load_block) ^ curr)
++ rq->load_history ^= LOAD_BLOCK_BIT(delta);
++ }
++
++ rq->load_block = BLOCK_MASK(time) * prev;
++ } else {
++ rq->load_block += (time - rq->load_stamp) * prev;
++ }
++ if (prev ^ curr)
++ rq->load_history ^= CURRENT_LOAD_BIT;
++ rq->load_stamp = time;
++}
++
++unsigned long rq_load_util(struct rq *rq, unsigned long max)
++{
++ return RQ_LOAD_HISTORY_TO_UTIL(rq->load_history) * (max >> RQ_UTIL_SHIFT);
++}
++
++#ifdef CONFIG_SMP
++unsigned long sched_cpu_util(int cpu, unsigned long max)
++{
++ return rq_load_util(cpu_rq(cpu), max);
++}
++#endif /* CONFIG_SMP */
++
++#ifdef CONFIG_CPU_FREQ
++/**
++ * cpufreq_update_util - Take a note about CPU utilization changes.
++ * @rq: Runqueue to carry out the update for.
++ * @flags: Update reason flags.
++ *
++ * This function is called by the scheduler on the CPU whose utilization is
++ * being updated.
++ *
++ * It can only be called from RCU-sched read-side critical sections.
++ *
++ * The way cpufreq is currently arranged requires it to evaluate the CPU
++ * performance state (frequency/voltage) on a regular basis to prevent it from
++ * being stuck in a completely inadequate performance level for too long.
++ * That is not guaranteed to happen if the updates are only triggered from CFS
++ * and DL, though, because they may not be coming in if only RT tasks are
++ * active all the time (or there are RT tasks only).
++ *
++ * As a workaround for that issue, this function is called periodically by the
++ * RT sched class to trigger extra cpufreq updates to prevent it from stalling,
++ * but that really is a band-aid. Going forward it should be replaced with
++ * solutions targeted more specifically at RT tasks.
++ */
++static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
++{
++ struct update_util_data *data;
++
++#ifdef CONFIG_SMP
++ rq_load_update(rq);
++#endif
++ data = rcu_dereference_sched(*per_cpu_ptr(&cpufreq_update_util_data,
++ cpu_of(rq)));
++ if (data)
++ data->func(data, rq_clock(rq), flags);
++}
++#else
++static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
++{
++#ifdef CONFIG_SMP
++ rq_load_update(rq);
++#endif
++}
++#endif /* CONFIG_CPU_FREQ */
++
++#ifdef CONFIG_NO_HZ_FULL
++/*
++ * Tick may be needed by tasks in the runqueue depending on their policy and
++ * requirements. If tick is needed, lets send the target an IPI to kick it out
++ * of nohz mode if necessary.
++ */
++static inline void sched_update_tick_dependency(struct rq *rq)
++{
++ int cpu = cpu_of(rq);
++
++ if (!tick_nohz_full_cpu(cpu))
++ return;
++
++ if (rq->nr_running < 2)
++ tick_nohz_dep_clear_cpu(cpu, TICK_DEP_BIT_SCHED);
++ else
++ tick_nohz_dep_set_cpu(cpu, TICK_DEP_BIT_SCHED);
++}
++#else /* !CONFIG_NO_HZ_FULL */
++static inline void sched_update_tick_dependency(struct rq *rq) { }
++#endif
++
++bool sched_task_on_rq(struct task_struct *p)
++{
++ return task_on_rq_queued(p);
++}
++
++unsigned long get_wchan(struct task_struct *p)
++{
++ unsigned long ip = 0;
++ unsigned int state;
++
++ if (!p || p == current)
++ return 0;
++
++ /* Only get wchan if task is blocked and we can keep it that way. */
++ raw_spin_lock_irq(&p->pi_lock);
++ state = READ_ONCE(p->__state);
++ smp_rmb(); /* see try_to_wake_up() */
++ if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq)
++ ip = __get_wchan(p);
++ raw_spin_unlock_irq(&p->pi_lock);
++
++ return ip;
++}
++
++/*
++ * Add/Remove/Requeue task to/from the runqueue routines
++ * Context: rq->lock
++ */
++#define __SCHED_DEQUEUE_TASK(p, rq, flags) \
++ psi_dequeue(p, flags & DEQUEUE_SLEEP); \
++ sched_info_dequeue(rq, p); \
++ \
++ list_del(&p->sq_node); \
++ if (list_empty(&rq->queue.heads[p->sq_idx])) \
++ clear_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
++
++#define __SCHED_ENQUEUE_TASK(p, rq, flags) \
++ sched_info_enqueue(rq, p); \
++ psi_enqueue(p, flags); \
++ \
++ p->sq_idx = task_sched_prio_idx(p, rq); \
++ list_add_tail(&p->sq_node, &rq->queue.heads[p->sq_idx]); \
++ set_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
++
++static inline void dequeue_task(struct task_struct *p, struct rq *rq, int flags)
++{
++ lockdep_assert_held(&rq->lock);
++
++ /*printk(KERN_INFO "sched: dequeue(%d) %px %016llx\n", cpu_of(rq), p, p->priodl);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: dequeue task reside on cpu%d from cpu%d\n",
++ task_cpu(p), cpu_of(rq));
++
++ __SCHED_DEQUEUE_TASK(p, rq, flags);
++ --rq->nr_running;
++#ifdef CONFIG_SMP
++ if (1 == rq->nr_running)
++ cpumask_clear_cpu(cpu_of(rq), &sched_rq_pending_mask);
++#endif
++
++ sched_update_tick_dependency(rq);
++}
++
++static inline void enqueue_task(struct task_struct *p, struct rq *rq, int flags)
++{
++ lockdep_assert_held(&rq->lock);
++
++ /*printk(KERN_INFO "sched: enqueue(%d) %px %016llx\n", cpu_of(rq), p, p->priodl);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: enqueue task reside on cpu%d to cpu%d\n",
++ task_cpu(p), cpu_of(rq));
++
++ __SCHED_ENQUEUE_TASK(p, rq, flags);
++ update_sched_rq_watermark(rq);
++ ++rq->nr_running;
++#ifdef CONFIG_SMP
++ if (2 == rq->nr_running)
++ cpumask_set_cpu(cpu_of(rq), &sched_rq_pending_mask);
++#endif
++
++ sched_update_tick_dependency(rq);
++}
++
++static inline void requeue_task(struct task_struct *p, struct rq *rq, int idx)
++{
++ lockdep_assert_held(&rq->lock);
++ /*printk(KERN_INFO "sched: requeue(%d) %px %016llx\n", cpu_of(rq), p, p->priodl);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: cpu[%d] requeue task reside on cpu%d\n",
++ cpu_of(rq), task_cpu(p));
++
++ list_del(&p->sq_node);
++ list_add_tail(&p->sq_node, &rq->queue.heads[idx]);
++ if (idx != p->sq_idx) {
++ if (list_empty(&rq->queue.heads[p->sq_idx]))
++ clear_bit(sched_idx2prio(p->sq_idx, rq),
++ rq->queue.bitmap);
++ p->sq_idx = idx;
++ set_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
++ update_sched_rq_watermark(rq);
++ }
++}
++
++/*
++ * cmpxchg based fetch_or, macro so it works for different integer types
++ */
++#define fetch_or(ptr, mask) \
++ ({ \
++ typeof(ptr) _ptr = (ptr); \
++ typeof(mask) _mask = (mask); \
++ typeof(*_ptr) _old, _val = *_ptr; \
++ \
++ for (;;) { \
++ _old = cmpxchg(_ptr, _val, _val | _mask); \
++ if (_old == _val) \
++ break; \
++ _val = _old; \
++ } \
++ _old; \
++})
++
++#if defined(CONFIG_SMP) && defined(TIF_POLLING_NRFLAG)
++/*
++ * Atomically set TIF_NEED_RESCHED and test for TIF_POLLING_NRFLAG,
++ * this avoids any races wrt polling state changes and thereby avoids
++ * spurious IPIs.
++ */
++static bool set_nr_and_not_polling(struct task_struct *p)
++{
++ struct thread_info *ti = task_thread_info(p);
++ return !(fetch_or(&ti->flags, _TIF_NEED_RESCHED) & _TIF_POLLING_NRFLAG);
++}
++
++/*
++ * Atomically set TIF_NEED_RESCHED if TIF_POLLING_NRFLAG is set.
++ *
++ * If this returns true, then the idle task promises to call
++ * sched_ttwu_pending() and reschedule soon.
++ */
++static bool set_nr_if_polling(struct task_struct *p)
++{
++ struct thread_info *ti = task_thread_info(p);
++ typeof(ti->flags) old, val = READ_ONCE(ti->flags);
++
++ for (;;) {
++ if (!(val & _TIF_POLLING_NRFLAG))
++ return false;
++ if (val & _TIF_NEED_RESCHED)
++ return true;
++ old = cmpxchg(&ti->flags, val, val | _TIF_NEED_RESCHED);
++ if (old == val)
++ break;
++ val = old;
++ }
++ return true;
++}
++
++#else
++static bool set_nr_and_not_polling(struct task_struct *p)
++{
++ set_tsk_need_resched(p);
++ return true;
++}
++
++#ifdef CONFIG_SMP
++static bool set_nr_if_polling(struct task_struct *p)
++{
++ return false;
++}
++#endif
++#endif
++
++static bool __wake_q_add(struct wake_q_head *head, struct task_struct *task)
++{
++ struct wake_q_node *node = &task->wake_q;
++
++ /*
++ * Atomically grab the task, if ->wake_q is !nil already it means
++ * it's already queued (either by us or someone else) and will get the
++ * wakeup due to that.
++ *
++ * In order to ensure that a pending wakeup will observe our pending
++ * state, even in the failed case, an explicit smp_mb() must be used.
++ */
++ smp_mb__before_atomic();
++ if (unlikely(cmpxchg_relaxed(&node->next, NULL, WAKE_Q_TAIL)))
++ return false;
++
++ /*
++ * The head is context local, there can be no concurrency.
++ */
++ *head->lastp = node;
++ head->lastp = &node->next;
++ return true;
++}
++
++/**
++ * wake_q_add() - queue a wakeup for 'later' waking.
++ * @head: the wake_q_head to add @task to
++ * @task: the task to queue for 'later' wakeup
++ *
++ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
++ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
++ * instantly.
++ *
++ * This function must be used as-if it were wake_up_process(); IOW the task
++ * must be ready to be woken at this location.
++ */
++void wake_q_add(struct wake_q_head *head, struct task_struct *task)
++{
++ if (__wake_q_add(head, task))
++ get_task_struct(task);
++}
++
++/**
++ * wake_q_add_safe() - safely queue a wakeup for 'later' waking.
++ * @head: the wake_q_head to add @task to
++ * @task: the task to queue for 'later' wakeup
++ *
++ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
++ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
++ * instantly.
++ *
++ * This function must be used as-if it were wake_up_process(); IOW the task
++ * must be ready to be woken at this location.
++ *
++ * This function is essentially a task-safe equivalent to wake_q_add(). Callers
++ * that already hold reference to @task can call the 'safe' version and trust
++ * wake_q to do the right thing depending whether or not the @task is already
++ * queued for wakeup.
++ */
++void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task)
++{
++ if (!__wake_q_add(head, task))
++ put_task_struct(task);
++}
++
++void wake_up_q(struct wake_q_head *head)
++{
++ struct wake_q_node *node = head->first;
++
++ while (node != WAKE_Q_TAIL) {
++ struct task_struct *task;
++
++ task = container_of(node, struct task_struct, wake_q);
++ /* task can safely be re-inserted now: */
++ node = node->next;
++ task->wake_q.next = NULL;
++
++ /*
++ * wake_up_process() executes a full barrier, which pairs with
++ * the queueing in wake_q_add() so as not to miss wakeups.
++ */
++ wake_up_process(task);
++ put_task_struct(task);
++ }
++}
++
++/*
++ * resched_curr - mark rq's current task 'to be rescheduled now'.
++ *
++ * On UP this means the setting of the need_resched flag, on SMP it
++ * might also involve a cross-CPU call to trigger the scheduler on
++ * the target CPU.
++ */
++void resched_curr(struct rq *rq)
++{
++ struct task_struct *curr = rq->curr;
++ int cpu;
++
++ lockdep_assert_held(&rq->lock);
++
++ if (test_tsk_need_resched(curr))
++ return;
++
++ cpu = cpu_of(rq);
++ if (cpu == smp_processor_id()) {
++ set_tsk_need_resched(curr);
++ set_preempt_need_resched();
++ return;
++ }
++
++ if (set_nr_and_not_polling(curr))
++ smp_send_reschedule(cpu);
++ else
++ trace_sched_wake_idle_without_ipi(cpu);
++}
++
++void resched_cpu(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ if (cpu_online(cpu) || cpu == smp_processor_id())
++ resched_curr(cpu_rq(cpu));
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++}
++
++#ifdef CONFIG_SMP
++#ifdef CONFIG_NO_HZ_COMMON
++void nohz_balance_enter_idle(int cpu) {}
++
++void select_nohz_load_balancer(int stop_tick) {}
++
++void set_cpu_sd_state_idle(void) {}
++
++/*
++ * In the semi idle case, use the nearest busy CPU for migrating timers
++ * from an idle CPU. This is good for power-savings.
++ *
++ * We don't do similar optimization for completely idle system, as
++ * selecting an idle CPU will add more delays to the timers than intended
++ * (as that CPU's timer base may not be uptodate wrt jiffies etc).
++ */
++int get_nohz_timer_target(void)
++{
++ int i, cpu = smp_processor_id(), default_cpu = -1;
++ struct cpumask *mask;
++ const struct cpumask *hk_mask;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_TIMER)) {
++ if (!idle_cpu(cpu))
++ return cpu;
++ default_cpu = cpu;
++ }
++
++ hk_mask = housekeeping_cpumask(HK_TYPE_TIMER);
++
++ for (mask = per_cpu(sched_cpu_topo_masks, cpu) + 1;
++ mask < per_cpu(sched_cpu_topo_end_mask, cpu); mask++)
++ for_each_cpu_and(i, mask, hk_mask)
++ if (!idle_cpu(i))
++ return i;
++
++ if (default_cpu == -1)
++ default_cpu = housekeeping_any_cpu(HK_TYPE_TIMER);
++ cpu = default_cpu;
++
++ return cpu;
++}
++
++/*
++ * When add_timer_on() enqueues a timer into the timer wheel of an
++ * idle CPU then this timer might expire before the next timer event
++ * which is scheduled to wake up that CPU. In case of a completely
++ * idle system the next event might even be infinite time into the
++ * future. wake_up_idle_cpu() ensures that the CPU is woken up and
++ * leaves the inner idle loop so the newly added timer is taken into
++ * account when the CPU goes back to idle and evaluates the timer
++ * wheel for the next timer event.
++ */
++static inline void wake_up_idle_cpu(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (cpu == smp_processor_id())
++ return;
++
++ if (set_nr_and_not_polling(rq->idle))
++ smp_send_reschedule(cpu);
++ else
++ trace_sched_wake_idle_without_ipi(cpu);
++}
++
++static inline bool wake_up_full_nohz_cpu(int cpu)
++{
++ /*
++ * We just need the target to call irq_exit() and re-evaluate
++ * the next tick. The nohz full kick at least implies that.
++ * If needed we can still optimize that later with an
++ * empty IRQ.
++ */
++ if (cpu_is_offline(cpu))
++ return true; /* Don't try to wake offline CPUs. */
++ if (tick_nohz_full_cpu(cpu)) {
++ if (cpu != smp_processor_id() ||
++ tick_nohz_tick_stopped())
++ tick_nohz_full_kick_cpu(cpu);
++ return true;
++ }
++
++ return false;
++}
++
++void wake_up_nohz_cpu(int cpu)
++{
++ if (!wake_up_full_nohz_cpu(cpu))
++ wake_up_idle_cpu(cpu);
++}
++
++static void nohz_csd_func(void *info)
++{
++ struct rq *rq = info;
++ int cpu = cpu_of(rq);
++ unsigned int flags;
++
++ /*
++ * Release the rq::nohz_csd.
++ */
++ flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(cpu));
++ WARN_ON(!(flags & NOHZ_KICK_MASK));
++
++ rq->idle_balance = idle_cpu(cpu);
++ if (rq->idle_balance && !need_resched()) {
++ rq->nohz_idle_balance = flags;
++ raise_softirq_irqoff(SCHED_SOFTIRQ);
++ }
++}
++
++#endif /* CONFIG_NO_HZ_COMMON */
++#endif /* CONFIG_SMP */
++
++static inline void check_preempt_curr(struct rq *rq)
++{
++ if (sched_rq_first_task(rq) != rq->curr)
++ resched_curr(rq);
++}
++
++#ifdef CONFIG_SCHED_HRTICK
++/*
++ * Use HR-timers to deliver accurate preemption points.
++ */
++
++static void hrtick_clear(struct rq *rq)
++{
++ if (hrtimer_active(&rq->hrtick_timer))
++ hrtimer_cancel(&rq->hrtick_timer);
++}
++
++/*
++ * High-resolution timer tick.
++ * Runs from hardirq context with interrupts disabled.
++ */
++static enum hrtimer_restart hrtick(struct hrtimer *timer)
++{
++ struct rq *rq = container_of(timer, struct rq, hrtick_timer);
++
++ WARN_ON_ONCE(cpu_of(rq) != smp_processor_id());
++
++ raw_spin_lock(&rq->lock);
++ resched_curr(rq);
++ raw_spin_unlock(&rq->lock);
++
++ return HRTIMER_NORESTART;
++}
++
++/*
++ * Use hrtick when:
++ * - enabled by features
++ * - hrtimer is actually high res
++ */
++static inline int hrtick_enabled(struct rq *rq)
++{
++ /**
++ * Alt schedule FW doesn't support sched_feat yet
++ if (!sched_feat(HRTICK))
++ return 0;
++ */
++ if (!cpu_active(cpu_of(rq)))
++ return 0;
++ return hrtimer_is_hres_active(&rq->hrtick_timer);
++}
++
++#ifdef CONFIG_SMP
++
++static void __hrtick_restart(struct rq *rq)
++{
++ struct hrtimer *timer = &rq->hrtick_timer;
++ ktime_t time = rq->hrtick_time;
++
++ hrtimer_start(timer, time, HRTIMER_MODE_ABS_PINNED_HARD);
++}
++
++/*
++ * called from hardirq (IPI) context
++ */
++static void __hrtick_start(void *arg)
++{
++ struct rq *rq = arg;
++
++ raw_spin_lock(&rq->lock);
++ __hrtick_restart(rq);
++ raw_spin_unlock(&rq->lock);
++}
++
++/*
++ * Called to set the hrtick timer state.
++ *
++ * called with rq->lock held and irqs disabled
++ */
++void hrtick_start(struct rq *rq, u64 delay)
++{
++ struct hrtimer *timer = &rq->hrtick_timer;
++ s64 delta;
++
++ /*
++ * Don't schedule slices shorter than 10000ns, that just
++ * doesn't make sense and can cause timer DoS.
++ */
++ delta = max_t(s64, delay, 10000LL);
++
++ rq->hrtick_time = ktime_add_ns(timer->base->get_time(), delta);
++
++ if (rq == this_rq())
++ __hrtick_restart(rq);
++ else
++ smp_call_function_single_async(cpu_of(rq), &rq->hrtick_csd);
++}
++
++#else
++/*
++ * Called to set the hrtick timer state.
++ *
++ * called with rq->lock held and irqs disabled
++ */
++void hrtick_start(struct rq *rq, u64 delay)
++{
++ /*
++ * Don't schedule slices shorter than 10000ns, that just
++ * doesn't make sense. Rely on vruntime for fairness.
++ */
++ delay = max_t(u64, delay, 10000LL);
++ hrtimer_start(&rq->hrtick_timer, ns_to_ktime(delay),
++ HRTIMER_MODE_REL_PINNED_HARD);
++}
++#endif /* CONFIG_SMP */
++
++static void hrtick_rq_init(struct rq *rq)
++{
++#ifdef CONFIG_SMP
++ INIT_CSD(&rq->hrtick_csd, __hrtick_start, rq);
++#endif
++
++ hrtimer_init(&rq->hrtick_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
++ rq->hrtick_timer.function = hrtick;
++}
++#else /* CONFIG_SCHED_HRTICK */
++static inline int hrtick_enabled(struct rq *rq)
++{
++ return 0;
++}
++
++static inline void hrtick_clear(struct rq *rq)
++{
++}
++
++static inline void hrtick_rq_init(struct rq *rq)
++{
++}
++#endif /* CONFIG_SCHED_HRTICK */
++
++static inline int __normal_prio(int policy, int rt_prio, int static_prio)
++{
++ return rt_policy(policy) ? (MAX_RT_PRIO - 1 - rt_prio) :
++ static_prio + MAX_PRIORITY_ADJ;
++}
++
++/*
++ * Calculate the expected normal priority: i.e. priority
++ * without taking RT-inheritance into account. Might be
++ * boosted by interactivity modifiers. Changes upon fork,
++ * setprio syscalls, and whenever the interactivity
++ * estimator recalculates.
++ */
++static inline int normal_prio(struct task_struct *p)
++{
++ return __normal_prio(p->policy, p->rt_priority, p->static_prio);
++}
++
++/*
++ * Calculate the current priority, i.e. the priority
++ * taken into account by the scheduler. This value might
++ * be boosted by RT tasks as it will be RT if the task got
++ * RT-boosted. If not then it returns p->normal_prio.
++ */
++static int effective_prio(struct task_struct *p)
++{
++ p->normal_prio = normal_prio(p);
++ /*
++ * If we are RT tasks or we were boosted to RT priority,
++ * keep the priority unchanged. Otherwise, update priority
++ * to the normal priority:
++ */
++ if (!rt_prio(p->prio))
++ return p->normal_prio;
++ return p->prio;
++}
++
++/*
++ * activate_task - move a task to the runqueue.
++ *
++ * Context: rq->lock
++ */
++static void activate_task(struct task_struct *p, struct rq *rq)
++{
++ enqueue_task(p, rq, ENQUEUE_WAKEUP);
++ p->on_rq = TASK_ON_RQ_QUEUED;
++
++ /*
++ * If in_iowait is set, the code below may not trigger any cpufreq
++ * utilization updates, so do it here explicitly with the IOWAIT flag
++ * passed.
++ */
++ cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT * p->in_iowait);
++}
++
++/*
++ * deactivate_task - remove a task from the runqueue.
++ *
++ * Context: rq->lock
++ */
++static inline void deactivate_task(struct task_struct *p, struct rq *rq)
++{
++ dequeue_task(p, rq, DEQUEUE_SLEEP);
++ p->on_rq = 0;
++ cpufreq_update_util(rq, 0);
++}
++
++static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu)
++{
++#ifdef CONFIG_SMP
++ /*
++ * After ->cpu is set up to a new value, task_access_lock(p, ...) can be
++ * successfully executed on another CPU. We must ensure that updates of
++ * per-task data have been completed by this moment.
++ */
++ smp_wmb();
++
++ WRITE_ONCE(task_thread_info(p)->cpu, cpu);
++#endif
++}
++
++static inline bool is_migration_disabled(struct task_struct *p)
++{
++#ifdef CONFIG_SMP
++ return p->migration_disabled;
++#else
++ return false;
++#endif
++}
++
++#define SCA_CHECK 0x01
++#define SCA_USER 0x08
++
++#ifdef CONFIG_SMP
++
++void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
++{
++#ifdef CONFIG_SCHED_DEBUG
++ unsigned int state = READ_ONCE(p->__state);
++
++ /*
++ * We should never call set_task_cpu() on a blocked task,
++ * ttwu() will sort out the placement.
++ */
++ WARN_ON_ONCE(state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq);
++
++#ifdef CONFIG_LOCKDEP
++ /*
++ * The caller should hold either p->pi_lock or rq->lock, when changing
++ * a task's CPU. ->pi_lock for waking tasks, rq->lock for runnable tasks.
++ *
++ * sched_move_task() holds both and thus holding either pins the cgroup,
++ * see task_group().
++ */
++ WARN_ON_ONCE(debug_locks && !(lockdep_is_held(&p->pi_lock) ||
++ lockdep_is_held(&task_rq(p)->lock)));
++#endif
++ /*
++ * Clearly, migrating tasks to offline CPUs is a fairly daft thing.
++ */
++ WARN_ON_ONCE(!cpu_online(new_cpu));
++
++ WARN_ON_ONCE(is_migration_disabled(p));
++#endif
++ if (task_cpu(p) == new_cpu)
++ return;
++ trace_sched_migrate_task(p, new_cpu);
++ rseq_migrate(p);
++ perf_event_task_migrate(p);
++
++ __set_task_cpu(p, new_cpu);
++}
++
++#define MDF_FORCE_ENABLED 0x80
++
++static void
++__do_set_cpus_ptr(struct task_struct *p, const struct cpumask *new_mask)
++{
++ /*
++ * This here violates the locking rules for affinity, since we're only
++ * supposed to change these variables while holding both rq->lock and
++ * p->pi_lock.
++ *
++ * HOWEVER, it magically works, because ttwu() is the only code that
++ * accesses these variables under p->pi_lock and only does so after
++ * smp_cond_load_acquire(&p->on_cpu, !VAL), and we're in __schedule()
++ * before finish_task().
++ *
++ * XXX do further audits, this smells like something putrid.
++ */
++ SCHED_WARN_ON(!p->on_cpu);
++ p->cpus_ptr = new_mask;
++}
++
++void migrate_disable(void)
++{
++ struct task_struct *p = current;
++ int cpu;
++
++ if (p->migration_disabled) {
++ p->migration_disabled++;
++ return;
++ }
++
++ preempt_disable();
++ cpu = smp_processor_id();
++ if (cpumask_test_cpu(cpu, &p->cpus_mask)) {
++ cpu_rq(cpu)->nr_pinned++;
++ p->migration_disabled = 1;
++ p->migration_flags &= ~MDF_FORCE_ENABLED;
++
++ /*
++ * Violates locking rules! see comment in __do_set_cpus_ptr().
++ */
++ if (p->cpus_ptr == &p->cpus_mask)
++ __do_set_cpus_ptr(p, cpumask_of(cpu));
++ }
++ preempt_enable();
++}
++EXPORT_SYMBOL_GPL(migrate_disable);
++
++void migrate_enable(void)
++{
++ struct task_struct *p = current;
++
++ if (0 == p->migration_disabled)
++ return;
++
++ if (p->migration_disabled > 1) {
++ p->migration_disabled--;
++ return;
++ }
++
++ if (WARN_ON_ONCE(!p->migration_disabled))
++ return;
++
++ /*
++ * Ensure stop_task runs either before or after this, and that
++ * __set_cpus_allowed_ptr(SCA_MIGRATE_ENABLE) doesn't schedule().
++ */
++ preempt_disable();
++ /*
++ * Assumption: current should be running on allowed cpu
++ */
++ WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &p->cpus_mask));
++ if (p->cpus_ptr != &p->cpus_mask)
++ __do_set_cpus_ptr(p, &p->cpus_mask);
++ /*
++ * Mustn't clear migration_disabled() until cpus_ptr points back at the
++ * regular cpus_mask, otherwise things that race (eg.
++ * select_fallback_rq) get confused.
++ */
++ barrier();
++ p->migration_disabled = 0;
++ this_rq()->nr_pinned--;
++ preempt_enable();
++}
++EXPORT_SYMBOL_GPL(migrate_enable);
++
++static inline bool rq_has_pinned_tasks(struct rq *rq)
++{
++ return rq->nr_pinned;
++}
++
++/*
++ * Per-CPU kthreads are allowed to run on !active && online CPUs, see
++ * __set_cpus_allowed_ptr() and select_fallback_rq().
++ */
++static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
++{
++ /* When not in the task's cpumask, no point in looking further. */
++ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
++ return false;
++
++ /* migrate_disabled() must be allowed to finish. */
++ if (is_migration_disabled(p))
++ return cpu_online(cpu);
++
++ /* Non kernel threads are not allowed during either online or offline. */
++ if (!(p->flags & PF_KTHREAD))
++ return cpu_active(cpu) && task_cpu_possible(cpu, p);
++
++ /* KTHREAD_IS_PER_CPU is always allowed. */
++ if (kthread_is_per_cpu(p))
++ return cpu_online(cpu);
++
++ /* Regular kernel threads don't get to stay during offline. */
++ if (cpu_dying(cpu))
++ return false;
++
++ /* But are allowed during online. */
++ return cpu_online(cpu);
++}
++
++/*
++ * This is how migration works:
++ *
++ * 1) we invoke migration_cpu_stop() on the target CPU using
++ * stop_one_cpu().
++ * 2) stopper starts to run (implicitly forcing the migrated thread
++ * off the CPU)
++ * 3) it checks whether the migrated task is still in the wrong runqueue.
++ * 4) if it's in the wrong runqueue then the migration thread removes
++ * it and puts it into the right queue.
++ * 5) stopper completes and stop_one_cpu() returns and the migration
++ * is done.
++ */
++
++/*
++ * move_queued_task - move a queued task to new rq.
++ *
++ * Returns (locked) new rq. Old rq's lock is released.
++ */
++static struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int
++ new_cpu)
++{
++ lockdep_assert_held(&rq->lock);
++
++ WRITE_ONCE(p->on_rq, TASK_ON_RQ_MIGRATING);
++ dequeue_task(p, rq, 0);
++ update_sched_rq_watermark(rq);
++ set_task_cpu(p, new_cpu);
++ raw_spin_unlock(&rq->lock);
++
++ rq = cpu_rq(new_cpu);
++
++ raw_spin_lock(&rq->lock);
++ BUG_ON(task_cpu(p) != new_cpu);
++ sched_task_sanity_check(p, rq);
++ enqueue_task(p, rq, 0);
++ p->on_rq = TASK_ON_RQ_QUEUED;
++ check_preempt_curr(rq);
++
++ return rq;
++}
++
++struct migration_arg {
++ struct task_struct *task;
++ int dest_cpu;
++};
++
++/*
++ * Move (not current) task off this CPU, onto the destination CPU. We're doing
++ * this because either it can't run here any more (set_cpus_allowed()
++ * away from this CPU, or CPU going down), or because we're
++ * attempting to rebalance this task on exec (sched_exec).
++ *
++ * So we race with normal scheduler movements, but that's OK, as long
++ * as the task is no longer on this CPU.
++ */
++static struct rq *__migrate_task(struct rq *rq, struct task_struct *p, int
++ dest_cpu)
++{
++ /* Affinity changed (again). */
++ if (!is_cpu_allowed(p, dest_cpu))
++ return rq;
++
++ update_rq_clock(rq);
++ return move_queued_task(rq, p, dest_cpu);
++}
++
++/*
++ * migration_cpu_stop - this will be executed by a highprio stopper thread
++ * and performs thread migration by bumping thread off CPU then
++ * 'pushing' onto another runqueue.
++ */
++static int migration_cpu_stop(void *data)
++{
++ struct migration_arg *arg = data;
++ struct task_struct *p = arg->task;
++ struct rq *rq = this_rq();
++ unsigned long flags;
++
++ /*
++ * The original target CPU might have gone down and we might
++ * be on another CPU but it doesn't matter.
++ */
++ local_irq_save(flags);
++ /*
++ * We need to explicitly wake pending tasks before running
++ * __migrate_task() such that we will not miss enforcing cpus_ptr
++ * during wakeups, see set_cpus_allowed_ptr()'s TASK_WAKING test.
++ */
++ flush_smp_call_function_from_idle();
++
++ raw_spin_lock(&p->pi_lock);
++ raw_spin_lock(&rq->lock);
++ /*
++ * If task_rq(p) != rq, it cannot be migrated here, because we're
++ * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
++ * we're holding p->pi_lock.
++ */
++ if (task_rq(p) == rq && task_on_rq_queued(p))
++ rq = __migrate_task(rq, p, arg->dest_cpu);
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ return 0;
++}
++
++static inline void
++set_cpus_allowed_common(struct task_struct *p, const struct cpumask *new_mask)
++{
++ cpumask_copy(&p->cpus_mask, new_mask);
++ p->nr_cpus_allowed = cpumask_weight(new_mask);
++}
++
++static void
++__do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
++{
++ lockdep_assert_held(&p->pi_lock);
++ set_cpus_allowed_common(p, new_mask);
++}
++
++void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
++{
++ __do_set_cpus_allowed(p, new_mask);
++}
++
++int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
++ int node)
++{
++ if (!src->user_cpus_ptr)
++ return 0;
++
++ dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
++ if (!dst->user_cpus_ptr)
++ return -ENOMEM;
++
++ cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
++ return 0;
++}
++
++static inline struct cpumask *clear_user_cpus_ptr(struct task_struct *p)
++{
++ struct cpumask *user_mask = NULL;
++
++ swap(p->user_cpus_ptr, user_mask);
++
++ return user_mask;
++}
++
++void release_user_cpus_ptr(struct task_struct *p)
++{
++ kfree(clear_user_cpus_ptr(p));
++}
++
++#endif
++
++/**
++ * task_curr - is this task currently executing on a CPU?
++ * @p: the task in question.
++ *
++ * Return: 1 if the task is currently executing. 0 otherwise.
++ */
++inline int task_curr(const struct task_struct *p)
++{
++ return cpu_curr(task_cpu(p)) == p;
++}
++
++#ifdef CONFIG_SMP
++/*
++ * wait_task_inactive - wait for a thread to unschedule.
++ *
++ * If @match_state is nonzero, it's the @p->state value just checked and
++ * not expected to change. If it changes, i.e. @p might have woken up,
++ * then return zero. When we succeed in waiting for @p to be off its CPU,
++ * we return a positive number (its total switch count). If a second call
++ * a short while later returns the same number, the caller can be sure that
++ * @p has remained unscheduled the whole time.
++ *
++ * The caller must ensure that the task *will* unschedule sometime soon,
++ * else this function might spin for a *long* time. This function can't
++ * be called with interrupts off, or it may introduce deadlock with
++ * smp_call_function() if an IPI is sent by the same process we are
++ * waiting to become inactive.
++ */
++unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state)
++{
++ unsigned long flags;
++ bool running, on_rq;
++ unsigned long ncsw;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ for (;;) {
++ rq = task_rq(p);
++
++ /*
++ * If the task is actively running on another CPU
++ * still, just relax and busy-wait without holding
++ * any locks.
++ *
++ * NOTE! Since we don't hold any locks, it's not
++ * even sure that "rq" stays as the right runqueue!
++ * But we don't care, since this will return false
++ * if the runqueue has changed and p is actually now
++ * running somewhere else!
++ */
++ while (task_running(p) && p == rq->curr) {
++ if (match_state && unlikely(READ_ONCE(p->__state) != match_state))
++ return 0;
++ cpu_relax();
++ }
++
++ /*
++ * Ok, time to look more closely! We need the rq
++ * lock now, to be *sure*. If we're wrong, we'll
++ * just go back and repeat.
++ */
++ task_access_lock_irqsave(p, &lock, &flags);
++ trace_sched_wait_task(p);
++ running = task_running(p);
++ on_rq = p->on_rq;
++ ncsw = 0;
++ if (!match_state || READ_ONCE(p->__state) == match_state)
++ ncsw = p->nvcsw | LONG_MIN; /* sets MSB */
++ task_access_unlock_irqrestore(p, lock, &flags);
++
++ /*
++ * If it changed from the expected state, bail out now.
++ */
++ if (unlikely(!ncsw))
++ break;
++
++ /*
++ * Was it really running after all now that we
++ * checked with the proper locks actually held?
++ *
++ * Oops. Go back and try again..
++ */
++ if (unlikely(running)) {
++ cpu_relax();
++ continue;
++ }
++
++ /*
++ * It's not enough that it's not actively running,
++ * it must be off the runqueue _entirely_, and not
++ * preempted!
++ *
++ * So if it was still runnable (but just not actively
++ * running right now), it's preempted, and we should
++ * yield - it could be a while.
++ */
++ if (unlikely(on_rq)) {
++ ktime_t to = NSEC_PER_SEC / HZ;
++
++ set_current_state(TASK_UNINTERRUPTIBLE);
++ schedule_hrtimeout(&to, HRTIMER_MODE_REL_HARD);
++ continue;
++ }
++
++ /*
++ * Ahh, all good. It wasn't running, and it wasn't
++ * runnable, which means that it will never become
++ * running in the future either. We're all done!
++ */
++ break;
++ }
++
++ return ncsw;
++}
++
++/***
++ * kick_process - kick a running thread to enter/exit the kernel
++ * @p: the to-be-kicked thread
++ *
++ * Cause a process which is running on another CPU to enter
++ * kernel-mode, without any delay. (to get signals handled.)
++ *
++ * NOTE: this function doesn't have to take the runqueue lock,
++ * because all it wants to ensure is that the remote task enters
++ * the kernel. If the IPI races and the task has been migrated
++ * to another CPU then no harm is done and the purpose has been
++ * achieved as well.
++ */
++void kick_process(struct task_struct *p)
++{
++ int cpu;
++
++ preempt_disable();
++ cpu = task_cpu(p);
++ if ((cpu != smp_processor_id()) && task_curr(p))
++ smp_send_reschedule(cpu);
++ preempt_enable();
++}
++EXPORT_SYMBOL_GPL(kick_process);
++
++/*
++ * ->cpus_ptr is protected by both rq->lock and p->pi_lock
++ *
++ * A few notes on cpu_active vs cpu_online:
++ *
++ * - cpu_active must be a subset of cpu_online
++ *
++ * - on CPU-up we allow per-CPU kthreads on the online && !active CPU,
++ * see __set_cpus_allowed_ptr(). At this point the newly online
++ * CPU isn't yet part of the sched domains, and balancing will not
++ * see it.
++ *
++ * - on cpu-down we clear cpu_active() to mask the sched domains and
++ * avoid the load balancer to place new tasks on the to be removed
++ * CPU. Existing tasks will remain running there and will be taken
++ * off.
++ *
++ * This means that fallback selection must not select !active CPUs.
++ * And can assume that any active CPU must be online. Conversely
++ * select_task_rq() below may allow selection of !active CPUs in order
++ * to satisfy the above rules.
++ */
++static int select_fallback_rq(int cpu, struct task_struct *p)
++{
++ int nid = cpu_to_node(cpu);
++ const struct cpumask *nodemask = NULL;
++ enum { cpuset, possible, fail } state = cpuset;
++ int dest_cpu;
++
++ /*
++ * If the node that the CPU is on has been offlined, cpu_to_node()
++ * will return -1. There is no CPU on the node, and we should
++ * select the CPU on the other node.
++ */
++ if (nid != -1) {
++ nodemask = cpumask_of_node(nid);
++
++ /* Look for allowed, online CPU in same node. */
++ for_each_cpu(dest_cpu, nodemask) {
++ if (is_cpu_allowed(p, dest_cpu))
++ return dest_cpu;
++ }
++ }
++
++ for (;;) {
++ /* Any allowed, online CPU? */
++ for_each_cpu(dest_cpu, p->cpus_ptr) {
++ if (!is_cpu_allowed(p, dest_cpu))
++ continue;
++ goto out;
++ }
++
++ /* No more Mr. Nice Guy. */
++ switch (state) {
++ case cpuset:
++ if (cpuset_cpus_allowed_fallback(p)) {
++ state = possible;
++ break;
++ }
++ fallthrough;
++ case possible:
++ /*
++ * XXX When called from select_task_rq() we only
++ * hold p->pi_lock and again violate locking order.
++ *
++ * More yuck to audit.
++ */
++ do_set_cpus_allowed(p, task_cpu_possible_mask(p));
++ state = fail;
++ break;
++
++ case fail:
++ BUG();
++ break;
++ }
++ }
++
++out:
++ if (state != cpuset) {
++ /*
++ * Don't tell them about moving exiting tasks or
++ * kernel threads (both mm NULL), since they never
++ * leave kernel.
++ */
++ if (p->mm && printk_ratelimit()) {
++ printk_deferred("process %d (%s) no longer affine to cpu%d\n",
++ task_pid_nr(p), p->comm, cpu);
++ }
++ }
++
++ return dest_cpu;
++}
++
++static inline int select_task_rq(struct task_struct *p)
++{
++ cpumask_t chk_mask, tmp;
++
++ if (unlikely(!cpumask_and(&chk_mask, p->cpus_ptr, cpu_active_mask)))
++ return select_fallback_rq(task_cpu(p), p);
++
++ if (
++#ifdef CONFIG_SCHED_SMT
++ cpumask_and(&tmp, &chk_mask, &sched_sg_idle_mask) ||
++#endif
++ cpumask_and(&tmp, &chk_mask, sched_rq_watermark) ||
++ cpumask_and(&tmp, &chk_mask,
++ sched_rq_watermark + SCHED_BITS - task_sched_prio(p)))
++ return best_mask_cpu(task_cpu(p), &tmp);
++
++ return best_mask_cpu(task_cpu(p), &chk_mask);
++}
++
++void sched_set_stop_task(int cpu, struct task_struct *stop)
++{
++ static struct lock_class_key stop_pi_lock;
++ struct sched_param stop_param = { .sched_priority = STOP_PRIO };
++ struct sched_param start_param = { .sched_priority = 0 };
++ struct task_struct *old_stop = cpu_rq(cpu)->stop;
++
++ if (stop) {
++ /*
++ * Make it appear like a SCHED_FIFO task, its something
++ * userspace knows about and won't get confused about.
++ *
++ * Also, it will make PI more or less work without too
++ * much confusion -- but then, stop work should not
++ * rely on PI working anyway.
++ */
++ sched_setscheduler_nocheck(stop, SCHED_FIFO, &stop_param);
++
++ /*
++ * The PI code calls rt_mutex_setprio() with ->pi_lock held to
++ * adjust the effective priority of a task. As a result,
++ * rt_mutex_setprio() can trigger (RT) balancing operations,
++ * which can then trigger wakeups of the stop thread to push
++ * around the current task.
++ *
++ * The stop task itself will never be part of the PI-chain, it
++ * never blocks, therefore that ->pi_lock recursion is safe.
++ * Tell lockdep about this by placing the stop->pi_lock in its
++ * own class.
++ */
++ lockdep_set_class(&stop->pi_lock, &stop_pi_lock);
++ }
++
++ cpu_rq(cpu)->stop = stop;
++
++ if (old_stop) {
++ /*
++ * Reset it back to a normal scheduling policy so that
++ * it can die in pieces.
++ */
++ sched_setscheduler_nocheck(old_stop, SCHED_NORMAL, &start_param);
++ }
++}
++
++static int affine_move_task(struct rq *rq, struct task_struct *p, int dest_cpu,
++ raw_spinlock_t *lock, unsigned long irq_flags)
++{
++ /* Can the task run on the task's current CPU? If so, we're done */
++ if (!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) {
++ if (p->migration_disabled) {
++ if (likely(p->cpus_ptr != &p->cpus_mask))
++ __do_set_cpus_ptr(p, &p->cpus_mask);
++ p->migration_disabled = 0;
++ p->migration_flags |= MDF_FORCE_ENABLED;
++ /* When p is migrate_disabled, rq->lock should be held */
++ rq->nr_pinned--;
++ }
++
++ if (task_running(p) || READ_ONCE(p->__state) == TASK_WAKING) {
++ struct migration_arg arg = { p, dest_cpu };
++
++ /* Need help from migration thread: drop lock and wait. */
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg);
++ return 0;
++ }
++ if (task_on_rq_queued(p)) {
++ /*
++ * OK, since we're going to drop the lock immediately
++ * afterwards anyway.
++ */
++ update_rq_clock(rq);
++ rq = move_queued_task(rq, p, dest_cpu);
++ lock = &rq->lock;
++ }
++ }
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ return 0;
++}
++
++static int __set_cpus_allowed_ptr_locked(struct task_struct *p,
++ const struct cpumask *new_mask,
++ u32 flags,
++ struct rq *rq,
++ raw_spinlock_t *lock,
++ unsigned long irq_flags)
++{
++ const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p);
++ const struct cpumask *cpu_valid_mask = cpu_active_mask;
++ bool kthread = p->flags & PF_KTHREAD;
++ struct cpumask *user_mask = NULL;
++ int dest_cpu;
++ int ret = 0;
++
++ if (kthread || is_migration_disabled(p)) {
++ /*
++ * Kernel threads are allowed on online && !active CPUs,
++ * however, during cpu-hot-unplug, even these might get pushed
++ * away if not KTHREAD_IS_PER_CPU.
++ *
++ * Specifically, migration_disabled() tasks must not fail the
++ * cpumask_any_and_distribute() pick below, esp. so on
++ * SCA_MIGRATE_ENABLE, otherwise we'll not call
++ * set_cpus_allowed_common() and actually reset p->cpus_ptr.
++ */
++ cpu_valid_mask = cpu_online_mask;
++ }
++
++ if (!kthread && !cpumask_subset(new_mask, cpu_allowed_mask)) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ /*
++ * Must re-check here, to close a race against __kthread_bind(),
++ * sched_setaffinity() is not guaranteed to observe the flag.
++ */
++ if ((flags & SCA_CHECK) && (p->flags & PF_NO_SETAFFINITY)) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ if (cpumask_equal(&p->cpus_mask, new_mask))
++ goto out;
++
++ dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);
++ if (dest_cpu >= nr_cpu_ids) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ __do_set_cpus_allowed(p, new_mask);
++
++ if (flags & SCA_USER)
++ user_mask = clear_user_cpus_ptr(p);
++
++ ret = affine_move_task(rq, p, dest_cpu, lock, irq_flags);
++
++ kfree(user_mask);
++
++ return ret;
++
++out:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++
++ return ret;
++}
++
++/*
++ * Change a given task's CPU affinity. Migrate the thread to a
++ * proper CPU and schedule it away if the CPU it's executing on
++ * is removed from the allowed bitmask.
++ *
++ * NOTE: the caller must have a valid reference to the task, the
++ * task must not exit() & deallocate itself prematurely. The
++ * call is not atomic; no spinlocks may be held.
++ */
++static int __set_cpus_allowed_ptr(struct task_struct *p,
++ const struct cpumask *new_mask, u32 flags)
++{
++ unsigned long irq_flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
++ rq = __task_access_lock(p, &lock);
++
++ return __set_cpus_allowed_ptr_locked(p, new_mask, flags, rq, lock, irq_flags);
++}
++
++int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
++{
++ return __set_cpus_allowed_ptr(p, new_mask, 0);
++}
++EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
++
++/*
++ * Change a given task's CPU affinity to the intersection of its current
++ * affinity mask and @subset_mask, writing the resulting mask to @new_mask
++ * and pointing @p->user_cpus_ptr to a copy of the old mask.
++ * If the resulting mask is empty, leave the affinity unchanged and return
++ * -EINVAL.
++ */
++static int restrict_cpus_allowed_ptr(struct task_struct *p,
++ struct cpumask *new_mask,
++ const struct cpumask *subset_mask)
++{
++ struct cpumask *user_mask = NULL;
++ unsigned long irq_flags;
++ raw_spinlock_t *lock;
++ struct rq *rq;
++ int err;
++
++ if (!p->user_cpus_ptr) {
++ user_mask = kmalloc(cpumask_size(), GFP_KERNEL);
++ if (!user_mask)
++ return -ENOMEM;
++ }
++
++ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
++ rq = __task_access_lock(p, &lock);
++
++ if (!cpumask_and(new_mask, &p->cpus_mask, subset_mask)) {
++ err = -EINVAL;
++ goto err_unlock;
++ }
++
++ /*
++ * We're about to butcher the task affinity, so keep track of what
++ * the user asked for in case we're able to restore it later on.
++ */
++ if (user_mask) {
++ cpumask_copy(user_mask, p->cpus_ptr);
++ p->user_cpus_ptr = user_mask;
++ }
++
++ /*return __set_cpus_allowed_ptr_locked(p, new_mask, 0, rq, &rf);*/
++ return __set_cpus_allowed_ptr_locked(p, new_mask, 0, rq, lock, irq_flags);
++
++err_unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ kfree(user_mask);
++ return err;
++}
++
++/*
++ * Restrict the CPU affinity of task @p so that it is a subset of
++ * task_cpu_possible_mask() and point @p->user_cpu_ptr to a copy of the
++ * old affinity mask. If the resulting mask is empty, we warn and walk
++ * up the cpuset hierarchy until we find a suitable mask.
++ */
++void force_compatible_cpus_allowed_ptr(struct task_struct *p)
++{
++ cpumask_var_t new_mask;
++ const struct cpumask *override_mask = task_cpu_possible_mask(p);
++
++ alloc_cpumask_var(&new_mask, GFP_KERNEL);
++
++ /*
++ * __migrate_task() can fail silently in the face of concurrent
++ * offlining of the chosen destination CPU, so take the hotplug
++ * lock to ensure that the migration succeeds.
++ */
++ cpus_read_lock();
++ if (!cpumask_available(new_mask))
++ goto out_set_mask;
++
++ if (!restrict_cpus_allowed_ptr(p, new_mask, override_mask))
++ goto out_free_mask;
++
++ /*
++ * We failed to find a valid subset of the affinity mask for the
++ * task, so override it based on its cpuset hierarchy.
++ */
++ cpuset_cpus_allowed(p, new_mask);
++ override_mask = new_mask;
++
++out_set_mask:
++ if (printk_ratelimit()) {
++ printk_deferred("Overriding affinity for process %d (%s) to CPUs %*pbl\n",
++ task_pid_nr(p), p->comm,
++ cpumask_pr_args(override_mask));
++ }
++
++ WARN_ON(set_cpus_allowed_ptr(p, override_mask));
++out_free_mask:
++ cpus_read_unlock();
++ free_cpumask_var(new_mask);
++}
++
++static int
++__sched_setaffinity(struct task_struct *p, const struct cpumask *mask);
++
++/*
++ * Restore the affinity of a task @p which was previously restricted by a
++ * call to force_compatible_cpus_allowed_ptr(). This will clear (and free)
++ * @p->user_cpus_ptr.
++ *
++ * It is the caller's responsibility to serialise this with any calls to
++ * force_compatible_cpus_allowed_ptr(@p).
++ */
++void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
++{
++ struct cpumask *user_mask = p->user_cpus_ptr;
++ unsigned long flags;
++
++ /*
++ * Try to restore the old affinity mask. If this fails, then
++ * we free the mask explicitly to avoid it being inherited across
++ * a subsequent fork().
++ */
++ if (!user_mask || !__sched_setaffinity(p, user_mask))
++ return;
++
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ user_mask = clear_user_cpus_ptr(p);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ kfree(user_mask);
++}
++
++#else /* CONFIG_SMP */
++
++static inline int select_task_rq(struct task_struct *p)
++{
++ return 0;
++}
++
++static inline int
++__set_cpus_allowed_ptr(struct task_struct *p,
++ const struct cpumask *new_mask, u32 flags)
++{
++ return set_cpus_allowed_ptr(p, new_mask);
++}
++
++static inline bool rq_has_pinned_tasks(struct rq *rq)
++{
++ return false;
++}
++
++#endif /* !CONFIG_SMP */
++
++static void
++ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq;
++
++ if (!schedstat_enabled())
++ return;
++
++ rq = this_rq();
++
++#ifdef CONFIG_SMP
++ if (cpu == rq->cpu) {
++ __schedstat_inc(rq->ttwu_local);
++ __schedstat_inc(p->stats.nr_wakeups_local);
++ } else {
++ /** Alt schedule FW ToDo:
++ * How to do ttwu_wake_remote
++ */
++ }
++#endif /* CONFIG_SMP */
++
++ __schedstat_inc(rq->ttwu_count);
++ __schedstat_inc(p->stats.nr_wakeups);
++}
++
++/*
++ * Mark the task runnable and perform wakeup-preemption.
++ */
++static inline void
++ttwu_do_wakeup(struct rq *rq, struct task_struct *p, int wake_flags)
++{
++ check_preempt_curr(rq);
++ WRITE_ONCE(p->__state, TASK_RUNNING);
++ trace_sched_wakeup(p);
++}
++
++static inline void
++ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags)
++{
++ if (p->sched_contributes_to_load)
++ rq->nr_uninterruptible--;
++
++ if (
++#ifdef CONFIG_SMP
++ !(wake_flags & WF_MIGRATED) &&
++#endif
++ p->in_iowait) {
++ delayacct_blkio_end(p);
++ atomic_dec(&task_rq(p)->nr_iowait);
++ }
++
++ activate_task(p, rq);
++ ttwu_do_wakeup(rq, p, 0);
++}
++
++/*
++ * Consider @p being inside a wait loop:
++ *
++ * for (;;) {
++ * set_current_state(TASK_UNINTERRUPTIBLE);
++ *
++ * if (CONDITION)
++ * break;
++ *
++ * schedule();
++ * }
++ * __set_current_state(TASK_RUNNING);
++ *
++ * between set_current_state() and schedule(). In this case @p is still
++ * runnable, so all that needs doing is change p->state back to TASK_RUNNING in
++ * an atomic manner.
++ *
++ * By taking task_rq(p)->lock we serialize against schedule(), if @p->on_rq
++ * then schedule() must still happen and p->state can be changed to
++ * TASK_RUNNING. Otherwise we lost the race, schedule() has happened, and we
++ * need to do a full wakeup with enqueue.
++ *
++ * Returns: %true when the wakeup is done,
++ * %false otherwise.
++ */
++static int ttwu_runnable(struct task_struct *p, int wake_flags)
++{
++ struct rq *rq;
++ raw_spinlock_t *lock;
++ int ret = 0;
++
++ rq = __task_access_lock(p, &lock);
++ if (task_on_rq_queued(p)) {
++ /* check_preempt_curr() may use rq clock */
++ update_rq_clock(rq);
++ ttwu_do_wakeup(rq, p, wake_flags);
++ ret = 1;
++ }
++ __task_access_unlock(p, lock);
++
++ return ret;
++}
++
++#ifdef CONFIG_SMP
++void sched_ttwu_pending(void *arg)
++{
++ struct llist_node *llist = arg;
++ struct rq *rq = this_rq();
++ struct task_struct *p, *t;
++ struct rq_flags rf;
++
++ if (!llist)
++ return;
++
++ /*
++ * rq::ttwu_pending racy indication of out-standing wakeups.
++ * Races such that false-negatives are possible, since they
++ * are shorter lived that false-positives would be.
++ */
++ WRITE_ONCE(rq->ttwu_pending, 0);
++
++ rq_lock_irqsave(rq, &rf);
++ update_rq_clock(rq);
++
++ llist_for_each_entry_safe(p, t, llist, wake_entry.llist) {
++ if (WARN_ON_ONCE(p->on_cpu))
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++
++ if (WARN_ON_ONCE(task_cpu(p) != cpu_of(rq)))
++ set_task_cpu(p, cpu_of(rq));
++
++ ttwu_do_activate(rq, p, p->sched_remote_wakeup ? WF_MIGRATED : 0);
++ }
++
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++void send_call_function_single_ipi(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (!set_nr_if_polling(rq->idle))
++ arch_send_call_function_single_ipi(cpu);
++ else
++ trace_sched_wake_idle_without_ipi(cpu);
++}
++
++/*
++ * Queue a task on the target CPUs wake_list and wake the CPU via IPI if
++ * necessary. The wakee CPU on receipt of the IPI will queue the task
++ * via sched_ttwu_wakeup() for activation so the wakee incurs the cost
++ * of the wakeup instead of the waker.
++ */
++static void __ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED);
++
++ WRITE_ONCE(rq->ttwu_pending, 1);
++ __smp_call_single_queue(cpu, &p->wake_entry.llist);
++}
++
++static inline bool ttwu_queue_cond(int cpu, int wake_flags)
++{
++ /*
++ * Do not complicate things with the async wake_list while the CPU is
++ * in hotplug state.
++ */
++ if (!cpu_active(cpu))
++ return false;
++
++ /*
++ * If the CPU does not share cache, then queue the task on the
++ * remote rqs wakelist to avoid accessing remote data.
++ */
++ if (!cpus_share_cache(smp_processor_id(), cpu))
++ return true;
++
++ /*
++ * If the task is descheduling and the only running task on the
++ * CPU then use the wakelist to offload the task activation to
++ * the soon-to-be-idle CPU as the current CPU is likely busy.
++ * nr_running is checked to avoid unnecessary task stacking.
++ */
++ if ((wake_flags & WF_ON_CPU) && cpu_rq(cpu)->nr_running <= 1)
++ return true;
++
++ return false;
++}
++
++static bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
++{
++ if (__is_defined(ALT_SCHED_TTWU_QUEUE) && ttwu_queue_cond(cpu, wake_flags)) {
++ if (WARN_ON_ONCE(cpu == smp_processor_id()))
++ return false;
++
++ sched_clock_cpu(cpu); /* Sync clocks across CPUs */
++ __ttwu_queue_wakelist(p, cpu, wake_flags);
++ return true;
++ }
++
++ return false;
++}
++
++void wake_up_if_idle(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ rcu_read_lock();
++
++ if (!is_idle_task(rcu_dereference(rq->curr)))
++ goto out;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ if (is_idle_task(rq->curr))
++ resched_curr(rq);
++ /* Else CPU is not idle, do nothing here */
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++out:
++ rcu_read_unlock();
++}
++
++bool cpus_share_cache(int this_cpu, int that_cpu)
++{
++ if (this_cpu == that_cpu)
++ return true;
++
++ return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
++}
++#else /* !CONFIG_SMP */
++
++static inline bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
++{
++ return false;
++}
++
++#endif /* CONFIG_SMP */
++
++static inline void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (ttwu_queue_wakelist(p, cpu, wake_flags))
++ return;
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++ ttwu_do_activate(rq, p, wake_flags);
++ raw_spin_unlock(&rq->lock);
++}
++
++/*
++ * Invoked from try_to_wake_up() to check whether the task can be woken up.
++ *
++ * The caller holds p::pi_lock if p != current or has preemption
++ * disabled when p == current.
++ *
++ * The rules of PREEMPT_RT saved_state:
++ *
++ * The related locking code always holds p::pi_lock when updating
++ * p::saved_state, which means the code is fully serialized in both cases.
++ *
++ * The lock wait and lock wakeups happen via TASK_RTLOCK_WAIT. No other
++ * bits set. This allows to distinguish all wakeup scenarios.
++ */
++static __always_inline
++bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
++{
++ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)) {
++ WARN_ON_ONCE((state & TASK_RTLOCK_WAIT) &&
++ state != TASK_RTLOCK_WAIT);
++ }
++
++ if (READ_ONCE(p->__state) & state) {
++ *success = 1;
++ return true;
++ }
++
++#ifdef CONFIG_PREEMPT_RT
++ /*
++ * Saved state preserves the task state across blocking on
++ * an RT lock. If the state matches, set p::saved_state to
++ * TASK_RUNNING, but do not wake the task because it waits
++ * for a lock wakeup. Also indicate success because from
++ * the regular waker's point of view this has succeeded.
++ *
++ * After acquiring the lock the task will restore p::__state
++ * from p::saved_state which ensures that the regular
++ * wakeup is not lost. The restore will also set
++ * p::saved_state to TASK_RUNNING so any further tests will
++ * not result in false positives vs. @success
++ */
++ if (p->saved_state & state) {
++ p->saved_state = TASK_RUNNING;
++ *success = 1;
++ }
++#endif
++ return false;
++}
++
++/*
++ * Notes on Program-Order guarantees on SMP systems.
++ *
++ * MIGRATION
++ *
++ * The basic program-order guarantee on SMP systems is that when a task [t]
++ * migrates, all its activity on its old CPU [c0] happens-before any subsequent
++ * execution on its new CPU [c1].
++ *
++ * For migration (of runnable tasks) this is provided by the following means:
++ *
++ * A) UNLOCK of the rq(c0)->lock scheduling out task t
++ * B) migration for t is required to synchronize *both* rq(c0)->lock and
++ * rq(c1)->lock (if not at the same time, then in that order).
++ * C) LOCK of the rq(c1)->lock scheduling in task
++ *
++ * Transitivity guarantees that B happens after A and C after B.
++ * Note: we only require RCpc transitivity.
++ * Note: the CPU doing B need not be c0 or c1
++ *
++ * Example:
++ *
++ * CPU0 CPU1 CPU2
++ *
++ * LOCK rq(0)->lock
++ * sched-out X
++ * sched-in Y
++ * UNLOCK rq(0)->lock
++ *
++ * LOCK rq(0)->lock // orders against CPU0
++ * dequeue X
++ * UNLOCK rq(0)->lock
++ *
++ * LOCK rq(1)->lock
++ * enqueue X
++ * UNLOCK rq(1)->lock
++ *
++ * LOCK rq(1)->lock // orders against CPU2
++ * sched-out Z
++ * sched-in X
++ * UNLOCK rq(1)->lock
++ *
++ *
++ * BLOCKING -- aka. SLEEP + WAKEUP
++ *
++ * For blocking we (obviously) need to provide the same guarantee as for
++ * migration. However the means are completely different as there is no lock
++ * chain to provide order. Instead we do:
++ *
++ * 1) smp_store_release(X->on_cpu, 0) -- finish_task()
++ * 2) smp_cond_load_acquire(!X->on_cpu) -- try_to_wake_up()
++ *
++ * Example:
++ *
++ * CPU0 (schedule) CPU1 (try_to_wake_up) CPU2 (schedule)
++ *
++ * LOCK rq(0)->lock LOCK X->pi_lock
++ * dequeue X
++ * sched-out X
++ * smp_store_release(X->on_cpu, 0);
++ *
++ * smp_cond_load_acquire(&X->on_cpu, !VAL);
++ * X->state = WAKING
++ * set_task_cpu(X,2)
++ *
++ * LOCK rq(2)->lock
++ * enqueue X
++ * X->state = RUNNING
++ * UNLOCK rq(2)->lock
++ *
++ * LOCK rq(2)->lock // orders against CPU1
++ * sched-out Z
++ * sched-in X
++ * UNLOCK rq(2)->lock
++ *
++ * UNLOCK X->pi_lock
++ * UNLOCK rq(0)->lock
++ *
++ *
++ * However; for wakeups there is a second guarantee we must provide, namely we
++ * must observe the state that lead to our wakeup. That is, not only must our
++ * task observe its own prior state, it must also observe the stores prior to
++ * its wakeup.
++ *
++ * This means that any means of doing remote wakeups must order the CPU doing
++ * the wakeup against the CPU the task is going to end up running on. This,
++ * however, is already required for the regular Program-Order guarantee above,
++ * since the waking CPU is the one issueing the ACQUIRE (smp_cond_load_acquire).
++ *
++ */
++
++/**
++ * try_to_wake_up - wake up a thread
++ * @p: the thread to be awakened
++ * @state: the mask of task states that can be woken
++ * @wake_flags: wake modifier flags (WF_*)
++ *
++ * Conceptually does:
++ *
++ * If (@state & @p->state) @p->state = TASK_RUNNING.
++ *
++ * If the task was not queued/runnable, also place it back on a runqueue.
++ *
++ * This function is atomic against schedule() which would dequeue the task.
++ *
++ * It issues a full memory barrier before accessing @p->state, see the comment
++ * with set_current_state().
++ *
++ * Uses p->pi_lock to serialize against concurrent wake-ups.
++ *
++ * Relies on p->pi_lock stabilizing:
++ * - p->sched_class
++ * - p->cpus_ptr
++ * - p->sched_task_group
++ * in order to do migration, see its use of select_task_rq()/set_task_cpu().
++ *
++ * Tries really hard to only take one task_rq(p)->lock for performance.
++ * Takes rq->lock in:
++ * - ttwu_runnable() -- old rq, unavoidable, see comment there;
++ * - ttwu_queue() -- new rq, for enqueue of the task;
++ * - psi_ttwu_dequeue() -- much sadness :-( accounting will kill us.
++ *
++ * As a consequence we race really badly with just about everything. See the
++ * many memory barriers and their comments for details.
++ *
++ * Return: %true if @p->state changes (an actual wakeup was done),
++ * %false otherwise.
++ */
++static int try_to_wake_up(struct task_struct *p, unsigned int state,
++ int wake_flags)
++{
++ unsigned long flags;
++ int cpu, success = 0;
++
++ preempt_disable();
++ if (p == current) {
++ /*
++ * We're waking current, this means 'p->on_rq' and 'task_cpu(p)
++ * == smp_processor_id()'. Together this means we can special
++ * case the whole 'p->on_rq && ttwu_runnable()' case below
++ * without taking any locks.
++ *
++ * In particular:
++ * - we rely on Program-Order guarantees for all the ordering,
++ * - we're serialized against set_special_state() by virtue of
++ * it disabling IRQs (this allows not taking ->pi_lock).
++ */
++ if (!ttwu_state_match(p, state, &success))
++ goto out;
++
++ trace_sched_waking(p);
++ WRITE_ONCE(p->__state, TASK_RUNNING);
++ trace_sched_wakeup(p);
++ goto out;
++ }
++
++ /*
++ * If we are going to wake up a thread waiting for CONDITION we
++ * need to ensure that CONDITION=1 done by the caller can not be
++ * reordered with p->state check below. This pairs with smp_store_mb()
++ * in set_current_state() that the waiting thread does.
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ smp_mb__after_spinlock();
++ if (!ttwu_state_match(p, state, &success))
++ goto unlock;
++
++ trace_sched_waking(p);
++
++ /*
++ * Ensure we load p->on_rq _after_ p->state, otherwise it would
++ * be possible to, falsely, observe p->on_rq == 0 and get stuck
++ * in smp_cond_load_acquire() below.
++ *
++ * sched_ttwu_pending() try_to_wake_up()
++ * STORE p->on_rq = 1 LOAD p->state
++ * UNLOCK rq->lock
++ *
++ * __schedule() (switch to task 'p')
++ * LOCK rq->lock smp_rmb();
++ * smp_mb__after_spinlock();
++ * UNLOCK rq->lock
++ *
++ * [task p]
++ * STORE p->state = UNINTERRUPTIBLE LOAD p->on_rq
++ *
++ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
++ * __schedule(). See the comment for smp_mb__after_spinlock().
++ *
++ * A similar smb_rmb() lives in try_invoke_on_locked_down_task().
++ */
++ smp_rmb();
++ if (READ_ONCE(p->on_rq) && ttwu_runnable(p, wake_flags))
++ goto unlock;
++
++#ifdef CONFIG_SMP
++ /*
++ * Ensure we load p->on_cpu _after_ p->on_rq, otherwise it would be
++ * possible to, falsely, observe p->on_cpu == 0.
++ *
++ * One must be running (->on_cpu == 1) in order to remove oneself
++ * from the runqueue.
++ *
++ * __schedule() (switch to task 'p') try_to_wake_up()
++ * STORE p->on_cpu = 1 LOAD p->on_rq
++ * UNLOCK rq->lock
++ *
++ * __schedule() (put 'p' to sleep)
++ * LOCK rq->lock smp_rmb();
++ * smp_mb__after_spinlock();
++ * STORE p->on_rq = 0 LOAD p->on_cpu
++ *
++ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
++ * __schedule(). See the comment for smp_mb__after_spinlock().
++ *
++ * Form a control-dep-acquire with p->on_rq == 0 above, to ensure
++ * schedule()'s deactivate_task() has 'happened' and p will no longer
++ * care about it's own p->state. See the comment in __schedule().
++ */
++ smp_acquire__after_ctrl_dep();
++
++ /*
++ * We're doing the wakeup (@success == 1), they did a dequeue (p->on_rq
++ * == 0), which means we need to do an enqueue, change p->state to
++ * TASK_WAKING such that we can unlock p->pi_lock before doing the
++ * enqueue, such as ttwu_queue_wakelist().
++ */
++ WRITE_ONCE(p->__state, TASK_WAKING);
++
++ /*
++ * If the owning (remote) CPU is still in the middle of schedule() with
++ * this task as prev, considering queueing p on the remote CPUs wake_list
++ * which potentially sends an IPI instead of spinning on p->on_cpu to
++ * let the waker make forward progress. This is safe because IRQs are
++ * disabled and the IPI will deliver after on_cpu is cleared.
++ *
++ * Ensure we load task_cpu(p) after p->on_cpu:
++ *
++ * set_task_cpu(p, cpu);
++ * STORE p->cpu = @cpu
++ * __schedule() (switch to task 'p')
++ * LOCK rq->lock
++ * smp_mb__after_spin_lock() smp_cond_load_acquire(&p->on_cpu)
++ * STORE p->on_cpu = 1 LOAD p->cpu
++ *
++ * to ensure we observe the correct CPU on which the task is currently
++ * scheduling.
++ */
++ if (smp_load_acquire(&p->on_cpu) &&
++ ttwu_queue_wakelist(p, task_cpu(p), wake_flags | WF_ON_CPU))
++ goto unlock;
++
++ /*
++ * If the owning (remote) CPU is still in the middle of schedule() with
++ * this task as prev, wait until it's done referencing the task.
++ *
++ * Pairs with the smp_store_release() in finish_task().
++ *
++ * This ensures that tasks getting woken will be fully ordered against
++ * their previous state and preserve Program Order.
++ */
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++
++ sched_task_ttwu(p);
++
++ cpu = select_task_rq(p);
++
++ if (cpu != task_cpu(p)) {
++ if (p->in_iowait) {
++ delayacct_blkio_end(p);
++ atomic_dec(&task_rq(p)->nr_iowait);
++ }
++
++ wake_flags |= WF_MIGRATED;
++ psi_ttwu_dequeue(p);
++ set_task_cpu(p, cpu);
++ }
++#else
++ cpu = task_cpu(p);
++#endif /* CONFIG_SMP */
++
++ ttwu_queue(p, cpu, wake_flags);
++unlock:
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++out:
++ if (success)
++ ttwu_stat(p, task_cpu(p), wake_flags);
++ preempt_enable();
++
++ return success;
++}
++
++/**
++ * task_call_func - Invoke a function on task in fixed state
++ * @p: Process for which the function is to be invoked, can be @current.
++ * @func: Function to invoke.
++ * @arg: Argument to function.
++ *
++ * Fix the task in it's current state by avoiding wakeups and or rq operations
++ * and call @func(@arg) on it. This function can use ->on_rq and task_curr()
++ * to work out what the state is, if required. Given that @func can be invoked
++ * with a runqueue lock held, it had better be quite lightweight.
++ *
++ * Returns:
++ * Whatever @func returns
++ */
++int task_call_func(struct task_struct *p, task_call_f func, void *arg)
++{
++ struct rq *rq = NULL;
++ unsigned int state;
++ struct rq_flags rf;
++ int ret;
++
++ raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
++
++ state = READ_ONCE(p->__state);
++
++ /*
++ * Ensure we load p->on_rq after p->__state, otherwise it would be
++ * possible to, falsely, observe p->on_rq == 0.
++ *
++ * See try_to_wake_up() for a longer comment.
++ */
++ smp_rmb();
++
++ /*
++ * Since pi->lock blocks try_to_wake_up(), we don't need rq->lock when
++ * the task is blocked. Make sure to check @state since ttwu() can drop
++ * locks at the end, see ttwu_queue_wakelist().
++ */
++ if (state == TASK_RUNNING || state == TASK_WAKING || p->on_rq)
++ rq = __task_rq_lock(p, &rf);
++
++ /*
++ * At this point the task is pinned; either:
++ * - blocked and we're holding off wakeups (pi->lock)
++ * - woken, and we're holding off enqueue (rq->lock)
++ * - queued, and we're holding off schedule (rq->lock)
++ * - running, and we're holding off de-schedule (rq->lock)
++ *
++ * The called function (@func) can use: task_curr(), p->on_rq and
++ * p->__state to differentiate between these states.
++ */
++ ret = func(p, arg);
++
++ if (rq)
++ __task_rq_unlock(rq, &rf);
++
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags);
++ return ret;
++}
++
++/**
++ * wake_up_process - Wake up a specific process
++ * @p: The process to be woken up.
++ *
++ * Attempt to wake up the nominated process and move it to the set of runnable
++ * processes.
++ *
++ * Return: 1 if the process was woken up, 0 if it was already running.
++ *
++ * This function executes a full memory barrier before accessing the task state.
++ */
++int wake_up_process(struct task_struct *p)
++{
++ return try_to_wake_up(p, TASK_NORMAL, 0);
++}
++EXPORT_SYMBOL(wake_up_process);
++
++int wake_up_state(struct task_struct *p, unsigned int state)
++{
++ return try_to_wake_up(p, state, 0);
++}
++
++/*
++ * Perform scheduler related setup for a newly forked process p.
++ * p is forked by current.
++ *
++ * __sched_fork() is basic setup used by init_idle() too:
++ */
++static inline void __sched_fork(unsigned long clone_flags, struct task_struct *p)
++{
++ p->on_rq = 0;
++ p->on_cpu = 0;
++ p->utime = 0;
++ p->stime = 0;
++ p->sched_time = 0;
++
++#ifdef CONFIG_SCHEDSTATS
++ /* Even if schedstat is disabled, there should not be garbage */
++ memset(&p->stats, 0, sizeof(p->stats));
++#endif
++
++#ifdef CONFIG_PREEMPT_NOTIFIERS
++ INIT_HLIST_HEAD(&p->preempt_notifiers);
++#endif
++
++#ifdef CONFIG_COMPACTION
++ p->capture_control = NULL;
++#endif
++#ifdef CONFIG_SMP
++ p->wake_entry.u_flags = CSD_TYPE_TTWU;
++#endif
++}
++
++/*
++ * fork()/clone()-time setup:
++ */
++int sched_fork(unsigned long clone_flags, struct task_struct *p)
++{
++ __sched_fork(clone_flags, p);
++ /*
++ * We mark the process as NEW here. This guarantees that
++ * nobody will actually run it, and a signal or other external
++ * event cannot wake it up and insert it on the runqueue either.
++ */
++ p->__state = TASK_NEW;
++
++ /*
++ * Make sure we do not leak PI boosting priority to the child.
++ */
++ p->prio = current->normal_prio;
++
++ /*
++ * Revert to default priority/policy on fork if requested.
++ */
++ if (unlikely(p->sched_reset_on_fork)) {
++ if (task_has_rt_policy(p)) {
++ p->policy = SCHED_NORMAL;
++ p->static_prio = NICE_TO_PRIO(0);
++ p->rt_priority = 0;
++ } else if (PRIO_TO_NICE(p->static_prio) < 0)
++ p->static_prio = NICE_TO_PRIO(0);
++
++ p->prio = p->normal_prio = p->static_prio;
++
++ /*
++ * We don't need the reset flag anymore after the fork. It has
++ * fulfilled its duty:
++ */
++ p->sched_reset_on_fork = 0;
++ }
++
++#ifdef CONFIG_SCHED_INFO
++ if (unlikely(sched_info_on()))
++ memset(&p->sched_info, 0, sizeof(p->sched_info));
++#endif
++ init_task_preempt_count(p);
++
++ return 0;
++}
++
++void sched_cgroup_fork(struct task_struct *p, struct kernel_clone_args *kargs)
++{
++ unsigned long flags;
++ struct rq *rq;
++
++ /*
++ * Because we're not yet on the pid-hash, p->pi_lock isn't strictly
++ * required yet, but lockdep gets upset if rules are violated.
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ /*
++ * Share the timeslice between parent and child, thus the
++ * total amount of pending timeslices in the system doesn't change,
++ * resulting in more scheduling fairness.
++ */
++ rq = this_rq();
++ raw_spin_lock(&rq->lock);
++
++ rq->curr->time_slice /= 2;
++ p->time_slice = rq->curr->time_slice;
++#ifdef CONFIG_SCHED_HRTICK
++ hrtick_start(rq, rq->curr->time_slice);
++#endif
++
++ if (p->time_slice < RESCHED_NS) {
++ p->time_slice = sched_timeslice_ns;
++ resched_curr(rq);
++ }
++ sched_task_fork(p, rq);
++ raw_spin_unlock(&rq->lock);
++
++ rseq_migrate(p);
++ /*
++ * We're setting the CPU for the first time, we don't migrate,
++ * so use __set_task_cpu().
++ */
++ __set_task_cpu(p, smp_processor_id());
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++}
++
++void sched_post_fork(struct task_struct *p)
++{
++}
++
++#ifdef CONFIG_SCHEDSTATS
++
++DEFINE_STATIC_KEY_FALSE(sched_schedstats);
++
++static void set_schedstats(bool enabled)
++{
++ if (enabled)
++ static_branch_enable(&sched_schedstats);
++ else
++ static_branch_disable(&sched_schedstats);
++}
++
++void force_schedstat_enabled(void)
++{
++ if (!schedstat_enabled()) {
++ pr_info("kernel profiling enabled schedstats, disable via kernel.sched_schedstats.\n");
++ static_branch_enable(&sched_schedstats);
++ }
++}
++
++static int __init setup_schedstats(char *str)
++{
++ int ret = 0;
++ if (!str)
++ goto out;
++
++ if (!strcmp(str, "enable")) {
++ set_schedstats(true);
++ ret = 1;
++ } else if (!strcmp(str, "disable")) {
++ set_schedstats(false);
++ ret = 1;
++ }
++out:
++ if (!ret)
++ pr_warn("Unable to parse schedstats=\n");
++
++ return ret;
++}
++__setup("schedstats=", setup_schedstats);
++
++#ifdef CONFIG_PROC_SYSCTL
++int sysctl_schedstats(struct ctl_table *table, int write,
++ void __user *buffer, size_t *lenp, loff_t *ppos)
++{
++ struct ctl_table t;
++ int err;
++ int state = static_branch_likely(&sched_schedstats);
++
++ if (write && !capable(CAP_SYS_ADMIN))
++ return -EPERM;
++
++ t = *table;
++ t.data = &state;
++ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
++ if (err < 0)
++ return err;
++ if (write)
++ set_schedstats(state);
++ return err;
++}
++#endif /* CONFIG_PROC_SYSCTL */
++#endif /* CONFIG_SCHEDSTATS */
++
++/*
++ * wake_up_new_task - wake up a newly created task for the first time.
++ *
++ * This function will do some initial scheduler statistics housekeeping
++ * that must be done for every newly created context, then puts the task
++ * on the runqueue and wakes it.
++ */
++void wake_up_new_task(struct task_struct *p)
++{
++ unsigned long flags;
++ struct rq *rq;
++
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ WRITE_ONCE(p->__state, TASK_RUNNING);
++ rq = cpu_rq(select_task_rq(p));
++#ifdef CONFIG_SMP
++ rseq_migrate(p);
++ /*
++ * Fork balancing, do it here and not earlier because:
++ * - cpus_ptr can change in the fork path
++ * - any previously selected CPU might disappear through hotplug
++ *
++ * Use __set_task_cpu() to avoid calling sched_class::migrate_task_rq,
++ * as we're not fully set-up yet.
++ */
++ __set_task_cpu(p, cpu_of(rq));
++#endif
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++
++ activate_task(p, rq);
++ trace_sched_wakeup_new(p);
++ check_preempt_curr(rq);
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++}
++
++#ifdef CONFIG_PREEMPT_NOTIFIERS
++
++static DEFINE_STATIC_KEY_FALSE(preempt_notifier_key);
++
++void preempt_notifier_inc(void)
++{
++ static_branch_inc(&preempt_notifier_key);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_inc);
++
++void preempt_notifier_dec(void)
++{
++ static_branch_dec(&preempt_notifier_key);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_dec);
++
++/**
++ * preempt_notifier_register - tell me when current is being preempted & rescheduled
++ * @notifier: notifier struct to register
++ */
++void preempt_notifier_register(struct preempt_notifier *notifier)
++{
++ if (!static_branch_unlikely(&preempt_notifier_key))
++ WARN(1, "registering preempt_notifier while notifiers disabled\n");
++
++ hlist_add_head(¬ifier->link, ¤t->preempt_notifiers);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_register);
++
++/**
++ * preempt_notifier_unregister - no longer interested in preemption notifications
++ * @notifier: notifier struct to unregister
++ *
++ * This is *not* safe to call from within a preemption notifier.
++ */
++void preempt_notifier_unregister(struct preempt_notifier *notifier)
++{
++ hlist_del(¬ifier->link);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_unregister);
++
++static void __fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++ struct preempt_notifier *notifier;
++
++ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
++ notifier->ops->sched_in(notifier, raw_smp_processor_id());
++}
++
++static __always_inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++ if (static_branch_unlikely(&preempt_notifier_key))
++ __fire_sched_in_preempt_notifiers(curr);
++}
++
++static void
++__fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++ struct preempt_notifier *notifier;
++
++ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
++ notifier->ops->sched_out(notifier, next);
++}
++
++static __always_inline void
++fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++ if (static_branch_unlikely(&preempt_notifier_key))
++ __fire_sched_out_preempt_notifiers(curr, next);
++}
++
++#else /* !CONFIG_PREEMPT_NOTIFIERS */
++
++static inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++}
++
++static inline void
++fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++}
++
++#endif /* CONFIG_PREEMPT_NOTIFIERS */
++
++static inline void prepare_task(struct task_struct *next)
++{
++ /*
++ * Claim the task as running, we do this before switching to it
++ * such that any running task will have this set.
++ *
++ * See the ttwu() WF_ON_CPU case and its ordering comment.
++ */
++ WRITE_ONCE(next->on_cpu, 1);
++}
++
++static inline void finish_task(struct task_struct *prev)
++{
++#ifdef CONFIG_SMP
++ /*
++ * This must be the very last reference to @prev from this CPU. After
++ * p->on_cpu is cleared, the task can be moved to a different CPU. We
++ * must ensure this doesn't happen until the switch is completely
++ * finished.
++ *
++ * In particular, the load of prev->state in finish_task_switch() must
++ * happen before this.
++ *
++ * Pairs with the smp_cond_load_acquire() in try_to_wake_up().
++ */
++ smp_store_release(&prev->on_cpu, 0);
++#else
++ prev->on_cpu = 0;
++#endif
++}
++
++#ifdef CONFIG_SMP
++
++static void do_balance_callbacks(struct rq *rq, struct callback_head *head)
++{
++ void (*func)(struct rq *rq);
++ struct callback_head *next;
++
++ lockdep_assert_held(&rq->lock);
++
++ while (head) {
++ func = (void (*)(struct rq *))head->func;
++ next = head->next;
++ head->next = NULL;
++ head = next;
++
++ func(rq);
++ }
++}
++
++static void balance_push(struct rq *rq);
++
++struct callback_head balance_push_callback = {
++ .next = NULL,
++ .func = (void (*)(struct callback_head *))balance_push,
++};
++
++static inline struct callback_head *splice_balance_callbacks(struct rq *rq)
++{
++ struct callback_head *head = rq->balance_callback;
++
++ if (head) {
++ lockdep_assert_held(&rq->lock);
++ rq->balance_callback = NULL;
++ }
++
++ return head;
++}
++
++static void __balance_callbacks(struct rq *rq)
++{
++ do_balance_callbacks(rq, splice_balance_callbacks(rq));
++}
++
++static inline void balance_callbacks(struct rq *rq, struct callback_head *head)
++{
++ unsigned long flags;
++
++ if (unlikely(head)) {
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ do_balance_callbacks(rq, head);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++ }
++}
++
++#else
++
++static inline void __balance_callbacks(struct rq *rq)
++{
++}
++
++static inline struct callback_head *splice_balance_callbacks(struct rq *rq)
++{
++ return NULL;
++}
++
++static inline void balance_callbacks(struct rq *rq, struct callback_head *head)
++{
++}
++
++#endif
++
++static inline void
++prepare_lock_switch(struct rq *rq, struct task_struct *next)
++{
++ /*
++ * Since the runqueue lock will be released by the next
++ * task (which is an invalid locking op but in the case
++ * of the scheduler it's an obvious special-case), so we
++ * do an early lockdep release here:
++ */
++ spin_release(&rq->lock.dep_map, _THIS_IP_);
++#ifdef CONFIG_DEBUG_SPINLOCK
++ /* this is a valid case when another task releases the spinlock */
++ rq->lock.owner = next;
++#endif
++}
++
++static inline void finish_lock_switch(struct rq *rq)
++{
++ /*
++ * If we are tracking spinlock dependencies then we have to
++ * fix up the runqueue lock - which gets 'carried over' from
++ * prev into current:
++ */
++ spin_acquire(&rq->lock.dep_map, 0, 0, _THIS_IP_);
++ __balance_callbacks(rq);
++ raw_spin_unlock_irq(&rq->lock);
++}
++
++/*
++ * NOP if the arch has not defined these:
++ */
++
++#ifndef prepare_arch_switch
++# define prepare_arch_switch(next) do { } while (0)
++#endif
++
++#ifndef finish_arch_post_lock_switch
++# define finish_arch_post_lock_switch() do { } while (0)
++#endif
++
++static inline void kmap_local_sched_out(void)
++{
++#ifdef CONFIG_KMAP_LOCAL
++ if (unlikely(current->kmap_ctrl.idx))
++ __kmap_local_sched_out();
++#endif
++}
++
++static inline void kmap_local_sched_in(void)
++{
++#ifdef CONFIG_KMAP_LOCAL
++ if (unlikely(current->kmap_ctrl.idx))
++ __kmap_local_sched_in();
++#endif
++}
++
++/**
++ * prepare_task_switch - prepare to switch tasks
++ * @rq: the runqueue preparing to switch
++ * @next: the task we are going to switch to.
++ *
++ * This is called with the rq lock held and interrupts off. It must
++ * be paired with a subsequent finish_task_switch after the context
++ * switch.
++ *
++ * prepare_task_switch sets up locking and calls architecture specific
++ * hooks.
++ */
++static inline void
++prepare_task_switch(struct rq *rq, struct task_struct *prev,
++ struct task_struct *next)
++{
++ kcov_prepare_switch(prev);
++ sched_info_switch(rq, prev, next);
++ perf_event_task_sched_out(prev, next);
++ rseq_preempt(prev);
++ fire_sched_out_preempt_notifiers(prev, next);
++ kmap_local_sched_out();
++ prepare_task(next);
++ prepare_arch_switch(next);
++}
++
++/**
++ * finish_task_switch - clean up after a task-switch
++ * @rq: runqueue associated with task-switch
++ * @prev: the thread we just switched away from.
++ *
++ * finish_task_switch must be called after the context switch, paired
++ * with a prepare_task_switch call before the context switch.
++ * finish_task_switch will reconcile locking set up by prepare_task_switch,
++ * and do any other architecture-specific cleanup actions.
++ *
++ * Note that we may have delayed dropping an mm in context_switch(). If
++ * so, we finish that here outside of the runqueue lock. (Doing it
++ * with the lock held can cause deadlocks; see schedule() for
++ * details.)
++ *
++ * The context switch have flipped the stack from under us and restored the
++ * local variables which were saved when this task called schedule() in the
++ * past. prev == current is still correct but we need to recalculate this_rq
++ * because prev may have moved to another CPU.
++ */
++static struct rq *finish_task_switch(struct task_struct *prev)
++ __releases(rq->lock)
++{
++ struct rq *rq = this_rq();
++ struct mm_struct *mm = rq->prev_mm;
++ unsigned int prev_state;
++
++ /*
++ * The previous task will have left us with a preempt_count of 2
++ * because it left us after:
++ *
++ * schedule()
++ * preempt_disable(); // 1
++ * __schedule()
++ * raw_spin_lock_irq(&rq->lock) // 2
++ *
++ * Also, see FORK_PREEMPT_COUNT.
++ */
++ if (WARN_ONCE(preempt_count() != 2*PREEMPT_DISABLE_OFFSET,
++ "corrupted preempt_count: %s/%d/0x%x\n",
++ current->comm, current->pid, preempt_count()))
++ preempt_count_set(FORK_PREEMPT_COUNT);
++
++ rq->prev_mm = NULL;
++
++ /*
++ * A task struct has one reference for the use as "current".
++ * If a task dies, then it sets TASK_DEAD in tsk->state and calls
++ * schedule one last time. The schedule call will never return, and
++ * the scheduled task must drop that reference.
++ *
++ * We must observe prev->state before clearing prev->on_cpu (in
++ * finish_task), otherwise a concurrent wakeup can get prev
++ * running on another CPU and we could rave with its RUNNING -> DEAD
++ * transition, resulting in a double drop.
++ */
++ prev_state = READ_ONCE(prev->__state);
++ vtime_task_switch(prev);
++ perf_event_task_sched_in(prev, current);
++ finish_task(prev);
++ tick_nohz_task_switch();
++ finish_lock_switch(rq);
++ finish_arch_post_lock_switch();
++ kcov_finish_switch(current);
++ /*
++ * kmap_local_sched_out() is invoked with rq::lock held and
++ * interrupts disabled. There is no requirement for that, but the
++ * sched out code does not have an interrupt enabled section.
++ * Restoring the maps on sched in does not require interrupts being
++ * disabled either.
++ */
++ kmap_local_sched_in();
++
++ fire_sched_in_preempt_notifiers(current);
++ /*
++ * When switching through a kernel thread, the loop in
++ * membarrier_{private,global}_expedited() may have observed that
++ * kernel thread and not issued an IPI. It is therefore possible to
++ * schedule between user->kernel->user threads without passing though
++ * switch_mm(). Membarrier requires a barrier after storing to
++ * rq->curr, before returning to userspace, so provide them here:
++ *
++ * - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED, implicitly
++ * provided by mmdrop(),
++ * - a sync_core for SYNC_CORE.
++ */
++ if (mm) {
++ membarrier_mm_sync_core_before_usermode(mm);
++ mmdrop_sched(mm);
++ }
++ if (unlikely(prev_state == TASK_DEAD)) {
++ /* Task is done with its stack. */
++ put_task_stack(prev);
++
++ put_task_struct_rcu_user(prev);
++ }
++
++ return rq;
++}
++
++/**
++ * schedule_tail - first thing a freshly forked thread must call.
++ * @prev: the thread we just switched away from.
++ */
++asmlinkage __visible void schedule_tail(struct task_struct *prev)
++ __releases(rq->lock)
++{
++ /*
++ * New tasks start with FORK_PREEMPT_COUNT, see there and
++ * finish_task_switch() for details.
++ *
++ * finish_task_switch() will drop rq->lock() and lower preempt_count
++ * and the preempt_enable() will end up enabling preemption (on
++ * PREEMPT_COUNT kernels).
++ */
++
++ finish_task_switch(prev);
++ preempt_enable();
++
++ if (current->set_child_tid)
++ put_user(task_pid_vnr(current), current->set_child_tid);
++
++ calculate_sigpending();
++}
++
++/*
++ * context_switch - switch to the new MM and the new thread's register state.
++ */
++static __always_inline struct rq *
++context_switch(struct rq *rq, struct task_struct *prev,
++ struct task_struct *next)
++{
++ prepare_task_switch(rq, prev, next);
++
++ /*
++ * For paravirt, this is coupled with an exit in switch_to to
++ * combine the page table reload and the switch backend into
++ * one hypercall.
++ */
++ arch_start_context_switch(prev);
++
++ /*
++ * kernel -> kernel lazy + transfer active
++ * user -> kernel lazy + mmgrab() active
++ *
++ * kernel -> user switch + mmdrop() active
++ * user -> user switch
++ */
++ if (!next->mm) { // to kernel
++ enter_lazy_tlb(prev->active_mm, next);
++
++ next->active_mm = prev->active_mm;
++ if (prev->mm) // from user
++ mmgrab(prev->active_mm);
++ else
++ prev->active_mm = NULL;
++ } else { // to user
++ membarrier_switch_mm(rq, prev->active_mm, next->mm);
++ /*
++ * sys_membarrier() requires an smp_mb() between setting
++ * rq->curr / membarrier_switch_mm() and returning to userspace.
++ *
++ * The below provides this either through switch_mm(), or in
++ * case 'prev->active_mm == next->mm' through
++ * finish_task_switch()'s mmdrop().
++ */
++ switch_mm_irqs_off(prev->active_mm, next->mm, next);
++
++ if (!prev->mm) { // from kernel
++ /* will mmdrop() in finish_task_switch(). */
++ rq->prev_mm = prev->active_mm;
++ prev->active_mm = NULL;
++ }
++ }
++
++ prepare_lock_switch(rq, next);
++
++ /* Here we just switch the register state and the stack. */
++ switch_to(prev, next, prev);
++ barrier();
++
++ return finish_task_switch(prev);
++}
++
++/*
++ * nr_running, nr_uninterruptible and nr_context_switches:
++ *
++ * externally visible scheduler statistics: current number of runnable
++ * threads, total number of context switches performed since bootup.
++ */
++unsigned int nr_running(void)
++{
++ unsigned int i, sum = 0;
++
++ for_each_online_cpu(i)
++ sum += cpu_rq(i)->nr_running;
++
++ return sum;
++}
++
++/*
++ * Check if only the current task is running on the CPU.
++ *
++ * Caution: this function does not check that the caller has disabled
++ * preemption, thus the result might have a time-of-check-to-time-of-use
++ * race. The caller is responsible to use it correctly, for example:
++ *
++ * - from a non-preemptible section (of course)
++ *
++ * - from a thread that is bound to a single CPU
++ *
++ * - in a loop with very short iterations (e.g. a polling loop)
++ */
++bool single_task_running(void)
++{
++ return raw_rq()->nr_running == 1;
++}
++EXPORT_SYMBOL(single_task_running);
++
++unsigned long long nr_context_switches(void)
++{
++ int i;
++ unsigned long long sum = 0;
++
++ for_each_possible_cpu(i)
++ sum += cpu_rq(i)->nr_switches;
++
++ return sum;
++}
++
++/*
++ * Consumers of these two interfaces, like for example the cpuidle menu
++ * governor, are using nonsensical data. Preferring shallow idle state selection
++ * for a CPU that has IO-wait which might not even end up running the task when
++ * it does become runnable.
++ */
++
++unsigned int nr_iowait_cpu(int cpu)
++{
++ return atomic_read(&cpu_rq(cpu)->nr_iowait);
++}
++
++/*
++ * IO-wait accounting, and how it's mostly bollocks (on SMP).
++ *
++ * The idea behind IO-wait account is to account the idle time that we could
++ * have spend running if it were not for IO. That is, if we were to improve the
++ * storage performance, we'd have a proportional reduction in IO-wait time.
++ *
++ * This all works nicely on UP, where, when a task blocks on IO, we account
++ * idle time as IO-wait, because if the storage were faster, it could've been
++ * running and we'd not be idle.
++ *
++ * This has been extended to SMP, by doing the same for each CPU. This however
++ * is broken.
++ *
++ * Imagine for instance the case where two tasks block on one CPU, only the one
++ * CPU will have IO-wait accounted, while the other has regular idle. Even
++ * though, if the storage were faster, both could've ran at the same time,
++ * utilising both CPUs.
++ *
++ * This means, that when looking globally, the current IO-wait accounting on
++ * SMP is a lower bound, by reason of under accounting.
++ *
++ * Worse, since the numbers are provided per CPU, they are sometimes
++ * interpreted per CPU, and that is nonsensical. A blocked task isn't strictly
++ * associated with any one particular CPU, it can wake to another CPU than it
++ * blocked on. This means the per CPU IO-wait number is meaningless.
++ *
++ * Task CPU affinities can make all that even more 'interesting'.
++ */
++
++unsigned int nr_iowait(void)
++{
++ unsigned int i, sum = 0;
++
++ for_each_possible_cpu(i)
++ sum += nr_iowait_cpu(i);
++
++ return sum;
++}
++
++#ifdef CONFIG_SMP
++
++/*
++ * sched_exec - execve() is a valuable balancing opportunity, because at
++ * this point the task has the smallest effective memory and cache
++ * footprint.
++ */
++void sched_exec(void)
++{
++ struct task_struct *p = current;
++ unsigned long flags;
++ int dest_cpu;
++
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ dest_cpu = cpumask_any(p->cpus_ptr);
++ if (dest_cpu == smp_processor_id())
++ goto unlock;
++
++ if (likely(cpu_active(dest_cpu))) {
++ struct migration_arg arg = { p, dest_cpu };
++
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++ stop_one_cpu(task_cpu(p), migration_cpu_stop, &arg);
++ return;
++ }
++unlock:
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++}
++
++#endif
++
++DEFINE_PER_CPU(struct kernel_stat, kstat);
++DEFINE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
++
++EXPORT_PER_CPU_SYMBOL(kstat);
++EXPORT_PER_CPU_SYMBOL(kernel_cpustat);
++
++static inline void update_curr(struct rq *rq, struct task_struct *p)
++{
++ s64 ns = rq->clock_task - p->last_ran;
++
++ p->sched_time += ns;
++ cgroup_account_cputime(p, ns);
++ account_group_exec_runtime(p, ns);
++
++ p->time_slice -= ns;
++ p->last_ran = rq->clock_task;
++}
++
++/*
++ * Return accounted runtime for the task.
++ * Return separately the current's pending runtime that have not been
++ * accounted yet.
++ */
++unsigned long long task_sched_runtime(struct task_struct *p)
++{
++ unsigned long flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++ u64 ns;
++
++#if defined(CONFIG_64BIT) && defined(CONFIG_SMP)
++ /*
++ * 64-bit doesn't need locks to atomically read a 64-bit value.
++ * So we have a optimization chance when the task's delta_exec is 0.
++ * Reading ->on_cpu is racy, but this is ok.
++ *
++ * If we race with it leaving CPU, we'll take a lock. So we're correct.
++ * If we race with it entering CPU, unaccounted time is 0. This is
++ * indistinguishable from the read occurring a few cycles earlier.
++ * If we see ->on_cpu without ->on_rq, the task is leaving, and has
++ * been accounted, so we're correct here as well.
++ */
++ if (!p->on_cpu || !task_on_rq_queued(p))
++ return tsk_seruntime(p);
++#endif
++
++ rq = task_access_lock_irqsave(p, &lock, &flags);
++ /*
++ * Must be ->curr _and_ ->on_rq. If dequeued, we would
++ * project cycles that may never be accounted to this
++ * thread, breaking clock_gettime().
++ */
++ if (p == rq->curr && task_on_rq_queued(p)) {
++ update_rq_clock(rq);
++ update_curr(rq, p);
++ }
++ ns = tsk_seruntime(p);
++ task_access_unlock_irqrestore(p, lock, &flags);
++
++ return ns;
++}
++
++/* This manages tasks that have run out of timeslice during a scheduler_tick */
++static inline void scheduler_task_tick(struct rq *rq)
++{
++ struct task_struct *p = rq->curr;
++
++ if (is_idle_task(p))
++ return;
++
++ update_curr(rq, p);
++ cpufreq_update_util(rq, 0);
++
++ /*
++ * Tasks have less than RESCHED_NS of time slice left they will be
++ * rescheduled.
++ */
++ if (p->time_slice >= RESCHED_NS)
++ return;
++ set_tsk_need_resched(p);
++ set_preempt_need_resched();
++}
++
++#ifdef CONFIG_SCHED_DEBUG
++static u64 cpu_resched_latency(struct rq *rq)
++{
++ int latency_warn_ms = READ_ONCE(sysctl_resched_latency_warn_ms);
++ u64 resched_latency, now = rq_clock(rq);
++ static bool warned_once;
++
++ if (sysctl_resched_latency_warn_once && warned_once)
++ return 0;
++
++ if (!need_resched() || !latency_warn_ms)
++ return 0;
++
++ if (system_state == SYSTEM_BOOTING)
++ return 0;
++
++ if (!rq->last_seen_need_resched_ns) {
++ rq->last_seen_need_resched_ns = now;
++ rq->ticks_without_resched = 0;
++ return 0;
++ }
++
++ rq->ticks_without_resched++;
++ resched_latency = now - rq->last_seen_need_resched_ns;
++ if (resched_latency <= latency_warn_ms * NSEC_PER_MSEC)
++ return 0;
++
++ warned_once = true;
++
++ return resched_latency;
++}
++
++static int __init setup_resched_latency_warn_ms(char *str)
++{
++ long val;
++
++ if ((kstrtol(str, 0, &val))) {
++ pr_warn("Unable to set resched_latency_warn_ms\n");
++ return 1;
++ }
++
++ sysctl_resched_latency_warn_ms = val;
++ return 1;
++}
++__setup("resched_latency_warn_ms=", setup_resched_latency_warn_ms);
++#else
++static inline u64 cpu_resched_latency(struct rq *rq) { return 0; }
++#endif /* CONFIG_SCHED_DEBUG */
++
++/*
++ * This function gets called by the timer code, with HZ frequency.
++ * We call it with interrupts disabled.
++ */
++void scheduler_tick(void)
++{
++ int cpu __maybe_unused = smp_processor_id();
++ struct rq *rq = cpu_rq(cpu);
++ u64 resched_latency;
++
++ arch_scale_freq_tick();
++ sched_clock_tick();
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++
++ scheduler_task_tick(rq);
++ if (sched_feat(LATENCY_WARN))
++ resched_latency = cpu_resched_latency(rq);
++ calc_global_load_tick(rq);
++
++ rq->last_tick = rq->clock;
++ raw_spin_unlock(&rq->lock);
++
++ if (sched_feat(LATENCY_WARN) && resched_latency)
++ resched_latency_warn(cpu, resched_latency);
++
++ perf_event_task_tick();
++}
++
++#ifdef CONFIG_SCHED_SMT
++static inline int active_load_balance_cpu_stop(void *data)
++{
++ struct rq *rq = this_rq();
++ struct task_struct *p = data;
++ cpumask_t tmp;
++ unsigned long flags;
++
++ local_irq_save(flags);
++
++ raw_spin_lock(&p->pi_lock);
++ raw_spin_lock(&rq->lock);
++
++ rq->active_balance = 0;
++ /* _something_ may have changed the task, double check again */
++ if (task_on_rq_queued(p) && task_rq(p) == rq &&
++ cpumask_and(&tmp, p->cpus_ptr, &sched_sg_idle_mask) &&
++ !is_migration_disabled(p)) {
++ int cpu = cpu_of(rq);
++ int dcpu = __best_mask_cpu(&tmp, per_cpu(sched_cpu_llc_mask, cpu));
++ rq = move_queued_task(rq, p, dcpu);
++ }
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock(&p->pi_lock);
++
++ local_irq_restore(flags);
++
++ return 0;
++}
++
++/* sg_balance_trigger - trigger slibing group balance for @cpu */
++static inline int sg_balance_trigger(const int cpu)
++{
++ struct rq *rq= cpu_rq(cpu);
++ unsigned long flags;
++ struct task_struct *curr;
++ int res;
++
++ if (!raw_spin_trylock_irqsave(&rq->lock, flags))
++ return 0;
++ curr = rq->curr;
++ res = (!is_idle_task(curr)) && (1 == rq->nr_running) &&\
++ cpumask_intersects(curr->cpus_ptr, &sched_sg_idle_mask) &&\
++ !is_migration_disabled(curr) && (!rq->active_balance);
++
++ if (res)
++ rq->active_balance = 1;
++
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++ if (res)
++ stop_one_cpu_nowait(cpu, active_load_balance_cpu_stop,
++ curr, &rq->active_balance_work);
++ return res;
++}
++
++/*
++ * sg_balance_check - slibing group balance check for run queue @rq
++ */
++static inline void sg_balance_check(struct rq *rq)
++{
++ cpumask_t chk;
++ int cpu = cpu_of(rq);
++
++ /* exit when cpu is offline */
++ if (unlikely(!rq->online))
++ return;
++
++ /*
++ * Only cpu in slibing idle group will do the checking and then
++ * find potential cpus which can migrate the current running task
++ */
++ if (cpumask_test_cpu(cpu, &sched_sg_idle_mask) &&
++ cpumask_andnot(&chk, cpu_online_mask, sched_rq_watermark) &&
++ cpumask_andnot(&chk, &chk, &sched_rq_pending_mask)) {
++ int i;
++
++ for_each_cpu_wrap(i, &chk, cpu) {
++ if (cpumask_subset(cpu_smt_mask(i), &chk) &&
++ sg_balance_trigger(i))
++ return;
++ }
++ }
++}
++#endif /* CONFIG_SCHED_SMT */
++
++#ifdef CONFIG_NO_HZ_FULL
++
++struct tick_work {
++ int cpu;
++ atomic_t state;
++ struct delayed_work work;
++};
++/* Values for ->state, see diagram below. */
++#define TICK_SCHED_REMOTE_OFFLINE 0
++#define TICK_SCHED_REMOTE_OFFLINING 1
++#define TICK_SCHED_REMOTE_RUNNING 2
++
++/*
++ * State diagram for ->state:
++ *
++ *
++ * TICK_SCHED_REMOTE_OFFLINE
++ * | ^
++ * | |
++ * | | sched_tick_remote()
++ * | |
++ * | |
++ * +--TICK_SCHED_REMOTE_OFFLINING
++ * | ^
++ * | |
++ * sched_tick_start() | | sched_tick_stop()
++ * | |
++ * V |
++ * TICK_SCHED_REMOTE_RUNNING
++ *
++ *
++ * Other transitions get WARN_ON_ONCE(), except that sched_tick_remote()
++ * and sched_tick_start() are happy to leave the state in RUNNING.
++ */
++
++static struct tick_work __percpu *tick_work_cpu;
++
++static void sched_tick_remote(struct work_struct *work)
++{
++ struct delayed_work *dwork = to_delayed_work(work);
++ struct tick_work *twork = container_of(dwork, struct tick_work, work);
++ int cpu = twork->cpu;
++ struct rq *rq = cpu_rq(cpu);
++ struct task_struct *curr;
++ unsigned long flags;
++ u64 delta;
++ int os;
++
++ /*
++ * Handle the tick only if it appears the remote CPU is running in full
++ * dynticks mode. The check is racy by nature, but missing a tick or
++ * having one too much is no big deal because the scheduler tick updates
++ * statistics and checks timeslices in a time-independent way, regardless
++ * of when exactly it is running.
++ */
++ if (!tick_nohz_tick_stopped_cpu(cpu))
++ goto out_requeue;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ curr = rq->curr;
++ if (cpu_is_offline(cpu))
++ goto out_unlock;
++
++ update_rq_clock(rq);
++ if (!is_idle_task(curr)) {
++ /*
++ * Make sure the next tick runs within a reasonable
++ * amount of time.
++ */
++ delta = rq_clock_task(rq) - curr->last_ran;
++ WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
++ }
++ scheduler_task_tick(rq);
++
++ calc_load_nohz_remote(rq);
++out_unlock:
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++out_requeue:
++ /*
++ * Run the remote tick once per second (1Hz). This arbitrary
++ * frequency is large enough to avoid overload but short enough
++ * to keep scheduler internal stats reasonably up to date. But
++ * first update state to reflect hotplug activity if required.
++ */
++ os = atomic_fetch_add_unless(&twork->state, -1, TICK_SCHED_REMOTE_RUNNING);
++ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_OFFLINE);
++ if (os == TICK_SCHED_REMOTE_RUNNING)
++ queue_delayed_work(system_unbound_wq, dwork, HZ);
++}
++
++static void sched_tick_start(int cpu)
++{
++ int os;
++ struct tick_work *twork;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
++ return;
++
++ WARN_ON_ONCE(!tick_work_cpu);
++
++ twork = per_cpu_ptr(tick_work_cpu, cpu);
++ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_RUNNING);
++ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_RUNNING);
++ if (os == TICK_SCHED_REMOTE_OFFLINE) {
++ twork->cpu = cpu;
++ INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
++ queue_delayed_work(system_unbound_wq, &twork->work, HZ);
++ }
++}
++
++#ifdef CONFIG_HOTPLUG_CPU
++static void sched_tick_stop(int cpu)
++{
++ struct tick_work *twork;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
++ return;
++
++ WARN_ON_ONCE(!tick_work_cpu);
++
++ twork = per_cpu_ptr(tick_work_cpu, cpu);
++ cancel_delayed_work_sync(&twork->work);
++}
++#endif /* CONFIG_HOTPLUG_CPU */
++
++int __init sched_tick_offload_init(void)
++{
++ tick_work_cpu = alloc_percpu(struct tick_work);
++ BUG_ON(!tick_work_cpu);
++ return 0;
++}
++
++#else /* !CONFIG_NO_HZ_FULL */
++static inline void sched_tick_start(int cpu) { }
++static inline void sched_tick_stop(int cpu) { }
++#endif
++
++#if defined(CONFIG_PREEMPTION) && (defined(CONFIG_DEBUG_PREEMPT) || \
++ defined(CONFIG_PREEMPT_TRACER))
++/*
++ * If the value passed in is equal to the current preempt count
++ * then we just disabled preemption. Start timing the latency.
++ */
++static inline void preempt_latency_start(int val)
++{
++ if (preempt_count() == val) {
++ unsigned long ip = get_lock_parent_ip();
++#ifdef CONFIG_DEBUG_PREEMPT
++ current->preempt_disable_ip = ip;
++#endif
++ trace_preempt_off(CALLER_ADDR0, ip);
++ }
++}
++
++void preempt_count_add(int val)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Underflow?
++ */
++ if (DEBUG_LOCKS_WARN_ON((preempt_count() < 0)))
++ return;
++#endif
++ __preempt_count_add(val);
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Spinlock count overflowing soon?
++ */
++ DEBUG_LOCKS_WARN_ON((preempt_count() & PREEMPT_MASK) >=
++ PREEMPT_MASK - 10);
++#endif
++ preempt_latency_start(val);
++}
++EXPORT_SYMBOL(preempt_count_add);
++NOKPROBE_SYMBOL(preempt_count_add);
++
++/*
++ * If the value passed in equals to the current preempt count
++ * then we just enabled preemption. Stop timing the latency.
++ */
++static inline void preempt_latency_stop(int val)
++{
++ if (preempt_count() == val)
++ trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip());
++}
++
++void preempt_count_sub(int val)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Underflow?
++ */
++ if (DEBUG_LOCKS_WARN_ON(val > preempt_count()))
++ return;
++ /*
++ * Is the spinlock portion underflowing?
++ */
++ if (DEBUG_LOCKS_WARN_ON((val < PREEMPT_MASK) &&
++ !(preempt_count() & PREEMPT_MASK)))
++ return;
++#endif
++
++ preempt_latency_stop(val);
++ __preempt_count_sub(val);
++}
++EXPORT_SYMBOL(preempt_count_sub);
++NOKPROBE_SYMBOL(preempt_count_sub);
++
++#else
++static inline void preempt_latency_start(int val) { }
++static inline void preempt_latency_stop(int val) { }
++#endif
++
++static inline unsigned long get_preempt_disable_ip(struct task_struct *p)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ return p->preempt_disable_ip;
++#else
++ return 0;
++#endif
++}
++
++/*
++ * Print scheduling while atomic bug:
++ */
++static noinline void __schedule_bug(struct task_struct *prev)
++{
++ /* Save this before calling printk(), since that will clobber it */
++ unsigned long preempt_disable_ip = get_preempt_disable_ip(current);
++
++ if (oops_in_progress)
++ return;
++
++ printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n",
++ prev->comm, prev->pid, preempt_count());
++
++ debug_show_held_locks(prev);
++ print_modules();
++ if (irqs_disabled())
++ print_irqtrace_events(prev);
++ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)
++ && in_atomic_preempt_off()) {
++ pr_err("Preemption disabled at:");
++ print_ip_sym(KERN_ERR, preempt_disable_ip);
++ }
++ if (panic_on_warn)
++ panic("scheduling while atomic\n");
++
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++
++/*
++ * Various schedule()-time debugging checks and statistics:
++ */
++static inline void schedule_debug(struct task_struct *prev, bool preempt)
++{
++#ifdef CONFIG_SCHED_STACK_END_CHECK
++ if (task_stack_end_corrupted(prev))
++ panic("corrupted stack end detected inside scheduler\n");
++
++ if (task_scs_end_corrupted(prev))
++ panic("corrupted shadow stack detected inside scheduler\n");
++#endif
++
++#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
++ if (!preempt && READ_ONCE(prev->__state) && prev->non_block_count) {
++ printk(KERN_ERR "BUG: scheduling in a non-blocking section: %s/%d/%i\n",
++ prev->comm, prev->pid, prev->non_block_count);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++ }
++#endif
++
++ if (unlikely(in_atomic_preempt_off())) {
++ __schedule_bug(prev);
++ preempt_count_set(PREEMPT_DISABLED);
++ }
++ rcu_sleep_check();
++ SCHED_WARN_ON(ct_state() == CONTEXT_USER);
++
++ profile_hit(SCHED_PROFILING, __builtin_return_address(0));
++
++ schedstat_inc(this_rq()->sched_count);
++}
++
++/*
++ * Compile time debug macro
++ * #define ALT_SCHED_DEBUG
++ */
++
++#ifdef ALT_SCHED_DEBUG
++void alt_sched_debug(void)
++{
++ printk(KERN_INFO "sched: pending: 0x%04lx, idle: 0x%04lx, sg_idle: 0x%04lx\n",
++ sched_rq_pending_mask.bits[0],
++ sched_rq_watermark[0].bits[0],
++ sched_sg_idle_mask.bits[0]);
++}
++#else
++inline void alt_sched_debug(void) {}
++#endif
++
++#ifdef CONFIG_SMP
++
++#define SCHED_RQ_NR_MIGRATION (32U)
++/*
++ * Migrate pending tasks in @rq to @dest_cpu
++ * Will try to migrate mininal of half of @rq nr_running tasks and
++ * SCHED_RQ_NR_MIGRATION to @dest_cpu
++ */
++static inline int
++migrate_pending_tasks(struct rq *rq, struct rq *dest_rq, const int dest_cpu)
++{
++ struct task_struct *p, *skip = rq->curr;
++ int nr_migrated = 0;
++ int nr_tries = min(rq->nr_running / 2, SCHED_RQ_NR_MIGRATION);
++
++ while (skip != rq->idle && nr_tries &&
++ (p = sched_rq_next_task(skip, rq)) != rq->idle) {
++ skip = sched_rq_next_task(p, rq);
++ if (cpumask_test_cpu(dest_cpu, p->cpus_ptr)) {
++ __SCHED_DEQUEUE_TASK(p, rq, 0);
++ set_task_cpu(p, dest_cpu);
++ sched_task_sanity_check(p, dest_rq);
++ __SCHED_ENQUEUE_TASK(p, dest_rq, 0);
++ nr_migrated++;
++ }
++ nr_tries--;
++ }
++
++ return nr_migrated;
++}
++
++static inline int take_other_rq_tasks(struct rq *rq, int cpu)
++{
++ struct cpumask *topo_mask, *end_mask;
++
++ if (unlikely(!rq->online))
++ return 0;
++
++ if (cpumask_empty(&sched_rq_pending_mask))
++ return 0;
++
++ topo_mask = per_cpu(sched_cpu_topo_masks, cpu) + 1;
++ end_mask = per_cpu(sched_cpu_topo_end_mask, cpu);
++ do {
++ int i;
++ for_each_cpu_and(i, &sched_rq_pending_mask, topo_mask) {
++ int nr_migrated;
++ struct rq *src_rq;
++
++ src_rq = cpu_rq(i);
++ if (!do_raw_spin_trylock(&src_rq->lock))
++ continue;
++ spin_acquire(&src_rq->lock.dep_map,
++ SINGLE_DEPTH_NESTING, 1, _RET_IP_);
++
++ if ((nr_migrated = migrate_pending_tasks(src_rq, rq, cpu))) {
++ src_rq->nr_running -= nr_migrated;
++ if (src_rq->nr_running < 2)
++ cpumask_clear_cpu(i, &sched_rq_pending_mask);
++
++ rq->nr_running += nr_migrated;
++ if (rq->nr_running > 1)
++ cpumask_set_cpu(cpu, &sched_rq_pending_mask);
++
++ cpufreq_update_util(rq, 0);
++
++ spin_release(&src_rq->lock.dep_map, _RET_IP_);
++ do_raw_spin_unlock(&src_rq->lock);
++
++ return 1;
++ }
++
++ spin_release(&src_rq->lock.dep_map, _RET_IP_);
++ do_raw_spin_unlock(&src_rq->lock);
++ }
++ } while (++topo_mask < end_mask);
++
++ return 0;
++}
++#endif
++
++/*
++ * Timeslices below RESCHED_NS are considered as good as expired as there's no
++ * point rescheduling when there's so little time left.
++ */
++static inline void check_curr(struct task_struct *p, struct rq *rq)
++{
++ if (unlikely(rq->idle == p))
++ return;
++
++ update_curr(rq, p);
++
++ if (p->time_slice < RESCHED_NS)
++ time_slice_expired(p, rq);
++}
++
++static inline struct task_struct *
++choose_next_task(struct rq *rq, int cpu, struct task_struct *prev)
++{
++ struct task_struct *next;
++
++ if (unlikely(rq->skip)) {
++ next = rq_runnable_task(rq);
++ if (next == rq->idle) {
++#ifdef CONFIG_SMP
++ if (!take_other_rq_tasks(rq, cpu)) {
++#endif
++ rq->skip = NULL;
++ schedstat_inc(rq->sched_goidle);
++ return next;
++#ifdef CONFIG_SMP
++ }
++ next = rq_runnable_task(rq);
++#endif
++ }
++ rq->skip = NULL;
++#ifdef CONFIG_HIGH_RES_TIMERS
++ hrtick_start(rq, next->time_slice);
++#endif
++ return next;
++ }
++
++ next = sched_rq_first_task(rq);
++ if (next == rq->idle) {
++#ifdef CONFIG_SMP
++ if (!take_other_rq_tasks(rq, cpu)) {
++#endif
++ schedstat_inc(rq->sched_goidle);
++ /*printk(KERN_INFO "sched: choose_next_task(%d) idle %px\n", cpu, next);*/
++ return next;
++#ifdef CONFIG_SMP
++ }
++ next = sched_rq_first_task(rq);
++#endif
++ }
++#ifdef CONFIG_HIGH_RES_TIMERS
++ hrtick_start(rq, next->time_slice);
++#endif
++ /*printk(KERN_INFO "sched: choose_next_task(%d) next %px\n", cpu,
++ * next);*/
++ return next;
++}
++
++/*
++ * Constants for the sched_mode argument of __schedule().
++ *
++ * The mode argument allows RT enabled kernels to differentiate a
++ * preemption from blocking on an 'sleeping' spin/rwlock. Note that
++ * SM_MASK_PREEMPT for !RT has all bits set, which allows the compiler to
++ * optimize the AND operation out and just check for zero.
++ */
++#define SM_NONE 0x0
++#define SM_PREEMPT 0x1
++#define SM_RTLOCK_WAIT 0x2
++
++#ifndef CONFIG_PREEMPT_RT
++# define SM_MASK_PREEMPT (~0U)
++#else
++# define SM_MASK_PREEMPT SM_PREEMPT
++#endif
++
++/*
++ * schedule() is the main scheduler function.
++ *
++ * The main means of driving the scheduler and thus entering this function are:
++ *
++ * 1. Explicit blocking: mutex, semaphore, waitqueue, etc.
++ *
++ * 2. TIF_NEED_RESCHED flag is checked on interrupt and userspace return
++ * paths. For example, see arch/x86/entry_64.S.
++ *
++ * To drive preemption between tasks, the scheduler sets the flag in timer
++ * interrupt handler scheduler_tick().
++ *
++ * 3. Wakeups don't really cause entry into schedule(). They add a
++ * task to the run-queue and that's it.
++ *
++ * Now, if the new task added to the run-queue preempts the current
++ * task, then the wakeup sets TIF_NEED_RESCHED and schedule() gets
++ * called on the nearest possible occasion:
++ *
++ * - If the kernel is preemptible (CONFIG_PREEMPTION=y):
++ *
++ * - in syscall or exception context, at the next outmost
++ * preempt_enable(). (this might be as soon as the wake_up()'s
++ * spin_unlock()!)
++ *
++ * - in IRQ context, return from interrupt-handler to
++ * preemptible context
++ *
++ * - If the kernel is not preemptible (CONFIG_PREEMPTION is not set)
++ * then at the next:
++ *
++ * - cond_resched() call
++ * - explicit schedule() call
++ * - return from syscall or exception to user-space
++ * - return from interrupt-handler to user-space
++ *
++ * WARNING: must be called with preemption disabled!
++ */
++static void __sched notrace __schedule(unsigned int sched_mode)
++{
++ struct task_struct *prev, *next;
++ unsigned long *switch_count;
++ unsigned long prev_state;
++ struct rq *rq;
++ int cpu;
++ int deactivated = 0;
++
++ cpu = smp_processor_id();
++ rq = cpu_rq(cpu);
++ prev = rq->curr;
++
++ schedule_debug(prev, !!sched_mode);
++
++ /* by passing sched_feat(HRTICK) checking which Alt schedule FW doesn't support */
++ hrtick_clear(rq);
++
++ local_irq_disable();
++ rcu_note_context_switch(!!sched_mode);
++
++ /*
++ * Make sure that signal_pending_state()->signal_pending() below
++ * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
++ * done by the caller to avoid the race with signal_wake_up():
++ *
++ * __set_current_state(@state) signal_wake_up()
++ * schedule() set_tsk_thread_flag(p, TIF_SIGPENDING)
++ * wake_up_state(p, state)
++ * LOCK rq->lock LOCK p->pi_state
++ * smp_mb__after_spinlock() smp_mb__after_spinlock()
++ * if (signal_pending_state()) if (p->state & @state)
++ *
++ * Also, the membarrier system call requires a full memory barrier
++ * after coming from user-space, before storing to rq->curr.
++ */
++ raw_spin_lock(&rq->lock);
++ smp_mb__after_spinlock();
++
++ update_rq_clock(rq);
++
++ switch_count = &prev->nivcsw;
++ /*
++ * We must load prev->state once (task_struct::state is volatile), such
++ * that:
++ *
++ * - we form a control dependency vs deactivate_task() below.
++ * - ptrace_{,un}freeze_traced() can change ->state underneath us.
++ */
++ prev_state = READ_ONCE(prev->__state);
++ if (!(sched_mode & SM_MASK_PREEMPT) && prev_state) {
++ if (signal_pending_state(prev_state, prev)) {
++ WRITE_ONCE(prev->__state, TASK_RUNNING);
++ } else {
++ prev->sched_contributes_to_load =
++ (prev_state & TASK_UNINTERRUPTIBLE) &&
++ !(prev_state & TASK_NOLOAD) &&
++ !(prev->flags & PF_FROZEN);
++
++ if (prev->sched_contributes_to_load)
++ rq->nr_uninterruptible++;
++
++ /*
++ * __schedule() ttwu()
++ * prev_state = prev->state; if (p->on_rq && ...)
++ * if (prev_state) goto out;
++ * p->on_rq = 0; smp_acquire__after_ctrl_dep();
++ * p->state = TASK_WAKING
++ *
++ * Where __schedule() and ttwu() have matching control dependencies.
++ *
++ * After this, schedule() must not care about p->state any more.
++ */
++ sched_task_deactivate(prev, rq);
++ deactivate_task(prev, rq);
++ deactivated = 1;
++
++ if (prev->in_iowait) {
++ atomic_inc(&rq->nr_iowait);
++ delayacct_blkio_start();
++ }
++ }
++ switch_count = &prev->nvcsw;
++ }
++
++ check_curr(prev, rq);
++
++ next = choose_next_task(rq, cpu, prev);
++ clear_tsk_need_resched(prev);
++ clear_preempt_need_resched();
++#ifdef CONFIG_SCHED_DEBUG
++ rq->last_seen_need_resched_ns = 0;
++#endif
++
++ if (likely(prev != next)) {
++ if (deactivated)
++ update_sched_rq_watermark(rq);
++ next->last_ran = rq->clock_task;
++ rq->last_ts_switch = rq->clock;
++
++ rq->nr_switches++;
++ /*
++ * RCU users of rcu_dereference(rq->curr) may not see
++ * changes to task_struct made by pick_next_task().
++ */
++ RCU_INIT_POINTER(rq->curr, next);
++ /*
++ * The membarrier system call requires each architecture
++ * to have a full memory barrier after updating
++ * rq->curr, before returning to user-space.
++ *
++ * Here are the schemes providing that barrier on the
++ * various architectures:
++ * - mm ? switch_mm() : mmdrop() for x86, s390, sparc, PowerPC.
++ * switch_mm() rely on membarrier_arch_switch_mm() on PowerPC.
++ * - finish_lock_switch() for weakly-ordered
++ * architectures where spin_unlock is a full barrier,
++ * - switch_to() for arm64 (weakly-ordered, spin_unlock
++ * is a RELEASE barrier),
++ */
++ ++*switch_count;
++
++ psi_sched_switch(prev, next, !task_on_rq_queued(prev));
++
++ trace_sched_switch(sched_mode & SM_MASK_PREEMPT, prev, next, prev_state);
++
++ /* Also unlocks the rq: */
++ rq = context_switch(rq, prev, next);
++ } else {
++ __balance_callbacks(rq);
++ raw_spin_unlock_irq(&rq->lock);
++ }
++
++#ifdef CONFIG_SCHED_SMT
++ sg_balance_check(rq);
++#endif
++}
++
++void __noreturn do_task_dead(void)
++{
++ /* Causes final put_task_struct in finish_task_switch(): */
++ set_special_state(TASK_DEAD);
++
++ /* Tell freezer to ignore us: */
++ current->flags |= PF_NOFREEZE;
++
++ __schedule(SM_NONE);
++ BUG();
++
++ /* Avoid "noreturn function does return" - but don't continue if BUG() is a NOP: */
++ for (;;)
++ cpu_relax();
++}
++
++static inline void sched_submit_work(struct task_struct *tsk)
++{
++ unsigned int task_flags;
++
++ if (task_is_running(tsk))
++ return;
++
++ task_flags = tsk->flags;
++ /*
++ * If a worker goes to sleep, notify and ask workqueue whether it
++ * wants to wake up a task to maintain concurrency.
++ */
++ if (task_flags & (PF_WQ_WORKER | PF_IO_WORKER)) {
++ if (task_flags & PF_WQ_WORKER)
++ wq_worker_sleeping(tsk);
++ else
++ io_wq_worker_sleeping(tsk);
++ }
++
++ if (tsk_is_pi_blocked(tsk))
++ return;
++
++ /*
++ * If we are going to sleep and we have plugged IO queued,
++ * make sure to submit it to avoid deadlocks.
++ */
++ blk_flush_plug(tsk->plug, true);
++}
++
++static void sched_update_worker(struct task_struct *tsk)
++{
++ if (tsk->flags & (PF_WQ_WORKER | PF_IO_WORKER)) {
++ if (tsk->flags & PF_WQ_WORKER)
++ wq_worker_running(tsk);
++ else
++ io_wq_worker_running(tsk);
++ }
++}
++
++asmlinkage __visible void __sched schedule(void)
++{
++ struct task_struct *tsk = current;
++
++ sched_submit_work(tsk);
++ do {
++ preempt_disable();
++ __schedule(SM_NONE);
++ sched_preempt_enable_no_resched();
++ } while (need_resched());
++ sched_update_worker(tsk);
++}
++EXPORT_SYMBOL(schedule);
++
++/*
++ * synchronize_rcu_tasks() makes sure that no task is stuck in preempted
++ * state (have scheduled out non-voluntarily) by making sure that all
++ * tasks have either left the run queue or have gone into user space.
++ * As idle tasks do not do either, they must not ever be preempted
++ * (schedule out non-voluntarily).
++ *
++ * schedule_idle() is similar to schedule_preempt_disable() except that it
++ * never enables preemption because it does not call sched_submit_work().
++ */
++void __sched schedule_idle(void)
++{
++ /*
++ * As this skips calling sched_submit_work(), which the idle task does
++ * regardless because that function is a nop when the task is in a
++ * TASK_RUNNING state, make sure this isn't used someplace that the
++ * current task can be in any other state. Note, idle is always in the
++ * TASK_RUNNING state.
++ */
++ WARN_ON_ONCE(current->__state);
++ do {
++ __schedule(SM_NONE);
++ } while (need_resched());
++}
++
++#if defined(CONFIG_CONTEXT_TRACKING) && !defined(CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK)
++asmlinkage __visible void __sched schedule_user(void)
++{
++ /*
++ * If we come here after a random call to set_need_resched(),
++ * or we have been woken up remotely but the IPI has not yet arrived,
++ * we haven't yet exited the RCU idle mode. Do it here manually until
++ * we find a better solution.
++ *
++ * NB: There are buggy callers of this function. Ideally we
++ * should warn if prev_state != CONTEXT_USER, but that will trigger
++ * too frequently to make sense yet.
++ */
++ enum ctx_state prev_state = exception_enter();
++ schedule();
++ exception_exit(prev_state);
++}
++#endif
++
++/**
++ * schedule_preempt_disabled - called with preemption disabled
++ *
++ * Returns with preemption disabled. Note: preempt_count must be 1
++ */
++void __sched schedule_preempt_disabled(void)
++{
++ sched_preempt_enable_no_resched();
++ schedule();
++ preempt_disable();
++}
++
++#ifdef CONFIG_PREEMPT_RT
++void __sched notrace schedule_rtlock(void)
++{
++ do {
++ preempt_disable();
++ __schedule(SM_RTLOCK_WAIT);
++ sched_preempt_enable_no_resched();
++ } while (need_resched());
++}
++NOKPROBE_SYMBOL(schedule_rtlock);
++#endif
++
++static void __sched notrace preempt_schedule_common(void)
++{
++ do {
++ /*
++ * Because the function tracer can trace preempt_count_sub()
++ * and it also uses preempt_enable/disable_notrace(), if
++ * NEED_RESCHED is set, the preempt_enable_notrace() called
++ * by the function tracer will call this function again and
++ * cause infinite recursion.
++ *
++ * Preemption must be disabled here before the function
++ * tracer can trace. Break up preempt_disable() into two
++ * calls. One to disable preemption without fear of being
++ * traced. The other to still record the preemption latency,
++ * which can also be traced by the function tracer.
++ */
++ preempt_disable_notrace();
++ preempt_latency_start(1);
++ __schedule(SM_PREEMPT);
++ preempt_latency_stop(1);
++ preempt_enable_no_resched_notrace();
++
++ /*
++ * Check again in case we missed a preemption opportunity
++ * between schedule and now.
++ */
++ } while (need_resched());
++}
++
++#ifdef CONFIG_PREEMPTION
++/*
++ * This is the entry point to schedule() from in-kernel preemption
++ * off of preempt_enable.
++ */
++asmlinkage __visible void __sched notrace preempt_schedule(void)
++{
++ /*
++ * If there is a non-zero preempt_count or interrupts are disabled,
++ * we do not want to preempt the current task. Just return..
++ */
++ if (likely(!preemptible()))
++ return;
++
++ preempt_schedule_common();
++}
++NOKPROBE_SYMBOL(preempt_schedule);
++EXPORT_SYMBOL(preempt_schedule);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++#ifndef preempt_schedule_dynamic_enabled
++#define preempt_schedule_dynamic_enabled preempt_schedule
++#define preempt_schedule_dynamic_disabled NULL
++#endif
++DEFINE_STATIC_CALL(preempt_schedule, preempt_schedule_dynamic_enabled);
++EXPORT_STATIC_CALL_TRAMP(preempt_schedule);
++#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule);
++void __sched notrace dynamic_preempt_schedule(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule))
++ return;
++ preempt_schedule();
++}
++NOKPROBE_SYMBOL(dynamic_preempt_schedule);
++EXPORT_SYMBOL(dynamic_preempt_schedule);
++#endif
++#endif
++
++/**
++ * preempt_schedule_notrace - preempt_schedule called by tracing
++ *
++ * The tracing infrastructure uses preempt_enable_notrace to prevent
++ * recursion and tracing preempt enabling caused by the tracing
++ * infrastructure itself. But as tracing can happen in areas coming
++ * from userspace or just about to enter userspace, a preempt enable
++ * can occur before user_exit() is called. This will cause the scheduler
++ * to be called when the system is still in usermode.
++ *
++ * To prevent this, the preempt_enable_notrace will use this function
++ * instead of preempt_schedule() to exit user context if needed before
++ * calling the scheduler.
++ */
++asmlinkage __visible void __sched notrace preempt_schedule_notrace(void)
++{
++ enum ctx_state prev_ctx;
++
++ if (likely(!preemptible()))
++ return;
++
++ do {
++ /*
++ * Because the function tracer can trace preempt_count_sub()
++ * and it also uses preempt_enable/disable_notrace(), if
++ * NEED_RESCHED is set, the preempt_enable_notrace() called
++ * by the function tracer will call this function again and
++ * cause infinite recursion.
++ *
++ * Preemption must be disabled here before the function
++ * tracer can trace. Break up preempt_disable() into two
++ * calls. One to disable preemption without fear of being
++ * traced. The other to still record the preemption latency,
++ * which can also be traced by the function tracer.
++ */
++ preempt_disable_notrace();
++ preempt_latency_start(1);
++ /*
++ * Needs preempt disabled in case user_exit() is traced
++ * and the tracer calls preempt_enable_notrace() causing
++ * an infinite recursion.
++ */
++ prev_ctx = exception_enter();
++ __schedule(SM_PREEMPT);
++ exception_exit(prev_ctx);
++
++ preempt_latency_stop(1);
++ preempt_enable_no_resched_notrace();
++ } while (need_resched());
++}
++EXPORT_SYMBOL_GPL(preempt_schedule_notrace);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++#ifndef preempt_schedule_notrace_dynamic_enabled
++#define preempt_schedule_notrace_dynamic_enabled preempt_schedule_notrace
++#define preempt_schedule_notrace_dynamic_disabled NULL
++#endif
++DEFINE_STATIC_CALL(preempt_schedule_notrace, preempt_schedule_notrace_dynamic_enabled);
++EXPORT_STATIC_CALL_TRAMP(preempt_schedule_notrace);
++#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule_notrace);
++void __sched notrace dynamic_preempt_schedule_notrace(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule_notrace))
++ return;
++ preempt_schedule_notrace();
++}
++NOKPROBE_SYMBOL(dynamic_preempt_schedule_notrace);
++EXPORT_SYMBOL(dynamic_preempt_schedule_notrace);
++#endif
++#endif
++
++#endif /* CONFIG_PREEMPTION */
++
++/*
++ * This is the entry point to schedule() from kernel preemption
++ * off of irq context.
++ * Note, that this is called and return with irqs disabled. This will
++ * protect us against recursive calling from irq.
++ */
++asmlinkage __visible void __sched preempt_schedule_irq(void)
++{
++ enum ctx_state prev_state;
++
++ /* Catch callers which need to be fixed */
++ BUG_ON(preempt_count() || !irqs_disabled());
++
++ prev_state = exception_enter();
++
++ do {
++ preempt_disable();
++ local_irq_enable();
++ __schedule(SM_PREEMPT);
++ local_irq_disable();
++ sched_preempt_enable_no_resched();
++ } while (need_resched());
++
++ exception_exit(prev_state);
++}
++
++int default_wake_function(wait_queue_entry_t *curr, unsigned mode, int wake_flags,
++ void *key)
++{
++ WARN_ON_ONCE(IS_ENABLED(CONFIG_SCHED_DEBUG) && wake_flags & ~WF_SYNC);
++ return try_to_wake_up(curr->private, mode, wake_flags);
++}
++EXPORT_SYMBOL(default_wake_function);
++
++static inline void check_task_changed(struct task_struct *p, struct rq *rq)
++{
++ int idx;
++
++ /* Trigger resched if task sched_prio has been modified. */
++ if (task_on_rq_queued(p) && (idx = task_sched_prio_idx(p, rq)) != p->sq_idx) {
++ requeue_task(p, rq, idx);
++ check_preempt_curr(rq);
++ }
++}
++
++static void __setscheduler_prio(struct task_struct *p, int prio)
++{
++ p->prio = prio;
++}
++
++#ifdef CONFIG_RT_MUTEXES
++
++static inline int __rt_effective_prio(struct task_struct *pi_task, int prio)
++{
++ if (pi_task)
++ prio = min(prio, pi_task->prio);
++
++ return prio;
++}
++
++static inline int rt_effective_prio(struct task_struct *p, int prio)
++{
++ struct task_struct *pi_task = rt_mutex_get_top_task(p);
++
++ return __rt_effective_prio(pi_task, prio);
++}
++
++/*
++ * rt_mutex_setprio - set the current priority of a task
++ * @p: task to boost
++ * @pi_task: donor task
++ *
++ * This function changes the 'effective' priority of a task. It does
++ * not touch ->normal_prio like __setscheduler().
++ *
++ * Used by the rt_mutex code to implement priority inheritance
++ * logic. Call site only calls if the priority of the task changed.
++ */
++void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
++{
++ int prio;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ /* XXX used to be waiter->prio, not waiter->task->prio */
++ prio = __rt_effective_prio(pi_task, p->normal_prio);
++
++ /*
++ * If nothing changed; bail early.
++ */
++ if (p->pi_top_task == pi_task && prio == p->prio)
++ return;
++
++ rq = __task_access_lock(p, &lock);
++ /*
++ * Set under pi_lock && rq->lock, such that the value can be used under
++ * either lock.
++ *
++ * Note that there is loads of tricky to make this pointer cache work
++ * right. rt_mutex_slowunlock()+rt_mutex_postunlock() work together to
++ * ensure a task is de-boosted (pi_task is set to NULL) before the
++ * task is allowed to run again (and can exit). This ensures the pointer
++ * points to a blocked task -- which guarantees the task is present.
++ */
++ p->pi_top_task = pi_task;
++
++ /*
++ * For FIFO/RR we only need to set prio, if that matches we're done.
++ */
++ if (prio == p->prio)
++ goto out_unlock;
++
++ /*
++ * Idle task boosting is a nono in general. There is one
++ * exception, when PREEMPT_RT and NOHZ is active:
++ *
++ * The idle task calls get_next_timer_interrupt() and holds
++ * the timer wheel base->lock on the CPU and another CPU wants
++ * to access the timer (probably to cancel it). We can safely
++ * ignore the boosting request, as the idle CPU runs this code
++ * with interrupts disabled and will complete the lock
++ * protected section without being interrupted. So there is no
++ * real need to boost.
++ */
++ if (unlikely(p == rq->idle)) {
++ WARN_ON(p != rq->curr);
++ WARN_ON(p->pi_blocked_on);
++ goto out_unlock;
++ }
++
++ trace_sched_pi_setprio(p, pi_task);
++
++ __setscheduler_prio(p, prio);
++
++ check_task_changed(p, rq);
++out_unlock:
++ /* Avoid rq from going away on us: */
++ preempt_disable();
++
++ __balance_callbacks(rq);
++ __task_access_unlock(p, lock);
++
++ preempt_enable();
++}
++#else
++static inline int rt_effective_prio(struct task_struct *p, int prio)
++{
++ return prio;
++}
++#endif
++
++void set_user_nice(struct task_struct *p, long nice)
++{
++ unsigned long flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ if (task_nice(p) == nice || nice < MIN_NICE || nice > MAX_NICE)
++ return;
++ /*
++ * We have to be careful, if called from sys_setpriority(),
++ * the task might be in the middle of scheduling on another CPU.
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ rq = __task_access_lock(p, &lock);
++
++ p->static_prio = NICE_TO_PRIO(nice);
++ /*
++ * The RT priorities are set via sched_setscheduler(), but we still
++ * allow the 'normal' nice value to be set - but as expected
++ * it won't have any effect on scheduling until the task is
++ * not SCHED_NORMAL/SCHED_BATCH:
++ */
++ if (task_has_rt_policy(p))
++ goto out_unlock;
++
++ p->prio = effective_prio(p);
++
++ check_task_changed(p, rq);
++out_unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++}
++EXPORT_SYMBOL(set_user_nice);
++
++/*
++ * can_nice - check if a task can reduce its nice value
++ * @p: task
++ * @nice: nice value
++ */
++int can_nice(const struct task_struct *p, const int nice)
++{
++ /* Convert nice value [19,-20] to rlimit style value [1,40] */
++ int nice_rlim = nice_to_rlimit(nice);
++
++ return (nice_rlim <= task_rlimit(p, RLIMIT_NICE) ||
++ capable(CAP_SYS_NICE));
++}
++
++#ifdef __ARCH_WANT_SYS_NICE
++
++/*
++ * sys_nice - change the priority of the current process.
++ * @increment: priority increment
++ *
++ * sys_setpriority is a more generic, but much slower function that
++ * does similar things.
++ */
++SYSCALL_DEFINE1(nice, int, increment)
++{
++ long nice, retval;
++
++ /*
++ * Setpriority might change our priority at the same moment.
++ * We don't have to worry. Conceptually one call occurs first
++ * and we have a single winner.
++ */
++
++ increment = clamp(increment, -NICE_WIDTH, NICE_WIDTH);
++ nice = task_nice(current) + increment;
++
++ nice = clamp_val(nice, MIN_NICE, MAX_NICE);
++ if (increment < 0 && !can_nice(current, nice))
++ return -EPERM;
++
++ retval = security_task_setnice(current, nice);
++ if (retval)
++ return retval;
++
++ set_user_nice(current, nice);
++ return 0;
++}
++
++#endif
++
++/**
++ * task_prio - return the priority value of a given task.
++ * @p: the task in question.
++ *
++ * Return: The priority value as seen by users in /proc.
++ *
++ * sched policy return value kernel prio user prio/nice
++ *
++ * (BMQ)normal, batch, idle[0 ... 53] [100 ... 139] 0/[-20 ... 19]/[-7 ... 7]
++ * (PDS)normal, batch, idle[0 ... 39] 100 0/[-20 ... 19]
++ * fifo, rr [-1 ... -100] [99 ... 0] [0 ... 99]
++ */
++int task_prio(const struct task_struct *p)
++{
++ return (p->prio < MAX_RT_PRIO) ? p->prio - MAX_RT_PRIO :
++ task_sched_prio_normal(p, task_rq(p));
++}
++
++/**
++ * idle_cpu - is a given CPU idle currently?
++ * @cpu: the processor in question.
++ *
++ * Return: 1 if the CPU is currently idle. 0 otherwise.
++ */
++int idle_cpu(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (rq->curr != rq->idle)
++ return 0;
++
++ if (rq->nr_running)
++ return 0;
++
++#ifdef CONFIG_SMP
++ if (rq->ttwu_pending)
++ return 0;
++#endif
++
++ return 1;
++}
++
++/**
++ * idle_task - return the idle task for a given CPU.
++ * @cpu: the processor in question.
++ *
++ * Return: The idle task for the cpu @cpu.
++ */
++struct task_struct *idle_task(int cpu)
++{
++ return cpu_rq(cpu)->idle;
++}
++
++/**
++ * find_process_by_pid - find a process with a matching PID value.
++ * @pid: the pid in question.
++ *
++ * The task of @pid, if found. %NULL otherwise.
++ */
++static inline struct task_struct *find_process_by_pid(pid_t pid)
++{
++ return pid ? find_task_by_vpid(pid) : current;
++}
++
++/*
++ * sched_setparam() passes in -1 for its policy, to let the functions
++ * it calls know not to change it.
++ */
++#define SETPARAM_POLICY -1
++
++static void __setscheduler_params(struct task_struct *p,
++ const struct sched_attr *attr)
++{
++ int policy = attr->sched_policy;
++
++ if (policy == SETPARAM_POLICY)
++ policy = p->policy;
++
++ p->policy = policy;
++
++ /*
++ * allow normal nice value to be set, but will not have any
++ * effect on scheduling until the task not SCHED_NORMAL/
++ * SCHED_BATCH
++ */
++ p->static_prio = NICE_TO_PRIO(attr->sched_nice);
++
++ /*
++ * __sched_setscheduler() ensures attr->sched_priority == 0 when
++ * !rt_policy. Always setting this ensures that things like
++ * getparam()/getattr() don't report silly values for !rt tasks.
++ */
++ p->rt_priority = attr->sched_priority;
++ p->normal_prio = normal_prio(p);
++}
++
++/*
++ * check the target process has a UID that matches the current process's
++ */
++static bool check_same_owner(struct task_struct *p)
++{
++ const struct cred *cred = current_cred(), *pcred;
++ bool match;
++
++ rcu_read_lock();
++ pcred = __task_cred(p);
++ match = (uid_eq(cred->euid, pcred->euid) ||
++ uid_eq(cred->euid, pcred->uid));
++ rcu_read_unlock();
++ return match;
++}
++
++static int __sched_setscheduler(struct task_struct *p,
++ const struct sched_attr *attr,
++ bool user, bool pi)
++{
++ const struct sched_attr dl_squash_attr = {
++ .size = sizeof(struct sched_attr),
++ .sched_policy = SCHED_FIFO,
++ .sched_nice = 0,
++ .sched_priority = 99,
++ };
++ int oldpolicy = -1, policy = attr->sched_policy;
++ int retval, newprio;
++ struct callback_head *head;
++ unsigned long flags;
++ struct rq *rq;
++ int reset_on_fork;
++ raw_spinlock_t *lock;
++
++ /* The pi code expects interrupts enabled */
++ BUG_ON(pi && in_interrupt());
++
++ /*
++ * Alt schedule FW supports SCHED_DEADLINE by squash it as prio 0 SCHED_FIFO
++ */
++ if (unlikely(SCHED_DEADLINE == policy)) {
++ attr = &dl_squash_attr;
++ policy = attr->sched_policy;
++ }
++recheck:
++ /* Double check policy once rq lock held */
++ if (policy < 0) {
++ reset_on_fork = p->sched_reset_on_fork;
++ policy = oldpolicy = p->policy;
++ } else {
++ reset_on_fork = !!(attr->sched_flags & SCHED_RESET_ON_FORK);
++
++ if (policy > SCHED_IDLE)
++ return -EINVAL;
++ }
++
++ if (attr->sched_flags & ~(SCHED_FLAG_ALL))
++ return -EINVAL;
++
++ /*
++ * Valid priorities for SCHED_FIFO and SCHED_RR are
++ * 1..MAX_RT_PRIO-1, valid priority for SCHED_NORMAL and
++ * SCHED_BATCH and SCHED_IDLE is 0.
++ */
++ if (attr->sched_priority < 0 ||
++ (p->mm && attr->sched_priority > MAX_RT_PRIO - 1) ||
++ (!p->mm && attr->sched_priority > MAX_RT_PRIO - 1))
++ return -EINVAL;
++ if ((SCHED_RR == policy || SCHED_FIFO == policy) !=
++ (attr->sched_priority != 0))
++ return -EINVAL;
++
++ /*
++ * Allow unprivileged RT tasks to decrease priority:
++ */
++ if (user && !capable(CAP_SYS_NICE)) {
++ if (SCHED_FIFO == policy || SCHED_RR == policy) {
++ unsigned long rlim_rtprio =
++ task_rlimit(p, RLIMIT_RTPRIO);
++
++ /* Can't set/change the rt policy */
++ if (policy != p->policy && !rlim_rtprio)
++ return -EPERM;
++
++ /* Can't increase priority */
++ if (attr->sched_priority > p->rt_priority &&
++ attr->sched_priority > rlim_rtprio)
++ return -EPERM;
++ }
++
++ /* Can't change other user's priorities */
++ if (!check_same_owner(p))
++ return -EPERM;
++
++ /* Normal users shall not reset the sched_reset_on_fork flag */
++ if (p->sched_reset_on_fork && !reset_on_fork)
++ return -EPERM;
++ }
++
++ if (user) {
++ retval = security_task_setscheduler(p);
++ if (retval)
++ return retval;
++ }
++
++ if (pi)
++ cpuset_read_lock();
++
++ /*
++ * Make sure no PI-waiters arrive (or leave) while we are
++ * changing the priority of the task:
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++
++ /*
++ * To be able to change p->policy safely, task_access_lock()
++ * must be called.
++ * IF use task_access_lock() here:
++ * For the task p which is not running, reading rq->stop is
++ * racy but acceptable as ->stop doesn't change much.
++ * An enhancemnet can be made to read rq->stop saftly.
++ */
++ rq = __task_access_lock(p, &lock);
++
++ /*
++ * Changing the policy of the stop threads its a very bad idea
++ */
++ if (p == rq->stop) {
++ retval = -EINVAL;
++ goto unlock;
++ }
++
++ /*
++ * If not changing anything there's no need to proceed further:
++ */
++ if (unlikely(policy == p->policy)) {
++ if (rt_policy(policy) && attr->sched_priority != p->rt_priority)
++ goto change;
++ if (!rt_policy(policy) &&
++ NICE_TO_PRIO(attr->sched_nice) != p->static_prio)
++ goto change;
++
++ p->sched_reset_on_fork = reset_on_fork;
++ retval = 0;
++ goto unlock;
++ }
++change:
++
++ /* Re-check policy now with rq lock held */
++ if (unlikely(oldpolicy != -1 && oldpolicy != p->policy)) {
++ policy = oldpolicy = -1;
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++ if (pi)
++ cpuset_read_unlock();
++ goto recheck;
++ }
++
++ p->sched_reset_on_fork = reset_on_fork;
++
++ newprio = __normal_prio(policy, attr->sched_priority, NICE_TO_PRIO(attr->sched_nice));
++ if (pi) {
++ /*
++ * Take priority boosted tasks into account. If the new
++ * effective priority is unchanged, we just store the new
++ * normal parameters and do not touch the scheduler class and
++ * the runqueue. This will be done when the task deboost
++ * itself.
++ */
++ newprio = rt_effective_prio(p, newprio);
++ }
++
++ if (!(attr->sched_flags & SCHED_FLAG_KEEP_PARAMS)) {
++ __setscheduler_params(p, attr);
++ __setscheduler_prio(p, newprio);
++ }
++
++ check_task_changed(p, rq);
++
++ /* Avoid rq from going away on us: */
++ preempt_disable();
++ head = splice_balance_callbacks(rq);
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ if (pi) {
++ cpuset_read_unlock();
++ rt_mutex_adjust_pi(p);
++ }
++
++ /* Run balance callbacks after we've adjusted the PI chain: */
++ balance_callbacks(rq, head);
++ preempt_enable();
++
++ return 0;
++
++unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++ if (pi)
++ cpuset_read_unlock();
++ return retval;
++}
++
++static int _sched_setscheduler(struct task_struct *p, int policy,
++ const struct sched_param *param, bool check)
++{
++ struct sched_attr attr = {
++ .sched_policy = policy,
++ .sched_priority = param->sched_priority,
++ .sched_nice = PRIO_TO_NICE(p->static_prio),
++ };
++
++ /* Fixup the legacy SCHED_RESET_ON_FORK hack. */
++ if ((policy != SETPARAM_POLICY) && (policy & SCHED_RESET_ON_FORK)) {
++ attr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
++ policy &= ~SCHED_RESET_ON_FORK;
++ attr.sched_policy = policy;
++ }
++
++ return __sched_setscheduler(p, &attr, check, true);
++}
++
++/**
++ * sched_setscheduler - change the scheduling policy and/or RT priority of a thread.
++ * @p: the task in question.
++ * @policy: new policy.
++ * @param: structure containing the new RT priority.
++ *
++ * Use sched_set_fifo(), read its comment.
++ *
++ * Return: 0 on success. An error code otherwise.
++ *
++ * NOTE that the task may be already dead.
++ */
++int sched_setscheduler(struct task_struct *p, int policy,
++ const struct sched_param *param)
++{
++ return _sched_setscheduler(p, policy, param, true);
++}
++
++int sched_setattr(struct task_struct *p, const struct sched_attr *attr)
++{
++ return __sched_setscheduler(p, attr, true, true);
++}
++
++int sched_setattr_nocheck(struct task_struct *p, const struct sched_attr *attr)
++{
++ return __sched_setscheduler(p, attr, false, true);
++}
++EXPORT_SYMBOL_GPL(sched_setattr_nocheck);
++
++/**
++ * sched_setscheduler_nocheck - change the scheduling policy and/or RT priority of a thread from kernelspace.
++ * @p: the task in question.
++ * @policy: new policy.
++ * @param: structure containing the new RT priority.
++ *
++ * Just like sched_setscheduler, only don't bother checking if the
++ * current context has permission. For example, this is needed in
++ * stop_machine(): we create temporary high priority worker threads,
++ * but our caller might not have that capability.
++ *
++ * Return: 0 on success. An error code otherwise.
++ */
++int sched_setscheduler_nocheck(struct task_struct *p, int policy,
++ const struct sched_param *param)
++{
++ return _sched_setscheduler(p, policy, param, false);
++}
++
++/*
++ * SCHED_FIFO is a broken scheduler model; that is, it is fundamentally
++ * incapable of resource management, which is the one thing an OS really should
++ * be doing.
++ *
++ * This is of course the reason it is limited to privileged users only.
++ *
++ * Worse still; it is fundamentally impossible to compose static priority
++ * workloads. You cannot take two correctly working static prio workloads
++ * and smash them together and still expect them to work.
++ *
++ * For this reason 'all' FIFO tasks the kernel creates are basically at:
++ *
++ * MAX_RT_PRIO / 2
++ *
++ * The administrator _MUST_ configure the system, the kernel simply doesn't
++ * know enough information to make a sensible choice.
++ */
++void sched_set_fifo(struct task_struct *p)
++{
++ struct sched_param sp = { .sched_priority = MAX_RT_PRIO / 2 };
++ WARN_ON_ONCE(sched_setscheduler_nocheck(p, SCHED_FIFO, &sp) != 0);
++}
++EXPORT_SYMBOL_GPL(sched_set_fifo);
++
++/*
++ * For when you don't much care about FIFO, but want to be above SCHED_NORMAL.
++ */
++void sched_set_fifo_low(struct task_struct *p)
++{
++ struct sched_param sp = { .sched_priority = 1 };
++ WARN_ON_ONCE(sched_setscheduler_nocheck(p, SCHED_FIFO, &sp) != 0);
++}
++EXPORT_SYMBOL_GPL(sched_set_fifo_low);
++
++void sched_set_normal(struct task_struct *p, int nice)
++{
++ struct sched_attr attr = {
++ .sched_policy = SCHED_NORMAL,
++ .sched_nice = nice,
++ };
++ WARN_ON_ONCE(sched_setattr_nocheck(p, &attr) != 0);
++}
++EXPORT_SYMBOL_GPL(sched_set_normal);
++
++static int
++do_sched_setscheduler(pid_t pid, int policy, struct sched_param __user *param)
++{
++ struct sched_param lparam;
++ struct task_struct *p;
++ int retval;
++
++ if (!param || pid < 0)
++ return -EINVAL;
++ if (copy_from_user(&lparam, param, sizeof(struct sched_param)))
++ return -EFAULT;
++
++ rcu_read_lock();
++ retval = -ESRCH;
++ p = find_process_by_pid(pid);
++ if (likely(p))
++ get_task_struct(p);
++ rcu_read_unlock();
++
++ if (likely(p)) {
++ retval = sched_setscheduler(p, policy, &lparam);
++ put_task_struct(p);
++ }
++
++ return retval;
++}
++
++/*
++ * Mimics kernel/events/core.c perf_copy_attr().
++ */
++static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *attr)
++{
++ u32 size;
++ int ret;
++
++ /* Zero the full structure, so that a short copy will be nice: */
++ memset(attr, 0, sizeof(*attr));
++
++ ret = get_user(size, &uattr->size);
++ if (ret)
++ return ret;
++
++ /* ABI compatibility quirk: */
++ if (!size)
++ size = SCHED_ATTR_SIZE_VER0;
++
++ if (size < SCHED_ATTR_SIZE_VER0 || size > PAGE_SIZE)
++ goto err_size;
++
++ ret = copy_struct_from_user(attr, sizeof(*attr), uattr, size);
++ if (ret) {
++ if (ret == -E2BIG)
++ goto err_size;
++ return ret;
++ }
++
++ /*
++ * XXX: Do we want to be lenient like existing syscalls; or do we want
++ * to be strict and return an error on out-of-bounds values?
++ */
++ attr->sched_nice = clamp(attr->sched_nice, -20, 19);
++
++ /* sched/core.c uses zero here but we already know ret is zero */
++ return 0;
++
++err_size:
++ put_user(sizeof(*attr), &uattr->size);
++ return -E2BIG;
++}
++
++/**
++ * sys_sched_setscheduler - set/change the scheduler policy and RT priority
++ * @pid: the pid in question.
++ * @policy: new policy.
++ *
++ * Return: 0 on success. An error code otherwise.
++ * @param: structure containing the new RT priority.
++ */
++SYSCALL_DEFINE3(sched_setscheduler, pid_t, pid, int, policy, struct sched_param __user *, param)
++{
++ if (policy < 0)
++ return -EINVAL;
++
++ return do_sched_setscheduler(pid, policy, param);
++}
++
++/**
++ * sys_sched_setparam - set/change the RT priority of a thread
++ * @pid: the pid in question.
++ * @param: structure containing the new RT priority.
++ *
++ * Return: 0 on success. An error code otherwise.
++ */
++SYSCALL_DEFINE2(sched_setparam, pid_t, pid, struct sched_param __user *, param)
++{
++ return do_sched_setscheduler(pid, SETPARAM_POLICY, param);
++}
++
++/**
++ * sys_sched_setattr - same as above, but with extended sched_attr
++ * @pid: the pid in question.
++ * @uattr: structure containing the extended parameters.
++ */
++SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
++ unsigned int, flags)
++{
++ struct sched_attr attr;
++ struct task_struct *p;
++ int retval;
++
++ if (!uattr || pid < 0 || flags)
++ return -EINVAL;
++
++ retval = sched_copy_attr(uattr, &attr);
++ if (retval)
++ return retval;
++
++ if ((int)attr.sched_policy < 0)
++ return -EINVAL;
++
++ rcu_read_lock();
++ retval = -ESRCH;
++ p = find_process_by_pid(pid);
++ if (likely(p))
++ get_task_struct(p);
++ rcu_read_unlock();
++
++ if (likely(p)) {
++ retval = sched_setattr(p, &attr);
++ put_task_struct(p);
++ }
++
++ return retval;
++}
++
++/**
++ * sys_sched_getscheduler - get the policy (scheduling class) of a thread
++ * @pid: the pid in question.
++ *
++ * Return: On success, the policy of the thread. Otherwise, a negative error
++ * code.
++ */
++SYSCALL_DEFINE1(sched_getscheduler, pid_t, pid)
++{
++ struct task_struct *p;
++ int retval = -EINVAL;
++
++ if (pid < 0)
++ goto out_nounlock;
++
++ retval = -ESRCH;
++ rcu_read_lock();
++ p = find_process_by_pid(pid);
++ if (p) {
++ retval = security_task_getscheduler(p);
++ if (!retval)
++ retval = p->policy;
++ }
++ rcu_read_unlock();
++
++out_nounlock:
++ return retval;
++}
++
++/**
++ * sys_sched_getscheduler - get the RT priority of a thread
++ * @pid: the pid in question.
++ * @param: structure containing the RT priority.
++ *
++ * Return: On success, 0 and the RT priority is in @param. Otherwise, an error
++ * code.
++ */
++SYSCALL_DEFINE2(sched_getparam, pid_t, pid, struct sched_param __user *, param)
++{
++ struct sched_param lp = { .sched_priority = 0 };
++ struct task_struct *p;
++ int retval = -EINVAL;
++
++ if (!param || pid < 0)
++ goto out_nounlock;
++
++ rcu_read_lock();
++ p = find_process_by_pid(pid);
++ retval = -ESRCH;
++ if (!p)
++ goto out_unlock;
++
++ retval = security_task_getscheduler(p);
++ if (retval)
++ goto out_unlock;
++
++ if (task_has_rt_policy(p))
++ lp.sched_priority = p->rt_priority;
++ rcu_read_unlock();
++
++ /*
++ * This one might sleep, we cannot do it with a spinlock held ...
++ */
++ retval = copy_to_user(param, &lp, sizeof(*param)) ? -EFAULT : 0;
++
++out_nounlock:
++ return retval;
++
++out_unlock:
++ rcu_read_unlock();
++ return retval;
++}
++
++/*
++ * Copy the kernel size attribute structure (which might be larger
++ * than what user-space knows about) to user-space.
++ *
++ * Note that all cases are valid: user-space buffer can be larger or
++ * smaller than the kernel-space buffer. The usual case is that both
++ * have the same size.
++ */
++static int
++sched_attr_copy_to_user(struct sched_attr __user *uattr,
++ struct sched_attr *kattr,
++ unsigned int usize)
++{
++ unsigned int ksize = sizeof(*kattr);
++
++ if (!access_ok(uattr, usize))
++ return -EFAULT;
++
++ /*
++ * sched_getattr() ABI forwards and backwards compatibility:
++ *
++ * If usize == ksize then we just copy everything to user-space and all is good.
++ *
++ * If usize < ksize then we only copy as much as user-space has space for,
++ * this keeps ABI compatibility as well. We skip the rest.
++ *
++ * If usize > ksize then user-space is using a newer version of the ABI,
++ * which part the kernel doesn't know about. Just ignore it - tooling can
++ * detect the kernel's knowledge of attributes from the attr->size value
++ * which is set to ksize in this case.
++ */
++ kattr->size = min(usize, ksize);
++
++ if (copy_to_user(uattr, kattr, kattr->size))
++ return -EFAULT;
++
++ return 0;
++}
++
++/**
++ * sys_sched_getattr - similar to sched_getparam, but with sched_attr
++ * @pid: the pid in question.
++ * @uattr: structure containing the extended parameters.
++ * @usize: sizeof(attr) for fwd/bwd comp.
++ * @flags: for future extension.
++ */
++SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
++ unsigned int, usize, unsigned int, flags)
++{
++ struct sched_attr kattr = { };
++ struct task_struct *p;
++ int retval;
++
++ if (!uattr || pid < 0 || usize > PAGE_SIZE ||
++ usize < SCHED_ATTR_SIZE_VER0 || flags)
++ return -EINVAL;
++
++ rcu_read_lock();
++ p = find_process_by_pid(pid);
++ retval = -ESRCH;
++ if (!p)
++ goto out_unlock;
++
++ retval = security_task_getscheduler(p);
++ if (retval)
++ goto out_unlock;
++
++ kattr.sched_policy = p->policy;
++ if (p->sched_reset_on_fork)
++ kattr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
++ if (task_has_rt_policy(p))
++ kattr.sched_priority = p->rt_priority;
++ else
++ kattr.sched_nice = task_nice(p);
++ kattr.sched_flags &= SCHED_FLAG_ALL;
++
++#ifdef CONFIG_UCLAMP_TASK
++ kattr.sched_util_min = p->uclamp_req[UCLAMP_MIN].value;
++ kattr.sched_util_max = p->uclamp_req[UCLAMP_MAX].value;
++#endif
++
++ rcu_read_unlock();
++
++ return sched_attr_copy_to_user(uattr, &kattr, usize);
++
++out_unlock:
++ rcu_read_unlock();
++ return retval;
++}
++
++static int
++__sched_setaffinity(struct task_struct *p, const struct cpumask *mask)
++{
++ int retval;
++ cpumask_var_t cpus_allowed, new_mask;
++
++ if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL))
++ return -ENOMEM;
++
++ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) {
++ retval = -ENOMEM;
++ goto out_free_cpus_allowed;
++ }
++
++ cpuset_cpus_allowed(p, cpus_allowed);
++ cpumask_and(new_mask, mask, cpus_allowed);
++again:
++ retval = __set_cpus_allowed_ptr(p, new_mask, SCA_CHECK | SCA_USER);
++ if (retval)
++ goto out_free_new_mask;
++
++ cpuset_cpus_allowed(p, cpus_allowed);
++ if (!cpumask_subset(new_mask, cpus_allowed)) {
++ /*
++ * We must have raced with a concurrent cpuset
++ * update. Just reset the cpus_allowed to the
++ * cpuset's cpus_allowed
++ */
++ cpumask_copy(new_mask, cpus_allowed);
++ goto again;
++ }
++
++out_free_new_mask:
++ free_cpumask_var(new_mask);
++out_free_cpus_allowed:
++ free_cpumask_var(cpus_allowed);
++ return retval;
++}
++
++long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
++{
++ struct task_struct *p;
++ int retval;
++
++ rcu_read_lock();
++
++ p = find_process_by_pid(pid);
++ if (!p) {
++ rcu_read_unlock();
++ return -ESRCH;
++ }
++
++ /* Prevent p going away */
++ get_task_struct(p);
++ rcu_read_unlock();
++
++ if (p->flags & PF_NO_SETAFFINITY) {
++ retval = -EINVAL;
++ goto out_put_task;
++ }
++
++ if (!check_same_owner(p)) {
++ rcu_read_lock();
++ if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
++ rcu_read_unlock();
++ retval = -EPERM;
++ goto out_put_task;
++ }
++ rcu_read_unlock();
++ }
++
++ retval = security_task_setscheduler(p);
++ if (retval)
++ goto out_put_task;
++
++ retval = __sched_setaffinity(p, in_mask);
++out_put_task:
++ put_task_struct(p);
++ return retval;
++}
++
++static int get_user_cpu_mask(unsigned long __user *user_mask_ptr, unsigned len,
++ struct cpumask *new_mask)
++{
++ if (len < cpumask_size())
++ cpumask_clear(new_mask);
++ else if (len > cpumask_size())
++ len = cpumask_size();
++
++ return copy_from_user(new_mask, user_mask_ptr, len) ? -EFAULT : 0;
++}
++
++/**
++ * sys_sched_setaffinity - set the CPU affinity of a process
++ * @pid: pid of the process
++ * @len: length in bytes of the bitmask pointed to by user_mask_ptr
++ * @user_mask_ptr: user-space pointer to the new CPU mask
++ *
++ * Return: 0 on success. An error code otherwise.
++ */
++SYSCALL_DEFINE3(sched_setaffinity, pid_t, pid, unsigned int, len,
++ unsigned long __user *, user_mask_ptr)
++{
++ cpumask_var_t new_mask;
++ int retval;
++
++ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL))
++ return -ENOMEM;
++
++ retval = get_user_cpu_mask(user_mask_ptr, len, new_mask);
++ if (retval == 0)
++ retval = sched_setaffinity(pid, new_mask);
++ free_cpumask_var(new_mask);
++ return retval;
++}
++
++long sched_getaffinity(pid_t pid, cpumask_t *mask)
++{
++ struct task_struct *p;
++ raw_spinlock_t *lock;
++ unsigned long flags;
++ int retval;
++
++ rcu_read_lock();
++
++ retval = -ESRCH;
++ p = find_process_by_pid(pid);
++ if (!p)
++ goto out_unlock;
++
++ retval = security_task_getscheduler(p);
++ if (retval)
++ goto out_unlock;
++
++ task_access_lock_irqsave(p, &lock, &flags);
++ cpumask_and(mask, &p->cpus_mask, cpu_active_mask);
++ task_access_unlock_irqrestore(p, lock, &flags);
++
++out_unlock:
++ rcu_read_unlock();
++
++ return retval;
++}
++
++/**
++ * sys_sched_getaffinity - get the CPU affinity of a process
++ * @pid: pid of the process
++ * @len: length in bytes of the bitmask pointed to by user_mask_ptr
++ * @user_mask_ptr: user-space pointer to hold the current CPU mask
++ *
++ * Return: size of CPU mask copied to user_mask_ptr on success. An
++ * error code otherwise.
++ */
++SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
++ unsigned long __user *, user_mask_ptr)
++{
++ int ret;
++ cpumask_var_t mask;
++
++ if ((len * BITS_PER_BYTE) < nr_cpu_ids)
++ return -EINVAL;
++ if (len & (sizeof(unsigned long)-1))
++ return -EINVAL;
++
++ if (!alloc_cpumask_var(&mask, GFP_KERNEL))
++ return -ENOMEM;
++
++ ret = sched_getaffinity(pid, mask);
++ if (ret == 0) {
++ unsigned int retlen = min_t(size_t, len, cpumask_size());
++
++ if (copy_to_user(user_mask_ptr, mask, retlen))
++ ret = -EFAULT;
++ else
++ ret = retlen;
++ }
++ free_cpumask_var(mask);
++
++ return ret;
++}
++
++static void do_sched_yield(void)
++{
++ struct rq *rq;
++ struct rq_flags rf;
++
++ if (!sched_yield_type)
++ return;
++
++ rq = this_rq_lock_irq(&rf);
++
++ schedstat_inc(rq->yld_count);
++
++ if (1 == sched_yield_type) {
++ if (!rt_task(current))
++ do_sched_yield_type_1(current, rq);
++ } else if (2 == sched_yield_type) {
++ if (rq->nr_running > 1)
++ rq->skip = current;
++ }
++
++ preempt_disable();
++ raw_spin_unlock_irq(&rq->lock);
++ sched_preempt_enable_no_resched();
++
++ schedule();
++}
++
++/**
++ * sys_sched_yield - yield the current processor to other threads.
++ *
++ * This function yields the current CPU to other tasks. If there are no
++ * other threads running on this CPU then this function will return.
++ *
++ * Return: 0.
++ */
++SYSCALL_DEFINE0(sched_yield)
++{
++ do_sched_yield();
++ return 0;
++}
++
++#if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC)
++int __sched __cond_resched(void)
++{
++ if (should_resched(0)) {
++ preempt_schedule_common();
++ return 1;
++ }
++ /*
++ * In preemptible kernels, ->rcu_read_lock_nesting tells the tick
++ * whether the current CPU is in an RCU read-side critical section,
++ * so the tick can report quiescent states even for CPUs looping
++ * in kernel context. In contrast, in non-preemptible kernels,
++ * RCU readers leave no in-memory hints, which means that CPU-bound
++ * processes executing in kernel context might never report an
++ * RCU quiescent state. Therefore, the following code causes
++ * cond_resched() to report a quiescent state, but only when RCU
++ * is in urgent need of one.
++ */
++#ifndef CONFIG_PREEMPT_RCU
++ rcu_all_qs();
++#endif
++ return 0;
++}
++EXPORT_SYMBOL(__cond_resched);
++#endif
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++#define cond_resched_dynamic_enabled __cond_resched
++#define cond_resched_dynamic_disabled ((void *)&__static_call_return0)
++DEFINE_STATIC_CALL_RET0(cond_resched, __cond_resched);
++EXPORT_STATIC_CALL_TRAMP(cond_resched);
++
++#define might_resched_dynamic_enabled __cond_resched
++#define might_resched_dynamic_disabled ((void *)&__static_call_return0)
++DEFINE_STATIC_CALL_RET0(might_resched, __cond_resched);
++EXPORT_STATIC_CALL_TRAMP(might_resched);
++#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_FALSE(sk_dynamic_cond_resched);
++int __sched dynamic_cond_resched(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_cond_resched))
++ return 0;
++ return __cond_resched();
++}
++EXPORT_SYMBOL(dynamic_cond_resched);
++
++static DEFINE_STATIC_KEY_FALSE(sk_dynamic_might_resched);
++int __sched dynamic_might_resched(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_might_resched))
++ return 0;
++ return __cond_resched();
++}
++EXPORT_SYMBOL(dynamic_might_resched);
++#endif
++#endif
++
++/*
++ * __cond_resched_lock() - if a reschedule is pending, drop the given lock,
++ * call schedule, and on return reacquire the lock.
++ *
++ * This works OK both with and without CONFIG_PREEMPTION. We do strange low-level
++ * operations here to prevent schedule() from being called twice (once via
++ * spin_unlock(), once by hand).
++ */
++int __cond_resched_lock(spinlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held(lock);
++
++ if (spin_needbreak(lock) || resched) {
++ spin_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ spin_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_lock);
++
++int __cond_resched_rwlock_read(rwlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held_read(lock);
++
++ if (rwlock_needbreak(lock) || resched) {
++ read_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ read_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_rwlock_read);
++
++int __cond_resched_rwlock_write(rwlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held_write(lock);
++
++ if (rwlock_needbreak(lock) || resched) {
++ write_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ write_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_rwlock_write);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++
++#ifdef CONFIG_GENERIC_ENTRY
++#include <linux/entry-common.h>
++#endif
++
++/*
++ * SC:cond_resched
++ * SC:might_resched
++ * SC:preempt_schedule
++ * SC:preempt_schedule_notrace
++ * SC:irqentry_exit_cond_resched
++ *
++ *
++ * NONE:
++ * cond_resched <- __cond_resched
++ * might_resched <- RET0
++ * preempt_schedule <- NOP
++ * preempt_schedule_notrace <- NOP
++ * irqentry_exit_cond_resched <- NOP
++ *
++ * VOLUNTARY:
++ * cond_resched <- __cond_resched
++ * might_resched <- __cond_resched
++ * preempt_schedule <- NOP
++ * preempt_schedule_notrace <- NOP
++ * irqentry_exit_cond_resched <- NOP
++ *
++ * FULL:
++ * cond_resched <- RET0
++ * might_resched <- RET0
++ * preempt_schedule <- preempt_schedule
++ * preempt_schedule_notrace <- preempt_schedule_notrace
++ * irqentry_exit_cond_resched <- irqentry_exit_cond_resched
++ */
++
++enum {
++ preempt_dynamic_undefined = -1,
++ preempt_dynamic_none,
++ preempt_dynamic_voluntary,
++ preempt_dynamic_full,
++};
++
++int preempt_dynamic_mode = preempt_dynamic_undefined;
++
++int sched_dynamic_mode(const char *str)
++{
++ if (!strcmp(str, "none"))
++ return preempt_dynamic_none;
++
++ if (!strcmp(str, "voluntary"))
++ return preempt_dynamic_voluntary;
++
++ if (!strcmp(str, "full"))
++ return preempt_dynamic_full;
++
++ return -EINVAL;
++}
++
++#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++#define preempt_dynamic_enable(f) static_call_update(f, f##_dynamic_enabled)
++#define preempt_dynamic_disable(f) static_call_update(f, f##_dynamic_disabled)
++#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++#define preempt_dynamic_enable(f) static_key_enable(&sk_dynamic_##f.key)
++#define preempt_dynamic_disable(f) static_key_disable(&sk_dynamic_##f.key)
++#else
++#error "Unsupported PREEMPT_DYNAMIC mechanism"
++#endif
++
++void sched_dynamic_update(int mode)
++{
++ /*
++ * Avoid {NONE,VOLUNTARY} -> FULL transitions from ever ending up in
++ * the ZERO state, which is invalid.
++ */
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(might_resched);
++ preempt_dynamic_enable(preempt_schedule);
++ preempt_dynamic_enable(preempt_schedule_notrace);
++ preempt_dynamic_enable(irqentry_exit_cond_resched);
++
++ switch (mode) {
++ case preempt_dynamic_none:
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_disable(might_resched);
++ preempt_dynamic_disable(preempt_schedule);
++ preempt_dynamic_disable(preempt_schedule_notrace);
++ preempt_dynamic_disable(irqentry_exit_cond_resched);
++ pr_info("Dynamic Preempt: none\n");
++ break;
++
++ case preempt_dynamic_voluntary:
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(might_resched);
++ preempt_dynamic_disable(preempt_schedule);
++ preempt_dynamic_disable(preempt_schedule_notrace);
++ preempt_dynamic_disable(irqentry_exit_cond_resched);
++ pr_info("Dynamic Preempt: voluntary\n");
++ break;
++
++ case preempt_dynamic_full:
++ preempt_dynamic_disable(cond_resched);
++ preempt_dynamic_disable(might_resched);
++ preempt_dynamic_enable(preempt_schedule);
++ preempt_dynamic_enable(preempt_schedule_notrace);
++ preempt_dynamic_enable(irqentry_exit_cond_resched);
++ pr_info("Dynamic Preempt: full\n");
++ break;
++ }
++
++ preempt_dynamic_mode = mode;
++}
++
++static int __init setup_preempt_mode(char *str)
++{
++ int mode = sched_dynamic_mode(str);
++ if (mode < 0) {
++ pr_warn("Dynamic Preempt: unsupported mode: %s\n", str);
++ return 0;
++ }
++
++ sched_dynamic_update(mode);
++ return 1;
++}
++__setup("preempt=", setup_preempt_mode);
++
++static void __init preempt_dynamic_init(void)
++{
++ if (preempt_dynamic_mode == preempt_dynamic_undefined) {
++ if (IS_ENABLED(CONFIG_PREEMPT_NONE)) {
++ sched_dynamic_update(preempt_dynamic_none);
++ } else if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) {
++ sched_dynamic_update(preempt_dynamic_voluntary);
++ } else {
++ /* Default static call setting, nothing to do */
++ WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT));
++ preempt_dynamic_mode = preempt_dynamic_full;
++ pr_info("Dynamic Preempt: full\n");
++ }
++ }
++}
++
++#else /* !CONFIG_PREEMPT_DYNAMIC */
++
++static inline void preempt_dynamic_init(void) { }
++
++#endif /* #ifdef CONFIG_PREEMPT_DYNAMIC */
++
++/**
++ * yield - yield the current processor to other threads.
++ *
++ * Do not ever use this function, there's a 99% chance you're doing it wrong.
++ *
++ * The scheduler is at all times free to pick the calling task as the most
++ * eligible task to run, if removing the yield() call from your code breaks
++ * it, it's already broken.
++ *
++ * Typical broken usage is:
++ *
++ * while (!event)
++ * yield();
++ *
++ * where one assumes that yield() will let 'the other' process run that will
++ * make event true. If the current task is a SCHED_FIFO task that will never
++ * happen. Never use yield() as a progress guarantee!!
++ *
++ * If you want to use yield() to wait for something, use wait_event().
++ * If you want to use yield() to be 'nice' for others, use cond_resched().
++ * If you still want to use yield(), do not!
++ */
++void __sched yield(void)
++{
++ set_current_state(TASK_RUNNING);
++ do_sched_yield();
++}
++EXPORT_SYMBOL(yield);
++
++/**
++ * yield_to - yield the current processor to another thread in
++ * your thread group, or accelerate that thread toward the
++ * processor it's on.
++ * @p: target task
++ * @preempt: whether task preemption is allowed or not
++ *
++ * It's the caller's job to ensure that the target task struct
++ * can't go away on us before we can do any checks.
++ *
++ * In Alt schedule FW, yield_to is not supported.
++ *
++ * Return:
++ * true (>0) if we indeed boosted the target task.
++ * false (0) if we failed to boost the target.
++ * -ESRCH if there's no task to yield to.
++ */
++int __sched yield_to(struct task_struct *p, bool preempt)
++{
++ return 0;
++}
++EXPORT_SYMBOL_GPL(yield_to);
++
++int io_schedule_prepare(void)
++{
++ int old_iowait = current->in_iowait;
++
++ current->in_iowait = 1;
++ blk_flush_plug(current->plug, true);
++ return old_iowait;
++}
++
++void io_schedule_finish(int token)
++{
++ current->in_iowait = token;
++}
++
++/*
++ * This task is about to go to sleep on IO. Increment rq->nr_iowait so
++ * that process accounting knows that this is a task in IO wait state.
++ *
++ * But don't do that if it is a deliberate, throttling IO wait (this task
++ * has set its backing_dev_info: the queue against which it should throttle)
++ */
++
++long __sched io_schedule_timeout(long timeout)
++{
++ int token;
++ long ret;
++
++ token = io_schedule_prepare();
++ ret = schedule_timeout(timeout);
++ io_schedule_finish(token);
++
++ return ret;
++}
++EXPORT_SYMBOL(io_schedule_timeout);
++
++void __sched io_schedule(void)
++{
++ int token;
++
++ token = io_schedule_prepare();
++ schedule();
++ io_schedule_finish(token);
++}
++EXPORT_SYMBOL(io_schedule);
++
++/**
++ * sys_sched_get_priority_max - return maximum RT priority.
++ * @policy: scheduling class.
++ *
++ * Return: On success, this syscall returns the maximum
++ * rt_priority that can be used by a given scheduling class.
++ * On failure, a negative error code is returned.
++ */
++SYSCALL_DEFINE1(sched_get_priority_max, int, policy)
++{
++ int ret = -EINVAL;
++
++ switch (policy) {
++ case SCHED_FIFO:
++ case SCHED_RR:
++ ret = MAX_RT_PRIO - 1;
++ break;
++ case SCHED_NORMAL:
++ case SCHED_BATCH:
++ case SCHED_IDLE:
++ ret = 0;
++ break;
++ }
++ return ret;
++}
++
++/**
++ * sys_sched_get_priority_min - return minimum RT priority.
++ * @policy: scheduling class.
++ *
++ * Return: On success, this syscall returns the minimum
++ * rt_priority that can be used by a given scheduling class.
++ * On failure, a negative error code is returned.
++ */
++SYSCALL_DEFINE1(sched_get_priority_min, int, policy)
++{
++ int ret = -EINVAL;
++
++ switch (policy) {
++ case SCHED_FIFO:
++ case SCHED_RR:
++ ret = 1;
++ break;
++ case SCHED_NORMAL:
++ case SCHED_BATCH:
++ case SCHED_IDLE:
++ ret = 0;
++ break;
++ }
++ return ret;
++}
++
++static int sched_rr_get_interval(pid_t pid, struct timespec64 *t)
++{
++ struct task_struct *p;
++ int retval;
++
++ alt_sched_debug();
++
++ if (pid < 0)
++ return -EINVAL;
++
++ retval = -ESRCH;
++ rcu_read_lock();
++ p = find_process_by_pid(pid);
++ if (!p)
++ goto out_unlock;
++
++ retval = security_task_getscheduler(p);
++ if (retval)
++ goto out_unlock;
++ rcu_read_unlock();
++
++ *t = ns_to_timespec64(sched_timeslice_ns);
++ return 0;
++
++out_unlock:
++ rcu_read_unlock();
++ return retval;
++}
++
++/**
++ * sys_sched_rr_get_interval - return the default timeslice of a process.
++ * @pid: pid of the process.
++ * @interval: userspace pointer to the timeslice value.
++ *
++ *
++ * Return: On success, 0 and the timeslice is in @interval. Otherwise,
++ * an error code.
++ */
++SYSCALL_DEFINE2(sched_rr_get_interval, pid_t, pid,
++ struct __kernel_timespec __user *, interval)
++{
++ struct timespec64 t;
++ int retval = sched_rr_get_interval(pid, &t);
++
++ if (retval == 0)
++ retval = put_timespec64(&t, interval);
++
++ return retval;
++}
++
++#ifdef CONFIG_COMPAT_32BIT_TIME
++SYSCALL_DEFINE2(sched_rr_get_interval_time32, pid_t, pid,
++ struct old_timespec32 __user *, interval)
++{
++ struct timespec64 t;
++ int retval = sched_rr_get_interval(pid, &t);
++
++ if (retval == 0)
++ retval = put_old_timespec32(&t, interval);
++ return retval;
++}
++#endif
++
++void sched_show_task(struct task_struct *p)
++{
++ unsigned long free = 0;
++ int ppid;
++
++ if (!try_get_task_stack(p))
++ return;
++
++ pr_info("task:%-15.15s state:%c", p->comm, task_state_to_char(p));
++
++ if (task_is_running(p))
++ pr_cont(" running task ");
++#ifdef CONFIG_DEBUG_STACK_USAGE
++ free = stack_not_used(p);
++#endif
++ ppid = 0;
++ rcu_read_lock();
++ if (pid_alive(p))
++ ppid = task_pid_nr(rcu_dereference(p->real_parent));
++ rcu_read_unlock();
++ pr_cont(" stack:%5lu pid:%5d ppid:%6d flags:0x%08lx\n",
++ free, task_pid_nr(p), ppid,
++ read_task_thread_flags(p));
++
++ print_worker_info(KERN_INFO, p);
++ print_stop_info(KERN_INFO, p);
++ show_stack(p, NULL, KERN_INFO);
++ put_task_stack(p);
++}
++EXPORT_SYMBOL_GPL(sched_show_task);
++
++static inline bool
++state_filter_match(unsigned long state_filter, struct task_struct *p)
++{
++ unsigned int state = READ_ONCE(p->__state);
++
++ /* no filter, everything matches */
++ if (!state_filter)
++ return true;
++
++ /* filter, but doesn't match */
++ if (!(state & state_filter))
++ return false;
++
++ /*
++ * When looking for TASK_UNINTERRUPTIBLE skip TASK_IDLE (allows
++ * TASK_KILLABLE).
++ */
++ if (state_filter == TASK_UNINTERRUPTIBLE && state == TASK_IDLE)
++ return false;
++
++ return true;
++}
++
++
++void show_state_filter(unsigned int state_filter)
++{
++ struct task_struct *g, *p;
++
++ rcu_read_lock();
++ for_each_process_thread(g, p) {
++ /*
++ * reset the NMI-timeout, listing all files on a slow
++ * console might take a lot of time:
++ * Also, reset softlockup watchdogs on all CPUs, because
++ * another CPU might be blocked waiting for us to process
++ * an IPI.
++ */
++ touch_nmi_watchdog();
++ touch_all_softlockup_watchdogs();
++ if (state_filter_match(state_filter, p))
++ sched_show_task(p);
++ }
++
++#ifdef CONFIG_SCHED_DEBUG
++ /* TODO: Alt schedule FW should support this
++ if (!state_filter)
++ sysrq_sched_debug_show();
++ */
++#endif
++ rcu_read_unlock();
++ /*
++ * Only show locks if all tasks are dumped:
++ */
++ if (!state_filter)
++ debug_show_all_locks();
++}
++
++void dump_cpu_task(int cpu)
++{
++ pr_info("Task dump for CPU %d:\n", cpu);
++ sched_show_task(cpu_curr(cpu));
++}
++
++/**
++ * init_idle - set up an idle thread for a given CPU
++ * @idle: task in question
++ * @cpu: CPU the idle task belongs to
++ *
++ * NOTE: this function does not set the idle thread's NEED_RESCHED
++ * flag, to make booting more robust.
++ */
++void __init init_idle(struct task_struct *idle, int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ __sched_fork(0, idle);
++
++ raw_spin_lock_irqsave(&idle->pi_lock, flags);
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++
++ idle->last_ran = rq->clock_task;
++ idle->__state = TASK_RUNNING;
++ /*
++ * PF_KTHREAD should already be set at this point; regardless, make it
++ * look like a proper per-CPU kthread.
++ */
++ idle->flags |= PF_IDLE | PF_KTHREAD | PF_NO_SETAFFINITY;
++ kthread_set_per_cpu(idle, cpu);
++
++ sched_queue_init_idle(&rq->queue, idle);
++
++#ifdef CONFIG_SMP
++ /*
++ * It's possible that init_idle() gets called multiple times on a task,
++ * in that case do_set_cpus_allowed() will not do the right thing.
++ *
++ * And since this is boot we can forgo the serialisation.
++ */
++ set_cpus_allowed_common(idle, cpumask_of(cpu));
++#endif
++
++ /* Silence PROVE_RCU */
++ rcu_read_lock();
++ __set_task_cpu(idle, cpu);
++ rcu_read_unlock();
++
++ rq->idle = idle;
++ rcu_assign_pointer(rq->curr, idle);
++ idle->on_cpu = 1;
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&idle->pi_lock, flags);
++
++ /* Set the preempt count _outside_ the spinlocks! */
++ init_idle_preempt_count(idle, cpu);
++
++ ftrace_graph_init_idle_task(idle, cpu);
++ vtime_init_idle(idle, cpu);
++#ifdef CONFIG_SMP
++ sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
++#endif
++}
++
++#ifdef CONFIG_SMP
++
++int cpuset_cpumask_can_shrink(const struct cpumask __maybe_unused *cur,
++ const struct cpumask __maybe_unused *trial)
++{
++ return 1;
++}
++
++int task_can_attach(struct task_struct *p,
++ const struct cpumask *cs_cpus_allowed)
++{
++ int ret = 0;
++
++ /*
++ * Kthreads which disallow setaffinity shouldn't be moved
++ * to a new cpuset; we don't want to change their CPU
++ * affinity and isolating such threads by their set of
++ * allowed nodes is unnecessary. Thus, cpusets are not
++ * applicable for such threads. This prevents checking for
++ * success of set_cpus_allowed_ptr() on all attached tasks
++ * before cpus_mask may be changed.
++ */
++ if (p->flags & PF_NO_SETAFFINITY)
++ ret = -EINVAL;
++
++ return ret;
++}
++
++bool sched_smp_initialized __read_mostly;
++
++#ifdef CONFIG_HOTPLUG_CPU
++/*
++ * Ensures that the idle task is using init_mm right before its CPU goes
++ * offline.
++ */
++void idle_task_exit(void)
++{
++ struct mm_struct *mm = current->active_mm;
++
++ BUG_ON(current != this_rq()->idle);
++
++ if (mm != &init_mm) {
++ switch_mm(mm, &init_mm, current);
++ finish_arch_post_lock_switch();
++ }
++
++ /* finish_cpu(), as ran on the BP, will clean up the active_mm state */
++}
++
++static int __balance_push_cpu_stop(void *arg)
++{
++ struct task_struct *p = arg;
++ struct rq *rq = this_rq();
++ struct rq_flags rf;
++ int cpu;
++
++ raw_spin_lock_irq(&p->pi_lock);
++ rq_lock(rq, &rf);
++
++ update_rq_clock(rq);
++
++ if (task_rq(p) == rq && task_on_rq_queued(p)) {
++ cpu = select_fallback_rq(rq->cpu, p);
++ rq = __migrate_task(rq, p, cpu);
++ }
++
++ rq_unlock(rq, &rf);
++ raw_spin_unlock_irq(&p->pi_lock);
++
++ put_task_struct(p);
++
++ return 0;
++}
++
++static DEFINE_PER_CPU(struct cpu_stop_work, push_work);
++
++/*
++ * This is enabled below SCHED_AP_ACTIVE; when !cpu_active(), but only
++ * effective when the hotplug motion is down.
++ */
++static void balance_push(struct rq *rq)
++{
++ struct task_struct *push_task = rq->curr;
++
++ lockdep_assert_held(&rq->lock);
++
++ /*
++ * Ensure the thing is persistent until balance_push_set(.on = false);
++ */
++ rq->balance_callback = &balance_push_callback;
++
++ /*
++ * Only active while going offline and when invoked on the outgoing
++ * CPU.
++ */
++ if (!cpu_dying(rq->cpu) || rq != this_rq())
++ return;
++
++ /*
++ * Both the cpu-hotplug and stop task are in this case and are
++ * required to complete the hotplug process.
++ */
++ if (kthread_is_per_cpu(push_task) ||
++ is_migration_disabled(push_task)) {
++
++ /*
++ * If this is the idle task on the outgoing CPU try to wake
++ * up the hotplug control thread which might wait for the
++ * last task to vanish. The rcuwait_active() check is
++ * accurate here because the waiter is pinned on this CPU
++ * and can't obviously be running in parallel.
++ *
++ * On RT kernels this also has to check whether there are
++ * pinned and scheduled out tasks on the runqueue. They
++ * need to leave the migrate disabled section first.
++ */
++ if (!rq->nr_running && !rq_has_pinned_tasks(rq) &&
++ rcuwait_active(&rq->hotplug_wait)) {
++ raw_spin_unlock(&rq->lock);
++ rcuwait_wake_up(&rq->hotplug_wait);
++ raw_spin_lock(&rq->lock);
++ }
++ return;
++ }
++
++ get_task_struct(push_task);
++ /*
++ * Temporarily drop rq->lock such that we can wake-up the stop task.
++ * Both preemption and IRQs are still disabled.
++ */
++ raw_spin_unlock(&rq->lock);
++ stop_one_cpu_nowait(rq->cpu, __balance_push_cpu_stop, push_task,
++ this_cpu_ptr(&push_work));
++ /*
++ * At this point need_resched() is true and we'll take the loop in
++ * schedule(). The next pick is obviously going to be the stop task
++ * which kthread_is_per_cpu() and will push this task away.
++ */
++ raw_spin_lock(&rq->lock);
++}
++
++static void balance_push_set(int cpu, bool on)
++{
++ struct rq *rq = cpu_rq(cpu);
++ struct rq_flags rf;
++
++ rq_lock_irqsave(rq, &rf);
++ if (on) {
++ WARN_ON_ONCE(rq->balance_callback);
++ rq->balance_callback = &balance_push_callback;
++ } else if (rq->balance_callback == &balance_push_callback) {
++ rq->balance_callback = NULL;
++ }
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++/*
++ * Invoked from a CPUs hotplug control thread after the CPU has been marked
++ * inactive. All tasks which are not per CPU kernel threads are either
++ * pushed off this CPU now via balance_push() or placed on a different CPU
++ * during wakeup. Wait until the CPU is quiescent.
++ */
++static void balance_hotplug_wait(void)
++{
++ struct rq *rq = this_rq();
++
++ rcuwait_wait_event(&rq->hotplug_wait,
++ rq->nr_running == 1 && !rq_has_pinned_tasks(rq),
++ TASK_UNINTERRUPTIBLE);
++}
++
++#else
++
++static void balance_push(struct rq *rq)
++{
++}
++
++static void balance_push_set(int cpu, bool on)
++{
++}
++
++static inline void balance_hotplug_wait(void)
++{
++}
++#endif /* CONFIG_HOTPLUG_CPU */
++
++static void set_rq_offline(struct rq *rq)
++{
++ if (rq->online)
++ rq->online = false;
++}
++
++static void set_rq_online(struct rq *rq)
++{
++ if (!rq->online)
++ rq->online = true;
++}
++
++/*
++ * used to mark begin/end of suspend/resume:
++ */
++static int num_cpus_frozen;
++
++/*
++ * Update cpusets according to cpu_active mask. If cpusets are
++ * disabled, cpuset_update_active_cpus() becomes a simple wrapper
++ * around partition_sched_domains().
++ *
++ * If we come here as part of a suspend/resume, don't touch cpusets because we
++ * want to restore it back to its original state upon resume anyway.
++ */
++static void cpuset_cpu_active(void)
++{
++ if (cpuhp_tasks_frozen) {
++ /*
++ * num_cpus_frozen tracks how many CPUs are involved in suspend
++ * resume sequence. As long as this is not the last online
++ * operation in the resume sequence, just build a single sched
++ * domain, ignoring cpusets.
++ */
++ partition_sched_domains(1, NULL, NULL);
++ if (--num_cpus_frozen)
++ return;
++ /*
++ * This is the last CPU online operation. So fall through and
++ * restore the original sched domains by considering the
++ * cpuset configurations.
++ */
++ cpuset_force_rebuild();
++ }
++
++ cpuset_update_active_cpus();
++}
++
++static int cpuset_cpu_inactive(unsigned int cpu)
++{
++ if (!cpuhp_tasks_frozen) {
++ cpuset_update_active_cpus();
++ } else {
++ num_cpus_frozen++;
++ partition_sched_domains(1, NULL, NULL);
++ }
++ return 0;
++}
++
++int sched_cpu_activate(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ /*
++ * Clear the balance_push callback and prepare to schedule
++ * regular tasks.
++ */
++ balance_push_set(cpu, false);
++
++#ifdef CONFIG_SCHED_SMT
++ /*
++ * When going up, increment the number of cores with SMT present.
++ */
++ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
++ static_branch_inc_cpuslocked(&sched_smt_present);
++#endif
++ set_cpu_active(cpu, true);
++
++ if (sched_smp_initialized)
++ cpuset_cpu_active();
++
++ /*
++ * Put the rq online, if not already. This happens:
++ *
++ * 1) In the early boot process, because we build the real domains
++ * after all cpus have been brought up.
++ *
++ * 2) At runtime, if cpuset_cpu_active() fails to rebuild the
++ * domains.
++ */
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ set_rq_online(rq);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++ return 0;
++}
++
++int sched_cpu_deactivate(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++ int ret;
++
++ set_cpu_active(cpu, false);
++
++ /*
++ * From this point forward, this CPU will refuse to run any task that
++ * is not: migrate_disable() or KTHREAD_IS_PER_CPU, and will actively
++ * push those tasks away until this gets cleared, see
++ * sched_cpu_dying().
++ */
++ balance_push_set(cpu, true);
++
++ /*
++ * We've cleared cpu_active_mask, wait for all preempt-disabled and RCU
++ * users of this state to go away such that all new such users will
++ * observe it.
++ *
++ * Specifically, we rely on ttwu to no longer target this CPU, see
++ * ttwu_queue_cond() and is_cpu_allowed().
++ *
++ * Do sync before park smpboot threads to take care the rcu boost case.
++ */
++ synchronize_rcu();
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ update_rq_clock(rq);
++ set_rq_offline(rq);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++#ifdef CONFIG_SCHED_SMT
++ /*
++ * When going down, decrement the number of cores with SMT present.
++ */
++ if (cpumask_weight(cpu_smt_mask(cpu)) == 2) {
++ static_branch_dec_cpuslocked(&sched_smt_present);
++ if (!static_branch_likely(&sched_smt_present))
++ cpumask_clear(&sched_sg_idle_mask);
++ }
++#endif
++
++ if (!sched_smp_initialized)
++ return 0;
++
++ ret = cpuset_cpu_inactive(cpu);
++ if (ret) {
++ balance_push_set(cpu, false);
++ set_cpu_active(cpu, true);
++ return ret;
++ }
++
++ return 0;
++}
++
++static void sched_rq_cpu_starting(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ rq->calc_load_update = calc_load_update;
++}
++
++int sched_cpu_starting(unsigned int cpu)
++{
++ sched_rq_cpu_starting(cpu);
++ sched_tick_start(cpu);
++ return 0;
++}
++
++#ifdef CONFIG_HOTPLUG_CPU
++
++/*
++ * Invoked immediately before the stopper thread is invoked to bring the
++ * CPU down completely. At this point all per CPU kthreads except the
++ * hotplug thread (current) and the stopper thread (inactive) have been
++ * either parked or have been unbound from the outgoing CPU. Ensure that
++ * any of those which might be on the way out are gone.
++ *
++ * If after this point a bound task is being woken on this CPU then the
++ * responsible hotplug callback has failed to do it's job.
++ * sched_cpu_dying() will catch it with the appropriate fireworks.
++ */
++int sched_cpu_wait_empty(unsigned int cpu)
++{
++ balance_hotplug_wait();
++ return 0;
++}
++
++/*
++ * Since this CPU is going 'away' for a while, fold any nr_active delta we
++ * might have. Called from the CPU stopper task after ensuring that the
++ * stopper is the last running task on the CPU, so nr_active count is
++ * stable. We need to take the teardown thread which is calling this into
++ * account, so we hand in adjust = 1 to the load calculation.
++ *
++ * Also see the comment "Global load-average calculations".
++ */
++static void calc_load_migrate(struct rq *rq)
++{
++ long delta = calc_load_fold_active(rq, 1);
++
++ if (delta)
++ atomic_long_add(delta, &calc_load_tasks);
++}
++
++static void dump_rq_tasks(struct rq *rq, const char *loglvl)
++{
++ struct task_struct *g, *p;
++ int cpu = cpu_of(rq);
++
++ lockdep_assert_held(&rq->lock);
++
++ printk("%sCPU%d enqueued tasks (%u total):\n", loglvl, cpu, rq->nr_running);
++ for_each_process_thread(g, p) {
++ if (task_cpu(p) != cpu)
++ continue;
++
++ if (!task_on_rq_queued(p))
++ continue;
++
++ printk("%s\tpid: %d, name: %s\n", loglvl, p->pid, p->comm);
++ }
++}
++
++int sched_cpu_dying(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ /* Handle pending wakeups and then migrate everything off */
++ sched_tick_stop(cpu);
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
++ WARN(true, "Dying CPU not properly vacated!");
++ dump_rq_tasks(rq, KERN_WARNING);
++ }
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++ calc_load_migrate(rq);
++ hrtick_clear(rq);
++ return 0;
++}
++#endif
++
++#ifdef CONFIG_SMP
++static void sched_init_topology_cpumask_early(void)
++{
++ int cpu;
++ cpumask_t *tmp;
++
++ for_each_possible_cpu(cpu) {
++ /* init topo masks */
++ tmp = per_cpu(sched_cpu_topo_masks, cpu);
++
++ cpumask_copy(tmp, cpumask_of(cpu));
++ tmp++;
++ cpumask_copy(tmp, cpu_possible_mask);
++ per_cpu(sched_cpu_llc_mask, cpu) = tmp;
++ per_cpu(sched_cpu_topo_end_mask, cpu) = ++tmp;
++ /*per_cpu(sd_llc_id, cpu) = cpu;*/
++ }
++}
++
++#define TOPOLOGY_CPUMASK(name, mask, last)\
++ if (cpumask_and(topo, topo, mask)) { \
++ cpumask_copy(topo, mask); \
++ printk(KERN_INFO "sched: cpu#%02d topo: 0x%08lx - "#name, \
++ cpu, (topo++)->bits[0]); \
++ } \
++ if (!last) \
++ cpumask_complement(topo, mask)
++
++static void sched_init_topology_cpumask(void)
++{
++ int cpu;
++ cpumask_t *topo;
++
++ for_each_online_cpu(cpu) {
++ /* take chance to reset time slice for idle tasks */
++ cpu_rq(cpu)->idle->time_slice = sched_timeslice_ns;
++
++ topo = per_cpu(sched_cpu_topo_masks, cpu) + 1;
++
++ cpumask_complement(topo, cpumask_of(cpu));
++#ifdef CONFIG_SCHED_SMT
++ TOPOLOGY_CPUMASK(smt, topology_sibling_cpumask(cpu), false);
++#endif
++ per_cpu(sd_llc_id, cpu) = cpumask_first(cpu_coregroup_mask(cpu));
++ per_cpu(sched_cpu_llc_mask, cpu) = topo;
++ TOPOLOGY_CPUMASK(coregroup, cpu_coregroup_mask(cpu), false);
++
++ TOPOLOGY_CPUMASK(core, topology_core_cpumask(cpu), false);
++
++ TOPOLOGY_CPUMASK(others, cpu_online_mask, true);
++
++ per_cpu(sched_cpu_topo_end_mask, cpu) = topo;
++ printk(KERN_INFO "sched: cpu#%02d llc_id = %d, llc_mask idx = %d\n",
++ cpu, per_cpu(sd_llc_id, cpu),
++ (int) (per_cpu(sched_cpu_llc_mask, cpu) -
++ per_cpu(sched_cpu_topo_masks, cpu)));
++ }
++}
++#endif
++
++void __init sched_init_smp(void)
++{
++ /* Move init over to a non-isolated CPU */
++ if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_DOMAIN)) < 0)
++ BUG();
++ current->flags &= ~PF_NO_SETAFFINITY;
++
++ sched_init_topology_cpumask();
++
++ sched_smp_initialized = true;
++}
++#else
++void __init sched_init_smp(void)
++{
++ cpu_rq(0)->idle->time_slice = sched_timeslice_ns;
++}
++#endif /* CONFIG_SMP */
++
++int in_sched_functions(unsigned long addr)
++{
++ return in_lock_functions(addr) ||
++ (addr >= (unsigned long)__sched_text_start
++ && addr < (unsigned long)__sched_text_end);
++}
++
++#ifdef CONFIG_CGROUP_SCHED
++/* task group related information */
++struct task_group {
++ struct cgroup_subsys_state css;
++
++ struct rcu_head rcu;
++ struct list_head list;
++
++ struct task_group *parent;
++ struct list_head siblings;
++ struct list_head children;
++#ifdef CONFIG_FAIR_GROUP_SCHED
++ unsigned long shares;
++#endif
++};
++
++/*
++ * Default task group.
++ * Every task in system belongs to this group at bootup.
++ */
++struct task_group root_task_group;
++LIST_HEAD(task_groups);
++
++/* Cacheline aligned slab cache for task_group */
++static struct kmem_cache *task_group_cache __read_mostly;
++#endif /* CONFIG_CGROUP_SCHED */
++
++void __init sched_init(void)
++{
++ int i;
++ struct rq *rq;
++
++ printk(KERN_INFO ALT_SCHED_VERSION_MSG);
++
++ wait_bit_init();
++
++#ifdef CONFIG_SMP
++ for (i = 0; i < SCHED_BITS; i++)
++ cpumask_copy(sched_rq_watermark + i, cpu_present_mask);
++#endif
++
++#ifdef CONFIG_CGROUP_SCHED
++ task_group_cache = KMEM_CACHE(task_group, 0);
++
++ list_add(&root_task_group.list, &task_groups);
++ INIT_LIST_HEAD(&root_task_group.children);
++ INIT_LIST_HEAD(&root_task_group.siblings);
++#endif /* CONFIG_CGROUP_SCHED */
++ for_each_possible_cpu(i) {
++ rq = cpu_rq(i);
++
++ sched_queue_init(&rq->queue);
++ rq->watermark = IDLE_TASK_SCHED_PRIO;
++ rq->skip = NULL;
++
++ raw_spin_lock_init(&rq->lock);
++ rq->nr_running = rq->nr_uninterruptible = 0;
++ rq->calc_load_active = 0;
++ rq->calc_load_update = jiffies + LOAD_FREQ;
++#ifdef CONFIG_SMP
++ rq->online = false;
++ rq->cpu = i;
++
++#ifdef CONFIG_SCHED_SMT
++ rq->active_balance = 0;
++#endif
++
++#ifdef CONFIG_NO_HZ_COMMON
++ INIT_CSD(&rq->nohz_csd, nohz_csd_func, rq);
++#endif
++ rq->balance_callback = &balance_push_callback;
++#ifdef CONFIG_HOTPLUG_CPU
++ rcuwait_init(&rq->hotplug_wait);
++#endif
++#endif /* CONFIG_SMP */
++ rq->nr_switches = 0;
++
++ hrtick_rq_init(rq);
++ atomic_set(&rq->nr_iowait, 0);
++ }
++#ifdef CONFIG_SMP
++ /* Set rq->online for cpu 0 */
++ cpu_rq(0)->online = true;
++#endif
++ /*
++ * The boot idle thread does lazy MMU switching as well:
++ */
++ mmgrab(&init_mm);
++ enter_lazy_tlb(&init_mm, current);
++
++ /*
++ * The idle task doesn't need the kthread struct to function, but it
++ * is dressed up as a per-CPU kthread and thus needs to play the part
++ * if we want to avoid special-casing it in code that deals with per-CPU
++ * kthreads.
++ */
++ WARN_ON(!set_kthread_struct(current));
++
++ /*
++ * Make us the idle thread. Technically, schedule() should not be
++ * called from this thread, however somewhere below it might be,
++ * but because we are the idle thread, we just pick up running again
++ * when this runqueue becomes "idle".
++ */
++ init_idle(current, smp_processor_id());
++
++ calc_load_update = jiffies + LOAD_FREQ;
++
++#ifdef CONFIG_SMP
++ idle_thread_set_boot_cpu();
++ balance_push_set(smp_processor_id(), false);
++
++ sched_init_topology_cpumask_early();
++#endif /* SMP */
++
++ psi_init();
++
++ preempt_dynamic_init();
++}
++
++#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
++
++void __might_sleep(const char *file, int line)
++{
++ unsigned int state = get_current_state();
++ /*
++ * Blocking primitives will set (and therefore destroy) current->state,
++ * since we will exit with TASK_RUNNING make sure we enter with it,
++ * otherwise we will destroy state.
++ */
++ WARN_ONCE(state != TASK_RUNNING && current->task_state_change,
++ "do not call blocking ops when !TASK_RUNNING; "
++ "state=%x set at [<%p>] %pS\n", state,
++ (void *)current->task_state_change,
++ (void *)current->task_state_change);
++
++ __might_resched(file, line, 0);
++}
++EXPORT_SYMBOL(__might_sleep);
++
++static void print_preempt_disable_ip(int preempt_offset, unsigned long ip)
++{
++ if (!IS_ENABLED(CONFIG_DEBUG_PREEMPT))
++ return;
++
++ if (preempt_count() == preempt_offset)
++ return;
++
++ pr_err("Preemption disabled at:");
++ print_ip_sym(KERN_ERR, ip);
++}
++
++static inline bool resched_offsets_ok(unsigned int offsets)
++{
++ unsigned int nested = preempt_count();
++
++ nested += rcu_preempt_depth() << MIGHT_RESCHED_RCU_SHIFT;
++
++ return nested == offsets;
++}
++
++void __might_resched(const char *file, int line, unsigned int offsets)
++{
++ /* Ratelimiting timestamp: */
++ static unsigned long prev_jiffy;
++
++ unsigned long preempt_disable_ip;
++
++ /* WARN_ON_ONCE() by default, no rate limit required: */
++ rcu_sleep_check();
++
++ if ((resched_offsets_ok(offsets) && !irqs_disabled() &&
++ !is_idle_task(current) && !current->non_block_count) ||
++ system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING ||
++ oops_in_progress)
++ return;
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ /* Save this before calling printk(), since that will clobber it: */
++ preempt_disable_ip = get_preempt_disable_ip(current);
++
++ pr_err("BUG: sleeping function called from invalid context at %s:%d\n",
++ file, line);
++ pr_err("in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(), current->non_block_count,
++ current->pid, current->comm);
++ pr_err("preempt_count: %x, expected: %x\n", preempt_count(),
++ offsets & MIGHT_RESCHED_PREEMPT_MASK);
++
++ if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
++ pr_err("RCU nest depth: %d, expected: %u\n",
++ rcu_preempt_depth(), offsets >> MIGHT_RESCHED_RCU_SHIFT);
++ }
++
++ if (task_stack_end_corrupted(current))
++ pr_emerg("Thread overran stack, or stack corrupted\n");
++
++ debug_show_held_locks(current);
++ if (irqs_disabled())
++ print_irqtrace_events(current);
++
++ print_preempt_disable_ip(offsets & MIGHT_RESCHED_PREEMPT_MASK,
++ preempt_disable_ip);
++
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL(__might_resched);
++
++void __cant_sleep(const char *file, int line, int preempt_offset)
++{
++ static unsigned long prev_jiffy;
++
++ if (irqs_disabled())
++ return;
++
++ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
++ return;
++
++ if (preempt_count() > preempt_offset)
++ return;
++
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ printk(KERN_ERR "BUG: assuming atomic context at %s:%d\n", file, line);
++ printk(KERN_ERR "in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(),
++ current->pid, current->comm);
++
++ debug_show_held_locks(current);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL_GPL(__cant_sleep);
++
++#ifdef CONFIG_SMP
++void __cant_migrate(const char *file, int line)
++{
++ static unsigned long prev_jiffy;
++
++ if (irqs_disabled())
++ return;
++
++ if (is_migration_disabled(current))
++ return;
++
++ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
++ return;
++
++ if (preempt_count() > 0)
++ return;
++
++ if (current->migration_flags & MDF_FORCE_ENABLED)
++ return;
++
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ pr_err("BUG: assuming non migratable context at %s:%d\n", file, line);
++ pr_err("in_atomic(): %d, irqs_disabled(): %d, migration_disabled() %u pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(), is_migration_disabled(current),
++ current->pid, current->comm);
++
++ debug_show_held_locks(current);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL_GPL(__cant_migrate);
++#endif
++#endif
++
++#ifdef CONFIG_MAGIC_SYSRQ
++void normalize_rt_tasks(void)
++{
++ struct task_struct *g, *p;
++ struct sched_attr attr = {
++ .sched_policy = SCHED_NORMAL,
++ };
++
++ read_lock(&tasklist_lock);
++ for_each_process_thread(g, p) {
++ /*
++ * Only normalize user tasks:
++ */
++ if (p->flags & PF_KTHREAD)
++ continue;
++
++ schedstat_set(p->stats.wait_start, 0);
++ schedstat_set(p->stats.sleep_start, 0);
++ schedstat_set(p->stats.block_start, 0);
++
++ if (!rt_task(p)) {
++ /*
++ * Renice negative nice level userspace
++ * tasks back to 0:
++ */
++ if (task_nice(p) < 0)
++ set_user_nice(p, 0);
++ continue;
++ }
++
++ __sched_setscheduler(p, &attr, false, false);
++ }
++ read_unlock(&tasklist_lock);
++}
++#endif /* CONFIG_MAGIC_SYSRQ */
++
++#if defined(CONFIG_IA64) || defined(CONFIG_KGDB_KDB)
++/*
++ * These functions are only useful for the IA64 MCA handling, or kdb.
++ *
++ * They can only be called when the whole system has been
++ * stopped - every CPU needs to be quiescent, and no scheduling
++ * activity can take place. Using them for anything else would
++ * be a serious bug, and as a result, they aren't even visible
++ * under any other configuration.
++ */
++
++/**
++ * curr_task - return the current task for a given CPU.
++ * @cpu: the processor in question.
++ *
++ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
++ *
++ * Return: The current task for @cpu.
++ */
++struct task_struct *curr_task(int cpu)
++{
++ return cpu_curr(cpu);
++}
++
++#endif /* defined(CONFIG_IA64) || defined(CONFIG_KGDB_KDB) */
++
++#ifdef CONFIG_IA64
++/**
++ * ia64_set_curr_task - set the current task for a given CPU.
++ * @cpu: the processor in question.
++ * @p: the task pointer to set.
++ *
++ * Description: This function must only be used when non-maskable interrupts
++ * are serviced on a separate stack. It allows the architecture to switch the
++ * notion of the current task on a CPU in a non-blocking manner. This function
++ * must be called with all CPU's synchronised, and interrupts disabled, the
++ * and caller must save the original value of the current task (see
++ * curr_task() above) and restore that value before reenabling interrupts and
++ * re-starting the system.
++ *
++ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
++ */
++void ia64_set_curr_task(int cpu, struct task_struct *p)
++{
++ cpu_curr(cpu) = p;
++}
++
++#endif
++
++#ifdef CONFIG_CGROUP_SCHED
++static void sched_free_group(struct task_group *tg)
++{
++ kmem_cache_free(task_group_cache, tg);
++}
++
++static void sched_free_group_rcu(struct rcu_head *rhp)
++{
++ sched_free_group(container_of(rhp, struct task_group, rcu));
++}
++
++static void sched_unregister_group(struct task_group *tg)
++{
++ /*
++ * We have to wait for yet another RCU grace period to expire, as
++ * print_cfs_stats() might run concurrently.
++ */
++ call_rcu(&tg->rcu, sched_free_group_rcu);
++}
++
++/* allocate runqueue etc for a new task group */
++struct task_group *sched_create_group(struct task_group *parent)
++{
++ struct task_group *tg;
++
++ tg = kmem_cache_alloc(task_group_cache, GFP_KERNEL | __GFP_ZERO);
++ if (!tg)
++ return ERR_PTR(-ENOMEM);
++
++ return tg;
++}
++
++void sched_online_group(struct task_group *tg, struct task_group *parent)
++{
++}
++
++/* rcu callback to free various structures associated with a task group */
++static void sched_unregister_group_rcu(struct rcu_head *rhp)
++{
++ /* Now it should be safe to free those cfs_rqs: */
++ sched_unregister_group(container_of(rhp, struct task_group, rcu));
++}
++
++void sched_destroy_group(struct task_group *tg)
++{
++ /* Wait for possible concurrent references to cfs_rqs complete: */
++ call_rcu(&tg->rcu, sched_unregister_group_rcu);
++}
++
++void sched_release_group(struct task_group *tg)
++{
++}
++
++static inline struct task_group *css_tg(struct cgroup_subsys_state *css)
++{
++ return css ? container_of(css, struct task_group, css) : NULL;
++}
++
++static struct cgroup_subsys_state *
++cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
++{
++ struct task_group *parent = css_tg(parent_css);
++ struct task_group *tg;
++
++ if (!parent) {
++ /* This is early initialization for the top cgroup */
++ return &root_task_group.css;
++ }
++
++ tg = sched_create_group(parent);
++ if (IS_ERR(tg))
++ return ERR_PTR(-ENOMEM);
++ return &tg->css;
++}
++
++/* Expose task group only after completing cgroup initialization */
++static int cpu_cgroup_css_online(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++ struct task_group *parent = css_tg(css->parent);
++
++ if (parent)
++ sched_online_group(tg, parent);
++ return 0;
++}
++
++static void cpu_cgroup_css_released(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++
++ sched_release_group(tg);
++}
++
++static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++
++ /*
++ * Relies on the RCU grace period between css_released() and this.
++ */
++ sched_unregister_group(tg);
++}
++
++static void cpu_cgroup_fork(struct task_struct *task)
++{
++}
++
++static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
++{
++ return 0;
++}
++
++static void cpu_cgroup_attach(struct cgroup_taskset *tset)
++{
++}
++
++#ifdef CONFIG_FAIR_GROUP_SCHED
++static DEFINE_MUTEX(shares_mutex);
++
++int sched_group_set_shares(struct task_group *tg, unsigned long shares)
++{
++ /*
++ * We can't change the weight of the root cgroup.
++ */
++ if (&root_task_group == tg)
++ return -EINVAL;
++
++ shares = clamp(shares, scale_load(MIN_SHARES), scale_load(MAX_SHARES));
++
++ mutex_lock(&shares_mutex);
++ if (tg->shares == shares)
++ goto done;
++
++ tg->shares = shares;
++done:
++ mutex_unlock(&shares_mutex);
++ return 0;
++}
++
++static int cpu_shares_write_u64(struct cgroup_subsys_state *css,
++ struct cftype *cftype, u64 shareval)
++{
++ if (shareval > scale_load_down(ULONG_MAX))
++ shareval = MAX_SHARES;
++ return sched_group_set_shares(css_tg(css), scale_load(shareval));
++}
++
++static u64 cpu_shares_read_u64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ struct task_group *tg = css_tg(css);
++
++ return (u64) scale_load_down(tg->shares);
++}
++#endif
++
++static struct cftype cpu_legacy_files[] = {
++#ifdef CONFIG_FAIR_GROUP_SCHED
++ {
++ .name = "shares",
++ .read_u64 = cpu_shares_read_u64,
++ .write_u64 = cpu_shares_write_u64,
++ },
++#endif
++ { } /* Terminate */
++};
++
++
++static struct cftype cpu_files[] = {
++ { } /* terminate */
++};
++
++static int cpu_extra_stat_show(struct seq_file *sf,
++ struct cgroup_subsys_state *css)
++{
++ return 0;
++}
++
++struct cgroup_subsys cpu_cgrp_subsys = {
++ .css_alloc = cpu_cgroup_css_alloc,
++ .css_online = cpu_cgroup_css_online,
++ .css_released = cpu_cgroup_css_released,
++ .css_free = cpu_cgroup_css_free,
++ .css_extra_stat_show = cpu_extra_stat_show,
++ .fork = cpu_cgroup_fork,
++ .can_attach = cpu_cgroup_can_attach,
++ .attach = cpu_cgroup_attach,
++ .legacy_cftypes = cpu_files,
++ .legacy_cftypes = cpu_legacy_files,
++ .dfl_cftypes = cpu_files,
++ .early_init = true,
++ .threaded = true,
++};
++#endif /* CONFIG_CGROUP_SCHED */
++
++#undef CREATE_TRACE_POINTS
+diff --git a/kernel/sched/alt_debug.c b/kernel/sched/alt_debug.c
+new file mode 100644
+index 000000000000..1212a031700e
+--- /dev/null
++++ b/kernel/sched/alt_debug.c
+@@ -0,0 +1,31 @@
++/*
++ * kernel/sched/alt_debug.c
++ *
++ * Print the alt scheduler debugging details
++ *
++ * Author: Alfred Chen
++ * Date : 2020
++ */
++#include "sched.h"
++
++/*
++ * This allows printing both to /proc/sched_debug and
++ * to the console
++ */
++#define SEQ_printf(m, x...) \
++ do { \
++ if (m) \
++ seq_printf(m, x); \
++ else \
++ pr_cont(x); \
++ } while (0)
++
++void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
++ struct seq_file *m)
++{
++ SEQ_printf(m, "%s (%d, #threads: %d)\n", p->comm, task_pid_nr_ns(p, ns),
++ get_nr_threads(p));
++}
++
++void proc_sched_set_task(struct task_struct *p)
++{}
+diff --git a/kernel/sched/alt_sched.h b/kernel/sched/alt_sched.h
+new file mode 100644
+index 000000000000..611424bbfa9b
+--- /dev/null
++++ b/kernel/sched/alt_sched.h
+@@ -0,0 +1,645 @@
++#ifndef ALT_SCHED_H
++#define ALT_SCHED_H
++
++#include <linux/psi.h>
++#include <linux/stop_machine.h>
++#include <linux/syscalls.h>
++#include <linux/tick.h>
++
++#include <trace/events/power.h>
++#include <trace/events/sched.h>
++
++#include "../workqueue_internal.h"
++
++#include "cpupri.h"
++
++#ifdef CONFIG_SCHED_BMQ
++/* bits:
++ * RT(0-99), (Low prio adj range, nice width, high prio adj range) / 2, cpu idle task */
++#define SCHED_BITS (MAX_RT_PRIO + NICE_WIDTH / 2 + MAX_PRIORITY_ADJ + 1)
++#endif
++
++#ifdef CONFIG_SCHED_PDS
++/* bits: RT(0-99), reserved(100-127), NORMAL_PRIO_NUM, cpu idle task */
++#define SCHED_BITS (MIN_NORMAL_PRIO + NORMAL_PRIO_NUM + 1)
++#endif /* CONFIG_SCHED_PDS */
++
++#define IDLE_TASK_SCHED_PRIO (SCHED_BITS - 1)
++
++#ifdef CONFIG_SCHED_DEBUG
++# define SCHED_WARN_ON(x) WARN_ONCE(x, #x)
++extern void resched_latency_warn(int cpu, u64 latency);
++#else
++# define SCHED_WARN_ON(x) ({ (void)(x), 0; })
++static inline void resched_latency_warn(int cpu, u64 latency) {}
++#endif
++
++/*
++ * Increase resolution of nice-level calculations for 64-bit architectures.
++ * The extra resolution improves shares distribution and load balancing of
++ * low-weight task groups (eg. nice +19 on an autogroup), deeper taskgroup
++ * hierarchies, especially on larger systems. This is not a user-visible change
++ * and does not change the user-interface for setting shares/weights.
++ *
++ * We increase resolution only if we have enough bits to allow this increased
++ * resolution (i.e. 64-bit). The costs for increasing resolution when 32-bit
++ * are pretty high and the returns do not justify the increased costs.
++ *
++ * Really only required when CONFIG_FAIR_GROUP_SCHED=y is also set, but to
++ * increase coverage and consistency always enable it on 64-bit platforms.
++ */
++#ifdef CONFIG_64BIT
++# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT + SCHED_FIXEDPOINT_SHIFT)
++# define scale_load(w) ((w) << SCHED_FIXEDPOINT_SHIFT)
++# define scale_load_down(w) \
++({ \
++ unsigned long __w = (w); \
++ if (__w) \
++ __w = max(2UL, __w >> SCHED_FIXEDPOINT_SHIFT); \
++ __w; \
++})
++#else
++# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT)
++# define scale_load(w) (w)
++# define scale_load_down(w) (w)
++#endif
++
++#ifdef CONFIG_FAIR_GROUP_SCHED
++#define ROOT_TASK_GROUP_LOAD NICE_0_LOAD
++
++/*
++ * A weight of 0 or 1 can cause arithmetics problems.
++ * A weight of a cfs_rq is the sum of weights of which entities
++ * are queued on this cfs_rq, so a weight of a entity should not be
++ * too large, so as the shares value of a task group.
++ * (The default weight is 1024 - so there's no practical
++ * limitation from this.)
++ */
++#define MIN_SHARES (1UL << 1)
++#define MAX_SHARES (1UL << 18)
++#endif
++
++/* task_struct::on_rq states: */
++#define TASK_ON_RQ_QUEUED 1
++#define TASK_ON_RQ_MIGRATING 2
++
++static inline int task_on_rq_queued(struct task_struct *p)
++{
++ return p->on_rq == TASK_ON_RQ_QUEUED;
++}
++
++static inline int task_on_rq_migrating(struct task_struct *p)
++{
++ return READ_ONCE(p->on_rq) == TASK_ON_RQ_MIGRATING;
++}
++
++/*
++ * wake flags
++ */
++#define WF_SYNC 0x01 /* waker goes to sleep after wakeup */
++#define WF_FORK 0x02 /* child wakeup after fork */
++#define WF_MIGRATED 0x04 /* internal use, task got migrated */
++#define WF_ON_CPU 0x08 /* Wakee is on_rq */
++
++#define SCHED_QUEUE_BITS (SCHED_BITS - 1)
++
++struct sched_queue {
++ DECLARE_BITMAP(bitmap, SCHED_QUEUE_BITS);
++ struct list_head heads[SCHED_BITS];
++};
++
++/*
++ * This is the main, per-CPU runqueue data structure.
++ * This data should only be modified by the local cpu.
++ */
++struct rq {
++ /* runqueue lock: */
++ raw_spinlock_t lock;
++
++ struct task_struct __rcu *curr;
++ struct task_struct *idle, *stop, *skip;
++ struct mm_struct *prev_mm;
++
++ struct sched_queue queue;
++#ifdef CONFIG_SCHED_PDS
++ u64 time_edge;
++#endif
++ unsigned long watermark;
++
++ /* switch count */
++ u64 nr_switches;
++
++ atomic_t nr_iowait;
++
++#ifdef CONFIG_SCHED_DEBUG
++ u64 last_seen_need_resched_ns;
++ int ticks_without_resched;
++#endif
++
++#ifdef CONFIG_MEMBARRIER
++ int membarrier_state;
++#endif
++
++#ifdef CONFIG_SMP
++ int cpu; /* cpu of this runqueue */
++ bool online;
++
++ unsigned int ttwu_pending;
++ unsigned char nohz_idle_balance;
++ unsigned char idle_balance;
++
++#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
++ struct sched_avg avg_irq;
++#endif
++
++#ifdef CONFIG_SCHED_SMT
++ int active_balance;
++ struct cpu_stop_work active_balance_work;
++#endif
++ struct callback_head *balance_callback;
++#ifdef CONFIG_HOTPLUG_CPU
++ struct rcuwait hotplug_wait;
++#endif
++ unsigned int nr_pinned;
++
++#endif /* CONFIG_SMP */
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++ u64 prev_irq_time;
++#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
++#ifdef CONFIG_PARAVIRT
++ u64 prev_steal_time;
++#endif /* CONFIG_PARAVIRT */
++#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
++ u64 prev_steal_time_rq;
++#endif /* CONFIG_PARAVIRT_TIME_ACCOUNTING */
++
++ /* For genenal cpu load util */
++ s32 load_history;
++ u64 load_block;
++ u64 load_stamp;
++
++ /* calc_load related fields */
++ unsigned long calc_load_update;
++ long calc_load_active;
++
++ u64 clock, last_tick;
++ u64 last_ts_switch;
++ u64 clock_task;
++
++ unsigned int nr_running;
++ unsigned long nr_uninterruptible;
++
++#ifdef CONFIG_SCHED_HRTICK
++#ifdef CONFIG_SMP
++ call_single_data_t hrtick_csd;
++#endif
++ struct hrtimer hrtick_timer;
++ ktime_t hrtick_time;
++#endif
++
++#ifdef CONFIG_SCHEDSTATS
++
++ /* latency stats */
++ struct sched_info rq_sched_info;
++ unsigned long long rq_cpu_time;
++ /* could above be rq->cfs_rq.exec_clock + rq->rt_rq.rt_runtime ? */
++
++ /* sys_sched_yield() stats */
++ unsigned int yld_count;
++
++ /* schedule() stats */
++ unsigned int sched_switch;
++ unsigned int sched_count;
++ unsigned int sched_goidle;
++
++ /* try_to_wake_up() stats */
++ unsigned int ttwu_count;
++ unsigned int ttwu_local;
++#endif /* CONFIG_SCHEDSTATS */
++
++#ifdef CONFIG_CPU_IDLE
++ /* Must be inspected within a rcu lock section */
++ struct cpuidle_state *idle_state;
++#endif
++
++#ifdef CONFIG_NO_HZ_COMMON
++#ifdef CONFIG_SMP
++ call_single_data_t nohz_csd;
++#endif
++ atomic_t nohz_flags;
++#endif /* CONFIG_NO_HZ_COMMON */
++};
++
++extern unsigned long rq_load_util(struct rq *rq, unsigned long max);
++
++extern unsigned long calc_load_update;
++extern atomic_long_t calc_load_tasks;
++
++extern void calc_global_load_tick(struct rq *this_rq);
++extern long calc_load_fold_active(struct rq *this_rq, long adjust);
++
++DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
++#define cpu_rq(cpu) (&per_cpu(runqueues, (cpu)))
++#define this_rq() this_cpu_ptr(&runqueues)
++#define task_rq(p) cpu_rq(task_cpu(p))
++#define cpu_curr(cpu) (cpu_rq(cpu)->curr)
++#define raw_rq() raw_cpu_ptr(&runqueues)
++
++#ifdef CONFIG_SMP
++#if defined(CONFIG_SCHED_DEBUG) && defined(CONFIG_SYSCTL)
++void register_sched_domain_sysctl(void);
++void unregister_sched_domain_sysctl(void);
++#else
++static inline void register_sched_domain_sysctl(void)
++{
++}
++static inline void unregister_sched_domain_sysctl(void)
++{
++}
++#endif
++
++extern bool sched_smp_initialized;
++
++enum {
++ ITSELF_LEVEL_SPACE_HOLDER,
++#ifdef CONFIG_SCHED_SMT
++ SMT_LEVEL_SPACE_HOLDER,
++#endif
++ COREGROUP_LEVEL_SPACE_HOLDER,
++ CORE_LEVEL_SPACE_HOLDER,
++ OTHER_LEVEL_SPACE_HOLDER,
++ NR_CPU_AFFINITY_LEVELS
++};
++
++DECLARE_PER_CPU(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
++DECLARE_PER_CPU(cpumask_t *, sched_cpu_llc_mask);
++
++static inline int
++__best_mask_cpu(const cpumask_t *cpumask, const cpumask_t *mask)
++{
++ int cpu;
++
++ while ((cpu = cpumask_any_and(cpumask, mask)) >= nr_cpu_ids)
++ mask++;
++
++ return cpu;
++}
++
++static inline int best_mask_cpu(int cpu, const cpumask_t *mask)
++{
++ return __best_mask_cpu(mask, per_cpu(sched_cpu_topo_masks, cpu));
++}
++
++extern void flush_smp_call_function_from_idle(void);
++
++#else /* !CONFIG_SMP */
++static inline void flush_smp_call_function_from_idle(void) { }
++#endif
++
++#ifndef arch_scale_freq_tick
++static __always_inline
++void arch_scale_freq_tick(void)
++{
++}
++#endif
++
++#ifndef arch_scale_freq_capacity
++static __always_inline
++unsigned long arch_scale_freq_capacity(int cpu)
++{
++ return SCHED_CAPACITY_SCALE;
++}
++#endif
++
++static inline u64 __rq_clock_broken(struct rq *rq)
++{
++ return READ_ONCE(rq->clock);
++}
++
++static inline u64 rq_clock(struct rq *rq)
++{
++ /*
++ * Relax lockdep_assert_held() checking as in VRQ, call to
++ * sched_info_xxxx() may not held rq->lock
++ * lockdep_assert_held(&rq->lock);
++ */
++ return rq->clock;
++}
++
++static inline u64 rq_clock_task(struct rq *rq)
++{
++ /*
++ * Relax lockdep_assert_held() checking as in VRQ, call to
++ * sched_info_xxxx() may not held rq->lock
++ * lockdep_assert_held(&rq->lock);
++ */
++ return rq->clock_task;
++}
++
++/*
++ * {de,en}queue flags:
++ *
++ * DEQUEUE_SLEEP - task is no longer runnable
++ * ENQUEUE_WAKEUP - task just became runnable
++ *
++ */
++
++#define DEQUEUE_SLEEP 0x01
++
++#define ENQUEUE_WAKEUP 0x01
++
++
++/*
++ * Below are scheduler API which using in other kernel code
++ * It use the dummy rq_flags
++ * ToDo : BMQ need to support these APIs for compatibility with mainline
++ * scheduler code.
++ */
++struct rq_flags {
++ unsigned long flags;
++};
++
++struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(rq->lock);
++
++struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(p->pi_lock)
++ __acquires(rq->lock);
++
++static inline void __task_rq_unlock(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock(&rq->lock);
++}
++
++static inline void
++task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
++ __releases(rq->lock)
++ __releases(p->pi_lock)
++{
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
++}
++
++static inline void
++rq_lock(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock(&rq->lock);
++}
++
++static inline void
++rq_unlock_irq(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock_irq(&rq->lock);
++}
++
++static inline void
++rq_unlock(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock(&rq->lock);
++}
++
++static inline struct rq *
++this_rq_lock_irq(struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ local_irq_disable();
++ rq = this_rq();
++ raw_spin_lock(&rq->lock);
++
++ return rq;
++}
++
++static inline raw_spinlock_t *__rq_lockp(struct rq *rq)
++{
++ return &rq->lock;
++}
++
++static inline raw_spinlock_t *rq_lockp(struct rq *rq)
++{
++ return __rq_lockp(rq);
++}
++
++static inline void lockdep_assert_rq_held(struct rq *rq)
++{
++ lockdep_assert_held(__rq_lockp(rq));
++}
++
++extern void raw_spin_rq_lock_nested(struct rq *rq, int subclass);
++extern void raw_spin_rq_unlock(struct rq *rq);
++
++static inline void raw_spin_rq_lock(struct rq *rq)
++{
++ raw_spin_rq_lock_nested(rq, 0);
++}
++
++static inline void raw_spin_rq_lock_irq(struct rq *rq)
++{
++ local_irq_disable();
++ raw_spin_rq_lock(rq);
++}
++
++static inline void raw_spin_rq_unlock_irq(struct rq *rq)
++{
++ raw_spin_rq_unlock(rq);
++ local_irq_enable();
++}
++
++static inline int task_current(struct rq *rq, struct task_struct *p)
++{
++ return rq->curr == p;
++}
++
++static inline bool task_running(struct task_struct *p)
++{
++ return p->on_cpu;
++}
++
++extern int task_running_nice(struct task_struct *p);
++
++extern struct static_key_false sched_schedstats;
++
++#ifdef CONFIG_CPU_IDLE
++static inline void idle_set_state(struct rq *rq,
++ struct cpuidle_state *idle_state)
++{
++ rq->idle_state = idle_state;
++}
++
++static inline struct cpuidle_state *idle_get_state(struct rq *rq)
++{
++ WARN_ON(!rcu_read_lock_held());
++ return rq->idle_state;
++}
++#else
++static inline void idle_set_state(struct rq *rq,
++ struct cpuidle_state *idle_state)
++{
++}
++
++static inline struct cpuidle_state *idle_get_state(struct rq *rq)
++{
++ return NULL;
++}
++#endif
++
++static inline int cpu_of(const struct rq *rq)
++{
++#ifdef CONFIG_SMP
++ return rq->cpu;
++#else
++ return 0;
++#endif
++}
++
++#include "stats.h"
++
++#ifdef CONFIG_NO_HZ_COMMON
++#define NOHZ_BALANCE_KICK_BIT 0
++#define NOHZ_STATS_KICK_BIT 1
++
++#define NOHZ_BALANCE_KICK BIT(NOHZ_BALANCE_KICK_BIT)
++#define NOHZ_STATS_KICK BIT(NOHZ_STATS_KICK_BIT)
++
++#define NOHZ_KICK_MASK (NOHZ_BALANCE_KICK | NOHZ_STATS_KICK)
++
++#define nohz_flags(cpu) (&cpu_rq(cpu)->nohz_flags)
++
++/* TODO: needed?
++extern void nohz_balance_exit_idle(struct rq *rq);
++#else
++static inline void nohz_balance_exit_idle(struct rq *rq) { }
++*/
++#endif
++
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++struct irqtime {
++ u64 total;
++ u64 tick_delta;
++ u64 irq_start_time;
++ struct u64_stats_sync sync;
++};
++
++DECLARE_PER_CPU(struct irqtime, cpu_irqtime);
++
++/*
++ * Returns the irqtime minus the softirq time computed by ksoftirqd.
++ * Otherwise ksoftirqd's sum_exec_runtime is substracted its own runtime
++ * and never move forward.
++ */
++static inline u64 irq_time_read(int cpu)
++{
++ struct irqtime *irqtime = &per_cpu(cpu_irqtime, cpu);
++ unsigned int seq;
++ u64 total;
++
++ do {
++ seq = __u64_stats_fetch_begin(&irqtime->sync);
++ total = irqtime->total;
++ } while (__u64_stats_fetch_retry(&irqtime->sync, seq));
++
++ return total;
++}
++#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
++
++#ifdef CONFIG_CPU_FREQ
++DECLARE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
++#endif /* CONFIG_CPU_FREQ */
++
++#ifdef CONFIG_NO_HZ_FULL
++extern int __init sched_tick_offload_init(void);
++#else
++static inline int sched_tick_offload_init(void) { return 0; }
++#endif
++
++#ifdef arch_scale_freq_capacity
++#ifndef arch_scale_freq_invariant
++#define arch_scale_freq_invariant() (true)
++#endif
++#else /* arch_scale_freq_capacity */
++#define arch_scale_freq_invariant() (false)
++#endif
++
++extern void schedule_idle(void);
++
++#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
++
++/*
++ * !! For sched_setattr_nocheck() (kernel) only !!
++ *
++ * This is actually gross. :(
++ *
++ * It is used to make schedutil kworker(s) higher priority than SCHED_DEADLINE
++ * tasks, but still be able to sleep. We need this on platforms that cannot
++ * atomically change clock frequency. Remove once fast switching will be
++ * available on such platforms.
++ *
++ * SUGOV stands for SchedUtil GOVernor.
++ */
++#define SCHED_FLAG_SUGOV 0x10000000
++
++#ifdef CONFIG_MEMBARRIER
++/*
++ * The scheduler provides memory barriers required by membarrier between:
++ * - prior user-space memory accesses and store to rq->membarrier_state,
++ * - store to rq->membarrier_state and following user-space memory accesses.
++ * In the same way it provides those guarantees around store to rq->curr.
++ */
++static inline void membarrier_switch_mm(struct rq *rq,
++ struct mm_struct *prev_mm,
++ struct mm_struct *next_mm)
++{
++ int membarrier_state;
++
++ if (prev_mm == next_mm)
++ return;
++
++ membarrier_state = atomic_read(&next_mm->membarrier_state);
++ if (READ_ONCE(rq->membarrier_state) == membarrier_state)
++ return;
++
++ WRITE_ONCE(rq->membarrier_state, membarrier_state);
++}
++#else
++static inline void membarrier_switch_mm(struct rq *rq,
++ struct mm_struct *prev_mm,
++ struct mm_struct *next_mm)
++{
++}
++#endif
++
++#ifdef CONFIG_NUMA
++extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);
++#else
++static inline int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
++{
++ return nr_cpu_ids;
++}
++#endif
++
++extern void swake_up_all_locked(struct swait_queue_head *q);
++extern void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++extern int preempt_dynamic_mode;
++extern int sched_dynamic_mode(const char *str);
++extern void sched_dynamic_update(int mode);
++#endif
++
++static inline void nohz_run_idle_balance(int cpu) { }
++
++static inline
++unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util,
++ struct task_struct *p)
++{
++ return util;
++}
++
++static inline bool uclamp_rq_is_capped(struct rq *rq) { return false; }
++
++#endif /* ALT_SCHED_H */
+diff --git a/kernel/sched/bmq.h b/kernel/sched/bmq.h
+new file mode 100644
+index 000000000000..bf7ac80ec242
+--- /dev/null
++++ b/kernel/sched/bmq.h
+@@ -0,0 +1,111 @@
++#define ALT_SCHED_VERSION_MSG "sched/bmq: BMQ CPU Scheduler "ALT_SCHED_VERSION" by Alfred Chen.\n"
++
++/*
++ * BMQ only routines
++ */
++#define rq_switch_time(rq) ((rq)->clock - (rq)->last_ts_switch)
++#define boost_threshold(p) (sched_timeslice_ns >>\
++ (15 - MAX_PRIORITY_ADJ - (p)->boost_prio))
++
++static inline void boost_task(struct task_struct *p)
++{
++ int limit;
++
++ switch (p->policy) {
++ case SCHED_NORMAL:
++ limit = -MAX_PRIORITY_ADJ;
++ break;
++ case SCHED_BATCH:
++ case SCHED_IDLE:
++ limit = 0;
++ break;
++ default:
++ return;
++ }
++
++ if (p->boost_prio > limit)
++ p->boost_prio--;
++}
++
++static inline void deboost_task(struct task_struct *p)
++{
++ if (p->boost_prio < MAX_PRIORITY_ADJ)
++ p->boost_prio++;
++}
++
++/*
++ * Common interfaces
++ */
++static inline void sched_timeslice_imp(const int timeslice_ms) {}
++
++static inline int
++task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
++{
++ return p->prio + p->boost_prio - MAX_RT_PRIO;
++}
++
++static inline int task_sched_prio(const struct task_struct *p)
++{
++ return (p->prio < MAX_RT_PRIO)? p->prio : MAX_RT_PRIO / 2 + (p->prio + p->boost_prio) / 2;
++}
++
++static inline int
++task_sched_prio_idx(const struct task_struct *p, const struct rq *rq)
++{
++ return task_sched_prio(p);
++}
++
++static inline int sched_prio2idx(int prio, struct rq *rq)
++{
++ return prio;
++}
++
++static inline int sched_idx2prio(int idx, struct rq *rq)
++{
++ return idx;
++}
++
++static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
++{
++ p->time_slice = sched_timeslice_ns;
++
++ if (SCHED_FIFO != p->policy && task_on_rq_queued(p)) {
++ if (SCHED_RR != p->policy)
++ deboost_task(p);
++ requeue_task(p, rq, task_sched_prio_idx(p, rq));
++ }
++}
++
++static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq) {}
++
++inline int task_running_nice(struct task_struct *p)
++{
++ return (p->prio + p->boost_prio > DEFAULT_PRIO + MAX_PRIORITY_ADJ);
++}
++
++static void sched_task_fork(struct task_struct *p, struct rq *rq)
++{
++ p->boost_prio = (p->boost_prio < 0) ?
++ p->boost_prio + MAX_PRIORITY_ADJ : MAX_PRIORITY_ADJ;
++}
++
++static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
++{
++ p->boost_prio = MAX_PRIORITY_ADJ;
++}
++
++#ifdef CONFIG_SMP
++static inline void sched_task_ttwu(struct task_struct *p)
++{
++ if(this_rq()->clock_task - p->last_ran > sched_timeslice_ns)
++ boost_task(p);
++}
++#endif
++
++static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq)
++{
++ if (rq_switch_time(rq) < boost_threshold(p))
++ boost_task(p);
++}
++
++static inline void update_rq_time_edge(struct rq *rq) {}
+diff --git a/kernel/sched/build_policy.c b/kernel/sched/build_policy.c
+index e0104b45029a..5eb28f1fdd74 100644
+--- a/kernel/sched/build_policy.c
++++ b/kernel/sched/build_policy.c
+@@ -40,13 +40,19 @@
+
+ #include "idle.c"
+
++#ifndef CONFIG_SCHED_ALT
+ #include "rt.c"
++#endif
+
+ #ifdef CONFIG_SMP
++#ifndef CONFIG_SCHED_ALT
+ # include "cpudeadline.c"
++#endif
+ # include "pelt.c"
+ #endif
+
+ #include "cputime.c"
+-#include "deadline.c"
+
++#ifndef CONFIG_SCHED_ALT
++#include "deadline.c"
++#endif
+diff --git a/kernel/sched/build_utility.c b/kernel/sched/build_utility.c
+index eec0849b2aae..880f4f819d77 100644
+--- a/kernel/sched/build_utility.c
++++ b/kernel/sched/build_utility.c
+@@ -84,7 +84,9 @@
+
+ #ifdef CONFIG_SMP
+ # include "cpupri.c"
++#ifndef CONFIG_SCHED_ALT
+ # include "stop_task.c"
++#endif
+ # include "topology.c"
+ #endif
+
+diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
+index 3dbf351d12d5..b2590f961139 100644
+--- a/kernel/sched/cpufreq_schedutil.c
++++ b/kernel/sched/cpufreq_schedutil.c
+@@ -160,9 +160,14 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu)
+ unsigned long max = arch_scale_cpu_capacity(sg_cpu->cpu);
+
+ sg_cpu->max = max;
++#ifndef CONFIG_SCHED_ALT
+ sg_cpu->bw_dl = cpu_bw_dl(rq);
+ sg_cpu->util = effective_cpu_util(sg_cpu->cpu, cpu_util_cfs(sg_cpu->cpu), max,
+ FREQUENCY_UTIL, NULL);
++#else
++ sg_cpu->bw_dl = 0;
++ sg_cpu->util = rq_load_util(rq, max);
++#endif /* CONFIG_SCHED_ALT */
+ }
+
+ /**
+@@ -306,8 +311,10 @@ static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return false; }
+ */
+ static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu)
+ {
++#ifndef CONFIG_SCHED_ALT
+ if (cpu_bw_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->bw_dl)
+ sg_cpu->sg_policy->limits_changed = true;
++#endif
+ }
+
+ static inline bool sugov_update_single_common(struct sugov_cpu *sg_cpu,
+@@ -607,6 +614,7 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
+ }
+
+ ret = sched_setattr_nocheck(thread, &attr);
++
+ if (ret) {
+ kthread_stop(thread);
+ pr_warn("%s: failed to set SCHED_DEADLINE\n", __func__);
+@@ -839,7 +847,9 @@ cpufreq_governor_init(schedutil_gov);
+ #ifdef CONFIG_ENERGY_MODEL
+ static void rebuild_sd_workfn(struct work_struct *work)
+ {
++#ifndef CONFIG_SCHED_ALT
+ rebuild_sched_domains_energy();
++#endif /* CONFIG_SCHED_ALT */
+ }
+ static DECLARE_WORK(rebuild_sd_work, rebuild_sd_workfn);
+
+diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
+index 78a233d43757..b3bbc87d4352 100644
+--- a/kernel/sched/cputime.c
++++ b/kernel/sched/cputime.c
+@@ -122,7 +122,7 @@ void account_user_time(struct task_struct *p, u64 cputime)
+ p->utime += cputime;
+ account_group_user_time(p, cputime);
+
+- index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;
++ index = task_running_nice(p) ? CPUTIME_NICE : CPUTIME_USER;
+
+ /* Add user time to cpustat. */
+ task_group_account_field(p, index, cputime);
+@@ -146,7 +146,7 @@ void account_guest_time(struct task_struct *p, u64 cputime)
+ p->gtime += cputime;
+
+ /* Add guest time to cpustat. */
+- if (task_nice(p) > 0) {
++ if (task_running_nice(p)) {
+ task_group_account_field(p, CPUTIME_NICE, cputime);
+ cpustat[CPUTIME_GUEST_NICE] += cputime;
+ } else {
+@@ -269,7 +269,7 @@ static inline u64 account_other_time(u64 max)
+ #ifdef CONFIG_64BIT
+ static inline u64 read_sum_exec_runtime(struct task_struct *t)
+ {
+- return t->se.sum_exec_runtime;
++ return tsk_seruntime(t);
+ }
+ #else
+ static u64 read_sum_exec_runtime(struct task_struct *t)
+@@ -279,7 +279,7 @@ static u64 read_sum_exec_runtime(struct task_struct *t)
+ struct rq *rq;
+
+ rq = task_rq_lock(t, &rf);
+- ns = t->se.sum_exec_runtime;
++ ns = tsk_seruntime(t);
+ task_rq_unlock(rq, t, &rf);
+
+ return ns;
+@@ -611,7 +611,7 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
+ void task_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
+ {
+ struct task_cputime cputime = {
+- .sum_exec_runtime = p->se.sum_exec_runtime,
++ .sum_exec_runtime = tsk_seruntime(p),
+ };
+
+ if (task_cputime(p, &cputime.utime, &cputime.stime))
+diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
+index bb3d63bdf4ae..4e1680785704 100644
+--- a/kernel/sched/debug.c
++++ b/kernel/sched/debug.c
+@@ -7,6 +7,7 @@
+ * Copyright(C) 2007, Red Hat, Inc., Ingo Molnar
+ */
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * This allows printing both to /proc/sched_debug and
+ * to the console
+@@ -215,6 +216,7 @@ static const struct file_operations sched_scaling_fops = {
+ };
+
+ #endif /* SMP */
++#endif /* !CONFIG_SCHED_ALT */
+
+ #ifdef CONFIG_PREEMPT_DYNAMIC
+
+@@ -278,6 +280,7 @@ static const struct file_operations sched_dynamic_fops = {
+
+ #endif /* CONFIG_PREEMPT_DYNAMIC */
+
++#ifndef CONFIG_SCHED_ALT
+ __read_mostly bool sched_debug_verbose;
+
+ static const struct seq_operations sched_debug_sops;
+@@ -293,6 +296,7 @@ static const struct file_operations sched_debug_fops = {
+ .llseek = seq_lseek,
+ .release = seq_release,
+ };
++#endif /* !CONFIG_SCHED_ALT */
+
+ static struct dentry *debugfs_sched;
+
+@@ -302,12 +306,15 @@ static __init int sched_init_debug(void)
+
+ debugfs_sched = debugfs_create_dir("sched", NULL);
+
++#ifndef CONFIG_SCHED_ALT
+ debugfs_create_file("features", 0644, debugfs_sched, NULL, &sched_feat_fops);
+ debugfs_create_bool("verbose", 0644, debugfs_sched, &sched_debug_verbose);
++#endif /* !CONFIG_SCHED_ALT */
+ #ifdef CONFIG_PREEMPT_DYNAMIC
+ debugfs_create_file("preempt", 0644, debugfs_sched, NULL, &sched_dynamic_fops);
+ #endif
+
++#ifndef CONFIG_SCHED_ALT
+ debugfs_create_u32("latency_ns", 0644, debugfs_sched, &sysctl_sched_latency);
+ debugfs_create_u32("min_granularity_ns", 0644, debugfs_sched, &sysctl_sched_min_granularity);
+ debugfs_create_u32("idle_min_granularity_ns", 0644, debugfs_sched, &sysctl_sched_idle_min_granularity);
+@@ -336,11 +343,13 @@ static __init int sched_init_debug(void)
+ #endif
+
+ debugfs_create_file("debug", 0444, debugfs_sched, NULL, &sched_debug_fops);
++#endif /* !CONFIG_SCHED_ALT */
+
+ return 0;
+ }
+ late_initcall(sched_init_debug);
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_SMP
+
+ static cpumask_var_t sd_sysctl_cpus;
+@@ -1067,6 +1076,7 @@ void proc_sched_set_task(struct task_struct *p)
+ memset(&p->stats, 0, sizeof(p->stats));
+ #endif
+ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ void resched_latency_warn(int cpu, u64 latency)
+ {
+diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
+index ecb0d7052877..000c0d87de78 100644
+--- a/kernel/sched/idle.c
++++ b/kernel/sched/idle.c
+@@ -400,6 +400,7 @@ void cpu_startup_entry(enum cpuhp_state state)
+ do_idle();
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * idle-task scheduling class.
+ */
+@@ -521,3 +522,4 @@ DEFINE_SCHED_CLASS(idle) = {
+ .switched_to = switched_to_idle,
+ .update_curr = update_curr_idle,
+ };
++#endif
+diff --git a/kernel/sched/pds.h b/kernel/sched/pds.h
+new file mode 100644
+index 000000000000..56a649d02e49
+--- /dev/null
++++ b/kernel/sched/pds.h
+@@ -0,0 +1,127 @@
++#define ALT_SCHED_VERSION_MSG "sched/pds: PDS CPU Scheduler "ALT_SCHED_VERSION" by Alfred Chen.\n"
++
++static int sched_timeslice_shift = 22;
++
++#define NORMAL_PRIO_MOD(x) ((x) & (NORMAL_PRIO_NUM - 1))
++
++/*
++ * Common interfaces
++ */
++static inline void sched_timeslice_imp(const int timeslice_ms)
++{
++ if (2 == timeslice_ms)
++ sched_timeslice_shift = 21;
++}
++
++static inline int
++task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
++{
++ s64 delta = p->deadline - rq->time_edge + NORMAL_PRIO_NUM - NICE_WIDTH;
++
++ if (WARN_ONCE(delta > NORMAL_PRIO_NUM - 1,
++ "pds: task_sched_prio_normal() delta %lld\n", delta))
++ return NORMAL_PRIO_NUM - 1;
++
++ return (delta < 0) ? 0 : delta;
++}
++
++static inline int task_sched_prio(const struct task_struct *p)
++{
++ return (p->prio < MAX_RT_PRIO) ? p->prio :
++ MIN_NORMAL_PRIO + task_sched_prio_normal(p, task_rq(p));
++}
++
++static inline int
++task_sched_prio_idx(const struct task_struct *p, const struct rq *rq)
++{
++ return (p->prio < MAX_RT_PRIO) ? p->prio : MIN_NORMAL_PRIO +
++ NORMAL_PRIO_MOD(task_sched_prio_normal(p, rq) + rq->time_edge);
++}
++
++static inline int sched_prio2idx(int prio, struct rq *rq)
++{
++ return (IDLE_TASK_SCHED_PRIO == prio || prio < MAX_RT_PRIO) ? prio :
++ MIN_NORMAL_PRIO + NORMAL_PRIO_MOD((prio - MIN_NORMAL_PRIO) +
++ rq->time_edge);
++}
++
++static inline int sched_idx2prio(int idx, struct rq *rq)
++{
++ return (idx < MAX_RT_PRIO) ? idx : MIN_NORMAL_PRIO +
++ NORMAL_PRIO_MOD((idx - MIN_NORMAL_PRIO) + NORMAL_PRIO_NUM -
++ NORMAL_PRIO_MOD(rq->time_edge));
++}
++
++static inline void sched_renew_deadline(struct task_struct *p, const struct rq *rq)
++{
++ if (p->prio >= MAX_RT_PRIO)
++ p->deadline = (rq->clock >> sched_timeslice_shift) +
++ p->static_prio - (MAX_PRIO - NICE_WIDTH);
++}
++
++int task_running_nice(struct task_struct *p)
++{
++ return (p->prio > DEFAULT_PRIO);
++}
++
++static inline void update_rq_time_edge(struct rq *rq)
++{
++ struct list_head head;
++ u64 old = rq->time_edge;
++ u64 now = rq->clock >> sched_timeslice_shift;
++ u64 prio, delta;
++
++ if (now == old)
++ return;
++
++ delta = min_t(u64, NORMAL_PRIO_NUM, now - old);
++ INIT_LIST_HEAD(&head);
++
++ for_each_set_bit(prio, &rq->queue.bitmap[2], delta)
++ list_splice_tail_init(rq->queue.heads + MIN_NORMAL_PRIO +
++ NORMAL_PRIO_MOD(prio + old), &head);
++
++ rq->queue.bitmap[2] = (NORMAL_PRIO_NUM == delta) ? 0UL :
++ rq->queue.bitmap[2] >> delta;
++ rq->time_edge = now;
++ if (!list_empty(&head)) {
++ u64 idx = MIN_NORMAL_PRIO + NORMAL_PRIO_MOD(now);
++ struct task_struct *p;
++
++ list_for_each_entry(p, &head, sq_node)
++ p->sq_idx = idx;
++
++ list_splice(&head, rq->queue.heads + idx);
++ rq->queue.bitmap[2] |= 1UL;
++ }
++}
++
++static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
++{
++ p->time_slice = sched_timeslice_ns;
++ sched_renew_deadline(p, rq);
++ if (SCHED_FIFO != p->policy && task_on_rq_queued(p))
++ requeue_task(p, rq, task_sched_prio_idx(p, rq));
++}
++
++static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq)
++{
++ u64 max_dl = rq->time_edge + NICE_WIDTH - 1;
++ if (unlikely(p->deadline > max_dl))
++ p->deadline = max_dl;
++}
++
++static void sched_task_fork(struct task_struct *p, struct rq *rq)
++{
++ sched_renew_deadline(p, rq);
++}
++
++static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
++{
++ time_slice_expired(p, rq);
++}
++
++#ifdef CONFIG_SMP
++static inline void sched_task_ttwu(struct task_struct *p) {}
++#endif
++static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq) {}
+diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
+index 0f310768260c..bd38bf738fe9 100644
+--- a/kernel/sched/pelt.c
++++ b/kernel/sched/pelt.c
+@@ -266,6 +266,7 @@ ___update_load_avg(struct sched_avg *sa, unsigned long load)
+ WRITE_ONCE(sa->util_avg, sa->util_sum / divider);
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * sched_entity:
+ *
+@@ -383,8 +384,9 @@ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
+
+ return 0;
+ }
++#endif
+
+-#ifdef CONFIG_SCHED_THERMAL_PRESSURE
++#if defined(CONFIG_SCHED_THERMAL_PRESSURE) && !defined(CONFIG_SCHED_ALT)
+ /*
+ * thermal:
+ *
+diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h
+index c336f5f481bc..5865f14714a9 100644
+--- a/kernel/sched/pelt.h
++++ b/kernel/sched/pelt.h
+@@ -1,13 +1,15 @@
+ #ifdef CONFIG_SMP
+ #include "sched-pelt.h"
+
++#ifndef CONFIG_SCHED_ALT
+ int __update_load_avg_blocked_se(u64 now, struct sched_entity *se);
+ int __update_load_avg_se(u64 now, struct cfs_rq *cfs_rq, struct sched_entity *se);
+ int __update_load_avg_cfs_rq(u64 now, struct cfs_rq *cfs_rq);
+ int update_rt_rq_load_avg(u64 now, struct rq *rq, int running);
+ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running);
++#endif
+
+-#ifdef CONFIG_SCHED_THERMAL_PRESSURE
++#if defined(CONFIG_SCHED_THERMAL_PRESSURE) && !defined(CONFIG_SCHED_ALT)
+ int update_thermal_load_avg(u64 now, struct rq *rq, u64 capacity);
+
+ static inline u64 thermal_load_avg(struct rq *rq)
+@@ -44,6 +46,7 @@ static inline u32 get_pelt_divider(struct sched_avg *avg)
+ return PELT_MIN_DIVIDER + avg->period_contrib;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ static inline void cfs_se_util_change(struct sched_avg *avg)
+ {
+ unsigned int enqueued;
+@@ -155,9 +158,11 @@ static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq)
+ return rq_clock_pelt(rq_of(cfs_rq));
+ }
+ #endif
++#endif /* CONFIG_SCHED_ALT */
+
+ #else
+
++#ifndef CONFIG_SCHED_ALT
+ static inline int
+ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
+ {
+@@ -175,6 +180,7 @@ update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
+ {
+ return 0;
+ }
++#endif
+
+ static inline int
+ update_thermal_load_avg(u64 now, struct rq *rq, u64 capacity)
+diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
+index 8dccb34eb190..bb3598e0ba5d 100644
+--- a/kernel/sched/sched.h
++++ b/kernel/sched/sched.h
+@@ -5,6 +5,10 @@
+ #ifndef _KERNEL_SCHED_SCHED_H
+ #define _KERNEL_SCHED_SCHED_H
+
++#ifdef CONFIG_SCHED_ALT
++#include "alt_sched.h"
++#else
++
+ #include <linux/sched/affinity.h>
+ #include <linux/sched/autogroup.h>
+ #include <linux/sched/cpufreq.h>
+@@ -3087,4 +3091,9 @@ extern int sched_dynamic_mode(const char *str);
+ extern void sched_dynamic_update(int mode);
+ #endif
+
++static inline int task_running_nice(struct task_struct *p)
++{
++ return (task_nice(p) > 0);
++}
++#endif /* !CONFIG_SCHED_ALT */
+ #endif /* _KERNEL_SCHED_SCHED_H */
+diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
+index 857f837f52cb..5486c63e4790 100644
+--- a/kernel/sched/stats.c
++++ b/kernel/sched/stats.c
+@@ -125,8 +125,10 @@ static int show_schedstat(struct seq_file *seq, void *v)
+ } else {
+ struct rq *rq;
+ #ifdef CONFIG_SMP
++#ifndef CONFIG_SCHED_ALT
+ struct sched_domain *sd;
+ int dcount = 0;
++#endif
+ #endif
+ cpu = (unsigned long)(v - 2);
+ rq = cpu_rq(cpu);
+@@ -143,6 +145,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
+ seq_printf(seq, "\n");
+
+ #ifdef CONFIG_SMP
++#ifndef CONFIG_SCHED_ALT
+ /* domain-specific stats */
+ rcu_read_lock();
+ for_each_domain(cpu, sd) {
+@@ -171,6 +174,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
+ sd->ttwu_move_balance);
+ }
+ rcu_read_unlock();
++#endif
+ #endif
+ }
+ return 0;
+diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
+index baa839c1ba96..15238be0581b 100644
+--- a/kernel/sched/stats.h
++++ b/kernel/sched/stats.h
+@@ -89,6 +89,7 @@ static inline void rq_sched_info_depart (struct rq *rq, unsigned long long delt
+
+ #endif /* CONFIG_SCHEDSTATS */
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_FAIR_GROUP_SCHED
+ struct sched_entity_stats {
+ struct sched_entity se;
+@@ -105,6 +106,7 @@ __schedstats_from_se(struct sched_entity *se)
+ #endif
+ return &task_of(se)->stats;
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ #ifdef CONFIG_PSI
+ /*
+diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
+index 810750e62118..f2cdbb696dba 100644
+--- a/kernel/sched/topology.c
++++ b/kernel/sched/topology.c
+@@ -3,6 +3,7 @@
+ * Scheduler topology setup/handling methods
+ */
+
++#ifndef CONFIG_SCHED_ALT
+ DEFINE_MUTEX(sched_domains_mutex);
+
+ /* Protected by sched_domains_mutex: */
+@@ -1392,8 +1393,10 @@ static void asym_cpu_capacity_scan(void)
+ */
+
+ static int default_relax_domain_level = -1;
++#endif /* CONFIG_SCHED_ALT */
+ int sched_domain_level_max;
+
++#ifndef CONFIG_SCHED_ALT
+ static int __init setup_relax_domain_level(char *str)
+ {
+ if (kstrtoint(str, 0, &default_relax_domain_level))
+@@ -1626,6 +1629,7 @@ sd_init(struct sched_domain_topology_level *tl,
+
+ return sd;
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ /*
+ * Topology list, bottom-up.
+@@ -1662,6 +1666,7 @@ void set_sched_topology(struct sched_domain_topology_level *tl)
+ sched_domain_topology_saved = NULL;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_NUMA
+
+ static const struct cpumask *sd_numa_mask(int cpu)
+@@ -2617,3 +2622,15 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
+ partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
+ mutex_unlock(&sched_domains_mutex);
+ }
++#else /* CONFIG_SCHED_ALT */
++void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
++ struct sched_domain_attr *dattr_new)
++{}
++
++#ifdef CONFIG_NUMA
++int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
++{
++ return best_mask_cpu(cpu, cpus);
++}
++#endif /* CONFIG_NUMA */
++#endif
+diff --git a/kernel/sysctl.c b/kernel/sysctl.c
+index 830aaf8ca08e..7ad676d5ae3b 100644
+--- a/kernel/sysctl.c
++++ b/kernel/sysctl.c
+@@ -96,6 +96,10 @@
+
+ /* Constants used for minimum and maximum */
+
++#ifdef CONFIG_SCHED_ALT
++extern int sched_yield_type;
++#endif
++
+ #ifdef CONFIG_PERF_EVENTS
+ static const int six_hundred_forty_kb = 640 * 1024;
+ #endif
+@@ -1659,6 +1663,24 @@ int proc_do_static_key(struct ctl_table *table, int write,
+ }
+
+ static struct ctl_table kern_table[] = {
++#ifdef CONFIG_SCHED_ALT
++/* In ALT, only supported "sched_schedstats" */
++#ifdef CONFIG_SCHED_DEBUG
++#ifdef CONFIG_SMP
++#ifdef CONFIG_SCHEDSTATS
++ {
++ .procname = "sched_schedstats",
++ .data = NULL,
++ .maxlen = sizeof(unsigned int),
++ .mode = 0644,
++ .proc_handler = sysctl_schedstats,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_ONE,
++ },
++#endif /* CONFIG_SCHEDSTATS */
++#endif /* CONFIG_SMP */
++#endif /* CONFIG_SCHED_DEBUG */
++#else /* !CONFIG_SCHED_ALT */
+ {
+ .procname = "sched_child_runs_first",
+ .data = &sysctl_sched_child_runs_first,
+@@ -1778,6 +1800,7 @@ static struct ctl_table kern_table[] = {
+ .extra2 = SYSCTL_ONE,
+ },
+ #endif
++#endif /* !CONFIG_SCHED_ALT */
+ #ifdef CONFIG_PROVE_LOCKING
+ {
+ .procname = "prove_locking",
+@@ -2163,6 +2186,17 @@ static struct ctl_table kern_table[] = {
+ .proc_handler = proc_dointvec,
+ },
+ #endif
++#ifdef CONFIG_SCHED_ALT
++ {
++ .procname = "yield_type",
++ .data = &sched_yield_type,
++ .maxlen = sizeof (int),
++ .mode = 0644,
++ .proc_handler = &proc_dointvec_minmax,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_TWO,
++ },
++#endif
+ #if defined(CONFIG_S390) && defined(CONFIG_SMP)
+ {
+ .procname = "spin_retry",
+diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
+index 0ea8702eb516..a27a0f3a654d 100644
+--- a/kernel/time/hrtimer.c
++++ b/kernel/time/hrtimer.c
+@@ -2088,8 +2088,10 @@ long hrtimer_nanosleep(ktime_t rqtp, const enum hrtimer_mode mode,
+ int ret = 0;
+ u64 slack;
+
++#ifndef CONFIG_SCHED_ALT
+ slack = current->timer_slack_ns;
+ if (dl_task(current) || rt_task(current))
++#endif
+ slack = 0;
+
+ hrtimer_init_sleeper_on_stack(&t, clockid, mode);
+diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
+index 0a97193984db..e32235cdc3b1 100644
+--- a/kernel/time/posix-cpu-timers.c
++++ b/kernel/time/posix-cpu-timers.c
+@@ -223,7 +223,7 @@ static void task_sample_cputime(struct task_struct *p, u64 *samples)
+ u64 stime, utime;
+
+ task_cputime(p, &utime, &stime);
+- store_samples(samples, stime, utime, p->se.sum_exec_runtime);
++ store_samples(samples, stime, utime, tsk_seruntime(p));
+ }
+
+ static void proc_sample_cputime_atomic(struct task_cputime_atomic *at,
+@@ -866,6 +866,7 @@ static void collect_posix_cputimers(struct posix_cputimers *pct, u64 *samples,
+ }
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ static inline void check_dl_overrun(struct task_struct *tsk)
+ {
+ if (tsk->dl.dl_overrun) {
+@@ -873,6 +874,7 @@ static inline void check_dl_overrun(struct task_struct *tsk)
+ __group_send_sig_info(SIGXCPU, SEND_SIG_PRIV, tsk);
+ }
+ }
++#endif
+
+ static bool check_rlimit(u64 time, u64 limit, int signo, bool rt, bool hard)
+ {
+@@ -900,8 +902,10 @@ static void check_thread_timers(struct task_struct *tsk,
+ u64 samples[CPUCLOCK_MAX];
+ unsigned long soft;
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_task(tsk))
+ check_dl_overrun(tsk);
++#endif
+
+ if (expiry_cache_is_inactive(pct))
+ return;
+@@ -915,7 +919,7 @@ static void check_thread_timers(struct task_struct *tsk,
+ soft = task_rlimit(tsk, RLIMIT_RTTIME);
+ if (soft != RLIM_INFINITY) {
+ /* Task RT timeout is accounted in jiffies. RTTIME is usec */
+- unsigned long rttime = tsk->rt.timeout * (USEC_PER_SEC / HZ);
++ unsigned long rttime = tsk_rttimeout(tsk) * (USEC_PER_SEC / HZ);
+ unsigned long hard = task_rlimit_max(tsk, RLIMIT_RTTIME);
+
+ /* At the hard limit, send SIGKILL. No further action. */
+@@ -1151,8 +1155,10 @@ static inline bool fastpath_timer_check(struct task_struct *tsk)
+ return true;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_task(tsk) && tsk->dl.dl_overrun)
+ return true;
++#endif
+
+ return false;
+ }
+diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
+index abcadbe933bb..d4c778b0ab0e 100644
+--- a/kernel/trace/trace_selftest.c
++++ b/kernel/trace/trace_selftest.c
+@@ -1140,10 +1140,15 @@ static int trace_wakeup_test_thread(void *data)
+ {
+ /* Make this a -deadline thread */
+ static const struct sched_attr attr = {
++#ifdef CONFIG_SCHED_ALT
++ /* No deadline on BMQ/PDS, use RR */
++ .sched_policy = SCHED_RR,
++#else
+ .sched_policy = SCHED_DEADLINE,
+ .sched_runtime = 100000ULL,
+ .sched_deadline = 10000000ULL,
+ .sched_period = 10000000ULL
++#endif
+ };
+ struct wakeup_test_data *x = data;
+
diff --git a/5021_BMQ-and-PDS-gentoo-defaults.patch b/5021_BMQ-and-PDS-gentoo-defaults.patch
new file mode 100644
index 00000000..7a71c6b6
--- /dev/null
+++ b/5021_BMQ-and-PDS-gentoo-defaults.patch
@@ -0,0 +1,12 @@
+--- a/init/Kconfig 2021-04-27 07:38:30.556467045 -0400
++++ b/init/Kconfig 2021-04-27 07:39:32.956412800 -0400
+@@ -780,8 +780,9 @@ config GENERIC_SCHED_CLOCK
+ menu "Scheduler features"
+
+ menuconfig SCHED_ALT
++ depends on X86_64
+ bool "Alternative CPU Schedulers"
+- default y
++ default n
+ help
+ This feature enable alternative CPU scheduler"
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-05-24 21:02 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-05-24 21:02 UTC (permalink / raw
To: gentoo-commits
commit: 500df47c29eddf6c891238c1454aa01d10f6f2a5
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Tue May 24 21:01:03 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Tue May 24 21:01:03 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=500df47c
Address -Warray-bounds warnings on sparc
See: https://github.com/KSPP/linux/issues/109
Bug: https://bugs.gentoo.org/847124
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 ++++
1700_sparc-address-warray-bound-warnings.patch | 17 +++++++++++++++++
2 files changed, 21 insertions(+)
diff --git a/0000_README b/0000_README
index f8cc57f0..298c5715 100644
--- a/0000_README
+++ b/0000_README
@@ -51,6 +51,10 @@ Patch: 1510_fs-enable-link-security-restrictions-by-default.patch
From: http://sources.debian.net/src/linux/3.16.7-ckt4-3/debian/patches/debian/fs-enable-link-security-restrictions-by-default.patch/
Desc: Enable link security restrictions by default.
+Patch: 1700_sparc-address-warray-bound-warnings.patch
+From: https://github.com/KSPP/linux/issues/109
+Desc: Address -Warray-bounds warnings
+
Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
diff --git a/1700_sparc-address-warray-bound-warnings.patch b/1700_sparc-address-warray-bound-warnings.patch
new file mode 100644
index 00000000..f9393555
--- /dev/null
+++ b/1700_sparc-address-warray-bound-warnings.patch
@@ -0,0 +1,17 @@
+--- a/arch/sparc/mm/init_64.c 2022-05-24 16:48:40.749677491 -0400
++++ b/arch/sparc/mm/init_64.c 2022-05-24 16:55:15.511356945 -0400
+@@ -3052,11 +3052,11 @@ static inline resource_size_t compute_ke
+ static void __init kernel_lds_init(void)
+ {
+ code_resource.start = compute_kern_paddr(_text);
+- code_resource.end = compute_kern_paddr(_etext - 1);
++ code_resource.end = compute_kern_paddr(_etext) - 1;
+ data_resource.start = compute_kern_paddr(_etext);
+- data_resource.end = compute_kern_paddr(_edata - 1);
++ data_resource.end = compute_kern_paddr(_edata) - 1;
+ bss_resource.start = compute_kern_paddr(__bss_start);
+- bss_resource.end = compute_kern_paddr(_end - 1);
++ bss_resource.end = compute_kern_paddr(_end) - 1;
+ }
+
+ static int __init report_memory(void)
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-05-30 13:57 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-05-30 13:57 UTC (permalink / raw
To: gentoo-commits
commit: 4c9bb1563e46363720d3778468b068a8509a2f36
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Mon May 30 13:57:08 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Mon May 30 13:57:08 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=4c9bb156
Linux patch 5.18.1
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1000_linux-5.18.1.patch | 2933 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 2937 insertions(+)
diff --git a/0000_README b/0000_README
index 298c5715..62ab5b31 100644
--- a/0000_README
+++ b/0000_README
@@ -43,6 +43,10 @@ EXPERIMENTAL
Individual Patch Descriptions:
--------------------------------------------------------------------------
+Patch: 1000_linux-5.18.1.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.1
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1000_linux-5.18.1.patch b/1000_linux-5.18.1.patch
new file mode 100644
index 00000000..679abefd
--- /dev/null
+++ b/1000_linux-5.18.1.patch
@@ -0,0 +1,2933 @@
+diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
+index 1144ea3229a37..e9c18dabc5523 100644
+--- a/Documentation/admin-guide/sysctl/kernel.rst
++++ b/Documentation/admin-guide/sysctl/kernel.rst
+@@ -994,6 +994,9 @@ This is a directory, with the following entries:
+ * ``boot_id``: a UUID generated the first time this is retrieved, and
+ unvarying after that;
+
++* ``uuid``: a UUID generated every time this is retrieved (this can
++ thus be used to generate UUIDs at will);
++
+ * ``entropy_avail``: the pool's entropy count, in bits;
+
+ * ``poolsize``: the entropy pool size, in bits;
+@@ -1001,10 +1004,7 @@ This is a directory, with the following entries:
+ * ``urandom_min_reseed_secs``: obsolete (used to determine the minimum
+ number of seconds between urandom pool reseeding). This file is
+ writable for compatibility purposes, but writing to it has no effect
+- on any RNG behavior.
+-
+-* ``uuid``: a UUID generated every time this is retrieved (this can
+- thus be used to generate UUIDs at will);
++ on any RNG behavior;
+
+ * ``write_wakeup_threshold``: when the entropy count drops below this
+ (as a number of bits), processes waiting to write to ``/dev/random``
+diff --git a/Makefile b/Makefile
+index 7d5b0bfe79602..2bb168acb8f43 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 0
++SUBLEVEL = 1
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/alpha/include/asm/timex.h b/arch/alpha/include/asm/timex.h
+index b565cc6f408e9..f89798da8a147 100644
+--- a/arch/alpha/include/asm/timex.h
++++ b/arch/alpha/include/asm/timex.h
+@@ -28,5 +28,6 @@ static inline cycles_t get_cycles (void)
+ __asm__ __volatile__ ("rpcc %0" : "=r"(ret));
+ return ret;
+ }
++#define get_cycles get_cycles
+
+ #endif
+diff --git a/arch/arm/include/asm/timex.h b/arch/arm/include/asm/timex.h
+index 7c3b3671d6c25..6d1337c169cd3 100644
+--- a/arch/arm/include/asm/timex.h
++++ b/arch/arm/include/asm/timex.h
+@@ -11,5 +11,6 @@
+
+ typedef unsigned long cycles_t;
+ #define get_cycles() ({ cycles_t c; read_current_timer(&c) ? 0 : c; })
++#define random_get_entropy() (((unsigned long)get_cycles()) ?: random_get_entropy_fallback())
+
+ #endif
+diff --git a/arch/ia64/include/asm/timex.h b/arch/ia64/include/asm/timex.h
+index 869a3ac6bf23a..7ccc077a60bed 100644
+--- a/arch/ia64/include/asm/timex.h
++++ b/arch/ia64/include/asm/timex.h
+@@ -39,6 +39,7 @@ get_cycles (void)
+ ret = ia64_getreg(_IA64_REG_AR_ITC);
+ return ret;
+ }
++#define get_cycles get_cycles
+
+ extern void ia64_cpu_local_tick (void);
+ extern unsigned long long ia64_native_sched_clock (void);
+diff --git a/arch/m68k/include/asm/timex.h b/arch/m68k/include/asm/timex.h
+index 6a21d93582805..f4a7a340f4cae 100644
+--- a/arch/m68k/include/asm/timex.h
++++ b/arch/m68k/include/asm/timex.h
+@@ -35,7 +35,7 @@ static inline unsigned long random_get_entropy(void)
+ {
+ if (mach_random_get_entropy)
+ return mach_random_get_entropy();
+- return 0;
++ return random_get_entropy_fallback();
+ }
+ #define random_get_entropy random_get_entropy
+
+diff --git a/arch/mips/include/asm/timex.h b/arch/mips/include/asm/timex.h
+index 8026baf46e729..2e107886f97ac 100644
+--- a/arch/mips/include/asm/timex.h
++++ b/arch/mips/include/asm/timex.h
+@@ -76,25 +76,24 @@ static inline cycles_t get_cycles(void)
+ else
+ return 0; /* no usable counter */
+ }
++#define get_cycles get_cycles
+
+ /*
+ * Like get_cycles - but where c0_count is not available we desperately
+ * use c0_random in an attempt to get at least a little bit of entropy.
+- *
+- * R6000 and R6000A neither have a count register nor a random register.
+- * That leaves no entropy source in the CPU itself.
+ */
+ static inline unsigned long random_get_entropy(void)
+ {
+- unsigned int prid = read_c0_prid();
+- unsigned int imp = prid & PRID_IMP_MASK;
++ unsigned int c0_random;
+
+- if (can_use_mips_counter(prid))
++ if (can_use_mips_counter(read_c0_prid()))
+ return read_c0_count();
+- else if (likely(imp != PRID_IMP_R6000 && imp != PRID_IMP_R6000A))
+- return read_c0_random();
++
++ if (cpu_has_3kex)
++ c0_random = (read_c0_random() >> 8) & 0x3f;
+ else
+- return 0; /* no usable register */
++ c0_random = read_c0_random() & 0x3f;
++ return (random_get_entropy_fallback() << 6) | (0x3f - c0_random);
+ }
+ #define random_get_entropy random_get_entropy
+
+diff --git a/arch/nios2/include/asm/timex.h b/arch/nios2/include/asm/timex.h
+index a769f871b28d9..40a1adc9bd03e 100644
+--- a/arch/nios2/include/asm/timex.h
++++ b/arch/nios2/include/asm/timex.h
+@@ -8,5 +8,8 @@
+ typedef unsigned long cycles_t;
+
+ extern cycles_t get_cycles(void);
++#define get_cycles get_cycles
++
++#define random_get_entropy() (((unsigned long)get_cycles()) ?: random_get_entropy_fallback())
+
+ #endif
+diff --git a/arch/parisc/include/asm/timex.h b/arch/parisc/include/asm/timex.h
+index 06b510f8172e3..b4622cb06a75e 100644
+--- a/arch/parisc/include/asm/timex.h
++++ b/arch/parisc/include/asm/timex.h
+@@ -13,9 +13,10 @@
+
+ typedef unsigned long cycles_t;
+
+-static inline cycles_t get_cycles (void)
++static inline cycles_t get_cycles(void)
+ {
+ return mfctl(16);
+ }
++#define get_cycles get_cycles
+
+ #endif
+diff --git a/arch/powerpc/include/asm/timex.h b/arch/powerpc/include/asm/timex.h
+index fa2e76e4093a3..14b4489de52c5 100644
+--- a/arch/powerpc/include/asm/timex.h
++++ b/arch/powerpc/include/asm/timex.h
+@@ -19,6 +19,7 @@ static inline cycles_t get_cycles(void)
+ {
+ return mftb();
+ }
++#define get_cycles get_cycles
+
+ #endif /* __KERNEL__ */
+ #endif /* _ASM_POWERPC_TIMEX_H */
+diff --git a/arch/riscv/include/asm/timex.h b/arch/riscv/include/asm/timex.h
+index 507cae273bc62..d6a7428f6248d 100644
+--- a/arch/riscv/include/asm/timex.h
++++ b/arch/riscv/include/asm/timex.h
+@@ -41,7 +41,7 @@ static inline u32 get_cycles_hi(void)
+ static inline unsigned long random_get_entropy(void)
+ {
+ if (unlikely(clint_time_val == NULL))
+- return 0;
++ return random_get_entropy_fallback();
+ return get_cycles();
+ }
+ #define random_get_entropy() random_get_entropy()
+diff --git a/arch/s390/include/asm/timex.h b/arch/s390/include/asm/timex.h
+index 2cfce42aa7fc4..ce878e85b6e4e 100644
+--- a/arch/s390/include/asm/timex.h
++++ b/arch/s390/include/asm/timex.h
+@@ -197,6 +197,7 @@ static inline cycles_t get_cycles(void)
+ {
+ return (cycles_t) get_tod_clock() >> 2;
+ }
++#define get_cycles get_cycles
+
+ int get_phys_clock(unsigned long *clock);
+ void init_cpu_timer(void);
+diff --git a/arch/sparc/include/asm/timex_32.h b/arch/sparc/include/asm/timex_32.h
+index 542915b462097..f86326a6f89e0 100644
+--- a/arch/sparc/include/asm/timex_32.h
++++ b/arch/sparc/include/asm/timex_32.h
+@@ -9,8 +9,6 @@
+
+ #define CLOCK_TICK_RATE 1193180 /* Underlying HZ */
+
+-/* XXX Maybe do something better at some point... -DaveM */
+-typedef unsigned long cycles_t;
+-#define get_cycles() (0)
++#include <asm-generic/timex.h>
+
+ #endif
+diff --git a/arch/um/include/asm/timex.h b/arch/um/include/asm/timex.h
+index e392a9a5bc9bd..9f27176adb26d 100644
+--- a/arch/um/include/asm/timex.h
++++ b/arch/um/include/asm/timex.h
+@@ -2,13 +2,8 @@
+ #ifndef __UM_TIMEX_H
+ #define __UM_TIMEX_H
+
+-typedef unsigned long cycles_t;
+-
+-static inline cycles_t get_cycles (void)
+-{
+- return 0;
+-}
+-
+ #define CLOCK_TICK_RATE (HZ)
+
++#include <asm-generic/timex.h>
++
+ #endif
+diff --git a/arch/x86/include/asm/timex.h b/arch/x86/include/asm/timex.h
+index a4a8b1b16c0c1..956e4145311b1 100644
+--- a/arch/x86/include/asm/timex.h
++++ b/arch/x86/include/asm/timex.h
+@@ -5,6 +5,15 @@
+ #include <asm/processor.h>
+ #include <asm/tsc.h>
+
++static inline unsigned long random_get_entropy(void)
++{
++ if (!IS_ENABLED(CONFIG_X86_TSC) &&
++ !cpu_feature_enabled(X86_FEATURE_TSC))
++ return random_get_entropy_fallback();
++ return rdtsc();
++}
++#define random_get_entropy random_get_entropy
++
+ /* Assume we use the PIT time source for the clock tick */
+ #define CLOCK_TICK_RATE PIT_TICK_RATE
+
+diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
+index 01a300a9700b9..fbdc3d9514943 100644
+--- a/arch/x86/include/asm/tsc.h
++++ b/arch/x86/include/asm/tsc.h
+@@ -20,13 +20,12 @@ extern void disable_TSC(void);
+
+ static inline cycles_t get_cycles(void)
+ {
+-#ifndef CONFIG_X86_TSC
+- if (!boot_cpu_has(X86_FEATURE_TSC))
++ if (!IS_ENABLED(CONFIG_X86_TSC) &&
++ !cpu_feature_enabled(X86_FEATURE_TSC))
+ return 0;
+-#endif
+-
+ return rdtsc();
+ }
++#define get_cycles get_cycles
+
+ extern struct system_counterval_t convert_art_to_tsc(u64 art);
+ extern struct system_counterval_t convert_art_ns_to_tsc(u64 art_ns);
+diff --git a/arch/xtensa/include/asm/timex.h b/arch/xtensa/include/asm/timex.h
+index 233ec75e60c69..3f2462f2d0270 100644
+--- a/arch/xtensa/include/asm/timex.h
++++ b/arch/xtensa/include/asm/timex.h
+@@ -29,10 +29,6 @@
+
+ extern unsigned long ccount_freq;
+
+-typedef unsigned long long cycles_t;
+-
+-#define get_cycles() (0)
+-
+ void local_timer_setup(unsigned cpu);
+
+ /*
+@@ -59,4 +55,6 @@ static inline void set_linux_timer (unsigned long ccompare)
+ xtensa_set_sr(ccompare, SREG_CCOMPARE + LINUX_TIMER);
+ }
+
++#include <asm-generic/timex.h>
++
+ #endif /* _XTENSA_TIMEX_H */
+diff --git a/drivers/acpi/sysfs.c b/drivers/acpi/sysfs.c
+index a4b638bea6f16..cc2fe0618178e 100644
+--- a/drivers/acpi/sysfs.c
++++ b/drivers/acpi/sysfs.c
+@@ -415,19 +415,30 @@ static ssize_t acpi_data_show(struct file *filp, struct kobject *kobj,
+ loff_t offset, size_t count)
+ {
+ struct acpi_data_attr *data_attr;
+- void *base;
+- ssize_t rc;
++ void __iomem *base;
++ ssize_t size;
+
+ data_attr = container_of(bin_attr, struct acpi_data_attr, attr);
++ size = data_attr->attr.size;
++
++ if (offset < 0)
++ return -EINVAL;
++
++ if (offset >= size)
++ return 0;
+
+- base = acpi_os_map_memory(data_attr->addr, data_attr->attr.size);
++ if (count > size - offset)
++ count = size - offset;
++
++ base = acpi_os_map_iomem(data_attr->addr, size);
+ if (!base)
+ return -ENOMEM;
+- rc = memory_read_from_buffer(buf, count, &offset, base,
+- data_attr->attr.size);
+- acpi_os_unmap_memory(base, data_attr->attr.size);
+
+- return rc;
++ memcpy_fromio(buf, base + offset, count);
++
++ acpi_os_unmap_iomem(base, size);
++
++ return count;
+ }
+
+ static int acpi_bert_data_init(void *th, struct acpi_data_attr *data_attr)
+diff --git a/drivers/char/random.c b/drivers/char/random.c
+index 4c9adb4f3d5d7..7a66eec08e373 100644
+--- a/drivers/char/random.c
++++ b/drivers/char/random.c
+@@ -15,14 +15,12 @@
+ * - Sysctl interface.
+ *
+ * The high level overview is that there is one input pool, into which
+- * various pieces of data are hashed. Some of that data is then "credited" as
+- * having a certain number of bits of entropy. When enough bits of entropy are
+- * available, the hash is finalized and handed as a key to a stream cipher that
+- * expands it indefinitely for various consumers. This key is periodically
+- * refreshed as the various entropy collectors, described below, add data to the
+- * input pool and credit it. There is currently no Fortuna-like scheduler
+- * involved, which can lead to malicious entropy sources causing a premature
+- * reseed, and the entropy estimates are, at best, conservative guesses.
++ * various pieces of data are hashed. Prior to initialization, some of that
++ * data is then "credited" as having a certain number of bits of entropy.
++ * When enough bits of entropy are available, the hash is finalized and
++ * handed as a key to a stream cipher that expands it indefinitely for
++ * various consumers. This key is periodically refreshed as the various
++ * entropy collectors, described below, add data to the input pool.
+ */
+
+ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+@@ -53,6 +51,7 @@
+ #include <linux/completion.h>
+ #include <linux/uuid.h>
+ #include <linux/uaccess.h>
++#include <linux/siphash.h>
+ #include <crypto/chacha.h>
+ #include <crypto/blake2s.h>
+ #include <asm/processor.h>
+@@ -71,27 +70,27 @@
+ *********************************************************************/
+
+ /*
+- * crng_init = 0 --> Uninitialized
+- * 1 --> Initialized
+- * 2 --> Initialized from input_pool
+- *
+ * crng_init is protected by base_crng->lock, and only increases
+- * its value (from 0->1->2).
++ * its value (from empty->early->ready).
+ */
+-static int crng_init = 0;
+-#define crng_ready() (likely(crng_init > 1))
+-/* Various types of waiters for crng_init->2 transition. */
++static enum {
++ CRNG_EMPTY = 0, /* Little to no entropy collected */
++ CRNG_EARLY = 1, /* At least POOL_EARLY_BITS collected */
++ CRNG_READY = 2 /* Fully initialized with POOL_READY_BITS collected */
++} crng_init __read_mostly = CRNG_EMPTY;
++static DEFINE_STATIC_KEY_FALSE(crng_is_ready);
++#define crng_ready() (static_branch_likely(&crng_is_ready) || crng_init >= CRNG_READY)
++/* Various types of waiters for crng_init->CRNG_READY transition. */
+ static DECLARE_WAIT_QUEUE_HEAD(crng_init_wait);
+ static struct fasync_struct *fasync;
+ static DEFINE_SPINLOCK(random_ready_chain_lock);
+ static RAW_NOTIFIER_HEAD(random_ready_chain);
+
+ /* Control how we warn userspace. */
+-static struct ratelimit_state unseeded_warning =
+- RATELIMIT_STATE_INIT("warn_unseeded_randomness", HZ, 3);
+ static struct ratelimit_state urandom_warning =
+ RATELIMIT_STATE_INIT("warn_urandom_randomness", HZ, 3);
+-static int ratelimit_disable __read_mostly;
++static int ratelimit_disable __read_mostly =
++ IS_ENABLED(CONFIG_WARN_ALL_UNSEEDED_RANDOM);
+ module_param_named(ratelimit_disable, ratelimit_disable, int, 0644);
+ MODULE_PARM_DESC(ratelimit_disable, "Disable random ratelimit suppression");
+
+@@ -110,6 +109,11 @@ bool rng_is_initialized(void)
+ }
+ EXPORT_SYMBOL(rng_is_initialized);
+
++static void __cold crng_set_ready(struct work_struct *work)
++{
++ static_branch_enable(&crng_is_ready);
++}
++
+ /* Used by wait_for_random_bytes(), and considered an entropy collector, below. */
+ static void try_to_generate_entropy(void);
+
+@@ -144,7 +148,7 @@ EXPORT_SYMBOL(wait_for_random_bytes);
+ * returns: 0 if callback is successfully added
+ * -EALREADY if pool is already initialised (callback not called)
+ */
+-int register_random_ready_notifier(struct notifier_block *nb)
++int __cold register_random_ready_notifier(struct notifier_block *nb)
+ {
+ unsigned long flags;
+ int ret = -EALREADY;
+@@ -162,7 +166,7 @@ int register_random_ready_notifier(struct notifier_block *nb)
+ /*
+ * Delete a previously registered readiness callback function.
+ */
+-int unregister_random_ready_notifier(struct notifier_block *nb)
++int __cold unregister_random_ready_notifier(struct notifier_block *nb)
+ {
+ unsigned long flags;
+ int ret;
+@@ -173,7 +177,7 @@ int unregister_random_ready_notifier(struct notifier_block *nb)
+ return ret;
+ }
+
+-static void process_random_ready_list(void)
++static void __cold process_random_ready_list(void)
+ {
+ unsigned long flags;
+
+@@ -182,28 +186,10 @@ static void process_random_ready_list(void)
+ spin_unlock_irqrestore(&random_ready_chain_lock, flags);
+ }
+
+-#define warn_unseeded_randomness(previous) \
+- _warn_unseeded_randomness(__func__, (void *)_RET_IP_, (previous))
+-
+-static void _warn_unseeded_randomness(const char *func_name, void *caller, void **previous)
+-{
+-#ifdef CONFIG_WARN_ALL_UNSEEDED_RANDOM
+- const bool print_once = false;
+-#else
+- static bool print_once __read_mostly;
+-#endif
+-
+- if (print_once || crng_ready() ||
+- (previous && (caller == READ_ONCE(*previous))))
+- return;
+- WRITE_ONCE(*previous, caller);
+-#ifndef CONFIG_WARN_ALL_UNSEEDED_RANDOM
+- print_once = true;
+-#endif
+- if (__ratelimit(&unseeded_warning))
+- printk_deferred(KERN_NOTICE "random: %s called from %pS with crng_init=%d\n",
+- func_name, caller, crng_init);
+-}
++#define warn_unseeded_randomness() \
++ if (IS_ENABLED(CONFIG_WARN_ALL_UNSEEDED_RANDOM) && !crng_ready()) \
++ printk_deferred(KERN_NOTICE "random: %s called from %pS with crng_init=%d\n", \
++ __func__, (void *)_RET_IP_, crng_init)
+
+
+ /*********************************************************************
+@@ -216,7 +202,7 @@ static void _warn_unseeded_randomness(const char *func_name, void *caller, void
+ *
+ * There are a few exported interfaces for use by other drivers:
+ *
+- * void get_random_bytes(void *buf, size_t nbytes)
++ * void get_random_bytes(void *buf, size_t len)
+ * u32 get_random_u32()
+ * u64 get_random_u64()
+ * unsigned int get_random_int()
+@@ -232,8 +218,8 @@ static void _warn_unseeded_randomness(const char *func_name, void *caller, void
+ *********************************************************************/
+
+ enum {
+- CRNG_RESEED_INTERVAL = 300 * HZ,
+- CRNG_INIT_CNT_THRESH = 2 * CHACHA_KEY_SIZE
++ CRNG_RESEED_START_INTERVAL = HZ,
++ CRNG_RESEED_INTERVAL = 60 * HZ
+ };
+
+ static struct {
+@@ -256,24 +242,17 @@ static DEFINE_PER_CPU(struct crng, crngs) = {
+ .lock = INIT_LOCAL_LOCK(crngs.lock),
+ };
+
+-/* Used by crng_reseed() to extract a new seed from the input pool. */
+-static bool drain_entropy(void *buf, size_t nbytes, bool force);
++/* Used by crng_reseed() and crng_make_state() to extract a new seed from the input pool. */
++static void extract_entropy(void *buf, size_t len);
+
+-/*
+- * This extracts a new crng key from the input pool, but only if there is a
+- * sufficient amount of entropy available or force is true, in order to
+- * mitigate bruteforcing of newly added bits.
+- */
+-static void crng_reseed(bool force)
++/* This extracts a new crng key from the input pool. */
++static void crng_reseed(void)
+ {
+ unsigned long flags;
+ unsigned long next_gen;
+ u8 key[CHACHA_KEY_SIZE];
+- bool finalize_init = false;
+
+- /* Only reseed if we can, to prevent brute forcing a small amount of new bits. */
+- if (!drain_entropy(key, sizeof(key), force))
+- return;
++ extract_entropy(key, sizeof(key));
+
+ /*
+ * We copy the new key into the base_crng, overwriting the old one,
+@@ -288,28 +267,10 @@ static void crng_reseed(bool force)
+ ++next_gen;
+ WRITE_ONCE(base_crng.generation, next_gen);
+ WRITE_ONCE(base_crng.birth, jiffies);
+- if (!crng_ready()) {
+- crng_init = 2;
+- finalize_init = true;
+- }
++ if (!static_branch_likely(&crng_is_ready))
++ crng_init = CRNG_READY;
+ spin_unlock_irqrestore(&base_crng.lock, flags);
+ memzero_explicit(key, sizeof(key));
+- if (finalize_init) {
+- process_random_ready_list();
+- wake_up_interruptible(&crng_init_wait);
+- kill_fasync(&fasync, SIGIO, POLL_IN);
+- pr_notice("crng init done\n");
+- if (unseeded_warning.missed) {
+- pr_notice("%d get_random_xx warning(s) missed due to ratelimiting\n",
+- unseeded_warning.missed);
+- unseeded_warning.missed = 0;
+- }
+- if (urandom_warning.missed) {
+- pr_notice("%d urandom warning(s) missed due to ratelimiting\n",
+- urandom_warning.missed);
+- urandom_warning.missed = 0;
+- }
+- }
+ }
+
+ /*
+@@ -345,10 +306,10 @@ static void crng_fast_key_erasure(u8 key[CHACHA_KEY_SIZE],
+ }
+
+ /*
+- * Return whether the crng seed is considered to be sufficiently
+- * old that a reseeding might be attempted. This happens if the last
+- * reseeding was CRNG_RESEED_INTERVAL ago, or during early boot, at
+- * an interval proportional to the uptime.
++ * Return whether the crng seed is considered to be sufficiently old
++ * that a reseeding is needed. This happens if the last reseeding
++ * was CRNG_RESEED_INTERVAL ago, or during early boot, at an interval
++ * proportional to the uptime.
+ */
+ static bool crng_has_old_seed(void)
+ {
+@@ -360,10 +321,10 @@ static bool crng_has_old_seed(void)
+ if (uptime >= CRNG_RESEED_INTERVAL / HZ * 2)
+ WRITE_ONCE(early_boot, false);
+ else
+- interval = max_t(unsigned int, 5 * HZ,
++ interval = max_t(unsigned int, CRNG_RESEED_START_INTERVAL,
+ (unsigned int)uptime / 2 * HZ);
+ }
+- return time_after(jiffies, READ_ONCE(base_crng.birth) + interval);
++ return time_is_before_jiffies(READ_ONCE(base_crng.birth) + interval);
+ }
+
+ /*
+@@ -382,28 +343,31 @@ static void crng_make_state(u32 chacha_state[CHACHA_STATE_WORDS],
+ /*
+ * For the fast path, we check whether we're ready, unlocked first, and
+ * then re-check once locked later. In the case where we're really not
+- * ready, we do fast key erasure with the base_crng directly, because
+- * this is what crng_pre_init_inject() mutates during early init.
++ * ready, we do fast key erasure with the base_crng directly, extracting
++ * when crng_init is CRNG_EMPTY.
+ */
+ if (!crng_ready()) {
+ bool ready;
+
+ spin_lock_irqsave(&base_crng.lock, flags);
+ ready = crng_ready();
+- if (!ready)
++ if (!ready) {
++ if (crng_init == CRNG_EMPTY)
++ extract_entropy(base_crng.key, sizeof(base_crng.key));
+ crng_fast_key_erasure(base_crng.key, chacha_state,
+ random_data, random_data_len);
++ }
+ spin_unlock_irqrestore(&base_crng.lock, flags);
+ if (!ready)
+ return;
+ }
+
+ /*
+- * If the base_crng is old enough, we try to reseed, which in turn
+- * bumps the generation counter that we check below.
++ * If the base_crng is old enough, we reseed, which in turn bumps the
++ * generation counter that we check below.
+ */
+ if (unlikely(crng_has_old_seed()))
+- crng_reseed(false);
++ crng_reseed();
+
+ local_lock_irqsave(&crngs.lock, flags);
+ crng = raw_cpu_ptr(&crngs);
+@@ -433,68 +397,24 @@ static void crng_make_state(u32 chacha_state[CHACHA_STATE_WORDS],
+ local_unlock_irqrestore(&crngs.lock, flags);
+ }
+
+-/*
+- * This function is for crng_init == 0 only. It loads entropy directly
+- * into the crng's key, without going through the input pool. It is,
+- * generally speaking, not very safe, but we use this only at early
+- * boot time when it's better to have something there rather than
+- * nothing.
+- *
+- * If account is set, then the crng_init_cnt counter is incremented.
+- * This shouldn't be set by functions like add_device_randomness(),
+- * where we can't trust the buffer passed to it is guaranteed to be
+- * unpredictable (so it might not have any entropy at all).
+- */
+-static void crng_pre_init_inject(const void *input, size_t len, bool account)
+-{
+- static int crng_init_cnt = 0;
+- struct blake2s_state hash;
+- unsigned long flags;
+-
+- blake2s_init(&hash, sizeof(base_crng.key));
+-
+- spin_lock_irqsave(&base_crng.lock, flags);
+- if (crng_init != 0) {
+- spin_unlock_irqrestore(&base_crng.lock, flags);
+- return;
+- }
+-
+- blake2s_update(&hash, base_crng.key, sizeof(base_crng.key));
+- blake2s_update(&hash, input, len);
+- blake2s_final(&hash, base_crng.key);
+-
+- if (account) {
+- crng_init_cnt += min_t(size_t, len, CRNG_INIT_CNT_THRESH - crng_init_cnt);
+- if (crng_init_cnt >= CRNG_INIT_CNT_THRESH) {
+- ++base_crng.generation;
+- crng_init = 1;
+- }
+- }
+-
+- spin_unlock_irqrestore(&base_crng.lock, flags);
+-
+- if (crng_init == 1)
+- pr_notice("fast init done\n");
+-}
+-
+-static void _get_random_bytes(void *buf, size_t nbytes)
++static void _get_random_bytes(void *buf, size_t len)
+ {
+ u32 chacha_state[CHACHA_STATE_WORDS];
+ u8 tmp[CHACHA_BLOCK_SIZE];
+- size_t len;
++ size_t first_block_len;
+
+- if (!nbytes)
++ if (!len)
+ return;
+
+- len = min_t(size_t, 32, nbytes);
+- crng_make_state(chacha_state, buf, len);
+- nbytes -= len;
+- buf += len;
++ first_block_len = min_t(size_t, 32, len);
++ crng_make_state(chacha_state, buf, first_block_len);
++ len -= first_block_len;
++ buf += first_block_len;
+
+- while (nbytes) {
+- if (nbytes < CHACHA_BLOCK_SIZE) {
++ while (len) {
++ if (len < CHACHA_BLOCK_SIZE) {
+ chacha20_block(chacha_state, tmp);
+- memcpy(buf, tmp, nbytes);
++ memcpy(buf, tmp, len);
+ memzero_explicit(tmp, sizeof(tmp));
+ break;
+ }
+@@ -502,7 +422,7 @@ static void _get_random_bytes(void *buf, size_t nbytes)
+ chacha20_block(chacha_state, buf);
+ if (unlikely(chacha_state[12] == 0))
+ ++chacha_state[13];
+- nbytes -= CHACHA_BLOCK_SIZE;
++ len -= CHACHA_BLOCK_SIZE;
+ buf += CHACHA_BLOCK_SIZE;
+ }
+
+@@ -519,22 +439,20 @@ static void _get_random_bytes(void *buf, size_t nbytes)
+ * wait_for_random_bytes() should be called and return 0 at least once
+ * at any point prior.
+ */
+-void get_random_bytes(void *buf, size_t nbytes)
++void get_random_bytes(void *buf, size_t len)
+ {
+- static void *previous;
+-
+- warn_unseeded_randomness(&previous);
+- _get_random_bytes(buf, nbytes);
++ warn_unseeded_randomness();
++ _get_random_bytes(buf, len);
+ }
+ EXPORT_SYMBOL(get_random_bytes);
+
+-static ssize_t get_random_bytes_user(void __user *buf, size_t nbytes)
++static ssize_t get_random_bytes_user(struct iov_iter *iter)
+ {
+- size_t len, left, ret = 0;
+ u32 chacha_state[CHACHA_STATE_WORDS];
+- u8 output[CHACHA_BLOCK_SIZE];
++ u8 block[CHACHA_BLOCK_SIZE];
++ size_t ret = 0, copied;
+
+- if (!nbytes)
++ if (unlikely(!iov_iter_count(iter)))
+ return 0;
+
+ /*
+@@ -548,30 +466,22 @@ static ssize_t get_random_bytes_user(void __user *buf, size_t nbytes)
+ * use chacha_state after, so we can simply return those bytes to
+ * the user directly.
+ */
+- if (nbytes <= CHACHA_KEY_SIZE) {
+- ret = nbytes - copy_to_user(buf, &chacha_state[4], nbytes);
++ if (iov_iter_count(iter) <= CHACHA_KEY_SIZE) {
++ ret = copy_to_iter(&chacha_state[4], CHACHA_KEY_SIZE, iter);
+ goto out_zero_chacha;
+ }
+
+ for (;;) {
+- chacha20_block(chacha_state, output);
++ chacha20_block(chacha_state, block);
+ if (unlikely(chacha_state[12] == 0))
+ ++chacha_state[13];
+
+- len = min_t(size_t, nbytes, CHACHA_BLOCK_SIZE);
+- left = copy_to_user(buf, output, len);
+- if (left) {
+- ret += len - left;
+- break;
+- }
+-
+- buf += len;
+- ret += len;
+- nbytes -= len;
+- if (!nbytes)
++ copied = copy_to_iter(block, sizeof(block), iter);
++ ret += copied;
++ if (!iov_iter_count(iter) || copied != sizeof(block))
+ break;
+
+- BUILD_BUG_ON(PAGE_SIZE % CHACHA_BLOCK_SIZE != 0);
++ BUILD_BUG_ON(PAGE_SIZE % sizeof(block) != 0);
+ if (ret % PAGE_SIZE == 0) {
+ if (signal_pending(current))
+ break;
+@@ -579,7 +489,7 @@ static ssize_t get_random_bytes_user(void __user *buf, size_t nbytes)
+ }
+ }
+
+- memzero_explicit(output, sizeof(output));
++ memzero_explicit(block, sizeof(block));
+ out_zero_chacha:
+ memzero_explicit(chacha_state, sizeof(chacha_state));
+ return ret ? ret : -EFAULT;
+@@ -591,98 +501,69 @@ out_zero_chacha:
+ * provided by this function is okay, the function wait_for_random_bytes()
+ * should be called and return 0 at least once at any point prior.
+ */
+-struct batched_entropy {
+- union {
+- /*
+- * We make this 1.5x a ChaCha block, so that we get the
+- * remaining 32 bytes from fast key erasure, plus one full
+- * block from the detached ChaCha state. We can increase
+- * the size of this later if needed so long as we keep the
+- * formula of (integer_blocks + 0.5) * CHACHA_BLOCK_SIZE.
+- */
+- u64 entropy_u64[CHACHA_BLOCK_SIZE * 3 / (2 * sizeof(u64))];
+- u32 entropy_u32[CHACHA_BLOCK_SIZE * 3 / (2 * sizeof(u32))];
+- };
+- local_lock_t lock;
+- unsigned long generation;
+- unsigned int position;
+-};
+-
+
+-static DEFINE_PER_CPU(struct batched_entropy, batched_entropy_u64) = {
+- .lock = INIT_LOCAL_LOCK(batched_entropy_u64.lock),
+- .position = UINT_MAX
+-};
+-
+-u64 get_random_u64(void)
+-{
+- u64 ret;
+- unsigned long flags;
+- struct batched_entropy *batch;
+- static void *previous;
+- unsigned long next_gen;
+-
+- warn_unseeded_randomness(&previous);
+-
+- local_lock_irqsave(&batched_entropy_u64.lock, flags);
+- batch = raw_cpu_ptr(&batched_entropy_u64);
+-
+- next_gen = READ_ONCE(base_crng.generation);
+- if (batch->position >= ARRAY_SIZE(batch->entropy_u64) ||
+- next_gen != batch->generation) {
+- _get_random_bytes(batch->entropy_u64, sizeof(batch->entropy_u64));
+- batch->position = 0;
+- batch->generation = next_gen;
+- }
+-
+- ret = batch->entropy_u64[batch->position];
+- batch->entropy_u64[batch->position] = 0;
+- ++batch->position;
+- local_unlock_irqrestore(&batched_entropy_u64.lock, flags);
+- return ret;
+-}
+-EXPORT_SYMBOL(get_random_u64);
+-
+-static DEFINE_PER_CPU(struct batched_entropy, batched_entropy_u32) = {
+- .lock = INIT_LOCAL_LOCK(batched_entropy_u32.lock),
+- .position = UINT_MAX
+-};
+-
+-u32 get_random_u32(void)
+-{
+- u32 ret;
+- unsigned long flags;
+- struct batched_entropy *batch;
+- static void *previous;
+- unsigned long next_gen;
+-
+- warn_unseeded_randomness(&previous);
+-
+- local_lock_irqsave(&batched_entropy_u32.lock, flags);
+- batch = raw_cpu_ptr(&batched_entropy_u32);
+-
+- next_gen = READ_ONCE(base_crng.generation);
+- if (batch->position >= ARRAY_SIZE(batch->entropy_u32) ||
+- next_gen != batch->generation) {
+- _get_random_bytes(batch->entropy_u32, sizeof(batch->entropy_u32));
+- batch->position = 0;
+- batch->generation = next_gen;
+- }
+-
+- ret = batch->entropy_u32[batch->position];
+- batch->entropy_u32[batch->position] = 0;
+- ++batch->position;
+- local_unlock_irqrestore(&batched_entropy_u32.lock, flags);
+- return ret;
+-}
+-EXPORT_SYMBOL(get_random_u32);
++#define DEFINE_BATCHED_ENTROPY(type) \
++struct batch_ ##type { \
++ /* \
++ * We make this 1.5x a ChaCha block, so that we get the \
++ * remaining 32 bytes from fast key erasure, plus one full \
++ * block from the detached ChaCha state. We can increase \
++ * the size of this later if needed so long as we keep the \
++ * formula of (integer_blocks + 0.5) * CHACHA_BLOCK_SIZE. \
++ */ \
++ type entropy[CHACHA_BLOCK_SIZE * 3 / (2 * sizeof(type))]; \
++ local_lock_t lock; \
++ unsigned long generation; \
++ unsigned int position; \
++}; \
++ \
++static DEFINE_PER_CPU(struct batch_ ##type, batched_entropy_ ##type) = { \
++ .lock = INIT_LOCAL_LOCK(batched_entropy_ ##type.lock), \
++ .position = UINT_MAX \
++}; \
++ \
++type get_random_ ##type(void) \
++{ \
++ type ret; \
++ unsigned long flags; \
++ struct batch_ ##type *batch; \
++ unsigned long next_gen; \
++ \
++ warn_unseeded_randomness(); \
++ \
++ if (!crng_ready()) { \
++ _get_random_bytes(&ret, sizeof(ret)); \
++ return ret; \
++ } \
++ \
++ local_lock_irqsave(&batched_entropy_ ##type.lock, flags); \
++ batch = raw_cpu_ptr(&batched_entropy_##type); \
++ \
++ next_gen = READ_ONCE(base_crng.generation); \
++ if (batch->position >= ARRAY_SIZE(batch->entropy) || \
++ next_gen != batch->generation) { \
++ _get_random_bytes(batch->entropy, sizeof(batch->entropy)); \
++ batch->position = 0; \
++ batch->generation = next_gen; \
++ } \
++ \
++ ret = batch->entropy[batch->position]; \
++ batch->entropy[batch->position] = 0; \
++ ++batch->position; \
++ local_unlock_irqrestore(&batched_entropy_ ##type.lock, flags); \
++ return ret; \
++} \
++EXPORT_SYMBOL(get_random_ ##type);
++
++DEFINE_BATCHED_ENTROPY(u64)
++DEFINE_BATCHED_ENTROPY(u32)
+
+ #ifdef CONFIG_SMP
+ /*
+ * This function is called when the CPU is coming up, with entry
+ * CPUHP_RANDOM_PREPARE, which comes before CPUHP_WORKQUEUE_PREP.
+ */
+-int random_prepare_cpu(unsigned int cpu)
++int __cold random_prepare_cpu(unsigned int cpu)
+ {
+ /*
+ * When the cpu comes back online, immediately invalidate both
+@@ -696,62 +577,30 @@ int random_prepare_cpu(unsigned int cpu)
+ }
+ #endif
+
+-/**
+- * randomize_page - Generate a random, page aligned address
+- * @start: The smallest acceptable address the caller will take.
+- * @range: The size of the area, starting at @start, within which the
+- * random address must fall.
+- *
+- * If @start + @range would overflow, @range is capped.
+- *
+- * NOTE: Historical use of randomize_range, which this replaces, presumed that
+- * @start was already page aligned. We now align it regardless.
+- *
+- * Return: A page aligned address within [start, start + range). On error,
+- * @start is returned.
+- */
+-unsigned long randomize_page(unsigned long start, unsigned long range)
+-{
+- if (!PAGE_ALIGNED(start)) {
+- range -= PAGE_ALIGN(start) - start;
+- start = PAGE_ALIGN(start);
+- }
+-
+- if (start > ULONG_MAX - range)
+- range = ULONG_MAX - start;
+-
+- range >>= PAGE_SHIFT;
+-
+- if (range == 0)
+- return start;
+-
+- return start + (get_random_long() % range << PAGE_SHIFT);
+-}
+-
+ /*
+ * This function will use the architecture-specific hardware random
+ * number generator if it is available. It is not recommended for
+ * use. Use get_random_bytes() instead. It returns the number of
+ * bytes filled in.
+ */
+-size_t __must_check get_random_bytes_arch(void *buf, size_t nbytes)
++size_t __must_check get_random_bytes_arch(void *buf, size_t len)
+ {
+- size_t left = nbytes;
++ size_t left = len;
+ u8 *p = buf;
+
+ while (left) {
+ unsigned long v;
+- size_t chunk = min_t(size_t, left, sizeof(unsigned long));
++ size_t block_len = min_t(size_t, left, sizeof(unsigned long));
+
+ if (!arch_get_random_long(&v))
+ break;
+
+- memcpy(p, &v, chunk);
+- p += chunk;
+- left -= chunk;
++ memcpy(p, &v, block_len);
++ p += block_len;
++ left -= block_len;
+ }
+
+- return nbytes - left;
++ return len - left;
+ }
+ EXPORT_SYMBOL(get_random_bytes_arch);
+
+@@ -762,33 +611,28 @@ EXPORT_SYMBOL(get_random_bytes_arch);
+ *
+ * Callers may add entropy via:
+ *
+- * static void mix_pool_bytes(const void *in, size_t nbytes)
++ * static void mix_pool_bytes(const void *buf, size_t len)
+ *
+ * After which, if added entropy should be credited:
+ *
+- * static void credit_entropy_bits(size_t nbits)
++ * static void credit_init_bits(size_t bits)
+ *
+- * Finally, extract entropy via these two, with the latter one
+- * setting the entropy count to zero and extracting only if there
+- * is POOL_MIN_BITS entropy credited prior or force is true:
++ * Finally, extract entropy via:
+ *
+- * static void extract_entropy(void *buf, size_t nbytes)
+- * static bool drain_entropy(void *buf, size_t nbytes, bool force)
++ * static void extract_entropy(void *buf, size_t len)
+ *
+ **********************************************************************/
+
+ enum {
+ POOL_BITS = BLAKE2S_HASH_SIZE * 8,
+- POOL_MIN_BITS = POOL_BITS /* No point in settling for less. */
++ POOL_READY_BITS = POOL_BITS, /* When crng_init->CRNG_READY */
++ POOL_EARLY_BITS = POOL_READY_BITS / 2 /* When crng_init->CRNG_EARLY */
+ };
+
+-/* For notifying userspace should write into /dev/random. */
+-static DECLARE_WAIT_QUEUE_HEAD(random_write_wait);
+-
+ static struct {
+ struct blake2s_state hash;
+ spinlock_t lock;
+- unsigned int entropy_count;
++ unsigned int init_bits;
+ } input_pool = {
+ .hash.h = { BLAKE2S_IV0 ^ (0x01010000 | BLAKE2S_HASH_SIZE),
+ BLAKE2S_IV1, BLAKE2S_IV2, BLAKE2S_IV3, BLAKE2S_IV4,
+@@ -797,48 +641,30 @@ static struct {
+ .lock = __SPIN_LOCK_UNLOCKED(input_pool.lock),
+ };
+
+-static void _mix_pool_bytes(const void *in, size_t nbytes)
++static void _mix_pool_bytes(const void *buf, size_t len)
+ {
+- blake2s_update(&input_pool.hash, in, nbytes);
++ blake2s_update(&input_pool.hash, buf, len);
+ }
+
+ /*
+- * This function adds bytes into the entropy "pool". It does not
+- * update the entropy estimate. The caller should call
+- * credit_entropy_bits if this is appropriate.
++ * This function adds bytes into the input pool. It does not
++ * update the initialization bit counter; the caller should call
++ * credit_init_bits if this is appropriate.
+ */
+-static void mix_pool_bytes(const void *in, size_t nbytes)
++static void mix_pool_bytes(const void *buf, size_t len)
+ {
+ unsigned long flags;
+
+ spin_lock_irqsave(&input_pool.lock, flags);
+- _mix_pool_bytes(in, nbytes);
++ _mix_pool_bytes(buf, len);
+ spin_unlock_irqrestore(&input_pool.lock, flags);
+ }
+
+-static void credit_entropy_bits(size_t nbits)
+-{
+- unsigned int entropy_count, orig, add;
+-
+- if (!nbits)
+- return;
+-
+- add = min_t(size_t, nbits, POOL_BITS);
+-
+- do {
+- orig = READ_ONCE(input_pool.entropy_count);
+- entropy_count = min_t(unsigned int, POOL_BITS, orig + add);
+- } while (cmpxchg(&input_pool.entropy_count, orig, entropy_count) != orig);
+-
+- if (!crng_ready() && entropy_count >= POOL_MIN_BITS)
+- crng_reseed(false);
+-}
+-
+ /*
+ * This is an HKDF-like construction for using the hashed collected entropy
+ * as a PRF key, that's then expanded block-by-block.
+ */
+-static void extract_entropy(void *buf, size_t nbytes)
++static void extract_entropy(void *buf, size_t len)
+ {
+ unsigned long flags;
+ u8 seed[BLAKE2S_HASH_SIZE], next_key[BLAKE2S_HASH_SIZE];
+@@ -867,12 +693,12 @@ static void extract_entropy(void *buf, size_t nbytes)
+ spin_unlock_irqrestore(&input_pool.lock, flags);
+ memzero_explicit(next_key, sizeof(next_key));
+
+- while (nbytes) {
+- i = min_t(size_t, nbytes, BLAKE2S_HASH_SIZE);
++ while (len) {
++ i = min_t(size_t, len, BLAKE2S_HASH_SIZE);
+ /* output = HASHPRF(seed, RDSEED || ++counter) */
+ ++block.counter;
+ blake2s(buf, (u8 *)&block, seed, i, sizeof(block), sizeof(seed));
+- nbytes -= i;
++ len -= i;
+ buf += i;
+ }
+
+@@ -880,23 +706,43 @@ static void extract_entropy(void *buf, size_t nbytes)
+ memzero_explicit(&block, sizeof(block));
+ }
+
+-/*
+- * First we make sure we have POOL_MIN_BITS of entropy in the pool unless force
+- * is true, and then we set the entropy count to zero (but don't actually touch
+- * any data). Only then can we extract a new key with extract_entropy().
+- */
+-static bool drain_entropy(void *buf, size_t nbytes, bool force)
++#define credit_init_bits(bits) if (!crng_ready()) _credit_init_bits(bits)
++
++static void __cold _credit_init_bits(size_t bits)
+ {
+- unsigned int entropy_count;
++ static struct execute_work set_ready;
++ unsigned int new, orig, add;
++ unsigned long flags;
++
++ if (!bits)
++ return;
++
++ add = min_t(size_t, bits, POOL_BITS);
++
+ do {
+- entropy_count = READ_ONCE(input_pool.entropy_count);
+- if (!force && entropy_count < POOL_MIN_BITS)
+- return false;
+- } while (cmpxchg(&input_pool.entropy_count, entropy_count, 0) != entropy_count);
+- extract_entropy(buf, nbytes);
+- wake_up_interruptible(&random_write_wait);
+- kill_fasync(&fasync, SIGIO, POLL_OUT);
+- return true;
++ orig = READ_ONCE(input_pool.init_bits);
++ new = min_t(unsigned int, POOL_BITS, orig + add);
++ } while (cmpxchg(&input_pool.init_bits, orig, new) != orig);
++
++ if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
++ crng_reseed(); /* Sets crng_init to CRNG_READY under base_crng.lock. */
++ execute_in_process_context(crng_set_ready, &set_ready);
++ process_random_ready_list();
++ wake_up_interruptible(&crng_init_wait);
++ kill_fasync(&fasync, SIGIO, POLL_IN);
++ pr_notice("crng init done\n");
++ if (urandom_warning.missed)
++ pr_notice("%d urandom warning(s) missed due to ratelimiting\n",
++ urandom_warning.missed);
++ } else if (orig < POOL_EARLY_BITS && new >= POOL_EARLY_BITS) {
++ spin_lock_irqsave(&base_crng.lock, flags);
++ /* Check if crng_init is CRNG_EMPTY, to avoid race with crng_reseed(). */
++ if (crng_init == CRNG_EMPTY) {
++ extract_entropy(base_crng.key, sizeof(base_crng.key));
++ crng_init = CRNG_EARLY;
++ }
++ spin_unlock_irqrestore(&base_crng.lock, flags);
++ }
+ }
+
+
+@@ -907,15 +753,13 @@ static bool drain_entropy(void *buf, size_t nbytes, bool force)
+ * The following exported functions are used for pushing entropy into
+ * the above entropy accumulation routines:
+ *
+- * void add_device_randomness(const void *buf, size_t size);
+- * void add_input_randomness(unsigned int type, unsigned int code,
+- * unsigned int value);
+- * void add_disk_randomness(struct gendisk *disk);
+- * void add_hwgenerator_randomness(const void *buffer, size_t count,
+- * size_t entropy);
+- * void add_bootloader_randomness(const void *buf, size_t size);
+- * void add_vmfork_randomness(const void *unique_vm_id, size_t size);
++ * void add_device_randomness(const void *buf, size_t len);
++ * void add_hwgenerator_randomness(const void *buf, size_t len, size_t entropy);
++ * void add_bootloader_randomness(const void *buf, size_t len);
++ * void add_vmfork_randomness(const void *unique_vm_id, size_t len);
+ * void add_interrupt_randomness(int irq);
++ * void add_input_randomness(unsigned int type, unsigned int code, unsigned int value);
++ * void add_disk_randomness(struct gendisk *disk);
+ *
+ * add_device_randomness() adds data to the input pool that
+ * is likely to differ between two devices (or possibly even per boot).
+@@ -925,26 +769,13 @@ static bool drain_entropy(void *buf, size_t nbytes, bool force)
+ * that might otherwise be identical and have very little entropy
+ * available to them (particularly common in the embedded world).
+ *
+- * add_input_randomness() uses the input layer interrupt timing, as well
+- * as the event type information from the hardware.
+- *
+- * add_disk_randomness() uses what amounts to the seek time of block
+- * layer request events, on a per-disk_devt basis, as input to the
+- * entropy pool. Note that high-speed solid state drives with very low
+- * seek times do not make for good sources of entropy, as their seek
+- * times are usually fairly consistent.
+- *
+- * The above two routines try to estimate how many bits of entropy
+- * to credit. They do this by keeping track of the first and second
+- * order deltas of the event timings.
+- *
+ * add_hwgenerator_randomness() is for true hardware RNGs, and will credit
+ * entropy as specified by the caller. If the entropy pool is full it will
+ * block until more entropy is needed.
+ *
+- * add_bootloader_randomness() is the same as add_hwgenerator_randomness() or
+- * add_device_randomness(), depending on whether or not the configuration
+- * option CONFIG_RANDOM_TRUST_BOOTLOADER is set.
++ * add_bootloader_randomness() is called by bootloader drivers, such as EFI
++ * and device tree, and credits its input depending on whether or not the
++ * configuration option CONFIG_RANDOM_TRUST_BOOTLOADER is set.
+ *
+ * add_vmfork_randomness() adds a unique (but not necessarily secret) ID
+ * representing the current instance of a VM to the pool, without crediting,
+@@ -955,6 +786,19 @@ static bool drain_entropy(void *buf, size_t nbytes, bool force)
+ * as inputs, it feeds the input pool roughly once a second or after 64
+ * interrupts, crediting 1 bit of entropy for whichever comes first.
+ *
++ * add_input_randomness() uses the input layer interrupt timing, as well
++ * as the event type information from the hardware.
++ *
++ * add_disk_randomness() uses what amounts to the seek time of block
++ * layer request events, on a per-disk_devt basis, as input to the
++ * entropy pool. Note that high-speed solid state drives with very low
++ * seek times do not make for good sources of entropy, as their seek
++ * times are usually fairly consistent.
++ *
++ * The last two routines try to estimate how many bits of entropy
++ * to credit. They do this by keeping track of the first and second
++ * order deltas of the event timings.
++ *
+ **********************************************************************/
+
+ static bool trust_cpu __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
+@@ -972,46 +816,42 @@ early_param("random.trust_bootloader", parse_trust_bootloader);
+
+ /*
+ * The first collection of entropy occurs at system boot while interrupts
+- * are still turned off. Here we push in RDSEED, a timestamp, and utsname().
+- * Depending on the above configuration knob, RDSEED may be considered
+- * sufficient for initialization. Note that much earlier setup may already
+- * have pushed entropy into the input pool by the time we get here.
++ * are still turned off. Here we push in latent entropy, RDSEED, a timestamp,
++ * utsname(), and the command line. Depending on the above configuration knob,
++ * RDSEED may be considered sufficient for initialization. Note that much
++ * earlier setup may already have pushed entropy into the input pool by the
++ * time we get here.
+ */
+-int __init rand_initialize(void)
++int __init random_init(const char *command_line)
+ {
+- size_t i;
+ ktime_t now = ktime_get_real();
+- bool arch_init = true;
+- unsigned long rv;
++ unsigned int i, arch_bytes;
++ unsigned long entropy;
+
+ #if defined(LATENT_ENTROPY_PLUGIN)
+ static const u8 compiletime_seed[BLAKE2S_BLOCK_SIZE] __initconst __latent_entropy;
+ _mix_pool_bytes(compiletime_seed, sizeof(compiletime_seed));
+ #endif
+
+- for (i = 0; i < BLAKE2S_BLOCK_SIZE; i += sizeof(rv)) {
+- if (!arch_get_random_seed_long_early(&rv) &&
+- !arch_get_random_long_early(&rv)) {
+- rv = random_get_entropy();
+- arch_init = false;
++ for (i = 0, arch_bytes = BLAKE2S_BLOCK_SIZE;
++ i < BLAKE2S_BLOCK_SIZE; i += sizeof(entropy)) {
++ if (!arch_get_random_seed_long_early(&entropy) &&
++ !arch_get_random_long_early(&entropy)) {
++ entropy = random_get_entropy();
++ arch_bytes -= sizeof(entropy);
+ }
+- _mix_pool_bytes(&rv, sizeof(rv));
++ _mix_pool_bytes(&entropy, sizeof(entropy));
+ }
+ _mix_pool_bytes(&now, sizeof(now));
+ _mix_pool_bytes(utsname(), sizeof(*(utsname())));
++ _mix_pool_bytes(command_line, strlen(command_line));
++ add_latent_entropy();
+
+- extract_entropy(base_crng.key, sizeof(base_crng.key));
+- ++base_crng.generation;
+-
+- if (arch_init && trust_cpu && !crng_ready()) {
+- crng_init = 2;
+- pr_notice("crng init done (trusting CPU's manufacturer)\n");
+- }
++ if (crng_ready())
++ crng_reseed();
++ else if (trust_cpu)
++ credit_init_bits(arch_bytes * 8);
+
+- if (ratelimit_disable) {
+- urandom_warning.interval = 0;
+- unseeded_warning.interval = 0;
+- }
+ return 0;
+ }
+
+@@ -1023,164 +863,46 @@ int __init rand_initialize(void)
+ * the entropy pool having similar initial state across largely
+ * identical devices.
+ */
+-void add_device_randomness(const void *buf, size_t size)
++void add_device_randomness(const void *buf, size_t len)
+ {
+- unsigned long cycles = random_get_entropy();
+- unsigned long flags, now = jiffies;
+-
+- if (crng_init == 0 && size)
+- crng_pre_init_inject(buf, size, false);
++ unsigned long entropy = random_get_entropy();
++ unsigned long flags;
+
+ spin_lock_irqsave(&input_pool.lock, flags);
+- _mix_pool_bytes(&cycles, sizeof(cycles));
+- _mix_pool_bytes(&now, sizeof(now));
+- _mix_pool_bytes(buf, size);
++ _mix_pool_bytes(&entropy, sizeof(entropy));
++ _mix_pool_bytes(buf, len);
+ spin_unlock_irqrestore(&input_pool.lock, flags);
+ }
+ EXPORT_SYMBOL(add_device_randomness);
+
+-/* There is one of these per entropy source */
+-struct timer_rand_state {
+- unsigned long last_time;
+- long last_delta, last_delta2;
+-};
+-
+-/*
+- * This function adds entropy to the entropy "pool" by using timing
+- * delays. It uses the timer_rand_state structure to make an estimate
+- * of how many bits of entropy this call has added to the pool.
+- *
+- * The number "num" is also added to the pool - it should somehow describe
+- * the type of event which just happened. This is currently 0-255 for
+- * keyboard scan codes, and 256 upwards for interrupts.
+- */
+-static void add_timer_randomness(struct timer_rand_state *state, unsigned int num)
+-{
+- unsigned long cycles = random_get_entropy(), now = jiffies, flags;
+- long delta, delta2, delta3;
+-
+- spin_lock_irqsave(&input_pool.lock, flags);
+- _mix_pool_bytes(&cycles, sizeof(cycles));
+- _mix_pool_bytes(&now, sizeof(now));
+- _mix_pool_bytes(&num, sizeof(num));
+- spin_unlock_irqrestore(&input_pool.lock, flags);
+-
+- /*
+- * Calculate number of bits of randomness we probably added.
+- * We take into account the first, second and third-order deltas
+- * in order to make our estimate.
+- */
+- delta = now - READ_ONCE(state->last_time);
+- WRITE_ONCE(state->last_time, now);
+-
+- delta2 = delta - READ_ONCE(state->last_delta);
+- WRITE_ONCE(state->last_delta, delta);
+-
+- delta3 = delta2 - READ_ONCE(state->last_delta2);
+- WRITE_ONCE(state->last_delta2, delta2);
+-
+- if (delta < 0)
+- delta = -delta;
+- if (delta2 < 0)
+- delta2 = -delta2;
+- if (delta3 < 0)
+- delta3 = -delta3;
+- if (delta > delta2)
+- delta = delta2;
+- if (delta > delta3)
+- delta = delta3;
+-
+- /*
+- * delta is now minimum absolute delta.
+- * Round down by 1 bit on general principles,
+- * and limit entropy estimate to 12 bits.
+- */
+- credit_entropy_bits(min_t(unsigned int, fls(delta >> 1), 11));
+-}
+-
+-void add_input_randomness(unsigned int type, unsigned int code,
+- unsigned int value)
+-{
+- static unsigned char last_value;
+- static struct timer_rand_state input_timer_state = { INITIAL_JIFFIES };
+-
+- /* Ignore autorepeat and the like. */
+- if (value == last_value)
+- return;
+-
+- last_value = value;
+- add_timer_randomness(&input_timer_state,
+- (type << 4) ^ code ^ (code >> 4) ^ value);
+-}
+-EXPORT_SYMBOL_GPL(add_input_randomness);
+-
+-#ifdef CONFIG_BLOCK
+-void add_disk_randomness(struct gendisk *disk)
+-{
+- if (!disk || !disk->random)
+- return;
+- /* First major is 1, so we get >= 0x200 here. */
+- add_timer_randomness(disk->random, 0x100 + disk_devt(disk));
+-}
+-EXPORT_SYMBOL_GPL(add_disk_randomness);
+-
+-void rand_initialize_disk(struct gendisk *disk)
+-{
+- struct timer_rand_state *state;
+-
+- /*
+- * If kzalloc returns null, we just won't use that entropy
+- * source.
+- */
+- state = kzalloc(sizeof(struct timer_rand_state), GFP_KERNEL);
+- if (state) {
+- state->last_time = INITIAL_JIFFIES;
+- disk->random = state;
+- }
+-}
+-#endif
+-
+ /*
+ * Interface for in-kernel drivers of true hardware RNGs.
+ * Those devices may produce endless random bits and will be throttled
+ * when our pool is full.
+ */
+-void add_hwgenerator_randomness(const void *buffer, size_t count,
+- size_t entropy)
++void add_hwgenerator_randomness(const void *buf, size_t len, size_t entropy)
+ {
+- if (unlikely(crng_init == 0 && entropy < POOL_MIN_BITS)) {
+- crng_pre_init_inject(buffer, count, true);
+- mix_pool_bytes(buffer, count);
+- return;
+- }
++ mix_pool_bytes(buf, len);
++ credit_init_bits(entropy);
+
+ /*
+- * Throttle writing if we're above the trickle threshold.
+- * We'll be woken up again once below POOL_MIN_BITS, when
+- * the calling thread is about to terminate, or once
+- * CRNG_RESEED_INTERVAL has elapsed.
++ * Throttle writing to once every CRNG_RESEED_INTERVAL, unless
++ * we're not yet initialized.
+ */
+- wait_event_interruptible_timeout(random_write_wait,
+- !system_wq || kthread_should_stop() ||
+- input_pool.entropy_count < POOL_MIN_BITS,
+- CRNG_RESEED_INTERVAL);
+- mix_pool_bytes(buffer, count);
+- credit_entropy_bits(entropy);
++ if (!kthread_should_stop() && crng_ready())
++ schedule_timeout_interruptible(CRNG_RESEED_INTERVAL);
+ }
+ EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);
+
+ /*
+- * Handle random seed passed by bootloader.
+- * If the seed is trustworthy, it would be regarded as hardware RNGs. Otherwise
+- * it would be regarded as device data.
+- * The decision is controlled by CONFIG_RANDOM_TRUST_BOOTLOADER.
++ * Handle random seed passed by bootloader, and credit it if
++ * CONFIG_RANDOM_TRUST_BOOTLOADER is set.
+ */
+-void add_bootloader_randomness(const void *buf, size_t size)
++void __cold add_bootloader_randomness(const void *buf, size_t len)
+ {
++ mix_pool_bytes(buf, len);
+ if (trust_bootloader)
+- add_hwgenerator_randomness(buf, size, size * 8);
+- else
+- add_device_randomness(buf, size);
++ credit_init_bits(len * 8);
+ }
+ EXPORT_SYMBOL_GPL(add_bootloader_randomness);
+
+@@ -1192,11 +914,11 @@ static BLOCKING_NOTIFIER_HEAD(vmfork_chain);
+ * don't credit it, but we do immediately force a reseed after so
+ * that it's used by the crng posthaste.
+ */
+-void add_vmfork_randomness(const void *unique_vm_id, size_t size)
++void __cold add_vmfork_randomness(const void *unique_vm_id, size_t len)
+ {
+- add_device_randomness(unique_vm_id, size);
++ add_device_randomness(unique_vm_id, len);
+ if (crng_ready()) {
+- crng_reseed(true);
++ crng_reseed();
+ pr_notice("crng reseeded due to virtual machine fork\n");
+ }
+ blocking_notifier_call_chain(&vmfork_chain, 0, NULL);
+@@ -1205,13 +927,13 @@ void add_vmfork_randomness(const void *unique_vm_id, size_t size)
+ EXPORT_SYMBOL_GPL(add_vmfork_randomness);
+ #endif
+
+-int register_random_vmfork_notifier(struct notifier_block *nb)
++int __cold register_random_vmfork_notifier(struct notifier_block *nb)
+ {
+ return blocking_notifier_chain_register(&vmfork_chain, nb);
+ }
+ EXPORT_SYMBOL_GPL(register_random_vmfork_notifier);
+
+-int unregister_random_vmfork_notifier(struct notifier_block *nb)
++int __cold unregister_random_vmfork_notifier(struct notifier_block *nb)
+ {
+ return blocking_notifier_chain_unregister(&vmfork_chain, nb);
+ }
+@@ -1223,17 +945,15 @@ struct fast_pool {
+ unsigned long pool[4];
+ unsigned long last;
+ unsigned int count;
+- u16 reg_idx;
+ };
+
+ static DEFINE_PER_CPU(struct fast_pool, irq_randomness) = {
+ #ifdef CONFIG_64BIT
+- /* SipHash constants */
+- .pool = { 0x736f6d6570736575UL, 0x646f72616e646f6dUL,
+- 0x6c7967656e657261UL, 0x7465646279746573UL }
++#define FASTMIX_PERM SIPHASH_PERMUTATION
++ .pool = { SIPHASH_CONST_0, SIPHASH_CONST_1, SIPHASH_CONST_2, SIPHASH_CONST_3 }
+ #else
+- /* HalfSipHash constants */
+- .pool = { 0, 0, 0x6c796765U, 0x74656462U }
++#define FASTMIX_PERM HSIPHASH_PERMUTATION
++ .pool = { HSIPHASH_CONST_0, HSIPHASH_CONST_1, HSIPHASH_CONST_2, HSIPHASH_CONST_3 }
+ #endif
+ };
+
+@@ -1241,27 +961,16 @@ static DEFINE_PER_CPU(struct fast_pool, irq_randomness) = {
+ * This is [Half]SipHash-1-x, starting from an empty key. Because
+ * the key is fixed, it assumes that its inputs are non-malicious,
+ * and therefore this has no security on its own. s represents the
+- * 128 or 256-bit SipHash state, while v represents a 128-bit input.
++ * four-word SipHash state, while v represents a two-word input.
+ */
+-static void fast_mix(unsigned long s[4], const unsigned long *v)
++static void fast_mix(unsigned long s[4], unsigned long v1, unsigned long v2)
+ {
+- size_t i;
+-
+- for (i = 0; i < 16 / sizeof(long); ++i) {
+- s[3] ^= v[i];
+-#ifdef CONFIG_64BIT
+- s[0] += s[1]; s[1] = rol64(s[1], 13); s[1] ^= s[0]; s[0] = rol64(s[0], 32);
+- s[2] += s[3]; s[3] = rol64(s[3], 16); s[3] ^= s[2];
+- s[0] += s[3]; s[3] = rol64(s[3], 21); s[3] ^= s[0];
+- s[2] += s[1]; s[1] = rol64(s[1], 17); s[1] ^= s[2]; s[2] = rol64(s[2], 32);
+-#else
+- s[0] += s[1]; s[1] = rol32(s[1], 5); s[1] ^= s[0]; s[0] = rol32(s[0], 16);
+- s[2] += s[3]; s[3] = rol32(s[3], 8); s[3] ^= s[2];
+- s[0] += s[3]; s[3] = rol32(s[3], 7); s[3] ^= s[0];
+- s[2] += s[1]; s[1] = rol32(s[1], 13); s[1] ^= s[2]; s[2] = rol32(s[2], 16);
+-#endif
+- s[0] ^= v[i];
+- }
++ s[3] ^= v1;
++ FASTMIX_PERM(s[0], s[1], s[2], s[3]);
++ s[0] ^= v1;
++ s[3] ^= v2;
++ FASTMIX_PERM(s[0], s[1], s[2], s[3]);
++ s[0] ^= v2;
+ }
+
+ #ifdef CONFIG_SMP
+@@ -1269,7 +978,7 @@ static void fast_mix(unsigned long s[4], const unsigned long *v)
+ * This function is called when the CPU has just come online, with
+ * entry CPUHP_AP_RANDOM_ONLINE, just after CPUHP_AP_WORKQUEUE_ONLINE.
+ */
+-int random_online_cpu(unsigned int cpu)
++int __cold random_online_cpu(unsigned int cpu)
+ {
+ /*
+ * During CPU shutdown and before CPU onlining, add_interrupt_
+@@ -1287,33 +996,18 @@ int random_online_cpu(unsigned int cpu)
+ }
+ #endif
+
+-static unsigned long get_reg(struct fast_pool *f, struct pt_regs *regs)
+-{
+- unsigned long *ptr = (unsigned long *)regs;
+- unsigned int idx;
+-
+- if (regs == NULL)
+- return 0;
+- idx = READ_ONCE(f->reg_idx);
+- if (idx >= sizeof(struct pt_regs) / sizeof(unsigned long))
+- idx = 0;
+- ptr += idx++;
+- WRITE_ONCE(f->reg_idx, idx);
+- return *ptr;
+-}
+-
+ static void mix_interrupt_randomness(struct work_struct *work)
+ {
+ struct fast_pool *fast_pool = container_of(work, struct fast_pool, mix);
+ /*
+- * The size of the copied stack pool is explicitly 16 bytes so that we
+- * tax mix_pool_byte()'s compression function the same amount on all
+- * platforms. This means on 64-bit we copy half the pool into this,
+- * while on 32-bit we copy all of it. The entropy is supposed to be
+- * sufficiently dispersed between bits that in the sponge-like
+- * half case, on average we don't wind up "losing" some.
++ * The size of the copied stack pool is explicitly 2 longs so that we
++ * only ever ingest half of the siphash output each time, retaining
++ * the other half as the next "key" that carries over. The entropy is
++ * supposed to be sufficiently dispersed between bits so on average
++ * we don't wind up "losing" some.
+ */
+- u8 pool[16];
++ unsigned long pool[2];
++ unsigned int count;
+
+ /* Check to see if we're running on the wrong CPU due to hotplug. */
+ local_irq_disable();
+@@ -1327,17 +1021,13 @@ static void mix_interrupt_randomness(struct work_struct *work)
+ * consistent view, before we reenable irqs again.
+ */
+ memcpy(pool, fast_pool->pool, sizeof(pool));
++ count = fast_pool->count;
+ fast_pool->count = 0;
+ fast_pool->last = jiffies;
+ local_irq_enable();
+
+- if (unlikely(crng_init == 0)) {
+- crng_pre_init_inject(pool, sizeof(pool), true);
+- mix_pool_bytes(pool, sizeof(pool));
+- } else {
+- mix_pool_bytes(pool, sizeof(pool));
+- credit_entropy_bits(1);
+- }
++ mix_pool_bytes(pool, sizeof(pool));
++ credit_init_bits(max(1u, (count & U16_MAX) / 64));
+
+ memzero_explicit(pool, sizeof(pool));
+ }
+@@ -1345,37 +1035,19 @@ static void mix_interrupt_randomness(struct work_struct *work)
+ void add_interrupt_randomness(int irq)
+ {
+ enum { MIX_INFLIGHT = 1U << 31 };
+- unsigned long cycles = random_get_entropy(), now = jiffies;
++ unsigned long entropy = random_get_entropy();
+ struct fast_pool *fast_pool = this_cpu_ptr(&irq_randomness);
+ struct pt_regs *regs = get_irq_regs();
+ unsigned int new_count;
+- union {
+- u32 u32[4];
+- u64 u64[2];
+- unsigned long longs[16 / sizeof(long)];
+- } irq_data;
+-
+- if (cycles == 0)
+- cycles = get_reg(fast_pool, regs);
+-
+- if (sizeof(unsigned long) == 8) {
+- irq_data.u64[0] = cycles ^ rol64(now, 32) ^ irq;
+- irq_data.u64[1] = regs ? instruction_pointer(regs) : _RET_IP_;
+- } else {
+- irq_data.u32[0] = cycles ^ irq;
+- irq_data.u32[1] = now;
+- irq_data.u32[2] = regs ? instruction_pointer(regs) : _RET_IP_;
+- irq_data.u32[3] = get_reg(fast_pool, regs);
+- }
+
+- fast_mix(fast_pool->pool, irq_data.longs);
++ fast_mix(fast_pool->pool, entropy,
++ (regs ? instruction_pointer(regs) : _RET_IP_) ^ swab(irq));
+ new_count = ++fast_pool->count;
+
+ if (new_count & MIX_INFLIGHT)
+ return;
+
+- if (new_count < 64 && (!time_after(now, fast_pool->last + HZ) ||
+- unlikely(crng_init == 0)))
++ if (new_count < 64 && !time_is_before_jiffies(fast_pool->last + HZ))
+ return;
+
+ if (unlikely(!fast_pool->mix.func))
+@@ -1385,6 +1057,126 @@ void add_interrupt_randomness(int irq)
+ }
+ EXPORT_SYMBOL_GPL(add_interrupt_randomness);
+
++/* There is one of these per entropy source */
++struct timer_rand_state {
++ unsigned long last_time;
++ long last_delta, last_delta2;
++};
++
++/*
++ * This function adds entropy to the entropy "pool" by using timing
++ * delays. It uses the timer_rand_state structure to make an estimate
++ * of how many bits of entropy this call has added to the pool. The
++ * value "num" is also added to the pool; it should somehow describe
++ * the type of event that just happened.
++ */
++static void add_timer_randomness(struct timer_rand_state *state, unsigned int num)
++{
++ unsigned long entropy = random_get_entropy(), now = jiffies, flags;
++ long delta, delta2, delta3;
++ unsigned int bits;
++
++ /*
++ * If we're in a hard IRQ, add_interrupt_randomness() will be called
++ * sometime after, so mix into the fast pool.
++ */
++ if (in_hardirq()) {
++ fast_mix(this_cpu_ptr(&irq_randomness)->pool, entropy, num);
++ } else {
++ spin_lock_irqsave(&input_pool.lock, flags);
++ _mix_pool_bytes(&entropy, sizeof(entropy));
++ _mix_pool_bytes(&num, sizeof(num));
++ spin_unlock_irqrestore(&input_pool.lock, flags);
++ }
++
++ if (crng_ready())
++ return;
++
++ /*
++ * Calculate number of bits of randomness we probably added.
++ * We take into account the first, second and third-order deltas
++ * in order to make our estimate.
++ */
++ delta = now - READ_ONCE(state->last_time);
++ WRITE_ONCE(state->last_time, now);
++
++ delta2 = delta - READ_ONCE(state->last_delta);
++ WRITE_ONCE(state->last_delta, delta);
++
++ delta3 = delta2 - READ_ONCE(state->last_delta2);
++ WRITE_ONCE(state->last_delta2, delta2);
++
++ if (delta < 0)
++ delta = -delta;
++ if (delta2 < 0)
++ delta2 = -delta2;
++ if (delta3 < 0)
++ delta3 = -delta3;
++ if (delta > delta2)
++ delta = delta2;
++ if (delta > delta3)
++ delta = delta3;
++
++ /*
++ * delta is now minimum absolute delta. Round down by 1 bit
++ * on general principles, and limit entropy estimate to 11 bits.
++ */
++ bits = min(fls(delta >> 1), 11);
++
++ /*
++ * As mentioned above, if we're in a hard IRQ, add_interrupt_randomness()
++ * will run after this, which uses a different crediting scheme of 1 bit
++ * per every 64 interrupts. In order to let that function do accounting
++ * close to the one in this function, we credit a full 64/64 bit per bit,
++ * and then subtract one to account for the extra one added.
++ */
++ if (in_hardirq())
++ this_cpu_ptr(&irq_randomness)->count += max(1u, bits * 64) - 1;
++ else
++ _credit_init_bits(bits);
++}
++
++void add_input_randomness(unsigned int type, unsigned int code, unsigned int value)
++{
++ static unsigned char last_value;
++ static struct timer_rand_state input_timer_state = { INITIAL_JIFFIES };
++
++ /* Ignore autorepeat and the like. */
++ if (value == last_value)
++ return;
++
++ last_value = value;
++ add_timer_randomness(&input_timer_state,
++ (type << 4) ^ code ^ (code >> 4) ^ value);
++}
++EXPORT_SYMBOL_GPL(add_input_randomness);
++
++#ifdef CONFIG_BLOCK
++void add_disk_randomness(struct gendisk *disk)
++{
++ if (!disk || !disk->random)
++ return;
++ /* First major is 1, so we get >= 0x200 here. */
++ add_timer_randomness(disk->random, 0x100 + disk_devt(disk));
++}
++EXPORT_SYMBOL_GPL(add_disk_randomness);
++
++void __cold rand_initialize_disk(struct gendisk *disk)
++{
++ struct timer_rand_state *state;
++
++ /*
++ * If kzalloc returns null, we just won't use that entropy
++ * source.
++ */
++ state = kzalloc(sizeof(struct timer_rand_state), GFP_KERNEL);
++ if (state) {
++ state->last_time = INITIAL_JIFFIES;
++ disk->random = state;
++ }
++}
++#endif
++
+ /*
+ * Each time the timer fires, we expect that we got an unpredictable
+ * jump in the cycle counter. Even if the timer is running on another
+@@ -1398,40 +1190,40 @@ EXPORT_SYMBOL_GPL(add_interrupt_randomness);
+ *
+ * So the re-arming always happens in the entropy loop itself.
+ */
+-static void entropy_timer(struct timer_list *t)
++static void __cold entropy_timer(struct timer_list *t)
+ {
+- credit_entropy_bits(1);
++ credit_init_bits(1);
+ }
+
+ /*
+ * If we have an actual cycle counter, see if we can
+ * generate enough entropy with timing noise
+ */
+-static void try_to_generate_entropy(void)
++static void __cold try_to_generate_entropy(void)
+ {
+ struct {
+- unsigned long cycles;
++ unsigned long entropy;
+ struct timer_list timer;
+ } stack;
+
+- stack.cycles = random_get_entropy();
++ stack.entropy = random_get_entropy();
+
+ /* Slow counter - or none. Don't even bother */
+- if (stack.cycles == random_get_entropy())
++ if (stack.entropy == random_get_entropy())
+ return;
+
+ timer_setup_on_stack(&stack.timer, entropy_timer, 0);
+ while (!crng_ready() && !signal_pending(current)) {
+ if (!timer_pending(&stack.timer))
+ mod_timer(&stack.timer, jiffies + 1);
+- mix_pool_bytes(&stack.cycles, sizeof(stack.cycles));
++ mix_pool_bytes(&stack.entropy, sizeof(stack.entropy));
+ schedule();
+- stack.cycles = random_get_entropy();
++ stack.entropy = random_get_entropy();
+ }
+
+ del_timer_sync(&stack.timer);
+ destroy_timer_on_stack(&stack.timer);
+- mix_pool_bytes(&stack.cycles, sizeof(stack.cycles));
++ mix_pool_bytes(&stack.entropy, sizeof(stack.entropy));
+ }
+
+
+@@ -1463,9 +1255,12 @@ static void try_to_generate_entropy(void)
+ *
+ **********************************************************************/
+
+-SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count, unsigned int,
+- flags)
++SYSCALL_DEFINE3(getrandom, char __user *, ubuf, size_t, len, unsigned int, flags)
+ {
++ struct iov_iter iter;
++ struct iovec iov;
++ int ret;
++
+ if (flags & ~(GRND_NONBLOCK | GRND_RANDOM | GRND_INSECURE))
+ return -EINVAL;
+
+@@ -1476,72 +1271,60 @@ SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count, unsigned int,
+ if ((flags & (GRND_INSECURE | GRND_RANDOM)) == (GRND_INSECURE | GRND_RANDOM))
+ return -EINVAL;
+
+- if (count > INT_MAX)
+- count = INT_MAX;
+-
+- if (!(flags & GRND_INSECURE) && !crng_ready()) {
+- int ret;
+-
++ if (!crng_ready() && !(flags & GRND_INSECURE)) {
+ if (flags & GRND_NONBLOCK)
+ return -EAGAIN;
+ ret = wait_for_random_bytes();
+ if (unlikely(ret))
+ return ret;
+ }
+- return get_random_bytes_user(buf, count);
++
++ ret = import_single_range(READ, ubuf, len, &iov, &iter);
++ if (unlikely(ret))
++ return ret;
++ return get_random_bytes_user(&iter);
+ }
+
+ static __poll_t random_poll(struct file *file, poll_table *wait)
+ {
+- __poll_t mask;
+-
+ poll_wait(file, &crng_init_wait, wait);
+- poll_wait(file, &random_write_wait, wait);
+- mask = 0;
+- if (crng_ready())
+- mask |= EPOLLIN | EPOLLRDNORM;
+- if (input_pool.entropy_count < POOL_MIN_BITS)
+- mask |= EPOLLOUT | EPOLLWRNORM;
+- return mask;
++ return crng_ready() ? EPOLLIN | EPOLLRDNORM : EPOLLOUT | EPOLLWRNORM;
+ }
+
+-static int write_pool(const char __user *ubuf, size_t count)
++static ssize_t write_pool_user(struct iov_iter *iter)
+ {
+- size_t len;
+- int ret = 0;
+ u8 block[BLAKE2S_BLOCK_SIZE];
++ ssize_t ret = 0;
++ size_t copied;
+
+- while (count) {
+- len = min(count, sizeof(block));
+- if (copy_from_user(block, ubuf, len)) {
+- ret = -EFAULT;
+- goto out;
++ if (unlikely(!iov_iter_count(iter)))
++ return 0;
++
++ for (;;) {
++ copied = copy_from_iter(block, sizeof(block), iter);
++ ret += copied;
++ mix_pool_bytes(block, copied);
++ if (!iov_iter_count(iter) || copied != sizeof(block))
++ break;
++
++ BUILD_BUG_ON(PAGE_SIZE % sizeof(block) != 0);
++ if (ret % PAGE_SIZE == 0) {
++ if (signal_pending(current))
++ break;
++ cond_resched();
+ }
+- count -= len;
+- ubuf += len;
+- mix_pool_bytes(block, len);
+- cond_resched();
+ }
+
+-out:
+ memzero_explicit(block, sizeof(block));
+- return ret;
++ return ret ? ret : -EFAULT;
+ }
+
+-static ssize_t random_write(struct file *file, const char __user *buffer,
+- size_t count, loff_t *ppos)
++static ssize_t random_write_iter(struct kiocb *kiocb, struct iov_iter *iter)
+ {
+- int ret;
+-
+- ret = write_pool(buffer, count);
+- if (ret)
+- return ret;
+-
+- return (ssize_t)count;
++ return write_pool_user(iter);
+ }
+
+-static ssize_t urandom_read(struct file *file, char __user *buf, size_t nbytes,
+- loff_t *ppos)
++static ssize_t urandom_read_iter(struct kiocb *kiocb, struct iov_iter *iter)
+ {
+ static int maxwarn = 10;
+
+@@ -1552,37 +1335,38 @@ static ssize_t urandom_read(struct file *file, char __user *buf, size_t nbytes,
+ if (!crng_ready())
+ try_to_generate_entropy();
+
+- if (!crng_ready() && maxwarn > 0) {
+- maxwarn--;
+- if (__ratelimit(&urandom_warning))
+- pr_notice("%s: uninitialized urandom read (%zd bytes read)\n",
+- current->comm, nbytes);
++ if (!crng_ready()) {
++ if (!ratelimit_disable && maxwarn <= 0)
++ ++urandom_warning.missed;
++ else if (ratelimit_disable || __ratelimit(&urandom_warning)) {
++ --maxwarn;
++ pr_notice("%s: uninitialized urandom read (%zu bytes read)\n",
++ current->comm, iov_iter_count(iter));
++ }
+ }
+
+- return get_random_bytes_user(buf, nbytes);
++ return get_random_bytes_user(iter);
+ }
+
+-static ssize_t random_read(struct file *file, char __user *buf, size_t nbytes,
+- loff_t *ppos)
++static ssize_t random_read_iter(struct kiocb *kiocb, struct iov_iter *iter)
+ {
+ int ret;
+
+ ret = wait_for_random_bytes();
+ if (ret != 0)
+ return ret;
+- return get_random_bytes_user(buf, nbytes);
++ return get_random_bytes_user(iter);
+ }
+
+ static long random_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+ {
+- int size, ent_count;
+ int __user *p = (int __user *)arg;
+- int retval;
++ int ent_count;
+
+ switch (cmd) {
+ case RNDGETENTCNT:
+ /* Inherently racy, no point locking. */
+- if (put_user(input_pool.entropy_count, p))
++ if (put_user(input_pool.init_bits, p))
+ return -EFAULT;
+ return 0;
+ case RNDADDTOENTCNT:
+@@ -1592,41 +1376,46 @@ static long random_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+ return -EFAULT;
+ if (ent_count < 0)
+ return -EINVAL;
+- credit_entropy_bits(ent_count);
++ credit_init_bits(ent_count);
+ return 0;
+- case RNDADDENTROPY:
++ case RNDADDENTROPY: {
++ struct iov_iter iter;
++ struct iovec iov;
++ ssize_t ret;
++ int len;
++
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+ if (get_user(ent_count, p++))
+ return -EFAULT;
+ if (ent_count < 0)
+ return -EINVAL;
+- if (get_user(size, p++))
++ if (get_user(len, p++))
+ return -EFAULT;
+- retval = write_pool((const char __user *)p, size);
+- if (retval < 0)
+- return retval;
+- credit_entropy_bits(ent_count);
++ ret = import_single_range(WRITE, p, len, &iov, &iter);
++ if (unlikely(ret))
++ return ret;
++ ret = write_pool_user(&iter);
++ if (unlikely(ret < 0))
++ return ret;
++ /* Since we're crediting, enforce that it was all written into the pool. */
++ if (unlikely(ret != len))
++ return -EFAULT;
++ credit_init_bits(ent_count);
+ return 0;
++ }
+ case RNDZAPENTCNT:
+ case RNDCLEARPOOL:
+- /*
+- * Clear the entropy pool counters. We no longer clear
+- * the entropy pool, as that's silly.
+- */
++ /* No longer has any effect. */
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+- if (xchg(&input_pool.entropy_count, 0) >= POOL_MIN_BITS) {
+- wake_up_interruptible(&random_write_wait);
+- kill_fasync(&fasync, SIGIO, POLL_OUT);
+- }
+ return 0;
+ case RNDRESEEDCRNG:
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+ if (!crng_ready())
+ return -ENODATA;
+- crng_reseed(false);
++ crng_reseed();
+ return 0;
+ default:
+ return -EINVAL;
+@@ -1639,22 +1428,26 @@ static int random_fasync(int fd, struct file *filp, int on)
+ }
+
+ const struct file_operations random_fops = {
+- .read = random_read,
+- .write = random_write,
++ .read_iter = random_read_iter,
++ .write_iter = random_write_iter,
+ .poll = random_poll,
+ .unlocked_ioctl = random_ioctl,
+ .compat_ioctl = compat_ptr_ioctl,
+ .fasync = random_fasync,
+ .llseek = noop_llseek,
++ .splice_read = generic_file_splice_read,
++ .splice_write = iter_file_splice_write,
+ };
+
+ const struct file_operations urandom_fops = {
+- .read = urandom_read,
+- .write = random_write,
++ .read_iter = urandom_read_iter,
++ .write_iter = random_write_iter,
+ .unlocked_ioctl = random_ioctl,
+ .compat_ioctl = compat_ptr_ioctl,
+ .fasync = random_fasync,
+ .llseek = noop_llseek,
++ .splice_read = generic_file_splice_read,
++ .splice_write = iter_file_splice_write,
+ };
+
+
+@@ -1678,7 +1471,7 @@ const struct file_operations urandom_fops = {
+ *
+ * - write_wakeup_threshold - the amount of entropy in the input pool
+ * below which write polls to /dev/random will unblock, requesting
+- * more entropy, tied to the POOL_MIN_BITS constant. It is writable
++ * more entropy, tied to the POOL_READY_BITS constant. It is writable
+ * to avoid breaking old userspaces, but writing to it does not
+ * change any behavior of the RNG.
+ *
+@@ -1693,7 +1486,7 @@ const struct file_operations urandom_fops = {
+ #include <linux/sysctl.h>
+
+ static int sysctl_random_min_urandom_seed = CRNG_RESEED_INTERVAL / HZ;
+-static int sysctl_random_write_wakeup_bits = POOL_MIN_BITS;
++static int sysctl_random_write_wakeup_bits = POOL_READY_BITS;
+ static int sysctl_poolsize = POOL_BITS;
+ static u8 sysctl_bootid[UUID_SIZE];
+
+@@ -1702,7 +1495,7 @@ static u8 sysctl_bootid[UUID_SIZE];
+ * UUID. The difference is in whether table->data is NULL; if it is,
+ * then a new UUID is generated and returned to the user.
+ */
+-static int proc_do_uuid(struct ctl_table *table, int write, void *buffer,
++static int proc_do_uuid(struct ctl_table *table, int write, void *buf,
+ size_t *lenp, loff_t *ppos)
+ {
+ u8 tmp_uuid[UUID_SIZE], *uuid;
+@@ -1729,14 +1522,14 @@ static int proc_do_uuid(struct ctl_table *table, int write, void *buffer,
+ }
+
+ snprintf(uuid_string, sizeof(uuid_string), "%pU", uuid);
+- return proc_dostring(&fake_table, 0, buffer, lenp, ppos);
++ return proc_dostring(&fake_table, 0, buf, lenp, ppos);
+ }
+
+ /* The same as proc_dointvec, but writes don't change anything. */
+-static int proc_do_rointvec(struct ctl_table *table, int write, void *buffer,
++static int proc_do_rointvec(struct ctl_table *table, int write, void *buf,
+ size_t *lenp, loff_t *ppos)
+ {
+- return write ? 0 : proc_dointvec(table, 0, buffer, lenp, ppos);
++ return write ? 0 : proc_dointvec(table, 0, buf, lenp, ppos);
+ }
+
+ static struct ctl_table random_table[] = {
+@@ -1749,7 +1542,7 @@ static struct ctl_table random_table[] = {
+ },
+ {
+ .procname = "entropy_avail",
+- .data = &input_pool.entropy_count,
++ .data = &input_pool.init_bits,
+ .maxlen = sizeof(int),
+ .mode = 0444,
+ .proc_handler = proc_dointvec,
+@@ -1783,8 +1576,8 @@ static struct ctl_table random_table[] = {
+ };
+
+ /*
+- * rand_initialize() is called before sysctl_init(),
+- * so we cannot call register_sysctl_init() in rand_initialize()
++ * random_init() is called before sysctl_init(),
++ * so we cannot call register_sysctl_init() in random_init()
+ */
+ static int __init random_sysctls_init(void)
+ {
+diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_client.c b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
+index c5de0ec4f9d03..444acd9e2cd6a 100644
+--- a/drivers/hid/amd-sfh-hid/amd_sfh_client.c
++++ b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
+@@ -227,6 +227,17 @@ int amd_sfh_hid_client_init(struct amd_mp2_dev *privdata)
+ dev_dbg(dev, "sid 0x%x status 0x%x\n",
+ cl_data->sensor_idx[i], cl_data->sensor_sts[i]);
+ }
++ if (privdata->mp2_ops->discovery_status &&
++ privdata->mp2_ops->discovery_status(privdata) == 0) {
++ amd_sfh_hid_client_deinit(privdata);
++ for (i = 0; i < cl_data->num_hid_devices; i++) {
++ devm_kfree(dev, cl_data->feature_report[i]);
++ devm_kfree(dev, in_data->input_report[i]);
++ devm_kfree(dev, cl_data->report_descr[i]);
++ }
++ dev_warn(dev, "Failed to discover, sensors not enabled\n");
++ return -EOPNOTSUPP;
++ }
+ schedule_delayed_work(&cl_data->work_buffer, msecs_to_jiffies(AMD_SFH_IDLE_LOOP));
+ return 0;
+
+diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c
+index 6b5fd90b0bd1b..e18a4efd8839e 100644
+--- a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c
++++ b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c
+@@ -130,6 +130,12 @@ static int amd_sfh_irq_init_v2(struct amd_mp2_dev *privdata)
+ return 0;
+ }
+
++static int amd_sfh_dis_sts_v2(struct amd_mp2_dev *privdata)
++{
++ return (readl(privdata->mmio + AMD_P2C_MSG(1)) &
++ SENSOR_DISCOVERY_STATUS_MASK) >> SENSOR_DISCOVERY_STATUS_SHIFT;
++}
++
+ void amd_start_sensor(struct amd_mp2_dev *privdata, struct amd_mp2_sensor_info info)
+ {
+ union sfh_cmd_param cmd_param;
+@@ -245,6 +251,7 @@ static const struct amd_mp2_ops amd_sfh_ops_v2 = {
+ .response = amd_sfh_wait_response_v2,
+ .clear_intr = amd_sfh_clear_intr_v2,
+ .init_intr = amd_sfh_irq_init_v2,
++ .discovery_status = amd_sfh_dis_sts_v2,
+ };
+
+ static const struct amd_mp2_ops amd_sfh_ops = {
+diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.h b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.h
+index 97b99861fae25..9aa88a91ac8d1 100644
+--- a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.h
++++ b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.h
+@@ -39,6 +39,9 @@
+
+ #define AMD_SFH_IDLE_LOOP 200
+
++#define SENSOR_DISCOVERY_STATUS_MASK GENMASK(5, 3)
++#define SENSOR_DISCOVERY_STATUS_SHIFT 3
++
+ /* SFH Command register */
+ union sfh_cmd_base {
+ u32 ul;
+@@ -143,5 +146,6 @@ struct amd_mp2_ops {
+ int (*response)(struct amd_mp2_dev *mp2, u8 sid, u32 sensor_sts);
+ void (*clear_intr)(struct amd_mp2_dev *privdata);
+ int (*init_intr)(struct amd_mp2_dev *privdata);
++ int (*discovery_status)(struct amd_mp2_dev *privdata);
+ };
+ #endif
+diff --git a/include/linux/mm.h b/include/linux/mm.h
+index 9f44254af8ce9..b0183450e484b 100644
+--- a/include/linux/mm.h
++++ b/include/linux/mm.h
+@@ -2677,6 +2677,7 @@ extern int install_special_mapping(struct mm_struct *mm,
+ unsigned long flags, struct page **pages);
+
+ unsigned long randomize_stack_top(unsigned long stack_top);
++unsigned long randomize_page(unsigned long start, unsigned long range);
+
+ extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
+
+diff --git a/include/linux/prandom.h b/include/linux/prandom.h
+index 056d31317e499..a4aadd2dc153e 100644
+--- a/include/linux/prandom.h
++++ b/include/linux/prandom.h
+@@ -10,6 +10,7 @@
+
+ #include <linux/types.h>
+ #include <linux/percpu.h>
++#include <linux/siphash.h>
+
+ u32 prandom_u32(void);
+ void prandom_bytes(void *buf, size_t nbytes);
+@@ -27,15 +28,10 @@ DECLARE_PER_CPU(unsigned long, net_rand_noise);
+ * The core SipHash round function. Each line can be executed in
+ * parallel given enough CPU resources.
+ */
+-#define PRND_SIPROUND(v0, v1, v2, v3) ( \
+- v0 += v1, v1 = rol64(v1, 13), v2 += v3, v3 = rol64(v3, 16), \
+- v1 ^= v0, v0 = rol64(v0, 32), v3 ^= v2, \
+- v0 += v3, v3 = rol64(v3, 21), v2 += v1, v1 = rol64(v1, 17), \
+- v3 ^= v0, v1 ^= v2, v2 = rol64(v2, 32) \
+-)
++#define PRND_SIPROUND(v0, v1, v2, v3) SIPHASH_PERMUTATION(v0, v1, v2, v3)
+
+-#define PRND_K0 (0x736f6d6570736575 ^ 0x6c7967656e657261)
+-#define PRND_K1 (0x646f72616e646f6d ^ 0x7465646279746573)
++#define PRND_K0 (SIPHASH_CONST_0 ^ SIPHASH_CONST_2)
++#define PRND_K1 (SIPHASH_CONST_1 ^ SIPHASH_CONST_3)
+
+ #elif BITS_PER_LONG == 32
+ /*
+@@ -43,14 +39,9 @@ DECLARE_PER_CPU(unsigned long, net_rand_noise);
+ * This is weaker, but 32-bit machines are not used for high-traffic
+ * applications, so there is less output for an attacker to analyze.
+ */
+-#define PRND_SIPROUND(v0, v1, v2, v3) ( \
+- v0 += v1, v1 = rol32(v1, 5), v2 += v3, v3 = rol32(v3, 8), \
+- v1 ^= v0, v0 = rol32(v0, 16), v3 ^= v2, \
+- v0 += v3, v3 = rol32(v3, 7), v2 += v1, v1 = rol32(v1, 13), \
+- v3 ^= v0, v1 ^= v2, v2 = rol32(v2, 16) \
+-)
+-#define PRND_K0 0x6c796765
+-#define PRND_K1 0x74656462
++#define PRND_SIPROUND(v0, v1, v2, v3) HSIPHASH_PERMUTATION(v0, v1, v2, v3)
++#define PRND_K0 (HSIPHASH_CONST_0 ^ HSIPHASH_CONST_2)
++#define PRND_K1 (HSIPHASH_CONST_1 ^ HSIPHASH_CONST_3)
+
+ #else
+ #error Unsupported BITS_PER_LONG
+diff --git a/include/linux/random.h b/include/linux/random.h
+index f673fbb838b35..4364de2300be6 100644
+--- a/include/linux/random.h
++++ b/include/linux/random.h
+@@ -12,45 +12,33 @@
+
+ struct notifier_block;
+
+-extern void add_device_randomness(const void *, size_t);
+-extern void add_bootloader_randomness(const void *, size_t);
++void add_device_randomness(const void *buf, size_t len);
++void add_bootloader_randomness(const void *buf, size_t len);
++void add_input_randomness(unsigned int type, unsigned int code,
++ unsigned int value) __latent_entropy;
++void add_interrupt_randomness(int irq) __latent_entropy;
++void add_hwgenerator_randomness(const void *buf, size_t len, size_t entropy);
+
+ #if defined(LATENT_ENTROPY_PLUGIN) && !defined(__CHECKER__)
+ static inline void add_latent_entropy(void)
+ {
+- add_device_randomness((const void *)&latent_entropy,
+- sizeof(latent_entropy));
++ add_device_randomness((const void *)&latent_entropy, sizeof(latent_entropy));
+ }
+ #else
+-static inline void add_latent_entropy(void) {}
++static inline void add_latent_entropy(void) { }
+ #endif
+
+-extern void add_input_randomness(unsigned int type, unsigned int code,
+- unsigned int value) __latent_entropy;
+-extern void add_interrupt_randomness(int irq) __latent_entropy;
+-extern void add_hwgenerator_randomness(const void *buffer, size_t count,
+- size_t entropy);
+ #if IS_ENABLED(CONFIG_VMGENID)
+-extern void add_vmfork_randomness(const void *unique_vm_id, size_t size);
+-extern int register_random_vmfork_notifier(struct notifier_block *nb);
+-extern int unregister_random_vmfork_notifier(struct notifier_block *nb);
++void add_vmfork_randomness(const void *unique_vm_id, size_t len);
++int register_random_vmfork_notifier(struct notifier_block *nb);
++int unregister_random_vmfork_notifier(struct notifier_block *nb);
+ #else
+ static inline int register_random_vmfork_notifier(struct notifier_block *nb) { return 0; }
+ static inline int unregister_random_vmfork_notifier(struct notifier_block *nb) { return 0; }
+ #endif
+
+-extern void get_random_bytes(void *buf, size_t nbytes);
+-extern int wait_for_random_bytes(void);
+-extern int __init rand_initialize(void);
+-extern bool rng_is_initialized(void);
+-extern int register_random_ready_notifier(struct notifier_block *nb);
+-extern int unregister_random_ready_notifier(struct notifier_block *nb);
+-extern size_t __must_check get_random_bytes_arch(void *buf, size_t nbytes);
+-
+-#ifndef MODULE
+-extern const struct file_operations random_fops, urandom_fops;
+-#endif
+-
++void get_random_bytes(void *buf, size_t len);
++size_t __must_check get_random_bytes_arch(void *buf, size_t len);
+ u32 get_random_u32(void);
+ u64 get_random_u64(void);
+ static inline unsigned int get_random_int(void)
+@@ -82,11 +70,15 @@ static inline unsigned long get_random_long(void)
+
+ static inline unsigned long get_random_canary(void)
+ {
+- unsigned long val = get_random_long();
+-
+- return val & CANARY_MASK;
++ return get_random_long() & CANARY_MASK;
+ }
+
++int __init random_init(const char *command_line);
++bool rng_is_initialized(void);
++int wait_for_random_bytes(void);
++int register_random_ready_notifier(struct notifier_block *nb);
++int unregister_random_ready_notifier(struct notifier_block *nb);
++
+ /* Calls wait_for_random_bytes() and then calls get_random_bytes(buf, nbytes).
+ * Returns the result of the call to wait_for_random_bytes. */
+ static inline int get_random_bytes_wait(void *buf, size_t nbytes)
+@@ -96,22 +88,20 @@ static inline int get_random_bytes_wait(void *buf, size_t nbytes)
+ return ret;
+ }
+
+-#define declare_get_random_var_wait(var) \
+- static inline int get_random_ ## var ## _wait(var *out) { \
++#define declare_get_random_var_wait(name, ret_type) \
++ static inline int get_random_ ## name ## _wait(ret_type *out) { \
+ int ret = wait_for_random_bytes(); \
+ if (unlikely(ret)) \
+ return ret; \
+- *out = get_random_ ## var(); \
++ *out = get_random_ ## name(); \
+ return 0; \
+ }
+-declare_get_random_var_wait(u32)
+-declare_get_random_var_wait(u64)
+-declare_get_random_var_wait(int)
+-declare_get_random_var_wait(long)
++declare_get_random_var_wait(u32, u32)
++declare_get_random_var_wait(u64, u32)
++declare_get_random_var_wait(int, unsigned int)
++declare_get_random_var_wait(long, unsigned long)
+ #undef declare_get_random_var
+
+-unsigned long randomize_page(unsigned long start, unsigned long range);
+-
+ /*
+ * This is designed to be standalone for just prandom
+ * users, but for now we include it from <linux/random.h>
+@@ -122,22 +112,10 @@ unsigned long randomize_page(unsigned long start, unsigned long range);
+ #ifdef CONFIG_ARCH_RANDOM
+ # include <asm/archrandom.h>
+ #else
+-static inline bool __must_check arch_get_random_long(unsigned long *v)
+-{
+- return false;
+-}
+-static inline bool __must_check arch_get_random_int(unsigned int *v)
+-{
+- return false;
+-}
+-static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
+-{
+- return false;
+-}
+-static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
+-{
+- return false;
+-}
++static inline bool __must_check arch_get_random_long(unsigned long *v) { return false; }
++static inline bool __must_check arch_get_random_int(unsigned int *v) { return false; }
++static inline bool __must_check arch_get_random_seed_long(unsigned long *v) { return false; }
++static inline bool __must_check arch_get_random_seed_int(unsigned int *v) { return false; }
+ #endif
+
+ /*
+@@ -161,8 +139,12 @@ static inline bool __init arch_get_random_long_early(unsigned long *v)
+ #endif
+
+ #ifdef CONFIG_SMP
+-extern int random_prepare_cpu(unsigned int cpu);
+-extern int random_online_cpu(unsigned int cpu);
++int random_prepare_cpu(unsigned int cpu);
++int random_online_cpu(unsigned int cpu);
++#endif
++
++#ifndef MODULE
++extern const struct file_operations random_fops, urandom_fops;
+ #endif
+
+ #endif /* _LINUX_RANDOM_H */
+diff --git a/include/linux/security.h b/include/linux/security.h
+index 25b3ef71f495e..7fc4e9f49f542 100644
+--- a/include/linux/security.h
++++ b/include/linux/security.h
+@@ -121,10 +121,12 @@ enum lockdown_reason {
+ LOCKDOWN_DEBUGFS,
+ LOCKDOWN_XMON_WR,
+ LOCKDOWN_BPF_WRITE_USER,
++ LOCKDOWN_DBG_WRITE_KERNEL,
+ LOCKDOWN_INTEGRITY_MAX,
+ LOCKDOWN_KCORE,
+ LOCKDOWN_KPROBES,
+ LOCKDOWN_BPF_READ_KERNEL,
++ LOCKDOWN_DBG_READ_KERNEL,
+ LOCKDOWN_PERF,
+ LOCKDOWN_TRACEFS,
+ LOCKDOWN_XMON_RW,
+diff --git a/include/linux/siphash.h b/include/linux/siphash.h
+index cce8a9acc76cb..3af1428da5597 100644
+--- a/include/linux/siphash.h
++++ b/include/linux/siphash.h
+@@ -138,4 +138,32 @@ static inline u32 hsiphash(const void *data, size_t len,
+ return ___hsiphash_aligned(data, len, key);
+ }
+
++/*
++ * These macros expose the raw SipHash and HalfSipHash permutations.
++ * Do not use them directly! If you think you have a use for them,
++ * be sure to CC the maintainer of this file explaining why.
++ */
++
++#define SIPHASH_PERMUTATION(a, b, c, d) ( \
++ (a) += (b), (b) = rol64((b), 13), (b) ^= (a), (a) = rol64((a), 32), \
++ (c) += (d), (d) = rol64((d), 16), (d) ^= (c), \
++ (a) += (d), (d) = rol64((d), 21), (d) ^= (a), \
++ (c) += (b), (b) = rol64((b), 17), (b) ^= (c), (c) = rol64((c), 32))
++
++#define SIPHASH_CONST_0 0x736f6d6570736575ULL
++#define SIPHASH_CONST_1 0x646f72616e646f6dULL
++#define SIPHASH_CONST_2 0x6c7967656e657261ULL
++#define SIPHASH_CONST_3 0x7465646279746573ULL
++
++#define HSIPHASH_PERMUTATION(a, b, c, d) ( \
++ (a) += (b), (b) = rol32((b), 5), (b) ^= (a), (a) = rol32((a), 16), \
++ (c) += (d), (d) = rol32((d), 8), (d) ^= (c), \
++ (a) += (d), (d) = rol32((d), 7), (d) ^= (a), \
++ (c) += (b), (b) = rol32((b), 13), (b) ^= (c), (c) = rol32((c), 16))
++
++#define HSIPHASH_CONST_0 0U
++#define HSIPHASH_CONST_1 0U
++#define HSIPHASH_CONST_2 0x6c796765U
++#define HSIPHASH_CONST_3 0x74656462U
++
+ #endif /* _LINUX_SIPHASH_H */
+diff --git a/include/linux/timex.h b/include/linux/timex.h
+index 5745c90c88005..3871b06bd302c 100644
+--- a/include/linux/timex.h
++++ b/include/linux/timex.h
+@@ -62,6 +62,8 @@
+ #include <linux/types.h>
+ #include <linux/param.h>
+
++unsigned long random_get_entropy_fallback(void);
++
+ #include <asm/timex.h>
+
+ #ifndef random_get_entropy
+@@ -74,8 +76,14 @@
+ *
+ * By default we use get_cycles() for this purpose, but individual
+ * architectures may override this in their asm/timex.h header file.
++ * If a given arch does not have get_cycles(), then we fallback to
++ * using random_get_entropy_fallback().
+ */
++#ifdef get_cycles
+ #define random_get_entropy() ((unsigned long)get_cycles())
++#else
++#define random_get_entropy() random_get_entropy_fallback()
++#endif
+ #endif
+
+ /*
+diff --git a/init/main.c b/init/main.c
+index 98182c3c2c4b3..f057c49f1d9d8 100644
+--- a/init/main.c
++++ b/init/main.c
+@@ -1035,21 +1035,18 @@ asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
+ softirq_init();
+ timekeeping_init();
+ kfence_init();
++ time_init();
+
+ /*
+ * For best initial stack canary entropy, prepare it after:
+ * - setup_arch() for any UEFI RNG entropy and boot cmdline access
+- * - timekeeping_init() for ktime entropy used in rand_initialize()
+- * - rand_initialize() to get any arch-specific entropy like RDRAND
+- * - add_latent_entropy() to get any latent entropy
+- * - adding command line entropy
++ * - timekeeping_init() for ktime entropy used in random_init()
++ * - time_init() for making random_get_entropy() work on some platforms
++ * - random_init() to initialize the RNG from from early entropy sources
+ */
+- rand_initialize();
+- add_latent_entropy();
+- add_device_randomness(command_line, strlen(command_line));
++ random_init(command_line);
+ boot_init_stack_canary();
+
+- time_init();
+ perf_event_init();
+ profile_init();
+ call_function_init();
+diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
+index da06a5553835b..7beceb447211d 100644
+--- a/kernel/debug/debug_core.c
++++ b/kernel/debug/debug_core.c
+@@ -53,6 +53,7 @@
+ #include <linux/vmacache.h>
+ #include <linux/rcupdate.h>
+ #include <linux/irq.h>
++#include <linux/security.h>
+
+ #include <asm/cacheflush.h>
+ #include <asm/byteorder.h>
+@@ -752,6 +753,29 @@ cpu_master_loop:
+ continue;
+ kgdb_connected = 0;
+ } else {
++ /*
++ * This is a brutal way to interfere with the debugger
++ * and prevent gdb being used to poke at kernel memory.
++ * This could cause trouble if lockdown is applied when
++ * there is already an active gdb session. For now the
++ * answer is simply "don't do that". Typically lockdown
++ * *will* be applied before the debug core gets started
++ * so only developers using kgdb for fairly advanced
++ * early kernel debug can be biten by this. Hopefully
++ * they are sophisticated enough to take care of
++ * themselves, especially with help from the lockdown
++ * message printed on the console!
++ */
++ if (security_locked_down(LOCKDOWN_DBG_WRITE_KERNEL)) {
++ if (IS_ENABLED(CONFIG_KGDB_KDB)) {
++ /* Switch back to kdb if possible... */
++ dbg_kdb_mode = 1;
++ continue;
++ } else {
++ /* ... otherwise just bail */
++ break;
++ }
++ }
+ error = gdb_serial_stub(ks);
+ }
+
+diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
+index 0852a537dad4c..ead4da9471270 100644
+--- a/kernel/debug/kdb/kdb_main.c
++++ b/kernel/debug/kdb/kdb_main.c
+@@ -45,6 +45,7 @@
+ #include <linux/proc_fs.h>
+ #include <linux/uaccess.h>
+ #include <linux/slab.h>
++#include <linux/security.h>
+ #include "kdb_private.h"
+
+ #undef MODULE_PARAM_PREFIX
+@@ -166,10 +167,62 @@ struct task_struct *kdb_curr_task(int cpu)
+ }
+
+ /*
+- * Check whether the flags of the current command and the permissions
+- * of the kdb console has allow a command to be run.
++ * Update the permissions flags (kdb_cmd_enabled) to match the
++ * current lockdown state.
++ *
++ * Within this function the calls to security_locked_down() are "lazy". We
++ * avoid calling them if the current value of kdb_cmd_enabled already excludes
++ * flags that might be subject to lockdown. Additionally we deliberately check
++ * the lockdown flags independently (even though read lockdown implies write
++ * lockdown) since that results in both simpler code and clearer messages to
++ * the user on first-time debugger entry.
++ *
++ * The permission masks during a read+write lockdown permits the following
++ * flags: INSPECT, SIGNAL, REBOOT (and ALWAYS_SAFE).
++ *
++ * The INSPECT commands are not blocked during lockdown because they are
++ * not arbitrary memory reads. INSPECT covers the backtrace family (sometimes
++ * forcing them to have no arguments) and lsmod. These commands do expose
++ * some kernel state but do not allow the developer seated at the console to
++ * choose what state is reported. SIGNAL and REBOOT should not be controversial,
++ * given these are allowed for root during lockdown already.
++ */
++static void kdb_check_for_lockdown(void)
++{
++ const int write_flags = KDB_ENABLE_MEM_WRITE |
++ KDB_ENABLE_REG_WRITE |
++ KDB_ENABLE_FLOW_CTRL;
++ const int read_flags = KDB_ENABLE_MEM_READ |
++ KDB_ENABLE_REG_READ;
++
++ bool need_to_lockdown_write = false;
++ bool need_to_lockdown_read = false;
++
++ if (kdb_cmd_enabled & (KDB_ENABLE_ALL | write_flags))
++ need_to_lockdown_write =
++ security_locked_down(LOCKDOWN_DBG_WRITE_KERNEL);
++
++ if (kdb_cmd_enabled & (KDB_ENABLE_ALL | read_flags))
++ need_to_lockdown_read =
++ security_locked_down(LOCKDOWN_DBG_READ_KERNEL);
++
++ /* De-compose KDB_ENABLE_ALL if required */
++ if (need_to_lockdown_write || need_to_lockdown_read)
++ if (kdb_cmd_enabled & KDB_ENABLE_ALL)
++ kdb_cmd_enabled = KDB_ENABLE_MASK & ~KDB_ENABLE_ALL;
++
++ if (need_to_lockdown_write)
++ kdb_cmd_enabled &= ~write_flags;
++
++ if (need_to_lockdown_read)
++ kdb_cmd_enabled &= ~read_flags;
++}
++
++/*
++ * Check whether the flags of the current command, the permissions of the kdb
++ * console and the lockdown state allow a command to be run.
+ */
+-static inline bool kdb_check_flags(kdb_cmdflags_t flags, int permissions,
++static bool kdb_check_flags(kdb_cmdflags_t flags, int permissions,
+ bool no_args)
+ {
+ /* permissions comes from userspace so needs massaging slightly */
+@@ -1180,6 +1233,9 @@ static int kdb_local(kdb_reason_t reason, int error, struct pt_regs *regs,
+ kdb_curr_task(raw_smp_processor_id());
+
+ KDB_DEBUG_STATE("kdb_local 1", reason);
++
++ kdb_check_for_lockdown();
++
+ kdb_go_count = 0;
+ if (reason == KDB_REASON_DEBUG) {
+ /* special case below */
+diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
+index 3b1398fbddaf8..871c912860ed5 100644
+--- a/kernel/time/timekeeping.c
++++ b/kernel/time/timekeeping.c
+@@ -17,6 +17,7 @@
+ #include <linux/clocksource.h>
+ #include <linux/jiffies.h>
+ #include <linux/time.h>
++#include <linux/timex.h>
+ #include <linux/tick.h>
+ #include <linux/stop_machine.h>
+ #include <linux/pvclock_gtod.h>
+@@ -2380,6 +2381,20 @@ static int timekeeping_validate_timex(const struct __kernel_timex *txc)
+ return 0;
+ }
+
++/**
++ * random_get_entropy_fallback - Returns the raw clock source value,
++ * used by random.c for platforms with no valid random_get_entropy().
++ */
++unsigned long random_get_entropy_fallback(void)
++{
++ struct tk_read_base *tkr = &tk_core.timekeeper.tkr_mono;
++ struct clocksource *clock = READ_ONCE(tkr->clock);
++
++ if (unlikely(timekeeping_suspended || !clock))
++ return 0;
++ return clock->read(clock);
++}
++EXPORT_SYMBOL_GPL(random_get_entropy_fallback);
+
+ /**
+ * do_adjtimex() - Accessor function to NTP __do_adjtimex function
+diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
+index 075cd25363ac3..7e282970177a8 100644
+--- a/lib/Kconfig.debug
++++ b/lib/Kconfig.debug
+@@ -1616,8 +1616,7 @@ config WARN_ALL_UNSEEDED_RANDOM
+ so architecture maintainers really need to do what they can
+ to get the CRNG seeded sooner after the system is booted.
+ However, since users cannot do anything actionable to
+- address this, by default the kernel will issue only a single
+- warning for the first use of unseeded randomness.
++ address this, by default this option is disabled.
+
+ Say Y here if you want to receive warnings for all uses of
+ unseeded randomness. This will be of use primarily for
+diff --git a/lib/siphash.c b/lib/siphash.c
+index 72b9068ab57bf..71d315a6ad623 100644
+--- a/lib/siphash.c
++++ b/lib/siphash.c
+@@ -18,19 +18,13 @@
+ #include <asm/word-at-a-time.h>
+ #endif
+
+-#define SIPROUND \
+- do { \
+- v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); \
+- v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; \
+- v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; \
+- v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \
+- } while (0)
++#define SIPROUND SIPHASH_PERMUTATION(v0, v1, v2, v3)
+
+ #define PREAMBLE(len) \
+- u64 v0 = 0x736f6d6570736575ULL; \
+- u64 v1 = 0x646f72616e646f6dULL; \
+- u64 v2 = 0x6c7967656e657261ULL; \
+- u64 v3 = 0x7465646279746573ULL; \
++ u64 v0 = SIPHASH_CONST_0; \
++ u64 v1 = SIPHASH_CONST_1; \
++ u64 v2 = SIPHASH_CONST_2; \
++ u64 v3 = SIPHASH_CONST_3; \
+ u64 b = ((u64)(len)) << 56; \
+ v3 ^= key->key[1]; \
+ v2 ^= key->key[0]; \
+@@ -389,19 +383,13 @@ u32 hsiphash_4u32(const u32 first, const u32 second, const u32 third,
+ }
+ EXPORT_SYMBOL(hsiphash_4u32);
+ #else
+-#define HSIPROUND \
+- do { \
+- v0 += v1; v1 = rol32(v1, 5); v1 ^= v0; v0 = rol32(v0, 16); \
+- v2 += v3; v3 = rol32(v3, 8); v3 ^= v2; \
+- v0 += v3; v3 = rol32(v3, 7); v3 ^= v0; \
+- v2 += v1; v1 = rol32(v1, 13); v1 ^= v2; v2 = rol32(v2, 16); \
+- } while (0)
++#define HSIPROUND HSIPHASH_PERMUTATION(v0, v1, v2, v3)
+
+ #define HPREAMBLE(len) \
+- u32 v0 = 0; \
+- u32 v1 = 0; \
+- u32 v2 = 0x6c796765U; \
+- u32 v3 = 0x74656462U; \
++ u32 v0 = HSIPHASH_CONST_0; \
++ u32 v1 = HSIPHASH_CONST_1; \
++ u32 v2 = HSIPHASH_CONST_2; \
++ u32 v3 = HSIPHASH_CONST_3; \
+ u32 b = ((u32)(len)) << 24; \
+ v3 ^= key->key[1]; \
+ v2 ^= key->key[0]; \
+diff --git a/mm/util.c b/mm/util.c
+index 3492a9e81aa3a..ac63e5ca8b211 100644
+--- a/mm/util.c
++++ b/mm/util.c
+@@ -343,6 +343,38 @@ unsigned long randomize_stack_top(unsigned long stack_top)
+ #endif
+ }
+
++/**
++ * randomize_page - Generate a random, page aligned address
++ * @start: The smallest acceptable address the caller will take.
++ * @range: The size of the area, starting at @start, within which the
++ * random address must fall.
++ *
++ * If @start + @range would overflow, @range is capped.
++ *
++ * NOTE: Historical use of randomize_range, which this replaces, presumed that
++ * @start was already page aligned. We now align it regardless.
++ *
++ * Return: A page aligned address within [start, start + range). On error,
++ * @start is returned.
++ */
++unsigned long randomize_page(unsigned long start, unsigned long range)
++{
++ if (!PAGE_ALIGNED(start)) {
++ range -= PAGE_ALIGN(start) - start;
++ start = PAGE_ALIGN(start);
++ }
++
++ if (start > ULONG_MAX - range)
++ range = ULONG_MAX - start;
++
++ range >>= PAGE_SHIFT;
++
++ if (range == 0)
++ return start;
++
++ return start + (get_random_long() % range << PAGE_SHIFT);
++}
++
+ #ifdef CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
+ unsigned long arch_randomize_brk(struct mm_struct *mm)
+ {
+diff --git a/security/security.c b/security/security.c
+index b7cf5cbfdc677..aaf6566deb9f0 100644
+--- a/security/security.c
++++ b/security/security.c
+@@ -59,10 +59,12 @@ const char *const lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] = {
+ [LOCKDOWN_DEBUGFS] = "debugfs access",
+ [LOCKDOWN_XMON_WR] = "xmon write access",
+ [LOCKDOWN_BPF_WRITE_USER] = "use of bpf to write user RAM",
++ [LOCKDOWN_DBG_WRITE_KERNEL] = "use of kgdb/kdb to write kernel RAM",
+ [LOCKDOWN_INTEGRITY_MAX] = "integrity",
+ [LOCKDOWN_KCORE] = "/proc/kcore access",
+ [LOCKDOWN_KPROBES] = "use of kprobes",
+ [LOCKDOWN_BPF_READ_KERNEL] = "use of bpf to read kernel RAM",
++ [LOCKDOWN_DBG_READ_KERNEL] = "use of kgdb/kdb to read kernel RAM",
+ [LOCKDOWN_PERF] = "unsafe use of perf",
+ [LOCKDOWN_TRACEFS] = "use of tracefs",
+ [LOCKDOWN_XMON_RW] = "xmon read and write access",
+diff --git a/sound/pci/ctxfi/ctatc.c b/sound/pci/ctxfi/ctatc.c
+index 78f35e88aed6b..fbdb8a3d5b8e5 100644
+--- a/sound/pci/ctxfi/ctatc.c
++++ b/sound/pci/ctxfi/ctatc.c
+@@ -36,6 +36,7 @@
+ | ((IEC958_AES3_CON_FS_48000) << 24))
+
+ static const struct snd_pci_quirk subsys_20k1_list[] = {
++ SND_PCI_QUIRK(PCI_VENDOR_ID_CREATIVE, 0x0021, "SB046x", CTSB046X),
+ SND_PCI_QUIRK(PCI_VENDOR_ID_CREATIVE, 0x0022, "SB055x", CTSB055X),
+ SND_PCI_QUIRK(PCI_VENDOR_ID_CREATIVE, 0x002f, "SB055x", CTSB055X),
+ SND_PCI_QUIRK(PCI_VENDOR_ID_CREATIVE, 0x0029, "SB073x", CTSB073X),
+@@ -64,6 +65,7 @@ static const struct snd_pci_quirk subsys_20k2_list[] = {
+
+ static const char *ct_subsys_name[NUM_CTCARDS] = {
+ /* 20k1 models */
++ [CTSB046X] = "SB046x",
+ [CTSB055X] = "SB055x",
+ [CTSB073X] = "SB073x",
+ [CTUAA] = "UAA",
+diff --git a/sound/pci/ctxfi/cthardware.h b/sound/pci/ctxfi/cthardware.h
+index f406b626a28c4..2875cec83b8f2 100644
+--- a/sound/pci/ctxfi/cthardware.h
++++ b/sound/pci/ctxfi/cthardware.h
+@@ -26,8 +26,9 @@ enum CHIPTYP {
+
+ enum CTCARDS {
+ /* 20k1 models */
++ CTSB046X,
++ CT20K1_MODEL_FIRST = CTSB046X,
+ CTSB055X,
+- CT20K1_MODEL_FIRST = CTSB055X,
+ CTSB073X,
+ CTUAA,
+ CT20K1_UNKNOWN,
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-05-30 21:42 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-05-30 21:42 UTC (permalink / raw
To: gentoo-commits
commit: 27ddd65ea4863e7523fea2eb73570a4698d2d415
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Mon May 30 21:42:06 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Mon May 30 21:42:06 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=27ddd65e
Update BMQ 5.18-r1 patch
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch b/5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch
index 9b1a8ba1..a130157e 100644
--- a/5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch
+++ b/5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch
@@ -632,10 +632,10 @@ index 976092b7bd45..31d587c16ec1 100644
obj-y += build_utility.o
diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
new file mode 100644
-index 000000000000..a466a05301b8
+index 000000000000..189332cd6f99
--- /dev/null
+++ b/kernel/sched/alt_core.c
-@@ -0,0 +1,7765 @@
+@@ -0,0 +1,7768 @@
+/*
+ * kernel/sched/alt_core.c
+ *
@@ -667,6 +667,7 @@ index 000000000000..a466a05301b8
+#include <linux/kprobes.h>
+#include <linux/profile.h>
+#include <linux/nmi.h>
++#include <linux/scs.h>
+
+#include <uapi/linux/sched/types.h>
+
@@ -678,6 +679,8 @@ index 000000000000..a466a05301b8
+
+#include "sched.h"
+
++#include "pelt.h"
++
+#include "../../fs/io-wq.h"
+#include "../smpboot.h"
+
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-06-06 11:01 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-06-06 11:01 UTC (permalink / raw
To: gentoo-commits
commit: 71d470c013d9ba1388d32e220ff5cf87fb8eb6cd
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Mon Jun 6 11:00:51 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Mon Jun 6 11:00:51 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=71d470c0
Linux patch 5.18.2
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1001_linux-5.18.2.patch | 2830 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 2834 insertions(+)
diff --git a/0000_README b/0000_README
index 62ab5b31..561c7140 100644
--- a/0000_README
+++ b/0000_README
@@ -47,6 +47,10 @@ Patch: 1000_linux-5.18.1.patch
From: http://www.kernel.org
Desc: Linux 5.18.1
+Patch: 1001_linux-5.18.2.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.2
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1001_linux-5.18.2.patch b/1001_linux-5.18.2.patch
new file mode 100644
index 00000000..609efb82
--- /dev/null
+++ b/1001_linux-5.18.2.patch
@@ -0,0 +1,2830 @@
+diff --git a/Documentation/process/submitting-patches.rst b/Documentation/process/submitting-patches.rst
+index fb496b2ebfd38..92d3432460d75 100644
+--- a/Documentation/process/submitting-patches.rst
++++ b/Documentation/process/submitting-patches.rst
+@@ -77,7 +77,7 @@ as you intend it to.
+
+ The maintainer will thank you if you write your patch description in a
+ form which can be easily pulled into Linux's source code management
+-system, ``git``, as a "commit log". See :ref:`explicit_in_reply_to`.
++system, ``git``, as a "commit log". See :ref:`the_canonical_patch_format`.
+
+ Solve only one problem per patch. If your description starts to get
+ long, that's a sign that you probably need to split up your patch.
+diff --git a/Makefile b/Makefile
+index 2bb168acb8f43..6b1d606a92f6f 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 1
++SUBLEVEL = 2
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/arm/boot/dts/s5pv210-aries.dtsi b/arch/arm/boot/dts/s5pv210-aries.dtsi
+index c8f1c324a6c26..26f2be2d9faa2 100644
+--- a/arch/arm/boot/dts/s5pv210-aries.dtsi
++++ b/arch/arm/boot/dts/s5pv210-aries.dtsi
+@@ -895,7 +895,7 @@
+ device-wakeup-gpios = <&gpg3 4 GPIO_ACTIVE_HIGH>;
+ interrupt-parent = <&gph2>;
+ interrupts = <5 IRQ_TYPE_LEVEL_HIGH>;
+- interrupt-names = "host-wake";
++ interrupt-names = "host-wakeup";
+ };
+ };
+
+diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
+index 45c993dd05f5e..36f2314c58e5f 100644
+--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
++++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
+@@ -361,13 +361,15 @@ static bool kvmppc_gfn_is_uvmem_pfn(unsigned long gfn, struct kvm *kvm,
+ static bool kvmppc_next_nontransitioned_gfn(const struct kvm_memory_slot *memslot,
+ struct kvm *kvm, unsigned long *gfn)
+ {
+- struct kvmppc_uvmem_slot *p;
++ struct kvmppc_uvmem_slot *p = NULL, *iter;
+ bool ret = false;
+ unsigned long i;
+
+- list_for_each_entry(p, &kvm->arch.uvmem_pfns, list)
+- if (*gfn >= p->base_pfn && *gfn < p->base_pfn + p->nr_pfns)
++ list_for_each_entry(iter, &kvm->arch.uvmem_pfns, list)
++ if (*gfn >= iter->base_pfn && *gfn < iter->base_pfn + iter->nr_pfns) {
++ p = iter;
+ break;
++ }
+ if (!p)
+ return ret;
+ /*
+diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
+index f78e2b3501a19..35f222aa66bfc 100644
+--- a/arch/x86/include/asm/uaccess.h
++++ b/arch/x86/include/asm/uaccess.h
+@@ -382,6 +382,103 @@ do { \
+
+ #endif // CONFIG_CC_HAS_ASM_GOTO_OUTPUT
+
++#ifdef CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT
++#define __try_cmpxchg_user_asm(itype, ltype, _ptr, _pold, _new, label) ({ \
++ bool success; \
++ __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \
++ __typeof__(*(_ptr)) __old = *_old; \
++ __typeof__(*(_ptr)) __new = (_new); \
++ asm_volatile_goto("\n" \
++ "1: " LOCK_PREFIX "cmpxchg"itype" %[new], %[ptr]\n"\
++ _ASM_EXTABLE_UA(1b, %l[label]) \
++ : CC_OUT(z) (success), \
++ [ptr] "+m" (*_ptr), \
++ [old] "+a" (__old) \
++ : [new] ltype (__new) \
++ : "memory" \
++ : label); \
++ if (unlikely(!success)) \
++ *_old = __old; \
++ likely(success); })
++
++#ifdef CONFIG_X86_32
++#define __try_cmpxchg64_user_asm(_ptr, _pold, _new, label) ({ \
++ bool success; \
++ __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \
++ __typeof__(*(_ptr)) __old = *_old; \
++ __typeof__(*(_ptr)) __new = (_new); \
++ asm_volatile_goto("\n" \
++ "1: " LOCK_PREFIX "cmpxchg8b %[ptr]\n" \
++ _ASM_EXTABLE_UA(1b, %l[label]) \
++ : CC_OUT(z) (success), \
++ "+A" (__old), \
++ [ptr] "+m" (*_ptr) \
++ : "b" ((u32)__new), \
++ "c" ((u32)((u64)__new >> 32)) \
++ : "memory" \
++ : label); \
++ if (unlikely(!success)) \
++ *_old = __old; \
++ likely(success); })
++#endif // CONFIG_X86_32
++#else // !CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT
++#define __try_cmpxchg_user_asm(itype, ltype, _ptr, _pold, _new, label) ({ \
++ int __err = 0; \
++ bool success; \
++ __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \
++ __typeof__(*(_ptr)) __old = *_old; \
++ __typeof__(*(_ptr)) __new = (_new); \
++ asm volatile("\n" \
++ "1: " LOCK_PREFIX "cmpxchg"itype" %[new], %[ptr]\n"\
++ CC_SET(z) \
++ "2:\n" \
++ _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_EFAULT_REG, \
++ %[errout]) \
++ : CC_OUT(z) (success), \
++ [errout] "+r" (__err), \
++ [ptr] "+m" (*_ptr), \
++ [old] "+a" (__old) \
++ : [new] ltype (__new) \
++ : "memory", "cc"); \
++ if (unlikely(__err)) \
++ goto label; \
++ if (unlikely(!success)) \
++ *_old = __old; \
++ likely(success); })
++
++#ifdef CONFIG_X86_32
++/*
++ * Unlike the normal CMPXCHG, hardcode ECX for both success/fail and error.
++ * There are only six GPRs available and four (EAX, EBX, ECX, and EDX) are
++ * hardcoded by CMPXCHG8B, leaving only ESI and EDI. If the compiler uses
++ * both ESI and EDI for the memory operand, compilation will fail if the error
++ * is an input+output as there will be no register available for input.
++ */
++#define __try_cmpxchg64_user_asm(_ptr, _pold, _new, label) ({ \
++ int __result; \
++ __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \
++ __typeof__(*(_ptr)) __old = *_old; \
++ __typeof__(*(_ptr)) __new = (_new); \
++ asm volatile("\n" \
++ "1: " LOCK_PREFIX "cmpxchg8b %[ptr]\n" \
++ "mov $0, %%ecx\n\t" \
++ "setz %%cl\n" \
++ "2:\n" \
++ _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_EFAULT_REG, %%ecx) \
++ : [result]"=c" (__result), \
++ "+A" (__old), \
++ [ptr] "+m" (*_ptr) \
++ : "b" ((u32)__new), \
++ "c" ((u32)((u64)__new >> 32)) \
++ : "memory", "cc"); \
++ if (unlikely(__result < 0)) \
++ goto label; \
++ if (unlikely(!__result)) \
++ *_old = __old; \
++ likely(__result); })
++#endif // CONFIG_X86_32
++#endif // CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT
++
+ /* FIXME: this hack is definitely wrong -AK */
+ struct __large_struct { unsigned long buf[100]; };
+ #define __m(x) (*(struct __large_struct __user *)(x))
+@@ -474,6 +571,51 @@ do { \
+ } while (0)
+ #endif // CONFIG_CC_HAS_ASM_GOTO_OUTPUT
+
++extern void __try_cmpxchg_user_wrong_size(void);
++
++#ifndef CONFIG_X86_32
++#define __try_cmpxchg64_user_asm(_ptr, _oldp, _nval, _label) \
++ __try_cmpxchg_user_asm("q", "r", (_ptr), (_oldp), (_nval), _label)
++#endif
++
++/*
++ * Force the pointer to u<size> to match the size expected by the asm helper.
++ * clang/LLVM compiles all cases and only discards the unused paths after
++ * processing errors, which breaks i386 if the pointer is an 8-byte value.
++ */
++#define unsafe_try_cmpxchg_user(_ptr, _oldp, _nval, _label) ({ \
++ bool __ret; \
++ __chk_user_ptr(_ptr); \
++ switch (sizeof(*(_ptr))) { \
++ case 1: __ret = __try_cmpxchg_user_asm("b", "q", \
++ (__force u8 *)(_ptr), (_oldp), \
++ (_nval), _label); \
++ break; \
++ case 2: __ret = __try_cmpxchg_user_asm("w", "r", \
++ (__force u16 *)(_ptr), (_oldp), \
++ (_nval), _label); \
++ break; \
++ case 4: __ret = __try_cmpxchg_user_asm("l", "r", \
++ (__force u32 *)(_ptr), (_oldp), \
++ (_nval), _label); \
++ break; \
++ case 8: __ret = __try_cmpxchg64_user_asm((__force u64 *)(_ptr), (_oldp),\
++ (_nval), _label); \
++ break; \
++ default: __try_cmpxchg_user_wrong_size(); \
++ } \
++ __ret; })
++
++/* "Returns" 0 on success, 1 on failure, -EFAULT if the access faults. */
++#define __try_cmpxchg_user(_ptr, _oldp, _nval, _label) ({ \
++ int __ret = -EFAULT; \
++ __uaccess_begin_nospec(); \
++ __ret = !unsafe_try_cmpxchg_user(_ptr, _oldp, _nval, _label); \
++_label: \
++ __uaccess_end(); \
++ __ret; \
++ })
++
+ /*
+ * We want the unsafe accessors to always be inlined and use
+ * the error labels - thus the macro games.
+diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
+index 7c63a1911fae9..3c24e6124d955 100644
+--- a/arch/x86/kernel/cpu/sgx/encl.c
++++ b/arch/x86/kernel/cpu/sgx/encl.c
+@@ -12,6 +12,92 @@
+ #include "encls.h"
+ #include "sgx.h"
+
++#define PCMDS_PER_PAGE (PAGE_SIZE / sizeof(struct sgx_pcmd))
++/*
++ * 32 PCMD entries share a PCMD page. PCMD_FIRST_MASK is used to
++ * determine the page index associated with the first PCMD entry
++ * within a PCMD page.
++ */
++#define PCMD_FIRST_MASK GENMASK(4, 0)
++
++/**
++ * reclaimer_writing_to_pcmd() - Query if any enclave page associated with
++ * a PCMD page is in process of being reclaimed.
++ * @encl: Enclave to which PCMD page belongs
++ * @start_addr: Address of enclave page using first entry within the PCMD page
++ *
++ * When an enclave page is reclaimed some Paging Crypto MetaData (PCMD) is
++ * stored. The PCMD data of a reclaimed enclave page contains enough
++ * information for the processor to verify the page at the time
++ * it is loaded back into the Enclave Page Cache (EPC).
++ *
++ * The backing storage to which enclave pages are reclaimed is laid out as
++ * follows:
++ * Encrypted enclave pages:SECS page:PCMD pages
++ *
++ * Each PCMD page contains the PCMD metadata of
++ * PAGE_SIZE/sizeof(struct sgx_pcmd) enclave pages.
++ *
++ * A PCMD page can only be truncated if it is (a) empty, and (b) not in the
++ * process of getting data (and thus soon being non-empty). (b) is tested with
++ * a check if an enclave page sharing the PCMD page is in the process of being
++ * reclaimed.
++ *
++ * The reclaimer sets the SGX_ENCL_PAGE_BEING_RECLAIMED flag when it
++ * intends to reclaim that enclave page - it means that the PCMD page
++ * associated with that enclave page is about to get some data and thus
++ * even if the PCMD page is empty, it should not be truncated.
++ *
++ * Context: Enclave mutex (&sgx_encl->lock) must be held.
++ * Return: 1 if the reclaimer is about to write to the PCMD page
++ * 0 if the reclaimer has no intention to write to the PCMD page
++ */
++static int reclaimer_writing_to_pcmd(struct sgx_encl *encl,
++ unsigned long start_addr)
++{
++ int reclaimed = 0;
++ int i;
++
++ /*
++ * PCMD_FIRST_MASK is based on number of PCMD entries within
++ * PCMD page being 32.
++ */
++ BUILD_BUG_ON(PCMDS_PER_PAGE != 32);
++
++ for (i = 0; i < PCMDS_PER_PAGE; i++) {
++ struct sgx_encl_page *entry;
++ unsigned long addr;
++
++ addr = start_addr + i * PAGE_SIZE;
++
++ /*
++ * Stop when reaching the SECS page - it does not
++ * have a page_array entry and its reclaim is
++ * started and completed with enclave mutex held so
++ * it does not use the SGX_ENCL_PAGE_BEING_RECLAIMED
++ * flag.
++ */
++ if (addr == encl->base + encl->size)
++ break;
++
++ entry = xa_load(&encl->page_array, PFN_DOWN(addr));
++ if (!entry)
++ continue;
++
++ /*
++ * VA page slot ID uses same bit as the flag so it is important
++ * to ensure that the page is not already in backing store.
++ */
++ if (entry->epc_page &&
++ (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED)) {
++ reclaimed = 1;
++ break;
++ }
++ }
++
++ return reclaimed;
++}
++
+ /*
+ * Calculate byte offset of a PCMD struct associated with an enclave page. PCMD's
+ * follow right after the EPC data in the backing storage. In addition to the
+@@ -47,6 +133,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
+ unsigned long va_offset = encl_page->desc & SGX_ENCL_PAGE_VA_OFFSET_MASK;
+ struct sgx_encl *encl = encl_page->encl;
+ pgoff_t page_index, page_pcmd_off;
++ unsigned long pcmd_first_page;
+ struct sgx_pageinfo pginfo;
+ struct sgx_backing b;
+ bool pcmd_page_empty;
+@@ -58,6 +145,11 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
+ else
+ page_index = PFN_DOWN(encl->size);
+
++ /*
++ * Address of enclave page using the first entry within the PCMD page.
++ */
++ pcmd_first_page = PFN_PHYS(page_index & ~PCMD_FIRST_MASK) + encl->base;
++
+ page_pcmd_off = sgx_encl_get_backing_page_pcmd_offset(encl, page_index);
+
+ ret = sgx_encl_get_backing(encl, page_index, &b);
+@@ -84,6 +176,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
+ }
+
+ memset(pcmd_page + b.pcmd_offset, 0, sizeof(struct sgx_pcmd));
++ set_page_dirty(b.pcmd);
+
+ /*
+ * The area for the PCMD in the page was zeroed above. Check if the
+@@ -94,12 +187,20 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
+ kunmap_atomic(pcmd_page);
+ kunmap_atomic((void *)(unsigned long)pginfo.contents);
+
+- sgx_encl_put_backing(&b, false);
++ get_page(b.pcmd);
++ sgx_encl_put_backing(&b);
+
+ sgx_encl_truncate_backing_page(encl, page_index);
+
+- if (pcmd_page_empty)
++ if (pcmd_page_empty && !reclaimer_writing_to_pcmd(encl, pcmd_first_page)) {
+ sgx_encl_truncate_backing_page(encl, PFN_DOWN(page_pcmd_off));
++ pcmd_page = kmap_atomic(b.pcmd);
++ if (memchr_inv(pcmd_page, 0, PAGE_SIZE))
++ pr_warn("PCMD page not empty after truncate.\n");
++ kunmap_atomic(pcmd_page);
++ }
++
++ put_page(b.pcmd);
+
+ return ret;
+ }
+@@ -645,15 +746,9 @@ int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
+ /**
+ * sgx_encl_put_backing() - Unpin the backing storage
+ * @backing: data for accessing backing storage for the page
+- * @do_write: mark pages dirty
+ */
+-void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write)
++void sgx_encl_put_backing(struct sgx_backing *backing)
+ {
+- if (do_write) {
+- set_page_dirty(backing->pcmd);
+- set_page_dirty(backing->contents);
+- }
+-
+ put_page(backing->pcmd);
+ put_page(backing->contents);
+ }
+diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
+index fec43ca65065b..d44e7372151f0 100644
+--- a/arch/x86/kernel/cpu/sgx/encl.h
++++ b/arch/x86/kernel/cpu/sgx/encl.h
+@@ -107,7 +107,7 @@ void sgx_encl_release(struct kref *ref);
+ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm);
+ int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
+ struct sgx_backing *backing);
+-void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write);
++void sgx_encl_put_backing(struct sgx_backing *backing);
+ int sgx_encl_test_and_clear_young(struct mm_struct *mm,
+ struct sgx_encl_page *page);
+
+diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
+index 8e4bc6453d263..ab4ec54bbdd94 100644
+--- a/arch/x86/kernel/cpu/sgx/main.c
++++ b/arch/x86/kernel/cpu/sgx/main.c
+@@ -191,6 +191,8 @@ static int __sgx_encl_ewb(struct sgx_epc_page *epc_page, void *va_slot,
+ backing->pcmd_offset;
+
+ ret = __ewb(&pginfo, sgx_get_epc_virt_addr(epc_page), va_slot);
++ set_page_dirty(backing->pcmd);
++ set_page_dirty(backing->contents);
+
+ kunmap_atomic((void *)(unsigned long)(pginfo.metadata -
+ backing->pcmd_offset));
+@@ -308,6 +310,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
+ sgx_encl_ewb(epc_page, backing);
+ encl_page->epc_page = NULL;
+ encl->secs_child_cnt--;
++ sgx_encl_put_backing(backing);
+
+ if (!encl->secs_child_cnt && test_bit(SGX_ENCL_INITIALIZED, &encl->flags)) {
+ ret = sgx_encl_get_backing(encl, PFN_DOWN(encl->size),
+@@ -320,7 +323,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
+ sgx_encl_free_epc_page(encl->secs.epc_page);
+ encl->secs.epc_page = NULL;
+
+- sgx_encl_put_backing(&secs_backing, true);
++ sgx_encl_put_backing(&secs_backing);
+ }
+
+ out:
+@@ -379,11 +382,14 @@ static void sgx_reclaim_pages(void)
+ goto skip;
+
+ page_index = PFN_DOWN(encl_page->desc - encl_page->encl->base);
++
++ mutex_lock(&encl_page->encl->lock);
+ ret = sgx_encl_get_backing(encl_page->encl, page_index, &backing[i]);
+- if (ret)
++ if (ret) {
++ mutex_unlock(&encl_page->encl->lock);
+ goto skip;
++ }
+
+- mutex_lock(&encl_page->encl->lock);
+ encl_page->desc |= SGX_ENCL_PAGE_BEING_RECLAIMED;
+ mutex_unlock(&encl_page->encl->lock);
+ continue;
+@@ -411,7 +417,6 @@ skip:
+
+ encl_page = epc_page->owner;
+ sgx_reclaimer_write(epc_page, &backing[i]);
+- sgx_encl_put_backing(&backing[i], true);
+
+ kref_put(&encl_page->encl->refcount, sgx_encl_release);
+ epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED;
+diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
+index e28ab0ecc5378..0fdc807ae13f8 100644
+--- a/arch/x86/kernel/fpu/core.c
++++ b/arch/x86/kernel/fpu/core.c
+@@ -14,6 +14,8 @@
+ #include <asm/traps.h>
+ #include <asm/irq_regs.h>
+
++#include <uapi/asm/kvm.h>
++
+ #include <linux/hardirq.h>
+ #include <linux/pkeys.h>
+ #include <linux/vmalloc.h>
+@@ -232,7 +234,20 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu)
+ gfpu->fpstate = fpstate;
+ gfpu->xfeatures = fpu_user_cfg.default_features;
+ gfpu->perm = fpu_user_cfg.default_features;
+- gfpu->uabi_size = fpu_user_cfg.default_size;
++
++ /*
++ * KVM sets the FP+SSE bits in the XSAVE header when copying FPU state
++ * to userspace, even when XSAVE is unsupported, so that restoring FPU
++ * state on a different CPU that does support XSAVE can cleanly load
++ * the incoming state using its natural XSAVE. In other words, KVM's
++ * uABI size may be larger than this host's default size. Conversely,
++ * the default size should never be larger than KVM's base uABI size;
++ * all features that can expand the uABI size must be opt-in.
++ */
++ gfpu->uabi_size = sizeof(struct kvm_xsave);
++ if (WARN_ON_ONCE(fpu_user_cfg.default_size > gfpu->uabi_size))
++ gfpu->uabi_size = fpu_user_cfg.default_size;
++
+ fpu_init_guest_permissions(gfpu);
+
+ return true;
+diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
+index 8b1c45c9cda87..1a55bf700f926 100644
+--- a/arch/x86/kernel/kvm.c
++++ b/arch/x86/kernel/kvm.c
+@@ -191,7 +191,7 @@ void kvm_async_pf_task_wake(u32 token)
+ {
+ u32 key = hash_32(token, KVM_TASK_SLEEP_HASHBITS);
+ struct kvm_task_sleep_head *b = &async_pf_sleepers[key];
+- struct kvm_task_sleep_node *n;
++ struct kvm_task_sleep_node *n, *dummy = NULL;
+
+ if (token == ~0) {
+ apf_task_wake_all();
+@@ -203,28 +203,41 @@ again:
+ n = _find_apf_task(b, token);
+ if (!n) {
+ /*
+- * async PF was not yet handled.
+- * Add dummy entry for the token.
++ * Async #PF not yet handled, add a dummy entry for the token.
++ * Allocating the token must be down outside of the raw lock
++ * as the allocator is preemptible on PREEMPT_RT kernels.
+ */
+- n = kzalloc(sizeof(*n), GFP_ATOMIC);
+- if (!n) {
++ if (!dummy) {
++ raw_spin_unlock(&b->lock);
++ dummy = kzalloc(sizeof(*dummy), GFP_ATOMIC);
++
+ /*
+- * Allocation failed! Busy wait while other cpu
+- * handles async PF.
++ * Continue looping on allocation failure, eventually
++ * the async #PF will be handled and allocating a new
++ * node will be unnecessary.
++ */
++ if (!dummy)
++ cpu_relax();
++
++ /*
++ * Recheck for async #PF completion before enqueueing
++ * the dummy token to avoid duplicate list entries.
+ */
+- raw_spin_unlock(&b->lock);
+- cpu_relax();
+ goto again;
+ }
+- n->token = token;
+- n->cpu = smp_processor_id();
+- init_swait_queue_head(&n->wq);
+- hlist_add_head(&n->link, &b->list);
++ dummy->token = token;
++ dummy->cpu = smp_processor_id();
++ init_swait_queue_head(&dummy->wq);
++ hlist_add_head(&dummy->link, &b->list);
++ dummy = NULL;
+ } else {
+ apf_task_wake_one(n);
+ }
+ raw_spin_unlock(&b->lock);
+- return;
++
++ /* A dummy token might be allocated and ultimately not used. */
++ if (dummy)
++ kfree(dummy);
+ }
+ EXPORT_SYMBOL_GPL(kvm_async_pf_task_wake);
+
+diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
+index 45e1573f8f1d3..cf48ac96ceecb 100644
+--- a/arch/x86/kvm/mmu/mmu.c
++++ b/arch/x86/kvm/mmu/mmu.c
+@@ -1843,17 +1843,14 @@ static void kvm_mmu_commit_zap_page(struct kvm *kvm,
+ &(_kvm)->arch.mmu_page_hash[kvm_page_table_hashfn(_gfn)]) \
+ if ((_sp)->gfn != (_gfn) || (_sp)->role.direct) {} else
+
+-static bool kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
++static int kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
+ struct list_head *invalid_list)
+ {
+ int ret = vcpu->arch.mmu->sync_page(vcpu, sp);
+
+- if (ret < 0) {
++ if (ret < 0)
+ kvm_mmu_prepare_zap_page(vcpu->kvm, sp, invalid_list);
+- return false;
+- }
+-
+- return !!ret;
++ return ret;
+ }
+
+ static bool kvm_mmu_remote_flush_or_zap(struct kvm *kvm,
+@@ -1975,7 +1972,7 @@ static int mmu_sync_children(struct kvm_vcpu *vcpu,
+
+ for_each_sp(pages, sp, parents, i) {
+ kvm_unlink_unsync_page(vcpu->kvm, sp);
+- flush |= kvm_sync_page(vcpu, sp, &invalid_list);
++ flush |= kvm_sync_page(vcpu, sp, &invalid_list) > 0;
+ mmu_pages_clear_parents(&parents);
+ }
+ if (need_resched() || rwlock_needbreak(&vcpu->kvm->mmu_lock)) {
+@@ -2016,6 +2013,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
+ struct hlist_head *sp_list;
+ unsigned quadrant;
+ struct kvm_mmu_page *sp;
++ int ret;
+ int collisions = 0;
+ LIST_HEAD(invalid_list);
+
+@@ -2068,11 +2066,13 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
+ * If the sync fails, the page is zapped. If so, break
+ * in order to rebuild it.
+ */
+- if (!kvm_sync_page(vcpu, sp, &invalid_list))
++ ret = kvm_sync_page(vcpu, sp, &invalid_list);
++ if (ret < 0)
+ break;
+
+ WARN_ON(!list_empty(&invalid_list));
+- kvm_flush_remote_tlbs(vcpu->kvm);
++ if (ret > 0)
++ kvm_flush_remote_tlbs(vcpu->kvm);
+ }
+
+ __clear_sp_write_flooding_count(sp);
+diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
+index 01fee5f67ac37..beb3ce8d94eb3 100644
+--- a/arch/x86/kvm/mmu/paging_tmpl.h
++++ b/arch/x86/kvm/mmu/paging_tmpl.h
+@@ -144,42 +144,6 @@ static bool FNAME(is_rsvd_bits_set)(struct kvm_mmu *mmu, u64 gpte, int level)
+ FNAME(is_bad_mt_xwr)(&mmu->guest_rsvd_check, gpte);
+ }
+
+-static int FNAME(cmpxchg_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
+- pt_element_t __user *ptep_user, unsigned index,
+- pt_element_t orig_pte, pt_element_t new_pte)
+-{
+- signed char r;
+-
+- if (!user_access_begin(ptep_user, sizeof(pt_element_t)))
+- return -EFAULT;
+-
+-#ifdef CMPXCHG
+- asm volatile("1:" LOCK_PREFIX CMPXCHG " %[new], %[ptr]\n"
+- "setnz %b[r]\n"
+- "2:"
+- _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_EFAULT_REG, %k[r])
+- : [ptr] "+m" (*ptep_user),
+- [old] "+a" (orig_pte),
+- [r] "=q" (r)
+- : [new] "r" (new_pte)
+- : "memory");
+-#else
+- asm volatile("1:" LOCK_PREFIX "cmpxchg8b %[ptr]\n"
+- "setnz %b[r]\n"
+- "2:"
+- _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_EFAULT_REG, %k[r])
+- : [ptr] "+m" (*ptep_user),
+- [old] "+A" (orig_pte),
+- [r] "=q" (r)
+- : [new_lo] "b" ((u32)new_pte),
+- [new_hi] "c" ((u32)(new_pte >> 32))
+- : "memory");
+-#endif
+-
+- user_access_end();
+- return r;
+-}
+-
+ static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu,
+ struct kvm_mmu_page *sp, u64 *spte,
+ u64 gpte)
+@@ -278,7 +242,7 @@ static int FNAME(update_accessed_dirty_bits)(struct kvm_vcpu *vcpu,
+ if (unlikely(!walker->pte_writable[level - 1]))
+ continue;
+
+- ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, orig_pte, pte);
++ ret = __try_cmpxchg_user(ptep_user, &orig_pte, pte, fault);
+ if (ret)
+ return ret;
+
+diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
+index 96bab464967f2..1a9b60cb6bcb8 100644
+--- a/arch/x86/kvm/svm/nested.c
++++ b/arch/x86/kvm/svm/nested.c
+@@ -819,9 +819,6 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
+ struct kvm_host_map map;
+ int rc;
+
+- /* Triple faults in L2 should never escape. */
+- WARN_ON_ONCE(kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu));
+-
+ rc = kvm_vcpu_map(vcpu, gpa_to_gfn(svm->nested.vmcb12_gpa), &map);
+ if (rc) {
+ if (rc == -EINVAL)
+diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
+index 7c392873626fd..4b7d490c0b639 100644
+--- a/arch/x86/kvm/svm/sev.c
++++ b/arch/x86/kvm/svm/sev.c
+@@ -688,7 +688,7 @@ static int sev_launch_measure(struct kvm *kvm, struct kvm_sev_cmd *argp)
+ if (params.len > SEV_FW_BLOB_MAX_SIZE)
+ return -EINVAL;
+
+- blob = kmalloc(params.len, GFP_KERNEL_ACCOUNT);
++ blob = kzalloc(params.len, GFP_KERNEL_ACCOUNT);
+ if (!blob)
+ return -ENOMEM;
+
+@@ -808,7 +808,7 @@ static int __sev_dbg_decrypt_user(struct kvm *kvm, unsigned long paddr,
+ if (!IS_ALIGNED(dst_paddr, 16) ||
+ !IS_ALIGNED(paddr, 16) ||
+ !IS_ALIGNED(size, 16)) {
+- tpage = (void *)alloc_page(GFP_KERNEL);
++ tpage = (void *)alloc_page(GFP_KERNEL | __GFP_ZERO);
+ if (!tpage)
+ return -ENOMEM;
+
+@@ -1094,7 +1094,7 @@ static int sev_get_attestation_report(struct kvm *kvm, struct kvm_sev_cmd *argp)
+ if (params.len > SEV_FW_BLOB_MAX_SIZE)
+ return -EINVAL;
+
+- blob = kmalloc(params.len, GFP_KERNEL_ACCOUNT);
++ blob = kzalloc(params.len, GFP_KERNEL_ACCOUNT);
+ if (!blob)
+ return -ENOMEM;
+
+@@ -1176,7 +1176,7 @@ static int sev_send_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
+ return -EINVAL;
+
+ /* allocate the memory to hold the session data blob */
+- session_data = kmalloc(params.session_len, GFP_KERNEL_ACCOUNT);
++ session_data = kzalloc(params.session_len, GFP_KERNEL_ACCOUNT);
+ if (!session_data)
+ return -ENOMEM;
+
+@@ -1300,11 +1300,11 @@ static int sev_send_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
+
+ /* allocate memory for header and transport buffer */
+ ret = -ENOMEM;
+- hdr = kmalloc(params.hdr_len, GFP_KERNEL_ACCOUNT);
++ hdr = kzalloc(params.hdr_len, GFP_KERNEL_ACCOUNT);
+ if (!hdr)
+ goto e_unpin;
+
+- trans_data = kmalloc(params.trans_len, GFP_KERNEL_ACCOUNT);
++ trans_data = kzalloc(params.trans_len, GFP_KERNEL_ACCOUNT);
+ if (!trans_data)
+ goto e_free_hdr;
+
+diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
+index 856c875638833..880d0b0c9315b 100644
+--- a/arch/x86/kvm/vmx/nested.c
++++ b/arch/x86/kvm/vmx/nested.c
+@@ -4518,9 +4518,6 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
+ /* trying to cancel vmlaunch/vmresume is a bug */
+ WARN_ON_ONCE(vmx->nested.nested_run_pending);
+
+- /* Similarly, triple faults in L2 should never escape. */
+- WARN_ON_ONCE(kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu));
+-
+ if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) {
+ /*
+ * KVM_REQ_GET_NESTED_STATE_PAGES is also used to map
+diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
+index 610355b9ccceb..982df9c000d31 100644
+--- a/arch/x86/kvm/vmx/vmx.c
++++ b/arch/x86/kvm/vmx/vmx.c
+@@ -7856,7 +7856,7 @@ static unsigned int vmx_handle_intel_pt_intr(void)
+ struct kvm_vcpu *vcpu = kvm_get_running_vcpu();
+
+ /* '0' on failure so that the !PT case can use a RET0 static call. */
+- if (!kvm_arch_pmi_in_guest(vcpu))
++ if (!vcpu || !kvm_handling_nmi_from_guest(vcpu))
+ return 0;
+
+ kvm_make_request(KVM_REQ_PMI, vcpu);
+diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
+index 4790f0d7d40b8..39c571224ac28 100644
+--- a/arch/x86/kvm/x86.c
++++ b/arch/x86/kvm/x86.c
+@@ -7229,15 +7229,8 @@ static int emulator_write_emulated(struct x86_emulate_ctxt *ctxt,
+ exception, &write_emultor);
+ }
+
+-#define CMPXCHG_TYPE(t, ptr, old, new) \
+- (cmpxchg((t *)(ptr), *(t *)(old), *(t *)(new)) == *(t *)(old))
+-
+-#ifdef CONFIG_X86_64
+-# define CMPXCHG64(ptr, old, new) CMPXCHG_TYPE(u64, ptr, old, new)
+-#else
+-# define CMPXCHG64(ptr, old, new) \
+- (cmpxchg64((u64 *)(ptr), *(u64 *)(old), *(u64 *)(new)) == *(u64 *)(old))
+-#endif
++#define emulator_try_cmpxchg_user(t, ptr, old, new) \
++ (__try_cmpxchg_user((t __user *)(ptr), (t *)(old), *(t *)(new), efault ## t))
+
+ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
+ unsigned long addr,
+@@ -7246,12 +7239,11 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
+ unsigned int bytes,
+ struct x86_exception *exception)
+ {
+- struct kvm_host_map map;
+ struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+ u64 page_line_mask;
++ unsigned long hva;
+ gpa_t gpa;
+- char *kaddr;
+- bool exchanged;
++ int r;
+
+ /* guests cmpxchg8b have to be emulated atomically */
+ if (bytes > 8 || (bytes & (bytes - 1)))
+@@ -7275,31 +7267,32 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
+ if (((gpa + bytes - 1) & page_line_mask) != (gpa & page_line_mask))
+ goto emul_write;
+
+- if (kvm_vcpu_map(vcpu, gpa_to_gfn(gpa), &map))
++ hva = kvm_vcpu_gfn_to_hva(vcpu, gpa_to_gfn(gpa));
++ if (kvm_is_error_hva(hva))
+ goto emul_write;
+
+- kaddr = map.hva + offset_in_page(gpa);
++ hva += offset_in_page(gpa);
+
+ switch (bytes) {
+ case 1:
+- exchanged = CMPXCHG_TYPE(u8, kaddr, old, new);
++ r = emulator_try_cmpxchg_user(u8, hva, old, new);
+ break;
+ case 2:
+- exchanged = CMPXCHG_TYPE(u16, kaddr, old, new);
++ r = emulator_try_cmpxchg_user(u16, hva, old, new);
+ break;
+ case 4:
+- exchanged = CMPXCHG_TYPE(u32, kaddr, old, new);
++ r = emulator_try_cmpxchg_user(u32, hva, old, new);
+ break;
+ case 8:
+- exchanged = CMPXCHG64(kaddr, old, new);
++ r = emulator_try_cmpxchg_user(u64, hva, old, new);
+ break;
+ default:
+ BUG();
+ }
+
+- kvm_vcpu_unmap(vcpu, &map, true);
+-
+- if (!exchanged)
++ if (r < 0)
++ goto emul_write;
++ if (r)
+ return X86EMUL_CMPXCHG_FAILED;
+
+ kvm_page_track_write(vcpu, gpa, new, bytes);
+@@ -8251,7 +8244,7 @@ int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
+ }
+ EXPORT_SYMBOL_GPL(kvm_skip_emulated_instruction);
+
+-static bool kvm_vcpu_check_breakpoint(struct kvm_vcpu *vcpu, int *r)
++static bool kvm_vcpu_check_code_breakpoint(struct kvm_vcpu *vcpu, int *r)
+ {
+ if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) &&
+ (vcpu->arch.guest_debug_dr7 & DR7_BP_EN_MASK)) {
+@@ -8320,25 +8313,23 @@ static bool is_vmware_backdoor_opcode(struct x86_emulate_ctxt *ctxt)
+ }
+
+ /*
+- * Decode to be emulated instruction. Return EMULATION_OK if success.
++ * Decode an instruction for emulation. The caller is responsible for handling
++ * code breakpoints. Note, manually detecting code breakpoints is unnecessary
++ * (and wrong) when emulating on an intercepted fault-like exception[*], as
++ * code breakpoints have higher priority and thus have already been done by
++ * hardware.
++ *
++ * [*] Except #MC, which is higher priority, but KVM should never emulate in
++ * response to a machine check.
+ */
+ int x86_decode_emulated_instruction(struct kvm_vcpu *vcpu, int emulation_type,
+ void *insn, int insn_len)
+ {
+- int r = EMULATION_OK;
+ struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt;
++ int r;
+
+ init_emulate_ctxt(vcpu);
+
+- /*
+- * We will reenter on the same instruction since we do not set
+- * complete_userspace_io. This does not handle watchpoints yet,
+- * those would be handled in the emulate_ops.
+- */
+- if (!(emulation_type & EMULTYPE_SKIP) &&
+- kvm_vcpu_check_breakpoint(vcpu, &r))
+- return r;
+-
+ r = x86_decode_insn(ctxt, insn, insn_len, emulation_type);
+
+ trace_kvm_emulate_insn_start(vcpu);
+@@ -8371,6 +8362,15 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
+ if (!(emulation_type & EMULTYPE_NO_DECODE)) {
+ kvm_clear_exception_queue(vcpu);
+
++ /*
++ * Return immediately if RIP hits a code breakpoint, such #DBs
++ * are fault-like and are higher priority than any faults on
++ * the code fetch itself.
++ */
++ if (!(emulation_type & EMULTYPE_SKIP) &&
++ kvm_vcpu_check_code_breakpoint(vcpu, &r))
++ return r;
++
+ r = x86_decode_emulated_instruction(vcpu, emulation_type,
+ insn, insn_len);
+ if (r != EMULATION_OK) {
+@@ -11747,20 +11747,15 @@ static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu)
+ vcpu_put(vcpu);
+ }
+
+-static void kvm_free_vcpus(struct kvm *kvm)
++static void kvm_unload_vcpu_mmus(struct kvm *kvm)
+ {
+ unsigned long i;
+ struct kvm_vcpu *vcpu;
+
+- /*
+- * Unpin any mmu pages first.
+- */
+ kvm_for_each_vcpu(i, vcpu, kvm) {
+ kvm_clear_async_pf_completion_queue(vcpu);
+ kvm_unload_vcpu_mmu(vcpu);
+ }
+-
+- kvm_destroy_vcpus(kvm);
+ }
+
+ void kvm_arch_sync_events(struct kvm *kvm)
+@@ -11866,11 +11861,12 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
+ __x86_set_memory_region(kvm, TSS_PRIVATE_MEMSLOT, 0, 0);
+ mutex_unlock(&kvm->slots_lock);
+ }
++ kvm_unload_vcpu_mmus(kvm);
+ static_call_cond(kvm_x86_vm_destroy)(kvm);
+ kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1));
+ kvm_pic_destroy(kvm);
+ kvm_ioapic_destroy(kvm);
+- kvm_free_vcpus(kvm);
++ kvm_destroy_vcpus(kvm);
+ kvfree(rcu_dereference_check(kvm->arch.apic_map, 1));
+ kfree(srcu_dereference_check(kvm->arch.pmu_event_filter, &kvm->srcu, 1));
+ kvm_mmu_uninit_vm(kvm);
+diff --git a/crypto/ecrdsa.c b/crypto/ecrdsa.c
+index b32ffcaad9adf..f3c6b5e15e75b 100644
+--- a/crypto/ecrdsa.c
++++ b/crypto/ecrdsa.c
+@@ -113,15 +113,15 @@ static int ecrdsa_verify(struct akcipher_request *req)
+
+ /* Step 1: verify that 0 < r < q, 0 < s < q */
+ if (vli_is_zero(r, ndigits) ||
+- vli_cmp(r, ctx->curve->n, ndigits) == 1 ||
++ vli_cmp(r, ctx->curve->n, ndigits) >= 0 ||
+ vli_is_zero(s, ndigits) ||
+- vli_cmp(s, ctx->curve->n, ndigits) == 1)
++ vli_cmp(s, ctx->curve->n, ndigits) >= 0)
+ return -EKEYREJECTED;
+
+ /* Step 2: calculate hash (h) of the message (passed as input) */
+ /* Step 3: calculate e = h \mod q */
+ vli_from_le64(e, digest, ndigits);
+- if (vli_cmp(e, ctx->curve->n, ndigits) == 1)
++ if (vli_cmp(e, ctx->curve->n, ndigits) >= 0)
+ vli_sub(e, e, ctx->curve->n, ndigits);
+ if (vli_is_zero(e, ndigits))
+ e[0] = 1;
+@@ -137,7 +137,7 @@ static int ecrdsa_verify(struct akcipher_request *req)
+ /* Step 6: calculate point C = z_1P + z_2Q, and R = x_c \mod q */
+ ecc_point_mult_shamir(&cc, z1, &ctx->curve->g, z2, &ctx->pub_key,
+ ctx->curve);
+- if (vli_cmp(cc.x, ctx->curve->n, ndigits) == 1)
++ if (vli_cmp(cc.x, ctx->curve->n, ndigits) >= 0)
+ vli_sub(cc.x, cc.x, ctx->curve->n, ndigits);
+
+ /* Step 7: if R == r signature is valid */
+diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
+index f6e91fb432a3b..eab34e24d9446 100644
+--- a/drivers/bluetooth/hci_qca.c
++++ b/drivers/bluetooth/hci_qca.c
+@@ -696,9 +696,9 @@ static int qca_close(struct hci_uart *hu)
+ skb_queue_purge(&qca->tx_wait_q);
+ skb_queue_purge(&qca->txq);
+ skb_queue_purge(&qca->rx_memdump_q);
+- del_timer(&qca->tx_idle_timer);
+- del_timer(&qca->wake_retrans_timer);
+ destroy_workqueue(qca->workqueue);
++ del_timer_sync(&qca->tx_idle_timer);
++ del_timer_sync(&qca->wake_retrans_timer);
+ qca->hu = NULL;
+
+ kfree_skb(qca->rx_skb);
+diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
+index 4704fa553098b..04a3e23a4afc7 100644
+--- a/drivers/char/tpm/tpm2-cmd.c
++++ b/drivers/char/tpm/tpm2-cmd.c
+@@ -400,7 +400,16 @@ ssize_t tpm2_get_tpm_pt(struct tpm_chip *chip, u32 property_id, u32 *value,
+ if (!rc) {
+ out = (struct tpm2_get_cap_out *)
+ &buf.data[TPM_HEADER_SIZE];
+- *value = be32_to_cpu(out->value);
++ /*
++ * To prevent failing boot up of some systems, Infineon TPM2.0
++ * returns SUCCESS on TPM2_Startup in field upgrade mode. Also
++ * the TPM2_Getcapability command returns a zero length list
++ * in field upgrade mode.
++ */
++ if (be32_to_cpu(out->property_cnt) > 0)
++ *value = be32_to_cpu(out->value);
++ else
++ rc = -ENODATA;
+ }
+ tpm_buf_destroy(&buf);
+ return rc;
+diff --git a/drivers/char/tpm/tpm_ibmvtpm.c b/drivers/char/tpm/tpm_ibmvtpm.c
+index 3af4c07a9342f..d3989b257f422 100644
+--- a/drivers/char/tpm/tpm_ibmvtpm.c
++++ b/drivers/char/tpm/tpm_ibmvtpm.c
+@@ -681,6 +681,7 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
+ if (!wait_event_timeout(ibmvtpm->crq_queue.wq,
+ ibmvtpm->rtce_buf != NULL,
+ HZ)) {
++ rc = -ENODEV;
+ dev_err(dev, "CRQ response timed out\n");
+ goto init_irq_cleanup;
+ }
+diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
+index ca0361b2dbb07..f87aa2169e5f5 100644
+--- a/drivers/crypto/caam/ctrl.c
++++ b/drivers/crypto/caam/ctrl.c
+@@ -609,6 +609,13 @@ static bool check_version(struct fsl_mc_version *mc_version, u32 major,
+ }
+ #endif
+
++static bool needs_entropy_delay_adjustment(void)
++{
++ if (of_machine_is_compatible("fsl,imx6sx"))
++ return true;
++ return false;
++}
++
+ /* Probe routine for CAAM top (controller) level */
+ static int caam_probe(struct platform_device *pdev)
+ {
+@@ -855,6 +862,8 @@ static int caam_probe(struct platform_device *pdev)
+ * Also, if a handle was instantiated, do not change
+ * the TRNG parameters.
+ */
++ if (needs_entropy_delay_adjustment())
++ ent_delay = 12000;
+ if (!(ctrlpriv->rng4_sh_init || inst_handles)) {
+ dev_info(dev,
+ "Entropy delay = %u\n",
+@@ -871,6 +880,15 @@ static int caam_probe(struct platform_device *pdev)
+ */
+ ret = instantiate_rng(dev, inst_handles,
+ gen_sk);
++ /*
++ * Entropy delay is determined via TRNG characterization.
++ * TRNG characterization is run across different voltages
++ * and temperatures.
++ * If worst case value for ent_dly is identified,
++ * the loop can be skipped for that platform.
++ */
++ if (needs_entropy_delay_adjustment())
++ break;
+ if (ret == -EAGAIN)
+ /*
+ * if here, the loop will rerun,
+diff --git a/drivers/crypto/qat/qat_common/adf_accel_devices.h b/drivers/crypto/qat/qat_common/adf_accel_devices.h
+index a03c6cf723312..dfa7ee41c5e9c 100644
+--- a/drivers/crypto/qat/qat_common/adf_accel_devices.h
++++ b/drivers/crypto/qat/qat_common/adf_accel_devices.h
+@@ -152,9 +152,9 @@ struct adf_pfvf_ops {
+ int (*enable_comms)(struct adf_accel_dev *accel_dev);
+ u32 (*get_pf2vf_offset)(u32 i);
+ u32 (*get_vf2pf_offset)(u32 i);
+- u32 (*get_vf2pf_sources)(void __iomem *pmisc_addr);
+ void (*enable_vf2pf_interrupts)(void __iomem *pmisc_addr, u32 vf_mask);
+ void (*disable_vf2pf_interrupts)(void __iomem *pmisc_addr, u32 vf_mask);
++ u32 (*disable_pending_vf2pf_interrupts)(void __iomem *pmisc_addr);
+ int (*send_msg)(struct adf_accel_dev *accel_dev, struct pfvf_message msg,
+ u32 pfvf_offset, struct mutex *csr_lock);
+ struct pfvf_message (*recv_msg)(struct adf_accel_dev *accel_dev,
+diff --git a/drivers/crypto/qat/qat_common/adf_gen2_pfvf.c b/drivers/crypto/qat/qat_common/adf_gen2_pfvf.c
+index 1a9072aac2ca9..def4cc8e1039a 100644
+--- a/drivers/crypto/qat/qat_common/adf_gen2_pfvf.c
++++ b/drivers/crypto/qat/qat_common/adf_gen2_pfvf.c
+@@ -13,6 +13,7 @@
+ #include "adf_pfvf_utils.h"
+
+ /* VF2PF interrupts */
++#define ADF_GEN2_VF_MSK 0xFFFF
+ #define ADF_GEN2_ERR_REG_VF2PF(vf_src) (((vf_src) & 0x01FFFE00) >> 9)
+ #define ADF_GEN2_ERR_MSK_VF2PF(vf_mask) (((vf_mask) & 0xFFFF) << 9)
+
+@@ -50,23 +51,6 @@ static u32 adf_gen2_vf_get_pfvf_offset(u32 i)
+ return ADF_GEN2_VF_PF2VF_OFFSET;
+ }
+
+-static u32 adf_gen2_get_vf2pf_sources(void __iomem *pmisc_addr)
+-{
+- u32 errsou3, errmsk3, vf_int_mask;
+-
+- /* Get the interrupt sources triggered by VFs */
+- errsou3 = ADF_CSR_RD(pmisc_addr, ADF_GEN2_ERRSOU3);
+- vf_int_mask = ADF_GEN2_ERR_REG_VF2PF(errsou3);
+-
+- /* To avoid adding duplicate entries to work queue, clear
+- * vf_int_mask_sets bits that are already masked in ERRMSK register.
+- */
+- errmsk3 = ADF_CSR_RD(pmisc_addr, ADF_GEN2_ERRMSK3);
+- vf_int_mask &= ~ADF_GEN2_ERR_REG_VF2PF(errmsk3);
+-
+- return vf_int_mask;
+-}
+-
+ static void adf_gen2_enable_vf2pf_interrupts(void __iomem *pmisc_addr,
+ u32 vf_mask)
+ {
+@@ -89,6 +73,44 @@ static void adf_gen2_disable_vf2pf_interrupts(void __iomem *pmisc_addr,
+ }
+ }
+
++static u32 adf_gen2_disable_pending_vf2pf_interrupts(void __iomem *pmisc_addr)
++{
++ u32 sources, disabled, pending;
++ u32 errsou3, errmsk3;
++
++ /* Get the interrupt sources triggered by VFs */
++ errsou3 = ADF_CSR_RD(pmisc_addr, ADF_GEN2_ERRSOU3);
++ sources = ADF_GEN2_ERR_REG_VF2PF(errsou3);
++
++ if (!sources)
++ return 0;
++
++ /* Get the already disabled interrupts */
++ errmsk3 = ADF_CSR_RD(pmisc_addr, ADF_GEN2_ERRMSK3);
++ disabled = ADF_GEN2_ERR_REG_VF2PF(errmsk3);
++
++ pending = sources & ~disabled;
++ if (!pending)
++ return 0;
++
++ /* Due to HW limitations, when disabling the interrupts, we can't
++ * just disable the requested sources, as this would lead to missed
++ * interrupts if ERRSOU3 changes just before writing to ERRMSK3.
++ * To work around it, disable all and re-enable only the sources that
++ * are not in vf_mask and were not already disabled. Re-enabling will
++ * trigger a new interrupt for the sources that have changed in the
++ * meantime, if any.
++ */
++ errmsk3 |= ADF_GEN2_ERR_MSK_VF2PF(ADF_GEN2_VF_MSK);
++ ADF_CSR_WR(pmisc_addr, ADF_GEN2_ERRMSK3, errmsk3);
++
++ errmsk3 &= ADF_GEN2_ERR_MSK_VF2PF(sources | disabled);
++ ADF_CSR_WR(pmisc_addr, ADF_GEN2_ERRMSK3, errmsk3);
++
++ /* Return the sources of the (new) interrupt(s) */
++ return pending;
++}
++
+ static u32 gen2_csr_get_int_bit(enum gen2_csr_pos offset)
+ {
+ return ADF_PFVF_INT << offset;
+@@ -362,9 +384,9 @@ void adf_gen2_init_pf_pfvf_ops(struct adf_pfvf_ops *pfvf_ops)
+ pfvf_ops->enable_comms = adf_enable_pf2vf_comms;
+ pfvf_ops->get_pf2vf_offset = adf_gen2_pf_get_pfvf_offset;
+ pfvf_ops->get_vf2pf_offset = adf_gen2_pf_get_pfvf_offset;
+- pfvf_ops->get_vf2pf_sources = adf_gen2_get_vf2pf_sources;
+ pfvf_ops->enable_vf2pf_interrupts = adf_gen2_enable_vf2pf_interrupts;
+ pfvf_ops->disable_vf2pf_interrupts = adf_gen2_disable_vf2pf_interrupts;
++ pfvf_ops->disable_pending_vf2pf_interrupts = adf_gen2_disable_pending_vf2pf_interrupts;
+ pfvf_ops->send_msg = adf_gen2_pf2vf_send;
+ pfvf_ops->recv_msg = adf_gen2_vf2pf_recv;
+ }
+diff --git a/drivers/crypto/qat/qat_common/adf_gen4_pfvf.c b/drivers/crypto/qat/qat_common/adf_gen4_pfvf.c
+index d80d493a77568..40fdab857f959 100644
+--- a/drivers/crypto/qat/qat_common/adf_gen4_pfvf.c
++++ b/drivers/crypto/qat/qat_common/adf_gen4_pfvf.c
+@@ -15,6 +15,7 @@
+ /* VF2PF interrupt source registers */
+ #define ADF_4XXX_VM2PF_SOU 0x41A180
+ #define ADF_4XXX_VM2PF_MSK 0x41A1C0
++#define ADF_GEN4_VF_MSK 0xFFFF
+
+ #define ADF_PFVF_GEN4_MSGTYPE_SHIFT 2
+ #define ADF_PFVF_GEN4_MSGTYPE_MASK 0x3F
+@@ -36,16 +37,6 @@ static u32 adf_gen4_pf_get_vf2pf_offset(u32 i)
+ return ADF_4XXX_VM2PF_OFFSET(i);
+ }
+
+-static u32 adf_gen4_get_vf2pf_sources(void __iomem *pmisc_addr)
+-{
+- u32 sou, mask;
+-
+- sou = ADF_CSR_RD(pmisc_addr, ADF_4XXX_VM2PF_SOU);
+- mask = ADF_CSR_RD(pmisc_addr, ADF_4XXX_VM2PF_MSK);
+-
+- return sou & ~mask;
+-}
+-
+ static void adf_gen4_enable_vf2pf_interrupts(void __iomem *pmisc_addr,
+ u32 vf_mask)
+ {
+@@ -64,6 +55,37 @@ static void adf_gen4_disable_vf2pf_interrupts(void __iomem *pmisc_addr,
+ ADF_CSR_WR(pmisc_addr, ADF_4XXX_VM2PF_MSK, val);
+ }
+
++static u32 adf_gen4_disable_pending_vf2pf_interrupts(void __iomem *pmisc_addr)
++{
++ u32 sources, disabled, pending;
++
++ /* Get the interrupt sources triggered by VFs */
++ sources = ADF_CSR_RD(pmisc_addr, ADF_4XXX_VM2PF_SOU);
++ if (!sources)
++ return 0;
++
++ /* Get the already disabled interrupts */
++ disabled = ADF_CSR_RD(pmisc_addr, ADF_4XXX_VM2PF_MSK);
++
++ pending = sources & ~disabled;
++ if (!pending)
++ return 0;
++
++ /* Due to HW limitations, when disabling the interrupts, we can't
++ * just disable the requested sources, as this would lead to missed
++ * interrupts if VM2PF_SOU changes just before writing to VM2PF_MSK.
++ * To work around it, disable all and re-enable only the sources that
++ * are not in vf_mask and were not already disabled. Re-enabling will
++ * trigger a new interrupt for the sources that have changed in the
++ * meantime, if any.
++ */
++ ADF_CSR_WR(pmisc_addr, ADF_4XXX_VM2PF_MSK, ADF_GEN4_VF_MSK);
++ ADF_CSR_WR(pmisc_addr, ADF_4XXX_VM2PF_MSK, disabled | sources);
++
++ /* Return the sources of the (new) interrupt(s) */
++ return pending;
++}
++
+ static int adf_gen4_pfvf_send(struct adf_accel_dev *accel_dev,
+ struct pfvf_message msg, u32 pfvf_offset,
+ struct mutex *csr_lock)
+@@ -115,9 +137,9 @@ void adf_gen4_init_pf_pfvf_ops(struct adf_pfvf_ops *pfvf_ops)
+ pfvf_ops->enable_comms = adf_enable_pf2vf_comms;
+ pfvf_ops->get_pf2vf_offset = adf_gen4_pf_get_pf2vf_offset;
+ pfvf_ops->get_vf2pf_offset = adf_gen4_pf_get_vf2pf_offset;
+- pfvf_ops->get_vf2pf_sources = adf_gen4_get_vf2pf_sources;
+ pfvf_ops->enable_vf2pf_interrupts = adf_gen4_enable_vf2pf_interrupts;
+ pfvf_ops->disable_vf2pf_interrupts = adf_gen4_disable_vf2pf_interrupts;
++ pfvf_ops->disable_pending_vf2pf_interrupts = adf_gen4_disable_pending_vf2pf_interrupts;
+ pfvf_ops->send_msg = adf_gen4_pfvf_send;
+ pfvf_ops->recv_msg = adf_gen4_pfvf_recv;
+ }
+diff --git a/drivers/crypto/qat/qat_common/adf_isr.c b/drivers/crypto/qat/qat_common/adf_isr.c
+index a35149f8bf1ee..23f7fff32c642 100644
+--- a/drivers/crypto/qat/qat_common/adf_isr.c
++++ b/drivers/crypto/qat/qat_common/adf_isr.c
+@@ -76,32 +76,29 @@ void adf_disable_vf2pf_interrupts(struct adf_accel_dev *accel_dev, u32 vf_mask)
+ spin_unlock_irqrestore(&accel_dev->pf.vf2pf_ints_lock, flags);
+ }
+
+-static void adf_disable_vf2pf_interrupts_irq(struct adf_accel_dev *accel_dev,
+- u32 vf_mask)
++static u32 adf_disable_pending_vf2pf_interrupts(struct adf_accel_dev *accel_dev)
+ {
+ void __iomem *pmisc_addr = adf_get_pmisc_base(accel_dev);
++ u32 pending;
+
+ spin_lock(&accel_dev->pf.vf2pf_ints_lock);
+- GET_PFVF_OPS(accel_dev)->disable_vf2pf_interrupts(pmisc_addr, vf_mask);
++ pending = GET_PFVF_OPS(accel_dev)->disable_pending_vf2pf_interrupts(pmisc_addr);
+ spin_unlock(&accel_dev->pf.vf2pf_ints_lock);
++
++ return pending;
+ }
+
+ static bool adf_handle_vf2pf_int(struct adf_accel_dev *accel_dev)
+ {
+- void __iomem *pmisc_addr = adf_get_pmisc_base(accel_dev);
+ bool irq_handled = false;
+ unsigned long vf_mask;
+
+- /* Get the interrupt sources triggered by VFs */
+- vf_mask = GET_PFVF_OPS(accel_dev)->get_vf2pf_sources(pmisc_addr);
+-
++ /* Get the interrupt sources triggered by VFs, except for those already disabled */
++ vf_mask = adf_disable_pending_vf2pf_interrupts(accel_dev);
+ if (vf_mask) {
+ struct adf_accel_vf_info *vf_info;
+ int i;
+
+- /* Disable VF2PF interrupts for VFs with pending ints */
+- adf_disable_vf2pf_interrupts_irq(accel_dev, vf_mask);
+-
+ /*
+ * Handle VF2PF interrupt unless the VF is malicious and
+ * is attempting to flood the host OS with VF2PF interrupts.
+diff --git a/drivers/crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c b/drivers/crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c
+index 09599fe4d2f3f..1e7bed8b011fe 100644
+--- a/drivers/crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c
++++ b/drivers/crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c
+@@ -7,6 +7,8 @@
+ #include "adf_dh895xcc_hw_data.h"
+ #include "icp_qat_hw.h"
+
++#define ADF_DH895XCC_VF_MSK 0xFFFFFFFF
++
+ /* Worker thread to service arbiter mappings */
+ static const u32 thrd_to_arb_map[ADF_DH895XCC_MAX_ACCELENGINES] = {
+ 0x12222AAA, 0x11666666, 0x12222AAA, 0x11666666,
+@@ -114,29 +116,6 @@ static void adf_enable_ints(struct adf_accel_dev *accel_dev)
+ ADF_DH895XCC_SMIA1_MASK);
+ }
+
+-static u32 get_vf2pf_sources(void __iomem *pmisc_bar)
+-{
+- u32 errsou3, errmsk3, errsou5, errmsk5, vf_int_mask;
+-
+- /* Get the interrupt sources triggered by VFs */
+- errsou3 = ADF_CSR_RD(pmisc_bar, ADF_GEN2_ERRSOU3);
+- vf_int_mask = ADF_DH895XCC_ERR_REG_VF2PF_L(errsou3);
+-
+- /* To avoid adding duplicate entries to work queue, clear
+- * vf_int_mask_sets bits that are already masked in ERRMSK register.
+- */
+- errmsk3 = ADF_CSR_RD(pmisc_bar, ADF_GEN2_ERRMSK3);
+- vf_int_mask &= ~ADF_DH895XCC_ERR_REG_VF2PF_L(errmsk3);
+-
+- /* Do the same for ERRSOU5 */
+- errsou5 = ADF_CSR_RD(pmisc_bar, ADF_GEN2_ERRSOU5);
+- errmsk5 = ADF_CSR_RD(pmisc_bar, ADF_GEN2_ERRMSK5);
+- vf_int_mask |= ADF_DH895XCC_ERR_REG_VF2PF_U(errsou5);
+- vf_int_mask &= ~ADF_DH895XCC_ERR_REG_VF2PF_U(errmsk5);
+-
+- return vf_int_mask;
+-}
+-
+ static void enable_vf2pf_interrupts(void __iomem *pmisc_addr, u32 vf_mask)
+ {
+ /* Enable VF2PF Messaging Ints - VFs 0 through 15 per vf_mask[15:0] */
+@@ -150,7 +129,6 @@ static void enable_vf2pf_interrupts(void __iomem *pmisc_addr, u32 vf_mask)
+ if (vf_mask >> 16) {
+ u32 val = ADF_CSR_RD(pmisc_addr, ADF_GEN2_ERRMSK5)
+ & ~ADF_DH895XCC_ERR_MSK_VF2PF_U(vf_mask);
+-
+ ADF_CSR_WR(pmisc_addr, ADF_GEN2_ERRMSK5, val);
+ }
+ }
+@@ -173,6 +151,54 @@ static void disable_vf2pf_interrupts(void __iomem *pmisc_addr, u32 vf_mask)
+ }
+ }
+
++static u32 disable_pending_vf2pf_interrupts(void __iomem *pmisc_addr)
++{
++ u32 sources, pending, disabled;
++ u32 errsou3, errmsk3;
++ u32 errsou5, errmsk5;
++
++ /* Get the interrupt sources triggered by VFs */
++ errsou3 = ADF_CSR_RD(pmisc_addr, ADF_GEN2_ERRSOU3);
++ errsou5 = ADF_CSR_RD(pmisc_addr, ADF_GEN2_ERRSOU5);
++ sources = ADF_DH895XCC_ERR_REG_VF2PF_L(errsou3)
++ | ADF_DH895XCC_ERR_REG_VF2PF_U(errsou5);
++
++ if (!sources)
++ return 0;
++
++ /* Get the already disabled interrupts */
++ errmsk3 = ADF_CSR_RD(pmisc_addr, ADF_GEN2_ERRMSK3);
++ errmsk5 = ADF_CSR_RD(pmisc_addr, ADF_GEN2_ERRMSK5);
++ disabled = ADF_DH895XCC_ERR_REG_VF2PF_L(errmsk3)
++ | ADF_DH895XCC_ERR_REG_VF2PF_U(errmsk5);
++
++ pending = sources & ~disabled;
++ if (!pending)
++ return 0;
++
++ /* Due to HW limitations, when disabling the interrupts, we can't
++ * just disable the requested sources, as this would lead to missed
++ * interrupts if sources changes just before writing to ERRMSK3 and
++ * ERRMSK5.
++ * To work around it, disable all and re-enable only the sources that
++ * are not in vf_mask and were not already disabled. Re-enabling will
++ * trigger a new interrupt for the sources that have changed in the
++ * meantime, if any.
++ */
++ errmsk3 |= ADF_DH895XCC_ERR_MSK_VF2PF_L(ADF_DH895XCC_VF_MSK);
++ errmsk5 |= ADF_DH895XCC_ERR_MSK_VF2PF_U(ADF_DH895XCC_VF_MSK);
++ ADF_CSR_WR(pmisc_addr, ADF_GEN2_ERRMSK3, errmsk3);
++ ADF_CSR_WR(pmisc_addr, ADF_GEN2_ERRMSK5, errmsk5);
++
++ errmsk3 &= ADF_DH895XCC_ERR_MSK_VF2PF_L(sources | disabled);
++ errmsk5 &= ADF_DH895XCC_ERR_MSK_VF2PF_U(sources | disabled);
++ ADF_CSR_WR(pmisc_addr, ADF_GEN2_ERRMSK3, errmsk3);
++ ADF_CSR_WR(pmisc_addr, ADF_GEN2_ERRMSK5, errmsk5);
++
++ /* Return the sources of the (new) interrupt(s) */
++ return pending;
++}
++
+ static void configure_iov_threads(struct adf_accel_dev *accel_dev, bool enable)
+ {
+ adf_gen2_cfg_iov_thds(accel_dev, enable,
+@@ -220,9 +246,9 @@ void adf_init_hw_data_dh895xcc(struct adf_hw_device_data *hw_data)
+ hw_data->disable_iov = adf_disable_sriov;
+
+ adf_gen2_init_pf_pfvf_ops(&hw_data->pfvf_ops);
+- hw_data->pfvf_ops.get_vf2pf_sources = get_vf2pf_sources;
+ hw_data->pfvf_ops.enable_vf2pf_interrupts = enable_vf2pf_interrupts;
+ hw_data->pfvf_ops.disable_vf2pf_interrupts = disable_vf2pf_interrupts;
++ hw_data->pfvf_ops.disable_pending_vf2pf_interrupts = disable_pending_vf2pf_interrupts;
+ adf_gen2_init_hw_csr_ops(&hw_data->csr_ops);
+ }
+
+diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
+index 9333f732cda8e..5167d63010b99 100644
+--- a/drivers/gpu/drm/i915/intel_pm.c
++++ b/drivers/gpu/drm/i915/intel_pm.c
+@@ -2859,7 +2859,7 @@ static void ilk_compute_wm_level(const struct drm_i915_private *dev_priv,
+ }
+
+ static void intel_read_wm_latency(struct drm_i915_private *dev_priv,
+- u16 wm[8])
++ u16 wm[])
+ {
+ struct intel_uncore *uncore = &dev_priv->uncore;
+
+diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
+index 053853a891c50..c297c63f3ec5c 100644
+--- a/drivers/hid/hid-ids.h
++++ b/drivers/hid/hid-ids.h
+@@ -768,6 +768,7 @@
+ #define USB_DEVICE_ID_LENOVO_X1_COVER 0x6085
+ #define USB_DEVICE_ID_LENOVO_X1_TAB 0x60a3
+ #define USB_DEVICE_ID_LENOVO_X1_TAB3 0x60b5
++#define USB_DEVICE_ID_LENOVO_X12_TAB 0x60fe
+ #define USB_DEVICE_ID_LENOVO_OPTICAL_USB_MOUSE_600E 0x600e
+ #define USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_608D 0x608d
+ #define USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_6019 0x6019
+diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c
+index 99eabfb4145b5..6bb3890b0f2c9 100644
+--- a/drivers/hid/hid-multitouch.c
++++ b/drivers/hid/hid-multitouch.c
+@@ -2034,6 +2034,12 @@ static const struct hid_device_id mt_devices[] = {
+ USB_VENDOR_ID_LENOVO,
+ USB_DEVICE_ID_LENOVO_X1_TAB3) },
+
++ /* Lenovo X12 TAB Gen 1 */
++ { .driver_data = MT_CLS_WIN_8_FORCE_MULTI_INPUT,
++ HID_DEVICE(BUS_USB, HID_GROUP_MULTITOUCH_WIN_8,
++ USB_VENDOR_ID_LENOVO,
++ USB_DEVICE_ID_LENOVO_X12_TAB) },
++
+ /* MosArt panels */
+ { .driver_data = MT_CLS_CONFIDENCE_MINUS_ONE,
+ MT_USB_DEVICE(USB_VENDOR_ID_ASUS,
+@@ -2178,6 +2184,9 @@ static const struct hid_device_id mt_devices[] = {
+ { .driver_data = MT_CLS_GOOGLE,
+ HID_DEVICE(HID_BUS_ANY, HID_GROUP_ANY, USB_VENDOR_ID_GOOGLE,
+ USB_DEVICE_ID_GOOGLE_TOUCH_ROSE) },
++ { .driver_data = MT_CLS_GOOGLE,
++ HID_DEVICE(BUS_USB, HID_GROUP_MULTITOUCH_WIN_8, USB_VENDOR_ID_GOOGLE,
++ USB_DEVICE_ID_GOOGLE_WHISKERS) },
+
+ /* Generic MT device */
+ { HID_DEVICE(HID_BUS_ANY, HID_GROUP_MULTITOUCH, HID_ANY_ID, HID_ANY_ID) },
+diff --git a/drivers/i2c/busses/i2c-ismt.c b/drivers/i2c/busses/i2c-ismt.c
+index c16157ee8c520..6078fa0c0d488 100644
+--- a/drivers/i2c/busses/i2c-ismt.c
++++ b/drivers/i2c/busses/i2c-ismt.c
+@@ -528,6 +528,9 @@ static int ismt_access(struct i2c_adapter *adap, u16 addr,
+
+ case I2C_SMBUS_BLOCK_PROC_CALL:
+ dev_dbg(dev, "I2C_SMBUS_BLOCK_PROC_CALL\n");
++ if (data->block[0] > I2C_SMBUS_BLOCK_MAX)
++ return -EINVAL;
++
+ dma_size = I2C_SMBUS_BLOCK_MAX;
+ desc->tgtaddr_rw = ISMT_DESC_ADDR_RW(addr, 1);
+ desc->wr_len_cmd = data->block[0] + 1;
+diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
+index fb80539865d7c..159c6806c19b8 100644
+--- a/drivers/md/dm-crypt.c
++++ b/drivers/md/dm-crypt.c
+@@ -3439,6 +3439,11 @@ static int crypt_map(struct dm_target *ti, struct bio *bio)
+ return DM_MAPIO_SUBMITTED;
+ }
+
++static char hex2asc(unsigned char c)
++{
++ return c + '0' + ((unsigned)(9 - c) >> 4 & 0x27);
++}
++
+ static void crypt_status(struct dm_target *ti, status_type_t type,
+ unsigned status_flags, char *result, unsigned maxlen)
+ {
+@@ -3457,9 +3462,12 @@ static void crypt_status(struct dm_target *ti, status_type_t type,
+ if (cc->key_size > 0) {
+ if (cc->key_string)
+ DMEMIT(":%u:%s", cc->key_size, cc->key_string);
+- else
+- for (i = 0; i < cc->key_size; i++)
+- DMEMIT("%02x", cc->key[i]);
++ else {
++ for (i = 0; i < cc->key_size; i++) {
++ DMEMIT("%c%c", hex2asc(cc->key[i] >> 4),
++ hex2asc(cc->key[i] & 0xf));
++ }
++ }
+ } else
+ DMEMIT("-");
+
+diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
+index 36ae30b73a6e0..3d5a0ce123c90 100644
+--- a/drivers/md/dm-integrity.c
++++ b/drivers/md/dm-integrity.c
+@@ -4494,8 +4494,6 @@ try_smaller_buffer:
+ }
+
+ if (should_write_sb) {
+- int r;
+-
+ init_journal(ic, 0, ic->journal_sections, 0);
+ r = dm_integrity_failed(ic);
+ if (unlikely(r)) {
+diff --git a/drivers/md/dm-stats.c b/drivers/md/dm-stats.c
+index 0e039a8c0bf2e..a3f2050b9c9b4 100644
+--- a/drivers/md/dm-stats.c
++++ b/drivers/md/dm-stats.c
+@@ -225,6 +225,7 @@ void dm_stats_cleanup(struct dm_stats *stats)
+ atomic_read(&shared->in_flight[READ]),
+ atomic_read(&shared->in_flight[WRITE]));
+ }
++ cond_resched();
+ }
+ dm_stat_free(&s->rcu_head);
+ }
+@@ -330,6 +331,7 @@ static int dm_stats_create(struct dm_stats *stats, sector_t start, sector_t end,
+ for (ni = 0; ni < n_entries; ni++) {
+ atomic_set(&s->stat_shared[ni].in_flight[READ], 0);
+ atomic_set(&s->stat_shared[ni].in_flight[WRITE], 0);
++ cond_resched();
+ }
+
+ if (s->n_histogram_entries) {
+@@ -342,6 +344,7 @@ static int dm_stats_create(struct dm_stats *stats, sector_t start, sector_t end,
+ for (ni = 0; ni < n_entries; ni++) {
+ s->stat_shared[ni].tmp.histogram = hi;
+ hi += s->n_histogram_entries + 1;
++ cond_resched();
+ }
+ }
+
+@@ -362,6 +365,7 @@ static int dm_stats_create(struct dm_stats *stats, sector_t start, sector_t end,
+ for (ni = 0; ni < n_entries; ni++) {
+ p[ni].histogram = hi;
+ hi += s->n_histogram_entries + 1;
++ cond_resched();
+ }
+ }
+ }
+@@ -497,6 +501,7 @@ static int dm_stats_list(struct dm_stats *stats, const char *program,
+ }
+ DMEMIT("\n");
+ }
++ cond_resched();
+ }
+ mutex_unlock(&stats->mutex);
+
+@@ -774,6 +779,7 @@ static void __dm_stat_clear(struct dm_stat *s, size_t idx_start, size_t idx_end,
+ local_irq_enable();
+ }
+ }
++ cond_resched();
+ }
+ }
+
+@@ -889,6 +895,8 @@ static int dm_stats_print(struct dm_stats *stats, int id,
+
+ if (unlikely(sz + 1 >= maxlen))
+ goto buffer_overflow;
++
++ cond_resched();
+ }
+
+ if (clear)
+diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
+index 80133aae0db37..d6dbd47492a85 100644
+--- a/drivers/md/dm-verity-target.c
++++ b/drivers/md/dm-verity-target.c
+@@ -1312,6 +1312,7 @@ bad:
+
+ static struct target_type verity_target = {
+ .name = "verity",
++ .features = DM_TARGET_IMMUTABLE,
+ .version = {1, 8, 0},
+ .module = THIS_MODULE,
+ .ctr = verity_ctr,
+diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
+index 351d341a1ffa4..d6ce5a09fd358 100644
+--- a/drivers/md/raid5.c
++++ b/drivers/md/raid5.c
+@@ -686,17 +686,17 @@ int raid5_calc_degraded(struct r5conf *conf)
+ return degraded;
+ }
+
+-static int has_failed(struct r5conf *conf)
++static bool has_failed(struct r5conf *conf)
+ {
+- int degraded;
++ int degraded = conf->mddev->degraded;
+
+- if (conf->mddev->reshape_position == MaxSector)
+- return conf->mddev->degraded > conf->max_degraded;
++ if (test_bit(MD_BROKEN, &conf->mddev->flags))
++ return true;
+
+- degraded = raid5_calc_degraded(conf);
+- if (degraded > conf->max_degraded)
+- return 1;
+- return 0;
++ if (conf->mddev->reshape_position != MaxSector)
++ degraded = raid5_calc_degraded(conf);
++
++ return degraded > conf->max_degraded;
+ }
+
+ struct stripe_head *
+@@ -2863,34 +2863,31 @@ static void raid5_error(struct mddev *mddev, struct md_rdev *rdev)
+ unsigned long flags;
+ pr_debug("raid456: error called\n");
+
++ pr_crit("md/raid:%s: Disk failure on %s, disabling device.\n",
++ mdname(mddev), bdevname(rdev->bdev, b));
++
+ spin_lock_irqsave(&conf->device_lock, flags);
++ set_bit(Faulty, &rdev->flags);
++ clear_bit(In_sync, &rdev->flags);
++ mddev->degraded = raid5_calc_degraded(conf);
+
+- if (test_bit(In_sync, &rdev->flags) &&
+- mddev->degraded == conf->max_degraded) {
+- /*
+- * Don't allow to achieve failed state
+- * Don't try to recover this device
+- */
++ if (has_failed(conf)) {
++ set_bit(MD_BROKEN, &conf->mddev->flags);
+ conf->recovery_disabled = mddev->recovery_disabled;
+- spin_unlock_irqrestore(&conf->device_lock, flags);
+- return;
++
++ pr_crit("md/raid:%s: Cannot continue operation (%d/%d failed).\n",
++ mdname(mddev), mddev->degraded, conf->raid_disks);
++ } else {
++ pr_crit("md/raid:%s: Operation continuing on %d devices.\n",
++ mdname(mddev), conf->raid_disks - mddev->degraded);
+ }
+
+- set_bit(Faulty, &rdev->flags);
+- clear_bit(In_sync, &rdev->flags);
+- mddev->degraded = raid5_calc_degraded(conf);
+ spin_unlock_irqrestore(&conf->device_lock, flags);
+ set_bit(MD_RECOVERY_INTR, &mddev->recovery);
+
+ set_bit(Blocked, &rdev->flags);
+ set_mask_bits(&mddev->sb_flags, 0,
+ BIT(MD_SB_CHANGE_DEVS) | BIT(MD_SB_CHANGE_PENDING));
+- pr_crit("md/raid:%s: Disk failure on %s, disabling device.\n"
+- "md/raid:%s: Operation continuing on %d devices.\n",
+- mdname(mddev),
+- bdevname(rdev->bdev, b),
+- mdname(mddev),
+- conf->raid_disks - mddev->degraded);
+ r5c_update_on_rdev_error(mddev, rdev);
+ }
+
+diff --git a/drivers/media/i2c/imx412.c b/drivers/media/i2c/imx412.c
+index be3f6ea555597..84279a6808730 100644
+--- a/drivers/media/i2c/imx412.c
++++ b/drivers/media/i2c/imx412.c
+@@ -1011,7 +1011,7 @@ static int imx412_power_on(struct device *dev)
+ struct imx412 *imx412 = to_imx412(sd);
+ int ret;
+
+- gpiod_set_value_cansleep(imx412->reset_gpio, 1);
++ gpiod_set_value_cansleep(imx412->reset_gpio, 0);
+
+ ret = clk_prepare_enable(imx412->inclk);
+ if (ret) {
+@@ -1024,7 +1024,7 @@ static int imx412_power_on(struct device *dev)
+ return 0;
+
+ error_reset:
+- gpiod_set_value_cansleep(imx412->reset_gpio, 0);
++ gpiod_set_value_cansleep(imx412->reset_gpio, 1);
+
+ return ret;
+ }
+@@ -1040,10 +1040,10 @@ static int imx412_power_off(struct device *dev)
+ struct v4l2_subdev *sd = dev_get_drvdata(dev);
+ struct imx412 *imx412 = to_imx412(sd);
+
+- gpiod_set_value_cansleep(imx412->reset_gpio, 0);
+-
+ clk_disable_unprepare(imx412->inclk);
+
++ gpiod_set_value_cansleep(imx412->reset_gpio, 1);
++
+ return 0;
+ }
+
+diff --git a/drivers/net/ipa/ipa_endpoint.c b/drivers/net/ipa/ipa_endpoint.c
+index cea7b2e2ce969..53764f3c0c7e4 100644
+--- a/drivers/net/ipa/ipa_endpoint.c
++++ b/drivers/net/ipa/ipa_endpoint.c
+@@ -130,9 +130,10 @@ static bool ipa_endpoint_data_valid_one(struct ipa *ipa, u32 count,
+ */
+ if (data->endpoint.config.aggregation) {
+ limit += SZ_1K * aggr_byte_limit_max(ipa->version);
+- if (buffer_size > limit) {
++ if (buffer_size - NET_SKB_PAD > limit) {
+ dev_err(dev, "RX buffer size too large for aggregated RX endpoint %u (%u > %u)\n",
+- data->endpoint_id, buffer_size, limit);
++ data->endpoint_id,
++ buffer_size - NET_SKB_PAD, limit);
+
+ return false;
+ }
+@@ -739,6 +740,7 @@ static void ipa_endpoint_init_aggr(struct ipa_endpoint *endpoint)
+ if (endpoint->data->aggregation) {
+ if (!endpoint->toward_ipa) {
+ const struct ipa_endpoint_rx_data *rx_data;
++ u32 buffer_size;
+ bool close_eof;
+ u32 limit;
+
+@@ -746,7 +748,8 @@ static void ipa_endpoint_init_aggr(struct ipa_endpoint *endpoint)
+ val |= u32_encode_bits(IPA_ENABLE_AGGR, AGGR_EN_FMASK);
+ val |= u32_encode_bits(IPA_GENERIC, AGGR_TYPE_FMASK);
+
+- limit = ipa_aggr_size_kb(rx_data->buffer_size);
++ buffer_size = rx_data->buffer_size;
++ limit = ipa_aggr_size_kb(buffer_size - NET_SKB_PAD);
+ val |= aggr_byte_limit_encoded(version, limit);
+
+ limit = IPA_AGGR_TIME_LIMIT;
+diff --git a/fs/exfat/balloc.c b/fs/exfat/balloc.c
+index 03f1423071749..9f42f25fab920 100644
+--- a/fs/exfat/balloc.c
++++ b/fs/exfat/balloc.c
+@@ -148,7 +148,9 @@ int exfat_set_bitmap(struct inode *inode, unsigned int clu, bool sync)
+ struct super_block *sb = inode->i_sb;
+ struct exfat_sb_info *sbi = EXFAT_SB(sb);
+
+- WARN_ON(clu < EXFAT_FIRST_CLUSTER);
++ if (!is_valid_cluster(sbi, clu))
++ return -EINVAL;
++
+ ent_idx = CLUSTER_TO_BITMAP_ENT(clu);
+ i = BITMAP_OFFSET_SECTOR_INDEX(sb, ent_idx);
+ b = BITMAP_OFFSET_BIT_IN_SECTOR(sb, ent_idx);
+@@ -166,7 +168,9 @@ void exfat_clear_bitmap(struct inode *inode, unsigned int clu, bool sync)
+ struct exfat_sb_info *sbi = EXFAT_SB(sb);
+ struct exfat_mount_options *opts = &sbi->options;
+
+- WARN_ON(clu < EXFAT_FIRST_CLUSTER);
++ if (!is_valid_cluster(sbi, clu))
++ return;
++
+ ent_idx = CLUSTER_TO_BITMAP_ENT(clu);
+ i = BITMAP_OFFSET_SECTOR_INDEX(sb, ent_idx);
+ b = BITMAP_OFFSET_BIT_IN_SECTOR(sb, ent_idx);
+diff --git a/fs/exfat/exfat_fs.h b/fs/exfat/exfat_fs.h
+index c6800b8809203..42d06c68d5c5e 100644
+--- a/fs/exfat/exfat_fs.h
++++ b/fs/exfat/exfat_fs.h
+@@ -381,6 +381,12 @@ static inline int exfat_sector_to_cluster(struct exfat_sb_info *sbi,
+ EXFAT_RESERVED_CLUSTERS;
+ }
+
++static inline bool is_valid_cluster(struct exfat_sb_info *sbi,
++ unsigned int clus)
++{
++ return clus >= EXFAT_FIRST_CLUSTER && clus < sbi->num_clusters;
++}
++
+ /* super.c */
+ int exfat_set_volume_dirty(struct super_block *sb);
+ int exfat_clear_volume_dirty(struct super_block *sb);
+diff --git a/fs/exfat/fatent.c b/fs/exfat/fatent.c
+index a3464e56a7e16..421c273531049 100644
+--- a/fs/exfat/fatent.c
++++ b/fs/exfat/fatent.c
+@@ -81,12 +81,6 @@ int exfat_ent_set(struct super_block *sb, unsigned int loc,
+ return 0;
+ }
+
+-static inline bool is_valid_cluster(struct exfat_sb_info *sbi,
+- unsigned int clus)
+-{
+- return clus >= EXFAT_FIRST_CLUSTER && clus < sbi->num_clusters;
+-}
+-
+ int exfat_ent_get(struct super_block *sb, unsigned int loc,
+ unsigned int *content)
+ {
+diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
+index 7eefa16ed381b..8f8cd6e2d4dbc 100644
+--- a/fs/nfs/internal.h
++++ b/fs/nfs/internal.h
+@@ -841,6 +841,7 @@ static inline bool nfs_error_is_fatal_on_server(int err)
+ case 0:
+ case -ERESTARTSYS:
+ case -EINTR:
++ case -ENOMEM:
+ return false;
+ }
+ return nfs_error_is_fatal(err);
+diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
+index 234e852fcdfad..d6e1f95ccfd8a 100644
+--- a/fs/nfsd/nfs4state.c
++++ b/fs/nfsd/nfs4state.c
+@@ -7330,16 +7330,12 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
+ if (sop->so_is_open_owner || !same_owner_str(sop, owner))
+ continue;
+
+- /* see if there are still any locks associated with it */
+- lo = lockowner(sop);
+- list_for_each_entry(stp, &sop->so_stateids, st_perstateowner) {
+- if (check_for_locks(stp->st_stid.sc_file, lo)) {
+- status = nfserr_locks_held;
+- spin_unlock(&clp->cl_lock);
+- return status;
+- }
++ if (atomic_read(&sop->so_count) != 1) {
++ spin_unlock(&clp->cl_lock);
++ return nfserr_locks_held;
+ }
+
++ lo = lockowner(sop);
+ nfs4_get_stateowner(sop);
+ break;
+ }
+diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c
+index 278dcf5024102..b2b54c4553f91 100644
+--- a/fs/ntfs3/super.c
++++ b/fs/ntfs3/super.c
+@@ -668,9 +668,11 @@ static u32 format_size_gb(const u64 bytes, u32 *mb)
+
+ static u32 true_sectors_per_clst(const struct NTFS_BOOT *boot)
+ {
+- return boot->sectors_per_clusters <= 0x80
+- ? boot->sectors_per_clusters
+- : (1u << (0 - boot->sectors_per_clusters));
++ if (boot->sectors_per_clusters <= 0x80)
++ return boot->sectors_per_clusters;
++ if (boot->sectors_per_clusters >= 0xf4) /* limit shift to 2MB max */
++ return 1U << (0 - boot->sectors_per_clusters);
++ return -EINVAL;
+ }
+
+ /*
+@@ -713,6 +715,8 @@ static int ntfs_init_from_boot(struct super_block *sb, u32 sector_size,
+
+ /* cluster size: 512, 1K, 2K, 4K, ... 2M */
+ sct_per_clst = true_sectors_per_clst(boot);
++ if ((int)sct_per_clst < 0)
++ goto out;
+ if (!is_power_of_2(sct_per_clst))
+ goto out;
+
+diff --git a/fs/pipe.c b/fs/pipe.c
+index e140ea150bbb1..74ae9fafd25a1 100644
+--- a/fs/pipe.c
++++ b/fs/pipe.c
+@@ -653,7 +653,7 @@ pipe_poll(struct file *filp, poll_table *wait)
+ unsigned int head, tail;
+
+ /* Epoll has some historical nasty semantics, this enables them */
+- pipe->poll_usage = 1;
++ WRITE_ONCE(pipe->poll_usage, true);
+
+ /*
+ * Reading pipe state only -- no need for acquiring the semaphore.
+@@ -1245,30 +1245,33 @@ unsigned int round_pipe_size(unsigned long size)
+
+ /*
+ * Resize the pipe ring to a number of slots.
++ *
++ * Note the pipe can be reduced in capacity, but only if the current
++ * occupancy doesn't exceed nr_slots; if it does, EBUSY will be
++ * returned instead.
+ */
+ int pipe_resize_ring(struct pipe_inode_info *pipe, unsigned int nr_slots)
+ {
+ struct pipe_buffer *bufs;
+ unsigned int head, tail, mask, n;
+
+- /*
+- * We can shrink the pipe, if arg is greater than the ring occupancy.
+- * Since we don't expect a lot of shrink+grow operations, just free and
+- * allocate again like we would do for growing. If the pipe currently
+- * contains more buffers than arg, then return busy.
+- */
+- mask = pipe->ring_size - 1;
+- head = pipe->head;
+- tail = pipe->tail;
+- n = pipe_occupancy(pipe->head, pipe->tail);
+- if (nr_slots < n)
+- return -EBUSY;
+-
+ bufs = kcalloc(nr_slots, sizeof(*bufs),
+ GFP_KERNEL_ACCOUNT | __GFP_NOWARN);
+ if (unlikely(!bufs))
+ return -ENOMEM;
+
++ spin_lock_irq(&pipe->rd_wait.lock);
++ mask = pipe->ring_size - 1;
++ head = pipe->head;
++ tail = pipe->tail;
++
++ n = pipe_occupancy(head, tail);
++ if (nr_slots < n) {
++ spin_unlock_irq(&pipe->rd_wait.lock);
++ kfree(bufs);
++ return -EBUSY;
++ }
++
+ /*
+ * The pipe array wraps around, so just start the new one at zero
+ * and adjust the indices.
+@@ -1300,6 +1303,8 @@ int pipe_resize_ring(struct pipe_inode_info *pipe, unsigned int nr_slots)
+ pipe->tail = tail;
+ pipe->head = head;
+
++ spin_unlock_irq(&pipe->rd_wait.lock);
++
+ /* This might have made more room for writers */
+ wake_up_interruptible(&pipe->wr_wait);
+ return 0;
+diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h
+index 493e632584970..7ea18d4da84b8 100644
+--- a/include/linux/bpf_local_storage.h
++++ b/include/linux/bpf_local_storage.h
+@@ -143,9 +143,9 @@ void bpf_selem_link_storage_nolock(struct bpf_local_storage *local_storage,
+
+ bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_storage,
+ struct bpf_local_storage_elem *selem,
+- bool uncharge_omem);
++ bool uncharge_omem, bool use_trace_rcu);
+
+-void bpf_selem_unlink(struct bpf_local_storage_elem *selem);
++void bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool use_trace_rcu);
+
+ void bpf_selem_link_map(struct bpf_local_storage_map *smap,
+ struct bpf_local_storage_elem *selem);
+diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
+index c00c618ef290d..cb0fd633a6106 100644
+--- a/include/linux/pipe_fs_i.h
++++ b/include/linux/pipe_fs_i.h
+@@ -71,7 +71,7 @@ struct pipe_inode_info {
+ unsigned int files;
+ unsigned int r_counter;
+ unsigned int w_counter;
+- unsigned int poll_usage;
++ bool poll_usage;
+ struct page *tmp_page;
+ struct fasync_struct *fasync_readers;
+ struct fasync_struct *fasync_writers;
+diff --git a/include/net/netfilter/nf_conntrack_core.h b/include/net/netfilter/nf_conntrack_core.h
+index 13807ea94cd2b..2d524782f53b7 100644
+--- a/include/net/netfilter/nf_conntrack_core.h
++++ b/include/net/netfilter/nf_conntrack_core.h
+@@ -58,8 +58,13 @@ static inline int nf_conntrack_confirm(struct sk_buff *skb)
+ int ret = NF_ACCEPT;
+
+ if (ct) {
+- if (!nf_ct_is_confirmed(ct))
++ if (!nf_ct_is_confirmed(ct)) {
+ ret = __nf_conntrack_confirm(skb);
++
++ if (ret == NF_ACCEPT)
++ ct = (struct nf_conn *)skb_nfct(skb);
++ }
++
+ if (likely(ret == NF_ACCEPT))
+ nf_ct_deliver_cached_events(ct);
+ }
+diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c
+index 96be8d518885c..10424a1cda51d 100644
+--- a/kernel/bpf/bpf_inode_storage.c
++++ b/kernel/bpf/bpf_inode_storage.c
+@@ -90,7 +90,7 @@ void bpf_inode_storage_free(struct inode *inode)
+ */
+ bpf_selem_unlink_map(selem);
+ free_inode_storage = bpf_selem_unlink_storage_nolock(
+- local_storage, selem, false);
++ local_storage, selem, false, false);
+ }
+ raw_spin_unlock_bh(&local_storage->lock);
+ rcu_read_unlock();
+@@ -149,7 +149,7 @@ static int inode_storage_delete(struct inode *inode, struct bpf_map *map)
+ if (!sdata)
+ return -ENOENT;
+
+- bpf_selem_unlink(SELEM(sdata));
++ bpf_selem_unlink(SELEM(sdata), true);
+
+ return 0;
+ }
+diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
+index 01aa2b51ec4df..8ce40fd869f6a 100644
+--- a/kernel/bpf/bpf_local_storage.c
++++ b/kernel/bpf/bpf_local_storage.c
+@@ -106,7 +106,7 @@ static void bpf_selem_free_rcu(struct rcu_head *rcu)
+ */
+ bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_storage,
+ struct bpf_local_storage_elem *selem,
+- bool uncharge_mem)
++ bool uncharge_mem, bool use_trace_rcu)
+ {
+ struct bpf_local_storage_map *smap;
+ bool free_local_storage;
+@@ -150,11 +150,16 @@ bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_storage,
+ SDATA(selem))
+ RCU_INIT_POINTER(local_storage->cache[smap->cache_idx], NULL);
+
+- call_rcu_tasks_trace(&selem->rcu, bpf_selem_free_rcu);
++ if (use_trace_rcu)
++ call_rcu_tasks_trace(&selem->rcu, bpf_selem_free_rcu);
++ else
++ kfree_rcu(selem, rcu);
++
+ return free_local_storage;
+ }
+
+-static void __bpf_selem_unlink_storage(struct bpf_local_storage_elem *selem)
++static void __bpf_selem_unlink_storage(struct bpf_local_storage_elem *selem,
++ bool use_trace_rcu)
+ {
+ struct bpf_local_storage *local_storage;
+ bool free_local_storage = false;
+@@ -169,12 +174,16 @@ static void __bpf_selem_unlink_storage(struct bpf_local_storage_elem *selem)
+ raw_spin_lock_irqsave(&local_storage->lock, flags);
+ if (likely(selem_linked_to_storage(selem)))
+ free_local_storage = bpf_selem_unlink_storage_nolock(
+- local_storage, selem, true);
++ local_storage, selem, true, use_trace_rcu);
+ raw_spin_unlock_irqrestore(&local_storage->lock, flags);
+
+- if (free_local_storage)
+- call_rcu_tasks_trace(&local_storage->rcu,
++ if (free_local_storage) {
++ if (use_trace_rcu)
++ call_rcu_tasks_trace(&local_storage->rcu,
+ bpf_local_storage_free_rcu);
++ else
++ kfree_rcu(local_storage, rcu);
++ }
+ }
+
+ void bpf_selem_link_storage_nolock(struct bpf_local_storage *local_storage,
+@@ -214,14 +223,14 @@ void bpf_selem_link_map(struct bpf_local_storage_map *smap,
+ raw_spin_unlock_irqrestore(&b->lock, flags);
+ }
+
+-void bpf_selem_unlink(struct bpf_local_storage_elem *selem)
++void bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool use_trace_rcu)
+ {
+ /* Always unlink from map before unlinking from local_storage
+ * because selem will be freed after successfully unlinked from
+ * the local_storage.
+ */
+ bpf_selem_unlink_map(selem);
+- __bpf_selem_unlink_storage(selem);
++ __bpf_selem_unlink_storage(selem, use_trace_rcu);
+ }
+
+ struct bpf_local_storage_data *
+@@ -466,7 +475,7 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap,
+ if (old_sdata) {
+ bpf_selem_unlink_map(SELEM(old_sdata));
+ bpf_selem_unlink_storage_nolock(local_storage, SELEM(old_sdata),
+- false);
++ false, true);
+ }
+
+ unlock:
+@@ -548,7 +557,7 @@ void bpf_local_storage_map_free(struct bpf_local_storage_map *smap,
+ migrate_disable();
+ __this_cpu_inc(*busy_counter);
+ }
+- bpf_selem_unlink(selem);
++ bpf_selem_unlink(selem, false);
+ if (busy_counter) {
+ __this_cpu_dec(*busy_counter);
+ migrate_enable();
+diff --git a/kernel/bpf/bpf_task_storage.c b/kernel/bpf/bpf_task_storage.c
+index 6638a0ecc3d21..57904263a710f 100644
+--- a/kernel/bpf/bpf_task_storage.c
++++ b/kernel/bpf/bpf_task_storage.c
+@@ -102,7 +102,7 @@ void bpf_task_storage_free(struct task_struct *task)
+ */
+ bpf_selem_unlink_map(selem);
+ free_task_storage = bpf_selem_unlink_storage_nolock(
+- local_storage, selem, false);
++ local_storage, selem, false, false);
+ }
+ raw_spin_unlock_irqrestore(&local_storage->lock, flags);
+ bpf_task_storage_unlock();
+@@ -192,7 +192,7 @@ static int task_storage_delete(struct task_struct *task, struct bpf_map *map)
+ if (!sdata)
+ return -ENOENT;
+
+- bpf_selem_unlink(SELEM(sdata));
++ bpf_selem_unlink(SELEM(sdata), true);
+
+ return 0;
+ }
+diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
+index 13e9dbeeedf36..05e701f0da81d 100644
+--- a/kernel/bpf/core.c
++++ b/kernel/bpf/core.c
+@@ -873,7 +873,7 @@ static size_t select_bpf_prog_pack_size(void)
+ return size;
+ }
+
+-static struct bpf_prog_pack *alloc_new_pack(void)
++static struct bpf_prog_pack *alloc_new_pack(bpf_jit_fill_hole_t bpf_fill_ill_insns)
+ {
+ struct bpf_prog_pack *pack;
+
+@@ -886,6 +886,7 @@ static struct bpf_prog_pack *alloc_new_pack(void)
+ kfree(pack);
+ return NULL;
+ }
++ bpf_fill_ill_insns(pack->ptr, bpf_prog_pack_size);
+ bitmap_zero(pack->bitmap, bpf_prog_pack_size / BPF_PROG_CHUNK_SIZE);
+ list_add_tail(&pack->list, &pack_list);
+
+@@ -895,7 +896,7 @@ static struct bpf_prog_pack *alloc_new_pack(void)
+ return pack;
+ }
+
+-static void *bpf_prog_pack_alloc(u32 size)
++static void *bpf_prog_pack_alloc(u32 size, bpf_jit_fill_hole_t bpf_fill_ill_insns)
+ {
+ unsigned int nbits = BPF_PROG_SIZE_TO_NBITS(size);
+ struct bpf_prog_pack *pack;
+@@ -910,6 +911,7 @@ static void *bpf_prog_pack_alloc(u32 size)
+ size = round_up(size, PAGE_SIZE);
+ ptr = module_alloc(size);
+ if (ptr) {
++ bpf_fill_ill_insns(ptr, size);
+ set_vm_flush_reset_perms(ptr);
+ set_memory_ro((unsigned long)ptr, size / PAGE_SIZE);
+ set_memory_x((unsigned long)ptr, size / PAGE_SIZE);
+@@ -923,7 +925,7 @@ static void *bpf_prog_pack_alloc(u32 size)
+ goto found_free_area;
+ }
+
+- pack = alloc_new_pack();
++ pack = alloc_new_pack(bpf_fill_ill_insns);
+ if (!pack)
+ goto out;
+
+@@ -1102,7 +1104,7 @@ bpf_jit_binary_pack_alloc(unsigned int proglen, u8 **image_ptr,
+
+ if (bpf_jit_charge_modmem(size))
+ return NULL;
+- ro_header = bpf_prog_pack_alloc(size);
++ ro_header = bpf_prog_pack_alloc(size, bpf_fill_ill_insns);
+ if (!ro_header) {
+ bpf_jit_uncharge_modmem(size);
+ return NULL;
+@@ -1434,6 +1436,16 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
+ insn = clone->insnsi;
+
+ for (i = 0; i < insn_cnt; i++, insn++) {
++ if (bpf_pseudo_func(insn)) {
++ /* ld_imm64 with an address of bpf subprog is not
++ * a user controlled constant. Don't randomize it,
++ * since it will conflict with jit_subprogs() logic.
++ */
++ insn++;
++ i++;
++ continue;
++ }
++
+ /* We temporarily need to hold the original ld64 insn
+ * so that we can still access the first part in the
+ * second blinding run.
+diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
+index 34725bfa1e97b..5c6c96d0e634d 100644
+--- a/kernel/bpf/stackmap.c
++++ b/kernel/bpf/stackmap.c
+@@ -100,7 +100,6 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
+ return ERR_PTR(-E2BIG);
+
+ cost = n_buckets * sizeof(struct stack_map_bucket *) + sizeof(*smap);
+- cost += n_buckets * (value_size + sizeof(struct stack_map_bucket));
+ smap = bpf_map_area_alloc(cost, bpf_map_attr_numa_node(attr));
+ if (!smap)
+ return ERR_PTR(-ENOMEM);
+diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
+index ada97751ae1b2..5d8bfb5ef239d 100644
+--- a/kernel/bpf/trampoline.c
++++ b/kernel/bpf/trampoline.c
+@@ -411,7 +411,7 @@ int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
+ {
+ enum bpf_tramp_prog_type kind;
+ int err = 0;
+- int cnt;
++ int cnt = 0, i;
+
+ kind = bpf_attach_type_to_tramp(prog);
+ mutex_lock(&tr->mutex);
+@@ -422,7 +422,10 @@ int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
+ err = -EBUSY;
+ goto out;
+ }
+- cnt = tr->progs_cnt[BPF_TRAMP_FENTRY] + tr->progs_cnt[BPF_TRAMP_FEXIT];
++
++ for (i = 0; i < BPF_TRAMP_MAX; i++)
++ cnt += tr->progs_cnt[i];
++
+ if (kind == BPF_TRAMP_REPLACE) {
+ /* Cannot attach extension if fentry/fexit are in use. */
+ if (cnt) {
+@@ -500,16 +503,19 @@ out:
+
+ void bpf_trampoline_put(struct bpf_trampoline *tr)
+ {
++ int i;
++
+ if (!tr)
+ return;
+ mutex_lock(&trampoline_mutex);
+ if (!refcount_dec_and_test(&tr->refcnt))
+ goto out;
+ WARN_ON_ONCE(mutex_is_locked(&tr->mutex));
+- if (WARN_ON_ONCE(!hlist_empty(&tr->progs_hlist[BPF_TRAMP_FENTRY])))
+- goto out;
+- if (WARN_ON_ONCE(!hlist_empty(&tr->progs_hlist[BPF_TRAMP_FEXIT])))
+- goto out;
++
++ for (i = 0; i < BPF_TRAMP_MAX; i++)
++ if (WARN_ON_ONCE(!hlist_empty(&tr->progs_hlist[i])))
++ goto out;
++
+ /* This code will be executed even when the last bpf_tramp_image
+ * is alive. All progs are detached from the trampoline and the
+ * trampoline image is patched with jmp into epilogue to skip
+diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
+index d175b70067b30..9c1a02b82ecd0 100644
+--- a/kernel/bpf/verifier.c
++++ b/kernel/bpf/verifier.c
+@@ -4861,6 +4861,11 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno,
+ return check_packet_access(env, regno, reg->off, access_size,
+ zero_size_allowed);
+ case PTR_TO_MAP_KEY:
++ if (meta && meta->raw_mode) {
++ verbose(env, "R%d cannot write into %s\n", regno,
++ reg_type_str(env, reg->type));
++ return -EACCES;
++ }
+ return check_mem_region_access(env, regno, reg->off, access_size,
+ reg->map_ptr->key_size, false);
+ case PTR_TO_MAP_VALUE:
+@@ -4871,13 +4876,23 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno,
+ return check_map_access(env, regno, reg->off, access_size,
+ zero_size_allowed);
+ case PTR_TO_MEM:
++ if (type_is_rdonly_mem(reg->type)) {
++ if (meta && meta->raw_mode) {
++ verbose(env, "R%d cannot write into %s\n", regno,
++ reg_type_str(env, reg->type));
++ return -EACCES;
++ }
++ }
+ return check_mem_region_access(env, regno, reg->off,
+ access_size, reg->mem_size,
+ zero_size_allowed);
+ case PTR_TO_BUF:
+ if (type_is_rdonly_mem(reg->type)) {
+- if (meta && meta->raw_mode)
++ if (meta && meta->raw_mode) {
++ verbose(env, "R%d cannot write into %s\n", regno,
++ reg_type_str(env, reg->type));
+ return -EACCES;
++ }
+
+ max_access = &env->prog->aux->max_rdonly_access;
+ } else {
+@@ -4919,8 +4934,7 @@ static int check_mem_size_reg(struct bpf_verifier_env *env,
+ * out. Only upper bounds can be learned because retval is an
+ * int type and negative retvals are allowed.
+ */
+- if (meta)
+- meta->msize_max_value = reg->umax_value;
++ meta->msize_max_value = reg->umax_value;
+
+ /* The register is SCALAR_VALUE; the access check
+ * happens using its boundaries.
+@@ -4963,24 +4977,33 @@ static int check_mem_size_reg(struct bpf_verifier_env *env,
+ int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
+ u32 regno, u32 mem_size)
+ {
++ bool may_be_null = type_may_be_null(reg->type);
++ struct bpf_reg_state saved_reg;
++ struct bpf_call_arg_meta meta;
++ int err;
++
+ if (register_is_null(reg))
+ return 0;
+
+- if (type_may_be_null(reg->type)) {
+- /* Assuming that the register contains a value check if the memory
+- * access is safe. Temporarily save and restore the register's state as
+- * the conversion shouldn't be visible to a caller.
+- */
+- const struct bpf_reg_state saved_reg = *reg;
+- int rv;
+-
++ memset(&meta, 0, sizeof(meta));
++ /* Assuming that the register contains a value check if the memory
++ * access is safe. Temporarily save and restore the register's state as
++ * the conversion shouldn't be visible to a caller.
++ */
++ if (may_be_null) {
++ saved_reg = *reg;
+ mark_ptr_not_null_reg(reg);
+- rv = check_helper_mem_access(env, regno, mem_size, true, NULL);
+- *reg = saved_reg;
+- return rv;
+ }
+
+- return check_helper_mem_access(env, regno, mem_size, true, NULL);
++ err = check_helper_mem_access(env, regno, mem_size, true, &meta);
++ /* Check access for BPF_WRITE */
++ meta.raw_mode = true;
++ err = err ?: check_helper_mem_access(env, regno, mem_size, true, &meta);
++
++ if (may_be_null)
++ *reg = saved_reg;
++
++ return err;
+ }
+
+ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
+@@ -4989,16 +5012,22 @@ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state
+ struct bpf_reg_state *mem_reg = &cur_regs(env)[regno - 1];
+ bool may_be_null = type_may_be_null(mem_reg->type);
+ struct bpf_reg_state saved_reg;
++ struct bpf_call_arg_meta meta;
+ int err;
+
+ WARN_ON_ONCE(regno < BPF_REG_2 || regno > BPF_REG_5);
+
++ memset(&meta, 0, sizeof(meta));
++
+ if (may_be_null) {
+ saved_reg = *mem_reg;
+ mark_ptr_not_null_reg(mem_reg);
+ }
+
+- err = check_mem_size_reg(env, reg, regno, true, NULL);
++ err = check_mem_size_reg(env, reg, regno, true, &meta);
++ /* Check access for BPF_WRITE */
++ meta.raw_mode = true;
++ err = err ?: check_mem_size_reg(env, reg, regno, true, &meta);
+
+ if (may_be_null)
+ *mem_reg = saved_reg;
+diff --git a/lib/assoc_array.c b/lib/assoc_array.c
+index 079c72e26493e..ca0b4f360c1a0 100644
+--- a/lib/assoc_array.c
++++ b/lib/assoc_array.c
+@@ -1461,6 +1461,7 @@ int assoc_array_gc(struct assoc_array *array,
+ struct assoc_array_ptr *cursor, *ptr;
+ struct assoc_array_ptr *new_root, *new_parent, **new_ptr_pp;
+ unsigned long nr_leaves_on_tree;
++ bool retained;
+ int keylen, slot, nr_free, next_slot, i;
+
+ pr_devel("-->%s()\n", __func__);
+@@ -1536,6 +1537,7 @@ continue_node:
+ goto descend;
+ }
+
++retry_compress:
+ pr_devel("-- compress node %p --\n", new_n);
+
+ /* Count up the number of empty slots in this node and work out the
+@@ -1553,6 +1555,7 @@ continue_node:
+ pr_devel("free=%d, leaves=%lu\n", nr_free, new_n->nr_leaves_on_branch);
+
+ /* See what we can fold in */
++ retained = false;
+ next_slot = 0;
+ for (slot = 0; slot < ASSOC_ARRAY_FAN_OUT; slot++) {
+ struct assoc_array_shortcut *s;
+@@ -1602,9 +1605,14 @@ continue_node:
+ pr_devel("[%d] retain node %lu/%d [nx %d]\n",
+ slot, child->nr_leaves_on_branch, nr_free + 1,
+ next_slot);
++ retained = true;
+ }
+ }
+
++ if (retained && new_n->nr_leaves_on_branch <= ASSOC_ARRAY_FAN_OUT) {
++ pr_devel("internal nodes remain despite enough space, retrying\n");
++ goto retry_compress;
++ }
+ pr_devel("after: %lu\n", new_n->nr_leaves_on_branch);
+
+ nr_leaves_on_tree = new_n->nr_leaves_on_branch;
+diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
+index 9152fbde33b50..5d5fc04385b8d 100644
+--- a/mm/zsmalloc.c
++++ b/mm/zsmalloc.c
+@@ -1718,11 +1718,40 @@ static enum fullness_group putback_zspage(struct size_class *class,
+ */
+ static void lock_zspage(struct zspage *zspage)
+ {
+- struct page *page = get_first_page(zspage);
++ struct page *curr_page, *page;
+
+- do {
+- lock_page(page);
+- } while ((page = get_next_page(page)) != NULL);
++ /*
++ * Pages we haven't locked yet can be migrated off the list while we're
++ * trying to lock them, so we need to be careful and only attempt to
++ * lock each page under migrate_read_lock(). Otherwise, the page we lock
++ * may no longer belong to the zspage. This means that we may wait for
++ * the wrong page to unlock, so we must take a reference to the page
++ * prior to waiting for it to unlock outside migrate_read_lock().
++ */
++ while (1) {
++ migrate_read_lock(zspage);
++ page = get_first_page(zspage);
++ if (trylock_page(page))
++ break;
++ get_page(page);
++ migrate_read_unlock(zspage);
++ wait_on_page_locked(page);
++ put_page(page);
++ }
++
++ curr_page = page;
++ while ((page = get_next_page(curr_page))) {
++ if (trylock_page(page)) {
++ curr_page = page;
++ } else {
++ get_page(page);
++ migrate_read_unlock(zspage);
++ wait_on_page_locked(page);
++ put_page(page);
++ migrate_read_lock(zspage);
++ }
++ }
++ migrate_read_unlock(zspage);
+ }
+
+ static int zs_init_fs_context(struct fs_context *fc)
+diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
+index e3ac363805203..83d7641ef67b0 100644
+--- a/net/core/bpf_sk_storage.c
++++ b/net/core/bpf_sk_storage.c
+@@ -40,7 +40,7 @@ static int bpf_sk_storage_del(struct sock *sk, struct bpf_map *map)
+ if (!sdata)
+ return -ENOENT;
+
+- bpf_selem_unlink(SELEM(sdata));
++ bpf_selem_unlink(SELEM(sdata), true);
+
+ return 0;
+ }
+@@ -75,8 +75,8 @@ void bpf_sk_storage_free(struct sock *sk)
+ * sk_storage.
+ */
+ bpf_selem_unlink_map(selem);
+- free_sk_storage = bpf_selem_unlink_storage_nolock(sk_storage,
+- selem, true);
++ free_sk_storage = bpf_selem_unlink_storage_nolock(
++ sk_storage, selem, true, false);
+ }
+ raw_spin_unlock_bh(&sk_storage->lock);
+ rcu_read_unlock();
+diff --git a/net/core/filter.c b/net/core/filter.c
+index 64470a727ef77..966796b345e78 100644
+--- a/net/core/filter.c
++++ b/net/core/filter.c
+@@ -1687,7 +1687,7 @@ BPF_CALL_5(bpf_skb_store_bytes, struct sk_buff *, skb, u32, offset,
+
+ if (unlikely(flags & ~(BPF_F_RECOMPUTE_CSUM | BPF_F_INVALIDATE_HASH)))
+ return -EINVAL;
+- if (unlikely(offset > 0xffff))
++ if (unlikely(offset > INT_MAX))
+ return -EFAULT;
+ if (unlikely(bpf_try_make_writable(skb, offset + len)))
+ return -EFAULT;
+@@ -1722,7 +1722,7 @@ BPF_CALL_4(bpf_skb_load_bytes, const struct sk_buff *, skb, u32, offset,
+ {
+ void *ptr;
+
+- if (unlikely(offset > 0xffff))
++ if (unlikely(offset > INT_MAX))
+ goto err_clear;
+
+ ptr = skb_header_pointer(skb, offset, len, to);
+diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
+index a096b9fbbbdff..b6a9208130051 100644
+--- a/net/netfilter/nf_tables_api.c
++++ b/net/netfilter/nf_tables_api.c
+@@ -222,12 +222,18 @@ err_register:
+ }
+
+ static void nft_netdev_unregister_hooks(struct net *net,
+- struct list_head *hook_list)
++ struct list_head *hook_list,
++ bool release_netdev)
+ {
+- struct nft_hook *hook;
++ struct nft_hook *hook, *next;
+
+- list_for_each_entry(hook, hook_list, list)
++ list_for_each_entry_safe(hook, next, hook_list, list) {
+ nf_unregister_net_hook(net, &hook->ops);
++ if (release_netdev) {
++ list_del(&hook->list);
++ kfree_rcu(hook, rcu);
++ }
++ }
+ }
+
+ static int nf_tables_register_hook(struct net *net,
+@@ -253,9 +259,10 @@ static int nf_tables_register_hook(struct net *net,
+ return nf_register_net_hook(net, &basechain->ops);
+ }
+
+-static void nf_tables_unregister_hook(struct net *net,
+- const struct nft_table *table,
+- struct nft_chain *chain)
++static void __nf_tables_unregister_hook(struct net *net,
++ const struct nft_table *table,
++ struct nft_chain *chain,
++ bool release_netdev)
+ {
+ struct nft_base_chain *basechain;
+ const struct nf_hook_ops *ops;
+@@ -270,11 +277,19 @@ static void nf_tables_unregister_hook(struct net *net,
+ return basechain->type->ops_unregister(net, ops);
+
+ if (nft_base_chain_netdev(table->family, basechain->ops.hooknum))
+- nft_netdev_unregister_hooks(net, &basechain->hook_list);
++ nft_netdev_unregister_hooks(net, &basechain->hook_list,
++ release_netdev);
+ else
+ nf_unregister_net_hook(net, &basechain->ops);
+ }
+
++static void nf_tables_unregister_hook(struct net *net,
++ const struct nft_table *table,
++ struct nft_chain *chain)
++{
++ return __nf_tables_unregister_hook(net, table, chain, false);
++}
++
+ static void nft_trans_commit_list_add_tail(struct net *net, struct nft_trans *trans)
+ {
+ struct nftables_pernet *nft_net = nft_pernet(net);
+@@ -2873,27 +2888,31 @@ static struct nft_expr *nft_expr_init(const struct nft_ctx *ctx,
+
+ err = nf_tables_expr_parse(ctx, nla, &expr_info);
+ if (err < 0)
+- goto err1;
++ goto err_expr_parse;
++
++ err = -EOPNOTSUPP;
++ if (!(expr_info.ops->type->flags & NFT_EXPR_STATEFUL))
++ goto err_expr_stateful;
+
+ err = -ENOMEM;
+ expr = kzalloc(expr_info.ops->size, GFP_KERNEL_ACCOUNT);
+ if (expr == NULL)
+- goto err2;
++ goto err_expr_stateful;
+
+ err = nf_tables_newexpr(ctx, &expr_info, expr);
+ if (err < 0)
+- goto err3;
++ goto err_expr_new;
+
+ return expr;
+-err3:
++err_expr_new:
+ kfree(expr);
+-err2:
++err_expr_stateful:
+ owner = expr_info.ops->type->owner;
+ if (expr_info.ops->type->release_ops)
+ expr_info.ops->type->release_ops(expr_info.ops);
+
+ module_put(owner);
+-err1:
++err_expr_parse:
+ return ERR_PTR(err);
+ }
+
+@@ -4242,6 +4261,9 @@ static int nft_set_desc_concat_parse(const struct nlattr *attr,
+ u32 len;
+ int err;
+
++ if (desc->field_count >= ARRAY_SIZE(desc->field_len))
++ return -E2BIG;
++
+ err = nla_parse_nested_deprecated(tb, NFTA_SET_FIELD_MAX, attr,
+ nft_concat_policy, NULL);
+ if (err < 0)
+@@ -4251,9 +4273,8 @@ static int nft_set_desc_concat_parse(const struct nlattr *attr,
+ return -EINVAL;
+
+ len = ntohl(nla_get_be32(tb[NFTA_SET_FIELD_LEN]));
+-
+- if (len * BITS_PER_BYTE / 32 > NFT_REG32_COUNT)
+- return -E2BIG;
++ if (!len || len > U8_MAX)
++ return -EINVAL;
+
+ desc->field_len[desc->field_count++] = len;
+
+@@ -4264,7 +4285,8 @@ static int nft_set_desc_concat(struct nft_set_desc *desc,
+ const struct nlattr *nla)
+ {
+ struct nlattr *attr;
+- int rem, err;
++ u32 num_regs = 0;
++ int rem, err, i;
+
+ nla_for_each_nested(attr, nla, rem) {
+ if (nla_type(attr) != NFTA_LIST_ELEM)
+@@ -4275,6 +4297,12 @@ static int nft_set_desc_concat(struct nft_set_desc *desc,
+ return err;
+ }
+
++ for (i = 0; i < desc->field_count; i++)
++ num_regs += DIV_ROUND_UP(desc->field_len[i], sizeof(u32));
++
++ if (num_regs > NFT_REG32_COUNT)
++ return -E2BIG;
++
+ return 0;
+ }
+
+@@ -5413,9 +5441,6 @@ struct nft_expr *nft_set_elem_expr_alloc(const struct nft_ctx *ctx,
+ return expr;
+
+ err = -EOPNOTSUPP;
+- if (!(expr->ops->type->flags & NFT_EXPR_STATEFUL))
+- goto err_set_elem_expr;
+-
+ if (expr->ops->type->flags & NFT_EXPR_GC) {
+ if (set->flags & NFT_SET_TIMEOUT)
+ goto err_set_elem_expr;
+@@ -7291,13 +7316,25 @@ static void nft_unregister_flowtable_hook(struct net *net,
+ FLOW_BLOCK_UNBIND);
+ }
+
+-static void nft_unregister_flowtable_net_hooks(struct net *net,
+- struct list_head *hook_list)
++static void __nft_unregister_flowtable_net_hooks(struct net *net,
++ struct list_head *hook_list,
++ bool release_netdev)
+ {
+- struct nft_hook *hook;
++ struct nft_hook *hook, *next;
+
+- list_for_each_entry(hook, hook_list, list)
++ list_for_each_entry_safe(hook, next, hook_list, list) {
+ nf_unregister_net_hook(net, &hook->ops);
++ if (release_netdev) {
++ list_del(&hook->list);
++ kfree_rcu(hook);
++ }
++ }
++}
++
++static void nft_unregister_flowtable_net_hooks(struct net *net,
++ struct list_head *hook_list)
++{
++ __nft_unregister_flowtable_net_hooks(net, hook_list, false);
+ }
+
+ static int nft_register_flowtable_net_hooks(struct net *net,
+@@ -9741,9 +9778,10 @@ static void __nft_release_hook(struct net *net, struct nft_table *table)
+ struct nft_chain *chain;
+
+ list_for_each_entry(chain, &table->chains, list)
+- nf_tables_unregister_hook(net, table, chain);
++ __nf_tables_unregister_hook(net, table, chain, true);
+ list_for_each_entry(flowtable, &table->flowtables, list)
+- nft_unregister_flowtable_net_hooks(net, &flowtable->hook_list);
++ __nft_unregister_flowtable_net_hooks(net, &flowtable->hook_list,
++ true);
+ }
+
+ static void __nft_release_hooks(struct net *net)
+@@ -9882,7 +9920,11 @@ static int __net_init nf_tables_init_net(struct net *net)
+
+ static void __net_exit nf_tables_pre_exit_net(struct net *net)
+ {
++ struct nftables_pernet *nft_net = nft_pernet(net);
++
++ mutex_lock(&nft_net->commit_mutex);
+ __nft_release_hooks(net);
++ mutex_unlock(&nft_net->commit_mutex);
+ }
+
+ static void __net_exit nf_tables_exit_net(struct net *net)
+diff --git a/net/netfilter/nft_limit.c b/net/netfilter/nft_limit.c
+index 04ea8b9bf2028..981addb2d0515 100644
+--- a/net/netfilter/nft_limit.c
++++ b/net/netfilter/nft_limit.c
+@@ -213,6 +213,8 @@ static int nft_limit_pkts_clone(struct nft_expr *dst, const struct nft_expr *src
+ struct nft_limit_priv_pkts *priv_dst = nft_expr_priv(dst);
+ struct nft_limit_priv_pkts *priv_src = nft_expr_priv(src);
+
++ priv_dst->cost = priv_src->cost;
++
+ return nft_limit_clone(&priv_dst->limit, &priv_src->limit);
+ }
+
+diff --git a/sound/usb/clock.c b/sound/usb/clock.c
+index 4dfe76416794f..33db334e65566 100644
+--- a/sound/usb/clock.c
++++ b/sound/usb/clock.c
+@@ -572,6 +572,17 @@ static int set_sample_rate_v2v3(struct snd_usb_audio *chip,
+ /* continue processing */
+ }
+
++ /* FIXME - TEAC devices require the immediate interface setup */
++ if (USB_ID_VENDOR(chip->usb_id) == 0x0644) {
++ bool cur_base_48k = (rate % 48000 == 0);
++ bool prev_base_48k = (prev_rate % 48000 == 0);
++ if (cur_base_48k != prev_base_48k) {
++ usb_set_interface(chip->dev, fmt->iface, fmt->altsetting);
++ if (chip->quirk_flags & QUIRK_FLAG_IFACE_DELAY)
++ msleep(50);
++ }
++ }
++
+ validation:
+ /* validate clock after rate change */
+ if (!uac_clock_source_is_valid(chip, fmt, clock))
+diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
+index 6d699065e81a2..b470404a5376c 100644
+--- a/sound/usb/pcm.c
++++ b/sound/usb/pcm.c
+@@ -439,16 +439,21 @@ static int configure_endpoints(struct snd_usb_audio *chip,
+ /* stop any running stream beforehand */
+ if (stop_endpoints(subs, false))
+ sync_pending_stops(subs);
++ if (subs->sync_endpoint) {
++ err = snd_usb_endpoint_configure(chip, subs->sync_endpoint);
++ if (err < 0)
++ return err;
++ }
+ err = snd_usb_endpoint_configure(chip, subs->data_endpoint);
+ if (err < 0)
+ return err;
+ snd_usb_set_format_quirk(subs, subs->cur_audiofmt);
+- }
+-
+- if (subs->sync_endpoint) {
+- err = snd_usb_endpoint_configure(chip, subs->sync_endpoint);
+- if (err < 0)
+- return err;
++ } else {
++ if (subs->sync_endpoint) {
++ err = snd_usb_endpoint_configure(chip, subs->sync_endpoint);
++ if (err < 0)
++ return err;
++ }
+ }
+
+ return 0;
+diff --git a/sound/usb/quirks-table.h b/sound/usb/quirks-table.h
+index 40a5e3eb4ef26..78eb41b621d63 100644
+--- a/sound/usb/quirks-table.h
++++ b/sound/usb/quirks-table.h
+@@ -2672,6 +2672,7 @@ YAMAHA_DEVICE(0x7010, "UB99"),
+ .altset_idx = 1,
+ .attributes = 0,
+ .endpoint = 0x82,
++ .ep_idx = 1,
+ .ep_attr = USB_ENDPOINT_XFER_ISOC,
+ .datainterval = 1,
+ .maxpacksize = 0x0126,
+@@ -2875,6 +2876,7 @@ YAMAHA_DEVICE(0x7010, "UB99"),
+ .altset_idx = 1,
+ .attributes = 0x4,
+ .endpoint = 0x81,
++ .ep_idx = 1,
+ .ep_attr = USB_ENDPOINT_XFER_ISOC |
+ USB_ENDPOINT_SYNC_ASYNC,
+ .maxpacksize = 0x130,
+@@ -3391,6 +3393,7 @@ YAMAHA_DEVICE(0x7010, "UB99"),
+ .altset_idx = 1,
+ .attributes = 0,
+ .endpoint = 0x03,
++ .ep_idx = 1,
+ .rates = SNDRV_PCM_RATE_96000,
+ .ep_attr = USB_ENDPOINT_XFER_ISOC |
+ USB_ENDPOINT_SYNC_ASYNC,
+diff --git a/tools/memory-model/README b/tools/memory-model/README
+index 9edd402704c4f..dab38904206a0 100644
+--- a/tools/memory-model/README
++++ b/tools/memory-model/README
+@@ -54,7 +54,8 @@ klitmus7 Compatibility Table
+ -- 4.14 7.48 --
+ 4.15 -- 4.19 7.49 --
+ 4.20 -- 5.5 7.54 --
+- 5.6 -- 7.56 --
++ 5.6 -- 5.16 7.56 --
++ 5.17 -- 7.56.1 --
+ ============ ==========
+
+
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-06-09 11:25 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-06-09 11:25 UTC (permalink / raw
To: gentoo-commits
commit: 565fe84a57a3471aa2c2fbfd0bfd62e7178a6fb0
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Thu Jun 9 11:22:59 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Thu Jun 9 11:22:59 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=565fe84a
Linux patch 5.18.3
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1002_linux-5.18.3.patch | 37846 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 37850 insertions(+)
diff --git a/0000_README b/0000_README
index 561c7140..5acbe17f 100644
--- a/0000_README
+++ b/0000_README
@@ -51,6 +51,10 @@ Patch: 1001_linux-5.18.2.patch
From: http://www.kernel.org
Desc: Linux 5.18.2
+Patch: 1002_linux-5.18.3.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.3
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1002_linux-5.18.3.patch b/1002_linux-5.18.3.patch
new file mode 100644
index 00000000..002288d0
--- /dev/null
+++ b/1002_linux-5.18.3.patch
@@ -0,0 +1,37846 @@
+diff --git a/Documentation/accounting/psi.rst b/Documentation/accounting/psi.rst
+index 860fe651d6453..5e40b3f437f90 100644
+--- a/Documentation/accounting/psi.rst
++++ b/Documentation/accounting/psi.rst
+@@ -37,11 +37,7 @@ Pressure interface
+ Pressure information for each resource is exported through the
+ respective file in /proc/pressure/ -- cpu, memory, and io.
+
+-The format for CPU is as such::
+-
+- some avg10=0.00 avg60=0.00 avg300=0.00 total=0
+-
+-and for memory and IO::
++The format is as such::
+
+ some avg10=0.00 avg60=0.00 avg300=0.00 total=0
+ full avg10=0.00 avg60=0.00 avg300=0.00 total=0
+@@ -58,6 +54,9 @@ situation from a state where some tasks are stalled but the CPU is
+ still doing productive work. As such, time spent in this subset of the
+ stall state is tracked separately and exported in the "full" averages.
+
++CPU full is undefined at the system level, but has been reported
++since 5.13, so it is set to zero for backward compatibility.
++
+ The ratios (in %) are tracked as recent trends over ten, sixty, and
+ three hundred second windows, which gives insight into short term events
+ as well as medium and long term trends. The total absolute stall time
+diff --git a/Documentation/conf.py b/Documentation/conf.py
+index 072ee31a301dc..934727e23e0eb 100644
+--- a/Documentation/conf.py
++++ b/Documentation/conf.py
+@@ -161,7 +161,7 @@ finally:
+ #
+ # This is also used if you do content translation via gettext catalogs.
+ # Usually you set "language" from the command line for these cases.
+-language = None
++language = 'en'
+
+ # There are two options for replacing |today|: either, you set today to some
+ # non-false value, then it is used:
+diff --git a/Documentation/devicetree/bindings/display/sitronix,st7735r.yaml b/Documentation/devicetree/bindings/display/sitronix,st7735r.yaml
+index 0cebaaefda032..419c3b2ac5a6f 100644
+--- a/Documentation/devicetree/bindings/display/sitronix,st7735r.yaml
++++ b/Documentation/devicetree/bindings/display/sitronix,st7735r.yaml
+@@ -72,6 +72,7 @@ examples:
+ dc-gpios = <&gpio 43 GPIO_ACTIVE_HIGH>;
+ reset-gpios = <&gpio 80 GPIO_ACTIVE_HIGH>;
+ rotation = <270>;
++ backlight = <&backlight>;
+ };
+ };
+
+diff --git a/Documentation/devicetree/bindings/gpio/gpio-altera.txt b/Documentation/devicetree/bindings/gpio/gpio-altera.txt
+index 146e554b3c676..2a80e272cd666 100644
+--- a/Documentation/devicetree/bindings/gpio/gpio-altera.txt
++++ b/Documentation/devicetree/bindings/gpio/gpio-altera.txt
+@@ -9,8 +9,9 @@ Required properties:
+ - The second cell is reserved and is currently unused.
+ - gpio-controller : Marks the device node as a GPIO controller.
+ - interrupt-controller: Mark the device node as an interrupt controller
+-- #interrupt-cells : Should be 1. The interrupt type is fixed in the hardware.
++- #interrupt-cells : Should be 2. The interrupt type is fixed in the hardware.
+ - The first cell is the GPIO offset number within the GPIO controller.
++ - The second cell is the interrupt trigger type and level flags.
+ - interrupts: Specify the interrupt.
+ - altr,interrupt-type: Specifies the interrupt trigger type the GPIO
+ hardware is synthesized. This field is required if the Altera GPIO controller
+@@ -38,6 +39,6 @@ gpio_altr: gpio@ff200000 {
+ altr,interrupt-type = <IRQ_TYPE_EDGE_RISING>;
+ #gpio-cells = <2>;
+ gpio-controller;
+- #interrupt-cells = <1>;
++ #interrupt-cells = <2>;
+ interrupt-controller;
+ };
+diff --git a/Documentation/devicetree/bindings/regulator/mt6315-regulator.yaml b/Documentation/devicetree/bindings/regulator/mt6315-regulator.yaml
+index 61dd5af80db67..5d2d989de893c 100644
+--- a/Documentation/devicetree/bindings/regulator/mt6315-regulator.yaml
++++ b/Documentation/devicetree/bindings/regulator/mt6315-regulator.yaml
+@@ -31,7 +31,7 @@ properties:
+ $ref: "regulator.yaml#"
+
+ properties:
+- regulator-name:
++ regulator-compatible:
+ pattern: "^vbuck[1-4]$"
+
+ additionalProperties: false
+diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,smd-rpm.yaml b/Documentation/devicetree/bindings/soc/qcom/qcom,smd-rpm.yaml
+index b32457c2fc0b0..3361218e278f3 100644
+--- a/Documentation/devicetree/bindings/soc/qcom/qcom,smd-rpm.yaml
++++ b/Documentation/devicetree/bindings/soc/qcom/qcom,smd-rpm.yaml
+@@ -34,6 +34,7 @@ properties:
+ - qcom,rpm-ipq6018
+ - qcom,rpm-msm8226
+ - qcom,rpm-msm8916
++ - qcom,rpm-msm8936
+ - qcom,rpm-msm8953
+ - qcom,rpm-msm8974
+ - qcom,rpm-msm8976
+diff --git a/Documentation/devicetree/bindings/spi/qcom,spi-qcom-qspi.yaml b/Documentation/devicetree/bindings/spi/qcom,spi-qcom-qspi.yaml
+index 5a60fba14bba0..44d08aa3fd85d 100644
+--- a/Documentation/devicetree/bindings/spi/qcom,spi-qcom-qspi.yaml
++++ b/Documentation/devicetree/bindings/spi/qcom,spi-qcom-qspi.yaml
+@@ -49,6 +49,7 @@ properties:
+ maxItems: 2
+
+ interconnect-names:
++ minItems: 1
+ items:
+ - const: qspi-config
+ - const: qspi-memory
+diff --git a/Documentation/driver-api/thermal/intel_dptf.rst b/Documentation/driver-api/thermal/intel_dptf.rst
+index 96668dca753a8..372bdb4d04c6d 100644
+--- a/Documentation/driver-api/thermal/intel_dptf.rst
++++ b/Documentation/driver-api/thermal/intel_dptf.rst
+@@ -4,7 +4,7 @@
+ Intel(R) Dynamic Platform and Thermal Framework Sysfs Interface
+ ===============================================================
+
+-:Copyright: |copy| 2022 Intel Corporation
++:Copyright: © 2022 Intel Corporation
+
+ :Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
+
+diff --git a/Documentation/sound/alsa-configuration.rst b/Documentation/sound/alsa-configuration.rst
+index 34888d4fc4a83..21ab5e6f7062f 100644
+--- a/Documentation/sound/alsa-configuration.rst
++++ b/Documentation/sound/alsa-configuration.rst
+@@ -2246,7 +2246,7 @@ implicit_fb
+ Apply the generic implicit feedback sync mode. When this is set
+ and the playback stream sync mode is ASYNC, the driver tries to
+ tie an adjacent ASYNC capture stream as the implicit feedback
+- source.
++ source. This is equivalent with quirk_flags bit 17.
+ use_vmalloc
+ Use vmalloc() for allocations of the PCM buffers (default: yes).
+ For architectures with non-coherent memory like ARM or MIPS, the
+@@ -2288,6 +2288,8 @@ quirk_flags
+ * bit 14: Ignore errors for mixer access
+ * bit 15: Support generic DSD raw U32_BE format
+ * bit 16: Set up the interface at first like UAC1
++ * bit 17: Apply the generic implicit feedback sync mode
++ * bit 18: Don't apply implicit feedback sync mode
+
+ This module supports multiple devices, autoprobe and hotplugging.
+
+diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
+index f35552ff19ba8..b68e7a51009f8 100644
+--- a/Documentation/userspace-api/landlock.rst
++++ b/Documentation/userspace-api/landlock.rst
+@@ -267,8 +267,8 @@ restrict such paths with dedicated ruleset flags.
+ Ruleset layers
+ --------------
+
+-There is a limit of 64 layers of stacked rulesets. This can be an issue for a
+-task willing to enforce a new ruleset in complement to its 64 inherited
++There is a limit of 16 layers of stacked rulesets. This can be an issue for a
++task willing to enforce a new ruleset in complement to its 16 inherited
+ rulesets. Once this limit is reached, sys_landlock_restrict_self() returns
+ E2BIG. It is then strongly suggested to carefully build rulesets once in the
+ life of a thread, especially for applications able to launch other applications
+diff --git a/Documentation/userspace-api/media/lirc.h.rst.exceptions b/Documentation/userspace-api/media/lirc.h.rst.exceptions
+index 913d17b498313..1aeb7d7afe13b 100644
+--- a/Documentation/userspace-api/media/lirc.h.rst.exceptions
++++ b/Documentation/userspace-api/media/lirc.h.rst.exceptions
+@@ -30,6 +30,8 @@ ignore define LIRC_CAN_REC
+
+ ignore define LIRC_CAN_SEND_MASK
+ ignore define LIRC_CAN_REC_MASK
++ignore define LIRC_CAN_SET_REC_FILTER
++ignore define LIRC_CAN_NOTIFY_DECODE
+
+ # Obsolete ioctls
+
+diff --git a/Makefile b/Makefile
+index 6b1d606a92f6f..eb3adfec0b222 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 2
++SUBLEVEL = 3
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/alpha/include/asm/page.h b/arch/alpha/include/asm/page.h
+index 18f48a6f2ff6d..8f3f5eecba28b 100644
+--- a/arch/alpha/include/asm/page.h
++++ b/arch/alpha/include/asm/page.h
+@@ -18,7 +18,7 @@ extern void clear_page(void *page);
+ #define clear_user_page(page, vaddr, pg) clear_page(page)
+
+ #define alloc_zeroed_user_highpage_movable(vma, vaddr) \
+- alloc_page_vma(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, vma, vmaddr)
++ alloc_page_vma(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, vma, vaddr)
+ #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE_MOVABLE
+
+ extern void copy_page(void * _to, void * _from);
+diff --git a/arch/arm/boot/dts/bcm2835-rpi-b.dts b/arch/arm/boot/dts/bcm2835-rpi-b.dts
+index 1b63d6b19750b..25d87212cefd3 100644
+--- a/arch/arm/boot/dts/bcm2835-rpi-b.dts
++++ b/arch/arm/boot/dts/bcm2835-rpi-b.dts
+@@ -53,18 +53,17 @@
+ "GPIO18",
+ "NC", /* GPIO19 */
+ "NC", /* GPIO20 */
+- "GPIO21",
++ "CAM_GPIO0",
+ "GPIO22",
+ "GPIO23",
+ "GPIO24",
+ "GPIO25",
+ "NC", /* GPIO26 */
+- "CAM_GPIO0",
+- /* Binary number representing build/revision */
+- "CONFIG0",
+- "CONFIG1",
+- "CONFIG2",
+- "CONFIG3",
++ "GPIO27",
++ "GPIO28",
++ "GPIO29",
++ "GPIO30",
++ "GPIO31",
+ "NC", /* GPIO32 */
+ "NC", /* GPIO33 */
+ "NC", /* GPIO34 */
+diff --git a/arch/arm/boot/dts/bcm2835-rpi-zero-w.dts b/arch/arm/boot/dts/bcm2835-rpi-zero-w.dts
+index 243236bc1e00b..8b043ab62dc83 100644
+--- a/arch/arm/boot/dts/bcm2835-rpi-zero-w.dts
++++ b/arch/arm/boot/dts/bcm2835-rpi-zero-w.dts
+@@ -74,16 +74,18 @@
+ "GPIO27",
+ "SDA0",
+ "SCL0",
+- "NC", /* GPIO30 */
+- "NC", /* GPIO31 */
+- "NC", /* GPIO32 */
+- "NC", /* GPIO33 */
+- "NC", /* GPIO34 */
+- "NC", /* GPIO35 */
+- "NC", /* GPIO36 */
+- "NC", /* GPIO37 */
+- "NC", /* GPIO38 */
+- "NC", /* GPIO39 */
++ /* Used by BT module */
++ "CTS0",
++ "RTS0",
++ "TXD0",
++ "RXD0",
++ /* Used by Wifi */
++ "SD1_CLK",
++ "SD1_CMD",
++ "SD1_DATA0",
++ "SD1_DATA1",
++ "SD1_DATA2",
++ "SD1_DATA3",
+ "CAM_GPIO1", /* GPIO40 */
+ "WL_ON", /* GPIO41 */
+ "NC", /* GPIO42 */
+diff --git a/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts b/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
+index e12938baaf12c..c263f5b48b96b 100644
+--- a/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
++++ b/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
+@@ -45,7 +45,7 @@
+ #gpio-cells = <2>;
+ gpio-line-names = "BT_ON",
+ "WL_ON",
+- "STATUS_LED_R",
++ "PWR_LED_R",
+ "LAN_RUN",
+ "",
+ "CAM_GPIO0",
+diff --git a/arch/arm/boot/dts/bcm2837-rpi-cm3-io3.dts b/arch/arm/boot/dts/bcm2837-rpi-cm3-io3.dts
+index 588d9411ceb61..3dfce4312dfc4 100644
+--- a/arch/arm/boot/dts/bcm2837-rpi-cm3-io3.dts
++++ b/arch/arm/boot/dts/bcm2837-rpi-cm3-io3.dts
+@@ -63,8 +63,8 @@
+ "GPIO43",
+ "GPIO44",
+ "GPIO45",
+- "GPIO46",
+- "GPIO47",
++ "SMPS_SCL",
++ "SMPS_SDA",
+ /* Used by eMMC */
+ "SD_CLK_R",
+ "SD_CMD_R",
+diff --git a/arch/arm/boot/dts/bcm5301x.dtsi b/arch/arm/boot/dts/bcm5301x.dtsi
+index 603c700c706f2..65f8a759f1e31 100644
+--- a/arch/arm/boot/dts/bcm5301x.dtsi
++++ b/arch/arm/boot/dts/bcm5301x.dtsi
+@@ -455,7 +455,7 @@
+ reg = <0x180 0x4>;
+ };
+
+- pinctrl: pin-controller@1c0 {
++ pinctrl: pinctrl@1c0 {
+ compatible = "brcm,bcm4708-pinmux";
+ reg = <0x1c0 0x24>;
+ reg-names = "cru_gpio_control";
+diff --git a/arch/arm/boot/dts/exynos5250-smdk5250.dts b/arch/arm/boot/dts/exynos5250-smdk5250.dts
+index 21fbbf3d8684d..71293749ac481 100644
+--- a/arch/arm/boot/dts/exynos5250-smdk5250.dts
++++ b/arch/arm/boot/dts/exynos5250-smdk5250.dts
+@@ -129,7 +129,7 @@
+ samsung,i2c-max-bus-freq = <20000>;
+
+ eeprom@50 {
+- compatible = "samsung,s524ad0xd1";
++ compatible = "samsung,s524ad0xd1", "atmel,24c128";
+ reg = <0x50>;
+ };
+
+@@ -289,7 +289,7 @@
+ samsung,i2c-max-bus-freq = <20000>;
+
+ eeprom@51 {
+- compatible = "samsung,s524ad0xd1";
++ compatible = "samsung,s524ad0xd1", "atmel,24c128";
+ reg = <0x51>;
+ };
+
+diff --git a/arch/arm/boot/dts/imx6dl-eckelmann-ci4x10.dts b/arch/arm/boot/dts/imx6dl-eckelmann-ci4x10.dts
+index b4a9523e325b4..864dc5018451f 100644
+--- a/arch/arm/boot/dts/imx6dl-eckelmann-ci4x10.dts
++++ b/arch/arm/boot/dts/imx6dl-eckelmann-ci4x10.dts
+@@ -297,7 +297,11 @@
+ phy-mode = "rmii";
+ phy-reset-gpios = <&gpio1 18 GPIO_ACTIVE_LOW>;
+ phy-handle = <&phy>;
+- clocks = <&clks IMX6QDL_CLK_ENET>, <&clks IMX6QDL_CLK_ENET>, <&rmii_clk>;
++ clocks = <&clks IMX6QDL_CLK_ENET>,
++ <&clks IMX6QDL_CLK_ENET>,
++ <&rmii_clk>,
++ <&clks IMX6QDL_CLK_ENET_REF>;
++ clock-names = "ipg", "ahb", "ptp", "enet_out";
+ status = "okay";
+
+ mdio {
+diff --git a/arch/arm/boot/dts/imx6qdl-colibri.dtsi b/arch/arm/boot/dts/imx6qdl-colibri.dtsi
+index 4e2a309c93fa8..1e86b38147080 100644
+--- a/arch/arm/boot/dts/imx6qdl-colibri.dtsi
++++ b/arch/arm/boot/dts/imx6qdl-colibri.dtsi
+@@ -1,6 +1,6 @@
+ // SPDX-License-Identifier: GPL-2.0+ OR MIT
+ /*
+- * Copyright 2014-2020 Toradex
++ * Copyright 2014-2022 Toradex
+ * Copyright 2012 Freescale Semiconductor, Inc.
+ * Copyright 2011 Linaro Ltd.
+ */
+@@ -132,7 +132,7 @@
+ clock-frequency = <100000>;
+ pinctrl-names = "default", "gpio";
+ pinctrl-0 = <&pinctrl_i2c2>;
+- pinctrl-0 = <&pinctrl_i2c2_gpio>;
++ pinctrl-1 = <&pinctrl_i2c2_gpio>;
+ scl-gpios = <&gpio2 30 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
+ sda-gpios = <&gpio3 16 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
+ status = "okay";
+@@ -488,7 +488,7 @@
+ >;
+ };
+
+- pinctrl_i2c2_gpio: i2c2grp {
++ pinctrl_i2c2_gpio: i2c2gpiogrp {
+ fsl,pins = <
+ MX6QDL_PAD_EIM_EB2__GPIO2_IO30 0x4001b8b1
+ MX6QDL_PAD_EIM_D16__GPIO3_IO16 0x4001b8b1
+diff --git a/arch/arm/boot/dts/lan966x.dtsi b/arch/arm/boot/dts/lan966x.dtsi
+index 7d28696480508..5e9cbc8cdcbce 100644
+--- a/arch/arm/boot/dts/lan966x.dtsi
++++ b/arch/arm/boot/dts/lan966x.dtsi
+@@ -114,9 +114,9 @@
+ compatible = "atmel,at91sam9g46-aes";
+ reg = <0xe004c000 0x100>;
+ interrupts = <GIC_SPI 53 IRQ_TYPE_LEVEL_HIGH>;
+- dmas = <&dma0 AT91_XDMAC_DT_PERID(13)>,
+- <&dma0 AT91_XDMAC_DT_PERID(12)>;
+- dma-names = "rx", "tx";
++ dmas = <&dma0 AT91_XDMAC_DT_PERID(12)>,
++ <&dma0 AT91_XDMAC_DT_PERID(13)>;
++ dma-names = "tx", "rx";
+ clocks = <&nic_clk>;
+ clock-names = "aes_clk";
+ };
+diff --git a/arch/arm/boot/dts/ox820.dtsi b/arch/arm/boot/dts/ox820.dtsi
+index 90846a7655b49..dde4364892bf0 100644
+--- a/arch/arm/boot/dts/ox820.dtsi
++++ b/arch/arm/boot/dts/ox820.dtsi
+@@ -287,7 +287,7 @@
+ clocks = <&armclk>;
+ };
+
+- gic: gic@1000 {
++ gic: interrupt-controller@1000 {
+ compatible = "arm,arm11mp-gic";
+ interrupt-controller;
+ #interrupt-cells = <3>;
+diff --git a/arch/arm/boot/dts/qcom-sdx65.dtsi b/arch/arm/boot/dts/qcom-sdx65.dtsi
+index 796641d30e06c..0c3f93603adc8 100644
+--- a/arch/arm/boot/dts/qcom-sdx65.dtsi
++++ b/arch/arm/boot/dts/qcom-sdx65.dtsi
+@@ -202,7 +202,7 @@
+ <WAKE_TCS 2>,
+ <CONTROL_TCS 1>;
+
+- rpmhcc: clock-controller@1 {
++ rpmhcc: clock-controller {
+ compatible = "qcom,sdx65-rpmh-clk";
+ #clock-cells = <1>;
+ clock-names = "xo";
+diff --git a/arch/arm/boot/dts/s5pv210-aries.dtsi b/arch/arm/boot/dts/s5pv210-aries.dtsi
+index 26f2be2d9faa2..962266c24aac4 100644
+--- a/arch/arm/boot/dts/s5pv210-aries.dtsi
++++ b/arch/arm/boot/dts/s5pv210-aries.dtsi
+@@ -564,7 +564,6 @@
+ reset-gpios = <&mp05 5 GPIO_ACTIVE_LOW>;
+ vdd3-supply = <&ldo7_reg>;
+ vci-supply = <&ldo17_reg>;
+- spi-cs-high;
+ spi-max-frequency = <1200000>;
+
+ pinctrl-names = "default";
+@@ -636,7 +635,7 @@
+ };
+
+ &i2s0 {
+- dmas = <&pdma0 9>, <&pdma0 10>, <&pdma0 11>;
++ dmas = <&pdma0 10>, <&pdma0 9>, <&pdma0 11>;
+ status = "okay";
+ };
+
+diff --git a/arch/arm/boot/dts/s5pv210.dtsi b/arch/arm/boot/dts/s5pv210.dtsi
+index 353ba7b09a0c0..c5265f3ae31d6 100644
+--- a/arch/arm/boot/dts/s5pv210.dtsi
++++ b/arch/arm/boot/dts/s5pv210.dtsi
+@@ -239,8 +239,8 @@
+ reg = <0xeee30000 0x1000>;
+ interrupt-parent = <&vic2>;
+ interrupts = <16>;
+- dma-names = "rx", "tx", "tx-sec";
+- dmas = <&pdma1 9>, <&pdma1 10>, <&pdma1 11>;
++ dma-names = "tx", "rx", "tx-sec";
++ dmas = <&pdma1 10>, <&pdma1 9>, <&pdma1 11>;
+ clock-names = "iis",
+ "i2s_opclk0",
+ "i2s_opclk1";
+@@ -259,8 +259,8 @@
+ reg = <0xe2100000 0x1000>;
+ interrupt-parent = <&vic2>;
+ interrupts = <17>;
+- dma-names = "rx", "tx";
+- dmas = <&pdma1 12>, <&pdma1 13>;
++ dma-names = "tx", "rx";
++ dmas = <&pdma1 13>, <&pdma1 12>;
+ clock-names = "iis", "i2s_opclk0";
+ clocks = <&clocks CLK_I2S1>, <&clocks SCLK_AUDIO1>;
+ pinctrl-names = "default";
+@@ -274,8 +274,8 @@
+ reg = <0xe2a00000 0x1000>;
+ interrupt-parent = <&vic2>;
+ interrupts = <18>;
+- dma-names = "rx", "tx";
+- dmas = <&pdma1 14>, <&pdma1 15>;
++ dma-names = "tx", "rx";
++ dmas = <&pdma1 15>, <&pdma1 14>;
+ clock-names = "iis", "i2s_opclk0";
+ clocks = <&clocks CLK_I2S2>, <&clocks SCLK_AUDIO2>;
+ pinctrl-names = "default";
+diff --git a/arch/arm/boot/dts/sama7g5.dtsi b/arch/arm/boot/dts/sama7g5.dtsi
+index f691c8f08d047..b632631296928 100644
+--- a/arch/arm/boot/dts/sama7g5.dtsi
++++ b/arch/arm/boot/dts/sama7g5.dtsi
+@@ -857,7 +857,6 @@
+ #interrupt-cells = <3>;
+ #address-cells = <0>;
+ interrupt-controller;
+- interrupt-parent;
+ reg = <0xe8c11000 0x1000>,
+ <0xe8c12000 0x2000>;
+ };
+diff --git a/arch/arm/boot/dts/socfpga.dtsi b/arch/arm/boot/dts/socfpga.dtsi
+index 7c1d6423d7f8c..b8c5dd7860cb2 100644
+--- a/arch/arm/boot/dts/socfpga.dtsi
++++ b/arch/arm/boot/dts/socfpga.dtsi
+@@ -46,7 +46,7 @@
+ <0xff113000 0x1000>;
+ };
+
+- intc: intc@fffed000 {
++ intc: interrupt-controller@fffed000 {
+ compatible = "arm,cortex-a9-gic";
+ #interrupt-cells = <3>;
+ interrupt-controller;
+diff --git a/arch/arm/boot/dts/socfpga_arria10.dtsi b/arch/arm/boot/dts/socfpga_arria10.dtsi
+index 3ba431dfa8c94..f1e50d2e623a3 100644
+--- a/arch/arm/boot/dts/socfpga_arria10.dtsi
++++ b/arch/arm/boot/dts/socfpga_arria10.dtsi
+@@ -38,7 +38,7 @@
+ <0xff113000 0x1000>;
+ };
+
+- intc: intc@ffffd000 {
++ intc: interrupt-controller@ffffd000 {
+ compatible = "arm,cortex-a9-gic";
+ #interrupt-cells = <3>;
+ interrupt-controller;
+diff --git a/arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi b/arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi
+index 61e17f44ce815..76c54b006d871 100644
+--- a/arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi
++++ b/arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi
+@@ -141,6 +141,7 @@
+ compatible = "snps,dwmac-mdio";
+ reset-gpios = <&gpioz 2 GPIO_ACTIVE_LOW>;
+ reset-delay-us = <1000>;
++ reset-post-delay-us = <1000>;
+
+ phy0: ethernet-phy@7 {
+ reg = <7>;
+diff --git a/arch/arm/boot/dts/suniv-f1c100s.dtsi b/arch/arm/boot/dts/suniv-f1c100s.dtsi
+index 6100d3b75f613..def8301014487 100644
+--- a/arch/arm/boot/dts/suniv-f1c100s.dtsi
++++ b/arch/arm/boot/dts/suniv-f1c100s.dtsi
+@@ -104,8 +104,10 @@
+
+ wdt: watchdog@1c20ca0 {
+ compatible = "allwinner,suniv-f1c100s-wdt",
+- "allwinner,sun4i-a10-wdt";
++ "allwinner,sun6i-a31-wdt";
+ reg = <0x01c20ca0 0x20>;
++ interrupts = <16>;
++ clocks = <&osc32k>;
+ };
+
+ uart0: serial@1c25000 {
+diff --git a/arch/arm/include/asm/arch_gicv3.h b/arch/arm/include/asm/arch_gicv3.h
+index 413abfb42989e..f82a819eb0dbb 100644
+--- a/arch/arm/include/asm/arch_gicv3.h
++++ b/arch/arm/include/asm/arch_gicv3.h
+@@ -48,6 +48,7 @@ static inline u32 read_ ## a64(void) \
+ return read_sysreg(a32); \
+ } \
+
++CPUIF_MAP(ICC_EOIR1, ICC_EOIR1_EL1)
+ CPUIF_MAP(ICC_PMR, ICC_PMR_EL1)
+ CPUIF_MAP(ICC_AP0R0, ICC_AP0R0_EL1)
+ CPUIF_MAP(ICC_AP0R1, ICC_AP0R1_EL1)
+@@ -63,12 +64,6 @@ CPUIF_MAP(ICC_AP1R3, ICC_AP1R3_EL1)
+
+ /* Low-level accessors */
+
+-static inline void gic_write_eoir(u32 irq)
+-{
+- write_sysreg(irq, ICC_EOIR1);
+- isb();
+-}
+-
+ static inline void gic_write_dir(u32 val)
+ {
+ write_sysreg(val, ICC_DIR);
+diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
+index 459abc5d18195..ea128e32e8ca8 100644
+--- a/arch/arm/kernel/signal.c
++++ b/arch/arm/kernel/signal.c
+@@ -708,6 +708,7 @@ static_assert(offsetof(siginfo_t, si_upper) == 0x18);
+ static_assert(offsetof(siginfo_t, si_pkey) == 0x14);
+ static_assert(offsetof(siginfo_t, si_perf_data) == 0x10);
+ static_assert(offsetof(siginfo_t, si_perf_type) == 0x14);
++static_assert(offsetof(siginfo_t, si_perf_flags) == 0x18);
+ static_assert(offsetof(siginfo_t, si_band) == 0x0c);
+ static_assert(offsetof(siginfo_t, si_fd) == 0x10);
+ static_assert(offsetof(siginfo_t, si_call_addr) == 0x0c);
+diff --git a/arch/arm/mach-hisi/platsmp.c b/arch/arm/mach-hisi/platsmp.c
+index a56cc64deeb8f..9ce93e0b6cdc3 100644
+--- a/arch/arm/mach-hisi/platsmp.c
++++ b/arch/arm/mach-hisi/platsmp.c
+@@ -67,14 +67,17 @@ static void __init hi3xxx_smp_prepare_cpus(unsigned int max_cpus)
+ }
+ ctrl_base = of_iomap(np, 0);
+ if (!ctrl_base) {
++ of_node_put(np);
+ pr_err("failed to map address\n");
+ return;
+ }
+ if (of_property_read_u32(np, "smp-offset", &offset) < 0) {
++ of_node_put(np);
+ pr_err("failed to find smp-offset property\n");
+ return;
+ }
+ ctrl_base += offset;
++ of_node_put(np);
+ }
+ }
+
+@@ -160,6 +163,7 @@ static int hip01_boot_secondary(unsigned int cpu, struct task_struct *idle)
+ if (WARN_ON(!node))
+ return -1;
+ ctrl_base = of_iomap(node, 0);
++ of_node_put(node);
+
+ /* set the secondary core boot from DDR */
+ remap_reg_value = readl_relaxed(ctrl_base + REG_SC_CTRL);
+diff --git a/arch/arm/mach-mediatek/Kconfig b/arch/arm/mach-mediatek/Kconfig
+index 9e0f592d87d8e..35a3430c7942d 100644
+--- a/arch/arm/mach-mediatek/Kconfig
++++ b/arch/arm/mach-mediatek/Kconfig
+@@ -30,6 +30,7 @@ config MACH_MT7623
+ config MACH_MT7629
+ bool "MediaTek MT7629 SoCs support"
+ default ARCH_MEDIATEK
++ select HAVE_ARM_ARCH_TIMER
+
+ config MACH_MT8127
+ bool "MediaTek MT8127 SoCs support"
+diff --git a/arch/arm/mach-omap1/clock.c b/arch/arm/mach-omap1/clock.c
+index 9d4a0ab50a468..d63d5eb8d8fdf 100644
+--- a/arch/arm/mach-omap1/clock.c
++++ b/arch/arm/mach-omap1/clock.c
+@@ -41,7 +41,7 @@ static DEFINE_SPINLOCK(clockfw_lock);
+ unsigned long omap1_uart_recalc(struct clk *clk)
+ {
+ unsigned int val = __raw_readl(clk->enable_reg);
+- return val & clk->enable_bit ? 48000000 : 12000000;
++ return val & 1 << clk->enable_bit ? 48000000 : 12000000;
+ }
+
+ unsigned long omap1_sossi_recalc(struct clk *clk)
+diff --git a/arch/arm/mach-pxa/cm-x300.c b/arch/arm/mach-pxa/cm-x300.c
+index 2e35354b61f56..167e871f059ef 100644
+--- a/arch/arm/mach-pxa/cm-x300.c
++++ b/arch/arm/mach-pxa/cm-x300.c
+@@ -354,13 +354,13 @@ static struct platform_device cm_x300_spi_gpio = {
+ static struct gpiod_lookup_table cm_x300_spi_gpiod_table = {
+ .dev_id = "spi_gpio",
+ .table = {
+- GPIO_LOOKUP("gpio-pxa", GPIO_LCD_SCL,
++ GPIO_LOOKUP("pca9555.1", GPIO_LCD_SCL - GPIO_LCD_BASE,
+ "sck", GPIO_ACTIVE_HIGH),
+- GPIO_LOOKUP("gpio-pxa", GPIO_LCD_DIN,
++ GPIO_LOOKUP("pca9555.1", GPIO_LCD_DIN - GPIO_LCD_BASE,
+ "mosi", GPIO_ACTIVE_HIGH),
+- GPIO_LOOKUP("gpio-pxa", GPIO_LCD_DOUT,
++ GPIO_LOOKUP("pca9555.1", GPIO_LCD_DOUT - GPIO_LCD_BASE,
+ "miso", GPIO_ACTIVE_HIGH),
+- GPIO_LOOKUP("gpio-pxa", GPIO_LCD_CS,
++ GPIO_LOOKUP("pca9555.1", GPIO_LCD_CS - GPIO_LCD_BASE,
+ "cs", GPIO_ACTIVE_HIGH),
+ { },
+ },
+diff --git a/arch/arm/mach-pxa/magician.c b/arch/arm/mach-pxa/magician.c
+index 200fd35168e05..fcced6499faee 100644
+--- a/arch/arm/mach-pxa/magician.c
++++ b/arch/arm/mach-pxa/magician.c
+@@ -681,7 +681,7 @@ static struct platform_device bq24022 = {
+ static struct gpiod_lookup_table bq24022_gpiod_table = {
+ .dev_id = "gpio-regulator",
+ .table = {
+- GPIO_LOOKUP("gpio-pxa", EGPIO_MAGICIAN_BQ24022_ISET2,
++ GPIO_LOOKUP("htc-egpio-0", EGPIO_MAGICIAN_BQ24022_ISET2 - MAGICIAN_EGPIO_BASE,
+ NULL, GPIO_ACTIVE_HIGH),
+ GPIO_LOOKUP("gpio-pxa", GPIO30_MAGICIAN_BQ24022_nCHARGE_EN,
+ "enable", GPIO_ACTIVE_LOW),
+diff --git a/arch/arm/mach-pxa/tosa.c b/arch/arm/mach-pxa/tosa.c
+index 431709725d02b..ded5e343e1984 100644
+--- a/arch/arm/mach-pxa/tosa.c
++++ b/arch/arm/mach-pxa/tosa.c
+@@ -296,9 +296,9 @@ static struct gpiod_lookup_table tosa_mci_gpio_table = {
+ .table = {
+ GPIO_LOOKUP("gpio-pxa", TOSA_GPIO_nSD_DETECT,
+ "cd", GPIO_ACTIVE_LOW),
+- GPIO_LOOKUP("gpio-pxa", TOSA_GPIO_SD_WP,
++ GPIO_LOOKUP("sharp-scoop.0", TOSA_GPIO_SD_WP - TOSA_SCOOP_GPIO_BASE,
+ "wp", GPIO_ACTIVE_LOW),
+- GPIO_LOOKUP("gpio-pxa", TOSA_GPIO_PWR_ON,
++ GPIO_LOOKUP("sharp-scoop.0", TOSA_GPIO_PWR_ON - TOSA_SCOOP_GPIO_BASE,
+ "power", GPIO_ACTIVE_HIGH),
+ { },
+ },
+diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
+index a0554d7d04f7c..e1adc098f89ac 100644
+--- a/arch/arm/mach-vexpress/dcscb.c
++++ b/arch/arm/mach-vexpress/dcscb.c
+@@ -144,6 +144,7 @@ static int __init dcscb_init(void)
+ if (!node)
+ return -ENODEV;
+ dcscb_base = of_iomap(node, 0);
++ of_node_put(node);
+ if (!dcscb_base)
+ return -EADDRNOTAVAIL;
+ cfg = readl_relaxed(dcscb_base + DCS_CFG_R);
+diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms
+index 30b123cde02c5..aaeaf57c82225 100644
+--- a/arch/arm64/Kconfig.platforms
++++ b/arch/arm64/Kconfig.platforms
+@@ -253,6 +253,7 @@ config ARCH_INTEL_SOCFPGA
+
+ config ARCH_SYNQUACER
+ bool "Socionext SynQuacer SoC Family"
++ select IRQ_FASTEOI_HIERARCHY_HANDLERS
+
+ config ARCH_TEGRA
+ bool "NVIDIA Tegra SoC Family"
+diff --git a/arch/arm64/boot/dts/arm/juno-r1-scmi.dts b/arch/arm64/boot/dts/arm/juno-r1-scmi.dts
+index 190a0fba4ad68..fd1f0d26d751a 100644
+--- a/arch/arm64/boot/dts/arm/juno-r1-scmi.dts
++++ b/arch/arm64/boot/dts/arm/juno-r1-scmi.dts
+@@ -7,11 +7,11 @@
+ };
+
+ etf@20140000 {
+- power-domains = <&scmi_devpd 0>;
++ power-domains = <&scmi_devpd 8>;
+ };
+
+ funnel@20150000 {
+- power-domains = <&scmi_devpd 0>;
++ power-domains = <&scmi_devpd 8>;
+ };
+ };
+
+diff --git a/arch/arm64/boot/dts/arm/juno-r2-scmi.dts b/arch/arm64/boot/dts/arm/juno-r2-scmi.dts
+index dbf13770084f5..35e6d4762c463 100644
+--- a/arch/arm64/boot/dts/arm/juno-r2-scmi.dts
++++ b/arch/arm64/boot/dts/arm/juno-r2-scmi.dts
+@@ -7,11 +7,11 @@
+ };
+
+ etf@20140000 {
+- power-domains = <&scmi_devpd 0>;
++ power-domains = <&scmi_devpd 8>;
+ };
+
+ funnel@20150000 {
+- power-domains = <&scmi_devpd 0>;
++ power-domains = <&scmi_devpd 8>;
+ };
+ };
+
+diff --git a/arch/arm64/boot/dts/marvell/armada-3720-espressobin-ultra.dts b/arch/arm64/boot/dts/marvell/armada-3720-espressobin-ultra.dts
+index c5eb3604dd5b7..119db6b541b7b 100644
+--- a/arch/arm64/boot/dts/marvell/armada-3720-espressobin-ultra.dts
++++ b/arch/arm64/boot/dts/marvell/armada-3720-espressobin-ultra.dts
+@@ -71,10 +71,6 @@
+
+ &spi0 {
+ flash@0 {
+- spi-max-frequency = <108000000>;
+- spi-rx-bus-width = <4>;
+- spi-tx-bus-width = <4>;
+-
+ partitions {
+ compatible = "fixed-partitions";
+ #address-cells = <1>;
+@@ -112,7 +108,6 @@
+
+ &usb3 {
+ usb-phy = <&usb3_phy>;
+- status = "disabled";
+ };
+
+ &mdio {
+diff --git a/arch/arm64/boot/dts/mediatek/mt8192.dtsi b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
+index 411feb2946134..bcecc74844535 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8192.dtsi
++++ b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
+@@ -679,7 +679,7 @@
+ assigned-clock-parents = <&clk26m>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+- status = "disable";
++ status = "disabled";
+ };
+
+ audsys: clock-controller@11210000 {
+diff --git a/arch/arm64/boot/dts/nvidia/tegra210.dtsi b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
+index 218a2b32200f8..4f0e51f1a3430 100644
+--- a/arch/arm64/boot/dts/nvidia/tegra210.dtsi
++++ b/arch/arm64/boot/dts/nvidia/tegra210.dtsi
+@@ -1366,8 +1366,9 @@
+ <&tegra_car TEGRA210_CLK_DFLL_REF>,
+ <&tegra_car TEGRA210_CLK_I2C5>;
+ clock-names = "soc", "ref", "i2c";
+- resets = <&tegra_car TEGRA210_RST_DFLL_DVCO>;
+- reset-names = "dvco";
++ resets = <&tegra_car TEGRA210_RST_DFLL_DVCO>,
++ <&tegra_car 155>;
++ reset-names = "dvco", "dfll";
+ #clock-cells = <0>;
+ clock-output-names = "dfllCPU_out";
+ status = "disabled";
+diff --git a/arch/arm64/boot/dts/qcom/ipq8074.dtsi b/arch/arm64/boot/dts/qcom/ipq8074.dtsi
+index d80b1cefab100..8d4e0e1934396 100644
+--- a/arch/arm64/boot/dts/qcom/ipq8074.dtsi
++++ b/arch/arm64/boot/dts/qcom/ipq8074.dtsi
+@@ -13,7 +13,7 @@
+ clocks {
+ sleep_clk: sleep_clk {
+ compatible = "fixed-clock";
+- clock-frequency = <32000>;
++ clock-frequency = <32768>;
+ #clock-cells = <0>;
+ };
+
+diff --git a/arch/arm64/boot/dts/qcom/msm8994.dtsi b/arch/arm64/boot/dts/qcom/msm8994.dtsi
+index 8c1dc5155b713..b1e595cb4b901 100644
+--- a/arch/arm64/boot/dts/qcom/msm8994.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8994.dtsi
+@@ -183,8 +183,8 @@
+ no-map;
+ };
+
+- cont_splash_mem: memory@3800000 {
+- reg = <0 0x03800000 0 0x2400000>;
++ cont_splash_mem: memory@3401000 {
++ reg = <0 0x03401000 0 0x2200000>;
+ no-map;
+ };
+
+@@ -498,7 +498,7 @@
+ #dma-cells = <1>;
+ qcom,ee = <0>;
+ qcom,controlled-remotely;
+- num-channels = <18>;
++ num-channels = <24>;
+ qcom,num-ees = <4>;
+ };
+
+@@ -634,7 +634,7 @@
+ #dma-cells = <1>;
+ qcom,ee = <0>;
+ qcom,controlled-remotely;
+- num-channels = <18>;
++ num-channels = <24>;
+ qcom,num-ees = <4>;
+ };
+
+diff --git a/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts b/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts
+index 845eb7a6bf92e..0e63f707b9115 100644
+--- a/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts
++++ b/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts
+@@ -29,7 +29,7 @@
+ };
+
+ /* Fixed crystal oscillator dedicated to MCP2518FD */
+- clk40M: can_clock {
++ clk40M: can-clock {
+ compatible = "fixed-clock";
+ #clock-cells = <0>;
+ clock-frequency = <40000000>;
+diff --git a/arch/arm64/boot/dts/qcom/sc7280-herobrine.dtsi b/arch/arm64/boot/dts/qcom/sc7280-herobrine.dtsi
+index dc17f2079695f..488caa48cba3a 100644
+--- a/arch/arm64/boot/dts/qcom/sc7280-herobrine.dtsi
++++ b/arch/arm64/boot/dts/qcom/sc7280-herobrine.dtsi
+@@ -677,7 +677,6 @@ ap_ec_spi: &spi10 {
+ function = "gpio";
+ bias-disable;
+ drive-strength = <2>;
+- output-high;
+ };
+
+ fp_to_ap_irq_l: fp-to-ap-irq-l {
+@@ -691,7 +690,6 @@ ap_ec_spi: &spi10 {
+ pins = "gpio68";
+ function = "gpio";
+ bias-disable;
+- output-low;
+ };
+
+ gsc_ap_int_odl: gsc-ap-int-odl {
+@@ -741,7 +739,7 @@ ap_ec_spi: &spi10 {
+ bias-pull-up;
+ };
+
+- sar1_irq_odl: sar0-irq-odl {
++ sar1_irq_odl: sar1-irq-odl {
+ pins = "gpio140";
+ function = "gpio";
+ bias-pull-up;
+diff --git a/arch/arm64/boot/dts/qcom/sc7280-idp.dtsi b/arch/arm64/boot/dts/qcom/sc7280-idp.dtsi
+index ecbf2b89d8963..5ab3696af3548 100644
+--- a/arch/arm64/boot/dts/qcom/sc7280-idp.dtsi
++++ b/arch/arm64/boot/dts/qcom/sc7280-idp.dtsi
+@@ -400,10 +400,13 @@
+
+ &qup_uart7_cts {
+ /*
+- * Configure a pull-down on CTS to match the pull of
+- * the Bluetooth module.
++ * Configure a bias-bus-hold on CTS to lower power
++ * usage when Bluetooth is turned off. Bus hold will
++ * maintain a low power state regardless of whether
++ * the Bluetooth module drives the pin in either
++ * direction or leaves the pin fully unpowered.
+ */
+- bias-pull-down;
++ bias-bus-hold;
+ };
+
+ &qup_uart7_rts {
+@@ -495,10 +498,13 @@
+ pins = "gpio28";
+ function = "gpio";
+ /*
+- * Configure a pull-down on CTS to match the pull of
+- * the Bluetooth module.
++ * Configure a bias-bus-hold on CTS to lower power
++ * usage when Bluetooth is turned off. Bus hold will
++ * maintain a low power state regardless of whether
++ * the Bluetooth module drives the pin in either
++ * direction or leaves the pin fully unpowered.
+ */
+- bias-pull-down;
++ bias-bus-hold;
+ };
+
+ qup_uart7_sleep_rts: qup-uart7-sleep-rts {
+diff --git a/arch/arm64/boot/dts/qcom/sc7280-qcard.dtsi b/arch/arm64/boot/dts/qcom/sc7280-qcard.dtsi
+index b833ba1e8f4af..98b5cd70bca52 100644
+--- a/arch/arm64/boot/dts/qcom/sc7280-qcard.dtsi
++++ b/arch/arm64/boot/dts/qcom/sc7280-qcard.dtsi
+@@ -398,8 +398,14 @@ mos_bt_uart: &uart7 {
+
+ /* For mos_bt_uart */
+ &qup_uart7_cts {
+- /* Configure a pull-down on CTS to match the pull of the Bluetooth module. */
+- bias-pull-down;
++ /*
++ * Configure a bias-bus-hold on CTS to lower power
++ * usage when Bluetooth is turned off. Bus hold will
++ * maintain a low power state regardless of whether
++ * the Bluetooth module drives the pin in either
++ * direction or leaves the pin fully unpowered.
++ */
++ bias-bus-hold;
+ };
+
+ /* For mos_bt_uart */
+@@ -490,10 +496,13 @@ mos_bt_uart: &uart7 {
+ pins = "gpio28";
+ function = "gpio";
+ /*
+- * Configure a pull-down on CTS to match the pull of
+- * the Bluetooth module.
++ * Configure a bias-bus-hold on CTS to lower power
++ * usage when Bluetooth is turned off. Bus hold will
++ * maintain a low power state regardless of whether
++ * the Bluetooth module drives the pin in either
++ * direction or leaves the pin fully unpowered.
+ */
+- bias-pull-down;
++ bias-bus-hold;
+ };
+
+ /* For mos_bt_uart */
+diff --git a/arch/arm64/boot/dts/qcom/sdm845-xiaomi-beryllium.dts b/arch/arm64/boot/dts/qcom/sdm845-xiaomi-beryllium.dts
+index 367389526b418..a97f5e89e1d0f 100644
+--- a/arch/arm64/boot/dts/qcom/sdm845-xiaomi-beryllium.dts
++++ b/arch/arm64/boot/dts/qcom/sdm845-xiaomi-beryllium.dts
+@@ -218,7 +218,7 @@
+ panel@0 {
+ compatible = "tianma,fhd-video";
+ reg = <0>;
+- vddi0-supply = <&vreg_l14a_1p8>;
++ vddio-supply = <&vreg_l14a_1p8>;
+ vddpos-supply = <&lab>;
+ vddneg-supply = <&ibb>;
+
+diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
+index 934e29b9e153b..e63b7b0458cf8 100644
+--- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
+@@ -693,6 +693,9 @@
+ clock-names = "m-ahb", "s-ahb";
+ clocks = <&gcc GCC_QUPV3_WRAP_0_M_AHB_CLK>,
+ <&gcc GCC_QUPV3_WRAP_0_S_AHB_CLK>;
++ iommus = <&apps_smmu 0x5a3 0x0>;
++ interconnects = <&clk_virt MASTER_QUP_CORE_0 0 &clk_virt SLAVE_QUP_CORE_0 0>;
++ interconnect-names = "qup-core";
+ #address-cells = <2>;
+ #size-cells = <2>;
+ ranges;
+@@ -718,6 +721,9 @@
+ clock-names = "m-ahb", "s-ahb";
+ clocks = <&gcc GCC_QUPV3_WRAP_1_M_AHB_CLK>,
+ <&gcc GCC_QUPV3_WRAP_1_S_AHB_CLK>;
++ iommus = <&apps_smmu 0x43 0x0>;
++ interconnects = <&clk_virt MASTER_QUP_CORE_1 0 &clk_virt SLAVE_QUP_CORE_1 0>;
++ interconnect-names = "qup-core";
+ #address-cells = <2>;
+ #size-cells = <2>;
+ ranges;
+diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+index 080457a68e3c7..88f26d89eea1a 100644
+--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
++++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+@@ -1534,6 +1534,7 @@
+ reg = <0xf780 0x24>;
+ clocks = <&sdhci>;
+ clock-names = "emmcclk";
++ drive-impedance-ohm = <50>;
+ #phy-cells = <0>;
+ status = "disabled";
+ };
+@@ -1544,7 +1545,6 @@
+ clock-names = "refclk";
+ #phy-cells = <1>;
+ resets = <&cru SRST_PCIEPHY>;
+- drive-impedance-ohm = <50>;
+ reset-names = "phy";
+ status = "disabled";
+ };
+diff --git a/arch/arm64/boot/dts/ti/k3-am64-mcu.dtsi b/arch/arm64/boot/dts/ti/k3-am64-mcu.dtsi
+index 2bb5c9ff172c9..02d4285acbb8d 100644
+--- a/arch/arm64/boot/dts/ti/k3-am64-mcu.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am64-mcu.dtsi
+@@ -10,7 +10,6 @@
+ compatible = "ti,am64-uart", "ti,am654-uart";
+ reg = <0x00 0x04a00000 0x00 0x100>;
+ interrupts = <GIC_SPI 185 IRQ_TYPE_LEVEL_HIGH>;
+- clock-frequency = <48000000>;
+ current-speed = <115200>;
+ power-domains = <&k3_pds 149 TI_SCI_PD_EXCLUSIVE>;
+ clocks = <&k3_clks 149 0>;
+@@ -21,7 +20,6 @@
+ compatible = "ti,am64-uart", "ti,am654-uart";
+ reg = <0x00 0x04a10000 0x00 0x100>;
+ interrupts = <GIC_SPI 186 IRQ_TYPE_LEVEL_HIGH>;
+- clock-frequency = <48000000>;
+ current-speed = <115200>;
+ power-domains = <&k3_pds 160 TI_SCI_PD_EXCLUSIVE>;
+ clocks = <&k3_clks 160 0>;
+diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
+index 50aa3d75ab4f4..f30af6e1fe401 100644
+--- a/arch/arm64/configs/defconfig
++++ b/arch/arm64/configs/defconfig
+@@ -1029,6 +1029,7 @@ CONFIG_SM_GCC_8350=y
+ CONFIG_SM_GCC_8450=y
+ CONFIG_SM_GPUCC_8150=y
+ CONFIG_SM_GPUCC_8250=y
++CONFIG_SM_DISPCC_8250=y
+ CONFIG_QCOM_HFPLL=y
+ CONFIG_CLK_GFM_LPASS_SM8250=m
+ CONFIG_CLK_RCAR_USB2_CLOCK_SEL=y
+diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
+index 8bd5afc7b692e..48d4473e8eee2 100644
+--- a/arch/arm64/include/asm/arch_gicv3.h
++++ b/arch/arm64/include/asm/arch_gicv3.h
+@@ -26,12 +26,6 @@
+ * sets the GP register's most significant bits to 0 with an explicit cast.
+ */
+
+-static inline void gic_write_eoir(u32 irq)
+-{
+- write_sysreg_s(irq, SYS_ICC_EOIR1_EL1);
+- isb();
+-}
+-
+ static __always_inline void gic_write_dir(u32 irq)
+ {
+ write_sysreg_s(irq, SYS_ICC_DIR_EL1);
+diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
+index 73e38d9a540ce..6b1a12c23fe77 100644
+--- a/arch/arm64/include/asm/processor.h
++++ b/arch/arm64/include/asm/processor.h
+@@ -381,12 +381,10 @@ long get_tagged_addr_ctrl(struct task_struct *task);
+ * of header definitions for the use of task_stack_page.
+ */
+
+-#define current_top_of_stack() \
+-({ \
+- struct stack_info _info; \
+- BUG_ON(!on_accessible_stack(current, current_stack_pointer, 1, &_info)); \
+- _info.high; \
+-})
++/*
++ * The top of the current task's task stack
++ */
++#define current_top_of_stack() ((unsigned long)current->stack + THREAD_SIZE)
+ #define on_thread_stack() (on_task_stack(current, current_stack_pointer, 1, NULL))
+
+ #endif /* __ASSEMBLY__ */
+diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
+index 4a4122ef6f39b..41b5d9d3672ab 100644
+--- a/arch/arm64/kernel/signal.c
++++ b/arch/arm64/kernel/signal.c
+@@ -1011,6 +1011,7 @@ static_assert(offsetof(siginfo_t, si_upper) == 0x28);
+ static_assert(offsetof(siginfo_t, si_pkey) == 0x20);
+ static_assert(offsetof(siginfo_t, si_perf_data) == 0x18);
+ static_assert(offsetof(siginfo_t, si_perf_type) == 0x20);
++static_assert(offsetof(siginfo_t, si_perf_flags) == 0x24);
+ static_assert(offsetof(siginfo_t, si_band) == 0x10);
+ static_assert(offsetof(siginfo_t, si_fd) == 0x18);
+ static_assert(offsetof(siginfo_t, si_call_addr) == 0x10);
+diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c
+index d984282b979f8..4700f8522d27b 100644
+--- a/arch/arm64/kernel/signal32.c
++++ b/arch/arm64/kernel/signal32.c
+@@ -487,6 +487,7 @@ static_assert(offsetof(compat_siginfo_t, si_upper) == 0x18);
+ static_assert(offsetof(compat_siginfo_t, si_pkey) == 0x14);
+ static_assert(offsetof(compat_siginfo_t, si_perf_data) == 0x10);
+ static_assert(offsetof(compat_siginfo_t, si_perf_type) == 0x14);
++static_assert(offsetof(compat_siginfo_t, si_perf_flags) == 0x18);
+ static_assert(offsetof(compat_siginfo_t, si_band) == 0x0c);
+ static_assert(offsetof(compat_siginfo_t, si_fd) == 0x10);
+ static_assert(offsetof(compat_siginfo_t, si_call_addr) == 0x0c);
+diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c
+index 12c6864e51e13..df14336c3a29c 100644
+--- a/arch/arm64/kernel/sys_compat.c
++++ b/arch/arm64/kernel/sys_compat.c
+@@ -113,6 +113,6 @@ long compat_arm_syscall(struct pt_regs *regs, int scno)
+ addr = instruction_pointer(regs) - (compat_thumb_mode(regs) ? 2 : 4);
+
+ arm64_notify_die("Oops - bad compat syscall(2)", regs,
+- SIGILL, ILL_ILLTRP, addr, scno);
++ SIGILL, ILL_ILLTRP, addr, 0);
+ return 0;
+ }
+diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
+index b5447e53cd73e..0dea80bf6de46 100644
+--- a/arch/arm64/mm/copypage.c
++++ b/arch/arm64/mm/copypage.c
+@@ -16,8 +16,8 @@
+
+ void copy_highpage(struct page *to, struct page *from)
+ {
+- struct page *kto = page_address(to);
+- struct page *kfrom = page_address(from);
++ void *kto = page_address(to);
++ void *kfrom = page_address(from);
+
+ copy_page(kto, kfrom);
+
+diff --git a/arch/csky/kernel/probes/kprobes.c b/arch/csky/kernel/probes/kprobes.c
+index 42920f25e73c8..34ba684d5962b 100644
+--- a/arch/csky/kernel/probes/kprobes.c
++++ b/arch/csky/kernel/probes/kprobes.c
+@@ -30,7 +30,7 @@ static int __kprobes patch_text_cb(void *priv)
+ struct csky_insn_patch *param = priv;
+ unsigned int addr = (unsigned int)param->addr;
+
+- if (atomic_inc_return(¶m->cpu_count) == 1) {
++ if (atomic_inc_return(¶m->cpu_count) == num_online_cpus()) {
+ *(u16 *) addr = cpu_to_le16(param->opcode);
+ dcache_wb_range(addr, addr + 2);
+ atomic_inc(¶m->cpu_count);
+diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu
+index 16ea9a67723c0..3d5da25c73b5a 100644
+--- a/arch/m68k/Kconfig.cpu
++++ b/arch/m68k/Kconfig.cpu
+@@ -327,7 +327,7 @@ comment "Processor Specific Options"
+
+ config M68KFPU_EMU
+ bool "Math emulation support"
+- depends on MMU
++ depends on M68KCLASSIC && FPU
+ help
+ At some point in the future, this will cause floating-point math
+ instructions to be emulated by the kernel on machines that lack a
+diff --git a/arch/m68k/include/asm/raw_io.h b/arch/m68k/include/asm/raw_io.h
+index 80eb2396d01eb..3ba40bc1dfaa9 100644
+--- a/arch/m68k/include/asm/raw_io.h
++++ b/arch/m68k/include/asm/raw_io.h
+@@ -80,14 +80,14 @@
+ ({ u16 __v = le16_to_cpu(*(__force volatile u16 *) (addr)); __v; })
+
+ #define rom_out_8(addr, b) \
+- ({u8 __maybe_unused __w, __v = (b); u32 _addr = ((u32) (addr)); \
++ (void)({u8 __maybe_unused __w, __v = (b); u32 _addr = ((u32) (addr)); \
+ __w = ((*(__force volatile u8 *) ((_addr | 0x10000) + (__v<<1)))); })
+ #define rom_out_be16(addr, w) \
+- ({u16 __maybe_unused __w, __v = (w); u32 _addr = ((u32) (addr)); \
++ (void)({u16 __maybe_unused __w, __v = (w); u32 _addr = ((u32) (addr)); \
+ __w = ((*(__force volatile u16 *) ((_addr & 0xFFFF0000UL) + ((__v & 0xFF)<<1)))); \
+ __w = ((*(__force volatile u16 *) ((_addr | 0x10000) + ((__v >> 8)<<1)))); })
+ #define rom_out_le16(addr, w) \
+- ({u16 __maybe_unused __w, __v = (w); u32 _addr = ((u32) (addr)); \
++ (void)({u16 __maybe_unused __w, __v = (w); u32 _addr = ((u32) (addr)); \
+ __w = ((*(__force volatile u16 *) ((_addr & 0xFFFF0000UL) + ((__v >> 8)<<1)))); \
+ __w = ((*(__force volatile u16 *) ((_addr | 0x10000) + ((__v & 0xFF)<<1)))); })
+
+diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
+index 49533f65958a6..b9f6908a31bc3 100644
+--- a/arch/m68k/kernel/signal.c
++++ b/arch/m68k/kernel/signal.c
+@@ -625,6 +625,7 @@ static inline void siginfo_build_tests(void)
+ /* _sigfault._perf */
+ BUILD_BUG_ON(offsetof(siginfo_t, si_perf_data) != 0x10);
+ BUILD_BUG_ON(offsetof(siginfo_t, si_perf_type) != 0x14);
++ BUILD_BUG_ON(offsetof(siginfo_t, si_perf_flags) != 0x18);
+
+ /* _sigpoll */
+ BUILD_BUG_ON(offsetof(siginfo_t, si_band) != 0x0c);
+diff --git a/arch/mips/include/asm/mach-ip27/cpu-feature-overrides.h b/arch/mips/include/asm/mach-ip27/cpu-feature-overrides.h
+index c8385c4e8664a..568fe09332eb4 100644
+--- a/arch/mips/include/asm/mach-ip27/cpu-feature-overrides.h
++++ b/arch/mips/include/asm/mach-ip27/cpu-feature-overrides.h
+@@ -25,7 +25,6 @@
+ #define cpu_has_4kex 1
+ #define cpu_has_3k_cache 0
+ #define cpu_has_4k_cache 1
+-#define cpu_has_fpu 1
+ #define cpu_has_nofpuex 0
+ #define cpu_has_32fpr 1
+ #define cpu_has_counter 1
+diff --git a/arch/mips/include/asm/mach-ip30/cpu-feature-overrides.h b/arch/mips/include/asm/mach-ip30/cpu-feature-overrides.h
+index 8ad0c424a9afb..ce4e4c6e09e24 100644
+--- a/arch/mips/include/asm/mach-ip30/cpu-feature-overrides.h
++++ b/arch/mips/include/asm/mach-ip30/cpu-feature-overrides.h
+@@ -28,7 +28,6 @@
+ #define cpu_has_4kex 1
+ #define cpu_has_3k_cache 0
+ #define cpu_has_4k_cache 1
+-#define cpu_has_fpu 1
+ #define cpu_has_nofpuex 0
+ #define cpu_has_32fpr 1
+ #define cpu_has_counter 1
+diff --git a/arch/mips/include/asm/mach-ralink/spaces.h b/arch/mips/include/asm/mach-ralink/spaces.h
+index f7af11ea2d612..a9f0570d0f044 100644
+--- a/arch/mips/include/asm/mach-ralink/spaces.h
++++ b/arch/mips/include/asm/mach-ralink/spaces.h
+@@ -6,7 +6,9 @@
+ #define PCI_IOSIZE SZ_64K
+ #define IO_SPACE_LIMIT (PCI_IOSIZE - 1)
+
++#ifdef CONFIG_PCI_DRIVERS_GENERIC
+ #define pci_remap_iospace pci_remap_iospace
++#endif
+
+ #include <asm/mach-generic/spaces.h>
+ #endif
+diff --git a/arch/openrisc/include/asm/timex.h b/arch/openrisc/include/asm/timex.h
+index d52b4e536e3f9..5487fa93dd9be 100644
+--- a/arch/openrisc/include/asm/timex.h
++++ b/arch/openrisc/include/asm/timex.h
+@@ -23,6 +23,7 @@ static inline cycles_t get_cycles(void)
+ {
+ return mfspr(SPR_TTCR);
+ }
++#define get_cycles get_cycles
+
+ /* This isn't really used any more */
+ #define CLOCK_TICK_RATE 1000
+diff --git a/arch/openrisc/kernel/head.S b/arch/openrisc/kernel/head.S
+index 15f1b38dfe03b..871f4c8588595 100644
+--- a/arch/openrisc/kernel/head.S
++++ b/arch/openrisc/kernel/head.S
+@@ -521,6 +521,15 @@ _start:
+ l.ori r3,r0,0x1
+ l.mtspr r0,r3,SPR_SR
+
++ /*
++ * Start the TTCR as early as possible, so that the RNG can make use of
++ * measurements of boot time from the earliest opportunity. Especially
++ * important is that the TTCR does not return zero by the time we reach
++ * rand_initialize().
++ */
++ l.movhi r3,hi(SPR_TTMR_CR)
++ l.mtspr r0,r3,SPR_TTMR
++
+ CLEAR_GPR(r1)
+ CLEAR_GPR(r2)
+ CLEAR_GPR(r3)
+diff --git a/arch/parisc/include/asm/fb.h b/arch/parisc/include/asm/fb.h
+index c4cd6360f9964..d63a2acb91f2b 100644
+--- a/arch/parisc/include/asm/fb.h
++++ b/arch/parisc/include/asm/fb.h
+@@ -12,9 +12,13 @@ static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
+ pgprot_val(vma->vm_page_prot) |= _PAGE_NO_CACHE;
+ }
+
++#if defined(CONFIG_STI_CONSOLE) || defined(CONFIG_FB_STI)
++int fb_is_primary_device(struct fb_info *info);
++#else
+ static inline int fb_is_primary_device(struct fb_info *info)
+ {
+ return 0;
+ }
++#endif
+
+ #endif /* _ASM_FB_H_ */
+diff --git a/arch/parisc/kernel/processor.c b/arch/parisc/kernel/processor.c
+index 26eb568f8b961..dddaaa6e7a825 100644
+--- a/arch/parisc/kernel/processor.c
++++ b/arch/parisc/kernel/processor.c
+@@ -327,8 +327,6 @@ int init_per_cpu(int cpunum)
+ set_firmware_width();
+ ret = pdc_coproc_cfg(&coproc_cfg);
+
+- store_cpu_topology(cpunum);
+-
+ if(ret >= 0 && coproc_cfg.ccr_functional) {
+ mtctl(coproc_cfg.ccr_functional, 10); /* 10 == Coprocessor Control Reg */
+
+diff --git a/arch/parisc/kernel/topology.c b/arch/parisc/kernel/topology.c
+index 9696e3cb6a2a6..b9d845e314f8b 100644
+--- a/arch/parisc/kernel/topology.c
++++ b/arch/parisc/kernel/topology.c
+@@ -20,8 +20,6 @@
+
+ static DEFINE_PER_CPU(struct cpu, cpu_devices);
+
+-static int dualcores_found;
+-
+ /*
+ * store_cpu_topology is called at boot when only one cpu is running
+ * and with the mutex cpu_hotplug.lock locked, when several cpus have booted,
+@@ -60,7 +58,6 @@ void store_cpu_topology(unsigned int cpuid)
+ if (p->cpu_loc) {
+ cpuid_topo->core_id++;
+ cpuid_topo->package_id = cpu_topology[cpu].package_id;
+- dualcores_found = 1;
+ continue;
+ }
+ }
+@@ -80,22 +77,11 @@ void store_cpu_topology(unsigned int cpuid)
+ cpu_topology[cpuid].package_id);
+ }
+
+-static struct sched_domain_topology_level parisc_mc_topology[] = {
+-#ifdef CONFIG_SCHED_MC
+- { cpu_coregroup_mask, cpu_core_flags, SD_INIT_NAME(MC) },
+-#endif
+-
+- { cpu_cpu_mask, SD_INIT_NAME(DIE) },
+- { NULL, },
+-};
+-
+ /*
+ * init_cpu_topology is called at boot when only one cpu is running
+ * which prevent simultaneous write access to cpu_topology array
+ */
+ void __init init_cpu_topology(void)
+ {
+- /* Set scheduler topology descriptor */
+- if (dualcores_found)
+- set_sched_topology(parisc_mc_topology);
++ reset_cpu_topology();
+ }
+diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
+index f2c5c26869f1a..03ae544eb6cc4 100644
+--- a/arch/powerpc/include/asm/page.h
++++ b/arch/powerpc/include/asm/page.h
+@@ -216,6 +216,9 @@ static inline bool pfn_valid(unsigned long pfn)
+ #define __pa(x) ((phys_addr_t)(unsigned long)(x) - VIRT_PHYS_OFFSET)
+ #else
+ #ifdef CONFIG_PPC64
++
++#define VIRTUAL_WARN_ON(x) WARN_ON(IS_ENABLED(CONFIG_DEBUG_VIRTUAL) && (x))
++
+ /*
+ * gcc miscompiles (unsigned long)(&static_var) - PAGE_OFFSET
+ * with -mcmodel=medium, so we use & and | instead of - and + on 64-bit.
+@@ -223,13 +226,13 @@ static inline bool pfn_valid(unsigned long pfn)
+ */
+ #define __va(x) \
+ ({ \
+- VIRTUAL_BUG_ON((unsigned long)(x) >= PAGE_OFFSET); \
++ VIRTUAL_WARN_ON((unsigned long)(x) >= PAGE_OFFSET); \
+ (void *)(unsigned long)((phys_addr_t)(x) | PAGE_OFFSET); \
+ })
+
+ #define __pa(x) \
+ ({ \
+- VIRTUAL_BUG_ON((unsigned long)(x) < PAGE_OFFSET); \
++ VIRTUAL_WARN_ON((unsigned long)(x) < PAGE_OFFSET); \
+ (unsigned long)(x) & 0x0fffffffffffffffUL; \
+ })
+
+diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
+index 83afcb6c194b5..c36f71e01c0f0 100644
+--- a/arch/powerpc/include/asm/vas.h
++++ b/arch/powerpc/include/asm/vas.h
+@@ -126,7 +126,7 @@ static inline void vas_user_win_add_mm_context(struct vas_user_win_ref *ref)
+ * Receive window attributes specified by the (in-kernel) owner of window.
+ */
+ struct vas_rx_win_attr {
+- void *rx_fifo;
++ u64 rx_fifo;
+ int rx_fifo_size;
+ int wcreds_max;
+
+diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
+index 9581906b5ee9c..da18f83ef8834 100644
+--- a/arch/powerpc/kernel/entry_64.S
++++ b/arch/powerpc/kernel/entry_64.S
+@@ -330,22 +330,22 @@ _GLOBAL(enter_rtas)
+ clrldi r4,r4,2 /* convert to realmode address */
+ mtlr r4
+
+- li r0,0
+- ori r0,r0,MSR_EE|MSR_SE|MSR_BE|MSR_RI
+- andc r0,r6,r0
+-
+- li r9,1
+- rldicr r9,r9,MSR_SF_LG,(63-MSR_SF_LG)
+- ori r9,r9,MSR_IR|MSR_DR|MSR_FE0|MSR_FE1|MSR_FP|MSR_RI|MSR_LE
+- andc r6,r0,r9
+-
+ __enter_rtas:
+- sync /* disable interrupts so SRR0/1 */
+- mtmsrd r0 /* don't get trashed */
+-
+ LOAD_REG_ADDR(r4, rtas)
+ ld r5,RTASENTRY(r4) /* get the rtas->entry value */
+ ld r4,RTASBASE(r4) /* get the rtas->base value */
++
++ /*
++ * RTAS runs in 32-bit big endian real mode, but leave MSR[RI] on as we
++ * may hit NMI (SRESET or MCE) while in RTAS. RTAS should disable RI in
++ * its critical regions (as specified in PAPR+ section 7.2.1). MSR[S]
++ * is not impacted by RFI_TO_KERNEL (only urfid can unset it). So if
++ * MSR[S] is set, it will remain when entering RTAS.
++ */
++ LOAD_REG_IMMEDIATE(r6, MSR_ME | MSR_RI)
++
++ li r0,0
++ mtmsrd r0,1 /* disable RI before using SRR0/1 */
+
+ mtspr SPRN_SRR0,r5
+ mtspr SPRN_SRR1,r6
+diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
+index 65562c4a0a690..dc2350b288cfe 100644
+--- a/arch/powerpc/kernel/fadump.c
++++ b/arch/powerpc/kernel/fadump.c
+@@ -867,7 +867,6 @@ static int fadump_alloc_mem_ranges(struct fadump_mrange_info *mrange_info)
+ sizeof(struct fadump_memory_range));
+ return 0;
+ }
+-
+ static inline int fadump_add_mem_range(struct fadump_mrange_info *mrange_info,
+ u64 base, u64 end)
+ {
+@@ -886,7 +885,12 @@ static inline int fadump_add_mem_range(struct fadump_mrange_info *mrange_info,
+ start = mem_ranges[mrange_info->mem_range_cnt - 1].base;
+ size = mem_ranges[mrange_info->mem_range_cnt - 1].size;
+
+- if ((start + size) == base)
++ /*
++ * Boot memory area needs separate PT_LOAD segment(s) as it
++ * is moved to a different location at the time of crash.
++ * So, fold only if the region is not boot memory area.
++ */
++ if ((start + size) == base && start >= fw_dump.boot_mem_top)
+ is_adjacent = true;
+ }
+ if (!is_adjacent) {
+diff --git a/arch/powerpc/kernel/idle.c b/arch/powerpc/kernel/idle.c
+index 4ad79eb638c62..77cd4c5a2d631 100644
+--- a/arch/powerpc/kernel/idle.c
++++ b/arch/powerpc/kernel/idle.c
+@@ -37,7 +37,7 @@ static int __init powersave_off(char *arg)
+ {
+ ppc_md.power_save = NULL;
+ cpuidle_disable = IDLE_POWERSAVE_OFF;
+- return 0;
++ return 1;
+ }
+ __setup("powersave=off", powersave_off);
+
+diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
+index 1f42aabbbab3a..6bc89d9ccf635 100644
+--- a/arch/powerpc/kernel/rtas.c
++++ b/arch/powerpc/kernel/rtas.c
+@@ -49,6 +49,15 @@ void enter_rtas(unsigned long);
+
+ static inline void do_enter_rtas(unsigned long args)
+ {
++ unsigned long msr;
++
++ /*
++ * Make sure MSR[RI] is currently enabled as it will be forced later
++ * in enter_rtas.
++ */
++ msr = mfmsr();
++ BUG_ON(!(msr & MSR_RI));
++
+ enter_rtas(args);
+
+ srr_regs_clobbered(); /* rtas uses SRRs, invalidate */
+diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
+index 6fa518f6501d5..aef0a6b423d80 100644
+--- a/arch/powerpc/kvm/book3s_hv.c
++++ b/arch/powerpc/kvm/book3s_hv.c
+@@ -4233,13 +4233,13 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
+ start_wait = ktime_get();
+
+ vc->vcore_state = VCORE_SLEEPING;
+- trace_kvmppc_vcore_blocked(vc, 0);
++ trace_kvmppc_vcore_blocked(vc->runner, 0);
+ spin_unlock(&vc->lock);
+ schedule();
+ finish_rcuwait(&vc->wait);
+ spin_lock(&vc->lock);
+ vc->vcore_state = VCORE_INACTIVE;
+- trace_kvmppc_vcore_blocked(vc, 1);
++ trace_kvmppc_vcore_blocked(vc->runner, 1);
+ ++vc->runner->stat.halt_successful_wait;
+
+ cur = ktime_get();
+@@ -4619,9 +4619,9 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
+ if (kvmppc_vcpu_check_block(vcpu))
+ break;
+
+- trace_kvmppc_vcore_blocked(vc, 0);
++ trace_kvmppc_vcore_blocked(vcpu, 0);
+ schedule();
+- trace_kvmppc_vcore_blocked(vc, 1);
++ trace_kvmppc_vcore_blocked(vcpu, 1);
+ }
+ finish_rcuwait(wait);
+ }
+@@ -5283,6 +5283,10 @@ static int kvmppc_core_init_vm_hv(struct kvm *kvm)
+ kvm->arch.host_lpcr = lpcr = mfspr(SPRN_LPCR);
+ lpcr &= LPCR_PECE | LPCR_LPES;
+ } else {
++ /*
++ * The L2 LPES mode will be set by the L0 according to whether
++ * or not it needs to take external interrupts in HV mode.
++ */
+ lpcr = 0;
+ }
+ lpcr |= (4UL << LPCR_DPFD_SH) | LPCR_HDICE |
+diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
+index c943a051c6e70..265bb30a0af2b 100644
+--- a/arch/powerpc/kvm/book3s_hv_nested.c
++++ b/arch/powerpc/kvm/book3s_hv_nested.c
+@@ -261,8 +261,7 @@ static void load_l2_hv_regs(struct kvm_vcpu *vcpu,
+ /*
+ * Don't let L1 change LPCR bits for the L2 except these:
+ */
+- mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD |
+- LPCR_LPES | LPCR_MER;
++ mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD | LPCR_MER;
+
+ /*
+ * Additional filtering is required depending on hardware
+diff --git a/arch/powerpc/kvm/trace_hv.h b/arch/powerpc/kvm/trace_hv.h
+index 38cd0ed0a6178..32e2cb5811cc8 100644
+--- a/arch/powerpc/kvm/trace_hv.h
++++ b/arch/powerpc/kvm/trace_hv.h
+@@ -409,9 +409,9 @@ TRACE_EVENT(kvmppc_run_core,
+ );
+
+ TRACE_EVENT(kvmppc_vcore_blocked,
+- TP_PROTO(struct kvmppc_vcore *vc, int where),
++ TP_PROTO(struct kvm_vcpu *vcpu, int where),
+
+- TP_ARGS(vc, where),
++ TP_ARGS(vcpu, where),
+
+ TP_STRUCT__entry(
+ __field(int, n_runnable)
+@@ -421,8 +421,8 @@ TRACE_EVENT(kvmppc_vcore_blocked,
+ ),
+
+ TP_fast_assign(
+- __entry->runner_vcpu = vc->runner->vcpu_id;
+- __entry->n_runnable = vc->n_runnable;
++ __entry->runner_vcpu = vcpu->vcpu_id;
++ __entry->n_runnable = vcpu->arch.vcore->n_runnable;
+ __entry->where = where;
+ __entry->tgid = current->tgid;
+ ),
+diff --git a/arch/powerpc/mm/nohash/fsl_book3e.c b/arch/powerpc/mm/nohash/fsl_book3e.c
+index dfe715e0f70ac..388f7c7dabd30 100644
+--- a/arch/powerpc/mm/nohash/fsl_book3e.c
++++ b/arch/powerpc/mm/nohash/fsl_book3e.c
+@@ -287,22 +287,19 @@ void __init adjust_total_lowmem(void)
+
+ #ifdef CONFIG_STRICT_KERNEL_RWX
+ void mmu_mark_rodata_ro(void)
+-{
+- /* Everything is done in mmu_mark_initmem_nx() */
+-}
+-#endif
+-
+-void mmu_mark_initmem_nx(void)
+ {
+ unsigned long remapped;
+
+- if (!strict_kernel_rwx_enabled())
+- return;
+-
+ remapped = map_mem_in_cams(__max_low_memory, CONFIG_LOWMEM_CAM_NUM, false, false);
+
+ WARN_ON(__max_low_memory != remapped);
+ }
++#endif
++
++void mmu_mark_initmem_nx(void)
++{
++ /* Everything is done in mmu_mark_rodata_ro() */
++}
+
+ void setup_initial_memory_limit(phys_addr_t first_memblock_base,
+ phys_addr_t first_memblock_size)
+diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
+index a74d382ecbb77..bb5d64862bc99 100644
+--- a/arch/powerpc/perf/isa207-common.c
++++ b/arch/powerpc/perf/isa207-common.c
+@@ -108,7 +108,7 @@ static void mmcra_sdar_mode(u64 event, unsigned long *mmcra)
+ *mmcra |= MMCRA_SDAR_MODE_TLB;
+ }
+
+-static u64 p10_thresh_cmp_val(u64 value)
++static int p10_thresh_cmp_val(u64 value)
+ {
+ int exp = 0;
+ u64 result = value;
+@@ -139,7 +139,7 @@ static u64 p10_thresh_cmp_val(u64 value)
+ * exponent is also zero.
+ */
+ if (!(value & 0xC0) && exp)
+- result = 0;
++ result = -1;
+ else
+ result = (exp << 8) | value;
+ }
+@@ -187,7 +187,7 @@ static bool is_thresh_cmp_valid(u64 event)
+ unsigned int cmp, exp;
+
+ if (cpu_has_feature(CPU_FTR_ARCH_31))
+- return p10_thresh_cmp_val(event) != 0;
++ return p10_thresh_cmp_val(event) >= 0;
+
+ /*
+ * Check the mantissa upper two bits are not zero, unless the
+@@ -502,12 +502,14 @@ int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp,
+ value |= CNST_THRESH_CTL_SEL_VAL(event >> EVENT_THRESH_SHIFT);
+ mask |= p10_CNST_THRESH_CMP_MASK;
+ value |= p10_CNST_THRESH_CMP_VAL(p10_thresh_cmp_val(event_config1));
+- }
++ } else if (event_is_threshold(event))
++ return -1;
+ } else if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+ if (event_is_threshold(event) && is_thresh_cmp_valid(event)) {
+ mask |= CNST_THRESH_MASK;
+ value |= CNST_THRESH_VAL(event >> EVENT_THRESH_SHIFT);
+- }
++ } else if (event_is_threshold(event))
++ return -1;
+ } else {
+ /*
+ * Special case for PM_MRK_FAB_RSP_MATCH and PM_MRK_FAB_RSP_MATCH_CYC,
+diff --git a/arch/powerpc/platforms/4xx/cpm.c b/arch/powerpc/platforms/4xx/cpm.c
+index 2571841625a23..1d3bc35ee1a7d 100644
+--- a/arch/powerpc/platforms/4xx/cpm.c
++++ b/arch/powerpc/platforms/4xx/cpm.c
+@@ -327,6 +327,6 @@ late_initcall(cpm_init);
+ static int __init cpm_powersave_off(char *arg)
+ {
+ cpm.powersave_off = 1;
+- return 0;
++ return 1;
+ }
+ __setup("powersave=off", cpm_powersave_off);
+diff --git a/arch/powerpc/platforms/8xx/cpm1.c b/arch/powerpc/platforms/8xx/cpm1.c
+index c58b6f1c40e35..3ef5e9fd3a9b6 100644
+--- a/arch/powerpc/platforms/8xx/cpm1.c
++++ b/arch/powerpc/platforms/8xx/cpm1.c
+@@ -280,6 +280,7 @@ cpm_setbrg(uint brg, uint rate)
+ out_be32(bp, (((BRG_UART_CLK_DIV16 / rate) - 1) << 1) |
+ CPM_BRG_EN | CPM_BRG_DIV16);
+ }
++EXPORT_SYMBOL(cpm_setbrg);
+
+ struct cpm_ioport16 {
+ __be16 dir, par, odr_sor, dat, intr;
+diff --git a/arch/powerpc/platforms/powernv/opal-fadump.c b/arch/powerpc/platforms/powernv/opal-fadump.c
+index c8ad057c72210..9d74d3950a523 100644
+--- a/arch/powerpc/platforms/powernv/opal-fadump.c
++++ b/arch/powerpc/platforms/powernv/opal-fadump.c
+@@ -60,7 +60,7 @@ void __init opal_fadump_dt_scan(struct fw_dump *fadump_conf, u64 node)
+ addr = be64_to_cpu(addr);
+ pr_debug("Kernel metadata addr: %llx\n", addr);
+ opal_fdm_active = (void *)addr;
+- if (opal_fdm_active->registered_regions == 0)
++ if (be16_to_cpu(opal_fdm_active->registered_regions) == 0)
+ return;
+
+ ret = opal_mpipl_query_tag(OPAL_MPIPL_TAG_BOOT_MEM, &addr);
+@@ -95,17 +95,17 @@ static int opal_fadump_unregister(struct fw_dump *fadump_conf);
+ static void opal_fadump_update_config(struct fw_dump *fadump_conf,
+ const struct opal_fadump_mem_struct *fdm)
+ {
+- pr_debug("Boot memory regions count: %d\n", fdm->region_cnt);
++ pr_debug("Boot memory regions count: %d\n", be16_to_cpu(fdm->region_cnt));
+
+ /*
+ * The destination address of the first boot memory region is the
+ * destination address of boot memory regions.
+ */
+- fadump_conf->boot_mem_dest_addr = fdm->rgn[0].dest;
++ fadump_conf->boot_mem_dest_addr = be64_to_cpu(fdm->rgn[0].dest);
+ pr_debug("Destination address of boot memory regions: %#016llx\n",
+ fadump_conf->boot_mem_dest_addr);
+
+- fadump_conf->fadumphdr_addr = fdm->fadumphdr_addr;
++ fadump_conf->fadumphdr_addr = be64_to_cpu(fdm->fadumphdr_addr);
+ }
+
+ /*
+@@ -126,9 +126,9 @@ static void __init opal_fadump_get_config(struct fw_dump *fadump_conf,
+ fadump_conf->boot_memory_size = 0;
+
+ pr_debug("Boot memory regions:\n");
+- for (i = 0; i < fdm->region_cnt; i++) {
+- base = fdm->rgn[i].src;
+- size = fdm->rgn[i].size;
++ for (i = 0; i < be16_to_cpu(fdm->region_cnt); i++) {
++ base = be64_to_cpu(fdm->rgn[i].src);
++ size = be64_to_cpu(fdm->rgn[i].size);
+ pr_debug("\t[%03d] base: 0x%lx, size: 0x%lx\n", i, base, size);
+
+ fadump_conf->boot_mem_addr[i] = base;
+@@ -143,7 +143,7 @@ static void __init opal_fadump_get_config(struct fw_dump *fadump_conf,
+ * Start address of reserve dump area (permanent reservation) for
+ * re-registering FADump after dump capture.
+ */
+- fadump_conf->reserve_dump_area_start = fdm->rgn[0].dest;
++ fadump_conf->reserve_dump_area_start = be64_to_cpu(fdm->rgn[0].dest);
+
+ /*
+ * Rarely, but it can so happen that system crashes before all
+@@ -155,13 +155,14 @@ static void __init opal_fadump_get_config(struct fw_dump *fadump_conf,
+ * Hope the memory that could not be preserved only has pages
+ * that are usually filtered out while saving the vmcore.
+ */
+- if (fdm->region_cnt > fdm->registered_regions) {
++ if (be16_to_cpu(fdm->region_cnt) > be16_to_cpu(fdm->registered_regions)) {
+ pr_warn("Not all memory regions were saved!!!\n");
+ pr_warn(" Unsaved memory regions:\n");
+- i = fdm->registered_regions;
+- while (i < fdm->region_cnt) {
++ i = be16_to_cpu(fdm->registered_regions);
++ while (i < be16_to_cpu(fdm->region_cnt)) {
+ pr_warn("\t[%03d] base: 0x%llx, size: 0x%llx\n",
+- i, fdm->rgn[i].src, fdm->rgn[i].size);
++ i, be64_to_cpu(fdm->rgn[i].src),
++ be64_to_cpu(fdm->rgn[i].size));
+ i++;
+ }
+
+@@ -170,7 +171,7 @@ static void __init opal_fadump_get_config(struct fw_dump *fadump_conf,
+ }
+
+ fadump_conf->boot_mem_top = (fadump_conf->boot_memory_size + hole_size);
+- fadump_conf->boot_mem_regs_cnt = fdm->region_cnt;
++ fadump_conf->boot_mem_regs_cnt = be16_to_cpu(fdm->region_cnt);
+ opal_fadump_update_config(fadump_conf, fdm);
+ }
+
+@@ -178,35 +179,38 @@ static void __init opal_fadump_get_config(struct fw_dump *fadump_conf,
+ static void opal_fadump_init_metadata(struct opal_fadump_mem_struct *fdm)
+ {
+ fdm->version = OPAL_FADUMP_VERSION;
+- fdm->region_cnt = 0;
+- fdm->registered_regions = 0;
+- fdm->fadumphdr_addr = 0;
++ fdm->region_cnt = cpu_to_be16(0);
++ fdm->registered_regions = cpu_to_be16(0);
++ fdm->fadumphdr_addr = cpu_to_be64(0);
+ }
+
+ static u64 opal_fadump_init_mem_struct(struct fw_dump *fadump_conf)
+ {
+ u64 addr = fadump_conf->reserve_dump_area_start;
++ u16 reg_cnt;
+ int i;
+
+ opal_fdm = __va(fadump_conf->kernel_metadata);
+ opal_fadump_init_metadata(opal_fdm);
+
+ /* Boot memory regions */
++ reg_cnt = be16_to_cpu(opal_fdm->region_cnt);
+ for (i = 0; i < fadump_conf->boot_mem_regs_cnt; i++) {
+- opal_fdm->rgn[i].src = fadump_conf->boot_mem_addr[i];
+- opal_fdm->rgn[i].dest = addr;
+- opal_fdm->rgn[i].size = fadump_conf->boot_mem_sz[i];
++ opal_fdm->rgn[i].src = cpu_to_be64(fadump_conf->boot_mem_addr[i]);
++ opal_fdm->rgn[i].dest = cpu_to_be64(addr);
++ opal_fdm->rgn[i].size = cpu_to_be64(fadump_conf->boot_mem_sz[i]);
+
+- opal_fdm->region_cnt++;
++ reg_cnt++;
+ addr += fadump_conf->boot_mem_sz[i];
+ }
++ opal_fdm->region_cnt = cpu_to_be16(reg_cnt);
+
+ /*
+ * Kernel metadata is passed to f/w and retrieved in capture kerenl.
+ * So, use it to save fadump header address instead of calculating it.
+ */
+- opal_fdm->fadumphdr_addr = (opal_fdm->rgn[0].dest +
+- fadump_conf->boot_memory_size);
++ opal_fdm->fadumphdr_addr = cpu_to_be64(be64_to_cpu(opal_fdm->rgn[0].dest) +
++ fadump_conf->boot_memory_size);
+
+ opal_fadump_update_config(fadump_conf, opal_fdm);
+
+@@ -269,18 +273,21 @@ static u64 opal_fadump_get_bootmem_min(void)
+ static int opal_fadump_register(struct fw_dump *fadump_conf)
+ {
+ s64 rc = OPAL_PARAMETER;
++ u16 registered_regs;
+ int i, err = -EIO;
+
+- for (i = 0; i < opal_fdm->region_cnt; i++) {
++ registered_regs = be16_to_cpu(opal_fdm->registered_regions);
++ for (i = 0; i < be16_to_cpu(opal_fdm->region_cnt); i++) {
+ rc = opal_mpipl_update(OPAL_MPIPL_ADD_RANGE,
+- opal_fdm->rgn[i].src,
+- opal_fdm->rgn[i].dest,
+- opal_fdm->rgn[i].size);
++ be64_to_cpu(opal_fdm->rgn[i].src),
++ be64_to_cpu(opal_fdm->rgn[i].dest),
++ be64_to_cpu(opal_fdm->rgn[i].size));
+ if (rc != OPAL_SUCCESS)
+ break;
+
+- opal_fdm->registered_regions++;
++ registered_regs++;
+ }
++ opal_fdm->registered_regions = cpu_to_be16(registered_regs);
+
+ switch (rc) {
+ case OPAL_SUCCESS:
+@@ -291,7 +298,8 @@ static int opal_fadump_register(struct fw_dump *fadump_conf)
+ case OPAL_RESOURCE:
+ /* If MAX regions limit in f/w is hit, warn and proceed. */
+ pr_warn("%d regions could not be registered for MPIPL as MAX limit is reached!\n",
+- (opal_fdm->region_cnt - opal_fdm->registered_regions));
++ (be16_to_cpu(opal_fdm->region_cnt) -
++ be16_to_cpu(opal_fdm->registered_regions)));
+ fadump_conf->dump_registered = 1;
+ err = 0;
+ break;
+@@ -312,7 +320,7 @@ static int opal_fadump_register(struct fw_dump *fadump_conf)
+ * If some regions were registered before OPAL_MPIPL_ADD_RANGE
+ * OPAL call failed, unregister all regions.
+ */
+- if ((err < 0) && (opal_fdm->registered_regions > 0))
++ if ((err < 0) && (be16_to_cpu(opal_fdm->registered_regions) > 0))
+ opal_fadump_unregister(fadump_conf);
+
+ return err;
+@@ -328,7 +336,7 @@ static int opal_fadump_unregister(struct fw_dump *fadump_conf)
+ return -EIO;
+ }
+
+- opal_fdm->registered_regions = 0;
++ opal_fdm->registered_regions = cpu_to_be16(0);
+ fadump_conf->dump_registered = 0;
+ return 0;
+ }
+@@ -563,19 +571,20 @@ static void opal_fadump_region_show(struct fw_dump *fadump_conf,
+ else
+ fdm_ptr = opal_fdm;
+
+- for (i = 0; i < fdm_ptr->region_cnt; i++) {
++ for (i = 0; i < be16_to_cpu(fdm_ptr->region_cnt); i++) {
+ /*
+ * Only regions that are registered for MPIPL
+ * would have dump data.
+ */
+ if ((fadump_conf->dump_active) &&
+- (i < fdm_ptr->registered_regions))
+- dumped_bytes = fdm_ptr->rgn[i].size;
++ (i < be16_to_cpu(fdm_ptr->registered_regions)))
++ dumped_bytes = be64_to_cpu(fdm_ptr->rgn[i].size);
+
+ seq_printf(m, "DUMP: Src: %#016llx, Dest: %#016llx, ",
+- fdm_ptr->rgn[i].src, fdm_ptr->rgn[i].dest);
++ be64_to_cpu(fdm_ptr->rgn[i].src),
++ be64_to_cpu(fdm_ptr->rgn[i].dest));
+ seq_printf(m, "Size: %#llx, Dumped: %#llx bytes\n",
+- fdm_ptr->rgn[i].size, dumped_bytes);
++ be64_to_cpu(fdm_ptr->rgn[i].size), dumped_bytes);
+ }
+
+ /* Dump is active. Show reserved area start address. */
+@@ -624,6 +633,7 @@ void __init opal_fadump_dt_scan(struct fw_dump *fadump_conf, u64 node)
+ {
+ const __be32 *prop;
+ unsigned long dn;
++ __be64 be_addr;
+ u64 addr = 0;
+ int i, len;
+ s64 ret;
+@@ -680,13 +690,13 @@ void __init opal_fadump_dt_scan(struct fw_dump *fadump_conf, u64 node)
+ if (!prop)
+ return;
+
+- ret = opal_mpipl_query_tag(OPAL_MPIPL_TAG_KERNEL, &addr);
+- if ((ret != OPAL_SUCCESS) || !addr) {
++ ret = opal_mpipl_query_tag(OPAL_MPIPL_TAG_KERNEL, &be_addr);
++ if ((ret != OPAL_SUCCESS) || !be_addr) {
+ pr_err("Failed to get Kernel metadata (%lld)\n", ret);
+ return;
+ }
+
+- addr = be64_to_cpu(addr);
++ addr = be64_to_cpu(be_addr);
+ pr_debug("Kernel metadata addr: %llx\n", addr);
+
+ opal_fdm_active = __va(addr);
+@@ -697,14 +707,14 @@ void __init opal_fadump_dt_scan(struct fw_dump *fadump_conf, u64 node)
+ }
+
+ /* Kernel regions not registered with f/w for MPIPL */
+- if (opal_fdm_active->registered_regions == 0) {
++ if (be16_to_cpu(opal_fdm_active->registered_regions) == 0) {
+ opal_fdm_active = NULL;
+ return;
+ }
+
+- ret = opal_mpipl_query_tag(OPAL_MPIPL_TAG_CPU, &addr);
+- if (addr) {
+- addr = be64_to_cpu(addr);
++ ret = opal_mpipl_query_tag(OPAL_MPIPL_TAG_CPU, &be_addr);
++ if (be_addr) {
++ addr = be64_to_cpu(be_addr);
+ pr_debug("CPU metadata addr: %llx\n", addr);
+ opal_cpu_metadata = __va(addr);
+ }
+diff --git a/arch/powerpc/platforms/powernv/opal-fadump.h b/arch/powerpc/platforms/powernv/opal-fadump.h
+index f1e9ecf548c5d..3f715efb0aa6e 100644
+--- a/arch/powerpc/platforms/powernv/opal-fadump.h
++++ b/arch/powerpc/platforms/powernv/opal-fadump.h
+@@ -31,14 +31,14 @@
+ * OPAL FADump kernel metadata
+ *
+ * The address of this structure will be registered with f/w for retrieving
+- * and processing during crash dump.
++ * in the capture kernel to process the crash dump.
+ */
+ struct opal_fadump_mem_struct {
+ u8 version;
+ u8 reserved[3];
+- u16 region_cnt; /* number of regions */
+- u16 registered_regions; /* Regions registered for MPIPL */
+- u64 fadumphdr_addr;
++ __be16 region_cnt; /* number of regions */
++ __be16 registered_regions; /* Regions registered for MPIPL */
++ __be64 fadumphdr_addr;
+ struct opal_mpipl_region rgn[FADUMP_MAX_MEM_REGS];
+ } __packed;
+
+@@ -135,7 +135,7 @@ static inline void opal_fadump_read_regs(char *bufp, unsigned int regs_cnt,
+ for (i = 0; i < regs_cnt; i++, bufp += reg_entry_size) {
+ reg_entry = (struct hdat_fadump_reg_entry *)bufp;
+ val = (cpu_endian ? be64_to_cpu(reg_entry->reg_val) :
+- reg_entry->reg_val);
++ (u64)(reg_entry->reg_val));
+ opal_fadump_set_regval_regnum(regs,
+ be32_to_cpu(reg_entry->reg_type),
+ be32_to_cpu(reg_entry->reg_num),
+diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
+index 105d889abd51a..824c3ad7a0faf 100644
+--- a/arch/powerpc/platforms/powernv/setup.c
++++ b/arch/powerpc/platforms/powernv/setup.c
+@@ -96,6 +96,15 @@ static void __init init_fw_feat_flags(struct device_node *np)
+
+ if (fw_feature_is("disabled", "needs-spec-barrier-for-bound-checks", np))
+ security_ftr_clear(SEC_FTR_BNDS_CHK_SPEC_BAR);
++
++ if (fw_feature_is("enabled", "no-need-l1d-flush-msr-pr-1-to-0", np))
++ security_ftr_clear(SEC_FTR_L1D_FLUSH_ENTRY);
++
++ if (fw_feature_is("enabled", "no-need-l1d-flush-kernel-on-user-access", np))
++ security_ftr_clear(SEC_FTR_L1D_FLUSH_UACCESS);
++
++ if (fw_feature_is("enabled", "no-need-store-drain-on-priv-state-switch", np))
++ security_ftr_clear(SEC_FTR_STF_BARRIER);
+ }
+
+ static void __init pnv_setup_security_mitigations(void)
+diff --git a/arch/powerpc/platforms/powernv/ultravisor.c b/arch/powerpc/platforms/powernv/ultravisor.c
+index e4a00ad06f9d3..67c8c4b2d8b17 100644
+--- a/arch/powerpc/platforms/powernv/ultravisor.c
++++ b/arch/powerpc/platforms/powernv/ultravisor.c
+@@ -55,6 +55,7 @@ static int __init uv_init(void)
+ return -ENODEV;
+
+ uv_memcons = memcons_init(node, "memcons");
++ of_node_put(node);
+ if (!uv_memcons)
+ return -ENOENT;
+
+diff --git a/arch/powerpc/platforms/powernv/vas-fault.c b/arch/powerpc/platforms/powernv/vas-fault.c
+index a7aabc18039eb..c1bfad56447d4 100644
+--- a/arch/powerpc/platforms/powernv/vas-fault.c
++++ b/arch/powerpc/platforms/powernv/vas-fault.c
+@@ -216,7 +216,7 @@ int vas_setup_fault_window(struct vas_instance *vinst)
+ vas_init_rx_win_attr(&attr, VAS_COP_TYPE_FAULT);
+
+ attr.rx_fifo_size = vinst->fault_fifo_size;
+- attr.rx_fifo = vinst->fault_fifo;
++ attr.rx_fifo = __pa(vinst->fault_fifo);
+
+ /*
+ * Max creds is based on number of CRBs can fit in the FIFO.
+diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
+index 0f8d39fbf2b21..0072682531d80 100644
+--- a/arch/powerpc/platforms/powernv/vas-window.c
++++ b/arch/powerpc/platforms/powernv/vas-window.c
+@@ -404,7 +404,7 @@ static void init_winctx_regs(struct pnv_vas_window *window,
+ *
+ * See also: Design note in function header.
+ */
+- val = __pa(winctx->rx_fifo);
++ val = winctx->rx_fifo;
+ val = SET_FIELD(VAS_PAGE_MIGRATION_SELECT, val, 0);
+ write_hvwc_reg(window, VREG(LFIFO_BAR), val);
+
+@@ -739,7 +739,7 @@ static void init_winctx_for_rxwin(struct pnv_vas_window *rxwin,
+ */
+ winctx->fifo_disable = true;
+ winctx->intr_disable = true;
+- winctx->rx_fifo = NULL;
++ winctx->rx_fifo = 0;
+ }
+
+ winctx->lnotify_lpid = rxattr->lnotify_lpid;
+diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
+index 8bb08e395de05..08d9d3d5a22b0 100644
+--- a/arch/powerpc/platforms/powernv/vas.h
++++ b/arch/powerpc/platforms/powernv/vas.h
+@@ -376,7 +376,7 @@ struct pnv_vas_window {
+ * is a container for the register fields in the window context.
+ */
+ struct vas_winctx {
+- void *rx_fifo;
++ u64 rx_fifo;
+ int rx_fifo_size;
+ int wcreds_max;
+ int rsvd_txbuf_count;
+diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
+index 39962c9055422..181b855b30509 100644
+--- a/arch/powerpc/platforms/pseries/papr_scm.c
++++ b/arch/powerpc/platforms/pseries/papr_scm.c
+@@ -125,8 +125,8 @@ struct papr_scm_priv {
+ /* The bits which needs to be overridden */
+ u64 health_bitmap_inject_mask;
+
+- /* array to have event_code and stat_id mappings */
+- char **nvdimm_events_map;
++ /* array to have event_code and stat_id mappings */
++ u8 *nvdimm_events_map;
+ };
+
+ static int papr_scm_pmem_flush(struct nd_region *nd_region,
+@@ -370,7 +370,7 @@ static int papr_scm_pmu_get_value(struct perf_event *event, struct device *dev,
+
+ stat = &stats->scm_statistic[0];
+ memcpy(&stat->stat_id,
+- p->nvdimm_events_map[event->attr.config],
++ &p->nvdimm_events_map[event->attr.config * sizeof(stat->stat_id)],
+ sizeof(stat->stat_id));
+ stat->stat_val = 0;
+
+@@ -462,14 +462,13 @@ static int papr_scm_pmu_check_events(struct papr_scm_priv *p, struct nvdimm_pmu
+ {
+ struct papr_scm_perf_stat *stat;
+ struct papr_scm_perf_stats *stats;
+- int index, rc, count;
+ u32 available_events;
+-
+- if (!p->stat_buffer_len)
+- return -ENOENT;
++ int index, rc = 0;
+
+ available_events = (p->stat_buffer_len - sizeof(struct papr_scm_perf_stats))
+ / sizeof(struct papr_scm_perf_stat);
++ if (available_events == 0)
++ return -EOPNOTSUPP;
+
+ /* Allocate the buffer for phyp where stats are written */
+ stats = kzalloc(p->stat_buffer_len, GFP_KERNEL);
+@@ -478,35 +477,30 @@ static int papr_scm_pmu_check_events(struct papr_scm_priv *p, struct nvdimm_pmu
+ return rc;
+ }
+
+- /* Allocate memory to nvdimm_event_map */
+- p->nvdimm_events_map = kcalloc(available_events, sizeof(char *), GFP_KERNEL);
+- if (!p->nvdimm_events_map) {
+- rc = -ENOMEM;
+- goto out_stats;
+- }
+-
+ /* Called to get list of events supported */
+ rc = drc_pmem_query_stats(p, stats, 0);
+ if (rc)
+- goto out_nvdimm_events_map;
+-
+- for (index = 0, stat = stats->scm_statistic, count = 0;
+- index < available_events; index++, ++stat) {
+- p->nvdimm_events_map[count] = kmemdup_nul(stat->stat_id, 8, GFP_KERNEL);
+- if (!p->nvdimm_events_map[count]) {
+- rc = -ENOMEM;
+- goto out_nvdimm_events_map;
+- }
++ goto out;
+
+- count++;
++ /*
++ * Allocate memory and populate nvdimm_event_map.
++ * Allocate an extra element for NULL entry
++ */
++ p->nvdimm_events_map = kcalloc(available_events + 1,
++ sizeof(stat->stat_id),
++ GFP_KERNEL);
++ if (!p->nvdimm_events_map) {
++ rc = -ENOMEM;
++ goto out;
+ }
+- p->nvdimm_events_map[count] = NULL;
+- kfree(stats);
+- return 0;
+
+-out_nvdimm_events_map:
+- kfree(p->nvdimm_events_map);
+-out_stats:
++ /* Copy all stat_ids to event map */
++ for (index = 0, stat = stats->scm_statistic;
++ index < available_events; index++, ++stat) {
++ memcpy(&p->nvdimm_events_map[index * sizeof(stat->stat_id)],
++ &stat->stat_id, sizeof(stat->stat_id));
++ }
++out:
+ kfree(stats);
+ return rc;
+ }
+diff --git a/arch/powerpc/sysdev/dart_iommu.c b/arch/powerpc/sysdev/dart_iommu.c
+index be6b99b1b3523..9a02aed886a0d 100644
+--- a/arch/powerpc/sysdev/dart_iommu.c
++++ b/arch/powerpc/sysdev/dart_iommu.c
+@@ -404,9 +404,10 @@ void __init iommu_init_early_dart(struct pci_controller_ops *controller_ops)
+ }
+
+ /* Initialize the DART HW */
+- if (dart_init(dn) != 0)
++ if (dart_init(dn) != 0) {
++ of_node_put(dn);
+ return;
+-
++ }
+ /*
+ * U4 supports a DART bypass, we use it for 64-bit capable devices to
+ * improve performance. However, that only works for devices connected
+@@ -419,6 +420,7 @@ void __init iommu_init_early_dart(struct pci_controller_ops *controller_ops)
+
+ /* Setup pci_dma ops */
+ set_pci_dma_ops(&dma_iommu_ops);
++ of_node_put(dn);
+ }
+
+ #ifdef CONFIG_PM
+diff --git a/arch/powerpc/sysdev/fsl_rio.c b/arch/powerpc/sysdev/fsl_rio.c
+index ff7906b48ca1e..1bfc9afa8a1a1 100644
+--- a/arch/powerpc/sysdev/fsl_rio.c
++++ b/arch/powerpc/sysdev/fsl_rio.c
+@@ -505,8 +505,10 @@ int fsl_rio_setup(struct platform_device *dev)
+ if (rc) {
+ dev_err(&dev->dev, "Can't get %pOF property 'reg'\n",
+ rmu_node);
++ of_node_put(rmu_node);
+ goto err_rmu;
+ }
++ of_node_put(rmu_node);
+ rmu_regs_win = ioremap(rmu_regs.start, resource_size(&rmu_regs));
+ if (!rmu_regs_win) {
+ dev_err(&dev->dev, "Unable to map rmu register window\n");
+diff --git a/arch/powerpc/sysdev/xics/icp-opal.c b/arch/powerpc/sysdev/xics/icp-opal.c
+index bda4c32582d97..4dae624b9f2f4 100644
+--- a/arch/powerpc/sysdev/xics/icp-opal.c
++++ b/arch/powerpc/sysdev/xics/icp-opal.c
+@@ -196,6 +196,7 @@ int __init icp_opal_init(void)
+
+ printk("XICS: Using OPAL ICP fallbacks\n");
+
++ of_node_put(np);
+ return 0;
+ }
+
+diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
+index 29456c255f9f7..503f544d28e29 100644
+--- a/arch/powerpc/sysdev/xive/spapr.c
++++ b/arch/powerpc/sysdev/xive/spapr.c
+@@ -830,12 +830,12 @@ bool __init xive_spapr_init(void)
+ /* Resource 1 is the OS ring TIMA */
+ if (of_address_to_resource(np, 1, &r)) {
+ pr_err("Failed to get thread mgmnt area resource\n");
+- return false;
++ goto err_put;
+ }
+ tima = ioremap(r.start, resource_size(&r));
+ if (!tima) {
+ pr_err("Failed to map thread mgmnt area\n");
+- return false;
++ goto err_put;
+ }
+
+ if (!xive_get_max_prio(&max_prio))
+@@ -871,6 +871,7 @@ bool __init xive_spapr_init(void)
+ if (!xive_core_init(np, &xive_spapr_ops, tima, TM_QW1_OS, max_prio))
+ goto err_mem_free;
+
++ of_node_put(np);
+ pr_info("Using %dkB queues\n", 1 << (xive_queue_shift - 10));
+ return true;
+
+@@ -878,6 +879,8 @@ err_mem_free:
+ xive_irq_bitmap_remove_all();
+ err_unmap:
+ iounmap(tima);
++err_put:
++ of_node_put(np);
+ return false;
+ }
+
+diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
+index 7d81102cffd48..c6ca1b9cbf712 100644
+--- a/arch/riscv/Makefile
++++ b/arch/riscv/Makefile
+@@ -154,3 +154,7 @@ PHONY += rv64_randconfig
+ rv64_randconfig:
+ $(Q)$(MAKE) KCONFIG_ALLCONFIG=$(srctree)/arch/riscv/configs/64-bit.config \
+ -f $(srctree)/Makefile randconfig
++
++PHONY += rv32_defconfig
++rv32_defconfig:
++ $(Q)$(MAKE) -f $(srctree)/Makefile defconfig 32-bit.config
+diff --git a/arch/riscv/include/asm/alternative-macros.h b/arch/riscv/include/asm/alternative-macros.h
+index 67406c3763890..0377ce0fcc726 100644
+--- a/arch/riscv/include/asm/alternative-macros.h
++++ b/arch/riscv/include/asm/alternative-macros.h
+@@ -23,9 +23,9 @@
+ 888 :
+ \new_c
+ 889 :
+- .previous
+ .org . - (889b - 888b) + (887b - 886b)
+ .org . - (887b - 886b) + (889b - 888b)
++ .previous
+ .endif
+ .endm
+
+@@ -60,9 +60,9 @@
+ "888 :\n" \
+ new_c "\n" \
+ "889 :\n" \
+- ".previous\n" \
+ ".org . - (887b - 886b) + (889b - 888b)\n" \
+ ".org . - (889b - 888b) + (887b - 886b)\n" \
++ ".previous\n" \
+ ".endif\n"
+
+ #define __ALTERNATIVE_CFG(old_c, new_c, vendor_id, errata_id, enable) \
+diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
+index 8c2549b16ac06..618d7c5af1a2d 100644
+--- a/arch/riscv/include/asm/asm.h
++++ b/arch/riscv/include/asm/asm.h
+@@ -67,30 +67,4 @@
+ #error "Unexpected __SIZEOF_SHORT__"
+ #endif
+
+-#ifdef __ASSEMBLY__
+-
+-/* Common assembly source macros */
+-
+-#ifdef CONFIG_XIP_KERNEL
+-.macro XIP_FIXUP_OFFSET reg
+- REG_L t0, _xip_fixup
+- add \reg, \reg, t0
+-.endm
+-.macro XIP_FIXUP_FLASH_OFFSET reg
+- la t1, __data_loc
+- REG_L t1, _xip_phys_offset
+- sub \reg, \reg, t1
+- add \reg, \reg, t0
+-.endm
+-_xip_fixup: .dword CONFIG_PHYS_RAM_BASE - CONFIG_XIP_PHYS_ADDR - XIP_OFFSET
+-_xip_phys_offset: .dword CONFIG_XIP_PHYS_ADDR + XIP_OFFSET
+-#else
+-.macro XIP_FIXUP_OFFSET reg
+-.endm
+-.macro XIP_FIXUP_FLASH_OFFSET reg
+-.endm
+-#endif /* CONFIG_XIP_KERNEL */
+-
+-#endif /* __ASSEMBLY__ */
+-
+ #endif /* _ASM_RISCV_ASM_H */
+diff --git a/arch/riscv/include/asm/irq_work.h b/arch/riscv/include/asm/irq_work.h
+index d6c277992f76a..b53891964ae03 100644
+--- a/arch/riscv/include/asm/irq_work.h
++++ b/arch/riscv/include/asm/irq_work.h
+@@ -4,7 +4,7 @@
+
+ static inline bool arch_irq_work_has_interrupt(void)
+ {
+- return true;
++ return IS_ENABLED(CONFIG_SMP);
+ }
+ extern void arch_irq_work_raise(void);
+ #endif /* _ASM_RISCV_IRQ_WORK_H */
+diff --git a/arch/riscv/include/asm/unistd.h b/arch/riscv/include/asm/unistd.h
+index 6c316093a1e59..977ee6181dabf 100644
+--- a/arch/riscv/include/asm/unistd.h
++++ b/arch/riscv/include/asm/unistd.h
+@@ -9,7 +9,6 @@
+ */
+
+ #define __ARCH_WANT_SYS_CLONE
+-#define __ARCH_WANT_MEMFD_SECRET
+
+ #include <uapi/asm/unistd.h>
+
+diff --git a/arch/riscv/include/asm/xip_fixup.h b/arch/riscv/include/asm/xip_fixup.h
+new file mode 100644
+index 0000000000000..d4ffc3c37649f
+--- /dev/null
++++ b/arch/riscv/include/asm/xip_fixup.h
+@@ -0,0 +1,31 @@
++/* SPDX-License-Identifier: GPL-2.0-only */
++/*
++ * XIP fixup macros, only useful in assembly.
++ */
++#ifndef _ASM_RISCV_XIP_FIXUP_H
++#define _ASM_RISCV_XIP_FIXUP_H
++
++#include <linux/pgtable.h>
++
++#ifdef CONFIG_XIP_KERNEL
++.macro XIP_FIXUP_OFFSET reg
++ REG_L t0, _xip_fixup
++ add \reg, \reg, t0
++.endm
++.macro XIP_FIXUP_FLASH_OFFSET reg
++ la t1, __data_loc
++ REG_L t1, _xip_phys_offset
++ sub \reg, \reg, t1
++ add \reg, \reg, t0
++.endm
++
++_xip_fixup: .dword CONFIG_PHYS_RAM_BASE - CONFIG_XIP_PHYS_ADDR - XIP_OFFSET
++_xip_phys_offset: .dword CONFIG_XIP_PHYS_ADDR + XIP_OFFSET
++#else
++.macro XIP_FIXUP_OFFSET reg
++.endm
++.macro XIP_FIXUP_FLASH_OFFSET reg
++.endm
++#endif /* CONFIG_XIP_KERNEL */
++
++#endif
+diff --git a/arch/riscv/include/uapi/asm/unistd.h b/arch/riscv/include/uapi/asm/unistd.h
+index 8062996c2dfd0..d95fbf5846b0b 100644
+--- a/arch/riscv/include/uapi/asm/unistd.h
++++ b/arch/riscv/include/uapi/asm/unistd.h
+@@ -21,6 +21,7 @@
+ #endif /* __LP64__ */
+
+ #define __ARCH_WANT_SYS_CLONE3
++#define __ARCH_WANT_MEMFD_SECRET
+
+ #include <asm-generic/unistd.h>
+
+diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
+index 893b8bb693912..b865046e4dbbc 100644
+--- a/arch/riscv/kernel/head.S
++++ b/arch/riscv/kernel/head.S
+@@ -14,6 +14,7 @@
+ #include <asm/cpu_ops_sbi.h>
+ #include <asm/hwcap.h>
+ #include <asm/image.h>
++#include <asm/xip_fixup.h>
+ #include "efi-header.S"
+
+ __HEAD
+@@ -297,6 +298,7 @@ clear_bss_done:
+ REG_S a0, (a2)
+
+ /* Initialize page tables and relocate to virtual addresses */
++ la tp, init_task
+ la sp, init_thread_union + THREAD_SIZE
+ XIP_FIXUP_OFFSET sp
+ #ifdef CONFIG_BUILTIN_DTB
+diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
+index 834eb652a7b9d..e0a00739bd13b 100644
+--- a/arch/riscv/kernel/setup.c
++++ b/arch/riscv/kernel/setup.c
+@@ -189,7 +189,7 @@ static void __init init_resources(void)
+ res = &mem_res[res_idx--];
+
+ res->name = "Reserved";
+- res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
++ res->flags = IORESOURCE_MEM | IORESOURCE_EXCLUSIVE;
+ res->start = __pfn_to_phys(memblock_region_reserved_base_pfn(region));
+ res->end = __pfn_to_phys(memblock_region_reserved_end_pfn(region)) - 1;
+
+@@ -214,7 +214,7 @@ static void __init init_resources(void)
+
+ if (unlikely(memblock_is_nomap(region))) {
+ res->name = "Reserved";
+- res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
++ res->flags = IORESOURCE_MEM | IORESOURCE_EXCLUSIVE;
+ } else {
+ res->name = "System RAM";
+ res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+diff --git a/arch/riscv/kernel/suspend_entry.S b/arch/riscv/kernel/suspend_entry.S
+index 4b07b809a2b8c..aafcca58c19de 100644
+--- a/arch/riscv/kernel/suspend_entry.S
++++ b/arch/riscv/kernel/suspend_entry.S
+@@ -8,6 +8,7 @@
+ #include <asm/asm.h>
+ #include <asm/asm-offsets.h>
+ #include <asm/csr.h>
++#include <asm/xip_fixup.h>
+
+ .text
+ .altmacro
+diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
+index 05ed641a1134c..39e2e1d0e94f4 100644
+--- a/arch/riscv/mm/init.c
++++ b/arch/riscv/mm/init.c
+@@ -677,7 +677,7 @@ static __init pgprot_t pgprot_from_va(uintptr_t va)
+ }
+ #endif /* CONFIG_STRICT_KERNEL_RWX */
+
+-#ifdef CONFIG_64BIT
++#if defined(CONFIG_64BIT) && !defined(CONFIG_XIP_KERNEL)
+ static void __init disable_pgtable_l5(void)
+ {
+ pgtable_l5_enabled = false;
+diff --git a/arch/s390/include/asm/cio.h b/arch/s390/include/asm/cio.h
+index 1effac6a01520..1c4f585dd39b6 100644
+--- a/arch/s390/include/asm/cio.h
++++ b/arch/s390/include/asm/cio.h
+@@ -369,7 +369,7 @@ void cio_gp_dma_destroy(struct gen_pool *gp_dma, struct device *dma_dev);
+ struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages);
+
+ /* Function from drivers/s390/cio/chsc.c */
+-int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta);
++int chsc_sstpc(void *page, unsigned int op, u16 ctrl, long *clock_delta);
+ int chsc_sstpi(void *page, void *result, size_t size);
+ int chsc_stzi(void *page, void *result, size_t size);
+ int chsc_sgib(u32 origin);
+diff --git a/arch/s390/include/asm/kexec.h b/arch/s390/include/asm/kexec.h
+index 7f3c9ac34bd8d..63098df81c9f2 100644
+--- a/arch/s390/include/asm/kexec.h
++++ b/arch/s390/include/asm/kexec.h
+@@ -9,6 +9,8 @@
+ #ifndef _S390_KEXEC_H
+ #define _S390_KEXEC_H
+
++#include <linux/module.h>
++
+ #include <asm/processor.h>
+ #include <asm/page.h>
+ #include <asm/setup.h>
+@@ -83,4 +85,12 @@ struct kimage_arch {
+ extern const struct kexec_file_ops s390_kexec_image_ops;
+ extern const struct kexec_file_ops s390_kexec_elf_ops;
+
++#ifdef CONFIG_KEXEC_FILE
++struct purgatory_info;
++int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
++ Elf_Shdr *section,
++ const Elf_Shdr *relsec,
++ const Elf_Shdr *symtab);
++#define arch_kexec_apply_relocations_add arch_kexec_apply_relocations_add
++#endif
+ #endif /*_S390_KEXEC_H */
+diff --git a/arch/s390/include/asm/preempt.h b/arch/s390/include/asm/preempt.h
+index d9d5350cc3ec3..bf15da0fedbca 100644
+--- a/arch/s390/include/asm/preempt.h
++++ b/arch/s390/include/asm/preempt.h
+@@ -46,10 +46,17 @@ static inline bool test_preempt_need_resched(void)
+
+ static inline void __preempt_count_add(int val)
+ {
+- if (__builtin_constant_p(val) && (val >= -128) && (val <= 127))
+- __atomic_add_const(val, &S390_lowcore.preempt_count);
+- else
+- __atomic_add(val, &S390_lowcore.preempt_count);
++ /*
++ * With some obscure config options and CONFIG_PROFILE_ALL_BRANCHES
++ * enabled, gcc 12 fails to handle __builtin_constant_p().
++ */
++ if (!IS_ENABLED(CONFIG_PROFILE_ALL_BRANCHES)) {
++ if (__builtin_constant_p(val) && (val >= -128) && (val <= 127)) {
++ __atomic_add_const(val, &S390_lowcore.preempt_count);
++ return;
++ }
++ }
++ __atomic_add(val, &S390_lowcore.preempt_count);
+ }
+
+ static inline void __preempt_count_sub(int val)
+diff --git a/arch/s390/kernel/perf_event.c b/arch/s390/kernel/perf_event.c
+index ea7729bebaa07..a7f8db73984b0 100644
+--- a/arch/s390/kernel/perf_event.c
++++ b/arch/s390/kernel/perf_event.c
+@@ -30,7 +30,7 @@ static struct kvm_s390_sie_block *sie_block(struct pt_regs *regs)
+ if (!stack)
+ return NULL;
+
+- return (struct kvm_s390_sie_block *) stack->empty1[0];
++ return (struct kvm_s390_sie_block *)stack->empty1[1];
+ }
+
+ static bool is_in_guest(struct pt_regs *regs)
+diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
+index 326cb8f75f58e..f0a1484ee00b0 100644
+--- a/arch/s390/kernel/time.c
++++ b/arch/s390/kernel/time.c
+@@ -364,7 +364,7 @@ static inline int check_sync_clock(void)
+ * Apply clock delta to the global data structures.
+ * This is called once on the CPU that performed the clock sync.
+ */
+-static void clock_sync_global(unsigned long delta)
++static void clock_sync_global(long delta)
+ {
+ unsigned long now, adj;
+ struct ptff_qto qto;
+@@ -400,7 +400,7 @@ static void clock_sync_global(unsigned long delta)
+ * Apply clock delta to the per-CPU data structures of this CPU.
+ * This is called for each online CPU after the call to clock_sync_global.
+ */
+-static void clock_sync_local(unsigned long delta)
++static void clock_sync_local(long delta)
+ {
+ /* Add the delta to the clock comparator. */
+ if (S390_lowcore.clock_comparator != clock_comparator_max) {
+@@ -424,7 +424,7 @@ static void __init time_init_wq(void)
+ struct clock_sync_data {
+ atomic_t cpus;
+ int in_sync;
+- unsigned long clock_delta;
++ long clock_delta;
+ };
+
+ /*
+@@ -544,7 +544,7 @@ static int stpinfo_valid(void)
+ static int stp_sync_clock(void *data)
+ {
+ struct clock_sync_data *sync = data;
+- u64 clock_delta, flags;
++ long clock_delta, flags;
+ static int first;
+ int rc;
+
+diff --git a/arch/sparc/kernel/signal32.c b/arch/sparc/kernel/signal32.c
+index f9fe502b81c65..dad38960d1a8a 100644
+--- a/arch/sparc/kernel/signal32.c
++++ b/arch/sparc/kernel/signal32.c
+@@ -779,5 +779,6 @@ static_assert(offsetof(compat_siginfo_t, si_upper) == 0x18);
+ static_assert(offsetof(compat_siginfo_t, si_pkey) == 0x14);
+ static_assert(offsetof(compat_siginfo_t, si_perf_data) == 0x10);
+ static_assert(offsetof(compat_siginfo_t, si_perf_type) == 0x14);
++static_assert(offsetof(compat_siginfo_t, si_perf_flags) == 0x18);
+ static_assert(offsetof(compat_siginfo_t, si_band) == 0x0c);
+ static_assert(offsetof(compat_siginfo_t, si_fd) == 0x10);
+diff --git a/arch/sparc/kernel/signal_64.c b/arch/sparc/kernel/signal_64.c
+index 8b9fc76cd3e02..570e43e6fda5c 100644
+--- a/arch/sparc/kernel/signal_64.c
++++ b/arch/sparc/kernel/signal_64.c
+@@ -590,5 +590,6 @@ static_assert(offsetof(siginfo_t, si_upper) == 0x28);
+ static_assert(offsetof(siginfo_t, si_pkey) == 0x20);
+ static_assert(offsetof(siginfo_t, si_perf_data) == 0x18);
+ static_assert(offsetof(siginfo_t, si_perf_type) == 0x20);
++static_assert(offsetof(siginfo_t, si_perf_flags) == 0x24);
+ static_assert(offsetof(siginfo_t, si_band) == 0x10);
+ static_assert(offsetof(siginfo_t, si_fd) == 0x14);
+diff --git a/arch/um/drivers/chan_user.c b/arch/um/drivers/chan_user.c
+index 6040817c036f3..25727ed648b72 100644
+--- a/arch/um/drivers/chan_user.c
++++ b/arch/um/drivers/chan_user.c
+@@ -220,7 +220,7 @@ static int winch_tramp(int fd, struct tty_port *port, int *fd_out,
+ unsigned long *stack_out)
+ {
+ struct winch_data data;
+- int fds[2], n, err;
++ int fds[2], n, err, pid;
+ char c;
+
+ err = os_pipe(fds, 1, 1);
+@@ -238,8 +238,9 @@ static int winch_tramp(int fd, struct tty_port *port, int *fd_out,
+ * problem with /dev/net/tun, which if held open by this
+ * thread, prevents the TUN/TAP device from being reused.
+ */
+- err = run_helper_thread(winch_thread, &data, CLONE_FILES, stack_out);
+- if (err < 0) {
++ pid = run_helper_thread(winch_thread, &data, CLONE_FILES, stack_out);
++ if (pid < 0) {
++ err = pid;
+ printk(UM_KERN_ERR "fork of winch_thread failed - errno = %d\n",
+ -err);
+ goto out_close;
+@@ -263,7 +264,7 @@ static int winch_tramp(int fd, struct tty_port *port, int *fd_out,
+ goto out_close;
+ }
+
+- return err;
++ return pid;
+
+ out_close:
+ close(fds[1]);
+diff --git a/arch/um/drivers/virtio_uml.c b/arch/um/drivers/virtio_uml.c
+index ba562d68dc048..82ff3785bf69f 100644
+--- a/arch/um/drivers/virtio_uml.c
++++ b/arch/um/drivers/virtio_uml.c
+@@ -63,6 +63,7 @@ struct virtio_uml_device {
+
+ u8 config_changed_irq:1;
+ uint64_t vq_irq_vq_map;
++ int recv_rc;
+ };
+
+ struct virtio_uml_vq_info {
+@@ -148,14 +149,6 @@ static int vhost_user_recv(struct virtio_uml_device *vu_dev,
+
+ rc = vhost_user_recv_header(fd, msg);
+
+- if (rc == -ECONNRESET && vu_dev->registered) {
+- struct virtio_uml_platform_data *pdata;
+-
+- pdata = vu_dev->pdata;
+-
+- virtio_break_device(&vu_dev->vdev);
+- schedule_work(&pdata->conn_broken_wk);
+- }
+ if (rc)
+ return rc;
+ size = msg->header.size;
+@@ -164,6 +157,21 @@ static int vhost_user_recv(struct virtio_uml_device *vu_dev,
+ return full_read(fd, &msg->payload, size, false);
+ }
+
++static void vhost_user_check_reset(struct virtio_uml_device *vu_dev,
++ int rc)
++{
++ struct virtio_uml_platform_data *pdata = vu_dev->pdata;
++
++ if (rc != -ECONNRESET)
++ return;
++
++ if (!vu_dev->registered)
++ return;
++
++ virtio_break_device(&vu_dev->vdev);
++ schedule_work(&pdata->conn_broken_wk);
++}
++
+ static int vhost_user_recv_resp(struct virtio_uml_device *vu_dev,
+ struct vhost_user_msg *msg,
+ size_t max_payload_size)
+@@ -171,8 +179,10 @@ static int vhost_user_recv_resp(struct virtio_uml_device *vu_dev,
+ int rc = vhost_user_recv(vu_dev, vu_dev->sock, msg,
+ max_payload_size, true);
+
+- if (rc)
++ if (rc) {
++ vhost_user_check_reset(vu_dev, rc);
+ return rc;
++ }
+
+ if (msg->header.flags != (VHOST_USER_FLAG_REPLY | VHOST_USER_VERSION))
+ return -EPROTO;
+@@ -369,6 +379,7 @@ static irqreturn_t vu_req_read_message(struct virtio_uml_device *vu_dev,
+ sizeof(msg.msg.payload) +
+ sizeof(msg.extra_payload));
+
++ vu_dev->recv_rc = rc;
+ if (rc)
+ return IRQ_NONE;
+
+@@ -412,7 +423,9 @@ static irqreturn_t vu_req_interrupt(int irq, void *data)
+ if (!um_irq_timetravel_handler_used())
+ ret = vu_req_read_message(vu_dev, NULL);
+
+- if (vu_dev->vq_irq_vq_map) {
++ if (vu_dev->recv_rc) {
++ vhost_user_check_reset(vu_dev, vu_dev->recv_rc);
++ } else if (vu_dev->vq_irq_vq_map) {
+ struct virtqueue *vq;
+
+ virtio_device_for_each_vq((&vu_dev->vdev), vq) {
+diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild
+index f1f3f52f1e9cc..b2d834a29f3a9 100644
+--- a/arch/um/include/asm/Kbuild
++++ b/arch/um/include/asm/Kbuild
+@@ -4,6 +4,7 @@ generic-y += bug.h
+ generic-y += compat.h
+ generic-y += current.h
+ generic-y += device.h
++generic-y += dma-mapping.h
+ generic-y += emergency-restart.h
+ generic-y += exec.h
+ generic-y += extable.h
+diff --git a/arch/um/include/asm/thread_info.h b/arch/um/include/asm/thread_info.h
+index 1395cbd7e340d..c7b4b49826a2a 100644
+--- a/arch/um/include/asm/thread_info.h
++++ b/arch/um/include/asm/thread_info.h
+@@ -60,6 +60,7 @@ static inline struct thread_info *current_thread_info(void)
+ #define TIF_RESTORE_SIGMASK 7
+ #define TIF_NOTIFY_RESUME 8
+ #define TIF_SECCOMP 9 /* secure computing */
++#define TIF_SINGLESTEP 10 /* single stepping userspace */
+
+ #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
+ #define _TIF_SIGPENDING (1 << TIF_SIGPENDING)
+@@ -68,5 +69,6 @@ static inline struct thread_info *current_thread_info(void)
+ #define _TIF_MEMDIE (1 << TIF_MEMDIE)
+ #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
+ #define _TIF_SECCOMP (1 << TIF_SECCOMP)
++#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
+
+ #endif
+diff --git a/arch/um/kernel/exec.c b/arch/um/kernel/exec.c
+index c85e40c72779f..58938d75871af 100644
+--- a/arch/um/kernel/exec.c
++++ b/arch/um/kernel/exec.c
+@@ -43,7 +43,7 @@ void start_thread(struct pt_regs *regs, unsigned long eip, unsigned long esp)
+ {
+ PT_REGS_IP(regs) = eip;
+ PT_REGS_SP(regs) = esp;
+- current->ptrace &= ~PT_DTRACE;
++ clear_thread_flag(TIF_SINGLESTEP);
+ #ifdef SUBARCH_EXECVE1
+ SUBARCH_EXECVE1(regs->regs);
+ #endif
+diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
+index 80504680be084..88c5c78442813 100644
+--- a/arch/um/kernel/process.c
++++ b/arch/um/kernel/process.c
+@@ -335,7 +335,7 @@ int singlestepping(void * t)
+ {
+ struct task_struct *task = t ? t : current;
+
+- if (!(task->ptrace & PT_DTRACE))
++ if (!test_thread_flag(TIF_SINGLESTEP))
+ return 0;
+
+ if (task->thread.singlestep_syscall)
+diff --git a/arch/um/kernel/ptrace.c b/arch/um/kernel/ptrace.c
+index bfaf6ab1ac037..5154b27de580f 100644
+--- a/arch/um/kernel/ptrace.c
++++ b/arch/um/kernel/ptrace.c
+@@ -11,7 +11,7 @@
+
+ void user_enable_single_step(struct task_struct *child)
+ {
+- child->ptrace |= PT_DTRACE;
++ set_tsk_thread_flag(child, TIF_SINGLESTEP);
+ child->thread.singlestep_syscall = 0;
+
+ #ifdef SUBARCH_SET_SINGLESTEPPING
+@@ -21,7 +21,7 @@ void user_enable_single_step(struct task_struct *child)
+
+ void user_disable_single_step(struct task_struct *child)
+ {
+- child->ptrace &= ~PT_DTRACE;
++ clear_tsk_thread_flag(child, TIF_SINGLESTEP);
+ child->thread.singlestep_syscall = 0;
+
+ #ifdef SUBARCH_SET_SINGLESTEPPING
+@@ -120,7 +120,7 @@ static void send_sigtrap(struct uml_pt_regs *regs, int error_code)
+ }
+
+ /*
+- * XXX Check PT_DTRACE vs TIF_SINGLESTEP for singlestepping check and
++ * XXX Check TIF_SINGLESTEP for singlestepping check and
+ * PT_PTRACED vs TIF_SYSCALL_TRACE for syscall tracing check
+ */
+ int syscall_trace_enter(struct pt_regs *regs)
+@@ -144,7 +144,7 @@ void syscall_trace_leave(struct pt_regs *regs)
+ audit_syscall_exit(regs);
+
+ /* Fake a debug trap */
+- if (ptraced & PT_DTRACE)
++ if (test_thread_flag(TIF_SINGLESTEP))
+ send_sigtrap(®s->regs, 0);
+
+ if (!test_thread_flag(TIF_SYSCALL_TRACE))
+diff --git a/arch/um/kernel/signal.c b/arch/um/kernel/signal.c
+index 88cd9b5c1b744..ae4658f576ab7 100644
+--- a/arch/um/kernel/signal.c
++++ b/arch/um/kernel/signal.c
+@@ -53,7 +53,7 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
+ unsigned long sp;
+ int err;
+
+- if ((current->ptrace & PT_DTRACE) && (current->ptrace & PT_PTRACED))
++ if (test_thread_flag(TIF_SINGLESTEP) && (current->ptrace & PT_PTRACED))
+ singlestep = 1;
+
+ /* Did we come from a system call? */
+@@ -128,7 +128,7 @@ void do_signal(struct pt_regs *regs)
+ * on the host. The tracing thread will check this flag and
+ * PTRACE_SYSCALL if necessary.
+ */
+- if (current->ptrace & PT_DTRACE)
++ if (test_thread_flag(TIF_SINGLESTEP))
+ current->thread.singlestep_syscall =
+ is_syscall(PT_REGS_IP(¤t->thread.regs));
+
+diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
+index 4bed3abf444d1..b2c65f5733538 100644
+--- a/arch/x86/Kconfig
++++ b/arch/x86/Kconfig
+@@ -1313,7 +1313,7 @@ config MICROCODE
+
+ config MICROCODE_INTEL
+ bool "Intel microcode loading support"
+- depends on MICROCODE
++ depends on CPU_SUP_INTEL && MICROCODE
+ default MICROCODE
+ help
+ This options enables microcode patch loading support for Intel
+@@ -1325,7 +1325,7 @@ config MICROCODE_INTEL
+
+ config MICROCODE_AMD
+ bool "AMD microcode loading support"
+- depends on MICROCODE
++ depends on CPU_SUP_AMD && MICROCODE
+ help
+ If you select this option, microcode patch loading support for AMD
+ processors will be enabled.
+diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
+index 73d958522b6a4..d8376e5fe1afe 100644
+--- a/arch/x86/entry/entry_64.S
++++ b/arch/x86/entry/entry_64.S
+@@ -508,6 +508,7 @@ SYM_CODE_START(\asmsym)
+ call vc_switch_off_ist
+ movq %rax, %rsp /* Switch to new stack */
+
++ ENCODE_FRAME_POINTER
+ UNWIND_HINT_REGS
+
+ /* Update pt_regs */
+diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
+index 235a5794296ac..1000d457c3321 100644
+--- a/arch/x86/entry/vdso/vma.c
++++ b/arch/x86/entry/vdso/vma.c
+@@ -438,7 +438,7 @@ bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
+ static __init int vdso_setup(char *s)
+ {
+ vdso64_enabled = simple_strtoul(s, NULL, 0);
+- return 0;
++ return 1;
+ }
+ __setup("vdso=", vdso_setup);
+
+diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
+index 9739019d4b67a..2704ec1e42a30 100644
+--- a/arch/x86/events/amd/ibs.c
++++ b/arch/x86/events/amd/ibs.c
+@@ -304,6 +304,16 @@ static int perf_ibs_init(struct perf_event *event)
+ hwc->config_base = perf_ibs->msr;
+ hwc->config = config;
+
++ /*
++ * rip recorded by IbsOpRip will not be consistent with rsp and rbp
++ * recorded as part of interrupt regs. Thus we need to use rip from
++ * interrupt regs while unwinding call stack. Setting _EARLY flag
++ * makes sure we unwind call-stack before perf sample rip is set to
++ * IbsOpRip.
++ */
++ if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)
++ event->attr.sample_type |= __PERF_SAMPLE_CALLCHAIN_EARLY;
++
+ return 0;
+ }
+
+@@ -687,6 +697,14 @@ fail:
+ data.raw = &raw;
+ }
+
++ /*
++ * rip recorded by IbsOpRip will not be consistent with rsp and rbp
++ * recorded as part of interrupt regs. Thus we need to use rip from
++ * interrupt regs while unwinding call stack.
++ */
++ if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)
++ data.callchain = perf_callchain(event, iregs);
++
+ throttle = perf_event_overflow(event, &data, ®s);
+ out:
+ if (throttle) {
+@@ -759,9 +777,10 @@ static __init int perf_ibs_pmu_init(struct perf_ibs *perf_ibs, char *name)
+ return ret;
+ }
+
+-static __init void perf_event_ibs_init(void)
++static __init int perf_event_ibs_init(void)
+ {
+ struct attribute **attr = ibs_op_format_attrs;
++ int ret;
+
+ /*
+ * Some chips fail to reset the fetch count when it is written; instead
+@@ -773,7 +792,9 @@ static __init void perf_event_ibs_init(void)
+ if (boot_cpu_data.x86 == 0x19 && boot_cpu_data.x86_model < 0x10)
+ perf_ibs_fetch.fetch_ignore_if_zero_rip = 1;
+
+- perf_ibs_pmu_init(&perf_ibs_fetch, "ibs_fetch");
++ ret = perf_ibs_pmu_init(&perf_ibs_fetch, "ibs_fetch");
++ if (ret)
++ return ret;
+
+ if (ibs_caps & IBS_CAPS_OPCNT) {
+ perf_ibs_op.config_mask |= IBS_OP_CNT_CTL;
+@@ -786,15 +807,35 @@ static __init void perf_event_ibs_init(void)
+ perf_ibs_op.cnt_mask |= IBS_OP_MAX_CNT_EXT_MASK;
+ }
+
+- perf_ibs_pmu_init(&perf_ibs_op, "ibs_op");
++ ret = perf_ibs_pmu_init(&perf_ibs_op, "ibs_op");
++ if (ret)
++ goto err_op;
++
++ ret = register_nmi_handler(NMI_LOCAL, perf_ibs_nmi_handler, 0, "perf_ibs");
++ if (ret)
++ goto err_nmi;
+
+- register_nmi_handler(NMI_LOCAL, perf_ibs_nmi_handler, 0, "perf_ibs");
+ pr_info("perf: AMD IBS detected (0x%08x)\n", ibs_caps);
++ return 0;
++
++err_nmi:
++ perf_pmu_unregister(&perf_ibs_op.pmu);
++ free_percpu(perf_ibs_op.pcpu);
++ perf_ibs_op.pcpu = NULL;
++err_op:
++ perf_pmu_unregister(&perf_ibs_fetch.pmu);
++ free_percpu(perf_ibs_fetch.pcpu);
++ perf_ibs_fetch.pcpu = NULL;
++
++ return ret;
+ }
+
+ #else /* defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_AMD) */
+
+-static __init void perf_event_ibs_init(void) { }
++static __init int perf_event_ibs_init(void)
++{
++ return 0;
++}
+
+ #endif
+
+@@ -1064,9 +1105,7 @@ static __init int amd_ibs_init(void)
+ x86_pmu_amd_ibs_starting_cpu,
+ x86_pmu_amd_ibs_dying_cpu);
+
+- perf_event_ibs_init();
+-
+- return 0;
++ return perf_event_ibs_init();
+ }
+
+ /* Since we need the pci subsystem to init ibs we can't do this earlier: */
+diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
+index fc7f458eb3de6..c6e7d358fb8d8 100644
+--- a/arch/x86/events/intel/core.c
++++ b/arch/x86/events/intel/core.c
+@@ -276,7 +276,7 @@ static struct event_constraint intel_icl_event_constraints[] = {
+ INTEL_EVENT_CONSTRAINT_RANGE(0x03, 0x0a, 0xf),
+ INTEL_EVENT_CONSTRAINT_RANGE(0x1f, 0x28, 0xf),
+ INTEL_EVENT_CONSTRAINT(0x32, 0xf), /* SW_PREFETCH_ACCESS.* */
+- INTEL_EVENT_CONSTRAINT_RANGE(0x48, 0x54, 0xf),
++ INTEL_EVENT_CONSTRAINT_RANGE(0x48, 0x56, 0xf),
+ INTEL_EVENT_CONSTRAINT_RANGE(0x60, 0x8b, 0xf),
+ INTEL_UEVENT_CONSTRAINT(0x04a3, 0xff), /* CYCLE_ACTIVITY.STALLS_TOTAL */
+ INTEL_UEVENT_CONSTRAINT(0x10a3, 0xff), /* CYCLE_ACTIVITY.CYCLES_MEM_ANY */
+diff --git a/arch/x86/include/asm/acenv.h b/arch/x86/include/asm/acenv.h
+index 9aff97f0de7fd..d937c55e717e6 100644
+--- a/arch/x86/include/asm/acenv.h
++++ b/arch/x86/include/asm/acenv.h
+@@ -13,7 +13,19 @@
+
+ /* Asm macros */
+
+-#define ACPI_FLUSH_CPU_CACHE() wbinvd()
++/*
++ * ACPI_FLUSH_CPU_CACHE() flushes caches on entering sleep states.
++ * It is required to prevent data loss.
++ *
++ * While running inside virtual machine, the kernel can bypass cache flushing.
++ * Changing sleep state in a virtual machine doesn't affect the host system
++ * sleep state and cannot lead to data loss.
++ */
++#define ACPI_FLUSH_CPU_CACHE() \
++do { \
++ if (!cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) \
++ wbinvd(); \
++} while (0)
+
+ int __acpi_acquire_global_lock(unsigned int *lock);
+ int __acpi_release_global_lock(unsigned int *lock);
+diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
+index 11b7c06e2828c..6ad8d946cd3eb 100644
+--- a/arch/x86/include/asm/kexec.h
++++ b/arch/x86/include/asm/kexec.h
+@@ -186,6 +186,14 @@ extern int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages,
+ extern void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages);
+ #define arch_kexec_pre_free_pages arch_kexec_pre_free_pages
+
++#ifdef CONFIG_KEXEC_FILE
++struct purgatory_info;
++int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
++ Elf_Shdr *section,
++ const Elf_Shdr *relsec,
++ const Elf_Shdr *symtab);
++#define arch_kexec_apply_relocations_add arch_kexec_apply_relocations_add
++#endif
+ #endif
+
+ typedef void crash_vmclear_fn(void);
+diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
+index 78ca535124864..b45c4d27fd46e 100644
+--- a/arch/x86/include/asm/set_memory.h
++++ b/arch/x86/include/asm/set_memory.h
+@@ -86,56 +86,4 @@ bool kernel_page_present(struct page *page);
+
+ extern int kernel_set_to_readonly;
+
+-#ifdef CONFIG_X86_64
+-/*
+- * Prevent speculative access to the page by either unmapping
+- * it (if we do not require access to any part of the page) or
+- * marking it uncacheable (if we want to try to retrieve data
+- * from non-poisoned lines in the page).
+- */
+-static inline int set_mce_nospec(unsigned long pfn, bool unmap)
+-{
+- unsigned long decoy_addr;
+- int rc;
+-
+- /* SGX pages are not in the 1:1 map */
+- if (arch_is_platform_page(pfn << PAGE_SHIFT))
+- return 0;
+- /*
+- * We would like to just call:
+- * set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1);
+- * but doing that would radically increase the odds of a
+- * speculative access to the poison page because we'd have
+- * the virtual address of the kernel 1:1 mapping sitting
+- * around in registers.
+- * Instead we get tricky. We create a non-canonical address
+- * that looks just like the one we want, but has bit 63 flipped.
+- * This relies on set_memory_XX() properly sanitizing any __pa()
+- * results with __PHYSICAL_MASK or PTE_PFN_MASK.
+- */
+- decoy_addr = (pfn << PAGE_SHIFT) + (PAGE_OFFSET ^ BIT(63));
+-
+- if (unmap)
+- rc = set_memory_np(decoy_addr, 1);
+- else
+- rc = set_memory_uc(decoy_addr, 1);
+- if (rc)
+- pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
+- return rc;
+-}
+-#define set_mce_nospec set_mce_nospec
+-
+-/* Restore full speculative operation to the pfn. */
+-static inline int clear_mce_nospec(unsigned long pfn)
+-{
+- return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
+-}
+-#define clear_mce_nospec clear_mce_nospec
+-#else
+-/*
+- * Few people would run a 32-bit kernel on a machine that supports
+- * recoverable errors because they have too much memory to boot 32-bit.
+- */
+-#endif
+-
+ #endif /* _ASM_X86_SET_MEMORY_H */
+diff --git a/arch/x86/include/asm/suspend_32.h b/arch/x86/include/asm/suspend_32.h
+index 7b132d0312ebf..a800abb1a9925 100644
+--- a/arch/x86/include/asm/suspend_32.h
++++ b/arch/x86/include/asm/suspend_32.h
+@@ -19,7 +19,6 @@ struct saved_context {
+ u16 gs;
+ unsigned long cr0, cr2, cr3, cr4;
+ u64 misc_enable;
+- bool misc_enable_saved;
+ struct saved_msrs saved_msrs;
+ struct desc_ptr gdt_desc;
+ struct desc_ptr idt;
+@@ -28,6 +27,7 @@ struct saved_context {
+ unsigned long tr;
+ unsigned long safety;
+ unsigned long return_address;
++ bool misc_enable_saved;
+ } __attribute__((packed));
+
+ /* routines for saving/restoring kernel state */
+diff --git a/arch/x86/include/asm/suspend_64.h b/arch/x86/include/asm/suspend_64.h
+index 35bb35d28733e..54df06687d834 100644
+--- a/arch/x86/include/asm/suspend_64.h
++++ b/arch/x86/include/asm/suspend_64.h
+@@ -14,9 +14,13 @@
+ * Image of the saved processor state, used by the low level ACPI suspend to
+ * RAM code and by the low level hibernation code.
+ *
+- * If you modify it, fix arch/x86/kernel/acpi/wakeup_64.S and make sure that
+- * __save/__restore_processor_state(), defined in arch/x86/kernel/suspend_64.c,
+- * still work as required.
++ * If you modify it, check how it is used in arch/x86/kernel/acpi/wakeup_64.S
++ * and make sure that __save/__restore_processor_state(), defined in
++ * arch/x86/power/cpu.c, still work as required.
++ *
++ * Because the structure is packed, make sure to avoid unaligned members. For
++ * optimisation purposes but also because tools like kmemleak only search for
++ * pointers that are aligned.
+ */
+ struct saved_context {
+ struct pt_regs regs;
+@@ -36,7 +40,6 @@ struct saved_context {
+
+ unsigned long cr0, cr2, cr3, cr4;
+ u64 misc_enable;
+- bool misc_enable_saved;
+ struct saved_msrs saved_msrs;
+ unsigned long efer;
+ u16 gdt_pad; /* Unused */
+@@ -48,6 +51,7 @@ struct saved_context {
+ unsigned long tr;
+ unsigned long safety;
+ unsigned long return_address;
++ bool misc_enable_saved;
+ } __attribute__((packed));
+
+ #define loaddebug(thread,register) \
+diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
+index b70344bf66008..ed7d9cf71f68d 100644
+--- a/arch/x86/kernel/apic/apic.c
++++ b/arch/x86/kernel/apic/apic.c
+@@ -170,7 +170,7 @@ static __init int setup_apicpmtimer(char *s)
+ {
+ apic_calibrate_pmtmr = 1;
+ notsc_setup(NULL);
+- return 0;
++ return 1;
+ }
+ __setup("apicpmtimer", setup_apicpmtimer);
+ #endif
+diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c
+index f5a48e66e4f54..a6e9c2794ef56 100644
+--- a/arch/x86/kernel/apic/x2apic_uv_x.c
++++ b/arch/x86/kernel/apic/x2apic_uv_x.c
+@@ -199,7 +199,13 @@ static void __init uv_tsc_check_sync(void)
+ int mmr_shift;
+ char *state;
+
+- /* Different returns from different UV BIOS versions */
++ /* UV5 guarantees synced TSCs; do not zero TSC_ADJUST */
++ if (!is_uv(UV2|UV3|UV4)) {
++ mark_tsc_async_resets("UV5+");
++ return;
++ }
++
++ /* UV2,3,4, UV BIOS TSC sync state available */
+ mmr = uv_early_read_mmr(UVH_TSC_SYNC_MMR);
+ mmr_shift =
+ is_uv2_hub() ? UVH_TSC_SYNC_SHIFT_UV2K : UVH_TSC_SYNC_SHIFT;
+diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
+index f7a5370a9b3b8..2c87d62f191e2 100644
+--- a/arch/x86/kernel/cpu/intel.c
++++ b/arch/x86/kernel/cpu/intel.c
+@@ -91,7 +91,7 @@ static bool ring3mwait_disabled __read_mostly;
+ static int __init ring3mwait_disable(char *__unused)
+ {
+ ring3mwait_disabled = true;
+- return 0;
++ return 1;
+ }
+ __setup("ring3mwait=disable", ring3mwait_disable);
+
+diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
+index 1940d305db1c0..1c87501e0fa3d 100644
+--- a/arch/x86/kernel/cpu/mce/amd.c
++++ b/arch/x86/kernel/cpu/mce/amd.c
+@@ -1294,10 +1294,23 @@ out_free:
+ kfree(bank);
+ }
+
++static void __threshold_remove_device(struct threshold_bank **bp)
++{
++ unsigned int bank, numbanks = this_cpu_read(mce_num_banks);
++
++ for (bank = 0; bank < numbanks; bank++) {
++ if (!bp[bank])
++ continue;
++
++ threshold_remove_bank(bp[bank]);
++ bp[bank] = NULL;
++ }
++ kfree(bp);
++}
++
+ int mce_threshold_remove_device(unsigned int cpu)
+ {
+ struct threshold_bank **bp = this_cpu_read(threshold_banks);
+- unsigned int bank, numbanks = this_cpu_read(mce_num_banks);
+
+ if (!bp)
+ return 0;
+@@ -1308,13 +1321,7 @@ int mce_threshold_remove_device(unsigned int cpu)
+ */
+ this_cpu_write(threshold_banks, NULL);
+
+- for (bank = 0; bank < numbanks; bank++) {
+- if (bp[bank]) {
+- threshold_remove_bank(bp[bank]);
+- bp[bank] = NULL;
+- }
+- }
+- kfree(bp);
++ __threshold_remove_device(bp);
+ return 0;
+ }
+
+@@ -1351,15 +1358,14 @@ int mce_threshold_create_device(unsigned int cpu)
+ if (!(this_cpu_read(bank_map) & (1 << bank)))
+ continue;
+ err = threshold_create_bank(bp, cpu, bank);
+- if (err)
+- goto out_err;
++ if (err) {
++ __threshold_remove_device(bp);
++ return err;
++ }
+ }
+ this_cpu_write(threshold_banks, bp);
+
+ if (thresholding_irq_en)
+ mce_threshold_vector = amd_threshold_interrupt;
+ return 0;
+-out_err:
+- mce_threshold_remove_device(cpu);
+- return err;
+ }
+diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
+index 981496e6bc0e4..fa67bb9d1afed 100644
+--- a/arch/x86/kernel/cpu/mce/core.c
++++ b/arch/x86/kernel/cpu/mce/core.c
+@@ -579,7 +579,7 @@ static int uc_decode_notifier(struct notifier_block *nb, unsigned long val,
+
+ pfn = mce->addr >> PAGE_SHIFT;
+ if (!memory_failure(pfn, 0)) {
+- set_mce_nospec(pfn, whole_page(mce));
++ set_mce_nospec(pfn);
+ mce->kflags |= MCE_HANDLED_UC;
+ }
+
+@@ -1316,7 +1316,7 @@ static void kill_me_maybe(struct callback_head *cb)
+
+ ret = memory_failure(p->mce_addr >> PAGE_SHIFT, flags);
+ if (!ret) {
+- set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page);
++ set_mce_nospec(p->mce_addr >> PAGE_SHIFT);
+ sync_core();
+ return;
+ }
+@@ -1342,7 +1342,7 @@ static void kill_me_never(struct callback_head *cb)
+ p->mce_count = 0;
+ pr_err("Kernel accessed poison in user space at %llx\n", p->mce_addr);
+ if (!memory_failure(p->mce_addr >> PAGE_SHIFT, 0))
+- set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page);
++ set_mce_nospec(p->mce_addr >> PAGE_SHIFT);
+ }
+
+ static void queue_task_work(struct mce *m, char *msg, void (*func)(struct callback_head *))
+diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
+index 3c24e6124d955..19876ebfb5044 100644
+--- a/arch/x86/kernel/cpu/sgx/encl.c
++++ b/arch/x86/kernel/cpu/sgx/encl.c
+@@ -152,7 +152,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
+
+ page_pcmd_off = sgx_encl_get_backing_page_pcmd_offset(encl, page_index);
+
+- ret = sgx_encl_get_backing(encl, page_index, &b);
++ ret = sgx_encl_lookup_backing(encl, page_index, &b);
+ if (ret)
+ return ret;
+
+@@ -718,7 +718,7 @@ static struct page *sgx_encl_get_backing_page(struct sgx_encl *encl,
+ * 0 on success,
+ * -errno otherwise.
+ */
+-int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
++static int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
+ struct sgx_backing *backing)
+ {
+ pgoff_t page_pcmd_off = sgx_encl_get_backing_page_pcmd_offset(encl, page_index);
+@@ -743,6 +743,107 @@ int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
+ return 0;
+ }
+
++/*
++ * When called from ksgxd, returns the mem_cgroup of a struct mm stored
++ * in the enclave's mm_list. When not called from ksgxd, just returns
++ * the mem_cgroup of the current task.
++ */
++static struct mem_cgroup *sgx_encl_get_mem_cgroup(struct sgx_encl *encl)
++{
++ struct mem_cgroup *memcg = NULL;
++ struct sgx_encl_mm *encl_mm;
++ int idx;
++
++ /*
++ * If called from normal task context, return the mem_cgroup
++ * of the current task's mm. The remainder of the handling is for
++ * ksgxd.
++ */
++ if (!current_is_ksgxd())
++ return get_mem_cgroup_from_mm(current->mm);
++
++ /*
++ * Search the enclave's mm_list to find an mm associated with
++ * this enclave to charge the allocation to.
++ */
++ idx = srcu_read_lock(&encl->srcu);
++
++ list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
++ if (!mmget_not_zero(encl_mm->mm))
++ continue;
++
++ memcg = get_mem_cgroup_from_mm(encl_mm->mm);
++
++ mmput_async(encl_mm->mm);
++
++ break;
++ }
++
++ srcu_read_unlock(&encl->srcu, idx);
++
++ /*
++ * In the rare case that there isn't an mm associated with
++ * the enclave, set memcg to the current active mem_cgroup.
++ * This will be the root mem_cgroup if there is no active
++ * mem_cgroup.
++ */
++ if (!memcg)
++ return get_mem_cgroup_from_mm(NULL);
++
++ return memcg;
++}
++
++/**
++ * sgx_encl_alloc_backing() - allocate a new backing storage page
++ * @encl: an enclave pointer
++ * @page_index: enclave page index
++ * @backing: data for accessing backing storage for the page
++ *
++ * When called from ksgxd, sets the active memcg from one of the
++ * mms in the enclave's mm_list prior to any backing page allocation,
++ * in order to ensure that shmem page allocations are charged to the
++ * enclave.
++ *
++ * Return:
++ * 0 on success,
++ * -errno otherwise.
++ */
++int sgx_encl_alloc_backing(struct sgx_encl *encl, unsigned long page_index,
++ struct sgx_backing *backing)
++{
++ struct mem_cgroup *encl_memcg = sgx_encl_get_mem_cgroup(encl);
++ struct mem_cgroup *memcg = set_active_memcg(encl_memcg);
++ int ret;
++
++ ret = sgx_encl_get_backing(encl, page_index, backing);
++
++ set_active_memcg(memcg);
++ mem_cgroup_put(encl_memcg);
++
++ return ret;
++}
++
++/**
++ * sgx_encl_lookup_backing() - retrieve an existing backing storage page
++ * @encl: an enclave pointer
++ * @page_index: enclave page index
++ * @backing: data for accessing backing storage for the page
++ *
++ * Retrieve a backing page for loading data back into an EPC page with ELDU.
++ * It is the caller's responsibility to ensure that it is appropriate to use
++ * sgx_encl_lookup_backing() rather than sgx_encl_alloc_backing(). If lookup is
++ * not used correctly, this will cause an allocation which is not accounted for.
++ *
++ * Return:
++ * 0 on success,
++ * -errno otherwise.
++ */
++int sgx_encl_lookup_backing(struct sgx_encl *encl, unsigned long page_index,
++ struct sgx_backing *backing)
++{
++ return sgx_encl_get_backing(encl, page_index, backing);
++}
++
+ /**
+ * sgx_encl_put_backing() - Unpin the backing storage
+ * @backing: data for accessing backing storage for the page
+diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
+index d44e7372151f0..332ef3568267e 100644
+--- a/arch/x86/kernel/cpu/sgx/encl.h
++++ b/arch/x86/kernel/cpu/sgx/encl.h
+@@ -103,10 +103,13 @@ static inline int sgx_encl_find(struct mm_struct *mm, unsigned long addr,
+ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
+ unsigned long end, unsigned long vm_flags);
+
++bool current_is_ksgxd(void);
+ void sgx_encl_release(struct kref *ref);
+ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm);
+-int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
+- struct sgx_backing *backing);
++int sgx_encl_lookup_backing(struct sgx_encl *encl, unsigned long page_index,
++ struct sgx_backing *backing);
++int sgx_encl_alloc_backing(struct sgx_encl *encl, unsigned long page_index,
++ struct sgx_backing *backing);
+ void sgx_encl_put_backing(struct sgx_backing *backing);
+ int sgx_encl_test_and_clear_young(struct mm_struct *mm,
+ struct sgx_encl_page *page);
+diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
+index ab4ec54bbdd94..a78652d43e61b 100644
+--- a/arch/x86/kernel/cpu/sgx/main.c
++++ b/arch/x86/kernel/cpu/sgx/main.c
+@@ -313,7 +313,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
+ sgx_encl_put_backing(backing);
+
+ if (!encl->secs_child_cnt && test_bit(SGX_ENCL_INITIALIZED, &encl->flags)) {
+- ret = sgx_encl_get_backing(encl, PFN_DOWN(encl->size),
++ ret = sgx_encl_alloc_backing(encl, PFN_DOWN(encl->size),
+ &secs_backing);
+ if (ret)
+ goto out;
+@@ -384,7 +384,7 @@ static void sgx_reclaim_pages(void)
+ page_index = PFN_DOWN(encl_page->desc - encl_page->encl->base);
+
+ mutex_lock(&encl_page->encl->lock);
+- ret = sgx_encl_get_backing(encl_page->encl, page_index, &backing[i]);
++ ret = sgx_encl_alloc_backing(encl_page->encl, page_index, &backing[i]);
+ if (ret) {
+ mutex_unlock(&encl_page->encl->lock);
+ goto skip;
+@@ -475,6 +475,11 @@ static bool __init sgx_page_reclaimer_init(void)
+ return true;
+ }
+
++bool current_is_ksgxd(void)
++{
++ return current == ksgxd_tsk;
++}
++
+ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid)
+ {
+ struct sgx_numa_node *node = &sgx_numa_nodes[nid];
+diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
+index 566bb8e171492..0611fd83858e6 100644
+--- a/arch/x86/kernel/machine_kexec_64.c
++++ b/arch/x86/kernel/machine_kexec_64.c
+@@ -376,9 +376,6 @@ void machine_kexec(struct kimage *image)
+ #ifdef CONFIG_KEXEC_FILE
+ void *arch_kexec_kernel_image_load(struct kimage *image)
+ {
+- vfree(image->elf_headers);
+- image->elf_headers = NULL;
+-
+ if (!image->fops || !image->fops->load)
+ return ERR_PTR(-ENOEXEC);
+
+@@ -514,6 +511,15 @@ overflow:
+ (int)ELF64_R_TYPE(rel[i].r_info), value);
+ return -ENOEXEC;
+ }
++
++int arch_kimage_file_post_load_cleanup(struct kimage *image)
++{
++ vfree(image->elf_headers);
++ image->elf_headers = NULL;
++ image->elf_headers_sz = 0;
++
++ return kexec_image_post_load_cleanup_default(image);
++}
+ #endif /* CONFIG_KEXEC_FILE */
+
+ static int
+diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c
+index b52407c56000e..879ef8c72f5c0 100644
+--- a/arch/x86/kernel/signal_compat.c
++++ b/arch/x86/kernel/signal_compat.c
+@@ -149,8 +149,10 @@ static inline void signal_compat_build_tests(void)
+
+ BUILD_BUG_ON(offsetof(siginfo_t, si_perf_data) != 0x18);
+ BUILD_BUG_ON(offsetof(siginfo_t, si_perf_type) != 0x20);
++ BUILD_BUG_ON(offsetof(siginfo_t, si_perf_flags) != 0x24);
+ BUILD_BUG_ON(offsetof(compat_siginfo_t, si_perf_data) != 0x10);
+ BUILD_BUG_ON(offsetof(compat_siginfo_t, si_perf_type) != 0x14);
++ BUILD_BUG_ON(offsetof(compat_siginfo_t, si_perf_flags) != 0x18);
+
+ CHECK_CSI_OFFSET(_sigpoll);
+ CHECK_CSI_SIZE (_sigpoll, 2*sizeof(int));
+diff --git a/arch/x86/kernel/step.c b/arch/x86/kernel/step.c
+index 0f3c307b37b3a..8e2b2552b5eea 100644
+--- a/arch/x86/kernel/step.c
++++ b/arch/x86/kernel/step.c
+@@ -180,8 +180,7 @@ void set_task_blockstep(struct task_struct *task, bool on)
+ *
+ * NOTE: this means that set/clear TIF_BLOCKSTEP is only safe if
+ * task is current or it can't be running, otherwise we can race
+- * with __switch_to_xtra(). We rely on ptrace_freeze_traced() but
+- * PTRACE_KILL is not safe.
++ * with __switch_to_xtra(). We rely on ptrace_freeze_traced().
+ */
+ local_irq_disable();
+ debugctl = get_debugctlmsr();
+diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
+index 660b78827638f..8cc653ffdccd7 100644
+--- a/arch/x86/kernel/sys_x86_64.c
++++ b/arch/x86/kernel/sys_x86_64.c
+@@ -68,9 +68,6 @@ static int __init control_va_addr_alignment(char *str)
+ if (*str == 0)
+ return 1;
+
+- if (*str == '=')
+- str++;
+-
+ if (!strcmp(str, "32"))
+ va_align.flags = ALIGN_VA_32;
+ else if (!strcmp(str, "64"))
+@@ -80,11 +77,11 @@ static int __init control_va_addr_alignment(char *str)
+ else if (!strcmp(str, "on"))
+ va_align.flags = ALIGN_VA_32 | ALIGN_VA_64;
+ else
+- return 0;
++ pr_warn("invalid option value: 'align_va_addr=%s'\n", str);
+
+ return 1;
+ }
+-__setup("align_va_addr", control_va_addr_alignment);
++__setup("align_va_addr=", control_va_addr_alignment);
+
+ SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
+ unsigned long, prot, unsigned long, flags,
+diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
+index 66b0eb0bda94e..6268880c8eed6 100644
+--- a/arch/x86/kvm/lapic.c
++++ b/arch/x86/kvm/lapic.c
+@@ -1548,6 +1548,7 @@ static void cancel_apic_timer(struct kvm_lapic *apic)
+ if (apic->lapic_timer.hv_timer_in_use)
+ cancel_hv_timer(apic);
+ preempt_enable();
++ atomic_set(&apic->lapic_timer.pending, 0);
+ }
+
+ static void apic_update_lvtt(struct kvm_lapic *apic)
+diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
+index 880d0b0c9315b..ee7df31883cd9 100644
+--- a/arch/x86/kvm/vmx/nested.c
++++ b/arch/x86/kvm/vmx/nested.c
+@@ -3695,12 +3695,34 @@ vmcs12_guest_cr4(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
+ }
+
+ static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu,
+- struct vmcs12 *vmcs12)
++ struct vmcs12 *vmcs12,
++ u32 vm_exit_reason, u32 exit_intr_info)
+ {
+ u32 idt_vectoring;
+ unsigned int nr;
+
+- if (vcpu->arch.exception.injected) {
++ /*
++ * Per the SDM, VM-Exits due to double and triple faults are never
++ * considered to occur during event delivery, even if the double/triple
++ * fault is the result of an escalating vectoring issue.
++ *
++ * Note, the SDM qualifies the double fault behavior with "The original
++ * event results in a double-fault exception". It's unclear why the
++ * qualification exists since exits due to double fault can occur only
++ * while vectoring a different exception (injected events are never
++ * subject to interception), i.e. there's _always_ an original event.
++ *
++ * The SDM also uses NMI as a confusing example for the "original event
++ * causes the VM exit directly" clause. NMI isn't special in any way,
++ * the same rule applies to all events that cause an exit directly.
++ * NMI is an odd choice for the example because NMIs can only occur on
++ * instruction boundaries, i.e. they _can't_ occur during vectoring.
++ */
++ if ((u16)vm_exit_reason == EXIT_REASON_TRIPLE_FAULT ||
++ ((u16)vm_exit_reason == EXIT_REASON_EXCEPTION_NMI &&
++ is_double_fault(exit_intr_info))) {
++ vmcs12->idt_vectoring_info_field = 0;
++ } else if (vcpu->arch.exception.injected) {
+ nr = vcpu->arch.exception.nr;
+ idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
+
+@@ -3733,6 +3755,8 @@ static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu,
+ idt_vectoring |= INTR_TYPE_EXT_INTR;
+
+ vmcs12->idt_vectoring_info_field = idt_vectoring;
++ } else {
++ vmcs12->idt_vectoring_info_field = 0;
+ }
+ }
+
+@@ -4202,12 +4226,12 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
+ if (to_vmx(vcpu)->exit_reason.enclave_mode)
+ vmcs12->vm_exit_reason |= VMX_EXIT_REASONS_SGX_ENCLAVE_MODE;
+ vmcs12->exit_qualification = exit_qualification;
+- vmcs12->vm_exit_intr_info = exit_intr_info;
+-
+- vmcs12->idt_vectoring_info_field = 0;
+- vmcs12->vm_exit_instruction_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
+- vmcs12->vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
+
++ /*
++ * On VM-Exit due to a failed VM-Entry, the VMCS isn't marked launched
++ * and only EXIT_REASON and EXIT_QUALIFICATION are updated, all other
++ * exit info fields are unmodified.
++ */
+ if (!(vmcs12->vm_exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY)) {
+ vmcs12->launch_state = 1;
+
+@@ -4219,7 +4243,12 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
+ * Transfer the event that L0 or L1 may wanted to inject into
+ * L2 to IDT_VECTORING_INFO_FIELD.
+ */
+- vmcs12_save_pending_event(vcpu, vmcs12);
++ vmcs12_save_pending_event(vcpu, vmcs12,
++ vm_exit_reason, exit_intr_info);
++
++ vmcs12->vm_exit_intr_info = exit_intr_info;
++ vmcs12->vm_exit_instruction_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
++ vmcs12->vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
+
+ /*
+ * According to spec, there's no need to store the guest's
+diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h
+index e325c290a8162..2b9d7a7e83f77 100644
+--- a/arch/x86/kvm/vmx/vmcs.h
++++ b/arch/x86/kvm/vmx/vmcs.h
+@@ -104,6 +104,11 @@ static inline bool is_breakpoint(u32 intr_info)
+ return is_exception_n(intr_info, BP_VECTOR);
+ }
+
++static inline bool is_double_fault(u32 intr_info)
++{
++ return is_exception_n(intr_info, DF_VECTOR);
++}
++
+ static inline bool is_page_fault(u32 intr_info)
+ {
+ return is_exception_n(intr_info, PF_VECTOR);
+diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
+index 65d15df6212d6..0e65d00e2339f 100644
+--- a/arch/x86/lib/delay.c
++++ b/arch/x86/lib/delay.c
+@@ -54,8 +54,8 @@ static void delay_loop(u64 __loops)
+ " jnz 2b \n"
+ "3: dec %0 \n"
+
+- : /* we don't need output */
+- :"a" (loops)
++ : "+a" (loops)
++ :
+ );
+ }
+
+diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
+index 4ba2a3ee4bce1..d5ef64ddd35e9 100644
+--- a/arch/x86/mm/pat/memtype.c
++++ b/arch/x86/mm/pat/memtype.c
+@@ -101,7 +101,7 @@ int pat_debug_enable;
+ static int __init pat_debug_setup(char *str)
+ {
+ pat_debug_enable = 1;
+- return 0;
++ return 1;
+ }
+ __setup("debugpat", pat_debug_setup);
+
+diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
+index 0656db33574d3..1abd5438f1269 100644
+--- a/arch/x86/mm/pat/set_memory.c
++++ b/arch/x86/mm/pat/set_memory.c
+@@ -19,6 +19,7 @@
+ #include <linux/vmstat.h>
+ #include <linux/kernel.h>
+ #include <linux/cc_platform.h>
++#include <linux/set_memory.h>
+
+ #include <asm/e820/api.h>
+ #include <asm/processor.h>
+@@ -29,7 +30,6 @@
+ #include <asm/pgalloc.h>
+ #include <asm/proto.h>
+ #include <asm/memtype.h>
+-#include <asm/set_memory.h>
+ #include <asm/hyperv-tlfs.h>
+ #include <asm/mshyperv.h>
+
+@@ -1805,7 +1805,7 @@ static inline int cpa_clear_pages_array(struct page **pages, int numpages,
+ }
+
+ /*
+- * _set_memory_prot is an internal helper for callers that have been passed
++ * __set_memory_prot is an internal helper for callers that have been passed
+ * a pgprot_t value from upper layers and a reservation has already been taken.
+ * If you want to set the pgprot to a specific page protocol, use the
+ * set_memory_xx() functions.
+@@ -1914,6 +1914,51 @@ int set_memory_wb(unsigned long addr, int numpages)
+ }
+ EXPORT_SYMBOL(set_memory_wb);
+
++/* Prevent speculative access to a page by marking it not-present */
++#ifdef CONFIG_X86_64
++int set_mce_nospec(unsigned long pfn)
++{
++ unsigned long decoy_addr;
++ int rc;
++
++ /* SGX pages are not in the 1:1 map */
++ if (arch_is_platform_page(pfn << PAGE_SHIFT))
++ return 0;
++ /*
++ * We would like to just call:
++ * set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1);
++ * but doing that would radically increase the odds of a
++ * speculative access to the poison page because we'd have
++ * the virtual address of the kernel 1:1 mapping sitting
++ * around in registers.
++ * Instead we get tricky. We create a non-canonical address
++ * that looks just like the one we want, but has bit 63 flipped.
++ * This relies on set_memory_XX() properly sanitizing any __pa()
++ * results with __PHYSICAL_MASK or PTE_PFN_MASK.
++ */
++ decoy_addr = (pfn << PAGE_SHIFT) + (PAGE_OFFSET ^ BIT(63));
++
++ rc = set_memory_np(decoy_addr, 1);
++ if (rc)
++ pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
++ return rc;
++}
++
++static int set_memory_present(unsigned long *addr, int numpages)
++{
++ return change_page_attr_set(addr, numpages, __pgprot(_PAGE_PRESENT), 0);
++}
++
++/* Restore full speculative operation to the pfn. */
++int clear_mce_nospec(unsigned long pfn)
++{
++ unsigned long addr = (unsigned long) pfn_to_kaddr(pfn);
++
++ return set_memory_present(&addr, 1);
++}
++EXPORT_SYMBOL_GPL(clear_mce_nospec);
++#endif /* CONFIG_X86_64 */
++
+ int set_memory_x(unsigned long addr, int numpages)
+ {
+ if (!(__supported_pte_mask & _PAGE_NX))
+diff --git a/arch/x86/pci/irq.c b/arch/x86/pci/irq.c
+index 97b63e35e1528..21c4bc41741fe 100644
+--- a/arch/x86/pci/irq.c
++++ b/arch/x86/pci/irq.c
+@@ -253,6 +253,15 @@ static void write_pc_conf_nybble(u8 base, u8 index, u8 val)
+ pc_conf_set(reg, x);
+ }
+
++/*
++ * FinALi pirq rules are as follows:
++ *
++ * - bit 0 selects between INTx Routing Table Mapping Registers,
++ *
++ * - bit 3 selects the nibble within the INTx Routing Table Mapping Register,
++ *
++ * - bits 7:4 map to bits 3:0 of the PCI INTx Sensitivity Register.
++ */
+ static int pirq_finali_get(struct pci_dev *router, struct pci_dev *dev,
+ int pirq)
+ {
+@@ -260,11 +269,13 @@ static int pirq_finali_get(struct pci_dev *router, struct pci_dev *dev,
+ 0, 9, 3, 10, 4, 5, 7, 6, 0, 11, 0, 12, 0, 14, 0, 15
+ };
+ unsigned long flags;
++ u8 index;
+ u8 x;
+
++ index = (pirq & 1) << 1 | (pirq & 8) >> 3;
+ raw_spin_lock_irqsave(&pc_conf_lock, flags);
+ pc_conf_set(PC_CONF_FINALI_LOCK, PC_CONF_FINALI_LOCK_KEY);
+- x = irqmap[read_pc_conf_nybble(PC_CONF_FINALI_PCI_INTX_RT1, pirq - 1)];
++ x = irqmap[read_pc_conf_nybble(PC_CONF_FINALI_PCI_INTX_RT1, index)];
+ pc_conf_set(PC_CONF_FINALI_LOCK, 0);
+ raw_spin_unlock_irqrestore(&pc_conf_lock, flags);
+ return x;
+@@ -278,13 +289,15 @@ static int pirq_finali_set(struct pci_dev *router, struct pci_dev *dev,
+ };
+ u8 val = irqmap[irq];
+ unsigned long flags;
++ u8 index;
+
+ if (!val)
+ return 0;
+
++ index = (pirq & 1) << 1 | (pirq & 8) >> 3;
+ raw_spin_lock_irqsave(&pc_conf_lock, flags);
+ pc_conf_set(PC_CONF_FINALI_LOCK, PC_CONF_FINALI_LOCK_KEY);
+- write_pc_conf_nybble(PC_CONF_FINALI_PCI_INTX_RT1, pirq - 1, val);
++ write_pc_conf_nybble(PC_CONF_FINALI_PCI_INTX_RT1, index, val);
+ pc_conf_set(PC_CONF_FINALI_LOCK, 0);
+ raw_spin_unlock_irqrestore(&pc_conf_lock, flags);
+ return 1;
+@@ -293,7 +306,7 @@ static int pirq_finali_set(struct pci_dev *router, struct pci_dev *dev,
+ static int pirq_finali_lvl(struct pci_dev *router, struct pci_dev *dev,
+ int pirq, int irq)
+ {
+- u8 mask = ~(1u << (pirq - 1));
++ u8 mask = ~((pirq & 0xf0u) >> 4);
+ unsigned long flags;
+ u8 trig;
+
+diff --git a/arch/x86/um/ldt.c b/arch/x86/um/ldt.c
+index 3ee234b6234dd..255a44dd415a9 100644
+--- a/arch/x86/um/ldt.c
++++ b/arch/x86/um/ldt.c
+@@ -23,9 +23,11 @@ static long write_ldt_entry(struct mm_id *mm_idp, int func,
+ {
+ long res;
+ void *stub_addr;
++
++ BUILD_BUG_ON(sizeof(*desc) % sizeof(long));
++
+ res = syscall_stub_data(mm_idp, (unsigned long *)desc,
+- (sizeof(*desc) + sizeof(long) - 1) &
+- ~(sizeof(long) - 1),
++ sizeof(*desc) / sizeof(long),
+ addr, &stub_addr);
+ if (!res) {
+ unsigned long args[] = { func,
+diff --git a/arch/xtensa/kernel/entry.S b/arch/xtensa/kernel/entry.S
+index 6b6eff658795c..07d683d94e175 100644
+--- a/arch/xtensa/kernel/entry.S
++++ b/arch/xtensa/kernel/entry.S
+@@ -442,7 +442,6 @@ KABI_W or a3, a3, a0
+ moveqz a3, a0, a2 # a3 = LOCKLEVEL iff interrupt
+ KABI_W movi a2, PS_WOE_MASK
+ KABI_W or a3, a3, a2
+- rsr a2, exccause
+ #endif
+
+ /* restore return address (or 0 if return to userspace) */
+@@ -469,19 +468,27 @@ KABI_W or a3, a3, a2
+
+ save_xtregs_opt a1 a3 a4 a5 a6 a7 PT_XTREGS_OPT
+
++#ifdef CONFIG_TRACE_IRQFLAGS
++ rsr abi_tmp0, ps
++ extui abi_tmp0, abi_tmp0, PS_INTLEVEL_SHIFT, PS_INTLEVEL_WIDTH
++ beqz abi_tmp0, 1f
++ abi_call trace_hardirqs_off
++1:
++#endif
++
+ /* Go to second-level dispatcher. Set up parameters to pass to the
+ * exception handler and call the exception handler.
+ */
+
+- rsr a4, excsave1
+- addx4 a4, a2, a4
+- l32i a4, a4, EXC_TABLE_DEFAULT # load handler
+- mov abi_arg1, a2 # pass EXCCAUSE
++ l32i abi_arg1, a1, PT_EXCCAUSE # pass EXCCAUSE
++ rsr abi_tmp0, excsave1
++ addx4 abi_tmp0, abi_arg1, abi_tmp0
++ l32i abi_tmp0, abi_tmp0, EXC_TABLE_DEFAULT # load handler
+ mov abi_arg0, a1 # pass stack frame
+
+ /* Call the second-level handler */
+
+- abi_callx a4
++ abi_callx abi_tmp0
+
+ /* Jump here for exception exit */
+ .global common_exception_return
+diff --git a/arch/xtensa/kernel/ptrace.c b/arch/xtensa/kernel/ptrace.c
+index 323c678a691ff..b952e67cc0ccd 100644
+--- a/arch/xtensa/kernel/ptrace.c
++++ b/arch/xtensa/kernel/ptrace.c
+@@ -225,12 +225,12 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task)
+
+ void user_enable_single_step(struct task_struct *child)
+ {
+- child->ptrace |= PT_SINGLESTEP;
++ set_tsk_thread_flag(child, TIF_SINGLESTEP);
+ }
+
+ void user_disable_single_step(struct task_struct *child)
+ {
+- child->ptrace &= ~PT_SINGLESTEP;
++ clear_tsk_thread_flag(child, TIF_SINGLESTEP);
+ }
+
+ /*
+diff --git a/arch/xtensa/kernel/signal.c b/arch/xtensa/kernel/signal.c
+index 6f68649e86ba5..ac50ec46c8f14 100644
+--- a/arch/xtensa/kernel/signal.c
++++ b/arch/xtensa/kernel/signal.c
+@@ -473,7 +473,7 @@ static void do_signal(struct pt_regs *regs)
+ /* Set up the stack frame */
+ ret = setup_frame(&ksig, sigmask_to_save(), regs);
+ signal_setup_done(ret, &ksig, 0);
+- if (current->ptrace & PT_SINGLESTEP)
++ if (test_thread_flag(TIF_SINGLESTEP))
+ task_pt_regs(current)->icountlevel = 1;
+
+ return;
+@@ -499,7 +499,7 @@ static void do_signal(struct pt_regs *regs)
+ /* If there's no signal to deliver, we just restore the saved mask. */
+ restore_saved_sigmask();
+
+- if (current->ptrace & PT_SINGLESTEP)
++ if (test_thread_flag(TIF_SINGLESTEP))
+ task_pt_regs(current)->icountlevel = 1;
+ return;
+ }
+diff --git a/arch/xtensa/kernel/traps.c b/arch/xtensa/kernel/traps.c
+index 9345007d474d3..5f86208c67c87 100644
+--- a/arch/xtensa/kernel/traps.c
++++ b/arch/xtensa/kernel/traps.c
+@@ -242,12 +242,8 @@ DEFINE_PER_CPU(unsigned long, nmi_count);
+
+ void do_nmi(struct pt_regs *regs)
+ {
+- struct pt_regs *old_regs;
++ struct pt_regs *old_regs = set_irq_regs(regs);
+
+- if ((regs->ps & PS_INTLEVEL_MASK) < LOCKLEVEL)
+- trace_hardirqs_off();
+-
+- old_regs = set_irq_regs(regs);
+ nmi_enter();
+ ++*this_cpu_ptr(&nmi_count);
+ check_valid_nmi();
+@@ -269,12 +265,9 @@ void do_interrupt(struct pt_regs *regs)
+ XCHAL_INTLEVEL6_MASK,
+ XCHAL_INTLEVEL7_MASK,
+ };
+- struct pt_regs *old_regs;
++ struct pt_regs *old_regs = set_irq_regs(regs);
+ unsigned unhandled = ~0u;
+
+- trace_hardirqs_off();
+-
+- old_regs = set_irq_regs(regs);
+ irq_enter();
+
+ for (;;) {
+diff --git a/arch/xtensa/platforms/iss/simdisk.c b/arch/xtensa/platforms/iss/simdisk.c
+index 0f0e0724397f4..4255b92fa3eb0 100644
+--- a/arch/xtensa/platforms/iss/simdisk.c
++++ b/arch/xtensa/platforms/iss/simdisk.c
+@@ -211,12 +211,18 @@ static ssize_t proc_read_simdisk(struct file *file, char __user *buf,
+ struct simdisk *dev = pde_data(file_inode(file));
+ const char *s = dev->filename;
+ if (s) {
+- ssize_t n = simple_read_from_buffer(buf, size, ppos,
+- s, strlen(s));
+- if (n < 0)
+- return n;
+- buf += n;
+- size -= n;
++ ssize_t len = strlen(s);
++ char *temp = kmalloc(len + 2, GFP_KERNEL);
++
++ if (!temp)
++ return -ENOMEM;
++
++ len = scnprintf(temp, len + 2, "%s\n", s);
++ len = simple_read_from_buffer(buf, size, ppos,
++ temp, len);
++
++ kfree(temp);
++ return len;
+ }
+ return simple_read_from_buffer(buf, size, ppos, "\n", 1);
+ }
+diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
+index 420eda2589c0e..09574af835662 100644
+--- a/block/bfq-cgroup.c
++++ b/block/bfq-cgroup.c
+@@ -557,6 +557,7 @@ static void bfq_pd_init(struct blkg_policy_data *pd)
+ */
+ bfqg->bfqd = bfqd;
+ bfqg->active_entities = 0;
++ bfqg->online = true;
+ bfqg->rq_pos_tree = RB_ROOT;
+ }
+
+@@ -585,28 +586,11 @@ static void bfq_group_set_parent(struct bfq_group *bfqg,
+ entity->sched_data = &parent->sched_data;
+ }
+
+-static struct bfq_group *bfq_lookup_bfqg(struct bfq_data *bfqd,
+- struct blkcg *blkcg)
++static void bfq_link_bfqg(struct bfq_data *bfqd, struct bfq_group *bfqg)
+ {
+- struct blkcg_gq *blkg;
+-
+- blkg = blkg_lookup(blkcg, bfqd->queue);
+- if (likely(blkg))
+- return blkg_to_bfqg(blkg);
+- return NULL;
+-}
+-
+-struct bfq_group *bfq_find_set_group(struct bfq_data *bfqd,
+- struct blkcg *blkcg)
+-{
+- struct bfq_group *bfqg, *parent;
++ struct bfq_group *parent;
+ struct bfq_entity *entity;
+
+- bfqg = bfq_lookup_bfqg(bfqd, blkcg);
+-
+- if (unlikely(!bfqg))
+- return NULL;
+-
+ /*
+ * Update chain of bfq_groups as we might be handling a leaf group
+ * which, along with some of its relatives, has not been hooked yet
+@@ -623,8 +607,24 @@ struct bfq_group *bfq_find_set_group(struct bfq_data *bfqd,
+ bfq_group_set_parent(curr_bfqg, parent);
+ }
+ }
++}
+
+- return bfqg;
++struct bfq_group *bfq_bio_bfqg(struct bfq_data *bfqd, struct bio *bio)
++{
++ struct blkcg_gq *blkg = bio->bi_blkg;
++ struct bfq_group *bfqg;
++
++ while (blkg) {
++ bfqg = blkg_to_bfqg(blkg);
++ if (bfqg->online) {
++ bio_associate_blkg_from_css(bio, &blkg->blkcg->css);
++ return bfqg;
++ }
++ blkg = blkg->parent;
++ }
++ bio_associate_blkg_from_css(bio,
++ &bfqg_to_blkg(bfqd->root_group)->blkcg->css);
++ return bfqd->root_group;
+ }
+
+ /**
+@@ -714,25 +714,15 @@ void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq,
+ * Move bic to blkcg, assuming that bfqd->lock is held; which makes
+ * sure that the reference to cgroup is valid across the call (see
+ * comments in bfq_bic_update_cgroup on this issue)
+- *
+- * NOTE: an alternative approach might have been to store the current
+- * cgroup in bfqq and getting a reference to it, reducing the lookup
+- * time here, at the price of slightly more complex code.
+ */
+-static struct bfq_group *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
+- struct bfq_io_cq *bic,
+- struct blkcg *blkcg)
++static void *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
++ struct bfq_io_cq *bic,
++ struct bfq_group *bfqg)
+ {
+ struct bfq_queue *async_bfqq = bic_to_bfqq(bic, 0);
+ struct bfq_queue *sync_bfqq = bic_to_bfqq(bic, 1);
+- struct bfq_group *bfqg;
+ struct bfq_entity *entity;
+
+- bfqg = bfq_find_set_group(bfqd, blkcg);
+-
+- if (unlikely(!bfqg))
+- bfqg = bfqd->root_group;
+-
+ if (async_bfqq) {
+ entity = &async_bfqq->entity;
+
+@@ -743,9 +733,39 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
+ }
+
+ if (sync_bfqq) {
+- entity = &sync_bfqq->entity;
+- if (entity->sched_data != &bfqg->sched_data)
+- bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
++ if (!sync_bfqq->new_bfqq && !bfq_bfqq_coop(sync_bfqq)) {
++ /* We are the only user of this bfqq, just move it */
++ if (sync_bfqq->entity.sched_data != &bfqg->sched_data)
++ bfq_bfqq_move(bfqd, sync_bfqq, bfqg);
++ } else {
++ struct bfq_queue *bfqq;
++
++ /*
++ * The queue was merged to a different queue. Check
++ * that the merge chain still belongs to the same
++ * cgroup.
++ */
++ for (bfqq = sync_bfqq; bfqq; bfqq = bfqq->new_bfqq)
++ if (bfqq->entity.sched_data !=
++ &bfqg->sched_data)
++ break;
++ if (bfqq) {
++ /*
++ * Some queue changed cgroup so the merge is
++ * not valid anymore. We cannot easily just
++ * cancel the merge (by clearing new_bfqq) as
++ * there may be other processes using this
++ * queue and holding refs to all queues below
++ * sync_bfqq->new_bfqq. Similarly if the merge
++ * already happened, we need to detach from
++ * bfqq now so that we cannot merge bio to a
++ * request from the old cgroup.
++ */
++ bfq_put_cooperator(sync_bfqq);
++ bfq_release_process_ref(bfqd, sync_bfqq);
++ bic_set_bfqq(bic, NULL, 1);
++ }
++ }
+ }
+
+ return bfqg;
+@@ -754,20 +774,24 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct bfq_data *bfqd,
+ void bfq_bic_update_cgroup(struct bfq_io_cq *bic, struct bio *bio)
+ {
+ struct bfq_data *bfqd = bic_to_bfqd(bic);
+- struct bfq_group *bfqg = NULL;
++ struct bfq_group *bfqg = bfq_bio_bfqg(bfqd, bio);
+ uint64_t serial_nr;
+
+- rcu_read_lock();
+- serial_nr = __bio_blkcg(bio)->css.serial_nr;
++ serial_nr = bfqg_to_blkg(bfqg)->blkcg->css.serial_nr;
+
+ /*
+ * Check whether blkcg has changed. The condition may trigger
+ * spuriously on a newly created cic but there's no harm.
+ */
+ if (unlikely(!bfqd) || likely(bic->blkcg_serial_nr == serial_nr))
+- goto out;
++ return;
+
+- bfqg = __bfq_bic_change_cgroup(bfqd, bic, __bio_blkcg(bio));
++ /*
++ * New cgroup for this process. Make sure it is linked to bfq internal
++ * cgroup hierarchy.
++ */
++ bfq_link_bfqg(bfqd, bfqg);
++ __bfq_bic_change_cgroup(bfqd, bic, bfqg);
+ /*
+ * Update blkg_path for bfq_log_* functions. We cache this
+ * path, and update it here, for the following
+@@ -820,8 +844,6 @@ void bfq_bic_update_cgroup(struct bfq_io_cq *bic, struct bio *bio)
+ */
+ blkg_path(bfqg_to_blkg(bfqg), bfqg->blkg_path, sizeof(bfqg->blkg_path));
+ bic->blkcg_serial_nr = serial_nr;
+-out:
+- rcu_read_unlock();
+ }
+
+ /**
+@@ -949,6 +971,7 @@ static void bfq_pd_offline(struct blkg_policy_data *pd)
+
+ put_async_queues:
+ bfq_put_async_queues(bfqd, bfqg);
++ bfqg->online = false;
+
+ spin_unlock_irqrestore(&bfqd->lock, flags);
+ /*
+@@ -1438,7 +1461,7 @@ void bfq_end_wr_async(struct bfq_data *bfqd)
+ bfq_end_wr_async_queues(bfqd, bfqd->root_group);
+ }
+
+-struct bfq_group *bfq_find_set_group(struct bfq_data *bfqd, struct blkcg *blkcg)
++struct bfq_group *bfq_bio_bfqg(struct bfq_data *bfqd, struct bio *bio)
+ {
+ return bfqd->root_group;
+ }
+diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
+index 1f62dbdc521ff..bf5acd8f43229 100644
+--- a/block/bfq-iosched.c
++++ b/block/bfq-iosched.c
+@@ -2133,9 +2133,7 @@ static void bfq_check_waker(struct bfq_data *bfqd, struct bfq_queue *bfqq,
+ if (!bfqd->last_completed_rq_bfqq ||
+ bfqd->last_completed_rq_bfqq == bfqq ||
+ bfq_bfqq_has_short_ttime(bfqq) ||
+- bfqq->dispatched > 0 ||
+- now_ns - bfqd->last_completion >= 4 * NSEC_PER_MSEC ||
+- bfqd->last_completed_rq_bfqq == bfqq->waker_bfqq)
++ now_ns - bfqd->last_completion >= 4 * NSEC_PER_MSEC)
+ return;
+
+ /*
+@@ -2210,7 +2208,7 @@ static void bfq_add_request(struct request *rq)
+ bfqq->queued[rq_is_sync(rq)]++;
+ bfqd->queued++;
+
+- if (RB_EMPTY_ROOT(&bfqq->sort_list) && bfq_bfqq_sync(bfqq)) {
++ if (bfq_bfqq_sync(bfqq) && RQ_BIC(rq)->requests <= 1) {
+ bfq_check_waker(bfqd, bfqq, now_ns);
+
+ /*
+@@ -2463,10 +2461,17 @@ static bool bfq_bio_merge(struct request_queue *q, struct bio *bio,
+
+ spin_lock_irq(&bfqd->lock);
+
+- if (bic)
++ if (bic) {
++ /*
++ * Make sure cgroup info is uptodate for current process before
++ * considering the merge.
++ */
++ bfq_bic_update_cgroup(bic, bio);
++
+ bfqd->bio_bfqq = bic_to_bfqq(bic, op_is_sync(bio->bi_opf));
+- else
++ } else {
+ bfqd->bio_bfqq = NULL;
++ }
+ bfqd->bio_bic = bic;
+
+ ret = blk_mq_sched_try_merge(q, bio, nr_segs, &free);
+@@ -2496,8 +2501,6 @@ static int bfq_request_merge(struct request_queue *q, struct request **req,
+ return ELEVATOR_NO_MERGE;
+ }
+
+-static struct bfq_queue *bfq_init_rq(struct request *rq);
+-
+ static void bfq_request_merged(struct request_queue *q, struct request *req,
+ enum elv_merge type)
+ {
+@@ -2506,7 +2509,7 @@ static void bfq_request_merged(struct request_queue *q, struct request *req,
+ blk_rq_pos(req) <
+ blk_rq_pos(container_of(rb_prev(&req->rb_node),
+ struct request, rb_node))) {
+- struct bfq_queue *bfqq = bfq_init_rq(req);
++ struct bfq_queue *bfqq = RQ_BFQQ(req);
+ struct bfq_data *bfqd;
+ struct request *prev, *next_rq;
+
+@@ -2558,8 +2561,8 @@ static void bfq_request_merged(struct request_queue *q, struct request *req,
+ static void bfq_requests_merged(struct request_queue *q, struct request *rq,
+ struct request *next)
+ {
+- struct bfq_queue *bfqq = bfq_init_rq(rq),
+- *next_bfqq = bfq_init_rq(next);
++ struct bfq_queue *bfqq = RQ_BFQQ(rq),
++ *next_bfqq = RQ_BFQQ(next);
+
+ if (!bfqq)
+ goto remove;
+@@ -2764,6 +2767,14 @@ bfq_setup_merge(struct bfq_queue *bfqq, struct bfq_queue *new_bfqq)
+ if (process_refs == 0 || new_process_refs == 0)
+ return NULL;
+
++ /*
++ * Make sure merged queues belong to the same parent. Parents could
++ * have changed since the time we decided the two queues are suitable
++ * for merging.
++ */
++ if (new_bfqq->entity.parent != bfqq->entity.parent)
++ return NULL;
++
+ bfq_log_bfqq(bfqq->bfqd, bfqq, "scheduling merge with queue %d",
+ new_bfqq->pid);
+
+@@ -2901,9 +2912,12 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq,
+ struct bfq_queue *new_bfqq =
+ bfq_setup_merge(bfqq, stable_merge_bfqq);
+
+- bic->stably_merged = true;
+- if (new_bfqq && new_bfqq->bic)
+- new_bfqq->bic->stably_merged = true;
++ if (new_bfqq) {
++ bic->stably_merged = true;
++ if (new_bfqq->bic)
++ new_bfqq->bic->stably_merged =
++ true;
++ }
+ return new_bfqq;
+ } else
+ return NULL;
+@@ -5310,7 +5324,7 @@ static void bfq_put_stable_ref(struct bfq_queue *bfqq)
+ bfq_put_queue(bfqq);
+ }
+
+-static void bfq_put_cooperator(struct bfq_queue *bfqq)
++void bfq_put_cooperator(struct bfq_queue *bfqq)
+ {
+ struct bfq_queue *__bfqq, *next;
+
+@@ -5716,14 +5730,7 @@ static struct bfq_queue *bfq_get_queue(struct bfq_data *bfqd,
+ struct bfq_queue *bfqq;
+ struct bfq_group *bfqg;
+
+- rcu_read_lock();
+-
+- bfqg = bfq_find_set_group(bfqd, __bio_blkcg(bio));
+- if (!bfqg) {
+- bfqq = &bfqd->oom_bfqq;
+- goto out;
+- }
+-
++ bfqg = bfq_bio_bfqg(bfqd, bio);
+ if (!is_sync) {
+ async_bfqq = bfq_async_queue_prio(bfqd, bfqg, ioprio_class,
+ ioprio);
+@@ -5769,8 +5776,6 @@ out:
+
+ if (bfqq != &bfqd->oom_bfqq && is_sync && !respawn)
+ bfqq = bfq_do_or_sched_stable_merge(bfqd, bfqq, bic);
+-
+- rcu_read_unlock();
+ return bfqq;
+ }
+
+@@ -6117,6 +6122,8 @@ static inline void bfq_update_insert_stats(struct request_queue *q,
+ unsigned int cmd_flags) {}
+ #endif /* CONFIG_BFQ_CGROUP_DEBUG */
+
++static struct bfq_queue *bfq_init_rq(struct request *rq);
++
+ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
+ bool at_head)
+ {
+@@ -6132,18 +6139,15 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
+ bfqg_stats_update_legacy_io(q, rq);
+ #endif
+ spin_lock_irq(&bfqd->lock);
++ bfqq = bfq_init_rq(rq);
+ if (blk_mq_sched_try_insert_merge(q, rq, &free)) {
+ spin_unlock_irq(&bfqd->lock);
+ blk_mq_free_requests(&free);
+ return;
+ }
+
+- spin_unlock_irq(&bfqd->lock);
+-
+ trace_block_rq_insert(rq);
+
+- spin_lock_irq(&bfqd->lock);
+- bfqq = bfq_init_rq(rq);
+ if (!bfqq || at_head) {
+ if (at_head)
+ list_add(&rq->queuelist, &bfqd->dispatch);
+@@ -6563,6 +6567,7 @@ static void bfq_finish_requeue_request(struct request *rq)
+ bfq_completed_request(bfqq, bfqd);
+ }
+ bfq_finish_requeue_request_body(bfqq);
++ RQ_BIC(rq)->requests--;
+ spin_unlock_irqrestore(&bfqd->lock, flags);
+
+ /*
+@@ -6796,6 +6801,7 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
+
+ bfqq_request_allocated(bfqq);
+ bfqq->ref++;
++ bic->requests++;
+ bfq_log_bfqq(bfqd, bfqq, "get_request %p: bfqq %p, %d",
+ rq, bfqq, bfqq->ref);
+
+diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
+index 3b83e3d1c2e58..9af8599ab90ff 100644
+--- a/block/bfq-iosched.h
++++ b/block/bfq-iosched.h
+@@ -468,6 +468,7 @@ struct bfq_io_cq {
+ struct bfq_queue *stable_merge_bfqq;
+
+ bool stably_merged; /* non splittable if true */
++ unsigned int requests; /* Number of requests this process has in flight */
+ };
+
+ /**
+@@ -928,6 +929,8 @@ struct bfq_group {
+
+ /* reference counter (see comments in bfq_bic_update_cgroup) */
+ int ref;
++ /* Is bfq_group still online? */
++ bool online;
+
+ struct bfq_entity entity;
+ struct bfq_sched_data sched_data;
+@@ -979,6 +982,7 @@ void bfq_weights_tree_remove(struct bfq_data *bfqd,
+ void bfq_bfqq_expire(struct bfq_data *bfqd, struct bfq_queue *bfqq,
+ bool compensate, enum bfqq_expiration reason);
+ void bfq_put_queue(struct bfq_queue *bfqq);
++void bfq_put_cooperator(struct bfq_queue *bfqq);
+ void bfq_end_wr_async_queues(struct bfq_data *bfqd, struct bfq_group *bfqg);
+ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq);
+ void bfq_schedule_dispatch(struct bfq_data *bfqd);
+@@ -1006,8 +1010,7 @@ void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq,
+ void bfq_init_entity(struct bfq_entity *entity, struct bfq_group *bfqg);
+ void bfq_bic_update_cgroup(struct bfq_io_cq *bic, struct bio *bio);
+ void bfq_end_wr_async(struct bfq_data *bfqd);
+-struct bfq_group *bfq_find_set_group(struct bfq_data *bfqd,
+- struct blkcg *blkcg);
++struct bfq_group *bfq_bio_bfqg(struct bfq_data *bfqd, struct bio *bio);
+ struct blkcg_gq *bfqg_to_blkg(struct bfq_group *bfqg);
+ struct bfq_group *bfqq_group(struct bfq_queue *bfqq);
+ struct bfq_group *bfq_create_group_hierarchy(struct bfq_data *bfqd, int node);
+diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
+index 8dfe62786cd5f..1c52752abb084 100644
+--- a/block/blk-cgroup.c
++++ b/block/blk-cgroup.c
+@@ -905,7 +905,6 @@ static void blkcg_print_one_stat(struct blkcg_gq *blkg, struct seq_file *s)
+ {
+ struct blkg_iostat_set *bis = &blkg->iostat;
+ u64 rbytes, wbytes, rios, wios, dbytes, dios;
+- bool has_stats = false;
+ const char *dname;
+ unsigned seq;
+ int i;
+@@ -931,14 +930,12 @@ static void blkcg_print_one_stat(struct blkcg_gq *blkg, struct seq_file *s)
+ } while (u64_stats_fetch_retry(&bis->sync, seq));
+
+ if (rbytes || wbytes || rios || wios) {
+- has_stats = true;
+ seq_printf(s, "rbytes=%llu wbytes=%llu rios=%llu wios=%llu dbytes=%llu dios=%llu",
+ rbytes, wbytes, rios, wios,
+ dbytes, dios);
+ }
+
+ if (blkcg_debug_stats && atomic_read(&blkg->use_delay)) {
+- has_stats = true;
+ seq_printf(s, " use_delay=%d delay_nsec=%llu",
+ atomic_read(&blkg->use_delay),
+ atomic64_read(&blkg->delay_nsec));
+@@ -950,12 +947,10 @@ static void blkcg_print_one_stat(struct blkcg_gq *blkg, struct seq_file *s)
+ if (!blkg->pd[i] || !pol->pd_stat_fn)
+ continue;
+
+- if (pol->pd_stat_fn(blkg->pd[i], s))
+- has_stats = true;
++ pol->pd_stat_fn(blkg->pd[i], s);
+ }
+
+- if (has_stats)
+- seq_printf(s, "\n");
++ seq_puts(s, "\n");
+ }
+
+ static int blkcg_print_stat(struct seq_file *sf, void *v)
+@@ -1906,12 +1901,8 @@ EXPORT_SYMBOL_GPL(bio_associate_blkg);
+ */
+ void bio_clone_blkg_association(struct bio *dst, struct bio *src)
+ {
+- if (src->bi_blkg) {
+- if (dst->bi_blkg)
+- blkg_put(dst->bi_blkg);
+- blkg_get(src->bi_blkg);
+- dst->bi_blkg = src->bi_blkg;
+- }
++ if (src->bi_blkg)
++ bio_associate_blkg_from_css(dst, &bio_blkcg(src)->css);
+ }
+ EXPORT_SYMBOL_GPL(bio_clone_blkg_association);
+
+diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
+index 47e1e38390c96..b56ba16fb6c57 100644
+--- a/block/blk-cgroup.h
++++ b/block/blk-cgroup.h
+@@ -63,7 +63,7 @@ typedef void (blkcg_pol_online_pd_fn)(struct blkg_policy_data *pd);
+ typedef void (blkcg_pol_offline_pd_fn)(struct blkg_policy_data *pd);
+ typedef void (blkcg_pol_free_pd_fn)(struct blkg_policy_data *pd);
+ typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkg_policy_data *pd);
+-typedef bool (blkcg_pol_stat_pd_fn)(struct blkg_policy_data *pd,
++typedef void (blkcg_pol_stat_pd_fn)(struct blkg_policy_data *pd,
+ struct seq_file *s);
+
+ struct blkcg_policy {
+diff --git a/block/blk-ia-ranges.c b/block/blk-ia-ranges.c
+index 18c68d8b9138e..56ed48d2954e6 100644
+--- a/block/blk-ia-ranges.c
++++ b/block/blk-ia-ranges.c
+@@ -54,13 +54,8 @@ static ssize_t blk_ia_range_sysfs_show(struct kobject *kobj,
+ container_of(attr, struct blk_ia_range_sysfs_entry, attr);
+ struct blk_independent_access_range *iar =
+ container_of(kobj, struct blk_independent_access_range, kobj);
+- ssize_t ret;
+
+- mutex_lock(&iar->queue->sysfs_lock);
+- ret = entry->show(iar, buf);
+- mutex_unlock(&iar->queue->sysfs_lock);
+-
+- return ret;
++ return entry->show(iar, buf);
+ }
+
+ static const struct sysfs_ops blk_ia_range_sysfs_ops = {
+diff --git a/block/blk-iocost.c b/block/blk-iocost.c
+index 9bd670999d0af..16705fbd06991 100644
+--- a/block/blk-iocost.c
++++ b/block/blk-iocost.c
+@@ -3005,13 +3005,13 @@ static void ioc_pd_free(struct blkg_policy_data *pd)
+ kfree(iocg);
+ }
+
+-static bool ioc_pd_stat(struct blkg_policy_data *pd, struct seq_file *s)
++static void ioc_pd_stat(struct blkg_policy_data *pd, struct seq_file *s)
+ {
+ struct ioc_gq *iocg = pd_to_iocg(pd);
+ struct ioc *ioc = iocg->ioc;
+
+ if (!ioc->enabled)
+- return false;
++ return;
+
+ if (iocg->level == 0) {
+ unsigned vp10k = DIV64_U64_ROUND_CLOSEST(
+@@ -3027,7 +3027,6 @@ static bool ioc_pd_stat(struct blkg_policy_data *pd, struct seq_file *s)
+ iocg->last_stat.wait_us,
+ iocg->last_stat.indebt_us,
+ iocg->last_stat.indelay_us);
+- return true;
+ }
+
+ static u64 ioc_weight_prfill(struct seq_file *sf, struct blkg_policy_data *pd,
+diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c
+index 2f33932e72e36..9568bf8dfe82b 100644
+--- a/block/blk-iolatency.c
++++ b/block/blk-iolatency.c
+@@ -87,7 +87,17 @@ struct iolatency_grp;
+ struct blk_iolatency {
+ struct rq_qos rqos;
+ struct timer_list timer;
+- atomic_t enabled;
++
++ /*
++ * ->enabled is the master enable switch gating the throttling logic and
++ * inflight tracking. The number of cgroups which have iolat enabled is
++ * tracked in ->enable_cnt, and ->enable is flipped on/off accordingly
++ * from ->enable_work with the request_queue frozen. For details, See
++ * blkiolatency_enable_work_fn().
++ */
++ bool enabled;
++ atomic_t enable_cnt;
++ struct work_struct enable_work;
+ };
+
+ static inline struct blk_iolatency *BLKIOLATENCY(struct rq_qos *rqos)
+@@ -95,11 +105,6 @@ static inline struct blk_iolatency *BLKIOLATENCY(struct rq_qos *rqos)
+ return container_of(rqos, struct blk_iolatency, rqos);
+ }
+
+-static inline bool blk_iolatency_enabled(struct blk_iolatency *blkiolat)
+-{
+- return atomic_read(&blkiolat->enabled) > 0;
+-}
+-
+ struct child_latency_info {
+ spinlock_t lock;
+
+@@ -464,7 +469,7 @@ static void blkcg_iolatency_throttle(struct rq_qos *rqos, struct bio *bio)
+ struct blkcg_gq *blkg = bio->bi_blkg;
+ bool issue_as_root = bio_issue_as_root_blkg(bio);
+
+- if (!blk_iolatency_enabled(blkiolat))
++ if (!blkiolat->enabled)
+ return;
+
+ while (blkg && blkg->parent) {
+@@ -594,7 +599,6 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio)
+ u64 window_start;
+ u64 now;
+ bool issue_as_root = bio_issue_as_root_blkg(bio);
+- bool enabled = false;
+ int inflight = 0;
+
+ blkg = bio->bi_blkg;
+@@ -605,8 +609,7 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio)
+ if (!iolat)
+ return;
+
+- enabled = blk_iolatency_enabled(iolat->blkiolat);
+- if (!enabled)
++ if (!iolat->blkiolat->enabled)
+ return;
+
+ now = ktime_to_ns(ktime_get());
+@@ -645,6 +648,7 @@ static void blkcg_iolatency_exit(struct rq_qos *rqos)
+ struct blk_iolatency *blkiolat = BLKIOLATENCY(rqos);
+
+ del_timer_sync(&blkiolat->timer);
++ flush_work(&blkiolat->enable_work);
+ blkcg_deactivate_policy(rqos->q, &blkcg_policy_iolatency);
+ kfree(blkiolat);
+ }
+@@ -716,6 +720,44 @@ next:
+ rcu_read_unlock();
+ }
+
++/**
++ * blkiolatency_enable_work_fn - Enable or disable iolatency on the device
++ * @work: enable_work of the blk_iolatency of interest
++ *
++ * iolatency needs to keep track of the number of in-flight IOs per cgroup. This
++ * is relatively expensive as it involves walking up the hierarchy twice for
++ * every IO. Thus, if iolatency is not enabled in any cgroup for the device, we
++ * want to disable the in-flight tracking.
++ *
++ * We have to make sure that the counting is balanced - we don't want to leak
++ * the in-flight counts by disabling accounting in the completion path while IOs
++ * are in flight. This is achieved by ensuring that no IO is in flight by
++ * freezing the queue while flipping ->enabled. As this requires a sleepable
++ * context, ->enabled flipping is punted to this work function.
++ */
++static void blkiolatency_enable_work_fn(struct work_struct *work)
++{
++ struct blk_iolatency *blkiolat = container_of(work, struct blk_iolatency,
++ enable_work);
++ bool enabled;
++
++ /*
++ * There can only be one instance of this function running for @blkiolat
++ * and it's guaranteed to be executed at least once after the latest
++ * ->enabled_cnt modification. Acting on the latest ->enable_cnt is
++ * sufficient.
++ *
++ * Also, we know @blkiolat is safe to access as ->enable_work is flushed
++ * in blkcg_iolatency_exit().
++ */
++ enabled = atomic_read(&blkiolat->enable_cnt);
++ if (enabled != blkiolat->enabled) {
++ blk_mq_freeze_queue(blkiolat->rqos.q);
++ blkiolat->enabled = enabled;
++ blk_mq_unfreeze_queue(blkiolat->rqos.q);
++ }
++}
++
+ int blk_iolatency_init(struct request_queue *q)
+ {
+ struct blk_iolatency *blkiolat;
+@@ -741,17 +783,15 @@ int blk_iolatency_init(struct request_queue *q)
+ }
+
+ timer_setup(&blkiolat->timer, blkiolatency_timer_fn, 0);
++ INIT_WORK(&blkiolat->enable_work, blkiolatency_enable_work_fn);
+
+ return 0;
+ }
+
+-/*
+- * return 1 for enabling iolatency, return -1 for disabling iolatency, otherwise
+- * return 0.
+- */
+-static int iolatency_set_min_lat_nsec(struct blkcg_gq *blkg, u64 val)
++static void iolatency_set_min_lat_nsec(struct blkcg_gq *blkg, u64 val)
+ {
+ struct iolatency_grp *iolat = blkg_to_lat(blkg);
++ struct blk_iolatency *blkiolat = iolat->blkiolat;
+ u64 oldval = iolat->min_lat_nsec;
+
+ iolat->min_lat_nsec = val;
+@@ -759,13 +799,15 @@ static int iolatency_set_min_lat_nsec(struct blkcg_gq *blkg, u64 val)
+ iolat->cur_win_nsec = min_t(u64, iolat->cur_win_nsec,
+ BLKIOLATENCY_MAX_WIN_SIZE);
+
+- if (!oldval && val)
+- return 1;
++ if (!oldval && val) {
++ if (atomic_inc_return(&blkiolat->enable_cnt) == 1)
++ schedule_work(&blkiolat->enable_work);
++ }
+ if (oldval && !val) {
+ blkcg_clear_delay(blkg);
+- return -1;
++ if (atomic_dec_return(&blkiolat->enable_cnt) == 0)
++ schedule_work(&blkiolat->enable_work);
+ }
+- return 0;
+ }
+
+ static void iolatency_clear_scaling(struct blkcg_gq *blkg)
+@@ -797,7 +839,6 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf,
+ u64 lat_val = 0;
+ u64 oldval;
+ int ret;
+- int enable = 0;
+
+ ret = blkg_conf_prep(blkcg, &blkcg_policy_iolatency, buf, &ctx);
+ if (ret)
+@@ -832,41 +873,12 @@ static ssize_t iolatency_set_limit(struct kernfs_open_file *of, char *buf,
+ blkg = ctx.blkg;
+ oldval = iolat->min_lat_nsec;
+
+- enable = iolatency_set_min_lat_nsec(blkg, lat_val);
+- if (enable) {
+- if (!blk_get_queue(blkg->q)) {
+- ret = -ENODEV;
+- goto out;
+- }
+-
+- blkg_get(blkg);
+- }
+-
+- if (oldval != iolat->min_lat_nsec) {
++ iolatency_set_min_lat_nsec(blkg, lat_val);
++ if (oldval != iolat->min_lat_nsec)
+ iolatency_clear_scaling(blkg);
+- }
+-
+ ret = 0;
+ out:
+ blkg_conf_finish(&ctx);
+- if (ret == 0 && enable) {
+- struct iolatency_grp *tmp = blkg_to_lat(blkg);
+- struct blk_iolatency *blkiolat = tmp->blkiolat;
+-
+- blk_mq_freeze_queue(blkg->q);
+-
+- if (enable == 1)
+- atomic_inc(&blkiolat->enabled);
+- else if (enable == -1)
+- atomic_dec(&blkiolat->enabled);
+- else
+- WARN_ON_ONCE(1);
+-
+- blk_mq_unfreeze_queue(blkg->q);
+-
+- blkg_put(blkg);
+- blk_put_queue(blkg->q);
+- }
+ return ret ?: nbytes;
+ }
+
+@@ -891,7 +903,7 @@ static int iolatency_print_limit(struct seq_file *sf, void *v)
+ return 0;
+ }
+
+-static bool iolatency_ssd_stat(struct iolatency_grp *iolat, struct seq_file *s)
++static void iolatency_ssd_stat(struct iolatency_grp *iolat, struct seq_file *s)
+ {
+ struct latency_stat stat;
+ int cpu;
+@@ -914,17 +926,16 @@ static bool iolatency_ssd_stat(struct iolatency_grp *iolat, struct seq_file *s)
+ (unsigned long long)stat.ps.missed,
+ (unsigned long long)stat.ps.total,
+ iolat->rq_depth.max_depth);
+- return true;
+ }
+
+-static bool iolatency_pd_stat(struct blkg_policy_data *pd, struct seq_file *s)
++static void iolatency_pd_stat(struct blkg_policy_data *pd, struct seq_file *s)
+ {
+ struct iolatency_grp *iolat = pd_to_lat(pd);
+ unsigned long long avg_lat;
+ unsigned long long cur_win;
+
+ if (!blkcg_debug_stats)
+- return false;
++ return;
+
+ if (iolat->ssd)
+ return iolatency_ssd_stat(iolat, s);
+@@ -937,7 +948,6 @@ static bool iolatency_pd_stat(struct blkg_policy_data *pd, struct seq_file *s)
+ else
+ seq_printf(s, " depth=%u avg_lat=%llu win=%llu",
+ iolat->rq_depth.max_depth, avg_lat, cur_win);
+- return true;
+ }
+
+ static struct blkg_policy_data *iolatency_pd_alloc(gfp_t gfp,
+@@ -1007,14 +1017,8 @@ static void iolatency_pd_offline(struct blkg_policy_data *pd)
+ {
+ struct iolatency_grp *iolat = pd_to_lat(pd);
+ struct blkcg_gq *blkg = lat_to_blkg(iolat);
+- struct blk_iolatency *blkiolat = iolat->blkiolat;
+- int ret;
+
+- ret = iolatency_set_min_lat_nsec(blkg, 0);
+- if (ret == 1)
+- atomic_inc(&blkiolat->enabled);
+- if (ret == -1)
+- atomic_dec(&blkiolat->enabled);
++ iolatency_set_min_lat_nsec(blkg, 0);
+ iolatency_clear_scaling(blkg);
+ }
+
+diff --git a/block/blk-throttle.c b/block/blk-throttle.c
+index 469c483719bea..5c5f2741a95fa 100644
+--- a/block/blk-throttle.c
++++ b/block/blk-throttle.c
+@@ -2189,13 +2189,14 @@ again:
+ }
+
+ out_unlock:
+- spin_unlock_irq(&q->queue_lock);
+ bio_set_flag(bio, BIO_THROTTLED);
+
+ #ifdef CONFIG_BLK_DEV_THROTTLING_LOW
+ if (throttled || !td->track_bio_latency)
+ bio->bi_issue.value |= BIO_ISSUE_THROTL_SKIP_LATENCY;
+ #endif
++ spin_unlock_irq(&q->queue_lock);
++
+ rcu_read_unlock();
+ return throttled;
+ }
+diff --git a/crypto/cryptd.c b/crypto/cryptd.c
+index a1bea0f4baa88..668095eca0faf 100644
+--- a/crypto/cryptd.c
++++ b/crypto/cryptd.c
+@@ -39,6 +39,10 @@ struct cryptd_cpu_queue {
+ };
+
+ struct cryptd_queue {
++ /*
++ * Protected by disabling BH to allow enqueueing from softinterrupt and
++ * dequeuing from kworker (cryptd_queue_worker()).
++ */
+ struct cryptd_cpu_queue __percpu *cpu_queue;
+ };
+
+@@ -125,28 +129,28 @@ static void cryptd_fini_queue(struct cryptd_queue *queue)
+ static int cryptd_enqueue_request(struct cryptd_queue *queue,
+ struct crypto_async_request *request)
+ {
+- int cpu, err;
++ int err;
+ struct cryptd_cpu_queue *cpu_queue;
+ refcount_t *refcnt;
+
+- cpu = get_cpu();
++ local_bh_disable();
+ cpu_queue = this_cpu_ptr(queue->cpu_queue);
+ err = crypto_enqueue_request(&cpu_queue->queue, request);
+
+ refcnt = crypto_tfm_ctx(request->tfm);
+
+ if (err == -ENOSPC)
+- goto out_put_cpu;
++ goto out;
+
+- queue_work_on(cpu, cryptd_wq, &cpu_queue->work);
++ queue_work_on(smp_processor_id(), cryptd_wq, &cpu_queue->work);
+
+ if (!refcount_read(refcnt))
+- goto out_put_cpu;
++ goto out;
+
+ refcount_inc(refcnt);
+
+-out_put_cpu:
+- put_cpu();
++out:
++ local_bh_enable();
+
+ return err;
+ }
+@@ -162,15 +166,10 @@ static void cryptd_queue_worker(struct work_struct *work)
+ cpu_queue = container_of(work, struct cryptd_cpu_queue, work);
+ /*
+ * Only handle one request at a time to avoid hogging crypto workqueue.
+- * preempt_disable/enable is used to prevent being preempted by
+- * cryptd_enqueue_request(). local_bh_disable/enable is used to prevent
+- * cryptd_enqueue_request() being accessed from software interrupts.
+ */
+ local_bh_disable();
+- preempt_disable();
+ backlog = crypto_get_backlog(&cpu_queue->queue);
+ req = crypto_dequeue_request(&cpu_queue->queue);
+- preempt_enable();
+ local_bh_enable();
+
+ if (!req)
+diff --git a/drivers/acpi/arm64/agdi.c b/drivers/acpi/arm64/agdi.c
+index 4df337d545b73..cf31abd0ed1bb 100644
+--- a/drivers/acpi/arm64/agdi.c
++++ b/drivers/acpi/arm64/agdi.c
+@@ -9,6 +9,7 @@
+ #define pr_fmt(fmt) "ACPI: AGDI: " fmt
+
+ #include <linux/acpi.h>
++#include <linux/acpi_agdi.h>
+ #include <linux/arm_sdei.h>
+ #include <linux/io.h>
+ #include <linux/kernel.h>
+diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
+index bc1454789a065..34576ab0e2e1d 100644
+--- a/drivers/acpi/cppc_acpi.c
++++ b/drivers/acpi/cppc_acpi.c
+@@ -100,6 +100,16 @@ static DEFINE_PER_CPU(struct cpc_desc *, cpc_desc_ptr);
+ (cpc)->cpc_entry.reg.space_id == \
+ ACPI_ADR_SPACE_PLATFORM_COMM)
+
++/* Check if a CPC register is in SystemMemory */
++#define CPC_IN_SYSTEM_MEMORY(cpc) ((cpc)->type == ACPI_TYPE_BUFFER && \
++ (cpc)->cpc_entry.reg.space_id == \
++ ACPI_ADR_SPACE_SYSTEM_MEMORY)
++
++/* Check if a CPC register is in SystemIo */
++#define CPC_IN_SYSTEM_IO(cpc) ((cpc)->type == ACPI_TYPE_BUFFER && \
++ (cpc)->cpc_entry.reg.space_id == \
++ ACPI_ADR_SPACE_SYSTEM_IO)
++
+ /* Evaluates to True if reg is a NULL register descriptor */
+ #define IS_NULL_REG(reg) ((reg)->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY && \
+ (reg)->address == 0 && \
+@@ -1447,6 +1457,9 @@ EXPORT_SYMBOL_GPL(cppc_set_perf);
+ * transition latency for performance change requests. The closest we have
+ * is the timing information from the PCCT tables which provides the info
+ * on the number and frequency of PCC commands the platform can handle.
++ *
++ * If desired_reg is in the SystemMemory or SystemIo ACPI address space,
++ * then assume there is no latency.
+ */
+ unsigned int cppc_get_transition_latency(int cpu_num)
+ {
+@@ -1472,7 +1485,9 @@ unsigned int cppc_get_transition_latency(int cpu_num)
+ return CPUFREQ_ETERNAL;
+
+ desired_reg = &cpc_desc->cpc_regs[DESIRED_PERF];
+- if (!CPC_IN_PCC(desired_reg))
++ if (CPC_IN_SYSTEM_MEMORY(desired_reg) || CPC_IN_SYSTEM_IO(desired_reg))
++ return 0;
++ else if (!CPC_IN_PCC(desired_reg))
+ return CPUFREQ_ETERNAL;
+
+ if (pcc_ss_id < 0)
+diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
+index 12bbfe8336095..2b5d53c2a0a4e 100644
+--- a/drivers/acpi/property.c
++++ b/drivers/acpi/property.c
+@@ -433,6 +433,16 @@ void acpi_init_properties(struct acpi_device *adev)
+ acpi_extract_apple_properties(adev);
+ }
+
++static void acpi_free_device_properties(struct list_head *list)
++{
++ struct acpi_device_properties *props, *tmp;
++
++ list_for_each_entry_safe(props, tmp, list, list) {
++ list_del(&props->list);
++ kfree(props);
++ }
++}
++
+ static void acpi_destroy_nondev_subnodes(struct list_head *list)
+ {
+ struct acpi_data_node *dn, *next;
+@@ -445,22 +455,18 @@ static void acpi_destroy_nondev_subnodes(struct list_head *list)
+ wait_for_completion(&dn->kobj_done);
+ list_del(&dn->sibling);
+ ACPI_FREE((void *)dn->data.pointer);
++ acpi_free_device_properties(&dn->data.properties);
+ kfree(dn);
+ }
+ }
+
+ void acpi_free_properties(struct acpi_device *adev)
+ {
+- struct acpi_device_properties *props, *tmp;
+-
+ acpi_destroy_nondev_subnodes(&adev->data.subnodes);
+ ACPI_FREE((void *)adev->data.pointer);
+ adev->data.of_compatible = NULL;
+ adev->data.pointer = NULL;
+- list_for_each_entry_safe(props, tmp, &adev->data.properties, list) {
+- list_del(&props->list);
+- kfree(props);
+- }
++ acpi_free_device_properties(&adev->data.properties);
+ }
+
+ /**
+diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
+index c992e57b2c790..3147702710afe 100644
+--- a/drivers/acpi/sleep.c
++++ b/drivers/acpi/sleep.c
+@@ -373,6 +373,18 @@ static const struct dmi_system_id acpisleep_dmi_table[] __initconst = {
+ DMI_MATCH(DMI_PRODUCT_NAME, "20GGA00L00"),
+ },
+ },
++ /*
++ * ASUS B1400CEAE hangs on resume from suspend (see
++ * https://bugzilla.kernel.org/show_bug.cgi?id=215742).
++ */
++ {
++ .callback = init_default_s3,
++ .ident = "ASUS B1400CEAE",
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
++ DMI_MATCH(DMI_PRODUCT_NAME, "ASUS EXPERTBOOK B1400CEAE"),
++ },
++ },
+ {},
+ };
+
+diff --git a/drivers/base/memory.c b/drivers/base/memory.c
+index 7222ff9b5e05c..084d67fd55cc8 100644
+--- a/drivers/base/memory.c
++++ b/drivers/base/memory.c
+@@ -636,10 +636,9 @@ static int __add_memory_block(struct memory_block *memory)
+ }
+ ret = xa_err(xa_store(&memory_blocks, memory->dev.id, memory,
+ GFP_KERNEL));
+- if (ret) {
+- put_device(&memory->dev);
++ if (ret)
+ device_unregister(&memory->dev);
+- }
++
+ return ret;
+ }
+
+diff --git a/drivers/base/node.c b/drivers/base/node.c
+index ec8bb24a5a227..0ac6376ef7a10 100644
+--- a/drivers/base/node.c
++++ b/drivers/base/node.c
+@@ -682,6 +682,7 @@ static int register_node(struct node *node, int num)
+ */
+ void unregister_node(struct node *node)
+ {
++ compaction_unregister_node(node);
+ hugetlb_unregister_node(node); /* no-op, if memoryless node */
+ node_remove_accesses(node);
+ node_remove_caches(node);
+diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
+index 1ee878d126fdf..f0e4b0ea93e8c 100644
+--- a/drivers/base/power/domain.c
++++ b/drivers/base/power/domain.c
+@@ -1997,6 +1997,7 @@ int pm_genpd_init(struct generic_pm_domain *genpd,
+ genpd->device_count = 0;
+ genpd->max_off_time_ns = -1;
+ genpd->max_off_time_changed = true;
++ genpd->next_wakeup = KTIME_MAX;
+ genpd->provider = NULL;
+ genpd->has_provider = false;
+ genpd->accounting_time = ktime_get();
+diff --git a/drivers/base/property.c b/drivers/base/property.c
+index c0e94cce9c294..2f6843c9612bd 100644
+--- a/drivers/base/property.c
++++ b/drivers/base/property.c
+@@ -47,12 +47,14 @@ bool fwnode_property_present(const struct fwnode_handle *fwnode,
+ {
+ bool ret;
+
++ if (IS_ERR_OR_NULL(fwnode))
++ return false;
++
+ ret = fwnode_call_bool_op(fwnode, property_present, propname);
+- if (ret == false && !IS_ERR_OR_NULL(fwnode) &&
+- !IS_ERR_OR_NULL(fwnode->secondary))
+- ret = fwnode_call_bool_op(fwnode->secondary, property_present,
+- propname);
+- return ret;
++ if (ret)
++ return ret;
++
++ return fwnode_call_bool_op(fwnode->secondary, property_present, propname);
+ }
+ EXPORT_SYMBOL_GPL(fwnode_property_present);
+
+@@ -232,15 +234,16 @@ static int fwnode_property_read_int_array(const struct fwnode_handle *fwnode,
+ {
+ int ret;
+
++ if (IS_ERR_OR_NULL(fwnode))
++ return -EINVAL;
++
+ ret = fwnode_call_int_op(fwnode, property_read_int_array, propname,
+ elem_size, val, nval);
+- if (ret == -EINVAL && !IS_ERR_OR_NULL(fwnode) &&
+- !IS_ERR_OR_NULL(fwnode->secondary))
+- ret = fwnode_call_int_op(
+- fwnode->secondary, property_read_int_array, propname,
+- elem_size, val, nval);
++ if (ret != -EINVAL)
++ return ret;
+
+- return ret;
++ return fwnode_call_int_op(fwnode->secondary, property_read_int_array, propname,
++ elem_size, val, nval);
+ }
+
+ /**
+@@ -371,14 +374,16 @@ int fwnode_property_read_string_array(const struct fwnode_handle *fwnode,
+ {
+ int ret;
+
++ if (IS_ERR_OR_NULL(fwnode))
++ return -EINVAL;
++
+ ret = fwnode_call_int_op(fwnode, property_read_string_array, propname,
+ val, nval);
+- if (ret == -EINVAL && !IS_ERR_OR_NULL(fwnode) &&
+- !IS_ERR_OR_NULL(fwnode->secondary))
+- ret = fwnode_call_int_op(fwnode->secondary,
+- property_read_string_array, propname,
+- val, nval);
+- return ret;
++ if (ret != -EINVAL)
++ return ret;
++
++ return fwnode_call_int_op(fwnode->secondary, property_read_string_array, propname,
++ val, nval);
+ }
+ EXPORT_SYMBOL_GPL(fwnode_property_read_string_array);
+
+@@ -480,15 +485,19 @@ int fwnode_property_get_reference_args(const struct fwnode_handle *fwnode,
+ {
+ int ret;
+
++ if (IS_ERR_OR_NULL(fwnode))
++ return -ENOENT;
++
+ ret = fwnode_call_int_op(fwnode, get_reference_args, prop, nargs_prop,
+ nargs, index, args);
++ if (ret == 0)
++ return ret;
+
+- if (ret < 0 && !IS_ERR_OR_NULL(fwnode) &&
+- !IS_ERR_OR_NULL(fwnode->secondary))
+- ret = fwnode_call_int_op(fwnode->secondary, get_reference_args,
+- prop, nargs_prop, nargs, index, args);
++ if (IS_ERR_OR_NULL(fwnode->secondary))
++ return ret;
+
+- return ret;
++ return fwnode_call_int_op(fwnode->secondary, get_reference_args, prop, nargs_prop,
++ nargs, index, args);
+ }
+ EXPORT_SYMBOL_GPL(fwnode_property_get_reference_args);
+
+@@ -635,12 +644,13 @@ EXPORT_SYMBOL_GPL(fwnode_count_parents);
+ struct fwnode_handle *fwnode_get_nth_parent(struct fwnode_handle *fwnode,
+ unsigned int depth)
+ {
+- unsigned int i;
+-
+ fwnode_handle_get(fwnode);
+
+- for (i = 0; i < depth && fwnode; i++)
++ do {
++ if (depth-- == 0)
++ break;
+ fwnode = fwnode_get_next_parent(fwnode);
++ } while (fwnode);
+
+ return fwnode;
+ }
+@@ -659,17 +669,17 @@ EXPORT_SYMBOL_GPL(fwnode_get_nth_parent);
+ bool fwnode_is_ancestor_of(struct fwnode_handle *test_ancestor,
+ struct fwnode_handle *test_child)
+ {
+- if (!test_ancestor)
++ if (IS_ERR_OR_NULL(test_ancestor))
+ return false;
+
+ fwnode_handle_get(test_child);
+- while (test_child) {
++ do {
+ if (test_child == test_ancestor) {
+ fwnode_handle_put(test_child);
+ return true;
+ }
+ test_child = fwnode_get_next_parent(test_child);
+- }
++ } while (test_child);
+ return false;
+ }
+
+@@ -698,7 +708,7 @@ fwnode_get_next_available_child_node(const struct fwnode_handle *fwnode,
+ {
+ struct fwnode_handle *next_child = child;
+
+- if (!fwnode)
++ if (IS_ERR_OR_NULL(fwnode))
+ return NULL;
+
+ do {
+@@ -722,16 +732,16 @@ struct fwnode_handle *device_get_next_child_node(struct device *dev,
+ const struct fwnode_handle *fwnode = dev_fwnode(dev);
+ struct fwnode_handle *next;
+
++ if (IS_ERR_OR_NULL(fwnode))
++ return NULL;
++
+ /* Try to find a child in primary fwnode */
+ next = fwnode_get_next_child_node(fwnode, child);
+ if (next)
+ return next;
+
+ /* When no more children in primary, continue with secondary */
+- if (fwnode && !IS_ERR_OR_NULL(fwnode->secondary))
+- next = fwnode_get_next_child_node(fwnode->secondary, child);
+-
+- return next;
++ return fwnode_get_next_child_node(fwnode->secondary, child);
+ }
+ EXPORT_SYMBOL_GPL(device_get_next_child_node);
+
+@@ -798,6 +808,9 @@ EXPORT_SYMBOL_GPL(fwnode_handle_put);
+ */
+ bool fwnode_device_is_available(const struct fwnode_handle *fwnode)
+ {
++ if (IS_ERR_OR_NULL(fwnode))
++ return false;
++
+ if (!fwnode_has_op(fwnode, device_is_available))
+ return true;
+
+@@ -988,14 +1001,14 @@ fwnode_graph_get_next_endpoint(const struct fwnode_handle *fwnode,
+ parent = fwnode_graph_get_port_parent(prev);
+ else
+ parent = fwnode;
++ if (IS_ERR_OR_NULL(parent))
++ return NULL;
+
+ ep = fwnode_call_ptr_op(parent, graph_get_next_endpoint, prev);
++ if (ep)
++ return ep;
+
+- if (IS_ERR_OR_NULL(ep) &&
+- !IS_ERR_OR_NULL(parent) && !IS_ERR_OR_NULL(parent->secondary))
+- ep = fwnode_graph_get_next_endpoint(parent->secondary, NULL);
+-
+- return ep;
++ return fwnode_graph_get_next_endpoint(parent->secondary, NULL);
+ }
+ EXPORT_SYMBOL_GPL(fwnode_graph_get_next_endpoint);
+
+diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
+index 4b0b25cc916ee..57b23e49ee91b 100644
+--- a/drivers/block/drbd/drbd_main.c
++++ b/drivers/block/drbd/drbd_main.c
+@@ -903,31 +903,6 @@ void drbd_gen_and_send_sync_uuid(struct drbd_peer_device *peer_device)
+ }
+ }
+
+-/* communicated if (agreed_features & DRBD_FF_WSAME) */
+-static void
+-assign_p_sizes_qlim(struct drbd_device *device, struct p_sizes *p,
+- struct request_queue *q)
+-{
+- if (q) {
+- p->qlim->physical_block_size = cpu_to_be32(queue_physical_block_size(q));
+- p->qlim->logical_block_size = cpu_to_be32(queue_logical_block_size(q));
+- p->qlim->alignment_offset = cpu_to_be32(queue_alignment_offset(q));
+- p->qlim->io_min = cpu_to_be32(queue_io_min(q));
+- p->qlim->io_opt = cpu_to_be32(queue_io_opt(q));
+- p->qlim->discard_enabled = blk_queue_discard(q);
+- p->qlim->write_same_capable = 0;
+- } else {
+- q = device->rq_queue;
+- p->qlim->physical_block_size = cpu_to_be32(queue_physical_block_size(q));
+- p->qlim->logical_block_size = cpu_to_be32(queue_logical_block_size(q));
+- p->qlim->alignment_offset = 0;
+- p->qlim->io_min = cpu_to_be32(queue_io_min(q));
+- p->qlim->io_opt = cpu_to_be32(queue_io_opt(q));
+- p->qlim->discard_enabled = 0;
+- p->qlim->write_same_capable = 0;
+- }
+-}
+-
+ int drbd_send_sizes(struct drbd_peer_device *peer_device, int trigger_reply, enum dds_flags flags)
+ {
+ struct drbd_device *device = peer_device->device;
+@@ -949,7 +924,9 @@ int drbd_send_sizes(struct drbd_peer_device *peer_device, int trigger_reply, enu
+
+ memset(p, 0, packet_size);
+ if (get_ldev_if_state(device, D_NEGOTIATING)) {
+- struct request_queue *q = bdev_get_queue(device->ldev->backing_bdev);
++ struct block_device *bdev = device->ldev->backing_bdev;
++ struct request_queue *q = bdev_get_queue(bdev);
++
+ d_size = drbd_get_max_capacity(device->ldev);
+ rcu_read_lock();
+ u_size = rcu_dereference(device->ldev->disk_conf)->disk_size;
+@@ -957,14 +934,32 @@ int drbd_send_sizes(struct drbd_peer_device *peer_device, int trigger_reply, enu
+ q_order_type = drbd_queue_order_type(device);
+ max_bio_size = queue_max_hw_sectors(q) << 9;
+ max_bio_size = min(max_bio_size, DRBD_MAX_BIO_SIZE);
+- assign_p_sizes_qlim(device, p, q);
++ p->qlim->physical_block_size =
++ cpu_to_be32(bdev_physical_block_size(bdev));
++ p->qlim->logical_block_size =
++ cpu_to_be32(bdev_logical_block_size(bdev));
++ p->qlim->alignment_offset =
++ cpu_to_be32(bdev_alignment_offset(bdev));
++ p->qlim->io_min = cpu_to_be32(bdev_io_min(bdev));
++ p->qlim->io_opt = cpu_to_be32(bdev_io_opt(bdev));
++ p->qlim->discard_enabled = blk_queue_discard(q);
+ put_ldev(device);
+ } else {
++ struct request_queue *q = device->rq_queue;
++
++ p->qlim->physical_block_size =
++ cpu_to_be32(queue_physical_block_size(q));
++ p->qlim->logical_block_size =
++ cpu_to_be32(queue_logical_block_size(q));
++ p->qlim->alignment_offset = 0;
++ p->qlim->io_min = cpu_to_be32(queue_io_min(q));
++ p->qlim->io_opt = cpu_to_be32(queue_io_opt(q));
++ p->qlim->discard_enabled = 0;
++
+ d_size = 0;
+ u_size = 0;
+ q_order_type = QUEUE_ORDERED_NONE;
+ max_bio_size = DRBD_MAX_BIO_SIZE; /* ... multiple BIOs per peer_request */
+- assign_p_sizes_qlim(device, p, NULL);
+ }
+
+ if (peer_device->connection->agreed_pro_version <= 94)
+@@ -3586,9 +3581,8 @@ const char *cmdname(enum drbd_packet cmd)
+ * when we want to support more than
+ * one PRO_VERSION */
+ static const char *cmdnames[] = {
++
+ [P_DATA] = "Data",
+- [P_WSAME] = "WriteSame",
+- [P_TRIM] = "Trim",
+ [P_DATA_REPLY] = "DataReply",
+ [P_RS_DATA_REPLY] = "RSDataReply",
+ [P_BARRIER] = "Barrier",
+@@ -3599,7 +3593,6 @@ const char *cmdname(enum drbd_packet cmd)
+ [P_DATA_REQUEST] = "DataRequest",
+ [P_RS_DATA_REQUEST] = "RSDataRequest",
+ [P_SYNC_PARAM] = "SyncParam",
+- [P_SYNC_PARAM89] = "SyncParam89",
+ [P_PROTOCOL] = "ReportProtocol",
+ [P_UUIDS] = "ReportUUIDs",
+ [P_SIZES] = "ReportSizes",
+@@ -3607,6 +3600,7 @@ const char *cmdname(enum drbd_packet cmd)
+ [P_SYNC_UUID] = "ReportSyncUUID",
+ [P_AUTH_CHALLENGE] = "AuthChallenge",
+ [P_AUTH_RESPONSE] = "AuthResponse",
++ [P_STATE_CHG_REQ] = "StateChgRequest",
+ [P_PING] = "Ping",
+ [P_PING_ACK] = "PingAck",
+ [P_RECV_ACK] = "RecvAck",
+@@ -3617,23 +3611,25 @@ const char *cmdname(enum drbd_packet cmd)
+ [P_NEG_DREPLY] = "NegDReply",
+ [P_NEG_RS_DREPLY] = "NegRSDReply",
+ [P_BARRIER_ACK] = "BarrierAck",
+- [P_STATE_CHG_REQ] = "StateChgRequest",
+ [P_STATE_CHG_REPLY] = "StateChgReply",
+ [P_OV_REQUEST] = "OVRequest",
+ [P_OV_REPLY] = "OVReply",
+ [P_OV_RESULT] = "OVResult",
+ [P_CSUM_RS_REQUEST] = "CsumRSRequest",
+ [P_RS_IS_IN_SYNC] = "CsumRSIsInSync",
++ [P_SYNC_PARAM89] = "SyncParam89",
+ [P_COMPRESSED_BITMAP] = "CBitmap",
+ [P_DELAY_PROBE] = "DelayProbe",
+ [P_OUT_OF_SYNC] = "OutOfSync",
+- [P_RETRY_WRITE] = "RetryWrite",
+ [P_RS_CANCEL] = "RSCancel",
+ [P_CONN_ST_CHG_REQ] = "conn_st_chg_req",
+ [P_CONN_ST_CHG_REPLY] = "conn_st_chg_reply",
+ [P_PROTOCOL_UPDATE] = "protocol_update",
++ [P_TRIM] = "Trim",
+ [P_RS_THIN_REQ] = "rs_thin_req",
+ [P_RS_DEALLOCATED] = "rs_deallocated",
++ [P_WSAME] = "WriteSame",
++ [P_ZEROES] = "Zeroes",
+
+ /* enum drbd_packet, but not commands - obsoleted flags:
+ * P_MAY_IGNORE
+diff --git a/drivers/block/loop.c b/drivers/block/loop.c
+index a58595f5ee2c8..ed7bec11948cd 100644
+--- a/drivers/block/loop.c
++++ b/drivers/block/loop.c
+@@ -1768,6 +1768,14 @@ out_unlock:
+ mutex_unlock(&lo->lo_mutex);
+ }
+
++static void lo_free_disk(struct gendisk *disk)
++{
++ struct loop_device *lo = disk->private_data;
++
++ mutex_destroy(&lo->lo_mutex);
++ kfree(lo);
++}
++
+ static const struct block_device_operations lo_fops = {
+ .owner = THIS_MODULE,
+ .open = lo_open,
+@@ -1776,6 +1784,7 @@ static const struct block_device_operations lo_fops = {
+ #ifdef CONFIG_COMPAT
+ .compat_ioctl = lo_compat_ioctl,
+ #endif
++ .free_disk = lo_free_disk,
+ };
+
+ /*
+@@ -2090,15 +2099,14 @@ static void loop_remove(struct loop_device *lo)
+ {
+ /* Make this loop device unreachable from pathname. */
+ del_gendisk(lo->lo_disk);
+- blk_cleanup_disk(lo->lo_disk);
++ blk_cleanup_queue(lo->lo_disk->queue);
+ blk_mq_free_tag_set(&lo->tag_set);
+
+ mutex_lock(&loop_ctl_mutex);
+ idr_remove(&loop_index_idr, lo->lo_number);
+ mutex_unlock(&loop_ctl_mutex);
+- /* There is no route which can find this loop device. */
+- mutex_destroy(&lo->lo_mutex);
+- kfree(lo);
++
++ put_disk(lo->lo_disk);
+ }
+
+ static void loop_probe(dev_t dev)
+diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
+index 5a1f98494dddf..2845570413360 100644
+--- a/drivers/block/nbd.c
++++ b/drivers/block/nbd.c
+@@ -947,11 +947,15 @@ static int wait_for_reconnect(struct nbd_device *nbd)
+ struct nbd_config *config = nbd->config;
+ if (!config->dead_conn_timeout)
+ return 0;
+- if (test_bit(NBD_RT_DISCONNECTED, &config->runtime_flags))
++
++ if (!wait_event_timeout(config->conn_wait,
++ test_bit(NBD_RT_DISCONNECTED,
++ &config->runtime_flags) ||
++ atomic_read(&config->live_connections) > 0,
++ config->dead_conn_timeout))
+ return 0;
+- return wait_event_timeout(config->conn_wait,
+- atomic_read(&config->live_connections) > 0,
+- config->dead_conn_timeout) > 0;
++
++ return !test_bit(NBD_RT_DISCONNECTED, &config->runtime_flags);
+ }
+
+ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
+@@ -2082,6 +2086,7 @@ static void nbd_disconnect_and_put(struct nbd_device *nbd)
+ mutex_lock(&nbd->config_lock);
+ nbd_disconnect(nbd);
+ sock_shutdown(nbd);
++ wake_up(&nbd->config->conn_wait);
+ /*
+ * Make sure recv thread has finished, we can safely call nbd_clear_que()
+ * to cancel the inflight I/Os.
+diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
+index a8bcf3f664af1..10bba1e00f2b4 100644
+--- a/drivers/block/virtio_blk.c
++++ b/drivers/block/virtio_blk.c
+@@ -867,11 +867,12 @@ static int virtblk_probe(struct virtio_device *vdev)
+ blk_queue_io_opt(q, blk_size * opt_io_size);
+
+ if (virtio_has_feature(vdev, VIRTIO_BLK_F_DISCARD)) {
+- q->limits.discard_granularity = blk_size;
+-
+ virtio_cread(vdev, struct virtio_blk_config,
+ discard_sector_alignment, &v);
+- q->limits.discard_alignment = v ? v << SECTOR_SHIFT : 0;
++ if (v)
++ q->limits.discard_granularity = v << SECTOR_SHIFT;
++ else
++ q->limits.discard_granularity = blk_size;
+
+ virtio_cread(vdev, struct virtio_blk_config,
+ max_discard_sectors, &v);
+diff --git a/drivers/bluetooth/btmtksdio.c b/drivers/bluetooth/btmtksdio.c
+index f3dc5881fff70..d6700efcfe8cd 100644
+--- a/drivers/bluetooth/btmtksdio.c
++++ b/drivers/bluetooth/btmtksdio.c
+@@ -379,6 +379,7 @@ static int btmtksdio_recv_event(struct hci_dev *hdev, struct sk_buff *skb)
+ {
+ struct btmtksdio_dev *bdev = hci_get_drvdata(hdev);
+ struct hci_event_hdr *hdr = (void *)skb->data;
++ u8 evt = hdr->evt;
+ int err;
+
+ /* When someone waits for the WMT event, the skb is being cloned
+@@ -396,7 +397,7 @@ static int btmtksdio_recv_event(struct hci_dev *hdev, struct sk_buff *skb)
+ if (err < 0)
+ goto err_free_skb;
+
+- if (hdr->evt == HCI_EV_WMT) {
++ if (evt == HCI_EV_WMT) {
+ if (test_and_clear_bit(BTMTKSDIO_TX_WAIT_VND_EVT,
+ &bdev->tx_state)) {
+ /* Barrier to sync with other CPUs */
+@@ -863,6 +864,14 @@ static int mt79xx_setup(struct hci_dev *hdev, const char *fwname)
+ return err;
+ }
+
++ err = btmtksdio_fw_pmctrl(bdev);
++ if (err < 0)
++ return err;
++
++ err = btmtksdio_drv_pmctrl(bdev);
++ if (err < 0)
++ return err;
++
+ /* Enable Bluetooth protocol */
+ wmt_params.op = BTMTK_WMT_FUNC_CTRL;
+ wmt_params.flag = 0;
+@@ -961,7 +970,7 @@ static int btmtksdio_get_codec_config_data(struct hci_dev *hdev,
+ }
+
+ *ven_data = kmalloc(sizeof(__u8), GFP_KERNEL);
+- if (!ven_data) {
++ if (!*ven_data) {
+ err = -ENOMEM;
+ goto error;
+ }
+@@ -1108,14 +1117,6 @@ static int btmtksdio_setup(struct hci_dev *hdev)
+ if (err < 0)
+ return err;
+
+- err = btmtksdio_fw_pmctrl(bdev);
+- if (err < 0)
+- return err;
+-
+- err = btmtksdio_drv_pmctrl(bdev);
+- if (err < 0)
+- return err;
+-
+ /* Enable SCO over I2S/PCM */
+ err = btmtksdio_sco_setting(hdev);
+ if (err < 0) {
+@@ -1188,6 +1189,10 @@ static int btmtksdio_shutdown(struct hci_dev *hdev)
+ */
+ pm_runtime_get_sync(bdev->dev);
+
++ /* wmt command only works until the reset is complete */
++ if (test_bit(BTMTKSDIO_HW_RESET_ACTIVE, &bdev->tx_state))
++ goto ignore_wmt_cmd;
++
+ /* Disable the device */
+ wmt_params.op = BTMTK_WMT_FUNC_CTRL;
+ wmt_params.flag = 0;
+@@ -1201,6 +1206,7 @@ static int btmtksdio_shutdown(struct hci_dev *hdev)
+ return err;
+ }
+
++ignore_wmt_cmd:
+ pm_runtime_put_noidle(bdev->dev);
+ pm_runtime_disable(bdev->dev);
+
+diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
+index 50df417207afd..e48c3ad069bb4 100644
+--- a/drivers/bluetooth/btusb.c
++++ b/drivers/bluetooth/btusb.c
+@@ -3335,6 +3335,12 @@ static int btusb_setup_qca(struct hci_dev *hdev)
+ msleep(QCA_BT_RESET_WAIT_MS);
+ }
+
++ /* Mark HCI_OP_ENHANCED_SETUP_SYNC_CONN as broken as it doesn't seem to
++ * work with the likes of HSP/HFP mSBC.
++ */
++ set_bit(HCI_QUIRK_BROKEN_ENHANCED_SETUP_SYNC_CONN, &hdev->quirks);
++ set_bit(HCI_QUIRK_BROKEN_ERR_DATA_REPORTING, &hdev->quirks);
++
+ return 0;
+ }
+
+diff --git a/drivers/char/hw_random/cn10k-rng.c b/drivers/char/hw_random/cn10k-rng.c
+index 35001c63648bb..a01e9307737c5 100644
+--- a/drivers/char/hw_random/cn10k-rng.c
++++ b/drivers/char/hw_random/cn10k-rng.c
+@@ -31,26 +31,23 @@ struct cn10k_rng {
+
+ #define PLAT_OCTEONTX_RESET_RNG_EBG_HEALTH_STATE 0xc2000b0f
+
+-static int reset_rng_health_state(struct cn10k_rng *rng)
++static unsigned long reset_rng_health_state(struct cn10k_rng *rng)
+ {
+ struct arm_smccc_res res;
+
+ /* Send SMC service call to reset EBG health state */
+ arm_smccc_smc(PLAT_OCTEONTX_RESET_RNG_EBG_HEALTH_STATE, 0, 0, 0, 0, 0, 0, 0, &res);
+- if (res.a0 != 0UL)
+- return -EIO;
+-
+- return 0;
++ return res.a0;
+ }
+
+ static int check_rng_health(struct cn10k_rng *rng)
+ {
+ u64 status;
+- int err;
++ unsigned long err;
+
+ /* Skip checking health */
+ if (!rng->reg_base)
+- return 0;
++ return -ENODEV;
+
+ status = readq(rng->reg_base + RNM_PF_EBG_HEALTH);
+ if (status & BIT_ULL(20)) {
+@@ -58,7 +55,9 @@ static int check_rng_health(struct cn10k_rng *rng)
+ if (err) {
+ dev_err(&rng->pdev->dev, "HWRNG: Health test failed (status=%llx)\n",
+ status);
+- dev_err(&rng->pdev->dev, "HWRNG: error during reset\n");
++ dev_err(&rng->pdev->dev, "HWRNG: error during reset (error=%lx)\n",
++ err);
++ return -EIO;
+ }
+ }
+ return 0;
+@@ -90,6 +89,7 @@ static int cn10k_rng_read(struct hwrng *hwrng, void *data,
+ {
+ struct cn10k_rng *rng = (struct cn10k_rng *)hwrng->priv;
+ unsigned int size;
++ u8 *pos = data;
+ int err = 0;
+ u64 value;
+
+@@ -102,17 +102,20 @@ static int cn10k_rng_read(struct hwrng *hwrng, void *data,
+ while (size >= 8) {
+ cn10k_read_trng(rng, &value);
+
+- *((u64 *)data) = (u64)value;
++ *((u64 *)pos) = value;
+ size -= 8;
+- data += 8;
++ pos += 8;
+ }
+
+- while (size > 0) {
++ if (size > 0) {
+ cn10k_read_trng(rng, &value);
+
+- *((u8 *)data) = (u8)value;
+- size--;
+- data++;
++ while (size > 0) {
++ *pos = (u8)value;
++ value >>= 8;
++ size--;
++ pos++;
++ }
+ }
+
+ return max - size;
+diff --git a/drivers/char/hw_random/omap3-rom-rng.c b/drivers/char/hw_random/omap3-rom-rng.c
+index e0d77fa048fb6..f06e4f95114f9 100644
+--- a/drivers/char/hw_random/omap3-rom-rng.c
++++ b/drivers/char/hw_random/omap3-rom-rng.c
+@@ -92,7 +92,7 @@ static int __maybe_unused omap_rom_rng_runtime_resume(struct device *dev)
+
+ r = ddata->rom_rng_call(0, 0, RNG_GEN_PRNG_HW_INIT);
+ if (r != 0) {
+- clk_disable(ddata->clk);
++ clk_disable_unprepare(ddata->clk);
+ dev_err(dev, "HW init failed: %d\n", r);
+
+ return -EIO;
+diff --git a/drivers/char/ipmi/ipmi_ipmb.c b/drivers/char/ipmi/ipmi_ipmb.c
+index b81b862532fb0..a8bfe0ade082b 100644
+--- a/drivers/char/ipmi/ipmi_ipmb.c
++++ b/drivers/char/ipmi/ipmi_ipmb.c
+@@ -476,6 +476,7 @@ static int ipmi_ipmb_probe(struct i2c_client *client,
+ slave_np = of_parse_phandle(dev->of_node, "slave-dev", 0);
+ if (slave_np) {
+ slave_adap = of_get_i2c_adapter_by_node(slave_np);
++ of_node_put(slave_np);
+ if (!slave_adap) {
+ dev_notice(&client->dev,
+ "Could not find slave adapter\n");
+diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
+index f1827257ef0e0..2610e809c802b 100644
+--- a/drivers/char/ipmi/ipmi_msghandler.c
++++ b/drivers/char/ipmi/ipmi_msghandler.c
+@@ -11,8 +11,8 @@
+ * Copyright 2002 MontaVista Software Inc.
+ */
+
+-#define pr_fmt(fmt) "%s" fmt, "IPMI message handler: "
+-#define dev_fmt pr_fmt
++#define pr_fmt(fmt) "IPMI message handler: " fmt
++#define dev_fmt(fmt) pr_fmt(fmt)
+
+ #include <linux/module.h>
+ #include <linux/errno.h>
+diff --git a/drivers/char/ipmi/ipmi_poweroff.c b/drivers/char/ipmi/ipmi_poweroff.c
+index bc3a18daf97a6..62e71c46ac5f7 100644
+--- a/drivers/char/ipmi/ipmi_poweroff.c
++++ b/drivers/char/ipmi/ipmi_poweroff.c
+@@ -94,9 +94,7 @@ static void dummy_recv_free(struct ipmi_recv_msg *msg)
+ {
+ atomic_dec(&dummy_count);
+ }
+-static struct ipmi_smi_msg halt_smi_msg = {
+- .done = dummy_smi_free
+-};
++static struct ipmi_smi_msg halt_smi_msg = INIT_IPMI_SMI_MSG(dummy_smi_free);
+ static struct ipmi_recv_msg halt_recv_msg = {
+ .done = dummy_recv_free
+ };
+diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c
+index f199cc1948446..64c73ea9c9151 100644
+--- a/drivers/char/ipmi/ipmi_ssif.c
++++ b/drivers/char/ipmi/ipmi_ssif.c
+@@ -814,6 +814,14 @@ static void msg_done_handler(struct ssif_info *ssif_info, int result,
+ break;
+
+ case SSIF_GETTING_EVENTS:
++ if (!msg) {
++ /* Should never happen, but just in case. */
++ dev_warn(&ssif_info->client->dev,
++ "No message set while getting events\n");
++ ipmi_ssif_unlock_cond(ssif_info, flags);
++ break;
++ }
++
+ if ((result < 0) || (len < 3) || (msg->rsp[2] != 0)) {
+ /* Error getting event, probably done. */
+ msg->done(msg);
+@@ -838,6 +846,14 @@ static void msg_done_handler(struct ssif_info *ssif_info, int result,
+ break;
+
+ case SSIF_GETTING_MESSAGES:
++ if (!msg) {
++ /* Should never happen, but just in case. */
++ dev_warn(&ssif_info->client->dev,
++ "No message set while getting messages\n");
++ ipmi_ssif_unlock_cond(ssif_info, flags);
++ break;
++ }
++
+ if ((result < 0) || (len < 3) || (msg->rsp[2] != 0)) {
+ /* Error getting event, probably done. */
+ msg->done(msg);
+@@ -861,6 +877,13 @@ static void msg_done_handler(struct ssif_info *ssif_info, int result,
+ deliver_recv_msg(ssif_info, msg);
+ }
+ break;
++
++ default:
++ /* Should never happen, but just in case. */
++ dev_warn(&ssif_info->client->dev,
++ "Invalid state in message done handling: %d\n",
++ ssif_info->ssif_state);
++ ipmi_ssif_unlock_cond(ssif_info, flags);
+ }
+
+ flags = ipmi_ssif_lock_cond(ssif_info, &oflags);
+diff --git a/drivers/char/ipmi/ipmi_watchdog.c b/drivers/char/ipmi/ipmi_watchdog.c
+index 0604abdd249a1..4c1e9663ea479 100644
+--- a/drivers/char/ipmi/ipmi_watchdog.c
++++ b/drivers/char/ipmi/ipmi_watchdog.c
+@@ -354,9 +354,7 @@ static void msg_free_recv(struct ipmi_recv_msg *msg)
+ complete(&msg_wait);
+ }
+ }
+-static struct ipmi_smi_msg smi_msg = {
+- .done = msg_free_smi
+-};
++static struct ipmi_smi_msg smi_msg = INIT_IPMI_SMI_MSG(msg_free_smi);
+ static struct ipmi_recv_msg recv_msg = {
+ .done = msg_free_recv
+ };
+@@ -475,9 +473,8 @@ static void panic_recv_free(struct ipmi_recv_msg *msg)
+ atomic_dec(&panic_done_count);
+ }
+
+-static struct ipmi_smi_msg panic_halt_heartbeat_smi_msg = {
+- .done = panic_smi_free
+-};
++static struct ipmi_smi_msg panic_halt_heartbeat_smi_msg =
++ INIT_IPMI_SMI_MSG(panic_smi_free);
+ static struct ipmi_recv_msg panic_halt_heartbeat_recv_msg = {
+ .done = panic_recv_free
+ };
+@@ -516,9 +513,8 @@ static void panic_halt_ipmi_heartbeat(void)
+ atomic_sub(2, &panic_done_count);
+ }
+
+-static struct ipmi_smi_msg panic_halt_smi_msg = {
+- .done = panic_smi_free
+-};
++static struct ipmi_smi_msg panic_halt_smi_msg =
++ INIT_IPMI_SMI_MSG(panic_smi_free);
+ static struct ipmi_recv_msg panic_halt_recv_msg = {
+ .done = panic_recv_free
+ };
+diff --git a/drivers/char/random.c b/drivers/char/random.c
+index 7a66eec08e373..0cfbfa8d5b50a 100644
+--- a/drivers/char/random.c
++++ b/drivers/char/random.c
+@@ -78,8 +78,7 @@ static enum {
+ CRNG_EARLY = 1, /* At least POOL_EARLY_BITS collected */
+ CRNG_READY = 2 /* Fully initialized with POOL_READY_BITS collected */
+ } crng_init __read_mostly = CRNG_EMPTY;
+-static DEFINE_STATIC_KEY_FALSE(crng_is_ready);
+-#define crng_ready() (static_branch_likely(&crng_is_ready) || crng_init >= CRNG_READY)
++#define crng_ready() (likely(crng_init >= CRNG_READY))
+ /* Various types of waiters for crng_init->CRNG_READY transition. */
+ static DECLARE_WAIT_QUEUE_HEAD(crng_init_wait);
+ static struct fasync_struct *fasync;
+@@ -109,11 +108,6 @@ bool rng_is_initialized(void)
+ }
+ EXPORT_SYMBOL(rng_is_initialized);
+
+-static void __cold crng_set_ready(struct work_struct *work)
+-{
+- static_branch_enable(&crng_is_ready);
+-}
+-
+ /* Used by wait_for_random_bytes(), and considered an entropy collector, below. */
+ static void try_to_generate_entropy(void);
+
+@@ -267,7 +261,7 @@ static void crng_reseed(void)
+ ++next_gen;
+ WRITE_ONCE(base_crng.generation, next_gen);
+ WRITE_ONCE(base_crng.birth, jiffies);
+- if (!static_branch_likely(&crng_is_ready))
++ if (!crng_ready())
+ crng_init = CRNG_READY;
+ spin_unlock_irqrestore(&base_crng.lock, flags);
+ memzero_explicit(key, sizeof(key));
+@@ -710,7 +704,6 @@ static void extract_entropy(void *buf, size_t len)
+
+ static void __cold _credit_init_bits(size_t bits)
+ {
+- static struct execute_work set_ready;
+ unsigned int new, orig, add;
+ unsigned long flags;
+
+@@ -726,7 +719,6 @@ static void __cold _credit_init_bits(size_t bits)
+
+ if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
+ crng_reseed(); /* Sets crng_init to CRNG_READY under base_crng.lock. */
+- execute_in_process_context(crng_set_ready, &set_ready);
+ process_random_ready_list();
+ wake_up_interruptible(&crng_init_wait);
+ kill_fasync(&fasync, SIGIO, POLL_IN);
+diff --git a/drivers/char/tpm/tpm_tis_i2c_cr50.c b/drivers/char/tpm/tpm_tis_i2c_cr50.c
+index f6c0affbb4567..bf608b6af3395 100644
+--- a/drivers/char/tpm/tpm_tis_i2c_cr50.c
++++ b/drivers/char/tpm/tpm_tis_i2c_cr50.c
+@@ -768,8 +768,8 @@ static int tpm_cr50_i2c_remove(struct i2c_client *client)
+ struct device *dev = &client->dev;
+
+ if (!chip) {
+- dev_err(dev, "Could not get client data at remove\n");
+- return -ENODEV;
++ dev_crit(dev, "Could not get client data at remove, memory corruption ahead\n");
++ return 0;
+ }
+
+ tpm_chip_unregister(chip);
+diff --git a/drivers/clk/tegra/clk-dfll.c b/drivers/clk/tegra/clk-dfll.c
+index 6144447f86c63..62238dca9a534 100644
+--- a/drivers/clk/tegra/clk-dfll.c
++++ b/drivers/clk/tegra/clk-dfll.c
+@@ -271,6 +271,7 @@ struct tegra_dfll {
+ struct clk *ref_clk;
+ struct clk *i2c_clk;
+ struct clk *dfll_clk;
++ struct reset_control *dfll_rst;
+ struct reset_control *dvco_rst;
+ unsigned long ref_rate;
+ unsigned long i2c_clk_rate;
+@@ -1464,6 +1465,7 @@ static int dfll_init(struct tegra_dfll *td)
+ return -EINVAL;
+ }
+
++ reset_control_deassert(td->dfll_rst);
+ reset_control_deassert(td->dvco_rst);
+
+ ret = clk_prepare(td->ref_clk);
+@@ -1509,6 +1511,7 @@ di_err1:
+ clk_unprepare(td->ref_clk);
+
+ reset_control_assert(td->dvco_rst);
++ reset_control_assert(td->dfll_rst);
+
+ return ret;
+ }
+@@ -1530,6 +1533,7 @@ int tegra_dfll_suspend(struct device *dev)
+ }
+
+ reset_control_assert(td->dvco_rst);
++ reset_control_assert(td->dfll_rst);
+
+ return 0;
+ }
+@@ -1548,6 +1552,7 @@ int tegra_dfll_resume(struct device *dev)
+ {
+ struct tegra_dfll *td = dev_get_drvdata(dev);
+
++ reset_control_deassert(td->dfll_rst);
+ reset_control_deassert(td->dvco_rst);
+
+ pm_runtime_get_sync(td->dev);
+@@ -1951,6 +1956,12 @@ int tegra_dfll_register(struct platform_device *pdev,
+
+ td->soc = soc;
+
++ td->dfll_rst = devm_reset_control_get_optional(td->dev, "dfll");
++ if (IS_ERR(td->dfll_rst)) {
++ dev_err(td->dev, "couldn't get dfll reset\n");
++ return PTR_ERR(td->dfll_rst);
++ }
++
+ td->dvco_rst = devm_reset_control_get(td->dev, "dvco");
+ if (IS_ERR(td->dvco_rst)) {
+ dev_err(td->dev, "couldn't get dvco reset\n");
+@@ -2087,6 +2098,7 @@ struct tegra_dfll_soc_data *tegra_dfll_unregister(struct platform_device *pdev)
+ clk_unprepare(td->i2c_clk);
+
+ reset_control_assert(td->dvco_rst);
++ reset_control_assert(td->dfll_rst);
+
+ return td->soc;
+ }
+diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
+index 80f535cc8a757..fbaa8e6c7d232 100644
+--- a/drivers/cpufreq/cpufreq.c
++++ b/drivers/cpufreq/cpufreq.c
+@@ -28,6 +28,7 @@
+ #include <linux/suspend.h>
+ #include <linux/syscore_ops.h>
+ #include <linux/tick.h>
++#include <linux/units.h>
+ #include <trace/events/power.h>
+
+ static LIST_HEAD(cpufreq_policy_list);
+@@ -1707,6 +1708,16 @@ static unsigned int cpufreq_verify_current_freq(struct cpufreq_policy *policy, b
+ return new_freq;
+
+ if (policy->cur != new_freq) {
++ /*
++ * For some platforms, the frequency returned by hardware may be
++ * slightly different from what is provided in the frequency
++ * table, for example hardware may return 499 MHz instead of 500
++ * MHz. In such cases it is better to avoid getting into
++ * unnecessary frequency updates.
++ */
++ if (abs(policy->cur - new_freq) < HZ_PER_MHZ)
++ return policy->cur;
++
+ cpufreq_out_of_sync(policy, new_freq);
+ if (update)
+ schedule_work(&policy->update);
+diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
+index 0d42cf8b88d8a..85da677c43d6b 100644
+--- a/drivers/cpufreq/cpufreq_governor.c
++++ b/drivers/cpufreq/cpufreq_governor.c
+@@ -388,6 +388,15 @@ static void free_policy_dbs_info(struct policy_dbs_info *policy_dbs,
+ gov->free(policy_dbs);
+ }
+
++static void cpufreq_dbs_data_release(struct kobject *kobj)
++{
++ struct dbs_data *dbs_data = to_dbs_data(to_gov_attr_set(kobj));
++ struct dbs_governor *gov = dbs_data->gov;
++
++ gov->exit(dbs_data);
++ kfree(dbs_data);
++}
++
+ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy)
+ {
+ struct dbs_governor *gov = dbs_governor_of(policy);
+@@ -425,6 +434,7 @@ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy)
+ goto free_policy_dbs_info;
+ }
+
++ dbs_data->gov = gov;
+ gov_attr_set_init(&dbs_data->attr_set, &policy_dbs->list);
+
+ ret = gov->init(dbs_data);
+@@ -447,6 +457,7 @@ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy)
+ policy->governor_data = policy_dbs;
+
+ gov->kobj_type.sysfs_ops = &governor_sysfs_ops;
++ gov->kobj_type.release = cpufreq_dbs_data_release;
+ ret = kobject_init_and_add(&dbs_data->attr_set.kobj, &gov->kobj_type,
+ get_governor_parent_kobj(policy),
+ "%s", gov->gov.name);
+@@ -488,13 +499,8 @@ void cpufreq_dbs_governor_exit(struct cpufreq_policy *policy)
+
+ policy->governor_data = NULL;
+
+- if (!count) {
+- if (!have_governor_per_policy())
+- gov->gdbs_data = NULL;
+-
+- gov->exit(dbs_data);
+- kfree(dbs_data);
+- }
++ if (!count && !have_governor_per_policy())
++ gov->gdbs_data = NULL;
+
+ free_policy_dbs_info(policy_dbs, gov);
+
+diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
+index a5a0bc3cc23ec..168c23fd7fcac 100644
+--- a/drivers/cpufreq/cpufreq_governor.h
++++ b/drivers/cpufreq/cpufreq_governor.h
+@@ -37,6 +37,7 @@ enum {OD_NORMAL_SAMPLE, OD_SUB_SAMPLE};
+ /* Governor demand based switching data (per-policy or global). */
+ struct dbs_data {
+ struct gov_attr_set attr_set;
++ struct dbs_governor *gov;
+ void *tuners;
+ unsigned int ignore_nice_load;
+ unsigned int sampling_rate;
+diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
+index 866163883b48d..bfe240c726e34 100644
+--- a/drivers/cpufreq/mediatek-cpufreq.c
++++ b/drivers/cpufreq/mediatek-cpufreq.c
+@@ -44,6 +44,8 @@ struct mtk_cpu_dvfs_info {
+ bool need_voltage_tracking;
+ };
+
++static struct platform_device *cpufreq_pdev;
++
+ static LIST_HEAD(dvfs_info_list);
+
+ static struct mtk_cpu_dvfs_info *mtk_cpu_dvfs_info_lookup(int cpu)
+@@ -547,7 +549,6 @@ static int __init mtk_cpufreq_driver_init(void)
+ {
+ struct device_node *np;
+ const struct of_device_id *match;
+- struct platform_device *pdev;
+ int err;
+
+ np = of_find_node_by_path("/");
+@@ -571,16 +572,23 @@ static int __init mtk_cpufreq_driver_init(void)
+ * and the device registration codes are put here to handle defer
+ * probing.
+ */
+- pdev = platform_device_register_simple("mtk-cpufreq", -1, NULL, 0);
+- if (IS_ERR(pdev)) {
++ cpufreq_pdev = platform_device_register_simple("mtk-cpufreq", -1, NULL, 0);
++ if (IS_ERR(cpufreq_pdev)) {
+ pr_err("failed to register mtk-cpufreq platform device\n");
+ platform_driver_unregister(&mtk_cpufreq_platdrv);
+- return PTR_ERR(pdev);
++ return PTR_ERR(cpufreq_pdev);
+ }
+
+ return 0;
+ }
+-device_initcall(mtk_cpufreq_driver_init);
++module_init(mtk_cpufreq_driver_init)
++
++static void __exit mtk_cpufreq_driver_exit(void)
++{
++ platform_device_unregister(cpufreq_pdev);
++ platform_driver_unregister(&mtk_cpufreq_platdrv);
++}
++module_exit(mtk_cpufreq_driver_exit)
+
+ MODULE_DESCRIPTION("MediaTek CPUFreq driver");
+ MODULE_AUTHOR("Pi-Cheng Chen <pi-cheng.chen@linaro.org>");
+diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
+index 755bbdfc5b82f..3db4fca1172b4 100644
+--- a/drivers/cpuidle/cpuidle-psci-domain.c
++++ b/drivers/cpuidle/cpuidle-psci-domain.c
+@@ -52,7 +52,7 @@ static int psci_pd_init(struct device_node *np, bool use_osi)
+ struct generic_pm_domain *pd;
+ struct psci_pd_provider *pd_provider;
+ struct dev_power_governor *pd_gov;
+- int ret = -ENOMEM, state_count = 0;
++ int ret = -ENOMEM;
+
+ pd = dt_idle_pd_alloc(np, psci_dt_parse_state_node);
+ if (!pd)
+@@ -71,7 +71,7 @@ static int psci_pd_init(struct device_node *np, bool use_osi)
+ pd->flags |= GENPD_FLAG_ALWAYS_ON;
+
+ /* Use governor for CPU PM domains if it has some states to manage. */
+- pd_gov = state_count > 0 ? &pm_domain_cpu_gov : NULL;
++ pd_gov = pd->states ? &pm_domain_cpu_gov : NULL;
+
+ ret = pm_genpd_init(pd, pd_gov, false);
+ if (ret)
+diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
+index b51b5df084500..540105ca0781f 100644
+--- a/drivers/cpuidle/cpuidle-psci.c
++++ b/drivers/cpuidle/cpuidle-psci.c
+@@ -23,6 +23,7 @@
+ #include <linux/pm_runtime.h>
+ #include <linux/slab.h>
+ #include <linux/string.h>
++#include <linux/syscore_ops.h>
+
+ #include <asm/cpuidle.h>
+
+@@ -131,6 +132,49 @@ static int psci_idle_cpuhp_down(unsigned int cpu)
+ return 0;
+ }
+
++static void psci_idle_syscore_switch(bool suspend)
++{
++ bool cleared = false;
++ struct device *dev;
++ int cpu;
++
++ for_each_possible_cpu(cpu) {
++ dev = per_cpu_ptr(&psci_cpuidle_data, cpu)->dev;
++
++ if (dev && suspend) {
++ dev_pm_genpd_suspend(dev);
++ } else if (dev) {
++ dev_pm_genpd_resume(dev);
++
++ /* Account for userspace having offlined a CPU. */
++ if (pm_runtime_status_suspended(dev))
++ pm_runtime_set_active(dev);
++
++ /* Clear domain state to re-start fresh. */
++ if (!cleared) {
++ psci_set_domain_state(0);
++ cleared = true;
++ }
++ }
++ }
++}
++
++static int psci_idle_syscore_suspend(void)
++{
++ psci_idle_syscore_switch(true);
++ return 0;
++}
++
++static void psci_idle_syscore_resume(void)
++{
++ psci_idle_syscore_switch(false);
++}
++
++static struct syscore_ops psci_idle_syscore_ops = {
++ .suspend = psci_idle_syscore_suspend,
++ .resume = psci_idle_syscore_resume,
++};
++
+ static void psci_idle_init_cpuhp(void)
+ {
+ int err;
+@@ -138,6 +182,8 @@ static void psci_idle_init_cpuhp(void)
+ if (!psci_cpuidle_use_cpuhp)
+ return;
+
++ register_syscore_ops(&psci_idle_syscore_ops);
++
+ err = cpuhp_setup_state_nocalls(CPUHP_AP_CPU_PM_STARTING,
+ "cpuidle/psci:online",
+ psci_idle_cpuhp_up,
+diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
+index 5c852e6719924..1151e5e2ba824 100644
+--- a/drivers/cpuidle/cpuidle-riscv-sbi.c
++++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
+@@ -414,7 +414,7 @@ static int sbi_pd_init(struct device_node *np)
+ struct generic_pm_domain *pd;
+ struct sbi_pd_provider *pd_provider;
+ struct dev_power_governor *pd_gov;
+- int ret = -ENOMEM, state_count = 0;
++ int ret = -ENOMEM;
+
+ pd = dt_idle_pd_alloc(np, sbi_dt_parse_state_node);
+ if (!pd)
+@@ -433,7 +433,7 @@ static int sbi_pd_init(struct device_node *np)
+ pd->flags |= GENPD_FLAG_ALWAYS_ON;
+
+ /* Use governor for CPU PM domains if it has some states to manage. */
+- pd_gov = state_count > 0 ? &pm_domain_cpu_gov : NULL;
++ pd_gov = pd->states ? &pm_domain_cpu_gov : NULL;
+
+ ret = pm_genpd_init(pd, pd_gov, false);
+ if (ret)
+diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
+index 554e400d41cad..70e2e6e373897 100644
+--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
++++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
+@@ -93,6 +93,68 @@ static int sun8i_ss_cipher_fallback(struct skcipher_request *areq)
+ return err;
+ }
+
++static int sun8i_ss_setup_ivs(struct skcipher_request *areq)
++{
++ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(areq);
++ struct sun8i_cipher_tfm_ctx *op = crypto_skcipher_ctx(tfm);
++ struct sun8i_ss_dev *ss = op->ss;
++ struct sun8i_cipher_req_ctx *rctx = skcipher_request_ctx(areq);
++ struct scatterlist *sg = areq->src;
++ unsigned int todo, offset;
++ unsigned int len = areq->cryptlen;
++ unsigned int ivsize = crypto_skcipher_ivsize(tfm);
++ struct sun8i_ss_flow *sf = &ss->flows[rctx->flow];
++ int i = 0;
++ u32 a;
++ int err;
++
++ rctx->ivlen = ivsize;
++ if (rctx->op_dir & SS_DECRYPTION) {
++ offset = areq->cryptlen - ivsize;
++ scatterwalk_map_and_copy(sf->biv, areq->src, offset,
++ ivsize, 0);
++ }
++
++ /* we need to copy all IVs from source in case DMA is bi-directionnal */
++ while (sg && len) {
++ if (sg_dma_len(sg) == 0) {
++ sg = sg_next(sg);
++ continue;
++ }
++ if (i == 0)
++ memcpy(sf->iv[0], areq->iv, ivsize);
++ a = dma_map_single(ss->dev, sf->iv[i], ivsize, DMA_TO_DEVICE);
++ if (dma_mapping_error(ss->dev, a)) {
++ memzero_explicit(sf->iv[i], ivsize);
++ dev_err(ss->dev, "Cannot DMA MAP IV\n");
++ err = -EFAULT;
++ goto dma_iv_error;
++ }
++ rctx->p_iv[i] = a;
++ /* we need to setup all others IVs only in the decrypt way */
++ if (rctx->op_dir & SS_ENCRYPTION)
++ return 0;
++ todo = min(len, sg_dma_len(sg));
++ len -= todo;
++ i++;
++ if (i < MAX_SG) {
++ offset = sg->length - ivsize;
++ scatterwalk_map_and_copy(sf->iv[i], sg, offset, ivsize, 0);
++ }
++ rctx->niv = i;
++ sg = sg_next(sg);
++ }
++
++ return 0;
++dma_iv_error:
++ i--;
++ while (i >= 0) {
++ dma_unmap_single(ss->dev, rctx->p_iv[i], ivsize, DMA_TO_DEVICE);
++ memzero_explicit(sf->iv[i], ivsize);
++ }
++ return err;
++}
++
+ static int sun8i_ss_cipher(struct skcipher_request *areq)
+ {
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(areq);
+@@ -101,9 +163,9 @@ static int sun8i_ss_cipher(struct skcipher_request *areq)
+ struct sun8i_cipher_req_ctx *rctx = skcipher_request_ctx(areq);
+ struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
+ struct sun8i_ss_alg_template *algt;
++ struct sun8i_ss_flow *sf = &ss->flows[rctx->flow];
+ struct scatterlist *sg;
+ unsigned int todo, len, offset, ivsize;
+- void *backup_iv = NULL;
+ int nr_sgs = 0;
+ int nr_sgd = 0;
+ int err = 0;
+@@ -134,30 +196,9 @@ static int sun8i_ss_cipher(struct skcipher_request *areq)
+
+ ivsize = crypto_skcipher_ivsize(tfm);
+ if (areq->iv && crypto_skcipher_ivsize(tfm) > 0) {
+- rctx->ivlen = ivsize;
+- rctx->biv = kzalloc(ivsize, GFP_KERNEL | GFP_DMA);
+- if (!rctx->biv) {
+- err = -ENOMEM;
++ err = sun8i_ss_setup_ivs(areq);
++ if (err)
+ goto theend_key;
+- }
+- if (rctx->op_dir & SS_DECRYPTION) {
+- backup_iv = kzalloc(ivsize, GFP_KERNEL);
+- if (!backup_iv) {
+- err = -ENOMEM;
+- goto theend_key;
+- }
+- offset = areq->cryptlen - ivsize;
+- scatterwalk_map_and_copy(backup_iv, areq->src, offset,
+- ivsize, 0);
+- }
+- memcpy(rctx->biv, areq->iv, ivsize);
+- rctx->p_iv = dma_map_single(ss->dev, rctx->biv, rctx->ivlen,
+- DMA_TO_DEVICE);
+- if (dma_mapping_error(ss->dev, rctx->p_iv)) {
+- dev_err(ss->dev, "Cannot DMA MAP IV\n");
+- err = -ENOMEM;
+- goto theend_iv;
+- }
+ }
+ if (areq->src == areq->dst) {
+ nr_sgs = dma_map_sg(ss->dev, areq->src, sg_nents(areq->src),
+@@ -243,21 +284,19 @@ theend_sgs:
+ }
+
+ theend_iv:
+- if (rctx->p_iv)
+- dma_unmap_single(ss->dev, rctx->p_iv, rctx->ivlen,
+- DMA_TO_DEVICE);
+-
+ if (areq->iv && ivsize > 0) {
+- if (rctx->biv) {
+- offset = areq->cryptlen - ivsize;
+- if (rctx->op_dir & SS_DECRYPTION) {
+- memcpy(areq->iv, backup_iv, ivsize);
+- kfree_sensitive(backup_iv);
+- } else {
+- scatterwalk_map_and_copy(areq->iv, areq->dst, offset,
+- ivsize, 0);
+- }
+- kfree(rctx->biv);
++ for (i = 0; i < rctx->niv; i++) {
++ dma_unmap_single(ss->dev, rctx->p_iv[i], ivsize, DMA_TO_DEVICE);
++ memzero_explicit(sf->iv[i], ivsize);
++ }
++
++ offset = areq->cryptlen - ivsize;
++ if (rctx->op_dir & SS_DECRYPTION) {
++ memcpy(areq->iv, sf->biv, ivsize);
++ memzero_explicit(sf->biv, ivsize);
++ } else {
++ scatterwalk_map_and_copy(areq->iv, areq->dst, offset,
++ ivsize, 0);
+ }
+ }
+
+diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c
+index 319fe3279a716..6575305786436 100644
+--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c
++++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c
+@@ -66,6 +66,7 @@ int sun8i_ss_run_task(struct sun8i_ss_dev *ss, struct sun8i_cipher_req_ctx *rctx
+ const char *name)
+ {
+ int flow = rctx->flow;
++ unsigned int ivlen = rctx->ivlen;
+ u32 v = SS_START;
+ int i;
+
+@@ -104,15 +105,14 @@ int sun8i_ss_run_task(struct sun8i_ss_dev *ss, struct sun8i_cipher_req_ctx *rctx
+ mutex_lock(&ss->mlock);
+ writel(rctx->p_key, ss->base + SS_KEY_ADR_REG);
+
+- if (i == 0) {
+- if (rctx->p_iv)
+- writel(rctx->p_iv, ss->base + SS_IV_ADR_REG);
+- } else {
+- if (rctx->biv) {
+- if (rctx->op_dir == SS_ENCRYPTION)
+- writel(rctx->t_dst[i - 1].addr + rctx->t_dst[i - 1].len * 4 - rctx->ivlen, ss->base + SS_IV_ADR_REG);
++ if (ivlen) {
++ if (rctx->op_dir == SS_ENCRYPTION) {
++ if (i == 0)
++ writel(rctx->p_iv[0], ss->base + SS_IV_ADR_REG);
+ else
+- writel(rctx->t_src[i - 1].addr + rctx->t_src[i - 1].len * 4 - rctx->ivlen, ss->base + SS_IV_ADR_REG);
++ writel(rctx->t_dst[i - 1].addr + rctx->t_dst[i - 1].len * 4 - ivlen, ss->base + SS_IV_ADR_REG);
++ } else {
++ writel(rctx->p_iv[i], ss->base + SS_IV_ADR_REG);
+ }
+ }
+
+@@ -464,7 +464,7 @@ static void sun8i_ss_free_flows(struct sun8i_ss_dev *ss, int i)
+ */
+ static int allocate_flows(struct sun8i_ss_dev *ss)
+ {
+- int i, err;
++ int i, j, err;
+
+ ss->flows = devm_kcalloc(ss->dev, MAXFLOW, sizeof(struct sun8i_ss_flow),
+ GFP_KERNEL);
+@@ -474,6 +474,18 @@ static int allocate_flows(struct sun8i_ss_dev *ss)
+ for (i = 0; i < MAXFLOW; i++) {
+ init_completion(&ss->flows[i].complete);
+
++ ss->flows[i].biv = devm_kmalloc(ss->dev, AES_BLOCK_SIZE,
++ GFP_KERNEL | GFP_DMA);
++ if (!ss->flows[i].biv)
++ goto error_engine;
++
++ for (j = 0; j < MAX_SG; j++) {
++ ss->flows[i].iv[j] = devm_kmalloc(ss->dev, AES_BLOCK_SIZE,
++ GFP_KERNEL | GFP_DMA);
++ if (!ss->flows[i].iv[j])
++ goto error_engine;
++ }
++
+ ss->flows[i].engine = crypto_engine_alloc_init(ss->dev, true);
+ if (!ss->flows[i].engine) {
+ dev_err(ss->dev, "Cannot allocate engine\n");
+diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c
+index 1a71ed49d2333..ca4f280af35d2 100644
+--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c
++++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c
+@@ -380,13 +380,21 @@ int sun8i_ss_hash_run(struct crypto_engine *engine, void *breq)
+ }
+
+ len = areq->nbytes;
+- for_each_sg(areq->src, sg, nr_sgs, i) {
++ sg = areq->src;
++ i = 0;
++ while (len > 0 && sg) {
++ if (sg_dma_len(sg) == 0) {
++ sg = sg_next(sg);
++ continue;
++ }
+ rctx->t_src[i].addr = sg_dma_address(sg);
+ todo = min(len, sg_dma_len(sg));
+ rctx->t_src[i].len = todo / 4;
+ len -= todo;
+ rctx->t_dst[i].addr = addr_res;
+ rctx->t_dst[i].len = digestsize / 4;
++ sg = sg_next(sg);
++ i++;
+ }
+ if (len > 0) {
+ dev_err(ss->dev, "remaining len %d\n", len);
+diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h
+index 28188685b9100..57ada86538550 100644
+--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h
++++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h
+@@ -121,11 +121,15 @@ struct sginfo {
+ * @complete: completion for the current task on this flow
+ * @status: set to 1 by interrupt if task is done
+ * @stat_req: number of request done by this flow
++ * @iv: list of IV to use for each step
++ * @biv: buffer which contain the backuped IV
+ */
+ struct sun8i_ss_flow {
+ struct crypto_engine *engine;
+ struct completion complete;
+ int status;
++ u8 *iv[MAX_SG];
++ u8 *biv;
+ #ifdef CONFIG_CRYPTO_DEV_SUN8I_SS_DEBUG
+ unsigned long stat_req;
+ #endif
+@@ -164,28 +168,28 @@ struct sun8i_ss_dev {
+ * @t_src: list of mapped SGs with their size
+ * @t_dst: list of mapped SGs with their size
+ * @p_key: DMA address of the key
+- * @p_iv: DMA address of the IV
++ * @p_iv: DMA address of the IVs
++ * @niv: Number of IVs DMA mapped
+ * @method: current algorithm for this request
+ * @op_mode: op_mode for this request
+ * @op_dir: direction (encrypt vs decrypt) for this request
+ * @flow: the flow to use for this request
+- * @ivlen: size of biv
++ * @ivlen: size of IVs
+ * @keylen: keylen for this request
+- * @biv: buffer which contain the IV
+ * @fallback_req: request struct for invoking the fallback skcipher TFM
+ */
+ struct sun8i_cipher_req_ctx {
+ struct sginfo t_src[MAX_SG];
+ struct sginfo t_dst[MAX_SG];
+ u32 p_key;
+- u32 p_iv;
++ u32 p_iv[MAX_SG];
++ int niv;
+ u32 method;
+ u32 op_mode;
+ u32 op_dir;
+ int flow;
+ unsigned int ivlen;
+ unsigned int keylen;
+- void *biv;
+ struct skcipher_request fallback_req; // keep at the end
+ };
+
+diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
+index 6ab93dfd478a9..3aefb177715e9 100644
+--- a/drivers/crypto/ccp/sev-dev.c
++++ b/drivers/crypto/ccp/sev-dev.c
+@@ -23,6 +23,7 @@
+ #include <linux/gfp.h>
+ #include <linux/cpufeature.h>
+ #include <linux/fs.h>
++#include <linux/fs_struct.h>
+
+ #include <asm/smp.h>
+
+@@ -170,6 +171,31 @@ static void *sev_fw_alloc(unsigned long len)
+ return page_address(page);
+ }
+
++static struct file *open_file_as_root(const char *filename, int flags, umode_t mode)
++{
++ struct file *fp;
++ struct path root;
++ struct cred *cred;
++ const struct cred *old_cred;
++
++ task_lock(&init_task);
++ get_fs_root(init_task.fs, &root);
++ task_unlock(&init_task);
++
++ cred = prepare_creds();
++ if (!cred)
++ return ERR_PTR(-ENOMEM);
++ cred->fsuid = GLOBAL_ROOT_UID;
++ old_cred = override_creds(cred);
++
++ fp = file_open_root(&root, filename, flags, mode);
++ path_put(&root);
++
++ revert_creds(old_cred);
++
++ return fp;
++}
++
+ static int sev_read_init_ex_file(void)
+ {
+ struct sev_device *sev = psp_master->sev_data;
+@@ -181,7 +207,7 @@ static int sev_read_init_ex_file(void)
+ if (!sev_init_ex_buffer)
+ return -EOPNOTSUPP;
+
+- fp = filp_open(init_ex_path, O_RDONLY, 0);
++ fp = open_file_as_root(init_ex_path, O_RDONLY, 0);
+ if (IS_ERR(fp)) {
+ int ret = PTR_ERR(fp);
+
+@@ -217,7 +243,7 @@ static void sev_write_init_ex_file(void)
+ if (!sev_init_ex_buffer)
+ return;
+
+- fp = filp_open(init_ex_path, O_CREAT | O_WRONLY, 0600);
++ fp = open_file_as_root(init_ex_path, O_CREAT | O_WRONLY, 0600);
+ if (IS_ERR(fp)) {
+ dev_err(sev->dev,
+ "SEV: could not open file for write, error %ld\n",
+diff --git a/drivers/crypto/ccree/cc_buffer_mgr.c b/drivers/crypto/ccree/cc_buffer_mgr.c
+index 11e0278c8631d..6140e49273226 100644
+--- a/drivers/crypto/ccree/cc_buffer_mgr.c
++++ b/drivers/crypto/ccree/cc_buffer_mgr.c
+@@ -356,12 +356,14 @@ void cc_unmap_cipher_request(struct device *dev, void *ctx,
+ req_ctx->mlli_params.mlli_dma_addr);
+ }
+
+- dma_unmap_sg(dev, src, req_ctx->in_nents, DMA_BIDIRECTIONAL);
+- dev_dbg(dev, "Unmapped req->src=%pK\n", sg_virt(src));
+-
+ if (src != dst) {
+- dma_unmap_sg(dev, dst, req_ctx->out_nents, DMA_BIDIRECTIONAL);
++ dma_unmap_sg(dev, src, req_ctx->in_nents, DMA_TO_DEVICE);
++ dma_unmap_sg(dev, dst, req_ctx->out_nents, DMA_FROM_DEVICE);
+ dev_dbg(dev, "Unmapped req->dst=%pK\n", sg_virt(dst));
++ dev_dbg(dev, "Unmapped req->src=%pK\n", sg_virt(src));
++ } else {
++ dma_unmap_sg(dev, src, req_ctx->in_nents, DMA_BIDIRECTIONAL);
++ dev_dbg(dev, "Unmapped req->src=%pK\n", sg_virt(src));
+ }
+ }
+
+@@ -377,6 +379,7 @@ int cc_map_cipher_request(struct cc_drvdata *drvdata, void *ctx,
+ u32 dummy = 0;
+ int rc = 0;
+ u32 mapped_nents = 0;
++ int src_direction = (src != dst ? DMA_TO_DEVICE : DMA_BIDIRECTIONAL);
+
+ req_ctx->dma_buf_type = CC_DMA_BUF_DLLI;
+ mlli_params->curr_pool = NULL;
+@@ -399,7 +402,7 @@ int cc_map_cipher_request(struct cc_drvdata *drvdata, void *ctx,
+ }
+
+ /* Map the src SGL */
+- rc = cc_map_sg(dev, src, nbytes, DMA_BIDIRECTIONAL, &req_ctx->in_nents,
++ rc = cc_map_sg(dev, src, nbytes, src_direction, &req_ctx->in_nents,
+ LLI_MAX_NUM_OF_DATA_ENTRIES, &dummy, &mapped_nents);
+ if (rc)
+ goto cipher_exit;
+@@ -416,7 +419,7 @@ int cc_map_cipher_request(struct cc_drvdata *drvdata, void *ctx,
+ }
+ } else {
+ /* Map the dst sg */
+- rc = cc_map_sg(dev, dst, nbytes, DMA_BIDIRECTIONAL,
++ rc = cc_map_sg(dev, dst, nbytes, DMA_FROM_DEVICE,
+ &req_ctx->out_nents, LLI_MAX_NUM_OF_DATA_ENTRIES,
+ &dummy, &mapped_nents);
+ if (rc)
+@@ -456,6 +459,7 @@ void cc_unmap_aead_request(struct device *dev, struct aead_request *req)
+ struct aead_req_ctx *areq_ctx = aead_request_ctx(req);
+ unsigned int hw_iv_size = areq_ctx->hw_iv_size;
+ struct cc_drvdata *drvdata = dev_get_drvdata(dev);
++ int src_direction = (req->src != req->dst ? DMA_TO_DEVICE : DMA_BIDIRECTIONAL);
+
+ if (areq_ctx->mac_buf_dma_addr) {
+ dma_unmap_single(dev, areq_ctx->mac_buf_dma_addr,
+@@ -514,13 +518,11 @@ void cc_unmap_aead_request(struct device *dev, struct aead_request *req)
+ sg_virt(req->src), areq_ctx->src.nents, areq_ctx->assoc.nents,
+ areq_ctx->assoclen, req->cryptlen);
+
+- dma_unmap_sg(dev, req->src, areq_ctx->src.mapped_nents,
+- DMA_BIDIRECTIONAL);
++ dma_unmap_sg(dev, req->src, areq_ctx->src.mapped_nents, src_direction);
+ if (req->src != req->dst) {
+ dev_dbg(dev, "Unmapping dst sgl: req->dst=%pK\n",
+ sg_virt(req->dst));
+- dma_unmap_sg(dev, req->dst, areq_ctx->dst.mapped_nents,
+- DMA_BIDIRECTIONAL);
++ dma_unmap_sg(dev, req->dst, areq_ctx->dst.mapped_nents, DMA_FROM_DEVICE);
+ }
+ if (drvdata->coherent &&
+ areq_ctx->gen_ctx.op_type == DRV_CRYPTO_DIRECTION_DECRYPT &&
+@@ -843,7 +845,7 @@ static int cc_aead_chain_data(struct cc_drvdata *drvdata,
+ else
+ size_for_map -= authsize;
+
+- rc = cc_map_sg(dev, req->dst, size_for_map, DMA_BIDIRECTIONAL,
++ rc = cc_map_sg(dev, req->dst, size_for_map, DMA_FROM_DEVICE,
+ &areq_ctx->dst.mapped_nents,
+ LLI_MAX_NUM_OF_DATA_ENTRIES, &dst_last_bytes,
+ &dst_mapped_nents);
+@@ -1056,7 +1058,8 @@ int cc_map_aead_request(struct cc_drvdata *drvdata, struct aead_request *req)
+ size_to_map += authsize;
+ }
+
+- rc = cc_map_sg(dev, req->src, size_to_map, DMA_BIDIRECTIONAL,
++ rc = cc_map_sg(dev, req->src, size_to_map,
++ (req->src != req->dst ? DMA_TO_DEVICE : DMA_BIDIRECTIONAL),
+ &areq_ctx->src.mapped_nents,
+ (LLI_MAX_NUM_OF_ASSOC_DATA_ENTRIES +
+ LLI_MAX_NUM_OF_DATA_ENTRIES),
+diff --git a/drivers/crypto/marvell/cesa/cipher.c b/drivers/crypto/marvell/cesa/cipher.c
+index b739d3b873dcf..c6f2fa753b7c0 100644
+--- a/drivers/crypto/marvell/cesa/cipher.c
++++ b/drivers/crypto/marvell/cesa/cipher.c
+@@ -624,7 +624,6 @@ struct skcipher_alg mv_cesa_ecb_des3_ede_alg = {
+ .decrypt = mv_cesa_ecb_des3_ede_decrypt,
+ .min_keysize = DES3_EDE_KEY_SIZE,
+ .max_keysize = DES3_EDE_KEY_SIZE,
+- .ivsize = DES3_EDE_BLOCK_SIZE,
+ .base = {
+ .cra_name = "ecb(des3_ede)",
+ .cra_driver_name = "mv-ecb-des3-ede",
+diff --git a/drivers/crypto/nx/nx-common-powernv.c b/drivers/crypto/nx/nx-common-powernv.c
+index 32a036ada5d0a..f418817c0f43e 100644
+--- a/drivers/crypto/nx/nx-common-powernv.c
++++ b/drivers/crypto/nx/nx-common-powernv.c
+@@ -827,7 +827,7 @@ static int __init vas_cfg_coproc_info(struct device_node *dn, int chip_id,
+ goto err_out;
+
+ vas_init_rx_win_attr(&rxattr, coproc->ct);
+- rxattr.rx_fifo = (void *)rx_fifo;
++ rxattr.rx_fifo = rx_fifo;
+ rxattr.rx_fifo_size = fifo_size;
+ rxattr.lnotify_lpid = lpid;
+ rxattr.lnotify_pid = pid;
+diff --git a/drivers/crypto/qat/qat_common/adf_pfvf_pf_proto.c b/drivers/crypto/qat/qat_common/adf_pfvf_pf_proto.c
+index 588352de1ef0e..d17318d3f63a4 100644
+--- a/drivers/crypto/qat/qat_common/adf_pfvf_pf_proto.c
++++ b/drivers/crypto/qat/qat_common/adf_pfvf_pf_proto.c
+@@ -154,7 +154,7 @@ static struct pfvf_message handle_blkmsg_req(struct adf_accel_vf_info *vf_info,
+ if (FIELD_GET(ADF_VF2PF_BLOCK_CRC_REQ_MASK, req.data)) {
+ dev_dbg(&GET_DEV(vf_info->accel_dev),
+ "BlockMsg of type %d for CRC over %d bytes received from VF%d\n",
+- blk_type, blk_byte, vf_info->vf_nr);
++ blk_type, blk_byte + 1, vf_info->vf_nr);
+
+ if (!adf_pf2vf_blkmsg_get_data(vf_info, blk_type, blk_byte,
+ byte_max, &resp_data,
+diff --git a/drivers/crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c b/drivers/crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c
+index 1e7bed8b011fe..91095ad479dc3 100644
+--- a/drivers/crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c
++++ b/drivers/crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c
+@@ -60,17 +60,24 @@ static u32 get_accel_cap(struct adf_accel_dev *accel_dev)
+
+ capabilities = ICP_ACCEL_CAPABILITIES_CRYPTO_SYMMETRIC |
+ ICP_ACCEL_CAPABILITIES_CRYPTO_ASYMMETRIC |
+- ICP_ACCEL_CAPABILITIES_AUTHENTICATION;
++ ICP_ACCEL_CAPABILITIES_AUTHENTICATION |
++ ICP_ACCEL_CAPABILITIES_CIPHER |
++ ICP_ACCEL_CAPABILITIES_COMPRESSION;
+
+ /* Read accelerator capabilities mask */
+ pci_read_config_dword(pdev, ADF_DEVICE_LEGFUSE_OFFSET, &legfuses);
+
+- if (legfuses & ICP_ACCEL_MASK_CIPHER_SLICE)
++ /* A set bit in legfuses means the feature is OFF in this SKU */
++ if (legfuses & ICP_ACCEL_MASK_CIPHER_SLICE) {
+ capabilities &= ~ICP_ACCEL_CAPABILITIES_CRYPTO_SYMMETRIC;
++ capabilities &= ~ICP_ACCEL_CAPABILITIES_CIPHER;
++ }
+ if (legfuses & ICP_ACCEL_MASK_PKE_SLICE)
+ capabilities &= ~ICP_ACCEL_CAPABILITIES_CRYPTO_ASYMMETRIC;
+- if (legfuses & ICP_ACCEL_MASK_AUTH_SLICE)
++ if (legfuses & ICP_ACCEL_MASK_AUTH_SLICE) {
+ capabilities &= ~ICP_ACCEL_CAPABILITIES_AUTHENTICATION;
++ capabilities &= ~ICP_ACCEL_CAPABILITIES_CIPHER;
++ }
+ if (legfuses & ICP_ACCEL_MASK_COMPRESS_SLICE)
+ capabilities &= ~ICP_ACCEL_CAPABILITIES_COMPRESSION;
+
+diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
+index 49a4b1c47299f..44e899f06094c 100644
+--- a/drivers/cxl/mem.c
++++ b/drivers/cxl/mem.c
+@@ -27,12 +27,8 @@
+ static int wait_for_media(struct cxl_memdev *cxlmd)
+ {
+ struct cxl_dev_state *cxlds = cxlmd->cxlds;
+- struct cxl_endpoint_dvsec_info *info = &cxlds->info;
+ int rc;
+
+- if (!info->mem_enabled)
+- return -EBUSY;
+-
+ rc = cxlds->wait_media_ready(cxlds);
+ if (rc)
+ return rc;
+diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
+index 3f2182d668292..bb92853c3b933 100644
+--- a/drivers/cxl/pci.c
++++ b/drivers/cxl/pci.c
+@@ -462,16 +462,24 @@ static int wait_for_media_ready(struct cxl_dev_state *cxlds)
+ return 0;
+ }
+
+-static int cxl_dvsec_ranges(struct cxl_dev_state *cxlds)
++/*
++ * Return positive number of non-zero ranges on success and a negative
++ * error code on failure. The cxl_mem driver depends on ranges == 0 to
++ * init HDM operation.
++ */
++static int __cxl_dvsec_ranges(struct cxl_dev_state *cxlds,
++ struct cxl_endpoint_dvsec_info *info)
+ {
+- struct cxl_endpoint_dvsec_info *info = &cxlds->info;
+ struct pci_dev *pdev = to_pci_dev(cxlds->dev);
++ int hdm_count, rc, i, ranges = 0;
++ struct device *dev = &pdev->dev;
+ int d = cxlds->cxl_dvsec;
+- int hdm_count, rc, i;
+ u16 cap, ctrl;
+
+- if (!d)
++ if (!d) {
++ dev_dbg(dev, "No DVSEC Capability\n");
+ return -ENXIO;
++ }
+
+ rc = pci_read_config_word(pdev, d + CXL_DVSEC_CAP_OFFSET, &cap);
+ if (rc)
+@@ -481,8 +489,10 @@ static int cxl_dvsec_ranges(struct cxl_dev_state *cxlds)
+ if (rc)
+ return rc;
+
+- if (!(cap & CXL_DVSEC_MEM_CAPABLE))
++ if (!(cap & CXL_DVSEC_MEM_CAPABLE)) {
++ dev_dbg(dev, "Not MEM Capable\n");
+ return -ENXIO;
++ }
+
+ /*
+ * It is not allowed by spec for MEM.capable to be set and have 0 legacy
+@@ -495,8 +505,10 @@ static int cxl_dvsec_ranges(struct cxl_dev_state *cxlds)
+ return -EINVAL;
+
+ rc = wait_for_valid(cxlds);
+- if (rc)
++ if (rc) {
++ dev_dbg(dev, "Failure awaiting MEM_INFO_VALID (%d)\n", rc);
+ return rc;
++ }
+
+ info->mem_enabled = FIELD_GET(CXL_DVSEC_MEM_ENABLE, ctrl);
+
+@@ -538,10 +550,17 @@ static int cxl_dvsec_ranges(struct cxl_dev_state *cxlds)
+ };
+
+ if (size)
+- info->ranges++;
++ ranges++;
+ }
+
+- return 0;
++ return ranges;
++}
++
++static void cxl_dvsec_ranges(struct cxl_dev_state *cxlds)
++{
++ struct cxl_endpoint_dvsec_info *info = &cxlds->info;
++
++ info->ranges = __cxl_dvsec_ranges(cxlds, info);
+ }
+
+ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+@@ -610,10 +629,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+ if (rc)
+ return rc;
+
+- rc = cxl_dvsec_ranges(cxlds);
+- if (rc)
+- dev_warn(&pdev->dev,
+- "Failed to get DVSEC range information (%d)\n", rc);
++ cxl_dvsec_ranges(cxlds);
+
+ cxlmd = devm_cxl_add_memdev(cxlds);
+ if (IS_ERR(cxlmd))
+diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
+index 293857ebfd75d..538e8dc74f40a 100644
+--- a/drivers/devfreq/rk3399_dmc.c
++++ b/drivers/devfreq/rk3399_dmc.c
+@@ -477,6 +477,8 @@ static int rk3399_dmcfreq_remove(struct platform_device *pdev)
+ {
+ struct rk3399_dmcfreq *dmcfreq = dev_get_drvdata(&pdev->dev);
+
++ devfreq_event_disable_edev(dmcfreq->edev);
++
+ /*
+ * Before remove the opp table we need to unregister the opp notifier.
+ */
+diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c
+index b9b2b4a4124ee..033df43db0cec 100644
+--- a/drivers/dma/idxd/cdev.c
++++ b/drivers/dma/idxd/cdev.c
+@@ -369,10 +369,16 @@ int idxd_cdev_register(void)
+ rc = alloc_chrdev_region(&ictx[i].devt, 0, MINORMASK,
+ ictx[i].name);
+ if (rc)
+- return rc;
++ goto err_free_chrdev_region;
+ }
+
+ return 0;
++
++err_free_chrdev_region:
++ for (i--; i >= 0; i--)
++ unregister_chrdev_region(ictx[i].devt, MINORMASK);
++
++ return rc;
+ }
+
+ void idxd_cdev_remove(void)
+diff --git a/drivers/dma/stm32-mdma.c b/drivers/dma/stm32-mdma.c
+index 6f57ff0e7b37b..f8c8b9d76aadc 100644
+--- a/drivers/dma/stm32-mdma.c
++++ b/drivers/dma/stm32-mdma.c
+@@ -34,7 +34,6 @@
+ #include "virt-dma.h"
+
+ #define STM32_MDMA_GISR0 0x0000 /* MDMA Int Status Reg 1 */
+-#define STM32_MDMA_GISR1 0x0004 /* MDMA Int Status Reg 2 */
+
+ /* MDMA Channel x interrupt/status register */
+ #define STM32_MDMA_CISR(x) (0x40 + 0x40 * (x)) /* x = 0..62 */
+@@ -168,7 +167,7 @@
+
+ #define STM32_MDMA_MAX_BUF_LEN 128
+ #define STM32_MDMA_MAX_BLOCK_LEN 65536
+-#define STM32_MDMA_MAX_CHANNELS 63
++#define STM32_MDMA_MAX_CHANNELS 32
+ #define STM32_MDMA_MAX_REQUESTS 256
+ #define STM32_MDMA_MAX_BURST 128
+ #define STM32_MDMA_VERY_HIGH_PRIORITY 0x3
+@@ -1317,26 +1316,16 @@ static void stm32_mdma_xfer_end(struct stm32_mdma_chan *chan)
+ static irqreturn_t stm32_mdma_irq_handler(int irq, void *devid)
+ {
+ struct stm32_mdma_device *dmadev = devid;
+- struct stm32_mdma_chan *chan = devid;
++ struct stm32_mdma_chan *chan;
+ u32 reg, id, ccr, ien, status;
+
+ /* Find out which channel generates the interrupt */
+ status = readl_relaxed(dmadev->base + STM32_MDMA_GISR0);
+- if (status) {
+- id = __ffs(status);
+- } else {
+- status = readl_relaxed(dmadev->base + STM32_MDMA_GISR1);
+- if (!status) {
+- dev_dbg(mdma2dev(dmadev), "spurious it\n");
+- return IRQ_NONE;
+- }
+- id = __ffs(status);
+- /*
+- * As GISR0 provides status for channel id from 0 to 31,
+- * so GISR1 provides status for channel id from 32 to 62
+- */
+- id += 32;
++ if (!status) {
++ dev_dbg(mdma2dev(dmadev), "spurious it\n");
++ return IRQ_NONE;
+ }
++ id = __ffs(status);
+
+ chan = &dmadev->chan[id];
+ if (!chan) {
+diff --git a/drivers/dma/ti/k3-psil-am62.c b/drivers/dma/ti/k3-psil-am62.c
+index d431e20332378..2b6fd6e37c610 100644
+--- a/drivers/dma/ti/k3-psil-am62.c
++++ b/drivers/dma/ti/k3-psil-am62.c
+@@ -70,10 +70,10 @@
+ /* PSI-L source thread IDs, used for RX (DMA_DEV_TO_MEM) */
+ static struct psil_ep am62_src_ep_map[] = {
+ /* SAUL */
+- PSIL_SAUL(0x7500, 20, 35, 8, 35, 0),
+- PSIL_SAUL(0x7501, 21, 35, 8, 36, 0),
+- PSIL_SAUL(0x7502, 22, 43, 8, 43, 0),
+- PSIL_SAUL(0x7503, 23, 43, 8, 44, 0),
++ PSIL_SAUL(0x7504, 20, 35, 8, 35, 0),
++ PSIL_SAUL(0x7505, 21, 35, 8, 36, 0),
++ PSIL_SAUL(0x7506, 22, 43, 8, 43, 0),
++ PSIL_SAUL(0x7507, 23, 43, 8, 44, 0),
+ /* PDMA_MAIN0 - SPI0-3 */
+ PSIL_PDMA_XY_PKT(0x4302),
+ PSIL_PDMA_XY_PKT(0x4303),
+diff --git a/drivers/edac/dmc520_edac.c b/drivers/edac/dmc520_edac.c
+index b8a7d9594afd4..1fa5ca57e9ec1 100644
+--- a/drivers/edac/dmc520_edac.c
++++ b/drivers/edac/dmc520_edac.c
+@@ -489,7 +489,7 @@ static int dmc520_edac_probe(struct platform_device *pdev)
+ dev = &pdev->dev;
+
+ for (idx = 0; idx < NUMBER_OF_IRQS; idx++) {
+- irq = platform_get_irq_byname(pdev, dmc520_irq_configs[idx].name);
++ irq = platform_get_irq_byname_optional(pdev, dmc520_irq_configs[idx].name);
+ irqs[idx] = irq;
+ masks[idx] = dmc520_irq_configs[idx].mask;
+ if (irq >= 0) {
+diff --git a/drivers/firmware/arm_ffa/driver.c b/drivers/firmware/arm_ffa/driver.c
+index 14f900047ac0c..44300dbcc643d 100644
+--- a/drivers/firmware/arm_ffa/driver.c
++++ b/drivers/firmware/arm_ffa/driver.c
+@@ -582,7 +582,7 @@ static int ffa_partition_info_get(const char *uuid_str,
+ return -ENODEV;
+ }
+
+- count = ffa_partition_probe(&uuid_null, &pbuf);
++ count = ffa_partition_probe(&uuid, &pbuf);
+ if (count <= 0)
+ return -ENOENT;
+
+@@ -688,8 +688,6 @@ static void ffa_setup_partitions(void)
+ __func__, tpbuf->id);
+ continue;
+ }
+-
+- ffa_dev_set_drvdata(ffa_dev, drv_info);
+ }
+ kfree(pbuf);
+ }
+diff --git a/drivers/firmware/arm_scmi/base.c b/drivers/firmware/arm_scmi/base.c
+index f5219334fd3a5..3fe172c03c247 100644
+--- a/drivers/firmware/arm_scmi/base.c
++++ b/drivers/firmware/arm_scmi/base.c
+@@ -197,7 +197,7 @@ scmi_base_implementation_list_get(const struct scmi_protocol_handle *ph,
+ break;
+
+ loop_num_ret = le32_to_cpu(*num_ret);
+- if (tot_num_ret + loop_num_ret > MAX_PROTOCOLS_IMP) {
++ if (loop_num_ret > MAX_PROTOCOLS_IMP - tot_num_ret) {
+ dev_err(dev, "No. of Protocol > MAX_PROTOCOLS_IMP");
+ break;
+ }
+diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
+index 2c3dac5ecb36d..243882f5e5f99 100644
+--- a/drivers/firmware/efi/Kconfig
++++ b/drivers/firmware/efi/Kconfig
+@@ -284,3 +284,18 @@ config EFI_CUSTOM_SSDT_OVERLAYS
+
+ See Documentation/admin-guide/acpi/ssdt-overlays.rst for more
+ information.
++
++config EFI_DISABLE_RUNTIME
++ bool "Disable EFI runtime services support by default"
++ default y if PREEMPT_RT
++ help
++ Allow to disable the EFI runtime services support by default. This can
++ already be achieved by using the efi=noruntime option, but it could be
++ useful to have this default without any kernel command line parameter.
++
++ The EFI runtime services are disabled by default when PREEMPT_RT is
++ enabled, because measurements have shown that some EFI functions calls
++ might take too much time to complete, causing large latencies which is
++ an issue for Real-Time kernels.
++
++ This default can be overridden by using the efi=runtime option.
+diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
+index 5502e176d51be..ff57db8f8d059 100644
+--- a/drivers/firmware/efi/efi.c
++++ b/drivers/firmware/efi/efi.c
+@@ -66,7 +66,7 @@ struct mm_struct efi_mm = {
+
+ struct workqueue_struct *efi_rts_wq;
+
+-static bool disable_runtime = IS_ENABLED(CONFIG_PREEMPT_RT);
++static bool disable_runtime = IS_ENABLED(CONFIG_EFI_DISABLE_RUNTIME);
+ static int __init setup_noefi(char *arg)
+ {
+ disable_runtime = true;
+diff --git a/drivers/gpio/gpio-rockchip.c b/drivers/gpio/gpio-rockchip.c
+index 099e358d24915..bcf5214e35866 100644
+--- a/drivers/gpio/gpio-rockchip.c
++++ b/drivers/gpio/gpio-rockchip.c
+@@ -19,6 +19,7 @@
+ #include <linux/of_address.h>
+ #include <linux/of_device.h>
+ #include <linux/of_irq.h>
++#include <linux/pinctrl/pinconf-generic.h>
+ #include <linux/regmap.h>
+
+ #include "../pinctrl/core.h"
+@@ -706,7 +707,7 @@ static int rockchip_gpio_probe(struct platform_device *pdev)
+ struct device_node *pctlnp = of_get_parent(np);
+ struct pinctrl_dev *pctldev = NULL;
+ struct rockchip_pin_bank *bank = NULL;
+- struct rockchip_pin_output_deferred *cfg;
++ struct rockchip_pin_deferred *cfg;
+ static int gpio;
+ int id, ret;
+
+@@ -747,15 +748,22 @@ static int rockchip_gpio_probe(struct platform_device *pdev)
+ return ret;
+ }
+
+- while (!list_empty(&bank->deferred_output)) {
+- cfg = list_first_entry(&bank->deferred_output,
+- struct rockchip_pin_output_deferred, head);
++ while (!list_empty(&bank->deferred_pins)) {
++ cfg = list_first_entry(&bank->deferred_pins,
++ struct rockchip_pin_deferred, head);
+ list_del(&cfg->head);
+
+- ret = rockchip_gpio_direction_output(&bank->gpio_chip, cfg->pin, cfg->arg);
+- if (ret)
+- dev_warn(dev, "setting output pin %u to %u failed\n", cfg->pin, cfg->arg);
+-
++ switch (cfg->param) {
++ case PIN_CONFIG_OUTPUT:
++ ret = rockchip_gpio_direction_output(&bank->gpio_chip, cfg->pin, cfg->arg);
++ if (ret)
++ dev_warn(dev, "setting output pin %u to %u failed\n", cfg->pin,
++ cfg->arg);
++ break;
++ default:
++ dev_warn(dev, "unknown deferred config param %d\n", cfg->param);
++ break;
++ }
+ kfree(cfg);
+ }
+
+diff --git a/drivers/gpio/gpio-sim.c b/drivers/gpio/gpio-sim.c
+index 41c31b10ae848..98109839102fb 100644
+--- a/drivers/gpio/gpio-sim.c
++++ b/drivers/gpio/gpio-sim.c
+@@ -314,8 +314,8 @@ static int gpio_sim_setup_sysfs(struct gpio_sim_chip *chip)
+
+ for (i = 0; i < num_lines; i++) {
+ attr_group = devm_kzalloc(dev, sizeof(*attr_group), GFP_KERNEL);
+- attrs = devm_kcalloc(dev, sizeof(*attrs),
+- GPIO_SIM_NUM_ATTRS, GFP_KERNEL);
++ attrs = devm_kcalloc(dev, GPIO_SIM_NUM_ATTRS, sizeof(*attrs),
++ GFP_KERNEL);
+ val_attr = devm_kzalloc(dev, sizeof(*val_attr), GFP_KERNEL);
+ pull_attr = devm_kzalloc(dev, sizeof(*pull_attr), GFP_KERNEL);
+ if (!attr_group || !attrs || !val_attr || !pull_attr)
+diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c
+index 7e5e51d49d09e..6dec81b1f24be 100644
+--- a/drivers/gpio/gpiolib-of.c
++++ b/drivers/gpio/gpiolib-of.c
+@@ -931,6 +931,11 @@ static int of_gpiochip_add_pin_range(struct gpio_chip *chip)
+ if (!np)
+ return 0;
+
++ if (!of_property_read_bool(np, "gpio-ranges") &&
++ chip->of_gpio_ranges_fallback) {
++ return chip->of_gpio_ranges_fallback(chip, np);
++ }
++
+ group_names = of_find_property(np, group_names_propname, NULL);
+
+ for (;; index++) {
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+index d0d0ea565e3df..2019622191b59 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+@@ -116,7 +116,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union drm_amdgpu_cs
+ int ret;
+
+ if (cs->in.num_chunks == 0)
+- return 0;
++ return -EINVAL;
+
+ chunk_array = kvmalloc_array(cs->in.num_chunks, sizeof(uint64_t), GFP_KERNEL);
+ if (!chunk_array)
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+index 46ef57b07c151..7e6f8475819dc 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+@@ -1931,6 +1931,7 @@ static const struct pci_device_id pciidlist[] = {
+ {0x1002, 0x7421, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BEIGE_GOBY},
+ {0x1002, 0x7422, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BEIGE_GOBY},
+ {0x1002, 0x7423, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BEIGE_GOBY},
++ {0x1002, 0x7424, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BEIGE_GOBY},
+ {0x1002, 0x743F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BEIGE_GOBY},
+
+ { PCI_DEVICE(0x1002, PCI_ANY_ID),
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+index a66a0881a934b..88b852b3a2cb6 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+@@ -25,6 +25,9 @@
+ */
+
+ #include <linux/io-64-nonatomic-lo-hi.h>
++#ifdef CONFIG_X86
++#include <asm/hypervisor.h>
++#endif
+
+ #include "amdgpu.h"
+ #include "amdgpu_gmc.h"
+@@ -647,12 +650,14 @@ void amdgpu_gmc_get_vbios_allocations(struct amdgpu_device *adev)
+ case CHIP_VEGA10:
+ adev->mman.keep_stolen_vga_memory = true;
+ /*
+- * VEGA10 SRIOV VF needs some firmware reserved area.
++ * VEGA10 SRIOV VF with MS_HYPERV host needs some firmware reserved area.
+ */
+- if (amdgpu_sriov_vf(adev)) {
+- adev->mman.stolen_reserved_offset = 0x100000;
+- adev->mman.stolen_reserved_size = 0x600000;
++#ifdef CONFIG_X86
++ if (amdgpu_sriov_vf(adev) && hypervisor_is_type(X86_HYPER_MS_HYPERV)) {
++ adev->mman.stolen_reserved_offset = 0x500000;
++ adev->mman.stolen_reserved_size = 0x200000;
+ }
++#endif
+ break;
+ case CHIP_RAVEN:
+ case CHIP_RENOIR:
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+index a6acec1a6155d..21aa556a6befa 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+@@ -357,7 +357,39 @@ static int psp_sw_init(void *handle)
+ }
+ }
+
++ ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,
++ amdgpu_sriov_vf(adev) ?
++ AMDGPU_GEM_DOMAIN_VRAM : AMDGPU_GEM_DOMAIN_GTT,
++ &psp->fw_pri_bo,
++ &psp->fw_pri_mc_addr,
++ &psp->fw_pri_buf);
++ if (ret)
++ return ret;
++
++ ret = amdgpu_bo_create_kernel(adev, PSP_FENCE_BUFFER_SIZE, PAGE_SIZE,
++ AMDGPU_GEM_DOMAIN_VRAM,
++ &psp->fence_buf_bo,
++ &psp->fence_buf_mc_addr,
++ &psp->fence_buf);
++ if (ret)
++ goto failed1;
++
++ ret = amdgpu_bo_create_kernel(adev, PSP_CMD_BUFFER_SIZE, PAGE_SIZE,
++ AMDGPU_GEM_DOMAIN_VRAM,
++ &psp->cmd_buf_bo, &psp->cmd_buf_mc_addr,
++ (void **)&psp->cmd_buf_mem);
++ if (ret)
++ goto failed2;
++
+ return 0;
++
++failed2:
++ amdgpu_bo_free_kernel(&psp->fw_pri_bo,
++ &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
++failed1:
++ amdgpu_bo_free_kernel(&psp->fence_buf_bo,
++ &psp->fence_buf_mc_addr, &psp->fence_buf);
++ return ret;
+ }
+
+ static int psp_sw_fini(void *handle)
+@@ -391,6 +423,13 @@ static int psp_sw_fini(void *handle)
+ kfree(cmd);
+ cmd = NULL;
+
++ amdgpu_bo_free_kernel(&psp->fw_pri_bo,
++ &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
++ amdgpu_bo_free_kernel(&psp->fence_buf_bo,
++ &psp->fence_buf_mc_addr, &psp->fence_buf);
++ amdgpu_bo_free_kernel(&psp->cmd_buf_bo, &psp->cmd_buf_mc_addr,
++ (void **)&psp->cmd_buf_mem);
++
+ return 0;
+ }
+
+@@ -2430,51 +2469,18 @@ static int psp_load_fw(struct amdgpu_device *adev)
+ struct psp_context *psp = &adev->psp;
+
+ if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev)) {
+- psp_ring_stop(psp, PSP_RING_TYPE__KM); /* should not destroy ring, only stop */
+- goto skip_memalloc;
+- }
+-
+- if (amdgpu_sriov_vf(adev)) {
+- ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,
+- AMDGPU_GEM_DOMAIN_VRAM,
+- &psp->fw_pri_bo,
+- &psp->fw_pri_mc_addr,
+- &psp->fw_pri_buf);
++ /* should not destroy ring, only stop */
++ psp_ring_stop(psp, PSP_RING_TYPE__KM);
+ } else {
+- ret = amdgpu_bo_create_kernel(adev, PSP_1_MEG, PSP_1_MEG,
+- AMDGPU_GEM_DOMAIN_GTT,
+- &psp->fw_pri_bo,
+- &psp->fw_pri_mc_addr,
+- &psp->fw_pri_buf);
+- }
+-
+- if (ret)
+- goto failed;
+-
+- ret = amdgpu_bo_create_kernel(adev, PSP_FENCE_BUFFER_SIZE, PAGE_SIZE,
+- AMDGPU_GEM_DOMAIN_VRAM,
+- &psp->fence_buf_bo,
+- &psp->fence_buf_mc_addr,
+- &psp->fence_buf);
+- if (ret)
+- goto failed;
+-
+- ret = amdgpu_bo_create_kernel(adev, PSP_CMD_BUFFER_SIZE, PAGE_SIZE,
+- AMDGPU_GEM_DOMAIN_VRAM,
+- &psp->cmd_buf_bo, &psp->cmd_buf_mc_addr,
+- (void **)&psp->cmd_buf_mem);
+- if (ret)
+- goto failed;
++ memset(psp->fence_buf, 0, PSP_FENCE_BUFFER_SIZE);
+
+- memset(psp->fence_buf, 0, PSP_FENCE_BUFFER_SIZE);
+-
+- ret = psp_ring_init(psp, PSP_RING_TYPE__KM);
+- if (ret) {
+- DRM_ERROR("PSP ring init failed!\n");
+- goto failed;
++ ret = psp_ring_init(psp, PSP_RING_TYPE__KM);
++ if (ret) {
++ DRM_ERROR("PSP ring init failed!\n");
++ goto failed;
++ }
+ }
+
+-skip_memalloc:
+ ret = psp_hw_start(psp);
+ if (ret)
+ goto failed;
+@@ -2592,13 +2598,6 @@ static int psp_hw_fini(void *handle)
+ psp_tmr_terminate(psp);
+ psp_ring_destroy(psp, PSP_RING_TYPE__KM);
+
+- amdgpu_bo_free_kernel(&psp->fw_pri_bo,
+- &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
+- amdgpu_bo_free_kernel(&psp->fence_buf_bo,
+- &psp->fence_buf_mc_addr, &psp->fence_buf);
+- amdgpu_bo_free_kernel(&psp->cmd_buf_bo, &psp->cmd_buf_mc_addr,
+- (void **)&psp->cmd_buf_mem);
+-
+ return 0;
+ }
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
+index ca33505026189..aebafbc327fb2 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
+@@ -714,8 +714,7 @@ int amdgpu_ucode_create_bo(struct amdgpu_device *adev)
+
+ void amdgpu_ucode_free_bo(struct amdgpu_device *adev)
+ {
+- if (adev->firmware.load_type != AMDGPU_FW_LOAD_DIRECT)
+- amdgpu_bo_free_kernel(&adev->firmware.fw_buf,
++ amdgpu_bo_free_kernel(&adev->firmware.fw_buf,
+ &adev->firmware.fw_buf_mc,
+ &adev->firmware.fw_buf_ptr);
+ }
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+index 5e3756643da3f..1d55b2bae37e7 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+@@ -864,11 +864,11 @@ static u32 amdgpu_virt_rlcg_reg_rw(struct amdgpu_device *adev, u32 offset, u32 v
+ uint32_t timeout = 50000;
+ uint32_t i, tmp;
+ uint32_t ret = 0;
+- static void *scratch_reg0;
+- static void *scratch_reg1;
+- static void *scratch_reg2;
+- static void *scratch_reg3;
+- static void *spare_int;
++ void *scratch_reg0;
++ void *scratch_reg1;
++ void *scratch_reg2;
++ void *scratch_reg3;
++ void *spare_int;
+
+ if (!adev->gfx.rlc.rlcg_reg_access_supported) {
+ dev_err(adev->dev,
+diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+index d7e8f72323641..ff86c43b63d11 100644
+--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+@@ -772,8 +772,8 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring)
+
+ DRM_DEBUG("Using doorbell -- "
+ "wptr_offs == 0x%08x "
+- "lower_32_bits(ring->wptr) << 2 == 0x%08x "
+- "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
++ "lower_32_bits(ring->wptr << 2) == 0x%08x "
++ "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
+ ring->wptr_offs,
+ lower_32_bits(ring->wptr << 2),
+ upper_32_bits(ring->wptr << 2));
+diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+index a8d49c005f73d..627eb1f147c20 100644
+--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
+@@ -394,8 +394,8 @@ static void sdma_v5_0_ring_set_wptr(struct amdgpu_ring *ring)
+ if (ring->use_doorbell) {
+ DRM_DEBUG("Using doorbell -- "
+ "wptr_offs == 0x%08x "
+- "lower_32_bits(ring->wptr) << 2 == 0x%08x "
+- "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
++ "lower_32_bits(ring->wptr << 2) == 0x%08x "
++ "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
+ ring->wptr_offs,
+ lower_32_bits(ring->wptr << 2),
+ upper_32_bits(ring->wptr << 2));
+@@ -774,9 +774,9 @@ static int sdma_v5_0_gfx_resume(struct amdgpu_device *adev)
+
+ if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use register write for wptr */
+ WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR),
+- lower_32_bits(ring->wptr) << 2);
++ lower_32_bits(ring->wptr << 2));
+ WREG32(sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI),
+- upper_32_bits(ring->wptr) << 2);
++ upper_32_bits(ring->wptr << 2));
+ }
+
+ doorbell = RREG32_SOC15_IP(GC, sdma_v5_0_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL));
+diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+index 824eace698842..a5eb82bfeaa8d 100644
+--- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
++++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c
+@@ -295,8 +295,8 @@ static void sdma_v5_2_ring_set_wptr(struct amdgpu_ring *ring)
+ if (ring->use_doorbell) {
+ DRM_DEBUG("Using doorbell -- "
+ "wptr_offs == 0x%08x "
+- "lower_32_bits(ring->wptr) << 2 == 0x%08x "
+- "upper_32_bits(ring->wptr) << 2 == 0x%08x\n",
++ "lower_32_bits(ring->wptr << 2) == 0x%08x "
++ "upper_32_bits(ring->wptr << 2) == 0x%08x\n",
+ ring->wptr_offs,
+ lower_32_bits(ring->wptr << 2),
+ upper_32_bits(ring->wptr << 2));
+@@ -672,8 +672,8 @@ static int sdma_v5_2_gfx_resume(struct amdgpu_device *adev)
+ WREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_MINOR_PTR_UPDATE), 1);
+
+ if (!amdgpu_sriov_vf(adev)) { /* only bare-metal use register write for wptr */
+- WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr) << 2);
+- WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr) << 2);
++ WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2));
++ WREG32(sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2));
+ }
+
+ doorbell = RREG32_SOC15_IP(GC, sdma_v5_2_get_reg_offset(adev, i, mmSDMA0_GFX_DOORBELL));
+diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+index 607f65ab39ac8..10cc834a5ac31 100644
+--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
++++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+@@ -944,8 +944,6 @@ err_drm_file:
+
+ bool kfd_dev_is_large_bar(struct kfd_dev *dev)
+ {
+- struct kfd_local_mem_info mem_info;
+-
+ if (debug_largebar) {
+ pr_debug("Simulate large-bar allocation on non large-bar machine\n");
+ return true;
+@@ -954,9 +952,8 @@ bool kfd_dev_is_large_bar(struct kfd_dev *dev)
+ if (dev->use_iommu_v2)
+ return false;
+
+- amdgpu_amdkfd_get_local_mem_info(dev->adev, &mem_info);
+- if (mem_info.local_mem_size_private == 0 &&
+- mem_info.local_mem_size_public > 0)
++ if (dev->local_mem_info.local_mem_size_private == 0 &&
++ dev->local_mem_info.local_mem_size_public > 0)
+ return true;
+ return false;
+ }
+diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+index 1eaabd2cb41b0..59b349a4c04a3 100644
+--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
++++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+@@ -2152,7 +2152,7 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
+ * report the total FB size (public+private) as a single
+ * private heap.
+ */
+- amdgpu_amdkfd_get_local_mem_info(kdev->adev, &local_mem_info);
++ local_mem_info = kdev->local_mem_info;
+ sub_type_hdr = (typeof(sub_type_hdr))((char *)sub_type_hdr +
+ sub_type_hdr->length);
+
+diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+index 62aa6c9d5123d..c96d521447fcf 100644
+--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
++++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+@@ -575,6 +575,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
+ if (kfd_resume(kfd))
+ goto kfd_resume_error;
+
++ amdgpu_amdkfd_get_local_mem_info(kfd->adev, &kfd->local_mem_info);
++
+ if (kfd_topology_add_device(kfd)) {
+ dev_err(kfd_device, "Error adding device to topology\n");
+ goto kfd_topology_add_device_error;
+diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+index 8f58fc491b289..49a29a60b71e4 100644
+--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
++++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+@@ -272,6 +272,7 @@ struct kfd_dev {
+
+ struct kgd2kfd_shared_resources shared_resources;
+ struct kfd_vmid_info vm_info;
++ struct kfd_local_mem_info local_mem_info;
+
+ const struct kfd2kgd_calls *kfd2kgd;
+ struct mutex doorbell_mutex;
+diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+index 3bdcae239bc0f..9fc24f6823dfc 100644
+--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
++++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+@@ -1102,15 +1102,12 @@ static uint32_t kfd_generate_gpu_id(struct kfd_dev *gpu)
+ uint32_t buf[7];
+ uint64_t local_mem_size;
+ int i;
+- struct kfd_local_mem_info local_mem_info;
+
+ if (!gpu)
+ return 0;
+
+- amdgpu_amdkfd_get_local_mem_info(gpu->adev, &local_mem_info);
+-
+- local_mem_size = local_mem_info.local_mem_size_private +
+- local_mem_info.local_mem_size_public;
++ local_mem_size = gpu->local_mem_info.local_mem_size_private +
++ gpu->local_mem_info.local_mem_size_public;
+
+ buf[0] = gpu->pdev->devfn;
+ buf[1] = gpu->pdev->subsystem_vendor |
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
+index 63934ecf6be84..d71e625cc476e 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
+@@ -1030,6 +1030,7 @@ static const struct dc_debug_options debug_defaults_drv = {
+ .afmt = true,
+ }
+ },
++ .disable_z10 = true,
+ .optimize_edp_link_rate = true,
+ .enable_sw_cntl_psr = true,
+ .apply_vendor_specific_lttpr_wa = true,
+diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
+index 72e7b5d40af69..5472f9936febc 100644
+--- a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
++++ b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
+@@ -790,7 +790,7 @@ int amdgpu_dpm_force_performance_level(struct amdgpu_device *adev,
+ AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK |
+ AMD_DPM_FORCED_LEVEL_PROFILE_PEAK;
+
+- if (!pp_funcs->force_performance_level)
++ if (!pp_funcs || !pp_funcs->force_performance_level)
+ return 0;
+
+ if (adev->pm.dpm.thermal_active)
+diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c
+index 8b23cc9f098ad..8fd0782a2b206 100644
+--- a/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c
++++ b/drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c
+@@ -1623,19 +1623,7 @@ static int kv_update_samu_dpm(struct amdgpu_device *adev, bool gate)
+
+ static u8 kv_get_acp_boot_level(struct amdgpu_device *adev)
+ {
+- u8 i;
+- struct amdgpu_clock_voltage_dependency_table *table =
+- &adev->pm.dpm.dyn_state.acp_clock_voltage_dependency_table;
+-
+- for (i = 0; i < table->count; i++) {
+- if (table->entries[i].clk >= 0) /* XXX */
+- break;
+- }
+-
+- if (i >= table->count)
+- i = table->count - 1;
+-
+- return i;
++ return 0;
+ }
+
+ static void kv_update_acp_boot_level(struct amdgpu_device *adev)
+diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
+index 633dab14f51c2..49c398ec0aaf6 100644
+--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
++++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
+@@ -7297,17 +7297,15 @@ static int si_parse_power_table(struct amdgpu_device *adev)
+ if (!adev->pm.dpm.ps)
+ return -ENOMEM;
+ power_state_offset = (u8 *)state_array->states;
+- for (i = 0; i < state_array->ucNumEntries; i++) {
++ for (adev->pm.dpm.num_ps = 0, i = 0; i < state_array->ucNumEntries; i++) {
+ u8 *idx;
+ power_state = (union pplib_power_state *)power_state_offset;
+ non_clock_array_index = power_state->v2.nonClockInfoIndex;
+ non_clock_info = (struct _ATOM_PPLIB_NONCLOCK_INFO *)
+ &non_clock_info_array->nonClockInfo[non_clock_array_index];
+ ps = kzalloc(sizeof(struct si_ps), GFP_KERNEL);
+- if (ps == NULL) {
+- kfree(adev->pm.dpm.ps);
++ if (ps == NULL)
+ return -ENOMEM;
+- }
+ adev->pm.dpm.ps[i].ps_priv = ps;
+ si_parse_pplib_non_clock_info(adev, &adev->pm.dpm.ps[i],
+ non_clock_info,
+@@ -7329,8 +7327,8 @@ static int si_parse_power_table(struct amdgpu_device *adev)
+ k++;
+ }
+ power_state_offset += 2 + power_state->v2.ucNumDPMLevels;
++ adev->pm.dpm.num_ps++;
+ }
+- adev->pm.dpm.num_ps = state_array->ucNumEntries;
+
+ /* fill in the vce power states */
+ for (i = 0; i < adev->pm.dpm.num_of_vce_states; i++) {
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+index f10a0256413e6..32cff21f261ca 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+@@ -576,6 +576,8 @@ static int smu_early_init(void *handle)
+ smu->smu_baco.platform_support = false;
+ smu->user_dpm_profile.fan_mode = -1;
+
++ mutex_init(&smu->message_lock);
++
+ adev->powerplay.pp_handle = smu;
+ adev->powerplay.pp_funcs = &swsmu_pm_funcs;
+
+@@ -975,8 +977,6 @@ static int smu_sw_init(void *handle)
+ bitmap_zero(smu->smu_feature.supported, SMU_FEATURE_MAX);
+ bitmap_zero(smu->smu_feature.allowed, SMU_FEATURE_MAX);
+
+- mutex_init(&smu->message_lock);
+-
+ INIT_WORK(&smu->throttling_logging_work, smu_throttling_logging_work_fn);
+ INIT_WORK(&smu->interrupt_work, smu_interrupt_work_fn);
+ atomic64_set(&smu->throttle_int_counter, 0);
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c
+index fd6c44ece1688..012e3bd99cc23 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c
+@@ -1119,6 +1119,39 @@ static int renoir_get_power_profile_mode(struct smu_context *smu,
+ return size;
+ }
+
++static void renoir_get_ss_power_percent(SmuMetrics_t *metrics,
++ uint32_t *apu_percent, uint32_t *dgpu_percent)
++{
++ uint32_t apu_boost = 0;
++ uint32_t dgpu_boost = 0;
++ uint16_t apu_limit = 0;
++ uint16_t dgpu_limit = 0;
++ uint16_t apu_power = 0;
++ uint16_t dgpu_power = 0;
++
++ apu_power = metrics->ApuPower;
++ apu_limit = metrics->StapmOriginalLimit;
++ if (apu_power > apu_limit && apu_limit != 0)
++ apu_boost = ((apu_power - apu_limit) * 100) / apu_limit;
++ apu_boost = (apu_boost > 100) ? 100 : apu_boost;
++
++ dgpu_power = metrics->dGpuPower;
++ if (metrics->StapmCurrentLimit > metrics->StapmOriginalLimit)
++ dgpu_limit = metrics->StapmCurrentLimit - metrics->StapmOriginalLimit;
++ if (dgpu_power > dgpu_limit && dgpu_limit != 0)
++ dgpu_boost = ((dgpu_power - dgpu_limit) * 100) / dgpu_limit;
++ dgpu_boost = (dgpu_boost > 100) ? 100 : dgpu_boost;
++
++ if (dgpu_boost >= apu_boost)
++ apu_boost = 0;
++ else
++ dgpu_boost = 0;
++
++ *apu_percent = apu_boost;
++ *dgpu_percent = dgpu_boost;
++}
++
++
+ static int renoir_get_smu_metrics_data(struct smu_context *smu,
+ MetricsMember_t member,
+ uint32_t *value)
+@@ -1127,6 +1160,9 @@ static int renoir_get_smu_metrics_data(struct smu_context *smu,
+
+ SmuMetrics_t *metrics = (SmuMetrics_t *)smu_table->metrics_table;
+ int ret = 0;
++ uint32_t apu_percent = 0;
++ uint32_t dgpu_percent = 0;
++
+
+ ret = smu_cmn_get_metrics_table(smu,
+ NULL,
+@@ -1171,26 +1207,18 @@ static int renoir_get_smu_metrics_data(struct smu_context *smu,
+ *value = metrics->Voltage[1];
+ break;
+ case METRICS_SS_APU_SHARE:
+- /* return the percentage of APU power with respect to APU's power limit.
+- * percentage is reported, this isn't boost value. Smartshift power
+- * boost/shift is only when the percentage is more than 100.
++ /* return the percentage of APU power boost
++ * with respect to APU's power limit.
+ */
+- if (metrics->StapmOriginalLimit > 0)
+- *value = (metrics->ApuPower * 100) / metrics->StapmOriginalLimit;
+- else
+- *value = 0;
++ renoir_get_ss_power_percent(metrics, &apu_percent, &dgpu_percent);
++ *value = apu_percent;
+ break;
+ case METRICS_SS_DGPU_SHARE:
+- /* return the percentage of dGPU power with respect to dGPU's power limit.
+- * percentage is reported, this isn't boost value. Smartshift power
+- * boost/shift is only when the percentage is more than 100.
++ /* return the percentage of dGPU power boost
++ * with respect to dGPU's power limit.
+ */
+- if ((metrics->dGpuPower > 0) &&
+- (metrics->StapmCurrentLimit > metrics->StapmOriginalLimit))
+- *value = (metrics->dGpuPower * 100) /
+- (metrics->StapmCurrentLimit - metrics->StapmOriginalLimit);
+- else
+- *value = 0;
++ renoir_get_ss_power_percent(metrics, &apu_percent, &dgpu_percent);
++ *value = dgpu_percent;
+ break;
+ default:
+ *value = UINT_MAX;
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c
+index e2d099409123a..87257b1b028f7 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c
+@@ -276,6 +276,42 @@ static int yellow_carp_mode2_reset(struct smu_context *smu)
+ return yellow_carp_mode_reset(smu, SMU_RESET_MODE_2);
+ }
+
++
++static void yellow_carp_get_ss_power_percent(SmuMetrics_t *metrics,
++ uint32_t *apu_percent, uint32_t *dgpu_percent)
++{
++ uint32_t apu_boost = 0;
++ uint32_t dgpu_boost = 0;
++ uint16_t apu_limit = 0;
++ uint16_t dgpu_limit = 0;
++ uint16_t apu_power = 0;
++ uint16_t dgpu_power = 0;
++
++ /* APU and dGPU power values are reported in milli Watts
++ * and STAPM power limits are in Watts */
++ apu_power = metrics->ApuPower/1000;
++ apu_limit = metrics->StapmOpnLimit;
++ if (apu_power > apu_limit && apu_limit != 0)
++ apu_boost = ((apu_power - apu_limit) * 100) / apu_limit;
++ apu_boost = (apu_boost > 100) ? 100 : apu_boost;
++
++ dgpu_power = metrics->dGpuPower/1000;
++ if (metrics->StapmCurrentLimit > metrics->StapmOpnLimit)
++ dgpu_limit = metrics->StapmCurrentLimit - metrics->StapmOpnLimit;
++ if (dgpu_power > dgpu_limit && dgpu_limit != 0)
++ dgpu_boost = ((dgpu_power - dgpu_limit) * 100) / dgpu_limit;
++ dgpu_boost = (dgpu_boost > 100) ? 100 : dgpu_boost;
++
++ if (dgpu_boost >= apu_boost)
++ apu_boost = 0;
++ else
++ dgpu_boost = 0;
++
++ *apu_percent = apu_boost;
++ *dgpu_percent = dgpu_boost;
++
++}
++
+ static int yellow_carp_get_smu_metrics_data(struct smu_context *smu,
+ MetricsMember_t member,
+ uint32_t *value)
+@@ -284,6 +320,8 @@ static int yellow_carp_get_smu_metrics_data(struct smu_context *smu,
+
+ SmuMetrics_t *metrics = (SmuMetrics_t *)smu_table->metrics_table;
+ int ret = 0;
++ uint32_t apu_percent = 0;
++ uint32_t dgpu_percent = 0;
+
+ ret = smu_cmn_get_metrics_table(smu, NULL, false);
+ if (ret)
+@@ -332,26 +370,18 @@ static int yellow_carp_get_smu_metrics_data(struct smu_context *smu,
+ *value = metrics->Voltage[1];
+ break;
+ case METRICS_SS_APU_SHARE:
+- /* return the percentage of APU power with respect to APU's power limit.
+- * percentage is reported, this isn't boost value. Smartshift power
+- * boost/shift is only when the percentage is more than 100.
++ /* return the percentage of APU power boost
++ * with respect to APU's power limit.
+ */
+- if (metrics->StapmOpnLimit > 0)
+- *value = (metrics->ApuPower * 100) / metrics->StapmOpnLimit;
+- else
+- *value = 0;
++ yellow_carp_get_ss_power_percent(metrics, &apu_percent, &dgpu_percent);
++ *value = apu_percent;
+ break;
+ case METRICS_SS_DGPU_SHARE:
+- /* return the percentage of dGPU power with respect to dGPU's power limit.
+- * percentage is reported, this isn't boost value. Smartshift power
+- * boost/shift is only when the percentage is more than 100.
++ /* return the percentage of dGPU power boost
++ * with respect to dGPU's power limit.
+ */
+- if ((metrics->dGpuPower > 0) &&
+- (metrics->StapmCurrentLimit > metrics->StapmOpnLimit))
+- *value = (metrics->dGpuPower * 100) /
+- (metrics->StapmCurrentLimit - metrics->StapmOpnLimit);
+- else
+- *value = 0;
++ yellow_carp_get_ss_power_percent(metrics, &apu_percent, &dgpu_percent);
++ *value = dgpu_percent;
+ break;
+ default:
+ *value = UINT_MAX;
+diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_plane.c b/drivers/gpu/drm/arm/display/komeda/komeda_plane.c
+index d63d83800a8a3..517b94c3bcaf9 100644
+--- a/drivers/gpu/drm/arm/display/komeda/komeda_plane.c
++++ b/drivers/gpu/drm/arm/display/komeda/komeda_plane.c
+@@ -265,6 +265,10 @@ static int komeda_plane_add(struct komeda_kms_dev *kms,
+
+ formats = komeda_get_layer_fourcc_list(&mdev->fmt_tbl,
+ layer->layer_type, &n_formats);
++ if (!formats) {
++ kfree(kplane);
++ return -ENOMEM;
++ }
+
+ err = drm_universal_plane_init(&kms->base, plane,
+ get_possible_crtcs(kms, c->pipeline),
+@@ -275,8 +279,10 @@ static int komeda_plane_add(struct komeda_kms_dev *kms,
+
+ komeda_put_fourcc_list(formats);
+
+- if (err)
+- goto cleanup;
++ if (err) {
++ kfree(kplane);
++ return err;
++ }
+
+ drm_plane_helper_add(plane, &komeda_plane_helper_funcs);
+
+diff --git a/drivers/gpu/drm/arm/malidp_crtc.c b/drivers/gpu/drm/arm/malidp_crtc.c
+index 494075ddbef68..b5928b52e2791 100644
+--- a/drivers/gpu/drm/arm/malidp_crtc.c
++++ b/drivers/gpu/drm/arm/malidp_crtc.c
+@@ -487,7 +487,10 @@ static void malidp_crtc_reset(struct drm_crtc *crtc)
+ if (crtc->state)
+ malidp_crtc_destroy_state(crtc, crtc->state);
+
+- __drm_atomic_helper_crtc_reset(crtc, &state->base);
++ if (state)
++ __drm_atomic_helper_crtc_reset(crtc, &state->base);
++ else
++ __drm_atomic_helper_crtc_reset(crtc, NULL);
+ }
+
+ static int malidp_crtc_enable_vblank(struct drm_crtc *crtc)
+diff --git a/drivers/gpu/drm/bridge/Kconfig b/drivers/gpu/drm/bridge/Kconfig
+index 2145b08f95346..be2fc4791c1da 100644
+--- a/drivers/gpu/drm/bridge/Kconfig
++++ b/drivers/gpu/drm/bridge/Kconfig
+@@ -77,6 +77,8 @@ config DRM_DISPLAY_CONNECTOR
+ config DRM_ITE_IT6505
+ tristate "ITE IT6505 DisplayPort bridge"
+ depends on OF
++ select DRM_DP_AUX_BUS
++ select DRM_DP_HELPER
+ select DRM_KMS_HELPER
+ select DRM_DP_HELPER
+ select EXTCON
+diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
+index 005bf18682ff8..668dcefbae17a 100644
+--- a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
++++ b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
+@@ -1313,6 +1313,7 @@ err_unregister_audio:
+ adv7511_audio_exit(adv7511);
+ drm_bridge_remove(&adv7511->bridge);
+ err_unregister_cec:
++ cec_unregister_adapter(adv7511->cec_adap);
+ i2c_unregister_device(adv7511->i2c_cec);
+ clk_disable_unprepare(adv7511->cec_clk);
+ err_i2c_unregister_packet:
+diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+index eb590fb8e8d0d..988669505aa5e 100644
+--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
++++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+@@ -1632,8 +1632,19 @@ static ssize_t analogix_dpaux_transfer(struct drm_dp_aux *aux,
+ struct drm_dp_aux_msg *msg)
+ {
+ struct analogix_dp_device *dp = to_dp(aux);
++ int ret;
++
++ pm_runtime_get_sync(dp->dev);
++
++ ret = analogix_dp_detect_hpd(dp);
++ if (ret)
++ goto out;
+
+- return analogix_dp_transfer(dp, msg);
++ ret = analogix_dp_transfer(dp, msg);
++out:
++ pm_runtime_put(dp->dev);
++
++ return ret;
+ }
+
+ struct analogix_dp_device *
+@@ -1698,8 +1709,10 @@ analogix_dp_probe(struct device *dev, struct analogix_dp_plat_data *plat_data)
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+
+ dp->reg_base = devm_ioremap_resource(&pdev->dev, res);
+- if (IS_ERR(dp->reg_base))
+- return ERR_CAST(dp->reg_base);
++ if (IS_ERR(dp->reg_base)) {
++ ret = PTR_ERR(dp->reg_base);
++ goto err_disable_clk;
++ }
+
+ dp->force_hpd = of_property_read_bool(dev->of_node, "force-hpd");
+
+@@ -1711,7 +1724,8 @@ analogix_dp_probe(struct device *dev, struct analogix_dp_plat_data *plat_data)
+ if (IS_ERR(dp->hpd_gpiod)) {
+ dev_err(dev, "error getting HDP GPIO: %ld\n",
+ PTR_ERR(dp->hpd_gpiod));
+- return ERR_CAST(dp->hpd_gpiod);
++ ret = PTR_ERR(dp->hpd_gpiod);
++ goto err_disable_clk;
+ }
+
+ if (dp->hpd_gpiod) {
+@@ -1731,7 +1745,8 @@ analogix_dp_probe(struct device *dev, struct analogix_dp_plat_data *plat_data)
+
+ if (dp->irq == -ENXIO) {
+ dev_err(&pdev->dev, "failed to get irq\n");
+- return ERR_PTR(-ENODEV);
++ ret = -ENODEV;
++ goto err_disable_clk;
+ }
+
+ ret = devm_request_threaded_irq(&pdev->dev, dp->irq,
+@@ -1740,11 +1755,15 @@ analogix_dp_probe(struct device *dev, struct analogix_dp_plat_data *plat_data)
+ irq_flags, "analogix-dp", dp);
+ if (ret) {
+ dev_err(&pdev->dev, "failed to request irq\n");
+- return ERR_PTR(ret);
++ goto err_disable_clk;
+ }
+ disable_irq(dp->irq);
+
+ return dp;
++
++err_disable_clk:
++ clk_disable_unprepare(dp->clock);
++ return ERR_PTR(ret);
+ }
+ EXPORT_SYMBOL_GPL(analogix_dp_probe);
+
+diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c b/drivers/gpu/drm/bridge/analogix/anx7625.c
+index 31ecf5626f1d9..060849f8ad8b4 100644
+--- a/drivers/gpu/drm/bridge/analogix/anx7625.c
++++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
+@@ -874,7 +874,10 @@ static int anx7625_hdcp_enable(struct anx7625_data *ctx)
+ }
+
+ /* Read downstream capability */
+- anx7625_aux_trans(ctx, DP_AUX_NATIVE_READ, 0x68028, 1, &bcap);
++ ret = anx7625_aux_trans(ctx, DP_AUX_NATIVE_READ, 0x68028, 1, &bcap);
++ if (ret < 0)
++ return ret;
++
+ if (!(bcap & 0x01)) {
+ pr_warn("downstream not support HDCP 1.4, cap(%x).\n", bcap);
+ return 0;
+@@ -1475,12 +1478,12 @@ static void anx7625_dp_adjust_swing(struct anx7625_data *ctx)
+ for (i = 0; i < ctx->pdata.dp_lane0_swing_reg_cnt; i++)
+ anx7625_reg_write(ctx, ctx->i2c.tx_p1_client,
+ DP_TX_LANE0_SWING_REG0 + i,
+- ctx->pdata.lane0_reg_data[i] & 0xFF);
++ ctx->pdata.lane0_reg_data[i]);
+
+ for (i = 0; i < ctx->pdata.dp_lane1_swing_reg_cnt; i++)
+ anx7625_reg_write(ctx, ctx->i2c.tx_p1_client,
+ DP_TX_LANE1_SWING_REG0 + i,
+- ctx->pdata.lane1_reg_data[i] & 0xFF);
++ ctx->pdata.lane1_reg_data[i]);
+ }
+
+ static void dp_hpd_change_handler(struct anx7625_data *ctx, bool on)
+@@ -1587,8 +1590,8 @@ static int anx7625_get_swing_setting(struct device *dev,
+ num_regs = DP_TX_SWING_REG_CNT;
+
+ pdata->dp_lane0_swing_reg_cnt = num_regs;
+- of_property_read_u32_array(dev->of_node, "analogix,lane0-swing",
+- pdata->lane0_reg_data, num_regs);
++ of_property_read_u8_array(dev->of_node, "analogix,lane0-swing",
++ pdata->lane0_reg_data, num_regs);
+ }
+
+ if (of_get_property(dev->of_node,
+@@ -1597,8 +1600,8 @@ static int anx7625_get_swing_setting(struct device *dev,
+ num_regs = DP_TX_SWING_REG_CNT;
+
+ pdata->dp_lane1_swing_reg_cnt = num_regs;
+- of_property_read_u32_array(dev->of_node, "analogix,lane1-swing",
+- pdata->lane1_reg_data, num_regs);
++ of_property_read_u8_array(dev->of_node, "analogix,lane1-swing",
++ pdata->lane1_reg_data, num_regs);
+ }
+
+ return 0;
+@@ -2654,7 +2657,7 @@ static int anx7625_i2c_probe(struct i2c_client *client,
+ if (ret) {
+ if (ret != -EPROBE_DEFER)
+ DRM_DEV_ERROR(dev, "fail to parse DT : %d\n", ret);
+- return ret;
++ goto free_wq;
+ }
+
+ if (anx7625_register_i2c_dummy_clients(platform, client) != 0) {
+@@ -2669,7 +2672,7 @@ static int anx7625_i2c_probe(struct i2c_client *client,
+ pm_suspend_ignore_children(dev, true);
+ ret = devm_add_action_or_reset(dev, anx7625_runtime_disable, dev);
+ if (ret)
+- return ret;
++ goto free_wq;
+
+ if (!platform->pdata.low_power_mode) {
+ anx7625_disable_pd_protocol(platform);
+diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.h b/drivers/gpu/drm/bridge/analogix/anx7625.h
+index edbbfe410a56e..e257a84db9626 100644
+--- a/drivers/gpu/drm/bridge/analogix/anx7625.h
++++ b/drivers/gpu/drm/bridge/analogix/anx7625.h
+@@ -426,9 +426,9 @@ struct anx7625_platform_data {
+ int mipi_lanes;
+ int audio_en;
+ int dp_lane0_swing_reg_cnt;
+- int lane0_reg_data[DP_TX_SWING_REG_CNT];
++ u8 lane0_reg_data[DP_TX_SWING_REG_CNT];
+ int dp_lane1_swing_reg_cnt;
+- int lane1_reg_data[DP_TX_SWING_REG_CNT];
++ u8 lane1_reg_data[DP_TX_SWING_REG_CNT];
+ u32 low_power_mode;
+ struct device_node *mipi_host_node;
+ };
+diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c
+index d9b7f48b99fbf..c871a90c0b8f4 100644
+--- a/drivers/gpu/drm/bridge/chipone-icn6211.c
++++ b/drivers/gpu/drm/bridge/chipone-icn6211.c
+@@ -15,8 +15,19 @@
+ #include <linux/of_device.h>
+ #include <linux/regulator/consumer.h>
+
+-#include <video/mipi_display.h>
+-
++#define VENDOR_ID 0x00
++#define DEVICE_ID_H 0x01
++#define DEVICE_ID_L 0x02
++#define VERSION_ID 0x03
++#define FIRMWARE_VERSION 0x08
++#define CONFIG_FINISH 0x09
++#define PD_CTRL(n) (0x0a + ((n) & 0x3)) /* 0..3 */
++#define RST_CTRL(n) (0x0e + ((n) & 0x1)) /* 0..1 */
++#define SYS_CTRL(n) (0x10 + ((n) & 0x7)) /* 0..4 */
++#define RGB_DRV(n) (0x18 + ((n) & 0x3)) /* 0..3 */
++#define RGB_DLY(n) (0x1c + ((n) & 0x1)) /* 0..1 */
++#define RGB_TEST_CTRL 0x1e
++#define ATE_PLL_EN 0x1f
+ #define HACTIVE_LI 0x20
+ #define VACTIVE_LI 0x21
+ #define VACTIVE_HACTIVE_HI 0x22
+@@ -24,9 +35,101 @@
+ #define HSYNC_LI 0x24
+ #define HBP_LI 0x25
+ #define HFP_HSW_HBP_HI 0x26
++#define HFP_HSW_HBP_HI_HFP(n) (((n) & 0x300) >> 4)
++#define HFP_HSW_HBP_HI_HS(n) (((n) & 0x300) >> 6)
++#define HFP_HSW_HBP_HI_HBP(n) (((n) & 0x300) >> 8)
+ #define VFP 0x27
+ #define VSYNC 0x28
+ #define VBP 0x29
++#define BIST_POL 0x2a
++#define BIST_POL_BIST_MODE(n) (((n) & 0xf) << 4)
++#define BIST_POL_BIST_GEN BIT(3)
++#define BIST_POL_HSYNC_POL BIT(2)
++#define BIST_POL_VSYNC_POL BIT(1)
++#define BIST_POL_DE_POL BIT(0)
++#define BIST_RED 0x2b
++#define BIST_GREEN 0x2c
++#define BIST_BLUE 0x2d
++#define BIST_CHESS_X 0x2e
++#define BIST_CHESS_Y 0x2f
++#define BIST_CHESS_XY_H 0x30
++#define BIST_FRAME_TIME_L 0x31
++#define BIST_FRAME_TIME_H 0x32
++#define FIFO_MAX_ADDR_LOW 0x33
++#define SYNC_EVENT_DLY 0x34
++#define HSW_MIN 0x35
++#define HFP_MIN 0x36
++#define LOGIC_RST_NUM 0x37
++#define OSC_CTRL(n) (0x48 + ((n) & 0x7)) /* 0..5 */
++#define BG_CTRL 0x4e
++#define LDO_PLL 0x4f
++#define PLL_CTRL(n) (0x50 + ((n) & 0xf)) /* 0..15 */
++#define PLL_CTRL_6_EXTERNAL 0x90
++#define PLL_CTRL_6_MIPI_CLK 0x92
++#define PLL_CTRL_6_INTERNAL 0x93
++#define PLL_REM(n) (0x60 + ((n) & 0x3)) /* 0..2 */
++#define PLL_DIV(n) (0x63 + ((n) & 0x3)) /* 0..2 */
++#define PLL_FRAC(n) (0x66 + ((n) & 0x3)) /* 0..2 */
++#define PLL_INT(n) (0x69 + ((n) & 0x1)) /* 0..1 */
++#define PLL_REF_DIV 0x6b
++#define PLL_REF_DIV_P(n) ((n) & 0xf)
++#define PLL_REF_DIV_Pe BIT(4)
++#define PLL_REF_DIV_S(n) (((n) & 0x7) << 5)
++#define PLL_SSC_P(n) (0x6c + ((n) & 0x3)) /* 0..2 */
++#define PLL_SSC_STEP(n) (0x6f + ((n) & 0x3)) /* 0..2 */
++#define PLL_SSC_OFFSET(n) (0x72 + ((n) & 0x3)) /* 0..3 */
++#define GPIO_OEN 0x79
++#define MIPI_CFG_PW 0x7a
++#define MIPI_CFG_PW_CONFIG_DSI 0xc1
++#define MIPI_CFG_PW_CONFIG_I2C 0x3e
++#define GPIO_SEL(n) (0x7b + ((n) & 0x1)) /* 0..1 */
++#define IRQ_SEL 0x7d
++#define DBG_SEL 0x7e
++#define DBG_SIGNAL 0x7f
++#define MIPI_ERR_VECTOR_L 0x80
++#define MIPI_ERR_VECTOR_H 0x81
++#define MIPI_ERR_VECTOR_EN_L 0x82
++#define MIPI_ERR_VECTOR_EN_H 0x83
++#define MIPI_MAX_SIZE_L 0x84
++#define MIPI_MAX_SIZE_H 0x85
++#define DSI_CTRL 0x86
++#define DSI_CTRL_UNKNOWN 0x28
++#define DSI_CTRL_DSI_LANES(n) ((n) & 0x3)
++#define MIPI_PN_SWAP 0x87
++#define MIPI_PN_SWAP_CLK BIT(4)
++#define MIPI_PN_SWAP_D(n) BIT((n) & 0x3)
++#define MIPI_SOT_SYNC_BIT_(n) (0x88 + ((n) & 0x1)) /* 0..1 */
++#define MIPI_ULPS_CTRL 0x8a
++#define MIPI_CLK_CHK_VAR 0x8e
++#define MIPI_CLK_CHK_INI 0x8f
++#define MIPI_T_TERM_EN 0x90
++#define MIPI_T_HS_SETTLE 0x91
++#define MIPI_T_TA_SURE_PRE 0x92
++#define MIPI_T_LPX_SET 0x94
++#define MIPI_T_CLK_MISS 0x95
++#define MIPI_INIT_TIME_L 0x96
++#define MIPI_INIT_TIME_H 0x97
++#define MIPI_T_CLK_TERM_EN 0x99
++#define MIPI_T_CLK_SETTLE 0x9a
++#define MIPI_TO_HS_RX_L 0x9e
++#define MIPI_TO_HS_RX_H 0x9f
++#define MIPI_PHY_(n) (0xa0 + ((n) & 0x7)) /* 0..5 */
++#define MIPI_PD_RX 0xb0
++#define MIPI_PD_TERM 0xb1
++#define MIPI_PD_HSRX 0xb2
++#define MIPI_PD_LPTX 0xb3
++#define MIPI_PD_LPRX 0xb4
++#define MIPI_PD_CK_LANE 0xb5
++#define MIPI_FORCE_0 0xb6
++#define MIPI_RST_CTRL 0xb7
++#define MIPI_RST_NUM 0xb8
++#define MIPI_DBG_SET_(n) (0xc0 + ((n) & 0xf)) /* 0..9 */
++#define MIPI_DBG_SEL 0xe0
++#define MIPI_DBG_DATA 0xe1
++#define MIPI_ATE_TEST_SEL 0xe2
++#define MIPI_ATE_STATUS_(n) (0xe3 + ((n) & 0x1)) /* 0..1 */
++#define MIPI_ATE_STATUS_1 0xe4
++#define ICN6211_MAX_REGISTER MIPI_ATE_STATUS(1)
+
+ struct chipone {
+ struct device *dev;
+@@ -63,14 +166,15 @@ static void chipone_atomic_enable(struct drm_bridge *bridge,
+ {
+ struct chipone *icn = bridge_to_chipone(bridge);
+ struct drm_display_mode *mode = &icn->mode;
++ u16 hfp, hbp, hsync;
+
+- ICN6211_DSI(icn, 0x7a, 0xc1);
++ ICN6211_DSI(icn, MIPI_CFG_PW, MIPI_CFG_PW_CONFIG_DSI);
+
+ ICN6211_DSI(icn, HACTIVE_LI, mode->hdisplay & 0xff);
+
+ ICN6211_DSI(icn, VACTIVE_LI, mode->vdisplay & 0xff);
+
+- /**
++ /*
+ * lsb nibble: 2nd nibble of hdisplay
+ * msb nibble: 2nd nibble of vdisplay
+ */
+@@ -78,13 +182,18 @@ static void chipone_atomic_enable(struct drm_bridge *bridge,
+ ((mode->hdisplay >> 8) & 0xf) |
+ (((mode->vdisplay >> 8) & 0xf) << 4));
+
+- ICN6211_DSI(icn, HFP_LI, mode->hsync_start - mode->hdisplay);
+-
+- ICN6211_DSI(icn, HSYNC_LI, mode->hsync_end - mode->hsync_start);
+-
+- ICN6211_DSI(icn, HBP_LI, mode->htotal - mode->hsync_end);
++ hfp = mode->hsync_start - mode->hdisplay;
++ hsync = mode->hsync_end - mode->hsync_start;
++ hbp = mode->htotal - mode->hsync_end;
+
+- ICN6211_DSI(icn, HFP_HSW_HBP_HI, 0x00);
++ ICN6211_DSI(icn, HFP_LI, hfp & 0xff);
++ ICN6211_DSI(icn, HSYNC_LI, hsync & 0xff);
++ ICN6211_DSI(icn, HBP_LI, hbp & 0xff);
++ /* Top two bits of Horizontal Front porch/Sync/Back porch */
++ ICN6211_DSI(icn, HFP_HSW_HBP_HI,
++ HFP_HSW_HBP_HI_HFP(hfp) |
++ HFP_HSW_HBP_HI_HS(hsync) |
++ HFP_HSW_HBP_HI_HBP(hbp));
+
+ ICN6211_DSI(icn, VFP, mode->vsync_start - mode->vdisplay);
+
+@@ -93,21 +202,21 @@ static void chipone_atomic_enable(struct drm_bridge *bridge,
+ ICN6211_DSI(icn, VBP, mode->vtotal - mode->vsync_end);
+
+ /* dsi specific sequence */
+- ICN6211_DSI(icn, MIPI_DCS_SET_TEAR_OFF, 0x80);
+- ICN6211_DSI(icn, MIPI_DCS_SET_ADDRESS_MODE, 0x28);
+- ICN6211_DSI(icn, 0xb5, 0xa0);
+- ICN6211_DSI(icn, 0x5c, 0xff);
+- ICN6211_DSI(icn, MIPI_DCS_SET_COLUMN_ADDRESS, 0x01);
+- ICN6211_DSI(icn, MIPI_DCS_GET_POWER_SAVE, 0x92);
+- ICN6211_DSI(icn, 0x6b, 0x71);
+- ICN6211_DSI(icn, 0x69, 0x2b);
+- ICN6211_DSI(icn, MIPI_DCS_ENTER_SLEEP_MODE, 0x40);
+- ICN6211_DSI(icn, MIPI_DCS_EXIT_SLEEP_MODE, 0x98);
++ ICN6211_DSI(icn, SYNC_EVENT_DLY, 0x80);
++ ICN6211_DSI(icn, HFP_MIN, hfp & 0xff);
++ ICN6211_DSI(icn, MIPI_PD_CK_LANE, 0xa0);
++ ICN6211_DSI(icn, PLL_CTRL(12), 0xff);
++ ICN6211_DSI(icn, BIST_POL, BIST_POL_DE_POL);
++ ICN6211_DSI(icn, PLL_CTRL(6), PLL_CTRL_6_MIPI_CLK);
++ ICN6211_DSI(icn, PLL_REF_DIV, 0x71);
++ ICN6211_DSI(icn, PLL_INT(0), 0x2b);
++ ICN6211_DSI(icn, SYS_CTRL(0), 0x40);
++ ICN6211_DSI(icn, SYS_CTRL(1), 0x98);
+
+ /* icn6211 specific sequence */
+- ICN6211_DSI(icn, 0xb6, 0x20);
+- ICN6211_DSI(icn, 0x51, 0x20);
+- ICN6211_DSI(icn, 0x09, 0x10);
++ ICN6211_DSI(icn, MIPI_FORCE_0, 0x20);
++ ICN6211_DSI(icn, PLL_CTRL(1), 0x20);
++ ICN6211_DSI(icn, CONFIG_FINISH, 0x10);
+
+ usleep_range(10000, 11000);
+ }
+diff --git a/drivers/gpu/drm/bridge/ite-it6505.c b/drivers/gpu/drm/bridge/ite-it6505.c
+index f2f101220ade9..c546646771724 100644
+--- a/drivers/gpu/drm/bridge/ite-it6505.c
++++ b/drivers/gpu/drm/bridge/ite-it6505.c
+@@ -737,8 +737,9 @@ static int it6505_drm_dp_link_probe(struct drm_dp_aux *aux,
+ return 0;
+ }
+
+-static int it6505_drm_dp_link_power_up(struct drm_dp_aux *aux,
+- struct it6505_drm_dp_link *link)
++static int it6505_drm_dp_link_set_power(struct drm_dp_aux *aux,
++ struct it6505_drm_dp_link *link,
++ u8 mode)
+ {
+ u8 value;
+ int err;
+@@ -752,18 +753,20 @@ static int it6505_drm_dp_link_power_up(struct drm_dp_aux *aux,
+ return err;
+
+ value &= ~DP_SET_POWER_MASK;
+- value |= DP_SET_POWER_D0;
++ value |= mode;
+
+ err = drm_dp_dpcd_writeb(aux, DP_SET_POWER, value);
+ if (err < 0)
+ return err;
+
+- /*
+- * According to the DP 1.1 specification, a "Sink Device must exit the
+- * power saving state within 1 ms" (Section 2.5.3.1, Table 5-52, "Sink
+- * Control Field" (register 0x600).
+- */
+- usleep_range(1000, 2000);
++ if (mode == DP_SET_POWER_D0) {
++ /*
++ * According to the DP 1.1 specification, a "Sink Device must
++ * exit the power saving state within 1 ms" (Section 2.5.3.1,
++ * Table 5-52, "Sink Control Field" (register 0x600).
++ */
++ usleep_range(1000, 2000);
++ }
+
+ return 0;
+ }
+@@ -2624,7 +2627,8 @@ static enum drm_connector_status it6505_detect(struct it6505 *it6505)
+ if (it6505_get_sink_hpd_status(it6505)) {
+ it6505_aux_on(it6505);
+ it6505_drm_dp_link_probe(&it6505->aux, &it6505->link);
+- it6505_drm_dp_link_power_up(&it6505->aux, &it6505->link);
++ it6505_drm_dp_link_set_power(&it6505->aux, &it6505->link,
++ DP_SET_POWER_D0);
+ it6505->auto_train_retry = AUTO_TRAIN_RETRY;
+
+ if (it6505->dpcd[0] == 0) {
+@@ -2960,8 +2964,11 @@ static void it6505_bridge_atomic_disable(struct drm_bridge *bridge,
+
+ DRM_DEV_DEBUG_DRIVER(dev, "start");
+
+- if (it6505->powered)
++ if (it6505->powered) {
+ it6505_video_disable(it6505);
++ it6505_drm_dp_link_set_power(&it6505->aux, &it6505->link,
++ DP_SET_POWER_D3);
++ }
+ }
+
+ static enum drm_connector_status
+diff --git a/drivers/gpu/drm/bridge/ite-it66121.c b/drivers/gpu/drm/bridge/ite-it66121.c
+index 69288cf894b99..e81c106e2c2bb 100644
+--- a/drivers/gpu/drm/bridge/ite-it66121.c
++++ b/drivers/gpu/drm/bridge/ite-it66121.c
+@@ -227,7 +227,7 @@ static const struct regmap_range_cfg it66121_regmap_banks[] = {
+ .selector_mask = 0x1,
+ .selector_shift = 0,
+ .window_start = 0x00,
+- .window_len = 0x130,
++ .window_len = 0x100,
+ },
+ };
+
+diff --git a/drivers/gpu/drm/drm_bridge_connector.c b/drivers/gpu/drm/drm_bridge_connector.c
+index 60923cdfe8e1c..6b3dad03d77d0 100644
+--- a/drivers/gpu/drm/drm_bridge_connector.c
++++ b/drivers/gpu/drm/drm_bridge_connector.c
+@@ -384,8 +384,10 @@ struct drm_connector *drm_bridge_connector_init(struct drm_device *drm,
+ connector_type, ddc);
+ drm_connector_helper_add(connector, &drm_bridge_connector_helper_funcs);
+
+- if (bridge_connector->bridge_hpd)
++ if (bridge_connector->bridge_hpd) {
+ connector->polled = DRM_CONNECTOR_POLL_HPD;
++ drm_bridge_connector_enable_hpd(connector);
++ }
+ else if (bridge_connector->bridge_detect)
+ connector->polled = DRM_CONNECTOR_POLL_CONNECT
+ | DRM_CONNECTOR_POLL_DISCONNECT;
+diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
+index cc7bd58369dfe..c5b86414873e2 100644
+--- a/drivers/gpu/drm/drm_edid.c
++++ b/drivers/gpu/drm/drm_edid.c
+@@ -2031,9 +2031,6 @@ struct edid *drm_do_get_edid(struct drm_connector *connector,
+
+ connector_bad_edid(connector, edid, edid[0x7e] + 1);
+
+- edid[EDID_LENGTH-1] += edid[0x7e] - valid_extensions;
+- edid[0x7e] = valid_extensions;
+-
+ new = kmalloc_array(valid_extensions + 1, EDID_LENGTH,
+ GFP_KERNEL);
+ if (!new)
+@@ -2050,6 +2047,9 @@ struct edid *drm_do_get_edid(struct drm_connector *connector,
+ base += EDID_LENGTH;
+ }
+
++ new[EDID_LENGTH - 1] += new[0x7e] - valid_extensions;
++ new[0x7e] = valid_extensions;
++
+ kfree(edid);
+ edid = new;
+ }
+diff --git a/drivers/gpu/drm/drm_format_helper.c b/drivers/gpu/drm/drm_format_helper.c
+index bc0f49773868a..e085f855a1990 100644
+--- a/drivers/gpu/drm/drm_format_helper.c
++++ b/drivers/gpu/drm/drm_format_helper.c
+@@ -594,35 +594,24 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int dst_pitch, uint32_t dst_for
+ }
+ EXPORT_SYMBOL(drm_fb_blit_toio);
+
+-static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, const u8 *src, unsigned int pixels,
+- unsigned int start_offset, unsigned int end_len)
+-{
+- unsigned int xb, i;
+-
+- for (xb = 0; xb < pixels; xb++) {
+- unsigned int start = 0, end = 8;
+- u8 byte = 0x00;
+-
+- if (xb == 0 && start_offset)
+- start = start_offset;
+
+- if (xb == pixels - 1 && end_len)
+- end = end_len;
+-
+- for (i = start; i < end; i++) {
+- unsigned int x = xb * 8 + i;
++static void drm_fb_gray8_to_mono_line(u8 *dst, const u8 *src, unsigned int pixels)
++{
++ while (pixels) {
++ unsigned int i, bits = min(pixels, 8U);
++ u8 byte = 0;
+
+- byte >>= 1;
+- if (src[x] >> 7)
+- byte |= BIT(7);
++ for (i = 0; i < bits; i++, pixels--) {
++ if (*src++ >= 128)
++ byte |= BIT(i);
+ }
+ *dst++ = byte;
+ }
+ }
+
+ /**
+- * drm_fb_xrgb8888_to_mono_reversed - Convert XRGB8888 to reversed monochrome
+- * @dst: reversed monochrome destination buffer
++ * drm_fb_xrgb8888_to_mono - Convert XRGB8888 to monochrome
++ * @dst: monochrome destination buffer (0=black, 1=white)
+ * @dst_pitch: Number of bytes between two consecutive scanlines within dst
+ * @src: XRGB8888 source buffer
+ * @fb: DRM framebuffer
+@@ -633,17 +622,23 @@ static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, const u8 *src, unsigned
+ * and use this function to convert to the native format.
+ *
+ * This function uses drm_fb_xrgb8888_to_gray8() to convert to grayscale and
+- * then the result is converted from grayscale to reversed monohrome.
++ * then the result is converted from grayscale to monochrome.
++ *
++ * The first pixel (upper left corner of the clip rectangle) will be converted
++ * and copied to the first bit (LSB) in the first byte of the monochrome
++ * destination buffer.
++ * If the caller requires that the first pixel in a byte must be located at an
++ * x-coordinate that is a multiple of 8, then the caller must take care itself
++ * of supplying a suitable clip rectangle.
+ */
+-void drm_fb_xrgb8888_to_mono_reversed(void *dst, unsigned int dst_pitch, const void *vaddr,
+- const struct drm_framebuffer *fb, const struct drm_rect *clip)
++void drm_fb_xrgb8888_to_mono(void *dst, unsigned int dst_pitch, const void *vaddr,
++ const struct drm_framebuffer *fb, const struct drm_rect *clip)
+ {
+ unsigned int linepixels = drm_rect_width(clip);
+- unsigned int lines = clip->y2 - clip->y1;
++ unsigned int lines = drm_rect_height(clip);
+ unsigned int cpp = fb->format->cpp[0];
+ unsigned int len_src32 = linepixels * cpp;
+ struct drm_device *dev = fb->dev;
+- unsigned int start_offset, end_len;
+ unsigned int y;
+ u8 *mono = dst, *gray8;
+ u32 *src32;
+@@ -652,21 +647,18 @@ void drm_fb_xrgb8888_to_mono_reversed(void *dst, unsigned int dst_pitch, const v
+ return;
+
+ /*
+- * The reversed mono destination buffer contains 1 bit per pixel
+- * and destination scanlines have to be in multiple of 8 pixels.
++ * The mono destination buffer contains 1 bit per pixel
+ */
+ if (!dst_pitch)
+ dst_pitch = DIV_ROUND_UP(linepixels, 8);
+
+- drm_WARN_ONCE(dev, dst_pitch % 8 != 0, "dst_pitch is not a multiple of 8\n");
+-
+ /*
+ * The cma memory is write-combined so reads are uncached.
+ * Speed up by fetching one line at a time.
+ *
+- * Also, format conversion from XR24 to reversed monochrome
+- * are done line-by-line but are converted to 8-bit grayscale
+- * as an intermediate step.
++ * Also, format conversion from XR24 to monochrome are done
++ * line-by-line but are converted to 8-bit grayscale as an
++ * intermediate step.
+ *
+ * Allocate a buffer to be used for both copying from the cma
+ * memory and to store the intermediate grayscale line pixels.
+@@ -677,27 +669,15 @@ void drm_fb_xrgb8888_to_mono_reversed(void *dst, unsigned int dst_pitch, const v
+
+ gray8 = (u8 *)src32 + len_src32;
+
+- /*
+- * For damage handling, it is possible that only parts of the source
+- * buffer is copied and this could lead to start and end pixels that
+- * are not aligned to multiple of 8.
+- *
+- * Calculate if the start and end pixels are not aligned and set the
+- * offsets for the reversed mono line conversion function to adjust.
+- */
+- start_offset = clip->x1 % 8;
+- end_len = clip->x2 % 8;
+-
+ vaddr += clip_offset(clip, fb->pitches[0], cpp);
+ for (y = 0; y < lines; y++) {
+ src32 = memcpy(src32, vaddr, len_src32);
+ drm_fb_xrgb8888_to_gray8_line(gray8, src32, linepixels);
+- drm_fb_gray8_to_mono_reversed_line(mono, gray8, dst_pitch,
+- start_offset, end_len);
++ drm_fb_gray8_to_mono_line(mono, gray8, linepixels);
+ vaddr += fb->pitches[0];
+ mono += dst_pitch;
+ }
+
+ kfree(src32);
+ }
+-EXPORT_SYMBOL(drm_fb_xrgb8888_to_mono_reversed);
++EXPORT_SYMBOL(drm_fb_xrgb8888_to_mono);
+diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
+index bf0daa8d9bbd9..726f2f163c269 100644
+--- a/drivers/gpu/drm/drm_plane.c
++++ b/drivers/gpu/drm/drm_plane.c
+@@ -247,6 +247,13 @@ static int __drm_universal_plane_init(struct drm_device *dev,
+ if (WARN_ON(config->num_total_plane >= 32))
+ return -EINVAL;
+
++ /*
++ * First driver to need more than 64 formats needs to fix this. Each
++ * format is encoded as a bit and the current code only supports a u64.
++ */
++ if (WARN_ON(format_count > 64))
++ return -EINVAL;
++
+ WARN_ON(drm_drv_uses_atomic_modeset(dev) &&
+ (!funcs->atomic_destroy_state ||
+ !funcs->atomic_duplicate_state));
+@@ -268,13 +275,6 @@ static int __drm_universal_plane_init(struct drm_device *dev,
+ return -ENOMEM;
+ }
+
+- /*
+- * First driver to need more than 64 formats needs to fix this. Each
+- * format is encoded as a bit and the current code only supports a u64.
+- */
+- if (WARN_ON(format_count > 64))
+- return -EINVAL;
+-
+ if (format_modifiers) {
+ const uint64_t *temp_modifiers = format_modifiers;
+
+diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
+index 9fb1a2aadbcb0..aabb997a74eb4 100644
+--- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
++++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
+@@ -286,6 +286,12 @@ void etnaviv_iommu_unmap_gem(struct etnaviv_iommu_context *context,
+
+ mutex_lock(&context->lock);
+
++ /* Bail if the mapping has been reaped by another thread */
++ if (!mapping->context) {
++ mutex_unlock(&context->lock);
++ return;
++ }
++
+ /* If the vram node is on the mm, unmap and remove the node */
+ if (mapping->vram_node.mm == &context->mm)
+ etnaviv_iommu_remove_mapping(context, mapping);
+diff --git a/drivers/gpu/drm/gma500/psb_intel_display.c b/drivers/gpu/drm/gma500/psb_intel_display.c
+index d5f95212934e2..42d1a733e1249 100644
+--- a/drivers/gpu/drm/gma500/psb_intel_display.c
++++ b/drivers/gpu/drm/gma500/psb_intel_display.c
+@@ -535,14 +535,15 @@ void psb_intel_crtc_init(struct drm_device *dev, int pipe,
+
+ struct drm_crtc *psb_intel_get_crtc_from_pipe(struct drm_device *dev, int pipe)
+ {
+- struct drm_crtc *crtc = NULL;
++ struct drm_crtc *crtc;
+
+ list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+ struct gma_crtc *gma_crtc = to_gma_crtc(crtc);
++
+ if (gma_crtc->pipe == pipe)
+- break;
++ return crtc;
+ }
+- return crtc;
++ return NULL;
+ }
+
+ int gma_connector_clones(struct drm_device *dev, int type_mask)
+diff --git a/drivers/gpu/drm/i915/display/intel_dsi_vbt.c b/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
+index 6b4a27372c825..9b631dd651d90 100644
+--- a/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
++++ b/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
+@@ -124,9 +124,25 @@ struct i2c_adapter_lookup {
+ #define ICL_GPIO_DDPA_CTRLCLK_2 8
+ #define ICL_GPIO_DDPA_CTRLDATA_2 9
+
+-static enum port intel_dsi_seq_port_to_port(u8 port)
++static enum port intel_dsi_seq_port_to_port(struct intel_dsi *intel_dsi,
++ u8 seq_port)
+ {
+- return port ? PORT_C : PORT_A;
++ /*
++ * If single link DSI is being used on any port, the VBT sequence block
++ * send packet apparently always has 0 for the port. Just use the port
++ * we have configured, and ignore the sequence block port.
++ */
++ if (hweight8(intel_dsi->ports) == 1)
++ return ffs(intel_dsi->ports) - 1;
++
++ if (seq_port) {
++ if (intel_dsi->ports & PORT_B)
++ return PORT_B;
++ else if (intel_dsi->ports & PORT_C)
++ return PORT_C;
++ }
++
++ return PORT_A;
+ }
+
+ static const u8 *mipi_exec_send_packet(struct intel_dsi *intel_dsi,
+@@ -148,15 +164,10 @@ static const u8 *mipi_exec_send_packet(struct intel_dsi *intel_dsi,
+
+ seq_port = (flags >> MIPI_PORT_SHIFT) & 3;
+
+- /* For DSI single link on Port A & C, the seq_port value which is
+- * parsed from Sequence Block#53 of VBT has been set to 0
+- * Now, read/write of packets for the DSI single link on Port A and
+- * Port C will based on the DVO port from VBT block 2.
+- */
+- if (intel_dsi->ports == (1 << PORT_C))
+- port = PORT_C;
+- else
+- port = intel_dsi_seq_port_to_port(seq_port);
++ port = intel_dsi_seq_port_to_port(intel_dsi, seq_port);
++
++ if (drm_WARN_ON(&dev_priv->drm, !intel_dsi->dsi_hosts[port]))
++ goto out;
+
+ dsi_device = intel_dsi->dsi_hosts[port]->device;
+ if (!dsi_device) {
+diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
+index 0a9c3fcc09b1e..1577ab6754db1 100644
+--- a/drivers/gpu/drm/i915/i915_perf.c
++++ b/drivers/gpu/drm/i915/i915_perf.c
+@@ -4050,8 +4050,8 @@ addr_err:
+ return ERR_PTR(err);
+ }
+
+-static ssize_t show_dynamic_id(struct device *dev,
+- struct device_attribute *attr,
++static ssize_t show_dynamic_id(struct kobject *kobj,
++ struct kobj_attribute *attr,
+ char *buf)
+ {
+ struct i915_oa_config *oa_config =
+diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h
+index 473a3c0544bb8..05cb9a335a971 100644
+--- a/drivers/gpu/drm/i915/i915_perf_types.h
++++ b/drivers/gpu/drm/i915/i915_perf_types.h
+@@ -55,7 +55,7 @@ struct i915_oa_config {
+
+ struct attribute_group sysfs_metric;
+ struct attribute *attrs[2];
+- struct device_attribute sysfs_metric_id;
++ struct kobj_attribute sysfs_metric_id;
+
+ struct kref ref;
+ struct rcu_head rcu;
+diff --git a/drivers/gpu/drm/mediatek/mtk_cec.c b/drivers/gpu/drm/mediatek/mtk_cec.c
+index e9cef5c0c8f7e..cdfa648910b23 100644
+--- a/drivers/gpu/drm/mediatek/mtk_cec.c
++++ b/drivers/gpu/drm/mediatek/mtk_cec.c
+@@ -85,7 +85,7 @@ static void mtk_cec_mask(struct mtk_cec *cec, unsigned int offset,
+ u32 tmp = readl(cec->regs + offset) & ~mask;
+
+ tmp |= val & mask;
+- writel(val, cec->regs + offset);
++ writel(tmp, cec->regs + offset);
+ }
+
+ void mtk_cec_set_hpd_event(struct device *dev,
+diff --git a/drivers/gpu/drm/mediatek/mtk_disp_drv.h b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
+index 86c3068894b11..974462831133b 100644
+--- a/drivers/gpu/drm/mediatek/mtk_disp_drv.h
++++ b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
+@@ -76,9 +76,11 @@ void mtk_ovl_layer_off(struct device *dev, unsigned int idx,
+ void mtk_ovl_start(struct device *dev);
+ void mtk_ovl_stop(struct device *dev);
+ unsigned int mtk_ovl_supported_rotations(struct device *dev);
+-void mtk_ovl_enable_vblank(struct device *dev,
+- void (*vblank_cb)(void *),
+- void *vblank_cb_data);
++void mtk_ovl_register_vblank_cb(struct device *dev,
++ void (*vblank_cb)(void *),
++ void *vblank_cb_data);
++void mtk_ovl_unregister_vblank_cb(struct device *dev);
++void mtk_ovl_enable_vblank(struct device *dev);
+ void mtk_ovl_disable_vblank(struct device *dev);
+
+ void mtk_rdma_bypass_shadow(struct device *dev);
+@@ -93,9 +95,11 @@ void mtk_rdma_layer_config(struct device *dev, unsigned int idx,
+ struct cmdq_pkt *cmdq_pkt);
+ void mtk_rdma_start(struct device *dev);
+ void mtk_rdma_stop(struct device *dev);
+-void mtk_rdma_enable_vblank(struct device *dev,
+- void (*vblank_cb)(void *),
+- void *vblank_cb_data);
++void mtk_rdma_register_vblank_cb(struct device *dev,
++ void (*vblank_cb)(void *),
++ void *vblank_cb_data);
++void mtk_rdma_unregister_vblank_cb(struct device *dev);
++void mtk_rdma_enable_vblank(struct device *dev);
+ void mtk_rdma_disable_vblank(struct device *dev);
+
+ #endif
+diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
+index 17cd9b9322988..70ab22964f3b5 100644
+--- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
++++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
+@@ -97,14 +97,28 @@ static irqreturn_t mtk_disp_ovl_irq_handler(int irq, void *dev_id)
+ return IRQ_HANDLED;
+ }
+
+-void mtk_ovl_enable_vblank(struct device *dev,
+- void (*vblank_cb)(void *),
+- void *vblank_cb_data)
++void mtk_ovl_register_vblank_cb(struct device *dev,
++ void (*vblank_cb)(void *),
++ void *vblank_cb_data)
+ {
+ struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
+
+ ovl->vblank_cb = vblank_cb;
+ ovl->vblank_cb_data = vblank_cb_data;
++}
++
++void mtk_ovl_unregister_vblank_cb(struct device *dev)
++{
++ struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
++
++ ovl->vblank_cb = NULL;
++ ovl->vblank_cb_data = NULL;
++}
++
++void mtk_ovl_enable_vblank(struct device *dev)
++{
++ struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
++
+ writel(0x0, ovl->regs + DISP_REG_OVL_INTSTA);
+ writel_relaxed(OVL_FME_CPL_INT, ovl->regs + DISP_REG_OVL_INTEN);
+ }
+@@ -113,8 +127,6 @@ void mtk_ovl_disable_vblank(struct device *dev)
+ {
+ struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
+
+- ovl->vblank_cb = NULL;
+- ovl->vblank_cb_data = NULL;
+ writel_relaxed(0x0, ovl->regs + DISP_REG_OVL_INTEN);
+ }
+
+diff --git a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
+index 662e91d9d45f6..1be4caf9ff963 100644
+--- a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
++++ b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
+@@ -95,24 +95,32 @@ static void rdma_update_bits(struct device *dev, unsigned int reg,
+ writel(tmp, rdma->regs + reg);
+ }
+
+-void mtk_rdma_enable_vblank(struct device *dev,
+- void (*vblank_cb)(void *),
+- void *vblank_cb_data)
++void mtk_rdma_register_vblank_cb(struct device *dev,
++ void (*vblank_cb)(void *),
++ void *vblank_cb_data)
+ {
+ struct mtk_disp_rdma *rdma = dev_get_drvdata(dev);
+
+ rdma->vblank_cb = vblank_cb;
+ rdma->vblank_cb_data = vblank_cb_data;
+- rdma_update_bits(dev, DISP_REG_RDMA_INT_ENABLE, RDMA_FRAME_END_INT,
+- RDMA_FRAME_END_INT);
+ }
+
+-void mtk_rdma_disable_vblank(struct device *dev)
++void mtk_rdma_unregister_vblank_cb(struct device *dev)
+ {
+ struct mtk_disp_rdma *rdma = dev_get_drvdata(dev);
+
+ rdma->vblank_cb = NULL;
+ rdma->vblank_cb_data = NULL;
++}
++
++void mtk_rdma_enable_vblank(struct device *dev)
++{
++ rdma_update_bits(dev, DISP_REG_RDMA_INT_ENABLE, RDMA_FRAME_END_INT,
++ RDMA_FRAME_END_INT);
++}
++
++void mtk_rdma_disable_vblank(struct device *dev)
++{
+ rdma_update_bits(dev, DISP_REG_RDMA_INT_ENABLE, RDMA_FRAME_END_INT, 0);
+ }
+
+diff --git a/drivers/gpu/drm/mediatek/mtk_dpi.c b/drivers/gpu/drm/mediatek/mtk_dpi.c
+index 4554e2de14309..e61cd67b978ff 100644
+--- a/drivers/gpu/drm/mediatek/mtk_dpi.c
++++ b/drivers/gpu/drm/mediatek/mtk_dpi.c
+@@ -819,8 +819,8 @@ static const struct mtk_dpi_conf mt8192_conf = {
+ .cal_factor = mt8183_calculate_factor,
+ .reg_h_fre_con = 0xe0,
+ .max_clock_khz = 150000,
+- .output_fmts = mt8173_output_fmts,
+- .num_output_fmts = ARRAY_SIZE(mt8173_output_fmts),
++ .output_fmts = mt8183_output_fmts,
++ .num_output_fmts = ARRAY_SIZE(mt8183_output_fmts),
+ };
+
+ static int mtk_dpi_probe(struct platform_device *pdev)
+diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+index ede435d2c1efa..f24b21eb03cdf 100644
+--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
++++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+@@ -152,6 +152,7 @@ static void mtk_drm_cmdq_pkt_destroy(struct cmdq_pkt *pkt)
+ static void mtk_drm_crtc_destroy(struct drm_crtc *crtc)
+ {
+ struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
++ int i;
+
+ mtk_mutex_put(mtk_crtc->mutex);
+ #if IS_REACHABLE(CONFIG_MTK_CMDQ)
+@@ -162,6 +163,14 @@ static void mtk_drm_crtc_destroy(struct drm_crtc *crtc)
+ mtk_crtc->cmdq_client.chan = NULL;
+ }
+ #endif
++
++ for (i = 0; i < mtk_crtc->ddp_comp_nr; i++) {
++ struct mtk_ddp_comp *comp;
++
++ comp = mtk_crtc->ddp_comp[i];
++ mtk_ddp_comp_unregister_vblank_cb(comp);
++ }
++
+ drm_crtc_cleanup(crtc);
+ }
+
+@@ -617,7 +626,7 @@ static int mtk_drm_crtc_enable_vblank(struct drm_crtc *crtc)
+ struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
+ struct mtk_ddp_comp *comp = mtk_crtc->ddp_comp[0];
+
+- mtk_ddp_comp_enable_vblank(comp, mtk_crtc_ddp_irq, &mtk_crtc->base);
++ mtk_ddp_comp_enable_vblank(comp);
+
+ return 0;
+ }
+@@ -926,6 +935,9 @@ int mtk_drm_crtc_create(struct drm_device *drm_dev,
+ if (comp->funcs->ctm_set)
+ has_ctm = true;
+ }
++
++ mtk_ddp_comp_register_vblank_cb(comp, mtk_crtc_ddp_irq,
++ &mtk_crtc->base);
+ }
+
+ for (i = 0; i < mtk_crtc->ddp_comp_nr; i++)
+diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
+index 2e99aee13dfe4..5d7504a72b11c 100644
+--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
++++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
+@@ -297,6 +297,8 @@ static const struct mtk_ddp_comp_funcs ddp_ovl = {
+ .config = mtk_ovl_config,
+ .start = mtk_ovl_start,
+ .stop = mtk_ovl_stop,
++ .register_vblank_cb = mtk_ovl_register_vblank_cb,
++ .unregister_vblank_cb = mtk_ovl_unregister_vblank_cb,
+ .enable_vblank = mtk_ovl_enable_vblank,
+ .disable_vblank = mtk_ovl_disable_vblank,
+ .supported_rotations = mtk_ovl_supported_rotations,
+@@ -321,6 +323,8 @@ static const struct mtk_ddp_comp_funcs ddp_rdma = {
+ .config = mtk_rdma_config,
+ .start = mtk_rdma_start,
+ .stop = mtk_rdma_stop,
++ .register_vblank_cb = mtk_rdma_register_vblank_cb,
++ .unregister_vblank_cb = mtk_rdma_unregister_vblank_cb,
+ .enable_vblank = mtk_rdma_enable_vblank,
+ .disable_vblank = mtk_rdma_disable_vblank,
+ .layer_nr = mtk_rdma_layer_nr,
+diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
+index ad267bb8fc9b5..1cbc6332282dc 100644
+--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
++++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
+@@ -48,9 +48,11 @@ struct mtk_ddp_comp_funcs {
+ unsigned int bpc, struct cmdq_pkt *cmdq_pkt);
+ void (*start)(struct device *dev);
+ void (*stop)(struct device *dev);
+- void (*enable_vblank)(struct device *dev,
+- void (*vblank_cb)(void *),
+- void *vblank_cb_data);
++ void (*register_vblank_cb)(struct device *dev,
++ void (*vblank_cb)(void *),
++ void *vblank_cb_data);
++ void (*unregister_vblank_cb)(struct device *dev);
++ void (*enable_vblank)(struct device *dev);
+ void (*disable_vblank)(struct device *dev);
+ unsigned int (*supported_rotations)(struct device *dev);
+ unsigned int (*layer_nr)(struct device *dev);
+@@ -110,12 +112,25 @@ static inline void mtk_ddp_comp_stop(struct mtk_ddp_comp *comp)
+ comp->funcs->stop(comp->dev);
+ }
+
+-static inline void mtk_ddp_comp_enable_vblank(struct mtk_ddp_comp *comp,
+- void (*vblank_cb)(void *),
+- void *vblank_cb_data)
++static inline void mtk_ddp_comp_register_vblank_cb(struct mtk_ddp_comp *comp,
++ void (*vblank_cb)(void *),
++ void *vblank_cb_data)
++{
++ if (comp->funcs && comp->funcs->register_vblank_cb)
++ comp->funcs->register_vblank_cb(comp->dev, vblank_cb,
++ vblank_cb_data);
++}
++
++static inline void mtk_ddp_comp_unregister_vblank_cb(struct mtk_ddp_comp *comp)
++{
++ if (comp->funcs && comp->funcs->unregister_vblank_cb)
++ comp->funcs->unregister_vblank_cb(comp->dev);
++}
++
++static inline void mtk_ddp_comp_enable_vblank(struct mtk_ddp_comp *comp)
+ {
+ if (comp->funcs && comp->funcs->enable_vblank)
+- comp->funcs->enable_vblank(comp->dev, vblank_cb, vblank_cb_data);
++ comp->funcs->enable_vblank(comp->dev);
+ }
+
+ static inline void mtk_ddp_comp_disable_vblank(struct mtk_ddp_comp *comp)
+diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+index 247c6ff277efd..b0e4e5d689272 100644
+--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
++++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+@@ -509,6 +509,8 @@ static const struct of_device_id mtk_ddp_comp_dt_ids[] = {
+ .data = (void *)MTK_DPI },
+ { .compatible = "mediatek,mt8183-dpi",
+ .data = (void *)MTK_DPI },
++ { .compatible = "mediatek,mt8192-dpi",
++ .data = (void *)MTK_DPI },
+ { .compatible = "mediatek,mt2701-dsi",
+ .data = (void *)MTK_DSI },
+ { .compatible = "mediatek,mt8173-dsi",
+diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+index 407f50a15faa4..217615e0e8507 100644
+--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
++++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+@@ -1662,28 +1662,23 @@ static struct msm_ringbuffer *a5xx_active_ring(struct msm_gpu *gpu)
+ return a5xx_gpu->cur_ring;
+ }
+
+-static unsigned long a5xx_gpu_busy(struct msm_gpu *gpu)
++static u64 a5xx_gpu_busy(struct msm_gpu *gpu, unsigned long *out_sample_rate)
+ {
+- u64 busy_cycles, busy_time;
++ u64 busy_cycles;
+
+ /* Only read the gpu busy if the hardware is already active */
+- if (pm_runtime_get_if_in_use(&gpu->pdev->dev) == 0)
++ if (pm_runtime_get_if_in_use(&gpu->pdev->dev) == 0) {
++ *out_sample_rate = 1;
+ return 0;
++ }
+
+ busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO,
+ REG_A5XX_RBBM_PERFCTR_RBBM_0_HI);
+-
+- busy_time = busy_cycles - gpu->devfreq.busy_cycles;
+- do_div(busy_time, clk_get_rate(gpu->core_clk) / 1000000);
+-
+- gpu->devfreq.busy_cycles = busy_cycles;
++ *out_sample_rate = clk_get_rate(gpu->core_clk);
+
+ pm_runtime_put(&gpu->pdev->dev);
+
+- if (WARN_ON(busy_time > ~0LU))
+- return ~0LU;
+-
+- return (unsigned long)busy_time;
++ return busy_cycles;
+ }
+
+ static uint32_t a5xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
+diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+index ccc4fcf7a630f..40fb92becc78a 100644
+--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
++++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+@@ -1649,12 +1649,14 @@ static void a6xx_destroy(struct msm_gpu *gpu)
+ kfree(a6xx_gpu);
+ }
+
+-static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
++static u64 a6xx_gpu_busy(struct msm_gpu *gpu, unsigned long *out_sample_rate)
+ {
+ struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+ struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+- u64 busy_cycles, busy_time;
++ u64 busy_cycles;
+
++ /* 19.2MHz */
++ *out_sample_rate = 19200000;
+
+ /* Only read the gpu busy if the hardware is already active */
+ if (pm_runtime_get_if_in_use(a6xx_gpu->gmu.dev) == 0)
+@@ -1664,17 +1666,10 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
+ REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_L,
+ REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_H);
+
+- busy_time = (busy_cycles - gpu->devfreq.busy_cycles) * 10;
+- do_div(busy_time, 192);
+-
+- gpu->devfreq.busy_cycles = busy_cycles;
+
+ pm_runtime_put(a6xx_gpu->gmu.dev);
+
+- if (WARN_ON(busy_time > ~0LU))
+- return ~0LU;
+-
+- return (unsigned long)busy_time;
++ return busy_cycles;
+ }
+
+ static void a6xx_gpu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp)
+@@ -1919,6 +1914,7 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
+ BUG_ON(!node);
+
+ ret = a6xx_gmu_init(a6xx_gpu, node);
++ of_node_put(node);
+ if (ret) {
+ a6xx_destroy(&(a6xx_gpu->base.base));
+ return ERR_PTR(ret);
+diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+index 9efc84929be0b..1219f71629a52 100644
+--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
++++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+@@ -272,7 +272,10 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
+ *value = 0;
+ return 0;
+ case MSM_PARAM_FAULTS:
+- *value = gpu->global_faults + ctx->aspace->faults;
++ if (ctx->aspace)
++ *value = gpu->global_faults + ctx->aspace->faults;
++ else
++ *value = gpu->global_faults;
+ return 0;
+ case MSM_PARAM_SUSPENDS:
+ *value = gpu->suspend_count;
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+index 7763558ef566b..16ba9f9b9a787 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+@@ -204,7 +204,8 @@ static int dpu_crtc_get_crc(struct drm_crtc *crtc)
+ rc = m->hw_lm->ops.collect_misr(m->hw_lm, &crcs[i]);
+
+ if (rc) {
+- DRM_DEBUG_DRIVER("MISR read failed\n");
++ if (rc != -ENODATA)
++ DRM_DEBUG_DRIVER("MISR read failed\n");
+ return rc;
+ }
+ }
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
+index c61b5b283f08d..cf9aa06ab8bdf 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
+@@ -599,6 +599,9 @@ void dpu_core_irq_uninstall(struct dpu_kms *dpu_kms)
+ {
+ int i;
+
++ if (!dpu_kms->hw_intr)
++ return;
++
+ pm_runtime_get_sync(&dpu_kms->pdev->dev);
+ for (i = 0; i < dpu_kms->hw_intr->total_irqs; i++)
+ if (!list_empty(&dpu_kms->hw_intr->irq_cb_tbl[i]))
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
+index 116e2b5b1a90f..284f5610dc35b 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c
+@@ -148,6 +148,7 @@ static void dpu_hw_intf_setup_timing_engine(struct dpu_hw_intf *ctx,
+ active_v_end = active_v_start + (p->yres * hsync_period) - 1;
+
+ display_v_start += p->hsync_pulse_width + p->h_back_porch;
++ display_v_end -= p->h_front_porch;
+
+ active_hctl = (active_h_end << 16) | active_h_start;
+ display_hctl = active_hctl;
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.c
+index 86363c0ec8341..462f5082099e6 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.c
+@@ -138,7 +138,7 @@ static int dpu_hw_lm_collect_misr(struct dpu_hw_mixer *ctx, u32 *misr_value)
+ ctrl = DPU_REG_READ(c, LM_MISR_CTRL);
+
+ if (!(ctrl & LM_MISR_CTRL_ENABLE))
+- return -EINVAL;
++ return -ENODATA;
+
+ if (!(ctrl & LM_MISR_CTRL_STATUS))
+ return -EINVAL;
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+index e29796c4f27be..c95bacd4f458e 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+@@ -793,8 +793,10 @@ static void _dpu_kms_hw_destroy(struct dpu_kms *dpu_kms)
+ for (i = 0; i < dpu_kms->catalog->vbif_count; i++) {
+ u32 vbif_idx = dpu_kms->catalog->vbif[i].id;
+
+- if ((vbif_idx < VBIF_MAX) && dpu_kms->hw_vbif[vbif_idx])
++ if ((vbif_idx < VBIF_MAX) && dpu_kms->hw_vbif[vbif_idx]) {
+ dpu_hw_vbif_destroy(dpu_kms->hw_vbif[vbif_idx]);
++ dpu_kms->hw_vbif[vbif_idx] = NULL;
++ }
+ }
+ }
+
+@@ -1056,7 +1058,9 @@ static int dpu_kms_hw_init(struct msm_kms *kms)
+
+ dpu_kms_parse_data_bus_icc_path(dpu_kms);
+
+- pm_runtime_get_sync(&dpu_kms->pdev->dev);
++ rc = pm_runtime_resume_and_get(&dpu_kms->pdev->dev);
++ if (rc < 0)
++ goto error;
+
+ dpu_kms->core_rev = readl_relaxed(dpu_kms->mmio + 0x0);
+
+@@ -1240,7 +1244,7 @@ static int dpu_bind(struct device *dev, struct device *master, void *data)
+
+ priv->kms = &dpu_kms->base;
+
+- return ret;
++ return 0;
+ }
+
+ static void dpu_unbind(struct device *dev, struct device *master, void *data)
+diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
+index b966cd69f99dd..31447da0af25c 100644
+--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
++++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
+@@ -612,9 +612,15 @@ static int mdp5_crtc_setup_pipeline(struct drm_crtc *crtc,
+ if (ret)
+ return ret;
+
+- mdp5_mixer_release(new_crtc_state->state, old_mixer);
++ ret = mdp5_mixer_release(new_crtc_state->state, old_mixer);
++ if (ret)
++ return ret;
++
+ if (old_r_mixer) {
+- mdp5_mixer_release(new_crtc_state->state, old_r_mixer);
++ ret = mdp5_mixer_release(new_crtc_state->state, old_r_mixer);
++ if (ret)
++ return ret;
++
+ if (!need_right_mixer)
+ pipeline->r_mixer = NULL;
+ }
+@@ -991,8 +997,10 @@ static int mdp5_crtc_cursor_set(struct drm_crtc *crtc,
+
+ ret = msm_gem_get_and_pin_iova(cursor_bo, kms->aspace,
+ &mdp5_crtc->cursor.iova);
+- if (ret)
++ if (ret) {
++ drm_gem_object_put(cursor_bo);
+ return -EINVAL;
++ }
+
+ pm_runtime_get_sync(&pdev->dev);
+
+diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
+index 3b92372e7bdf1..1d4bbde293208 100644
+--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
++++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
+@@ -570,9 +570,9 @@ struct msm_kms *mdp5_kms_init(struct drm_device *dev)
+ }
+
+ irq = irq_of_parse_and_map(pdev->dev.of_node, 0);
+- if (irq < 0) {
+- ret = irq;
+- DRM_DEV_ERROR(&pdev->dev, "failed to get irq: %d\n", ret);
++ if (!irq) {
++ ret = -EINVAL;
++ DRM_DEV_ERROR(&pdev->dev, "failed to get irq\n");
+ goto fail;
+ }
+
+diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c
+index 954db683ae444..2536def2a0005 100644
+--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c
++++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.c
+@@ -116,21 +116,28 @@ int mdp5_mixer_assign(struct drm_atomic_state *s, struct drm_crtc *crtc,
+ return 0;
+ }
+
+-void mdp5_mixer_release(struct drm_atomic_state *s, struct mdp5_hw_mixer *mixer)
++int mdp5_mixer_release(struct drm_atomic_state *s, struct mdp5_hw_mixer *mixer)
+ {
+ struct mdp5_global_state *global_state = mdp5_get_global_state(s);
+- struct mdp5_hw_mixer_state *new_state = &global_state->hwmixer;
++ struct mdp5_hw_mixer_state *new_state;
+
+ if (!mixer)
+- return;
++ return 0;
++
++ if (IS_ERR(global_state))
++ return PTR_ERR(global_state);
++
++ new_state = &global_state->hwmixer;
+
+ if (WARN_ON(!new_state->hwmixer_to_crtc[mixer->idx]))
+- return;
++ return -EINVAL;
+
+ DBG("%s: release from crtc %s", mixer->name,
+ new_state->hwmixer_to_crtc[mixer->idx]->name);
+
+ new_state->hwmixer_to_crtc[mixer->idx] = NULL;
++
++ return 0;
+ }
+
+ void mdp5_mixer_destroy(struct mdp5_hw_mixer *mixer)
+diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h
+index 43c9ba43ce185..545ee223b9d74 100644
+--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h
++++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_mixer.h
+@@ -30,7 +30,7 @@ void mdp5_mixer_destroy(struct mdp5_hw_mixer *lm);
+ int mdp5_mixer_assign(struct drm_atomic_state *s, struct drm_crtc *crtc,
+ uint32_t caps, struct mdp5_hw_mixer **mixer,
+ struct mdp5_hw_mixer **r_mixer);
+-void mdp5_mixer_release(struct drm_atomic_state *s,
+- struct mdp5_hw_mixer *mixer);
++int mdp5_mixer_release(struct drm_atomic_state *s,
++ struct mdp5_hw_mixer *mixer);
+
+ #endif /* __MDP5_LM_H__ */
+diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
+index ba6695963aa66..a4f5cb90f3e80 100644
+--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
++++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
+@@ -119,18 +119,23 @@ int mdp5_pipe_assign(struct drm_atomic_state *s, struct drm_plane *plane,
+ return 0;
+ }
+
+-void mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe *hwpipe)
++int mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe *hwpipe)
+ {
+ struct msm_drm_private *priv = s->dev->dev_private;
+ struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(priv->kms));
+ struct mdp5_global_state *state = mdp5_get_global_state(s);
+- struct mdp5_hw_pipe_state *new_state = &state->hwpipe;
++ struct mdp5_hw_pipe_state *new_state;
+
+ if (!hwpipe)
+- return;
++ return 0;
++
++ if (IS_ERR(state))
++ return PTR_ERR(state);
++
++ new_state = &state->hwpipe;
+
+ if (WARN_ON(!new_state->hwpipe_to_plane[hwpipe->idx]))
+- return;
++ return -EINVAL;
+
+ DBG("%s: release from plane %s", hwpipe->name,
+ new_state->hwpipe_to_plane[hwpipe->idx]->name);
+@@ -141,6 +146,8 @@ void mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe *hwpipe)
+ }
+
+ new_state->hwpipe_to_plane[hwpipe->idx] = NULL;
++
++ return 0;
+ }
+
+ void mdp5_pipe_destroy(struct mdp5_hw_pipe *hwpipe)
+diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h
+index 9b26d0761bd4f..cca67938cab21 100644
+--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h
++++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.h
+@@ -37,7 +37,7 @@ int mdp5_pipe_assign(struct drm_atomic_state *s, struct drm_plane *plane,
+ uint32_t caps, uint32_t blkcfg,
+ struct mdp5_hw_pipe **hwpipe,
+ struct mdp5_hw_pipe **r_hwpipe);
+-void mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe *hwpipe);
++int mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe *hwpipe);
+
+ struct mdp5_hw_pipe *mdp5_pipe_init(enum mdp5_pipe pipe,
+ uint32_t reg_offset, uint32_t caps);
+diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
+index c478d25f7825a..f2d72497467bd 100644
+--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
++++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
+@@ -314,12 +314,24 @@ static int mdp5_plane_atomic_check_with_state(struct drm_crtc_state *crtc_state,
+ mdp5_state->r_hwpipe = NULL;
+
+
+- mdp5_pipe_release(state->state, old_hwpipe);
+- mdp5_pipe_release(state->state, old_right_hwpipe);
++ ret = mdp5_pipe_release(state->state, old_hwpipe);
++ if (ret)
++ return ret;
++
++ ret = mdp5_pipe_release(state->state, old_right_hwpipe);
++ if (ret)
++ return ret;
++
+ }
+ } else {
+- mdp5_pipe_release(state->state, mdp5_state->hwpipe);
+- mdp5_pipe_release(state->state, mdp5_state->r_hwpipe);
++ ret = mdp5_pipe_release(state->state, mdp5_state->hwpipe);
++ if (ret)
++ return ret;
++
++ ret = mdp5_pipe_release(state->state, mdp5_state->r_hwpipe);
++ if (ret)
++ return ret;
++
+ mdp5_state->hwpipe = mdp5_state->r_hwpipe = NULL;
+ }
+
+diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c
+index 53568567e05bc..08cc48af03b7d 100644
+--- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
++++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
+@@ -1532,7 +1532,7 @@ static int dp_ctrl_process_phy_test_request(struct dp_ctrl_private *ctrl)
+ * running. Add the global reset just before disabling the
+ * link clocks and core clocks.
+ */
+- ret = dp_ctrl_off_link_stream(&ctrl->dp_ctrl);
++ ret = dp_ctrl_off(&ctrl->dp_ctrl);
+ if (ret) {
+ DRM_ERROR("failed to disable DP controller\n");
+ return ret;
+@@ -1699,8 +1699,6 @@ int dp_ctrl_on_link(struct dp_ctrl *dp_ctrl)
+ ctrl->link->link_params.rate,
+ ctrl->link->link_params.num_lanes, ctrl->dp_ctrl.pixel_rate);
+
+- ctrl->link->phy_params.p_level = 0;
+- ctrl->link->phy_params.v_level = 0;
+
+ rc = dp_ctrl_enable_mainlink_clocks(ctrl);
+ if (rc)
+@@ -1822,12 +1820,6 @@ int dp_ctrl_on_stream(struct dp_ctrl *dp_ctrl)
+ }
+ }
+
+- if (!dp_ctrl_channel_eq_ok(ctrl))
+- dp_ctrl_link_retrain(ctrl);
+-
+- /* stop txing train pattern to end link training */
+- dp_ctrl_clear_training_pattern(ctrl);
+-
+ ret = dp_ctrl_enable_stream_clocks(ctrl);
+ if (ret) {
+ DRM_ERROR("Failed to start pixel clocks. ret=%d\n", ret);
+@@ -1839,6 +1831,12 @@ int dp_ctrl_on_stream(struct dp_ctrl *dp_ctrl)
+ return 0;
+ }
+
++ if (!dp_ctrl_channel_eq_ok(ctrl))
++ dp_ctrl_link_retrain(ctrl);
++
++ /* stop txing train pattern to end link training */
++ dp_ctrl_clear_training_pattern(ctrl);
++
+ /*
+ * Set up transfer unit values and set controller state to send
+ * video.
+diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
+index 178b774a5fbd3..8deb92bddfdec 100644
+--- a/drivers/gpu/drm/msm/dp/dp_display.c
++++ b/drivers/gpu/drm/msm/dp/dp_display.c
+@@ -113,6 +113,7 @@ struct dp_display_private {
+ u32 hpd_state;
+ u32 event_pndx;
+ u32 event_gndx;
++ struct task_struct *ev_tsk;
+ struct dp_event event_list[DP_EVENT_Q_MAX];
+ spinlock_t event_lock;
+
+@@ -249,6 +250,8 @@ void dp_display_signal_audio_complete(struct msm_dp *dp_display)
+ complete_all(&dp->audio_comp);
+ }
+
++static int dp_hpd_event_thread_start(struct dp_display_private *dp_priv);
++
+ static int dp_display_bind(struct device *dev, struct device *master,
+ void *data)
+ {
+@@ -282,9 +285,18 @@ static int dp_display_bind(struct device *dev, struct device *master,
+ }
+
+ rc = dp_register_audio_driver(dev, dp->audio);
+- if (rc)
++ if (rc) {
+ DRM_ERROR("Audio registration Dp failed\n");
++ goto end;
++ }
+
++ rc = dp_hpd_event_thread_start(dp);
++ if (rc) {
++ DRM_ERROR("Event thread create failed\n");
++ goto end;
++ }
++
++ return 0;
+ end:
+ return rc;
+ }
+@@ -295,6 +307,11 @@ static void dp_display_unbind(struct device *dev, struct device *master,
+ struct dp_display_private *dp = dev_get_dp_display_private(dev);
+ struct msm_drm_private *priv = dev_get_drvdata(master);
+
++ /* disable all HPD interrupts */
++ dp_catalog_hpd_config_intr(dp->catalog, DP_DP_HPD_INT_MASK, false);
++
++ kthread_stop(dp->ev_tsk);
++
+ dp_power_client_deinit(dp->power);
+ dp_aux_unregister(dp->aux);
+ priv->dp[dp->id] = NULL;
+@@ -1093,12 +1110,17 @@ static int hpd_event_thread(void *data)
+ while (1) {
+ if (timeout_mode) {
+ wait_event_timeout(dp_priv->event_q,
+- (dp_priv->event_pndx == dp_priv->event_gndx),
+- EVENT_TIMEOUT);
++ (dp_priv->event_pndx == dp_priv->event_gndx) ||
++ kthread_should_stop(), EVENT_TIMEOUT);
+ } else {
+ wait_event_interruptible(dp_priv->event_q,
+- (dp_priv->event_pndx != dp_priv->event_gndx));
++ (dp_priv->event_pndx != dp_priv->event_gndx) ||
++ kthread_should_stop());
+ }
++
++ if (kthread_should_stop())
++ break;
++
+ spin_lock_irqsave(&dp_priv->event_lock, flag);
+ todo = &dp_priv->event_list[dp_priv->event_gndx];
+ if (todo->delay) {
+@@ -1168,12 +1190,17 @@ static int hpd_event_thread(void *data)
+ return 0;
+ }
+
+-static void dp_hpd_event_setup(struct dp_display_private *dp_priv)
++static int dp_hpd_event_thread_start(struct dp_display_private *dp_priv)
+ {
+- init_waitqueue_head(&dp_priv->event_q);
+- spin_lock_init(&dp_priv->event_lock);
++ /* set event q to empty */
++ dp_priv->event_gndx = 0;
++ dp_priv->event_pndx = 0;
+
+- kthread_run(hpd_event_thread, dp_priv, "dp_hpd_handler");
++ dp_priv->ev_tsk = kthread_run(hpd_event_thread, dp_priv, "dp_hpd_handler");
++ if (IS_ERR(dp_priv->ev_tsk))
++ return PTR_ERR(dp_priv->ev_tsk);
++
++ return 0;
+ }
+
+ static irqreturn_t dp_display_irq_handler(int irq, void *dev_id)
+@@ -1233,10 +1260,9 @@ int dp_display_request_irq(struct msm_dp *dp_display)
+ dp = container_of(dp_display, struct dp_display_private, dp_display);
+
+ dp->irq = irq_of_parse_and_map(dp->pdev->dev.of_node, 0);
+- if (dp->irq < 0) {
+- rc = dp->irq;
+- DRM_ERROR("failed to get irq: %d\n", rc);
+- return rc;
++ if (!dp->irq) {
++ DRM_ERROR("failed to get irq\n");
++ return -EINVAL;
+ }
+
+ rc = devm_request_irq(&dp->pdev->dev, dp->irq,
+@@ -1303,7 +1329,10 @@ static int dp_display_probe(struct platform_device *pdev)
+ return -EPROBE_DEFER;
+ }
+
++ /* setup event q */
+ mutex_init(&dp->event_mutex);
++ init_waitqueue_head(&dp->event_q);
++ spin_lock_init(&dp->event_lock);
+
+ /* Store DP audio handle inside DP display */
+ dp->dp_display.dp_audio = dp->audio;
+@@ -1483,8 +1512,6 @@ void msm_dp_irq_postinstall(struct msm_dp *dp_display)
+
+ dp = container_of(dp_display, struct dp_display_private, dp_display);
+
+- dp_hpd_event_setup(dp);
+-
+ dp_add_event(dp, EV_HPD_INIT_SETUP, 0, 100);
+ }
+
+diff --git a/drivers/gpu/drm/msm/dp/dp_drm.c b/drivers/gpu/drm/msm/dp/dp_drm.c
+index 80f59cf990898..262744914f97d 100644
+--- a/drivers/gpu/drm/msm/dp/dp_drm.c
++++ b/drivers/gpu/drm/msm/dp/dp_drm.c
+@@ -230,9 +230,13 @@ struct drm_bridge *msm_dp_bridge_init(struct msm_dp *dp_display, struct drm_devi
+ bridge->funcs = &dp_bridge_ops;
+ bridge->encoder = encoder;
+
++ drm_bridge_add(bridge);
++
+ rc = drm_bridge_attach(encoder, bridge, NULL, DRM_BRIDGE_ATTACH_NO_CONNECTOR);
+ if (rc) {
+ DRM_ERROR("failed to attach bridge, rc=%d\n", rc);
++ drm_bridge_remove(bridge);
++
+ return ERR_PTR(rc);
+ }
+
+diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c b/drivers/gpu/drm/msm/dsi/dsi_host.c
+index d51e70fab93db..8925f60fd9ecc 100644
+--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
++++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
+@@ -1341,10 +1341,10 @@ static int dsi_cmds2buf_tx(struct msm_dsi_host *msm_host,
+ dsi_get_bpp(msm_host->format) / 8;
+
+ len = dsi_cmd_dma_add(msm_host, msg);
+- if (!len) {
++ if (len < 0) {
+ pr_err("%s: failed to add cmd type = 0x%x\n",
+ __func__, msg->type);
+- return -EINVAL;
++ return len;
+ }
+
+ /* for video mode, do not send cmds more than
+@@ -1363,10 +1363,14 @@ static int dsi_cmds2buf_tx(struct msm_dsi_host *msm_host,
+ }
+
+ ret = dsi_cmd_dma_tx(msm_host, len);
+- if (ret < len) {
+- pr_err("%s: cmd dma tx failed, type=0x%x, data0=0x%x, len=%d\n",
+- __func__, msg->type, (*(u8 *)(msg->tx_buf)), len);
+- return -ECOMM;
++ if (ret < 0) {
++ pr_err("%s: cmd dma tx failed, type=0x%x, data0=0x%x, len=%d, ret=%d\n",
++ __func__, msg->type, (*(u8 *)(msg->tx_buf)), len, ret);
++ return ret;
++ } else if (ret < len) {
++ pr_err("%s: cmd dma tx failed, type=0x%x, data0=0x%x, ret=%d len=%d\n",
++ __func__, msg->type, (*(u8 *)(msg->tx_buf)), ret, len);
++ return -EIO;
+ }
+
+ return len;
+@@ -2092,9 +2096,12 @@ int msm_dsi_host_cmd_rx(struct mipi_dsi_host *host,
+ }
+
+ ret = dsi_cmds2buf_tx(msm_host, msg);
+- if (ret < msg->tx_len) {
++ if (ret < 0) {
+ pr_err("%s: Read cmd Tx failed, %d\n", __func__, ret);
+ return ret;
++ } else if (ret < msg->tx_len) {
++ pr_err("%s: Read cmd Tx failed, too short: %d\n", __func__, ret);
++ return -ECOMM;
+ }
+
+ /*
+diff --git a/drivers/gpu/drm/msm/dsi/dsi_manager.c b/drivers/gpu/drm/msm/dsi/dsi_manager.c
+index 9f6af0f0fe005..84f3b2ebf1b8a 100644
+--- a/drivers/gpu/drm/msm/dsi/dsi_manager.c
++++ b/drivers/gpu/drm/msm/dsi/dsi_manager.c
+@@ -34,6 +34,32 @@ static struct msm_dsi_manager msm_dsim_glb;
+ #define IS_SYNC_NEEDED() (msm_dsim_glb.is_sync_needed)
+ #define IS_MASTER_DSI_LINK(id) (msm_dsim_glb.master_dsi_link_id == id)
+
++#ifdef CONFIG_OF
++static bool dsi_mgr_power_on_early(struct drm_bridge *bridge)
++{
++ struct drm_bridge *next_bridge = drm_bridge_get_next_bridge(bridge);
++
++ /*
++ * If the next bridge in the chain is the Parade ps8640 bridge chip
++ * then don't power on early since it seems to violate the expectations
++ * of the firmware that the bridge chip is running.
++ *
++ * NOTE: this is expected to be a temporary special case. It's expected
++ * that we'll eventually have a framework that allows the next level
++ * bridge to indicate whether it needs us to power on before it or
++ * after it. When that framework is in place then we'll use it and
++ * remove this special case.
++ */
++ return !(next_bridge && next_bridge->of_node &&
++ of_device_is_compatible(next_bridge->of_node, "parade,ps8640"));
++}
++#else
++static inline bool dsi_mgr_power_on_early(struct drm_bridge *bridge)
++{
++ return true;
++}
++#endif
++
+ static inline struct msm_dsi *dsi_mgr_get_dsi(int id)
+ {
+ return msm_dsim_glb.dsi[id];
+@@ -389,6 +415,9 @@ static void dsi_mgr_bridge_pre_enable(struct drm_bridge *bridge)
+ if (is_bonded_dsi && !IS_MASTER_DSI_LINK(id))
+ return;
+
++ if (!dsi_mgr_power_on_early(bridge))
++ dsi_mgr_bridge_power_on(bridge);
++
+ /* Always call panel functions once, because even for dual panels,
+ * there is only one drm_panel instance.
+ */
+@@ -570,7 +599,8 @@ static void dsi_mgr_bridge_mode_set(struct drm_bridge *bridge,
+ if (is_bonded_dsi && other_dsi)
+ msm_dsi_host_set_display_mode(other_dsi->host, adjusted_mode);
+
+- dsi_mgr_bridge_power_on(bridge);
++ if (dsi_mgr_power_on_early(bridge))
++ dsi_mgr_bridge_power_on(bridge);
+ }
+
+ static const struct drm_connector_funcs dsi_mgr_connector_funcs = {
+@@ -665,6 +695,8 @@ struct drm_bridge *msm_dsi_manager_bridge_init(u8 id)
+ bridge = &dsi_bridge->base;
+ bridge->funcs = &dsi_mgr_bridge_funcs;
+
++ drm_bridge_add(bridge);
++
+ ret = drm_bridge_attach(encoder, bridge, NULL, 0);
+ if (ret)
+ goto fail;
+@@ -735,6 +767,7 @@ struct drm_connector *msm_dsi_manager_ext_bridge_init(u8 id)
+
+ void msm_dsi_manager_bridge_destroy(struct drm_bridge *bridge)
+ {
++ drm_bridge_remove(bridge);
+ }
+
+ int msm_dsi_manager_cmd_xfer(int id, const struct mipi_dsi_msg *msg)
+diff --git a/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c b/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c
+index 75557ac99adf1..8199c53567f4e 100644
+--- a/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c
++++ b/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c
+@@ -1062,7 +1062,7 @@ const struct msm_dsi_phy_cfg dsi_phy_14nm_660_cfgs = {
+ },
+ .min_pll_rate = VCO_MIN_RATE,
+ .max_pll_rate = VCO_MAX_RATE,
+- .io_start = { 0xc994400, 0xc996000 },
++ .io_start = { 0xc994400, 0xc996400 },
+ .num_dsi_phy = 2,
+ };
+
+diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c b/drivers/gpu/drm/msm/hdmi/hdmi.c
+index ec324352e862e..f6229262dcb05 100644
+--- a/drivers/gpu/drm/msm/hdmi/hdmi.c
++++ b/drivers/gpu/drm/msm/hdmi/hdmi.c
+@@ -142,6 +142,10 @@ static struct hdmi *msm_hdmi_init(struct platform_device *pdev)
+ /* HDCP needs physical address of hdmi register */
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
+ config->mmio_name);
++ if (!res) {
++ ret = -EINVAL;
++ goto fail;
++ }
+ hdmi->mmio_phy_addr = res->start;
+
+ hdmi->qfprom_mmio = msm_ioremap(pdev, config->qfprom_mmio_name);
+@@ -298,9 +302,9 @@ int msm_hdmi_modeset_init(struct hdmi *hdmi,
+ drm_connector_attach_encoder(hdmi->connector, hdmi->encoder);
+
+ hdmi->irq = irq_of_parse_and_map(pdev->dev.of_node, 0);
+- if (hdmi->irq < 0) {
+- ret = hdmi->irq;
+- DRM_DEV_ERROR(dev->dev, "failed to get irq: %d\n", ret);
++ if (!hdmi->irq) {
++ ret = -EINVAL;
++ DRM_DEV_ERROR(dev->dev, "failed to get irq\n");
+ goto fail;
+ }
+
+diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
+index 10ebe2089cb61..97c24010c4d10 100644
+--- a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
++++ b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
+@@ -15,6 +15,7 @@ void msm_hdmi_bridge_destroy(struct drm_bridge *bridge)
+ struct hdmi_bridge *hdmi_bridge = to_hdmi_bridge(bridge);
+
+ msm_hdmi_hpd_disable(hdmi_bridge);
++ drm_bridge_remove(bridge);
+ }
+
+ static void msm_hdmi_power_on(struct drm_bridge *bridge)
+@@ -349,6 +350,8 @@ struct drm_bridge *msm_hdmi_bridge_init(struct hdmi *hdmi)
+ DRM_BRIDGE_OP_DETECT |
+ DRM_BRIDGE_OP_EDID;
+
++ drm_bridge_add(bridge);
++
+ ret = drm_bridge_attach(hdmi->encoder, bridge, NULL, DRM_BRIDGE_ATTACH_NO_CONNECTOR);
+ if (ret)
+ goto fail;
+diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
+index affa95eb05fcd..f2c46116df55c 100644
+--- a/drivers/gpu/drm/msm/msm_drv.c
++++ b/drivers/gpu/drm/msm/msm_drv.c
+@@ -11,6 +11,7 @@
+ #include <linux/uaccess.h>
+ #include <uapi/linux/sched/types.h>
+
++#include <drm/drm_bridge.h>
+ #include <drm/drm_drv.h>
+ #include <drm/drm_file.h>
+ #include <drm/drm_ioctl.h>
+@@ -112,6 +113,8 @@ static int msm_irq_postinstall(struct drm_device *dev)
+
+ static int msm_irq_install(struct drm_device *dev, unsigned int irq)
+ {
++ struct msm_drm_private *priv = dev->dev_private;
++ struct msm_kms *kms = priv->kms;
+ int ret;
+
+ if (irq == IRQ_NOTCONNECTED)
+@@ -123,6 +126,8 @@ static int msm_irq_install(struct drm_device *dev, unsigned int irq)
+ if (ret)
+ return ret;
+
++ kms->irq_requested = true;
++
+ ret = msm_irq_postinstall(dev);
+ if (ret) {
+ free_irq(irq, dev);
+@@ -138,7 +143,8 @@ static void msm_irq_uninstall(struct drm_device *dev)
+ struct msm_kms *kms = priv->kms;
+
+ kms->funcs->irq_uninstall(kms);
+- free_irq(kms->irq, dev);
++ if (kms->irq_requested)
++ free_irq(kms->irq, dev);
+ }
+
+ struct msm_vblank_work {
+@@ -232,6 +238,9 @@ static int msm_drm_uninit(struct device *dev)
+
+ drm_mode_config_cleanup(ddev);
+
++ for (i = 0; i < priv->num_bridges; i++)
++ drm_bridge_remove(priv->bridges[i]);
++
+ pm_runtime_get_sync(dev);
+ msm_irq_uninstall(ddev);
+ pm_runtime_put_sync(dev);
+diff --git a/drivers/gpu/drm/msm/msm_gem_prime.c b/drivers/gpu/drm/msm/msm_gem_prime.c
+index e8f1b7a2ca9c5..94ab705e9b8a4 100644
+--- a/drivers/gpu/drm/msm/msm_gem_prime.c
++++ b/drivers/gpu/drm/msm/msm_gem_prime.c
+@@ -17,7 +17,7 @@ struct sg_table *msm_gem_prime_get_sg_table(struct drm_gem_object *obj)
+ int npages = obj->size >> PAGE_SHIFT;
+
+ if (WARN_ON(!msm_obj->pages)) /* should have already pinned! */
+- return NULL;
++ return ERR_PTR(-ENOMEM);
+
+ return drm_prime_pages_to_sg(obj->dev, msm_obj->pages, npages);
+ }
+diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
+index faf0c242874e8..58eb3e1662cb9 100644
+--- a/drivers/gpu/drm/msm/msm_gpu.c
++++ b/drivers/gpu/drm/msm/msm_gpu.c
+@@ -371,7 +371,8 @@ static void recover_worker(struct kthread_work *work)
+
+ /* Increment the fault counts */
+ submit->queue->faults++;
+- submit->aspace->faults++;
++ if (submit->aspace)
++ submit->aspace->faults++;
+
+ task = get_pid_task(submit->pid, PIDTYPE_PID);
+ if (task) {
+diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
+index 02419f2ca2bc5..143c56f5185b8 100644
+--- a/drivers/gpu/drm/msm/msm_gpu.h
++++ b/drivers/gpu/drm/msm/msm_gpu.h
+@@ -9,6 +9,7 @@
+
+ #include <linux/adreno-smmu-priv.h>
+ #include <linux/clk.h>
++#include <linux/devfreq.h>
+ #include <linux/interconnect.h>
+ #include <linux/pm_opp.h>
+ #include <linux/regulator/consumer.h>
+@@ -62,7 +63,7 @@ struct msm_gpu_funcs {
+ /* for generation specific debugfs: */
+ void (*debugfs_init)(struct msm_gpu *gpu, struct drm_minor *minor);
+ #endif
+- unsigned long (*gpu_busy)(struct msm_gpu *gpu);
++ u64 (*gpu_busy)(struct msm_gpu *gpu, unsigned long *out_sample_rate);
+ struct msm_gpu_state *(*gpu_state_get)(struct msm_gpu *gpu);
+ int (*gpu_state_put)(struct msm_gpu_state *state);
+ unsigned long (*gpu_get_freq)(struct msm_gpu *gpu);
+@@ -106,11 +107,8 @@ struct msm_gpu_devfreq {
+ struct dev_pm_qos_request boost_freq;
+
+ /**
+- * busy_cycles:
+- *
+- * Used by implementation of gpu->gpu_busy() to track the last
+- * busy counter value, for calculating elapsed busy cycles since
+- * last sampling period.
++ * busy_cycles: Last busy counter value, for calculating elapsed busy
++ * cycles since last sampling period.
+ */
+ u64 busy_cycles;
+
+@@ -120,6 +118,8 @@ struct msm_gpu_devfreq {
+ /** idle_time: Time of last transition to idle: */
+ ktime_t idle_time;
+
++ struct devfreq_dev_status average_status;
++
+ /**
+ * idle_work:
+ *
+diff --git a/drivers/gpu/drm/msm/msm_gpu_devfreq.c b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
+index 12641616acd30..c7dbaa4b19264 100644
+--- a/drivers/gpu/drm/msm/msm_gpu_devfreq.c
++++ b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
+@@ -9,6 +9,7 @@
+
+ #include <linux/devfreq.h>
+ #include <linux/devfreq_cooling.h>
++#include <linux/math64.h>
+ #include <linux/units.h>
+
+ /*
+@@ -49,18 +50,95 @@ static unsigned long get_freq(struct msm_gpu *gpu)
+ return clk_get_rate(gpu->core_clk);
+ }
+
+-static int msm_devfreq_get_dev_status(struct device *dev,
++static void get_raw_dev_status(struct msm_gpu *gpu,
+ struct devfreq_dev_status *status)
+ {
+- struct msm_gpu *gpu = dev_to_gpu(dev);
++ struct msm_gpu_devfreq *df = &gpu->devfreq;
++ u64 busy_cycles, busy_time;
++ unsigned long sample_rate;
+ ktime_t time;
+
+ status->current_frequency = get_freq(gpu);
+- status->busy_time = gpu->funcs->gpu_busy(gpu);
+-
++ busy_cycles = gpu->funcs->gpu_busy(gpu, &sample_rate);
+ time = ktime_get();
+- status->total_time = ktime_us_delta(time, gpu->devfreq.time);
+- gpu->devfreq.time = time;
++
++ busy_time = busy_cycles - df->busy_cycles;
++ status->total_time = ktime_us_delta(time, df->time);
++
++ df->busy_cycles = busy_cycles;
++ df->time = time;
++
++ busy_time *= USEC_PER_SEC;
++ do_div(busy_time, sample_rate);
++ if (WARN_ON(busy_time > ~0LU))
++ busy_time = ~0LU;
++
++ status->busy_time = busy_time;
++}
++
++static void update_average_dev_status(struct msm_gpu *gpu,
++ const struct devfreq_dev_status *raw)
++{
++ struct msm_gpu_devfreq *df = &gpu->devfreq;
++ const u32 polling_ms = df->devfreq->profile->polling_ms;
++ const u32 max_history_ms = polling_ms * 11 / 10;
++ struct devfreq_dev_status *avg = &df->average_status;
++ u64 avg_freq;
++
++ /* simple_ondemand governor interacts poorly with gpu->clamp_to_idle.
++ * When we enforce the constraint on idle, it calls get_dev_status
++ * which would normally reset the stats. When we remove the
++ * constraint on active, it calls get_dev_status again where busy_time
++ * would be 0.
++ *
++ * To remedy this, we always return the average load over the past
++ * polling_ms.
++ */
++
++ /* raw is longer than polling_ms or avg has no history */
++ if (div_u64(raw->total_time, USEC_PER_MSEC) >= polling_ms ||
++ !avg->total_time) {
++ *avg = *raw;
++ return;
++ }
++
++ /* Truncate the oldest history first.
++ *
++ * Because we keep the history with a single devfreq_dev_status,
++ * rather than a list of devfreq_dev_status, we have to assume freq
++ * and load are the same over avg->total_time. We can scale down
++ * avg->busy_time and avg->total_time by the same factor to drop
++ * history.
++ */
++ if (div_u64(avg->total_time + raw->total_time, USEC_PER_MSEC) >=
++ max_history_ms) {
++ const u32 new_total_time = polling_ms * USEC_PER_MSEC -
++ raw->total_time;
++ avg->busy_time = div_u64(
++ mul_u32_u32(avg->busy_time, new_total_time),
++ avg->total_time);
++ avg->total_time = new_total_time;
++ }
++
++ /* compute the average freq over avg->total_time + raw->total_time */
++ avg_freq = mul_u32_u32(avg->current_frequency, avg->total_time);
++ avg_freq += mul_u32_u32(raw->current_frequency, raw->total_time);
++ do_div(avg_freq, avg->total_time + raw->total_time);
++
++ avg->current_frequency = avg_freq;
++ avg->busy_time += raw->busy_time;
++ avg->total_time += raw->total_time;
++}
++
++static int msm_devfreq_get_dev_status(struct device *dev,
++ struct devfreq_dev_status *status)
++{
++ struct msm_gpu *gpu = dev_to_gpu(dev);
++ struct devfreq_dev_status raw;
++
++ get_raw_dev_status(gpu, &raw);
++ update_average_dev_status(gpu, &raw);
++ *status = gpu->devfreq.average_status;
+
+ return 0;
+ }
+diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
+index 2a4f0526cb980..401d7e19811f3 100644
+--- a/drivers/gpu/drm/msm/msm_kms.h
++++ b/drivers/gpu/drm/msm/msm_kms.h
+@@ -148,6 +148,7 @@ struct msm_kms {
+
+ /* irq number to be passed on to msm_irq_install */
+ int irq;
++ bool irq_requested;
+
+ /* mapper-id used to request GEM buffer mapped for scanout: */
+ struct msm_gem_address_space *aspace;
+diff --git a/drivers/gpu/drm/nouveau/dispnv50/atom.h b/drivers/gpu/drm/nouveau/dispnv50/atom.h
+index 3d82b3c67decc..93f8f4f645784 100644
+--- a/drivers/gpu/drm/nouveau/dispnv50/atom.h
++++ b/drivers/gpu/drm/nouveau/dispnv50/atom.h
+@@ -160,14 +160,14 @@ nv50_head_atom_get(struct drm_atomic_state *state, struct drm_crtc *crtc)
+ static inline struct drm_encoder *
+ nv50_head_atom_get_encoder(struct nv50_head_atom *atom)
+ {
+- struct drm_encoder *encoder = NULL;
++ struct drm_encoder *encoder;
+
+ /* We only ever have a single encoder */
+ drm_for_each_encoder_mask(encoder, atom->state.crtc->dev,
+ atom->state.encoder_mask)
+- break;
++ return encoder;
+
+- return encoder;
++ return NULL;
+ }
+
+ #define nv50_wndw_atom(p) container_of((p), struct nv50_wndw_atom, state)
+diff --git a/drivers/gpu/drm/nouveau/dispnv50/crc.c b/drivers/gpu/drm/nouveau/dispnv50/crc.c
+index 29428e770f146..b834e8a9ae775 100644
+--- a/drivers/gpu/drm/nouveau/dispnv50/crc.c
++++ b/drivers/gpu/drm/nouveau/dispnv50/crc.c
+@@ -390,9 +390,18 @@ void nv50_crc_atomic_check_outp(struct nv50_atom *atom)
+ struct nv50_head_atom *armh = nv50_head_atom(old_crtc_state);
+ struct nv50_head_atom *asyh = nv50_head_atom(new_crtc_state);
+ struct nv50_outp_atom *outp_atom;
+- struct nouveau_encoder *outp =
+- nv50_real_outp(nv50_head_atom_get_encoder(armh));
+- struct drm_encoder *encoder = &outp->base.base;
++ struct nouveau_encoder *outp;
++ struct drm_encoder *encoder, *enc;
++
++ enc = nv50_head_atom_get_encoder(armh);
++ if (!enc)
++ continue;
++
++ outp = nv50_real_outp(enc);
++ if (!outp)
++ continue;
++
++ encoder = &outp->base.base;
+
+ if (!asyh->clr.crc)
+ continue;
+@@ -443,8 +452,16 @@ void nv50_crc_atomic_set(struct nv50_head *head,
+ struct drm_device *dev = crtc->dev;
+ struct nv50_crc *crc = &head->crc;
+ const struct nv50_crc_func *func = nv50_disp(dev)->core->func->crc;
+- struct nouveau_encoder *outp =
+- nv50_real_outp(nv50_head_atom_get_encoder(asyh));
++ struct nouveau_encoder *outp;
++ struct drm_encoder *encoder;
++
++ encoder = nv50_head_atom_get_encoder(asyh);
++ if (!encoder)
++ return;
++
++ outp = nv50_real_outp(encoder);
++ if (!outp)
++ return;
+
+ func->set_src(head, outp->or, nv50_crc_source_type(outp, asyh->crc.src),
+ &crc->ctx[crc->ctx_idx]);
+diff --git a/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h b/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h
+index 1665738948fb4..96113c8bee8c5 100644
+--- a/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h
++++ b/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h
+@@ -62,4 +62,6 @@ void nvkm_subdev_intr(struct nvkm_subdev *);
+ #define nvkm_debug(s,f,a...) nvkm_printk((s), DEBUG, info, f, ##a)
+ #define nvkm_trace(s,f,a...) nvkm_printk((s), TRACE, info, f, ##a)
+ #define nvkm_spam(s,f,a...) nvkm_printk((s), SPAM, dbg, f, ##a)
++
++#define nvkm_error_ratelimited(s,f,a...) nvkm_printk((s), ERROR, err_ratelimited, f, ##a)
+ #endif
+diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bus/gf100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bus/gf100.c
+index 53a6651ac2258..80b5aaceeaad1 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bus/gf100.c
++++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bus/gf100.c
+@@ -35,13 +35,13 @@ gf100_bus_intr(struct nvkm_bus *bus)
+ u32 addr = nvkm_rd32(device, 0x009084);
+ u32 data = nvkm_rd32(device, 0x009088);
+
+- nvkm_error(subdev,
+- "MMIO %s of %08x FAULT at %06x [ %s%s%s]\n",
+- (addr & 0x00000002) ? "write" : "read", data,
+- (addr & 0x00fffffc),
+- (stat & 0x00000002) ? "!ENGINE " : "",
+- (stat & 0x00000004) ? "PRIVRING " : "",
+- (stat & 0x00000008) ? "TIMEOUT " : "");
++ nvkm_error_ratelimited(subdev,
++ "MMIO %s of %08x FAULT at %06x [ %s%s%s]\n",
++ (addr & 0x00000002) ? "write" : "read", data,
++ (addr & 0x00fffffc),
++ (stat & 0x00000002) ? "!ENGINE " : "",
++ (stat & 0x00000004) ? "PRIVRING " : "",
++ (stat & 0x00000008) ? "TIMEOUT " : "");
+
+ nvkm_wr32(device, 0x009084, 0x00000000);
+ nvkm_wr32(device, 0x001100, (stat & 0x0000000e));
+diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bus/nv31.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bus/nv31.c
+index ad8da523bb22e..c75e463f35013 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bus/nv31.c
++++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bus/nv31.c
+@@ -45,9 +45,9 @@ nv31_bus_intr(struct nvkm_bus *bus)
+ u32 addr = nvkm_rd32(device, 0x009084);
+ u32 data = nvkm_rd32(device, 0x009088);
+
+- nvkm_error(subdev, "MMIO %s of %08x FAULT at %06x\n",
+- (addr & 0x00000002) ? "write" : "read", data,
+- (addr & 0x00fffffc));
++ nvkm_error_ratelimited(subdev, "MMIO %s of %08x FAULT at %06x\n",
++ (addr & 0x00000002) ? "write" : "read", data,
++ (addr & 0x00fffffc));
+
+ stat &= ~0x00000008;
+ nvkm_wr32(device, 0x001100, 0x00000008);
+diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bus/nv50.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bus/nv50.c
+index 3a1e45adeedc1..2055d0b100d3f 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bus/nv50.c
++++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bus/nv50.c
+@@ -60,9 +60,9 @@ nv50_bus_intr(struct nvkm_bus *bus)
+ u32 addr = nvkm_rd32(device, 0x009084);
+ u32 data = nvkm_rd32(device, 0x009088);
+
+- nvkm_error(subdev, "MMIO %s of %08x FAULT at %06x\n",
+- (addr & 0x00000002) ? "write" : "read", data,
+- (addr & 0x00fffffc));
++ nvkm_error_ratelimited(subdev, "MMIO %s of %08x FAULT at %06x\n",
++ (addr & 0x00000002) ? "write" : "read", data,
++ (addr & 0x00fffffc));
+
+ stat &= ~0x00000008;
+ nvkm_wr32(device, 0x001100, 0x00000008);
+diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.c
+index 57199be082fd3..c2b5cc5f97eda 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.c
++++ b/drivers/gpu/drm/nouveau/nvkm/subdev/clk/base.c
+@@ -135,10 +135,10 @@ nvkm_cstate_find_best(struct nvkm_clk *clk, struct nvkm_pstate *pstate,
+
+ list_for_each_entry_from_reverse(cstate, &pstate->list, head) {
+ if (nvkm_cstate_valid(clk, cstate, max_volt, clk->temp))
+- break;
++ return cstate;
+ }
+
+- return cstate;
++ return NULL;
+ }
+
+ static struct nvkm_cstate *
+@@ -169,6 +169,8 @@ nvkm_cstate_prog(struct nvkm_clk *clk, struct nvkm_pstate *pstate, int cstatei)
+ if (!list_empty(&pstate->list)) {
+ cstate = nvkm_cstate_get(clk, pstate, cstatei);
+ cstate = nvkm_cstate_find_best(clk, pstate, cstate);
++ if (!cstate)
++ return -EINVAL;
+ } else {
+ cstate = &pstate->base;
+ }
+diff --git a/drivers/gpu/drm/omapdrm/omap_overlay.c b/drivers/gpu/drm/omapdrm/omap_overlay.c
+index 10730c9b27523..b0bc9ad2ef73a 100644
+--- a/drivers/gpu/drm/omapdrm/omap_overlay.c
++++ b/drivers/gpu/drm/omapdrm/omap_overlay.c
+@@ -86,7 +86,7 @@ int omap_overlay_assign(struct drm_atomic_state *s, struct drm_plane *plane,
+ r_ovl = omap_plane_find_free_overlay(s->dev, overlay_map,
+ caps, fourcc);
+ if (!r_ovl) {
+- overlay_map[r_ovl->idx] = NULL;
++ overlay_map[ovl->idx] = NULL;
+ *overlay = NULL;
+ return -ENOMEM;
+ }
+diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c
+index a34f4198a5341..6880dc59fa886 100644
+--- a/drivers/gpu/drm/panel/panel-simple.c
++++ b/drivers/gpu/drm/panel/panel-simple.c
+@@ -720,7 +720,7 @@ static const struct drm_display_mode ampire_am_1280800n3tzqw_t00h_mode = {
+ static const struct panel_desc ampire_am_1280800n3tzqw_t00h = {
+ .modes = &ire_am_1280800n3tzqw_t00h_mode,
+ .num_modes = 1,
+- .bpc = 6,
++ .bpc = 8,
+ .size = {
+ .width = 217,
+ .height = 136,
+@@ -2029,6 +2029,7 @@ static const struct panel_desc innolux_g070y2_l01 = {
+ .unprepare = 800,
+ },
+ .bus_format = MEDIA_BUS_FMT_RGB888_1X7X4_SPWG,
++ .bus_flags = DRM_BUS_FLAG_DE_HIGH,
+ .connector_type = DRM_MODE_CONNECTOR_LVDS,
+ };
+
+diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+index 3e8d9e2d1b675..d53037531f407 100644
+--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
++++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+@@ -2118,10 +2118,10 @@ static int vop_bind(struct device *dev, struct device *master, void *data)
+ vop_win_init(vop);
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+- vop->len = resource_size(res);
+ vop->regs = devm_ioremap_resource(dev, res);
+ if (IS_ERR(vop->regs))
+ return PTR_ERR(vop->regs);
++ vop->len = resource_size(res);
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
+ if (res) {
+diff --git a/drivers/gpu/drm/selftests/test-drm_buddy.c b/drivers/gpu/drm/selftests/test-drm_buddy.c
+index fa997f89522b6..aca0c491040f2 100644
+--- a/drivers/gpu/drm/selftests/test-drm_buddy.c
++++ b/drivers/gpu/drm/selftests/test-drm_buddy.c
+@@ -488,8 +488,10 @@ static int igt_buddy_alloc_smoke(void *arg)
+ }
+
+ order = drm_random_order(mm.max_order + 1, &prng);
+- if (!order)
++ if (!order) {
++ err = -ENOMEM;
+ goto out_fini;
++ }
+
+ for (i = 0; i <= mm.max_order; ++i) {
+ struct drm_buddy_block *block;
+@@ -902,14 +904,13 @@ err_fini:
+
+ static int igt_buddy_alloc_limit(void *arg)
+ {
+- u64 end, size = U64_MAX, start = 0;
++ u64 size = U64_MAX, start = 0;
+ struct drm_buddy_block *block;
+ unsigned long flags = 0;
+ LIST_HEAD(allocated);
+ struct drm_buddy mm;
+ int err;
+
+- size = end = round_down(size, 4096);
+ err = drm_buddy_init(&mm, size, PAGE_SIZE);
+ if (err)
+ return err;
+@@ -921,7 +922,8 @@ static int igt_buddy_alloc_limit(void *arg)
+ goto out_fini;
+ }
+
+- err = drm_buddy_alloc_blocks(&mm, start, end, size,
++ size = mm.chunk_size << mm.max_order;
++ err = drm_buddy_alloc_blocks(&mm, start, size, size,
+ PAGE_SIZE, &allocated, flags);
+
+ if (unlikely(err))
+diff --git a/drivers/gpu/drm/solomon/Kconfig b/drivers/gpu/drm/solomon/Kconfig
+index 5861c3ab7c452..6230369505c96 100644
+--- a/drivers/gpu/drm/solomon/Kconfig
++++ b/drivers/gpu/drm/solomon/Kconfig
+@@ -1,6 +1,6 @@
+ config DRM_SSD130X
+ tristate "DRM support for Solomon SSD130x OLED displays"
+- depends on DRM
++ depends on DRM && MMU
+ select BACKLIGHT_CLASS_DEVICE
+ select DRM_GEM_SHMEM_HELPER
+ select DRM_KMS_HELPER
+diff --git a/drivers/gpu/drm/solomon/ssd130x.c b/drivers/gpu/drm/solomon/ssd130x.c
+index ce4dc20412e06..38b6c2c14f536 100644
+--- a/drivers/gpu/drm/solomon/ssd130x.c
++++ b/drivers/gpu/drm/solomon/ssd130x.c
+@@ -48,7 +48,7 @@
+ #define SSD130X_CONTRAST 0x81
+ #define SSD130X_SET_LOOKUP_TABLE 0x91
+ #define SSD130X_CHARGE_PUMP 0x8d
+-#define SSD130X_SEG_REMAP_ON 0xa1
++#define SSD130X_SET_SEG_REMAP 0xa0
+ #define SSD130X_DISPLAY_OFF 0xae
+ #define SSD130X_SET_MULTIPLEX_RATIO 0xa8
+ #define SSD130X_DISPLAY_ON 0xaf
+@@ -61,7 +61,9 @@
+ #define SSD130X_SET_COM_PINS_CONFIG 0xda
+ #define SSD130X_SET_VCOMH 0xdb
+
+-#define SSD130X_SET_COM_SCAN_DIR_MASK GENMASK(3, 2)
++#define SSD130X_SET_SEG_REMAP_MASK GENMASK(0, 0)
++#define SSD130X_SET_SEG_REMAP_SET(val) FIELD_PREP(SSD130X_SET_SEG_REMAP_MASK, (val))
++#define SSD130X_SET_COM_SCAN_DIR_MASK GENMASK(3, 3)
+ #define SSD130X_SET_COM_SCAN_DIR_SET(val) FIELD_PREP(SSD130X_SET_COM_SCAN_DIR_MASK, (val))
+ #define SSD130X_SET_CLOCK_DIV_MASK GENMASK(3, 0)
+ #define SSD130X_SET_CLOCK_DIV_SET(val) FIELD_PREP(SSD130X_SET_CLOCK_DIV_MASK, (val))
+@@ -235,7 +237,7 @@ static void ssd130x_power_off(struct ssd130x_device *ssd130x)
+
+ static int ssd130x_init(struct ssd130x_device *ssd130x)
+ {
+- u32 precharge, dclk, com_invdir, compins, chargepump;
++ u32 precharge, dclk, com_invdir, compins, chargepump, seg_remap;
+ int ret;
+
+ /* Set initial contrast */
+@@ -244,11 +246,11 @@ static int ssd130x_init(struct ssd130x_device *ssd130x)
+ return ret;
+
+ /* Set segment re-map */
+- if (ssd130x->seg_remap) {
+- ret = ssd130x_write_cmd(ssd130x, 1, SSD130X_SEG_REMAP_ON);
+- if (ret < 0)
+- return ret;
+- }
++ seg_remap = (SSD130X_SET_SEG_REMAP |
++ SSD130X_SET_SEG_REMAP_SET(ssd130x->seg_remap));
++ ret = ssd130x_write_cmd(ssd130x, 1, seg_remap);
++ if (ret < 0)
++ return ret;
+
+ /* Set COM direction */
+ com_invdir = (SSD130X_SET_COM_SCAN_DIR |
+@@ -353,11 +355,14 @@ static int ssd130x_update_rect(struct ssd130x_device *ssd130x, u8 *buf,
+ unsigned int width = drm_rect_width(rect);
+ unsigned int height = drm_rect_height(rect);
+ unsigned int line_length = DIV_ROUND_UP(width, 8);
+- unsigned int pages = DIV_ROUND_UP(y % 8 + height, 8);
++ unsigned int pages = DIV_ROUND_UP(height, 8);
++ struct drm_device *drm = &ssd130x->drm;
+ u32 array_idx = 0;
+ int ret, i, j, k;
+ u8 *data_array = NULL;
+
++ drm_WARN_ONCE(drm, y % 8 != 0, "y must be aligned to screen page\n");
++
+ data_array = kcalloc(width, pages, GFP_KERNEL);
+ if (!data_array)
+ return -ENOMEM;
+@@ -399,13 +404,13 @@ static int ssd130x_update_rect(struct ssd130x_device *ssd130x, u8 *buf,
+ if (ret < 0)
+ goto out_free;
+
+- for (i = y / 8; i < y / 8 + pages; i++) {
++ for (i = 0; i < pages; i++) {
+ int m = 8;
+
+ /* Last page may be partial */
+- if (8 * (i + 1) > ssd130x->height)
++ if (8 * (y / 8 + i + 1) > ssd130x->height)
+ m = ssd130x->height % 8;
+- for (j = x; j < x + width; j++) {
++ for (j = 0; j < width; j++) {
+ u8 data = 0;
+
+ for (k = 0; k < m; k++) {
+@@ -435,7 +440,8 @@ static void ssd130x_clear_screen(struct ssd130x_device *ssd130x)
+ .y2 = ssd130x->height,
+ };
+
+- buf = kcalloc(ssd130x->width, ssd130x->height, GFP_KERNEL);
++ buf = kcalloc(DIV_ROUND_UP(ssd130x->width, 8), ssd130x->height,
++ GFP_KERNEL);
+ if (!buf)
+ return;
+
+@@ -449,14 +455,20 @@ static int ssd130x_fb_blit_rect(struct drm_framebuffer *fb, const struct iosys_m
+ {
+ struct ssd130x_device *ssd130x = drm_to_ssd130x(fb->dev);
+ void *vmap = map->vaddr; /* TODO: Use mapping abstraction properly */
++ unsigned int dst_pitch;
+ int ret = 0;
+ u8 *buf = NULL;
+
+- buf = kcalloc(fb->width, fb->height, GFP_KERNEL);
++ /* Align y to display page boundaries */
++ rect->y1 = round_down(rect->y1, 8);
++ rect->y2 = min_t(unsigned int, round_up(rect->y2, 8), ssd130x->height);
++
++ dst_pitch = DIV_ROUND_UP(drm_rect_width(rect), 8);
++ buf = kcalloc(dst_pitch, drm_rect_height(rect), GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+- drm_fb_xrgb8888_to_mono_reversed(buf, 0, vmap, fb, rect);
++ drm_fb_xrgb8888_to_mono(buf, dst_pitch, vmap, fb, rect);
+
+ ssd130x_update_rect(ssd130x, buf, rect);
+
+diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
+index 17cc050207f4a..6bd45df8f5a72 100644
+--- a/drivers/gpu/drm/stm/ltdc.c
++++ b/drivers/gpu/drm/stm/ltdc.c
+@@ -869,8 +869,8 @@ static void ltdc_crtc_mode_set_nofb(struct drm_crtc *crtc)
+ struct drm_device *ddev = crtc->dev;
+ struct drm_connector_list_iter iter;
+ struct drm_connector *connector = NULL;
+- struct drm_encoder *encoder = NULL;
+- struct drm_bridge *bridge = NULL;
++ struct drm_encoder *encoder = NULL, *en_iter;
++ struct drm_bridge *bridge = NULL, *br_iter;
+ struct drm_display_mode *mode = &crtc->state->adjusted_mode;
+ u32 hsync, vsync, accum_hbp, accum_vbp, accum_act_w, accum_act_h;
+ u32 total_width, total_height;
+@@ -880,15 +880,19 @@ static void ltdc_crtc_mode_set_nofb(struct drm_crtc *crtc)
+ int ret;
+
+ /* get encoder from crtc */
+- drm_for_each_encoder(encoder, ddev)
+- if (encoder->crtc == crtc)
++ drm_for_each_encoder(en_iter, ddev)
++ if (en_iter->crtc == crtc) {
++ encoder = en_iter;
+ break;
++ }
+
+ if (encoder) {
+ /* get bridge from encoder */
+- list_for_each_entry(bridge, &encoder->bridge_chain, chain_node)
+- if (bridge->encoder == encoder)
++ list_for_each_entry(br_iter, &encoder->bridge_chain, chain_node)
++ if (br_iter->encoder == encoder) {
++ bridge = br_iter;
+ break;
++ }
+
+ /* Get the connector from encoder */
+ drm_connector_list_iter_begin(ddev, &iter);
+diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
+index 0063403ab5e18..7c7dd84e6db84 100644
+--- a/drivers/gpu/drm/tegra/gem.c
++++ b/drivers/gpu/drm/tegra/gem.c
+@@ -88,6 +88,7 @@ static struct host1x_bo_mapping *tegra_bo_pin(struct device *dev, struct host1x_
+ if (IS_ERR(map->sgt)) {
+ dma_buf_detach(buf, map->attach);
+ err = PTR_ERR(map->sgt);
++ map->sgt = NULL;
+ goto free;
+ }
+
+diff --git a/drivers/gpu/drm/tilcdc/tilcdc_external.c b/drivers/gpu/drm/tilcdc/tilcdc_external.c
+index 7594cf6e186eb..3b86d002ef62e 100644
+--- a/drivers/gpu/drm/tilcdc/tilcdc_external.c
++++ b/drivers/gpu/drm/tilcdc/tilcdc_external.c
+@@ -60,11 +60,13 @@ struct drm_connector *tilcdc_encoder_find_connector(struct drm_device *ddev,
+ int tilcdc_add_component_encoder(struct drm_device *ddev)
+ {
+ struct tilcdc_drm_private *priv = ddev->dev_private;
+- struct drm_encoder *encoder;
++ struct drm_encoder *encoder = NULL, *iter;
+
+- list_for_each_entry(encoder, &ddev->mode_config.encoder_list, head)
+- if (encoder->possible_crtcs & (1 << priv->crtc->index))
++ list_for_each_entry(iter, &ddev->mode_config.encoder_list, head)
++ if (iter->possible_crtcs & (1 << priv->crtc->index)) {
++ encoder = iter;
+ break;
++ }
+
+ if (!encoder) {
+ dev_err(ddev->dev, "%s: No suitable encoder found\n", __func__);
+diff --git a/drivers/gpu/drm/tiny/repaper.c b/drivers/gpu/drm/tiny/repaper.c
+index 37b6bb90e46e1..a096fb8b83e99 100644
+--- a/drivers/gpu/drm/tiny/repaper.c
++++ b/drivers/gpu/drm/tiny/repaper.c
+@@ -540,7 +540,7 @@ static int repaper_fb_dirty(struct drm_framebuffer *fb)
+ if (ret)
+ goto out_free;
+
+- drm_fb_xrgb8888_to_mono_reversed(buf, 0, cma_obj->vaddr, fb, &clip);
++ drm_fb_xrgb8888_to_mono(buf, 0, cma_obj->vaddr, fb, &clip);
+
+ drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE);
+
+diff --git a/drivers/gpu/drm/v3d/v3d_perfmon.c b/drivers/gpu/drm/v3d/v3d_perfmon.c
+index 0288ef063513e..f6a88abccc7d9 100644
+--- a/drivers/gpu/drm/v3d/v3d_perfmon.c
++++ b/drivers/gpu/drm/v3d/v3d_perfmon.c
+@@ -25,11 +25,12 @@ void v3d_perfmon_start(struct v3d_dev *v3d, struct v3d_perfmon *perfmon)
+ {
+ unsigned int i;
+ u32 mask;
+- u8 ncounters = perfmon->ncounters;
++ u8 ncounters;
+
+ if (WARN_ON_ONCE(!perfmon || v3d->active_perfmon))
+ return;
+
++ ncounters = perfmon->ncounters;
+ mask = GENMASK(ncounters - 1, 0);
+
+ for (i = 0; i < ncounters; i++) {
+diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
+index 783890e8d43a2..477b3c5ad089c 100644
+--- a/drivers/gpu/drm/vc4/vc4_crtc.c
++++ b/drivers/gpu/drm/vc4/vc4_crtc.c
+@@ -123,7 +123,7 @@ static bool vc4_crtc_get_scanout_position(struct drm_crtc *crtc,
+ *vpos /= 2;
+
+ /* Use hpos to correct for field offset in interlaced mode. */
+- if (VC4_GET_FIELD(val, SCALER_DISPSTATX_FRAME_COUNT) % 2)
++ if (vc4_hvs_get_fifo_frame_count(dev, vc4_crtc_state->assigned_channel) % 2)
+ *hpos += mode->crtc_htotal / 2;
+ }
+
+diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
+index 4329e09d357c8..801da3e8ebdb8 100644
+--- a/drivers/gpu/drm/vc4/vc4_drv.h
++++ b/drivers/gpu/drm/vc4/vc4_drv.h
+@@ -935,6 +935,7 @@ void vc4_irq_reset(struct drm_device *dev);
+ extern struct platform_driver vc4_hvs_driver;
+ void vc4_hvs_stop_channel(struct drm_device *dev, unsigned int output);
+ int vc4_hvs_get_fifo_from_output(struct drm_device *dev, unsigned int output);
++u8 vc4_hvs_get_fifo_frame_count(struct drm_device *dev, unsigned int fifo);
+ int vc4_hvs_atomic_check(struct drm_crtc *crtc, struct drm_atomic_state *state);
+ void vc4_hvs_atomic_begin(struct drm_crtc *crtc, struct drm_atomic_state *state);
+ void vc4_hvs_atomic_enable(struct drm_crtc *crtc, struct drm_atomic_state *state);
+diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
+index 604933e20e6a2..9d88bfb50c9b0 100644
+--- a/drivers/gpu/drm/vc4/vc4_hvs.c
++++ b/drivers/gpu/drm/vc4/vc4_hvs.c
+@@ -197,6 +197,29 @@ static void vc4_hvs_update_gamma_lut(struct drm_crtc *crtc)
+ vc4_hvs_lut_load(crtc);
+ }
+
++u8 vc4_hvs_get_fifo_frame_count(struct drm_device *dev, unsigned int fifo)
++{
++ struct vc4_dev *vc4 = to_vc4_dev(dev);
++ u8 field = 0;
++
++ switch (fifo) {
++ case 0:
++ field = VC4_GET_FIELD(HVS_READ(SCALER_DISPSTAT1),
++ SCALER_DISPSTAT1_FRCNT0);
++ break;
++ case 1:
++ field = VC4_GET_FIELD(HVS_READ(SCALER_DISPSTAT1),
++ SCALER_DISPSTAT1_FRCNT1);
++ break;
++ case 2:
++ field = VC4_GET_FIELD(HVS_READ(SCALER_DISPSTAT2),
++ SCALER_DISPSTAT2_FRCNT2);
++ break;
++ }
++
++ return field;
++}
++
+ int vc4_hvs_get_fifo_from_output(struct drm_device *dev, unsigned int output)
+ {
+ struct vc4_dev *vc4 = to_vc4_dev(dev);
+@@ -582,6 +605,7 @@ static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
+ struct vc4_hvs *hvs = NULL;
+ int ret;
+ u32 dispctrl;
++ u32 reg;
+
+ hvs = devm_kzalloc(&pdev->dev, sizeof(*hvs), GFP_KERNEL);
+ if (!hvs)
+@@ -653,6 +677,26 @@ static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
+
+ vc4->hvs = hvs;
+
++ reg = HVS_READ(SCALER_DISPECTRL);
++ reg &= ~SCALER_DISPECTRL_DSP2_MUX_MASK;
++ HVS_WRITE(SCALER_DISPECTRL,
++ reg | VC4_SET_FIELD(0, SCALER_DISPECTRL_DSP2_MUX));
++
++ reg = HVS_READ(SCALER_DISPCTRL);
++ reg &= ~SCALER_DISPCTRL_DSP3_MUX_MASK;
++ HVS_WRITE(SCALER_DISPCTRL,
++ reg | VC4_SET_FIELD(3, SCALER_DISPCTRL_DSP3_MUX));
++
++ reg = HVS_READ(SCALER_DISPEOLN);
++ reg &= ~SCALER_DISPEOLN_DSP4_MUX_MASK;
++ HVS_WRITE(SCALER_DISPEOLN,
++ reg | VC4_SET_FIELD(3, SCALER_DISPEOLN_DSP4_MUX));
++
++ reg = HVS_READ(SCALER_DISPDITHER);
++ reg &= ~SCALER_DISPDITHER_DSP5_MUX_MASK;
++ HVS_WRITE(SCALER_DISPDITHER,
++ reg | VC4_SET_FIELD(3, SCALER_DISPDITHER_DSP5_MUX));
++
+ dispctrl = HVS_READ(SCALER_DISPCTRL);
+
+ dispctrl |= SCALER_DISPCTRL_ENABLE;
+@@ -660,10 +704,6 @@ static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
+ SCALER_DISPCTRL_DISPEIRQ(1) |
+ SCALER_DISPCTRL_DISPEIRQ(2);
+
+- /* Set DSP3 (PV1) to use HVS channel 2, which would otherwise
+- * be unused.
+- */
+- dispctrl &= ~SCALER_DISPCTRL_DSP3_MUX_MASK;
+ dispctrl &= ~(SCALER_DISPCTRL_DMAEIRQ |
+ SCALER_DISPCTRL_SLVWREIRQ |
+ SCALER_DISPCTRL_SLVRDEIRQ |
+@@ -677,7 +717,6 @@ static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
+ SCALER_DISPCTRL_DSPEISLUR(1) |
+ SCALER_DISPCTRL_DSPEISLUR(2) |
+ SCALER_DISPCTRL_SCLEIRQ);
+- dispctrl |= VC4_SET_FIELD(2, SCALER_DISPCTRL_DSP3_MUX);
+
+ HVS_WRITE(SCALER_DISPCTRL, dispctrl);
+
+diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
+index 24de29bc1cda4..992d6a2400029 100644
+--- a/drivers/gpu/drm/vc4/vc4_kms.c
++++ b/drivers/gpu/drm/vc4/vc4_kms.c
+@@ -385,9 +385,10 @@ static void vc4_atomic_commit_tail(struct drm_atomic_state *state)
+ }
+
+ if (vc4->hvs->hvs5) {
++ unsigned long state_rate = max(old_hvs_state->core_clock_rate,
++ new_hvs_state->core_clock_rate);
+ unsigned long core_rate = max_t(unsigned long,
+- 500000000,
+- new_hvs_state->core_clock_rate);
++ 500000000, state_rate);
+
+ clk_set_min_rate(hvs->core_clk, core_rate);
+ }
+diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
+index 33410718089ec..bae8c9cd6f7ca 100644
+--- a/drivers/gpu/drm/vc4/vc4_regs.h
++++ b/drivers/gpu/drm/vc4/vc4_regs.h
+@@ -379,8 +379,6 @@
+ # define SCALER_DISPSTATX_MODE_EOF 3
+ # define SCALER_DISPSTATX_FULL BIT(29)
+ # define SCALER_DISPSTATX_EMPTY BIT(28)
+-# define SCALER_DISPSTATX_FRAME_COUNT_MASK VC4_MASK(17, 12)
+-# define SCALER_DISPSTATX_FRAME_COUNT_SHIFT 12
+ # define SCALER_DISPSTATX_LINE_MASK VC4_MASK(11, 0)
+ # define SCALER_DISPSTATX_LINE_SHIFT 0
+
+@@ -403,9 +401,15 @@
+ (x) * (SCALER_DISPBKGND1 - \
+ SCALER_DISPBKGND0))
+ #define SCALER_DISPSTAT1 0x00000058
++# define SCALER_DISPSTAT1_FRCNT0_MASK VC4_MASK(23, 18)
++# define SCALER_DISPSTAT1_FRCNT0_SHIFT 18
++# define SCALER_DISPSTAT1_FRCNT1_MASK VC4_MASK(17, 12)
++# define SCALER_DISPSTAT1_FRCNT1_SHIFT 12
++
+ #define SCALER_DISPSTATX(x) (SCALER_DISPSTAT0 + \
+ (x) * (SCALER_DISPSTAT1 - \
+ SCALER_DISPSTAT0))
++
+ #define SCALER_DISPBASE1 0x0000005c
+ #define SCALER_DISPBASEX(x) (SCALER_DISPBASE0 + \
+ (x) * (SCALER_DISPBASE1 - \
+@@ -415,7 +419,11 @@
+ (x) * (SCALER_DISPCTRL1 - \
+ SCALER_DISPCTRL0))
+ #define SCALER_DISPBKGND2 0x00000064
++
+ #define SCALER_DISPSTAT2 0x00000068
++# define SCALER_DISPSTAT2_FRCNT2_MASK VC4_MASK(17, 12)
++# define SCALER_DISPSTAT2_FRCNT2_SHIFT 12
++
+ #define SCALER_DISPBASE2 0x0000006c
+ #define SCALER_DISPALPHA2 0x00000070
+ #define SCALER_GAMADDR 0x00000078
+diff --git a/drivers/gpu/drm/vc4/vc4_txp.c b/drivers/gpu/drm/vc4/vc4_txp.c
+index 9809ca3e29451..82beb8c159f28 100644
+--- a/drivers/gpu/drm/vc4/vc4_txp.c
++++ b/drivers/gpu/drm/vc4/vc4_txp.c
+@@ -298,12 +298,18 @@ static void vc4_txp_connector_atomic_commit(struct drm_connector *conn,
+ if (WARN_ON(i == ARRAY_SIZE(drm_fmts)))
+ return;
+
+- ctrl = TXP_GO | TXP_VSTART_AT_EOF | TXP_EI |
++ ctrl = TXP_GO | TXP_EI |
+ VC4_SET_FIELD(0xf, TXP_BYTE_ENABLE) |
+ VC4_SET_FIELD(txp_fmts[i], TXP_FORMAT);
+
+ if (fb->format->has_alpha)
+ ctrl |= TXP_ALPHA_ENABLE;
++ else
++ /*
++ * If TXP_ALPHA_ENABLE isn't set and TXP_ALPHA_INVERT is, the
++ * hardware will force the output padding to be 0xff.
++ */
++ ctrl |= TXP_ALPHA_INVERT;
+
+ gem = drm_fb_cma_get_gem_obj(fb, 0);
+ TXP_WRITE(TXP_DST_PTR, gem->paddr + fb->offsets[0]);
+diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c b/drivers/gpu/drm/virtio/virtgpu_display.c
+index 5b00310ac4cd4..f73352e7b8329 100644
+--- a/drivers/gpu/drm/virtio/virtgpu_display.c
++++ b/drivers/gpu/drm/virtio/virtgpu_display.c
+@@ -179,6 +179,8 @@ static int virtio_gpu_conn_get_modes(struct drm_connector *connector)
+ DRM_DEBUG("add mode: %dx%d\n", width, height);
+ mode = drm_cvt_mode(connector->dev, width, height, 60,
+ false, false, false);
++ if (!mode)
++ return count;
+ mode->type |= DRM_MODE_TYPE_PREFERRED;
+ drm_mode_probed_add(connector, mode);
+ count++;
+diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+index 93431e8f66060..9410152f9d6f1 100644
+--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+@@ -914,6 +914,15 @@ static int vmw_kms_new_framebuffer_surface(struct vmw_private *dev_priv,
+ * Sanity checks.
+ */
+
++ if (!drm_any_plane_has_format(&dev_priv->drm,
++ mode_cmd->pixel_format,
++ mode_cmd->modifier[0])) {
++ drm_dbg(&dev_priv->drm,
++ "unsupported pixel format %p4cc / modifier 0x%llx\n",
++ &mode_cmd->pixel_format, mode_cmd->modifier[0]);
++ return -EINVAL;
++ }
++
+ /* Surface must be marked as a scanout. */
+ if (unlikely(!surface->metadata.scanout))
+ return -EINVAL;
+@@ -1236,20 +1245,13 @@ static int vmw_kms_new_framebuffer_bo(struct vmw_private *dev_priv,
+ return -EINVAL;
+ }
+
+- /* Limited framebuffer color depth support for screen objects */
+- if (dev_priv->active_display_unit == vmw_du_screen_object) {
+- switch (mode_cmd->pixel_format) {
+- case DRM_FORMAT_XRGB8888:
+- case DRM_FORMAT_ARGB8888:
+- break;
+- case DRM_FORMAT_XRGB1555:
+- case DRM_FORMAT_RGB565:
+- break;
+- default:
+- DRM_ERROR("Invalid pixel format: %p4cc\n",
+- &mode_cmd->pixel_format);
+- return -EINVAL;
+- }
++ if (!drm_any_plane_has_format(&dev_priv->drm,
++ mode_cmd->pixel_format,
++ mode_cmd->modifier[0])) {
++ drm_dbg(&dev_priv->drm,
++ "unsupported pixel format %p4cc / modifier 0x%llx\n",
++ &mode_cmd->pixel_format, mode_cmd->modifier[0]);
++ return -EINVAL;
+ }
+
+ vfbd = kzalloc(sizeof(*vfbd), GFP_KERNEL);
+diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
+index 4d36e85073806..d9ebd02099a68 100644
+--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
+@@ -247,7 +247,6 @@ struct vmw_framebuffer_bo {
+ static const uint32_t __maybe_unused vmw_primary_plane_formats[] = {
+ DRM_FORMAT_XRGB1555,
+ DRM_FORMAT_RGB565,
+- DRM_FORMAT_RGB888,
+ DRM_FORMAT_XRGB8888,
+ DRM_FORMAT_ARGB8888,
+ };
+diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+index 708899ba21029..6542f14986510 100644
+--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+@@ -859,22 +859,21 @@ void vmw_query_move_notify(struct ttm_buffer_object *bo,
+ struct ttm_device *bdev = bo->bdev;
+ struct vmw_private *dev_priv;
+
+-
+ dev_priv = container_of(bdev, struct vmw_private, bdev);
+
+ mutex_lock(&dev_priv->binding_mutex);
+
+- dx_query_mob = container_of(bo, struct vmw_buffer_object, base);
+- if (!dx_query_mob || !dx_query_mob->dx_query_ctx) {
+- mutex_unlock(&dev_priv->binding_mutex);
+- return;
+- }
+-
+ /* If BO is being moved from MOB to system memory */
+ if (new_mem->mem_type == TTM_PL_SYSTEM &&
+ old_mem->mem_type == VMW_PL_MOB) {
+ struct vmw_fence_obj *fence;
+
++ dx_query_mob = container_of(bo, struct vmw_buffer_object, base);
++ if (!dx_query_mob || !dx_query_mob->dx_query_ctx) {
++ mutex_unlock(&dev_priv->binding_mutex);
++ return;
++ }
++
+ (void) vmw_query_readback_all(dx_query_mob);
+ mutex_unlock(&dev_priv->binding_mutex);
+
+@@ -888,7 +887,6 @@ void vmw_query_move_notify(struct ttm_buffer_object *bo,
+ (void) ttm_bo_wait(bo, false, false);
+ } else
+ mutex_unlock(&dev_priv->binding_mutex);
+-
+ }
+
+ /**
+diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_hid.c b/drivers/hid/amd-sfh-hid/amd_sfh_hid.c
+index 2bf97b6ac9735..e2a9679e32be8 100644
+--- a/drivers/hid/amd-sfh-hid/amd_sfh_hid.c
++++ b/drivers/hid/amd-sfh-hid/amd_sfh_hid.c
+@@ -141,10 +141,10 @@ int amdtp_hid_probe(u32 cur_hid_dev, struct amdtp_cl_data *cli_data)
+
+ hid->driver_data = hid_data;
+ cli_data->hid_sensor_hubs[cur_hid_dev] = hid;
+- hid->bus = BUS_AMD_AMDTP;
++ hid->bus = BUS_AMD_SFH;
+ hid->vendor = AMD_SFH_HID_VENDOR;
+ hid->product = AMD_SFH_HID_PRODUCT;
+- snprintf(hid->name, sizeof(hid->name), "%s %04X:%04X", "hid-amdtp",
++ snprintf(hid->name, sizeof(hid->name), "%s %04X:%04X", "hid-amdsfh",
+ hid->vendor, hid->product);
+
+ rc = hid_add_device(hid);
+diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_hid.h b/drivers/hid/amd-sfh-hid/amd_sfh_hid.h
+index c60abd38054ca..cb04f47c86483 100644
+--- a/drivers/hid/amd-sfh-hid/amd_sfh_hid.h
++++ b/drivers/hid/amd-sfh-hid/amd_sfh_hid.h
+@@ -12,7 +12,7 @@
+ #define AMDSFH_HID_H
+
+ #define MAX_HID_DEVICES 5
+-#define BUS_AMD_AMDTP 0x20
++#define BUS_AMD_SFH 0x20
+ #define AMD_SFH_HID_VENDOR 0x1022
+ #define AMD_SFH_HID_PRODUCT 0x0001
+
+diff --git a/drivers/hid/hid-bigbenff.c b/drivers/hid/hid-bigbenff.c
+index 74ad8bf98bfd5..e8c5e3ac9fff1 100644
+--- a/drivers/hid/hid-bigbenff.c
++++ b/drivers/hid/hid-bigbenff.c
+@@ -347,6 +347,12 @@ static int bigben_probe(struct hid_device *hid,
+ bigben->report = list_entry(report_list->next,
+ struct hid_report, list);
+
++ if (list_empty(&hid->inputs)) {
++ hid_err(hid, "no inputs found\n");
++ error = -ENODEV;
++ goto error_hw_stop;
++ }
++
+ hidinput = list_first_entry(&hid->inputs, struct hid_input, list);
+ set_bit(FF_RUMBLE, hidinput->input->ffbit);
+
+diff --git a/drivers/hid/hid-elan.c b/drivers/hid/hid-elan.c
+index 3091355d48df6..8e4a5528e25df 100644
+--- a/drivers/hid/hid-elan.c
++++ b/drivers/hid/hid-elan.c
+@@ -188,7 +188,6 @@ static int elan_input_configured(struct hid_device *hdev, struct hid_input *hi)
+ ret = input_mt_init_slots(input, ELAN_MAX_FINGERS, INPUT_MT_POINTER);
+ if (ret) {
+ hid_err(hdev, "Failed to init elan MT slots: %d\n", ret);
+- input_free_device(input);
+ return ret;
+ }
+
+@@ -200,7 +199,6 @@ static int elan_input_configured(struct hid_device *hdev, struct hid_input *hi)
+ hid_err(hdev, "Failed to register elan input device: %d\n",
+ ret);
+ input_mt_destroy_slots(input);
+- input_free_device(input);
+ return ret;
+ }
+
+diff --git a/drivers/hid/hid-led.c b/drivers/hid/hid-led.c
+index c2c66ceca1327..7d82f8d426bbc 100644
+--- a/drivers/hid/hid-led.c
++++ b/drivers/hid/hid-led.c
+@@ -366,7 +366,7 @@ static const struct hidled_config hidled_configs[] = {
+ .type = DREAM_CHEEKY,
+ .name = "Dream Cheeky Webmail Notifier",
+ .short_name = "dream_cheeky",
+- .max_brightness = 31,
++ .max_brightness = 63,
+ .num_leds = 1,
+ .report_size = 9,
+ .report_type = RAW_REQUEST,
+diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
+index dc5c35210c16a..20fc8d50a0398 100644
+--- a/drivers/hv/channel.c
++++ b/drivers/hv/channel.c
+@@ -1245,7 +1245,9 @@ u64 vmbus_next_request_id(struct vmbus_channel *channel, u64 rqst_addr)
+
+ /*
+ * Cannot return an ID of 0, which is reserved for an unsolicited
+- * message from Hyper-V.
++ * message from Hyper-V; Hyper-V does not acknowledge (respond to)
++ * VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED requests with ID of
++ * 0 sent by the guest.
+ */
+ return current_id + 1;
+ }
+@@ -1270,7 +1272,7 @@ u64 vmbus_request_addr(struct vmbus_channel *channel, u64 trans_id)
+
+ /* Hyper-V can send an unsolicited message with ID of 0 */
+ if (!trans_id)
+- return trans_id;
++ return VMBUS_RQST_ERROR;
+
+ spin_lock_irqsave(&rqstor->req_lock, flags);
+
+diff --git a/drivers/hwmon/peci/dimmtemp.c b/drivers/hwmon/peci/dimmtemp.c
+index c8222354c0056..53e58a9c28ea2 100644
+--- a/drivers/hwmon/peci/dimmtemp.c
++++ b/drivers/hwmon/peci/dimmtemp.c
+@@ -219,7 +219,7 @@ static int check_populated_dimms(struct peci_dimmtemp *priv)
+ int chan_rank_max = priv->gen_info->chan_rank_max;
+ int dimm_idx_max = priv->gen_info->dimm_idx_max;
+ u32 chan_rank_empty = 0;
+- u64 dimm_mask = 0;
++ u32 dimm_mask = 0;
+ int chan_rank, dimm_idx, ret;
+ u32 pcs;
+
+@@ -278,9 +278,9 @@ static int check_populated_dimms(struct peci_dimmtemp *priv)
+ return -EAGAIN;
+ }
+
+- dev_dbg(priv->dev, "Scanned populated DIMMs: %#llx\n", dimm_mask);
++ dev_dbg(priv->dev, "Scanned populated DIMMs: %#x\n", dimm_mask);
+
+- bitmap_from_u64(priv->dimm_mask, dimm_mask);
++ bitmap_from_arr32(priv->dimm_mask, &dimm_mask, DIMM_NUMS_MAX);
+
+ return 0;
+ }
+diff --git a/drivers/hwmon/pmbus/pmbus_core.c b/drivers/hwmon/pmbus/pmbus_core.c
+index d93574d6a1fb6..86429bfa4847c 100644
+--- a/drivers/hwmon/pmbus/pmbus_core.c
++++ b/drivers/hwmon/pmbus/pmbus_core.c
+@@ -2308,6 +2308,21 @@ static int pmbus_init_common(struct i2c_client *client, struct pmbus_data *data,
+ struct device *dev = &client->dev;
+ int page, ret;
+
++ /*
++ * Figure out if PEC is enabled before accessing any other register.
++ * Make sure PEC is disabled, will be enabled later if needed.
++ */
++ client->flags &= ~I2C_CLIENT_PEC;
++
++ /* Enable PEC if the controller and bus supports it */
++ if (!(data->flags & PMBUS_NO_CAPABILITY)) {
++ ret = i2c_smbus_read_byte_data(client, PMBUS_CAPABILITY);
++ if (ret >= 0 && (ret & PB_CAPABILITY_ERROR_CHECK)) {
++ if (i2c_check_functionality(client->adapter, I2C_FUNC_SMBUS_PEC))
++ client->flags |= I2C_CLIENT_PEC;
++ }
++ }
++
+ /*
+ * Some PMBus chips don't support PMBUS_STATUS_WORD, so try
+ * to use PMBUS_STATUS_BYTE instead if that is the case.
+@@ -2326,19 +2341,6 @@ static int pmbus_init_common(struct i2c_client *client, struct pmbus_data *data,
+ data->has_status_word = true;
+ }
+
+- /* Make sure PEC is disabled, will be enabled later if needed */
+- client->flags &= ~I2C_CLIENT_PEC;
+-
+- /* Enable PEC if the controller and bus supports it */
+- if (!(data->flags & PMBUS_NO_CAPABILITY)) {
+- ret = i2c_smbus_read_byte_data(client, PMBUS_CAPABILITY);
+- if (ret >= 0 && (ret & PB_CAPABILITY_ERROR_CHECK)) {
+- if (i2c_check_functionality(client->adapter, I2C_FUNC_SMBUS_PEC)) {
+- client->flags |= I2C_CLIENT_PEC;
+- }
+- }
+- }
+-
+ /*
+ * Check if the chip is write protected. If it is, we can not clear
+ * faults, and we should not try it. Also, in that case, writes into
+@@ -2548,11 +2550,78 @@ static int pmbus_regulator_get_error_flags(struct regulator_dev *rdev, unsigned
+ return 0;
+ }
+
++static int pmbus_regulator_get_voltage(struct regulator_dev *rdev)
++{
++ struct device *dev = rdev_get_dev(rdev);
++ struct i2c_client *client = to_i2c_client(dev->parent);
++ struct pmbus_data *data = i2c_get_clientdata(client);
++ struct pmbus_sensor s = {
++ .page = rdev_get_id(rdev),
++ .class = PSC_VOLTAGE_OUT,
++ .convert = true,
++ };
++
++ s.data = _pmbus_read_word_data(client, s.page, 0xff, PMBUS_READ_VOUT);
++ if (s.data < 0)
++ return s.data;
++
++ return (int)pmbus_reg2data(data, &s) * 1000; /* unit is uV */
++}
++
++static int pmbus_regulator_set_voltage(struct regulator_dev *rdev, int min_uv,
++ int max_uv, unsigned int *selector)
++{
++ struct device *dev = rdev_get_dev(rdev);
++ struct i2c_client *client = to_i2c_client(dev->parent);
++ struct pmbus_data *data = i2c_get_clientdata(client);
++ struct pmbus_sensor s = {
++ .page = rdev_get_id(rdev),
++ .class = PSC_VOLTAGE_OUT,
++ .convert = true,
++ .data = -1,
++ };
++ int val = DIV_ROUND_CLOSEST(min_uv, 1000); /* convert to mV */
++ int low, high;
++
++ *selector = 0;
++
++ if (pmbus_check_word_register(client, s.page, PMBUS_MFR_VOUT_MIN))
++ s.data = _pmbus_read_word_data(client, s.page, 0xff, PMBUS_MFR_VOUT_MIN);
++ if (s.data < 0) {
++ s.data = _pmbus_read_word_data(client, s.page, 0xff, PMBUS_VOUT_MARGIN_LOW);
++ if (s.data < 0)
++ return s.data;
++ }
++ low = pmbus_reg2data(data, &s);
++
++ s.data = -1;
++ if (pmbus_check_word_register(client, s.page, PMBUS_MFR_VOUT_MAX))
++ s.data = _pmbus_read_word_data(client, s.page, 0xff, PMBUS_MFR_VOUT_MAX);
++ if (s.data < 0) {
++ s.data = _pmbus_read_word_data(client, s.page, 0xff, PMBUS_VOUT_MARGIN_HIGH);
++ if (s.data < 0)
++ return s.data;
++ }
++ high = pmbus_reg2data(data, &s);
++
++ /* Make sure we are within margins */
++ if (low > val)
++ val = low;
++ if (high < val)
++ val = high;
++
++ val = pmbus_data2reg(data, &s, val);
++
++ return _pmbus_write_word_data(client, s.page, PMBUS_VOUT_COMMAND, (u16)val);
++}
++
+ const struct regulator_ops pmbus_regulator_ops = {
+ .enable = pmbus_regulator_enable,
+ .disable = pmbus_regulator_disable,
+ .is_enabled = pmbus_regulator_is_enabled,
+ .get_error_flags = pmbus_regulator_get_error_flags,
++ .get_voltage = pmbus_regulator_get_voltage,
++ .set_voltage = pmbus_regulator_set_voltage,
+ };
+ EXPORT_SYMBOL_NS_GPL(pmbus_regulator_ops, PMBUS);
+
+diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
+index af00dca8d1acd..ee6ce92ab4c31 100644
+--- a/drivers/hwtracing/coresight/coresight-core.c
++++ b/drivers/hwtracing/coresight/coresight-core.c
+@@ -1379,7 +1379,7 @@ static int coresight_fixup_device_conns(struct coresight_device *csdev)
+ continue;
+ conn->child_dev =
+ coresight_find_csdev_by_fwnode(conn->child_fwnode);
+- if (conn->child_dev) {
++ if (conn->child_dev && conn->child_dev->has_conns_grp) {
+ ret = coresight_make_links(csdev, conn,
+ conn->child_dev);
+ if (ret)
+@@ -1571,6 +1571,7 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
+ int nr_refcnts = 1;
+ atomic_t *refcnts = NULL;
+ struct coresight_device *csdev;
++ bool registered = false;
+
+ csdev = kzalloc(sizeof(*csdev), GFP_KERNEL);
+ if (!csdev) {
+@@ -1591,7 +1592,8 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
+ refcnts = kcalloc(nr_refcnts, sizeof(*refcnts), GFP_KERNEL);
+ if (!refcnts) {
+ ret = -ENOMEM;
+- goto err_free_csdev;
++ kfree(csdev);
++ goto err_out;
+ }
+
+ csdev->refcnt = refcnts;
+@@ -1616,6 +1618,13 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
+ csdev->dev.fwnode = fwnode_handle_get(dev_fwnode(desc->dev));
+ dev_set_name(&csdev->dev, "%s", desc->name);
+
++ /*
++ * Make sure the device registration and the connection fixup
++ * are synchronised, so that we don't see uninitialised devices
++ * on the coresight bus while trying to resolve the connections.
++ */
++ mutex_lock(&coresight_mutex);
++
+ ret = device_register(&csdev->dev);
+ if (ret) {
+ put_device(&csdev->dev);
+@@ -1623,7 +1632,7 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
+ * All resources are free'd explicitly via
+ * coresight_device_release(), triggered from put_device().
+ */
+- goto err_out;
++ goto out_unlock;
+ }
+
+ if (csdev->type == CORESIGHT_DEV_TYPE_SINK ||
+@@ -1638,11 +1647,11 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
+ * from put_device(), which is in turn called from
+ * function device_unregister().
+ */
+- goto err_out;
++ goto out_unlock;
+ }
+ }
+-
+- mutex_lock(&coresight_mutex);
++ /* Device is now registered */
++ registered = true;
+
+ ret = coresight_create_conns_sysfs_group(csdev);
+ if (!ret)
+@@ -1652,16 +1661,18 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
+ if (!ret && cti_assoc_ops && cti_assoc_ops->add)
+ cti_assoc_ops->add(csdev);
+
++out_unlock:
+ mutex_unlock(&coresight_mutex);
+- if (ret) {
++ /* Success */
++ if (!ret)
++ return csdev;
++
++ /* Unregister the device if needed */
++ if (registered) {
+ coresight_unregister(csdev);
+ return ERR_PTR(ret);
+ }
+
+- return csdev;
+-
+-err_free_csdev:
+- kfree(csdev);
+ err_out:
+ /* Cleanup the connection information */
+ coresight_release_platform_data(NULL, desc->pdata);
+diff --git a/drivers/i2c/busses/i2c-at91-master.c b/drivers/i2c/busses/i2c-at91-master.c
+index b0eae94909f44..c0c35785a0dc4 100644
+--- a/drivers/i2c/busses/i2c-at91-master.c
++++ b/drivers/i2c/busses/i2c-at91-master.c
+@@ -656,6 +656,7 @@ static int at91_twi_xfer(struct i2c_adapter *adap, struct i2c_msg *msg, int num)
+ unsigned int_addr_flag = 0;
+ struct i2c_msg *m_start = msg;
+ bool is_read;
++ u8 *dma_buf = NULL;
+
+ dev_dbg(&adap->dev, "at91_xfer: processing %d messages:\n", num);
+
+@@ -703,7 +704,17 @@ static int at91_twi_xfer(struct i2c_adapter *adap, struct i2c_msg *msg, int num)
+ dev->msg = m_start;
+ dev->recv_len_abort = false;
+
++ if (dev->use_dma) {
++ dma_buf = i2c_get_dma_safe_msg_buf(m_start, 1);
++ if (!dma_buf) {
++ ret = -ENOMEM;
++ goto out;
++ }
++ dev->buf = dma_buf;
++ }
++
+ ret = at91_do_twi_transfer(dev);
++ i2c_put_dma_safe_msg_buf(dma_buf, m_start, !ret);
+
+ ret = (ret < 0) ? ret : num;
+ out:
+diff --git a/drivers/i2c/busses/i2c-npcm7xx.c b/drivers/i2c/busses/i2c-npcm7xx.c
+index 71aad029425d8..c638f2efb97c7 100644
+--- a/drivers/i2c/busses/i2c-npcm7xx.c
++++ b/drivers/i2c/busses/i2c-npcm7xx.c
+@@ -359,14 +359,14 @@ static int npcm_i2c_get_SCL(struct i2c_adapter *_adap)
+ {
+ struct npcm_i2c *bus = container_of(_adap, struct npcm_i2c, adap);
+
+- return !!(I2CCTL3_SCL_LVL & ioread32(bus->reg + NPCM_I2CCTL3));
++ return !!(I2CCTL3_SCL_LVL & ioread8(bus->reg + NPCM_I2CCTL3));
+ }
+
+ static int npcm_i2c_get_SDA(struct i2c_adapter *_adap)
+ {
+ struct npcm_i2c *bus = container_of(_adap, struct npcm_i2c, adap);
+
+- return !!(I2CCTL3_SDA_LVL & ioread32(bus->reg + NPCM_I2CCTL3));
++ return !!(I2CCTL3_SDA_LVL & ioread8(bus->reg + NPCM_I2CCTL3));
+ }
+
+ static inline u16 npcm_i2c_get_index(struct npcm_i2c *bus)
+@@ -563,6 +563,15 @@ static inline void npcm_i2c_nack(struct npcm_i2c *bus)
+ iowrite8(val, bus->reg + NPCM_I2CCTL1);
+ }
+
++static inline void npcm_i2c_clear_master_status(struct npcm_i2c *bus)
++{
++ u8 val;
++
++ /* Clear NEGACK, STASTR and BER bits */
++ val = NPCM_I2CST_BER | NPCM_I2CST_NEGACK | NPCM_I2CST_STASTR;
++ iowrite8(val, bus->reg + NPCM_I2CST);
++}
++
+ #if IS_ENABLED(CONFIG_I2C_SLAVE)
+ static void npcm_i2c_slave_int_enable(struct npcm_i2c *bus, bool enable)
+ {
+@@ -642,8 +651,8 @@ static void npcm_i2c_reset(struct npcm_i2c *bus)
+ iowrite8(NPCM_I2CCST_BB, bus->reg + NPCM_I2CCST);
+ iowrite8(0xFF, bus->reg + NPCM_I2CST);
+
+- /* Clear EOB bit */
+- iowrite8(NPCM_I2CCST3_EO_BUSY, bus->reg + NPCM_I2CCST3);
++ /* Clear and disable EOB */
++ npcm_i2c_eob_int(bus, false);
+
+ /* Clear all fifo bits: */
+ iowrite8(NPCM_I2CFIF_CTS_CLR_FIFO, bus->reg + NPCM_I2CFIF_CTS);
+@@ -655,6 +664,9 @@ static void npcm_i2c_reset(struct npcm_i2c *bus)
+ }
+ #endif
+
++ /* clear status bits for spurious interrupts */
++ npcm_i2c_clear_master_status(bus);
++
+ bus->state = I2C_IDLE;
+ }
+
+@@ -815,15 +827,6 @@ static void npcm_i2c_read_fifo(struct npcm_i2c *bus, u8 bytes_in_fifo)
+ }
+ }
+
+-static inline void npcm_i2c_clear_master_status(struct npcm_i2c *bus)
+-{
+- u8 val;
+-
+- /* Clear NEGACK, STASTR and BER bits */
+- val = NPCM_I2CST_BER | NPCM_I2CST_NEGACK | NPCM_I2CST_STASTR;
+- iowrite8(val, bus->reg + NPCM_I2CST);
+-}
+-
+ static void npcm_i2c_master_abort(struct npcm_i2c *bus)
+ {
+ /* Only current master is allowed to issue a stop condition */
+@@ -1231,7 +1234,16 @@ static irqreturn_t npcm_i2c_int_slave_handler(struct npcm_i2c *bus)
+ ret = IRQ_HANDLED;
+ } /* SDAST */
+
+- return ret;
++ /*
++ * if irq is not one of the above, make sure EOB is disabled and all
++ * status bits are cleared.
++ */
++ if (ret == IRQ_NONE) {
++ npcm_i2c_eob_int(bus, false);
++ npcm_i2c_clear_master_status(bus);
++ }
++
++ return IRQ_HANDLED;
+ }
+
+ static int npcm_i2c_reg_slave(struct i2c_client *client)
+@@ -1467,6 +1479,9 @@ static void npcm_i2c_irq_handle_nack(struct npcm_i2c *bus)
+ npcm_i2c_eob_int(bus, false);
+ npcm_i2c_master_stop(bus);
+
++ /* Clear SDA Status bit (by reading dummy byte) */
++ npcm_i2c_rd_byte(bus);
++
+ /*
+ * The bus is released from stall only after the SW clears
+ * NEGACK bit. Then a Stop condition is sent.
+@@ -1474,6 +1489,8 @@ static void npcm_i2c_irq_handle_nack(struct npcm_i2c *bus)
+ npcm_i2c_clear_master_status(bus);
+ readx_poll_timeout_atomic(ioread8, bus->reg + NPCM_I2CCST, val,
+ !(val & NPCM_I2CCST_BUSY), 10, 200);
++ /* verify no status bits are still set after bus is released */
++ npcm_i2c_clear_master_status(bus);
+ }
+ bus->state = I2C_IDLE;
+
+@@ -1672,10 +1689,10 @@ static int npcm_i2c_recovery_tgclk(struct i2c_adapter *_adap)
+ int iter = 27;
+
+ if ((npcm_i2c_get_SDA(_adap) == 1) && (npcm_i2c_get_SCL(_adap) == 1)) {
+- dev_dbg(bus->dev, "bus%d recovery skipped, bus not stuck",
+- bus->num);
++ dev_dbg(bus->dev, "bus%d-0x%x recovery skipped, bus not stuck",
++ bus->num, bus->dest_addr);
+ npcm_i2c_reset(bus);
+- return status;
++ return 0;
+ }
+
+ npcm_i2c_int_enable(bus, false);
+@@ -1909,6 +1926,7 @@ static int npcm_i2c_init_module(struct npcm_i2c *bus, enum i2c_mode mode,
+ bus_freq_hz < I2C_FREQ_MIN_HZ || bus_freq_hz > I2C_FREQ_MAX_HZ)
+ return -EINVAL;
+
++ npcm_i2c_int_enable(bus, false);
+ npcm_i2c_disable(bus);
+
+ /* Configure FIFO mode : */
+@@ -1937,10 +1955,17 @@ static int npcm_i2c_init_module(struct npcm_i2c *bus, enum i2c_mode mode,
+ val = (val | NPCM_I2CCTL1_NMINTE) & ~NPCM_I2CCTL1_RWS;
+ iowrite8(val, bus->reg + NPCM_I2CCTL1);
+
+- npcm_i2c_int_enable(bus, true);
+-
+ npcm_i2c_reset(bus);
+
++ /* check HW is OK: SDA and SCL should be high at this point. */
++ if ((npcm_i2c_get_SDA(&bus->adap) == 0) || (npcm_i2c_get_SCL(&bus->adap) == 0)) {
++ dev_err(bus->dev, "I2C%d init fail: lines are low\n", bus->num);
++ dev_err(bus->dev, "SDA=%d SCL=%d\n", npcm_i2c_get_SDA(&bus->adap),
++ npcm_i2c_get_SCL(&bus->adap));
++ return -ENXIO;
++ }
++
++ npcm_i2c_int_enable(bus, true);
+ return 0;
+ }
+
+@@ -1988,10 +2013,14 @@ static irqreturn_t npcm_i2c_bus_irq(int irq, void *dev_id)
+ #if IS_ENABLED(CONFIG_I2C_SLAVE)
+ if (bus->slave) {
+ bus->master_or_slave = I2C_SLAVE;
+- return npcm_i2c_int_slave_handler(bus);
++ if (npcm_i2c_int_slave_handler(bus))
++ return IRQ_HANDLED;
+ }
+ #endif
+- return IRQ_NONE;
++ /* clear status bits for spurious interrupts */
++ npcm_i2c_clear_master_status(bus);
++
++ return IRQ_HANDLED;
+ }
+
+ static bool npcm_i2c_master_start_xmit(struct npcm_i2c *bus,
+@@ -2047,8 +2076,7 @@ static int npcm_i2c_master_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs,
+ u16 nwrite, nread;
+ u8 *write_data, *read_data;
+ u8 slave_addr;
+- int timeout;
+- int ret = 0;
++ unsigned long timeout;
+ bool read_block = false;
+ bool read_PEC = false;
+ u8 bus_busy;
+@@ -2099,13 +2127,13 @@ static int npcm_i2c_master_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs,
+ * 9: bits per transaction (including the ack/nack)
+ */
+ timeout_usec = (2 * 9 * USEC_PER_SEC / bus->bus_freq) * (2 + nread + nwrite);
+- timeout = max(msecs_to_jiffies(35), usecs_to_jiffies(timeout_usec));
++ timeout = max_t(unsigned long, bus->adap.timeout, usecs_to_jiffies(timeout_usec));
+ if (nwrite >= 32 * 1024 || nread >= 32 * 1024) {
+ dev_err(bus->dev, "i2c%d buffer too big\n", bus->num);
+ return -EINVAL;
+ }
+
+- time_left = jiffies + msecs_to_jiffies(DEFAULT_STALL_COUNT) + 1;
++ time_left = jiffies + timeout + 1;
+ do {
+ /*
+ * we must clear slave address immediately when the bus is not
+@@ -2138,12 +2166,12 @@ static int npcm_i2c_master_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs,
+ bus->read_block_use = read_block;
+
+ reinit_completion(&bus->cmd_complete);
+- if (!npcm_i2c_master_start_xmit(bus, slave_addr, nwrite, nread,
+- write_data, read_data, read_PEC,
+- read_block))
+- ret = -EBUSY;
+
+- if (ret != -EBUSY) {
++ npcm_i2c_int_enable(bus, true);
++
++ if (npcm_i2c_master_start_xmit(bus, slave_addr, nwrite, nread,
++ write_data, read_data, read_PEC,
++ read_block)) {
+ time_left = wait_for_completion_timeout(&bus->cmd_complete,
+ timeout);
+
+@@ -2157,26 +2185,31 @@ static int npcm_i2c_master_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs,
+ }
+ }
+ }
+- ret = bus->cmd_err;
+
+ /* if there was BER, check if need to recover the bus: */
+ if (bus->cmd_err == -EAGAIN)
+- ret = i2c_recover_bus(adap);
++ bus->cmd_err = i2c_recover_bus(adap);
+
+ /*
+ * After any type of error, check if LAST bit is still set,
+ * due to a HW issue.
+ * It cannot be cleared without resetting the module.
+ */
+- if (bus->cmd_err &&
+- (NPCM_I2CRXF_CTL_LAST_PEC & ioread8(bus->reg + NPCM_I2CRXF_CTL)))
++ else if (bus->cmd_err &&
++ (NPCM_I2CRXF_CTL_LAST_PEC & ioread8(bus->reg + NPCM_I2CRXF_CTL)))
+ npcm_i2c_reset(bus);
+
++ /* after any xfer, successful or not, stall and EOB must be disabled */
++ npcm_i2c_stall_after_start(bus, false);
++ npcm_i2c_eob_int(bus, false);
++
+ #if IS_ENABLED(CONFIG_I2C_SLAVE)
+ /* reenable slave if it was enabled */
+ if (bus->slave)
+ iowrite8((bus->slave->addr & 0x7F) | NPCM_I2CADDR_SAEN,
+ bus->reg + NPCM_I2CADDR1);
++#else
++ npcm_i2c_int_enable(bus, false);
+ #endif
+ return bus->cmd_err;
+ }
+@@ -2269,7 +2302,7 @@ static int npcm_i2c_probe_bus(struct platform_device *pdev)
+ adap = &bus->adap;
+ adap->owner = THIS_MODULE;
+ adap->retries = 3;
+- adap->timeout = HZ;
++ adap->timeout = msecs_to_jiffies(35);
+ adap->algo = &npcm_i2c_algo;
+ adap->quirks = &npcm_i2c_quirks;
+ adap->algo_data = bus;
+diff --git a/drivers/i2c/busses/i2c-rcar.c b/drivers/i2c/busses/i2c-rcar.c
+index 0db3d75590662..0064c632af5cd 100644
+--- a/drivers/i2c/busses/i2c-rcar.c
++++ b/drivers/i2c/busses/i2c-rcar.c
+@@ -1063,8 +1063,10 @@ static int rcar_i2c_probe(struct platform_device *pdev)
+ pm_runtime_enable(dev);
+ pm_runtime_get_sync(dev);
+ ret = rcar_i2c_clock_calculate(priv);
+- if (ret < 0)
+- goto out_pm_put;
++ if (ret < 0) {
++ pm_runtime_put(dev);
++ goto out_pm_disable;
++ }
+
+ rcar_i2c_write(priv, ICSAR, 0); /* Gen2: must be 0 if not using slave */
+
+@@ -1093,19 +1095,19 @@ static int rcar_i2c_probe(struct platform_device *pdev)
+
+ ret = platform_get_irq(pdev, 0);
+ if (ret < 0)
+- goto out_pm_disable;
++ goto out_pm_put;
+ priv->irq = ret;
+ ret = devm_request_irq(dev, priv->irq, irqhandler, irqflags, dev_name(dev), priv);
+ if (ret < 0) {
+ dev_err(dev, "cannot get irq %d\n", priv->irq);
+- goto out_pm_disable;
++ goto out_pm_put;
+ }
+
+ platform_set_drvdata(pdev, priv);
+
+ ret = i2c_add_numbered_adapter(adap);
+ if (ret < 0)
+- goto out_pm_disable;
++ goto out_pm_put;
+
+ if (priv->flags & ID_P_HOST_NOTIFY) {
+ priv->host_notify_client = i2c_new_slave_host_notify_device(adap);
+@@ -1122,7 +1124,8 @@ static int rcar_i2c_probe(struct platform_device *pdev)
+ out_del_device:
+ i2c_del_adapter(&priv->adap);
+ out_pm_put:
+- pm_runtime_put(dev);
++ if (priv->flags & ID_P_PM_BLOCKED)
++ pm_runtime_put(dev);
+ out_pm_disable:
+ pm_runtime_disable(dev);
+ return ret;
+diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
+index 1783a6ea5427b..3ebdd42fec362 100644
+--- a/drivers/infiniband/hw/hfi1/file_ops.c
++++ b/drivers/infiniband/hw/hfi1/file_ops.c
+@@ -265,6 +265,8 @@ static ssize_t hfi1_write_iter(struct kiocb *kiocb, struct iov_iter *from)
+ unsigned long dim = from->nr_segs;
+ int idx;
+
++ if (!HFI1_CAP_IS_KSET(SDMA))
++ return -EINVAL;
+ idx = srcu_read_lock(&fd->pq_srcu);
+ pq = srcu_dereference(fd->pq, &fd->pq_srcu);
+ if (!cq || !pq) {
+diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
+index 4436ed41547c4..436372b314312 100644
+--- a/drivers/infiniband/hw/hfi1/init.c
++++ b/drivers/infiniband/hw/hfi1/init.c
+@@ -489,7 +489,7 @@ void set_link_ipg(struct hfi1_pportdata *ppd)
+ u16 shift, mult;
+ u64 src;
+ u32 current_egress_rate; /* Mbits /sec */
+- u32 max_pkt_time;
++ u64 max_pkt_time;
+ /*
+ * max_pkt_time is the maximum packet egress time in units
+ * of the fabric clock period 1/(805 MHz).
+diff --git a/drivers/infiniband/hw/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c
+index f07d328689d3d..a95b654f52540 100644
+--- a/drivers/infiniband/hw/hfi1/sdma.c
++++ b/drivers/infiniband/hw/hfi1/sdma.c
+@@ -1288,11 +1288,13 @@ void sdma_clean(struct hfi1_devdata *dd, size_t num_engines)
+ kvfree(sde->tx_ring);
+ sde->tx_ring = NULL;
+ }
+- spin_lock_irq(&dd->sde_map_lock);
+- sdma_map_free(rcu_access_pointer(dd->sdma_map));
+- RCU_INIT_POINTER(dd->sdma_map, NULL);
+- spin_unlock_irq(&dd->sde_map_lock);
+- synchronize_rcu();
++ if (rcu_access_pointer(dd->sdma_map)) {
++ spin_lock_irq(&dd->sde_map_lock);
++ sdma_map_free(rcu_access_pointer(dd->sdma_map));
++ RCU_INIT_POINTER(dd->sdma_map, NULL);
++ spin_unlock_irq(&dd->sde_map_lock);
++ synchronize_rcu();
++ }
+ kfree(dd->per_sdma);
+ dd->per_sdma = NULL;
+
+diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
+index 3083d6db1d688..2d690bf8ac16e 100644
+--- a/drivers/infiniband/hw/hns/hns_roce_device.h
++++ b/drivers/infiniband/hw/hns/hns_roce_device.h
+@@ -535,6 +535,11 @@ struct hns_roce_cmd_context {
+ u16 busy;
+ };
+
++enum hns_roce_cmdq_state {
++ HNS_ROCE_CMDQ_STATE_NORMAL,
++ HNS_ROCE_CMDQ_STATE_FATAL_ERR,
++};
++
+ struct hns_roce_cmdq {
+ struct dma_pool *pool;
+ struct semaphore poll_sem;
+@@ -554,6 +559,7 @@ struct hns_roce_cmdq {
+ * close device, switch into poll mode(non event mode)
+ */
+ u8 use_events;
++ enum hns_roce_cmdq_state state;
+ };
+
+ struct hns_roce_cmd_mailbox {
+@@ -725,7 +731,6 @@ struct hns_roce_caps {
+ u32 num_pi_qps;
+ u32 reserved_qps;
+ int num_qpc_timer;
+- int num_cqc_timer;
+ u32 num_srqs;
+ u32 max_wqes;
+ u32 max_srq_wrs;
+diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+index 2b0cef17ad452..86f6a4aae1e5a 100644
+--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
++++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+@@ -1265,6 +1265,16 @@ static int hns_roce_cmq_csq_done(struct hns_roce_dev *hr_dev)
+ return tail == priv->cmq.csq.head;
+ }
+
++static void update_cmdq_status(struct hns_roce_dev *hr_dev)
++{
++ struct hns_roce_v2_priv *priv = hr_dev->priv;
++ struct hnae3_handle *handle = priv->handle;
++
++ if (handle->rinfo.reset_state == HNS_ROCE_STATE_RST_INIT ||
++ handle->rinfo.instance_state == HNS_ROCE_STATE_INIT)
++ hr_dev->cmd.state = HNS_ROCE_CMDQ_STATE_FATAL_ERR;
++}
++
+ static int __hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
+ struct hns_roce_cmq_desc *desc, int num)
+ {
+@@ -1318,6 +1328,8 @@ static int __hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
+ csq->head, tail);
+ csq->head = tail;
+
++ update_cmdq_status(hr_dev);
++
+ ret = -EAGAIN;
+ }
+
+@@ -1332,6 +1344,9 @@ static int hns_roce_cmq_send(struct hns_roce_dev *hr_dev,
+ bool busy;
+ int ret;
+
++ if (hr_dev->cmd.state == HNS_ROCE_CMDQ_STATE_FATAL_ERR)
++ return -EIO;
++
+ if (!v2_chk_mbox_is_avail(hr_dev, &busy))
+ return busy ? -EBUSY : 0;
+
+@@ -1528,6 +1543,9 @@ static void hns_roce_function_clear(struct hns_roce_dev *hr_dev)
+ {
+ int i;
+
++ if (hr_dev->cmd.state == HNS_ROCE_CMDQ_STATE_FATAL_ERR)
++ return;
++
+ for (i = hr_dev->func_num - 1; i >= 0; i--) {
+ __hns_roce_function_clear(hr_dev, i);
+ if (i != 0)
+@@ -1947,7 +1965,7 @@ static void set_default_caps(struct hns_roce_dev *hr_dev)
+ caps->num_mtpts = HNS_ROCE_V2_MAX_MTPT_NUM;
+ caps->num_pds = HNS_ROCE_V2_MAX_PD_NUM;
+ caps->num_qpc_timer = HNS_ROCE_V2_MAX_QPC_TIMER_NUM;
+- caps->num_cqc_timer = HNS_ROCE_V2_MAX_CQC_TIMER_NUM;
++ caps->cqc_timer_bt_num = HNS_ROCE_V2_MAX_CQC_TIMER_BT_NUM;
+
+ caps->max_qp_init_rdma = HNS_ROCE_V2_MAX_QP_INIT_RDMA;
+ caps->max_qp_dest_rdma = HNS_ROCE_V2_MAX_QP_DEST_RDMA;
+@@ -2243,7 +2261,6 @@ static int hns_roce_query_pf_caps(struct hns_roce_dev *hr_dev)
+ caps->max_rq_sg = roundup_pow_of_two(caps->max_rq_sg);
+ caps->max_extend_sg = le32_to_cpu(resp_a->max_extend_sg);
+ caps->num_qpc_timer = le16_to_cpu(resp_a->num_qpc_timer);
+- caps->num_cqc_timer = le16_to_cpu(resp_a->num_cqc_timer);
+ caps->max_srq_sges = le16_to_cpu(resp_a->max_srq_sges);
+ caps->max_srq_sges = roundup_pow_of_two(caps->max_srq_sges);
+ caps->num_aeq_vectors = resp_a->num_aeq_vectors;
+@@ -3000,6 +3017,9 @@ static int v2_wait_mbox_complete(struct hns_roce_dev *hr_dev, u32 timeout,
+ mb_st = (struct hns_roce_mbox_status *)desc.data;
+ end = msecs_to_jiffies(timeout) + jiffies;
+ while (v2_chk_mbox_is_avail(hr_dev, &busy)) {
++ if (hr_dev->cmd.state == HNS_ROCE_CMDQ_STATE_FATAL_ERR)
++ return -EIO;
++
+ status = 0;
+ hns_roce_cmq_setup_basic_desc(&desc, HNS_ROCE_OPC_QUERY_MB_ST,
+ true);
+diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
+index 0d87b627601e9..9cbb230de03bd 100644
+--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
++++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
+@@ -41,7 +41,7 @@
+ #define HNS_ROCE_V2_MAX_SRQ_WR 0x8000
+ #define HNS_ROCE_V2_MAX_SRQ_SGE 64
+ #define HNS_ROCE_V2_MAX_CQ_NUM 0x100000
+-#define HNS_ROCE_V2_MAX_CQC_TIMER_NUM 0x100
++#define HNS_ROCE_V2_MAX_CQC_TIMER_BT_NUM 0x100
+ #define HNS_ROCE_V2_MAX_SRQ_NUM 0x100000
+ #define HNS_ROCE_V2_MAX_CQE_NUM 0x400000
+ #define HNS_ROCE_V2_MAX_RQ_SGE_NUM 64
+diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
+index f73ba619f3756..c8af4ebd7cbd3 100644
+--- a/drivers/infiniband/hw/hns/hns_roce_main.c
++++ b/drivers/infiniband/hw/hns/hns_roce_main.c
+@@ -737,7 +737,7 @@ static int hns_roce_init_hem(struct hns_roce_dev *hr_dev)
+ ret = hns_roce_init_hem_table(hr_dev, &hr_dev->cqc_timer_table,
+ HEM_TYPE_CQC_TIMER,
+ hr_dev->caps.cqc_timer_entry_sz,
+- hr_dev->caps.num_cqc_timer, 1);
++ hr_dev->caps.cqc_timer_bt_num, 1);
+ if (ret) {
+ dev_err(dev,
+ "Failed to init CQC timer memory, aborting.\n");
+diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
+index 8ef112f883a77..3acab569fbb94 100644
+--- a/drivers/infiniband/sw/rdmavt/qp.c
++++ b/drivers/infiniband/sw/rdmavt/qp.c
+@@ -2775,7 +2775,7 @@ void rvt_qp_iter(struct rvt_dev_info *rdi,
+ EXPORT_SYMBOL(rvt_qp_iter);
+
+ /*
+- * This should be called with s_lock held.
++ * This should be called with s_lock and r_lock held.
+ */
+ void rvt_send_complete(struct rvt_qp *qp, struct rvt_swqe *wqe,
+ enum ib_wc_status status)
+@@ -3134,7 +3134,9 @@ send_comp:
+ rvp->n_loop_pkts++;
+ flush_send:
+ sqp->s_rnr_retry = sqp->s_rnr_retry_cnt;
++ spin_lock(&sqp->r_lock);
+ rvt_send_complete(sqp, wqe, send_status);
++ spin_unlock(&sqp->r_lock);
+ if (local_ops) {
+ atomic_dec(&sqp->local_ops_pending);
+ local_ops = 0;
+@@ -3188,7 +3190,9 @@ serr:
+ spin_unlock_irqrestore(&qp->r_lock, flags);
+ serr_no_r_lock:
+ spin_lock_irqsave(&sqp->s_lock, flags);
++ spin_lock(&sqp->r_lock);
+ rvt_send_complete(sqp, wqe, send_status);
++ spin_unlock(&sqp->r_lock);
+ if (sqp->ibqp.qp_type == IB_QPT_RC) {
+ int lastwqe;
+
+diff --git a/drivers/infiniband/sw/rxe/rxe_mcast.c b/drivers/infiniband/sw/rxe/rxe_mcast.c
+index 873a9b10307c0..86cc2e18a7fda 100644
+--- a/drivers/infiniband/sw/rxe/rxe_mcast.c
++++ b/drivers/infiniband/sw/rxe/rxe_mcast.c
+@@ -206,8 +206,10 @@ static struct rxe_mcg *rxe_get_mcg(struct rxe_dev *rxe, union ib_gid *mgid)
+
+ /* speculative alloc of new mcg */
+ mcg = kzalloc(sizeof(*mcg), GFP_KERNEL);
+- if (!mcg)
+- return ERR_PTR(-ENOMEM);
++ if (!mcg) {
++ err = -ENOMEM;
++ goto err_dec;
++ }
+
+ spin_lock_bh(&rxe->mcg_lock);
+ /* re-check to see if someone else just added it */
+diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
+index ae5fbc79dd5c6..8a1cff80a68e7 100644
+--- a/drivers/infiniband/sw/rxe/rxe_req.c
++++ b/drivers/infiniband/sw/rxe/rxe_req.c
+@@ -661,7 +661,7 @@ next_wqe:
+ opcode = next_opcode(qp, wqe, wqe->wr.opcode);
+ if (unlikely(opcode < 0)) {
+ wqe->status = IB_WC_LOC_QP_OP_ERR;
+- goto exit;
++ goto err;
+ }
+
+ mask = rxe_opcode[opcode].mask;
+diff --git a/drivers/input/keyboard/gpio_keys.c b/drivers/input/keyboard/gpio_keys.c
+index d75a8b179a8ae..a5dc4ab87fa1f 100644
+--- a/drivers/input/keyboard/gpio_keys.c
++++ b/drivers/input/keyboard/gpio_keys.c
+@@ -131,7 +131,7 @@ static void gpio_keys_quiesce_key(void *data)
+
+ if (!bdata->gpiod)
+ hrtimer_cancel(&bdata->release_timer);
+- if (bdata->debounce_use_hrtimer)
++ else if (bdata->debounce_use_hrtimer)
+ hrtimer_cancel(&bdata->debounce_timer);
+ else
+ cancel_delayed_work_sync(&bdata->work);
+diff --git a/drivers/input/misc/sparcspkr.c b/drivers/input/misc/sparcspkr.c
+index fe43e5557ed72..cdcb7737c46aa 100644
+--- a/drivers/input/misc/sparcspkr.c
++++ b/drivers/input/misc/sparcspkr.c
+@@ -205,6 +205,7 @@ static int bbc_beep_probe(struct platform_device *op)
+
+ info = &state->u.bbc;
+ info->clock_freq = of_getintprop_default(dp, "clock-frequency", 0);
++ of_node_put(dp);
+ if (!info->clock_freq)
+ goto out_free;
+
+diff --git a/drivers/input/touchscreen/stmfts.c b/drivers/input/touchscreen/stmfts.c
+index 72e0b767e1ba4..c175d44c52f37 100644
+--- a/drivers/input/touchscreen/stmfts.c
++++ b/drivers/input/touchscreen/stmfts.c
+@@ -337,13 +337,15 @@ static int stmfts_input_open(struct input_dev *dev)
+ struct stmfts_data *sdata = input_get_drvdata(dev);
+ int err;
+
+- err = pm_runtime_get_sync(&sdata->client->dev);
+- if (err < 0)
+- goto out;
++ err = pm_runtime_resume_and_get(&sdata->client->dev);
++ if (err)
++ return err;
+
+ err = i2c_smbus_write_byte(sdata->client, STMFTS_MS_MT_SENSE_ON);
+- if (err)
+- goto out;
++ if (err) {
++ pm_runtime_put_sync(&sdata->client->dev);
++ return err;
++ }
+
+ mutex_lock(&sdata->mutex);
+ sdata->running = true;
+@@ -366,9 +368,7 @@ static int stmfts_input_open(struct input_dev *dev)
+ "failed to enable touchkey\n");
+ }
+
+-out:
+- pm_runtime_put_noidle(&sdata->client->dev);
+- return err;
++ return 0;
+ }
+
+ static void stmfts_input_close(struct input_dev *dev)
+diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
+index b4a798c7b347f..d8060503ba51b 100644
+--- a/drivers/iommu/amd/init.c
++++ b/drivers/iommu/amd/init.c
+@@ -84,7 +84,7 @@
+ #define ACPI_DEVFLAG_LINT1 0x80
+ #define ACPI_DEVFLAG_ATSDIS 0x10000000
+
+-#define LOOP_TIMEOUT 100000
++#define LOOP_TIMEOUT 2000000
+ /*
+ * ACPI table definitions
+ *
+diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
+index a1ada7bff44e6..079694f894b85 100644
+--- a/drivers/iommu/amd/iommu.c
++++ b/drivers/iommu/amd/iommu.c
+@@ -1838,17 +1838,10 @@ void amd_iommu_domain_update(struct protection_domain *domain)
+ amd_iommu_domain_flush_complete(domain);
+ }
+
+-static void __init amd_iommu_init_dma_ops(void)
+-{
+- swiotlb = (iommu_default_passthrough() || sme_me_mask) ? 1 : 0;
+-}
+-
+ int __init amd_iommu_init_api(void)
+ {
+ int err;
+
+- amd_iommu_init_dma_ops();
+-
+ err = bus_set_iommu(&pci_bus_type, &amd_iommu_ops);
+ if (err)
+ return err;
+diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
+index e56b137ceabd1..afb3efd565b78 100644
+--- a/drivers/iommu/amd/iommu_v2.c
++++ b/drivers/iommu/amd/iommu_v2.c
+@@ -956,6 +956,7 @@ static void __exit amd_iommu_v2_exit(void)
+ {
+ struct device_state *dev_state, *next;
+ unsigned long flags;
++ LIST_HEAD(freelist);
+
+ if (!amd_iommu_v2_supported())
+ return;
+@@ -975,11 +976,20 @@ static void __exit amd_iommu_v2_exit(void)
+
+ put_device_state(dev_state);
+ list_del(&dev_state->list);
+- free_device_state(dev_state);
++ list_add_tail(&dev_state->list, &freelist);
+ }
+
+ spin_unlock_irqrestore(&state_lock, flags);
+
++ /*
++ * Since free_device_state waits on the count to be zero,
++ * we need to free dev_state outside the spinlock.
++ */
++ list_for_each_entry_safe(dev_state, next, &freelist, list) {
++ list_del(&dev_state->list);
++ free_device_state(dev_state);
++ }
++
+ destroy_workqueue(iommu_wq);
+ }
+
+diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+index c623dae1e1154..1ef7bbb4acf30 100644
+--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
++++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+@@ -6,6 +6,7 @@
+ #include <linux/mm.h>
+ #include <linux/mmu_context.h>
+ #include <linux/mmu_notifier.h>
++#include <linux/sched/mm.h>
+ #include <linux/slab.h>
+
+ #include "arm-smmu-v3.h"
+@@ -96,9 +97,14 @@ static struct arm_smmu_ctx_desc *arm_smmu_alloc_shared_cd(struct mm_struct *mm)
+ struct arm_smmu_ctx_desc *cd;
+ struct arm_smmu_ctx_desc *ret = NULL;
+
++ /* Don't free the mm until we release the ASID */
++ mmgrab(mm);
++
+ asid = arm64_mm_context_get(mm);
+- if (!asid)
+- return ERR_PTR(-ESRCH);
++ if (!asid) {
++ err = -ESRCH;
++ goto out_drop_mm;
++ }
+
+ cd = kzalloc(sizeof(*cd), GFP_KERNEL);
+ if (!cd) {
+@@ -165,6 +171,8 @@ out_free_cd:
+ kfree(cd);
+ out_put_context:
+ arm64_mm_context_put(mm);
++out_drop_mm:
++ mmdrop(mm);
+ return err < 0 ? ERR_PTR(err) : ret;
+ }
+
+@@ -173,6 +181,7 @@ static void arm_smmu_free_shared_cd(struct arm_smmu_ctx_desc *cd)
+ if (arm_smmu_free_asid(cd)) {
+ /* Unpin ASID */
+ arm64_mm_context_put(cd->mm);
++ mmdrop(cd->mm);
+ kfree(cd);
+ }
+ }
+diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
+index 09f6e1c0f9c07..2932281e93fc4 100644
+--- a/drivers/iommu/dma-iommu.c
++++ b/drivers/iommu/dma-iommu.c
+@@ -776,6 +776,7 @@ static struct page **__iommu_dma_alloc_noncontiguous(struct device *dev,
+ unsigned int count, min_size, alloc_sizes = domain->pgsize_bitmap;
+ struct page **pages;
+ dma_addr_t iova;
++ ssize_t ret;
+
+ if (static_branch_unlikely(&iommu_deferred_attach_enabled) &&
+ iommu_deferred_attach(dev, domain))
+@@ -813,8 +814,8 @@ static struct page **__iommu_dma_alloc_noncontiguous(struct device *dev,
+ arch_dma_prep_coherent(sg_page(sg), sg->length);
+ }
+
+- if (iommu_map_sg_atomic(domain, iova, sgt->sgl, sgt->orig_nents, ioprot)
+- < size)
++ ret = iommu_map_sg_atomic(domain, iova, sgt->sgl, sgt->orig_nents, ioprot);
++ if (ret < 0 || ret < size)
+ goto out_free_sg;
+
+ sgt->sgl->dma_address = iova;
+@@ -1209,7 +1210,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
+ * implementation - it knows better than we do.
+ */
+ ret = iommu_map_sg_atomic(domain, iova, sg, nents, prot);
+- if (ret < iova_len)
++ if (ret < 0 || ret < iova_len)
+ goto out_free_iova;
+
+ return __finalise_sg(dev, sg, nents, iova);
+diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
+index 0ea47e17b379e..ba9a63cac47cc 100644
+--- a/drivers/iommu/intel/iommu.c
++++ b/drivers/iommu/intel/iommu.c
+@@ -5031,7 +5031,7 @@ static void quirk_igfx_skip_te_disable(struct pci_dev *dev)
+ ver = (dev->device >> 8) & 0xff;
+ if (ver != 0x45 && ver != 0x46 && ver != 0x4c &&
+ ver != 0x4e && ver != 0x8a && ver != 0x98 &&
+- ver != 0x9a)
++ ver != 0x9a && ver != 0xa7)
+ return;
+
+ if (risky_device(dev))
+diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
+index 50f57624610f6..a16e0fe57cd8d 100644
+--- a/drivers/iommu/msm_iommu.c
++++ b/drivers/iommu/msm_iommu.c
+@@ -610,16 +610,19 @@ static void insert_iommu_master(struct device *dev,
+ static int qcom_iommu_of_xlate(struct device *dev,
+ struct of_phandle_args *spec)
+ {
+- struct msm_iommu_dev *iommu;
++ struct msm_iommu_dev *iommu = NULL, *iter;
+ unsigned long flags;
+ int ret = 0;
+
+ spin_lock_irqsave(&msm_iommu_lock, flags);
+- list_for_each_entry(iommu, &qcom_iommu_devices, dev_node)
+- if (iommu->dev->of_node == spec->np)
++ list_for_each_entry(iter, &qcom_iommu_devices, dev_node) {
++ if (iter->dev->of_node == spec->np) {
++ iommu = iter;
+ break;
++ }
++ }
+
+- if (!iommu || iommu->dev->of_node != spec->np) {
++ if (!iommu) {
+ ret = -ENODEV;
+ goto fail;
+ }
+diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
+index 6fd75a60abd67..1a31f4707222a 100644
+--- a/drivers/iommu/mtk_iommu.c
++++ b/drivers/iommu/mtk_iommu.c
+@@ -446,7 +446,7 @@ static void mtk_iommu_domain_free(struct iommu_domain *domain)
+ static int mtk_iommu_attach_device(struct iommu_domain *domain,
+ struct device *dev)
+ {
+- struct mtk_iommu_data *data = dev_iommu_priv_get(dev);
++ struct mtk_iommu_data *data = dev_iommu_priv_get(dev), *frstdata;
+ struct mtk_iommu_domain *dom = to_mtk_domain(domain);
+ struct device *m4udev = data->dev;
+ int ret, domid;
+@@ -456,20 +456,24 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain,
+ return domid;
+
+ if (!dom->data) {
+- if (mtk_iommu_domain_finalise(dom, data, domid))
++ /* Data is in the frstdata in sharing pgtable case. */
++ frstdata = mtk_iommu_get_m4u_data();
++
++ if (mtk_iommu_domain_finalise(dom, frstdata, domid))
+ return -ENODEV;
+ dom->data = data;
+ }
+
++ mutex_lock(&data->mutex);
+ if (!data->m4u_dom) { /* Initialize the M4U HW */
+ ret = pm_runtime_resume_and_get(m4udev);
+ if (ret < 0)
+- return ret;
++ goto err_unlock;
+
+ ret = mtk_iommu_hw_init(data);
+ if (ret) {
+ pm_runtime_put(m4udev);
+- return ret;
++ goto err_unlock;
+ }
+ data->m4u_dom = dom;
+ writel(dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK,
+@@ -477,9 +481,14 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain,
+
+ pm_runtime_put(m4udev);
+ }
++ mutex_unlock(&data->mutex);
+
+ mtk_iommu_config(data, dev, true, domid);
+ return 0;
++
++err_unlock:
++ mutex_unlock(&data->mutex);
++ return ret;
+ }
+
+ static void mtk_iommu_detach_device(struct iommu_domain *domain,
+@@ -572,6 +581,9 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
+ * All the ports in each a device should be in the same larbs.
+ */
+ larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
++ if (larbid >= MTK_LARB_NR_MAX)
++ return ERR_PTR(-EINVAL);
++
+ for (i = 1; i < fwspec->num_ids; i++) {
+ larbidx = MTK_M4U_TO_LARB(fwspec->ids[i]);
+ if (larbid != larbidx) {
+@@ -581,6 +593,9 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
+ }
+ }
+ larbdev = data->larb_imu[larbid].dev;
++ if (!larbdev)
++ return ERR_PTR(-EINVAL);
++
+ link = device_link_add(dev, larbdev,
+ DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+ if (!link)
+@@ -619,6 +634,7 @@ static struct iommu_group *mtk_iommu_device_group(struct device *dev)
+ if (domid < 0)
+ return ERR_PTR(domid);
+
++ mutex_lock(&data->mutex);
+ group = data->m4u_group[domid];
+ if (!group) {
+ group = iommu_group_alloc();
+@@ -627,6 +643,7 @@ static struct iommu_group *mtk_iommu_device_group(struct device *dev)
+ } else {
+ iommu_group_ref_get(group);
+ }
++ mutex_unlock(&data->mutex);
+ return group;
+ }
+
+@@ -907,6 +924,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
+ }
+
+ platform_set_drvdata(pdev, data);
++ mutex_init(&data->mutex);
+
+ ret = iommu_device_sysfs_add(&data->iommu, dev, NULL,
+ "mtk-iommu.%pa", &ioaddr);
+@@ -952,10 +970,8 @@ static int mtk_iommu_remove(struct platform_device *pdev)
+ iommu_device_sysfs_remove(&data->iommu);
+ iommu_device_unregister(&data->iommu);
+
+- if (iommu_present(&platform_bus_type))
+- bus_set_iommu(&platform_bus_type, NULL);
++ list_del(&data->list);
+
+- clk_disable_unprepare(data->bclk);
+ device_link_remove(data->smicomm_dev, &pdev->dev);
+ pm_runtime_disable(&pdev->dev);
+ devm_free_irq(&pdev->dev, data->irq, data);
+diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
+index b742432220c5f..5e8da947affc8 100644
+--- a/drivers/iommu/mtk_iommu.h
++++ b/drivers/iommu/mtk_iommu.h
+@@ -80,6 +80,8 @@ struct mtk_iommu_data {
+
+ struct dma_iommu_mapping *mapping; /* For mtk_iommu_v1.c */
+
++ struct mutex mutex; /* Protect m4u_group/m4u_dom above */
++
+ struct list_head list;
+ struct mtk_smi_larb_iommu larb_imu[MTK_LARB_NR_MAX];
+ };
+diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
+index ecff800656e6b..74563f689fbd9 100644
+--- a/drivers/iommu/mtk_iommu_v1.c
++++ b/drivers/iommu/mtk_iommu_v1.c
+@@ -80,6 +80,7 @@
+ /* MTK generation one iommu HW only support 4K size mapping */
+ #define MT2701_IOMMU_PAGE_SHIFT 12
+ #define MT2701_IOMMU_PAGE_SIZE (1UL << MT2701_IOMMU_PAGE_SHIFT)
++#define MT2701_LARB_NR_MAX 3
+
+ /*
+ * MTK m4u support 4GB iova address space, and only support 4K page
+@@ -457,6 +458,9 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
+
+ /* Link the consumer device with the smi-larb device(supplier) */
+ larbid = mt2701_m4u_to_larb(fwspec->ids[0]);
++ if (larbid >= MT2701_LARB_NR_MAX)
++ return ERR_PTR(-EINVAL);
++
+ for (idx = 1; idx < fwspec->num_ids; idx++) {
+ larbidx = mt2701_m4u_to_larb(fwspec->ids[idx]);
+ if (larbid != larbidx) {
+@@ -467,6 +471,9 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
+ }
+
+ larbdev = data->larb_imu[larbid].dev;
++ if (!larbdev)
++ return ERR_PTR(-EINVAL);
++
+ link = device_link_add(dev, larbdev,
+ DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+ if (!link)
+diff --git a/drivers/irqchip/irq-armada-370-xp.c b/drivers/irqchip/irq-armada-370-xp.c
+index 5b8d571c041dc..1120084cba09d 100644
+--- a/drivers/irqchip/irq-armada-370-xp.c
++++ b/drivers/irqchip/irq-armada-370-xp.c
+@@ -308,7 +308,16 @@ static inline int armada_370_xp_msi_init(struct device_node *node,
+
+ static void armada_xp_mpic_perf_init(void)
+ {
+- unsigned long cpuid = cpu_logical_map(smp_processor_id());
++ unsigned long cpuid;
++
++ /*
++ * This Performance Counter Overflow interrupt is specific for
++ * Armada 370 and XP. It is not available on Armada 375, 38x and 39x.
++ */
++ if (!of_machine_is_compatible("marvell,armada-370-xp"))
++ return;
++
++ cpuid = cpu_logical_map(smp_processor_id());
+
+ /* Enable Performance Counter Overflow interrupts */
+ writel(ARMADA_370_XP_INT_CAUSE_PERF(cpuid),
+diff --git a/drivers/irqchip/irq-aspeed-i2c-ic.c b/drivers/irqchip/irq-aspeed-i2c-ic.c
+index a47db16ff9603..9c9fc3e2967ed 100644
+--- a/drivers/irqchip/irq-aspeed-i2c-ic.c
++++ b/drivers/irqchip/irq-aspeed-i2c-ic.c
+@@ -77,8 +77,8 @@ static int __init aspeed_i2c_ic_of_init(struct device_node *node,
+ }
+
+ i2c_ic->parent_irq = irq_of_parse_and_map(node, 0);
+- if (i2c_ic->parent_irq < 0) {
+- ret = i2c_ic->parent_irq;
++ if (!i2c_ic->parent_irq) {
++ ret = -EINVAL;
+ goto err_iounmap;
+ }
+
+diff --git a/drivers/irqchip/irq-aspeed-scu-ic.c b/drivers/irqchip/irq-aspeed-scu-ic.c
+index 18b77c3e6db4b..279e92cf0b16b 100644
+--- a/drivers/irqchip/irq-aspeed-scu-ic.c
++++ b/drivers/irqchip/irq-aspeed-scu-ic.c
+@@ -157,8 +157,8 @@ static int aspeed_scu_ic_of_init_common(struct aspeed_scu_ic *scu_ic,
+ }
+
+ irq = irq_of_parse_and_map(node, 0);
+- if (irq < 0) {
+- rc = irq;
++ if (!irq) {
++ rc = -EINVAL;
+ goto err;
+ }
+
+diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
+index b252d5534547c..1af2b50f36f3e 100644
+--- a/drivers/irqchip/irq-gic-v3.c
++++ b/drivers/irqchip/irq-gic-v3.c
+@@ -556,7 +556,8 @@ static void gic_irq_nmi_teardown(struct irq_data *d)
+
+ static void gic_eoi_irq(struct irq_data *d)
+ {
+- gic_write_eoir(gic_irq(d));
++ write_gicreg(gic_irq(d), ICC_EOIR1_EL1);
++ isb();
+ }
+
+ static void gic_eoimode1_eoi_irq(struct irq_data *d)
+@@ -640,82 +641,101 @@ static void gic_deactivate_unhandled(u32 irqnr)
+ if (irqnr < 8192)
+ gic_write_dir(irqnr);
+ } else {
+- gic_write_eoir(irqnr);
++ write_gicreg(irqnr, ICC_EOIR1_EL1);
++ isb();
+ }
+ }
+
+-static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs)
++/*
++ * Follow a read of the IAR with any HW maintenance that needs to happen prior
++ * to invoking the relevant IRQ handler. We must do two things:
++ *
++ * (1) Ensure instruction ordering between a read of IAR and subsequent
++ * instructions in the IRQ handler using an ISB.
++ *
++ * It is possible for the IAR to report an IRQ which was signalled *after*
++ * the CPU took an IRQ exception as multiple interrupts can race to be
++ * recognized by the GIC, earlier interrupts could be withdrawn, and/or
++ * later interrupts could be prioritized by the GIC.
++ *
++ * For devices which are tightly coupled to the CPU, such as PMUs, a
++ * context synchronization event is necessary to ensure that system
++ * register state is not stale, as these may have been indirectly written
++ * *after* exception entry.
++ *
++ * (2) Deactivate the interrupt when EOI mode 1 is in use.
++ */
++static inline void gic_complete_ack(u32 irqnr)
+ {
+- bool irqs_enabled = interrupts_enabled(regs);
+- int err;
+-
+- if (irqs_enabled)
+- nmi_enter();
+-
+ if (static_branch_likely(&supports_deactivate_key))
+- gic_write_eoir(irqnr);
+- /*
+- * Leave the PSR.I bit set to prevent other NMIs to be
+- * received while handling this one.
+- * PSR.I will be restored when we ERET to the
+- * interrupted context.
+- */
+- err = generic_handle_domain_nmi(gic_data.domain, irqnr);
+- if (err)
+- gic_deactivate_unhandled(irqnr);
++ write_gicreg(irqnr, ICC_EOIR1_EL1);
+
+- if (irqs_enabled)
+- nmi_exit();
++ isb();
+ }
+
+-static u32 do_read_iar(struct pt_regs *regs)
++static bool gic_rpr_is_nmi_prio(void)
+ {
+- u32 iar;
++ if (!gic_supports_nmi())
++ return false;
+
+- if (gic_supports_nmi() && unlikely(!interrupts_enabled(regs))) {
+- u64 pmr;
++ return unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI));
++}
+
+- /*
+- * We were in a context with IRQs disabled. However, the
+- * entry code has set PMR to a value that allows any
+- * interrupt to be acknowledged, and not just NMIs. This can
+- * lead to surprising effects if the NMI has been retired in
+- * the meantime, and that there is an IRQ pending. The IRQ
+- * would then be taken in NMI context, something that nobody
+- * wants to debug twice.
+- *
+- * Until we sort this, drop PMR again to a level that will
+- * actually only allow NMIs before reading IAR, and then
+- * restore it to what it was.
+- */
+- pmr = gic_read_pmr();
+- gic_pmr_mask_irqs();
+- isb();
++static bool gic_irqnr_is_special(u32 irqnr)
++{
++ return irqnr >= 1020 && irqnr <= 1023;
++}
+
+- iar = gic_read_iar();
++static void __gic_handle_irq(u32 irqnr, struct pt_regs *regs)
++{
++ if (gic_irqnr_is_special(irqnr))
++ return;
+
+- gic_write_pmr(pmr);
+- } else {
+- iar = gic_read_iar();
++ gic_complete_ack(irqnr);
++
++ if (generic_handle_domain_irq(gic_data.domain, irqnr)) {
++ WARN_ONCE(true, "Unexpected interrupt (irqnr %u)\n", irqnr);
++ gic_deactivate_unhandled(irqnr);
+ }
++}
++
++static void __gic_handle_nmi(u32 irqnr, struct pt_regs *regs)
++{
++ if (gic_irqnr_is_special(irqnr))
++ return;
++
++ gic_complete_ack(irqnr);
+
+- return iar;
++ if (generic_handle_domain_nmi(gic_data.domain, irqnr)) {
++ WARN_ONCE(true, "Unexpected pseudo-NMI (irqnr %u)\n", irqnr);
++ gic_deactivate_unhandled(irqnr);
++ }
+ }
+
+-static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
++/*
++ * An exception has been taken from a context with IRQs enabled, and this could
++ * be an IRQ or an NMI.
++ *
++ * The entry code called us with DAIF.IF set to keep NMIs masked. We must clear
++ * DAIF.IF (and update ICC_PMR_EL1 to mask regular IRQs) prior to returning,
++ * after handling any NMI but before handling any IRQ.
++ *
++ * The entry code has performed IRQ entry, and if an NMI is detected we must
++ * perform NMI entry/exit around invoking the handler.
++ */
++static void __gic_handle_irq_from_irqson(struct pt_regs *regs)
+ {
++ bool is_nmi;
+ u32 irqnr;
+
+- irqnr = do_read_iar(regs);
++ irqnr = gic_read_iar();
+
+- /* Check for special IDs first */
+- if ((irqnr >= 1020 && irqnr <= 1023))
+- return;
++ is_nmi = gic_rpr_is_nmi_prio();
+
+- if (gic_supports_nmi() &&
+- unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI))) {
+- gic_handle_nmi(irqnr, regs);
+- return;
++ if (is_nmi) {
++ nmi_enter();
++ __gic_handle_nmi(irqnr, regs);
++ nmi_exit();
+ }
+
+ if (gic_prio_masking_enabled()) {
+@@ -723,15 +743,52 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs
+ gic_arch_enable_irqs();
+ }
+
+- if (static_branch_likely(&supports_deactivate_key))
+- gic_write_eoir(irqnr);
+- else
+- isb();
++ if (!is_nmi)
++ __gic_handle_irq(irqnr, regs);
++}
+
+- if (generic_handle_domain_irq(gic_data.domain, irqnr)) {
+- WARN_ONCE(true, "Unexpected interrupt received!\n");
+- gic_deactivate_unhandled(irqnr);
+- }
++/*
++ * An exception has been taken from a context with IRQs disabled, which can only
++ * be an NMI.
++ *
++ * The entry code called us with DAIF.IF set to keep NMIs masked. We must leave
++ * DAIF.IF (and ICC_PMR_EL1) unchanged.
++ *
++ * The entry code has performed NMI entry.
++ */
++static void __gic_handle_irq_from_irqsoff(struct pt_regs *regs)
++{
++ u64 pmr;
++ u32 irqnr;
++
++ /*
++ * We were in a context with IRQs disabled. However, the
++ * entry code has set PMR to a value that allows any
++ * interrupt to be acknowledged, and not just NMIs. This can
++ * lead to surprising effects if the NMI has been retired in
++ * the meantime, and that there is an IRQ pending. The IRQ
++ * would then be taken in NMI context, something that nobody
++ * wants to debug twice.
++ *
++ * Until we sort this, drop PMR again to a level that will
++ * actually only allow NMIs before reading IAR, and then
++ * restore it to what it was.
++ */
++ pmr = gic_read_pmr();
++ gic_pmr_mask_irqs();
++ isb();
++ irqnr = gic_read_iar();
++ gic_write_pmr(pmr);
++
++ __gic_handle_nmi(irqnr, regs);
++}
++
++static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
++{
++ if (unlikely(gic_supports_nmi() && !interrupts_enabled(regs)))
++ __gic_handle_irq_from_irqsoff(regs);
++ else
++ __gic_handle_irq_from_irqson(regs);
+ }
+
+ static u32 gic_get_pribits(void)
+diff --git a/drivers/irqchip/irq-sni-exiu.c b/drivers/irqchip/irq-sni-exiu.c
+index abd011fcecf4a..c7db617e1a2f6 100644
+--- a/drivers/irqchip/irq-sni-exiu.c
++++ b/drivers/irqchip/irq-sni-exiu.c
+@@ -37,11 +37,26 @@ struct exiu_irq_data {
+ u32 spi_base;
+ };
+
+-static void exiu_irq_eoi(struct irq_data *d)
++static void exiu_irq_ack(struct irq_data *d)
+ {
+ struct exiu_irq_data *data = irq_data_get_irq_chip_data(d);
+
+ writel(BIT(d->hwirq), data->base + EIREQCLR);
++}
++
++static void exiu_irq_eoi(struct irq_data *d)
++{
++ struct exiu_irq_data *data = irq_data_get_irq_chip_data(d);
++
++ /*
++ * Level triggered interrupts are latched and must be cleared during
++ * EOI or the interrupt will be jammed on. Of course if a level
++ * triggered interrupt is still asserted then the write will not clear
++ * the interrupt.
++ */
++ if (irqd_is_level_type(d))
++ writel(BIT(d->hwirq), data->base + EIREQCLR);
++
+ irq_chip_eoi_parent(d);
+ }
+
+@@ -91,10 +106,13 @@ static int exiu_irq_set_type(struct irq_data *d, unsigned int type)
+ writel_relaxed(val, data->base + EILVL);
+
+ val = readl_relaxed(data->base + EIEDG);
+- if (type == IRQ_TYPE_LEVEL_LOW || type == IRQ_TYPE_LEVEL_HIGH)
++ if (type == IRQ_TYPE_LEVEL_LOW || type == IRQ_TYPE_LEVEL_HIGH) {
+ val &= ~BIT(d->hwirq);
+- else
++ irq_set_handler_locked(d, handle_fasteoi_irq);
++ } else {
+ val |= BIT(d->hwirq);
++ irq_set_handler_locked(d, handle_fasteoi_ack_irq);
++ }
+ writel_relaxed(val, data->base + EIEDG);
+
+ writel_relaxed(BIT(d->hwirq), data->base + EIREQCLR);
+@@ -104,6 +122,7 @@ static int exiu_irq_set_type(struct irq_data *d, unsigned int type)
+
+ static struct irq_chip exiu_irq_chip = {
+ .name = "EXIU",
++ .irq_ack = exiu_irq_ack,
+ .irq_eoi = exiu_irq_eoi,
+ .irq_enable = exiu_irq_enable,
+ .irq_mask = exiu_irq_mask,
+diff --git a/drivers/irqchip/irq-xtensa-mx.c b/drivers/irqchip/irq-xtensa-mx.c
+index 27933338f7b36..8c581c985aa7d 100644
+--- a/drivers/irqchip/irq-xtensa-mx.c
++++ b/drivers/irqchip/irq-xtensa-mx.c
+@@ -151,14 +151,25 @@ static struct irq_chip xtensa_mx_irq_chip = {
+ .irq_set_affinity = xtensa_mx_irq_set_affinity,
+ };
+
++static void __init xtensa_mx_init_common(struct irq_domain *root_domain)
++{
++ unsigned int i;
++
++ irq_set_default_host(root_domain);
++ secondary_init_irq();
++
++ /* Initialize default IRQ routing to CPU 0 */
++ for (i = 0; i < XCHAL_NUM_EXTINTERRUPTS; ++i)
++ set_er(1, MIROUT(i));
++}
++
+ int __init xtensa_mx_init_legacy(struct device_node *interrupt_parent)
+ {
+ struct irq_domain *root_domain =
+ irq_domain_add_legacy(NULL, NR_IRQS - 1, 1, 0,
+ &xtensa_mx_irq_domain_ops,
+ &xtensa_mx_irq_chip);
+- irq_set_default_host(root_domain);
+- secondary_init_irq();
++ xtensa_mx_init_common(root_domain);
+ return 0;
+ }
+
+@@ -168,8 +179,7 @@ static int __init xtensa_mx_init(struct device_node *np,
+ struct irq_domain *root_domain =
+ irq_domain_add_linear(np, NR_IRQS, &xtensa_mx_irq_domain_ops,
+ &xtensa_mx_irq_chip);
+- irq_set_default_host(root_domain);
+- secondary_init_irq();
++ xtensa_mx_init_common(root_domain);
+ return 0;
+ }
+ IRQCHIP_DECLARE(xtensa_mx_irq_chip, "cdns,xtensa-mx", xtensa_mx_init);
+diff --git a/drivers/macintosh/Kconfig b/drivers/macintosh/Kconfig
+index 5cdc361da37cb..539a2ed4e13dc 100644
+--- a/drivers/macintosh/Kconfig
++++ b/drivers/macintosh/Kconfig
+@@ -44,6 +44,7 @@ config ADB_IOP
+ config ADB_CUDA
+ bool "Support for Cuda/Egret based Macs and PowerMacs"
+ depends on (ADB || PPC_PMAC) && !PPC_PMAC64
++ select RTC_LIB
+ help
+ This provides support for Cuda/Egret based Macintosh and
+ Power Macintosh systems. This includes most m68k based Macs,
+@@ -57,6 +58,7 @@ config ADB_CUDA
+ config ADB_PMU
+ bool "Support for PMU based PowerMacs and PowerBooks"
+ depends on PPC_PMAC || MAC
++ select RTC_LIB
+ help
+ On PowerBooks, iBooks, and recent iMacs and Power Macintoshes, the
+ PMU is an embedded microprocessor whose primary function is to
+@@ -67,6 +69,10 @@ config ADB_PMU
+ this device; you should do so if your machine is one of those
+ mentioned above.
+
++config ADB_PMU_EVENT
++ def_bool y
++ depends on ADB_PMU && INPUT=y
++
+ config ADB_PMU_LED
+ bool "Support for the Power/iBook front LED"
+ depends on PPC_PMAC && ADB_PMU
+diff --git a/drivers/macintosh/Makefile b/drivers/macintosh/Makefile
+index 49819b1b6f201..712edcb3e0b08 100644
+--- a/drivers/macintosh/Makefile
++++ b/drivers/macintosh/Makefile
+@@ -12,7 +12,8 @@ obj-$(CONFIG_MAC_EMUMOUSEBTN) += mac_hid.o
+ obj-$(CONFIG_INPUT_ADBHID) += adbhid.o
+ obj-$(CONFIG_ANSLCD) += ans-lcd.o
+
+-obj-$(CONFIG_ADB_PMU) += via-pmu.o via-pmu-event.o
++obj-$(CONFIG_ADB_PMU) += via-pmu.o
++obj-$(CONFIG_ADB_PMU_EVENT) += via-pmu-event.o
+ obj-$(CONFIG_ADB_PMU_LED) += via-pmu-led.o
+ obj-$(CONFIG_PMAC_BACKLIGHT) += via-pmu-backlight.o
+ obj-$(CONFIG_ADB_CUDA) += via-cuda.o
+diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
+index 4b98bc26a94b5..2109129ea1bbf 100644
+--- a/drivers/macintosh/via-pmu.c
++++ b/drivers/macintosh/via-pmu.c
+@@ -1459,7 +1459,7 @@ next:
+ pmu_pass_intr(data, len);
+ /* len == 6 is probably a bad check. But how do I
+ * know what PMU versions send what events here? */
+- if (len == 6) {
++ if (IS_ENABLED(CONFIG_ADB_PMU_EVENT) && len == 6) {
+ via_pmu_event(PMU_EVT_POWER, !!(data[1]&8));
+ via_pmu_event(PMU_EVT_LID, data[1]&1);
+ }
+diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
+index 3e7d4b20ab34f..4229b9b5da98f 100644
+--- a/drivers/mailbox/mailbox.c
++++ b/drivers/mailbox/mailbox.c
+@@ -82,11 +82,11 @@ static void msg_submit(struct mbox_chan *chan)
+ exit:
+ spin_unlock_irqrestore(&chan->lock, flags);
+
+- /* kick start the timer immediately to avoid delays */
+ if (!err && (chan->txdone_method & TXDONE_BY_POLL)) {
+- /* but only if not already active */
+- if (!hrtimer_active(&chan->mbox->poll_hrt))
+- hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL);
++ /* kick start the timer immediately to avoid delays */
++ spin_lock_irqsave(&chan->mbox->poll_hrt_lock, flags);
++ hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL);
++ spin_unlock_irqrestore(&chan->mbox->poll_hrt_lock, flags);
+ }
+ }
+
+@@ -120,20 +120,26 @@ static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer)
+ container_of(hrtimer, struct mbox_controller, poll_hrt);
+ bool txdone, resched = false;
+ int i;
++ unsigned long flags;
+
+ for (i = 0; i < mbox->num_chans; i++) {
+ struct mbox_chan *chan = &mbox->chans[i];
+
+ if (chan->active_req && chan->cl) {
+- resched = true;
+ txdone = chan->mbox->ops->last_tx_done(chan);
+ if (txdone)
+ tx_tick(chan, 0);
++ else
++ resched = true;
+ }
+ }
+
+ if (resched) {
+- hrtimer_forward_now(hrtimer, ms_to_ktime(mbox->txpoll_period));
++ spin_lock_irqsave(&mbox->poll_hrt_lock, flags);
++ if (!hrtimer_is_queued(hrtimer))
++ hrtimer_forward_now(hrtimer, ms_to_ktime(mbox->txpoll_period));
++ spin_unlock_irqrestore(&mbox->poll_hrt_lock, flags);
++
+ return HRTIMER_RESTART;
+ }
+ return HRTIMER_NORESTART;
+@@ -500,6 +506,7 @@ int mbox_controller_register(struct mbox_controller *mbox)
+ hrtimer_init(&mbox->poll_hrt, CLOCK_MONOTONIC,
+ HRTIMER_MODE_REL);
+ mbox->poll_hrt.function = txdone_hrtimer;
++ spin_lock_init(&mbox->poll_hrt_lock);
+ }
+
+ for (i = 0; i < mbox->num_chans; i++) {
+diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
+index ed18936b8ce68..ebfa33a40fceb 100644
+--- a/drivers/mailbox/pcc.c
++++ b/drivers/mailbox/pcc.c
+@@ -654,7 +654,7 @@ static int pcc_mbox_probe(struct platform_device *pdev)
+ goto err;
+ }
+
+- pcc_mbox_ctrl = devm_kmalloc(dev, sizeof(*pcc_mbox_ctrl), GFP_KERNEL);
++ pcc_mbox_ctrl = devm_kzalloc(dev, sizeof(*pcc_mbox_ctrl), GFP_KERNEL);
+ if (!pcc_mbox_ctrl) {
+ rc = -ENOMEM;
+ goto err;
+diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
+index ad9f16689419d..2362bb8ef6d19 100644
+--- a/drivers/md/bcache/btree.c
++++ b/drivers/md/bcache/btree.c
+@@ -2006,8 +2006,7 @@ int bch_btree_check(struct cache_set *c)
+ int i;
+ struct bkey *k = NULL;
+ struct btree_iter iter;
+- struct btree_check_state *check_state;
+- char name[32];
++ struct btree_check_state check_state;
+
+ /* check and mark root node keys */
+ for_each_key_filter(&c->root->keys, k, &iter, bch_ptr_invalid)
+@@ -2018,63 +2017,58 @@ int bch_btree_check(struct cache_set *c)
+ if (c->root->level == 0)
+ return 0;
+
+- check_state = kzalloc(sizeof(struct btree_check_state), GFP_KERNEL);
+- if (!check_state)
+- return -ENOMEM;
+-
+- check_state->c = c;
+- check_state->total_threads = bch_btree_chkthread_nr();
+- check_state->key_idx = 0;
+- spin_lock_init(&check_state->idx_lock);
+- atomic_set(&check_state->started, 0);
+- atomic_set(&check_state->enough, 0);
+- init_waitqueue_head(&check_state->wait);
++ check_state.c = c;
++ check_state.total_threads = bch_btree_chkthread_nr();
++ check_state.key_idx = 0;
++ spin_lock_init(&check_state.idx_lock);
++ atomic_set(&check_state.started, 0);
++ atomic_set(&check_state.enough, 0);
++ init_waitqueue_head(&check_state.wait);
+
++ rw_lock(0, c->root, c->root->level);
+ /*
+ * Run multiple threads to check btree nodes in parallel,
+- * if check_state->enough is non-zero, it means current
++ * if check_state.enough is non-zero, it means current
+ * running check threads are enough, unncessary to create
+ * more.
+ */
+- for (i = 0; i < check_state->total_threads; i++) {
+- /* fetch latest check_state->enough earlier */
++ for (i = 0; i < check_state.total_threads; i++) {
++ /* fetch latest check_state.enough earlier */
+ smp_mb__before_atomic();
+- if (atomic_read(&check_state->enough))
++ if (atomic_read(&check_state.enough))
+ break;
+
+- check_state->infos[i].result = 0;
+- check_state->infos[i].state = check_state;
+- snprintf(name, sizeof(name), "bch_btrchk[%u]", i);
+- atomic_inc(&check_state->started);
++ check_state.infos[i].result = 0;
++ check_state.infos[i].state = &check_state;
+
+- check_state->infos[i].thread =
++ check_state.infos[i].thread =
+ kthread_run(bch_btree_check_thread,
+- &check_state->infos[i],
+- name);
+- if (IS_ERR(check_state->infos[i].thread)) {
++ &check_state.infos[i],
++ "bch_btrchk[%d]", i);
++ if (IS_ERR(check_state.infos[i].thread)) {
+ pr_err("fails to run thread bch_btrchk[%d]\n", i);
+ for (--i; i >= 0; i--)
+- kthread_stop(check_state->infos[i].thread);
++ kthread_stop(check_state.infos[i].thread);
+ ret = -ENOMEM;
+ goto out;
+ }
++ atomic_inc(&check_state.started);
+ }
+
+ /*
+ * Must wait for all threads to stop.
+ */
+- wait_event_interruptible(check_state->wait,
+- atomic_read(&check_state->started) == 0);
++ wait_event(check_state.wait, atomic_read(&check_state.started) == 0);
+
+- for (i = 0; i < check_state->total_threads; i++) {
+- if (check_state->infos[i].result) {
+- ret = check_state->infos[i].result;
++ for (i = 0; i < check_state.total_threads; i++) {
++ if (check_state.infos[i].result) {
++ ret = check_state.infos[i].result;
+ goto out;
+ }
+ }
+
+ out:
+- kfree(check_state);
++ rw_unlock(0, c->root);
+ return ret;
+ }
+
+diff --git a/drivers/md/bcache/btree.h b/drivers/md/bcache/btree.h
+index 50482107134f1..1b5fdbc0d83eb 100644
+--- a/drivers/md/bcache/btree.h
++++ b/drivers/md/bcache/btree.h
+@@ -226,7 +226,7 @@ struct btree_check_info {
+ int result;
+ };
+
+-#define BCH_BTR_CHKTHREAD_MAX 64
++#define BCH_BTR_CHKTHREAD_MAX 12
+ struct btree_check_state {
+ struct cache_set *c;
+ int total_threads;
+diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
+index df5347ea450b5..e5da469a42357 100644
+--- a/drivers/md/bcache/journal.c
++++ b/drivers/md/bcache/journal.c
+@@ -405,6 +405,11 @@ err:
+ return ret;
+ }
+
++void bch_journal_space_reserve(struct journal *j)
++{
++ j->do_reserve = true;
++}
++
+ /* Journalling */
+
+ static void btree_flush_write(struct cache_set *c)
+@@ -621,12 +626,30 @@ static void do_journal_discard(struct cache *ca)
+ }
+ }
+
++static unsigned int free_journal_buckets(struct cache_set *c)
++{
++ struct journal *j = &c->journal;
++ struct cache *ca = c->cache;
++ struct journal_device *ja = &c->cache->journal;
++ unsigned int n;
++
++ /* In case njournal_buckets is not power of 2 */
++ if (ja->cur_idx >= ja->discard_idx)
++ n = ca->sb.njournal_buckets + ja->discard_idx - ja->cur_idx;
++ else
++ n = ja->discard_idx - ja->cur_idx;
++
++ if (n > (1 + j->do_reserve))
++ return n - (1 + j->do_reserve);
++
++ return 0;
++}
++
+ static void journal_reclaim(struct cache_set *c)
+ {
+ struct bkey *k = &c->journal.key;
+ struct cache *ca = c->cache;
+ uint64_t last_seq;
+- unsigned int next;
+ struct journal_device *ja = &ca->journal;
+ atomic_t p __maybe_unused;
+
+@@ -649,12 +672,10 @@ static void journal_reclaim(struct cache_set *c)
+ if (c->journal.blocks_free)
+ goto out;
+
+- next = (ja->cur_idx + 1) % ca->sb.njournal_buckets;
+- /* No space available on this device */
+- if (next == ja->discard_idx)
++ if (!free_journal_buckets(c))
+ goto out;
+
+- ja->cur_idx = next;
++ ja->cur_idx = (ja->cur_idx + 1) % ca->sb.njournal_buckets;
+ k->ptr[0] = MAKE_PTR(0,
+ bucket_to_sector(c, ca->sb.d[ja->cur_idx]),
+ ca->sb.nr_this_dev);
+diff --git a/drivers/md/bcache/journal.h b/drivers/md/bcache/journal.h
+index f2ea34d5f431b..cd316b4a1e95f 100644
+--- a/drivers/md/bcache/journal.h
++++ b/drivers/md/bcache/journal.h
+@@ -105,6 +105,7 @@ struct journal {
+ spinlock_t lock;
+ spinlock_t flush_write_lock;
+ bool btree_flushing;
++ bool do_reserve;
+ /* used when waiting because the journal was full */
+ struct closure_waitlist wait;
+ struct closure io;
+@@ -182,5 +183,6 @@ int bch_journal_replay(struct cache_set *c, struct list_head *list);
+
+ void bch_journal_free(struct cache_set *c);
+ int bch_journal_alloc(struct cache_set *c);
++void bch_journal_space_reserve(struct journal *j);
+
+ #endif /* _BCACHE_JOURNAL_H */
+diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
+index 320fcdfef48ef..02df49d79b4bf 100644
+--- a/drivers/md/bcache/request.c
++++ b/drivers/md/bcache/request.c
+@@ -1105,6 +1105,12 @@ static void detached_dev_do_request(struct bcache_device *d, struct bio *bio,
+ * which would call closure_get(&dc->disk.cl)
+ */
+ ddip = kzalloc(sizeof(struct detached_dev_io_private), GFP_NOIO);
++ if (!ddip) {
++ bio->bi_status = BLK_STS_RESOURCE;
++ bio->bi_end_io(bio);
++ return;
++ }
++
+ ddip->d = d;
+ /* Count on the bcache device */
+ ddip->orig_bdev = orig_bdev;
+diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
+index bf3de149d3c9f..2bb55278d22d6 100644
+--- a/drivers/md/bcache/super.c
++++ b/drivers/md/bcache/super.c
+@@ -2128,6 +2128,7 @@ static int run_cache_set(struct cache_set *c)
+
+ flash_devs_run(c);
+
++ bch_journal_space_reserve(&c->journal);
+ set_bit(CACHE_SET_RUNNING, &c->flags);
+ return 0;
+ err:
+diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
+index 9ee0005874cda..75b71199800dc 100644
+--- a/drivers/md/bcache/writeback.c
++++ b/drivers/md/bcache/writeback.c
+@@ -805,13 +805,11 @@ static int bch_writeback_thread(void *arg)
+
+ /* Init */
+ #define INIT_KEYS_EACH_TIME 500000
+-#define INIT_KEYS_SLEEP_MS 100
+
+ struct sectors_dirty_init {
+ struct btree_op op;
+ unsigned int inode;
+ size_t count;
+- struct bkey start;
+ };
+
+ static int sectors_dirty_init_fn(struct btree_op *_op, struct btree *b,
+@@ -827,11 +825,8 @@ static int sectors_dirty_init_fn(struct btree_op *_op, struct btree *b,
+ KEY_START(k), KEY_SIZE(k));
+
+ op->count++;
+- if (atomic_read(&b->c->search_inflight) &&
+- !(op->count % INIT_KEYS_EACH_TIME)) {
+- bkey_copy_key(&op->start, k);
+- return -EAGAIN;
+- }
++ if (!(op->count % INIT_KEYS_EACH_TIME))
++ cond_resched();
+
+ return MAP_CONTINUE;
+ }
+@@ -846,24 +841,16 @@ static int bch_root_node_dirty_init(struct cache_set *c,
+ bch_btree_op_init(&op.op, -1);
+ op.inode = d->id;
+ op.count = 0;
+- op.start = KEY(op.inode, 0, 0);
+-
+- do {
+- ret = bcache_btree(map_keys_recurse,
+- k,
+- c->root,
+- &op.op,
+- &op.start,
+- sectors_dirty_init_fn,
+- 0);
+- if (ret == -EAGAIN)
+- schedule_timeout_interruptible(
+- msecs_to_jiffies(INIT_KEYS_SLEEP_MS));
+- else if (ret < 0) {
+- pr_warn("sectors dirty init failed, ret=%d!\n", ret);
+- break;
+- }
+- } while (ret == -EAGAIN);
++
++ ret = bcache_btree(map_keys_recurse,
++ k,
++ c->root,
++ &op.op,
++ &KEY(op.inode, 0, 0),
++ sectors_dirty_init_fn,
++ 0);
++ if (ret < 0)
++ pr_warn("sectors dirty init failed, ret=%d!\n", ret);
+
+ return ret;
+ }
+@@ -907,7 +894,6 @@ static int bch_dirty_init_thread(void *arg)
+ goto out;
+ }
+ skip_nr--;
+- cond_resched();
+ }
+
+ if (p) {
+@@ -917,7 +903,6 @@ static int bch_dirty_init_thread(void *arg)
+
+ p = NULL;
+ prev_idx = cur_idx;
+- cond_resched();
+ }
+
+ out:
+@@ -948,67 +933,55 @@ void bch_sectors_dirty_init(struct bcache_device *d)
+ struct btree_iter iter;
+ struct sectors_dirty_init op;
+ struct cache_set *c = d->c;
+- struct bch_dirty_init_state *state;
+- char name[32];
++ struct bch_dirty_init_state state;
+
+ /* Just count root keys if no leaf node */
++ rw_lock(0, c->root, c->root->level);
+ if (c->root->level == 0) {
+ bch_btree_op_init(&op.op, -1);
+ op.inode = d->id;
+ op.count = 0;
+- op.start = KEY(op.inode, 0, 0);
+
+ for_each_key_filter(&c->root->keys,
+ k, &iter, bch_ptr_invalid)
+ sectors_dirty_init_fn(&op.op, c->root, k);
+- return;
+- }
+
+- state = kzalloc(sizeof(struct bch_dirty_init_state), GFP_KERNEL);
+- if (!state) {
+- pr_warn("sectors dirty init failed: cannot allocate memory\n");
++ rw_unlock(0, c->root);
+ return;
+ }
+
+- state->c = c;
+- state->d = d;
+- state->total_threads = bch_btre_dirty_init_thread_nr();
+- state->key_idx = 0;
+- spin_lock_init(&state->idx_lock);
+- atomic_set(&state->started, 0);
+- atomic_set(&state->enough, 0);
+- init_waitqueue_head(&state->wait);
+-
+- for (i = 0; i < state->total_threads; i++) {
+- /* Fetch latest state->enough earlier */
++ state.c = c;
++ state.d = d;
++ state.total_threads = bch_btre_dirty_init_thread_nr();
++ state.key_idx = 0;
++ spin_lock_init(&state.idx_lock);
++ atomic_set(&state.started, 0);
++ atomic_set(&state.enough, 0);
++ init_waitqueue_head(&state.wait);
++
++ for (i = 0; i < state.total_threads; i++) {
++ /* Fetch latest state.enough earlier */
+ smp_mb__before_atomic();
+- if (atomic_read(&state->enough))
++ if (atomic_read(&state.enough))
+ break;
+
+- state->infos[i].state = state;
+- atomic_inc(&state->started);
+- snprintf(name, sizeof(name), "bch_dirty_init[%d]", i);
+-
+- state->infos[i].thread =
+- kthread_run(bch_dirty_init_thread,
+- &state->infos[i],
+- name);
+- if (IS_ERR(state->infos[i].thread)) {
++ state.infos[i].state = &state;
++ state.infos[i].thread =
++ kthread_run(bch_dirty_init_thread, &state.infos[i],
++ "bch_dirtcnt[%d]", i);
++ if (IS_ERR(state.infos[i].thread)) {
+ pr_err("fails to run thread bch_dirty_init[%d]\n", i);
+ for (--i; i >= 0; i--)
+- kthread_stop(state->infos[i].thread);
++ kthread_stop(state.infos[i].thread);
+ goto out;
+ }
++ atomic_inc(&state.started);
+ }
+
+- /*
+- * Must wait for all threads to stop.
+- */
+- wait_event_interruptible(state->wait,
+- atomic_read(&state->started) == 0);
+-
+ out:
+- kfree(state);
++ /* Must wait for all threads to stop. */
++ wait_event(state.wait, atomic_read(&state.started) == 0);
++ rw_unlock(0, c->root);
+ }
+
+ void bch_cached_dev_writeback_init(struct cached_dev *dc)
+diff --git a/drivers/md/bcache/writeback.h b/drivers/md/bcache/writeback.h
+index 02b2f9df73f69..31df716951f66 100644
+--- a/drivers/md/bcache/writeback.h
++++ b/drivers/md/bcache/writeback.h
+@@ -20,7 +20,7 @@
+ #define BCH_WRITEBACK_FRAGMENT_THRESHOLD_MID 57
+ #define BCH_WRITEBACK_FRAGMENT_THRESHOLD_HIGH 64
+
+-#define BCH_DIRTY_INIT_THRD_MAX 64
++#define BCH_DIRTY_INIT_THRD_MAX 12
+ /*
+ * 14 (16384ths) is chosen here as something that each backing device
+ * should be a reasonable fraction of the share, and not to blow up
+diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
+index bfd6026d78099..612460d2bdaf2 100644
+--- a/drivers/md/md-bitmap.c
++++ b/drivers/md/md-bitmap.c
+@@ -639,14 +639,6 @@ re_read:
+ daemon_sleep = le32_to_cpu(sb->daemon_sleep) * HZ;
+ write_behind = le32_to_cpu(sb->write_behind);
+ sectors_reserved = le32_to_cpu(sb->sectors_reserved);
+- /* Setup nodes/clustername only if bitmap version is
+- * cluster-compatible
+- */
+- if (sb->version == cpu_to_le32(BITMAP_MAJOR_CLUSTERED)) {
+- nodes = le32_to_cpu(sb->nodes);
+- strlcpy(bitmap->mddev->bitmap_info.cluster_name,
+- sb->cluster_name, 64);
+- }
+
+ /* verify that the bitmap-specific fields are valid */
+ if (sb->magic != cpu_to_le32(BITMAP_MAGIC))
+@@ -668,6 +660,16 @@ re_read:
+ goto out;
+ }
+
++ /*
++ * Setup nodes/clustername only if bitmap version is
++ * cluster-compatible
++ */
++ if (sb->version == cpu_to_le32(BITMAP_MAJOR_CLUSTERED)) {
++ nodes = le32_to_cpu(sb->nodes);
++ strlcpy(bitmap->mddev->bitmap_info.cluster_name,
++ sb->cluster_name, 64);
++ }
++
+ /* keep the array size field of the bitmap superblock up to date */
+ sb->sync_size = cpu_to_le64(bitmap->mddev->resync_max_sectors);
+
+@@ -700,9 +702,9 @@ re_read:
+
+ out:
+ kunmap_atomic(sb);
+- /* Assigning chunksize is required for "re_read" */
+- bitmap->mddev->bitmap_info.chunksize = chunksize;
+ if (err == 0 && nodes && (bitmap->cluster_slot < 0)) {
++ /* Assigning chunksize is required for "re_read" */
++ bitmap->mddev->bitmap_info.chunksize = chunksize;
+ err = md_setup_cluster(bitmap->mddev, nodes);
+ if (err) {
+ pr_warn("%s: Could not setup cluster service (%d)\n",
+@@ -713,18 +715,18 @@ out:
+ goto re_read;
+ }
+
+-
+ out_no_sb:
+- if (test_bit(BITMAP_STALE, &bitmap->flags))
+- bitmap->events_cleared = bitmap->mddev->events;
+- bitmap->mddev->bitmap_info.chunksize = chunksize;
+- bitmap->mddev->bitmap_info.daemon_sleep = daemon_sleep;
+- bitmap->mddev->bitmap_info.max_write_behind = write_behind;
+- bitmap->mddev->bitmap_info.nodes = nodes;
+- if (bitmap->mddev->bitmap_info.space == 0 ||
+- bitmap->mddev->bitmap_info.space > sectors_reserved)
+- bitmap->mddev->bitmap_info.space = sectors_reserved;
+- if (err) {
++ if (err == 0) {
++ if (test_bit(BITMAP_STALE, &bitmap->flags))
++ bitmap->events_cleared = bitmap->mddev->events;
++ bitmap->mddev->bitmap_info.chunksize = chunksize;
++ bitmap->mddev->bitmap_info.daemon_sleep = daemon_sleep;
++ bitmap->mddev->bitmap_info.max_write_behind = write_behind;
++ bitmap->mddev->bitmap_info.nodes = nodes;
++ if (bitmap->mddev->bitmap_info.space == 0 ||
++ bitmap->mddev->bitmap_info.space > sectors_reserved)
++ bitmap->mddev->bitmap_info.space = sectors_reserved;
++ } else {
+ md_bitmap_print_sb(bitmap);
+ if (bitmap->cluster_slot < 0)
+ md_cluster_stop(bitmap->mddev);
+diff --git a/drivers/md/md.c b/drivers/md/md.c
+index 309b3af906ad3..066f792b374e3 100644
+--- a/drivers/md/md.c
++++ b/drivers/md/md.c
+@@ -2627,14 +2627,16 @@ static void sync_sbs(struct mddev *mddev, int nospares)
+
+ static bool does_sb_need_changing(struct mddev *mddev)
+ {
+- struct md_rdev *rdev;
++ struct md_rdev *rdev = NULL, *iter;
+ struct mdp_superblock_1 *sb;
+ int role;
+
+ /* Find a good rdev */
+- rdev_for_each(rdev, mddev)
+- if ((rdev->raid_disk >= 0) && !test_bit(Faulty, &rdev->flags))
++ rdev_for_each(iter, mddev)
++ if ((iter->raid_disk >= 0) && !test_bit(Faulty, &iter->flags)) {
++ rdev = iter;
+ break;
++ }
+
+ /* No good device found. */
+ if (!rdev)
+@@ -5596,8 +5598,6 @@ static void md_free(struct kobject *ko)
+
+ bioset_exit(&mddev->bio_set);
+ bioset_exit(&mddev->sync_set);
+- if (mddev->level != 1 && mddev->level != 10)
+- bioset_exit(&mddev->io_acct_set);
+ kfree(mddev);
+ }
+
+@@ -6284,8 +6284,6 @@ void md_stop(struct mddev *mddev)
+ __md_stop(mddev);
+ bioset_exit(&mddev->bio_set);
+ bioset_exit(&mddev->sync_set);
+- if (mddev->level != 1 && mddev->level != 10)
+- bioset_exit(&mddev->io_acct_set);
+ }
+
+ EXPORT_SYMBOL_GPL(md_stop);
+@@ -9791,16 +9789,18 @@ static int read_rdev(struct mddev *mddev, struct md_rdev *rdev)
+
+ void md_reload_sb(struct mddev *mddev, int nr)
+ {
+- struct md_rdev *rdev;
++ struct md_rdev *rdev = NULL, *iter;
+ int err;
+
+ /* Find the rdev */
+- rdev_for_each_rcu(rdev, mddev) {
+- if (rdev->desc_nr == nr)
++ rdev_for_each_rcu(iter, mddev) {
++ if (iter->desc_nr == nr) {
++ rdev = iter;
+ break;
++ }
+ }
+
+- if (!rdev || rdev->desc_nr != nr) {
++ if (!rdev) {
+ pr_warn("%s: %d Could not find rdev with nr %d\n", __func__, __LINE__, nr);
+ return;
+ }
+diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
+index b21e101183f44..46c30fb538a46 100644
+--- a/drivers/md/raid0.c
++++ b/drivers/md/raid0.c
+@@ -361,7 +361,6 @@ static void free_conf(struct mddev *mddev, struct r0conf *conf)
+ kfree(conf->strip_zone);
+ kfree(conf->devlist);
+ kfree(conf);
+- mddev->private = NULL;
+ }
+
+ static void raid0_free(struct mddev *mddev, void *priv)
+diff --git a/drivers/media/cec/core/cec-adap.c b/drivers/media/cec/core/cec-adap.c
+index 2e12331c12a9d..01766e7447728 100644
+--- a/drivers/media/cec/core/cec-adap.c
++++ b/drivers/media/cec/core/cec-adap.c
+@@ -1278,7 +1278,7 @@ static int cec_config_log_addr(struct cec_adapter *adap,
+ * While trying to poll the physical address was reset
+ * and the adapter was unconfigured, so bail out.
+ */
+- if (!adap->is_configuring)
++ if (adap->phys_addr == CEC_PHYS_ADDR_INVALID)
+ return -EINTR;
+
+ if (err)
+@@ -1335,7 +1335,6 @@ static void cec_adap_unconfigure(struct cec_adapter *adap)
+ adap->phys_addr != CEC_PHYS_ADDR_INVALID)
+ WARN_ON(adap->ops->adap_log_addr(adap, CEC_LOG_ADDR_INVALID));
+ adap->log_addrs.log_addr_mask = 0;
+- adap->is_configuring = false;
+ adap->is_configured = false;
+ cec_flush(adap);
+ wake_up_interruptible(&adap->kthread_waitq);
+@@ -1527,9 +1526,10 @@ unconfigure:
+ for (i = 0; i < las->num_log_addrs; i++)
+ las->log_addr[i] = CEC_LOG_ADDR_INVALID;
+ cec_adap_unconfigure(adap);
++ adap->is_configuring = false;
+ adap->kthread_config = NULL;
+- mutex_unlock(&adap->lock);
+ complete(&adap->config_completion);
++ mutex_unlock(&adap->lock);
+ return 0;
+ }
+
+diff --git a/drivers/media/i2c/Kconfig b/drivers/media/i2c/Kconfig
+index fae2baabb7738..2b20aa6c37b1b 100644
+--- a/drivers/media/i2c/Kconfig
++++ b/drivers/media/i2c/Kconfig
+@@ -372,6 +372,7 @@ config VIDEO_OV13B10
+ config VIDEO_OV2640
+ tristate "OmniVision OV2640 sensor support"
+ depends on VIDEO_DEV && I2C
++ select V4L2_ASYNC
+ help
+ This is a Video4Linux2 sensor driver for the OmniVision
+ OV2640 camera.
+diff --git a/drivers/media/i2c/ccs/ccs-core.c b/drivers/media/i2c/ccs/ccs-core.c
+index 03e841b8443f0..7ae469caf9906 100644
+--- a/drivers/media/i2c/ccs/ccs-core.c
++++ b/drivers/media/i2c/ccs/ccs-core.c
+@@ -1602,8 +1602,11 @@ static int ccs_power_on(struct device *dev)
+ usleep_range(1000, 2000);
+ } while (--retry);
+
+- if (!reset)
+- return -EIO;
++ if (!reset) {
++ dev_err(dev, "software reset failed\n");
++ rval = -EIO;
++ goto out_cci_addr_fail;
++ }
+ }
+
+ if (sensor->hwcfg.i2c_addr_alt) {
+diff --git a/drivers/media/i2c/dw9714.c b/drivers/media/i2c/dw9714.c
+index cd7008ad8f2f3..8c5797ba57d41 100644
+--- a/drivers/media/i2c/dw9714.c
++++ b/drivers/media/i2c/dw9714.c
+@@ -183,6 +183,7 @@ static int dw9714_probe(struct i2c_client *client)
+ return 0;
+
+ err_cleanup:
++ regulator_disable(dw9714_dev->vcc);
+ v4l2_ctrl_handler_free(&dw9714_dev->ctrls_vcm);
+ media_entity_cleanup(&dw9714_dev->sd.entity);
+
+diff --git a/drivers/media/i2c/dw9768.c b/drivers/media/i2c/dw9768.c
+index 65c6acf3ced9a..c086580efac78 100644
+--- a/drivers/media/i2c/dw9768.c
++++ b/drivers/media/i2c/dw9768.c
+@@ -469,11 +469,6 @@ static int dw9768_probe(struct i2c_client *client)
+
+ dw9768->sd.entity.function = MEDIA_ENT_F_LENS;
+
+- /*
+- * Device is already turned on by i2c-core with ACPI domain PM.
+- * Attempt to turn off the device to satisfy the privacy LED concerns.
+- */
+- pm_runtime_set_active(dev);
+ pm_runtime_enable(dev);
+ if (!pm_runtime_enabled(dev)) {
+ ret = dw9768_runtime_resume(dev);
+@@ -488,7 +483,6 @@ static int dw9768_probe(struct i2c_client *client)
+ dev_err(dev, "failed to register V4L2 subdev: %d", ret);
+ goto err_power_off;
+ }
+- pm_runtime_idle(dev);
+
+ return 0;
+
+diff --git a/drivers/media/i2c/max9286.c b/drivers/media/i2c/max9286.c
+index d2a4915ed9f7b..3684faa72253b 100644
+--- a/drivers/media/i2c/max9286.c
++++ b/drivers/media/i2c/max9286.c
+@@ -1147,22 +1147,18 @@ static int max9286_poc_enable(struct max9286_priv *priv, bool enable)
+ return ret;
+ }
+
+-static int max9286_init(struct device *dev)
++static int max9286_init(struct max9286_priv *priv)
+ {
+- struct max9286_priv *priv;
+- struct i2c_client *client;
++ struct i2c_client *client = priv->client;
+ int ret;
+
+- client = to_i2c_client(dev);
+- priv = i2c_get_clientdata(client);
+-
+ ret = max9286_poc_enable(priv, true);
+ if (ret)
+ return ret;
+
+ ret = max9286_setup(priv);
+ if (ret) {
+- dev_err(dev, "Unable to setup max9286\n");
++ dev_err(&client->dev, "Unable to setup max9286\n");
+ goto err_poc_disable;
+ }
+
+@@ -1172,13 +1168,13 @@ static int max9286_init(struct device *dev)
+ */
+ ret = max9286_v4l2_register(priv);
+ if (ret) {
+- dev_err(dev, "Failed to register with V4L2\n");
++ dev_err(&client->dev, "Failed to register with V4L2\n");
+ goto err_poc_disable;
+ }
+
+ ret = max9286_i2c_mux_init(priv);
+ if (ret) {
+- dev_err(dev, "Unable to initialize I2C multiplexer\n");
++ dev_err(&client->dev, "Unable to initialize I2C multiplexer\n");
+ goto err_v4l2_register;
+ }
+
+@@ -1333,7 +1329,6 @@ static int max9286_probe(struct i2c_client *client)
+ mutex_init(&priv->mutex);
+
+ priv->client = client;
+- i2c_set_clientdata(client, priv);
+
+ priv->gpiod_pwdn = devm_gpiod_get_optional(&client->dev, "enable",
+ GPIOD_OUT_HIGH);
+@@ -1369,7 +1364,7 @@ static int max9286_probe(struct i2c_client *client)
+ if (ret)
+ goto err_powerdown;
+
+- ret = max9286_init(&client->dev);
++ ret = max9286_init(priv);
+ if (ret < 0)
+ goto err_cleanup_dt;
+
+@@ -1385,7 +1380,7 @@ err_powerdown:
+
+ static int max9286_remove(struct i2c_client *client)
+ {
+- struct max9286_priv *priv = i2c_get_clientdata(client);
++ struct max9286_priv *priv = sd_to_max9286(i2c_get_clientdata(client));
+
+ i2c_mux_del_adapters(priv->mux);
+
+diff --git a/drivers/media/i2c/ov5648.c b/drivers/media/i2c/ov5648.c
+index 930ff6897044a..dfcd33e9ee136 100644
+--- a/drivers/media/i2c/ov5648.c
++++ b/drivers/media/i2c/ov5648.c
+@@ -2498,9 +2498,9 @@ static int ov5648_probe(struct i2c_client *client)
+
+ /* DOVDD: digital I/O */
+ sensor->dovdd = devm_regulator_get(dev, "dovdd");
+- if (IS_ERR(sensor->dvdd)) {
++ if (IS_ERR(sensor->dovdd)) {
+ dev_err(dev, "cannot get DOVDD (digital I/O) regulator\n");
+- ret = PTR_ERR(sensor->dvdd);
++ ret = PTR_ERR(sensor->dovdd);
+ goto error_endpoint;
+ }
+
+diff --git a/drivers/media/i2c/ov7670.c b/drivers/media/i2c/ov7670.c
+index 1967464231160..1be2c0e5bdc15 100644
+--- a/drivers/media/i2c/ov7670.c
++++ b/drivers/media/i2c/ov7670.c
+@@ -2017,7 +2017,6 @@ static int ov7670_remove(struct i2c_client *client)
+ v4l2_async_unregister_subdev(sd);
+ v4l2_ctrl_handler_free(&info->hdl);
+ media_entity_cleanup(&info->sd.entity);
+- ov7670_power_off(sd);
+ return 0;
+ }
+
+diff --git a/drivers/media/i2c/rdacm20.c b/drivers/media/i2c/rdacm20.c
+index 025a610de8935..9c6f66cab5642 100644
+--- a/drivers/media/i2c/rdacm20.c
++++ b/drivers/media/i2c/rdacm20.c
+@@ -611,7 +611,7 @@ static int rdacm20_probe(struct i2c_client *client)
+ goto error_free_ctrls;
+
+ dev->pad.flags = MEDIA_PAD_FL_SOURCE;
+- dev->sd.entity.flags |= MEDIA_ENT_F_CAM_SENSOR;
++ dev->sd.entity.function = MEDIA_ENT_F_CAM_SENSOR;
+ ret = media_entity_pads_init(&dev->sd.entity, 1, &dev->pad);
+ if (ret < 0)
+ goto error_free_ctrls;
+diff --git a/drivers/media/i2c/rdacm21.c b/drivers/media/i2c/rdacm21.c
+index 12ec5467ed1ee..ef31cf5f23cac 100644
+--- a/drivers/media/i2c/rdacm21.c
++++ b/drivers/media/i2c/rdacm21.c
+@@ -583,7 +583,7 @@ static int rdacm21_probe(struct i2c_client *client)
+ goto error_free_ctrls;
+
+ dev->pad.flags = MEDIA_PAD_FL_SOURCE;
+- dev->sd.entity.flags |= MEDIA_ENT_F_CAM_SENSOR;
++ dev->sd.entity.function = MEDIA_ENT_F_CAM_SENSOR;
+ ret = media_entity_pads_init(&dev->sd.entity, 1, &dev->pad);
+ if (ret < 0)
+ goto error_free_ctrls;
+diff --git a/drivers/media/pci/cx23885/cx23885-core.c b/drivers/media/pci/cx23885/cx23885-core.c
+index f8f2ff3b00c37..a07b18f2034e9 100644
+--- a/drivers/media/pci/cx23885/cx23885-core.c
++++ b/drivers/media/pci/cx23885/cx23885-core.c
+@@ -2165,7 +2165,7 @@ static int cx23885_initdev(struct pci_dev *pci_dev,
+ err = dma_set_mask(&pci_dev->dev, 0xffffffff);
+ if (err) {
+ pr_err("%s/0: Oops: no 32bit PCI DMA ???\n", dev->name);
+- goto fail_ctrl;
++ goto fail_dma_set_mask;
+ }
+
+ err = request_irq(pci_dev->irq, cx23885_irq,
+@@ -2173,7 +2173,7 @@ static int cx23885_initdev(struct pci_dev *pci_dev,
+ if (err < 0) {
+ pr_err("%s: can't get IRQ %d\n",
+ dev->name, pci_dev->irq);
+- goto fail_irq;
++ goto fail_dma_set_mask;
+ }
+
+ switch (dev->board) {
+@@ -2195,7 +2195,7 @@ static int cx23885_initdev(struct pci_dev *pci_dev,
+
+ return 0;
+
+-fail_irq:
++fail_dma_set_mask:
+ cx23885_dev_unregister(dev);
+ fail_ctrl:
+ v4l2_ctrl_handler_free(hdl);
+diff --git a/drivers/media/pci/cx25821/cx25821-core.c b/drivers/media/pci/cx25821/cx25821-core.c
+index 3078a39f0b95d..6627fa9166d30 100644
+--- a/drivers/media/pci/cx25821/cx25821-core.c
++++ b/drivers/media/pci/cx25821/cx25821-core.c
+@@ -1332,11 +1332,11 @@ static void cx25821_finidev(struct pci_dev *pci_dev)
+ struct cx25821_dev *dev = get_cx25821(v4l2_dev);
+
+ cx25821_shutdown(dev);
+- pci_disable_device(pci_dev);
+
+ /* unregister stuff */
+ if (pci_dev->irq)
+ free_irq(pci_dev->irq, dev);
++ pci_disable_device(pci_dev);
+
+ cx25821_dev_unregister(dev);
+ v4l2_device_unregister(v4l2_dev);
+diff --git a/drivers/media/platform/amphion/vdec.c b/drivers/media/platform/amphion/vdec.c
+index 8f8dfd6ce2c68..c0dfede11ab74 100644
+--- a/drivers/media/platform/amphion/vdec.c
++++ b/drivers/media/platform/amphion/vdec.c
+@@ -782,7 +782,7 @@ static void vdec_init_fmt(struct vpu_inst *inst)
+ if (vdec->codec_info.progressive)
+ inst->cap_format.field = V4L2_FIELD_NONE;
+ else
+- inst->cap_format.field = V4L2_FIELD_SEQ_BT;
++ inst->cap_format.field = V4L2_FIELD_SEQ_TB;
+ if (vdec->codec_info.color_primaries == V4L2_COLORSPACE_DEFAULT)
+ vdec->codec_info.color_primaries = V4L2_COLORSPACE_REC709;
+ if (vdec->codec_info.transfer_chars == V4L2_XFER_FUNC_DEFAULT)
+diff --git a/drivers/media/platform/aspeed/aspeed-video.c b/drivers/media/platform/aspeed/aspeed-video.c
+index b937dbcbe9e0a..20f795ccc11b6 100644
+--- a/drivers/media/platform/aspeed/aspeed-video.c
++++ b/drivers/media/platform/aspeed/aspeed-video.c
+@@ -1993,6 +1993,7 @@ static int aspeed_video_probe(struct platform_device *pdev)
+
+ rc = aspeed_video_setup_video(video);
+ if (rc) {
++ aspeed_video_free_buf(video, &video->jpeg);
+ clk_unprepare(video->vclk);
+ clk_unprepare(video->eclk);
+ return rc;
+@@ -2024,8 +2025,7 @@ static int aspeed_video_remove(struct platform_device *pdev)
+
+ v4l2_device_unregister(v4l2_dev);
+
+- dma_free_coherent(video->dev, VE_JPEG_HEADER_SIZE, video->jpeg.virt,
+- video->jpeg.dma);
++ aspeed_video_free_buf(video, &video->jpeg);
+
+ of_reserved_mem_device_release(dev);
+
+diff --git a/drivers/media/platform/atmel/atmel-sama5d2-isc.c b/drivers/media/platform/atmel/atmel-sama5d2-isc.c
+index c5b9563e36cb1..c2d50b0c0e3d3 100644
+--- a/drivers/media/platform/atmel/atmel-sama5d2-isc.c
++++ b/drivers/media/platform/atmel/atmel-sama5d2-isc.c
+@@ -291,7 +291,7 @@ static void isc_sama5d2_config_rlp(struct isc_device *isc)
+ * Thus, if the YCYC mode is selected, replace it with the
+ * sama5d2-compliant mode which is YYCC .
+ */
+- if ((rlp_mode & ISC_RLP_CFG_MODE_YCYC) == ISC_RLP_CFG_MODE_YCYC) {
++ if ((rlp_mode & ISC_RLP_CFG_MODE_MASK) == ISC_RLP_CFG_MODE_YCYC) {
+ rlp_mode &= ~ISC_RLP_CFG_MODE_MASK;
+ rlp_mode |= ISC_RLP_CFG_MODE_YYCC;
+ }
+@@ -562,7 +562,7 @@ static int atmel_isc_probe(struct platform_device *pdev)
+ ret = clk_prepare_enable(isc->ispck);
+ if (ret) {
+ dev_err(dev, "failed to enable ispck: %d\n", ret);
+- goto cleanup_subdev;
++ goto disable_pm;
+ }
+
+ /* ispck should be greater or equal to hclock */
+@@ -580,6 +580,9 @@ static int atmel_isc_probe(struct platform_device *pdev)
+ unprepare_clk:
+ clk_disable_unprepare(isc->ispck);
+
++disable_pm:
++ pm_runtime_disable(dev);
++
+ cleanup_subdev:
+ isc_subdev_cleanup(isc);
+
+diff --git a/drivers/media/platform/chips-media/coda-common.c b/drivers/media/platform/chips-media/coda-common.c
+index a57822b050706..27d002d9f631f 100644
+--- a/drivers/media/platform/chips-media/coda-common.c
++++ b/drivers/media/platform/chips-media/coda-common.c
+@@ -1324,7 +1324,8 @@ static int coda_enum_frameintervals(struct file *file, void *fh,
+ struct v4l2_frmivalenum *f)
+ {
+ struct coda_ctx *ctx = fh_to_ctx(fh);
+- int i;
++ struct coda_q_data *q_data;
++ const struct coda_codec *codec;
+
+ if (f->index)
+ return -EINVAL;
+@@ -1333,12 +1334,19 @@ static int coda_enum_frameintervals(struct file *file, void *fh,
+ if (!ctx->vdoa && f->pixel_format == V4L2_PIX_FMT_YUYV)
+ return -EINVAL;
+
+- for (i = 0; i < CODA_MAX_FORMATS; i++) {
+- if (f->pixel_format == ctx->cvd->src_formats[i] ||
+- f->pixel_format == ctx->cvd->dst_formats[i])
+- break;
++ if (coda_format_normalize_yuv(f->pixel_format) == V4L2_PIX_FMT_YUV420) {
++ q_data = get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE);
++ codec = coda_find_codec(ctx->dev, f->pixel_format,
++ q_data->fourcc);
++ } else {
++ codec = coda_find_codec(ctx->dev, V4L2_PIX_FMT_YUV420,
++ f->pixel_format);
+ }
+- if (i == CODA_MAX_FORMATS)
++ if (!codec)
++ return -EINVAL;
++
++ if (f->width < MIN_W || f->width > codec->max_w ||
++ f->height < MIN_H || f->height > codec->max_h)
+ return -EINVAL;
+
+ f->type = V4L2_FRMIVAL_TYPE_CONTINUOUS;
+@@ -2344,8 +2352,8 @@ static void coda_encode_ctrls(struct coda_ctx *ctx)
+ V4L2_CID_MPEG_VIDEO_H264_CHROMA_QP_INDEX_OFFSET, -12, 12, 1, 0);
+ v4l2_ctrl_new_std_menu(&ctx->ctrls, &coda_ctrl_ops,
+ V4L2_CID_MPEG_VIDEO_H264_PROFILE,
+- V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE, 0x0,
+- V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE);
++ V4L2_MPEG_VIDEO_H264_PROFILE_CONSTRAINED_BASELINE, 0x0,
++ V4L2_MPEG_VIDEO_H264_PROFILE_CONSTRAINED_BASELINE);
+ if (ctx->dev->devtype->product == CODA_HX4 ||
+ ctx->dev->devtype->product == CODA_7541) {
+ v4l2_ctrl_new_std_menu(&ctx->ctrls, &coda_ctrl_ops,
+@@ -2359,12 +2367,15 @@ static void coda_encode_ctrls(struct coda_ctx *ctx)
+ if (ctx->dev->devtype->product == CODA_960) {
+ v4l2_ctrl_new_std_menu(&ctx->ctrls, &coda_ctrl_ops,
+ V4L2_CID_MPEG_VIDEO_H264_LEVEL,
+- V4L2_MPEG_VIDEO_H264_LEVEL_4_0,
+- ~((1 << V4L2_MPEG_VIDEO_H264_LEVEL_2_0) |
++ V4L2_MPEG_VIDEO_H264_LEVEL_4_2,
++ ~((1 << V4L2_MPEG_VIDEO_H264_LEVEL_1_0) |
++ (1 << V4L2_MPEG_VIDEO_H264_LEVEL_2_0) |
+ (1 << V4L2_MPEG_VIDEO_H264_LEVEL_3_0) |
+ (1 << V4L2_MPEG_VIDEO_H264_LEVEL_3_1) |
+ (1 << V4L2_MPEG_VIDEO_H264_LEVEL_3_2) |
+- (1 << V4L2_MPEG_VIDEO_H264_LEVEL_4_0)),
++ (1 << V4L2_MPEG_VIDEO_H264_LEVEL_4_0) |
++ (1 << V4L2_MPEG_VIDEO_H264_LEVEL_4_1) |
++ (1 << V4L2_MPEG_VIDEO_H264_LEVEL_4_2)),
+ V4L2_MPEG_VIDEO_H264_LEVEL_4_0);
+ }
+ v4l2_ctrl_new_std(&ctx->ctrls, &coda_ctrl_ops,
+@@ -2426,7 +2437,7 @@ static void coda_decode_ctrls(struct coda_ctx *ctx)
+ ctx->h264_profile_ctrl = v4l2_ctrl_new_std_menu(&ctx->ctrls,
+ &coda_ctrl_ops, V4L2_CID_MPEG_VIDEO_H264_PROFILE,
+ V4L2_MPEG_VIDEO_H264_PROFILE_HIGH,
+- ~((1 << V4L2_MPEG_VIDEO_H264_PROFILE_BASELINE) |
++ ~((1 << V4L2_MPEG_VIDEO_H264_PROFILE_CONSTRAINED_BASELINE) |
+ (1 << V4L2_MPEG_VIDEO_H264_PROFILE_MAIN) |
+ (1 << V4L2_MPEG_VIDEO_H264_PROFILE_HIGH)),
+ V4L2_MPEG_VIDEO_H264_PROFILE_HIGH);
+diff --git a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec.c
+index 130ecef2e7664..c8ee5e2b4f699 100644
+--- a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec.c
++++ b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec.c
+@@ -47,14 +47,7 @@ static struct mtk_q_data *mtk_vdec_get_q_data(struct mtk_vcodec_ctx *ctx,
+ static int vidioc_try_decoder_cmd(struct file *file, void *priv,
+ struct v4l2_decoder_cmd *cmd)
+ {
+- struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
+-
+- /* Use M2M stateless helper if relevant */
+- if (ctx->dev->vdec_pdata->uses_stateless_api)
+- return v4l2_m2m_ioctl_stateless_try_decoder_cmd(file, priv,
+- cmd);
+- else
+- return v4l2_m2m_ioctl_try_decoder_cmd(file, priv, cmd);
++ return v4l2_m2m_ioctl_try_decoder_cmd(file, priv, cmd);
+ }
+
+
+@@ -69,10 +62,6 @@ static int vidioc_decoder_cmd(struct file *file, void *priv,
+ if (ret)
+ return ret;
+
+- /* Use M2M stateless helper if relevant */
+- if (ctx->dev->vdec_pdata->uses_stateless_api)
+- return v4l2_m2m_ioctl_stateless_decoder_cmd(file, priv, cmd);
+-
+ mtk_v4l2_debug(1, "decoder cmd=%u", cmd->cmd);
+ dst_vq = v4l2_m2m_get_vq(ctx->m2m_ctx,
+ V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
+diff --git a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_drv.c b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_drv.c
+index df7b25e9cbc88..fe7b2f1739b15 100644
+--- a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_drv.c
++++ b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_drv.c
+@@ -400,6 +400,9 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
+ }
+
+ if (dev->vdec_pdata->uses_stateless_api) {
++ v4l2_disable_ioctl(vfd_dec, VIDIOC_DECODER_CMD);
++ v4l2_disable_ioctl(vfd_dec, VIDIOC_TRY_DECODER_CMD);
++
+ dev->mdev_dec.dev = &pdev->dev;
+ strscpy(dev->mdev_dec.model, MTK_VCODEC_DEC_NAME,
+ sizeof(dev->mdev_dec.model));
+@@ -487,7 +490,8 @@ static int mtk_vcodec_dec_remove(struct platform_device *pdev)
+ video_unregister_device(dev->vfd_dec);
+
+ v4l2_device_unregister(&dev->v4l2_dev);
+- pm_runtime_disable(dev->pm.dev);
++ if (!dev->vdec_pdata->is_subdev_supported)
++ pm_runtime_disable(dev->pm.dev);
+ mtk_vcodec_fw_release(dev->fw_handler);
+ return 0;
+ }
+diff --git a/drivers/media/platform/nxp/imx-mipi-csis.c b/drivers/media/platform/nxp/imx-mipi-csis.c
+index 0a72734db55e9..e0e345fbb00f8 100644
+--- a/drivers/media/platform/nxp/imx-mipi-csis.c
++++ b/drivers/media/platform/nxp/imx-mipi-csis.c
+@@ -310,7 +310,7 @@ struct mipi_csis_info {
+ unsigned int num_clocks;
+ };
+
+-struct csi_state {
++struct mipi_csis_device {
+ struct device *dev;
+ void __iomem *regs;
+ struct clk_bulk_data *clks;
+@@ -487,59 +487,60 @@ static const struct csis_pix_format *find_csis_format(u32 code)
+ * Hardware configuration
+ */
+
+-static inline u32 mipi_csis_read(struct csi_state *state, u32 reg)
++static inline u32 mipi_csis_read(struct mipi_csis_device *csis, u32 reg)
+ {
+- return readl(state->regs + reg);
++ return readl(csis->regs + reg);
+ }
+
+-static inline void mipi_csis_write(struct csi_state *state, u32 reg, u32 val)
++static inline void mipi_csis_write(struct mipi_csis_device *csis, u32 reg,
++ u32 val)
+ {
+- writel(val, state->regs + reg);
++ writel(val, csis->regs + reg);
+ }
+
+-static void mipi_csis_enable_interrupts(struct csi_state *state, bool on)
++static void mipi_csis_enable_interrupts(struct mipi_csis_device *csis, bool on)
+ {
+- mipi_csis_write(state, MIPI_CSIS_INT_MSK, on ? 0xffffffff : 0);
+- mipi_csis_write(state, MIPI_CSIS_DBG_INTR_MSK, on ? 0xffffffff : 0);
++ mipi_csis_write(csis, MIPI_CSIS_INT_MSK, on ? 0xffffffff : 0);
++ mipi_csis_write(csis, MIPI_CSIS_DBG_INTR_MSK, on ? 0xffffffff : 0);
+ }
+
+-static void mipi_csis_sw_reset(struct csi_state *state)
++static void mipi_csis_sw_reset(struct mipi_csis_device *csis)
+ {
+- u32 val = mipi_csis_read(state, MIPI_CSIS_CMN_CTRL);
++ u32 val = mipi_csis_read(csis, MIPI_CSIS_CMN_CTRL);
+
+- mipi_csis_write(state, MIPI_CSIS_CMN_CTRL,
++ mipi_csis_write(csis, MIPI_CSIS_CMN_CTRL,
+ val | MIPI_CSIS_CMN_CTRL_RESET);
+ usleep_range(10, 20);
+ }
+
+-static void mipi_csis_system_enable(struct csi_state *state, int on)
++static void mipi_csis_system_enable(struct mipi_csis_device *csis, int on)
+ {
+ u32 val, mask;
+
+- val = mipi_csis_read(state, MIPI_CSIS_CMN_CTRL);
++ val = mipi_csis_read(csis, MIPI_CSIS_CMN_CTRL);
+ if (on)
+ val |= MIPI_CSIS_CMN_CTRL_ENABLE;
+ else
+ val &= ~MIPI_CSIS_CMN_CTRL_ENABLE;
+- mipi_csis_write(state, MIPI_CSIS_CMN_CTRL, val);
++ mipi_csis_write(csis, MIPI_CSIS_CMN_CTRL, val);
+
+- val = mipi_csis_read(state, MIPI_CSIS_DPHY_CMN_CTRL);
++ val = mipi_csis_read(csis, MIPI_CSIS_DPHY_CMN_CTRL);
+ val &= ~MIPI_CSIS_DPHY_CMN_CTRL_ENABLE;
+ if (on) {
+- mask = (1 << (state->bus.num_data_lanes + 1)) - 1;
++ mask = (1 << (csis->bus.num_data_lanes + 1)) - 1;
+ val |= (mask & MIPI_CSIS_DPHY_CMN_CTRL_ENABLE);
+ }
+- mipi_csis_write(state, MIPI_CSIS_DPHY_CMN_CTRL, val);
++ mipi_csis_write(csis, MIPI_CSIS_DPHY_CMN_CTRL, val);
+ }
+
+-/* Called with the state.lock mutex held */
+-static void __mipi_csis_set_format(struct csi_state *state)
++/* Called with the csis.lock mutex held */
++static void __mipi_csis_set_format(struct mipi_csis_device *csis)
+ {
+- struct v4l2_mbus_framefmt *mf = &state->format_mbus[CSIS_PAD_SINK];
++ struct v4l2_mbus_framefmt *mf = &csis->format_mbus[CSIS_PAD_SINK];
+ u32 val;
+
+ /* Color format */
+- val = mipi_csis_read(state, MIPI_CSIS_ISP_CONFIG_CH(0));
++ val = mipi_csis_read(csis, MIPI_CSIS_ISP_CONFIG_CH(0));
+ val &= ~(MIPI_CSIS_ISPCFG_ALIGN_32BIT | MIPI_CSIS_ISPCFG_FMT_MASK
+ | MIPI_CSIS_ISPCFG_PIXEL_MASK);
+
+@@ -556,28 +557,28 @@ static void __mipi_csis_set_format(struct csi_state *state)
+ *
+ * TODO: Verify which other formats require DUAL (or QUAD) modes.
+ */
+- if (state->csis_fmt->data_type == MIPI_CSI2_DATA_TYPE_YUV422_8)
++ if (csis->csis_fmt->data_type == MIPI_CSI2_DATA_TYPE_YUV422_8)
+ val |= MIPI_CSIS_ISPCFG_PIXEL_MODE_DUAL;
+
+- val |= MIPI_CSIS_ISPCFG_FMT(state->csis_fmt->data_type);
+- mipi_csis_write(state, MIPI_CSIS_ISP_CONFIG_CH(0), val);
++ val |= MIPI_CSIS_ISPCFG_FMT(csis->csis_fmt->data_type);
++ mipi_csis_write(csis, MIPI_CSIS_ISP_CONFIG_CH(0), val);
+
+ /* Pixel resolution */
+ val = mf->width | (mf->height << 16);
+- mipi_csis_write(state, MIPI_CSIS_ISP_RESOL_CH(0), val);
++ mipi_csis_write(csis, MIPI_CSIS_ISP_RESOL_CH(0), val);
+ }
+
+-static int mipi_csis_calculate_params(struct csi_state *state)
++static int mipi_csis_calculate_params(struct mipi_csis_device *csis)
+ {
+ s64 link_freq;
+ u32 lane_rate;
+
+ /* Calculate the line rate from the pixel rate. */
+- link_freq = v4l2_get_link_freq(state->src_sd->ctrl_handler,
+- state->csis_fmt->width,
+- state->bus.num_data_lanes * 2);
++ link_freq = v4l2_get_link_freq(csis->src_sd->ctrl_handler,
++ csis->csis_fmt->width,
++ csis->bus.num_data_lanes * 2);
+ if (link_freq < 0) {
+- dev_err(state->dev, "Unable to obtain link frequency: %d\n",
++ dev_err(csis->dev, "Unable to obtain link frequency: %d\n",
+ (int)link_freq);
+ return link_freq;
+ }
+@@ -585,7 +586,7 @@ static int mipi_csis_calculate_params(struct csi_state *state)
+ lane_rate = link_freq * 2;
+
+ if (lane_rate < 80000000 || lane_rate > 1500000000) {
+- dev_dbg(state->dev, "Out-of-bound lane rate %u\n", lane_rate);
++ dev_dbg(csis->dev, "Out-of-bound lane rate %u\n", lane_rate);
+ return -EINVAL;
+ }
+
+@@ -595,57 +596,57 @@ static int mipi_csis_calculate_params(struct csi_state *state)
+ * (which is documented as corresponding to CSI-2 v0.87 to v1.00) until
+ * we figure out how to compute it correctly.
+ */
+- state->hs_settle = (lane_rate - 5000000) / 45000000;
+- state->clk_settle = 0;
++ csis->hs_settle = (lane_rate - 5000000) / 45000000;
++ csis->clk_settle = 0;
+
+- dev_dbg(state->dev, "lane rate %u, Tclk_settle %u, Ths_settle %u\n",
+- lane_rate, state->clk_settle, state->hs_settle);
++ dev_dbg(csis->dev, "lane rate %u, Tclk_settle %u, Ths_settle %u\n",
++ lane_rate, csis->clk_settle, csis->hs_settle);
+
+- if (state->debug.hs_settle < 0xff) {
+- dev_dbg(state->dev, "overriding Ths_settle with %u\n",
+- state->debug.hs_settle);
+- state->hs_settle = state->debug.hs_settle;
++ if (csis->debug.hs_settle < 0xff) {
++ dev_dbg(csis->dev, "overriding Ths_settle with %u\n",
++ csis->debug.hs_settle);
++ csis->hs_settle = csis->debug.hs_settle;
+ }
+
+- if (state->debug.clk_settle < 4) {
+- dev_dbg(state->dev, "overriding Tclk_settle with %u\n",
+- state->debug.clk_settle);
+- state->clk_settle = state->debug.clk_settle;
++ if (csis->debug.clk_settle < 4) {
++ dev_dbg(csis->dev, "overriding Tclk_settle with %u\n",
++ csis->debug.clk_settle);
++ csis->clk_settle = csis->debug.clk_settle;
+ }
+
+ return 0;
+ }
+
+-static void mipi_csis_set_params(struct csi_state *state)
++static void mipi_csis_set_params(struct mipi_csis_device *csis)
+ {
+- int lanes = state->bus.num_data_lanes;
++ int lanes = csis->bus.num_data_lanes;
+ u32 val;
+
+- val = mipi_csis_read(state, MIPI_CSIS_CMN_CTRL);
++ val = mipi_csis_read(csis, MIPI_CSIS_CMN_CTRL);
+ val &= ~MIPI_CSIS_CMN_CTRL_LANE_NR_MASK;
+ val |= (lanes - 1) << MIPI_CSIS_CMN_CTRL_LANE_NR_OFFSET;
+- if (state->info->version == MIPI_CSIS_V3_3)
++ if (csis->info->version == MIPI_CSIS_V3_3)
+ val |= MIPI_CSIS_CMN_CTRL_INTER_MODE;
+- mipi_csis_write(state, MIPI_CSIS_CMN_CTRL, val);
++ mipi_csis_write(csis, MIPI_CSIS_CMN_CTRL, val);
+
+- __mipi_csis_set_format(state);
++ __mipi_csis_set_format(csis);
+
+- mipi_csis_write(state, MIPI_CSIS_DPHY_CMN_CTRL,
+- MIPI_CSIS_DPHY_CMN_CTRL_HSSETTLE(state->hs_settle) |
+- MIPI_CSIS_DPHY_CMN_CTRL_CLKSETTLE(state->clk_settle));
++ mipi_csis_write(csis, MIPI_CSIS_DPHY_CMN_CTRL,
++ MIPI_CSIS_DPHY_CMN_CTRL_HSSETTLE(csis->hs_settle) |
++ MIPI_CSIS_DPHY_CMN_CTRL_CLKSETTLE(csis->clk_settle));
+
+ val = (0 << MIPI_CSIS_ISP_SYNC_HSYNC_LINTV_OFFSET)
+ | (0 << MIPI_CSIS_ISP_SYNC_VSYNC_SINTV_OFFSET)
+ | (0 << MIPI_CSIS_ISP_SYNC_VSYNC_EINTV_OFFSET);
+- mipi_csis_write(state, MIPI_CSIS_ISP_SYNC_CH(0), val);
++ mipi_csis_write(csis, MIPI_CSIS_ISP_SYNC_CH(0), val);
+
+- val = mipi_csis_read(state, MIPI_CSIS_CLK_CTRL);
++ val = mipi_csis_read(csis, MIPI_CSIS_CLK_CTRL);
+ val |= MIPI_CSIS_CLK_CTRL_WCLK_SRC;
+ val |= MIPI_CSIS_CLK_CTRL_CLKGATE_TRAIL_CH0(15);
+ val &= ~MIPI_CSIS_CLK_CTRL_CLKGATE_EN_MSK;
+- mipi_csis_write(state, MIPI_CSIS_CLK_CTRL, val);
++ mipi_csis_write(csis, MIPI_CSIS_CLK_CTRL, val);
+
+- mipi_csis_write(state, MIPI_CSIS_DPHY_BCTRL_L,
++ mipi_csis_write(csis, MIPI_CSIS_DPHY_BCTRL_L,
+ MIPI_CSIS_DPHY_BCTRL_L_BIAS_REF_VOLT_715MV |
+ MIPI_CSIS_DPHY_BCTRL_L_BGR_CHOPPER_FREQ_3MHZ |
+ MIPI_CSIS_DPHY_BCTRL_L_REG_12P_LVL_CTL_1_2V |
+@@ -653,95 +654,95 @@ static void mipi_csis_set_params(struct csi_state *state)
+ MIPI_CSIS_DPHY_BCTRL_L_LP_RX_VREF_LVL_715MV |
+ MIPI_CSIS_DPHY_BCTRL_L_LP_CD_HYS_60MV |
+ MIPI_CSIS_DPHY_BCTRL_L_B_DPHYCTRL(20000000));
+- mipi_csis_write(state, MIPI_CSIS_DPHY_BCTRL_H, 0);
++ mipi_csis_write(csis, MIPI_CSIS_DPHY_BCTRL_H, 0);
+
+ /* Update the shadow register. */
+- val = mipi_csis_read(state, MIPI_CSIS_CMN_CTRL);
+- mipi_csis_write(state, MIPI_CSIS_CMN_CTRL,
++ val = mipi_csis_read(csis, MIPI_CSIS_CMN_CTRL);
++ mipi_csis_write(csis, MIPI_CSIS_CMN_CTRL,
+ val | MIPI_CSIS_CMN_CTRL_UPDATE_SHADOW |
+ MIPI_CSIS_CMN_CTRL_UPDATE_SHADOW_CTRL);
+ }
+
+-static int mipi_csis_clk_enable(struct csi_state *state)
++static int mipi_csis_clk_enable(struct mipi_csis_device *csis)
+ {
+- return clk_bulk_prepare_enable(state->info->num_clocks, state->clks);
++ return clk_bulk_prepare_enable(csis->info->num_clocks, csis->clks);
+ }
+
+-static void mipi_csis_clk_disable(struct csi_state *state)
++static void mipi_csis_clk_disable(struct mipi_csis_device *csis)
+ {
+- clk_bulk_disable_unprepare(state->info->num_clocks, state->clks);
++ clk_bulk_disable_unprepare(csis->info->num_clocks, csis->clks);
+ }
+
+-static int mipi_csis_clk_get(struct csi_state *state)
++static int mipi_csis_clk_get(struct mipi_csis_device *csis)
+ {
+ unsigned int i;
+ int ret;
+
+- state->clks = devm_kcalloc(state->dev, state->info->num_clocks,
+- sizeof(*state->clks), GFP_KERNEL);
++ csis->clks = devm_kcalloc(csis->dev, csis->info->num_clocks,
++ sizeof(*csis->clks), GFP_KERNEL);
+
+- if (!state->clks)
++ if (!csis->clks)
+ return -ENOMEM;
+
+- for (i = 0; i < state->info->num_clocks; i++)
+- state->clks[i].id = mipi_csis_clk_id[i];
++ for (i = 0; i < csis->info->num_clocks; i++)
++ csis->clks[i].id = mipi_csis_clk_id[i];
+
+- ret = devm_clk_bulk_get(state->dev, state->info->num_clocks,
+- state->clks);
++ ret = devm_clk_bulk_get(csis->dev, csis->info->num_clocks,
++ csis->clks);
+ if (ret < 0)
+ return ret;
+
+ /* Set clock rate */
+- ret = clk_set_rate(state->clks[MIPI_CSIS_CLK_WRAP].clk,
+- state->clk_frequency);
++ ret = clk_set_rate(csis->clks[MIPI_CSIS_CLK_WRAP].clk,
++ csis->clk_frequency);
+ if (ret < 0)
+- dev_err(state->dev, "set rate=%d failed: %d\n",
+- state->clk_frequency, ret);
++ dev_err(csis->dev, "set rate=%d failed: %d\n",
++ csis->clk_frequency, ret);
+
+ return ret;
+ }
+
+-static void mipi_csis_start_stream(struct csi_state *state)
++static void mipi_csis_start_stream(struct mipi_csis_device *csis)
+ {
+- mipi_csis_sw_reset(state);
+- mipi_csis_set_params(state);
+- mipi_csis_system_enable(state, true);
+- mipi_csis_enable_interrupts(state, true);
++ mipi_csis_sw_reset(csis);
++ mipi_csis_set_params(csis);
++ mipi_csis_system_enable(csis, true);
++ mipi_csis_enable_interrupts(csis, true);
+ }
+
+-static void mipi_csis_stop_stream(struct csi_state *state)
++static void mipi_csis_stop_stream(struct mipi_csis_device *csis)
+ {
+- mipi_csis_enable_interrupts(state, false);
+- mipi_csis_system_enable(state, false);
++ mipi_csis_enable_interrupts(csis, false);
++ mipi_csis_system_enable(csis, false);
+ }
+
+ static irqreturn_t mipi_csis_irq_handler(int irq, void *dev_id)
+ {
+- struct csi_state *state = dev_id;
++ struct mipi_csis_device *csis = dev_id;
+ unsigned long flags;
+ unsigned int i;
+ u32 status;
+ u32 dbg_status;
+
+- status = mipi_csis_read(state, MIPI_CSIS_INT_SRC);
+- dbg_status = mipi_csis_read(state, MIPI_CSIS_DBG_INTR_SRC);
++ status = mipi_csis_read(csis, MIPI_CSIS_INT_SRC);
++ dbg_status = mipi_csis_read(csis, MIPI_CSIS_DBG_INTR_SRC);
+
+- spin_lock_irqsave(&state->slock, flags);
++ spin_lock_irqsave(&csis->slock, flags);
+
+ /* Update the event/error counters */
+- if ((status & MIPI_CSIS_INT_SRC_ERRORS) || state->debug.enable) {
++ if ((status & MIPI_CSIS_INT_SRC_ERRORS) || csis->debug.enable) {
+ for (i = 0; i < MIPI_CSIS_NUM_EVENTS; i++) {
+- struct mipi_csis_event *event = &state->events[i];
++ struct mipi_csis_event *event = &csis->events[i];
+
+ if ((!event->debug && (status & event->mask)) ||
+ (event->debug && (dbg_status & event->mask)))
+ event->counter++;
+ }
+ }
+- spin_unlock_irqrestore(&state->slock, flags);
++ spin_unlock_irqrestore(&csis->slock, flags);
+
+- mipi_csis_write(state, MIPI_CSIS_INT_SRC, status);
+- mipi_csis_write(state, MIPI_CSIS_DBG_INTR_SRC, dbg_status);
++ mipi_csis_write(csis, MIPI_CSIS_INT_SRC, status);
++ mipi_csis_write(csis, MIPI_CSIS_DBG_INTR_SRC, dbg_status);
+
+ return IRQ_HANDLED;
+ }
+@@ -750,47 +751,47 @@ static irqreturn_t mipi_csis_irq_handler(int irq, void *dev_id)
+ * PHY regulator and reset
+ */
+
+-static int mipi_csis_phy_enable(struct csi_state *state)
++static int mipi_csis_phy_enable(struct mipi_csis_device *csis)
+ {
+- if (state->info->version != MIPI_CSIS_V3_3)
++ if (csis->info->version != MIPI_CSIS_V3_3)
+ return 0;
+
+- return regulator_enable(state->mipi_phy_regulator);
++ return regulator_enable(csis->mipi_phy_regulator);
+ }
+
+-static int mipi_csis_phy_disable(struct csi_state *state)
++static int mipi_csis_phy_disable(struct mipi_csis_device *csis)
+ {
+- if (state->info->version != MIPI_CSIS_V3_3)
++ if (csis->info->version != MIPI_CSIS_V3_3)
+ return 0;
+
+- return regulator_disable(state->mipi_phy_regulator);
++ return regulator_disable(csis->mipi_phy_regulator);
+ }
+
+-static void mipi_csis_phy_reset(struct csi_state *state)
++static void mipi_csis_phy_reset(struct mipi_csis_device *csis)
+ {
+- if (state->info->version != MIPI_CSIS_V3_3)
++ if (csis->info->version != MIPI_CSIS_V3_3)
+ return;
+
+- reset_control_assert(state->mrst);
++ reset_control_assert(csis->mrst);
+ msleep(20);
+- reset_control_deassert(state->mrst);
++ reset_control_deassert(csis->mrst);
+ }
+
+-static int mipi_csis_phy_init(struct csi_state *state)
++static int mipi_csis_phy_init(struct mipi_csis_device *csis)
+ {
+- if (state->info->version != MIPI_CSIS_V3_3)
++ if (csis->info->version != MIPI_CSIS_V3_3)
+ return 0;
+
+ /* Get MIPI PHY reset and regulator. */
+- state->mrst = devm_reset_control_get_exclusive(state->dev, NULL);
+- if (IS_ERR(state->mrst))
+- return PTR_ERR(state->mrst);
++ csis->mrst = devm_reset_control_get_exclusive(csis->dev, NULL);
++ if (IS_ERR(csis->mrst))
++ return PTR_ERR(csis->mrst);
+
+- state->mipi_phy_regulator = devm_regulator_get(state->dev, "phy");
+- if (IS_ERR(state->mipi_phy_regulator))
+- return PTR_ERR(state->mipi_phy_regulator);
++ csis->mipi_phy_regulator = devm_regulator_get(csis->dev, "phy");
++ if (IS_ERR(csis->mipi_phy_regulator))
++ return PTR_ERR(csis->mipi_phy_regulator);
+
+- return regulator_set_voltage(state->mipi_phy_regulator, 1000000,
++ return regulator_set_voltage(csis->mipi_phy_regulator, 1000000,
+ 1000000);
+ }
+
+@@ -798,36 +799,36 @@ static int mipi_csis_phy_init(struct csi_state *state)
+ * Debug
+ */
+
+-static void mipi_csis_clear_counters(struct csi_state *state)
++static void mipi_csis_clear_counters(struct mipi_csis_device *csis)
+ {
+ unsigned long flags;
+ unsigned int i;
+
+- spin_lock_irqsave(&state->slock, flags);
++ spin_lock_irqsave(&csis->slock, flags);
+ for (i = 0; i < MIPI_CSIS_NUM_EVENTS; i++)
+- state->events[i].counter = 0;
+- spin_unlock_irqrestore(&state->slock, flags);
++ csis->events[i].counter = 0;
++ spin_unlock_irqrestore(&csis->slock, flags);
+ }
+
+-static void mipi_csis_log_counters(struct csi_state *state, bool non_errors)
++static void mipi_csis_log_counters(struct mipi_csis_device *csis, bool non_errors)
+ {
+ unsigned int num_events = non_errors ? MIPI_CSIS_NUM_EVENTS
+ : MIPI_CSIS_NUM_EVENTS - 8;
+ unsigned long flags;
+ unsigned int i;
+
+- spin_lock_irqsave(&state->slock, flags);
++ spin_lock_irqsave(&csis->slock, flags);
+
+ for (i = 0; i < num_events; ++i) {
+- if (state->events[i].counter > 0 || state->debug.enable)
+- dev_info(state->dev, "%s events: %d\n",
+- state->events[i].name,
+- state->events[i].counter);
++ if (csis->events[i].counter > 0 || csis->debug.enable)
++ dev_info(csis->dev, "%s events: %d\n",
++ csis->events[i].name,
++ csis->events[i].counter);
+ }
+- spin_unlock_irqrestore(&state->slock, flags);
++ spin_unlock_irqrestore(&csis->slock, flags);
+ }
+
+-static int mipi_csis_dump_regs(struct csi_state *state)
++static int mipi_csis_dump_regs(struct mipi_csis_device *csis)
+ {
+ static const struct {
+ u32 offset;
+@@ -851,11 +852,11 @@ static int mipi_csis_dump_regs(struct csi_state *state)
+ unsigned int i;
+ u32 cfg;
+
+- dev_info(state->dev, "--- REGISTERS ---\n");
++ dev_info(csis->dev, "--- REGISTERS ---\n");
+
+ for (i = 0; i < ARRAY_SIZE(registers); i++) {
+- cfg = mipi_csis_read(state, registers[i].offset);
+- dev_info(state->dev, "%14s: 0x%08x\n", registers[i].name, cfg);
++ cfg = mipi_csis_read(csis, registers[i].offset);
++ dev_info(csis->dev, "%14s: 0x%08x\n", registers[i].name, cfg);
+ }
+
+ return 0;
+@@ -863,123 +864,123 @@ static int mipi_csis_dump_regs(struct csi_state *state)
+
+ static int mipi_csis_dump_regs_show(struct seq_file *m, void *private)
+ {
+- struct csi_state *state = m->private;
++ struct mipi_csis_device *csis = m->private;
+
+- return mipi_csis_dump_regs(state);
++ return mipi_csis_dump_regs(csis);
+ }
+ DEFINE_SHOW_ATTRIBUTE(mipi_csis_dump_regs);
+
+-static void mipi_csis_debugfs_init(struct csi_state *state)
++static void mipi_csis_debugfs_init(struct mipi_csis_device *csis)
+ {
+- state->debug.hs_settle = UINT_MAX;
+- state->debug.clk_settle = UINT_MAX;
++ csis->debug.hs_settle = UINT_MAX;
++ csis->debug.clk_settle = UINT_MAX;
+
+- state->debugfs_root = debugfs_create_dir(dev_name(state->dev), NULL);
++ csis->debugfs_root = debugfs_create_dir(dev_name(csis->dev), NULL);
+
+- debugfs_create_bool("debug_enable", 0600, state->debugfs_root,
+- &state->debug.enable);
+- debugfs_create_file("dump_regs", 0600, state->debugfs_root, state,
++ debugfs_create_bool("debug_enable", 0600, csis->debugfs_root,
++ &csis->debug.enable);
++ debugfs_create_file("dump_regs", 0600, csis->debugfs_root, csis,
+ &mipi_csis_dump_regs_fops);
+- debugfs_create_u32("tclk_settle", 0600, state->debugfs_root,
+- &state->debug.clk_settle);
+- debugfs_create_u32("ths_settle", 0600, state->debugfs_root,
+- &state->debug.hs_settle);
++ debugfs_create_u32("tclk_settle", 0600, csis->debugfs_root,
++ &csis->debug.clk_settle);
++ debugfs_create_u32("ths_settle", 0600, csis->debugfs_root,
++ &csis->debug.hs_settle);
+ }
+
+-static void mipi_csis_debugfs_exit(struct csi_state *state)
++static void mipi_csis_debugfs_exit(struct mipi_csis_device *csis)
+ {
+- debugfs_remove_recursive(state->debugfs_root);
++ debugfs_remove_recursive(csis->debugfs_root);
+ }
+
+ /* -----------------------------------------------------------------------------
+ * V4L2 subdev operations
+ */
+
+-static struct csi_state *mipi_sd_to_csis_state(struct v4l2_subdev *sdev)
++static struct mipi_csis_device *sd_to_mipi_csis_device(struct v4l2_subdev *sdev)
+ {
+- return container_of(sdev, struct csi_state, sd);
++ return container_of(sdev, struct mipi_csis_device, sd);
+ }
+
+ static int mipi_csis_s_stream(struct v4l2_subdev *sd, int enable)
+ {
+- struct csi_state *state = mipi_sd_to_csis_state(sd);
++ struct mipi_csis_device *csis = sd_to_mipi_csis_device(sd);
+ int ret;
+
+ if (enable) {
+- ret = mipi_csis_calculate_params(state);
++ ret = mipi_csis_calculate_params(csis);
+ if (ret < 0)
+ return ret;
+
+- mipi_csis_clear_counters(state);
++ mipi_csis_clear_counters(csis);
+
+- ret = pm_runtime_resume_and_get(state->dev);
++ ret = pm_runtime_resume_and_get(csis->dev);
+ if (ret < 0)
+ return ret;
+
+- ret = v4l2_subdev_call(state->src_sd, core, s_power, 1);
++ ret = v4l2_subdev_call(csis->src_sd, core, s_power, 1);
+ if (ret < 0 && ret != -ENOIOCTLCMD)
+ goto done;
+ }
+
+- mutex_lock(&state->lock);
++ mutex_lock(&csis->lock);
+
+ if (enable) {
+- if (state->state & ST_SUSPENDED) {
++ if (csis->state & ST_SUSPENDED) {
+ ret = -EBUSY;
+ goto unlock;
+ }
+
+- mipi_csis_start_stream(state);
+- ret = v4l2_subdev_call(state->src_sd, video, s_stream, 1);
++ mipi_csis_start_stream(csis);
++ ret = v4l2_subdev_call(csis->src_sd, video, s_stream, 1);
+ if (ret < 0)
+ goto unlock;
+
+- mipi_csis_log_counters(state, true);
++ mipi_csis_log_counters(csis, true);
+
+- state->state |= ST_STREAMING;
++ csis->state |= ST_STREAMING;
+ } else {
+- v4l2_subdev_call(state->src_sd, video, s_stream, 0);
+- ret = v4l2_subdev_call(state->src_sd, core, s_power, 0);
++ v4l2_subdev_call(csis->src_sd, video, s_stream, 0);
++ ret = v4l2_subdev_call(csis->src_sd, core, s_power, 0);
+ if (ret == -ENOIOCTLCMD)
+ ret = 0;
+- mipi_csis_stop_stream(state);
+- state->state &= ~ST_STREAMING;
+- if (state->debug.enable)
+- mipi_csis_log_counters(state, true);
++ mipi_csis_stop_stream(csis);
++ csis->state &= ~ST_STREAMING;
++ if (csis->debug.enable)
++ mipi_csis_log_counters(csis, true);
+ }
+
+ unlock:
+- mutex_unlock(&state->lock);
++ mutex_unlock(&csis->lock);
+
+ done:
+ if (!enable || ret < 0)
+- pm_runtime_put(state->dev);
++ pm_runtime_put(csis->dev);
+
+ return ret;
+ }
+
+ static struct v4l2_mbus_framefmt *
+-mipi_csis_get_format(struct csi_state *state,
++mipi_csis_get_format(struct mipi_csis_device *csis,
+ struct v4l2_subdev_state *sd_state,
+ enum v4l2_subdev_format_whence which,
+ unsigned int pad)
+ {
+ if (which == V4L2_SUBDEV_FORMAT_TRY)
+- return v4l2_subdev_get_try_format(&state->sd, sd_state, pad);
++ return v4l2_subdev_get_try_format(&csis->sd, sd_state, pad);
+
+- return &state->format_mbus[pad];
++ return &csis->format_mbus[pad];
+ }
+
+ static int mipi_csis_init_cfg(struct v4l2_subdev *sd,
+ struct v4l2_subdev_state *sd_state)
+ {
+- struct csi_state *state = mipi_sd_to_csis_state(sd);
++ struct mipi_csis_device *csis = sd_to_mipi_csis_device(sd);
+ struct v4l2_mbus_framefmt *fmt_sink;
+ struct v4l2_mbus_framefmt *fmt_source;
+ enum v4l2_subdev_format_whence which;
+
+ which = sd_state ? V4L2_SUBDEV_FORMAT_TRY : V4L2_SUBDEV_FORMAT_ACTIVE;
+- fmt_sink = mipi_csis_get_format(state, sd_state, which, CSIS_PAD_SINK);
++ fmt_sink = mipi_csis_get_format(csis, sd_state, which, CSIS_PAD_SINK);
+
+ fmt_sink->code = MEDIA_BUS_FMT_UYVY8_1X16;
+ fmt_sink->width = MIPI_CSIS_DEF_PIX_WIDTH;
+@@ -993,15 +994,7 @@ static int mipi_csis_init_cfg(struct v4l2_subdev *sd,
+ V4L2_MAP_QUANTIZATION_DEFAULT(false, fmt_sink->colorspace,
+ fmt_sink->ycbcr_enc);
+
+- /*
+- * When called from mipi_csis_subdev_init() to initialize the active
+- * configuration, cfg is NULL, which indicates there's no source pad
+- * configuration to set.
+- */
+- if (!sd_state)
+- return 0;
+-
+- fmt_source = mipi_csis_get_format(state, sd_state, which,
++ fmt_source = mipi_csis_get_format(csis, sd_state, which,
+ CSIS_PAD_SOURCE);
+ *fmt_source = *fmt_sink;
+
+@@ -1012,15 +1005,15 @@ static int mipi_csis_get_fmt(struct v4l2_subdev *sd,
+ struct v4l2_subdev_state *sd_state,
+ struct v4l2_subdev_format *sdformat)
+ {
+- struct csi_state *state = mipi_sd_to_csis_state(sd);
++ struct mipi_csis_device *csis = sd_to_mipi_csis_device(sd);
+ struct v4l2_mbus_framefmt *fmt;
+
+- fmt = mipi_csis_get_format(state, sd_state, sdformat->which,
++ fmt = mipi_csis_get_format(csis, sd_state, sdformat->which,
+ sdformat->pad);
+
+- mutex_lock(&state->lock);
++ mutex_lock(&csis->lock);
+ sdformat->format = *fmt;
+- mutex_unlock(&state->lock);
++ mutex_unlock(&csis->lock);
+
+ return 0;
+ }
+@@ -1029,7 +1022,7 @@ static int mipi_csis_enum_mbus_code(struct v4l2_subdev *sd,
+ struct v4l2_subdev_state *sd_state,
+ struct v4l2_subdev_mbus_code_enum *code)
+ {
+- struct csi_state *state = mipi_sd_to_csis_state(sd);
++ struct mipi_csis_device *csis = sd_to_mipi_csis_device(sd);
+
+ /*
+ * The CSIS can't transcode in any way, the source format is identical
+@@ -1041,7 +1034,7 @@ static int mipi_csis_enum_mbus_code(struct v4l2_subdev *sd,
+ if (code->index > 0)
+ return -EINVAL;
+
+- fmt = mipi_csis_get_format(state, sd_state, code->which,
++ fmt = mipi_csis_get_format(csis, sd_state, code->which,
+ code->pad);
+ code->code = fmt->code;
+ return 0;
+@@ -1062,7 +1055,7 @@ static int mipi_csis_set_fmt(struct v4l2_subdev *sd,
+ struct v4l2_subdev_state *sd_state,
+ struct v4l2_subdev_format *sdformat)
+ {
+- struct csi_state *state = mipi_sd_to_csis_state(sd);
++ struct mipi_csis_device *csis = sd_to_mipi_csis_device(sd);
+ struct csis_pix_format const *csis_fmt;
+ struct v4l2_mbus_framefmt *fmt;
+ unsigned int align;
+@@ -1110,10 +1103,10 @@ static int mipi_csis_set_fmt(struct v4l2_subdev *sd,
+ &sdformat->format.height, 1,
+ CSIS_MAX_PIX_HEIGHT, 0, 0);
+
+- fmt = mipi_csis_get_format(state, sd_state, sdformat->which,
++ fmt = mipi_csis_get_format(csis, sd_state, sdformat->which,
+ sdformat->pad);
+
+- mutex_lock(&state->lock);
++ mutex_lock(&csis->lock);
+
+ fmt->code = csis_fmt->code;
+ fmt->width = sdformat->format.width;
+@@ -1126,7 +1119,7 @@ static int mipi_csis_set_fmt(struct v4l2_subdev *sd,
+ sdformat->format = *fmt;
+
+ /* Propagate the format from sink to source. */
+- fmt = mipi_csis_get_format(state, sd_state, sdformat->which,
++ fmt = mipi_csis_get_format(csis, sd_state, sdformat->which,
+ CSIS_PAD_SOURCE);
+ *fmt = sdformat->format;
+
+@@ -1135,22 +1128,22 @@ static int mipi_csis_set_fmt(struct v4l2_subdev *sd,
+
+ /* Store the CSIS format descriptor for active formats. */
+ if (sdformat->which == V4L2_SUBDEV_FORMAT_ACTIVE)
+- state->csis_fmt = csis_fmt;
++ csis->csis_fmt = csis_fmt;
+
+- mutex_unlock(&state->lock);
++ mutex_unlock(&csis->lock);
+
+ return 0;
+ }
+
+ static int mipi_csis_log_status(struct v4l2_subdev *sd)
+ {
+- struct csi_state *state = mipi_sd_to_csis_state(sd);
++ struct mipi_csis_device *csis = sd_to_mipi_csis_device(sd);
+
+- mutex_lock(&state->lock);
+- mipi_csis_log_counters(state, true);
+- if (state->debug.enable && (state->state & ST_POWERED))
+- mipi_csis_dump_regs(state);
+- mutex_unlock(&state->lock);
++ mutex_lock(&csis->lock);
++ mipi_csis_log_counters(csis, true);
++ if (csis->debug.enable && (csis->state & ST_POWERED))
++ mipi_csis_dump_regs(csis);
++ mutex_unlock(&csis->lock);
+
+ return 0;
+ }
+@@ -1185,10 +1178,10 @@ static int mipi_csis_link_setup(struct media_entity *entity,
+ const struct media_pad *remote_pad, u32 flags)
+ {
+ struct v4l2_subdev *sd = media_entity_to_v4l2_subdev(entity);
+- struct csi_state *state = mipi_sd_to_csis_state(sd);
++ struct mipi_csis_device *csis = sd_to_mipi_csis_device(sd);
+ struct v4l2_subdev *remote_sd;
+
+- dev_dbg(state->dev, "link setup %s -> %s", remote_pad->entity->name,
++ dev_dbg(csis->dev, "link setup %s -> %s", remote_pad->entity->name,
+ local_pad->entity->name);
+
+ /* We only care about the link to the source. */
+@@ -1198,12 +1191,12 @@ static int mipi_csis_link_setup(struct media_entity *entity,
+ remote_sd = media_entity_to_v4l2_subdev(remote_pad->entity);
+
+ if (flags & MEDIA_LNK_FL_ENABLED) {
+- if (state->src_sd)
++ if (csis->src_sd)
+ return -EBUSY;
+
+- state->src_sd = remote_sd;
++ csis->src_sd = remote_sd;
+ } else {
+- state->src_sd = NULL;
++ csis->src_sd = NULL;
+ }
+
+ return 0;
+@@ -1219,18 +1212,18 @@ static const struct media_entity_operations mipi_csis_entity_ops = {
+ * Async subdev notifier
+ */
+
+-static struct csi_state *
++static struct mipi_csis_device *
+ mipi_notifier_to_csis_state(struct v4l2_async_notifier *n)
+ {
+- return container_of(n, struct csi_state, notifier);
++ return container_of(n, struct mipi_csis_device, notifier);
+ }
+
+ static int mipi_csis_notify_bound(struct v4l2_async_notifier *notifier,
+ struct v4l2_subdev *sd,
+ struct v4l2_async_subdev *asd)
+ {
+- struct csi_state *state = mipi_notifier_to_csis_state(notifier);
+- struct media_pad *sink = &state->sd.entity.pads[CSIS_PAD_SINK];
++ struct mipi_csis_device *csis = mipi_notifier_to_csis_state(notifier);
++ struct media_pad *sink = &csis->sd.entity.pads[CSIS_PAD_SINK];
+
+ return v4l2_create_fwnode_links_to_pad(sd, sink, 0);
+ }
+@@ -1239,7 +1232,7 @@ static const struct v4l2_async_notifier_operations mipi_csis_notify_ops = {
+ .bound = mipi_csis_notify_bound,
+ };
+
+-static int mipi_csis_async_register(struct csi_state *state)
++static int mipi_csis_async_register(struct mipi_csis_device *csis)
+ {
+ struct v4l2_fwnode_endpoint vep = {
+ .bus_type = V4L2_MBUS_CSI2_DPHY,
+@@ -1249,9 +1242,9 @@ static int mipi_csis_async_register(struct csi_state *state)
+ unsigned int i;
+ int ret;
+
+- v4l2_async_nf_init(&state->notifier);
++ v4l2_async_nf_init(&csis->notifier);
+
+- ep = fwnode_graph_get_endpoint_by_id(dev_fwnode(state->dev), 0, 0,
++ ep = fwnode_graph_get_endpoint_by_id(dev_fwnode(csis->dev), 0, 0,
+ FWNODE_GRAPH_ENDPOINT_NEXT);
+ if (!ep)
+ return -ENOTCONN;
+@@ -1262,19 +1255,19 @@ static int mipi_csis_async_register(struct csi_state *state)
+
+ for (i = 0; i < vep.bus.mipi_csi2.num_data_lanes; ++i) {
+ if (vep.bus.mipi_csi2.data_lanes[i] != i + 1) {
+- dev_err(state->dev,
++ dev_err(csis->dev,
+ "data lanes reordering is not supported");
+ ret = -EINVAL;
+ goto err_parse;
+ }
+ }
+
+- state->bus = vep.bus.mipi_csi2;
++ csis->bus = vep.bus.mipi_csi2;
+
+- dev_dbg(state->dev, "data lanes: %d\n", state->bus.num_data_lanes);
+- dev_dbg(state->dev, "flags: 0x%08x\n", state->bus.flags);
++ dev_dbg(csis->dev, "data lanes: %d\n", csis->bus.num_data_lanes);
++ dev_dbg(csis->dev, "flags: 0x%08x\n", csis->bus.flags);
+
+- asd = v4l2_async_nf_add_fwnode_remote(&state->notifier, ep,
++ asd = v4l2_async_nf_add_fwnode_remote(&csis->notifier, ep,
+ struct v4l2_async_subdev);
+ if (IS_ERR(asd)) {
+ ret = PTR_ERR(asd);
+@@ -1283,13 +1276,13 @@ static int mipi_csis_async_register(struct csi_state *state)
+
+ fwnode_handle_put(ep);
+
+- state->notifier.ops = &mipi_csis_notify_ops;
++ csis->notifier.ops = &mipi_csis_notify_ops;
+
+- ret = v4l2_async_subdev_nf_register(&state->sd, &state->notifier);
++ ret = v4l2_async_subdev_nf_register(&csis->sd, &csis->notifier);
+ if (ret)
+ return ret;
+
+- return v4l2_async_register_subdev(&state->sd);
++ return v4l2_async_register_subdev(&csis->sd);
+
+ err_parse:
+ fwnode_handle_put(ep);
+@@ -1304,23 +1297,23 @@ err_parse:
+ static int mipi_csis_pm_suspend(struct device *dev, bool runtime)
+ {
+ struct v4l2_subdev *sd = dev_get_drvdata(dev);
+- struct csi_state *state = mipi_sd_to_csis_state(sd);
++ struct mipi_csis_device *csis = sd_to_mipi_csis_device(sd);
+ int ret = 0;
+
+- mutex_lock(&state->lock);
+- if (state->state & ST_POWERED) {
+- mipi_csis_stop_stream(state);
+- ret = mipi_csis_phy_disable(state);
++ mutex_lock(&csis->lock);
++ if (csis->state & ST_POWERED) {
++ mipi_csis_stop_stream(csis);
++ ret = mipi_csis_phy_disable(csis);
+ if (ret)
+ goto unlock;
+- mipi_csis_clk_disable(state);
+- state->state &= ~ST_POWERED;
++ mipi_csis_clk_disable(csis);
++ csis->state &= ~ST_POWERED;
+ if (!runtime)
+- state->state |= ST_SUSPENDED;
++ csis->state |= ST_SUSPENDED;
+ }
+
+ unlock:
+- mutex_unlock(&state->lock);
++ mutex_unlock(&csis->lock);
+
+ return ret ? -EAGAIN : 0;
+ }
+@@ -1328,28 +1321,28 @@ unlock:
+ static int mipi_csis_pm_resume(struct device *dev, bool runtime)
+ {
+ struct v4l2_subdev *sd = dev_get_drvdata(dev);
+- struct csi_state *state = mipi_sd_to_csis_state(sd);
++ struct mipi_csis_device *csis = sd_to_mipi_csis_device(sd);
+ int ret = 0;
+
+- mutex_lock(&state->lock);
+- if (!runtime && !(state->state & ST_SUSPENDED))
++ mutex_lock(&csis->lock);
++ if (!runtime && !(csis->state & ST_SUSPENDED))
+ goto unlock;
+
+- if (!(state->state & ST_POWERED)) {
+- ret = mipi_csis_phy_enable(state);
++ if (!(csis->state & ST_POWERED)) {
++ ret = mipi_csis_phy_enable(csis);
+ if (ret)
+ goto unlock;
+
+- state->state |= ST_POWERED;
+- mipi_csis_clk_enable(state);
++ csis->state |= ST_POWERED;
++ mipi_csis_clk_enable(csis);
+ }
+- if (state->state & ST_STREAMING)
+- mipi_csis_start_stream(state);
++ if (csis->state & ST_STREAMING)
++ mipi_csis_start_stream(csis);
+
+- state->state &= ~ST_SUSPENDED;
++ csis->state &= ~ST_SUSPENDED;
+
+ unlock:
+- mutex_unlock(&state->lock);
++ mutex_unlock(&csis->lock);
+
+ return ret ? -EAGAIN : 0;
+ }
+@@ -1384,14 +1377,14 @@ static const struct dev_pm_ops mipi_csis_pm_ops = {
+ * Probe/remove & platform driver
+ */
+
+-static int mipi_csis_subdev_init(struct csi_state *state)
++static int mipi_csis_subdev_init(struct mipi_csis_device *csis)
+ {
+- struct v4l2_subdev *sd = &state->sd;
++ struct v4l2_subdev *sd = &csis->sd;
+
+ v4l2_subdev_init(sd, &mipi_csis_subdev_ops);
+ sd->owner = THIS_MODULE;
+ snprintf(sd->name, sizeof(sd->name), "csis-%s",
+- dev_name(state->dev));
++ dev_name(csis->dev));
+
+ sd->flags |= V4L2_SUBDEV_FL_HAS_DEVNODE;
+ sd->ctrl_handler = NULL;
+@@ -1399,26 +1392,26 @@ static int mipi_csis_subdev_init(struct csi_state *state)
+ sd->entity.function = MEDIA_ENT_F_VID_IF_BRIDGE;
+ sd->entity.ops = &mipi_csis_entity_ops;
+
+- sd->dev = state->dev;
++ sd->dev = csis->dev;
+
+- state->csis_fmt = &mipi_csis_formats[0];
++ csis->csis_fmt = &mipi_csis_formats[0];
+ mipi_csis_init_cfg(sd, NULL);
+
+- state->pads[CSIS_PAD_SINK].flags = MEDIA_PAD_FL_SINK
++ csis->pads[CSIS_PAD_SINK].flags = MEDIA_PAD_FL_SINK
+ | MEDIA_PAD_FL_MUST_CONNECT;
+- state->pads[CSIS_PAD_SOURCE].flags = MEDIA_PAD_FL_SOURCE
++ csis->pads[CSIS_PAD_SOURCE].flags = MEDIA_PAD_FL_SOURCE
+ | MEDIA_PAD_FL_MUST_CONNECT;
+ return media_entity_pads_init(&sd->entity, CSIS_PADS_NUM,
+- state->pads);
++ csis->pads);
+ }
+
+-static int mipi_csis_parse_dt(struct csi_state *state)
++static int mipi_csis_parse_dt(struct mipi_csis_device *csis)
+ {
+- struct device_node *node = state->dev->of_node;
++ struct device_node *node = csis->dev->of_node;
+
+ if (of_property_read_u32(node, "clock-frequency",
+- &state->clk_frequency))
+- state->clk_frequency = DEFAULT_SCLK_CSIS_FREQ;
++ &csis->clk_frequency))
++ csis->clk_frequency = DEFAULT_SCLK_CSIS_FREQ;
+
+ return 0;
+ }
+@@ -1426,78 +1419,78 @@ static int mipi_csis_parse_dt(struct csi_state *state)
+ static int mipi_csis_probe(struct platform_device *pdev)
+ {
+ struct device *dev = &pdev->dev;
+- struct csi_state *state;
++ struct mipi_csis_device *csis;
+ int irq;
+ int ret;
+
+- state = devm_kzalloc(dev, sizeof(*state), GFP_KERNEL);
+- if (!state)
++ csis = devm_kzalloc(dev, sizeof(*csis), GFP_KERNEL);
++ if (!csis)
+ return -ENOMEM;
+
+- mutex_init(&state->lock);
+- spin_lock_init(&state->slock);
++ mutex_init(&csis->lock);
++ spin_lock_init(&csis->slock);
+
+- state->dev = dev;
+- state->info = of_device_get_match_data(dev);
++ csis->dev = dev;
++ csis->info = of_device_get_match_data(dev);
+
+- memcpy(state->events, mipi_csis_events, sizeof(state->events));
++ memcpy(csis->events, mipi_csis_events, sizeof(csis->events));
+
+ /* Parse DT properties. */
+- ret = mipi_csis_parse_dt(state);
++ ret = mipi_csis_parse_dt(csis);
+ if (ret < 0) {
+ dev_err(dev, "Failed to parse device tree: %d\n", ret);
+ return ret;
+ }
+
+ /* Acquire resources. */
+- state->regs = devm_platform_ioremap_resource(pdev, 0);
+- if (IS_ERR(state->regs))
+- return PTR_ERR(state->regs);
++ csis->regs = devm_platform_ioremap_resource(pdev, 0);
++ if (IS_ERR(csis->regs))
++ return PTR_ERR(csis->regs);
+
+ irq = platform_get_irq(pdev, 0);
+ if (irq < 0)
+ return irq;
+
+- ret = mipi_csis_phy_init(state);
++ ret = mipi_csis_phy_init(csis);
+ if (ret < 0)
+ return ret;
+
+- ret = mipi_csis_clk_get(state);
++ ret = mipi_csis_clk_get(csis);
+ if (ret < 0)
+ return ret;
+
+ /* Reset PHY and enable the clocks. */
+- mipi_csis_phy_reset(state);
++ mipi_csis_phy_reset(csis);
+
+- ret = mipi_csis_clk_enable(state);
++ ret = mipi_csis_clk_enable(csis);
+ if (ret < 0) {
+- dev_err(state->dev, "failed to enable clocks: %d\n", ret);
++ dev_err(csis->dev, "failed to enable clocks: %d\n", ret);
+ return ret;
+ }
+
+ /* Now that the hardware is initialized, request the interrupt. */
+ ret = devm_request_irq(dev, irq, mipi_csis_irq_handler, 0,
+- dev_name(dev), state);
++ dev_name(dev), csis);
+ if (ret) {
+ dev_err(dev, "Interrupt request failed\n");
+ goto disable_clock;
+ }
+
+ /* Initialize and register the subdev. */
+- ret = mipi_csis_subdev_init(state);
++ ret = mipi_csis_subdev_init(csis);
+ if (ret < 0)
+ goto disable_clock;
+
+- platform_set_drvdata(pdev, &state->sd);
++ platform_set_drvdata(pdev, &csis->sd);
+
+- ret = mipi_csis_async_register(state);
++ ret = mipi_csis_async_register(csis);
+ if (ret < 0) {
+ dev_err(dev, "async register failed: %d\n", ret);
+ goto cleanup;
+ }
+
+ /* Initialize debugfs. */
+- mipi_csis_debugfs_init(state);
++ mipi_csis_debugfs_init(csis);
+
+ /* Enable runtime PM. */
+ pm_runtime_enable(dev);
+@@ -1508,20 +1501,20 @@ static int mipi_csis_probe(struct platform_device *pdev)
+ }
+
+ dev_info(dev, "lanes: %d, freq: %u\n",
+- state->bus.num_data_lanes, state->clk_frequency);
++ csis->bus.num_data_lanes, csis->clk_frequency);
+
+ return 0;
+
+ unregister_all:
+- mipi_csis_debugfs_exit(state);
++ mipi_csis_debugfs_exit(csis);
+ cleanup:
+- media_entity_cleanup(&state->sd.entity);
+- v4l2_async_nf_unregister(&state->notifier);
+- v4l2_async_nf_cleanup(&state->notifier);
+- v4l2_async_unregister_subdev(&state->sd);
++ media_entity_cleanup(&csis->sd.entity);
++ v4l2_async_nf_unregister(&csis->notifier);
++ v4l2_async_nf_cleanup(&csis->notifier);
++ v4l2_async_unregister_subdev(&csis->sd);
+ disable_clock:
+- mipi_csis_clk_disable(state);
+- mutex_destroy(&state->lock);
++ mipi_csis_clk_disable(csis);
++ mutex_destroy(&csis->lock);
+
+ return ret;
+ }
+@@ -1529,18 +1522,18 @@ disable_clock:
+ static int mipi_csis_remove(struct platform_device *pdev)
+ {
+ struct v4l2_subdev *sd = platform_get_drvdata(pdev);
+- struct csi_state *state = mipi_sd_to_csis_state(sd);
++ struct mipi_csis_device *csis = sd_to_mipi_csis_device(sd);
+
+- mipi_csis_debugfs_exit(state);
+- v4l2_async_nf_unregister(&state->notifier);
+- v4l2_async_nf_cleanup(&state->notifier);
+- v4l2_async_unregister_subdev(&state->sd);
++ mipi_csis_debugfs_exit(csis);
++ v4l2_async_nf_unregister(&csis->notifier);
++ v4l2_async_nf_cleanup(&csis->notifier);
++ v4l2_async_unregister_subdev(&csis->sd);
+
+ pm_runtime_disable(&pdev->dev);
+ mipi_csis_pm_suspend(&pdev->dev, true);
+- mipi_csis_clk_disable(state);
+- media_entity_cleanup(&state->sd.entity);
+- mutex_destroy(&state->lock);
++ mipi_csis_clk_disable(csis);
++ media_entity_cleanup(&csis->sd.entity);
++ mutex_destroy(&csis->lock);
+ pm_runtime_set_suspended(&pdev->dev);
+
+ return 0;
+diff --git a/drivers/media/platform/qcom/venus/helpers.c b/drivers/media/platform/qcom/venus/helpers.c
+index 0bca95d016507..fa01edd54c038 100644
+--- a/drivers/media/platform/qcom/venus/helpers.c
++++ b/drivers/media/platform/qcom/venus/helpers.c
+@@ -90,12 +90,28 @@ bool venus_helper_check_codec(struct venus_inst *inst, u32 v4l2_pixfmt)
+ }
+ EXPORT_SYMBOL_GPL(venus_helper_check_codec);
+
++static void free_dpb_buf(struct venus_inst *inst, struct intbuf *buf)
++{
++ ida_free(&inst->dpb_ids, buf->dpb_out_tag);
++
++ list_del_init(&buf->list);
++ dma_free_attrs(inst->core->dev, buf->size, buf->va, buf->da,
++ buf->attrs);
++ kfree(buf);
++}
++
+ int venus_helper_queue_dpb_bufs(struct venus_inst *inst)
+ {
+- struct intbuf *buf;
++ struct intbuf *buf, *next;
++ unsigned int dpb_size = 0;
+ int ret = 0;
+
+- list_for_each_entry(buf, &inst->dpbbufs, list) {
++ if (inst->dpb_buftype == HFI_BUFFER_OUTPUT)
++ dpb_size = inst->output_buf_size;
++ else if (inst->dpb_buftype == HFI_BUFFER_OUTPUT2)
++ dpb_size = inst->output2_buf_size;
++
++ list_for_each_entry_safe(buf, next, &inst->dpbbufs, list) {
+ struct hfi_frame_data fdata;
+
+ memset(&fdata, 0, sizeof(fdata));
+@@ -106,6 +122,12 @@ int venus_helper_queue_dpb_bufs(struct venus_inst *inst)
+ if (buf->owned_by == FIRMWARE)
+ continue;
+
++ /* free buffer from previous sequence which was released later */
++ if (dpb_size > buf->size) {
++ free_dpb_buf(inst, buf);
++ continue;
++ }
++
+ fdata.clnt_data = buf->dpb_out_tag;
+
+ ret = hfi_session_process_buf(inst, &fdata);
+@@ -127,13 +149,7 @@ int venus_helper_free_dpb_bufs(struct venus_inst *inst)
+ list_for_each_entry_safe(buf, n, &inst->dpbbufs, list) {
+ if (buf->owned_by == FIRMWARE)
+ continue;
+-
+- ida_free(&inst->dpb_ids, buf->dpb_out_tag);
+-
+- list_del_init(&buf->list);
+- dma_free_attrs(inst->core->dev, buf->size, buf->va, buf->da,
+- buf->attrs);
+- kfree(buf);
++ free_dpb_buf(inst, buf);
+ }
+
+ if (list_empty(&inst->dpbbufs))
+diff --git a/drivers/media/platform/qcom/venus/hfi.c b/drivers/media/platform/qcom/venus/hfi.c
+index 4e2151fb47f08..1968f09ad177a 100644
+--- a/drivers/media/platform/qcom/venus/hfi.c
++++ b/drivers/media/platform/qcom/venus/hfi.c
+@@ -104,6 +104,9 @@ int hfi_core_deinit(struct venus_core *core, bool blocking)
+ mutex_lock(&core->lock);
+ }
+
++ if (!core->ops)
++ goto unlock;
++
+ ret = core->ops->core_deinit(core);
+
+ if (!ret)
+diff --git a/drivers/media/platform/renesas/vsp1/vsp1_rpf.c b/drivers/media/platform/renesas/vsp1/vsp1_rpf.c
+index 85587c1b6a373..75083cb234fe3 100644
+--- a/drivers/media/platform/renesas/vsp1/vsp1_rpf.c
++++ b/drivers/media/platform/renesas/vsp1/vsp1_rpf.c
+@@ -291,11 +291,11 @@ static void rpf_configure_partition(struct vsp1_entity *entity,
+ + crop.left * fmtinfo->bpp[0] / 8;
+
+ if (format->num_planes > 1) {
++ unsigned int bpl = format->plane_fmt[1].bytesperline;
+ unsigned int offset;
+
+- offset = crop.top * format->plane_fmt[1].bytesperline
+- + crop.left / fmtinfo->hsub
+- * fmtinfo->bpp[1] / 8;
++ offset = crop.top / fmtinfo->vsub * bpl
++ + crop.left / fmtinfo->hsub * fmtinfo->bpp[1] / 8;
+ mem.addr[1] += offset;
+ mem.addr[2] += offset;
+ }
+diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
+index 3d3d1062e2122..2f8df74ad0fde 100644
+--- a/drivers/media/platform/rockchip/rga/rga.c
++++ b/drivers/media/platform/rockchip/rga/rga.c
+@@ -865,7 +865,7 @@ static int rga_probe(struct platform_device *pdev)
+
+ ret = pm_runtime_resume_and_get(rga->dev);
+ if (ret < 0)
+- goto rel_vdev;
++ goto rel_m2m;
+
+ rga->version.major = (rga_read(rga, RGA_VERSION_INFO) >> 24) & 0xFF;
+ rga->version.minor = (rga_read(rga, RGA_VERSION_INFO) >> 20) & 0x0F;
+@@ -881,7 +881,7 @@ static int rga_probe(struct platform_device *pdev)
+ DMA_ATTR_WRITE_COMBINE);
+ if (!rga->cmdbuf_virt) {
+ ret = -ENOMEM;
+- goto rel_vdev;
++ goto rel_m2m;
+ }
+
+ rga->src_mmu_pages =
+@@ -918,6 +918,8 @@ free_src_pages:
+ free_dma:
+ dma_free_attrs(rga->dev, RGA_CMDBUF_SIZE, rga->cmdbuf_virt,
+ rga->cmdbuf_phy, DMA_ATTR_WRITE_COMBINE);
++rel_m2m:
++ v4l2_m2m_release(rga->m2m_dev);
+ rel_vdev:
+ video_device_release(vfd);
+ unreg_v4l2_dev:
+diff --git a/drivers/media/platform/samsung/exynos4-is/fimc-is.c b/drivers/media/platform/samsung/exynos4-is/fimc-is.c
+index e55e411038f48..e3072d69c49fa 100644
+--- a/drivers/media/platform/samsung/exynos4-is/fimc-is.c
++++ b/drivers/media/platform/samsung/exynos4-is/fimc-is.c
+@@ -140,7 +140,7 @@ static int fimc_is_enable_clocks(struct fimc_is *is)
+ dev_err(&is->pdev->dev, "clock %s enable failed\n",
+ fimc_is_clocks[i]);
+ for (--i; i >= 0; i--)
+- clk_disable(is->clocks[i]);
++ clk_disable_unprepare(is->clocks[i]);
+ return ret;
+ }
+ pr_debug("enabled clock: %s\n", fimc_is_clocks[i]);
+@@ -830,7 +830,7 @@ static int fimc_is_probe(struct platform_device *pdev)
+
+ ret = pm_runtime_resume_and_get(dev);
+ if (ret < 0)
+- goto err_irq;
++ goto err_pm_disable;
+
+ vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
+
+@@ -864,6 +864,8 @@ err_pm:
+ pm_runtime_put_noidle(dev);
+ if (!pm_runtime_enabled(dev))
+ fimc_is_runtime_suspend(dev);
++err_pm_disable:
++ pm_runtime_disable(dev);
+ err_irq:
+ free_irq(is->irq, is);
+ err_clk:
+diff --git a/drivers/media/platform/samsung/exynos4-is/fimc-isp-video.h b/drivers/media/platform/samsung/exynos4-is/fimc-isp-video.h
+index edcb3a5e3cb90..2dd4ddbc748a1 100644
+--- a/drivers/media/platform/samsung/exynos4-is/fimc-isp-video.h
++++ b/drivers/media/platform/samsung/exynos4-is/fimc-isp-video.h
+@@ -32,7 +32,7 @@ static inline int fimc_isp_video_device_register(struct fimc_isp *isp,
+ return 0;
+ }
+
+-void fimc_isp_video_device_unregister(struct fimc_isp *isp,
++static inline void fimc_isp_video_device_unregister(struct fimc_isp *isp,
+ enum v4l2_buf_type type)
+ {
+ }
+diff --git a/drivers/media/platform/st/sti/delta/delta-v4l2.c b/drivers/media/platform/st/sti/delta/delta-v4l2.c
+index c887a31ebb540..420ad4d8df5d5 100644
+--- a/drivers/media/platform/st/sti/delta/delta-v4l2.c
++++ b/drivers/media/platform/st/sti/delta/delta-v4l2.c
+@@ -1859,7 +1859,7 @@ static int delta_probe(struct platform_device *pdev)
+ if (ret) {
+ dev_err(delta->dev, "%s failed to initialize firmware ipc channel\n",
+ DELTA_PREFIX);
+- goto err;
++ goto err_pm_disable;
+ }
+
+ /* register all available decoders */
+@@ -1873,7 +1873,7 @@ static int delta_probe(struct platform_device *pdev)
+ if (ret) {
+ dev_err(delta->dev, "%s failed to register V4L2 device\n",
+ DELTA_PREFIX);
+- goto err;
++ goto err_pm_disable;
+ }
+
+ delta->work_queue = create_workqueue(DELTA_NAME);
+@@ -1898,6 +1898,8 @@ err_work_queue:
+ destroy_workqueue(delta->work_queue);
+ err_v4l2:
+ v4l2_device_unregister(&delta->v4l2_dev);
++err_pm_disable:
++ pm_runtime_disable(dev);
+ err:
+ return ret;
+ }
+diff --git a/drivers/media/radio/Kconfig b/drivers/media/radio/Kconfig
+index cca03bd2cc426..616a38feb641e 100644
+--- a/drivers/media/radio/Kconfig
++++ b/drivers/media/radio/Kconfig
+@@ -4,10 +4,10 @@
+ #
+
+ menuconfig RADIO_ADAPTERS
+- bool "Radio Adapters"
++ tristate "Radio Adapters"
+ depends on VIDEO_DEV
+ depends on MEDIA_RADIO_SUPPORT
+- default y
++ default VIDEO_DEV
+ help
+ Say Y here to enable selecting AM/FM radio adapters.
+
+diff --git a/drivers/media/rc/bpf-lirc.c b/drivers/media/rc/bpf-lirc.c
+index 3eff08d7b8e5c..fe17c7f98e810 100644
+--- a/drivers/media/rc/bpf-lirc.c
++++ b/drivers/media/rc/bpf-lirc.c
+@@ -216,8 +216,12 @@ void lirc_bpf_run(struct rc_dev *rcdev, u32 sample)
+
+ raw->bpf_sample = sample;
+
+- if (raw->progs)
+- BPF_PROG_RUN_ARRAY(raw->progs, &raw->bpf_sample, bpf_prog_run);
++ if (raw->progs) {
++ rcu_read_lock();
++ bpf_prog_run_array(rcu_dereference(raw->progs),
++ &raw->bpf_sample, bpf_prog_run);
++ rcu_read_unlock();
++ }
+ }
+
+ /*
+diff --git a/drivers/media/rc/imon.c b/drivers/media/rc/imon.c
+index 54da6f60079ba..ab090663f975f 100644
+--- a/drivers/media/rc/imon.c
++++ b/drivers/media/rc/imon.c
+@@ -153,6 +153,24 @@ struct imon_context {
+ const struct imon_usb_dev_descr *dev_descr;
+ /* device description with key */
+ /* table for front panels */
++ /*
++ * Fields for deferring free_imon_context().
++ *
++ * Since reference to "struct imon_context" is stored into
++ * "struct file"->private_data, we need to remember
++ * how many file descriptors might access this "struct imon_context".
++ */
++ refcount_t users;
++ /*
++ * Use a flag for telling display_open()/vfd_write()/lcd_write() that
++ * imon_disconnect() was already called.
++ */
++ bool disconnected;
++ /*
++ * We need to wait for RCU grace period in order to allow
++ * display_open() to safely check ->disconnected and increment ->users.
++ */
++ struct rcu_head rcu;
+ };
+
+ #define TOUCH_TIMEOUT (HZ/30)
+@@ -160,18 +178,18 @@ struct imon_context {
+ /* vfd character device file operations */
+ static const struct file_operations vfd_fops = {
+ .owner = THIS_MODULE,
+- .open = &display_open,
+- .write = &vfd_write,
+- .release = &display_close,
++ .open = display_open,
++ .write = vfd_write,
++ .release = display_close,
+ .llseek = noop_llseek,
+ };
+
+ /* lcd character device file operations */
+ static const struct file_operations lcd_fops = {
+ .owner = THIS_MODULE,
+- .open = &display_open,
+- .write = &lcd_write,
+- .release = &display_close,
++ .open = display_open,
++ .write = lcd_write,
++ .release = display_close,
+ .llseek = noop_llseek,
+ };
+
+@@ -439,9 +457,6 @@ static struct usb_driver imon_driver = {
+ .id_table = imon_usb_id_table,
+ };
+
+-/* to prevent races between open() and disconnect(), probing, etc */
+-static DEFINE_MUTEX(driver_lock);
+-
+ /* Module bookkeeping bits */
+ MODULE_AUTHOR(MOD_AUTHOR);
+ MODULE_DESCRIPTION(MOD_DESC);
+@@ -481,9 +496,11 @@ static void free_imon_context(struct imon_context *ictx)
+ struct device *dev = ictx->dev;
+
+ usb_free_urb(ictx->tx_urb);
++ WARN_ON(ictx->dev_present_intf0);
+ usb_free_urb(ictx->rx_urb_intf0);
++ WARN_ON(ictx->dev_present_intf1);
+ usb_free_urb(ictx->rx_urb_intf1);
+- kfree(ictx);
++ kfree_rcu(ictx, rcu);
+
+ dev_dbg(dev, "%s: iMON context freed\n", __func__);
+ }
+@@ -499,9 +516,6 @@ static int display_open(struct inode *inode, struct file *file)
+ int subminor;
+ int retval = 0;
+
+- /* prevent races with disconnect */
+- mutex_lock(&driver_lock);
+-
+ subminor = iminor(inode);
+ interface = usb_find_interface(&imon_driver, subminor);
+ if (!interface) {
+@@ -509,13 +523,16 @@ static int display_open(struct inode *inode, struct file *file)
+ retval = -ENODEV;
+ goto exit;
+ }
+- ictx = usb_get_intfdata(interface);
+
+- if (!ictx) {
++ rcu_read_lock();
++ ictx = usb_get_intfdata(interface);
++ if (!ictx || ictx->disconnected || !refcount_inc_not_zero(&ictx->users)) {
++ rcu_read_unlock();
+ pr_err("no context found for minor %d\n", subminor);
+ retval = -ENODEV;
+ goto exit;
+ }
++ rcu_read_unlock();
+
+ mutex_lock(&ictx->lock);
+
+@@ -533,8 +550,10 @@ static int display_open(struct inode *inode, struct file *file)
+
+ mutex_unlock(&ictx->lock);
+
++ if (retval && refcount_dec_and_test(&ictx->users))
++ free_imon_context(ictx);
++
+ exit:
+- mutex_unlock(&driver_lock);
+ return retval;
+ }
+
+@@ -544,16 +563,9 @@ exit:
+ */
+ static int display_close(struct inode *inode, struct file *file)
+ {
+- struct imon_context *ictx = NULL;
++ struct imon_context *ictx = file->private_data;
+ int retval = 0;
+
+- ictx = file->private_data;
+-
+- if (!ictx) {
+- pr_err("no context for device\n");
+- return -ENODEV;
+- }
+-
+ mutex_lock(&ictx->lock);
+
+ if (!ictx->display_supported) {
+@@ -568,6 +580,8 @@ static int display_close(struct inode *inode, struct file *file)
+ }
+
+ mutex_unlock(&ictx->lock);
++ if (refcount_dec_and_test(&ictx->users))
++ free_imon_context(ictx);
+ return retval;
+ }
+
+@@ -934,15 +948,12 @@ static ssize_t vfd_write(struct file *file, const char __user *buf,
+ int offset;
+ int seq;
+ int retval = 0;
+- struct imon_context *ictx;
++ struct imon_context *ictx = file->private_data;
+ static const unsigned char vfd_packet6[] = {
+ 0x01, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF };
+
+- ictx = file->private_data;
+- if (!ictx) {
+- pr_err_ratelimited("no context for device\n");
++ if (ictx->disconnected)
+ return -ENODEV;
+- }
+
+ mutex_lock(&ictx->lock);
+
+@@ -1018,13 +1029,10 @@ static ssize_t lcd_write(struct file *file, const char __user *buf,
+ size_t n_bytes, loff_t *pos)
+ {
+ int retval = 0;
+- struct imon_context *ictx;
++ struct imon_context *ictx = file->private_data;
+
+- ictx = file->private_data;
+- if (!ictx) {
+- pr_err_ratelimited("no context for device\n");
++ if (ictx->disconnected)
+ return -ENODEV;
+- }
+
+ mutex_lock(&ictx->lock);
+
+@@ -2404,7 +2412,6 @@ static int imon_probe(struct usb_interface *interface,
+ int ifnum, sysfs_err;
+ int ret = 0;
+ struct imon_context *ictx = NULL;
+- struct imon_context *first_if_ctx = NULL;
+ u16 vendor, product;
+
+ usbdev = usb_get_dev(interface_to_usbdev(interface));
+@@ -2416,17 +2423,12 @@ static int imon_probe(struct usb_interface *interface,
+ dev_dbg(dev, "%s: found iMON device (%04x:%04x, intf%d)\n",
+ __func__, vendor, product, ifnum);
+
+- /* prevent races probing devices w/multiple interfaces */
+- mutex_lock(&driver_lock);
+-
+ first_if = usb_ifnum_to_if(usbdev, 0);
+ if (!first_if) {
+ ret = -ENODEV;
+ goto fail;
+ }
+
+- first_if_ctx = usb_get_intfdata(first_if);
+-
+ if (ifnum == 0) {
+ ictx = imon_init_intf0(interface, id);
+ if (!ictx) {
+@@ -2434,9 +2436,11 @@ static int imon_probe(struct usb_interface *interface,
+ ret = -ENODEV;
+ goto fail;
+ }
++ refcount_set(&ictx->users, 1);
+
+ } else {
+ /* this is the secondary interface on the device */
++ struct imon_context *first_if_ctx = usb_get_intfdata(first_if);
+
+ /* fail early if first intf failed to register */
+ if (!first_if_ctx) {
+@@ -2450,14 +2454,13 @@ static int imon_probe(struct usb_interface *interface,
+ ret = -ENODEV;
+ goto fail;
+ }
++ refcount_inc(&ictx->users);
+
+ }
+
+ usb_set_intfdata(interface, ictx);
+
+ if (ifnum == 0) {
+- mutex_lock(&ictx->lock);
+-
+ if (product == 0xffdc && ictx->rf_device) {
+ sysfs_err = sysfs_create_group(&interface->dev.kobj,
+ &imon_rf_attr_group);
+@@ -2468,21 +2471,17 @@ static int imon_probe(struct usb_interface *interface,
+
+ if (ictx->display_supported)
+ imon_init_display(ictx, interface);
+-
+- mutex_unlock(&ictx->lock);
+ }
+
+ dev_info(dev, "iMON device (%04x:%04x, intf%d) on usb<%d:%d> initialized\n",
+ vendor, product, ifnum,
+ usbdev->bus->busnum, usbdev->devnum);
+
+- mutex_unlock(&driver_lock);
+ usb_put_dev(usbdev);
+
+ return 0;
+
+ fail:
+- mutex_unlock(&driver_lock);
+ usb_put_dev(usbdev);
+ dev_err(dev, "unable to register, err %d\n", ret);
+
+@@ -2498,10 +2497,8 @@ static void imon_disconnect(struct usb_interface *interface)
+ struct device *dev;
+ int ifnum;
+
+- /* prevent races with multi-interface device probing and display_open */
+- mutex_lock(&driver_lock);
+-
+ ictx = usb_get_intfdata(interface);
++ ictx->disconnected = true;
+ dev = ictx->dev;
+ ifnum = interface->cur_altsetting->desc.bInterfaceNumber;
+
+@@ -2542,11 +2539,9 @@ static void imon_disconnect(struct usb_interface *interface)
+ }
+ }
+
+- if (!ictx->dev_present_intf0 && !ictx->dev_present_intf1)
++ if (refcount_dec_and_test(&ictx->users))
+ free_imon_context(ictx);
+
+- mutex_unlock(&driver_lock);
+-
+ dev_dbg(dev, "%s: iMON device (intf%d) disconnected\n",
+ __func__, ifnum);
+ }
+diff --git a/drivers/media/usb/pvrusb2/pvrusb2-hdw.c b/drivers/media/usb/pvrusb2/pvrusb2-hdw.c
+index cd7b118d59290..a9666373af6b9 100644
+--- a/drivers/media/usb/pvrusb2/pvrusb2-hdw.c
++++ b/drivers/media/usb/pvrusb2/pvrusb2-hdw.c
+@@ -2569,6 +2569,11 @@ struct pvr2_hdw *pvr2_hdw_create(struct usb_interface *intf,
+ } while (0);
+ mutex_unlock(&pvr2_unit_mtx);
+
++ INIT_WORK(&hdw->workpoll, pvr2_hdw_worker_poll);
++
++ if (hdw->unit_number == -1)
++ goto fail;
++
+ cnt1 = 0;
+ cnt2 = scnprintf(hdw->name+cnt1,sizeof(hdw->name)-cnt1,"pvrusb2");
+ cnt1 += cnt2;
+@@ -2580,8 +2585,6 @@ struct pvr2_hdw *pvr2_hdw_create(struct usb_interface *intf,
+ if (cnt1 >= sizeof(hdw->name)) cnt1 = sizeof(hdw->name)-1;
+ hdw->name[cnt1] = 0;
+
+- INIT_WORK(&hdw->workpoll,pvr2_hdw_worker_poll);
+-
+ pvr2_trace(PVR2_TRACE_INIT,"Driver unit number is %d, name is %s",
+ hdw->unit_number,hdw->name);
+
+diff --git a/drivers/media/usb/uvc/uvc_v4l2.c b/drivers/media/usb/uvc/uvc_v4l2.c
+index 711556d13d03a..177181985345c 100644
+--- a/drivers/media/usb/uvc/uvc_v4l2.c
++++ b/drivers/media/usb/uvc/uvc_v4l2.c
+@@ -871,29 +871,31 @@ static int uvc_ioctl_enum_input(struct file *file, void *fh,
+ struct uvc_video_chain *chain = handle->chain;
+ const struct uvc_entity *selector = chain->selector;
+ struct uvc_entity *iterm = NULL;
++ struct uvc_entity *it;
+ u32 index = input->index;
+- int pin = 0;
+
+ if (selector == NULL ||
+ (chain->dev->quirks & UVC_QUIRK_IGNORE_SELECTOR_UNIT)) {
+ if (index != 0)
+ return -EINVAL;
+- list_for_each_entry(iterm, &chain->entities, chain) {
+- if (UVC_ENTITY_IS_ITERM(iterm))
++ list_for_each_entry(it, &chain->entities, chain) {
++ if (UVC_ENTITY_IS_ITERM(it)) {
++ iterm = it;
+ break;
++ }
+ }
+- pin = iterm->id;
+ } else if (index < selector->bNrInPins) {
+- pin = selector->baSourceID[index];
+- list_for_each_entry(iterm, &chain->entities, chain) {
+- if (!UVC_ENTITY_IS_ITERM(iterm))
++ list_for_each_entry(it, &chain->entities, chain) {
++ if (!UVC_ENTITY_IS_ITERM(it))
+ continue;
+- if (iterm->id == pin)
++ if (it->id == selector->baSourceID[index]) {
++ iterm = it;
+ break;
++ }
+ }
+ }
+
+- if (iterm == NULL || iterm->id != pin)
++ if (iterm == NULL)
+ return -EINVAL;
+
+ memset(input, 0, sizeof(*input));
+diff --git a/drivers/memory/samsung/exynos5422-dmc.c b/drivers/memory/samsung/exynos5422-dmc.c
+index 9c8318923ed0b..4733e7898ffe5 100644
+--- a/drivers/memory/samsung/exynos5422-dmc.c
++++ b/drivers/memory/samsung/exynos5422-dmc.c
+@@ -1322,7 +1322,6 @@ static int exynos5_dmc_init_clks(struct exynos5_dmc *dmc)
+ */
+ static int exynos5_performance_counters_init(struct exynos5_dmc *dmc)
+ {
+- int counters_size;
+ int ret, i;
+
+ dmc->num_counters = devfreq_event_get_edev_count(dmc->dev,
+@@ -1332,8 +1331,8 @@ static int exynos5_performance_counters_init(struct exynos5_dmc *dmc)
+ return dmc->num_counters;
+ }
+
+- counters_size = sizeof(struct devfreq_event_dev) * dmc->num_counters;
+- dmc->counter = devm_kzalloc(dmc->dev, counters_size, GFP_KERNEL);
++ dmc->counter = devm_kcalloc(dmc->dev, dmc->num_counters,
++ sizeof(*dmc->counter), GFP_KERNEL);
+ if (!dmc->counter)
+ return -ENOMEM;
+
+diff --git a/drivers/mfd/davinci_voicecodec.c b/drivers/mfd/davinci_voicecodec.c
+index e5c8bc998eb4e..965820481f1e1 100644
+--- a/drivers/mfd/davinci_voicecodec.c
++++ b/drivers/mfd/davinci_voicecodec.c
+@@ -46,14 +46,12 @@ static int __init davinci_vc_probe(struct platform_device *pdev)
+ }
+ clk_enable(davinci_vc->clk);
+
+- res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+-
+- fifo_base = (dma_addr_t)res->start;
+- davinci_vc->base = devm_ioremap_resource(&pdev->dev, res);
++ davinci_vc->base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
+ if (IS_ERR(davinci_vc->base)) {
+ ret = PTR_ERR(davinci_vc->base);
+ goto fail;
+ }
++ fifo_base = (dma_addr_t)res->start;
+
+ davinci_vc->regmap = devm_regmap_init_mmio(&pdev->dev,
+ davinci_vc->base,
+diff --git a/drivers/mfd/ipaq-micro.c b/drivers/mfd/ipaq-micro.c
+index e92eeeb67a98a..4cd5ecc722112 100644
+--- a/drivers/mfd/ipaq-micro.c
++++ b/drivers/mfd/ipaq-micro.c
+@@ -403,7 +403,7 @@ static int __init micro_probe(struct platform_device *pdev)
+ micro_reset_comm(micro);
+
+ irq = platform_get_irq(pdev, 0);
+- if (!irq)
++ if (irq < 0)
+ return -EINVAL;
+ ret = devm_request_irq(&pdev->dev, irq, micro_serial_isr,
+ IRQF_SHARED, "ipaq-micro",
+diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
+index d80ada8cac09f..29cf292c0aba6 100644
+--- a/drivers/misc/fastrpc.c
++++ b/drivers/misc/fastrpc.c
+@@ -1747,17 +1747,18 @@ err_invoke:
+ static int fastrpc_req_mem_unmap_impl(struct fastrpc_user *fl, struct fastrpc_mem_unmap *req)
+ {
+ struct fastrpc_invoke_args args[1] = { [0] = { 0 } };
+- struct fastrpc_map *map = NULL, *m;
++ struct fastrpc_map *map = NULL, *iter, *m;
+ struct fastrpc_mem_unmap_req_msg req_msg = { 0 };
+ int err = 0;
+ u32 sc;
+ struct device *dev = fl->sctx->dev;
+
+ spin_lock(&fl->lock);
+- list_for_each_entry_safe(map, m, &fl->maps, node) {
+- if ((req->fd < 0 || map->fd == req->fd) && (map->raddr == req->vaddr))
++ list_for_each_entry_safe(iter, m, &fl->maps, node) {
++ if ((req->fd < 0 || iter->fd == req->fd) && (iter->raddr == req->vaddr)) {
++ map = iter;
+ break;
+- map = NULL;
++ }
+ }
+
+ spin_unlock(&fl->lock);
+diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c
+index d881f5e40ad9e..6777c419a8da2 100644
+--- a/drivers/misc/ocxl/file.c
++++ b/drivers/misc/ocxl/file.c
+@@ -556,7 +556,9 @@ int ocxl_file_register_afu(struct ocxl_afu *afu)
+
+ err_unregister:
+ ocxl_sysfs_unregister_afu(info); // safe to call even if register failed
++ free_minor(info);
+ device_unregister(&info->dev);
++ return rc;
+ err_put:
+ ocxl_afu_put(afu);
+ free_minor(info);
+diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
+index 506dc900f5c7c..5235b03c6cffa 100644
+--- a/drivers/mmc/core/block.c
++++ b/drivers/mmc/core/block.c
+@@ -609,11 +609,11 @@ static int __mmc_blk_ioctl_cmd(struct mmc_card *card, struct mmc_blk_data *md,
+
+ if (idata->rpmb || (cmd.flags & MMC_RSP_R1B) == MMC_RSP_R1B) {
+ /*
+- * Ensure RPMB/R1B command has completed by polling CMD13
+- * "Send Status".
++ * Ensure RPMB/R1B command has completed by polling CMD13 "Send Status". Here we
++ * allow to override the default timeout value if a custom timeout is specified.
+ */
+- err = mmc_poll_for_busy(card, MMC_BLK_TIMEOUT_MS, false,
+- MMC_BUSY_IO);
++ err = mmc_poll_for_busy(card, idata->ic.cmd_timeout_ms ? : MMC_BLK_TIMEOUT_MS,
++ false, MMC_BUSY_IO);
+ }
+
+ return err;
+diff --git a/drivers/mmc/host/jz4740_mmc.c b/drivers/mmc/host/jz4740_mmc.c
+index 7ab1b38a7be50..b1d563b2ed1b0 100644
+--- a/drivers/mmc/host/jz4740_mmc.c
++++ b/drivers/mmc/host/jz4740_mmc.c
+@@ -247,6 +247,26 @@ static int jz4740_mmc_acquire_dma_channels(struct jz4740_mmc_host *host)
+ return PTR_ERR(host->dma_rx);
+ }
+
++ /*
++ * Limit the maximum segment size in any SG entry according to
++ * the parameters of the DMA engine device.
++ */
++ if (host->dma_tx) {
++ struct device *dev = host->dma_tx->device->dev;
++ unsigned int max_seg_size = dma_get_max_seg_size(dev);
++
++ if (max_seg_size < host->mmc->max_seg_size)
++ host->mmc->max_seg_size = max_seg_size;
++ }
++
++ if (host->dma_rx) {
++ struct device *dev = host->dma_rx->device->dev;
++ unsigned int max_seg_size = dma_get_max_seg_size(dev);
++
++ if (max_seg_size < host->mmc->max_seg_size)
++ host->mmc->max_seg_size = max_seg_size;
++ }
++
+ return 0;
+ }
+
+diff --git a/drivers/mmc/host/sdhci_am654.c b/drivers/mmc/host/sdhci_am654.c
+index e54fe24d47e73..e7ced1496a073 100644
+--- a/drivers/mmc/host/sdhci_am654.c
++++ b/drivers/mmc/host/sdhci_am654.c
+@@ -147,6 +147,9 @@ struct sdhci_am654_data {
+ int drv_strength;
+ int strb_sel;
+ u32 flags;
++ u32 quirks;
++
++#define SDHCI_AM654_QUIRK_FORCE_CDTEST BIT(0)
+ };
+
+ struct sdhci_am654_driver_data {
+@@ -369,6 +372,21 @@ static void sdhci_am654_write_b(struct sdhci_host *host, u8 val, int reg)
+ }
+ }
+
++static void sdhci_am654_reset(struct sdhci_host *host, u8 mask)
++{
++ u8 ctrl;
++ struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
++ struct sdhci_am654_data *sdhci_am654 = sdhci_pltfm_priv(pltfm_host);
++
++ sdhci_reset(host, mask);
++
++ if (sdhci_am654->quirks & SDHCI_AM654_QUIRK_FORCE_CDTEST) {
++ ctrl = sdhci_readb(host, SDHCI_HOST_CONTROL);
++ ctrl |= SDHCI_CTRL_CDTEST_INS | SDHCI_CTRL_CDTEST_EN;
++ sdhci_writeb(host, ctrl, SDHCI_HOST_CONTROL);
++ }
++}
++
+ static int sdhci_am654_execute_tuning(struct mmc_host *mmc, u32 opcode)
+ {
+ struct sdhci_host *host = mmc_priv(mmc);
+@@ -500,7 +518,7 @@ static struct sdhci_ops sdhci_j721e_4bit_ops = {
+ .set_clock = sdhci_j721e_4bit_set_clock,
+ .write_b = sdhci_am654_write_b,
+ .irq = sdhci_am654_cqhci_irq,
+- .reset = sdhci_reset,
++ .reset = sdhci_am654_reset,
+ };
+
+ static const struct sdhci_pltfm_data sdhci_j721e_4bit_pdata = {
+@@ -719,6 +737,9 @@ static int sdhci_am654_get_of_property(struct platform_device *pdev,
+ device_property_read_u32(dev, "ti,clkbuf-sel",
+ &sdhci_am654->clkbuf_sel);
+
++ if (device_property_read_bool(dev, "ti,fails-without-test-cd"))
++ sdhci_am654->quirks |= SDHCI_AM654_QUIRK_FORCE_CDTEST;
++
+ sdhci_get_of_property(pdev);
+
+ return 0;
+diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c b/drivers/mtd/chips/cfi_cmdset_0002.c
+index a761134fd3bea..59334530dd46f 100644
+--- a/drivers/mtd/chips/cfi_cmdset_0002.c
++++ b/drivers/mtd/chips/cfi_cmdset_0002.c
+@@ -59,6 +59,10 @@
+ #define CFI_SR_WBASB BIT(3)
+ #define CFI_SR_SLSB BIT(1)
+
++enum cfi_quirks {
++ CFI_QUIRK_DQ_TRUE_DATA = BIT(0),
++};
++
+ static int cfi_amdstd_read (struct mtd_info *, loff_t, size_t, size_t *, u_char *);
+ static int cfi_amdstd_write_words(struct mtd_info *, loff_t, size_t, size_t *, const u_char *);
+ #if !FORCE_WORD_WRITE
+@@ -436,6 +440,15 @@ static void fixup_s29ns512p_sectors(struct mtd_info *mtd)
+ mtd->name);
+ }
+
++static void fixup_quirks(struct mtd_info *mtd)
++{
++ struct map_info *map = mtd->priv;
++ struct cfi_private *cfi = map->fldrv_priv;
++
++ if (cfi->mfr == CFI_MFR_AMD && cfi->id == 0x0c01)
++ cfi->quirks |= CFI_QUIRK_DQ_TRUE_DATA;
++}
++
+ /* Used to fix CFI-Tables of chips without Extended Query Tables */
+ static struct cfi_fixup cfi_nopri_fixup_table[] = {
+ { CFI_MFR_SST, 0x234a, fixup_sst39vf }, /* SST39VF1602 */
+@@ -474,6 +487,7 @@ static struct cfi_fixup cfi_fixup_table[] = {
+ #if !FORCE_WORD_WRITE
+ { CFI_MFR_ANY, CFI_ID_ANY, fixup_use_write_buffers },
+ #endif
++ { CFI_MFR_ANY, CFI_ID_ANY, fixup_quirks },
+ { 0, 0, NULL }
+ };
+ static struct cfi_fixup jedec_fixup_table[] = {
+@@ -802,21 +816,25 @@ static struct mtd_info *cfi_amdstd_setup(struct mtd_info *mtd)
+ }
+
+ /*
+- * Return true if the chip is ready.
++ * Return true if the chip is ready and has the correct value.
+ *
+ * Ready is one of: read mode, query mode, erase-suspend-read mode (in any
+ * non-suspended sector) and is indicated by no toggle bits toggling.
+ *
++ * Error are indicated by toggling bits or bits held with the wrong value,
++ * or with bits toggling.
++ *
+ * Note that anything more complicated than checking if no bits are toggling
+ * (including checking DQ5 for an error status) is tricky to get working
+ * correctly and is therefore not done (particularly with interleaved chips
+ * as each chip must be checked independently of the others).
+ */
+ static int __xipram chip_ready(struct map_info *map, struct flchip *chip,
+- unsigned long addr)
++ unsigned long addr, map_word *expected)
+ {
+ struct cfi_private *cfi = map->fldrv_priv;
+ map_word d, t;
++ int ret;
+
+ if (cfi_use_status_reg(cfi)) {
+ map_word ready = CMD(CFI_SR_DRB);
+@@ -826,57 +844,32 @@ static int __xipram chip_ready(struct map_info *map, struct flchip *chip,
+ */
+ cfi_send_gen_cmd(0x70, cfi->addr_unlock1, chip->start, map, cfi,
+ cfi->device_type, NULL);
+- d = map_read(map, addr);
++ t = map_read(map, addr);
+
+- return map_word_andequal(map, d, ready, ready);
++ return map_word_andequal(map, t, ready, ready);
+ }
+
+ d = map_read(map, addr);
+ t = map_read(map, addr);
+
+- return map_word_equal(map, d, t);
++ ret = map_word_equal(map, d, t);
++
++ if (!ret || !expected)
++ return ret;
++
++ return map_word_equal(map, t, *expected);
+ }
+
+-/*
+- * Return true if the chip is ready and has the correct value.
+- *
+- * Ready is one of: read mode, query mode, erase-suspend-read mode (in any
+- * non-suspended sector) and it is indicated by no bits toggling.
+- *
+- * Error are indicated by toggling bits or bits held with the wrong value,
+- * or with bits toggling.
+- *
+- * Note that anything more complicated than checking if no bits are toggling
+- * (including checking DQ5 for an error status) is tricky to get working
+- * correctly and is therefore not done (particularly with interleaved chips
+- * as each chip must be checked independently of the others).
+- *
+- */
+ static int __xipram chip_good(struct map_info *map, struct flchip *chip,
+- unsigned long addr, map_word expected)
++ unsigned long addr, map_word *expected)
+ {
+ struct cfi_private *cfi = map->fldrv_priv;
+- map_word oldd, curd;
+-
+- if (cfi_use_status_reg(cfi)) {
+- map_word ready = CMD(CFI_SR_DRB);
+-
+- /*
+- * For chips that support status register, check device
+- * ready bit
+- */
+- cfi_send_gen_cmd(0x70, cfi->addr_unlock1, chip->start, map, cfi,
+- cfi->device_type, NULL);
+- curd = map_read(map, addr);
+-
+- return map_word_andequal(map, curd, ready, ready);
+- }
++ map_word *datum = expected;
+
+- oldd = map_read(map, addr);
+- curd = map_read(map, addr);
++ if (cfi->quirks & CFI_QUIRK_DQ_TRUE_DATA)
++ datum = NULL;
+
+- return map_word_equal(map, oldd, curd) &&
+- map_word_equal(map, curd, expected);
++ return chip_ready(map, chip, addr, datum);
+ }
+
+ static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr, int mode)
+@@ -893,7 +886,7 @@ static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr
+
+ case FL_STATUS:
+ for (;;) {
+- if (chip_ready(map, chip, adr))
++ if (chip_ready(map, chip, adr, NULL))
+ break;
+
+ if (time_after(jiffies, timeo)) {
+@@ -932,7 +925,7 @@ static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr
+ chip->state = FL_ERASE_SUSPENDING;
+ chip->erase_suspended = 1;
+ for (;;) {
+- if (chip_ready(map, chip, adr))
++ if (chip_ready(map, chip, adr, NULL))
+ break;
+
+ if (time_after(jiffies, timeo)) {
+@@ -1463,7 +1456,7 @@ static int do_otp_lock(struct map_info *map, struct flchip *chip, loff_t adr,
+ /* wait for chip to become ready */
+ timeo = jiffies + msecs_to_jiffies(2);
+ for (;;) {
+- if (chip_ready(map, chip, adr))
++ if (chip_ready(map, chip, adr, NULL))
+ break;
+
+ if (time_after(jiffies, timeo)) {
+@@ -1699,7 +1692,7 @@ static int __xipram do_write_oneword_once(struct map_info *map,
+ * "chip_good" to avoid the failure due to scheduling.
+ */
+ if (time_after(jiffies, timeo) &&
+- !chip_good(map, chip, adr, datum)) {
++ !chip_good(map, chip, adr, &datum)) {
+ xip_enable(map, chip, adr);
+ printk(KERN_WARNING "MTD %s(): software timeout\n", __func__);
+ xip_disable(map, chip, adr);
+@@ -1707,7 +1700,7 @@ static int __xipram do_write_oneword_once(struct map_info *map,
+ break;
+ }
+
+- if (chip_good(map, chip, adr, datum)) {
++ if (chip_good(map, chip, adr, &datum)) {
+ if (cfi_check_err_status(map, chip, adr))
+ ret = -EIO;
+ break;
+@@ -1979,14 +1972,14 @@ static int __xipram do_write_buffer_wait(struct map_info *map,
+ * "chip_good" to avoid the failure due to scheduling.
+ */
+ if (time_after(jiffies, timeo) &&
+- !chip_good(map, chip, adr, datum)) {
++ !chip_good(map, chip, adr, &datum)) {
+ pr_err("MTD %s(): software timeout, address:0x%.8lx.\n",
+ __func__, adr);
+ ret = -EIO;
+ break;
+ }
+
+- if (chip_good(map, chip, adr, datum)) {
++ if (chip_good(map, chip, adr, &datum)) {
+ if (cfi_check_err_status(map, chip, adr))
+ ret = -EIO;
+ break;
+@@ -2195,7 +2188,7 @@ static int cfi_amdstd_panic_wait(struct map_info *map, struct flchip *chip,
+ * If the driver thinks the chip is idle, and no toggle bits
+ * are changing, then the chip is actually idle for sure.
+ */
+- if (chip->state == FL_READY && chip_ready(map, chip, adr))
++ if (chip->state == FL_READY && chip_ready(map, chip, adr, NULL))
+ return 0;
+
+ /*
+@@ -2212,7 +2205,7 @@ static int cfi_amdstd_panic_wait(struct map_info *map, struct flchip *chip,
+
+ /* wait for the chip to become ready */
+ for (i = 0; i < jiffies_to_usecs(timeo); i++) {
+- if (chip_ready(map, chip, adr))
++ if (chip_ready(map, chip, adr, NULL))
+ return 0;
+
+ udelay(1);
+@@ -2276,13 +2269,13 @@ retry:
+ map_write(map, datum, adr);
+
+ for (i = 0; i < jiffies_to_usecs(uWriteTimeout); i++) {
+- if (chip_ready(map, chip, adr))
++ if (chip_ready(map, chip, adr, NULL))
+ break;
+
+ udelay(1);
+ }
+
+- if (!chip_good(map, chip, adr, datum) ||
++ if (!chip_ready(map, chip, adr, &datum) ||
+ cfi_check_err_status(map, chip, adr)) {
+ /* reset on all failures. */
+ map_write(map, CMD(0xF0), chip->start);
+@@ -2424,6 +2417,7 @@ static int __xipram do_erase_chip(struct map_info *map, struct flchip *chip)
+ DECLARE_WAITQUEUE(wait, current);
+ int ret;
+ int retry_cnt = 0;
++ map_word datum = map_word_ff(map);
+
+ adr = cfi->addr_unlock1;
+
+@@ -2478,7 +2472,7 @@ static int __xipram do_erase_chip(struct map_info *map, struct flchip *chip)
+ chip->erase_suspended = 0;
+ }
+
+- if (chip_good(map, chip, adr, map_word_ff(map))) {
++ if (chip_ready(map, chip, adr, &datum)) {
+ if (cfi_check_err_status(map, chip, adr))
+ ret = -EIO;
+ break;
+@@ -2523,6 +2517,7 @@ static int __xipram do_erase_oneblock(struct map_info *map, struct flchip *chip,
+ DECLARE_WAITQUEUE(wait, current);
+ int ret;
+ int retry_cnt = 0;
++ map_word datum = map_word_ff(map);
+
+ adr += chip->start;
+
+@@ -2577,7 +2572,7 @@ static int __xipram do_erase_oneblock(struct map_info *map, struct flchip *chip,
+ chip->erase_suspended = 0;
+ }
+
+- if (chip_good(map, chip, adr, map_word_ff(map))) {
++ if (chip_ready(map, chip, adr, &datum)) {
+ if (cfi_check_err_status(map, chip, adr))
+ ret = -EIO;
+ break;
+@@ -2771,7 +2766,7 @@ static int __maybe_unused do_ppb_xxlock(struct map_info *map,
+ */
+ timeo = jiffies + msecs_to_jiffies(2000); /* 2s max (un)locking */
+ for (;;) {
+- if (chip_ready(map, chip, adr))
++ if (chip_ready(map, chip, adr, NULL))
+ break;
+
+ if (time_after(jiffies, timeo)) {
+diff --git a/drivers/mtd/mtdblock.c b/drivers/mtd/mtdblock.c
+index 03e3de3a5d79e..1e94e7d10b8be 100644
+--- a/drivers/mtd/mtdblock.c
++++ b/drivers/mtd/mtdblock.c
+@@ -257,6 +257,10 @@ static int mtdblock_open(struct mtd_blktrans_dev *mbd)
+ return 0;
+ }
+
++ if (mtd_type_is_nand(mbd->mtd))
++ pr_warn("%s: MTD device '%s' is NAND, please consider using UBI block devices instead.\n",
++ mbd->tr->name, mbd->mtd->name);
++
+ /* OK, it's not open. Create cache info for it */
+ mtdblk->count = 1;
+ mutex_init(&mtdblk->cache_mutex);
+@@ -322,10 +326,6 @@ static void mtdblock_add_mtd(struct mtd_blktrans_ops *tr, struct mtd_info *mtd)
+ if (!(mtd->flags & MTD_WRITEABLE))
+ dev->mbd.readonly = 1;
+
+- if (mtd_type_is_nand(mtd))
+- pr_warn("%s: MTD device '%s' is NAND, please consider using UBI block devices instead.\n",
+- tr->name, mtd->name);
+-
+ if (add_mtd_blktrans_dev(&dev->mbd))
+ kfree(dev);
+ }
+diff --git a/drivers/mtd/nand/raw/cadence-nand-controller.c b/drivers/mtd/nand/raw/cadence-nand-controller.c
+index 7eec60ea90564..0d72672f8b64d 100644
+--- a/drivers/mtd/nand/raw/cadence-nand-controller.c
++++ b/drivers/mtd/nand/raw/cadence-nand-controller.c
+@@ -2983,11 +2983,10 @@ static int cadence_nand_dt_probe(struct platform_device *ofdev)
+ if (IS_ERR(cdns_ctrl->reg))
+ return PTR_ERR(cdns_ctrl->reg);
+
+- res = platform_get_resource(ofdev, IORESOURCE_MEM, 1);
+- cdns_ctrl->io.dma = res->start;
+- cdns_ctrl->io.virt = devm_ioremap_resource(&ofdev->dev, res);
++ cdns_ctrl->io.virt = devm_platform_get_and_ioremap_resource(ofdev, 1, &res);
+ if (IS_ERR(cdns_ctrl->io.virt))
+ return PTR_ERR(cdns_ctrl->io.virt);
++ cdns_ctrl->io.dma = res->start;
+
+ dt->clk = devm_clk_get(cdns_ctrl->dev, "nf_clk");
+ if (IS_ERR(dt->clk))
+diff --git a/drivers/mtd/nand/raw/denali_pci.c b/drivers/mtd/nand/raw/denali_pci.c
+index 20c085a30adcb..de7e722d38262 100644
+--- a/drivers/mtd/nand/raw/denali_pci.c
++++ b/drivers/mtd/nand/raw/denali_pci.c
+@@ -74,22 +74,21 @@ static int denali_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
+ return ret;
+ }
+
+- denali->reg = ioremap(csr_base, csr_len);
++ denali->reg = devm_ioremap(denali->dev, csr_base, csr_len);
+ if (!denali->reg) {
+ dev_err(&dev->dev, "Spectra: Unable to remap memory region\n");
+ return -ENOMEM;
+ }
+
+- denali->host = ioremap(mem_base, mem_len);
++ denali->host = devm_ioremap(denali->dev, mem_base, mem_len);
+ if (!denali->host) {
+ dev_err(&dev->dev, "Spectra: ioremap failed!");
+- ret = -ENOMEM;
+- goto out_unmap_reg;
++ return -ENOMEM;
+ }
+
+ ret = denali_init(denali);
+ if (ret)
+- goto out_unmap_host;
++ return ret;
+
+ nsels = denali->nbanks;
+
+@@ -117,10 +116,6 @@ static int denali_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
+
+ out_remove_denali:
+ denali_remove(denali);
+-out_unmap_host:
+- iounmap(denali->host);
+-out_unmap_reg:
+- iounmap(denali->reg);
+ return ret;
+ }
+
+@@ -129,8 +124,6 @@ static void denali_pci_remove(struct pci_dev *dev)
+ struct denali_controller *denali = pci_get_drvdata(dev);
+
+ denali_remove(denali);
+- iounmap(denali->reg);
+- iounmap(denali->host);
+ }
+
+ static struct pci_driver denali_pci_driver = {
+diff --git a/drivers/mtd/nand/raw/intel-nand-controller.c b/drivers/mtd/nand/raw/intel-nand-controller.c
+index 7c1c80dae826a..e91b879b32bdb 100644
+--- a/drivers/mtd/nand/raw/intel-nand-controller.c
++++ b/drivers/mtd/nand/raw/intel-nand-controller.c
+@@ -619,9 +619,9 @@ static int ebu_nand_probe(struct platform_device *pdev)
+ resname = devm_kasprintf(dev, GFP_KERNEL, "nand_cs%d", cs);
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM, resname);
+ ebu_host->cs[cs].chipaddr = devm_ioremap_resource(dev, res);
+- ebu_host->cs[cs].nand_pa = res->start;
+ if (IS_ERR(ebu_host->cs[cs].chipaddr))
+ return PTR_ERR(ebu_host->cs[cs].chipaddr);
++ ebu_host->cs[cs].nand_pa = res->start;
+
+ ebu_host->clk = devm_clk_get(dev, NULL);
+ if (IS_ERR(ebu_host->clk))
+diff --git a/drivers/mtd/nand/spi/gigadevice.c b/drivers/mtd/nand/spi/gigadevice.c
+index 1dd1c58980934..da77ab20296ea 100644
+--- a/drivers/mtd/nand/spi/gigadevice.c
++++ b/drivers/mtd/nand/spi/gigadevice.c
+@@ -39,6 +39,14 @@ static SPINAND_OP_VARIANTS(read_cache_variants_f,
+ SPINAND_PAGE_READ_FROM_CACHE_OP_3A(true, 0, 1, NULL, 0),
+ SPINAND_PAGE_READ_FROM_CACHE_OP_3A(false, 0, 0, NULL, 0));
+
++static SPINAND_OP_VARIANTS(read_cache_variants_1gq5,
++ SPINAND_PAGE_READ_FROM_CACHE_QUADIO_OP(0, 2, NULL, 0),
++ SPINAND_PAGE_READ_FROM_CACHE_X4_OP(0, 1, NULL, 0),
++ SPINAND_PAGE_READ_FROM_CACHE_DUALIO_OP(0, 1, NULL, 0),
++ SPINAND_PAGE_READ_FROM_CACHE_X2_OP(0, 1, NULL, 0),
++ SPINAND_PAGE_READ_FROM_CACHE_OP(true, 0, 1, NULL, 0),
++ SPINAND_PAGE_READ_FROM_CACHE_OP(false, 0, 1, NULL, 0));
++
+ static SPINAND_OP_VARIANTS(write_cache_variants,
+ SPINAND_PROG_LOAD_X4(true, 0, NULL, 0),
+ SPINAND_PROG_LOAD(true, 0, NULL, 0));
+@@ -339,7 +347,7 @@ static const struct spinand_info gigadevice_spinand_table[] = {
+ SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x51),
+ NAND_MEMORG(1, 2048, 128, 64, 1024, 20, 1, 1, 1),
+ NAND_ECCREQ(4, 512),
+- SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
++ SPINAND_INFO_OP_VARIANTS(&read_cache_variants_1gq5,
+ &write_cache_variants,
+ &update_cache_variants),
+ SPINAND_HAS_QE_BIT,
+diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
+index b4f141ad9c9c3..c1630131c7342 100644
+--- a/drivers/mtd/spi-nor/core.c
++++ b/drivers/mtd/spi-nor/core.c
+@@ -788,6 +788,15 @@ static int spi_nor_write_16bit_sr_and_check(struct spi_nor *nor, u8 sr1)
+ if (ret)
+ return ret;
+
++ ret = spi_nor_read_sr(nor, sr_cr);
++ if (ret)
++ return ret;
++
++ if (sr1 != sr_cr[0]) {
++ dev_dbg(nor->dev, "SR: Read back test failed\n");
++ return -EIO;
++ }
++
+ if (nor->flags & SNOR_F_NO_READ_CR)
+ return 0;
+
+diff --git a/drivers/net/amt.c b/drivers/net/amt.c
+index 10455c9b9da0e..de4ea518c793f 100644
+--- a/drivers/net/amt.c
++++ b/drivers/net/amt.c
+@@ -943,7 +943,7 @@ static void amt_req_work(struct work_struct *work)
+ if (amt->status < AMT_STATUS_RECEIVED_ADVERTISEMENT)
+ goto out;
+
+- if (amt->req_cnt++ > AMT_MAX_REQ_COUNT) {
++ if (amt->req_cnt > AMT_MAX_REQ_COUNT) {
+ netdev_dbg(amt->dev, "Gateway is not ready");
+ amt->qi = AMT_INIT_REQ_TIMEOUT;
+ amt->ready4 = false;
+@@ -951,13 +951,15 @@ static void amt_req_work(struct work_struct *work)
+ amt->remote_ip = 0;
+ __amt_update_gw_status(amt, AMT_STATUS_INIT, false);
+ amt->req_cnt = 0;
++ goto out;
+ }
+ spin_unlock_bh(&amt->lock);
+
+ amt_send_request(amt, false);
+ amt_send_request(amt, true);
+- amt_update_gw_status(amt, AMT_STATUS_SENT_REQUEST, true);
+ spin_lock_bh(&amt->lock);
++ __amt_update_gw_status(amt, AMT_STATUS_SENT_REQUEST, true);
++ amt->req_cnt++;
+ out:
+ exp = min_t(u32, (1 * (1 << amt->req_cnt)), AMT_MAX_REQ_TIMEOUT);
+ mod_delayed_work(amt_wq, &amt->req_wq, msecs_to_jiffies(exp * 1000));
+@@ -2696,9 +2698,8 @@ static int amt_rcv(struct sock *sk, struct sk_buff *skb)
+ err = true;
+ goto drop;
+ }
+- if (amt_advertisement_handler(amt, skb))
+- amt->dev->stats.rx_dropped++;
+- goto out;
++ err = amt_advertisement_handler(amt, skb);
++ break;
+ case AMT_MSG_MULTICAST_DATA:
+ if (iph->saddr != amt->remote_ip) {
+ netdev_dbg(amt->dev, "Invalid Relay IP\n");
+diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
+index 38e1525481261..b5c5196e03ee0 100644
+--- a/drivers/net/bonding/bond_main.c
++++ b/drivers/net/bonding/bond_main.c
+@@ -5591,16 +5591,23 @@ static int bond_ethtool_get_ts_info(struct net_device *bond_dev,
+ const struct ethtool_ops *ops;
+ struct net_device *real_dev;
+ struct phy_device *phydev;
++ int ret = 0;
+
++ rcu_read_lock();
+ real_dev = bond_option_active_slave_get_rcu(bond);
++ dev_hold(real_dev);
++ rcu_read_unlock();
++
+ if (real_dev) {
+ ops = real_dev->ethtool_ops;
+ phydev = real_dev->phydev;
+
+ if (phy_has_tsinfo(phydev)) {
+- return phy_ts_info(phydev, info);
++ ret = phy_ts_info(phydev, info);
++ goto out;
+ } else if (ops->get_ts_info) {
+- return ops->get_ts_info(real_dev, info);
++ ret = ops->get_ts_info(real_dev, info);
++ goto out;
+ }
+ }
+
+@@ -5608,7 +5615,9 @@ static int bond_ethtool_get_ts_info(struct net_device *bond_dev,
+ SOF_TIMESTAMPING_SOFTWARE;
+ info->phc_index = -1;
+
+- return 0;
++out:
++ dev_put(real_dev);
++ return ret;
+ }
+
+ static const struct ethtool_ops bond_ethtool_ops = {
+diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h
+index 9cb6b5ad8dda0..60e56fa4601d3 100644
+--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h
++++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h
+@@ -441,7 +441,7 @@ struct mcp251xfd_hw_tef_obj {
+ /* The tx_obj_raw version is used in spi async, i.e. without
+ * regmap. We have to take care of endianness ourselves.
+ */
+-struct mcp251xfd_hw_tx_obj_raw {
++struct __packed mcp251xfd_hw_tx_obj_raw {
+ __le32 id;
+ __le32 flags;
+ u8 data[sizeof_field(struct canfd_frame, data)];
+diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
+index e562c5ab1149a..43f0c6a064ba1 100644
+--- a/drivers/net/can/xilinx_can.c
++++ b/drivers/net/can/xilinx_can.c
+@@ -239,7 +239,7 @@ static const struct can_bittiming_const xcan_bittiming_const_canfd = {
+ };
+
+ /* AXI CANFD Data Bittiming constants as per AXI CANFD 1.0 specs */
+-static struct can_bittiming_const xcan_data_bittiming_const_canfd = {
++static const struct can_bittiming_const xcan_data_bittiming_const_canfd = {
+ .name = DRIVER_NAME,
+ .tseg1_min = 1,
+ .tseg1_max = 16,
+@@ -265,7 +265,7 @@ static const struct can_bittiming_const xcan_bittiming_const_canfd2 = {
+ };
+
+ /* AXI CANFD 2.0 Data Bittiming constants as per AXI CANFD 2.0 spec */
+-static struct can_bittiming_const xcan_data_bittiming_const_canfd2 = {
++static const struct can_bittiming_const xcan_data_bittiming_const_canfd2 = {
+ .name = DRIVER_NAME,
+ .tseg1_min = 1,
+ .tseg1_max = 32,
+diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
+index 37a3dabdce313..6d1fcb08bba1f 100644
+--- a/drivers/net/dsa/Kconfig
++++ b/drivers/net/dsa/Kconfig
+@@ -72,7 +72,6 @@ source "drivers/net/dsa/realtek/Kconfig"
+
+ config NET_DSA_SMSC_LAN9303
+ tristate
+- depends on VLAN_8021Q || VLAN_8021Q=n
+ select NET_DSA_TAG_LAN9303
+ select REGMAP
+ help
+@@ -82,6 +81,7 @@ config NET_DSA_SMSC_LAN9303
+ config NET_DSA_SMSC_LAN9303_I2C
+ tristate "SMSC/Microchip LAN9303 3-ports 10/100 ethernet switch in I2C managed mode"
+ depends on I2C
++ depends on VLAN_8021Q || VLAN_8021Q=n
+ select NET_DSA_SMSC_LAN9303
+ select REGMAP_I2C
+ help
+@@ -91,6 +91,7 @@ config NET_DSA_SMSC_LAN9303_I2C
+ config NET_DSA_SMSC_LAN9303_MDIO
+ tristate "SMSC/Microchip LAN9303 3-ports 10/100 ethernet switch in MDIO managed mode"
+ select NET_DSA_SMSC_LAN9303
++ depends on VLAN_8021Q || VLAN_8021Q=n
+ help
+ Enable access functions if the SMSC/Microchip LAN9303 is configured
+ for MDIO managed mode.
+diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
+index fe3cb26f4287e..831ccbecb0c2d 100644
+--- a/drivers/net/dsa/mt7530.c
++++ b/drivers/net/dsa/mt7530.c
+@@ -2540,13 +2540,7 @@ static void mt7531_sgmii_validate(struct mt7530_priv *priv, int port,
+ /* Port5 supports ethier RGMII or SGMII.
+ * Port6 supports SGMII only.
+ */
+- switch (port) {
+- case 5:
+- if (mt7531_is_rgmii_port(priv, port))
+- break;
+- fallthrough;
+- case 6:
+- phylink_set(supported, 1000baseX_Full);
++ if (port == 6) {
+ phylink_set(supported, 2500baseX_Full);
+ phylink_set(supported, 2500baseT_Full);
+ }
+@@ -2914,8 +2908,6 @@ static void
+ mt7530_mac_port_validate(struct dsa_switch *ds, int port,
+ unsigned long *supported)
+ {
+- if (port == 5)
+- phylink_set(supported, 1000baseX_Full);
+ }
+
+ static void mt7531_mac_port_validate(struct dsa_switch *ds, int port,
+@@ -2952,8 +2944,10 @@ mt753x_phylink_validate(struct dsa_switch *ds, int port,
+ }
+
+ /* This switch only supports 1G full-duplex. */
+- if (state->interface != PHY_INTERFACE_MODE_MII)
++ if (state->interface != PHY_INTERFACE_MODE_MII) {
+ phylink_set(mask, 1000baseT_Full);
++ phylink_set(mask, 1000baseX_Full);
++ }
+
+ priv->info->mac_port_validate(ds, port, mask);
+
+diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
+index d3ed0a7f80771..22b328bd7cd51 100644
+--- a/drivers/net/dsa/qca8k.c
++++ b/drivers/net/dsa/qca8k.c
+@@ -1287,7 +1287,12 @@ qca8k_internal_mdio_read(struct mii_bus *slave_bus, int phy, int regnum)
+ if (ret >= 0)
+ return ret;
+
+- return qca8k_mdio_read(priv, phy, regnum);
++ ret = qca8k_mdio_read(priv, phy, regnum);
++
++ if (ret < 0)
++ return 0xffff;
++
++ return ret;
+ }
+
+ static int
+diff --git a/drivers/net/ethernet/broadcom/Makefile b/drivers/net/ethernet/broadcom/Makefile
+index 0ddfb5b5d53ca..2e6c5f258a1ff 100644
+--- a/drivers/net/ethernet/broadcom/Makefile
++++ b/drivers/net/ethernet/broadcom/Makefile
+@@ -17,3 +17,8 @@ obj-$(CONFIG_BGMAC_BCMA) += bgmac-bcma.o bgmac-bcma-mdio.o
+ obj-$(CONFIG_BGMAC_PLATFORM) += bgmac-platform.o
+ obj-$(CONFIG_SYSTEMPORT) += bcmsysport.o
+ obj-$(CONFIG_BNXT) += bnxt/
++
++# FIXME: temporarily silence -Warray-bounds on non W=1+ builds
++ifndef KBUILD_EXTRA_WARN
++CFLAGS_tg3.o += -Wno-array-bounds
++endif
+diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+index 1d69fe0737a1c..d5149478a3510 100644
+--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+@@ -10363,6 +10363,7 @@ static int __bnxt_open_nic(struct bnxt *bp, bool irq_re_init, bool link_re_init)
+ if (BNXT_PF(bp))
+ bnxt_vf_reps_open(bp);
+ bnxt_ptp_init_rtc(bp, true);
++ bnxt_ptp_cfg_tstamp_filters(bp);
+ return 0;
+
+ open_err_irq:
+diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
+index 00f2f80c00733..f9c94e5fe7187 100644
+--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
+@@ -295,6 +295,27 @@ static int bnxt_ptp_cfg_event(struct bnxt *bp, u8 event)
+ return hwrm_req_send(bp, req);
+ }
+
++void bnxt_ptp_cfg_tstamp_filters(struct bnxt *bp)
++{
++ struct bnxt_ptp_cfg *ptp = bp->ptp_cfg;
++ struct hwrm_port_mac_cfg_input *req;
++
++ if (!ptp || !ptp->tstamp_filters)
++ return;
++
++ if (hwrm_req_init(bp, req, HWRM_PORT_MAC_CFG))
++ goto out;
++ req->flags = cpu_to_le32(ptp->tstamp_filters);
++ req->enables = cpu_to_le32(PORT_MAC_CFG_REQ_ENABLES_RX_TS_CAPTURE_PTP_MSG_TYPE);
++ req->rx_ts_capture_ptp_msg_type = cpu_to_le16(ptp->rxctl);
++
++ if (!hwrm_req_send(bp, req))
++ return;
++ ptp->tstamp_filters = 0;
++out:
++ netdev_warn(bp->dev, "Failed to configure HW packet timestamp filters\n");
++}
++
+ void bnxt_ptp_reapply_pps(struct bnxt *bp)
+ {
+ struct bnxt_ptp_cfg *ptp = bp->ptp_cfg;
+@@ -435,27 +456,36 @@ static int bnxt_ptp_enable(struct ptp_clock_info *ptp_info,
+ static int bnxt_hwrm_ptp_cfg(struct bnxt *bp)
+ {
+ struct bnxt_ptp_cfg *ptp = bp->ptp_cfg;
+- struct hwrm_port_mac_cfg_input *req;
+ u32 flags = 0;
+- int rc;
++ int rc = 0;
+
+- rc = hwrm_req_init(bp, req, HWRM_PORT_MAC_CFG);
+- if (rc)
+- return rc;
++ switch (ptp->rx_filter) {
++ case HWTSTAMP_FILTER_NONE:
++ flags = PORT_MAC_CFG_REQ_FLAGS_PTP_RX_TS_CAPTURE_DISABLE;
++ break;
++ case HWTSTAMP_FILTER_PTP_V2_EVENT:
++ case HWTSTAMP_FILTER_PTP_V2_SYNC:
++ case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
++ flags = PORT_MAC_CFG_REQ_FLAGS_PTP_RX_TS_CAPTURE_ENABLE;
++ break;
++ }
+
+- if (ptp->rx_filter)
+- flags |= PORT_MAC_CFG_REQ_FLAGS_PTP_RX_TS_CAPTURE_ENABLE;
+- else
+- flags |= PORT_MAC_CFG_REQ_FLAGS_PTP_RX_TS_CAPTURE_DISABLE;
+ if (ptp->tx_tstamp_en)
+ flags |= PORT_MAC_CFG_REQ_FLAGS_PTP_TX_TS_CAPTURE_ENABLE;
+ else
+ flags |= PORT_MAC_CFG_REQ_FLAGS_PTP_TX_TS_CAPTURE_DISABLE;
+- req->flags = cpu_to_le32(flags);
+- req->enables = cpu_to_le32(PORT_MAC_CFG_REQ_ENABLES_RX_TS_CAPTURE_PTP_MSG_TYPE);
+- req->rx_ts_capture_ptp_msg_type = cpu_to_le16(ptp->rxctl);
+
+- return hwrm_req_send(bp, req);
++ ptp->tstamp_filters = flags;
++
++ if (netif_running(bp->dev)) {
++ rc = bnxt_close_nic(bp, false, false);
++ if (!rc)
++ rc = bnxt_open_nic(bp, false, false);
++ if (!rc && !ptp->tstamp_filters)
++ rc = -EIO;
++ }
++
++ return rc;
+ }
+
+ int bnxt_hwtstamp_set(struct net_device *dev, struct ifreq *ifr)
+diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h
+index 530b9922608c8..4ce0a14c1e232 100644
+--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h
++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h
+@@ -113,6 +113,7 @@ struct bnxt_ptp_cfg {
+ BNXT_PTP_MSG_PDELAY_RESP)
+ u8 tx_tstamp_en:1;
+ int rx_filter;
++ u32 tstamp_filters;
+
+ u32 refclk_regs[2];
+ u32 refclk_mapped_regs[2];
+@@ -133,6 +134,7 @@ do { \
+ int bnxt_ptp_parse(struct sk_buff *skb, u16 *seq_id, u16 *hdr_off);
+ void bnxt_ptp_update_current_time(struct bnxt *bp);
+ void bnxt_ptp_pps_event(struct bnxt *bp, u32 data1, u32 data2);
++void bnxt_ptp_cfg_tstamp_filters(struct bnxt *bp);
+ void bnxt_ptp_reapply_pps(struct bnxt *bp);
+ int bnxt_hwtstamp_set(struct net_device *dev, struct ifreq *ifr);
+ int bnxt_hwtstamp_get(struct net_device *dev, struct ifreq *ifr);
+diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
+index 61284baa0496e..e9e5c3f6027c7 100644
+--- a/drivers/net/ethernet/cadence/macb_main.c
++++ b/drivers/net/ethernet/cadence/macb_main.c
+@@ -36,6 +36,7 @@
+ #include <linux/iopoll.h>
+ #include <linux/phy/phy.h>
+ #include <linux/pm_runtime.h>
++#include <linux/ptp_classify.h>
+ #include <linux/reset.h>
+ #include "macb.h"
+
+@@ -1124,6 +1125,36 @@ static void macb_tx_error_task(struct work_struct *work)
+ spin_unlock_irqrestore(&bp->lock, flags);
+ }
+
++static bool ptp_one_step_sync(struct sk_buff *skb)
++{
++ struct ptp_header *hdr;
++ unsigned int ptp_class;
++ u8 msgtype;
++
++ /* No need to parse packet if PTP TS is not involved */
++ if (likely(!(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)))
++ goto not_oss;
++
++ /* Identify and return whether PTP one step sync is being processed */
++ ptp_class = ptp_classify_raw(skb);
++ if (ptp_class == PTP_CLASS_NONE)
++ goto not_oss;
++
++ hdr = ptp_parse_header(skb, ptp_class);
++ if (!hdr)
++ goto not_oss;
++
++ if (hdr->flag_field[0] & PTP_FLAG_TWOSTEP)
++ goto not_oss;
++
++ msgtype = ptp_get_msgtype(hdr, ptp_class);
++ if (msgtype == PTP_MSGTYPE_SYNC)
++ return true;
++
++not_oss:
++ return false;
++}
++
+ static void macb_tx_interrupt(struct macb_queue *queue)
+ {
+ unsigned int tail;
+@@ -1168,8 +1199,8 @@ static void macb_tx_interrupt(struct macb_queue *queue)
+
+ /* First, update TX stats if needed */
+ if (skb) {
+- if (unlikely(skb_shinfo(skb)->tx_flags &
+- SKBTX_HW_TSTAMP) &&
++ if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
++ !ptp_one_step_sync(skb) &&
+ gem_ptp_do_txstamp(queue, skb, desc) == 0) {
+ /* skb now belongs to timestamp buffer
+ * and will be removed later
+@@ -1999,7 +2030,8 @@ static unsigned int macb_tx_map(struct macb *bp,
+ ctrl |= MACB_BF(TX_LSO, lso_ctrl);
+ ctrl |= MACB_BF(TX_TCP_SEQ_SRC, seq_ctrl);
+ if ((bp->dev->features & NETIF_F_HW_CSUM) &&
+- skb->ip_summed != CHECKSUM_PARTIAL && !lso_ctrl)
++ skb->ip_summed != CHECKSUM_PARTIAL && !lso_ctrl &&
++ !ptp_one_step_sync(skb))
+ ctrl |= MACB_BIT(TX_NOCRC);
+ } else
+ /* Only set MSS/MFS on payload descriptors
+@@ -2097,7 +2129,7 @@ static int macb_pad_and_fcs(struct sk_buff **skb, struct net_device *ndev)
+
+ if (!(ndev->features & NETIF_F_HW_CSUM) ||
+ !((*skb)->ip_summed != CHECKSUM_PARTIAL) ||
+- skb_shinfo(*skb)->gso_size) /* Not available for GSO */
++ skb_shinfo(*skb)->gso_size || ptp_one_step_sync(*skb))
+ return 0;
+
+ if (padlen <= 0) {
+@@ -4594,7 +4626,7 @@ static int zynqmp_init(struct platform_device *pdev)
+
+ if (bp->phy_interface == PHY_INTERFACE_MODE_SGMII) {
+ /* Ensure PS-GTR PHY device used in SGMII mode is ready */
+- bp->sgmii_phy = devm_phy_get(&pdev->dev, "sgmii-phy");
++ bp->sgmii_phy = devm_phy_optional_get(&pdev->dev, NULL);
+
+ if (IS_ERR(bp->sgmii_phy)) {
+ ret = PTR_ERR(bp->sgmii_phy);
+diff --git a/drivers/net/ethernet/cadence/macb_ptp.c b/drivers/net/ethernet/cadence/macb_ptp.c
+index fb6b27f46b153..9559c16078f95 100644
+--- a/drivers/net/ethernet/cadence/macb_ptp.c
++++ b/drivers/net/ethernet/cadence/macb_ptp.c
+@@ -470,8 +470,10 @@ int gem_set_hwtst(struct net_device *dev, struct ifreq *ifr, int cmd)
+ case HWTSTAMP_TX_ONESTEP_SYNC:
+ if (gem_ptp_set_one_step_sync(bp, 1) != 0)
+ return -ERANGE;
+- fallthrough;
++ tx_bd_control = TSTAMP_ALL_FRAMES;
++ break;
+ case HWTSTAMP_TX_ON:
++ gem_ptp_set_one_step_sync(bp, 0);
+ tx_bd_control = TSTAMP_ALL_FRAMES;
+ break;
+ default:
+diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
+index 4b047255d9280..cd9ec80522e75 100644
+--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
++++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
+@@ -1097,6 +1097,7 @@ static void dpaa2_eth_free_tx_fd(struct dpaa2_eth_priv *priv,
+ u32 fd_len = dpaa2_fd_get_len(fd);
+ struct dpaa2_sg_entry *sgt;
+ int should_free_skb = 1;
++ void *tso_hdr;
+ int i;
+
+ fd_addr = dpaa2_fd_get_addr(fd);
+@@ -1135,20 +1136,21 @@ static void dpaa2_eth_free_tx_fd(struct dpaa2_eth_priv *priv,
+ sgt = (struct dpaa2_sg_entry *)(buffer_start +
+ priv->tx_data_offset);
+
++ /* Unmap the SGT buffer */
++ dma_unmap_single(dev, fd_addr, swa->tso.sgt_size,
++ DMA_BIDIRECTIONAL);
++
+ /* Unmap and free the header */
++ tso_hdr = dpaa2_iova_to_virt(priv->iommu_domain, dpaa2_sg_get_addr(sgt));
+ dma_unmap_single(dev, dpaa2_sg_get_addr(sgt), TSO_HEADER_SIZE,
+ DMA_TO_DEVICE);
+- kfree(dpaa2_iova_to_virt(priv->iommu_domain, dpaa2_sg_get_addr(sgt)));
++ kfree(tso_hdr);
+
+ /* Unmap the other SG entries for the data */
+ for (i = 1; i < swa->tso.num_sg; i++)
+ dma_unmap_single(dev, dpaa2_sg_get_addr(&sgt[i]),
+ dpaa2_sg_get_len(&sgt[i]), DMA_TO_DEVICE);
+
+- /* Unmap the SGT buffer */
+- dma_unmap_single(dev, fd_addr, swa->sg.sgt_size,
+- DMA_BIDIRECTIONAL);
+-
+ if (!swa->tso.is_last_fd)
+ should_free_skb = 0;
+ } else {
+diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_mgmt.c b/drivers/net/ethernet/huawei/hinic/hinic_hw_mgmt.c
+index ebc77771f5dac..4aa1f433ed24d 100644
+--- a/drivers/net/ethernet/huawei/hinic/hinic_hw_mgmt.c
++++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_mgmt.c
+@@ -643,6 +643,7 @@ int hinic_pf_to_mgmt_init(struct hinic_pf_to_mgmt *pf_to_mgmt,
+ err = alloc_msg_buf(pf_to_mgmt);
+ if (err) {
+ dev_err(&pdev->dev, "Failed to allocate msg buffers\n");
++ destroy_workqueue(pf_to_mgmt->workq);
+ hinic_health_reporters_destroy(hwdev->devlink_dev);
+ return err;
+ }
+@@ -650,6 +651,7 @@ int hinic_pf_to_mgmt_init(struct hinic_pf_to_mgmt *pf_to_mgmt,
+ err = hinic_api_cmd_init(pf_to_mgmt->cmd_chain, hwif);
+ if (err) {
+ dev_err(&pdev->dev, "Failed to initialize cmd chains\n");
++ destroy_workqueue(pf_to_mgmt->workq);
+ hinic_health_reporters_destroy(hwdev->devlink_dev);
+ return err;
+ }
+diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_wq.c b/drivers/net/ethernet/huawei/hinic/hinic_hw_wq.c
+index f7dc7d825f637..4daf6bf291ecb 100644
+--- a/drivers/net/ethernet/huawei/hinic/hinic_hw_wq.c
++++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_wq.c
+@@ -386,7 +386,7 @@ static int alloc_wqes_shadow(struct hinic_wq *wq)
+ return -ENOMEM;
+
+ wq->shadow_idx = devm_kcalloc(&pdev->dev, wq->num_q_pages,
+- sizeof(wq->prod_idx), GFP_KERNEL);
++ sizeof(*wq->shadow_idx), GFP_KERNEL);
+ if (!wq->shadow_idx)
+ goto err_shadow_idx;
+
+diff --git a/drivers/net/ethernet/intel/ice/ice_devlink.c b/drivers/net/ethernet/intel/ice/ice_devlink.c
+index a230edb384665..4a9de59121d85 100644
+--- a/drivers/net/ethernet/intel/ice/ice_devlink.c
++++ b/drivers/net/ethernet/intel/ice/ice_devlink.c
+@@ -753,9 +753,12 @@ int ice_devlink_create_vf_port(struct ice_vf *vf)
+
+ pf = vf->pf;
+ dev = ice_pf_to_dev(pf);
+- vsi = ice_get_vf_vsi(vf);
+ devlink_port = &vf->devlink_port;
+
++ vsi = ice_get_vf_vsi(vf);
++ if (!vsi)
++ return -EINVAL;
++
+ attrs.flavour = DEVLINK_PORT_FLAVOUR_PCI_VF;
+ attrs.pci_vf.pf = pf->hw.bus.func;
+ attrs.pci_vf.vf = vf->vf_id;
+diff --git a/drivers/net/ethernet/intel/ice/ice_repr.c b/drivers/net/ethernet/intel/ice/ice_repr.c
+index 848f2adea563e..a91b81c3088b5 100644
+--- a/drivers/net/ethernet/intel/ice/ice_repr.c
++++ b/drivers/net/ethernet/intel/ice/ice_repr.c
+@@ -293,8 +293,13 @@ static int ice_repr_add(struct ice_vf *vf)
+ struct ice_q_vector *q_vector;
+ struct ice_netdev_priv *np;
+ struct ice_repr *repr;
++ struct ice_vsi *vsi;
+ int err;
+
++ vsi = ice_get_vf_vsi(vf);
++ if (!vsi)
++ return -EINVAL;
++
+ repr = kzalloc(sizeof(*repr), GFP_KERNEL);
+ if (!repr)
+ return -ENOMEM;
+@@ -313,7 +318,7 @@ static int ice_repr_add(struct ice_vf *vf)
+ goto err_alloc;
+ }
+
+- repr->src_vsi = ice_get_vf_vsi(vf);
++ repr->src_vsi = vsi;
+ repr->vf = vf;
+ vf->repr = repr;
+ np = netdev_priv(repr->netdev);
+diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c b/drivers/net/ethernet/intel/ice/ice_sriov.c
+index 0c438219f7a39..bb1721f1321db 100644
+--- a/drivers/net/ethernet/intel/ice/ice_sriov.c
++++ b/drivers/net/ethernet/intel/ice/ice_sriov.c
+@@ -46,7 +46,12 @@ static void ice_free_vf_entries(struct ice_pf *pf)
+ */
+ static void ice_vf_vsi_release(struct ice_vf *vf)
+ {
+- ice_vsi_release(ice_get_vf_vsi(vf));
++ struct ice_vsi *vsi = ice_get_vf_vsi(vf);
++
++ if (WARN_ON(!vsi))
++ return;
++
++ ice_vsi_release(vsi);
+ ice_vf_invalidate_vsi(vf);
+ }
+
+@@ -104,6 +109,8 @@ static void ice_dis_vf_mappings(struct ice_vf *vf)
+
+ hw = &pf->hw;
+ vsi = ice_get_vf_vsi(vf);
++ if (WARN_ON(!vsi))
++ return;
+
+ dev = ice_pf_to_dev(pf);
+ wr32(hw, VPINT_ALLOC(vf->vf_id), 0);
+@@ -341,6 +348,9 @@ static void ice_ena_vf_q_mappings(struct ice_vf *vf, u16 max_txq, u16 max_rxq)
+ struct ice_hw *hw = &vf->pf->hw;
+ u32 reg;
+
++ if (WARN_ON(!vsi))
++ return;
++
+ /* set regardless of mapping mode */
+ wr32(hw, VPLAN_TXQ_MAPENA(vf->vf_id), VPLAN_TXQ_MAPENA_TX_ENA_M);
+
+@@ -386,6 +396,9 @@ static void ice_ena_vf_mappings(struct ice_vf *vf)
+ {
+ struct ice_vsi *vsi = ice_get_vf_vsi(vf);
+
++ if (WARN_ON(!vsi))
++ return;
++
+ ice_ena_vf_msix_mappings(vf);
+ ice_ena_vf_q_mappings(vf, vsi->alloc_txq, vsi->alloc_rxq);
+ }
+@@ -1128,6 +1141,8 @@ static struct ice_vf *ice_get_vf_from_pfq(struct ice_pf *pf, u16 pfq)
+ u16 rxq_idx;
+
+ vsi = ice_get_vf_vsi(vf);
++ if (!vsi)
++ continue;
+
+ ice_for_each_rxq(vsi, rxq_idx)
+ if (vsi->rxq_map[rxq_idx] == pfq) {
+@@ -1521,8 +1536,15 @@ static int ice_calc_all_vfs_min_tx_rate(struct ice_pf *pf)
+ static bool
+ ice_min_tx_rate_oversubscribed(struct ice_vf *vf, int min_tx_rate)
+ {
+- int link_speed_mbps = ice_get_link_speed_mbps(ice_get_vf_vsi(vf));
+- int all_vfs_min_tx_rate = ice_calc_all_vfs_min_tx_rate(vf->pf);
++ struct ice_vsi *vsi = ice_get_vf_vsi(vf);
++ int all_vfs_min_tx_rate;
++ int link_speed_mbps;
++
++ if (WARN_ON(!vsi))
++ return false;
++
++ link_speed_mbps = ice_get_link_speed_mbps(vsi);
++ all_vfs_min_tx_rate = ice_calc_all_vfs_min_tx_rate(vf->pf);
+
+ /* this VF's previous rate is being overwritten */
+ all_vfs_min_tx_rate -= vf->min_tx_rate;
+@@ -1566,6 +1588,10 @@ ice_set_vf_bw(struct net_device *netdev, int vf_id, int min_tx_rate,
+ goto out_put_vf;
+
+ vsi = ice_get_vf_vsi(vf);
++ if (!vsi) {
++ ret = -EINVAL;
++ goto out_put_vf;
++ }
+
+ /* when max_tx_rate is zero that means no max Tx rate limiting, so only
+ * check if max_tx_rate is non-zero
+diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
+index 6578059d94794..aefd66a4db80d 100644
+--- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
++++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
+@@ -220,8 +220,10 @@ static void ice_vf_clear_counters(struct ice_vf *vf)
+ {
+ struct ice_vsi *vsi = ice_get_vf_vsi(vf);
+
++ if (vsi)
++ vsi->num_vlan = 0;
++
+ vf->num_mac = 0;
+- vsi->num_vlan = 0;
+ memset(&vf->mdd_tx_events, 0, sizeof(vf->mdd_tx_events));
+ memset(&vf->mdd_rx_events, 0, sizeof(vf->mdd_rx_events));
+ }
+@@ -251,6 +253,9 @@ static int ice_vf_rebuild_vsi(struct ice_vf *vf)
+ struct ice_vsi *vsi = ice_get_vf_vsi(vf);
+ struct ice_pf *pf = vf->pf;
+
++ if (WARN_ON(!vsi))
++ return -EINVAL;
++
+ if (ice_vsi_rebuild(vsi, true)) {
+ dev_err(ice_pf_to_dev(pf), "failed to rebuild VF %d VSI\n",
+ vf->vf_id);
+@@ -514,6 +519,10 @@ int ice_reset_vf(struct ice_vf *vf, u32 flags)
+ ice_trigger_vf_reset(vf, flags & ICE_VF_RESET_VFLR, false);
+
+ vsi = ice_get_vf_vsi(vf);
++ if (WARN_ON(!vsi)) {
++ err = -EIO;
++ goto out_unlock;
++ }
+
+ ice_dis_vf_qs(vf);
+
+@@ -572,6 +581,11 @@ int ice_reset_vf(struct ice_vf *vf, u32 flags)
+
+ vf->vf_ops->post_vsi_rebuild(vf);
+ vsi = ice_get_vf_vsi(vf);
++ if (WARN_ON(!vsi)) {
++ err = -EINVAL;
++ goto out_unlock;
++ }
++
+ ice_eswitch_update_repr(vsi);
+ ice_eswitch_replay_vf_mac_rule(vf);
+
+@@ -610,6 +624,9 @@ void ice_dis_vf_qs(struct ice_vf *vf)
+ {
+ struct ice_vsi *vsi = ice_get_vf_vsi(vf);
+
++ if (WARN_ON(!vsi))
++ return;
++
+ ice_vsi_stop_lan_tx_rings(vsi, ICE_NO_RESET, vf->vf_id);
+ ice_vsi_stop_all_rx_rings(vsi);
+ ice_set_vf_state_qs_dis(vf);
+@@ -790,6 +807,9 @@ static int ice_vf_rebuild_host_mac_cfg(struct ice_vf *vf)
+ u8 broadcast[ETH_ALEN];
+ int status;
+
++ if (WARN_ON(!vsi))
++ return -EINVAL;
++
+ if (ice_is_eswitch_mode_switchdev(vf->pf))
+ return 0;
+
+@@ -875,6 +895,9 @@ static int ice_vf_rebuild_host_tx_rate_cfg(struct ice_vf *vf)
+ struct ice_vsi *vsi = ice_get_vf_vsi(vf);
+ int err;
+
++ if (WARN_ON(!vsi))
++ return -EINVAL;
++
+ if (vf->min_tx_rate) {
+ err = ice_set_min_bw_limit(vsi, (u64)vf->min_tx_rate * 1000);
+ if (err) {
+@@ -938,6 +961,9 @@ void ice_vf_rebuild_host_cfg(struct ice_vf *vf)
+ struct device *dev = ice_pf_to_dev(vf->pf);
+ struct ice_vsi *vsi = ice_get_vf_vsi(vf);
+
++ if (WARN_ON(!vsi))
++ return;
++
+ ice_vf_set_host_trust_cfg(vf);
+
+ if (ice_vf_rebuild_host_mac_cfg(vf))
+diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c
+index 2889e050a4c93..5405a0e752cf7 100644
+--- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c
++++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c
+@@ -2392,6 +2392,11 @@ static int ice_vc_ena_vlan_stripping(struct ice_vf *vf)
+ }
+
+ vsi = ice_get_vf_vsi(vf);
++ if (!vsi) {
++ v_ret = VIRTCHNL_STATUS_ERR_PARAM;
++ goto error_param;
++ }
++
+ if (vsi->inner_vlan_ops.ena_stripping(vsi, ETH_P_8021Q))
+ v_ret = VIRTCHNL_STATUS_ERR_PARAM;
+
+diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c
+index 8e38ee2faf586..b74ccbd1591ae 100644
+--- a/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c
++++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c
+@@ -1344,7 +1344,12 @@ static void ice_vf_fdir_dump_info(struct ice_vf *vf)
+ pf = vf->pf;
+ hw = &pf->hw;
+ dev = ice_pf_to_dev(pf);
+- vf_vsi = pf->vsi[vf->lan_vsi_idx];
++ vf_vsi = ice_get_vf_vsi(vf);
++ if (!vf_vsi) {
++ dev_dbg(dev, "VF %d: invalid VSI pointer\n", vf->vf_id);
++ return;
++ }
++
+ vsi_num = ice_get_hw_vsi_num(hw, vf_vsi->idx);
+
+ fd_size = rd32(hw, VSIQF_FD_SIZE(vsi_num));
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+index 057dde6f44174..9401127fb0ecf 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+@@ -178,13 +178,13 @@ static int mlx5_devlink_reload_up(struct devlink *devlink, enum devlink_reload_a
+ *actions_performed = BIT(action);
+ switch (action) {
+ case DEVLINK_RELOAD_ACTION_DRIVER_REINIT:
+- return mlx5_load_one(dev);
++ return mlx5_load_one(dev, false);
+ case DEVLINK_RELOAD_ACTION_FW_ACTIVATE:
+ if (limit == DEVLINK_RELOAD_LIMIT_NO_RESET)
+ break;
+ /* On fw_activate action, also driver is reloaded and reinit performed */
+ *actions_performed |= BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT);
+- return mlx5_load_one(dev);
++ return mlx5_load_one(dev, false);
+ default:
+ /* Unsupported action should not get to this function */
+ WARN_ON(1);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
+index 8653ac0fd865c..ee34e861d3af6 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
+@@ -1221,6 +1221,7 @@ mlx5e_tx_mpwqe_supported(struct mlx5_core_dev *mdev)
+ MLX5_CAP_ETH(mdev, enhanced_multi_pkt_send_wqe);
+ }
+
++int mlx5e_get_pf_num_tirs(struct mlx5_core_dev *mdev);
+ int mlx5e_priv_init(struct mlx5e_priv *priv,
+ const struct mlx5e_profile *profile,
+ struct net_device *netdev,
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c
+index bec9ed0103a93..2b80fe73549d2 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/ct_fs_smfs.c
+@@ -101,7 +101,7 @@ mlx5_ct_fs_smfs_matcher_create(struct mlx5_ct_fs *fs, struct mlx5dr_table *tbl,
+ spec->match_criteria_enable = MLX5_MATCH_MISC_PARAMETERS_2 | MLX5_MATCH_OUTER_HEADERS;
+
+ dr_matcher = mlx5_smfs_matcher_create(tbl, priority, spec);
+- kfree(spec);
++ kvfree(spec);
+ if (!dr_matcher)
+ return ERR_PTR(-EINVAL);
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+index fa229998606c2..72867a8ff48b6 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+@@ -5251,6 +5251,15 @@ mlx5e_calc_max_nch(struct mlx5_core_dev *mdev, struct net_device *netdev,
+ return max_nch;
+ }
+
++int mlx5e_get_pf_num_tirs(struct mlx5_core_dev *mdev)
++{
++ /* Indirect TIRS: 2 sets of TTCs (inner + outer steering)
++ * and 1 set of direct TIRS
++ */
++ return 2 * MLX5E_NUM_INDIR_TIRS
++ + mlx5e_profile_max_num_channels(mdev, &mlx5e_nic_profile);
++}
++
+ /* mlx5e generic netdev management API (move to en_common.c) */
+ int mlx5e_priv_init(struct mlx5e_priv *priv,
+ const struct mlx5e_profile *profile,
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+index 6b7e7ea6ded23..a464461f14189 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+@@ -604,10 +604,16 @@ bool mlx5e_eswitch_vf_rep(const struct net_device *netdev)
+ return netdev->netdev_ops == &mlx5e_netdev_ops_rep;
+ }
+
++/* One indirect TIR set for outer. Inner not supported in reps. */
++#define REP_NUM_INDIR_TIRS MLX5E_NUM_INDIR_TIRS
++
+ static int mlx5e_rep_max_nch_limit(struct mlx5_core_dev *mdev)
+ {
+- return (1 << MLX5_CAP_GEN(mdev, log_max_tir)) /
+- mlx5_eswitch_get_total_vports(mdev);
++ int max_tir_num = 1 << MLX5_CAP_GEN(mdev, log_max_tir);
++ int num_vports = mlx5_eswitch_get_total_vports(mdev);
++
++ return (max_tir_num - mlx5e_get_pf_num_tirs(mdev)
++ - (num_vports * REP_NUM_INDIR_TIRS)) / num_vports;
+ }
+
+ static void mlx5e_build_rep_params(struct net_device *netdev)
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+index 3ad67e6b5586d..89ba72e8d1091 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+@@ -2071,16 +2071,16 @@ void mlx5_del_flow_rules(struct mlx5_flow_handle *handle)
+ down_write_ref_node(&fte->node, false);
+ for (i = handle->num_rules - 1; i >= 0; i--)
+ tree_remove_node(&handle->rule[i]->node, true);
+- if (fte->dests_size) {
+- if (fte->modify_mask)
+- modify_fte(fte);
+- up_write_ref_node(&fte->node, false);
+- } else if (list_empty(&fte->node.children)) {
++ if (list_empty(&fte->node.children)) {
+ del_hw_fte(&fte->node);
+ /* Avoid double call to del_hw_fte */
+ fte->node.del_hw_func = NULL;
+ up_write_ref_node(&fte->node, false);
+ tree_put_node(&fte->node, false);
++ } else if (fte->dests_size) {
++ if (fte->modify_mask)
++ modify_fte(fte);
++ up_write_ref_node(&fte->node, false);
+ } else {
+ up_write_ref_node(&fte->node, false);
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+index 81eb67fb95b04..052af4901c0b9 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+@@ -149,7 +149,7 @@ static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev)
+ if (test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags)) {
+ complete(&fw_reset->done);
+ } else {
+- mlx5_load_one(dev);
++ mlx5_load_one(dev, false);
+ devlink_remote_reload_actions_performed(priv_to_devlink(dev), 0,
+ BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT) |
+ BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE));
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c
+index c1df0d3595d87..d758848d34d0c 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c
+@@ -10,6 +10,7 @@ struct mlx5_timeouts {
+
+ static const u32 tout_def_sw_val[MAX_TIMEOUT_TYPES] = {
+ [MLX5_TO_FW_PRE_INIT_TIMEOUT_MS] = 120000,
++ [MLX5_TO_FW_PRE_INIT_ON_RECOVERY_TIMEOUT_MS] = 7200000,
+ [MLX5_TO_FW_PRE_INIT_WARN_MESSAGE_INTERVAL_MS] = 20000,
+ [MLX5_TO_FW_PRE_INIT_WAIT_MS] = 2,
+ [MLX5_TO_FW_INIT_MS] = 2000,
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h
+index 1c42ead782fa7..257c03eeab365 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h
+@@ -7,6 +7,7 @@
+ enum mlx5_timeouts_types {
+ /* pre init timeouts (not read from FW) */
+ MLX5_TO_FW_PRE_INIT_TIMEOUT_MS,
++ MLX5_TO_FW_PRE_INIT_ON_RECOVERY_TIMEOUT_MS,
+ MLX5_TO_FW_PRE_INIT_WARN_MESSAGE_INTERVAL_MS,
+ MLX5_TO_FW_PRE_INIT_WAIT_MS,
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
+index ef196cb764e2a..8b52636999943 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
+@@ -1014,7 +1014,7 @@ static void mlx5_cleanup_once(struct mlx5_core_dev *dev)
+ mlx5_devcom_unregister_device(dev->priv.devcom);
+ }
+
+-static int mlx5_function_setup(struct mlx5_core_dev *dev, bool boot)
++static int mlx5_function_setup(struct mlx5_core_dev *dev, u64 timeout)
+ {
+ int err;
+
+@@ -1029,11 +1029,11 @@ static int mlx5_function_setup(struct mlx5_core_dev *dev, bool boot)
+
+ /* wait for firmware to accept initialization segments configurations
+ */
+- err = wait_fw_init(dev, mlx5_tout_ms(dev, FW_PRE_INIT_TIMEOUT),
++ err = wait_fw_init(dev, timeout,
+ mlx5_tout_ms(dev, FW_PRE_INIT_WARN_MESSAGE_INTERVAL));
+ if (err) {
+ mlx5_core_err(dev, "Firmware over %llu MS in pre-initializing state, aborting\n",
+- mlx5_tout_ms(dev, FW_PRE_INIT_TIMEOUT));
++ timeout);
+ return err;
+ }
+
+@@ -1296,7 +1296,7 @@ int mlx5_init_one(struct mlx5_core_dev *dev)
+ mutex_lock(&dev->intf_state_mutex);
+ dev->state = MLX5_DEVICE_STATE_UP;
+
+- err = mlx5_function_setup(dev, true);
++ err = mlx5_function_setup(dev, mlx5_tout_ms(dev, FW_PRE_INIT_TIMEOUT));
+ if (err)
+ goto err_function;
+
+@@ -1360,9 +1360,10 @@ out:
+ mutex_unlock(&dev->intf_state_mutex);
+ }
+
+-int mlx5_load_one(struct mlx5_core_dev *dev)
++int mlx5_load_one(struct mlx5_core_dev *dev, bool recovery)
+ {
+ int err = 0;
++ u64 timeout;
+
+ mutex_lock(&dev->intf_state_mutex);
+ if (test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state)) {
+@@ -1372,7 +1373,11 @@ int mlx5_load_one(struct mlx5_core_dev *dev)
+ /* remove any previous indication of internal error */
+ dev->state = MLX5_DEVICE_STATE_UP;
+
+- err = mlx5_function_setup(dev, false);
++ if (recovery)
++ timeout = mlx5_tout_ms(dev, FW_PRE_INIT_ON_RECOVERY_TIMEOUT);
++ else
++ timeout = mlx5_tout_ms(dev, FW_PRE_INIT_TIMEOUT);
++ err = mlx5_function_setup(dev, timeout);
+ if (err)
+ goto err_function;
+
+@@ -1746,7 +1751,7 @@ static void mlx5_pci_resume(struct pci_dev *pdev)
+
+ mlx5_pci_trace(dev, "Enter, loading driver..\n");
+
+- err = mlx5_load_one(dev);
++ err = mlx5_load_one(dev, false);
+
+ mlx5_pci_trace(dev, "Done, err = %d, device %s\n", err,
+ !err ? "recovered" : "Failed");
+@@ -1833,7 +1838,7 @@ static int mlx5_resume(struct pci_dev *pdev)
+ {
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+
+- return mlx5_load_one(dev);
++ return mlx5_load_one(dev, false);
+ }
+
+ static const struct pci_device_id mlx5_core_pci_table[] = {
+@@ -1878,7 +1883,7 @@ int mlx5_recover_device(struct mlx5_core_dev *dev)
+ return -EIO;
+ }
+
+- return mlx5_load_one(dev);
++ return mlx5_load_one(dev, true);
+ }
+
+ static struct pci_driver mlx5_core_driver = {
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+index a9b2d6ead542b..9026be1d62232 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+@@ -290,7 +290,7 @@ void mlx5_mdev_uninit(struct mlx5_core_dev *dev);
+ int mlx5_init_one(struct mlx5_core_dev *dev);
+ void mlx5_uninit_one(struct mlx5_core_dev *dev);
+ void mlx5_unload_one(struct mlx5_core_dev *dev);
+-int mlx5_load_one(struct mlx5_core_dev *dev);
++int mlx5_load_one(struct mlx5_core_dev *dev, bool recovery);
+
+ int mlx5_vport_get_other_func_cap(struct mlx5_core_dev *dev, u16 function_id, void *out);
+
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c
+index 5f92b16913605..aff6d4f35cd2f 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c
++++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c
+@@ -168,8 +168,6 @@ static int mlxsw_sp_dcbnl_ieee_setets(struct net_device *dev,
+ static int mlxsw_sp_dcbnl_app_validate(struct net_device *dev,
+ struct dcb_app *app)
+ {
+- int prio;
+-
+ if (app->priority >= IEEE_8021QAZ_MAX_TCS) {
+ netdev_err(dev, "APP entry with priority value %u is invalid\n",
+ app->priority);
+@@ -183,17 +181,6 @@ static int mlxsw_sp_dcbnl_app_validate(struct net_device *dev,
+ app->protocol);
+ return -EINVAL;
+ }
+-
+- /* Warn about any DSCP APP entries with the same PID. */
+- prio = fls(dcb_ieee_getapp_mask(dev, app));
+- if (prio--) {
+- if (prio < app->priority)
+- netdev_warn(dev, "Choosing priority %d for DSCP %d in favor of previously-active value of %d\n",
+- app->priority, app->protocol, prio);
+- else if (prio > app->priority)
+- netdev_warn(dev, "Ignoring new priority %d for DSCP %d in favor of current value of %d\n",
+- app->priority, app->protocol, prio);
+- }
+ break;
+
+ case IEEE_8021QAZ_APP_SEL_ETHERTYPE:
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_trap.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_trap.c
+index 47b061b99160e..ed4d0d3448f31 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_trap.c
++++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_trap.c
+@@ -864,7 +864,7 @@ static const struct mlxsw_sp_trap_item mlxsw_sp_trap_items_arr[] = {
+ .trap = MLXSW_SP_TRAP_CONTROL(LLDP, LLDP, TRAP),
+ .listeners_arr = {
+ MLXSW_RXL(mlxsw_sp_rx_ptp_listener, LLDP, TRAP_TO_CPU,
+- false, SP_LLDP, DISCARD),
++ true, SP_LLDP, DISCARD),
+ },
+ },
+ {
+diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
+index f8edb3f1b73ad..186cb28c03bdb 100644
+--- a/drivers/net/ethernet/sfc/ef10.c
++++ b/drivers/net/ethernet/sfc/ef10.c
+@@ -2256,7 +2256,7 @@ int efx_ef10_tx_tso_desc(struct efx_tx_queue *tx_queue, struct sk_buff *skb,
+ * guaranteed to satisfy the second as we only attempt TSO if
+ * inner_network_header <= 208.
+ */
+- ip_tot_len = -EFX_TSO2_MAX_HDRLEN;
++ ip_tot_len = 0x10000 - EFX_TSO2_MAX_HDRLEN;
+ EFX_WARN_ON_ONCE_PARANOID(mss + EFX_TSO2_MAX_HDRLEN +
+ (tcp->doff << 2u) > ip_tot_len);
+
+diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c
+index 9f1759593b942..2fc51dc5eb0bb 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c
++++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c
+@@ -1084,8 +1084,9 @@ static int stmmac_test_rxp(struct stmmac_priv *priv)
+ unsigned char addr[ETH_ALEN] = {0xde, 0xad, 0xbe, 0xef, 0x00, 0x00};
+ struct tc_cls_u32_offload cls_u32 = { };
+ struct stmmac_packet_attrs attr = { };
+- struct tc_action **actions, *act;
++ struct tc_action **actions;
+ struct tc_u32_sel *sel;
++ struct tcf_gact *gact;
+ struct tcf_exts *exts;
+ int ret, i, nk = 1;
+
+@@ -1110,8 +1111,8 @@ static int stmmac_test_rxp(struct stmmac_priv *priv)
+ goto cleanup_exts;
+ }
+
+- act = kcalloc(nk, sizeof(*act), GFP_KERNEL);
+- if (!act) {
++ gact = kcalloc(nk, sizeof(*gact), GFP_KERNEL);
++ if (!gact) {
+ ret = -ENOMEM;
+ goto cleanup_actions;
+ }
+@@ -1126,9 +1127,7 @@ static int stmmac_test_rxp(struct stmmac_priv *priv)
+ exts->nr_actions = nk;
+ exts->actions = actions;
+ for (i = 0; i < nk; i++) {
+- struct tcf_gact *gact = to_gact(&act[i]);
+-
+- actions[i] = &act[i];
++ actions[i] = (struct tc_action *)&gact[i];
+ gact->tcf_action = TC_ACT_SHOT;
+ }
+
+@@ -1152,7 +1151,7 @@ static int stmmac_test_rxp(struct stmmac_priv *priv)
+ stmmac_tc_setup_cls_u32(priv, priv, &cls_u32);
+
+ cleanup_act:
+- kfree(act);
++ kfree(gact);
+ cleanup_actions:
+ kfree(actions);
+ cleanup_exts:
+diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
+index affcf92cd3aa5..fb30bc5d56cb7 100644
+--- a/drivers/net/ethernet/ti/Kconfig
++++ b/drivers/net/ethernet/ti/Kconfig
+@@ -94,6 +94,7 @@ config TI_K3_AM65_CPSW_NUSS
+ depends on ARCH_K3 && OF && TI_K3_UDMA_GLUE_LAYER
+ select NET_DEVLINK
+ select TI_DAVINCI_MDIO
++ select PHYLINK
+ imply PHY_TI_GMII_SEL
+ depends on TI_K3_AM65_CPTS || !TI_K3_AM65_CPTS
+ help
+diff --git a/drivers/net/ethernet/xscale/ptp_ixp46x.c b/drivers/net/ethernet/xscale/ptp_ixp46x.c
+index 1f382777aa5a8..9abbdb71e629f 100644
+--- a/drivers/net/ethernet/xscale/ptp_ixp46x.c
++++ b/drivers/net/ethernet/xscale/ptp_ixp46x.c
+@@ -271,7 +271,7 @@ static int ptp_ixp_probe(struct platform_device *pdev)
+ ixp_clock.master_irq = platform_get_irq(pdev, 0);
+ ixp_clock.slave_irq = platform_get_irq(pdev, 1);
+ if (IS_ERR(ixp_clock.regs) ||
+- !ixp_clock.master_irq || !ixp_clock.slave_irq)
++ ixp_clock.master_irq < 0 || ixp_clock.slave_irq < 0)
+ return -ENXIO;
+
+ ixp_clock.caps = ptp_ixp_caps;
+diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
+index fde1c492ca02a..b1dece6b96986 100644
+--- a/drivers/net/hyperv/netvsc_drv.c
++++ b/drivers/net/hyperv/netvsc_drv.c
+@@ -2671,7 +2671,10 @@ static int netvsc_suspend(struct hv_device *dev)
+
+ /* Save the current config info */
+ ndev_ctx->saved_netvsc_dev_info = netvsc_devinfo_get(nvdev);
+-
++ if (!ndev_ctx->saved_netvsc_dev_info) {
++ ret = -ENOMEM;
++ goto out;
++ }
+ ret = netvsc_detach(net, nvdev);
+ out:
+ rtnl_unlock();
+diff --git a/drivers/net/ipa/ipa_endpoint.c b/drivers/net/ipa/ipa_endpoint.c
+index 53764f3c0c7e4..b9b4cb82790f8 100644
+--- a/drivers/net/ipa/ipa_endpoint.c
++++ b/drivers/net/ipa/ipa_endpoint.c
+@@ -587,19 +587,23 @@ static void ipa_endpoint_init_hdr_ext(struct ipa_endpoint *endpoint)
+ struct ipa *ipa = endpoint->ipa;
+ u32 val = 0;
+
+- val |= HDR_ENDIANNESS_FMASK; /* big endian */
+-
+- /* A QMAP header contains a 6 bit pad field at offset 0. The RMNet
+- * driver assumes this field is meaningful in packets it receives,
+- * and assumes the header's payload length includes that padding.
+- * The RMNet driver does *not* pad packets it sends, however, so
+- * the pad field (although 0) should be ignored.
+- */
+- if (endpoint->data->qmap && !endpoint->toward_ipa) {
+- val |= HDR_TOTAL_LEN_OR_PAD_VALID_FMASK;
+- /* HDR_TOTAL_LEN_OR_PAD is 0 (pad, not total_len) */
+- val |= HDR_PAYLOAD_LEN_INC_PADDING_FMASK;
+- /* HDR_TOTAL_LEN_OR_PAD_OFFSET is 0 */
++ if (endpoint->data->qmap) {
++ /* We have a header, so we must specify its endianness */
++ val |= HDR_ENDIANNESS_FMASK; /* big endian */
++
++ /* A QMAP header contains a 6 bit pad field at offset 0.
++ * The RMNet driver assumes this field is meaningful in
++ * packets it receives, and assumes the header's payload
++ * length includes that padding. The RMNet driver does
++ * *not* pad packets it sends, however, so the pad field
++ * (although 0) should be ignored.
++ */
++ if (!endpoint->toward_ipa) {
++ val |= HDR_TOTAL_LEN_OR_PAD_VALID_FMASK;
++ /* HDR_TOTAL_LEN_OR_PAD is 0 (pad, not total_len) */
++ val |= HDR_PAYLOAD_LEN_INC_PADDING_FMASK;
++ /* HDR_TOTAL_LEN_OR_PAD_OFFSET is 0 */
++ }
+ }
+
+ /* HDR_PAYLOAD_LEN_INC_PADDING is 0 */
+@@ -759,8 +763,6 @@ static void ipa_endpoint_init_aggr(struct ipa_endpoint *endpoint)
+
+ close_eof = rx_data->aggr_close_eof;
+ val |= aggr_sw_eof_active_encoded(version, close_eof);
+-
+- /* AGGR_HARD_BYTE_LIMIT_ENABLE is 0 */
+ } else {
+ val |= u32_encode_bits(IPA_ENABLE_DEAGGR,
+ AGGR_EN_FMASK);
+@@ -1060,7 +1062,7 @@ static int ipa_endpoint_replenish_one(struct ipa_endpoint *endpoint,
+
+ ret = gsi_trans_page_add(trans, page, len, offset);
+ if (ret)
+- __free_pages(page, get_order(buffer_size));
++ put_page(page);
+ else
+ trans->data = page; /* transaction owns page now */
+
+@@ -1383,11 +1385,8 @@ void ipa_endpoint_trans_release(struct ipa_endpoint *endpoint,
+ } else {
+ struct page *page = trans->data;
+
+- if (page) {
+- u32 buffer_size = endpoint->data->rx.buffer_size;
+-
+- __free_pages(page, get_order(buffer_size));
+- }
++ if (page)
++ put_page(page);
+ }
+ }
+
+diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
+index 832f09ac075e7..817577e713d70 100644
+--- a/drivers/net/macsec.c
++++ b/drivers/net/macsec.c
+@@ -99,6 +99,7 @@ struct pcpu_secy_stats {
+ * struct macsec_dev - private data
+ * @secy: SecY config
+ * @real_dev: pointer to underlying netdevice
++ * @dev_tracker: refcount tracker for @real_dev reference
+ * @stats: MACsec device stats
+ * @secys: linked list of SecY's on the underlying device
+ * @gro_cells: pointer to the Generic Receive Offload cell
+@@ -107,6 +108,7 @@ struct pcpu_secy_stats {
+ struct macsec_dev {
+ struct macsec_secy secy;
+ struct net_device *real_dev;
++ netdevice_tracker dev_tracker;
+ struct pcpu_secy_stats __percpu *stats;
+ struct list_head secys;
+ struct gro_cells gro_cells;
+@@ -3459,6 +3461,9 @@ static int macsec_dev_init(struct net_device *dev)
+ if (is_zero_ether_addr(dev->broadcast))
+ memcpy(dev->broadcast, real_dev->broadcast, dev->addr_len);
+
++ /* Get macsec's reference to real_dev */
++ dev_hold_track(real_dev, &macsec->dev_tracker, GFP_KERNEL);
++
+ return 0;
+ }
+
+@@ -3704,6 +3709,8 @@ static void macsec_free_netdev(struct net_device *dev)
+ free_percpu(macsec->stats);
+ free_percpu(macsec->secy.tx_sc.stats);
+
++ /* Get rid of the macsec's reference to real_dev */
++ dev_put_track(macsec->real_dev, &macsec->dev_tracker);
+ }
+
+ static void macsec_setup(struct net_device *dev)
+diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
+index cd9aa353b653f..48c7d715a9e38 100644
+--- a/drivers/net/phy/micrel.c
++++ b/drivers/net/phy/micrel.c
+@@ -497,7 +497,7 @@ static int kszphy_config_reset(struct phy_device *phydev)
+ }
+ }
+
+- if (priv->led_mode >= 0)
++ if (priv->type && priv->led_mode >= 0)
+ kszphy_setup_led(phydev, priv->type->led_mode_reg, priv->led_mode);
+
+ return 0;
+@@ -513,10 +513,10 @@ static int kszphy_config_init(struct phy_device *phydev)
+
+ type = priv->type;
+
+- if (type->has_broadcast_disable)
++ if (type && type->has_broadcast_disable)
+ kszphy_broadcast_disable(phydev);
+
+- if (type->has_nand_tree_disable)
++ if (type && type->has_nand_tree_disable)
+ kszphy_nand_tree_disable(phydev);
+
+ return kszphy_config_reset(phydev);
+@@ -1514,7 +1514,7 @@ static int kszphy_probe(struct phy_device *phydev)
+
+ priv->type = type;
+
+- if (type->led_mode_reg) {
++ if (type && type->led_mode_reg) {
+ ret = of_property_read_u32(np, "micrel,led-mode",
+ &priv->led_mode);
+ if (ret)
+@@ -1535,7 +1535,8 @@ static int kszphy_probe(struct phy_device *phydev)
+ unsigned long rate = clk_get_rate(clk);
+ bool rmii_ref_clk_sel_25_mhz;
+
+- priv->rmii_ref_clk_sel = type->has_rmii_ref_clk_sel;
++ if (type)
++ priv->rmii_ref_clk_sel = type->has_rmii_ref_clk_sel;
+ rmii_ref_clk_sel_25_mhz = of_property_read_bool(np,
+ "micrel,rmii-reference-clock-select-25-mhz");
+
+diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
+index 38e47a93fb833..5b5eb630c4b79 100644
+--- a/drivers/net/usb/asix_devices.c
++++ b/drivers/net/usb/asix_devices.c
+@@ -795,11 +795,7 @@ static int ax88772_stop(struct usbnet *dev)
+ {
+ struct asix_common_private *priv = dev->driver_priv;
+
+- /* On unplugged USB, we will get MDIO communication errors and the
+- * PHY will be set in to PHY_HALTED state.
+- */
+- if (priv->phydev->state != PHY_HALTED)
+- phy_stop(priv->phydev);
++ phy_stop(priv->phydev);
+
+ return 0;
+ }
+diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
+index 4ef61f6b85df5..edf0492ad489a 100644
+--- a/drivers/net/usb/smsc95xx.c
++++ b/drivers/net/usb/smsc95xx.c
+@@ -1243,8 +1243,7 @@ static int smsc95xx_start_phy(struct usbnet *dev)
+
+ static int smsc95xx_stop(struct usbnet *dev)
+ {
+- if (dev->net->phydev)
+- phy_stop(dev->net->phydev);
++ phy_stop(dev->net->phydev);
+
+ return 0;
+ }
+diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
+index 9a6450f796dcb..36b24ec116504 100644
+--- a/drivers/net/usb/usbnet.c
++++ b/drivers/net/usb/usbnet.c
+@@ -1616,9 +1616,6 @@ void usbnet_disconnect (struct usb_interface *intf)
+ xdev->bus->bus_name, xdev->devpath,
+ dev->driver_info->description);
+
+- if (dev->driver_info->unbind)
+- dev->driver_info->unbind(dev, intf);
+-
+ net = dev->net;
+ unregister_netdev (net);
+
+@@ -1626,6 +1623,9 @@ void usbnet_disconnect (struct usb_interface *intf)
+
+ usb_scuttle_anchored_urbs(&dev->deferred);
+
++ if (dev->driver_info->unbind)
++ dev->driver_info->unbind(dev, intf);
++
+ usb_kill_urb(dev->interrupt);
+ usb_free_urb(dev->interrupt);
+ kfree(dev->padding_pkt);
+diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
+index b11aaee8b8c03..a11b31191d5aa 100644
+--- a/drivers/net/wireless/ath/ath10k/mac.c
++++ b/drivers/net/wireless/ath/ath10k/mac.c
+@@ -5339,13 +5339,29 @@ err:
+ static void ath10k_stop(struct ieee80211_hw *hw)
+ {
+ struct ath10k *ar = hw->priv;
++ u32 opt;
+
+ ath10k_drain_tx(ar);
+
+ mutex_lock(&ar->conf_mutex);
+ if (ar->state != ATH10K_STATE_OFF) {
+- if (!ar->hw_rfkill_on)
+- ath10k_halt(ar);
++ if (!ar->hw_rfkill_on) {
++ /* If the current driver state is RESTARTING but not yet
++ * fully RESTARTED because of incoming suspend event,
++ * then ath10k_halt() is already called via
++ * ath10k_core_restart() and should not be called here.
++ */
++ if (ar->state != ATH10K_STATE_RESTARTING) {
++ ath10k_halt(ar);
++ } else {
++ /* Suspending here, because when in RESTARTING
++ * state, ath10k_core_stop() skips
++ * ath10k_wait_for_suspend().
++ */
++ opt = WMI_PDEV_SUSPEND_AND_DISABLE_INTR;
++ ath10k_wait_for_suspend(ar, opt);
++ }
++ }
+ ar->state = ATH10K_STATE_OFF;
+ }
+ mutex_unlock(&ar->conf_mutex);
+diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c
+index 58ff761393db1..54d738bdee0e1 100644
+--- a/drivers/net/wireless/ath/ath11k/mac.c
++++ b/drivers/net/wireless/ath/ath11k/mac.c
+@@ -5520,8 +5520,8 @@ static void ath11k_mgmt_over_wmi_tx_work(struct work_struct *work)
+ }
+
+ arvif = ath11k_vif_to_arvif(skb_cb->vif);
+- if (ar->allocated_vdev_map & (1LL << arvif->vdev_id) &&
+- arvif->is_started) {
++ mutex_lock(&ar->conf_mutex);
++ if (ar->allocated_vdev_map & (1LL << arvif->vdev_id)) {
+ ret = ath11k_mac_mgmt_tx_wmi(ar, arvif, skb);
+ if (ret) {
+ ath11k_warn(ar->ab, "failed to tx mgmt frame, vdev_id %d :%d\n",
+@@ -5539,6 +5539,7 @@ static void ath11k_mgmt_over_wmi_tx_work(struct work_struct *work)
+ arvif->is_started);
+ ath11k_mgmt_over_wmi_tx_drop(ar, skb);
+ }
++ mutex_unlock(&ar->conf_mutex);
+ }
+ }
+
+@@ -7114,6 +7115,7 @@ ath11k_mac_op_unassign_vif_chanctx(struct ieee80211_hw *hw,
+ struct ath11k *ar = hw->priv;
+ struct ath11k_base *ab = ar->ab;
+ struct ath11k_vif *arvif = (void *)vif->drv_priv;
++ struct ath11k_peer *peer;
+ int ret;
+
+ mutex_lock(&ar->conf_mutex);
+@@ -7125,9 +7127,13 @@ ath11k_mac_op_unassign_vif_chanctx(struct ieee80211_hw *hw,
+ WARN_ON(!arvif->is_started);
+
+ if (ab->hw_params.vdev_start_delay &&
+- arvif->vdev_type == WMI_VDEV_TYPE_MONITOR &&
+- ath11k_peer_find_by_addr(ab, ar->mac_addr))
+- ath11k_peer_delete(ar, arvif->vdev_id, ar->mac_addr);
++ arvif->vdev_type == WMI_VDEV_TYPE_MONITOR) {
++ spin_lock_bh(&ab->base_lock);
++ peer = ath11k_peer_find_by_addr(ab, ar->mac_addr);
++ spin_unlock_bh(&ab->base_lock);
++ if (peer)
++ ath11k_peer_delete(ar, arvif->vdev_id, ar->mac_addr);
++ }
+
+ if (arvif->vdev_type == WMI_VDEV_TYPE_MONITOR) {
+ ret = ath11k_mac_monitor_stop(ar);
+diff --git a/drivers/net/wireless/ath/ath11k/pci.c b/drivers/net/wireless/ath/ath11k/pci.c
+index 903758751c99a..8a3ff12057e89 100644
+--- a/drivers/net/wireless/ath/ath11k/pci.c
++++ b/drivers/net/wireless/ath/ath11k/pci.c
+@@ -191,6 +191,7 @@ void ath11k_pci_write32(struct ath11k_base *ab, u32 offset, u32 value)
+ {
+ struct ath11k_pci *ab_pci = ath11k_pci_priv(ab);
+ u32 window_start;
++ int ret = 0;
+
+ /* for offset beyond BAR + 4K - 32, may
+ * need to wakeup MHI to access.
+@@ -198,7 +199,7 @@ void ath11k_pci_write32(struct ath11k_base *ab, u32 offset, u32 value)
+ if (ab->hw_params.wakeup_mhi &&
+ test_bit(ATH11K_PCI_FLAG_INIT_DONE, &ab_pci->flags) &&
+ offset >= ACCESS_ALWAYS_OFF)
+- mhi_device_get_sync(ab_pci->mhi_ctrl->mhi_dev);
++ ret = mhi_device_get_sync(ab_pci->mhi_ctrl->mhi_dev);
+
+ if (offset < WINDOW_START) {
+ iowrite32(value, ab->mem + offset);
+@@ -222,7 +223,8 @@ void ath11k_pci_write32(struct ath11k_base *ab, u32 offset, u32 value)
+
+ if (ab->hw_params.wakeup_mhi &&
+ test_bit(ATH11K_PCI_FLAG_INIT_DONE, &ab_pci->flags) &&
+- offset >= ACCESS_ALWAYS_OFF)
++ offset >= ACCESS_ALWAYS_OFF &&
++ !ret)
+ mhi_device_put(ab_pci->mhi_ctrl->mhi_dev);
+ }
+
+@@ -230,6 +232,7 @@ u32 ath11k_pci_read32(struct ath11k_base *ab, u32 offset)
+ {
+ struct ath11k_pci *ab_pci = ath11k_pci_priv(ab);
+ u32 val, window_start;
++ int ret = 0;
+
+ /* for offset beyond BAR + 4K - 32, may
+ * need to wakeup MHI to access.
+@@ -237,7 +240,7 @@ u32 ath11k_pci_read32(struct ath11k_base *ab, u32 offset)
+ if (ab->hw_params.wakeup_mhi &&
+ test_bit(ATH11K_PCI_FLAG_INIT_DONE, &ab_pci->flags) &&
+ offset >= ACCESS_ALWAYS_OFF)
+- mhi_device_get_sync(ab_pci->mhi_ctrl->mhi_dev);
++ ret = mhi_device_get_sync(ab_pci->mhi_ctrl->mhi_dev);
+
+ if (offset < WINDOW_START) {
+ val = ioread32(ab->mem + offset);
+@@ -261,7 +264,8 @@ u32 ath11k_pci_read32(struct ath11k_base *ab, u32 offset)
+
+ if (ab->hw_params.wakeup_mhi &&
+ test_bit(ATH11K_PCI_FLAG_INIT_DONE, &ab_pci->flags) &&
+- offset >= ACCESS_ALWAYS_OFF)
++ offset >= ACCESS_ALWAYS_OFF &&
++ !ret)
+ mhi_device_put(ab_pci->mhi_ctrl->mhi_dev);
+
+ return val;
+diff --git a/drivers/net/wireless/ath/ath11k/spectral.c b/drivers/net/wireless/ath/ath11k/spectral.c
+index 2b18871d5f7cb..516a7b4cd1805 100644
+--- a/drivers/net/wireless/ath/ath11k/spectral.c
++++ b/drivers/net/wireless/ath/ath11k/spectral.c
+@@ -212,7 +212,10 @@ static int ath11k_spectral_scan_config(struct ath11k *ar,
+ return -ENODEV;
+
+ arvif->spectral_enabled = (mode != ATH11K_SPECTRAL_DISABLED);
++
++ spin_lock_bh(&ar->spectral.lock);
+ ar->spectral.mode = mode;
++ spin_unlock_bh(&ar->spectral.lock);
+
+ ret = ath11k_wmi_vdev_spectral_enable(ar, arvif->vdev_id,
+ ATH11K_WMI_SPECTRAL_TRIGGER_CMD_CLEAR,
+@@ -843,9 +846,6 @@ static inline void ath11k_spectral_ring_free(struct ath11k *ar)
+ {
+ struct ath11k_spectral *sp = &ar->spectral;
+
+- if (!sp->enabled)
+- return;
+-
+ ath11k_dbring_srng_cleanup(ar, &sp->rx_ring);
+ ath11k_dbring_buf_cleanup(ar, &sp->rx_ring);
+ }
+@@ -897,15 +897,16 @@ void ath11k_spectral_deinit(struct ath11k_base *ab)
+ if (!sp->enabled)
+ continue;
+
+- ath11k_spectral_debug_unregister(ar);
+- ath11k_spectral_ring_free(ar);
++ mutex_lock(&ar->conf_mutex);
++ ath11k_spectral_scan_config(ar, ATH11K_SPECTRAL_DISABLED);
++ mutex_unlock(&ar->conf_mutex);
+
+ spin_lock_bh(&sp->lock);
+-
+- sp->mode = ATH11K_SPECTRAL_DISABLED;
+ sp->enabled = false;
+-
+ spin_unlock_bh(&sp->lock);
++
++ ath11k_spectral_debug_unregister(ar);
++ ath11k_spectral_ring_free(ar);
+ }
+ }
+
+diff --git a/drivers/net/wireless/ath/ath11k/wmi.c b/drivers/net/wireless/ath/ath11k/wmi.c
+index 2751fe8814df7..0900f75eef202 100644
+--- a/drivers/net/wireless/ath/ath11k/wmi.c
++++ b/drivers/net/wireless/ath/ath11k/wmi.c
+@@ -5789,9 +5789,9 @@ static int ath11k_wmi_tlv_rssi_chain_parse(struct ath11k_base *ab,
+ arvif->bssid,
+ NULL);
+ if (!sta) {
+- ath11k_warn(ab, "not found station for bssid %pM\n",
+- arvif->bssid);
+- ret = -EPROTO;
++ ath11k_dbg(ab, ATH11K_DBG_WMI,
++ "not found station of bssid %pM for rssi chain\n",
++ arvif->bssid);
+ goto exit;
+ }
+
+@@ -5889,8 +5889,9 @@ static int ath11k_wmi_tlv_fw_stats_data_parse(struct ath11k_base *ab,
+ "wmi stats vdev id %d snr %d\n",
+ src->vdev_id, src->beacon_snr);
+ } else {
+- ath11k_warn(ab, "not found station for bssid %pM\n",
+- arvif->bssid);
++ ath11k_dbg(ab, ATH11K_DBG_WMI,
++ "not found station of bssid %pM for vdev stat\n",
++ arvif->bssid);
+ }
+ }
+
+diff --git a/drivers/net/wireless/ath/ath11k/wmi.h b/drivers/net/wireless/ath/ath11k/wmi.h
+index 587f423072508..b5b72483477d7 100644
+--- a/drivers/net/wireless/ath/ath11k/wmi.h
++++ b/drivers/net/wireless/ath/ath11k/wmi.h
+@@ -3088,9 +3088,6 @@ enum scan_dwelltime_adaptive_mode {
+ SCAN_DWELL_MODE_STATIC = 4
+ };
+
+-#define WLAN_SCAN_MAX_NUM_SSID 10
+-#define WLAN_SCAN_MAX_NUM_BSSID 10
+-
+ #define WLAN_SSID_MAX_LEN 32
+
+ struct element_info {
+@@ -3105,7 +3102,6 @@ struct wlan_ssid {
+
+ #define WMI_IE_BITMAP_SIZE 8
+
+-#define WMI_SCAN_MAX_NUM_SSID 0x0A
+ /* prefix used by scan requestor ids on the host */
+ #define WMI_HOST_SCAN_REQUESTOR_ID_PREFIX 0xA000
+
+@@ -3113,10 +3109,6 @@ struct wlan_ssid {
+ /* host cycles through the lower 12 bits to generate ids */
+ #define WMI_HOST_SCAN_REQ_ID_PREFIX 0xA000
+
+-#define WLAN_SCAN_PARAMS_MAX_SSID 16
+-#define WLAN_SCAN_PARAMS_MAX_BSSID 4
+-#define WLAN_SCAN_PARAMS_MAX_IE_LEN 256
+-
+ /* Values lower than this may be refused by some firmware revisions with a scan
+ * completion with a timedout reason.
+ */
+@@ -3312,8 +3304,8 @@ struct scan_req_params {
+ u32 n_probes;
+ u32 *chan_list;
+ u32 notify_scan_events;
+- struct wlan_ssid ssid[WLAN_SCAN_MAX_NUM_SSID];
+- struct wmi_mac_addr bssid_list[WLAN_SCAN_MAX_NUM_BSSID];
++ struct wlan_ssid ssid[WLAN_SCAN_PARAMS_MAX_SSID];
++ struct wmi_mac_addr bssid_list[WLAN_SCAN_PARAMS_MAX_BSSID];
+ struct element_info extraie;
+ struct element_info htcap;
+ struct element_info vhtcap;
+diff --git a/drivers/net/wireless/ath/ath9k/ar9003_eeprom.c b/drivers/net/wireless/ath/ath9k/ar9003_eeprom.c
+index b0a4ca3559fd8..abed1effd95ca 100644
+--- a/drivers/net/wireless/ath/ath9k/ar9003_eeprom.c
++++ b/drivers/net/wireless/ath/ath9k/ar9003_eeprom.c
+@@ -5615,7 +5615,7 @@ unsigned int ar9003_get_paprd_scale_factor(struct ath_hw *ah,
+
+ static u8 ar9003_get_eepmisc(struct ath_hw *ah)
+ {
+- return ah->eeprom.map4k.baseEepHeader.eepMisc;
++ return ah->eeprom.ar9300_eep.baseEepHeader.opCapFlags.eepMisc;
+ }
+
+ const struct eeprom_ops eep_ar9300_ops = {
+diff --git a/drivers/net/wireless/ath/ath9k/ar9003_phy.h b/drivers/net/wireless/ath/ath9k/ar9003_phy.h
+index a171dbb29fbb6..ad949eb02f3d2 100644
+--- a/drivers/net/wireless/ath/ath9k/ar9003_phy.h
++++ b/drivers/net/wireless/ath/ath9k/ar9003_phy.h
+@@ -720,7 +720,7 @@
+ #define AR_CH0_TOP2 (AR_SREV_9300(ah) ? 0x1628c : \
+ (AR_SREV_9462(ah) ? 0x16290 : 0x16284))
+ #define AR_CH0_TOP2_XPABIASLVL (AR_SREV_9561(ah) ? 0x1e00 : 0xf000)
+-#define AR_CH0_TOP2_XPABIASLVL_S 12
++#define AR_CH0_TOP2_XPABIASLVL_S (AR_SREV_9561(ah) ? 9 : 12)
+
+ #define AR_CH0_XTAL (AR_SREV_9300(ah) ? 0x16294 : \
+ ((AR_SREV_9462(ah) || AR_SREV_9565(ah)) ? 0x16298 : \
+diff --git a/drivers/net/wireless/ath/ath9k/htc_drv_txrx.c b/drivers/net/wireless/ath/ath9k/htc_drv_txrx.c
+index 6a850a0bfa8ad..a23eaca0326d1 100644
+--- a/drivers/net/wireless/ath/ath9k/htc_drv_txrx.c
++++ b/drivers/net/wireless/ath/ath9k/htc_drv_txrx.c
+@@ -1016,6 +1016,14 @@ static bool ath9k_rx_prepare(struct ath9k_htc_priv *priv,
+ goto rx_next;
+ }
+
++ if (rxstatus->rs_keyix >= ATH_KEYMAX &&
++ rxstatus->rs_keyix != ATH9K_RXKEYIX_INVALID) {
++ ath_dbg(common, ANY,
++ "Invalid keyix, dropping (keyix: %d)\n",
++ rxstatus->rs_keyix);
++ goto rx_next;
++ }
++
+ /* Get the RX status information */
+
+ memset(rx_status, 0, sizeof(struct ieee80211_rx_status));
+diff --git a/drivers/net/wireless/ath/carl9170/tx.c b/drivers/net/wireless/ath/carl9170/tx.c
+index 1b76f4434c069..791f9f120af3a 100644
+--- a/drivers/net/wireless/ath/carl9170/tx.c
++++ b/drivers/net/wireless/ath/carl9170/tx.c
+@@ -1558,6 +1558,9 @@ static struct carl9170_vif_info *carl9170_pick_beaconing_vif(struct ar9170 *ar)
+ goto out;
+ }
+ } while (ar->beacon_enabled && i--);
++
++ /* no entry found in list */
++ return NULL;
+ }
+
+ out:
+diff --git a/drivers/net/wireless/broadcom/b43/phy_n.c b/drivers/net/wireless/broadcom/b43/phy_n.c
+index cf3ccf4ddfe72..aa5c994656749 100644
+--- a/drivers/net/wireless/broadcom/b43/phy_n.c
++++ b/drivers/net/wireless/broadcom/b43/phy_n.c
+@@ -582,7 +582,7 @@ static void b43_nphy_adjust_lna_gain_table(struct b43_wldev *dev)
+ u16 data[4];
+ s16 gain[2];
+ u16 minmax[2];
+- static const u16 lna_gain[4] = { -2, 10, 19, 25 };
++ static const s16 lna_gain[4] = { -2, 10, 19, 25 };
+
+ if (nphy->hang_avoid)
+ b43_nphy_stay_in_carrier_search(dev, 1);
+diff --git a/drivers/net/wireless/broadcom/b43legacy/phy.c b/drivers/net/wireless/broadcom/b43legacy/phy.c
+index 05404fbd1e70b..c1395e622759e 100644
+--- a/drivers/net/wireless/broadcom/b43legacy/phy.c
++++ b/drivers/net/wireless/broadcom/b43legacy/phy.c
+@@ -1123,7 +1123,7 @@ void b43legacy_phy_lo_b_measure(struct b43legacy_wldev *dev)
+ struct b43legacy_phy *phy = &dev->phy;
+ u16 regstack[12] = { 0 };
+ u16 mls;
+- u16 fval;
++ s16 fval;
+ int i;
+ int j;
+
+diff --git a/drivers/net/wireless/intel/ipw2x00/libipw_tx.c b/drivers/net/wireless/intel/ipw2x00/libipw_tx.c
+index 36d1e6b2568db..4aec1fce1ae29 100644
+--- a/drivers/net/wireless/intel/ipw2x00/libipw_tx.c
++++ b/drivers/net/wireless/intel/ipw2x00/libipw_tx.c
+@@ -383,7 +383,7 @@ netdev_tx_t libipw_xmit(struct sk_buff *skb, struct net_device *dev)
+
+ /* Each fragment may need to have room for encryption
+ * pre/postfix */
+- if (host_encrypt)
++ if (host_encrypt && crypt && crypt->ops)
+ bytes_per_frag -= crypt->ops->extra_mpdu_prefix_len +
+ crypt->ops->extra_mpdu_postfix_len;
+
+diff --git a/drivers/net/wireless/intel/iwlwifi/fw/acpi.c b/drivers/net/wireless/intel/iwlwifi/fw/acpi.c
+index 33aae639ad37e..e6d64152c81a7 100644
+--- a/drivers/net/wireless/intel/iwlwifi/fw/acpi.c
++++ b/drivers/net/wireless/intel/iwlwifi/fw/acpi.c
+@@ -937,6 +937,9 @@ int iwl_sar_geo_init(struct iwl_fw_runtime *fwrt,
+ {
+ int i, j;
+
++ if (!fwrt->geo_enabled)
++ return -ENODATA;
++
+ if (!iwl_sar_geo_support(fwrt))
+ return -EOPNOTSUPP;
+
+diff --git a/drivers/net/wireless/intel/iwlwifi/mei/main.c b/drivers/net/wireless/intel/iwlwifi/mei/main.c
+index b4f45234cfc89..357f14626cf43 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mei/main.c
++++ b/drivers/net/wireless/intel/iwlwifi/mei/main.c
+@@ -493,6 +493,7 @@ void iwl_mei_add_data_to_ring(struct sk_buff *skb, bool cb_tx)
+ if (cb_tx) {
+ struct iwl_sap_cb_data *cb_hdr = skb_push(skb, sizeof(*cb_hdr));
+
++ memset(cb_hdr, 0, sizeof(*cb_hdr));
+ cb_hdr->hdr.type = cpu_to_le16(SAP_MSG_CB_DATA_PACKET);
+ cb_hdr->hdr.len = cpu_to_le16(skb->len - sizeof(cb_hdr->hdr));
+ cb_hdr->hdr.seq_num = cpu_to_le32(atomic_inc_return(&mei->sap_seq_no));
+@@ -1019,6 +1020,8 @@ static void iwl_mei_handle_sap_data(struct mei_cl_device *cldev,
+
+ /* We need enough room for the WiFi header + SNAP + IV */
+ skb = netdev_alloc_skb(netdev, len + QOS_HDR_IV_SNAP_LEN);
++ if (!skb)
++ continue;
+
+ skb_reserve(skb, QOS_HDR_IV_SNAP_LEN);
+ ethhdr = skb_push(skb, sizeof(*ethhdr));
+diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/power.c b/drivers/net/wireless/intel/iwlwifi/mvm/power.c
+index b2ea2fca5376f..b9bd81242b216 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mvm/power.c
++++ b/drivers/net/wireless/intel/iwlwifi/mvm/power.c
+@@ -563,6 +563,9 @@ static void iwl_mvm_power_get_vifs_iterator(void *_data, u8 *mac,
+ struct iwl_power_vifs *power_iterator = _data;
+ bool active = mvmvif->phy_ctxt && mvmvif->phy_ctxt->id < NUM_PHY_CTX;
+
++ if (!mvmvif->uploaded)
++ return;
++
+ switch (ieee80211_vif_type_p2p(vif)) {
+ case NL80211_IFTYPE_P2P_DEVICE:
+ break;
+diff --git a/drivers/net/wireless/marvell/mwifiex/11h.c b/drivers/net/wireless/marvell/mwifiex/11h.c
+index d2ee6469e67bb..3fa25cd64cda0 100644
+--- a/drivers/net/wireless/marvell/mwifiex/11h.c
++++ b/drivers/net/wireless/marvell/mwifiex/11h.c
+@@ -303,5 +303,7 @@ void mwifiex_dfs_chan_sw_work_queue(struct work_struct *work)
+
+ mwifiex_dbg(priv->adapter, MSG,
+ "indicating channel switch completion to kernel\n");
++ mutex_lock(&priv->wdev.mtx);
+ cfg80211_ch_switch_notify(priv->netdev, &priv->dfs_chandef);
++ mutex_unlock(&priv->wdev.mtx);
+ }
+diff --git a/drivers/net/wireless/mediatek/mt76/agg-rx.c b/drivers/net/wireless/mediatek/mt76/agg-rx.c
+index 72622220051bb..6c8b441945791 100644
+--- a/drivers/net/wireless/mediatek/mt76/agg-rx.c
++++ b/drivers/net/wireless/mediatek/mt76/agg-rx.c
+@@ -162,8 +162,9 @@ void mt76_rx_aggr_reorder(struct sk_buff *skb, struct sk_buff_head *frames)
+ if (!sta)
+ return;
+
+- if (!status->aggr && !(status->flag & RX_FLAG_8023)) {
+- mt76_rx_aggr_check_ctl(skb, frames);
++ if (!status->aggr) {
++ if (!(status->flag & RX_FLAG_8023))
++ mt76_rx_aggr_check_ctl(skb, frames);
+ return;
+ }
+
+diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c b/drivers/net/wireless/mediatek/mt76/mac80211.c
+index 5b53d008eb664..8a2fedbb1451c 100644
+--- a/drivers/net/wireless/mediatek/mt76/mac80211.c
++++ b/drivers/net/wireless/mediatek/mt76/mac80211.c
+@@ -248,6 +248,8 @@ static void mt76_init_stream_cap(struct mt76_phy *phy,
+ vht_cap->cap |= IEEE80211_VHT_CAP_TXSTBC;
+ else
+ vht_cap->cap &= ~IEEE80211_VHT_CAP_TXSTBC;
++ vht_cap->cap |= IEEE80211_VHT_CAP_TX_ANTENNA_PATTERN |
++ IEEE80211_VHT_CAP_RX_ANTENNA_PATTERN;
+
+ for (i = 0; i < 8; i++) {
+ if (i < nstream)
+@@ -323,8 +325,6 @@ mt76_init_sband(struct mt76_phy *phy, struct mt76_sband *msband,
+ vht_cap->cap |= IEEE80211_VHT_CAP_RXLDPC |
+ IEEE80211_VHT_CAP_RXSTBC_1 |
+ IEEE80211_VHT_CAP_SHORT_GI_80 |
+- IEEE80211_VHT_CAP_RX_ANTENNA_PATTERN |
+- IEEE80211_VHT_CAP_TX_ANTENNA_PATTERN |
+ (3 << IEEE80211_VHT_CAP_MAX_A_MPDU_LENGTH_EXPONENT_SHIFT);
+
+ return 0;
+@@ -1303,7 +1303,7 @@ mt76_sta_add(struct mt76_dev *dev, struct ieee80211_vif *vif,
+ continue;
+
+ mtxq = (struct mt76_txq *)sta->txq[i]->drv_priv;
+- mtxq->wcid = wcid;
++ mtxq->wcid = wcid->idx;
+ }
+
+ ewma_signal_init(&wcid->rssi);
+@@ -1381,7 +1381,9 @@ void mt76_sta_pre_rcu_remove(struct ieee80211_hw *hw, struct ieee80211_vif *vif,
+ struct mt76_wcid *wcid = (struct mt76_wcid *)sta->drv_priv;
+
+ mutex_lock(&dev->mutex);
++ spin_lock_bh(&dev->status_lock);
+ rcu_assign_pointer(dev->wcid[wcid->idx], NULL);
++ spin_unlock_bh(&dev->status_lock);
+ mutex_unlock(&dev->mutex);
+ }
+ EXPORT_SYMBOL_GPL(mt76_sta_pre_rcu_remove);
+diff --git a/drivers/net/wireless/mediatek/mt76/mt76.h b/drivers/net/wireless/mediatek/mt76/mt76.h
+index 882fb5d2517fa..522c523d5c416 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt76.h
++++ b/drivers/net/wireless/mediatek/mt76/mt76.h
+@@ -275,7 +275,7 @@ struct mt76_wcid {
+ };
+
+ struct mt76_txq {
+- struct mt76_wcid *wcid;
++ u16 wcid;
+
+ u16 agg_ssn;
+ bool send_bar;
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7603/main.c b/drivers/net/wireless/mediatek/mt76/mt7603/main.c
+index 83c5eec5b1633..1d098e9799ddc 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7603/main.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7603/main.c
+@@ -75,7 +75,7 @@ mt7603_add_interface(struct ieee80211_hw *hw, struct ieee80211_vif *vif)
+ mt7603_wtbl_init(dev, idx, mvif->idx, bc_addr);
+
+ mtxq = (struct mt76_txq *)vif->txq->drv_priv;
+- mtxq->wcid = &mvif->sta.wcid;
++ mtxq->wcid = idx;
+ rcu_assign_pointer(dev->mt76.wcid[idx], &mvif->sta.wcid);
+
+ out:
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/main.c b/drivers/net/wireless/mediatek/mt76/mt7615/main.c
+index d79cbdbd5a051..6b8e3e7ae4a26 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7615/main.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7615/main.c
+@@ -234,7 +234,7 @@ static int mt7615_add_interface(struct ieee80211_hw *hw,
+ rcu_assign_pointer(dev->mt76.wcid[idx], &mvif->sta.wcid);
+ if (vif->txq) {
+ mtxq = (struct mt76_txq *)vif->txq->drv_priv;
+- mtxq->wcid = &mvif->sta.wcid;
++ mtxq->wcid = idx;
+ }
+
+ ret = mt7615_mcu_add_dev_info(phy, vif, true);
+diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_util.c b/drivers/net/wireless/mediatek/mt76/mt76x02_util.c
+index dd30f537676da..be1d27de993ae 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt76x02_util.c
++++ b/drivers/net/wireless/mediatek/mt76/mt76x02_util.c
+@@ -292,7 +292,8 @@ mt76x02_vif_init(struct mt76x02_dev *dev, struct ieee80211_vif *vif,
+ mt76_packet_id_init(&mvif->group_wcid);
+
+ mtxq = (struct mt76_txq *)vif->txq->drv_priv;
+- mtxq->wcid = &mvif->group_wcid;
++ rcu_assign_pointer(dev->mt76.wcid[MT_VIF_WCID(idx)], &mvif->group_wcid);
++ mtxq->wcid = MT_VIF_WCID(idx);
+ }
+
+ int
+@@ -345,6 +346,7 @@ void mt76x02_remove_interface(struct ieee80211_hw *hw,
+ struct mt76x02_vif *mvif = (struct mt76x02_vif *)vif->drv_priv;
+
+ dev->mt76.vif_mask &= ~BIT(mvif->idx);
++ rcu_assign_pointer(dev->mt76.wcid[mvif->group_wcid.idx], NULL);
+ mt76_packet_id_flush(&dev->mt76, &mvif->group_wcid);
+ }
+ EXPORT_SYMBOL_GPL(mt76x02_remove_interface);
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/debugfs.c b/drivers/net/wireless/mediatek/mt76/mt7915/debugfs.c
+index 4e1ecaec8f4fb..dece0a6e00b33 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/debugfs.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/debugfs.c
+@@ -95,7 +95,7 @@ mt7915_muru_debug_set(void *data, u64 val)
+ struct mt7915_dev *dev = data;
+
+ dev->muru_debug = val;
+- mt7915_mcu_muru_debug_set(dev, data);
++ mt7915_mcu_muru_debug_set(dev, dev->muru_debug);
+
+ return 0;
+ }
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/eeprom.c b/drivers/net/wireless/mediatek/mt76/mt7915/eeprom.c
+index 5b133bcdab17d..4b1a9811646fd 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/eeprom.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/eeprom.c
+@@ -152,6 +152,8 @@ static void mt7915_eeprom_parse_band_config(struct mt7915_phy *phy)
+ phy->mt76->cap.has_2ghz = true;
+ return;
+ }
++ } else if (val == MT_EE_BAND_SEL_DEFAULT && dev->dbdc_support) {
++ val = phy->band_idx ? MT_EE_BAND_SEL_5GHZ : MT_EE_BAND_SEL_2GHZ;
+ }
+
+ switch (val) {
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/mac.c b/drivers/net/wireless/mediatek/mt76/mt7915/mac.c
+index e9e7efbf350d5..45169a027fdac 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/mac.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/mac.c
+@@ -309,7 +309,7 @@ mt7915_mac_decode_he_mu_radiotap(struct sk_buff *skb, __le32 *rxv)
+ }
+
+ static void
+-mt7915_mac_decode_he_radiotap(struct sk_buff *skb, __le32 *rxv, u32 mode)
++mt7915_mac_decode_he_radiotap(struct sk_buff *skb, __le32 *rxv, u8 mode)
+ {
+ struct mt76_rx_status *status = (struct mt76_rx_status *)skb->cb;
+ static const struct ieee80211_radiotap_he known = {
+@@ -474,10 +474,10 @@ static int
+ mt7915_mac_fill_rx_rate(struct mt7915_dev *dev,
+ struct mt76_rx_status *status,
+ struct ieee80211_supported_band *sband,
+- __le32 *rxv)
++ __le32 *rxv, u8 *mode)
+ {
+ u32 v0, v2;
+- u8 stbc, gi, bw, dcm, mode, nss;
++ u8 stbc, gi, bw, dcm, nss;
+ int i, idx;
+ bool cck = false;
+
+@@ -490,18 +490,18 @@ mt7915_mac_fill_rx_rate(struct mt7915_dev *dev,
+ if (!is_mt7915(&dev->mt76)) {
+ stbc = FIELD_GET(MT_PRXV_HT_STBC, v0);
+ gi = FIELD_GET(MT_PRXV_HT_SHORT_GI, v0);
+- mode = FIELD_GET(MT_PRXV_TX_MODE, v0);
++ *mode = FIELD_GET(MT_PRXV_TX_MODE, v0);
+ dcm = FIELD_GET(MT_PRXV_DCM, v0);
+ bw = FIELD_GET(MT_PRXV_FRAME_MODE, v0);
+ } else {
+ stbc = FIELD_GET(MT_CRXV_HT_STBC, v2);
+ gi = FIELD_GET(MT_CRXV_HT_SHORT_GI, v2);
+- mode = FIELD_GET(MT_CRXV_TX_MODE, v2);
++ *mode = FIELD_GET(MT_CRXV_TX_MODE, v2);
+ dcm = !!(idx & GENMASK(3, 0) & MT_PRXV_TX_DCM);
+ bw = FIELD_GET(MT_CRXV_FRAME_MODE, v2);
+ }
+
+- switch (mode) {
++ switch (*mode) {
+ case MT_PHY_TYPE_CCK:
+ cck = true;
+ fallthrough;
+@@ -521,7 +521,7 @@ mt7915_mac_fill_rx_rate(struct mt7915_dev *dev,
+ status->encoding = RX_ENC_VHT;
+ if (gi)
+ status->enc_flags |= RX_ENC_FLAG_SHORT_GI;
+- if (i > 9)
++ if (i > 11)
+ return -EINVAL;
+ break;
+ case MT_PHY_TYPE_HE_MU:
+@@ -546,7 +546,7 @@ mt7915_mac_fill_rx_rate(struct mt7915_dev *dev,
+ case IEEE80211_STA_RX_BW_20:
+ break;
+ case IEEE80211_STA_RX_BW_40:
+- if (mode & MT_PHY_TYPE_HE_EXT_SU &&
++ if (*mode & MT_PHY_TYPE_HE_EXT_SU &&
+ (idx & MT_PRXV_TX_ER_SU_106T)) {
+ status->bw = RATE_INFO_BW_HE_RU;
+ status->he_ru =
+@@ -566,7 +566,7 @@ mt7915_mac_fill_rx_rate(struct mt7915_dev *dev,
+ }
+
+ status->enc_flags |= RX_ENC_FLAG_STBC_MASK * stbc;
+- if (mode < MT_PHY_TYPE_HE_SU && gi)
++ if (*mode < MT_PHY_TYPE_HE_SU && gi)
+ status->enc_flags |= RX_ENC_FLAG_SHORT_GI;
+
+ return 0;
+@@ -581,7 +581,6 @@ mt7915_mac_fill_rx(struct mt7915_dev *dev, struct sk_buff *skb)
+ struct ieee80211_supported_band *sband;
+ __le32 *rxd = (__le32 *)skb->data;
+ __le32 *rxv = NULL;
+- u32 mode = 0;
+ u32 rxd0 = le32_to_cpu(rxd[0]);
+ u32 rxd1 = le32_to_cpu(rxd[1]);
+ u32 rxd2 = le32_to_cpu(rxd[2]);
+@@ -590,10 +589,10 @@ mt7915_mac_fill_rx(struct mt7915_dev *dev, struct sk_buff *skb)
+ u32 csum_mask = MT_RXD0_NORMAL_IP_SUM | MT_RXD0_NORMAL_UDP_TCP_SUM;
+ bool unicast, insert_ccmp_hdr = false;
+ u8 remove_pad, amsdu_info;
++ u8 mode = 0, qos_ctl = 0;
+ bool hdr_trans;
+ u16 hdr_gap;
+ u16 seq_ctrl = 0;
+- u8 qos_ctl = 0;
+ __le16 fc = 0;
+ int idx;
+
+@@ -766,7 +765,8 @@ mt7915_mac_fill_rx(struct mt7915_dev *dev, struct sk_buff *skb)
+ }
+
+ if (!is_mt7915(&dev->mt76) || (rxd1 & MT_RXD1_NORMAL_GROUP_5)) {
+- ret = mt7915_mac_fill_rx_rate(dev, status, sband, rxv);
++ ret = mt7915_mac_fill_rx_rate(dev, status, sband, rxv,
++ &mode);
+ if (ret < 0)
+ return ret;
+ }
+@@ -864,8 +864,11 @@ mt7915_mac_fill_rx_vector(struct mt7915_dev *dev, struct sk_buff *skb)
+ int i;
+
+ band_idx = le32_get_bits(rxv_hdr[1], MT_RXV_HDR_BAND_IDX);
+- if (band_idx && !phy->band_idx)
++ if (band_idx && !phy->band_idx) {
+ phy = mt7915_ext_phy(dev);
++ if (!phy)
++ goto out;
++ }
+
+ rcpi = le32_to_cpu(rxv[6]);
+ ib_rssi = le32_to_cpu(rxv[7]);
+@@ -890,8 +893,8 @@ mt7915_mac_fill_rx_vector(struct mt7915_dev *dev, struct sk_buff *skb)
+
+ phy->test.last_freq_offset = foe;
+ phy->test.last_snr = snr;
++out:
+ #endif
+-
+ dev_kfree_skb(skb);
+ }
+
+@@ -1017,6 +1020,7 @@ mt7915_mac_write_txwi_8023(struct mt7915_dev *dev, __le32 *txwi,
+
+ u8 tid = skb->priority & IEEE80211_QOS_CTL_TID_MASK;
+ u8 fc_type, fc_stype;
++ u16 ethertype;
+ bool wmm = false;
+ u32 val;
+
+@@ -1030,7 +1034,8 @@ mt7915_mac_write_txwi_8023(struct mt7915_dev *dev, __le32 *txwi,
+ val = FIELD_PREP(MT_TXD1_HDR_FORMAT, MT_HDR_FORMAT_802_3) |
+ FIELD_PREP(MT_TXD1_TID, tid);
+
+- if (be16_to_cpu(skb->protocol) >= ETH_P_802_3_MIN)
++ ethertype = get_unaligned_be16(&skb->data[12]);
++ if (ethertype >= ETH_P_802_3_MIN)
+ val |= MT_TXD1_ETH_802_3;
+
+ txwi[1] |= cpu_to_le32(val);
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/main.c b/drivers/net/wireless/mediatek/mt76/mt7915/main.c
+index c3f44d801e7fe..187cf4ccd36e1 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/main.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/main.c
+@@ -246,7 +246,7 @@ static int mt7915_add_interface(struct ieee80211_hw *hw,
+ rcu_assign_pointer(dev->mt76.wcid[idx], &mvif->sta.wcid);
+ if (vif->txq) {
+ mtxq = (struct mt76_txq *)vif->txq->drv_priv;
+- mtxq->wcid = &mvif->sta.wcid;
++ mtxq->wcid = idx;
+ }
+
+ if (vif->type != NL80211_IFTYPE_AP &&
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
+index e7a6f80e77551..736c9c342baaa 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
+@@ -1854,7 +1854,8 @@ mt7915_mcu_beacon_mbss(struct sk_buff *rskb, struct sk_buff *skb,
+ continue;
+
+ for_each_element(sub_elem, elem->data + 1, elem->datalen - 1) {
+- const u8 *data;
++ const struct ieee80211_bssid_index *idx;
++ const u8 *idx_ie;
+
+ if (sub_elem->id || sub_elem->datalen < 4)
+ continue; /* not a valid BSS profile */
+@@ -1862,14 +1863,19 @@ mt7915_mcu_beacon_mbss(struct sk_buff *rskb, struct sk_buff *skb,
+ /* Find WLAN_EID_MULTI_BSSID_IDX
+ * in the merged nontransmitted profile
+ */
+- data = cfg80211_find_ie(WLAN_EID_MULTI_BSSID_IDX,
+- sub_elem->data,
+- sub_elem->datalen);
+- if (!data || data[1] < 1 || !data[2])
++ idx_ie = cfg80211_find_ie(WLAN_EID_MULTI_BSSID_IDX,
++ sub_elem->data,
++ sub_elem->datalen);
++ if (!idx_ie || idx_ie[1] < sizeof(*idx))
+ continue;
+
+- mbss->offset[data[2]] = cpu_to_le16(data - skb->data);
+- mbss->bitmap |= cpu_to_le32(BIT(data[2]));
++ idx = (void *)(idx_ie + 2);
++ if (!idx->bssid_index || idx->bssid_index > 31)
++ continue;
++
++ mbss->offset[idx->bssid_index] =
++ cpu_to_le16(idx_ie - skb->data);
++ mbss->bitmap |= cpu_to_le32(BIT(idx->bssid_index));
+ }
+ }
+ }
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/mt7915.h b/drivers/net/wireless/mediatek/mt76/mt7915/mt7915.h
+index 6efa0a2e23458..4b6eda958ef36 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/mt7915.h
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/mt7915.h
+@@ -319,7 +319,7 @@ struct mt7915_dev {
+ void *cal;
+
+ struct {
+- u8 table_mask;
++ u16 table_mask;
+ u8 n_agrt;
+ } twt;
+
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/soc.c b/drivers/net/wireless/mediatek/mt76/mt7915/soc.c
+index 3028c02cb840e..be448d471b03b 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/soc.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/soc.c
+@@ -210,6 +210,8 @@ static int mt7986_wmac_gpio_setup(struct mt7915_dev *dev)
+ if (IS_ERR_OR_NULL(state))
+ return -EINVAL;
+ break;
++ default:
++ return -EINVAL;
+ }
+
+ ret = pinctrl_select_state(pinctrl, state);
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mac.c b/drivers/net/wireless/mediatek/mt76/mt7921/mac.c
+index 233998ca48573..c5350e7a11e62 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7921/mac.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7921/mac.c
+@@ -696,7 +696,7 @@ mt7921_mac_fill_rx(struct mt7921_dev *dev, struct sk_buff *skb)
+ status->nss =
+ FIELD_GET(MT_PRXV_NSTS, v0) + 1;
+ status->encoding = RX_ENC_VHT;
+- if (i > 9)
++ if (i > 11)
+ return -EINVAL;
+ break;
+ case MT_PHY_TYPE_HE_MU:
+@@ -814,6 +814,7 @@ mt7921_mac_write_txwi_8023(struct mt7921_dev *dev, __le32 *txwi,
+ {
+ u8 tid = skb->priority & IEEE80211_QOS_CTL_TID_MASK;
+ u8 fc_type, fc_stype;
++ u16 ethertype;
+ bool wmm = false;
+ u32 val;
+
+@@ -827,7 +828,8 @@ mt7921_mac_write_txwi_8023(struct mt7921_dev *dev, __le32 *txwi,
+ val = FIELD_PREP(MT_TXD1_HDR_FORMAT, MT_HDR_FORMAT_802_3) |
+ FIELD_PREP(MT_TXD1_TID, tid);
+
+- if (be16_to_cpu(skb->protocol) >= ETH_P_802_3_MIN)
++ ethertype = get_unaligned_be16(&skb->data[12]);
++ if (ethertype >= ETH_P_802_3_MIN)
+ val |= MT_TXD1_ETH_802_3;
+
+ txwi[1] |= cpu_to_le32(val);
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c b/drivers/net/wireless/mediatek/mt76/mt7921/main.c
+index fdaf2451bc1de..9b9e80f56eda7 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c
+@@ -330,7 +330,7 @@ static int mt7921_add_interface(struct ieee80211_hw *hw,
+ rcu_assign_pointer(dev->mt76.wcid[idx], &mvif->sta.wcid);
+ if (vif->txq) {
+ mtxq = (struct mt76_txq *)vif->txq->drv_priv;
+- mtxq->wcid = &mvif->sta.wcid;
++ mtxq->wcid = idx;
+ }
+
+ out:
+@@ -489,8 +489,8 @@ mt7921_sniffer_interface_iter(void *priv, u8 *mac, struct ieee80211_vif *vif)
+ bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR);
+
+ mt7921_mcu_set_sniffer(dev, vif, monitor);
+- pm->enable = !monitor;
+- pm->ds_enable = !monitor;
++ pm->enable = pm->enable_user && !monitor;
++ pm->ds_enable = pm->ds_enable_user && !monitor;
+
+ mt76_connac_mcu_set_deep_sleep(&dev->mt76, pm->ds_enable);
+
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
+index 1a01d025bbe59..b5fb22b8e0869 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
+@@ -119,7 +119,6 @@ static void mt7921e_unregister_device(struct mt7921_dev *dev)
+ mt7921_mcu_exit(dev);
+
+ tasklet_disable(&dev->irq_tasklet);
+- mt76_free_device(&dev->mt76);
+ }
+
+ static u32 __mt7921_reg_addr(struct mt7921_dev *dev, u32 addr)
+@@ -302,8 +301,10 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
+ dev->bus_ops = dev->mt76.bus;
+ bus_ops = devm_kmemdup(dev->mt76.dev, dev->bus_ops, sizeof(*bus_ops),
+ GFP_KERNEL);
+- if (!bus_ops)
+- return -ENOMEM;
++ if (!bus_ops) {
++ ret = -ENOMEM;
++ goto err_free_dev;
++ }
+
+ bus_ops->rr = mt7921_rr;
+ bus_ops->wr = mt7921_wr;
+@@ -312,7 +313,7 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
+
+ ret = __mt7921e_mcu_drv_pmctrl(dev);
+ if (ret)
+- return ret;
++ goto err_free_dev;
+
+ mdev->rev = (mt7921_l1_rr(dev, MT_HW_CHIPID) << 16) |
+ (mt7921_l1_rr(dev, MT_HW_REV) & 0xff);
+@@ -354,6 +355,7 @@ static void mt7921_pci_remove(struct pci_dev *pdev)
+
+ mt7921e_unregister_device(dev);
+ devm_free_irq(&pdev->dev, pdev->irq, dev);
++ mt76_free_device(&dev->mt76);
+ pci_free_irq_vectors(pdev);
+ }
+
+diff --git a/drivers/net/wireless/mediatek/mt76/tx.c b/drivers/net/wireless/mediatek/mt76/tx.c
+index 6b8c9dc805425..5ed2d60debfb3 100644
+--- a/drivers/net/wireless/mediatek/mt76/tx.c
++++ b/drivers/net/wireless/mediatek/mt76/tx.c
+@@ -120,7 +120,7 @@ mt76_tx_status_skb_add(struct mt76_dev *dev, struct mt76_wcid *wcid,
+
+ memset(cb, 0, sizeof(*cb));
+
+- if (!wcid)
++ if (!wcid || !rcu_access_pointer(dev->wcid[wcid->idx]))
+ return MT_PACKET_ID_NO_ACK;
+
+ if (info->flags & IEEE80211_TX_CTL_NO_ACK)
+@@ -436,12 +436,11 @@ mt76_txq_stopped(struct mt76_queue *q)
+
+ static int
+ mt76_txq_send_burst(struct mt76_phy *phy, struct mt76_queue *q,
+- struct mt76_txq *mtxq)
++ struct mt76_txq *mtxq, struct mt76_wcid *wcid)
+ {
+ struct mt76_dev *dev = phy->dev;
+ struct ieee80211_txq *txq = mtxq_to_txq(mtxq);
+ enum mt76_txq_id qid = mt76_txq_get_qid(txq);
+- struct mt76_wcid *wcid = mtxq->wcid;
+ struct ieee80211_tx_info *info;
+ struct sk_buff *skb;
+ int n_frames = 1;
+@@ -521,8 +520,8 @@ mt76_txq_schedule_list(struct mt76_phy *phy, enum mt76_txq_id qid)
+ break;
+
+ mtxq = (struct mt76_txq *)txq->drv_priv;
+- wcid = mtxq->wcid;
+- if (wcid && test_bit(MT_WCID_FLAG_PS, &wcid->flags))
++ wcid = rcu_dereference(dev->wcid[mtxq->wcid]);
++ if (!wcid || test_bit(MT_WCID_FLAG_PS, &wcid->flags))
+ continue;
+
+ spin_lock_bh(&q->lock);
+@@ -541,7 +540,7 @@ mt76_txq_schedule_list(struct mt76_phy *phy, enum mt76_txq_id qid)
+ }
+
+ if (!mt76_txq_stopped(q))
+- n_frames = mt76_txq_send_burst(phy, q, mtxq);
++ n_frames = mt76_txq_send_burst(phy, q, mtxq, wcid);
+
+ spin_unlock_bh(&q->lock);
+
+diff --git a/drivers/net/wireless/microchip/wilc1000/mon.c b/drivers/net/wireless/microchip/wilc1000/mon.c
+index 6bd63934c2d84..b5a1b65c087ca 100644
+--- a/drivers/net/wireless/microchip/wilc1000/mon.c
++++ b/drivers/net/wireless/microchip/wilc1000/mon.c
+@@ -233,7 +233,7 @@ struct net_device *wilc_wfi_init_mon_interface(struct wilc *wl,
+ wl->monitor_dev->netdev_ops = &wilc_wfi_netdev_ops;
+ wl->monitor_dev->needs_free_netdev = true;
+
+- if (cfg80211_register_netdevice(wl->monitor_dev)) {
++ if (register_netdevice(wl->monitor_dev)) {
+ netdev_err(real_dev, "register_netdevice failed\n");
+ free_netdev(wl->monitor_dev);
+ return NULL;
+@@ -251,7 +251,7 @@ void wilc_wfi_deinit_mon_interface(struct wilc *wl, bool rtnl_locked)
+ return;
+
+ if (rtnl_locked)
+- cfg80211_unregister_netdevice(wl->monitor_dev);
++ unregister_netdevice(wl->monitor_dev);
+ else
+ unregister_netdev(wl->monitor_dev);
+ wl->monitor_dev = NULL;
+diff --git a/drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c b/drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c
+index 2477e18c7caec..025619cd14e82 100644
+--- a/drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c
++++ b/drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c
+@@ -460,8 +460,10 @@ static void rtl8180_tx(struct ieee80211_hw *dev,
+ struct rtl8180_priv *priv = dev->priv;
+ struct rtl8180_tx_ring *ring;
+ struct rtl8180_tx_desc *entry;
++ unsigned int prio = 0;
+ unsigned long flags;
+- unsigned int idx, prio, hw_prio;
++ unsigned int idx, hw_prio;
++
+ dma_addr_t mapping;
+ u32 tx_flags;
+ u8 rc_flags;
+@@ -470,7 +472,9 @@ static void rtl8180_tx(struct ieee80211_hw *dev,
+ /* do arithmetic and then convert to le16 */
+ u16 frame_duration = 0;
+
+- prio = skb_get_queue_mapping(skb);
++ /* rtl8180/rtl8185 only has one useable tx queue */
++ if (dev->queues > IEEE80211_AC_BK)
++ prio = skb_get_queue_mapping(skb);
+ ring = &priv->tx_ring[prio];
+
+ mapping = dma_map_single(&priv->pdev->dev, skb->data, skb->len,
+diff --git a/drivers/net/wireless/realtek/rtlwifi/usb.c b/drivers/net/wireless/realtek/rtlwifi/usb.c
+index 86a2368732547..a8eebafb9a7ee 100644
+--- a/drivers/net/wireless/realtek/rtlwifi/usb.c
++++ b/drivers/net/wireless/realtek/rtlwifi/usb.c
+@@ -1014,7 +1014,7 @@ int rtl_usb_probe(struct usb_interface *intf,
+ hw = ieee80211_alloc_hw(sizeof(struct rtl_priv) +
+ sizeof(struct rtl_usb_priv), &rtl_ops);
+ if (!hw) {
+- WARN_ONCE(true, "rtl_usb: ieee80211 alloc failed\n");
++ pr_warn("rtl_usb: ieee80211 alloc failed\n");
+ return -ENOMEM;
+ }
+ rtlpriv = hw->priv;
+diff --git a/drivers/net/wireless/realtek/rtw88/rtw8821c.c b/drivers/net/wireless/realtek/rtw88/rtw8821c.c
+index 99eee128ae945..ec38a7c849517 100644
+--- a/drivers/net/wireless/realtek/rtw88/rtw8821c.c
++++ b/drivers/net/wireless/realtek/rtw88/rtw8821c.c
+@@ -512,6 +512,7 @@ static s8 get_cck_rx_pwr(struct rtw_dev *rtwdev, u8 lna_idx, u8 vga_idx)
+ static void query_phy_status_page0(struct rtw_dev *rtwdev, u8 *phy_status,
+ struct rtw_rx_pkt_stat *pkt_stat)
+ {
++ struct rtw_dm_info *dm_info = &rtwdev->dm_info;
+ s8 rx_power;
+ u8 lna_idx = 0;
+ u8 vga_idx = 0;
+@@ -523,6 +524,7 @@ static void query_phy_status_page0(struct rtw_dev *rtwdev, u8 *phy_status,
+
+ pkt_stat->rx_power[RF_PATH_A] = rx_power;
+ pkt_stat->rssi = rtw_phy_rf_power_2_rssi(pkt_stat->rx_power, 1);
++ dm_info->rssi[RF_PATH_A] = pkt_stat->rssi;
+ pkt_stat->bw = RTW_CHANNEL_WIDTH_20;
+ pkt_stat->signal_power = rx_power;
+ }
+@@ -530,6 +532,7 @@ static void query_phy_status_page0(struct rtw_dev *rtwdev, u8 *phy_status,
+ static void query_phy_status_page1(struct rtw_dev *rtwdev, u8 *phy_status,
+ struct rtw_rx_pkt_stat *pkt_stat)
+ {
++ struct rtw_dm_info *dm_info = &rtwdev->dm_info;
+ u8 rxsc, bw;
+ s8 min_rx_power = -120;
+
+@@ -549,6 +552,7 @@ static void query_phy_status_page1(struct rtw_dev *rtwdev, u8 *phy_status,
+
+ pkt_stat->rx_power[RF_PATH_A] = GET_PHY_STAT_P1_PWDB_A(phy_status) - 110;
+ pkt_stat->rssi = rtw_phy_rf_power_2_rssi(pkt_stat->rx_power, 1);
++ dm_info->rssi[RF_PATH_A] = pkt_stat->rssi;
+ pkt_stat->bw = bw;
+ pkt_stat->signal_power = max(pkt_stat->rx_power[RF_PATH_A],
+ min_rx_power);
+diff --git a/drivers/net/wireless/realtek/rtw88/rx.c b/drivers/net/wireless/realtek/rtw88/rx.c
+index d2d607e22198d..84aedabdf2853 100644
+--- a/drivers/net/wireless/realtek/rtw88/rx.c
++++ b/drivers/net/wireless/realtek/rtw88/rx.c
+@@ -158,7 +158,8 @@ void rtw_rx_fill_rx_status(struct rtw_dev *rtwdev,
+ memset(rx_status, 0, sizeof(*rx_status));
+ rx_status->freq = hw->conf.chandef.chan->center_freq;
+ rx_status->band = hw->conf.chandef.chan->band;
+- if (rtw_fw_feature_check(&rtwdev->fw, FW_FEATURE_SCAN_OFFLOAD))
++ if (rtw_fw_feature_check(&rtwdev->fw, FW_FEATURE_SCAN_OFFLOAD) &&
++ test_bit(RTW_FLAG_SCANNING, rtwdev->flags))
+ rtw_set_rx_freq_by_pktstat(pkt_stat, rx_status);
+ if (pkt_stat->crc_err)
+ rx_status->flag |= RX_FLAG_FAILED_FCS_CRC;
+diff --git a/drivers/net/wireless/realtek/rtw89/cam.c b/drivers/net/wireless/realtek/rtw89/cam.c
+index 305dbbebff6bb..26bef9fdd2053 100644
+--- a/drivers/net/wireless/realtek/rtw89/cam.c
++++ b/drivers/net/wireless/realtek/rtw89/cam.c
+@@ -421,10 +421,8 @@ static void rtw89_cam_reset_key_iter(struct ieee80211_hw *hw,
+ void *data)
+ {
+ struct rtw89_dev *rtwdev = (struct rtw89_dev *)data;
+- struct rtw89_vif *rtwvif = (struct rtw89_vif *)vif->drv_priv;
+
+ rtw89_cam_sec_key_del(rtwdev, vif, sta, key, false);
+- rtw89_cam_deinit(rtwdev, rtwvif);
+ }
+
+ void rtw89_cam_deinit_addr_cam(struct rtw89_dev *rtwdev,
+@@ -480,6 +478,12 @@ int rtw89_cam_init_addr_cam(struct rtw89_dev *rtwdev,
+ int i;
+ int ret;
+
++ if (unlikely(addr_cam->valid)) {
++ rtw89_debug(rtwdev, RTW89_DBG_FW,
++ "addr cam is already valid; skip init\n");
++ return 0;
++ }
++
+ ret = rtw89_cam_get_avail_addr_cam(rtwdev, &addr_cam_idx);
+ if (ret) {
+ rtw89_err(rtwdev, "failed to get available addr cam\n");
+@@ -531,6 +535,12 @@ static int rtw89_cam_init_bssid_cam(struct rtw89_dev *rtwdev,
+ u8 bssid_cam_idx;
+ int ret;
+
++ if (unlikely(bssid_cam->valid)) {
++ rtw89_debug(rtwdev, RTW89_DBG_FW,
++ "bssid cam is already valid; skip init\n");
++ return 0;
++ }
++
+ ret = rtw89_cam_get_avail_bssid_cam(rtwdev, &bssid_cam_idx);
+ if (ret) {
+ rtw89_err(rtwdev, "failed to get available bssid cam\n");
+diff --git a/drivers/net/wireless/realtek/rtw89/fw.c b/drivers/net/wireless/realtek/rtw89/fw.c
+index 6deaf8eec6b47..a9b5315a517e8 100644
+--- a/drivers/net/wireless/realtek/rtw89/fw.c
++++ b/drivers/net/wireless/realtek/rtw89/fw.c
+@@ -2065,7 +2065,7 @@ static void rtw89_hw_scan_add_chan(struct rtw89_dev *rtwdev, int chan_type,
+ ch_info->num_pkt = 0;
+ break;
+ case RTW89_CHAN_DFS:
+- ch_info->period = min_t(u8, ch_info->period,
++ ch_info->period = max_t(u8, ch_info->period,
+ RTW89_DFS_CHAN_TIME);
+ ch_info->dwell_time = RTW89_DWELL_TIME;
+ break;
+diff --git a/drivers/net/wireless/realtek/rtw89/phy.c b/drivers/net/wireless/realtek/rtw89/phy.c
+index ac211d8973118..8414f30184b9e 100644
+--- a/drivers/net/wireless/realtek/rtw89/phy.c
++++ b/drivers/net/wireless/realtek/rtw89/phy.c
+@@ -2213,6 +2213,11 @@ void rtw89_phy_cfo_parse(struct rtw89_dev *rtwdev, s16 cfo_val,
+ struct rtw89_cfo_tracking_info *cfo = &rtwdev->cfo_tracking;
+ u8 macid = phy_ppdu->mac_id;
+
++ if (macid >= CFO_TRACK_MAX_USER) {
++ rtw89_warn(rtwdev, "mac_id %d is out of range\n", macid);
++ return;
++ }
++
+ cfo->cfo_tail[macid] += cfo_val;
+ cfo->cfo_cnt[macid]++;
+ cfo->packet_count++;
+diff --git a/drivers/net/wireless/realtek/rtw89/ser.c b/drivers/net/wireless/realtek/rtw89/ser.c
+index 837cdc366a61a..e86f3d89ef1bf 100644
+--- a/drivers/net/wireless/realtek/rtw89/ser.c
++++ b/drivers/net/wireless/realtek/rtw89/ser.c
+@@ -220,11 +220,32 @@ static void ser_reset_vif(struct rtw89_dev *rtwdev, struct rtw89_vif *rtwvif)
+ rtwvif->trigger = false;
+ }
+
++static void ser_sta_deinit_addr_cam_iter(void *data, struct ieee80211_sta *sta)
++{
++ struct rtw89_dev *rtwdev = (struct rtw89_dev *)data;
++ struct rtw89_sta *rtwsta = (struct rtw89_sta *)sta->drv_priv;
++
++ rtw89_cam_deinit_addr_cam(rtwdev, &rtwsta->addr_cam);
++}
++
++static void ser_deinit_cam(struct rtw89_dev *rtwdev, struct rtw89_vif *rtwvif)
++{
++ if (rtwvif->net_type == RTW89_NET_TYPE_AP_MODE)
++ ieee80211_iterate_stations_atomic(rtwdev->hw,
++ ser_sta_deinit_addr_cam_iter,
++ rtwdev);
++
++ rtw89_cam_deinit(rtwdev, rtwvif);
++}
++
+ static void ser_reset_mac_binding(struct rtw89_dev *rtwdev)
+ {
+ struct rtw89_vif *rtwvif;
+
+ rtw89_cam_reset_keys(rtwdev);
++ rtw89_for_each_rtwvif(rtwdev, rtwvif)
++ ser_deinit_cam(rtwdev, rtwvif);
++
+ rtw89_core_release_all_bits_map(rtwdev->mac_id_map, RTW89_MAX_MAC_ID_NUM);
+ rtw89_for_each_rtwvif(rtwdev, rtwvif)
+ ser_reset_vif(rtwdev, rtwvif);
+diff --git a/drivers/net/wireless/ti/wl1251/event.c b/drivers/net/wireless/ti/wl1251/event.c
+index e6d426edab56b..e945aafd88ee5 100644
+--- a/drivers/net/wireless/ti/wl1251/event.c
++++ b/drivers/net/wireless/ti/wl1251/event.c
+@@ -169,11 +169,9 @@ int wl1251_event_wait(struct wl1251 *wl, u32 mask, int timeout_ms)
+ msleep(1);
+
+ /* read from both event fields */
+- wl1251_mem_read(wl, wl->mbox_ptr[0], &events_vector,
+- sizeof(events_vector));
++ events_vector = wl1251_mem_read32(wl, wl->mbox_ptr[0]);
+ event = events_vector & mask;
+- wl1251_mem_read(wl, wl->mbox_ptr[1], &events_vector,
+- sizeof(events_vector));
++ events_vector = wl1251_mem_read32(wl, wl->mbox_ptr[1]);
+ event |= events_vector & mask;
+ } while (!event);
+
+@@ -202,7 +200,7 @@ void wl1251_event_mbox_config(struct wl1251 *wl)
+
+ int wl1251_event_handle(struct wl1251 *wl, u8 mbox_num)
+ {
+- struct event_mailbox mbox;
++ struct event_mailbox *mbox;
+ int ret;
+
+ wl1251_debug(DEBUG_EVENT, "EVENT on mbox %d", mbox_num);
+@@ -210,12 +208,20 @@ int wl1251_event_handle(struct wl1251 *wl, u8 mbox_num)
+ if (mbox_num > 1)
+ return -EINVAL;
+
++ mbox = kmalloc(sizeof(*mbox), GFP_KERNEL);
++ if (!mbox) {
++ wl1251_error("can not allocate mbox buffer");
++ return -ENOMEM;
++ }
++
+ /* first we read the mbox descriptor */
+- wl1251_mem_read(wl, wl->mbox_ptr[mbox_num], &mbox,
+- sizeof(struct event_mailbox));
++ wl1251_mem_read(wl, wl->mbox_ptr[mbox_num], mbox,
++ sizeof(*mbox));
+
+ /* process the descriptor */
+- ret = wl1251_event_process(wl, &mbox);
++ ret = wl1251_event_process(wl, mbox);
++ kfree(mbox);
++
+ if (ret < 0)
+ return ret;
+
+diff --git a/drivers/net/wireless/ti/wl1251/io.c b/drivers/net/wireless/ti/wl1251/io.c
+index 5ebe7958ed5c7..e8d567af74b4b 100644
+--- a/drivers/net/wireless/ti/wl1251/io.c
++++ b/drivers/net/wireless/ti/wl1251/io.c
+@@ -121,7 +121,13 @@ void wl1251_set_partition(struct wl1251 *wl,
+ u32 mem_start, u32 mem_size,
+ u32 reg_start, u32 reg_size)
+ {
+- struct wl1251_partition partition[2];
++ struct wl1251_partition_set *partition;
++
++ partition = kmalloc(sizeof(*partition), GFP_KERNEL);
++ if (!partition) {
++ wl1251_error("can not allocate partition buffer");
++ return;
++ }
+
+ wl1251_debug(DEBUG_SPI, "mem_start %08X mem_size %08X",
+ mem_start, mem_size);
+@@ -164,10 +170,10 @@ void wl1251_set_partition(struct wl1251 *wl,
+ reg_start, reg_size);
+ }
+
+- partition[0].start = mem_start;
+- partition[0].size = mem_size;
+- partition[1].start = reg_start;
+- partition[1].size = reg_size;
++ partition->mem.start = mem_start;
++ partition->mem.size = mem_size;
++ partition->reg.start = reg_start;
++ partition->reg.size = reg_size;
+
+ wl->physical_mem_addr = mem_start;
+ wl->physical_reg_addr = reg_start;
+@@ -176,5 +182,7 @@ void wl1251_set_partition(struct wl1251 *wl,
+ wl->virtual_reg_addr = mem_size;
+
+ wl->if_ops->write(wl, HW_ACCESS_PART0_SIZE_ADDR, partition,
+- sizeof(partition));
++ sizeof(*partition));
++
++ kfree(partition);
+ }
+diff --git a/drivers/net/wireless/ti/wl1251/tx.c b/drivers/net/wireless/ti/wl1251/tx.c
+index 98cd39619d579..e9dc3c72bb110 100644
+--- a/drivers/net/wireless/ti/wl1251/tx.c
++++ b/drivers/net/wireless/ti/wl1251/tx.c
+@@ -443,19 +443,25 @@ static void wl1251_tx_packet_cb(struct wl1251 *wl,
+ void wl1251_tx_complete(struct wl1251 *wl)
+ {
+ int i, result_index, num_complete = 0, queue_len;
+- struct tx_result result[FW_TX_CMPLT_BLOCK_SIZE], *result_ptr;
++ struct tx_result *result, *result_ptr;
+ unsigned long flags;
+
+ if (unlikely(wl->state != WL1251_STATE_ON))
+ return;
+
++ result = kmalloc_array(FW_TX_CMPLT_BLOCK_SIZE, sizeof(*result), GFP_KERNEL);
++ if (!result) {
++ wl1251_error("can not allocate result buffer");
++ return;
++ }
++
+ /* First we read the result */
+- wl1251_mem_read(wl, wl->data_path->tx_complete_addr,
+- result, sizeof(result));
++ wl1251_mem_read(wl, wl->data_path->tx_complete_addr, result,
++ FW_TX_CMPLT_BLOCK_SIZE * sizeof(*result));
+
+ result_index = wl->next_tx_complete;
+
+- for (i = 0; i < ARRAY_SIZE(result); i++) {
++ for (i = 0; i < FW_TX_CMPLT_BLOCK_SIZE; i++) {
+ result_ptr = &result[result_index];
+
+ if (result_ptr->done_1 == 1 &&
+@@ -538,6 +544,7 @@ void wl1251_tx_complete(struct wl1251 *wl)
+
+ }
+
++ kfree(result);
+ wl->next_tx_complete = result_index;
+ }
+
+diff --git a/drivers/nfc/st21nfca/se.c b/drivers/nfc/st21nfca/se.c
+index c922f10d0d7b9..7e213f8ddc98b 100644
+--- a/drivers/nfc/st21nfca/se.c
++++ b/drivers/nfc/st21nfca/se.c
+@@ -241,7 +241,7 @@ int st21nfca_hci_se_io(struct nfc_hci_dev *hdev, u32 se_idx,
+ }
+ EXPORT_SYMBOL(st21nfca_hci_se_io);
+
+-static void st21nfca_se_wt_timeout(struct timer_list *t)
++static void st21nfca_se_wt_work(struct work_struct *work)
+ {
+ /*
+ * No answer from the secure element
+@@ -254,8 +254,9 @@ static void st21nfca_se_wt_timeout(struct timer_list *t)
+ */
+ /* hardware reset managed through VCC_UICC_OUT power supply */
+ u8 param = 0x01;
+- struct st21nfca_hci_info *info = from_timer(info, t,
+- se_info.bwi_timer);
++ struct st21nfca_hci_info *info = container_of(work,
++ struct st21nfca_hci_info,
++ se_info.timeout_work);
+
+ info->se_info.bwi_active = false;
+
+@@ -271,6 +272,13 @@ static void st21nfca_se_wt_timeout(struct timer_list *t)
+ info->se_info.cb(info->se_info.cb_context, NULL, 0, -ETIME);
+ }
+
++static void st21nfca_se_wt_timeout(struct timer_list *t)
++{
++ struct st21nfca_hci_info *info = from_timer(info, t, se_info.bwi_timer);
++
++ schedule_work(&info->se_info.timeout_work);
++}
++
+ static void st21nfca_se_activation_timeout(struct timer_list *t)
+ {
+ struct st21nfca_hci_info *info = from_timer(info, t,
+@@ -360,6 +368,7 @@ int st21nfca_apdu_reader_event_received(struct nfc_hci_dev *hdev,
+ switch (event) {
+ case ST21NFCA_EVT_TRANSMIT_DATA:
+ del_timer_sync(&info->se_info.bwi_timer);
++ cancel_work_sync(&info->se_info.timeout_work);
+ info->se_info.bwi_active = false;
+ r = nfc_hci_send_event(hdev, ST21NFCA_DEVICE_MGNT_GATE,
+ ST21NFCA_EVT_SE_END_OF_APDU_TRANSFER, NULL, 0);
+@@ -389,6 +398,7 @@ void st21nfca_se_init(struct nfc_hci_dev *hdev)
+ struct st21nfca_hci_info *info = nfc_hci_get_clientdata(hdev);
+
+ init_completion(&info->se_info.req_completion);
++ INIT_WORK(&info->se_info.timeout_work, st21nfca_se_wt_work);
+ /* initialize timers */
+ timer_setup(&info->se_info.bwi_timer, st21nfca_se_wt_timeout, 0);
+ info->se_info.bwi_active = false;
+@@ -416,6 +426,7 @@ void st21nfca_se_deinit(struct nfc_hci_dev *hdev)
+ if (info->se_info.se_active)
+ del_timer_sync(&info->se_info.se_active_timer);
+
++ cancel_work_sync(&info->se_info.timeout_work);
+ info->se_info.bwi_active = false;
+ info->se_info.se_active = false;
+ }
+diff --git a/drivers/nfc/st21nfca/st21nfca.h b/drivers/nfc/st21nfca/st21nfca.h
+index cb6ad916be911..ae6771cc9894a 100644
+--- a/drivers/nfc/st21nfca/st21nfca.h
++++ b/drivers/nfc/st21nfca/st21nfca.h
+@@ -141,6 +141,7 @@ struct st21nfca_se_info {
+
+ se_io_cb_t cb;
+ void *cb_context;
++ struct work_struct timeout_work;
+ };
+
+ struct st21nfca_hci_info {
+diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
+index 69a03358817f1..681cc28703a3e 100644
+--- a/drivers/nvdimm/core.c
++++ b/drivers/nvdimm/core.c
+@@ -368,9 +368,7 @@ static ssize_t capability_show(struct device *dev,
+ if (!nd_desc->fw_ops)
+ return -EOPNOTSUPP;
+
+- nvdimm_bus_lock(dev);
+ cap = nd_desc->fw_ops->capability(nd_desc);
+- nvdimm_bus_unlock(dev);
+
+ switch (cap) {
+ case NVDIMM_FWA_CAP_QUIESCE:
+@@ -395,10 +393,8 @@ static ssize_t activate_show(struct device *dev,
+ if (!nd_desc->fw_ops)
+ return -EOPNOTSUPP;
+
+- nvdimm_bus_lock(dev);
+ cap = nd_desc->fw_ops->capability(nd_desc);
+ state = nd_desc->fw_ops->activate_state(nd_desc);
+- nvdimm_bus_unlock(dev);
+
+ if (cap < NVDIMM_FWA_CAP_QUIESCE)
+ return -EOPNOTSUPP;
+@@ -443,7 +439,6 @@ static ssize_t activate_store(struct device *dev,
+ else
+ return -EINVAL;
+
+- nvdimm_bus_lock(dev);
+ state = nd_desc->fw_ops->activate_state(nd_desc);
+
+ switch (state) {
+@@ -461,7 +456,6 @@ static ssize_t activate_store(struct device *dev,
+ default:
+ rc = -ENXIO;
+ }
+- nvdimm_bus_unlock(dev);
+
+ if (rc == 0)
+ rc = len;
+@@ -484,10 +478,7 @@ static umode_t nvdimm_bus_firmware_visible(struct kobject *kobj, struct attribut
+ if (!nd_desc->fw_ops)
+ return 0;
+
+- nvdimm_bus_lock(dev);
+ cap = nd_desc->fw_ops->capability(nd_desc);
+- nvdimm_bus_unlock(dev);
+-
+ if (cap < NVDIMM_FWA_CAP_QUIESCE)
+ return 0;
+
+diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
+index 58d95242a836b..4aa17132a5572 100644
+--- a/drivers/nvdimm/pmem.c
++++ b/drivers/nvdimm/pmem.c
+@@ -158,36 +158,20 @@ static blk_status_t pmem_do_write(struct pmem_device *pmem,
+ struct page *page, unsigned int page_off,
+ sector_t sector, unsigned int len)
+ {
+- blk_status_t rc = BLK_STS_OK;
+- bool bad_pmem = false;
+ phys_addr_t pmem_off = sector * 512 + pmem->data_offset;
+ void *pmem_addr = pmem->virt_addr + pmem_off;
+
+- if (unlikely(is_bad_pmem(&pmem->bb, sector, len)))
+- bad_pmem = true;
++ if (unlikely(is_bad_pmem(&pmem->bb, sector, len))) {
++ blk_status_t rc = pmem_clear_poison(pmem, pmem_off, len);
++
++ if (rc != BLK_STS_OK)
++ return rc;
++ }
+
+- /*
+- * Note that we write the data both before and after
+- * clearing poison. The write before clear poison
+- * handles situations where the latest written data is
+- * preserved and the clear poison operation simply marks
+- * the address range as valid without changing the data.
+- * In this case application software can assume that an
+- * interrupted write will either return the new good
+- * data or an error.
+- *
+- * However, if pmem_clear_poison() leaves the data in an
+- * indeterminate state we need to perform the write
+- * after clear poison.
+- */
+ flush_dcache_page(page);
+ write_pmem(pmem_addr, page, page_off, len);
+- if (unlikely(bad_pmem)) {
+- rc = pmem_clear_poison(pmem, pmem_off, len);
+- write_pmem(pmem_addr, page, page_off, len);
+- }
+
+- return rc;
++ return BLK_STS_OK;
+ }
+
+ static void pmem_submit_bio(struct bio *bio)
+diff --git a/drivers/nvdimm/security.c b/drivers/nvdimm/security.c
+index 4b80150e4afa7..b5aa55c614616 100644
+--- a/drivers/nvdimm/security.c
++++ b/drivers/nvdimm/security.c
+@@ -379,11 +379,6 @@ static int security_overwrite(struct nvdimm *nvdimm, unsigned int keyid)
+ || !nvdimm->sec.flags)
+ return -EOPNOTSUPP;
+
+- if (dev->driver == NULL) {
+- dev_dbg(dev, "Unable to overwrite while DIMM active.\n");
+- return -EINVAL;
+- }
+-
+ rc = check_security_state(nvdimm);
+ if (rc)
+ return rc;
+diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
+index e1846d04817f3..2d6a01853109b 100644
+--- a/drivers/nvme/host/core.c
++++ b/drivers/nvme/host/core.c
+@@ -1771,7 +1771,7 @@ static void nvme_set_queue_limits(struct nvme_ctrl *ctrl,
+ blk_queue_max_segments(q, min_t(u32, max_segments, USHRT_MAX));
+ }
+ blk_queue_virt_boundary(q, NVME_CTRL_PAGE_SIZE - 1);
+- blk_queue_dma_alignment(q, 7);
++ blk_queue_dma_alignment(q, 3);
+ blk_queue_write_cache(q, vwc, vwc);
+ }
+
+@@ -3080,10 +3080,6 @@ int nvme_init_ctrl_finish(struct nvme_ctrl *ctrl)
+ if (ret)
+ return ret;
+
+- ret = nvme_init_non_mdts_limits(ctrl);
+- if (ret < 0)
+- return ret;
+-
+ ret = nvme_configure_apst(ctrl);
+ if (ret < 0)
+ return ret;
+@@ -4237,11 +4233,26 @@ static void nvme_scan_work(struct work_struct *work)
+ {
+ struct nvme_ctrl *ctrl =
+ container_of(work, struct nvme_ctrl, scan_work);
++ int ret;
+
+ /* No tagset on a live ctrl means IO queues could not created */
+ if (ctrl->state != NVME_CTRL_LIVE || !ctrl->tagset)
+ return;
+
++ /*
++ * Identify controller limits can change at controller reset due to
++ * new firmware download, even though it is not common we cannot ignore
++ * such scenario. Controller's non-mdts limits are reported in the unit
++ * of logical blocks that is dependent on the format of attached
++ * namespace. Hence re-read the limits at the time of ns allocation.
++ */
++ ret = nvme_init_non_mdts_limits(ctrl);
++ if (ret < 0) {
++ dev_warn(ctrl->device,
++ "reading non-mdts-limits failed: %d\n", ret);
++ return;
++ }
++
+ if (test_and_clear_bit(NVME_AER_NOTICE_NS_CHANGED, &ctrl->events)) {
+ dev_info(ctrl->device, "rescanning namespaces.\n");
+ nvme_clear_changed_ns_log(ctrl);
+diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
+index 3aacf1c0d5a5f..17aeb7d5c4852 100644
+--- a/drivers/nvme/host/pci.c
++++ b/drivers/nvme/host/pci.c
+@@ -1775,6 +1775,7 @@ static int nvme_alloc_admin_tags(struct nvme_dev *dev)
+ dev->ctrl.admin_q = blk_mq_init_queue(&dev->admin_tagset);
+ if (IS_ERR(dev->ctrl.admin_q)) {
+ blk_mq_free_tag_set(&dev->admin_tagset);
++ dev->ctrl.admin_q = NULL;
+ return -ENOMEM;
+ }
+ if (!blk_get_queue(dev->ctrl.admin_q)) {
+diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
+index ec315b060cd50..0f30496ce80bf 100644
+--- a/drivers/of/fdt.c
++++ b/drivers/of/fdt.c
+@@ -1105,6 +1105,9 @@ int __init early_init_dt_scan_memory(void)
+ if (type == NULL || strcmp(type, "memory") != 0)
+ continue;
+
++ if (!of_fdt_device_is_available(fdt, node))
++ continue;
++
+ reg = of_get_flat_dt_prop(node, "linux,usable-memory", &l);
+ if (reg == NULL)
+ reg = of_get_flat_dt_prop(node, "reg", &l);
+diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
+index b9bd1cff17938..8d374cc552be5 100644
+--- a/drivers/of/kexec.c
++++ b/drivers/of/kexec.c
+@@ -386,6 +386,15 @@ void *of_kexec_alloc_and_setup_fdt(const struct kimage *image,
+ crashk_res.end - crashk_res.start + 1);
+ if (ret)
+ goto out;
++
++ if (crashk_low_res.end) {
++ ret = fdt_appendprop_addrrange(fdt, 0, chosen_node,
++ "linux,usable-memory-range",
++ crashk_low_res.start,
++ crashk_low_res.end - crashk_low_res.start + 1);
++ if (ret)
++ goto out;
++ }
+ }
+
+ /* add bootargs */
+diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
+index d80160cf34bb7..d1187123c4fc4 100644
+--- a/drivers/of/overlay.c
++++ b/drivers/of/overlay.c
+@@ -170,9 +170,7 @@ static int overlay_notify(struct overlay_changeset *ovcs,
+
+ ret = blocking_notifier_call_chain(&overlay_notify_chain,
+ action, &nd);
+- if (ret == NOTIFY_OK || ret == NOTIFY_STOP)
+- return 0;
+- if (ret) {
++ if (notifier_to_errno(ret)) {
+ ret = notifier_to_errno(ret);
+ pr_err("overlay changeset %s notifier error %d, target: %pOF\n",
+ of_overlay_action_name[action], ret, nd.target);
+diff --git a/drivers/opp/of.c b/drivers/opp/of.c
+index 440ab5a03df9f..95b184fc33727 100644
+--- a/drivers/opp/of.c
++++ b/drivers/opp/of.c
+@@ -437,11 +437,11 @@ static int _bandwidth_supported(struct device *dev, struct opp_table *opp_table)
+
+ /* Checking only first OPP is sufficient */
+ np = of_get_next_available_child(opp_np, NULL);
++ of_node_put(opp_np);
+ if (!np) {
+ dev_err(dev, "OPP table empty\n");
+ return -EINVAL;
+ }
+- of_node_put(opp_np);
+
+ prop = of_find_property(np, "opp-peak-kBps", NULL);
+ of_node_put(np);
+diff --git a/drivers/pci/controller/cadence/pci-j721e.c b/drivers/pci/controller/cadence/pci-j721e.c
+index 768d33f9ebc87..a82f845cc4b52 100644
+--- a/drivers/pci/controller/cadence/pci-j721e.c
++++ b/drivers/pci/controller/cadence/pci-j721e.c
+@@ -69,6 +69,7 @@ struct j721e_pcie_data {
+ enum j721e_pcie_mode mode;
+ unsigned int quirk_retrain_flag:1;
+ unsigned int quirk_detect_quiet_flag:1;
++ unsigned int quirk_disable_flr:1;
+ u32 linkdown_irq_regfield;
+ unsigned int byte_access_allowed:1;
+ };
+@@ -307,6 +308,7 @@ static const struct j721e_pcie_data j7200_pcie_rc_data = {
+ static const struct j721e_pcie_data j7200_pcie_ep_data = {
+ .mode = PCI_MODE_EP,
+ .quirk_detect_quiet_flag = true,
++ .quirk_disable_flr = true,
+ };
+
+ static const struct j721e_pcie_data am64_pcie_rc_data = {
+@@ -405,6 +407,7 @@ static int j721e_pcie_probe(struct platform_device *pdev)
+ return -ENOMEM;
+
+ ep->quirk_detect_quiet_flag = data->quirk_detect_quiet_flag;
++ ep->quirk_disable_flr = data->quirk_disable_flr;
+
+ cdns_pcie = &ep->pcie;
+ cdns_pcie->dev = dev;
+diff --git a/drivers/pci/controller/cadence/pcie-cadence-ep.c b/drivers/pci/controller/cadence/pcie-cadence-ep.c
+index 88e05b9c2e5b8..b8b655d4047ec 100644
+--- a/drivers/pci/controller/cadence/pcie-cadence-ep.c
++++ b/drivers/pci/controller/cadence/pcie-cadence-ep.c
+@@ -187,8 +187,7 @@ static int cdns_pcie_ep_map_addr(struct pci_epc *epc, u8 fn, u8 vfn,
+ struct cdns_pcie *pcie = &ep->pcie;
+ u32 r;
+
+- r = find_first_zero_bit(&ep->ob_region_map,
+- sizeof(ep->ob_region_map) * BITS_PER_LONG);
++ r = find_first_zero_bit(&ep->ob_region_map, BITS_PER_LONG);
+ if (r >= ep->max_regions - 1) {
+ dev_err(&epc->dev, "no free outbound region\n");
+ return -EINVAL;
+@@ -565,7 +564,8 @@ static int cdns_pcie_ep_start(struct pci_epc *epc)
+ struct cdns_pcie_ep *ep = epc_get_drvdata(epc);
+ struct cdns_pcie *pcie = &ep->pcie;
+ struct device *dev = pcie->dev;
+- int ret;
++ int max_epfs = sizeof(epc->function_num_map) * 8;
++ int ret, value, epf;
+
+ /*
+ * BIT(0) is hardwired to 1, hence function 0 is always enabled
+@@ -573,6 +573,21 @@ static int cdns_pcie_ep_start(struct pci_epc *epc)
+ */
+ cdns_pcie_writel(pcie, CDNS_PCIE_LM_EP_FUNC_CFG, epc->function_num_map);
+
++ if (ep->quirk_disable_flr) {
++ for (epf = 0; epf < max_epfs; epf++) {
++ if (!(epc->function_num_map & BIT(epf)))
++ continue;
++
++ value = cdns_pcie_ep_fn_readl(pcie, epf,
++ CDNS_PCIE_EP_FUNC_DEV_CAP_OFFSET +
++ PCI_EXP_DEVCAP);
++ value &= ~PCI_EXP_DEVCAP_FLR;
++ cdns_pcie_ep_fn_writel(pcie, epf,
++ CDNS_PCIE_EP_FUNC_DEV_CAP_OFFSET +
++ PCI_EXP_DEVCAP, value);
++ }
++ }
++
+ ret = cdns_pcie_start_link(pcie);
+ if (ret) {
+ dev_err(dev, "Failed to start link\n");
+diff --git a/drivers/pci/controller/cadence/pcie-cadence.h b/drivers/pci/controller/cadence/pcie-cadence.h
+index c8a27b6290cea..d9c785365da3b 100644
+--- a/drivers/pci/controller/cadence/pcie-cadence.h
++++ b/drivers/pci/controller/cadence/pcie-cadence.h
+@@ -123,6 +123,7 @@
+
+ #define CDNS_PCIE_EP_FUNC_MSI_CAP_OFFSET 0x90
+ #define CDNS_PCIE_EP_FUNC_MSIX_CAP_OFFSET 0xb0
++#define CDNS_PCIE_EP_FUNC_DEV_CAP_OFFSET 0xc0
+ #define CDNS_PCIE_EP_FUNC_SRIOV_CAP_OFFSET 0x200
+
+ /*
+@@ -357,6 +358,7 @@ struct cdns_pcie_epf {
+ * minimize time between read and write
+ * @epf: Structure to hold info about endpoint function
+ * @quirk_detect_quiet_flag: LTSSM Detect Quiet min delay set as quirk
++ * @quirk_disable_flr: Disable FLR (Function Level Reset) quirk flag
+ */
+ struct cdns_pcie_ep {
+ struct cdns_pcie pcie;
+@@ -372,6 +374,7 @@ struct cdns_pcie_ep {
+ spinlock_t lock;
+ struct cdns_pcie_epf *epf;
+ unsigned int quirk_detect_quiet_flag:1;
++ unsigned int quirk_disable_flr:1;
+ };
+
+
+diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c
+index 6619e3caffe2d..7a285fb0f6199 100644
+--- a/drivers/pci/controller/dwc/pci-imx6.c
++++ b/drivers/pci/controller/dwc/pci-imx6.c
+@@ -408,6 +408,11 @@ static void imx6_pcie_assert_core_reset(struct imx6_pcie *imx6_pcie)
+ dev_err(dev, "failed to disable vpcie regulator: %d\n",
+ ret);
+ }
++
++ /* Some boards don't have PCIe reset GPIO. */
++ if (gpio_is_valid(imx6_pcie->reset_gpio))
++ gpio_set_value_cansleep(imx6_pcie->reset_gpio,
++ imx6_pcie->gpio_active_high);
+ }
+
+ static unsigned int imx6_pcie_grp_offset(const struct imx6_pcie *imx6_pcie)
+@@ -540,15 +545,6 @@ static void imx6_pcie_deassert_core_reset(struct imx6_pcie *imx6_pcie)
+ /* allow the clocks to stabilize */
+ usleep_range(200, 500);
+
+- /* Some boards don't have PCIe reset GPIO. */
+- if (gpio_is_valid(imx6_pcie->reset_gpio)) {
+- gpio_set_value_cansleep(imx6_pcie->reset_gpio,
+- imx6_pcie->gpio_active_high);
+- msleep(100);
+- gpio_set_value_cansleep(imx6_pcie->reset_gpio,
+- !imx6_pcie->gpio_active_high);
+- }
+-
+ switch (imx6_pcie->drvdata->variant) {
+ case IMX8MQ:
+ reset_control_deassert(imx6_pcie->pciephy_reset);
+@@ -595,6 +591,15 @@ static void imx6_pcie_deassert_core_reset(struct imx6_pcie *imx6_pcie)
+ break;
+ }
+
++ /* Some boards don't have PCIe reset GPIO. */
++ if (gpio_is_valid(imx6_pcie->reset_gpio)) {
++ msleep(100);
++ gpio_set_value_cansleep(imx6_pcie->reset_gpio,
++ !imx6_pcie->gpio_active_high);
++ /* Wait for 100ms after PERST# deassertion (PCIe r5.0, 6.6.1) */
++ msleep(100);
++ }
++
+ return;
+
+ err_ref_clk:
+diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
+index 2fa86f32d9642..9979302532b72 100644
+--- a/drivers/pci/controller/dwc/pcie-designware-host.c
++++ b/drivers/pci/controller/dwc/pcie-designware-host.c
+@@ -396,7 +396,8 @@ int dw_pcie_host_init(struct pcie_port *pp)
+ sizeof(pp->msi_msg),
+ DMA_FROM_DEVICE,
+ DMA_ATTR_SKIP_CPU_SYNC);
+- if (dma_mapping_error(pci->dev, pp->msi_data)) {
++ ret = dma_mapping_error(pci->dev, pp->msi_data);
++ if (ret) {
+ dev_err(pci->dev, "Failed to map MSI data\n");
+ pp->msi_data = 0;
+ goto err_free_msi;
+diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
+index 816028c0f6edb..ed55421eb9ba9 100644
+--- a/drivers/pci/controller/dwc/pcie-qcom.c
++++ b/drivers/pci/controller/dwc/pcie-qcom.c
+@@ -1238,12 +1238,6 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie)
+ goto err_disable_clocks;
+ }
+
+- ret = clk_prepare_enable(res->pipe_clk);
+- if (ret) {
+- dev_err(dev, "cannot prepare/enable pipe clock\n");
+- goto err_disable_clocks;
+- }
+-
+ /* Wait for reset to complete, required on SM8450 */
+ usleep_range(1000, 1500);
+
+@@ -1627,22 +1621,21 @@ static int qcom_pcie_probe(struct platform_device *pdev)
+ pp->ops = &qcom_pcie_dw_ops;
+
+ ret = phy_init(pcie->phy);
+- if (ret) {
+- pm_runtime_disable(&pdev->dev);
++ if (ret)
+ goto err_pm_runtime_put;
+- }
+
+ platform_set_drvdata(pdev, pcie);
+
+ ret = dw_pcie_host_init(pp);
+ if (ret) {
+ dev_err(dev, "cannot initialize host\n");
+- pm_runtime_disable(&pdev->dev);
+- goto err_pm_runtime_put;
++ goto err_phy_exit;
+ }
+
+ return 0;
+
++err_phy_exit:
++ phy_exit(pcie->phy);
+ err_pm_runtime_put:
+ pm_runtime_put(dev);
+ pm_runtime_disable(dev);
+diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c b/drivers/pci/controller/pcie-mediatek-gen3.c
+index 3e8d70bfabc6a..5d9fd36b02d18 100644
+--- a/drivers/pci/controller/pcie-mediatek-gen3.c
++++ b/drivers/pci/controller/pcie-mediatek-gen3.c
+@@ -838,6 +838,14 @@ static int mtk_pcie_setup(struct mtk_gen3_pcie *pcie)
+ if (err)
+ return err;
+
++ /*
++ * The controller may have been left out of reset by the bootloader
++ * so make sure that we get a clean start by asserting resets here.
++ */
++ reset_control_assert(pcie->phy_reset);
++ reset_control_assert(pcie->mac_reset);
++ usleep_range(10, 20);
++
+ /* Don't touch the hardware registers before power up */
+ err = mtk_pcie_power_up(pcie);
+ if (err)
+diff --git a/drivers/pci/controller/pcie-mediatek.c b/drivers/pci/controller/pcie-mediatek.c
+index ddfbd4aebdeca..be8bd919cb88f 100644
+--- a/drivers/pci/controller/pcie-mediatek.c
++++ b/drivers/pci/controller/pcie-mediatek.c
+@@ -1008,6 +1008,7 @@ static int mtk_pcie_subsys_powerup(struct mtk_pcie *pcie)
+ "mediatek,generic-pciecfg");
+ if (cfg_node) {
+ pcie->cfg = syscon_node_to_regmap(cfg_node);
++ of_node_put(cfg_node);
+ if (IS_ERR(pcie->cfg))
+ return PTR_ERR(pcie->cfg);
+ }
+diff --git a/drivers/pci/controller/pcie-microchip-host.c b/drivers/pci/controller/pcie-microchip-host.c
+index 29d8e81e41810..2c52a8cef7260 100644
+--- a/drivers/pci/controller/pcie-microchip-host.c
++++ b/drivers/pci/controller/pcie-microchip-host.c
+@@ -406,6 +406,7 @@ static void mc_pcie_enable_msi(struct mc_pcie *port, void __iomem *base)
+ static void mc_handle_msi(struct irq_desc *desc)
+ {
+ struct mc_pcie *port = irq_desc_get_handler_data(desc);
++ struct irq_chip *chip = irq_desc_get_chip(desc);
+ struct device *dev = port->dev;
+ struct mc_msi *msi = &port->msi;
+ void __iomem *bridge_base_addr =
+@@ -414,8 +415,11 @@ static void mc_handle_msi(struct irq_desc *desc)
+ u32 bit;
+ int ret;
+
++ chained_irq_enter(chip, desc);
++
+ status = readl_relaxed(bridge_base_addr + ISTATUS_LOCAL);
+ if (status & PM_MSI_INT_MSI_MASK) {
++ writel_relaxed(status & PM_MSI_INT_MSI_MASK, bridge_base_addr + ISTATUS_LOCAL);
+ status = readl_relaxed(bridge_base_addr + ISTATUS_MSI);
+ for_each_set_bit(bit, &status, msi->num_vectors) {
+ ret = generic_handle_domain_irq(msi->dev_domain, bit);
+@@ -424,6 +428,8 @@ static void mc_handle_msi(struct irq_desc *desc)
+ bit);
+ }
+ }
++
++ chained_irq_exit(chip, desc);
+ }
+
+ static void mc_msi_bottom_irq_ack(struct irq_data *data)
+@@ -432,13 +438,8 @@ static void mc_msi_bottom_irq_ack(struct irq_data *data)
+ void __iomem *bridge_base_addr =
+ port->axi_base_addr + MC_PCIE_BRIDGE_ADDR;
+ u32 bitpos = data->hwirq;
+- unsigned long status;
+
+ writel_relaxed(BIT(bitpos), bridge_base_addr + ISTATUS_MSI);
+- status = readl_relaxed(bridge_base_addr + ISTATUS_MSI);
+- if (!status)
+- writel_relaxed(BIT(PM_MSI_INT_MSI_SHIFT),
+- bridge_base_addr + ISTATUS_LOCAL);
+ }
+
+ static void mc_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
+@@ -563,6 +564,7 @@ static int mc_allocate_msi_domains(struct mc_pcie *port)
+ static void mc_handle_intx(struct irq_desc *desc)
+ {
+ struct mc_pcie *port = irq_desc_get_handler_data(desc);
++ struct irq_chip *chip = irq_desc_get_chip(desc);
+ struct device *dev = port->dev;
+ void __iomem *bridge_base_addr =
+ port->axi_base_addr + MC_PCIE_BRIDGE_ADDR;
+@@ -570,6 +572,8 @@ static void mc_handle_intx(struct irq_desc *desc)
+ u32 bit;
+ int ret;
+
++ chained_irq_enter(chip, desc);
++
+ status = readl_relaxed(bridge_base_addr + ISTATUS_LOCAL);
+ if (status & PM_MSI_INT_INTX_MASK) {
+ status &= PM_MSI_INT_INTX_MASK;
+@@ -581,6 +585,8 @@ static void mc_handle_intx(struct irq_desc *desc)
+ bit);
+ }
+ }
++
++ chained_irq_exit(chip, desc);
+ }
+
+ static void mc_ack_intx_irq(struct irq_data *data)
+diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
+index 5fb9ce6e536e0..d1a200b93b2bf 100644
+--- a/drivers/pci/controller/pcie-rockchip-ep.c
++++ b/drivers/pci/controller/pcie-rockchip-ep.c
+@@ -264,8 +264,7 @@ static int rockchip_pcie_ep_map_addr(struct pci_epc *epc, u8 fn, u8 vfn,
+ struct rockchip_pcie *pcie = &ep->rockchip;
+ u32 r;
+
+- r = find_first_zero_bit(&ep->ob_region_map,
+- sizeof(ep->ob_region_map) * BITS_PER_LONG);
++ r = find_first_zero_bit(&ep->ob_region_map, BITS_PER_LONG);
+ /*
+ * Region 0 is reserved for configuration space and shouldn't
+ * be used elsewhere per TRM, so leave it out.
+diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
+index 1f15ab7eabf81..3ae435beaf0a9 100644
+--- a/drivers/pci/pci-acpi.c
++++ b/drivers/pci/pci-acpi.c
+@@ -974,9 +974,11 @@ bool acpi_pci_power_manageable(struct pci_dev *dev)
+
+ bool acpi_pci_bridge_d3(struct pci_dev *dev)
+ {
+- const union acpi_object *obj;
+- struct acpi_device *adev;
+ struct pci_dev *rpdev;
++ struct acpi_device *adev;
++ acpi_status status;
++ unsigned long long state;
++ const union acpi_object *obj;
+
+ if (acpi_pci_disabled || !dev->is_hotplug_bridge)
+ return false;
+@@ -985,12 +987,6 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
+ if (acpi_pci_power_manageable(dev))
+ return true;
+
+- /*
+- * The ACPI firmware will provide the device-specific properties through
+- * _DSD configuration object. Look for the 'HotPlugSupportInD3' property
+- * for the root port and if it is set we know the hierarchy behind it
+- * supports D3 just fine.
+- */
+ rpdev = pcie_find_root_port(dev);
+ if (!rpdev)
+ return false;
+@@ -999,11 +995,34 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
+ if (!adev)
+ return false;
+
+- if (acpi_dev_get_property(adev, "HotPlugSupportInD3",
+- ACPI_TYPE_INTEGER, &obj) < 0)
++ /*
++ * If the Root Port cannot signal wakeup signals at all, i.e., it
++ * doesn't supply a wakeup GPE via _PRW, it cannot signal hotplug
++ * events from low-power states including D3hot and D3cold.
++ */
++ if (!adev->wakeup.flags.valid)
+ return false;
+
+- return obj->integer.value == 1;
++ /*
++ * If the Root Port cannot wake itself from D3hot or D3cold, we
++ * can't use D3.
++ */
++ status = acpi_evaluate_integer(adev->handle, "_S0W", NULL, &state);
++ if (ACPI_SUCCESS(status) && state < ACPI_STATE_D3_HOT)
++ return false;
++
++ /*
++ * The "HotPlugSupportInD3" property in a Root Port _DSD indicates
++ * the Port can signal hotplug events while in D3. We assume any
++ * bridges *below* that Root Port can also signal hotplug events
++ * while in D3.
++ */
++ if (!acpi_dev_get_property(adev, "HotPlugSupportInD3",
++ ACPI_TYPE_INTEGER, &obj) &&
++ obj->integer.value == 1)
++ return true;
++
++ return false;
+ }
+
+ int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
+diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
+index d25122fbe98ab..8ac110d6c6f4b 100644
+--- a/drivers/pci/pci.c
++++ b/drivers/pci/pci.c
+@@ -2920,6 +2920,8 @@ static const struct dmi_system_id bridge_d3_blacklist[] = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "Gigabyte Technology Co., Ltd."),
+ DMI_MATCH(DMI_BOARD_NAME, "X299 DESIGNARE EX-CF"),
+ },
++ },
++ {
+ /*
+ * Downstream device is not accessible after putting a root port
+ * into D3cold and back into D0 on Elo i2.
+@@ -5113,19 +5115,19 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
+
+ void pci_dev_lock(struct pci_dev *dev)
+ {
+- pci_cfg_access_lock(dev);
+ /* block PM suspend, driver probe, etc. */
+ device_lock(&dev->dev);
++ pci_cfg_access_lock(dev);
+ }
+ EXPORT_SYMBOL_GPL(pci_dev_lock);
+
+ /* Return 1 on successful lock, 0 on contention */
+ int pci_dev_trylock(struct pci_dev *dev)
+ {
+- if (pci_cfg_access_trylock(dev)) {
+- if (device_trylock(&dev->dev))
++ if (device_trylock(&dev->dev)) {
++ if (pci_cfg_access_trylock(dev))
+ return 1;
+- pci_cfg_access_unlock(dev);
++ device_unlock(&dev->dev);
+ }
+
+ return 0;
+@@ -5134,8 +5136,8 @@ EXPORT_SYMBOL_GPL(pci_dev_trylock);
+
+ void pci_dev_unlock(struct pci_dev *dev)
+ {
+- device_unlock(&dev->dev);
+ pci_cfg_access_unlock(dev);
++ device_unlock(&dev->dev);
+ }
+ EXPORT_SYMBOL_GPL(pci_dev_unlock);
+
+diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
+index 9fa1f97e5b270..7952e5efd6cf3 100644
+--- a/drivers/pci/pcie/aer.c
++++ b/drivers/pci/pcie/aer.c
+@@ -101,6 +101,11 @@ struct aer_stats {
+ #define ERR_COR_ID(d) (d & 0xffff)
+ #define ERR_UNCOR_ID(d) (d >> 16)
+
++#define AER_ERR_STATUS_MASK (PCI_ERR_ROOT_UNCOR_RCV | \
++ PCI_ERR_ROOT_COR_RCV | \
++ PCI_ERR_ROOT_MULTI_COR_RCV | \
++ PCI_ERR_ROOT_MULTI_UNCOR_RCV)
++
+ static int pcie_aer_disable;
+ static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
+
+@@ -1196,7 +1201,7 @@ static irqreturn_t aer_irq(int irq, void *context)
+ struct aer_err_source e_src = {};
+
+ pci_read_config_dword(rp, aer + PCI_ERR_ROOT_STATUS, &e_src.status);
+- if (!(e_src.status & (PCI_ERR_ROOT_UNCOR_RCV|PCI_ERR_ROOT_COR_RCV)))
++ if (!(e_src.status & AER_ERR_STATUS_MASK))
+ return IRQ_NONE;
+
+ pci_read_config_dword(rp, aer + PCI_ERR_ROOT_ERR_SRC, &e_src.id);
+diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
+index da829274fc66d..41aeaa2351322 100644
+--- a/drivers/pci/quirks.c
++++ b/drivers/pci/quirks.c
+@@ -12,6 +12,7 @@
+ * file, where their drivers can use them.
+ */
+
++#include <linux/bitfield.h>
+ #include <linux/types.h>
+ #include <linux/kernel.h>
+ #include <linux/export.h>
+@@ -5895,3 +5896,49 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1533, rom_bar_overlap_defect);
+ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1536, rom_bar_overlap_defect);
+ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1537, rom_bar_overlap_defect);
+ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1538, rom_bar_overlap_defect);
++
++#ifdef CONFIG_PCIEASPM
++/*
++ * Several Intel DG2 graphics devices advertise that they can only tolerate
++ * 1us latency when transitioning from L1 to L0, which may prevent ASPM L1
++ * from being enabled. But in fact these devices can tolerate unlimited
++ * latency. Override their Device Capabilities value to allow ASPM L1 to
++ * be enabled.
++ */
++static void aspm_l1_acceptable_latency(struct pci_dev *dev)
++{
++ u32 l1_lat = FIELD_GET(PCI_EXP_DEVCAP_L1, dev->devcap);
++
++ if (l1_lat < 7) {
++ dev->devcap |= FIELD_PREP(PCI_EXP_DEVCAP_L1, 7);
++ pci_info(dev, "ASPM: overriding L1 acceptable latency from %#x to 0x7\n",
++ l1_lat);
++ }
++}
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f80, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f81, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f82, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f83, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f84, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f85, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f86, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f87, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x4f88, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5690, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5691, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5692, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5693, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5694, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x5695, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a0, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a1, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a2, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a3, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a4, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a5, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56a6, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b0, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b1, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c0, aspm_l1_acceptable_latency);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c1, aspm_l1_acceptable_latency);
++#endif
+diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c b/drivers/phy/qualcomm/phy-qcom-qmp.c
+index b144ae1f729ab..9afac02e0eaa2 100644
+--- a/drivers/phy/qualcomm/phy-qcom-qmp.c
++++ b/drivers/phy/qualcomm/phy-qcom-qmp.c
+@@ -5818,6 +5818,11 @@ static const struct phy_ops qcom_qmp_pcie_ufs_ops = {
+ .owner = THIS_MODULE,
+ };
+
++static void qcom_qmp_reset_control_put(void *data)
++{
++ reset_control_put(data);
++}
++
+ static
+ int qcom_qmp_phy_create(struct device *dev, struct device_node *np, int id,
+ void __iomem *serdes, const struct qmp_phy_cfg *cfg)
+@@ -5890,7 +5895,7 @@ int qcom_qmp_phy_create(struct device *dev, struct device_node *np, int id,
+ * all phys that don't need this.
+ */
+ snprintf(prop_name, sizeof(prop_name), "pipe%d", id);
+- qphy->pipe_clk = of_clk_get_by_name(np, prop_name);
++ qphy->pipe_clk = devm_get_clk_from_child(dev, np, prop_name);
+ if (IS_ERR(qphy->pipe_clk)) {
+ if (cfg->type == PHY_TYPE_PCIE ||
+ cfg->type == PHY_TYPE_USB3) {
+@@ -5912,6 +5917,10 @@ int qcom_qmp_phy_create(struct device *dev, struct device_node *np, int id,
+ dev_err(dev, "failed to get lane%d reset\n", id);
+ return PTR_ERR(qphy->lane_rst);
+ }
++ ret = devm_add_action_or_reset(dev, qcom_qmp_reset_control_put,
++ qphy->lane_rst);
++ if (ret)
++ return ret;
+ }
+
+ if (cfg->type == PHY_TYPE_UFS || cfg->type == PHY_TYPE_PCIE)
+diff --git a/drivers/pinctrl/bcm/pinctrl-bcm2835.c b/drivers/pinctrl/bcm/pinctrl-bcm2835.c
+index 47e433e09c5ce..dad4530547768 100644
+--- a/drivers/pinctrl/bcm/pinctrl-bcm2835.c
++++ b/drivers/pinctrl/bcm/pinctrl-bcm2835.c
+@@ -358,6 +358,22 @@ static int bcm2835_gpio_direction_output(struct gpio_chip *chip,
+ return 0;
+ }
+
++static int bcm2835_of_gpio_ranges_fallback(struct gpio_chip *gc,
++ struct device_node *np)
++{
++ struct pinctrl_dev *pctldev = of_pinctrl_get(np);
++
++ of_node_put(np);
++
++ if (!pctldev)
++ return 0;
++
++ gpiochip_add_pin_range(gc, pinctrl_dev_get_devname(pctldev), 0, 0,
++ gc->ngpio);
++
++ return 0;
++}
++
+ static const struct gpio_chip bcm2835_gpio_chip = {
+ .label = MODULE_NAME,
+ .owner = THIS_MODULE,
+@@ -372,6 +388,7 @@ static const struct gpio_chip bcm2835_gpio_chip = {
+ .base = -1,
+ .ngpio = BCM2835_NUM_GPIOS,
+ .can_sleep = false,
++ .of_gpio_ranges_fallback = bcm2835_of_gpio_ranges_fallback,
+ };
+
+ static const struct gpio_chip bcm2711_gpio_chip = {
+@@ -388,6 +405,7 @@ static const struct gpio_chip bcm2711_gpio_chip = {
+ .base = -1,
+ .ngpio = BCM2711_NUM_GPIOS,
+ .can_sleep = false,
++ .of_gpio_ranges_fallback = bcm2835_of_gpio_ranges_fallback,
+ };
+
+ static void bcm2835_gpio_irq_handle_bank(struct bcm2835_pinctrl *pc,
+diff --git a/drivers/pinctrl/mediatek/Kconfig b/drivers/pinctrl/mediatek/Kconfig
+index 40accd110c3d8..b3074082c56d3 100644
+--- a/drivers/pinctrl/mediatek/Kconfig
++++ b/drivers/pinctrl/mediatek/Kconfig
+@@ -166,6 +166,7 @@ config PINCTRL_MT8195
+ bool "Mediatek MT8195 pin control"
+ depends on OF
+ depends on ARM64 || COMPILE_TEST
++ default ARM64 && ARCH_MEDIATEK
+ select PINCTRL_MTK_PARIS
+
+ config PINCTRL_MT8365
+diff --git a/drivers/pinctrl/mvebu/pinctrl-armada-37xx.c b/drivers/pinctrl/mvebu/pinctrl-armada-37xx.c
+index 08cad14042e2e..adccf03b3e5af 100644
+--- a/drivers/pinctrl/mvebu/pinctrl-armada-37xx.c
++++ b/drivers/pinctrl/mvebu/pinctrl-armada-37xx.c
+@@ -773,7 +773,7 @@ static int armada_37xx_irqchip_register(struct platform_device *pdev,
+ for (i = 0; i < nr_irq_parent; i++) {
+ int irq = irq_of_parse_and_map(np, i);
+
+- if (irq < 0)
++ if (!irq)
+ continue;
+ girq->parents[i] = irq;
+ }
+diff --git a/drivers/pinctrl/pinctrl-apple-gpio.c b/drivers/pinctrl/pinctrl-apple-gpio.c
+index 72f4dd2466e11..6d1bff9588d99 100644
+--- a/drivers/pinctrl/pinctrl-apple-gpio.c
++++ b/drivers/pinctrl/pinctrl-apple-gpio.c
+@@ -72,6 +72,7 @@ struct regmap_config regmap_config = {
+ .max_register = 512 * sizeof(u32),
+ .num_reg_defaults_raw = 512,
+ .use_relaxed_mmio = true,
++ .use_raw_spinlock = true,
+ };
+
+ /* No locking needed to mask/unmask IRQs as the interrupt mode is per pin-register. */
+diff --git a/drivers/pinctrl/pinctrl-rockchip.c b/drivers/pinctrl/pinctrl-rockchip.c
+index 2cb79e649fcf3..e1f58451984ff 100644
+--- a/drivers/pinctrl/pinctrl-rockchip.c
++++ b/drivers/pinctrl/pinctrl-rockchip.c
+@@ -2110,19 +2110,20 @@ static bool rockchip_pinconf_pull_valid(struct rockchip_pin_ctrl *ctrl,
+ return false;
+ }
+
+-static int rockchip_pinconf_defer_output(struct rockchip_pin_bank *bank,
+- unsigned int pin, u32 arg)
++static int rockchip_pinconf_defer_pin(struct rockchip_pin_bank *bank,
++ unsigned int pin, u32 param, u32 arg)
+ {
+- struct rockchip_pin_output_deferred *cfg;
++ struct rockchip_pin_deferred *cfg;
+
+ cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
+ if (!cfg)
+ return -ENOMEM;
+
+ cfg->pin = pin;
++ cfg->param = param;
+ cfg->arg = arg;
+
+- list_add_tail(&cfg->head, &bank->deferred_output);
++ list_add_tail(&cfg->head, &bank->deferred_pins);
+
+ return 0;
+ }
+@@ -2143,6 +2144,25 @@ static int rockchip_pinconf_set(struct pinctrl_dev *pctldev, unsigned int pin,
+ param = pinconf_to_config_param(configs[i]);
+ arg = pinconf_to_config_argument(configs[i]);
+
++ if (param == PIN_CONFIG_OUTPUT || param == PIN_CONFIG_INPUT_ENABLE) {
++ /*
++ * Check for gpio driver not being probed yet.
++ * The lock makes sure that either gpio-probe has completed
++ * or the gpio driver hasn't probed yet.
++ */
++ mutex_lock(&bank->deferred_lock);
++ if (!gpio || !gpio->direction_output) {
++ rc = rockchip_pinconf_defer_pin(bank, pin - bank->pin_base, param,
++ arg);
++ mutex_unlock(&bank->deferred_lock);
++ if (rc)
++ return rc;
++
++ break;
++ }
++ mutex_unlock(&bank->deferred_lock);
++ }
++
+ switch (param) {
+ case PIN_CONFIG_BIAS_DISABLE:
+ rc = rockchip_set_pull(bank, pin - bank->pin_base,
+@@ -2171,27 +2191,21 @@ static int rockchip_pinconf_set(struct pinctrl_dev *pctldev, unsigned int pin,
+ if (rc != RK_FUNC_GPIO)
+ return -EINVAL;
+
+- /*
+- * Check for gpio driver not being probed yet.
+- * The lock makes sure that either gpio-probe has completed
+- * or the gpio driver hasn't probed yet.
+- */
+- mutex_lock(&bank->deferred_lock);
+- if (!gpio || !gpio->direction_output) {
+- rc = rockchip_pinconf_defer_output(bank, pin - bank->pin_base, arg);
+- mutex_unlock(&bank->deferred_lock);
+- if (rc)
+- return rc;
+-
+- break;
+- }
+- mutex_unlock(&bank->deferred_lock);
+-
+ rc = gpio->direction_output(gpio, pin - bank->pin_base,
+ arg);
+ if (rc)
+ return rc;
+ break;
++ case PIN_CONFIG_INPUT_ENABLE:
++ rc = rockchip_set_mux(bank, pin - bank->pin_base,
++ RK_FUNC_GPIO);
++ if (rc != RK_FUNC_GPIO)
++ return -EINVAL;
++
++ rc = gpio->direction_input(gpio, pin - bank->pin_base);
++ if (rc)
++ return rc;
++ break;
+ case PIN_CONFIG_DRIVE_STRENGTH:
+ /* rk3288 is the first with per-pin drive-strength */
+ if (!info->ctrl->drv_calc_reg)
+@@ -2500,7 +2514,7 @@ static int rockchip_pinctrl_register(struct platform_device *pdev,
+ pdesc++;
+ }
+
+- INIT_LIST_HEAD(&pin_bank->deferred_output);
++ INIT_LIST_HEAD(&pin_bank->deferred_pins);
+ mutex_init(&pin_bank->deferred_lock);
+ }
+
+@@ -2763,7 +2777,7 @@ static int rockchip_pinctrl_remove(struct platform_device *pdev)
+ {
+ struct rockchip_pinctrl *info = platform_get_drvdata(pdev);
+ struct rockchip_pin_bank *bank;
+- struct rockchip_pin_output_deferred *cfg;
++ struct rockchip_pin_deferred *cfg;
+ int i;
+
+ of_platform_depopulate(&pdev->dev);
+@@ -2772,9 +2786,9 @@ static int rockchip_pinctrl_remove(struct platform_device *pdev)
+ bank = &info->ctrl->pin_banks[i];
+
+ mutex_lock(&bank->deferred_lock);
+- while (!list_empty(&bank->deferred_output)) {
+- cfg = list_first_entry(&bank->deferred_output,
+- struct rockchip_pin_output_deferred, head);
++ while (!list_empty(&bank->deferred_pins)) {
++ cfg = list_first_entry(&bank->deferred_pins,
++ struct rockchip_pin_deferred, head);
+ list_del(&cfg->head);
+ kfree(cfg);
+ }
+diff --git a/drivers/pinctrl/pinctrl-rockchip.h b/drivers/pinctrl/pinctrl-rockchip.h
+index 91f10279d0844..98a01a616da67 100644
+--- a/drivers/pinctrl/pinctrl-rockchip.h
++++ b/drivers/pinctrl/pinctrl-rockchip.h
+@@ -171,7 +171,7 @@ struct rockchip_pin_bank {
+ u32 toggle_edge_mode;
+ u32 recalced_mask;
+ u32 route_mask;
+- struct list_head deferred_output;
++ struct list_head deferred_pins;
+ struct mutex deferred_lock;
+ };
+
+@@ -247,9 +247,12 @@ struct rockchip_pin_config {
+ unsigned int nconfigs;
+ };
+
+-struct rockchip_pin_output_deferred {
++enum pin_config_param;
++
++struct rockchip_pin_deferred {
+ struct list_head head;
+ unsigned int pin;
++ enum pin_config_param param;
+ u32 arg;
+ };
+
+diff --git a/drivers/pinctrl/renesas/core.c b/drivers/pinctrl/renesas/core.c
+index d0d4714731c14..3d8bf521c3e77 100644
+--- a/drivers/pinctrl/renesas/core.c
++++ b/drivers/pinctrl/renesas/core.c
+@@ -71,12 +71,11 @@ static int sh_pfc_map_resources(struct sh_pfc *pfc,
+
+ /* Fill them. */
+ for (i = 0; i < num_windows; i++) {
+- res = platform_get_resource(pdev, IORESOURCE_MEM, i);
+- windows->phys = res->start;
+- windows->size = resource_size(res);
+- windows->virt = devm_ioremap_resource(pfc->dev, res);
++ windows->virt = devm_platform_get_and_ioremap_resource(pdev, i, &res);
+ if (IS_ERR(windows->virt))
+ return -ENOMEM;
++ windows->phys = res->start;
++ windows->size = resource_size(res);
+ windows++;
+ }
+ for (i = 0; i < num_irqs; i++)
+diff --git a/drivers/pinctrl/renesas/pfc-r8a779a0.c b/drivers/pinctrl/renesas/pfc-r8a779a0.c
+index 4a668a04b7ca6..0c26e95ba7db1 100644
+--- a/drivers/pinctrl/renesas/pfc-r8a779a0.c
++++ b/drivers/pinctrl/renesas/pfc-r8a779a0.c
+@@ -629,7 +629,36 @@ enum {
+ };
+
+ static const u16 pinmux_data[] = {
++/* Using GP_2_[2-15] requires disabling I2C in MOD_SEL2 */
++#define GP_2_2_FN GP_2_2_FN, FN_SEL_I2C0_0
++#define GP_2_3_FN GP_2_3_FN, FN_SEL_I2C0_0
++#define GP_2_4_FN GP_2_4_FN, FN_SEL_I2C1_0
++#define GP_2_5_FN GP_2_5_FN, FN_SEL_I2C1_0
++#define GP_2_6_FN GP_2_6_FN, FN_SEL_I2C2_0
++#define GP_2_7_FN GP_2_7_FN, FN_SEL_I2C2_0
++#define GP_2_8_FN GP_2_8_FN, FN_SEL_I2C3_0
++#define GP_2_9_FN GP_2_9_FN, FN_SEL_I2C3_0
++#define GP_2_10_FN GP_2_10_FN, FN_SEL_I2C4_0
++#define GP_2_11_FN GP_2_11_FN, FN_SEL_I2C4_0
++#define GP_2_12_FN GP_2_12_FN, FN_SEL_I2C5_0
++#define GP_2_13_FN GP_2_13_FN, FN_SEL_I2C5_0
++#define GP_2_14_FN GP_2_14_FN, FN_SEL_I2C6_0
++#define GP_2_15_FN GP_2_15_FN, FN_SEL_I2C6_0
+ PINMUX_DATA_GP_ALL(),
++#undef GP_2_2_FN
++#undef GP_2_3_FN
++#undef GP_2_4_FN
++#undef GP_2_5_FN
++#undef GP_2_6_FN
++#undef GP_2_7_FN
++#undef GP_2_8_FN
++#undef GP_2_9_FN
++#undef GP_2_10_FN
++#undef GP_2_11_FN
++#undef GP_2_12_FN
++#undef GP_2_13_FN
++#undef GP_2_14_FN
++#undef GP_2_15_FN
+
+ PINMUX_SINGLE(MMC_D7),
+ PINMUX_SINGLE(MMC_D6),
+diff --git a/drivers/pinctrl/renesas/pfc-r8a779f0.c b/drivers/pinctrl/renesas/pfc-r8a779f0.c
+index 91860608242c5..3b4ca9622bbe1 100644
+--- a/drivers/pinctrl/renesas/pfc-r8a779f0.c
++++ b/drivers/pinctrl/renesas/pfc-r8a779f0.c
+@@ -257,7 +257,28 @@ enum {
+ };
+
+ static const u16 pinmux_data[] = {
++/* Using GP_1_[0-9] requires disabling I2C in MOD_SEL1 */
++#define GP_1_0_FN GP_1_0_FN, FN_SEL_I2C0_0
++#define GP_1_1_FN GP_1_1_FN, FN_SEL_I2C0_0
++#define GP_1_2_FN GP_1_2_FN, FN_SEL_I2C1_0
++#define GP_1_3_FN GP_1_3_FN, FN_SEL_I2C1_0
++#define GP_1_4_FN GP_1_4_FN, FN_SEL_I2C2_0
++#define GP_1_5_FN GP_1_5_FN, FN_SEL_I2C2_0
++#define GP_1_6_FN GP_1_6_FN, FN_SEL_I2C3_0
++#define GP_1_7_FN GP_1_7_FN, FN_SEL_I2C3_0
++#define GP_1_8_FN GP_1_8_FN, FN_SEL_I2C4_0
++#define GP_1_9_FN GP_1_9_FN, FN_SEL_I2C4_0
+ PINMUX_DATA_GP_ALL(),
++#undef GP_1_0_FN
++#undef GP_1_1_FN
++#undef GP_1_2_FN
++#undef GP_1_3_FN
++#undef GP_1_4_FN
++#undef GP_1_5_FN
++#undef GP_1_6_FN
++#undef GP_1_7_FN
++#undef GP_1_8_FN
++#undef GP_1_9_FN
+
+ PINMUX_SINGLE(SD_WP),
+ PINMUX_SINGLE(SD_CD),
+diff --git a/drivers/pinctrl/renesas/pinctrl-rzn1.c b/drivers/pinctrl/renesas/pinctrl-rzn1.c
+index ef5fb25b6016d..849d091205d4d 100644
+--- a/drivers/pinctrl/renesas/pinctrl-rzn1.c
++++ b/drivers/pinctrl/renesas/pinctrl-rzn1.c
+@@ -865,17 +865,15 @@ static int rzn1_pinctrl_probe(struct platform_device *pdev)
+ ipctl->mdio_func[0] = -1;
+ ipctl->mdio_func[1] = -1;
+
+- res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+- ipctl->lev1_protect_phys = (u32)res->start + 0x400;
+- ipctl->lev1 = devm_ioremap_resource(&pdev->dev, res);
++ ipctl->lev1 = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
+ if (IS_ERR(ipctl->lev1))
+ return PTR_ERR(ipctl->lev1);
++ ipctl->lev1_protect_phys = (u32)res->start + 0x400;
+
+- res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
+- ipctl->lev2_protect_phys = (u32)res->start + 0x400;
+- ipctl->lev2 = devm_ioremap_resource(&pdev->dev, res);
++ ipctl->lev2 = devm_platform_get_and_ioremap_resource(pdev, 1, &res);
+ if (IS_ERR(ipctl->lev2))
+ return PTR_ERR(ipctl->lev2);
++ ipctl->lev2_protect_phys = (u32)res->start + 0x400;
+
+ ipctl->clk = devm_clk_get(&pdev->dev, NULL);
+ if (IS_ERR(ipctl->clk))
+diff --git a/drivers/platform/chrome/cros_ec.c b/drivers/platform/chrome/cros_ec.c
+index d49a4efe46c8a..a5cc8f24299eb 100644
+--- a/drivers/platform/chrome/cros_ec.c
++++ b/drivers/platform/chrome/cros_ec.c
+@@ -189,6 +189,8 @@ int cros_ec_register(struct cros_ec_device *ec_dev)
+ ec_dev->max_request = sizeof(struct ec_params_hello);
+ ec_dev->max_response = sizeof(struct ec_response_get_protocol_info);
+ ec_dev->max_passthru = 0;
++ ec_dev->ec = NULL;
++ ec_dev->pd = NULL;
+
+ ec_dev->din = devm_kzalloc(dev, ec_dev->din_size, GFP_KERNEL);
+ if (!ec_dev->din)
+@@ -245,18 +247,16 @@ int cros_ec_register(struct cros_ec_device *ec_dev)
+ if (IS_ERR(ec_dev->pd)) {
+ dev_err(ec_dev->dev,
+ "Failed to create CrOS PD platform device\n");
+- platform_device_unregister(ec_dev->ec);
+- return PTR_ERR(ec_dev->pd);
++ err = PTR_ERR(ec_dev->pd);
++ goto exit;
+ }
+ }
+
+ if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
+ err = devm_of_platform_populate(dev);
+ if (err) {
+- platform_device_unregister(ec_dev->pd);
+- platform_device_unregister(ec_dev->ec);
+ dev_err(dev, "Failed to register sub-devices\n");
+- return err;
++ goto exit;
+ }
+ }
+
+@@ -278,7 +278,7 @@ int cros_ec_register(struct cros_ec_device *ec_dev)
+ err = blocking_notifier_chain_register(&ec_dev->event_notifier,
+ &ec_dev->notifier_ready);
+ if (err)
+- return err;
++ goto exit;
+ }
+
+ dev_info(dev, "Chrome EC device registered\n");
+@@ -291,6 +291,10 @@ int cros_ec_register(struct cros_ec_device *ec_dev)
+ cros_ec_irq_thread(0, ec_dev);
+
+ return 0;
++exit:
++ platform_device_unregister(ec_dev->ec);
++ platform_device_unregister(ec_dev->pd);
++ return err;
+ }
+ EXPORT_SYMBOL(cros_ec_register);
+
+diff --git a/drivers/platform/chrome/cros_ec_chardev.c b/drivers/platform/chrome/cros_ec_chardev.c
+index e0bce869c49a9..fd33de546aee0 100644
+--- a/drivers/platform/chrome/cros_ec_chardev.c
++++ b/drivers/platform/chrome/cros_ec_chardev.c
+@@ -301,7 +301,7 @@ static long cros_ec_chardev_ioctl_xcmd(struct cros_ec_dev *ec, void __user *arg)
+ }
+
+ s_cmd->command += ec->cmd_offset;
+- ret = cros_ec_cmd_xfer_status(ec->ec_dev, s_cmd);
++ ret = cros_ec_cmd_xfer(ec->ec_dev, s_cmd);
+ /* Only copy data to userland if data was received. */
+ if (ret < 0)
+ goto exit;
+diff --git a/drivers/platform/chrome/cros_ec_proto.c b/drivers/platform/chrome/cros_ec_proto.c
+index c4caf2e2de825..ac1419881ff35 100644
+--- a/drivers/platform/chrome/cros_ec_proto.c
++++ b/drivers/platform/chrome/cros_ec_proto.c
+@@ -560,22 +560,28 @@ exit:
+ EXPORT_SYMBOL(cros_ec_query_all);
+
+ /**
+- * cros_ec_cmd_xfer_status() - Send a command to the ChromeOS EC.
++ * cros_ec_cmd_xfer() - Send a command to the ChromeOS EC.
+ * @ec_dev: EC device.
+ * @msg: Message to write.
+ *
+- * Call this to send a command to the ChromeOS EC. This should be used instead of calling the EC's
+- * cmd_xfer() callback directly. It returns success status only if both the command was transmitted
+- * successfully and the EC replied with success status.
++ * Call this to send a command to the ChromeOS EC. This should be used instead
++ * of calling the EC's cmd_xfer() callback directly. This function does not
++ * convert EC command execution error codes to Linux error codes. Most
++ * in-kernel users will want to use cros_ec_cmd_xfer_status() instead since
++ * that function implements the conversion.
+ *
+ * Return:
+- * >=0 - The number of bytes transferred
+- * <0 - Linux error code
++ * >0 - EC command was executed successfully. The return value is the number
++ * of bytes returned by the EC (excluding the header).
++ * =0 - EC communication was successful. EC command execution results are
++ * reported in msg->result. The result will be EC_RES_SUCCESS if the
++ * command was executed successfully or report an EC command execution
++ * error.
++ * <0 - EC communication error. Return value is the Linux error code.
+ */
+-int cros_ec_cmd_xfer_status(struct cros_ec_device *ec_dev,
+- struct cros_ec_command *msg)
++int cros_ec_cmd_xfer(struct cros_ec_device *ec_dev, struct cros_ec_command *msg)
+ {
+- int ret, mapped;
++ int ret;
+
+ mutex_lock(&ec_dev->lock);
+ if (ec_dev->proto_version == EC_PROTO_VERSION_UNKNOWN) {
+@@ -616,6 +622,32 @@ int cros_ec_cmd_xfer_status(struct cros_ec_device *ec_dev,
+ ret = send_command(ec_dev, msg);
+ mutex_unlock(&ec_dev->lock);
+
++ return ret;
++}
++EXPORT_SYMBOL(cros_ec_cmd_xfer);
++
++/**
++ * cros_ec_cmd_xfer_status() - Send a command to the ChromeOS EC.
++ * @ec_dev: EC device.
++ * @msg: Message to write.
++ *
++ * Call this to send a command to the ChromeOS EC. This should be used instead of calling the EC's
++ * cmd_xfer() callback directly. It returns success status only if both the command was transmitted
++ * successfully and the EC replied with success status.
++ *
++ * Return:
++ * >=0 - The number of bytes transferred.
++ * <0 - Linux error code
++ */
++int cros_ec_cmd_xfer_status(struct cros_ec_device *ec_dev,
++ struct cros_ec_command *msg)
++{
++ int ret, mapped;
++
++ ret = cros_ec_cmd_xfer(ec_dev, msg);
++ if (ret < 0)
++ return ret;
++
+ mapped = cros_ec_map_error(msg->result);
+ if (mapped) {
+ dev_dbg(ec_dev->dev, "Command result (err: %d [%d])\n",
+diff --git a/drivers/platform/mips/cpu_hwmon.c b/drivers/platform/mips/cpu_hwmon.c
+index 386389ffec419..d8c5f9195f85f 100644
+--- a/drivers/platform/mips/cpu_hwmon.c
++++ b/drivers/platform/mips/cpu_hwmon.c
+@@ -55,55 +55,6 @@ out:
+ static int nr_packages;
+ static struct device *cpu_hwmon_dev;
+
+-static SENSOR_DEVICE_ATTR(name, 0444, NULL, NULL, 0);
+-
+-static struct attribute *cpu_hwmon_attributes[] = {
+- &sensor_dev_attr_name.dev_attr.attr,
+- NULL
+-};
+-
+-/* Hwmon device attribute group */
+-static struct attribute_group cpu_hwmon_attribute_group = {
+- .attrs = cpu_hwmon_attributes,
+-};
+-
+-static ssize_t get_cpu_temp(struct device *dev,
+- struct device_attribute *attr, char *buf);
+-static ssize_t cpu_temp_label(struct device *dev,
+- struct device_attribute *attr, char *buf);
+-
+-static SENSOR_DEVICE_ATTR(temp1_input, 0444, get_cpu_temp, NULL, 1);
+-static SENSOR_DEVICE_ATTR(temp1_label, 0444, cpu_temp_label, NULL, 1);
+-static SENSOR_DEVICE_ATTR(temp2_input, 0444, get_cpu_temp, NULL, 2);
+-static SENSOR_DEVICE_ATTR(temp2_label, 0444, cpu_temp_label, NULL, 2);
+-static SENSOR_DEVICE_ATTR(temp3_input, 0444, get_cpu_temp, NULL, 3);
+-static SENSOR_DEVICE_ATTR(temp3_label, 0444, cpu_temp_label, NULL, 3);
+-static SENSOR_DEVICE_ATTR(temp4_input, 0444, get_cpu_temp, NULL, 4);
+-static SENSOR_DEVICE_ATTR(temp4_label, 0444, cpu_temp_label, NULL, 4);
+-
+-static const struct attribute *hwmon_cputemp[4][3] = {
+- {
+- &sensor_dev_attr_temp1_input.dev_attr.attr,
+- &sensor_dev_attr_temp1_label.dev_attr.attr,
+- NULL
+- },
+- {
+- &sensor_dev_attr_temp2_input.dev_attr.attr,
+- &sensor_dev_attr_temp2_label.dev_attr.attr,
+- NULL
+- },
+- {
+- &sensor_dev_attr_temp3_input.dev_attr.attr,
+- &sensor_dev_attr_temp3_label.dev_attr.attr,
+- NULL
+- },
+- {
+- &sensor_dev_attr_temp4_input.dev_attr.attr,
+- &sensor_dev_attr_temp4_label.dev_attr.attr,
+- NULL
+- }
+-};
+-
+ static ssize_t cpu_temp_label(struct device *dev,
+ struct device_attribute *attr, char *buf)
+ {
+@@ -121,24 +72,47 @@ static ssize_t get_cpu_temp(struct device *dev,
+ return sprintf(buf, "%d\n", value);
+ }
+
+-static int create_sysfs_cputemp_files(struct kobject *kobj)
+-{
+- int i, ret = 0;
+-
+- for (i = 0; i < nr_packages; i++)
+- ret = sysfs_create_files(kobj, hwmon_cputemp[i]);
++static SENSOR_DEVICE_ATTR(temp1_input, 0444, get_cpu_temp, NULL, 1);
++static SENSOR_DEVICE_ATTR(temp1_label, 0444, cpu_temp_label, NULL, 1);
++static SENSOR_DEVICE_ATTR(temp2_input, 0444, get_cpu_temp, NULL, 2);
++static SENSOR_DEVICE_ATTR(temp2_label, 0444, cpu_temp_label, NULL, 2);
++static SENSOR_DEVICE_ATTR(temp3_input, 0444, get_cpu_temp, NULL, 3);
++static SENSOR_DEVICE_ATTR(temp3_label, 0444, cpu_temp_label, NULL, 3);
++static SENSOR_DEVICE_ATTR(temp4_input, 0444, get_cpu_temp, NULL, 4);
++static SENSOR_DEVICE_ATTR(temp4_label, 0444, cpu_temp_label, NULL, 4);
+
+- return ret;
+-}
++static struct attribute *cpu_hwmon_attributes[] = {
++ &sensor_dev_attr_temp1_input.dev_attr.attr,
++ &sensor_dev_attr_temp1_label.dev_attr.attr,
++ &sensor_dev_attr_temp2_input.dev_attr.attr,
++ &sensor_dev_attr_temp2_label.dev_attr.attr,
++ &sensor_dev_attr_temp3_input.dev_attr.attr,
++ &sensor_dev_attr_temp3_label.dev_attr.attr,
++ &sensor_dev_attr_temp4_input.dev_attr.attr,
++ &sensor_dev_attr_temp4_label.dev_attr.attr,
++ NULL
++};
+
+-static void remove_sysfs_cputemp_files(struct kobject *kobj)
++static umode_t cpu_hwmon_is_visible(struct kobject *kobj,
++ struct attribute *attr, int i)
+ {
+- int i;
++ int id = i / 2;
+
+- for (i = 0; i < nr_packages; i++)
+- sysfs_remove_files(kobj, hwmon_cputemp[i]);
++ if (id < nr_packages)
++ return attr->mode;
++ return 0;
+ }
+
++static struct attribute_group cpu_hwmon_group = {
++ .attrs = cpu_hwmon_attributes,
++ .is_visible = cpu_hwmon_is_visible,
++};
++
++static const struct attribute_group *cpu_hwmon_groups[] = {
++ &cpu_hwmon_group,
++ NULL
++};
++
+ #define CPU_THERMAL_THRESHOLD 90000
+ static struct delayed_work thermal_work;
+
+@@ -159,50 +133,31 @@ static void do_thermal_timer(struct work_struct *work)
+
+ static int __init loongson_hwmon_init(void)
+ {
+- int ret;
+-
+ pr_info("Loongson Hwmon Enter...\n");
+
+ if (cpu_has_csr())
+ csr_temp_enable = csr_readl(LOONGSON_CSR_FEATURES) &
+ LOONGSON_CSRF_TEMP;
+
+- cpu_hwmon_dev = hwmon_device_register_with_info(NULL, "cpu_hwmon", NULL, NULL, NULL);
+- if (IS_ERR(cpu_hwmon_dev)) {
+- ret = PTR_ERR(cpu_hwmon_dev);
+- pr_err("hwmon_device_register fail!\n");
+- goto fail_hwmon_device_register;
+- }
+-
+ nr_packages = loongson_sysconf.nr_cpus /
+ loongson_sysconf.cores_per_package;
+
+- ret = create_sysfs_cputemp_files(&cpu_hwmon_dev->kobj);
+- if (ret) {
+- pr_err("fail to create cpu temperature interface!\n");
+- goto fail_create_sysfs_cputemp_files;
++ cpu_hwmon_dev = hwmon_device_register_with_groups(NULL, "cpu_hwmon",
++ NULL, cpu_hwmon_groups);
++ if (IS_ERR(cpu_hwmon_dev)) {
++ pr_err("hwmon_device_register fail!\n");
++ return PTR_ERR(cpu_hwmon_dev);
+ }
+
+ INIT_DEFERRABLE_WORK(&thermal_work, do_thermal_timer);
+ schedule_delayed_work(&thermal_work, msecs_to_jiffies(20000));
+
+- return ret;
+-
+-fail_create_sysfs_cputemp_files:
+- sysfs_remove_group(&cpu_hwmon_dev->kobj,
+- &cpu_hwmon_attribute_group);
+- hwmon_device_unregister(cpu_hwmon_dev);
+-
+-fail_hwmon_device_register:
+- return ret;
++ return 0;
+ }
+
+ static void __exit loongson_hwmon_exit(void)
+ {
+ cancel_delayed_work_sync(&thermal_work);
+- remove_sysfs_cputemp_files(&cpu_hwmon_dev->kobj);
+- sysfs_remove_group(&cpu_hwmon_dev->kobj,
+- &cpu_hwmon_attribute_group);
+ hwmon_device_unregister(cpu_hwmon_dev);
+ }
+
+diff --git a/drivers/platform/x86/intel/chtwc_int33fe.c b/drivers/platform/x86/intel/chtwc_int33fe.c
+index 0de509fbf0209..c52ac23e23315 100644
+--- a/drivers/platform/x86/intel/chtwc_int33fe.c
++++ b/drivers/platform/x86/intel/chtwc_int33fe.c
+@@ -389,6 +389,8 @@ static int cht_int33fe_typec_probe(struct platform_device *pdev)
+ goto out_unregister_fusb302;
+ }
+
++ platform_set_drvdata(pdev, data);
++
+ return 0;
+
+ out_unregister_fusb302:
+diff --git a/drivers/platform/x86/intel/hid.c b/drivers/platform/x86/intel/hid.c
+index 2def562c6e1de..216d31e3403dd 100644
+--- a/drivers/platform/x86/intel/hid.c
++++ b/drivers/platform/x86/intel/hid.c
+@@ -238,7 +238,7 @@ static bool intel_hid_evaluate_method(acpi_handle handle,
+
+ method_name = (char *)intel_hid_dsm_fn_to_method[fn_index];
+
+- if (!(intel_hid_dsm_fn_mask & fn_index))
++ if (!(intel_hid_dsm_fn_mask & BIT(fn_index)))
+ goto skip_dsm_eval;
+
+ obj = acpi_evaluate_dsm_typed(handle, &intel_dsm_guid,
+diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
+index d2553970a67ba..c4d844ffad7a6 100644
+--- a/drivers/regulator/core.c
++++ b/drivers/regulator/core.c
+@@ -2133,10 +2133,13 @@ struct regulator *_regulator_get(struct device *dev, const char *id,
+ rdev->exclusive = 1;
+
+ ret = _regulator_is_enabled(rdev);
+- if (ret > 0)
++ if (ret > 0) {
+ rdev->use_count = 1;
+- else
++ regulator->enable_count = 1;
++ } else {
+ rdev->use_count = 0;
++ regulator->enable_count = 0;
++ }
+ }
+
+ link = device_link_add(dev, &rdev->dev, DL_FLAG_STATELESS);
+diff --git a/drivers/regulator/da9121-regulator.c b/drivers/regulator/da9121-regulator.c
+index eb9df485bd8aa..76e0e23bf598c 100644
+--- a/drivers/regulator/da9121-regulator.c
++++ b/drivers/regulator/da9121-regulator.c
+@@ -1030,6 +1030,8 @@ static int da9121_assign_chip_model(struct i2c_client *i2c,
+ chip->variant_id = DA9121_TYPE_DA9142;
+ regmap = &da9121_2ch_regmap_config;
+ break;
++ default:
++ return -EINVAL;
+ }
+
+ /* Set these up for of_regulator_match call which may want .of_map_modes */
+diff --git a/drivers/regulator/pfuze100-regulator.c b/drivers/regulator/pfuze100-regulator.c
+index d60d7d1b7fa25..aa55cfca9e400 100644
+--- a/drivers/regulator/pfuze100-regulator.c
++++ b/drivers/regulator/pfuze100-regulator.c
+@@ -521,6 +521,7 @@ static int pfuze_parse_regulators_dt(struct pfuze_chip *chip)
+ parent = of_get_child_by_name(np, "regulators");
+ if (!parent) {
+ dev_err(dev, "regulators node not found\n");
++ of_node_put(np);
+ return -EINVAL;
+ }
+
+@@ -550,6 +551,7 @@ static int pfuze_parse_regulators_dt(struct pfuze_chip *chip)
+ }
+
+ of_node_put(parent);
++ of_node_put(np);
+ if (ret < 0) {
+ dev_err(dev, "Error parsing regulator init data: %d\n",
+ ret);
+diff --git a/drivers/regulator/qcom_smd-regulator.c b/drivers/regulator/qcom_smd-regulator.c
+index 8490aa8eecb1a..7dff94a2eb7e9 100644
+--- a/drivers/regulator/qcom_smd-regulator.c
++++ b/drivers/regulator/qcom_smd-regulator.c
+@@ -944,32 +944,31 @@ static const struct rpm_regulator_data rpm_pm8950_regulators[] = {
+ { "s2", QCOM_SMD_RPM_SMPA, 2, &pm8950_hfsmps, "vdd_s2" },
+ { "s3", QCOM_SMD_RPM_SMPA, 3, &pm8950_hfsmps, "vdd_s3" },
+ { "s4", QCOM_SMD_RPM_SMPA, 4, &pm8950_hfsmps, "vdd_s4" },
+- { "s5", QCOM_SMD_RPM_SMPA, 5, &pm8950_ftsmps2p5, "vdd_s5" },
++ /* S5 is managed via SPMI. */
+ { "s6", QCOM_SMD_RPM_SMPA, 6, &pm8950_hfsmps, "vdd_s6" },
+
+ { "l1", QCOM_SMD_RPM_LDOA, 1, &pm8950_ult_nldo, "vdd_l1_l19" },
+ { "l2", QCOM_SMD_RPM_LDOA, 2, &pm8950_ult_nldo, "vdd_l2_l23" },
+ { "l3", QCOM_SMD_RPM_LDOA, 3, &pm8950_ult_nldo, "vdd_l3" },
+- { "l4", QCOM_SMD_RPM_LDOA, 4, &pm8950_ult_pldo, "vdd_l4_l5_l6_l7_l16" },
+- { "l5", QCOM_SMD_RPM_LDOA, 5, &pm8950_pldo_lv, "vdd_l4_l5_l6_l7_l16" },
+- { "l6", QCOM_SMD_RPM_LDOA, 6, &pm8950_pldo_lv, "vdd_l4_l5_l6_l7_l16" },
+- { "l7", QCOM_SMD_RPM_LDOA, 7, &pm8950_pldo_lv, "vdd_l4_l5_l6_l7_l16" },
++ /* L4 seems not to exist. */
++ { "l5", QCOM_SMD_RPM_LDOA, 5, &pm8950_pldo_lv, "vdd_l5_l6_l7_l16" },
++ { "l6", QCOM_SMD_RPM_LDOA, 6, &pm8950_pldo_lv, "vdd_l5_l6_l7_l16" },
++ { "l7", QCOM_SMD_RPM_LDOA, 7, &pm8950_pldo_lv, "vdd_l5_l6_l7_l16" },
+ { "l8", QCOM_SMD_RPM_LDOA, 8, &pm8950_ult_pldo, "vdd_l8_l11_l12_l17_l22" },
+ { "l9", QCOM_SMD_RPM_LDOA, 9, &pm8950_ult_pldo, "vdd_l9_l10_l13_l14_l15_l18" },
+ { "l10", QCOM_SMD_RPM_LDOA, 10, &pm8950_ult_nldo, "vdd_l9_l10_l13_l14_l15_l18"},
+- { "l11", QCOM_SMD_RPM_LDOA, 11, &pm8950_ult_pldo, "vdd_l8_l11_l12_l17_l22"},
+- { "l12", QCOM_SMD_RPM_LDOA, 12, &pm8950_ult_pldo, "vdd_l8_l11_l12_l17_l22"},
+- { "l13", QCOM_SMD_RPM_LDOA, 13, &pm8950_ult_pldo, "vdd_l9_l10_l13_l14_l15_l18"},
+- { "l14", QCOM_SMD_RPM_LDOA, 14, &pm8950_ult_pldo, "vdd_l9_l10_l13_l14_l15_l18"},
+- { "l15", QCOM_SMD_RPM_LDOA, 15, &pm8950_ult_pldo, "vdd_l9_l10_l13_l14_l15_l18"},
+- { "l16", QCOM_SMD_RPM_LDOA, 16, &pm8950_ult_pldo, "vdd_l4_l5_l6_l7_l16"},
+- { "l17", QCOM_SMD_RPM_LDOA, 17, &pm8950_ult_pldo, "vdd_l8_l11_l12_l17_l22"},
+- { "l18", QCOM_SMD_RPM_LDOA, 18, &pm8950_ult_pldo, "vdd_l9_l10_l13_l14_l15_l18"},
+- { "l19", QCOM_SMD_RPM_LDOA, 18, &pm8950_pldo, "vdd_l1_l19"},
+- { "l20", QCOM_SMD_RPM_LDOA, 18, &pm8950_pldo, "vdd_l20"},
+- { "l21", QCOM_SMD_RPM_LDOA, 18, &pm8950_pldo, "vdd_l21"},
+- { "l22", QCOM_SMD_RPM_LDOA, 18, &pm8950_pldo, "vdd_l8_l11_l12_l17_l22"},
+- { "l23", QCOM_SMD_RPM_LDOA, 18, &pm8950_pldo, "vdd_l2_l23"},
++ { "l11", QCOM_SMD_RPM_LDOA, 11, &pm8950_ult_pldo, "vdd_l8_l11_l12_l17_l22" },
++ { "l12", QCOM_SMD_RPM_LDOA, 12, &pm8950_ult_pldo, "vdd_l8_l11_l12_l17_l22" },
++ { "l13", QCOM_SMD_RPM_LDOA, 13, &pm8950_ult_pldo, "vdd_l9_l10_l13_l14_l15_l18" },
++ { "l14", QCOM_SMD_RPM_LDOA, 14, &pm8950_ult_pldo, "vdd_l9_l10_l13_l14_l15_l18" },
++ { "l15", QCOM_SMD_RPM_LDOA, 15, &pm8950_ult_pldo, "vdd_l9_l10_l13_l14_l15_l18" },
++ { "l16", QCOM_SMD_RPM_LDOA, 16, &pm8950_ult_pldo, "vdd_l5_l6_l7_l16" },
++ { "l17", QCOM_SMD_RPM_LDOA, 17, &pm8950_ult_pldo, "vdd_l8_l11_l12_l17_l22" },
++ /* L18 seems not to exist. */
++ { "l19", QCOM_SMD_RPM_LDOA, 19, &pm8950_pldo, "vdd_l1_l19" },
++ /* L20 & L21 seem not to exist. */
++ { "l22", QCOM_SMD_RPM_LDOA, 22, &pm8950_pldo, "vdd_l8_l11_l12_l17_l22" },
++ { "l23", QCOM_SMD_RPM_LDOA, 23, &pm8950_pldo, "vdd_l2_l23" },
+ {}
+ };
+
+diff --git a/drivers/regulator/scmi-regulator.c b/drivers/regulator/scmi-regulator.c
+index 1f02f60ad1366..41ae7ac27ff6a 100644
+--- a/drivers/regulator/scmi-regulator.c
++++ b/drivers/regulator/scmi-regulator.c
+@@ -352,7 +352,7 @@ static int scmi_regulator_probe(struct scmi_device *sdev)
+ return ret;
+ }
+ }
+-
++ of_node_put(np);
+ /*
+ * Register a regulator for each valid regulator-DT-entry that we
+ * can successfully reach via SCMI and has a valid associated voltage
+diff --git a/drivers/s390/cio/chsc.c b/drivers/s390/cio/chsc.c
+index 297fb399363cc..620a917cd3a15 100644
+--- a/drivers/s390/cio/chsc.c
++++ b/drivers/s390/cio/chsc.c
+@@ -1255,7 +1255,7 @@ exit:
+ EXPORT_SYMBOL_GPL(css_general_characteristics);
+ EXPORT_SYMBOL_GPL(css_chsc_characteristics);
+
+-int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta)
++int chsc_sstpc(void *page, unsigned int op, u16 ctrl, long *clock_delta)
+ {
+ struct {
+ struct chsc_header request;
+@@ -1266,7 +1266,7 @@ int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta)
+ unsigned int rsvd2[5];
+ struct chsc_header response;
+ unsigned int rsvd3[3];
+- u64 clock_delta;
++ s64 clock_delta;
+ unsigned int rsvd4[2];
+ } *rr;
+ int rc;
+diff --git a/drivers/scsi/dc395x.c b/drivers/scsi/dc395x.c
+index 67a89715c8630..670a836a6ba19 100644
+--- a/drivers/scsi/dc395x.c
++++ b/drivers/scsi/dc395x.c
+@@ -3585,10 +3585,19 @@ static struct DeviceCtlBlk *device_alloc(struct AdapterCtlBlk *acb,
+ #endif
+ if (dcb->target_lun != 0) {
+ /* Copy settings */
+- struct DeviceCtlBlk *p;
+- list_for_each_entry(p, &acb->dcb_list, list)
+- if (p->target_id == dcb->target_id)
++ struct DeviceCtlBlk *p = NULL, *iter;
++
++ list_for_each_entry(iter, &acb->dcb_list, list)
++ if (iter->target_id == dcb->target_id) {
++ p = iter;
+ break;
++ }
++
++ if (!p) {
++ kfree(dcb);
++ return NULL;
++ }
++
+ dprintkdbg(DBG_1,
+ "device_alloc: <%02i-%i> copy from <%02i-%i>\n",
+ dcb->target_id, dcb->target_lun,
+diff --git a/drivers/scsi/fcoe/fcoe_ctlr.c b/drivers/scsi/fcoe/fcoe_ctlr.c
+index 1756a0ac6f083..558f3f4e18593 100644
+--- a/drivers/scsi/fcoe/fcoe_ctlr.c
++++ b/drivers/scsi/fcoe/fcoe_ctlr.c
+@@ -1969,7 +1969,7 @@ EXPORT_SYMBOL(fcoe_ctlr_recv_flogi);
+ *
+ * Returns: u64 fc world wide name
+ */
+-u64 fcoe_wwn_from_mac(unsigned char mac[MAX_ADDR_LEN],
++u64 fcoe_wwn_from_mac(unsigned char mac[ETH_ALEN],
+ unsigned int scheme, unsigned int port)
+ {
+ u64 wwn;
+diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
+index 4bda2f6cb3526..849cc5fc86af6 100644
+--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
++++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
+@@ -446,6 +446,8 @@ void hisi_sas_task_deliver(struct hisi_hba *hisi_hba,
+ return;
+ }
+
++ /* Make slot memories observable before marking as ready */
++ smp_wmb();
+ WRITE_ONCE(slot->ready, 1);
+
+ spin_lock(&dq->lock);
+@@ -709,8 +711,6 @@ static int hisi_sas_init_device(struct domain_device *device)
+ struct scsi_lun lun;
+ int retry = HISI_SAS_DISK_RECOVER_CNT;
+ struct hisi_hba *hisi_hba = dev_to_hisi_hba(device);
+- struct device *dev = hisi_hba->dev;
+- struct sas_phy *local_phy;
+
+ switch (device->dev_type) {
+ case SAS_END_DEVICE:
+@@ -729,30 +729,18 @@ static int hisi_sas_init_device(struct domain_device *device)
+ case SAS_SATA_PM_PORT:
+ case SAS_SATA_PENDING:
+ /*
+- * send HARD RESET to clear previous affiliation of
+- * STP target port
++ * If an expander is swapped when a SATA disk is attached then
++ * we should issue a hard reset to clear previous affiliation
++ * of STP target port, see SPL (chapter 6.19.4).
++ *
++ * However we don't need to issue a hard reset here for these
++ * reasons:
++ * a. When probing the device, libsas/libata already issues a
++ * hard reset in sas_probe_sata() -> ata_sas_async_probe().
++ * Note that in hisi_sas_debug_I_T_nexus_reset() we take care
++ * to issue a hard reset by checking the dev status (== INIT).
++ * b. When resetting the controller, this is simply unnecessary.
+ */
+- local_phy = sas_get_local_phy(device);
+- if (!scsi_is_sas_phy_local(local_phy) &&
+- !test_bit(HISI_SAS_RESETTING_BIT, &hisi_hba->flags)) {
+- unsigned long deadline = ata_deadline(jiffies, 20000);
+- struct sata_device *sata_dev = &device->sata_dev;
+- struct ata_host *ata_host = sata_dev->ata_host;
+- struct ata_port_operations *ops = ata_host->ops;
+- struct ata_port *ap = sata_dev->ap;
+- struct ata_link *link;
+- unsigned int classes;
+-
+- ata_for_each_link(link, ap, EDGE)
+- rc = ops->hardreset(link, &classes,
+- deadline);
+- }
+- sas_put_local_phy(local_phy);
+- if (rc) {
+- dev_warn(dev, "SATA disk hardreset fail: %d\n", rc);
+- return rc;
+- }
+-
+ while (retry-- > 0) {
+ rc = hisi_sas_softreset_ata_disk(device);
+ if (!rc)
+@@ -768,15 +756,19 @@ static int hisi_sas_init_device(struct domain_device *device)
+
+ int hisi_sas_slave_alloc(struct scsi_device *sdev)
+ {
+- struct domain_device *ddev;
++ struct domain_device *ddev = sdev_to_domain_dev(sdev);
++ struct hisi_sas_device *sas_dev = ddev->lldd_dev;
+ int rc;
+
+ rc = sas_slave_alloc(sdev);
+ if (rc)
+ return rc;
+- ddev = sdev_to_domain_dev(sdev);
+
+- return hisi_sas_init_device(ddev);
++ rc = hisi_sas_init_device(ddev);
++ if (rc)
++ return rc;
++ sas_dev->dev_status = HISI_SAS_DEV_NORMAL;
++ return 0;
+ }
+ EXPORT_SYMBOL_GPL(hisi_sas_slave_alloc);
+
+@@ -826,7 +818,6 @@ static int hisi_sas_dev_found(struct domain_device *device)
+ dev_info(dev, "dev[%d:%x] found\n",
+ sas_dev->device_id, sas_dev->dev_type);
+
+- sas_dev->dev_status = HISI_SAS_DEV_NORMAL;
+ return 0;
+
+ err_out:
+diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+index 79f87d7c3e682..7d819fc0395e4 100644
+--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
++++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+@@ -1563,9 +1563,15 @@ static irqreturn_t phy_up_v3_hw(int phy_no, struct hisi_hba *hisi_hba)
+
+ phy->port_id = port_id;
+
+- /* Call pm_runtime_put_sync() with pairs in hisi_sas_phyup_pm_work() */
++ /*
++ * Call pm_runtime_get_noresume() which pairs with
++ * hisi_sas_phyup_pm_work() -> pm_runtime_put_sync().
++ * For failure call pm_runtime_put() as we are in a hardirq context.
++ */
+ pm_runtime_get_noresume(dev);
+- hisi_sas_notify_phy_event(phy, HISI_PHYE_PHY_UP_PM);
++ res = hisi_sas_notify_phy_event(phy, HISI_PHYE_PHY_UP_PM);
++ if (!res)
++ pm_runtime_put(dev);
+
+ res = IRQ_HANDLED;
+
+diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
+index 0025760230e51..da5e91a911510 100644
+--- a/drivers/scsi/lpfc/lpfc.h
++++ b/drivers/scsi/lpfc/lpfc.h
+@@ -1025,6 +1025,7 @@ struct lpfc_hba {
+ #define LS_MDS_LINK_DOWN 0x8 /* MDS Diagnostics Link Down */
+ #define LS_MDS_LOOPBACK 0x10 /* MDS Diagnostics Link Up (Loopback) */
+ #define LS_CT_VEN_RPA 0x20 /* Vendor RPA sent to switch */
++#define LS_EXTERNAL_LOOPBACK 0x40 /* External loopback plug inserted */
+
+ uint32_t hba_flag; /* hba generic flags */
+ #define HBA_ERATT_HANDLED 0x1 /* This flag is set when eratt handled */
+diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
+index 872a26376ccbb..892b3da1ba450 100644
+--- a/drivers/scsi/lpfc/lpfc_els.c
++++ b/drivers/scsi/lpfc/lpfc_els.c
+@@ -1387,6 +1387,9 @@ lpfc_issue_els_flogi(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp,
+
+ phba->hba_flag |= (HBA_FLOGI_ISSUED | HBA_FLOGI_OUTSTANDING);
+
++ /* Clear external loopback plug detected flag */
++ phba->link_flag &= ~LS_EXTERNAL_LOOPBACK;
++
+ /* Check for a deferred FLOGI ACC condition */
+ if (phba->defer_flogi_acc_flag) {
+ /* lookup ndlp for received FLOGI */
+@@ -1532,10 +1535,13 @@ lpfc_initial_flogi(struct lpfc_vport *vport)
+ }
+
+ if (lpfc_issue_els_flogi(vport, ndlp, 0)) {
+- /* This decrement of reference count to node shall kick off
+- * the release of the node.
++ /* A node reference should be retained while registered with a
++ * transport or dev-loss-evt work is pending.
++ * Otherwise, decrement node reference to trigger release.
+ */
+- lpfc_nlp_put(ndlp);
++ if (!(ndlp->fc4_xpt_flags & (SCSI_XPT_REGD | NVME_XPT_REGD)) &&
++ !(ndlp->nlp_flag & NLP_IN_DEV_LOSS))
++ lpfc_nlp_put(ndlp);
+ return 0;
+ }
+ return 1;
+@@ -1578,10 +1584,13 @@ lpfc_initial_fdisc(struct lpfc_vport *vport)
+ }
+
+ if (lpfc_issue_els_fdisc(vport, ndlp, 0)) {
+- /* decrement node reference count to trigger the release of
+- * the node.
++ /* A node reference should be retained while registered with a
++ * transport or dev-loss-evt work is pending.
++ * Otherwise, decrement node reference to trigger release.
+ */
+- lpfc_nlp_put(ndlp);
++ if (!(ndlp->fc4_xpt_flags & (SCSI_XPT_REGD | NVME_XPT_REGD)) &&
++ !(ndlp->nlp_flag & NLP_IN_DEV_LOSS))
++ lpfc_nlp_put(ndlp);
+ return 0;
+ }
+ return 1;
+@@ -1983,6 +1992,7 @@ lpfc_cmpl_els_plogi(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+ int disc;
+ struct serv_parm *sp = NULL;
+ u32 ulp_status, ulp_word4, did, iotag;
++ bool release_node = false;
+
+ /* we pass cmdiocb to state machine which needs rspiocb as well */
+ cmdiocb->context_un.rsp_iocb = rspiocb;
+@@ -2071,19 +2081,21 @@ lpfc_cmpl_els_plogi(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+ spin_unlock_irq(&ndlp->lock);
+ goto out;
+ }
+- spin_unlock_irq(&ndlp->lock);
+
+ /* No PLOGI collision and the node is not registered with the
+ * scsi or nvme transport. It is no longer an active node. Just
+ * start the device remove process.
+ */
+ if (!(ndlp->fc4_xpt_flags & (SCSI_XPT_REGD | NVME_XPT_REGD))) {
+- spin_lock_irq(&ndlp->lock);
+ ndlp->nlp_flag &= ~NLP_NPR_2B_DISC;
+- spin_unlock_irq(&ndlp->lock);
++ if (!(ndlp->nlp_flag & NLP_IN_DEV_LOSS))
++ release_node = true;
++ }
++ spin_unlock_irq(&ndlp->lock);
++
++ if (release_node)
+ lpfc_disc_state_machine(vport, ndlp, cmdiocb,
+ NLP_EVT_DEVICE_RM);
+- }
+ } else {
+ /* Good status, call state machine */
+ prsp = list_entry(((struct lpfc_dmabuf *)
+@@ -2294,6 +2306,7 @@ lpfc_cmpl_els_prli(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+ u32 loglevel;
+ u32 ulp_status;
+ u32 ulp_word4;
++ bool release_node = false;
+
+ /* we pass cmdiocb to state machine which needs rspiocb as well */
+ cmdiocb->context_un.rsp_iocb = rspiocb;
+@@ -2370,14 +2383,18 @@ lpfc_cmpl_els_prli(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+ * it is no longer an active node. Otherwise devloss
+ * handles the final cleanup.
+ */
++ spin_lock_irq(&ndlp->lock);
+ if (!(ndlp->fc4_xpt_flags & (SCSI_XPT_REGD | NVME_XPT_REGD)) &&
+ !ndlp->fc4_prli_sent) {
+- spin_lock_irq(&ndlp->lock);
+ ndlp->nlp_flag &= ~NLP_NPR_2B_DISC;
+- spin_unlock_irq(&ndlp->lock);
++ if (!(ndlp->nlp_flag & NLP_IN_DEV_LOSS))
++ release_node = true;
++ }
++ spin_unlock_irq(&ndlp->lock);
++
++ if (release_node)
+ lpfc_disc_state_machine(vport, ndlp, cmdiocb,
+ NLP_EVT_DEVICE_RM);
+- }
+ } else {
+ /* Good status, call state machine. However, if another
+ * PRLI is outstanding, don't call the state machine
+@@ -2749,6 +2766,7 @@ lpfc_cmpl_els_adisc(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+ struct lpfc_nodelist *ndlp;
+ int disc;
+ u32 ulp_status, ulp_word4, tmo;
++ bool release_node = false;
+
+ /* we pass cmdiocb to state machine which needs rspiocb as well */
+ cmdiocb->context_un.rsp_iocb = rspiocb;
+@@ -2815,13 +2833,17 @@ lpfc_cmpl_els_adisc(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+ * transport, it is no longer an active node. Otherwise
+ * devloss handles the final cleanup.
+ */
++ spin_lock_irq(&ndlp->lock);
+ if (!(ndlp->fc4_xpt_flags & (SCSI_XPT_REGD | NVME_XPT_REGD))) {
+- spin_lock_irq(&ndlp->lock);
+ ndlp->nlp_flag &= ~NLP_NPR_2B_DISC;
+- spin_unlock_irq(&ndlp->lock);
++ if (!(ndlp->nlp_flag & NLP_IN_DEV_LOSS))
++ release_node = true;
++ }
++ spin_unlock_irq(&ndlp->lock);
++
++ if (release_node)
+ lpfc_disc_state_machine(vport, ndlp, cmdiocb,
+ NLP_EVT_DEVICE_RM);
+- }
+ } else
+ /* Good status, call state machine */
+ lpfc_disc_state_machine(vport, ndlp, cmdiocb,
+@@ -3855,9 +3877,6 @@ lpfc_least_capable_settings(struct lpfc_hba *phba,
+ {
+ u32 rsp_sig_cap = 0, drv_sig_cap = 0;
+ u32 rsp_sig_freq_cyc = 0, rsp_sig_freq_scale = 0;
+- struct lpfc_cgn_info *cp;
+- u32 crc;
+- u16 sig_freq;
+
+ /* Get rsp signal and frequency capabilities. */
+ rsp_sig_cap = be32_to_cpu(pcgd->xmt_signal_capability);
+@@ -3913,25 +3932,7 @@ lpfc_least_capable_settings(struct lpfc_hba *phba,
+ }
+ }
+
+- if (!phba->cgn_i)
+- return;
+-
+- /* Update signal frequency in congestion info buffer */
+- cp = (struct lpfc_cgn_info *)phba->cgn_i->virt;
+-
+- /* Frequency (in ms) Signal Warning/Signal Congestion Notifications
+- * are received by the HBA
+- */
+- sig_freq = phba->cgn_sig_freq;
+-
+- if (phba->cgn_reg_signal == EDC_CG_SIG_WARN_ONLY)
+- cp->cgn_warn_freq = cpu_to_le16(sig_freq);
+- if (phba->cgn_reg_signal == EDC_CG_SIG_WARN_ALARM) {
+- cp->cgn_alarm_freq = cpu_to_le16(sig_freq);
+- cp->cgn_warn_freq = cpu_to_le16(sig_freq);
+- }
+- crc = lpfc_cgn_calc_crc32(cp, LPFC_CGN_INFO_SZ, LPFC_CGN_CRC32_SEED);
+- cp->cgn_info_crc = cpu_to_le32(crc);
++ /* We are NOT recording signal frequency in congestion info buffer */
+ return;
+
+ out_no_support:
+@@ -8163,6 +8164,9 @@ lpfc_els_rcv_flogi(struct lpfc_vport *vport, struct lpfc_iocbq *cmdiocb,
+ uint32_t fc_flag = 0;
+ uint32_t port_state = 0;
+
++ /* Clear external loopback plug detected flag */
++ phba->link_flag &= ~LS_EXTERNAL_LOOPBACK;
++
+ cmd = *lp++;
+ sp = (struct serv_parm *) lp;
+
+@@ -8214,6 +8218,12 @@ lpfc_els_rcv_flogi(struct lpfc_vport *vport, struct lpfc_iocbq *cmdiocb,
+ return 1;
+ }
+
++ /* External loopback plug insertion detected */
++ phba->link_flag |= LS_EXTERNAL_LOOPBACK;
++
++ lpfc_printf_vlog(vport, KERN_INFO, LOG_ELS | LOG_LIBDFC,
++ "1119 External Loopback plug detected\n");
++
+ /* abort the flogi coming back to ourselves
+ * due to external loopback on the port.
+ */
+@@ -9940,11 +9950,14 @@ lpfc_els_rcv_fpin_cgn(struct lpfc_hba *phba, struct fc_tlv_desc *tlv)
+ /* Take action here for an Alarm event */
+ if (phba->cmf_active_mode != LPFC_CFG_OFF) {
+ if (phba->cgn_reg_fpin & LPFC_CGN_FPIN_ALARM) {
+- /* Track of alarm cnt for cgn_info */
+- atomic_inc(&phba->cgn_fabric_alarm_cnt);
+ /* Track of alarm cnt for SYNC_WQE */
+ atomic_inc(&phba->cgn_sync_alarm_cnt);
+ }
++ /* Track alarm cnt for cgn_info regardless
++ * of whether CMF is configured for Signals
++ * or FPINs.
++ */
++ atomic_inc(&phba->cgn_fabric_alarm_cnt);
+ goto cleanup;
+ }
+ break;
+@@ -9952,11 +9965,14 @@ lpfc_els_rcv_fpin_cgn(struct lpfc_hba *phba, struct fc_tlv_desc *tlv)
+ /* Take action here for a Warning event */
+ if (phba->cmf_active_mode != LPFC_CFG_OFF) {
+ if (phba->cgn_reg_fpin & LPFC_CGN_FPIN_WARN) {
+- /* Track of warning cnt for cgn_info */
+- atomic_inc(&phba->cgn_fabric_warn_cnt);
+ /* Track of warning cnt for SYNC_WQE */
+ atomic_inc(&phba->cgn_sync_warn_cnt);
+ }
++ /* Track warning cnt and freq for cgn_info
++ * regardless of whether CMF is configured for
++ * Signals or FPINs.
++ */
++ atomic_inc(&phba->cgn_fabric_warn_cnt);
+ cleanup:
+ /* Save frequency in ms */
+ phba->cgn_fpin_frequency =
+@@ -9965,14 +9981,10 @@ cleanup:
+ if (phba->cgn_i) {
+ cp = (struct lpfc_cgn_info *)
+ phba->cgn_i->virt;
+- if (phba->cgn_reg_fpin &
+- LPFC_CGN_FPIN_ALARM)
+- cp->cgn_alarm_freq =
+- cpu_to_le16(value);
+- if (phba->cgn_reg_fpin &
+- LPFC_CGN_FPIN_WARN)
+- cp->cgn_warn_freq =
+- cpu_to_le16(value);
++ cp->cgn_alarm_freq =
++ cpu_to_le16(value);
++ cp->cgn_warn_freq =
++ cpu_to_le16(value);
+ crc = lpfc_cgn_calc_crc32
+ (cp,
+ LPFC_CGN_INFO_SZ,
+diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c
+index 2b877dff5ed4f..6b6b3790d7b58 100644
+--- a/drivers/scsi/lpfc/lpfc_hbadisc.c
++++ b/drivers/scsi/lpfc/lpfc_hbadisc.c
+@@ -1221,6 +1221,9 @@ lpfc_linkdown(struct lpfc_hba *phba)
+
+ phba->defer_flogi_acc_flag = false;
+
++ /* Clear external loopback plug detected flag */
++ phba->link_flag &= ~LS_EXTERNAL_LOOPBACK;
++
+ spin_lock_irq(&phba->hbalock);
+ phba->fcf.fcf_flag &= ~(FCF_AVAILABLE | FCF_SCAN_DONE);
+ spin_unlock_irq(&phba->hbalock);
+diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
+index 461d333b1b3a8..011849c1ed3c9 100644
+--- a/drivers/scsi/lpfc/lpfc_init.c
++++ b/drivers/scsi/lpfc/lpfc_init.c
+@@ -5866,21 +5866,8 @@ lpfc_cgn_save_evt_cnt(struct lpfc_hba *phba)
+
+ /* Use the frequency found in the last rcv'ed FPIN */
+ value = phba->cgn_fpin_frequency;
+- if (phba->cgn_reg_fpin & LPFC_CGN_FPIN_WARN)
+- cp->cgn_warn_freq = cpu_to_le16(value);
+- if (phba->cgn_reg_fpin & LPFC_CGN_FPIN_ALARM)
+- cp->cgn_alarm_freq = cpu_to_le16(value);
+-
+- /* Frequency (in ms) Signal Warning/Signal Congestion Notifications
+- * are received by the HBA
+- */
+- value = phba->cgn_sig_freq;
+-
+- if (phba->cgn_reg_signal == EDC_CG_SIG_WARN_ONLY ||
+- phba->cgn_reg_signal == EDC_CG_SIG_WARN_ALARM)
+- cp->cgn_warn_freq = cpu_to_le16(value);
+- if (phba->cgn_reg_signal == EDC_CG_SIG_WARN_ALARM)
+- cp->cgn_alarm_freq = cpu_to_le16(value);
++ cp->cgn_warn_freq = cpu_to_le16(value);
++ cp->cgn_alarm_freq = cpu_to_le16(value);
+
+ lvalue = lpfc_cgn_calc_crc32(cp, LPFC_CGN_INFO_SZ,
+ LPFC_CGN_CRC32_SEED);
+@@ -6595,9 +6582,6 @@ lpfc_sli4_async_sli_evt(struct lpfc_hba *phba, struct lpfc_acqe_sli *acqe_sli)
+ /* Alarm overrides warning, so check that first */
+ if (cgn_signal->alarm_cnt) {
+ if (phba->cgn_reg_signal == EDC_CG_SIG_WARN_ALARM) {
+- /* Keep track of alarm cnt for cgn_info */
+- atomic_add(cgn_signal->alarm_cnt,
+- &phba->cgn_fabric_alarm_cnt);
+ /* Keep track of alarm cnt for CMF_SYNC_WQE */
+ atomic_add(cgn_signal->alarm_cnt,
+ &phba->cgn_sync_alarm_cnt);
+@@ -6606,8 +6590,6 @@ lpfc_sli4_async_sli_evt(struct lpfc_hba *phba, struct lpfc_acqe_sli *acqe_sli)
+ /* signal action needs to be taken */
+ if (phba->cgn_reg_signal == EDC_CG_SIG_WARN_ONLY ||
+ phba->cgn_reg_signal == EDC_CG_SIG_WARN_ALARM) {
+- /* Keep track of warning cnt for cgn_info */
+- atomic_add(cnt, &phba->cgn_fabric_warn_cnt);
+ /* Keep track of warning cnt for CMF_SYNC_WQE */
+ atomic_add(cnt, &phba->cgn_sync_warn_cnt);
+ }
+@@ -15700,34 +15682,7 @@ void lpfc_dmp_dbg(struct lpfc_hba *phba)
+ unsigned int temp_idx;
+ int i;
+ int j = 0;
+- unsigned long rem_nsec, iflags;
+- bool log_verbose = false;
+- struct lpfc_vport *port_iterator;
+-
+- /* Don't dump messages if we explicitly set log_verbose for the
+- * physical port or any vport.
+- */
+- if (phba->cfg_log_verbose)
+- return;
+-
+- spin_lock_irqsave(&phba->port_list_lock, iflags);
+- list_for_each_entry(port_iterator, &phba->port_list, listentry) {
+- if (port_iterator->load_flag & FC_UNLOADING)
+- continue;
+- if (scsi_host_get(lpfc_shost_from_vport(port_iterator))) {
+- if (port_iterator->cfg_log_verbose)
+- log_verbose = true;
+-
+- scsi_host_put(lpfc_shost_from_vport(port_iterator));
+-
+- if (log_verbose) {
+- spin_unlock_irqrestore(&phba->port_list_lock,
+- iflags);
+- return;
+- }
+- }
+- }
+- spin_unlock_irqrestore(&phba->port_list_lock, iflags);
++ unsigned long rem_nsec;
+
+ if (atomic_cmpxchg(&phba->dbg_log_dmping, 0, 1) != 0)
+ return;
+diff --git a/drivers/scsi/lpfc/lpfc_logmsg.h b/drivers/scsi/lpfc/lpfc_logmsg.h
+index 7d480c7987942..a5aafe230c74f 100644
+--- a/drivers/scsi/lpfc/lpfc_logmsg.h
++++ b/drivers/scsi/lpfc/lpfc_logmsg.h
+@@ -73,7 +73,7 @@ do { \
+ #define lpfc_printf_vlog(vport, level, mask, fmt, arg...) \
+ do { \
+ { if (((mask) & (vport)->cfg_log_verbose) || (level[1] <= '3')) { \
+- if ((mask) & LOG_TRACE_EVENT) \
++ if ((mask) & LOG_TRACE_EVENT && !(vport)->cfg_log_verbose) \
+ lpfc_dmp_dbg((vport)->phba); \
+ dev_printk(level, &((vport)->phba->pcidev)->dev, "%d:(%d):" \
+ fmt, (vport)->phba->brd_no, vport->vpi, ##arg); \
+@@ -89,11 +89,11 @@ do { \
+ (phba)->pport->cfg_log_verbose : \
+ (phba)->cfg_log_verbose; \
+ if (((mask) & log_verbose) || (level[1] <= '3')) { \
+- if ((mask) & LOG_TRACE_EVENT) \
++ if ((mask) & LOG_TRACE_EVENT && !log_verbose) \
+ lpfc_dmp_dbg(phba); \
+ dev_printk(level, &((phba)->pcidev)->dev, "%d:" \
+ fmt, phba->brd_no, ##arg); \
+- } else if (!(phba)->cfg_log_verbose)\
++ } else if (!log_verbose)\
+ lpfc_dbg_print(phba, "%d:" fmt, phba->brd_no, ##arg); \
+ } \
+ } while (0)
+diff --git a/drivers/scsi/lpfc/lpfc_nportdisc.c b/drivers/scsi/lpfc/lpfc_nportdisc.c
+index c4e1a07066a2e..4b065c51ee1b0 100644
+--- a/drivers/scsi/lpfc/lpfc_nportdisc.c
++++ b/drivers/scsi/lpfc/lpfc_nportdisc.c
+@@ -614,9 +614,15 @@ lpfc_rcv_plogi(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp,
+ stat.un.b.lsRjtRsnCode = LSRJT_INVALID_CMD;
+ stat.un.b.lsRjtRsnCodeExp = LSEXP_NOTHING_MORE;
+ rc = lpfc_els_rsp_reject(vport, stat.un.lsRjtError, cmdiocb,
+- ndlp, login_mbox);
+- if (rc)
++ ndlp, login_mbox);
++ if (rc) {
++ mp = (struct lpfc_dmabuf *)login_mbox->ctx_buf;
++ if (mp) {
++ lpfc_mbuf_free(phba, mp->virt, mp->phys);
++ kfree(mp);
++ }
+ mempool_free(login_mbox, phba->mbox_mem_pool);
++ }
+ return 1;
+ }
+
+diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
+index ba9dbb51b75f0..f617a2ef6b0f4 100644
+--- a/drivers/scsi/lpfc/lpfc_scsi.c
++++ b/drivers/scsi/lpfc/lpfc_scsi.c
+@@ -3835,7 +3835,7 @@ lpfc_update_cmf_cmpl(struct lpfc_hba *phba,
+ else
+ time = div_u64(time + 500, 1000); /* round it */
+
+- cgs = this_cpu_ptr(phba->cmf_stat);
++ cgs = per_cpu_ptr(phba->cmf_stat, raw_smp_processor_id());
+ atomic64_add(size, &cgs->rcv_bytes);
+ atomic64_add(time, &cgs->rx_latency);
+ atomic_inc(&cgs->rx_io_cnt);
+@@ -3879,7 +3879,7 @@ lpfc_update_cmf_cmd(struct lpfc_hba *phba, uint32_t size)
+ atomic_set(&phba->rx_max_read_cnt, size);
+ }
+
+- cgs = this_cpu_ptr(phba->cmf_stat);
++ cgs = per_cpu_ptr(phba->cmf_stat, raw_smp_processor_id());
+ atomic64_add(size, &cgs->total_bytes);
+ return 0;
+ }
+@@ -5864,25 +5864,25 @@ lpfc_abort_handler(struct scsi_cmnd *cmnd)
+ if (!lpfc_cmd)
+ return ret;
+
+- spin_lock_irqsave(&phba->hbalock, flags);
++ /* Guard against IO completion being called at same time */
++ spin_lock_irqsave(&lpfc_cmd->buf_lock, flags);
++
++ spin_lock(&phba->hbalock);
+ /* driver queued commands are in process of being flushed */
+ if (phba->hba_flag & HBA_IOQ_FLUSH) {
+ lpfc_printf_vlog(vport, KERN_WARNING, LOG_FCP,
+ "3168 SCSI Layer abort requested I/O has been "
+ "flushed by LLD.\n");
+ ret = FAILED;
+- goto out_unlock;
++ goto out_unlock_hba;
+ }
+
+- /* Guard against IO completion being called at same time */
+- spin_lock(&lpfc_cmd->buf_lock);
+-
+ if (!lpfc_cmd->pCmd) {
+ lpfc_printf_vlog(vport, KERN_WARNING, LOG_FCP,
+ "2873 SCSI Layer I/O Abort Request IO CMPL Status "
+ "x%x ID %d LUN %llu\n",
+ SUCCESS, cmnd->device->id, cmnd->device->lun);
+- goto out_unlock_buf;
++ goto out_unlock_hba;
+ }
+
+ iocb = &lpfc_cmd->cur_iocbq;
+@@ -5890,7 +5890,7 @@ lpfc_abort_handler(struct scsi_cmnd *cmnd)
+ pring_s4 = phba->sli4_hba.hdwq[iocb->hba_wqidx].io_wq->pring;
+ if (!pring_s4) {
+ ret = FAILED;
+- goto out_unlock_buf;
++ goto out_unlock_hba;
+ }
+ spin_lock(&pring_s4->ring_lock);
+ }
+@@ -5923,8 +5923,8 @@ lpfc_abort_handler(struct scsi_cmnd *cmnd)
+ "3389 SCSI Layer I/O Abort Request is pending\n");
+ if (phba->sli_rev == LPFC_SLI_REV4)
+ spin_unlock(&pring_s4->ring_lock);
+- spin_unlock(&lpfc_cmd->buf_lock);
+- spin_unlock_irqrestore(&phba->hbalock, flags);
++ spin_unlock(&phba->hbalock);
++ spin_unlock_irqrestore(&lpfc_cmd->buf_lock, flags);
+ goto wait_for_cmpl;
+ }
+
+@@ -5945,15 +5945,13 @@ lpfc_abort_handler(struct scsi_cmnd *cmnd)
+ if (ret_val != IOCB_SUCCESS) {
+ /* Indicate the IO is not being aborted by the driver. */
+ lpfc_cmd->waitq = NULL;
+- spin_unlock(&lpfc_cmd->buf_lock);
+- spin_unlock_irqrestore(&phba->hbalock, flags);
+ ret = FAILED;
+- goto out;
++ goto out_unlock_hba;
+ }
+
+ /* no longer need the lock after this point */
+- spin_unlock(&lpfc_cmd->buf_lock);
+- spin_unlock_irqrestore(&phba->hbalock, flags);
++ spin_unlock(&phba->hbalock);
++ spin_unlock_irqrestore(&lpfc_cmd->buf_lock, flags);
+
+ if (phba->cfg_poll & DISABLE_FCP_RING_INT)
+ lpfc_sli_handle_fast_ring_event(phba,
+@@ -5988,10 +5986,9 @@ wait_for_cmpl:
+ out_unlock_ring:
+ if (phba->sli_rev == LPFC_SLI_REV4)
+ spin_unlock(&pring_s4->ring_lock);
+-out_unlock_buf:
+- spin_unlock(&lpfc_cmd->buf_lock);
+-out_unlock:
+- spin_unlock_irqrestore(&phba->hbalock, flags);
++out_unlock_hba:
++ spin_unlock(&phba->hbalock);
++ spin_unlock_irqrestore(&lpfc_cmd->buf_lock, flags);
+ out:
+ lpfc_printf_vlog(vport, KERN_WARNING, LOG_FCP,
+ "0749 SCSI Layer I/O Abort Request Status x%x ID %d "
+diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
+index 6adaf79e67cc0..331241a71452f 100644
+--- a/drivers/scsi/lpfc/lpfc_sli.c
++++ b/drivers/scsi/lpfc/lpfc_sli.c
+@@ -1373,7 +1373,7 @@ static void
+ __lpfc_sli_release_iocbq_s4(struct lpfc_hba *phba, struct lpfc_iocbq *iocbq)
+ {
+ struct lpfc_sglq *sglq;
+- size_t start_clean = offsetof(struct lpfc_iocbq, iocb);
++ size_t start_clean = offsetof(struct lpfc_iocbq, wqe);
+ unsigned long iflag = 0;
+ struct lpfc_sli_ring *pring;
+
+@@ -10800,24 +10800,15 @@ __lpfc_sli_prep_xmit_seq64_s4(struct lpfc_iocbq *cmdiocbq,
+ {
+ union lpfc_wqe128 *wqe;
+ struct ulp_bde64 *bpl;
+- struct ulp_bde64_le *bde;
+
+ wqe = &cmdiocbq->wqe;
+ memset(wqe, 0, sizeof(*wqe));
+
+ /* Words 0 - 2 */
+ bpl = (struct ulp_bde64 *)bmp->virt;
+- if (cmdiocbq->cmd_flag & (LPFC_IO_LIBDFC | LPFC_IO_LOOPBACK)) {
+- wqe->xmit_sequence.bde.addrHigh = bpl->addrHigh;
+- wqe->xmit_sequence.bde.addrLow = bpl->addrLow;
+- wqe->xmit_sequence.bde.tus.w = bpl->tus.w;
+- } else {
+- bde = (struct ulp_bde64_le *)&wqe->xmit_sequence.bde;
+- bde->addr_low = cpu_to_le32(putPaddrLow(bmp->phys));
+- bde->addr_high = cpu_to_le32(putPaddrHigh(bmp->phys));
+- bde->type_size = cpu_to_le32(bpl->tus.f.bdeSize);
+- bde->type_size |= cpu_to_le32(ULP_BDE64_TYPE_BDE_64);
+- }
++ wqe->xmit_sequence.bde.addrHigh = bpl->addrHigh;
++ wqe->xmit_sequence.bde.addrLow = bpl->addrLow;
++ wqe->xmit_sequence.bde.tus.w = bpl->tus.w;
+
+ /* Word 5 */
+ bf_set(wqe_ls, &wqe->xmit_sequence.wge_ctl, last_seq);
+@@ -12066,6 +12057,8 @@ lpfc_ignore_els_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+ {
+ struct lpfc_nodelist *ndlp = NULL;
+ IOCB_t *irsp;
++ LPFC_MBOXQ_t *mbox;
++ struct lpfc_dmabuf *mp;
+ u32 ulp_command, ulp_status, ulp_word4, iotag;
+
+ ulp_command = get_job_cmnd(phba, cmdiocb);
+@@ -12077,6 +12070,21 @@ lpfc_ignore_els_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+ } else {
+ irsp = &rspiocb->iocb;
+ iotag = irsp->ulpIoTag;
++
++ /* It is possible a PLOGI_RJT for NPIV ports to get aborted.
++ * The MBX_REG_LOGIN64 mbox command is freed back to the
++ * mbox_mem_pool here.
++ */
++ if (cmdiocb->context_un.mbox) {
++ mbox = cmdiocb->context_un.mbox;
++ mp = (struct lpfc_dmabuf *)mbox->ctx_buf;
++ if (mp) {
++ lpfc_mbuf_free(phba, mp->virt, mp->phys);
++ kfree(mp);
++ }
++ mempool_free(mbox, phba->mbox_mem_pool);
++ cmdiocb->context_un.mbox = NULL;
++ }
+ }
+
+ /* ELS cmd tag <ulpIoTag> completes */
+@@ -12185,7 +12193,8 @@ lpfc_sli_issue_abort_iotag(struct lpfc_hba *phba, struct lpfc_sli_ring *pring,
+
+ if (phba->link_state < LPFC_LINK_UP ||
+ (phba->sli_rev == LPFC_SLI_REV4 &&
+- phba->sli4_hba.link_state.status == LPFC_FC_LA_TYPE_LINK_DOWN))
++ phba->sli4_hba.link_state.status == LPFC_FC_LA_TYPE_LINK_DOWN) ||
++ (phba->link_flag & LS_EXTERNAL_LOOPBACK))
+ ia = true;
+ else
+ ia = false;
+@@ -12644,7 +12653,8 @@ lpfc_sli_abort_taskmgmt(struct lpfc_vport *vport, struct lpfc_sli_ring *pring,
+ ndlp = lpfc_cmd->rdata->pnode;
+
+ if (lpfc_is_link_up(phba) &&
+- (ndlp && ndlp->nlp_state == NLP_STE_MAPPED_NODE))
++ (ndlp && ndlp->nlp_state == NLP_STE_MAPPED_NODE) &&
++ !(phba->link_flag & LS_EXTERNAL_LOOPBACK))
+ ia = false;
+ else
+ ia = true;
+@@ -18107,7 +18117,6 @@ lpfc_fc_frame_check(struct lpfc_hba *phba, struct fc_frame_header *fc_hdr)
+ case FC_RCTL_ELS_REP: /* extended link services reply */
+ case FC_RCTL_ELS4_REQ: /* FC-4 ELS request */
+ case FC_RCTL_ELS4_REP: /* FC-4 ELS reply */
+- case FC_RCTL_BA_NOP: /* basic link service NOP */
+ case FC_RCTL_BA_ABTS: /* basic link service abort */
+ case FC_RCTL_BA_RMC: /* remove connection */
+ case FC_RCTL_BA_ACC: /* basic accept */
+@@ -18128,6 +18137,7 @@ lpfc_fc_frame_check(struct lpfc_hba *phba, struct fc_frame_header *fc_hdr)
+ fc_vft_hdr = (struct fc_vft_header *)fc_hdr;
+ fc_hdr = &((struct fc_frame_header *)fc_vft_hdr)[1];
+ return lpfc_fc_frame_check(phba, fc_hdr);
++ case FC_RCTL_BA_NOP: /* basic link service NOP */
+ default:
+ goto drop;
+ }
+@@ -18942,12 +18952,14 @@ lpfc_sli4_send_seq_to_ulp(struct lpfc_vport *vport,
+ if (!lpfc_complete_unsol_iocb(phba,
+ phba->sli4_hba.els_wq->pring,
+ iocbq, fc_hdr->fh_r_ctl,
+- fc_hdr->fh_type))
++ fc_hdr->fh_type)) {
+ lpfc_printf_log(phba, KERN_ERR, LOG_TRACE_EVENT,
+ "2540 Ring %d handler: unexpected Rctl "
+ "x%x Type x%x received\n",
+ LPFC_ELS_RING,
+ fc_hdr->fh_r_ctl, fc_hdr->fh_type);
++ lpfc_in_buf_free(phba, &seq_dmabuf->dbuf);
++ }
+
+ /* Free iocb created in lpfc_prep_seq */
+ list_for_each_entry_safe(curr_iocb, next_iocb,
+@@ -21107,7 +21119,7 @@ lpfc_sli4_issue_abort_iotag(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+ abtswqe = &abtsiocb->wqe;
+ memset(abtswqe, 0, sizeof(*abtswqe));
+
+- if (!lpfc_is_link_up(phba))
++ if (!lpfc_is_link_up(phba) || (phba->link_flag & LS_EXTERNAL_LOOPBACK))
+ bf_set(abort_cmd_ia, &abtswqe->abort_cmd, 1);
+ bf_set(abort_cmd_criteria, &abtswqe->abort_cmd, T_XRI_TAG);
+ abtswqe->abort_cmd.rsrvd5 = 0;
+diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
+index a5d8cee2d5106..bf491af9f0d65 100644
+--- a/drivers/scsi/megaraid.c
++++ b/drivers/scsi/megaraid.c
+@@ -4607,7 +4607,7 @@ static int __init megaraid_init(void)
+ * major number allocation.
+ */
+ major = register_chrdev(0, "megadev_legacy", &megadev_fops);
+- if (!major) {
++ if (major < 0) {
+ printk(KERN_WARNING
+ "megaraid: failed to register char device\n");
+ }
+diff --git a/drivers/scsi/ufs/ti-j721e-ufs.c b/drivers/scsi/ufs/ti-j721e-ufs.c
+index eafe0db98d542..122d650d08102 100644
+--- a/drivers/scsi/ufs/ti-j721e-ufs.c
++++ b/drivers/scsi/ufs/ti-j721e-ufs.c
+@@ -29,11 +29,9 @@ static int ti_j721e_ufs_probe(struct platform_device *pdev)
+ return PTR_ERR(regbase);
+
+ pm_runtime_enable(dev);
+- ret = pm_runtime_get_sync(dev);
+- if (ret < 0) {
+- pm_runtime_put_noidle(dev);
++ ret = pm_runtime_resume_and_get(dev);
++ if (ret < 0)
+ goto disable_pm;
+- }
+
+ /* Select MPHY refclk frequency */
+ clk = devm_clk_get(dev, NULL);
+diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
+index 586c0e567ff9a..e1b6c9e7a7f25 100644
+--- a/drivers/scsi/ufs/ufs-qcom.c
++++ b/drivers/scsi/ufs/ufs-qcom.c
+@@ -641,12 +641,7 @@ static int ufs_qcom_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op)
+ return err;
+ }
+
+- err = ufs_qcom_ice_resume(host);
+- if (err)
+- return err;
+-
+- hba->is_sys_suspended = false;
+- return 0;
++ return ufs_qcom_ice_resume(host);
+ }
+
+ static void ufs_qcom_dev_ref_clk_ctrl(struct ufs_qcom_host *host, bool enable)
+@@ -687,8 +682,11 @@ static void ufs_qcom_dev_ref_clk_ctrl(struct ufs_qcom_host *host, bool enable)
+
+ writel_relaxed(temp, host->dev_ref_clk_ctrl_mmio);
+
+- /* ensure that ref_clk is enabled/disabled before we return */
+- wmb();
++ /*
++ * Make sure the write to ref_clk reaches the destination and
++ * not stored in a Write Buffer (WB).
++ */
++ readl(host->dev_ref_clk_ctrl_mmio);
+
+ /*
+ * If we call hibern8 exit after this, we need to make sure that
+diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
+index 3f9caafa91bfa..4c9eb4be449ca 100644
+--- a/drivers/scsi/ufs/ufshcd.c
++++ b/drivers/scsi/ufs/ufshcd.c
+@@ -113,8 +113,13 @@ int ufshcd_dump_regs(struct ufs_hba *hba, size_t offset, size_t len,
+ if (!regs)
+ return -ENOMEM;
+
+- for (pos = 0; pos < len; pos += 4)
++ for (pos = 0; pos < len; pos += 4) {
++ if (offset == 0 &&
++ pos >= REG_UIC_ERROR_CODE_PHY_ADAPTER_LAYER &&
++ pos <= REG_UIC_ERROR_CODE_DME)
++ continue;
+ regs[pos / 4] = ufshcd_readl(hba, offset + pos);
++ }
+
+ ufshcd_hex_dump(prefix, regs, len);
+ kfree(regs);
+diff --git a/drivers/soc/bcm/bcm63xx/bcm-pmb.c b/drivers/soc/bcm/bcm63xx/bcm-pmb.c
+index 7bbe46ea5f945..9407cac47fdbe 100644
+--- a/drivers/soc/bcm/bcm63xx/bcm-pmb.c
++++ b/drivers/soc/bcm/bcm63xx/bcm-pmb.c
+@@ -312,6 +312,9 @@ static int bcm_pmb_probe(struct platform_device *pdev)
+ for (e = table; e->name; e++) {
+ struct bcm_pmb_pm_domain *pd = devm_kzalloc(dev, sizeof(*pd), GFP_KERNEL);
+
++ if (!pd)
++ return -ENOMEM;
++
+ pd->pmb = pmb;
+ pd->data = e;
+ pd->genpd.name = e->name;
+diff --git a/drivers/soc/qcom/llcc-qcom.c b/drivers/soc/qcom/llcc-qcom.c
+index eecafeded56ff..85ba8209b1826 100644
+--- a/drivers/soc/qcom/llcc-qcom.c
++++ b/drivers/soc/qcom/llcc-qcom.c
+@@ -749,6 +749,7 @@ static const struct of_device_id qcom_llcc_of_match[] = {
+ { .compatible = "qcom,sm8450-llcc", .data = &sm8450_cfg },
+ { }
+ };
++MODULE_DEVICE_TABLE(of, qcom_llcc_of_match);
+
+ static struct platform_driver qcom_llcc_driver = {
+ .driver = {
+diff --git a/drivers/soc/qcom/smp2p.c b/drivers/soc/qcom/smp2p.c
+index 4a157240f419e..59dbf4b61e6c2 100644
+--- a/drivers/soc/qcom/smp2p.c
++++ b/drivers/soc/qcom/smp2p.c
+@@ -493,6 +493,7 @@ static int smp2p_parse_ipc(struct qcom_smp2p *smp2p)
+ }
+
+ smp2p->ipc_regmap = syscon_node_to_regmap(syscon);
++ of_node_put(syscon);
+ if (IS_ERR(smp2p->ipc_regmap))
+ return PTR_ERR(smp2p->ipc_regmap);
+
+diff --git a/drivers/soc/qcom/smsm.c b/drivers/soc/qcom/smsm.c
+index ef15d014c03a3..9df9bba242f3e 100644
+--- a/drivers/soc/qcom/smsm.c
++++ b/drivers/soc/qcom/smsm.c
+@@ -374,6 +374,7 @@ static int smsm_parse_ipc(struct qcom_smsm *smsm, unsigned host_id)
+ return 0;
+
+ host->ipc_regmap = syscon_node_to_regmap(syscon);
++ of_node_put(syscon);
+ if (IS_ERR(host->ipc_regmap))
+ return PTR_ERR(host->ipc_regmap);
+
+diff --git a/drivers/soc/ti/ti_sci_pm_domains.c b/drivers/soc/ti/ti_sci_pm_domains.c
+index 8afb3f45d2637..a33ec7eaf23d1 100644
+--- a/drivers/soc/ti/ti_sci_pm_domains.c
++++ b/drivers/soc/ti/ti_sci_pm_domains.c
+@@ -183,6 +183,8 @@ static int ti_sci_pm_domain_probe(struct platform_device *pdev)
+ devm_kcalloc(dev, max_id + 1,
+ sizeof(*pd_provider->data.domains),
+ GFP_KERNEL);
++ if (!pd_provider->data.domains)
++ return -ENOMEM;
+
+ pd_provider->data.num_domains = max_id + 1;
+ pd_provider->data.xlate = ti_sci_pd_xlate;
+diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
+index 19686fb47bb35..ec53b807909e7 100644
+--- a/drivers/spi/spi-cadence-quadspi.c
++++ b/drivers/spi/spi-cadence-quadspi.c
+@@ -1865,7 +1865,7 @@ static const struct cqspi_driver_platdata intel_lgm_qspi = {
+ };
+
+ static const struct cqspi_driver_platdata socfpga_qspi = {
+- .quirks = CQSPI_NO_SUPPORT_WR_COMPLETION,
++ .quirks = CQSPI_DISABLE_DAC_MODE | CQSPI_NO_SUPPORT_WR_COMPLETION,
+ };
+
+ static const struct cqspi_driver_platdata versal_ospi = {
+diff --git a/drivers/spi/spi-fsl-qspi.c b/drivers/spi/spi-fsl-qspi.c
+index 9851551ebbe05..46ae46a944c5c 100644
+--- a/drivers/spi/spi-fsl-qspi.c
++++ b/drivers/spi/spi-fsl-qspi.c
+@@ -876,6 +876,10 @@ static int fsl_qspi_probe(struct platform_device *pdev)
+
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
+ "QuadSPI-memory");
++ if (!res) {
++ ret = -EINVAL;
++ goto err_put_ctrl;
++ }
+ q->memmap_phy = res->start;
+ /* Since there are 4 cs, map size required is 4 times ahb_buf_size */
+ q->ahb_addr = devm_ioremap(dev, q->memmap_phy,
+diff --git a/drivers/spi/spi-img-spfi.c b/drivers/spi/spi-img-spfi.c
+index 5f05d519fbbd0..71376b6df89db 100644
+--- a/drivers/spi/spi-img-spfi.c
++++ b/drivers/spi/spi-img-spfi.c
+@@ -731,7 +731,7 @@ static int img_spfi_resume(struct device *dev)
+ int ret;
+
+ ret = pm_runtime_get_sync(dev);
+- if (ret) {
++ if (ret < 0) {
+ pm_runtime_put_noidle(dev);
+ return ret;
+ }
+diff --git a/drivers/spi/spi-rockchip.c b/drivers/spi/spi-rockchip.c
+index cdc16eecaf6b5..a08215eb9e148 100644
+--- a/drivers/spi/spi-rockchip.c
++++ b/drivers/spi/spi-rockchip.c
+@@ -196,6 +196,8 @@ struct rockchip_spi {
+
+ bool slave_abort;
+ bool cs_inactive; /* spi slave tansmition stop when cs inactive */
++ bool cs_high_supported; /* native CS supports active-high polarity */
++
+ struct spi_transfer *xfer; /* Store xfer temporarily */
+ };
+
+@@ -719,6 +721,11 @@ static int rockchip_spi_setup(struct spi_device *spi)
+ struct rockchip_spi *rs = spi_controller_get_devdata(spi->controller);
+ u32 cr0;
+
++ if (!spi->cs_gpiod && (spi->mode & SPI_CS_HIGH) && !rs->cs_high_supported) {
++ dev_warn(&spi->dev, "setup: non GPIO CS can't be active-high\n");
++ return -EINVAL;
++ }
++
+ pm_runtime_get_sync(rs->dev);
+
+ cr0 = readl_relaxed(rs->regs + ROCKCHIP_SPI_CTRLR0);
+@@ -899,6 +906,7 @@ static int rockchip_spi_probe(struct platform_device *pdev)
+
+ switch (readl_relaxed(rs->regs + ROCKCHIP_SPI_VERSION)) {
+ case ROCKCHIP_SPI_VER2_TYPE2:
++ rs->cs_high_supported = true;
+ ctlr->mode_bits |= SPI_CS_HIGH;
+ if (ctlr->can_dma && slave_mode)
+ rs->cs_inactive = true;
+diff --git a/drivers/spi/spi-rspi.c b/drivers/spi/spi-rspi.c
+index bd5708d7e5a15..7a014eeec2d0d 100644
+--- a/drivers/spi/spi-rspi.c
++++ b/drivers/spi/spi-rspi.c
+@@ -1108,14 +1108,11 @@ static struct dma_chan *rspi_request_dma_chan(struct device *dev,
+ }
+
+ memset(&cfg, 0, sizeof(cfg));
++ cfg.dst_addr = port_addr + RSPI_SPDR;
++ cfg.src_addr = port_addr + RSPI_SPDR;
++ cfg.dst_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
++ cfg.src_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
+ cfg.direction = dir;
+- if (dir == DMA_MEM_TO_DEV) {
+- cfg.dst_addr = port_addr;
+- cfg.dst_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
+- } else {
+- cfg.src_addr = port_addr;
+- cfg.src_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
+- }
+
+ ret = dmaengine_slave_config(chan, &cfg);
+ if (ret) {
+@@ -1146,12 +1143,12 @@ static int rspi_request_dma(struct device *dev, struct spi_controller *ctlr,
+ }
+
+ ctlr->dma_tx = rspi_request_dma_chan(dev, DMA_MEM_TO_DEV, dma_tx_id,
+- res->start + RSPI_SPDR);
++ res->start);
+ if (!ctlr->dma_tx)
+ return -ENODEV;
+
+ ctlr->dma_rx = rspi_request_dma_chan(dev, DMA_DEV_TO_MEM, dma_rx_id,
+- res->start + RSPI_SPDR);
++ res->start);
+ if (!ctlr->dma_rx) {
+ dma_release_channel(ctlr->dma_tx);
+ ctlr->dma_tx = NULL;
+diff --git a/drivers/spi/spi-stm32-qspi.c b/drivers/spi/spi-stm32-qspi.c
+index ffdc55f87e821..dd38cb8ffbc20 100644
+--- a/drivers/spi/spi-stm32-qspi.c
++++ b/drivers/spi/spi-stm32-qspi.c
+@@ -308,7 +308,8 @@ static int stm32_qspi_wait_cmd(struct stm32_qspi *qspi,
+ if (!op->data.nbytes)
+ goto wait_nobusy;
+
+- if (readl_relaxed(qspi->io_base + QSPI_SR) & SR_TCF)
++ if ((readl_relaxed(qspi->io_base + QSPI_SR) & SR_TCF) ||
++ qspi->fmode == CCR_FMODE_APM)
+ goto out;
+
+ reinit_completion(&qspi->data_completion);
+diff --git a/drivers/spi/spi-ti-qspi.c b/drivers/spi/spi-ti-qspi.c
+index e06aafe169e0c..081da1fd3fd7e 100644
+--- a/drivers/spi/spi-ti-qspi.c
++++ b/drivers/spi/spi-ti-qspi.c
+@@ -448,6 +448,7 @@ static int ti_qspi_dma_xfer(struct ti_qspi *qspi, dma_addr_t dma_dst,
+ enum dma_ctrl_flags flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT;
+ struct dma_async_tx_descriptor *tx;
+ int ret;
++ unsigned long time_left;
+
+ tx = dmaengine_prep_dma_memcpy(chan, dma_dst, dma_src, len, flags);
+ if (!tx) {
+@@ -467,9 +468,9 @@ static int ti_qspi_dma_xfer(struct ti_qspi *qspi, dma_addr_t dma_dst,
+ }
+
+ dma_async_issue_pending(chan);
+- ret = wait_for_completion_timeout(&qspi->transfer_complete,
++ time_left = wait_for_completion_timeout(&qspi->transfer_complete,
+ msecs_to_jiffies(len));
+- if (ret <= 0) {
++ if (time_left == 0) {
+ dmaengine_terminate_sync(chan);
+ dev_err(qspi->dev, "DMA wait_for_completion_timeout\n");
+ return -ETIMEDOUT;
+diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
+index dc768884cb79a..bd7d11032c94a 100644
+--- a/drivers/staging/media/hantro/hantro_drv.c
++++ b/drivers/staging/media/hantro/hantro_drv.c
+@@ -56,6 +56,10 @@ dma_addr_t hantro_get_ref(struct hantro_ctx *ctx, u64 ts)
+ return hantro_get_dec_buf_addr(ctx, buf);
+ }
+
++static const struct v4l2_event hantro_eos_event = {
++ .type = V4L2_EVENT_EOS
++};
++
+ static void hantro_job_finish_no_pm(struct hantro_dev *vpu,
+ struct hantro_ctx *ctx,
+ enum vb2_buffer_state result)
+@@ -73,6 +77,12 @@ static void hantro_job_finish_no_pm(struct hantro_dev *vpu,
+ src->sequence = ctx->sequence_out++;
+ dst->sequence = ctx->sequence_cap++;
+
++ if (v4l2_m2m_is_last_draining_src_buf(ctx->fh.m2m_ctx, src)) {
++ dst->flags |= V4L2_BUF_FLAG_LAST;
++ v4l2_event_queue_fh(&ctx->fh, &hantro_eos_event);
++ v4l2_m2m_mark_stopped(ctx->fh.m2m_ctx);
++ }
++
+ v4l2_m2m_buf_done_and_job_finish(ctx->dev->m2m_dev, ctx->fh.m2m_ctx,
+ result);
+ }
+@@ -809,10 +819,13 @@ static int hantro_add_func(struct hantro_dev *vpu, unsigned int funcid)
+ snprintf(vfd->name, sizeof(vfd->name), "%s-%s", match->compatible,
+ funcid == MEDIA_ENT_F_PROC_VIDEO_ENCODER ? "enc" : "dec");
+
+- if (funcid == MEDIA_ENT_F_PROC_VIDEO_ENCODER)
++ if (funcid == MEDIA_ENT_F_PROC_VIDEO_ENCODER) {
+ vpu->encoder = func;
+- else
++ } else {
+ vpu->decoder = func;
++ v4l2_disable_ioctl(vfd, VIDIOC_TRY_ENCODER_CMD);
++ v4l2_disable_ioctl(vfd, VIDIOC_ENCODER_CMD);
++ }
+
+ video_set_drvdata(vfd, vpu);
+
+diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
+index c524af41baf59..5f3178bac9c80 100644
+--- a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
++++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
+@@ -60,7 +60,7 @@ static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
+ no_chroma = 1;
+ for (j = 0, tmp_w = 0; j < num_tile_cols - 1; j++) {
+ tmp_w += pps->column_width_minus1[j] + 1;
+- *p++ = pps->column_width_minus1[j + 1];
++ *p++ = pps->column_width_minus1[j] + 1;
+ *p++ = h;
+ if (i == 0 && h == 1 && ctb_size == 16)
+ no_chroma = 1;
+@@ -180,13 +180,8 @@ static void set_params(struct hantro_ctx *ctx)
+ hantro_reg_write(vpu, &g2_max_cu_qpd_depth, 0);
+ }
+
+- if (pps->flags & V4L2_HEVC_PPS_FLAG_PPS_SLICE_CHROMA_QP_OFFSETS_PRESENT) {
+- hantro_reg_write(vpu, &g2_cb_qp_offset, pps->pps_cb_qp_offset);
+- hantro_reg_write(vpu, &g2_cr_qp_offset, pps->pps_cr_qp_offset);
+- } else {
+- hantro_reg_write(vpu, &g2_cb_qp_offset, 0);
+- hantro_reg_write(vpu, &g2_cr_qp_offset, 0);
+- }
++ hantro_reg_write(vpu, &g2_cb_qp_offset, pps->pps_cb_qp_offset);
++ hantro_reg_write(vpu, &g2_cr_qp_offset, pps->pps_cr_qp_offset);
+
+ hantro_reg_write(vpu, &g2_filt_offset_beta, pps->pps_beta_offset_div2);
+ hantro_reg_write(vpu, &g2_filt_offset_tc, pps->pps_tc_offset_div2);
+diff --git a/drivers/staging/media/hantro/hantro_h264.c b/drivers/staging/media/hantro/hantro_h264.c
+index 0b4d2491be3b8..228629fb3cdf9 100644
+--- a/drivers/staging/media/hantro/hantro_h264.c
++++ b/drivers/staging/media/hantro/hantro_h264.c
+@@ -354,8 +354,6 @@ u16 hantro_h264_get_ref_nbr(struct hantro_ctx *ctx, unsigned int dpb_idx)
+
+ if (!(dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE))
+ return 0;
+- if (dpb->flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM)
+- return dpb->pic_num;
+ return dpb->frame_num;
+ }
+
+diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
+index 67148ba346f52..71a6279750bf7 100644
+--- a/drivers/staging/media/hantro/hantro_v4l2.c
++++ b/drivers/staging/media/hantro/hantro_v4l2.c
+@@ -628,6 +628,38 @@ static int vidioc_s_selection(struct file *file, void *priv,
+ return 0;
+ }
+
++static const struct v4l2_event hantro_eos_event = {
++ .type = V4L2_EVENT_EOS
++};
++
++static int vidioc_encoder_cmd(struct file *file, void *priv,
++ struct v4l2_encoder_cmd *ec)
++{
++ struct hantro_ctx *ctx = fh_to_ctx(priv);
++ int ret;
++
++ ret = v4l2_m2m_ioctl_try_encoder_cmd(file, priv, ec);
++ if (ret < 0)
++ return ret;
++
++ if (!vb2_is_streaming(v4l2_m2m_get_src_vq(ctx->fh.m2m_ctx)) ||
++ !vb2_is_streaming(v4l2_m2m_get_dst_vq(ctx->fh.m2m_ctx)))
++ return 0;
++
++ ret = v4l2_m2m_ioctl_encoder_cmd(file, priv, ec);
++ if (ret < 0)
++ return ret;
++
++ if (ec->cmd == V4L2_ENC_CMD_STOP &&
++ v4l2_m2m_has_stopped(ctx->fh.m2m_ctx))
++ v4l2_event_queue_fh(&ctx->fh, &hantro_eos_event);
++
++ if (ec->cmd == V4L2_ENC_CMD_START)
++ vb2_clear_last_buffer_dequeued(&ctx->fh.m2m_ctx->cap_q_ctx.q);
++
++ return 0;
++}
++
+ const struct v4l2_ioctl_ops hantro_ioctl_ops = {
+ .vidioc_querycap = vidioc_querycap,
+ .vidioc_enum_framesizes = vidioc_enum_framesizes,
+@@ -657,6 +689,9 @@ const struct v4l2_ioctl_ops hantro_ioctl_ops = {
+
+ .vidioc_g_selection = vidioc_g_selection,
+ .vidioc_s_selection = vidioc_s_selection,
++
++ .vidioc_try_encoder_cmd = v4l2_m2m_ioctl_try_encoder_cmd,
++ .vidioc_encoder_cmd = vidioc_encoder_cmd,
+ };
+
+ static int
+@@ -733,8 +768,12 @@ static int hantro_buf_prepare(struct vb2_buffer *vb)
+ * (for OUTPUT buffers, if userspace passes 0 bytesused, v4l2-core sets
+ * it to buffer length).
+ */
+- if (V4L2_TYPE_IS_CAPTURE(vq->type))
+- vb2_set_plane_payload(vb, 0, pix_fmt->plane_fmt[0].sizeimage);
++ if (V4L2_TYPE_IS_CAPTURE(vq->type)) {
++ if (ctx->is_encoder)
++ vb2_set_plane_payload(vb, 0, 0);
++ else
++ vb2_set_plane_payload(vb, 0, pix_fmt->plane_fmt[0].sizeimage);
++ }
+
+ return 0;
+ }
+@@ -744,6 +783,22 @@ static void hantro_buf_queue(struct vb2_buffer *vb)
+ struct hantro_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
+ struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
+
++ if (V4L2_TYPE_IS_CAPTURE(vb->vb2_queue->type) &&
++ vb2_is_streaming(vb->vb2_queue) &&
++ v4l2_m2m_dst_buf_is_last(ctx->fh.m2m_ctx)) {
++ unsigned int i;
++
++ for (i = 0; i < vb->num_planes; i++)
++ vb2_set_plane_payload(vb, i, 0);
++
++ vbuf->field = V4L2_FIELD_NONE;
++ vbuf->sequence = ctx->sequence_cap++;
++
++ v4l2_m2m_last_buffer_done(ctx->fh.m2m_ctx, vbuf);
++ v4l2_event_queue_fh(&ctx->fh, &hantro_eos_event);
++ return;
++ }
++
+ v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf);
+ }
+
+@@ -759,6 +814,8 @@ static int hantro_start_streaming(struct vb2_queue *q, unsigned int count)
+ struct hantro_ctx *ctx = vb2_get_drv_priv(q);
+ int ret = 0;
+
++ v4l2_m2m_update_start_streaming_state(ctx->fh.m2m_ctx, q);
++
+ if (V4L2_TYPE_IS_OUTPUT(q->type))
+ ctx->sequence_out = 0;
+ else
+@@ -831,6 +888,12 @@ static void hantro_stop_streaming(struct vb2_queue *q)
+ hantro_return_bufs(q, v4l2_m2m_src_buf_remove);
+ else
+ hantro_return_bufs(q, v4l2_m2m_dst_buf_remove);
++
++ v4l2_m2m_update_stop_streaming_state(ctx->fh.m2m_ctx, q);
++
++ if (V4L2_TYPE_IS_OUTPUT(q->type) &&
++ v4l2_m2m_has_stopped(ctx->fh.m2m_ctx))
++ v4l2_event_queue_fh(&ctx->fh, &hantro_eos_event);
+ }
+
+ static void hantro_buf_request_complete(struct vb2_buffer *vb)
+diff --git a/drivers/staging/media/rkvdec/rkvdec-h264.c b/drivers/staging/media/rkvdec/rkvdec-h264.c
+index 951e19231da21..22b4bf9e9ef40 100644
+--- a/drivers/staging/media/rkvdec/rkvdec-h264.c
++++ b/drivers/staging/media/rkvdec/rkvdec-h264.c
+@@ -112,6 +112,7 @@ struct rkvdec_h264_run {
+ const struct v4l2_ctrl_h264_sps *sps;
+ const struct v4l2_ctrl_h264_pps *pps;
+ const struct v4l2_ctrl_h264_scaling_matrix *scaling_matrix;
++ int ref_buf_idx[V4L2_H264_NUM_DPB_ENTRIES];
+ };
+
+ struct rkvdec_h264_ctx {
+@@ -661,8 +662,8 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
+ WRITE_PPS(0xff, PROFILE_IDC);
+ WRITE_PPS(1, CONSTRAINT_SET3_FLAG);
+ WRITE_PPS(sps->chroma_format_idc, CHROMA_FORMAT_IDC);
+- WRITE_PPS(sps->bit_depth_luma_minus8 + 8, BIT_DEPTH_LUMA);
+- WRITE_PPS(sps->bit_depth_chroma_minus8 + 8, BIT_DEPTH_CHROMA);
++ WRITE_PPS(sps->bit_depth_luma_minus8, BIT_DEPTH_LUMA);
++ WRITE_PPS(sps->bit_depth_chroma_minus8, BIT_DEPTH_CHROMA);
+ WRITE_PPS(0, QPPRIME_Y_ZERO_TRANSFORM_BYPASS_FLAG);
+ WRITE_PPS(sps->log2_max_frame_num_minus4, LOG2_MAX_FRAME_NUM_MINUS4);
+ WRITE_PPS(sps->max_num_ref_frames, MAX_NUM_REF_FRAMES);
+@@ -725,6 +726,26 @@ static void assemble_hw_pps(struct rkvdec_ctx *ctx,
+ }
+ }
+
++static void lookup_ref_buf_idx(struct rkvdec_ctx *ctx,
++ struct rkvdec_h264_run *run)
++{
++ const struct v4l2_ctrl_h264_decode_params *dec_params = run->decode_params;
++ u32 i;
++
++ for (i = 0; i < ARRAY_SIZE(dec_params->dpb); i++) {
++ struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx;
++ const struct v4l2_h264_dpb_entry *dpb = run->decode_params->dpb;
++ struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q;
++ int buf_idx = -1;
++
++ if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
++ buf_idx = vb2_find_timestamp(cap_q,
++ dpb[i].reference_ts, 0);
++
++ run->ref_buf_idx[i] = buf_idx;
++ }
++}
++
+ static void assemble_hw_rps(struct rkvdec_ctx *ctx,
+ struct rkvdec_h264_run *run)
+ {
+@@ -762,7 +783,7 @@ static void assemble_hw_rps(struct rkvdec_ctx *ctx,
+
+ for (j = 0; j < RKVDEC_NUM_REFLIST; j++) {
+ for (i = 0; i < h264_ctx->reflists.num_valid; i++) {
+- u8 dpb_valid = 0;
++ bool dpb_valid = run->ref_buf_idx[i] >= 0;
+ u8 idx = 0;
+
+ switch (j) {
+@@ -779,8 +800,6 @@ static void assemble_hw_rps(struct rkvdec_ctx *ctx,
+
+ if (idx >= ARRAY_SIZE(dec_params->dpb))
+ continue;
+- dpb_valid = !!(dpb[idx].flags &
+- V4L2_H264_DPB_ENTRY_FLAG_ACTIVE);
+
+ set_ps_field(hw_rps, DPB_INFO(i, j),
+ idx | dpb_valid << 4);
+@@ -859,13 +878,8 @@ get_ref_buf(struct rkvdec_ctx *ctx, struct rkvdec_h264_run *run,
+ unsigned int dpb_idx)
+ {
+ struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx;
+- const struct v4l2_h264_dpb_entry *dpb = run->decode_params->dpb;
+ struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q;
+- int buf_idx = -1;
+-
+- if (dpb[dpb_idx].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)
+- buf_idx = vb2_find_timestamp(cap_q,
+- dpb[dpb_idx].reference_ts, 0);
++ int buf_idx = run->ref_buf_idx[dpb_idx];
+
+ /*
+ * If a DPB entry is unused or invalid, address of current destination
+@@ -1102,6 +1116,7 @@ static int rkvdec_h264_run(struct rkvdec_ctx *ctx)
+
+ assemble_hw_scaling_list(ctx, &run);
+ assemble_hw_pps(ctx, &run);
++ lookup_ref_buf_idx(ctx, &run);
+ assemble_hw_rps(ctx, &run);
+ config_registers(ctx, &run);
+
+diff --git a/drivers/staging/r8188eu/os_dep/ioctl_linux.c b/drivers/staging/r8188eu/os_dep/ioctl_linux.c
+index eb9375b0c6606..60bd1cc2b3afd 100644
+--- a/drivers/staging/r8188eu/os_dep/ioctl_linux.c
++++ b/drivers/staging/r8188eu/os_dep/ioctl_linux.c
+@@ -1131,9 +1131,11 @@ static int rtw_wx_set_scan(struct net_device *dev, struct iw_request_info *a,
+ break;
+ }
+ sec_len = *(pos++); len -= 1;
+- if (sec_len > 0 && sec_len <= len) {
++ if (sec_len > 0 &&
++ sec_len <= len &&
++ sec_len <= 32) {
+ ssid[ssid_index].SsidLength = sec_len;
+- memcpy(ssid[ssid_index].Ssid, pos, ssid[ssid_index].SsidLength);
++ memcpy(ssid[ssid_index].Ssid, pos, sec_len);
+ ssid_index++;
+ }
+ pos += sec_len;
+@@ -1886,88 +1888,6 @@ static int rtw_wx_get_nick(struct net_device *dev,
+ return 0;
+ }
+
+-static int rtw_wx_read32(struct net_device *dev,
+- struct iw_request_info *info,
+- union iwreq_data *wrqu, char *extra)
+-{
+- struct adapter *padapter;
+- struct iw_point *p;
+- u16 len;
+- u32 addr;
+- u32 data32;
+- u32 bytes;
+- u8 *ptmp;
+- int ret;
+-
+- padapter = (struct adapter *)rtw_netdev_priv(dev);
+- p = &wrqu->data;
+- len = p->length;
+- ptmp = memdup_user(p->pointer, len);
+- if (IS_ERR(ptmp))
+- return PTR_ERR(ptmp);
+-
+- bytes = 0;
+- addr = 0;
+- sscanf(ptmp, "%d,%x", &bytes, &addr);
+-
+- switch (bytes) {
+- case 1:
+- data32 = rtw_read8(padapter, addr);
+- sprintf(extra, "0x%02X", data32);
+- break;
+- case 2:
+- data32 = rtw_read16(padapter, addr);
+- sprintf(extra, "0x%04X", data32);
+- break;
+- case 4:
+- data32 = rtw_read32(padapter, addr);
+- sprintf(extra, "0x%08X", data32);
+- break;
+- default:
+- ret = -EINVAL;
+- goto err_free_ptmp;
+- }
+-
+- kfree(ptmp);
+- return 0;
+-
+-err_free_ptmp:
+- kfree(ptmp);
+- return ret;
+-}
+-
+-static int rtw_wx_write32(struct net_device *dev,
+- struct iw_request_info *info,
+- union iwreq_data *wrqu, char *extra)
+-{
+- struct adapter *padapter = (struct adapter *)rtw_netdev_priv(dev);
+-
+- u32 addr;
+- u32 data32;
+- u32 bytes;
+-
+- bytes = 0;
+- addr = 0;
+- data32 = 0;
+- sscanf(extra, "%d,%x,%x", &bytes, &addr, &data32);
+-
+- switch (bytes) {
+- case 1:
+- rtw_write8(padapter, addr, (u8)data32);
+- break;
+- case 2:
+- rtw_write16(padapter, addr, (u16)data32);
+- break;
+- case 4:
+- rtw_write32(padapter, addr, data32);
+- break;
+- default:
+- return -EINVAL;
+- }
+-
+- return 0;
+-}
+-
+ static int rtw_wx_read_rf(struct net_device *dev,
+ struct iw_request_info *info,
+ union iwreq_data *wrqu, char *extra)
+@@ -3895,8 +3815,8 @@ static const struct iw_priv_args rtw_private_args[] = {
+ };
+
+ static iw_handler rtw_private_handler[] = {
+-rtw_wx_write32, /* 0x00 */
+-rtw_wx_read32, /* 0x01 */
++ NULL, /* 0x00 */
++ NULL, /* 0x01 */
+ NULL, /* 0x02 */
+ NULL, /* 0x03 */
+ /* for MM DTV platform */
+diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c
+index 44bb380e7390c..fa866acef5bb2 100644
+--- a/drivers/target/target_core_device.c
++++ b/drivers/target/target_core_device.c
+@@ -850,7 +850,6 @@ bool target_configure_unmap_from_queue(struct se_dev_attrib *attrib,
+ attrib->unmap_granularity = q->limits.discard_granularity / block_size;
+ attrib->unmap_granularity_alignment = q->limits.discard_alignment /
+ block_size;
+- attrib->unmap_zeroes_data = !!(q->limits.max_write_zeroes_sectors);
+ return true;
+ }
+ EXPORT_SYMBOL(target_configure_unmap_from_queue);
+diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
+index fd7267baa7078..3deaeecb712e3 100644
+--- a/drivers/target/target_core_user.c
++++ b/drivers/target/target_core_user.c
+@@ -20,6 +20,7 @@
+ #include <linux/configfs.h>
+ #include <linux/mutex.h>
+ #include <linux/workqueue.h>
++#include <linux/pagemap.h>
+ #include <net/genetlink.h>
+ #include <scsi/scsi_common.h>
+ #include <scsi/scsi_proto.h>
+@@ -1660,17 +1661,37 @@ static int tcmu_check_and_free_pending_cmd(struct tcmu_cmd *cmd)
+ static u32 tcmu_blocks_release(struct tcmu_dev *udev, unsigned long first,
+ unsigned long last)
+ {
+- XA_STATE(xas, &udev->data_pages, first * udev->data_pages_per_blk);
+ struct page *page;
++ unsigned long dpi;
+ u32 pages_freed = 0;
+
+- xas_lock(&xas);
+- xas_for_each(&xas, page, (last + 1) * udev->data_pages_per_blk - 1) {
+- xas_store(&xas, NULL);
++ first = first * udev->data_pages_per_blk;
++ last = (last + 1) * udev->data_pages_per_blk - 1;
++ xa_for_each_range(&udev->data_pages, dpi, page, first, last) {
++ xa_erase(&udev->data_pages, dpi);
++ /*
++ * While reaching here there may be page faults occurring on
++ * the to-be-released pages. A race condition may occur if
++ * unmap_mapping_range() is called before page faults on these
++ * pages have completed; a valid but stale map is created.
++ *
++ * If another command subsequently runs and needs to extend
++ * dbi_thresh, it may reuse the slot corresponding to the
++ * previous page in data_bitmap. Though we will allocate a new
++ * page for the slot in data_area, no page fault will happen
++ * because we have a valid map. Therefore the command's data
++ * will be lost.
++ *
++ * We lock and unlock pages that are to be released to ensure
++ * all page faults have completed. This way
++ * unmap_mapping_range() can ensure stale maps are cleanly
++ * removed.
++ */
++ lock_page(page);
++ unlock_page(page);
+ __free_page(page);
+ pages_freed++;
+ }
+- xas_unlock(&xas);
+
+ atomic_sub(pages_freed, &global_page_count);
+
+@@ -1822,6 +1843,7 @@ static struct page *tcmu_try_get_data_page(struct tcmu_dev *udev, uint32_t dpi)
+ page = xa_load(&udev->data_pages, dpi);
+ if (likely(page)) {
+ get_page(page);
++ lock_page(page);
+ mutex_unlock(&udev->cmdr_lock);
+ return page;
+ }
+@@ -1863,6 +1885,7 @@ static vm_fault_t tcmu_vma_fault(struct vm_fault *vmf)
+ struct page *page;
+ unsigned long offset;
+ void *addr;
++ vm_fault_t ret = 0;
+
+ int mi = tcmu_find_mem_index(vmf->vma);
+ if (mi < 0)
+@@ -1887,10 +1910,11 @@ static vm_fault_t tcmu_vma_fault(struct vm_fault *vmf)
+ page = tcmu_try_get_data_page(udev, dpi);
+ if (!page)
+ return VM_FAULT_SIGBUS;
++ ret = VM_FAULT_LOCKED;
+ }
+
+ vmf->page = page;
+- return 0;
++ return ret;
+ }
+
+ static const struct vm_operations_struct tcmu_vm_ops = {
+@@ -3205,12 +3229,22 @@ static void find_free_blocks(void)
+ udev->dbi_max = block;
+ }
+
++ /*
++ * Release the block pages.
++ *
++ * Also note that since tcmu_vma_fault() gets an extra page
++ * refcount, tcmu_blocks_release() won't free pages if pages
++ * are mapped. This means it is safe to call
++ * tcmu_blocks_release() before unmap_mapping_range() which
++ * drops the refcount of any pages it unmaps and thus releases
++ * them.
++ */
++ pages_freed = tcmu_blocks_release(udev, start, end - 1);
++
+ /* Here will truncate the data area from off */
+ off = udev->data_off + (loff_t)start * udev->data_blk_size;
+ unmap_mapping_range(udev->inode->i_mapping, off, 0, 1);
+
+- /* Release the block pages */
+- pages_freed = tcmu_blocks_release(udev, start, end - 1);
+ mutex_unlock(&udev->cmdr_lock);
+
+ total_pages_freed += pages_freed;
+diff --git a/drivers/thermal/broadcom/bcm2711_thermal.c b/drivers/thermal/broadcom/bcm2711_thermal.c
+index 1ec57d9ecf539..e9bef5c3414b6 100644
+--- a/drivers/thermal/broadcom/bcm2711_thermal.c
++++ b/drivers/thermal/broadcom/bcm2711_thermal.c
+@@ -38,7 +38,6 @@ static int bcm2711_get_temp(void *data, int *temp)
+ int offset = thermal_zone_get_offset(priv->thermal);
+ u32 val;
+ int ret;
+- long t;
+
+ ret = regmap_read(priv->regmap, AVS_RO_TEMP_STATUS, &val);
+ if (ret)
+@@ -50,9 +49,7 @@ static int bcm2711_get_temp(void *data, int *temp)
+ val &= AVS_RO_TEMP_STATUS_DATA_MSK;
+
+ /* Convert a HW code to a temperature reading (millidegree celsius) */
+- t = slope * val + offset;
+-
+- *temp = t < 0 ? 0 : t;
++ *temp = slope * val + offset;
+
+ return 0;
+ }
+diff --git a/drivers/thermal/broadcom/sr-thermal.c b/drivers/thermal/broadcom/sr-thermal.c
+index 475ce29007713..85ab9edd580cc 100644
+--- a/drivers/thermal/broadcom/sr-thermal.c
++++ b/drivers/thermal/broadcom/sr-thermal.c
+@@ -60,6 +60,9 @@ static int sr_thermal_probe(struct platform_device *pdev)
+ return -ENOMEM;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
++ if (!res)
++ return -ENOENT;
++
+ sr_thermal->regs = (void __iomem *)devm_memremap(&pdev->dev, res->start,
+ resource_size(res),
+ MEMREMAP_WB);
+diff --git a/drivers/thermal/devfreq_cooling.c b/drivers/thermal/devfreq_cooling.c
+index 4310cb342a9fb..d38a80adec733 100644
+--- a/drivers/thermal/devfreq_cooling.c
++++ b/drivers/thermal/devfreq_cooling.c
+@@ -358,21 +358,28 @@ of_devfreq_cooling_register_power(struct device_node *np, struct devfreq *df,
+ struct thermal_cooling_device *cdev;
+ struct device *dev = df->dev.parent;
+ struct devfreq_cooling_device *dfc;
++ struct thermal_cooling_device_ops *ops;
+ char *name;
+ int err, num_opps;
+
+- dfc = kzalloc(sizeof(*dfc), GFP_KERNEL);
+- if (!dfc)
++ ops = kmemdup(&devfreq_cooling_ops, sizeof(*ops), GFP_KERNEL);
++ if (!ops)
+ return ERR_PTR(-ENOMEM);
+
++ dfc = kzalloc(sizeof(*dfc), GFP_KERNEL);
++ if (!dfc) {
++ err = -ENOMEM;
++ goto free_ops;
++ }
++
+ dfc->devfreq = df;
+
+ dfc->em_pd = em_pd_get(dev);
+ if (dfc->em_pd) {
+- devfreq_cooling_ops.get_requested_power =
++ ops->get_requested_power =
+ devfreq_cooling_get_requested_power;
+- devfreq_cooling_ops.state2power = devfreq_cooling_state2power;
+- devfreq_cooling_ops.power2state = devfreq_cooling_power2state;
++ ops->state2power = devfreq_cooling_state2power;
++ ops->power2state = devfreq_cooling_power2state;
+
+ dfc->power_ops = dfc_power;
+
+@@ -407,8 +414,7 @@ of_devfreq_cooling_register_power(struct device_node *np, struct devfreq *df,
+ if (!name)
+ goto remove_qos_req;
+
+- cdev = thermal_of_cooling_device_register(np, name, dfc,
+- &devfreq_cooling_ops);
++ cdev = thermal_of_cooling_device_register(np, name, dfc, ops);
+ kfree(name);
+
+ if (IS_ERR(cdev)) {
+@@ -429,6 +435,8 @@ free_table:
+ kfree(dfc->freq_table);
+ free_dfc:
+ kfree(dfc);
++free_ops:
++ kfree(ops);
+
+ return ERR_PTR(err);
+ }
+@@ -510,11 +518,13 @@ EXPORT_SYMBOL_GPL(devfreq_cooling_em_register);
+ void devfreq_cooling_unregister(struct thermal_cooling_device *cdev)
+ {
+ struct devfreq_cooling_device *dfc;
++ const struct thermal_cooling_device_ops *ops;
+ struct device *dev;
+
+ if (IS_ERR_OR_NULL(cdev))
+ return;
+
++ ops = cdev->ops;
+ dfc = cdev->devdata;
+ dev = dfc->devfreq->dev.parent;
+
+@@ -525,5 +535,6 @@ void devfreq_cooling_unregister(struct thermal_cooling_device *cdev)
+
+ kfree(dfc->freq_table);
+ kfree(dfc);
++ kfree(ops);
+ }
+ EXPORT_SYMBOL_GPL(devfreq_cooling_unregister);
+diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c
+index 8d76dbfde6a9f..331a241eb0ef3 100644
+--- a/drivers/thermal/imx_sc_thermal.c
++++ b/drivers/thermal/imx_sc_thermal.c
+@@ -94,8 +94,8 @@ static int imx_sc_thermal_probe(struct platform_device *pdev)
+ sensor = devm_kzalloc(&pdev->dev, sizeof(*sensor), GFP_KERNEL);
+ if (!sensor) {
+ of_node_put(child);
+- of_node_put(sensor_np);
+- return -ENOMEM;
++ ret = -ENOMEM;
++ goto put_node;
+ }
+
+ ret = thermal_zone_of_get_sensor_id(child,
+@@ -124,7 +124,9 @@ static int imx_sc_thermal_probe(struct platform_device *pdev)
+ dev_warn(&pdev->dev, "failed to add hwmon sysfs attributes\n");
+ }
+
++put_node:
+ of_node_put(sensor_np);
++ of_node_put(np);
+
+ return ret;
+ }
+diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
+index 82654dc8382b8..cdc0552e8c42e 100644
+--- a/drivers/thermal/thermal_core.c
++++ b/drivers/thermal/thermal_core.c
+@@ -947,6 +947,7 @@ __thermal_cooling_device_register(struct device_node *np,
+ return cdev;
+
+ out_kfree_type:
++ thermal_cooling_device_destroy_sysfs(cdev);
+ kfree(cdev->type);
+ put_device(&cdev->device);
+ cdev = NULL;
+diff --git a/drivers/tty/goldfish.c b/drivers/tty/goldfish.c
+index ad13532e92fe2..9e8ccb8ed6d69 100644
+--- a/drivers/tty/goldfish.c
++++ b/drivers/tty/goldfish.c
+@@ -61,13 +61,13 @@ static void do_rw_io(struct goldfish_tty *qtty,
+ spin_lock_irqsave(&qtty->lock, irq_flags);
+ gf_write_ptr((void *)address, base + GOLDFISH_TTY_REG_DATA_PTR,
+ base + GOLDFISH_TTY_REG_DATA_PTR_HIGH);
+- __raw_writel(count, base + GOLDFISH_TTY_REG_DATA_LEN);
++ gf_iowrite32(count, base + GOLDFISH_TTY_REG_DATA_LEN);
+
+ if (is_write)
+- __raw_writel(GOLDFISH_TTY_CMD_WRITE_BUFFER,
++ gf_iowrite32(GOLDFISH_TTY_CMD_WRITE_BUFFER,
+ base + GOLDFISH_TTY_REG_CMD);
+ else
+- __raw_writel(GOLDFISH_TTY_CMD_READ_BUFFER,
++ gf_iowrite32(GOLDFISH_TTY_CMD_READ_BUFFER,
+ base + GOLDFISH_TTY_REG_CMD);
+
+ spin_unlock_irqrestore(&qtty->lock, irq_flags);
+@@ -142,7 +142,7 @@ static irqreturn_t goldfish_tty_interrupt(int irq, void *dev_id)
+ unsigned char *buf;
+ u32 count;
+
+- count = __raw_readl(base + GOLDFISH_TTY_REG_BYTES_READY);
++ count = gf_ioread32(base + GOLDFISH_TTY_REG_BYTES_READY);
+ if (count == 0)
+ return IRQ_NONE;
+
+@@ -159,7 +159,7 @@ static int goldfish_tty_activate(struct tty_port *port, struct tty_struct *tty)
+ {
+ struct goldfish_tty *qtty = container_of(port, struct goldfish_tty,
+ port);
+- __raw_writel(GOLDFISH_TTY_CMD_INT_ENABLE, qtty->base + GOLDFISH_TTY_REG_CMD);
++ gf_iowrite32(GOLDFISH_TTY_CMD_INT_ENABLE, qtty->base + GOLDFISH_TTY_REG_CMD);
+ return 0;
+ }
+
+@@ -167,7 +167,7 @@ static void goldfish_tty_shutdown(struct tty_port *port)
+ {
+ struct goldfish_tty *qtty = container_of(port, struct goldfish_tty,
+ port);
+- __raw_writel(GOLDFISH_TTY_CMD_INT_DISABLE, qtty->base + GOLDFISH_TTY_REG_CMD);
++ gf_iowrite32(GOLDFISH_TTY_CMD_INT_DISABLE, qtty->base + GOLDFISH_TTY_REG_CMD);
+ }
+
+ static int goldfish_tty_open(struct tty_struct *tty, struct file *filp)
+@@ -202,7 +202,7 @@ static unsigned int goldfish_tty_chars_in_buffer(struct tty_struct *tty)
+ {
+ struct goldfish_tty *qtty = &goldfish_ttys[tty->index];
+ void __iomem *base = qtty->base;
+- return __raw_readl(base + GOLDFISH_TTY_REG_BYTES_READY);
++ return gf_ioread32(base + GOLDFISH_TTY_REG_BYTES_READY);
+ }
+
+ static void goldfish_tty_console_write(struct console *co, const char *b,
+@@ -355,7 +355,7 @@ static int goldfish_tty_probe(struct platform_device *pdev)
+ * on Ranchu emulator (qemu2) returns 1 here and
+ * driver will use physical addresses.
+ */
+- qtty->version = __raw_readl(base + GOLDFISH_TTY_REG_VERSION);
++ qtty->version = gf_ioread32(base + GOLDFISH_TTY_REG_VERSION);
+
+ /*
+ * Goldfish TTY device on Ranchu emulator (qemu2)
+@@ -374,7 +374,7 @@ static int goldfish_tty_probe(struct platform_device *pdev)
+ }
+ }
+
+- __raw_writel(GOLDFISH_TTY_CMD_INT_DISABLE, base + GOLDFISH_TTY_REG_CMD);
++ gf_iowrite32(GOLDFISH_TTY_CMD_INT_DISABLE, base + GOLDFISH_TTY_REG_CMD);
+
+ ret = request_irq(irq, goldfish_tty_interrupt, IRQF_SHARED,
+ "goldfish_tty", qtty);
+@@ -436,7 +436,7 @@ static int goldfish_tty_remove(struct platform_device *pdev)
+ #ifdef CONFIG_GOLDFISH_TTY_EARLY_CONSOLE
+ static void gf_early_console_putchar(struct uart_port *port, unsigned char ch)
+ {
+- __raw_writel(ch, port->membase);
++ gf_iowrite32(ch, port->membase);
+ }
+
+ static void gf_early_write(struct console *con, const char *s, unsigned int n)
+diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
+index fd8b86dde5255..ea5381dedb07c 100644
+--- a/drivers/tty/n_gsm.c
++++ b/drivers/tty/n_gsm.c
+@@ -444,6 +444,25 @@ static u8 gsm_encode_modem(const struct gsm_dlci *dlci)
+ return modembits;
+ }
+
++static void gsm_hex_dump_bytes(const char *fname, const u8 *data,
++ unsigned long len)
++{
++ char *prefix;
++
++ if (!fname) {
++ print_hex_dump(KERN_INFO, "", DUMP_PREFIX_NONE, 16, 1, data, len,
++ true);
++ return;
++ }
++
++ prefix = kasprintf(GFP_KERNEL, "%s: ", fname);
++ if (!prefix)
++ return;
++ print_hex_dump(KERN_INFO, prefix, DUMP_PREFIX_OFFSET, 16, 1, data, len,
++ true);
++ kfree(prefix);
++}
++
+ /**
+ * gsm_print_packet - display a frame for debug
+ * @hdr: header to print before decode
+@@ -508,7 +527,7 @@ static void gsm_print_packet(const char *hdr, int addr, int cr,
+ else
+ pr_cont("(F)");
+
+- print_hex_dump_bytes("", DUMP_PREFIX_NONE, data, dlen);
++ gsm_hex_dump_bytes(NULL, data, dlen);
+ }
+
+
+@@ -698,9 +717,7 @@ static void gsm_data_kick(struct gsm_mux *gsm, struct gsm_dlci *dlci)
+ }
+
+ if (debug & 4)
+- print_hex_dump_bytes("gsm_data_kick: ",
+- DUMP_PREFIX_OFFSET,
+- gsm->txframe, len);
++ gsm_hex_dump_bytes(__func__, gsm->txframe, len);
+ if (gsmld_output(gsm, gsm->txframe, len) <= 0)
+ break;
+ /* FIXME: Can eliminate one SOF in many more cases */
+@@ -2448,8 +2465,7 @@ static int gsmld_output(struct gsm_mux *gsm, u8 *data, int len)
+ return -ENOSPC;
+ }
+ if (debug & 4)
+- print_hex_dump_bytes("gsmld_output: ", DUMP_PREFIX_OFFSET,
+- data, len);
++ gsm_hex_dump_bytes(__func__, data, len);
+ return gsm->tty->ops->write(gsm->tty, data, len);
+ }
+
+@@ -2525,8 +2541,7 @@ static void gsmld_receive_buf(struct tty_struct *tty, const unsigned char *cp,
+ char flags = TTY_NORMAL;
+
+ if (debug & 4)
+- print_hex_dump_bytes("gsmld_receive: ", DUMP_PREFIX_OFFSET,
+- cp, count);
++ gsm_hex_dump_bytes(__func__, cp, count);
+
+ for (; count; count--, cp++) {
+ if (fp)
+diff --git a/drivers/tty/serial/pch_uart.c b/drivers/tty/serial/pch_uart.c
+index affe71f8b50c7..937636ecdc3c3 100644
+--- a/drivers/tty/serial/pch_uart.c
++++ b/drivers/tty/serial/pch_uart.c
+@@ -624,22 +624,6 @@ static int push_rx(struct eg20t_port *priv, const unsigned char *buf,
+ return 0;
+ }
+
+-static int pop_tx_x(struct eg20t_port *priv, unsigned char *buf)
+-{
+- int ret = 0;
+- struct uart_port *port = &priv->port;
+-
+- if (port->x_char) {
+- dev_dbg(priv->port.dev, "%s:X character send %02x (%lu)\n",
+- __func__, port->x_char, jiffies);
+- buf[0] = port->x_char;
+- port->x_char = 0;
+- ret = 1;
+- }
+-
+- return ret;
+-}
+-
+ static int dma_push_rx(struct eg20t_port *priv, int size)
+ {
+ int room;
+@@ -889,9 +873,10 @@ static unsigned int handle_tx(struct eg20t_port *priv)
+
+ fifo_size = max(priv->fifo_size, 1);
+ tx_empty = 1;
+- if (pop_tx_x(priv, xmit->buf)) {
+- pch_uart_hal_write(priv, xmit->buf, 1);
++ if (port->x_char) {
++ pch_uart_hal_write(priv, &port->x_char, 1);
+ port->icount.tx++;
++ port->x_char = 0;
+ tx_empty = 0;
+ fifo_size--;
+ }
+@@ -946,9 +931,11 @@ static unsigned int dma_handle_tx(struct eg20t_port *priv)
+ }
+
+ fifo_size = max(priv->fifo_size, 1);
+- if (pop_tx_x(priv, xmit->buf)) {
+- pch_uart_hal_write(priv, xmit->buf, 1);
++
++ if (port->x_char) {
++ pch_uart_hal_write(priv, &port->x_char, 1);
+ port->icount.tx++;
++ port->x_char = 0;
+ fifo_size--;
+ }
+
+diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
+index 646510476c304..bfa431a8e6902 100644
+--- a/drivers/tty/tty_buffer.c
++++ b/drivers/tty/tty_buffer.c
+@@ -175,7 +175,8 @@ static struct tty_buffer *tty_buffer_alloc(struct tty_port *port, size_t size)
+ */
+ if (atomic_read(&port->buf.mem_used) > port->buf.mem_limit)
+ return NULL;
+- p = kmalloc(sizeof(struct tty_buffer) + 2 * size, GFP_ATOMIC);
++ p = kmalloc(sizeof(struct tty_buffer) + 2 * size,
++ GFP_ATOMIC | __GFP_NOWARN);
+ if (p == NULL)
+ return NULL;
+
+diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
+index d9712c2602afe..06eea8848ccc2 100644
+--- a/drivers/usb/core/hcd.c
++++ b/drivers/usb/core/hcd.c
+@@ -2816,6 +2816,7 @@ int usb_add_hcd(struct usb_hcd *hcd,
+ {
+ int retval;
+ struct usb_device *rhdev;
++ struct usb_hcd *shared_hcd;
+
+ if (!hcd->skip_phy_initialization && usb_hcd_is_primary_hcd(hcd)) {
+ hcd->phy_roothub = usb_phy_roothub_alloc(hcd->self.sysdev);
+@@ -2976,13 +2977,26 @@ int usb_add_hcd(struct usb_hcd *hcd,
+ goto err_hcd_driver_start;
+ }
+
++ /* starting here, usbcore will pay attention to the shared HCD roothub */
++ shared_hcd = hcd->shared_hcd;
++ if (!usb_hcd_is_primary_hcd(hcd) && shared_hcd && HCD_DEFER_RH_REGISTER(shared_hcd)) {
++ retval = register_root_hub(shared_hcd);
++ if (retval != 0)
++ goto err_register_root_hub;
++
++ if (shared_hcd->uses_new_polling && HCD_POLL_RH(shared_hcd))
++ usb_hcd_poll_rh_status(shared_hcd);
++ }
++
+ /* starting here, usbcore will pay attention to this root hub */
+- retval = register_root_hub(hcd);
+- if (retval != 0)
+- goto err_register_root_hub;
++ if (!HCD_DEFER_RH_REGISTER(hcd)) {
++ retval = register_root_hub(hcd);
++ if (retval != 0)
++ goto err_register_root_hub;
+
+- if (hcd->uses_new_polling && HCD_POLL_RH(hcd))
+- usb_hcd_poll_rh_status(hcd);
++ if (hcd->uses_new_polling && HCD_POLL_RH(hcd))
++ usb_hcd_poll_rh_status(hcd);
++ }
+
+ return retval;
+
+@@ -3020,6 +3034,7 @@ EXPORT_SYMBOL_GPL(usb_add_hcd);
+ void usb_remove_hcd(struct usb_hcd *hcd)
+ {
+ struct usb_device *rhdev = hcd->self.root_hub;
++ bool rh_registered;
+
+ dev_info(hcd->self.controller, "remove, state %x\n", hcd->state);
+
+@@ -3030,6 +3045,7 @@ void usb_remove_hcd(struct usb_hcd *hcd)
+
+ dev_dbg(hcd->self.controller, "roothub graceful disconnect\n");
+ spin_lock_irq (&hcd_root_hub_lock);
++ rh_registered = hcd->rh_registered;
+ hcd->rh_registered = 0;
+ spin_unlock_irq (&hcd_root_hub_lock);
+
+@@ -3039,7 +3055,8 @@ void usb_remove_hcd(struct usb_hcd *hcd)
+ cancel_work_sync(&hcd->died_work);
+
+ mutex_lock(&usb_bus_idr_lock);
+- usb_disconnect(&rhdev); /* Sets rhdev to NULL */
++ if (rh_registered)
++ usb_disconnect(&rhdev); /* Sets rhdev to NULL */
+ mutex_unlock(&usb_bus_idr_lock);
+
+ /*
+diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
+index 97b44a68668a5..f99a65a64588f 100644
+--- a/drivers/usb/core/quirks.c
++++ b/drivers/usb/core/quirks.c
+@@ -510,6 +510,9 @@ static const struct usb_device_id usb_quirk_list[] = {
+ /* DJI CineSSD */
+ { USB_DEVICE(0x2ca3, 0x0031), .driver_info = USB_QUIRK_NO_LPM },
+
++ /* DELL USB GEN2 */
++ { USB_DEVICE(0x413c, 0xb062), .driver_info = USB_QUIRK_NO_LPM | USB_QUIRK_RESET_RESUME },
++
+ /* VCOM device */
+ { USB_DEVICE(0x4296, 0x7570), .driver_info = USB_QUIRK_CONFIG_INTF_STRINGS },
+
+diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
+index 0b9c2493844a8..026fc360cc506 100644
+--- a/drivers/usb/dwc3/gadget.c
++++ b/drivers/usb/dwc3/gadget.c
+@@ -3380,14 +3380,14 @@ static bool dwc3_gadget_endpoint_trbs_complete(struct dwc3_ep *dep,
+ struct dwc3 *dwc = dep->dwc;
+ bool no_started_trb = true;
+
+- if (!dep->endpoint.desc)
+- return no_started_trb;
+-
+ dwc3_gadget_ep_cleanup_completed_requests(dep, event, status);
+
+ if (dep->flags & DWC3_EP_END_TRANSFER_PENDING)
+ goto out;
+
++ if (!dep->endpoint.desc)
++ return no_started_trb;
++
+ if (usb_endpoint_xfer_isoc(dep->endpoint.desc) &&
+ list_empty(&dep->started_list) &&
+ (list_empty(&dep->pending_list) || status == -EXDEV))
+diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
+index d7e0e6ebf0800..d57c5ff5ae1f4 100644
+--- a/drivers/usb/host/xhci-pci.c
++++ b/drivers/usb/host/xhci-pci.c
+@@ -59,6 +59,7 @@
+ #define PCI_DEVICE_ID_INTEL_TIGER_LAKE_XHCI 0x9a13
+ #define PCI_DEVICE_ID_INTEL_MAPLE_RIDGE_XHCI 0x1138
+ #define PCI_DEVICE_ID_INTEL_ALDER_LAKE_XHCI 0x461e
++#define PCI_DEVICE_ID_INTEL_ALDER_LAKE_N_XHCI 0x464e
+ #define PCI_DEVICE_ID_INTEL_ALDER_LAKE_PCH_XHCI 0x51ed
+
+ #define PCI_DEVICE_ID_AMD_RENOIR_XHCI 0x1639
+@@ -268,6 +269,7 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
+ pdev->device == PCI_DEVICE_ID_INTEL_TIGER_LAKE_XHCI ||
+ pdev->device == PCI_DEVICE_ID_INTEL_MAPLE_RIDGE_XHCI ||
+ pdev->device == PCI_DEVICE_ID_INTEL_ALDER_LAKE_XHCI ||
++ pdev->device == PCI_DEVICE_ID_INTEL_ALDER_LAKE_N_XHCI ||
+ pdev->device == PCI_DEVICE_ID_INTEL_ALDER_LAKE_PCH_XHCI))
+ xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
+
+diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
+index 25b87e99b4dd4..2be38d9de8df4 100644
+--- a/drivers/usb/host/xhci.c
++++ b/drivers/usb/host/xhci.c
+@@ -696,6 +696,8 @@ int xhci_run(struct usb_hcd *hcd)
+ xhci_dbg_trace(xhci, trace_xhci_dbg_init,
+ "Finished xhci_run for USB2 roothub");
+
++ set_bit(HCD_FLAG_DEFER_RH_REGISTER, &hcd->flags);
++
+ xhci_create_dbc_dev(xhci);
+
+ xhci_debugfs_init(xhci);
+diff --git a/drivers/usb/isp1760/isp1760-core.c b/drivers/usb/isp1760/isp1760-core.c
+index d1d9a7d5da175..af88f4fe00d27 100644
+--- a/drivers/usb/isp1760/isp1760-core.c
++++ b/drivers/usb/isp1760/isp1760-core.c
+@@ -251,6 +251,8 @@ static const struct reg_field isp1760_hc_reg_fields[] = {
+ [HW_DM_PULLDOWN] = REG_FIELD(ISP176x_HC_OTG_CTRL, 2, 2),
+ [HW_DP_PULLDOWN] = REG_FIELD(ISP176x_HC_OTG_CTRL, 1, 1),
+ [HW_DP_PULLUP] = REG_FIELD(ISP176x_HC_OTG_CTRL, 0, 0),
++ /* Make sure the array is sized properly during compilation */
++ [HC_FIELD_MAX] = {},
+ };
+
+ static const struct reg_field isp1763_hc_reg_fields[] = {
+@@ -321,6 +323,8 @@ static const struct reg_field isp1763_hc_reg_fields[] = {
+ [HW_DM_PULLDOWN_CLEAR] = REG_FIELD(ISP1763_HC_OTG_CTRL_CLEAR, 2, 2),
+ [HW_DP_PULLDOWN_CLEAR] = REG_FIELD(ISP1763_HC_OTG_CTRL_CLEAR, 1, 1),
+ [HW_DP_PULLUP_CLEAR] = REG_FIELD(ISP1763_HC_OTG_CTRL_CLEAR, 0, 0),
++ /* Make sure the array is sized properly during compilation */
++ [HC_FIELD_MAX] = {},
+ };
+
+ static const struct regmap_range isp1763_hc_volatile_ranges[] = {
+@@ -405,6 +409,8 @@ static const struct reg_field isp1761_dc_reg_fields[] = {
+ [DC_CHIP_ID_HIGH] = REG_FIELD(ISP176x_DC_CHIPID, 16, 31),
+ [DC_CHIP_ID_LOW] = REG_FIELD(ISP176x_DC_CHIPID, 0, 15),
+ [DC_SCRATCH] = REG_FIELD(ISP176x_DC_SCRATCH, 0, 15),
++ /* Make sure the array is sized properly during compilation */
++ [DC_FIELD_MAX] = {},
+ };
+
+ static const struct regmap_range isp1763_dc_volatile_ranges[] = {
+@@ -458,6 +464,8 @@ static const struct reg_field isp1763_dc_reg_fields[] = {
+ [DC_CHIP_ID_HIGH] = REG_FIELD(ISP1763_DC_CHIPID_HIGH, 0, 15),
+ [DC_CHIP_ID_LOW] = REG_FIELD(ISP1763_DC_CHIPID_LOW, 0, 15),
+ [DC_SCRATCH] = REG_FIELD(ISP1763_DC_SCRATCH, 0, 15),
++ /* Make sure the array is sized properly during compilation */
++ [DC_FIELD_MAX] = {},
+ };
+
+ static const struct regmap_config isp1763_dc_regmap_conf = {
+diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
+index 152ad882657d7..e60425bbf5376 100644
+--- a/drivers/usb/serial/option.c
++++ b/drivers/usb/serial/option.c
+@@ -1137,6 +1137,8 @@ static const struct usb_device_id option_ids[] = {
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM12, 0xff, 0, 0) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, 0x0620, 0xff, 0xff, 0x30) }, /* EM160R-GL */
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, 0x0620, 0xff, 0, 0) },
++ { USB_DEVICE_INTERFACE_CLASS(QUECTEL_VENDOR_ID, 0x0700, 0xff), /* BG95 */
++ .driver_info = RSVD(3) | ZLP },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500Q, 0xff, 0xff, 0x30) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500Q, 0xff, 0, 0) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500Q, 0xff, 0xff, 0x10),
+diff --git a/drivers/usb/serial/pl2303.c b/drivers/usb/serial/pl2303.c
+index 1d878d05a6584..3506c47e1eef0 100644
+--- a/drivers/usb/serial/pl2303.c
++++ b/drivers/usb/serial/pl2303.c
+@@ -421,6 +421,9 @@ static int pl2303_detect_type(struct usb_serial *serial)
+ bcdUSB = le16_to_cpu(desc->bcdUSB);
+
+ switch (bcdUSB) {
++ case 0x101:
++ /* USB 1.0.1? Let's assume they meant 1.1... */
++ fallthrough;
+ case 0x110:
+ switch (bcdDevice) {
+ case 0x300:
+diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
+index ddbe142af09ae..881f9864c437c 100644
+--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
++++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
+@@ -353,11 +353,14 @@ static void vdpasim_set_vq_ready(struct vdpa_device *vdpa, u16 idx, bool ready)
+ {
+ struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
+ struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx];
++ bool old_ready;
+
+ spin_lock(&vdpasim->lock);
++ old_ready = vq->ready;
+ vq->ready = ready;
+- if (vq->ready)
++ if (vq->ready && !old_ready) {
+ vdpasim_queue_ready(vdpasim, idx);
++ }
+ spin_unlock(&vdpasim->lock);
+ }
+
+diff --git a/drivers/video/console/sticon.c b/drivers/video/console/sticon.c
+index 40496e9e9b438..f304163e87e99 100644
+--- a/drivers/video/console/sticon.c
++++ b/drivers/video/console/sticon.c
+@@ -46,6 +46,7 @@
+ #include <linux/slab.h>
+ #include <linux/font.h>
+ #include <linux/crc32.h>
++#include <linux/fb.h>
+
+ #include <asm/io.h>
+
+@@ -392,7 +393,9 @@ static int __init sticonsole_init(void)
+ for (i = 0; i < MAX_NR_CONSOLES; i++)
+ font_data[i] = STI_DEF_FONT;
+
+- pr_info("sticon: Initializing STI text console.\n");
++ pr_info("sticon: Initializing STI text console on %s at [%s]\n",
++ sticon_sti->sti_data->inq_outptr.dev_name,
++ sticon_sti->pa_path);
+ console_lock();
+ err = do_take_over_console(&sti_con, 0, MAX_NR_CONSOLES - 1,
+ PAGE0->mem_cons.cl_class != CL_DUPLEX);
+diff --git a/drivers/video/console/sticore.c b/drivers/video/console/sticore.c
+index f869b723494f1..6a947ff96d6eb 100644
+--- a/drivers/video/console/sticore.c
++++ b/drivers/video/console/sticore.c
+@@ -30,10 +30,11 @@
+ #include <asm/pdc.h>
+ #include <asm/cacheflush.h>
+ #include <asm/grfioctl.h>
++#include <asm/fb.h>
+
+ #include "../fbdev/sticore.h"
+
+-#define STI_DRIVERVERSION "Version 0.9b"
++#define STI_DRIVERVERSION "Version 0.9c"
+
+ static struct sti_struct *default_sti __read_mostly;
+
+@@ -502,7 +503,7 @@ sti_select_fbfont(struct sti_cooked_rom *cooked_rom, const char *fbfont_name)
+ if (!fbfont)
+ return NULL;
+
+- pr_info("STI selected %ux%u framebuffer font %s for sticon\n",
++ pr_info(" using %ux%u framebuffer font %s\n",
+ fbfont->width, fbfont->height, fbfont->name);
+
+ bpc = ((fbfont->width+7)/8) * fbfont->height;
+@@ -946,6 +947,7 @@ out_err:
+
+ static void sticore_check_for_default_sti(struct sti_struct *sti, char *path)
+ {
++ pr_info(" located at [%s]\n", sti->pa_path);
+ if (strcmp (path, default_sti_path) == 0)
+ default_sti = sti;
+ }
+@@ -957,7 +959,6 @@ static void sticore_check_for_default_sti(struct sti_struct *sti, char *path)
+ */
+ static int __init sticore_pa_init(struct parisc_device *dev)
+ {
+- char pa_path[21];
+ struct sti_struct *sti = NULL;
+ int hpa = dev->hpa.start;
+
+@@ -970,8 +971,8 @@ static int __init sticore_pa_init(struct parisc_device *dev)
+ if (!sti)
+ return 1;
+
+- print_pa_hwpath(dev, pa_path);
+- sticore_check_for_default_sti(sti, pa_path);
++ print_pa_hwpath(dev, sti->pa_path);
++ sticore_check_for_default_sti(sti, sti->pa_path);
+ return 0;
+ }
+
+@@ -1007,9 +1008,8 @@ static int sticore_pci_init(struct pci_dev *pd, const struct pci_device_id *ent)
+
+ sti = sti_try_rom_generic(rom_base, fb_base, pd);
+ if (sti) {
+- char pa_path[30];
+- print_pci_hwpath(pd, pa_path);
+- sticore_check_for_default_sti(sti, pa_path);
++ print_pci_hwpath(pd, sti->pa_path);
++ sticore_check_for_default_sti(sti, sti->pa_path);
+ }
+
+ if (!sti) {
+@@ -1127,6 +1127,22 @@ int sti_call(const struct sti_struct *sti, unsigned long func,
+ return ret;
+ }
+
++/* check if given fb_info is the primary device */
++int fb_is_primary_device(struct fb_info *info)
++{
++ struct sti_struct *sti;
++
++ sti = sti_get_rom(0);
++
++ /* if no built-in graphics card found, allow any fb driver as default */
++ if (!sti)
++ return true;
++
++ /* return true if it's the default built-in framebuffer driver */
++ return (sti->info == info);
++}
++EXPORT_SYMBOL(fb_is_primary_device);
++
+ MODULE_AUTHOR("Philipp Rumpf, Helge Deller, Thomas Bogendoerfer");
+ MODULE_DESCRIPTION("Core STI driver for HP's NGLE series graphics cards in HP PARISC machines");
+ MODULE_LICENSE("GPL v2");
+diff --git a/drivers/video/fbdev/amba-clcd.c b/drivers/video/fbdev/amba-clcd.c
+index 9ec969e136bfd..8080116aea844 100644
+--- a/drivers/video/fbdev/amba-clcd.c
++++ b/drivers/video/fbdev/amba-clcd.c
+@@ -758,12 +758,15 @@ static int clcdfb_of_vram_setup(struct clcd_fb *fb)
+ return -ENODEV;
+
+ fb->fb.screen_base = of_iomap(memory, 0);
+- if (!fb->fb.screen_base)
++ if (!fb->fb.screen_base) {
++ of_node_put(memory);
+ return -ENOMEM;
++ }
+
+ fb->fb.fix.smem_start = of_translate_address(memory,
+ of_get_address(memory, 0, &size, NULL));
+ fb->fb.fix.smem_len = size;
++ of_node_put(memory);
+
+ return 0;
+ }
+diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
+index 842c66b3e33d0..6aaf6d0abf399 100644
+--- a/drivers/video/fbdev/core/fb_defio.c
++++ b/drivers/video/fbdev/core/fb_defio.c
+@@ -59,7 +59,6 @@ static vm_fault_t fb_deferred_io_fault(struct vm_fault *vmf)
+ printk(KERN_ERR "no mapping available\n");
+
+ BUG_ON(!page->mapping);
+- INIT_LIST_HEAD(&page->lru);
+ page->index = vmf->pgoff;
+
+ vmf->page = page;
+@@ -213,6 +212,8 @@ static void fb_deferred_io_work(struct work_struct *work)
+ void fb_deferred_io_init(struct fb_info *info)
+ {
+ struct fb_deferred_io *fbdefio = info->fbdefio;
++ struct page *page;
++ unsigned int i;
+
+ BUG_ON(!fbdefio);
+ mutex_init(&fbdefio->lock);
+@@ -220,6 +221,12 @@ void fb_deferred_io_init(struct fb_info *info)
+ INIT_LIST_HEAD(&fbdefio->pagelist);
+ if (fbdefio->delay == 0) /* set a default of 1 s */
+ fbdefio->delay = HZ;
++
++ /* initialize all the page lists one time */
++ for (i = 0; i < info->fix.smem_len; i += PAGE_SIZE) {
++ page = fb_deferred_io_page(info, i);
++ INIT_LIST_HEAD(&page->lru);
++ }
+ }
+ EXPORT_SYMBOL_GPL(fb_deferred_io_init);
+
+diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c
+index 2fc1b80a26ad9..9a8ae6fa6ecbb 100644
+--- a/drivers/video/fbdev/core/fbcon.c
++++ b/drivers/video/fbdev/core/fbcon.c
+@@ -3265,6 +3265,9 @@ static void fbcon_register_existing_fbs(struct work_struct *work)
+
+ console_lock();
+
++ deferred_takeover = false;
++ logo_shown = FBCON_LOGO_DONTSHOW;
++
+ for_each_registered_fb(i)
+ fbcon_fb_registered(registered_fb[i]);
+
+@@ -3282,8 +3285,6 @@ static int fbcon_output_notifier(struct notifier_block *nb,
+ pr_info("fbcon: Taking over console\n");
+
+ dummycon_unregister_output_notifier(&fbcon_output_nb);
+- deferred_takeover = false;
+- logo_shown = FBCON_LOGO_DONTSHOW;
+
+ /* We may get called in atomic context */
+ schedule_work(&fbcon_deferred_takeover_work);
+diff --git a/drivers/video/fbdev/sticore.h b/drivers/video/fbdev/sticore.h
+index c338f7848ae2b..0ebdd28a0b813 100644
+--- a/drivers/video/fbdev/sticore.h
++++ b/drivers/video/fbdev/sticore.h
+@@ -370,6 +370,9 @@ struct sti_struct {
+
+ /* pointer to all internal data */
+ struct sti_all_data *sti_data;
++
++ /* pa_path of this device */
++ char pa_path[24];
+ };
+
+
+diff --git a/drivers/video/fbdev/stifb.c b/drivers/video/fbdev/stifb.c
+index bebb2eea6448b..38a861e22c339 100644
+--- a/drivers/video/fbdev/stifb.c
++++ b/drivers/video/fbdev/stifb.c
+@@ -1358,11 +1358,11 @@ static int __init stifb_init_fb(struct sti_struct *sti, int bpp_pref)
+ goto out_err3;
+ }
+
++ /* save for primary gfx device detection & unregister_framebuffer() */
++ sti->info = info;
+ if (register_framebuffer(&fb->info) < 0)
+ goto out_err4;
+
+- sti->info = info; /* save for unregister_framebuffer() */
+-
+ fb_info(&fb->info, "%s %dx%d-%d frame buffer device, %s, id: %04x, mmio: 0x%04lx\n",
+ fix->id,
+ var->xres,
+diff --git a/drivers/video/fbdev/vesafb.c b/drivers/video/fbdev/vesafb.c
+index e25e8de5ff672..929d4775cb4bc 100644
+--- a/drivers/video/fbdev/vesafb.c
++++ b/drivers/video/fbdev/vesafb.c
+@@ -490,11 +490,12 @@ static int vesafb_remove(struct platform_device *pdev)
+ {
+ struct fb_info *info = platform_get_drvdata(pdev);
+
+- /* vesafb_destroy takes care of info cleanup */
+- unregister_framebuffer(info);
+ if (((struct vesafb_par *)(info->par))->region)
+ release_region(0x3c0, 32);
+
++ /* vesafb_destroy takes care of info cleanup */
++ unregister_framebuffer(info);
++
+ return 0;
+ }
+
+diff --git a/fs/afs/misc.c b/fs/afs/misc.c
+index 1d1a8debe4723..933e67fcdab1a 100644
+--- a/fs/afs/misc.c
++++ b/fs/afs/misc.c
+@@ -163,8 +163,11 @@ void afs_prioritise_error(struct afs_error *e, int error, u32 abort_code)
+ return;
+
+ case -ECONNABORTED:
++ error = afs_abort_to_error(abort_code);
++ fallthrough;
++ case -ENETRESET: /* Responded, but we seem to have changed address */
+ e->responded = true;
+- e->error = afs_abort_to_error(abort_code);
++ e->error = error;
+ return;
+ }
+ }
+diff --git a/fs/afs/rotate.c b/fs/afs/rotate.c
+index 79e1a5f6701be..a840c3588ebbb 100644
+--- a/fs/afs/rotate.c
++++ b/fs/afs/rotate.c
+@@ -292,6 +292,10 @@ bool afs_select_fileserver(struct afs_operation *op)
+ op->error = error;
+ goto iterate_address;
+
++ case -ENETRESET:
++ pr_warn("kAFS: Peer reset %s (op=%x)\n",
++ op->type ? op->type->name : "???", op->debug_id);
++ fallthrough;
+ case -ECONNRESET:
+ _debug("call reset");
+ op->error = error;
+diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
+index 23a1a92d64bb5..a5434f3e57c68 100644
+--- a/fs/afs/rxrpc.c
++++ b/fs/afs/rxrpc.c
+@@ -537,6 +537,8 @@ static void afs_deliver_to_call(struct afs_call *call)
+ case -ENODATA:
+ case -EBADMSG:
+ case -EMSGSIZE:
++ case -ENOMEM:
++ case -EFAULT:
+ abort_code = RXGEN_CC_UNMARSHAL;
+ if (state != AFS_CALL_CL_AWAIT_REPLY)
+ abort_code = RXGEN_SS_UNMARSHAL;
+@@ -544,7 +546,7 @@ static void afs_deliver_to_call(struct afs_call *call)
+ abort_code, ret, "KUM");
+ goto local_abort;
+ default:
+- abort_code = RX_USER_ABORT;
++ abort_code = RX_CALL_DEAD;
+ rxrpc_kernel_abort_call(call->net->socket, call->rxcall,
+ abort_code, ret, "KER");
+ goto local_abort;
+@@ -836,7 +838,7 @@ void afs_send_empty_reply(struct afs_call *call)
+ case -ENOMEM:
+ _debug("oom");
+ rxrpc_kernel_abort_call(net->socket, call->rxcall,
+- RX_USER_ABORT, -ENOMEM, "KOO");
++ RXGEN_SS_MARSHAL, -ENOMEM, "KOO");
+ fallthrough;
+ default:
+ _leave(" [error]");
+@@ -878,7 +880,7 @@ void afs_send_simple_reply(struct afs_call *call, const void *buf, size_t len)
+ if (n == -ENOMEM) {
+ _debug("oom");
+ rxrpc_kernel_abort_call(net->socket, call->rxcall,
+- RX_USER_ABORT, -ENOMEM, "KOO");
++ RXGEN_SS_MARSHAL, -ENOMEM, "KOO");
+ }
+ _leave(" [error]");
+ }
+diff --git a/fs/afs/write.c b/fs/afs/write.c
+index 4763132ca57e7..c1bc52ac7de11 100644
+--- a/fs/afs/write.c
++++ b/fs/afs/write.c
+@@ -636,6 +636,7 @@ static ssize_t afs_write_back_from_locked_folio(struct address_space *mapping,
+ case -EKEYEXPIRED:
+ case -EKEYREJECTED:
+ case -EKEYREVOKED:
++ case -ENETRESET:
+ afs_redirty_pages(wbc, mapping, start, len);
+ mapping_set_error(mapping, ret);
+ break;
+diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c
+index 6268981500112..dca0b6875f9c3 100644
+--- a/fs/binfmt_flat.c
++++ b/fs/binfmt_flat.c
+@@ -440,6 +440,30 @@ static void old_reloc(unsigned long rl)
+
+ /****************************************************************************/
+
++static inline u32 __user *skip_got_header(u32 __user *rp)
++{
++ if (IS_ENABLED(CONFIG_RISCV)) {
++ /*
++ * RISC-V has a 16 byte GOT PLT header for elf64-riscv
++ * and 8 byte GOT PLT header for elf32-riscv.
++ * Skip the whole GOT PLT header, since it is reserved
++ * for the dynamic linker (ld.so).
++ */
++ u32 rp_val0, rp_val1;
++
++ if (get_user(rp_val0, rp))
++ return rp;
++ if (get_user(rp_val1, rp + 1))
++ return rp;
++
++ if (rp_val0 == 0xffffffff && rp_val1 == 0xffffffff)
++ rp += 4;
++ else if (rp_val0 == 0xffffffff)
++ rp += 2;
++ }
++ return rp;
++}
++
+ static int load_flat_file(struct linux_binprm *bprm,
+ struct lib_info *libinfo, int id, unsigned long *extra_stack)
+ {
+@@ -789,7 +813,8 @@ static int load_flat_file(struct linux_binprm *bprm,
+ * image.
+ */
+ if (flags & FLAT_FLAG_GOTPIC) {
+- for (rp = (u32 __user *)datapos; ; rp++) {
++ rp = skip_got_header((u32 __user *) datapos);
++ for (; ; rp++) {
+ u32 addr, rp_val;
+ if (get_user(rp_val, rp))
+ return -EFAULT;
+diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
+index 0dd6de9941999..667b7025d5030 100644
+--- a/fs/btrfs/block-group.c
++++ b/fs/btrfs/block-group.c
+@@ -1367,6 +1367,14 @@ void btrfs_delete_unused_bgs(struct btrfs_fs_info *fs_info)
+ goto next;
+ }
+
++ ret = btrfs_zone_finish(block_group);
++ if (ret < 0) {
++ btrfs_dec_block_group_ro(block_group);
++ if (ret == -EAGAIN)
++ ret = 0;
++ goto next;
++ }
++
+ /*
+ * Want to do this before we do anything else so we can recover
+ * properly if we fail to join the transaction.
+diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
+index e8308f2ad07d1..19db5693175fe 100644
+--- a/fs/btrfs/block-group.h
++++ b/fs/btrfs/block-group.h
+@@ -212,6 +212,8 @@ struct btrfs_block_group {
+ u64 meta_write_pointer;
+ struct map_lookup *physical_map;
+ struct list_head active_bg_list;
++ struct work_struct zone_finish_work;
++ struct extent_buffer *last_eb;
+ };
+
+ static inline u64 btrfs_block_group_end(struct btrfs_block_group *block_group)
+diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
+index 31c3f592e5875..30d0bbfdb3bca 100644
+--- a/fs/btrfs/disk-io.c
++++ b/fs/btrfs/disk-io.c
+@@ -3611,7 +3611,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
+ ~BTRFS_FEATURE_INCOMPAT_SUPP;
+ if (features) {
+ btrfs_err(fs_info,
+- "cannot mount because of unsupported optional features (%llx)",
++ "cannot mount because of unsupported optional features (0x%llx)",
+ features);
+ err = -EINVAL;
+ goto fail_alloc;
+@@ -3649,7 +3649,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
+ ~BTRFS_FEATURE_COMPAT_RO_SUPP;
+ if (!sb_rdonly(sb) && features) {
+ btrfs_err(fs_info,
+- "cannot mount read-write because of unsupported optional features (%llx)",
++ "cannot mount read-write because of unsupported optional features (0x%llx)",
+ features);
+ err = -EINVAL;
+ goto fail_alloc;
+diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
+index 33c19f51d79b0..a23a42ba88cae 100644
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -3743,8 +3743,12 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached,
+ this_bio_flag,
+ force_bio_submit);
+ if (ret) {
+- unlock_extent(tree, cur, cur + iosize - 1);
+- end_page_read(page, false, cur, iosize);
++ /*
++ * We have to unlock the remaining range, or the page
++ * will never be unlocked.
++ */
++ unlock_extent(tree, cur, end);
++ end_page_read(page, false, cur, end + 1 - cur);
+ goto out;
+ }
+ cur = cur + iosize;
+@@ -3920,10 +3924,12 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
+ u64 extent_offset;
+ u64 block_start;
+ struct extent_map *em;
++ int saved_ret = 0;
+ int ret = 0;
+ int nr = 0;
+ u32 opf = REQ_OP_WRITE;
+ const unsigned int write_flags = wbc_to_write_flags(wbc);
++ bool has_error = false;
+ bool compressed;
+
+ ret = btrfs_writepage_cow_fixup(page);
+@@ -3973,6 +3979,9 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
+ if (IS_ERR(em)) {
+ btrfs_page_set_error(fs_info, page, cur, end - cur + 1);
+ ret = PTR_ERR_OR_ZERO(em);
++ has_error = true;
++ if (!saved_ret)
++ saved_ret = ret;
+ break;
+ }
+
+@@ -4036,6 +4045,10 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
+ end_bio_extent_writepage,
+ 0, 0, false);
+ if (ret) {
++ has_error = true;
++ if (!saved_ret)
++ saved_ret = ret;
++
+ btrfs_page_set_error(fs_info, page, cur, iosize);
+ if (PageWriteback(page))
+ btrfs_page_clear_writeback(fs_info, page, cur,
+@@ -4049,8 +4062,10 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
+ * If we finish without problem, we should not only clear page dirty,
+ * but also empty subpage dirty bits
+ */
+- if (!ret)
++ if (!has_error)
+ btrfs_page_assert_not_dirty(fs_info, page);
++ else
++ ret = saved_ret;
+ *nr_ret = nr;
+ return ret;
+ }
+@@ -4181,9 +4196,6 @@ void wait_on_extent_buffer_writeback(struct extent_buffer *eb)
+
+ static void end_extent_buffer_writeback(struct extent_buffer *eb)
+ {
+- if (test_bit(EXTENT_BUFFER_ZONE_FINISH, &eb->bflags))
+- btrfs_zone_finish_endio(eb->fs_info, eb->start, eb->len);
+-
+ clear_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags);
+ smp_mb__after_atomic();
+ wake_up_bit(&eb->bflags, EXTENT_BUFFER_WRITEBACK);
+@@ -4803,8 +4815,7 @@ static int submit_eb_page(struct page *page, struct writeback_control *wbc,
+ /*
+ * Implies write in zoned mode. Mark the last eb in a block group.
+ */
+- if (cache->seq_zone && eb->start + eb->len == cache->zone_capacity)
+- set_bit(EXTENT_BUFFER_ZONE_FINISH, &eb->bflags);
++ btrfs_schedule_zone_finish_bg(cache, eb);
+ btrfs_put_block_group(cache);
+ }
+ ret = write_one_eb(eb, wbc, epd);
+diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
+index 151e9da5da2dc..c37a3e5f5eb98 100644
+--- a/fs/btrfs/extent_io.h
++++ b/fs/btrfs/extent_io.h
+@@ -32,7 +32,6 @@ enum {
+ /* write IO error */
+ EXTENT_BUFFER_WRITE_ERR,
+ EXTENT_BUFFER_NO_CHECK,
+- EXTENT_BUFFER_ZONE_FINISH,
+ };
+
+ /* these are flags for __process_pages_contig */
+diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
+index 95c499b8424e7..861b9748a0d9f 100644
+--- a/fs/btrfs/inode.c
++++ b/fs/btrfs/inode.c
+@@ -64,6 +64,8 @@ struct btrfs_iget_args {
+ struct btrfs_dio_data {
+ ssize_t submitted;
+ struct extent_changeset *data_reserved;
++ bool data_space_reserved;
++ bool nocow_done;
+ };
+
+ struct btrfs_rename_ctx {
+@@ -7489,15 +7491,25 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
+ ret = PTR_ERR(em2);
+ goto out;
+ }
++
++ dio_data->nocow_done = true;
+ } else {
+ /* Our caller expects us to free the input extent map. */
+ free_extent_map(em);
+ *map = NULL;
+
+- /* We have to COW, so need to reserve metadata and data space. */
+- ret = btrfs_delalloc_reserve_space(BTRFS_I(inode),
+- &dio_data->data_reserved,
+- start, len);
++ /*
++ * If we could not allocate data space before locking the file
++ * range and we can't do a NOCOW write, then we have to fail.
++ */
++ if (!dio_data->data_space_reserved)
++ return -ENOSPC;
++
++ /*
++ * We have to COW and we have already reserved data space before,
++ * so now we reserve only metadata.
++ */
++ ret = btrfs_delalloc_reserve_metadata(BTRFS_I(inode), len, len);
+ if (ret < 0)
+ goto out;
+ space_reserved = true;
+@@ -7510,10 +7522,8 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
+ *map = em;
+ len = min(len, em->len - (start - em->start));
+ if (len < prev_len)
+- btrfs_delalloc_release_space(BTRFS_I(inode),
+- dio_data->data_reserved,
+- start + len, prev_len - len,
+- true);
++ btrfs_delalloc_release_metadata(BTRFS_I(inode),
++ prev_len - len, true);
+ }
+
+ /*
+@@ -7531,15 +7541,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
+ out:
+ if (ret && space_reserved) {
+ btrfs_delalloc_release_extents(BTRFS_I(inode), len);
+- if (can_nocow) {
+- btrfs_delalloc_release_metadata(BTRFS_I(inode), len, true);
+- } else {
+- btrfs_delalloc_release_space(BTRFS_I(inode),
+- dio_data->data_reserved,
+- start, len, true);
+- extent_changeset_free(dio_data->data_reserved);
+- dio_data->data_reserved = NULL;
+- }
++ btrfs_delalloc_release_metadata(BTRFS_I(inode), len, true);
+ }
+ return ret;
+ }
+@@ -7556,6 +7558,7 @@ static int btrfs_dio_iomap_begin(struct inode *inode, loff_t start,
+ const bool write = !!(flags & IOMAP_WRITE);
+ int ret = 0;
+ u64 len = length;
++ const u64 data_alloc_len = length;
+ bool unlock_extents = false;
+
+ if (!write)
+@@ -7584,6 +7587,25 @@ static int btrfs_dio_iomap_begin(struct inode *inode, loff_t start,
+
+ iomap->private = dio_data;
+
++ /*
++ * We always try to allocate data space and must do it before locking
++ * the file range, to avoid deadlocks with concurrent writes to the same
++ * range if the range has several extents and the writes don't expand the
++ * current i_size (the inode lock is taken in shared mode). If we fail to
++ * allocate data space here we continue and later, after locking the
++ * file range, we fail with ENOSPC only if we figure out we can not do a
++ * NOCOW write.
++ */
++ if (write && !(flags & IOMAP_NOWAIT)) {
++ ret = btrfs_check_data_free_space(BTRFS_I(inode),
++ &dio_data->data_reserved,
++ start, data_alloc_len);
++ if (!ret)
++ dio_data->data_space_reserved = true;
++ else if (ret && !(BTRFS_I(inode)->flags &
++ (BTRFS_INODE_NODATACOW | BTRFS_INODE_PREALLOC)))
++ goto err;
++ }
+
+ /*
+ * If this errors out it's because we couldn't invalidate pagecache for
+@@ -7658,6 +7680,24 @@ static int btrfs_dio_iomap_begin(struct inode *inode, loff_t start,
+ unlock_extents = true;
+ /* Recalc len in case the new em is smaller than requested */
+ len = min(len, em->len - (start - em->start));
++ if (dio_data->data_space_reserved) {
++ u64 release_offset;
++ u64 release_len = 0;
++
++ if (dio_data->nocow_done) {
++ release_offset = start;
++ release_len = data_alloc_len;
++ } else if (len < data_alloc_len) {
++ release_offset = start + len;
++ release_len = data_alloc_len - len;
++ }
++
++ if (release_len > 0)
++ btrfs_free_reserved_data_space(BTRFS_I(inode),
++ dio_data->data_reserved,
++ release_offset,
++ release_len);
++ }
+ } else {
+ /*
+ * We need to unlock only the end area that we aren't using.
+@@ -7702,6 +7742,13 @@ unlock_err:
+ unlock_extent_cached(&BTRFS_I(inode)->io_tree, lockstart, lockend,
+ &cached_state);
+ err:
++ if (dio_data->data_space_reserved) {
++ btrfs_free_reserved_data_space(BTRFS_I(inode),
++ dio_data->data_reserved,
++ start, data_alloc_len);
++ extent_changeset_free(dio_data->data_reserved);
++ }
++
+ kfree(dio_data);
+
+ return ret;
+diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
+index be6c24577dbe0..777801902511e 100644
+--- a/fs/btrfs/ioctl.c
++++ b/fs/btrfs/ioctl.c
+@@ -561,7 +561,7 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
+ struct timespec64 cur_time = current_time(dir);
+ struct inode *inode;
+ int ret;
+- dev_t anon_dev = 0;
++ dev_t anon_dev;
+ u64 objectid;
+ u64 index = 0;
+
+@@ -571,11 +571,7 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
+
+ ret = btrfs_get_free_objectid(fs_info->tree_root, &objectid);
+ if (ret)
+- goto fail_free;
+-
+- ret = get_anon_bdev(&anon_dev);
+- if (ret < 0)
+- goto fail_free;
++ goto out_root_item;
+
+ /*
+ * Don't create subvolume whose level is not zero. Or qgroup will be
+@@ -583,9 +579,13 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
+ */
+ if (btrfs_qgroup_level(objectid)) {
+ ret = -ENOSPC;
+- goto fail_free;
++ goto out_root_item;
+ }
+
++ ret = get_anon_bdev(&anon_dev);
++ if (ret < 0)
++ goto out_root_item;
++
+ btrfs_init_block_rsv(&block_rsv, BTRFS_BLOCK_RSV_TEMP);
+ /*
+ * The same as the snapshot creation, please see the comment
+@@ -593,26 +593,26 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
+ */
+ ret = btrfs_subvolume_reserve_metadata(root, &block_rsv, 8, false);
+ if (ret)
+- goto fail_free;
++ goto out_anon_dev;
+
+ trans = btrfs_start_transaction(root, 0);
+ if (IS_ERR(trans)) {
+ ret = PTR_ERR(trans);
+ btrfs_subvolume_release_metadata(root, &block_rsv);
+- goto fail_free;
++ goto out_anon_dev;
+ }
+ trans->block_rsv = &block_rsv;
+ trans->bytes_reserved = block_rsv.size;
+
+ ret = btrfs_qgroup_inherit(trans, 0, objectid, inherit);
+ if (ret)
+- goto fail;
++ goto out;
+
+ leaf = btrfs_alloc_tree_block(trans, root, 0, objectid, NULL, 0, 0, 0,
+ BTRFS_NESTING_NORMAL);
+ if (IS_ERR(leaf)) {
+ ret = PTR_ERR(leaf);
+- goto fail;
++ goto out;
+ }
+
+ btrfs_mark_buffer_dirty(leaf);
+@@ -667,7 +667,7 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
+ btrfs_tree_unlock(leaf);
+ btrfs_free_tree_block(trans, objectid, leaf, 0, 1);
+ free_extent_buffer(leaf);
+- goto fail;
++ goto out;
+ }
+
+ free_extent_buffer(leaf);
+@@ -676,19 +676,18 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
+ key.offset = (u64)-1;
+ new_root = btrfs_get_new_fs_root(fs_info, objectid, anon_dev);
+ if (IS_ERR(new_root)) {
+- free_anon_bdev(anon_dev);
+ ret = PTR_ERR(new_root);
+ btrfs_abort_transaction(trans, ret);
+- goto fail;
++ goto out;
+ }
+- /* Freeing will be done in btrfs_put_root() of new_root */
++ /* anon_dev is owned by new_root now. */
+ anon_dev = 0;
+
+ ret = btrfs_record_root_in_trans(trans, new_root);
+ if (ret) {
+ btrfs_put_root(new_root);
+ btrfs_abort_transaction(trans, ret);
+- goto fail;
++ goto out;
+ }
+
+ ret = btrfs_create_subvol_root(trans, new_root, root, mnt_userns);
+@@ -696,7 +695,7 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
+ if (ret) {
+ /* We potentially lose an unused inode item here */
+ btrfs_abort_transaction(trans, ret);
+- goto fail;
++ goto out;
+ }
+
+ /*
+@@ -705,28 +704,28 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
+ ret = btrfs_set_inode_index(BTRFS_I(dir), &index);
+ if (ret) {
+ btrfs_abort_transaction(trans, ret);
+- goto fail;
++ goto out;
+ }
+
+ ret = btrfs_insert_dir_item(trans, name, namelen, BTRFS_I(dir), &key,
+ BTRFS_FT_DIR, index);
+ if (ret) {
+ btrfs_abort_transaction(trans, ret);
+- goto fail;
++ goto out;
+ }
+
+ btrfs_i_size_write(BTRFS_I(dir), dir->i_size + namelen * 2);
+ ret = btrfs_update_inode(trans, root, BTRFS_I(dir));
+ if (ret) {
+ btrfs_abort_transaction(trans, ret);
+- goto fail;
++ goto out;
+ }
+
+ ret = btrfs_add_root_ref(trans, objectid, root->root_key.objectid,
+ btrfs_ino(BTRFS_I(dir)), index, name, namelen);
+ if (ret) {
+ btrfs_abort_transaction(trans, ret);
+- goto fail;
++ goto out;
+ }
+
+ ret = btrfs_uuid_tree_add(trans, root_item->uuid,
+@@ -734,8 +733,7 @@ static noinline int create_subvol(struct user_namespace *mnt_userns,
+ if (ret)
+ btrfs_abort_transaction(trans, ret);
+
+-fail:
+- kfree(root_item);
++out:
+ trans->block_rsv = NULL;
+ trans->bytes_reserved = 0;
+ btrfs_subvolume_release_metadata(root, &block_rsv);
+@@ -751,11 +749,10 @@ fail:
+ return PTR_ERR(inode);
+ d_instantiate(dentry, inode);
+ }
+- return ret;
+-
+-fail_free:
++out_anon_dev:
+ if (anon_dev)
+ free_anon_bdev(anon_dev);
++out_root_item:
+ kfree(root_item);
+ return ret;
+ }
+diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
+index a8cc736731fdb..659575526e9fe 100644
+--- a/fs/btrfs/volumes.c
++++ b/fs/btrfs/volumes.c
+@@ -7671,12 +7671,12 @@ int btrfs_read_chunk_tree(struct btrfs_fs_info *fs_info)
+ * do another round of validation checks.
+ */
+ if (total_dev != fs_info->fs_devices->total_devices) {
+- btrfs_err(fs_info,
+- "super_num_devices %llu mismatch with num_devices %llu found here",
++ btrfs_warn(fs_info,
++"super block num_devices %llu mismatch with DEV_ITEM count %llu, will be repaired on next transaction commit",
+ btrfs_super_num_devices(fs_info->super_copy),
+ total_dev);
+- ret = -EINVAL;
+- goto error;
++ fs_info->fs_devices->total_devices = total_dev;
++ btrfs_set_super_num_devices(fs_info->super_copy, total_dev);
+ }
+ if (btrfs_super_total_bytes(fs_info->super_copy) <
+ fs_info->fs_devices->total_rw_bytes) {
+diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
+index d31b0eda210f1..b5236f5afca7c 100644
+--- a/fs/btrfs/zoned.c
++++ b/fs/btrfs/zoned.c
+@@ -1896,7 +1896,7 @@ int btrfs_zone_finish(struct btrfs_block_group *block_group)
+ /* Check if we have unwritten allocated space */
+ if ((block_group->flags &
+ (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM)) &&
+- block_group->alloc_offset > block_group->meta_write_pointer) {
++ block_group->start + block_group->alloc_offset > block_group->meta_write_pointer) {
+ spin_unlock(&block_group->lock);
+ return -EAGAIN;
+ }
+@@ -2000,6 +2000,7 @@ void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 len
+ struct btrfs_block_group *block_group;
+ struct map_lookup *map;
+ struct btrfs_device *device;
++ u64 min_alloc_bytes;
+ u64 physical;
+
+ if (!btrfs_is_zoned(fs_info))
+@@ -2008,7 +2009,15 @@ void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 len
+ block_group = btrfs_lookup_block_group(fs_info, logical);
+ ASSERT(block_group);
+
+- if (logical + length < block_group->start + block_group->zone_capacity)
++ /* No MIXED_BG on zoned btrfs. */
++ if (block_group->flags & BTRFS_BLOCK_GROUP_DATA)
++ min_alloc_bytes = fs_info->sectorsize;
++ else
++ min_alloc_bytes = fs_info->nodesize;
++
++ /* Bail out if we can allocate more data from this block group. */
++ if (logical + length + min_alloc_bytes <=
++ block_group->start + block_group->zone_capacity)
+ goto out;
+
+ spin_lock(&block_group->lock);
+@@ -2046,6 +2055,37 @@ out:
+ btrfs_put_block_group(block_group);
+ }
+
++static void btrfs_zone_finish_endio_workfn(struct work_struct *work)
++{
++ struct btrfs_block_group *bg =
++ container_of(work, struct btrfs_block_group, zone_finish_work);
++
++ wait_on_extent_buffer_writeback(bg->last_eb);
++ free_extent_buffer(bg->last_eb);
++ btrfs_zone_finish_endio(bg->fs_info, bg->start, bg->length);
++ btrfs_put_block_group(bg);
++}
++
++void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg,
++ struct extent_buffer *eb)
++{
++ if (!bg->seq_zone || eb->start + eb->len * 2 <= bg->start + bg->zone_capacity)
++ return;
++
++ if (WARN_ON(bg->zone_finish_work.func == btrfs_zone_finish_endio_workfn)) {
++ btrfs_err(bg->fs_info, "double scheduling of bg %llu zone finishing",
++ bg->start);
++ return;
++ }
++
++ /* For the work */
++ btrfs_get_block_group(bg);
++ atomic_inc(&eb->refs);
++ bg->last_eb = eb;
++ INIT_WORK(&bg->zone_finish_work, btrfs_zone_finish_endio_workfn);
++ queue_work(system_unbound_wq, &bg->zone_finish_work);
++}
++
+ void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg)
+ {
+ struct btrfs_fs_info *fs_info = bg->fs_info;
+diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h
+index 6dee76248cb4d..2d898970aec5f 100644
+--- a/fs/btrfs/zoned.h
++++ b/fs/btrfs/zoned.h
+@@ -76,6 +76,8 @@ int btrfs_zone_finish(struct btrfs_block_group *block_group);
+ bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags);
+ void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical,
+ u64 length);
++void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg,
++ struct extent_buffer *eb);
+ void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg);
+ void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info);
+ #else /* CONFIG_BLK_DEV_ZONED */
+@@ -233,6 +235,9 @@ static inline bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices,
+ static inline void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info,
+ u64 logical, u64 length) { }
+
++static inline void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg,
++ struct extent_buffer *eb) { }
++
+ static inline void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg) { }
+
+ static inline void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info) { }
+diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
+index 00c3de177dd66..1bd3e1bb0fdf0 100644
+--- a/fs/ceph/mds_client.c
++++ b/fs/ceph/mds_client.c
+@@ -3375,13 +3375,17 @@ static void handle_session(struct ceph_mds_session *session,
+ }
+
+ if (msg_version >= 5) {
+- u32 flags;
+- /* version >= 4, struct_v, struct_cv, len, metric_spec */
+- ceph_decode_skip_n(&p, end, 2 + sizeof(u32) * 2, bad);
++ u32 flags, len;
++
++ /* version >= 4 */
++ ceph_decode_skip_16(&p, end, bad); /* struct_v, struct_cv */
++ ceph_decode_32_safe(&p, end, len, bad); /* len */
++ ceph_decode_skip_n(&p, end, len, bad); /* metric_spec */
++
+ /* version >= 5, flags */
+- ceph_decode_32_safe(&p, end, flags, bad);
++ ceph_decode_32_safe(&p, end, flags, bad);
+ if (flags & CEPH_SESSION_BLOCKLISTED) {
+- pr_warn("mds%d session blocklisted\n", session->s_mds);
++ pr_warn("mds%d session blocklisted\n", session->s_mds);
+ blocklisted = true;
+ }
+ }
+diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
+index 2b1a1c029c75e..6d150bb87aaf8 100644
+--- a/fs/cifs/cifsfs.c
++++ b/fs/cifs/cifsfs.c
+@@ -836,7 +836,7 @@ cifs_smb3_do_mount(struct file_system_type *fs_type,
+ int flags, struct smb3_fs_context *old_ctx)
+ {
+ int rc;
+- struct super_block *sb;
++ struct super_block *sb = NULL;
+ struct cifs_sb_info *cifs_sb = NULL;
+ struct cifs_mnt_data mnt_data;
+ struct dentry *root;
+@@ -932,9 +932,11 @@ out_super:
+ return root;
+ out:
+ if (cifs_sb) {
+- kfree(cifs_sb->prepath);
+- smb3_cleanup_fs_context(cifs_sb->ctx);
+- kfree(cifs_sb);
++ if (!sb || IS_ERR(sb)) { /* otherwise kill_sb will handle */
++ kfree(cifs_sb->prepath);
++ smb3_cleanup_fs_context(cifs_sb->ctx);
++ kfree(cifs_sb);
++ }
+ }
+ return root;
+ }
+diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
+index 8de977c359b11..5024b6792dab6 100644
+--- a/fs/cifs/cifsglob.h
++++ b/fs/cifs/cifsglob.h
+@@ -944,7 +944,7 @@ struct cifs_ses {
+ and after mount option parsing we fill it */
+ char *domainName;
+ char *password;
+- char *workstation_name;
++ char workstation_name[CIFS_MAX_WORKSTATION_LEN];
+ struct session_key auth_key;
+ struct ntlmssp_auth *ntlmssp; /* ciphertext, flags, server challenge */
+ enum securityEnum sectype; /* what security flavor was specified? */
+@@ -1979,4 +1979,17 @@ static inline bool cifs_is_referral_server(struct cifs_tcon *tcon,
+ return is_tcon_dfs(tcon) || (ref && (ref->flags & DFSREF_REFERRAL_SERVER));
+ }
+
++static inline size_t ntlmssp_workstation_name_size(const struct cifs_ses *ses)
++{
++ if (WARN_ON_ONCE(!ses || !ses->server))
++ return 0;
++ /*
++ * Make workstation name no more than 15 chars when using insecure dialects as some legacy
++ * servers do require it during NTLMSSP.
++ */
++ if (ses->server->dialect <= SMB20_PROT_ID)
++ return min_t(size_t, sizeof(ses->workstation_name), RFC1001_NAME_LEN_WITH_NULL);
++ return sizeof(ses->workstation_name);
++}
++
+ #endif /* _CIFS_GLOB_H */
+diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
+index 42e14f408856d..aa2d4c49e2a5b 100644
+--- a/fs/cifs/connect.c
++++ b/fs/cifs/connect.c
+@@ -2037,18 +2037,7 @@ cifs_set_cifscreds(struct smb3_fs_context *ctx, struct cifs_ses *ses)
+ }
+ }
+
+- ctx->workstation_name = kstrdup(ses->workstation_name, GFP_KERNEL);
+- if (!ctx->workstation_name) {
+- cifs_dbg(FYI, "Unable to allocate memory for workstation_name\n");
+- rc = -ENOMEM;
+- kfree(ctx->username);
+- ctx->username = NULL;
+- kfree_sensitive(ctx->password);
+- ctx->password = NULL;
+- kfree(ctx->domainname);
+- ctx->domainname = NULL;
+- goto out_key_put;
+- }
++ strscpy(ctx->workstation_name, ses->workstation_name, sizeof(ctx->workstation_name));
+
+ out_key_put:
+ up_read(&key->sem);
+@@ -2157,12 +2146,9 @@ cifs_get_smb_ses(struct TCP_Server_Info *server, struct smb3_fs_context *ctx)
+ if (!ses->domainName)
+ goto get_ses_fail;
+ }
+- if (ctx->workstation_name) {
+- ses->workstation_name = kstrdup(ctx->workstation_name,
+- GFP_KERNEL);
+- if (!ses->workstation_name)
+- goto get_ses_fail;
+- }
++
++ strscpy(ses->workstation_name, ctx->workstation_name, sizeof(ses->workstation_name));
++
+ if (ctx->domainauto)
+ ses->domainAuto = ctx->domainauto;
+ ses->cred_uid = ctx->cred_uid;
+@@ -3420,8 +3406,9 @@ cifs_are_all_path_components_accessible(struct TCP_Server_Info *server,
+ }
+
+ /*
+- * Check if path is remote (e.g. a DFS share). Return -EREMOTE if it is,
+- * otherwise 0.
++ * Check if path is remote (i.e. a DFS share).
++ *
++ * Return -EREMOTE if it is, otherwise 0 or -errno.
+ */
+ static int is_path_remote(struct mount_ctx *mnt_ctx)
+ {
+@@ -3432,6 +3419,7 @@ static int is_path_remote(struct mount_ctx *mnt_ctx)
+ struct cifs_tcon *tcon = mnt_ctx->tcon;
+ struct smb3_fs_context *ctx = mnt_ctx->fs_ctx;
+ char *full_path;
++ bool nodfs = cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_DFS;
+
+ if (!server->ops->is_path_accessible)
+ return -EOPNOTSUPP;
+@@ -3449,14 +3437,20 @@ static int is_path_remote(struct mount_ctx *mnt_ctx)
+ rc = server->ops->is_path_accessible(xid, tcon, cifs_sb,
+ full_path);
+ #ifdef CONFIG_CIFS_DFS_UPCALL
++ if (nodfs) {
++ if (rc == -EREMOTE)
++ rc = -EOPNOTSUPP;
++ goto out;
++ }
++
++ /* path *might* exist with non-ASCII characters in DFS root
++ * try again with full path (only if nodfs is not set) */
+ if (rc == -ENOENT && is_tcon_dfs(tcon))
+ rc = cifs_dfs_query_info_nonascii_quirk(xid, tcon, cifs_sb,
+ full_path);
+ #endif
+- if (rc != 0 && rc != -EREMOTE) {
+- kfree(full_path);
+- return rc;
+- }
++ if (rc != 0 && rc != -EREMOTE)
++ goto out;
+
+ if (rc != -EREMOTE) {
+ rc = cifs_are_all_path_components_accessible(server, xid, tcon,
+@@ -3468,6 +3462,7 @@ static int is_path_remote(struct mount_ctx *mnt_ctx)
+ }
+ }
+
++out:
+ kfree(full_path);
+ return rc;
+ }
+@@ -3703,6 +3698,7 @@ int cifs_mount(struct cifs_sb_info *cifs_sb, struct smb3_fs_context *ctx)
+ if (!isdfs)
+ goto out;
+
++ /* proceed as DFS mount */
+ uuid_gen(&mnt_ctx.mount_id);
+ rc = connect_dfs_root(&mnt_ctx, &tl);
+ dfs_cache_free_tgts(&tl);
+@@ -3960,7 +3956,7 @@ cifs_negotiate_protocol(const unsigned int xid, struct cifs_ses *ses,
+ if (rc == 0) {
+ spin_lock(&cifs_tcp_ses_lock);
+ if (server->tcpStatus == CifsInNegotiate)
+- server->tcpStatus = CifsNeedSessSetup;
++ server->tcpStatus = CifsGood;
+ else
+ rc = -EHOSTDOWN;
+ spin_unlock(&cifs_tcp_ses_lock);
+@@ -3983,19 +3979,18 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
+ bool is_binding = false;
+
+ /* only send once per connect */
++ spin_lock(&ses->chan_lock);
++ is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
++ spin_unlock(&ses->chan_lock);
++
+ spin_lock(&cifs_tcp_ses_lock);
+- if ((server->tcpStatus != CifsNeedSessSetup) &&
+- (ses->status == CifsGood)) {
++ if (ses->status == CifsExiting) {
+ spin_unlock(&cifs_tcp_ses_lock);
+ return 0;
+ }
+- server->tcpStatus = CifsInSessSetup;
++ ses->status = CifsInSessSetup;
+ spin_unlock(&cifs_tcp_ses_lock);
+
+- spin_lock(&ses->chan_lock);
+- is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
+- spin_unlock(&ses->chan_lock);
+-
+ if (!is_binding) {
+ ses->capabilities = server->capabilities;
+ if (!linuxExtEnabled)
+@@ -4019,13 +4014,13 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
+ if (rc) {
+ cifs_server_dbg(VFS, "Send error in SessSetup = %d\n", rc);
+ spin_lock(&cifs_tcp_ses_lock);
+- if (server->tcpStatus == CifsInSessSetup)
+- server->tcpStatus = CifsNeedSessSetup;
++ if (ses->status == CifsInSessSetup)
++ ses->status = CifsNeedSessSetup;
+ spin_unlock(&cifs_tcp_ses_lock);
+ } else {
+ spin_lock(&cifs_tcp_ses_lock);
+- if (server->tcpStatus == CifsInSessSetup)
+- server->tcpStatus = CifsGood;
++ if (ses->status == CifsInSessSetup)
++ ses->status = CifsGood;
+ /* Even if one channel is active, session is in good state */
+ ses->status = CifsGood;
+ spin_unlock(&cifs_tcp_ses_lock);
+diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
+index 956f8e5cf3e74..c5dd6f7305bd1 100644
+--- a/fs/cifs/dfs_cache.c
++++ b/fs/cifs/dfs_cache.c
+@@ -654,7 +654,7 @@ static struct cache_entry *__lookup_cache_entry(const char *path, unsigned int h
+ return ce;
+ }
+ }
+- return ERR_PTR(-EEXIST);
++ return ERR_PTR(-ENOENT);
+ }
+
+ /*
+@@ -662,7 +662,7 @@ static struct cache_entry *__lookup_cache_entry(const char *path, unsigned int h
+ *
+ * Use whole path components in the match. Must be called with htable_rw_lock held.
+ *
+- * Return ERR_PTR(-EEXIST) if the entry is not found.
++ * Return ERR_PTR(-ENOENT) if the entry is not found.
+ */
+ static struct cache_entry *lookup_cache_entry(const char *path)
+ {
+@@ -710,7 +710,7 @@ static struct cache_entry *lookup_cache_entry(const char *path)
+ while (e > s && *e != sep)
+ e--;
+ }
+- return ERR_PTR(-EEXIST);
++ return ERR_PTR(-ENOENT);
+ }
+
+ /**
+diff --git a/fs/cifs/fs_context.c b/fs/cifs/fs_context.c
+index a92e9eec521f3..fbb0e98c7d2c4 100644
+--- a/fs/cifs/fs_context.c
++++ b/fs/cifs/fs_context.c
+@@ -312,7 +312,6 @@ smb3_fs_context_dup(struct smb3_fs_context *new_ctx, struct smb3_fs_context *ctx
+ new_ctx->password = NULL;
+ new_ctx->server_hostname = NULL;
+ new_ctx->domainname = NULL;
+- new_ctx->workstation_name = NULL;
+ new_ctx->UNC = NULL;
+ new_ctx->source = NULL;
+ new_ctx->iocharset = NULL;
+@@ -327,7 +326,6 @@ smb3_fs_context_dup(struct smb3_fs_context *new_ctx, struct smb3_fs_context *ctx
+ DUP_CTX_STR(UNC);
+ DUP_CTX_STR(source);
+ DUP_CTX_STR(domainname);
+- DUP_CTX_STR(workstation_name);
+ DUP_CTX_STR(nodename);
+ DUP_CTX_STR(iocharset);
+
+@@ -766,8 +764,7 @@ static int smb3_verify_reconfigure_ctx(struct fs_context *fc,
+ cifs_errorf(fc, "can not change domainname during remount\n");
+ return -EINVAL;
+ }
+- if (new_ctx->workstation_name &&
+- (!old_ctx->workstation_name || strcmp(new_ctx->workstation_name, old_ctx->workstation_name))) {
++ if (strcmp(new_ctx->workstation_name, old_ctx->workstation_name)) {
+ cifs_errorf(fc, "can not change workstation_name during remount\n");
+ return -EINVAL;
+ }
+@@ -814,7 +811,6 @@ static int smb3_reconfigure(struct fs_context *fc)
+ STEAL_STRING(cifs_sb, ctx, username);
+ STEAL_STRING(cifs_sb, ctx, password);
+ STEAL_STRING(cifs_sb, ctx, domainname);
+- STEAL_STRING(cifs_sb, ctx, workstation_name);
+ STEAL_STRING(cifs_sb, ctx, nodename);
+ STEAL_STRING(cifs_sb, ctx, iocharset);
+
+@@ -1467,22 +1463,15 @@ static int smb3_fs_context_parse_param(struct fs_context *fc,
+
+ int smb3_init_fs_context(struct fs_context *fc)
+ {
+- int rc;
+ struct smb3_fs_context *ctx;
+ char *nodename = utsname()->nodename;
+ int i;
+
+ ctx = kzalloc(sizeof(struct smb3_fs_context), GFP_KERNEL);
+- if (unlikely(!ctx)) {
+- rc = -ENOMEM;
+- goto err_exit;
+- }
++ if (unlikely(!ctx))
++ return -ENOMEM;
+
+- ctx->workstation_name = kstrdup(nodename, GFP_KERNEL);
+- if (unlikely(!ctx->workstation_name)) {
+- rc = -ENOMEM;
+- goto err_exit;
+- }
++ strscpy(ctx->workstation_name, nodename, sizeof(ctx->workstation_name));
+
+ /*
+ * does not have to be perfect mapping since field is
+@@ -1555,14 +1544,6 @@ int smb3_init_fs_context(struct fs_context *fc)
+ fc->fs_private = ctx;
+ fc->ops = &smb3_fs_context_ops;
+ return 0;
+-
+-err_exit:
+- if (ctx) {
+- kfree(ctx->workstation_name);
+- kfree(ctx);
+- }
+-
+- return rc;
+ }
+
+ void
+@@ -1588,8 +1569,6 @@ smb3_cleanup_fs_context_contents(struct smb3_fs_context *ctx)
+ ctx->source = NULL;
+ kfree(ctx->domainname);
+ ctx->domainname = NULL;
+- kfree(ctx->workstation_name);
+- ctx->workstation_name = NULL;
+ kfree(ctx->nodename);
+ ctx->nodename = NULL;
+ kfree(ctx->iocharset);
+diff --git a/fs/cifs/fs_context.h b/fs/cifs/fs_context.h
+index e54090d9ef368..3a156c1439254 100644
+--- a/fs/cifs/fs_context.h
++++ b/fs/cifs/fs_context.h
+@@ -170,7 +170,7 @@ struct smb3_fs_context {
+ char *server_hostname;
+ char *UNC;
+ char *nodename;
+- char *workstation_name;
++ char workstation_name[CIFS_MAX_WORKSTATION_LEN];
+ char *iocharset; /* local code page for mapping to and from Unicode */
+ char source_rfc1001_name[RFC1001_NAME_LEN_WITH_NULL]; /* clnt nb name */
+ char target_rfc1001_name[RFC1001_NAME_LEN_WITH_NULL]; /* srvr nb name */
+diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c
+index afaf59c221936..5a803d6861464 100644
+--- a/fs/cifs/misc.c
++++ b/fs/cifs/misc.c
+@@ -95,7 +95,6 @@ sesInfoFree(struct cifs_ses *buf_to_free)
+ kfree_sensitive(buf_to_free->password);
+ kfree(buf_to_free->user_name);
+ kfree(buf_to_free->domainName);
+- kfree(buf_to_free->workstation_name);
+ kfree_sensitive(buf_to_free->auth_key.response);
+ kfree(buf_to_free->iface_list);
+ kfree_sensitive(buf_to_free);
+@@ -1309,7 +1308,7 @@ int cifs_update_super_prepath(struct cifs_sb_info *cifs_sb, char *prefix)
+ * for "\<server>\<dfsname>\<linkpath>" DFS reference,
+ * where <dfsname> contains non-ASCII unicode symbols.
+ *
+- * Check such DFS reference and emulate -ENOENT if it is actual.
++ * Check such DFS reference.
+ */
+ int cifs_dfs_query_info_nonascii_quirk(const unsigned int xid,
+ struct cifs_tcon *tcon,
+@@ -1341,10 +1340,6 @@ int cifs_dfs_query_info_nonascii_quirk(const unsigned int xid,
+ cifs_dbg(FYI, "DFS ref '%s' is found, emulate -EREMOTE\n",
+ dfspath);
+ rc = -EREMOTE;
+- } else if (rc == -EEXIST) {
+- cifs_dbg(FYI, "DFS ref '%s' is not found, emulate -ENOENT\n",
+- dfspath);
+- rc = -ENOENT;
+ } else {
+ cifs_dbg(FYI, "%s: dfs_cache_find returned %d\n", __func__, rc);
+ }
+diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c
+index 32f478c7a66d8..1a0995bb5d90c 100644
+--- a/fs/cifs/sess.c
++++ b/fs/cifs/sess.c
+@@ -714,9 +714,9 @@ static int size_of_ntlmssp_blob(struct cifs_ses *ses, int base_size)
+ else
+ sz += sizeof(__le16);
+
+- if (ses->workstation_name)
++ if (ses->workstation_name[0])
+ sz += sizeof(__le16) * strnlen(ses->workstation_name,
+- CIFS_MAX_WORKSTATION_LEN);
++ ntlmssp_workstation_name_size(ses));
+ else
+ sz += sizeof(__le16);
+
+@@ -960,7 +960,7 @@ int build_ntlmssp_auth_blob(unsigned char **pbuffer,
+
+ cifs_security_buffer_from_str(&sec_blob->WorkstationName,
+ ses->workstation_name,
+- CIFS_MAX_WORKSTATION_LEN,
++ ntlmssp_workstation_name_size(ses),
+ *pbuffer, &tmp,
+ nls_cp);
+
+diff --git a/fs/cifs/smb2inode.c b/fs/cifs/smb2inode.c
+index fe5bfa245fa7e..1b89b9b8a212a 100644
+--- a/fs/cifs/smb2inode.c
++++ b/fs/cifs/smb2inode.c
+@@ -362,8 +362,6 @@ smb2_compound_op(const unsigned int xid, struct cifs_tcon *tcon,
+ num_rqst++;
+
+ if (cfile) {
+- cifsFileInfo_put(cfile);
+- cfile = NULL;
+ rc = compound_send_recv(xid, ses, server,
+ flags, num_rqst - 2,
+ &rqst[1], &resp_buftype[1],
+diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
+index d6aaeff4a30a5..861291662c95d 100644
+--- a/fs/cifs/smb2ops.c
++++ b/fs/cifs/smb2ops.c
+@@ -760,8 +760,8 @@ int open_cached_dir(unsigned int xid, struct cifs_tcon *tcon,
+ struct cifs_sb_info *cifs_sb,
+ struct cached_fid **cfid)
+ {
+- struct cifs_ses *ses = tcon->ses;
+- struct TCP_Server_Info *server = ses->server;
++ struct cifs_ses *ses;
++ struct TCP_Server_Info *server;
+ struct cifs_open_parms oparms;
+ struct smb2_create_rsp *o_rsp = NULL;
+ struct smb2_query_info_rsp *qi_rsp = NULL;
+@@ -779,6 +779,9 @@ int open_cached_dir(unsigned int xid, struct cifs_tcon *tcon,
+ if (tcon->nohandlecache)
+ return -ENOTSUPP;
+
++ ses = tcon->ses;
++ server = ses->server;
++
+ if (cifs_sb->root == NULL)
+ return -ENOENT;
+
+@@ -3837,7 +3840,7 @@ static long smb3_simple_falloc(struct file *file, struct cifs_tcon *tcon,
+ if (rc)
+ goto out;
+
+- if ((cifsi->cifsAttrs & FILE_ATTRIBUTE_SPARSE_FILE) == 0)
++ if (cifsi->cifsAttrs & FILE_ATTRIBUTE_SPARSE_FILE)
+ smb2_set_sparse(xid, tcon, cfile, inode, false);
+
+ eof = cpu_to_le64(off + len);
+diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
+index 1b7ad0c095668..f5321a3500f38 100644
+--- a/fs/cifs/smb2pdu.c
++++ b/fs/cifs/smb2pdu.c
+@@ -3899,7 +3899,8 @@ SMB2_echo(struct TCP_Server_Info *server)
+ cifs_dbg(FYI, "In echo request for conn_id %lld\n", server->conn_id);
+
+ spin_lock(&cifs_tcp_ses_lock);
+- if (server->tcpStatus == CifsNeedNegotiate) {
++ if (server->ops->need_neg &&
++ server->ops->need_neg(server)) {
+ spin_unlock(&cifs_tcp_ses_lock);
+ /* No need to send echo on newly established connections */
+ mod_delayed_work(cifsiod_wq, &server->reconnect, 0);
+diff --git a/fs/cifs/smb2transport.c b/fs/cifs/smb2transport.c
+index 2af79093b78bc..01b732641edb2 100644
+--- a/fs/cifs/smb2transport.c
++++ b/fs/cifs/smb2transport.c
+@@ -641,7 +641,8 @@ smb2_sign_rqst(struct smb_rqst *rqst, struct TCP_Server_Info *server)
+ if (!is_signed)
+ return 0;
+ spin_lock(&cifs_tcp_ses_lock);
+- if (server->tcpStatus == CifsNeedNegotiate) {
++ if (server->ops->need_neg &&
++ server->ops->need_neg(server)) {
+ spin_unlock(&cifs_tcp_ses_lock);
+ return 0;
+ }
+diff --git a/fs/dax.c b/fs/dax.c
+index 67a08a32fccb7..a372304c9695b 100644
+--- a/fs/dax.c
++++ b/fs/dax.c
+@@ -845,7 +845,8 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index,
+ if (!pmd_dirty(*pmdp) && !pmd_write(*pmdp))
+ goto unlock_pmd;
+
+- flush_cache_page(vma, address, pfn);
++ flush_cache_range(vma, address,
++ address + HPAGE_PMD_SIZE);
+ pmd = pmdp_invalidate(vma, address, pmdp);
+ pmd = pmd_wrprotect(pmd);
+ pmd = pmd_mkclean(pmd);
+diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
+index bdb51d209ba25..5b485cd96c931 100644
+--- a/fs/dlm/lock.c
++++ b/fs/dlm/lock.c
+@@ -1559,6 +1559,7 @@ static int _remove_from_waiters(struct dlm_lkb *lkb, int mstype,
+ lkb->lkb_wait_type = 0;
+ lkb->lkb_flags &= ~DLM_IFL_OVERLAP_CANCEL;
+ lkb->lkb_wait_count--;
++ unhold_lkb(lkb);
+ goto out_del;
+ }
+
+@@ -1585,6 +1586,7 @@ static int _remove_from_waiters(struct dlm_lkb *lkb, int mstype,
+ log_error(ls, "remwait error %x reply %d wait_type %d overlap",
+ lkb->lkb_id, mstype, lkb->lkb_wait_type);
+ lkb->lkb_wait_count--;
++ unhold_lkb(lkb);
+ lkb->lkb_wait_type = 0;
+ }
+
+@@ -1795,7 +1797,6 @@ static void shrink_bucket(struct dlm_ls *ls, int b)
+ memcpy(ls->ls_remove_name, name, DLM_RESNAME_MAXLEN);
+ spin_unlock(&ls->ls_remove_spin);
+ spin_unlock(&ls->ls_rsbtbl[b].lock);
+- wake_up(&ls->ls_remove_wait);
+
+ send_remove(r);
+
+@@ -1804,6 +1805,7 @@ static void shrink_bucket(struct dlm_ls *ls, int b)
+ ls->ls_remove_len = 0;
+ memset(ls->ls_remove_name, 0, DLM_RESNAME_MAXLEN);
+ spin_unlock(&ls->ls_remove_spin);
++ wake_up(&ls->ls_remove_wait);
+
+ dlm_free_rsb(r);
+ }
+@@ -4079,7 +4081,6 @@ static void send_repeat_remove(struct dlm_ls *ls, char *ms_name, int len)
+ memcpy(ls->ls_remove_name, name, DLM_RESNAME_MAXLEN);
+ spin_unlock(&ls->ls_remove_spin);
+ spin_unlock(&ls->ls_rsbtbl[b].lock);
+- wake_up(&ls->ls_remove_wait);
+
+ rv = _create_message(ls, sizeof(struct dlm_message) + len,
+ dir_nodeid, DLM_MSG_REMOVE, &ms, &mh);
+@@ -4095,6 +4096,7 @@ static void send_repeat_remove(struct dlm_ls *ls, char *ms_name, int len)
+ ls->ls_remove_len = 0;
+ memset(ls->ls_remove_name, 0, DLM_RESNAME_MAXLEN);
+ spin_unlock(&ls->ls_remove_spin);
++ wake_up(&ls->ls_remove_wait);
+ }
+
+ static int receive_request(struct dlm_ls *ls, struct dlm_message *ms)
+@@ -5331,11 +5333,16 @@ int dlm_recover_waiters_post(struct dlm_ls *ls)
+ lkb->lkb_flags &= ~DLM_IFL_OVERLAP_UNLOCK;
+ lkb->lkb_flags &= ~DLM_IFL_OVERLAP_CANCEL;
+ lkb->lkb_wait_type = 0;
+- lkb->lkb_wait_count = 0;
++ /* drop all wait_count references we still
++ * hold a reference for this iteration.
++ */
++ while (lkb->lkb_wait_count) {
++ lkb->lkb_wait_count--;
++ unhold_lkb(lkb);
++ }
+ mutex_lock(&ls->ls_waiters_mutex);
+ list_del_init(&lkb->lkb_wait_reply);
+ mutex_unlock(&ls->ls_waiters_mutex);
+- unhold_lkb(lkb); /* for waiters list */
+
+ if (oc || ou) {
+ /* do an unlock or cancel instead of resending */
+diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
+index e284d696c1fdc..6ed935ad82471 100644
+--- a/fs/dlm/lowcomms.c
++++ b/fs/dlm/lowcomms.c
+@@ -1789,7 +1789,7 @@ static int dlm_listen_for_all(void)
+ SOCK_STREAM, dlm_proto_ops->proto, &sock);
+ if (result < 0) {
+ log_print("Can't create comms socket: %d", result);
+- goto out;
++ return result;
+ }
+
+ sock_set_mark(sock->sk, dlm_config.ci_mark);
+diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
+index c38b2b8ffd1d3..a10d2bcfe75a8 100644
+--- a/fs/dlm/plock.c
++++ b/fs/dlm/plock.c
+@@ -23,11 +23,11 @@ struct plock_op {
+ struct list_head list;
+ int done;
+ struct dlm_plock_info info;
++ int (*callback)(struct file_lock *fl, int result);
+ };
+
+ struct plock_xop {
+ struct plock_op xop;
+- int (*callback)(struct file_lock *fl, int result);
+ void *fl;
+ void *file;
+ struct file_lock flc;
+@@ -129,19 +129,18 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
+ /* fl_owner is lockd which doesn't distinguish
+ processes on the nfs client */
+ op->info.owner = (__u64) fl->fl_pid;
+- xop->callback = fl->fl_lmops->lm_grant;
++ op->callback = fl->fl_lmops->lm_grant;
+ locks_init_lock(&xop->flc);
+ locks_copy_lock(&xop->flc, fl);
+ xop->fl = fl;
+ xop->file = file;
+ } else {
+ op->info.owner = (__u64)(long) fl->fl_owner;
+- xop->callback = NULL;
+ }
+
+ send_op(op);
+
+- if (xop->callback == NULL) {
++ if (!op->callback) {
+ rv = wait_event_interruptible(recv_wq, (op->done != 0));
+ if (rv == -ERESTARTSYS) {
+ log_debug(ls, "dlm_posix_lock: wait killed %llx",
+@@ -203,7 +202,7 @@ static int dlm_plock_callback(struct plock_op *op)
+ file = xop->file;
+ flc = &xop->flc;
+ fl = xop->fl;
+- notify = xop->callback;
++ notify = op->callback;
+
+ if (op->info.rv) {
+ notify(fl, op->info.rv);
+@@ -436,10 +435,9 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
+ if (op->info.fsid == info.fsid &&
+ op->info.number == info.number &&
+ op->info.owner == info.owner) {
+- struct plock_xop *xop = (struct plock_xop *)op;
+ list_del_init(&op->list);
+ memcpy(&op->info, &info, sizeof(info));
+- if (xop->callback)
++ if (op->callback)
+ do_callback = 1;
+ else
+ op->done = 1;
+diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
+index 3efa686c76441..0e0d1fc0f1301 100644
+--- a/fs/erofs/decompressor.c
++++ b/fs/erofs/decompressor.c
+@@ -322,6 +322,7 @@ static int z_erofs_shifted_transform(struct z_erofs_decompress_req *rq,
+ PAGE_ALIGN(rq->pageofs_out + rq->outputsize) >> PAGE_SHIFT;
+ const unsigned int righthalf = min_t(unsigned int, rq->outputsize,
+ PAGE_SIZE - rq->pageofs_out);
++ const unsigned int lefthalf = rq->outputsize - righthalf;
+ unsigned char *src, *dst;
+
+ if (nrpages_out > 2) {
+@@ -344,10 +345,10 @@ static int z_erofs_shifted_transform(struct z_erofs_decompress_req *rq,
+ if (nrpages_out == 2) {
+ DBG_BUGON(!rq->out[1]);
+ if (rq->out[1] == *rq->in) {
+- memmove(src, src + righthalf, rq->pageofs_out);
++ memmove(src, src + righthalf, lefthalf);
+ } else {
+ dst = kmap_atomic(rq->out[1]);
+- memcpy(dst, src + righthalf, rq->pageofs_out);
++ memcpy(dst, src + righthalf, lefthalf);
+ kunmap_atomic(dst);
+ }
+ }
+diff --git a/fs/exec.c b/fs/exec.c
+index e3e55d5e0be1f..75eb6e0ee7b2f 100644
+--- a/fs/exec.c
++++ b/fs/exec.c
+@@ -1308,8 +1308,6 @@ int begin_new_exec(struct linux_binprm * bprm)
+ if (retval)
+ goto out_unlock;
+
+- if (me->flags & PF_KTHREAD)
+- free_kthread_struct(me);
+ me->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD |
+ PF_NOFREEZE | PF_NO_SETAFFINITY);
+ flush_thread();
+@@ -1955,6 +1953,10 @@ int kernel_execve(const char *kernel_filename,
+ int fd = AT_FDCWD;
+ int retval;
+
++ if (WARN_ON_ONCE((current->flags & PF_KTHREAD) &&
++ (current->worker_private)))
++ return -EINVAL;
++
+ filename = getname_kernel(kernel_filename);
+ if (IS_ERR(filename))
+ return PTR_ERR(filename);
+diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
+index 0106eba46d5af..3ef80d000e13d 100644
+--- a/fs/exportfs/expfs.c
++++ b/fs/exportfs/expfs.c
+@@ -145,7 +145,7 @@ static struct dentry *reconnect_one(struct vfsmount *mnt,
+ if (err)
+ goto out_err;
+ dprintk("%s: found name: %s\n", __func__, nbuf);
+- tmp = lookup_one_len_unlocked(nbuf, parent, strlen(nbuf));
++ tmp = lookup_one_unlocked(mnt_user_ns(mnt), nbuf, parent, strlen(nbuf));
+ if (IS_ERR(tmp)) {
+ dprintk("%s: lookup failed: %d\n", __func__, PTR_ERR(tmp));
+ err = PTR_ERR(tmp);
+@@ -525,7 +525,8 @@ exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len,
+ }
+
+ inode_lock(target_dir->d_inode);
+- nresult = lookup_one_len(nbuf, target_dir, strlen(nbuf));
++ nresult = lookup_one(mnt_user_ns(mnt), nbuf,
++ target_dir, strlen(nbuf));
+ if (!IS_ERR(nresult)) {
+ if (unlikely(nresult->d_inode != result->d_inode)) {
+ dput(nresult);
+diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
+index a743b1e3b89ec..f6d6661817b63 100644
+--- a/fs/ext4/ext4.h
++++ b/fs/ext4/ext4.h
+@@ -1440,12 +1440,6 @@ struct ext4_super_block {
+
+ #ifdef __KERNEL__
+
+-#ifdef CONFIG_FS_ENCRYPTION
+-#define DUMMY_ENCRYPTION_ENABLED(sbi) ((sbi)->s_dummy_enc_policy.policy != NULL)
+-#else
+-#define DUMMY_ENCRYPTION_ENABLED(sbi) (0)
+-#endif
+-
+ /* Number of quota types we support */
+ #define EXT4_MAXQUOTAS 3
+
+diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
+index e473fde6b64b4..c148bb97b5273 100644
+--- a/fs/ext4/extents.c
++++ b/fs/ext4/extents.c
+@@ -372,7 +372,7 @@ static int ext4_valid_extent_entries(struct inode *inode,
+ {
+ unsigned short entries;
+ ext4_lblk_t lblock = 0;
+- ext4_lblk_t prev = 0;
++ ext4_lblk_t cur = 0;
+
+ if (eh->eh_entries == 0)
+ return 1;
+@@ -396,11 +396,11 @@ static int ext4_valid_extent_entries(struct inode *inode,
+
+ /* Check for overlapping extents */
+ lblock = le32_to_cpu(ext->ee_block);
+- if ((lblock <= prev) && prev) {
++ if (lblock < cur) {
+ *pblk = ext4_ext_pblock(ext);
+ return 0;
+ }
+- prev = lblock + ext4_ext_get_actual_len(ext) - 1;
++ cur = lblock + ext4_ext_get_actual_len(ext);
+ ext++;
+ entries--;
+ }
+@@ -420,13 +420,13 @@ static int ext4_valid_extent_entries(struct inode *inode,
+
+ /* Check for overlapping index extents */
+ lblock = le32_to_cpu(ext_idx->ei_block);
+- if ((lblock <= prev) && prev) {
++ if (lblock < cur) {
+ *pblk = ext4_idx_pblock(ext_idx);
+ return 0;
+ }
+ ext_idx++;
+ entries--;
+- prev = lblock;
++ cur = lblock + 1;
+ }
+ }
+ return 1;
+@@ -4693,15 +4693,17 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
+ FALLOC_FL_INSERT_RANGE))
+ return -EOPNOTSUPP;
+
++ inode_lock(inode);
++ ret = ext4_convert_inline_data(inode);
++ inode_unlock(inode);
++ if (ret)
++ goto exit;
++
+ if (mode & FALLOC_FL_PUNCH_HOLE) {
+ ret = ext4_punch_hole(file, offset, len);
+ goto exit;
+ }
+
+- ret = ext4_convert_inline_data(inode);
+- if (ret)
+- goto exit;
+-
+ if (mode & FALLOC_FL_COLLAPSE_RANGE) {
+ ret = ext4_collapse_range(file, offset, len);
+ goto exit;
+diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
+index 9c076262770d9..e9ef5cf309694 100644
+--- a/fs/ext4/inline.c
++++ b/fs/ext4/inline.c
+@@ -2005,6 +2005,18 @@ int ext4_convert_inline_data(struct inode *inode)
+ if (!ext4_has_inline_data(inode)) {
+ ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA);
+ return 0;
++ } else if (!ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)) {
++ /*
++ * Inode has inline data but EXT4_STATE_MAY_INLINE_DATA is
++ * cleared. This means we are in the middle of moving of
++ * inline data to delay allocated block. Just force writeout
++ * here to finish conversion.
++ */
++ error = filemap_flush(inode->i_mapping);
++ if (error)
++ return error;
++ if (!ext4_has_inline_data(inode))
++ return 0;
+ }
+
+ needed_blocks = ext4_writepage_trans_blocks(inode);
+diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
+index 646ece9b3455f..beed9e32571c0 100644
+--- a/fs/ext4/inode.c
++++ b/fs/ext4/inode.c
+@@ -3967,15 +3967,6 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length)
+
+ trace_ext4_punch_hole(inode, offset, length, 0);
+
+- ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA);
+- if (ext4_has_inline_data(inode)) {
+- filemap_invalidate_lock(mapping);
+- ret = ext4_convert_inline_data(inode);
+- filemap_invalidate_unlock(mapping);
+- if (ret)
+- return ret;
+- }
+-
+ /*
+ * Write out all dirty pages to avoid race conditions
+ * Then release them.
+@@ -5398,6 +5389,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
+ if (attr->ia_valid & ATTR_SIZE) {
+ handle_t *handle;
+ loff_t oldsize = inode->i_size;
++ loff_t old_disksize;
+ int shrink = (attr->ia_size < inode->i_size);
+
+ if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) {
+@@ -5469,6 +5461,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
+ inode->i_sb->s_blocksize_bits);
+
+ down_write(&EXT4_I(inode)->i_data_sem);
++ old_disksize = EXT4_I(inode)->i_disksize;
+ EXT4_I(inode)->i_disksize = attr->ia_size;
+ rc = ext4_mark_inode_dirty(handle, inode);
+ if (!error)
+@@ -5480,6 +5473,8 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
+ */
+ if (!error)
+ i_size_write(inode, attr->ia_size);
++ else
++ EXT4_I(inode)->i_disksize = old_disksize;
+ up_write(&EXT4_I(inode)->i_data_sem);
+ ext4_journal_stop(handle);
+ if (error)
+diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
+index 252c168454c7f..87d85ce04d58b 100644
+--- a/fs/ext4/mballoc.c
++++ b/fs/ext4/mballoc.c
+@@ -6398,6 +6398,7 @@ __releases(ext4_group_lock_ptr(sb, e4b->bd_group))
+ * @start: first group block to examine
+ * @max: last group block to examine
+ * @minblocks: minimum extent block count
++ * @set_trimmed: set the trimmed flag if at least one block is trimmed
+ *
+ * ext4_trim_all_free walks through group's block bitmap searching for free
+ * extents. When the free extent is found, mark it as used in group buddy
+@@ -6407,7 +6408,7 @@ __releases(ext4_group_lock_ptr(sb, e4b->bd_group))
+ static ext4_grpblk_t
+ ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
+ ext4_grpblk_t start, ext4_grpblk_t max,
+- ext4_grpblk_t minblocks)
++ ext4_grpblk_t minblocks, bool set_trimmed)
+ {
+ struct ext4_buddy e4b;
+ int ret;
+@@ -6426,7 +6427,7 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
+ if (!EXT4_MB_GRP_WAS_TRIMMED(e4b.bd_info) ||
+ minblocks < EXT4_SB(sb)->s_last_trim_minblks) {
+ ret = ext4_try_to_trim_range(sb, &e4b, start, max, minblocks);
+- if (ret >= 0)
++ if (ret >= 0 && set_trimmed)
+ EXT4_MB_GRP_SET_TRIMMED(e4b.bd_info);
+ } else {
+ ret = 0;
+@@ -6463,6 +6464,7 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range)
+ ext4_fsblk_t first_data_blk =
+ le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block);
+ ext4_fsblk_t max_blks = ext4_blocks_count(EXT4_SB(sb)->s_es);
++ bool whole_group, eof = false;
+ int ret = 0;
+
+ start = range->start >> sb->s_blocksize_bits;
+@@ -6481,8 +6483,10 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range)
+ if (minlen > EXT4_CLUSTERS_PER_GROUP(sb))
+ goto out;
+ }
+- if (end >= max_blks)
++ if (end >= max_blks - 1) {
+ end = max_blks - 1;
++ eof = true;
++ }
+ if (end <= first_data_blk)
+ goto out;
+ if (start < first_data_blk)
+@@ -6496,6 +6500,7 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range)
+
+ /* end now represents the last cluster to discard in this group */
+ end = EXT4_CLUSTERS_PER_GROUP(sb) - 1;
++ whole_group = true;
+
+ for (group = first_group; group <= last_group; group++) {
+ grp = ext4_get_group_info(sb, group);
+@@ -6512,12 +6517,13 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range)
+ * change it for the last group, note that last_cluster is
+ * already computed earlier by ext4_get_group_no_and_offset()
+ */
+- if (group == last_group)
++ if (group == last_group) {
+ end = last_cluster;
+-
++ whole_group = eof ? true : end == EXT4_CLUSTERS_PER_GROUP(sb) - 1;
++ }
+ if (grp->bb_free >= minlen) {
+ cnt = ext4_trim_all_free(sb, group, first_cluster,
+- end, minlen);
++ end, minlen, whole_group);
+ if (cnt < 0) {
+ ret = cnt;
+ break;
+diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
+index 767b4bfe39c38..e9cba12e5e128 100644
+--- a/fs/ext4/namei.c
++++ b/fs/ext4/namei.c
+@@ -277,9 +277,9 @@ static struct dx_frame *dx_probe(struct ext4_filename *fname,
+ struct dx_hash_info *hinfo,
+ struct dx_frame *frame);
+ static void dx_release(struct dx_frame *frames);
+-static int dx_make_map(struct inode *dir, struct ext4_dir_entry_2 *de,
+- unsigned blocksize, struct dx_hash_info *hinfo,
+- struct dx_map_entry map[]);
++static int dx_make_map(struct inode *dir, struct buffer_head *bh,
++ struct dx_hash_info *hinfo,
++ struct dx_map_entry *map_tail);
+ static void dx_sort_map(struct dx_map_entry *map, unsigned count);
+ static struct ext4_dir_entry_2 *dx_move_dirents(struct inode *dir, char *from,
+ char *to, struct dx_map_entry *offsets,
+@@ -777,12 +777,14 @@ static struct dx_frame *
+ dx_probe(struct ext4_filename *fname, struct inode *dir,
+ struct dx_hash_info *hinfo, struct dx_frame *frame_in)
+ {
+- unsigned count, indirect;
++ unsigned count, indirect, level, i;
+ struct dx_entry *at, *entries, *p, *q, *m;
+ struct dx_root *root;
+ struct dx_frame *frame = frame_in;
+ struct dx_frame *ret_err = ERR_PTR(ERR_BAD_DX_DIR);
+ u32 hash;
++ ext4_lblk_t block;
++ ext4_lblk_t blocks[EXT4_HTREE_LEVEL];
+
+ memset(frame_in, 0, EXT4_HTREE_LEVEL * sizeof(frame_in[0]));
+ frame->bh = ext4_read_dirblock(dir, 0, INDEX);
+@@ -854,6 +856,8 @@ dx_probe(struct ext4_filename *fname, struct inode *dir,
+ }
+
+ dxtrace(printk("Look up %x", hash));
++ level = 0;
++ blocks[0] = 0;
+ while (1) {
+ count = dx_get_count(entries);
+ if (!count || count > dx_get_limit(entries)) {
+@@ -882,15 +886,27 @@ dx_probe(struct ext4_filename *fname, struct inode *dir,
+ dx_get_block(at)));
+ frame->entries = entries;
+ frame->at = at;
+- if (!indirect--)
++
++ block = dx_get_block(at);
++ for (i = 0; i <= level; i++) {
++ if (blocks[i] == block) {
++ ext4_warning_inode(dir,
++ "dx entry: tree cycle block %u points back to block %u",
++ blocks[level], block);
++ goto fail;
++ }
++ }
++ if (++level > indirect)
+ return frame;
++ blocks[level] = block;
+ frame++;
+- frame->bh = ext4_read_dirblock(dir, dx_get_block(at), INDEX);
++ frame->bh = ext4_read_dirblock(dir, block, INDEX);
+ if (IS_ERR(frame->bh)) {
+ ret_err = (struct dx_frame *) frame->bh;
+ frame->bh = NULL;
+ goto fail;
+ }
++
+ entries = ((struct dx_node *) frame->bh->b_data)->entries;
+
+ if (dx_get_limit(entries) != dx_node_limit(dir)) {
+@@ -1249,15 +1265,23 @@ static inline int search_dirblock(struct buffer_head *bh,
+ * Create map of hash values, offsets, and sizes, stored at end of block.
+ * Returns number of entries mapped.
+ */
+-static int dx_make_map(struct inode *dir, struct ext4_dir_entry_2 *de,
+- unsigned blocksize, struct dx_hash_info *hinfo,
++static int dx_make_map(struct inode *dir, struct buffer_head *bh,
++ struct dx_hash_info *hinfo,
+ struct dx_map_entry *map_tail)
+ {
+ int count = 0;
+- char *base = (char *) de;
++ struct ext4_dir_entry_2 *de = (struct ext4_dir_entry_2 *)bh->b_data;
++ unsigned int buflen = bh->b_size;
++ char *base = bh->b_data;
+ struct dx_hash_info h = *hinfo;
+
+- while ((char *) de < base + blocksize) {
++ if (ext4_has_metadata_csum(dir->i_sb))
++ buflen -= sizeof(struct ext4_dir_entry_tail);
++
++ while ((char *) de < base + buflen) {
++ if (ext4_check_dir_entry(dir, NULL, de, bh, base, buflen,
++ ((char *)de) - base))
++ return -EFSCORRUPTED;
+ if (de->name_len && de->inode) {
+ if (ext4_hash_in_dirent(dir))
+ h.hash = EXT4_DIRENT_HASH(de);
+@@ -1270,8 +1294,7 @@ static int dx_make_map(struct inode *dir, struct ext4_dir_entry_2 *de,
+ count++;
+ cond_resched();
+ }
+- /* XXX: do we need to check rec_len == 0 case? -Chris */
+- de = ext4_next_entry(de, blocksize);
++ de = ext4_next_entry(de, dir->i_sb->s_blocksize);
+ }
+ return count;
+ }
+@@ -1943,8 +1966,11 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir,
+
+ /* create map in the end of data2 block */
+ map = (struct dx_map_entry *) (data2 + blocksize);
+- count = dx_make_map(dir, (struct ext4_dir_entry_2 *) data1,
+- blocksize, hinfo, map);
++ count = dx_make_map(dir, *bh, hinfo, map);
++ if (count < 0) {
++ err = count;
++ goto journal_error;
++ }
+ map -= count;
+ dx_sort_map(map, count);
+ /* Ensure that neither split block is over half full */
+@@ -3455,6 +3481,9 @@ static struct buffer_head *ext4_get_first_dir_block(handle_t *handle,
+ struct buffer_head *bh;
+
+ if (!ext4_has_inline_data(inode)) {
++ struct ext4_dir_entry_2 *de;
++ unsigned int offset;
++
+ /* The first directory block must not be a hole, so
+ * treat it as DIRENT_HTREE
+ */
+@@ -3463,9 +3492,30 @@ static struct buffer_head *ext4_get_first_dir_block(handle_t *handle,
+ *retval = PTR_ERR(bh);
+ return NULL;
+ }
+- *parent_de = ext4_next_entry(
+- (struct ext4_dir_entry_2 *)bh->b_data,
+- inode->i_sb->s_blocksize);
++
++ de = (struct ext4_dir_entry_2 *) bh->b_data;
++ if (ext4_check_dir_entry(inode, NULL, de, bh, bh->b_data,
++ bh->b_size, 0) ||
++ le32_to_cpu(de->inode) != inode->i_ino ||
++ strcmp(".", de->name)) {
++ EXT4_ERROR_INODE(inode, "directory missing '.'");
++ brelse(bh);
++ *retval = -EFSCORRUPTED;
++ return NULL;
++ }
++ offset = ext4_rec_len_from_disk(de->rec_len,
++ inode->i_sb->s_blocksize);
++ de = ext4_next_entry(de, inode->i_sb->s_blocksize);
++ if (ext4_check_dir_entry(inode, NULL, de, bh, bh->b_data,
++ bh->b_size, offset) ||
++ le32_to_cpu(de->inode) == 0 || strcmp("..", de->name)) {
++ EXT4_ERROR_INODE(inode, "directory missing '..'");
++ brelse(bh);
++ *retval = -EFSCORRUPTED;
++ return NULL;
++ }
++ *parent_de = de;
++
+ return bh;
+ }
+
+diff --git a/fs/ext4/super.c b/fs/ext4/super.c
+index 1466fbdbc8e34..a0c79304f92ff 100644
+--- a/fs/ext4/super.c
++++ b/fs/ext4/super.c
+@@ -1913,6 +1913,7 @@ static const struct mount_opts {
+ MOPT_EXT4_ONLY | MOPT_CLEAR},
+ {Opt_warn_on_error, EXT4_MOUNT_WARN_ON_ERROR, MOPT_SET},
+ {Opt_nowarn_on_error, EXT4_MOUNT_WARN_ON_ERROR, MOPT_CLEAR},
++ {Opt_commit, 0, MOPT_NO_EXT2},
+ {Opt_nojournal_checksum, EXT4_MOUNT_JOURNAL_CHECKSUM,
+ MOPT_EXT4_ONLY | MOPT_CLEAR},
+ {Opt_journal_checksum, EXT4_MOUNT_JOURNAL_CHECKSUM,
+@@ -2427,11 +2428,12 @@ static int ext4_parse_param(struct fs_context *fc, struct fs_parameter *param)
+ ctx->spec |= EXT4_SPEC_DUMMY_ENCRYPTION;
+ ctx->test_dummy_enc_arg = kmemdup_nul(param->string, param->size,
+ GFP_KERNEL);
++ return 0;
+ #else
+ ext4_msg(NULL, KERN_WARNING,
+- "Test dummy encryption mount option ignored");
++ "test_dummy_encryption option not supported");
++ return -EINVAL;
+ #endif
+- return 0;
+ case Opt_dax:
+ case Opt_dax_type:
+ #ifdef CONFIG_FS_DAX
+@@ -2625,8 +2627,10 @@ parse_failed:
+ ret = ext4_apply_options(fc, sb);
+
+ out_free:
+- kfree(s_ctx);
+- kfree(fc);
++ if (fc) {
++ ext4_fc_free(fc);
++ kfree(fc);
++ }
+ kfree(s_mount_opts);
+ return ret;
+ }
+@@ -2786,12 +2790,44 @@ err_jquota_specified:
+ #endif
+ }
+
++static int ext4_check_test_dummy_encryption(const struct fs_context *fc,
++ struct super_block *sb)
++{
++#ifdef CONFIG_FS_ENCRYPTION
++ const struct ext4_fs_context *ctx = fc->fs_private;
++ const struct ext4_sb_info *sbi = EXT4_SB(sb);
++
++ if (!(ctx->spec & EXT4_SPEC_DUMMY_ENCRYPTION))
++ return 0;
++
++ if (!ext4_has_feature_encrypt(sb)) {
++ ext4_msg(NULL, KERN_WARNING,
++ "test_dummy_encryption requires encrypt feature");
++ return -EINVAL;
++ }
++ /*
++ * This mount option is just for testing, and it's not worthwhile to
++ * implement the extra complexity (e.g. RCU protection) that would be
++ * needed to allow it to be set or changed during remount. We do allow
++ * it to be specified during remount, but only if there is no change.
++ */
++ if (fc->purpose == FS_CONTEXT_FOR_RECONFIGURE &&
++ !sbi->s_dummy_enc_policy.policy) {
++ ext4_msg(NULL, KERN_WARNING,
++ "Can't set test_dummy_encryption on remount");
++ return -EINVAL;
++ }
++#endif /* CONFIG_FS_ENCRYPTION */
++ return 0;
++}
++
+ static int ext4_check_opt_consistency(struct fs_context *fc,
+ struct super_block *sb)
+ {
+ struct ext4_fs_context *ctx = fc->fs_private;
+ struct ext4_sb_info *sbi = fc->s_fs_info;
+ int is_remount = fc->purpose == FS_CONTEXT_FOR_RECONFIGURE;
++ int err;
+
+ if ((ctx->opt_flags & MOPT_NO_EXT2) && IS_EXT2_SB(sb)) {
+ ext4_msg(NULL, KERN_ERR,
+@@ -2821,20 +2857,9 @@ static int ext4_check_opt_consistency(struct fs_context *fc,
+ "for blocksize < PAGE_SIZE");
+ }
+
+-#ifdef CONFIG_FS_ENCRYPTION
+- /*
+- * This mount option is just for testing, and it's not worthwhile to
+- * implement the extra complexity (e.g. RCU protection) that would be
+- * needed to allow it to be set or changed during remount. We do allow
+- * it to be specified during remount, but only if there is no change.
+- */
+- if ((ctx->spec & EXT4_SPEC_DUMMY_ENCRYPTION) &&
+- is_remount && !sbi->s_dummy_enc_policy.policy) {
+- ext4_msg(NULL, KERN_WARNING,
+- "Can't set test_dummy_encryption on remount");
+- return -1;
+- }
+-#endif
++ err = ext4_check_test_dummy_encryption(fc, sb);
++ if (err)
++ return err;
+
+ if ((ctx->spec & EXT4_SPEC_DATAJ) && is_remount) {
+ if (!sbi->s_journal) {
+@@ -4409,7 +4434,8 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
+ int silent = fc->sb_flags & SB_SILENT;
+
+ /* Set defaults for the variables that will be set during parsing */
+- ctx->journal_ioprio = DEFAULT_JOURNAL_IOPRIO;
++ if (!(ctx->spec & EXT4_SPEC_JOURNAL_IOPRIO))
++ ctx->journal_ioprio = DEFAULT_JOURNAL_IOPRIO;
+
+ sbi->s_inode_readahead_blks = EXT4_DEF_INODE_READAHEAD_BLKS;
+ sbi->s_sectors_written_start =
+@@ -4886,7 +4912,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
+ sbi->s_inodes_per_block;
+ sbi->s_desc_per_block = blocksize / EXT4_DESC_SIZE(sb);
+ sbi->s_sbh = bh;
+- sbi->s_mount_state = le16_to_cpu(es->s_state);
++ sbi->s_mount_state = le16_to_cpu(es->s_state) & ~EXT4_FC_REPLAY;
+ sbi->s_addr_per_block_bits = ilog2(EXT4_ADDR_PER_BLOCK(sb));
+ sbi->s_desc_per_block_bits = ilog2(EXT4_DESC_PER_BLOCK(sb));
+
+@@ -5279,12 +5305,6 @@ no_journal:
+ goto failed_mount_wq;
+ }
+
+- if (DUMMY_ENCRYPTION_ENABLED(sbi) && !sb_rdonly(sb) &&
+- !ext4_has_feature_encrypt(sb)) {
+- ext4_set_feature_encrypt(sb);
+- ext4_commit_super(sb);
+- }
+-
+ /*
+ * Get the # of file system overhead blocks from the
+ * superblock if present.
+@@ -6276,7 +6296,6 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
+ char *to_free[EXT4_MAXQUOTAS];
+ #endif
+
+- ctx->journal_ioprio = DEFAULT_JOURNAL_IOPRIO;
+
+ /* Store the original options */
+ old_sb_flags = sb->s_flags;
+@@ -6302,9 +6321,14 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
+ } else
+ old_opts.s_qf_names[i] = NULL;
+ #endif
+- if (sbi->s_journal && sbi->s_journal->j_task->io_context)
+- ctx->journal_ioprio =
+- sbi->s_journal->j_task->io_context->ioprio;
++ if (!(ctx->spec & EXT4_SPEC_JOURNAL_IOPRIO)) {
++ if (sbi->s_journal && sbi->s_journal->j_task->io_context)
++ ctx->journal_ioprio =
++ sbi->s_journal->j_task->io_context->ioprio;
++ else
++ ctx->journal_ioprio = DEFAULT_JOURNAL_IOPRIO;
++
++ }
+
+ ext4_apply_options(fc, sb);
+
+@@ -6445,7 +6469,8 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
+ if (err)
+ goto restore_opts;
+ }
+- sbi->s_mount_state = le16_to_cpu(es->s_state);
++ sbi->s_mount_state = (le16_to_cpu(es->s_state) &
++ ~EXT4_FC_REPLAY);
+
+ err = ext4_setup_super(sb, es, 0);
+ if (err)
+diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
+index a0e51937d92eb..d5bd7932fb642 100644
+--- a/fs/f2fs/dir.c
++++ b/fs/f2fs/dir.c
+@@ -82,7 +82,8 @@ int f2fs_init_casefolded_name(const struct inode *dir,
+ #if IS_ENABLED(CONFIG_UNICODE)
+ struct super_block *sb = dir->i_sb;
+
+- if (IS_CASEFOLDED(dir)) {
++ if (IS_CASEFOLDED(dir) &&
++ !is_dot_dotdot(fname->usr_fname->name, fname->usr_fname->len)) {
+ fname->cf_name.name = f2fs_kmem_cache_alloc(f2fs_cf_name_slab,
+ GFP_NOFS, false, F2FS_SB(sb));
+ if (!fname->cf_name.name)
+diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
+index 8c570de21ed5a..6ec8c6d4711f4 100644
+--- a/fs/f2fs/f2fs.h
++++ b/fs/f2fs/f2fs.h
+@@ -508,11 +508,11 @@ struct f2fs_filename {
+ #if IS_ENABLED(CONFIG_UNICODE)
+ /*
+ * For casefolded directories: the casefolded name, but it's left NULL
+- * if the original name is not valid Unicode, if the directory is both
+- * casefolded and encrypted and its encryption key is unavailable, or if
+- * the filesystem is doing an internal operation where usr_fname is also
+- * NULL. In all these cases we fall back to treating the name as an
+- * opaque byte sequence.
++ * if the original name is not valid Unicode, if the original name is
++ * "." or "..", if the directory is both casefolded and encrypted and
++ * its encryption key is unavailable, or if the filesystem is doing an
++ * internal operation where usr_fname is also NULL. In all these cases
++ * we fall back to treating the name as an opaque byte sequence.
+ */
+ struct fscrypt_str cf_name;
+ #endif
+@@ -1117,8 +1117,8 @@ enum count_type {
+ */
+ #define PAGE_TYPE_OF_BIO(type) ((type) > META ? META : (type))
+ enum page_type {
+- DATA,
+- NODE,
++ DATA = 0,
++ NODE = 1, /* should not change this */
+ META,
+ NR_PAGE_TYPE,
+ META_FLUSH,
+@@ -2605,11 +2605,17 @@ static inline void dec_valid_node_count(struct f2fs_sb_info *sbi,
+ {
+ spin_lock(&sbi->stat_lock);
+
+- f2fs_bug_on(sbi, !sbi->total_valid_block_count);
+- f2fs_bug_on(sbi, !sbi->total_valid_node_count);
++ if (unlikely(!sbi->total_valid_block_count ||
++ !sbi->total_valid_node_count)) {
++ f2fs_warn(sbi, "dec_valid_node_count: inconsistent block counts, total_valid_block:%u, total_valid_node:%u",
++ sbi->total_valid_block_count,
++ sbi->total_valid_node_count);
++ set_sbi_flag(sbi, SBI_NEED_FSCK);
++ } else {
++ sbi->total_valid_block_count--;
++ sbi->total_valid_node_count--;
++ }
+
+- sbi->total_valid_node_count--;
+- sbi->total_valid_block_count--;
+ if (sbi->reserved_blocks &&
+ sbi->current_reserved_blocks < sbi->reserved_blocks)
+ sbi->current_reserved_blocks++;
+@@ -4046,6 +4052,7 @@ extern struct kmem_cache *f2fs_inode_entry_slab;
+ * inline.c
+ */
+ bool f2fs_may_inline_data(struct inode *inode);
++bool f2fs_sanity_check_inline_data(struct inode *inode);
+ bool f2fs_may_inline_dentry(struct inode *inode);
+ void f2fs_do_read_inline_data(struct page *page, struct page *ipage);
+ void f2fs_truncate_inline_inode(struct inode *inode,
+diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
+index 5b89af0f27f05..176e97b985e61 100644
+--- a/fs/f2fs/file.c
++++ b/fs/f2fs/file.c
+@@ -1437,11 +1437,19 @@ static int f2fs_do_zero_range(struct dnode_of_data *dn, pgoff_t start,
+ ret = -ENOSPC;
+ break;
+ }
+- if (dn->data_blkaddr != NEW_ADDR) {
+- f2fs_invalidate_blocks(sbi, dn->data_blkaddr);
+- dn->data_blkaddr = NEW_ADDR;
+- f2fs_set_data_blkaddr(dn);
++
++ if (dn->data_blkaddr == NEW_ADDR)
++ continue;
++
++ if (!f2fs_is_valid_blkaddr(sbi, dn->data_blkaddr,
++ DATA_GENERIC_ENHANCE)) {
++ ret = -EFSCORRUPTED;
++ break;
+ }
++
++ f2fs_invalidate_blocks(sbi, dn->data_blkaddr);
++ dn->data_blkaddr = NEW_ADDR;
++ f2fs_set_data_blkaddr(dn);
+ }
+
+ f2fs_update_extent_cache_range(dn, start, 0, index - start);
+@@ -1766,6 +1774,10 @@ static long f2fs_fallocate(struct file *file, int mode,
+
+ inode_lock(inode);
+
++ ret = file_modified(file);
++ if (ret)
++ goto out;
++
+ if (mode & FALLOC_FL_PUNCH_HOLE) {
+ if (offset >= inode->i_size)
+ goto out;
+diff --git a/fs/f2fs/hash.c b/fs/f2fs/hash.c
+index 3cb1e7a24740f..049ce50cec9b0 100644
+--- a/fs/f2fs/hash.c
++++ b/fs/f2fs/hash.c
+@@ -91,7 +91,7 @@ static u32 TEA_hash_name(const u8 *p, size_t len)
+ /*
+ * Compute @fname->hash. For all directories, @fname->disk_name must be set.
+ * For casefolded directories, @fname->usr_fname must be set, and also
+- * @fname->cf_name if the filename is valid Unicode.
++ * @fname->cf_name if the filename is valid Unicode and is not "." or "..".
+ */
+ void f2fs_hash_filename(const struct inode *dir, struct f2fs_filename *fname)
+ {
+@@ -110,10 +110,11 @@ void f2fs_hash_filename(const struct inode *dir, struct f2fs_filename *fname)
+ /*
+ * If the casefolded name is provided, hash it instead of the
+ * on-disk name. If the casefolded name is *not* provided, that
+- * should only be because the name wasn't valid Unicode, so fall
+- * back to treating the name as an opaque byte sequence. Note
+- * that to handle encrypted directories, the fallback must use
+- * usr_fname (plaintext) rather than disk_name (ciphertext).
++ * should only be because the name wasn't valid Unicode or was
++ * "." or "..", so fall back to treating the name as an opaque
++ * byte sequence. Note that to handle encrypted directories,
++ * the fallback must use usr_fname (plaintext) rather than
++ * disk_name (ciphertext).
+ */
+ WARN_ON_ONCE(!fname->usr_fname->name);
+ if (fname->cf_name.name) {
+diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
+index a578bf83b803b..bf46a7dfbea2f 100644
+--- a/fs/f2fs/inline.c
++++ b/fs/f2fs/inline.c
+@@ -14,21 +14,40 @@
+ #include "node.h"
+ #include <trace/events/f2fs.h>
+
+-bool f2fs_may_inline_data(struct inode *inode)
++static bool support_inline_data(struct inode *inode)
+ {
+ if (f2fs_is_atomic_file(inode))
+ return false;
+-
+ if (!S_ISREG(inode->i_mode) && !S_ISLNK(inode->i_mode))
+ return false;
+-
+ if (i_size_read(inode) > MAX_INLINE_DATA(inode))
+ return false;
++ return true;
++}
+
+- if (f2fs_post_read_required(inode))
++bool f2fs_may_inline_data(struct inode *inode)
++{
++ if (!support_inline_data(inode))
+ return false;
+
+- return true;
++ return !f2fs_post_read_required(inode);
++}
++
++bool f2fs_sanity_check_inline_data(struct inode *inode)
++{
++ if (!f2fs_has_inline_data(inode))
++ return false;
++
++ if (!support_inline_data(inode))
++ return true;
++
++ /*
++ * used by sanity_check_inode(), when disk layout fields has not
++ * been synchronized to inmem fields.
++ */
++ return (S_ISREG(inode->i_mode) &&
++ (file_is_encrypt(inode) || file_is_verity(inode) ||
++ (F2FS_I(inode)->i_flags & F2FS_COMPR_FL)));
+ }
+
+ bool f2fs_may_inline_dentry(struct inode *inode)
+diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
+index 83639238a1fe9..e9818723103c6 100644
+--- a/fs/f2fs/inode.c
++++ b/fs/f2fs/inode.c
+@@ -276,8 +276,7 @@ static bool sanity_check_inode(struct inode *inode, struct page *node_page)
+ }
+ }
+
+- if (f2fs_has_inline_data(inode) &&
+- (!S_ISREG(inode->i_mode) && !S_ISLNK(inode->i_mode))) {
++ if (f2fs_sanity_check_inline_data(inode)) {
+ set_sbi_flag(sbi, SBI_NEED_FSCK);
+ f2fs_warn(sbi, "%s: inode (ino=%lx, mode=%u) should not have inline_data, run fsck to fix",
+ __func__, inode->i_ino, inode->i_mode);
+@@ -796,8 +795,22 @@ retry:
+ f2fs_lock_op(sbi);
+ err = f2fs_remove_inode_page(inode);
+ f2fs_unlock_op(sbi);
+- if (err == -ENOENT)
++ if (err == -ENOENT) {
+ err = 0;
++
++ /*
++ * in fuzzed image, another node may has the same
++ * block address as inode's, if it was truncated
++ * previously, truncation of inode node will fail.
++ */
++ if (is_inode_flag_set(inode, FI_DIRTY_INODE)) {
++ f2fs_warn(F2FS_I_SB(inode),
++ "f2fs_evict_inode: inconsistent node id, ino:%lu",
++ inode->i_ino);
++ f2fs_inode_synced(inode);
++ set_sbi_flag(sbi, SBI_NEED_FSCK);
++ }
++ }
+ }
+
+ /* give more chances, if ENOMEM case */
+diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
+index 5ed79b29999fc..fffafd2aa4387 100644
+--- a/fs/f2fs/namei.c
++++ b/fs/f2fs/namei.c
+@@ -461,6 +461,13 @@ static int __recover_dot_dentries(struct inode *dir, nid_t pino)
+ return 0;
+ }
+
++ if (!S_ISDIR(dir->i_mode)) {
++ f2fs_err(sbi, "inconsistent inode status, skip recovering inline_dots inode (ino:%lu, i_mode:%u, pino:%u)",
++ dir->i_ino, dir->i_mode, pino);
++ set_sbi_flag(sbi, SBI_NEED_FSCK);
++ return -ENOTDIR;
++ }
++
+ err = f2fs_dquot_initialize(dir);
+ if (err)
+ return err;
+diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
+index bd9731cdec565..aa0162664a1ec 100644
+--- a/fs/f2fs/segment.c
++++ b/fs/f2fs/segment.c
+@@ -355,16 +355,19 @@ void f2fs_drop_inmem_page(struct inode *inode, struct page *page)
+ struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+ struct list_head *head = &fi->inmem_pages;
+ struct inmem_pages *cur = NULL;
++ struct inmem_pages *tmp;
+
+ f2fs_bug_on(sbi, !page_private_atomic(page));
+
+ mutex_lock(&fi->inmem_lock);
+- list_for_each_entry(cur, head, list) {
+- if (cur->page == page)
++ list_for_each_entry(tmp, head, list) {
++ if (tmp->page == page) {
++ cur = tmp;
+ break;
++ }
+ }
+
+- f2fs_bug_on(sbi, list_empty(head) || cur->page != page);
++ f2fs_bug_on(sbi, !cur);
+ list_del(&cur->list);
+ mutex_unlock(&fi->inmem_lock);
+
+@@ -4457,7 +4460,7 @@ static int build_sit_entries(struct f2fs_sb_info *sbi)
+ unsigned int i, start, end;
+ unsigned int readed, start_blk = 0;
+ int err = 0;
+- block_t total_node_blocks = 0;
++ block_t sit_valid_blocks[2] = {0, 0};
+
+ do {
+ readed = f2fs_ra_meta_pages(sbi, start_blk, BIO_MAX_VECS,
+@@ -4482,8 +4485,8 @@ static int build_sit_entries(struct f2fs_sb_info *sbi)
+ if (err)
+ return err;
+ seg_info_from_raw_sit(se, &sit);
+- if (IS_NODESEG(se->type))
+- total_node_blocks += se->valid_blocks;
++
++ sit_valid_blocks[SE_PAGETYPE(se)] += se->valid_blocks;
+
+ if (f2fs_block_unit_discard(sbi)) {
+ /* build discard map only one time */
+@@ -4523,15 +4526,15 @@ static int build_sit_entries(struct f2fs_sb_info *sbi)
+ sit = sit_in_journal(journal, i);
+
+ old_valid_blocks = se->valid_blocks;
+- if (IS_NODESEG(se->type))
+- total_node_blocks -= old_valid_blocks;
++
++ sit_valid_blocks[SE_PAGETYPE(se)] -= old_valid_blocks;
+
+ err = check_block_count(sbi, start, &sit);
+ if (err)
+ break;
+ seg_info_from_raw_sit(se, &sit);
+- if (IS_NODESEG(se->type))
+- total_node_blocks += se->valid_blocks;
++
++ sit_valid_blocks[SE_PAGETYPE(se)] += se->valid_blocks;
+
+ if (f2fs_block_unit_discard(sbi)) {
+ if (is_set_ckpt_flags(sbi, CP_TRIMMED_FLAG)) {
+@@ -4553,13 +4556,24 @@ static int build_sit_entries(struct f2fs_sb_info *sbi)
+ }
+ up_read(&curseg->journal_rwsem);
+
+- if (!err && total_node_blocks != valid_node_count(sbi)) {
++ if (err)
++ return err;
++
++ if (sit_valid_blocks[NODE] != valid_node_count(sbi)) {
+ f2fs_err(sbi, "SIT is corrupted node# %u vs %u",
+- total_node_blocks, valid_node_count(sbi));
+- err = -EFSCORRUPTED;
++ sit_valid_blocks[NODE], valid_node_count(sbi));
++ return -EFSCORRUPTED;
+ }
+
+- return err;
++ if (sit_valid_blocks[DATA] + sit_valid_blocks[NODE] >
++ valid_user_blocks(sbi)) {
++ f2fs_err(sbi, "SIT is corrupted data# %u %u vs %u",
++ sit_valid_blocks[DATA], sit_valid_blocks[NODE],
++ valid_user_blocks(sbi));
++ return -EFSCORRUPTED;
++ }
++
++ return 0;
+ }
+
+ static void init_free_segmap(struct f2fs_sb_info *sbi)
+diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
+index 5c94caf0c0a1d..1fa26a9603cb8 100644
+--- a/fs/f2fs/segment.h
++++ b/fs/f2fs/segment.h
+@@ -24,6 +24,7 @@
+
+ #define IS_DATASEG(t) ((t) <= CURSEG_COLD_DATA)
+ #define IS_NODESEG(t) ((t) >= CURSEG_HOT_NODE && (t) <= CURSEG_COLD_NODE)
++#define SE_PAGETYPE(se) ((IS_NODESEG((se)->type) ? NODE : DATA))
+
+ static inline void sanity_check_seg_type(struct f2fs_sb_info *sbi,
+ unsigned short seg_type)
+@@ -572,11 +573,10 @@ static inline int reserved_sections(struct f2fs_sb_info *sbi)
+ return GET_SEC_FROM_SEG(sbi, reserved_segments(sbi));
+ }
+
+-static inline bool has_curseg_enough_space(struct f2fs_sb_info *sbi)
++static inline bool has_curseg_enough_space(struct f2fs_sb_info *sbi,
++ unsigned int node_blocks, unsigned int dent_blocks)
+ {
+- unsigned int node_blocks = get_pages(sbi, F2FS_DIRTY_NODES) +
+- get_pages(sbi, F2FS_DIRTY_DENTS);
+- unsigned int dent_blocks = get_pages(sbi, F2FS_DIRTY_DENTS);
++
+ unsigned int segno, left_blocks;
+ int i;
+
+@@ -602,19 +602,28 @@ static inline bool has_curseg_enough_space(struct f2fs_sb_info *sbi)
+ static inline bool has_not_enough_free_secs(struct f2fs_sb_info *sbi,
+ int freed, int needed)
+ {
+- int node_secs = get_blocktype_secs(sbi, F2FS_DIRTY_NODES);
+- int dent_secs = get_blocktype_secs(sbi, F2FS_DIRTY_DENTS);
+- int imeta_secs = get_blocktype_secs(sbi, F2FS_DIRTY_IMETA);
++ unsigned int total_node_blocks = get_pages(sbi, F2FS_DIRTY_NODES) +
++ get_pages(sbi, F2FS_DIRTY_DENTS) +
++ get_pages(sbi, F2FS_DIRTY_IMETA);
++ unsigned int total_dent_blocks = get_pages(sbi, F2FS_DIRTY_DENTS);
++ unsigned int node_secs = total_node_blocks / BLKS_PER_SEC(sbi);
++ unsigned int dent_secs = total_dent_blocks / BLKS_PER_SEC(sbi);
++ unsigned int node_blocks = total_node_blocks % BLKS_PER_SEC(sbi);
++ unsigned int dent_blocks = total_dent_blocks % BLKS_PER_SEC(sbi);
++ unsigned int free, need_lower, need_upper;
+
+ if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING)))
+ return false;
+
+- if (free_sections(sbi) + freed == reserved_sections(sbi) + needed &&
+- has_curseg_enough_space(sbi))
++ free = free_sections(sbi) + freed;
++ need_lower = node_secs + dent_secs + reserved_sections(sbi) + needed;
++ need_upper = need_lower + (node_blocks ? 1 : 0) + (dent_blocks ? 1 : 0);
++
++ if (free > need_upper)
+ return false;
+- return (free_sections(sbi) + freed) <=
+- (node_secs + 2 * dent_secs + imeta_secs +
+- reserved_sections(sbi) + needed);
++ else if (free <= need_lower)
++ return true;
++ return !has_curseg_enough_space(sbi, node_blocks, dent_blocks);
+ }
+
+ static inline bool f2fs_is_checkpoint_ready(struct f2fs_sb_info *sbi)
+diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
+index 4368f90571bd6..1b59f95606c72 100644
+--- a/fs/f2fs/super.c
++++ b/fs/f2fs/super.c
+@@ -2684,7 +2684,8 @@ int f2fs_quota_sync(struct super_block *sb, int type)
+ if (!sb_has_quota_active(sb, cnt))
+ continue;
+
+- inode_lock(dqopt->files[cnt]);
++ if (!f2fs_sb_has_quota_ino(sbi))
++ inode_lock(dqopt->files[cnt]);
+
+ /*
+ * do_quotactl
+@@ -2703,7 +2704,8 @@ int f2fs_quota_sync(struct super_block *sb, int type)
+ f2fs_up_read(&sbi->quota_sem);
+ f2fs_unlock_op(sbi);
+
+- inode_unlock(dqopt->files[cnt]);
++ if (!f2fs_sb_has_quota_ino(sbi))
++ inode_unlock(dqopt->files[cnt]);
+
+ if (ret)
+ break;
+diff --git a/fs/fat/fatent.c b/fs/fat/fatent.c
+index 978ac6751aeb7..1db348f8f887a 100644
+--- a/fs/fat/fatent.c
++++ b/fs/fat/fatent.c
+@@ -94,7 +94,8 @@ static int fat12_ent_bread(struct super_block *sb, struct fat_entry *fatent,
+ err_brelse:
+ brelse(bhs[0]);
+ err:
+- fat_msg(sb, KERN_ERR, "FAT read failed (blocknr %llu)", (llu)blocknr);
++ fat_msg_ratelimit(sb, KERN_ERR, "FAT read failed (blocknr %llu)",
++ (llu)blocknr);
+ return -EIO;
+ }
+
+@@ -107,8 +108,8 @@ static int fat_ent_bread(struct super_block *sb, struct fat_entry *fatent,
+ fatent->fat_inode = MSDOS_SB(sb)->fat_inode;
+ fatent->bhs[0] = sb_bread(sb, blocknr);
+ if (!fatent->bhs[0]) {
+- fat_msg(sb, KERN_ERR, "FAT read failed (blocknr %llu)",
+- (llu)blocknr);
++ fat_msg_ratelimit(sb, KERN_ERR, "FAT read failed (blocknr %llu)",
++ (llu)blocknr);
+ return -EIO;
+ }
+ fatent->nr_bhs = 1;
+diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
+index 1fae0196292a1..a1074a26e784d 100644
+--- a/fs/fs-writeback.c
++++ b/fs/fs-writeback.c
+@@ -1779,11 +1779,12 @@ static long writeback_sb_inodes(struct super_block *sb,
+ };
+ unsigned long start_time = jiffies;
+ long write_chunk;
+- long wrote = 0; /* count both pages and inodes */
++ long total_wrote = 0; /* count both pages and inodes */
+
+ while (!list_empty(&wb->b_io)) {
+ struct inode *inode = wb_inode(wb->b_io.prev);
+ struct bdi_writeback *tmp_wb;
++ long wrote;
+
+ if (inode->i_sb != sb) {
+ if (work->sb) {
+@@ -1859,7 +1860,9 @@ static long writeback_sb_inodes(struct super_block *sb,
+
+ wbc_detach_inode(&wbc);
+ work->nr_pages -= write_chunk - wbc.nr_to_write;
+- wrote += write_chunk - wbc.nr_to_write;
++ wrote = write_chunk - wbc.nr_to_write - wbc.pages_skipped;
++ wrote = wrote < 0 ? 0 : wrote;
++ total_wrote += wrote;
+
+ if (need_resched()) {
+ /*
+@@ -1881,7 +1884,7 @@ static long writeback_sb_inodes(struct super_block *sb,
+ tmp_wb = inode_to_wb_and_lock_list(inode);
+ spin_lock(&inode->i_lock);
+ if (!(inode->i_state & I_DIRTY_ALL))
+- wrote++;
++ total_wrote++;
+ requeue_inode(inode, tmp_wb, &wbc);
+ inode_sync_complete(inode);
+ spin_unlock(&inode->i_lock);
+@@ -1895,14 +1898,14 @@ static long writeback_sb_inodes(struct super_block *sb,
+ * bail out to wb_writeback() often enough to check
+ * background threshold and other termination conditions.
+ */
+- if (wrote) {
++ if (total_wrote) {
+ if (time_is_before_jiffies(start_time + HZ / 10UL))
+ break;
+ if (work->nr_pages <= 0)
+ break;
+ }
+ }
+- return wrote;
++ return total_wrote;
+ }
+
+ static long __writeback_inodes_wb(struct bdi_writeback *wb,
+diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
+index be0997e24d60b..dc77080a82bbf 100644
+--- a/fs/gfs2/quota.c
++++ b/fs/gfs2/quota.c
+@@ -531,34 +531,42 @@ static void qdsb_put(struct gfs2_quota_data *qd)
+ */
+ int gfs2_qa_get(struct gfs2_inode *ip)
+ {
+- int error = 0;
+ struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
++ struct inode *inode = &ip->i_inode;
+
+ if (sdp->sd_args.ar_quota == GFS2_QUOTA_OFF)
+ return 0;
+
+- down_write(&ip->i_rw_mutex);
++ spin_lock(&inode->i_lock);
+ if (ip->i_qadata == NULL) {
+- ip->i_qadata = kmem_cache_zalloc(gfs2_qadata_cachep, GFP_NOFS);
+- if (!ip->i_qadata) {
+- error = -ENOMEM;
+- goto out;
+- }
++ struct gfs2_qadata *tmp;
++
++ spin_unlock(&inode->i_lock);
++ tmp = kmem_cache_zalloc(gfs2_qadata_cachep, GFP_NOFS);
++ if (!tmp)
++ return -ENOMEM;
++
++ spin_lock(&inode->i_lock);
++ if (ip->i_qadata == NULL)
++ ip->i_qadata = tmp;
++ else
++ kmem_cache_free(gfs2_qadata_cachep, tmp);
+ }
+ ip->i_qadata->qa_ref++;
+-out:
+- up_write(&ip->i_rw_mutex);
+- return error;
++ spin_unlock(&inode->i_lock);
++ return 0;
+ }
+
+ void gfs2_qa_put(struct gfs2_inode *ip)
+ {
+- down_write(&ip->i_rw_mutex);
++ struct inode *inode = &ip->i_inode;
++
++ spin_lock(&inode->i_lock);
+ if (ip->i_qadata && --ip->i_qadata->qa_ref == 0) {
+ kmem_cache_free(gfs2_qadata_cachep, ip->i_qadata);
+ ip->i_qadata = NULL;
+ }
+- up_write(&ip->i_rw_mutex);
++ spin_unlock(&inode->i_lock);
+ }
+
+ int gfs2_quota_hold(struct gfs2_inode *ip, kuid_t uid, kgid_t gid)
+diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
+index dd3a088db11d1..591599829e2a6 100644
+--- a/fs/hugetlbfs/inode.c
++++ b/fs/hugetlbfs/inode.c
+@@ -1048,12 +1048,12 @@ static int hugetlbfs_statfs(struct dentry *dentry, struct kstatfs *buf)
+ if (sbinfo->spool) {
+ long free_pages;
+
+- spin_lock(&sbinfo->spool->lock);
++ spin_lock_irq(&sbinfo->spool->lock);
+ buf->f_blocks = sbinfo->spool->max_hpages;
+ free_pages = sbinfo->spool->max_hpages
+ - sbinfo->spool->used_hpages;
+ buf->f_bavail = buf->f_bfree = free_pages;
+- spin_unlock(&sbinfo->spool->lock);
++ spin_unlock_irq(&sbinfo->spool->lock);
+ buf->f_files = sbinfo->max_inodes;
+ buf->f_ffree = sbinfo->free_inodes;
+ }
+diff --git a/fs/io_uring.c b/fs/io_uring.c
+index e0823f58f7959..9e247335e70d5 100644
+--- a/fs/io_uring.c
++++ b/fs/io_uring.c
+@@ -5981,6 +5981,7 @@ static void io_poll_cancel_req(struct io_kiocb *req)
+
+ #define wqe_to_req(wait) ((void *)((unsigned long) (wait)->private & ~1))
+ #define wqe_is_double(wait) ((unsigned long) (wait)->private & 1)
++#define IO_ASYNC_POLL_COMMON (EPOLLONESHOT | POLLPRI)
+
+ static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
+ void *key)
+@@ -6015,7 +6016,7 @@ static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
+ }
+
+ /* for instances that support it check for an event match first */
+- if (mask && !(mask & poll->events))
++ if (mask && !(mask & (poll->events & ~IO_ASYNC_POLL_COMMON)))
+ return 0;
+
+ if (io_poll_get_ownership(req)) {
+@@ -6171,7 +6172,7 @@ static int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
+ struct io_ring_ctx *ctx = req->ctx;
+ struct async_poll *apoll;
+ struct io_poll_table ipt;
+- __poll_t mask = EPOLLONESHOT | POLLERR | POLLPRI;
++ __poll_t mask = IO_ASYNC_POLL_COMMON | POLLERR;
+ int ret;
+
+ if (!def->pollin && !def->pollout)
+@@ -7327,6 +7328,8 @@ fail:
+ * wait for request slots on the block side.
+ */
+ if (!needs_poll) {
++ if (!(req->ctx->flags & IORING_SETUP_IOPOLL))
++ break;
+ cond_resched();
+ continue;
+ }
+diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
+index 8ce8720093b9a..358ee1fb6f0db 100644
+--- a/fs/iomap/buffered-io.c
++++ b/fs/iomap/buffered-io.c
+@@ -531,7 +531,8 @@ iomap_write_failed(struct inode *inode, loff_t pos, unsigned len)
+ * write started inside the existing inode size.
+ */
+ if (pos + len > i_size)
+- truncate_pagecache_range(inode, max(pos, i_size), pos + len);
++ truncate_pagecache_range(inode, max(pos, i_size),
++ pos + len - 1);
+ }
+
+ static int iomap_read_folio_sync(loff_t block_start, struct folio *folio,
+diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c
+index d8502f4989d9d..e75f31b81d634 100644
+--- a/fs/jfs/jfs_dmap.c
++++ b/fs/jfs/jfs_dmap.c
+@@ -385,7 +385,8 @@ int dbFree(struct inode *ip, s64 blkno, s64 nblocks)
+ }
+
+ /* write the last buffer. */
+- write_metapage(mp);
++ if (mp)
++ write_metapage(mp);
+
+ IREAD_UNLOCK(ipbmap);
+
+diff --git a/fs/ksmbd/connection.c b/fs/ksmbd/connection.c
+index 208d2cff7bd37..bc6050b67256d 100644
+--- a/fs/ksmbd/connection.c
++++ b/fs/ksmbd/connection.c
+@@ -62,7 +62,7 @@ struct ksmbd_conn *ksmbd_conn_alloc(void)
+ atomic_set(&conn->req_running, 0);
+ atomic_set(&conn->r_count, 0);
+ conn->total_credits = 1;
+- conn->outstanding_credits = 1;
++ conn->outstanding_credits = 0;
+
+ init_waitqueue_head(&conn->req_running_q);
+ INIT_LIST_HEAD(&conn->conns_list);
+diff --git a/fs/ksmbd/smb2misc.c b/fs/ksmbd/smb2misc.c
+index 4a9460153b595..f8f456377a51d 100644
+--- a/fs/ksmbd/smb2misc.c
++++ b/fs/ksmbd/smb2misc.c
+@@ -338,7 +338,7 @@ static int smb2_validate_credit_charge(struct ksmbd_conn *conn,
+ ret = 1;
+ }
+
+- if ((u64)conn->outstanding_credits + credit_charge > conn->vals->max_credits) {
++ if ((u64)conn->outstanding_credits + credit_charge > conn->total_credits) {
+ ksmbd_debug(SMB, "Limits exceeding the maximum allowable outstanding requests, given : %u, pending : %u\n",
+ credit_charge, conn->outstanding_credits);
+ ret = 1;
+diff --git a/fs/ksmbd/smb_common.c b/fs/ksmbd/smb_common.c
+index 9a7e211dbf4f4..7f8ab14fb8ec1 100644
+--- a/fs/ksmbd/smb_common.c
++++ b/fs/ksmbd/smb_common.c
+@@ -140,8 +140,10 @@ int ksmbd_verify_smb_message(struct ksmbd_work *work)
+
+ hdr = work->request_buf;
+ if (*(__le32 *)hdr->Protocol == SMB1_PROTO_NUMBER &&
+- hdr->Command == SMB_COM_NEGOTIATE)
++ hdr->Command == SMB_COM_NEGOTIATE) {
++ work->conn->outstanding_credits++;
+ return 0;
++ }
+
+ return -EINVAL;
+ }
+diff --git a/fs/namei.c b/fs/namei.c
+index 509657fdf4f56..fd3c95ac261bd 100644
+--- a/fs/namei.c
++++ b/fs/namei.c
+@@ -2768,7 +2768,8 @@ struct dentry *lookup_one(struct user_namespace *mnt_userns, const char *name,
+ EXPORT_SYMBOL(lookup_one);
+
+ /**
+- * lookup_one_len_unlocked - filesystem helper to lookup single pathname component
++ * lookup_one_unlocked - filesystem helper to lookup single pathname component
++ * @mnt_userns: idmapping of the mount the lookup is performed from
+ * @name: pathname component to lookup
+ * @base: base directory to lookup from
+ * @len: maximum length @len should be interpreted to
+@@ -2779,14 +2780,15 @@ EXPORT_SYMBOL(lookup_one);
+ * Unlike lookup_one_len, it should be called without the parent
+ * i_mutex held, and will take the i_mutex itself if necessary.
+ */
+-struct dentry *lookup_one_len_unlocked(const char *name,
+- struct dentry *base, int len)
++struct dentry *lookup_one_unlocked(struct user_namespace *mnt_userns,
++ const char *name, struct dentry *base,
++ int len)
+ {
+ struct qstr this;
+ int err;
+ struct dentry *ret;
+
+- err = lookup_one_common(&init_user_ns, name, base, len, &this);
++ err = lookup_one_common(mnt_userns, name, base, len, &this);
+ if (err)
+ return ERR_PTR(err);
+
+@@ -2795,6 +2797,59 @@ struct dentry *lookup_one_len_unlocked(const char *name,
+ ret = lookup_slow(&this, base, 0);
+ return ret;
+ }
++EXPORT_SYMBOL(lookup_one_unlocked);
++
++/**
++ * lookup_one_positive_unlocked - filesystem helper to lookup single
++ * pathname component
++ * @mnt_userns: idmapping of the mount the lookup is performed from
++ * @name: pathname component to lookup
++ * @base: base directory to lookup from
++ * @len: maximum length @len should be interpreted to
++ *
++ * This helper will yield ERR_PTR(-ENOENT) on negatives. The helper returns
++ * known positive or ERR_PTR(). This is what most of the users want.
++ *
++ * Note that pinned negative with unlocked parent _can_ become positive at any
++ * time, so callers of lookup_one_unlocked() need to be very careful; pinned
++ * positives have >d_inode stable, so this one avoids such problems.
++ *
++ * Note that this routine is purely a helper for filesystem usage and should
++ * not be called by generic code.
++ *
++ * The helper should be called without i_mutex held.
++ */
++struct dentry *lookup_one_positive_unlocked(struct user_namespace *mnt_userns,
++ const char *name,
++ struct dentry *base, int len)
++{
++ struct dentry *ret = lookup_one_unlocked(mnt_userns, name, base, len);
++
++ if (!IS_ERR(ret) && d_flags_negative(smp_load_acquire(&ret->d_flags))) {
++ dput(ret);
++ ret = ERR_PTR(-ENOENT);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(lookup_one_positive_unlocked);
++
++/**
++ * lookup_one_len_unlocked - filesystem helper to lookup single pathname component
++ * @name: pathname component to lookup
++ * @base: base directory to lookup from
++ * @len: maximum length @len should be interpreted to
++ *
++ * Note that this routine is purely a helper for filesystem usage and should
++ * not be called by generic code.
++ *
++ * Unlike lookup_one_len, it should be called without the parent
++ * i_mutex held, and will take the i_mutex itself if necessary.
++ */
++struct dentry *lookup_one_len_unlocked(const char *name,
++ struct dentry *base, int len)
++{
++ return lookup_one_unlocked(&init_user_ns, name, base, len);
++}
+ EXPORT_SYMBOL(lookup_one_len_unlocked);
+
+ /*
+@@ -2808,12 +2863,7 @@ EXPORT_SYMBOL(lookup_one_len_unlocked);
+ struct dentry *lookup_positive_unlocked(const char *name,
+ struct dentry *base, int len)
+ {
+- struct dentry *ret = lookup_one_len_unlocked(name, base, len);
+- if (!IS_ERR(ret) && d_flags_negative(smp_load_acquire(&ret->d_flags))) {
+- dput(ret);
+- ret = ERR_PTR(-ENOENT);
+- }
+- return ret;
++ return lookup_one_positive_unlocked(&init_user_ns, name, base, len);
+ }
+ EXPORT_SYMBOL(lookup_positive_unlocked);
+
+diff --git a/fs/namespace.c b/fs/namespace.c
+index afe2b64b14f1f..41461f55c0390 100644
+--- a/fs/namespace.c
++++ b/fs/namespace.c
+@@ -4026,8 +4026,9 @@ static int can_idmap_mount(const struct mount_kattr *kattr, struct mount *mnt)
+ static inline bool mnt_allow_writers(const struct mount_kattr *kattr,
+ const struct mount *mnt)
+ {
+- return !(kattr->attr_set & MNT_READONLY) ||
+- (mnt->mnt.mnt_flags & MNT_READONLY);
++ return (!(kattr->attr_set & MNT_READONLY) ||
++ (mnt->mnt.mnt_flags & MNT_READONLY)) &&
++ !kattr->mnt_userns;
+ }
+
+ static int mount_setattr_prepare(struct mount_kattr *kattr, struct mount *mnt)
+diff --git a/fs/nfs/file.c b/fs/nfs/file.c
+index 150b7fa8f0a73..3f17748eaf290 100644
+--- a/fs/nfs/file.c
++++ b/fs/nfs/file.c
+@@ -204,15 +204,16 @@ static int
+ nfs_file_fsync_commit(struct file *file, int datasync)
+ {
+ struct inode *inode = file_inode(file);
+- int ret;
++ int ret, ret2;
+
+ dprintk("NFS: fsync file(%pD2) datasync %d\n", file, datasync);
+
+ nfs_inc_stats(inode, NFSIOS_VFSFSYNC);
+ ret = nfs_commit_inode(inode, FLUSH_SYNC);
+- if (ret < 0)
+- return ret;
+- return file_check_and_advance_wb_err(file);
++ ret2 = file_check_and_advance_wb_err(file);
++ if (ret2 < 0)
++ return ret2;
++ return ret;
+ }
+
+ int
+@@ -385,11 +386,8 @@ static int nfs_write_end(struct file *file, struct address_space *mapping,
+ return status;
+ NFS_I(mapping->host)->write_io += copied;
+
+- if (nfs_ctx_key_to_expire(ctx, mapping->host)) {
+- status = nfs_wb_all(mapping->host);
+- if (status < 0)
+- return status;
+- }
++ if (nfs_ctx_key_to_expire(ctx, mapping->host))
++ nfs_wb_all(mapping->host);
+
+ return copied;
+ }
+@@ -597,18 +595,6 @@ static const struct vm_operations_struct nfs_file_vm_ops = {
+ .page_mkwrite = nfs_vm_page_mkwrite,
+ };
+
+-static int nfs_need_check_write(struct file *filp, struct inode *inode,
+- int error)
+-{
+- struct nfs_open_context *ctx;
+-
+- ctx = nfs_file_open_context(filp);
+- if (nfs_error_is_fatal_on_server(error) ||
+- nfs_ctx_key_to_expire(ctx, inode))
+- return 1;
+- return 0;
+-}
+-
+ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
+ {
+ struct file *file = iocb->ki_filp;
+@@ -636,7 +622,7 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
+ if (iocb->ki_flags & IOCB_APPEND || iocb->ki_pos > i_size_read(inode)) {
+ result = nfs_revalidate_file_size(inode, file);
+ if (result)
+- goto out;
++ return result;
+ }
+
+ nfs_clear_invalid_mapping(file->f_mapping);
+@@ -655,6 +641,7 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
+
+ written = result;
+ iocb->ki_pos += written;
++ nfs_add_stats(inode, NFSIOS_NORMALWRITTENBYTES, written);
+
+ if (mntflags & NFS_MOUNT_WRITE_EAGER) {
+ result = filemap_fdatawrite_range(file->f_mapping,
+@@ -672,17 +659,22 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
+ }
+ result = generic_write_sync(iocb, written);
+ if (result < 0)
+- goto out;
++ return result;
+
++out:
+ /* Return error values */
+ error = filemap_check_wb_err(file->f_mapping, since);
+- if (nfs_need_check_write(file, inode, error)) {
+- int err = nfs_wb_all(inode);
+- if (err < 0)
+- result = err;
++ switch (error) {
++ default:
++ break;
++ case -EDQUOT:
++ case -EFBIG:
++ case -ENOSPC:
++ nfs_wb_all(inode);
++ error = file_check_and_advance_wb_err(file);
++ if (error < 0)
++ result = error;
+ }
+- nfs_add_stats(inode, NFSIOS_NORMALWRITTENBYTES, written);
+-out:
+ return result;
+
+ out_swapfile:
+diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
+index f73c09a9cf0a9..e861d7bae305f 100644
+--- a/fs/nfs/fscache.c
++++ b/fs/nfs/fscache.c
+@@ -231,11 +231,10 @@ void nfs_fscache_release_file(struct inode *inode, struct file *filp)
+ {
+ struct nfs_fscache_inode_auxdata auxdata;
+ struct fscache_cookie *cookie = nfs_i_fscache(inode);
++ loff_t i_size = i_size_read(inode);
+
+- if (fscache_cookie_valid(cookie)) {
+- nfs_fscache_update_auxdata(&auxdata, inode);
+- fscache_unuse_cookie(cookie, &auxdata, NULL);
+- }
++ nfs_fscache_update_auxdata(&auxdata, inode);
++ fscache_unuse_cookie(cookie, &auxdata, &i_size);
+ }
+
+ /*
+diff --git a/fs/nfs/nfs4namespace.c b/fs/nfs/nfs4namespace.c
+index 3680c8da510c9..f2dbf904c5989 100644
+--- a/fs/nfs/nfs4namespace.c
++++ b/fs/nfs/nfs4namespace.c
+@@ -417,6 +417,9 @@ static int nfs_do_refmount(struct fs_context *fc, struct rpc_clnt *client)
+ fs_locations = kmalloc(sizeof(struct nfs4_fs_locations), GFP_KERNEL);
+ if (!fs_locations)
+ goto out_free;
++ fs_locations->fattr = nfs_alloc_fattr();
++ if (!fs_locations->fattr)
++ goto out_free_2;
+
+ /* Get locations */
+ dentry = ctx->clone_data.dentry;
+@@ -427,14 +430,16 @@ static int nfs_do_refmount(struct fs_context *fc, struct rpc_clnt *client)
+ err = nfs4_proc_fs_locations(client, d_inode(parent), &dentry->d_name, fs_locations, page);
+ dput(parent);
+ if (err != 0)
+- goto out_free_2;
++ goto out_free_3;
+
+ err = -ENOENT;
+ if (fs_locations->nlocations <= 0 ||
+ fs_locations->fs_path.ncomponents <= 0)
+- goto out_free_2;
++ goto out_free_3;
+
+ err = nfs_follow_referral(fc, fs_locations);
++out_free_3:
++ kfree(fs_locations->fattr);
+ out_free_2:
+ kfree(fs_locations);
+ out_free:
+diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
+index a79f66432bd39..8c5907287c161 100644
+--- a/fs/nfs/nfs4proc.c
++++ b/fs/nfs/nfs4proc.c
+@@ -1162,7 +1162,7 @@ static int nfs4_call_sync_sequence(struct rpc_clnt *clnt,
+ {
+ unsigned short task_flags = 0;
+
+- if (server->nfs_client->cl_minorversion)
++ if (server->caps & NFS_CAP_MOVEABLE)
+ task_flags = RPC_TASK_MOVEABLE;
+ return nfs4_do_call_sync(clnt, server, msg, args, res, task_flags);
+ }
+@@ -2568,7 +2568,7 @@ static int nfs4_run_open_task(struct nfs4_opendata *data,
+ };
+ int status;
+
+- if (server->nfs_client->cl_minorversion)
++ if (nfs_server_capable(dir, NFS_CAP_MOVEABLE))
+ task_setup_data.flags |= RPC_TASK_MOVEABLE;
+
+ kref_get(&data->kref);
+@@ -3733,7 +3733,7 @@ int nfs4_do_close(struct nfs4_state *state, gfp_t gfp_mask, int wait)
+ };
+ int status = -ENOMEM;
+
+- if (server->nfs_client->cl_minorversion)
++ if (nfs_server_capable(state->inode, NFS_CAP_MOVEABLE))
+ task_setup_data.flags |= RPC_TASK_MOVEABLE;
+
+ nfs4_state_protect(server->nfs_client, NFS_SP4_MACH_CRED_CLEANUP,
+@@ -4243,6 +4243,8 @@ static int nfs4_get_referral(struct rpc_clnt *client, struct inode *dir,
+ if (locations == NULL)
+ goto out;
+
++ locations->fattr = fattr;
++
+ status = nfs4_proc_fs_locations(client, dir, name, locations, page);
+ if (status != 0)
+ goto out;
+@@ -4252,17 +4254,14 @@ static int nfs4_get_referral(struct rpc_clnt *client, struct inode *dir,
+ * referral. Cause us to drop into the exception handler, which
+ * will kick off migration recovery.
+ */
+- if (nfs_fsid_equal(&NFS_SERVER(dir)->fsid, &locations->fattr.fsid)) {
++ if (nfs_fsid_equal(&NFS_SERVER(dir)->fsid, &fattr->fsid)) {
+ dprintk("%s: server did not return a different fsid for"
+ " a referral at %s\n", __func__, name->name);
+ status = -NFS4ERR_MOVED;
+ goto out;
+ }
+ /* Fixup attributes for the nfs_lookup() call to nfs_fhget() */
+- nfs_fixup_referral_attributes(&locations->fattr);
+-
+- /* replace the lookup nfs_fattr with the locations nfs_fattr */
+- memcpy(fattr, &locations->fattr, sizeof(struct nfs_fattr));
++ nfs_fixup_referral_attributes(fattr);
+ memset(fhandle, 0, sizeof(struct nfs_fh));
+ out:
+ if (page)
+@@ -4404,7 +4403,7 @@ static int _nfs4_proc_lookup(struct rpc_clnt *clnt, struct inode *dir,
+ };
+ unsigned short task_flags = 0;
+
+- if (server->nfs_client->cl_minorversion)
++ if (nfs_server_capable(dir, NFS_CAP_MOVEABLE))
+ task_flags = RPC_TASK_MOVEABLE;
+
+ /* Is this is an attribute revalidation, subject to softreval? */
+@@ -6612,10 +6611,13 @@ static int _nfs4_proc_delegreturn(struct inode *inode, const struct cred *cred,
+ .rpc_client = server->client,
+ .rpc_message = &msg,
+ .callback_ops = &nfs4_delegreturn_ops,
+- .flags = RPC_TASK_ASYNC | RPC_TASK_TIMEOUT | RPC_TASK_MOVEABLE,
++ .flags = RPC_TASK_ASYNC | RPC_TASK_TIMEOUT,
+ };
+ int status = 0;
+
++ if (nfs_server_capable(inode, NFS_CAP_MOVEABLE))
++ task_setup_data.flags |= RPC_TASK_MOVEABLE;
++
+ data = kzalloc(sizeof(*data), GFP_KERNEL);
+ if (data == NULL)
+ return -ENOMEM;
+@@ -6929,10 +6931,8 @@ static struct rpc_task *nfs4_do_unlck(struct file_lock *fl,
+ .workqueue = nfsiod_workqueue,
+ .flags = RPC_TASK_ASYNC,
+ };
+- struct nfs_client *client =
+- NFS_SERVER(lsp->ls_state->inode)->nfs_client;
+
+- if (client->cl_minorversion)
++ if (nfs_server_capable(lsp->ls_state->inode, NFS_CAP_MOVEABLE))
+ task_setup_data.flags |= RPC_TASK_MOVEABLE;
+
+ nfs4_state_protect(NFS_SERVER(lsp->ls_state->inode)->nfs_client,
+@@ -7203,9 +7203,8 @@ static int _nfs4_do_setlk(struct nfs4_state *state, int cmd, struct file_lock *f
+ .flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF,
+ };
+ int ret;
+- struct nfs_client *client = NFS_SERVER(state->inode)->nfs_client;
+
+- if (client->cl_minorversion)
++ if (nfs_server_capable(state->inode, NFS_CAP_MOVEABLE))
+ task_setup_data.flags |= RPC_TASK_MOVEABLE;
+
+ data = nfs4_alloc_lockdata(fl, nfs_file_open_context(fl->fl_file),
+@@ -7902,7 +7901,7 @@ static int _nfs4_proc_fs_locations(struct rpc_clnt *client, struct inode *dir,
+ else
+ bitmask[1] &= ~FATTR4_WORD1_MOUNTED_ON_FILEID;
+
+- nfs_fattr_init(&fs_locations->fattr);
++ nfs_fattr_init(fs_locations->fattr);
+ fs_locations->server = server;
+ fs_locations->nlocations = 0;
+ status = nfs4_call_sync(client, server, &msg, &args.seq_args, &res.seq_res, 0);
+@@ -7967,7 +7966,7 @@ static int _nfs40_proc_get_locations(struct nfs_server *server,
+ unsigned long now = jiffies;
+ int status;
+
+- nfs_fattr_init(&locations->fattr);
++ nfs_fattr_init(locations->fattr);
+ locations->server = server;
+ locations->nlocations = 0;
+
+@@ -8032,7 +8031,7 @@ static int _nfs41_proc_get_locations(struct nfs_server *server,
+ };
+ int status;
+
+- nfs_fattr_init(&locations->fattr);
++ nfs_fattr_init(locations->fattr);
+ locations->server = server;
+ locations->nlocations = 0;
+
+@@ -10391,7 +10390,8 @@ static const struct nfs4_minor_version_ops nfs_v4_1_minor_ops = {
+ | NFS_CAP_POSIX_LOCK
+ | NFS_CAP_STATEID_NFSV41
+ | NFS_CAP_ATOMIC_OPEN_V1
+- | NFS_CAP_LGOPEN,
++ | NFS_CAP_LGOPEN
++ | NFS_CAP_MOVEABLE,
+ .init_client = nfs41_init_client,
+ .shutdown_client = nfs41_shutdown_client,
+ .match_stateid = nfs41_match_stateid,
+@@ -10426,7 +10426,8 @@ static const struct nfs4_minor_version_ops nfs_v4_2_minor_ops = {
+ | NFS_CAP_LAYOUTSTATS
+ | NFS_CAP_CLONE
+ | NFS_CAP_LAYOUTERROR
+- | NFS_CAP_READ_PLUS,
++ | NFS_CAP_READ_PLUS
++ | NFS_CAP_MOVEABLE,
+ .init_client = nfs41_init_client,
+ .shutdown_client = nfs41_shutdown_client,
+ .match_stateid = nfs41_match_stateid,
+diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
+index 9e1c987c81e7f..9656d40bb4887 100644
+--- a/fs/nfs/nfs4state.c
++++ b/fs/nfs/nfs4state.c
+@@ -2106,6 +2106,11 @@ static int nfs4_try_migration(struct nfs_server *server, const struct cred *cred
+ dprintk("<-- %s: no memory\n", __func__);
+ goto out;
+ }
++ locations->fattr = nfs_alloc_fattr();
++ if (locations->fattr == NULL) {
++ dprintk("<-- %s: no memory\n", __func__);
++ goto out;
++ }
+
+ inode = d_inode(server->super->s_root);
+ result = nfs4_proc_get_locations(server, NFS_FH(inode), locations,
+@@ -2120,7 +2125,7 @@ static int nfs4_try_migration(struct nfs_server *server, const struct cred *cred
+ if (!locations->nlocations)
+ goto out;
+
+- if (!(locations->fattr.valid & NFS_ATTR_FATTR_V4_LOCATIONS)) {
++ if (!(locations->fattr->valid & NFS_ATTR_FATTR_V4_LOCATIONS)) {
+ dprintk("<-- %s: No fs_locations data, migration skipped\n",
+ __func__);
+ goto out;
+@@ -2145,6 +2150,8 @@ static int nfs4_try_migration(struct nfs_server *server, const struct cred *cred
+ out:
+ if (page != NULL)
+ __free_page(page);
++ if (locations != NULL)
++ kfree(locations->fattr);
+ kfree(locations);
+ if (result) {
+ pr_err("NFS: migration recovery failed (server %s)\n",
+diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
+index 86a5f6516928e..5d822594336dc 100644
+--- a/fs/nfs/nfs4xdr.c
++++ b/fs/nfs/nfs4xdr.c
+@@ -7051,7 +7051,7 @@ static int nfs4_xdr_dec_fs_locations(struct rpc_rqst *req,
+ if (res->migration) {
+ xdr_enter_page(xdr, PAGE_SIZE);
+ status = decode_getfattr_generic(xdr,
+- &res->fs_locations->fattr,
++ res->fs_locations->fattr,
+ NULL, res->fs_locations,
+ res->fs_locations->server);
+ if (status)
+@@ -7064,7 +7064,7 @@ static int nfs4_xdr_dec_fs_locations(struct rpc_rqst *req,
+ goto out;
+ xdr_enter_page(xdr, PAGE_SIZE);
+ status = decode_getfattr_generic(xdr,
+- &res->fs_locations->fattr,
++ res->fs_locations->fattr,
+ NULL, res->fs_locations,
+ res->fs_locations->server);
+ }
+diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
+index 9157dd19b8b4f..317cedfa52bf6 100644
+--- a/fs/nfs/pagelist.c
++++ b/fs/nfs/pagelist.c
+@@ -767,6 +767,9 @@ int nfs_initiate_pgio(struct rpc_clnt *clnt, struct nfs_pgio_header *hdr,
+ .flags = RPC_TASK_ASYNC | flags,
+ };
+
++ if (nfs_server_capable(hdr->inode, NFS_CAP_MOVEABLE))
++ task_setup_data.flags |= RPC_TASK_MOVEABLE;
++
+ hdr->rw_ops->rw_initiate(hdr, &msg, rpc_ops, &task_setup_data, how);
+
+ dprintk("NFS: initiated pgio call "
+diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
+index 856c962273c71..68a87be3e6f96 100644
+--- a/fs/nfs/pnfs.c
++++ b/fs/nfs/pnfs.c
+@@ -2000,6 +2000,7 @@ lookup_again:
+ lo = pnfs_find_alloc_layout(ino, ctx, gfp_flags);
+ if (lo == NULL) {
+ spin_unlock(&ino->i_lock);
++ lseg = ERR_PTR(-ENOMEM);
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
+ PNFS_UPDATE_LAYOUT_NOMEM);
+ goto out;
+@@ -2128,6 +2129,7 @@ lookup_again:
+
+ lgp = pnfs_alloc_init_layoutget_args(ino, ctx, &stateid, &arg, gfp_flags);
+ if (!lgp) {
++ lseg = ERR_PTR(-ENOMEM);
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, NULL,
+ PNFS_UPDATE_LAYOUT_NOMEM);
+ nfs_layoutget_end(lo);
+diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
+index 6f325e10056ce..9697cd5d2561c 100644
+--- a/fs/nfs/unlink.c
++++ b/fs/nfs/unlink.c
+@@ -102,6 +102,10 @@ static void nfs_do_call_unlink(struct inode *inode, struct nfs_unlinkdata *data)
+ };
+ struct rpc_task *task;
+ struct inode *dir = d_inode(data->dentry->d_parent);
++
++ if (nfs_server_capable(inode, NFS_CAP_MOVEABLE))
++ task_setup_data.flags |= RPC_TASK_MOVEABLE;
++
+ nfs_sb_active(dir->i_sb);
+ data->args.fh = NFS_FH(dir);
+ nfs_fattr_init(data->res.dir_attr);
+@@ -344,6 +348,10 @@ nfs_async_rename(struct inode *old_dir, struct inode *new_dir,
+ .flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF,
+ };
+
++ if (nfs_server_capable(old_dir, NFS_CAP_MOVEABLE) &&
++ nfs_server_capable(new_dir, NFS_CAP_MOVEABLE))
++ task_setup_data.flags |= RPC_TASK_MOVEABLE;
++
+ data = kzalloc(sizeof(*data), GFP_KERNEL);
+ if (data == NULL)
+ return ERR_PTR(-ENOMEM);
+diff --git a/fs/nfs/write.c b/fs/nfs/write.c
+index f00d45cf80ef3..1c706465d090b 100644
+--- a/fs/nfs/write.c
++++ b/fs/nfs/write.c
+@@ -603,8 +603,9 @@ static void nfs_write_error(struct nfs_page *req, int error)
+ * Find an associated nfs write request, and prepare to flush it out
+ * May return an error if the user signalled nfs_wait_on_request().
+ */
+-static int nfs_page_async_flush(struct nfs_pageio_descriptor *pgio,
+- struct page *page)
++static int nfs_page_async_flush(struct page *page,
++ struct writeback_control *wbc,
++ struct nfs_pageio_descriptor *pgio)
+ {
+ struct nfs_page *req;
+ int ret = 0;
+@@ -630,11 +631,11 @@ static int nfs_page_async_flush(struct nfs_pageio_descriptor *pgio,
+ /*
+ * Remove the problematic req upon fatal errors on the server
+ */
+- if (nfs_error_is_fatal(ret)) {
+- if (nfs_error_is_fatal_on_server(ret))
+- goto out_launder;
+- } else
+- ret = -EAGAIN;
++ if (nfs_error_is_fatal_on_server(ret))
++ goto out_launder;
++ if (wbc->sync_mode == WB_SYNC_NONE)
++ ret = AOP_WRITEPAGE_ACTIVATE;
++ redirty_page_for_writepage(wbc, page);
+ nfs_redirty_request(req);
+ pgio->pg_error = 0;
+ } else
+@@ -650,15 +651,8 @@ out_launder:
+ static int nfs_do_writepage(struct page *page, struct writeback_control *wbc,
+ struct nfs_pageio_descriptor *pgio)
+ {
+- int ret;
+-
+ nfs_pageio_cond_complete(pgio, page_index(page));
+- ret = nfs_page_async_flush(pgio, page);
+- if (ret == -EAGAIN) {
+- redirty_page_for_writepage(wbc, page);
+- ret = AOP_WRITEPAGE_ACTIVATE;
+- }
+- return ret;
++ return nfs_page_async_flush(page, wbc, pgio);
+ }
+
+ /*
+@@ -681,11 +675,7 @@ static int nfs_writepage_locked(struct page *page,
+ err = nfs_do_writepage(page, wbc, &pgio);
+ pgio.pg_error = 0;
+ nfs_pageio_complete(&pgio);
+- if (err < 0)
+- return err;
+- if (nfs_error_is_fatal(pgio.pg_error))
+- return pgio.pg_error;
+- return 0;
++ return err;
+ }
+
+ int nfs_writepage(struct page *page, struct writeback_control *wbc)
+@@ -737,19 +727,19 @@ int nfs_writepages(struct address_space *mapping, struct writeback_control *wbc)
+ priority = wb_priority(wbc);
+ }
+
+- nfs_pageio_init_write(&pgio, inode, priority, false,
+- &nfs_async_write_completion_ops);
+- pgio.pg_io_completion = ioc;
+- err = write_cache_pages(mapping, wbc, nfs_writepages_callback, &pgio);
+- pgio.pg_error = 0;
+- nfs_pageio_complete(&pgio);
++ do {
++ nfs_pageio_init_write(&pgio, inode, priority, false,
++ &nfs_async_write_completion_ops);
++ pgio.pg_io_completion = ioc;
++ err = write_cache_pages(mapping, wbc, nfs_writepages_callback,
++ &pgio);
++ pgio.pg_error = 0;
++ nfs_pageio_complete(&pgio);
++ } while (err < 0 && !nfs_error_is_fatal(err));
+ nfs_io_completion_put(ioc);
+
+ if (err < 0)
+ goto out_err;
+- err = pgio.pg_error;
+- if (nfs_error_is_fatal(err))
+- goto out_err;
+ return 0;
+ out_err:
+ return err;
+@@ -1444,7 +1434,7 @@ static void nfs_async_write_error(struct list_head *head, int error)
+ while (!list_empty(head)) {
+ req = nfs_list_entry(head->next);
+ nfs_list_remove_request(req);
+- if (nfs_error_is_fatal(error))
++ if (nfs_error_is_fatal_on_server(error))
+ nfs_write_error(req, error);
+ else
+ nfs_redirty_request(req);
+@@ -1719,6 +1709,10 @@ int nfs_initiate_commit(struct rpc_clnt *clnt, struct nfs_commit_data *data,
+ .flags = RPC_TASK_ASYNC | flags,
+ .priority = priority,
+ };
++
++ if (nfs_server_capable(data->inode, NFS_CAP_MOVEABLE))
++ task_setup_data.flags |= RPC_TASK_MOVEABLE;
++
+ /* Set up the initial task struct. */
+ nfs_ops->commit_setup(data, &msg, &task_setup_data.rpc_client);
+ trace_nfs_initiate_commit(data);
+diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
+index 0b3f12aa37ff5..7da88bdc0d6c3 100644
+--- a/fs/nfsd/nfscache.c
++++ b/fs/nfsd/nfscache.c
+@@ -206,7 +206,6 @@ void nfsd_reply_cache_shutdown(struct nfsd_net *nn)
+ struct svc_cacherep *rp;
+ unsigned int i;
+
+- nfsd_reply_cache_stats_destroy(nn);
+ unregister_shrinker(&nn->nfsd_reply_cache_shrinker);
+
+ for (i = 0; i < nn->drc_hashsize; i++) {
+@@ -217,6 +216,7 @@ void nfsd_reply_cache_shutdown(struct nfsd_net *nn)
+ rp, nn);
+ }
+ }
++ nfsd_reply_cache_stats_destroy(nn);
+
+ kvfree(nn->drc_hashtbl);
+ nn->drc_hashtbl = NULL;
+diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
+index a792e21c53099..16d8fc84713a4 100644
+--- a/fs/notify/fanotify/fanotify_user.c
++++ b/fs/notify/fanotify/fanotify_user.c
+@@ -264,7 +264,7 @@ static int create_fd(struct fsnotify_group *group, struct path *path,
+ * originally opened O_WRONLY.
+ */
+ new_file = dentry_open(path,
+- group->fanotify_data.f_flags | FMODE_NONOTIFY,
++ group->fanotify_data.f_flags | __FMODE_NONOTIFY,
+ current_cred());
+ if (IS_ERR(new_file)) {
+ /*
+@@ -1348,7 +1348,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
+ (!(fid_mode & FAN_REPORT_NAME) || !(fid_mode & FAN_REPORT_FID)))
+ return -EINVAL;
+
+- f_flags = O_RDWR | FMODE_NONOTIFY;
++ f_flags = O_RDWR | __FMODE_NONOTIFY;
+ if (flags & FAN_CLOEXEC)
+ f_flags |= O_CLOEXEC;
+ if (flags & FAN_NONBLOCK)
+diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c
+index 57f0d5d9f934e..3451708fd035c 100644
+--- a/fs/notify/fdinfo.c
++++ b/fs/notify/fdinfo.c
+@@ -83,16 +83,9 @@ static void inotify_fdinfo(struct seq_file *m, struct fsnotify_mark *mark)
+ inode_mark = container_of(mark, struct inotify_inode_mark, fsn_mark);
+ inode = igrab(fsnotify_conn_inode(mark->connector));
+ if (inode) {
+- /*
+- * IN_ALL_EVENTS represents all of the mask bits
+- * that we expose to userspace. There is at
+- * least one bit (FS_EVENT_ON_CHILD) which is
+- * used only internally to the kernel.
+- */
+- u32 mask = mark->mask & IN_ALL_EVENTS;
+- seq_printf(m, "inotify wd:%x ino:%lx sdev:%x mask:%x ignored_mask:%x ",
++ seq_printf(m, "inotify wd:%x ino:%lx sdev:%x mask:%x ignored_mask:0 ",
+ inode_mark->wd, inode->i_ino, inode->i_sb->s_dev,
+- mask, mark->ignored_mask);
++ inotify_mark_user_mask(mark));
+ show_mark_fhandle(m, inode);
+ seq_putc(m, '\n');
+ iput(inode);
+diff --git a/fs/notify/inotify/inotify.h b/fs/notify/inotify/inotify.h
+index 2007e37119160..8f00151eb731f 100644
+--- a/fs/notify/inotify/inotify.h
++++ b/fs/notify/inotify/inotify.h
+@@ -22,6 +22,18 @@ static inline struct inotify_event_info *INOTIFY_E(struct fsnotify_event *fse)
+ return container_of(fse, struct inotify_event_info, fse);
+ }
+
++/*
++ * INOTIFY_USER_FLAGS represents all of the mask bits that we expose to
++ * userspace. There is at least one bit (FS_EVENT_ON_CHILD) which is
++ * used only internally to the kernel.
++ */
++#define INOTIFY_USER_MASK (IN_ALL_EVENTS | IN_ONESHOT | IN_EXCL_UNLINK)
++
++static inline __u32 inotify_mark_user_mask(struct fsnotify_mark *fsn_mark)
++{
++ return fsn_mark->mask & INOTIFY_USER_MASK;
++}
++
+ extern void inotify_ignored_and_remove_idr(struct fsnotify_mark *fsn_mark,
+ struct fsnotify_group *group);
+ extern int inotify_handle_inode_event(struct fsnotify_mark *inode_mark,
+diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
+index 54583f62dc440..3ef57db0ec9d6 100644
+--- a/fs/notify/inotify/inotify_user.c
++++ b/fs/notify/inotify/inotify_user.c
+@@ -110,7 +110,7 @@ static inline __u32 inotify_arg_to_mask(struct inode *inode, u32 arg)
+ mask |= FS_EVENT_ON_CHILD;
+
+ /* mask off the flags used to open the fd */
+- mask |= (arg & (IN_ALL_EVENTS | IN_ONESHOT | IN_EXCL_UNLINK));
++ mask |= (arg & INOTIFY_USER_MASK);
+
+ return mask;
+ }
+diff --git a/fs/notify/mark.c b/fs/notify/mark.c
+index 4853184f7ddef..c86982be2d505 100644
+--- a/fs/notify/mark.c
++++ b/fs/notify/mark.c
+@@ -452,7 +452,7 @@ void fsnotify_free_mark(struct fsnotify_mark *mark)
+ void fsnotify_destroy_mark(struct fsnotify_mark *mark,
+ struct fsnotify_group *group)
+ {
+- mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
++ mutex_lock(&group->mark_mutex);
+ fsnotify_detach_mark(mark);
+ mutex_unlock(&group->mark_mutex);
+ fsnotify_free_mark(mark);
+@@ -770,7 +770,7 @@ void fsnotify_clear_marks_by_group(struct fsnotify_group *group,
+ * move marks to free to to_free list in one go and then free marks in
+ * to_free list one by one.
+ */
+- mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
++ mutex_lock(&group->mark_mutex);
+ list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list) {
+ if (mark->connector->type == obj_type)
+ list_move(&mark->g_list, &to_free);
+@@ -779,7 +779,7 @@ void fsnotify_clear_marks_by_group(struct fsnotify_group *group,
+
+ clear:
+ while (1) {
+- mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
++ mutex_lock(&group->mark_mutex);
+ if (list_empty(head)) {
+ mutex_unlock(&group->mark_mutex);
+ break;
+diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c
+index 787b53b984ee1..3bae76930e68a 100644
+--- a/fs/ntfs3/file.c
++++ b/fs/ntfs3/file.c
+@@ -495,7 +495,7 @@ static int ntfs_truncate(struct inode *inode, loff_t new_size)
+
+ down_write(&ni->file.run_lock);
+ err = attr_set_size(ni, ATTR_DATA, NULL, 0, &ni->file.run, new_size,
+- &new_valid, true, NULL);
++ &new_valid, ni->mi.sbi->options->prealloc, NULL);
+ up_write(&ni->file.run_lock);
+
+ if (new_valid < ni->i_valid)
+@@ -662,7 +662,13 @@ static long ntfs_fallocate(struct file *file, int mode, loff_t vbo, loff_t len)
+ /*
+ * Normal file: Allocate clusters, do not change 'valid' size.
+ */
+- err = ntfs_set_size(inode, max(end, i_size));
++ loff_t new_size = max(end, i_size);
++
++ err = inode_newsize_ok(inode, new_size);
++ if (err)
++ goto out;
++
++ err = ntfs_set_size(inode, new_size);
+ if (err)
+ goto out;
+
+@@ -762,7 +768,7 @@ int ntfs3_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
+ }
+ inode_dio_wait(inode);
+
+- if (attr->ia_size < oldsize)
++ if (attr->ia_size <= oldsize)
+ err = ntfs_truncate(inode, attr->ia_size);
+ else if (attr->ia_size > oldsize)
+ err = ntfs_extend(inode, attr->ia_size, 0, NULL);
+diff --git a/fs/ntfs3/frecord.c b/fs/ntfs3/frecord.c
+index 6f47a9c17f896..18842998c8fa3 100644
+--- a/fs/ntfs3/frecord.c
++++ b/fs/ntfs3/frecord.c
+@@ -1964,10 +1964,8 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo,
+
+ vcn += clen;
+
+- if (vbo + bytes >= end) {
++ if (vbo + bytes >= end)
+ bytes = end - vbo;
+- flags |= FIEMAP_EXTENT_LAST;
+- }
+
+ if (vbo + bytes <= valid) {
+ ;
+@@ -1977,6 +1975,9 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo,
+ /* vbo < valid && valid < vbo + bytes */
+ u64 dlen = valid - vbo;
+
++ if (vbo + dlen >= end)
++ flags |= FIEMAP_EXTENT_LAST;
++
+ err = fiemap_fill_next_extent(fieinfo, vbo, lbo, dlen,
+ flags);
+ if (err < 0)
+@@ -1995,6 +1996,9 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo,
+ flags |= FIEMAP_EXTENT_UNWRITTEN;
+ }
+
++ if (vbo + bytes >= end)
++ flags |= FIEMAP_EXTENT_LAST;
++
+ err = fiemap_fill_next_extent(fieinfo, vbo, lbo, bytes, flags);
+ if (err < 0)
+ break;
+diff --git a/fs/ntfs3/fslog.c b/fs/ntfs3/fslog.c
+index 06492f088d602..49b7df6167785 100644
+--- a/fs/ntfs3/fslog.c
++++ b/fs/ntfs3/fslog.c
+@@ -1185,8 +1185,6 @@ static int log_read_rst(struct ntfs_log *log, u32 l_size, bool first,
+ if (!r_page)
+ return -ENOMEM;
+
+- memset(info, 0, sizeof(struct restart_info));
+-
+ /* Determine which restart area we are looking for. */
+ if (first) {
+ vbo = 0;
+@@ -3791,10 +3789,11 @@ int log_replay(struct ntfs_inode *ni, bool *initialized)
+ if (!log)
+ return -ENOMEM;
+
++ memset(&rst_info, 0, sizeof(struct restart_info));
++
+ log->ni = ni;
+ log->l_size = l_size;
+ log->one_page_buf = kmalloc(page_size, GFP_NOFS);
+-
+ if (!log->one_page_buf) {
+ err = -ENOMEM;
+ goto out;
+@@ -3842,6 +3841,7 @@ int log_replay(struct ntfs_inode *ni, bool *initialized)
+ if (rst_info.vbo)
+ goto check_restart_area;
+
++ memset(&rst_info2, 0, sizeof(struct restart_info));
+ err = log_read_rst(log, l_size, false, &rst_info2);
+
+ /* Determine which restart area to use. */
+@@ -4085,8 +4085,10 @@ process_log:
+ if (client == LFS_NO_CLIENT_LE) {
+ /* Insert "NTFS" client LogFile. */
+ client = ra->client_idx[0];
+- if (client == LFS_NO_CLIENT_LE)
+- return -EINVAL;
++ if (client == LFS_NO_CLIENT_LE) {
++ err = -EINVAL;
++ goto out;
++ }
+
+ t16 = le16_to_cpu(client);
+ cr = ca + t16;
+diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c
+index 9eab11e3b0341..38045264a61bd 100644
+--- a/fs/ntfs3/inode.c
++++ b/fs/ntfs3/inode.c
+@@ -757,6 +757,7 @@ static ssize_t ntfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
+ loff_t vbo = iocb->ki_pos;
+ loff_t end;
+ int wr = iov_iter_rw(iter) & WRITE;
++ size_t iter_count = iov_iter_count(iter);
+ loff_t valid;
+ ssize_t ret;
+
+@@ -770,10 +771,13 @@ static ssize_t ntfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
+ wr ? ntfs_get_block_direct_IO_W
+ : ntfs_get_block_direct_IO_R);
+
+- if (ret <= 0)
++ if (ret > 0)
++ end = vbo + ret;
++ else if (wr && ret == -EIOCBQUEUED)
++ end = vbo + iter_count;
++ else
+ goto out;
+
+- end = vbo + ret;
+ valid = ni->i_valid;
+ if (wr) {
+ if (end > valid && !S_ISBLK(inode->i_mode)) {
+@@ -1951,6 +1955,7 @@ const struct address_space_operations ntfs_aops = {
+ .direct_IO = ntfs_direct_IO,
+ .bmap = ntfs_bmap,
+ .dirty_folio = block_dirty_folio,
++ .invalidate_folio = block_invalidate_folio,
+ };
+
+ const struct address_space_operations ntfs_aops_cmpr = {
+diff --git a/fs/ntfs3/xattr.c b/fs/ntfs3/xattr.c
+index afd0ddad826ff..0968565ff2ca0 100644
+--- a/fs/ntfs3/xattr.c
++++ b/fs/ntfs3/xattr.c
+@@ -112,7 +112,7 @@ static int ntfs_read_ea(struct ntfs_inode *ni, struct EA_FULL **ea,
+ return -ENOMEM;
+
+ if (!size) {
+- ;
++ /* EA info persists, but xattr is empty. Looks like EA problem. */
+ } else if (attr_ea->non_res) {
+ struct runs_tree run;
+
+@@ -541,7 +541,7 @@ struct posix_acl *ntfs_get_acl(struct inode *inode, int type, bool rcu)
+
+ static noinline int ntfs_set_acl_ex(struct user_namespace *mnt_userns,
+ struct inode *inode, struct posix_acl *acl,
+- int type)
++ int type, bool init_acl)
+ {
+ const char *name;
+ size_t size, name_len;
+@@ -554,8 +554,9 @@ static noinline int ntfs_set_acl_ex(struct user_namespace *mnt_userns,
+
+ switch (type) {
+ case ACL_TYPE_ACCESS:
+- if (acl) {
+- umode_t mode = inode->i_mode;
++ /* Do not change i_mode if we are in init_acl */
++ if (acl && !init_acl) {
++ umode_t mode;
+
+ err = posix_acl_update_mode(mnt_userns, inode, &mode,
+ &acl);
+@@ -616,7 +617,68 @@ out:
+ int ntfs_set_acl(struct user_namespace *mnt_userns, struct inode *inode,
+ struct posix_acl *acl, int type)
+ {
+- return ntfs_set_acl_ex(mnt_userns, inode, acl, type);
++ return ntfs_set_acl_ex(mnt_userns, inode, acl, type, false);
++}
++
++static int ntfs_xattr_get_acl(struct user_namespace *mnt_userns,
++ struct inode *inode, int type, void *buffer,
++ size_t size)
++{
++ struct posix_acl *acl;
++ int err;
++
++ if (!(inode->i_sb->s_flags & SB_POSIXACL)) {
++ ntfs_inode_warn(inode, "add mount option \"acl\" to use acl");
++ return -EOPNOTSUPP;
++ }
++
++ acl = ntfs_get_acl(inode, type, false);
++ if (IS_ERR(acl))
++ return PTR_ERR(acl);
++
++ if (!acl)
++ return -ENODATA;
++
++ err = posix_acl_to_xattr(mnt_userns, acl, buffer, size);
++ posix_acl_release(acl);
++
++ return err;
++}
++
++static int ntfs_xattr_set_acl(struct user_namespace *mnt_userns,
++ struct inode *inode, int type, const void *value,
++ size_t size)
++{
++ struct posix_acl *acl;
++ int err;
++
++ if (!(inode->i_sb->s_flags & SB_POSIXACL)) {
++ ntfs_inode_warn(inode, "add mount option \"acl\" to use acl");
++ return -EOPNOTSUPP;
++ }
++
++ if (!inode_owner_or_capable(mnt_userns, inode))
++ return -EPERM;
++
++ if (!value) {
++ acl = NULL;
++ } else {
++ acl = posix_acl_from_xattr(mnt_userns, value, size);
++ if (IS_ERR(acl))
++ return PTR_ERR(acl);
++
++ if (acl) {
++ err = posix_acl_valid(mnt_userns, acl);
++ if (err)
++ goto release_and_out;
++ }
++ }
++
++ err = ntfs_set_acl(mnt_userns, inode, acl, type);
++
++release_and_out:
++ posix_acl_release(acl);
++ return err;
+ }
+
+ /*
+@@ -636,7 +698,7 @@ int ntfs_init_acl(struct user_namespace *mnt_userns, struct inode *inode,
+
+ if (default_acl) {
+ err = ntfs_set_acl_ex(mnt_userns, inode, default_acl,
+- ACL_TYPE_DEFAULT);
++ ACL_TYPE_DEFAULT, true);
+ posix_acl_release(default_acl);
+ } else {
+ inode->i_default_acl = NULL;
+@@ -647,7 +709,7 @@ int ntfs_init_acl(struct user_namespace *mnt_userns, struct inode *inode,
+ else {
+ if (!err)
+ err = ntfs_set_acl_ex(mnt_userns, inode, acl,
+- ACL_TYPE_ACCESS);
++ ACL_TYPE_ACCESS, true);
+ posix_acl_release(acl);
+ }
+
+@@ -785,6 +847,23 @@ static int ntfs_getxattr(const struct xattr_handler *handler, struct dentry *de,
+ goto out;
+ }
+
++#ifdef CONFIG_NTFS3_FS_POSIX_ACL
++ if ((name_len == sizeof(XATTR_NAME_POSIX_ACL_ACCESS) - 1 &&
++ !memcmp(name, XATTR_NAME_POSIX_ACL_ACCESS,
++ sizeof(XATTR_NAME_POSIX_ACL_ACCESS))) ||
++ (name_len == sizeof(XATTR_NAME_POSIX_ACL_DEFAULT) - 1 &&
++ !memcmp(name, XATTR_NAME_POSIX_ACL_DEFAULT,
++ sizeof(XATTR_NAME_POSIX_ACL_DEFAULT)))) {
++ /* TODO: init_user_ns? */
++ err = ntfs_xattr_get_acl(
++ &init_user_ns, inode,
++ name_len == sizeof(XATTR_NAME_POSIX_ACL_ACCESS) - 1
++ ? ACL_TYPE_ACCESS
++ : ACL_TYPE_DEFAULT,
++ buffer, size);
++ goto out;
++ }
++#endif
+ /* Deal with NTFS extended attribute. */
+ err = ntfs_get_ea(inode, name, name_len, buffer, size, NULL);
+
+@@ -897,10 +976,29 @@ set_new_fa:
+ goto out;
+ }
+
++#ifdef CONFIG_NTFS3_FS_POSIX_ACL
++ if ((name_len == sizeof(XATTR_NAME_POSIX_ACL_ACCESS) - 1 &&
++ !memcmp(name, XATTR_NAME_POSIX_ACL_ACCESS,
++ sizeof(XATTR_NAME_POSIX_ACL_ACCESS))) ||
++ (name_len == sizeof(XATTR_NAME_POSIX_ACL_DEFAULT) - 1 &&
++ !memcmp(name, XATTR_NAME_POSIX_ACL_DEFAULT,
++ sizeof(XATTR_NAME_POSIX_ACL_DEFAULT)))) {
++ err = ntfs_xattr_set_acl(
++ mnt_userns, inode,
++ name_len == sizeof(XATTR_NAME_POSIX_ACL_ACCESS) - 1
++ ? ACL_TYPE_ACCESS
++ : ACL_TYPE_DEFAULT,
++ value, size);
++ goto out;
++ }
++#endif
+ /* Deal with NTFS extended attribute. */
+ err = ntfs_set_ea(inode, name, name_len, value, size, flags);
+
+ out:
++ inode->i_ctime = current_time(inode);
++ mark_inode_dirty(inode);
++
+ return err;
+ }
+
+diff --git a/fs/ocfs2/dlmfs/userdlm.c b/fs/ocfs2/dlmfs/userdlm.c
+index 29f183a15798e..c1d67c806e1d3 100644
+--- a/fs/ocfs2/dlmfs/userdlm.c
++++ b/fs/ocfs2/dlmfs/userdlm.c
+@@ -433,6 +433,11 @@ again:
+ }
+
+ spin_lock(&lockres->l_lock);
++ if (lockres->l_flags & USER_LOCK_IN_TEARDOWN) {
++ spin_unlock(&lockres->l_lock);
++ status = -EAGAIN;
++ goto bail;
++ }
+
+ /* We only compare against the currently granted level
+ * here. If the lock is blocked waiting on a downconvert,
+@@ -595,7 +600,7 @@ int user_dlm_destroy_lock(struct user_lock_res *lockres)
+ spin_lock(&lockres->l_lock);
+ if (lockres->l_flags & USER_LOCK_IN_TEARDOWN) {
+ spin_unlock(&lockres->l_lock);
+- return 0;
++ goto bail;
+ }
+
+ lockres->l_flags |= USER_LOCK_IN_TEARDOWN;
+@@ -609,12 +614,17 @@ int user_dlm_destroy_lock(struct user_lock_res *lockres)
+ }
+
+ if (lockres->l_ro_holders || lockres->l_ex_holders) {
++ lockres->l_flags &= ~USER_LOCK_IN_TEARDOWN;
+ spin_unlock(&lockres->l_lock);
+ goto bail;
+ }
+
+ status = 0;
+ if (!(lockres->l_flags & USER_LOCK_ATTACHED)) {
++ /*
++ * lock is never requested, leave USER_LOCK_IN_TEARDOWN set
++ * to avoid new lock request coming in.
++ */
+ spin_unlock(&lockres->l_lock);
+ goto bail;
+ }
+@@ -625,6 +635,10 @@ int user_dlm_destroy_lock(struct user_lock_res *lockres)
+
+ status = ocfs2_dlm_unlock(conn, &lockres->l_lksb, DLM_LKF_VALBLK);
+ if (status) {
++ spin_lock(&lockres->l_lock);
++ lockres->l_flags &= ~USER_LOCK_IN_TEARDOWN;
++ lockres->l_flags &= ~USER_LOCK_BUSY;
++ spin_unlock(&lockres->l_lock);
+ user_log_dlm_error("ocfs2_dlm_unlock", status, lockres);
+ goto bail;
+ }
+diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
+index 5739dc3015698..bb116c39b5813 100644
+--- a/fs/ocfs2/inode.c
++++ b/fs/ocfs2/inode.c
+@@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
+ struct inode *inode = NULL;
+ struct super_block *sb = osb->sb;
+ struct ocfs2_find_inode_args args;
++ journal_t *journal = osb->journal->j_journal;
+
+ trace_ocfs2_iget_begin((unsigned long long)blkno, flags,
+ sysfile_type);
+@@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
+ * part of the transaction - the inode could have been reclaimed and
+ * now it is reread from disk.
+ */
+- if (osb->journal) {
++ if (journal) {
+ transaction_t *transaction;
+ tid_t tid;
+ struct ocfs2_inode_info *oi = OCFS2_I(inode);
+- journal_t *journal = osb->journal->j_journal;
+
+ read_lock(&journal->j_state_lock);
+ if (journal->j_running_transaction)
+diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
+index 1887a27087097..fa87d89cf7542 100644
+--- a/fs/ocfs2/journal.c
++++ b/fs/ocfs2/journal.c
+@@ -810,22 +810,20 @@ void ocfs2_set_journal_params(struct ocfs2_super *osb)
+ write_unlock(&journal->j_state_lock);
+ }
+
+-int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
++/*
++ * alloc & initialize skeleton for journal structure.
++ * ocfs2_journal_init() will make fs have journal ability.
++ */
++int ocfs2_journal_alloc(struct ocfs2_super *osb)
+ {
+- int status = -1;
+- struct inode *inode = NULL; /* the journal inode */
+- journal_t *j_journal = NULL;
+- struct ocfs2_journal *journal = NULL;
+- struct ocfs2_dinode *di = NULL;
+- struct buffer_head *bh = NULL;
+- int inode_lock = 0;
++ int status = 0;
++ struct ocfs2_journal *journal;
+
+- /* initialize our journal structure */
+ journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
+ if (!journal) {
+ mlog(ML_ERROR, "unable to alloc journal\n");
+ status = -ENOMEM;
+- goto done;
++ goto bail;
+ }
+ osb->journal = journal;
+ journal->j_osb = osb;
+@@ -839,6 +837,21 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
+ INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
+ journal->j_state = OCFS2_JOURNAL_FREE;
+
++bail:
++ return status;
++}
++
++int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
++{
++ int status = -1;
++ struct inode *inode = NULL; /* the journal inode */
++ journal_t *j_journal = NULL;
++ struct ocfs2_journal *journal = osb->journal;
++ struct ocfs2_dinode *di = NULL;
++ struct buffer_head *bh = NULL;
++ int inode_lock = 0;
++
++ BUG_ON(!journal);
+ /* already have the inode for our journal */
+ inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE,
+ osb->slot_num);
+diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h
+index 8dcb2f2cadbc5..969d0aa287187 100644
+--- a/fs/ocfs2/journal.h
++++ b/fs/ocfs2/journal.h
+@@ -154,6 +154,7 @@ int ocfs2_compute_replay_slots(struct ocfs2_super *osb);
+ * Journal Control:
+ * Initialize, Load, Shutdown, Wipe a journal.
+ *
++ * ocfs2_journal_alloc - Initialize skeleton for journal structure.
+ * ocfs2_journal_init - Initialize journal structures in the OSB.
+ * ocfs2_journal_load - Load the given journal off disk. Replay it if
+ * there's transactions still in there.
+@@ -167,6 +168,7 @@ int ocfs2_compute_replay_slots(struct ocfs2_super *osb);
+ * ocfs2_start_checkpoint - Kick the commit thread to do a checkpoint.
+ */
+ void ocfs2_set_journal_params(struct ocfs2_super *osb);
++int ocfs2_journal_alloc(struct ocfs2_super *osb);
+ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty);
+ void ocfs2_journal_shutdown(struct ocfs2_super *osb);
+ int ocfs2_journal_wipe(struct ocfs2_journal *journal,
+diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
+index 477cdf94122e7..311433c69a3f5 100644
+--- a/fs/ocfs2/super.c
++++ b/fs/ocfs2/super.c
+@@ -2195,6 +2195,15 @@ static int ocfs2_initialize_super(struct super_block *sb,
+
+ get_random_bytes(&osb->s_next_generation, sizeof(u32));
+
++ /*
++ * FIXME
++ * This should be done in ocfs2_journal_init(), but any inode
++ * writes back operation will cause the filesystem to crash.
++ */
++ status = ocfs2_journal_alloc(osb);
++ if (status < 0)
++ goto bail;
++
+ INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs);
+ init_llist_head(&osb->dquot_drop_list);
+
+@@ -2483,6 +2492,12 @@ static void ocfs2_delete_osb(struct ocfs2_super *osb)
+
+ kfree(osb->osb_orphan_wipes);
+ kfree(osb->slot_recovery_generations);
++ /* FIXME
++ * This belongs in journal shutdown, but because we have to
++ * allocate osb->journal at the middle of ocfs2_initialize_super(),
++ * we free it here.
++ */
++ kfree(osb->journal);
+ kfree(osb->local_alloc_copy);
+ kfree(osb->uuid_str);
+ kfree(osb->vol_label);
+diff --git a/fs/proc/generic.c b/fs/proc/generic.c
+index f2132407e1335..587b91d9d998f 100644
+--- a/fs/proc/generic.c
++++ b/fs/proc/generic.c
+@@ -448,6 +448,9 @@ static struct proc_dir_entry *__proc_create(struct proc_dir_entry **parent,
+ proc_set_user(ent, (*parent)->uid, (*parent)->gid);
+
+ ent->proc_dops = &proc_misc_dentry_ops;
++ /* Revalidate everything under /proc/${pid}/net */
++ if ((*parent)->proc_dops == &proc_net_dentry_ops)
++ pde_force_lookup(ent);
+
+ out:
+ return ent;
+diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
+index e1cfeda397f3f..913e5acefbb66 100644
+--- a/fs/proc/proc_net.c
++++ b/fs/proc/proc_net.c
+@@ -376,6 +376,9 @@ static __net_init int proc_net_ns_init(struct net *net)
+
+ proc_set_user(netd, uid, gid);
+
++ /* Seed dentry revalidation for /proc/${pid}/net */
++ pde_force_lookup(netd);
++
+ err = -EEXIST;
+ net_statd = proc_net_mkdir(net, "stat", netd);
+ if (!net_statd)
+diff --git a/fs/seq_file.c b/fs/seq_file.c
+index 7ab8a58c29b61..9456a2032224a 100644
+--- a/fs/seq_file.c
++++ b/fs/seq_file.c
+@@ -931,6 +931,38 @@ struct list_head *seq_list_next(void *v, struct list_head *head, loff_t *ppos)
+ }
+ EXPORT_SYMBOL(seq_list_next);
+
++struct list_head *seq_list_start_rcu(struct list_head *head, loff_t pos)
++{
++ struct list_head *lh;
++
++ list_for_each_rcu(lh, head)
++ if (pos-- == 0)
++ return lh;
++
++ return NULL;
++}
++EXPORT_SYMBOL(seq_list_start_rcu);
++
++struct list_head *seq_list_start_head_rcu(struct list_head *head, loff_t pos)
++{
++ if (!pos)
++ return head;
++
++ return seq_list_start_rcu(head, pos - 1);
++}
++EXPORT_SYMBOL(seq_list_start_head_rcu);
++
++struct list_head *seq_list_next_rcu(void *v, struct list_head *head,
++ loff_t *ppos)
++{
++ struct list_head *lh;
++
++ lh = list_next_rcu((struct list_head *)v);
++ ++*ppos;
++ return lh == head ? NULL : lh;
++}
++EXPORT_SYMBOL(seq_list_next_rcu);
++
+ /**
+ * seq_hlist_start - start an iteration of a hlist
+ * @head: the head of the hlist
+diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
+index 144c495b99c48..d6b2aeb342119 100644
+--- a/include/drm/drm_edid.h
++++ b/include/drm/drm_edid.h
+@@ -121,7 +121,7 @@ struct detailed_data_monitor_range {
+ u8 supported_scalings;
+ u8 preferred_refresh;
+ } __attribute__((packed)) cvt;
+- } formula;
++ } __attribute__((packed)) formula;
+ } __attribute__((packed));
+
+ struct detailed_data_wpindex {
+@@ -154,7 +154,7 @@ struct detailed_non_pixel {
+ struct detailed_data_wpindex color;
+ struct std_timing timings[6];
+ struct cvt_timing cvt[4];
+- } data;
++ } __attribute__((packed)) data;
+ } __attribute__((packed));
+
+ #define EDID_DETAIL_EST_TIMINGS 0xf7
+@@ -172,7 +172,7 @@ struct detailed_timing {
+ union {
+ struct detailed_pixel_timing pixel_data;
+ struct detailed_non_pixel other_data;
+- } data;
++ } __attribute__((packed)) data;
+ } __attribute__((packed));
+
+ #define DRM_EDID_INPUT_SERRATION_VSYNC (1 << 0)
+diff --git a/include/drm/drm_format_helper.h b/include/drm/drm_format_helper.h
+index 0b0937c0b2f63..55145eca07828 100644
+--- a/include/drm/drm_format_helper.h
++++ b/include/drm/drm_format_helper.h
+@@ -43,8 +43,7 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int dst_pitch, uint32_t dst_for
+ const void *vmap, const struct drm_framebuffer *fb,
+ const struct drm_rect *rect);
+
+-void drm_fb_xrgb8888_to_mono_reversed(void *dst, unsigned int dst_pitch, const void *src,
+- const struct drm_framebuffer *fb,
+- const struct drm_rect *clip);
++void drm_fb_xrgb8888_to_mono(void *dst, unsigned int dst_pitch, const void *src,
++ const struct drm_framebuffer *fb, const struct drm_rect *clip);
+
+ #endif /* __LINUX_DRM_FORMAT_HELPER_H */
+diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
+index 1973ef9bd40fc..4fa359c2c01fb 100644
+--- a/include/linux/blk_types.h
++++ b/include/linux/blk_types.h
+@@ -246,9 +246,8 @@ typedef unsigned int blk_qc_t;
+ struct bio {
+ struct bio *bi_next; /* request queue link */
+ struct block_device *bi_bdev;
+- unsigned int bi_opf; /* bottom bits req flags,
+- * top bits REQ_OP. Use
+- * accessors.
++ unsigned int bi_opf; /* bottom bits REQ_OP, top bits
++ * req_flags.
+ */
+ unsigned short bi_flags; /* BIO_* below */
+ unsigned short bi_ioprio;
+diff --git a/include/linux/bpf.h b/include/linux/bpf.h
+index bdb5298735ce9..83bd5598ec4df 100644
+--- a/include/linux/bpf.h
++++ b/include/linux/bpf.h
+@@ -672,7 +672,7 @@ struct btf_func_model {
+ #define BPF_TRAMP_F_RET_FENTRY_RET BIT(4)
+
+ /* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
+- * bytes on x86. Pick a number to fit into BPF_IMAGE_SIZE / 2
++ * bytes on x86.
+ */
+ #define BPF_MAX_TRAMP_PROGS 38
+
+@@ -1221,7 +1221,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
+ /* an array of programs to be executed under rcu_lock.
+ *
+ * Typical usage:
+- * ret = BPF_PROG_RUN_ARRAY(&bpf_prog_array, ctx, bpf_prog_run);
++ * ret = bpf_prog_run_array(rcu_dereference(&bpf_prog_array), ctx, bpf_prog_run);
+ *
+ * the structure returned by bpf_prog_array_alloc() should be populated
+ * with program pointers and the last pointer must be NULL.
+@@ -1315,83 +1315,22 @@ static inline void bpf_reset_run_ctx(struct bpf_run_ctx *old_ctx)
+
+ typedef u32 (*bpf_prog_run_fn)(const struct bpf_prog *prog, const void *ctx);
+
+-static __always_inline int
+-BPF_PROG_RUN_ARRAY_CG_FLAGS(const struct bpf_prog_array __rcu *array_rcu,
+- const void *ctx, bpf_prog_run_fn run_prog,
+- int retval, u32 *ret_flags)
+-{
+- const struct bpf_prog_array_item *item;
+- const struct bpf_prog *prog;
+- const struct bpf_prog_array *array;
+- struct bpf_run_ctx *old_run_ctx;
+- struct bpf_cg_run_ctx run_ctx;
+- u32 func_ret;
+-
+- run_ctx.retval = retval;
+- migrate_disable();
+- rcu_read_lock();
+- array = rcu_dereference(array_rcu);
+- item = &array->items[0];
+- old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
+- while ((prog = READ_ONCE(item->prog))) {
+- run_ctx.prog_item = item;
+- func_ret = run_prog(prog, ctx);
+- if (!(func_ret & 1) && !IS_ERR_VALUE((long)run_ctx.retval))
+- run_ctx.retval = -EPERM;
+- *(ret_flags) |= (func_ret >> 1);
+- item++;
+- }
+- bpf_reset_run_ctx(old_run_ctx);
+- rcu_read_unlock();
+- migrate_enable();
+- return run_ctx.retval;
+-}
+-
+-static __always_inline int
+-BPF_PROG_RUN_ARRAY_CG(const struct bpf_prog_array __rcu *array_rcu,
+- const void *ctx, bpf_prog_run_fn run_prog,
+- int retval)
+-{
+- const struct bpf_prog_array_item *item;
+- const struct bpf_prog *prog;
+- const struct bpf_prog_array *array;
+- struct bpf_run_ctx *old_run_ctx;
+- struct bpf_cg_run_ctx run_ctx;
+-
+- run_ctx.retval = retval;
+- migrate_disable();
+- rcu_read_lock();
+- array = rcu_dereference(array_rcu);
+- item = &array->items[0];
+- old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
+- while ((prog = READ_ONCE(item->prog))) {
+- run_ctx.prog_item = item;
+- if (!run_prog(prog, ctx) && !IS_ERR_VALUE((long)run_ctx.retval))
+- run_ctx.retval = -EPERM;
+- item++;
+- }
+- bpf_reset_run_ctx(old_run_ctx);
+- rcu_read_unlock();
+- migrate_enable();
+- return run_ctx.retval;
+-}
+-
+ static __always_inline u32
+-BPF_PROG_RUN_ARRAY(const struct bpf_prog_array __rcu *array_rcu,
++bpf_prog_run_array(const struct bpf_prog_array *array,
+ const void *ctx, bpf_prog_run_fn run_prog)
+ {
+ const struct bpf_prog_array_item *item;
+ const struct bpf_prog *prog;
+- const struct bpf_prog_array *array;
+ struct bpf_run_ctx *old_run_ctx;
+ struct bpf_trace_run_ctx run_ctx;
+ u32 ret = 1;
+
+- migrate_disable();
+- rcu_read_lock();
+- array = rcu_dereference(array_rcu);
++ RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "no rcu lock held");
++
+ if (unlikely(!array))
+- goto out;
++ return ret;
++
++ migrate_disable();
+ old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
+ item = &array->items[0];
+ while ((prog = READ_ONCE(item->prog))) {
+@@ -1400,50 +1339,10 @@ BPF_PROG_RUN_ARRAY(const struct bpf_prog_array __rcu *array_rcu,
+ item++;
+ }
+ bpf_reset_run_ctx(old_run_ctx);
+-out:
+- rcu_read_unlock();
+ migrate_enable();
+ return ret;
+ }
+
+-/* To be used by __cgroup_bpf_run_filter_skb for EGRESS BPF progs
+- * so BPF programs can request cwr for TCP packets.
+- *
+- * Current cgroup skb programs can only return 0 or 1 (0 to drop the
+- * packet. This macro changes the behavior so the low order bit
+- * indicates whether the packet should be dropped (0) or not (1)
+- * and the next bit is a congestion notification bit. This could be
+- * used by TCP to call tcp_enter_cwr()
+- *
+- * Hence, new allowed return values of CGROUP EGRESS BPF programs are:
+- * 0: drop packet
+- * 1: keep packet
+- * 2: drop packet and cn
+- * 3: keep packet and cn
+- *
+- * This macro then converts it to one of the NET_XMIT or an error
+- * code that is then interpreted as drop packet (and no cn):
+- * 0: NET_XMIT_SUCCESS skb should be transmitted
+- * 1: NET_XMIT_DROP skb should be dropped and cn
+- * 2: NET_XMIT_CN skb should be transmitted and cn
+- * 3: -err skb should be dropped
+- */
+-#define BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY(array, ctx, func) \
+- ({ \
+- u32 _flags = 0; \
+- bool _cn; \
+- u32 _ret; \
+- _ret = BPF_PROG_RUN_ARRAY_CG_FLAGS(array, ctx, func, 0, &_flags); \
+- _cn = _flags & BPF_RET_SET_CN; \
+- if (_ret && !IS_ERR_VALUE((long)_ret)) \
+- _ret = -EFAULT; \
+- if (!_ret) \
+- _ret = (_cn ? NET_XMIT_CN : NET_XMIT_SUCCESS); \
+- else \
+- _ret = (_cn ? NET_XMIT_DROP : _ret); \
+- _ret; \
+- })
+-
+ #ifdef CONFIG_BPF_SYSCALL
+ DECLARE_PER_CPU(int, bpf_prog_active);
+ extern struct mutex bpf_stats_enabled_mutex;
+@@ -2085,6 +1984,8 @@ void bpf_offload_dev_netdev_unregister(struct bpf_offload_dev *offdev,
+ struct net_device *netdev);
+ bool bpf_offload_dev_match(struct bpf_prog *prog, struct net_device *netdev);
+
++void unpriv_ebpf_notify(int new_state);
++
+ #if defined(CONFIG_NET) && defined(CONFIG_BPF_SYSCALL)
+ int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr);
+
+diff --git a/include/linux/compat.h b/include/linux/compat.h
+index 1c758b0e03598..01fddf72a81f0 100644
+--- a/include/linux/compat.h
++++ b/include/linux/compat.h
+@@ -235,6 +235,7 @@ typedef struct compat_siginfo {
+ struct {
+ compat_ulong_t _data;
+ u32 _type;
++ u32 _flags;
+ } _perf;
+ };
+ } _sigfault;
+diff --git a/include/linux/efi.h b/include/linux/efi.h
+index ccd4d3f91c98c..cc6d2be2ffd51 100644
+--- a/include/linux/efi.h
++++ b/include/linux/efi.h
+@@ -213,6 +213,8 @@ struct capsule_info {
+ size_t page_bytes_remain;
+ };
+
++int efi_capsule_setup_info(struct capsule_info *cap_info, void *kbuff,
++ size_t hdr_bytes);
+ int __efi_capsule_setup_info(struct capsule_info *cap_info);
+
+ /*
+diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
+index 3a532ba66f6c2..7defac04f9a38 100644
+--- a/include/linux/fwnode.h
++++ b/include/linux/fwnode.h
+@@ -148,12 +148,12 @@ struct fwnode_operations {
+ int (*add_links)(struct fwnode_handle *fwnode);
+ };
+
+-#define fwnode_has_op(fwnode, op) \
+- ((fwnode) && (fwnode)->ops && (fwnode)->ops->op)
++#define fwnode_has_op(fwnode, op) \
++ (!IS_ERR_OR_NULL(fwnode) && (fwnode)->ops && (fwnode)->ops->op)
++
+ #define fwnode_call_int_op(fwnode, op, ...) \
+- (fwnode ? (fwnode_has_op(fwnode, op) ? \
+- (fwnode)->ops->op(fwnode, ## __VA_ARGS__) : -ENXIO) : \
+- -EINVAL)
++ (fwnode_has_op(fwnode, op) ? \
++ (fwnode)->ops->op(fwnode, ## __VA_ARGS__) : (IS_ERR_OR_NULL(fwnode) ? -EINVAL : -ENXIO))
+
+ #define fwnode_call_bool_op(fwnode, op, ...) \
+ (fwnode_has_op(fwnode, op) ? \
+diff --git a/include/linux/goldfish.h b/include/linux/goldfish.h
+index 12be1601fd845..bcc17f95b9066 100644
+--- a/include/linux/goldfish.h
++++ b/include/linux/goldfish.h
+@@ -8,14 +8,21 @@
+
+ /* Helpers for Goldfish virtual platform */
+
++#ifndef gf_ioread32
++#define gf_ioread32 ioread32
++#endif
++#ifndef gf_iowrite32
++#define gf_iowrite32 iowrite32
++#endif
++
+ static inline void gf_write_ptr(const void *ptr, void __iomem *portl,
+ void __iomem *porth)
+ {
+ const unsigned long addr = (unsigned long)ptr;
+
+- __raw_writel(lower_32_bits(addr), portl);
++ gf_iowrite32(lower_32_bits(addr), portl);
+ #ifdef CONFIG_64BIT
+- __raw_writel(upper_32_bits(addr), porth);
++ gf_iowrite32(upper_32_bits(addr), porth);
+ #endif
+ }
+
+@@ -23,9 +30,9 @@ static inline void gf_write_dma_addr(const dma_addr_t addr,
+ void __iomem *portl,
+ void __iomem *porth)
+ {
+- __raw_writel(lower_32_bits(addr), portl);
++ gf_iowrite32(lower_32_bits(addr), portl);
+ #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
+- __raw_writel(upper_32_bits(addr), porth);
++ gf_iowrite32(upper_32_bits(addr), porth);
+ #endif
+ }
+
+diff --git a/include/linux/gpio/driver.h b/include/linux/gpio/driver.h
+index 874aabd270c9b..48d03eb4e5d8a 100644
+--- a/include/linux/gpio/driver.h
++++ b/include/linux/gpio/driver.h
+@@ -501,6 +501,18 @@ struct gpio_chip {
+ */
+ int (*of_xlate)(struct gpio_chip *gc,
+ const struct of_phandle_args *gpiospec, u32 *flags);
++
++ /**
++ * @of_gpio_ranges_fallback:
++ *
++ * Optional hook for the case that no gpio-ranges property is defined
++ * within the device tree node "np" (usually DT before introduction
++ * of gpio-ranges). So this callback is helpful to provide the
++ * necessary backward compatibility for the pin ranges.
++ */
++ int (*of_gpio_ranges_fallback)(struct gpio_chip *gc,
++ struct device_node *np);
++
+ #endif /* CONFIG_OF_GPIO */
+ };
+
+diff --git a/include/linux/ipmi_smi.h b/include/linux/ipmi_smi.h
+index 9277d21c2690c..5d69820d8b027 100644
+--- a/include/linux/ipmi_smi.h
++++ b/include/linux/ipmi_smi.h
+@@ -125,6 +125,12 @@ struct ipmi_smi_msg {
+ void (*done)(struct ipmi_smi_msg *msg);
+ };
+
++#define INIT_IPMI_SMI_MSG(done_handler) \
++{ \
++ .done = done_handler, \
++ .type = IPMI_SMI_MSG_TYPE_NORMAL \
++}
++
+ struct ipmi_smi_handlers {
+ struct module *owner;
+
+diff --git a/include/linux/kexec.h b/include/linux/kexec.h
+index 58d1b58a971e3..fcd5035209f19 100644
+--- a/include/linux/kexec.h
++++ b/include/linux/kexec.h
+@@ -193,14 +193,6 @@ void *kexec_purgatory_get_symbol_addr(struct kimage *image, const char *name);
+ int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
+ unsigned long buf_len);
+ void *arch_kexec_kernel_image_load(struct kimage *image);
+-int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
+- Elf_Shdr *section,
+- const Elf_Shdr *relsec,
+- const Elf_Shdr *symtab);
+-int arch_kexec_apply_relocations(struct purgatory_info *pi,
+- Elf_Shdr *section,
+- const Elf_Shdr *relsec,
+- const Elf_Shdr *symtab);
+ int arch_kimage_file_post_load_cleanup(struct kimage *image);
+ #ifdef CONFIG_KEXEC_SIG
+ int arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
+@@ -229,6 +221,44 @@ extern int crash_exclude_mem_range(struct crash_mem *mem,
+ unsigned long long mend);
+ extern int crash_prepare_elf64_headers(struct crash_mem *mem, int kernel_map,
+ void **addr, unsigned long *sz);
++
++#ifndef arch_kexec_apply_relocations_add
++/*
++ * arch_kexec_apply_relocations_add - apply relocations of type RELA
++ * @pi: Purgatory to be relocated.
++ * @section: Section relocations applying to.
++ * @relsec: Section containing RELAs.
++ * @symtab: Corresponding symtab.
++ *
++ * Return: 0 on success, negative errno on error.
++ */
++static inline int
++arch_kexec_apply_relocations_add(struct purgatory_info *pi, Elf_Shdr *section,
++ const Elf_Shdr *relsec, const Elf_Shdr *symtab)
++{
++ pr_err("RELA relocation unsupported.\n");
++ return -ENOEXEC;
++}
++#endif
++
++#ifndef arch_kexec_apply_relocations
++/*
++ * arch_kexec_apply_relocations - apply relocations of type REL
++ * @pi: Purgatory to be relocated.
++ * @section: Section relocations applying to.
++ * @relsec: Section containing RELs.
++ * @symtab: Corresponding symtab.
++ *
++ * Return: 0 on success, negative errno on error.
++ */
++static inline int
++arch_kexec_apply_relocations(struct purgatory_info *pi, Elf_Shdr *section,
++ const Elf_Shdr *relsec, const Elf_Shdr *symtab)
++{
++ pr_err("REL relocation unsupported.\n");
++ return -ENOEXEC;
++}
++#endif
+ #endif /* CONFIG_KEXEC_FILE */
+
+ #ifdef CONFIG_KEXEC_ELF
+diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
+index 157168769fc26..55041d2f884de 100644
+--- a/include/linux/kprobes.h
++++ b/include/linux/kprobes.h
+@@ -424,7 +424,7 @@ void unregister_kretprobe(struct kretprobe *rp);
+ int register_kretprobes(struct kretprobe **rps, int num);
+ void unregister_kretprobes(struct kretprobe **rps, int num);
+
+-#ifdef CONFIG_KRETPROBE_ON_RETHOOK
++#if defined(CONFIG_KRETPROBE_ON_RETHOOK) || !defined(CONFIG_KRETPROBES)
+ #define kprobe_flush_task(tk) do {} while (0)
+ #else
+ void kprobe_flush_task(struct task_struct *tk);
+diff --git a/include/linux/linkage.h b/include/linux/linkage.h
+index acb1ad2356f1b..1feab6136b5b5 100644
+--- a/include/linux/linkage.h
++++ b/include/linux/linkage.h
+@@ -171,12 +171,9 @@
+
+ /* SYM_ALIAS -- use only if you have to */
+ #ifndef SYM_ALIAS
+-#define SYM_ALIAS(alias, name, sym_type, linkage) \
+- linkage(alias) ASM_NL \
+- .set alias, name ASM_NL \
+- .type alias sym_type ASM_NL \
+- .set .L__sym_size_##alias, .L__sym_size_##name ASM_NL \
+- .size alias, .L__sym_size_##alias
++#define SYM_ALIAS(alias, name, linkage) \
++ linkage(alias) ASM_NL \
++ .set alias, name ASM_NL
+ #endif
+
+ /* === code annotations === */
+@@ -261,7 +258,7 @@
+ */
+ #ifndef SYM_FUNC_ALIAS
+ #define SYM_FUNC_ALIAS(alias, name) \
+- SYM_ALIAS(alias, name, SYM_T_FUNC, SYM_L_GLOBAL)
++ SYM_ALIAS(alias, name, SYM_L_GLOBAL)
+ #endif
+
+ /*
+@@ -269,7 +266,7 @@
+ */
+ #ifndef SYM_FUNC_ALIAS_LOCAL
+ #define SYM_FUNC_ALIAS_LOCAL(alias, name) \
+- SYM_ALIAS(alias, name, SYM_T_FUNC, SYM_L_LOCAL)
++ SYM_ALIAS(alias, name, SYM_L_LOCAL)
+ #endif
+
+ /*
+@@ -277,7 +274,7 @@
+ */
+ #ifndef SYM_FUNC_ALIAS_WEAK
+ #define SYM_FUNC_ALIAS_WEAK(alias, name) \
+- SYM_ALIAS(alias, name, SYM_T_FUNC, SYM_L_WEAK)
++ SYM_ALIAS(alias, name, SYM_L_WEAK)
+ #endif
+
+ /* SYM_CODE_START -- use for non-C (special) functions */
+diff --git a/include/linux/list.h b/include/linux/list.h
+index dd6c2041d09c1..0df13cb03028b 100644
+--- a/include/linux/list.h
++++ b/include/linux/list.h
+@@ -35,7 +35,7 @@
+ static inline void INIT_LIST_HEAD(struct list_head *list)
+ {
+ WRITE_ONCE(list->next, list);
+- list->prev = list;
++ WRITE_ONCE(list->prev, list);
+ }
+
+ #ifdef CONFIG_DEBUG_LIST
+@@ -306,7 +306,7 @@ static inline int list_empty(const struct list_head *head)
+ static inline void list_del_init_careful(struct list_head *entry)
+ {
+ __list_del_entry(entry);
+- entry->prev = entry;
++ WRITE_ONCE(entry->prev, entry);
+ smp_store_release(&entry->next, entry);
+ }
+
+@@ -326,7 +326,7 @@ static inline void list_del_init_careful(struct list_head *entry)
+ static inline int list_empty_careful(const struct list_head *head)
+ {
+ struct list_head *next = smp_load_acquire(&head->next);
+- return list_is_head(next, head) && (next == head->prev);
++ return list_is_head(next, head) && (next == READ_ONCE(head->prev));
+ }
+
+ /**
+@@ -579,6 +579,16 @@ static inline void list_splice_tail_init(struct list_head *list,
+ #define list_for_each(pos, head) \
+ for (pos = (head)->next; !list_is_head(pos, (head)); pos = pos->next)
+
++/**
++ * list_for_each_rcu - Iterate over a list in an RCU-safe fashion
++ * @pos: the &struct list_head to use as a loop cursor.
++ * @head: the head for your list.
++ */
++#define list_for_each_rcu(pos, head) \
++ for (pos = rcu_dereference((head)->next); \
++ !list_is_head(pos, (head)); \
++ pos = rcu_dereference(pos->next))
++
+ /**
+ * list_for_each_continue - continue iteration over a list
+ * @pos: the &struct list_head to use as a loop cursor.
+diff --git a/include/linux/mailbox_controller.h b/include/linux/mailbox_controller.h
+index 36d6ce673503c..6fee33cb52f58 100644
+--- a/include/linux/mailbox_controller.h
++++ b/include/linux/mailbox_controller.h
+@@ -83,6 +83,7 @@ struct mbox_controller {
+ const struct of_phandle_args *sp);
+ /* Internal to API */
+ struct hrtimer poll_hrt;
++ spinlock_t poll_hrt_lock;
+ struct list_head node;
+ };
+
+diff --git a/include/linux/module.h b/include/linux/module.h
+index 1e135fd5c076a..d5e9066990ca0 100644
+--- a/include/linux/module.h
++++ b/include/linux/module.h
+@@ -290,8 +290,7 @@ extern typeof(name) __mod_##type##__##name##_device_table \
+ * files require multiple MODULE_FIRMWARE() specifiers */
+ #define MODULE_FIRMWARE(_firmware) MODULE_INFO(firmware, _firmware)
+
+-#define _MODULE_IMPORT_NS(ns) MODULE_INFO(import_ns, #ns)
+-#define MODULE_IMPORT_NS(ns) _MODULE_IMPORT_NS(ns)
++#define MODULE_IMPORT_NS(ns) MODULE_INFO(import_ns, __stringify(ns))
+
+ struct notifier_block;
+
+diff --git a/include/linux/mtd/cfi.h b/include/linux/mtd/cfi.h
+index fd1ecb8211060..d88bb56c18e2e 100644
+--- a/include/linux/mtd/cfi.h
++++ b/include/linux/mtd/cfi.h
+@@ -286,6 +286,7 @@ struct cfi_private {
+ map_word sector_erase_cmd;
+ unsigned long chipshift; /* Because they're of the same type */
+ const char *im_name; /* inter_module name for cmdset_setup */
++ unsigned long quirks;
+ struct flchip chips[]; /* per-chip data structure for each chip */
+ };
+
+diff --git a/include/linux/namei.h b/include/linux/namei.h
+index e89329bb3134e..caeb08a98536c 100644
+--- a/include/linux/namei.h
++++ b/include/linux/namei.h
+@@ -69,6 +69,12 @@ extern struct dentry *lookup_one_len(const char *, struct dentry *, int);
+ extern struct dentry *lookup_one_len_unlocked(const char *, struct dentry *, int);
+ extern struct dentry *lookup_positive_unlocked(const char *, struct dentry *, int);
+ struct dentry *lookup_one(struct user_namespace *, const char *, struct dentry *, int);
++struct dentry *lookup_one_unlocked(struct user_namespace *mnt_userns,
++ const char *name, struct dentry *base,
++ int len);
++struct dentry *lookup_one_positive_unlocked(struct user_namespace *mnt_userns,
++ const char *name,
++ struct dentry *base, int len);
+
+ extern int follow_down_one(struct path *);
+ extern int follow_down(struct path *);
+diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
+index 157d2bd6b2417..ea2f7e6b1b0b5 100644
+--- a/include/linux/nfs_fs_sb.h
++++ b/include/linux/nfs_fs_sb.h
+@@ -287,4 +287,5 @@ struct nfs_server {
+ #define NFS_CAP_XATTR (1U << 28)
+ #define NFS_CAP_READ_PLUS (1U << 29)
+ #define NFS_CAP_FS_LOCATIONS (1U << 30)
++#define NFS_CAP_MOVEABLE (1U << 31)
+ #endif
+diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
+index 2863e5a69c6ab..20e97329fe463 100644
+--- a/include/linux/nfs_xdr.h
++++ b/include/linux/nfs_xdr.h
+@@ -1212,7 +1212,7 @@ struct nfs4_fs_location {
+
+ #define NFS4_FS_LOCATIONS_MAXENTRIES 10
+ struct nfs4_fs_locations {
+- struct nfs_fattr fattr;
++ struct nfs_fattr *fattr;
+ const struct nfs_server *server;
+ struct nfs4_pathname fs_path;
+ int nlocations;
+diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
+index 567c3ddba2c42..c6199dbe25913 100644
+--- a/include/linux/nodemask.h
++++ b/include/linux/nodemask.h
+@@ -375,14 +375,13 @@ static inline void __nodes_fold(nodemask_t *dstp, const nodemask_t *origp,
+ }
+
+ #if MAX_NUMNODES > 1
+-#define for_each_node_mask(node, mask) \
+- for ((node) = first_node(mask); \
+- (node) < MAX_NUMNODES; \
+- (node) = next_node((node), (mask)))
++#define for_each_node_mask(node, mask) \
++ for ((node) = first_node(mask); \
++ (node >= 0) && (node) < MAX_NUMNODES; \
++ (node) = next_node((node), (mask)))
+ #else /* MAX_NUMNODES == 1 */
+-#define for_each_node_mask(node, mask) \
+- if (!nodes_empty(mask)) \
+- for ((node) = 0; (node) < 1; (node)++)
++#define for_each_node_mask(node, mask) \
++ for ((node) = 0; (node) < 1 && !nodes_empty(mask); (node)++)
+ #endif /* MAX_NUMNODES */
+
+ /*
+diff --git a/include/linux/platform_data/cros_ec_proto.h b/include/linux/platform_data/cros_ec_proto.h
+index df3c78c92ca2f..16931569adce1 100644
+--- a/include/linux/platform_data/cros_ec_proto.h
++++ b/include/linux/platform_data/cros_ec_proto.h
+@@ -216,6 +216,9 @@ int cros_ec_prepare_tx(struct cros_ec_device *ec_dev,
+ int cros_ec_check_result(struct cros_ec_device *ec_dev,
+ struct cros_ec_command *msg);
+
++int cros_ec_cmd_xfer(struct cros_ec_device *ec_dev,
++ struct cros_ec_command *msg);
++
+ int cros_ec_cmd_xfer_status(struct cros_ec_device *ec_dev,
+ struct cros_ec_command *msg);
+
+diff --git a/include/linux/ptp_classify.h b/include/linux/ptp_classify.h
+index fefa7790dc460..2b6ea36ad1626 100644
+--- a/include/linux/ptp_classify.h
++++ b/include/linux/ptp_classify.h
+@@ -43,6 +43,9 @@
+ #define OFF_PTP_SOURCE_UUID 22 /* PTPv1 only */
+ #define OFF_PTP_SEQUENCE_ID 30
+
++/* PTP header flag fields */
++#define PTP_FLAG_TWOSTEP BIT(1)
++
+ /* Below defines should actually be removed at some point in time. */
+ #define IP6_HLEN 40
+ #define UDP_HLEN 8
+diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
+index 15b3d176b6b4d..c952c5ba8fab6 100644
+--- a/include/linux/ptrace.h
++++ b/include/linux/ptrace.h
+@@ -30,7 +30,6 @@ extern int ptrace_access_vm(struct task_struct *tsk, unsigned long addr,
+
+ #define PT_SEIZED 0x00010000 /* SEIZE used, enable new behavior */
+ #define PT_PTRACED 0x00000001
+-#define PT_DTRACE 0x00000002 /* delayed trace (used on m68k, i386) */
+
+ #define PT_OPT_FLAG_SHIFT 3
+ /* PT_TRACE_* event enable flags */
+@@ -47,12 +46,6 @@ extern int ptrace_access_vm(struct task_struct *tsk, unsigned long addr,
+ #define PT_EXITKILL (PTRACE_O_EXITKILL << PT_OPT_FLAG_SHIFT)
+ #define PT_SUSPEND_SECCOMP (PTRACE_O_SUSPEND_SECCOMP << PT_OPT_FLAG_SHIFT)
+
+-/* single stepping state bits (used on ARM and PA-RISC) */
+-#define PT_SINGLESTEP_BIT 31
+-#define PT_SINGLESTEP (1<<PT_SINGLESTEP_BIT)
+-#define PT_BLOCKSTEP_BIT 30
+-#define PT_BLOCKSTEP (1<<PT_BLOCKSTEP_BIT)
+-
+ extern long arch_ptrace(struct task_struct *child, long request,
+ unsigned long addr, unsigned long data);
+ extern int ptrace_readdata(struct task_struct *tsk, unsigned long src, char __user *dst, int len);
+diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
+index 3c8b34876744b..bab7cc56b13a8 100644
+--- a/include/linux/sched/signal.h
++++ b/include/linux/sched/signal.h
+@@ -320,7 +320,7 @@ int send_sig_mceerr(int code, void __user *, short, struct task_struct *);
+
+ int force_sig_bnderr(void __user *addr, void __user *lower, void __user *upper);
+ int force_sig_pkuerr(void __user *addr, u32 pkey);
+-int force_sig_perf(void __user *addr, u32 type, u64 sig_data);
++int send_sig_perf(void __user *addr, u32 type, u64 sig_data);
+
+ int force_sig_ptrace_errno_trap(int errno, void __user *addr);
+ int force_sig_fault_trapno(int sig, int code, void __user *addr, int trapno);
+diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
+index 719c9a6cac8d8..4492266935dd7 100644
+--- a/include/linux/sched/task.h
++++ b/include/linux/sched/task.h
+@@ -32,6 +32,7 @@ struct kernel_clone_args {
+ size_t set_tid_size;
+ int cgroup;
+ int io_thread;
++ int kthread;
+ struct cgroup *cgrp;
+ struct css_set *cset;
+ };
+@@ -89,6 +90,7 @@ struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node);
+ struct task_struct *fork_idle(int);
+ struct mm_struct *copy_init_mm(void);
+ extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
++extern pid_t user_mode_thread(int (*fn)(void *), void *arg, unsigned long flags);
+ extern long kernel_wait4(pid_t, int __user *, int, struct rusage *);
+ int kernel_wait(pid_t pid, int *stat);
+
+diff --git a/include/linux/seq_file.h b/include/linux/seq_file.h
+index 60820ab511d22..bd023dd38ae6f 100644
+--- a/include/linux/seq_file.h
++++ b/include/linux/seq_file.h
+@@ -277,6 +277,10 @@ extern struct list_head *seq_list_start_head(struct list_head *head,
+ extern struct list_head *seq_list_next(void *v, struct list_head *head,
+ loff_t *ppos);
+
++extern struct list_head *seq_list_start_rcu(struct list_head *head, loff_t pos);
++extern struct list_head *seq_list_start_head_rcu(struct list_head *head, loff_t pos);
++extern struct list_head *seq_list_next_rcu(void *v, struct list_head *head, loff_t *ppos);
++
+ /*
+ * Helpers for iteration over hlist_head-s in seq_files
+ */
+diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
+index f36be5166c197..369769ce7399d 100644
+--- a/include/linux/set_memory.h
++++ b/include/linux/set_memory.h
+@@ -42,14 +42,14 @@ static inline bool can_set_direct_map(void)
+ #endif
+ #endif /* CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+
+-#ifndef set_mce_nospec
+-static inline int set_mce_nospec(unsigned long pfn, bool unmap)
++#ifdef CONFIG_X86_64
++int set_mce_nospec(unsigned long pfn);
++int clear_mce_nospec(unsigned long pfn);
++#else
++static inline int set_mce_nospec(unsigned long pfn)
+ {
+ return 0;
+ }
+-#endif
+-
+-#ifndef clear_mce_nospec
+ static inline int clear_mce_nospec(unsigned long pfn)
+ {
+ return 0;
+diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h
+index 548a028f2dabb..2c1fc9212cf28 100644
+--- a/include/linux/usb/hcd.h
++++ b/include/linux/usb/hcd.h
+@@ -124,6 +124,7 @@ struct usb_hcd {
+ #define HCD_FLAG_RH_RUNNING 5 /* root hub is running? */
+ #define HCD_FLAG_DEAD 6 /* controller has died? */
+ #define HCD_FLAG_INTF_AUTHORIZED 7 /* authorize interfaces? */
++#define HCD_FLAG_DEFER_RH_REGISTER 8 /* Defer roothub registration */
+
+ /* The flags can be tested using these macros; they are likely to
+ * be slightly faster than test_bit().
+@@ -134,6 +135,7 @@ struct usb_hcd {
+ #define HCD_WAKEUP_PENDING(hcd) ((hcd)->flags & (1U << HCD_FLAG_WAKEUP_PENDING))
+ #define HCD_RH_RUNNING(hcd) ((hcd)->flags & (1U << HCD_FLAG_RH_RUNNING))
+ #define HCD_DEAD(hcd) ((hcd)->flags & (1U << HCD_FLAG_DEAD))
++#define HCD_DEFER_RH_REGISTER(hcd) ((hcd)->flags & (1U << HCD_FLAG_DEFER_RH_REGISTER))
+
+ /*
+ * Specifies if interfaces are authorized by default
+diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
+index 69ef31cea5822..62a9bb022aedf 100644
+--- a/include/net/bluetooth/hci.h
++++ b/include/net/bluetooth/hci.h
+@@ -265,6 +265,15 @@ enum {
+ * runtime suspend, because event filtering takes place there.
+ */
+ HCI_QUIRK_BROKEN_FILTER_CLEAR_ALL,
++
++ /*
++ * When this quirk is set, disables the use of
++ * HCI_OP_ENHANCED_SETUP_SYNC_CONN command to setup SCO connections.
++ *
++ * This quirk can be set before hci_register_dev is called or
++ * during the hdev->setup vendor callback.
++ */
++ HCI_QUIRK_BROKEN_ENHANCED_SETUP_SYNC_CONN,
+ };
+
+ /* HCI device flags */
+diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
+index 62d7b81b1cb74..5a52a2018b56a 100644
+--- a/include/net/bluetooth/hci_core.h
++++ b/include/net/bluetooth/hci_core.h
+@@ -1495,8 +1495,12 @@ void hci_conn_del_sysfs(struct hci_conn *conn);
+ #define privacy_mode_capable(dev) (use_ll_privacy(dev) && \
+ (hdev->commands[39] & 0x04))
+
+-/* Use enhanced synchronous connection if command is supported */
+-#define enhanced_sco_capable(dev) ((dev)->commands[29] & 0x08)
++/* Use enhanced synchronous connection if command is supported and its quirk
++ * has not been set.
++ */
++#define enhanced_sync_conn_capable(dev) \
++ (((dev)->commands[29] & 0x08) && \
++ !test_bit(HCI_QUIRK_BROKEN_ENHANCED_SETUP_SYNC_CONN, &(dev)->quirks))
+
+ /* Use ext scanning if set ext scan param and ext scan enable is supported */
+ #define use_ext_scan(dev) (((dev)->commands[37] & 0x20) && \
+diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
+index 4cfdef6ca4f64..c8490729b4aea 100644
+--- a/include/net/if_inet6.h
++++ b/include/net/if_inet6.h
+@@ -64,6 +64,14 @@ struct inet6_ifaddr {
+
+ struct hlist_node addr_lst;
+ struct list_head if_list;
++ /*
++ * Used to safely traverse idev->addr_list in process context
++ * if the idev->lock needed to protect idev->addr_list cannot be held.
++ * In that case, add the items to this list temporarily and iterate
++ * without holding idev->lock.
++ * See addrconf_ifdown and dev_forward_change.
++ */
++ struct list_head if_list_aux;
+
+ struct list_head tmp_list;
+ struct inet6_ifaddr *ifpub;
+diff --git a/include/net/ip.h b/include/net/ip.h
+index 0161137914cf9..26fffda78cca4 100644
+--- a/include/net/ip.h
++++ b/include/net/ip.h
+@@ -94,7 +94,7 @@ static inline void ipcm_init_sk(struct ipcm_cookie *ipcm,
+
+ ipcm->sockc.mark = inet->sk.sk_mark;
+ ipcm->sockc.tsflags = inet->sk.sk_tsflags;
+- ipcm->oif = inet->sk.sk_bound_dev_if;
++ ipcm->oif = READ_ONCE(inet->sk.sk_bound_dev_if);
+ ipcm->addr = inet->inet_saddr;
+ }
+
+diff --git a/include/net/sock.h b/include/net/sock.h
+index c4b91fc19b9ca..3c4fb8f03fd99 100644
+--- a/include/net/sock.h
++++ b/include/net/sock.h
+@@ -2866,13 +2866,14 @@ static inline void sk_pacing_shift_update(struct sock *sk, int val)
+ */
+ static inline bool sk_dev_equal_l3scope(struct sock *sk, int dif)
+ {
++ int bound_dev_if = READ_ONCE(sk->sk_bound_dev_if);
+ int mdif;
+
+- if (!sk->sk_bound_dev_if || sk->sk_bound_dev_if == dif)
++ if (!bound_dev_if || bound_dev_if == dif)
+ return true;
+
+ mdif = l3mdev_master_ifindex_by_index(sock_net(sk), dif);
+- if (mdif && mdif == sk->sk_bound_dev_if)
++ if (mdif && mdif == bound_dev_if)
+ return true;
+
+ return false;
+diff --git a/include/scsi/libfcoe.h b/include/scsi/libfcoe.h
+index fac8e89aed81d..310e0dbffda99 100644
+--- a/include/scsi/libfcoe.h
++++ b/include/scsi/libfcoe.h
+@@ -249,7 +249,8 @@ int fcoe_ctlr_recv_flogi(struct fcoe_ctlr *, struct fc_lport *,
+ struct fc_frame *);
+
+ /* libfcoe funcs */
+-u64 fcoe_wwn_from_mac(unsigned char mac[MAX_ADDR_LEN], unsigned int, unsigned int);
++u64 fcoe_wwn_from_mac(unsigned char mac[ETH_ALEN], unsigned int scheme,
++ unsigned int port);
+ int fcoe_libfc_config(struct fc_lport *, struct fcoe_ctlr *,
+ const struct libfc_function_template *, int init_fcp);
+ u32 fcoe_fc_crc(struct fc_frame *fp);
+diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
+index d0a24779c52dc..c0703cd20a993 100644
+--- a/include/scsi/libiscsi.h
++++ b/include/scsi/libiscsi.h
+@@ -54,9 +54,9 @@ enum {
+ #define ISID_SIZE 6
+
+ /* Connection flags */
+-#define ISCSI_CONN_FLAG_SUSPEND_TX BIT(0)
+-#define ISCSI_CONN_FLAG_SUSPEND_RX BIT(1)
+-#define ISCSI_CONN_FLAG_BOUND BIT(2)
++#define ISCSI_CONN_FLAG_SUSPEND_TX 0
++#define ISCSI_CONN_FLAG_SUSPEND_RX 1
++#define ISCSI_CONN_FLAG_BOUND 2
+
+ #define ISCSI_ITT_MASK 0x1fff
+ #define ISCSI_TOTAL_CMDS_MAX 4096
+diff --git a/include/sound/cs35l41.h b/include/sound/cs35l41.h
+index bf7f9a9aeba04..9341130257ea6 100644
+--- a/include/sound/cs35l41.h
++++ b/include/sound/cs35l41.h
+@@ -536,7 +536,6 @@
+
+ #define CS35L41_MAX_CACHE_REG 36
+ #define CS35L41_OTP_SIZE_WORDS 32
+-#define CS35L41_NUM_OTP_ELEM 100
+
+ #define CS35L41_VALID_PDATA 0x80000000
+ #define CS35L41_NUM_SUPPLIES 2
+diff --git a/include/sound/jack.h b/include/sound/jack.h
+index 1181f536557eb..1ed90e2109e9b 100644
+--- a/include/sound/jack.h
++++ b/include/sound/jack.h
+@@ -62,6 +62,7 @@ struct snd_jack {
+ const char *id;
+ #ifdef CONFIG_SND_JACK_INPUT_DEV
+ struct input_dev *input_dev;
++ struct mutex input_dev_lock;
+ int registered;
+ int type;
+ char name[100];
+diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h
+index 4a3ab0ed6e062..1c714336b8635 100644
+--- a/include/trace/events/rxrpc.h
++++ b/include/trace/events/rxrpc.h
+@@ -1509,7 +1509,7 @@ TRACE_EVENT(rxrpc_call_reset,
+ __entry->call_serial = call->rx_serial;
+ __entry->conn_serial = call->conn->hi_serial;
+ __entry->tx_seq = call->tx_hard_ack;
+- __entry->rx_seq = call->ackr_seen;
++ __entry->rx_seq = call->rx_hard_ack;
+ ),
+
+ TP_printk("c=%08x %08x:%08x r=%08x/%08x tx=%08x rx=%08x",
+diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
+index de136dbd623ac..e7d28dc549dab 100644
+--- a/include/trace/events/vmscan.h
++++ b/include/trace/events/vmscan.h
+@@ -297,7 +297,7 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
+ __field(unsigned long, nr_scanned)
+ __field(unsigned long, nr_skipped)
+ __field(unsigned long, nr_taken)
+- __field(isolate_mode_t, isolate_mode)
++ __field(unsigned int, isolate_mode)
+ __field(int, lru)
+ ),
+
+@@ -308,7 +308,7 @@ TRACE_EVENT(mm_vmscan_lru_isolate,
+ __entry->nr_scanned = nr_scanned;
+ __entry->nr_skipped = nr_skipped;
+ __entry->nr_taken = nr_taken;
+- __entry->isolate_mode = isolate_mode;
++ __entry->isolate_mode = (__force unsigned int)isolate_mode;
+ __entry->lru = lru;
+ ),
+
+diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
+index 3ba180f550d7c..ffbe4cec9f32d 100644
+--- a/include/uapi/asm-generic/siginfo.h
++++ b/include/uapi/asm-generic/siginfo.h
+@@ -99,6 +99,7 @@ union __sifields {
+ struct {
+ unsigned long _data;
+ __u32 _type;
++ __u32 _flags;
+ } _perf;
+ };
+ } _sigfault;
+@@ -164,6 +165,7 @@ typedef struct siginfo {
+ #define si_pkey _sifields._sigfault._addr_pkey._pkey
+ #define si_perf_data _sifields._sigfault._perf._data
+ #define si_perf_type _sifields._sigfault._perf._type
++#define si_perf_flags _sifields._sigfault._perf._flags
+ #define si_band _sifields._sigpoll._band
+ #define si_fd _sifields._sigpoll._fd
+ #define si_call_addr _sifields._sigsys._call_addr
+@@ -270,6 +272,11 @@ typedef struct siginfo {
+ * that are of the form: ((PTRACE_EVENT_XXX << 8) | SIGTRAP)
+ */
+
++/*
++ * Flags for si_perf_flags if SIGTRAP si_code is TRAP_PERF.
++ */
++#define TRAP_PERF_FLAG_ASYNC (1u << 0)
++
+ /*
+ * SIGCHLD si_codes
+ */
+diff --git a/include/uapi/linux/android/binder.h b/include/uapi/linux/android/binder.h
+index 11157fae8a8e7..688bcdaeed536 100644
+--- a/include/uapi/linux/android/binder.h
++++ b/include/uapi/linux/android/binder.h
+@@ -289,7 +289,7 @@ struct binder_transaction_data {
+ /* General information about the transaction. */
+ __u32 flags;
+ __kernel_pid_t sender_pid;
+- __kernel_uid_t sender_euid;
++ __kernel_uid32_t sender_euid;
+ binder_size_t data_size; /* number of bytes of data */
+ binder_size_t offsets_size; /* number of bytes of offsets */
+
+diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
+index b3d952067f59c..21c8d58283c9e 100644
+--- a/include/uapi/linux/landlock.h
++++ b/include/uapi/linux/landlock.h
+@@ -33,7 +33,9 @@ struct landlock_ruleset_attr {
+ * - %LANDLOCK_CREATE_RULESET_VERSION: Get the highest supported Landlock ABI
+ * version.
+ */
++/* clang-format off */
+ #define LANDLOCK_CREATE_RULESET_VERSION (1U << 0)
++/* clang-format on */
+
+ /**
+ * enum landlock_rule_type - Landlock rule type
+@@ -60,8 +62,9 @@ struct landlock_path_beneath_attr {
+ */
+ __u64 allowed_access;
+ /**
+- * @parent_fd: File descriptor, open with ``O_PATH``, which identifies
+- * the parent directory of a file hierarchy, or just a file.
++ * @parent_fd: File descriptor, preferably opened with ``O_PATH``,
++ * which identifies the parent directory of a file hierarchy, or just a
++ * file.
+ */
+ __s32 parent_fd;
+ /*
+@@ -120,6 +123,7 @@ struct landlock_path_beneath_attr {
+ * :manpage:`access(2)`.
+ * Future Landlock evolutions will enable to restrict them.
+ */
++/* clang-format off */
+ #define LANDLOCK_ACCESS_FS_EXECUTE (1ULL << 0)
+ #define LANDLOCK_ACCESS_FS_WRITE_FILE (1ULL << 1)
+ #define LANDLOCK_ACCESS_FS_READ_FILE (1ULL << 2)
+@@ -133,5 +137,6 @@ struct landlock_path_beneath_attr {
+ #define LANDLOCK_ACCESS_FS_MAKE_FIFO (1ULL << 10)
+ #define LANDLOCK_ACCESS_FS_MAKE_BLOCK (1ULL << 11)
+ #define LANDLOCK_ACCESS_FS_MAKE_SYM (1ULL << 12)
++/* clang-format on */
+
+ #endif /* _UAPI_LINUX_LANDLOCK_H */
+diff --git a/include/uapi/linux/lirc.h b/include/uapi/linux/lirc.h
+index 23b0f2c8ba81e..8d7ca7c6af42e 100644
+--- a/include/uapi/linux/lirc.h
++++ b/include/uapi/linux/lirc.h
+@@ -84,6 +84,13 @@
+ #define LIRC_CAN_SEND(x) ((x)&LIRC_CAN_SEND_MASK)
+ #define LIRC_CAN_REC(x) ((x)&LIRC_CAN_REC_MASK)
+
++/*
++ * Unused features. These features were never implemented, in tree or
++ * out of tree. These definitions are here so not to break the lircd build.
++ */
++#define LIRC_CAN_SET_REC_FILTER 0
++#define LIRC_CAN_NOTIFY_DECODE 0
++
+ /*** IOCTL commands for lirc driver ***/
+
+ #define LIRC_GET_FEATURES _IOR('i', 0x00000000, __u32)
+diff --git a/include/uapi/linux/types.h b/include/uapi/linux/types.h
+index c4dc597f3dcf4..308433be33c26 100644
+--- a/include/uapi/linux/types.h
++++ b/include/uapi/linux/types.h
+@@ -26,6 +26,9 @@
+ #define __bitwise
+ #endif
+
++/* The kernel doesn't use this legacy form, but user space does */
++#define __bitwise__ __bitwise
++
+ typedef __u16 __bitwise __le16;
+ typedef __u16 __bitwise __be16;
+ typedef __u32 __bitwise __le32;
+diff --git a/init/Kconfig b/init/Kconfig
+index ddcbefe535e9e..b19e2eeaae803 100644
+--- a/init/Kconfig
++++ b/init/Kconfig
+@@ -77,6 +77,11 @@ config CC_HAS_ASM_GOTO_OUTPUT
+ depends on CC_HAS_ASM_GOTO
+ def_bool $(success,echo 'int foo(int x) { asm goto ("": "=r"(x) ::: bar); return x; bar: return 0; }' | $(CC) -x c - -c -o /dev/null)
+
++config CC_HAS_ASM_GOTO_TIED_OUTPUT
++ depends on CC_HAS_ASM_GOTO_OUTPUT
++ # Detect buggy gcc and clang, fixed in gcc-11 clang-14.
++ def_bool $(success,echo 'int foo(int *x) { asm goto (".long (%l[bar]) - .\n": "+m"(*x) ::: bar); return *x; bar: return 0; }' | $CC -x c - -c -o /dev/null)
++
+ config TOOLS_SUPPORT_RELR
+ def_bool $(success,env "CC=$(CC)" "LD=$(LD)" "NM=$(NM)" "OBJCOPY=$(OBJCOPY)" $(srctree)/scripts/tools-support-relr.sh)
+
+diff --git a/init/main.c b/init/main.c
+index f057c49f1d9d8..80bd217dd35ea 100644
+--- a/init/main.c
++++ b/init/main.c
+@@ -688,7 +688,7 @@ noinline void __ref rest_init(void)
+ * the init task will end up wanting to create kthreads, which, if
+ * we schedule it before we create kthreadd, will OOPS.
+ */
+- pid = kernel_thread(kernel_init, NULL, CLONE_FS);
++ pid = user_mode_thread(kernel_init, NULL, CLONE_FS);
+ /*
+ * Pin init on the boot CPU. Task migration is not properly working
+ * until sched_init_smp() has been run. It will set the allowed
+diff --git a/ipc/mqueue.c b/ipc/mqueue.c
+index 7c08eb3c258d2..54cb6264f8cff 100644
+--- a/ipc/mqueue.c
++++ b/ipc/mqueue.c
+@@ -45,6 +45,7 @@
+
+ struct mqueue_fs_context {
+ struct ipc_namespace *ipc_ns;
++ bool newns; /* Set if newly created ipc namespace */
+ };
+
+ #define MQUEUE_MAGIC 0x19800202
+@@ -427,6 +428,14 @@ static int mqueue_get_tree(struct fs_context *fc)
+ {
+ struct mqueue_fs_context *ctx = fc->fs_private;
+
++ /*
++ * With a newly created ipc namespace, we don't need to do a search
++ * for an ipc namespace match, but we still need to set s_fs_info.
++ */
++ if (ctx->newns) {
++ fc->s_fs_info = ctx->ipc_ns;
++ return get_tree_nodev(fc, mqueue_fill_super);
++ }
+ return get_tree_keyed(fc, mqueue_fill_super, ctx->ipc_ns);
+ }
+
+@@ -454,6 +463,10 @@ static int mqueue_init_fs_context(struct fs_context *fc)
+ return 0;
+ }
+
++/*
++ * mq_init_ns() is currently the only caller of mq_create_mount().
++ * So the ns parameter is always a newly created ipc namespace.
++ */
+ static struct vfsmount *mq_create_mount(struct ipc_namespace *ns)
+ {
+ struct mqueue_fs_context *ctx;
+@@ -465,6 +478,7 @@ static struct vfsmount *mq_create_mount(struct ipc_namespace *ns)
+ return ERR_CAST(fc);
+
+ ctx = fc->fs_private;
++ ctx->newns = true;
+ put_ipc_ns(ctx->ipc_ns);
+ ctx->ipc_ns = get_ipc_ns(ns);
+ put_user_ns(fc->user_ns);
+diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
+index 128028efda647..0cb6211fcb58d 100644
+--- a/kernel/bpf/cgroup.c
++++ b/kernel/bpf/cgroup.c
+@@ -22,6 +22,72 @@
+ DEFINE_STATIC_KEY_ARRAY_FALSE(cgroup_bpf_enabled_key, MAX_CGROUP_BPF_ATTACH_TYPE);
+ EXPORT_SYMBOL(cgroup_bpf_enabled_key);
+
++/* __always_inline is necessary to prevent indirect call through run_prog
++ * function pointer.
++ */
++static __always_inline int
++bpf_prog_run_array_cg_flags(const struct cgroup_bpf *cgrp,
++ enum cgroup_bpf_attach_type atype,
++ const void *ctx, bpf_prog_run_fn run_prog,
++ int retval, u32 *ret_flags)
++{
++ const struct bpf_prog_array_item *item;
++ const struct bpf_prog *prog;
++ const struct bpf_prog_array *array;
++ struct bpf_run_ctx *old_run_ctx;
++ struct bpf_cg_run_ctx run_ctx;
++ u32 func_ret;
++
++ run_ctx.retval = retval;
++ migrate_disable();
++ rcu_read_lock();
++ array = rcu_dereference(cgrp->effective[atype]);
++ item = &array->items[0];
++ old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
++ while ((prog = READ_ONCE(item->prog))) {
++ run_ctx.prog_item = item;
++ func_ret = run_prog(prog, ctx);
++ if (!(func_ret & 1) && !IS_ERR_VALUE((long)run_ctx.retval))
++ run_ctx.retval = -EPERM;
++ *(ret_flags) |= (func_ret >> 1);
++ item++;
++ }
++ bpf_reset_run_ctx(old_run_ctx);
++ rcu_read_unlock();
++ migrate_enable();
++ return run_ctx.retval;
++}
++
++static __always_inline int
++bpf_prog_run_array_cg(const struct cgroup_bpf *cgrp,
++ enum cgroup_bpf_attach_type atype,
++ const void *ctx, bpf_prog_run_fn run_prog,
++ int retval)
++{
++ const struct bpf_prog_array_item *item;
++ const struct bpf_prog *prog;
++ const struct bpf_prog_array *array;
++ struct bpf_run_ctx *old_run_ctx;
++ struct bpf_cg_run_ctx run_ctx;
++
++ run_ctx.retval = retval;
++ migrate_disable();
++ rcu_read_lock();
++ array = rcu_dereference(cgrp->effective[atype]);
++ item = &array->items[0];
++ old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
++ while ((prog = READ_ONCE(item->prog))) {
++ run_ctx.prog_item = item;
++ if (!run_prog(prog, ctx) && !IS_ERR_VALUE((long)run_ctx.retval))
++ run_ctx.retval = -EPERM;
++ item++;
++ }
++ bpf_reset_run_ctx(old_run_ctx);
++ rcu_read_unlock();
++ migrate_enable();
++ return run_ctx.retval;
++}
++
+ void cgroup_bpf_offline(struct cgroup *cgrp)
+ {
+ cgroup_get(cgrp);
+@@ -1075,11 +1141,38 @@ int __cgroup_bpf_run_filter_skb(struct sock *sk,
+ bpf_compute_and_save_data_end(skb, &saved_data_end);
+
+ if (atype == CGROUP_INET_EGRESS) {
+- ret = BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY(
+- cgrp->bpf.effective[atype], skb, __bpf_prog_run_save_cb);
++ u32 flags = 0;
++ bool cn;
++
++ ret = bpf_prog_run_array_cg_flags(
++ &cgrp->bpf, atype,
++ skb, __bpf_prog_run_save_cb, 0, &flags);
++
++ /* Return values of CGROUP EGRESS BPF programs are:
++ * 0: drop packet
++ * 1: keep packet
++ * 2: drop packet and cn
++ * 3: keep packet and cn
++ *
++ * The returned value is then converted to one of the NET_XMIT
++ * or an error code that is then interpreted as drop packet
++ * (and no cn):
++ * 0: NET_XMIT_SUCCESS skb should be transmitted
++ * 1: NET_XMIT_DROP skb should be dropped and cn
++ * 2: NET_XMIT_CN skb should be transmitted and cn
++ * 3: -err skb should be dropped
++ */
++
++ cn = flags & BPF_RET_SET_CN;
++ if (ret && !IS_ERR_VALUE((long)ret))
++ ret = -EFAULT;
++ if (!ret)
++ ret = (cn ? NET_XMIT_CN : NET_XMIT_SUCCESS);
++ else
++ ret = (cn ? NET_XMIT_DROP : ret);
+ } else {
+- ret = BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[atype], skb,
+- __bpf_prog_run_save_cb, 0);
++ ret = bpf_prog_run_array_cg(&cgrp->bpf, atype,
++ skb, __bpf_prog_run_save_cb, 0);
+ if (ret && !IS_ERR_VALUE((long)ret))
+ ret = -EFAULT;
+ }
+@@ -1109,8 +1202,7 @@ int __cgroup_bpf_run_filter_sk(struct sock *sk,
+ {
+ struct cgroup *cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
+
+- return BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[atype], sk,
+- bpf_prog_run, 0);
++ return bpf_prog_run_array_cg(&cgrp->bpf, atype, sk, bpf_prog_run, 0);
+ }
+ EXPORT_SYMBOL(__cgroup_bpf_run_filter_sk);
+
+@@ -1155,8 +1247,8 @@ int __cgroup_bpf_run_filter_sock_addr(struct sock *sk,
+ }
+
+ cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
+- return BPF_PROG_RUN_ARRAY_CG_FLAGS(cgrp->bpf.effective[atype], &ctx,
+- bpf_prog_run, 0, flags);
++ return bpf_prog_run_array_cg_flags(&cgrp->bpf, atype,
++ &ctx, bpf_prog_run, 0, flags);
+ }
+ EXPORT_SYMBOL(__cgroup_bpf_run_filter_sock_addr);
+
+@@ -1182,8 +1274,8 @@ int __cgroup_bpf_run_filter_sock_ops(struct sock *sk,
+ {
+ struct cgroup *cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
+
+- return BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[atype], sock_ops,
+- bpf_prog_run, 0);
++ return bpf_prog_run_array_cg(&cgrp->bpf, atype, sock_ops, bpf_prog_run,
++ 0);
+ }
+ EXPORT_SYMBOL(__cgroup_bpf_run_filter_sock_ops);
+
+@@ -1200,8 +1292,7 @@ int __cgroup_bpf_check_dev_permission(short dev_type, u32 major, u32 minor,
+
+ rcu_read_lock();
+ cgrp = task_dfl_cgroup(current);
+- ret = BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[atype], &ctx,
+- bpf_prog_run, 0);
++ ret = bpf_prog_run_array_cg(&cgrp->bpf, atype, &ctx, bpf_prog_run, 0);
+ rcu_read_unlock();
+
+ return ret;
+@@ -1366,8 +1457,7 @@ int __cgroup_bpf_run_filter_sysctl(struct ctl_table_header *head,
+
+ rcu_read_lock();
+ cgrp = task_dfl_cgroup(current);
+- ret = BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[atype], &ctx,
+- bpf_prog_run, 0);
++ ret = bpf_prog_run_array_cg(&cgrp->bpf, atype, &ctx, bpf_prog_run, 0);
+ rcu_read_unlock();
+
+ kfree(ctx.cur_val);
+@@ -1459,7 +1549,7 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
+ }
+
+ lock_sock(sk);
+- ret = BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[CGROUP_SETSOCKOPT],
++ ret = bpf_prog_run_array_cg(&cgrp->bpf, CGROUP_SETSOCKOPT,
+ &ctx, bpf_prog_run, 0);
+ release_sock(sk);
+
+@@ -1559,7 +1649,7 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
+ }
+
+ lock_sock(sk);
+- ret = BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[CGROUP_GETSOCKOPT],
++ ret = bpf_prog_run_array_cg(&cgrp->bpf, CGROUP_GETSOCKOPT,
+ &ctx, bpf_prog_run, retval);
+ release_sock(sk);
+
+@@ -1608,7 +1698,7 @@ int __cgroup_bpf_run_filter_getsockopt_kern(struct sock *sk, int level,
+ * be called if that data shouldn't be "exported".
+ */
+
+- ret = BPF_PROG_RUN_ARRAY_CG(cgrp->bpf.effective[CGROUP_GETSOCKOPT],
++ ret = bpf_prog_run_array_cg(&cgrp->bpf, CGROUP_GETSOCKOPT,
+ &ctx, bpf_prog_run, retval);
+ if (ret < 0)
+ return ret;
+diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
+index f8ff598596b85..ac740630c79c2 100644
+--- a/kernel/dma/debug.c
++++ b/kernel/dma/debug.c
+@@ -448,7 +448,7 @@ void debug_dma_dump_mappings(struct device *dev)
+ * other hand, consumes a single dma_debug_entry, but inserts 'nents'
+ * entries into the tree.
+ */
+-static RADIX_TREE(dma_active_cacheline, GFP_NOWAIT);
++static RADIX_TREE(dma_active_cacheline, GFP_ATOMIC);
+ static DEFINE_SPINLOCK(radix_lock);
+ #define ACTIVE_CACHELINE_MAX_OVERLAP ((1 << RADIX_TREE_MAX_TAGS) - 1)
+ #define CACHELINE_PER_PAGE_SHIFT (PAGE_SHIFT - L1_CACHE_SHIFT)
+diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
+index 9743c6ccce1a9..e978f36e6be86 100644
+--- a/kernel/dma/direct.c
++++ b/kernel/dma/direct.c
+@@ -79,7 +79,7 @@ static int dma_set_decrypted(struct device *dev, void *vaddr, size_t size)
+ {
+ if (!force_dma_unencrypted(dev))
+ return 0;
+- return set_memory_decrypted((unsigned long)vaddr, 1 << get_order(size));
++ return set_memory_decrypted((unsigned long)vaddr, PFN_UP(size));
+ }
+
+ static int dma_set_encrypted(struct device *dev, void *vaddr, size_t size)
+@@ -88,7 +88,7 @@ static int dma_set_encrypted(struct device *dev, void *vaddr, size_t size)
+
+ if (!force_dma_unencrypted(dev))
+ return 0;
+- ret = set_memory_encrypted((unsigned long)vaddr, 1 << get_order(size));
++ ret = set_memory_encrypted((unsigned long)vaddr, PFN_UP(size));
+ if (ret)
+ pr_warn_ratelimited("leaking DMA memory that can't be re-encrypted\n");
+ return ret;
+@@ -115,7 +115,7 @@ static struct page *dma_direct_alloc_swiotlb(struct device *dev, size_t size)
+ }
+
+ static struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
+- gfp_t gfp)
++ gfp_t gfp, bool allow_highmem)
+ {
+ int node = dev_to_node(dev);
+ struct page *page = NULL;
+@@ -129,9 +129,12 @@ static struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
+ gfp |= dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
+ &phys_limit);
+ page = dma_alloc_contiguous(dev, size, gfp);
+- if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
+- dma_free_contiguous(dev, page, size);
+- page = NULL;
++ if (page) {
++ if (!dma_coherent_ok(dev, page_to_phys(page), size) ||
++ (!allow_highmem && PageHighMem(page))) {
++ dma_free_contiguous(dev, page, size);
++ page = NULL;
++ }
+ }
+ again:
+ if (!page)
+@@ -189,7 +192,7 @@ static void *dma_direct_alloc_no_mapping(struct device *dev, size_t size,
+ {
+ struct page *page;
+
+- page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO);
++ page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO, true);
+ if (!page)
+ return NULL;
+
+@@ -262,7 +265,7 @@ void *dma_direct_alloc(struct device *dev, size_t size,
+ return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp);
+
+ /* we always manually zero the memory once we are done */
+- page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO);
++ page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO, true);
+ if (!page)
+ return NULL;
+
+@@ -370,19 +373,9 @@ struct page *dma_direct_alloc_pages(struct device *dev, size_t size,
+ if (force_dma_unencrypted(dev) && dma_direct_use_pool(dev, gfp))
+ return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp);
+
+- page = __dma_direct_alloc_pages(dev, size, gfp);
++ page = __dma_direct_alloc_pages(dev, size, gfp, false);
+ if (!page)
+ return NULL;
+- if (PageHighMem(page)) {
+- /*
+- * Depending on the cma= arguments and per-arch setup
+- * dma_alloc_contiguous could return highmem pages.
+- * Without remapping there is no way to return them here,
+- * so log an error and fail.
+- */
+- dev_info(dev, "Rejecting highmem page from CMA.\n");
+- goto out_free_pages;
+- }
+
+ ret = page_address(page);
+ if (dma_set_decrypted(dev, ret, size))
+diff --git a/kernel/events/core.c b/kernel/events/core.c
+index 7f1e4c5897e75..950b25c3f2103 100644
+--- a/kernel/events/core.c
++++ b/kernel/events/core.c
+@@ -6428,8 +6428,8 @@ static void perf_sigtrap(struct perf_event *event)
+ if (current->flags & PF_EXITING)
+ return;
+
+- force_sig_perf((void __user *)event->pending_addr,
+- event->attr.type, event->attr.sig_data);
++ send_sig_perf((void __user *)event->pending_addr,
++ event->attr.type, event->attr.sig_data);
+ }
+
+ static void perf_pending_event_disable(struct perf_event *event)
+diff --git a/kernel/fork.c b/kernel/fork.c
+index 35a3beff140b6..0d8abfb9e0f4a 100644
+--- a/kernel/fork.c
++++ b/kernel/fork.c
+@@ -2157,7 +2157,7 @@ static __latent_entropy struct task_struct *copy_process(
+ p->io_context = NULL;
+ audit_set_context(p, NULL);
+ cgroup_fork(p);
+- if (p->flags & PF_KTHREAD) {
++ if (args->kthread) {
+ if (!set_kthread_struct(p))
+ goto bad_fork_cleanup_delayacct;
+ }
+@@ -2548,7 +2548,8 @@ struct task_struct * __init fork_idle(int cpu)
+ {
+ struct task_struct *task;
+ struct kernel_clone_args args = {
+- .flags = CLONE_VM,
++ .flags = CLONE_VM,
++ .kthread = 1,
+ };
+
+ task = copy_process(&init_struct_pid, 0, cpu_to_node(cpu), &args);
+@@ -2679,6 +2680,23 @@ pid_t kernel_clone(struct kernel_clone_args *args)
+ * Create a kernel thread.
+ */
+ pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags)
++{
++ struct kernel_clone_args args = {
++ .flags = ((lower_32_bits(flags) | CLONE_VM |
++ CLONE_UNTRACED) & ~CSIGNAL),
++ .exit_signal = (lower_32_bits(flags) & CSIGNAL),
++ .stack = (unsigned long)fn,
++ .stack_size = (unsigned long)arg,
++ .kthread = 1,
++ };
++
++ return kernel_clone(&args);
++}
++
++/*
++ * Create a user mode thread.
++ */
++pid_t user_mode_thread(int (*fn)(void *), void *arg, unsigned long flags)
+ {
+ struct kernel_clone_args args = {
+ .flags = ((lower_32_bits(flags) | CLONE_VM |
+diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
+index 8347fc158d2b9..c108a2a887546 100644
+--- a/kernel/kexec_file.c
++++ b/kernel/kexec_file.c
+@@ -108,40 +108,6 @@ int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
+ }
+ #endif
+
+-/*
+- * arch_kexec_apply_relocations_add - apply relocations of type RELA
+- * @pi: Purgatory to be relocated.
+- * @section: Section relocations applying to.
+- * @relsec: Section containing RELAs.
+- * @symtab: Corresponding symtab.
+- *
+- * Return: 0 on success, negative errno on error.
+- */
+-int __weak
+-arch_kexec_apply_relocations_add(struct purgatory_info *pi, Elf_Shdr *section,
+- const Elf_Shdr *relsec, const Elf_Shdr *symtab)
+-{
+- pr_err("RELA relocation unsupported.\n");
+- return -ENOEXEC;
+-}
+-
+-/*
+- * arch_kexec_apply_relocations - apply relocations of type REL
+- * @pi: Purgatory to be relocated.
+- * @section: Section relocations applying to.
+- * @relsec: Section containing RELs.
+- * @symtab: Corresponding symtab.
+- *
+- * Return: 0 on success, negative errno on error.
+- */
+-int __weak
+-arch_kexec_apply_relocations(struct purgatory_info *pi, Elf_Shdr *section,
+- const Elf_Shdr *relsec, const Elf_Shdr *symtab)
+-{
+- pr_err("REL relocation unsupported.\n");
+- return -ENOEXEC;
+-}
+-
+ /*
+ * Free up memory used by kernel, initrd, and command line. This is temporary
+ * memory allocation which is not needed any more after these buffers have
+diff --git a/kernel/kprobes.c b/kernel/kprobes.c
+index dd58c0be9ce25..f214f8c088ede 100644
+--- a/kernel/kprobes.c
++++ b/kernel/kprobes.c
+@@ -1257,79 +1257,6 @@ void kprobe_busy_end(void)
+ preempt_enable();
+ }
+
+-#if !defined(CONFIG_KRETPROBE_ON_RETHOOK)
+-static void free_rp_inst_rcu(struct rcu_head *head)
+-{
+- struct kretprobe_instance *ri = container_of(head, struct kretprobe_instance, rcu);
+-
+- if (refcount_dec_and_test(&ri->rph->ref))
+- kfree(ri->rph);
+- kfree(ri);
+-}
+-NOKPROBE_SYMBOL(free_rp_inst_rcu);
+-
+-static void recycle_rp_inst(struct kretprobe_instance *ri)
+-{
+- struct kretprobe *rp = get_kretprobe(ri);
+-
+- if (likely(rp))
+- freelist_add(&ri->freelist, &rp->freelist);
+- else
+- call_rcu(&ri->rcu, free_rp_inst_rcu);
+-}
+-NOKPROBE_SYMBOL(recycle_rp_inst);
+-
+-/*
+- * This function is called from delayed_put_task_struct() when a task is
+- * dead and cleaned up to recycle any kretprobe instances associated with
+- * this task. These left over instances represent probed functions that
+- * have been called but will never return.
+- */
+-void kprobe_flush_task(struct task_struct *tk)
+-{
+- struct kretprobe_instance *ri;
+- struct llist_node *node;
+-
+- /* Early boot, not yet initialized. */
+- if (unlikely(!kprobes_initialized))
+- return;
+-
+- kprobe_busy_begin();
+-
+- node = __llist_del_all(&tk->kretprobe_instances);
+- while (node) {
+- ri = container_of(node, struct kretprobe_instance, llist);
+- node = node->next;
+-
+- recycle_rp_inst(ri);
+- }
+-
+- kprobe_busy_end();
+-}
+-NOKPROBE_SYMBOL(kprobe_flush_task);
+-
+-static inline void free_rp_inst(struct kretprobe *rp)
+-{
+- struct kretprobe_instance *ri;
+- struct freelist_node *node;
+- int count = 0;
+-
+- node = rp->freelist.head;
+- while (node) {
+- ri = container_of(node, struct kretprobe_instance, freelist);
+- node = node->next;
+-
+- kfree(ri);
+- count++;
+- }
+-
+- if (refcount_sub_and_test(count, &rp->rph->ref)) {
+- kfree(rp->rph);
+- rp->rph = NULL;
+- }
+-}
+-#endif /* !CONFIG_KRETPROBE_ON_RETHOOK */
+-
+ /* Add the new probe to 'ap->list'. */
+ static int add_new_kprobe(struct kprobe *ap, struct kprobe *p)
+ {
+@@ -1928,6 +1855,77 @@ static struct notifier_block kprobe_exceptions_nb = {
+ #ifdef CONFIG_KRETPROBES
+
+ #if !defined(CONFIG_KRETPROBE_ON_RETHOOK)
++static void free_rp_inst_rcu(struct rcu_head *head)
++{
++ struct kretprobe_instance *ri = container_of(head, struct kretprobe_instance, rcu);
++
++ if (refcount_dec_and_test(&ri->rph->ref))
++ kfree(ri->rph);
++ kfree(ri);
++}
++NOKPROBE_SYMBOL(free_rp_inst_rcu);
++
++static void recycle_rp_inst(struct kretprobe_instance *ri)
++{
++ struct kretprobe *rp = get_kretprobe(ri);
++
++ if (likely(rp))
++ freelist_add(&ri->freelist, &rp->freelist);
++ else
++ call_rcu(&ri->rcu, free_rp_inst_rcu);
++}
++NOKPROBE_SYMBOL(recycle_rp_inst);
++
++/*
++ * This function is called from delayed_put_task_struct() when a task is
++ * dead and cleaned up to recycle any kretprobe instances associated with
++ * this task. These left over instances represent probed functions that
++ * have been called but will never return.
++ */
++void kprobe_flush_task(struct task_struct *tk)
++{
++ struct kretprobe_instance *ri;
++ struct llist_node *node;
++
++ /* Early boot, not yet initialized. */
++ if (unlikely(!kprobes_initialized))
++ return;
++
++ kprobe_busy_begin();
++
++ node = __llist_del_all(&tk->kretprobe_instances);
++ while (node) {
++ ri = container_of(node, struct kretprobe_instance, llist);
++ node = node->next;
++
++ recycle_rp_inst(ri);
++ }
++
++ kprobe_busy_end();
++}
++NOKPROBE_SYMBOL(kprobe_flush_task);
++
++static inline void free_rp_inst(struct kretprobe *rp)
++{
++ struct kretprobe_instance *ri;
++ struct freelist_node *node;
++ int count = 0;
++
++ node = rp->freelist.head;
++ while (node) {
++ ri = container_of(node, struct kretprobe_instance, freelist);
++ node = node->next;
++
++ kfree(ri);
++ count++;
++ }
++
++ if (refcount_sub_and_test(count, &rp->rph->ref)) {
++ kfree(rp->rph);
++ rp->rph = NULL;
++ }
++}
++
+ /* This assumes the 'tsk' is the current task or the is not running. */
+ static kprobe_opcode_t *__kretprobe_find_ret_addr(struct task_struct *tsk,
+ struct llist_node **cur)
+diff --git a/kernel/module.c b/kernel/module.c
+index 6cea788fd965c..6529c84c536f6 100644
+--- a/kernel/module.c
++++ b/kernel/module.c
+@@ -3033,6 +3033,10 @@ static int elf_validity_check(struct load_info *info)
+ * strings in the section safe.
+ */
+ info->secstrings = (void *)info->hdr + strhdr->sh_offset;
++ if (strhdr->sh_size == 0) {
++ pr_err("empty section name table\n");
++ goto no_exec;
++ }
+ if (info->secstrings[strhdr->sh_size - 1] != '\0') {
+ pr_err("ELF Spec violation: section name table isn't null terminated\n");
+ goto no_exec;
+diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
+index 0153b0ca7b23e..6219aaa454b5b 100644
+--- a/kernel/power/energy_model.c
++++ b/kernel/power/energy_model.c
+@@ -259,6 +259,8 @@ static void em_cpufreq_update_efficiencies(struct device *dev)
+ found++;
+ }
+
++ cpufreq_cpu_put(policy);
++
+ if (!found)
+ return;
+
+diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
+index da03c15ecc898..1ead794fc2f47 100644
+--- a/kernel/printk/printk.c
++++ b/kernel/printk/printk.c
+@@ -746,8 +746,19 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
+ goto out;
+ }
+
++ /*
++ * Guarantee this task is visible on the waitqueue before
++ * checking the wake condition.
++ *
++ * The full memory barrier within set_current_state() of
++ * prepare_to_wait_event() pairs with the full memory barrier
++ * within wq_has_sleeper().
++ *
++ * This pairs with __wake_up_klogd:A.
++ */
+ ret = wait_event_interruptible(log_wait,
+- prb_read_valid(prb, atomic64_read(&user->seq), r));
++ prb_read_valid(prb,
++ atomic64_read(&user->seq), r)); /* LMM(devkmsg_read:A) */
+ if (ret)
+ goto out;
+ }
+@@ -1513,7 +1524,18 @@ static int syslog_print(char __user *buf, int size)
+ seq = syslog_seq;
+
+ mutex_unlock(&syslog_lock);
+- len = wait_event_interruptible(log_wait, prb_read_valid(prb, seq, NULL));
++ /*
++ * Guarantee this task is visible on the waitqueue before
++ * checking the wake condition.
++ *
++ * The full memory barrier within set_current_state() of
++ * prepare_to_wait_event() pairs with the full memory barrier
++ * within wq_has_sleeper().
++ *
++ * This pairs with __wake_up_klogd:A.
++ */
++ len = wait_event_interruptible(log_wait,
++ prb_read_valid(prb, seq, NULL)); /* LMM(syslog_print:A) */
+ mutex_lock(&syslog_lock);
+
+ if (len)
+@@ -3310,28 +3332,43 @@ static void wake_up_klogd_work_func(struct irq_work *irq_work)
+ static DEFINE_PER_CPU(struct irq_work, wake_up_klogd_work) =
+ IRQ_WORK_INIT_LAZY(wake_up_klogd_work_func);
+
+-void wake_up_klogd(void)
++static void __wake_up_klogd(int val)
+ {
+ if (!printk_percpu_data_ready())
+ return;
+
+ preempt_disable();
+- if (waitqueue_active(&log_wait)) {
+- this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
++ /*
++ * Guarantee any new records can be seen by tasks preparing to wait
++ * before this context checks if the wait queue is empty.
++ *
++ * The full memory barrier within wq_has_sleeper() pairs with the full
++ * memory barrier within set_current_state() of
++ * prepare_to_wait_event(), which is called after ___wait_event() adds
++ * the waiter but before it has checked the wait condition.
++ *
++ * This pairs with devkmsg_read:A and syslog_print:A.
++ */
++ if (wq_has_sleeper(&log_wait) || /* LMM(__wake_up_klogd:A) */
++ (val & PRINTK_PENDING_OUTPUT)) {
++ this_cpu_or(printk_pending, val);
+ irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
+ }
+ preempt_enable();
+ }
+
+-void defer_console_output(void)
++void wake_up_klogd(void)
+ {
+- if (!printk_percpu_data_ready())
+- return;
++ __wake_up_klogd(PRINTK_PENDING_WAKEUP);
++}
+
+- preempt_disable();
+- this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
+- irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
+- preempt_enable();
++void defer_console_output(void)
++{
++ /*
++ * New messages may have been added directly to the ringbuffer
++ * using vprintk_store(), so wake any waiters as well.
++ */
++ __wake_up_klogd(PRINTK_PENDING_WAKEUP | PRINTK_PENDING_OUTPUT);
+ }
+
+ void printk_trigger_flush(void)
+diff --git a/kernel/ptrace.c b/kernel/ptrace.c
+index ccc4b465775b8..6149ca5e0e14b 100644
+--- a/kernel/ptrace.c
++++ b/kernel/ptrace.c
+@@ -1236,9 +1236,8 @@ int ptrace_request(struct task_struct *child, long request,
+ return ptrace_resume(child, request, data);
+
+ case PTRACE_KILL:
+- if (child->exit_state) /* already dead */
+- return 0;
+- return ptrace_resume(child, request, SIGKILL);
++ send_sig_info(SIGKILL, SEND_SIG_NOINFO, child);
++ return 0;
+
+ #ifdef CONFIG_HAVE_ARCH_TRACEHOOK
+ case PTRACE_GETREGSET:
+diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
+index bf8e341e75b4f..f559870fbf8b3 100644
+--- a/kernel/rcu/Kconfig
++++ b/kernel/rcu/Kconfig
+@@ -86,6 +86,7 @@ config TASKS_RCU
+
+ config TASKS_RUDE_RCU
+ def_bool 0
++ select IRQ_WORK
+ help
+ This option enables a task-based RCU implementation that uses
+ only context switch (including preemption) and user-mode
+diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
+index 99cf3a13954cf..00ff0896fb000 100644
+--- a/kernel/rcu/tasks.h
++++ b/kernel/rcu/tasks.h
+@@ -460,7 +460,7 @@ static void rcu_tasks_invoke_cbs(struct rcu_tasks *rtp, struct rcu_tasks_percpu
+ }
+ }
+
+- if (rcu_segcblist_empty(&rtpcp->cblist))
++ if (rcu_segcblist_empty(&rtpcp->cblist) || !cpu_possible(cpu))
+ return;
+ raw_spin_lock_irqsave_rcu_node(rtpcp, flags);
+ rcu_segcblist_advance(&rtpcp->cblist, rcu_seq_current(&rtp->tasks_gp_seq));
+@@ -950,6 +950,9 @@ static void rcu_tasks_be_rude(struct work_struct *work)
+ // Wait for one rude RCU-tasks grace period.
+ static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
+ {
++ if (num_online_cpus() <= 1)
++ return; // Fastpath for only one CPU.
++
+ rtp->n_ipis += cpumask_weight(cpu_online_mask);
+ schedule_on_each_cpu(rcu_tasks_be_rude);
+ }
+diff --git a/kernel/scftorture.c b/kernel/scftorture.c
+index dcb0410950e45..5d113aa59e773 100644
+--- a/kernel/scftorture.c
++++ b/kernel/scftorture.c
+@@ -267,9 +267,10 @@ static void scf_handler(void *scfc_in)
+ }
+ this_cpu_inc(scf_invoked_count);
+ if (longwait <= 0) {
+- if (!(r & 0xffc0))
++ if (!(r & 0xffc0)) {
+ udelay(r & 0x3f);
+- goto out;
++ goto out;
++ }
+ }
+ if (r & 0xfff)
+ goto out;
+diff --git a/kernel/sched/core.c b/kernel/sched/core.c
+index d58c0389eb23c..e58d894df2074 100644
+--- a/kernel/sched/core.c
++++ b/kernel/sched/core.c
+@@ -610,10 +610,10 @@ void double_rq_lock(struct rq *rq1, struct rq *rq2)
+ swap(rq1, rq2);
+
+ raw_spin_rq_lock(rq1);
+- if (__rq_lockp(rq1) == __rq_lockp(rq2))
+- return;
++ if (__rq_lockp(rq1) != __rq_lockp(rq2))
++ raw_spin_rq_lock_nested(rq2, SINGLE_DEPTH_NESTING);
+
+- raw_spin_rq_lock_nested(rq2, SINGLE_DEPTH_NESTING);
++ double_rq_clock_clear_update(rq1, rq2);
+ }
+ #endif
+
+diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
+index fb4255ae0b2c8..b61281d104584 100644
+--- a/kernel/sched/deadline.c
++++ b/kernel/sched/deadline.c
+@@ -1832,6 +1832,7 @@ out:
+
+ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused)
+ {
++ struct rq_flags rf;
+ struct rq *rq;
+
+ if (READ_ONCE(p->__state) != TASK_WAKING)
+@@ -1843,7 +1844,7 @@ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused
+ * from try_to_wake_up(). Hence, p->pi_lock is locked, but
+ * rq->lock is not... So, lock it
+ */
+- raw_spin_rq_lock(rq);
++ rq_lock(rq, &rf);
+ if (p->dl.dl_non_contending) {
+ update_rq_clock(rq);
+ sub_running_bw(&p->dl, &rq->dl);
+@@ -1859,7 +1860,7 @@ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused
+ put_task_struct(p);
+ }
+ sub_rq_bw(&p->dl, &rq->dl);
+- raw_spin_rq_unlock(rq);
++ rq_unlock(rq, &rf);
+ }
+
+ static void check_preempt_equal_dl(struct rq *rq, struct task_struct *p)
+diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
+index a68482d665355..cc8daa3dcc8bc 100644
+--- a/kernel/sched/fair.c
++++ b/kernel/sched/fair.c
+@@ -4846,8 +4846,8 @@ static int tg_unthrottle_up(struct task_group *tg, void *data)
+
+ cfs_rq->throttle_count--;
+ if (!cfs_rq->throttle_count) {
+- cfs_rq->throttled_clock_task_time += rq_clock_task(rq) -
+- cfs_rq->throttled_clock_task;
++ cfs_rq->throttled_clock_pelt_time += rq_clock_pelt(rq) -
++ cfs_rq->throttled_clock_pelt;
+
+ /* Add cfs_rq with load or one or more already running entities to the list */
+ if (!cfs_rq_is_decayed(cfs_rq) || cfs_rq->nr_running)
+@@ -4864,7 +4864,7 @@ static int tg_throttle_down(struct task_group *tg, void *data)
+
+ /* group is entering throttled state, stop time */
+ if (!cfs_rq->throttle_count) {
+- cfs_rq->throttled_clock_task = rq_clock_task(rq);
++ cfs_rq->throttled_clock_pelt = rq_clock_pelt(rq);
+ list_del_leaf_cfs_rq(cfs_rq);
+ }
+ cfs_rq->throttle_count++;
+@@ -5308,7 +5308,7 @@ static void sync_throttle(struct task_group *tg, int cpu)
+ pcfs_rq = tg->parent->cfs_rq[cpu];
+
+ cfs_rq->throttle_count = pcfs_rq->throttle_count;
+- cfs_rq->throttled_clock_task = rq_clock_task(cpu_rq(cpu));
++ cfs_rq->throttled_clock_pelt = rq_clock_pelt(cpu_rq(cpu));
+ }
+
+ /* conditionally throttle active cfs_rq's from put_prev_entity() */
+diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h
+index c336f5f481bca..4ff2ed4f8fa15 100644
+--- a/kernel/sched/pelt.h
++++ b/kernel/sched/pelt.h
+@@ -145,9 +145,9 @@ static inline u64 rq_clock_pelt(struct rq *rq)
+ static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq)
+ {
+ if (unlikely(cfs_rq->throttle_count))
+- return cfs_rq->throttled_clock_task - cfs_rq->throttled_clock_task_time;
++ return cfs_rq->throttled_clock_pelt - cfs_rq->throttled_clock_pelt_time;
+
+- return rq_clock_pelt(rq_of(cfs_rq)) - cfs_rq->throttled_clock_task_time;
++ return rq_clock_pelt(rq_of(cfs_rq)) - cfs_rq->throttled_clock_pelt_time;
+ }
+ #else
+ static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq)
+diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
+index a4fa3aadfcba6..ed9fb557dadd0 100644
+--- a/kernel/sched/psi.c
++++ b/kernel/sched/psi.c
+@@ -1060,14 +1060,17 @@ int psi_show(struct seq_file *m, struct psi_group *group, enum psi_res res)
+ mutex_unlock(&group->avgs_lock);
+
+ for (full = 0; full < 2; full++) {
+- unsigned long avg[3];
+- u64 total;
++ unsigned long avg[3] = { 0, };
++ u64 total = 0;
+ int w;
+
+- for (w = 0; w < 3; w++)
+- avg[w] = group->avg[res * 2 + full][w];
+- total = div_u64(group->total[PSI_AVGS][res * 2 + full],
+- NSEC_PER_USEC);
++ /* CPU FULL is undefined at the system level */
++ if (!(group == &psi_system && res == PSI_CPU && full)) {
++ for (w = 0; w < 3; w++)
++ avg[w] = group->avg[res * 2 + full][w];
++ total = div_u64(group->total[PSI_AVGS][res * 2 + full],
++ NSEC_PER_USEC);
++ }
+
+ seq_printf(m, "%s avg10=%lu.%02lu avg60=%lu.%02lu avg300=%lu.%02lu total=%llu\n",
+ full ? "full" : "some",
+diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
+index a32c46889af89..7891c0f0e1ff7 100644
+--- a/kernel/sched/rt.c
++++ b/kernel/sched/rt.c
+@@ -871,6 +871,7 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun)
+ int enqueue = 0;
+ struct rt_rq *rt_rq = sched_rt_period_rt_rq(rt_b, i);
+ struct rq *rq = rq_of_rt_rq(rt_rq);
++ struct rq_flags rf;
+ int skip;
+
+ /*
+@@ -885,7 +886,7 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun)
+ if (skip)
+ continue;
+
+- raw_spin_rq_lock(rq);
++ rq_lock(rq, &rf);
+ update_rq_clock(rq);
+
+ if (rt_rq->rt_time) {
+@@ -923,7 +924,7 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun)
+
+ if (enqueue)
+ sched_rt_rq_enqueue(rt_rq);
+- raw_spin_rq_unlock(rq);
++ rq_unlock(rq, &rf);
+ }
+
+ if (!throttled && (!rt_bandwidth_enabled() || rt_b->rt_runtime == RUNTIME_INF))
+diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
+index 8dccb34eb1908..0d2b6b758f324 100644
+--- a/kernel/sched/sched.h
++++ b/kernel/sched/sched.h
+@@ -603,8 +603,8 @@ struct cfs_rq {
+ s64 runtime_remaining;
+
+ u64 throttled_clock;
+- u64 throttled_clock_task;
+- u64 throttled_clock_task_time;
++ u64 throttled_clock_pelt;
++ u64 throttled_clock_pelt_time;
+ int throttled;
+ int throttle_count;
+ struct list_head throttled_list;
+@@ -2478,6 +2478,24 @@ unsigned long arch_scale_freq_capacity(int cpu)
+ }
+ #endif
+
++#ifdef CONFIG_SCHED_DEBUG
++/*
++ * In double_lock_balance()/double_rq_lock(), we use raw_spin_rq_lock() to
++ * acquire rq lock instead of rq_lock(). So at the end of these two functions
++ * we need to call double_rq_clock_clear_update() to clear RQCF_UPDATED of
++ * rq->clock_update_flags to avoid the WARN_DOUBLE_CLOCK warning.
++ */
++static inline void double_rq_clock_clear_update(struct rq *rq1, struct rq *rq2)
++{
++ rq1->clock_update_flags &= (RQCF_REQ_SKIP|RQCF_ACT_SKIP);
++ /* rq1 == rq2 for !CONFIG_SMP, so just clear RQCF_UPDATED once. */
++#ifdef CONFIG_SMP
++ rq2->clock_update_flags &= (RQCF_REQ_SKIP|RQCF_ACT_SKIP);
++#endif
++}
++#else
++static inline void double_rq_clock_clear_update(struct rq *rq1, struct rq *rq2) {}
++#endif
+
+ #ifdef CONFIG_SMP
+
+@@ -2543,14 +2561,15 @@ static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest)
+ __acquires(busiest->lock)
+ __acquires(this_rq->lock)
+ {
+- if (__rq_lockp(this_rq) == __rq_lockp(busiest))
+- return 0;
+-
+- if (likely(raw_spin_rq_trylock(busiest)))
++ if (__rq_lockp(this_rq) == __rq_lockp(busiest) ||
++ likely(raw_spin_rq_trylock(busiest))) {
++ double_rq_clock_clear_update(this_rq, busiest);
+ return 0;
++ }
+
+ if (rq_order_less(this_rq, busiest)) {
+ raw_spin_rq_lock_nested(busiest, SINGLE_DEPTH_NESTING);
++ double_rq_clock_clear_update(this_rq, busiest);
+ return 0;
+ }
+
+@@ -2644,6 +2663,7 @@ static inline void double_rq_lock(struct rq *rq1, struct rq *rq2)
+ BUG_ON(rq1 != rq2);
+ raw_spin_rq_lock(rq1);
+ __acquire(rq2->lock); /* Fake it out ;) */
++ double_rq_clock_clear_update(rq1, rq2);
+ }
+
+ /*
+diff --git a/kernel/signal.c b/kernel/signal.c
+index 30cd1ca43bcd5..e43bc2a692f5e 100644
+--- a/kernel/signal.c
++++ b/kernel/signal.c
+@@ -1805,7 +1805,7 @@ int force_sig_pkuerr(void __user *addr, u32 pkey)
+ }
+ #endif
+
+-int force_sig_perf(void __user *addr, u32 type, u64 sig_data)
++int send_sig_perf(void __user *addr, u32 type, u64 sig_data)
+ {
+ struct kernel_siginfo info;
+
+@@ -1817,7 +1817,18 @@ int force_sig_perf(void __user *addr, u32 type, u64 sig_data)
+ info.si_perf_data = sig_data;
+ info.si_perf_type = type;
+
+- return force_sig_info(&info);
++ /*
++ * Signals generated by perf events should not terminate the whole
++ * process if SIGTRAP is blocked, however, delivering the signal
++ * asynchronously is better than not delivering at all. But tell user
++ * space if the signal was asynchronous, so it can clearly be
++ * distinguished from normal synchronous ones.
++ */
++ info.si_perf_flags = sigismember(¤t->blocked, info.si_signo) ?
++ TRAP_PERF_FLAG_ASYNC :
++ 0;
++
++ return send_sig_info(info.si_signo, &info, current);
+ }
+
+ /**
+@@ -3432,6 +3443,7 @@ void copy_siginfo_to_external32(struct compat_siginfo *to,
+ to->si_addr = ptr_to_compat(from->si_addr);
+ to->si_perf_data = from->si_perf_data;
+ to->si_perf_type = from->si_perf_type;
++ to->si_perf_flags = from->si_perf_flags;
+ break;
+ case SIL_CHLD:
+ to->si_pid = from->si_pid;
+@@ -3509,6 +3521,7 @@ static int post_copy_siginfo_from_user32(kernel_siginfo_t *to,
+ to->si_addr = compat_ptr(from->si_addr);
+ to->si_perf_data = from->si_perf_data;
+ to->si_perf_type = from->si_perf_type;
++ to->si_perf_flags = from->si_perf_flags;
+ break;
+ case SIL_CHLD:
+ to->si_pid = from->si_pid;
+@@ -4722,6 +4735,7 @@ static inline void siginfo_buildtime_checks(void)
+ CHECK_OFFSET(si_pkey);
+ CHECK_OFFSET(si_perf_data);
+ CHECK_OFFSET(si_perf_type);
++ CHECK_OFFSET(si_perf_flags);
+
+ /* sigpoll */
+ CHECK_OFFSET(si_band);
+diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
+index d8553f46caa29..6b58fc6813dfc 100644
+--- a/kernel/trace/bpf_trace.c
++++ b/kernel/trace/bpf_trace.c
+@@ -129,7 +129,10 @@ unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
+ * out of events when it was updated in between this and the
+ * rcu_dereference() which is accepted risk.
+ */
+- ret = BPF_PROG_RUN_ARRAY(call->prog_array, ctx, bpf_prog_run);
++ rcu_read_lock();
++ ret = bpf_prog_run_array(rcu_dereference(call->prog_array),
++ ctx, bpf_prog_run);
++ rcu_read_unlock();
+
+ out:
+ __this_cpu_dec(bpf_prog_active);
+diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
+index af899b058c8d0..9c2031941b460 100644
+--- a/kernel/trace/ftrace.c
++++ b/kernel/trace/ftrace.c
+@@ -4465,7 +4465,7 @@ int ftrace_func_mapper_add_ip(struct ftrace_func_mapper *mapper,
+ * @ip: The instruction pointer address to remove the data from
+ *
+ * Returns the data if it is found, otherwise NULL.
+- * Note, if the data pointer is used as the data itself, (see
++ * Note, if the data pointer is used as the data itself, (see
+ * ftrace_func_mapper_find_ip(), then the return value may be meaningless,
+ * if the data pointer was set to zero.
+ */
+@@ -5195,8 +5195,6 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr)
+ goto out_unlock;
+
+ ret = ftrace_set_filter_ip(&direct_ops, ip, 0, 0);
+- if (ret)
+- remove_hash_entry(direct_functions, entry);
+
+ if (!ret && !(direct_ops.flags & FTRACE_OPS_FL_ENABLED)) {
+ ret = register_ftrace_function(&direct_ops);
+@@ -5205,6 +5203,7 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr)
+ }
+
+ if (ret) {
++ remove_hash_entry(direct_functions, entry);
+ kfree(entry);
+ if (!direct->count) {
+ list_del_rcu(&direct->next);
+diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
+index f4de111fa18ff..f6fb04d79eba6 100644
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -721,13 +721,16 @@ int trace_pid_write(struct trace_pid_list *filtered_pids,
+ pos = 0;
+
+ ret = trace_get_user(&parser, ubuf, cnt, &pos);
+- if (ret < 0 || !trace_parser_loaded(&parser))
++ if (ret < 0)
+ break;
+
+ read += ret;
+ ubuf += ret;
+ cnt -= ret;
+
++ if (!trace_parser_loaded(&parser))
++ break;
++
+ ret = -EINVAL;
+ if (kstrtoul(parser.buffer, 0, &val))
+ break;
+@@ -753,7 +756,6 @@ int trace_pid_write(struct trace_pid_list *filtered_pids,
+ if (!nr_pids) {
+ /* Cleared the list of pids */
+ trace_pid_list_free(pid_list);
+- read = ret;
+ pid_list = NULL;
+ }
+
+diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c
+index 0580287d7a0d1..778200dd8edea 100644
+--- a/kernel/trace/trace_boot.c
++++ b/kernel/trace/trace_boot.c
+@@ -300,7 +300,7 @@ trace_boot_hist_add_handlers(struct xbc_node *hnode, char **bufp,
+ {
+ struct xbc_node *node;
+ const char *p, *handler;
+- int ret;
++ int ret = 0;
+
+ handler = xbc_node_get_data(hnode);
+
+diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
+index f97de82d1342a..f5c1dfb4b0117 100644
+--- a/kernel/trace/trace_events.c
++++ b/kernel/trace/trace_events.c
+@@ -392,12 +392,6 @@ static void test_event_printk(struct trace_event_call *call)
+ if (!(dereference_flags & (1ULL << arg)))
+ goto next_arg;
+
+- /* Check for __get_sockaddr */;
+- if (str_has_prefix(fmt + i, "__get_sockaddr(")) {
+- dereference_flags &= ~(1ULL << arg);
+- goto next_arg;
+- }
+-
+ /* Find the REC-> in the argument */
+ c = strchr(fmt + i, ',');
+ r = strstr(fmt + i, "REC->");
+@@ -413,7 +407,14 @@ static void test_event_printk(struct trace_event_call *call)
+ a = strchr(fmt + i, '&');
+ if ((a && (a < r)) || test_field(r, call))
+ dereference_flags &= ~(1ULL << arg);
++ } else if ((r = strstr(fmt + i, "__get_dynamic_array(")) &&
++ (!c || r < c)) {
++ dereference_flags &= ~(1ULL << arg);
++ } else if ((r = strstr(fmt + i, "__get_sockaddr(")) &&
++ (!c || r < c)) {
++ dereference_flags &= ~(1ULL << arg);
+ }
++
+ next_arg:
+ i--;
+ arg++;
+diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
+index 44db5ba9cabb8..a0e41906d9ce1 100644
+--- a/kernel/trace/trace_events_hist.c
++++ b/kernel/trace/trace_events_hist.c
+@@ -2093,8 +2093,11 @@ static int init_var_ref(struct hist_field *ref_field,
+ return err;
+ free:
+ kfree(ref_field->system);
++ ref_field->system = NULL;
+ kfree(ref_field->event_name);
++ ref_field->event_name = NULL;
+ kfree(ref_field->name);
++ ref_field->name = NULL;
+
+ goto out;
+ }
+diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
+index afb92e2f0aeab..d8e8167a079ff 100644
+--- a/kernel/trace/trace_osnoise.c
++++ b/kernel/trace/trace_osnoise.c
+@@ -1578,11 +1578,12 @@ static enum hrtimer_restart timerlat_irq(struct hrtimer *timer)
+
+ trace_timerlat_sample(&s);
+
+- notify_new_max_latency(diff);
+-
+- if (osnoise_data.stop_tracing)
+- if (time_to_us(diff) >= osnoise_data.stop_tracing)
++ if (osnoise_data.stop_tracing) {
++ if (time_to_us(diff) >= osnoise_data.stop_tracing) {
+ osnoise_stop_tracing();
++ notify_new_max_latency(diff);
++ }
++ }
+
+ wake_up_process(tlat->kthread);
+
+diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
+index abcadbe933bb7..a2d301f58ceda 100644
+--- a/kernel/trace/trace_selftest.c
++++ b/kernel/trace/trace_selftest.c
+@@ -895,6 +895,9 @@ trace_selftest_startup_function_graph(struct tracer *trace,
+ ret = -1;
+ goto out;
+ }
++
++ /* Enable tracing on all functions again */
++ ftrace_set_global_filter(NULL, 0, 1);
+ #endif
+
+ /* Don't test dynamic tracing, the function tracer already did */
+diff --git a/kernel/umh.c b/kernel/umh.c
+index 36c123360ab88..b989736e87074 100644
+--- a/kernel/umh.c
++++ b/kernel/umh.c
+@@ -132,7 +132,7 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
+
+ /* If SIGCLD is ignored do_wait won't populate the status. */
+ kernel_sigaction(SIGCHLD, SIG_DFL);
+- pid = kernel_thread(call_usermodehelper_exec_async, sub_info, SIGCHLD);
++ pid = user_mode_thread(call_usermodehelper_exec_async, sub_info, SIGCHLD);
+ if (pid < 0)
+ sub_info->retval = pid;
+ else
+@@ -171,8 +171,8 @@ static void call_usermodehelper_exec_work(struct work_struct *work)
+ * want to pollute current->children, and we need a parent
+ * that always ignores SIGCHLD to ensure auto-reaping.
+ */
+- pid = kernel_thread(call_usermodehelper_exec_async, sub_info,
+- CLONE_PARENT | SIGCHLD);
++ pid = user_mode_thread(call_usermodehelper_exec_async, sub_info,
++ CLONE_PARENT | SIGCHLD);
+ if (pid < 0) {
+ sub_info->retval = pid;
+ umh_complete(sub_info);
+diff --git a/lib/kunit/debugfs.c b/lib/kunit/debugfs.c
+index b71db0abc12bf..1048ef1b8d6ec 100644
+--- a/lib/kunit/debugfs.c
++++ b/lib/kunit/debugfs.c
+@@ -52,7 +52,7 @@ static void debugfs_print_result(struct seq_file *seq,
+ static int debugfs_print_results(struct seq_file *seq, void *v)
+ {
+ struct kunit_suite *suite = (struct kunit_suite *)seq->private;
+- bool success = kunit_suite_has_succeeded(suite);
++ enum kunit_status success = kunit_suite_has_succeeded(suite);
+ struct kunit_case *test_case;
+
+ if (!suite || !suite->log)
+diff --git a/lib/kunit/executor.c b/lib/kunit/executor.c
+index 22640c9ee8198..96f96e42ce062 100644
+--- a/lib/kunit/executor.c
++++ b/lib/kunit/executor.c
+@@ -71,9 +71,13 @@ kunit_filter_tests(struct kunit_suite *const suite, const char *test_glob)
+
+ /* Use memcpy to workaround copy->name being const. */
+ copy = kmalloc(sizeof(*copy), GFP_KERNEL);
++ if (!copy)
++ return ERR_PTR(-ENOMEM);
+ memcpy(copy, suite, sizeof(*copy));
+
+ filtered = kcalloc(n + 1, sizeof(*filtered), GFP_KERNEL);
++ if (!filtered)
++ return ERR_PTR(-ENOMEM);
+
+ n = 0;
+ kunit_suite_for_each_test_case(suite, test_case) {
+@@ -106,14 +110,16 @@ kunit_filter_subsuite(struct kunit_suite * const * const subsuite,
+
+ filtered = kmalloc_array(n + 1, sizeof(*filtered), GFP_KERNEL);
+ if (!filtered)
+- return NULL;
++ return ERR_PTR(-ENOMEM);
+
+ n = 0;
+ for (i = 0; subsuite[i] != NULL; ++i) {
+ if (!glob_match(filter->suite_glob, subsuite[i]->name))
+ continue;
+ filtered_suite = kunit_filter_tests(subsuite[i], filter->test_glob);
+- if (filtered_suite)
++ if (IS_ERR(filtered_suite))
++ return ERR_CAST(filtered_suite);
++ else if (filtered_suite)
+ filtered[n++] = filtered_suite;
+ }
+ filtered[n] = NULL;
+@@ -146,7 +152,8 @@ static void kunit_free_suite_set(struct suite_set suite_set)
+ }
+
+ static struct suite_set kunit_filter_suites(const struct suite_set *suite_set,
+- const char *filter_glob)
++ const char *filter_glob,
++ int *err)
+ {
+ int i;
+ struct kunit_suite * const **copy, * const *filtered_subsuite;
+@@ -166,6 +173,10 @@ static struct suite_set kunit_filter_suites(const struct suite_set *suite_set,
+
+ for (i = 0; i < max; ++i) {
+ filtered_subsuite = kunit_filter_subsuite(suite_set->start[i], &filter);
++ if (IS_ERR(filtered_subsuite)) {
++ *err = PTR_ERR(filtered_subsuite);
++ return filtered;
++ }
+ if (filtered_subsuite)
+ *copy++ = filtered_subsuite;
+ }
+@@ -236,9 +247,15 @@ int kunit_run_all_tests(void)
+ .start = __kunit_suites_start,
+ .end = __kunit_suites_end,
+ };
++ int err = 0;
+
+- if (filter_glob_param)
+- suite_set = kunit_filter_suites(&suite_set, filter_glob_param);
++ if (filter_glob_param) {
++ suite_set = kunit_filter_suites(&suite_set, filter_glob_param, &err);
++ if (err) {
++ pr_err("kunit executor: error filtering suites: %d\n", err);
++ goto out;
++ }
++ }
+
+ if (!action_param)
+ kunit_exec_run_tests(&suite_set);
+@@ -251,9 +268,10 @@ int kunit_run_all_tests(void)
+ kunit_free_suite_set(suite_set);
+ }
+
+- kunit_handle_shutdown();
+
+- return 0;
++out:
++ kunit_handle_shutdown();
++ return err;
+ }
+
+ #if IS_BUILTIN(CONFIG_KUNIT_TEST)
+diff --git a/lib/kunit/executor_test.c b/lib/kunit/executor_test.c
+index 4ed57fd94e427..eac6ff4802738 100644
+--- a/lib/kunit/executor_test.c
++++ b/lib/kunit/executor_test.c
+@@ -137,14 +137,16 @@ static void filter_suites_test(struct kunit *test)
+ .end = suites + 2,
+ };
+ struct suite_set filtered = {.start = NULL, .end = NULL};
++ int err = 0;
+
+ /* Emulate two files, each having one suite */
+ subsuites[0][0] = alloc_fake_suite(test, "suite0", dummy_test_cases);
+ subsuites[1][0] = alloc_fake_suite(test, "suite1", dummy_test_cases);
+
+ /* Filter out suite1 */
+- filtered = kunit_filter_suites(&suite_set, "suite0");
++ filtered = kunit_filter_suites(&suite_set, "suite0", &err);
+ kfree_subsuites_at_end(test, &filtered); /* let us use ASSERTs without leaking */
++ KUNIT_EXPECT_EQ(test, err, 0);
+ KUNIT_ASSERT_EQ(test, filtered.end - filtered.start, (ptrdiff_t)1);
+
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, filtered.start);
+diff --git a/lib/string_helpers.c b/lib/string_helpers.c
+index 4f877e9551d5b..5ed3beb066e6d 100644
+--- a/lib/string_helpers.c
++++ b/lib/string_helpers.c
+@@ -757,6 +757,9 @@ char **devm_kasprintf_strarray(struct device *dev, const char *prefix, size_t n)
+ return ERR_PTR(-ENOMEM);
+ }
+
++ ptr->n = n;
++ devres_add(dev, ptr);
++
+ return ptr->array;
+ }
+ EXPORT_SYMBOL_GPL(devm_kasprintf_strarray);
+diff --git a/mm/cma.c b/mm/cma.c
+index eaa4b5c920a20..4a978e09547a8 100644
+--- a/mm/cma.c
++++ b/mm/cma.c
+@@ -37,6 +37,7 @@
+
+ struct cma cma_areas[MAX_CMA_AREAS];
+ unsigned cma_area_count;
++static DEFINE_MUTEX(cma_mutex);
+
+ phys_addr_t cma_get_base(const struct cma *cma)
+ {
+@@ -468,9 +469,10 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
+ spin_unlock_irq(&cma->lock);
+
+ pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
++ mutex_lock(&cma_mutex);
+ ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
+ GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0));
+-
++ mutex_unlock(&cma_mutex);
+ if (ret == 0) {
+ page = pfn_to_page(pfn);
+ break;
+diff --git a/mm/compaction.c b/mm/compaction.c
+index fe915db6149b9..de42b8e487589 100644
+--- a/mm/compaction.c
++++ b/mm/compaction.c
+@@ -1858,6 +1858,8 @@ static unsigned long fast_find_migrateblock(struct compact_control *cc)
+
+ update_fast_start_pfn(cc, free_pfn);
+ pfn = pageblock_start_pfn(free_pfn);
++ if (pfn < cc->zone->zone_start_pfn)
++ pfn = cc->zone->zone_start_pfn;
+ cc->fast_search_fail = 0;
+ found_block = true;
+ set_pageblock_skip(freepage);
+diff --git a/mm/hugetlb.c b/mm/hugetlb.c
+index 3fc721789743e..410bbb0aee321 100644
+--- a/mm/hugetlb.c
++++ b/mm/hugetlb.c
+@@ -6562,7 +6562,14 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma,
+ pud_clear(pud);
+ put_page(virt_to_page(ptep));
+ mm_dec_nr_pmds(mm);
+- *addr = ALIGN(*addr, HPAGE_SIZE * PTRS_PER_PTE) - HPAGE_SIZE;
++ /*
++ * This update of passed address optimizes loops sequentially
++ * processing addresses in increments of huge page size (PMD_SIZE
++ * in this case). By clearing the pud, a PUD_SIZE area is unmapped.
++ * Update address to the 'last page' in the cleared area so that
++ * calling loop can move to first page past this area.
++ */
++ *addr |= PUD_SIZE - PMD_SIZE;
+ return 1;
+ }
+
+diff --git a/mm/memremap.c b/mm/memremap.c
+index af0223605e697..2554a6b07007f 100644
+--- a/mm/memremap.c
++++ b/mm/memremap.c
+@@ -214,7 +214,7 @@ static int pagemap_range(struct dev_pagemap *pgmap, struct mhp_params *params,
+
+ if (!mhp_range_allowed(range->start, range_len(range), !is_private)) {
+ error = -EINVAL;
+- goto err_pfn_remap;
++ goto err_kasan;
+ }
+
+ mem_hotplug_begin();
+diff --git a/mm/page_alloc.c b/mm/page_alloc.c
+index 0e42038382c12..5ced6cb260ed1 100644
+--- a/mm/page_alloc.c
++++ b/mm/page_alloc.c
+@@ -5324,8 +5324,8 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
+ page = __rmqueue_pcplist(zone, 0, ac.migratetype, alloc_flags,
+ pcp, pcp_list);
+ if (unlikely(!page)) {
+- /* Try and get at least one page */
+- if (!nr_populated)
++ /* Try and allocate at least one page */
++ if (!nr_account)
+ goto failed_irq;
+ break;
+ }
+diff --git a/mm/page_owner.c b/mm/page_owner.c
+index fb3a05fdebdbf..19bc559e49040 100644
+--- a/mm/page_owner.c
++++ b/mm/page_owner.c
+@@ -168,7 +168,7 @@ static inline void __set_page_owner_handle(struct page_ext *page_ext,
+ page_owner->pid = current->pid;
+ page_owner->tgid = current->tgid;
+ page_owner->ts_nsec = local_clock();
+- strlcpy(page_owner->comm, current->comm,
++ strscpy(page_owner->comm, current->comm,
+ sizeof(page_owner->comm));
+ __set_bit(PAGE_EXT_OWNER, &page_ext->flags);
+ __set_bit(PAGE_EXT_OWNER_ALLOCATED, &page_ext->flags);
+diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
+index fe803bee419a9..ac06c9724c7f3 100644
+--- a/net/bluetooth/hci_conn.c
++++ b/net/bluetooth/hci_conn.c
+@@ -481,7 +481,7 @@ static bool hci_setup_sync_conn(struct hci_conn *conn, __u16 handle)
+
+ bool hci_setup_sync(struct hci_conn *conn, __u16 handle)
+ {
+- if (enhanced_sco_capable(conn->hdev))
++ if (enhanced_sync_conn_capable(conn->hdev))
+ return hci_enhanced_setup_sync_conn(conn, handle);
+
+ return hci_setup_sync_conn(conn, handle);
+@@ -943,10 +943,11 @@ static void create_le_conn_complete(struct hci_dev *hdev, void *data, int err)
+
+ bt_dev_err(hdev, "request failed to create LE connection: err %d", err);
+
+- if (!conn)
++ /* Check if connection is still pending */
++ if (conn != hci_lookup_le_connect(hdev))
+ goto done;
+
+- hci_le_conn_failed(conn, err);
++ hci_conn_failed(conn, err);
+
+ done:
+ hci_dev_unlock(hdev);
+diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
+index 66451661283c2..af17dfb20e017 100644
+--- a/net/bluetooth/hci_event.c
++++ b/net/bluetooth/hci_event.c
+@@ -1835,7 +1835,9 @@ static u8 hci_cc_le_clear_accept_list(struct hci_dev *hdev, void *data,
+ if (rp->status)
+ return rp->status;
+
++ hci_dev_lock(hdev);
+ hci_bdaddr_list_clear(&hdev->le_accept_list);
++ hci_dev_unlock(hdev);
+
+ return rp->status;
+ }
+@@ -1855,8 +1857,10 @@ static u8 hci_cc_le_add_to_accept_list(struct hci_dev *hdev, void *data,
+ if (!sent)
+ return rp->status;
+
++ hci_dev_lock(hdev);
+ hci_bdaddr_list_add(&hdev->le_accept_list, &sent->bdaddr,
+ sent->bdaddr_type);
++ hci_dev_unlock(hdev);
+
+ return rp->status;
+ }
+@@ -1876,8 +1880,10 @@ static u8 hci_cc_le_del_from_accept_list(struct hci_dev *hdev, void *data,
+ if (!sent)
+ return rp->status;
+
++ hci_dev_lock(hdev);
+ hci_bdaddr_list_del(&hdev->le_accept_list, &sent->bdaddr,
+ sent->bdaddr_type);
++ hci_dev_unlock(hdev);
+
+ return rp->status;
+ }
+@@ -1949,9 +1955,11 @@ static u8 hci_cc_le_add_to_resolv_list(struct hci_dev *hdev, void *data,
+ if (!sent)
+ return rp->status;
+
++ hci_dev_lock(hdev);
+ hci_bdaddr_list_add_with_irk(&hdev->le_resolv_list, &sent->bdaddr,
+ sent->bdaddr_type, sent->peer_irk,
+ sent->local_irk);
++ hci_dev_unlock(hdev);
+
+ return rp->status;
+ }
+@@ -1971,8 +1979,10 @@ static u8 hci_cc_le_del_from_resolv_list(struct hci_dev *hdev, void *data,
+ if (!sent)
+ return rp->status;
+
++ hci_dev_lock(hdev);
+ hci_bdaddr_list_del_with_irk(&hdev->le_resolv_list, &sent->bdaddr,
+ sent->bdaddr_type);
++ hci_dev_unlock(hdev);
+
+ return rp->status;
+ }
+@@ -1987,7 +1997,9 @@ static u8 hci_cc_le_clear_resolv_list(struct hci_dev *hdev, void *data,
+ if (rp->status)
+ return rp->status;
+
++ hci_dev_lock(hdev);
+ hci_bdaddr_list_clear(&hdev->le_resolv_list);
++ hci_dev_unlock(hdev);
+
+ return rp->status;
+ }
+@@ -3225,10 +3237,12 @@ static void hci_conn_request_evt(struct hci_dev *hdev, void *data,
+ return;
+ }
+
++ hci_dev_lock(hdev);
++
+ if (hci_bdaddr_list_lookup(&hdev->reject_list, &ev->bdaddr,
+ BDADDR_BREDR)) {
+ hci_reject_conn(hdev, &ev->bdaddr);
+- return;
++ goto unlock;
+ }
+
+ /* Require HCI_CONNECTABLE or an accept list entry to accept the
+@@ -3240,13 +3254,11 @@ static void hci_conn_request_evt(struct hci_dev *hdev, void *data,
+ !hci_bdaddr_list_lookup_with_flags(&hdev->accept_list, &ev->bdaddr,
+ BDADDR_BREDR)) {
+ hci_reject_conn(hdev, &ev->bdaddr);
+- return;
++ goto unlock;
+ }
+
+ /* Connection accepted */
+
+- hci_dev_lock(hdev);
+-
+ ie = hci_inquiry_cache_lookup(hdev, &ev->bdaddr);
+ if (ie)
+ memcpy(ie->data.dev_class, ev->dev_class, 3);
+@@ -3258,8 +3270,7 @@ static void hci_conn_request_evt(struct hci_dev *hdev, void *data,
+ HCI_ROLE_SLAVE);
+ if (!conn) {
+ bt_dev_err(hdev, "no memory for new connection");
+- hci_dev_unlock(hdev);
+- return;
++ goto unlock;
+ }
+ }
+
+@@ -3299,6 +3310,10 @@ static void hci_conn_request_evt(struct hci_dev *hdev, void *data,
+ conn->state = BT_CONNECT2;
+ hci_connect_cfm(conn, 0);
+ }
++
++ return;
++unlock:
++ hci_dev_unlock(hdev);
+ }
+
+ static u8 hci_to_mgmt_reason(u8 err)
+@@ -5617,10 +5632,12 @@ static void le_conn_complete_evt(struct hci_dev *hdev, u8 status,
+ status = HCI_ERROR_INVALID_PARAMETERS;
+ }
+
+- if (status) {
+- hci_conn_failed(conn, status);
++ /* All connection failure handling is taken care of by the
++ * hci_conn_failed function which is triggered by the HCI
++ * request completion callbacks used for connecting.
++ */
++ if (status)
+ goto unlock;
+- }
+
+ if (conn->dst_type == ADDR_LE_DEV_PUBLIC)
+ addr_type = BDADDR_LE_PUBLIC;
+diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
+index 42c8047a9897d..f4afe482e3004 100644
+--- a/net/bluetooth/hci_request.c
++++ b/net/bluetooth/hci_request.c
+@@ -2260,6 +2260,7 @@ static int active_scan(struct hci_request *req, unsigned long opt)
+ if (err < 0)
+ own_addr_type = ADDR_LE_DEV_PUBLIC;
+
++ hci_dev_lock(hdev);
+ if (hci_is_adv_monitoring(hdev)) {
+ /* Duplicate filter should be disabled when some advertisement
+ * monitor is activated, otherwise AdvMon can only receive one
+@@ -2276,6 +2277,7 @@ static int active_scan(struct hci_request *req, unsigned long opt)
+ */
+ filter_dup = LE_SCAN_FILTER_DUP_DISABLE;
+ }
++ hci_dev_unlock(hdev);
+
+ hci_req_start_scan(req, LE_SCAN_ACTIVE, interval,
+ hdev->le_scan_window_discovery, own_addr_type,
+diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
+index 8eabf41b29939..1111da4e2f2bd 100644
+--- a/net/bluetooth/sco.c
++++ b/net/bluetooth/sco.c
+@@ -574,19 +574,24 @@ static int sco_sock_connect(struct socket *sock, struct sockaddr *addr, int alen
+ addr->sa_family != AF_BLUETOOTH)
+ return -EINVAL;
+
+- if (sk->sk_state != BT_OPEN && sk->sk_state != BT_BOUND)
+- return -EBADFD;
++ lock_sock(sk);
++ if (sk->sk_state != BT_OPEN && sk->sk_state != BT_BOUND) {
++ err = -EBADFD;
++ goto done;
++ }
+
+- if (sk->sk_type != SOCK_SEQPACKET)
+- return -EINVAL;
++ if (sk->sk_type != SOCK_SEQPACKET) {
++ err = -EINVAL;
++ goto done;
++ }
+
+ hdev = hci_get_route(&sa->sco_bdaddr, &sco_pi(sk)->src, BDADDR_BREDR);
+- if (!hdev)
+- return -EHOSTUNREACH;
++ if (!hdev) {
++ err = -EHOSTUNREACH;
++ goto done;
++ }
+ hci_dev_lock(hdev);
+
+- lock_sock(sk);
+-
+ /* Set destination address and psm */
+ bacpy(&sco_pi(sk)->dst, &sa->sco_bdaddr);
+
+@@ -885,7 +890,7 @@ static int sco_sock_setsockopt(struct socket *sock, int level, int optname,
+ err = -EBADFD;
+ break;
+ }
+- if (enhanced_sco_capable(hdev) &&
++ if (enhanced_sync_conn_capable(hdev) &&
+ voice.setting == BT_VOICE_TRANSPARENT)
+ sco_pi(sk)->codec.id = BT_CODEC_TRANSPARENT;
+ hci_dev_put(hdev);
+diff --git a/net/core/dev.c b/net/core/dev.c
+index 2771fd22dc6ae..0784c339cd7d8 100644
+--- a/net/core/dev.c
++++ b/net/core/dev.c
+@@ -3215,11 +3215,15 @@ int skb_checksum_help(struct sk_buff *skb)
+ }
+
+ offset = skb_checksum_start_offset(skb);
+- BUG_ON(offset >= skb_headlen(skb));
++ ret = -EINVAL;
++ if (WARN_ON_ONCE(offset >= skb_headlen(skb)))
++ goto out;
++
+ csum = skb_checksum(skb, offset, skb->len - offset, 0);
+
+ offset += skb->csum_offset;
+- BUG_ON(offset + sizeof(__sum16) > skb_headlen(skb));
++ if (WARN_ON_ONCE(offset + sizeof(__sum16) > skb_headlen(skb)))
++ goto out;
+
+ ret = skb_ensure_writable(skb, offset + sizeof(__sum16));
+ if (ret)
+diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
+index 60f99e9fb6d12..1f3ce7aea7168 100644
+--- a/net/ipv4/tcp_input.c
++++ b/net/ipv4/tcp_input.c
+@@ -5711,7 +5711,7 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb,
+ &tp->last_oow_ack_time))
+ tcp_send_dupack(sk, skb);
+ } else if (tcp_reset_check(sk, skb)) {
+- tcp_reset(sk, skb);
++ goto reset;
+ }
+ goto discard;
+ }
+@@ -5747,17 +5747,16 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb,
+ }
+
+ if (rst_seq_match)
+- tcp_reset(sk, skb);
+- else {
+- /* Disable TFO if RST is out-of-order
+- * and no data has been received
+- * for current active TFO socket
+- */
+- if (tp->syn_fastopen && !tp->data_segs_in &&
+- sk->sk_state == TCP_ESTABLISHED)
+- tcp_fastopen_active_disable(sk);
+- tcp_send_challenge_ack(sk);
+- }
++ goto reset;
++
++ /* Disable TFO if RST is out-of-order
++ * and no data has been received
++ * for current active TFO socket
++ */
++ if (tp->syn_fastopen && !tp->data_segs_in &&
++ sk->sk_state == TCP_ESTABLISHED)
++ tcp_fastopen_active_disable(sk);
++ tcp_send_challenge_ack(sk);
+ goto discard;
+ }
+
+@@ -5782,6 +5781,11 @@ syn_challenge:
+ discard:
+ tcp_drop(sk, skb);
+ return false;
++
++reset:
++ tcp_reset(sk, skb);
++ __kfree_skb(skb);
++ return false;
+ }
+
+ /*
+diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
+index b225041765885..51e77dc6571a2 100644
+--- a/net/ipv6/addrconf.c
++++ b/net/ipv6/addrconf.c
+@@ -797,6 +797,7 @@ static void dev_forward_change(struct inet6_dev *idev)
+ {
+ struct net_device *dev;
+ struct inet6_ifaddr *ifa;
++ LIST_HEAD(tmp_addr_list);
+
+ if (!idev)
+ return;
+@@ -815,14 +816,24 @@ static void dev_forward_change(struct inet6_dev *idev)
+ }
+ }
+
++ read_lock_bh(&idev->lock);
+ list_for_each_entry(ifa, &idev->addr_list, if_list) {
+ if (ifa->flags&IFA_F_TENTATIVE)
+ continue;
++ list_add_tail(&ifa->if_list_aux, &tmp_addr_list);
++ }
++ read_unlock_bh(&idev->lock);
++
++ while (!list_empty(&tmp_addr_list)) {
++ ifa = list_first_entry(&tmp_addr_list,
++ struct inet6_ifaddr, if_list_aux);
++ list_del(&ifa->if_list_aux);
+ if (idev->cnf.forwarding)
+ addrconf_join_anycast(ifa);
+ else
+ addrconf_leave_anycast(ifa);
+ }
++
+ inet6_netconf_notify_devconf(dev_net(dev), RTM_NEWNETCONF,
+ NETCONFA_FORWARDING,
+ dev->ifindex, &idev->cnf);
+@@ -3728,7 +3739,8 @@ static int addrconf_ifdown(struct net_device *dev, bool unregister)
+ unsigned long event = unregister ? NETDEV_UNREGISTER : NETDEV_DOWN;
+ struct net *net = dev_net(dev);
+ struct inet6_dev *idev;
+- struct inet6_ifaddr *ifa, *tmp;
++ struct inet6_ifaddr *ifa;
++ LIST_HEAD(tmp_addr_list);
+ bool keep_addr = false;
+ bool was_ready;
+ int state, i;
+@@ -3820,16 +3832,23 @@ restart:
+ write_lock_bh(&idev->lock);
+ }
+
+- list_for_each_entry_safe(ifa, tmp, &idev->addr_list, if_list) {
++ list_for_each_entry(ifa, &idev->addr_list, if_list)
++ list_add_tail(&ifa->if_list_aux, &tmp_addr_list);
++ write_unlock_bh(&idev->lock);
++
++ while (!list_empty(&tmp_addr_list)) {
+ struct fib6_info *rt = NULL;
+ bool keep;
+
++ ifa = list_first_entry(&tmp_addr_list,
++ struct inet6_ifaddr, if_list_aux);
++ list_del(&ifa->if_list_aux);
++
+ addrconf_del_dad_work(ifa);
+
+ keep = keep_addr && (ifa->flags & IFA_F_PERMANENT) &&
+ !addr_is_local(&ifa->addr);
+
+- write_unlock_bh(&idev->lock);
+ spin_lock_bh(&ifa->lock);
+
+ if (keep) {
+@@ -3860,15 +3879,14 @@ restart:
+ addrconf_leave_solict(ifa->idev, &ifa->addr);
+ }
+
+- write_lock_bh(&idev->lock);
+ if (!keep) {
++ write_lock_bh(&idev->lock);
+ list_del_rcu(&ifa->if_list);
++ write_unlock_bh(&idev->lock);
+ in6_ifa_put(ifa);
+ }
+ }
+
+- write_unlock_bh(&idev->lock);
+-
+ /* Step 5: Discard anycast and multicast list */
+ if (unregister) {
+ ipv6_ac_destroy_dev(idev);
+@@ -4201,7 +4219,8 @@ static void addrconf_dad_completed(struct inet6_ifaddr *ifp, bool bump_id,
+ send_rs = send_mld &&
+ ipv6_accept_ra(ifp->idev) &&
+ ifp->idev->cnf.rtr_solicits != 0 &&
+- (dev->flags&IFF_LOOPBACK) == 0;
++ (dev->flags & IFF_LOOPBACK) == 0 &&
++ (dev->type != ARPHRD_TUNNEL);
+ read_unlock_bh(&ifp->idev->lock);
+
+ /* While dad is in progress mld report's source address is in6_addrany.
+diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
+index 206f66310a88d..0324e26850169 100644
+--- a/net/ipv6/datagram.c
++++ b/net/ipv6/datagram.c
+@@ -218,11 +218,11 @@ ipv4_connected:
+ err = -EINVAL;
+ goto out;
+ }
+- sk->sk_bound_dev_if = usin->sin6_scope_id;
++ WRITE_ONCE(sk->sk_bound_dev_if, usin->sin6_scope_id);
+ }
+
+ if (!sk->sk_bound_dev_if && (addr_type & IPV6_ADDR_MULTICAST))
+- sk->sk_bound_dev_if = np->mcast_oif;
++ WRITE_ONCE(sk->sk_bound_dev_if, np->mcast_oif);
+
+ /* Connect to link-local address requires an interface */
+ if (!sk->sk_bound_dev_if) {
+@@ -798,7 +798,7 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
+ if (src_idx) {
+ if (fl6->flowi6_oif &&
+ src_idx != fl6->flowi6_oif &&
+- (sk->sk_bound_dev_if != fl6->flowi6_oif ||
++ (READ_ONCE(sk->sk_bound_dev_if) != fl6->flowi6_oif ||
+ !sk_dev_equal_l3scope(sk, src_idx)))
+ return -EINVAL;
+ fl6->flowi6_oif = src_idx;
+diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
+index 7f0fa9bd9ffe0..a535c3f2e4af4 100644
+--- a/net/ipv6/udp.c
++++ b/net/ipv6/udp.c
+@@ -105,7 +105,7 @@ static int compute_score(struct sock *sk, struct net *net,
+ const struct in6_addr *daddr, unsigned short hnum,
+ int dif, int sdif)
+ {
+- int score;
++ int bound_dev_if, score;
+ struct inet_sock *inet;
+ bool dev_match;
+
+@@ -132,10 +132,11 @@ static int compute_score(struct sock *sk, struct net *net,
+ score++;
+ }
+
+- dev_match = udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif);
++ bound_dev_if = READ_ONCE(sk->sk_bound_dev_if);
++ dev_match = udp_sk_bound_dev_eq(net, bound_dev_if, dif, sdif);
+ if (!dev_match)
+ return -1;
+- if (sk->sk_bound_dev_if)
++ if (bound_dev_if)
+ score++;
+
+ if (READ_ONCE(sk->sk_incoming_cpu) == raw_smp_processor_id())
+@@ -789,7 +790,7 @@ static bool __udp_v6_is_mcast_sock(struct net *net, struct sock *sk,
+ (inet->inet_dport && inet->inet_dport != rmt_port) ||
+ (!ipv6_addr_any(&sk->sk_v6_daddr) &&
+ !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr)) ||
+- !udp_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif) ||
++ !udp_sk_bound_dev_eq(net, READ_ONCE(sk->sk_bound_dev_if), dif, sdif) ||
+ (!ipv6_addr_any(&sk->sk_v6_rcv_saddr) &&
+ !ipv6_addr_equal(&sk->sk_v6_rcv_saddr, loc_addr)))
+ return false;
+@@ -1433,7 +1434,7 @@ do_udp_sendmsg:
+ }
+
+ if (!fl6->flowi6_oif)
+- fl6->flowi6_oif = sk->sk_bound_dev_if;
++ fl6->flowi6_oif = READ_ONCE(sk->sk_bound_dev_if);
+
+ if (!fl6->flowi6_oif)
+ fl6->flowi6_oif = np->sticky_pktinfo.ipi6_ifindex;
+diff --git a/net/mac80211/chan.c b/net/mac80211/chan.c
+index e26d42de14ec2..3bb88c6b6d849 100644
+--- a/net/mac80211/chan.c
++++ b/net/mac80211/chan.c
+@@ -1749,12 +1749,9 @@ int ieee80211_vif_use_reserved_context(struct ieee80211_sub_if_data *sdata)
+
+ if (new_ctx->replace_state == IEEE80211_CHANCTX_REPLACE_NONE) {
+ if (old_ctx)
+- err = ieee80211_vif_use_reserved_reassign(sdata);
+- else
+- err = ieee80211_vif_use_reserved_assign(sdata);
++ return ieee80211_vif_use_reserved_reassign(sdata);
+
+- if (err)
+- return err;
++ return ieee80211_vif_use_reserved_assign(sdata);
+ }
+
+ /*
+diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
+index d4a7ba4a82025..e58aa6fa58f24 100644
+--- a/net/mac80211/ieee80211_i.h
++++ b/net/mac80211/ieee80211_i.h
+@@ -1148,6 +1148,9 @@ struct tpt_led_trigger {
+ * a scan complete for an aborted scan.
+ * @SCAN_HW_CANCELLED: Set for our scan work function when the scan is being
+ * cancelled.
++ * @SCAN_BEACON_WAIT: Set whenever we're passive scanning because of radar/no-IR
++ * and could send a probe request after receiving a beacon.
++ * @SCAN_BEACON_DONE: Beacon received, we can now send a probe request
+ */
+ enum {
+ SCAN_SW_SCANNING,
+@@ -1156,6 +1159,8 @@ enum {
+ SCAN_COMPLETED,
+ SCAN_ABORTED,
+ SCAN_HW_CANCELLED,
++ SCAN_BEACON_WAIT,
++ SCAN_BEACON_DONE,
+ };
+
+ /**
+diff --git a/net/mac80211/rc80211_minstrel_ht.c b/net/mac80211/rc80211_minstrel_ht.c
+index 9c6ace858107a..5a6bf46a42483 100644
+--- a/net/mac80211/rc80211_minstrel_ht.c
++++ b/net/mac80211/rc80211_minstrel_ht.c
+@@ -362,6 +362,9 @@ minstrel_ht_get_stats(struct minstrel_priv *mp, struct minstrel_ht_sta *mi,
+
+ group = MINSTREL_CCK_GROUP;
+ for (idx = 0; idx < ARRAY_SIZE(mp->cck_rates); idx++) {
++ if (!(mi->supported[group] & BIT(idx)))
++ continue;
++
+ if (rate->idx != mp->cck_rates[idx])
+ continue;
+
+diff --git a/net/mac80211/scan.c b/net/mac80211/scan.c
+index 5e6b275afc9e6..b698756887eb5 100644
+--- a/net/mac80211/scan.c
++++ b/net/mac80211/scan.c
+@@ -281,6 +281,16 @@ void ieee80211_scan_rx(struct ieee80211_local *local, struct sk_buff *skb)
+ if (likely(!sdata1 && !sdata2))
+ return;
+
++ if (test_and_clear_bit(SCAN_BEACON_WAIT, &local->scanning)) {
++ /*
++ * we were passive scanning because of radar/no-IR, but
++ * the beacon/proberesp rx gives us an opportunity to upgrade
++ * to active scan
++ */
++ set_bit(SCAN_BEACON_DONE, &local->scanning);
++ ieee80211_queue_delayed_work(&local->hw, &local->scan_work, 0);
++ }
++
+ if (ieee80211_is_probe_resp(mgmt->frame_control)) {
+ struct cfg80211_scan_request *scan_req;
+ struct cfg80211_sched_scan_request *sched_scan_req;
+@@ -787,6 +797,8 @@ static int __ieee80211_start_scan(struct ieee80211_sub_if_data *sdata,
+ IEEE80211_CHAN_RADAR)) ||
+ !req->n_ssids) {
+ next_delay = IEEE80211_PASSIVE_CHANNEL_TIME;
++ if (req->n_ssids)
++ set_bit(SCAN_BEACON_WAIT, &local->scanning);
+ } else {
+ ieee80211_scan_state_send_probe(local, &next_delay);
+ next_delay = IEEE80211_CHANNEL_TIME;
+@@ -998,6 +1010,8 @@ set_channel:
+ !scan_req->n_ssids) {
+ *next_delay = IEEE80211_PASSIVE_CHANNEL_TIME;
+ local->next_scan_state = SCAN_DECISION;
++ if (scan_req->n_ssids)
++ set_bit(SCAN_BEACON_WAIT, &local->scanning);
+ return;
+ }
+
+@@ -1090,6 +1104,8 @@ void ieee80211_scan_work(struct work_struct *work)
+ goto out;
+ }
+
++ clear_bit(SCAN_BEACON_WAIT, &local->scanning);
++
+ /*
+ * as long as no delay is required advance immediately
+ * without scheduling a new work
+@@ -1100,6 +1116,10 @@ void ieee80211_scan_work(struct work_struct *work)
+ goto out_complete;
+ }
+
++ if (test_and_clear_bit(SCAN_BEACON_DONE, &local->scanning) &&
++ local->next_scan_state == SCAN_DECISION)
++ local->next_scan_state = SCAN_SEND_PROBE;
++
+ switch (local->next_scan_state) {
+ case SCAN_DECISION:
+ /* if no more bands/channels left, complete scan */
+diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
+index aa51b100e0335..4d6a61acc4870 100644
+--- a/net/mptcp/pm.c
++++ b/net/mptcp/pm.c
+@@ -261,14 +261,25 @@ void mptcp_pm_rm_addr_received(struct mptcp_sock *msk,
+ spin_unlock_bh(&pm->lock);
+ }
+
+-void mptcp_pm_mp_prio_received(struct sock *sk, u8 bkup)
++void mptcp_pm_mp_prio_received(struct sock *ssk, u8 bkup)
+ {
+- struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
++ struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
++ struct sock *sk = subflow->conn;
++ struct mptcp_sock *msk;
+
+ pr_debug("subflow->backup=%d, bkup=%d\n", subflow->backup, bkup);
+- subflow->backup = bkup;
++ msk = mptcp_sk(sk);
++ if (subflow->backup != bkup) {
++ subflow->backup = bkup;
++ mptcp_data_lock(sk);
++ if (!sock_owned_by_user(sk))
++ msk->last_snd = NULL;
++ else
++ __set_bit(MPTCP_RESET_SCHEDULER, &msk->cb_flags);
++ mptcp_data_unlock(sk);
++ }
+
+- mptcp_event(MPTCP_EVENT_SUB_PRIORITY, mptcp_sk(subflow->conn), sk, GFP_ATOMIC);
++ mptcp_event(MPTCP_EVENT_SUB_PRIORITY, msk, ssk, GFP_ATOMIC);
+ }
+
+ void mptcp_pm_mp_fail_received(struct sock *sk, u64 fail_seq)
+diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
+index b5e8de6f75076..e3dcc5501579f 100644
+--- a/net/mptcp/pm_netlink.c
++++ b/net/mptcp/pm_netlink.c
+@@ -727,6 +727,8 @@ static int mptcp_pm_nl_mp_prio_send_ack(struct mptcp_sock *msk,
+ if (!addresses_equal(&local, addr, addr->port))
+ continue;
+
++ if (subflow->backup != bkup)
++ msk->last_snd = NULL;
+ subflow->backup = bkup;
+ subflow->send_mp_prio = 1;
+ subflow->request_bkup = bkup;
+diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
+index 0cbea3b6d0a42..8f54293c1d887 100644
+--- a/net/mptcp/protocol.c
++++ b/net/mptcp/protocol.c
+@@ -3092,15 +3092,19 @@ static void mptcp_release_cb(struct sock *sk)
+ spin_lock_bh(&sk->sk_lock.slock);
+ }
+
+- /* be sure to set the current sk state before tacking actions
+- * depending on sk_state
+- */
+- if (__test_and_clear_bit(MPTCP_CONNECTED, &msk->cb_flags))
+- __mptcp_set_connected(sk);
+ if (__test_and_clear_bit(MPTCP_CLEAN_UNA, &msk->cb_flags))
+ __mptcp_clean_una_wakeup(sk);
+- if (__test_and_clear_bit(MPTCP_ERROR_REPORT, &msk->cb_flags))
+- __mptcp_error_report(sk);
++ if (unlikely(&msk->cb_flags)) {
++ /* be sure to set the current sk state before tacking actions
++ * depending on sk_state, that is processing MPTCP_ERROR_REPORT
++ */
++ if (__test_and_clear_bit(MPTCP_CONNECTED, &msk->cb_flags))
++ __mptcp_set_connected(sk);
++ if (__test_and_clear_bit(MPTCP_ERROR_REPORT, &msk->cb_flags))
++ __mptcp_error_report(sk);
++ if (__test_and_clear_bit(MPTCP_RESET_SCHEDULER, &msk->cb_flags))
++ msk->last_snd = NULL;
++ }
+
+ __mptcp_update_rmem(sk);
+ }
+diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
+index 5655a63aa6a8b..9ac63fa4866ef 100644
+--- a/net/mptcp/protocol.h
++++ b/net/mptcp/protocol.h
+@@ -124,6 +124,7 @@
+ #define MPTCP_RETRANSMIT 4
+ #define MPTCP_FLUSH_JOIN_LIST 5
+ #define MPTCP_CONNECTED 6
++#define MPTCP_RESET_SCHEDULER 7
+
+ static inline bool before64(__u64 seq1, __u64 seq2)
+ {
+diff --git a/net/nfc/core.c b/net/nfc/core.c
+index 5b286e1e0a6ff..6ff3e10ff8e35 100644
+--- a/net/nfc/core.c
++++ b/net/nfc/core.c
+@@ -1166,6 +1166,7 @@ void nfc_unregister_device(struct nfc_dev *dev)
+ if (dev->rfkill) {
+ rfkill_unregister(dev->rfkill);
+ rfkill_destroy(dev->rfkill);
++ dev->rfkill = NULL;
+ }
+ dev->shutting_down = true;
+ device_unlock(&dev->dev);
+diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
+index 969e532f77a90..f2d593e27b64f 100644
+--- a/net/rxrpc/ar-internal.h
++++ b/net/rxrpc/ar-internal.h
+@@ -68,7 +68,7 @@ struct rxrpc_net {
+ struct proc_dir_entry *proc_net; /* Subdir in /proc/net */
+ u32 epoch; /* Local epoch for detecting local-end reset */
+ struct list_head calls; /* List of calls active in this namespace */
+- rwlock_t call_lock; /* Lock for ->calls */
++ spinlock_t call_lock; /* Lock for ->calls */
+ atomic_t nr_calls; /* Count of allocated calls */
+
+ atomic_t nr_conns;
+@@ -676,13 +676,12 @@ struct rxrpc_call {
+
+ spinlock_t input_lock; /* Lock for packet input to this call */
+
+- /* receive-phase ACK management */
++ /* Receive-phase ACK management (ACKs we send). */
+ u8 ackr_reason; /* reason to ACK */
+ rxrpc_serial_t ackr_serial; /* serial of packet being ACK'd */
+- rxrpc_serial_t ackr_first_seq; /* first sequence number received */
+- rxrpc_seq_t ackr_prev_seq; /* previous sequence number received */
+- rxrpc_seq_t ackr_consumed; /* Highest packet shown consumed */
+- rxrpc_seq_t ackr_seen; /* Highest packet shown seen */
++ rxrpc_seq_t ackr_highest_seq; /* Higest sequence number received */
++ atomic_t ackr_nr_unacked; /* Number of unacked packets */
++ atomic_t ackr_nr_consumed; /* Number of packets needing hard ACK */
+
+ /* RTT management */
+ rxrpc_serial_t rtt_serial[4]; /* Serial number of DATA or PING sent */
+@@ -692,8 +691,10 @@ struct rxrpc_call {
+ #define RXRPC_CALL_RTT_AVAIL_MASK 0xf
+ #define RXRPC_CALL_RTT_PEND_SHIFT 8
+
+- /* transmission-phase ACK management */
++ /* Transmission-phase ACK management (ACKs we've received). */
+ ktime_t acks_latest_ts; /* Timestamp of latest ACK received */
++ rxrpc_seq_t acks_first_seq; /* first sequence number received */
++ rxrpc_seq_t acks_prev_seq; /* Highest previousPacket received */
+ rxrpc_seq_t acks_lowest_nak; /* Lowest NACK in the buffer (or ==tx_hard_ack) */
+ rxrpc_seq_t acks_lost_top; /* tx_top at the time lost-ack ping sent */
+ rxrpc_serial_t acks_lost_ping; /* Serial number of probe ACK */
+diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
+index 1ae90fb979362..8b24ffbc72efb 100644
+--- a/net/rxrpc/call_accept.c
++++ b/net/rxrpc/call_accept.c
+@@ -140,9 +140,9 @@ static int rxrpc_service_prealloc_one(struct rxrpc_sock *rx,
+ write_unlock(&rx->call_lock);
+
+ rxnet = call->rxnet;
+- write_lock(&rxnet->call_lock);
+- list_add_tail(&call->link, &rxnet->calls);
+- write_unlock(&rxnet->call_lock);
++ spin_lock_bh(&rxnet->call_lock);
++ list_add_tail_rcu(&call->link, &rxnet->calls);
++ spin_unlock_bh(&rxnet->call_lock);
+
+ b->call_backlog[call_head] = call;
+ smp_store_release(&b->call_backlog_head, (call_head + 1) & (size - 1));
+diff --git a/net/rxrpc/call_event.c b/net/rxrpc/call_event.c
+index 22e05de5d1ca9..f8ecad2b730e8 100644
+--- a/net/rxrpc/call_event.c
++++ b/net/rxrpc/call_event.c
+@@ -377,9 +377,9 @@ recheck_state:
+ if (test_bit(RXRPC_CALL_RX_HEARD, &call->flags) &&
+ (int)call->conn->hi_serial - (int)call->rx_serial > 0) {
+ trace_rxrpc_call_reset(call);
+- rxrpc_abort_call("EXP", call, 0, RX_USER_ABORT, -ECONNRESET);
++ rxrpc_abort_call("EXP", call, 0, RX_CALL_DEAD, -ECONNRESET);
+ } else {
+- rxrpc_abort_call("EXP", call, 0, RX_USER_ABORT, -ETIME);
++ rxrpc_abort_call("EXP", call, 0, RX_CALL_TIMEOUT, -ETIME);
+ }
+ set_bit(RXRPC_CALL_EV_ABORT, &call->events);
+ goto recheck_state;
+@@ -406,7 +406,8 @@ recheck_state:
+ goto recheck_state;
+ }
+
+- if (test_and_clear_bit(RXRPC_CALL_EV_RESEND, &call->events)) {
++ if (test_and_clear_bit(RXRPC_CALL_EV_RESEND, &call->events) &&
++ call->state != RXRPC_CALL_CLIENT_RECV_REPLY) {
+ rxrpc_resend(call, now);
+ goto recheck_state;
+ }
+diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c
+index 043508fd8d8a5..25c9a2cbf048c 100644
+--- a/net/rxrpc/call_object.c
++++ b/net/rxrpc/call_object.c
+@@ -337,9 +337,9 @@ struct rxrpc_call *rxrpc_new_client_call(struct rxrpc_sock *rx,
+ write_unlock(&rx->call_lock);
+
+ rxnet = call->rxnet;
+- write_lock(&rxnet->call_lock);
+- list_add_tail(&call->link, &rxnet->calls);
+- write_unlock(&rxnet->call_lock);
++ spin_lock_bh(&rxnet->call_lock);
++ list_add_tail_rcu(&call->link, &rxnet->calls);
++ spin_unlock_bh(&rxnet->call_lock);
+
+ /* From this point on, the call is protected by its own lock. */
+ release_sock(&rx->sk);
+@@ -631,9 +631,9 @@ void rxrpc_put_call(struct rxrpc_call *call, enum rxrpc_call_trace op)
+ ASSERTCMP(call->state, ==, RXRPC_CALL_COMPLETE);
+
+ if (!list_empty(&call->link)) {
+- write_lock(&rxnet->call_lock);
++ spin_lock_bh(&rxnet->call_lock);
+ list_del_init(&call->link);
+- write_unlock(&rxnet->call_lock);
++ spin_unlock_bh(&rxnet->call_lock);
+ }
+
+ rxrpc_cleanup_call(call);
+@@ -705,7 +705,7 @@ void rxrpc_destroy_all_calls(struct rxrpc_net *rxnet)
+ _enter("");
+
+ if (!list_empty(&rxnet->calls)) {
+- write_lock(&rxnet->call_lock);
++ spin_lock_bh(&rxnet->call_lock);
+
+ while (!list_empty(&rxnet->calls)) {
+ call = list_entry(rxnet->calls.next,
+@@ -720,12 +720,12 @@ void rxrpc_destroy_all_calls(struct rxrpc_net *rxnet)
+ rxrpc_call_states[call->state],
+ call->flags, call->events);
+
+- write_unlock(&rxnet->call_lock);
++ spin_unlock_bh(&rxnet->call_lock);
+ cond_resched();
+- write_lock(&rxnet->call_lock);
++ spin_lock_bh(&rxnet->call_lock);
+ }
+
+- write_unlock(&rxnet->call_lock);
++ spin_unlock_bh(&rxnet->call_lock);
+ }
+
+ atomic_dec(&rxnet->nr_calls);
+diff --git a/net/rxrpc/conn_object.c b/net/rxrpc/conn_object.c
+index b2159dbf5412c..660cd9b1a4658 100644
+--- a/net/rxrpc/conn_object.c
++++ b/net/rxrpc/conn_object.c
+@@ -183,7 +183,7 @@ void __rxrpc_disconnect_call(struct rxrpc_connection *conn,
+ chan->last_type = RXRPC_PACKET_TYPE_ABORT;
+ break;
+ default:
+- chan->last_abort = RX_USER_ABORT;
++ chan->last_abort = RX_CALL_DEAD;
+ chan->last_type = RXRPC_PACKET_TYPE_ABORT;
+ break;
+ }
+diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
+index dc201363f2c48..3521ebd0ee41c 100644
+--- a/net/rxrpc/input.c
++++ b/net/rxrpc/input.c
+@@ -412,8 +412,8 @@ static void rxrpc_input_data(struct rxrpc_call *call, struct sk_buff *skb)
+ {
+ struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
+ enum rxrpc_call_state state;
+- unsigned int j, nr_subpackets;
+- rxrpc_serial_t serial = sp->hdr.serial, ack_serial = 0;
++ unsigned int j, nr_subpackets, nr_unacked = 0;
++ rxrpc_serial_t serial = sp->hdr.serial, ack_serial = serial;
+ rxrpc_seq_t seq0 = sp->hdr.seq, hard_ack;
+ bool immediate_ack = false, jumbo_bad = false;
+ u8 ack = 0;
+@@ -453,7 +453,6 @@ static void rxrpc_input_data(struct rxrpc_call *call, struct sk_buff *skb)
+ !rxrpc_receiving_reply(call))
+ goto unlock;
+
+- call->ackr_prev_seq = seq0;
+ hard_ack = READ_ONCE(call->rx_hard_ack);
+
+ nr_subpackets = sp->nr_subpackets;
+@@ -534,6 +533,9 @@ static void rxrpc_input_data(struct rxrpc_call *call, struct sk_buff *skb)
+ ack_serial = serial;
+ }
+
++ if (after(seq0, call->ackr_highest_seq))
++ call->ackr_highest_seq = seq0;
++
+ /* Queue the packet. We use a couple of memory barriers here as need
+ * to make sure that rx_top is perceived to be set after the buffer
+ * pointer and that the buffer pointer is set after the annotation and
+@@ -567,6 +569,8 @@ static void rxrpc_input_data(struct rxrpc_call *call, struct sk_buff *skb)
+ sp = NULL;
+ }
+
++ nr_unacked++;
++
+ if (last) {
+ set_bit(RXRPC_CALL_RX_LAST, &call->flags);
+ if (!ack) {
+@@ -586,9 +590,14 @@ static void rxrpc_input_data(struct rxrpc_call *call, struct sk_buff *skb)
+ }
+ call->rx_expect_next = seq + 1;
+ }
++ if (!ack)
++ ack_serial = serial;
+ }
+
+ ack:
++ if (atomic_add_return(nr_unacked, &call->ackr_nr_unacked) > 2 && !ack)
++ ack = RXRPC_ACK_IDLE;
++
+ if (ack)
+ rxrpc_propose_ACK(call, ack, ack_serial,
+ immediate_ack, true,
+@@ -812,7 +821,7 @@ static void rxrpc_input_soft_acks(struct rxrpc_call *call, u8 *acks,
+ static bool rxrpc_is_ack_valid(struct rxrpc_call *call,
+ rxrpc_seq_t first_pkt, rxrpc_seq_t prev_pkt)
+ {
+- rxrpc_seq_t base = READ_ONCE(call->ackr_first_seq);
++ rxrpc_seq_t base = READ_ONCE(call->acks_first_seq);
+
+ if (after(first_pkt, base))
+ return true; /* The window advanced */
+@@ -820,7 +829,7 @@ static bool rxrpc_is_ack_valid(struct rxrpc_call *call,
+ if (before(first_pkt, base))
+ return false; /* firstPacket regressed */
+
+- if (after_eq(prev_pkt, call->ackr_prev_seq))
++ if (after_eq(prev_pkt, call->acks_prev_seq))
+ return true; /* previousPacket hasn't regressed. */
+
+ /* Some rx implementations put a serial number in previousPacket. */
+@@ -903,11 +912,38 @@ static void rxrpc_input_ack(struct rxrpc_call *call, struct sk_buff *skb)
+ rxrpc_propose_ack_respond_to_ack);
+ }
+
++ /* If we get an EXCEEDS_WINDOW ACK from the server, it probably
++ * indicates that the client address changed due to NAT. The server
++ * lost the call because it switched to a different peer.
++ */
++ if (unlikely(buf.ack.reason == RXRPC_ACK_EXCEEDS_WINDOW) &&
++ first_soft_ack == 1 &&
++ prev_pkt == 0 &&
++ rxrpc_is_client_call(call)) {
++ rxrpc_set_call_completion(call, RXRPC_CALL_REMOTELY_ABORTED,
++ 0, -ENETRESET);
++ return;
++ }
++
++ /* If we get an OUT_OF_SEQUENCE ACK from the server, that can also
++ * indicate a change of address. However, we can retransmit the call
++ * if we still have it buffered to the beginning.
++ */
++ if (unlikely(buf.ack.reason == RXRPC_ACK_OUT_OF_SEQUENCE) &&
++ first_soft_ack == 1 &&
++ prev_pkt == 0 &&
++ call->tx_hard_ack == 0 &&
++ rxrpc_is_client_call(call)) {
++ rxrpc_set_call_completion(call, RXRPC_CALL_REMOTELY_ABORTED,
++ 0, -ENETRESET);
++ return;
++ }
++
+ /* Discard any out-of-order or duplicate ACKs (outside lock). */
+ if (!rxrpc_is_ack_valid(call, first_soft_ack, prev_pkt)) {
+ trace_rxrpc_rx_discard_ack(call->debug_id, ack_serial,
+- first_soft_ack, call->ackr_first_seq,
+- prev_pkt, call->ackr_prev_seq);
++ first_soft_ack, call->acks_first_seq,
++ prev_pkt, call->acks_prev_seq);
+ return;
+ }
+
+@@ -922,14 +958,14 @@ static void rxrpc_input_ack(struct rxrpc_call *call, struct sk_buff *skb)
+ /* Discard any out-of-order or duplicate ACKs (inside lock). */
+ if (!rxrpc_is_ack_valid(call, first_soft_ack, prev_pkt)) {
+ trace_rxrpc_rx_discard_ack(call->debug_id, ack_serial,
+- first_soft_ack, call->ackr_first_seq,
+- prev_pkt, call->ackr_prev_seq);
++ first_soft_ack, call->acks_first_seq,
++ prev_pkt, call->acks_prev_seq);
+ goto out;
+ }
+ call->acks_latest_ts = skb->tstamp;
+
+- call->ackr_first_seq = first_soft_ack;
+- call->ackr_prev_seq = prev_pkt;
++ call->acks_first_seq = first_soft_ack;
++ call->acks_prev_seq = prev_pkt;
+
+ /* Parse rwind and mtu sizes if provided. */
+ if (buf.info.rxMTU)
+diff --git a/net/rxrpc/net_ns.c b/net/rxrpc/net_ns.c
+index cc7e30733feb0..e4d6d432515bc 100644
+--- a/net/rxrpc/net_ns.c
++++ b/net/rxrpc/net_ns.c
+@@ -50,7 +50,7 @@ static __net_init int rxrpc_init_net(struct net *net)
+ rxnet->epoch |= RXRPC_RANDOM_EPOCH;
+
+ INIT_LIST_HEAD(&rxnet->calls);
+- rwlock_init(&rxnet->call_lock);
++ spin_lock_init(&rxnet->call_lock);
+ atomic_set(&rxnet->nr_calls, 1);
+
+ atomic_set(&rxnet->nr_conns, 1);
+diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c
+index a45c83f22236e..9683617db7049 100644
+--- a/net/rxrpc/output.c
++++ b/net/rxrpc/output.c
+@@ -74,11 +74,18 @@ static size_t rxrpc_fill_out_ack(struct rxrpc_connection *conn,
+ u8 reason)
+ {
+ rxrpc_serial_t serial;
++ unsigned int tmp;
+ rxrpc_seq_t hard_ack, top, seq;
+ int ix;
+ u32 mtu, jmax;
+ u8 *ackp = pkt->acks;
+
++ tmp = atomic_xchg(&call->ackr_nr_unacked, 0);
++ tmp |= atomic_xchg(&call->ackr_nr_consumed, 0);
++ if (!tmp && (reason == RXRPC_ACK_DELAY ||
++ reason == RXRPC_ACK_IDLE))
++ return 0;
++
+ /* Barrier against rxrpc_input_data(). */
+ serial = call->ackr_serial;
+ hard_ack = READ_ONCE(call->rx_hard_ack);
+@@ -89,7 +96,7 @@ static size_t rxrpc_fill_out_ack(struct rxrpc_connection *conn,
+ pkt->ack.bufferSpace = htons(8);
+ pkt->ack.maxSkew = htons(0);
+ pkt->ack.firstPacket = htonl(hard_ack + 1);
+- pkt->ack.previousPacket = htonl(call->ackr_prev_seq);
++ pkt->ack.previousPacket = htonl(call->ackr_highest_seq);
+ pkt->ack.serial = htonl(serial);
+ pkt->ack.reason = reason;
+ pkt->ack.nAcks = top - hard_ack;
+@@ -223,6 +230,10 @@ int rxrpc_send_ack_packet(struct rxrpc_call *call, bool ping,
+ n = rxrpc_fill_out_ack(conn, call, pkt, &hard_ack, &top, reason);
+
+ spin_unlock_bh(&call->lock);
++ if (n == 0) {
++ kfree(pkt);
++ return 0;
++ }
+
+ iov[0].iov_base = pkt;
+ iov[0].iov_len = sizeof(pkt->whdr) + sizeof(pkt->ack) + n;
+@@ -259,13 +270,6 @@ int rxrpc_send_ack_packet(struct rxrpc_call *call, bool ping,
+ ntohl(pkt->ack.serial),
+ false, true,
+ rxrpc_propose_ack_retry_tx);
+- } else {
+- spin_lock_bh(&call->lock);
+- if (after(hard_ack, call->ackr_consumed))
+- call->ackr_consumed = hard_ack;
+- if (after(top, call->ackr_seen))
+- call->ackr_seen = top;
+- spin_unlock_bh(&call->lock);
+ }
+
+ rxrpc_set_keepalive(call);
+diff --git a/net/rxrpc/proc.c b/net/rxrpc/proc.c
+index e2f990754f882..5a67955cc00f6 100644
+--- a/net/rxrpc/proc.c
++++ b/net/rxrpc/proc.c
+@@ -26,29 +26,23 @@ static const char *const rxrpc_conn_states[RXRPC_CONN__NR_STATES] = {
+ */
+ static void *rxrpc_call_seq_start(struct seq_file *seq, loff_t *_pos)
+ __acquires(rcu)
+- __acquires(rxnet->call_lock)
+ {
+ struct rxrpc_net *rxnet = rxrpc_net(seq_file_net(seq));
+
+ rcu_read_lock();
+- read_lock(&rxnet->call_lock);
+- return seq_list_start_head(&rxnet->calls, *_pos);
++ return seq_list_start_head_rcu(&rxnet->calls, *_pos);
+ }
+
+ static void *rxrpc_call_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+ {
+ struct rxrpc_net *rxnet = rxrpc_net(seq_file_net(seq));
+
+- return seq_list_next(v, &rxnet->calls, pos);
++ return seq_list_next_rcu(v, &rxnet->calls, pos);
+ }
+
+ static void rxrpc_call_seq_stop(struct seq_file *seq, void *v)
+- __releases(rxnet->call_lock)
+ __releases(rcu)
+ {
+- struct rxrpc_net *rxnet = rxrpc_net(seq_file_net(seq));
+-
+- read_unlock(&rxnet->call_lock);
+ rcu_read_unlock();
+ }
+
+diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c
+index eca6dda26c77e..250f23bc1c076 100644
+--- a/net/rxrpc/recvmsg.c
++++ b/net/rxrpc/recvmsg.c
+@@ -260,11 +260,9 @@ static void rxrpc_rotate_rx_window(struct rxrpc_call *call)
+ rxrpc_end_rx_phase(call, serial);
+ } else {
+ /* Check to see if there's an ACK that needs sending. */
+- if (after_eq(hard_ack, call->ackr_consumed + 2) ||
+- after_eq(top, call->ackr_seen + 2) ||
+- (hard_ack == top && after(hard_ack, call->ackr_consumed)))
+- rxrpc_propose_ACK(call, RXRPC_ACK_DELAY, serial,
+- true, true,
++ if (atomic_inc_return(&call->ackr_nr_consumed) > 2)
++ rxrpc_propose_ACK(call, RXRPC_ACK_IDLE, serial,
++ true, false,
+ rxrpc_propose_ack_rotate_rx);
+ if (call->ackr_reason && call->ackr_reason != RXRPC_ACK_DELAY)
+ rxrpc_send_ack_packet(call, false, NULL);
+diff --git a/net/rxrpc/sendmsg.c b/net/rxrpc/sendmsg.c
+index af8ad6c30b9fb..1d38e279e2efa 100644
+--- a/net/rxrpc/sendmsg.c
++++ b/net/rxrpc/sendmsg.c
+@@ -444,6 +444,12 @@ static int rxrpc_send_data(struct rxrpc_sock *rx,
+
+ success:
+ ret = copied;
++ if (READ_ONCE(call->state) == RXRPC_CALL_COMPLETE) {
++ read_lock_bh(&call->state_lock);
++ if (call->error < 0)
++ ret = call->error;
++ read_unlock_bh(&call->state_lock);
++ }
+ out:
+ call->tx_pending = skb;
+ _leave(" = %d", ret);
+diff --git a/net/rxrpc/sysctl.c b/net/rxrpc/sysctl.c
+index 540351d6a5f47..555e0910786bc 100644
+--- a/net/rxrpc/sysctl.c
++++ b/net/rxrpc/sysctl.c
+@@ -12,7 +12,7 @@
+
+ static struct ctl_table_header *rxrpc_sysctl_reg_table;
+ static const unsigned int four = 4;
+-static const unsigned int thirtytwo = 32;
++static const unsigned int max_backlog = RXRPC_BACKLOG_MAX - 1;
+ static const unsigned int n_65535 = 65535;
+ static const unsigned int n_max_acks = RXRPC_RXTX_BUFF_SIZE - 1;
+ static const unsigned long one_jiffy = 1;
+@@ -89,7 +89,7 @@ static struct ctl_table rxrpc_sysctl_table[] = {
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = (void *)&four,
+- .extra2 = (void *)&thirtytwo,
++ .extra2 = (void *)&max_backlog,
+ },
+ {
+ .procname = "rx_window_size",
+diff --git a/net/sctp/input.c b/net/sctp/input.c
+index 90e12bafdd489..4f43afa8678f9 100644
+--- a/net/sctp/input.c
++++ b/net/sctp/input.c
+@@ -92,6 +92,7 @@ int sctp_rcv(struct sk_buff *skb)
+ struct sctp_chunk *chunk;
+ union sctp_addr src;
+ union sctp_addr dest;
++ int bound_dev_if;
+ int family;
+ struct sctp_af *af;
+ struct net *net = dev_net(skb->dev);
+@@ -169,7 +170,8 @@ int sctp_rcv(struct sk_buff *skb)
+ * If a frame arrives on an interface and the receiving socket is
+ * bound to another interface, via SO_BINDTODEVICE, treat it as OOTB
+ */
+- if (sk->sk_bound_dev_if && (sk->sk_bound_dev_if != af->skb_iif(skb))) {
++ bound_dev_if = READ_ONCE(sk->sk_bound_dev_if);
++ if (bound_dev_if && (bound_dev_if != af->skb_iif(skb))) {
+ if (transport) {
+ sctp_transport_put(transport);
+ asoc = NULL;
+diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
+index fce16b9d6e1a4..45a24d24210f0 100644
+--- a/net/smc/af_smc.c
++++ b/net/smc/af_smc.c
+@@ -1564,9 +1564,9 @@ static int smc_connect(struct socket *sock, struct sockaddr *addr,
+ if (rc && rc != -EINPROGRESS)
+ goto out;
+
+- sock_hold(&smc->sk); /* sock put in passive closing */
+ if (smc->use_fallback)
+ goto out;
++ sock_hold(&smc->sk); /* sock put in passive closing */
+ if (flags & O_NONBLOCK) {
+ if (queue_work(smc_hs_wq, &smc->connect_work))
+ smc->connect_nonblock = 1;
+diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
+index 1a3551b6d18bb..fd8b48b889681 100644
+--- a/net/wireless/nl80211.c
++++ b/net/wireless/nl80211.c
+@@ -3719,6 +3719,7 @@ static int nl80211_send_iface(struct sk_buff *msg, u32 portid, u32 seq, int flag
+ wdev_lock(wdev);
+ switch (wdev->iftype) {
+ case NL80211_IFTYPE_AP:
++ case NL80211_IFTYPE_P2P_GO:
+ if (wdev->ssid_len &&
+ nla_put(msg, NL80211_ATTR_SSID, wdev->ssid_len, wdev->ssid))
+ goto nla_put_failure_locked;
+@@ -16286,8 +16287,7 @@ static const struct genl_small_ops nl80211_small_ops[] = {
+ .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+ .doit = nl80211_color_change,
+ .flags = GENL_UNS_ADMIN_PERM,
+- .internal_flags = NL80211_FLAG_NEED_NETDEV_UP |
+- NL80211_FLAG_NEED_RTNL,
++ .internal_flags = NL80211_FLAG_NEED_NETDEV_UP,
+ },
+ {
+ .cmd = NL80211_CMD_SET_FILS_AAD,
+diff --git a/net/wireless/reg.c b/net/wireless/reg.c
+index c76cd973f06e4..58e83ce642ad2 100644
+--- a/net/wireless/reg.c
++++ b/net/wireless/reg.c
+@@ -807,6 +807,8 @@ static int __init load_builtin_regdb_keys(void)
+ return 0;
+ }
+
++MODULE_FIRMWARE("regulatory.db.p7s");
++
+ static bool regdb_has_valid_signature(const u8 *data, unsigned int size)
+ {
+ const struct firmware *sig;
+@@ -1078,6 +1080,8 @@ static void regdb_fw_cb(const struct firmware *fw, void *context)
+ release_firmware(fw);
+ }
+
++MODULE_FIRMWARE("regulatory.db");
++
+ static int query_regdb_file(const char *alpha2)
+ {
+ ASSERT_RTNL();
+diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
+index 38638845db9d7..72bb85c18804f 100644
+--- a/samples/bpf/Makefile
++++ b/samples/bpf/Makefile
+@@ -368,16 +368,15 @@ VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS))))
+
+ $(obj)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL)
+ ifeq ($(VMLINUX_H),)
++ifeq ($(VMLINUX_BTF),)
++ $(error Cannot find a vmlinux for VMLINUX_BTF at any of "$(VMLINUX_BTF_PATHS)",\
++ build the kernel or set VMLINUX_BTF or VMLINUX_H variable)
++endif
+ $(Q)$(BPFTOOL) btf dump file $(VMLINUX_BTF) format c > $@
+ else
+ $(Q)cp "$(VMLINUX_H)" $@
+ endif
+
+-ifeq ($(VMLINUX_BTF),)
+- $(error Cannot find a vmlinux for VMLINUX_BTF at any of "$(VMLINUX_BTF_PATHS)",\
+- build the kernel or set VMLINUX_BTF variable)
+-endif
+-
+ clean-files += vmlinux.h
+
+ # Get Clang's default includes on this system, as opposed to those seen by
+diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
+index 8859fc1935428..c089e9cdaf328 100644
+--- a/samples/landlock/sandboxer.c
++++ b/samples/landlock/sandboxer.c
+@@ -22,9 +22,9 @@
+ #include <unistd.h>
+
+ #ifndef landlock_create_ruleset
+-static inline int landlock_create_ruleset(
+- const struct landlock_ruleset_attr *const attr,
+- const size_t size, const __u32 flags)
++static inline int
++landlock_create_ruleset(const struct landlock_ruleset_attr *const attr,
++ const size_t size, const __u32 flags)
+ {
+ return syscall(__NR_landlock_create_ruleset, attr, size, flags);
+ }
+@@ -32,17 +32,18 @@ static inline int landlock_create_ruleset(
+
+ #ifndef landlock_add_rule
+ static inline int landlock_add_rule(const int ruleset_fd,
+- const enum landlock_rule_type rule_type,
+- const void *const rule_attr, const __u32 flags)
++ const enum landlock_rule_type rule_type,
++ const void *const rule_attr,
++ const __u32 flags)
+ {
+- return syscall(__NR_landlock_add_rule, ruleset_fd, rule_type,
+- rule_attr, flags);
++ return syscall(__NR_landlock_add_rule, ruleset_fd, rule_type, rule_attr,
++ flags);
+ }
+ #endif
+
+ #ifndef landlock_restrict_self
+ static inline int landlock_restrict_self(const int ruleset_fd,
+- const __u32 flags)
++ const __u32 flags)
+ {
+ return syscall(__NR_landlock_restrict_self, ruleset_fd, flags);
+ }
+@@ -70,14 +71,17 @@ static int parse_path(char *env_path, const char ***const path_list)
+ return num_paths;
+ }
+
++/* clang-format off */
++
+ #define ACCESS_FILE ( \
+ LANDLOCK_ACCESS_FS_EXECUTE | \
+ LANDLOCK_ACCESS_FS_WRITE_FILE | \
+ LANDLOCK_ACCESS_FS_READ_FILE)
+
+-static int populate_ruleset(
+- const char *const env_var, const int ruleset_fd,
+- const __u64 allowed_access)
++/* clang-format on */
++
++static int populate_ruleset(const char *const env_var, const int ruleset_fd,
++ const __u64 allowed_access)
+ {
+ int num_paths, i, ret = 1;
+ char *env_path_name;
+@@ -107,12 +111,10 @@ static int populate_ruleset(
+ for (i = 0; i < num_paths; i++) {
+ struct stat statbuf;
+
+- path_beneath.parent_fd = open(path_list[i], O_PATH |
+- O_CLOEXEC);
++ path_beneath.parent_fd = open(path_list[i], O_PATH | O_CLOEXEC);
+ if (path_beneath.parent_fd < 0) {
+ fprintf(stderr, "Failed to open \"%s\": %s\n",
+- path_list[i],
+- strerror(errno));
++ path_list[i], strerror(errno));
+ goto out_free_name;
+ }
+ if (fstat(path_beneath.parent_fd, &statbuf)) {
+@@ -123,9 +125,10 @@ static int populate_ruleset(
+ if (!S_ISDIR(statbuf.st_mode))
+ path_beneath.allowed_access &= ACCESS_FILE;
+ if (landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+- &path_beneath, 0)) {
+- fprintf(stderr, "Failed to update the ruleset with \"%s\": %s\n",
+- path_list[i], strerror(errno));
++ &path_beneath, 0)) {
++ fprintf(stderr,
++ "Failed to update the ruleset with \"%s\": %s\n",
++ path_list[i], strerror(errno));
+ close(path_beneath.parent_fd);
+ goto out_free_name;
+ }
+@@ -139,6 +142,8 @@ out_free_name:
+ return ret;
+ }
+
++/* clang-format off */
++
+ #define ACCESS_FS_ROUGHLY_READ ( \
+ LANDLOCK_ACCESS_FS_EXECUTE | \
+ LANDLOCK_ACCESS_FS_READ_FILE | \
+@@ -156,6 +161,8 @@ out_free_name:
+ LANDLOCK_ACCESS_FS_MAKE_BLOCK | \
+ LANDLOCK_ACCESS_FS_MAKE_SYM)
+
++/* clang-format on */
++
+ int main(const int argc, char *const argv[], char *const *const envp)
+ {
+ const char *cmd_path;
+@@ -163,55 +170,64 @@ int main(const int argc, char *const argv[], char *const *const envp)
+ int ruleset_fd;
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = ACCESS_FS_ROUGHLY_READ |
+- ACCESS_FS_ROUGHLY_WRITE,
++ ACCESS_FS_ROUGHLY_WRITE,
+ };
+
+ if (argc < 2) {
+- fprintf(stderr, "usage: %s=\"...\" %s=\"...\" %s <cmd> [args]...\n\n",
+- ENV_FS_RO_NAME, ENV_FS_RW_NAME, argv[0]);
+- fprintf(stderr, "Launch a command in a restricted environment.\n\n");
++ fprintf(stderr,
++ "usage: %s=\"...\" %s=\"...\" %s <cmd> [args]...\n\n",
++ ENV_FS_RO_NAME, ENV_FS_RW_NAME, argv[0]);
++ fprintf(stderr,
++ "Launch a command in a restricted environment.\n\n");
+ fprintf(stderr, "Environment variables containing paths, "
+ "each separated by a colon:\n");
+- fprintf(stderr, "* %s: list of paths allowed to be used in a read-only way.\n",
+- ENV_FS_RO_NAME);
+- fprintf(stderr, "* %s: list of paths allowed to be used in a read-write way.\n",
+- ENV_FS_RW_NAME);
+- fprintf(stderr, "\nexample:\n"
+- "%s=\"/bin:/lib:/usr:/proc:/etc:/dev/urandom\" "
+- "%s=\"/dev/null:/dev/full:/dev/zero:/dev/pts:/tmp\" "
+- "%s bash -i\n",
+- ENV_FS_RO_NAME, ENV_FS_RW_NAME, argv[0]);
++ fprintf(stderr,
++ "* %s: list of paths allowed to be used in a read-only way.\n",
++ ENV_FS_RO_NAME);
++ fprintf(stderr,
++ "* %s: list of paths allowed to be used in a read-write way.\n",
++ ENV_FS_RW_NAME);
++ fprintf(stderr,
++ "\nexample:\n"
++ "%s=\"/bin:/lib:/usr:/proc:/etc:/dev/urandom\" "
++ "%s=\"/dev/null:/dev/full:/dev/zero:/dev/pts:/tmp\" "
++ "%s bash -i\n",
++ ENV_FS_RO_NAME, ENV_FS_RW_NAME, argv[0]);
+ return 1;
+ }
+
+- ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
++ ruleset_fd =
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+ if (ruleset_fd < 0) {
+ const int err = errno;
+
+ perror("Failed to create a ruleset");
+ switch (err) {
+ case ENOSYS:
+- fprintf(stderr, "Hint: Landlock is not supported by the current kernel. "
+- "To support it, build the kernel with "
+- "CONFIG_SECURITY_LANDLOCK=y and prepend "
+- "\"landlock,\" to the content of CONFIG_LSM.\n");
++ fprintf(stderr,
++ "Hint: Landlock is not supported by the current kernel. "
++ "To support it, build the kernel with "
++ "CONFIG_SECURITY_LANDLOCK=y and prepend "
++ "\"landlock,\" to the content of CONFIG_LSM.\n");
+ break;
+ case EOPNOTSUPP:
+- fprintf(stderr, "Hint: Landlock is currently disabled. "
+- "It can be enabled in the kernel configuration by "
+- "prepending \"landlock,\" to the content of CONFIG_LSM, "
+- "or at boot time by setting the same content to the "
+- "\"lsm\" kernel parameter.\n");
++ fprintf(stderr,
++ "Hint: Landlock is currently disabled. "
++ "It can be enabled in the kernel configuration by "
++ "prepending \"landlock,\" to the content of CONFIG_LSM, "
++ "or at boot time by setting the same content to the "
++ "\"lsm\" kernel parameter.\n");
+ break;
+ }
+ return 1;
+ }
+ if (populate_ruleset(ENV_FS_RO_NAME, ruleset_fd,
+- ACCESS_FS_ROUGHLY_READ)) {
++ ACCESS_FS_ROUGHLY_READ)) {
+ goto err_close_ruleset;
+ }
+ if (populate_ruleset(ENV_FS_RW_NAME, ruleset_fd,
+- ACCESS_FS_ROUGHLY_READ | ACCESS_FS_ROUGHLY_WRITE)) {
++ ACCESS_FS_ROUGHLY_READ |
++ ACCESS_FS_ROUGHLY_WRITE)) {
+ goto err_close_ruleset;
+ }
+ if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
+@@ -228,7 +244,7 @@ int main(const int argc, char *const argv[], char *const *const envp)
+ cmd_argv = argv + 1;
+ execvpe(cmd_path, cmd_argv, envp);
+ fprintf(stderr, "Failed to execute \"%s\": %s\n", cmd_path,
+- strerror(errno));
++ strerror(errno));
+ fprintf(stderr, "Hint: access to the binary, the interpreter or "
+ "shared libraries may be denied.\n");
+ return 1;
+diff --git a/scripts/faddr2line b/scripts/faddr2line
+index 6c6439f69a725..0e6268d598835 100755
+--- a/scripts/faddr2line
++++ b/scripts/faddr2line
+@@ -44,17 +44,6 @@
+ set -o errexit
+ set -o nounset
+
+-READELF="${CROSS_COMPILE:-}readelf"
+-ADDR2LINE="${CROSS_COMPILE:-}addr2line"
+-SIZE="${CROSS_COMPILE:-}size"
+-NM="${CROSS_COMPILE:-}nm"
+-
+-command -v awk >/dev/null 2>&1 || die "awk isn't installed"
+-command -v ${READELF} >/dev/null 2>&1 || die "readelf isn't installed"
+-command -v ${ADDR2LINE} >/dev/null 2>&1 || die "addr2line isn't installed"
+-command -v ${SIZE} >/dev/null 2>&1 || die "size isn't installed"
+-command -v ${NM} >/dev/null 2>&1 || die "nm isn't installed"
+-
+ usage() {
+ echo "usage: faddr2line [--list] <object file> <func+offset> <func+offset>..." >&2
+ exit 1
+@@ -69,6 +58,14 @@ die() {
+ exit 1
+ }
+
++READELF="${CROSS_COMPILE:-}readelf"
++ADDR2LINE="${CROSS_COMPILE:-}addr2line"
++AWK="awk"
++
++command -v ${AWK} >/dev/null 2>&1 || die "${AWK} isn't installed"
++command -v ${READELF} >/dev/null 2>&1 || die "${READELF} isn't installed"
++command -v ${ADDR2LINE} >/dev/null 2>&1 || die "${ADDR2LINE} isn't installed"
++
+ # Try to figure out the source directory prefix so we can remove it from the
+ # addr2line output. HACK ALERT: This assumes that start_kernel() is in
+ # init/main.c! This only works for vmlinux. Otherwise it falls back to
+@@ -76,7 +73,7 @@ die() {
+ find_dir_prefix() {
+ local objfile=$1
+
+- local start_kernel_addr=$(${READELF} -sW $objfile | awk '$8 == "start_kernel" {printf "0x%s", $2}')
++ local start_kernel_addr=$(${READELF} --symbols --wide $objfile | ${AWK} '$8 == "start_kernel" {printf "0x%s", $2}')
+ [[ -z $start_kernel_addr ]] && return
+
+ local file_line=$(${ADDR2LINE} -e $objfile $start_kernel_addr)
+@@ -97,86 +94,133 @@ __faddr2line() {
+ local dir_prefix=$3
+ local print_warnings=$4
+
+- local func=${func_addr%+*}
++ local sym_name=${func_addr%+*}
+ local offset=${func_addr#*+}
+ offset=${offset%/*}
+- local size=
+- [[ $func_addr =~ "/" ]] && size=${func_addr#*/}
++ local user_size=
++ [[ $func_addr =~ "/" ]] && user_size=${func_addr#*/}
+
+- if [[ -z $func ]] || [[ -z $offset ]] || [[ $func = $func_addr ]]; then
++ if [[ -z $sym_name ]] || [[ -z $offset ]] || [[ $sym_name = $func_addr ]]; then
+ warn "bad func+offset $func_addr"
+ DONE=1
+ return
+ fi
+
+ # Go through each of the object's symbols which match the func name.
+- # In rare cases there might be duplicates.
+- file_end=$(${SIZE} -Ax $objfile | awk '$1 == ".text" {print $2}')
+- while read symbol; do
+- local fields=($symbol)
+- local sym_base=0x${fields[0]}
+- local sym_type=${fields[1]}
+- local sym_end=${fields[3]}
+-
+- # calculate the size
+- local sym_size=$(($sym_end - $sym_base))
++ # In rare cases there might be duplicates, in which case we print all
++ # matches.
++ while read line; do
++ local fields=($line)
++ local sym_addr=0x${fields[1]}
++ local sym_elf_size=${fields[2]}
++ local sym_sec=${fields[6]}
++
++ # Get the section size:
++ local sec_size=$(${READELF} --section-headers --wide $objfile |
++ sed 's/\[ /\[/' |
++ ${AWK} -v sec=$sym_sec '$1 == "[" sec "]" { print "0x" $6; exit }')
++
++ if [[ -z $sec_size ]]; then
++ warn "bad section size: section: $sym_sec"
++ DONE=1
++ return
++ fi
++
++ # Calculate the symbol size.
++ #
++ # Unfortunately we can't use the ELF size, because kallsyms
++ # also includes the padding bytes in its size calculation. For
++ # kallsyms, the size calculation is the distance between the
++ # symbol and the next symbol in a sorted list.
++ local sym_size
++ local cur_sym_addr
++ local found=0
++ while read line; do
++ local fields=($line)
++ cur_sym_addr=0x${fields[1]}
++ local cur_sym_elf_size=${fields[2]}
++ local cur_sym_name=${fields[7]:-}
++
++ if [[ $cur_sym_addr = $sym_addr ]] &&
++ [[ $cur_sym_elf_size = $sym_elf_size ]] &&
++ [[ $cur_sym_name = $sym_name ]]; then
++ found=1
++ continue
++ fi
++
++ if [[ $found = 1 ]]; then
++ sym_size=$(($cur_sym_addr - $sym_addr))
++ [[ $sym_size -lt $sym_elf_size ]] && continue;
++ found=2
++ break
++ fi
++ done < <(${READELF} --symbols --wide $objfile | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
++
++ if [[ $found = 0 ]]; then
++ warn "can't find symbol: sym_name: $sym_name sym_sec: $sym_sec sym_addr: $sym_addr sym_elf_size: $sym_elf_size"
++ DONE=1
++ return
++ fi
++
++ # If nothing was found after the symbol, assume it's the last
++ # symbol in the section.
++ [[ $found = 1 ]] && sym_size=$(($sec_size - $sym_addr))
++
+ if [[ -z $sym_size ]] || [[ $sym_size -le 0 ]]; then
+- warn "bad symbol size: base: $sym_base end: $sym_end"
++ warn "bad symbol size: sym_addr: $sym_addr cur_sym_addr: $cur_sym_addr"
+ DONE=1
+ return
+ fi
++
+ sym_size=0x$(printf %x $sym_size)
+
+- # calculate the address
+- local addr=$(($sym_base + $offset))
++ # Calculate the section address from user-supplied offset:
++ local addr=$(($sym_addr + $offset))
+ if [[ -z $addr ]] || [[ $addr = 0 ]]; then
+- warn "bad address: $sym_base + $offset"
++ warn "bad address: $sym_addr + $offset"
+ DONE=1
+ return
+ fi
+ addr=0x$(printf %x $addr)
+
+- # weed out non-function symbols
+- if [[ $sym_type != t ]] && [[ $sym_type != T ]]; then
+- [[ $print_warnings = 1 ]] &&
+- echo "skipping $func address at $addr due to non-function symbol of type '$sym_type'"
+- continue
+- fi
+-
+- # if the user provided a size, make sure it matches the symbol's size
+- if [[ -n $size ]] && [[ $size -ne $sym_size ]]; then
++ # If the user provided a size, make sure it matches the symbol's size:
++ if [[ -n $user_size ]] && [[ $user_size -ne $sym_size ]]; then
+ [[ $print_warnings = 1 ]] &&
+- echo "skipping $func address at $addr due to size mismatch ($size != $sym_size)"
++ echo "skipping $sym_name address at $addr due to size mismatch ($user_size != $sym_size)"
+ continue;
+ fi
+
+- # make sure the provided offset is within the symbol's range
++ # Make sure the provided offset is within the symbol's range:
+ if [[ $offset -gt $sym_size ]]; then
+ [[ $print_warnings = 1 ]] &&
+- echo "skipping $func address at $addr due to size mismatch ($offset > $sym_size)"
++ echo "skipping $sym_name address at $addr due to size mismatch ($offset > $sym_size)"
+ continue
+ fi
+
+- # separate multiple entries with a blank line
++ # In case of duplicates or multiple addresses specified on the
++ # cmdline, separate multiple entries with a blank line:
+ [[ $FIRST = 0 ]] && echo
+ FIRST=0
+
+- # pass real address to addr2line
+- echo "$func+$offset/$sym_size:"
+- local file_lines=$(${ADDR2LINE} -fpie $objfile $addr | sed "s; $dir_prefix\(\./\)*; ;")
+- [[ -z $file_lines ]] && return
++ echo "$sym_name+$offset/$sym_size:"
+
++ # Pass section address to addr2line and strip absolute paths
++ # from the output:
++ local output=$(${ADDR2LINE} -fpie $objfile $addr | sed "s; $dir_prefix\(\./\)*; ;")
++ [[ -z $output ]] && continue
++
++ # Default output (non --list):
+ if [[ $LIST = 0 ]]; then
+- echo "$file_lines" | while read -r line
++ echo "$output" | while read -r line
+ do
+ echo $line
+ done
+ DONE=1;
+- return
++ continue
+ fi
+
+- # show each line with context
+- echo "$file_lines" | while read -r line
++ # For --list, show each line with its corresponding source code:
++ echo "$output" | while read -r line
+ do
+ echo
+ echo $line
+@@ -184,12 +228,12 @@ __faddr2line() {
+ n1=$[$n-5]
+ n2=$[$n+5]
+ f=$(echo $line | sed 's/.*at \(.\+\):.*/\1/g')
+- awk 'NR>=strtonum("'$n1'") && NR<=strtonum("'$n2'") { if (NR=='$n') printf(">%d<", NR); else printf(" %d ", NR); printf("\t%s\n", $0)}' $f
++ ${AWK} 'NR>=strtonum("'$n1'") && NR<=strtonum("'$n2'") { if (NR=='$n') printf(">%d<", NR); else printf(" %d ", NR); printf("\t%s\n", $0)}' $f
+ done
+
+ DONE=1
+
+- done < <(${NM} -n $objfile | awk -v fn=$func -v end=$file_end '$3 == fn { found=1; line=$0; start=$1; next } found == 1 { found=0; print line, "0x"$1 } END {if (found == 1) print line, end; }')
++ done < <(${READELF} --symbols --wide $objfile | ${AWK} -v fn=$sym_name '$4 == "FUNC" && $8 == fn')
+ }
+
+ [[ $# -lt 2 ]] && usage
+diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
+index f3a9cc201c8c2..7249f16257c72 100644
+--- a/security/integrity/ima/Kconfig
++++ b/security/integrity/ima/Kconfig
+@@ -69,10 +69,9 @@ choice
+ hash, defined as 20 bytes, and a null terminated pathname,
+ limited to 255 characters. The 'ima-ng' measurement list
+ template permits both larger hash digests and longer
+- pathnames.
++ pathnames. The configured default template can be replaced
++ by specifying "ima_template=" on the boot command line.
+
+- config IMA_TEMPLATE
+- bool "ima"
+ config IMA_NG_TEMPLATE
+ bool "ima-ng (default)"
+ config IMA_SIG_TEMPLATE
+@@ -82,7 +81,6 @@ endchoice
+ config IMA_DEFAULT_TEMPLATE
+ string
+ depends on IMA
+- default "ima" if IMA_TEMPLATE
+ default "ima-ng" if IMA_NG_TEMPLATE
+ default "ima-sig" if IMA_SIG_TEMPLATE
+
+@@ -102,19 +100,19 @@ choice
+
+ config IMA_DEFAULT_HASH_SHA256
+ bool "SHA256"
+- depends on CRYPTO_SHA256=y && !IMA_TEMPLATE
++ depends on CRYPTO_SHA256=y
+
+ config IMA_DEFAULT_HASH_SHA512
+ bool "SHA512"
+- depends on CRYPTO_SHA512=y && !IMA_TEMPLATE
++ depends on CRYPTO_SHA512=y
+
+ config IMA_DEFAULT_HASH_WP512
+ bool "WP512"
+- depends on CRYPTO_WP512=y && !IMA_TEMPLATE
++ depends on CRYPTO_WP512=y
+
+ config IMA_DEFAULT_HASH_SM3
+ bool "SM3"
+- depends on CRYPTO_SM3=y && !IMA_TEMPLATE
++ depends on CRYPTO_SM3=y
+ endchoice
+
+ config IMA_DEFAULT_HASH
+diff --git a/security/integrity/platform_certs/keyring_handler.h b/security/integrity/platform_certs/keyring_handler.h
+index 284558f30411e..212d894a8c0c0 100644
+--- a/security/integrity/platform_certs/keyring_handler.h
++++ b/security/integrity/platform_certs/keyring_handler.h
+@@ -35,3 +35,11 @@ efi_element_handler_t get_handler_for_mok(const efi_guid_t *sig_type);
+ efi_element_handler_t get_handler_for_dbx(const efi_guid_t *sig_type);
+
+ #endif
++
++#ifndef UEFI_QUIRK_SKIP_CERT
++#define UEFI_QUIRK_SKIP_CERT(vendor, product) \
++ .matches = { \
++ DMI_MATCH(DMI_BOARD_VENDOR, vendor), \
++ DMI_MATCH(DMI_PRODUCT_NAME, product), \
++ },
++#endif
+diff --git a/security/integrity/platform_certs/load_uefi.c b/security/integrity/platform_certs/load_uefi.c
+index 5f45c3c07dbd4..093894a640dca 100644
+--- a/security/integrity/platform_certs/load_uefi.c
++++ b/security/integrity/platform_certs/load_uefi.c
+@@ -3,6 +3,7 @@
+ #include <linux/kernel.h>
+ #include <linux/sched.h>
+ #include <linux/cred.h>
++#include <linux/dmi.h>
+ #include <linux/err.h>
+ #include <linux/efi.h>
+ #include <linux/slab.h>
+@@ -12,6 +13,31 @@
+ #include "../integrity.h"
+ #include "keyring_handler.h"
+
++/*
++ * On T2 Macs reading the db and dbx efi variables to load UEFI Secure Boot
++ * certificates causes occurrence of a page fault in Apple's firmware and
++ * a crash disabling EFI runtime services. The following quirk skips reading
++ * these variables.
++ */
++static const struct dmi_system_id uefi_skip_cert[] = {
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "MacBookPro15,1") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "MacBookPro15,2") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "MacBookPro15,3") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "MacBookPro15,4") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "MacBookPro16,1") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "MacBookPro16,2") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "MacBookPro16,3") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "MacBookPro16,4") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "MacBookAir8,1") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "MacBookAir8,2") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "MacBookAir9,1") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "MacMini8,1") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "MacPro7,1") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "iMac20,1") },
++ { UEFI_QUIRK_SKIP_CERT("Apple Inc.", "iMac20,2") },
++ { }
++};
++
+ /*
+ * Look to see if a UEFI variable called MokIgnoreDB exists and return true if
+ * it does.
+@@ -138,6 +164,13 @@ static int __init load_uefi_certs(void)
+ unsigned long dbsize = 0, dbxsize = 0, mokxsize = 0;
+ efi_status_t status;
+ int rc = 0;
++ const struct dmi_system_id *dmi_id;
++
++ dmi_id = dmi_first_match(uefi_skip_cert);
++ if (dmi_id) {
++ pr_err("Reading UEFI Secure Boot Certs is not supported on T2 Macs.\n");
++ return false;
++ }
+
+ if (!efi_rt_services_supported(EFI_RT_SUPPORTED_GET_VARIABLE))
+ return false;
+diff --git a/security/landlock/cred.c b/security/landlock/cred.c
+index 6725af24c6841..ec6c37f04a191 100644
+--- a/security/landlock/cred.c
++++ b/security/landlock/cred.c
+@@ -15,7 +15,7 @@
+ #include "setup.h"
+
+ static int hook_cred_prepare(struct cred *const new,
+- const struct cred *const old, const gfp_t gfp)
++ const struct cred *const old, const gfp_t gfp)
+ {
+ struct landlock_ruleset *const old_dom = landlock_cred(old)->domain;
+
+@@ -42,5 +42,5 @@ static struct security_hook_list landlock_hooks[] __lsm_ro_after_init = {
+ __init void landlock_add_cred_hooks(void)
+ {
+ security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks),
+- LANDLOCK_NAME);
++ LANDLOCK_NAME);
+ }
+diff --git a/security/landlock/cred.h b/security/landlock/cred.h
+index 5f99d3decade6..af89ab00e6d10 100644
+--- a/security/landlock/cred.h
++++ b/security/landlock/cred.h
+@@ -20,8 +20,8 @@ struct landlock_cred_security {
+ struct landlock_ruleset *domain;
+ };
+
+-static inline struct landlock_cred_security *landlock_cred(
+- const struct cred *cred)
++static inline struct landlock_cred_security *
++landlock_cred(const struct cred *cred)
+ {
+ return cred->security + landlock_blob_sizes.lbs_cred;
+ }
+@@ -34,8 +34,8 @@ static inline const struct landlock_ruleset *landlock_get_current_domain(void)
+ /*
+ * The call needs to come from an RCU read-side critical section.
+ */
+-static inline const struct landlock_ruleset *landlock_get_task_domain(
+- const struct task_struct *const task)
++static inline const struct landlock_ruleset *
++landlock_get_task_domain(const struct task_struct *const task)
+ {
+ return landlock_cred(__task_cred(task))->domain;
+ }
+diff --git a/security/landlock/fs.c b/security/landlock/fs.c
+index 97b8e421f6171..c5749301b37d6 100644
+--- a/security/landlock/fs.c
++++ b/security/landlock/fs.c
+@@ -141,23 +141,26 @@ retry:
+ }
+
+ /* All access rights that can be tied to files. */
++/* clang-format off */
+ #define ACCESS_FILE ( \
+ LANDLOCK_ACCESS_FS_EXECUTE | \
+ LANDLOCK_ACCESS_FS_WRITE_FILE | \
+ LANDLOCK_ACCESS_FS_READ_FILE)
++/* clang-format on */
+
+ /*
+ * @path: Should have been checked by get_path_from_fd().
+ */
+ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
+- const struct path *const path, u32 access_rights)
++ const struct path *const path,
++ access_mask_t access_rights)
+ {
+ int err;
+ struct landlock_object *object;
+
+ /* Files only get access rights that make sense. */
+- if (!d_is_dir(path->dentry) && (access_rights | ACCESS_FILE) !=
+- ACCESS_FILE)
++ if (!d_is_dir(path->dentry) &&
++ (access_rights | ACCESS_FILE) != ACCESS_FILE)
+ return -EINVAL;
+ if (WARN_ON_ONCE(ruleset->num_layers != 1))
+ return -EINVAL;
+@@ -180,59 +183,93 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
+
+ /* Access-control management */
+
+-static inline u64 unmask_layers(
+- const struct landlock_ruleset *const domain,
+- const struct path *const path, const u32 access_request,
+- u64 layer_mask)
++/*
++ * The lifetime of the returned rule is tied to @domain.
++ *
++ * Returns NULL if no rule is found or if @dentry is negative.
++ */
++static inline const struct landlock_rule *
++find_rule(const struct landlock_ruleset *const domain,
++ const struct dentry *const dentry)
+ {
+ const struct landlock_rule *rule;
+ const struct inode *inode;
+- size_t i;
+
+- if (d_is_negative(path->dentry))
+- /* Ignore nonexistent leafs. */
+- return layer_mask;
+- inode = d_backing_inode(path->dentry);
++ /* Ignores nonexistent leafs. */
++ if (d_is_negative(dentry))
++ return NULL;
++
++ inode = d_backing_inode(dentry);
+ rcu_read_lock();
+- rule = landlock_find_rule(domain,
+- rcu_dereference(landlock_inode(inode)->object));
++ rule = landlock_find_rule(
++ domain, rcu_dereference(landlock_inode(inode)->object));
+ rcu_read_unlock();
++ return rule;
++}
++
++/*
++ * @layer_masks is read and may be updated according to the access request and
++ * the matching rule.
++ *
++ * Returns true if the request is allowed (i.e. relevant layer masks for the
++ * request are empty).
++ */
++static inline bool
++unmask_layers(const struct landlock_rule *const rule,
++ const access_mask_t access_request,
++ layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS])
++{
++ size_t layer_level;
++
++ if (!access_request || !layer_masks)
++ return true;
+ if (!rule)
+- return layer_mask;
++ return false;
+
+ /*
+ * An access is granted if, for each policy layer, at least one rule
+- * encountered on the pathwalk grants the requested accesses,
+- * regardless of their position in the layer stack. We must then check
++ * encountered on the pathwalk grants the requested access,
++ * regardless of its position in the layer stack. We must then check
+ * the remaining layers for each inode, from the first added layer to
+- * the last one.
++ * the last one. When there is multiple requested accesses, for each
++ * policy layer, the full set of requested accesses may not be granted
++ * by only one rule, but by the union (binary OR) of multiple rules.
++ * E.g. /a/b <execute> + /a <read> => /a/b <execute + read>
+ */
+- for (i = 0; i < rule->num_layers; i++) {
+- const struct landlock_layer *const layer = &rule->layers[i];
+- const u64 layer_level = BIT_ULL(layer->level - 1);
+-
+- /* Checks that the layer grants access to the full request. */
+- if ((layer->access & access_request) == access_request) {
+- layer_mask &= ~layer_level;
++ for (layer_level = 0; layer_level < rule->num_layers; layer_level++) {
++ const struct landlock_layer *const layer =
++ &rule->layers[layer_level];
++ const layer_mask_t layer_bit = BIT_ULL(layer->level - 1);
++ const unsigned long access_req = access_request;
++ unsigned long access_bit;
++ bool is_empty;
+
+- if (layer_mask == 0)
+- return layer_mask;
++ /*
++ * Records in @layer_masks which layer grants access to each
++ * requested access.
++ */
++ is_empty = true;
++ for_each_set_bit(access_bit, &access_req,
++ ARRAY_SIZE(*layer_masks)) {
++ if (layer->access & BIT_ULL(access_bit))
++ (*layer_masks)[access_bit] &= ~layer_bit;
++ is_empty = is_empty && !(*layer_masks)[access_bit];
+ }
++ if (is_empty)
++ return true;
+ }
+- return layer_mask;
++ return false;
+ }
+
+ static int check_access_path(const struct landlock_ruleset *const domain,
+- const struct path *const path, u32 access_request)
++ const struct path *const path,
++ const access_mask_t access_request)
+ {
+- bool allowed = false;
++ layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
++ bool allowed = false, has_access = false;
+ struct path walker_path;
+- u64 layer_mask;
+ size_t i;
+
+- /* Make sure all layers can be checked. */
+- BUILD_BUG_ON(BITS_PER_TYPE(layer_mask) < LANDLOCK_MAX_NUM_LAYERS);
+-
+ if (!access_request)
+ return 0;
+ if (WARN_ON_ONCE(!domain || !path))
+@@ -243,20 +280,27 @@ static int check_access_path(const struct landlock_ruleset *const domain,
+ * /proc/<pid>/fd/<file-descriptor> .
+ */
+ if ((path->dentry->d_sb->s_flags & SB_NOUSER) ||
+- (d_is_positive(path->dentry) &&
+- unlikely(IS_PRIVATE(d_backing_inode(path->dentry)))))
++ (d_is_positive(path->dentry) &&
++ unlikely(IS_PRIVATE(d_backing_inode(path->dentry)))))
+ return 0;
+ if (WARN_ON_ONCE(domain->num_layers < 1))
+ return -EACCES;
+
+ /* Saves all layers handling a subset of requested accesses. */
+- layer_mask = 0;
+ for (i = 0; i < domain->num_layers; i++) {
+- if (domain->fs_access_masks[i] & access_request)
+- layer_mask |= BIT_ULL(i);
++ const unsigned long access_req = access_request;
++ unsigned long access_bit;
++
++ for_each_set_bit(access_bit, &access_req,
++ ARRAY_SIZE(layer_masks)) {
++ if (domain->fs_access_masks[i] & BIT_ULL(access_bit)) {
++ layer_masks[access_bit] |= BIT_ULL(i);
++ has_access = true;
++ }
++ }
+ }
+ /* An access request not handled by the domain is allowed. */
+- if (layer_mask == 0)
++ if (!has_access)
+ return 0;
+
+ walker_path = *path;
+@@ -268,13 +312,11 @@ static int check_access_path(const struct landlock_ruleset *const domain,
+ while (true) {
+ struct dentry *parent_dentry;
+
+- layer_mask = unmask_layers(domain, &walker_path,
+- access_request, layer_mask);
+- if (layer_mask == 0) {
++ allowed = unmask_layers(find_rule(domain, walker_path.dentry),
++ access_request, &layer_masks);
++ if (allowed)
+ /* Stops when a rule from each layer grants access. */
+- allowed = true;
+ break;
+- }
+
+ jump_up:
+ if (walker_path.dentry == walker_path.mnt->mnt_root) {
+@@ -308,7 +350,7 @@ jump_up:
+ }
+
+ static inline int current_check_access_path(const struct path *const path,
+- const u32 access_request)
++ const access_mask_t access_request)
+ {
+ const struct landlock_ruleset *const dom =
+ landlock_get_current_domain();
+@@ -436,8 +478,8 @@ static void hook_sb_delete(struct super_block *const sb)
+ if (prev_inode)
+ iput(prev_inode);
+ /* Waits for pending iput() in release_inode(). */
+- wait_var_event(&landlock_superblock(sb)->inode_refs, !atomic_long_read(
+- &landlock_superblock(sb)->inode_refs));
++ wait_var_event(&landlock_superblock(sb)->inode_refs,
++ !atomic_long_read(&landlock_superblock(sb)->inode_refs));
+ }
+
+ /*
+@@ -459,8 +501,8 @@ static void hook_sb_delete(struct super_block *const sb)
+ * a dedicated user space option would be required (e.g. as a ruleset flag).
+ */
+ static int hook_sb_mount(const char *const dev_name,
+- const struct path *const path, const char *const type,
+- const unsigned long flags, void *const data)
++ const struct path *const path, const char *const type,
++ const unsigned long flags, void *const data)
+ {
+ if (!landlock_get_current_domain())
+ return 0;
+@@ -468,7 +510,7 @@ static int hook_sb_mount(const char *const dev_name,
+ }
+
+ static int hook_move_mount(const struct path *const from_path,
+- const struct path *const to_path)
++ const struct path *const to_path)
+ {
+ if (!landlock_get_current_domain())
+ return 0;
+@@ -502,7 +544,7 @@ static int hook_sb_remount(struct super_block *const sb, void *const mnt_opts)
+ * view of the filesystem.
+ */
+ static int hook_sb_pivotroot(const struct path *const old_path,
+- const struct path *const new_path)
++ const struct path *const new_path)
+ {
+ if (!landlock_get_current_domain())
+ return 0;
+@@ -511,7 +553,7 @@ static int hook_sb_pivotroot(const struct path *const old_path,
+
+ /* Path hooks */
+
+-static inline u32 get_mode_access(const umode_t mode)
++static inline access_mask_t get_mode_access(const umode_t mode)
+ {
+ switch (mode & S_IFMT) {
+ case S_IFLNK:
+@@ -545,8 +587,8 @@ static inline u32 get_mode_access(const umode_t mode)
+ * deal with that.
+ */
+ static int hook_path_link(struct dentry *const old_dentry,
+- const struct path *const new_dir,
+- struct dentry *const new_dentry)
++ const struct path *const new_dir,
++ struct dentry *const new_dentry)
+ {
+ const struct landlock_ruleset *const dom =
+ landlock_get_current_domain();
+@@ -559,22 +601,23 @@ static int hook_path_link(struct dentry *const old_dentry,
+ return -EXDEV;
+ if (unlikely(d_is_negative(old_dentry)))
+ return -ENOENT;
+- return check_access_path(dom, new_dir,
+- get_mode_access(d_backing_inode(old_dentry)->i_mode));
++ return check_access_path(
++ dom, new_dir,
++ get_mode_access(d_backing_inode(old_dentry)->i_mode));
+ }
+
+-static inline u32 maybe_remove(const struct dentry *const dentry)
++static inline access_mask_t maybe_remove(const struct dentry *const dentry)
+ {
+ if (d_is_negative(dentry))
+ return 0;
+ return d_is_dir(dentry) ? LANDLOCK_ACCESS_FS_REMOVE_DIR :
+- LANDLOCK_ACCESS_FS_REMOVE_FILE;
++ LANDLOCK_ACCESS_FS_REMOVE_FILE;
+ }
+
+ static int hook_path_rename(const struct path *const old_dir,
+- struct dentry *const old_dentry,
+- const struct path *const new_dir,
+- struct dentry *const new_dentry)
++ struct dentry *const old_dentry,
++ const struct path *const new_dir,
++ struct dentry *const new_dentry)
+ {
+ const struct landlock_ruleset *const dom =
+ landlock_get_current_domain();
+@@ -588,20 +631,21 @@ static int hook_path_rename(const struct path *const old_dir,
+ if (unlikely(d_is_negative(old_dentry)))
+ return -ENOENT;
+ /* RENAME_EXCHANGE is handled because directories are the same. */
+- return check_access_path(dom, old_dir, maybe_remove(old_dentry) |
+- maybe_remove(new_dentry) |
++ return check_access_path(
++ dom, old_dir,
++ maybe_remove(old_dentry) | maybe_remove(new_dentry) |
+ get_mode_access(d_backing_inode(old_dentry)->i_mode));
+ }
+
+ static int hook_path_mkdir(const struct path *const dir,
+- struct dentry *const dentry, const umode_t mode)
++ struct dentry *const dentry, const umode_t mode)
+ {
+ return current_check_access_path(dir, LANDLOCK_ACCESS_FS_MAKE_DIR);
+ }
+
+ static int hook_path_mknod(const struct path *const dir,
+- struct dentry *const dentry, const umode_t mode,
+- const unsigned int dev)
++ struct dentry *const dentry, const umode_t mode,
++ const unsigned int dev)
+ {
+ const struct landlock_ruleset *const dom =
+ landlock_get_current_domain();
+@@ -612,28 +656,29 @@ static int hook_path_mknod(const struct path *const dir,
+ }
+
+ static int hook_path_symlink(const struct path *const dir,
+- struct dentry *const dentry, const char *const old_name)
++ struct dentry *const dentry,
++ const char *const old_name)
+ {
+ return current_check_access_path(dir, LANDLOCK_ACCESS_FS_MAKE_SYM);
+ }
+
+ static int hook_path_unlink(const struct path *const dir,
+- struct dentry *const dentry)
++ struct dentry *const dentry)
+ {
+ return current_check_access_path(dir, LANDLOCK_ACCESS_FS_REMOVE_FILE);
+ }
+
+ static int hook_path_rmdir(const struct path *const dir,
+- struct dentry *const dentry)
++ struct dentry *const dentry)
+ {
+ return current_check_access_path(dir, LANDLOCK_ACCESS_FS_REMOVE_DIR);
+ }
+
+ /* File hooks */
+
+-static inline u32 get_file_access(const struct file *const file)
++static inline access_mask_t get_file_access(const struct file *const file)
+ {
+- u32 access = 0;
++ access_mask_t access = 0;
+
+ if (file->f_mode & FMODE_READ) {
+ /* A directory can only be opened in read mode. */
+@@ -688,5 +733,5 @@ static struct security_hook_list landlock_hooks[] __lsm_ro_after_init = {
+ __init void landlock_add_fs_hooks(void)
+ {
+ security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks),
+- LANDLOCK_NAME);
++ LANDLOCK_NAME);
+ }
+diff --git a/security/landlock/fs.h b/security/landlock/fs.h
+index 187284b421c9d..8db7acf9109b6 100644
+--- a/security/landlock/fs.h
++++ b/security/landlock/fs.h
+@@ -50,14 +50,14 @@ struct landlock_superblock_security {
+ atomic_long_t inode_refs;
+ };
+
+-static inline struct landlock_inode_security *landlock_inode(
+- const struct inode *const inode)
++static inline struct landlock_inode_security *
++landlock_inode(const struct inode *const inode)
+ {
+ return inode->i_security + landlock_blob_sizes.lbs_inode;
+ }
+
+-static inline struct landlock_superblock_security *landlock_superblock(
+- const struct super_block *const superblock)
++static inline struct landlock_superblock_security *
++landlock_superblock(const struct super_block *const superblock)
+ {
+ return superblock->s_security + landlock_blob_sizes.lbs_superblock;
+ }
+@@ -65,6 +65,7 @@ static inline struct landlock_superblock_security *landlock_superblock(
+ __init void landlock_add_fs_hooks(void);
+
+ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
+- const struct path *const path, u32 access_hierarchy);
++ const struct path *const path,
++ access_mask_t access_hierarchy);
+
+ #endif /* _SECURITY_LANDLOCK_FS_H */
+diff --git a/security/landlock/limits.h b/security/landlock/limits.h
+index 2a0a1095ee27e..17c2a2e7fe1ef 100644
+--- a/security/landlock/limits.h
++++ b/security/landlock/limits.h
+@@ -9,13 +9,19 @@
+ #ifndef _SECURITY_LANDLOCK_LIMITS_H
+ #define _SECURITY_LANDLOCK_LIMITS_H
+
++#include <linux/bitops.h>
+ #include <linux/limits.h>
+ #include <uapi/linux/landlock.h>
+
+-#define LANDLOCK_MAX_NUM_LAYERS 64
++/* clang-format off */
++
++#define LANDLOCK_MAX_NUM_LAYERS 16
+ #define LANDLOCK_MAX_NUM_RULES U32_MAX
+
+ #define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_MAKE_SYM
+ #define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1)
++#define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS)
++
++/* clang-format on */
+
+ #endif /* _SECURITY_LANDLOCK_LIMITS_H */
+diff --git a/security/landlock/object.c b/security/landlock/object.c
+index d674fdf9ff04f..1f50612f01850 100644
+--- a/security/landlock/object.c
++++ b/security/landlock/object.c
+@@ -17,9 +17,9 @@
+
+ #include "object.h"
+
+-struct landlock_object *landlock_create_object(
+- const struct landlock_object_underops *const underops,
+- void *const underobj)
++struct landlock_object *
++landlock_create_object(const struct landlock_object_underops *const underops,
++ void *const underobj)
+ {
+ struct landlock_object *new_object;
+
+diff --git a/security/landlock/object.h b/security/landlock/object.h
+index 3f80674c6c8d3..5f28c35e8aa8c 100644
+--- a/security/landlock/object.h
++++ b/security/landlock/object.h
+@@ -76,9 +76,9 @@ struct landlock_object {
+ };
+ };
+
+-struct landlock_object *landlock_create_object(
+- const struct landlock_object_underops *const underops,
+- void *const underobj);
++struct landlock_object *
++landlock_create_object(const struct landlock_object_underops *const underops,
++ void *const underobj);
+
+ void landlock_put_object(struct landlock_object *const object);
+
+diff --git a/security/landlock/ptrace.c b/security/landlock/ptrace.c
+index f55b82446de21..4c5b9cd712861 100644
+--- a/security/landlock/ptrace.c
++++ b/security/landlock/ptrace.c
+@@ -30,7 +30,7 @@
+ * means a subset of) the @child domain.
+ */
+ static bool domain_scope_le(const struct landlock_ruleset *const parent,
+- const struct landlock_ruleset *const child)
++ const struct landlock_ruleset *const child)
+ {
+ const struct landlock_hierarchy *walker;
+
+@@ -48,7 +48,7 @@ static bool domain_scope_le(const struct landlock_ruleset *const parent,
+ }
+
+ static bool task_is_scoped(const struct task_struct *const parent,
+- const struct task_struct *const child)
++ const struct task_struct *const child)
+ {
+ bool is_scoped;
+ const struct landlock_ruleset *dom_parent, *dom_child;
+@@ -62,7 +62,7 @@ static bool task_is_scoped(const struct task_struct *const parent,
+ }
+
+ static int task_ptrace(const struct task_struct *const parent,
+- const struct task_struct *const child)
++ const struct task_struct *const child)
+ {
+ /* Quick return for non-landlocked tasks. */
+ if (!landlocked(parent))
+@@ -86,7 +86,7 @@ static int task_ptrace(const struct task_struct *const parent,
+ * granted, -errno if denied.
+ */
+ static int hook_ptrace_access_check(struct task_struct *const child,
+- const unsigned int mode)
++ const unsigned int mode)
+ {
+ return task_ptrace(current, child);
+ }
+@@ -116,5 +116,5 @@ static struct security_hook_list landlock_hooks[] __lsm_ro_after_init = {
+ __init void landlock_add_ptrace_hooks(void)
+ {
+ security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks),
+- LANDLOCK_NAME);
++ LANDLOCK_NAME);
+ }
+diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
+index ec72b9262bf38..996484f98bfde 100644
+--- a/security/landlock/ruleset.c
++++ b/security/landlock/ruleset.c
+@@ -28,8 +28,9 @@ static struct landlock_ruleset *create_ruleset(const u32 num_layers)
+ {
+ struct landlock_ruleset *new_ruleset;
+
+- new_ruleset = kzalloc(struct_size(new_ruleset, fs_access_masks,
+- num_layers), GFP_KERNEL_ACCOUNT);
++ new_ruleset =
++ kzalloc(struct_size(new_ruleset, fs_access_masks, num_layers),
++ GFP_KERNEL_ACCOUNT);
+ if (!new_ruleset)
+ return ERR_PTR(-ENOMEM);
+ refcount_set(&new_ruleset->usage, 1);
+@@ -44,7 +45,8 @@ static struct landlock_ruleset *create_ruleset(const u32 num_layers)
+ return new_ruleset;
+ }
+
+-struct landlock_ruleset *landlock_create_ruleset(const u32 fs_access_mask)
++struct landlock_ruleset *
++landlock_create_ruleset(const access_mask_t fs_access_mask)
+ {
+ struct landlock_ruleset *new_ruleset;
+
+@@ -66,11 +68,10 @@ static void build_check_rule(void)
+ BUILD_BUG_ON(rule.num_layers < LANDLOCK_MAX_NUM_LAYERS);
+ }
+
+-static struct landlock_rule *create_rule(
+- struct landlock_object *const object,
+- const struct landlock_layer (*const layers)[],
+- const u32 num_layers,
+- const struct landlock_layer *const new_layer)
++static struct landlock_rule *
++create_rule(struct landlock_object *const object,
++ const struct landlock_layer (*const layers)[], const u32 num_layers,
++ const struct landlock_layer *const new_layer)
+ {
+ struct landlock_rule *new_rule;
+ u32 new_num_layers;
+@@ -85,7 +86,7 @@ static struct landlock_rule *create_rule(
+ new_num_layers = num_layers;
+ }
+ new_rule = kzalloc(struct_size(new_rule, layers, new_num_layers),
+- GFP_KERNEL_ACCOUNT);
++ GFP_KERNEL_ACCOUNT);
+ if (!new_rule)
+ return ERR_PTR(-ENOMEM);
+ RB_CLEAR_NODE(&new_rule->node);
+@@ -94,7 +95,7 @@ static struct landlock_rule *create_rule(
+ new_rule->num_layers = new_num_layers;
+ /* Copies the original layer stack. */
+ memcpy(new_rule->layers, layers,
+- flex_array_size(new_rule, layers, num_layers));
++ flex_array_size(new_rule, layers, num_layers));
+ if (new_layer)
+ /* Adds a copy of @new_layer on the layer stack. */
+ new_rule->layers[new_rule->num_layers - 1] = *new_layer;
+@@ -142,9 +143,9 @@ static void build_check_ruleset(void)
+ * access rights.
+ */
+ static int insert_rule(struct landlock_ruleset *const ruleset,
+- struct landlock_object *const object,
+- const struct landlock_layer (*const layers)[],
+- size_t num_layers)
++ struct landlock_object *const object,
++ const struct landlock_layer (*const layers)[],
++ size_t num_layers)
+ {
+ struct rb_node **walker_node;
+ struct rb_node *parent_node = NULL;
+@@ -156,8 +157,8 @@ static int insert_rule(struct landlock_ruleset *const ruleset,
+ return -ENOENT;
+ walker_node = &(ruleset->root.rb_node);
+ while (*walker_node) {
+- struct landlock_rule *const this = rb_entry(*walker_node,
+- struct landlock_rule, node);
++ struct landlock_rule *const this =
++ rb_entry(*walker_node, struct landlock_rule, node);
+
+ if (this->object != object) {
+ parent_node = *walker_node;
+@@ -194,7 +195,7 @@ static int insert_rule(struct landlock_ruleset *const ruleset,
+ * ruleset and a domain.
+ */
+ new_rule = create_rule(object, &this->layers, this->num_layers,
+- &(*layers)[0]);
++ &(*layers)[0]);
+ if (IS_ERR(new_rule))
+ return PTR_ERR(new_rule);
+ rb_replace_node(&this->node, &new_rule->node, &ruleset->root);
+@@ -228,13 +229,14 @@ static void build_check_layer(void)
+
+ /* @ruleset must be locked by the caller. */
+ int landlock_insert_rule(struct landlock_ruleset *const ruleset,
+- struct landlock_object *const object, const u32 access)
++ struct landlock_object *const object,
++ const access_mask_t access)
+ {
+- struct landlock_layer layers[] = {{
++ struct landlock_layer layers[] = { {
+ .access = access,
+ /* When @level is zero, insert_rule() extends @ruleset. */
+ .level = 0,
+- }};
++ } };
+
+ build_check_layer();
+ return insert_rule(ruleset, object, &layers, ARRAY_SIZE(layers));
+@@ -257,7 +259,7 @@ static void put_hierarchy(struct landlock_hierarchy *hierarchy)
+ }
+
+ static int merge_ruleset(struct landlock_ruleset *const dst,
+- struct landlock_ruleset *const src)
++ struct landlock_ruleset *const src)
+ {
+ struct landlock_rule *walker_rule, *next_rule;
+ int err = 0;
+@@ -282,11 +284,11 @@ static int merge_ruleset(struct landlock_ruleset *const dst,
+ dst->fs_access_masks[dst->num_layers - 1] = src->fs_access_masks[0];
+
+ /* Merges the @src tree. */
+- rbtree_postorder_for_each_entry_safe(walker_rule, next_rule,
+- &src->root, node) {
+- struct landlock_layer layers[] = {{
++ rbtree_postorder_for_each_entry_safe(walker_rule, next_rule, &src->root,
++ node) {
++ struct landlock_layer layers[] = { {
+ .level = dst->num_layers,
+- }};
++ } };
+
+ if (WARN_ON_ONCE(walker_rule->num_layers != 1)) {
+ err = -EINVAL;
+@@ -298,7 +300,7 @@ static int merge_ruleset(struct landlock_ruleset *const dst,
+ }
+ layers[0].access = walker_rule->layers[0].access;
+ err = insert_rule(dst, walker_rule->object, &layers,
+- ARRAY_SIZE(layers));
++ ARRAY_SIZE(layers));
+ if (err)
+ goto out_unlock;
+ }
+@@ -310,7 +312,7 @@ out_unlock:
+ }
+
+ static int inherit_ruleset(struct landlock_ruleset *const parent,
+- struct landlock_ruleset *const child)
++ struct landlock_ruleset *const child)
+ {
+ struct landlock_rule *walker_rule, *next_rule;
+ int err = 0;
+@@ -325,9 +327,10 @@ static int inherit_ruleset(struct landlock_ruleset *const parent,
+
+ /* Copies the @parent tree. */
+ rbtree_postorder_for_each_entry_safe(walker_rule, next_rule,
+- &parent->root, node) {
++ &parent->root, node) {
+ err = insert_rule(child, walker_rule->object,
+- &walker_rule->layers, walker_rule->num_layers);
++ &walker_rule->layers,
++ walker_rule->num_layers);
+ if (err)
+ goto out_unlock;
+ }
+@@ -338,7 +341,7 @@ static int inherit_ruleset(struct landlock_ruleset *const parent,
+ }
+ /* Copies the parent layer stack and leaves a space for the new layer. */
+ memcpy(child->fs_access_masks, parent->fs_access_masks,
+- flex_array_size(parent, fs_access_masks, parent->num_layers));
++ flex_array_size(parent, fs_access_masks, parent->num_layers));
+
+ if (WARN_ON_ONCE(!parent->hierarchy)) {
+ err = -EINVAL;
+@@ -358,8 +361,7 @@ static void free_ruleset(struct landlock_ruleset *const ruleset)
+ struct landlock_rule *freeme, *next;
+
+ might_sleep();
+- rbtree_postorder_for_each_entry_safe(freeme, next, &ruleset->root,
+- node)
++ rbtree_postorder_for_each_entry_safe(freeme, next, &ruleset->root, node)
+ free_rule(freeme);
+ put_hierarchy(ruleset->hierarchy);
+ kfree(ruleset);
+@@ -397,9 +399,9 @@ void landlock_put_ruleset_deferred(struct landlock_ruleset *const ruleset)
+ * Returns the intersection of @parent and @ruleset, or returns @parent if
+ * @ruleset is empty, or returns a duplicate of @ruleset if @parent is empty.
+ */
+-struct landlock_ruleset *landlock_merge_ruleset(
+- struct landlock_ruleset *const parent,
+- struct landlock_ruleset *const ruleset)
++struct landlock_ruleset *
++landlock_merge_ruleset(struct landlock_ruleset *const parent,
++ struct landlock_ruleset *const ruleset)
+ {
+ struct landlock_ruleset *new_dom;
+ u32 num_layers;
+@@ -421,8 +423,8 @@ struct landlock_ruleset *landlock_merge_ruleset(
+ new_dom = create_ruleset(num_layers);
+ if (IS_ERR(new_dom))
+ return new_dom;
+- new_dom->hierarchy = kzalloc(sizeof(*new_dom->hierarchy),
+- GFP_KERNEL_ACCOUNT);
++ new_dom->hierarchy =
++ kzalloc(sizeof(*new_dom->hierarchy), GFP_KERNEL_ACCOUNT);
+ if (!new_dom->hierarchy) {
+ err = -ENOMEM;
+ goto out_put_dom;
+@@ -449,9 +451,9 @@ out_put_dom:
+ /*
+ * The returned access has the same lifetime as @ruleset.
+ */
+-const struct landlock_rule *landlock_find_rule(
+- const struct landlock_ruleset *const ruleset,
+- const struct landlock_object *const object)
++const struct landlock_rule *
++landlock_find_rule(const struct landlock_ruleset *const ruleset,
++ const struct landlock_object *const object)
+ {
+ const struct rb_node *node;
+
+@@ -459,8 +461,8 @@ const struct landlock_rule *landlock_find_rule(
+ return NULL;
+ node = ruleset->root.rb_node;
+ while (node) {
+- struct landlock_rule *this = rb_entry(node,
+- struct landlock_rule, node);
++ struct landlock_rule *this =
++ rb_entry(node, struct landlock_rule, node);
+
+ if (this->object == object)
+ return this;
+diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
+index 2d3ed7ec5a0ab..d43231b783e4f 100644
+--- a/security/landlock/ruleset.h
++++ b/security/landlock/ruleset.h
+@@ -9,13 +9,26 @@
+ #ifndef _SECURITY_LANDLOCK_RULESET_H
+ #define _SECURITY_LANDLOCK_RULESET_H
+
++#include <linux/bitops.h>
++#include <linux/build_bug.h>
+ #include <linux/mutex.h>
+ #include <linux/rbtree.h>
+ #include <linux/refcount.h>
+ #include <linux/workqueue.h>
+
++#include "limits.h"
+ #include "object.h"
+
++typedef u16 access_mask_t;
++/* Makes sure all filesystem access rights can be stored. */
++static_assert(BITS_PER_TYPE(access_mask_t) >= LANDLOCK_NUM_ACCESS_FS);
++/* Makes sure for_each_set_bit() and for_each_clear_bit() calls are OK. */
++static_assert(sizeof(unsigned long) >= sizeof(access_mask_t));
++
++typedef u16 layer_mask_t;
++/* Makes sure all layers can be checked. */
++static_assert(BITS_PER_TYPE(layer_mask_t) >= LANDLOCK_MAX_NUM_LAYERS);
++
+ /**
+ * struct landlock_layer - Access rights for a given layer
+ */
+@@ -28,7 +41,7 @@ struct landlock_layer {
+ * @access: Bitfield of allowed actions on the kernel object. They are
+ * relative to the object type (e.g. %LANDLOCK_ACTION_FS_READ).
+ */
+- u16 access;
++ access_mask_t access;
+ };
+
+ /**
+@@ -135,26 +148,28 @@ struct landlock_ruleset {
+ * layers are set once and never changed for the
+ * lifetime of the ruleset.
+ */
+- u16 fs_access_masks[];
++ access_mask_t fs_access_masks[];
+ };
+ };
+ };
+
+-struct landlock_ruleset *landlock_create_ruleset(const u32 fs_access_mask);
++struct landlock_ruleset *
++landlock_create_ruleset(const access_mask_t fs_access_mask);
+
+ void landlock_put_ruleset(struct landlock_ruleset *const ruleset);
+ void landlock_put_ruleset_deferred(struct landlock_ruleset *const ruleset);
+
+ int landlock_insert_rule(struct landlock_ruleset *const ruleset,
+- struct landlock_object *const object, const u32 access);
++ struct landlock_object *const object,
++ const access_mask_t access);
+
+-struct landlock_ruleset *landlock_merge_ruleset(
+- struct landlock_ruleset *const parent,
+- struct landlock_ruleset *const ruleset);
++struct landlock_ruleset *
++landlock_merge_ruleset(struct landlock_ruleset *const parent,
++ struct landlock_ruleset *const ruleset);
+
+-const struct landlock_rule *landlock_find_rule(
+- const struct landlock_ruleset *const ruleset,
+- const struct landlock_object *const object);
++const struct landlock_rule *
++landlock_find_rule(const struct landlock_ruleset *const ruleset,
++ const struct landlock_object *const object);
+
+ static inline void landlock_get_ruleset(struct landlock_ruleset *const ruleset)
+ {
+diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
+index 7e27ce394020d..507d43827afed 100644
+--- a/security/landlock/syscalls.c
++++ b/security/landlock/syscalls.c
+@@ -43,9 +43,10 @@
+ * @src: User space pointer or NULL.
+ * @usize: (Alleged) size of the data pointed to by @src.
+ */
+-static __always_inline int copy_min_struct_from_user(void *const dst,
+- const size_t ksize, const size_t ksize_min,
+- const void __user *const src, const size_t usize)
++static __always_inline int
++copy_min_struct_from_user(void *const dst, const size_t ksize,
++ const size_t ksize_min, const void __user *const src,
++ const size_t usize)
+ {
+ /* Checks buffer inconsistencies. */
+ BUILD_BUG_ON(!dst);
+@@ -93,7 +94,7 @@ static void build_check_abi(void)
+ /* Ruleset handling */
+
+ static int fop_ruleset_release(struct inode *const inode,
+- struct file *const filp)
++ struct file *const filp)
+ {
+ struct landlock_ruleset *ruleset = filp->private_data;
+
+@@ -102,15 +103,15 @@ static int fop_ruleset_release(struct inode *const inode,
+ }
+
+ static ssize_t fop_dummy_read(struct file *const filp, char __user *const buf,
+- const size_t size, loff_t *const ppos)
++ const size_t size, loff_t *const ppos)
+ {
+ /* Dummy handler to enable FMODE_CAN_READ. */
+ return -EINVAL;
+ }
+
+ static ssize_t fop_dummy_write(struct file *const filp,
+- const char __user *const buf, const size_t size,
+- loff_t *const ppos)
++ const char __user *const buf, const size_t size,
++ loff_t *const ppos)
+ {
+ /* Dummy handler to enable FMODE_CAN_WRITE. */
+ return -EINVAL;
+@@ -128,7 +129,7 @@ static const struct file_operations ruleset_fops = {
+ .write = fop_dummy_write,
+ };
+
+-#define LANDLOCK_ABI_VERSION 1
++#define LANDLOCK_ABI_VERSION 1
+
+ /**
+ * sys_landlock_create_ruleset - Create a new ruleset
+@@ -168,22 +169,23 @@ SYSCALL_DEFINE3(landlock_create_ruleset,
+ return -EOPNOTSUPP;
+
+ if (flags) {
+- if ((flags == LANDLOCK_CREATE_RULESET_VERSION)
+- && !attr && !size)
++ if ((flags == LANDLOCK_CREATE_RULESET_VERSION) && !attr &&
++ !size)
+ return LANDLOCK_ABI_VERSION;
+ return -EINVAL;
+ }
+
+ /* Copies raw user space buffer. */
+ err = copy_min_struct_from_user(&ruleset_attr, sizeof(ruleset_attr),
+- offsetofend(typeof(ruleset_attr), handled_access_fs),
+- attr, size);
++ offsetofend(typeof(ruleset_attr),
++ handled_access_fs),
++ attr, size);
+ if (err)
+ return err;
+
+ /* Checks content (and 32-bits cast). */
+ if ((ruleset_attr.handled_access_fs | LANDLOCK_MASK_ACCESS_FS) !=
+- LANDLOCK_MASK_ACCESS_FS)
++ LANDLOCK_MASK_ACCESS_FS)
+ return -EINVAL;
+
+ /* Checks arguments and transforms to kernel struct. */
+@@ -193,7 +195,7 @@ SYSCALL_DEFINE3(landlock_create_ruleset,
+
+ /* Creates anonymous FD referring to the ruleset. */
+ ruleset_fd = anon_inode_getfd("[landlock-ruleset]", &ruleset_fops,
+- ruleset, O_RDWR | O_CLOEXEC);
++ ruleset, O_RDWR | O_CLOEXEC);
+ if (ruleset_fd < 0)
+ landlock_put_ruleset(ruleset);
+ return ruleset_fd;
+@@ -204,7 +206,7 @@ SYSCALL_DEFINE3(landlock_create_ruleset,
+ * landlock_put_ruleset() on the return value.
+ */
+ static struct landlock_ruleset *get_ruleset_from_fd(const int fd,
+- const fmode_t mode)
++ const fmode_t mode)
+ {
+ struct fd ruleset_f;
+ struct landlock_ruleset *ruleset;
+@@ -244,8 +246,8 @@ static int get_path_from_fd(const s32 fd, struct path *const path)
+ struct fd f;
+ int err = 0;
+
+- BUILD_BUG_ON(!__same_type(fd,
+- ((struct landlock_path_beneath_attr *)NULL)->parent_fd));
++ BUILD_BUG_ON(!__same_type(
++ fd, ((struct landlock_path_beneath_attr *)NULL)->parent_fd));
+
+ /* Handles O_PATH. */
+ f = fdget_raw(fd);
+@@ -257,10 +259,10 @@ static int get_path_from_fd(const s32 fd, struct path *const path)
+ * pipefs).
+ */
+ if ((f.file->f_op == &ruleset_fops) ||
+- (f.file->f_path.mnt->mnt_flags & MNT_INTERNAL) ||
+- (f.file->f_path.dentry->d_sb->s_flags & SB_NOUSER) ||
+- d_is_negative(f.file->f_path.dentry) ||
+- IS_PRIVATE(d_backing_inode(f.file->f_path.dentry))) {
++ (f.file->f_path.mnt->mnt_flags & MNT_INTERNAL) ||
++ (f.file->f_path.dentry->d_sb->s_flags & SB_NOUSER) ||
++ d_is_negative(f.file->f_path.dentry) ||
++ IS_PRIVATE(d_backing_inode(f.file->f_path.dentry))) {
+ err = -EBADFD;
+ goto out_fdput;
+ }
+@@ -290,19 +292,18 @@ out_fdput:
+ *
+ * - EOPNOTSUPP: Landlock is supported by the kernel but disabled at boot time;
+ * - EINVAL: @flags is not 0, or inconsistent access in the rule (i.e.
+- * &landlock_path_beneath_attr.allowed_access is not a subset of the rule's
+- * accesses);
++ * &landlock_path_beneath_attr.allowed_access is not a subset of the
++ * ruleset handled accesses);
+ * - ENOMSG: Empty accesses (e.g. &landlock_path_beneath_attr.allowed_access);
+ * - EBADF: @ruleset_fd is not a file descriptor for the current thread, or a
+ * member of @rule_attr is not a file descriptor as expected;
+ * - EBADFD: @ruleset_fd is not a ruleset file descriptor, or a member of
+- * @rule_attr is not the expected file descriptor type (e.g. file open
+- * without O_PATH);
++ * @rule_attr is not the expected file descriptor type;
+ * - EPERM: @ruleset_fd has no write access to the underlying ruleset;
+ * - EFAULT: @rule_attr inconsistency.
+ */
+-SYSCALL_DEFINE4(landlock_add_rule,
+- const int, ruleset_fd, const enum landlock_rule_type, rule_type,
++SYSCALL_DEFINE4(landlock_add_rule, const int, ruleset_fd,
++ const enum landlock_rule_type, rule_type,
+ const void __user *const, rule_attr, const __u32, flags)
+ {
+ struct landlock_path_beneath_attr path_beneath_attr;
+@@ -317,20 +318,24 @@ SYSCALL_DEFINE4(landlock_add_rule,
+ if (flags)
+ return -EINVAL;
+
+- if (rule_type != LANDLOCK_RULE_PATH_BENEATH)
+- return -EINVAL;
+-
+- /* Copies raw user space buffer, only one type for now. */
+- res = copy_from_user(&path_beneath_attr, rule_attr,
+- sizeof(path_beneath_attr));
+- if (res)
+- return -EFAULT;
+-
+ /* Gets and checks the ruleset. */
+ ruleset = get_ruleset_from_fd(ruleset_fd, FMODE_CAN_WRITE);
+ if (IS_ERR(ruleset))
+ return PTR_ERR(ruleset);
+
++ if (rule_type != LANDLOCK_RULE_PATH_BENEATH) {
++ err = -EINVAL;
++ goto out_put_ruleset;
++ }
++
++ /* Copies raw user space buffer, only one type for now. */
++ res = copy_from_user(&path_beneath_attr, rule_attr,
++ sizeof(path_beneath_attr));
++ if (res) {
++ err = -EFAULT;
++ goto out_put_ruleset;
++ }
++
+ /*
+ * Informs about useless rule: empty allowed_access (i.e. deny rules)
+ * are ignored in path walks.
+@@ -344,7 +349,7 @@ SYSCALL_DEFINE4(landlock_add_rule,
+ * (ruleset->fs_access_masks[0] is automatically upgraded to 64-bits).
+ */
+ if ((path_beneath_attr.allowed_access | ruleset->fs_access_masks[0]) !=
+- ruleset->fs_access_masks[0]) {
++ ruleset->fs_access_masks[0]) {
+ err = -EINVAL;
+ goto out_put_ruleset;
+ }
+@@ -356,7 +361,7 @@ SYSCALL_DEFINE4(landlock_add_rule,
+
+ /* Imports the new rule. */
+ err = landlock_append_fs_rule(ruleset, &path,
+- path_beneath_attr.allowed_access);
++ path_beneath_attr.allowed_access);
+ path_put(&path);
+
+ out_put_ruleset:
+@@ -389,8 +394,8 @@ out_put_ruleset:
+ * - E2BIG: The maximum number of stacked rulesets is reached for the current
+ * thread.
+ */
+-SYSCALL_DEFINE2(landlock_restrict_self,
+- const int, ruleset_fd, const __u32, flags)
++SYSCALL_DEFINE2(landlock_restrict_self, const int, ruleset_fd, const __u32,
++ flags)
+ {
+ struct landlock_ruleset *new_dom, *ruleset;
+ struct cred *new_cred;
+@@ -400,18 +405,18 @@ SYSCALL_DEFINE2(landlock_restrict_self,
+ if (!landlock_initialized)
+ return -EOPNOTSUPP;
+
+- /* No flag for now. */
+- if (flags)
+- return -EINVAL;
+-
+ /*
+ * Similar checks as for seccomp(2), except that an -EPERM may be
+ * returned.
+ */
+ if (!task_no_new_privs(current) &&
+- !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN))
++ !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN))
+ return -EPERM;
+
++ /* No flag for now. */
++ if (flags)
++ return -EINVAL;
++
+ /* Gets and checks the ruleset. */
+ ruleset = get_ruleset_from_fd(ruleset_fd, FMODE_CAN_READ);
+ if (IS_ERR(ruleset))
+diff --git a/sound/core/jack.c b/sound/core/jack.c
+index d1e3055f2b6a5..88493cc31914b 100644
+--- a/sound/core/jack.c
++++ b/sound/core/jack.c
+@@ -42,8 +42,11 @@ static int snd_jack_dev_disconnect(struct snd_device *device)
+ #ifdef CONFIG_SND_JACK_INPUT_DEV
+ struct snd_jack *jack = device->device_data;
+
+- if (!jack->input_dev)
++ mutex_lock(&jack->input_dev_lock);
++ if (!jack->input_dev) {
++ mutex_unlock(&jack->input_dev_lock);
+ return 0;
++ }
+
+ /* If the input device is registered with the input subsystem
+ * then we need to use a different deallocator. */
+@@ -52,6 +55,7 @@ static int snd_jack_dev_disconnect(struct snd_device *device)
+ else
+ input_free_device(jack->input_dev);
+ jack->input_dev = NULL;
++ mutex_unlock(&jack->input_dev_lock);
+ #endif /* CONFIG_SND_JACK_INPUT_DEV */
+ return 0;
+ }
+@@ -90,8 +94,11 @@ static int snd_jack_dev_register(struct snd_device *device)
+ snprintf(jack->name, sizeof(jack->name), "%s %s",
+ card->shortname, jack->id);
+
+- if (!jack->input_dev)
++ mutex_lock(&jack->input_dev_lock);
++ if (!jack->input_dev) {
++ mutex_unlock(&jack->input_dev_lock);
+ return 0;
++ }
+
+ jack->input_dev->name = jack->name;
+
+@@ -116,6 +123,7 @@ static int snd_jack_dev_register(struct snd_device *device)
+ if (err == 0)
+ jack->registered = 1;
+
++ mutex_unlock(&jack->input_dev_lock);
+ return err;
+ }
+ #endif /* CONFIG_SND_JACK_INPUT_DEV */
+@@ -517,9 +525,11 @@ int snd_jack_new(struct snd_card *card, const char *id, int type,
+ return -ENOMEM;
+ }
+
+- /* don't creat input device for phantom jack */
+- if (!phantom_jack) {
+ #ifdef CONFIG_SND_JACK_INPUT_DEV
++ mutex_init(&jack->input_dev_lock);
++
++ /* don't create input device for phantom jack */
++ if (!phantom_jack) {
+ int i;
+
+ jack->input_dev = input_allocate_device();
+@@ -537,8 +547,8 @@ int snd_jack_new(struct snd_card *card, const char *id, int type,
+ input_set_capability(jack->input_dev, EV_SW,
+ jack_switch_types[i]);
+
+-#endif /* CONFIG_SND_JACK_INPUT_DEV */
+ }
++#endif /* CONFIG_SND_JACK_INPUT_DEV */
+
+ err = snd_device_new(card, SNDRV_DEV_JACK, jack, &ops);
+ if (err < 0)
+@@ -578,10 +588,14 @@ EXPORT_SYMBOL(snd_jack_new);
+ void snd_jack_set_parent(struct snd_jack *jack, struct device *parent)
+ {
+ WARN_ON(jack->registered);
+- if (!jack->input_dev)
++ mutex_lock(&jack->input_dev_lock);
++ if (!jack->input_dev) {
++ mutex_unlock(&jack->input_dev_lock);
+ return;
++ }
+
+ jack->input_dev->dev.parent = parent;
++ mutex_unlock(&jack->input_dev_lock);
+ }
+ EXPORT_SYMBOL(snd_jack_set_parent);
+
+@@ -629,6 +643,8 @@ EXPORT_SYMBOL(snd_jack_set_key);
+
+ /**
+ * snd_jack_report - Report the current status of a jack
++ * Note: This function uses mutexes and should be called from a
++ * context which can sleep (such as a workqueue).
+ *
+ * @jack: The jack to report status for
+ * @status: The current status of the jack
+@@ -654,8 +670,11 @@ void snd_jack_report(struct snd_jack *jack, int status)
+ status & jack_kctl->mask_bits);
+
+ #ifdef CONFIG_SND_JACK_INPUT_DEV
+- if (!jack->input_dev)
++ mutex_lock(&jack->input_dev_lock);
++ if (!jack->input_dev) {
++ mutex_unlock(&jack->input_dev_lock);
+ return;
++ }
+
+ for (i = 0; i < ARRAY_SIZE(jack->key); i++) {
+ int testbit = ((SND_JACK_BTN_0 >> i) & ~mask_bits);
+@@ -675,6 +694,7 @@ void snd_jack_report(struct snd_jack *jack, int status)
+ }
+
+ input_sync(jack->input_dev);
++ mutex_unlock(&jack->input_dev_lock);
+ #endif /* CONFIG_SND_JACK_INPUT_DEV */
+ }
+ EXPORT_SYMBOL(snd_jack_report);
+diff --git a/sound/core/pcm_memory.c b/sound/core/pcm_memory.c
+index 8848d2f3160d8..b8296b6eb2c19 100644
+--- a/sound/core/pcm_memory.c
++++ b/sound/core/pcm_memory.c
+@@ -453,7 +453,6 @@ EXPORT_SYMBOL(snd_pcm_lib_malloc_pages);
+ */
+ int snd_pcm_lib_free_pages(struct snd_pcm_substream *substream)
+ {
+- struct snd_card *card = substream->pcm->card;
+ struct snd_pcm_runtime *runtime;
+
+ if (PCM_RUNTIME_CHECK(substream))
+@@ -462,6 +461,8 @@ int snd_pcm_lib_free_pages(struct snd_pcm_substream *substream)
+ if (runtime->dma_area == NULL)
+ return 0;
+ if (runtime->dma_buffer_p != &substream->dma_buffer) {
++ struct snd_card *card = substream->pcm->card;
++
+ /* it's a newly allocated buffer. release it now. */
+ do_free_pages(card, runtime->dma_buffer_p);
+ kfree(runtime->dma_buffer_p);
+diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
+index ad292df7d805c..323c74a042688 100644
+--- a/sound/pci/hda/patch_realtek.c
++++ b/sound/pci/hda/patch_realtek.c
+@@ -1981,6 +1981,7 @@ enum {
+ ALC1220_FIXUP_CLEVO_PB51ED_PINS,
+ ALC887_FIXUP_ASUS_AUDIO,
+ ALC887_FIXUP_ASUS_HMIC,
++ ALCS1200A_FIXUP_MIC_VREF,
+ };
+
+ static void alc889_fixup_coef(struct hda_codec *codec,
+@@ -2526,6 +2527,14 @@ static const struct hda_fixup alc882_fixups[] = {
+ .chained = true,
+ .chain_id = ALC887_FIXUP_ASUS_AUDIO,
+ },
++ [ALCS1200A_FIXUP_MIC_VREF] = {
++ .type = HDA_FIXUP_PINCTLS,
++ .v.pins = (const struct hda_pintbl[]) {
++ { 0x18, PIN_VREF50 }, /* rear mic */
++ { 0x19, PIN_VREF50 }, /* front mic */
++ {}
++ }
++ },
+ };
+
+ static const struct snd_pci_quirk alc882_fixup_tbl[] = {
+@@ -2563,6 +2572,7 @@ static const struct snd_pci_quirk alc882_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1043, 0x835f, "Asus Eee 1601", ALC888_FIXUP_EEE1601),
+ SND_PCI_QUIRK(0x1043, 0x84bc, "ASUS ET2700", ALC887_FIXUP_ASUS_BASS),
+ SND_PCI_QUIRK(0x1043, 0x8691, "ASUS ROG Ranger VIII", ALC882_FIXUP_GPIO3),
++ SND_PCI_QUIRK(0x1043, 0x8797, "ASUS TUF B550M-PLUS", ALCS1200A_FIXUP_MIC_VREF),
+ SND_PCI_QUIRK(0x104d, 0x9043, "Sony Vaio VGC-LN51JGB", ALC882_FIXUP_NO_PRIMARY_HP),
+ SND_PCI_QUIRK(0x104d, 0x9044, "Sony VAIO AiO", ALC882_FIXUP_NO_PRIMARY_HP),
+ SND_PCI_QUIRK(0x104d, 0x9047, "Sony Vaio TT", ALC889_FIXUP_VAIO_TT),
+@@ -3131,6 +3141,7 @@ enum {
+ ALC269_TYPE_ALC257,
+ ALC269_TYPE_ALC215,
+ ALC269_TYPE_ALC225,
++ ALC269_TYPE_ALC245,
+ ALC269_TYPE_ALC287,
+ ALC269_TYPE_ALC294,
+ ALC269_TYPE_ALC300,
+@@ -3168,6 +3179,7 @@ static int alc269_parse_auto_config(struct hda_codec *codec)
+ case ALC269_TYPE_ALC257:
+ case ALC269_TYPE_ALC215:
+ case ALC269_TYPE_ALC225:
++ case ALC269_TYPE_ALC245:
+ case ALC269_TYPE_ALC287:
+ case ALC269_TYPE_ALC294:
+ case ALC269_TYPE_ALC300:
+@@ -3695,7 +3707,8 @@ static void alc225_init(struct hda_codec *codec)
+ hda_nid_t hp_pin = alc_get_hp_pin(spec);
+ bool hp1_pin_sense, hp2_pin_sense;
+
+- if (spec->codec_variant != ALC269_TYPE_ALC287)
++ if (spec->codec_variant != ALC269_TYPE_ALC287 &&
++ spec->codec_variant != ALC269_TYPE_ALC245)
+ /* required only at boot or S3 and S4 resume time */
+ if (!spec->done_hp_init ||
+ is_s3_resume(codec) ||
+@@ -8954,6 +8967,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1028, 0x0a62, "Dell Precision 5560", ALC289_FIXUP_DUAL_SPK),
+ SND_PCI_QUIRK(0x1028, 0x0a9d, "Dell Latitude 5430", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1028, 0x0a9e, "Dell Latitude 5430", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE),
++ SND_PCI_QUIRK(0x1028, 0x0b19, "Dell XPS 15 9520", ALC289_FIXUP_DUAL_SPK),
+ SND_PCI_QUIRK(0x1028, 0x164a, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1028, 0x164b, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x103c, 0x1586, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC2),
+@@ -10148,7 +10162,10 @@ static int patch_alc269(struct hda_codec *codec)
+ case 0x10ec0245:
+ case 0x10ec0285:
+ case 0x10ec0289:
+- spec->codec_variant = ALC269_TYPE_ALC215;
++ if (alc_get_coef0(codec) & 0x0010)
++ spec->codec_variant = ALC269_TYPE_ALC245;
++ else
++ spec->codec_variant = ALC269_TYPE_ALC215;
+ spec->shutup = alc225_shutup;
+ spec->init_hook = alc225_init;
+ spec->gen.mixer_nid = 0;
+diff --git a/sound/soc/amd/yc/acp6x-mach.c b/sound/soc/amd/yc/acp6x-mach.c
+index 9a767f47b89f1..959b70e8baf21 100644
+--- a/sound/soc/amd/yc/acp6x-mach.c
++++ b/sound/soc/amd/yc/acp6x-mach.c
+@@ -45,108 +45,126 @@ static struct snd_soc_card acp6x_card = {
+
+ static const struct dmi_system_id yc_acp_quirk_table[] = {
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21D2"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21D3"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21D4"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21D5"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21CF"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21CG"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21CQ"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21CR"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21AW"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21AX"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21BN"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21BQ"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21CH"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21CJ"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21CK"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21CL"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21D8"),
+ }
+ },
+ {
++ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21D9"),
+@@ -157,18 +175,21 @@ static const struct dmi_system_id yc_acp_quirk_table[] = {
+
+ static int acp6x_probe(struct platform_device *pdev)
+ {
++ const struct dmi_system_id *dmi_id;
+ struct acp6x_pdm *machine = NULL;
+ struct snd_soc_card *card;
+ int ret;
+- const struct dmi_system_id *dmi_id;
+
++ /* check for any DMI overrides */
+ dmi_id = dmi_first_match(yc_acp_quirk_table);
+- if (!dmi_id)
++ if (dmi_id)
++ platform_set_drvdata(pdev, dmi_id->driver_data);
++
++ card = platform_get_drvdata(pdev);
++ if (!card)
+ return -ENODEV;
+- card = &acp6x_card;
+ acp6x_card.dev = &pdev->dev;
+
+- platform_set_drvdata(pdev, card);
+ snd_soc_card_set_drvdata(card, machine);
+ ret = devm_snd_soc_register_card(&pdev->dev, card);
+ if (ret) {
+diff --git a/sound/soc/atmel/atmel-classd.c b/sound/soc/atmel/atmel-classd.c
+index a9f9f449c48c2..74b7b2611aa70 100644
+--- a/sound/soc/atmel/atmel-classd.c
++++ b/sound/soc/atmel/atmel-classd.c
+@@ -458,7 +458,6 @@ static const struct snd_soc_component_driver atmel_classd_cpu_dai_component = {
+ .num_controls = ARRAY_SIZE(atmel_classd_snd_controls),
+ .idle_bias_on = 1,
+ .use_pmdown_time = 1,
+- .endianness = 1,
+ };
+
+ /* ASoC sound card */
+diff --git a/sound/soc/atmel/atmel-pdmic.c b/sound/soc/atmel/atmel-pdmic.c
+index 42117de299e74..ea34efac2fff5 100644
+--- a/sound/soc/atmel/atmel-pdmic.c
++++ b/sound/soc/atmel/atmel-pdmic.c
+@@ -481,7 +481,6 @@ static const struct snd_soc_component_driver atmel_pdmic_cpu_dai_component = {
+ .num_controls = ARRAY_SIZE(atmel_pdmic_snd_controls),
+ .idle_bias_on = 1,
+ .use_pmdown_time = 1,
+- .endianness = 1,
+ };
+
+ /* ASoC sound card */
+diff --git a/sound/soc/codecs/Kconfig b/sound/soc/codecs/Kconfig
+index f46a226601032..3dea20b2c4054 100644
+--- a/sound/soc/codecs/Kconfig
++++ b/sound/soc/codecs/Kconfig
+@@ -953,7 +953,6 @@ config SND_SOC_MAX98095
+
+ config SND_SOC_MAX98357A
+ tristate "Maxim MAX98357A CODEC"
+- depends on GPIOLIB
+
+ config SND_SOC_MAX98371
+ tristate
+@@ -1213,7 +1212,6 @@ config SND_SOC_RT1015
+
+ config SND_SOC_RT1015P
+ tristate
+- depends on GPIOLIB
+
+ config SND_SOC_RT1019
+ tristate
+diff --git a/sound/soc/codecs/cs35l41-lib.c b/sound/soc/codecs/cs35l41-lib.c
+index aa6823fbd1a4d..17cf782f39af6 100644
+--- a/sound/soc/codecs/cs35l41-lib.c
++++ b/sound/soc/codecs/cs35l41-lib.c
+@@ -422,7 +422,7 @@ static bool cs35l41_volatile_reg(struct device *dev, unsigned int reg)
+ }
+ }
+
+-static const struct cs35l41_otp_packed_element_t otp_map_1[CS35L41_NUM_OTP_ELEM] = {
++static const struct cs35l41_otp_packed_element_t otp_map_1[] = {
+ /* addr shift size */
+ { 0x00002030, 0, 4 }, /*TRIM_OSC_FREQ_TRIM*/
+ { 0x00002030, 7, 1 }, /*TRIM_OSC_TRIM_DONE*/
+@@ -525,7 +525,7 @@ static const struct cs35l41_otp_packed_element_t otp_map_1[CS35L41_NUM_OTP_ELEM]
+ { 0x00017044, 0, 24 }, /*LOT_NUMBER*/
+ };
+
+-static const struct cs35l41_otp_packed_element_t otp_map_2[CS35L41_NUM_OTP_ELEM] = {
++static const struct cs35l41_otp_packed_element_t otp_map_2[] = {
+ /* addr shift size */
+ { 0x00002030, 0, 4 }, /*TRIM_OSC_FREQ_TRIM*/
+ { 0x00002030, 7, 1 }, /*TRIM_OSC_TRIM_DONE*/
+@@ -671,35 +671,35 @@ static const struct cs35l41_otp_map_element_t cs35l41_otp_map_map[] = {
+ {
+ .id = 0x01,
+ .map = otp_map_1,
+- .num_elements = CS35L41_NUM_OTP_ELEM,
++ .num_elements = ARRAY_SIZE(otp_map_1),
+ .bit_offset = 16,
+ .word_offset = 2,
+ },
+ {
+ .id = 0x02,
+ .map = otp_map_2,
+- .num_elements = CS35L41_NUM_OTP_ELEM,
++ .num_elements = ARRAY_SIZE(otp_map_2),
+ .bit_offset = 16,
+ .word_offset = 2,
+ },
+ {
+ .id = 0x03,
+ .map = otp_map_2,
+- .num_elements = CS35L41_NUM_OTP_ELEM,
++ .num_elements = ARRAY_SIZE(otp_map_2),
+ .bit_offset = 16,
+ .word_offset = 2,
+ },
+ {
+ .id = 0x06,
+ .map = otp_map_2,
+- .num_elements = CS35L41_NUM_OTP_ELEM,
++ .num_elements = ARRAY_SIZE(otp_map_2),
+ .bit_offset = 16,
+ .word_offset = 2,
+ },
+ {
+ .id = 0x08,
+ .map = otp_map_1,
+- .num_elements = CS35L41_NUM_OTP_ELEM,
++ .num_elements = ARRAY_SIZE(otp_map_1),
+ .bit_offset = 16,
+ .word_offset = 2,
+ },
+diff --git a/sound/soc/codecs/lpass-macro-common.c b/sound/soc/codecs/lpass-macro-common.c
+index 6cede75ed3b5d..1b9082d237c13 100644
+--- a/sound/soc/codecs/lpass-macro-common.c
++++ b/sound/soc/codecs/lpass-macro-common.c
+@@ -24,42 +24,45 @@ struct lpass_macro *lpass_macro_pds_init(struct device *dev)
+ return ERR_PTR(-ENOMEM);
+
+ l_pds->macro_pd = dev_pm_domain_attach_by_name(dev, "macro");
+- if (IS_ERR_OR_NULL(l_pds->macro_pd))
+- return NULL;
+-
+- ret = pm_runtime_get_sync(l_pds->macro_pd);
+- if (ret < 0) {
+- pm_runtime_put_noidle(l_pds->macro_pd);
++ if (IS_ERR_OR_NULL(l_pds->macro_pd)) {
++ ret = l_pds->macro_pd ? PTR_ERR(l_pds->macro_pd) : -ENODATA;
+ goto macro_err;
+ }
+
++ ret = pm_runtime_resume_and_get(l_pds->macro_pd);
++ if (ret < 0)
++ goto macro_sync_err;
++
+ l_pds->dcodec_pd = dev_pm_domain_attach_by_name(dev, "dcodec");
+- if (IS_ERR_OR_NULL(l_pds->dcodec_pd))
++ if (IS_ERR_OR_NULL(l_pds->dcodec_pd)) {
++ ret = l_pds->dcodec_pd ? PTR_ERR(l_pds->dcodec_pd) : -ENODATA;
+ goto dcodec_err;
++ }
+
+- ret = pm_runtime_get_sync(l_pds->dcodec_pd);
+- if (ret < 0) {
+- pm_runtime_put_noidle(l_pds->dcodec_pd);
++ ret = pm_runtime_resume_and_get(l_pds->dcodec_pd);
++ if (ret < 0)
+ goto dcodec_sync_err;
+- }
+ return l_pds;
+
+ dcodec_sync_err:
+ dev_pm_domain_detach(l_pds->dcodec_pd, false);
+ dcodec_err:
+ pm_runtime_put(l_pds->macro_pd);
+-macro_err:
++macro_sync_err:
+ dev_pm_domain_detach(l_pds->macro_pd, false);
++macro_err:
+ return ERR_PTR(ret);
+ }
+ EXPORT_SYMBOL_GPL(lpass_macro_pds_init);
+
+ void lpass_macro_pds_exit(struct lpass_macro *pds)
+ {
+- pm_runtime_put(pds->macro_pd);
+- dev_pm_domain_detach(pds->macro_pd, false);
+- pm_runtime_put(pds->dcodec_pd);
+- dev_pm_domain_detach(pds->dcodec_pd, false);
++ if (pds) {
++ pm_runtime_put(pds->macro_pd);
++ dev_pm_domain_detach(pds->macro_pd, false);
++ pm_runtime_put(pds->dcodec_pd);
++ dev_pm_domain_detach(pds->dcodec_pd, false);
++ }
+ }
+ EXPORT_SYMBOL_GPL(lpass_macro_pds_exit);
+
+diff --git a/sound/soc/codecs/max98090.c b/sound/soc/codecs/max98090.c
+index 62b41ca050a20..5513acd360b8f 100644
+--- a/sound/soc/codecs/max98090.c
++++ b/sound/soc/codecs/max98090.c
+@@ -393,7 +393,8 @@ static int max98090_put_enab_tlv(struct snd_kcontrol *kcontrol,
+ struct soc_mixer_control *mc =
+ (struct soc_mixer_control *)kcontrol->private_value;
+ unsigned int mask = (1 << fls(mc->max)) - 1;
+- unsigned int sel = ucontrol->value.integer.value[0];
++ int sel_unchecked = ucontrol->value.integer.value[0];
++ unsigned int sel;
+ unsigned int val = snd_soc_component_read(component, mc->reg);
+ unsigned int *select;
+
+@@ -413,8 +414,9 @@ static int max98090_put_enab_tlv(struct snd_kcontrol *kcontrol,
+
+ val = (val >> mc->shift) & mask;
+
+- if (sel < 0 || sel > mc->max)
++ if (sel_unchecked < 0 || sel_unchecked > mc->max)
+ return -EINVAL;
++ sel = sel_unchecked;
+
+ *select = sel;
+
+diff --git a/sound/soc/codecs/rk3328_codec.c b/sound/soc/codecs/rk3328_codec.c
+index 758d439e8c7a5..86b679cf7aef9 100644
+--- a/sound/soc/codecs/rk3328_codec.c
++++ b/sound/soc/codecs/rk3328_codec.c
+@@ -481,7 +481,7 @@ static int rk3328_platform_probe(struct platform_device *pdev)
+ ret = clk_prepare_enable(rk3328->pclk);
+ if (ret < 0) {
+ dev_err(&pdev->dev, "failed to enable acodec pclk\n");
+- return ret;
++ goto err_unprepare_mclk;
+ }
+
+ base = devm_platform_ioremap_resource(pdev, 0);
+diff --git a/sound/soc/codecs/rt5514.c b/sound/soc/codecs/rt5514.c
+index 577680df70520..92428f2b459ba 100644
+--- a/sound/soc/codecs/rt5514.c
++++ b/sound/soc/codecs/rt5514.c
+@@ -419,7 +419,7 @@ static int rt5514_dsp_voice_wake_up_put(struct snd_kcontrol *kcontrol,
+ }
+ }
+
+- return 0;
++ return 1;
+ }
+
+ static const struct snd_kcontrol_new rt5514_snd_controls[] = {
+diff --git a/sound/soc/codecs/rt5645.c b/sound/soc/codecs/rt5645.c
+index 197c560479470..4b2e027c10331 100644
+--- a/sound/soc/codecs/rt5645.c
++++ b/sound/soc/codecs/rt5645.c
+@@ -4154,9 +4154,14 @@ static int rt5645_i2c_remove(struct i2c_client *i2c)
+ if (i2c->irq)
+ free_irq(i2c->irq, rt5645);
+
++ /*
++ * Since the rt5645_btn_check_callback() can queue jack_detect_work,
++ * the timer need to be delted first
++ */
++ del_timer_sync(&rt5645->btn_check_timer);
++
+ cancel_delayed_work_sync(&rt5645->jack_detect_work);
+ cancel_delayed_work_sync(&rt5645->rcclock_work);
+- del_timer_sync(&rt5645->btn_check_timer);
+
+ regulator_bulk_disable(ARRAY_SIZE(rt5645->supplies), rt5645->supplies);
+
+diff --git a/sound/soc/codecs/tscs454.c b/sound/soc/codecs/tscs454.c
+index 7e1826d6f06f4..32e6fa7b0a061 100644
+--- a/sound/soc/codecs/tscs454.c
++++ b/sound/soc/codecs/tscs454.c
+@@ -3120,18 +3120,17 @@ static int set_aif_sample_format(struct snd_soc_component *component,
+ unsigned int width;
+ int ret;
+
+- switch (format) {
+- case SNDRV_PCM_FORMAT_S16_LE:
++ switch (snd_pcm_format_width(format)) {
++ case 16:
+ width = FV_WL_16;
+ break;
+- case SNDRV_PCM_FORMAT_S20_3LE:
++ case 20:
+ width = FV_WL_20;
+ break;
+- case SNDRV_PCM_FORMAT_S24_3LE:
++ case 24:
+ width = FV_WL_24;
+ break;
+- case SNDRV_PCM_FORMAT_S24_LE:
+- case SNDRV_PCM_FORMAT_S32_LE:
++ case 32:
+ width = FV_WL_32;
+ break;
+ default:
+@@ -3326,6 +3325,7 @@ static const struct snd_soc_component_driver soc_component_dev_tscs454 = {
+ .num_dapm_routes = ARRAY_SIZE(tscs454_intercon),
+ .controls = tscs454_snd_controls,
+ .num_controls = ARRAY_SIZE(tscs454_snd_controls),
++ .endianness = 1,
+ };
+
+ #define TSCS454_RATES SNDRV_PCM_RATE_8000_96000
+diff --git a/sound/soc/codecs/wm2000.c b/sound/soc/codecs/wm2000.c
+index 72e165cc64439..97ece3114b3dc 100644
+--- a/sound/soc/codecs/wm2000.c
++++ b/sound/soc/codecs/wm2000.c
+@@ -536,7 +536,7 @@ static int wm2000_anc_transition(struct wm2000_priv *wm2000,
+ {
+ struct i2c_client *i2c = wm2000->i2c;
+ int i, j;
+- int ret;
++ int ret = 0;
+
+ if (wm2000->anc_mode == mode)
+ return 0;
+@@ -566,13 +566,13 @@ static int wm2000_anc_transition(struct wm2000_priv *wm2000,
+ ret = anc_transitions[i].step[j](i2c,
+ anc_transitions[i].analogue);
+ if (ret != 0)
+- return ret;
++ break;
+ }
+
+ if (anc_transitions[i].dest == ANC_OFF)
+ clk_disable_unprepare(wm2000->mclk);
+
+- return 0;
++ return ret;
+ }
+
+ static int wm2000_anc_set_mode(struct wm2000_priv *wm2000)
+diff --git a/sound/soc/fsl/imx-hdmi.c b/sound/soc/fsl/imx-hdmi.c
+index 929f69b758af4..ec149dc739383 100644
+--- a/sound/soc/fsl/imx-hdmi.c
++++ b/sound/soc/fsl/imx-hdmi.c
+@@ -126,6 +126,7 @@ static int imx_hdmi_probe(struct platform_device *pdev)
+ data = devm_kzalloc(&pdev->dev, sizeof(*data), GFP_KERNEL);
+ if (!data) {
+ ret = -ENOMEM;
++ put_device(&cpu_pdev->dev);
+ goto fail;
+ }
+
+diff --git a/sound/soc/fsl/imx-sgtl5000.c b/sound/soc/fsl/imx-sgtl5000.c
+index 8daced42d55e4..580a0d963f0eb 100644
+--- a/sound/soc/fsl/imx-sgtl5000.c
++++ b/sound/soc/fsl/imx-sgtl5000.c
+@@ -120,19 +120,19 @@ static int imx_sgtl5000_probe(struct platform_device *pdev)
+ data = devm_kzalloc(&pdev->dev, sizeof(*data), GFP_KERNEL);
+ if (!data) {
+ ret = -ENOMEM;
+- goto fail;
++ goto put_device;
+ }
+
+ comp = devm_kzalloc(&pdev->dev, 3 * sizeof(*comp), GFP_KERNEL);
+ if (!comp) {
+ ret = -ENOMEM;
+- goto fail;
++ goto put_device;
+ }
+
+ data->codec_clk = clk_get(&codec_dev->dev, NULL);
+ if (IS_ERR(data->codec_clk)) {
+ ret = PTR_ERR(data->codec_clk);
+- goto fail;
++ goto put_device;
+ }
+
+ data->clk_frequency = clk_get_rate(data->codec_clk);
+@@ -158,10 +158,10 @@ static int imx_sgtl5000_probe(struct platform_device *pdev)
+ data->card.dev = &pdev->dev;
+ ret = snd_soc_of_parse_card_name(&data->card, "model");
+ if (ret)
+- goto fail;
++ goto put_device;
+ ret = snd_soc_of_parse_audio_routing(&data->card, "audio-routing");
+ if (ret)
+- goto fail;
++ goto put_device;
+ data->card.num_links = 1;
+ data->card.owner = THIS_MODULE;
+ data->card.dai_link = &data->dai;
+@@ -174,7 +174,7 @@ static int imx_sgtl5000_probe(struct platform_device *pdev)
+ ret = devm_snd_soc_register_card(&pdev->dev, &data->card);
+ if (ret) {
+ dev_err_probe(&pdev->dev, ret, "snd_soc_register_card failed\n");
+- goto fail;
++ goto put_device;
+ }
+
+ of_node_put(ssi_np);
+@@ -182,6 +182,8 @@ static int imx_sgtl5000_probe(struct platform_device *pdev)
+
+ return 0;
+
++put_device:
++ put_device(&codec_dev->dev);
+ fail:
+ if (data && !IS_ERR(data->codec_clk))
+ clk_put(data->codec_clk);
+diff --git a/sound/soc/intel/boards/bytcr_rt5640.c b/sound/soc/intel/boards/bytcr_rt5640.c
+index d76a505052fb7..f81ae742faa78 100644
+--- a/sound/soc/intel/boards/bytcr_rt5640.c
++++ b/sound/soc/intel/boards/bytcr_rt5640.c
+@@ -773,6 +773,18 @@ static const struct dmi_system_id byt_rt5640_quirk_table[] = {
+ BYT_RT5640_OVCD_SF_0P75 |
+ BYT_RT5640_MCLK_EN),
+ },
++ { /* HP Pro Tablet 408 */
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "HP Pro Tablet 408"),
++ },
++ .driver_data = (void *)(BYT_RT5640_DMIC1_MAP |
++ BYT_RT5640_JD_SRC_JD2_IN4N |
++ BYT_RT5640_OVCD_TH_1500UA |
++ BYT_RT5640_OVCD_SF_0P75 |
++ BYT_RT5640_SSP0_AIF1 |
++ BYT_RT5640_MCLK_EN),
++ },
+ { /* HP Stream 7 */
+ .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"),
+diff --git a/sound/soc/intel/boards/sof_ssp_amp.c b/sound/soc/intel/boards/sof_ssp_amp.c
+index 88530e9de5435..ef70c6f27fe18 100644
+--- a/sound/soc/intel/boards/sof_ssp_amp.c
++++ b/sound/soc/intel/boards/sof_ssp_amp.c
+@@ -9,6 +9,7 @@
+
+ #include <linux/acpi.h>
+ #include <linux/delay.h>
++#include <linux/dmi.h>
+ #include <linux/module.h>
+ #include <linux/platform_device.h>
+ #include <sound/core.h>
+@@ -78,6 +79,16 @@ struct sof_card_private {
+ bool idisp_codec;
+ };
+
++static const struct dmi_system_id chromebook_platforms[] = {
++ {
++ .ident = "Google Chromebooks",
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Google"),
++ }
++ },
++ {},
++};
++
+ static const struct snd_soc_dapm_widget sof_ssp_amp_dapm_widgets[] = {
+ SND_SOC_DAPM_MIC("SoC DMIC", NULL),
+ };
+@@ -371,7 +382,7 @@ static int sof_ssp_amp_probe(struct platform_device *pdev)
+ struct snd_soc_dai_link *dai_links;
+ struct snd_soc_acpi_mach *mach;
+ struct sof_card_private *ctx;
+- int dmic_be_num, hdmi_num = 0;
++ int dmic_be_num = 0, hdmi_num = 0;
+ int ret, ssp_codec;
+
+ ctx = devm_kzalloc(&pdev->dev, sizeof(*ctx), GFP_KERNEL);
+@@ -383,7 +394,8 @@ static int sof_ssp_amp_probe(struct platform_device *pdev)
+
+ mach = pdev->dev.platform_data;
+
+- dmic_be_num = mach->mach_params.dmic_num;
++ if (dmi_check_system(chromebook_platforms) || mach->mach_params.dmic_num > 0)
++ dmic_be_num = 2;
+
+ ssp_codec = sof_ssp_amp_quirk & SOF_AMPLIFIER_SSP_MASK;
+
+diff --git a/sound/soc/mediatek/mt2701/mt2701-wm8960.c b/sound/soc/mediatek/mt2701/mt2701-wm8960.c
+index f56de1b918bf0..0cdf2ae362439 100644
+--- a/sound/soc/mediatek/mt2701/mt2701-wm8960.c
++++ b/sound/soc/mediatek/mt2701/mt2701-wm8960.c
+@@ -129,7 +129,8 @@ static int mt2701_wm8960_machine_probe(struct platform_device *pdev)
+ if (!codec_node) {
+ dev_err(&pdev->dev,
+ "Property 'audio-codec' missing or invalid\n");
+- return -EINVAL;
++ ret = -EINVAL;
++ goto put_platform_node;
+ }
+ for_each_card_prelinks(card, i, dai_link) {
+ if (dai_link->codecs->name)
+@@ -140,7 +141,7 @@ static int mt2701_wm8960_machine_probe(struct platform_device *pdev)
+ ret = snd_soc_of_parse_audio_routing(card, "audio-routing");
+ if (ret) {
+ dev_err(&pdev->dev, "failed to parse audio-routing: %d\n", ret);
+- return ret;
++ goto put_codec_node;
+ }
+
+ ret = devm_snd_soc_register_card(&pdev->dev, card);
+@@ -148,6 +149,10 @@ static int mt2701_wm8960_machine_probe(struct platform_device *pdev)
+ dev_err(&pdev->dev, "%s snd_soc_register_card fail %d\n",
+ __func__, ret);
+
++put_codec_node:
++ of_node_put(codec_node);
++put_platform_node:
++ of_node_put(platform_node);
+ return ret;
+ }
+
+diff --git a/sound/soc/mediatek/mt8173/mt8173-max98090.c b/sound/soc/mediatek/mt8173/mt8173-max98090.c
+index 4cb90da89262b..58778cd2e61b1 100644
+--- a/sound/soc/mediatek/mt8173/mt8173-max98090.c
++++ b/sound/soc/mediatek/mt8173/mt8173-max98090.c
+@@ -167,7 +167,8 @@ static int mt8173_max98090_dev_probe(struct platform_device *pdev)
+ if (!codec_node) {
+ dev_err(&pdev->dev,
+ "Property 'audio-codec' missing or invalid\n");
+- return -EINVAL;
++ ret = -EINVAL;
++ goto put_platform_node;
+ }
+ for_each_card_prelinks(card, i, dai_link) {
+ if (dai_link->codecs->name)
+@@ -179,6 +180,8 @@ static int mt8173_max98090_dev_probe(struct platform_device *pdev)
+ ret = devm_snd_soc_register_card(&pdev->dev, card);
+
+ of_node_put(codec_node);
++
++put_platform_node:
+ of_node_put(platform_node);
+ return ret;
+ }
+diff --git a/sound/soc/mxs/mxs-saif.c b/sound/soc/mxs/mxs-saif.c
+index 879c1221a809b..7afe1a1acc568 100644
+--- a/sound/soc/mxs/mxs-saif.c
++++ b/sound/soc/mxs/mxs-saif.c
+@@ -754,6 +754,7 @@ static int mxs_saif_probe(struct platform_device *pdev)
+ saif->master_id = saif->id;
+ } else {
+ ret = of_alias_get_id(master, "saif");
++ of_node_put(master);
+ if (ret < 0)
+ return ret;
+ else
+diff --git a/sound/soc/samsung/aries_wm8994.c b/sound/soc/samsung/aries_wm8994.c
+index 5265e546b124c..83acbe57b2489 100644
+--- a/sound/soc/samsung/aries_wm8994.c
++++ b/sound/soc/samsung/aries_wm8994.c
+@@ -585,10 +585,10 @@ static int aries_audio_probe(struct platform_device *pdev)
+
+ extcon_np = of_parse_phandle(np, "extcon", 0);
+ priv->usb_extcon = extcon_find_edev_by_node(extcon_np);
++ of_node_put(extcon_np);
+ if (IS_ERR(priv->usb_extcon))
+ return dev_err_probe(dev, PTR_ERR(priv->usb_extcon),
+ "Failed to get extcon device");
+- of_node_put(extcon_np);
+
+ priv->adc = devm_iio_channel_get(dev, "headset-detect");
+ if (IS_ERR(priv->adc))
+diff --git a/sound/soc/sh/rcar/core.c b/sound/soc/sh/rcar/core.c
+index 6a8fe0da7670b..af8ef2a27d341 100644
+--- a/sound/soc/sh/rcar/core.c
++++ b/sound/soc/sh/rcar/core.c
+@@ -1159,6 +1159,7 @@ void rsnd_parse_connect_common(struct rsnd_dai *rdai, char *name,
+ struct device_node *capture)
+ {
+ struct rsnd_priv *priv = rsnd_rdai_to_priv(rdai);
++ struct device *dev = rsnd_priv_to_dev(priv);
+ struct device_node *np;
+ int i;
+
+@@ -1169,7 +1170,11 @@ void rsnd_parse_connect_common(struct rsnd_dai *rdai, char *name,
+ for_each_child_of_node(node, np) {
+ struct rsnd_mod *mod;
+
+- i = rsnd_node_fixed_index(np, name, i);
++ i = rsnd_node_fixed_index(dev, np, name, i);
++ if (i < 0) {
++ of_node_put(np);
++ break;
++ }
+
+ mod = mod_get(priv, i);
+
+@@ -1183,7 +1188,7 @@ void rsnd_parse_connect_common(struct rsnd_dai *rdai, char *name,
+ of_node_put(node);
+ }
+
+-int rsnd_node_fixed_index(struct device_node *node, char *name, int idx)
++int rsnd_node_fixed_index(struct device *dev, struct device_node *node, char *name, int idx)
+ {
+ char node_name[16];
+
+@@ -1210,6 +1215,8 @@ int rsnd_node_fixed_index(struct device_node *node, char *name, int idx)
+ return idx;
+ }
+
++ dev_err(dev, "strange node numbering (%s)",
++ of_node_full_name(node));
+ return -EINVAL;
+ }
+
+@@ -1221,10 +1228,8 @@ int rsnd_node_count(struct rsnd_priv *priv, struct device_node *node, char *name
+
+ i = 0;
+ for_each_child_of_node(node, np) {
+- i = rsnd_node_fixed_index(np, name, i);
++ i = rsnd_node_fixed_index(dev, np, name, i);
+ if (i < 0) {
+- dev_err(dev, "strange node numbering (%s)",
+- of_node_full_name(node));
+ of_node_put(np);
+ return 0;
+ }
+diff --git a/sound/soc/sh/rcar/dma.c b/sound/soc/sh/rcar/dma.c
+index 03e0d4eca7815..463ab237d7bd4 100644
+--- a/sound/soc/sh/rcar/dma.c
++++ b/sound/soc/sh/rcar/dma.c
+@@ -240,12 +240,19 @@ static int rsnd_dmaen_start(struct rsnd_mod *mod,
+ struct dma_chan *rsnd_dma_request_channel(struct device_node *of_node, char *name,
+ struct rsnd_mod *mod, char *x)
+ {
++ struct rsnd_priv *priv = rsnd_mod_to_priv(mod);
++ struct device *dev = rsnd_priv_to_dev(priv);
+ struct dma_chan *chan = NULL;
+ struct device_node *np;
+ int i = 0;
+
+ for_each_child_of_node(of_node, np) {
+- i = rsnd_node_fixed_index(np, name, i);
++ i = rsnd_node_fixed_index(dev, np, name, i);
++ if (i < 0) {
++ chan = NULL;
++ of_node_put(np);
++ break;
++ }
+
+ if (i == rsnd_mod_id_raw(mod) && (!chan))
+ chan = of_dma_request_slave_channel(np, x);
+diff --git a/sound/soc/sh/rcar/rsnd.h b/sound/soc/sh/rcar/rsnd.h
+index 6580bab0e229b..d9cd190d7e198 100644
+--- a/sound/soc/sh/rcar/rsnd.h
++++ b/sound/soc/sh/rcar/rsnd.h
+@@ -460,7 +460,7 @@ void rsnd_parse_connect_common(struct rsnd_dai *rdai, char *name,
+ struct device_node *playback,
+ struct device_node *capture);
+ int rsnd_node_count(struct rsnd_priv *priv, struct device_node *node, char *name);
+-int rsnd_node_fixed_index(struct device_node *node, char *name, int idx);
++int rsnd_node_fixed_index(struct device *dev, struct device_node *node, char *name, int idx);
+
+ int rsnd_channel_normalization(int chan);
+ #define rsnd_runtime_channel_original(io) \
+diff --git a/sound/soc/sh/rcar/src.c b/sound/soc/sh/rcar/src.c
+index 42a100c6303d4..0ea84ae57c6ac 100644
+--- a/sound/soc/sh/rcar/src.c
++++ b/sound/soc/sh/rcar/src.c
+@@ -676,7 +676,12 @@ int rsnd_src_probe(struct rsnd_priv *priv)
+ if (!of_device_is_available(np))
+ goto skip;
+
+- i = rsnd_node_fixed_index(np, SRC_NAME, i);
++ i = rsnd_node_fixed_index(dev, np, SRC_NAME, i);
++ if (i < 0) {
++ ret = -EINVAL;
++ of_node_put(np);
++ goto rsnd_src_probe_done;
++ }
+
+ src = rsnd_src_get(priv, i);
+
+diff --git a/sound/soc/sh/rcar/ssi.c b/sound/soc/sh/rcar/ssi.c
+index 87e606f688d3f..43c5e27dc5c86 100644
+--- a/sound/soc/sh/rcar/ssi.c
++++ b/sound/soc/sh/rcar/ssi.c
+@@ -1105,6 +1105,7 @@ void rsnd_parse_connect_ssi(struct rsnd_dai *rdai,
+ struct device_node *capture)
+ {
+ struct rsnd_priv *priv = rsnd_rdai_to_priv(rdai);
++ struct device *dev = rsnd_priv_to_dev(priv);
+ struct device_node *node;
+ struct device_node *np;
+ int i;
+@@ -1117,7 +1118,11 @@ void rsnd_parse_connect_ssi(struct rsnd_dai *rdai,
+ for_each_child_of_node(node, np) {
+ struct rsnd_mod *mod;
+
+- i = rsnd_node_fixed_index(np, SSI_NAME, i);
++ i = rsnd_node_fixed_index(dev, np, SSI_NAME, i);
++ if (i < 0) {
++ of_node_put(np);
++ break;
++ }
+
+ mod = rsnd_ssi_mod_get(priv, i);
+
+@@ -1182,7 +1187,12 @@ int rsnd_ssi_probe(struct rsnd_priv *priv)
+ if (!of_device_is_available(np))
+ goto skip;
+
+- i = rsnd_node_fixed_index(np, SSI_NAME, i);
++ i = rsnd_node_fixed_index(dev, np, SSI_NAME, i);
++ if (i < 0) {
++ ret = -EINVAL;
++ of_node_put(np);
++ goto rsnd_ssi_probe_done;
++ }
+
+ ssi = rsnd_ssi_get(priv, i);
+
+diff --git a/sound/soc/sh/rcar/ssiu.c b/sound/soc/sh/rcar/ssiu.c
+index 0d8f97633dd26..4b8a63e336c77 100644
+--- a/sound/soc/sh/rcar/ssiu.c
++++ b/sound/soc/sh/rcar/ssiu.c
+@@ -102,6 +102,8 @@ bool rsnd_ssiu_busif_err_status_clear(struct rsnd_mod *mod)
+ shift = 1;
+ offset = 1;
+ break;
++ default:
++ goto out;
+ }
+
+ for (i = 0; i < 4; i++) {
+@@ -120,7 +122,7 @@ bool rsnd_ssiu_busif_err_status_clear(struct rsnd_mod *mod)
+ }
+ rsnd_mod_write(mod, reg, val);
+ }
+-
++out:
+ return error;
+ }
+
+@@ -460,6 +462,7 @@ void rsnd_parse_connect_ssiu(struct rsnd_dai *rdai,
+ struct device_node *capture)
+ {
+ struct rsnd_priv *priv = rsnd_rdai_to_priv(rdai);
++ struct device *dev = rsnd_priv_to_dev(priv);
+ struct device_node *node = rsnd_ssiu_of_node(priv);
+ struct rsnd_dai_stream *io_p = &rdai->playback;
+ struct rsnd_dai_stream *io_c = &rdai->capture;
+@@ -472,7 +475,11 @@ void rsnd_parse_connect_ssiu(struct rsnd_dai *rdai,
+ for_each_child_of_node(node, np) {
+ struct rsnd_mod *mod;
+
+- i = rsnd_node_fixed_index(np, SSIU_NAME, i);
++ i = rsnd_node_fixed_index(dev, np, SSIU_NAME, i);
++ if (i < 0) {
++ of_node_put(np);
++ break;
++ }
+
+ mod = rsnd_ssiu_mod_get(priv, i);
+
+diff --git a/sound/soc/sh/rz-ssi.c b/sound/soc/sh/rz-ssi.c
+index e8edaed05d4cf..8a0c01ca06bee 100644
+--- a/sound/soc/sh/rz-ssi.c
++++ b/sound/soc/sh/rz-ssi.c
+@@ -978,22 +978,24 @@ static int rz_ssi_probe(struct platform_device *pdev)
+
+ /* Error Interrupt */
+ ssi->irq_int = platform_get_irq_byname(pdev, "int_req");
+- if (ssi->irq_int < 0)
+- return dev_err_probe(&pdev->dev, -ENODEV,
+- "Unable to get SSI int_req IRQ\n");
++ if (ssi->irq_int < 0) {
++ rz_ssi_release_dma_channels(ssi);
++ return ssi->irq_int;
++ }
+
+ ret = devm_request_irq(&pdev->dev, ssi->irq_int, &rz_ssi_interrupt,
+ 0, dev_name(&pdev->dev), ssi);
+- if (ret < 0)
++ if (ret < 0) {
++ rz_ssi_release_dma_channels(ssi);
+ return dev_err_probe(&pdev->dev, ret,
+ "irq request error (int_req)\n");
++ }
+
+ if (!rz_ssi_is_dma_enabled(ssi)) {
+ /* Tx and Rx interrupts (pio only) */
+ ssi->irq_tx = platform_get_irq_byname(pdev, "dma_tx");
+ if (ssi->irq_tx < 0)
+- return dev_err_probe(&pdev->dev, -ENODEV,
+- "Unable to get SSI dma_tx IRQ\n");
++ return ssi->irq_tx;
+
+ ret = devm_request_irq(&pdev->dev, ssi->irq_tx,
+ &rz_ssi_interrupt, 0,
+@@ -1004,8 +1006,7 @@ static int rz_ssi_probe(struct platform_device *pdev)
+
+ ssi->irq_rx = platform_get_irq_byname(pdev, "dma_rx");
+ if (ssi->irq_rx < 0)
+- return dev_err_probe(&pdev->dev, -ENODEV,
+- "Unable to get SSI dma_rx IRQ\n");
++ return ssi->irq_rx;
+
+ ret = devm_request_irq(&pdev->dev, ssi->irq_rx,
+ &rz_ssi_interrupt, 0,
+@@ -1016,13 +1017,16 @@ static int rz_ssi_probe(struct platform_device *pdev)
+ }
+
+ ssi->rstc = devm_reset_control_get_exclusive(&pdev->dev, NULL);
+- if (IS_ERR(ssi->rstc))
++ if (IS_ERR(ssi->rstc)) {
++ rz_ssi_release_dma_channels(ssi);
+ return PTR_ERR(ssi->rstc);
++ }
+
+ reset_control_deassert(ssi->rstc);
+ pm_runtime_enable(&pdev->dev);
+ ret = pm_runtime_resume_and_get(&pdev->dev);
+ if (ret < 0) {
++ rz_ssi_release_dma_channels(ssi);
+ pm_runtime_disable(ssi->dev);
+ reset_control_assert(ssi->rstc);
+ return dev_err_probe(ssi->dev, ret, "pm_runtime_resume_and_get failed\n");
+diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c
+index ca917a849c423..869c76506b669 100644
+--- a/sound/soc/soc-dapm.c
++++ b/sound/soc/soc-dapm.c
+@@ -3437,7 +3437,6 @@ int snd_soc_dapm_put_volsw(struct snd_kcontrol *kcontrol,
+ update.val = val;
+ card->update = &update;
+ }
+- change |= reg_change;
+
+ ret = soc_dapm_mixer_update_power(card, kcontrol, connect,
+ rconnect);
+@@ -3539,7 +3538,6 @@ int snd_soc_dapm_put_enum_double(struct snd_kcontrol *kcontrol,
+ update.val = val;
+ card->update = &update;
+ }
+- change |= reg_change;
+
+ ret = soc_dapm_mux_update_power(card, kcontrol, item[0], e);
+
+diff --git a/sound/soc/sof/amd/pci-rn.c b/sound/soc/sof/amd/pci-rn.c
+index 392ffbdf64179..d809d151a38c4 100644
+--- a/sound/soc/sof/amd/pci-rn.c
++++ b/sound/soc/sof/amd/pci-rn.c
+@@ -93,6 +93,7 @@ static int acp_pci_rn_probe(struct pci_dev *pci, const struct pci_device_id *pci
+ res = devm_kzalloc(&pci->dev, sizeof(struct resource) * ARRAY_SIZE(renoir_res), GFP_KERNEL);
+ if (!res) {
+ sof_pci_remove(pci);
++ platform_device_unregister(dmic_dev);
+ return -ENOMEM;
+ }
+
+diff --git a/sound/soc/sof/ipc3-topology.c b/sound/soc/sof/ipc3-topology.c
+index 2f8450a8c0a1f..cdff48c4195f8 100644
+--- a/sound/soc/sof/ipc3-topology.c
++++ b/sound/soc/sof/ipc3-topology.c
+@@ -20,7 +20,8 @@
+ struct sof_widget_data {
+ int ctrl_type;
+ int ipc_cmd;
+- struct sof_abi_hdr *pdata;
++ void *pdata;
++ size_t pdata_size;
+ struct snd_sof_control *control;
+ };
+
+@@ -784,16 +785,26 @@ static int sof_get_control_data(struct snd_soc_component *scomp,
+ }
+
+ cdata = wdata[i].control->ipc_control_data;
+- wdata[i].pdata = cdata->data;
+- if (!wdata[i].pdata)
+- return -EINVAL;
+
+- /* make sure data is valid - data can be updated at runtime */
+- if (widget->dobj.widget.kcontrol_type[i] == SND_SOC_TPLG_TYPE_BYTES &&
+- wdata[i].pdata->magic != SOF_ABI_MAGIC)
+- return -EINVAL;
++ if (widget->dobj.widget.kcontrol_type[i] == SND_SOC_TPLG_TYPE_BYTES) {
++ /* make sure data is valid - data can be updated at runtime */
++ if (cdata->data->magic != SOF_ABI_MAGIC)
++ return -EINVAL;
++
++ wdata[i].pdata = cdata->data->data;
++ wdata[i].pdata_size = cdata->data->size;
++ } else {
++ /* points to the control data union */
++ wdata[i].pdata = cdata->chanv;
++ /*
++ * wdata[i].control->size is calculated with struct_size
++ * and includes the size of struct sof_ipc_ctrl_data
++ */
++ wdata[i].pdata_size = wdata[i].control->size -
++ sizeof(struct sof_ipc_ctrl_data);
++ }
+
+- *size += wdata[i].pdata->size;
++ *size += wdata[i].pdata_size;
+
+ /* get data type */
+ switch (cdata->cmd) {
+@@ -876,10 +887,12 @@ static int sof_process_load(struct snd_soc_component *scomp,
+ */
+ if (ipc_data_size) {
+ for (i = 0; i < widget->num_kcontrols; i++) {
+- memcpy(&process->data[offset],
+- wdata[i].pdata->data,
+- wdata[i].pdata->size);
+- offset += wdata[i].pdata->size;
++ if (!wdata[i].pdata_size)
++ continue;
++
++ memcpy(&process->data[offset], wdata[i].pdata,
++ wdata[i].pdata_size);
++ offset += wdata[i].pdata_size;
+ }
+ }
+
+@@ -1592,6 +1605,7 @@ static int sof_ipc3_control_load_bytes(struct snd_sof_dev *sdev, struct snd_sof_
+ if (scontrol->priv_size > 0) {
+ memcpy(cdata->data, scontrol->priv, scontrol->priv_size);
+ kfree(scontrol->priv);
++ scontrol->priv = NULL;
+
+ if (cdata->data->magic != SOF_ABI_MAGIC) {
+ dev_err(sdev->dev, "Wrong ABI magic 0x%08x.\n", cdata->data->magic);
+diff --git a/sound/soc/ti/j721e-evm.c b/sound/soc/ti/j721e-evm.c
+index 4077e15ec48b7..6a969874c9270 100644
+--- a/sound/soc/ti/j721e-evm.c
++++ b/sound/soc/ti/j721e-evm.c
+@@ -630,17 +630,18 @@ static int j721e_soc_probe_cpb(struct j721e_priv *priv, int *link_idx,
+ codec_node = of_parse_phandle(node, "ti,cpb-codec", 0);
+ if (!codec_node) {
+ dev_err(priv->dev, "CPB codec node is not provided\n");
+- return -EINVAL;
++ ret = -EINVAL;
++ goto put_dai_node;
+ }
+
+ domain = &priv->audio_domains[J721E_AUDIO_DOMAIN_CPB];
+ ret = j721e_get_clocks(priv->dev, &domain->codec, "cpb-codec-scki");
+ if (ret)
+- return ret;
++ goto put_codec_node;
+
+ ret = j721e_get_clocks(priv->dev, &domain->mcasp, "cpb-mcasp-auxclk");
+ if (ret)
+- return ret;
++ goto put_codec_node;
+
+ /*
+ * Common Processor Board, two links
+@@ -650,8 +651,10 @@ static int j721e_soc_probe_cpb(struct j721e_priv *priv, int *link_idx,
+ comp_count = 6;
+ compnent = devm_kzalloc(priv->dev, comp_count * sizeof(*compnent),
+ GFP_KERNEL);
+- if (!compnent)
+- return -ENOMEM;
++ if (!compnent) {
++ ret = -ENOMEM;
++ goto put_codec_node;
++ }
+
+ comp_idx = 0;
+ priv->dai_links[*link_idx].cpus = &compnent[comp_idx++];
+@@ -702,6 +705,12 @@ static int j721e_soc_probe_cpb(struct j721e_priv *priv, int *link_idx,
+ (*conf_idx)++;
+
+ return 0;
++
++put_codec_node:
++ of_node_put(codec_node);
++put_dai_node:
++ of_node_put(dai_node);
++ return ret;
+ }
+
+ static int j721e_soc_probe_ivi(struct j721e_priv *priv, int *link_idx,
+@@ -726,23 +735,25 @@ static int j721e_soc_probe_ivi(struct j721e_priv *priv, int *link_idx,
+ codeca_node = of_parse_phandle(node, "ti,ivi-codec-a", 0);
+ if (!codeca_node) {
+ dev_err(priv->dev, "IVI codec-a node is not provided\n");
+- return -EINVAL;
++ ret = -EINVAL;
++ goto put_dai_node;
+ }
+
+ codecb_node = of_parse_phandle(node, "ti,ivi-codec-b", 0);
+ if (!codecb_node) {
+ dev_warn(priv->dev, "IVI codec-b node is not provided\n");
+- return 0;
++ ret = 0;
++ goto put_codeca_node;
+ }
+
+ domain = &priv->audio_domains[J721E_AUDIO_DOMAIN_IVI];
+ ret = j721e_get_clocks(priv->dev, &domain->codec, "ivi-codec-scki");
+ if (ret)
+- return ret;
++ goto put_codecb_node;
+
+ ret = j721e_get_clocks(priv->dev, &domain->mcasp, "ivi-mcasp-auxclk");
+ if (ret)
+- return ret;
++ goto put_codecb_node;
+
+ /*
+ * IVI extension, two links
+@@ -754,8 +765,10 @@ static int j721e_soc_probe_ivi(struct j721e_priv *priv, int *link_idx,
+ comp_count = 8;
+ compnent = devm_kzalloc(priv->dev, comp_count * sizeof(*compnent),
+ GFP_KERNEL);
+- if (!compnent)
+- return -ENOMEM;
++ if (!compnent) {
++ ret = -ENOMEM;
++ goto put_codecb_node;
++ }
+
+ comp_idx = 0;
+ priv->dai_links[*link_idx].cpus = &compnent[comp_idx++];
+@@ -816,6 +829,15 @@ static int j721e_soc_probe_ivi(struct j721e_priv *priv, int *link_idx,
+ (*conf_idx)++;
+
+ return 0;
++
++
++put_codecb_node:
++ of_node_put(codecb_node);
++put_codeca_node:
++ of_node_put(codeca_node);
++put_dai_node:
++ of_node_put(dai_node);
++ return ret;
+ }
+
+ static int j721e_soc_probe(struct platform_device *pdev)
+diff --git a/sound/usb/implicit.c b/sound/usb/implicit.c
+index 2d444ec742029..e1bf1b5da423c 100644
+--- a/sound/usb/implicit.c
++++ b/sound/usb/implicit.c
+@@ -45,11 +45,6 @@ struct snd_usb_implicit_fb_match {
+
+ /* Implicit feedback quirk table for playback */
+ static const struct snd_usb_implicit_fb_match playback_implicit_fb_quirks[] = {
+- /* Generic matching */
+- IMPLICIT_FB_GENERIC_DEV(0x0499, 0x1509), /* Steinberg UR22 */
+- IMPLICIT_FB_GENERIC_DEV(0x0763, 0x2030), /* M-Audio Fast Track C400 */
+- IMPLICIT_FB_GENERIC_DEV(0x0763, 0x2031), /* M-Audio Fast Track C600 */
+-
+ /* Fixed EP */
+ /* FIXME: check the availability of generic matching */
+ IMPLICIT_FB_FIXED_DEV(0x0763, 0x2080, 0x81, 2), /* M-Audio FastTrack Ultra */
+@@ -350,7 +345,8 @@ static int audioformat_implicit_fb_quirk(struct snd_usb_audio *chip,
+ }
+
+ /* Try the generic implicit fb if available */
+- if (chip->generic_implicit_fb)
++ if (chip->generic_implicit_fb ||
++ (chip->quirk_flags & QUIRK_FLAG_GENERIC_IMPLICIT_FB))
+ return add_generic_implicit_fb(chip, fmt, alts);
+
+ /* No quirk */
+@@ -387,6 +383,8 @@ int snd_usb_parse_implicit_fb_quirk(struct snd_usb_audio *chip,
+ struct audioformat *fmt,
+ struct usb_host_interface *alts)
+ {
++ if (chip->quirk_flags & QUIRK_FLAG_SKIP_IMPLICIT_FB)
++ return 0;
+ if (fmt->endpoint & USB_DIR_IN)
+ return audioformat_capture_quirk(chip, fmt, alts);
+ else
+diff --git a/sound/usb/midi.c b/sound/usb/midi.c
+index 7c6ca2b433a53..344fbeadf161b 100644
+--- a/sound/usb/midi.c
++++ b/sound/usb/midi.c
+@@ -1145,6 +1145,9 @@ static int snd_usbmidi_output_open(struct snd_rawmidi_substream *substream)
+
+ static int snd_usbmidi_output_close(struct snd_rawmidi_substream *substream)
+ {
++ struct usbmidi_out_port *port = substream->runtime->private_data;
++
++ cancel_work_sync(&port->ep->work);
+ return substream_open(substream, 0, 0);
+ }
+
+diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c
+index fbbe59054c3fb..e8468f9b007d1 100644
+--- a/sound/usb/quirks.c
++++ b/sound/usb/quirks.c
+@@ -1793,6 +1793,8 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = {
+ QUIRK_FLAG_CTL_MSG_DELAY_1M | QUIRK_FLAG_IGNORE_CTL_ERROR),
+ DEVICE_FLG(0x046d, 0x09a4, /* Logitech QuickCam E 3500 */
+ QUIRK_FLAG_CTL_MSG_DELAY_1M | QUIRK_FLAG_IGNORE_CTL_ERROR),
++ DEVICE_FLG(0x0499, 0x1509, /* Steinberg UR22 */
++ QUIRK_FLAG_GENERIC_IMPLICIT_FB),
+ DEVICE_FLG(0x04d8, 0xfeea, /* Benchmark DAC1 Pre */
+ QUIRK_FLAG_GET_SAMPLE_RATE),
+ DEVICE_FLG(0x04e8, 0xa051, /* Samsung USBC Headset (AKG) */
+@@ -1826,6 +1828,10 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = {
+ QUIRK_FLAG_GET_SAMPLE_RATE),
+ DEVICE_FLG(0x074d, 0x3553, /* Outlaw RR2150 (Micronas UAC3553B) */
+ QUIRK_FLAG_GET_SAMPLE_RATE),
++ DEVICE_FLG(0x0763, 0x2030, /* M-Audio Fast Track C400 */
++ QUIRK_FLAG_GENERIC_IMPLICIT_FB),
++ DEVICE_FLG(0x0763, 0x2031, /* M-Audio Fast Track C600 */
++ QUIRK_FLAG_GENERIC_IMPLICIT_FB),
+ DEVICE_FLG(0x08bb, 0x2702, /* LineX FM Transmitter */
+ QUIRK_FLAG_IGNORE_CTL_ERROR),
+ DEVICE_FLG(0x0951, 0x16ad, /* Kingston HyperX */
+diff --git a/sound/usb/usbaudio.h b/sound/usb/usbaudio.h
+index b8359a0aa008a..044cd7ab27cbb 100644
+--- a/sound/usb/usbaudio.h
++++ b/sound/usb/usbaudio.h
+@@ -164,6 +164,10 @@ extern bool snd_usb_skip_validation;
+ * Support generic DSD raw U32_BE format
+ * QUIRK_FLAG_SET_IFACE_FIRST:
+ * Set up the interface at first like UAC1
++ * QUIRK_FLAG_GENERIC_IMPLICIT_FB
++ * Apply the generic implicit feedback sync mode (same as implicit_fb=1 option)
++ * QUIRK_FLAG_SKIP_IMPLICIT_FB
++ * Don't apply implicit feedback sync mode
+ */
+
+ #define QUIRK_FLAG_GET_SAMPLE_RATE (1U << 0)
+@@ -183,5 +187,7 @@ extern bool snd_usb_skip_validation;
+ #define QUIRK_FLAG_IGNORE_CTL_ERROR (1U << 14)
+ #define QUIRK_FLAG_DSD_RAW (1U << 15)
+ #define QUIRK_FLAG_SET_IFACE_FIRST (1U << 16)
++#define QUIRK_FLAG_GENERIC_IMPLICIT_FB (1U << 17)
++#define QUIRK_FLAG_SKIP_IMPLICIT_FB (1U << 18)
+
+ #endif /* __USBAUDIO_H */
+diff --git a/tools/build/feature/test-libbpf-btf__load_from_kernel_by_id.c b/tools/build/feature/test-libbpf-btf__load_from_kernel_by_id.c
+index f7c084428735a..a17647f7d5a43 100644
+--- a/tools/build/feature/test-libbpf-btf__load_from_kernel_by_id.c
++++ b/tools/build/feature/test-libbpf-btf__load_from_kernel_by_id.c
+@@ -1,7 +1,8 @@
+ // SPDX-License-Identifier: GPL-2.0
+-#include <bpf/libbpf.h>
++#include <bpf/btf.h>
+
+ int main(void)
+ {
+- return btf__load_from_kernel_by_id(20151128, NULL);
++ btf__load_from_kernel_by_id(20151128);
++ return 0;
+ }
+diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
+index 809fe209cdcc0..881ea905ca815 100644
+--- a/tools/lib/bpf/libbpf.c
++++ b/tools/lib/bpf/libbpf.c
+@@ -4587,7 +4587,7 @@ static int probe_kern_probe_read_kernel(void)
+ };
+ int fd, insn_cnt = ARRAY_SIZE(insns);
+
+- fd = bpf_prog_load(BPF_PROG_TYPE_KPROBE, NULL, "GPL", insns, insn_cnt, NULL);
++ fd = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, NULL);
+ return probe_fd(fd);
+ }
+
+@@ -5646,9 +5646,10 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
+ */
+ prog = NULL;
+ for (i = 0; i < obj->nr_programs; i++) {
+- prog = &obj->programs[i];
+- if (strcmp(prog->sec_name, sec_name) == 0)
++ if (strcmp(obj->programs[i].sec_name, sec_name) == 0) {
++ prog = &obj->programs[i];
+ break;
++ }
+ }
+ if (!prog) {
+ pr_warn("sec '%s': failed to find a BPF program\n", sec_name);
+@@ -5665,10 +5666,17 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
+ insn_idx = rec->insn_off / BPF_INSN_SZ;
+ prog = find_prog_by_sec_insn(obj, sec_idx, insn_idx);
+ if (!prog) {
+- pr_warn("sec '%s': failed to find program at insn #%d for CO-RE offset relocation #%d\n",
+- sec_name, insn_idx, i);
+- err = -EINVAL;
+- goto out;
++ /* When __weak subprog is "overridden" by another instance
++ * of the subprog from a different object file, linker still
++ * appends all the .BTF.ext info that used to belong to that
++ * eliminated subprogram.
++ * This is similar to what x86-64 linker does for relocations.
++ * So just ignore such relocations just like we ignore
++ * subprog instructions when discovering subprograms.
++ */
++ pr_debug("sec '%s': skipping CO-RE relocation #%d for insn #%d belonging to eliminated weak subprogram\n",
++ sec_name, i, insn_idx);
++ continue;
+ }
+ /* no need to apply CO-RE relocation if the program is
+ * not going to be loaded
+diff --git a/tools/objtool/check.c b/tools/objtool/check.c
+index ca5b746030089..8a0971a620f09 100644
+--- a/tools/objtool/check.c
++++ b/tools/objtool/check.c
+@@ -5,6 +5,7 @@
+
+ #include <string.h>
+ #include <stdlib.h>
++#include <inttypes.h>
+ #include <sys/mman.h>
+
+ #include <arch/elf.h>
+@@ -560,12 +561,12 @@ static int add_dead_ends(struct objtool_file *file)
+ else if (reloc->addend == reloc->sym->sec->sh.sh_size) {
+ insn = find_last_insn(file, reloc->sym->sec);
+ if (!insn) {
+- WARN("can't find unreachable insn at %s+0x%lx",
++ WARN("can't find unreachable insn at %s+0x%" PRIx64,
+ reloc->sym->sec->name, reloc->addend);
+ return -1;
+ }
+ } else {
+- WARN("can't find unreachable insn at %s+0x%lx",
++ WARN("can't find unreachable insn at %s+0x%" PRIx64,
+ reloc->sym->sec->name, reloc->addend);
+ return -1;
+ }
+@@ -595,12 +596,12 @@ reachable:
+ else if (reloc->addend == reloc->sym->sec->sh.sh_size) {
+ insn = find_last_insn(file, reloc->sym->sec);
+ if (!insn) {
+- WARN("can't find reachable insn at %s+0x%lx",
++ WARN("can't find reachable insn at %s+0x%" PRIx64,
+ reloc->sym->sec->name, reloc->addend);
+ return -1;
+ }
+ } else {
+- WARN("can't find reachable insn at %s+0x%lx",
++ WARN("can't find reachable insn at %s+0x%" PRIx64,
+ reloc->sym->sec->name, reloc->addend);
+ return -1;
+ }
+diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
+index ebf2ba5755c1e..e84cf15b8c11b 100644
+--- a/tools/objtool/elf.c
++++ b/tools/objtool/elf.c
+@@ -374,6 +374,9 @@ static void elf_add_symbol(struct elf *elf, struct symbol *sym)
+ struct list_head *entry;
+ struct rb_node *pnode;
+
++ INIT_LIST_HEAD(&sym->pv_target);
++ sym->alias = sym;
++
+ sym->type = GELF_ST_TYPE(sym->sym.st_info);
+ sym->bind = GELF_ST_BIND(sym->sym.st_info);
+
+@@ -435,8 +438,6 @@ static int read_symbols(struct elf *elf)
+ return -1;
+ }
+ memset(sym, 0, sizeof(*sym));
+- INIT_LIST_HEAD(&sym->pv_target);
+- sym->alias = sym;
+
+ sym->idx = i;
+
+@@ -546,7 +547,7 @@ static struct section *elf_create_reloc_section(struct elf *elf,
+ int reltype);
+
+ int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
+- unsigned int type, struct symbol *sym, long addend)
++ unsigned int type, struct symbol *sym, s64 addend)
+ {
+ struct reloc *reloc;
+
+@@ -600,24 +601,21 @@ static void elf_dirty_reloc_sym(struct elf *elf, struct symbol *sym)
+ }
+
+ /*
+- * Move the first global symbol, as per sh_info, into a new, higher symbol
+- * index. This fees up the shndx for a new local symbol.
++ * The libelf API is terrible; gelf_update_sym*() takes a data block relative
++ * index value, *NOT* the symbol index. As such, iterate the data blocks and
++ * adjust index until it fits.
++ *
++ * If no data block is found, allow adding a new data block provided the index
++ * is only one past the end.
+ */
+-static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
+- struct section *symtab_shndx)
++static int elf_update_symbol(struct elf *elf, struct section *symtab,
++ struct section *symtab_shndx, struct symbol *sym)
+ {
+- Elf_Data *data, *shndx_data = NULL;
+- Elf32_Word first_non_local;
+- struct symbol *sym;
+- Elf_Scn *s;
+-
+- first_non_local = symtab->sh.sh_info;
+-
+- sym = find_symbol_by_index(elf, first_non_local);
+- if (!sym) {
+- WARN("no non-local symbols !?");
+- return first_non_local;
+- }
++ Elf32_Word shndx = sym->sec ? sym->sec->idx : SHN_UNDEF;
++ Elf_Data *symtab_data = NULL, *shndx_data = NULL;
++ Elf64_Xword entsize = symtab->sh.sh_entsize;
++ int max_idx, idx = sym->idx;
++ Elf_Scn *s, *t = NULL;
+
+ s = elf_getscn(elf->elf, symtab->idx);
+ if (!s) {
+@@ -625,79 +623,124 @@ static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
+ return -1;
+ }
+
+- data = elf_newdata(s);
+- if (!data) {
+- WARN_ELF("elf_newdata");
+- return -1;
++ if (symtab_shndx) {
++ t = elf_getscn(elf->elf, symtab_shndx->idx);
++ if (!t) {
++ WARN_ELF("elf_getscn");
++ return -1;
++ }
+ }
+
+- data->d_buf = &sym->sym;
+- data->d_size = sizeof(sym->sym);
+- data->d_align = 1;
+- data->d_type = ELF_T_SYM;
++ for (;;) {
++ /* get next data descriptor for the relevant sections */
++ symtab_data = elf_getdata(s, symtab_data);
++ if (t)
++ shndx_data = elf_getdata(t, shndx_data);
+
+- sym->idx = symtab->sh.sh_size / sizeof(sym->sym);
+- elf_dirty_reloc_sym(elf, sym);
++ /* end-of-list */
++ if (!symtab_data) {
++ void *buf;
+
+- symtab->sh.sh_info += 1;
+- symtab->sh.sh_size += data->d_size;
+- symtab->changed = true;
++ if (idx) {
++ /* we don't do holes in symbol tables */
++ WARN("index out of range");
++ return -1;
++ }
+
+- if (symtab_shndx) {
+- s = elf_getscn(elf->elf, symtab_shndx->idx);
+- if (!s) {
+- WARN_ELF("elf_getscn");
++ /* if @idx == 0, it's the next contiguous entry, create it */
++ symtab_data = elf_newdata(s);
++ if (t)
++ shndx_data = elf_newdata(t);
++
++ buf = calloc(1, entsize);
++ if (!buf) {
++ WARN("malloc");
++ return -1;
++ }
++
++ symtab_data->d_buf = buf;
++ symtab_data->d_size = entsize;
++ symtab_data->d_align = 1;
++ symtab_data->d_type = ELF_T_SYM;
++
++ symtab->sh.sh_size += entsize;
++ symtab->changed = true;
++
++ if (t) {
++ shndx_data->d_buf = &sym->sec->idx;
++ shndx_data->d_size = sizeof(Elf32_Word);
++ shndx_data->d_align = sizeof(Elf32_Word);
++ shndx_data->d_type = ELF_T_WORD;
++
++ symtab_shndx->sh.sh_size += sizeof(Elf32_Word);
++ symtab_shndx->changed = true;
++ }
++
++ break;
++ }
++
++ /* empty blocks should not happen */
++ if (!symtab_data->d_size) {
++ WARN("zero size data");
+ return -1;
+ }
+
+- shndx_data = elf_newdata(s);
++ /* is this the right block? */
++ max_idx = symtab_data->d_size / entsize;
++ if (idx < max_idx)
++ break;
++
++ /* adjust index and try again */
++ idx -= max_idx;
++ }
++
++ /* something went side-ways */
++ if (idx < 0) {
++ WARN("negative index");
++ return -1;
++ }
++
++ /* setup extended section index magic and write the symbol */
++ if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
++ sym->sym.st_shndx = shndx;
++ if (!shndx_data)
++ shndx = 0;
++ } else {
++ sym->sym.st_shndx = SHN_XINDEX;
+ if (!shndx_data) {
+- WARN_ELF("elf_newshndx_data");
++ WARN("no .symtab_shndx");
+ return -1;
+ }
++ }
+
+- shndx_data->d_buf = &sym->sec->idx;
+- shndx_data->d_size = sizeof(Elf32_Word);
+- shndx_data->d_align = 4;
+- shndx_data->d_type = ELF_T_WORD;
+-
+- symtab_shndx->sh.sh_size += 4;
+- symtab_shndx->changed = true;
++ if (!gelf_update_symshndx(symtab_data, shndx_data, idx, &sym->sym, shndx)) {
++ WARN_ELF("gelf_update_symshndx");
++ return -1;
+ }
+
+- return first_non_local;
++ return 0;
+ }
+
+ static struct symbol *
+ elf_create_section_symbol(struct elf *elf, struct section *sec)
+ {
+ struct section *symtab, *symtab_shndx;
+- Elf_Data *shndx_data = NULL;
+- struct symbol *sym;
+- Elf32_Word shndx;
++ Elf32_Word first_non_local, new_idx;
++ struct symbol *sym, *old;
+
+ symtab = find_section_by_name(elf, ".symtab");
+ if (symtab) {
+ symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
+- if (symtab_shndx)
+- shndx_data = symtab_shndx->data;
+ } else {
+ WARN("no .symtab");
+ return NULL;
+ }
+
+- sym = malloc(sizeof(*sym));
++ sym = calloc(1, sizeof(*sym));
+ if (!sym) {
+ perror("malloc");
+ return NULL;
+ }
+- memset(sym, 0, sizeof(*sym));
+-
+- sym->idx = elf_move_global_symbol(elf, symtab, symtab_shndx);
+- if (sym->idx < 0) {
+- WARN("elf_move_global_symbol");
+- return NULL;
+- }
+
+ sym->name = sec->name;
+ sym->sec = sec;
+@@ -707,24 +750,41 @@ elf_create_section_symbol(struct elf *elf, struct section *sec)
+ // st_other 0
+ // st_value 0
+ // st_size 0
+- shndx = sec->idx;
+- if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
+- sym->sym.st_shndx = shndx;
+- if (!shndx_data)
+- shndx = 0;
+- } else {
+- sym->sym.st_shndx = SHN_XINDEX;
+- if (!shndx_data) {
+- WARN("no .symtab_shndx");
++
++ /*
++ * Move the first global symbol, as per sh_info, into a new, higher
++ * symbol index. This fees up a spot for a new local symbol.
++ */
++ first_non_local = symtab->sh.sh_info;
++ new_idx = symtab->sh.sh_size / symtab->sh.sh_entsize;
++ old = find_symbol_by_index(elf, first_non_local);
++ if (old) {
++ old->idx = new_idx;
++
++ hlist_del(&old->hash);
++ elf_hash_add(symbol, &old->hash, old->idx);
++
++ elf_dirty_reloc_sym(elf, old);
++
++ if (elf_update_symbol(elf, symtab, symtab_shndx, old)) {
++ WARN("elf_update_symbol move");
+ return NULL;
+ }
++
++ new_idx = first_non_local;
+ }
+
+- if (!gelf_update_symshndx(symtab->data, shndx_data, sym->idx, &sym->sym, shndx)) {
+- WARN_ELF("gelf_update_symshndx");
++ sym->idx = new_idx;
++ if (elf_update_symbol(elf, symtab, symtab_shndx, sym)) {
++ WARN("elf_update_symbol");
+ return NULL;
+ }
+
++ /*
++ * Either way, we added a LOCAL symbol.
++ */
++ symtab->sh.sh_info += 1;
++
+ elf_add_symbol(elf, sym);
+
+ return sym;
+diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
+index 9b36802ed86f6..82e57eb4b4c5d 100644
+--- a/tools/objtool/include/objtool/elf.h
++++ b/tools/objtool/include/objtool/elf.h
+@@ -73,7 +73,7 @@ struct reloc {
+ struct symbol *sym;
+ unsigned long offset;
+ unsigned int type;
+- long addend;
++ s64 addend;
+ int idx;
+ bool jump_table_start;
+ };
+@@ -135,7 +135,7 @@ struct elf *elf_open_read(const char *name, int flags);
+ struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr);
+
+ int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
+- unsigned int type, struct symbol *sym, long addend);
++ unsigned int type, struct symbol *sym, s64 addend);
+ int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,
+ unsigned long offset, unsigned int type,
+ struct section *insn_sec, unsigned long insn_off);
+diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
+index 1bd64e7404b9f..c38423807d010 100644
+--- a/tools/perf/Makefile.config
++++ b/tools/perf/Makefile.config
+@@ -239,18 +239,33 @@ ifdef PARSER_DEBUG
+ endif
+
+ # Try different combinations to accommodate systems that only have
+-# python[2][-config] in weird combinations but always preferring
+-# python2 and python2-config as per pep-0394. If python2 or python
+-# aren't found, then python3 is used.
+-PYTHON_AUTO := python
+-PYTHON_AUTO := $(if $(call get-executable,python3),python3,$(PYTHON_AUTO))
+-PYTHON_AUTO := $(if $(call get-executable,python),python,$(PYTHON_AUTO))
+-PYTHON_AUTO := $(if $(call get-executable,python2),python2,$(PYTHON_AUTO))
+-override PYTHON := $(call get-executable-or-default,PYTHON,$(PYTHON_AUTO))
+-PYTHON_AUTO_CONFIG := \
+- $(if $(call get-executable,$(PYTHON)-config),$(PYTHON)-config,python-config)
+-override PYTHON_CONFIG := \
+- $(call get-executable-or-default,PYTHON_CONFIG,$(PYTHON_AUTO_CONFIG))
++# python[2][3]-config in weird combinations in the following order of
++# priority from lowest to highest:
++# * python3-config
++# * python-config
++# * python2-config as per pep-0394.
++# * $(PYTHON)-config (If PYTHON is user supplied but PYTHON_CONFIG isn't)
++#
++PYTHON_AUTO := python-config
++PYTHON_AUTO := $(if $(call get-executable,python3-config),python3-config,$(PYTHON_AUTO))
++PYTHON_AUTO := $(if $(call get-executable,python-config),python-config,$(PYTHON_AUTO))
++PYTHON_AUTO := $(if $(call get-executable,python2-config),python2-config,$(PYTHON_AUTO))
++
++# If PYTHON is defined but PYTHON_CONFIG isn't, then take $(PYTHON)-config as if it was the user
++# supplied value for PYTHON_CONFIG. Because it's "user supplied", error out if it doesn't exist.
++ifdef PYTHON
++ ifndef PYTHON_CONFIG
++ PYTHON_CONFIG_AUTO := $(call get-executable,$(PYTHON)-config)
++ PYTHON_CONFIG := $(if $(PYTHON_CONFIG_AUTO),$(PYTHON_CONFIG_AUTO),\
++ $(call $(error $(PYTHON)-config not found)))
++ endif
++endif
++
++# Select either auto detected python and python-config or use user supplied values if they are
++# defined. get-executable-or-default fails with an error if the first argument is supplied but
++# doesn't exist.
++override PYTHON_CONFIG := $(call get-executable-or-default,PYTHON_CONFIG,$(PYTHON_AUTO))
++override PYTHON := $(call get-executable-or-default,PYTHON,$(subst -config,,$(PYTHON_AUTO)))
+
+ grep-libs = $(filter -l%,$(1))
+ strip-libs = $(filter-out -l%,$(1))
+diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
+index cfc208d71f00a..75564a7df15be 100644
+--- a/tools/perf/arch/x86/util/evlist.c
++++ b/tools/perf/arch/x86/util/evlist.c
+@@ -36,7 +36,7 @@ struct evsel *arch_evlist__leader(struct list_head *list)
+ if (slots == first)
+ return first;
+ }
+- if (!strncasecmp(evsel->name, "topdown", 7))
++ if (strcasestr(evsel->name, "topdown"))
+ has_topdown = true;
+ if (slots && has_topdown)
+ return slots;
+diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
+index ac2899a25b7a3..0c9e56ab07b5b 100644
+--- a/tools/perf/arch/x86/util/evsel.c
++++ b/tools/perf/arch/x86/util/evsel.c
+@@ -3,6 +3,7 @@
+ #include <stdlib.h>
+ #include "util/evsel.h"
+ #include "util/env.h"
++#include "util/pmu.h"
+ #include "linux/string.h"
+
+ void arch_evsel__set_sample_weight(struct evsel *evsel)
+@@ -29,3 +30,14 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)
+
+ free(env.cpuid);
+ }
++
++bool arch_evsel__must_be_in_group(const struct evsel *evsel)
++{
++ if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
++ !pmu_have_event("cpu", "slots"))
++ return false;
++
++ return evsel->name &&
++ (strcasestr(evsel->name, "slots") ||
++ strcasestr(evsel->name, "topdown"));
++}
+diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
+index fbbed434014f4..8c9ffacbdd281 100644
+--- a/tools/perf/builtin-c2c.c
++++ b/tools/perf/builtin-c2c.c
+@@ -2735,9 +2735,7 @@ static int perf_c2c__report(int argc, const char **argv)
+ "the input file to process"),
+ OPT_INCR('N', "node-info", &c2c.node_info,
+ "show extra node info in report (repeat for more info)"),
+-#ifdef HAVE_SLANG_SUPPORT
+ OPT_BOOLEAN(0, "stdio", &c2c.use_stdio, "Use the stdio interface"),
+-#endif
+ OPT_BOOLEAN(0, "stats", &c2c.stats_only,
+ "Display only statistic tables (implies --stdio)"),
+ OPT_BOOLEAN(0, "full-symbols", &c2c.symbol_full,
+@@ -2767,6 +2765,10 @@ static int perf_c2c__report(int argc, const char **argv)
+ if (argc)
+ usage_with_options(report_c2c_usage, options);
+
++#ifndef HAVE_SLANG_SUPPORT
++ c2c.use_stdio = true;
++#endif
++
+ if (c2c.stats_only)
+ c2c.use_stdio = true;
+
+diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
+index a96f106dc93a0..f058e8cddfa85 100644
+--- a/tools/perf/builtin-stat.c
++++ b/tools/perf/builtin-stat.c
+@@ -271,11 +271,8 @@ static void evlist__check_cpu_maps(struct evlist *evlist)
+ pr_warning(" %s: %s\n", evsel->name, buf);
+ }
+
+- for_each_group_evsel(pos, leader) {
+- evsel__set_leader(pos, pos);
+- pos->core.nr_members = 0;
+- }
+- evsel->core.leader->nr_members = 0;
++ for_each_group_evsel(pos, leader)
++ evsel__remove_from_group(pos, leader);
+ }
+ }
+
+diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
+index 159d9eab6e799..b1eb68c861e7a 100644
+--- a/tools/perf/pmu-events/jevents.c
++++ b/tools/perf/pmu-events/jevents.c
+@@ -612,7 +612,7 @@ static int json_events(const char *fn,
+ } else if (json_streq(map, field, "ExtSel")) {
+ char *code = NULL;
+ addfield(map, &code, "", "", val);
+- eventcode |= strtoul(code, NULL, 0) << 21;
++ eventcode |= strtoul(code, NULL, 0) << 8;
+ free(code);
+ } else if (json_streq(map, field, "EventName")) {
+ addfield(map, &je.name, "", "", val);
+diff --git a/tools/perf/util/data.h b/tools/perf/util/data.h
+index c9de82af5584e..1402d9657ef27 100644
+--- a/tools/perf/util/data.h
++++ b/tools/perf/util/data.h
+@@ -4,6 +4,7 @@
+
+ #include <stdio.h>
+ #include <stdbool.h>
++#include <linux/types.h>
+
+ enum perf_data_mode {
+ PERF_DATA_MODE_WRITE,
+diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
+index 52ea004ba01e5..3084ec7e93254 100644
+--- a/tools/perf/util/evlist.c
++++ b/tools/perf/util/evlist.c
+@@ -1790,8 +1790,13 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
+ if (evsel__has_leader(c2, leader)) {
+ if (is_open && close)
+ perf_evsel__close(&c2->core);
+- evsel__set_leader(c2, c2);
+- c2->core.nr_members = 0;
++ /*
++ * We want to close all members of the group and reopen
++ * them. Some events, like Intel topdown, require being
++ * in a group and so keep these in the group.
++ */
++ evsel__remove_from_group(c2, leader);
++
+ /*
+ * Set this for all former members of the group
+ * to indicate they get reopened.
+@@ -1799,6 +1804,9 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel *
+ c2->reset_group = true;
+ }
+ }
++ /* Reset the leader count if all entries were removed. */
++ if (leader->core.nr_members == 1)
++ leader->core.nr_members = 0;
+ return leader;
+ }
+
+diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
+index 2a1729e7aee46..deb428ee5e509 100644
+--- a/tools/perf/util/evsel.c
++++ b/tools/perf/util/evsel.c
+@@ -3077,3 +3077,22 @@ int evsel__source_count(const struct evsel *evsel)
+ }
+ return count;
+ }
++
++bool __weak arch_evsel__must_be_in_group(const struct evsel *evsel __maybe_unused)
++{
++ return false;
++}
++
++/*
++ * Remove an event from a given group (leader).
++ * Some events, e.g., perf metrics Topdown events,
++ * must always be grouped. Ignore the events.
++ */
++void evsel__remove_from_group(struct evsel *evsel, struct evsel *leader)
++{
++ if (!arch_evsel__must_be_in_group(evsel) && evsel != leader) {
++ evsel__set_leader(evsel, evsel);
++ evsel->core.nr_members = 0;
++ leader->core.nr_members--;
++ }
++}
+diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
+index 041b42d33bf5a..47f65f8e7c749 100644
+--- a/tools/perf/util/evsel.h
++++ b/tools/perf/util/evsel.h
+@@ -483,6 +483,9 @@ bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
+ bool evsel__is_leader(struct evsel *evsel);
+ void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
+ int evsel__source_count(const struct evsel *evsel);
++void evsel__remove_from_group(struct evsel *evsel, struct evsel *leader);
++
++bool arch_evsel__must_be_in_group(const struct evsel *evsel);
+
+ /*
+ * Macro to swap the bit-field postition and size.
+diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
+index bc5ae0872fed9..babede4486dec 100644
+--- a/tools/power/x86/turbostat/turbostat.c
++++ b/tools/power/x86/turbostat/turbostat.c
+@@ -4376,6 +4376,7 @@ static double rapl_dram_energy_units_probe(int model, double rapl_energy_units)
+ case INTEL_FAM6_BROADWELL_X: /* BDX */
+ case INTEL_FAM6_SKYLAKE_X: /* SKX */
+ case INTEL_FAM6_XEON_PHI_KNL: /* KNL */
++ case INTEL_FAM6_ICELAKE_X: /* ICX */
+ return (rapl_dram_energy_units = 15.3 / 1000000);
+ default:
+ return (rapl_energy_units);
+diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py
+index 05ff334761dd0..2f93ed1d7f990 100644
+--- a/tools/testing/kunit/kunit_parser.py
++++ b/tools/testing/kunit/kunit_parser.py
+@@ -789,8 +789,11 @@ def parse_test(lines: LineStream, expected_num: int, log: List[str]) -> Test:
+
+ # Check for there being no tests
+ if parent_test and len(subtests) == 0:
+- test.status = TestStatus.NO_TESTS
+- test.add_error('0 tests run!')
++ # Don't override a bad status if this test had one reported.
++ # Assumption: no subtests means CRASHED is from Test.__init__()
++ if test.status in (TestStatus.TEST_CRASHED, TestStatus.SUCCESS):
++ test.status = TestStatus.NO_TESTS
++ test.add_error('0 tests run!')
+
+ # Add statuses to TestCounts attribute in Test object
+ bubble_up_test_results(test)
+diff --git a/tools/testing/kunit/test_data/test_is_test_passed-no_tests_no_plan.log b/tools/testing/kunit/test_data/test_is_test_passed-no_tests_no_plan.log
+index dd873c9811086..4f81876ee6f18 100644
+--- a/tools/testing/kunit/test_data/test_is_test_passed-no_tests_no_plan.log
++++ b/tools/testing/kunit/test_data/test_is_test_passed-no_tests_no_plan.log
+@@ -3,5 +3,5 @@ TAP version 14
+ # Subtest: suite
+ 1..1
+ # Subtest: case
+- ok 1 - case # SKIP
++ ok 1 - case
+ ok 1 - suite
+diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
+index 2319ec87f53d6..bd2ac8b3bf1f5 100644
+--- a/tools/testing/selftests/Makefile
++++ b/tools/testing/selftests/Makefile
+@@ -9,6 +9,7 @@ TARGETS += clone3
+ TARGETS += core
+ TARGETS += cpufreq
+ TARGETS += cpu-hotplug
++TARGETS += damon
+ TARGETS += drivers/dma-buf
+ TARGETS += efivarfs
+ TARGETS += exec
+diff --git a/tools/testing/selftests/arm64/bti/Makefile b/tools/testing/selftests/arm64/bti/Makefile
+index 73e013c082a65..dafa1c2aa5c47 100644
+--- a/tools/testing/selftests/arm64/bti/Makefile
++++ b/tools/testing/selftests/arm64/bti/Makefile
+@@ -39,7 +39,7 @@ BTI_OBJS = \
+ teststubs-bti.o \
+ trampoline-bti.o
+ gen/btitest: $(BTI_OBJS)
+- $(CC) $(CFLAGS_BTI) $(CFLAGS_COMMON) -nostdlib -o $@ $^
++ $(CC) $(CFLAGS_BTI) $(CFLAGS_COMMON) -nostdlib -static -o $@ $^
+
+ NOBTI_OBJS = \
+ test-nobti.o \
+@@ -50,7 +50,7 @@ NOBTI_OBJS = \
+ teststubs-nobti.o \
+ trampoline-nobti.o
+ gen/nobtitest: $(NOBTI_OBJS)
+- $(CC) $(CFLAGS_BTI) $(CFLAGS_COMMON) -nostdlib -o $@ $^
++ $(CC) $(CFLAGS_BTI) $(CFLAGS_COMMON) -nostdlib -static -o $@ $^
+
+ # Including KSFT lib.mk here will also mangle the TEST_GEN_PROGS list
+ # to account for any OUTPUT target-dirs optionally provided by
+diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
+index 3820608faf57f..6e2383701ce0b 100644
+--- a/tools/testing/selftests/bpf/Makefile
++++ b/tools/testing/selftests/bpf/Makefile
+@@ -75,7 +75,7 @@ TEST_PROGS := test_kmod.sh \
+ test_xsk.sh
+
+ TEST_PROGS_EXTENDED := with_addr.sh \
+- with_tunnels.sh \
++ with_tunnels.sh ima_setup.sh \
+ test_xdp_vlan.sh test_bpftool.py
+
+ # Compile but not part of 'make run_tests'
+@@ -415,11 +415,11 @@ $(TRUNNER_BPF_SKELS): %.skel.h: %.o $(BPFTOOL) | $(TRUNNER_OUTPUT)
+
+ $(TRUNNER_BPF_LSKELS): %.lskel.h: %.o $(BPFTOOL) | $(TRUNNER_OUTPUT)
+ $$(call msg,GEN-SKEL,$(TRUNNER_BINARY),$$@)
+- $(Q)$$(BPFTOOL) gen object $$(<:.o=.linked1.o) $$<
+- $(Q)$$(BPFTOOL) gen object $$(<:.o=.linked2.o) $$(<:.o=.linked1.o)
+- $(Q)$$(BPFTOOL) gen object $$(<:.o=.linked3.o) $$(<:.o=.linked2.o)
+- $(Q)diff $$(<:.o=.linked2.o) $$(<:.o=.linked3.o)
+- $(Q)$$(BPFTOOL) gen skeleton -L $$(<:.o=.linked3.o) name $$(notdir $$(<:.o=_lskel)) > $$@
++ $(Q)$$(BPFTOOL) gen object $$(<:.o=.llinked1.o) $$<
++ $(Q)$$(BPFTOOL) gen object $$(<:.o=.llinked2.o) $$(<:.o=.llinked1.o)
++ $(Q)$$(BPFTOOL) gen object $$(<:.o=.llinked3.o) $$(<:.o=.llinked2.o)
++ $(Q)diff $$(<:.o=.llinked2.o) $$(<:.o=.llinked3.o)
++ $(Q)$$(BPFTOOL) gen skeleton -L $$(<:.o=.llinked3.o) name $$(notdir $$(<:.o=_lskel)) > $$@
+
+ $(TRUNNER_BPF_SKELS_LINKED): $(TRUNNER_BPF_OBJS) $(BPFTOOL) | $(TRUNNER_OUTPUT)
+ $$(call msg,LINK-BPF,$(TRUNNER_BINARY),$$(@:.skel.h=.o))
+diff --git a/tools/testing/selftests/bpf/prog_tests/trampoline_count.c b/tools/testing/selftests/bpf/prog_tests/trampoline_count.c
+index 9c795ee52b7bf..b0acbda6dbf5e 100644
+--- a/tools/testing/selftests/bpf/prog_tests/trampoline_count.c
++++ b/tools/testing/selftests/bpf/prog_tests/trampoline_count.c
+@@ -1,126 +1,94 @@
+ // SPDX-License-Identifier: GPL-2.0-only
+ #define _GNU_SOURCE
+-#include <sched.h>
+-#include <sys/prctl.h>
+ #include <test_progs.h>
+
+ #define MAX_TRAMP_PROGS 38
+
+ struct inst {
+ struct bpf_object *obj;
+- struct bpf_link *link_fentry;
+- struct bpf_link *link_fexit;
++ struct bpf_link *link;
+ };
+
+-static int test_task_rename(void)
+-{
+- int fd, duration = 0, err;
+- char buf[] = "test_overhead";
+-
+- fd = open("/proc/self/comm", O_WRONLY|O_TRUNC);
+- if (CHECK(fd < 0, "open /proc", "err %d", errno))
+- return -1;
+- err = write(fd, buf, sizeof(buf));
+- if (err < 0) {
+- CHECK(err < 0, "task rename", "err %d", errno);
+- close(fd);
+- return -1;
+- }
+- close(fd);
+- return 0;
+-}
+-
+-static struct bpf_link *load(struct bpf_object *obj, const char *name)
++static struct bpf_program *load_prog(char *file, char *name, struct inst *inst)
+ {
++ struct bpf_object *obj;
+ struct bpf_program *prog;
+- int duration = 0;
++ int err;
++
++ obj = bpf_object__open_file(file, NULL);
++ if (!ASSERT_OK_PTR(obj, "obj_open_file"))
++ return NULL;
++
++ inst->obj = obj;
++
++ err = bpf_object__load(obj);
++ if (!ASSERT_OK(err, "obj_load"))
++ return NULL;
+
+ prog = bpf_object__find_program_by_name(obj, name);
+- if (CHECK(!prog, "find_probe", "prog '%s' not found\n", name))
+- return ERR_PTR(-EINVAL);
+- return bpf_program__attach_trace(prog);
++ if (!ASSERT_OK_PTR(prog, "obj_find_prog"))
++ return NULL;
++
++ return prog;
+ }
+
+ /* TODO: use different target function to run in concurrent mode */
+ void serial_test_trampoline_count(void)
+ {
+- const char *fentry_name = "prog1";
+- const char *fexit_name = "prog2";
+- const char *object = "test_trampoline_count.o";
+- struct inst inst[MAX_TRAMP_PROGS] = {};
+- int err, i = 0, duration = 0;
+- struct bpf_object *obj;
++ char *file = "test_trampoline_count.o";
++ char *const progs[] = { "fentry_test", "fmod_ret_test", "fexit_test" };
++ struct inst inst[MAX_TRAMP_PROGS + 1] = {};
++ struct bpf_program *prog;
+ struct bpf_link *link;
+- char comm[16] = {};
++ int prog_fd, err, i;
++ LIBBPF_OPTS(bpf_test_run_opts, opts);
+
+ /* attach 'allowed' trampoline programs */
+ for (i = 0; i < MAX_TRAMP_PROGS; i++) {
+- obj = bpf_object__open_file(object, NULL);
+- if (!ASSERT_OK_PTR(obj, "obj_open_file")) {
+- obj = NULL;
++ prog = load_prog(file, progs[i % ARRAY_SIZE(progs)], &inst[i]);
++ if (!prog)
+ goto cleanup;
+- }
+
+- err = bpf_object__load(obj);
+- if (CHECK(err, "obj_load", "err %d\n", err))
++ link = bpf_program__attach(prog);
++ if (!ASSERT_OK_PTR(link, "attach_prog"))
+ goto cleanup;
+- inst[i].obj = obj;
+- obj = NULL;
+-
+- if (rand() % 2) {
+- link = load(inst[i].obj, fentry_name);
+- if (!ASSERT_OK_PTR(link, "attach_prog")) {
+- link = NULL;
+- goto cleanup;
+- }
+- inst[i].link_fentry = link;
+- } else {
+- link = load(inst[i].obj, fexit_name);
+- if (!ASSERT_OK_PTR(link, "attach_prog")) {
+- link = NULL;
+- goto cleanup;
+- }
+- inst[i].link_fexit = link;
+- }
++
++ inst[i].link = link;
+ }
+
+ /* and try 1 extra.. */
+- obj = bpf_object__open_file(object, NULL);
+- if (!ASSERT_OK_PTR(obj, "obj_open_file")) {
+- obj = NULL;
++ prog = load_prog(file, "fmod_ret_test", &inst[i]);
++ if (!prog)
+ goto cleanup;
+- }
+-
+- err = bpf_object__load(obj);
+- if (CHECK(err, "obj_load", "err %d\n", err))
+- goto cleanup_extra;
+
+ /* ..that needs to fail */
+- link = load(obj, fentry_name);
+- err = libbpf_get_error(link);
+- if (!ASSERT_ERR_PTR(link, "cannot attach over the limit")) {
+- bpf_link__destroy(link);
+- goto cleanup_extra;
++ link = bpf_program__attach(prog);
++ if (!ASSERT_ERR_PTR(link, "attach_prog")) {
++ inst[i].link = link;
++ goto cleanup;
+ }
+
+ /* with E2BIG error */
+- ASSERT_EQ(err, -E2BIG, "proper error check");
+- ASSERT_EQ(link, NULL, "ptr_is_null");
++ if (!ASSERT_EQ(libbpf_get_error(link), -E2BIG, "E2BIG"))
++ goto cleanup;
++ if (!ASSERT_EQ(link, NULL, "ptr_is_null"))
++ goto cleanup;
+
+ /* and finaly execute the probe */
+- if (CHECK_FAIL(prctl(PR_GET_NAME, comm, 0L, 0L, 0L)))
+- goto cleanup_extra;
+- CHECK_FAIL(test_task_rename());
+- CHECK_FAIL(prctl(PR_SET_NAME, comm, 0L, 0L, 0L));
++ prog_fd = bpf_program__fd(prog);
++ if (!ASSERT_GE(prog_fd, 0, "bpf_program__fd"))
++ goto cleanup;
++
++ err = bpf_prog_test_run_opts(prog_fd, &opts);
++ if (!ASSERT_OK(err, "bpf_prog_test_run_opts"))
++ goto cleanup;
++
++ ASSERT_EQ(opts.retval & 0xffff, 4, "bpf_modify_return_test.result");
++ ASSERT_EQ(opts.retval >> 16, 1, "bpf_modify_return_test.side_effect");
+
+-cleanup_extra:
+- bpf_object__close(obj);
+ cleanup:
+- if (i >= MAX_TRAMP_PROGS)
+- i = MAX_TRAMP_PROGS - 1;
+ for (; i >= 0; i--) {
+- bpf_link__destroy(inst[i].link_fentry);
+- bpf_link__destroy(inst[i].link_fexit);
++ bpf_link__destroy(inst[i].link);
+ bpf_object__close(inst[i].obj);
+ }
+ }
+diff --git a/tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c b/tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c
+index 1c7105fcae3c4..4ee4748133fec 100644
+--- a/tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c
++++ b/tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c
+@@ -94,7 +94,7 @@ typedef void (* (*signal_t)(int, void (*)(int)))(int);
+
+ typedef char * (*fn_ptr_arr1_t[10])(int **);
+
+-typedef char * (* const (* const fn_ptr_arr2_t[5])())(char * (*)(int));
++typedef char * (* (* const fn_ptr_arr2_t[5])())(char * (*)(int));
+
+ struct struct_w_typedefs {
+ int_t a;
+diff --git a/tools/testing/selftests/bpf/progs/profiler.inc.h b/tools/testing/selftests/bpf/progs/profiler.inc.h
+index 4896fdf816f73..92331053dba3b 100644
+--- a/tools/testing/selftests/bpf/progs/profiler.inc.h
++++ b/tools/testing/selftests/bpf/progs/profiler.inc.h
+@@ -826,8 +826,9 @@ out:
+
+ SEC("kprobe/vfs_link")
+ int BPF_KPROBE(kprobe__vfs_link,
+- struct dentry* old_dentry, struct inode* dir,
+- struct dentry* new_dentry, struct inode** delegated_inode)
++ struct dentry* old_dentry, struct user_namespace *mnt_userns,
++ struct inode* dir, struct dentry* new_dentry,
++ struct inode** delegated_inode)
+ {
+ struct bpf_func_stats_ctx stats_ctx;
+ bpf_stats_enter(&stats_ctx, profiler_bpf_vfs_link);
+diff --git a/tools/testing/selftests/bpf/progs/test_trampoline_count.c b/tools/testing/selftests/bpf/progs/test_trampoline_count.c
+index f030e469d05b5..7765720da7d58 100644
+--- a/tools/testing/selftests/bpf/progs/test_trampoline_count.c
++++ b/tools/testing/selftests/bpf/progs/test_trampoline_count.c
+@@ -1,20 +1,22 @@
+ // SPDX-License-Identifier: GPL-2.0
+-#include <stdbool.h>
+-#include <stddef.h>
+ #include <linux/bpf.h>
+ #include <bpf/bpf_helpers.h>
+ #include <bpf/bpf_tracing.h>
+
+-struct task_struct;
++SEC("fentry/bpf_modify_return_test")
++int BPF_PROG(fentry_test, int a, int *b)
++{
++ return 0;
++}
+
+-SEC("fentry/__set_task_comm")
+-int BPF_PROG(prog1, struct task_struct *tsk, const char *buf, bool exec)
++SEC("fmod_ret/bpf_modify_return_test")
++int BPF_PROG(fmod_ret_test, int a, int *b, int ret)
+ {
+ return 0;
+ }
+
+-SEC("fexit/__set_task_comm")
+-int BPF_PROG(prog2, struct task_struct *tsk, const char *buf, bool exec)
++SEC("fexit/bpf_modify_return_test")
++int BPF_PROG(fexit_test, int a, int *b, int ret)
+ {
+ return 0;
+ }
+diff --git a/tools/testing/selftests/bpf/test_bpftool_synctypes.py b/tools/testing/selftests/bpf/test_bpftool_synctypes.py
+index 6bf21e47882af..c0e7acd698edf 100755
+--- a/tools/testing/selftests/bpf/test_bpftool_synctypes.py
++++ b/tools/testing/selftests/bpf/test_bpftool_synctypes.py
+@@ -180,7 +180,7 @@ class FileExtractor(object):
+ @enum_name: name of the enum to parse
+ """
+ start_marker = re.compile(f'enum {enum_name} {{\n')
+- pattern = re.compile('^\s*(BPF_\w+),?$')
++ pattern = re.compile('^\s*(BPF_\w+),?(\s+/\*.*\*/)?$')
+ end_marker = re.compile('^};')
+ parser = BlockParser(self.reader)
+ parser.search_block(start_marker)
+diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
+index 3d6217e3aff7a..9c4be2cdb21a0 100644
+--- a/tools/testing/selftests/bpf/trace_helpers.c
++++ b/tools/testing/selftests/bpf/trace_helpers.c
+@@ -25,15 +25,12 @@ static int ksym_cmp(const void *p1, const void *p2)
+
+ int load_kallsyms(void)
+ {
+- FILE *f = fopen("/proc/kallsyms", "r");
++ FILE *f;
+ char func[256], buf[256];
+ char symbol;
+ void *addr;
+ int i = 0;
+
+- if (!f)
+- return -ENOENT;
+-
+ /*
+ * This is called/used from multiplace places,
+ * load symbols just once.
+@@ -41,6 +38,10 @@ int load_kallsyms(void)
+ if (sym_cnt)
+ return 0;
+
++ f = fopen("/proc/kallsyms", "r");
++ if (!f)
++ return -ENOENT;
++
+ while (fgets(buf, sizeof(buf), f)) {
+ if (sscanf(buf, "%p %c %s", &addr, &symbol, func) != 3)
+ break;
+diff --git a/tools/testing/selftests/cgroup/test_stress.sh b/tools/testing/selftests/cgroup/test_stress.sh
+index 15d9d58963941..3c9c4554d5f6a 100755
+--- a/tools/testing/selftests/cgroup/test_stress.sh
++++ b/tools/testing/selftests/cgroup/test_stress.sh
+@@ -1,4 +1,4 @@
+ #!/bin/bash
+ # SPDX-License-Identifier: GPL-2.0
+
+-./with_stress.sh -s subsys -s fork ./test_core
++./with_stress.sh -s subsys -s fork ${OUTPUT:-.}/test_core
+diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c
+index ca40abe9daa86..35f64832b869c 100644
+--- a/tools/testing/selftests/landlock/base_test.c
++++ b/tools/testing/selftests/landlock/base_test.c
+@@ -18,10 +18,11 @@
+ #include "common.h"
+
+ #ifndef O_PATH
+-#define O_PATH 010000000
++#define O_PATH 010000000
+ #endif
+
+-TEST(inconsistent_attr) {
++TEST(inconsistent_attr)
++{
+ const long page_size = sysconf(_SC_PAGESIZE);
+ char *const buf = malloc(page_size + 1);
+ struct landlock_ruleset_attr *const ruleset_attr = (void *)buf;
+@@ -34,20 +35,26 @@ TEST(inconsistent_attr) {
+ ASSERT_EQ(EINVAL, errno);
+ ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr, 1, 0));
+ ASSERT_EQ(EINVAL, errno);
++ ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr, 7, 0));
++ ASSERT_EQ(EINVAL, errno);
+
+ ASSERT_EQ(-1, landlock_create_ruleset(NULL, 1, 0));
+ /* The size if less than sizeof(struct landlock_attr_enforce). */
+ ASSERT_EQ(EFAULT, errno);
+
+- ASSERT_EQ(-1, landlock_create_ruleset(NULL,
+- sizeof(struct landlock_ruleset_attr), 0));
++ ASSERT_EQ(-1, landlock_create_ruleset(
++ NULL, sizeof(struct landlock_ruleset_attr), 0));
+ ASSERT_EQ(EFAULT, errno);
+
+ ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr, page_size + 1, 0));
+ ASSERT_EQ(E2BIG, errno);
+
+- ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr,
+- sizeof(struct landlock_ruleset_attr), 0));
++ /* Checks minimal valid attribute size. */
++ ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr, 8, 0));
++ ASSERT_EQ(ENOMSG, errno);
++ ASSERT_EQ(-1, landlock_create_ruleset(
++ ruleset_attr,
++ sizeof(struct landlock_ruleset_attr), 0));
+ ASSERT_EQ(ENOMSG, errno);
+ ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr, page_size, 0));
+ ASSERT_EQ(ENOMSG, errno);
+@@ -63,38 +70,44 @@ TEST(inconsistent_attr) {
+ free(buf);
+ }
+
+-TEST(abi_version) {
++TEST(abi_version)
++{
+ const struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE,
+ };
+ ASSERT_EQ(1, landlock_create_ruleset(NULL, 0,
+- LANDLOCK_CREATE_RULESET_VERSION));
++ LANDLOCK_CREATE_RULESET_VERSION));
+
+ ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0,
+- LANDLOCK_CREATE_RULESET_VERSION));
++ LANDLOCK_CREATE_RULESET_VERSION));
+ ASSERT_EQ(EINVAL, errno);
+
+ ASSERT_EQ(-1, landlock_create_ruleset(NULL, sizeof(ruleset_attr),
+- LANDLOCK_CREATE_RULESET_VERSION));
++ LANDLOCK_CREATE_RULESET_VERSION));
+ ASSERT_EQ(EINVAL, errno);
+
+- ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr,
+- sizeof(ruleset_attr),
+- LANDLOCK_CREATE_RULESET_VERSION));
++ ASSERT_EQ(-1,
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr),
++ LANDLOCK_CREATE_RULESET_VERSION));
+ ASSERT_EQ(EINVAL, errno);
+
+ ASSERT_EQ(-1, landlock_create_ruleset(NULL, 0,
+- LANDLOCK_CREATE_RULESET_VERSION | 1 << 31));
++ LANDLOCK_CREATE_RULESET_VERSION |
++ 1 << 31));
+ ASSERT_EQ(EINVAL, errno);
+ }
+
+-TEST(inval_create_ruleset_flags) {
++/* Tests ordering of syscall argument checks. */
++TEST(create_ruleset_checks_ordering)
++{
+ const int last_flag = LANDLOCK_CREATE_RULESET_VERSION;
+ const int invalid_flag = last_flag << 1;
++ int ruleset_fd;
+ const struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE,
+ };
+
++ /* Checks priority for invalid flags. */
+ ASSERT_EQ(-1, landlock_create_ruleset(NULL, 0, invalid_flag));
+ ASSERT_EQ(EINVAL, errno);
+
+@@ -102,44 +115,121 @@ TEST(inval_create_ruleset_flags) {
+ ASSERT_EQ(EINVAL, errno);
+
+ ASSERT_EQ(-1, landlock_create_ruleset(NULL, sizeof(ruleset_attr),
+- invalid_flag));
++ invalid_flag));
++ ASSERT_EQ(EINVAL, errno);
++
++ ASSERT_EQ(-1,
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr),
++ invalid_flag));
+ ASSERT_EQ(EINVAL, errno);
+
+- ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr,
+- sizeof(ruleset_attr), invalid_flag));
++ /* Checks too big ruleset_attr size. */
++ ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, -1, 0));
++ ASSERT_EQ(E2BIG, errno);
++
++ /* Checks too small ruleset_attr size. */
++ ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0, 0));
++ ASSERT_EQ(EINVAL, errno);
++ ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 1, 0));
+ ASSERT_EQ(EINVAL, errno);
++
++ /* Checks valid call. */
++ ruleset_fd =
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
++ ASSERT_LE(0, ruleset_fd);
++ ASSERT_EQ(0, close(ruleset_fd));
+ }
+
+-TEST(empty_path_beneath_attr) {
++/* Tests ordering of syscall argument checks. */
++TEST(add_rule_checks_ordering)
++{
+ const struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = LANDLOCK_ACCESS_FS_EXECUTE,
+ };
+- const int ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+- sizeof(ruleset_attr), 0);
++ struct landlock_path_beneath_attr path_beneath_attr = {
++ .allowed_access = LANDLOCK_ACCESS_FS_EXECUTE,
++ .parent_fd = -1,
++ };
++ const int ruleset_fd =
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+
+ ASSERT_LE(0, ruleset_fd);
+
+- /* Similar to struct landlock_path_beneath_attr.parent_fd = 0 */
++ /* Checks invalid flags. */
++ ASSERT_EQ(-1, landlock_add_rule(-1, 0, NULL, 1));
++ ASSERT_EQ(EINVAL, errno);
++
++ /* Checks invalid ruleset FD. */
++ ASSERT_EQ(-1, landlock_add_rule(-1, 0, NULL, 0));
++ ASSERT_EQ(EBADF, errno);
++
++ /* Checks invalid rule type. */
++ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, 0, NULL, 0));
++ ASSERT_EQ(EINVAL, errno);
++
++ /* Checks invalid rule attr. */
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+- NULL, 0));
++ NULL, 0));
+ ASSERT_EQ(EFAULT, errno);
++
++ /* Checks invalid path_beneath.parent_fd. */
++ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
++ &path_beneath_attr, 0));
++ ASSERT_EQ(EBADF, errno);
++
++ /* Checks valid call. */
++ path_beneath_attr.parent_fd =
++ open("/tmp", O_PATH | O_NOFOLLOW | O_DIRECTORY | O_CLOEXEC);
++ ASSERT_LE(0, path_beneath_attr.parent_fd);
++ ASSERT_EQ(0, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
++ &path_beneath_attr, 0));
++ ASSERT_EQ(0, close(path_beneath_attr.parent_fd));
+ ASSERT_EQ(0, close(ruleset_fd));
+ }
+
+-TEST(inval_fd_enforce) {
++/* Tests ordering of syscall argument and permission checks. */
++TEST(restrict_self_checks_ordering)
++{
++ const struct landlock_ruleset_attr ruleset_attr = {
++ .handled_access_fs = LANDLOCK_ACCESS_FS_EXECUTE,
++ };
++ struct landlock_path_beneath_attr path_beneath_attr = {
++ .allowed_access = LANDLOCK_ACCESS_FS_EXECUTE,
++ .parent_fd = -1,
++ };
++ const int ruleset_fd =
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
++
++ ASSERT_LE(0, ruleset_fd);
++ path_beneath_attr.parent_fd =
++ open("/tmp", O_PATH | O_NOFOLLOW | O_DIRECTORY | O_CLOEXEC);
++ ASSERT_LE(0, path_beneath_attr.parent_fd);
++ ASSERT_EQ(0, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
++ &path_beneath_attr, 0));
++ ASSERT_EQ(0, close(path_beneath_attr.parent_fd));
++
++ /* Checks unprivileged enforcement without no_new_privs. */
++ drop_caps(_metadata);
++ ASSERT_EQ(-1, landlock_restrict_self(-1, -1));
++ ASSERT_EQ(EPERM, errno);
++ ASSERT_EQ(-1, landlock_restrict_self(-1, 0));
++ ASSERT_EQ(EPERM, errno);
++ ASSERT_EQ(-1, landlock_restrict_self(ruleset_fd, 0));
++ ASSERT_EQ(EPERM, errno);
++
+ ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0));
+
++ /* Checks invalid flags. */
++ ASSERT_EQ(-1, landlock_restrict_self(-1, -1));
++ ASSERT_EQ(EINVAL, errno);
++
++ /* Checks invalid ruleset FD. */
+ ASSERT_EQ(-1, landlock_restrict_self(-1, 0));
+ ASSERT_EQ(EBADF, errno);
+-}
+-
+-TEST(unpriv_enforce_without_no_new_privs) {
+- int err;
+
+- drop_caps(_metadata);
+- err = landlock_restrict_self(-1, 0);
+- ASSERT_EQ(EPERM, errno);
+- ASSERT_EQ(err, -1);
++ /* Checks valid call. */
++ ASSERT_EQ(0, landlock_restrict_self(ruleset_fd, 0));
++ ASSERT_EQ(0, close(ruleset_fd));
+ }
+
+ TEST(ruleset_fd_io)
+@@ -151,8 +241,8 @@ TEST(ruleset_fd_io)
+ char buf;
+
+ drop_caps(_metadata);
+- ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+- sizeof(ruleset_attr), 0);
++ ruleset_fd =
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd);
+
+ ASSERT_EQ(-1, write(ruleset_fd, ".", 1));
+@@ -197,14 +287,15 @@ TEST(ruleset_fd_transfer)
+ drop_caps(_metadata);
+
+ /* Creates a test ruleset with a simple rule. */
+- ruleset_fd_tx = landlock_create_ruleset(&ruleset_attr,
+- sizeof(ruleset_attr), 0);
++ ruleset_fd_tx =
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd_tx);
+- path_beneath_attr.parent_fd = open("/tmp", O_PATH | O_NOFOLLOW |
+- O_DIRECTORY | O_CLOEXEC);
++ path_beneath_attr.parent_fd =
++ open("/tmp", O_PATH | O_NOFOLLOW | O_DIRECTORY | O_CLOEXEC);
+ ASSERT_LE(0, path_beneath_attr.parent_fd);
+- ASSERT_EQ(0, landlock_add_rule(ruleset_fd_tx, LANDLOCK_RULE_PATH_BENEATH,
+- &path_beneath_attr, 0));
++ ASSERT_EQ(0,
++ landlock_add_rule(ruleset_fd_tx, LANDLOCK_RULE_PATH_BENEATH,
++ &path_beneath_attr, 0));
+ ASSERT_EQ(0, close(path_beneath_attr.parent_fd));
+
+ cmsg = CMSG_FIRSTHDR(&msg);
+@@ -215,7 +306,8 @@ TEST(ruleset_fd_transfer)
+ memcpy(CMSG_DATA(cmsg), &ruleset_fd_tx, sizeof(ruleset_fd_tx));
+
+ /* Sends the ruleset FD over a socketpair and then close it. */
+- ASSERT_EQ(0, socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, socket_fds));
++ ASSERT_EQ(0, socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0,
++ socket_fds));
+ ASSERT_EQ(sizeof(data_tx), sendmsg(socket_fds[0], &msg, 0));
+ ASSERT_EQ(0, close(socket_fds[0]));
+ ASSERT_EQ(0, close(ruleset_fd_tx));
+@@ -226,7 +318,8 @@ TEST(ruleset_fd_transfer)
+ int ruleset_fd_rx;
+
+ *(char *)msg.msg_iov->iov_base = '\0';
+- ASSERT_EQ(sizeof(data_tx), recvmsg(socket_fds[1], &msg, MSG_CMSG_CLOEXEC));
++ ASSERT_EQ(sizeof(data_tx),
++ recvmsg(socket_fds[1], &msg, MSG_CMSG_CLOEXEC));
+ ASSERT_EQ('.', *(char *)msg.msg_iov->iov_base);
+ ASSERT_EQ(0, close(socket_fds[1]));
+ cmsg = CMSG_FIRSTHDR(&msg);
+diff --git a/tools/testing/selftests/landlock/common.h b/tools/testing/selftests/landlock/common.h
+index 183b7e8e1b957..7ba18eb237838 100644
+--- a/tools/testing/selftests/landlock/common.h
++++ b/tools/testing/selftests/landlock/common.h
+@@ -25,6 +25,7 @@
+ * this to be possible, we must not call abort() but instead exit smoothly
+ * (hence the step print).
+ */
++/* clang-format off */
+ #define TEST_F_FORK(fixture_name, test_name) \
+ static void fixture_name##_##test_name##_child( \
+ struct __test_metadata *_metadata, \
+@@ -71,11 +72,12 @@
+ FIXTURE_DATA(fixture_name) __attribute__((unused)) *self, \
+ const FIXTURE_VARIANT(fixture_name) \
+ __attribute__((unused)) *variant)
++/* clang-format on */
+
+ #ifndef landlock_create_ruleset
+-static inline int landlock_create_ruleset(
+- const struct landlock_ruleset_attr *const attr,
+- const size_t size, const __u32 flags)
++static inline int
++landlock_create_ruleset(const struct landlock_ruleset_attr *const attr,
++ const size_t size, const __u32 flags)
+ {
+ return syscall(__NR_landlock_create_ruleset, attr, size, flags);
+ }
+@@ -83,17 +85,18 @@ static inline int landlock_create_ruleset(
+
+ #ifndef landlock_add_rule
+ static inline int landlock_add_rule(const int ruleset_fd,
+- const enum landlock_rule_type rule_type,
+- const void *const rule_attr, const __u32 flags)
++ const enum landlock_rule_type rule_type,
++ const void *const rule_attr,
++ const __u32 flags)
+ {
+- return syscall(__NR_landlock_add_rule, ruleset_fd, rule_type,
+- rule_attr, flags);
++ return syscall(__NR_landlock_add_rule, ruleset_fd, rule_type, rule_attr,
++ flags);
+ }
+ #endif
+
+ #ifndef landlock_restrict_self
+ static inline int landlock_restrict_self(const int ruleset_fd,
+- const __u32 flags)
++ const __u32 flags)
+ {
+ return syscall(__NR_landlock_restrict_self, ruleset_fd, flags);
+ }
+@@ -111,69 +114,76 @@ static void _init_caps(struct __test_metadata *const _metadata, bool drop_all)
+ };
+
+ cap_p = cap_get_proc();
+- EXPECT_NE(NULL, cap_p) {
++ EXPECT_NE(NULL, cap_p)
++ {
+ TH_LOG("Failed to cap_get_proc: %s", strerror(errno));
+ }
+- EXPECT_NE(-1, cap_clear(cap_p)) {
++ EXPECT_NE(-1, cap_clear(cap_p))
++ {
+ TH_LOG("Failed to cap_clear: %s", strerror(errno));
+ }
+ if (!drop_all) {
+ EXPECT_NE(-1, cap_set_flag(cap_p, CAP_PERMITTED,
+- ARRAY_SIZE(caps), caps, CAP_SET)) {
++ ARRAY_SIZE(caps), caps, CAP_SET))
++ {
+ TH_LOG("Failed to cap_set_flag: %s", strerror(errno));
+ }
+ }
+- EXPECT_NE(-1, cap_set_proc(cap_p)) {
++ EXPECT_NE(-1, cap_set_proc(cap_p))
++ {
+ TH_LOG("Failed to cap_set_proc: %s", strerror(errno));
+ }
+- EXPECT_NE(-1, cap_free(cap_p)) {
++ EXPECT_NE(-1, cap_free(cap_p))
++ {
+ TH_LOG("Failed to cap_free: %s", strerror(errno));
+ }
+ }
+
+ /* We cannot put such helpers in a library because of kselftest_harness.h . */
+-__attribute__((__unused__))
+-static void disable_caps(struct __test_metadata *const _metadata)
++__attribute__((__unused__)) static void
++disable_caps(struct __test_metadata *const _metadata)
+ {
+ _init_caps(_metadata, false);
+ }
+
+-__attribute__((__unused__))
+-static void drop_caps(struct __test_metadata *const _metadata)
++__attribute__((__unused__)) static void
++drop_caps(struct __test_metadata *const _metadata)
+ {
+ _init_caps(_metadata, true);
+ }
+
+ static void _effective_cap(struct __test_metadata *const _metadata,
+- const cap_value_t caps, const cap_flag_value_t value)
++ const cap_value_t caps, const cap_flag_value_t value)
+ {
+ cap_t cap_p;
+
+ cap_p = cap_get_proc();
+- EXPECT_NE(NULL, cap_p) {
++ EXPECT_NE(NULL, cap_p)
++ {
+ TH_LOG("Failed to cap_get_proc: %s", strerror(errno));
+ }
+- EXPECT_NE(-1, cap_set_flag(cap_p, CAP_EFFECTIVE, 1, &caps, value)) {
++ EXPECT_NE(-1, cap_set_flag(cap_p, CAP_EFFECTIVE, 1, &caps, value))
++ {
+ TH_LOG("Failed to cap_set_flag: %s", strerror(errno));
+ }
+- EXPECT_NE(-1, cap_set_proc(cap_p)) {
++ EXPECT_NE(-1, cap_set_proc(cap_p))
++ {
+ TH_LOG("Failed to cap_set_proc: %s", strerror(errno));
+ }
+- EXPECT_NE(-1, cap_free(cap_p)) {
++ EXPECT_NE(-1, cap_free(cap_p))
++ {
+ TH_LOG("Failed to cap_free: %s", strerror(errno));
+ }
+ }
+
+-__attribute__((__unused__))
+-static void set_cap(struct __test_metadata *const _metadata,
+- const cap_value_t caps)
++__attribute__((__unused__)) static void
++set_cap(struct __test_metadata *const _metadata, const cap_value_t caps)
+ {
+ _effective_cap(_metadata, caps, CAP_SET);
+ }
+
+-__attribute__((__unused__))
+-static void clear_cap(struct __test_metadata *const _metadata,
+- const cap_value_t caps)
++__attribute__((__unused__)) static void
++clear_cap(struct __test_metadata *const _metadata, const cap_value_t caps)
+ {
+ _effective_cap(_metadata, caps, CAP_CLEAR);
+ }
+diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
+index 10c9a1e4ebd9b..a4fdcda62bdee 100644
+--- a/tools/testing/selftests/landlock/fs_test.c
++++ b/tools/testing/selftests/landlock/fs_test.c
+@@ -22,8 +22,21 @@
+
+ #include "common.h"
+
+-#define TMP_DIR "tmp"
+-#define BINARY_PATH "./true"
++#ifndef renameat2
++int renameat2(int olddirfd, const char *oldpath, int newdirfd,
++ const char *newpath, unsigned int flags)
++{
++ return syscall(__NR_renameat2, olddirfd, oldpath, newdirfd, newpath,
++ flags);
++}
++#endif
++
++#ifndef RENAME_EXCHANGE
++#define RENAME_EXCHANGE (1 << 1)
++#endif
++
++#define TMP_DIR "tmp"
++#define BINARY_PATH "./true"
+
+ /* Paths (sibling number and depth) */
+ static const char dir_s1d1[] = TMP_DIR "/s1d1";
+@@ -75,7 +88,7 @@ static const char dir_s3d3[] = TMP_DIR "/s3d1/s3d2/s3d3";
+ */
+
+ static void mkdir_parents(struct __test_metadata *const _metadata,
+- const char *const path)
++ const char *const path)
+ {
+ char *walker;
+ const char *parent;
+@@ -90,9 +103,10 @@ static void mkdir_parents(struct __test_metadata *const _metadata,
+ continue;
+ walker[i] = '\0';
+ err = mkdir(parent, 0700);
+- ASSERT_FALSE(err && errno != EEXIST) {
+- TH_LOG("Failed to create directory \"%s\": %s",
+- parent, strerror(errno));
++ ASSERT_FALSE(err && errno != EEXIST)
++ {
++ TH_LOG("Failed to create directory \"%s\": %s", parent,
++ strerror(errno));
+ }
+ walker[i] = '/';
+ }
+@@ -100,22 +114,24 @@ static void mkdir_parents(struct __test_metadata *const _metadata,
+ }
+
+ static void create_directory(struct __test_metadata *const _metadata,
+- const char *const path)
++ const char *const path)
+ {
+ mkdir_parents(_metadata, path);
+- ASSERT_EQ(0, mkdir(path, 0700)) {
++ ASSERT_EQ(0, mkdir(path, 0700))
++ {
+ TH_LOG("Failed to create directory \"%s\": %s", path,
+- strerror(errno));
++ strerror(errno));
+ }
+ }
+
+ static void create_file(struct __test_metadata *const _metadata,
+- const char *const path)
++ const char *const path)
+ {
+ mkdir_parents(_metadata, path);
+- ASSERT_EQ(0, mknod(path, S_IFREG | 0700, 0)) {
++ ASSERT_EQ(0, mknod(path, S_IFREG | 0700, 0))
++ {
+ TH_LOG("Failed to create file \"%s\": %s", path,
+- strerror(errno));
++ strerror(errno));
+ }
+ }
+
+@@ -221,8 +237,9 @@ static void remove_layout1(struct __test_metadata *const _metadata)
+ EXPECT_EQ(0, remove_path(dir_s3d2));
+ }
+
+-FIXTURE(layout1) {
+-};
++/* clang-format off */
++FIXTURE(layout1) {};
++/* clang-format on */
+
+ FIXTURE_SETUP(layout1)
+ {
+@@ -242,7 +259,8 @@ FIXTURE_TEARDOWN(layout1)
+ * This helper enables to use the ASSERT_* macros and print the line number
+ * pointing to the test caller.
+ */
+-static int test_open_rel(const int dirfd, const char *const path, const int flags)
++static int test_open_rel(const int dirfd, const char *const path,
++ const int flags)
+ {
+ int fd;
+
+@@ -291,23 +309,23 @@ TEST_F_FORK(layout1, inval)
+ {
+ struct landlock_path_beneath_attr path_beneath = {
+ .allowed_access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ .parent_fd = -1,
+ };
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ };
+ int ruleset_fd;
+
+- path_beneath.parent_fd = open(dir_s1d2, O_PATH | O_DIRECTORY |
+- O_CLOEXEC);
++ path_beneath.parent_fd =
++ open(dir_s1d2, O_PATH | O_DIRECTORY | O_CLOEXEC);
+ ASSERT_LE(0, path_beneath.parent_fd);
+
+ ruleset_fd = open(dir_s1d1, O_PATH | O_DIRECTORY | O_CLOEXEC);
+ ASSERT_LE(0, ruleset_fd);
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+- &path_beneath, 0));
++ &path_beneath, 0));
+ /* Returns EBADF because ruleset_fd is not a landlock-ruleset FD. */
+ ASSERT_EQ(EBADF, errno);
+ ASSERT_EQ(0, close(ruleset_fd));
+@@ -315,55 +333,55 @@ TEST_F_FORK(layout1, inval)
+ ruleset_fd = open(dir_s1d1, O_DIRECTORY | O_CLOEXEC);
+ ASSERT_LE(0, ruleset_fd);
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+- &path_beneath, 0));
++ &path_beneath, 0));
+ /* Returns EBADFD because ruleset_fd is not a valid ruleset. */
+ ASSERT_EQ(EBADFD, errno);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Gets a real ruleset. */
+- ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+- sizeof(ruleset_attr), 0);
++ ruleset_fd =
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd);
+ ASSERT_EQ(0, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+- &path_beneath, 0));
++ &path_beneath, 0));
+ ASSERT_EQ(0, close(path_beneath.parent_fd));
+
+ /* Tests without O_PATH. */
+ path_beneath.parent_fd = open(dir_s1d2, O_DIRECTORY | O_CLOEXEC);
+ ASSERT_LE(0, path_beneath.parent_fd);
+ ASSERT_EQ(0, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+- &path_beneath, 0));
++ &path_beneath, 0));
+ ASSERT_EQ(0, close(path_beneath.parent_fd));
+
+ /* Tests with a ruleset FD. */
+ path_beneath.parent_fd = ruleset_fd;
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+- &path_beneath, 0));
++ &path_beneath, 0));
+ ASSERT_EQ(EBADFD, errno);
+
+ /* Checks unhandled allowed_access. */
+- path_beneath.parent_fd = open(dir_s1d2, O_PATH | O_DIRECTORY |
+- O_CLOEXEC);
++ path_beneath.parent_fd =
++ open(dir_s1d2, O_PATH | O_DIRECTORY | O_CLOEXEC);
+ ASSERT_LE(0, path_beneath.parent_fd);
+
+ /* Test with legitimate values. */
+ path_beneath.allowed_access |= LANDLOCK_ACCESS_FS_EXECUTE;
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+- &path_beneath, 0));
++ &path_beneath, 0));
+ ASSERT_EQ(EINVAL, errno);
+ path_beneath.allowed_access &= ~LANDLOCK_ACCESS_FS_EXECUTE;
+
+ /* Test with unknown (64-bits) value. */
+ path_beneath.allowed_access |= (1ULL << 60);
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+- &path_beneath, 0));
++ &path_beneath, 0));
+ ASSERT_EQ(EINVAL, errno);
+ path_beneath.allowed_access &= ~(1ULL << 60);
+
+ /* Test with no access. */
+ path_beneath.allowed_access = 0;
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+- &path_beneath, 0));
++ &path_beneath, 0));
+ ASSERT_EQ(ENOMSG, errno);
+ path_beneath.allowed_access &= ~(1ULL << 60);
+
+@@ -376,6 +394,8 @@ TEST_F_FORK(layout1, inval)
+ ASSERT_EQ(0, close(ruleset_fd));
+ }
+
++/* clang-format off */
++
+ #define ACCESS_FILE ( \
+ LANDLOCK_ACCESS_FS_EXECUTE | \
+ LANDLOCK_ACCESS_FS_WRITE_FILE | \
+@@ -396,53 +416,87 @@ TEST_F_FORK(layout1, inval)
+ LANDLOCK_ACCESS_FS_MAKE_BLOCK | \
+ ACCESS_LAST)
+
+-TEST_F_FORK(layout1, file_access_rights)
++/* clang-format on */
++
++TEST_F_FORK(layout1, file_and_dir_access_rights)
+ {
+ __u64 access;
+ int err;
+- struct landlock_path_beneath_attr path_beneath = {};
++ struct landlock_path_beneath_attr path_beneath_file = {},
++ path_beneath_dir = {};
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = ACCESS_ALL,
+ };
+- const int ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+- sizeof(ruleset_attr), 0);
++ const int ruleset_fd =
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+
+ ASSERT_LE(0, ruleset_fd);
+
+ /* Tests access rights for files. */
+- path_beneath.parent_fd = open(file1_s1d2, O_PATH | O_CLOEXEC);
+- ASSERT_LE(0, path_beneath.parent_fd);
++ path_beneath_file.parent_fd = open(file1_s1d2, O_PATH | O_CLOEXEC);
++ ASSERT_LE(0, path_beneath_file.parent_fd);
++
++ /* Tests access rights for directories. */
++ path_beneath_dir.parent_fd =
++ open(dir_s1d2, O_PATH | O_DIRECTORY | O_CLOEXEC);
++ ASSERT_LE(0, path_beneath_dir.parent_fd);
++
+ for (access = 1; access <= ACCESS_LAST; access <<= 1) {
+- path_beneath.allowed_access = access;
++ path_beneath_dir.allowed_access = access;
++ ASSERT_EQ(0, landlock_add_rule(ruleset_fd,
++ LANDLOCK_RULE_PATH_BENEATH,
++ &path_beneath_dir, 0));
++
++ path_beneath_file.allowed_access = access;
+ err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+- &path_beneath, 0);
+- if ((access | ACCESS_FILE) == ACCESS_FILE) {
++ &path_beneath_file, 0);
++ if (access & ACCESS_FILE) {
+ ASSERT_EQ(0, err);
+ } else {
+ ASSERT_EQ(-1, err);
+ ASSERT_EQ(EINVAL, errno);
+ }
+ }
+- ASSERT_EQ(0, close(path_beneath.parent_fd));
++ ASSERT_EQ(0, close(path_beneath_file.parent_fd));
++ ASSERT_EQ(0, close(path_beneath_dir.parent_fd));
++ ASSERT_EQ(0, close(ruleset_fd));
++}
++
++TEST_F_FORK(layout1, unknown_access_rights)
++{
++ __u64 access_mask;
++
++ for (access_mask = 1ULL << 63; access_mask != ACCESS_LAST;
++ access_mask >>= 1) {
++ struct landlock_ruleset_attr ruleset_attr = {
++ .handled_access_fs = access_mask,
++ };
++
++ ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr,
++ sizeof(ruleset_attr), 0));
++ ASSERT_EQ(EINVAL, errno);
++ }
+ }
+
+ static void add_path_beneath(struct __test_metadata *const _metadata,
+- const int ruleset_fd, const __u64 allowed_access,
+- const char *const path)
++ const int ruleset_fd, const __u64 allowed_access,
++ const char *const path)
+ {
+ struct landlock_path_beneath_attr path_beneath = {
+ .allowed_access = allowed_access,
+ };
+
+ path_beneath.parent_fd = open(path, O_PATH | O_CLOEXEC);
+- ASSERT_LE(0, path_beneath.parent_fd) {
++ ASSERT_LE(0, path_beneath.parent_fd)
++ {
+ TH_LOG("Failed to open directory \"%s\": %s", path,
+- strerror(errno));
++ strerror(errno));
+ }
+ ASSERT_EQ(0, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+- &path_beneath, 0)) {
++ &path_beneath, 0))
++ {
+ TH_LOG("Failed to update the ruleset with \"%s\": %s", path,
+- strerror(errno));
++ strerror(errno));
+ }
+ ASSERT_EQ(0, close(path_beneath.parent_fd));
+ }
+@@ -452,6 +506,8 @@ struct rule {
+ __u64 access;
+ };
+
++/* clang-format off */
++
+ #define ACCESS_RO ( \
+ LANDLOCK_ACCESS_FS_READ_FILE | \
+ LANDLOCK_ACCESS_FS_READ_DIR)
+@@ -460,39 +516,46 @@ struct rule {
+ ACCESS_RO | \
+ LANDLOCK_ACCESS_FS_WRITE_FILE)
+
++/* clang-format on */
++
+ static int create_ruleset(struct __test_metadata *const _metadata,
+- const __u64 handled_access_fs, const struct rule rules[])
++ const __u64 handled_access_fs,
++ const struct rule rules[])
+ {
+ int ruleset_fd, i;
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = handled_access_fs,
+ };
+
+- ASSERT_NE(NULL, rules) {
++ ASSERT_NE(NULL, rules)
++ {
+ TH_LOG("No rule list");
+ }
+- ASSERT_NE(NULL, rules[0].path) {
++ ASSERT_NE(NULL, rules[0].path)
++ {
+ TH_LOG("Empty rule list");
+ }
+
+- ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+- sizeof(ruleset_attr), 0);
+- ASSERT_LE(0, ruleset_fd) {
++ ruleset_fd =
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
++ ASSERT_LE(0, ruleset_fd)
++ {
+ TH_LOG("Failed to create a ruleset: %s", strerror(errno));
+ }
+
+ for (i = 0; rules[i].path; i++) {
+ add_path_beneath(_metadata, ruleset_fd, rules[i].access,
+- rules[i].path);
++ rules[i].path);
+ }
+ return ruleset_fd;
+ }
+
+ static void enforce_ruleset(struct __test_metadata *const _metadata,
+- const int ruleset_fd)
++ const int ruleset_fd)
+ {
+ ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0));
+- ASSERT_EQ(0, landlock_restrict_self(ruleset_fd, 0)) {
++ ASSERT_EQ(0, landlock_restrict_self(ruleset_fd, 0))
++ {
+ TH_LOG("Failed to enforce ruleset: %s", strerror(errno));
+ }
+ }
+@@ -503,13 +566,14 @@ TEST_F_FORK(layout1, proc_nsfs)
+ {
+ .path = "/dev/null",
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+- {}
++ {},
+ };
+ struct landlock_path_beneath_attr path_beneath;
+- const int ruleset_fd = create_ruleset(_metadata, rules[0].access |
+- LANDLOCK_ACCESS_FS_READ_DIR, rules);
++ const int ruleset_fd = create_ruleset(
++ _metadata, rules[0].access | LANDLOCK_ACCESS_FS_READ_DIR,
++ rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ ASSERT_EQ(0, test_open("/proc/self/ns/mnt", O_RDONLY));
+@@ -536,22 +600,23 @@ TEST_F_FORK(layout1, proc_nsfs)
+ * references to a ruleset.
+ */
+ path_beneath.allowed_access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ path_beneath.parent_fd = open("/proc/self/ns/mnt", O_PATH | O_CLOEXEC);
+ ASSERT_LE(0, path_beneath.parent_fd);
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+- &path_beneath, 0));
++ &path_beneath, 0));
+ ASSERT_EQ(EBADFD, errno);
+ ASSERT_EQ(0, close(path_beneath.parent_fd));
+ }
+
+-TEST_F_FORK(layout1, unpriv) {
++TEST_F_FORK(layout1, unpriv)
++{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = ACCESS_RO,
+ },
+- {}
++ {},
+ };
+ int ruleset_fd;
+
+@@ -577,9 +642,9 @@ TEST_F_FORK(layout1, effective_access)
+ {
+ .path = file1_s2d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+- {}
++ {},
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+ char buf;
+@@ -589,17 +654,23 @@ TEST_F_FORK(layout1, effective_access)
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+- /* Tests on a directory. */
++ /* Tests on a directory (with or without O_PATH). */
+ ASSERT_EQ(EACCES, test_open("/", O_RDONLY));
++ ASSERT_EQ(0, test_open("/", O_RDONLY | O_PATH));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY));
++ ASSERT_EQ(0, test_open(dir_s1d1, O_RDONLY | O_PATH));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
++ ASSERT_EQ(0, test_open(file1_s1d1, O_RDONLY | O_PATH));
++
+ ASSERT_EQ(0, test_open(dir_s1d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s1d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+
+- /* Tests on a file. */
++ /* Tests on a file (with or without O_PATH). */
+ ASSERT_EQ(EACCES, test_open(dir_s2d2, O_RDONLY));
++ ASSERT_EQ(0, test_open(dir_s2d2, O_RDONLY | O_PATH));
++
+ ASSERT_EQ(0, test_open(file1_s2d2, O_RDONLY));
+
+ /* Checks effective read and write actions. */
+@@ -626,7 +697,7 @@ TEST_F_FORK(layout1, unhandled_access)
+ .path = dir_s1d2,
+ .access = ACCESS_RO,
+ },
+- {}
++ {},
+ };
+ /* Here, we only handle read accesses, not write accesses. */
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RO, rules);
+@@ -653,14 +724,14 @@ TEST_F_FORK(layout1, ruleset_overlap)
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_READ_DIR,
++ LANDLOCK_ACCESS_FS_READ_DIR,
+ },
+- {}
++ {},
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+@@ -687,6 +758,113 @@ TEST_F_FORK(layout1, ruleset_overlap)
+ ASSERT_EQ(0, test_open(dir_s1d3, O_RDONLY | O_DIRECTORY));
+ }
+
++TEST_F_FORK(layout1, layer_rule_unions)
++{
++ const struct rule layer1[] = {
++ {
++ .path = dir_s1d2,
++ .access = LANDLOCK_ACCESS_FS_READ_FILE,
++ },
++ /* dir_s1d3 should allow READ_FILE and WRITE_FILE (O_RDWR). */
++ {
++ .path = dir_s1d3,
++ .access = LANDLOCK_ACCESS_FS_WRITE_FILE,
++ },
++ {},
++ };
++ const struct rule layer2[] = {
++ /* Doesn't change anything from layer1. */
++ {
++ .path = dir_s1d2,
++ .access = LANDLOCK_ACCESS_FS_READ_FILE |
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
++ },
++ {},
++ };
++ const struct rule layer3[] = {
++ /* Only allows write (but not read) to dir_s1d3. */
++ {
++ .path = dir_s1d2,
++ .access = LANDLOCK_ACCESS_FS_WRITE_FILE,
++ },
++ {},
++ };
++ int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer1);
++
++ ASSERT_LE(0, ruleset_fd);
++ enforce_ruleset(_metadata, ruleset_fd);
++ ASSERT_EQ(0, close(ruleset_fd));
++
++ /* Checks s1d1 hierarchy with layer1. */
++ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
++ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_WRONLY));
++ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDWR));
++ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
++
++ /* Checks s1d2 hierarchy with layer1. */
++ ASSERT_EQ(0, test_open(file1_s1d2, O_RDONLY));
++ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_WRONLY));
++ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_RDWR));
++ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
++
++ /* Checks s1d3 hierarchy with layer1. */
++ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
++ ASSERT_EQ(0, test_open(file1_s1d3, O_WRONLY));
++ /* dir_s1d3 should allow READ_FILE and WRITE_FILE (O_RDWR). */
++ ASSERT_EQ(0, test_open(file1_s1d3, O_RDWR));
++ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
++
++ /* Doesn't change anything from layer1. */
++ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer2);
++ ASSERT_LE(0, ruleset_fd);
++ enforce_ruleset(_metadata, ruleset_fd);
++ ASSERT_EQ(0, close(ruleset_fd));
++
++ /* Checks s1d1 hierarchy with layer2. */
++ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
++ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_WRONLY));
++ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDWR));
++ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
++
++ /* Checks s1d2 hierarchy with layer2. */
++ ASSERT_EQ(0, test_open(file1_s1d2, O_RDONLY));
++ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_WRONLY));
++ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_RDWR));
++ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
++
++ /* Checks s1d3 hierarchy with layer2. */
++ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
++ ASSERT_EQ(0, test_open(file1_s1d3, O_WRONLY));
++ /* dir_s1d3 should allow READ_FILE and WRITE_FILE (O_RDWR). */
++ ASSERT_EQ(0, test_open(file1_s1d3, O_RDWR));
++ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
++
++ /* Only allows write (but not read) to dir_s1d3. */
++ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer3);
++ ASSERT_LE(0, ruleset_fd);
++ enforce_ruleset(_metadata, ruleset_fd);
++ ASSERT_EQ(0, close(ruleset_fd));
++
++ /* Checks s1d1 hierarchy with layer3. */
++ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
++ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_WRONLY));
++ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDWR));
++ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
++
++ /* Checks s1d2 hierarchy with layer3. */
++ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_RDONLY));
++ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_WRONLY));
++ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_RDWR));
++ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
++
++ /* Checks s1d3 hierarchy with layer3. */
++ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_RDONLY));
++ ASSERT_EQ(0, test_open(file1_s1d3, O_WRONLY));
++ /* dir_s1d3 should now deny READ_FILE and WRITE_FILE (O_RDWR). */
++ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_RDWR));
++ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
++}
++
+ TEST_F_FORK(layout1, non_overlapping_accesses)
+ {
+ const struct rule layer1[] = {
+@@ -694,22 +872,22 @@ TEST_F_FORK(layout1, non_overlapping_accesses)
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_MAKE_REG,
+ },
+- {}
++ {},
+ };
+ const struct rule layer2[] = {
+ {
+ .path = dir_s1d3,
+ .access = LANDLOCK_ACCESS_FS_REMOVE_FILE,
+ },
+- {}
++ {},
+ };
+ int ruleset_fd;
+
+ ASSERT_EQ(0, unlink(file1_s1d1));
+ ASSERT_EQ(0, unlink(file1_s1d2));
+
+- ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_MAKE_REG,
+- layer1);
++ ruleset_fd =
++ create_ruleset(_metadata, LANDLOCK_ACCESS_FS_MAKE_REG, layer1);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+@@ -720,7 +898,7 @@ TEST_F_FORK(layout1, non_overlapping_accesses)
+ ASSERT_EQ(0, unlink(file1_s1d2));
+
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_REMOVE_FILE,
+- layer2);
++ layer2);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+@@ -758,7 +936,7 @@ TEST_F_FORK(layout1, interleaved_masked_accesses)
+ .path = file1_s1d3,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+- {}
++ {},
+ };
+ /* First rule with write restrictions. */
+ const struct rule layer2_read_write[] = {
+@@ -766,14 +944,14 @@ TEST_F_FORK(layout1, interleaved_masked_accesses)
+ {
+ .path = dir_s1d3,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ /* ...but also denies read access via its grandparent directory. */
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+- {}
++ {},
+ };
+ const struct rule layer3_read[] = {
+ /* Allows read access via its great-grandparent directory. */
+@@ -781,7 +959,7 @@ TEST_F_FORK(layout1, interleaved_masked_accesses)
+ .path = dir_s1d1,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+- {}
++ {},
+ };
+ const struct rule layer4_read_write[] = {
+ /*
+@@ -792,7 +970,7 @@ TEST_F_FORK(layout1, interleaved_masked_accesses)
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+- {}
++ {},
+ };
+ const struct rule layer5_read[] = {
+ /*
+@@ -803,7 +981,7 @@ TEST_F_FORK(layout1, interleaved_masked_accesses)
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+- {}
++ {},
+ };
+ const struct rule layer6_execute[] = {
+ /*
+@@ -814,7 +992,7 @@ TEST_F_FORK(layout1, interleaved_masked_accesses)
+ .path = dir_s2d1,
+ .access = LANDLOCK_ACCESS_FS_EXECUTE,
+ },
+- {}
++ {},
+ };
+ const struct rule layer7_read_write[] = {
+ /*
+@@ -825,12 +1003,12 @@ TEST_F_FORK(layout1, interleaved_masked_accesses)
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+- {}
++ {},
+ };
+ int ruleset_fd;
+
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_FILE,
+- layer1_read);
++ layer1_read);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+@@ -840,8 +1018,10 @@ TEST_F_FORK(layout1, interleaved_masked_accesses)
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(file2_s1d3, O_WRONLY));
+
+- ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE, layer2_read_write);
++ ruleset_fd = create_ruleset(_metadata,
++ LANDLOCK_ACCESS_FS_READ_FILE |
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
++ layer2_read_write);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+@@ -852,7 +1032,7 @@ TEST_F_FORK(layout1, interleaved_masked_accesses)
+ ASSERT_EQ(0, test_open(file2_s1d3, O_WRONLY));
+
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_FILE,
+- layer3_read);
++ layer3_read);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+@@ -863,8 +1043,10 @@ TEST_F_FORK(layout1, interleaved_masked_accesses)
+ ASSERT_EQ(0, test_open(file2_s1d3, O_WRONLY));
+
+ /* This time, denies write access for the file hierarchy. */
+- ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE, layer4_read_write);
++ ruleset_fd = create_ruleset(_metadata,
++ LANDLOCK_ACCESS_FS_READ_FILE |
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
++ layer4_read_write);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+@@ -879,7 +1061,7 @@ TEST_F_FORK(layout1, interleaved_masked_accesses)
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_WRONLY));
+
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_FILE,
+- layer5_read);
++ layer5_read);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+@@ -891,7 +1073,7 @@ TEST_F_FORK(layout1, interleaved_masked_accesses)
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_RDONLY));
+
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_EXECUTE,
+- layer6_execute);
++ layer6_execute);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+@@ -902,8 +1084,10 @@ TEST_F_FORK(layout1, interleaved_masked_accesses)
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_RDONLY));
+
+- ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE, layer7_read_write);
++ ruleset_fd = create_ruleset(_metadata,
++ LANDLOCK_ACCESS_FS_READ_FILE |
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
++ layer7_read_write);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+@@ -921,9 +1105,9 @@ TEST_F_FORK(layout1, inherit_subset)
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_READ_DIR,
++ LANDLOCK_ACCESS_FS_READ_DIR,
+ },
+- {}
++ {},
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+@@ -949,7 +1133,7 @@ TEST_F_FORK(layout1, inherit_subset)
+ * ANDed with the previous ones.
+ */
+ add_path_beneath(_metadata, ruleset_fd, LANDLOCK_ACCESS_FS_WRITE_FILE,
+- dir_s1d2);
++ dir_s1d2);
+ /*
+ * According to ruleset_fd, dir_s1d2 should now have the
+ * LANDLOCK_ACCESS_FS_READ_FILE and LANDLOCK_ACCESS_FS_WRITE_FILE
+@@ -1004,7 +1188,7 @@ TEST_F_FORK(layout1, inherit_subset)
+ * that there was no rule tied to it before.
+ */
+ add_path_beneath(_metadata, ruleset_fd, LANDLOCK_ACCESS_FS_WRITE_FILE,
+- dir_s1d3);
++ dir_s1d3);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+@@ -1039,7 +1223,7 @@ TEST_F_FORK(layout1, inherit_superset)
+ .path = dir_s1d3,
+ .access = ACCESS_RO,
+ },
+- {}
++ {},
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+@@ -1054,8 +1238,10 @@ TEST_F_FORK(layout1, inherit_superset)
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+
+ /* Now dir_s1d2, parent of dir_s1d3, gets a new rule tied to it. */
+- add_path_beneath(_metadata, ruleset_fd, LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_READ_DIR, dir_s1d2);
++ add_path_beneath(_metadata, ruleset_fd,
++ LANDLOCK_ACCESS_FS_READ_FILE |
++ LANDLOCK_ACCESS_FS_READ_DIR,
++ dir_s1d2);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+@@ -1075,12 +1261,12 @@ TEST_F_FORK(layout1, max_layers)
+ .path = dir_s1d2,
+ .access = ACCESS_RO,
+ },
+- {}
++ {},
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+- for (i = 0; i < 64; i++)
++ for (i = 0; i < 16; i++)
+ enforce_ruleset(_metadata, ruleset_fd);
+
+ for (i = 0; i < 2; i++) {
+@@ -1097,15 +1283,15 @@ TEST_F_FORK(layout1, empty_or_same_ruleset)
+ int ruleset_fd;
+
+ /* Tests empty handled_access_fs. */
+- ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+- sizeof(ruleset_attr), 0);
++ ruleset_fd =
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+ ASSERT_LE(-1, ruleset_fd);
+ ASSERT_EQ(ENOMSG, errno);
+
+ /* Enforces policy which deny read access to all files. */
+ ruleset_attr.handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE;
+- ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+- sizeof(ruleset_attr), 0);
++ ruleset_fd =
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
+@@ -1113,8 +1299,8 @@ TEST_F_FORK(layout1, empty_or_same_ruleset)
+
+ /* Nests a policy which deny read access to all directories. */
+ ruleset_attr.handled_access_fs = LANDLOCK_ACCESS_FS_READ_DIR;
+- ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+- sizeof(ruleset_attr), 0);
++ ruleset_fd =
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
+@@ -1137,7 +1323,7 @@ TEST_F_FORK(layout1, rule_on_mountpoint)
+ .path = dir_s3d2,
+ .access = ACCESS_RO,
+ },
+- {}
++ {},
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+@@ -1166,7 +1352,7 @@ TEST_F_FORK(layout1, rule_over_mountpoint)
+ .path = dir_s3d1,
+ .access = ACCESS_RO,
+ },
+- {}
++ {},
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+@@ -1194,7 +1380,7 @@ TEST_F_FORK(layout1, rule_over_root_allow_then_deny)
+ .path = "/",
+ .access = ACCESS_RO,
+ },
+- {}
++ {},
+ };
+ int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+@@ -1224,7 +1410,7 @@ TEST_F_FORK(layout1, rule_over_root_deny)
+ .path = "/",
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+- {}
++ {},
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+@@ -1244,12 +1430,13 @@ TEST_F_FORK(layout1, rule_inside_mount_ns)
+ .path = "s3d3",
+ .access = ACCESS_RO,
+ },
+- {}
++ {},
+ };
+ int ruleset_fd;
+
+ set_cap(_metadata, CAP_SYS_ADMIN);
+- ASSERT_EQ(0, syscall(SYS_pivot_root, dir_s3d2, dir_s3d3)) {
++ ASSERT_EQ(0, syscall(__NR_pivot_root, dir_s3d2, dir_s3d3))
++ {
+ TH_LOG("Failed to pivot root: %s", strerror(errno));
+ };
+ ASSERT_EQ(0, chdir("/"));
+@@ -1271,7 +1458,7 @@ TEST_F_FORK(layout1, mount_and_pivot)
+ .path = dir_s3d2,
+ .access = ACCESS_RO,
+ },
+- {}
++ {},
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+@@ -1282,7 +1469,7 @@ TEST_F_FORK(layout1, mount_and_pivot)
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ ASSERT_EQ(-1, mount(NULL, dir_s3d2, NULL, MS_RDONLY, NULL));
+ ASSERT_EQ(EPERM, errno);
+- ASSERT_EQ(-1, syscall(SYS_pivot_root, dir_s3d2, dir_s3d3));
++ ASSERT_EQ(-1, syscall(__NR_pivot_root, dir_s3d2, dir_s3d3));
+ ASSERT_EQ(EPERM, errno);
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+ }
+@@ -1294,28 +1481,29 @@ TEST_F_FORK(layout1, move_mount)
+ .path = dir_s3d2,
+ .access = ACCESS_RO,
+ },
+- {}
++ {},
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+ set_cap(_metadata, CAP_SYS_ADMIN);
+- ASSERT_EQ(0, syscall(SYS_move_mount, AT_FDCWD, dir_s3d2, AT_FDCWD,
+- dir_s1d2, 0)) {
++ ASSERT_EQ(0, syscall(__NR_move_mount, AT_FDCWD, dir_s3d2, AT_FDCWD,
++ dir_s1d2, 0))
++ {
+ TH_LOG("Failed to move mount: %s", strerror(errno));
+ }
+
+- ASSERT_EQ(0, syscall(SYS_move_mount, AT_FDCWD, dir_s1d2, AT_FDCWD,
+- dir_s3d2, 0));
++ ASSERT_EQ(0, syscall(__NR_move_mount, AT_FDCWD, dir_s1d2, AT_FDCWD,
++ dir_s3d2, 0));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ set_cap(_metadata, CAP_SYS_ADMIN);
+- ASSERT_EQ(-1, syscall(SYS_move_mount, AT_FDCWD, dir_s3d2, AT_FDCWD,
+- dir_s1d2, 0));
++ ASSERT_EQ(-1, syscall(__NR_move_mount, AT_FDCWD, dir_s3d2, AT_FDCWD,
++ dir_s1d2, 0));
+ ASSERT_EQ(EPERM, errno);
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+ }
+@@ -1335,7 +1523,7 @@ TEST_F_FORK(layout1, release_inodes)
+ .path = dir_s3d3,
+ .access = ACCESS_RO,
+ },
+- {}
++ {},
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+@@ -1362,7 +1550,7 @@ enum relative_access {
+ };
+
+ static void test_relative_path(struct __test_metadata *const _metadata,
+- const enum relative_access rel)
++ const enum relative_access rel)
+ {
+ /*
+ * Common layer to check that chroot doesn't ignore it (i.e. a chroot
+@@ -1373,7 +1561,7 @@ static void test_relative_path(struct __test_metadata *const _metadata,
+ .path = TMP_DIR,
+ .access = ACCESS_RO,
+ },
+- {}
++ {},
+ };
+ const struct rule layer2_subs[] = {
+ {
+@@ -1384,7 +1572,7 @@ static void test_relative_path(struct __test_metadata *const _metadata,
+ .path = dir_s2d2,
+ .access = ACCESS_RO,
+ },
+- {}
++ {},
+ };
+ int dirfd, ruleset_fd;
+
+@@ -1425,14 +1613,16 @@ static void test_relative_path(struct __test_metadata *const _metadata,
+ break;
+ case REL_CHROOT_ONLY:
+ /* Do chroot into dir_s1d2 (relative to dir_s2d2). */
+- ASSERT_EQ(0, chroot("../../s1d1/s1d2")) {
++ ASSERT_EQ(0, chroot("../../s1d1/s1d2"))
++ {
+ TH_LOG("Failed to chroot: %s", strerror(errno));
+ }
+ dirfd = AT_FDCWD;
+ break;
+ case REL_CHROOT_CHDIR:
+ /* Do chroot into dir_s1d2. */
+- ASSERT_EQ(0, chroot(".")) {
++ ASSERT_EQ(0, chroot("."))
++ {
+ TH_LOG("Failed to chroot: %s", strerror(errno));
+ }
+ dirfd = AT_FDCWD;
+@@ -1440,7 +1630,7 @@ static void test_relative_path(struct __test_metadata *const _metadata,
+ }
+
+ ASSERT_EQ((rel == REL_CHROOT_CHDIR) ? 0 : EACCES,
+- test_open_rel(dirfd, "..", O_RDONLY));
++ test_open_rel(dirfd, "..", O_RDONLY));
+ ASSERT_EQ(0, test_open_rel(dirfd, ".", O_RDONLY));
+
+ if (rel == REL_CHROOT_ONLY) {
+@@ -1462,11 +1652,13 @@ static void test_relative_path(struct __test_metadata *const _metadata,
+ if (rel != REL_CHROOT_CHDIR) {
+ ASSERT_EQ(EACCES, test_open_rel(dirfd, "../../s1d1", O_RDONLY));
+ ASSERT_EQ(0, test_open_rel(dirfd, "../../s1d1/s1d2", O_RDONLY));
+- ASSERT_EQ(0, test_open_rel(dirfd, "../../s1d1/s1d2/s1d3", O_RDONLY));
++ ASSERT_EQ(0, test_open_rel(dirfd, "../../s1d1/s1d2/s1d3",
++ O_RDONLY));
+
+ ASSERT_EQ(EACCES, test_open_rel(dirfd, "../../s2d1", O_RDONLY));
+ ASSERT_EQ(0, test_open_rel(dirfd, "../../s2d1/s2d2", O_RDONLY));
+- ASSERT_EQ(0, test_open_rel(dirfd, "../../s2d1/s2d2/s2d3", O_RDONLY));
++ ASSERT_EQ(0, test_open_rel(dirfd, "../../s2d1/s2d2/s2d3",
++ O_RDONLY));
+ }
+
+ if (rel == REL_OPEN)
+@@ -1495,40 +1687,42 @@ TEST_F_FORK(layout1, relative_chroot_chdir)
+ }
+
+ static void copy_binary(struct __test_metadata *const _metadata,
+- const char *const dst_path)
++ const char *const dst_path)
+ {
+ int dst_fd, src_fd;
+ struct stat statbuf;
+
+ dst_fd = open(dst_path, O_WRONLY | O_TRUNC | O_CLOEXEC);
+- ASSERT_LE(0, dst_fd) {
+- TH_LOG("Failed to open \"%s\": %s", dst_path,
+- strerror(errno));
++ ASSERT_LE(0, dst_fd)
++ {
++ TH_LOG("Failed to open \"%s\": %s", dst_path, strerror(errno));
+ }
+ src_fd = open(BINARY_PATH, O_RDONLY | O_CLOEXEC);
+- ASSERT_LE(0, src_fd) {
++ ASSERT_LE(0, src_fd)
++ {
+ TH_LOG("Failed to open \"" BINARY_PATH "\": %s",
+- strerror(errno));
++ strerror(errno));
+ }
+ ASSERT_EQ(0, fstat(src_fd, &statbuf));
+- ASSERT_EQ(statbuf.st_size, sendfile(dst_fd, src_fd, 0,
+- statbuf.st_size));
++ ASSERT_EQ(statbuf.st_size,
++ sendfile(dst_fd, src_fd, 0, statbuf.st_size));
+ ASSERT_EQ(0, close(src_fd));
+ ASSERT_EQ(0, close(dst_fd));
+ }
+
+-static void test_execute(struct __test_metadata *const _metadata,
+- const int err, const char *const path)
++static void test_execute(struct __test_metadata *const _metadata, const int err,
++ const char *const path)
+ {
+ int status;
+- char *const argv[] = {(char *)path, NULL};
++ char *const argv[] = { (char *)path, NULL };
+ const pid_t child = fork();
+
+ ASSERT_LE(0, child);
+ if (child == 0) {
+- ASSERT_EQ(err ? -1 : 0, execve(path, argv, NULL)) {
++ ASSERT_EQ(err ? -1 : 0, execve(path, argv, NULL))
++ {
+ TH_LOG("Failed to execute \"%s\": %s", path,
+- strerror(errno));
++ strerror(errno));
+ };
+ ASSERT_EQ(err, errno);
+ _exit(_metadata->passed ? 2 : 1);
+@@ -1536,9 +1730,10 @@ static void test_execute(struct __test_metadata *const _metadata,
+ }
+ ASSERT_EQ(child, waitpid(child, &status, 0));
+ ASSERT_EQ(1, WIFEXITED(status));
+- ASSERT_EQ(err ? 2 : 0, WEXITSTATUS(status)) {
++ ASSERT_EQ(err ? 2 : 0, WEXITSTATUS(status))
++ {
+ TH_LOG("Unexpected return code for \"%s\": %s", path,
+- strerror(errno));
++ strerror(errno));
+ };
+ }
+
+@@ -1549,10 +1744,10 @@ TEST_F_FORK(layout1, execute)
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_EXECUTE,
+ },
+- {}
++ {},
+ };
+- const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+- rules);
++ const int ruleset_fd =
++ create_ruleset(_metadata, rules[0].access, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ copy_binary(_metadata, file1_s1d1);
+@@ -1577,15 +1772,21 @@ TEST_F_FORK(layout1, execute)
+
+ TEST_F_FORK(layout1, link)
+ {
+- const struct rule rules[] = {
++ const struct rule layer1[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_MAKE_REG,
+ },
+- {}
++ {},
++ };
++ const struct rule layer2[] = {
++ {
++ .path = dir_s1d3,
++ .access = LANDLOCK_ACCESS_FS_REMOVE_FILE,
++ },
++ {},
+ };
+- const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+- rules);
++ int ruleset_fd = create_ruleset(_metadata, layer1[0].access, layer1);
+
+ ASSERT_LE(0, ruleset_fd);
+
+@@ -1598,14 +1799,30 @@ TEST_F_FORK(layout1, link)
+
+ ASSERT_EQ(-1, link(file2_s1d1, file1_s1d1));
+ ASSERT_EQ(EACCES, errno);
++
+ /* Denies linking because of reparenting. */
+ ASSERT_EQ(-1, link(file1_s2d1, file1_s1d2));
+ ASSERT_EQ(EXDEV, errno);
+ ASSERT_EQ(-1, link(file2_s1d2, file1_s1d3));
+ ASSERT_EQ(EXDEV, errno);
++ ASSERT_EQ(-1, link(file2_s1d3, file1_s1d2));
++ ASSERT_EQ(EXDEV, errno);
+
+ ASSERT_EQ(0, link(file2_s1d2, file1_s1d2));
+ ASSERT_EQ(0, link(file2_s1d3, file1_s1d3));
++
++ /* Prepares for next unlinks. */
++ ASSERT_EQ(0, unlink(file2_s1d2));
++ ASSERT_EQ(0, unlink(file2_s1d3));
++
++ ruleset_fd = create_ruleset(_metadata, layer2[0].access, layer2);
++ ASSERT_LE(0, ruleset_fd);
++ enforce_ruleset(_metadata, ruleset_fd);
++ ASSERT_EQ(0, close(ruleset_fd));
++
++ /* Checks that linkind doesn't require the ability to delete a file. */
++ ASSERT_EQ(0, link(file1_s1d2, file2_s1d2));
++ ASSERT_EQ(0, link(file1_s1d3, file2_s1d3));
+ }
+
+ TEST_F_FORK(layout1, rename_file)
+@@ -1619,14 +1836,13 @@ TEST_F_FORK(layout1, rename_file)
+ .path = dir_s2d2,
+ .access = LANDLOCK_ACCESS_FS_REMOVE_FILE,
+ },
+- {}
++ {},
+ };
+- const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+- rules);
++ const int ruleset_fd =
++ create_ruleset(_metadata, rules[0].access, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+- ASSERT_EQ(0, unlink(file1_s1d1));
+ ASSERT_EQ(0, unlink(file1_s1d2));
+
+ enforce_ruleset(_metadata, ruleset_fd);
+@@ -1662,9 +1878,15 @@ TEST_F_FORK(layout1, rename_file)
+ ASSERT_EQ(-1, renameat2(AT_FDCWD, dir_s2d2, AT_FDCWD, file1_s2d1,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(EACCES, errno);
++ /* Checks that file1_s2d1 cannot be removed (instead of ENOTDIR). */
++ ASSERT_EQ(-1, rename(dir_s2d2, file1_s2d1));
++ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(-1, renameat2(AT_FDCWD, file1_s2d1, AT_FDCWD, dir_s2d2,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(EACCES, errno);
++ /* Checks that file1_s1d1 cannot be removed (instead of EISDIR). */
++ ASSERT_EQ(-1, rename(file1_s1d1, dir_s1d2));
++ ASSERT_EQ(EACCES, errno);
+
+ /* Renames files with different parents. */
+ ASSERT_EQ(-1, rename(file1_s2d2, file1_s1d2));
+@@ -1675,14 +1897,14 @@ TEST_F_FORK(layout1, rename_file)
+
+ /* Exchanges and renames files with same parent. */
+ ASSERT_EQ(0, renameat2(AT_FDCWD, file2_s2d3, AT_FDCWD, file1_s2d3,
+- RENAME_EXCHANGE));
++ RENAME_EXCHANGE));
+ ASSERT_EQ(0, rename(file2_s2d3, file1_s2d3));
+
+ /* Exchanges files and directories with same parent, twice. */
+ ASSERT_EQ(0, renameat2(AT_FDCWD, file1_s2d2, AT_FDCWD, dir_s2d3,
+- RENAME_EXCHANGE));
++ RENAME_EXCHANGE));
+ ASSERT_EQ(0, renameat2(AT_FDCWD, file1_s2d2, AT_FDCWD, dir_s2d3,
+- RENAME_EXCHANGE));
++ RENAME_EXCHANGE));
+ }
+
+ TEST_F_FORK(layout1, rename_dir)
+@@ -1696,10 +1918,10 @@ TEST_F_FORK(layout1, rename_dir)
+ .path = dir_s2d1,
+ .access = LANDLOCK_ACCESS_FS_REMOVE_DIR,
+ },
+- {}
++ {},
+ };
+- const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+- rules);
++ const int ruleset_fd =
++ create_ruleset(_metadata, rules[0].access, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+@@ -1727,16 +1949,22 @@ TEST_F_FORK(layout1, rename_dir)
+ ASSERT_EQ(-1, renameat2(AT_FDCWD, dir_s1d1, AT_FDCWD, dir_s2d1,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(EACCES, errno);
++ /* Checks that dir_s1d2 cannot be removed (instead of ENOTDIR). */
++ ASSERT_EQ(-1, rename(dir_s1d2, file1_s1d1));
++ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(-1, renameat2(AT_FDCWD, file1_s1d1, AT_FDCWD, dir_s1d2,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(EACCES, errno);
++ /* Checks that dir_s1d2 cannot be removed (instead of EISDIR). */
++ ASSERT_EQ(-1, rename(file1_s1d1, dir_s1d2));
++ ASSERT_EQ(EACCES, errno);
+
+ /*
+ * Exchanges and renames directory to the same parent, which allows
+ * directory removal.
+ */
+ ASSERT_EQ(0, renameat2(AT_FDCWD, dir_s1d3, AT_FDCWD, file1_s1d2,
+- RENAME_EXCHANGE));
++ RENAME_EXCHANGE));
+ ASSERT_EQ(0, unlink(dir_s1d3));
+ ASSERT_EQ(0, mkdir(dir_s1d3, 0700));
+ ASSERT_EQ(0, rename(file1_s1d2, dir_s1d3));
+@@ -1750,10 +1978,10 @@ TEST_F_FORK(layout1, remove_dir)
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_REMOVE_DIR,
+ },
+- {}
++ {},
+ };
+- const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+- rules);
++ const int ruleset_fd =
++ create_ruleset(_metadata, rules[0].access, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+@@ -1787,10 +2015,10 @@ TEST_F_FORK(layout1, remove_file)
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_REMOVE_FILE,
+ },
+- {}
++ {},
+ };
+- const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+- rules);
++ const int ruleset_fd =
++ create_ruleset(_metadata, rules[0].access, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+@@ -1805,14 +2033,15 @@ TEST_F_FORK(layout1, remove_file)
+ }
+
+ static void test_make_file(struct __test_metadata *const _metadata,
+- const __u64 access, const mode_t mode, const dev_t dev)
++ const __u64 access, const mode_t mode,
++ const dev_t dev)
+ {
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = access,
+ },
+- {}
++ {},
+ };
+ const int ruleset_fd = create_ruleset(_metadata, access, rules);
+
+@@ -1820,9 +2049,10 @@ static void test_make_file(struct __test_metadata *const _metadata,
+
+ ASSERT_EQ(0, unlink(file1_s1d1));
+ ASSERT_EQ(0, unlink(file2_s1d1));
+- ASSERT_EQ(0, mknod(file2_s1d1, mode | 0400, dev)) {
+- TH_LOG("Failed to make file \"%s\": %s",
+- file2_s1d1, strerror(errno));
++ ASSERT_EQ(0, mknod(file2_s1d1, mode | 0400, dev))
++ {
++ TH_LOG("Failed to make file \"%s\": %s", file2_s1d1,
++ strerror(errno));
+ };
+
+ ASSERT_EQ(0, unlink(file1_s1d2));
+@@ -1841,9 +2071,10 @@ static void test_make_file(struct __test_metadata *const _metadata,
+ ASSERT_EQ(-1, rename(file2_s1d1, file1_s1d1));
+ ASSERT_EQ(EACCES, errno);
+
+- ASSERT_EQ(0, mknod(file1_s1d2, mode | 0400, dev)) {
+- TH_LOG("Failed to make file \"%s\": %s",
+- file1_s1d2, strerror(errno));
++ ASSERT_EQ(0, mknod(file1_s1d2, mode | 0400, dev))
++ {
++ TH_LOG("Failed to make file \"%s\": %s", file1_s1d2,
++ strerror(errno));
+ };
+ ASSERT_EQ(0, link(file1_s1d2, file2_s1d2));
+ ASSERT_EQ(0, unlink(file2_s1d2));
+@@ -1860,7 +2091,7 @@ TEST_F_FORK(layout1, make_char)
+ /* Creates a /dev/null device. */
+ set_cap(_metadata, CAP_MKNOD);
+ test_make_file(_metadata, LANDLOCK_ACCESS_FS_MAKE_CHAR, S_IFCHR,
+- makedev(1, 3));
++ makedev(1, 3));
+ }
+
+ TEST_F_FORK(layout1, make_block)
+@@ -1868,7 +2099,7 @@ TEST_F_FORK(layout1, make_block)
+ /* Creates a /dev/loop0 device. */
+ set_cap(_metadata, CAP_MKNOD);
+ test_make_file(_metadata, LANDLOCK_ACCESS_FS_MAKE_BLOCK, S_IFBLK,
+- makedev(7, 0));
++ makedev(7, 0));
+ }
+
+ TEST_F_FORK(layout1, make_reg_1)
+@@ -1898,10 +2129,10 @@ TEST_F_FORK(layout1, make_sym)
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_MAKE_SYM,
+ },
+- {}
++ {},
+ };
+- const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+- rules);
++ const int ruleset_fd =
++ create_ruleset(_metadata, rules[0].access, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+@@ -1943,10 +2174,10 @@ TEST_F_FORK(layout1, make_dir)
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_MAKE_DIR,
+ },
+- {}
++ {},
+ };
+- const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+- rules);
++ const int ruleset_fd =
++ create_ruleset(_metadata, rules[0].access, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+@@ -1965,12 +2196,12 @@ TEST_F_FORK(layout1, make_dir)
+ }
+
+ static int open_proc_fd(struct __test_metadata *const _metadata, const int fd,
+- const int open_flags)
++ const int open_flags)
+ {
+ static const char path_template[] = "/proc/self/fd/%d";
+ char procfd_path[sizeof(path_template) + 10];
+- const int procfd_path_size = snprintf(procfd_path, sizeof(procfd_path),
+- path_template, fd);
++ const int procfd_path_size =
++ snprintf(procfd_path, sizeof(procfd_path), path_template, fd);
+
+ ASSERT_LT(procfd_path_size, sizeof(procfd_path));
+ return open(procfd_path, open_flags);
+@@ -1983,12 +2214,13 @@ TEST_F_FORK(layout1, proc_unlinked_file)
+ .path = file1_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+- {}
++ {},
+ };
+ int reg_fd, proc_fd;
+- const int ruleset_fd = create_ruleset(_metadata,
+- LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE, rules);
++ const int ruleset_fd = create_ruleset(
++ _metadata,
++ LANDLOCK_ACCESS_FS_READ_FILE | LANDLOCK_ACCESS_FS_WRITE_FILE,
++ rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+@@ -2005,9 +2237,10 @@ TEST_F_FORK(layout1, proc_unlinked_file)
+ ASSERT_EQ(0, close(proc_fd));
+
+ proc_fd = open_proc_fd(_metadata, reg_fd, O_RDWR | O_CLOEXEC);
+- ASSERT_EQ(-1, proc_fd) {
+- TH_LOG("Successfully opened /proc/self/fd/%d: %s",
+- reg_fd, strerror(errno));
++ ASSERT_EQ(-1, proc_fd)
++ {
++ TH_LOG("Successfully opened /proc/self/fd/%d: %s", reg_fd,
++ strerror(errno));
+ }
+ ASSERT_EQ(EACCES, errno);
+
+@@ -2023,13 +2256,13 @@ TEST_F_FORK(layout1, proc_pipe)
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+- {}
++ {},
+ };
+ /* Limits read and write access to files tied to the filesystem. */
+- const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+- rules);
++ const int ruleset_fd =
++ create_ruleset(_metadata, rules[0].access, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+@@ -2041,7 +2274,8 @@ TEST_F_FORK(layout1, proc_pipe)
+
+ /* Checks access to pipes through FD. */
+ ASSERT_EQ(0, pipe2(pipe_fds, O_CLOEXEC));
+- ASSERT_EQ(1, write(pipe_fds[1], ".", 1)) {
++ ASSERT_EQ(1, write(pipe_fds[1], ".", 1))
++ {
+ TH_LOG("Failed to write in pipe: %s", strerror(errno));
+ }
+ ASSERT_EQ(1, read(pipe_fds[0], &buf, 1));
+@@ -2050,9 +2284,10 @@ TEST_F_FORK(layout1, proc_pipe)
+ /* Checks write access to pipe through /proc/self/fd . */
+ proc_fd = open_proc_fd(_metadata, pipe_fds[1], O_WRONLY | O_CLOEXEC);
+ ASSERT_LE(0, proc_fd);
+- ASSERT_EQ(1, write(proc_fd, ".", 1)) {
++ ASSERT_EQ(1, write(proc_fd, ".", 1))
++ {
+ TH_LOG("Failed to write through /proc/self/fd/%d: %s",
+- pipe_fds[1], strerror(errno));
++ pipe_fds[1], strerror(errno));
+ }
+ ASSERT_EQ(0, close(proc_fd));
+
+@@ -2060,9 +2295,10 @@ TEST_F_FORK(layout1, proc_pipe)
+ proc_fd = open_proc_fd(_metadata, pipe_fds[0], O_RDONLY | O_CLOEXEC);
+ ASSERT_LE(0, proc_fd);
+ buf = '\0';
+- ASSERT_EQ(1, read(proc_fd, &buf, 1)) {
++ ASSERT_EQ(1, read(proc_fd, &buf, 1))
++ {
+ TH_LOG("Failed to read through /proc/self/fd/%d: %s",
+- pipe_fds[1], strerror(errno));
++ pipe_fds[1], strerror(errno));
+ }
+ ASSERT_EQ(0, close(proc_fd));
+
+@@ -2070,8 +2306,9 @@ TEST_F_FORK(layout1, proc_pipe)
+ ASSERT_EQ(0, close(pipe_fds[1]));
+ }
+
+-FIXTURE(layout1_bind) {
+-};
++/* clang-format off */
++FIXTURE(layout1_bind) {};
++/* clang-format on */
+
+ FIXTURE_SETUP(layout1_bind)
+ {
+@@ -2161,7 +2398,7 @@ TEST_F_FORK(layout1_bind, same_content_same_file)
+ .path = dir_s2d1,
+ .access = ACCESS_RW,
+ },
+- {}
++ {},
+ };
+ /*
+ * Sets access rights on the same bind-mounted directories. The result
+@@ -2177,7 +2414,7 @@ TEST_F_FORK(layout1_bind, same_content_same_file)
+ .path = dir_s2d2,
+ .access = ACCESS_RW,
+ },
+- {}
++ {},
+ };
+ /* Only allow read-access to the s1d3 hierarchies. */
+ const struct rule layer3_source[] = {
+@@ -2185,7 +2422,7 @@ TEST_F_FORK(layout1_bind, same_content_same_file)
+ .path = dir_s1d3,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+- {}
++ {},
+ };
+ /* Removes all access rights. */
+ const struct rule layer4_destination[] = {
+@@ -2193,7 +2430,7 @@ TEST_F_FORK(layout1_bind, same_content_same_file)
+ .path = bind_file1_s1d3,
+ .access = LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+- {}
++ {},
+ };
+ int ruleset_fd;
+
+@@ -2282,8 +2519,8 @@ TEST_F_FORK(layout1_bind, same_content_same_file)
+ ASSERT_EQ(EACCES, test_open(bind_file1_s1d3, O_WRONLY));
+ }
+
+-#define LOWER_BASE TMP_DIR "/lower"
+-#define LOWER_DATA LOWER_BASE "/data"
++#define LOWER_BASE TMP_DIR "/lower"
++#define LOWER_DATA LOWER_BASE "/data"
+ static const char lower_fl1[] = LOWER_DATA "/fl1";
+ static const char lower_dl1[] = LOWER_DATA "/dl1";
+ static const char lower_dl1_fl2[] = LOWER_DATA "/dl1/fl2";
+@@ -2295,23 +2532,23 @@ static const char lower_do1_fl3[] = LOWER_DATA "/do1/fl3";
+ static const char (*lower_base_files[])[] = {
+ &lower_fl1,
+ &lower_fo1,
+- NULL
++ NULL,
+ };
+ static const char (*lower_base_directories[])[] = {
+ &lower_dl1,
+ &lower_do1,
+- NULL
++ NULL,
+ };
+ static const char (*lower_sub_files[])[] = {
+ &lower_dl1_fl2,
+ &lower_do1_fo2,
+ &lower_do1_fl3,
+- NULL
++ NULL,
+ };
+
+-#define UPPER_BASE TMP_DIR "/upper"
+-#define UPPER_DATA UPPER_BASE "/data"
+-#define UPPER_WORK UPPER_BASE "/work"
++#define UPPER_BASE TMP_DIR "/upper"
++#define UPPER_DATA UPPER_BASE "/data"
++#define UPPER_WORK UPPER_BASE "/work"
+ static const char upper_fu1[] = UPPER_DATA "/fu1";
+ static const char upper_du1[] = UPPER_DATA "/du1";
+ static const char upper_du1_fu2[] = UPPER_DATA "/du1/fu2";
+@@ -2323,22 +2560,22 @@ static const char upper_do1_fu3[] = UPPER_DATA "/do1/fu3";
+ static const char (*upper_base_files[])[] = {
+ &upper_fu1,
+ &upper_fo1,
+- NULL
++ NULL,
+ };
+ static const char (*upper_base_directories[])[] = {
+ &upper_du1,
+ &upper_do1,
+- NULL
++ NULL,
+ };
+ static const char (*upper_sub_files[])[] = {
+ &upper_du1_fu2,
+ &upper_do1_fo2,
+ &upper_do1_fu3,
+- NULL
++ NULL,
+ };
+
+-#define MERGE_BASE TMP_DIR "/merge"
+-#define MERGE_DATA MERGE_BASE "/data"
++#define MERGE_BASE TMP_DIR "/merge"
++#define MERGE_DATA MERGE_BASE "/data"
+ static const char merge_fl1[] = MERGE_DATA "/fl1";
+ static const char merge_dl1[] = MERGE_DATA "/dl1";
+ static const char merge_dl1_fl2[] = MERGE_DATA "/dl1/fl2";
+@@ -2355,21 +2592,17 @@ static const char (*merge_base_files[])[] = {
+ &merge_fl1,
+ &merge_fu1,
+ &merge_fo1,
+- NULL
++ NULL,
+ };
+ static const char (*merge_base_directories[])[] = {
+ &merge_dl1,
+ &merge_du1,
+ &merge_do1,
+- NULL
++ NULL,
+ };
+ static const char (*merge_sub_files[])[] = {
+- &merge_dl1_fl2,
+- &merge_du1_fu2,
+- &merge_do1_fo2,
+- &merge_do1_fl3,
+- &merge_do1_fu3,
+- NULL
++ &merge_dl1_fl2, &merge_du1_fu2, &merge_do1_fo2,
++ &merge_do1_fl3, &merge_do1_fu3, NULL,
+ };
+
+ /*
+@@ -2411,8 +2644,9 @@ static const char (*merge_sub_files[])[] = {
+ * └── work
+ */
+
+-FIXTURE(layout2_overlay) {
+-};
++/* clang-format off */
++FIXTURE(layout2_overlay) {};
++/* clang-format on */
+
+ FIXTURE_SETUP(layout2_overlay)
+ {
+@@ -2444,9 +2678,8 @@ FIXTURE_SETUP(layout2_overlay)
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ set_cap(_metadata, CAP_DAC_OVERRIDE);
+ ASSERT_EQ(0, mount("overlay", MERGE_DATA, "overlay", 0,
+- "lowerdir=" LOWER_DATA
+- ",upperdir=" UPPER_DATA
+- ",workdir=" UPPER_WORK));
++ "lowerdir=" LOWER_DATA ",upperdir=" UPPER_DATA
++ ",workdir=" UPPER_WORK));
+ clear_cap(_metadata, CAP_DAC_OVERRIDE);
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+ }
+@@ -2513,9 +2746,9 @@ TEST_F_FORK(layout2_overlay, no_restriction)
+ ASSERT_EQ(0, test_open(merge_do1_fu3, O_RDONLY));
+ }
+
+-#define for_each_path(path_list, path_entry, i) \
+- for (i = 0, path_entry = *path_list[i]; path_list[i]; \
+- path_entry = *path_list[++i])
++#define for_each_path(path_list, path_entry, i) \
++ for (i = 0, path_entry = *path_list[i]; path_list[i]; \
++ path_entry = *path_list[++i])
+
+ TEST_F_FORK(layout2_overlay, same_content_different_file)
+ {
+@@ -2533,7 +2766,7 @@ TEST_F_FORK(layout2_overlay, same_content_different_file)
+ .path = MERGE_BASE,
+ .access = ACCESS_RW,
+ },
+- {}
++ {},
+ };
+ const struct rule layer2_data[] = {
+ {
+@@ -2548,7 +2781,7 @@ TEST_F_FORK(layout2_overlay, same_content_different_file)
+ .path = MERGE_DATA,
+ .access = ACCESS_RW,
+ },
+- {}
++ {},
+ };
+ /* Sets access right on directories inside both layers. */
+ const struct rule layer3_subdirs[] = {
+@@ -2580,7 +2813,7 @@ TEST_F_FORK(layout2_overlay, same_content_different_file)
+ .path = merge_do1,
+ .access = ACCESS_RW,
+ },
+- {}
++ {},
+ };
+ /* Tighten access rights to the files. */
+ const struct rule layer4_files[] = {
+@@ -2611,37 +2844,37 @@ TEST_F_FORK(layout2_overlay, same_content_different_file)
+ {
+ .path = merge_dl1_fl2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {
+ .path = merge_du1_fu2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {
+ .path = merge_do1_fo2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {
+ .path = merge_do1_fl3,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {
+ .path = merge_do1_fu3,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+- {}
++ {},
+ };
+ const struct rule layer5_merge_only[] = {
+ {
+ .path = MERGE_DATA,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+- LANDLOCK_ACCESS_FS_WRITE_FILE,
++ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+- {}
++ {},
+ };
+ int ruleset_fd;
+ size_t i;
+@@ -2659,7 +2892,8 @@ TEST_F_FORK(layout2_overlay, same_content_different_file)
+ ASSERT_EQ(EACCES, test_open(path_entry, O_WRONLY));
+ }
+ for_each_path(lower_base_directories, path_entry, i) {
+- ASSERT_EQ(EACCES, test_open(path_entry, O_RDONLY | O_DIRECTORY));
++ ASSERT_EQ(EACCES,
++ test_open(path_entry, O_RDONLY | O_DIRECTORY));
+ }
+ for_each_path(lower_sub_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDONLY));
+@@ -2671,7 +2905,8 @@ TEST_F_FORK(layout2_overlay, same_content_different_file)
+ ASSERT_EQ(EACCES, test_open(path_entry, O_WRONLY));
+ }
+ for_each_path(upper_base_directories, path_entry, i) {
+- ASSERT_EQ(EACCES, test_open(path_entry, O_RDONLY | O_DIRECTORY));
++ ASSERT_EQ(EACCES,
++ test_open(path_entry, O_RDONLY | O_DIRECTORY));
+ }
+ for_each_path(upper_sub_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDONLY));
+@@ -2756,7 +2991,8 @@ TEST_F_FORK(layout2_overlay, same_content_different_file)
+ ASSERT_EQ(EACCES, test_open(path_entry, O_RDWR));
+ }
+ for_each_path(merge_base_directories, path_entry, i) {
+- ASSERT_EQ(EACCES, test_open(path_entry, O_RDONLY | O_DIRECTORY));
++ ASSERT_EQ(EACCES,
++ test_open(path_entry, O_RDONLY | O_DIRECTORY));
+ }
+ for_each_path(merge_sub_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDWR));
+@@ -2781,7 +3017,8 @@ TEST_F_FORK(layout2_overlay, same_content_different_file)
+ ASSERT_EQ(EACCES, test_open(path_entry, O_RDWR));
+ }
+ for_each_path(merge_base_directories, path_entry, i) {
+- ASSERT_EQ(EACCES, test_open(path_entry, O_RDONLY | O_DIRECTORY));
++ ASSERT_EQ(EACCES,
++ test_open(path_entry, O_RDONLY | O_DIRECTORY));
+ }
+ for_each_path(merge_sub_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDWR));
+diff --git a/tools/testing/selftests/landlock/ptrace_test.c b/tools/testing/selftests/landlock/ptrace_test.c
+index 15fbef9cc8496..c28ef98ff3ac1 100644
+--- a/tools/testing/selftests/landlock/ptrace_test.c
++++ b/tools/testing/selftests/landlock/ptrace_test.c
+@@ -26,9 +26,10 @@ static void create_domain(struct __test_metadata *const _metadata)
+ .handled_access_fs = LANDLOCK_ACCESS_FS_MAKE_BLOCK,
+ };
+
+- ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+- sizeof(ruleset_attr), 0);
+- EXPECT_LE(0, ruleset_fd) {
++ ruleset_fd =
++ landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
++ EXPECT_LE(0, ruleset_fd)
++ {
+ TH_LOG("Failed to create a ruleset: %s", strerror(errno));
+ }
+ EXPECT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0));
+@@ -43,7 +44,7 @@ static int test_ptrace_read(const pid_t pid)
+ int procenv_path_size, fd;
+
+ procenv_path_size = snprintf(procenv_path, sizeof(procenv_path),
+- path_template, pid);
++ path_template, pid);
+ if (procenv_path_size >= sizeof(procenv_path))
+ return E2BIG;
+
+@@ -59,9 +60,12 @@ static int test_ptrace_read(const pid_t pid)
+ return 0;
+ }
+
+-FIXTURE(hierarchy) { };
++/* clang-format off */
++FIXTURE(hierarchy) {};
++/* clang-format on */
+
+-FIXTURE_VARIANT(hierarchy) {
++FIXTURE_VARIANT(hierarchy)
++{
+ const bool domain_both;
+ const bool domain_parent;
+ const bool domain_child;
+@@ -83,7 +87,9 @@ FIXTURE_VARIANT(hierarchy) {
+ * \ P2 -> P1 : allow
+ * 'P2
+ */
++/* clang-format off */
+ FIXTURE_VARIANT_ADD(hierarchy, allow_without_domain) {
++ /* clang-format on */
+ .domain_both = false,
+ .domain_parent = false,
+ .domain_child = false,
+@@ -98,7 +104,9 @@ FIXTURE_VARIANT_ADD(hierarchy, allow_without_domain) {
+ * | P2 |
+ * '------'
+ */
++/* clang-format off */
+ FIXTURE_VARIANT_ADD(hierarchy, allow_with_one_domain) {
++ /* clang-format on */
+ .domain_both = false,
+ .domain_parent = false,
+ .domain_child = true,
+@@ -112,7 +120,9 @@ FIXTURE_VARIANT_ADD(hierarchy, allow_with_one_domain) {
+ * '
+ * P2
+ */
++/* clang-format off */
+ FIXTURE_VARIANT_ADD(hierarchy, deny_with_parent_domain) {
++ /* clang-format on */
+ .domain_both = false,
+ .domain_parent = true,
+ .domain_child = false,
+@@ -127,7 +137,9 @@ FIXTURE_VARIANT_ADD(hierarchy, deny_with_parent_domain) {
+ * | P2 |
+ * '------'
+ */
++/* clang-format off */
+ FIXTURE_VARIANT_ADD(hierarchy, deny_with_sibling_domain) {
++ /* clang-format on */
+ .domain_both = false,
+ .domain_parent = true,
+ .domain_child = true,
+@@ -142,7 +154,9 @@ FIXTURE_VARIANT_ADD(hierarchy, deny_with_sibling_domain) {
+ * | P2 |
+ * '-------------'
+ */
++/* clang-format off */
+ FIXTURE_VARIANT_ADD(hierarchy, allow_sibling_domain) {
++ /* clang-format on */
+ .domain_both = true,
+ .domain_parent = false,
+ .domain_child = false,
+@@ -158,7 +172,9 @@ FIXTURE_VARIANT_ADD(hierarchy, allow_sibling_domain) {
+ * | '------' |
+ * '-----------------'
+ */
++/* clang-format off */
+ FIXTURE_VARIANT_ADD(hierarchy, allow_with_nested_domain) {
++ /* clang-format on */
+ .domain_both = true,
+ .domain_parent = false,
+ .domain_child = true,
+@@ -174,7 +190,9 @@ FIXTURE_VARIANT_ADD(hierarchy, allow_with_nested_domain) {
+ * | P2 |
+ * '-----------------'
+ */
++/* clang-format off */
+ FIXTURE_VARIANT_ADD(hierarchy, deny_with_nested_and_parent_domain) {
++ /* clang-format on */
+ .domain_both = true,
+ .domain_parent = true,
+ .domain_child = false,
+@@ -192,17 +210,21 @@ FIXTURE_VARIANT_ADD(hierarchy, deny_with_nested_and_parent_domain) {
+ * | '------' |
+ * '-----------------'
+ */
++/* clang-format off */
+ FIXTURE_VARIANT_ADD(hierarchy, deny_with_forked_domain) {
++ /* clang-format on */
+ .domain_both = true,
+ .domain_parent = true,
+ .domain_child = true,
+ };
+
+ FIXTURE_SETUP(hierarchy)
+-{ }
++{
++}
+
+ FIXTURE_TEARDOWN(hierarchy)
+-{ }
++{
++}
+
+ /* Test PTRACE_TRACEME and PTRACE_ATTACH for parent and child. */
+ TEST_F(hierarchy, trace)
+@@ -330,7 +352,7 @@ TEST_F(hierarchy, trace)
+ ASSERT_EQ(1, write(pipe_parent[1], ".", 1));
+ ASSERT_EQ(child, waitpid(child, &status, 0));
+ if (WIFSIGNALED(status) || !WIFEXITED(status) ||
+- WEXITSTATUS(status) != EXIT_SUCCESS)
++ WEXITSTATUS(status) != EXIT_SUCCESS)
+ _metadata->passed = 0;
+ }
+
+diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c
+index 51e5cf22632f7..56ccbeae0638d 100644
+--- a/tools/testing/selftests/resctrl/fill_buf.c
++++ b/tools/testing/selftests/resctrl/fill_buf.c
+@@ -121,8 +121,10 @@ static int fill_cache_read(unsigned char *start_ptr, unsigned char *end_ptr,
+
+ /* Consume read result so that reading memory is not optimized out. */
+ fp = fopen("/dev/null", "w");
+- if (!fp)
++ if (!fp) {
+ perror("Unable to write to /dev/null");
++ return -1;
++ }
+ fprintf(fp, "Sum: %d ", ret);
+ fclose(fp);
+
+diff --git a/tools/tracing/rtla/Makefile b/tools/tracing/rtla/Makefile
+index 11fb417abb42f..523f0a8c38c23 100644
+--- a/tools/tracing/rtla/Makefile
++++ b/tools/tracing/rtla/Makefile
+@@ -23,6 +23,7 @@ $(call allow-override,LD_SO_CONF_PATH,/etc/ld.so.conf.d/)
+ $(call allow-override,LDCONFIG,ldconfig)
+
+ INSTALL = install
++MKDIR = mkdir
+ FOPTS := -flto=auto -ffat-lto-objects -fexceptions -fstack-protector-strong \
+ -fasynchronous-unwind-tables -fstack-clash-protection
+ WOPTS := -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -Wno-maybe-uninitialized
+@@ -31,7 +32,7 @@ TRACEFS_HEADERS := $$($(PKG_CONFIG) --cflags libtracefs)
+
+ CFLAGS := -O -g -DVERSION=\"$(VERSION)\" $(FOPTS) $(MOPTS) $(WOPTS) $(TRACEFS_HEADERS)
+ LDFLAGS := -ggdb
+-LIBS := $$($(PKG_CONFIG) --libs libtracefs) -lprocps
++LIBS := $$($(PKG_CONFIG) --libs libtracefs)
+
+ SRC := $(wildcard src/*.c)
+ HDR := $(wildcard src/*.h)
+@@ -68,7 +69,7 @@ static: $(OBJ)
+
+ .PHONY: install
+ install: doc_install
+- $(INSTALL) -d -m 755 $(DESTDIR)$(BINDIR)
++ $(MKDIR) -p $(DESTDIR)$(BINDIR)
+ $(INSTALL) rtla -m 755 $(DESTDIR)$(BINDIR)
+ $(STRIP) $(DESTDIR)$(BINDIR)/rtla
+ @test ! -f $(DESTDIR)$(BINDIR)/osnoise || rm $(DESTDIR)$(BINDIR)/osnoise
+diff --git a/tools/tracing/rtla/README.txt b/tools/tracing/rtla/README.txt
+index 6c88446f7e74a..4af3fd40f1716 100644
+--- a/tools/tracing/rtla/README.txt
++++ b/tools/tracing/rtla/README.txt
+@@ -1,19 +1,16 @@
+ RTLA: Real-Time Linux Analysis tools
+
+-The rtla is a meta-tool that includes a set of commands that
+-aims to analyze the real-time properties of Linux. But, instead of
+-testing Linux as a black box, rtla leverages kernel tracing
+-capabilities to provide precise information about the properties
+-and root causes of unexpected results.
++The rtla meta-tool includes a set of commands that aims to analyze
++the real-time properties of Linux. Instead of testing Linux as a black box,
++rtla leverages kernel tracing capabilities to provide precise information
++about the properties and root causes of unexpected results.
+
+ Installing RTLA
+
+-RTLA depends on some libraries and tools. More precisely, it depends on the
+-following libraries:
++RTLA depends on the following libraries and tools:
+
+ - libtracefs
+ - libtraceevent
+- - procps
+
+ It also depends on python3-docutils to compile man pages.
+
+diff --git a/tools/tracing/rtla/src/osnoise_hist.c b/tools/tracing/rtla/src/osnoise_hist.c
+index b4380d45cacd4..5d7ea479ac89f 100644
+--- a/tools/tracing/rtla/src/osnoise_hist.c
++++ b/tools/tracing/rtla/src/osnoise_hist.c
+@@ -809,7 +809,7 @@ int osnoise_hist_main(int argc, char *argv[])
+ retval = set_comm_sched_attr("osnoise/", ¶ms->sched_param);
+ if (retval) {
+ err_msg("Failed to set sched parameters\n");
+- goto out_hist;
++ goto out_free;
+ }
+ }
+
+@@ -819,7 +819,7 @@ int osnoise_hist_main(int argc, char *argv[])
+ record = osnoise_init_trace_tool("osnoise");
+ if (!record) {
+ err_msg("Failed to enable the trace instance\n");
+- goto out_hist;
++ goto out_free;
+ }
+
+ if (params->events) {
+@@ -869,6 +869,7 @@ int osnoise_hist_main(int argc, char *argv[])
+ out_hist:
+ trace_events_destroy(&record->trace, params->events);
+ params->events = NULL;
++out_free:
+ osnoise_free_histogram(tool->data);
+ out_destroy:
+ osnoise_destroy_tool(record);
+diff --git a/tools/tracing/rtla/src/osnoise_top.c b/tools/tracing/rtla/src/osnoise_top.c
+index 72c2fd6ce005d..76479bfb29224 100644
+--- a/tools/tracing/rtla/src/osnoise_top.c
++++ b/tools/tracing/rtla/src/osnoise_top.c
+@@ -572,7 +572,7 @@ int osnoise_top_main(int argc, char **argv)
+ retval = osnoise_top_apply_config(tool, params);
+ if (retval) {
+ err_msg("Could not apply config\n");
+- goto out_top;
++ goto out_free;
+ }
+
+ trace = &tool->trace;
+@@ -580,14 +580,14 @@ int osnoise_top_main(int argc, char **argv)
+ retval = enable_osnoise(trace);
+ if (retval) {
+ err_msg("Failed to enable osnoise tracer\n");
+- goto out_top;
++ goto out_free;
+ }
+
+ if (params->set_sched) {
+ retval = set_comm_sched_attr("osnoise/", ¶ms->sched_param);
+ if (retval) {
+ err_msg("Failed to set sched parameters\n");
+- goto out_top;
++ goto out_free;
+ }
+ }
+
+@@ -597,7 +597,7 @@ int osnoise_top_main(int argc, char **argv)
+ record = osnoise_init_trace_tool("osnoise");
+ if (!record) {
+ err_msg("Failed to enable the trace instance\n");
+- goto out_top;
++ goto out_free;
+ }
+
+ if (params->events) {
+@@ -649,6 +649,7 @@ int osnoise_top_main(int argc, char **argv)
+ out_top:
+ trace_events_destroy(&record->trace, params->events);
+ params->events = NULL;
++out_free:
+ osnoise_free_top(tool->data);
+ osnoise_destroy_tool(record);
+ osnoise_destroy_tool(tool);
+diff --git a/tools/tracing/rtla/src/timerlat_hist.c b/tools/tracing/rtla/src/timerlat_hist.c
+index dc908126c610d..f3ec628f5e519 100644
+--- a/tools/tracing/rtla/src/timerlat_hist.c
++++ b/tools/tracing/rtla/src/timerlat_hist.c
+@@ -821,7 +821,7 @@ int timerlat_hist_main(int argc, char *argv[])
+ retval = timerlat_hist_apply_config(tool, params);
+ if (retval) {
+ err_msg("Could not apply config\n");
+- goto out_hist;
++ goto out_free;
+ }
+
+ trace = &tool->trace;
+@@ -829,14 +829,14 @@ int timerlat_hist_main(int argc, char *argv[])
+ retval = enable_timerlat(trace);
+ if (retval) {
+ err_msg("Failed to enable timerlat tracer\n");
+- goto out_hist;
++ goto out_free;
+ }
+
+ if (params->set_sched) {
+ retval = set_comm_sched_attr("timerlat/", ¶ms->sched_param);
+ if (retval) {
+ err_msg("Failed to set sched parameters\n");
+- goto out_hist;
++ goto out_free;
+ }
+ }
+
+@@ -844,7 +844,7 @@ int timerlat_hist_main(int argc, char *argv[])
+ dma_latency_fd = set_cpu_dma_latency(params->dma_latency);
+ if (dma_latency_fd < 0) {
+ err_msg("Could not set /dev/cpu_dma_latency.\n");
+- goto out_hist;
++ goto out_free;
+ }
+ }
+
+@@ -854,7 +854,7 @@ int timerlat_hist_main(int argc, char *argv[])
+ record = osnoise_init_trace_tool("timerlat");
+ if (!record) {
+ err_msg("Failed to enable the trace instance\n");
+- goto out_hist;
++ goto out_free;
+ }
+
+ if (params->events) {
+@@ -904,6 +904,7 @@ out_hist:
+ close(dma_latency_fd);
+ trace_events_destroy(&record->trace, params->events);
+ params->events = NULL;
++out_free:
+ timerlat_free_histogram(tool->data);
+ osnoise_destroy_tool(record);
+ osnoise_destroy_tool(tool);
+diff --git a/tools/tracing/rtla/src/timerlat_top.c b/tools/tracing/rtla/src/timerlat_top.c
+index 1f754c3df53f2..35452a1d45e9f 100644
+--- a/tools/tracing/rtla/src/timerlat_top.c
++++ b/tools/tracing/rtla/src/timerlat_top.c
+@@ -612,7 +612,7 @@ int timerlat_top_main(int argc, char *argv[])
+ retval = timerlat_top_apply_config(top, params);
+ if (retval) {
+ err_msg("Could not apply config\n");
+- goto out_top;
++ goto out_free;
+ }
+
+ trace = &top->trace;
+@@ -620,14 +620,14 @@ int timerlat_top_main(int argc, char *argv[])
+ retval = enable_timerlat(trace);
+ if (retval) {
+ err_msg("Failed to enable timerlat tracer\n");
+- goto out_top;
++ goto out_free;
+ }
+
+ if (params->set_sched) {
+ retval = set_comm_sched_attr("timerlat/", ¶ms->sched_param);
+ if (retval) {
+ err_msg("Failed to set sched parameters\n");
+- goto out_top;
++ goto out_free;
+ }
+ }
+
+@@ -635,7 +635,7 @@ int timerlat_top_main(int argc, char *argv[])
+ dma_latency_fd = set_cpu_dma_latency(params->dma_latency);
+ if (dma_latency_fd < 0) {
+ err_msg("Could not set /dev/cpu_dma_latency.\n");
+- goto out_top;
++ goto out_free;
+ }
+ }
+
+@@ -645,7 +645,7 @@ int timerlat_top_main(int argc, char *argv[])
+ record = osnoise_init_trace_tool("timerlat");
+ if (!record) {
+ err_msg("Failed to enable the trace instance\n");
+- goto out_top;
++ goto out_free;
+ }
+
+ if (params->events) {
+@@ -699,6 +699,7 @@ out_top:
+ close(dma_latency_fd);
+ trace_events_destroy(&record->trace, params->events);
+ params->events = NULL;
++out_free:
+ timerlat_free_top(top->data);
+ osnoise_destroy_tool(record);
+ osnoise_destroy_tool(top);
+diff --git a/tools/tracing/rtla/src/utils.c b/tools/tracing/rtla/src/utils.c
+index da2b590edaede..5352167a1e751 100644
+--- a/tools/tracing/rtla/src/utils.c
++++ b/tools/tracing/rtla/src/utils.c
+@@ -3,7 +3,7 @@
+ * Copyright (C) 2021 Red Hat Inc, Daniel Bristot de Oliveira <bristot@kernel.org>
+ */
+
+-#include <proc/readproc.h>
++#include <dirent.h>
+ #include <stdarg.h>
+ #include <stdlib.h>
+ #include <string.h>
+@@ -255,50 +255,114 @@ int __set_sched_attr(int pid, struct sched_attr *attr)
+
+ retval = sched_setattr(pid, attr, flags);
+ if (retval < 0) {
+- err_msg("boost_with_deadline failed to boost pid %d: %s\n",
++ err_msg("Failed to set sched attributes to the pid %d: %s\n",
+ pid, strerror(errno));
+ return 1;
+ }
+
+ return 0;
+ }
++
++/*
++ * procfs_is_workload_pid - check if a procfs entry contains a comm_prefix* comm
++ *
++ * Check if the procfs entry is a directory of a process, and then check if the
++ * process has a comm with the prefix set in char *comm_prefix. As the
++ * current users of this function only check for kernel threads, there is no
++ * need to check for the threads for the process.
++ *
++ * Return: True if the proc_entry contains a comm file with comm_prefix*.
++ * Otherwise returns false.
++ */
++static int procfs_is_workload_pid(const char *comm_prefix, struct dirent *proc_entry)
++{
++ char buffer[MAX_PATH];
++ int comm_fd, retval;
++ char *t_name;
++
++ if (proc_entry->d_type != DT_DIR)
++ return 0;
++
++ if (*proc_entry->d_name == '.')
++ return 0;
++
++ /* check if the string is a pid */
++ for (t_name = proc_entry->d_name; t_name; t_name++) {
++ if (!isdigit(*t_name))
++ break;
++ }
++
++ if (*t_name != '\0')
++ return 0;
++
++ snprintf(buffer, MAX_PATH, "/proc/%s/comm", proc_entry->d_name);
++ comm_fd = open(buffer, O_RDONLY);
++ if (comm_fd < 0)
++ return 0;
++
++ memset(buffer, 0, MAX_PATH);
++ retval = read(comm_fd, buffer, MAX_PATH);
++
++ close(comm_fd);
++
++ if (retval <= 0)
++ return 0;
++
++ retval = strncmp(comm_prefix, buffer, strlen(comm_prefix));
++ if (retval)
++ return 0;
++
++ /* comm already have \n */
++ debug_msg("Found workload pid:%s comm:%s", proc_entry->d_name, buffer);
++
++ return 1;
++}
++
+ /*
+- * set_comm_sched_attr - set sched params to threads starting with char *comm
++ * set_comm_sched_attr - set sched params to threads starting with char *comm_prefix
+ *
+- * This function uses procps to list the currently running threads and then
+- * set the sched_attr *attr to the threads that start with char *comm. It is
++ * This function uses procfs to list the currently running threads and then set the
++ * sched_attr *attr to the threads that start with char *comm_prefix. It is
+ * mainly used to set the priority to the kernel threads created by the
+ * tracers.
+ */
+-int set_comm_sched_attr(const char *comm, struct sched_attr *attr)
++int set_comm_sched_attr(const char *comm_prefix, struct sched_attr *attr)
+ {
+- int flags = PROC_FILLCOM | PROC_FILLSTAT;
+- PROCTAB *ptp;
+- proc_t task;
++ struct dirent *proc_entry;
++ DIR *procfs;
+ int retval;
+
+- ptp = openproc(flags);
+- if (!ptp) {
+- err_msg("error openproc()\n");
+- return -ENOENT;
++ if (strlen(comm_prefix) >= MAX_PATH) {
++ err_msg("Command prefix is too long: %d < strlen(%s)\n",
++ MAX_PATH, comm_prefix);
++ return 1;
+ }
+
+- memset(&task, 0, sizeof(task));
++ procfs = opendir("/proc");
++ if (!procfs) {
++ err_msg("Could not open procfs\n");
++ return 1;
++ }
+
+- while (readproc(ptp, &task)) {
+- retval = strncmp(comm, task.cmd, strlen(comm));
+- if (retval)
++ while ((proc_entry = readdir(procfs))) {
++
++ retval = procfs_is_workload_pid(comm_prefix, proc_entry);
++ if (!retval)
+ continue;
+- retval = __set_sched_attr(task.tid, attr);
+- if (retval)
++
++ /* procfs_is_workload_pid confirmed it is a pid */
++ retval = __set_sched_attr(atoi(proc_entry->d_name), attr);
++ if (retval) {
++ err_msg("Error setting sched attributes for pid:%s\n", proc_entry->d_name);
+ goto out_err;
+- }
++ }
+
+- closeproc(ptp);
++ debug_msg("Set sched attributes for pid:%s\n", proc_entry->d_name);
++ }
+ return 0;
+
+ out_err:
+- closeproc(ptp);
++ closedir(procfs);
+ return 1;
+ }
+
+diff --git a/tools/tracing/rtla/src/utils.h b/tools/tracing/rtla/src/utils.h
+index fa08e374870ac..5571afd3b5498 100644
+--- a/tools/tracing/rtla/src/utils.h
++++ b/tools/tracing/rtla/src/utils.h
+@@ -6,6 +6,7 @@
+ * '18446744073709551615\0'
+ */
+ #define BUFF_U64_STR_SIZE 24
++#define MAX_PATH 1024
+
+ #define container_of(ptr, type, member)({ \
+ const typeof(((type *)0)->member) *__mptr = (ptr); \
+@@ -53,5 +54,5 @@ struct sched_attr {
+ };
+
+ int parse_prio(char *arg, struct sched_attr *sched_param);
+-int set_comm_sched_attr(const char *comm, struct sched_attr *attr);
++int set_comm_sched_attr(const char *comm_prefix, struct sched_attr *attr);
+ int set_cpu_dma_latency(int32_t latency);
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-06-09 18:40 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-06-09 18:40 UTC (permalink / raw
To: gentoo-commits
commit: ea28214f21dc64ec385bdcf962806c9d93a9e7e6
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Thu Jun 9 18:39:55 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Thu Jun 9 18:39:55 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=ea28214f
cifs: fix minor compile warning
Bug: https://bugs.gentoo.org/85076
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 ++++
1950_cifs-fix-minor-compile-warning.patch | 33 +++++++++++++++++++++++++++++++
2 files changed, 37 insertions(+)
diff --git a/0000_README b/0000_README
index 5acbe17f..01987387 100644
--- a/0000_README
+++ b/0000_README
@@ -67,6 +67,10 @@ Patch: 1700_sparc-address-warray-bound-warnings.patch
From: https://github.com/KSPP/linux/issues/109
Desc: Address -Warray-bounds warnings
+Patch: 1950_cifs-fix-minor-compile-warning.patch
+From: https://git.kernel.org/
+Desc: cifs: fix minor compile warning
+
Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
diff --git a/1950_cifs-fix-minor-compile-warning.patch b/1950_cifs-fix-minor-compile-warning.patch
new file mode 100644
index 00000000..65787238
--- /dev/null
+++ b/1950_cifs-fix-minor-compile-warning.patch
@@ -0,0 +1,33 @@
+From 93ed91c020aa4f021600a633f1f87790a5e50b91 Mon Sep 17 00:00:00 2001
+From: Steve French <stfrench@microsoft.com>
+Date: Sun, 22 May 2022 21:25:24 -0500
+Subject: cifs: fix minor compile warning
+
+Add ifdef around nodfs variable from patch:
+ "cifs: don't call cifs_dfs_query_info_nonascii_quirk() if nodfs was set"
+which is unused when CONFIG_DFS_UPCALL is not set.
+
+Signed-off-by: Steve French <stfrench@microsoft.com>
+---
+ fs/cifs/connect.c | 2 ++
+ 1 file changed, 2 insertions(+)
+
+(limited to 'fs/cifs/connect.c')
+
+diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
+index 44dc66f21d832..0b08693d1af8f 100644
+--- a/fs/cifs/connect.c
++++ b/fs/cifs/connect.c
+@@ -3433,7 +3433,9 @@ static int is_path_remote(struct mount_ctx *mnt_ctx)
+ struct cifs_tcon *tcon = mnt_ctx->tcon;
+ struct smb3_fs_context *ctx = mnt_ctx->fs_ctx;
+ char *full_path;
++#ifdef CONFIG_CIFS_DFS_UPCALL
+ bool nodfs = cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_DFS;
++#endif
+
+ if (!server->ops->is_path_accessible)
+ return -EOPNOTSUPP;
+--
+cgit 1.2.3-1.el7
+
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-06-14 17:29 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-06-14 17:29 UTC (permalink / raw
To: gentoo-commits
commit: 2b6f39f4128d3fd3810c18307955fe90031a6854
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Tue Jun 14 17:29:07 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Tue Jun 14 17:29:07 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=2b6f39f4
Linux patch 5.18.4
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1003_linux-5.18.4.patch | 13704 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 13708 insertions(+)
diff --git a/0000_README b/0000_README
index 01987387..1dbd2f58 100644
--- a/0000_README
+++ b/0000_README
@@ -55,6 +55,10 @@ Patch: 1002_linux-5.18.3.patch
From: http://www.kernel.org
Desc: Linux 5.18.3
+Patch: 1003_linux-5.18.4.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.4
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1003_linux-5.18.4.patch b/1003_linux-5.18.4.patch
new file mode 100644
index 00000000..7e2824f0
--- /dev/null
+++ b/1003_linux-5.18.4.patch
@@ -0,0 +1,13704 @@
+diff --git a/Documentation/ABI/testing/sysfs-ata b/Documentation/ABI/testing/sysfs-ata
+index 2f726c9147522..3daecac48964f 100644
+--- a/Documentation/ABI/testing/sysfs-ata
++++ b/Documentation/ABI/testing/sysfs-ata
+@@ -107,13 +107,14 @@ Description:
+ described in ATA8 7.16 and 7.17. Only valid if
+ the device is not a PM.
+
+- pio_mode: (RO) Transfer modes supported by the device when
+- in PIO mode. Mostly used by PATA device.
++ pio_mode: (RO) PIO transfer mode used by the device.
++ Mostly used by PATA devices.
+
+- xfer_mode: (RO) Current transfer mode
++ xfer_mode: (RO) Current transfer mode. Mostly used by
++ PATA devices.
+
+- dma_mode: (RO) Transfer modes supported by the device when
+- in DMA mode. Mostly used by PATA device.
++ dma_mode: (RO) DMA transfer mode used by the device.
++ Mostly used by PATA devices.
+
+ class: (RO) Device class. Can be "ata" for disk,
+ "atapi" for packet device, "pmp" for PM, or
+diff --git a/Documentation/devicetree/bindings/regulator/mt6315-regulator.yaml b/Documentation/devicetree/bindings/regulator/mt6315-regulator.yaml
+index 5d2d989de893c..37402c370fbbc 100644
+--- a/Documentation/devicetree/bindings/regulator/mt6315-regulator.yaml
++++ b/Documentation/devicetree/bindings/regulator/mt6315-regulator.yaml
+@@ -55,7 +55,7 @@ examples:
+ regulator-min-microvolt = <300000>;
+ regulator-max-microvolt = <1193750>;
+ regulator-enable-ramp-delay = <256>;
+- regulator-allowed-modes = <0 1 2 4>;
++ regulator-allowed-modes = <0 1 2>;
+ };
+
+ vbuck3 {
+@@ -63,7 +63,7 @@ examples:
+ regulator-min-microvolt = <300000>;
+ regulator-max-microvolt = <1193750>;
+ regulator-enable-ramp-delay = <256>;
+- regulator-allowed-modes = <0 1 2 4>;
++ regulator-allowed-modes = <0 1 2>;
+ };
+ };
+ };
+diff --git a/Documentation/devicetree/bindings/remoteproc/mtk,scp.yaml b/Documentation/devicetree/bindings/remoteproc/mtk,scp.yaml
+index 5b693a2d049c1..d55b861db605f 100644
+--- a/Documentation/devicetree/bindings/remoteproc/mtk,scp.yaml
++++ b/Documentation/devicetree/bindings/remoteproc/mtk,scp.yaml
+@@ -23,11 +23,13 @@ properties:
+
+ reg:
+ description:
+- Should contain the address ranges for memory regions SRAM, CFG, and
+- L1TCM.
++ Should contain the address ranges for memory regions SRAM, CFG, and,
++ on some platforms, L1TCM.
++ minItems: 2
+ maxItems: 3
+
+ reg-names:
++ minItems: 2
+ items:
+ - const: sram
+ - const: cfg
+@@ -47,16 +49,30 @@ required:
+ - reg
+ - reg-names
+
+-if:
+- properties:
+- compatible:
+- enum:
+- - mediatek,mt8183-scp
+- - mediatek,mt8192-scp
+-then:
+- required:
+- - clocks
+- - clock-names
++allOf:
++ - if:
++ properties:
++ compatible:
++ enum:
++ - mediatek,mt8183-scp
++ - mediatek,mt8192-scp
++ then:
++ required:
++ - clocks
++ - clock-names
++
++ - if:
++ properties:
++ compatible:
++ enum:
++ - mediatek,mt8183-scp
++ - mediatek,mt8186-scp
++ then:
++ properties:
++ reg:
++ maxItems: 2
++ reg-names:
++ maxItems: 2
+
+ additionalProperties:
+ type: object
+@@ -76,10 +92,10 @@ additionalProperties:
+
+ examples:
+ - |
+- #include <dt-bindings/clock/mt8183-clk.h>
++ #include <dt-bindings/clock/mt8192-clk.h>
+
+ scp@10500000 {
+- compatible = "mediatek,mt8183-scp";
++ compatible = "mediatek,mt8192-scp";
+ reg = <0x10500000 0x80000>,
+ <0x10700000 0x8000>,
+ <0x10720000 0xe0000>;
+diff --git a/Documentation/tools/rtla/Makefile b/Documentation/tools/rtla/Makefile
+index 9f2b84af1a6c7..093af6d7a0e93 100644
+--- a/Documentation/tools/rtla/Makefile
++++ b/Documentation/tools/rtla/Makefile
+@@ -17,9 +17,21 @@ DOC_MAN1 = $(addprefix $(OUTPUT),$(_DOC_MAN1))
+ RST2MAN_DEP := $(shell command -v rst2man 2>/dev/null)
+ RST2MAN_OPTS += --verbose
+
++TEST_RST2MAN = $(shell sh -c "rst2man --version > /dev/null 2>&1 || echo n")
++
+ $(OUTPUT)%.1: %.rst
+ ifndef RST2MAN_DEP
+- $(error "rst2man not found, but required to generate man pages")
++ $(info ********************************************)
++ $(info ** NOTICE: rst2man not found)
++ $(info **)
++ $(info ** Consider installing the latest rst2man from your)
++ $(info ** distribution, e.g., 'dnf install python3-docutils' on Fedora,)
++ $(info ** or from source:)
++ $(info **)
++ $(info ** https://docutils.sourceforge.io/docs/dev/repository.html )
++ $(info **)
++ $(info ********************************************)
++ $(error NOTICE: rst2man required to generate man pages)
+ endif
+ rst2man $(RST2MAN_OPTS) $< > $@
+
+diff --git a/Makefile b/Makefile
+index eb3adfec0b222..6cbf7bb15edde 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 3
++SUBLEVEL = 4
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/arm/boot/dts/aspeed-ast2600-evb.dts b/arch/arm/boot/dts/aspeed-ast2600-evb.dts
+index b7eb552640cbf..788448cdd6b3f 100644
+--- a/arch/arm/boot/dts/aspeed-ast2600-evb.dts
++++ b/arch/arm/boot/dts/aspeed-ast2600-evb.dts
+@@ -103,7 +103,7 @@
+ &mac0 {
+ status = "okay";
+
+- phy-mode = "rgmii";
++ phy-mode = "rgmii-rxid";
+ phy-handle = <ðphy0>;
+
+ pinctrl-names = "default";
+@@ -114,7 +114,7 @@
+ &mac1 {
+ status = "okay";
+
+- phy-mode = "rgmii";
++ phy-mode = "rgmii-rxid";
+ phy-handle = <ðphy1>;
+
+ pinctrl-names = "default";
+diff --git a/arch/arm/mach-ep93xx/clock.c b/arch/arm/mach-ep93xx/clock.c
+index 4fa6ea5461b79..85a496ddc6197 100644
+--- a/arch/arm/mach-ep93xx/clock.c
++++ b/arch/arm/mach-ep93xx/clock.c
+@@ -345,9 +345,10 @@ static struct clk_hw *clk_hw_register_ddiv(const char *name,
+ psc->hw.init = &init;
+
+ clk = clk_register(NULL, &psc->hw);
+- if (IS_ERR(clk))
++ if (IS_ERR(clk)) {
+ kfree(psc);
+-
++ return ERR_CAST(clk);
++ }
+ return &psc->hw;
+ }
+
+@@ -452,9 +453,10 @@ static struct clk_hw *clk_hw_register_div(const char *name,
+ psc->hw.init = &init;
+
+ clk = clk_register(NULL, &psc->hw);
+- if (IS_ERR(clk))
++ if (IS_ERR(clk)) {
+ kfree(psc);
+-
++ return ERR_CAST(clk);
++ }
+ return &psc->hw;
+ }
+
+diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
+index fcc675aa1670d..c779e604edac1 100644
+--- a/arch/arm64/net/bpf_jit_comp.c
++++ b/arch/arm64/net/bpf_jit_comp.c
+@@ -1261,6 +1261,7 @@ skip_init_ctx:
+ bpf_jit_binary_free(header);
+ prog->bpf_func = NULL;
+ prog->jited = 0;
++ prog->jited_len = 0;
+ goto out_off;
+ }
+ bpf_jit_binary_lock_ro(header);
+diff --git a/arch/m68k/Kconfig.machine b/arch/m68k/Kconfig.machine
+index eeab4f3e6c197..946853a08502e 100644
+--- a/arch/m68k/Kconfig.machine
++++ b/arch/m68k/Kconfig.machine
+@@ -335,6 +335,7 @@ comment "Machine Options"
+
+ config UBOOT
+ bool "Support for U-Boot command line parameters"
++ depends on COLDFIRE
+ help
+ If you say Y here kernel will try to collect command
+ line parameters from the initial u-boot stack.
+diff --git a/arch/m68k/include/asm/pgtable_no.h b/arch/m68k/include/asm/pgtable_no.h
+index 87151d67d91e7..bce5ca56c3883 100644
+--- a/arch/m68k/include/asm/pgtable_no.h
++++ b/arch/m68k/include/asm/pgtable_no.h
+@@ -42,7 +42,8 @@ extern void paging_init(void);
+ * ZERO_PAGE is a global shared page that is always zero: used
+ * for zero-mapped memory areas etc..
+ */
+-#define ZERO_PAGE(vaddr) (virt_to_page(0))
++extern void *empty_zero_page;
++#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
+
+ /*
+ * All 32bit addresses are effectively valid for vmalloc...
+diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c
+index 8f94feed969c4..07317367ead89 100644
+--- a/arch/m68k/kernel/setup_mm.c
++++ b/arch/m68k/kernel/setup_mm.c
+@@ -87,15 +87,8 @@ void (*mach_sched_init) (void) __initdata = NULL;
+ void (*mach_init_IRQ) (void) __initdata = NULL;
+ void (*mach_get_model) (char *model);
+ void (*mach_get_hardware_list) (struct seq_file *m);
+-/* machine dependent timer functions */
+-int (*mach_hwclk) (int, struct rtc_time*);
+-EXPORT_SYMBOL(mach_hwclk);
+ unsigned int (*mach_get_ss)(void);
+-int (*mach_get_rtc_pll)(struct rtc_pll_info *);
+-int (*mach_set_rtc_pll)(struct rtc_pll_info *);
+ EXPORT_SYMBOL(mach_get_ss);
+-EXPORT_SYMBOL(mach_get_rtc_pll);
+-EXPORT_SYMBOL(mach_set_rtc_pll);
+ void (*mach_reset)( void );
+ void (*mach_halt)( void );
+ void (*mach_power_off)( void );
+diff --git a/arch/m68k/kernel/setup_no.c b/arch/m68k/kernel/setup_no.c
+index 5e4104f07a443..19eea73d3c170 100644
+--- a/arch/m68k/kernel/setup_no.c
++++ b/arch/m68k/kernel/setup_no.c
+@@ -50,7 +50,6 @@ char __initdata command_line[COMMAND_LINE_SIZE];
+
+ /* machine dependent timer functions */
+ void (*mach_sched_init)(void) __initdata = NULL;
+-int (*mach_hwclk) (int, struct rtc_time*);
+
+ /* machine dependent reboot functions */
+ void (*mach_reset)(void);
+diff --git a/arch/m68k/kernel/time.c b/arch/m68k/kernel/time.c
+index 340ffeea0a9dc..a97600b2af502 100644
+--- a/arch/m68k/kernel/time.c
++++ b/arch/m68k/kernel/time.c
+@@ -63,6 +63,15 @@ void timer_heartbeat(void)
+ #endif /* CONFIG_HEARTBEAT */
+
+ #ifdef CONFIG_M68KCLASSIC
++/* machine dependent timer functions */
++int (*mach_hwclk) (int, struct rtc_time*);
++EXPORT_SYMBOL(mach_hwclk);
++
++int (*mach_get_rtc_pll)(struct rtc_pll_info *);
++int (*mach_set_rtc_pll)(struct rtc_pll_info *);
++EXPORT_SYMBOL(mach_get_rtc_pll);
++EXPORT_SYMBOL(mach_set_rtc_pll);
++
+ #if !IS_BUILTIN(CONFIG_RTC_DRV_GENERIC)
+ void read_persistent_clock64(struct timespec64 *ts)
+ {
+diff --git a/arch/mips/kernel/mips-cpc.c b/arch/mips/kernel/mips-cpc.c
+index 17aff13cd7ce6..3e386f7e15450 100644
+--- a/arch/mips/kernel/mips-cpc.c
++++ b/arch/mips/kernel/mips-cpc.c
+@@ -28,6 +28,7 @@ phys_addr_t __weak mips_cpc_default_phys_base(void)
+ cpc_node = of_find_compatible_node(of_root, NULL, "mti,mips-cpc");
+ if (cpc_node) {
+ err = of_address_to_resource(cpc_node, 0, &res);
++ of_node_put(cpc_node);
+ if (!err)
+ return res.start;
+ }
+diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
+index 174edabb74fa1..500f2a0831ef8 100644
+--- a/arch/powerpc/Kconfig
++++ b/arch/powerpc/Kconfig
+@@ -218,7 +218,6 @@ config PPC
+ select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI && !HAVE_HARDLOCKUP_DETECTOR_ARCH
+ select HAVE_HW_BREAKPOINT if PERF_EVENTS && (PPC_BOOK3S || PPC_8xx)
+ select HAVE_IOREMAP_PROT
+- select HAVE_IRQ_EXIT_ON_IRQ_STACK
+ select HAVE_IRQ_TIME_ACCOUNTING
+ select HAVE_KERNEL_GZIP
+ select HAVE_KERNEL_LZMA if DEFAULT_UIMAGE
+@@ -771,7 +770,6 @@ config THREAD_SHIFT
+ range 13 15
+ default "15" if PPC_256K_PAGES
+ default "14" if PPC64
+- default "14" if KASAN
+ default "13"
+ help
+ Used to define the stack size. The default is almost always what you
+diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
+index 125328d1b9802..af58f1ed3952e 100644
+--- a/arch/powerpc/include/asm/thread_info.h
++++ b/arch/powerpc/include/asm/thread_info.h
+@@ -14,10 +14,16 @@
+
+ #ifdef __KERNEL__
+
+-#if defined(CONFIG_VMAP_STACK) && CONFIG_THREAD_SHIFT < PAGE_SHIFT
++#ifdef CONFIG_KASAN
++#define MIN_THREAD_SHIFT (CONFIG_THREAD_SHIFT + 1)
++#else
++#define MIN_THREAD_SHIFT CONFIG_THREAD_SHIFT
++#endif
++
++#if defined(CONFIG_VMAP_STACK) && MIN_THREAD_SHIFT < PAGE_SHIFT
+ #define THREAD_SHIFT PAGE_SHIFT
+ #else
+-#define THREAD_SHIFT CONFIG_THREAD_SHIFT
++#define THREAD_SHIFT MIN_THREAD_SHIFT
+ #endif
+
+ #define THREAD_SIZE (1 << THREAD_SHIFT)
+diff --git a/arch/powerpc/kernel/ptrace/ptrace-fpu.c b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
+index 5dca19361316e..09c49632bfe59 100644
+--- a/arch/powerpc/kernel/ptrace/ptrace-fpu.c
++++ b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
+@@ -17,9 +17,13 @@ int ptrace_get_fpr(struct task_struct *child, int index, unsigned long *data)
+
+ #ifdef CONFIG_PPC_FPU_REGS
+ flush_fp_to_thread(child);
+- if (fpidx < (PT_FPSCR - PT_FPR0))
+- memcpy(data, &child->thread.TS_FPR(fpidx), sizeof(long));
+- else
++ if (fpidx < (PT_FPSCR - PT_FPR0)) {
++ if (IS_ENABLED(CONFIG_PPC32))
++ // On 32-bit the index we are passed refers to 32-bit words
++ *data = ((u32 *)child->thread.fp_state.fpr)[fpidx];
++ else
++ memcpy(data, &child->thread.TS_FPR(fpidx), sizeof(long));
++ } else
+ *data = child->thread.fp_state.fpscr;
+ #else
+ *data = 0;
+@@ -39,9 +43,13 @@ int ptrace_put_fpr(struct task_struct *child, int index, unsigned long data)
+
+ #ifdef CONFIG_PPC_FPU_REGS
+ flush_fp_to_thread(child);
+- if (fpidx < (PT_FPSCR - PT_FPR0))
+- memcpy(&child->thread.TS_FPR(fpidx), &data, sizeof(long));
+- else
++ if (fpidx < (PT_FPSCR - PT_FPR0)) {
++ if (IS_ENABLED(CONFIG_PPC32))
++ // On 32-bit the index we are passed refers to 32-bit words
++ ((u32 *)child->thread.fp_state.fpr)[fpidx] = data;
++ else
++ memcpy(&child->thread.TS_FPR(fpidx), &data, sizeof(long));
++ } else
+ child->thread.fp_state.fpscr = data;
+ #endif
+
+diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
+index 6d5026a9db4fc..eab6fa59d9d2d 100644
+--- a/arch/powerpc/kernel/ptrace/ptrace.c
++++ b/arch/powerpc/kernel/ptrace/ptrace.c
+@@ -450,4 +450,7 @@ void __init pt_regs_check(void)
+ #else
+ BUILD_BUG_ON(IS_ENABLED(CONFIG_HAVE_FUNCTION_DESCRIPTORS));
+ #endif
++
++ // ptrace_get/put_fpr() rely on PPC32 and VSX being incompatible
++ BUILD_BUG_ON(IS_ENABLED(CONFIG_PPC32) && IS_ENABLED(CONFIG_VSX));
+ }
+diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
+index 181b855b30509..82cae08976bcd 100644
+--- a/arch/powerpc/platforms/pseries/papr_scm.c
++++ b/arch/powerpc/platforms/pseries/papr_scm.c
+@@ -465,6 +465,9 @@ static int papr_scm_pmu_check_events(struct papr_scm_priv *p, struct nvdimm_pmu
+ u32 available_events;
+ int index, rc = 0;
+
++ if (!p->stat_buffer_len)
++ return -ENOENT;
++
+ available_events = (p->stat_buffer_len - sizeof(struct papr_scm_perf_stats))
+ / sizeof(struct papr_scm_perf_stat);
+ if (available_events == 0)
+diff --git a/arch/riscv/kernel/efi.c b/arch/riscv/kernel/efi.c
+index 0241592982314..1aa540350abd3 100644
+--- a/arch/riscv/kernel/efi.c
++++ b/arch/riscv/kernel/efi.c
+@@ -65,7 +65,7 @@ static int __init set_permissions(pte_t *ptep, unsigned long addr, void *data)
+
+ if (md->attribute & EFI_MEMORY_RO) {
+ val = pte_val(pte) & ~_PAGE_WRITE;
+- val = pte_val(pte) | _PAGE_READ;
++ val |= _PAGE_READ;
+ pte = __pte(val);
+ }
+ if (md->attribute & EFI_MEMORY_XP) {
+diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c
+index cbef0fc73afa8..df8e24559035c 100644
+--- a/arch/riscv/kernel/machine_kexec.c
++++ b/arch/riscv/kernel/machine_kexec.c
+@@ -65,7 +65,9 @@ machine_kexec_prepare(struct kimage *image)
+ if (image->segment[i].memsz <= sizeof(fdt))
+ continue;
+
+- if (copy_from_user(&fdt, image->segment[i].buf, sizeof(fdt)))
++ if (image->file_mode)
++ memcpy(&fdt, image->segment[i].buf, sizeof(fdt));
++ else if (copy_from_user(&fdt, image->segment[i].buf, sizeof(fdt)))
+ continue;
+
+ if (fdt_check_header(&fdt))
+diff --git a/arch/s390/crypto/aes_s390.c b/arch/s390/crypto/aes_s390.c
+index 54c7536f2482d..1023e9d43d443 100644
+--- a/arch/s390/crypto/aes_s390.c
++++ b/arch/s390/crypto/aes_s390.c
+@@ -701,7 +701,7 @@ static inline void _gcm_sg_unmap_and_advance(struct gcm_sg_walk *gw,
+ unsigned int nbytes)
+ {
+ gw->walk_bytes_remain -= nbytes;
+- scatterwalk_unmap(&gw->walk);
++ scatterwalk_unmap(gw->walk_ptr);
+ scatterwalk_advance(&gw->walk, nbytes);
+ scatterwalk_done(&gw->walk, 0, gw->walk_bytes_remain);
+ gw->walk_ptr = NULL;
+@@ -776,7 +776,7 @@ static int gcm_out_walk_go(struct gcm_sg_walk *gw, unsigned int minbytesneeded)
+ goto out;
+ }
+
+- scatterwalk_unmap(&gw->walk);
++ scatterwalk_unmap(gw->walk_ptr);
+ gw->walk_ptr = NULL;
+
+ gw->ptr = gw->buf;
+diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
+index 59b69c8ab5e1f..85e9703e52a8c 100644
+--- a/arch/s390/kernel/entry.S
++++ b/arch/s390/kernel/entry.S
+@@ -258,6 +258,10 @@ ENTRY(sie64a)
+ BPEXIT __SF_SIE_FLAGS(%r15),(_TIF_ISOLATE_BP|_TIF_ISOLATE_BP_GUEST)
+ .Lsie_entry:
+ sie 0(%r14)
++# Let the next instruction be NOP to avoid triggering a machine check
++# and handling it in a guest as result of the instruction execution.
++ nopr 7
++.Lsie_leave:
+ BPOFF
+ BPENTER __SF_SIE_FLAGS(%r15),(_TIF_ISOLATE_BP|_TIF_ISOLATE_BP_GUEST)
+ .Lsie_skip:
+@@ -557,7 +561,7 @@ ENTRY(mcck_int_handler)
+ jno .Lmcck_panic
+ #if IS_ENABLED(CONFIG_KVM)
+ OUTSIDE %r9,.Lsie_gmap,.Lsie_done,6f
+- OUTSIDE %r9,.Lsie_entry,.Lsie_skip,4f
++ OUTSIDE %r9,.Lsie_entry,.Lsie_leave,4f
+ oi __LC_CPU_FLAGS+7, _CIF_MCCK_GUEST
+ j 5f
+ 4: CHKSTG .Lmcck_panic
+diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
+index 1ac73917a8d39..b8ae4a4aa2ba4 100644
+--- a/arch/s390/mm/gmap.c
++++ b/arch/s390/mm/gmap.c
+@@ -2608,6 +2608,18 @@ static int __s390_enable_skey_pte(pte_t *pte, unsigned long addr,
+ return 0;
+ }
+
++/*
++ * Give a chance to schedule after setting a key to 256 pages.
++ * We only hold the mm lock, which is a rwsem and the kvm srcu.
++ * Both can sleep.
++ */
++static int __s390_enable_skey_pmd(pmd_t *pmd, unsigned long addr,
++ unsigned long next, struct mm_walk *walk)
++{
++ cond_resched();
++ return 0;
++}
++
+ static int __s390_enable_skey_hugetlb(pte_t *pte, unsigned long addr,
+ unsigned long hmask, unsigned long next,
+ struct mm_walk *walk)
+@@ -2630,12 +2642,14 @@ static int __s390_enable_skey_hugetlb(pte_t *pte, unsigned long addr,
+ end = start + HPAGE_SIZE - 1;
+ __storage_key_init_range(start, end);
+ set_bit(PG_arch_1, &page->flags);
++ cond_resched();
+ return 0;
+ }
+
+ static const struct mm_walk_ops enable_skey_walk_ops = {
+ .hugetlb_entry = __s390_enable_skey_hugetlb,
+ .pte_entry = __s390_enable_skey_pte,
++ .pmd_entry = __s390_enable_skey_pmd,
+ };
+
+ int s390_enable_skey(void)
+diff --git a/arch/um/drivers/chan_kern.c b/arch/um/drivers/chan_kern.c
+index 62997055c4547..26a702a065154 100644
+--- a/arch/um/drivers/chan_kern.c
++++ b/arch/um/drivers/chan_kern.c
+@@ -133,7 +133,7 @@ static void line_timer_cb(struct work_struct *work)
+ struct line *line = container_of(work, struct line, task.work);
+
+ if (!line->throttled)
+- chan_interrupt(line, line->driver->read_irq);
++ chan_interrupt(line, line->read_irq);
+ }
+
+ int enable_chan(struct line *line)
+@@ -195,9 +195,9 @@ void free_irqs(void)
+ chan = list_entry(ele, struct chan, free_list);
+
+ if (chan->input && chan->enabled)
+- um_free_irq(chan->line->driver->read_irq, chan);
++ um_free_irq(chan->line->read_irq, chan);
+ if (chan->output && chan->enabled)
+- um_free_irq(chan->line->driver->write_irq, chan);
++ um_free_irq(chan->line->write_irq, chan);
+ chan->enabled = 0;
+ }
+ }
+@@ -215,9 +215,9 @@ static void close_one_chan(struct chan *chan, int delay_free_irq)
+ spin_unlock_irqrestore(&irqs_to_free_lock, flags);
+ } else {
+ if (chan->input && chan->enabled)
+- um_free_irq(chan->line->driver->read_irq, chan);
++ um_free_irq(chan->line->read_irq, chan);
+ if (chan->output && chan->enabled)
+- um_free_irq(chan->line->driver->write_irq, chan);
++ um_free_irq(chan->line->write_irq, chan);
+ chan->enabled = 0;
+ }
+ if (chan->ops->close != NULL)
+diff --git a/arch/um/drivers/line.c b/arch/um/drivers/line.c
+index 8febf95da96e1..02b0befd67632 100644
+--- a/arch/um/drivers/line.c
++++ b/arch/um/drivers/line.c
+@@ -139,7 +139,7 @@ static int flush_buffer(struct line *line)
+ count = line->buffer + LINE_BUFSIZE - line->head;
+
+ n = write_chan(line->chan_out, line->head, count,
+- line->driver->write_irq);
++ line->write_irq);
+ if (n < 0)
+ return n;
+ if (n == count) {
+@@ -156,7 +156,7 @@ static int flush_buffer(struct line *line)
+
+ count = line->tail - line->head;
+ n = write_chan(line->chan_out, line->head, count,
+- line->driver->write_irq);
++ line->write_irq);
+
+ if (n < 0)
+ return n;
+@@ -195,7 +195,7 @@ int line_write(struct tty_struct *tty, const unsigned char *buf, int len)
+ ret = buffer_data(line, buf, len);
+ else {
+ n = write_chan(line->chan_out, buf, len,
+- line->driver->write_irq);
++ line->write_irq);
+ if (n < 0) {
+ ret = n;
+ goto out_up;
+@@ -215,7 +215,7 @@ void line_throttle(struct tty_struct *tty)
+ {
+ struct line *line = tty->driver_data;
+
+- deactivate_chan(line->chan_in, line->driver->read_irq);
++ deactivate_chan(line->chan_in, line->read_irq);
+ line->throttled = 1;
+ }
+
+@@ -224,7 +224,7 @@ void line_unthrottle(struct tty_struct *tty)
+ struct line *line = tty->driver_data;
+
+ line->throttled = 0;
+- chan_interrupt(line, line->driver->read_irq);
++ chan_interrupt(line, line->read_irq);
+ }
+
+ static irqreturn_t line_write_interrupt(int irq, void *data)
+@@ -260,19 +260,23 @@ int line_setup_irq(int fd, int input, int output, struct line *line, void *data)
+ int err;
+
+ if (input) {
+- err = um_request_irq(driver->read_irq, fd, IRQ_READ,
+- line_interrupt, IRQF_SHARED,
++ err = um_request_irq(UM_IRQ_ALLOC, fd, IRQ_READ,
++ line_interrupt, 0,
+ driver->read_irq_name, data);
+ if (err < 0)
+ return err;
++
++ line->read_irq = err;
+ }
+
+ if (output) {
+- err = um_request_irq(driver->write_irq, fd, IRQ_WRITE,
+- line_write_interrupt, IRQF_SHARED,
++ err = um_request_irq(UM_IRQ_ALLOC, fd, IRQ_WRITE,
++ line_write_interrupt, 0,
+ driver->write_irq_name, data);
+ if (err < 0)
+ return err;
++
++ line->write_irq = err;
+ }
+
+ return 0;
+diff --git a/arch/um/drivers/line.h b/arch/um/drivers/line.h
+index bdb16b96e76fd..f15be75a3bf3b 100644
+--- a/arch/um/drivers/line.h
++++ b/arch/um/drivers/line.h
+@@ -23,9 +23,7 @@ struct line_driver {
+ const short minor_start;
+ const short type;
+ const short subtype;
+- const int read_irq;
+ const char *read_irq_name;
+- const int write_irq;
+ const char *write_irq_name;
+ struct mc_device mc;
+ struct tty_driver *driver;
+@@ -35,6 +33,8 @@ struct line {
+ struct tty_port port;
+ int valid;
+
++ int read_irq, write_irq;
++
+ char *init_str;
+ struct list_head chan_list;
+ struct chan *chan_in, *chan_out;
+diff --git a/arch/um/drivers/ssl.c b/arch/um/drivers/ssl.c
+index 41eae2e8fb652..8514966778d53 100644
+--- a/arch/um/drivers/ssl.c
++++ b/arch/um/drivers/ssl.c
+@@ -47,9 +47,7 @@ static struct line_driver driver = {
+ .minor_start = 64,
+ .type = TTY_DRIVER_TYPE_SERIAL,
+ .subtype = 0,
+- .read_irq = SSL_IRQ,
+ .read_irq_name = "ssl",
+- .write_irq = SSL_WRITE_IRQ,
+ .write_irq_name = "ssl-write",
+ .mc = {
+ .list = LIST_HEAD_INIT(driver.mc.list),
+diff --git a/arch/um/drivers/stdio_console.c b/arch/um/drivers/stdio_console.c
+index e8b762f4d8c25..489d5a746ed33 100644
+--- a/arch/um/drivers/stdio_console.c
++++ b/arch/um/drivers/stdio_console.c
+@@ -53,9 +53,7 @@ static struct line_driver driver = {
+ .minor_start = 0,
+ .type = TTY_DRIVER_TYPE_CONSOLE,
+ .subtype = SYSTEM_TYPE_CONSOLE,
+- .read_irq = CONSOLE_IRQ,
+ .read_irq_name = "console",
+- .write_irq = CONSOLE_WRITE_IRQ,
+ .write_irq_name = "console-write",
+ .mc = {
+ .list = LIST_HEAD_INIT(driver.mc.list),
+diff --git a/arch/um/include/asm/irq.h b/arch/um/include/asm/irq.h
+index e187c789369d3..749dfe8512e84 100644
+--- a/arch/um/include/asm/irq.h
++++ b/arch/um/include/asm/irq.h
+@@ -4,19 +4,15 @@
+
+ #define TIMER_IRQ 0
+ #define UMN_IRQ 1
+-#define CONSOLE_IRQ 2
+-#define CONSOLE_WRITE_IRQ 3
+-#define UBD_IRQ 4
+-#define UM_ETH_IRQ 5
+-#define SSL_IRQ 6
+-#define SSL_WRITE_IRQ 7
+-#define ACCEPT_IRQ 8
+-#define MCONSOLE_IRQ 9
+-#define WINCH_IRQ 10
+-#define SIGIO_WRITE_IRQ 11
+-#define TELNETD_IRQ 12
+-#define XTERM_IRQ 13
+-#define RANDOM_IRQ 14
++#define UBD_IRQ 2
++#define UM_ETH_IRQ 3
++#define ACCEPT_IRQ 4
++#define MCONSOLE_IRQ 5
++#define WINCH_IRQ 6
++#define SIGIO_WRITE_IRQ 7
++#define TELNETD_IRQ 8
++#define XTERM_IRQ 9
++#define RANDOM_IRQ 10
+
+ #ifdef CONFIG_UML_NET_VECTOR
+
+diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
+index 1261842d006c7..49a3b122279e5 100644
+--- a/arch/x86/include/asm/cpufeature.h
++++ b/arch/x86/include/asm/cpufeature.h
+@@ -51,7 +51,7 @@ extern const char * const x86_power_flags[32];
+ extern const char * const x86_bug_flags[NBUGINTS*32];
+
+ #define test_cpu_cap(c, bit) \
+- test_bit(bit, (unsigned long *)((c)->x86_capability))
++ arch_test_bit(bit, (unsigned long *)((c)->x86_capability))
+
+ /*
+ * There are 32 bits/features in each mask word. The high bits
+diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
+index 35f222aa66bfc..913e593a3b45f 100644
+--- a/arch/x86/include/asm/uaccess.h
++++ b/arch/x86/include/asm/uaccess.h
+@@ -439,7 +439,7 @@ do { \
+ [ptr] "+m" (*_ptr), \
+ [old] "+a" (__old) \
+ : [new] ltype (__new) \
+- : "memory", "cc"); \
++ : "memory"); \
+ if (unlikely(__err)) \
+ goto label; \
+ if (unlikely(!success)) \
+diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
+index cf48ac96ceecb..7f9c3b0fcd0b5 100644
+--- a/arch/x86/kvm/mmu/mmu.c
++++ b/arch/x86/kvm/mmu/mmu.c
+@@ -5168,7 +5168,7 @@ static void __kvm_mmu_free_obsolete_roots(struct kvm *kvm, struct kvm_mmu *mmu)
+ roots_to_free |= KVM_MMU_ROOT_CURRENT;
+
+ for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) {
+- if (is_obsolete_root(kvm, mmu->root.hpa))
++ if (is_obsolete_root(kvm, mmu->prev_roots[i].hpa))
+ roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i);
+ }
+
+diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
+index 1a9b60cb6bcb8..9a5963bff0f95 100644
+--- a/arch/x86/kvm/svm/nested.c
++++ b/arch/x86/kvm/svm/nested.c
+@@ -896,7 +896,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
+ if (svm->tsc_ratio_msr != kvm_default_tsc_scaling_ratio) {
+ WARN_ON(!svm->tsc_scaling_enabled);
+ vcpu->arch.tsc_scaling_ratio = vcpu->arch.l1_tsc_scaling_ratio;
+- svm_write_tsc_multiplier(vcpu, vcpu->arch.tsc_scaling_ratio);
++ __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
+ }
+
+ svm->nested.ctl.nested_cr3 = 0;
+@@ -1293,7 +1293,7 @@ void nested_svm_update_tsc_ratio_msr(struct kvm_vcpu *vcpu)
+ vcpu->arch.tsc_scaling_ratio =
+ kvm_calc_nested_tsc_multiplier(vcpu->arch.l1_tsc_scaling_ratio,
+ svm->tsc_ratio_msr);
+- svm_write_tsc_multiplier(vcpu, vcpu->arch.tsc_scaling_ratio);
++ __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
+ }
+
+ /* Inverse operation of nested_copy_vmcb_control_to_cache(). asid is copied too. */
+diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
+index 7e45d03cd018a..0c0a09b43b105 100644
+--- a/arch/x86/kvm/svm/svm.c
++++ b/arch/x86/kvm/svm/svm.c
+@@ -463,11 +463,24 @@ static int has_svm(void)
+ return 1;
+ }
+
++void __svm_write_tsc_multiplier(u64 multiplier)
++{
++ preempt_disable();
++
++ if (multiplier == __this_cpu_read(current_tsc_ratio))
++ goto out;
++
++ wrmsrl(MSR_AMD64_TSC_RATIO, multiplier);
++ __this_cpu_write(current_tsc_ratio, multiplier);
++out:
++ preempt_enable();
++}
++
+ static void svm_hardware_disable(void)
+ {
+ /* Make sure we clean up behind us */
+ if (tsc_scaling)
+- wrmsrl(MSR_AMD64_TSC_RATIO, SVM_TSC_RATIO_DEFAULT);
++ __svm_write_tsc_multiplier(SVM_TSC_RATIO_DEFAULT);
+
+ cpu_svm_disable();
+
+@@ -513,8 +526,7 @@ static int svm_hardware_enable(void)
+ * Set the default value, even if we don't use TSC scaling
+ * to avoid having stale value in the msr
+ */
+- wrmsrl(MSR_AMD64_TSC_RATIO, SVM_TSC_RATIO_DEFAULT);
+- __this_cpu_write(current_tsc_ratio, SVM_TSC_RATIO_DEFAULT);
++ __svm_write_tsc_multiplier(SVM_TSC_RATIO_DEFAULT);
+ }
+
+
+@@ -915,11 +927,12 @@ static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
+ vmcb_mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
+ }
+
+-void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier)
++static void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier)
+ {
+- wrmsrl(MSR_AMD64_TSC_RATIO, multiplier);
++ __svm_write_tsc_multiplier(multiplier);
+ }
+
++
+ /* Evaluate instruction intercepts that depend on guest CPUID features. */
+ static void svm_recalc_instruction_intercepts(struct kvm_vcpu *vcpu,
+ struct vcpu_svm *svm)
+@@ -1276,13 +1289,8 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
+ sev_es_prepare_switch_to_guest(hostsa);
+ }
+
+- if (tsc_scaling) {
+- u64 tsc_ratio = vcpu->arch.tsc_scaling_ratio;
+- if (tsc_ratio != __this_cpu_read(current_tsc_ratio)) {
+- __this_cpu_write(current_tsc_ratio, tsc_ratio);
+- wrmsrl(MSR_AMD64_TSC_RATIO, tsc_ratio);
+- }
+- }
++ if (tsc_scaling)
++ __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
+
+ if (likely(tsc_aux_uret_slot >= 0))
+ kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
+diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
+index f76deff71002c..34babf9185fe5 100644
+--- a/arch/x86/kvm/svm/svm.h
++++ b/arch/x86/kvm/svm/svm.h
+@@ -558,7 +558,7 @@ int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr,
+ bool has_error_code, u32 error_code);
+ int nested_svm_exit_special(struct vcpu_svm *svm);
+ void nested_svm_update_tsc_ratio_msr(struct kvm_vcpu *vcpu);
+-void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier);
++void __svm_write_tsc_multiplier(u64 multiplier);
+ void nested_copy_vmcb_control_to_cache(struct vcpu_svm *svm,
+ struct vmcb_control_area *control);
+ void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
+diff --git a/block/bio.c b/block/bio.c
+index 4259125e16ab2..d3ca79c3ebdff 100644
+--- a/block/bio.c
++++ b/block/bio.c
+@@ -693,6 +693,7 @@ static void bio_alloc_cache_destroy(struct bio_set *bs)
+ bio_alloc_cache_prune(cache, -1U);
+ }
+ free_percpu(bs->cache);
++ bs->cache = NULL;
+ }
+
+ /**
+@@ -1336,10 +1337,12 @@ void bio_copy_data_iter(struct bio *dst, struct bvec_iter *dst_iter,
+ struct bio_vec src_bv = bio_iter_iovec(src, *src_iter);
+ struct bio_vec dst_bv = bio_iter_iovec(dst, *dst_iter);
+ unsigned int bytes = min(src_bv.bv_len, dst_bv.bv_len);
+- void *src_buf;
++ void *src_buf = bvec_kmap_local(&src_bv);
++ void *dst_buf = bvec_kmap_local(&dst_bv);
+
+- src_buf = bvec_kmap_local(&src_bv);
+- memcpy_to_bvec(&dst_bv, src_buf);
++ memcpy(dst_buf, src_buf, bytes);
++
++ kunmap_local(dst_buf);
+ kunmap_local(src_buf);
+
+ bio_advance_iter_single(src, src_iter, bytes);
+diff --git a/block/blk-core.c b/block/blk-core.c
+index bc05067721523..84f7b7884d072 100644
+--- a/block/blk-core.c
++++ b/block/blk-core.c
+@@ -948,7 +948,7 @@ int bio_poll(struct bio *bio, struct io_comp_batch *iob, unsigned int flags)
+
+ blk_flush_plug(current->plug, false);
+
+- if (blk_queue_enter(q, BLK_MQ_REQ_NOWAIT))
++ if (bio_queue_enter(bio))
+ return 0;
+ if (queue_is_mq(q)) {
+ ret = blk_mq_poll(q, cookie, iob, flags);
+diff --git a/block/blk-mq.c b/block/blk-mq.c
+index 84d749511f551..de7fc69572718 100644
+--- a/block/blk-mq.c
++++ b/block/blk-mq.c
+@@ -133,7 +133,8 @@ static bool blk_mq_check_inflight(struct request *rq, void *priv,
+ {
+ struct mq_inflight *mi = priv;
+
+- if ((!mi->part->bd_partno || rq->part == mi->part) &&
++ if (rq->part && blk_do_io_stat(rq) &&
++ (!mi->part->bd_partno || rq->part == mi->part) &&
+ blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT)
+ mi->inflight[rq_data_dir(rq)]++;
+
+@@ -2123,8 +2124,7 @@ static bool blk_mq_has_sqsched(struct request_queue *q)
+ */
+ static struct blk_mq_hw_ctx *blk_mq_get_sq_hctx(struct request_queue *q)
+ {
+- struct blk_mq_hw_ctx *hctx;
+-
++ struct blk_mq_ctx *ctx = blk_mq_get_ctx(q);
+ /*
+ * If the IO scheduler does not respect hardware queues when
+ * dispatching, we just don't bother with multiple HW queues and
+@@ -2132,8 +2132,8 @@ static struct blk_mq_hw_ctx *blk_mq_get_sq_hctx(struct request_queue *q)
+ * just causes lock contention inside the scheduler and pointless cache
+ * bouncing.
+ */
+- hctx = blk_mq_map_queue_type(q, HCTX_TYPE_DEFAULT,
+- raw_smp_processor_id());
++ struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, 0, ctx);
++
+ if (!blk_mq_hctx_stopped(hctx))
+ return hctx;
+ return NULL;
+diff --git a/block/genhd.c b/block/genhd.c
+index b8b6759d670f0..3008ec2136543 100644
+--- a/block/genhd.c
++++ b/block/genhd.c
+@@ -385,6 +385,8 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
+
+ if (disk->flags & (GENHD_FL_NO_PART | GENHD_FL_HIDDEN))
+ return -EINVAL;
++ if (test_bit(GD_SUPPRESS_PART_SCAN, &disk->state))
++ return -EINVAL;
+ if (disk->open_partitions)
+ return -EBUSY;
+
+diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
+index ca64837641be2..3d57fa84e2be8 100644
+--- a/drivers/ata/libata-core.c
++++ b/drivers/ata/libata-core.c
+@@ -2003,16 +2003,16 @@ retry:
+ return err_mask;
+ }
+
+-static bool ata_log_supported(struct ata_device *dev, u8 log)
++static int ata_log_supported(struct ata_device *dev, u8 log)
+ {
+ struct ata_port *ap = dev->link->ap;
+
+ if (dev->horkage & ATA_HORKAGE_NO_LOG_DIR)
+- return false;
++ return 0;
+
+ if (ata_read_log_page(dev, ATA_LOG_DIRECTORY, 0, ap->sector_buf, 1))
+- return false;
+- return get_unaligned_le16(&ap->sector_buf[log * 2]) ? true : false;
++ return 0;
++ return get_unaligned_le16(&ap->sector_buf[log * 2]);
+ }
+
+ static bool ata_identify_page_supported(struct ata_device *dev, u8 page)
+@@ -2448,15 +2448,20 @@ static void ata_dev_config_cpr(struct ata_device *dev)
+ struct ata_cpr_log *cpr_log = NULL;
+ u8 *desc, *buf = NULL;
+
+- if (ata_id_major_version(dev->id) < 11 ||
+- !ata_log_supported(dev, ATA_LOG_CONCURRENT_POSITIONING_RANGES))
++ if (ata_id_major_version(dev->id) < 11)
++ goto out;
++
++ buf_len = ata_log_supported(dev, ATA_LOG_CONCURRENT_POSITIONING_RANGES);
++ if (buf_len == 0)
+ goto out;
+
+ /*
+ * Read the concurrent positioning ranges log (0x47). We can have at
+- * most 255 32B range descriptors plus a 64B header.
++ * most 255 32B range descriptors plus a 64B header. This log varies in
++ * size, so use the size reported in the GPL directory. Reading beyond
++ * the supported length will result in an error.
+ */
+- buf_len = (64 + 255 * 32 + 511) & ~511;
++ buf_len <<= 9;
+ buf = kzalloc(buf_len, GFP_KERNEL);
+ if (!buf)
+ goto out;
+diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
+index 06c9d90238d9e..0ea9c3115529e 100644
+--- a/drivers/ata/libata-scsi.c
++++ b/drivers/ata/libata-scsi.c
+@@ -2101,7 +2101,7 @@ static unsigned int ata_scsiop_inq_b9(struct ata_scsi_args *args, u8 *rbuf)
+
+ /* SCSI Concurrent Positioning Ranges VPD page: SBC-5 rev 1 or later */
+ rbuf[1] = 0xb9;
+- put_unaligned_be16(64 + (int)cpr_log->nr_cpr * 32 - 4, &rbuf[3]);
++ put_unaligned_be16(64 + (int)cpr_log->nr_cpr * 32 - 4, &rbuf[2]);
+
+ for (i = 0; i < cpr_log->nr_cpr; i++, desc += 32) {
+ desc[0] = cpr_log->cpr[i].num;
+diff --git a/drivers/ata/libata-transport.c b/drivers/ata/libata-transport.c
+index ca129854a88c7..c380278874990 100644
+--- a/drivers/ata/libata-transport.c
++++ b/drivers/ata/libata-transport.c
+@@ -196,7 +196,7 @@ static struct {
+ { XFER_PIO_0, "XFER_PIO_0" },
+ { XFER_PIO_SLOW, "XFER_PIO_SLOW" }
+ };
+-ata_bitfield_name_match(xfer,ata_xfer_names)
++ata_bitfield_name_search(xfer, ata_xfer_names)
+
+ /*
+ * ATA Port attributes
+diff --git a/drivers/ata/pata_octeon_cf.c b/drivers/ata/pata_octeon_cf.c
+index 6b5ed3046b44d..35608a0cf552e 100644
+--- a/drivers/ata/pata_octeon_cf.c
++++ b/drivers/ata/pata_octeon_cf.c
+@@ -856,12 +856,14 @@ static int octeon_cf_probe(struct platform_device *pdev)
+ int i;
+ res_dma = platform_get_resource(dma_dev, IORESOURCE_MEM, 0);
+ if (!res_dma) {
++ put_device(&dma_dev->dev);
+ of_node_put(dma_node);
+ return -EINVAL;
+ }
+ cf_port->dma_base = (u64)devm_ioremap(&pdev->dev, res_dma->start,
+ resource_size(res_dma));
+ if (!cf_port->dma_base) {
++ put_device(&dma_dev->dev);
+ of_node_put(dma_node);
+ return -EINVAL;
+ }
+@@ -871,6 +873,7 @@ static int octeon_cf_probe(struct platform_device *pdev)
+ irq = i;
+ irq_handler = octeon_cf_interrupt;
+ }
++ put_device(&dma_dev->dev);
+ }
+ of_node_put(dma_node);
+ }
+diff --git a/drivers/base/bus.c b/drivers/base/bus.c
+index 97936ec49bde0..7ca47e5b3c1f4 100644
+--- a/drivers/base/bus.c
++++ b/drivers/base/bus.c
+@@ -617,7 +617,7 @@ int bus_add_driver(struct device_driver *drv)
+ if (drv->bus->p->drivers_autoprobe) {
+ error = driver_attach(drv);
+ if (error)
+- goto out_unregister;
++ goto out_del_list;
+ }
+ module_add_driver(drv->owner, drv);
+
+@@ -644,6 +644,8 @@ int bus_add_driver(struct device_driver *drv)
+
+ return 0;
+
++out_del_list:
++ klist_del(&priv->knode_bus);
+ out_unregister:
+ kobject_put(&priv->kobj);
+ /* drv->p is freed in driver_release() */
+diff --git a/drivers/base/dd.c b/drivers/base/dd.c
+index 3fc3b5940bb31..d6980f33afc44 100644
+--- a/drivers/base/dd.c
++++ b/drivers/base/dd.c
+@@ -257,7 +257,6 @@ DEFINE_SHOW_ATTRIBUTE(deferred_devs);
+
+ int driver_deferred_probe_timeout;
+ EXPORT_SYMBOL_GPL(driver_deferred_probe_timeout);
+-static DECLARE_WAIT_QUEUE_HEAD(probe_timeout_waitqueue);
+
+ static int __init deferred_probe_timeout_setup(char *str)
+ {
+@@ -312,7 +311,6 @@ static void deferred_probe_timeout_work_func(struct work_struct *work)
+ list_for_each_entry(p, &deferred_probe_pending_list, deferred_probe)
+ dev_info(p->device, "deferred probe pending\n");
+ mutex_unlock(&deferred_probe_mutex);
+- wake_up_all(&probe_timeout_waitqueue);
+ }
+ static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_work_func);
+
+@@ -716,9 +714,6 @@ int driver_probe_done(void)
+ */
+ void wait_for_device_probe(void)
+ {
+- /* wait for probe timeout */
+- wait_event(probe_timeout_waitqueue, !driver_deferred_probe_timeout);
+-
+ /* wait for the deferred probe workqueue to finish */
+ flush_work(&deferred_probe_work);
+
+@@ -941,6 +936,7 @@ out_unlock:
+ static int __device_attach(struct device *dev, bool allow_async)
+ {
+ int ret = 0;
++ bool async = false;
+
+ device_lock(dev);
+ if (dev->p->dead) {
+@@ -979,7 +975,7 @@ static int __device_attach(struct device *dev, bool allow_async)
+ */
+ dev_dbg(dev, "scheduling asynchronous probe\n");
+ get_device(dev);
+- async_schedule_dev(__device_attach_async_helper, dev);
++ async = true;
+ } else {
+ pm_request_idle(dev);
+ }
+@@ -989,6 +985,8 @@ static int __device_attach(struct device *dev, bool allow_async)
+ }
+ out_unlock:
+ device_unlock(dev);
++ if (async)
++ async_schedule_dev(__device_attach_async_helper, dev);
+ return ret;
+ }
+
+diff --git a/drivers/block/loop.c b/drivers/block/loop.c
+index ed7bec11948cd..4e1dce3beab0c 100644
+--- a/drivers/block/loop.c
++++ b/drivers/block/loop.c
+@@ -1066,7 +1066,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
+ lo->lo_flags |= LO_FLAGS_PARTSCAN;
+ partscan = lo->lo_flags & LO_FLAGS_PARTSCAN;
+ if (partscan)
+- lo->lo_disk->flags &= ~GENHD_FL_NO_PART;
++ clear_bit(GD_SUPPRESS_PART_SCAN, &lo->lo_disk->state);
+
+ loop_global_unlock(lo, is_loop);
+ if (partscan)
+@@ -1185,7 +1185,7 @@ static void __loop_clr_fd(struct loop_device *lo, bool release)
+ */
+ lo->lo_flags = 0;
+ if (!part_shift)
+- lo->lo_disk->flags |= GENHD_FL_NO_PART;
++ set_bit(GD_SUPPRESS_PART_SCAN, &lo->lo_disk->state);
+ mutex_lock(&lo->lo_mutex);
+ lo->lo_state = Lo_unbound;
+ mutex_unlock(&lo->lo_mutex);
+@@ -1295,7 +1295,7 @@ out_unfreeze:
+
+ if (!err && (lo->lo_flags & LO_FLAGS_PARTSCAN) &&
+ !(prev_lo_flags & LO_FLAGS_PARTSCAN)) {
+- lo->lo_disk->flags &= ~GENHD_FL_NO_PART;
++ clear_bit(GD_SUPPRESS_PART_SCAN, &lo->lo_disk->state);
+ partscan = true;
+ }
+ out_unlock:
+@@ -2054,7 +2054,7 @@ static int loop_add(int i)
+ * userspace tools. Parameters like this in general should be avoided.
+ */
+ if (!part_shift)
+- disk->flags |= GENHD_FL_NO_PART;
++ set_bit(GD_SUPPRESS_PART_SCAN, &disk->state);
+ atomic_set(&lo->lo_refcnt, 0);
+ mutex_init(&lo->lo_mutex);
+ lo->lo_number = i;
+diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
+index 2845570413360..ee5adca0ba7b9 100644
+--- a/drivers/block/nbd.c
++++ b/drivers/block/nbd.c
+@@ -404,13 +404,14 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
+ if (!mutex_trylock(&cmd->lock))
+ return BLK_EH_RESET_TIMER;
+
+- if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
++ if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+ mutex_unlock(&cmd->lock);
+ return BLK_EH_DONE;
+ }
+
+ if (!refcount_inc_not_zero(&nbd->config_refs)) {
+ cmd->status = BLK_STS_TIMEOUT;
++ __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+ mutex_unlock(&cmd->lock);
+ goto done;
+ }
+@@ -479,6 +480,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
+ dev_err_ratelimited(nbd_to_dev(nbd), "Connection timed out\n");
+ set_bit(NBD_RT_TIMEDOUT, &config->runtime_flags);
+ cmd->status = BLK_STS_IOERR;
++ __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+ mutex_unlock(&cmd->lock);
+ sock_shutdown(nbd);
+ nbd_config_put(nbd);
+@@ -746,7 +748,7 @@ static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index,
+ cmd = blk_mq_rq_to_pdu(req);
+
+ mutex_lock(&cmd->lock);
+- if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
++ if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+ dev_err(disk_to_dev(nbd->disk), "Suspicious reply %d (status %u flags %lu)",
+ tag, cmd->status, cmd->flags);
+ ret = -ENOENT;
+@@ -855,8 +857,16 @@ static void recv_work(struct work_struct *work)
+ }
+
+ rq = blk_mq_rq_from_pdu(cmd);
+- if (likely(!blk_should_fake_timeout(rq->q)))
+- blk_mq_complete_request(rq);
++ if (likely(!blk_should_fake_timeout(rq->q))) {
++ bool complete;
++
++ mutex_lock(&cmd->lock);
++ complete = __test_and_clear_bit(NBD_CMD_INFLIGHT,
++ &cmd->flags);
++ mutex_unlock(&cmd->lock);
++ if (complete)
++ blk_mq_complete_request(rq);
++ }
+ percpu_ref_put(&q->q_usage_counter);
+ }
+
+@@ -1424,7 +1434,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b
+ static void nbd_clear_sock_ioctl(struct nbd_device *nbd,
+ struct block_device *bdev)
+ {
+- sock_shutdown(nbd);
++ nbd_clear_sock(nbd);
+ __invalidate_device(bdev, true);
+ nbd_bdev_reset(bdev);
+ if (test_and_clear_bit(NBD_RT_HAS_CONFIG_REF,
+@@ -1523,15 +1533,20 @@ static struct nbd_config *nbd_alloc_config(void)
+ {
+ struct nbd_config *config;
+
++ if (!try_module_get(THIS_MODULE))
++ return ERR_PTR(-ENODEV);
++
+ config = kzalloc(sizeof(struct nbd_config), GFP_NOFS);
+- if (!config)
+- return NULL;
++ if (!config) {
++ module_put(THIS_MODULE);
++ return ERR_PTR(-ENOMEM);
++ }
++
+ atomic_set(&config->recv_threads, 0);
+ init_waitqueue_head(&config->recv_wq);
+ init_waitqueue_head(&config->conn_wait);
+ config->blksize_bits = NBD_DEF_BLKSIZE_BITS;
+ atomic_set(&config->live_connections, 0);
+- try_module_get(THIS_MODULE);
+ return config;
+ }
+
+@@ -1558,12 +1573,13 @@ static int nbd_open(struct block_device *bdev, fmode_t mode)
+ mutex_unlock(&nbd->config_lock);
+ goto out;
+ }
+- config = nbd->config = nbd_alloc_config();
+- if (!config) {
+- ret = -ENOMEM;
++ config = nbd_alloc_config();
++ if (IS_ERR(config)) {
++ ret = PTR_ERR(config);
+ mutex_unlock(&nbd->config_lock);
+ goto out;
+ }
++ nbd->config = config;
+ refcount_set(&nbd->config_refs, 1);
+ refcount_inc(&nbd->refs);
+ mutex_unlock(&nbd->config_lock);
+@@ -1804,17 +1820,7 @@ static struct nbd_device *nbd_dev_add(int index, unsigned int refs)
+ refcount_set(&nbd->refs, 0);
+ INIT_LIST_HEAD(&nbd->list);
+ disk->major = NBD_MAJOR;
+-
+- /* Too big first_minor can cause duplicate creation of
+- * sysfs files/links, since index << part_shift might overflow, or
+- * MKDEV() expect that the max bits of first_minor is 20.
+- */
+ disk->first_minor = index << part_shift;
+- if (disk->first_minor < index || disk->first_minor > MINORMASK) {
+- err = -EINVAL;
+- goto out_free_work;
+- }
+-
+ disk->minors = 1 << part_shift;
+ disk->fops = &nbd_fops;
+ disk->private_data = nbd;
+@@ -1919,8 +1925,19 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
+ if (!netlink_capable(skb, CAP_SYS_ADMIN))
+ return -EPERM;
+
+- if (info->attrs[NBD_ATTR_INDEX])
++ if (info->attrs[NBD_ATTR_INDEX]) {
+ index = nla_get_u32(info->attrs[NBD_ATTR_INDEX]);
++
++ /*
++ * Too big first_minor can cause duplicate creation of
++ * sysfs files/links, since index << part_shift might overflow, or
++ * MKDEV() expect that the max bits of first_minor is 20.
++ */
++ if (index < 0 || index > MINORMASK >> part_shift) {
++ printk(KERN_ERR "nbd: illegal input index %d\n", index);
++ return -EINVAL;
++ }
++ }
+ if (!info->attrs[NBD_ATTR_SOCKETS]) {
+ printk(KERN_ERR "nbd: must specify at least one socket\n");
+ return -EINVAL;
+@@ -1970,13 +1987,14 @@ again:
+ nbd_put(nbd);
+ return -EINVAL;
+ }
+- config = nbd->config = nbd_alloc_config();
+- if (!nbd->config) {
++ config = nbd_alloc_config();
++ if (IS_ERR(config)) {
+ mutex_unlock(&nbd->config_lock);
+ nbd_put(nbd);
+ printk(KERN_ERR "nbd: couldn't allocate config\n");
+- return -ENOMEM;
++ return PTR_ERR(config);
+ }
++ nbd->config = config;
+ refcount_set(&nbd->config_refs, 1);
+ set_bit(NBD_RT_BOUND, &config->runtime_flags);
+
+@@ -2534,6 +2552,12 @@ static void __exit nbd_cleanup(void)
+ struct nbd_device *nbd;
+ LIST_HEAD(del_list);
+
++ /*
++ * Unregister netlink interface prior to waiting
++ * for the completion of netlink commands.
++ */
++ genl_unregister_family(&nbd_genl_family);
++
+ nbd_dbg_close();
+
+ mutex_lock(&nbd_index_mutex);
+@@ -2543,6 +2567,9 @@ static void __exit nbd_cleanup(void)
+ while (!list_empty(&del_list)) {
+ nbd = list_first_entry(&del_list, struct nbd_device, list);
+ list_del_init(&nbd->list);
++ if (refcount_read(&nbd->config_refs))
++ printk(KERN_ERR "nbd: possibly leaking nbd_config (ref %d)\n",
++ refcount_read(&nbd->config_refs));
+ if (refcount_read(&nbd->refs) != 1)
+ printk(KERN_ERR "nbd: possibly leaking a device\n");
+ nbd_put(nbd);
+@@ -2552,7 +2579,6 @@ static void __exit nbd_cleanup(void)
+ destroy_workqueue(nbd_del_wq);
+
+ idr_destroy(&nbd_index_idr);
+- genl_unregister_family(&nbd_genl_family);
+ unregister_blkdev(NBD_MAJOR, "nbd");
+ }
+
+diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c
+index 7a1b1f9e49333..70d00cea9d22a 100644
+--- a/drivers/bus/ti-sysc.c
++++ b/drivers/bus/ti-sysc.c
+@@ -3395,7 +3395,9 @@ static int sysc_remove(struct platform_device *pdev)
+ struct sysc *ddata = platform_get_drvdata(pdev);
+ int error;
+
+- cancel_delayed_work_sync(&ddata->idle_work);
++ /* Device can still be enabled, see deferred idle quirk in probe */
++ if (cancel_delayed_work_sync(&ddata->idle_work))
++ ti_sysc_idle(&ddata->idle_work.work);
+
+ error = pm_runtime_resume_and_get(ddata->dev);
+ if (error < 0) {
+diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
+index e856df7e285c7..a6f3a8a2aca6d 100644
+--- a/drivers/char/hw_random/virtio-rng.c
++++ b/drivers/char/hw_random/virtio-rng.c
+@@ -159,6 +159,8 @@ static int probe_common(struct virtio_device *vdev)
+ goto err_find;
+ }
+
++ virtio_device_ready(vdev);
++
+ /* we always have a pending entropy request */
+ request_entropy(vi);
+
+diff --git a/drivers/char/random.c b/drivers/char/random.c
+index 0cfbfa8d5b50a..1f3072ee6b7cd 100644
+--- a/drivers/char/random.c
++++ b/drivers/char/random.c
+@@ -793,8 +793,8 @@ static void __cold _credit_init_bits(size_t bits)
+ *
+ **********************************************************************/
+
+-static bool trust_cpu __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
+-static bool trust_bootloader __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_BOOTLOADER);
++static bool trust_cpu __initdata = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
++static bool trust_bootloader __initdata = IS_ENABLED(CONFIG_RANDOM_TRUST_BOOTLOADER);
+ static int __init parse_trust_cpu(char *arg)
+ {
+ return kstrtobool(arg, &trust_cpu);
+@@ -817,7 +817,7 @@ early_param("random.trust_bootloader", parse_trust_bootloader);
+ int __init random_init(const char *command_line)
+ {
+ ktime_t now = ktime_get_real();
+- unsigned int i, arch_bytes;
++ unsigned int i, arch_bits;
+ unsigned long entropy;
+
+ #if defined(LATENT_ENTROPY_PLUGIN)
+@@ -825,12 +825,12 @@ int __init random_init(const char *command_line)
+ _mix_pool_bytes(compiletime_seed, sizeof(compiletime_seed));
+ #endif
+
+- for (i = 0, arch_bytes = BLAKE2S_BLOCK_SIZE;
++ for (i = 0, arch_bits = BLAKE2S_BLOCK_SIZE * 8;
+ i < BLAKE2S_BLOCK_SIZE; i += sizeof(entropy)) {
+ if (!arch_get_random_seed_long_early(&entropy) &&
+ !arch_get_random_long_early(&entropy)) {
+ entropy = random_get_entropy();
+- arch_bytes -= sizeof(entropy);
++ arch_bits -= sizeof(entropy) * 8;
+ }
+ _mix_pool_bytes(&entropy, sizeof(entropy));
+ }
+@@ -842,7 +842,7 @@ int __init random_init(const char *command_line)
+ if (crng_ready())
+ crng_reseed();
+ else if (trust_cpu)
+- credit_init_bits(arch_bytes * 8);
++ _credit_init_bits(arch_bits);
+
+ return 0;
+ }
+@@ -890,13 +890,12 @@ EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);
+ * Handle random seed passed by bootloader, and credit it if
+ * CONFIG_RANDOM_TRUST_BOOTLOADER is set.
+ */
+-void __cold add_bootloader_randomness(const void *buf, size_t len)
++void __init add_bootloader_randomness(const void *buf, size_t len)
+ {
+ mix_pool_bytes(buf, len);
+ if (trust_bootloader)
+ credit_init_bits(len * 8);
+ }
+-EXPORT_SYMBOL_GPL(add_bootloader_randomness);
+
+ #if IS_ENABLED(CONFIG_VMGENID)
+ static BLOCKING_NOTIFIER_HEAD(vmfork_chain);
+diff --git a/drivers/char/xillybus/xillyusb.c b/drivers/char/xillybus/xillyusb.c
+index dc3551796e5ed..39bcbfd908b46 100644
+--- a/drivers/char/xillybus/xillyusb.c
++++ b/drivers/char/xillybus/xillyusb.c
+@@ -549,6 +549,7 @@ static void cleanup_dev(struct kref *kref)
+ if (xdev->workq)
+ destroy_workqueue(xdev->workq);
+
++ usb_put_dev(xdev->udev);
+ kfree(xdev->channels); /* Argument may be NULL, and that's fine */
+ kfree(xdev);
+ }
+diff --git a/drivers/clocksource/timer-oxnas-rps.c b/drivers/clocksource/timer-oxnas-rps.c
+index 56c0cc32d0ac6..d514b44e67dd1 100644
+--- a/drivers/clocksource/timer-oxnas-rps.c
++++ b/drivers/clocksource/timer-oxnas-rps.c
+@@ -236,7 +236,7 @@ static int __init oxnas_rps_timer_init(struct device_node *np)
+ }
+
+ rps->irq = irq_of_parse_and_map(np, 0);
+- if (rps->irq < 0) {
++ if (!rps->irq) {
+ ret = -EINVAL;
+ goto err_iomap;
+ }
+diff --git a/drivers/clocksource/timer-riscv.c b/drivers/clocksource/timer-riscv.c
+index 1767f8bf20133..593d5a957b69d 100644
+--- a/drivers/clocksource/timer-riscv.c
++++ b/drivers/clocksource/timer-riscv.c
+@@ -34,7 +34,7 @@ static int riscv_clock_next_event(unsigned long delta,
+ static unsigned int riscv_clock_event_irq;
+ static DEFINE_PER_CPU(struct clock_event_device, riscv_clock_event) = {
+ .name = "riscv_timer_clockevent",
+- .features = CLOCK_EVT_FEAT_ONESHOT,
++ .features = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_C3STOP,
+ .rating = 100,
+ .set_next_event = riscv_clock_next_event,
+ };
+diff --git a/drivers/clocksource/timer-sp804.c b/drivers/clocksource/timer-sp804.c
+index 401d592e85f5a..e6a87f4af2b50 100644
+--- a/drivers/clocksource/timer-sp804.c
++++ b/drivers/clocksource/timer-sp804.c
+@@ -259,6 +259,11 @@ static int __init sp804_of_init(struct device_node *np, struct sp804_timer *time
+ struct clk *clk1, *clk2;
+ const char *name = of_get_property(np, "compatible", NULL);
+
++ if (initialized) {
++ pr_debug("%pOF: skipping further SP804 timer device\n", np);
++ return 0;
++ }
++
+ base = of_iomap(np, 0);
+ if (!base)
+ return -ENXIO;
+@@ -270,11 +275,6 @@ static int __init sp804_of_init(struct device_node *np, struct sp804_timer *time
+ writel(0, timer1_base + timer->ctrl);
+ writel(0, timer2_base + timer->ctrl);
+
+- if (initialized || !of_device_is_available(np)) {
+- ret = -EINVAL;
+- goto err;
+- }
+-
+ clk1 = of_clk_get(np, 0);
+ if (IS_ERR(clk1))
+ clk1 = NULL;
+diff --git a/drivers/dma/idxd/dma.c b/drivers/dma/idxd/dma.c
+index bfff59617d047..f9ed550117756 100644
+--- a/drivers/dma/idxd/dma.c
++++ b/drivers/dma/idxd/dma.c
+@@ -87,6 +87,27 @@ static inline void idxd_prep_desc_common(struct idxd_wq *wq,
+ hw->completion_addr = compl;
+ }
+
++static struct dma_async_tx_descriptor *
++idxd_dma_prep_interrupt(struct dma_chan *c, unsigned long flags)
++{
++ struct idxd_wq *wq = to_idxd_wq(c);
++ u32 desc_flags;
++ struct idxd_desc *desc;
++
++ if (wq->state != IDXD_WQ_ENABLED)
++ return NULL;
++
++ op_flag_setup(flags, &desc_flags);
++ desc = idxd_alloc_desc(wq, IDXD_OP_BLOCK);
++ if (IS_ERR(desc))
++ return NULL;
++
++ idxd_prep_desc_common(wq, desc->hw, DSA_OPCODE_NOOP,
++ 0, 0, 0, desc->compl_dma, desc_flags);
++ desc->txd.flags = flags;
++ return &desc->txd;
++}
++
+ static struct dma_async_tx_descriptor *
+ idxd_dma_submit_memcpy(struct dma_chan *c, dma_addr_t dma_dest,
+ dma_addr_t dma_src, size_t len, unsigned long flags)
+@@ -193,10 +214,12 @@ int idxd_register_dma_device(struct idxd_device *idxd)
+ INIT_LIST_HEAD(&dma->channels);
+ dma->dev = dev;
+
++ dma_cap_set(DMA_INTERRUPT, dma->cap_mask);
+ dma_cap_set(DMA_PRIVATE, dma->cap_mask);
+ dma_cap_set(DMA_COMPLETION_NO_ORDER, dma->cap_mask);
+ dma->device_release = idxd_dma_release;
+
++ dma->device_prep_dma_interrupt = idxd_dma_prep_interrupt;
+ if (idxd->hw.opcap.bits[0] & IDXD_OPCAP_MEMMOVE) {
+ dma_cap_set(DMA_MEMCPY, dma->cap_mask);
+ dma->device_prep_dma_memcpy = idxd_dma_submit_memcpy;
+diff --git a/drivers/dma/xilinx/zynqmp_dma.c b/drivers/dma/xilinx/zynqmp_dma.c
+index 7aa63b6520272..3ffa7f37c7017 100644
+--- a/drivers/dma/xilinx/zynqmp_dma.c
++++ b/drivers/dma/xilinx/zynqmp_dma.c
+@@ -229,7 +229,7 @@ struct zynqmp_dma_chan {
+ bool is_dmacoherent;
+ struct tasklet_struct tasklet;
+ bool idle;
+- u32 desc_size;
++ size_t desc_size;
+ bool err;
+ u32 bus_width;
+ u32 src_burst_len;
+@@ -486,7 +486,8 @@ static int zynqmp_dma_alloc_chan_resources(struct dma_chan *dchan)
+ }
+
+ chan->desc_pool_v = dma_alloc_coherent(chan->dev,
+- (2 * chan->desc_size * ZYNQMP_DMA_NUM_DESCS),
++ (2 * ZYNQMP_DMA_DESC_SIZE(chan) *
++ ZYNQMP_DMA_NUM_DESCS),
+ &chan->desc_pool_p, GFP_KERNEL);
+ if (!chan->desc_pool_v)
+ return -ENOMEM;
+diff --git a/drivers/extcon/extcon-axp288.c b/drivers/extcon/extcon-axp288.c
+index 7c6d5857ff25e..180be768c2157 100644
+--- a/drivers/extcon/extcon-axp288.c
++++ b/drivers/extcon/extcon-axp288.c
+@@ -394,8 +394,8 @@ static int axp288_extcon_probe(struct platform_device *pdev)
+ if (adev) {
+ info->id_extcon = extcon_get_extcon_dev(acpi_dev_name(adev));
+ put_device(&adev->dev);
+- if (!info->id_extcon)
+- return -EPROBE_DEFER;
++ if (IS_ERR(info->id_extcon))
++ return PTR_ERR(info->id_extcon);
+
+ dev_info(dev, "controlling USB role\n");
+ } else {
+diff --git a/drivers/extcon/extcon-ptn5150.c b/drivers/extcon/extcon-ptn5150.c
+index 5b9a3cf8df268..2a7874108df87 100644
+--- a/drivers/extcon/extcon-ptn5150.c
++++ b/drivers/extcon/extcon-ptn5150.c
+@@ -194,6 +194,13 @@ static int ptn5150_init_dev_type(struct ptn5150_info *info)
+ return 0;
+ }
+
++static void ptn5150_work_sync_and_put(void *data)
++{
++ struct ptn5150_info *info = data;
++
++ cancel_work_sync(&info->irq_work);
++}
++
+ static int ptn5150_i2c_probe(struct i2c_client *i2c)
+ {
+ struct device *dev = &i2c->dev;
+@@ -284,6 +291,10 @@ static int ptn5150_i2c_probe(struct i2c_client *i2c)
+ if (ret)
+ return -EINVAL;
+
++ ret = devm_add_action_or_reset(dev, ptn5150_work_sync_and_put, info);
++ if (ret)
++ return ret;
++
+ /*
+ * Update current extcon state if for example OTG connection was there
+ * before the probe
+diff --git a/drivers/extcon/extcon.c b/drivers/extcon/extcon.c
+index a09e704fd0fa1..97e35c32bfa55 100644
+--- a/drivers/extcon/extcon.c
++++ b/drivers/extcon/extcon.c
+@@ -851,6 +851,8 @@ EXPORT_SYMBOL_GPL(extcon_set_property_capability);
+ * @extcon_name: the extcon name provided with extcon_dev_register()
+ *
+ * Return the pointer of extcon device if success or ERR_PTR(err) if fail.
++ * NOTE: This function returns -EPROBE_DEFER so it may only be called from
++ * probe() functions.
+ */
+ struct extcon_dev *extcon_get_extcon_dev(const char *extcon_name)
+ {
+@@ -864,7 +866,7 @@ struct extcon_dev *extcon_get_extcon_dev(const char *extcon_name)
+ if (!strcmp(sd->name, extcon_name))
+ goto out;
+ }
+- sd = NULL;
++ sd = ERR_PTR(-EPROBE_DEFER);
+ out:
+ mutex_unlock(&extcon_dev_list_lock);
+ return sd;
+@@ -1218,19 +1220,14 @@ int extcon_dev_register(struct extcon_dev *edev)
+ edev->dev.type = &edev->extcon_dev_type;
+ }
+
+- ret = device_register(&edev->dev);
+- if (ret) {
+- put_device(&edev->dev);
+- goto err_dev;
+- }
+-
+ spin_lock_init(&edev->lock);
+- edev->nh = devm_kcalloc(&edev->dev, edev->max_supported,
+- sizeof(*edev->nh), GFP_KERNEL);
+- if (!edev->nh) {
+- ret = -ENOMEM;
+- device_unregister(&edev->dev);
+- goto err_dev;
++ if (edev->max_supported) {
++ edev->nh = kcalloc(edev->max_supported, sizeof(*edev->nh),
++ GFP_KERNEL);
++ if (!edev->nh) {
++ ret = -ENOMEM;
++ goto err_alloc_nh;
++ }
+ }
+
+ for (index = 0; index < edev->max_supported; index++)
+@@ -1241,6 +1238,12 @@ int extcon_dev_register(struct extcon_dev *edev)
+ dev_set_drvdata(&edev->dev, edev);
+ edev->state = 0;
+
++ ret = device_register(&edev->dev);
++ if (ret) {
++ put_device(&edev->dev);
++ goto err_dev;
++ }
++
+ mutex_lock(&extcon_dev_list_lock);
+ list_add(&edev->entry, &extcon_dev_list);
+ mutex_unlock(&extcon_dev_list_lock);
+@@ -1248,6 +1251,9 @@ int extcon_dev_register(struct extcon_dev *edev)
+ return 0;
+
+ err_dev:
++ if (edev->max_supported)
++ kfree(edev->nh);
++err_alloc_nh:
+ if (edev->max_supported)
+ kfree(edev->extcon_dev_type.groups);
+ err_alloc_groups:
+@@ -1308,6 +1314,7 @@ void extcon_dev_unregister(struct extcon_dev *edev)
+ if (edev->max_supported) {
+ kfree(edev->extcon_dev_type.groups);
+ kfree(edev->cables);
++ kfree(edev->nh);
+ }
+
+ put_device(&edev->dev);
+diff --git a/drivers/firmware/dmi-sysfs.c b/drivers/firmware/dmi-sysfs.c
+index 3a353776bd344..66727ad3361b9 100644
+--- a/drivers/firmware/dmi-sysfs.c
++++ b/drivers/firmware/dmi-sysfs.c
+@@ -604,7 +604,7 @@ static void __init dmi_sysfs_register_handle(const struct dmi_header *dh,
+ "%d-%d", dh->type, entry->instance);
+
+ if (*ret) {
+- kfree(entry);
++ kobject_put(&entry->kobj);
+ return;
+ }
+
+diff --git a/drivers/firmware/stratix10-svc.c b/drivers/firmware/stratix10-svc.c
+index 8177a0fae11d8..14663f6713235 100644
+--- a/drivers/firmware/stratix10-svc.c
++++ b/drivers/firmware/stratix10-svc.c
+@@ -948,17 +948,17 @@ EXPORT_SYMBOL_GPL(stratix10_svc_allocate_memory);
+ void stratix10_svc_free_memory(struct stratix10_svc_chan *chan, void *kaddr)
+ {
+ struct stratix10_svc_data_mem *pmem;
+- size_t size = 0;
+
+ list_for_each_entry(pmem, &svc_data_mem, node)
+ if (pmem->vaddr == kaddr) {
+- size = pmem->size;
+- break;
++ gen_pool_free(chan->ctrl->genpool,
++ (unsigned long)kaddr, pmem->size);
++ pmem->vaddr = NULL;
++ list_del(&pmem->node);
++ return;
+ }
+
+- gen_pool_free(chan->ctrl->genpool, (unsigned long)kaddr, size);
+- pmem->vaddr = NULL;
+- list_del(&pmem->node);
++ list_del(&svc_data_mem);
+ }
+ EXPORT_SYMBOL_GPL(stratix10_svc_free_memory);
+
+diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c
+index 8726921a11294..33683295a0bfe 100644
+--- a/drivers/gpio/gpio-pca953x.c
++++ b/drivers/gpio/gpio-pca953x.c
+@@ -1108,20 +1108,21 @@ static int pca953x_regcache_sync(struct device *dev)
+ {
+ struct pca953x_chip *chip = dev_get_drvdata(dev);
+ int ret;
++ u8 regaddr;
+
+ /*
+ * The ordering between direction and output is important,
+ * sync these registers first and only then sync the rest.
+ */
+- ret = regcache_sync_region(chip->regmap, chip->regs->direction,
+- chip->regs->direction + NBANK(chip));
++ regaddr = pca953x_recalc_addr(chip, chip->regs->direction, 0);
++ ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip));
+ if (ret) {
+ dev_err(dev, "Failed to sync GPIO dir registers: %d\n", ret);
+ return ret;
+ }
+
+- ret = regcache_sync_region(chip->regmap, chip->regs->output,
+- chip->regs->output + NBANK(chip));
++ regaddr = pca953x_recalc_addr(chip, chip->regs->output, 0);
++ ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip));
+ if (ret) {
+ dev_err(dev, "Failed to sync GPIO out registers: %d\n", ret);
+ return ret;
+@@ -1129,16 +1130,18 @@ static int pca953x_regcache_sync(struct device *dev)
+
+ #ifdef CONFIG_GPIO_PCA953X_IRQ
+ if (chip->driver_data & PCA_PCAL) {
+- ret = regcache_sync_region(chip->regmap, PCAL953X_IN_LATCH,
+- PCAL953X_IN_LATCH + NBANK(chip));
++ regaddr = pca953x_recalc_addr(chip, PCAL953X_IN_LATCH, 0);
++ ret = regcache_sync_region(chip->regmap, regaddr,
++ regaddr + NBANK(chip));
+ if (ret) {
+ dev_err(dev, "Failed to sync INT latch registers: %d\n",
+ ret);
+ return ret;
+ }
+
+- ret = regcache_sync_region(chip->regmap, PCAL953X_INT_MASK,
+- PCAL953X_INT_MASK + NBANK(chip));
++ regaddr = pca953x_recalc_addr(chip, PCAL953X_INT_MASK, 0);
++ ret = regcache_sync_region(chip->regmap, regaddr,
++ regaddr + NBANK(chip));
+ if (ret) {
+ dev_err(dev, "Failed to sync INT mask registers: %d\n",
+ ret);
+diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c
+index 299de1d131d82..5ac1a43a16b4a 100644
+--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c
+@@ -535,6 +535,10 @@ void jpeg_v2_0_dec_ring_emit_ib(struct amdgpu_ring *ring,
+ {
+ unsigned vmid = AMDGPU_JOB_GET_VMID(job);
+
++ amdgpu_ring_write(ring, PACKETJ(mmUVD_JPEG_IH_CTRL_INTERNAL_OFFSET,
++ 0, 0, PACKETJ_TYPE0));
++ amdgpu_ring_write(ring, (vmid << JPEG_IH_CTRL__IH_VMID__SHIFT));
++
+ amdgpu_ring_write(ring, PACKETJ(mmUVD_LMI_JRBC_IB_VMID_INTERNAL_OFFSET,
+ 0, 0, PACKETJ_TYPE0));
+ amdgpu_ring_write(ring, (vmid | (vmid << 4)));
+@@ -768,7 +772,7 @@ static const struct amdgpu_ring_funcs jpeg_v2_0_dec_ring_vm_funcs = {
+ 8 + /* jpeg_v2_0_dec_ring_emit_vm_flush */
+ 18 + 18 + /* jpeg_v2_0_dec_ring_emit_fence x2 vm fence */
+ 8 + 16,
+- .emit_ib_size = 22, /* jpeg_v2_0_dec_ring_emit_ib */
++ .emit_ib_size = 24, /* jpeg_v2_0_dec_ring_emit_ib */
+ .emit_ib = jpeg_v2_0_dec_ring_emit_ib,
+ .emit_fence = jpeg_v2_0_dec_ring_emit_fence,
+ .emit_vm_flush = jpeg_v2_0_dec_ring_emit_vm_flush,
+diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.h b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.h
+index 1a03baa597557..654e43e83e2c4 100644
+--- a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.h
++++ b/drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.h
+@@ -41,6 +41,7 @@
+ #define mmUVD_JRBC_RB_REF_DATA_INTERNAL_OFFSET 0x4084
+ #define mmUVD_JRBC_STATUS_INTERNAL_OFFSET 0x4089
+ #define mmUVD_JPEG_PITCH_INTERNAL_OFFSET 0x401f
++#define mmUVD_JPEG_IH_CTRL_INTERNAL_OFFSET 0x4149
+
+ #define JRBC_DEC_EXTERNAL_REG_WRITE_ADDR 0x18000
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
+index e19f14c3ef59f..8af685fd85318 100644
+--- a/drivers/gpu/drm/amd/amdgpu/nv.c
++++ b/drivers/gpu/drm/amd/amdgpu/nv.c
+@@ -170,6 +170,7 @@ static const struct amdgpu_video_codec_info yc_video_codecs_decode_array[] = {
+ {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_HEVC, 8192, 4352, 186)},
+ {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_VP9, 8192, 4352, 0)},
+ {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_JPEG, 4096, 4096, 0)},
++ {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_AV1, 8192, 4352, 0)},
+ };
+
+ static const struct amdgpu_video_codecs yc_video_codecs_decode = {
+diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+index cb5f0a12333f3..57a34e775da38 100644
+--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c
+@@ -1821,23 +1821,21 @@ static const struct amdgpu_ring_funcs vcn_v3_0_dec_sw_ring_vm_funcs = {
+ .emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper,
+ };
+
+-static int vcn_v3_0_limit_sched(struct amdgpu_cs_parser *p,
+- struct amdgpu_job *job)
++static int vcn_v3_0_limit_sched(struct amdgpu_cs_parser *p)
+ {
+ struct drm_gpu_scheduler **scheds;
+
+ /* The create msg must be in the first IB submitted */
+- if (atomic_read(&job->base.entity->fence_seq))
++ if (atomic_read(&p->entity->fence_seq))
+ return -EINVAL;
+
+ scheds = p->adev->gpu_sched[AMDGPU_HW_IP_VCN_DEC]
+ [AMDGPU_RING_PRIO_DEFAULT].sched;
+- drm_sched_entity_modify_sched(job->base.entity, scheds, 1);
++ drm_sched_entity_modify_sched(p->entity, scheds, 1);
+ return 0;
+ }
+
+-static int vcn_v3_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job,
+- uint64_t addr)
++static int vcn_v3_0_dec_msg(struct amdgpu_cs_parser *p, uint64_t addr)
+ {
+ struct ttm_operation_ctx ctx = { false, false };
+ struct amdgpu_bo_va_mapping *map;
+@@ -1908,7 +1906,7 @@ static int vcn_v3_0_dec_msg(struct amdgpu_cs_parser *p, struct amdgpu_job *job,
+ if (create[0] == 0x7 || create[0] == 0x10 || create[0] == 0x11)
+ continue;
+
+- r = vcn_v3_0_limit_sched(p, job);
++ r = vcn_v3_0_limit_sched(p);
+ if (r)
+ goto out;
+ }
+@@ -1922,7 +1920,7 @@ static int vcn_v3_0_ring_patch_cs_in_place(struct amdgpu_cs_parser *p,
+ struct amdgpu_job *job,
+ struct amdgpu_ib *ib)
+ {
+- struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched);
++ struct amdgpu_ring *ring = to_amdgpu_ring(p->entity->rq->sched);
+ uint32_t msg_lo = 0, msg_hi = 0;
+ unsigned i;
+ int r;
+@@ -1941,8 +1939,7 @@ static int vcn_v3_0_ring_patch_cs_in_place(struct amdgpu_cs_parser *p,
+ msg_hi = val;
+ } else if (reg == PACKET0(p->adev->vcn.internal.cmd, 0) &&
+ val == 0) {
+- r = vcn_v3_0_dec_msg(p, job,
+- ((u64)msg_hi) << 32 | msg_lo);
++ r = vcn_v3_0_dec_msg(p, ((u64)msg_hi) << 32 | msg_lo);
+ if (r)
+ return r;
+ }
+diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+index c96d521447fcf..651498bfecc8d 100644
+--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
++++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+@@ -156,7 +156,9 @@ static void kfd_device_info_init(struct kfd_dev *kfd,
+
+ if (gc_version < IP_VERSION(11, 0, 0)) {
+ /* Navi2x+, Navi1x+ */
+- if (gc_version >= IP_VERSION(10, 3, 0))
++ if (gc_version == IP_VERSION(10, 3, 6))
++ kfd->device_info.no_atomic_fw_version = 14;
++ else if (gc_version >= IP_VERSION(10, 3, 0))
+ kfd->device_info.no_atomic_fw_version = 92;
+ else if (gc_version >= IP_VERSION(10, 1, 1))
+ kfd->device_info.no_atomic_fw_version = 145;
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+index 62139ff35476c..8dd03de7c2775 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+@@ -771,7 +771,7 @@ static void dm_dmub_outbox1_low_irq(void *interrupt_params)
+
+ do {
+ dc_stat_get_dmub_notification(adev->dm.dc, ¬ify);
+- if (notify.type > ARRAY_SIZE(dm->dmub_thread_offload)) {
++ if (notify.type >= ARRAY_SIZE(dm->dmub_thread_offload)) {
+ DRM_ERROR("DM: notify type %d invalid!", notify.type);
+ continue;
+ }
+diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c
+index 8be4c19706285..2918ad07d489d 100644
+--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c
++++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c
+@@ -41,9 +41,7 @@
+
+ #include "dc_dmub_srv.h"
+
+-#if defined (CONFIG_DRM_AMD_DC_DP2_0)
+ #include "dc_link_dp.h"
+-#endif
+
+ #define TO_CLK_MGR_DCN315(clk_mgr)\
+ container_of(clk_mgr, struct clk_mgr_dcn315, base)
+@@ -91,7 +89,8 @@ static void dcn315_disable_otg_wa(struct clk_mgr *clk_mgr_base, bool disable)
+
+ if (pipe->top_pipe || pipe->prev_odm_pipe)
+ continue;
+- if (pipe->stream && (pipe->stream->dpms_off || dc_is_virtual_signal(pipe->stream->signal))) {
++ if (pipe->stream && (pipe->stream->dpms_off || pipe->plane_state == NULL ||
++ dc_is_virtual_signal(pipe->stream->signal))) {
+ if (disable)
+ pipe->stream_res.tg->funcs->immediate_disable_crtc(pipe->stream_res.tg);
+ else
+diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn316/dcn316_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn316/dcn316_clk_mgr.c
+index 3121dd2d2a911..fc3af81ed6c62 100644
+--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn316/dcn316_clk_mgr.c
++++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn316/dcn316_clk_mgr.c
+@@ -122,7 +122,8 @@ static void dcn316_disable_otg_wa(struct clk_mgr *clk_mgr_base, bool disable)
+
+ if (pipe->top_pipe || pipe->prev_odm_pipe)
+ continue;
+- if (pipe->stream && (pipe->stream->dpms_off || dc_is_virtual_signal(pipe->stream->signal))) {
++ if (pipe->stream && (pipe->stream->dpms_off || pipe->plane_state == NULL ||
++ dc_is_virtual_signal(pipe->stream->signal))) {
+ if (disable)
+ pipe->stream_res.tg->funcs->immediate_disable_crtc(pipe->stream_res.tg);
+ else
+diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c b/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c
+index cc5128e67daff..8e9a7409c17a7 100644
+--- a/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c
++++ b/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c
+@@ -1105,9 +1105,12 @@ static bool get_pixel_clk_frequency_100hz(
+ * not be programmed equal to DPREFCLK
+ */
+ modulo_hz = REG_READ(MODULO[inst]);
+- *pixel_clk_khz = div_u64((uint64_t)clock_hz*
+- clock_source->ctx->dc->clk_mgr->dprefclk_khz*10,
+- modulo_hz);
++ if (modulo_hz)
++ *pixel_clk_khz = div_u64((uint64_t)clock_hz*
++ clock_source->ctx->dc->clk_mgr->dprefclk_khz*10,
++ modulo_hz);
++ else
++ *pixel_clk_khz = 0;
+ } else {
+ /* NOTE: There is agreement with VBIOS here that MODULO is
+ * programmed equal to DPREFCLK, in which case PHASE will be
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dml_wrapper.c b/drivers/gpu/drm/amd/display/dc/dml/dml_wrapper.c
+index 789f7562cdc75..d2273674e8729 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dml_wrapper.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dml_wrapper.c
+@@ -1284,10 +1284,8 @@ static bool is_dtbclk_required(struct dc *dc, struct dc_state *context)
+ for (i = 0; i < dc->res_pool->pipe_count; i++) {
+ if (!context->res_ctx.pipe_ctx[i].stream)
+ continue;
+-#if defined (CONFIG_DRM_AMD_DC_DP2_0)
+ if (is_dp_128b_132b_signal(&context->res_ctx.pipe_ctx[i]))
+ return true;
+-#endif
+ }
+ return false;
+ }
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+index 38f04836c82f2..7a1e225fb8230 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+@@ -586,12 +586,28 @@ static int sienna_cichlid_get_smu_metrics_data(struct smu_context *smu,
+ uint16_t average_gfx_activity;
+ int ret = 0;
+
+- if ((smu->adev->ip_versions[MP1_HWIP][0] == IP_VERSION(11, 0, 7)) &&
+- (smu->smc_fw_version >= 0x3A4900))
+- use_metrics_v3 = true;
+- else if ((smu->adev->ip_versions[MP1_HWIP][0] == IP_VERSION(11, 0, 7)) &&
+- (smu->smc_fw_version >= 0x3A4300))
+- use_metrics_v2 = true;
++ switch (smu->adev->ip_versions[MP1_HWIP][0]) {
++ case IP_VERSION(11, 0, 7):
++ if (smu->smc_fw_version >= 0x3A4900)
++ use_metrics_v3 = true;
++ else if (smu->smc_fw_version >= 0x3A4300)
++ use_metrics_v2 = true;
++ break;
++ case IP_VERSION(11, 0, 11):
++ if (smu->smc_fw_version >= 0x412D00)
++ use_metrics_v2 = true;
++ break;
++ case IP_VERSION(11, 0, 12):
++ if (smu->smc_fw_version >= 0x3B2300)
++ use_metrics_v2 = true;
++ break;
++ case IP_VERSION(11, 0, 13):
++ if (smu->smc_fw_version >= 0x491100)
++ use_metrics_v2 = true;
++ break;
++ default:
++ break;
++ }
+
+ ret = smu_cmn_get_metrics_table(smu,
+ NULL,
+@@ -3701,13 +3717,28 @@ static ssize_t sienna_cichlid_get_gpu_metrics(struct smu_context *smu,
+ uint16_t average_gfx_activity;
+ int ret = 0;
+
+- if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(11, 0, 7)) &&
+- (smu->smc_fw_version >= 0x3A4900))
+- use_metrics_v3 = true;
+- else if ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(11, 0, 7)) &&
+- (smu->smc_fw_version >= 0x3A4300))
+- use_metrics_v2 = true;
+-
++ switch (smu->adev->ip_versions[MP1_HWIP][0]) {
++ case IP_VERSION(11, 0, 7):
++ if (smu->smc_fw_version >= 0x3A4900)
++ use_metrics_v3 = true;
++ else if (smu->smc_fw_version >= 0x3A4300)
++ use_metrics_v2 = true;
++ break;
++ case IP_VERSION(11, 0, 11):
++ if (smu->smc_fw_version >= 0x412D00)
++ use_metrics_v2 = true;
++ break;
++ case IP_VERSION(11, 0, 12):
++ if (smu->smc_fw_version >= 0x3B2300)
++ use_metrics_v2 = true;
++ break;
++ case IP_VERSION(11, 0, 13):
++ if (smu->smc_fw_version >= 0x491100)
++ use_metrics_v2 = true;
++ break;
++ default:
++ break;
++ }
+
+ ret = smu_cmn_get_metrics_table(smu,
+ &metrics_external,
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
+index b87f550af26ba..5f8809f6990dd 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
+@@ -781,7 +781,7 @@ int smu_v11_0_set_allowed_mask(struct smu_context *smu)
+ goto failed;
+ }
+
+- bitmap_copy((unsigned long *)feature_mask, feature->allowed, 64);
++ bitmap_to_arr32(feature_mask, feature->allowed, 64);
+
+ ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetAllowedFeaturesMaskHigh,
+ feature_mask[1], NULL);
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
+index cd81f848d45ab..7f998f24af81d 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
+@@ -1664,6 +1664,7 @@ static const struct throttling_logging_label {
+ uint32_t feature_mask;
+ const char *label;
+ } logging_label[] = {
++ {(1U << THROTTLER_TEMP_GPU_BIT), "GPU"},
+ {(1U << THROTTLER_TEMP_MEM_BIT), "HBM"},
+ {(1U << THROTTLER_TEMP_VR_GFX_BIT), "VR of GFX rail"},
+ {(1U << THROTTLER_TEMP_VR_MEM_BIT), "VR of HBM rail"},
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+index cf09e30bdfe0b..747430ce63942 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+@@ -730,7 +730,7 @@ int smu_v13_0_set_allowed_mask(struct smu_context *smu)
+ feature->feature_num < 64)
+ return -EINVAL;
+
+- bitmap_copy((unsigned long *)feature_mask, feature->allowed, 64);
++ bitmap_to_arr32(feature_mask, feature->allowed, 64);
+
+ ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetAllowedFeaturesMaskHigh,
+ feature_mask[1], NULL);
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c
+index 87257b1b028f7..feff4f8c927cc 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c
+@@ -190,6 +190,9 @@ static int yellow_carp_fini_smc_tables(struct smu_context *smu)
+ kfree(smu_table->watermarks_table);
+ smu_table->watermarks_table = NULL;
+
++ kfree(smu_table->gpu_metrics_table);
++ smu_table->gpu_metrics_table = NULL;
++
+ return 0;
+ }
+
+diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+index 988669505aa5e..26d8fc3c18673 100644
+--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
++++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+@@ -1268,6 +1268,25 @@ static int analogix_dp_bridge_attach(struct drm_bridge *bridge,
+ return 0;
+ }
+
++static
++struct drm_crtc *analogix_dp_get_old_crtc(struct analogix_dp_device *dp,
++ struct drm_atomic_state *state)
++{
++ struct drm_encoder *encoder = dp->encoder;
++ struct drm_connector *connector;
++ struct drm_connector_state *conn_state;
++
++ connector = drm_atomic_get_old_connector_for_encoder(state, encoder);
++ if (!connector)
++ return NULL;
++
++ conn_state = drm_atomic_get_old_connector_state(state, connector);
++ if (!conn_state)
++ return NULL;
++
++ return conn_state->crtc;
++}
++
+ static
+ struct drm_crtc *analogix_dp_get_new_crtc(struct analogix_dp_device *dp,
+ struct drm_atomic_state *state)
+@@ -1448,14 +1467,16 @@ analogix_dp_bridge_atomic_disable(struct drm_bridge *bridge,
+ {
+ struct drm_atomic_state *old_state = old_bridge_state->base.state;
+ struct analogix_dp_device *dp = bridge->driver_private;
+- struct drm_crtc *crtc;
++ struct drm_crtc *old_crtc, *new_crtc;
++ struct drm_crtc_state *old_crtc_state = NULL;
+ struct drm_crtc_state *new_crtc_state = NULL;
++ int ret;
+
+- crtc = analogix_dp_get_new_crtc(dp, old_state);
+- if (!crtc)
++ new_crtc = analogix_dp_get_new_crtc(dp, old_state);
++ if (!new_crtc)
+ goto out;
+
+- new_crtc_state = drm_atomic_get_new_crtc_state(old_state, crtc);
++ new_crtc_state = drm_atomic_get_new_crtc_state(old_state, new_crtc);
+ if (!new_crtc_state)
+ goto out;
+
+@@ -1464,6 +1485,19 @@ analogix_dp_bridge_atomic_disable(struct drm_bridge *bridge,
+ return;
+
+ out:
++ old_crtc = analogix_dp_get_old_crtc(dp, old_state);
++ if (old_crtc) {
++ old_crtc_state = drm_atomic_get_old_crtc_state(old_state,
++ old_crtc);
++
++ /* When moving from PSR to fully disabled, exit PSR first. */
++ if (old_crtc_state && old_crtc_state->self_refresh_active) {
++ ret = analogix_dp_disable_psr(dp);
++ if (ret)
++ DRM_ERROR("Failed to disable psr (%d)\n", ret);
++ }
++ }
++
+ analogix_dp_bridge_disable(bridge);
+ }
+
+diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
+index 19daaddd29a41..3d58110465fe9 100644
+--- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
++++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
+@@ -573,7 +573,7 @@ static int sn65dsi83_parse_dt(struct sn65dsi83 *ctx, enum sn65dsi83_model model)
+ ctx->host_node = of_graph_get_remote_port_parent(endpoint);
+ of_node_put(endpoint);
+
+- if (ctx->dsi_lanes < 0 || ctx->dsi_lanes > 4) {
++ if (ctx->dsi_lanes <= 0 || ctx->dsi_lanes > 4) {
+ ret = -EINVAL;
+ goto err_put_node;
+ }
+diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
+index 9603193d2fa13..987e4b212e9fb 100644
+--- a/drivers/gpu/drm/drm_atomic_helper.c
++++ b/drivers/gpu/drm/drm_atomic_helper.c
+@@ -1011,9 +1011,19 @@ crtc_needs_disable(struct drm_crtc_state *old_state,
+ return drm_atomic_crtc_effectively_active(old_state);
+
+ /*
+- * We need to run through the crtc_funcs->disable() function if the CRTC
+- * is currently on, if it's transitioning to self refresh mode, or if
+- * it's in self refresh mode and needs to be fully disabled.
++ * We need to disable bridge(s) and CRTC if we're transitioning out of
++ * self-refresh and changing CRTCs at the same time, because the
++ * bridge tracks self-refresh status via CRTC state.
++ */
++ if (old_state->self_refresh_active &&
++ old_state->crtc != new_state->crtc)
++ return true;
++
++ /*
++ * We also need to run through the crtc_funcs->disable() function if
++ * the CRTC is currently on, if it's transitioning to self refresh
++ * mode, or if it's in self refresh mode and needs to be fully
++ * disabled.
+ */
+ return old_state->active ||
+ (old_state->self_refresh_active && !new_state->active) ||
+diff --git a/drivers/gpu/drm/imx/ipuv3-crtc.c b/drivers/gpu/drm/imx/ipuv3-crtc.c
+index 9c8829f945b23..f7863d6dea804 100644
+--- a/drivers/gpu/drm/imx/ipuv3-crtc.c
++++ b/drivers/gpu/drm/imx/ipuv3-crtc.c
+@@ -69,7 +69,7 @@ static void ipu_crtc_disable_planes(struct ipu_crtc *ipu_crtc,
+ drm_atomic_crtc_state_for_each_plane(plane, old_crtc_state) {
+ if (plane == &ipu_crtc->plane[0]->base)
+ disable_full = true;
+- if (&ipu_crtc->plane[1] && plane == &ipu_crtc->plane[1]->base)
++ if (ipu_crtc->plane[1] && plane == &ipu_crtc->plane[1]->base)
+ disable_partial = true;
+ }
+
+diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c
+index 08cc48af03b7d..de1974916ad2d 100644
+--- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
++++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
+@@ -1380,8 +1380,13 @@ void dp_ctrl_reset_irq_ctrl(struct dp_ctrl *dp_ctrl, bool enable)
+
+ dp_catalog_ctrl_reset(ctrl->catalog);
+
+- if (enable)
+- dp_catalog_ctrl_enable_irq(ctrl->catalog, enable);
++ /*
++ * all dp controller programmable registers will not
++ * be reset to default value after DP_SW_RESET
++ * therefore interrupt mask bits have to be updated
++ * to enable/disable interrupts
++ */
++ dp_catalog_ctrl_enable_irq(ctrl->catalog, enable);
+ }
+
+ void dp_ctrl_phy_init(struct dp_ctrl *dp_ctrl)
+diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
+index 94b6f0a19c83a..47780fe597f23 100644
+--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
++++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
+@@ -233,6 +233,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
+ struct drm_file *file)
+ {
+ struct panfrost_device *pfdev = dev->dev_private;
++ struct panfrost_file_priv *file_priv = file->driver_priv;
+ struct drm_panfrost_submit *args = data;
+ struct drm_syncobj *sync_out = NULL;
+ struct panfrost_job *job;
+@@ -262,12 +263,12 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
+ job->jc = args->jc;
+ job->requirements = args->requirements;
+ job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
+- job->file_priv = file->driver_priv;
++ job->mmu = file_priv->mmu;
+
+ slot = panfrost_job_get_slot(job);
+
+ ret = drm_sched_job_init(&job->base,
+- &job->file_priv->sched_entity[slot],
++ &file_priv->sched_entity[slot],
+ NULL);
+ if (ret)
+ goto out_put_job;
+diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
+index a6925dbb6224f..22c2af1a4627d 100644
+--- a/drivers/gpu/drm/panfrost/panfrost_job.c
++++ b/drivers/gpu/drm/panfrost/panfrost_job.c
+@@ -201,7 +201,7 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
+ return;
+ }
+
+- cfg = panfrost_mmu_as_get(pfdev, job->file_priv->mmu);
++ cfg = panfrost_mmu_as_get(pfdev, job->mmu);
+
+ job_write(pfdev, JS_HEAD_NEXT_LO(js), lower_32_bits(jc_head));
+ job_write(pfdev, JS_HEAD_NEXT_HI(js), upper_32_bits(jc_head));
+@@ -431,7 +431,7 @@ static void panfrost_job_handle_err(struct panfrost_device *pfdev,
+ job->jc = 0;
+ }
+
+- panfrost_mmu_as_put(pfdev, job->file_priv->mmu);
++ panfrost_mmu_as_put(pfdev, job->mmu);
+ panfrost_devfreq_record_idle(&pfdev->pfdevfreq);
+
+ if (signal_fence)
+@@ -452,7 +452,7 @@ static void panfrost_job_handle_done(struct panfrost_device *pfdev,
+ * happen when we receive the DONE interrupt while doing a GPU reset).
+ */
+ job->jc = 0;
+- panfrost_mmu_as_put(pfdev, job->file_priv->mmu);
++ panfrost_mmu_as_put(pfdev, job->mmu);
+ panfrost_devfreq_record_idle(&pfdev->pfdevfreq);
+
+ dma_fence_signal_locked(job->done_fence);
+diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
+index 77e6d0e6f612f..8becc1ba0eb95 100644
+--- a/drivers/gpu/drm/panfrost/panfrost_job.h
++++ b/drivers/gpu/drm/panfrost/panfrost_job.h
+@@ -17,7 +17,7 @@ struct panfrost_job {
+ struct kref refcount;
+
+ struct panfrost_device *pfdev;
+- struct panfrost_file_priv *file_priv;
++ struct panfrost_mmu *mmu;
+
+ /* Fence to be signaled by IRQ handler when the job is complete. */
+ struct dma_fence *done_fence;
+diff --git a/drivers/gpu/drm/radeon/radeon_connectors.c b/drivers/gpu/drm/radeon/radeon_connectors.c
+index 0cb1345c6ba40..fabe4f4ca1249 100644
+--- a/drivers/gpu/drm/radeon/radeon_connectors.c
++++ b/drivers/gpu/drm/radeon/radeon_connectors.c
+@@ -473,6 +473,8 @@ static struct drm_display_mode *radeon_fp_native_mode(struct drm_encoder *encode
+ native_mode->vdisplay != 0 &&
+ native_mode->clock != 0) {
+ mode = drm_mode_duplicate(dev, native_mode);
++ if (!mode)
++ return NULL;
+ mode->type = DRM_MODE_TYPE_PREFERRED | DRM_MODE_TYPE_DRIVER;
+ drm_mode_set_name(mode);
+
+@@ -487,6 +489,8 @@ static struct drm_display_mode *radeon_fp_native_mode(struct drm_encoder *encode
+ * simpler.
+ */
+ mode = drm_cvt_mode(dev, native_mode->hdisplay, native_mode->vdisplay, 60, true, false, false);
++ if (!mode)
++ return NULL;
+ mode->type = DRM_MODE_TYPE_PREFERRED | DRM_MODE_TYPE_DRIVER;
+ DRM_DEBUG_KMS("Adding cvt approximation of native panel mode %s\n", mode->name);
+ }
+diff --git a/drivers/hwtracing/coresight/coresight-cpu-debug.c b/drivers/hwtracing/coresight/coresight-cpu-debug.c
+index 8845ec4b4402a..1874df7c6a736 100644
+--- a/drivers/hwtracing/coresight/coresight-cpu-debug.c
++++ b/drivers/hwtracing/coresight/coresight-cpu-debug.c
+@@ -380,9 +380,10 @@ static int debug_notifier_call(struct notifier_block *self,
+ int cpu;
+ struct debug_drvdata *drvdata;
+
+- mutex_lock(&debug_lock);
++ /* Bail out if we can't acquire the mutex or the functionality is off */
++ if (!mutex_trylock(&debug_lock))
++ return NOTIFY_DONE;
+
+- /* Bail out if the functionality is disabled */
+ if (!debug_enable)
+ goto skip_dump;
+
+@@ -401,7 +402,7 @@ static int debug_notifier_call(struct notifier_block *self,
+
+ skip_dump:
+ mutex_unlock(&debug_lock);
+- return 0;
++ return NOTIFY_DONE;
+ }
+
+ static struct notifier_block debug_notifier = {
+diff --git a/drivers/i2c/busses/i2c-cadence.c b/drivers/i2c/busses/i2c-cadence.c
+index 805c77143a0f9..b4c1ad19cdaec 100644
+--- a/drivers/i2c/busses/i2c-cadence.c
++++ b/drivers/i2c/busses/i2c-cadence.c
+@@ -760,7 +760,7 @@ static void cdns_i2c_master_reset(struct i2c_adapter *adap)
+ static int cdns_i2c_process_msg(struct cdns_i2c *id, struct i2c_msg *msg,
+ struct i2c_adapter *adap)
+ {
+- unsigned long time_left;
++ unsigned long time_left, msg_timeout;
+ u32 reg;
+
+ id->p_msg = msg;
+@@ -785,8 +785,16 @@ static int cdns_i2c_process_msg(struct cdns_i2c *id, struct i2c_msg *msg,
+ else
+ cdns_i2c_msend(id);
+
++ /* Minimal time to execute this message */
++ msg_timeout = msecs_to_jiffies((1000 * msg->len * BITS_PER_BYTE) / id->i2c_clk);
++ /* Plus some wiggle room */
++ msg_timeout += msecs_to_jiffies(500);
++
++ if (msg_timeout < adap->timeout)
++ msg_timeout = adap->timeout;
++
+ /* Wait for the signal of completion */
+- time_left = wait_for_completion_timeout(&id->xfer_done, adap->timeout);
++ time_left = wait_for_completion_timeout(&id->xfer_done, msg_timeout);
+ if (time_left == 0) {
+ cdns_i2c_master_reset(adap);
+ dev_err(id->adap.dev.parent,
+diff --git a/drivers/i2c/busses/i2c-mt65xx.c b/drivers/i2c/busses/i2c-mt65xx.c
+index f651d3e124d6c..bdecb78bfc26d 100644
+--- a/drivers/i2c/busses/i2c-mt65xx.c
++++ b/drivers/i2c/busses/i2c-mt65xx.c
+@@ -1177,7 +1177,7 @@ static int mtk_i2c_transfer(struct i2c_adapter *adap,
+ int left_num = num;
+ struct mtk_i2c *i2c = i2c_get_adapdata(adap);
+
+- ret = clk_bulk_prepare_enable(I2C_MT65XX_CLK_MAX, i2c->clocks);
++ ret = clk_bulk_enable(I2C_MT65XX_CLK_MAX, i2c->clocks);
+ if (ret)
+ return ret;
+
+@@ -1231,7 +1231,7 @@ static int mtk_i2c_transfer(struct i2c_adapter *adap,
+ ret = num;
+
+ err_exit:
+- clk_bulk_disable_unprepare(I2C_MT65XX_CLK_MAX, i2c->clocks);
++ clk_bulk_disable(I2C_MT65XX_CLK_MAX, i2c->clocks);
+ return ret;
+ }
+
+@@ -1412,7 +1412,7 @@ static int mtk_i2c_probe(struct platform_device *pdev)
+ return ret;
+ }
+ mtk_i2c_init_hw(i2c);
+- clk_bulk_disable_unprepare(I2C_MT65XX_CLK_MAX, i2c->clocks);
++ clk_bulk_disable(I2C_MT65XX_CLK_MAX, i2c->clocks);
+
+ ret = devm_request_irq(&pdev->dev, irq, mtk_i2c_irq,
+ IRQF_NO_SUSPEND | IRQF_TRIGGER_NONE,
+@@ -1439,6 +1439,8 @@ static int mtk_i2c_remove(struct platform_device *pdev)
+
+ i2c_del_adapter(&i2c->adap);
+
++ clk_bulk_unprepare(I2C_MT65XX_CLK_MAX, i2c->clocks);
++
+ return 0;
+ }
+
+@@ -1448,6 +1450,7 @@ static int mtk_i2c_suspend_noirq(struct device *dev)
+ struct mtk_i2c *i2c = dev_get_drvdata(dev);
+
+ i2c_mark_adapter_suspended(&i2c->adap);
++ clk_bulk_unprepare(I2C_MT65XX_CLK_MAX, i2c->clocks);
+
+ return 0;
+ }
+@@ -1465,7 +1468,7 @@ static int mtk_i2c_resume_noirq(struct device *dev)
+
+ mtk_i2c_init_hw(i2c);
+
+- clk_bulk_disable_unprepare(I2C_MT65XX_CLK_MAX, i2c->clocks);
++ clk_bulk_disable(I2C_MT65XX_CLK_MAX, i2c->clocks);
+
+ i2c_mark_adapter_resumed(&i2c->adap);
+
+diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
+index 47551ab73ca8a..c5a019eab5ec9 100644
+--- a/drivers/idle/intel_idle.c
++++ b/drivers/idle/intel_idle.c
+@@ -115,6 +115,18 @@ static unsigned int mwait_substates __initdata;
+ #define flg2MWAIT(flags) (((flags) >> 24) & 0xFF)
+ #define MWAIT2flg(eax) ((eax & 0xFF) << 24)
+
++static __always_inline int __intel_idle(struct cpuidle_device *dev,
++ struct cpuidle_driver *drv, int index)
++{
++ struct cpuidle_state *state = &drv->states[index];
++ unsigned long eax = flg2MWAIT(state->flags);
++ unsigned long ecx = 1; /* break on interrupt flag */
++
++ mwait_idle_with_hints(eax, ecx);
++
++ return index;
++}
++
+ /**
+ * intel_idle - Ask the processor to enter the given idle state.
+ * @dev: cpuidle device of the target CPU.
+@@ -132,16 +144,19 @@ static unsigned int mwait_substates __initdata;
+ static __cpuidle int intel_idle(struct cpuidle_device *dev,
+ struct cpuidle_driver *drv, int index)
+ {
+- struct cpuidle_state *state = &drv->states[index];
+- unsigned long eax = flg2MWAIT(state->flags);
+- unsigned long ecx = 1; /* break on interrupt flag */
++ return __intel_idle(dev, drv, index);
++}
+
+- if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE)
+- local_irq_enable();
++static __cpuidle int intel_idle_irq(struct cpuidle_device *dev,
++ struct cpuidle_driver *drv, int index)
++{
++ int ret;
+
+- mwait_idle_with_hints(eax, ecx);
++ raw_local_irq_enable();
++ ret = __intel_idle(dev, drv, index);
++ raw_local_irq_disable();
+
+- return index;
++ return ret;
+ }
+
+ /**
+@@ -1668,6 +1683,9 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
+ /* Structure copy. */
+ drv->states[drv->state_count] = cpuidle_state_table[cstate];
+
++ if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_IRQ_ENABLE)
++ drv->states[drv->state_count].enter = intel_idle_irq;
++
+ if ((disabled_states_mask & BIT(drv->state_count)) ||
+ ((icpu->use_acpi || force_use_acpi) &&
+ intel_idle_off_by_default(mwait_hint) &&
+diff --git a/drivers/iio/adc/ad7124.c b/drivers/iio/adc/ad7124.c
+index c47ead15f6e5a..3752b2c889598 100644
+--- a/drivers/iio/adc/ad7124.c
++++ b/drivers/iio/adc/ad7124.c
+@@ -188,7 +188,6 @@ static const struct iio_chan_spec ad7124_channel_template = {
+ .sign = 'u',
+ .realbits = 24,
+ .storagebits = 32,
+- .shift = 8,
+ .endianness = IIO_BE,
+ },
+ };
+diff --git a/drivers/iio/adc/sc27xx_adc.c b/drivers/iio/adc/sc27xx_adc.c
+index 00098caf6d9ee..cfe003cc4f0b6 100644
+--- a/drivers/iio/adc/sc27xx_adc.c
++++ b/drivers/iio/adc/sc27xx_adc.c
+@@ -36,8 +36,8 @@
+
+ /* Bits and mask definition for SC27XX_ADC_CH_CFG register */
+ #define SC27XX_ADC_CHN_ID_MASK GENMASK(4, 0)
+-#define SC27XX_ADC_SCALE_MASK GENMASK(10, 8)
+-#define SC27XX_ADC_SCALE_SHIFT 8
++#define SC27XX_ADC_SCALE_MASK GENMASK(10, 9)
++#define SC27XX_ADC_SCALE_SHIFT 9
+
+ /* Bits definitions for SC27XX_ADC_INT_EN registers */
+ #define SC27XX_ADC_IRQ_EN BIT(0)
+@@ -103,14 +103,14 @@ static struct sc27xx_adc_linear_graph small_scale_graph = {
+ 100, 341,
+ };
+
+-static const struct sc27xx_adc_linear_graph big_scale_graph_calib = {
+- 4200, 856,
+- 3600, 733,
++static const struct sc27xx_adc_linear_graph sc2731_big_scale_graph_calib = {
++ 4200, 850,
++ 3600, 728,
+ };
+
+-static const struct sc27xx_adc_linear_graph small_scale_graph_calib = {
+- 1000, 833,
+- 100, 80,
++static const struct sc27xx_adc_linear_graph sc2731_small_scale_graph_calib = {
++ 1000, 838,
++ 100, 84,
+ };
+
+ static int sc27xx_adc_get_calib_data(u32 calib_data, int calib_adc)
+@@ -130,11 +130,11 @@ static int sc27xx_adc_scale_calibration(struct sc27xx_adc_data *data,
+ size_t len;
+
+ if (big_scale) {
+- calib_graph = &big_scale_graph_calib;
++ calib_graph = &sc2731_big_scale_graph_calib;
+ graph = &big_scale_graph;
+ cell_name = "big_scale_calib";
+ } else {
+- calib_graph = &small_scale_graph_calib;
++ calib_graph = &sc2731_small_scale_graph_calib;
+ graph = &small_scale_graph;
+ cell_name = "small_scale_calib";
+ }
+diff --git a/drivers/iio/adc/stmpe-adc.c b/drivers/iio/adc/stmpe-adc.c
+index d2d4053884991..83e0ac4467ca0 100644
+--- a/drivers/iio/adc/stmpe-adc.c
++++ b/drivers/iio/adc/stmpe-adc.c
+@@ -61,7 +61,7 @@ struct stmpe_adc {
+ static int stmpe_read_voltage(struct stmpe_adc *info,
+ struct iio_chan_spec const *chan, int *val)
+ {
+- long ret;
++ unsigned long ret;
+
+ mutex_lock(&info->lock);
+
+@@ -79,7 +79,7 @@ static int stmpe_read_voltage(struct stmpe_adc *info,
+
+ ret = wait_for_completion_timeout(&info->completion, STMPE_ADC_TIMEOUT);
+
+- if (ret <= 0) {
++ if (ret == 0) {
+ stmpe_reg_write(info->stmpe, STMPE_REG_ADC_INT_STA,
+ STMPE_ADC_CH(info->channel));
+ mutex_unlock(&info->lock);
+@@ -96,7 +96,7 @@ static int stmpe_read_voltage(struct stmpe_adc *info,
+ static int stmpe_read_temp(struct stmpe_adc *info,
+ struct iio_chan_spec const *chan, int *val)
+ {
+- long ret;
++ unsigned long ret;
+
+ mutex_lock(&info->lock);
+
+@@ -114,7 +114,7 @@ static int stmpe_read_temp(struct stmpe_adc *info,
+
+ ret = wait_for_completion_timeout(&info->completion, STMPE_ADC_TIMEOUT);
+
+- if (ret <= 0) {
++ if (ret == 0) {
+ mutex_unlock(&info->lock);
+ return -ETIMEDOUT;
+ }
+diff --git a/drivers/iio/common/st_sensors/st_sensors_core.c b/drivers/iio/common/st_sensors/st_sensors_core.c
+index fa9bcdf0d1900..b92de90a125c6 100644
+--- a/drivers/iio/common/st_sensors/st_sensors_core.c
++++ b/drivers/iio/common/st_sensors/st_sensors_core.c
+@@ -71,16 +71,18 @@ st_sensors_match_odr_error:
+
+ int st_sensors_set_odr(struct iio_dev *indio_dev, unsigned int odr)
+ {
+- int err;
++ int err = 0;
+ struct st_sensor_odr_avl odr_out = {0, 0};
+ struct st_sensor_data *sdata = iio_priv(indio_dev);
+
++ mutex_lock(&sdata->odr_lock);
++
+ if (!sdata->sensor_settings->odr.mask)
+- return 0;
++ goto unlock_mutex;
+
+ err = st_sensors_match_odr(sdata->sensor_settings, odr, &odr_out);
+ if (err < 0)
+- goto st_sensors_match_odr_error;
++ goto unlock_mutex;
+
+ if ((sdata->sensor_settings->odr.addr ==
+ sdata->sensor_settings->pw.addr) &&
+@@ -103,7 +105,9 @@ int st_sensors_set_odr(struct iio_dev *indio_dev, unsigned int odr)
+ if (err >= 0)
+ sdata->odr = odr_out.hz;
+
+-st_sensors_match_odr_error:
++unlock_mutex:
++ mutex_unlock(&sdata->odr_lock);
++
+ return err;
+ }
+ EXPORT_SYMBOL_NS(st_sensors_set_odr, IIO_ST_SENSORS);
+@@ -361,6 +365,8 @@ int st_sensors_init_sensor(struct iio_dev *indio_dev,
+ struct st_sensors_platform_data *of_pdata;
+ int err = 0;
+
++ mutex_init(&sdata->odr_lock);
++
+ /* If OF/DT pdata exists, it will take precedence of anything else */
+ of_pdata = st_sensors_dev_probe(indio_dev->dev.parent, pdata);
+ if (IS_ERR(of_pdata))
+@@ -554,18 +560,24 @@ int st_sensors_read_info_raw(struct iio_dev *indio_dev,
+ err = -EBUSY;
+ goto out;
+ } else {
++ mutex_lock(&sdata->odr_lock);
+ err = st_sensors_set_enable(indio_dev, true);
+- if (err < 0)
++ if (err < 0) {
++ mutex_unlock(&sdata->odr_lock);
+ goto out;
++ }
+
+ msleep((sdata->sensor_settings->bootime * 1000) / sdata->odr);
+ err = st_sensors_read_axis_data(indio_dev, ch, val);
+- if (err < 0)
++ if (err < 0) {
++ mutex_unlock(&sdata->odr_lock);
+ goto out;
++ }
+
+ *val = *val >> ch->scan_type.shift;
+
+ err = st_sensors_set_enable(indio_dev, false);
++ mutex_unlock(&sdata->odr_lock);
+ }
+ out:
+ mutex_unlock(&indio_dev->mlock);
+diff --git a/drivers/iio/dummy/iio_simple_dummy.c b/drivers/iio/dummy/iio_simple_dummy.c
+index c0b7ef9007354..c24f609c2ade6 100644
+--- a/drivers/iio/dummy/iio_simple_dummy.c
++++ b/drivers/iio/dummy/iio_simple_dummy.c
+@@ -575,10 +575,9 @@ static struct iio_sw_device *iio_dummy_probe(const char *name)
+ */
+
+ swd = kzalloc(sizeof(*swd), GFP_KERNEL);
+- if (!swd) {
+- ret = -ENOMEM;
+- goto error_kzalloc;
+- }
++ if (!swd)
++ return ERR_PTR(-ENOMEM);
++
+ /*
+ * Allocate an IIO device.
+ *
+@@ -590,7 +589,7 @@ static struct iio_sw_device *iio_dummy_probe(const char *name)
+ indio_dev = iio_device_alloc(parent, sizeof(*st));
+ if (!indio_dev) {
+ ret = -ENOMEM;
+- goto error_ret;
++ goto error_free_swd;
+ }
+
+ st = iio_priv(indio_dev);
+@@ -616,6 +615,10 @@ static struct iio_sw_device *iio_dummy_probe(const char *name)
+ * indio_dev->name = spi_get_device_id(spi)->name;
+ */
+ indio_dev->name = kstrdup(name, GFP_KERNEL);
++ if (!indio_dev->name) {
++ ret = -ENOMEM;
++ goto error_free_device;
++ }
+
+ /* Provide description of available channels */
+ indio_dev->channels = iio_dummy_channels;
+@@ -632,7 +635,7 @@ static struct iio_sw_device *iio_dummy_probe(const char *name)
+
+ ret = iio_simple_dummy_events_register(indio_dev);
+ if (ret < 0)
+- goto error_free_device;
++ goto error_free_name;
+
+ ret = iio_simple_dummy_configure_buffer(indio_dev);
+ if (ret < 0)
+@@ -649,11 +652,12 @@ error_unconfigure_buffer:
+ iio_simple_dummy_unconfigure_buffer(indio_dev);
+ error_unregister_events:
+ iio_simple_dummy_events_unregister(indio_dev);
++error_free_name:
++ kfree(indio_dev->name);
+ error_free_device:
+ iio_device_free(indio_dev);
+-error_ret:
++error_free_swd:
+ kfree(swd);
+-error_kzalloc:
+ return ERR_PTR(ret);
+ }
+
+diff --git a/drivers/iio/proximity/vl53l0x-i2c.c b/drivers/iio/proximity/vl53l0x-i2c.c
+index 661a79ea200dd..a284b20529fb7 100644
+--- a/drivers/iio/proximity/vl53l0x-i2c.c
++++ b/drivers/iio/proximity/vl53l0x-i2c.c
+@@ -104,6 +104,7 @@ static int vl53l0x_read_proximity(struct vl53l0x_data *data,
+ u16 tries = 20;
+ u8 buffer[12];
+ int ret;
++ unsigned long time_left;
+
+ ret = i2c_smbus_write_byte_data(client, VL_REG_SYSRANGE_START, 1);
+ if (ret < 0)
+@@ -112,10 +113,8 @@ static int vl53l0x_read_proximity(struct vl53l0x_data *data,
+ if (data->client->irq) {
+ reinit_completion(&data->completion);
+
+- ret = wait_for_completion_timeout(&data->completion, HZ/10);
+- if (ret < 0)
+- return ret;
+- else if (ret == 0)
++ time_left = wait_for_completion_timeout(&data->completion, HZ/10);
++ if (time_left == 0)
+ return -ETIMEDOUT;
+
+ vl53l0x_clear_irq(data);
+diff --git a/drivers/input/mouse/bcm5974.c b/drivers/input/mouse/bcm5974.c
+index 59a14505b9cd1..ca150618d32f1 100644
+--- a/drivers/input/mouse/bcm5974.c
++++ b/drivers/input/mouse/bcm5974.c
+@@ -942,17 +942,22 @@ static int bcm5974_probe(struct usb_interface *iface,
+ if (!dev->tp_data)
+ goto err_free_bt_buffer;
+
+- if (dev->bt_urb)
++ if (dev->bt_urb) {
+ usb_fill_int_urb(dev->bt_urb, udev,
+ usb_rcvintpipe(udev, cfg->bt_ep),
+ dev->bt_data, dev->cfg.bt_datalen,
+ bcm5974_irq_button, dev, 1);
+
++ dev->bt_urb->transfer_flags |= URB_NO_TRANSFER_DMA_MAP;
++ }
++
+ usb_fill_int_urb(dev->tp_urb, udev,
+ usb_rcvintpipe(udev, cfg->tp_ep),
+ dev->tp_data, dev->cfg.tp_datalen,
+ bcm5974_irq_trackpad, dev, 1);
+
++ dev->tp_urb->transfer_flags |= URB_NO_TRANSFER_DMA_MAP;
++
+ /* create bcm5974 device */
+ usb_make_path(udev, dev->phys, sizeof(dev->phys));
+ strlcat(dev->phys, "/input0", sizeof(dev->phys));
+diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+index 627a3ed5ee8fd..88817a3376ef0 100644
+--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
++++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+@@ -3770,6 +3770,8 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
+
+ /* Base address */
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
++ if (!res)
++ return -EINVAL;
+ if (resource_size(res) < arm_smmu_resource_size(smmu)) {
+ dev_err(dev, "MMIO region too small (%pr)\n", res);
+ return -EINVAL;
+diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
+index 568cce590ccc1..52b71f6aee3fe 100644
+--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
++++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
+@@ -2092,11 +2092,10 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
+ if (err)
+ return err;
+
+- res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+- ioaddr = res->start;
+- smmu->base = devm_ioremap_resource(dev, res);
++ smmu->base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
+ if (IS_ERR(smmu->base))
+ return PTR_ERR(smmu->base);
++ ioaddr = res->start;
+ /*
+ * The resource size should effectively match the value of SMMU_TOP;
+ * stash that temporarily until we know PAGESIZE to validate it with.
+diff --git a/drivers/md/md.c b/drivers/md/md.c
+index 066f792b374e3..f79cab8c77009 100644
+--- a/drivers/md/md.c
++++ b/drivers/md/md.c
+@@ -7960,17 +7960,22 @@ EXPORT_SYMBOL(md_register_thread);
+
+ void md_unregister_thread(struct md_thread **threadp)
+ {
+- struct md_thread *thread = *threadp;
+- if (!thread)
+- return;
+- pr_debug("interrupting MD-thread pid %d\n", task_pid_nr(thread->tsk));
+- /* Locking ensures that mddev_unlock does not wake_up a
++ struct md_thread *thread;
++
++ /*
++ * Locking ensures that mddev_unlock does not wake_up a
+ * non-existent thread
+ */
+ spin_lock(&pers_lock);
++ thread = *threadp;
++ if (!thread) {
++ spin_unlock(&pers_lock);
++ return;
++ }
+ *threadp = NULL;
+ spin_unlock(&pers_lock);
+
++ pr_debug("interrupting MD-thread pid %d\n", task_pid_nr(thread->tsk));
+ kthread_stop(thread->tsk);
+ kfree(thread);
+ }
+diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
+index 46c30fb538a46..0566def9764d5 100644
+--- a/drivers/md/raid0.c
++++ b/drivers/md/raid0.c
+@@ -128,21 +128,6 @@ static int create_strip_zones(struct mddev *mddev, struct r0conf **private_conf)
+ pr_debug("md/raid0:%s: FINAL %d zones\n",
+ mdname(mddev), conf->nr_strip_zones);
+
+- if (conf->nr_strip_zones == 1) {
+- conf->layout = RAID0_ORIG_LAYOUT;
+- } else if (mddev->layout == RAID0_ORIG_LAYOUT ||
+- mddev->layout == RAID0_ALT_MULTIZONE_LAYOUT) {
+- conf->layout = mddev->layout;
+- } else if (default_layout == RAID0_ORIG_LAYOUT ||
+- default_layout == RAID0_ALT_MULTIZONE_LAYOUT) {
+- conf->layout = default_layout;
+- } else {
+- pr_err("md/raid0:%s: cannot assemble multi-zone RAID0 with default_layout setting\n",
+- mdname(mddev));
+- pr_err("md/raid0: please set raid0.default_layout to 1 or 2\n");
+- err = -ENOTSUPP;
+- goto abort;
+- }
+ /*
+ * now since we have the hard sector sizes, we can make sure
+ * chunk size is a multiple of that sector size
+@@ -273,6 +258,22 @@ static int create_strip_zones(struct mddev *mddev, struct r0conf **private_conf)
+ (unsigned long long)smallest->sectors);
+ }
+
++ if (conf->nr_strip_zones == 1 || conf->strip_zone[1].nb_dev == 1) {
++ conf->layout = RAID0_ORIG_LAYOUT;
++ } else if (mddev->layout == RAID0_ORIG_LAYOUT ||
++ mddev->layout == RAID0_ALT_MULTIZONE_LAYOUT) {
++ conf->layout = mddev->layout;
++ } else if (default_layout == RAID0_ORIG_LAYOUT ||
++ default_layout == RAID0_ALT_MULTIZONE_LAYOUT) {
++ conf->layout = default_layout;
++ } else {
++ pr_err("md/raid0:%s: cannot assemble multi-zone RAID0 with default_layout setting\n",
++ mdname(mddev));
++ pr_err("md/raid0: please set raid0.default_layout to 1 or 2\n");
++ err = -EOPNOTSUPP;
++ goto abort;
++ }
++
+ pr_debug("md/raid0:%s: done.\n", mdname(mddev));
+ *private_conf = conf;
+
+diff --git a/drivers/misc/cardreader/rtsx_usb.c b/drivers/misc/cardreader/rtsx_usb.c
+index 59eda55d92a38..1ef9b61077c44 100644
+--- a/drivers/misc/cardreader/rtsx_usb.c
++++ b/drivers/misc/cardreader/rtsx_usb.c
+@@ -667,6 +667,7 @@ static int rtsx_usb_probe(struct usb_interface *intf,
+ return 0;
+
+ out_init_fail:
++ usb_set_intfdata(ucr->pusb_intf, NULL);
+ usb_free_coherent(ucr->pusb_dev, IOBUF_SIZE, ucr->iobuf,
+ ucr->iobuf_dma);
+ return ret;
+diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
+index 29cf292c0aba6..93ebd174d8487 100644
+--- a/drivers/misc/fastrpc.c
++++ b/drivers/misc/fastrpc.c
+@@ -1606,17 +1606,18 @@ static int fastrpc_req_munmap_impl(struct fastrpc_user *fl,
+ struct fastrpc_req_munmap *req)
+ {
+ struct fastrpc_invoke_args args[1] = { [0] = { 0 } };
+- struct fastrpc_buf *buf, *b;
++ struct fastrpc_buf *buf = NULL, *iter, *b;
+ struct fastrpc_munmap_req_msg req_msg;
+ struct device *dev = fl->sctx->dev;
+ int err;
+ u32 sc;
+
+ spin_lock(&fl->lock);
+- list_for_each_entry_safe(buf, b, &fl->mmaps, node) {
+- if ((buf->raddr == req->vaddrout) && (buf->size == req->size))
++ list_for_each_entry_safe(iter, b, &fl->mmaps, node) {
++ if ((iter->raddr == req->vaddrout) && (iter->size == req->size)) {
++ buf = iter;
+ break;
+- buf = NULL;
++ }
+ }
+ spin_unlock(&fl->lock);
+
+diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
+index f21854ac5cc2b..8cb342c562af6 100644
+--- a/drivers/misc/lkdtm/bugs.c
++++ b/drivers/misc/lkdtm/bugs.c
+@@ -327,6 +327,11 @@ void lkdtm_ARRAY_BOUNDS(void)
+
+ not_checked = kmalloc(sizeof(*not_checked) * 2, GFP_KERNEL);
+ checked = kmalloc(sizeof(*checked) * 2, GFP_KERNEL);
++ if (!not_checked || !checked) {
++ kfree(not_checked);
++ kfree(checked);
++ return;
++ }
+
+ pr_info("Array access within bounds ...\n");
+ /* For both, touch all bytes in the actual member size. */
+@@ -346,7 +351,10 @@ void lkdtm_ARRAY_BOUNDS(void)
+ kfree(not_checked);
+ kfree(checked);
+ pr_err("FAIL: survived array bounds overflow!\n");
+- pr_expected_config(CONFIG_UBSAN_BOUNDS);
++ if (IS_ENABLED(CONFIG_UBSAN_BOUNDS))
++ pr_expected_config(CONFIG_UBSAN_TRAP);
++ else
++ pr_expected_config(CONFIG_UBSAN_BOUNDS);
+ }
+
+ void lkdtm_CORRUPT_LIST_ADD(void)
+diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
+index 305fc2ec3f256..90f87b193c1ee 100644
+--- a/drivers/misc/lkdtm/lkdtm.h
++++ b/drivers/misc/lkdtm/lkdtm.h
+@@ -9,19 +9,19 @@
+ extern char *lkdtm_kernel_info;
+
+ #define pr_expected_config(kconfig) \
+-{ \
++do { \
+ if (IS_ENABLED(kconfig)) \
+ pr_err("Unexpected! This %s was built with " #kconfig "=y\n", \
+ lkdtm_kernel_info); \
+ else \
+ pr_warn("This is probably expected, since this %s was built *without* " #kconfig "=y\n", \
+ lkdtm_kernel_info); \
+-}
++} while (0)
+
+ #ifndef MODULE
+ int lkdtm_check_bool_cmdline(const char *param);
+ #define pr_expected_config_param(kconfig, param) \
+-{ \
++do { \
+ if (IS_ENABLED(kconfig)) { \
+ switch (lkdtm_check_bool_cmdline(param)) { \
+ case 0: \
+@@ -52,7 +52,7 @@ int lkdtm_check_bool_cmdline(const char *param);
+ break; \
+ } \
+ } \
+-}
++} while (0)
+ #else
+ #define pr_expected_config_param(kconfig, param) pr_expected_config(kconfig)
+ #endif
+diff --git a/drivers/misc/lkdtm/usercopy.c b/drivers/misc/lkdtm/usercopy.c
+index 9161ce7ed47a6..3fead5efe523a 100644
+--- a/drivers/misc/lkdtm/usercopy.c
++++ b/drivers/misc/lkdtm/usercopy.c
+@@ -30,12 +30,12 @@ static const unsigned char test_text[] = "This is a test.\n";
+ */
+ static noinline unsigned char *trick_compiler(unsigned char *stack)
+ {
+- return stack + 0;
++ return stack + unconst;
+ }
+
+ static noinline unsigned char *do_usercopy_stack_callee(int value)
+ {
+- unsigned char buf[32];
++ unsigned char buf[128];
+ int i;
+
+ /* Exercise stack to avoid everything living in registers. */
+@@ -43,7 +43,12 @@ static noinline unsigned char *do_usercopy_stack_callee(int value)
+ buf[i] = value & 0xff;
+ }
+
+- return trick_compiler(buf);
++ /*
++ * Put the target buffer in the middle of stack allocation
++ * so that we don't step on future stack users regardless
++ * of stack growth direction.
++ */
++ return trick_compiler(&buf[(128/2)-32]);
+ }
+
+ static noinline void do_usercopy_stack(bool to_user, bool bad_frame)
+@@ -66,6 +71,12 @@ static noinline void do_usercopy_stack(bool to_user, bool bad_frame)
+ bad_stack -= sizeof(unsigned long);
+ }
+
++#ifdef ARCH_HAS_CURRENT_STACK_POINTER
++ pr_info("stack : %px\n", (void *)current_stack_pointer);
++#endif
++ pr_info("good_stack: %px-%px\n", good_stack, good_stack + sizeof(good_stack));
++ pr_info("bad_stack : %px-%px\n", bad_stack, bad_stack + sizeof(good_stack));
++
+ user_addr = vm_mmap(NULL, 0, PAGE_SIZE,
+ PROT_READ | PROT_WRITE | PROT_EXEC,
+ MAP_ANONYMOUS | MAP_PRIVATE, 0);
+diff --git a/drivers/misc/pvpanic/pvpanic.c b/drivers/misc/pvpanic/pvpanic.c
+index 4b8f1c7d726d1..049a120063489 100644
+--- a/drivers/misc/pvpanic/pvpanic.c
++++ b/drivers/misc/pvpanic/pvpanic.c
+@@ -34,7 +34,9 @@ pvpanic_send_event(unsigned int event)
+ {
+ struct pvpanic_instance *pi_cur;
+
+- spin_lock(&pvpanic_lock);
++ if (!spin_trylock(&pvpanic_lock))
++ return;
++
+ list_for_each_entry(pi_cur, &pvpanic_list, list) {
+ if (event & pi_cur->capability & pi_cur->events)
+ iowrite8(event, pi_cur->base);
+@@ -55,9 +57,13 @@ pvpanic_panic_notify(struct notifier_block *nb, unsigned long code, void *unused
+ return NOTIFY_DONE;
+ }
+
++/*
++ * Call our notifier very early on panic, deferring the
++ * action taken to the hypervisor.
++ */
+ static struct notifier_block pvpanic_panic_nb = {
+ .notifier_call = pvpanic_panic_notify,
+- .priority = 1, /* let this called before broken drm_fb_helper() */
++ .priority = INT_MAX,
+ };
+
+ static void pvpanic_remove(void *param)
+diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
+index 5235b03c6cffa..23fcb7a166057 100644
+--- a/drivers/mmc/core/block.c
++++ b/drivers/mmc/core/block.c
+@@ -1482,8 +1482,7 @@ void mmc_blk_cqe_recovery(struct mmc_queue *mq)
+ err = mmc_cqe_recovery(host);
+ if (err)
+ mmc_blk_reset(mq->blkdata, host, MMC_BLK_CQE_RECOVERY);
+- else
+- mmc_blk_reset_success(mq->blkdata, MMC_BLK_CQE_RECOVERY);
++ mmc_blk_reset_success(mq->blkdata, MMC_BLK_CQE_RECOVERY);
+
+ pr_debug("%s: CQE recovery done\n", mmc_hostname(host));
+ }
+diff --git a/drivers/mmc/host/sdhci-pci-gli.c b/drivers/mmc/host/sdhci-pci-gli.c
+index d09728c37d03e..d81e5dc90e15e 100644
+--- a/drivers/mmc/host/sdhci-pci-gli.c
++++ b/drivers/mmc/host/sdhci-pci-gli.c
+@@ -972,6 +972,9 @@ static int gl9763e_runtime_resume(struct sdhci_pci_chip *chip)
+ struct sdhci_host *host = slot->host;
+ u16 clock;
+
++ if (host->mmc->ios.power_mode != MMC_POWER_ON)
++ return 0;
++
+ clock = sdhci_readw(host, SDHCI_CLOCK_CONTROL);
+
+ clock |= SDHCI_CLOCK_PLL_EN;
+diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c
+index 28f55f9cf7153..053ab52668e8b 100644
+--- a/drivers/mtd/ubi/fastmap-wl.c
++++ b/drivers/mtd/ubi/fastmap-wl.c
+@@ -97,6 +97,33 @@ out:
+ return e;
+ }
+
++/*
++ * has_enough_free_count - whether ubi has enough free pebs to fill fm pools
++ * @ubi: UBI device description object
++ * @is_wl_pool: whether UBI is filling wear leveling pool
++ *
++ * This helper function checks whether there are enough free pebs (deducted
++ * by fastmap pebs) to fill fm_pool and fm_wl_pool, above rule works after
++ * there is at least one of free pebs is filled into fm_wl_pool.
++ * For wear leveling pool, UBI should also reserve free pebs for bad pebs
++ * handling, because there maybe no enough free pebs for user volumes after
++ * producing new bad pebs.
++ */
++static bool has_enough_free_count(struct ubi_device *ubi, bool is_wl_pool)
++{
++ int fm_used = 0; // fastmap non anchor pebs.
++ int beb_rsvd_pebs;
++
++ if (!ubi->free.rb_node)
++ return false;
++
++ beb_rsvd_pebs = is_wl_pool ? ubi->beb_rsvd_pebs : 0;
++ if (ubi->fm_wl_pool.size > 0 && !(ubi->ro_mode || ubi->fm_disabled))
++ fm_used = ubi->fm_size / ubi->leb_size - 1;
++
++ return ubi->free_count - beb_rsvd_pebs > fm_used;
++}
++
+ /**
+ * ubi_refill_pools - refills all fastmap PEB pools.
+ * @ubi: UBI device description object
+@@ -120,21 +147,17 @@ void ubi_refill_pools(struct ubi_device *ubi)
+ wl_tree_add(ubi->fm_anchor, &ubi->free);
+ ubi->free_count++;
+ }
+- if (ubi->fm_next_anchor) {
+- wl_tree_add(ubi->fm_next_anchor, &ubi->free);
+- ubi->free_count++;
+- }
+
+- /* All available PEBs are in ubi->free, now is the time to get
++ /*
++ * All available PEBs are in ubi->free, now is the time to get
+ * the best anchor PEBs.
+ */
+ ubi->fm_anchor = ubi_wl_get_fm_peb(ubi, 1);
+- ubi->fm_next_anchor = ubi_wl_get_fm_peb(ubi, 1);
+
+ for (;;) {
+ enough = 0;
+ if (pool->size < pool->max_size) {
+- if (!ubi->free.rb_node)
++ if (!has_enough_free_count(ubi, false))
+ break;
+
+ e = wl_get_wle(ubi);
+@@ -147,8 +170,7 @@ void ubi_refill_pools(struct ubi_device *ubi)
+ enough++;
+
+ if (wl_pool->size < wl_pool->max_size) {
+- if (!ubi->free.rb_node ||
+- (ubi->free_count - ubi->beb_rsvd_pebs < 5))
++ if (!has_enough_free_count(ubi, true))
+ break;
+
+ e = find_wl_entry(ubi, &ubi->free, WL_FREE_MAX_DIFF);
+@@ -286,20 +308,26 @@ static struct ubi_wl_entry *get_peb_for_wl(struct ubi_device *ubi)
+ int ubi_ensure_anchor_pebs(struct ubi_device *ubi)
+ {
+ struct ubi_work *wrk;
++ struct ubi_wl_entry *anchor;
+
+ spin_lock(&ubi->wl_lock);
+
+- /* Do we have a next anchor? */
+- if (!ubi->fm_next_anchor) {
+- ubi->fm_next_anchor = ubi_wl_get_fm_peb(ubi, 1);
+- if (!ubi->fm_next_anchor)
+- /* Tell wear leveling to produce a new anchor PEB */
+- ubi->fm_do_produce_anchor = 1;
++ /* Do we already have an anchor? */
++ if (ubi->fm_anchor) {
++ spin_unlock(&ubi->wl_lock);
++ return 0;
+ }
+
+- /* Do wear leveling to get a new anchor PEB or check the
+- * existing next anchor candidate.
+- */
++ /* See if we can find an anchor PEB on the list of free PEBs */
++ anchor = ubi_wl_get_fm_peb(ubi, 1);
++ if (anchor) {
++ ubi->fm_anchor = anchor;
++ spin_unlock(&ubi->wl_lock);
++ return 0;
++ }
++
++ ubi->fm_do_produce_anchor = 1;
++ /* No luck, trigger wear leveling to produce a new anchor PEB. */
+ if (ubi->wl_scheduled) {
+ spin_unlock(&ubi->wl_lock);
+ return 0;
+@@ -381,11 +409,6 @@ static void ubi_fastmap_close(struct ubi_device *ubi)
+ ubi->fm_anchor = NULL;
+ }
+
+- if (ubi->fm_next_anchor) {
+- return_unused_peb(ubi, ubi->fm_next_anchor);
+- ubi->fm_next_anchor = NULL;
+- }
+-
+ if (ubi->fm) {
+ for (i = 0; i < ubi->fm->used_blocks; i++)
+ kfree(ubi->fm->e[i]);
+diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c
+index 6b5f1ffd961b9..6e95c4b1473e6 100644
+--- a/drivers/mtd/ubi/fastmap.c
++++ b/drivers/mtd/ubi/fastmap.c
+@@ -1230,17 +1230,6 @@ static int ubi_write_fastmap(struct ubi_device *ubi,
+ fm_pos += sizeof(*fec);
+ ubi_assert(fm_pos <= ubi->fm_size);
+ }
+- if (ubi->fm_next_anchor) {
+- fec = (struct ubi_fm_ec *)(fm_raw + fm_pos);
+-
+- fec->pnum = cpu_to_be32(ubi->fm_next_anchor->pnum);
+- set_seen(ubi, ubi->fm_next_anchor->pnum, seen_pebs);
+- fec->ec = cpu_to_be32(ubi->fm_next_anchor->ec);
+-
+- free_peb_count++;
+- fm_pos += sizeof(*fec);
+- ubi_assert(fm_pos <= ubi->fm_size);
+- }
+ fmh->free_peb_count = cpu_to_be32(free_peb_count);
+
+ ubi_for_each_used_peb(ubi, wl_e, tmp_rb) {
+diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h
+index 7c083ad58274a..078112e23dfd5 100644
+--- a/drivers/mtd/ubi/ubi.h
++++ b/drivers/mtd/ubi/ubi.h
+@@ -489,8 +489,7 @@ struct ubi_debug_info {
+ * @fm_work: fastmap work queue
+ * @fm_work_scheduled: non-zero if fastmap work was scheduled
+ * @fast_attach: non-zero if UBI was attached by fastmap
+- * @fm_anchor: The new anchor PEB used during fastmap update
+- * @fm_next_anchor: An anchor PEB candidate for the next time fastmap is updated
++ * @fm_anchor: The next anchor PEB to use for fastmap
+ * @fm_do_produce_anchor: If true produce an anchor PEB in wl
+ *
+ * @used: RB-tree of used physical eraseblocks
+@@ -601,7 +600,6 @@ struct ubi_device {
+ int fm_work_scheduled;
+ int fast_attach;
+ struct ubi_wl_entry *fm_anchor;
+- struct ubi_wl_entry *fm_next_anchor;
+ int fm_do_produce_anchor;
+
+ /* Wear-leveling sub-system's stuff */
+diff --git a/drivers/mtd/ubi/vmt.c b/drivers/mtd/ubi/vmt.c
+index 1bc7b3a056046..6ea95ade4ca6b 100644
+--- a/drivers/mtd/ubi/vmt.c
++++ b/drivers/mtd/ubi/vmt.c
+@@ -309,7 +309,6 @@ out_mapping:
+ ubi->volumes[vol_id] = NULL;
+ ubi->vol_count -= 1;
+ spin_unlock(&ubi->volumes_lock);
+- ubi_eba_destroy_table(eba_tbl);
+ out_acc:
+ spin_lock(&ubi->volumes_lock);
+ ubi->rsvd_pebs -= vol->reserved_pebs;
+diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
+index 8455f1d47f3c9..afcdacb9d0e99 100644
+--- a/drivers/mtd/ubi/wl.c
++++ b/drivers/mtd/ubi/wl.c
+@@ -689,16 +689,16 @@ static int wear_leveling_worker(struct ubi_device *ubi, struct ubi_work *wrk,
+
+ #ifdef CONFIG_MTD_UBI_FASTMAP
+ e1 = find_anchor_wl_entry(&ubi->used);
+- if (e1 && ubi->fm_next_anchor &&
+- (ubi->fm_next_anchor->ec - e1->ec >= UBI_WL_THRESHOLD)) {
++ if (e1 && ubi->fm_anchor &&
++ (ubi->fm_anchor->ec - e1->ec >= UBI_WL_THRESHOLD)) {
+ ubi->fm_do_produce_anchor = 1;
+- /* fm_next_anchor is no longer considered a good anchor
+- * candidate.
++ /*
++ * fm_anchor is no longer considered a good anchor.
+ * NULL assignment also prevents multiple wear level checks
+ * of this PEB.
+ */
+- wl_tree_add(ubi->fm_next_anchor, &ubi->free);
+- ubi->fm_next_anchor = NULL;
++ wl_tree_add(ubi->fm_anchor, &ubi->free);
++ ubi->fm_anchor = NULL;
+ ubi->free_count++;
+ }
+
+@@ -1085,12 +1085,13 @@ static int __erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk)
+ if (!err) {
+ spin_lock(&ubi->wl_lock);
+
+- if (!ubi->fm_disabled && !ubi->fm_next_anchor &&
++ if (!ubi->fm_disabled && !ubi->fm_anchor &&
+ e->pnum < UBI_FM_MAX_START) {
+- /* Abort anchor production, if needed it will be
++ /*
++ * Abort anchor production, if needed it will be
+ * enabled again in the wear leveling started below.
+ */
+- ubi->fm_next_anchor = e;
++ ubi->fm_anchor = e;
+ ubi->fm_do_produce_anchor = 0;
+ } else {
+ wl_tree_add(e, &ubi->free);
+diff --git a/drivers/net/amt.c b/drivers/net/amt.c
+index de4ea518c793f..14fe03dbd9b1d 100644
+--- a/drivers/net/amt.c
++++ b/drivers/net/amt.c
+@@ -51,6 +51,7 @@ static char *status_str[] = {
+ };
+
+ static char *type_str[] = {
++ "", /* Type 0 is not defined */
+ "AMT_MSG_DISCOVERY",
+ "AMT_MSG_ADVERTISEMENT",
+ "AMT_MSG_REQUEST",
+@@ -2220,8 +2221,7 @@ static bool amt_advertisement_handler(struct amt_dev *amt, struct sk_buff *skb)
+ struct amt_header_advertisement *amta;
+ int hdr_size;
+
+- hdr_size = sizeof(*amta) - sizeof(struct amt_header);
+-
++ hdr_size = sizeof(*amta) + sizeof(struct udphdr);
+ if (!pskb_may_pull(skb, hdr_size))
+ return true;
+
+@@ -2251,19 +2251,27 @@ static bool amt_multicast_data_handler(struct amt_dev *amt, struct sk_buff *skb)
+ struct ethhdr *eth;
+ struct iphdr *iph;
+
++ hdr_size = sizeof(*amtmd) + sizeof(struct udphdr);
++ if (!pskb_may_pull(skb, hdr_size))
++ return true;
++
+ amtmd = (struct amt_header_mcast_data *)(udp_hdr(skb) + 1);
+ if (amtmd->reserved || amtmd->version)
+ return true;
+
+- hdr_size = sizeof(*amtmd) + sizeof(struct udphdr);
+ if (iptunnel_pull_header(skb, hdr_size, htons(ETH_P_IP), false))
+ return true;
++
+ skb_reset_network_header(skb);
+ skb_push(skb, sizeof(*eth));
+ skb_reset_mac_header(skb);
+ skb_pull(skb, sizeof(*eth));
+ eth = eth_hdr(skb);
++
++ if (!pskb_may_pull(skb, sizeof(*iph)))
++ return true;
+ iph = ip_hdr(skb);
++
+ if (iph->version == 4) {
+ if (!ipv4_is_multicast(iph->daddr))
+ return true;
+@@ -2274,6 +2282,9 @@ static bool amt_multicast_data_handler(struct amt_dev *amt, struct sk_buff *skb)
+ } else if (iph->version == 6) {
+ struct ipv6hdr *ip6h;
+
++ if (!pskb_may_pull(skb, sizeof(*ip6h)))
++ return true;
++
+ ip6h = ipv6_hdr(skb);
+ if (!ipv6_addr_is_multicast(&ip6h->daddr))
+ return true;
+@@ -2306,8 +2317,7 @@ static bool amt_membership_query_handler(struct amt_dev *amt,
+ struct iphdr *iph;
+ int hdr_size, len;
+
+- hdr_size = sizeof(*amtmq) - sizeof(struct amt_header);
+-
++ hdr_size = sizeof(*amtmq) + sizeof(struct udphdr);
+ if (!pskb_may_pull(skb, hdr_size))
+ return true;
+
+@@ -2315,22 +2325,27 @@ static bool amt_membership_query_handler(struct amt_dev *amt,
+ if (amtmq->reserved || amtmq->version)
+ return true;
+
+- hdr_size = sizeof(*amtmq) + sizeof(struct udphdr) - sizeof(*eth);
++ hdr_size -= sizeof(*eth);
+ if (iptunnel_pull_header(skb, hdr_size, htons(ETH_P_TEB), false))
+ return true;
++
+ oeth = eth_hdr(skb);
+ skb_reset_mac_header(skb);
+ skb_pull(skb, sizeof(*eth));
+ skb_reset_network_header(skb);
+ eth = eth_hdr(skb);
++ if (!pskb_may_pull(skb, sizeof(*iph)))
++ return true;
++
+ iph = ip_hdr(skb);
+ if (iph->version == 4) {
+- if (!ipv4_is_multicast(iph->daddr))
+- return true;
+ if (!pskb_may_pull(skb, sizeof(*iph) + AMT_IPHDR_OPTS +
+ sizeof(*ihv3)))
+ return true;
+
++ if (!ipv4_is_multicast(iph->daddr))
++ return true;
++
+ ihv3 = skb_pull(skb, sizeof(*iph) + AMT_IPHDR_OPTS);
+ skb_reset_transport_header(skb);
+ skb_push(skb, sizeof(*iph) + AMT_IPHDR_OPTS);
+@@ -2345,15 +2360,17 @@ static bool amt_membership_query_handler(struct amt_dev *amt,
+ ip_eth_mc_map(iph->daddr, eth->h_dest);
+ #if IS_ENABLED(CONFIG_IPV6)
+ } else if (iph->version == 6) {
+- struct ipv6hdr *ip6h = ipv6_hdr(skb);
+ struct mld2_query *mld2q;
++ struct ipv6hdr *ip6h;
+
+- if (!ipv6_addr_is_multicast(&ip6h->daddr))
+- return true;
+ if (!pskb_may_pull(skb, sizeof(*ip6h) + AMT_IP6HDR_OPTS +
+ sizeof(*mld2q)))
+ return true;
+
++ ip6h = ipv6_hdr(skb);
++ if (!ipv6_addr_is_multicast(&ip6h->daddr))
++ return true;
++
+ mld2q = skb_pull(skb, sizeof(*ip6h) + AMT_IP6HDR_OPTS);
+ skb_reset_transport_header(skb);
+ skb_push(skb, sizeof(*ip6h) + AMT_IP6HDR_OPTS);
+@@ -2389,23 +2406,23 @@ static bool amt_update_handler(struct amt_dev *amt, struct sk_buff *skb)
+ {
+ struct amt_header_membership_update *amtmu;
+ struct amt_tunnel_list *tunnel;
+- struct udphdr *udph;
+ struct ethhdr *eth;
+ struct iphdr *iph;
+- int len;
++ int len, hdr_size;
+
+ iph = ip_hdr(skb);
+- udph = udp_hdr(skb);
+
+- if (__iptunnel_pull_header(skb, sizeof(*udph), skb->protocol,
+- false, false))
++ hdr_size = sizeof(*amtmu) + sizeof(struct udphdr);
++ if (!pskb_may_pull(skb, hdr_size))
+ return true;
+
+- amtmu = (struct amt_header_membership_update *)skb->data;
++ amtmu = (struct amt_header_membership_update *)(udp_hdr(skb) + 1);
+ if (amtmu->reserved || amtmu->version)
+ return true;
+
+- skb_pull(skb, sizeof(*amtmu));
++ if (iptunnel_pull_header(skb, hdr_size, skb->protocol, false))
++ return true;
++
+ skb_reset_network_header(skb);
+
+ list_for_each_entry_rcu(tunnel, &amt->tunnel_list, list) {
+@@ -2423,9 +2440,12 @@ static bool amt_update_handler(struct amt_dev *amt, struct sk_buff *skb)
+ }
+ }
+
+- return false;
++ return true;
+
+ report:
++ if (!pskb_may_pull(skb, sizeof(*iph)))
++ return true;
++
+ iph = ip_hdr(skb);
+ if (iph->version == 4) {
+ if (ip_mc_check_igmp(skb)) {
+@@ -2679,6 +2699,7 @@ static int amt_rcv(struct sock *sk, struct sk_buff *skb)
+ amt = rcu_dereference_sk_user_data(sk);
+ if (!amt) {
+ err = true;
++ kfree_skb(skb);
+ goto out;
+ }
+
+diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
+index b5c5196e03ee0..26a6573adf0f5 100644
+--- a/drivers/net/bonding/bond_main.c
++++ b/drivers/net/bonding/bond_main.c
+@@ -6159,7 +6159,9 @@ static int bond_check_params(struct bond_params *params)
+ strscpy_pad(params->primary, primary, sizeof(params->primary));
+
+ memcpy(params->arp_targets, arp_target, sizeof(arp_target));
++#if IS_ENABLED(CONFIG_IPV6)
+ memset(params->ns_targets, 0, sizeof(struct in6_addr) * BOND_MAX_NS_TARGETS);
++#endif
+
+ return 0;
+ }
+diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
+index f427fa1737c77..6f404f9c34e34 100644
+--- a/drivers/net/bonding/bond_netlink.c
++++ b/drivers/net/bonding/bond_netlink.c
+@@ -290,11 +290,6 @@ static int bond_changelink(struct net_device *bond_dev, struct nlattr *tb[],
+
+ addr6 = nla_get_in6_addr(attr);
+
+- if (ipv6_addr_type(&addr6) & IPV6_ADDR_LINKLOCAL) {
+- NL_SET_ERR_MSG(extack, "Invalid IPv6 addr6");
+- return -EINVAL;
+- }
+-
+ bond_opt_initextra(&newval, &addr6, sizeof(addr6));
+ err = __bond_opt_set(bond, BOND_OPT_NS_TARGETS,
+ &newval);
+diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
+index 64f7db2627ce9..1f8323ad5282a 100644
+--- a/drivers/net/bonding/bond_options.c
++++ b/drivers/net/bonding/bond_options.c
+@@ -34,10 +34,8 @@ static int bond_option_arp_ip_target_add(struct bonding *bond, __be32 target);
+ static int bond_option_arp_ip_target_rem(struct bonding *bond, __be32 target);
+ static int bond_option_arp_ip_targets_set(struct bonding *bond,
+ const struct bond_opt_value *newval);
+-#if IS_ENABLED(CONFIG_IPV6)
+ static int bond_option_ns_ip6_targets_set(struct bonding *bond,
+ const struct bond_opt_value *newval);
+-#endif
+ static int bond_option_arp_validate_set(struct bonding *bond,
+ const struct bond_opt_value *newval);
+ static int bond_option_arp_all_targets_set(struct bonding *bond,
+@@ -299,7 +297,6 @@ static const struct bond_option bond_opts[BOND_OPT_LAST] = {
+ .flags = BOND_OPTFLAG_RAWVAL,
+ .set = bond_option_arp_ip_targets_set
+ },
+-#if IS_ENABLED(CONFIG_IPV6)
+ [BOND_OPT_NS_TARGETS] = {
+ .id = BOND_OPT_NS_TARGETS,
+ .name = "ns_ip6_target",
+@@ -307,7 +304,6 @@ static const struct bond_option bond_opts[BOND_OPT_LAST] = {
+ .flags = BOND_OPTFLAG_RAWVAL,
+ .set = bond_option_ns_ip6_targets_set
+ },
+-#endif
+ [BOND_OPT_DOWNDELAY] = {
+ .id = BOND_OPT_DOWNDELAY,
+ .name = "downdelay",
+@@ -1254,6 +1250,12 @@ static int bond_option_ns_ip6_targets_set(struct bonding *bond,
+
+ return 0;
+ }
++#else
++static int bond_option_ns_ip6_targets_set(struct bonding *bond,
++ const struct bond_opt_value *newval)
++{
++ return -EPERM;
++}
+ #endif
+
+ static int bond_option_arp_validate_set(struct bonding *bond,
+diff --git a/drivers/net/bonding/bond_procfs.c b/drivers/net/bonding/bond_procfs.c
+index cfe37be42be4e..43be458422b3f 100644
+--- a/drivers/net/bonding/bond_procfs.c
++++ b/drivers/net/bonding/bond_procfs.c
+@@ -129,6 +129,21 @@ static void bond_info_show_master(struct seq_file *seq)
+ printed = 1;
+ }
+ seq_printf(seq, "\n");
++
++#if IS_ENABLED(CONFIG_IPV6)
++ printed = 0;
++ seq_printf(seq, "NS IPv6 target/s (xx::xx form):");
++
++ for (i = 0; (i < BOND_MAX_NS_TARGETS); i++) {
++ if (ipv6_addr_any(&bond->params.ns_targets[i]))
++ break;
++ if (printed)
++ seq_printf(seq, ",");
++ seq_printf(seq, " %pI6c", &bond->params.ns_targets[i]);
++ printed = 1;
++ }
++ seq_printf(seq, "\n");
++#endif
+ }
+
+ if (BOND_MODE(bond) == BOND_MODE_8023AD) {
+diff --git a/drivers/net/dsa/lantiq_gswip.c b/drivers/net/dsa/lantiq_gswip.c
+index 12c15da55664b..9284373222fae 100644
+--- a/drivers/net/dsa/lantiq_gswip.c
++++ b/drivers/net/dsa/lantiq_gswip.c
+@@ -2069,8 +2069,10 @@ static int gswip_gphy_fw_list(struct gswip_priv *priv,
+ for_each_available_child_of_node(gphy_fw_list_np, gphy_fw_np) {
+ err = gswip_gphy_fw_probe(priv, &priv->gphy_fw[i],
+ gphy_fw_np, i);
+- if (err)
++ if (err) {
++ of_node_put(gphy_fw_np);
+ goto remove_gphy;
++ }
+ i++;
+ }
+
+diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
+index 64f4fdd029027..732570fb97b19 100644
+--- a/drivers/net/dsa/mv88e6xxx/chip.c
++++ b/drivers/net/dsa/mv88e6xxx/chip.c
+@@ -3960,6 +3960,7 @@ static int mv88e6xxx_mdios_register(struct mv88e6xxx_chip *chip,
+ */
+ child = of_get_child_by_name(np, "mdio");
+ err = mv88e6xxx_mdio_register(chip, child, false);
++ of_node_put(child);
+ if (err)
+ return err;
+
+diff --git a/drivers/net/dsa/mv88e6xxx/serdes.c b/drivers/net/dsa/mv88e6xxx/serdes.c
+index 7b37d45bc9fb5..1a19c5284f2cb 100644
+--- a/drivers/net/dsa/mv88e6xxx/serdes.c
++++ b/drivers/net/dsa/mv88e6xxx/serdes.c
+@@ -50,22 +50,17 @@ static int mv88e6390_serdes_write(struct mv88e6xxx_chip *chip,
+ }
+
+ static int mv88e6xxx_serdes_pcs_get_state(struct mv88e6xxx_chip *chip,
+- u16 ctrl, u16 status, u16 lpa,
++ u16 bmsr, u16 lpa, u16 status,
+ struct phylink_link_state *state)
+ {
+ state->link = !!(status & MV88E6390_SGMII_PHY_STATUS_LINK);
++ state->an_complete = !!(bmsr & BMSR_ANEGCOMPLETE);
+
+ if (status & MV88E6390_SGMII_PHY_STATUS_SPD_DPL_VALID) {
+ /* The Spped and Duplex Resolved register is 1 if AN is enabled
+ * and complete, or if AN is disabled. So with disabled AN we
+- * still get here on link up. But we want to set an_complete
+- * only if AN was enabled, thus we look at BMCR_ANENABLE.
+- * (According to 802.3-2008 section 22.2.4.2.10, we should be
+- * able to get this same value from BMSR_ANEGCAPABLE, but tests
+- * show that these Marvell PHYs don't conform to this part of
+- * the specificaion - BMSR_ANEGCAPABLE is simply always 1.)
++ * still get here on link up.
+ */
+- state->an_complete = !!(ctrl & BMCR_ANENABLE);
+ state->duplex = status &
+ MV88E6390_SGMII_PHY_STATUS_DUPLEX_FULL ?
+ DUPLEX_FULL : DUPLEX_HALF;
+@@ -191,12 +186,12 @@ int mv88e6352_serdes_pcs_config(struct mv88e6xxx_chip *chip, int port,
+ int mv88e6352_serdes_pcs_get_state(struct mv88e6xxx_chip *chip, int port,
+ int lane, struct phylink_link_state *state)
+ {
+- u16 lpa, status, ctrl;
++ u16 bmsr, lpa, status;
+ int err;
+
+- err = mv88e6352_serdes_read(chip, MII_BMCR, &ctrl);
++ err = mv88e6352_serdes_read(chip, MII_BMSR, &bmsr);
+ if (err) {
+- dev_err(chip->dev, "can't read Serdes PHY control: %d\n", err);
++ dev_err(chip->dev, "can't read Serdes BMSR: %d\n", err);
+ return err;
+ }
+
+@@ -212,7 +207,7 @@ int mv88e6352_serdes_pcs_get_state(struct mv88e6xxx_chip *chip, int port,
+ return err;
+ }
+
+- return mv88e6xxx_serdes_pcs_get_state(chip, ctrl, status, lpa, state);
++ return mv88e6xxx_serdes_pcs_get_state(chip, bmsr, lpa, status, state);
+ }
+
+ int mv88e6352_serdes_pcs_an_restart(struct mv88e6xxx_chip *chip, int port,
+@@ -918,13 +913,13 @@ int mv88e6390_serdes_pcs_config(struct mv88e6xxx_chip *chip, int port,
+ static int mv88e6390_serdes_pcs_get_state_sgmii(struct mv88e6xxx_chip *chip,
+ int port, int lane, struct phylink_link_state *state)
+ {
+- u16 lpa, status, ctrl;
++ u16 bmsr, lpa, status;
+ int err;
+
+ err = mv88e6390_serdes_read(chip, lane, MDIO_MMD_PHYXS,
+- MV88E6390_SGMII_BMCR, &ctrl);
++ MV88E6390_SGMII_BMSR, &bmsr);
+ if (err) {
+- dev_err(chip->dev, "can't read Serdes PHY control: %d\n", err);
++ dev_err(chip->dev, "can't read Serdes PHY BMSR: %d\n", err);
+ return err;
+ }
+
+@@ -942,7 +937,7 @@ static int mv88e6390_serdes_pcs_get_state_sgmii(struct mv88e6xxx_chip *chip,
+ return err;
+ }
+
+- return mv88e6xxx_serdes_pcs_get_state(chip, ctrl, status, lpa, state);
++ return mv88e6xxx_serdes_pcs_get_state(chip, bmsr, lpa, status, state);
+ }
+
+ static int mv88e6390_serdes_pcs_get_state_10g(struct mv88e6xxx_chip *chip,
+diff --git a/drivers/net/dsa/realtek/rtl8365mb.c b/drivers/net/dsa/realtek/rtl8365mb.c
+index 3d70e8a77ecf0..907c743370e3a 100644
+--- a/drivers/net/dsa/realtek/rtl8365mb.c
++++ b/drivers/net/dsa/realtek/rtl8365mb.c
+@@ -955,35 +955,21 @@ static int rtl8365mb_ext_config_forcemode(struct realtek_priv *priv, int port,
+ return 0;
+ }
+
+-static bool rtl8365mb_phy_mode_supported(struct dsa_switch *ds, int port,
+- phy_interface_t interface)
+-{
+- int ext_int;
+-
+- ext_int = rtl8365mb_extint_port_map[port];
+-
+- if (ext_int < 0 &&
+- (interface == PHY_INTERFACE_MODE_NA ||
+- interface == PHY_INTERFACE_MODE_INTERNAL ||
+- interface == PHY_INTERFACE_MODE_GMII))
+- /* Internal PHY */
+- return true;
+- else if ((ext_int >= 1) &&
+- phy_interface_mode_is_rgmii(interface))
+- /* Extension MAC */
+- return true;
+-
+- return false;
+-}
+-
+ static void rtl8365mb_phylink_get_caps(struct dsa_switch *ds, int port,
+ struct phylink_config *config)
+ {
+- if (dsa_is_user_port(ds, port))
++ if (dsa_is_user_port(ds, port)) {
+ __set_bit(PHY_INTERFACE_MODE_INTERNAL,
+ config->supported_interfaces);
+- else if (dsa_is_cpu_port(ds, port))
++
++ /* GMII is the default interface mode for phylib, so
++ * we have to support it for ports with integrated PHY.
++ */
++ __set_bit(PHY_INTERFACE_MODE_GMII,
++ config->supported_interfaces);
++ } else if (dsa_is_cpu_port(ds, port)) {
+ phy_interface_set_rgmii(config->supported_interfaces);
++ }
+
+ config->mac_capabilities = MAC_SYM_PAUSE | MAC_ASYM_PAUSE |
+ MAC_10 | MAC_100 | MAC_1000FD;
+@@ -996,12 +982,6 @@ static void rtl8365mb_phylink_mac_config(struct dsa_switch *ds, int port,
+ struct realtek_priv *priv = ds->priv;
+ int ret;
+
+- if (!rtl8365mb_phy_mode_supported(ds, port, state->interface)) {
+- dev_err(priv->dev, "phy mode %s is unsupported on port %d\n",
+- phy_modes(state->interface), port);
+- return;
+- }
+-
+ if (mode != MLO_AN_PHY && mode != MLO_AN_FIXED) {
+ dev_err(priv->dev,
+ "port %d supports only conventional PHY or fixed-link\n",
+diff --git a/drivers/net/ethernet/altera/altera_tse_main.c b/drivers/net/ethernet/altera/altera_tse_main.c
+index a3816264c35cb..8c5828582c21e 100644
+--- a/drivers/net/ethernet/altera/altera_tse_main.c
++++ b/drivers/net/ethernet/altera/altera_tse_main.c
+@@ -163,7 +163,8 @@ static int altera_tse_mdio_create(struct net_device *dev, unsigned int id)
+ mdio = mdiobus_alloc();
+ if (mdio == NULL) {
+ netdev_err(dev, "Error allocating MDIO bus\n");
+- return -ENOMEM;
++ ret = -ENOMEM;
++ goto put_node;
+ }
+
+ mdio->name = ALTERA_TSE_RESOURCE_NAME;
+@@ -180,6 +181,7 @@ static int altera_tse_mdio_create(struct net_device *dev, unsigned int id)
+ mdio->id);
+ goto out_free_mdio;
+ }
++ of_node_put(mdio_node);
+
+ if (netif_msg_drv(priv))
+ netdev_info(dev, "MDIO bus %s: created\n", mdio->id);
+@@ -189,6 +191,8 @@ static int altera_tse_mdio_create(struct net_device *dev, unsigned int id)
+ out_free_mdio:
+ mdiobus_free(mdio);
+ mdio = NULL;
++put_node:
++ of_node_put(mdio_node);
+ return ret;
+ }
+
+diff --git a/drivers/net/ethernet/broadcom/bgmac-bcma-mdio.c b/drivers/net/ethernet/broadcom/bgmac-bcma-mdio.c
+index 086739e4f40a9..9b83d53616994 100644
+--- a/drivers/net/ethernet/broadcom/bgmac-bcma-mdio.c
++++ b/drivers/net/ethernet/broadcom/bgmac-bcma-mdio.c
+@@ -234,6 +234,7 @@ struct mii_bus *bcma_mdio_mii_register(struct bgmac *bgmac)
+ np = of_get_child_by_name(core->dev.of_node, "mdio");
+
+ err = of_mdiobus_register(mii_bus, np);
++ of_node_put(np);
+ if (err) {
+ dev_err(&core->dev, "Registration of mii bus failed\n");
+ goto err_free_bus;
+diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+index 7f11c0a8e7a91..d4e63f0644c36 100644
+--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
++++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+@@ -1184,9 +1184,9 @@ static int ixgbe_update_vf_xcast_mode(struct ixgbe_adapter *adapter,
+
+ switch (xcast_mode) {
+ case IXGBEVF_XCAST_MODE_NONE:
+- disable = IXGBE_VMOLR_BAM | IXGBE_VMOLR_ROMPE |
++ disable = IXGBE_VMOLR_ROMPE |
+ IXGBE_VMOLR_MPE | IXGBE_VMOLR_UPE | IXGBE_VMOLR_VPE;
+- enable = 0;
++ enable = IXGBE_VMOLR_BAM;
+ break;
+ case IXGBEVF_XCAST_MODE_MULTI:
+ disable = IXGBE_VMOLR_MPE | IXGBE_VMOLR_UPE | IXGBE_VMOLR_VPE;
+@@ -1208,9 +1208,9 @@ static int ixgbe_update_vf_xcast_mode(struct ixgbe_adapter *adapter,
+ return -EPERM;
+ }
+
+- disable = 0;
++ disable = IXGBE_VMOLR_VPE;
+ enable = IXGBE_VMOLR_BAM | IXGBE_VMOLR_ROMPE |
+- IXGBE_VMOLR_MPE | IXGBE_VMOLR_UPE | IXGBE_VMOLR_VPE;
++ IXGBE_VMOLR_MPE | IXGBE_VMOLR_UPE;
+ break;
+ default:
+ return -EOPNOTSUPP;
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c
+index a79201a9a6f03..a9da85e418a49 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c
+@@ -579,7 +579,7 @@ static bool is_valid_offset(struct rvu *rvu, struct cpt_rd_wr_reg_msg *req)
+
+ blkaddr = validate_and_get_cpt_blkaddr(req->blkaddr);
+ if (blkaddr < 0)
+- return blkaddr;
++ return false;
+
+ /* Registers that can be accessed from PF/VF */
+ if ((offset & 0xFF000) == CPT_AF_LFX_CTL(0) ||
+diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+index f02d07ec5ccbf..a50090e62c8f9 100644
+--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
++++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+@@ -1949,6 +1949,9 @@ static int mtk_hwlro_get_fdir_entry(struct net_device *dev,
+ struct ethtool_rx_flow_spec *fsp =
+ (struct ethtool_rx_flow_spec *)&cmd->fs;
+
++ if (fsp->location >= ARRAY_SIZE(mac->hwlro_ip))
++ return -EINVAL;
++
+ /* only tcp dst ipv4 is meaningful, others are meaningless */
+ fsp->flow_type = TCP_V4_FLOW;
+ fsp->h_u.tcp_ip4_spec.ip4dst = ntohl(mac->hwlro_ip[fsp->location]);
+diff --git a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
+index ed5038d98ef6e..6400a827173cf 100644
+--- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
++++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
+@@ -2110,7 +2110,7 @@ static int mlx4_en_get_module_eeprom(struct net_device *dev,
+ en_err(priv,
+ "mlx4_get_module_info i(%d) offset(%d) bytes_to_read(%d) - FAILED (0x%x)\n",
+ i, offset, ee->len - i, ret);
+- return 0;
++ return ret;
+ }
+
+ i += ret;
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
+index ba6dad97e308d..c4ca4e157ff7d 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
+@@ -555,12 +555,9 @@ static u32 mlx5_gen_pci_id(const struct mlx5_core_dev *dev)
+ PCI_SLOT(dev->pdev->devfn));
+ }
+
+-static int next_phys_dev(struct device *dev, const void *data)
++static int _next_phys_dev(struct mlx5_core_dev *mdev,
++ const struct mlx5_core_dev *curr)
+ {
+- struct mlx5_adev *madev = container_of(dev, struct mlx5_adev, adev.dev);
+- struct mlx5_core_dev *mdev = madev->mdev;
+- const struct mlx5_core_dev *curr = data;
+-
+ if (!mlx5_core_is_pf(mdev))
+ return 0;
+
+@@ -574,22 +571,51 @@ static int next_phys_dev(struct device *dev, const void *data)
+ return 1;
+ }
+
+-/* Must be called with intf_mutex held */
+-struct mlx5_core_dev *mlx5_get_next_phys_dev(struct mlx5_core_dev *dev)
++static void *pci_get_other_drvdata(struct device *this, struct device *other)
+ {
+- struct auxiliary_device *adev;
+- struct mlx5_adev *madev;
++ if (this->driver != other->driver)
++ return NULL;
++
++ return pci_get_drvdata(to_pci_dev(other));
++}
++
++static int next_phys_dev_lag(struct device *dev, const void *data)
++{
++ struct mlx5_core_dev *mdev, *this = (struct mlx5_core_dev *)data;
++
++ mdev = pci_get_other_drvdata(this->device, dev);
++ if (!mdev)
++ return 0;
++
++ if (!MLX5_CAP_GEN(mdev, vport_group_manager) ||
++ !MLX5_CAP_GEN(mdev, lag_master) ||
++ MLX5_CAP_GEN(mdev, num_lag_ports) != MLX5_MAX_PORTS)
++ return 0;
++
++ return _next_phys_dev(mdev, data);
++}
++
++static struct mlx5_core_dev *mlx5_get_next_dev(struct mlx5_core_dev *dev,
++ int (*match)(struct device *dev, const void *data))
++{
++ struct device *next;
+
+ if (!mlx5_core_is_pf(dev))
+ return NULL;
+
+- adev = auxiliary_find_device(NULL, dev, &next_phys_dev);
+- if (!adev)
++ next = bus_find_device(&pci_bus_type, NULL, dev, match);
++ if (!next)
+ return NULL;
+
+- madev = container_of(adev, struct mlx5_adev, adev);
+- put_device(&adev->dev);
+- return madev->mdev;
++ put_device(next);
++ return pci_get_drvdata(to_pci_dev(next));
++}
++
++/* Must be called with intf_mutex held */
++struct mlx5_core_dev *mlx5_get_next_phys_dev_lag(struct mlx5_core_dev *dev)
++{
++ lockdep_assert_held(&mlx5_intf_mutex);
++ return mlx5_get_next_dev(dev, &next_phys_dev_lag);
+ }
+
+ void mlx5_dev_list_lock(void)
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
+index eae9aa9c08118..978a2bb8e1220 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
+@@ -675,6 +675,9 @@ static void mlx5_fw_tracer_handle_traces(struct work_struct *work)
+ if (!tracer->owner)
+ return;
+
++ if (unlikely(!tracer->str_db.loaded))
++ goto arm;
++
+ block_count = tracer->buff.size / TRACER_BLOCK_SIZE_BYTE;
+ start_offset = tracer->buff.consumer_index * TRACER_BLOCK_SIZE_BYTE;
+
+@@ -732,6 +735,7 @@ static void mlx5_fw_tracer_handle_traces(struct work_struct *work)
+ &tmp_trace_block[TRACES_PER_BLOCK - 1]);
+ }
+
++arm:
+ mlx5_fw_tracer_arm(dev);
+ }
+
+@@ -1136,8 +1140,7 @@ static int fw_tracer_event(struct notifier_block *nb, unsigned long action, void
+ queue_work(tracer->work_queue, &tracer->ownership_change_work);
+ break;
+ case MLX5_TRACER_SUBTYPE_TRACES_AVAILABLE:
+- if (likely(tracer->str_db.loaded))
+- queue_work(tracer->work_queue, &tracer->handle_traces_work);
++ queue_work(tracer->work_queue, &tracer->handle_traces_work);
+ break;
+ default:
+ mlx5_core_dbg(dev, "FWTracer: Event with unrecognized subtype: sub_type %d\n",
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
+index ee34e861d3af6..d0d14325a0d95 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
+@@ -765,6 +765,7 @@ struct mlx5e_rq {
+ u8 wq_type;
+ u32 rqn;
+ struct mlx5_core_dev *mdev;
++ struct mlx5e_channel *channel;
+ u32 umr_mkey;
+ struct mlx5e_dma_info wqe_overflow;
+
+@@ -1077,6 +1078,9 @@ void mlx5e_close_cq(struct mlx5e_cq *cq);
+ int mlx5e_open_locked(struct net_device *netdev);
+ int mlx5e_close_locked(struct net_device *netdev);
+
++void mlx5e_trigger_napi_icosq(struct mlx5e_channel *c);
++void mlx5e_trigger_napi_sched(struct napi_struct *napi);
++
+ int mlx5e_open_channels(struct mlx5e_priv *priv,
+ struct mlx5e_channels *chs);
+ void mlx5e_close_channels(struct mlx5e_channels *chs);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h b/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h
+index 678ffbb48a25f..e3e8c1c3ff242 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h
+@@ -12,6 +12,7 @@ struct mlx5e_post_act;
+ enum {
+ MLX5E_TC_FT_LEVEL = 0,
+ MLX5E_TC_TTC_FT_LEVEL,
++ MLX5E_TC_MISS_LEVEL,
+ };
+
+ struct mlx5e_tc_table {
+@@ -20,6 +21,7 @@ struct mlx5e_tc_table {
+ */
+ struct mutex t_lock;
+ struct mlx5_flow_table *t;
++ struct mlx5_flow_table *miss_t;
+ struct mlx5_fs_chains *chains;
+ struct mlx5e_post_act *post_act;
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
+index 335b20b6383b6..047f88f092038 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
+@@ -736,6 +736,7 @@ void mlx5e_ptp_activate_channel(struct mlx5e_ptp *c)
+ if (test_bit(MLX5E_PTP_STATE_RX, c->state)) {
+ mlx5e_ptp_rx_set_fs(c->priv);
+ mlx5e_activate_rq(&c->rq);
++ mlx5e_trigger_napi_sched(&c->napi);
+ }
+ }
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
+index 2684e9da9f412..fc366e66d0b0f 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
+@@ -123,6 +123,8 @@ static int mlx5e_rx_reporter_err_icosq_cqe_recover(void *ctx)
+ xskrq->stats->recover++;
+ }
+
++ mlx5e_trigger_napi_icosq(icosq->channel);
++
+ mutex_unlock(&icosq->channel->icosq_recovery_lock);
+
+ return 0;
+@@ -166,6 +168,10 @@ static int mlx5e_rx_reporter_err_rq_cqe_recover(void *ctx)
+ clear_bit(MLX5E_RQ_STATE_RECOVERING, &rq->state);
+ mlx5e_activate_rq(rq);
+ rq->stats->recover++;
++ if (rq->channel)
++ mlx5e_trigger_napi_icosq(rq->channel);
++ else
++ mlx5e_trigger_napi_sched(rq->cq.napi);
+ return 0;
+ out:
+ clear_bit(MLX5E_RQ_STATE_RECOVERING, &rq->state);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
+index ab4b0f3ee2a0a..1ff7a07bcd06f 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
+@@ -701,7 +701,7 @@ mlx5_tc_ct_entry_create_mod_hdr(struct mlx5_tc_ct_priv *ct_priv,
+ struct mlx5_flow_attr *attr,
+ struct flow_rule *flow_rule,
+ struct mlx5e_mod_hdr_handle **mh,
+- u8 zone_restore_id, bool nat)
++ u8 zone_restore_id, bool nat_table, bool has_nat)
+ {
+ DECLARE_MOD_HDR_ACTS_ACTIONS(actions_arr, MLX5_CT_MIN_MOD_ACTS);
+ DECLARE_MOD_HDR_ACTS(mod_acts, actions_arr);
+@@ -717,11 +717,12 @@ mlx5_tc_ct_entry_create_mod_hdr(struct mlx5_tc_ct_priv *ct_priv,
+ &attr->ct_attr.ct_labels_id);
+ if (err)
+ return -EOPNOTSUPP;
+- if (nat) {
+- err = mlx5_tc_ct_entry_create_nat(ct_priv, flow_rule,
+- &mod_acts);
+- if (err)
+- goto err_mapping;
++ if (nat_table) {
++ if (has_nat) {
++ err = mlx5_tc_ct_entry_create_nat(ct_priv, flow_rule, &mod_acts);
++ if (err)
++ goto err_mapping;
++ }
+
+ ct_state |= MLX5_CT_STATE_NAT_BIT;
+ }
+@@ -736,7 +737,7 @@ mlx5_tc_ct_entry_create_mod_hdr(struct mlx5_tc_ct_priv *ct_priv,
+ if (err)
+ goto err_mapping;
+
+- if (nat) {
++ if (nat_table && has_nat) {
+ attr->modify_hdr = mlx5_modify_header_alloc(ct_priv->dev, ct_priv->ns_type,
+ mod_acts.num_actions,
+ mod_acts.actions);
+@@ -804,7 +805,9 @@ mlx5_tc_ct_entry_add_rule(struct mlx5_tc_ct_priv *ct_priv,
+
+ err = mlx5_tc_ct_entry_create_mod_hdr(ct_priv, attr, flow_rule,
+ &zone_rule->mh,
+- zone_restore_id, nat);
++ zone_restore_id,
++ nat,
++ mlx5_tc_ct_entry_has_nat(entry));
+ if (err) {
+ ct_dbg("Failed to create ct entry mod hdr");
+ goto err_mod_hdr;
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c
+index 857840ab1e918..11f2a7fb72a9e 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c
+@@ -179,6 +179,7 @@ static void mlx5e_activate_trap(struct mlx5e_trap *trap)
+ {
+ napi_enable(&trap->napi);
+ mlx5e_activate_rq(&trap->rq);
++ mlx5e_trigger_napi_sched(&trap->napi);
+ }
+
+ void mlx5e_deactivate_trap(struct mlx5e_priv *priv)
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c
+index 279cd8f4e79f7..2c520394aa1d6 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c
+@@ -117,6 +117,7 @@ static int mlx5e_xsk_enable_locked(struct mlx5e_priv *priv,
+ goto err_remove_pool;
+
+ mlx5e_activate_xsk(c);
++ mlx5e_trigger_napi_icosq(c);
+
+ /* Don't wait for WQEs, because the newer xdpsock sample doesn't provide
+ * any Fill Ring entries at the setup stage.
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
+index 3ad7f1301fa8d..98ed9ef3a6bdd 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
+@@ -64,6 +64,7 @@ static int mlx5e_init_xsk_rq(struct mlx5e_channel *c,
+ rq->clock = &mdev->clock;
+ rq->icosq = &c->icosq;
+ rq->ix = c->ix;
++ rq->channel = c;
+ rq->mdev = mdev;
+ rq->hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu);
+ rq->xdpsq = &c->rq_xdpsq;
+@@ -179,10 +180,6 @@ void mlx5e_activate_xsk(struct mlx5e_channel *c)
+ mlx5e_reporter_icosq_resume_recovery(c);
+
+ /* TX queue is created active. */
+-
+- spin_lock_bh(&c->async_icosq_lock);
+- mlx5e_trigger_irq(&c->async_icosq);
+- spin_unlock_bh(&c->async_icosq_lock);
+ }
+
+ void mlx5e_deactivate_xsk(struct mlx5e_channel *c)
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+index 72867a8ff48b6..58b6c8b82fd0c 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+@@ -478,6 +478,7 @@ static int mlx5e_init_rxq_rq(struct mlx5e_channel *c, struct mlx5e_params *param
+ rq->clock = &mdev->clock;
+ rq->icosq = &c->icosq;
+ rq->ix = c->ix;
++ rq->channel = c;
+ rq->mdev = mdev;
+ rq->hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu);
+ rq->xdpsq = &c->rq_xdpsq;
+@@ -1072,13 +1073,6 @@ err_free_rq:
+ void mlx5e_activate_rq(struct mlx5e_rq *rq)
+ {
+ set_bit(MLX5E_RQ_STATE_ENABLED, &rq->state);
+- if (rq->icosq) {
+- mlx5e_trigger_irq(rq->icosq);
+- } else {
+- local_bh_disable();
+- napi_schedule(rq->cq.napi);
+- local_bh_enable();
+- }
+ }
+
+ void mlx5e_deactivate_rq(struct mlx5e_rq *rq)
+@@ -2233,6 +2227,20 @@ static int mlx5e_channel_stats_alloc(struct mlx5e_priv *priv, int ix, int cpu)
+ return 0;
+ }
+
++void mlx5e_trigger_napi_icosq(struct mlx5e_channel *c)
++{
++ spin_lock_bh(&c->async_icosq_lock);
++ mlx5e_trigger_irq(&c->async_icosq);
++ spin_unlock_bh(&c->async_icosq_lock);
++}
++
++void mlx5e_trigger_napi_sched(struct napi_struct *napi)
++{
++ local_bh_disable();
++ napi_schedule(napi);
++ local_bh_enable();
++}
++
+ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
+ struct mlx5e_params *params,
+ struct mlx5e_channel_param *cparam,
+@@ -2314,6 +2322,8 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c)
+
+ if (test_bit(MLX5E_CHANNEL_STATE_XSK, c->state))
+ mlx5e_activate_xsk(c);
++
++ mlx5e_trigger_napi_icosq(c);
+ }
+
+ static void mlx5e_deactivate_channel(struct mlx5e_channel *c)
+@@ -4571,6 +4581,11 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog)
+
+ unlock:
+ mutex_unlock(&priv->state_lock);
++
++ /* Need to fix some features. */
++ if (!err)
++ netdev_update_features(netdev);
++
+ return err;
+ }
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+index a464461f14189..52caefdbabb1a 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+@@ -937,6 +937,13 @@ err_event_reg:
+ return err;
+ }
+
++static void mlx5e_cleanup_uplink_rep_tx(struct mlx5e_rep_priv *rpriv)
++{
++ mlx5e_rep_tc_netdevice_event_unregister(rpriv);
++ mlx5e_rep_bond_cleanup(rpriv);
++ mlx5e_rep_tc_cleanup(rpriv);
++}
++
+ static int mlx5e_init_rep_tx(struct mlx5e_priv *priv)
+ {
+ struct mlx5e_rep_priv *rpriv = priv->ppriv;
+@@ -948,42 +955,36 @@ static int mlx5e_init_rep_tx(struct mlx5e_priv *priv)
+ return err;
+ }
+
+- err = mlx5e_tc_ht_init(&rpriv->tc_ht);
+- if (err)
+- goto err_ht_init;
+-
+ if (rpriv->rep->vport == MLX5_VPORT_UPLINK) {
+ err = mlx5e_init_uplink_rep_tx(rpriv);
+ if (err)
+ goto err_init_tx;
+ }
+
++ err = mlx5e_tc_ht_init(&rpriv->tc_ht);
++ if (err)
++ goto err_ht_init;
++
+ return 0;
+
+-err_init_tx:
+- mlx5e_tc_ht_cleanup(&rpriv->tc_ht);
+ err_ht_init:
++ if (rpriv->rep->vport == MLX5_VPORT_UPLINK)
++ mlx5e_cleanup_uplink_rep_tx(rpriv);
++err_init_tx:
+ mlx5e_destroy_tises(priv);
+ return err;
+ }
+
+-static void mlx5e_cleanup_uplink_rep_tx(struct mlx5e_rep_priv *rpriv)
+-{
+- mlx5e_rep_tc_netdevice_event_unregister(rpriv);
+- mlx5e_rep_bond_cleanup(rpriv);
+- mlx5e_rep_tc_cleanup(rpriv);
+-}
+-
+ static void mlx5e_cleanup_rep_tx(struct mlx5e_priv *priv)
+ {
+ struct mlx5e_rep_priv *rpriv = priv->ppriv;
+
+- mlx5e_destroy_tises(priv);
++ mlx5e_tc_ht_cleanup(&rpriv->tc_ht);
+
+ if (rpriv->rep->vport == MLX5_VPORT_UPLINK)
+ mlx5e_cleanup_uplink_rep_tx(rpriv);
+
+- mlx5e_tc_ht_cleanup(&rpriv->tc_ht);
++ mlx5e_destroy_tises(priv);
+ }
+
+ static void mlx5e_rep_enable(struct mlx5e_priv *priv)
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+index ac0f73074f7ab..ec2dfecd7f0f1 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+@@ -4688,6 +4688,33 @@ static int mlx5e_tc_nic_get_ft_size(struct mlx5_core_dev *dev)
+ return tc_tbl_size;
+ }
+
++static int mlx5e_tc_nic_create_miss_table(struct mlx5e_priv *priv)
++{
++ struct mlx5_flow_table **ft = &priv->fs.tc.miss_t;
++ struct mlx5_flow_table_attr ft_attr = {};
++ struct mlx5_flow_namespace *ns;
++ int err = 0;
++
++ ft_attr.max_fte = 1;
++ ft_attr.autogroup.max_num_groups = 1;
++ ft_attr.level = MLX5E_TC_MISS_LEVEL;
++ ft_attr.prio = 0;
++ ns = mlx5_get_flow_namespace(priv->mdev, MLX5_FLOW_NAMESPACE_KERNEL);
++
++ *ft = mlx5_create_auto_grouped_flow_table(ns, &ft_attr);
++ if (IS_ERR(*ft)) {
++ err = PTR_ERR(*ft);
++ netdev_err(priv->netdev, "failed to create tc nic miss table err=%d\n", err);
++ }
++
++ return err;
++}
++
++static void mlx5e_tc_nic_destroy_miss_table(struct mlx5e_priv *priv)
++{
++ mlx5_destroy_flow_table(priv->fs.tc.miss_t);
++}
++
+ int mlx5e_tc_nic_init(struct mlx5e_priv *priv)
+ {
+ struct mlx5e_tc_table *tc = &priv->fs.tc;
+@@ -4720,19 +4747,23 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv)
+ }
+ tc->mapping = chains_mapping;
+
++ err = mlx5e_tc_nic_create_miss_table(priv);
++ if (err)
++ goto err_chains;
++
+ if (MLX5_CAP_FLOWTABLE_NIC_RX(priv->mdev, ignore_flow_level))
+ attr.flags = MLX5_CHAINS_AND_PRIOS_SUPPORTED |
+ MLX5_CHAINS_IGNORE_FLOW_LEVEL_SUPPORTED;
+ attr.ns = MLX5_FLOW_NAMESPACE_KERNEL;
+ attr.max_ft_sz = mlx5e_tc_nic_get_ft_size(dev);
+ attr.max_grp_num = MLX5E_TC_TABLE_NUM_GROUPS;
+- attr.default_ft = mlx5e_vlan_get_flowtable(priv->fs.vlan);
++ attr.default_ft = priv->fs.tc.miss_t;
+ attr.mapping = chains_mapping;
+
+ tc->chains = mlx5_chains_create(dev, &attr);
+ if (IS_ERR(tc->chains)) {
+ err = PTR_ERR(tc->chains);
+- goto err_chains;
++ goto err_miss;
+ }
+
+ tc->post_act = mlx5e_tc_post_act_init(priv, tc->chains, MLX5_FLOW_NAMESPACE_KERNEL);
+@@ -4755,6 +4786,8 @@ err_reg:
+ mlx5_tc_ct_clean(tc->ct);
+ mlx5e_tc_post_act_destroy(tc->post_act);
+ mlx5_chains_destroy(tc->chains);
++err_miss:
++ mlx5e_tc_nic_destroy_miss_table(priv);
+ err_chains:
+ mapping_destroy(chains_mapping);
+ err_mapping:
+@@ -4795,6 +4828,7 @@ void mlx5e_tc_nic_cleanup(struct mlx5e_priv *priv)
+ mlx5e_tc_post_act_destroy(tc->post_act);
+ mapping_destroy(tc->mapping);
+ mlx5_chains_destroy(tc->chains);
++ mlx5e_tc_nic_destroy_miss_table(priv);
+ }
+
+ int mlx5e_tc_ht_init(struct rhashtable *tc_ht)
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+index 3b151332e2f89..796d97bcf1aa0 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+@@ -49,6 +49,7 @@
+ #include "en_tc.h"
+ #include "en/mapping.h"
+ #include "devlink.h"
++#include "lag/lag.h"
+
+ #define mlx5_esw_for_each_rep(esw, i, rep) \
+ xa_for_each(&((esw)->offloads.vport_reps), i, rep)
+@@ -2687,9 +2688,6 @@ static int mlx5_esw_offloads_devcom_event(int event,
+
+ switch (event) {
+ case ESW_OFFLOADS_DEVCOM_PAIR:
+- if (mlx5_get_next_phys_dev(esw->dev) != peer_esw->dev)
+- break;
+-
+ if (mlx5_eswitch_vport_match_metadata_enabled(esw) !=
+ mlx5_eswitch_vport_match_metadata_enabled(peer_esw))
+ break;
+@@ -2741,6 +2739,9 @@ static void esw_offloads_devcom_init(struct mlx5_eswitch *esw)
+ if (!MLX5_CAP_ESW(esw->dev, merged_eswitch))
+ return;
+
++ if (!mlx5_is_lag_supported(esw->dev))
++ return;
++
+ mlx5_devcom_register_component(devcom,
+ MLX5_DEVCOM_ESW_OFFLOADS,
+ mlx5_esw_offloads_devcom_event,
+@@ -2758,6 +2759,9 @@ static void esw_offloads_devcom_cleanup(struct mlx5_eswitch *esw)
+ if (!MLX5_CAP_ESW(esw->dev, merged_eswitch))
+ return;
+
++ if (!mlx5_is_lag_supported(esw->dev))
++ return;
++
+ mlx5_devcom_send_event(devcom, MLX5_DEVCOM_ESW_OFFLOADS,
+ ESW_OFFLOADS_DEVCOM_UNPAIR, esw);
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+index 89ba72e8d1091..beedaf5b03ee1 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+@@ -116,7 +116,7 @@
+ #define KERNEL_MIN_LEVEL (KERNEL_NIC_PRIO_NUM_LEVELS + 1)
+
+ #define KERNEL_NIC_TC_NUM_PRIOS 1
+-#define KERNEL_NIC_TC_NUM_LEVELS 2
++#define KERNEL_NIC_TC_NUM_LEVELS 3
+
+ #define ANCHOR_NUM_LEVELS 1
+ #define ANCHOR_NUM_PRIOS 1
+@@ -1560,9 +1560,22 @@ static struct mlx5_flow_rule *find_flow_rule(struct fs_fte *fte,
+ return NULL;
+ }
+
+-static bool check_conflicting_actions(u32 action1, u32 action2)
++static bool check_conflicting_actions_vlan(const struct mlx5_fs_vlan *vlan0,
++ const struct mlx5_fs_vlan *vlan1)
+ {
+- u32 xored_actions = action1 ^ action2;
++ return vlan0->ethtype != vlan1->ethtype ||
++ vlan0->vid != vlan1->vid ||
++ vlan0->prio != vlan1->prio;
++}
++
++static bool check_conflicting_actions(const struct mlx5_flow_act *act1,
++ const struct mlx5_flow_act *act2)
++{
++ u32 action1 = act1->action;
++ u32 action2 = act2->action;
++ u32 xored_actions;
++
++ xored_actions = action1 ^ action2;
+
+ /* if one rule only wants to count, it's ok */
+ if (action1 == MLX5_FLOW_CONTEXT_ACTION_COUNT ||
+@@ -1579,6 +1592,22 @@ static bool check_conflicting_actions(u32 action1, u32 action2)
+ MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH_2))
+ return true;
+
++ if (action1 & MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT &&
++ act1->pkt_reformat != act2->pkt_reformat)
++ return true;
++
++ if (action1 & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR &&
++ act1->modify_hdr != act2->modify_hdr)
++ return true;
++
++ if (action1 & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH &&
++ check_conflicting_actions_vlan(&act1->vlan[0], &act2->vlan[0]))
++ return true;
++
++ if (action1 & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH_2 &&
++ check_conflicting_actions_vlan(&act1->vlan[1], &act2->vlan[1]))
++ return true;
++
+ return false;
+ }
+
+@@ -1586,7 +1615,7 @@ static int check_conflicting_ftes(struct fs_fte *fte,
+ const struct mlx5_flow_context *flow_context,
+ const struct mlx5_flow_act *flow_act)
+ {
+- if (check_conflicting_actions(flow_act->action, fte->action.action)) {
++ if (check_conflicting_actions(flow_act, &fte->action)) {
+ mlx5_core_warn(get_dev(&fte->node),
+ "Found two FTEs with conflicting actions\n");
+ return -EEXIST;
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+index 6cad3b72c1339..a8b98242edb16 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+@@ -924,12 +924,7 @@ static int __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev)
+ struct mlx5_lag *ldev = NULL;
+ struct mlx5_core_dev *tmp_dev;
+
+- if (!MLX5_CAP_GEN(dev, vport_group_manager) ||
+- !MLX5_CAP_GEN(dev, lag_master) ||
+- MLX5_CAP_GEN(dev, num_lag_ports) != MLX5_MAX_PORTS)
+- return 0;
+-
+- tmp_dev = mlx5_get_next_phys_dev(dev);
++ tmp_dev = mlx5_get_next_phys_dev_lag(dev);
+ if (tmp_dev)
+ ldev = tmp_dev->priv.lag;
+
+@@ -974,6 +969,11 @@ void mlx5_lag_add_mdev(struct mlx5_core_dev *dev)
+ {
+ int err;
+
++ if (!MLX5_CAP_GEN(dev, vport_group_manager) ||
++ !MLX5_CAP_GEN(dev, lag_master) ||
++ MLX5_CAP_GEN(dev, num_lag_ports) != MLX5_MAX_PORTS)
++ return;
++
+ recheck:
+ mlx5_dev_list_lock();
+ err = __mlx5_lag_dev_add_mdev(dev);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
+index cbf9a9003e55b..d58df1e9c13f9 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.h
+@@ -58,6 +58,16 @@ struct mlx5_lag {
+ struct mlx5_lag_port_sel port_sel;
+ };
+
++static inline bool mlx5_is_lag_supported(struct mlx5_core_dev *dev)
++{
++ if (!MLX5_CAP_GEN(dev, vport_group_manager) ||
++ !MLX5_CAP_GEN(dev, lag_master) ||
++ MLX5_CAP_GEN(dev, num_lag_ports) < 2 ||
++ MLX5_CAP_GEN(dev, num_lag_ports) > MLX5_MAX_PORTS)
++ return false;
++ return true;
++}
++
+ static inline struct mlx5_lag *
+ mlx5_lag_dev(struct mlx5_core_dev *dev)
+ {
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+index 9026be1d62232..9cc7afea2758f 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+@@ -209,7 +209,7 @@ int mlx5_attach_device(struct mlx5_core_dev *dev);
+ void mlx5_detach_device(struct mlx5_core_dev *dev);
+ int mlx5_register_device(struct mlx5_core_dev *dev);
+ void mlx5_unregister_device(struct mlx5_core_dev *dev);
+-struct mlx5_core_dev *mlx5_get_next_phys_dev(struct mlx5_core_dev *dev);
++struct mlx5_core_dev *mlx5_get_next_phys_dev_lag(struct mlx5_core_dev *dev);
+ void mlx5_dev_list_lock(void);
+ void mlx5_dev_list_unlock(void);
+ int mlx5_dev_list_trylock(void);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
+index 728f818825892..6a9abba92df6e 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
+@@ -44,11 +44,10 @@ static int set_miss_action(struct mlx5_flow_root_namespace *ns,
+ err = mlx5dr_table_set_miss_action(ft->fs_dr_table.dr_table, action);
+ if (err && action) {
+ err = mlx5dr_action_destroy(action);
+- if (err) {
+- action = NULL;
+- mlx5_core_err(ns->dev, "Failed to destroy action (%d)\n",
+- err);
+- }
++ if (err)
++ mlx5_core_err(ns->dev,
++ "Failed to destroy action (%d)\n", err);
++ action = NULL;
+ }
+ ft->fs_dr_table.miss_action = action;
+ if (old_miss_action) {
+diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
+index 05f6dcc9dfd52..f180a157eea49 100644
+--- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
++++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
+@@ -1080,8 +1080,13 @@ static int lan966x_probe(struct platform_device *pdev)
+ lan966x->ports[p]->fwnode = fwnode_handle_get(portnp);
+
+ serdes = devm_of_phy_get(lan966x->dev, to_of_node(portnp), NULL);
+- if (!IS_ERR(serdes))
+- lan966x->ports[p]->serdes = serdes;
++ if (PTR_ERR(serdes) == -ENODEV)
++ serdes = NULL;
++ if (IS_ERR(serdes)) {
++ err = PTR_ERR(serdes);
++ goto cleanup_ports;
++ }
++ lan966x->ports[p]->serdes = serdes;
+
+ lan966x_port_init(lan966x->ports[p]);
+ }
+diff --git a/drivers/net/ethernet/netronome/nfp/flower/conntrack.c b/drivers/net/ethernet/netronome/nfp/flower/conntrack.c
+index bfd7d1c350767..7e9fcc16286e2 100644
+--- a/drivers/net/ethernet/netronome/nfp/flower/conntrack.c
++++ b/drivers/net/ethernet/netronome/nfp/flower/conntrack.c
+@@ -442,6 +442,11 @@ nfp_fl_calc_key_layers_sz(struct nfp_fl_key_ls in_key_ls, uint16_t *map)
+ key_size += sizeof(struct nfp_flower_ipv6);
+ }
+
++ if (in_key_ls.key_layer_two & NFP_FLOWER_LAYER2_QINQ) {
++ map[FLOW_PAY_QINQ] = key_size;
++ key_size += sizeof(struct nfp_flower_vlan);
++ }
++
+ if (in_key_ls.key_layer_two & NFP_FLOWER_LAYER2_GRE) {
+ map[FLOW_PAY_GRE] = key_size;
+ if (in_key_ls.key_layer_two & NFP_FLOWER_LAYER2_TUN_IPV6)
+@@ -450,11 +455,6 @@ nfp_fl_calc_key_layers_sz(struct nfp_fl_key_ls in_key_ls, uint16_t *map)
+ key_size += sizeof(struct nfp_flower_ipv4_gre_tun);
+ }
+
+- if (in_key_ls.key_layer_two & NFP_FLOWER_LAYER2_QINQ) {
+- map[FLOW_PAY_QINQ] = key_size;
+- key_size += sizeof(struct nfp_flower_vlan);
+- }
+-
+ if ((in_key_ls.key_layer & NFP_FLOWER_LAYER_VXLAN) ||
+ (in_key_ls.key_layer_two & NFP_FLOWER_LAYER2_GENEVE)) {
+ map[FLOW_PAY_UDP_TUN] = key_size;
+@@ -693,6 +693,17 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry)
+ }
+ }
+
++ if (NFP_FLOWER_LAYER2_QINQ & key_layer.key_layer_two) {
++ offset = key_map[FLOW_PAY_QINQ];
++ key = kdata + offset;
++ msk = mdata + offset;
++ for (i = 0; i < _CT_TYPE_MAX; i++) {
++ nfp_flower_compile_vlan((struct nfp_flower_vlan *)key,
++ (struct nfp_flower_vlan *)msk,
++ rules[i]);
++ }
++ }
++
+ if (key_layer.key_layer_two & NFP_FLOWER_LAYER2_GRE) {
+ offset = key_map[FLOW_PAY_GRE];
+ key = kdata + offset;
+@@ -733,17 +744,6 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry)
+ }
+ }
+
+- if (NFP_FLOWER_LAYER2_QINQ & key_layer.key_layer_two) {
+- offset = key_map[FLOW_PAY_QINQ];
+- key = kdata + offset;
+- msk = mdata + offset;
+- for (i = 0; i < _CT_TYPE_MAX; i++) {
+- nfp_flower_compile_vlan((struct nfp_flower_vlan *)key,
+- (struct nfp_flower_vlan *)msk,
+- rules[i]);
+- }
+- }
+-
+ if (key_layer.key_layer & NFP_FLOWER_LAYER_VXLAN ||
+ key_layer.key_layer_two & NFP_FLOWER_LAYER2_GENEVE) {
+ offset = key_map[FLOW_PAY_UDP_TUN];
+diff --git a/drivers/net/ethernet/netronome/nfp/flower/match.c b/drivers/net/ethernet/netronome/nfp/flower/match.c
+index 9d86eea4dc169..fb8bd2135c63a 100644
+--- a/drivers/net/ethernet/netronome/nfp/flower/match.c
++++ b/drivers/net/ethernet/netronome/nfp/flower/match.c
+@@ -602,6 +602,14 @@ int nfp_flower_compile_flow_match(struct nfp_app *app,
+ msk += sizeof(struct nfp_flower_ipv6);
+ }
+
++ if (NFP_FLOWER_LAYER2_QINQ & key_ls->key_layer_two) {
++ nfp_flower_compile_vlan((struct nfp_flower_vlan *)ext,
++ (struct nfp_flower_vlan *)msk,
++ rule);
++ ext += sizeof(struct nfp_flower_vlan);
++ msk += sizeof(struct nfp_flower_vlan);
++ }
++
+ if (key_ls->key_layer_two & NFP_FLOWER_LAYER2_GRE) {
+ if (key_ls->key_layer_two & NFP_FLOWER_LAYER2_TUN_IPV6) {
+ struct nfp_flower_ipv6_gre_tun *gre_match;
+@@ -637,14 +645,6 @@ int nfp_flower_compile_flow_match(struct nfp_app *app,
+ }
+ }
+
+- if (NFP_FLOWER_LAYER2_QINQ & key_ls->key_layer_two) {
+- nfp_flower_compile_vlan((struct nfp_flower_vlan *)ext,
+- (struct nfp_flower_vlan *)msk,
+- rule);
+- ext += sizeof(struct nfp_flower_vlan);
+- msk += sizeof(struct nfp_flower_vlan);
+- }
+-
+ if (key_ls->key_layer & NFP_FLOWER_LAYER_VXLAN ||
+ key_ls->key_layer_two & NFP_FLOWER_LAYER2_GENEVE) {
+ if (key_ls->key_layer_two & NFP_FLOWER_LAYER2_TUN_IPV6) {
+diff --git a/drivers/net/ethernet/netronome/nfp/nfdk/dp.c b/drivers/net/ethernet/netronome/nfp/nfdk/dp.c
+index e3da9ac20e57d..e509d6dcba5cb 100644
+--- a/drivers/net/ethernet/netronome/nfp/nfdk/dp.c
++++ b/drivers/net/ethernet/netronome/nfp/nfdk/dp.c
+@@ -314,7 +314,7 @@ netdev_tx_t nfp_nfdk_tx(struct sk_buff *skb, struct net_device *netdev)
+ FIELD_PREP(NFDK_DESC_TX_TYPE_HEAD, type);
+
+ txd->dma_len_type = cpu_to_le16(dlen_type);
+- nfp_desc_set_dma_addr(txd, dma_addr);
++ nfp_nfdk_tx_desc_set_dma_addr(txd, dma_addr);
+
+ /* starts at bit 0 */
+ BUILD_BUG_ON(!(NFDK_DESC_TX_DMA_LEN_HEAD & 1));
+@@ -339,7 +339,7 @@ netdev_tx_t nfp_nfdk_tx(struct sk_buff *skb, struct net_device *netdev)
+ dlen_type = FIELD_PREP(NFDK_DESC_TX_DMA_LEN, dma_len);
+
+ txd->dma_len_type = cpu_to_le16(dlen_type);
+- nfp_desc_set_dma_addr(txd, dma_addr);
++ nfp_nfdk_tx_desc_set_dma_addr(txd, dma_addr);
+
+ dma_len -= dlen_type;
+ dma_addr += dlen_type + 1;
+@@ -929,7 +929,7 @@ nfp_nfdk_tx_xdp_buf(struct nfp_net_dp *dp, struct nfp_net_rx_ring *rx_ring,
+ FIELD_PREP(NFDK_DESC_TX_TYPE_HEAD, type);
+
+ txd->dma_len_type = cpu_to_le16(dlen_type);
+- nfp_desc_set_dma_addr(txd, dma_addr);
++ nfp_nfdk_tx_desc_set_dma_addr(txd, dma_addr);
+
+ tmp_dlen = dlen_type & NFDK_DESC_TX_DMA_LEN_HEAD;
+ dma_len -= tmp_dlen;
+@@ -940,7 +940,7 @@ nfp_nfdk_tx_xdp_buf(struct nfp_net_dp *dp, struct nfp_net_rx_ring *rx_ring,
+ dma_len -= 1;
+ dlen_type = FIELD_PREP(NFDK_DESC_TX_DMA_LEN, dma_len);
+ txd->dma_len_type = cpu_to_le16(dlen_type);
+- nfp_desc_set_dma_addr(txd, dma_addr);
++ nfp_nfdk_tx_desc_set_dma_addr(txd, dma_addr);
+
+ dlen_type &= NFDK_DESC_TX_DMA_LEN;
+ dma_len -= dlen_type;
+@@ -1332,7 +1332,7 @@ nfp_nfdk_ctrl_tx_one(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
+ FIELD_PREP(NFDK_DESC_TX_TYPE_HEAD, type);
+
+ txd->dma_len_type = cpu_to_le16(dlen_type);
+- nfp_desc_set_dma_addr(txd, dma_addr);
++ nfp_nfdk_tx_desc_set_dma_addr(txd, dma_addr);
+
+ tmp_dlen = dlen_type & NFDK_DESC_TX_DMA_LEN_HEAD;
+ dma_len -= tmp_dlen;
+@@ -1343,7 +1343,7 @@ nfp_nfdk_ctrl_tx_one(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
+ dma_len -= 1;
+ dlen_type = FIELD_PREP(NFDK_DESC_TX_DMA_LEN, dma_len);
+ txd->dma_len_type = cpu_to_le16(dlen_type);
+- nfp_desc_set_dma_addr(txd, dma_addr);
++ nfp_nfdk_tx_desc_set_dma_addr(txd, dma_addr);
+
+ dlen_type &= NFDK_DESC_TX_DMA_LEN;
+ dma_len -= dlen_type;
+diff --git a/drivers/net/ethernet/netronome/nfp/nfdk/nfdk.h b/drivers/net/ethernet/netronome/nfp/nfdk/nfdk.h
+index c41e0975eb738..0ea51d9f2325d 100644
+--- a/drivers/net/ethernet/netronome/nfp/nfdk/nfdk.h
++++ b/drivers/net/ethernet/netronome/nfp/nfdk/nfdk.h
+@@ -46,8 +46,7 @@
+ struct nfp_nfdk_tx_desc {
+ union {
+ struct {
+- u8 dma_addr_hi; /* High bits of host buf address */
+- u8 padding; /* Must be zero */
++ __le16 dma_addr_hi; /* High bits of host buf address */
+ __le16 dma_len_type; /* Length to DMA for this desc */
+ __le32 dma_addr_lo; /* Low 32bit of host buf addr */
+ };
+diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
+index 428783b7018bb..3dd3a92d2e7fd 100644
+--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
++++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
+@@ -117,13 +117,22 @@ struct nfp_nfdk_tx_buf;
+ /* Convenience macro for writing dma address into RX/TX descriptors */
+ #define nfp_desc_set_dma_addr(desc, dma_addr) \
+ do { \
+- __typeof(desc) __d = (desc); \
++ __typeof__(desc) __d = (desc); \
+ dma_addr_t __addr = (dma_addr); \
+ \
+ __d->dma_addr_lo = cpu_to_le32(lower_32_bits(__addr)); \
+ __d->dma_addr_hi = upper_32_bits(__addr) & 0xff; \
+ } while (0)
+
++#define nfp_nfdk_tx_desc_set_dma_addr(desc, dma_addr) \
++ do { \
++ __typeof__(desc) __d = (desc); \
++ dma_addr_t __addr = (dma_addr); \
++ \
++ __d->dma_addr_hi = cpu_to_le16(upper_32_bits(__addr) & 0xff); \
++ __d->dma_addr_lo = cpu_to_le32(lower_32_bits(__addr)); \
++ } while (0)
++
+ /**
+ * struct nfp_net_tx_ring - TX ring structure
+ * @r_vec: Back pointer to ring vector structure
+diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
+index 61c8b450aafb1..df0afd271a21e 100644
+--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
++++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
+@@ -289,8 +289,6 @@ nfp_net_get_link_ksettings(struct net_device *netdev,
+
+ /* Init to unknowns */
+ ethtool_link_ksettings_add_link_mode(cmd, supported, FIBRE);
+- ethtool_link_ksettings_add_link_mode(cmd, supported, Pause);
+- ethtool_link_ksettings_add_link_mode(cmd, advertising, Pause);
+ cmd->base.port = PORT_OTHER;
+ cmd->base.speed = SPEED_UNKNOWN;
+ cmd->base.duplex = DUPLEX_UNKNOWN;
+@@ -298,6 +296,8 @@ nfp_net_get_link_ksettings(struct net_device *netdev,
+ port = nfp_port_from_netdev(netdev);
+ eth_port = nfp_port_get_eth_port(port);
+ if (eth_port) {
++ ethtool_link_ksettings_add_link_mode(cmd, supported, Pause);
++ ethtool_link_ksettings_add_link_mode(cmd, advertising, Pause);
+ cmd->base.autoneg = eth_port->aneg != NFP_ANEG_DISABLED ?
+ AUTONEG_ENABLE : AUTONEG_DISABLE;
+ nfp_net_set_fec_link_mode(eth_port, cmd);
+diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c
+index 40df910aa1401..b9cf873e1e424 100644
+--- a/drivers/net/ethernet/sfc/efx_channels.c
++++ b/drivers/net/ethernet/sfc/efx_channels.c
+@@ -324,6 +324,7 @@ int efx_probe_interrupts(struct efx_nic *efx)
+ efx->n_channels = 1;
+ efx->n_rx_channels = 1;
+ efx->n_tx_channels = 1;
++ efx->tx_channel_offset = 0;
+ efx->n_xdp_channels = 0;
+ efx->xdp_channel_offset = efx->n_channels;
+ rc = pci_enable_msi(efx->pci_dev);
+@@ -344,6 +345,7 @@ int efx_probe_interrupts(struct efx_nic *efx)
+ efx->n_channels = 1 + (efx_separate_tx_channels ? 1 : 0);
+ efx->n_rx_channels = 1;
+ efx->n_tx_channels = 1;
++ efx->tx_channel_offset = 1;
+ efx->n_xdp_channels = 0;
+ efx->xdp_channel_offset = efx->n_channels;
+ efx->legacy_irq = efx->pci_dev->irq;
+@@ -979,10 +981,6 @@ int efx_set_channels(struct efx_nic *efx)
+ struct efx_channel *channel;
+ int rc;
+
+- efx->tx_channel_offset =
+- efx_separate_tx_channels ?
+- efx->n_channels - efx->n_tx_channels : 0;
+-
+ if (efx->xdp_tx_queue_count) {
+ EFX_WARN_ON_PARANOID(efx->xdp_tx_queues);
+
+diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
+index c75dc75e2857d..d7255d54707c0 100644
+--- a/drivers/net/ethernet/sfc/net_driver.h
++++ b/drivers/net/ethernet/sfc/net_driver.h
+@@ -1535,7 +1535,7 @@ static inline bool efx_channel_is_xdp_tx(struct efx_channel *channel)
+
+ static inline bool efx_channel_has_tx_queues(struct efx_channel *channel)
+ {
+- return true;
++ return channel && channel->channel >= channel->efx->tx_channel_offset;
+ }
+
+ static inline unsigned int efx_channel_num_tx_queues(struct efx_channel *channel)
+diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c
+index 0b0be0898ac57..f6d8109e7edc5 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c
++++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c
+@@ -1072,13 +1072,11 @@ static int intel_eth_pci_probe(struct pci_dev *pdev,
+
+ ret = stmmac_dvr_probe(&pdev->dev, plat, &res);
+ if (ret) {
+- goto err_dvr_probe;
++ goto err_alloc_irq;
+ }
+
+ return 0;
+
+-err_dvr_probe:
+- pci_free_irq_vectors(pdev);
+ err_alloc_irq:
+ clk_disable_unprepare(plat->stmmac_clk);
+ clk_unregister_fixed_rate(plat->stmmac_clk);
+diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
+index d2747e9db286d..6d978dbf708f2 100644
+--- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
++++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
+@@ -9,6 +9,7 @@
+ #include <linux/etherdevice.h>
+ #include <linux/if_vlan.h>
+ #include <linux/interrupt.h>
++#include <linux/irqdomain.h>
+ #include <linux/kernel.h>
+ #include <linux/kmemleak.h>
+ #include <linux/module.h>
+@@ -1796,6 +1797,7 @@ static int am65_cpsw_init_cpts(struct am65_cpsw_common *common)
+ if (IS_ERR(cpts)) {
+ int ret = PTR_ERR(cpts);
+
++ of_node_put(node);
+ if (ret == -EOPNOTSUPP) {
+ dev_info(dev, "cpts disabled\n");
+ return 0;
+@@ -1989,7 +1991,9 @@ am65_cpsw_nuss_init_port_ndev(struct am65_cpsw_common *common, u32 port_idx)
+
+ phy_interface_set_rgmii(port->slave.phylink_config.supported_interfaces);
+
+- phylink = phylink_create(&port->slave.phylink_config, dev->fwnode, port->slave.phy_if,
++ phylink = phylink_create(&port->slave.phylink_config,
++ of_node_to_fwnode(port->slave.phy_node),
++ port->slave.phy_if,
+ &am65_cpsw_phylink_mac_ops);
+ if (IS_ERR(phylink))
+ return PTR_ERR(phylink);
+@@ -2670,9 +2674,9 @@ static int am65_cpsw_nuss_probe(struct platform_device *pdev)
+ if (!node)
+ return -ENOENT;
+ common->port_num = of_get_child_count(node);
++ of_node_put(node);
+ if (common->port_num < 1 || common->port_num > AM65_CPSW_MAX_PORTS)
+ return -ENOENT;
+- of_node_put(node);
+
+ common->rx_flow_id_base = -1;
+ init_completion(&common->tdown_complete);
+diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
+index 73926006d3198..6a467e7817a6a 100644
+--- a/drivers/net/phy/at803x.c
++++ b/drivers/net/phy/at803x.c
+@@ -433,20 +433,21 @@ static void at803x_context_restore(struct phy_device *phydev,
+ static int at803x_set_wol(struct phy_device *phydev,
+ struct ethtool_wolinfo *wol)
+ {
+- struct net_device *ndev = phydev->attached_dev;
+- const u8 *mac;
+ int ret, irq_enabled;
+- unsigned int i;
+- static const unsigned int offsets[] = {
+- AT803X_LOC_MAC_ADDR_32_47_OFFSET,
+- AT803X_LOC_MAC_ADDR_16_31_OFFSET,
+- AT803X_LOC_MAC_ADDR_0_15_OFFSET,
+- };
+-
+- if (!ndev)
+- return -ENODEV;
+
+ if (wol->wolopts & WAKE_MAGIC) {
++ struct net_device *ndev = phydev->attached_dev;
++ const u8 *mac;
++ unsigned int i;
++ static const unsigned int offsets[] = {
++ AT803X_LOC_MAC_ADDR_32_47_OFFSET,
++ AT803X_LOC_MAC_ADDR_16_31_OFFSET,
++ AT803X_LOC_MAC_ADDR_0_15_OFFSET,
++ };
++
++ if (!ndev)
++ return -ENODEV;
++
+ mac = (const u8 *) ndev->dev_addr;
+
+ if (!is_valid_ether_addr(mac))
+@@ -857,6 +858,9 @@ static int at803x_probe(struct phy_device *phydev)
+ if (phydev->drv->phy_id == ATH8031_PHY_ID) {
+ int ccr = phy_read(phydev, AT803X_REG_CHIP_CONFIG);
+ int mode_cfg;
++ struct ethtool_wolinfo wol = {
++ .wolopts = 0,
++ };
+
+ if (ccr < 0)
+ goto err;
+@@ -872,6 +876,13 @@ static int at803x_probe(struct phy_device *phydev)
+ priv->is_fiber = true;
+ break;
+ }
++
++ /* Disable WOL by default */
++ ret = at803x_set_wol(phydev, &wol);
++ if (ret < 0) {
++ phydev_err(phydev, "failed to disable WOL on probe: %d\n", ret);
++ goto err;
++ }
+ }
+
+ return 0;
+diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c
+index 8561f2d4443bf..13dafe7a29bde 100644
+--- a/drivers/net/phy/dp83867.c
++++ b/drivers/net/phy/dp83867.c
+@@ -137,6 +137,7 @@
+ #define DP83867_DOWNSHIFT_2_COUNT 2
+ #define DP83867_DOWNSHIFT_4_COUNT 4
+ #define DP83867_DOWNSHIFT_8_COUNT 8
++#define DP83867_SGMII_AUTONEG_EN BIT(7)
+
+ /* CFG3 bits */
+ #define DP83867_CFG3_INT_OE BIT(7)
+@@ -855,6 +856,32 @@ static int dp83867_phy_reset(struct phy_device *phydev)
+ DP83867_PHYCR_FORCE_LINK_GOOD, 0);
+ }
+
++static void dp83867_link_change_notify(struct phy_device *phydev)
++{
++ /* There is a limitation in DP83867 PHY device where SGMII AN is
++ * only triggered once after the device is booted up. Even after the
++ * PHY TPI is down and up again, SGMII AN is not triggered and
++ * hence no new in-band message from PHY to MAC side SGMII.
++ * This could cause an issue during power up, when PHY is up prior
++ * to MAC. At this condition, once MAC side SGMII is up, MAC side
++ * SGMII wouldn`t receive new in-band message from TI PHY with
++ * correct link status, speed and duplex info.
++ * Thus, implemented a SW solution here to retrigger SGMII Auto-Neg
++ * whenever there is a link change.
++ */
++ if (phydev->interface == PHY_INTERFACE_MODE_SGMII) {
++ int val = 0;
++
++ val = phy_clear_bits(phydev, DP83867_CFG2,
++ DP83867_SGMII_AUTONEG_EN);
++ if (val < 0)
++ return;
++
++ phy_set_bits(phydev, DP83867_CFG2,
++ DP83867_SGMII_AUTONEG_EN);
++ }
++}
++
+ static struct phy_driver dp83867_driver[] = {
+ {
+ .phy_id = DP83867_PHY_ID,
+@@ -879,6 +906,8 @@ static struct phy_driver dp83867_driver[] = {
+
+ .suspend = genphy_suspend,
+ .resume = genphy_resume,
++
++ .link_change_notify = dp83867_link_change_notify,
+ },
+ };
+ module_phy_driver(dp83867_driver);
+diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
+index 58d602985877b..8a2dbe849866d 100644
+--- a/drivers/net/phy/mdio_bus.c
++++ b/drivers/net/phy/mdio_bus.c
+@@ -1046,7 +1046,6 @@ int __init mdio_bus_init(void)
+
+ return ret;
+ }
+-EXPORT_SYMBOL_GPL(mdio_bus_init);
+
+ #if IS_ENABLED(CONFIG_PHYLIB)
+ void mdio_bus_exit(void)
+diff --git a/drivers/nfc/st21nfca/se.c b/drivers/nfc/st21nfca/se.c
+index 7e213f8ddc98b..df8d27cf2956b 100644
+--- a/drivers/nfc/st21nfca/se.c
++++ b/drivers/nfc/st21nfca/se.c
+@@ -300,6 +300,8 @@ int st21nfca_connectivity_event_received(struct nfc_hci_dev *hdev, u8 host,
+ int r = 0;
+ struct device *dev = &hdev->ndev->dev;
+ struct nfc_evt_transaction *transaction;
++ u32 aid_len;
++ u8 params_len;
+
+ pr_debug("connectivity gate event: %x\n", event);
+
+@@ -308,43 +310,48 @@ int st21nfca_connectivity_event_received(struct nfc_hci_dev *hdev, u8 host,
+ r = nfc_se_connectivity(hdev->ndev, host);
+ break;
+ case ST21NFCA_EVT_TRANSACTION:
+- /*
+- * According to specification etsi 102 622
++ /* According to specification etsi 102 622
+ * 11.2.2.4 EVT_TRANSACTION Table 52
+ * Description Tag Length
+ * AID 81 5 to 16
+ * PARAMETERS 82 0 to 255
++ *
++ * The key differences are aid storage length is variably sized
++ * in the packet, but fixed in nfc_evt_transaction, and that the aid_len
++ * is u8 in the packet, but u32 in the structure, and the tags in
++ * the packet are not included in nfc_evt_transaction.
++ *
++ * size in bytes: 1 1 5-16 1 1 0-255
++ * offset: 0 1 2 aid_len + 2 aid_len + 3 aid_len + 4
++ * member name: aid_tag(M) aid_len aid params_tag(M) params_len params
++ * example: 0x81 5-16 X 0x82 0-255 X
+ */
+- if (skb->len < NFC_MIN_AID_LENGTH + 2 &&
+- skb->data[0] != NFC_EVT_TRANSACTION_AID_TAG)
++ if (skb->len < 2 || skb->data[0] != NFC_EVT_TRANSACTION_AID_TAG)
+ return -EPROTO;
+
+- transaction = devm_kzalloc(dev, skb->len - 2, GFP_KERNEL);
+- if (!transaction)
+- return -ENOMEM;
+-
+- transaction->aid_len = skb->data[1];
++ aid_len = skb->data[1];
+
+- /* Checking if the length of the AID is valid */
+- if (transaction->aid_len > sizeof(transaction->aid))
+- return -EINVAL;
++ if (skb->len < aid_len + 4 || aid_len > sizeof(transaction->aid))
++ return -EPROTO;
+
+- memcpy(transaction->aid, &skb->data[2],
+- transaction->aid_len);
++ params_len = skb->data[aid_len + 3];
+
+- /* Check next byte is PARAMETERS tag (82) */
+- if (skb->data[transaction->aid_len + 2] !=
+- NFC_EVT_TRANSACTION_PARAMS_TAG)
++ /* Verify PARAMETERS tag is (82), and final check that there is enough
++ * space in the packet to read everything.
++ */
++ if ((skb->data[aid_len + 2] != NFC_EVT_TRANSACTION_PARAMS_TAG) ||
++ (skb->len < aid_len + 4 + params_len))
+ return -EPROTO;
+
+- transaction->params_len = skb->data[transaction->aid_len + 3];
++ transaction = devm_kzalloc(dev, sizeof(*transaction) + params_len, GFP_KERNEL);
++ if (!transaction)
++ return -ENOMEM;
+
+- /* Total size is allocated (skb->len - 2) minus fixed array members */
+- if (transaction->params_len > ((skb->len - 2) - sizeof(struct nfc_evt_transaction)))
+- return -EINVAL;
++ transaction->aid_len = aid_len;
++ transaction->params_len = params_len;
+
+- memcpy(transaction->params, skb->data +
+- transaction->aid_len + 4, transaction->params_len);
++ memcpy(transaction->aid, &skb->data[2], aid_len);
++ memcpy(transaction->params, &skb->data[aid_len + 4], params_len);
+
+ r = nfc_se_transaction(hdev->ndev, host, transaction);
+ break;
+diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
+index 375c0c40bbf8d..e61058e138182 100644
+--- a/drivers/pci/controller/pcie-brcmstb.c
++++ b/drivers/pci/controller/pcie-brcmstb.c
+@@ -24,7 +24,6 @@
+ #include <linux/pci.h>
+ #include <linux/pci-ecam.h>
+ #include <linux/printk.h>
+-#include <linux/regulator/consumer.h>
+ #include <linux/reset.h>
+ #include <linux/sizes.h>
+ #include <linux/slab.h>
+@@ -196,8 +195,6 @@ static inline void brcm_pcie_bridge_sw_init_set_generic(struct brcm_pcie *pcie,
+ static inline void brcm_pcie_perst_set_4908(struct brcm_pcie *pcie, u32 val);
+ static inline void brcm_pcie_perst_set_7278(struct brcm_pcie *pcie, u32 val);
+ static inline void brcm_pcie_perst_set_generic(struct brcm_pcie *pcie, u32 val);
+-static int brcm_pcie_linkup(struct brcm_pcie *pcie);
+-static int brcm_pcie_add_bus(struct pci_bus *bus);
+
+ enum {
+ RGR1_SW_INIT_1,
+@@ -286,14 +283,6 @@ static const struct pcie_cfg_data bcm2711_cfg = {
+ .bridge_sw_init_set = brcm_pcie_bridge_sw_init_set_generic,
+ };
+
+-struct subdev_regulators {
+- unsigned int num_supplies;
+- struct regulator_bulk_data supplies[];
+-};
+-
+-static int pci_subdev_regulators_add_bus(struct pci_bus *bus);
+-static void pci_subdev_regulators_remove_bus(struct pci_bus *bus);
+-
+ struct brcm_msi {
+ struct device *dev;
+ void __iomem *base;
+@@ -331,9 +320,6 @@ struct brcm_pcie {
+ u32 hw_rev;
+ void (*perst_set)(struct brcm_pcie *pcie, u32 val);
+ void (*bridge_sw_init_set)(struct brcm_pcie *pcie, u32 val);
+- bool refusal_mode;
+- struct subdev_regulators *sr;
+- bool ep_wakeup_capable;
+ };
+
+ static inline bool is_bmips(const struct brcm_pcie *pcie)
+@@ -450,99 +436,6 @@ static int brcm_pcie_set_ssc(struct brcm_pcie *pcie)
+ return ssc && pll ? 0 : -EIO;
+ }
+
+-static void *alloc_subdev_regulators(struct device *dev)
+-{
+- static const char * const supplies[] = {
+- "vpcie3v3",
+- "vpcie3v3aux",
+- "vpcie12v",
+- };
+- const size_t size = sizeof(struct subdev_regulators)
+- + sizeof(struct regulator_bulk_data) * ARRAY_SIZE(supplies);
+- struct subdev_regulators *sr;
+- int i;
+-
+- sr = devm_kzalloc(dev, size, GFP_KERNEL);
+- if (sr) {
+- sr->num_supplies = ARRAY_SIZE(supplies);
+- for (i = 0; i < ARRAY_SIZE(supplies); i++)
+- sr->supplies[i].supply = supplies[i];
+- }
+-
+- return sr;
+-}
+-
+-static int pci_subdev_regulators_add_bus(struct pci_bus *bus)
+-{
+- struct device *dev = &bus->dev;
+- struct subdev_regulators *sr;
+- int ret;
+-
+- if (!dev->of_node || !bus->parent || !pci_is_root_bus(bus->parent))
+- return 0;
+-
+- if (dev->driver_data)
+- dev_err(dev, "dev.driver_data unexpectedly non-NULL\n");
+-
+- sr = alloc_subdev_regulators(dev);
+- if (!sr)
+- return -ENOMEM;
+-
+- dev->driver_data = sr;
+- ret = regulator_bulk_get(dev, sr->num_supplies, sr->supplies);
+- if (ret)
+- return ret;
+-
+- ret = regulator_bulk_enable(sr->num_supplies, sr->supplies);
+- if (ret) {
+- dev_err(dev, "failed to enable regulators for downstream device\n");
+- return ret;
+- }
+-
+- return 0;
+-}
+-
+-static int brcm_pcie_add_bus(struct pci_bus *bus)
+-{
+- struct device *dev = &bus->dev;
+- struct brcm_pcie *pcie = (struct brcm_pcie *) bus->sysdata;
+- int ret;
+-
+- if (!dev->of_node || !bus->parent || !pci_is_root_bus(bus->parent))
+- return 0;
+-
+- ret = pci_subdev_regulators_add_bus(bus);
+- if (ret)
+- return ret;
+-
+- /* Grab the regulators for suspend/resume */
+- pcie->sr = bus->dev.driver_data;
+-
+- /*
+- * If we have failed linkup there is no point to return an error as
+- * currently it will cause a WARNING() from pci_alloc_child_bus().
+- * We return 0 and turn on the "refusal_mode" so that any further
+- * accesses to the pci_dev just get 0xffffffff
+- */
+- if (brcm_pcie_linkup(pcie) != 0)
+- pcie->refusal_mode = true;
+-
+- return 0;
+-}
+-
+-static void pci_subdev_regulators_remove_bus(struct pci_bus *bus)
+-{
+- struct device *dev = &bus->dev;
+- struct subdev_regulators *sr = dev->driver_data;
+-
+- if (!sr || !bus->parent || !pci_is_root_bus(bus->parent))
+- return;
+-
+- if (regulator_bulk_disable(sr->num_supplies, sr->supplies))
+- dev_err(dev, "failed to disable regulators for downstream device\n");
+- dev->driver_data = NULL;
+-}
+-
+ /* Limits operation to a specific generation (1, 2, or 3) */
+ static void brcm_pcie_set_gen(struct brcm_pcie *pcie, int gen)
+ {
+@@ -858,18 +751,6 @@ static void __iomem *brcm_pcie_map_conf(struct pci_bus *bus, unsigned int devfn,
+ /* Accesses to the RC go right to the RC registers if slot==0 */
+ if (pci_is_root_bus(bus))
+ return PCI_SLOT(devfn) ? NULL : base + where;
+- if (pcie->refusal_mode) {
+- /*
+- * At this point we do not have link. There will be a CPU
+- * abort -- a quirk with this controller --if Linux tries
+- * to read any config-space registers besides those
+- * targeting the host bridge. To prevent this we hijack
+- * the address to point to a safe access that will return
+- * 0xffffffff.
+- */
+- writel(0xffffffff, base + PCIE_MISC_RC_BAR2_CONFIG_HI);
+- return base + PCIE_MISC_RC_BAR2_CONFIG_HI + (where & 0x3);
+- }
+
+ /* For devices, write to the config space index register */
+ idx = PCIE_ECAM_OFFSET(bus->number, devfn, 0);
+@@ -898,8 +779,6 @@ static struct pci_ops brcm_pcie_ops = {
+ .map_bus = brcm_pcie_map_conf,
+ .read = pci_generic_config_read,
+ .write = pci_generic_config_write,
+- .add_bus = brcm_pcie_add_bus,
+- .remove_bus = pci_subdev_regulators_remove_bus,
+ };
+
+ static struct pci_ops brcm_pcie_ops32 = {
+@@ -1047,9 +926,16 @@ static inline int brcm_pcie_get_rc_bar2_size_and_offset(struct brcm_pcie *pcie,
+
+ static int brcm_pcie_setup(struct brcm_pcie *pcie)
+ {
++ struct pci_host_bridge *bridge = pci_host_bridge_from_priv(pcie);
+ u64 rc_bar2_offset, rc_bar2_size;
+ void __iomem *base = pcie->base;
+- int ret, memc;
++ struct device *dev = pcie->dev;
++ struct resource_entry *entry;
++ bool ssc_good = false;
++ struct resource *res;
++ int num_out_wins = 0;
++ u16 nlw, cls, lnksta;
++ int i, ret, memc;
+ u32 tmp, burst, aspm_support;
+
+ /* Reset the bridge */
+@@ -1139,40 +1025,6 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
+ if (pcie->gen)
+ brcm_pcie_set_gen(pcie, pcie->gen);
+
+- /* Don't advertise L0s capability if 'aspm-no-l0s' */
+- aspm_support = PCIE_LINK_STATE_L1;
+- if (!of_property_read_bool(pcie->np, "aspm-no-l0s"))
+- aspm_support |= PCIE_LINK_STATE_L0S;
+- tmp = readl(base + PCIE_RC_CFG_PRIV1_LINK_CAPABILITY);
+- u32p_replace_bits(&tmp, aspm_support,
+- PCIE_RC_CFG_PRIV1_LINK_CAPABILITY_ASPM_SUPPORT_MASK);
+- writel(tmp, base + PCIE_RC_CFG_PRIV1_LINK_CAPABILITY);
+-
+- /*
+- * For config space accesses on the RC, show the right class for
+- * a PCIe-PCIe bridge (the default setting is to be EP mode).
+- */
+- tmp = readl(base + PCIE_RC_CFG_PRIV1_ID_VAL3);
+- u32p_replace_bits(&tmp, 0x060400,
+- PCIE_RC_CFG_PRIV1_ID_VAL3_CLASS_CODE_MASK);
+- writel(tmp, base + PCIE_RC_CFG_PRIV1_ID_VAL3);
+-
+- return 0;
+-}
+-
+-static int brcm_pcie_linkup(struct brcm_pcie *pcie)
+-{
+- struct pci_host_bridge *bridge = pci_host_bridge_from_priv(pcie);
+- struct device *dev = pcie->dev;
+- void __iomem *base = pcie->base;
+- struct resource_entry *entry;
+- struct resource *res;
+- int num_out_wins = 0;
+- u16 nlw, cls, lnksta;
+- bool ssc_good = false;
+- u32 tmp;
+- int ret, i;
+-
+ /* Unassert the fundamental reset */
+ pcie->perst_set(pcie, 0);
+
+@@ -1223,6 +1075,24 @@ static int brcm_pcie_linkup(struct brcm_pcie *pcie)
+ num_out_wins++;
+ }
+
++ /* Don't advertise L0s capability if 'aspm-no-l0s' */
++ aspm_support = PCIE_LINK_STATE_L1;
++ if (!of_property_read_bool(pcie->np, "aspm-no-l0s"))
++ aspm_support |= PCIE_LINK_STATE_L0S;
++ tmp = readl(base + PCIE_RC_CFG_PRIV1_LINK_CAPABILITY);
++ u32p_replace_bits(&tmp, aspm_support,
++ PCIE_RC_CFG_PRIV1_LINK_CAPABILITY_ASPM_SUPPORT_MASK);
++ writel(tmp, base + PCIE_RC_CFG_PRIV1_LINK_CAPABILITY);
++
++ /*
++ * For config space accesses on the RC, show the right class for
++ * a PCIe-PCIe bridge (the default setting is to be EP mode).
++ */
++ tmp = readl(base + PCIE_RC_CFG_PRIV1_ID_VAL3);
++ u32p_replace_bits(&tmp, 0x060400,
++ PCIE_RC_CFG_PRIV1_ID_VAL3_CLASS_CODE_MASK);
++ writel(tmp, base + PCIE_RC_CFG_PRIV1_ID_VAL3);
++
+ if (pcie->ssc) {
+ ret = brcm_pcie_set_ssc(pcie);
+ if (ret == 0)
+@@ -1351,21 +1221,9 @@ static void brcm_pcie_turn_off(struct brcm_pcie *pcie)
+ pcie->bridge_sw_init_set(pcie, 1);
+ }
+
+-static int pci_dev_may_wakeup(struct pci_dev *dev, void *data)
+-{
+- bool *ret = data;
+-
+- if (device_may_wakeup(&dev->dev)) {
+- *ret = true;
+- dev_info(&dev->dev, "disable cancelled for wake-up device\n");
+- }
+- return (int) *ret;
+-}
+-
+ static int brcm_pcie_suspend(struct device *dev)
+ {
+ struct brcm_pcie *pcie = dev_get_drvdata(dev);
+- struct pci_host_bridge *bridge = pci_host_bridge_from_priv(pcie);
+ int ret;
+
+ brcm_pcie_turn_off(pcie);
+@@ -1383,25 +1241,6 @@ static int brcm_pcie_suspend(struct device *dev)
+ return ret;
+ }
+
+- if (pcie->sr) {
+- /*
+- * Now turn off the regulators, but if at least one
+- * downstream device is enabled as a wake-up source, do not
+- * turn off regulators.
+- */
+- pcie->ep_wakeup_capable = false;
+- pci_walk_bus(bridge->bus, pci_dev_may_wakeup,
+- &pcie->ep_wakeup_capable);
+- if (!pcie->ep_wakeup_capable) {
+- ret = regulator_bulk_disable(pcie->sr->num_supplies,
+- pcie->sr->supplies);
+- if (ret) {
+- dev_err(dev, "Could not turn off regulators\n");
+- reset_control_reset(pcie->rescal);
+- return ret;
+- }
+- }
+- }
+ clk_disable_unprepare(pcie->clk);
+
+ return 0;
+@@ -1419,28 +1258,9 @@ static int brcm_pcie_resume(struct device *dev)
+ if (ret)
+ return ret;
+
+- if (pcie->sr) {
+- if (pcie->ep_wakeup_capable) {
+- /*
+- * We are resuming from a suspend. In the suspend we
+- * did not disable the power supplies, so there is
+- * no need to enable them (and falsely increase their
+- * usage count).
+- */
+- pcie->ep_wakeup_capable = false;
+- } else {
+- ret = regulator_bulk_enable(pcie->sr->num_supplies,
+- pcie->sr->supplies);
+- if (ret) {
+- dev_err(dev, "Could not turn on regulators\n");
+- goto err_disable_clk;
+- }
+- }
+- }
+-
+ ret = reset_control_reset(pcie->rescal);
+ if (ret)
+- goto err_regulator;
++ goto err_disable_clk;
+
+ ret = brcm_phy_start(pcie);
+ if (ret)
+@@ -1461,10 +1281,6 @@ static int brcm_pcie_resume(struct device *dev)
+ if (ret)
+ goto err_reset;
+
+- ret = brcm_pcie_linkup(pcie);
+- if (ret)
+- goto err_reset;
+-
+ if (pcie->msi)
+ brcm_msi_set_regs(pcie->msi);
+
+@@ -1472,9 +1288,6 @@ static int brcm_pcie_resume(struct device *dev)
+
+ err_reset:
+ reset_control_rearm(pcie->rescal);
+-err_regulator:
+- if (pcie->sr)
+- regulator_bulk_disable(pcie->sr->num_supplies, pcie->sr->supplies);
+ err_disable_clk:
+ clk_disable_unprepare(pcie->clk);
+ return ret;
+@@ -1606,17 +1419,7 @@ static int brcm_pcie_probe(struct platform_device *pdev)
+
+ platform_set_drvdata(pdev, pcie);
+
+- ret = pci_host_probe(bridge);
+- if (!ret && !brcm_pcie_link_up(pcie))
+- ret = -ENODEV;
+-
+- if (ret) {
+- brcm_pcie_remove(pdev);
+- return ret;
+- }
+-
+- return 0;
+-
++ return pci_host_probe(bridge);
+ fail:
+ __brcm_pcie_remove(pcie);
+ return ret;
+@@ -1625,8 +1428,8 @@ fail:
+ MODULE_DEVICE_TABLE(of, brcm_pcie_match);
+
+ static const struct dev_pm_ops brcm_pcie_pm_ops = {
+- .suspend_noirq = brcm_pcie_suspend,
+- .resume_noirq = brcm_pcie_resume,
++ .suspend = brcm_pcie_suspend,
++ .resume = brcm_pcie_resume,
+ };
+
+ static struct platform_driver brcm_pcie_driver = {
+diff --git a/drivers/pcmcia/Kconfig b/drivers/pcmcia/Kconfig
+index 2ce261cfff8ee..89e4511e9c435 100644
+--- a/drivers/pcmcia/Kconfig
++++ b/drivers/pcmcia/Kconfig
+@@ -151,7 +151,7 @@ config TCIC
+
+ config PCMCIA_ALCHEMY_DEVBOARD
+ tristate "Alchemy Db/Pb1xxx PCMCIA socket services"
+- depends on MIPS_ALCHEMY && PCMCIA
++ depends on MIPS_DB1XXX && PCMCIA
+ help
+ Enable this driver of you want PCMCIA support on your Alchemy
+ Db1000, Db/Pb1100, Db/Pb1500, Db/Pb1550, Db/Pb1200, DB1300
+diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c b/drivers/phy/qualcomm/phy-qcom-qmp.c
+index 9afac02e0eaa2..4d50a69256007 100644
+--- a/drivers/phy/qualcomm/phy-qcom-qmp.c
++++ b/drivers/phy/qualcomm/phy-qcom-qmp.c
+@@ -5246,7 +5246,7 @@ static int qcom_qmp_phy_power_on(struct phy *phy)
+
+ ret = reset_control_deassert(qmp->ufs_reset);
+ if (ret)
+- goto err_lane_rst;
++ goto err_pcs_ready;
+
+ qcom_qmp_phy_configure(pcs_misc, cfg->regs, cfg->pcs_misc_tbl,
+ cfg->pcs_misc_tbl_num);
+diff --git a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
+index eca77e44a4c1b..cba5c32cbaeed 100644
+--- a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
++++ b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
+@@ -940,8 +940,14 @@ static irqreturn_t rockchip_usb2phy_irq(int irq, void *data)
+ if (!rport->phy)
+ continue;
+
+- /* Handle linestate irq for both otg port and host port */
+- ret = rockchip_usb2phy_linestate_irq(irq, rport);
++ switch (rport->port_id) {
++ case USB2PHY_PORT_OTG:
++ ret |= rockchip_usb2phy_otg_mux_irq(irq, rport);
++ break;
++ case USB2PHY_PORT_HOST:
++ ret |= rockchip_usb2phy_linestate_irq(irq, rport);
++ break;
++ }
+ }
+
+ return ret;
+diff --git a/drivers/platform/x86/barco-p50-gpio.c b/drivers/platform/x86/barco-p50-gpio.c
+index 05534287bc26b..8dd6723394858 100644
+--- a/drivers/platform/x86/barco-p50-gpio.c
++++ b/drivers/platform/x86/barco-p50-gpio.c
+@@ -405,11 +405,14 @@ MODULE_DEVICE_TABLE(dmi, dmi_ids);
+ static int __init p50_module_init(void)
+ {
+ struct resource res = DEFINE_RES_IO(P50_GPIO_IO_PORT_BASE, P50_PORT_CMD + 1);
++ int ret;
+
+ if (!dmi_first_match(dmi_ids))
+ return -ENODEV;
+
+- platform_driver_register(&p50_gpio_driver);
++ ret = platform_driver_register(&p50_gpio_driver);
++ if (ret)
++ return ret;
+
+ gpio_pdev = platform_device_register_simple(DRIVER_NAME, PLATFORM_DEVID_NONE, &res, 1);
+ if (IS_ERR(gpio_pdev)) {
+diff --git a/drivers/platform/x86/hp-wmi.c b/drivers/platform/x86/hp-wmi.c
+index 0e9a25b56e0e4..0e6ed75c70f36 100644
+--- a/drivers/platform/x86/hp-wmi.c
++++ b/drivers/platform/x86/hp-wmi.c
+@@ -38,6 +38,7 @@ MODULE_ALIAS("wmi:5FB7F034-2C63-45e9-BE91-3D44E2C707E4");
+ #define HPWMI_EVENT_GUID "95F24279-4D7B-4334-9387-ACCDC67EF61C"
+ #define HPWMI_BIOS_GUID "5FB7F034-2C63-45e9-BE91-3D44E2C707E4"
+ #define HP_OMEN_EC_THERMAL_PROFILE_OFFSET 0x95
++#define zero_if_sup(tmp) (zero_insize_support?0:sizeof(tmp)) // use when zero insize is required
+
+ /* DMI board names of devices that should use the omen specific path for
+ * thermal profiles.
+@@ -220,6 +221,7 @@ static struct input_dev *hp_wmi_input_dev;
+ static struct platform_device *hp_wmi_platform_dev;
+ static struct platform_profile_handler platform_profile_handler;
+ static bool platform_profile_support;
++static bool zero_insize_support;
+
+ static struct rfkill *wifi_rfkill;
+ static struct rfkill *bluetooth_rfkill;
+@@ -290,14 +292,16 @@ static int hp_wmi_perform_query(int query, enum hp_wmi_command command,
+ struct bios_return *bios_return;
+ union acpi_object *obj = NULL;
+ struct bios_args *args = NULL;
+- int mid, actual_outsize, ret;
++ int mid, actual_insize, actual_outsize;
+ size_t bios_args_size;
++ int ret;
+
+ mid = encode_outsize_for_pvsz(outsize);
+ if (WARN_ON(mid < 0))
+ return mid;
+
+- bios_args_size = struct_size(args, data, insize);
++ actual_insize = max(insize, 128);
++ bios_args_size = struct_size(args, data, actual_insize);
+ args = kmalloc(bios_args_size, GFP_KERNEL);
+ if (!args)
+ return -ENOMEM;
+@@ -374,7 +378,7 @@ static int hp_wmi_read_int(int query)
+ int val = 0, ret;
+
+ ret = hp_wmi_perform_query(query, HPWMI_READ, &val,
+- 0, sizeof(val));
++ zero_if_sup(val), sizeof(val));
+
+ if (ret)
+ return ret < 0 ? ret : -EINVAL;
+@@ -410,7 +414,8 @@ static int hp_wmi_get_tablet_mode(void)
+ return -ENODEV;
+
+ ret = hp_wmi_perform_query(HPWMI_SYSTEM_DEVICE_MODE, HPWMI_READ,
+- system_device_mode, 0, sizeof(system_device_mode));
++ system_device_mode, zero_if_sup(system_device_mode),
++ sizeof(system_device_mode));
+ if (ret < 0)
+ return ret;
+
+@@ -497,7 +502,7 @@ static int hp_wmi_fan_speed_max_get(void)
+ int val = 0, ret;
+
+ ret = hp_wmi_perform_query(HPWMI_FAN_SPEED_MAX_GET_QUERY, HPWMI_GM,
+- &val, 0, sizeof(val));
++ &val, zero_if_sup(val), sizeof(val));
+
+ if (ret)
+ return ret < 0 ? ret : -EINVAL;
+@@ -509,7 +514,7 @@ static int __init hp_wmi_bios_2008_later(void)
+ {
+ int state = 0;
+ int ret = hp_wmi_perform_query(HPWMI_FEATURE_QUERY, HPWMI_READ, &state,
+- 0, sizeof(state));
++ zero_if_sup(state), sizeof(state));
+ if (!ret)
+ return 1;
+
+@@ -520,7 +525,7 @@ static int __init hp_wmi_bios_2009_later(void)
+ {
+ u8 state[128];
+ int ret = hp_wmi_perform_query(HPWMI_FEATURE2_QUERY, HPWMI_READ, &state,
+- 0, sizeof(state));
++ zero_if_sup(state), sizeof(state));
+ if (!ret)
+ return 1;
+
+@@ -598,7 +603,7 @@ static int hp_wmi_rfkill2_refresh(void)
+ int err, i;
+
+ err = hp_wmi_perform_query(HPWMI_WIRELESS2_QUERY, HPWMI_READ, &state,
+- 0, sizeof(state));
++ zero_if_sup(state), sizeof(state));
+ if (err)
+ return err;
+
+@@ -1000,7 +1005,7 @@ static int __init hp_wmi_rfkill2_setup(struct platform_device *device)
+ int err, i;
+
+ err = hp_wmi_perform_query(HPWMI_WIRELESS2_QUERY, HPWMI_READ, &state,
+- 0, sizeof(state));
++ zero_if_sup(state), sizeof(state));
+ if (err)
+ return err < 0 ? err : -EINVAL;
+
+@@ -1475,11 +1480,15 @@ static int __init hp_wmi_init(void)
+ {
+ int event_capable = wmi_has_guid(HPWMI_EVENT_GUID);
+ int bios_capable = wmi_has_guid(HPWMI_BIOS_GUID);
+- int err;
++ int err, tmp = 0;
+
+ if (!bios_capable && !event_capable)
+ return -ENODEV;
+
++ if (hp_wmi_perform_query(HPWMI_HARDWARE_QUERY, HPWMI_READ, &tmp,
++ sizeof(tmp), sizeof(tmp)) == HPWMI_RET_INVALID_PARAMETERS)
++ zero_insize_support = true;
++
+ if (event_capable) {
+ err = hp_wmi_input_setup();
+ if (err)
+diff --git a/drivers/power/supply/ab8500_fg.c b/drivers/power/supply/ab8500_fg.c
+index 97ac588a9e9ce..ec8a404d71b44 100644
+--- a/drivers/power/supply/ab8500_fg.c
++++ b/drivers/power/supply/ab8500_fg.c
+@@ -3037,13 +3037,6 @@ static int ab8500_fg_bind(struct device *dev, struct device *master,
+ {
+ struct ab8500_fg *di = dev_get_drvdata(dev);
+
+- /* Create a work queue for running the FG algorithm */
+- di->fg_wq = alloc_ordered_workqueue("ab8500_fg_wq", WQ_MEM_RECLAIM);
+- if (di->fg_wq == NULL) {
+- dev_err(dev, "failed to create work queue\n");
+- return -ENOMEM;
+- }
+-
+ di->bat_cap.max_mah_design = di->bm->bi->charge_full_design_uah;
+ di->bat_cap.max_mah = di->bat_cap.max_mah_design;
+ di->vbat_nom_uv = di->bm->bi->voltage_max_design_uv;
+@@ -3067,8 +3060,7 @@ static void ab8500_fg_unbind(struct device *dev, struct device *master,
+ if (ret)
+ dev_err(dev, "failed to disable coulomb counter\n");
+
+- destroy_workqueue(di->fg_wq);
+- flush_scheduled_work();
++ flush_workqueue(di->fg_wq);
+ }
+
+ static const struct component_ops ab8500_fg_component_ops = {
+@@ -3117,6 +3109,13 @@ static int ab8500_fg_probe(struct platform_device *pdev)
+ ab8500_fg_charge_state_to(di, AB8500_FG_CHARGE_INIT);
+ ab8500_fg_discharge_state_to(di, AB8500_FG_DISCHARGE_INIT);
+
++ /* Create a work queue for running the FG algorithm */
++ di->fg_wq = alloc_ordered_workqueue("ab8500_fg_wq", WQ_MEM_RECLAIM);
++ if (di->fg_wq == NULL) {
++ dev_err(dev, "failed to create work queue\n");
++ return -ENOMEM;
++ }
++
+ /* Init work for running the fg algorithm instantly */
+ INIT_WORK(&di->fg_work, ab8500_fg_instant_work);
+
+@@ -3227,6 +3226,8 @@ static int ab8500_fg_remove(struct platform_device *pdev)
+ {
+ struct ab8500_fg *di = platform_get_drvdata(pdev);
+
++ destroy_workqueue(di->fg_wq);
++ flush_scheduled_work();
+ component_del(&pdev->dev, &ab8500_fg_component_ops);
+ list_del(&di->node);
+ ab8500_fg_sysfs_exit(di);
+diff --git a/drivers/power/supply/axp288_charger.c b/drivers/power/supply/axp288_charger.c
+index 19746e658a6a8..15219ed43ce95 100644
+--- a/drivers/power/supply/axp288_charger.c
++++ b/drivers/power/supply/axp288_charger.c
+@@ -865,17 +865,20 @@ static int axp288_charger_probe(struct platform_device *pdev)
+ info->regmap_irqc = axp20x->regmap_irqc;
+
+ info->cable.edev = extcon_get_extcon_dev(AXP288_EXTCON_DEV_NAME);
+- if (info->cable.edev == NULL) {
+- dev_dbg(dev, "%s is not ready, probe deferred\n",
+- AXP288_EXTCON_DEV_NAME);
+- return -EPROBE_DEFER;
++ if (IS_ERR(info->cable.edev)) {
++ dev_err_probe(dev, PTR_ERR(info->cable.edev),
++ "extcon_get_extcon_dev(%s) failed\n",
++ AXP288_EXTCON_DEV_NAME);
++ return PTR_ERR(info->cable.edev);
+ }
+
+ if (acpi_dev_present(USB_HOST_EXTCON_HID, NULL, -1)) {
+ info->otg.cable = extcon_get_extcon_dev(USB_HOST_EXTCON_NAME);
+- if (info->otg.cable == NULL) {
+- dev_dbg(dev, "EXTCON_USB_HOST is not ready, probe deferred\n");
+- return -EPROBE_DEFER;
++ if (IS_ERR(info->otg.cable)) {
++ dev_err_probe(dev, PTR_ERR(info->otg.cable),
++ "extcon_get_extcon_dev(%s) failed\n",
++ USB_HOST_EXTCON_NAME);
++ return PTR_ERR(info->otg.cable);
+ }
+ dev_info(dev, "Using " USB_HOST_EXTCON_HID " extcon for usb-id\n");
+ }
+diff --git a/drivers/power/supply/axp288_fuel_gauge.c b/drivers/power/supply/axp288_fuel_gauge.c
+index e9f285dae489c..8e6f8a6550790 100644
+--- a/drivers/power/supply/axp288_fuel_gauge.c
++++ b/drivers/power/supply/axp288_fuel_gauge.c
+@@ -90,6 +90,8 @@
+ #define AXP288_REG_UPDATE_INTERVAL (60 * HZ)
+ #define AXP288_FG_INTR_NUM 6
+
++#define AXP288_QUIRK_NO_BATTERY BIT(0)
++
+ static bool no_current_sense_res;
+ module_param(no_current_sense_res, bool, 0444);
+ MODULE_PARM_DESC(no_current_sense_res, "No (or broken) current sense resistor");
+@@ -524,7 +526,7 @@ static struct power_supply_desc fuel_gauge_desc = {
+ * detection reports one despite it not being there.
+ * Please keep this listed sorted alphabetically.
+ */
+-static const struct dmi_system_id axp288_no_battery_list[] = {
++static const struct dmi_system_id axp288_quirks[] = {
+ {
+ /* ACEPC T8 Cherry Trail Z8350 mini PC */
+ .matches = {
+@@ -534,6 +536,7 @@ static const struct dmi_system_id axp288_no_battery_list[] = {
+ /* also match on somewhat unique bios-version */
+ DMI_EXACT_MATCH(DMI_BIOS_VERSION, "1.000"),
+ },
++ .driver_data = (void *)AXP288_QUIRK_NO_BATTERY,
+ },
+ {
+ /* ACEPC T11 Cherry Trail Z8350 mini PC */
+@@ -544,6 +547,7 @@ static const struct dmi_system_id axp288_no_battery_list[] = {
+ /* also match on somewhat unique bios-version */
+ DMI_EXACT_MATCH(DMI_BIOS_VERSION, "1.000"),
+ },
++ .driver_data = (void *)AXP288_QUIRK_NO_BATTERY,
+ },
+ {
+ /* Intel Cherry Trail Compute Stick, Windows version */
+@@ -551,6 +555,7 @@ static const struct dmi_system_id axp288_no_battery_list[] = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Intel"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "STK1AW32SC"),
+ },
++ .driver_data = (void *)AXP288_QUIRK_NO_BATTERY,
+ },
+ {
+ /* Intel Cherry Trail Compute Stick, version without an OS */
+@@ -558,34 +563,54 @@ static const struct dmi_system_id axp288_no_battery_list[] = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Intel"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "STK1A32SC"),
+ },
++ .driver_data = (void *)AXP288_QUIRK_NO_BATTERY,
+ },
+ {
+ /* Meegopad T02 */
+ .matches = {
+ DMI_MATCH(DMI_PRODUCT_NAME, "MEEGOPAD T02"),
+ },
++ .driver_data = (void *)AXP288_QUIRK_NO_BATTERY,
+ },
+ { /* Mele PCG03 Mini PC */
+ .matches = {
+ DMI_EXACT_MATCH(DMI_BOARD_VENDOR, "Mini PC"),
+ DMI_EXACT_MATCH(DMI_BOARD_NAME, "Mini PC"),
+ },
++ .driver_data = (void *)AXP288_QUIRK_NO_BATTERY,
+ },
+ {
+ /* Minix Neo Z83-4 mini PC */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "MINIX"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Z83-4"),
+- }
++ },
++ .driver_data = (void *)AXP288_QUIRK_NO_BATTERY,
+ },
+ {
+- /* Various Ace PC/Meegopad/MinisForum/Wintel Mini-PCs/HDMI-sticks */
++ /*
++ * One Mix 1, this uses the "T3 MRD" boardname used by
++ * generic mini PCs, but it is a mini laptop so it does
++ * actually have a battery!
++ */
++ .matches = {
++ DMI_MATCH(DMI_BOARD_NAME, "T3 MRD"),
++ DMI_MATCH(DMI_BIOS_DATE, "06/14/2018"),
++ },
++ .driver_data = NULL,
++ },
++ {
++ /*
++ * Various Ace PC/Meegopad/MinisForum/Wintel Mini-PCs/HDMI-sticks
++ * This entry must be last because it is generic, this allows
++ * adding more specifuc quirks overriding this generic entry.
++ */
+ .matches = {
+ DMI_MATCH(DMI_BOARD_NAME, "T3 MRD"),
+ DMI_MATCH(DMI_CHASSIS_TYPE, "3"),
+ DMI_MATCH(DMI_BIOS_VENDOR, "American Megatrends Inc."),
+- DMI_MATCH(DMI_BIOS_VERSION, "5.11"),
+ },
++ .driver_data = (void *)AXP288_QUIRK_NO_BATTERY,
+ },
+ {}
+ };
+@@ -665,7 +690,9 @@ static int axp288_fuel_gauge_probe(struct platform_device *pdev)
+ [BAT_D_CURR] = "axp288-chrg-d-curr",
+ [BAT_VOLT] = "axp288-batt-volt",
+ };
++ const struct dmi_system_id *dmi_id;
+ struct device *dev = &pdev->dev;
++ unsigned long quirks = 0;
+ int i, pirq, ret;
+
+ /*
+@@ -675,7 +702,11 @@ static int axp288_fuel_gauge_probe(struct platform_device *pdev)
+ if (!acpi_quirk_skip_acpi_ac_and_battery())
+ return -ENODEV;
+
+- if (dmi_check_system(axp288_no_battery_list))
++ dmi_id = dmi_first_match(axp288_quirks);
++ if (dmi_id)
++ quirks = (unsigned long)dmi_id->driver_data;
++
++ if (quirks & AXP288_QUIRK_NO_BATTERY)
+ return -ENODEV;
+
+ info = devm_kzalloc(dev, sizeof(*info), GFP_KERNEL);
+diff --git a/drivers/power/supply/charger-manager.c b/drivers/power/supply/charger-manager.c
+index d67edb760c948..92db79400a6ad 100644
+--- a/drivers/power/supply/charger-manager.c
++++ b/drivers/power/supply/charger-manager.c
+@@ -985,13 +985,10 @@ static int charger_extcon_init(struct charger_manager *cm,
+ cable->nb.notifier_call = charger_extcon_notifier;
+
+ cable->extcon_dev = extcon_get_extcon_dev(cable->extcon_name);
+- if (IS_ERR_OR_NULL(cable->extcon_dev)) {
++ if (IS_ERR(cable->extcon_dev)) {
+ pr_err("Cannot find extcon_dev for %s (cable: %s)\n",
+ cable->extcon_name, cable->name);
+- if (cable->extcon_dev == NULL)
+- return -EPROBE_DEFER;
+- else
+- return PTR_ERR(cable->extcon_dev);
++ return PTR_ERR(cable->extcon_dev);
+ }
+
+ for (i = 0; i < ARRAY_SIZE(extcon_mapping); i++) {
+diff --git a/drivers/power/supply/max8997_charger.c b/drivers/power/supply/max8997_charger.c
+index 127c73b0b3bd7..1ec3535a257d9 100644
+--- a/drivers/power/supply/max8997_charger.c
++++ b/drivers/power/supply/max8997_charger.c
+@@ -242,10 +242,10 @@ static int max8997_battery_probe(struct platform_device *pdev)
+ dev_info(&pdev->dev, "couldn't get charger regulator\n");
+ }
+ charger->edev = extcon_get_extcon_dev("max8997-muic");
+- if (IS_ERR_OR_NULL(charger->edev)) {
+- if (!charger->edev)
+- return -EPROBE_DEFER;
+- dev_info(charger->dev, "couldn't get extcon device\n");
++ if (IS_ERR(charger->edev)) {
++ dev_err_probe(charger->dev, PTR_ERR(charger->edev),
++ "couldn't get extcon device: max8997-muic\n");
++ return PTR_ERR(charger->edev);
+ }
+
+ if (!IS_ERR(charger->reg) && !IS_ERR_OR_NULL(charger->edev)) {
+diff --git a/drivers/power/supply/power_supply_core.c b/drivers/power/supply/power_supply_core.c
+index d925cb137e126..fad5890c899e2 100644
+--- a/drivers/power/supply/power_supply_core.c
++++ b/drivers/power/supply/power_supply_core.c
+@@ -616,7 +616,7 @@ int power_supply_get_battery_info(struct power_supply *psy,
+ goto out_put_node;
+ }
+
+- info = devm_kmalloc(&psy->dev, sizeof(*info), GFP_KERNEL);
++ info = devm_kzalloc(&psy->dev, sizeof(*info), GFP_KERNEL);
+ if (!info) {
+ err = -ENOMEM;
+ goto out_put_node;
+diff --git a/drivers/pwm/pwm-lp3943.c b/drivers/pwm/pwm-lp3943.c
+index ea17d446a6276..2bd04ecb508cf 100644
+--- a/drivers/pwm/pwm-lp3943.c
++++ b/drivers/pwm/pwm-lp3943.c
+@@ -125,6 +125,7 @@ static int lp3943_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm,
+ if (err)
+ return err;
+
++ duty_ns = min(duty_ns, period_ns);
+ val = (u8)(duty_ns * LP3943_MAX_DUTY / period_ns);
+
+ return lp3943_write_byte(lp3943, reg_duty, val);
+diff --git a/drivers/pwm/pwm-raspberrypi-poe.c b/drivers/pwm/pwm-raspberrypi-poe.c
+index e52e29fc8231f..6ff73029f367f 100644
+--- a/drivers/pwm/pwm-raspberrypi-poe.c
++++ b/drivers/pwm/pwm-raspberrypi-poe.c
+@@ -66,7 +66,7 @@ static int raspberrypi_pwm_get_property(struct rpi_firmware *firmware,
+ u32 reg, u32 *val)
+ {
+ struct raspberrypi_pwm_prop msg = {
+- .reg = reg
++ .reg = cpu_to_le32(reg),
+ };
+ int ret;
+
+diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c
+index 7a096f1891e61..91eb037089ef8 100644
+--- a/drivers/remoteproc/imx_rproc.c
++++ b/drivers/remoteproc/imx_rproc.c
+@@ -423,6 +423,9 @@ static int imx_rproc_prepare(struct rproc *rproc)
+ if (!strcmp(it.node->name, "vdev0buffer"))
+ continue;
+
++ if (!strcmp(it.node->name, "rsc-table"))
++ continue;
++
+ rmem = of_reserved_mem_lookup(it.node);
+ if (!rmem) {
+ dev_err(priv->dev, "unable to acquire memory-region\n");
+diff --git a/drivers/remoteproc/mtk_common.h b/drivers/remoteproc/mtk_common.h
+index 71ce4977cb0b3..ea6fa1100a00b 100644
+--- a/drivers/remoteproc/mtk_common.h
++++ b/drivers/remoteproc/mtk_common.h
+@@ -54,6 +54,8 @@
+ #define MT8192_CORE0_WDT_IRQ 0x10030
+ #define MT8192_CORE0_WDT_CFG 0x10034
+
++#define MT8195_L1TCM_SRAM_PDN_RESERVED_RSI_BITS GENMASK(7, 4)
++
+ #define SCP_FW_VER_LEN 32
+ #define SCP_SHARE_BUFFER_SIZE 288
+
+diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c
+index 38609153bf64c..621174ea7fd62 100644
+--- a/drivers/remoteproc/mtk_scp.c
++++ b/drivers/remoteproc/mtk_scp.c
+@@ -365,22 +365,22 @@ static int mt8183_scp_before_load(struct mtk_scp *scp)
+ return 0;
+ }
+
+-static void mt8192_power_on_sram(void __iomem *addr)
++static void scp_sram_power_on(void __iomem *addr, u32 reserved_mask)
+ {
+ int i;
+
+ for (i = 31; i >= 0; i--)
+- writel(GENMASK(i, 0), addr);
++ writel(GENMASK(i, 0) & ~reserved_mask, addr);
+ writel(0, addr);
+ }
+
+-static void mt8192_power_off_sram(void __iomem *addr)
++static void scp_sram_power_off(void __iomem *addr, u32 reserved_mask)
+ {
+ int i;
+
+ writel(0, addr);
+ for (i = 0; i < 32; i++)
+- writel(GENMASK(i, 0), addr);
++ writel(GENMASK(i, 0) & ~reserved_mask, addr);
+ }
+
+ static int mt8186_scp_before_load(struct mtk_scp *scp)
+@@ -393,7 +393,7 @@ static int mt8186_scp_before_load(struct mtk_scp *scp)
+ writel(0x0, scp->reg_base + MT8183_SCP_CLK_DIV_SEL);
+
+ /* Turn on the power of SCP's SRAM before using it. Enable 1 block per time*/
+- mt8192_power_on_sram(scp->reg_base + MT8183_SCP_SRAM_PDN);
++ scp_sram_power_on(scp->reg_base + MT8183_SCP_SRAM_PDN, 0);
+
+ /* Initialize TCM before loading FW. */
+ writel(0x0, scp->reg_base + MT8183_SCP_L1_SRAM_PD);
+@@ -412,11 +412,32 @@ static int mt8192_scp_before_load(struct mtk_scp *scp)
+ writel(1, scp->reg_base + MT8192_CORE0_SW_RSTN_SET);
+
+ /* enable SRAM clock */
+- mt8192_power_on_sram(scp->reg_base + MT8192_L2TCM_SRAM_PD_0);
+- mt8192_power_on_sram(scp->reg_base + MT8192_L2TCM_SRAM_PD_1);
+- mt8192_power_on_sram(scp->reg_base + MT8192_L2TCM_SRAM_PD_2);
+- mt8192_power_on_sram(scp->reg_base + MT8192_L1TCM_SRAM_PDN);
+- mt8192_power_on_sram(scp->reg_base + MT8192_CPU0_SRAM_PD);
++ scp_sram_power_on(scp->reg_base + MT8192_L2TCM_SRAM_PD_0, 0);
++ scp_sram_power_on(scp->reg_base + MT8192_L2TCM_SRAM_PD_1, 0);
++ scp_sram_power_on(scp->reg_base + MT8192_L2TCM_SRAM_PD_2, 0);
++ scp_sram_power_on(scp->reg_base + MT8192_L1TCM_SRAM_PDN, 0);
++ scp_sram_power_on(scp->reg_base + MT8192_CPU0_SRAM_PD, 0);
++
++ /* enable MPU for all memory regions */
++ writel(0xff, scp->reg_base + MT8192_CORE0_MEM_ATT_PREDEF);
++
++ return 0;
++}
++
++static int mt8195_scp_before_load(struct mtk_scp *scp)
++{
++ /* clear SPM interrupt, SCP2SPM_IPC_CLR */
++ writel(0xff, scp->reg_base + MT8192_SCP2SPM_IPC_CLR);
++
++ writel(1, scp->reg_base + MT8192_CORE0_SW_RSTN_SET);
++
++ /* enable SRAM clock */
++ scp_sram_power_on(scp->reg_base + MT8192_L2TCM_SRAM_PD_0, 0);
++ scp_sram_power_on(scp->reg_base + MT8192_L2TCM_SRAM_PD_1, 0);
++ scp_sram_power_on(scp->reg_base + MT8192_L2TCM_SRAM_PD_2, 0);
++ scp_sram_power_on(scp->reg_base + MT8192_L1TCM_SRAM_PDN,
++ MT8195_L1TCM_SRAM_PDN_RESERVED_RSI_BITS);
++ scp_sram_power_on(scp->reg_base + MT8192_CPU0_SRAM_PD, 0);
+
+ /* enable MPU for all memory regions */
+ writel(0xff, scp->reg_base + MT8192_CORE0_MEM_ATT_PREDEF);
+@@ -572,11 +593,25 @@ static void mt8183_scp_stop(struct mtk_scp *scp)
+ static void mt8192_scp_stop(struct mtk_scp *scp)
+ {
+ /* Disable SRAM clock */
+- mt8192_power_off_sram(scp->reg_base + MT8192_L2TCM_SRAM_PD_0);
+- mt8192_power_off_sram(scp->reg_base + MT8192_L2TCM_SRAM_PD_1);
+- mt8192_power_off_sram(scp->reg_base + MT8192_L2TCM_SRAM_PD_2);
+- mt8192_power_off_sram(scp->reg_base + MT8192_L1TCM_SRAM_PDN);
+- mt8192_power_off_sram(scp->reg_base + MT8192_CPU0_SRAM_PD);
++ scp_sram_power_off(scp->reg_base + MT8192_L2TCM_SRAM_PD_0, 0);
++ scp_sram_power_off(scp->reg_base + MT8192_L2TCM_SRAM_PD_1, 0);
++ scp_sram_power_off(scp->reg_base + MT8192_L2TCM_SRAM_PD_2, 0);
++ scp_sram_power_off(scp->reg_base + MT8192_L1TCM_SRAM_PDN, 0);
++ scp_sram_power_off(scp->reg_base + MT8192_CPU0_SRAM_PD, 0);
++
++ /* Disable SCP watchdog */
++ writel(0, scp->reg_base + MT8192_CORE0_WDT_CFG);
++}
++
++static void mt8195_scp_stop(struct mtk_scp *scp)
++{
++ /* Disable SRAM clock */
++ scp_sram_power_off(scp->reg_base + MT8192_L2TCM_SRAM_PD_0, 0);
++ scp_sram_power_off(scp->reg_base + MT8192_L2TCM_SRAM_PD_1, 0);
++ scp_sram_power_off(scp->reg_base + MT8192_L2TCM_SRAM_PD_2, 0);
++ scp_sram_power_off(scp->reg_base + MT8192_L1TCM_SRAM_PDN,
++ MT8195_L1TCM_SRAM_PDN_RESERVED_RSI_BITS);
++ scp_sram_power_off(scp->reg_base + MT8192_CPU0_SRAM_PD, 0);
+
+ /* Disable SCP watchdog */
+ writel(0, scp->reg_base + MT8192_CORE0_WDT_CFG);
+@@ -877,7 +912,6 @@ static int scp_remove(struct platform_device *pdev)
+ for (i = 0; i < SCP_IPI_MAX; i++)
+ mutex_destroy(&scp->ipi_desc[i].lock);
+ mutex_destroy(&scp->send_lock);
+- rproc_free(scp->rproc);
+
+ return 0;
+ }
+@@ -922,11 +956,11 @@ static const struct mtk_scp_of_data mt8192_of_data = {
+
+ static const struct mtk_scp_of_data mt8195_of_data = {
+ .scp_clk_get = mt8195_scp_clk_get,
+- .scp_before_load = mt8192_scp_before_load,
++ .scp_before_load = mt8195_scp_before_load,
+ .scp_irq_handler = mt8192_scp_irq_handler,
+ .scp_reset_assert = mt8192_scp_reset_assert,
+ .scp_reset_deassert = mt8192_scp_reset_deassert,
+- .scp_stop = mt8192_scp_stop,
++ .scp_stop = mt8195_scp_stop,
+ .scp_da_to_va = mt8192_scp_da_to_va,
+ .host_to_scp_reg = MT8192_GIPC_IN_SET,
+ .host_to_scp_int_bit = MT8192_HOST_IPC_INT_BIT,
+diff --git a/drivers/rpmsg/qcom_smd.c b/drivers/rpmsg/qcom_smd.c
+index 764c980507beb..1957b27c4cf37 100644
+--- a/drivers/rpmsg/qcom_smd.c
++++ b/drivers/rpmsg/qcom_smd.c
+@@ -1407,9 +1407,9 @@ static int qcom_smd_parse_edge(struct device *dev,
+ edge->name = node->name;
+
+ irq = irq_of_parse_and_map(node, 0);
+- if (irq < 0) {
++ if (!irq) {
+ dev_err(dev, "required smd interrupt missing\n");
+- ret = irq;
++ ret = -EINVAL;
+ goto put_node;
+ }
+
+diff --git a/drivers/rpmsg/virtio_rpmsg_bus.c b/drivers/rpmsg/virtio_rpmsg_bus.c
+index 3ede25b1f2e43..905ac7910c98f 100644
+--- a/drivers/rpmsg/virtio_rpmsg_bus.c
++++ b/drivers/rpmsg/virtio_rpmsg_bus.c
+@@ -851,7 +851,7 @@ static struct rpmsg_device *rpmsg_virtio_add_ctrl_dev(struct virtio_device *vdev
+
+ err = rpmsg_ctrldev_register_device(rpdev_ctrl);
+ if (err) {
+- kfree(vch);
++ /* vch will be free in virtio_rpmsg_release_device() */
+ return ERR_PTR(err);
+ }
+
+@@ -862,7 +862,7 @@ static void rpmsg_virtio_del_ctrl_dev(struct rpmsg_device *rpdev_ctrl)
+ {
+ if (!rpdev_ctrl)
+ return;
+- kfree(to_virtio_rpmsg_channel(rpdev_ctrl));
++ device_unregister(&rpdev_ctrl->dev);
+ }
+
+ static int rpmsg_probe(struct virtio_device *vdev)
+@@ -973,7 +973,8 @@ static int rpmsg_probe(struct virtio_device *vdev)
+
+ err = rpmsg_ns_register_device(rpdev_ns);
+ if (err)
+- goto free_vch;
++ /* vch will be free in virtio_rpmsg_release_device() */
++ goto free_ctrldev;
+ }
+
+ /*
+@@ -997,8 +998,6 @@ static int rpmsg_probe(struct virtio_device *vdev)
+
+ return 0;
+
+-free_vch:
+- kfree(vch);
+ free_ctrldev:
+ rpmsg_virtio_del_ctrl_dev(rpdev_ctrl);
+ free_coherent:
+diff --git a/drivers/rtc/rtc-ftrtc010.c b/drivers/rtc/rtc-ftrtc010.c
+index 53bb08fe1cd46..25c6e7d9570f0 100644
+--- a/drivers/rtc/rtc-ftrtc010.c
++++ b/drivers/rtc/rtc-ftrtc010.c
+@@ -137,26 +137,34 @@ static int ftrtc010_rtc_probe(struct platform_device *pdev)
+ ret = clk_prepare_enable(rtc->extclk);
+ if (ret) {
+ dev_err(dev, "failed to enable EXTCLK\n");
+- return ret;
++ goto err_disable_pclk;
+ }
+ }
+
+ rtc->rtc_irq = platform_get_irq(pdev, 0);
+- if (rtc->rtc_irq < 0)
+- return rtc->rtc_irq;
++ if (rtc->rtc_irq < 0) {
++ ret = rtc->rtc_irq;
++ goto err_disable_extclk;
++ }
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+- if (!res)
+- return -ENODEV;
++ if (!res) {
++ ret = -ENODEV;
++ goto err_disable_extclk;
++ }
+
+ rtc->rtc_base = devm_ioremap(dev, res->start,
+ resource_size(res));
+- if (!rtc->rtc_base)
+- return -ENOMEM;
++ if (!rtc->rtc_base) {
++ ret = -ENOMEM;
++ goto err_disable_extclk;
++ }
+
+ rtc->rtc_dev = devm_rtc_allocate_device(dev);
+- if (IS_ERR(rtc->rtc_dev))
+- return PTR_ERR(rtc->rtc_dev);
++ if (IS_ERR(rtc->rtc_dev)) {
++ ret = PTR_ERR(rtc->rtc_dev);
++ goto err_disable_extclk;
++ }
+
+ rtc->rtc_dev->ops = &ftrtc010_rtc_ops;
+
+@@ -172,9 +180,15 @@ static int ftrtc010_rtc_probe(struct platform_device *pdev)
+ ret = devm_request_irq(dev, rtc->rtc_irq, ftrtc010_rtc_interrupt,
+ IRQF_SHARED, pdev->name, dev);
+ if (unlikely(ret))
+- return ret;
++ goto err_disable_extclk;
+
+ return devm_rtc_register_device(rtc->rtc_dev);
++
++err_disable_extclk:
++ clk_disable_unprepare(rtc->extclk);
++err_disable_pclk:
++ clk_disable_unprepare(rtc->pclk);
++ return ret;
+ }
+
+ static int ftrtc010_rtc_remove(struct platform_device *pdev)
+diff --git a/drivers/rtc/rtc-mt6397.c b/drivers/rtc/rtc-mt6397.c
+index 80dc479a6ff02..1d297af80f878 100644
+--- a/drivers/rtc/rtc-mt6397.c
++++ b/drivers/rtc/rtc-mt6397.c
+@@ -269,6 +269,8 @@ static int mtk_rtc_probe(struct platform_device *pdev)
+ return -ENOMEM;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
++ if (!res)
++ return -EINVAL;
+ rtc->addr_base = res->start;
+
+ rtc->data = of_device_get_match_data(&pdev->dev);
+diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h
+index 9897a1aa387b6..9ed67fdd77f97 100644
+--- a/drivers/scsi/lpfc/lpfc_crtn.h
++++ b/drivers/scsi/lpfc/lpfc_crtn.h
+@@ -418,8 +418,6 @@ int lpfc_sli_issue_iocb_wait(struct lpfc_hba *, uint32_t,
+ uint32_t);
+ void lpfc_sli_abort_fcp_cmpl(struct lpfc_hba *, struct lpfc_iocbq *,
+ struct lpfc_iocbq *);
+-void lpfc_sli4_abort_fcp_cmpl(struct lpfc_hba *h, struct lpfc_iocbq *i,
+- struct lpfc_wcqe_complete *w);
+
+ void lpfc_sli_free_hbq(struct lpfc_hba *, struct hbq_dmabuf *);
+
+@@ -627,7 +625,7 @@ void lpfc_nvmet_invalidate_host(struct lpfc_hba *phba,
+ struct lpfc_nodelist *ndlp);
+ void lpfc_nvme_abort_fcreq_cmpl(struct lpfc_hba *phba,
+ struct lpfc_iocbq *cmdiocb,
+- struct lpfc_wcqe_complete *abts_cmpl);
++ struct lpfc_iocbq *rspiocb);
+ void lpfc_create_multixri_pools(struct lpfc_hba *phba);
+ void lpfc_create_destroy_pools(struct lpfc_hba *phba);
+ void lpfc_move_xri_pvt_to_pbl(struct lpfc_hba *phba, u32 hwqid);
+diff --git a/drivers/scsi/lpfc/lpfc_ct.c b/drivers/scsi/lpfc/lpfc_ct.c
+index 4b024aa03c1b2..87124fd652725 100644
+--- a/drivers/scsi/lpfc/lpfc_ct.c
++++ b/drivers/scsi/lpfc/lpfc_ct.c
+@@ -197,7 +197,7 @@ lpfc_ct_reject_event(struct lpfc_nodelist *ndlp,
+ memset(bpl, 0, sizeof(struct ulp_bde64));
+ bpl->addrHigh = le32_to_cpu(putPaddrHigh(mp->phys));
+ bpl->addrLow = le32_to_cpu(putPaddrLow(mp->phys));
+- bpl->tus.f.bdeFlags = BUFF_TYPE_BLP_64;
++ bpl->tus.f.bdeFlags = BUFF_TYPE_BDE_64;
+ bpl->tus.f.bdeSize = (LPFC_CT_PREAMBLE - 4);
+ bpl->tus.w = le32_to_cpu(bpl->tus.w);
+
+diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
+index 011849c1ed3c9..a83e820de5873 100644
+--- a/drivers/scsi/lpfc/lpfc_init.c
++++ b/drivers/scsi/lpfc/lpfc_init.c
+@@ -12063,7 +12063,7 @@ lpfc_sli_enable_msi(struct lpfc_hba *phba)
+ rc = pci_enable_msi(phba->pcidev);
+ if (!rc)
+ lpfc_printf_log(phba, KERN_INFO, LOG_INIT,
+- "0462 PCI enable MSI mode success.\n");
++ "0012 PCI enable MSI mode success.\n");
+ else {
+ lpfc_printf_log(phba, KERN_INFO, LOG_INIT,
+ "0471 PCI enable MSI mode failed (%d)\n", rc);
+diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
+index 8d26f207ebd22..d3a542466e981 100644
+--- a/drivers/scsi/lpfc/lpfc_nvme.c
++++ b/drivers/scsi/lpfc/lpfc_nvme.c
+@@ -1741,7 +1741,7 @@ lpfc_nvme_fcp_io_submit(struct nvme_fc_local_port *pnvme_lport,
+ * lpfc_nvme_abort_fcreq_cmpl - Complete an NVME FCP abort request.
+ * @phba: Pointer to HBA context object
+ * @cmdiocb: Pointer to command iocb object.
+- * @abts_cmpl: Pointer to wcqe complete object.
++ * @rspiocb: Pointer to response iocb object.
+ *
+ * This is the callback function for any NVME FCP IO that was aborted.
+ *
+@@ -1750,8 +1750,10 @@ lpfc_nvme_fcp_io_submit(struct nvme_fc_local_port *pnvme_lport,
+ **/
+ void
+ lpfc_nvme_abort_fcreq_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+- struct lpfc_wcqe_complete *abts_cmpl)
++ struct lpfc_iocbq *rspiocb)
+ {
++ struct lpfc_wcqe_complete *abts_cmpl = &rspiocb->wcqe_cmpl;
++
+ lpfc_printf_log(phba, KERN_INFO, LOG_NVME,
+ "6145 ABORT_XRI_CN completing on rpi x%x "
+ "original iotag x%x, abort cmd iotag x%x "
+diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
+index f617a2ef6b0f4..add238dc599bc 100644
+--- a/drivers/scsi/lpfc/lpfc_scsi.c
++++ b/drivers/scsi/lpfc/lpfc_scsi.c
+@@ -6316,6 +6316,9 @@ lpfc_device_reset_handler(struct scsi_cmnd *cmnd)
+ int status;
+ u32 logit = LOG_FCP;
+
++ if (!rport)
++ return FAILED;
++
+ rdata = rport->dd_data;
+ if (!rdata || !rdata->pnode) {
+ lpfc_printf_vlog(vport, KERN_ERR, LOG_TRACE_EVENT,
+@@ -6394,6 +6397,9 @@ lpfc_target_reset_handler(struct scsi_cmnd *cmnd)
+ unsigned long flags;
+ DECLARE_WAIT_QUEUE_HEAD_ONSTACK(waitq);
+
++ if (!rport)
++ return FAILED;
++
+ rdata = rport->dd_data;
+ if (!rdata || !rdata->pnode) {
+ lpfc_printf_vlog(vport, KERN_ERR, LOG_TRACE_EVENT,
+diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
+index 331241a71452f..123a18784aa48 100644
+--- a/drivers/scsi/lpfc/lpfc_sli.c
++++ b/drivers/scsi/lpfc/lpfc_sli.c
+@@ -1930,7 +1930,7 @@ lpfc_issue_cmf_sync_wqe(struct lpfc_hba *phba, u32 ms, u64 total)
+ sync_buf = __lpfc_sli_get_iocbq(phba);
+ if (!sync_buf) {
+ lpfc_printf_log(phba, KERN_ERR, LOG_CGN_MGMT,
+- "6213 No available WQEs for CMF_SYNC_WQE\n");
++ "6244 No available WQEs for CMF_SYNC_WQE\n");
+ ret_val = ENOMEM;
+ goto out_unlock;
+ }
+@@ -3816,7 +3816,7 @@ lpfc_sli_process_sol_iocb(struct lpfc_hba *phba, struct lpfc_sli_ring *pring,
+ set_job_ulpword4(cmdiocbp,
+ IOERR_ABORT_REQUESTED);
+ /*
+- * For SLI4, irsiocb contains
++ * For SLI4, irspiocb contains
+ * NO_XRI in sli_xritag, it
+ * shall not affect releasing
+ * sgl (xri) process.
+@@ -3834,7 +3834,7 @@ lpfc_sli_process_sol_iocb(struct lpfc_hba *phba, struct lpfc_sli_ring *pring,
+ }
+ }
+ }
+- (cmdiocbp->cmd_cmpl) (phba, cmdiocbp, saveq);
++ cmdiocbp->cmd_cmpl(phba, cmdiocbp, saveq);
+ } else
+ lpfc_sli_release_iocbq(phba, cmdiocbp);
+ } else {
+@@ -4074,8 +4074,7 @@ lpfc_sli_handle_fast_ring_event(struct lpfc_hba *phba,
+ cmdiocbq->cmd_flag &= ~LPFC_DRIVER_ABORTED;
+ if (cmdiocbq->cmd_cmpl) {
+ spin_unlock_irqrestore(&phba->hbalock, iflag);
+- (cmdiocbq->cmd_cmpl)(phba, cmdiocbq,
+- &rspiocbq);
++ cmdiocbq->cmd_cmpl(phba, cmdiocbq, &rspiocbq);
+ spin_lock_irqsave(&phba->hbalock, iflag);
+ }
+ break;
+@@ -10304,7 +10303,7 @@ __lpfc_sli_issue_iocb_s3(struct lpfc_hba *phba, uint32_t ring_number,
+ * @flag: Flag indicating if this command can be put into txq.
+ *
+ * __lpfc_sli_issue_fcp_io_s3 is wrapper function to invoke lockless func to
+- * send an iocb command to an HBA with SLI-4 interface spec.
++ * send an iocb command to an HBA with SLI-3 interface spec.
+ *
+ * This function takes the hbalock before invoking the lockless version.
+ * The function will return success after it successfully submit the wqe to
+@@ -12741,7 +12740,7 @@ lpfc_sli_wake_iocb_wait(struct lpfc_hba *phba,
+ cmdiocbq->cmd_cmpl = cmdiocbq->wait_cmd_cmpl;
+ cmdiocbq->wait_cmd_cmpl = NULL;
+ if (cmdiocbq->cmd_cmpl)
+- (cmdiocbq->cmd_cmpl)(phba, cmdiocbq, NULL);
++ cmdiocbq->cmd_cmpl(phba, cmdiocbq, NULL);
+ else
+ lpfc_sli_release_iocbq(phba, cmdiocbq);
+ return;
+@@ -12755,9 +12754,9 @@ lpfc_sli_wake_iocb_wait(struct lpfc_hba *phba,
+
+ /* Set the exchange busy flag for task management commands */
+ if ((cmdiocbq->cmd_flag & LPFC_IO_FCP) &&
+- !(cmdiocbq->cmd_flag & LPFC_IO_LIBDFC)) {
++ !(cmdiocbq->cmd_flag & LPFC_IO_LIBDFC)) {
+ lpfc_cmd = container_of(cmdiocbq, struct lpfc_io_buf,
+- cur_iocbq);
++ cur_iocbq);
+ if (rspiocbq && (rspiocbq->cmd_flag & LPFC_EXCHANGE_BUSY))
+ lpfc_cmd->flags |= LPFC_SBUF_XBUSY;
+ else
+@@ -13897,7 +13896,7 @@ void lpfc_sli4_els_xri_abort_event_proc(struct lpfc_hba *phba)
+ * @irspiocbq: Pointer to work-queue completion queue entry.
+ *
+ * This routine handles an ELS work-queue completion event and construct
+- * a pseudo response ELS IODBQ from the SLI4 ELS WCQE for the common
++ * a pseudo response ELS IOCBQ from the SLI4 ELS WCQE for the common
+ * discovery engine to handle.
+ *
+ * Return: Pointer to the receive IOCBQ, NULL otherwise.
+@@ -13941,7 +13940,7 @@ lpfc_sli4_els_preprocess_rspiocbq(struct lpfc_hba *phba,
+
+ if (bf_get(lpfc_wcqe_c_xb, wcqe)) {
+ spin_lock_irqsave(&phba->hbalock, iflags);
+- cmdiocbq->cmd_flag |= LPFC_EXCHANGE_BUSY;
++ irspiocbq->cmd_flag |= LPFC_EXCHANGE_BUSY;
+ spin_unlock_irqrestore(&phba->hbalock, iflags);
+ }
+
+@@ -14800,7 +14799,7 @@ lpfc_sli4_fp_handle_fcp_wcqe(struct lpfc_hba *phba, struct lpfc_queue *cq,
+ /* Pass the cmd_iocb and the wcqe to the upper layer */
+ memcpy(&cmdiocbq->wcqe_cmpl, wcqe,
+ sizeof(struct lpfc_wcqe_complete));
+- (cmdiocbq->cmd_cmpl)(phba, cmdiocbq, cmdiocbq);
++ cmdiocbq->cmd_cmpl(phba, cmdiocbq, cmdiocbq);
+ } else {
+ lpfc_printf_log(phba, KERN_WARNING, LOG_SLI,
+ "0375 FCP cmdiocb not callback function "
+@@ -18963,7 +18962,7 @@ lpfc_sli4_send_seq_to_ulp(struct lpfc_vport *vport,
+
+ /* Free iocb created in lpfc_prep_seq */
+ list_for_each_entry_safe(curr_iocb, next_iocb,
+- &iocbq->list, list) {
++ &iocbq->list, list) {
+ list_del_init(&curr_iocb->list);
+ lpfc_sli_release_iocbq(phba, curr_iocb);
+ }
+diff --git a/drivers/scsi/myrb.c b/drivers/scsi/myrb.c
+index 71585528e8db9..e885c1dbf61f9 100644
+--- a/drivers/scsi/myrb.c
++++ b/drivers/scsi/myrb.c
+@@ -1239,7 +1239,8 @@ static void myrb_cleanup(struct myrb_hba *cb)
+ myrb_unmap(cb);
+
+ if (cb->mmio_base) {
+- cb->disable_intr(cb->io_base);
++ if (cb->disable_intr)
++ cb->disable_intr(cb->io_base);
+ iounmap(cb->mmio_base);
+ }
+ if (cb->irq)
+@@ -3413,9 +3414,13 @@ static struct myrb_hba *myrb_detect(struct pci_dev *pdev,
+ mutex_init(&cb->dcmd_mutex);
+ mutex_init(&cb->dma_mutex);
+ cb->pdev = pdev;
++ cb->host = shost;
+
+- if (pci_enable_device(pdev))
+- goto failure;
++ if (pci_enable_device(pdev)) {
++ dev_err(&pdev->dev, "Failed to enable PCI device\n");
++ scsi_host_put(shost);
++ return NULL;
++ }
+
+ if (privdata->hw_init == DAC960_PD_hw_init ||
+ privdata->hw_init == DAC960_P_hw_init) {
+diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
+index dc6e55761fd1f..edc27a88a0575 100644
+--- a/drivers/scsi/sd.c
++++ b/drivers/scsi/sd.c
+@@ -3067,7 +3067,7 @@ static void sd_read_cpr(struct scsi_disk *sdkp)
+ goto out;
+
+ /* We must have at least a 64B header and one 32B range descriptor */
+- vpd_len = get_unaligned_be16(&buffer[2]) + 3;
++ vpd_len = get_unaligned_be16(&buffer[2]) + 4;
+ if (vpd_len > buf_len || vpd_len < 64 + 32 || (vpd_len & 31)) {
+ sd_printk(KERN_ERR, sdkp,
+ "Invalid Concurrent Positioning Ranges VPD page\n");
+@@ -3475,7 +3475,7 @@ static int sd_probe(struct device *dev)
+ error = device_add_disk(dev, gd, NULL);
+ if (error) {
+ put_device(&sdkp->disk_dev);
+- blk_cleanup_disk(gd);
++ put_disk(gd);
+ goto out;
+ }
+
+@@ -3501,7 +3501,6 @@ static int sd_probe(struct device *dev)
+ out_put:
+ put_disk(gd);
+ out_free:
+- sd_zbc_release_disk(sdkp);
+ kfree(sdkp);
+ out:
+ scsi_autopm_put_device(sdp);
+diff --git a/drivers/soc/rockchip/grf.c b/drivers/soc/rockchip/grf.c
+index 494cf2b5bf7b6..343ff61ccccbb 100644
+--- a/drivers/soc/rockchip/grf.c
++++ b/drivers/soc/rockchip/grf.c
+@@ -148,12 +148,14 @@ static int __init rockchip_grf_init(void)
+ return -ENODEV;
+ if (!match || !match->data) {
+ pr_err("%s: missing grf data\n", __func__);
++ of_node_put(np);
+ return -EINVAL;
+ }
+
+ grf_info = match->data;
+
+ grf = syscon_node_to_regmap(np);
++ of_node_put(np);
+ if (IS_ERR(grf)) {
+ pr_err("%s: could not get grf syscon\n", __func__);
+ return PTR_ERR(grf);
+diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c
+index 63101f1ba2713..32e5fdb823c40 100644
+--- a/drivers/soundwire/intel.c
++++ b/drivers/soundwire/intel.c
+@@ -1293,6 +1293,9 @@ static int intel_link_probe(struct auxiliary_device *auxdev,
+ /* use generic bandwidth allocation algorithm */
+ sdw->cdns.bus.compute_params = sdw_compute_params;
+
++ /* avoid resuming from pm_runtime suspend if it's not required */
++ dev_pm_set_driver_flags(dev, DPM_FLAG_SMART_SUSPEND);
++
+ ret = sdw_bus_master_add(bus, dev, dev->fwnode);
+ if (ret) {
+ dev_err(dev, "sdw_bus_master_add fail: %d\n", ret);
+diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c
+index da1ad7ebb1aa7..7b8ef45abee44 100644
+--- a/drivers/soundwire/qcom.c
++++ b/drivers/soundwire/qcom.c
+@@ -105,7 +105,7 @@
+
+ #define SWRM_SPECIAL_CMD_ID 0xF
+ #define MAX_FREQ_NUM 1
+-#define TIMEOUT_MS (2 * HZ)
++#define TIMEOUT_MS 100
+ #define QCOM_SWRM_MAX_RD_LEN 0x1
+ #define QCOM_SDW_MAX_PORTS 14
+ #define DEFAULT_CLK_FREQ 9600000
+@@ -516,6 +516,7 @@ static irqreturn_t qcom_swrm_wake_irq_handler(int irq, void *dev_id)
+ "pm_runtime_get_sync failed in %s, ret %d\n",
+ __func__, ret);
+ pm_runtime_put_noidle(swrm->dev);
++ return ret;
+ }
+
+ if (swrm->wake_irq > 0) {
+@@ -1258,6 +1259,7 @@ static int swrm_reg_show(struct seq_file *s_file, void *data)
+ "pm_runtime_get_sync failed in %s, ret %d\n",
+ __func__, ret);
+ pm_runtime_put_noidle(swrm->dev);
++ return ret;
+ }
+
+ for (reg = 0; reg <= SWR_MSTR_MAX_REG_ADDR; reg += 4) {
+@@ -1452,7 +1454,7 @@ static bool swrm_wait_for_frame_gen_enabled(struct qcom_swrm_ctrl *swrm)
+ } while (retry--);
+
+ dev_err(swrm->dev, "%s: link status not %s\n", __func__,
+- comp_sts && SWRM_FRM_GEN_ENABLED ? "connected" : "disconnected");
++ comp_sts & SWRM_FRM_GEN_ENABLED ? "connected" : "disconnected");
+
+ return false;
+ }
+diff --git a/drivers/spi/spi-fsi.c b/drivers/spi/spi-fsi.c
+index d403a7a3021d0..72ab066ce5523 100644
+--- a/drivers/spi/spi-fsi.c
++++ b/drivers/spi/spi-fsi.c
+@@ -319,12 +319,12 @@ static int fsi_spi_transfer_data(struct fsi_spi *ctx,
+
+ end = jiffies + msecs_to_jiffies(SPI_FSI_STATUS_TIMEOUT_MS);
+ do {
++ if (time_after(jiffies, end))
++ return -ETIMEDOUT;
++
+ rc = fsi_spi_status(ctx, &status, "TX");
+ if (rc)
+ return rc;
+-
+- if (time_after(jiffies, end))
+- return -ETIMEDOUT;
+ } while (status & SPI_FSI_STATUS_TDR_FULL);
+
+ sent += nb;
+@@ -337,12 +337,12 @@ static int fsi_spi_transfer_data(struct fsi_spi *ctx,
+ while (transfer->len > recv) {
+ end = jiffies + msecs_to_jiffies(SPI_FSI_STATUS_TIMEOUT_MS);
+ do {
++ if (time_after(jiffies, end))
++ return -ETIMEDOUT;
++
+ rc = fsi_spi_status(ctx, &status, "RX");
+ if (rc)
+ return rc;
+-
+- if (time_after(jiffies, end))
+- return -ETIMEDOUT;
+ } while (!(status & SPI_FSI_STATUS_RDR_FULL));
+
+ rc = fsi_spi_read_reg(ctx, SPI_FSI_DATA_RX, &in);
+diff --git a/drivers/staging/fieldbus/anybuss/host.c b/drivers/staging/fieldbus/anybuss/host.c
+index a344410e48fe9..cd86b9c9e3458 100644
+--- a/drivers/staging/fieldbus/anybuss/host.c
++++ b/drivers/staging/fieldbus/anybuss/host.c
+@@ -1384,7 +1384,7 @@ anybuss_host_common_probe(struct device *dev,
+ goto err_device;
+ return cd;
+ err_device:
+- device_unregister(&cd->client->dev);
++ put_device(&cd->client->dev);
+ err_kthread:
+ kthread_stop(cd->qthread);
+ err_reset:
+diff --git a/drivers/staging/greybus/audio_codec.c b/drivers/staging/greybus/audio_codec.c
+index b589cf6b1d034..e19b91e7a72ef 100644
+--- a/drivers/staging/greybus/audio_codec.c
++++ b/drivers/staging/greybus/audio_codec.c
+@@ -599,8 +599,8 @@ static int gbcodec_mute_stream(struct snd_soc_dai *dai, int mute, int stream)
+ break;
+ }
+ if (!data) {
+- dev_err(dai->dev, "%s:%s DATA connection missing\n",
+- dai->name, module->name);
++ dev_err(dai->dev, "%s DATA connection missing\n",
++ dai->name);
+ mutex_unlock(&codec->lock);
+ return -ENODEV;
+ }
+diff --git a/drivers/staging/r8188eu/core/rtw_fw.c b/drivers/staging/r8188eu/core/rtw_fw.c
+index 625d186c36471..ce431d8ffea0d 100644
+--- a/drivers/staging/r8188eu/core/rtw_fw.c
++++ b/drivers/staging/r8188eu/core/rtw_fw.c
+@@ -29,7 +29,7 @@ struct rt_firmware_hdr {
+ * FW for different conditions */
+ __le16 Version; /* FW Version */
+ u8 Subversion; /* FW Subversion, default 0x00 */
+- u16 Rsvd1;
++ u8 Rsvd1;
+
+ /* LONG WORD 1 ---- */
+ u8 Month; /* Release time Month field */
+diff --git a/drivers/staging/r8188eu/core/rtw_mlme.c b/drivers/staging/r8188eu/core/rtw_mlme.c
+index 6f0bff1864778..76cf6a69bf0f3 100644
+--- a/drivers/staging/r8188eu/core/rtw_mlme.c
++++ b/drivers/staging/r8188eu/core/rtw_mlme.c
+@@ -1071,8 +1071,10 @@ void rtw_joinbss_event_prehandle(struct adapter *adapter, u8 *pbuf)
+ rtw_indicate_connect(adapter);
+ }
+
++ spin_unlock_bh(&pmlmepriv->lock);
+ /* s5. Cancel assoc_timer */
+ del_timer_sync(&pmlmepriv->assoc_timer);
++ spin_lock_bh(&pmlmepriv->lock);
+ } else {
+ spin_unlock_bh(&pmlmepriv->scanned_queue.lock);
+ goto ignore_joinbss_callback;
+@@ -1310,7 +1312,7 @@ void _rtw_join_timeout_handler (struct adapter *adapter)
+ if (adapter->bDriverStopped || adapter->bSurpriseRemoved)
+ return;
+
+- spin_lock_bh(&pmlmepriv->lock);
++ spin_lock_irq(&pmlmepriv->lock);
+
+ if (rtw_to_roaming(adapter) > 0) { /* join timeout caused by roaming */
+ while (1) {
+@@ -1329,7 +1331,7 @@ void _rtw_join_timeout_handler (struct adapter *adapter)
+ rtw_indicate_disconnect(adapter);
+ free_scanqueue(pmlmepriv);/* */
+ }
+- spin_unlock_bh(&pmlmepriv->lock);
++ spin_unlock_irq(&pmlmepriv->lock);
+
+ }
+
+diff --git a/drivers/staging/r8188eu/core/rtw_xmit.c b/drivers/staging/r8188eu/core/rtw_xmit.c
+index c2a550e7250e8..2ee92bbe66a08 100644
+--- a/drivers/staging/r8188eu/core/rtw_xmit.c
++++ b/drivers/staging/r8188eu/core/rtw_xmit.c
+@@ -178,7 +178,12 @@ s32 _rtw_init_xmit_priv(struct xmit_priv *pxmitpriv, struct adapter *padapter)
+
+ pxmitpriv->free_xmit_extbuf_cnt = num_xmit_extbuf;
+
+- rtw_alloc_hwxmits(padapter);
++ res = rtw_alloc_hwxmits(padapter);
++ if (res) {
++ res = _FAIL;
++ goto exit;
++ }
++
+ rtw_init_hwxmits(pxmitpriv->hwxmits, pxmitpriv->hwxmit_entry);
+
+ for (i = 0; i < 4; i++)
+@@ -1474,7 +1479,7 @@ exit:
+ return res;
+ }
+
+-void rtw_alloc_hwxmits(struct adapter *padapter)
++int rtw_alloc_hwxmits(struct adapter *padapter)
+ {
+ struct hw_xmit *hwxmits;
+ struct xmit_priv *pxmitpriv = &padapter->xmitpriv;
+@@ -1482,6 +1487,8 @@ void rtw_alloc_hwxmits(struct adapter *padapter)
+ pxmitpriv->hwxmit_entry = HWXMIT_ENTRY;
+
+ pxmitpriv->hwxmits = kzalloc(sizeof(struct hw_xmit) * pxmitpriv->hwxmit_entry, GFP_KERNEL);
++ if (!pxmitpriv->hwxmits)
++ return -ENOMEM;
+
+ hwxmits = pxmitpriv->hwxmits;
+
+@@ -1498,6 +1505,8 @@ void rtw_alloc_hwxmits(struct adapter *padapter)
+ hwxmits[3] .sta_queue = &pxmitpriv->bk_pending;
+ } else {
+ }
++
++ return 0;
+ }
+
+ void rtw_free_hwxmits(struct adapter *padapter)
+diff --git a/drivers/staging/r8188eu/include/rtw_xmit.h b/drivers/staging/r8188eu/include/rtw_xmit.h
+index b2df1480d66b3..e73632972900d 100644
+--- a/drivers/staging/r8188eu/include/rtw_xmit.h
++++ b/drivers/staging/r8188eu/include/rtw_xmit.h
+@@ -341,7 +341,7 @@ s32 rtw_txframes_sta_ac_pending(struct adapter *padapter,
+ void rtw_init_hwxmits(struct hw_xmit *phwxmit, int entry);
+ s32 _rtw_init_xmit_priv(struct xmit_priv *pxmitpriv, struct adapter *padapter);
+ void _rtw_free_xmit_priv(struct xmit_priv *pxmitpriv);
+-void rtw_alloc_hwxmits(struct adapter *padapter);
++int rtw_alloc_hwxmits(struct adapter *padapter);
+ void rtw_free_hwxmits(struct adapter *padapter);
+ s32 rtw_xmit(struct adapter *padapter, struct sk_buff **pkt);
+
+diff --git a/drivers/staging/rtl8192e/rtllib_softmac.c b/drivers/staging/rtl8192e/rtllib_softmac.c
+index 4b6c2295a3cf7..b5a38f0a8d79b 100644
+--- a/drivers/staging/rtl8192e/rtllib_softmac.c
++++ b/drivers/staging/rtl8192e/rtllib_softmac.c
+@@ -651,9 +651,9 @@ static void rtllib_beacons_stop(struct rtllib_device *ieee)
+ spin_lock_irqsave(&ieee->beacon_lock, flags);
+
+ ieee->beacon_txing = 0;
+- del_timer_sync(&ieee->beacon_timer);
+
+ spin_unlock_irqrestore(&ieee->beacon_lock, flags);
++ del_timer_sync(&ieee->beacon_timer);
+
+ }
+
+diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c b/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
+index 1a43979939a8a..79f3fbe25556a 100644
+--- a/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
++++ b/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac.c
+@@ -528,9 +528,9 @@ static void ieee80211_beacons_stop(struct ieee80211_device *ieee)
+ spin_lock_irqsave(&ieee->beacon_lock, flags);
+
+ ieee->beacon_txing = 0;
+- del_timer_sync(&ieee->beacon_timer);
+
+ spin_unlock_irqrestore(&ieee->beacon_lock, flags);
++ del_timer_sync(&ieee->beacon_timer);
+ }
+
+ void ieee80211_stop_send_beacons(struct ieee80211_device *ieee)
+diff --git a/drivers/staging/rtl8712/os_intfs.c b/drivers/staging/rtl8712/os_intfs.c
+index d15d52c0d1a74..003e972051240 100644
+--- a/drivers/staging/rtl8712/os_intfs.c
++++ b/drivers/staging/rtl8712/os_intfs.c
+@@ -332,7 +332,6 @@ void r8712_free_drv_sw(struct _adapter *padapter)
+ r8712_free_evt_priv(&padapter->evtpriv);
+ r8712_DeInitSwLeds(padapter);
+ r8712_free_mlme_priv(&padapter->mlmepriv);
+- r8712_free_io_queue(padapter);
+ _free_xmit_priv(&padapter->xmitpriv);
+ _r8712_free_sta_priv(&padapter->stapriv);
+ _r8712_free_recv_priv(&padapter->recvpriv);
+diff --git a/drivers/staging/rtl8712/usb_intf.c b/drivers/staging/rtl8712/usb_intf.c
+index ee4c61f85a076..1ff3e2658e77e 100644
+--- a/drivers/staging/rtl8712/usb_intf.c
++++ b/drivers/staging/rtl8712/usb_intf.c
+@@ -265,6 +265,7 @@ static uint r8712_usb_dvobj_init(struct _adapter *padapter)
+
+ static void r8712_usb_dvobj_deinit(struct _adapter *padapter)
+ {
++ r8712_free_io_queue(padapter);
+ }
+
+ void rtl871x_intf_stop(struct _adapter *padapter)
+@@ -302,9 +303,6 @@ void r871x_dev_unload(struct _adapter *padapter)
+ rtl8712_hal_deinit(padapter);
+ }
+
+- /*s6.*/
+- if (padapter->dvobj_deinit)
+- padapter->dvobj_deinit(padapter);
+ padapter->bup = false;
+ }
+ }
+@@ -538,13 +536,13 @@ static int r871xu_drv_init(struct usb_interface *pusb_intf,
+ } else {
+ AutoloadFail = false;
+ }
+- if (((mac[0] == 0xff) && (mac[1] == 0xff) &&
++ if ((!AutoloadFail) ||
++ ((mac[0] == 0xff) && (mac[1] == 0xff) &&
+ (mac[2] == 0xff) && (mac[3] == 0xff) &&
+ (mac[4] == 0xff) && (mac[5] == 0xff)) ||
+ ((mac[0] == 0x00) && (mac[1] == 0x00) &&
+ (mac[2] == 0x00) && (mac[3] == 0x00) &&
+- (mac[4] == 0x00) && (mac[5] == 0x00)) ||
+- (!AutoloadFail)) {
++ (mac[4] == 0x00) && (mac[5] == 0x00))) {
+ mac[0] = 0x00;
+ mac[1] = 0xe0;
+ mac[2] = 0x4c;
+@@ -607,6 +605,8 @@ static void r871xu_dev_remove(struct usb_interface *pusb_intf)
+ /* Stop driver mlme relation timer */
+ r8712_stop_drv_timers(padapter);
+ r871x_dev_unload(padapter);
++ if (padapter->dvobj_deinit)
++ padapter->dvobj_deinit(padapter);
+ r8712_free_drv_sw(padapter);
+ free_netdev(pnetdev);
+
+diff --git a/drivers/staging/rtl8712/usb_ops.c b/drivers/staging/rtl8712/usb_ops.c
+index e64845e6adf3d..af9966d03979c 100644
+--- a/drivers/staging/rtl8712/usb_ops.c
++++ b/drivers/staging/rtl8712/usb_ops.c
+@@ -29,7 +29,8 @@ static u8 usb_read8(struct intf_hdl *intfhdl, u32 addr)
+ u16 wvalue;
+ u16 index;
+ u16 len;
+- __le32 data;
++ int status;
++ __le32 data = 0;
+ struct intf_priv *intfpriv = intfhdl->pintfpriv;
+
+ request = 0x05;
+@@ -37,8 +38,10 @@ static u8 usb_read8(struct intf_hdl *intfhdl, u32 addr)
+ index = 0;
+ wvalue = (u16)(addr & 0x0000ffff);
+ len = 1;
+- r8712_usbctrl_vendorreq(intfpriv, request, wvalue, index, &data, len,
+- requesttype);
++ status = r8712_usbctrl_vendorreq(intfpriv, request, wvalue, index,
++ &data, len, requesttype);
++ if (status < 0)
++ return 0;
+ return (u8)(le32_to_cpu(data) & 0x0ff);
+ }
+
+@@ -49,7 +52,8 @@ static u16 usb_read16(struct intf_hdl *intfhdl, u32 addr)
+ u16 wvalue;
+ u16 index;
+ u16 len;
+- __le32 data;
++ int status;
++ __le32 data = 0;
+ struct intf_priv *intfpriv = intfhdl->pintfpriv;
+
+ request = 0x05;
+@@ -57,8 +61,10 @@ static u16 usb_read16(struct intf_hdl *intfhdl, u32 addr)
+ index = 0;
+ wvalue = (u16)(addr & 0x0000ffff);
+ len = 2;
+- r8712_usbctrl_vendorreq(intfpriv, request, wvalue, index, &data, len,
+- requesttype);
++ status = r8712_usbctrl_vendorreq(intfpriv, request, wvalue, index,
++ &data, len, requesttype);
++ if (status < 0)
++ return 0;
+ return (u16)(le32_to_cpu(data) & 0xffff);
+ }
+
+@@ -69,7 +75,8 @@ static u32 usb_read32(struct intf_hdl *intfhdl, u32 addr)
+ u16 wvalue;
+ u16 index;
+ u16 len;
+- __le32 data;
++ int status;
++ __le32 data = 0;
+ struct intf_priv *intfpriv = intfhdl->pintfpriv;
+
+ request = 0x05;
+@@ -77,8 +84,10 @@ static u32 usb_read32(struct intf_hdl *intfhdl, u32 addr)
+ index = 0;
+ wvalue = (u16)(addr & 0x0000ffff);
+ len = 4;
+- r8712_usbctrl_vendorreq(intfpriv, request, wvalue, index, &data, len,
+- requesttype);
++ status = r8712_usbctrl_vendorreq(intfpriv, request, wvalue, index,
++ &data, len, requesttype);
++ if (status < 0)
++ return 0;
+ return le32_to_cpu(data);
+ }
+
+diff --git a/drivers/staging/rtl8723bs/core/rtw_mlme.c b/drivers/staging/rtl8723bs/core/rtw_mlme.c
+index ed2d3b7d44d9a..24d6af886f72e 100644
+--- a/drivers/staging/rtl8723bs/core/rtw_mlme.c
++++ b/drivers/staging/rtl8723bs/core/rtw_mlme.c
+@@ -751,7 +751,9 @@ void rtw_surveydone_event_callback(struct adapter *adapter, u8 *pbuf)
+ }
+
+ if (check_fwstate(pmlmepriv, _FW_UNDER_SURVEY)) {
++ spin_unlock_bh(&pmlmepriv->lock);
+ del_timer_sync(&pmlmepriv->scan_to_timer);
++ spin_lock_bh(&pmlmepriv->lock);
+ _clr_fwstate_(pmlmepriv, _FW_UNDER_SURVEY);
+ }
+
+@@ -1238,8 +1240,10 @@ void rtw_joinbss_event_prehandle(struct adapter *adapter, u8 *pbuf)
+
+ spin_unlock_bh(&pmlmepriv->scanned_queue.lock);
+
++ spin_unlock_bh(&pmlmepriv->lock);
+ /* s5. Cancel assoc_timer */
+ del_timer_sync(&pmlmepriv->assoc_timer);
++ spin_lock_bh(&pmlmepriv->lock);
+ } else {
+ spin_unlock_bh(&(pmlmepriv->scanned_queue.lock));
+ }
+@@ -1545,7 +1549,7 @@ void _rtw_join_timeout_handler(struct timer_list *t)
+ if (adapter->bDriverStopped || adapter->bSurpriseRemoved)
+ return;
+
+- spin_lock_bh(&pmlmepriv->lock);
++ spin_lock_irq(&pmlmepriv->lock);
+
+ if (rtw_to_roam(adapter) > 0) { /* join timeout caused by roaming */
+ while (1) {
+@@ -1573,7 +1577,7 @@ void _rtw_join_timeout_handler(struct timer_list *t)
+
+ }
+
+- spin_unlock_bh(&pmlmepriv->lock);
++ spin_unlock_irq(&pmlmepriv->lock);
+ }
+
+ /*
+@@ -1586,11 +1590,11 @@ void rtw_scan_timeout_handler(struct timer_list *t)
+ mlmepriv.scan_to_timer);
+ struct mlme_priv *pmlmepriv = &adapter->mlmepriv;
+
+- spin_lock_bh(&pmlmepriv->lock);
++ spin_lock_irq(&pmlmepriv->lock);
+
+ _clr_fwstate_(pmlmepriv, _FW_UNDER_SURVEY);
+
+- spin_unlock_bh(&pmlmepriv->lock);
++ spin_unlock_irq(&pmlmepriv->lock);
+
+ rtw_indicate_scan_done(adapter, true);
+ }
+diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c
+index 9beb47b31c75a..44d04b651a8b8 100644
+--- a/drivers/thunderbolt/tb.c
++++ b/drivers/thunderbolt/tb.c
+@@ -867,7 +867,7 @@ static struct tb_port *tb_find_dp_out(struct tb *tb, struct tb_port *in)
+
+ static void tb_tunnel_dp(struct tb *tb)
+ {
+- int available_up, available_down, ret;
++ int available_up, available_down, ret, link_nr;
+ struct tb_cm *tcm = tb_priv(tb);
+ struct tb_port *port, *in, *out;
+ struct tb_tunnel *tunnel;
+@@ -912,6 +912,20 @@ static void tb_tunnel_dp(struct tb *tb)
+ return;
+ }
+
++ /*
++ * This is only applicable to links that are not bonded (so
++ * when Thunderbolt 1 hardware is involved somewhere in the
++ * topology). For these try to share the DP bandwidth between
++ * the two lanes.
++ */
++ link_nr = 1;
++ list_for_each_entry(tunnel, &tcm->tunnel_list, list) {
++ if (tb_tunnel_is_dp(tunnel)) {
++ link_nr = 0;
++ break;
++ }
++ }
++
+ /*
+ * DP stream needs the domain to be active so runtime resume
+ * both ends of the tunnel.
+@@ -943,7 +957,8 @@ static void tb_tunnel_dp(struct tb *tb)
+ tb_dbg(tb, "available bandwidth for new DP tunnel %u/%u Mb/s\n",
+ available_up, available_down);
+
+- tunnel = tb_tunnel_alloc_dp(tb, in, out, available_up, available_down);
++ tunnel = tb_tunnel_alloc_dp(tb, in, out, link_nr, available_up,
++ available_down);
+ if (!tunnel) {
+ tb_port_dbg(out, "could not allocate DP tunnel\n");
+ goto err_reclaim;
+diff --git a/drivers/thunderbolt/test.c b/drivers/thunderbolt/test.c
+index 1f69bab236ee9..66b6e665e96f0 100644
+--- a/drivers/thunderbolt/test.c
++++ b/drivers/thunderbolt/test.c
+@@ -1348,7 +1348,7 @@ static void tb_test_tunnel_dp(struct kunit *test)
+ in = &host->ports[5];
+ out = &dev->ports[13];
+
+- tunnel = tb_tunnel_alloc_dp(NULL, in, out, 0, 0);
++ tunnel = tb_tunnel_alloc_dp(NULL, in, out, 1, 0, 0);
+ KUNIT_ASSERT_TRUE(test, tunnel != NULL);
+ KUNIT_EXPECT_EQ(test, tunnel->type, TB_TUNNEL_DP);
+ KUNIT_EXPECT_PTR_EQ(test, tunnel->src_port, in);
+@@ -1394,7 +1394,7 @@ static void tb_test_tunnel_dp_chain(struct kunit *test)
+ in = &host->ports[5];
+ out = &dev4->ports[14];
+
+- tunnel = tb_tunnel_alloc_dp(NULL, in, out, 0, 0);
++ tunnel = tb_tunnel_alloc_dp(NULL, in, out, 1, 0, 0);
+ KUNIT_ASSERT_TRUE(test, tunnel != NULL);
+ KUNIT_EXPECT_EQ(test, tunnel->type, TB_TUNNEL_DP);
+ KUNIT_EXPECT_PTR_EQ(test, tunnel->src_port, in);
+@@ -1444,7 +1444,7 @@ static void tb_test_tunnel_dp_tree(struct kunit *test)
+ in = &dev2->ports[13];
+ out = &dev5->ports[13];
+
+- tunnel = tb_tunnel_alloc_dp(NULL, in, out, 0, 0);
++ tunnel = tb_tunnel_alloc_dp(NULL, in, out, 1, 0, 0);
+ KUNIT_ASSERT_TRUE(test, tunnel != NULL);
+ KUNIT_EXPECT_EQ(test, tunnel->type, TB_TUNNEL_DP);
+ KUNIT_EXPECT_PTR_EQ(test, tunnel->src_port, in);
+@@ -1509,7 +1509,7 @@ static void tb_test_tunnel_dp_max_length(struct kunit *test)
+ in = &dev6->ports[13];
+ out = &dev12->ports[13];
+
+- tunnel = tb_tunnel_alloc_dp(NULL, in, out, 0, 0);
++ tunnel = tb_tunnel_alloc_dp(NULL, in, out, 1, 0, 0);
+ KUNIT_ASSERT_TRUE(test, tunnel != NULL);
+ KUNIT_EXPECT_EQ(test, tunnel->type, TB_TUNNEL_DP);
+ KUNIT_EXPECT_PTR_EQ(test, tunnel->src_port, in);
+@@ -1627,7 +1627,7 @@ static void tb_test_tunnel_port_on_path(struct kunit *test)
+ in = &dev2->ports[13];
+ out = &dev5->ports[13];
+
+- dp_tunnel = tb_tunnel_alloc_dp(NULL, in, out, 0, 0);
++ dp_tunnel = tb_tunnel_alloc_dp(NULL, in, out, 1, 0, 0);
+ KUNIT_ASSERT_TRUE(test, dp_tunnel != NULL);
+
+ KUNIT_EXPECT_TRUE(test, tb_tunnel_port_on_path(dp_tunnel, in));
+@@ -2009,7 +2009,7 @@ static void tb_test_credit_alloc_dp(struct kunit *test)
+ in = &host->ports[5];
+ out = &dev->ports[14];
+
+- tunnel = tb_tunnel_alloc_dp(NULL, in, out, 0, 0);
++ tunnel = tb_tunnel_alloc_dp(NULL, in, out, 1, 0, 0);
+ KUNIT_ASSERT_TRUE(test, tunnel != NULL);
+ KUNIT_ASSERT_EQ(test, tunnel->npaths, (size_t)3);
+
+@@ -2245,7 +2245,7 @@ static struct tb_tunnel *TB_TEST_DP_TUNNEL1(struct kunit *test,
+
+ in = &host->ports[5];
+ out = &dev->ports[13];
+- dp_tunnel1 = tb_tunnel_alloc_dp(NULL, in, out, 0, 0);
++ dp_tunnel1 = tb_tunnel_alloc_dp(NULL, in, out, 1, 0, 0);
+ KUNIT_ASSERT_TRUE(test, dp_tunnel1 != NULL);
+ KUNIT_ASSERT_EQ(test, dp_tunnel1->npaths, (size_t)3);
+
+@@ -2282,7 +2282,7 @@ static struct tb_tunnel *TB_TEST_DP_TUNNEL2(struct kunit *test,
+
+ in = &host->ports[6];
+ out = &dev->ports[14];
+- dp_tunnel2 = tb_tunnel_alloc_dp(NULL, in, out, 0, 0);
++ dp_tunnel2 = tb_tunnel_alloc_dp(NULL, in, out, 1, 0, 0);
+ KUNIT_ASSERT_TRUE(test, dp_tunnel2 != NULL);
+ KUNIT_ASSERT_EQ(test, dp_tunnel2->npaths, (size_t)3);
+
+diff --git a/drivers/thunderbolt/tunnel.c b/drivers/thunderbolt/tunnel.c
+index 118742ec93ed7..8ccd70920b6a6 100644
+--- a/drivers/thunderbolt/tunnel.c
++++ b/drivers/thunderbolt/tunnel.c
+@@ -858,6 +858,7 @@ err_free:
+ * @tb: Pointer to the domain structure
+ * @in: DP in adapter port
+ * @out: DP out adapter port
++ * @link_nr: Preferred lane adapter when the link is not bonded
+ * @max_up: Maximum available upstream bandwidth for the DP tunnel (%0
+ * if not limited)
+ * @max_down: Maximum available downstream bandwidth for the DP tunnel
+@@ -869,8 +870,8 @@ err_free:
+ * Return: Returns a tb_tunnel on success or NULL on failure.
+ */
+ struct tb_tunnel *tb_tunnel_alloc_dp(struct tb *tb, struct tb_port *in,
+- struct tb_port *out, int max_up,
+- int max_down)
++ struct tb_port *out, int link_nr,
++ int max_up, int max_down)
+ {
+ struct tb_tunnel *tunnel;
+ struct tb_path **paths;
+@@ -894,21 +895,21 @@ struct tb_tunnel *tb_tunnel_alloc_dp(struct tb *tb, struct tb_port *in,
+ paths = tunnel->paths;
+
+ path = tb_path_alloc(tb, in, TB_DP_VIDEO_HOPID, out, TB_DP_VIDEO_HOPID,
+- 1, "Video");
++ link_nr, "Video");
+ if (!path)
+ goto err_free;
+ tb_dp_init_video_path(path);
+ paths[TB_DP_VIDEO_PATH_OUT] = path;
+
+ path = tb_path_alloc(tb, in, TB_DP_AUX_TX_HOPID, out,
+- TB_DP_AUX_TX_HOPID, 1, "AUX TX");
++ TB_DP_AUX_TX_HOPID, link_nr, "AUX TX");
+ if (!path)
+ goto err_free;
+ tb_dp_init_aux_path(path);
+ paths[TB_DP_AUX_PATH_OUT] = path;
+
+ path = tb_path_alloc(tb, out, TB_DP_AUX_RX_HOPID, in,
+- TB_DP_AUX_RX_HOPID, 1, "AUX RX");
++ TB_DP_AUX_RX_HOPID, link_nr, "AUX RX");
+ if (!path)
+ goto err_free;
+ tb_dp_init_aux_path(path);
+diff --git a/drivers/thunderbolt/tunnel.h b/drivers/thunderbolt/tunnel.h
+index 03e56076b5bcf..bb4d1f1d6d0b0 100644
+--- a/drivers/thunderbolt/tunnel.h
++++ b/drivers/thunderbolt/tunnel.h
+@@ -71,8 +71,8 @@ struct tb_tunnel *tb_tunnel_alloc_pci(struct tb *tb, struct tb_port *up,
+ struct tb_tunnel *tb_tunnel_discover_dp(struct tb *tb, struct tb_port *in,
+ bool alloc_hopid);
+ struct tb_tunnel *tb_tunnel_alloc_dp(struct tb *tb, struct tb_port *in,
+- struct tb_port *out, int max_up,
+- int max_down);
++ struct tb_port *out, int link_nr,
++ int max_up, int max_down);
+ struct tb_tunnel *tb_tunnel_alloc_dma(struct tb *tb, struct tb_port *nhi,
+ struct tb_port *dst, int transmit_path,
+ int transmit_ring, int receive_path,
+diff --git a/drivers/tty/goldfish.c b/drivers/tty/goldfish.c
+index 9e8ccb8ed6d69..c7968aecd8702 100644
+--- a/drivers/tty/goldfish.c
++++ b/drivers/tty/goldfish.c
+@@ -405,6 +405,7 @@ static int goldfish_tty_probe(struct platform_device *pdev)
+ err_tty_register_device_failed:
+ free_irq(irq, qtty);
+ err_dec_line_count:
++ tty_port_destroy(&qtty->port);
+ goldfish_tty_current_line_count--;
+ if (goldfish_tty_current_line_count == 0)
+ goldfish_tty_delete_driver();
+@@ -426,6 +427,7 @@ static int goldfish_tty_remove(struct platform_device *pdev)
+ iounmap(qtty->base);
+ qtty->base = NULL;
+ free_irq(qtty->irq, pdev);
++ tty_port_destroy(&qtty->port);
+ goldfish_tty_current_line_count--;
+ if (goldfish_tty_current_line_count == 0)
+ goldfish_tty_delete_driver();
+diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
+index efc72104c8400..bdc314aeab886 100644
+--- a/drivers/tty/n_tty.c
++++ b/drivers/tty/n_tty.c
+@@ -1975,6 +1975,35 @@ static bool canon_copy_from_read_buf(struct tty_struct *tty,
+ return ldata->read_tail != canon_head;
+ }
+
++/*
++ * If we finished a read at the exact location of an
++ * EOF (special EOL character that's a __DISABLED_CHAR)
++ * in the stream, silently eat the EOF.
++ */
++static void canon_skip_eof(struct tty_struct *tty)
++{
++ struct n_tty_data *ldata = tty->disc_data;
++ size_t tail, canon_head;
++
++ canon_head = smp_load_acquire(&ldata->canon_head);
++ tail = ldata->read_tail;
++
++ // No data?
++ if (tail == canon_head)
++ return;
++
++ // See if the tail position is EOF in the circular buffer
++ tail &= (N_TTY_BUF_SIZE - 1);
++ if (!test_bit(tail, ldata->read_flags))
++ return;
++ if (read_buf(ldata, tail) != __DISABLED_CHAR)
++ return;
++
++ // Clear the EOL bit, skip the EOF char.
++ clear_bit(tail, ldata->read_flags);
++ smp_store_release(&ldata->read_tail, ldata->read_tail + 1);
++}
++
+ /**
+ * job_control - check job control
+ * @tty: tty
+@@ -2045,7 +2074,14 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file,
+ */
+ if (*cookie) {
+ if (ldata->icanon && !L_EXTPROC(tty)) {
+- if (canon_copy_from_read_buf(tty, &kb, &nr))
++ /*
++ * If we have filled the user buffer, see
++ * if we should skip an EOF character before
++ * releasing the lock and returning done.
++ */
++ if (!nr)
++ canon_skip_eof(tty);
++ else if (canon_copy_from_read_buf(tty, &kb, &nr))
+ return kb - kbuf;
+ } else {
+ if (copy_from_read_buf(tty, &kb, &nr))
+diff --git a/drivers/tty/serial/8250/8250_aspeed_vuart.c b/drivers/tty/serial/8250/8250_aspeed_vuart.c
+index 93fe10c680fbe..9d2a7856784f7 100644
+--- a/drivers/tty/serial/8250/8250_aspeed_vuart.c
++++ b/drivers/tty/serial/8250/8250_aspeed_vuart.c
+@@ -429,6 +429,8 @@ static int aspeed_vuart_probe(struct platform_device *pdev)
+ timer_setup(&vuart->unthrottle_timer, aspeed_vuart_unthrottle_exp, 0);
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
++ if (!res)
++ return -EINVAL;
+
+ memset(&port, 0, sizeof(port));
+ port.port.private_data = vuart;
+diff --git a/drivers/tty/serial/8250/8250_fintek.c b/drivers/tty/serial/8250/8250_fintek.c
+index 251f0018ae8ca..dba5950b8d0e2 100644
+--- a/drivers/tty/serial/8250/8250_fintek.c
++++ b/drivers/tty/serial/8250/8250_fintek.c
+@@ -200,12 +200,12 @@ static int fintek_8250_rs485_config(struct uart_port *port,
+ if (!pdata)
+ return -EINVAL;
+
+- /* Hardware do not support same RTS level on send and receive */
+- if (!(rs485->flags & SER_RS485_RTS_ON_SEND) ==
+- !(rs485->flags & SER_RS485_RTS_AFTER_SEND))
+- return -EINVAL;
+
+ if (rs485->flags & SER_RS485_ENABLED) {
++ /* Hardware do not support same RTS level on send and receive */
++ if (!(rs485->flags & SER_RS485_RTS_ON_SEND) ==
++ !(rs485->flags & SER_RS485_RTS_AFTER_SEND))
++ return -EINVAL;
+ memset(rs485->padding, 0, sizeof(rs485->padding));
+ config |= RS485_URA;
+ } else {
+diff --git a/drivers/tty/serial/8250/8250_mtk.c b/drivers/tty/serial/8250/8250_mtk.c
+index 21053db93ff1e..54051ec7b4992 100644
+--- a/drivers/tty/serial/8250/8250_mtk.c
++++ b/drivers/tty/serial/8250/8250_mtk.c
+@@ -54,9 +54,6 @@
+ #define MTK_UART_TX_TRIGGER 1
+ #define MTK_UART_RX_TRIGGER MTK_UART_RX_SIZE
+
+-#define MTK_UART_FEATURE_SEL 39 /* Feature Selection register */
+-#define MTK_UART_FEAT_NEWRMAP BIT(0) /* Use new register map */
+-
+ #define MTK_UART_XON1 40 /* I/O: Xon character 1 */
+ #define MTK_UART_XOFF1 42 /* I/O: Xoff character 1 */
+
+@@ -575,10 +572,6 @@ static int mtk8250_probe(struct platform_device *pdev)
+ uart.dma = data->dma;
+ #endif
+
+- /* Set AP UART new register map */
+- writel(MTK_UART_FEAT_NEWRMAP, uart.port.membase +
+- (MTK_UART_FEATURE_SEL << uart.port.regshift));
+-
+ /* Disable Rate Fix function */
+ writel(0x0, uart.port.membase +
+ (MTK_UART_RATE_FIX << uart.port.regshift));
+diff --git a/drivers/tty/serial/cpm_uart/cpm_uart_core.c b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
+index d6d3db9c3b1f8..db07d6a5d764d 100644
+--- a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
++++ b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
+@@ -1247,7 +1247,7 @@ static int cpm_uart_init_port(struct device_node *np,
+ }
+
+ #ifdef CONFIG_PPC_EARLY_DEBUG_CPM
+-#ifdef CONFIG_CONSOLE_POLL
++#if defined(CONFIG_CONSOLE_POLL) && defined(CONFIG_SERIAL_CPM_CONSOLE)
+ if (!udbg_port)
+ #endif
+ udbg_putc = NULL;
+diff --git a/drivers/tty/serial/digicolor-usart.c b/drivers/tty/serial/digicolor-usart.c
+index e37a917b9dbbc..af951e6a2ef4e 100644
+--- a/drivers/tty/serial/digicolor-usart.c
++++ b/drivers/tty/serial/digicolor-usart.c
+@@ -309,6 +309,8 @@ static void digicolor_uart_set_termios(struct uart_port *port,
+ case CS8:
+ default:
+ config |= UA_CONFIG_CHAR_LEN;
++ termios->c_cflag &= ~CSIZE;
++ termios->c_cflag |= CS8;
+ break;
+ }
+
+diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
+index be12fee94db55..2cb89491dd09f 100644
+--- a/drivers/tty/serial/fsl_lpuart.c
++++ b/drivers/tty/serial/fsl_lpuart.c
+@@ -239,8 +239,6 @@
+ /* IMX lpuart has four extra unused regs located at the beginning */
+ #define IMX_REG_OFF 0x10
+
+-static DEFINE_IDA(fsl_lpuart_ida);
+-
+ enum lpuart_type {
+ VF610_LPUART,
+ LS1021A_LPUART,
+@@ -276,7 +274,6 @@ struct lpuart_port {
+ int rx_dma_rng_buf_len;
+ unsigned int dma_tx_nents;
+ wait_queue_head_t dma_wait;
+- bool id_allocated;
+ };
+
+ struct lpuart_soc_data {
+@@ -2717,23 +2714,18 @@ static int lpuart_probe(struct platform_device *pdev)
+
+ ret = of_alias_get_id(np, "serial");
+ if (ret < 0) {
+- ret = ida_simple_get(&fsl_lpuart_ida, 0, UART_NR, GFP_KERNEL);
+- if (ret < 0) {
+- dev_err(&pdev->dev, "port line is full, add device failed\n");
+- return ret;
+- }
+- sport->id_allocated = true;
++ dev_err(&pdev->dev, "failed to get alias id, errno %d\n", ret);
++ return ret;
+ }
+ if (ret >= ARRAY_SIZE(lpuart_ports)) {
+ dev_err(&pdev->dev, "serial%d out of range\n", ret);
+- ret = -EINVAL;
+- goto failed_out_of_range;
++ return -EINVAL;
+ }
+ sport->port.line = ret;
+
+ ret = lpuart_enable_clks(sport);
+ if (ret)
+- goto failed_clock_enable;
++ return ret;
+ sport->port.uartclk = lpuart_get_baud_clk_rate(sport);
+
+ lpuart_ports[sport->port.line] = sport;
+@@ -2781,10 +2773,6 @@ failed_reset:
+ uart_remove_one_port(&lpuart_reg, &sport->port);
+ failed_attach_port:
+ lpuart_disable_clks(sport);
+-failed_clock_enable:
+-failed_out_of_range:
+- if (sport->id_allocated)
+- ida_simple_remove(&fsl_lpuart_ida, sport->port.line);
+ return ret;
+ }
+
+@@ -2794,9 +2782,6 @@ static int lpuart_remove(struct platform_device *pdev)
+
+ uart_remove_one_port(&lpuart_reg, &sport->port);
+
+- if (sport->id_allocated)
+- ida_simple_remove(&fsl_lpuart_ida, sport->port.line);
+-
+ lpuart_disable_clks(sport);
+
+ if (sport->dma_tx_chan)
+@@ -2926,7 +2911,6 @@ static int __init lpuart_serial_init(void)
+
+ static void __exit lpuart_serial_exit(void)
+ {
+- ida_destroy(&fsl_lpuart_ida);
+ platform_driver_unregister(&lpuart_driver);
+ uart_unregister_driver(&lpuart_reg);
+ }
+diff --git a/drivers/tty/serial/icom.c b/drivers/tty/serial/icom.c
+index 03a2fe9f4c9a9..02b375ba2f078 100644
+--- a/drivers/tty/serial/icom.c
++++ b/drivers/tty/serial/icom.c
+@@ -1501,7 +1501,7 @@ static int icom_probe(struct pci_dev *dev,
+ retval = pci_read_config_dword(dev, PCI_COMMAND, &command_reg);
+ if (retval) {
+ dev_err(&dev->dev, "PCI Config read FAILED\n");
+- return retval;
++ goto probe_exit0;
+ }
+
+ pci_write_config_dword(dev, PCI_COMMAND,
+diff --git a/drivers/tty/serial/meson_uart.c b/drivers/tty/serial/meson_uart.c
+index 2bf1c57e0981e..39021dac09cc2 100644
+--- a/drivers/tty/serial/meson_uart.c
++++ b/drivers/tty/serial/meson_uart.c
+@@ -253,6 +253,14 @@ static const char *meson_uart_type(struct uart_port *port)
+ return (port->type == PORT_MESON) ? "meson_uart" : NULL;
+ }
+
++/*
++ * This function is called only from probe() using a temporary io mapping
++ * in order to perform a reset before setting up the device. Since the
++ * temporarily mapped region was successfully requested, there can be no
++ * console on this port at this time. Hence it is not necessary for this
++ * function to acquire the port->lock. (Since there is no console on this
++ * port at this time, the port->lock is not initialized yet.)
++ */
+ static void meson_uart_reset(struct uart_port *port)
+ {
+ u32 val;
+@@ -267,9 +275,12 @@ static void meson_uart_reset(struct uart_port *port)
+
+ static int meson_uart_startup(struct uart_port *port)
+ {
++ unsigned long flags;
+ u32 val;
+ int ret = 0;
+
++ spin_lock_irqsave(&port->lock, flags);
++
+ val = readl(port->membase + AML_UART_CONTROL);
+ val |= AML_UART_CLEAR_ERR;
+ writel(val, port->membase + AML_UART_CONTROL);
+@@ -285,6 +296,8 @@ static int meson_uart_startup(struct uart_port *port)
+ val = (AML_UART_RECV_IRQ(1) | AML_UART_XMIT_IRQ(port->fifosize / 2));
+ writel(val, port->membase + AML_UART_MISC);
+
++ spin_unlock_irqrestore(&port->lock, flags);
++
+ ret = request_irq(port->irq, meson_uart_interrupt, 0,
+ port->name, port);
+
+diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c
+index 23c94b9277760..e676ec761f18c 100644
+--- a/drivers/tty/serial/msm_serial.c
++++ b/drivers/tty/serial/msm_serial.c
+@@ -1599,6 +1599,7 @@ static inline struct uart_port *msm_get_port_from_line(unsigned int line)
+ static void __msm_console_write(struct uart_port *port, const char *s,
+ unsigned int count, bool is_uartdm)
+ {
++ unsigned long flags;
+ int i;
+ int num_newlines = 0;
+ bool replaced = false;
+@@ -1616,6 +1617,8 @@ static void __msm_console_write(struct uart_port *port, const char *s,
+ num_newlines++;
+ count += num_newlines;
+
++ local_irq_save(flags);
++
+ if (port->sysrq)
+ locked = 0;
+ else if (oops_in_progress)
+@@ -1661,6 +1664,8 @@ static void __msm_console_write(struct uart_port *port, const char *s,
+
+ if (locked)
+ spin_unlock(&port->lock);
++
++ local_irq_restore(flags);
+ }
+
+ static void msm_console_write(struct console *co, const char *s,
+diff --git a/drivers/tty/serial/owl-uart.c b/drivers/tty/serial/owl-uart.c
+index 5250bd7d390a4..0866d749a9f44 100644
+--- a/drivers/tty/serial/owl-uart.c
++++ b/drivers/tty/serial/owl-uart.c
+@@ -731,6 +731,7 @@ static int owl_uart_probe(struct platform_device *pdev)
+ owl_port->port.uartclk = clk_get_rate(owl_port->clk);
+ if (owl_port->port.uartclk == 0) {
+ dev_err(&pdev->dev, "clock rate is zero\n");
++ clk_disable_unprepare(owl_port->clk);
+ return -EINVAL;
+ }
+ owl_port->port.flags = UPF_BOOT_AUTOCONF | UPF_IOREMAP | UPF_LOW_LATENCY;
+diff --git a/drivers/tty/serial/rda-uart.c b/drivers/tty/serial/rda-uart.c
+index e5f1fded423aa..f556b4955f59e 100644
+--- a/drivers/tty/serial/rda-uart.c
++++ b/drivers/tty/serial/rda-uart.c
+@@ -262,6 +262,8 @@ static void rda_uart_set_termios(struct uart_port *port,
+ fallthrough;
+ case CS7:
+ ctrl &= ~RDA_UART_DBITS_8;
++ termios->c_cflag &= ~CSIZE;
++ termios->c_cflag |= CS7;
+ break;
+ default:
+ ctrl |= RDA_UART_DBITS_8;
+diff --git a/drivers/tty/serial/sa1100.c b/drivers/tty/serial/sa1100.c
+index 5fe6cccfc1aeb..e64e42a19d1a0 100644
+--- a/drivers/tty/serial/sa1100.c
++++ b/drivers/tty/serial/sa1100.c
+@@ -446,6 +446,8 @@ sa1100_set_termios(struct uart_port *port, struct ktermios *termios,
+ baud = uart_get_baud_rate(port, termios, old, 0, port->uartclk/16);
+ quot = uart_get_divisor(port, baud);
+
++ del_timer_sync(&sport->timer);
++
+ spin_lock_irqsave(&sport->port.lock, flags);
+
+ sport->port.read_status_mask &= UTSR0_TO_SM(UTSR0_TFS);
+@@ -476,8 +478,6 @@ sa1100_set_termios(struct uart_port *port, struct ktermios *termios,
+ UTSR1_TO_SM(UTSR1_ROR);
+ }
+
+- del_timer_sync(&sport->timer);
+-
+ /*
+ * Update the per-port timeout.
+ */
+diff --git a/drivers/tty/serial/serial_txx9.c b/drivers/tty/serial/serial_txx9.c
+index 2213e6b841d3d..228e380db0804 100644
+--- a/drivers/tty/serial/serial_txx9.c
++++ b/drivers/tty/serial/serial_txx9.c
+@@ -618,6 +618,8 @@ serial_txx9_set_termios(struct uart_port *up, struct ktermios *termios,
+ case CS6: /* not supported */
+ case CS8:
+ cval |= TXX9_SILCR_UMODE_8BIT;
++ termios->c_cflag &= ~CSIZE;
++ termios->c_cflag |= CS8;
+ break;
+ }
+
+diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c
+index 0f9b8bd235006..0075a14200057 100644
+--- a/drivers/tty/serial/sh-sci.c
++++ b/drivers/tty/serial/sh-sci.c
+@@ -2379,8 +2379,12 @@ static void sci_set_termios(struct uart_port *port, struct ktermios *termios,
+ int best_clk = -1;
+ unsigned long flags;
+
+- if ((termios->c_cflag & CSIZE) == CS7)
++ if ((termios->c_cflag & CSIZE) == CS7) {
+ smr_val |= SCSMR_CHR;
++ } else {
++ termios->c_cflag &= ~CSIZE;
++ termios->c_cflag |= CS8;
++ }
+ if (termios->c_cflag & PARENB)
+ smr_val |= SCSMR_PE;
+ if (termios->c_cflag & PARODD)
+diff --git a/drivers/tty/serial/sifive.c b/drivers/tty/serial/sifive.c
+index f5ac14c384c4d..776aec6516c4a 100644
+--- a/drivers/tty/serial/sifive.c
++++ b/drivers/tty/serial/sifive.c
+@@ -666,12 +666,16 @@ static void sifive_serial_set_termios(struct uart_port *port,
+ int rate;
+ char nstop;
+
+- if ((termios->c_cflag & CSIZE) != CS8)
++ if ((termios->c_cflag & CSIZE) != CS8) {
+ dev_err_once(ssp->port.dev, "only 8-bit words supported\n");
++ termios->c_cflag &= ~CSIZE;
++ termios->c_cflag |= CS8;
++ }
+ if (termios->c_iflag & (INPCK | PARMRK))
+ dev_err_once(ssp->port.dev, "parity checking not supported\n");
+ if (termios->c_iflag & BRKINT)
+ dev_err_once(ssp->port.dev, "BREAK detection not supported\n");
++ termios->c_iflag &= ~(INPCK|PARMRK|BRKINT);
+
+ /* Set number of stop bits */
+ nstop = (termios->c_cflag & CSTOPB) ? 2 : 1;
+@@ -998,7 +1002,7 @@ static int sifive_serial_probe(struct platform_device *pdev)
+ /* Set up clock divider */
+ ssp->clkin_rate = clk_get_rate(ssp->clk);
+ ssp->baud_rate = SIFIVE_DEFAULT_BAUD_RATE;
+- ssp->port.uartclk = ssp->baud_rate * 16;
++ ssp->port.uartclk = ssp->clkin_rate;
+ __ssp_update_div(ssp);
+
+ platform_set_drvdata(pdev, ssp);
+diff --git a/drivers/tty/serial/st-asc.c b/drivers/tty/serial/st-asc.c
+index d7fd692286cf6..1b0da603ab549 100644
+--- a/drivers/tty/serial/st-asc.c
++++ b/drivers/tty/serial/st-asc.c
+@@ -535,10 +535,14 @@ static void asc_set_termios(struct uart_port *port, struct ktermios *termios,
+ /* set character length */
+ if ((cflag & CSIZE) == CS7) {
+ ctrl_val |= ASC_CTL_MODE_7BIT_PAR;
++ cflag |= PARENB;
+ } else {
+ ctrl_val |= (cflag & PARENB) ? ASC_CTL_MODE_8BIT_PAR :
+ ASC_CTL_MODE_8BIT;
++ cflag &= ~CSIZE;
++ cflag |= CS8;
+ }
++ termios->c_cflag = cflag;
+
+ /* set stop bit */
+ ctrl_val |= (cflag & CSTOPB) ? ASC_CTL_STOP_2BIT : ASC_CTL_STOP_1BIT;
+diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
+index 87b5cd4c9743e..3c551fd4f3ff8 100644
+--- a/drivers/tty/serial/stm32-usart.c
++++ b/drivers/tty/serial/stm32-usart.c
+@@ -1037,13 +1037,22 @@ static void stm32_usart_set_termios(struct uart_port *port,
+ * CS8 or (CS7 + parity), 8 bits word aka [M1:M0] = 0b00
+ * M0 and M1 already cleared by cr1 initialization.
+ */
+- if (bits == 9)
++ if (bits == 9) {
+ cr1 |= USART_CR1_M0;
+- else if ((bits == 7) && cfg->has_7bits_data)
++ } else if ((bits == 7) && cfg->has_7bits_data) {
+ cr1 |= USART_CR1_M1;
+- else if (bits != 8)
++ } else if (bits != 8) {
+ dev_dbg(port->dev, "Unsupported data bits config: %u bits\n"
+ , bits);
++ cflag &= ~CSIZE;
++ cflag |= CS8;
++ termios->c_cflag = cflag;
++ bits = 8;
++ if (cflag & PARENB) {
++ bits++;
++ cr1 |= USART_CR1_M0;
++ }
++ }
+
+ if (ofs->rtor != UNDEF_REG && (stm32_port->rx_ch ||
+ (stm32_port->fifoen &&
+diff --git a/drivers/tty/serial/uartlite.c b/drivers/tty/serial/uartlite.c
+index 007db67292a2d..880e2afbb97b6 100644
+--- a/drivers/tty/serial/uartlite.c
++++ b/drivers/tty/serial/uartlite.c
+@@ -321,7 +321,8 @@ static void ulite_set_termios(struct uart_port *port, struct ktermios *termios,
+ struct uartlite_data *pdata = port->private_data;
+
+ /* Set termios to what the hardware supports */
+- termios->c_cflag &= ~(BRKINT | CSTOPB | PARENB | PARODD | CSIZE);
++ termios->c_iflag &= ~BRKINT;
++ termios->c_cflag &= ~(CSTOPB | PARENB | PARODD | CSIZE);
+ termios->c_cflag |= pdata->cflags & (PARENB | PARODD | CSIZE);
+ tty_termios_encode_baud_rate(termios, pdata->baud, pdata->baud);
+
+diff --git a/drivers/tty/synclink_gt.c b/drivers/tty/synclink_gt.c
+index 25c558e65ece0..9bc2a92652772 100644
+--- a/drivers/tty/synclink_gt.c
++++ b/drivers/tty/synclink_gt.c
+@@ -1746,6 +1746,8 @@ static int hdlcdev_init(struct slgt_info *info)
+ */
+ static void hdlcdev_exit(struct slgt_info *info)
+ {
++ if (!info->netdev)
++ return;
+ unregister_hdlc_device(info->netdev);
+ free_netdev(info->netdev);
+ info->netdev = NULL;
+diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
+index bbfd004449b5b..34cfdda4aff5d 100644
+--- a/drivers/tty/sysrq.c
++++ b/drivers/tty/sysrq.c
+@@ -232,8 +232,10 @@ static void showacpu(void *dummy)
+ unsigned long flags;
+
+ /* Idle CPUs have no interesting backtrace. */
+- if (idle_cpu(smp_processor_id()))
++ if (idle_cpu(smp_processor_id())) {
++ pr_info("CPU%d: backtrace skipped as idling\n", smp_processor_id());
+ return;
++ }
+
+ raw_spin_lock_irqsave(&show_lock, flags);
+ pr_info("CPU%d:\n", smp_processor_id());
+@@ -260,10 +262,13 @@ static void sysrq_handle_showallcpus(int key)
+
+ if (in_hardirq())
+ regs = get_irq_regs();
+- if (regs) {
+- pr_info("CPU%d:\n", smp_processor_id());
++
++ pr_info("CPU%d:\n", smp_processor_id());
++ if (regs)
+ show_regs(regs);
+- }
++ else
++ show_stack(NULL, NULL, KERN_INFO);
++
+ schedule_work(&sysrq_showallcpus);
+ }
+ }
+diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
+index 8176bc81a635d..ae5e6d572376b 100644
+--- a/drivers/usb/core/hcd-pci.c
++++ b/drivers/usb/core/hcd-pci.c
+@@ -616,10 +616,10 @@ const struct dev_pm_ops usb_hcd_pci_pm_ops = {
+ .suspend_noirq = hcd_pci_suspend_noirq,
+ .resume_noirq = hcd_pci_resume_noirq,
+ .resume = hcd_pci_resume,
+- .freeze = check_root_hub_suspended,
++ .freeze = hcd_pci_suspend,
+ .freeze_noirq = check_root_hub_suspended,
+ .thaw_noirq = NULL,
+- .thaw = NULL,
++ .thaw = hcd_pci_resume,
+ .poweroff = hcd_pci_suspend,
+ .poweroff_noirq = hcd_pci_suspend_noirq,
+ .restore_noirq = hcd_pci_resume_noirq,
+diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
+index eee3504397e6e..fe2a58c758610 100644
+--- a/drivers/usb/dwc2/gadget.c
++++ b/drivers/usb/dwc2/gadget.c
+@@ -4544,7 +4544,6 @@ static int dwc2_hsotg_udc_start(struct usb_gadget *gadget,
+
+ WARN_ON(hsotg->driver);
+
+- driver->driver.bus = NULL;
+ hsotg->driver = driver;
+ hsotg->gadget.dev.of_node = hsotg->dev->of_node;
+ hsotg->gadget.speed = USB_SPEED_UNKNOWN;
+diff --git a/drivers/usb/dwc3/drd.c b/drivers/usb/dwc3/drd.c
+index 8cad9e7d33687..4982edd130476 100644
+--- a/drivers/usb/dwc3/drd.c
++++ b/drivers/usb/dwc3/drd.c
+@@ -455,13 +455,8 @@ static struct extcon_dev *dwc3_get_extcon(struct dwc3 *dwc)
+ * This device property is for kernel internal use only and
+ * is expected to be set by the glue code.
+ */
+- if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
+- edev = extcon_get_extcon_dev(name);
+- if (!edev)
+- return ERR_PTR(-EPROBE_DEFER);
+-
+- return edev;
+- }
++ if (device_property_read_string(dev, "linux,extcon-name", &name) == 0)
++ return extcon_get_extcon_dev(name);
+
+ /*
+ * Try to get an extcon device from the USB PHY controller's "port"
+diff --git a/drivers/usb/dwc3/dwc3-pci.c b/drivers/usb/dwc3/dwc3-pci.c
+index 2e19e0e4ea538..ba51de7dd7605 100644
+--- a/drivers/usb/dwc3/dwc3-pci.c
++++ b/drivers/usb/dwc3/dwc3-pci.c
+@@ -288,7 +288,7 @@ static void dwc3_pci_resume_work(struct work_struct *work)
+ int ret;
+
+ ret = pm_runtime_get_sync(&dwc3->dev);
+- if (ret) {
++ if (ret < 0) {
+ pm_runtime_put_sync_autosuspend(&dwc3->dev);
+ return;
+ }
+diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
+index 026fc360cc506..bf2eaa09d73c8 100644
+--- a/drivers/usb/dwc3/gadget.c
++++ b/drivers/usb/dwc3/gadget.c
+@@ -2001,10 +2001,10 @@ static void dwc3_gadget_ep_skip_trbs(struct dwc3_ep *dep, struct dwc3_request *r
+ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
+ {
+ struct dwc3_request *req;
+- struct dwc3_request *tmp;
+ struct dwc3 *dwc = dep->dwc;
+
+- list_for_each_entry_safe(req, tmp, &dep->cancelled_list, list) {
++ while (!list_empty(&dep->cancelled_list)) {
++ req = next_request(&dep->cancelled_list);
+ dwc3_gadget_ep_skip_trbs(dep, req);
+ switch (req->status) {
+ case DWC3_REQUEST_STATUS_DISCONNECTED:
+@@ -2021,6 +2021,12 @@ static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
+ dwc3_gadget_giveback(dep, req, -ECONNRESET);
+ break;
+ }
++ /*
++ * The endpoint is disabled, let the dwc3_remove_requests()
++ * handle the cleanup.
++ */
++ if (!dep->endpoint.desc)
++ break;
+ }
+ }
+
+@@ -3333,15 +3339,21 @@ static void dwc3_gadget_ep_cleanup_completed_requests(struct dwc3_ep *dep,
+ const struct dwc3_event_depevt *event, int status)
+ {
+ struct dwc3_request *req;
+- struct dwc3_request *tmp;
+
+- list_for_each_entry_safe(req, tmp, &dep->started_list, list) {
++ while (!list_empty(&dep->started_list)) {
+ int ret;
+
++ req = next_request(&dep->started_list);
+ ret = dwc3_gadget_ep_cleanup_completed_request(dep, event,
+ req, status);
+ if (ret)
+ break;
++ /*
++ * The endpoint is disabled, let the dwc3_remove_requests()
++ * handle the cleanup.
++ */
++ if (!dep->endpoint.desc)
++ break;
+ }
+ }
+
+@@ -3673,6 +3685,17 @@ static void dwc3_reset_gadget(struct dwc3 *dwc)
+ void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force,
+ bool interrupt)
+ {
++ struct dwc3 *dwc = dep->dwc;
++
++ /*
++ * Only issue End Transfer command to the control endpoint of a started
++ * Data Phase. Typically we should only do so in error cases such as
++ * invalid/unexpected direction as described in the control transfer
++ * flow of the programming guide.
++ */
++ if (dep->number <= 1 && dwc->ep0state != EP0_DATA_PHASE)
++ return;
++
+ if (!(dep->flags & DWC3_EP_TRANSFER_STARTED) ||
+ (dep->flags & DWC3_EP_DELAY_STOP) ||
+ (dep->flags & DWC3_EP_END_TRANSFER_PENDING))
+diff --git a/drivers/usb/dwc3/host.c b/drivers/usb/dwc3/host.c
+index eda871973d6cc..f56c30cf151e4 100644
+--- a/drivers/usb/dwc3/host.c
++++ b/drivers/usb/dwc3/host.c
+@@ -7,7 +7,6 @@
+ * Authors: Felipe Balbi <balbi@ti.com>,
+ */
+
+-#include <linux/acpi.h>
+ #include <linux/irq.h>
+ #include <linux/of.h>
+ #include <linux/platform_device.h>
+@@ -83,7 +82,6 @@ int dwc3_host_init(struct dwc3 *dwc)
+ }
+
+ xhci->dev.parent = dwc->dev;
+- ACPI_COMPANION_SET(&xhci->dev, ACPI_COMPANION(dwc->dev));
+
+ dwc->xhci = xhci;
+
+diff --git a/drivers/usb/host/isp116x-hcd.c b/drivers/usb/host/isp116x-hcd.c
+index 8835f6bd528e1..8c7f0991c21b5 100644
+--- a/drivers/usb/host/isp116x-hcd.c
++++ b/drivers/usb/host/isp116x-hcd.c
+@@ -1541,10 +1541,12 @@ static int isp116x_remove(struct platform_device *pdev)
+
+ iounmap(isp116x->data_reg);
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
+- release_mem_region(res->start, 2);
++ if (res)
++ release_mem_region(res->start, 2);
+ iounmap(isp116x->addr_reg);
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+- release_mem_region(res->start, 2);
++ if (res)
++ release_mem_region(res->start, 2);
+
+ usb_put_hcd(hcd);
+ return 0;
+diff --git a/drivers/usb/host/oxu210hp-hcd.c b/drivers/usb/host/oxu210hp-hcd.c
+index b741670525e34..ee403df33093c 100644
+--- a/drivers/usb/host/oxu210hp-hcd.c
++++ b/drivers/usb/host/oxu210hp-hcd.c
+@@ -3909,8 +3909,10 @@ static int oxu_bus_suspend(struct usb_hcd *hcd)
+ }
+ }
+
++ spin_unlock_irq(&oxu->lock);
+ /* turn off now-idle HC */
+ del_timer_sync(&oxu->watchdog);
++ spin_lock_irq(&oxu->lock);
+ ehci_halt(oxu);
+ hcd->state = HC_STATE_SUSPENDED;
+
+diff --git a/drivers/usb/musb/omap2430.c b/drivers/usb/musb/omap2430.c
+index d2b7e613eb34f..f571a65ae6ee2 100644
+--- a/drivers/usb/musb/omap2430.c
++++ b/drivers/usb/musb/omap2430.c
+@@ -362,6 +362,7 @@ static int omap2430_probe(struct platform_device *pdev)
+ control_node = of_parse_phandle(np, "ctrl-module", 0);
+ if (control_node) {
+ control_pdev = of_find_device_by_node(control_node);
++ of_node_put(control_node);
+ if (!control_pdev) {
+ dev_err(&pdev->dev, "Failed to get control device\n");
+ ret = -EINVAL;
+diff --git a/drivers/usb/phy/phy-omap-otg.c b/drivers/usb/phy/phy-omap-otg.c
+index ee0863c6553ed..6e6ef8c0bc7ed 100644
+--- a/drivers/usb/phy/phy-omap-otg.c
++++ b/drivers/usb/phy/phy-omap-otg.c
+@@ -95,8 +95,8 @@ static int omap_otg_probe(struct platform_device *pdev)
+ return -ENODEV;
+
+ extcon = extcon_get_extcon_dev(config->extcon);
+- if (!extcon)
+- return -EPROBE_DEFER;
++ if (IS_ERR(extcon))
++ return PTR_ERR(extcon);
+
+ otg_dev = devm_kzalloc(&pdev->dev, sizeof(*otg_dev), GFP_KERNEL);
+ if (!otg_dev)
+diff --git a/drivers/usb/storage/karma.c b/drivers/usb/storage/karma.c
+index 05cec81dcd3f2..38ddfedef629c 100644
+--- a/drivers/usb/storage/karma.c
++++ b/drivers/usb/storage/karma.c
+@@ -174,24 +174,25 @@ static void rio_karma_destructor(void *extra)
+
+ static int rio_karma_init(struct us_data *us)
+ {
+- int ret = 0;
+ struct karma_data *data = kzalloc(sizeof(struct karma_data), GFP_NOIO);
+
+ if (!data)
+- goto out;
++ return -ENOMEM;
+
+ data->recv = kmalloc(RIO_RECV_LEN, GFP_NOIO);
+ if (!data->recv) {
+ kfree(data);
+- goto out;
++ return -ENOMEM;
+ }
+
+ us->extra = data;
+ us->extra_destructor = rio_karma_destructor;
+- ret = rio_karma_send_command(RIO_ENTER_STORAGE, us);
+- data->in_storage = (ret == 0);
+-out:
+- return ret;
++ if (rio_karma_send_command(RIO_ENTER_STORAGE, us))
++ return -EIO;
++
++ data->in_storage = 1;
++
++ return 0;
+ }
+
+ static struct scsi_host_template karma_host_template;
+diff --git a/drivers/usb/typec/mux.c b/drivers/usb/typec/mux.c
+index c8340de0ed495..d2aaf294b6493 100644
+--- a/drivers/usb/typec/mux.c
++++ b/drivers/usb/typec/mux.c
+@@ -131,8 +131,11 @@ typec_switch_register(struct device *parent,
+ sw->dev.class = &typec_mux_class;
+ sw->dev.type = &typec_switch_dev_type;
+ sw->dev.driver_data = desc->drvdata;
+- dev_set_name(&sw->dev, "%s-switch",
+- desc->name ? desc->name : dev_name(parent));
++ ret = dev_set_name(&sw->dev, "%s-switch", desc->name ? desc->name : dev_name(parent));
++ if (ret) {
++ put_device(&sw->dev);
++ return ERR_PTR(ret);
++ }
+
+ ret = device_add(&sw->dev);
+ if (ret) {
+@@ -338,8 +341,11 @@ typec_mux_register(struct device *parent, const struct typec_mux_desc *desc)
+ mux->dev.class = &typec_mux_class;
+ mux->dev.type = &typec_mux_dev_type;
+ mux->dev.driver_data = desc->drvdata;
+- dev_set_name(&mux->dev, "%s-mux",
+- desc->name ? desc->name : dev_name(parent));
++ ret = dev_set_name(&mux->dev, "%s-mux", desc->name ? desc->name : dev_name(parent));
++ if (ret) {
++ put_device(&mux->dev);
++ return ERR_PTR(ret);
++ }
+
+ ret = device_add(&mux->dev);
+ if (ret) {
+diff --git a/drivers/usb/typec/tcpm/fusb302.c b/drivers/usb/typec/tcpm/fusb302.c
+index 72f9001b07921..96c55eaf3f808 100644
+--- a/drivers/usb/typec/tcpm/fusb302.c
++++ b/drivers/usb/typec/tcpm/fusb302.c
+@@ -1708,8 +1708,8 @@ static int fusb302_probe(struct i2c_client *client,
+ */
+ if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
+ chip->extcon = extcon_get_extcon_dev(name);
+- if (!chip->extcon)
+- return -EPROBE_DEFER;
++ if (IS_ERR(chip->extcon))
++ return PTR_ERR(chip->extcon);
+ }
+
+ chip->vbus = devm_regulator_get(chip->dev, "vbus");
+diff --git a/drivers/usb/usbip/stub_dev.c b/drivers/usb/usbip/stub_dev.c
+index d8d3892e5a69a..3c6d452e3bf40 100644
+--- a/drivers/usb/usbip/stub_dev.c
++++ b/drivers/usb/usbip/stub_dev.c
+@@ -393,7 +393,6 @@ static int stub_probe(struct usb_device *udev)
+
+ err_port:
+ dev_set_drvdata(&udev->dev, NULL);
+- usb_put_dev(udev);
+
+ /* we already have busid_priv, just lock busid_lock */
+ spin_lock(&busid_priv->busid_lock);
+@@ -408,6 +407,7 @@ call_put_busid_priv:
+ put_busid_priv(busid_priv);
+
+ sdev_free:
++ usb_put_dev(udev);
+ stub_device_free(sdev);
+
+ return rc;
+diff --git a/drivers/usb/usbip/stub_rx.c b/drivers/usb/usbip/stub_rx.c
+index 325c22008e536..5dd41e8215e0f 100644
+--- a/drivers/usb/usbip/stub_rx.c
++++ b/drivers/usb/usbip/stub_rx.c
+@@ -138,7 +138,9 @@ static int tweak_set_configuration_cmd(struct urb *urb)
+ req = (struct usb_ctrlrequest *) urb->setup_packet;
+ config = le16_to_cpu(req->wValue);
+
++ usb_lock_device(sdev->udev);
+ err = usb_set_configuration(sdev->udev, config);
++ usb_unlock_device(sdev->udev);
+ if (err && err != -ENODEV)
+ dev_err(&sdev->udev->dev, "can't set config #%d, error %d\n",
+ config, err);
+diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
+index 4366320fb68d3..197d52e7b801c 100644
+--- a/drivers/vdpa/ifcvf/ifcvf_main.c
++++ b/drivers/vdpa/ifcvf/ifcvf_main.c
+@@ -765,7 +765,6 @@ static int ifcvf_vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name,
+ }
+
+ ifcvf_mgmt_dev->adapter = adapter;
+- pci_set_drvdata(pdev, ifcvf_mgmt_dev);
+
+ vf = &adapter->vf;
+ vf->dev_type = get_dev_type(pdev);
+@@ -880,6 +879,8 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+ goto err;
+ }
+
++ pci_set_drvdata(pdev, ifcvf_mgmt_dev);
++
+ return 0;
+
+ err:
+diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
+index 2b75c00b10054..fac89a0d81783 100644
+--- a/drivers/vdpa/vdpa.c
++++ b/drivers/vdpa/vdpa.c
+@@ -756,14 +756,19 @@ static int vdpa_nl_cmd_dev_get_doit(struct sk_buff *skb, struct genl_info *info)
+ goto mdev_err;
+ }
+ err = vdpa_dev_fill(vdev, msg, info->snd_portid, info->snd_seq, 0, info->extack);
+- if (!err)
+- err = genlmsg_reply(msg, info);
++ if (err)
++ goto mdev_err;
++
++ err = genlmsg_reply(msg, info);
++ put_device(dev);
++ mutex_unlock(&vdpa_dev_mutex);
++ return err;
++
+ mdev_err:
+ put_device(dev);
+ err:
+ mutex_unlock(&vdpa_dev_mutex);
+- if (err)
+- nlmsg_free(msg);
++ nlmsg_free(msg);
+ return err;
+ }
+
+diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
+index f85d1a08ed87c..160e40d030847 100644
+--- a/drivers/vdpa/vdpa_user/vduse_dev.c
++++ b/drivers/vdpa/vdpa_user/vduse_dev.c
+@@ -1344,9 +1344,9 @@ static int vduse_create_dev(struct vduse_dev_config *config,
+
+ dev->minor = ret;
+ dev->msg_timeout = VDUSE_MSG_DEFAULT_TIMEOUT;
+- dev->dev = device_create(vduse_class, NULL,
+- MKDEV(MAJOR(vduse_major), dev->minor),
+- dev, "%s", config->name);
++ dev->dev = device_create_with_groups(vduse_class, NULL,
++ MKDEV(MAJOR(vduse_major), dev->minor),
++ dev, vduse_dev_groups, "%s", config->name);
+ if (IS_ERR(dev->dev)) {
+ ret = PTR_ERR(dev->dev);
+ goto err_dev;
+@@ -1595,7 +1595,6 @@ static int vduse_init(void)
+ return PTR_ERR(vduse_class);
+
+ vduse_class->devnode = vduse_devnode;
+- vduse_class->dev_groups = vduse_dev_groups;
+
+ ret = alloc_chrdev_region(&vduse_major, 0, VDUSE_DEV_MAX, "vduse");
+ if (ret)
+diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
+index 14e2043d76852..eab55accf381f 100644
+--- a/drivers/vhost/vringh.c
++++ b/drivers/vhost/vringh.c
+@@ -292,7 +292,7 @@ __vringh_iov(struct vringh *vrh, u16 i,
+ int (*copy)(const struct vringh *vrh,
+ void *dst, const void *src, size_t len))
+ {
+- int err, count = 0, up_next, desc_max;
++ int err, count = 0, indirect_count = 0, up_next, desc_max;
+ struct vring_desc desc, *descs;
+ struct vringh_range range = { -1ULL, 0 }, slowrange;
+ bool slow = false;
+@@ -349,7 +349,12 @@ __vringh_iov(struct vringh *vrh, u16 i,
+ continue;
+ }
+
+- if (count++ == vrh->vring.num) {
++ if (up_next == -1)
++ count++;
++ else
++ indirect_count++;
++
++ if (count > vrh->vring.num || indirect_count > desc_max) {
+ vringh_bad("Descriptor loop in %p", descs);
+ err = -ELOOP;
+ goto fail;
+@@ -411,6 +416,7 @@ __vringh_iov(struct vringh *vrh, u16 i,
+ i = return_from_indirect(vrh, &up_next,
+ &descs, &desc_max);
+ slow = false;
++ indirect_count = 0;
+ } else
+ break;
+ }
+diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
+index c8e0ea27caf1d..58c304a3b7c41 100644
+--- a/drivers/video/fbdev/hyperv_fb.c
++++ b/drivers/video/fbdev/hyperv_fb.c
+@@ -1009,7 +1009,6 @@ static int hvfb_getmem(struct hv_device *hdev, struct fb_info *info)
+ struct pci_dev *pdev = NULL;
+ void __iomem *fb_virt;
+ int gen2vm = efi_enabled(EFI_BOOT);
+- resource_size_t pot_start, pot_end;
+ phys_addr_t paddr;
+ int ret;
+
+@@ -1060,23 +1059,7 @@ static int hvfb_getmem(struct hv_device *hdev, struct fb_info *info)
+ dio_fb_size =
+ screen_width * screen_height * screen_depth / 8;
+
+- if (gen2vm) {
+- pot_start = 0;
+- pot_end = -1;
+- } else {
+- if (!(pci_resource_flags(pdev, 0) & IORESOURCE_MEM) ||
+- pci_resource_len(pdev, 0) < screen_fb_size) {
+- pr_err("Resource not available or (0x%lx < 0x%lx)\n",
+- (unsigned long) pci_resource_len(pdev, 0),
+- (unsigned long) screen_fb_size);
+- goto err1;
+- }
+-
+- pot_end = pci_resource_end(pdev, 0);
+- pot_start = pot_end - screen_fb_size + 1;
+- }
+-
+- ret = vmbus_allocate_mmio(&par->mem, hdev, pot_start, pot_end,
++ ret = vmbus_allocate_mmio(&par->mem, hdev, 0, -1,
+ screen_fb_size, 0x100000, true);
+ if (ret != 0) {
+ pr_err("Unable to allocate framebuffer memory\n");
+diff --git a/drivers/video/fbdev/pxa3xx-gcu.c b/drivers/video/fbdev/pxa3xx-gcu.c
+index 350b3139c863e..043cc8f9ef1c7 100644
+--- a/drivers/video/fbdev/pxa3xx-gcu.c
++++ b/drivers/video/fbdev/pxa3xx-gcu.c
+@@ -646,6 +646,7 @@ static int pxa3xx_gcu_probe(struct platform_device *pdev)
+ for (i = 0; i < 8; i++) {
+ ret = pxa3xx_gcu_add_buffer(dev, priv);
+ if (ret) {
++ pxa3xx_gcu_free_buffers(dev, priv);
+ dev_err(dev, "failed to allocate DMA memory\n");
+ goto err_disable_clk;
+ }
+@@ -662,15 +663,15 @@ static int pxa3xx_gcu_probe(struct platform_device *pdev)
+ SHARED_SIZE, irq);
+ return 0;
+
+-err_free_dma:
+- dma_free_coherent(dev, SHARED_SIZE,
+- priv->shared, priv->shared_phys);
++err_disable_clk:
++ clk_disable_unprepare(priv->clk);
+
+ err_misc_deregister:
+ misc_deregister(&priv->misc_dev);
+
+-err_disable_clk:
+- clk_disable_unprepare(priv->clk);
++err_free_dma:
++ dma_free_coherent(dev, SHARED_SIZE,
++ priv->shared, priv->shared_phys);
+
+ return ret;
+ }
+@@ -683,6 +684,7 @@ static int pxa3xx_gcu_remove(struct platform_device *pdev)
+ pxa3xx_gcu_wait_idle(priv);
+ misc_deregister(&priv->misc_dev);
+ dma_free_coherent(dev, SHARED_SIZE, priv->shared, priv->shared_phys);
++ clk_disable_unprepare(priv->clk);
+ pxa3xx_gcu_free_buffers(dev, priv);
+
+ return 0;
+diff --git a/drivers/virtio/virtio_pci_modern_dev.c b/drivers/virtio/virtio_pci_modern_dev.c
+index 591738ad3d565..4093f9cca7a6a 100644
+--- a/drivers/virtio/virtio_pci_modern_dev.c
++++ b/drivers/virtio/virtio_pci_modern_dev.c
+@@ -347,6 +347,7 @@ err_map_notify:
+ err_map_isr:
+ pci_iounmap(pci_dev, mdev->common);
+ err_map_common:
++ pci_release_selected_regions(pci_dev, mdev->modern_bars);
+ return err;
+ }
+ EXPORT_SYMBOL_GPL(vp_modern_probe);
+diff --git a/drivers/watchdog/rti_wdt.c b/drivers/watchdog/rti_wdt.c
+index db843f8258602..00ebeffc674fd 100644
+--- a/drivers/watchdog/rti_wdt.c
++++ b/drivers/watchdog/rti_wdt.c
+@@ -226,7 +226,7 @@ static int rti_wdt_probe(struct platform_device *pdev)
+
+ pm_runtime_enable(dev);
+ ret = pm_runtime_get_sync(dev);
+- if (ret) {
++ if (ret < 0) {
+ pm_runtime_put_noidle(dev);
+ pm_runtime_disable(&pdev->dev);
+ return dev_err_probe(dev, ret, "runtime pm failed\n");
+diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c
+index 6b426df34fd6f..88274704b260d 100644
+--- a/drivers/watchdog/rzg2l_wdt.c
++++ b/drivers/watchdog/rzg2l_wdt.c
+@@ -43,6 +43,8 @@ struct rzg2l_wdt_priv {
+ struct reset_control *rstc;
+ unsigned long osc_clk_rate;
+ unsigned long delay;
++ struct clk *pclk;
++ struct clk *osc_clk;
+ };
+
+ static void rzg2l_wdt_wait_delay(struct rzg2l_wdt_priv *priv)
+@@ -53,7 +55,7 @@ static void rzg2l_wdt_wait_delay(struct rzg2l_wdt_priv *priv)
+
+ static u32 rzg2l_wdt_get_cycle_usec(unsigned long cycle, u32 wdttime)
+ {
+- u64 timer_cycle_us = 1024 * 1024 * (wdttime + 1) * MICRO;
++ u64 timer_cycle_us = 1024 * 1024ULL * (wdttime + 1) * MICRO;
+
+ return div64_ul(timer_cycle_us, cycle);
+ }
+@@ -86,7 +88,6 @@ static int rzg2l_wdt_start(struct watchdog_device *wdev)
+ {
+ struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev);
+
+- reset_control_deassert(priv->rstc);
+ pm_runtime_get_sync(wdev->parent);
+
+ /* Initialize time out */
+@@ -106,7 +107,7 @@ static int rzg2l_wdt_stop(struct watchdog_device *wdev)
+ struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev);
+
+ pm_runtime_put(wdev->parent);
+- reset_control_assert(priv->rstc);
++ reset_control_reset(priv->rstc);
+
+ return 0;
+ }
+@@ -118,7 +119,9 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev,
+
+ /* Reset the module before we modify any register */
+ reset_control_reset(priv->rstc);
+- pm_runtime_get_sync(wdev->parent);
++
++ clk_prepare_enable(priv->pclk);
++ clk_prepare_enable(priv->osc_clk);
+
+ /* smallest counter value to reboot soon */
+ rzg2l_wdt_write(priv, WDTSET_COUNTER_VAL(1), WDTSET);
+@@ -151,12 +154,11 @@ static const struct watchdog_ops rzg2l_wdt_ops = {
+ .restart = rzg2l_wdt_restart,
+ };
+
+-static void rzg2l_wdt_reset_assert_pm_disable_put(void *data)
++static void rzg2l_wdt_reset_assert_pm_disable(void *data)
+ {
+ struct watchdog_device *wdev = data;
+ struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev);
+
+- pm_runtime_put(wdev->parent);
+ pm_runtime_disable(wdev->parent);
+ reset_control_assert(priv->rstc);
+ }
+@@ -166,7 +168,6 @@ static int rzg2l_wdt_probe(struct platform_device *pdev)
+ struct device *dev = &pdev->dev;
+ struct rzg2l_wdt_priv *priv;
+ unsigned long pclk_rate;
+- struct clk *wdt_clk;
+ int ret;
+
+ priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+@@ -178,22 +179,20 @@ static int rzg2l_wdt_probe(struct platform_device *pdev)
+ return PTR_ERR(priv->base);
+
+ /* Get watchdog main clock */
+- wdt_clk = clk_get(&pdev->dev, "oscclk");
+- if (IS_ERR(wdt_clk))
+- return dev_err_probe(&pdev->dev, PTR_ERR(wdt_clk), "no oscclk");
++ priv->osc_clk = devm_clk_get(&pdev->dev, "oscclk");
++ if (IS_ERR(priv->osc_clk))
++ return dev_err_probe(&pdev->dev, PTR_ERR(priv->osc_clk), "no oscclk");
+
+- priv->osc_clk_rate = clk_get_rate(wdt_clk);
+- clk_put(wdt_clk);
++ priv->osc_clk_rate = clk_get_rate(priv->osc_clk);
+ if (!priv->osc_clk_rate)
+ return dev_err_probe(&pdev->dev, -EINVAL, "oscclk rate is 0");
+
+ /* Get Peripheral clock */
+- wdt_clk = clk_get(&pdev->dev, "pclk");
+- if (IS_ERR(wdt_clk))
+- return dev_err_probe(&pdev->dev, PTR_ERR(wdt_clk), "no pclk");
++ priv->pclk = devm_clk_get(&pdev->dev, "pclk");
++ if (IS_ERR(priv->pclk))
++ return dev_err_probe(&pdev->dev, PTR_ERR(priv->pclk), "no pclk");
+
+- pclk_rate = clk_get_rate(wdt_clk);
+- clk_put(wdt_clk);
++ pclk_rate = clk_get_rate(priv->pclk);
+ if (!pclk_rate)
+ return dev_err_probe(&pdev->dev, -EINVAL, "pclk rate is 0");
+
+@@ -206,11 +205,6 @@ static int rzg2l_wdt_probe(struct platform_device *pdev)
+
+ reset_control_deassert(priv->rstc);
+ pm_runtime_enable(&pdev->dev);
+- ret = pm_runtime_resume_and_get(&pdev->dev);
+- if (ret < 0) {
+- dev_err(dev, "pm_runtime_resume_and_get failed ret=%pe", ERR_PTR(ret));
+- goto out_pm_get;
+- }
+
+ priv->wdev.info = &rzg2l_wdt_ident;
+ priv->wdev.ops = &rzg2l_wdt_ops;
+@@ -222,7 +216,7 @@ static int rzg2l_wdt_probe(struct platform_device *pdev)
+
+ watchdog_set_drvdata(&priv->wdev, priv);
+ ret = devm_add_action_or_reset(&pdev->dev,
+- rzg2l_wdt_reset_assert_pm_disable_put,
++ rzg2l_wdt_reset_assert_pm_disable,
+ &priv->wdev);
+ if (ret < 0)
+ return ret;
+@@ -235,12 +229,6 @@ static int rzg2l_wdt_probe(struct platform_device *pdev)
+ dev_warn(dev, "Specified timeout invalid, using default");
+
+ return devm_watchdog_register_device(&pdev->dev, &priv->wdev);
+-
+-out_pm_get:
+- pm_runtime_disable(dev);
+- reset_control_assert(priv->rstc);
+-
+- return ret;
+ }
+
+ static const struct of_device_id rzg2l_wdt_ids[] = {
+diff --git a/drivers/watchdog/ts4800_wdt.c b/drivers/watchdog/ts4800_wdt.c
+index c137ad2bd5c31..0ea554c7cda57 100644
+--- a/drivers/watchdog/ts4800_wdt.c
++++ b/drivers/watchdog/ts4800_wdt.c
+@@ -125,13 +125,16 @@ static int ts4800_wdt_probe(struct platform_device *pdev)
+ ret = of_property_read_u32_index(np, "syscon", 1, ®);
+ if (ret < 0) {
+ dev_err(dev, "no offset in syscon\n");
++ of_node_put(syscon_np);
+ return ret;
+ }
+
+ /* allocate memory for watchdog struct */
+ wdt = devm_kzalloc(dev, sizeof(*wdt), GFP_KERNEL);
+- if (!wdt)
++ if (!wdt) {
++ of_node_put(syscon_np);
+ return -ENOMEM;
++ }
+
+ /* set regmap and offset to know where to write */
+ wdt->feed_offset = reg;
+diff --git a/drivers/watchdog/wdat_wdt.c b/drivers/watchdog/wdat_wdt.c
+index 195c8c004b69d..4fac8148a8e62 100644
+--- a/drivers/watchdog/wdat_wdt.c
++++ b/drivers/watchdog/wdat_wdt.c
+@@ -462,6 +462,7 @@ static int wdat_wdt_probe(struct platform_device *pdev)
+ return ret;
+
+ watchdog_set_nowayout(&wdat->wdd, nowayout);
++ watchdog_stop_on_reboot(&wdat->wdd);
+ return devm_watchdog_register_device(dev, &wdat->wdd);
+ }
+
+diff --git a/drivers/xen/xlate_mmu.c b/drivers/xen/xlate_mmu.c
+index 34742c6e189e3..f17c4c03db30c 100644
+--- a/drivers/xen/xlate_mmu.c
++++ b/drivers/xen/xlate_mmu.c
+@@ -261,7 +261,6 @@ int __init xen_xlate_map_ballooned_pages(xen_pfn_t **gfns, void **virt,
+
+ return 0;
+ }
+-EXPORT_SYMBOL_GPL(xen_xlate_map_ballooned_pages);
+
+ struct remap_pfn {
+ struct mm_struct *mm;
+diff --git a/fs/afs/dir.c b/fs/afs/dir.c
+index 932e61e28e5d9..bdac73554e6e5 100644
+--- a/fs/afs/dir.c
++++ b/fs/afs/dir.c
+@@ -463,8 +463,11 @@ static int afs_dir_iterate_block(struct afs_vnode *dvnode,
+ }
+
+ /* skip if starts before the current position */
+- if (offset < curr)
++ if (offset < curr) {
++ if (next > curr)
++ ctx->pos = blkoff + next * sizeof(union afs_xdr_dirent);
+ continue;
++ }
+
+ /* found the next entry */
+ if (!dir_emit(ctx, dire->u.name, nlen,
+diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
+index b6edcf89a429f..adef10a6e5c7b 100644
+--- a/fs/ceph/addr.c
++++ b/fs/ceph/addr.c
+@@ -1644,7 +1644,7 @@ int ceph_uninline_data(struct file *file)
+ struct inode *inode = file_inode(file);
+ struct ceph_inode_info *ci = ceph_inode(inode);
+ struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
+- struct ceph_osd_request *req;
++ struct ceph_osd_request *req = NULL;
+ struct ceph_cap_flush *prealloc_cf;
+ struct folio *folio = NULL;
+ u64 inline_version = CEPH_INLINE_NONE;
+@@ -1652,10 +1652,23 @@ int ceph_uninline_data(struct file *file)
+ int err = 0;
+ u64 len;
+
++ spin_lock(&ci->i_ceph_lock);
++ inline_version = ci->i_inline_version;
++ spin_unlock(&ci->i_ceph_lock);
++
++ dout("uninline_data %p %llx.%llx inline_version %llu\n",
++ inode, ceph_vinop(inode), inline_version);
++
++ if (inline_version == CEPH_INLINE_NONE)
++ return 0;
++
+ prealloc_cf = ceph_alloc_cap_flush();
+ if (!prealloc_cf)
+ return -ENOMEM;
+
++ if (inline_version == 1) /* initial version, no data */
++ goto out_uninline;
++
+ folio = read_mapping_folio(inode->i_mapping, 0, file);
+ if (IS_ERR(folio)) {
+ err = PTR_ERR(folio);
+@@ -1664,17 +1677,6 @@ int ceph_uninline_data(struct file *file)
+
+ folio_lock(folio);
+
+- spin_lock(&ci->i_ceph_lock);
+- inline_version = ci->i_inline_version;
+- spin_unlock(&ci->i_ceph_lock);
+-
+- dout("uninline_data %p %llx.%llx inline_version %llu\n",
+- inode, ceph_vinop(inode), inline_version);
+-
+- if (inline_version == 1 || /* initial version, no data */
+- inline_version == CEPH_INLINE_NONE)
+- goto out_unlock;
+-
+ len = i_size_read(inode);
+ if (len > folio_size(folio))
+ len = folio_size(folio);
+@@ -1739,6 +1741,7 @@ int ceph_uninline_data(struct file *file)
+ ceph_update_write_metrics(&fsc->mdsc->metric, req->r_start_latency,
+ req->r_end_latency, len, err);
+
++out_uninline:
+ if (!err) {
+ int dirty;
+
+@@ -1757,8 +1760,10 @@ out_put_req:
+ if (err == -ECANCELED)
+ err = 0;
+ out_unlock:
+- folio_unlock(folio);
+- folio_put(folio);
++ if (folio) {
++ folio_unlock(folio);
++ folio_put(folio);
++ }
+ out:
+ ceph_free_cap_flush(prealloc_cf);
+ dout("uninline_data %p %llx.%llx inline_version %llu = %d\n",
+diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
+index 1bd3e1bb0fdf0..8c249511344da 100644
+--- a/fs/ceph/mds_client.c
++++ b/fs/ceph/mds_client.c
+@@ -4700,15 +4700,17 @@ void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc)
+ }
+
+ /*
+- * wait for all write mds requests to flush.
++ * flush the mdlog and wait for all write mds requests to flush.
+ */
+-static void wait_unsafe_requests(struct ceph_mds_client *mdsc, u64 want_tid)
++static void flush_mdlog_and_wait_mdsc_unsafe_requests(struct ceph_mds_client *mdsc,
++ u64 want_tid)
+ {
+ struct ceph_mds_request *req = NULL, *nextreq;
++ struct ceph_mds_session *last_session = NULL;
+ struct rb_node *n;
+
+ mutex_lock(&mdsc->mutex);
+- dout("wait_unsafe_requests want %lld\n", want_tid);
++ dout("%s want %lld\n", __func__, want_tid);
+ restart:
+ req = __get_oldest_req(mdsc);
+ while (req && req->r_tid <= want_tid) {
+@@ -4720,14 +4722,32 @@ restart:
+ nextreq = NULL;
+ if (req->r_op != CEPH_MDS_OP_SETFILELOCK &&
+ (req->r_op & CEPH_MDS_OP_WRITE)) {
++ struct ceph_mds_session *s = req->r_session;
++
++ if (!s) {
++ req = nextreq;
++ continue;
++ }
++
+ /* write op */
+ ceph_mdsc_get_request(req);
+ if (nextreq)
+ ceph_mdsc_get_request(nextreq);
++ s = ceph_get_mds_session(s);
+ mutex_unlock(&mdsc->mutex);
+- dout("wait_unsafe_requests wait on %llu (want %llu)\n",
++
++ /* send flush mdlog request to MDS */
++ if (last_session != s) {
++ send_flush_mdlog(s);
++ ceph_put_mds_session(last_session);
++ last_session = s;
++ } else {
++ ceph_put_mds_session(s);
++ }
++ dout("%s wait on %llu (want %llu)\n", __func__,
+ req->r_tid, want_tid);
+ wait_for_completion(&req->r_safe_completion);
++
+ mutex_lock(&mdsc->mutex);
+ ceph_mdsc_put_request(req);
+ if (!nextreq)
+@@ -4742,7 +4762,8 @@ restart:
+ req = nextreq;
+ }
+ mutex_unlock(&mdsc->mutex);
+- dout("wait_unsafe_requests done\n");
++ ceph_put_mds_session(last_session);
++ dout("%s done\n", __func__);
+ }
+
+ void ceph_mdsc_sync(struct ceph_mds_client *mdsc)
+@@ -4771,7 +4792,7 @@ void ceph_mdsc_sync(struct ceph_mds_client *mdsc)
+ dout("sync want tid %lld flush_seq %lld\n",
+ want_tid, want_flush);
+
+- wait_unsafe_requests(mdsc, want_tid);
++ flush_mdlog_and_wait_mdsc_unsafe_requests(mdsc, want_tid);
+ wait_caps_flush(mdsc, want_flush);
+ }
+
+diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
+index afec840884719..8c2dc2c762a4e 100644
+--- a/fs/ceph/xattr.c
++++ b/fs/ceph/xattr.c
+@@ -366,6 +366,14 @@ static ssize_t ceph_vxattrcb_auth_mds(struct ceph_inode_info *ci,
+ }
+ #define XATTR_RSTAT_FIELD(_type, _name) \
+ XATTR_NAME_CEPH(_type, _name, VXATTR_FLAG_RSTAT)
++#define XATTR_RSTAT_FIELD_UPDATABLE(_type, _name) \
++ { \
++ .name = CEPH_XATTR_NAME(_type, _name), \
++ .name_size = sizeof (CEPH_XATTR_NAME(_type, _name)), \
++ .getxattr_cb = ceph_vxattrcb_ ## _type ## _ ## _name, \
++ .exists_cb = NULL, \
++ .flags = VXATTR_FLAG_RSTAT, \
++ }
+ #define XATTR_LAYOUT_FIELD(_type, _name, _field) \
+ { \
+ .name = CEPH_XATTR_NAME2(_type, _name, _field), \
+@@ -404,7 +412,7 @@ static struct ceph_vxattr ceph_dir_vxattrs[] = {
+ XATTR_RSTAT_FIELD(dir, rsubdirs),
+ XATTR_RSTAT_FIELD(dir, rsnaps),
+ XATTR_RSTAT_FIELD(dir, rbytes),
+- XATTR_RSTAT_FIELD(dir, rctime),
++ XATTR_RSTAT_FIELD_UPDATABLE(dir, rctime),
+ {
+ .name = "ceph.dir.pin",
+ .name_size = sizeof("ceph.dir.pin"),
+diff --git a/fs/cifs/cifs_swn.c b/fs/cifs/cifs_swn.c
+index 180c234c2f46c..1e4c7cc5287f0 100644
+--- a/fs/cifs/cifs_swn.c
++++ b/fs/cifs/cifs_swn.c
+@@ -465,7 +465,7 @@ static int cifs_swn_reconnect(struct cifs_tcon *tcon, struct sockaddr_storage *a
+ int ret = 0;
+
+ /* Store the reconnect address */
+- mutex_lock(&tcon->ses->server->srv_mutex);
++ cifs_server_lock(tcon->ses->server);
+ if (cifs_sockaddr_equal(&tcon->ses->server->dstaddr, addr))
+ goto unlock;
+
+@@ -501,7 +501,7 @@ static int cifs_swn_reconnect(struct cifs_tcon *tcon, struct sockaddr_storage *a
+ cifs_signal_cifsd_for_reconnect(tcon->ses->server, false);
+
+ unlock:
+- mutex_unlock(&tcon->ses->server->srv_mutex);
++ cifs_server_unlock(tcon->ses->server);
+
+ return ret;
+ }
+diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c
+index 0912d8bbbac14..663cb9db49085 100644
+--- a/fs/cifs/cifsencrypt.c
++++ b/fs/cifs/cifsencrypt.c
+@@ -236,9 +236,9 @@ int cifs_verify_signature(struct smb_rqst *rqst,
+ cpu_to_le32(expected_sequence_number);
+ cifs_pdu->Signature.Sequence.Reserved = 0;
+
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+ rc = cifs_calc_signature(rqst, server, what_we_think_sig_should_be);
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+
+ if (rc)
+ return rc;
+@@ -626,7 +626,7 @@ setup_ntlmv2_rsp(struct cifs_ses *ses, const struct nls_table *nls_cp)
+
+ memcpy(ses->auth_key.response + baselen, tiblob, tilen);
+
+- mutex_lock(&ses->server->srv_mutex);
++ cifs_server_lock(ses->server);
+
+ rc = cifs_alloc_hash("hmac(md5)",
+ &ses->server->secmech.hmacmd5,
+@@ -678,7 +678,7 @@ setup_ntlmv2_rsp(struct cifs_ses *ses, const struct nls_table *nls_cp)
+ cifs_dbg(VFS, "%s: Could not generate md5 hash\n", __func__);
+
+ unlock:
+- mutex_unlock(&ses->server->srv_mutex);
++ cifs_server_unlock(ses->server);
+ setup_ntlmv2_rsp_ret:
+ kfree(tiblob);
+
+diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
+index 6d150bb87aaf8..9b4fd76866996 100644
+--- a/fs/cifs/cifsfs.c
++++ b/fs/cifs/cifsfs.c
+@@ -1084,7 +1084,7 @@ struct file_system_type cifs_fs_type = {
+ };
+ MODULE_ALIAS_FS("cifs");
+
+-static struct file_system_type smb3_fs_type = {
++struct file_system_type smb3_fs_type = {
+ .owner = THIS_MODULE,
+ .name = "smb3",
+ .init_fs_context = smb3_init_fs_context,
+diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
+index c0542bdcd06bc..cc40a55fbfacd 100644
+--- a/fs/cifs/cifsfs.h
++++ b/fs/cifs/cifsfs.h
+@@ -38,7 +38,7 @@ static inline unsigned long cifs_get_time(struct dentry *dentry)
+ return (unsigned long) dentry->d_fsdata;
+ }
+
+-extern struct file_system_type cifs_fs_type;
++extern struct file_system_type cifs_fs_type, smb3_fs_type;
+ extern const struct address_space_operations cifs_addr_ops;
+ extern const struct address_space_operations cifs_addr_ops_smallbuf;
+
+diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
+index 5024b6792dab6..a6cade2aebd98 100644
+--- a/fs/cifs/cifsglob.h
++++ b/fs/cifs/cifsglob.h
+@@ -16,6 +16,7 @@
+ #include <linux/mempool.h>
+ #include <linux/workqueue.h>
+ #include <linux/utsname.h>
++#include <linux/sched/mm.h>
+ #include <linux/netfs.h>
+ #include "cifs_fs_sb.h"
+ #include "cifsacl.h"
+@@ -621,7 +622,8 @@ struct TCP_Server_Info {
+ unsigned int in_flight; /* number of requests on the wire to server */
+ unsigned int max_in_flight; /* max number of requests that were on wire */
+ spinlock_t req_lock; /* protect the two values above */
+- struct mutex srv_mutex;
++ struct mutex _srv_mutex;
++ unsigned int nofs_flag;
+ struct task_struct *tsk;
+ char server_GUID[16];
+ __u16 sec_mode;
+@@ -736,6 +738,22 @@ struct TCP_Server_Info {
+ #endif
+ };
+
++static inline void cifs_server_lock(struct TCP_Server_Info *server)
++{
++ unsigned int nofs_flag = memalloc_nofs_save();
++
++ mutex_lock(&server->_srv_mutex);
++ server->nofs_flag = nofs_flag;
++}
++
++static inline void cifs_server_unlock(struct TCP_Server_Info *server)
++{
++ unsigned int nofs_flag = server->nofs_flag;
++
++ mutex_unlock(&server->_srv_mutex);
++ memalloc_nofs_restore(nofs_flag);
++}
++
+ struct cifs_credits {
+ unsigned int value;
+ unsigned int instance;
+@@ -1911,11 +1929,13 @@ extern mempool_t *cifs_mid_poolp;
+
+ /* Operations for different SMB versions */
+ #define SMB1_VERSION_STRING "1.0"
++#define SMB20_VERSION_STRING "2.0"
++#ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY
+ extern struct smb_version_operations smb1_operations;
+ extern struct smb_version_values smb1_values;
+-#define SMB20_VERSION_STRING "2.0"
+ extern struct smb_version_operations smb20_operations;
+ extern struct smb_version_values smb20_values;
++#endif /* CIFS_ALLOW_INSECURE_LEGACY */
+ #define SMB21_VERSION_STRING "2.1"
+ extern struct smb_version_operations smb21_operations;
+ extern struct smb_version_values smb21_values;
+diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
+index aa2d4c49e2a5b..98e4a1aa898e2 100644
+--- a/fs/cifs/connect.c
++++ b/fs/cifs/connect.c
+@@ -97,6 +97,10 @@ static int reconn_set_ipaddr_from_hostname(struct TCP_Server_Info *server)
+ if (!server->hostname)
+ return -EINVAL;
+
++ /* if server hostname isn't populated, there's nothing to do here */
++ if (server->hostname[0] == '\0')
++ return 0;
++
+ len = strlen(server->hostname) + 3;
+
+ unc = kmalloc(len, GFP_KERNEL);
+@@ -148,7 +152,7 @@ static void cifs_resolve_server(struct work_struct *work)
+ struct TCP_Server_Info *server = container_of(work,
+ struct TCP_Server_Info, resolve.work);
+
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+
+ /*
+ * Resolve the hostname again to make sure that IP address is up-to-date.
+@@ -159,7 +163,7 @@ static void cifs_resolve_server(struct work_struct *work)
+ __func__, rc);
+ }
+
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ }
+
+ /*
+@@ -267,7 +271,7 @@ cifs_abort_connection(struct TCP_Server_Info *server)
+
+ /* do not want to be sending data on a socket we are freeing */
+ cifs_dbg(FYI, "%s: tearing down socket\n", __func__);
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+ if (server->ssocket) {
+ cifs_dbg(FYI, "State: 0x%x Flags: 0x%lx\n", server->ssocket->state,
+ server->ssocket->flags);
+@@ -296,7 +300,7 @@ cifs_abort_connection(struct TCP_Server_Info *server)
+ mid->mid_flags |= MID_DELETED;
+ }
+ spin_unlock(&GlobalMid_Lock);
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+
+ cifs_dbg(FYI, "%s: issuing mid callbacks\n", __func__);
+ list_for_each_entry_safe(mid, nmid, &retry_list, qhead) {
+@@ -306,9 +310,9 @@ cifs_abort_connection(struct TCP_Server_Info *server)
+ }
+
+ if (cifs_rdma_enabled(server)) {
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+ smbd_destroy(server);
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ }
+ }
+
+@@ -359,7 +363,7 @@ static int __cifs_reconnect(struct TCP_Server_Info *server,
+
+ do {
+ try_to_freeze();
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+
+ if (!cifs_swn_set_server_dstaddr(server)) {
+ /* resolve the hostname again to make sure that IP address is up-to-date */
+@@ -372,7 +376,7 @@ static int __cifs_reconnect(struct TCP_Server_Info *server,
+ else
+ rc = generic_ip_connect(server);
+ if (rc) {
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ cifs_dbg(FYI, "%s: reconnect error %d\n", __func__, rc);
+ msleep(3000);
+ } else {
+@@ -383,7 +387,7 @@ static int __cifs_reconnect(struct TCP_Server_Info *server,
+ server->tcpStatus = CifsNeedNegotiate;
+ spin_unlock(&cifs_tcp_ses_lock);
+ cifs_swn_reset_server_dstaddr(server);
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ mod_delayed_work(cifsiod_wq, &server->reconnect, 0);
+ }
+ } while (server->tcpStatus == CifsNeedReconnect);
+@@ -488,12 +492,12 @@ static int reconnect_dfs_server(struct TCP_Server_Info *server)
+
+ do {
+ try_to_freeze();
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+
+ rc = reconnect_target_unlocked(server, &tl, &target_hint);
+ if (rc) {
+ /* Failed to reconnect socket */
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ cifs_dbg(FYI, "%s: reconnect error %d\n", __func__, rc);
+ msleep(3000);
+ continue;
+@@ -510,7 +514,7 @@ static int reconnect_dfs_server(struct TCP_Server_Info *server)
+ server->tcpStatus = CifsNeedNegotiate;
+ spin_unlock(&cifs_tcp_ses_lock);
+ cifs_swn_reset_server_dstaddr(server);
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ mod_delayed_work(cifsiod_wq, &server->reconnect, 0);
+ } while (server->tcpStatus == CifsNeedReconnect);
+
+@@ -1565,7 +1569,7 @@ cifs_get_tcp_session(struct smb3_fs_context *ctx,
+ init_waitqueue_head(&tcp_ses->response_q);
+ init_waitqueue_head(&tcp_ses->request_q);
+ INIT_LIST_HEAD(&tcp_ses->pending_mid_q);
+- mutex_init(&tcp_ses->srv_mutex);
++ mutex_init(&tcp_ses->_srv_mutex);
+ memcpy(tcp_ses->workstation_RFC1001_name,
+ ctx->source_rfc1001_name, RFC1001_NAME_LEN_WITH_NULL);
+ memcpy(tcp_ses->server_RFC1001_name,
+diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
+index c5dd6f7305bd1..aa7d00b5b3e7c 100644
+--- a/fs/cifs/dfs_cache.c
++++ b/fs/cifs/dfs_cache.c
+@@ -1327,9 +1327,9 @@ static bool target_share_equal(struct TCP_Server_Info *server, const char *s1, c
+ cifs_dbg(VFS, "%s: failed to convert address \'%s\'. skip address matching.\n",
+ __func__, ip);
+ } else {
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+ match = cifs_match_ipaddr((struct sockaddr *)&server->dstaddr, &sa);
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ }
+
+ kfree(ip);
+diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c
+index 5a803d6861464..4abf36b3b3450 100644
+--- a/fs/cifs/misc.c
++++ b/fs/cifs/misc.c
+@@ -1209,18 +1209,23 @@ static struct super_block *__cifs_get_super(void (*f)(struct super_block *, void
+ .data = data,
+ .sb = NULL,
+ };
++ struct file_system_type **fs_type = (struct file_system_type *[]) {
++ &cifs_fs_type, &smb3_fs_type, NULL,
++ };
+
+- iterate_supers_type(&cifs_fs_type, f, &sd);
+-
+- if (!sd.sb)
+- return ERR_PTR(-EINVAL);
+- /*
+- * Grab an active reference in order to prevent automounts (DFS links)
+- * of expiring and then freeing up our cifs superblock pointer while
+- * we're doing failover.
+- */
+- cifs_sb_active(sd.sb);
+- return sd.sb;
++ for (; *fs_type; fs_type++) {
++ iterate_supers_type(*fs_type, f, &sd);
++ if (sd.sb) {
++ /*
++ * Grab an active reference in order to prevent automounts (DFS links)
++ * of expiring and then freeing up our cifs superblock pointer while
++ * we're doing failover.
++ */
++ cifs_sb_active(sd.sb);
++ return sd.sb;
++ }
++ }
++ return ERR_PTR(-EINVAL);
+ }
+
+ static void __cifs_put_super(struct super_block *sb)
+diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c
+index 1a0995bb5d90c..83b9047c945aa 100644
+--- a/fs/cifs/sess.c
++++ b/fs/cifs/sess.c
+@@ -274,7 +274,10 @@ cifs_ses_add_channel(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses,
+ /* Auth */
+ ctx.domainauto = ses->domainAuto;
+ ctx.domainname = ses->domainName;
+- ctx.server_hostname = ses->server->hostname;
++
++ /* no hostname for extra channels */
++ ctx.server_hostname = "";
++
+ ctx.username = ses->user_name;
+ ctx.password = ses->password;
+ ctx.sectype = ses->sectype;
+@@ -1093,14 +1096,14 @@ sess_establish_session(struct sess_data *sess_data)
+ struct cifs_ses *ses = sess_data->ses;
+ struct TCP_Server_Info *server = sess_data->server;
+
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+ if (!server->session_estab) {
+ if (server->sign) {
+ server->session_key.response =
+ kmemdup(ses->auth_key.response,
+ ses->auth_key.len, GFP_KERNEL);
+ if (!server->session_key.response) {
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ return -ENOMEM;
+ }
+ server->session_key.len =
+@@ -1109,7 +1112,7 @@ sess_establish_session(struct sess_data *sess_data)
+ server->sequence_number = 0x2;
+ server->session_estab = true;
+ }
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+
+ cifs_dbg(FYI, "CIFS session established successfully\n");
+ return 0;
+diff --git a/fs/cifs/smb1ops.c b/fs/cifs/smb1ops.c
+index c71c9a44bef4b..2e20ee4dab7b7 100644
+--- a/fs/cifs/smb1ops.c
++++ b/fs/cifs/smb1ops.c
+@@ -38,10 +38,10 @@ send_nt_cancel(struct TCP_Server_Info *server, struct smb_rqst *rqst,
+ in_buf->WordCount = 0;
+ put_bcc(0, in_buf);
+
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+ rc = cifs_sign_smb(in_buf, server, &mid->sequence_number);
+ if (rc) {
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ return rc;
+ }
+
+@@ -55,7 +55,7 @@ send_nt_cancel(struct TCP_Server_Info *server, struct smb_rqst *rqst,
+ if (rc < 0)
+ server->sequence_number--;
+
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+
+ cifs_dbg(FYI, "issued NT_CANCEL for mid %u, rc = %d\n",
+ get_mid(in_buf), rc);
+diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
+index 861291662c95d..6e26edbffc486 100644
+--- a/fs/cifs/smb2ops.c
++++ b/fs/cifs/smb2ops.c
+@@ -4326,11 +4326,13 @@ smb3_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock,
+ }
+ }
+
++#ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY
+ static bool
+ smb2_is_read_op(__u32 oplock)
+ {
+ return oplock == SMB2_OPLOCK_LEVEL_II;
+ }
++#endif /* CIFS_ALLOW_INSECURE_LEGACY */
+
+ static bool
+ smb21_is_read_op(__u32 oplock)
+@@ -5429,7 +5431,7 @@ out:
+ return rc;
+ }
+
+-
++#ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY
+ struct smb_version_operations smb20_operations = {
+ .compare_fids = smb2_compare_fids,
+ .setup_request = smb2_setup_request,
+@@ -5528,6 +5530,7 @@ struct smb_version_operations smb20_operations = {
+ .is_status_io_timeout = smb2_is_status_io_timeout,
+ .is_network_name_deleted = smb2_is_network_name_deleted,
+ };
++#endif /* CIFS_ALLOW_INSECURE_LEGACY */
+
+ struct smb_version_operations smb21_operations = {
+ .compare_fids = smb2_compare_fids,
+@@ -5859,6 +5862,7 @@ struct smb_version_operations smb311_operations = {
+ .is_network_name_deleted = smb2_is_network_name_deleted,
+ };
+
++#ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY
+ struct smb_version_values smb20_values = {
+ .version_string = SMB20_VERSION_STRING,
+ .protocol_id = SMB20_PROT_ID,
+@@ -5879,6 +5883,7 @@ struct smb_version_values smb20_values = {
+ .signing_required = SMB2_NEGOTIATE_SIGNING_REQUIRED,
+ .create_lease_size = sizeof(struct create_lease),
+ };
++#endif /* ALLOW_INSECURE_LEGACY */
+
+ struct smb_version_values smb21_values = {
+ .version_string = SMB21_VERSION_STRING,
+diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
+index f5321a3500f38..179c1630bf561 100644
+--- a/fs/cifs/smb2pdu.c
++++ b/fs/cifs/smb2pdu.c
+@@ -288,6 +288,9 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
+ mutex_unlock(&ses->session_mutex);
+ rc = -EHOSTDOWN;
+ goto failed;
++ } else if (rc) {
++ mutex_unlock(&ses->session_mutex);
++ goto out;
+ }
+ } else {
+ mutex_unlock(&ses->session_mutex);
+@@ -1369,13 +1372,13 @@ SMB2_sess_establish_session(struct SMB2_sess_data *sess_data)
+ struct cifs_ses *ses = sess_data->ses;
+ struct TCP_Server_Info *server = sess_data->server;
+
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+ if (server->ops->generate_signingkey) {
+ rc = server->ops->generate_signingkey(ses, server);
+ if (rc) {
+ cifs_dbg(FYI,
+ "SMB3 session key generation failed\n");
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ return rc;
+ }
+ }
+@@ -1383,7 +1386,7 @@ SMB2_sess_establish_session(struct SMB2_sess_data *sess_data)
+ server->sequence_number = 0x2;
+ server->session_estab = true;
+ }
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+
+ cifs_dbg(FYI, "SMB2/3 session established successfully\n");
+ return rc;
+diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c
+index 31ef64eb7fbb9..35829d2a09181 100644
+--- a/fs/cifs/smbdirect.c
++++ b/fs/cifs/smbdirect.c
+@@ -1382,9 +1382,9 @@ void smbd_destroy(struct TCP_Server_Info *server)
+ log_rdma_event(INFO, "freeing mr list\n");
+ wake_up_interruptible_all(&info->wait_mr);
+ while (atomic_read(&info->mr_used_count)) {
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ msleep(1000);
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+ }
+ destroy_mr_list(info);
+
+diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
+index c667e6ddfe2f7..71750cf7bf55a 100644
+--- a/fs/cifs/transport.c
++++ b/fs/cifs/transport.c
+@@ -822,7 +822,7 @@ cifs_call_async(struct TCP_Server_Info *server, struct smb_rqst *rqst,
+ } else
+ instance = exist_credits->instance;
+
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+
+ /*
+ * We can't use credits obtained from the previous session to send this
+@@ -830,14 +830,14 @@ cifs_call_async(struct TCP_Server_Info *server, struct smb_rqst *rqst,
+ * return -EAGAIN in such cases to let callers handle it.
+ */
+ if (instance != server->reconnect_instance) {
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ add_credits_and_wake_if(server, &credits, optype);
+ return -EAGAIN;
+ }
+
+ mid = server->ops->setup_async_request(server, rqst);
+ if (IS_ERR(mid)) {
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ add_credits_and_wake_if(server, &credits, optype);
+ return PTR_ERR(mid);
+ }
+@@ -868,7 +868,7 @@ cifs_call_async(struct TCP_Server_Info *server, struct smb_rqst *rqst,
+ cifs_delete_mid(mid);
+ }
+
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+
+ if (rc == 0)
+ return 0;
+@@ -1109,7 +1109,7 @@ compound_send_recv(const unsigned int xid, struct cifs_ses *ses,
+ * of smb data.
+ */
+
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+
+ /*
+ * All the parts of the compound chain belong obtained credits from the
+@@ -1119,7 +1119,7 @@ compound_send_recv(const unsigned int xid, struct cifs_ses *ses,
+ * handle it.
+ */
+ if (instance != server->reconnect_instance) {
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ for (j = 0; j < num_rqst; j++)
+ add_credits(server, &credits[j], optype);
+ return -EAGAIN;
+@@ -1131,7 +1131,7 @@ compound_send_recv(const unsigned int xid, struct cifs_ses *ses,
+ revert_current_mid(server, i);
+ for (j = 0; j < i; j++)
+ cifs_delete_mid(midQ[j]);
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+
+ /* Update # of requests on wire to server */
+ for (j = 0; j < num_rqst; j++)
+@@ -1163,7 +1163,7 @@ compound_send_recv(const unsigned int xid, struct cifs_ses *ses,
+ server->sequence_number -= 2;
+ }
+
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+
+ /*
+ * If sending failed for some reason or it is an oplock break that we
+@@ -1190,9 +1190,9 @@ compound_send_recv(const unsigned int xid, struct cifs_ses *ses,
+ if ((ses->status == CifsNew) || (optype & CIFS_NEG_OP) || (optype & CIFS_SESS_OP)) {
+ spin_unlock(&cifs_tcp_ses_lock);
+
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+ smb311_update_preauth_hash(ses, server, rqst[0].rq_iov, rqst[0].rq_nvec);
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+
+ spin_lock(&cifs_tcp_ses_lock);
+ }
+@@ -1266,9 +1266,9 @@ compound_send_recv(const unsigned int xid, struct cifs_ses *ses,
+ .iov_len = resp_iov[0].iov_len
+ };
+ spin_unlock(&cifs_tcp_ses_lock);
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+ smb311_update_preauth_hash(ses, server, &iov, 1);
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ spin_lock(&cifs_tcp_ses_lock);
+ }
+ spin_unlock(&cifs_tcp_ses_lock);
+@@ -1385,11 +1385,11 @@ SendReceive(const unsigned int xid, struct cifs_ses *ses,
+ and avoid races inside tcp sendmsg code that could cause corruption
+ of smb data */
+
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+
+ rc = allocate_mid(ses, in_buf, &midQ);
+ if (rc) {
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ /* Update # of requests on wire to server */
+ add_credits(server, &credits, 0);
+ return rc;
+@@ -1397,7 +1397,7 @@ SendReceive(const unsigned int xid, struct cifs_ses *ses,
+
+ rc = cifs_sign_smb(in_buf, server, &midQ->sequence_number);
+ if (rc) {
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ goto out;
+ }
+
+@@ -1411,7 +1411,7 @@ SendReceive(const unsigned int xid, struct cifs_ses *ses,
+ if (rc < 0)
+ server->sequence_number -= 2;
+
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+
+ if (rc < 0)
+ goto out;
+@@ -1530,18 +1530,18 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifs_tcon *tcon,
+ and avoid races inside tcp sendmsg code that could cause corruption
+ of smb data */
+
+- mutex_lock(&server->srv_mutex);
++ cifs_server_lock(server);
+
+ rc = allocate_mid(ses, in_buf, &midQ);
+ if (rc) {
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ return rc;
+ }
+
+ rc = cifs_sign_smb(in_buf, server, &midQ->sequence_number);
+ if (rc) {
+ cifs_delete_mid(midQ);
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+ return rc;
+ }
+
+@@ -1554,7 +1554,7 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifs_tcon *tcon,
+ if (rc < 0)
+ server->sequence_number -= 2;
+
+- mutex_unlock(&server->srv_mutex);
++ cifs_server_unlock(server);
+
+ if (rc < 0) {
+ cifs_delete_mid(midQ);
+diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
+index e6dea6dfca161..3e3e96043b5b5 100644
+--- a/fs/erofs/zdata.c
++++ b/fs/erofs/zdata.c
+@@ -214,7 +214,7 @@ struct z_erofs_decompress_frontend {
+
+ #define DECOMPRESS_FRONTEND_INIT(__i) { \
+ .inode = __i, .owned_head = Z_EROFS_PCLUSTER_TAIL, \
+- .mode = COLLECT_PRIMARY_FOLLOWED }
++ .mode = COLLECT_PRIMARY_FOLLOWED, .backmost = true }
+
+ static struct page *z_pagemap_global[Z_EROFS_VMAP_GLOBAL_PAGES];
+ static DEFINE_MUTEX(z_pagemap_global_lock);
+diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
+index 909085a78f9c3..beceac9885c3e 100644
+--- a/fs/f2fs/checkpoint.c
++++ b/fs/f2fs/checkpoint.c
+@@ -98,13 +98,7 @@ repeat:
+ }
+
+ if (unlikely(!PageUptodate(page))) {
+- if (page->index == sbi->metapage_eio_ofs) {
+- if (sbi->metapage_eio_cnt++ == MAX_RETRY_META_PAGE_EIO)
+- set_ckpt_flags(sbi, CP_ERROR_FLAG);
+- } else {
+- sbi->metapage_eio_ofs = page->index;
+- sbi->metapage_eio_cnt = 0;
+- }
++ f2fs_handle_page_eio(sbi, page->index, META);
+ f2fs_put_page(page, 1);
+ return ERR_PTR(-EIO);
+ }
+@@ -158,7 +152,7 @@ static bool __is_bitmap_valid(struct f2fs_sb_info *sbi, block_t blkaddr,
+ f2fs_err(sbi, "Inconsistent error blkaddr:%u, sit bitmap:%d",
+ blkaddr, exist);
+ set_sbi_flag(sbi, SBI_NEED_FSCK);
+- WARN_ON(1);
++ dump_stack();
+ }
+ return exist;
+ }
+@@ -196,7 +190,7 @@ bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi,
+ f2fs_warn(sbi, "access invalid blkaddr:%u",
+ blkaddr);
+ set_sbi_flag(sbi, SBI_NEED_FSCK);
+- WARN_ON(1);
++ dump_stack();
+ return false;
+ } else {
+ return __is_bitmap_valid(sbi, blkaddr, type);
+diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
+index 6ec8c6d4711f4..9b89f26af1f3f 100644
+--- a/fs/f2fs/f2fs.h
++++ b/fs/f2fs/f2fs.h
+@@ -578,8 +578,8 @@ enum {
+ /* maximum retry quota flush count */
+ #define DEFAULT_RETRY_QUOTA_FLUSH_COUNT 8
+
+-/* maximum retry of EIO'ed meta page */
+-#define MAX_RETRY_META_PAGE_EIO 100
++/* maximum retry of EIO'ed page */
++#define MAX_RETRY_PAGE_EIO 100
+
+ #define F2FS_LINK_MAX 0xffffffff /* maximum link count per file */
+
+@@ -1614,8 +1614,8 @@ struct f2fs_sb_info {
+ /* keep migration IO order for LFS mode */
+ struct f2fs_rwsem io_order_lock;
+ mempool_t *write_io_dummy; /* Dummy pages */
+- pgoff_t metapage_eio_ofs; /* EIO page offset */
+- int metapage_eio_cnt; /* EIO count */
++ pgoff_t page_eio_ofs[NR_PAGE_TYPE]; /* EIO page offset */
++ int page_eio_cnt[NR_PAGE_TYPE]; /* EIO count */
+
+ /* for checkpoint */
+ struct f2fs_checkpoint *ckpt; /* raw checkpoint pointer */
+@@ -4541,6 +4541,21 @@ static inline void f2fs_io_schedule_timeout(long timeout)
+ io_schedule_timeout(timeout);
+ }
+
++static inline void f2fs_handle_page_eio(struct f2fs_sb_info *sbi, pgoff_t ofs,
++ enum page_type type)
++{
++ if (unlikely(f2fs_cp_error(sbi)))
++ return;
++
++ if (ofs == sbi->page_eio_ofs[type]) {
++ if (sbi->page_eio_cnt[type]++ == MAX_RETRY_PAGE_EIO)
++ set_ckpt_flags(sbi, CP_ERROR_FLAG);
++ } else {
++ sbi->page_eio_ofs[type] = ofs;
++ sbi->page_eio_cnt[type] = 0;
++ }
++}
++
+ #define EFSBADCRC EBADMSG /* Bad CRC detected */
+ #define EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */
+
+diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
+index 176e97b985e61..5d1b97e852e74 100644
+--- a/fs/f2fs/file.c
++++ b/fs/f2fs/file.c
+@@ -2687,6 +2687,7 @@ do_map:
+ }
+
+ set_page_dirty(page);
++ set_page_private_gcing(page);
+ f2fs_put_page(page, 1);
+
+ idx++;
+diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
+index c45d341dcf6e5..a8d0fa2731cbe 100644
+--- a/fs/f2fs/node.c
++++ b/fs/f2fs/node.c
+@@ -1416,8 +1416,7 @@ repeat:
+
+ err = read_node_page(page, 0);
+ if (err < 0) {
+- f2fs_put_page(page, 1);
+- return ERR_PTR(err);
++ goto out_put_err;
+ } else if (err == LOCKED_PAGE) {
+ err = 0;
+ goto page_hit;
+@@ -1443,19 +1442,21 @@ repeat:
+ goto out_err;
+ }
+ page_hit:
+- if (unlikely(nid != nid_of_node(page))) {
+- f2fs_warn(sbi, "inconsistent node block, nid:%lu, node_footer[nid:%u,ino:%u,ofs:%u,cpver:%llu,blkaddr:%u]",
++ if (likely(nid == nid_of_node(page)))
++ return page;
++
++ f2fs_warn(sbi, "inconsistent node block, nid:%lu, node_footer[nid:%u,ino:%u,ofs:%u,cpver:%llu,blkaddr:%u]",
+ nid, nid_of_node(page), ino_of_node(page),
+ ofs_of_node(page), cpver_of_node(page),
+ next_blkaddr_of_node(page));
+- set_sbi_flag(sbi, SBI_NEED_FSCK);
+- err = -EINVAL;
++ set_sbi_flag(sbi, SBI_NEED_FSCK);
++ err = -EINVAL;
+ out_err:
+- ClearPageUptodate(page);
+- f2fs_put_page(page, 1);
+- return ERR_PTR(err);
+- }
+- return page;
++ ClearPageUptodate(page);
++out_put_err:
++ f2fs_handle_page_eio(sbi, page->index, NODE);
++ f2fs_put_page(page, 1);
++ return ERR_PTR(err);
+ }
+
+ struct page *f2fs_get_node_page(struct f2fs_sb_info *sbi, pgoff_t nid)
+diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
+index a1074a26e784d..90050b4add7a5 100644
+--- a/fs/fs-writeback.c
++++ b/fs/fs-writeback.c
+@@ -120,6 +120,7 @@ static bool inode_io_list_move_locked(struct inode *inode,
+ struct list_head *head)
+ {
+ assert_spin_locked(&wb->list_lock);
++ assert_spin_locked(&inode->i_lock);
+
+ list_move(&inode->i_io_list, head);
+
+@@ -1365,9 +1366,9 @@ static int move_expired_inodes(struct list_head *delaying_queue,
+ inode = wb_inode(delaying_queue->prev);
+ if (inode_dirtied_after(inode, dirtied_before))
+ break;
++ spin_lock(&inode->i_lock);
+ list_move(&inode->i_io_list, &tmp);
+ moved++;
+- spin_lock(&inode->i_lock);
+ inode->i_state |= I_SYNC_QUEUED;
+ spin_unlock(&inode->i_lock);
+ if (sb_is_blkdev_sb(inode->i_sb))
+@@ -1383,7 +1384,12 @@ static int move_expired_inodes(struct list_head *delaying_queue,
+ goto out;
+ }
+
+- /* Move inodes from one superblock together */
++ /*
++ * Although inode's i_io_list is moved from 'tmp' to 'dispatch_queue',
++ * we don't take inode->i_lock here because it is just a pointless overhead.
++ * Inode is already marked as I_SYNC_QUEUED so writeback list handling is
++ * fully under our control.
++ */
+ while (!list_empty(&tmp)) {
+ sb = wb_inode(tmp.prev)->i_sb;
+ list_for_each_prev_safe(pos, node, &tmp) {
+@@ -1826,8 +1832,8 @@ static long writeback_sb_inodes(struct super_block *sb,
+ * We'll have another go at writing back this inode
+ * when we completed a full scan of b_io.
+ */
+- spin_unlock(&inode->i_lock);
+ requeue_io(inode, wb);
++ spin_unlock(&inode->i_lock);
+ trace_writeback_sb_inodes_requeue(inode);
+ continue;
+ }
+@@ -2358,6 +2364,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
+ {
+ struct super_block *sb = inode->i_sb;
+ int dirtytime = 0;
++ struct bdi_writeback *wb = NULL;
+
+ trace_writeback_mark_inode_dirty(inode, flags);
+
+@@ -2409,6 +2416,17 @@ void __mark_inode_dirty(struct inode *inode, int flags)
+ inode->i_state &= ~I_DIRTY_TIME;
+ inode->i_state |= flags;
+
++ /*
++ * Grab inode's wb early because it requires dropping i_lock and we
++ * need to make sure following checks happen atomically with dirty
++ * list handling so that we don't move inodes under flush worker's
++ * hands.
++ */
++ if (!was_dirty) {
++ wb = locked_inode_to_wb_and_lock_list(inode);
++ spin_lock(&inode->i_lock);
++ }
++
+ /*
+ * If the inode is queued for writeback by flush worker, just
+ * update its dirty state. Once the flush worker is done with
+@@ -2416,7 +2434,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
+ * list, based upon its state.
+ */
+ if (inode->i_state & I_SYNC_QUEUED)
+- goto out_unlock_inode;
++ goto out_unlock;
+
+ /*
+ * Only add valid (hashed) inodes to the superblock's
+@@ -2424,22 +2442,19 @@ void __mark_inode_dirty(struct inode *inode, int flags)
+ */
+ if (!S_ISBLK(inode->i_mode)) {
+ if (inode_unhashed(inode))
+- goto out_unlock_inode;
++ goto out_unlock;
+ }
+ if (inode->i_state & I_FREEING)
+- goto out_unlock_inode;
++ goto out_unlock;
+
+ /*
+ * If the inode was already on b_dirty/b_io/b_more_io, don't
+ * reposition it (that would break b_dirty time-ordering).
+ */
+ if (!was_dirty) {
+- struct bdi_writeback *wb;
+ struct list_head *dirty_list;
+ bool wakeup_bdi = false;
+
+- wb = locked_inode_to_wb_and_lock_list(inode);
+-
+ inode->dirtied_when = jiffies;
+ if (dirtytime)
+ inode->dirtied_time_when = jiffies;
+@@ -2453,6 +2468,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
+ dirty_list);
+
+ spin_unlock(&wb->list_lock);
++ spin_unlock(&inode->i_lock);
+ trace_writeback_dirty_inode_enqueue(inode);
+
+ /*
+@@ -2467,6 +2483,9 @@ void __mark_inode_dirty(struct inode *inode, int flags)
+ return;
+ }
+ }
++out_unlock:
++ if (wb)
++ spin_unlock(&wb->list_lock);
+ out_unlock_inode:
+ spin_unlock(&inode->i_lock);
+ }
+diff --git a/fs/inode.c b/fs/inode.c
+index 9d9b422504d1a..bd4da9c5207ea 100644
+--- a/fs/inode.c
++++ b/fs/inode.c
+@@ -27,7 +27,7 @@
+ * Inode locking rules:
+ *
+ * inode->i_lock protects:
+- * inode->i_state, inode->i_hash, __iget()
++ * inode->i_state, inode->i_hash, __iget(), inode->i_io_list
+ * Inode LRU list locks protect:
+ * inode->i_sb->s_inode_lru, inode->i_lru
+ * inode->i_sb->s_inode_list_lock protects:
+diff --git a/fs/jffs2/fs.c b/fs/jffs2/fs.c
+index 71f03a5d36ed2..f83a468b64883 100644
+--- a/fs/jffs2/fs.c
++++ b/fs/jffs2/fs.c
+@@ -604,6 +604,7 @@ out_root:
+ jffs2_free_raw_node_refs(c);
+ kvfree(c->blocks);
+ jffs2_clear_xattr_subsystem(c);
++ jffs2_sum_exit(c);
+ out_inohash:
+ kfree(c->inocache_list);
+ out_wbuf:
+diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
+index e205fde7163ab..6eca72cfa1f28 100644
+--- a/fs/kernfs/dir.c
++++ b/fs/kernfs/dir.c
+@@ -18,7 +18,15 @@
+ #include "kernfs-internal.h"
+
+ static DEFINE_SPINLOCK(kernfs_rename_lock); /* kn->parent and ->name */
+-static char kernfs_pr_cont_buf[PATH_MAX]; /* protected by rename_lock */
++/*
++ * Don't use rename_lock to piggy back on pr_cont_buf. We don't want to
++ * call pr_cont() while holding rename_lock. Because sometimes pr_cont()
++ * will perform wakeups when releasing console_sem. Holding rename_lock
++ * will introduce deadlock if the scheduler reads the kernfs_name in the
++ * wakeup path.
++ */
++static DEFINE_SPINLOCK(kernfs_pr_cont_lock);
++static char kernfs_pr_cont_buf[PATH_MAX]; /* protected by pr_cont_lock */
+ static DEFINE_SPINLOCK(kernfs_idr_lock); /* root->ino_idr */
+
+ #define rb_to_kn(X) rb_entry((X), struct kernfs_node, rb)
+@@ -229,12 +237,12 @@ void pr_cont_kernfs_name(struct kernfs_node *kn)
+ {
+ unsigned long flags;
+
+- spin_lock_irqsave(&kernfs_rename_lock, flags);
++ spin_lock_irqsave(&kernfs_pr_cont_lock, flags);
+
+- kernfs_name_locked(kn, kernfs_pr_cont_buf, sizeof(kernfs_pr_cont_buf));
++ kernfs_name(kn, kernfs_pr_cont_buf, sizeof(kernfs_pr_cont_buf));
+ pr_cont("%s", kernfs_pr_cont_buf);
+
+- spin_unlock_irqrestore(&kernfs_rename_lock, flags);
++ spin_unlock_irqrestore(&kernfs_pr_cont_lock, flags);
+ }
+
+ /**
+@@ -248,10 +256,10 @@ void pr_cont_kernfs_path(struct kernfs_node *kn)
+ unsigned long flags;
+ int sz;
+
+- spin_lock_irqsave(&kernfs_rename_lock, flags);
++ spin_lock_irqsave(&kernfs_pr_cont_lock, flags);
+
+- sz = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf,
+- sizeof(kernfs_pr_cont_buf));
++ sz = kernfs_path_from_node(kn, NULL, kernfs_pr_cont_buf,
++ sizeof(kernfs_pr_cont_buf));
+ if (sz < 0) {
+ pr_cont("(error)");
+ goto out;
+@@ -265,7 +273,7 @@ void pr_cont_kernfs_path(struct kernfs_node *kn)
+ pr_cont("%s", kernfs_pr_cont_buf);
+
+ out:
+- spin_unlock_irqrestore(&kernfs_rename_lock, flags);
++ spin_unlock_irqrestore(&kernfs_pr_cont_lock, flags);
+ }
+
+ /**
+@@ -823,13 +831,12 @@ static struct kernfs_node *kernfs_walk_ns(struct kernfs_node *parent,
+
+ lockdep_assert_held_read(&kernfs_root(parent)->kernfs_rwsem);
+
+- /* grab kernfs_rename_lock to piggy back on kernfs_pr_cont_buf */
+- spin_lock_irq(&kernfs_rename_lock);
++ spin_lock_irq(&kernfs_pr_cont_lock);
+
+ len = strlcpy(kernfs_pr_cont_buf, path, sizeof(kernfs_pr_cont_buf));
+
+ if (len >= sizeof(kernfs_pr_cont_buf)) {
+- spin_unlock_irq(&kernfs_rename_lock);
++ spin_unlock_irq(&kernfs_pr_cont_lock);
+ return NULL;
+ }
+
+@@ -841,7 +848,7 @@ static struct kernfs_node *kernfs_walk_ns(struct kernfs_node *parent,
+ parent = kernfs_find_ns(parent, name, ns);
+ }
+
+- spin_unlock_irq(&kernfs_rename_lock);
++ spin_unlock_irq(&kernfs_pr_cont_lock);
+
+ return parent;
+ }
+diff --git a/fs/ksmbd/smbacl.c b/fs/ksmbd/smbacl.c
+index 6ecf55ea1fed5..38f23bf981ac9 100644
+--- a/fs/ksmbd/smbacl.c
++++ b/fs/ksmbd/smbacl.c
+@@ -1261,6 +1261,7 @@ int smb_check_perm_dacl(struct ksmbd_conn *conn, struct path *path,
+ if (!access_bits)
+ access_bits =
+ SET_MINIMUM_RIGHTS;
++ posix_acl_release(posix_acls);
+ goto check_access_bits;
+ }
+ }
+diff --git a/fs/ksmbd/transport_rdma.c b/fs/ksmbd/transport_rdma.c
+index e646d79554b8c..3f5d135716945 100644
+--- a/fs/ksmbd/transport_rdma.c
++++ b/fs/ksmbd/transport_rdma.c
+@@ -569,6 +569,7 @@ static void recv_done(struct ib_cq *cq, struct ib_wc *wc)
+ }
+ t->negotiation_requested = true;
+ t->full_packet_received = true;
++ t->status = SMB_DIRECT_CS_CONNECTED;
+ enqueue_reassembly(t, recvmsg, 0);
+ wake_up_interruptible(&t->wait_status);
+ break;
+diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
+index 8c5907287c161..d1eaaeb7f7135 100644
+--- a/fs/nfs/nfs4proc.c
++++ b/fs/nfs/nfs4proc.c
+@@ -3098,6 +3098,10 @@ static int _nfs4_open_and_get_state(struct nfs4_opendata *opendata,
+ }
+
+ out:
++ if (opendata->lgp) {
++ nfs4_lgopen_release(opendata->lgp);
++ opendata->lgp = NULL;
++ }
+ if (!opendata->cancelled)
+ nfs4_sequence_free_slot(&opendata->o_res.seq_res);
+ return ret;
+diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
+index 2c1b027774d41..0326bdec5de77 100644
+--- a/fs/nfsd/filecache.c
++++ b/fs/nfsd/filecache.c
+@@ -306,11 +306,12 @@ nfsd_file_put(struct nfsd_file *nf)
+ if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags) == 0) {
+ nfsd_file_flush(nf);
+ nfsd_file_put_noref(nf);
+- } else {
++ } else if (nf->nf_file) {
+ nfsd_file_put_noref(nf);
+- if (nf->nf_file)
+- nfsd_file_schedule_laundrette();
+- }
++ nfsd_file_schedule_laundrette();
++ } else
++ nfsd_file_put_noref(nf);
++
+ if (atomic_long_read(&nfsd_filecache_count) >= NFSD_FILE_LRU_LIMIT)
+ nfsd_file_gc();
+ }
+diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
+index e20e7c8414896..1c2ece9611287 100644
+--- a/fs/zonefs/super.c
++++ b/fs/zonefs/super.c
+@@ -1690,11 +1690,6 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent)
+ sbi->s_mount_opts = ZONEFS_MNTOPT_ERRORS_RO;
+ sbi->s_max_open_zones = bdev_max_open_zones(sb->s_bdev);
+ atomic_set(&sbi->s_open_zones, 0);
+- if (!sbi->s_max_open_zones &&
+- sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) {
+- zonefs_info(sb, "No open zones limit. Ignoring explicit_open mount option\n");
+- sbi->s_mount_opts &= ~ZONEFS_MNTOPT_EXPLICIT_OPEN;
+- }
+
+ ret = zonefs_read_super(sb);
+ if (ret)
+@@ -1713,6 +1708,12 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent)
+ zonefs_info(sb, "Mounting %u zones",
+ blkdev_nr_zones(sb->s_bdev->bd_disk));
+
++ if (!sbi->s_max_open_zones &&
++ sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) {
++ zonefs_info(sb, "No open zones limit. Ignoring explicit_open mount option\n");
++ sbi->s_mount_opts &= ~ZONEFS_MNTOPT_EXPLICIT_OPEN;
++ }
++
+ /* Create root directory inode */
+ ret = -ENOMEM;
+ inode = new_inode(sb);
+diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
+index 60d0161389971..108e3d114bfc1 100644
+--- a/include/linux/blkdev.h
++++ b/include/linux/blkdev.h
+@@ -147,6 +147,7 @@ struct gendisk {
+ #define GD_DEAD 2
+ #define GD_NATIVE_CAPACITY 3
+ #define GD_ADDED 4
++#define GD_SUPPRESS_PART_SCAN 5
+
+ struct mutex open_mutex; /* open/close mutex */
+ unsigned open_partitions; /* number of open partitions */
+diff --git a/include/linux/export.h b/include/linux/export.h
+index 27d848712b90b..5910ccb66ca2d 100644
+--- a/include/linux/export.h
++++ b/include/linux/export.h
+@@ -2,6 +2,8 @@
+ #ifndef _LINUX_EXPORT_H
+ #define _LINUX_EXPORT_H
+
++#include <linux/stringify.h>
++
+ /*
+ * Export symbols from the kernel to modules. Forked from module.h
+ * to reduce the amount of pointless cruft we feed to gcc when only
+@@ -154,7 +156,6 @@ struct kernel_symbol {
+ #endif /* CONFIG_MODULES */
+
+ #ifdef DEFAULT_SYMBOL_NAMESPACE
+-#include <linux/stringify.h>
+ #define _EXPORT_SYMBOL(sym, sec) __EXPORT_SYMBOL(sym, sec, __stringify(DEFAULT_SYMBOL_NAMESPACE))
+ #else
+ #define _EXPORT_SYMBOL(sym, sec) __EXPORT_SYMBOL(sym, sec, "")
+@@ -162,8 +163,8 @@ struct kernel_symbol {
+
+ #define EXPORT_SYMBOL(sym) _EXPORT_SYMBOL(sym, "")
+ #define EXPORT_SYMBOL_GPL(sym) _EXPORT_SYMBOL(sym, "_gpl")
+-#define EXPORT_SYMBOL_NS(sym, ns) __EXPORT_SYMBOL(sym, "", #ns)
+-#define EXPORT_SYMBOL_NS_GPL(sym, ns) __EXPORT_SYMBOL(sym, "_gpl", #ns)
++#define EXPORT_SYMBOL_NS(sym, ns) __EXPORT_SYMBOL(sym, "", __stringify(ns))
++#define EXPORT_SYMBOL_NS_GPL(sym, ns) __EXPORT_SYMBOL(sym, "_gpl", __stringify(ns))
+
+ #endif /* !__ASSEMBLY__ */
+
+diff --git a/include/linux/extcon.h b/include/linux/extcon.h
+index 0c19010da77fa..685401d94d398 100644
+--- a/include/linux/extcon.h
++++ b/include/linux/extcon.h
+@@ -296,7 +296,7 @@ static inline void devm_extcon_unregister_notifier_all(struct device *dev,
+
+ static inline struct extcon_dev *extcon_get_extcon_dev(const char *extcon_name)
+ {
+- return ERR_PTR(-ENODEV);
++ return NULL;
+ }
+
+ static inline struct extcon_dev *extcon_find_edev_by_node(struct device_node *node)
+diff --git a/include/linux/iio/common/st_sensors.h b/include/linux/iio/common/st_sensors.h
+index 22f67845cdd36..db4a1b260348c 100644
+--- a/include/linux/iio/common/st_sensors.h
++++ b/include/linux/iio/common/st_sensors.h
+@@ -237,6 +237,7 @@ struct st_sensor_settings {
+ * @hw_irq_trigger: if we're using the hardware interrupt on the sensor.
+ * @hw_timestamp: Latest timestamp from the interrupt handler, when in use.
+ * @buffer_data: Data used by buffer part.
++ * @odr_lock: Local lock for preventing concurrent ODR accesses/changes
+ */
+ struct st_sensor_data {
+ struct iio_trigger *trig;
+@@ -261,6 +262,8 @@ struct st_sensor_data {
+ s64 hw_timestamp;
+
+ char buffer_data[ST_SENSORS_MAX_BUFFER_SIZE] ____cacheline_aligned;
++
++ struct mutex odr_lock;
+ };
+
+ #ifdef CONFIG_IIO_BUFFER
+diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
+index 107751cc047be..bf1eef337a074 100644
+--- a/include/linux/jump_label.h
++++ b/include/linux/jump_label.h
+@@ -256,9 +256,9 @@ extern void static_key_disable_cpuslocked(struct static_key *key);
+ #include <linux/atomic.h>
+ #include <linux/bug.h>
+
+-static inline int static_key_count(struct static_key *key)
++static __always_inline int static_key_count(struct static_key *key)
+ {
+- return atomic_read(&key->enabled);
++ return arch_atomic_read(&key->enabled);
+ }
+
+ static __always_inline void jump_label_init(void)
+diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
+index 7d2d0ba821441..2e162ec2a3d3b 100644
+--- a/include/linux/mlx5/mlx5_ifc.h
++++ b/include/linux/mlx5/mlx5_ifc.h
+@@ -5180,12 +5180,11 @@ struct mlx5_ifc_query_qp_out_bits {
+
+ u8 syndrome[0x20];
+
+- u8 reserved_at_40[0x20];
+- u8 ece[0x20];
++ u8 reserved_at_40[0x40];
+
+ u8 opt_param_mask[0x20];
+
+- u8 reserved_at_a0[0x20];
++ u8 ece[0x20];
+
+ struct mlx5_ifc_qpc_bits qpc;
+
+diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
+index c6199dbe25913..0f233b76c9cec 100644
+--- a/include/linux/nodemask.h
++++ b/include/linux/nodemask.h
+@@ -42,11 +42,11 @@
+ * void nodes_shift_right(dst, src, n) Shift right
+ * void nodes_shift_left(dst, src, n) Shift left
+ *
+- * int first_node(mask) Number lowest set bit, or MAX_NUMNODES
+- * int next_node(node, mask) Next node past 'node', or MAX_NUMNODES
+- * int next_node_in(node, mask) Next node past 'node', or wrap to first,
++ * unsigned int first_node(mask) Number lowest set bit, or MAX_NUMNODES
++ * unsigend int next_node(node, mask) Next node past 'node', or MAX_NUMNODES
++ * unsigned int next_node_in(node, mask) Next node past 'node', or wrap to first,
+ * or MAX_NUMNODES
+- * int first_unset_node(mask) First node not set in mask, or
++ * unsigned int first_unset_node(mask) First node not set in mask, or
+ * MAX_NUMNODES
+ *
+ * nodemask_t nodemask_of_node(node) Return nodemask with bit 'node' set
+@@ -153,7 +153,7 @@ static inline void __nodes_clear(nodemask_t *dstp, unsigned int nbits)
+
+ #define node_test_and_set(node, nodemask) \
+ __node_test_and_set((node), &(nodemask))
+-static inline int __node_test_and_set(int node, nodemask_t *addr)
++static inline bool __node_test_and_set(int node, nodemask_t *addr)
+ {
+ return test_and_set_bit(node, addr->bits);
+ }
+@@ -200,7 +200,7 @@ static inline void __nodes_complement(nodemask_t *dstp,
+
+ #define nodes_equal(src1, src2) \
+ __nodes_equal(&(src1), &(src2), MAX_NUMNODES)
+-static inline int __nodes_equal(const nodemask_t *src1p,
++static inline bool __nodes_equal(const nodemask_t *src1p,
+ const nodemask_t *src2p, unsigned int nbits)
+ {
+ return bitmap_equal(src1p->bits, src2p->bits, nbits);
+@@ -208,7 +208,7 @@ static inline int __nodes_equal(const nodemask_t *src1p,
+
+ #define nodes_intersects(src1, src2) \
+ __nodes_intersects(&(src1), &(src2), MAX_NUMNODES)
+-static inline int __nodes_intersects(const nodemask_t *src1p,
++static inline bool __nodes_intersects(const nodemask_t *src1p,
+ const nodemask_t *src2p, unsigned int nbits)
+ {
+ return bitmap_intersects(src1p->bits, src2p->bits, nbits);
+@@ -216,20 +216,20 @@ static inline int __nodes_intersects(const nodemask_t *src1p,
+
+ #define nodes_subset(src1, src2) \
+ __nodes_subset(&(src1), &(src2), MAX_NUMNODES)
+-static inline int __nodes_subset(const nodemask_t *src1p,
++static inline bool __nodes_subset(const nodemask_t *src1p,
+ const nodemask_t *src2p, unsigned int nbits)
+ {
+ return bitmap_subset(src1p->bits, src2p->bits, nbits);
+ }
+
+ #define nodes_empty(src) __nodes_empty(&(src), MAX_NUMNODES)
+-static inline int __nodes_empty(const nodemask_t *srcp, unsigned int nbits)
++static inline bool __nodes_empty(const nodemask_t *srcp, unsigned int nbits)
+ {
+ return bitmap_empty(srcp->bits, nbits);
+ }
+
+ #define nodes_full(nodemask) __nodes_full(&(nodemask), MAX_NUMNODES)
+-static inline int __nodes_full(const nodemask_t *srcp, unsigned int nbits)
++static inline bool __nodes_full(const nodemask_t *srcp, unsigned int nbits)
+ {
+ return bitmap_full(srcp->bits, nbits);
+ }
+@@ -260,15 +260,15 @@ static inline void __nodes_shift_left(nodemask_t *dstp,
+ > MAX_NUMNODES, then the silly min_ts could be dropped. */
+
+ #define first_node(src) __first_node(&(src))
+-static inline int __first_node(const nodemask_t *srcp)
++static inline unsigned int __first_node(const nodemask_t *srcp)
+ {
+- return min_t(int, MAX_NUMNODES, find_first_bit(srcp->bits, MAX_NUMNODES));
++ return min_t(unsigned int, MAX_NUMNODES, find_first_bit(srcp->bits, MAX_NUMNODES));
+ }
+
+ #define next_node(n, src) __next_node((n), &(src))
+-static inline int __next_node(int n, const nodemask_t *srcp)
++static inline unsigned int __next_node(int n, const nodemask_t *srcp)
+ {
+- return min_t(int,MAX_NUMNODES,find_next_bit(srcp->bits, MAX_NUMNODES, n+1));
++ return min_t(unsigned int, MAX_NUMNODES, find_next_bit(srcp->bits, MAX_NUMNODES, n+1));
+ }
+
+ /*
+@@ -276,7 +276,7 @@ static inline int __next_node(int n, const nodemask_t *srcp)
+ * the first node in src if needed. Returns MAX_NUMNODES if src is empty.
+ */
+ #define next_node_in(n, src) __next_node_in((n), &(src))
+-int __next_node_in(int node, const nodemask_t *srcp);
++unsigned int __next_node_in(int node, const nodemask_t *srcp);
+
+ static inline void init_nodemask_of_node(nodemask_t *mask, int node)
+ {
+@@ -296,9 +296,9 @@ static inline void init_nodemask_of_node(nodemask_t *mask, int node)
+ })
+
+ #define first_unset_node(mask) __first_unset_node(&(mask))
+-static inline int __first_unset_node(const nodemask_t *maskp)
++static inline unsigned int __first_unset_node(const nodemask_t *maskp)
+ {
+- return min_t(int,MAX_NUMNODES,
++ return min_t(unsigned int, MAX_NUMNODES,
+ find_first_zero_bit(maskp->bits, MAX_NUMNODES));
+ }
+
+@@ -435,11 +435,11 @@ static inline int num_node_state(enum node_states state)
+
+ #define first_online_node first_node(node_states[N_ONLINE])
+ #define first_memory_node first_node(node_states[N_MEMORY])
+-static inline int next_online_node(int nid)
++static inline unsigned int next_online_node(int nid)
+ {
+ return next_node(nid, node_states[N_ONLINE]);
+ }
+-static inline int next_memory_node(int nid)
++static inline unsigned int next_memory_node(int nid)
+ {
+ return next_node(nid, node_states[N_MEMORY]);
+ }
+diff --git a/include/linux/random.h b/include/linux/random.h
+index 4364de2300be6..e5a543fd29c87 100644
+--- a/include/linux/random.h
++++ b/include/linux/random.h
+@@ -13,7 +13,7 @@
+ struct notifier_block;
+
+ void add_device_randomness(const void *buf, size_t len);
+-void add_bootloader_randomness(const void *buf, size_t len);
++void __init add_bootloader_randomness(const void *buf, size_t len);
+ void add_input_randomness(unsigned int type, unsigned int code,
+ unsigned int value) __latent_entropy;
+ void add_interrupt_randomness(int irq) __latent_entropy;
+diff --git a/include/linux/xarray.h b/include/linux/xarray.h
+index 72feab5ea8d4a..c29e11b2c0739 100644
+--- a/include/linux/xarray.h
++++ b/include/linux/xarray.h
+@@ -1508,6 +1508,7 @@ void *xas_find_marked(struct xa_state *, unsigned long max, xa_mark_t);
+ void xas_init_marks(const struct xa_state *);
+
+ bool xas_nomem(struct xa_state *, gfp_t);
++void xas_destroy(struct xa_state *);
+ void xas_pause(struct xa_state *);
+
+ void xas_create_range(struct xa_state *);
+diff --git a/include/net/ax25.h b/include/net/ax25.h
+index 0f9790c455bbd..a427a05672e2a 100644
+--- a/include/net/ax25.h
++++ b/include/net/ax25.h
+@@ -228,6 +228,7 @@ typedef struct ax25_dev {
+ ax25_dama_info dama;
+ #endif
+ refcount_t refcount;
++ bool device_up;
+ } ax25_dev;
+
+ typedef struct ax25_cb {
+diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
+index 5a52a2018b56a..c0ea2a4892b16 100644
+--- a/include/net/bluetooth/hci_core.h
++++ b/include/net/bluetooth/hci_core.h
+@@ -155,21 +155,18 @@ struct bdaddr_list_with_irk {
+ u8 local_irk[16];
+ };
+
++/* Bitmask of connection flags */
+ enum hci_conn_flags {
+- HCI_CONN_FLAG_REMOTE_WAKEUP,
+- HCI_CONN_FLAG_DEVICE_PRIVACY,
+-
+- __HCI_CONN_NUM_FLAGS,
++ HCI_CONN_FLAG_REMOTE_WAKEUP = 1,
++ HCI_CONN_FLAG_DEVICE_PRIVACY = 2,
+ };
+-
+-/* Make sure number of flags doesn't exceed sizeof(current_flags) */
+-static_assert(__HCI_CONN_NUM_FLAGS < 32);
++typedef u8 hci_conn_flags_t;
+
+ struct bdaddr_list_with_flags {
+ struct list_head list;
+ bdaddr_t bdaddr;
+ u8 bdaddr_type;
+- DECLARE_BITMAP(flags, __HCI_CONN_NUM_FLAGS);
++ hci_conn_flags_t flags;
+ };
+
+ struct bt_uuid {
+@@ -576,7 +573,7 @@ struct hci_dev {
+ struct rfkill *rfkill;
+
+ DECLARE_BITMAP(dev_flags, __HCI_NUM_FLAGS);
+- DECLARE_BITMAP(conn_flags, __HCI_CONN_NUM_FLAGS);
++ hci_conn_flags_t conn_flags;
+
+ __s8 adv_tx_power;
+ __u8 adv_data[HCI_MAX_EXT_AD_LENGTH];
+@@ -775,7 +772,7 @@ struct hci_conn_params {
+
+ struct hci_conn *conn;
+ bool explicit_connect;
+- DECLARE_BITMAP(flags, __HCI_CONN_NUM_FLAGS);
++ hci_conn_flags_t flags;
+ u8 privacy_mode;
+ };
+
+diff --git a/include/net/bonding.h b/include/net/bonding.h
+index b14f4c0b4e9ed..cb904d356e31e 100644
+--- a/include/net/bonding.h
++++ b/include/net/bonding.h
+@@ -149,7 +149,9 @@ struct bond_params {
+ struct reciprocal_value reciprocal_packets_per_slave;
+ u16 ad_actor_sys_prio;
+ u16 ad_user_port_key;
++#if IS_ENABLED(CONFIG_IPV6)
+ struct in6_addr ns_targets[BOND_MAX_NS_TARGETS];
++#endif
+
+ /* 2 bytes of padding : see ether_addr_equal_64bits() */
+ u8 ad_actor_system[ETH_ALEN + 2];
+@@ -503,12 +505,14 @@ static inline int bond_is_ip_target_ok(__be32 addr)
+ return !ipv4_is_lbcast(addr) && !ipv4_is_zeronet(addr);
+ }
+
++#if IS_ENABLED(CONFIG_IPV6)
+ static inline int bond_is_ip6_target_ok(struct in6_addr *addr)
+ {
+ return !ipv6_addr_any(addr) &&
+ !ipv6_addr_loopback(addr) &&
+ !ipv6_addr_is_multicast(addr);
+ }
++#endif
+
+ /* Get the oldest arp which we've received on this slave for bond's
+ * arp_targets.
+@@ -746,6 +750,7 @@ static inline int bond_get_targets_ip(__be32 *targets, __be32 ip)
+ return -1;
+ }
+
++#if IS_ENABLED(CONFIG_IPV6)
+ static inline int bond_get_targets_ip6(struct in6_addr *targets, struct in6_addr *ip)
+ {
+ int i;
+@@ -758,6 +763,7 @@ static inline int bond_get_targets_ip6(struct in6_addr *targets, struct in6_addr
+
+ return -1;
+ }
++#endif
+
+ /* exported from bond_main.c */
+ extern unsigned int bond_net_id;
+diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
+index 021778a7e1afa..6484095a8c011 100644
+--- a/include/net/flow_offload.h
++++ b/include/net/flow_offload.h
+@@ -612,5 +612,6 @@ int flow_indr_dev_setup_offload(struct net_device *dev, struct Qdisc *sch,
+ enum tc_setup_type type, void *data,
+ struct flow_block_offload *bo,
+ void (*cleanup)(struct flow_block_cb *block_cb));
++bool flow_indr_dev_exists(void);
+
+ #endif /* _NET_FLOW_OFFLOAD_H */
+diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
+index 20af9d3557b9d..279ae0fff7adb 100644
+--- a/include/net/netfilter/nf_tables.h
++++ b/include/net/netfilter/nf_tables.h
+@@ -1090,7 +1090,6 @@ struct nft_stats {
+
+ struct nft_hook {
+ struct list_head list;
+- bool inactive;
+ struct nf_hook_ops ops;
+ struct rcu_head rcu;
+ };
+diff --git a/include/net/netfilter/nf_tables_offload.h b/include/net/netfilter/nf_tables_offload.h
+index 7971478439580..3568b6a2f5f0f 100644
+--- a/include/net/netfilter/nf_tables_offload.h
++++ b/include/net/netfilter/nf_tables_offload.h
+@@ -92,7 +92,7 @@ int nft_flow_rule_offload_commit(struct net *net);
+ NFT_OFFLOAD_MATCH(__key, __base, __field, __len, __reg) \
+ memset(&(__reg)->mask, 0xff, (__reg)->len);
+
+-int nft_chain_offload_priority(struct nft_base_chain *basechain);
++bool nft_chain_offload_support(const struct nft_base_chain *basechain);
+
+ int nft_offload_init(void);
+ void nft_offload_exit(void);
+diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
+index 9bab396c1f3ba..d6cf5116b5f98 100644
+--- a/include/net/sch_generic.h
++++ b/include/net/sch_generic.h
+@@ -187,37 +187,17 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc)
+ if (spin_trylock(&qdisc->seqlock))
+ return true;
+
+- /* Paired with smp_mb__after_atomic() to make sure
+- * STATE_MISSED checking is synchronized with clearing
+- * in pfifo_fast_dequeue().
++ /* No need to insist if the MISSED flag was already set.
++ * Note that test_and_set_bit() also gives us memory ordering
++ * guarantees wrt potential earlier enqueue() and below
++ * spin_trylock(), both of which are necessary to prevent races
+ */
+- smp_mb__before_atomic();
+-
+- /* If the MISSED flag is set, it means other thread has
+- * set the MISSED flag before second spin_trylock(), so
+- * we can return false here to avoid multi cpus doing
+- * the set_bit() and second spin_trylock() concurrently.
+- */
+- if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
++ if (test_and_set_bit(__QDISC_STATE_MISSED, &qdisc->state))
+ return false;
+
+- /* Set the MISSED flag before the second spin_trylock(),
+- * if the second spin_trylock() return false, it means
+- * other cpu holding the lock will do dequeuing for us
+- * or it will see the MISSED flag set after releasing
+- * lock and reschedule the net_tx_action() to do the
+- * dequeuing.
+- */
+- set_bit(__QDISC_STATE_MISSED, &qdisc->state);
+-
+- /* spin_trylock() only has load-acquire semantic, so use
+- * smp_mb__after_atomic() to ensure STATE_MISSED is set
+- * before doing the second spin_trylock().
+- */
+- smp_mb__after_atomic();
+-
+- /* Retry again in case other CPU may not see the new flag
+- * after it releases the lock at the end of qdisc_run_end().
++ /* Try to take the lock again to make sure that we will either
++ * grab it or the CPU that still has it will see MISSED set
++ * when testing it in qdisc_run_end()
+ */
+ return spin_trylock(&qdisc->seqlock);
+ }
+@@ -229,6 +209,12 @@ static inline void qdisc_run_end(struct Qdisc *qdisc)
+ if (qdisc->flags & TCQ_F_NOLOCK) {
+ spin_unlock(&qdisc->seqlock);
+
++ /* spin_unlock() only has store-release semantic. The unlock
++ * and test_bit() ordering is a store-load ordering, so a full
++ * memory barrier is needed here.
++ */
++ smp_mb();
++
+ if (unlikely(test_bit(__QDISC_STATE_MISSED,
+ &qdisc->state)))
+ __netif_schedule(qdisc);
+diff --git a/include/net/tcp.h b/include/net/tcp.h
+index cc1295037533a..2d9a78b3beaa9 100644
+--- a/include/net/tcp.h
++++ b/include/net/tcp.h
+@@ -1215,9 +1215,20 @@ static inline unsigned int tcp_packets_in_flight(const struct tcp_sock *tp)
+
+ #define TCP_INFINITE_SSTHRESH 0x7fffffff
+
++static inline u32 tcp_snd_cwnd(const struct tcp_sock *tp)
++{
++ return tp->snd_cwnd;
++}
++
++static inline void tcp_snd_cwnd_set(struct tcp_sock *tp, u32 val)
++{
++ WARN_ON_ONCE((int)val <= 0);
++ tp->snd_cwnd = val;
++}
++
+ static inline bool tcp_in_slow_start(const struct tcp_sock *tp)
+ {
+- return tp->snd_cwnd < tp->snd_ssthresh;
++ return tcp_snd_cwnd(tp) < tp->snd_ssthresh;
+ }
+
+ static inline bool tcp_in_initial_slowstart(const struct tcp_sock *tp)
+@@ -1243,8 +1254,8 @@ static inline __u32 tcp_current_ssthresh(const struct sock *sk)
+ return tp->snd_ssthresh;
+ else
+ return max(tp->snd_ssthresh,
+- ((tp->snd_cwnd >> 1) +
+- (tp->snd_cwnd >> 2)));
++ ((tcp_snd_cwnd(tp) >> 1) +
++ (tcp_snd_cwnd(tp) >> 2)));
+ }
+
+ /* Use define here intentionally to get WARN_ON location shown at the caller */
+@@ -1286,7 +1297,7 @@ static inline bool tcp_is_cwnd_limited(const struct sock *sk)
+
+ /* If in slow start, ensure cwnd grows to twice what was ACKed. */
+ if (tcp_in_slow_start(tp))
+- return tp->snd_cwnd < 2 * tp->max_packets_out;
++ return tcp_snd_cwnd(tp) < 2 * tp->max_packets_out;
+
+ return tp->is_cwnd_limited;
+ }
+diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
+index 521059d8dc0a6..edcd6369de102 100644
+--- a/include/trace/events/tcp.h
++++ b/include/trace/events/tcp.h
+@@ -279,7 +279,7 @@ TRACE_EVENT(tcp_probe,
+ __entry->data_len = skb->len - __tcp_hdrlen(th);
+ __entry->snd_nxt = tp->snd_nxt;
+ __entry->snd_una = tp->snd_una;
+- __entry->snd_cwnd = tp->snd_cwnd;
++ __entry->snd_cwnd = tcp_snd_cwnd(tp);
+ __entry->snd_wnd = tp->snd_wnd;
+ __entry->rcv_wnd = tp->rcv_wnd;
+ __entry->ssthresh = tcp_current_ssthresh(sk);
+diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
+index 05e701f0da81d..1e92b52fc8146 100644
+--- a/kernel/bpf/core.c
++++ b/kernel/bpf/core.c
+@@ -1950,6 +1950,11 @@ out:
+ CONT; \
+ LDX_MEM_##SIZEOP: \
+ DST = *(SIZE *)(unsigned long) (SRC + insn->off); \
++ CONT; \
++ LDX_PROBE_MEM_##SIZEOP: \
++ bpf_probe_read_kernel(&DST, sizeof(SIZE), \
++ (const void *)(long) (SRC + insn->off)); \
++ DST = *((SIZE *)&DST); \
+ CONT;
+
+ LDST(B, u8)
+@@ -1957,15 +1962,6 @@ out:
+ LDST(W, u32)
+ LDST(DW, u64)
+ #undef LDST
+-#define LDX_PROBE(SIZEOP, SIZE) \
+- LDX_PROBE_MEM_##SIZEOP: \
+- bpf_probe_read_kernel(&DST, SIZE, (const void *)(long) (SRC + insn->off)); \
+- CONT;
+- LDX_PROBE(B, 1)
+- LDX_PROBE(H, 2)
+- LDX_PROBE(W, 4)
+- LDX_PROBE(DW, 8)
+-#undef LDX_PROBE
+
+ #define ATOMIC_ALU_OP(BOP, KOP) \
+ case BOP: \
+diff --git a/kernel/sched/autogroup.c b/kernel/sched/autogroup.c
+index 16092b49ff6a9..4ebaf97f7bd85 100644
+--- a/kernel/sched/autogroup.c
++++ b/kernel/sched/autogroup.c
+@@ -36,6 +36,7 @@ void __init autogroup_init(struct task_struct *init_task)
+ kref_init(&autogroup_default.kref);
+ init_rwsem(&autogroup_default.lock);
+ init_task->signal->autogroup = &autogroup_default;
++ sched_autogroup_sysctl_init();
+ }
+
+ void autogroup_free(struct task_group *tg)
+@@ -219,7 +220,6 @@ void sched_autogroup_exit(struct signal_struct *sig)
+ static int __init setup_autogroup(char *str)
+ {
+ sysctl_sched_autogroup_enabled = 0;
+- sched_autogroup_sysctl_init();
+
+ return 1;
+ }
+diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
+index f6fb04d79eba6..114c31bdf8f97 100644
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -2837,7 +2837,7 @@ trace_event_buffer_lock_reserve(struct trace_buffer **current_rb,
+ }
+ EXPORT_SYMBOL_GPL(trace_event_buffer_lock_reserve);
+
+-static DEFINE_SPINLOCK(tracepoint_iter_lock);
++static DEFINE_RAW_SPINLOCK(tracepoint_iter_lock);
+ static DEFINE_MUTEX(tracepoint_printk_mutex);
+
+ static void output_printk(struct trace_event_buffer *fbuffer)
+@@ -2865,14 +2865,14 @@ static void output_printk(struct trace_event_buffer *fbuffer)
+
+ event = &fbuffer->trace_file->event_call->event;
+
+- spin_lock_irqsave(&tracepoint_iter_lock, flags);
++ raw_spin_lock_irqsave(&tracepoint_iter_lock, flags);
+ trace_seq_init(&iter->seq);
+ iter->ent = fbuffer->entry;
+ event_call->event.funcs->trace(iter, 0, event);
+ trace_seq_putc(&iter->seq, 0);
+ printk("%s", iter->seq.buffer);
+
+- spin_unlock_irqrestore(&tracepoint_iter_lock, flags);
++ raw_spin_unlock_irqrestore(&tracepoint_iter_lock, flags);
+ }
+
+ int tracepoint_printk_sysctl(struct ctl_table *table, int write,
+@@ -6334,12 +6334,18 @@ static void tracing_set_nop(struct trace_array *tr)
+ tr->current_trace = &nop_trace;
+ }
+
++static bool tracer_options_updated;
++
+ static void add_tracer_options(struct trace_array *tr, struct tracer *t)
+ {
+ /* Only enable if the directory has been created already. */
+ if (!tr->dir)
+ return;
+
++ /* Only create trace option files after update_tracer_options finish */
++ if (!tracer_options_updated)
++ return;
++
+ create_trace_option_files(tr, t);
+ }
+
+@@ -9178,6 +9184,7 @@ static void __update_tracer_options(struct trace_array *tr)
+ static void update_tracer_options(struct trace_array *tr)
+ {
+ mutex_lock(&trace_types_lock);
++ tracer_options_updated = true;
+ __update_tracer_options(tr);
+ mutex_unlock(&trace_types_lock);
+ }
+diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
+index f755bde42fd07..b69e207012c99 100644
+--- a/kernel/trace/trace_syscalls.c
++++ b/kernel/trace/trace_syscalls.c
+@@ -154,7 +154,7 @@ print_syscall_enter(struct trace_iterator *iter, int flags,
+ goto end;
+
+ /* parameter types */
+- if (tr->trace_flags & TRACE_ITER_VERBOSE)
++ if (tr && tr->trace_flags & TRACE_ITER_VERBOSE)
+ trace_seq_printf(s, "%s ", entry->types[i]);
+
+ /* parameter values */
+@@ -296,9 +296,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
+ struct trace_event_file *trace_file;
+ struct syscall_trace_enter *entry;
+ struct syscall_metadata *sys_data;
+- struct ring_buffer_event *event;
+- struct trace_buffer *buffer;
+- unsigned int trace_ctx;
++ struct trace_event_buffer fbuffer;
+ unsigned long args[6];
+ int syscall_nr;
+ int size;
+@@ -321,20 +319,16 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
+
+ size = sizeof(*entry) + sizeof(unsigned long) * sys_data->nb_args;
+
+- trace_ctx = tracing_gen_ctx();
+-
+- event = trace_event_buffer_lock_reserve(&buffer, trace_file,
+- sys_data->enter_event->event.type, size, trace_ctx);
+- if (!event)
++ entry = trace_event_buffer_reserve(&fbuffer, trace_file, size);
++ if (!entry)
+ return;
+
+- entry = ring_buffer_event_data(event);
++ entry = ring_buffer_event_data(fbuffer.event);
+ entry->nr = syscall_nr;
+ syscall_get_arguments(current, regs, args);
+ memcpy(entry->args, args, sizeof(unsigned long) * sys_data->nb_args);
+
+- event_trigger_unlock_commit(trace_file, buffer, event, entry,
+- trace_ctx);
++ trace_event_buffer_commit(&fbuffer);
+ }
+
+ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
+@@ -343,9 +337,7 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
+ struct trace_event_file *trace_file;
+ struct syscall_trace_exit *entry;
+ struct syscall_metadata *sys_data;
+- struct ring_buffer_event *event;
+- struct trace_buffer *buffer;
+- unsigned int trace_ctx;
++ struct trace_event_buffer fbuffer;
+ int syscall_nr;
+
+ syscall_nr = trace_get_syscall_nr(current, regs);
+@@ -364,20 +356,15 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
+ if (!sys_data)
+ return;
+
+- trace_ctx = tracing_gen_ctx();
+-
+- event = trace_event_buffer_lock_reserve(&buffer, trace_file,
+- sys_data->exit_event->event.type, sizeof(*entry),
+- trace_ctx);
+- if (!event)
++ entry = trace_event_buffer_reserve(&fbuffer, trace_file, sizeof(*entry));
++ if (!entry)
+ return;
+
+- entry = ring_buffer_event_data(event);
++ entry = ring_buffer_event_data(fbuffer.event);
+ entry->nr = syscall_nr;
+ entry->ret = syscall_get_return_value(current, regs);
+
+- event_trigger_unlock_commit(trace_file, buffer, event, entry,
+- trace_ctx);
++ trace_event_buffer_commit(&fbuffer);
+ }
+
+ static int reg_event_syscall_enter(struct trace_event_file *file,
+diff --git a/lib/Makefile b/lib/Makefile
+index 6b9ffc1bd1eed..08053df16c7c8 100644
+--- a/lib/Makefile
++++ b/lib/Makefile
+@@ -279,7 +279,7 @@ $(foreach file, $(libfdt_files), \
+ $(eval CFLAGS_$(file) = -I $(srctree)/scripts/dtc/libfdt))
+ lib-$(CONFIG_LIBFDT) += $(libfdt_files)
+
+-lib-$(CONFIG_BOOT_CONFIG) += bootconfig.o
++obj-$(CONFIG_BOOT_CONFIG) += bootconfig.o
+
+ obj-$(CONFIG_RBTREE_TEST) += rbtree_test.o
+ obj-$(CONFIG_INTERVAL_TREE_TEST) += interval_tree_test.o
+diff --git a/lib/iov_iter.c b/lib/iov_iter.c
+index 6dd5330f7a995..0b64695ab632f 100644
+--- a/lib/iov_iter.c
++++ b/lib/iov_iter.c
+@@ -1434,7 +1434,7 @@ static ssize_t iter_xarray_get_pages(struct iov_iter *i,
+ {
+ unsigned nr, offset;
+ pgoff_t index, count;
+- size_t size = maxsize, actual;
++ size_t size = maxsize;
+ loff_t pos;
+
+ if (!size || !maxpages)
+@@ -1461,13 +1461,7 @@ static ssize_t iter_xarray_get_pages(struct iov_iter *i,
+ if (nr == 0)
+ return 0;
+
+- actual = PAGE_SIZE * nr;
+- actual -= offset;
+- if (nr == count && size > 0) {
+- unsigned last_offset = (nr > 1) ? 0 : offset;
+- actual -= PAGE_SIZE - (last_offset + size);
+- }
+- return actual;
++ return min_t(size_t, nr * PAGE_SIZE - offset, maxsize);
+ }
+
+ /* must be done on non-empty ITER_IOVEC one */
+@@ -1602,7 +1596,7 @@ static ssize_t iter_xarray_get_pages_alloc(struct iov_iter *i,
+ struct page **p;
+ unsigned nr, offset;
+ pgoff_t index, count;
+- size_t size = maxsize, actual;
++ size_t size = maxsize;
+ loff_t pos;
+
+ if (!size)
+@@ -1631,13 +1625,7 @@ static ssize_t iter_xarray_get_pages_alloc(struct iov_iter *i,
+ if (nr == 0)
+ return 0;
+
+- actual = PAGE_SIZE * nr;
+- actual -= offset;
+- if (nr == count && size > 0) {
+- unsigned last_offset = (nr > 1) ? 0 : offset;
+- actual -= PAGE_SIZE - (last_offset + size);
+- }
+- return actual;
++ return min_t(size_t, nr * PAGE_SIZE - offset, maxsize);
+ }
+
+ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
+diff --git a/lib/nodemask.c b/lib/nodemask.c
+index 3aa454c54c0de..e22647f5181b3 100644
+--- a/lib/nodemask.c
++++ b/lib/nodemask.c
+@@ -3,9 +3,9 @@
+ #include <linux/module.h>
+ #include <linux/random.h>
+
+-int __next_node_in(int node, const nodemask_t *srcp)
++unsigned int __next_node_in(int node, const nodemask_t *srcp)
+ {
+- int ret = __next_node(node, srcp);
++ unsigned int ret = __next_node(node, srcp);
+
+ if (ret == MAX_NUMNODES)
+ ret = __first_node(srcp);
+diff --git a/lib/xarray.c b/lib/xarray.c
+index 54e646e8e6ee7..ea9ce1f0b3863 100644
+--- a/lib/xarray.c
++++ b/lib/xarray.c
+@@ -264,9 +264,10 @@ static void xa_node_free(struct xa_node *node)
+ * xas_destroy() - Free any resources allocated during the XArray operation.
+ * @xas: XArray operation state.
+ *
+- * This function is now internal-only.
++ * Most users will not need to call this function; it is called for you
++ * by xas_nomem().
+ */
+-static void xas_destroy(struct xa_state *xas)
++void xas_destroy(struct xa_state *xas)
+ {
+ struct xa_node *next, *node = xas->xa_alloc;
+
+diff --git a/mm/filemap.c b/mm/filemap.c
+index 9a1eef6c5d350..61dd39990fda2 100644
+--- a/mm/filemap.c
++++ b/mm/filemap.c
+@@ -2991,11 +2991,12 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf)
+ struct address_space *mapping = file->f_mapping;
+ DEFINE_READAHEAD(ractl, file, ra, mapping, vmf->pgoff);
+ struct file *fpin = NULL;
++ unsigned long vm_flags = vmf->vma->vm_flags;
+ unsigned int mmap_miss;
+
+ #ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ /* Use the readahead code, even if readahead is disabled */
+- if (vmf->vma->vm_flags & VM_HUGEPAGE) {
++ if (vm_flags & VM_HUGEPAGE) {
+ fpin = maybe_unlock_mmap_for_io(vmf, fpin);
+ ractl._index &= ~((unsigned long)HPAGE_PMD_NR - 1);
+ ra->size = HPAGE_PMD_NR;
+@@ -3003,7 +3004,7 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf)
+ * Fetch two PMD folios, so we get the chance to actually
+ * readahead, unless we've been told not to.
+ */
+- if (!(vmf->vma->vm_flags & VM_RAND_READ))
++ if (!(vm_flags & VM_RAND_READ))
+ ra->size *= 2;
+ ra->async_size = HPAGE_PMD_NR;
+ page_cache_ra_order(&ractl, ra, HPAGE_PMD_ORDER);
+@@ -3012,12 +3013,12 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf)
+ #endif
+
+ /* If we don't want any read-ahead, don't bother */
+- if (vmf->vma->vm_flags & VM_RAND_READ)
++ if (vm_flags & VM_RAND_READ)
+ return fpin;
+ if (!ra->ra_pages)
+ return fpin;
+
+- if (vmf->vma->vm_flags & VM_SEQ_READ) {
++ if (vm_flags & VM_SEQ_READ) {
+ fpin = maybe_unlock_mmap_for_io(vmf, fpin);
+ page_cache_sync_ra(&ractl, ra->ra_pages);
+ return fpin;
+diff --git a/mm/huge_memory.c b/mm/huge_memory.c
+index 910a138e9859e..b7d0697b1f26e 100644
+--- a/mm/huge_memory.c
++++ b/mm/huge_memory.c
+@@ -2622,8 +2622,7 @@ out_unlock:
+ if (mapping)
+ i_mmap_unlock_read(mapping);
+ out:
+- /* Free any memory we didn't use */
+- xas_nomem(&xas, 0);
++ xas_destroy(&xas);
+ count_vm_event(!ret ? THP_SPLIT_PAGE : THP_SPLIT_PAGE_FAILED);
+ return ret;
+ }
+diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
+index 363d47f945324..289f355e18531 100644
+--- a/net/ax25/af_ax25.c
++++ b/net/ax25/af_ax25.c
+@@ -62,12 +62,12 @@ static void ax25_free_sock(struct sock *sk)
+ */
+ static void ax25_cb_del(ax25_cb *ax25)
+ {
++ spin_lock_bh(&ax25_list_lock);
+ if (!hlist_unhashed(&ax25->ax25_node)) {
+- spin_lock_bh(&ax25_list_lock);
+ hlist_del_init(&ax25->ax25_node);
+- spin_unlock_bh(&ax25_list_lock);
+ ax25_cb_put(ax25);
+ }
++ spin_unlock_bh(&ax25_list_lock);
+ }
+
+ /*
+@@ -81,6 +81,7 @@ static void ax25_kill_by_device(struct net_device *dev)
+
+ if ((ax25_dev = ax25_dev_ax25dev(dev)) == NULL)
+ return;
++ ax25_dev->device_up = false;
+
+ spin_lock_bh(&ax25_list_lock);
+ again:
+@@ -91,6 +92,7 @@ again:
+ spin_unlock_bh(&ax25_list_lock);
+ ax25_disconnect(s, ENETUNREACH);
+ s->ax25_dev = NULL;
++ ax25_cb_del(s);
+ spin_lock_bh(&ax25_list_lock);
+ goto again;
+ }
+@@ -103,6 +105,7 @@ again:
+ dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker);
+ ax25_dev_put(ax25_dev);
+ }
++ ax25_cb_del(s);
+ release_sock(sk);
+ spin_lock_bh(&ax25_list_lock);
+ sock_put(sk);
+@@ -995,9 +998,11 @@ static int ax25_release(struct socket *sock)
+ if (sk->sk_type == SOCK_SEQPACKET) {
+ switch (ax25->state) {
+ case AX25_STATE_0:
+- release_sock(sk);
+- ax25_disconnect(ax25, 0);
+- lock_sock(sk);
++ if (!sock_flag(ax25->sk, SOCK_DEAD)) {
++ release_sock(sk);
++ ax25_disconnect(ax25, 0);
++ lock_sock(sk);
++ }
+ ax25_destroy_socket(ax25);
+ break;
+
+@@ -1053,11 +1058,13 @@ static int ax25_release(struct socket *sock)
+ ax25_destroy_socket(ax25);
+ }
+ if (ax25_dev) {
+- del_timer_sync(&ax25->timer);
+- del_timer_sync(&ax25->t1timer);
+- del_timer_sync(&ax25->t2timer);
+- del_timer_sync(&ax25->t3timer);
+- del_timer_sync(&ax25->idletimer);
++ if (!ax25_dev->device_up) {
++ del_timer_sync(&ax25->timer);
++ del_timer_sync(&ax25->t1timer);
++ del_timer_sync(&ax25->t2timer);
++ del_timer_sync(&ax25->t3timer);
++ del_timer_sync(&ax25->idletimer);
++ }
+ dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker);
+ ax25_dev_put(ax25_dev);
+ }
+diff --git a/net/ax25/ax25_dev.c b/net/ax25/ax25_dev.c
+index d2a244e1c260f..5451be15e072b 100644
+--- a/net/ax25/ax25_dev.c
++++ b/net/ax25/ax25_dev.c
+@@ -62,6 +62,7 @@ void ax25_dev_device_up(struct net_device *dev)
+ ax25_dev->dev = dev;
+ dev_hold_track(dev, &ax25_dev->dev_tracker, GFP_ATOMIC);
+ ax25_dev->forward = NULL;
++ ax25_dev->device_up = true;
+
+ ax25_dev->values[AX25_VALUES_IPDEFMODE] = AX25_DEF_IPDEFMODE;
+ ax25_dev->values[AX25_VALUES_AXDEFMODE] = AX25_DEF_AXDEFMODE;
+diff --git a/net/ax25/ax25_subr.c b/net/ax25/ax25_subr.c
+index 3a476e4f6cd0b..9ff98f46dc6be 100644
+--- a/net/ax25/ax25_subr.c
++++ b/net/ax25/ax25_subr.c
+@@ -268,7 +268,7 @@ void ax25_disconnect(ax25_cb *ax25, int reason)
+ del_timer_sync(&ax25->t3timer);
+ del_timer_sync(&ax25->idletimer);
+ } else {
+- if (!ax25->sk || !sock_flag(ax25->sk, SOCK_DESTROY))
++ if (ax25->sk && !sock_flag(ax25->sk, SOCK_DESTROY))
+ ax25_stop_heartbeat(ax25);
+ ax25_stop_t1timer(ax25);
+ ax25_stop_t2timer(ax25);
+diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
+index 45c2dd2e15905..19df3905c5f8e 100644
+--- a/net/bluetooth/hci_core.c
++++ b/net/bluetooth/hci_core.c
+@@ -2153,7 +2153,7 @@ int hci_bdaddr_list_add_with_flags(struct list_head *list, bdaddr_t *bdaddr,
+
+ bacpy(&entry->bdaddr, bdaddr);
+ entry->bdaddr_type = type;
+- bitmap_from_u64(entry->flags, flags);
++ entry->flags = flags;
+
+ list_add(&entry->list, list);
+
+@@ -2634,7 +2634,7 @@ int hci_register_dev(struct hci_dev *hdev)
+ * callback.
+ */
+ if (hdev->wakeup)
+- set_bit(HCI_CONN_FLAG_REMOTE_WAKEUP, hdev->conn_flags);
++ hdev->conn_flags |= HCI_CONN_FLAG_REMOTE_WAKEUP;
+
+ hci_sock_dev_event(hdev, HCI_DEV_REG);
+ hci_dev_hold(hdev);
+diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
+index f4afe482e3004..95689982eedbc 100644
+--- a/net/bluetooth/hci_request.c
++++ b/net/bluetooth/hci_request.c
+@@ -482,7 +482,7 @@ static int add_to_accept_list(struct hci_request *req,
+
+ /* During suspend, only wakeable devices can be in accept list */
+ if (hdev->suspended &&
+- !test_bit(HCI_CONN_FLAG_REMOTE_WAKEUP, params->flags))
++ !(params->flags & HCI_CONN_FLAG_REMOTE_WAKEUP))
+ return 0;
+
+ *num_entries += 1;
+diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
+index 13600bf120b02..351c2390164d0 100644
+--- a/net/bluetooth/hci_sync.c
++++ b/net/bluetooth/hci_sync.c
+@@ -1637,7 +1637,7 @@ static int hci_le_set_privacy_mode_sync(struct hci_dev *hdev,
+ * indicates that LL Privacy has been enabled and
+ * HCI_OP_LE_SET_PRIVACY_MODE is supported.
+ */
+- if (!test_bit(HCI_CONN_FLAG_DEVICE_PRIVACY, params->flags))
++ if (!(params->flags & HCI_CONN_FLAG_DEVICE_PRIVACY))
+ return 0;
+
+ irk = hci_find_irk_by_addr(hdev, ¶ms->addr, params->addr_type);
+@@ -1664,20 +1664,19 @@ static int hci_le_add_accept_list_sync(struct hci_dev *hdev,
+ struct hci_cp_le_add_to_accept_list cp;
+ int err;
+
++ /* During suspend, only wakeable devices can be in acceptlist */
++ if (hdev->suspended &&
++ !(params->flags & HCI_CONN_FLAG_REMOTE_WAKEUP))
++ return 0;
++
+ /* Select filter policy to accept all advertising */
+ if (*num_entries >= hdev->le_accept_list_size)
+ return -ENOSPC;
+
+ /* Accept list can not be used with RPAs */
+ if (!use_ll_privacy(hdev) &&
+- hci_find_irk_by_addr(hdev, ¶ms->addr, params->addr_type)) {
++ hci_find_irk_by_addr(hdev, ¶ms->addr, params->addr_type))
+ return -EINVAL;
+- }
+-
+- /* During suspend, only wakeable devices can be in acceptlist */
+- if (hdev->suspended &&
+- !test_bit(HCI_CONN_FLAG_REMOTE_WAKEUP, params->flags))
+- return 0;
+
+ /* Attempt to program the device in the resolving list first to avoid
+ * having to rollback in case it fails since the resolving list is
+@@ -4857,7 +4856,7 @@ static int hci_update_event_filter_sync(struct hci_dev *hdev)
+ hci_clear_event_filter_sync(hdev);
+
+ list_for_each_entry(b, &hdev->accept_list, list) {
+- if (!test_bit(HCI_CONN_FLAG_REMOTE_WAKEUP, b->flags))
++ if (!(b->flags & HCI_CONN_FLAG_REMOTE_WAKEUP))
+ continue;
+
+ bt_dev_dbg(hdev, "Adding event filters for %pMR", &b->bdaddr);
+@@ -4881,10 +4880,28 @@ static int hci_update_event_filter_sync(struct hci_dev *hdev)
+ return 0;
+ }
+
++/* This function disables scan (BR and LE) and mark it as paused */
++static int hci_pause_scan_sync(struct hci_dev *hdev)
++{
++ if (hdev->scanning_paused)
++ return 0;
++
++ /* Disable page scan if enabled */
++ if (test_bit(HCI_PSCAN, &hdev->flags))
++ hci_write_scan_enable_sync(hdev, SCAN_DISABLED);
++
++ hci_scan_disable_sync(hdev);
++
++ hdev->scanning_paused = true;
++
++ return 0;
++}
++
+ /* This function performs the HCI suspend procedures in the follow order:
+ *
+ * Pause discovery (active scanning/inquiry)
+ * Pause Directed Advertising/Advertising
++ * Pause Scanning (passive scanning in case discovery was not active)
+ * Disconnect all connections
+ * Set suspend_status to BT_SUSPEND_DISCONNECT if hdev cannot wakeup
+ * otherwise:
+@@ -4910,15 +4927,11 @@ int hci_suspend_sync(struct hci_dev *hdev)
+ /* Pause other advertisements */
+ hci_pause_advertising_sync(hdev);
+
+- /* Disable page scan if enabled */
+- if (test_bit(HCI_PSCAN, &hdev->flags))
+- hci_write_scan_enable_sync(hdev, SCAN_DISABLED);
+-
+ /* Suspend monitor filters */
+ hci_suspend_monitor_sync(hdev);
+
+ /* Prevent disconnects from causing scanning to be re-enabled */
+- hdev->scanning_paused = true;
++ hci_pause_scan_sync(hdev);
+
+ /* Soft disconnect everything (power off) */
+ err = hci_disconnect_all_sync(hdev, HCI_ERROR_REMOTE_POWER_OFF);
+@@ -4989,6 +5002,22 @@ static void hci_resume_monitor_sync(struct hci_dev *hdev)
+ }
+ }
+
++/* This function resume scan and reset paused flag */
++static int hci_resume_scan_sync(struct hci_dev *hdev)
++{
++ if (!hdev->scanning_paused)
++ return 0;
++
++ hci_update_scan_sync(hdev);
++
++ /* Reset passive scanning to normal */
++ hci_update_passive_scan_sync(hdev);
++
++ hdev->scanning_paused = false;
++
++ return 0;
++}
++
+ /* This function performs the HCI suspend procedures in the follow order:
+ *
+ * Restore event mask
+@@ -5011,10 +5040,9 @@ int hci_resume_sync(struct hci_dev *hdev)
+
+ /* Clear any event filters and restore scan state */
+ hci_clear_event_filter_sync(hdev);
+- hci_update_scan_sync(hdev);
+
+- /* Reset passive scanning to normal */
+- hci_update_passive_scan_sync(hdev);
++ /* Resume scanning */
++ hci_resume_scan_sync(hdev);
+
+ /* Resume monitor filters */
+ hci_resume_monitor_sync(hdev);
+diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
+index d2d390534e544..ae758ab1b558d 100644
+--- a/net/bluetooth/mgmt.c
++++ b/net/bluetooth/mgmt.c
+@@ -4013,10 +4013,11 @@ static int exp_ll_privacy_feature_changed(bool enabled, struct hci_dev *hdev,
+ memcpy(ev.uuid, rpa_resolution_uuid, 16);
+ ev.flags = cpu_to_le32((enabled ? BIT(0) : 0) | BIT(1));
+
++ // Do we need to be atomic with the conn_flags?
+ if (enabled && privacy_mode_capable(hdev))
+- set_bit(HCI_CONN_FLAG_DEVICE_PRIVACY, hdev->conn_flags);
++ hdev->conn_flags |= HCI_CONN_FLAG_DEVICE_PRIVACY;
+ else
+- clear_bit(HCI_CONN_FLAG_DEVICE_PRIVACY, hdev->conn_flags);
++ hdev->conn_flags &= ~HCI_CONN_FLAG_DEVICE_PRIVACY;
+
+ return mgmt_limited_event(MGMT_EV_EXP_FEATURE_CHANGED, hdev,
+ &ev, sizeof(ev),
+@@ -4435,8 +4436,7 @@ static int get_device_flags(struct sock *sk, struct hci_dev *hdev, void *data,
+
+ hci_dev_lock(hdev);
+
+- bitmap_to_arr32(&supported_flags, hdev->conn_flags,
+- __HCI_CONN_NUM_FLAGS);
++ supported_flags = hdev->conn_flags;
+
+ memset(&rp, 0, sizeof(rp));
+
+@@ -4447,8 +4447,7 @@ static int get_device_flags(struct sock *sk, struct hci_dev *hdev, void *data,
+ if (!br_params)
+ goto done;
+
+- bitmap_to_arr32(¤t_flags, br_params->flags,
+- __HCI_CONN_NUM_FLAGS);
++ current_flags = br_params->flags;
+ } else {
+ params = hci_conn_params_lookup(hdev, &cp->addr.bdaddr,
+ le_addr_type(cp->addr.type));
+@@ -4456,8 +4455,7 @@ static int get_device_flags(struct sock *sk, struct hci_dev *hdev, void *data,
+ if (!params)
+ goto done;
+
+- bitmap_to_arr32(¤t_flags, params->flags,
+- __HCI_CONN_NUM_FLAGS);
++ current_flags = params->flags;
+ }
+
+ bacpy(&rp.addr.bdaddr, &cp->addr.bdaddr);
+@@ -4502,8 +4500,8 @@ static int set_device_flags(struct sock *sk, struct hci_dev *hdev, void *data,
+ &cp->addr.bdaddr, cp->addr.type,
+ __le32_to_cpu(current_flags));
+
+- bitmap_to_arr32(&supported_flags, hdev->conn_flags,
+- __HCI_CONN_NUM_FLAGS);
++ // We should take hci_dev_lock() early, I think.. conn_flags can change
++ supported_flags = hdev->conn_flags;
+
+ if ((supported_flags | current_flags) != supported_flags) {
+ bt_dev_warn(hdev, "Bad flag given (0x%x) vs supported (0x%0x)",
+@@ -4519,7 +4517,7 @@ static int set_device_flags(struct sock *sk, struct hci_dev *hdev, void *data,
+ cp->addr.type);
+
+ if (br_params) {
+- bitmap_from_u64(br_params->flags, current_flags);
++ br_params->flags = current_flags;
+ status = MGMT_STATUS_SUCCESS;
+ } else {
+ bt_dev_warn(hdev, "No such BR/EDR device %pMR (0x%x)",
+@@ -4529,14 +4527,26 @@ static int set_device_flags(struct sock *sk, struct hci_dev *hdev, void *data,
+ params = hci_conn_params_lookup(hdev, &cp->addr.bdaddr,
+ le_addr_type(cp->addr.type));
+ if (params) {
+- bitmap_from_u64(params->flags, current_flags);
++ /* Devices using RPAs can only be programmed in the
++ * acceptlist LL Privacy has been enable otherwise they
++ * cannot mark HCI_CONN_FLAG_REMOTE_WAKEUP.
++ */
++ if ((current_flags & HCI_CONN_FLAG_REMOTE_WAKEUP) &&
++ !use_ll_privacy(hdev) &&
++ hci_find_irk_by_addr(hdev, ¶ms->addr,
++ params->addr_type)) {
++ bt_dev_warn(hdev,
++ "Cannot set wakeable for RPA");
++ goto unlock;
++ }
++
++ params->flags = current_flags;
+ status = MGMT_STATUS_SUCCESS;
+
+ /* Update passive scan if HCI_CONN_FLAG_DEVICE_PRIVACY
+ * has been set.
+ */
+- if (test_bit(HCI_CONN_FLAG_DEVICE_PRIVACY,
+- params->flags))
++ if (params->flags & HCI_CONN_FLAG_DEVICE_PRIVACY)
+ hci_update_passive_scan(hdev);
+ } else {
+ bt_dev_warn(hdev, "No such LE device %pMR (0x%x)",
+@@ -4545,6 +4555,7 @@ static int set_device_flags(struct sock *sk, struct hci_dev *hdev, void *data,
+ }
+ }
+
++unlock:
+ hci_dev_unlock(hdev);
+
+ done:
+@@ -7136,8 +7147,7 @@ static int add_device(struct sock *sk, struct hci_dev *hdev,
+ params = hci_conn_params_lookup(hdev, &cp->addr.bdaddr,
+ addr_type);
+ if (params)
+- bitmap_to_arr32(¤t_flags, params->flags,
+- __HCI_CONN_NUM_FLAGS);
++ current_flags = params->flags;
+ }
+
+ err = hci_cmd_sync_queue(hdev, add_device_sync, NULL, NULL);
+@@ -7146,8 +7156,7 @@ static int add_device(struct sock *sk, struct hci_dev *hdev,
+
+ added:
+ device_added(sk, hdev, &cp->addr.bdaddr, cp->addr.type, cp->action);
+- bitmap_to_arr32(&supported_flags, hdev->conn_flags,
+- __HCI_CONN_NUM_FLAGS);
++ supported_flags = hdev->conn_flags;
+ device_flags_changed(NULL, hdev, &cp->addr.bdaddr, cp->addr.type,
+ supported_flags, current_flags);
+
+diff --git a/net/core/filter.c b/net/core/filter.c
+index 966796b345e78..8847316ee20e0 100644
+--- a/net/core/filter.c
++++ b/net/core/filter.c
+@@ -5173,7 +5173,7 @@ static int _bpf_setsockopt(struct sock *sk, int level, int optname,
+ if (val <= 0 || tp->data_segs_out > tp->syn_data)
+ ret = -EINVAL;
+ else
+- tp->snd_cwnd = val;
++ tcp_snd_cwnd_set(tp, val);
+ break;
+ case TCP_BPF_SNDCWND_CLAMP:
+ if (val <= 0) {
+diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
+index 73f68d4625f32..929f6379a2798 100644
+--- a/net/core/flow_offload.c
++++ b/net/core/flow_offload.c
+@@ -595,3 +595,9 @@ int flow_indr_dev_setup_offload(struct net_device *dev, struct Qdisc *sch,
+ return (bo && list_empty(&bo->cb_list)) ? -EOPNOTSUPP : count;
+ }
+ EXPORT_SYMBOL(flow_indr_dev_setup_offload);
++
++bool flow_indr_dev_exists(void)
++{
++ return !list_empty(&flow_block_indr_dev_list);
++}
++EXPORT_SYMBOL(flow_indr_dev_exists);
+diff --git a/net/core/neighbour.c b/net/core/neighbour.c
+index f64ebd050f6c4..fd69133dc7c5b 100644
+--- a/net/core/neighbour.c
++++ b/net/core/neighbour.c
+@@ -1579,7 +1579,7 @@ static void neigh_managed_work(struct work_struct *work)
+ list_for_each_entry(neigh, &tbl->managed_list, managed_list)
+ neigh_event_send_probe(neigh, NULL, false);
+ queue_delayed_work(system_power_efficient_wq, &tbl->managed_work,
+- NEIGH_VAR(&tbl->parms, DELAY_PROBE_TIME));
++ max(NEIGH_VAR(&tbl->parms, DELAY_PROBE_TIME), HZ));
+ write_unlock_bh(&tbl->lock);
+ }
+
+diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
+index a5d57fa679caa..55654e335d43d 100644
+--- a/net/ipv4/inet_hashtables.c
++++ b/net/ipv4/inet_hashtables.c
+@@ -917,10 +917,12 @@ void __init inet_hashinfo2_init(struct inet_hashinfo *h, const char *name,
+ init_hashinfo_lhash2(h);
+
+ /* this one is used for source ports of outgoing connections */
+- table_perturb = kmalloc_array(INET_TABLE_PERTURB_SIZE,
+- sizeof(*table_perturb), GFP_KERNEL);
+- if (!table_perturb)
+- panic("TCP: failed to alloc table_perturb");
++ table_perturb = alloc_large_system_hash("Table-perturb",
++ sizeof(*table_perturb),
++ INET_TABLE_PERTURB_SIZE,
++ 0, 0, NULL, NULL,
++ INET_TABLE_PERTURB_SIZE,
++ INET_TABLE_PERTURB_SIZE);
+ }
+
+ int inet_hashinfo2_init_mod(struct inet_hashinfo *h)
+diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
+index aacee9dd771b4..bc8dfdf1c48ad 100644
+--- a/net/ipv4/ip_gre.c
++++ b/net/ipv4/ip_gre.c
+@@ -629,21 +629,20 @@ static netdev_tx_t ipgre_xmit(struct sk_buff *skb,
+ }
+
+ if (dev->header_ops) {
+- const int pull_len = tunnel->hlen + sizeof(struct iphdr);
+-
+ if (skb_cow_head(skb, 0))
+ goto free_skb;
+
+ tnl_params = (const struct iphdr *)skb->data;
+
+- if (pull_len > skb_transport_offset(skb))
+- goto free_skb;
+-
+ /* Pull skb since ip_tunnel_xmit() needs skb->data pointing
+ * to gre header.
+ */
+- skb_pull(skb, pull_len);
++ skb_pull(skb, tunnel->hlen + sizeof(struct iphdr));
+ skb_reset_mac_header(skb);
++
++ if (skb->ip_summed == CHECKSUM_PARTIAL &&
++ skb_checksum_start(skb) < skb->data)
++ goto free_skb;
+ } else {
+ if (skb_cow_head(skb, dev->needed_headroom))
+ goto free_skb;
+diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
+index cf18fbcbf123a..e31cf137c6140 100644
+--- a/net/ipv4/tcp.c
++++ b/net/ipv4/tcp.c
+@@ -429,7 +429,7 @@ void tcp_init_sock(struct sock *sk)
+ * algorithms that we must have the following bandaid to talk
+ * efficiently to them. -DaveM
+ */
+- tp->snd_cwnd = TCP_INIT_CWND;
++ tcp_snd_cwnd_set(tp, TCP_INIT_CWND);
+
+ /* There's a bubble in the pipe until at least the first ACK. */
+ tp->app_limited = ~0U;
+@@ -3033,7 +3033,7 @@ int tcp_disconnect(struct sock *sk, int flags)
+ icsk->icsk_rto_min = TCP_RTO_MIN;
+ icsk->icsk_delack_max = TCP_DELACK_MAX;
+ tp->snd_ssthresh = TCP_INFINITE_SSTHRESH;
+- tp->snd_cwnd = TCP_INIT_CWND;
++ tcp_snd_cwnd_set(tp, TCP_INIT_CWND);
+ tp->snd_cwnd_cnt = 0;
+ tp->window_clamp = 0;
+ tp->delivered = 0;
+@@ -3744,7 +3744,7 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info)
+ info->tcpi_max_pacing_rate = rate64;
+
+ info->tcpi_reordering = tp->reordering;
+- info->tcpi_snd_cwnd = tp->snd_cwnd;
++ info->tcpi_snd_cwnd = tcp_snd_cwnd(tp);
+
+ if (info->tcpi_state == TCP_LISTEN) {
+ /* listeners aliased fields :
+@@ -3915,7 +3915,7 @@ struct sk_buff *tcp_get_timestamping_opt_stats(const struct sock *sk,
+ rate64 = tcp_compute_delivery_rate(tp);
+ nla_put_u64_64bit(stats, TCP_NLA_DELIVERY_RATE, rate64, TCP_NLA_PAD);
+
+- nla_put_u32(stats, TCP_NLA_SND_CWND, tp->snd_cwnd);
++ nla_put_u32(stats, TCP_NLA_SND_CWND, tcp_snd_cwnd(tp));
+ nla_put_u32(stats, TCP_NLA_REORDERING, tp->reordering);
+ nla_put_u32(stats, TCP_NLA_MIN_RTT, tcp_min_rtt(tp));
+
+diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
+index 02e8626ccb278..c7d30a3bbd81d 100644
+--- a/net/ipv4/tcp_bbr.c
++++ b/net/ipv4/tcp_bbr.c
+@@ -276,7 +276,7 @@ static void bbr_init_pacing_rate_from_rtt(struct sock *sk)
+ } else { /* no RTT sample yet */
+ rtt_us = USEC_PER_MSEC; /* use nominal default RTT */
+ }
+- bw = (u64)tp->snd_cwnd * BW_UNIT;
++ bw = (u64)tcp_snd_cwnd(tp) * BW_UNIT;
+ do_div(bw, rtt_us);
+ sk->sk_pacing_rate = bbr_bw_to_pacing_rate(sk, bw, bbr_high_gain);
+ }
+@@ -323,9 +323,9 @@ static void bbr_save_cwnd(struct sock *sk)
+ struct bbr *bbr = inet_csk_ca(sk);
+
+ if (bbr->prev_ca_state < TCP_CA_Recovery && bbr->mode != BBR_PROBE_RTT)
+- bbr->prior_cwnd = tp->snd_cwnd; /* this cwnd is good enough */
++ bbr->prior_cwnd = tcp_snd_cwnd(tp); /* this cwnd is good enough */
+ else /* loss recovery or BBR_PROBE_RTT have temporarily cut cwnd */
+- bbr->prior_cwnd = max(bbr->prior_cwnd, tp->snd_cwnd);
++ bbr->prior_cwnd = max(bbr->prior_cwnd, tcp_snd_cwnd(tp));
+ }
+
+ static void bbr_cwnd_event(struct sock *sk, enum tcp_ca_event event)
+@@ -482,7 +482,7 @@ static bool bbr_set_cwnd_to_recover_or_restore(
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+ u8 prev_state = bbr->prev_ca_state, state = inet_csk(sk)->icsk_ca_state;
+- u32 cwnd = tp->snd_cwnd;
++ u32 cwnd = tcp_snd_cwnd(tp);
+
+ /* An ACK for P pkts should release at most 2*P packets. We do this
+ * in two steps. First, here we deduct the number of lost packets.
+@@ -520,7 +520,7 @@ static void bbr_set_cwnd(struct sock *sk, const struct rate_sample *rs,
+ {
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct bbr *bbr = inet_csk_ca(sk);
+- u32 cwnd = tp->snd_cwnd, target_cwnd = 0;
++ u32 cwnd = tcp_snd_cwnd(tp), target_cwnd = 0;
+
+ if (!acked)
+ goto done; /* no packet fully ACKed; just apply caps */
+@@ -544,9 +544,9 @@ static void bbr_set_cwnd(struct sock *sk, const struct rate_sample *rs,
+ cwnd = max(cwnd, bbr_cwnd_min_target);
+
+ done:
+- tp->snd_cwnd = min(cwnd, tp->snd_cwnd_clamp); /* apply global cap */
++ tcp_snd_cwnd_set(tp, min(cwnd, tp->snd_cwnd_clamp)); /* apply global cap */
+ if (bbr->mode == BBR_PROBE_RTT) /* drain queue, refresh min_rtt */
+- tp->snd_cwnd = min(tp->snd_cwnd, bbr_cwnd_min_target);
++ tcp_snd_cwnd_set(tp, min(tcp_snd_cwnd(tp), bbr_cwnd_min_target));
+ }
+
+ /* End cycle phase if it's time and/or we hit the phase's in-flight target. */
+@@ -856,7 +856,7 @@ static void bbr_update_ack_aggregation(struct sock *sk,
+ bbr->ack_epoch_acked = min_t(u32, 0xFFFFF,
+ bbr->ack_epoch_acked + rs->acked_sacked);
+ extra_acked = bbr->ack_epoch_acked - expected_acked;
+- extra_acked = min(extra_acked, tp->snd_cwnd);
++ extra_acked = min(extra_acked, tcp_snd_cwnd(tp));
+ if (extra_acked > bbr->extra_acked[bbr->extra_acked_win_idx])
+ bbr->extra_acked[bbr->extra_acked_win_idx] = extra_acked;
+ }
+@@ -914,7 +914,7 @@ static void bbr_check_probe_rtt_done(struct sock *sk)
+ return;
+
+ bbr->min_rtt_stamp = tcp_jiffies32; /* wait a while until PROBE_RTT */
+- tp->snd_cwnd = max(tp->snd_cwnd, bbr->prior_cwnd);
++ tcp_snd_cwnd_set(tp, max(tcp_snd_cwnd(tp), bbr->prior_cwnd));
+ bbr_reset_mode(sk);
+ }
+
+@@ -1093,7 +1093,7 @@ static u32 bbr_undo_cwnd(struct sock *sk)
+ bbr->full_bw = 0; /* spurious slow-down; reset full pipe detection */
+ bbr->full_bw_cnt = 0;
+ bbr_reset_lt_bw_sampling(sk);
+- return tcp_sk(sk)->snd_cwnd;
++ return tcp_snd_cwnd(tcp_sk(sk));
+ }
+
+ /* Entering loss recovery, so save cwnd for when we exit or undo recovery. */
+diff --git a/net/ipv4/tcp_bic.c b/net/ipv4/tcp_bic.c
+index f5f588b1f6e9d..58358bf92e1b8 100644
+--- a/net/ipv4/tcp_bic.c
++++ b/net/ipv4/tcp_bic.c
+@@ -150,7 +150,7 @@ static void bictcp_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ if (!acked)
+ return;
+ }
+- bictcp_update(ca, tp->snd_cwnd);
++ bictcp_update(ca, tcp_snd_cwnd(tp));
+ tcp_cong_avoid_ai(tp, ca->cnt, acked);
+ }
+
+@@ -166,16 +166,16 @@ static u32 bictcp_recalc_ssthresh(struct sock *sk)
+ ca->epoch_start = 0; /* end of epoch */
+
+ /* Wmax and fast convergence */
+- if (tp->snd_cwnd < ca->last_max_cwnd && fast_convergence)
+- ca->last_max_cwnd = (tp->snd_cwnd * (BICTCP_BETA_SCALE + beta))
++ if (tcp_snd_cwnd(tp) < ca->last_max_cwnd && fast_convergence)
++ ca->last_max_cwnd = (tcp_snd_cwnd(tp) * (BICTCP_BETA_SCALE + beta))
+ / (2 * BICTCP_BETA_SCALE);
+ else
+- ca->last_max_cwnd = tp->snd_cwnd;
++ ca->last_max_cwnd = tcp_snd_cwnd(tp);
+
+- if (tp->snd_cwnd <= low_window)
+- return max(tp->snd_cwnd >> 1U, 2U);
++ if (tcp_snd_cwnd(tp) <= low_window)
++ return max(tcp_snd_cwnd(tp) >> 1U, 2U);
+ else
+- return max((tp->snd_cwnd * beta) / BICTCP_BETA_SCALE, 2U);
++ return max((tcp_snd_cwnd(tp) * beta) / BICTCP_BETA_SCALE, 2U);
+ }
+
+ static void bictcp_state(struct sock *sk, u8 new_state)
+diff --git a/net/ipv4/tcp_cdg.c b/net/ipv4/tcp_cdg.c
+index 709d238018239..ddc7ba0554bdd 100644
+--- a/net/ipv4/tcp_cdg.c
++++ b/net/ipv4/tcp_cdg.c
+@@ -161,8 +161,8 @@ static void tcp_cdg_hystart_update(struct sock *sk)
+ LINUX_MIB_TCPHYSTARTTRAINDETECT);
+ NET_ADD_STATS(sock_net(sk),
+ LINUX_MIB_TCPHYSTARTTRAINCWND,
+- tp->snd_cwnd);
+- tp->snd_ssthresh = tp->snd_cwnd;
++ tcp_snd_cwnd(tp));
++ tp->snd_ssthresh = tcp_snd_cwnd(tp);
+ return;
+ }
+ }
+@@ -180,8 +180,8 @@ static void tcp_cdg_hystart_update(struct sock *sk)
+ LINUX_MIB_TCPHYSTARTDELAYDETECT);
+ NET_ADD_STATS(sock_net(sk),
+ LINUX_MIB_TCPHYSTARTDELAYCWND,
+- tp->snd_cwnd);
+- tp->snd_ssthresh = tp->snd_cwnd;
++ tcp_snd_cwnd(tp));
++ tp->snd_ssthresh = tcp_snd_cwnd(tp);
+ }
+ }
+ }
+@@ -252,7 +252,7 @@ static bool tcp_cdg_backoff(struct sock *sk, u32 grad)
+ return false;
+ }
+
+- ca->shadow_wnd = max(ca->shadow_wnd, tp->snd_cwnd);
++ ca->shadow_wnd = max(ca->shadow_wnd, tcp_snd_cwnd(tp));
+ ca->state = CDG_BACKOFF;
+ tcp_enter_cwr(sk);
+ return true;
+@@ -285,14 +285,14 @@ static void tcp_cdg_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ }
+
+ if (!tcp_is_cwnd_limited(sk)) {
+- ca->shadow_wnd = min(ca->shadow_wnd, tp->snd_cwnd);
++ ca->shadow_wnd = min(ca->shadow_wnd, tcp_snd_cwnd(tp));
+ return;
+ }
+
+- prior_snd_cwnd = tp->snd_cwnd;
++ prior_snd_cwnd = tcp_snd_cwnd(tp);
+ tcp_reno_cong_avoid(sk, ack, acked);
+
+- incr = tp->snd_cwnd - prior_snd_cwnd;
++ incr = tcp_snd_cwnd(tp) - prior_snd_cwnd;
+ ca->shadow_wnd = max(ca->shadow_wnd, ca->shadow_wnd + incr);
+ }
+
+@@ -331,15 +331,15 @@ static u32 tcp_cdg_ssthresh(struct sock *sk)
+ struct tcp_sock *tp = tcp_sk(sk);
+
+ if (ca->state == CDG_BACKOFF)
+- return max(2U, (tp->snd_cwnd * min(1024U, backoff_beta)) >> 10);
++ return max(2U, (tcp_snd_cwnd(tp) * min(1024U, backoff_beta)) >> 10);
+
+ if (ca->state == CDG_NONFULL && use_tolerance)
+- return tp->snd_cwnd;
++ return tcp_snd_cwnd(tp);
+
+- ca->shadow_wnd = min(ca->shadow_wnd >> 1, tp->snd_cwnd);
++ ca->shadow_wnd = min(ca->shadow_wnd >> 1, tcp_snd_cwnd(tp));
+ if (use_shadow)
+- return max3(2U, ca->shadow_wnd, tp->snd_cwnd >> 1);
+- return max(2U, tp->snd_cwnd >> 1);
++ return max3(2U, ca->shadow_wnd, tcp_snd_cwnd(tp) >> 1);
++ return max(2U, tcp_snd_cwnd(tp) >> 1);
+ }
+
+ static void tcp_cdg_cwnd_event(struct sock *sk, const enum tcp_ca_event ev)
+@@ -357,7 +357,7 @@ static void tcp_cdg_cwnd_event(struct sock *sk, const enum tcp_ca_event ev)
+
+ ca->gradients = gradients;
+ ca->rtt_seq = tp->snd_nxt;
+- ca->shadow_wnd = tp->snd_cwnd;
++ ca->shadow_wnd = tcp_snd_cwnd(tp);
+ break;
+ case CA_EVENT_COMPLETE_CWR:
+ ca->state = CDG_UNKNOWN;
+@@ -380,7 +380,7 @@ static void tcp_cdg_init(struct sock *sk)
+ ca->gradients = kcalloc(window, sizeof(ca->gradients[0]),
+ GFP_NOWAIT | __GFP_NOWARN);
+ ca->rtt_seq = tp->snd_nxt;
+- ca->shadow_wnd = tp->snd_cwnd;
++ ca->shadow_wnd = tcp_snd_cwnd(tp);
+ }
+
+ static void tcp_cdg_release(struct sock *sk)
+diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
+index dc95572163df3..d854bcfb99060 100644
+--- a/net/ipv4/tcp_cong.c
++++ b/net/ipv4/tcp_cong.c
+@@ -393,10 +393,10 @@ int tcp_set_congestion_control(struct sock *sk, const char *name, bool load,
+ */
+ u32 tcp_slow_start(struct tcp_sock *tp, u32 acked)
+ {
+- u32 cwnd = min(tp->snd_cwnd + acked, tp->snd_ssthresh);
++ u32 cwnd = min(tcp_snd_cwnd(tp) + acked, tp->snd_ssthresh);
+
+- acked -= cwnd - tp->snd_cwnd;
+- tp->snd_cwnd = min(cwnd, tp->snd_cwnd_clamp);
++ acked -= cwnd - tcp_snd_cwnd(tp);
++ tcp_snd_cwnd_set(tp, min(cwnd, tp->snd_cwnd_clamp));
+
+ return acked;
+ }
+@@ -410,7 +410,7 @@ void tcp_cong_avoid_ai(struct tcp_sock *tp, u32 w, u32 acked)
+ /* If credits accumulated at a higher w, apply them gently now. */
+ if (tp->snd_cwnd_cnt >= w) {
+ tp->snd_cwnd_cnt = 0;
+- tp->snd_cwnd++;
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) + 1);
+ }
+
+ tp->snd_cwnd_cnt += acked;
+@@ -418,9 +418,9 @@ void tcp_cong_avoid_ai(struct tcp_sock *tp, u32 w, u32 acked)
+ u32 delta = tp->snd_cwnd_cnt / w;
+
+ tp->snd_cwnd_cnt -= delta * w;
+- tp->snd_cwnd += delta;
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) + delta);
+ }
+- tp->snd_cwnd = min(tp->snd_cwnd, tp->snd_cwnd_clamp);
++ tcp_snd_cwnd_set(tp, min(tcp_snd_cwnd(tp), tp->snd_cwnd_clamp));
+ }
+ EXPORT_SYMBOL_GPL(tcp_cong_avoid_ai);
+
+@@ -445,7 +445,7 @@ void tcp_reno_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ return;
+ }
+ /* In dangerous area, increase slowly. */
+- tcp_cong_avoid_ai(tp, tp->snd_cwnd, acked);
++ tcp_cong_avoid_ai(tp, tcp_snd_cwnd(tp), acked);
+ }
+ EXPORT_SYMBOL_GPL(tcp_reno_cong_avoid);
+
+@@ -454,7 +454,7 @@ u32 tcp_reno_ssthresh(struct sock *sk)
+ {
+ const struct tcp_sock *tp = tcp_sk(sk);
+
+- return max(tp->snd_cwnd >> 1U, 2U);
++ return max(tcp_snd_cwnd(tp) >> 1U, 2U);
+ }
+ EXPORT_SYMBOL_GPL(tcp_reno_ssthresh);
+
+@@ -462,7 +462,7 @@ u32 tcp_reno_undo_cwnd(struct sock *sk)
+ {
+ const struct tcp_sock *tp = tcp_sk(sk);
+
+- return max(tp->snd_cwnd, tp->prior_cwnd);
++ return max(tcp_snd_cwnd(tp), tp->prior_cwnd);
+ }
+ EXPORT_SYMBOL_GPL(tcp_reno_undo_cwnd);
+
+diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
+index 24d562dd62254..b0918839bee7c 100644
+--- a/net/ipv4/tcp_cubic.c
++++ b/net/ipv4/tcp_cubic.c
+@@ -334,7 +334,7 @@ static void cubictcp_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ if (!acked)
+ return;
+ }
+- bictcp_update(ca, tp->snd_cwnd, acked);
++ bictcp_update(ca, tcp_snd_cwnd(tp), acked);
+ tcp_cong_avoid_ai(tp, ca->cnt, acked);
+ }
+
+@@ -346,13 +346,13 @@ static u32 cubictcp_recalc_ssthresh(struct sock *sk)
+ ca->epoch_start = 0; /* end of epoch */
+
+ /* Wmax and fast convergence */
+- if (tp->snd_cwnd < ca->last_max_cwnd && fast_convergence)
+- ca->last_max_cwnd = (tp->snd_cwnd * (BICTCP_BETA_SCALE + beta))
++ if (tcp_snd_cwnd(tp) < ca->last_max_cwnd && fast_convergence)
++ ca->last_max_cwnd = (tcp_snd_cwnd(tp) * (BICTCP_BETA_SCALE + beta))
+ / (2 * BICTCP_BETA_SCALE);
+ else
+- ca->last_max_cwnd = tp->snd_cwnd;
++ ca->last_max_cwnd = tcp_snd_cwnd(tp);
+
+- return max((tp->snd_cwnd * beta) / BICTCP_BETA_SCALE, 2U);
++ return max((tcp_snd_cwnd(tp) * beta) / BICTCP_BETA_SCALE, 2U);
+ }
+
+ static void cubictcp_state(struct sock *sk, u8 new_state)
+@@ -413,13 +413,13 @@ static void hystart_update(struct sock *sk, u32 delay)
+ ca->found = 1;
+ pr_debug("hystart_ack_train (%u > %u) delay_min %u (+ ack_delay %u) cwnd %u\n",
+ now - ca->round_start, threshold,
+- ca->delay_min, hystart_ack_delay(sk), tp->snd_cwnd);
++ ca->delay_min, hystart_ack_delay(sk), tcp_snd_cwnd(tp));
+ NET_INC_STATS(sock_net(sk),
+ LINUX_MIB_TCPHYSTARTTRAINDETECT);
+ NET_ADD_STATS(sock_net(sk),
+ LINUX_MIB_TCPHYSTARTTRAINCWND,
+- tp->snd_cwnd);
+- tp->snd_ssthresh = tp->snd_cwnd;
++ tcp_snd_cwnd(tp));
++ tp->snd_ssthresh = tcp_snd_cwnd(tp);
+ }
+ }
+ }
+@@ -438,8 +438,8 @@ static void hystart_update(struct sock *sk, u32 delay)
+ LINUX_MIB_TCPHYSTARTDELAYDETECT);
+ NET_ADD_STATS(sock_net(sk),
+ LINUX_MIB_TCPHYSTARTDELAYCWND,
+- tp->snd_cwnd);
+- tp->snd_ssthresh = tp->snd_cwnd;
++ tcp_snd_cwnd(tp));
++ tp->snd_ssthresh = tcp_snd_cwnd(tp);
+ }
+ }
+ }
+@@ -469,7 +469,7 @@ static void cubictcp_acked(struct sock *sk, const struct ack_sample *sample)
+
+ /* hystart triggers when cwnd is larger than some threshold */
+ if (!ca->found && tcp_in_slow_start(tp) && hystart &&
+- tp->snd_cwnd >= hystart_low_window)
++ tcp_snd_cwnd(tp) >= hystart_low_window)
+ hystart_update(sk, delay);
+ }
+
+diff --git a/net/ipv4/tcp_dctcp.c b/net/ipv4/tcp_dctcp.c
+index 1943a6630341c..ab034a4e9324a 100644
+--- a/net/ipv4/tcp_dctcp.c
++++ b/net/ipv4/tcp_dctcp.c
+@@ -106,8 +106,8 @@ static u32 dctcp_ssthresh(struct sock *sk)
+ struct dctcp *ca = inet_csk_ca(sk);
+ struct tcp_sock *tp = tcp_sk(sk);
+
+- ca->loss_cwnd = tp->snd_cwnd;
+- return max(tp->snd_cwnd - ((tp->snd_cwnd * ca->dctcp_alpha) >> 11U), 2U);
++ ca->loss_cwnd = tcp_snd_cwnd(tp);
++ return max(tcp_snd_cwnd(tp) - ((tcp_snd_cwnd(tp) * ca->dctcp_alpha) >> 11U), 2U);
+ }
+
+ static void dctcp_update_alpha(struct sock *sk, u32 flags)
+@@ -148,8 +148,8 @@ static void dctcp_react_to_loss(struct sock *sk)
+ struct dctcp *ca = inet_csk_ca(sk);
+ struct tcp_sock *tp = tcp_sk(sk);
+
+- ca->loss_cwnd = tp->snd_cwnd;
+- tp->snd_ssthresh = max(tp->snd_cwnd >> 1U, 2U);
++ ca->loss_cwnd = tcp_snd_cwnd(tp);
++ tp->snd_ssthresh = max(tcp_snd_cwnd(tp) >> 1U, 2U);
+ }
+
+ static void dctcp_state(struct sock *sk, u8 new_state)
+@@ -211,8 +211,9 @@ static size_t dctcp_get_info(struct sock *sk, u32 ext, int *attr,
+ static u32 dctcp_cwnd_undo(struct sock *sk)
+ {
+ const struct dctcp *ca = inet_csk_ca(sk);
++ struct tcp_sock *tp = tcp_sk(sk);
+
+- return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
++ return max(tcp_snd_cwnd(tp), ca->loss_cwnd);
+ }
+
+ static struct tcp_congestion_ops dctcp __read_mostly = {
+diff --git a/net/ipv4/tcp_highspeed.c b/net/ipv4/tcp_highspeed.c
+index 349069d6cd0aa..c6de5ce79ad3c 100644
+--- a/net/ipv4/tcp_highspeed.c
++++ b/net/ipv4/tcp_highspeed.c
+@@ -127,22 +127,22 @@ static void hstcp_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ * snd_cwnd <=
+ * hstcp_aimd_vals[ca->ai].cwnd
+ */
+- if (tp->snd_cwnd > hstcp_aimd_vals[ca->ai].cwnd) {
+- while (tp->snd_cwnd > hstcp_aimd_vals[ca->ai].cwnd &&
++ if (tcp_snd_cwnd(tp) > hstcp_aimd_vals[ca->ai].cwnd) {
++ while (tcp_snd_cwnd(tp) > hstcp_aimd_vals[ca->ai].cwnd &&
+ ca->ai < HSTCP_AIMD_MAX - 1)
+ ca->ai++;
+- } else if (ca->ai && tp->snd_cwnd <= hstcp_aimd_vals[ca->ai-1].cwnd) {
+- while (ca->ai && tp->snd_cwnd <= hstcp_aimd_vals[ca->ai-1].cwnd)
++ } else if (ca->ai && tcp_snd_cwnd(tp) <= hstcp_aimd_vals[ca->ai-1].cwnd) {
++ while (ca->ai && tcp_snd_cwnd(tp) <= hstcp_aimd_vals[ca->ai-1].cwnd)
+ ca->ai--;
+ }
+
+ /* Do additive increase */
+- if (tp->snd_cwnd < tp->snd_cwnd_clamp) {
++ if (tcp_snd_cwnd(tp) < tp->snd_cwnd_clamp) {
+ /* cwnd = cwnd + a(w) / cwnd */
+ tp->snd_cwnd_cnt += ca->ai + 1;
+- if (tp->snd_cwnd_cnt >= tp->snd_cwnd) {
+- tp->snd_cwnd_cnt -= tp->snd_cwnd;
+- tp->snd_cwnd++;
++ if (tp->snd_cwnd_cnt >= tcp_snd_cwnd(tp)) {
++ tp->snd_cwnd_cnt -= tcp_snd_cwnd(tp);
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) + 1);
+ }
+ }
+ }
+@@ -154,7 +154,7 @@ static u32 hstcp_ssthresh(struct sock *sk)
+ struct hstcp *ca = inet_csk_ca(sk);
+
+ /* Do multiplicative decrease */
+- return max(tp->snd_cwnd - ((tp->snd_cwnd * hstcp_aimd_vals[ca->ai].md) >> 8), 2U);
++ return max(tcp_snd_cwnd(tp) - ((tcp_snd_cwnd(tp) * hstcp_aimd_vals[ca->ai].md) >> 8), 2U);
+ }
+
+ static struct tcp_congestion_ops tcp_highspeed __read_mostly = {
+diff --git a/net/ipv4/tcp_htcp.c b/net/ipv4/tcp_htcp.c
+index 55adcfcf96fea..52b1f2665dfae 100644
+--- a/net/ipv4/tcp_htcp.c
++++ b/net/ipv4/tcp_htcp.c
+@@ -124,7 +124,7 @@ static void measure_achieved_throughput(struct sock *sk,
+
+ ca->packetcount += sample->pkts_acked;
+
+- if (ca->packetcount >= tp->snd_cwnd - (ca->alpha >> 7 ? : 1) &&
++ if (ca->packetcount >= tcp_snd_cwnd(tp) - (ca->alpha >> 7 ? : 1) &&
+ now - ca->lasttime >= ca->minRTT &&
+ ca->minRTT > 0) {
+ __u32 cur_Bi = ca->packetcount * HZ / (now - ca->lasttime);
+@@ -225,7 +225,7 @@ static u32 htcp_recalc_ssthresh(struct sock *sk)
+ const struct htcp *ca = inet_csk_ca(sk);
+
+ htcp_param_update(sk);
+- return max((tp->snd_cwnd * ca->beta) >> 7, 2U);
++ return max((tcp_snd_cwnd(tp) * ca->beta) >> 7, 2U);
+ }
+
+ static void htcp_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+@@ -242,9 +242,9 @@ static void htcp_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ /* In dangerous area, increase slowly.
+ * In theory this is tp->snd_cwnd += alpha / tp->snd_cwnd
+ */
+- if ((tp->snd_cwnd_cnt * ca->alpha)>>7 >= tp->snd_cwnd) {
+- if (tp->snd_cwnd < tp->snd_cwnd_clamp)
+- tp->snd_cwnd++;
++ if ((tp->snd_cwnd_cnt * ca->alpha)>>7 >= tcp_snd_cwnd(tp)) {
++ if (tcp_snd_cwnd(tp) < tp->snd_cwnd_clamp)
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) + 1);
+ tp->snd_cwnd_cnt = 0;
+ htcp_alpha_update(ca);
+ } else
+diff --git a/net/ipv4/tcp_hybla.c b/net/ipv4/tcp_hybla.c
+index be39327e04e6c..abd7d91807e54 100644
+--- a/net/ipv4/tcp_hybla.c
++++ b/net/ipv4/tcp_hybla.c
+@@ -54,7 +54,7 @@ static void hybla_init(struct sock *sk)
+ ca->rho2_7ls = 0;
+ ca->snd_cwnd_cents = 0;
+ ca->hybla_en = true;
+- tp->snd_cwnd = 2;
++ tcp_snd_cwnd_set(tp, 2);
+ tp->snd_cwnd_clamp = 65535;
+
+ /* 1st Rho measurement based on initial srtt */
+@@ -62,7 +62,7 @@ static void hybla_init(struct sock *sk)
+
+ /* set minimum rtt as this is the 1st ever seen */
+ ca->minrtt_us = tp->srtt_us;
+- tp->snd_cwnd = ca->rho;
++ tcp_snd_cwnd_set(tp, ca->rho);
+ }
+
+ static void hybla_state(struct sock *sk, u8 ca_state)
+@@ -137,31 +137,31 @@ static void hybla_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ * as long as increment is estimated as (rho<<7)/window
+ * it already is <<7 and we can easily count its fractions.
+ */
+- increment = ca->rho2_7ls / tp->snd_cwnd;
++ increment = ca->rho2_7ls / tcp_snd_cwnd(tp);
+ if (increment < 128)
+ tp->snd_cwnd_cnt++;
+ }
+
+ odd = increment % 128;
+- tp->snd_cwnd += increment >> 7;
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) + (increment >> 7));
+ ca->snd_cwnd_cents += odd;
+
+ /* check when fractions goes >=128 and increase cwnd by 1. */
+ while (ca->snd_cwnd_cents >= 128) {
+- tp->snd_cwnd++;
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) + 1);
+ ca->snd_cwnd_cents -= 128;
+ tp->snd_cwnd_cnt = 0;
+ }
+ /* check when cwnd has not been incremented for a while */
+- if (increment == 0 && odd == 0 && tp->snd_cwnd_cnt >= tp->snd_cwnd) {
+- tp->snd_cwnd++;
++ if (increment == 0 && odd == 0 && tp->snd_cwnd_cnt >= tcp_snd_cwnd(tp)) {
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) + 1);
+ tp->snd_cwnd_cnt = 0;
+ }
+ /* clamp down slowstart cwnd to ssthresh value. */
+ if (is_slowstart)
+- tp->snd_cwnd = min(tp->snd_cwnd, tp->snd_ssthresh);
++ tcp_snd_cwnd_set(tp, min(tcp_snd_cwnd(tp), tp->snd_ssthresh));
+
+- tp->snd_cwnd = min_t(u32, tp->snd_cwnd, tp->snd_cwnd_clamp);
++ tcp_snd_cwnd_set(tp, min(tcp_snd_cwnd(tp), tp->snd_cwnd_clamp));
+ }
+
+ static struct tcp_congestion_ops tcp_hybla __read_mostly = {
+diff --git a/net/ipv4/tcp_illinois.c b/net/ipv4/tcp_illinois.c
+index 00e54873213e8..c0c81a2c77fae 100644
+--- a/net/ipv4/tcp_illinois.c
++++ b/net/ipv4/tcp_illinois.c
+@@ -224,7 +224,7 @@ static void update_params(struct sock *sk)
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct illinois *ca = inet_csk_ca(sk);
+
+- if (tp->snd_cwnd < win_thresh) {
++ if (tcp_snd_cwnd(tp) < win_thresh) {
+ ca->alpha = ALPHA_BASE;
+ ca->beta = BETA_BASE;
+ } else if (ca->cnt_rtt > 0) {
+@@ -284,9 +284,9 @@ static void tcp_illinois_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ * tp->snd_cwnd += alpha/tp->snd_cwnd
+ */
+ delta = (tp->snd_cwnd_cnt * ca->alpha) >> ALPHA_SHIFT;
+- if (delta >= tp->snd_cwnd) {
+- tp->snd_cwnd = min(tp->snd_cwnd + delta / tp->snd_cwnd,
+- (u32)tp->snd_cwnd_clamp);
++ if (delta >= tcp_snd_cwnd(tp)) {
++ tcp_snd_cwnd_set(tp, min(tcp_snd_cwnd(tp) + delta / tcp_snd_cwnd(tp),
++ (u32)tp->snd_cwnd_clamp));
+ tp->snd_cwnd_cnt = 0;
+ }
+ }
+@@ -296,9 +296,11 @@ static u32 tcp_illinois_ssthresh(struct sock *sk)
+ {
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct illinois *ca = inet_csk_ca(sk);
++ u32 decr;
+
+ /* Multiplicative decrease */
+- return max(tp->snd_cwnd - ((tp->snd_cwnd * ca->beta) >> BETA_SHIFT), 2U);
++ decr = (tcp_snd_cwnd(tp) * ca->beta) >> BETA_SHIFT;
++ return max(tcp_snd_cwnd(tp) - decr, 2U);
+ }
+
+ /* Extract info for Tcp socket info provided via netlink. */
+diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
+index 1f3ce7aea7168..6b8fcf79688b5 100644
+--- a/net/ipv4/tcp_input.c
++++ b/net/ipv4/tcp_input.c
+@@ -414,7 +414,7 @@ static void tcp_sndbuf_expand(struct sock *sk)
+ per_mss = roundup_pow_of_two(per_mss) +
+ SKB_DATA_ALIGN(sizeof(struct sk_buff));
+
+- nr_segs = max_t(u32, TCP_INIT_CWND, tp->snd_cwnd);
++ nr_segs = max_t(u32, TCP_INIT_CWND, tcp_snd_cwnd(tp));
+ nr_segs = max_t(u32, nr_segs, tp->reordering + 1);
+
+ /* Fast Recovery (RFC 5681 3.2) :
+@@ -909,12 +909,12 @@ static void tcp_update_pacing_rate(struct sock *sk)
+ * If snd_cwnd >= (tp->snd_ssthresh / 2), we are approaching
+ * end of slow start and should slow down.
+ */
+- if (tp->snd_cwnd < tp->snd_ssthresh / 2)
++ if (tcp_snd_cwnd(tp) < tp->snd_ssthresh / 2)
+ rate *= sock_net(sk)->ipv4.sysctl_tcp_pacing_ss_ratio;
+ else
+ rate *= sock_net(sk)->ipv4.sysctl_tcp_pacing_ca_ratio;
+
+- rate *= max(tp->snd_cwnd, tp->packets_out);
++ rate *= max(tcp_snd_cwnd(tp), tp->packets_out);
+
+ if (likely(tp->srtt_us))
+ do_div(rate, tp->srtt_us);
+@@ -2147,12 +2147,12 @@ void tcp_enter_loss(struct sock *sk)
+ !after(tp->high_seq, tp->snd_una) ||
+ (icsk->icsk_ca_state == TCP_CA_Loss && !icsk->icsk_retransmits)) {
+ tp->prior_ssthresh = tcp_current_ssthresh(sk);
+- tp->prior_cwnd = tp->snd_cwnd;
++ tp->prior_cwnd = tcp_snd_cwnd(tp);
+ tp->snd_ssthresh = icsk->icsk_ca_ops->ssthresh(sk);
+ tcp_ca_event(sk, CA_EVENT_LOSS);
+ tcp_init_undo(tp);
+ }
+- tp->snd_cwnd = tcp_packets_in_flight(tp) + 1;
++ tcp_snd_cwnd_set(tp, tcp_packets_in_flight(tp) + 1);
+ tp->snd_cwnd_cnt = 0;
+ tp->snd_cwnd_stamp = tcp_jiffies32;
+
+@@ -2458,7 +2458,7 @@ static void DBGUNDO(struct sock *sk, const char *msg)
+ pr_debug("Undo %s %pI4/%u c%u l%u ss%u/%u p%u\n",
+ msg,
+ &inet->inet_daddr, ntohs(inet->inet_dport),
+- tp->snd_cwnd, tcp_left_out(tp),
++ tcp_snd_cwnd(tp), tcp_left_out(tp),
+ tp->snd_ssthresh, tp->prior_ssthresh,
+ tp->packets_out);
+ }
+@@ -2467,7 +2467,7 @@ static void DBGUNDO(struct sock *sk, const char *msg)
+ pr_debug("Undo %s %pI6/%u c%u l%u ss%u/%u p%u\n",
+ msg,
+ &sk->sk_v6_daddr, ntohs(inet->inet_dport),
+- tp->snd_cwnd, tcp_left_out(tp),
++ tcp_snd_cwnd(tp), tcp_left_out(tp),
+ tp->snd_ssthresh, tp->prior_ssthresh,
+ tp->packets_out);
+ }
+@@ -2492,7 +2492,7 @@ static void tcp_undo_cwnd_reduction(struct sock *sk, bool unmark_loss)
+ if (tp->prior_ssthresh) {
+ const struct inet_connection_sock *icsk = inet_csk(sk);
+
+- tp->snd_cwnd = icsk->icsk_ca_ops->undo_cwnd(sk);
++ tcp_snd_cwnd_set(tp, icsk->icsk_ca_ops->undo_cwnd(sk));
+
+ if (tp->prior_ssthresh > tp->snd_ssthresh) {
+ tp->snd_ssthresh = tp->prior_ssthresh;
+@@ -2599,7 +2599,7 @@ static void tcp_init_cwnd_reduction(struct sock *sk)
+ tp->high_seq = tp->snd_nxt;
+ tp->tlp_high_seq = 0;
+ tp->snd_cwnd_cnt = 0;
+- tp->prior_cwnd = tp->snd_cwnd;
++ tp->prior_cwnd = tcp_snd_cwnd(tp);
+ tp->prr_delivered = 0;
+ tp->prr_out = 0;
+ tp->snd_ssthresh = inet_csk(sk)->icsk_ca_ops->ssthresh(sk);
+@@ -2629,7 +2629,7 @@ void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, int newly_lost,
+ }
+ /* Force a fast retransmit upon entering fast recovery */
+ sndcnt = max(sndcnt, (tp->prr_out ? 0 : 1));
+- tp->snd_cwnd = tcp_packets_in_flight(tp) + sndcnt;
++ tcp_snd_cwnd_set(tp, tcp_packets_in_flight(tp) + sndcnt);
+ }
+
+ static inline void tcp_end_cwnd_reduction(struct sock *sk)
+@@ -2642,7 +2642,7 @@ static inline void tcp_end_cwnd_reduction(struct sock *sk)
+ /* Reset cwnd to ssthresh in CWR or Recovery (unless it's undone) */
+ if (tp->snd_ssthresh < TCP_INFINITE_SSTHRESH &&
+ (inet_csk(sk)->icsk_ca_state == TCP_CA_CWR || tp->undo_marker)) {
+- tp->snd_cwnd = tp->snd_ssthresh;
++ tcp_snd_cwnd_set(tp, tp->snd_ssthresh);
+ tp->snd_cwnd_stamp = tcp_jiffies32;
+ }
+ tcp_ca_event(sk, CA_EVENT_COMPLETE_CWR);
+@@ -2706,12 +2706,15 @@ static void tcp_mtup_probe_success(struct sock *sk)
+ {
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct inet_connection_sock *icsk = inet_csk(sk);
++ u64 val;
+
+- /* FIXME: breaks with very large cwnd */
+ tp->prior_ssthresh = tcp_current_ssthresh(sk);
+- tp->snd_cwnd = tp->snd_cwnd *
+- tcp_mss_to_mtu(sk, tp->mss_cache) /
+- icsk->icsk_mtup.probe_size;
++
++ val = (u64)tcp_snd_cwnd(tp) * tcp_mss_to_mtu(sk, tp->mss_cache);
++ do_div(val, icsk->icsk_mtup.probe_size);
++ WARN_ON_ONCE((u32)val != val);
++ tcp_snd_cwnd_set(tp, max_t(u32, 1U, val));
++
+ tp->snd_cwnd_cnt = 0;
+ tp->snd_cwnd_stamp = tcp_jiffies32;
+ tp->snd_ssthresh = tcp_current_ssthresh(sk);
+@@ -3034,7 +3037,7 @@ static void tcp_fastretrans_alert(struct sock *sk, const u32 prior_snd_una,
+ tp->snd_una == tp->mtu_probe.probe_seq_start) {
+ tcp_mtup_probe_failed(sk);
+ /* Restores the reduction we did in tcp_mtup_probe() */
+- tp->snd_cwnd++;
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) + 1);
+ tcp_simple_retransmit(sk);
+ return;
+ }
+@@ -5437,7 +5440,7 @@ static bool tcp_should_expand_sndbuf(struct sock *sk)
+ return false;
+
+ /* If we filled the congestion window, do not expand. */
+- if (tcp_packets_in_flight(tp) >= tp->snd_cwnd)
++ if (tcp_packets_in_flight(tp) >= tcp_snd_cwnd(tp))
+ return false;
+
+ return true;
+@@ -6013,9 +6016,9 @@ void tcp_init_transfer(struct sock *sk, int bpf_op, struct sk_buff *skb)
+ * retransmission has occurred.
+ */
+ if (tp->total_retrans > 1 && tp->undo_marker)
+- tp->snd_cwnd = 1;
++ tcp_snd_cwnd_set(tp, 1);
+ else
+- tp->snd_cwnd = tcp_init_cwnd(tp, __sk_dst_get(sk));
++ tcp_snd_cwnd_set(tp, tcp_init_cwnd(tp, __sk_dst_get(sk)));
+ tp->snd_cwnd_stamp = tcp_jiffies32;
+
+ bpf_skops_established(sk, bpf_op, skb);
+diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
+index 457f5b5d5d4a9..30a74e4eeab49 100644
+--- a/net/ipv4/tcp_ipv4.c
++++ b/net/ipv4/tcp_ipv4.c
+@@ -2621,7 +2621,7 @@ static void get_tcp4_sock(struct sock *sk, struct seq_file *f, int i)
+ jiffies_to_clock_t(icsk->icsk_rto),
+ jiffies_to_clock_t(icsk->icsk_ack.ato),
+ (icsk->icsk_ack.quick << 1) | inet_csk_in_pingpong_mode(sk),
+- tp->snd_cwnd,
++ tcp_snd_cwnd(tp),
+ state == TCP_LISTEN ?
+ fastopenq->max_qlen :
+ (tcp_in_initial_slowstart(tp) ? -1 : tp->snd_ssthresh));
+diff --git a/net/ipv4/tcp_lp.c b/net/ipv4/tcp_lp.c
+index 82b36ec3f2f82..ae36780977d27 100644
+--- a/net/ipv4/tcp_lp.c
++++ b/net/ipv4/tcp_lp.c
+@@ -297,7 +297,7 @@ static void tcp_lp_pkts_acked(struct sock *sk, const struct ack_sample *sample)
+ lp->flag &= ~LP_WITHIN_THR;
+
+ pr_debug("TCP-LP: %05o|%5u|%5u|%15u|%15u|%15u\n", lp->flag,
+- tp->snd_cwnd, lp->remote_hz, lp->owd_min, lp->owd_max,
++ tcp_snd_cwnd(tp), lp->remote_hz, lp->owd_min, lp->owd_max,
+ lp->sowd >> 3);
+
+ if (lp->flag & LP_WITHIN_THR)
+@@ -313,12 +313,12 @@ static void tcp_lp_pkts_acked(struct sock *sk, const struct ack_sample *sample)
+ /* happened within inference
+ * drop snd_cwnd into 1 */
+ if (lp->flag & LP_WITHIN_INF)
+- tp->snd_cwnd = 1U;
++ tcp_snd_cwnd_set(tp, 1U);
+
+ /* happened after inference
+ * cut snd_cwnd into half */
+ else
+- tp->snd_cwnd = max(tp->snd_cwnd >> 1U, 1U);
++ tcp_snd_cwnd_set(tp, max(tcp_snd_cwnd(tp) >> 1U, 1U));
+
+ /* record this drop time */
+ lp->last_drop = now;
+diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
+index 0588b004ddac1..7029b0e98edb2 100644
+--- a/net/ipv4/tcp_metrics.c
++++ b/net/ipv4/tcp_metrics.c
+@@ -388,15 +388,15 @@ void tcp_update_metrics(struct sock *sk)
+ if (!net->ipv4.sysctl_tcp_no_ssthresh_metrics_save &&
+ !tcp_metric_locked(tm, TCP_METRIC_SSTHRESH)) {
+ val = tcp_metric_get(tm, TCP_METRIC_SSTHRESH);
+- if (val && (tp->snd_cwnd >> 1) > val)
++ if (val && (tcp_snd_cwnd(tp) >> 1) > val)
+ tcp_metric_set(tm, TCP_METRIC_SSTHRESH,
+- tp->snd_cwnd >> 1);
++ tcp_snd_cwnd(tp) >> 1);
+ }
+ if (!tcp_metric_locked(tm, TCP_METRIC_CWND)) {
+ val = tcp_metric_get(tm, TCP_METRIC_CWND);
+- if (tp->snd_cwnd > val)
++ if (tcp_snd_cwnd(tp) > val)
+ tcp_metric_set(tm, TCP_METRIC_CWND,
+- tp->snd_cwnd);
++ tcp_snd_cwnd(tp));
+ }
+ } else if (!tcp_in_slow_start(tp) &&
+ icsk->icsk_ca_state == TCP_CA_Open) {
+@@ -404,10 +404,10 @@ void tcp_update_metrics(struct sock *sk)
+ if (!net->ipv4.sysctl_tcp_no_ssthresh_metrics_save &&
+ !tcp_metric_locked(tm, TCP_METRIC_SSTHRESH))
+ tcp_metric_set(tm, TCP_METRIC_SSTHRESH,
+- max(tp->snd_cwnd >> 1, tp->snd_ssthresh));
++ max(tcp_snd_cwnd(tp) >> 1, tp->snd_ssthresh));
+ if (!tcp_metric_locked(tm, TCP_METRIC_CWND)) {
+ val = tcp_metric_get(tm, TCP_METRIC_CWND);
+- tcp_metric_set(tm, TCP_METRIC_CWND, (val + tp->snd_cwnd) >> 1);
++ tcp_metric_set(tm, TCP_METRIC_CWND, (val + tcp_snd_cwnd(tp)) >> 1);
+ }
+ } else {
+ /* Else slow start did not finish, cwnd is non-sense,
+diff --git a/net/ipv4/tcp_nv.c b/net/ipv4/tcp_nv.c
+index ab552356bdba8..a60662f4bdf92 100644
+--- a/net/ipv4/tcp_nv.c
++++ b/net/ipv4/tcp_nv.c
+@@ -197,10 +197,10 @@ static void tcpnv_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ }
+
+ if (ca->cwnd_growth_factor < 0) {
+- cnt = tp->snd_cwnd << -ca->cwnd_growth_factor;
++ cnt = tcp_snd_cwnd(tp) << -ca->cwnd_growth_factor;
+ tcp_cong_avoid_ai(tp, cnt, acked);
+ } else {
+- cnt = max(4U, tp->snd_cwnd >> ca->cwnd_growth_factor);
++ cnt = max(4U, tcp_snd_cwnd(tp) >> ca->cwnd_growth_factor);
+ tcp_cong_avoid_ai(tp, cnt, acked);
+ }
+ }
+@@ -209,7 +209,7 @@ static u32 tcpnv_recalc_ssthresh(struct sock *sk)
+ {
+ const struct tcp_sock *tp = tcp_sk(sk);
+
+- return max((tp->snd_cwnd * nv_loss_dec_factor) >> 10, 2U);
++ return max((tcp_snd_cwnd(tp) * nv_loss_dec_factor) >> 10, 2U);
+ }
+
+ static void tcpnv_state(struct sock *sk, u8 new_state)
+@@ -257,7 +257,7 @@ static void tcpnv_acked(struct sock *sk, const struct ack_sample *sample)
+ return;
+
+ /* Stop cwnd growth if we were in catch up mode */
+- if (ca->nv_catchup && tp->snd_cwnd >= nv_min_cwnd) {
++ if (ca->nv_catchup && tcp_snd_cwnd(tp) >= nv_min_cwnd) {
+ ca->nv_catchup = 0;
+ ca->nv_allow_cwnd_growth = 0;
+ }
+@@ -371,7 +371,7 @@ static void tcpnv_acked(struct sock *sk, const struct ack_sample *sample)
+ * if cwnd < max_win, grow cwnd
+ * else leave the same
+ */
+- if (tp->snd_cwnd > max_win) {
++ if (tcp_snd_cwnd(tp) > max_win) {
+ /* there is congestion, check that it is ok
+ * to make a CA decision
+ * 1. We should have at least nv_dec_eval_min_calls
+@@ -398,20 +398,20 @@ static void tcpnv_acked(struct sock *sk, const struct ack_sample *sample)
+ ca->nv_allow_cwnd_growth = 0;
+ tp->snd_ssthresh =
+ (nv_ssthresh_factor * max_win) >> 3;
+- if (tp->snd_cwnd - max_win > 2) {
++ if (tcp_snd_cwnd(tp) - max_win > 2) {
+ /* gap > 2, we do exponential cwnd decrease */
+ int dec;
+
+- dec = max(2U, ((tp->snd_cwnd - max_win) *
++ dec = max(2U, ((tcp_snd_cwnd(tp) - max_win) *
+ nv_cong_dec_mult) >> 7);
+- tp->snd_cwnd -= dec;
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) - dec);
+ } else if (nv_cong_dec_mult > 0) {
+- tp->snd_cwnd = max_win;
++ tcp_snd_cwnd_set(tp, max_win);
+ }
+ if (ca->cwnd_growth_factor > 0)
+ ca->cwnd_growth_factor = 0;
+ ca->nv_no_cong_cnt = 0;
+- } else if (tp->snd_cwnd <= max_win - nv_pad_buffer) {
++ } else if (tcp_snd_cwnd(tp) <= max_win - nv_pad_buffer) {
+ /* There is no congestion, grow cwnd if allowed*/
+ if (ca->nv_eval_call_cnt < nv_inc_eval_min_calls)
+ return;
+@@ -444,8 +444,8 @@ static void tcpnv_acked(struct sock *sk, const struct ack_sample *sample)
+ * (it wasn't before, if it is now is because nv
+ * decreased it).
+ */
+- if (tp->snd_cwnd < nv_min_cwnd)
+- tp->snd_cwnd = nv_min_cwnd;
++ if (tcp_snd_cwnd(tp) < nv_min_cwnd)
++ tcp_snd_cwnd_set(tp, nv_min_cwnd);
+ }
+ }
+
+diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
+index 1ca2f28c99810..6b00c17c72aa8 100644
+--- a/net/ipv4/tcp_output.c
++++ b/net/ipv4/tcp_output.c
+@@ -143,7 +143,7 @@ void tcp_cwnd_restart(struct sock *sk, s32 delta)
+ {
+ struct tcp_sock *tp = tcp_sk(sk);
+ u32 restart_cwnd = tcp_init_cwnd(tp, __sk_dst_get(sk));
+- u32 cwnd = tp->snd_cwnd;
++ u32 cwnd = tcp_snd_cwnd(tp);
+
+ tcp_ca_event(sk, CA_EVENT_CWND_RESTART);
+
+@@ -152,7 +152,7 @@ void tcp_cwnd_restart(struct sock *sk, s32 delta)
+
+ while ((delta -= inet_csk(sk)->icsk_rto) > 0 && cwnd > restart_cwnd)
+ cwnd >>= 1;
+- tp->snd_cwnd = max(cwnd, restart_cwnd);
++ tcp_snd_cwnd_set(tp, max(cwnd, restart_cwnd));
+ tp->snd_cwnd_stamp = tcp_jiffies32;
+ tp->snd_cwnd_used = 0;
+ }
+@@ -1014,7 +1014,7 @@ static void tcp_tsq_write(struct sock *sk)
+ struct tcp_sock *tp = tcp_sk(sk);
+
+ if (tp->lost_out > tp->retrans_out &&
+- tp->snd_cwnd > tcp_packets_in_flight(tp)) {
++ tcp_snd_cwnd(tp) > tcp_packets_in_flight(tp)) {
+ tcp_mstamp_refresh(tp);
+ tcp_xmit_retransmit_queue(sk);
+ }
+@@ -1861,9 +1861,9 @@ static void tcp_cwnd_application_limited(struct sock *sk)
+ /* Limited by application or receiver window. */
+ u32 init_win = tcp_init_cwnd(tp, __sk_dst_get(sk));
+ u32 win_used = max(tp->snd_cwnd_used, init_win);
+- if (win_used < tp->snd_cwnd) {
++ if (win_used < tcp_snd_cwnd(tp)) {
+ tp->snd_ssthresh = tcp_current_ssthresh(sk);
+- tp->snd_cwnd = (tp->snd_cwnd + win_used) >> 1;
++ tcp_snd_cwnd_set(tp, (tcp_snd_cwnd(tp) + win_used) >> 1);
+ }
+ tp->snd_cwnd_used = 0;
+ }
+@@ -2044,7 +2044,7 @@ static inline unsigned int tcp_cwnd_test(const struct tcp_sock *tp,
+ return 1;
+
+ in_flight = tcp_packets_in_flight(tp);
+- cwnd = tp->snd_cwnd;
++ cwnd = tcp_snd_cwnd(tp);
+ if (in_flight >= cwnd)
+ return 0;
+
+@@ -2197,12 +2197,12 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
+ in_flight = tcp_packets_in_flight(tp);
+
+ BUG_ON(tcp_skb_pcount(skb) <= 1);
+- BUG_ON(tp->snd_cwnd <= in_flight);
++ BUG_ON(tcp_snd_cwnd(tp) <= in_flight);
+
+ send_win = tcp_wnd_end(tp) - TCP_SKB_CB(skb)->seq;
+
+ /* From in_flight test above, we know that cwnd > in_flight. */
+- cong_win = (tp->snd_cwnd - in_flight) * tp->mss_cache;
++ cong_win = (tcp_snd_cwnd(tp) - in_flight) * tp->mss_cache;
+
+ limit = min(send_win, cong_win);
+
+@@ -2216,7 +2216,7 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
+
+ win_divisor = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_tso_win_divisor);
+ if (win_divisor) {
+- u32 chunk = min(tp->snd_wnd, tp->snd_cwnd * tp->mss_cache);
++ u32 chunk = min(tp->snd_wnd, tcp_snd_cwnd(tp) * tp->mss_cache);
+
+ /* If at least some fraction of a window is available,
+ * just use it.
+@@ -2346,7 +2346,7 @@ static int tcp_mtu_probe(struct sock *sk)
+ if (likely(!icsk->icsk_mtup.enabled ||
+ icsk->icsk_mtup.probe_size ||
+ inet_csk(sk)->icsk_ca_state != TCP_CA_Open ||
+- tp->snd_cwnd < 11 ||
++ tcp_snd_cwnd(tp) < 11 ||
+ tp->rx_opt.num_sacks || tp->rx_opt.dsack))
+ return -1;
+
+@@ -2382,7 +2382,7 @@ static int tcp_mtu_probe(struct sock *sk)
+ return 0;
+
+ /* Do we need to wait to drain cwnd? With none in flight, don't stall */
+- if (tcp_packets_in_flight(tp) + 2 > tp->snd_cwnd) {
++ if (tcp_packets_in_flight(tp) + 2 > tcp_snd_cwnd(tp)) {
+ if (!tcp_packets_in_flight(tp))
+ return -1;
+ else
+@@ -2451,7 +2451,7 @@ static int tcp_mtu_probe(struct sock *sk)
+ if (!tcp_transmit_skb(sk, nskb, 1, GFP_ATOMIC)) {
+ /* Decrement cwnd here because we are sending
+ * effectively two packets. */
+- tp->snd_cwnd--;
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) - 1);
+ tcp_event_new_data_sent(sk, nskb);
+
+ icsk->icsk_mtup.probe_size = tcp_mss_to_mtu(sk, nskb->len);
+@@ -2709,7 +2709,7 @@ repair:
+ else
+ tcp_chrono_stop(sk, TCP_CHRONO_RWND_LIMITED);
+
+- is_cwnd_limited |= (tcp_packets_in_flight(tp) >= tp->snd_cwnd);
++ is_cwnd_limited |= (tcp_packets_in_flight(tp) >= tcp_snd_cwnd(tp));
+ if (likely(sent_pkts || is_cwnd_limited))
+ tcp_cwnd_validate(sk, is_cwnd_limited);
+
+@@ -2819,7 +2819,7 @@ void tcp_send_loss_probe(struct sock *sk)
+ if (unlikely(!skb)) {
+ WARN_ONCE(tp->packets_out,
+ "invalid inflight: %u state %u cwnd %u mss %d\n",
+- tp->packets_out, sk->sk_state, tp->snd_cwnd, mss);
++ tp->packets_out, sk->sk_state, tcp_snd_cwnd(tp), mss);
+ inet_csk(sk)->icsk_pending = 0;
+ return;
+ }
+@@ -3303,7 +3303,7 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
+ if (!hole)
+ tp->retransmit_skb_hint = skb;
+
+- segs = tp->snd_cwnd - tcp_packets_in_flight(tp);
++ segs = tcp_snd_cwnd(tp) - tcp_packets_in_flight(tp);
+ if (segs <= 0)
+ break;
+ sacked = TCP_SKB_CB(skb)->sacked;
+@@ -4113,8 +4113,8 @@ int tcp_rtx_synack(const struct sock *sk, struct request_sock *req)
+ res = af_ops->send_synack(sk, NULL, &fl, req, NULL, TCP_SYNACK_NORMAL,
+ NULL);
+ if (!res) {
+- __TCP_INC_STATS(sock_net(sk), TCP_MIB_RETRANSSEGS);
+- __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNRETRANS);
++ TCP_INC_STATS(sock_net(sk), TCP_MIB_RETRANSSEGS);
++ NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNRETRANS);
+ if (unlikely(tcp_passive_fastopen(sk)))
+ tcp_sk(sk)->total_retrans++;
+ trace_tcp_retransmit_synack(sk, req);
+diff --git a/net/ipv4/tcp_rate.c b/net/ipv4/tcp_rate.c
+index 9a8e014d9b5b9..a8f6d9d06f2eb 100644
+--- a/net/ipv4/tcp_rate.c
++++ b/net/ipv4/tcp_rate.c
+@@ -200,7 +200,7 @@ void tcp_rate_check_app_limited(struct sock *sk)
+ /* Nothing in sending host's qdisc queues or NIC tx queue. */
+ sk_wmem_alloc_get(sk) < SKB_TRUESIZE(1) &&
+ /* We are not limited by CWND. */
+- tcp_packets_in_flight(tp) < tp->snd_cwnd &&
++ tcp_packets_in_flight(tp) < tcp_snd_cwnd(tp) &&
+ /* All lost packets have been retransmitted. */
+ tp->lost_out <= tp->retrans_out)
+ tp->app_limited =
+diff --git a/net/ipv4/tcp_scalable.c b/net/ipv4/tcp_scalable.c
+index 5842081bc8a25..862b96248a92d 100644
+--- a/net/ipv4/tcp_scalable.c
++++ b/net/ipv4/tcp_scalable.c
+@@ -27,7 +27,7 @@ static void tcp_scalable_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ if (!acked)
+ return;
+ }
+- tcp_cong_avoid_ai(tp, min(tp->snd_cwnd, TCP_SCALABLE_AI_CNT),
++ tcp_cong_avoid_ai(tp, min(tcp_snd_cwnd(tp), TCP_SCALABLE_AI_CNT),
+ acked);
+ }
+
+@@ -35,7 +35,7 @@ static u32 tcp_scalable_ssthresh(struct sock *sk)
+ {
+ const struct tcp_sock *tp = tcp_sk(sk);
+
+- return max(tp->snd_cwnd - (tp->snd_cwnd>>TCP_SCALABLE_MD_SCALE), 2U);
++ return max(tcp_snd_cwnd(tp) - (tcp_snd_cwnd(tp)>>TCP_SCALABLE_MD_SCALE), 2U);
+ }
+
+ static struct tcp_congestion_ops tcp_scalable __read_mostly = {
+diff --git a/net/ipv4/tcp_vegas.c b/net/ipv4/tcp_vegas.c
+index c8003c8aad2c0..786848ad37ea8 100644
+--- a/net/ipv4/tcp_vegas.c
++++ b/net/ipv4/tcp_vegas.c
+@@ -159,7 +159,7 @@ EXPORT_SYMBOL_GPL(tcp_vegas_cwnd_event);
+
+ static inline u32 tcp_vegas_ssthresh(struct tcp_sock *tp)
+ {
+- return min(tp->snd_ssthresh, tp->snd_cwnd);
++ return min(tp->snd_ssthresh, tcp_snd_cwnd(tp));
+ }
+
+ static void tcp_vegas_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+@@ -217,14 +217,14 @@ static void tcp_vegas_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ * This is:
+ * (actual rate in segments) * baseRTT
+ */
+- target_cwnd = (u64)tp->snd_cwnd * vegas->baseRTT;
++ target_cwnd = (u64)tcp_snd_cwnd(tp) * vegas->baseRTT;
+ do_div(target_cwnd, rtt);
+
+ /* Calculate the difference between the window we had,
+ * and the window we would like to have. This quantity
+ * is the "Diff" from the Arizona Vegas papers.
+ */
+- diff = tp->snd_cwnd * (rtt-vegas->baseRTT) / vegas->baseRTT;
++ diff = tcp_snd_cwnd(tp) * (rtt-vegas->baseRTT) / vegas->baseRTT;
+
+ if (diff > gamma && tcp_in_slow_start(tp)) {
+ /* Going too fast. Time to slow down
+@@ -238,7 +238,8 @@ static void tcp_vegas_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ * truncation robs us of full link
+ * utilization.
+ */
+- tp->snd_cwnd = min(tp->snd_cwnd, (u32)target_cwnd+1);
++ tcp_snd_cwnd_set(tp, min(tcp_snd_cwnd(tp),
++ (u32)target_cwnd + 1));
+ tp->snd_ssthresh = tcp_vegas_ssthresh(tp);
+
+ } else if (tcp_in_slow_start(tp)) {
+@@ -254,14 +255,14 @@ static void tcp_vegas_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ /* The old window was too fast, so
+ * we slow down.
+ */
+- tp->snd_cwnd--;
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) - 1);
+ tp->snd_ssthresh
+ = tcp_vegas_ssthresh(tp);
+ } else if (diff < alpha) {
+ /* We don't have enough extra packets
+ * in the network, so speed up.
+ */
+- tp->snd_cwnd++;
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) + 1);
+ } else {
+ /* Sending just as fast as we
+ * should be.
+@@ -269,10 +270,10 @@ static void tcp_vegas_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ }
+ }
+
+- if (tp->snd_cwnd < 2)
+- tp->snd_cwnd = 2;
+- else if (tp->snd_cwnd > tp->snd_cwnd_clamp)
+- tp->snd_cwnd = tp->snd_cwnd_clamp;
++ if (tcp_snd_cwnd(tp) < 2)
++ tcp_snd_cwnd_set(tp, 2);
++ else if (tcp_snd_cwnd(tp) > tp->snd_cwnd_clamp)
++ tcp_snd_cwnd_set(tp, tp->snd_cwnd_clamp);
+
+ tp->snd_ssthresh = tcp_current_ssthresh(sk);
+ }
+diff --git a/net/ipv4/tcp_veno.c b/net/ipv4/tcp_veno.c
+index cd50a61c9976d..366ff6f214b2e 100644
+--- a/net/ipv4/tcp_veno.c
++++ b/net/ipv4/tcp_veno.c
+@@ -146,11 +146,11 @@ static void tcp_veno_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+
+ rtt = veno->minrtt;
+
+- target_cwnd = (u64)tp->snd_cwnd * veno->basertt;
++ target_cwnd = (u64)tcp_snd_cwnd(tp) * veno->basertt;
+ target_cwnd <<= V_PARAM_SHIFT;
+ do_div(target_cwnd, rtt);
+
+- veno->diff = (tp->snd_cwnd << V_PARAM_SHIFT) - target_cwnd;
++ veno->diff = (tcp_snd_cwnd(tp) << V_PARAM_SHIFT) - target_cwnd;
+
+ if (tcp_in_slow_start(tp)) {
+ /* Slow start. */
+@@ -164,15 +164,15 @@ static void tcp_veno_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ /* In the "non-congestive state", increase cwnd
+ * every rtt.
+ */
+- tcp_cong_avoid_ai(tp, tp->snd_cwnd, acked);
++ tcp_cong_avoid_ai(tp, tcp_snd_cwnd(tp), acked);
+ } else {
+ /* In the "congestive state", increase cwnd
+ * every other rtt.
+ */
+- if (tp->snd_cwnd_cnt >= tp->snd_cwnd) {
++ if (tp->snd_cwnd_cnt >= tcp_snd_cwnd(tp)) {
+ if (veno->inc &&
+- tp->snd_cwnd < tp->snd_cwnd_clamp) {
+- tp->snd_cwnd++;
++ tcp_snd_cwnd(tp) < tp->snd_cwnd_clamp) {
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) + 1);
+ veno->inc = 0;
+ } else
+ veno->inc = 1;
+@@ -181,10 +181,10 @@ static void tcp_veno_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+ tp->snd_cwnd_cnt += acked;
+ }
+ done:
+- if (tp->snd_cwnd < 2)
+- tp->snd_cwnd = 2;
+- else if (tp->snd_cwnd > tp->snd_cwnd_clamp)
+- tp->snd_cwnd = tp->snd_cwnd_clamp;
++ if (tcp_snd_cwnd(tp) < 2)
++ tcp_snd_cwnd_set(tp, 2);
++ else if (tcp_snd_cwnd(tp) > tp->snd_cwnd_clamp)
++ tcp_snd_cwnd_set(tp, tp->snd_cwnd_clamp);
+ }
+ /* Wipe the slate clean for the next rtt. */
+ /* veno->cntrtt = 0; */
+@@ -199,10 +199,10 @@ static u32 tcp_veno_ssthresh(struct sock *sk)
+
+ if (veno->diff < beta)
+ /* in "non-congestive state", cut cwnd by 1/5 */
+- return max(tp->snd_cwnd * 4 / 5, 2U);
++ return max(tcp_snd_cwnd(tp) * 4 / 5, 2U);
+ else
+ /* in "congestive state", cut cwnd by 1/2 */
+- return max(tp->snd_cwnd >> 1U, 2U);
++ return max(tcp_snd_cwnd(tp) >> 1U, 2U);
+ }
+
+ static struct tcp_congestion_ops tcp_veno __read_mostly = {
+diff --git a/net/ipv4/tcp_westwood.c b/net/ipv4/tcp_westwood.c
+index b2e05c4cea00f..c6e97141eef25 100644
+--- a/net/ipv4/tcp_westwood.c
++++ b/net/ipv4/tcp_westwood.c
+@@ -244,7 +244,8 @@ static void tcp_westwood_event(struct sock *sk, enum tcp_ca_event event)
+
+ switch (event) {
+ case CA_EVENT_COMPLETE_CWR:
+- tp->snd_cwnd = tp->snd_ssthresh = tcp_westwood_bw_rttmin(sk);
++ tp->snd_ssthresh = tcp_westwood_bw_rttmin(sk);
++ tcp_snd_cwnd_set(tp, tp->snd_ssthresh);
+ break;
+ case CA_EVENT_LOSS:
+ tp->snd_ssthresh = tcp_westwood_bw_rttmin(sk);
+diff --git a/net/ipv4/tcp_yeah.c b/net/ipv4/tcp_yeah.c
+index 07c4c93b9fdb6..18b07ff5d20e6 100644
+--- a/net/ipv4/tcp_yeah.c
++++ b/net/ipv4/tcp_yeah.c
+@@ -71,11 +71,11 @@ static void tcp_yeah_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+
+ if (!yeah->doing_reno_now) {
+ /* Scalable */
+- tcp_cong_avoid_ai(tp, min(tp->snd_cwnd, TCP_SCALABLE_AI_CNT),
++ tcp_cong_avoid_ai(tp, min(tcp_snd_cwnd(tp), TCP_SCALABLE_AI_CNT),
+ acked);
+ } else {
+ /* Reno */
+- tcp_cong_avoid_ai(tp, tp->snd_cwnd, acked);
++ tcp_cong_avoid_ai(tp, tcp_snd_cwnd(tp), acked);
+ }
+
+ /* The key players are v_vegas.beg_snd_una and v_beg_snd_nxt.
+@@ -130,7 +130,7 @@ do_vegas:
+ /* Compute excess number of packets above bandwidth
+ * Avoid doing full 64 bit divide.
+ */
+- bw = tp->snd_cwnd;
++ bw = tcp_snd_cwnd(tp);
+ bw *= rtt - yeah->vegas.baseRTT;
+ do_div(bw, rtt);
+ queue = bw;
+@@ -138,20 +138,20 @@ do_vegas:
+ if (queue > TCP_YEAH_ALPHA ||
+ rtt - yeah->vegas.baseRTT > (yeah->vegas.baseRTT / TCP_YEAH_PHY)) {
+ if (queue > TCP_YEAH_ALPHA &&
+- tp->snd_cwnd > yeah->reno_count) {
++ tcp_snd_cwnd(tp) > yeah->reno_count) {
+ u32 reduction = min(queue / TCP_YEAH_GAMMA ,
+- tp->snd_cwnd >> TCP_YEAH_EPSILON);
++ tcp_snd_cwnd(tp) >> TCP_YEAH_EPSILON);
+
+- tp->snd_cwnd -= reduction;
++ tcp_snd_cwnd_set(tp, tcp_snd_cwnd(tp) - reduction);
+
+- tp->snd_cwnd = max(tp->snd_cwnd,
+- yeah->reno_count);
++ tcp_snd_cwnd_set(tp, max(tcp_snd_cwnd(tp),
++ yeah->reno_count));
+
+- tp->snd_ssthresh = tp->snd_cwnd;
++ tp->snd_ssthresh = tcp_snd_cwnd(tp);
+ }
+
+ if (yeah->reno_count <= 2)
+- yeah->reno_count = max(tp->snd_cwnd>>1, 2U);
++ yeah->reno_count = max(tcp_snd_cwnd(tp)>>1, 2U);
+ else
+ yeah->reno_count++;
+
+@@ -176,7 +176,7 @@ do_vegas:
+ */
+ yeah->vegas.beg_snd_una = yeah->vegas.beg_snd_nxt;
+ yeah->vegas.beg_snd_nxt = tp->snd_nxt;
+- yeah->vegas.beg_snd_cwnd = tp->snd_cwnd;
++ yeah->vegas.beg_snd_cwnd = tcp_snd_cwnd(tp);
+
+ /* Wipe the slate clean for the next RTT. */
+ yeah->vegas.cntRTT = 0;
+@@ -193,16 +193,16 @@ static u32 tcp_yeah_ssthresh(struct sock *sk)
+ if (yeah->doing_reno_now < TCP_YEAH_RHO) {
+ reduction = yeah->lastQ;
+
+- reduction = min(reduction, max(tp->snd_cwnd>>1, 2U));
++ reduction = min(reduction, max(tcp_snd_cwnd(tp)>>1, 2U));
+
+- reduction = max(reduction, tp->snd_cwnd >> TCP_YEAH_DELTA);
++ reduction = max(reduction, tcp_snd_cwnd(tp) >> TCP_YEAH_DELTA);
+ } else
+- reduction = max(tp->snd_cwnd>>1, 2U);
++ reduction = max(tcp_snd_cwnd(tp)>>1, 2U);
+
+ yeah->fast_count = 0;
+ yeah->reno_count = max(yeah->reno_count>>1, 2U);
+
+- return max_t(int, tp->snd_cwnd - reduction, 2);
++ return max_t(int, tcp_snd_cwnd(tp) - reduction, 2);
+ }
+
+ static struct tcp_congestion_ops tcp_yeah __read_mostly = {
+diff --git a/net/ipv4/xfrm4_protocol.c b/net/ipv4/xfrm4_protocol.c
+index 2fe5860c21d6e..b146ce88c5d0c 100644
+--- a/net/ipv4/xfrm4_protocol.c
++++ b/net/ipv4/xfrm4_protocol.c
+@@ -304,4 +304,3 @@ void __init xfrm4_protocol_init(void)
+ {
+ xfrm_input_register_afinfo(&xfrm4_input_afinfo);
+ }
+-EXPORT_SYMBOL(xfrm4_protocol_init);
+diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
+index ff033d16549e9..ecf3a553a0dc4 100644
+--- a/net/ipv6/ping.c
++++ b/net/ipv6/ping.c
+@@ -101,6 +101,9 @@ static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+ ipc6.sockc.tsflags = sk->sk_tsflags;
+ ipc6.sockc.mark = sk->sk_mark;
+
++ memset(&fl6, 0, sizeof(fl6));
++ fl6.flowi6_oif = oif;
++
+ if (msg->msg_controllen) {
+ struct ipv6_txoptions opt = {};
+
+@@ -112,17 +115,14 @@ static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+ return err;
+
+ /* Changes to txoptions and flow info are not implemented, yet.
+- * Drop the options, fl6 is wiped below.
++ * Drop the options.
+ */
+ ipc6.opt = NULL;
+ }
+
+- memset(&fl6, 0, sizeof(fl6));
+-
+ fl6.flowi6_proto = IPPROTO_ICMPV6;
+ fl6.saddr = np->saddr;
+ fl6.daddr = *daddr;
+- fl6.flowi6_oif = oif;
+ fl6.flowi6_mark = ipc6.sockc.mark;
+ fl6.flowi6_uid = sk->sk_uid;
+ fl6.fl6_icmp_type = user_icmph.icmp6_type;
+diff --git a/net/ipv6/seg6_hmac.c b/net/ipv6/seg6_hmac.c
+index 29bc4e7c3046e..6de01185cc68f 100644
+--- a/net/ipv6/seg6_hmac.c
++++ b/net/ipv6/seg6_hmac.c
+@@ -399,7 +399,6 @@ int __init seg6_hmac_init(void)
+ {
+ return seg6_hmac_init_algo();
+ }
+-EXPORT_SYMBOL(seg6_hmac_init);
+
+ int __net_init seg6_hmac_net_init(struct net *net)
+ {
+diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
+index 9fbe243a0e810..98a34287439cc 100644
+--- a/net/ipv6/seg6_local.c
++++ b/net/ipv6/seg6_local.c
+@@ -218,6 +218,7 @@ seg6_lookup_any_nexthop(struct sk_buff *skb, struct in6_addr *nhaddr,
+ struct flowi6 fl6;
+ int dev_flags = 0;
+
++ memset(&fl6, 0, sizeof(fl6));
+ fl6.flowi6_iif = skb->dev->ifindex;
+ fl6.daddr = nhaddr ? *nhaddr : hdr->daddr;
+ fl6.saddr = hdr->saddr;
+diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
+index faaddaf43c90b..cbc5fff3d8466 100644
+--- a/net/ipv6/tcp_ipv6.c
++++ b/net/ipv6/tcp_ipv6.c
+@@ -2044,7 +2044,7 @@ static void get_tcp6_sock(struct seq_file *seq, struct sock *sp, int i)
+ jiffies_to_clock_t(icsk->icsk_rto),
+ jiffies_to_clock_t(icsk->icsk_ack.ato),
+ (icsk->icsk_ack.quick << 1) | inet_csk_in_pingpong_mode(sp),
+- tp->snd_cwnd,
++ tcp_snd_cwnd(tp),
+ state == TCP_LISTEN ?
+ fastopenq->max_qlen :
+ (tcp_in_initial_slowstart(tp) ? -1 : tp->snd_ssthresh)
+diff --git a/net/key/af_key.c b/net/key/af_key.c
+index 339d95df19d32..d93bde6573593 100644
+--- a/net/key/af_key.c
++++ b/net/key/af_key.c
+@@ -2826,10 +2826,12 @@ static int pfkey_process(struct sock *sk, struct sk_buff *skb, const struct sadb
+ void *ext_hdrs[SADB_EXT_MAX];
+ int err;
+
+- err = pfkey_broadcast(skb_clone(skb, GFP_KERNEL), GFP_KERNEL,
+- BROADCAST_PROMISC_ONLY, NULL, sock_net(sk));
+- if (err)
+- return err;
++ /* Non-zero return value of pfkey_broadcast() does not always signal
++ * an error and even on an actual error we may still want to process
++ * the message so rather ignore the return value.
++ */
++ pfkey_broadcast(skb_clone(skb, GFP_KERNEL), GFP_KERNEL,
++ BROADCAST_PROMISC_ONLY, NULL, sock_net(sk));
+
+ memset(ext_hdrs, 0, sizeof(ext_hdrs));
+ err = parse_exthdrs(skb, hdr, ext_hdrs);
+diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
+index b6a9208130051..81243c834abbe 100644
+--- a/net/netfilter/nf_tables_api.c
++++ b/net/netfilter/nf_tables_api.c
+@@ -544,6 +544,7 @@ static int nft_trans_flowtable_add(struct nft_ctx *ctx, int msg_type,
+ if (msg_type == NFT_MSG_NEWFLOWTABLE)
+ nft_activate_next(ctx->net, flowtable);
+
++ INIT_LIST_HEAD(&nft_trans_flowtable_hooks(trans));
+ nft_trans_flowtable(trans) = flowtable;
+ nft_trans_commit_list_add_tail(ctx->net, trans);
+
+@@ -1914,7 +1915,6 @@ static struct nft_hook *nft_netdev_hook_alloc(struct net *net,
+ goto err_hook_dev;
+ }
+ hook->ops.dev = dev;
+- hook->inactive = false;
+
+ return hook;
+
+@@ -2166,7 +2166,7 @@ static int nft_basechain_init(struct nft_base_chain *basechain, u8 family,
+ chain->flags |= NFT_CHAIN_BASE | flags;
+ basechain->policy = NF_ACCEPT;
+ if (chain->flags & NFT_CHAIN_HW_OFFLOAD &&
+- nft_chain_offload_priority(basechain) < 0)
++ !nft_chain_offload_support(basechain))
+ return -EOPNOTSUPP;
+
+ flow_block_init(&basechain->flow_block);
+@@ -7326,7 +7326,7 @@ static void __nft_unregister_flowtable_net_hooks(struct net *net,
+ nf_unregister_net_hook(net, &hook->ops);
+ if (release_netdev) {
+ list_del(&hook->list);
+- kfree_rcu(hook);
++ kfree_rcu(hook, rcu);
+ }
+ }
+ }
+@@ -7427,11 +7427,15 @@ static int nft_flowtable_update(struct nft_ctx *ctx, const struct nlmsghdr *nlh,
+
+ if (nla[NFTA_FLOWTABLE_FLAGS]) {
+ flags = ntohl(nla_get_be32(nla[NFTA_FLOWTABLE_FLAGS]));
+- if (flags & ~NFT_FLOWTABLE_MASK)
+- return -EOPNOTSUPP;
++ if (flags & ~NFT_FLOWTABLE_MASK) {
++ err = -EOPNOTSUPP;
++ goto err_flowtable_update_hook;
++ }
+ if ((flowtable->data.flags & NFT_FLOWTABLE_HW_OFFLOAD) ^
+- (flags & NFT_FLOWTABLE_HW_OFFLOAD))
+- return -EOPNOTSUPP;
++ (flags & NFT_FLOWTABLE_HW_OFFLOAD)) {
++ err = -EOPNOTSUPP;
++ goto err_flowtable_update_hook;
++ }
+ } else {
+ flags = flowtable->data.flags;
+ }
+@@ -7612,6 +7616,7 @@ static int nft_delflowtable_hook(struct nft_ctx *ctx,
+ {
+ const struct nlattr * const *nla = ctx->nla;
+ struct nft_flowtable_hook flowtable_hook;
++ LIST_HEAD(flowtable_del_list);
+ struct nft_hook *this, *hook;
+ struct nft_trans *trans;
+ int err;
+@@ -7627,7 +7632,7 @@ static int nft_delflowtable_hook(struct nft_ctx *ctx,
+ err = -ENOENT;
+ goto err_flowtable_del_hook;
+ }
+- hook->inactive = true;
++ list_move(&hook->list, &flowtable_del_list);
+ }
+
+ trans = nft_trans_alloc(ctx, NFT_MSG_DELFLOWTABLE,
+@@ -7640,6 +7645,7 @@ static int nft_delflowtable_hook(struct nft_ctx *ctx,
+ nft_trans_flowtable(trans) = flowtable;
+ nft_trans_flowtable_update(trans) = true;
+ INIT_LIST_HEAD(&nft_trans_flowtable_hooks(trans));
++ list_splice(&flowtable_del_list, &nft_trans_flowtable_hooks(trans));
+ nft_flowtable_hook_release(&flowtable_hook);
+
+ nft_trans_commit_list_add_tail(ctx->net, trans);
+@@ -7647,13 +7653,7 @@ static int nft_delflowtable_hook(struct nft_ctx *ctx,
+ return 0;
+
+ err_flowtable_del_hook:
+- list_for_each_entry(this, &flowtable_hook.list, list) {
+- hook = nft_hook_list_find(&flowtable->hook_list, this);
+- if (!hook)
+- break;
+-
+- hook->inactive = false;
+- }
++ list_splice(&flowtable_del_list, &flowtable->hook_list);
+ nft_flowtable_hook_release(&flowtable_hook);
+
+ return err;
+@@ -8323,6 +8323,9 @@ static void nft_commit_release(struct nft_trans *trans)
+ nf_tables_chain_destroy(&trans->ctx);
+ break;
+ case NFT_MSG_DELRULE:
++ if (trans->ctx.chain->flags & NFT_CHAIN_HW_OFFLOAD)
++ nft_flow_rule_destroy(nft_trans_flow_rule(trans));
++
+ nf_tables_rule_destroy(&trans->ctx, nft_trans_rule(trans));
+ break;
+ case NFT_MSG_DELSET:
+@@ -8559,17 +8562,6 @@ void nft_chain_del(struct nft_chain *chain)
+ list_del_rcu(&chain->list);
+ }
+
+-static void nft_flowtable_hooks_del(struct nft_flowtable *flowtable,
+- struct list_head *hook_list)
+-{
+- struct nft_hook *hook, *next;
+-
+- list_for_each_entry_safe(hook, next, &flowtable->hook_list, list) {
+- if (hook->inactive)
+- list_move(&hook->list, hook_list);
+- }
+-}
+-
+ static void nf_tables_module_autoload_cleanup(struct net *net)
+ {
+ struct nftables_pernet *nft_net = nft_pernet(net);
+@@ -8824,6 +8816,9 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
+ nf_tables_rule_notify(&trans->ctx,
+ nft_trans_rule(trans),
+ NFT_MSG_NEWRULE);
++ if (trans->ctx.chain->flags & NFT_CHAIN_HW_OFFLOAD)
++ nft_flow_rule_destroy(nft_trans_flow_rule(trans));
++
+ nft_trans_destroy(trans);
+ break;
+ case NFT_MSG_DELRULE:
+@@ -8914,8 +8909,6 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
+ break;
+ case NFT_MSG_DELFLOWTABLE:
+ if (nft_trans_flowtable_update(trans)) {
+- nft_flowtable_hooks_del(nft_trans_flowtable(trans),
+- &nft_trans_flowtable_hooks(trans));
+ nf_tables_flowtable_notify(&trans->ctx,
+ nft_trans_flowtable(trans),
+ &nft_trans_flowtable_hooks(trans),
+@@ -8996,7 +8989,6 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
+ struct nftables_pernet *nft_net = nft_pernet(net);
+ struct nft_trans *trans, *next;
+ struct nft_trans_elem *te;
+- struct nft_hook *hook;
+
+ if (action == NFNL_ABORT_VALIDATE &&
+ nf_tables_validate(net) < 0)
+@@ -9127,8 +9119,8 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
+ break;
+ case NFT_MSG_DELFLOWTABLE:
+ if (nft_trans_flowtable_update(trans)) {
+- list_for_each_entry(hook, &nft_trans_flowtable(trans)->hook_list, list)
+- hook->inactive = false;
++ list_splice(&nft_trans_flowtable_hooks(trans),
++ &nft_trans_flowtable(trans)->hook_list);
+ } else {
+ trans->ctx.table->use++;
+ nft_clear(trans->ctx.net, nft_trans_flowtable(trans));
+diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
+index 2d36952b13920..910ef881c3b85 100644
+--- a/net/netfilter/nf_tables_offload.c
++++ b/net/netfilter/nf_tables_offload.c
+@@ -208,7 +208,7 @@ static int nft_setup_cb_call(enum tc_setup_type type, void *type_data,
+ return 0;
+ }
+
+-int nft_chain_offload_priority(struct nft_base_chain *basechain)
++static int nft_chain_offload_priority(const struct nft_base_chain *basechain)
+ {
+ if (basechain->ops.priority <= 0 ||
+ basechain->ops.priority > USHRT_MAX)
+@@ -217,6 +217,27 @@ int nft_chain_offload_priority(struct nft_base_chain *basechain)
+ return 0;
+ }
+
++bool nft_chain_offload_support(const struct nft_base_chain *basechain)
++{
++ struct net_device *dev;
++ struct nft_hook *hook;
++
++ if (nft_chain_offload_priority(basechain) < 0)
++ return false;
++
++ list_for_each_entry(hook, &basechain->hook_list, list) {
++ if (hook->ops.pf != NFPROTO_NETDEV ||
++ hook->ops.hooknum != NF_NETDEV_INGRESS)
++ return false;
++
++ dev = hook->ops.dev;
++ if (!dev->netdev_ops->ndo_setup_tc && !flow_indr_dev_exists())
++ return false;
++ }
++
++ return true;
++}
++
+ static void nft_flow_cls_offload_setup(struct flow_cls_offload *cls_flow,
+ const struct nft_base_chain *basechain,
+ const struct nft_rule *rule,
+diff --git a/net/netfilter/nft_nat.c b/net/netfilter/nft_nat.c
+index 4394df4bc99b4..e5fd6995e4bf3 100644
+--- a/net/netfilter/nft_nat.c
++++ b/net/netfilter/nft_nat.c
+@@ -335,7 +335,8 @@ static void nft_nat_inet_eval(const struct nft_expr *expr,
+ {
+ const struct nft_nat *priv = nft_expr_priv(expr);
+
+- if (priv->family == nft_pf(pkt))
++ if (priv->family == nft_pf(pkt) ||
++ priv->family == NFPROTO_INET)
+ nft_nat_eval(expr, regs, pkt);
+ }
+
+diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
+index 1b5d73079dc9b..868db4669a291 100644
+--- a/net/openvswitch/actions.c
++++ b/net/openvswitch/actions.c
+@@ -373,6 +373,7 @@ static void set_ip_addr(struct sk_buff *skb, struct iphdr *nh,
+ update_ip_l4_checksum(skb, nh, *addr, new_addr);
+ csum_replace4(&nh->check, *addr, new_addr);
+ skb_clear_hash(skb);
++ ovs_ct_clear(skb, NULL);
+ *addr = new_addr;
+ }
+
+@@ -420,6 +421,7 @@ static void set_ipv6_addr(struct sk_buff *skb, u8 l4_proto,
+ update_ipv6_checksum(skb, l4_proto, addr, new_addr);
+
+ skb_clear_hash(skb);
++ ovs_ct_clear(skb, NULL);
+ memcpy(addr, new_addr, sizeof(__be32[4]));
+ }
+
+@@ -660,6 +662,7 @@ static int set_nsh(struct sk_buff *skb, struct sw_flow_key *flow_key,
+ static void set_tp_port(struct sk_buff *skb, __be16 *port,
+ __be16 new_port, __sum16 *check)
+ {
++ ovs_ct_clear(skb, NULL);
+ inet_proto_csum_replace2(check, skb, *port, new_port, false);
+ *port = new_port;
+ }
+@@ -699,6 +702,7 @@ static int set_udp(struct sk_buff *skb, struct sw_flow_key *flow_key,
+ uh->dest = dst;
+ flow_key->tp.src = src;
+ flow_key->tp.dst = dst;
++ ovs_ct_clear(skb, NULL);
+ }
+
+ skb_clear_hash(skb);
+@@ -761,6 +765,8 @@ static int set_sctp(struct sk_buff *skb, struct sw_flow_key *flow_key,
+ sh->checksum = old_csum ^ old_correct_csum ^ new_csum;
+
+ skb_clear_hash(skb);
++ ovs_ct_clear(skb, NULL);
++
+ flow_key->tp.src = sh->source;
+ flow_key->tp.dst = sh->dest;
+
+diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
+index 4a947c13c813a..4e70df91d0f2a 100644
+--- a/net/openvswitch/conntrack.c
++++ b/net/openvswitch/conntrack.c
+@@ -1342,7 +1342,9 @@ int ovs_ct_clear(struct sk_buff *skb, struct sw_flow_key *key)
+
+ nf_ct_put(ct);
+ nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
+- ovs_ct_fill_key(skb, key, false);
++
++ if (key)
++ ovs_ct_fill_key(skb, key, false);
+
+ return 0;
+ }
+diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
+index b1f502fce5956..b3ca837fd4e82 100644
+--- a/net/sched/act_ct.c
++++ b/net/sched/act_ct.c
+@@ -548,7 +548,7 @@ tcf_ct_flow_table_fill_tuple_ipv6(struct sk_buff *skb,
+ break;
+ #endif
+ default:
+- return -1;
++ return false;
+ }
+
+ if (ip6h->hop_limit <= 1)
+diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
+index 45a24d24210f0..540b32d86d9b1 100644
+--- a/net/smc/af_smc.c
++++ b/net/smc/af_smc.c
+@@ -2136,6 +2136,7 @@ static void smc_find_rdma_v2_device_serv(struct smc_sock *new_smc,
+
+ not_found:
+ ini->smcr_version &= ~SMC_V2;
++ ini->smcrv2.ib_dev_v2 = NULL;
+ ini->check_smcrv2 = false;
+ }
+
+diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
+index 5c731f27996ef..53f63bfbaf5f9 100644
+--- a/net/smc/smc_cdc.c
++++ b/net/smc/smc_cdc.c
+@@ -82,7 +82,7 @@ int smc_cdc_get_free_slot(struct smc_connection *conn,
+ /* abnormal termination */
+ if (!rc)
+ smc_wr_tx_put_slot(link,
+- (struct smc_wr_tx_pend_priv *)pend);
++ (struct smc_wr_tx_pend_priv *)(*pend));
+ rc = -EPIPE;
+ }
+ return rc;
+diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
+index df194cc070350..b57cf9df4de89 100644
+--- a/net/sunrpc/xdr.c
++++ b/net/sunrpc/xdr.c
+@@ -979,7 +979,11 @@ static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
+ */
+ xdr->p = (void *)p + frag2bytes;
+ space_left = xdr->buf->buflen - xdr->buf->len;
+- xdr->end = (void *)p + min_t(int, space_left, PAGE_SIZE);
++ if (space_left - nbytes >= PAGE_SIZE)
++ xdr->end = (void *)p + PAGE_SIZE;
++ else
++ xdr->end = (void *)p + space_left - frag1bytes;
++
+ xdr->buf->page_len += frag2bytes;
+ xdr->buf->len += nbytes;
+ return p;
+diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
+index 281ddb87ac8d5..190a4de239c85 100644
+--- a/net/sunrpc/xprtrdma/rpc_rdma.c
++++ b/net/sunrpc/xprtrdma/rpc_rdma.c
+@@ -1121,6 +1121,7 @@ static bool
+ rpcrdma_is_bcall(struct rpcrdma_xprt *r_xprt, struct rpcrdma_rep *rep)
+ #if defined(CONFIG_SUNRPC_BACKCHANNEL)
+ {
++ struct rpc_xprt *xprt = &r_xprt->rx_xprt;
+ struct xdr_stream *xdr = &rep->rr_stream;
+ __be32 *p;
+
+@@ -1144,6 +1145,10 @@ rpcrdma_is_bcall(struct rpcrdma_xprt *r_xprt, struct rpcrdma_rep *rep)
+ if (*p != cpu_to_be32(RPC_CALL))
+ return false;
+
++ /* No bc service. */
++ if (xprt->bc_serv == NULL)
++ return false;
++
+ /* Now that we are sure this is a backchannel call,
+ * advance to the RPC header.
+ */
+diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c
+index 5f0155fdefc7b..11cf7c6466443 100644
+--- a/net/sunrpc/xprtrdma/svc_rdma_rw.c
++++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c
+@@ -478,10 +478,10 @@ svc_rdma_build_writes(struct svc_rdma_write_info *info,
+ unsigned int write_len;
+ u64 offset;
+
+- seg = &info->wi_chunk->ch_segments[info->wi_seg_no];
+- if (!seg)
++ if (info->wi_seg_no >= info->wi_chunk->ch_segcount)
+ goto out_overflow;
+
++ seg = &info->wi_chunk->ch_segments[info->wi_seg_no];
+ write_len = min(remaining, seg->rs_length - info->wi_seg_off);
+ if (!write_len)
+ goto out_overflow;
+diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c
+index 6d39ca05f2495..932c87b98eca0 100644
+--- a/net/tipc/bearer.c
++++ b/net/tipc/bearer.c
+@@ -259,9 +259,8 @@ static int tipc_enable_bearer(struct net *net, const char *name,
+ u32 i;
+
+ if (!bearer_name_validate(name, &b_names)) {
+- errstr = "illegal name";
+ NL_SET_ERR_MSG(extack, "Illegal name");
+- goto rejected;
++ return res;
+ }
+
+ if (prio > TIPC_MAX_LINK_PRI && prio != TIPC_MEDIA_LINK_PRI) {
+diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
+index e71a312faa1e2..4aed12e94221a 100644
+--- a/net/unix/af_unix.c
++++ b/net/unix/af_unix.c
+@@ -490,7 +490,7 @@ static int unix_dgram_peer_wake_me(struct sock *sk, struct sock *other)
+ * -ECONNREFUSED. Otherwise, if we haven't queued any skbs
+ * to other and its full, we will hang waiting for POLLOUT.
+ */
+- if (unix_recvq_full(other) && !sock_flag(other, SOCK_DEAD))
++ if (unix_recvq_full_lockless(other) && !sock_flag(other, SOCK_DEAD))
+ return 1;
+
+ if (connected)
+diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
+index 3a9348030e207..d6bcdbfd0fc58 100644
+--- a/net/xdp/xsk.c
++++ b/net/xdp/xsk.c
+@@ -373,7 +373,8 @@ u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max_entries)
+ goto out;
+ }
+
+- nb_pkts = xskq_cons_peek_desc_batch(xs->tx, pool, max_entries);
++ max_entries = xskq_cons_nb_entries(xs->tx, max_entries);
++ nb_pkts = xskq_cons_read_desc_batch(xs->tx, pool, max_entries);
+ if (!nb_pkts) {
+ xs->tx->queue_empty_descs++;
+ goto out;
+@@ -389,7 +390,7 @@ u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max_entries)
+ if (!nb_pkts)
+ goto out;
+
+- xskq_cons_release_n(xs->tx, nb_pkts);
++ xskq_cons_release_n(xs->tx, max_entries);
+ __xskq_cons_release(xs->tx);
+ xs->sk.sk_write_space(&xs->sk);
+
+diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
+index 801cda5d19381..64b43f31942fd 100644
+--- a/net/xdp/xsk_queue.h
++++ b/net/xdp/xsk_queue.h
+@@ -282,14 +282,6 @@ static inline bool xskq_cons_peek_desc(struct xsk_queue *q,
+ return xskq_cons_read_desc(q, desc, pool);
+ }
+
+-static inline u32 xskq_cons_peek_desc_batch(struct xsk_queue *q, struct xsk_buff_pool *pool,
+- u32 max)
+-{
+- u32 entries = xskq_cons_nb_entries(q, max);
+-
+- return xskq_cons_read_desc_batch(q, pool, entries);
+-}
+-
+ /* To improve performance in the xskq_cons_release functions, only update local state here.
+ * Reflect this to global state when we get new entries from the ring in
+ * xskq_cons_get_entries() and whenever Rx or Tx processing are completed in the NAPI loop.
+diff --git a/scripts/gdb/linux/config.py b/scripts/gdb/linux/config.py
+index 90e1565b19671..8843ab3cbaddc 100644
+--- a/scripts/gdb/linux/config.py
++++ b/scripts/gdb/linux/config.py
+@@ -24,9 +24,9 @@ class LxConfigDump(gdb.Command):
+ filename = arg
+
+ try:
+- py_config_ptr = gdb.parse_and_eval("kernel_config_data + 8")
+- py_config_size = gdb.parse_and_eval(
+- "sizeof(kernel_config_data) - 1 - 8 * 2")
++ py_config_ptr = gdb.parse_and_eval("&kernel_config_data")
++ py_config_ptr_end = gdb.parse_and_eval("&kernel_config_data_end")
++ py_config_size = py_config_ptr_end - py_config_ptr
+ except gdb.error as e:
+ raise gdb.GdbError("Can't find config, enable CONFIG_IKCONFIG?")
+
+diff --git a/scripts/get_abi.pl b/scripts/get_abi.pl
+index 1389db76cff31..0ffd5531242aa 100755
+--- a/scripts/get_abi.pl
++++ b/scripts/get_abi.pl
+@@ -981,11 +981,11 @@ __END__
+
+ =head1 NAME
+
+-abi_book.pl - parse the Linux ABI files and produce a ReST book.
++get_abi.pl - parse the Linux ABI files and produce a ReST book.
+
+ =head1 SYNOPSIS
+
+-B<abi_book.pl> [--debug <level>] [--enable-lineno] [--man] [--help]
++B<get_abi.pl> [--debug <level>] [--enable-lineno] [--man] [--help]
+ [--(no-)rst-source] [--dir=<dir>] [--show-hints]
+ [--search-string <regex>]
+ <COMMAND> [<ARGUMENT>]
+diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
+index ed9d056d2108a..b28344fd7408e 100644
+--- a/scripts/mod/modpost.c
++++ b/scripts/mod/modpost.c
+@@ -1267,7 +1267,8 @@ static int secref_whitelist(const struct sectioncheck *mismatch,
+
+ static inline int is_arm_mapping_symbol(const char *str)
+ {
+- return str[0] == '$' && strchr("axtd", str[1])
++ return str[0] == '$' &&
++ (str[1] == 'a' || str[1] == 'd' || str[1] == 't' || str[1] == 'x')
+ && (str[2] == '\0' || str[2] == '.');
+ }
+
+@@ -1993,7 +1994,7 @@ static char *remove_dot(char *s)
+
+ if (n && s[n]) {
+ size_t m = strspn(s + n + 1, "0123456789");
+- if (m && (s[n + m] == '.' || s[n + m] == 0))
++ if (m && (s[n + m + 1] == '.' || s[n + m + 1] == 0))
+ s[n] = 0;
+
+ /* strip trailing .prelink */
+diff --git a/security/keys/trusted-keys/trusted_tpm2.c b/security/keys/trusted-keys/trusted_tpm2.c
+index 0165da386289c..2b2c8eb258d5b 100644
+--- a/security/keys/trusted-keys/trusted_tpm2.c
++++ b/security/keys/trusted-keys/trusted_tpm2.c
+@@ -283,8 +283,8 @@ int tpm2_seal_trusted(struct tpm_chip *chip,
+ /* key properties */
+ flags = 0;
+ flags |= options->policydigest_len ? 0 : TPM2_OA_USER_WITH_AUTH;
+- flags |= payload->migratable ? (TPM2_OA_FIXED_TPM |
+- TPM2_OA_FIXED_PARENT) : 0;
++ flags |= payload->migratable ? 0 : (TPM2_OA_FIXED_TPM |
++ TPM2_OA_FIXED_PARENT);
+ tpm_buf_append_u32(&buf, flags);
+
+ /* policy */
+diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c
+index 0515137a75b0f..bce2cef80000b 100644
+--- a/sound/pci/hda/patch_conexant.c
++++ b/sound/pci/hda/patch_conexant.c
+@@ -1052,6 +1052,13 @@ static int patch_conexant_auto(struct hda_codec *codec)
+ snd_hda_pick_fixup(codec, cxt5051_fixup_models,
+ cxt5051_fixups, cxt_fixups);
+ break;
++ case 0x14f15098:
++ codec->pin_amp_workaround = 1;
++ spec->gen.mixer_nid = 0x22;
++ spec->gen.add_stereo_mix_input = HDA_HINT_STEREO_MIX_AUTO;
++ snd_hda_pick_fixup(codec, cxt5066_fixup_models,
++ cxt5066_fixups, cxt_fixups);
++ break;
+ case 0x14f150f2:
+ codec->power_save_node = 1;
+ fallthrough;
+diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
+index 323c74a042688..8d2d29880716f 100644
+--- a/sound/pci/hda/patch_realtek.c
++++ b/sound/pci/hda/patch_realtek.c
+@@ -9111,6 +9111,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x103c, 0x89c3, "Zbook Studio G9", ALC245_FIXUP_CS35L41_SPI_4_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x89c6, "Zbook Fury 17 G9", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x89ca, "HP", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF),
++ SND_PCI_QUIRK(0x103c, 0x8a78, "HP Dev One", ALC285_FIXUP_HP_LIMIT_INT_MIC_BOOST),
+ SND_PCI_QUIRK(0x1043, 0x103e, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC),
+ SND_PCI_QUIRK(0x1043, 0x103f, "ASUS TX300", ALC282_FIXUP_ASUS_TX300),
+ SND_PCI_QUIRK(0x1043, 0x106d, "Asus K53BE", ALC269_FIXUP_LIMIT_INT_MIC_BOOST),
+@@ -9310,6 +9311,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x17aa, 0x3176, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC),
+ SND_PCI_QUIRK(0x17aa, 0x3178, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC),
+ SND_PCI_QUIRK(0x17aa, 0x31af, "ThinkCentre Station", ALC623_FIXUP_LENOVO_THINKSTATION_P340),
++ SND_PCI_QUIRK(0x17aa, 0x3802, "Lenovo Yoga DuetITL 2021", ALC287_FIXUP_YOGA7_14ITL_SPEAKERS),
+ SND_PCI_QUIRK(0x17aa, 0x3813, "Legion 7i 15IMHG05", ALC287_FIXUP_LEGION_15IMHG05_SPEAKERS),
+ SND_PCI_QUIRK(0x17aa, 0x3818, "Lenovo C940", ALC298_FIXUP_LENOVO_SPK_VOLUME),
+ SND_PCI_QUIRK(0x17aa, 0x3819, "Lenovo 13s Gen2 ITL", ALC287_FIXUP_13S_GEN2_SPEAKERS),
+diff --git a/sound/soc/amd/acp/acp-pci.c b/sound/soc/amd/acp/acp-pci.c
+index 340e39d7f4205..c893963ee2d06 100644
+--- a/sound/soc/amd/acp/acp-pci.c
++++ b/sound/soc/amd/acp/acp-pci.c
+@@ -16,6 +16,7 @@
+ #include <linux/pci.h>
+ #include <linux/platform_device.h>
+ #include <linux/pm_runtime.h>
++#include <linux/module.h>
+
+ #include "amd.h"
+ #include "../mach-config.h"
+diff --git a/sound/soc/codecs/rt5640.c b/sound/soc/codecs/rt5640.c
+index 30c2e7cb7ed24..3559d9ecfa076 100644
+--- a/sound/soc/codecs/rt5640.c
++++ b/sound/soc/codecs/rt5640.c
+@@ -2094,12 +2094,14 @@ EXPORT_SYMBOL_GPL(rt5640_sel_asrc_clk_src);
+ void rt5640_enable_micbias1_for_ovcd(struct snd_soc_component *component)
+ {
+ struct snd_soc_dapm_context *dapm = snd_soc_component_get_dapm(component);
++ struct rt5640_priv *rt5640 = snd_soc_component_get_drvdata(component);
+
+ snd_soc_dapm_mutex_lock(dapm);
+ snd_soc_dapm_force_enable_pin_unlocked(dapm, "LDO2");
+ snd_soc_dapm_force_enable_pin_unlocked(dapm, "MICBIAS1");
+ /* OVCD is unreliable when used with RCCLK as sysclk-source */
+- snd_soc_dapm_force_enable_pin_unlocked(dapm, "Platform Clock");
++ if (rt5640->use_platform_clock)
++ snd_soc_dapm_force_enable_pin_unlocked(dapm, "Platform Clock");
+ snd_soc_dapm_sync_unlocked(dapm);
+ snd_soc_dapm_mutex_unlock(dapm);
+ }
+@@ -2108,9 +2110,11 @@ EXPORT_SYMBOL_GPL(rt5640_enable_micbias1_for_ovcd);
+ void rt5640_disable_micbias1_for_ovcd(struct snd_soc_component *component)
+ {
+ struct snd_soc_dapm_context *dapm = snd_soc_component_get_dapm(component);
++ struct rt5640_priv *rt5640 = snd_soc_component_get_drvdata(component);
+
+ snd_soc_dapm_mutex_lock(dapm);
+- snd_soc_dapm_disable_pin_unlocked(dapm, "Platform Clock");
++ if (rt5640->use_platform_clock)
++ snd_soc_dapm_disable_pin_unlocked(dapm, "Platform Clock");
+ snd_soc_dapm_disable_pin_unlocked(dapm, "MICBIAS1");
+ snd_soc_dapm_disable_pin_unlocked(dapm, "LDO2");
+ snd_soc_dapm_sync_unlocked(dapm);
+@@ -2535,6 +2539,9 @@ static void rt5640_enable_jack_detect(struct snd_soc_component *component,
+ rt5640->jd_gpio_irq_requested = true;
+ }
+
++ if (jack_data && jack_data->use_platform_clock)
++ rt5640->use_platform_clock = jack_data->use_platform_clock;
++
+ ret = request_irq(rt5640->irq, rt5640_irq,
+ IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
+ "rt5640", rt5640);
+diff --git a/sound/soc/codecs/rt5640.h b/sound/soc/codecs/rt5640.h
+index 9e49b9a0ccaad..505c93514051a 100644
+--- a/sound/soc/codecs/rt5640.h
++++ b/sound/soc/codecs/rt5640.h
+@@ -2155,11 +2155,13 @@ struct rt5640_priv {
+ bool jd_inverted;
+ unsigned int ovcd_th;
+ unsigned int ovcd_sf;
++ bool use_platform_clock;
+ };
+
+ struct rt5640_set_jack_data {
+ int codec_irq_override;
+ struct gpio_desc *jd_gpio;
++ bool use_platform_clock;
+ };
+
+ int rt5640_dmic_enable(struct snd_soc_component *component,
+diff --git a/sound/soc/fsl/fsl_sai.h b/sound/soc/fsl/fsl_sai.h
+index 7310fd02cc3c7..bd0b56589bdbc 100644
+--- a/sound/soc/fsl/fsl_sai.h
++++ b/sound/soc/fsl/fsl_sai.h
+@@ -80,8 +80,8 @@
+ #define FSL_SAI_xCR3(tx, ofs) (tx ? FSL_SAI_TCR3(ofs) : FSL_SAI_RCR3(ofs))
+ #define FSL_SAI_xCR4(tx, ofs) (tx ? FSL_SAI_TCR4(ofs) : FSL_SAI_RCR4(ofs))
+ #define FSL_SAI_xCR5(tx, ofs) (tx ? FSL_SAI_TCR5(ofs) : FSL_SAI_RCR5(ofs))
+-#define FSL_SAI_xDR(tx, ofs) (tx ? FSL_SAI_TDR(ofs) : FSL_SAI_RDR(ofs))
+-#define FSL_SAI_xFR(tx, ofs) (tx ? FSL_SAI_TFR(ofs) : FSL_SAI_RFR(ofs))
++#define FSL_SAI_xDR0(tx) (tx ? FSL_SAI_TDR0 : FSL_SAI_RDR0)
++#define FSL_SAI_xFR0(tx) (tx ? FSL_SAI_TFR0 : FSL_SAI_RFR0)
+ #define FSL_SAI_xMR(tx) (tx ? FSL_SAI_TMR : FSL_SAI_RMR)
+
+ /* SAI Transmit/Receive Control Register */
+diff --git a/sound/soc/intel/boards/bytcr_rt5640.c b/sound/soc/intel/boards/bytcr_rt5640.c
+index f81ae742faa78..75ec4a9322bb3 100644
+--- a/sound/soc/intel/boards/bytcr_rt5640.c
++++ b/sound/soc/intel/boards/bytcr_rt5640.c
+@@ -1191,12 +1191,14 @@ static int byt_rt5640_init(struct snd_soc_pcm_runtime *runtime)
+ {
+ struct snd_soc_card *card = runtime->card;
+ struct byt_rt5640_private *priv = snd_soc_card_get_drvdata(card);
++ struct rt5640_set_jack_data *jack_data = &priv->jack_data;
+ struct snd_soc_component *component = asoc_rtd_to_codec(runtime, 0)->component;
+ const struct snd_soc_dapm_route *custom_map = NULL;
+ int num_routes = 0;
+ int ret;
+
+ card->dapm.idle_bias_off = true;
++ jack_data->use_platform_clock = true;
+
+ /* Start with RC clk for jack-detect (we disable MCLK below) */
+ if (byt_rt5640_quirk & BYT_RT5640_MCLK_EN)
+diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
+index b470404a5376c..e692ae04436a5 100644
+--- a/sound/usb/pcm.c
++++ b/sound/usb/pcm.c
+@@ -291,6 +291,9 @@ int snd_usb_audioformat_set_sync_ep(struct snd_usb_audio *chip,
+ bool is_playback;
+ int err;
+
++ if (fmt->sync_ep)
++ return 0; /* already set up */
++
+ alts = snd_usb_get_host_interface(chip, fmt->iface, fmt->altsetting);
+ if (!alts)
+ return 0;
+@@ -304,7 +307,7 @@ int snd_usb_audioformat_set_sync_ep(struct snd_usb_audio *chip,
+ * Generic sync EP handling
+ */
+
+- if (altsd->bNumEndpoints < 2)
++ if (fmt->ep_idx > 0 || altsd->bNumEndpoints < 2)
+ return 0;
+
+ is_playback = !(get_endpoint(alts, 0)->bEndpointAddress & USB_DIR_IN);
+diff --git a/sound/usb/quirks-table.h b/sound/usb/quirks-table.h
+index 78eb41b621d63..4f56e1784932a 100644
+--- a/sound/usb/quirks-table.h
++++ b/sound/usb/quirks-table.h
+@@ -2658,7 +2658,12 @@ YAMAHA_DEVICE(0x7010, "UB99"),
+ .nr_rates = 2,
+ .rate_table = (unsigned int[]) {
+ 44100, 48000
+- }
++ },
++ .sync_ep = 0x82,
++ .sync_iface = 0,
++ .sync_altsetting = 1,
++ .sync_ep_idx = 1,
++ .implicit_fb = 1,
+ }
+ },
+ {
+diff --git a/tools/objtool/check.c b/tools/objtool/check.c
+index 8a0971a620f09..f66e4ac0af948 100644
+--- a/tools/objtool/check.c
++++ b/tools/objtool/check.c
+@@ -185,7 +185,8 @@ static bool __dead_end_function(struct objtool_file *file, struct symbol *func,
+ "do_group_exit",
+ "stop_this_cpu",
+ "__invalid_creds",
+- "cpu_startup_entry",
++ "cpu_startup_entry",
++ "__ubsan_handle_builtin_unreachable",
+ };
+
+ if (!func)
+diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
+index 75564a7df15be..68f681ad54c1e 100644
+--- a/tools/perf/arch/x86/util/evlist.c
++++ b/tools/perf/arch/x86/util/evlist.c
+@@ -3,6 +3,7 @@
+ #include "util/pmu.h"
+ #include "util/evlist.h"
+ #include "util/parse-events.h"
++#include "topdown.h"
+
+ #define TOPDOWN_L1_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound}"
+ #define TOPDOWN_L2_EVENTS "{slots,topdown-retiring,topdown-bad-spec,topdown-fe-bound,topdown-be-bound,topdown-heavy-ops,topdown-br-mispredict,topdown-fetch-lat,topdown-mem-bound}"
+@@ -25,12 +26,12 @@ struct evsel *arch_evlist__leader(struct list_head *list)
+
+ first = list_first_entry(list, struct evsel, core.node);
+
+- if (!pmu_have_event("cpu", "slots"))
++ if (!topdown_sys_has_perf_metrics())
+ return first;
+
+ /* If there is a slots event and a topdown event then the slots event comes first. */
+ __evlist__for_each_entry(list, evsel) {
+- if (evsel->pmu_name && !strcmp(evsel->pmu_name, "cpu") && evsel->name) {
++ if (evsel->pmu_name && !strncmp(evsel->pmu_name, "cpu", 3) && evsel->name) {
+ if (strcasestr(evsel->name, "slots")) {
+ slots = evsel;
+ if (slots == first)
+diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
+index 0c9e56ab07b5b..3501399cef350 100644
+--- a/tools/perf/arch/x86/util/evsel.c
++++ b/tools/perf/arch/x86/util/evsel.c
+@@ -5,6 +5,7 @@
+ #include "util/env.h"
+ #include "util/pmu.h"
+ #include "linux/string.h"
++#include "evsel.h"
+
+ void arch_evsel__set_sample_weight(struct evsel *evsel)
+ {
+@@ -31,10 +32,29 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)
+ free(env.cpuid);
+ }
+
++/* Check whether the evsel's PMU supports the perf metrics */
++bool evsel__sys_has_perf_metrics(const struct evsel *evsel)
++{
++ const char *pmu_name = evsel->pmu_name ? evsel->pmu_name : "cpu";
++
++ /*
++ * The PERF_TYPE_RAW type is the core PMU type, e.g., "cpu" PMU
++ * on a non-hybrid machine, "cpu_core" PMU on a hybrid machine.
++ * The slots event is only available for the core PMU, which
++ * supports the perf metrics feature.
++ * Checking both the PERF_TYPE_RAW type and the slots event
++ * should be good enough to detect the perf metrics feature.
++ */
++ if ((evsel->core.attr.type == PERF_TYPE_RAW) &&
++ pmu_have_event(pmu_name, "slots"))
++ return true;
++
++ return false;
++}
++
+ bool arch_evsel__must_be_in_group(const struct evsel *evsel)
+ {
+- if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
+- !pmu_have_event("cpu", "slots"))
++ if (!evsel__sys_has_perf_metrics(evsel))
+ return false;
+
+ return evsel->name &&
+diff --git a/tools/perf/arch/x86/util/evsel.h b/tools/perf/arch/x86/util/evsel.h
+new file mode 100644
+index 0000000000000..19ad1691374dc
+--- /dev/null
++++ b/tools/perf/arch/x86/util/evsel.h
+@@ -0,0 +1,7 @@
++/* SPDX-License-Identifier: GPL-2.0 */
++#ifndef _EVSEL_H
++#define _EVSEL_H 1
++
++bool evsel__sys_has_perf_metrics(const struct evsel *evsel);
++
++#endif
+diff --git a/tools/perf/arch/x86/util/topdown.c b/tools/perf/arch/x86/util/topdown.c
+index 2f3d96aa92a58..f81a7cfe4d633 100644
+--- a/tools/perf/arch/x86/util/topdown.c
++++ b/tools/perf/arch/x86/util/topdown.c
+@@ -3,6 +3,32 @@
+ #include "api/fs/fs.h"
+ #include "util/pmu.h"
+ #include "util/topdown.h"
++#include "topdown.h"
++#include "evsel.h"
++
++/* Check whether there is a PMU which supports the perf metrics. */
++bool topdown_sys_has_perf_metrics(void)
++{
++ static bool has_perf_metrics;
++ static bool cached;
++ struct perf_pmu *pmu;
++
++ if (cached)
++ return has_perf_metrics;
++
++ /*
++ * The perf metrics feature is a core PMU feature.
++ * The PERF_TYPE_RAW type is the type of a core PMU.
++ * The slots event is only available when the core PMU
++ * supports the perf metrics feature.
++ */
++ pmu = perf_pmu__find_by_type(PERF_TYPE_RAW);
++ if (pmu && pmu_have_event(pmu->name, "slots"))
++ has_perf_metrics = true;
++
++ cached = true;
++ return has_perf_metrics;
++}
+
+ /*
+ * Check whether we can use a group for top down.
+@@ -30,33 +56,19 @@ void arch_topdown_group_warn(void)
+
+ #define TOPDOWN_SLOTS 0x0400
+
+-static bool is_topdown_slots_event(struct evsel *counter)
+-{
+- if (!counter->pmu_name)
+- return false;
+-
+- if (strcmp(counter->pmu_name, "cpu"))
+- return false;
+-
+- if (counter->core.attr.config == TOPDOWN_SLOTS)
+- return true;
+-
+- return false;
+-}
+-
+ /*
+ * Check whether a topdown group supports sample-read.
+ *
+- * Only Topdown metic supports sample-read. The slots
++ * Only Topdown metric supports sample-read. The slots
+ * event must be the leader of the topdown group.
+ */
+
+ bool arch_topdown_sample_read(struct evsel *leader)
+ {
+- if (!pmu_have_event("cpu", "slots"))
++ if (!evsel__sys_has_perf_metrics(leader))
+ return false;
+
+- if (is_topdown_slots_event(leader))
++ if (leader->core.attr.config == TOPDOWN_SLOTS)
+ return true;
+
+ return false;
+diff --git a/tools/perf/arch/x86/util/topdown.h b/tools/perf/arch/x86/util/topdown.h
+new file mode 100644
+index 0000000000000..46bf9273e572f
+--- /dev/null
++++ b/tools/perf/arch/x86/util/topdown.h
+@@ -0,0 +1,7 @@
++/* SPDX-License-Identifier: GPL-2.0 */
++#ifndef _TOPDOWN_H
++#define _TOPDOWN_H 1
++
++bool topdown_sys_has_perf_metrics(void);
++
++#endif
+diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
+index 8c9ffacbdd281..157533d451e3c 100644
+--- a/tools/perf/builtin-c2c.c
++++ b/tools/perf/builtin-c2c.c
+@@ -925,8 +925,8 @@ percent_rmt_hitm_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ double per_left;
+ double per_right;
+
+- per_left = PERCENT(left, lcl_hitm);
+- per_right = PERCENT(right, lcl_hitm);
++ per_left = PERCENT(left, rmt_hitm);
++ per_right = PERCENT(right, rmt_hitm);
+
+ return per_left - per_right;
+ }
+diff --git a/tools/testing/selftests/bpf/progs/test_stacktrace_build_id.c b/tools/testing/selftests/bpf/progs/test_stacktrace_build_id.c
+index 6c62bfb8bb6fa..0c4426592a260 100644
+--- a/tools/testing/selftests/bpf/progs/test_stacktrace_build_id.c
++++ b/tools/testing/selftests/bpf/progs/test_stacktrace_build_id.c
+@@ -39,7 +39,7 @@ struct {
+ __type(value, stack_trace_t);
+ } stack_amap SEC(".maps");
+
+-SEC("kprobe/urandom_read")
++SEC("kprobe/urandom_read_iter")
+ int oncpu(struct pt_regs *args)
+ {
+ __u32 max_len = sizeof(struct bpf_stack_build_id)
+diff --git a/tools/testing/selftests/net/bpf/Makefile b/tools/testing/selftests/net/bpf/Makefile
+index f91bf14bbee7b..8a69c91fcca07 100644
+--- a/tools/testing/selftests/net/bpf/Makefile
++++ b/tools/testing/selftests/net/bpf/Makefile
+@@ -2,6 +2,7 @@
+
+ CLANG ?= clang
+ CCINCLUDE += -I../../bpf
++CCINCLUDE += -I../../../lib
+ CCINCLUDE += -I../../../../../usr/include/
+
+ TEST_CUSTOM_PROGS = $(OUTPUT)/bpf/nat6to4.o
+@@ -10,5 +11,4 @@ all: $(TEST_CUSTOM_PROGS)
+ $(OUTPUT)/%.o: %.c
+ $(CLANG) -O2 -target bpf -c $< $(CCINCLUDE) -o $@
+
+-clean:
+- rm -f $(TEST_CUSTOM_PROGS)
++EXTRA_CLEAN := $(TEST_CUSTOM_PROGS)
+diff --git a/tools/testing/selftests/netfilter/nft_nat.sh b/tools/testing/selftests/netfilter/nft_nat.sh
+index eb8543b9a5c40..924ecb3f1f737 100755
+--- a/tools/testing/selftests/netfilter/nft_nat.sh
++++ b/tools/testing/selftests/netfilter/nft_nat.sh
+@@ -374,6 +374,45 @@ EOF
+ return $lret
+ }
+
++test_local_dnat_portonly()
++{
++ local family=$1
++ local daddr=$2
++ local lret=0
++ local sr_s
++ local sr_r
++
++ip netns exec "$ns0" nft -f /dev/stdin <<EOF
++table $family nat {
++ chain output {
++ type nat hook output priority 0; policy accept;
++ meta l4proto tcp dnat to :2000
++
++ }
++}
++EOF
++ if [ $? -ne 0 ]; then
++ if [ $family = "inet" ];then
++ echo "SKIP: inet port test"
++ test_inet_nat=false
++ return
++ fi
++ echo "SKIP: Could not add $family dnat hook"
++ return
++ fi
++
++ echo SERVER-$family | ip netns exec "$ns1" timeout 5 socat -u STDIN TCP-LISTEN:2000 &
++ sc_s=$!
++
++ result=$(ip netns exec "$ns0" timeout 1 socat TCP:$daddr:2000 STDOUT)
++
++ if [ "$result" = "SERVER-inet" ];then
++ echo "PASS: inet port rewrite without l3 address"
++ else
++ echo "ERROR: inet port rewrite"
++ ret=1
++ fi
++}
+
+ test_masquerade6()
+ {
+@@ -1148,6 +1187,10 @@ fi
+ reset_counters
+ test_local_dnat ip
+ test_local_dnat6 ip6
++
++reset_counters
++test_local_dnat_portonly inet 10.0.1.99
++
+ reset_counters
+ $test_inet_nat && test_local_dnat inet
+ $test_inet_nat && test_local_dnat6 inet
+diff --git a/tools/tracing/rtla/Makefile b/tools/tracing/rtla/Makefile
+index 523f0a8c38c23..3822f4ea5f495 100644
+--- a/tools/tracing/rtla/Makefile
++++ b/tools/tracing/rtla/Makefile
+@@ -58,6 +58,41 @@ else
+ DOCSRC = $(SRCTREE)/../../../Documentation/tools/rtla/
+ endif
+
++LIBTRACEEVENT_MIN_VERSION = 1.5
++LIBTRACEFS_MIN_VERSION = 1.3
++
++TEST_LIBTRACEEVENT = $(shell sh -c "$(PKG_CONFIG) --atleast-version $(LIBTRACEEVENT_MIN_VERSION) libtraceevent > /dev/null 2>&1 || echo n")
++ifeq ("$(TEST_LIBTRACEEVENT)", "n")
++.PHONY: warning_traceevent
++warning_traceevent:
++ @echo "********************************************"
++ @echo "** NOTICE: libtraceevent version $(LIBTRACEEVENT_MIN_VERSION) or higher not found"
++ @echo "**"
++ @echo "** Consider installing the latest libtraceevent from your"
++ @echo "** distribution, e.g., 'dnf install libtraceevent' on Fedora,"
++ @echo "** or from source:"
++ @echo "**"
++ @echo "** https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/ "
++ @echo "**"
++ @echo "********************************************"
++endif
++
++TEST_LIBTRACEFS = $(shell sh -c "$(PKG_CONFIG) --atleast-version $(LIBTRACEFS_MIN_VERSION) libtracefs > /dev/null 2>&1 || echo n")
++ifeq ("$(TEST_LIBTRACEFS)", "n")
++.PHONY: warning_tracefs
++warning_tracefs:
++ @echo "********************************************"
++ @echo "** NOTICE: libtracefs version $(LIBTRACEFS_MIN_VERSION) or higher not found"
++ @echo "**"
++ @echo "** Consider installing the latest libtracefs from your"
++ @echo "** distribution, e.g., 'dnf install libtracefs' on Fedora,"
++ @echo "** or from source:"
++ @echo "**"
++ @echo "** https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/ "
++ @echo "**"
++ @echo "********************************************"
++endif
++
+ .PHONY: all
+ all: rtla
+
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-06-16 12:07 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-06-16 12:07 UTC (permalink / raw
To: gentoo-commits
commit: 37fd5a860fa5ee01ce4a8a3c79b9f4e1e91780b6
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Thu Jun 16 12:06:00 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Thu Jun 16 12:06:00 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=37fd5a86
Linux patch 5.18.5
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1004_linux-5.18.5.patch | 1113 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 1117 insertions(+)
diff --git a/0000_README b/0000_README
index 1dbd2f58..d6cb6557 100644
--- a/0000_README
+++ b/0000_README
@@ -59,6 +59,10 @@ Patch: 1003_linux-5.18.4.patch
From: http://www.kernel.org
Desc: Linux 5.18.4
+Patch: 1004_linux-5.18.5.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.5
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1004_linux-5.18.5.patch b/1004_linux-5.18.5.patch
new file mode 100644
index 00000000..88623faf
--- /dev/null
+++ b/1004_linux-5.18.5.patch
@@ -0,0 +1,1113 @@
+diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
+index 2ad01cad7f1c8..bcc974d276dc4 100644
+--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
++++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
+@@ -526,6 +526,7 @@ What: /sys/devices/system/cpu/vulnerabilities
+ /sys/devices/system/cpu/vulnerabilities/srbds
+ /sys/devices/system/cpu/vulnerabilities/tsx_async_abort
+ /sys/devices/system/cpu/vulnerabilities/itlb_multihit
++ /sys/devices/system/cpu/vulnerabilities/mmio_stale_data
+ Date: January 2018
+ Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
+ Description: Information about CPU vulnerabilities
+diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
+index 8cbc711cda935..4df436e7c4177 100644
+--- a/Documentation/admin-guide/hw-vuln/index.rst
++++ b/Documentation/admin-guide/hw-vuln/index.rst
+@@ -17,3 +17,4 @@ are configurable at compile, boot or run time.
+ special-register-buffer-data-sampling.rst
+ core-scheduling.rst
+ l1d_flush.rst
++ processor_mmio_stale_data.rst
+diff --git a/Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst b/Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
+new file mode 100644
+index 0000000000000..9393c50b5afc9
+--- /dev/null
++++ b/Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
+@@ -0,0 +1,246 @@
++=========================================
++Processor MMIO Stale Data Vulnerabilities
++=========================================
++
++Processor MMIO Stale Data Vulnerabilities are a class of memory-mapped I/O
++(MMIO) vulnerabilities that can expose data. The sequences of operations for
++exposing data range from simple to very complex. Because most of the
++vulnerabilities require the attacker to have access to MMIO, many environments
++are not affected. System environments using virtualization where MMIO access is
++provided to untrusted guests may need mitigation. These vulnerabilities are
++not transient execution attacks. However, these vulnerabilities may propagate
++stale data into core fill buffers where the data can subsequently be inferred
++by an unmitigated transient execution attack. Mitigation for these
++vulnerabilities includes a combination of microcode update and software
++changes, depending on the platform and usage model. Some of these mitigations
++are similar to those used to mitigate Microarchitectural Data Sampling (MDS) or
++those used to mitigate Special Register Buffer Data Sampling (SRBDS).
++
++Data Propagators
++================
++Propagators are operations that result in stale data being copied or moved from
++one microarchitectural buffer or register to another. Processor MMIO Stale Data
++Vulnerabilities are operations that may result in stale data being directly
++read into an architectural, software-visible state or sampled from a buffer or
++register.
++
++Fill Buffer Stale Data Propagator (FBSDP)
++-----------------------------------------
++Stale data may propagate from fill buffers (FB) into the non-coherent portion
++of the uncore on some non-coherent writes. Fill buffer propagation by itself
++does not make stale data architecturally visible. Stale data must be propagated
++to a location where it is subject to reading or sampling.
++
++Sideband Stale Data Propagator (SSDP)
++-------------------------------------
++The sideband stale data propagator (SSDP) is limited to the client (including
++Intel Xeon server E3) uncore implementation. The sideband response buffer is
++shared by all client cores. For non-coherent reads that go to sideband
++destinations, the uncore logic returns 64 bytes of data to the core, including
++both requested data and unrequested stale data, from a transaction buffer and
++the sideband response buffer. As a result, stale data from the sideband
++response and transaction buffers may now reside in a core fill buffer.
++
++Primary Stale Data Propagator (PSDP)
++------------------------------------
++The primary stale data propagator (PSDP) is limited to the client (including
++Intel Xeon server E3) uncore implementation. Similar to the sideband response
++buffer, the primary response buffer is shared by all client cores. For some
++processors, MMIO primary reads will return 64 bytes of data to the core fill
++buffer including both requested data and unrequested stale data. This is
++similar to the sideband stale data propagator.
++
++Vulnerabilities
++===============
++Device Register Partial Write (DRPW) (CVE-2022-21166)
++-----------------------------------------------------
++Some endpoint MMIO registers incorrectly handle writes that are smaller than
++the register size. Instead of aborting the write or only copying the correct
++subset of bytes (for example, 2 bytes for a 2-byte write), more bytes than
++specified by the write transaction may be written to the register. On
++processors affected by FBSDP, this may expose stale data from the fill buffers
++of the core that created the write transaction.
++
++Shared Buffers Data Sampling (SBDS) (CVE-2022-21125)
++----------------------------------------------------
++After propagators may have moved data around the uncore and copied stale data
++into client core fill buffers, processors affected by MFBDS can leak data from
++the fill buffer. It is limited to the client (including Intel Xeon server E3)
++uncore implementation.
++
++Shared Buffers Data Read (SBDR) (CVE-2022-21123)
++------------------------------------------------
++It is similar to Shared Buffer Data Sampling (SBDS) except that the data is
++directly read into the architectural software-visible state. It is limited to
++the client (including Intel Xeon server E3) uncore implementation.
++
++Affected Processors
++===================
++Not all the CPUs are affected by all the variants. For instance, most
++processors for the server market (excluding Intel Xeon E3 processors) are
++impacted by only Device Register Partial Write (DRPW).
++
++Below is the list of affected Intel processors [#f1]_:
++
++ =================== ============ =========
++ Common name Family_Model Steppings
++ =================== ============ =========
++ HASWELL_X 06_3FH 2,4
++ SKYLAKE_L 06_4EH 3
++ BROADWELL_X 06_4FH All
++ SKYLAKE_X 06_55H 3,4,6,7,11
++ BROADWELL_D 06_56H 3,4,5
++ SKYLAKE 06_5EH 3
++ ICELAKE_X 06_6AH 4,5,6
++ ICELAKE_D 06_6CH 1
++ ICELAKE_L 06_7EH 5
++ ATOM_TREMONT_D 06_86H All
++ LAKEFIELD 06_8AH 1
++ KABYLAKE_L 06_8EH 9 to 12
++ ATOM_TREMONT 06_96H 1
++ ATOM_TREMONT_L 06_9CH 0
++ KABYLAKE 06_9EH 9 to 13
++ COMETLAKE 06_A5H 2,3,5
++ COMETLAKE_L 06_A6H 0,1
++ ROCKETLAKE 06_A7H 1
++ =================== ============ =========
++
++If a CPU is in the affected processor list, but not affected by a variant, it
++is indicated by new bits in MSR IA32_ARCH_CAPABILITIES. As described in a later
++section, mitigation largely remains the same for all the variants, i.e. to
++clear the CPU fill buffers via VERW instruction.
++
++New bits in MSRs
++================
++Newer processors and microcode update on existing affected processors added new
++bits to IA32_ARCH_CAPABILITIES MSR. These bits can be used to enumerate
++specific variants of Processor MMIO Stale Data vulnerabilities and mitigation
++capability.
++
++MSR IA32_ARCH_CAPABILITIES
++--------------------------
++Bit 13 - SBDR_SSDP_NO - When set, processor is not affected by either the
++ Shared Buffers Data Read (SBDR) vulnerability or the sideband stale
++ data propagator (SSDP).
++Bit 14 - FBSDP_NO - When set, processor is not affected by the Fill Buffer
++ Stale Data Propagator (FBSDP).
++Bit 15 - PSDP_NO - When set, processor is not affected by Primary Stale Data
++ Propagator (PSDP).
++Bit 17 - FB_CLEAR - When set, VERW instruction will overwrite CPU fill buffer
++ values as part of MD_CLEAR operations. Processors that do not
++ enumerate MDS_NO (meaning they are affected by MDS) but that do
++ enumerate support for both L1D_FLUSH and MD_CLEAR implicitly enumerate
++ FB_CLEAR as part of their MD_CLEAR support.
++Bit 18 - FB_CLEAR_CTRL - Processor supports read and write to MSR
++ IA32_MCU_OPT_CTRL[FB_CLEAR_DIS]. On such processors, the FB_CLEAR_DIS
++ bit can be set to cause the VERW instruction to not perform the
++ FB_CLEAR action. Not all processors that support FB_CLEAR will support
++ FB_CLEAR_CTRL.
++
++MSR IA32_MCU_OPT_CTRL
++---------------------
++Bit 3 - FB_CLEAR_DIS - When set, VERW instruction does not perform the FB_CLEAR
++action. This may be useful to reduce the performance impact of FB_CLEAR in
++cases where system software deems it warranted (for example, when performance
++is more critical, or the untrusted software has no MMIO access). Note that
++FB_CLEAR_DIS has no impact on enumeration (for example, it does not change
++FB_CLEAR or MD_CLEAR enumeration) and it may not be supported on all processors
++that enumerate FB_CLEAR.
++
++Mitigation
++==========
++Like MDS, all variants of Processor MMIO Stale Data vulnerabilities have the
++same mitigation strategy to force the CPU to clear the affected buffers before
++an attacker can extract the secrets.
++
++This is achieved by using the otherwise unused and obsolete VERW instruction in
++combination with a microcode update. The microcode clears the affected CPU
++buffers when the VERW instruction is executed.
++
++Kernel reuses the MDS function to invoke the buffer clearing:
++
++ mds_clear_cpu_buffers()
++
++On MDS affected CPUs, the kernel already invokes CPU buffer clear on
++kernel/userspace, hypervisor/guest and C-state (idle) transitions. No
++additional mitigation is needed on such CPUs.
++
++For CPUs not affected by MDS or TAA, mitigation is needed only for the attacker
++with MMIO capability. Therefore, VERW is not required for kernel/userspace. For
++virtualization case, VERW is only needed at VMENTER for a guest with MMIO
++capability.
++
++Mitigation points
++-----------------
++Return to user space
++^^^^^^^^^^^^^^^^^^^^
++Same mitigation as MDS when affected by MDS/TAA, otherwise no mitigation
++needed.
++
++C-State transition
++^^^^^^^^^^^^^^^^^^
++Control register writes by CPU during C-state transition can propagate data
++from fill buffer to uncore buffers. Execute VERW before C-state transition to
++clear CPU fill buffers.
++
++Guest entry point
++^^^^^^^^^^^^^^^^^
++Same mitigation as MDS when processor is also affected by MDS/TAA, otherwise
++execute VERW at VMENTER only for MMIO capable guests. On CPUs not affected by
++MDS/TAA, guest without MMIO access cannot extract secrets using Processor MMIO
++Stale Data vulnerabilities, so there is no need to execute VERW for such guests.
++
++Mitigation control on the kernel command line
++---------------------------------------------
++The kernel command line allows to control the Processor MMIO Stale Data
++mitigations at boot time with the option "mmio_stale_data=". The valid
++arguments for this option are:
++
++ ========== =================================================================
++ full If the CPU is vulnerable, enable mitigation; CPU buffer clearing
++ on exit to userspace and when entering a VM. Idle transitions are
++ protected as well. It does not automatically disable SMT.
++ full,nosmt Same as full, with SMT disabled on vulnerable CPUs. This is the
++ complete mitigation.
++ off Disables mitigation completely.
++ ========== =================================================================
++
++If the CPU is affected and mmio_stale_data=off is not supplied on the kernel
++command line, then the kernel selects the appropriate mitigation.
++
++Mitigation status information
++-----------------------------
++The Linux kernel provides a sysfs interface to enumerate the current
++vulnerability status of the system: whether the system is vulnerable, and
++which mitigations are active. The relevant sysfs file is:
++
++ /sys/devices/system/cpu/vulnerabilities/mmio_stale_data
++
++The possible values in this file are:
++
++ .. list-table::
++
++ * - 'Not affected'
++ - The processor is not vulnerable
++ * - 'Vulnerable'
++ - The processor is vulnerable, but no mitigation enabled
++ * - 'Vulnerable: Clear CPU buffers attempted, no microcode'
++ - The processor is vulnerable, but microcode is not updated. The
++ mitigation is enabled on a best effort basis.
++ * - 'Mitigation: Clear CPU buffers'
++ - The processor is vulnerable and the CPU buffer clearing mitigation is
++ enabled.
++
++If the processor is vulnerable then the following information is appended to
++the above information:
++
++ ======================== ===========================================
++ 'SMT vulnerable' SMT is enabled
++ 'SMT disabled' SMT is disabled
++ 'SMT Host state unknown' Kernel runs in a VM, Host SMT state unknown
++ ======================== ===========================================
++
++References
++----------
++.. [#f1] Affected Processors
++ https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html
+diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
+index 3f1cc5e317ed4..c4893782055b4 100644
+--- a/Documentation/admin-guide/kernel-parameters.txt
++++ b/Documentation/admin-guide/kernel-parameters.txt
+@@ -3105,6 +3105,7 @@
+ kvm.nx_huge_pages=off [X86]
+ no_entry_flush [PPC]
+ no_uaccess_flush [PPC]
++ mmio_stale_data=off [X86]
+
+ Exceptions:
+ This does not have any effect on
+@@ -3126,6 +3127,7 @@
+ Equivalent to: l1tf=flush,nosmt [X86]
+ mds=full,nosmt [X86]
+ tsx_async_abort=full,nosmt [X86]
++ mmio_stale_data=full,nosmt [X86]
+
+ mminit_loglevel=
+ [KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
+@@ -3135,6 +3137,40 @@
+ log everything. Information is printed at KERN_DEBUG
+ so loglevel=8 may also need to be specified.
+
++ mmio_stale_data=
++ [X86,INTEL] Control mitigation for the Processor
++ MMIO Stale Data vulnerabilities.
++
++ Processor MMIO Stale Data is a class of
++ vulnerabilities that may expose data after an MMIO
++ operation. Exposed data could originate or end in
++ the same CPU buffers as affected by MDS and TAA.
++ Therefore, similar to MDS and TAA, the mitigation
++ is to clear the affected CPU buffers.
++
++ This parameter controls the mitigation. The
++ options are:
++
++ full - Enable mitigation on vulnerable CPUs
++
++ full,nosmt - Enable mitigation and disable SMT on
++ vulnerable CPUs.
++
++ off - Unconditionally disable mitigation
++
++ On MDS or TAA affected machines,
++ mmio_stale_data=off can be prevented by an active
++ MDS or TAA mitigation as these vulnerabilities are
++ mitigated with the same mechanism so in order to
++ disable this mitigation, you need to specify
++ mds=off and tsx_async_abort=off too.
++
++ Not specifying this option is equivalent to
++ mmio_stale_data=full.
++
++ For details see:
++ Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
++
+ module.sig_enforce
+ [KNL] When CONFIG_MODULE_SIG is set, this means that
+ modules without (valid) signatures will fail to load.
+diff --git a/Makefile b/Makefile
+index 6cbf7bb15edde..34bfb76d63332 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 4
++SUBLEVEL = 5
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
+index 73e643ae94b6f..e17de69faa543 100644
+--- a/arch/x86/include/asm/cpufeatures.h
++++ b/arch/x86/include/asm/cpufeatures.h
+@@ -443,5 +443,6 @@
+ #define X86_BUG_TAA X86_BUG(22) /* CPU is affected by TSX Async Abort(TAA) */
+ #define X86_BUG_ITLB_MULTIHIT X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */
+ #define X86_BUG_SRBDS X86_BUG(24) /* CPU may leak RNG bits if not mitigated */
++#define X86_BUG_MMIO_STALE_DATA X86_BUG(25) /* CPU is affected by Processor MMIO Stale Data vulnerabilities */
+
+ #endif /* _ASM_X86_CPUFEATURES_H */
+diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
+index ee15311b6be1d..4425d6773183b 100644
+--- a/arch/x86/include/asm/msr-index.h
++++ b/arch/x86/include/asm/msr-index.h
+@@ -114,6 +114,30 @@
+ * Not susceptible to
+ * TSX Async Abort (TAA) vulnerabilities.
+ */
++#define ARCH_CAP_SBDR_SSDP_NO BIT(13) /*
++ * Not susceptible to SBDR and SSDP
++ * variants of Processor MMIO stale data
++ * vulnerabilities.
++ */
++#define ARCH_CAP_FBSDP_NO BIT(14) /*
++ * Not susceptible to FBSDP variant of
++ * Processor MMIO stale data
++ * vulnerabilities.
++ */
++#define ARCH_CAP_PSDP_NO BIT(15) /*
++ * Not susceptible to PSDP variant of
++ * Processor MMIO stale data
++ * vulnerabilities.
++ */
++#define ARCH_CAP_FB_CLEAR BIT(17) /*
++ * VERW clears CPU fill buffer
++ * even on MDS_NO CPUs.
++ */
++#define ARCH_CAP_FB_CLEAR_CTRL BIT(18) /*
++ * MSR_IA32_MCU_OPT_CTRL[FB_CLEAR_DIS]
++ * bit available to control VERW
++ * behavior.
++ */
+
+ #define MSR_IA32_FLUSH_CMD 0x0000010b
+ #define L1D_FLUSH BIT(0) /*
+@@ -131,6 +155,7 @@
+ #define MSR_IA32_MCU_OPT_CTRL 0x00000123
+ #define RNGDS_MITG_DIS BIT(0) /* SRBDS support */
+ #define RTM_ALLOW BIT(1) /* TSX development mode */
++#define FB_CLEAR_DIS BIT(3) /* CPU Fill buffer clear disable */
+
+ #define MSR_IA32_SYSENTER_CS 0x00000174
+ #define MSR_IA32_SYSENTER_ESP 0x00000175
+diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
+index acbaeaf83b61a..da251a5645b0e 100644
+--- a/arch/x86/include/asm/nospec-branch.h
++++ b/arch/x86/include/asm/nospec-branch.h
+@@ -269,6 +269,8 @@ DECLARE_STATIC_KEY_FALSE(mds_idle_clear);
+
+ DECLARE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
+
++DECLARE_STATIC_KEY_FALSE(mmio_stale_data_clear);
++
+ #include <asm/segment.h>
+
+ /**
+diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
+index 6296e1ebed1db..a8a9f64063315 100644
+--- a/arch/x86/kernel/cpu/bugs.c
++++ b/arch/x86/kernel/cpu/bugs.c
+@@ -41,8 +41,10 @@ static void __init spectre_v2_select_mitigation(void);
+ static void __init ssb_select_mitigation(void);
+ static void __init l1tf_select_mitigation(void);
+ static void __init mds_select_mitigation(void);
+-static void __init mds_print_mitigation(void);
++static void __init md_clear_update_mitigation(void);
++static void __init md_clear_select_mitigation(void);
+ static void __init taa_select_mitigation(void);
++static void __init mmio_select_mitigation(void);
+ static void __init srbds_select_mitigation(void);
+ static void __init l1d_flush_select_mitigation(void);
+
+@@ -85,6 +87,10 @@ EXPORT_SYMBOL_GPL(mds_idle_clear);
+ */
+ DEFINE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
+
++/* Controls CPU Fill buffer clear before KVM guest MMIO accesses */
++DEFINE_STATIC_KEY_FALSE(mmio_stale_data_clear);
++EXPORT_SYMBOL_GPL(mmio_stale_data_clear);
++
+ void __init check_bugs(void)
+ {
+ identify_boot_cpu();
+@@ -117,17 +123,10 @@ void __init check_bugs(void)
+ spectre_v2_select_mitigation();
+ ssb_select_mitigation();
+ l1tf_select_mitigation();
+- mds_select_mitigation();
+- taa_select_mitigation();
++ md_clear_select_mitigation();
+ srbds_select_mitigation();
+ l1d_flush_select_mitigation();
+
+- /*
+- * As MDS and TAA mitigations are inter-related, print MDS
+- * mitigation until after TAA mitigation selection is done.
+- */
+- mds_print_mitigation();
+-
+ arch_smt_update();
+
+ #ifdef CONFIG_X86_32
+@@ -267,14 +266,6 @@ static void __init mds_select_mitigation(void)
+ }
+ }
+
+-static void __init mds_print_mitigation(void)
+-{
+- if (!boot_cpu_has_bug(X86_BUG_MDS) || cpu_mitigations_off())
+- return;
+-
+- pr_info("%s\n", mds_strings[mds_mitigation]);
+-}
+-
+ static int __init mds_cmdline(char *str)
+ {
+ if (!boot_cpu_has_bug(X86_BUG_MDS))
+@@ -329,7 +320,7 @@ static void __init taa_select_mitigation(void)
+ /* TSX previously disabled by tsx=off */
+ if (!boot_cpu_has(X86_FEATURE_RTM)) {
+ taa_mitigation = TAA_MITIGATION_TSX_DISABLED;
+- goto out;
++ return;
+ }
+
+ if (cpu_mitigations_off()) {
+@@ -343,7 +334,7 @@ static void __init taa_select_mitigation(void)
+ */
+ if (taa_mitigation == TAA_MITIGATION_OFF &&
+ mds_mitigation == MDS_MITIGATION_OFF)
+- goto out;
++ return;
+
+ if (boot_cpu_has(X86_FEATURE_MD_CLEAR))
+ taa_mitigation = TAA_MITIGATION_VERW;
+@@ -375,18 +366,6 @@ static void __init taa_select_mitigation(void)
+
+ if (taa_nosmt || cpu_mitigations_auto_nosmt())
+ cpu_smt_disable(false);
+-
+- /*
+- * Update MDS mitigation, if necessary, as the mds_user_clear is
+- * now enabled for TAA mitigation.
+- */
+- if (mds_mitigation == MDS_MITIGATION_OFF &&
+- boot_cpu_has_bug(X86_BUG_MDS)) {
+- mds_mitigation = MDS_MITIGATION_FULL;
+- mds_select_mitigation();
+- }
+-out:
+- pr_info("%s\n", taa_strings[taa_mitigation]);
+ }
+
+ static int __init tsx_async_abort_parse_cmdline(char *str)
+@@ -410,6 +389,151 @@ static int __init tsx_async_abort_parse_cmdline(char *str)
+ }
+ early_param("tsx_async_abort", tsx_async_abort_parse_cmdline);
+
++#undef pr_fmt
++#define pr_fmt(fmt) "MMIO Stale Data: " fmt
++
++enum mmio_mitigations {
++ MMIO_MITIGATION_OFF,
++ MMIO_MITIGATION_UCODE_NEEDED,
++ MMIO_MITIGATION_VERW,
++};
++
++/* Default mitigation for Processor MMIO Stale Data vulnerabilities */
++static enum mmio_mitigations mmio_mitigation __ro_after_init = MMIO_MITIGATION_VERW;
++static bool mmio_nosmt __ro_after_init = false;
++
++static const char * const mmio_strings[] = {
++ [MMIO_MITIGATION_OFF] = "Vulnerable",
++ [MMIO_MITIGATION_UCODE_NEEDED] = "Vulnerable: Clear CPU buffers attempted, no microcode",
++ [MMIO_MITIGATION_VERW] = "Mitigation: Clear CPU buffers",
++};
++
++static void __init mmio_select_mitigation(void)
++{
++ u64 ia32_cap;
++
++ if (!boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA) ||
++ cpu_mitigations_off()) {
++ mmio_mitigation = MMIO_MITIGATION_OFF;
++ return;
++ }
++
++ if (mmio_mitigation == MMIO_MITIGATION_OFF)
++ return;
++
++ ia32_cap = x86_read_arch_cap_msr();
++
++ /*
++ * Enable CPU buffer clear mitigation for host and VMM, if also affected
++ * by MDS or TAA. Otherwise, enable mitigation for VMM only.
++ */
++ if (boot_cpu_has_bug(X86_BUG_MDS) || (boot_cpu_has_bug(X86_BUG_TAA) &&
++ boot_cpu_has(X86_FEATURE_RTM)))
++ static_branch_enable(&mds_user_clear);
++ else
++ static_branch_enable(&mmio_stale_data_clear);
++
++ /*
++ * If Processor-MMIO-Stale-Data bug is present and Fill Buffer data can
++ * be propagated to uncore buffers, clearing the Fill buffers on idle
++ * is required irrespective of SMT state.
++ */
++ if (!(ia32_cap & ARCH_CAP_FBSDP_NO))
++ static_branch_enable(&mds_idle_clear);
++
++ /*
++ * Check if the system has the right microcode.
++ *
++ * CPU Fill buffer clear mitigation is enumerated by either an explicit
++ * FB_CLEAR or by the presence of both MD_CLEAR and L1D_FLUSH on MDS
++ * affected systems.
++ */
++ if ((ia32_cap & ARCH_CAP_FB_CLEAR) ||
++ (boot_cpu_has(X86_FEATURE_MD_CLEAR) &&
++ boot_cpu_has(X86_FEATURE_FLUSH_L1D) &&
++ !(ia32_cap & ARCH_CAP_MDS_NO)))
++ mmio_mitigation = MMIO_MITIGATION_VERW;
++ else
++ mmio_mitigation = MMIO_MITIGATION_UCODE_NEEDED;
++
++ if (mmio_nosmt || cpu_mitigations_auto_nosmt())
++ cpu_smt_disable(false);
++}
++
++static int __init mmio_stale_data_parse_cmdline(char *str)
++{
++ if (!boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA))
++ return 0;
++
++ if (!str)
++ return -EINVAL;
++
++ if (!strcmp(str, "off")) {
++ mmio_mitigation = MMIO_MITIGATION_OFF;
++ } else if (!strcmp(str, "full")) {
++ mmio_mitigation = MMIO_MITIGATION_VERW;
++ } else if (!strcmp(str, "full,nosmt")) {
++ mmio_mitigation = MMIO_MITIGATION_VERW;
++ mmio_nosmt = true;
++ }
++
++ return 0;
++}
++early_param("mmio_stale_data", mmio_stale_data_parse_cmdline);
++
++#undef pr_fmt
++#define pr_fmt(fmt) "" fmt
++
++static void __init md_clear_update_mitigation(void)
++{
++ if (cpu_mitigations_off())
++ return;
++
++ if (!static_key_enabled(&mds_user_clear))
++ goto out;
++
++ /*
++ * mds_user_clear is now enabled. Update MDS, TAA and MMIO Stale Data
++ * mitigation, if necessary.
++ */
++ if (mds_mitigation == MDS_MITIGATION_OFF &&
++ boot_cpu_has_bug(X86_BUG_MDS)) {
++ mds_mitigation = MDS_MITIGATION_FULL;
++ mds_select_mitigation();
++ }
++ if (taa_mitigation == TAA_MITIGATION_OFF &&
++ boot_cpu_has_bug(X86_BUG_TAA)) {
++ taa_mitigation = TAA_MITIGATION_VERW;
++ taa_select_mitigation();
++ }
++ if (mmio_mitigation == MMIO_MITIGATION_OFF &&
++ boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA)) {
++ mmio_mitigation = MMIO_MITIGATION_VERW;
++ mmio_select_mitigation();
++ }
++out:
++ if (boot_cpu_has_bug(X86_BUG_MDS))
++ pr_info("MDS: %s\n", mds_strings[mds_mitigation]);
++ if (boot_cpu_has_bug(X86_BUG_TAA))
++ pr_info("TAA: %s\n", taa_strings[taa_mitigation]);
++ if (boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA))
++ pr_info("MMIO Stale Data: %s\n", mmio_strings[mmio_mitigation]);
++}
++
++static void __init md_clear_select_mitigation(void)
++{
++ mds_select_mitigation();
++ taa_select_mitigation();
++ mmio_select_mitigation();
++
++ /*
++ * As MDS, TAA and MMIO Stale Data mitigations are inter-related, update
++ * and print their mitigation after MDS, TAA and MMIO Stale Data
++ * mitigation selection is done.
++ */
++ md_clear_update_mitigation();
++}
++
+ #undef pr_fmt
+ #define pr_fmt(fmt) "SRBDS: " fmt
+
+@@ -471,11 +595,13 @@ static void __init srbds_select_mitigation(void)
+ return;
+
+ /*
+- * Check to see if this is one of the MDS_NO systems supporting
+- * TSX that are only exposed to SRBDS when TSX is enabled.
++ * Check to see if this is one of the MDS_NO systems supporting TSX that
++ * are only exposed to SRBDS when TSX is enabled or when CPU is affected
++ * by Processor MMIO Stale Data vulnerability.
+ */
+ ia32_cap = x86_read_arch_cap_msr();
+- if ((ia32_cap & ARCH_CAP_MDS_NO) && !boot_cpu_has(X86_FEATURE_RTM))
++ if ((ia32_cap & ARCH_CAP_MDS_NO) && !boot_cpu_has(X86_FEATURE_RTM) &&
++ !boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA))
+ srbds_mitigation = SRBDS_MITIGATION_TSX_OFF;
+ else if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
+ srbds_mitigation = SRBDS_MITIGATION_HYPERVISOR;
+@@ -1109,6 +1235,8 @@ static void update_indir_branch_cond(void)
+ /* Update the static key controlling the MDS CPU buffer clear in idle */
+ static void update_mds_branch_idle(void)
+ {
++ u64 ia32_cap = x86_read_arch_cap_msr();
++
+ /*
+ * Enable the idle clearing if SMT is active on CPUs which are
+ * affected only by MSBDS and not any other MDS variant.
+@@ -1120,14 +1248,17 @@ static void update_mds_branch_idle(void)
+ if (!boot_cpu_has_bug(X86_BUG_MSBDS_ONLY))
+ return;
+
+- if (sched_smt_active())
++ if (sched_smt_active()) {
+ static_branch_enable(&mds_idle_clear);
+- else
++ } else if (mmio_mitigation == MMIO_MITIGATION_OFF ||
++ (ia32_cap & ARCH_CAP_FBSDP_NO)) {
+ static_branch_disable(&mds_idle_clear);
++ }
+ }
+
+ #define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n"
+ #define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.\n"
++#define MMIO_MSG_SMT "MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_stale_data.html for more details.\n"
+
+ void cpu_bugs_smt_update(void)
+ {
+@@ -1172,6 +1303,16 @@ void cpu_bugs_smt_update(void)
+ break;
+ }
+
++ switch (mmio_mitigation) {
++ case MMIO_MITIGATION_VERW:
++ case MMIO_MITIGATION_UCODE_NEEDED:
++ if (sched_smt_active())
++ pr_warn_once(MMIO_MSG_SMT);
++ break;
++ case MMIO_MITIGATION_OFF:
++ break;
++ }
++
+ mutex_unlock(&spec_ctrl_mutex);
+ }
+
+@@ -1774,6 +1915,20 @@ static ssize_t tsx_async_abort_show_state(char *buf)
+ sched_smt_active() ? "vulnerable" : "disabled");
+ }
+
++static ssize_t mmio_stale_data_show_state(char *buf)
++{
++ if (mmio_mitigation == MMIO_MITIGATION_OFF)
++ return sysfs_emit(buf, "%s\n", mmio_strings[mmio_mitigation]);
++
++ if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
++ return sysfs_emit(buf, "%s; SMT Host state unknown\n",
++ mmio_strings[mmio_mitigation]);
++ }
++
++ return sysfs_emit(buf, "%s; SMT %s\n", mmio_strings[mmio_mitigation],
++ sched_smt_active() ? "vulnerable" : "disabled");
++}
++
+ static char *stibp_state(void)
+ {
+ if (spectre_v2_in_eibrs_mode(spectre_v2_enabled))
+@@ -1874,6 +2029,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
+ case X86_BUG_SRBDS:
+ return srbds_show_state(buf);
+
++ case X86_BUG_MMIO_STALE_DATA:
++ return mmio_stale_data_show_state(buf);
++
+ default:
+ break;
+ }
+@@ -1925,4 +2083,9 @@ ssize_t cpu_show_srbds(struct device *dev, struct device_attribute *attr, char *
+ {
+ return cpu_show_common(dev, attr, buf, X86_BUG_SRBDS);
+ }
++
++ssize_t cpu_show_mmio_stale_data(struct device *dev, struct device_attribute *attr, char *buf)
++{
++ return cpu_show_common(dev, attr, buf, X86_BUG_MMIO_STALE_DATA);
++}
+ #endif
+diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
+index e342ae4db3c4d..af5d0c188f7b8 100644
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -1237,18 +1237,42 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
+ X86_FEATURE_ANY, issues)
+
+ #define SRBDS BIT(0)
++/* CPU is affected by X86_BUG_MMIO_STALE_DATA */
++#define MMIO BIT(1)
++/* CPU is affected by Shared Buffers Data Sampling (SBDS), a variant of X86_BUG_MMIO_STALE_DATA */
++#define MMIO_SBDS BIT(2)
+
+ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
+ VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS),
+ VULNBL_INTEL_STEPPINGS(HASWELL, X86_STEPPING_ANY, SRBDS),
+ VULNBL_INTEL_STEPPINGS(HASWELL_L, X86_STEPPING_ANY, SRBDS),
+ VULNBL_INTEL_STEPPINGS(HASWELL_G, X86_STEPPING_ANY, SRBDS),
++ VULNBL_INTEL_STEPPINGS(HASWELL_X, BIT(2) | BIT(4), MMIO),
++ VULNBL_INTEL_STEPPINGS(BROADWELL_D, X86_STEPPINGS(0x3, 0x5), MMIO),
+ VULNBL_INTEL_STEPPINGS(BROADWELL_G, X86_STEPPING_ANY, SRBDS),
++ VULNBL_INTEL_STEPPINGS(BROADWELL_X, X86_STEPPING_ANY, MMIO),
+ VULNBL_INTEL_STEPPINGS(BROADWELL, X86_STEPPING_ANY, SRBDS),
++ VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPINGS(0x3, 0x3), SRBDS | MMIO),
+ VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPING_ANY, SRBDS),
++ VULNBL_INTEL_STEPPINGS(SKYLAKE_X, BIT(3) | BIT(4) | BIT(6) |
++ BIT(7) | BIT(0xB), MMIO),
++ VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPINGS(0x3, 0x3), SRBDS | MMIO),
+ VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPING_ANY, SRBDS),
+- VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPINGS(0x0, 0xC), SRBDS),
+- VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x0, 0xD), SRBDS),
++ VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPINGS(0x9, 0xC), SRBDS | MMIO),
++ VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPINGS(0x0, 0x8), SRBDS),
++ VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x9, 0xD), SRBDS | MMIO),
++ VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x0, 0x8), SRBDS),
++ VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPINGS(0x5, 0x5), MMIO | MMIO_SBDS),
++ VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPINGS(0x1, 0x1), MMIO),
++ VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPINGS(0x4, 0x6), MMIO),
++ VULNBL_INTEL_STEPPINGS(COMETLAKE, BIT(2) | BIT(3) | BIT(5), MMIO | MMIO_SBDS),
++ VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x1, 0x1), MMIO | MMIO_SBDS),
++ VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO),
++ VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPINGS(0x1, 0x1), MMIO | MMIO_SBDS),
++ VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPINGS(0x1, 0x1), MMIO),
++ VULNBL_INTEL_STEPPINGS(ATOM_TREMONT, X86_STEPPINGS(0x1, 0x1), MMIO | MMIO_SBDS),
++ VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_D, X86_STEPPING_ANY, MMIO),
++ VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L, X86_STEPPINGS(0x0, 0x0), MMIO | MMIO_SBDS),
+ {}
+ };
+
+@@ -1269,6 +1293,13 @@ u64 x86_read_arch_cap_msr(void)
+ return ia32_cap;
+ }
+
++static bool arch_cap_mmio_immune(u64 ia32_cap)
++{
++ return (ia32_cap & ARCH_CAP_FBSDP_NO &&
++ ia32_cap & ARCH_CAP_PSDP_NO &&
++ ia32_cap & ARCH_CAP_SBDR_SSDP_NO);
++}
++
+ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
+ {
+ u64 ia32_cap = x86_read_arch_cap_msr();
+@@ -1322,12 +1353,27 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
+ /*
+ * SRBDS affects CPUs which support RDRAND or RDSEED and are listed
+ * in the vulnerability blacklist.
++ *
++ * Some of the implications and mitigation of Shared Buffers Data
++ * Sampling (SBDS) are similar to SRBDS. Give SBDS same treatment as
++ * SRBDS.
+ */
+ if ((cpu_has(c, X86_FEATURE_RDRAND) ||
+ cpu_has(c, X86_FEATURE_RDSEED)) &&
+- cpu_matches(cpu_vuln_blacklist, SRBDS))
++ cpu_matches(cpu_vuln_blacklist, SRBDS | MMIO_SBDS))
+ setup_force_cpu_bug(X86_BUG_SRBDS);
+
++ /*
++ * Processor MMIO Stale Data bug enumeration
++ *
++ * Affected CPU list is generally enough to enumerate the vulnerability,
++ * but for virtualization case check for ARCH_CAP MSR bits also, VMM may
++ * not want the guest to enumerate the bug.
++ */
++ if (cpu_matches(cpu_vuln_blacklist, MMIO) &&
++ !arch_cap_mmio_immune(ia32_cap))
++ setup_force_cpu_bug(X86_BUG_MMIO_STALE_DATA);
++
+ if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
+ return;
+
+diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
+index 982df9c000d31..9646ae886b4b5 100644
+--- a/arch/x86/kvm/vmx/vmx.c
++++ b/arch/x86/kvm/vmx/vmx.c
+@@ -229,6 +229,9 @@ static const struct {
+ #define L1D_CACHE_ORDER 4
+ static void *vmx_l1d_flush_pages;
+
++/* Control for disabling CPU Fill buffer clear */
++static bool __read_mostly vmx_fb_clear_ctrl_available;
++
+ static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
+ {
+ struct page *page;
+@@ -360,6 +363,60 @@ static int vmentry_l1d_flush_get(char *s, const struct kernel_param *kp)
+ return sprintf(s, "%s\n", vmentry_l1d_param[l1tf_vmx_mitigation].option);
+ }
+
++static void vmx_setup_fb_clear_ctrl(void)
++{
++ u64 msr;
++
++ if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES) &&
++ !boot_cpu_has_bug(X86_BUG_MDS) &&
++ !boot_cpu_has_bug(X86_BUG_TAA)) {
++ rdmsrl(MSR_IA32_ARCH_CAPABILITIES, msr);
++ if (msr & ARCH_CAP_FB_CLEAR_CTRL)
++ vmx_fb_clear_ctrl_available = true;
++ }
++}
++
++static __always_inline void vmx_disable_fb_clear(struct vcpu_vmx *vmx)
++{
++ u64 msr;
++
++ if (!vmx->disable_fb_clear)
++ return;
++
++ rdmsrl(MSR_IA32_MCU_OPT_CTRL, msr);
++ msr |= FB_CLEAR_DIS;
++ wrmsrl(MSR_IA32_MCU_OPT_CTRL, msr);
++ /* Cache the MSR value to avoid reading it later */
++ vmx->msr_ia32_mcu_opt_ctrl = msr;
++}
++
++static __always_inline void vmx_enable_fb_clear(struct vcpu_vmx *vmx)
++{
++ if (!vmx->disable_fb_clear)
++ return;
++
++ vmx->msr_ia32_mcu_opt_ctrl &= ~FB_CLEAR_DIS;
++ wrmsrl(MSR_IA32_MCU_OPT_CTRL, vmx->msr_ia32_mcu_opt_ctrl);
++}
++
++static void vmx_update_fb_clear_dis(struct kvm_vcpu *vcpu, struct vcpu_vmx *vmx)
++{
++ vmx->disable_fb_clear = vmx_fb_clear_ctrl_available;
++
++ /*
++ * If guest will not execute VERW, there is no need to set FB_CLEAR_DIS
++ * at VMEntry. Skip the MSR read/write when a guest has no use case to
++ * execute VERW.
++ */
++ if ((vcpu->arch.arch_capabilities & ARCH_CAP_FB_CLEAR) ||
++ ((vcpu->arch.arch_capabilities & ARCH_CAP_MDS_NO) &&
++ (vcpu->arch.arch_capabilities & ARCH_CAP_TAA_NO) &&
++ (vcpu->arch.arch_capabilities & ARCH_CAP_PSDP_NO) &&
++ (vcpu->arch.arch_capabilities & ARCH_CAP_FBSDP_NO) &&
++ (vcpu->arch.arch_capabilities & ARCH_CAP_SBDR_SSDP_NO)))
++ vmx->disable_fb_clear = false;
++}
++
+ static const struct kernel_param_ops vmentry_l1d_flush_ops = {
+ .set = vmentry_l1d_flush_set,
+ .get = vmentry_l1d_flush_get,
+@@ -2252,6 +2309,10 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
+ ret = kvm_set_msr_common(vcpu, msr_info);
+ }
+
++ /* FB_CLEAR may have changed, also update the FB_CLEAR_DIS behavior */
++ if (msr_index == MSR_IA32_ARCH_CAPABILITIES)
++ vmx_update_fb_clear_dis(vcpu, vmx);
++
+ return ret;
+ }
+
+@@ -4553,6 +4614,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
+ kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu);
+
+ vpid_sync_context(vmx->vpid);
++
++ vmx_update_fb_clear_dis(vcpu, vmx);
+ }
+
+ static void vmx_enable_irq_window(struct kvm_vcpu *vcpu)
+@@ -6773,6 +6836,11 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
+ vmx_l1d_flush(vcpu);
+ else if (static_branch_unlikely(&mds_user_clear))
+ mds_clear_cpu_buffers();
++ else if (static_branch_unlikely(&mmio_stale_data_clear) &&
++ kvm_arch_has_assigned_device(vcpu->kvm))
++ mds_clear_cpu_buffers();
++
++ vmx_disable_fb_clear(vmx);
+
+ if (vcpu->arch.cr2 != native_read_cr2())
+ native_write_cr2(vcpu->arch.cr2);
+@@ -6782,6 +6850,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
+
+ vcpu->arch.cr2 = native_read_cr2();
+
++ vmx_enable_fb_clear(vmx);
++
+ guest_state_exit_irqoff();
+ }
+
+@@ -8182,6 +8252,8 @@ static int __init vmx_init(void)
+ return r;
+ }
+
++ vmx_setup_fb_clear_ctrl();
++
+ for_each_possible_cpu(cpu) {
+ INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
+
+diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
+index b98c7e96697a9..8d2342ede0c59 100644
+--- a/arch/x86/kvm/vmx/vmx.h
++++ b/arch/x86/kvm/vmx/vmx.h
+@@ -348,6 +348,8 @@ struct vcpu_vmx {
+ u64 msr_ia32_feature_control_valid_bits;
+ /* SGX Launch Control public key hash */
+ u64 msr_ia32_sgxlepubkeyhash[4];
++ u64 msr_ia32_mcu_opt_ctrl;
++ bool disable_fb_clear;
+
+ struct pt_desc pt_desc;
+ struct lbr_desc lbr_desc;
+diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
+index 39c571224ac28..558d1f2ab5b49 100644
+--- a/arch/x86/kvm/x86.c
++++ b/arch/x86/kvm/x86.c
+@@ -1587,6 +1587,9 @@ static u64 kvm_get_arch_capabilities(void)
+ */
+ }
+
++ /* Guests don't need to know "Fill buffer clear control" exists */
++ data &= ~ARCH_CAP_FB_CLEAR_CTRL;
++
+ return data;
+ }
+
+diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
+index 2ef23fce0860c..a97776ea9d990 100644
+--- a/drivers/base/cpu.c
++++ b/drivers/base/cpu.c
+@@ -564,6 +564,12 @@ ssize_t __weak cpu_show_srbds(struct device *dev,
+ return sysfs_emit(buf, "Not affected\n");
+ }
+
++ssize_t __weak cpu_show_mmio_stale_data(struct device *dev,
++ struct device_attribute *attr, char *buf)
++{
++ return sysfs_emit(buf, "Not affected\n");
++}
++
+ static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
+ static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
+ static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
+@@ -573,6 +579,7 @@ static DEVICE_ATTR(mds, 0444, cpu_show_mds, NULL);
+ static DEVICE_ATTR(tsx_async_abort, 0444, cpu_show_tsx_async_abort, NULL);
+ static DEVICE_ATTR(itlb_multihit, 0444, cpu_show_itlb_multihit, NULL);
+ static DEVICE_ATTR(srbds, 0444, cpu_show_srbds, NULL);
++static DEVICE_ATTR(mmio_stale_data, 0444, cpu_show_mmio_stale_data, NULL);
+
+ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
+ &dev_attr_meltdown.attr,
+@@ -584,6 +591,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
+ &dev_attr_tsx_async_abort.attr,
+ &dev_attr_itlb_multihit.attr,
+ &dev_attr_srbds.attr,
++ &dev_attr_mmio_stale_data.attr,
+ NULL
+ };
+
+diff --git a/include/linux/cpu.h b/include/linux/cpu.h
+index 54dc2f9a2d56e..2c74773547444 100644
+--- a/include/linux/cpu.h
++++ b/include/linux/cpu.h
+@@ -65,6 +65,9 @@ extern ssize_t cpu_show_tsx_async_abort(struct device *dev,
+ extern ssize_t cpu_show_itlb_multihit(struct device *dev,
+ struct device_attribute *attr, char *buf);
+ extern ssize_t cpu_show_srbds(struct device *dev, struct device_attribute *attr, char *buf);
++extern ssize_t cpu_show_mmio_stale_data(struct device *dev,
++ struct device_attribute *attr,
++ char *buf);
+
+ extern __printf(4, 5)
+ struct device *cpu_device_create(struct device *parent, void *drvdata,
+diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
+index 73e643ae94b6f..e17de69faa543 100644
+--- a/tools/arch/x86/include/asm/cpufeatures.h
++++ b/tools/arch/x86/include/asm/cpufeatures.h
+@@ -443,5 +443,6 @@
+ #define X86_BUG_TAA X86_BUG(22) /* CPU is affected by TSX Async Abort(TAA) */
+ #define X86_BUG_ITLB_MULTIHIT X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */
+ #define X86_BUG_SRBDS X86_BUG(24) /* CPU may leak RNG bits if not mitigated */
++#define X86_BUG_MMIO_STALE_DATA X86_BUG(25) /* CPU is affected by Processor MMIO Stale Data vulnerabilities */
+
+ #endif /* _ASM_X86_CPUFEATURES_H */
+diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
+index ee15311b6be1d..4425d6773183b 100644
+--- a/tools/arch/x86/include/asm/msr-index.h
++++ b/tools/arch/x86/include/asm/msr-index.h
+@@ -114,6 +114,30 @@
+ * Not susceptible to
+ * TSX Async Abort (TAA) vulnerabilities.
+ */
++#define ARCH_CAP_SBDR_SSDP_NO BIT(13) /*
++ * Not susceptible to SBDR and SSDP
++ * variants of Processor MMIO stale data
++ * vulnerabilities.
++ */
++#define ARCH_CAP_FBSDP_NO BIT(14) /*
++ * Not susceptible to FBSDP variant of
++ * Processor MMIO stale data
++ * vulnerabilities.
++ */
++#define ARCH_CAP_PSDP_NO BIT(15) /*
++ * Not susceptible to PSDP variant of
++ * Processor MMIO stale data
++ * vulnerabilities.
++ */
++#define ARCH_CAP_FB_CLEAR BIT(17) /*
++ * VERW clears CPU fill buffer
++ * even on MDS_NO CPUs.
++ */
++#define ARCH_CAP_FB_CLEAR_CTRL BIT(18) /*
++ * MSR_IA32_MCU_OPT_CTRL[FB_CLEAR_DIS]
++ * bit available to control VERW
++ * behavior.
++ */
+
+ #define MSR_IA32_FLUSH_CMD 0x0000010b
+ #define L1D_FLUSH BIT(0) /*
+@@ -131,6 +155,7 @@
+ #define MSR_IA32_MCU_OPT_CTRL 0x00000123
+ #define RNGDS_MITG_DIS BIT(0) /* SRBDS support */
+ #define RTM_ALLOW BIT(1) /* TSX development mode */
++#define FB_CLEAR_DIS BIT(3) /* CPU Fill buffer clear disable */
+
+ #define MSR_IA32_SYSENTER_CS 0x00000174
+ #define MSR_IA32_SYSENTER_ESP 0x00000175
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-06-22 12:53 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-06-22 12:53 UTC (permalink / raw
To: gentoo-commits
commit: 1033c5a449fe9c1742dbea3a4850ae7bdd29010d
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Jun 22 12:53:16 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Jun 22 12:53:16 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=1033c5a4
Linux patch 5.18.6
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1005_linux-5.18.6.patch | 7495 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 7499 insertions(+)
diff --git a/0000_README b/0000_README
index d6cb6557..84b72755 100644
--- a/0000_README
+++ b/0000_README
@@ -63,6 +63,10 @@ Patch: 1004_linux-5.18.5.patch
From: http://www.kernel.org
Desc: Linux 5.18.5
+Patch: 1005_linux-5.18.6.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.6
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1005_linux-5.18.6.patch b/1005_linux-5.18.6.patch
new file mode 100644
index 00000000..529a089e
--- /dev/null
+++ b/1005_linux-5.18.6.patch
@@ -0,0 +1,7495 @@
+diff --git a/Documentation/ABI/testing/sysfs-driver-bd9571mwv-regulator b/Documentation/ABI/testing/sysfs-driver-bd9571mwv-regulator
+index 42214b4ff14a1..90596d8bb51c0 100644
+--- a/Documentation/ABI/testing/sysfs-driver-bd9571mwv-regulator
++++ b/Documentation/ABI/testing/sysfs-driver-bd9571mwv-regulator
+@@ -26,6 +26,6 @@ Description: Read/write the current state of DDR Backup Mode, which controls
+ DDR Backup Mode must be explicitly enabled by the user,
+ to invoke step 1.
+
+- See also Documentation/devicetree/bindings/mfd/bd9571mwv.txt.
++ See also Documentation/devicetree/bindings/mfd/rohm,bd9571mwv.yaml.
+ Users: User space applications for embedded boards equipped with a
+ BD9571MWV PMIC.
+diff --git a/Documentation/devicetree/bindings/cpufreq/brcm,stb-avs-cpu-freq.txt b/Documentation/devicetree/bindings/cpufreq/brcm,stb-avs-cpu-freq.txt
+index 73470ecd1f12f..ce91a91976976 100644
+--- a/Documentation/devicetree/bindings/cpufreq/brcm,stb-avs-cpu-freq.txt
++++ b/Documentation/devicetree/bindings/cpufreq/brcm,stb-avs-cpu-freq.txt
+@@ -16,7 +16,7 @@ has been processed. See [2] for more information on the brcm,l2-intc node.
+ firmware. On some SoCs, this firmware supports DFS and DVFS in addition to
+ Adaptive Voltage Scaling.
+
+-[2] Documentation/devicetree/bindings/interrupt-controller/brcm,l2-intc.txt
++[2] Documentation/devicetree/bindings/interrupt-controller/brcm,l2-intc.yaml
+
+
+ Node brcm,avs-cpu-data-mem
+diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst
+index 69f00179fdfeb..0483abcafcb01 100644
+--- a/Documentation/filesystems/netfs_library.rst
++++ b/Documentation/filesystems/netfs_library.rst
+@@ -37,30 +37,31 @@ The network filesystem helper library needs a place to store a bit of state for
+ its use on each netfs inode it is helping to manage. To this end, a context
+ structure is defined::
+
+- struct netfs_i_context {
++ struct netfs_inode {
++ struct inode inode;
+ const struct netfs_request_ops *ops;
+- struct fscache_cookie *cache;
++ struct fscache_cookie *cache;
+ };
+
+-A network filesystem that wants to use netfs lib must place one of these
+-directly after the VFS ``struct inode`` it allocates, usually as part of its
+-own struct. This can be done in a way similar to the following::
++A network filesystem that wants to use netfs lib must place one of these in its
++inode wrapper struct instead of the VFS ``struct inode``. This can be done in
++a way similar to the following::
+
+ struct my_inode {
+- struct {
+- /* These must be contiguous */
+- struct inode vfs_inode;
+- struct netfs_i_context netfs_ctx;
+- };
++ struct netfs_inode netfs; /* Netfslib context and vfs inode */
+ ...
+ };
+
+-This allows netfslib to find its state by simple offset from the inode pointer,
+-thereby allowing the netfslib helper functions to be pointed to directly by the
+-VFS/VM operation tables.
++This allows netfslib to find its state by using ``container_of()`` from the
++inode pointer, thereby allowing the netfslib helper functions to be pointed to
++directly by the VFS/VM operation tables.
+
+ The structure contains the following fields:
+
++ * ``inode``
++
++ The VFS inode structure.
++
+ * ``ops``
+
+ The set of operations provided by the network filesystem to netfslib.
+@@ -78,14 +79,12 @@ To help deal with the per-inode context, a number helper functions are
+ provided. Firstly, a function to perform basic initialisation on a context and
+ set the operations table pointer::
+
+- void netfs_i_context_init(struct inode *inode,
+- const struct netfs_request_ops *ops);
++ void netfs_inode_init(struct inode *inode,
++ const struct netfs_request_ops *ops);
+
+-then two functions to cast between the VFS inode structure and the netfs
+-context::
++then a function to cast from the VFS inode structure to the netfs context::
+
+- struct netfs_i_context *netfs_i_context(struct inode *inode);
+- struct inode *netfs_inode(struct netfs_i_context *ctx);
++ struct netfs_inode *netfs_node(struct inode *inode);
+
+ and finally, a function to get the cache cookie pointer from the context
+ attached to an inode (or NULL if fscache is disabled)::
+diff --git a/Makefile b/Makefile
+index 34bfb76d63332..27850d452d652 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 5
++SUBLEVEL = 6
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+@@ -787,6 +787,7 @@ stackp-flags-$(CONFIG_STACKPROTECTOR_STRONG) := -fstack-protector-strong
+ KBUILD_CFLAGS += $(stackp-flags-y)
+
+ KBUILD_CFLAGS-$(CONFIG_WERROR) += -Werror
++KBUILD_CFLAGS-$(CONFIG_CC_NO_ARRAY_BOUNDS) += -Wno-array-bounds
+ KBUILD_CFLAGS += $(KBUILD_CFLAGS-y) $(CONFIG_CC_IMPLICIT_FALLTHROUGH)
+
+ ifdef CONFIG_CC_IS_CLANG
+@@ -804,6 +805,9 @@ endif
+ KBUILD_CFLAGS += $(call cc-disable-warning, unused-but-set-variable)
+ KBUILD_CFLAGS += $(call cc-disable-warning, unused-const-variable)
+
++# These result in bogus false positives
++KBUILD_CFLAGS += $(call cc-disable-warning, dangling-pointer)
++
+ ifdef CONFIG_FRAME_POINTER
+ KBUILD_CFLAGS += -fno-omit-frame-pointer -fno-optimize-sibling-calls
+ else
+diff --git a/arch/arm64/boot/dts/freescale/imx8mm-beacon-baseboard.dtsi b/arch/arm64/boot/dts/freescale/imx8mm-beacon-baseboard.dtsi
+index ec3f2c1770357..f338a886d8117 100644
+--- a/arch/arm64/boot/dts/freescale/imx8mm-beacon-baseboard.dtsi
++++ b/arch/arm64/boot/dts/freescale/imx8mm-beacon-baseboard.dtsi
+@@ -278,6 +278,7 @@
+ pinctrl-0 = <&pinctrl_uart3>;
+ assigned-clocks = <&clk IMX8MM_CLK_UART3>;
+ assigned-clock-parents = <&clk IMX8MM_SYS_PLL1_80M>;
++ uart-has-rtscts;
+ status = "okay";
+ };
+
+@@ -386,6 +387,8 @@
+ fsl,pins = <
+ MX8MM_IOMUXC_ECSPI1_SCLK_UART3_DCE_RX 0x40
+ MX8MM_IOMUXC_ECSPI1_MOSI_UART3_DCE_TX 0x40
++ MX8MM_IOMUXC_ECSPI1_MISO_UART3_DCE_CTS_B 0x40
++ MX8MM_IOMUXC_ECSPI1_SS0_UART3_DCE_RTS_B 0x40
+ >;
+ };
+
+diff --git a/arch/arm64/boot/dts/freescale/imx8mn-beacon-baseboard.dtsi b/arch/arm64/boot/dts/freescale/imx8mn-beacon-baseboard.dtsi
+index 0f40b43ac091c..02f37dcda7eda 100644
+--- a/arch/arm64/boot/dts/freescale/imx8mn-beacon-baseboard.dtsi
++++ b/arch/arm64/boot/dts/freescale/imx8mn-beacon-baseboard.dtsi
+@@ -175,6 +175,7 @@
+ pinctrl-0 = <&pinctrl_uart3>;
+ assigned-clocks = <&clk IMX8MN_CLK_UART3>;
+ assigned-clock-parents = <&clk IMX8MN_SYS_PLL1_80M>;
++ uart-has-rtscts;
+ status = "okay";
+ };
+
+@@ -258,6 +259,8 @@
+ fsl,pins = <
+ MX8MN_IOMUXC_ECSPI1_SCLK_UART3_DCE_RX 0x40
+ MX8MN_IOMUXC_ECSPI1_MOSI_UART3_DCE_TX 0x40
++ MX8MN_IOMUXC_ECSPI1_MISO_UART3_DCE_CTS_B 0x40
++ MX8MN_IOMUXC_ECSPI1_SS0_UART3_DCE_RTS_B 0x40
+ >;
+ };
+
+diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
+index 4506c4a90ac10..f3184cd81b190 100644
+--- a/arch/arm64/kernel/ftrace.c
++++ b/arch/arm64/kernel/ftrace.c
+@@ -78,47 +78,76 @@ static struct plt_entry *get_ftrace_plt(struct module *mod, unsigned long addr)
+ }
+
+ /*
+- * Turn on the call to ftrace_caller() in instrumented function
++ * Find the address the callsite must branch to in order to reach '*addr'.
++ *
++ * Due to the limited range of 'BL' instructions, modules may be placed too far
++ * away to branch directly and must use a PLT.
++ *
++ * Returns true when '*addr' contains a reachable target address, or has been
++ * modified to contain a PLT address. Returns false otherwise.
+ */
+-int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
++static bool ftrace_find_callable_addr(struct dyn_ftrace *rec,
++ struct module *mod,
++ unsigned long *addr)
+ {
+ unsigned long pc = rec->ip;
+- u32 old, new;
+- long offset = (long)pc - (long)addr;
++ long offset = (long)*addr - (long)pc;
++ struct plt_entry *plt;
+
+- if (offset < -SZ_128M || offset >= SZ_128M) {
+- struct module *mod;
+- struct plt_entry *plt;
++ /*
++ * When the target is within range of the 'BL' instruction, use 'addr'
++ * as-is and branch to that directly.
++ */
++ if (offset >= -SZ_128M && offset < SZ_128M)
++ return true;
+
+- if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
+- return -EINVAL;
++ /*
++ * When the target is outside of the range of a 'BL' instruction, we
++ * must use a PLT to reach it. We can only place PLTs for modules, and
++ * only when module PLT support is built-in.
++ */
++ if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
++ return false;
+
+- /*
+- * On kernels that support module PLTs, the offset between the
+- * branch instruction and its target may legally exceed the
+- * range of an ordinary relative 'bl' opcode. In this case, we
+- * need to branch via a trampoline in the module.
+- *
+- * NOTE: __module_text_address() must be called with preemption
+- * disabled, but we can rely on ftrace_lock to ensure that 'mod'
+- * retains its validity throughout the remainder of this code.
+- */
++ /*
++ * 'mod' is only set at module load time, but if we end up
++ * dealing with an out-of-range condition, we can assume it
++ * is due to a module being loaded far away from the kernel.
++ *
++ * NOTE: __module_text_address() must be called with preemption
++ * disabled, but we can rely on ftrace_lock to ensure that 'mod'
++ * retains its validity throughout the remainder of this code.
++ */
++ if (!mod) {
+ preempt_disable();
+ mod = __module_text_address(pc);
+ preempt_enable();
++ }
+
+- if (WARN_ON(!mod))
+- return -EINVAL;
++ if (WARN_ON(!mod))
++ return false;
+
+- plt = get_ftrace_plt(mod, addr);
+- if (!plt) {
+- pr_err("ftrace: no module PLT for %ps\n", (void *)addr);
+- return -EINVAL;
+- }
+-
+- addr = (unsigned long)plt;
++ plt = get_ftrace_plt(mod, *addr);
++ if (!plt) {
++ pr_err("ftrace: no module PLT for %ps\n", (void *)*addr);
++ return false;
+ }
+
++ *addr = (unsigned long)plt;
++ return true;
++}
++
++/*
++ * Turn on the call to ftrace_caller() in instrumented function
++ */
++int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
++{
++ unsigned long pc = rec->ip;
++ u32 old, new;
++
++ if (!ftrace_find_callable_addr(rec, NULL, &addr))
++ return -EINVAL;
++
+ old = aarch64_insn_gen_nop();
+ new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
+
+@@ -132,6 +161,11 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
+ unsigned long pc = rec->ip;
+ u32 old, new;
+
++ if (!ftrace_find_callable_addr(rec, NULL, &old_addr))
++ return -EINVAL;
++ if (!ftrace_find_callable_addr(rec, NULL, &addr))
++ return -EINVAL;
++
+ old = aarch64_insn_gen_branch_imm(pc, old_addr,
+ AARCH64_INSN_BRANCH_LINK);
+ new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
+@@ -181,54 +215,15 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
+ unsigned long addr)
+ {
+ unsigned long pc = rec->ip;
+- bool validate = true;
+ u32 old = 0, new;
+- long offset = (long)pc - (long)addr;
+
+- if (offset < -SZ_128M || offset >= SZ_128M) {
+- u32 replaced;
+-
+- if (!IS_ENABLED(CONFIG_ARM64_MODULE_PLTS))
+- return -EINVAL;
+-
+- /*
+- * 'mod' is only set at module load time, but if we end up
+- * dealing with an out-of-range condition, we can assume it
+- * is due to a module being loaded far away from the kernel.
+- */
+- if (!mod) {
+- preempt_disable();
+- mod = __module_text_address(pc);
+- preempt_enable();
+-
+- if (WARN_ON(!mod))
+- return -EINVAL;
+- }
+-
+- /*
+- * The instruction we are about to patch may be a branch and
+- * link instruction that was redirected via a PLT entry. In
+- * this case, the normal validation will fail, but we can at
+- * least check that we are dealing with a branch and link
+- * instruction that points into the right module.
+- */
+- if (aarch64_insn_read((void *)pc, &replaced))
+- return -EFAULT;
+-
+- if (!aarch64_insn_is_bl(replaced) ||
+- !within_module(pc + aarch64_get_branch_offset(replaced),
+- mod))
+- return -EINVAL;
+-
+- validate = false;
+- } else {
+- old = aarch64_insn_gen_branch_imm(pc, addr,
+- AARCH64_INSN_BRANCH_LINK);
+- }
++ if (!ftrace_find_callable_addr(rec, mod, &addr))
++ return -EINVAL;
+
++ old = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
+ new = aarch64_insn_gen_nop();
+
+- return ftrace_modify_code(pc, old, new, validate);
++ return ftrace_modify_code(pc, old, new, true);
+ }
+
+ void arch_ftrace_update_code(int command)
+diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
+index 397fdac75cb12..7fe29bf6d1938 100644
+--- a/arch/arm64/kvm/fpsimd.c
++++ b/arch/arm64/kvm/fpsimd.c
+@@ -80,6 +80,7 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
+ vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
+ vcpu->arch.flags |= KVM_ARM64_FP_HOST;
+
++ vcpu->arch.flags &= ~KVM_ARM64_HOST_SVE_ENABLED;
+ if (read_sysreg(cpacr_el1) & CPACR_EL1_ZEN_EL0EN)
+ vcpu->arch.flags |= KVM_ARM64_HOST_SVE_ENABLED;
+ }
+diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v2.c b/arch/arm64/kvm/vgic/vgic-mmio-v2.c
+index 12e4c223e6b8c..54167fcfb6053 100644
+--- a/arch/arm64/kvm/vgic/vgic-mmio-v2.c
++++ b/arch/arm64/kvm/vgic/vgic-mmio-v2.c
+@@ -417,11 +417,11 @@ static const struct vgic_register_region vgic_v2_dist_registers[] = {
+ VGIC_ACCESS_32bit),
+ REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_PENDING_SET,
+ vgic_mmio_read_pending, vgic_mmio_write_spending,
+- NULL, vgic_uaccess_write_spending, 1,
++ vgic_uaccess_read_pending, vgic_uaccess_write_spending, 1,
+ VGIC_ACCESS_32bit),
+ REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_PENDING_CLEAR,
+ vgic_mmio_read_pending, vgic_mmio_write_cpending,
+- NULL, vgic_uaccess_write_cpending, 1,
++ vgic_uaccess_read_pending, vgic_uaccess_write_cpending, 1,
+ VGIC_ACCESS_32bit),
+ REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_ACTIVE_SET,
+ vgic_mmio_read_active, vgic_mmio_write_sactive,
+diff --git a/arch/arm64/kvm/vgic/vgic-mmio.c b/arch/arm64/kvm/vgic/vgic-mmio.c
+index 49837d3a3ef56..dc8c52487e470 100644
+--- a/arch/arm64/kvm/vgic/vgic-mmio.c
++++ b/arch/arm64/kvm/vgic/vgic-mmio.c
+@@ -226,8 +226,9 @@ int vgic_uaccess_write_cenable(struct kvm_vcpu *vcpu,
+ return 0;
+ }
+
+-unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
+- gpa_t addr, unsigned int len)
++static unsigned long __read_pending(struct kvm_vcpu *vcpu,
++ gpa_t addr, unsigned int len,
++ bool is_user)
+ {
+ u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
+ u32 value = 0;
+@@ -248,7 +249,7 @@ unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
+ IRQCHIP_STATE_PENDING,
+ &val);
+ WARN_RATELIMIT(err, "IRQ %d", irq->host_irq);
+- } else if (vgic_irq_is_mapped_level(irq)) {
++ } else if (!is_user && vgic_irq_is_mapped_level(irq)) {
+ val = vgic_get_phys_line_level(irq);
+ } else {
+ val = irq_is_pending(irq);
+@@ -263,6 +264,18 @@ unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
+ return value;
+ }
+
++unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
++ gpa_t addr, unsigned int len)
++{
++ return __read_pending(vcpu, addr, len, false);
++}
++
++unsigned long vgic_uaccess_read_pending(struct kvm_vcpu *vcpu,
++ gpa_t addr, unsigned int len)
++{
++ return __read_pending(vcpu, addr, len, true);
++}
++
+ static bool is_vgic_v2_sgi(struct kvm_vcpu *vcpu, struct vgic_irq *irq)
+ {
+ return (vgic_irq_is_sgi(irq->intid) &&
+diff --git a/arch/arm64/kvm/vgic/vgic-mmio.h b/arch/arm64/kvm/vgic/vgic-mmio.h
+index 3fa696f198a37..6082d4b66d398 100644
+--- a/arch/arm64/kvm/vgic/vgic-mmio.h
++++ b/arch/arm64/kvm/vgic/vgic-mmio.h
+@@ -149,6 +149,9 @@ int vgic_uaccess_write_cenable(struct kvm_vcpu *vcpu,
+ unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
+ gpa_t addr, unsigned int len);
+
++unsigned long vgic_uaccess_read_pending(struct kvm_vcpu *vcpu,
++ gpa_t addr, unsigned int len);
++
+ void vgic_mmio_write_spending(struct kvm_vcpu *vcpu,
+ gpa_t addr, unsigned int len,
+ unsigned long val);
+diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
+index 0ea6cc25dc663..21c907987080f 100644
+--- a/arch/arm64/mm/cache.S
++++ b/arch/arm64/mm/cache.S
+@@ -218,8 +218,6 @@ SYM_FUNC_ALIAS(__dma_flush_area, __pi___dma_flush_area)
+ */
+ SYM_FUNC_START(__pi___dma_map_area)
+ add x1, x0, x1
+- cmp w2, #DMA_FROM_DEVICE
+- b.eq __pi_dcache_inval_poc
+ b __pi_dcache_clean_poc
+ SYM_FUNC_END(__pi___dma_map_area)
+ SYM_FUNC_ALIAS(__dma_map_area, __pi___dma_map_area)
+diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
+index 984813a4d5dc4..a75d20f23dac8 100644
+--- a/arch/powerpc/kernel/process.c
++++ b/arch/powerpc/kernel/process.c
+@@ -2160,12 +2160,12 @@ static unsigned long ___get_wchan(struct task_struct *p)
+ return 0;
+
+ do {
+- sp = *(unsigned long *)sp;
++ sp = READ_ONCE_NOCHECK(*(unsigned long *)sp);
+ if (!validate_sp(sp, p, STACK_FRAME_OVERHEAD) ||
+ task_is_running(p))
+ return 0;
+ if (count > 0) {
+- ip = ((unsigned long *)sp)[STACK_FRAME_LR_SAVE];
++ ip = READ_ONCE_NOCHECK(((unsigned long *)sp)[STACK_FRAME_LR_SAVE]);
+ if (!in_sched_functions(ip))
+ return ip;
+ }
+diff --git a/arch/powerpc/mm/nohash/kaslr_booke.c b/arch/powerpc/mm/nohash/kaslr_booke.c
+index 96c38f971603f..5f81c076621f2 100644
+--- a/arch/powerpc/mm/nohash/kaslr_booke.c
++++ b/arch/powerpc/mm/nohash/kaslr_booke.c
+@@ -18,7 +18,6 @@
+ #include <asm/prom.h>
+ #include <asm/kdump.h>
+ #include <mm/mmu_decl.h>
+-#include <generated/compile.h>
+ #include <generated/utsrelease.h>
+
+ struct regions {
+@@ -36,10 +35,6 @@ struct regions {
+ int reserved_mem_size_cells;
+ };
+
+-/* Simplified build-specific string for starting entropy. */
+-static const char build_str[] = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
+- LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION;
+-
+ struct regions __initdata regions;
+
+ static __init void kaslr_get_cmdline(void *fdt)
+@@ -70,7 +65,8 @@ static unsigned long __init get_boot_seed(void *fdt)
+ {
+ unsigned long hash = 0;
+
+- hash = rotate_xor(hash, build_str, sizeof(build_str));
++ /* build-specific string for starting entropy. */
++ hash = rotate_xor(hash, linux_banner, strlen(linux_banner));
+ hash = rotate_xor(hash, fdt, fdt_totalsize(fdt));
+
+ return hash;
+diff --git a/arch/riscv/boot/dts/microchip/microchip-mpfs.dtsi b/arch/riscv/boot/dts/microchip/microchip-mpfs.dtsi
+index cf2f55e1dcb67..f44fce1fe080c 100644
+--- a/arch/riscv/boot/dts/microchip/microchip-mpfs.dtsi
++++ b/arch/riscv/boot/dts/microchip/microchip-mpfs.dtsi
+@@ -188,6 +188,15 @@
+ riscv,ndev = <186>;
+ };
+
++ pdma: dma-controller@3000000 {
++ compatible = "sifive,fu540-c000-pdma", "sifive,pdma0";
++ reg = <0x0 0x3000000 0x0 0x8000>;
++ interrupt-parent = <&plic>;
++ interrupts = <5 6>, <7 8>, <9 10>, <11 12>;
++ dma-channels = <4>;
++ #dma-cells = <1>;
++ };
++
+ clkcfg: clkcfg@20002000 {
+ compatible = "microchip,mpfs-clkcfg";
+ reg = <0x0 0x20002000 0x0 0x1000>, <0x0 0x3E001000 0x0 0x1000>;
+diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
+index e084c72104f86..359b0cc0dc35d 100644
+--- a/arch/s390/Kconfig
++++ b/arch/s390/Kconfig
+@@ -125,6 +125,7 @@ config S390
+ select CLONE_BACKWARDS2
+ select DMA_OPS if PCI
+ select DYNAMIC_FTRACE if FUNCTION_TRACER
++ select GCC12_NO_ARRAY_BOUNDS
+ select GENERIC_ALLOCATOR
+ select GENERIC_CPU_AUTOPROBE
+ select GENERIC_CPU_VULNERABILITIES
+diff --git a/arch/s390/Makefile b/arch/s390/Makefile
+index df325eacf62d2..eba70d585cb2c 100644
+--- a/arch/s390/Makefile
++++ b/arch/s390/Makefile
+@@ -30,15 +30,7 @@ KBUILD_CFLAGS_DECOMPRESSOR += -fno-stack-protector
+ KBUILD_CFLAGS_DECOMPRESSOR += $(call cc-disable-warning, address-of-packed-member)
+ KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_DEBUG_INFO),-g)
+ KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_DEBUG_INFO_DWARF4), $(call cc-option, -gdwarf-4,))
+-
+-ifdef CONFIG_CC_IS_GCC
+- ifeq ($(call cc-ifversion, -ge, 1200, y), y)
+- ifeq ($(call cc-ifversion, -lt, 1300, y), y)
+- KBUILD_CFLAGS += $(call cc-disable-warning, array-bounds)
+- KBUILD_CFLAGS_DECOMPRESSOR += $(call cc-disable-warning, array-bounds)
+- endif
+- endif
+-endif
++KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_CC_NO_ARRAY_BOUNDS),-Wno-array-bounds)
+
+ UTS_MACHINE := s390x
+ STACK_SIZE := $(if $(CONFIG_KASAN),65536,16384)
+diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
+index c41ef42adbe8a..25828e4c62375 100644
+--- a/arch/x86/kernel/Makefile
++++ b/arch/x86/kernel/Makefile
+@@ -36,10 +36,6 @@ KCSAN_SANITIZE := n
+
+ OBJECT_FILES_NON_STANDARD_test_nx.o := y
+
+-ifdef CONFIG_FRAME_POINTER
+-OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o := y
+-endif
+-
+ # If instrumentation of this dir is enabled, boot hangs during first second.
+ # Probably could be more selective here, but note that files related to irqs,
+ # boot, dumpstack/stacktrace, etc are either non-interesting or can lead to
+diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
+index 4ec13608d3c62..dfeb227de5617 100644
+--- a/arch/x86/kernel/ftrace_64.S
++++ b/arch/x86/kernel/ftrace_64.S
+@@ -175,6 +175,7 @@ SYM_INNER_LABEL(ftrace_caller_end, SYM_L_GLOBAL)
+
+ jmp ftrace_epilogue
+ SYM_FUNC_END(ftrace_caller);
++STACK_FRAME_NON_STANDARD_FP(ftrace_caller)
+
+ SYM_FUNC_START(ftrace_epilogue)
+ /*
+@@ -282,6 +283,7 @@ SYM_INNER_LABEL(ftrace_regs_caller_end, SYM_L_GLOBAL)
+ jmp ftrace_epilogue
+
+ SYM_FUNC_END(ftrace_regs_caller)
++STACK_FRAME_NON_STANDARD_FP(ftrace_regs_caller)
+
+
+ #else /* ! CONFIG_DYNAMIC_FTRACE */
+@@ -311,10 +313,14 @@ trace:
+ jmp ftrace_stub
+ SYM_FUNC_END(__fentry__)
+ EXPORT_SYMBOL(__fentry__)
++STACK_FRAME_NON_STANDARD_FP(__fentry__)
++
+ #endif /* CONFIG_DYNAMIC_FTRACE */
+
+ #ifdef CONFIG_FUNCTION_GRAPH_TRACER
+-SYM_FUNC_START(return_to_handler)
++SYM_CODE_START(return_to_handler)
++ UNWIND_HINT_EMPTY
++ ANNOTATE_NOENDBR
+ subq $16, %rsp
+
+ /* Save the return values */
+@@ -339,7 +345,6 @@ SYM_FUNC_START(return_to_handler)
+ int3
+ .Ldo_rop:
+ mov %rdi, (%rsp)
+- UNWIND_HINT_FUNC
+ RET
+-SYM_FUNC_END(return_to_handler)
++SYM_CODE_END(return_to_handler)
+ #endif
+diff --git a/block/blk-mq.c b/block/blk-mq.c
+index de7fc69572718..631fb87b4976f 100644
+--- a/block/blk-mq.c
++++ b/block/blk-mq.c
+@@ -579,6 +579,8 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
+ if (!blk_mq_hw_queue_mapped(data.hctx))
+ goto out_queue_exit;
+ cpu = cpumask_first_and(data.hctx->cpumask, cpu_online_mask);
++ if (cpu >= nr_cpu_ids)
++ goto out_queue_exit;
+ data.ctx = __blk_mq_get_ctx(q, cpu);
+
+ if (!q->elevator)
+diff --git a/certs/blacklist_hashes.c b/certs/blacklist_hashes.c
+index 344892337be07..d5961aa3d3380 100644
+--- a/certs/blacklist_hashes.c
++++ b/certs/blacklist_hashes.c
+@@ -1,7 +1,7 @@
+ // SPDX-License-Identifier: GPL-2.0
+ #include "blacklist.h"
+
+-const char __initdata *const blacklist_hashes[] = {
++const char __initconst *const blacklist_hashes[] = {
+ #include CONFIG_SYSTEM_BLACKLIST_HASH_LIST
+ , NULL
+ };
+diff --git a/crypto/Kconfig b/crypto/Kconfig
+index 41068811fd0e1..b4e00a7a046b9 100644
+--- a/crypto/Kconfig
++++ b/crypto/Kconfig
+@@ -15,6 +15,7 @@ source "crypto/async_tx/Kconfig"
+ #
+ menuconfig CRYPTO
+ tristate "Cryptographic API"
++ select LIB_MEMNEQ
+ help
+ This option provides the core Cryptographic API.
+
+diff --git a/crypto/Makefile b/crypto/Makefile
+index f754c4d17d6bd..a40e6d5fb2c83 100644
+--- a/crypto/Makefile
++++ b/crypto/Makefile
+@@ -4,7 +4,7 @@
+ #
+
+ obj-$(CONFIG_CRYPTO) += crypto.o
+-crypto-y := api.o cipher.o compress.o memneq.o
++crypto-y := api.o cipher.o compress.o
+
+ obj-$(CONFIG_CRYPTO_ENGINE) += crypto_engine.o
+ obj-$(CONFIG_CRYPTO_FIPS) += fips.o
+diff --git a/crypto/memneq.c b/crypto/memneq.c
+deleted file mode 100644
+index fb11608b1ec1d..0000000000000
+--- a/crypto/memneq.c
++++ /dev/null
+@@ -1,176 +0,0 @@
+-/*
+- * Constant-time equality testing of memory regions.
+- *
+- * Authors:
+- *
+- * James Yonan <james@openvpn.net>
+- * Daniel Borkmann <dborkman@redhat.com>
+- *
+- * This file is provided under a dual BSD/GPLv2 license. When using or
+- * redistributing this file, you may do so under either license.
+- *
+- * GPL LICENSE SUMMARY
+- *
+- * Copyright(c) 2013 OpenVPN Technologies, Inc. All rights reserved.
+- *
+- * This program is free software; you can redistribute it and/or modify
+- * it under the terms of version 2 of the GNU General Public License as
+- * published by the Free Software Foundation.
+- *
+- * This program is distributed in the hope that it will be useful, but
+- * WITHOUT ANY WARRANTY; without even the implied warranty of
+- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+- * General Public License for more details.
+- *
+- * You should have received a copy of the GNU General Public License
+- * along with this program; if not, write to the Free Software
+- * Foundation, Inc., 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+- * The full GNU General Public License is included in this distribution
+- * in the file called LICENSE.GPL.
+- *
+- * BSD LICENSE
+- *
+- * Copyright(c) 2013 OpenVPN Technologies, Inc. All rights reserved.
+- *
+- * Redistribution and use in source and binary forms, with or without
+- * modification, are permitted provided that the following conditions
+- * are met:
+- *
+- * * Redistributions of source code must retain the above copyright
+- * notice, this list of conditions and the following disclaimer.
+- * * Redistributions in binary form must reproduce the above copyright
+- * notice, this list of conditions and the following disclaimer in
+- * the documentation and/or other materials provided with the
+- * distribution.
+- * * Neither the name of OpenVPN Technologies nor the names of its
+- * contributors may be used to endorse or promote products derived
+- * from this software without specific prior written permission.
+- *
+- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+- * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+- * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+- */
+-
+-#include <crypto/algapi.h>
+-#include <asm/unaligned.h>
+-
+-#ifndef __HAVE_ARCH_CRYPTO_MEMNEQ
+-
+-/* Generic path for arbitrary size */
+-static inline unsigned long
+-__crypto_memneq_generic(const void *a, const void *b, size_t size)
+-{
+- unsigned long neq = 0;
+-
+-#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
+- while (size >= sizeof(unsigned long)) {
+- neq |= get_unaligned((unsigned long *)a) ^
+- get_unaligned((unsigned long *)b);
+- OPTIMIZER_HIDE_VAR(neq);
+- a += sizeof(unsigned long);
+- b += sizeof(unsigned long);
+- size -= sizeof(unsigned long);
+- }
+-#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
+- while (size > 0) {
+- neq |= *(unsigned char *)a ^ *(unsigned char *)b;
+- OPTIMIZER_HIDE_VAR(neq);
+- a += 1;
+- b += 1;
+- size -= 1;
+- }
+- return neq;
+-}
+-
+-/* Loop-free fast-path for frequently used 16-byte size */
+-static inline unsigned long __crypto_memneq_16(const void *a, const void *b)
+-{
+- unsigned long neq = 0;
+-
+-#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+- if (sizeof(unsigned long) == 8) {
+- neq |= get_unaligned((unsigned long *)a) ^
+- get_unaligned((unsigned long *)b);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= get_unaligned((unsigned long *)(a + 8)) ^
+- get_unaligned((unsigned long *)(b + 8));
+- OPTIMIZER_HIDE_VAR(neq);
+- } else if (sizeof(unsigned int) == 4) {
+- neq |= get_unaligned((unsigned int *)a) ^
+- get_unaligned((unsigned int *)b);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= get_unaligned((unsigned int *)(a + 4)) ^
+- get_unaligned((unsigned int *)(b + 4));
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= get_unaligned((unsigned int *)(a + 8)) ^
+- get_unaligned((unsigned int *)(b + 8));
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= get_unaligned((unsigned int *)(a + 12)) ^
+- get_unaligned((unsigned int *)(b + 12));
+- OPTIMIZER_HIDE_VAR(neq);
+- } else
+-#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
+- {
+- neq |= *(unsigned char *)(a) ^ *(unsigned char *)(b);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+1) ^ *(unsigned char *)(b+1);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+2) ^ *(unsigned char *)(b+2);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+3) ^ *(unsigned char *)(b+3);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+4) ^ *(unsigned char *)(b+4);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+5) ^ *(unsigned char *)(b+5);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+6) ^ *(unsigned char *)(b+6);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+7) ^ *(unsigned char *)(b+7);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+8) ^ *(unsigned char *)(b+8);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+9) ^ *(unsigned char *)(b+9);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+10) ^ *(unsigned char *)(b+10);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+11) ^ *(unsigned char *)(b+11);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+12) ^ *(unsigned char *)(b+12);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+13) ^ *(unsigned char *)(b+13);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+14) ^ *(unsigned char *)(b+14);
+- OPTIMIZER_HIDE_VAR(neq);
+- neq |= *(unsigned char *)(a+15) ^ *(unsigned char *)(b+15);
+- OPTIMIZER_HIDE_VAR(neq);
+- }
+-
+- return neq;
+-}
+-
+-/* Compare two areas of memory without leaking timing information,
+- * and with special optimizations for common sizes. Users should
+- * not call this function directly, but should instead use
+- * crypto_memneq defined in crypto/algapi.h.
+- */
+-noinline unsigned long __crypto_memneq(const void *a, const void *b,
+- size_t size)
+-{
+- switch (size) {
+- case 16:
+- return __crypto_memneq_16(a, b);
+- default:
+- return __crypto_memneq_generic(a, b, size);
+- }
+-}
+-EXPORT_SYMBOL(__crypto_memneq);
+-
+-#endif /* __HAVE_ARCH_CRYPTO_MEMNEQ */
+diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
+index 3d57fa84e2be8..ea96718212589 100644
+--- a/drivers/ata/libata-core.c
++++ b/drivers/ata/libata-core.c
+@@ -5506,7 +5506,7 @@ struct ata_host *ata_host_alloc_pinfo(struct device *dev,
+ const struct ata_port_info * const * ppi,
+ int n_ports)
+ {
+- const struct ata_port_info *pi;
++ const struct ata_port_info *pi = &ata_dummy_port_info;
+ struct ata_host *host;
+ int i, j;
+
+@@ -5514,7 +5514,7 @@ struct ata_host *ata_host_alloc_pinfo(struct device *dev,
+ if (!host)
+ return NULL;
+
+- for (i = 0, j = 0, pi = NULL; i < host->n_ports; i++) {
++ for (i = 0, j = 0; i < host->n_ports; i++) {
+ struct ata_port *ap = host->ports[i];
+
+ if (ppi[j])
+diff --git a/drivers/base/init.c b/drivers/base/init.c
+index d8d0fe687111a..397eb9880cecb 100644
+--- a/drivers/base/init.c
++++ b/drivers/base/init.c
+@@ -8,6 +8,7 @@
+ #include <linux/init.h>
+ #include <linux/memory.h>
+ #include <linux/of.h>
++#include <linux/backing-dev.h>
+
+ #include "base.h"
+
+@@ -20,6 +21,7 @@
+ void __init driver_init(void)
+ {
+ /* These are the core pieces */
++ bdi_init(&noop_backing_dev_info);
+ devtmpfs_init();
+ devices_init();
+ buses_init();
+diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
+index 8fd4a356a86ec..74593a1722fe0 100644
+--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
++++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
+@@ -1236,14 +1236,14 @@ error_cleanup_mc_io:
+ static int fsl_mc_bus_remove(struct platform_device *pdev)
+ {
+ struct fsl_mc *mc = platform_get_drvdata(pdev);
++ struct fsl_mc_io *mc_io;
+
+ if (!fsl_mc_is_root_dprc(&mc->root_mc_bus_dev->dev))
+ return -EINVAL;
+
++ mc_io = mc->root_mc_bus_dev->mc_io;
+ fsl_mc_device_remove(mc->root_mc_bus_dev);
+-
+- fsl_destroy_mc_io(mc->root_mc_bus_dev->mc_io);
+- mc->root_mc_bus_dev->mc_io = NULL;
++ fsl_destroy_mc_io(mc_io);
+
+ bus_unregister_notifier(&fsl_mc_bus_type, &fsl_mc_nb);
+
+diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
+index 55f48375e3fe5..d454428f4981d 100644
+--- a/drivers/char/Kconfig
++++ b/drivers/char/Kconfig
+@@ -428,28 +428,40 @@ config ADI
+ driver include crash and makedumpfile.
+
+ config RANDOM_TRUST_CPU
+- bool "Trust the CPU manufacturer to initialize Linux's CRNG"
++ bool "Initialize RNG using CPU RNG instructions"
++ default y
+ depends on ARCH_RANDOM
+- default n
+ help
+- Assume that CPU manufacturer (e.g., Intel or AMD for RDSEED or
+- RDRAND, IBM for the S390 and Power PC architectures) is trustworthy
+- for the purposes of initializing Linux's CRNG. Since this is not
+- something that can be independently audited, this amounts to trusting
+- that CPU manufacturer (perhaps with the insistence or mandate
+- of a Nation State's intelligence or law enforcement agencies)
+- has not installed a hidden back door to compromise the CPU's
+- random number generation facilities. This can also be configured
+- at boot with "random.trust_cpu=on/off".
++ Initialize the RNG using random numbers supplied by the CPU's
++ RNG instructions (e.g. RDRAND), if supported and available. These
++ random numbers are never used directly, but are rather hashed into
++ the main input pool, and this happens regardless of whether or not
++ this option is enabled. Instead, this option controls whether the
++ they are credited and hence can initialize the RNG. Additionally,
++ other sources of randomness are always used, regardless of this
++ setting. Enabling this implies trusting that the CPU can supply high
++ quality and non-backdoored random numbers.
++
++ Say Y here unless you have reason to mistrust your CPU or believe
++ its RNG facilities may be faulty. This may also be configured at
++ boot time with "random.trust_cpu=on/off".
+
+ config RANDOM_TRUST_BOOTLOADER
+- bool "Trust the bootloader to initialize Linux's CRNG"
+- help
+- Some bootloaders can provide entropy to increase the kernel's initial
+- device randomness. Say Y here to assume the entropy provided by the
+- booloader is trustworthy so it will be added to the kernel's entropy
+- pool. Otherwise, say N here so it will be regarded as device input that
+- only mixes the entropy pool. This can also be configured at boot with
+- "random.trust_bootloader=on/off".
++ bool "Initialize RNG using bootloader-supplied seed"
++ default y
++ help
++ Initialize the RNG using a seed supplied by the bootloader or boot
++ environment (e.g. EFI or a bootloader-generated device tree). This
++ seed is not used directly, but is rather hashed into the main input
++ pool, and this happens regardless of whether or not this option is
++ enabled. Instead, this option controls whether the seed is credited
++ and hence can initialize the RNG. Additionally, other sources of
++ randomness are always used, regardless of this setting. Enabling
++ this implies trusting that the bootloader can supply high quality and
++ non-backdoored seeds.
++
++ Say Y here unless you have reason to mistrust your bootloader or
++ believe its RNG facilities may be faulty. This may also be configured
++ at boot time with "random.trust_bootloader=on/off".
+
+ endmenu
+diff --git a/drivers/clk/imx/clk-imx8mp.c b/drivers/clk/imx/clk-imx8mp.c
+index 18f5b7c3ca9d8..74dabcdf81aba 100644
+--- a/drivers/clk/imx/clk-imx8mp.c
++++ b/drivers/clk/imx/clk-imx8mp.c
+@@ -659,7 +659,7 @@ static int imx8mp_clocks_probe(struct platform_device *pdev)
+ hws[IMX8MP_CLK_UART2_ROOT] = imx_clk_hw_gate4("uart2_root_clk", "uart2", ccm_base + 0x44a0, 0);
+ hws[IMX8MP_CLK_UART3_ROOT] = imx_clk_hw_gate4("uart3_root_clk", "uart3", ccm_base + 0x44b0, 0);
+ hws[IMX8MP_CLK_UART4_ROOT] = imx_clk_hw_gate4("uart4_root_clk", "uart4", ccm_base + 0x44c0, 0);
+- hws[IMX8MP_CLK_USB_ROOT] = imx_clk_hw_gate4("usb_root_clk", "osc_32k", ccm_base + 0x44d0, 0);
++ hws[IMX8MP_CLK_USB_ROOT] = imx_clk_hw_gate4("usb_root_clk", "hsio_axi", ccm_base + 0x44d0, 0);
+ hws[IMX8MP_CLK_USB_PHY_ROOT] = imx_clk_hw_gate4("usb_phy_root_clk", "usb_phy_ref", ccm_base + 0x44f0, 0);
+ hws[IMX8MP_CLK_USDHC1_ROOT] = imx_clk_hw_gate4("usdhc1_root_clk", "usdhc1", ccm_base + 0x4510, 0);
+ hws[IMX8MP_CLK_USDHC2_ROOT] = imx_clk_hw_gate4("usdhc2_root_clk", "usdhc2", ccm_base + 0x4520, 0);
+diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
+index ff188ab68496e..bb47610bbd1c4 100644
+--- a/drivers/clocksource/hyperv_timer.c
++++ b/drivers/clocksource/hyperv_timer.c
+@@ -565,4 +565,3 @@ void __init hv_init_clocksource(void)
+ hv_sched_clock_offset = hv_read_reference_counter();
+ hv_setup_sched_clock(read_hv_sched_clock_msr);
+ }
+-EXPORT_SYMBOL_GPL(hv_init_clocksource);
+diff --git a/drivers/comedi/drivers/vmk80xx.c b/drivers/comedi/drivers/vmk80xx.c
+index 46023adc53958..4536ed43f65b2 100644
+--- a/drivers/comedi/drivers/vmk80xx.c
++++ b/drivers/comedi/drivers/vmk80xx.c
+@@ -684,7 +684,7 @@ static int vmk80xx_alloc_usb_buffers(struct comedi_device *dev)
+ if (!devpriv->usb_rx_buf)
+ return -ENOMEM;
+
+- size = max(usb_endpoint_maxp(devpriv->ep_rx), MIN_BUF_SIZE);
++ size = max(usb_endpoint_maxp(devpriv->ep_tx), MIN_BUF_SIZE);
+ devpriv->usb_tx_buf = kzalloc(size, GFP_KERNEL);
+ if (!devpriv->usb_tx_buf)
+ return -ENOMEM;
+diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
+index b0f3aca61974c..9467d695a33e2 100644
+--- a/drivers/gpio/gpio-dwapb.c
++++ b/drivers/gpio/gpio-dwapb.c
+@@ -652,10 +652,9 @@ static int dwapb_get_clks(struct dwapb_gpio *gpio)
+ gpio->clks[1].id = "db";
+ err = devm_clk_bulk_get_optional(gpio->dev, DWAPB_NR_CLOCKS,
+ gpio->clks);
+- if (err) {
+- dev_err(gpio->dev, "Cannot get APB/Debounce clocks\n");
+- return err;
+- }
++ if (err)
++ return dev_err_probe(gpio->dev, err,
++ "Cannot get APB/Debounce clocks\n");
+
+ err = clk_bulk_prepare_enable(DWAPB_NR_CLOCKS, gpio->clks);
+ if (err) {
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+index cd89d2e46852d..f4509656ea8c8 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+@@ -1955,9 +1955,6 @@ int amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(struct amdgpu_device *adev,
+ return -EINVAL;
+ }
+
+- /* delete kgd_mem from kfd_bo_list to avoid re-validating
+- * this BO in BO's restoring after eviction.
+- */
+ mutex_lock(&mem->process_info->lock);
+
+ ret = amdgpu_bo_reserve(bo, true);
+@@ -1980,7 +1977,6 @@ int amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(struct amdgpu_device *adev,
+
+ amdgpu_amdkfd_remove_eviction_fence(
+ bo, mem->process_info->eviction_fence);
+- list_del_init(&mem->validate_list.head);
+
+ if (size)
+ *size = amdgpu_bo_size(bo);
+@@ -2544,12 +2540,15 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
+ process_info->eviction_fence = new_fence;
+ *ef = dma_fence_get(&new_fence->base);
+
+- /* Attach new eviction fence to all BOs */
++ /* Attach new eviction fence to all BOs except pinned ones */
+ list_for_each_entry(mem, &process_info->kfd_bo_list,
+- validate_list.head)
++ validate_list.head) {
++ if (mem->bo->tbo.pin_count)
++ continue;
++
+ amdgpu_bo_fence(mem->bo,
+ &process_info->eviction_fence->base, true);
+-
++ }
+ /* Attach eviction fence to PD / PT BOs */
+ list_for_each_entry(peer_vm, &process_info->vm_list_head,
+ vm_list_node) {
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+index 28a736c507bb3..bd3b32e5ba9e9 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+@@ -625,17 +625,20 @@ int amdgpu_get_gfx_off_status(struct amdgpu_device *adev, uint32_t *value)
+ int amdgpu_gfx_ras_late_init(struct amdgpu_device *adev, struct ras_common_if *ras_block)
+ {
+ int r;
+- r = amdgpu_ras_block_late_init(adev, ras_block);
+- if (r)
+- return r;
+
+ if (amdgpu_ras_is_supported(adev, ras_block->block)) {
+ if (!amdgpu_persistent_edc_harvesting_supported(adev))
+ amdgpu_ras_reset_error_status(adev, AMDGPU_RAS_BLOCK__GFX);
+
++ r = amdgpu_ras_block_late_init(adev, ras_block);
++ if (r)
++ return r;
++
+ r = amdgpu_irq_get(adev, &adev->gfx.cp_ecc_error_irq, 0);
+ if (r)
+ goto late_fini;
++ } else {
++ amdgpu_ras_feature_enable_on_boot(adev, ras_block, 0);
+ }
+
+ return 0;
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+index 6b626c293e724..49c55d82cba80 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+@@ -634,7 +634,6 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
+ atomic64_read(&adev->visible_pin_size),
+ vram_gtt.vram_size);
+ vram_gtt.gtt_size = ttm_manager_type(&adev->mman.bdev, TTM_PL_TT)->size;
+- vram_gtt.gtt_size *= PAGE_SIZE;
+ vram_gtt.gtt_size -= atomic64_read(&adev->gart_pin_size);
+ return copy_to_user(out, &vram_gtt,
+ min((size_t)size, sizeof(vram_gtt))) ? -EFAULT : 0;
+@@ -667,7 +666,6 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
+ mem.cpu_accessible_vram.usable_heap_size * 3 / 4;
+
+ mem.gtt.total_heap_size = gtt_man->size;
+- mem.gtt.total_heap_size *= PAGE_SIZE;
+ mem.gtt.usable_heap_size = mem.gtt.total_heap_size -
+ atomic64_read(&adev->gart_pin_size);
+ mem.gtt.heap_usage = ttm_resource_manager_usage(gtt_man);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+index 424c22a841f40..3f96dadf2698b 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+@@ -195,6 +195,13 @@ static ssize_t amdgpu_ras_debugfs_read(struct file *f, char __user *buf,
+ if (amdgpu_ras_query_error_status(obj->adev, &info))
+ return -EINVAL;
+
++ /* Hardware counter will be reset automatically after the query on Vega20 and Arcturus */
++ if (obj->adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 2) &&
++ obj->adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 4)) {
++ if (amdgpu_ras_reset_error_status(obj->adev, info.head.block))
++ dev_warn(obj->adev->dev, "Failed to reset error counter and error status");
++ }
++
+ s = snprintf(val, sizeof(val), "%s: %lu\n%s: %lu\n",
+ "ue", info.ue_count,
+ "ce", info.ce_count);
+@@ -548,9 +555,10 @@ static ssize_t amdgpu_ras_sysfs_read(struct device *dev,
+ if (amdgpu_ras_query_error_status(obj->adev, &info))
+ return -EINVAL;
+
+- if (obj->adev->asic_type == CHIP_ALDEBARAN) {
++ if (obj->adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 2) &&
++ obj->adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 4)) {
+ if (amdgpu_ras_reset_error_status(obj->adev, info.head.block))
+- DRM_WARN("Failed to reset error counter and error status");
++ dev_warn(obj->adev->dev, "Failed to reset error counter and error status");
+ }
+
+ return sysfs_emit(buf, "%s: %lu\n%s: %lu\n", "ue", info.ue_count,
+@@ -1023,9 +1031,6 @@ int amdgpu_ras_query_error_status(struct amdgpu_device *adev,
+ }
+ }
+
+- if (!amdgpu_persistent_edc_harvesting_supported(adev))
+- amdgpu_ras_reset_error_status(adev, info->head.block);
+-
+ return 0;
+ }
+
+@@ -1145,6 +1150,12 @@ int amdgpu_ras_query_error_count(struct amdgpu_device *adev,
+ if (res)
+ return res;
+
++ if (adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 2) &&
++ adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 4)) {
++ if (amdgpu_ras_reset_error_status(adev, info.head.block))
++ dev_warn(adev->dev, "Failed to reset error counter and error status");
++ }
++
+ ce += info.ce_count;
+ ue += info.ue_count;
+ }
+@@ -1705,6 +1716,12 @@ static void amdgpu_ras_log_on_err_counter(struct amdgpu_device *adev)
+ continue;
+
+ amdgpu_ras_query_error_status(adev, &info);
++
++ if (adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 2) &&
++ adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 4)) {
++ if (amdgpu_ras_reset_error_status(adev, info.head.block))
++ dev_warn(adev->dev, "Failed to reset error counter and error status");
++ }
+ }
+ }
+
+diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+index 3b8856b4cece8..5979335d7afdc 100644
+--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
++++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+@@ -2286,6 +2286,8 @@ svm_range_cpu_invalidate_pagetables(struct mmu_interval_notifier *mni,
+
+ if (range->event == MMU_NOTIFY_RELEASE)
+ return true;
++ if (!mmget_not_zero(mni->mm))
++ return true;
+
+ start = mni->interval_tree.start;
+ last = mni->interval_tree.last;
+@@ -2312,6 +2314,7 @@ svm_range_cpu_invalidate_pagetables(struct mmu_interval_notifier *mni,
+ }
+
+ svm_range_unlock(prange);
++ mmput(mni->mm);
+
+ return true;
+ }
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+index 8dd03de7c2775..78a38c3b7d664 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+@@ -2835,7 +2835,7 @@ static struct drm_mode_config_helper_funcs amdgpu_dm_mode_config_helperfuncs = {
+
+ static void update_connector_ext_caps(struct amdgpu_dm_connector *aconnector)
+ {
+- u32 max_cll, min_cll, max, min, q, r;
++ u32 max_avg, min_cll, max, min, q, r;
+ struct amdgpu_dm_backlight_caps *caps;
+ struct amdgpu_display_manager *dm;
+ struct drm_connector *conn_base;
+@@ -2865,7 +2865,7 @@ static void update_connector_ext_caps(struct amdgpu_dm_connector *aconnector)
+ caps = &dm->backlight_caps[i];
+ caps->ext_caps = &aconnector->dc_link->dpcd_sink_ext_caps;
+ caps->aux_support = false;
+- max_cll = conn_base->hdr_sink_metadata.hdmi_type1.max_cll;
++ max_avg = conn_base->hdr_sink_metadata.hdmi_type1.max_fall;
+ min_cll = conn_base->hdr_sink_metadata.hdmi_type1.min_cll;
+
+ if (caps->ext_caps->bits.oled == 1 /*||
+@@ -2893,8 +2893,8 @@ static void update_connector_ext_caps(struct amdgpu_dm_connector *aconnector)
+ * The results of the above expressions can be verified at
+ * pre_computed_values.
+ */
+- q = max_cll >> 5;
+- r = max_cll % 32;
++ q = max_avg >> 5;
++ r = max_avg % 32;
+ max = (1 << q) * pre_computed_values[r];
+
+ // min luminance: maxLum * (CV/255)^2 / 100
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dio_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dio_link_encoder.c
+index d94fd1010debc..8b12b4111c887 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dio_link_encoder.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dio_link_encoder.c
+@@ -230,9 +230,7 @@ static void enc31_hw_init(struct link_encoder *enc)
+ AUX_RX_PHASE_DETECT_LEN, [21,20] = 0x3 default is 3
+ AUX_RX_DETECTION_THRESHOLD [30:28] = 1
+ */
+- AUX_REG_WRITE(AUX_DPHY_RX_CONTROL0, 0x103d1110);
+-
+- AUX_REG_WRITE(AUX_DPHY_TX_CONTROL, 0x21c7a);
++ // dmub will read AUX_DPHY_RX_CONTROL0/AUX_DPHY_TX_CONTROL from vbios table in dp_aux_init
+
+ //AUX_DPHY_TX_REF_CONTROL'AUX_TX_REF_DIV HW default is 0x32;
+ // Set AUX_TX_REF_DIV Divider to generate 2 MHz reference from refclk
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
+index d71e625cc476e..559cf10831bea 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c
+@@ -1392,12 +1392,6 @@ static struct stream_encoder *dcn31_stream_encoder_create(
+ return NULL;
+ }
+
+- if (ctx->asic_id.chip_family == FAMILY_YELLOW_CARP &&
+- ctx->asic_id.hw_internal_rev == YELLOW_CARP_B0) {
+- if ((eng_id == ENGINE_ID_DIGC) || (eng_id == ENGINE_ID_DIGD))
+- eng_id = eng_id + 3; // For B0 only. C->F, D->G.
+- }
+-
+ dcn30_dio_stream_encoder_construct(enc1, ctx, ctx->dc_bios,
+ eng_id, vpg, afmt,
+ &stream_enc_regs[eng_id],
+diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+index c881130444948..9b6fbad476465 100644
+--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
++++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+@@ -154,7 +154,7 @@ __uc_fw_auto_select(struct drm_i915_private *i915, struct intel_uc_fw *uc_fw)
+ [INTEL_UC_FW_TYPE_GUC] = { blobs_guc, ARRAY_SIZE(blobs_guc) },
+ [INTEL_UC_FW_TYPE_HUC] = { blobs_huc, ARRAY_SIZE(blobs_huc) },
+ };
+- static const struct uc_fw_platform_requirement *fw_blobs;
++ const struct uc_fw_platform_requirement *fw_blobs;
+ enum intel_platform p = INTEL_INFO(i915)->platform;
+ u32 fw_count;
+ u8 rev = INTEL_REVID(i915);
+diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
+index a4d1759375b9d..66a8880eaf19a 100644
+--- a/drivers/gpu/drm/i915/i915_sysfs.c
++++ b/drivers/gpu/drm/i915/i915_sysfs.c
+@@ -432,7 +432,14 @@ static ssize_t error_state_read(struct file *filp, struct kobject *kobj,
+ struct device *kdev = kobj_to_dev(kobj);
+ struct drm_i915_private *i915 = kdev_minor_to_i915(kdev);
+ struct i915_gpu_coredump *gpu;
+- ssize_t ret;
++ ssize_t ret = 0;
++
++ /*
++ * FIXME: Concurrent clients triggering resets and reading + clearing
++ * dumps can cause inconsistent sysfs reads when a user calls in with a
++ * non-zero offset to complete a prior partial read but the
++ * gpu_coredump has been cleared or replaced.
++ */
+
+ gpu = i915_first_error_state(i915);
+ if (IS_ERR(gpu)) {
+@@ -444,8 +451,10 @@ static ssize_t error_state_read(struct file *filp, struct kobject *kobj,
+ const char *str = "No error state collected\n";
+ size_t len = strlen(str);
+
+- ret = min_t(size_t, count, len - off);
+- memcpy(buf, str + off, ret);
++ if (off < len) {
++ ret = min_t(size_t, count, len - off);
++ memcpy(buf, str + off, ret);
++ }
+ }
+
+ return ret;
+diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
+index 85a2142c93840..073ff9b6f5352 100644
+--- a/drivers/hv/channel_mgmt.c
++++ b/drivers/hv/channel_mgmt.c
+@@ -637,6 +637,7 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
+ */
+ if (newchannel->offermsg.offer.sub_channel_index == 0) {
+ mutex_unlock(&vmbus_connection.channel_mutex);
++ cpus_read_unlock();
+ /*
+ * Don't call free_channel(), because newchannel->kobj
+ * is not initialized yet.
+diff --git a/drivers/i2c/busses/i2c-designware-common.c b/drivers/i2c/busses/i2c-designware-common.c
+index 9f8574320eb2d..b08e5bc2b64cc 100644
+--- a/drivers/i2c/busses/i2c-designware-common.c
++++ b/drivers/i2c/busses/i2c-designware-common.c
+@@ -477,9 +477,6 @@ int i2c_dw_prepare_clk(struct dw_i2c_dev *dev, bool prepare)
+ {
+ int ret;
+
+- if (IS_ERR(dev->clk))
+- return PTR_ERR(dev->clk);
+-
+ if (prepare) {
+ /* Optional interface clock */
+ ret = clk_prepare_enable(dev->pclk);
+diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c b/drivers/i2c/busses/i2c-designware-platdrv.c
+index 70ade5306e458..ba043b5473936 100644
+--- a/drivers/i2c/busses/i2c-designware-platdrv.c
++++ b/drivers/i2c/busses/i2c-designware-platdrv.c
+@@ -320,8 +320,17 @@ static int dw_i2c_plat_probe(struct platform_device *pdev)
+ goto exit_reset;
+ }
+
+- dev->clk = devm_clk_get(&pdev->dev, NULL);
+- if (!i2c_dw_prepare_clk(dev, true)) {
++ dev->clk = devm_clk_get_optional(&pdev->dev, NULL);
++ if (IS_ERR(dev->clk)) {
++ ret = PTR_ERR(dev->clk);
++ goto exit_reset;
++ }
++
++ ret = i2c_dw_prepare_clk(dev, true);
++ if (ret)
++ goto exit_reset;
++
++ if (dev->clk) {
+ u64 clk_khz;
+
+ dev->get_clk_rate_khz = i2c_dw_get_clk_rate_khz;
+diff --git a/drivers/i2c/busses/i2c-mt65xx.c b/drivers/i2c/busses/i2c-mt65xx.c
+index bdecb78bfc26d..8e6985354fd59 100644
+--- a/drivers/i2c/busses/i2c-mt65xx.c
++++ b/drivers/i2c/busses/i2c-mt65xx.c
+@@ -1420,17 +1420,22 @@ static int mtk_i2c_probe(struct platform_device *pdev)
+ if (ret < 0) {
+ dev_err(&pdev->dev,
+ "Request I2C IRQ %d fail\n", irq);
+- return ret;
++ goto err_bulk_unprepare;
+ }
+
+ i2c_set_adapdata(&i2c->adap, i2c);
+ ret = i2c_add_adapter(&i2c->adap);
+ if (ret)
+- return ret;
++ goto err_bulk_unprepare;
+
+ platform_set_drvdata(pdev, i2c);
+
+ return 0;
++
++err_bulk_unprepare:
++ clk_bulk_unprepare(I2C_MT65XX_CLK_MAX, i2c->clocks);
++
++ return ret;
+ }
+
+ static int mtk_i2c_remove(struct platform_device *pdev)
+diff --git a/drivers/i2c/busses/i2c-npcm7xx.c b/drivers/i2c/busses/i2c-npcm7xx.c
+index c638f2efb97c7..743ac20a405c2 100644
+--- a/drivers/i2c/busses/i2c-npcm7xx.c
++++ b/drivers/i2c/busses/i2c-npcm7xx.c
+@@ -2369,8 +2369,7 @@ static struct platform_driver npcm_i2c_bus_driver = {
+ static int __init npcm_i2c_init(void)
+ {
+ npcm_i2c_debugfs_dir = debugfs_create_dir("npcm_i2c", NULL);
+- platform_driver_register(&npcm_i2c_bus_driver);
+- return 0;
++ return platform_driver_register(&npcm_i2c_bus_driver);
+ }
+ module_init(npcm_i2c_init);
+
+diff --git a/drivers/input/misc/soc_button_array.c b/drivers/input/misc/soc_button_array.c
+index cbb1599a520e6..480476121c010 100644
+--- a/drivers/input/misc/soc_button_array.c
++++ b/drivers/input/misc/soc_button_array.c
+@@ -85,13 +85,13 @@ static const struct dmi_system_id dmi_use_low_level_irq[] = {
+ },
+ {
+ /*
+- * Lenovo Yoga Tab2 1051L, something messes with the home-button
++ * Lenovo Yoga Tab2 1051F/1051L, something messes with the home-button
+ * IRQ settings, leading to a non working home-button.
+ */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "60073"),
+- DMI_MATCH(DMI_PRODUCT_VERSION, "1051L"),
++ DMI_MATCH(DMI_PRODUCT_VERSION, "1051"),
+ },
+ },
+ {} /* Terminating entry */
+diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
+index 12dd48727a15f..5ac83185ff479 100644
+--- a/drivers/irqchip/irq-apple-aic.c
++++ b/drivers/irqchip/irq-apple-aic.c
+@@ -1035,6 +1035,7 @@ static void build_fiq_affinity(struct aic_irq_chip *ic, struct device_node *aff)
+ continue;
+
+ cpu = of_cpu_node_to_id(cpu_node);
++ of_node_put(cpu_node);
+ if (WARN_ON(cpu < 0))
+ continue;
+
+@@ -1143,6 +1144,7 @@ static int __init aic_of_ic_init(struct device_node *node, struct device_node *p
+ for_each_child_of_node(affs, chld)
+ build_fiq_affinity(irqc, chld);
+ }
++ of_node_put(affs);
+
+ set_handle_irq(aic_handle_irq);
+ set_handle_fiq(aic_handle_fiq);
+diff --git a/drivers/irqchip/irq-gic-realview.c b/drivers/irqchip/irq-gic-realview.c
+index b4c1924f02554..38fab02ffe9d0 100644
+--- a/drivers/irqchip/irq-gic-realview.c
++++ b/drivers/irqchip/irq-gic-realview.c
+@@ -57,6 +57,7 @@ realview_gic_of_init(struct device_node *node, struct device_node *parent)
+
+ /* The PB11MPCore GIC needs to be configured in the syscon */
+ map = syscon_node_to_regmap(np);
++ of_node_put(np);
+ if (!IS_ERR(map)) {
+ /* new irq mode with no DCC */
+ regmap_write(map, REALVIEW_SYS_LOCK_OFFSET,
+diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
+index 1af2b50f36f3e..d5420f9d6219f 100644
+--- a/drivers/irqchip/irq-gic-v3.c
++++ b/drivers/irqchip/irq-gic-v3.c
+@@ -1922,7 +1922,7 @@ static void __init gic_populate_ppi_partitions(struct device_node *gic_node)
+
+ gic_data.ppi_descs = kcalloc(gic_data.ppi_nr, sizeof(*gic_data.ppi_descs), GFP_KERNEL);
+ if (!gic_data.ppi_descs)
+- return;
++ goto out_put_node;
+
+ nr_parts = of_get_child_count(parts_node);
+
+@@ -1963,12 +1963,15 @@ static void __init gic_populate_ppi_partitions(struct device_node *gic_node)
+ continue;
+
+ cpu = of_cpu_node_to_id(cpu_node);
+- if (WARN_ON(cpu < 0))
++ if (WARN_ON(cpu < 0)) {
++ of_node_put(cpu_node);
+ continue;
++ }
+
+ pr_cont("%pOF[%d] ", cpu_node, cpu);
+
+ cpumask_set_cpu(cpu, &part->mask);
++ of_node_put(cpu_node);
+ }
+
+ pr_cont("}\n");
+diff --git a/drivers/irqchip/irq-realtek-rtl.c b/drivers/irqchip/irq-realtek-rtl.c
+index 50a56820c99bc..56bf502d9c673 100644
+--- a/drivers/irqchip/irq-realtek-rtl.c
++++ b/drivers/irqchip/irq-realtek-rtl.c
+@@ -134,9 +134,9 @@ static int __init map_interrupts(struct device_node *node, struct irq_domain *do
+ if (!cpu_ictl)
+ return -EINVAL;
+ ret = of_property_read_u32(cpu_ictl, "#interrupt-cells", &tmp);
++ of_node_put(cpu_ictl);
+ if (ret || tmp != 1)
+ return -EINVAL;
+- of_node_put(cpu_ictl);
+
+ cpu_int = be32_to_cpup(imap + 2);
+ if (cpu_int > 7 || cpu_int < 2)
+diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
+index a6606736d8c50..2776ca5fc33f3 100644
+--- a/drivers/isdn/mISDN/socket.c
++++ b/drivers/isdn/mISDN/socket.c
+@@ -121,7 +121,7 @@ mISDN_sock_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+ if (sk->sk_state == MISDN_CLOSED)
+ return 0;
+
+- skb = skb_recv_datagram(sk, flags, flags & MSG_DONTWAIT, &err);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb)
+ return err;
+
+diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
+index 4277853c75351..3cd78f1e82a44 100644
+--- a/drivers/md/dm-core.h
++++ b/drivers/md/dm-core.h
+@@ -32,6 +32,14 @@ struct dm_kobject_holder {
+ * access their members!
+ */
+
++/*
++ * For mempools pre-allocation at the table loading time.
++ */
++struct dm_md_mempools {
++ struct bio_set bs;
++ struct bio_set io_bs;
++};
++
+ struct mapped_device {
+ struct mutex suspend_lock;
+
+@@ -109,8 +117,7 @@ struct mapped_device {
+ /*
+ * io objects are allocated from here.
+ */
+- struct bio_set io_bs;
+- struct bio_set bs;
++ struct dm_md_mempools *mempools;
+
+ /* kobject and completion */
+ struct dm_kobject_holder kobj_holder;
+diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
+index 06f328928a7f5..2dda05aada231 100644
+--- a/drivers/md/dm-log.c
++++ b/drivers/md/dm-log.c
+@@ -415,8 +415,7 @@ static int create_log_context(struct dm_dirty_log *log, struct dm_target *ti,
+ /*
+ * Work out how many "unsigned long"s we need to hold the bitset.
+ */
+- bitset_size = dm_round_up(region_count,
+- sizeof(*lc->clean_bits) << BYTE_SHIFT);
++ bitset_size = dm_round_up(region_count, BITS_PER_LONG);
+ bitset_size >>= BYTE_SHIFT;
+
+ lc->bitset_uint32_count = bitset_size / sizeof(*lc->clean_bits);
+diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
+index 6087cdcaad46d..a83b98a8d2a99 100644
+--- a/drivers/md/dm-rq.c
++++ b/drivers/md/dm-rq.c
+@@ -319,7 +319,7 @@ static int setup_clone(struct request *clone, struct request *rq,
+ {
+ int r;
+
+- r = blk_rq_prep_clone(clone, rq, &tio->md->bs, gfp_mask,
++ r = blk_rq_prep_clone(clone, rq, &tio->md->mempools->bs, gfp_mask,
+ dm_rq_bio_constructor, tio);
+ if (r)
+ return r;
+diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
+index 03541cfc2317c..4b5cf33726995 100644
+--- a/drivers/md/dm-table.c
++++ b/drivers/md/dm-table.c
+@@ -1030,17 +1030,6 @@ static int dm_table_alloc_md_mempools(struct dm_table *t, struct mapped_device *
+ return 0;
+ }
+
+-void dm_table_free_md_mempools(struct dm_table *t)
+-{
+- dm_free_md_mempools(t->mempools);
+- t->mempools = NULL;
+-}
+-
+-struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t)
+-{
+- return t->mempools;
+-}
+-
+ static int setup_indexes(struct dm_table *t)
+ {
+ int i;
+diff --git a/drivers/md/dm.c b/drivers/md/dm.c
+index 82957bd460e89..83dd17abf1af9 100644
+--- a/drivers/md/dm.c
++++ b/drivers/md/dm.c
+@@ -131,14 +131,6 @@ static int get_swap_bios(void)
+ return latch;
+ }
+
+-/*
+- * For mempools pre-allocation at the table loading time.
+- */
+-struct dm_md_mempools {
+- struct bio_set bs;
+- struct bio_set io_bs;
+-};
+-
+ struct table_device {
+ struct list_head list;
+ refcount_t count;
+@@ -551,6 +543,10 @@ static void dm_start_io_acct(struct dm_io *io, struct bio *clone)
+ return;
+ /* Can afford locking given DM_TIO_IS_DUPLICATE_BIO */
+ spin_lock_irqsave(&io->lock, flags);
++ if (dm_io_flagged(io, DM_IO_ACCOUNTED)) {
++ spin_unlock_irqrestore(&io->lock, flags);
++ return;
++ }
+ dm_io_set_flag(io, DM_IO_ACCOUNTED);
+ spin_unlock_irqrestore(&io->lock, flags);
+ }
+@@ -569,7 +565,7 @@ static struct dm_io *alloc_io(struct mapped_device *md, struct bio *bio)
+ struct dm_target_io *tio;
+ struct bio *clone;
+
+- clone = bio_alloc_clone(bio->bi_bdev, bio, GFP_NOIO, &md->io_bs);
++ clone = bio_alloc_clone(bio->bi_bdev, bio, GFP_NOIO, &md->mempools->io_bs);
+
+ tio = clone_to_tio(clone);
+ tio->flags = 0;
+@@ -611,7 +607,7 @@ static struct bio *alloc_tio(struct clone_info *ci, struct dm_target *ti,
+ clone = &tio->clone;
+ } else {
+ clone = bio_alloc_clone(ci->bio->bi_bdev, ci->bio,
+- gfp_mask, &ci->io->md->bs);
++ gfp_mask, &ci->io->md->mempools->bs);
+ if (!clone)
+ return NULL;
+
+@@ -1771,8 +1767,7 @@ static void cleanup_mapped_device(struct mapped_device *md)
+ {
+ if (md->wq)
+ destroy_workqueue(md->wq);
+- bioset_exit(&md->bs);
+- bioset_exit(&md->io_bs);
++ dm_free_md_mempools(md->mempools);
+
+ if (md->dax_dev) {
+ dax_remove_host(md->disk);
+@@ -1944,48 +1939,6 @@ static void free_dev(struct mapped_device *md)
+ kvfree(md);
+ }
+
+-static int __bind_mempools(struct mapped_device *md, struct dm_table *t)
+-{
+- struct dm_md_mempools *p = dm_table_get_md_mempools(t);
+- int ret = 0;
+-
+- if (dm_table_bio_based(t)) {
+- /*
+- * The md may already have mempools that need changing.
+- * If so, reload bioset because front_pad may have changed
+- * because a different table was loaded.
+- */
+- bioset_exit(&md->bs);
+- bioset_exit(&md->io_bs);
+-
+- } else if (bioset_initialized(&md->bs)) {
+- /*
+- * There's no need to reload with request-based dm
+- * because the size of front_pad doesn't change.
+- * Note for future: If you are to reload bioset,
+- * prep-ed requests in the queue may refer
+- * to bio from the old bioset, so you must walk
+- * through the queue to unprep.
+- */
+- goto out;
+- }
+-
+- BUG_ON(!p ||
+- bioset_initialized(&md->bs) ||
+- bioset_initialized(&md->io_bs));
+-
+- ret = bioset_init_from_src(&md->bs, &p->bs);
+- if (ret)
+- goto out;
+- ret = bioset_init_from_src(&md->io_bs, &p->io_bs);
+- if (ret)
+- bioset_exit(&md->bs);
+-out:
+- /* mempool bind completed, no longer need any mempools in the table */
+- dm_table_free_md_mempools(t);
+- return ret;
+-}
+-
+ /*
+ * Bind a table to the device.
+ */
+@@ -2039,12 +1992,28 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t,
+ * immutable singletons - used to optimize dm_mq_queue_rq.
+ */
+ md->immutable_target = dm_table_get_immutable_target(t);
+- }
+
+- ret = __bind_mempools(md, t);
+- if (ret) {
+- old_map = ERR_PTR(ret);
+- goto out;
++ /*
++ * There is no need to reload with request-based dm because the
++ * size of front_pad doesn't change.
++ *
++ * Note for future: If you are to reload bioset, prep-ed
++ * requests in the queue may refer to bio from the old bioset,
++ * so you must walk through the queue to unprep.
++ */
++ if (!md->mempools) {
++ md->mempools = t->mempools;
++ t->mempools = NULL;
++ }
++ } else {
++ /*
++ * The md may already have mempools that need changing.
++ * If so, reload bioset because front_pad may have changed
++ * because a different table was loaded.
++ */
++ dm_free_md_mempools(md->mempools);
++ md->mempools = t->mempools;
++ t->mempools = NULL;
+ }
+
+ ret = dm_table_set_restrictions(t, md->queue, limits);
+diff --git a/drivers/md/dm.h b/drivers/md/dm.h
+index 9013dc1a7b002..968199b95fd1e 100644
+--- a/drivers/md/dm.h
++++ b/drivers/md/dm.h
+@@ -71,8 +71,6 @@ struct dm_target *dm_table_get_immutable_target(struct dm_table *t);
+ struct dm_target *dm_table_get_wildcard_target(struct dm_table *t);
+ bool dm_table_bio_based(struct dm_table *t);
+ bool dm_table_request_based(struct dm_table *t);
+-void dm_table_free_md_mempools(struct dm_table *t);
+-struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t);
+
+ void dm_lock_md_type(struct mapped_device *md);
+ void dm_unlock_md_type(struct mapped_device *md);
+diff --git a/drivers/md/raid5-ppl.c b/drivers/md/raid5-ppl.c
+index d3962d92df18a..18acbbed8f0c4 100644
+--- a/drivers/md/raid5-ppl.c
++++ b/drivers/md/raid5-ppl.c
+@@ -629,9 +629,9 @@ static void ppl_do_flush(struct ppl_io_unit *io)
+ if (bdev) {
+ struct bio *bio;
+
+- bio = bio_alloc_bioset(bdev, 0, GFP_NOIO,
++ bio = bio_alloc_bioset(bdev, 0,
+ REQ_OP_WRITE | REQ_PREFLUSH,
+- &ppl_conf->flush_bs);
++ GFP_NOIO, &ppl_conf->flush_bs);
+ bio->bi_private = io;
+ bio->bi_end_io = ppl_flush_endio;
+
+diff --git a/drivers/misc/atmel-ssc.c b/drivers/misc/atmel-ssc.c
+index d6cd5537126c6..69f9b0336410d 100644
+--- a/drivers/misc/atmel-ssc.c
++++ b/drivers/misc/atmel-ssc.c
+@@ -232,9 +232,9 @@ static int ssc_probe(struct platform_device *pdev)
+ clk_disable_unprepare(ssc->clk);
+
+ ssc->irq = platform_get_irq(pdev, 0);
+- if (!ssc->irq) {
++ if (ssc->irq < 0) {
+ dev_dbg(&pdev->dev, "could not get irq\n");
+- return -ENXIO;
++ return ssc->irq;
+ }
+
+ mutex_lock(&user_lock);
+diff --git a/drivers/misc/mei/hbm.c b/drivers/misc/mei/hbm.c
+index cebcca6d6d3ef..cf2b8261da144 100644
+--- a/drivers/misc/mei/hbm.c
++++ b/drivers/misc/mei/hbm.c
+@@ -1351,7 +1351,8 @@ int mei_hbm_dispatch(struct mei_device *dev, struct mei_msg_hdr *hdr)
+
+ if (dev->dev_state != MEI_DEV_INIT_CLIENTS ||
+ dev->hbm_state != MEI_HBM_CAP_SETUP) {
+- if (dev->dev_state == MEI_DEV_POWER_DOWN) {
++ if (dev->dev_state == MEI_DEV_POWER_DOWN ||
++ dev->dev_state == MEI_DEV_POWERING_DOWN) {
+ dev_dbg(dev->dev, "hbm: capabilities response: on shutdown, ignoring\n");
+ return 0;
+ }
+diff --git a/drivers/misc/mei/hw-me-regs.h b/drivers/misc/mei/hw-me-regs.h
+index 64ce3f830262b..15e8e2b322b1a 100644
+--- a/drivers/misc/mei/hw-me-regs.h
++++ b/drivers/misc/mei/hw-me-regs.h
+@@ -109,6 +109,8 @@
+ #define MEI_DEV_ID_ADP_P 0x51E0 /* Alder Lake Point P */
+ #define MEI_DEV_ID_ADP_N 0x54E0 /* Alder Lake Point N */
+
++#define MEI_DEV_ID_RPL_S 0x7A68 /* Raptor Lake Point S */
++
+ /*
+ * MEI HW Section
+ */
+diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
+index 33e58821e4785..5435604327a71 100644
+--- a/drivers/misc/mei/pci-me.c
++++ b/drivers/misc/mei/pci-me.c
+@@ -116,6 +116,8 @@ static const struct pci_device_id mei_me_pci_tbl[] = {
+ {MEI_PCI_DEVICE(MEI_DEV_ID_ADP_P, MEI_ME_PCH15_CFG)},
+ {MEI_PCI_DEVICE(MEI_DEV_ID_ADP_N, MEI_ME_PCH15_CFG)},
+
++ {MEI_PCI_DEVICE(MEI_DEV_ID_RPL_S, MEI_ME_PCH15_CFG)},
++
+ /* required last entry */
+ {0, }
+ };
+diff --git a/drivers/net/ethernet/broadcom/bgmac-bcma.c b/drivers/net/ethernet/broadcom/bgmac-bcma.c
+index e6f48786949c0..02bd3cf9a260e 100644
+--- a/drivers/net/ethernet/broadcom/bgmac-bcma.c
++++ b/drivers/net/ethernet/broadcom/bgmac-bcma.c
+@@ -332,7 +332,6 @@ static void bgmac_remove(struct bcma_device *core)
+ bcma_mdio_mii_unregister(bgmac->mii_bus);
+ bgmac_enet_remove(bgmac);
+ bcma_set_drvdata(core, NULL);
+- kfree(bgmac);
+ }
+
+ static struct bcma_driver bgmac_bcma_driver = {
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+index 79c64f4e67d2b..3affcdb34c915 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
++++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+@@ -765,6 +765,7 @@ struct hnae3_tc_info {
+ u8 prio_tc[HNAE3_MAX_USER_PRIO]; /* TC indexed by prio */
+ u16 tqp_count[HNAE3_MAX_TC];
+ u16 tqp_offset[HNAE3_MAX_TC];
++ u8 max_tc; /* Total number of TCs */
+ u8 num_tc; /* Total number of enabled TCs */
+ bool mqprio_active;
+ };
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+index 8cebb180c812f..c0b4ff73128fc 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+@@ -3276,7 +3276,7 @@ static int hclge_tp_port_init(struct hclge_dev *hdev)
+ static int hclge_update_port_info(struct hclge_dev *hdev)
+ {
+ struct hclge_mac *mac = &hdev->hw.mac;
+- int speed = HCLGE_MAC_SPEED_UNKNOWN;
++ int speed;
+ int ret;
+
+ /* get the port info from SFP cmd if not copper port */
+@@ -3287,10 +3287,13 @@ static int hclge_update_port_info(struct hclge_dev *hdev)
+ if (!hdev->support_sfp_query)
+ return 0;
+
+- if (hdev->ae_dev->dev_version >= HNAE3_DEVICE_VERSION_V2)
++ if (hdev->ae_dev->dev_version >= HNAE3_DEVICE_VERSION_V2) {
++ speed = mac->speed;
+ ret = hclge_get_sfp_info(hdev, mac);
+- else
++ } else {
++ speed = HCLGE_MAC_SPEED_UNKNOWN;
+ ret = hclge_get_sfp_speed(hdev, &speed);
++ }
+
+ if (ret == -EOPNOTSUPP) {
+ hdev->support_sfp_query = false;
+@@ -3302,6 +3305,8 @@ static int hclge_update_port_info(struct hclge_dev *hdev)
+ if (hdev->ae_dev->dev_version >= HNAE3_DEVICE_VERSION_V2) {
+ if (mac->speed_type == QUERY_ACTIVE_SPEED) {
+ hclge_update_port_capability(hdev, mac);
++ if (mac->speed != speed)
++ (void)hclge_tm_port_shaper_cfg(hdev);
+ return 0;
+ }
+ return hclge_cfg_mac_speed_dup(hdev, mac->speed,
+@@ -3384,6 +3389,12 @@ static int hclge_set_vf_link_state(struct hnae3_handle *handle, int vf,
+ link_state_old = vport->vf_info.link_state;
+ vport->vf_info.link_state = link_state;
+
++ /* return success directly if the VF is unalive, VF will
++ * query link state itself when it starts work.
++ */
++ if (!test_bit(HCLGE_VPORT_STATE_ALIVE, &vport->state))
++ return 0;
++
+ ret = hclge_push_vf_link_status(vport);
+ if (ret) {
+ vport->vf_info.link_state = link_state_old;
+@@ -10136,6 +10147,7 @@ static int hclge_modify_port_base_vlan_tag(struct hclge_vport *vport,
+ if (ret)
+ return ret;
+
++ vport->port_base_vlan_cfg.tbl_sta = false;
+ /* remove old VLAN tag */
+ if (old_info->vlan_tag == 0)
+ ret = hclge_set_vf_vlan_common(hdev, vport->vport_id,
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
+index 1f87a8a3fe321..2f33b036a47a7 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
+@@ -282,8 +282,8 @@ static int hclge_tm_pg_to_pri_map_cfg(struct hclge_dev *hdev,
+ return hclge_cmd_send(&hdev->hw, &desc, 1);
+ }
+
+-static int hclge_tm_qs_to_pri_map_cfg(struct hclge_dev *hdev,
+- u16 qs_id, u8 pri)
++static int hclge_tm_qs_to_pri_map_cfg(struct hclge_dev *hdev, u16 qs_id, u8 pri,
++ bool link_vld)
+ {
+ struct hclge_qs_to_pri_link_cmd *map;
+ struct hclge_desc desc;
+@@ -294,7 +294,7 @@ static int hclge_tm_qs_to_pri_map_cfg(struct hclge_dev *hdev,
+
+ map->qs_id = cpu_to_le16(qs_id);
+ map->priority = pri;
+- map->link_vld = HCLGE_TM_QS_PRI_LINK_VLD_MSK;
++ map->link_vld = link_vld ? HCLGE_TM_QS_PRI_LINK_VLD_MSK : 0;
+
+ return hclge_cmd_send(&hdev->hw, &desc, 1);
+ }
+@@ -420,7 +420,7 @@ static int hclge_tm_pg_shapping_cfg(struct hclge_dev *hdev,
+ return hclge_cmd_send(&hdev->hw, &desc, 1);
+ }
+
+-static int hclge_tm_port_shaper_cfg(struct hclge_dev *hdev)
++int hclge_tm_port_shaper_cfg(struct hclge_dev *hdev)
+ {
+ struct hclge_port_shapping_cmd *shap_cfg_cmd;
+ struct hclge_shaper_ir_para ir_para;
+@@ -642,11 +642,13 @@ static void hclge_tm_update_kinfo_rss_size(struct hclge_vport *vport)
+ * one tc for VF for simplicity. VF's vport_id is non zero.
+ */
+ if (vport->vport_id) {
++ kinfo->tc_info.max_tc = 1;
+ kinfo->tc_info.num_tc = 1;
+ vport->qs_offset = HNAE3_MAX_TC +
+ vport->vport_id - HCLGE_VF_VPORT_START_NUM;
+ vport_max_rss_size = hdev->vf_rss_size_max;
+ } else {
++ kinfo->tc_info.max_tc = hdev->tc_max;
+ kinfo->tc_info.num_tc =
+ min_t(u16, vport->alloc_tqps, hdev->tm_info.num_tc);
+ vport->qs_offset = 0;
+@@ -679,7 +681,9 @@ static void hclge_tm_vport_tc_info_update(struct hclge_vport *vport)
+ kinfo->num_tqps = hclge_vport_get_tqp_num(vport);
+ vport->dwrr = 100; /* 100 percent as init */
+ vport->bw_limit = hdev->tm_info.pg_info[0].bw_limit;
+- hdev->rss_cfg.rss_size = kinfo->rss_size;
++
++ if (vport->vport_id == PF_VPORT_ID)
++ hdev->rss_cfg.rss_size = kinfo->rss_size;
+
+ /* when enable mqprio, the tc_info has been updated. */
+ if (kinfo->tc_info.mqprio_active)
+@@ -714,14 +718,22 @@ static void hclge_tm_vport_info_update(struct hclge_dev *hdev)
+
+ static void hclge_tm_tc_info_init(struct hclge_dev *hdev)
+ {
+- u8 i;
++ u8 i, tc_sch_mode;
++ u32 bw_limit;
++
++ for (i = 0; i < hdev->tc_max; i++) {
++ if (i < hdev->tm_info.num_tc) {
++ tc_sch_mode = HCLGE_SCH_MODE_DWRR;
++ bw_limit = hdev->tm_info.pg_info[0].bw_limit;
++ } else {
++ tc_sch_mode = HCLGE_SCH_MODE_SP;
++ bw_limit = 0;
++ }
+
+- for (i = 0; i < hdev->tm_info.num_tc; i++) {
+ hdev->tm_info.tc_info[i].tc_id = i;
+- hdev->tm_info.tc_info[i].tc_sch_mode = HCLGE_SCH_MODE_DWRR;
++ hdev->tm_info.tc_info[i].tc_sch_mode = tc_sch_mode;
+ hdev->tm_info.tc_info[i].pgid = 0;
+- hdev->tm_info.tc_info[i].bw_limit =
+- hdev->tm_info.pg_info[0].bw_limit;
++ hdev->tm_info.tc_info[i].bw_limit = bw_limit;
+ }
+
+ for (i = 0; i < HNAE3_MAX_USER_PRIO; i++)
+@@ -926,10 +938,13 @@ static int hclge_tm_pri_q_qs_cfg_tc_base(struct hclge_dev *hdev)
+ for (k = 0; k < hdev->num_alloc_vport; k++) {
+ struct hnae3_knic_private_info *kinfo = &vport[k].nic.kinfo;
+
+- for (i = 0; i < kinfo->tc_info.num_tc; i++) {
++ for (i = 0; i < kinfo->tc_info.max_tc; i++) {
++ u8 pri = i < kinfo->tc_info.num_tc ? i : 0;
++ bool link_vld = i < kinfo->tc_info.num_tc;
++
+ ret = hclge_tm_qs_to_pri_map_cfg(hdev,
+ vport[k].qs_offset + i,
+- i);
++ pri, link_vld);
+ if (ret)
+ return ret;
+ }
+@@ -949,7 +964,7 @@ static int hclge_tm_pri_q_qs_cfg_vnet_base(struct hclge_dev *hdev)
+ for (i = 0; i < HNAE3_MAX_TC; i++) {
+ ret = hclge_tm_qs_to_pri_map_cfg(hdev,
+ vport[k].qs_offset + i,
+- k);
++ k, true);
+ if (ret)
+ return ret;
+ }
+@@ -989,33 +1004,39 @@ static int hclge_tm_pri_tc_base_shaper_cfg(struct hclge_dev *hdev)
+ {
+ u32 max_tm_rate = hdev->ae_dev->dev_specs.max_tm_rate;
+ struct hclge_shaper_ir_para ir_para;
+- u32 shaper_para;
++ u32 shaper_para_c, shaper_para_p;
+ int ret;
+ u32 i;
+
+- for (i = 0; i < hdev->tm_info.num_tc; i++) {
++ for (i = 0; i < hdev->tc_max; i++) {
+ u32 rate = hdev->tm_info.tc_info[i].bw_limit;
+
+- ret = hclge_shaper_para_calc(rate, HCLGE_SHAPER_LVL_PRI,
+- &ir_para, max_tm_rate);
+- if (ret)
+- return ret;
++ if (rate) {
++ ret = hclge_shaper_para_calc(rate, HCLGE_SHAPER_LVL_PRI,
++ &ir_para, max_tm_rate);
++ if (ret)
++ return ret;
++
++ shaper_para_c = hclge_tm_get_shapping_para(0, 0, 0,
++ HCLGE_SHAPER_BS_U_DEF,
++ HCLGE_SHAPER_BS_S_DEF);
++ shaper_para_p = hclge_tm_get_shapping_para(ir_para.ir_b,
++ ir_para.ir_u,
++ ir_para.ir_s,
++ HCLGE_SHAPER_BS_U_DEF,
++ HCLGE_SHAPER_BS_S_DEF);
++ } else {
++ shaper_para_c = 0;
++ shaper_para_p = 0;
++ }
+
+- shaper_para = hclge_tm_get_shapping_para(0, 0, 0,
+- HCLGE_SHAPER_BS_U_DEF,
+- HCLGE_SHAPER_BS_S_DEF);
+ ret = hclge_tm_pri_shapping_cfg(hdev, HCLGE_TM_SHAP_C_BUCKET, i,
+- shaper_para, rate);
++ shaper_para_c, rate);
+ if (ret)
+ return ret;
+
+- shaper_para = hclge_tm_get_shapping_para(ir_para.ir_b,
+- ir_para.ir_u,
+- ir_para.ir_s,
+- HCLGE_SHAPER_BS_U_DEF,
+- HCLGE_SHAPER_BS_S_DEF);
+ ret = hclge_tm_pri_shapping_cfg(hdev, HCLGE_TM_SHAP_P_BUCKET, i,
+- shaper_para, rate);
++ shaper_para_p, rate);
+ if (ret)
+ return ret;
+ }
+@@ -1125,7 +1146,7 @@ static int hclge_tm_pri_tc_base_dwrr_cfg(struct hclge_dev *hdev)
+ int ret;
+ u32 i, k;
+
+- for (i = 0; i < hdev->tm_info.num_tc; i++) {
++ for (i = 0; i < hdev->tc_max; i++) {
+ pg_info =
+ &hdev->tm_info.pg_info[hdev->tm_info.tc_info[i].pgid];
+ dwrr = pg_info->tc_dwrr[i];
+@@ -1135,9 +1156,15 @@ static int hclge_tm_pri_tc_base_dwrr_cfg(struct hclge_dev *hdev)
+ return ret;
+
+ for (k = 0; k < hdev->num_alloc_vport; k++) {
++ struct hnae3_knic_private_info *kinfo = &vport[k].nic.kinfo;
++
++ if (i >= kinfo->tc_info.max_tc)
++ continue;
++
++ dwrr = i < kinfo->tc_info.num_tc ? vport[k].dwrr : 0;
+ ret = hclge_tm_qs_weight_cfg(
+ hdev, vport[k].qs_offset + i,
+- vport[k].dwrr);
++ dwrr);
+ if (ret)
+ return ret;
+ }
+@@ -1303,6 +1330,7 @@ static int hclge_tm_schd_mode_tc_base_cfg(struct hclge_dev *hdev, u8 pri_id)
+ {
+ struct hclge_vport *vport = hdev->vport;
+ int ret;
++ u8 mode;
+ u16 i;
+
+ ret = hclge_tm_pri_schd_mode_cfg(hdev, pri_id);
+@@ -1310,9 +1338,16 @@ static int hclge_tm_schd_mode_tc_base_cfg(struct hclge_dev *hdev, u8 pri_id)
+ return ret;
+
+ for (i = 0; i < hdev->num_alloc_vport; i++) {
++ struct hnae3_knic_private_info *kinfo = &vport[i].nic.kinfo;
++
++ if (pri_id >= kinfo->tc_info.max_tc)
++ continue;
++
++ mode = pri_id < kinfo->tc_info.num_tc ? HCLGE_SCH_MODE_DWRR :
++ HCLGE_SCH_MODE_SP;
+ ret = hclge_tm_qs_schd_mode_cfg(hdev,
+ vport[i].qs_offset + pri_id,
+- HCLGE_SCH_MODE_DWRR);
++ mode);
+ if (ret)
+ return ret;
+ }
+@@ -1353,7 +1388,7 @@ static int hclge_tm_lvl34_schd_mode_cfg(struct hclge_dev *hdev)
+ u8 i;
+
+ if (hdev->tx_sch_mode == HCLGE_FLAG_TC_BASE_SCH_MODE) {
+- for (i = 0; i < hdev->tm_info.num_tc; i++) {
++ for (i = 0; i < hdev->tc_max; i++) {
+ ret = hclge_tm_schd_mode_tc_base_cfg(hdev, i);
+ if (ret)
+ return ret;
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.h
+index 619cc30a2dfcc..d943943912f76 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.h
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.h
+@@ -237,6 +237,7 @@ int hclge_pause_addr_cfg(struct hclge_dev *hdev, const u8 *mac_addr);
+ void hclge_pfc_rx_stats_get(struct hclge_dev *hdev, u64 *stats);
+ void hclge_pfc_tx_stats_get(struct hclge_dev *hdev, u64 *stats);
+ int hclge_tm_qs_shaper_cfg(struct hclge_vport *vport, int max_tx_rate);
++int hclge_tm_port_shaper_cfg(struct hclge_dev *hdev);
+ int hclge_tm_get_qset_num(struct hclge_dev *hdev, u16 *qset_num);
+ int hclge_tm_get_pri_num(struct hclge_dev *hdev, u8 *pri_num);
+ int hclge_tm_get_qset_map_pri(struct hclge_dev *hdev, u16 qset_id, u8 *priority,
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+index e48499624d22c..06c05a6b8b71f 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
++++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+@@ -2584,15 +2584,16 @@ static void i40e_diag_test(struct net_device *netdev,
+
+ set_bit(__I40E_TESTING, pf->state);
+
++ if (test_bit(__I40E_RESET_RECOVERY_PENDING, pf->state) ||
++ test_bit(__I40E_RESET_INTR_RECEIVED, pf->state)) {
++ dev_warn(&pf->pdev->dev,
++ "Cannot start offline testing when PF is in reset state.\n");
++ goto skip_ol_tests;
++ }
++
+ if (i40e_active_vfs(pf) || i40e_active_vmdqs(pf)) {
+ dev_warn(&pf->pdev->dev,
+ "Please take active VFs and Netqueues offline and restart the adapter before running NIC diagnostics\n");
+- data[I40E_ETH_TEST_REG] = 1;
+- data[I40E_ETH_TEST_EEPROM] = 1;
+- data[I40E_ETH_TEST_INTR] = 1;
+- data[I40E_ETH_TEST_LINK] = 1;
+- eth_test->flags |= ETH_TEST_FL_FAILED;
+- clear_bit(__I40E_TESTING, pf->state);
+ goto skip_ol_tests;
+ }
+
+@@ -2639,9 +2640,17 @@ static void i40e_diag_test(struct net_device *netdev,
+ data[I40E_ETH_TEST_INTR] = 0;
+ }
+
+-skip_ol_tests:
+-
+ netif_info(pf, drv, netdev, "testing finished\n");
++ return;
++
++skip_ol_tests:
++ data[I40E_ETH_TEST_REG] = 1;
++ data[I40E_ETH_TEST_EEPROM] = 1;
++ data[I40E_ETH_TEST_INTR] = 1;
++ data[I40E_ETH_TEST_LINK] = 1;
++ eth_test->flags |= ETH_TEST_FL_FAILED;
++ clear_bit(__I40E_TESTING, pf->state);
++ netif_info(pf, drv, netdev, "testing failed\n");
+ }
+
+ static void i40e_get_wol(struct net_device *netdev,
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
+index 98871f0149946..46bb1169a004d 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
++++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
+@@ -8537,6 +8537,11 @@ static int i40e_configure_clsflower(struct i40e_vsi *vsi,
+ return -EOPNOTSUPP;
+ }
+
++ if (!tc) {
++ dev_err(&pf->pdev->dev, "Unable to add filter because of invalid destination");
++ return -EINVAL;
++ }
++
+ if (test_bit(__I40E_RESET_RECOVERY_PENDING, pf->state) ||
+ test_bit(__I40E_RESET_INTR_RECEIVED, pf->state))
+ return -EBUSY;
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+index 2606e8f0f19be..033ea71763e3d 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
++++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+@@ -2282,7 +2282,7 @@ static int i40e_vc_config_queues_msg(struct i40e_vf *vf, u8 *msg)
+ }
+
+ if (vf->adq_enabled) {
+- for (i = 0; i < I40E_MAX_VF_VSI; i++)
++ for (i = 0; i < vf->num_tc; i++)
+ num_qps_all += vf->ch[i].num_qps;
+ if (num_qps_all != qci->num_queue_pairs) {
+ aq_ret = I40E_ERR_PARAM;
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
+index 7dfcf78b57fb5..f3ecb3bca33dd 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
+@@ -984,7 +984,7 @@ struct iavf_mac_filter *iavf_add_filter(struct iavf_adapter *adapter,
+ list_add_tail(&f->list, &adapter->mac_filter_list);
+ f->add = true;
+ f->is_new_mac = true;
+- f->is_primary = false;
++ f->is_primary = ether_addr_equal(macaddr, adapter->hw.mac.addr);
+ adapter->aq_required |= IAVF_FLAG_AQ_ADD_MAC_FILTER;
+ } else {
+ f->remove = false;
+diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
+index 963a5f40e071b..d069b19f9bf7f 100644
+--- a/drivers/net/ethernet/intel/ice/ice_main.c
++++ b/drivers/net/ethernet/intel/ice/ice_main.c
+@@ -5746,25 +5746,38 @@ static netdev_features_t
+ ice_fix_features(struct net_device *netdev, netdev_features_t features)
+ {
+ struct ice_netdev_priv *np = netdev_priv(netdev);
+- netdev_features_t supported_vlan_filtering;
+- netdev_features_t requested_vlan_filtering;
+- struct ice_vsi *vsi = np->vsi;
+-
+- requested_vlan_filtering = features & NETIF_VLAN_FILTERING_FEATURES;
+-
+- /* make sure supported_vlan_filtering works for both SVM and DVM */
+- supported_vlan_filtering = NETIF_F_HW_VLAN_CTAG_FILTER;
+- if (ice_is_dvm_ena(&vsi->back->hw))
+- supported_vlan_filtering |= NETIF_F_HW_VLAN_STAG_FILTER;
+-
+- if (requested_vlan_filtering &&
+- requested_vlan_filtering != supported_vlan_filtering) {
+- if (requested_vlan_filtering & NETIF_F_HW_VLAN_CTAG_FILTER) {
+- netdev_warn(netdev, "cannot support requested VLAN filtering settings, enabling all supported VLAN filtering settings\n");
+- features |= supported_vlan_filtering;
++ netdev_features_t req_vlan_fltr, cur_vlan_fltr;
++ bool cur_ctag, cur_stag, req_ctag, req_stag;
++
++ cur_vlan_fltr = netdev->features & NETIF_VLAN_FILTERING_FEATURES;
++ cur_ctag = cur_vlan_fltr & NETIF_F_HW_VLAN_CTAG_FILTER;
++ cur_stag = cur_vlan_fltr & NETIF_F_HW_VLAN_STAG_FILTER;
++
++ req_vlan_fltr = features & NETIF_VLAN_FILTERING_FEATURES;
++ req_ctag = req_vlan_fltr & NETIF_F_HW_VLAN_CTAG_FILTER;
++ req_stag = req_vlan_fltr & NETIF_F_HW_VLAN_STAG_FILTER;
++
++ if (req_vlan_fltr != cur_vlan_fltr) {
++ if (ice_is_dvm_ena(&np->vsi->back->hw)) {
++ if (req_ctag && req_stag) {
++ features |= NETIF_VLAN_FILTERING_FEATURES;
++ } else if (!req_ctag && !req_stag) {
++ features &= ~NETIF_VLAN_FILTERING_FEATURES;
++ } else if ((!cur_ctag && req_ctag && !cur_stag) ||
++ (!cur_stag && req_stag && !cur_ctag)) {
++ features |= NETIF_VLAN_FILTERING_FEATURES;
++ netdev_warn(netdev, "802.1Q and 802.1ad VLAN filtering must be either both on or both off. VLAN filtering has been enabled for both types.\n");
++ } else if ((cur_ctag && !req_ctag && cur_stag) ||
++ (cur_stag && !req_stag && cur_ctag)) {
++ features &= ~NETIF_VLAN_FILTERING_FEATURES;
++ netdev_warn(netdev, "802.1Q and 802.1ad VLAN filtering must be either both on or both off. VLAN filtering has been disabled for both types.\n");
++ }
+ } else {
+- netdev_warn(netdev, "cannot support requested VLAN filtering settings, clearing all supported VLAN filtering settings\n");
+- features &= ~supported_vlan_filtering;
++ if (req_vlan_fltr & NETIF_F_HW_VLAN_STAG_FILTER)
++ netdev_warn(netdev, "cannot support requested 802.1ad filtering setting in SVM mode\n");
++
++ if (req_vlan_fltr & NETIF_F_HW_VLAN_CTAG_FILTER)
++ features |= NETIF_F_HW_VLAN_CTAG_FILTER;
+ }
+ }
+
+diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
+index 662947c882e8b..ef9344ef0d8e4 100644
+--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
++++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
+@@ -2271,7 +2271,7 @@ static int
+ ice_ptp_init_tx_e822(struct ice_pf *pf, struct ice_ptp_tx *tx, u8 port)
+ {
+ tx->quad = port / ICE_PORTS_PER_QUAD;
+- tx->quad_offset = tx->quad * INDEX_PER_PORT;
++ tx->quad_offset = (port % ICE_PORTS_PER_QUAD) * INDEX_PER_PORT;
+ tx->len = INDEX_PER_PORT;
+
+ return ice_ptp_alloc_tx_tracker(tx);
+diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.h b/drivers/net/ethernet/intel/ice/ice_ptp.h
+index afd048d699598..10e396abf1309 100644
+--- a/drivers/net/ethernet/intel/ice/ice_ptp.h
++++ b/drivers/net/ethernet/intel/ice/ice_ptp.h
+@@ -49,6 +49,37 @@ struct ice_perout_channel {
+ * To allow multiple ports to access the shared register block independently,
+ * the blocks are split up so that indexes are assigned to each port based on
+ * hardware logical port number.
++ *
++ * The timestamp blocks are handled differently for E810- and E822-based
++ * devices. In E810 devices, each port has its own block of timestamps, while in
++ * E822 there is a need to logically break the block of registers into smaller
++ * chunks based on the port number to avoid collisions.
++ *
++ * Example for port 5 in E810:
++ * +--------+--------+--------+--------+--------+--------+--------+--------+
++ * |register|register|register|register|register|register|register|register|
++ * | block | block | block | block | block | block | block | block |
++ * | for | for | for | for | for | for | for | for |
++ * | port 0 | port 1 | port 2 | port 3 | port 4 | port 5 | port 6 | port 7 |
++ * +--------+--------+--------+--------+--------+--------+--------+--------+
++ * ^^
++ * ||
++ * |--- quad offset is always 0
++ * ---- quad number
++ *
++ * Example for port 5 in E822:
++ * +-----------------------------+-----------------------------+
++ * | register block for quad 0 | register block for quad 1 |
++ * |+------+------+------+------+|+------+------+------+------+|
++ * ||port 0|port 1|port 2|port 3|||port 0|port 1|port 2|port 3||
++ * |+------+------+------+------+|+------+------+------+------+|
++ * +-----------------------------+-------^---------------------+
++ * ^ |
++ * | --- quad offset*
++ * ---- quad number
++ *
++ * * PHY port 5 is port 1 in quad 1
++ *
+ */
+
+ /**
+diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
+index aefd66a4db80d..9790df872c2ae 100644
+--- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
++++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
+@@ -504,6 +504,11 @@ int ice_reset_vf(struct ice_vf *vf, u32 flags)
+ }
+
+ if (ice_is_vf_disabled(vf)) {
++ vsi = ice_get_vf_vsi(vf);
++ if (WARN_ON(!vsi))
++ return -EINVAL;
++ ice_vsi_stop_lan_tx_rings(vsi, ICE_NO_RESET, vf->vf_id);
++ ice_vsi_stop_all_rx_rings(vsi);
+ dev_dbg(dev, "VF is already disabled, there is no need for resetting it, telling VM, all is fine %d\n",
+ vf->vf_id);
+ return 0;
+diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c
+index 5405a0e752cf7..da7c5ce15be0d 100644
+--- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c
++++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c
+@@ -1587,35 +1587,27 @@ error_param:
+ */
+ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
+ {
+- enum virtchnl_status_code v_ret = VIRTCHNL_STATUS_SUCCESS;
+ struct virtchnl_vsi_queue_config_info *qci =
+ (struct virtchnl_vsi_queue_config_info *)msg;
+ struct virtchnl_queue_pair_info *qpi;
+ struct ice_pf *pf = vf->pf;
+ struct ice_vsi *vsi;
+- int i, q_idx;
++ int i = -1, q_idx;
+
+- if (!test_bit(ICE_VF_STATE_ACTIVE, vf->vf_states)) {
+- v_ret = VIRTCHNL_STATUS_ERR_PARAM;
++ if (!test_bit(ICE_VF_STATE_ACTIVE, vf->vf_states))
+ goto error_param;
+- }
+
+- if (!ice_vc_isvalid_vsi_id(vf, qci->vsi_id)) {
+- v_ret = VIRTCHNL_STATUS_ERR_PARAM;
++ if (!ice_vc_isvalid_vsi_id(vf, qci->vsi_id))
+ goto error_param;
+- }
+
+ vsi = ice_get_vf_vsi(vf);
+- if (!vsi) {
+- v_ret = VIRTCHNL_STATUS_ERR_PARAM;
++ if (!vsi)
+ goto error_param;
+- }
+
+ if (qci->num_queue_pairs > ICE_MAX_RSS_QS_PER_VF ||
+ qci->num_queue_pairs > min_t(u16, vsi->alloc_txq, vsi->alloc_rxq)) {
+ dev_err(ice_pf_to_dev(pf), "VF-%d requesting more than supported number of queues: %d\n",
+ vf->vf_id, min_t(u16, vsi->alloc_txq, vsi->alloc_rxq));
+- v_ret = VIRTCHNL_STATUS_ERR_PARAM;
+ goto error_param;
+ }
+
+@@ -1628,7 +1620,6 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
+ !ice_vc_isvalid_ring_len(qpi->txq.ring_len) ||
+ !ice_vc_isvalid_ring_len(qpi->rxq.ring_len) ||
+ !ice_vc_isvalid_q_id(vf, qci->vsi_id, qpi->txq.queue_id)) {
+- v_ret = VIRTCHNL_STATUS_ERR_PARAM;
+ goto error_param;
+ }
+
+@@ -1638,7 +1629,6 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
+ * for selected "vsi"
+ */
+ if (q_idx >= vsi->alloc_txq || q_idx >= vsi->alloc_rxq) {
+- v_ret = VIRTCHNL_STATUS_ERR_PARAM;
+ goto error_param;
+ }
+
+@@ -1648,14 +1638,13 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
+ vsi->tx_rings[i]->count = qpi->txq.ring_len;
+
+ /* Disable any existing queue first */
+- if (ice_vf_vsi_dis_single_txq(vf, vsi, q_idx)) {
+- v_ret = VIRTCHNL_STATUS_ERR_PARAM;
++ if (ice_vf_vsi_dis_single_txq(vf, vsi, q_idx))
+ goto error_param;
+- }
+
+ /* Configure a queue with the requested settings */
+ if (ice_vsi_cfg_single_txq(vsi, vsi->tx_rings, q_idx)) {
+- v_ret = VIRTCHNL_STATUS_ERR_PARAM;
++ dev_warn(ice_pf_to_dev(pf), "VF-%d failed to configure TX queue %d\n",
++ vf->vf_id, i);
+ goto error_param;
+ }
+ }
+@@ -1669,17 +1658,13 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
+
+ if (qpi->rxq.databuffer_size != 0 &&
+ (qpi->rxq.databuffer_size > ((16 * 1024) - 128) ||
+- qpi->rxq.databuffer_size < 1024)) {
+- v_ret = VIRTCHNL_STATUS_ERR_PARAM;
++ qpi->rxq.databuffer_size < 1024))
+ goto error_param;
+- }
+ vsi->rx_buf_len = qpi->rxq.databuffer_size;
+ vsi->rx_rings[i]->rx_buf_len = vsi->rx_buf_len;
+ if (qpi->rxq.max_pkt_size > max_frame_size ||
+- qpi->rxq.max_pkt_size < 64) {
+- v_ret = VIRTCHNL_STATUS_ERR_PARAM;
++ qpi->rxq.max_pkt_size < 64)
+ goto error_param;
+- }
+
+ vsi->max_frame = qpi->rxq.max_pkt_size;
+ /* add space for the port VLAN since the VF driver is
+@@ -1690,16 +1675,30 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
+ vsi->max_frame += VLAN_HLEN;
+
+ if (ice_vsi_cfg_single_rxq(vsi, q_idx)) {
+- v_ret = VIRTCHNL_STATUS_ERR_PARAM;
++ dev_warn(ice_pf_to_dev(pf), "VF-%d failed to configure RX queue %d\n",
++ vf->vf_id, i);
+ goto error_param;
+ }
+ }
+ }
+
++ /* send the response to the VF */
++ return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_VSI_QUEUES,
++ VIRTCHNL_STATUS_SUCCESS, NULL, 0);
+ error_param:
++ /* disable whatever we can */
++ for (; i >= 0; i--) {
++ if (ice_vsi_ctrl_one_rx_ring(vsi, false, i, true))
++ dev_err(ice_pf_to_dev(pf), "VF-%d could not disable RX queue %d\n",
++ vf->vf_id, i);
++ if (ice_vf_vsi_dis_single_txq(vf, vsi, i))
++ dev_err(ice_pf_to_dev(pf), "VF-%d could not disable TX queue %d\n",
++ vf->vf_id, i);
++ }
++
+ /* send the response to the VF */
+- return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_VSI_QUEUES, v_ret,
+- NULL, 0);
++ return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_VSI_QUEUES,
++ VIRTCHNL_STATUS_ERR_PARAM, NULL, 0);
+ }
+
+ /**
+diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+index a50090e62c8f9..c075670bc5625 100644
+--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
++++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+@@ -778,6 +778,17 @@ static inline bool mtk_rx_get_desc(struct mtk_rx_dma *rxd,
+ return true;
+ }
+
++static void *mtk_max_lro_buf_alloc(gfp_t gfp_mask)
++{
++ unsigned int size = mtk_max_frag_size(MTK_MAX_LRO_RX_LENGTH);
++ unsigned long data;
++
++ data = __get_free_pages(gfp_mask | __GFP_COMP | __GFP_NOWARN,
++ get_order(size));
++
++ return (void *)data;
++}
++
+ /* the qdma core needs scratch memory to be setup */
+ static int mtk_init_fq_dma(struct mtk_eth *eth)
+ {
+@@ -1269,7 +1280,10 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
+ goto release_desc;
+
+ /* alloc new buffer */
+- new_data = napi_alloc_frag(ring->frag_size);
++ if (ring->frag_size <= PAGE_SIZE)
++ new_data = napi_alloc_frag(ring->frag_size);
++ else
++ new_data = mtk_max_lro_buf_alloc(GFP_ATOMIC);
+ if (unlikely(!new_data)) {
+ netdev->stats.rx_dropped++;
+ goto release_desc;
+@@ -1683,7 +1697,10 @@ static int mtk_rx_alloc(struct mtk_eth *eth, int ring_no, int rx_flag)
+ return -ENOMEM;
+
+ for (i = 0; i < rx_dma_size; i++) {
+- ring->data[i] = netdev_alloc_frag(ring->frag_size);
++ if (ring->frag_size <= PAGE_SIZE)
++ ring->data[i] = netdev_alloc_frag(ring->frag_size);
++ else
++ ring->data[i] = mtk_max_lro_buf_alloc(GFP_KERNEL);
+ if (!ring->data[i])
+ return -ENOMEM;
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+index a8b98242edb16..a1e9d30515332 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
+@@ -561,7 +561,7 @@ static void mlx5_do_bond(struct mlx5_lag *ldev)
+ {
+ struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev;
+ struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev;
+- struct lag_tracker tracker;
++ struct lag_tracker tracker = { };
+ bool do_bond, roce_lag;
+ int err;
+
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_cnt.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum_cnt.h
+index a68d931090dd5..15c8d4de83508 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_cnt.h
++++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_cnt.h
+@@ -8,8 +8,8 @@
+ #include "spectrum.h"
+
+ enum mlxsw_sp_counter_sub_pool_id {
+- MLXSW_SP_COUNTER_SUB_POOL_FLOW,
+ MLXSW_SP_COUNTER_SUB_POOL_RIF,
++ MLXSW_SP_COUNTER_SUB_POOL_FLOW,
+ };
+
+ int mlxsw_sp_counter_alloc(struct mlxsw_sp *mlxsw_sp,
+diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
+index e172743948ed7..ce2cbb5903d7b 100644
+--- a/drivers/net/ppp/pppoe.c
++++ b/drivers/net/ppp/pppoe.c
+@@ -1012,8 +1012,7 @@ static int pppoe_recvmsg(struct socket *sock, struct msghdr *m,
+ goto end;
+ }
+
+- skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT,
+- flags & MSG_DONTWAIT, &error);
++ skb = skb_recv_datagram(sk, flags, &error);
+ if (error < 0)
+ goto end;
+
+diff --git a/drivers/nfc/nfcmrvl/usb.c b/drivers/nfc/nfcmrvl/usb.c
+index a99aedff795dc..ea73094530968 100644
+--- a/drivers/nfc/nfcmrvl/usb.c
++++ b/drivers/nfc/nfcmrvl/usb.c
+@@ -388,13 +388,25 @@ static void nfcmrvl_play_deferred(struct nfcmrvl_usb_drv_data *drv_data)
+ int err;
+
+ while ((urb = usb_get_from_anchor(&drv_data->deferred))) {
++ usb_anchor_urb(urb, &drv_data->tx_anchor);
++
+ err = usb_submit_urb(urb, GFP_ATOMIC);
+- if (err)
++ if (err) {
++ kfree(urb->setup_packet);
++ usb_unanchor_urb(urb);
++ usb_free_urb(urb);
+ break;
++ }
+
+ drv_data->tx_in_flight++;
++ usb_free_urb(urb);
++ }
++
++ /* Cleanup the rest deferred urbs. */
++ while ((urb = usb_get_from_anchor(&drv_data->deferred))) {
++ kfree(urb->setup_packet);
++ usb_free_urb(urb);
+ }
+- usb_scuttle_anchored_urbs(&drv_data->deferred);
+ }
+
+ static int nfcmrvl_resume(struct usb_interface *intf)
+diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
+index 2d6a01853109b..1ea85c88d7951 100644
+--- a/drivers/nvme/host/core.c
++++ b/drivers/nvme/host/core.c
+@@ -3226,8 +3226,8 @@ static ssize_t uuid_show(struct device *dev, struct device_attribute *attr,
+ * we have no UUID set
+ */
+ if (uuid_is_null(&ids->uuid)) {
+- printk_ratelimited(KERN_WARNING
+- "No UUID available providing old NGUID\n");
++ dev_warn_ratelimited(dev,
++ "No UUID available providing old NGUID\n");
+ return sysfs_emit(buf, "%pU\n", ids->nguid);
+ }
+ return sysfs_emit(buf, "%pU\n", &ids->uuid);
+diff --git a/drivers/platform/mips/Kconfig b/drivers/platform/mips/Kconfig
+index d421e14823957..6b51ad01f7915 100644
+--- a/drivers/platform/mips/Kconfig
++++ b/drivers/platform/mips/Kconfig
+@@ -17,7 +17,7 @@ menuconfig MIPS_PLATFORM_DEVICES
+ if MIPS_PLATFORM_DEVICES
+
+ config CPU_HWMON
+- tristate "Loongson-3 CPU HWMon Driver"
++ bool "Loongson-3 CPU HWMon Driver"
+ depends on MACH_LOONGSON64
+ select HWMON
+ default y
+diff --git a/drivers/platform/x86/gigabyte-wmi.c b/drivers/platform/x86/gigabyte-wmi.c
+index e87a931eab1e7..78446b1953f7c 100644
+--- a/drivers/platform/x86/gigabyte-wmi.c
++++ b/drivers/platform/x86/gigabyte-wmi.c
+@@ -140,6 +140,7 @@ static u8 gigabyte_wmi_detect_sensor_usability(struct wmi_device *wdev)
+ }}
+
+ static const struct dmi_system_id gigabyte_wmi_known_working_platforms[] = {
++ DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("B450M DS3H-CF"),
+ DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("B450M S2H V2"),
+ DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("B550 AORUS ELITE AX V2"),
+ DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("B550 AORUS ELITE"),
+@@ -154,6 +155,7 @@ static const struct dmi_system_id gigabyte_wmi_known_working_platforms[] = {
+ DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("X570 GAMING X"),
+ DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("X570 I AORUS PRO WIFI"),
+ DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("X570 UD"),
++ DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("Z690M AORUS ELITE AX DDR4"),
+ { }
+ };
+
+diff --git a/drivers/platform/x86/intel/hid.c b/drivers/platform/x86/intel/hid.c
+index 216d31e3403dd..79cff1fc675c2 100644
+--- a/drivers/platform/x86/intel/hid.c
++++ b/drivers/platform/x86/intel/hid.c
+@@ -122,6 +122,12 @@ static const struct dmi_system_id dmi_vgbs_allow_list[] = {
+ DMI_MATCH(DMI_PRODUCT_NAME, "HP Spectre x360 Convertible 15-df0xxx"),
+ },
+ },
++ {
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Surface Go"),
++ },
++ },
+ { }
+ };
+
+diff --git a/drivers/platform/x86/intel/pmc/core.c b/drivers/platform/x86/intel/pmc/core.c
+index ac19fcc9abbf5..8ee15a7252c71 100644
+--- a/drivers/platform/x86/intel/pmc/core.c
++++ b/drivers/platform/x86/intel/pmc/core.c
+@@ -1912,6 +1912,7 @@ static const struct x86_cpu_id intel_pmc_core_ids[] = {
+ X86_MATCH_INTEL_FAM6_MODEL(ROCKETLAKE, &tgl_reg_map),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &tgl_reg_map),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &adl_reg_map),
++ X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, &tgl_reg_map),
+ {}
+ };
+
+diff --git a/drivers/platform/x86/intel/pmt/crashlog.c b/drivers/platform/x86/intel/pmt/crashlog.c
+index 34daf9df168b1..ace1239bc0a0b 100644
+--- a/drivers/platform/x86/intel/pmt/crashlog.c
++++ b/drivers/platform/x86/intel/pmt/crashlog.c
+@@ -282,7 +282,7 @@ static int pmt_crashlog_probe(struct auxiliary_device *auxdev,
+ auxiliary_set_drvdata(auxdev, priv);
+
+ for (i = 0; i < intel_vsec_dev->num_resources; i++) {
+- struct intel_pmt_entry *entry = &priv->entry[i].entry;
++ struct intel_pmt_entry *entry = &priv->entry[priv->num_entries].entry;
+
+ ret = intel_pmt_dev_create(entry, &pmt_crashlog_ns, intel_vsec_dev, i);
+ if (ret < 0)
+diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
+index 104bee9b3a9dc..00593d8953f10 100644
+--- a/drivers/scsi/ipr.c
++++ b/drivers/scsi/ipr.c
+@@ -9795,7 +9795,7 @@ static int ipr_alloc_mem(struct ipr_ioa_cfg *ioa_cfg)
+ GFP_KERNEL);
+
+ if (!ioa_cfg->hrrq[i].host_rrq) {
+- while (--i > 0)
++ while (--i >= 0)
+ dma_free_coherent(&pdev->dev,
+ sizeof(u32) * ioa_cfg->hrrq[i].size,
+ ioa_cfg->hrrq[i].host_rrq,
+@@ -10068,7 +10068,7 @@ static int ipr_request_other_msi_irqs(struct ipr_ioa_cfg *ioa_cfg,
+ ioa_cfg->vectors_info[i].desc,
+ &ioa_cfg->hrrq[i]);
+ if (rc) {
+- while (--i >= 0)
++ while (--i > 0)
+ free_irq(pci_irq_vector(pdev, i),
+ &ioa_cfg->hrrq[i]);
+ return rc;
+diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
+index 892b3da1ba450..9e38995800390 100644
+--- a/drivers/scsi/lpfc/lpfc_els.c
++++ b/drivers/scsi/lpfc/lpfc_els.c
+@@ -3035,18 +3035,10 @@ lpfc_cmpl_els_logo(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+ spin_unlock_irq(&ndlp->lock);
+ lpfc_disc_state_machine(vport, ndlp, cmdiocb,
+ NLP_EVT_DEVICE_RM);
+- lpfc_els_free_iocb(phba, cmdiocb);
+- lpfc_nlp_put(ndlp);
+-
+- /* Presume the node was released. */
+- return;
++ goto out_rsrc_free;
+ }
+
+ out:
+- /* Driver is done with the IO. */
+- lpfc_els_free_iocb(phba, cmdiocb);
+- lpfc_nlp_put(ndlp);
+-
+ /* At this point, the LOGO processing is complete. NOTE: For a
+ * pt2pt topology, we are assuming the NPortID will only change
+ * on link up processing. For a LOGO / PLOGI initiated by the
+@@ -3073,6 +3065,10 @@ out:
+ ndlp->nlp_DID, ulp_status,
+ ulp_word4, tmo,
+ vport->num_disc_nodes);
++
++ lpfc_els_free_iocb(phba, cmdiocb);
++ lpfc_nlp_put(ndlp);
++
+ lpfc_disc_start(vport);
+ return;
+ }
+@@ -3089,6 +3085,10 @@ out:
+ lpfc_disc_state_machine(vport, ndlp, cmdiocb,
+ NLP_EVT_DEVICE_RM);
+ }
++out_rsrc_free:
++ /* Driver is done with the I/O. */
++ lpfc_els_free_iocb(phba, cmdiocb);
++ lpfc_nlp_put(ndlp);
+ }
+
+ /**
+diff --git a/drivers/scsi/lpfc/lpfc_hw4.h b/drivers/scsi/lpfc/lpfc_hw4.h
+index 02e230ed62802..e7daef5500952 100644
+--- a/drivers/scsi/lpfc/lpfc_hw4.h
++++ b/drivers/scsi/lpfc/lpfc_hw4.h
+@@ -4488,6 +4488,9 @@ struct wqe_common {
+ #define wqe_sup_SHIFT 6
+ #define wqe_sup_MASK 0x00000001
+ #define wqe_sup_WORD word11
++#define wqe_ffrq_SHIFT 6
++#define wqe_ffrq_MASK 0x00000001
++#define wqe_ffrq_WORD word11
+ #define wqe_wqec_SHIFT 7
+ #define wqe_wqec_MASK 0x00000001
+ #define wqe_wqec_WORD word11
+diff --git a/drivers/scsi/lpfc/lpfc_nportdisc.c b/drivers/scsi/lpfc/lpfc_nportdisc.c
+index 4b065c51ee1b0..f5de88877ffe7 100644
+--- a/drivers/scsi/lpfc/lpfc_nportdisc.c
++++ b/drivers/scsi/lpfc/lpfc_nportdisc.c
+@@ -835,7 +835,8 @@ lpfc_rcv_logo(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp,
+ lpfc_nvmet_invalidate_host(phba, ndlp);
+
+ if (ndlp->nlp_DID == Fabric_DID) {
+- if (vport->port_state <= LPFC_FDISC)
++ if (vport->port_state <= LPFC_FDISC ||
++ vport->fc_flag & FC_PT2PT)
+ goto out;
+ lpfc_linkdown_port(vport);
+ spin_lock_irq(shost->host_lock);
+diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
+index d3a542466e981..49f44d9d04ea7 100644
+--- a/drivers/scsi/lpfc/lpfc_nvme.c
++++ b/drivers/scsi/lpfc/lpfc_nvme.c
+@@ -1194,7 +1194,8 @@ lpfc_nvme_prep_io_cmd(struct lpfc_vport *vport,
+ {
+ struct lpfc_hba *phba = vport->phba;
+ struct nvmefc_fcp_req *nCmd = lpfc_ncmd->nvmeCmd;
+- struct lpfc_iocbq *pwqeq = &(lpfc_ncmd->cur_iocbq);
++ struct nvme_common_command *sqe;
++ struct lpfc_iocbq *pwqeq = &lpfc_ncmd->cur_iocbq;
+ union lpfc_wqe128 *wqe = &pwqeq->wqe;
+ uint32_t req_len;
+
+@@ -1251,8 +1252,14 @@ lpfc_nvme_prep_io_cmd(struct lpfc_vport *vport,
+ cstat->control_requests++;
+ }
+
+- if (pnode->nlp_nvme_info & NLP_NVME_NSLER)
++ if (pnode->nlp_nvme_info & NLP_NVME_NSLER) {
+ bf_set(wqe_erp, &wqe->generic.wqe_com, 1);
++ sqe = &((struct nvme_fc_cmd_iu *)
++ nCmd->cmdaddr)->sqe.common;
++ if (sqe->opcode == nvme_admin_async_event)
++ bf_set(wqe_ffrq, &wqe->generic.wqe_com, 1);
++ }
++
+ /*
+ * Finish initializing those WQE fields that are independent
+ * of the nvme_cmnd request_buffer
+diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
+index 538d2c0cd9713..aa142052ebe45 100644
+--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
++++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
+@@ -5368,6 +5368,7 @@ static int _base_assign_fw_reported_qd(struct MPT3SAS_ADAPTER *ioc)
+ Mpi2ConfigReply_t mpi_reply;
+ Mpi2SasIOUnitPage1_t *sas_iounit_pg1 = NULL;
+ Mpi26PCIeIOUnitPage1_t pcie_iounit_pg1;
++ u16 depth;
+ int sz;
+ int rc = 0;
+
+@@ -5379,7 +5380,7 @@ static int _base_assign_fw_reported_qd(struct MPT3SAS_ADAPTER *ioc)
+ goto out;
+ /* sas iounit page 1 */
+ sz = offsetof(Mpi2SasIOUnitPage1_t, PhyData);
+- sas_iounit_pg1 = kzalloc(sz, GFP_KERNEL);
++ sas_iounit_pg1 = kzalloc(sizeof(Mpi2SasIOUnitPage1_t), GFP_KERNEL);
+ if (!sas_iounit_pg1) {
+ pr_err("%s: failure at %s:%d/%s()!\n",
+ ioc->name, __FILE__, __LINE__, __func__);
+@@ -5392,16 +5393,16 @@ static int _base_assign_fw_reported_qd(struct MPT3SAS_ADAPTER *ioc)
+ ioc->name, __FILE__, __LINE__, __func__);
+ goto out;
+ }
+- ioc->max_wideport_qd =
+- (le16_to_cpu(sas_iounit_pg1->SASWideMaxQueueDepth)) ?
+- le16_to_cpu(sas_iounit_pg1->SASWideMaxQueueDepth) :
+- MPT3SAS_SAS_QUEUE_DEPTH;
+- ioc->max_narrowport_qd =
+- (le16_to_cpu(sas_iounit_pg1->SASNarrowMaxQueueDepth)) ?
+- le16_to_cpu(sas_iounit_pg1->SASNarrowMaxQueueDepth) :
+- MPT3SAS_SAS_QUEUE_DEPTH;
+- ioc->max_sata_qd = (sas_iounit_pg1->SATAMaxQDepth) ?
+- sas_iounit_pg1->SATAMaxQDepth : MPT3SAS_SATA_QUEUE_DEPTH;
++
++ depth = le16_to_cpu(sas_iounit_pg1->SASWideMaxQueueDepth);
++ ioc->max_wideport_qd = (depth ? depth : MPT3SAS_SAS_QUEUE_DEPTH);
++
++ depth = le16_to_cpu(sas_iounit_pg1->SASNarrowMaxQueueDepth);
++ ioc->max_narrowport_qd = (depth ? depth : MPT3SAS_SAS_QUEUE_DEPTH);
++
++ depth = sas_iounit_pg1->SATAMaxQDepth;
++ ioc->max_sata_qd = (depth ? depth : MPT3SAS_SATA_QUEUE_DEPTH);
++
+ /* pcie iounit page 1 */
+ rc = mpt3sas_config_get_pcie_iounit_pg1(ioc, &mpi_reply,
+ &pcie_iounit_pg1, sizeof(Mpi26PCIeIOUnitPage1_t));
+diff --git a/drivers/scsi/pmcraid.c b/drivers/scsi/pmcraid.c
+index fd674ed1febed..6d94837c90493 100644
+--- a/drivers/scsi/pmcraid.c
++++ b/drivers/scsi/pmcraid.c
+@@ -4031,7 +4031,7 @@ pmcraid_register_interrupt_handler(struct pmcraid_instance *pinstance)
+ return 0;
+
+ out_unwind:
+- while (--i > 0)
++ while (--i >= 0)
+ free_irq(pci_irq_vector(pdev, i), &pinstance->hrrq_vector[i]);
+ pci_free_irq_vectors(pdev);
+ return rc;
+diff --git a/drivers/scsi/vmw_pvscsi.h b/drivers/scsi/vmw_pvscsi.h
+index 51a82f7803d3c..9d16cf9254837 100644
+--- a/drivers/scsi/vmw_pvscsi.h
++++ b/drivers/scsi/vmw_pvscsi.h
+@@ -331,8 +331,8 @@ struct PVSCSIRingReqDesc {
+ u8 tag;
+ u8 bus;
+ u8 target;
+- u8 vcpuHint;
+- u8 unused[59];
++ u16 vcpuHint;
++ u8 unused[58];
+ } __packed;
+
+ /*
+diff --git a/drivers/staging/r8188eu/core/rtw_xmit.c b/drivers/staging/r8188eu/core/rtw_xmit.c
+index 2ee92bbe66a08..ea5c88904961d 100644
+--- a/drivers/staging/r8188eu/core/rtw_xmit.c
++++ b/drivers/staging/r8188eu/core/rtw_xmit.c
+@@ -178,8 +178,7 @@ s32 _rtw_init_xmit_priv(struct xmit_priv *pxmitpriv, struct adapter *padapter)
+
+ pxmitpriv->free_xmit_extbuf_cnt = num_xmit_extbuf;
+
+- res = rtw_alloc_hwxmits(padapter);
+- if (res) {
++ if (rtw_alloc_hwxmits(padapter)) {
+ res = _FAIL;
+ goto exit;
+ }
+@@ -1492,19 +1491,10 @@ int rtw_alloc_hwxmits(struct adapter *padapter)
+
+ hwxmits = pxmitpriv->hwxmits;
+
+- if (pxmitpriv->hwxmit_entry == 5) {
+- hwxmits[0] .sta_queue = &pxmitpriv->bm_pending;
+- hwxmits[1] .sta_queue = &pxmitpriv->vo_pending;
+- hwxmits[2] .sta_queue = &pxmitpriv->vi_pending;
+- hwxmits[3] .sta_queue = &pxmitpriv->bk_pending;
+- hwxmits[4] .sta_queue = &pxmitpriv->be_pending;
+- } else if (pxmitpriv->hwxmit_entry == 4) {
+- hwxmits[0] .sta_queue = &pxmitpriv->vo_pending;
+- hwxmits[1] .sta_queue = &pxmitpriv->vi_pending;
+- hwxmits[2] .sta_queue = &pxmitpriv->be_pending;
+- hwxmits[3] .sta_queue = &pxmitpriv->bk_pending;
+- } else {
+- }
++ hwxmits[0].sta_queue = &pxmitpriv->vo_pending;
++ hwxmits[1].sta_queue = &pxmitpriv->vi_pending;
++ hwxmits[2].sta_queue = &pxmitpriv->be_pending;
++ hwxmits[3].sta_queue = &pxmitpriv->bk_pending;
+
+ return 0;
+ }
+diff --git a/drivers/staging/r8188eu/os_dep/ioctl_linux.c b/drivers/staging/r8188eu/os_dep/ioctl_linux.c
+index 60bd1cc2b3afd..607c5e1eb3205 100644
+--- a/drivers/staging/r8188eu/os_dep/ioctl_linux.c
++++ b/drivers/staging/r8188eu/os_dep/ioctl_linux.c
+@@ -404,7 +404,7 @@ static int wpa_set_encryption(struct net_device *dev, struct ieee_param *param,
+
+ if (wep_key_len > 0) {
+ wep_key_len = wep_key_len <= 5 ? 5 : 13;
+- wep_total_len = wep_key_len + FIELD_OFFSET(struct ndis_802_11_wep, KeyMaterial);
++ wep_total_len = wep_key_len + sizeof(*pwep);
+ pwep = kzalloc(wep_total_len, GFP_KERNEL);
+ if (!pwep)
+ goto exit;
+diff --git a/drivers/tty/goldfish.c b/drivers/tty/goldfish.c
+index c7968aecd8702..d02de3f0326fb 100644
+--- a/drivers/tty/goldfish.c
++++ b/drivers/tty/goldfish.c
+@@ -426,7 +426,7 @@ static int goldfish_tty_remove(struct platform_device *pdev)
+ tty_unregister_device(goldfish_tty_driver, qtty->console.index);
+ iounmap(qtty->base);
+ qtty->base = NULL;
+- free_irq(qtty->irq, pdev);
++ free_irq(qtty->irq, qtty);
+ tty_port_destroy(&qtty->port);
+ goldfish_tty_current_line_count--;
+ if (goldfish_tty_current_line_count == 0)
+diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
+index ea5381dedb07c..9f2a8c0e1e334 100644
+--- a/drivers/tty/n_gsm.c
++++ b/drivers/tty/n_gsm.c
+@@ -455,7 +455,7 @@ static void gsm_hex_dump_bytes(const char *fname, const u8 *data,
+ return;
+ }
+
+- prefix = kasprintf(GFP_KERNEL, "%s: ", fname);
++ prefix = kasprintf(GFP_ATOMIC, "%s: ", fname);
+ if (!prefix)
+ return;
+ print_hex_dump(KERN_INFO, prefix, DUMP_PREFIX_OFFSET, 16, 1, data, len,
+diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
+index 1fbd5bf264bec..7e295d2701b28 100644
+--- a/drivers/tty/serial/8250/8250_port.c
++++ b/drivers/tty/serial/8250/8250_port.c
+@@ -1535,6 +1535,8 @@ static inline void __stop_tx(struct uart_8250_port *p)
+
+ if (em485) {
+ unsigned char lsr = serial_in(p, UART_LSR);
++ p->lsr_saved_flags |= lsr & LSR_SAVE_FLAGS;
++
+ /*
+ * To provide required timeing and allow FIFO transfer,
+ * __stop_tx_rs485() must be called only when both FIFO and
+diff --git a/drivers/usb/cdns3/cdnsp-ring.c b/drivers/usb/cdns3/cdnsp-ring.c
+index e45c3d6e1536c..794e413800ae8 100644
+--- a/drivers/usb/cdns3/cdnsp-ring.c
++++ b/drivers/usb/cdns3/cdnsp-ring.c
+@@ -1941,13 +1941,16 @@ int cdnsp_queue_bulk_tx(struct cdnsp_device *pdev, struct cdnsp_request *preq)
+ }
+
+ if (enqd_len + trb_buff_len >= full_len) {
+- if (need_zero_pkt)
+- zero_len_trb = !zero_len_trb;
+-
+- field &= ~TRB_CHAIN;
+- field |= TRB_IOC;
+- more_trbs_coming = false;
+- preq->td.last_trb = ring->enqueue;
++ if (need_zero_pkt && !zero_len_trb) {
++ zero_len_trb = true;
++ } else {
++ zero_len_trb = false;
++ field &= ~TRB_CHAIN;
++ field |= TRB_IOC;
++ more_trbs_coming = false;
++ need_zero_pkt = false;
++ preq->td.last_trb = ring->enqueue;
++ }
+ }
+
+ /* Only set interrupt on short packet for OUT endpoints. */
+@@ -1962,7 +1965,7 @@ int cdnsp_queue_bulk_tx(struct cdnsp_device *pdev, struct cdnsp_request *preq)
+ length_field = TRB_LEN(trb_buff_len) | TRB_TD_SIZE(remainder) |
+ TRB_INTR_TARGET(0);
+
+- cdnsp_queue_trb(pdev, ring, more_trbs_coming | zero_len_trb,
++ cdnsp_queue_trb(pdev, ring, more_trbs_coming,
+ lower_32_bits(send_addr),
+ upper_32_bits(send_addr),
+ length_field,
+diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
+index f63a27d11fac8..3f107a06817d8 100644
+--- a/drivers/usb/dwc2/hcd.c
++++ b/drivers/usb/dwc2/hcd.c
+@@ -5190,7 +5190,7 @@ int dwc2_hcd_init(struct dwc2_hsotg *hsotg)
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res) {
+ retval = -EINVAL;
+- goto error1;
++ goto error2;
+ }
+ hcd->rsrc_start = res->start;
+ hcd->rsrc_len = resource_size(res);
+diff --git a/drivers/usb/dwc3/dwc3-pci.c b/drivers/usb/dwc3/dwc3-pci.c
+index ba51de7dd7605..6b018048fe2e1 100644
+--- a/drivers/usb/dwc3/dwc3-pci.c
++++ b/drivers/usb/dwc3/dwc3-pci.c
+@@ -127,6 +127,7 @@ static const struct property_entry dwc3_pci_intel_phy_charger_detect_properties[
+ PROPERTY_ENTRY_STRING("dr_mode", "peripheral"),
+ PROPERTY_ENTRY_BOOL("snps,dis_u2_susphy_quirk"),
+ PROPERTY_ENTRY_BOOL("linux,phy_charger_detect"),
++ PROPERTY_ENTRY_BOOL("linux,sysdev_is_parent"),
+ {}
+ };
+
+diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
+index bf2eaa09d73c8..0a6633e7edce2 100644
+--- a/drivers/usb/dwc3/gadget.c
++++ b/drivers/usb/dwc3/gadget.c
+@@ -2984,6 +2984,7 @@ static int dwc3_gadget_init_in_endpoint(struct dwc3_ep *dep)
+ struct dwc3 *dwc = dep->dwc;
+ u32 mdwidth;
+ int size;
++ int maxpacket;
+
+ mdwidth = dwc3_mdwidth(dwc);
+
+@@ -2996,21 +2997,24 @@ static int dwc3_gadget_init_in_endpoint(struct dwc3_ep *dep)
+ else
+ size = DWC31_GTXFIFOSIZ_TXFDEP(size);
+
+- /* FIFO Depth is in MDWDITH bytes. Multiply */
+- size *= mdwidth;
+-
+ /*
+- * To meet performance requirement, a minimum TxFIFO size of 3x
+- * MaxPacketSize is recommended for endpoints that support burst and a
+- * minimum TxFIFO size of 2x MaxPacketSize for endpoints that don't
+- * support burst. Use those numbers and we can calculate the max packet
+- * limit as below.
++ * maxpacket size is determined as part of the following, after assuming
++ * a mult value of one maxpacket:
++ * DWC3 revision 280A and prior:
++ * fifo_size = mult * (max_packet / mdwidth) + 1;
++ * maxpacket = mdwidth * (fifo_size - 1);
++ *
++ * DWC3 revision 290A and onwards:
++ * fifo_size = mult * ((max_packet + mdwidth)/mdwidth + 1) + 1
++ * maxpacket = mdwidth * ((fifo_size - 1) - 1) - mdwidth;
+ */
+- if (dwc->maximum_speed >= USB_SPEED_SUPER)
+- size /= 3;
++ if (DWC3_VER_IS_PRIOR(DWC3, 290A))
++ maxpacket = mdwidth * (size - 1);
+ else
+- size /= 2;
++ maxpacket = mdwidth * ((size - 1) - 1) - mdwidth;
+
++ /* Functionally, space for one max packet is sufficient */
++ size = min_t(int, maxpacket, 1024);
+ usb_ep_set_maxpacket_limit(&dep->endpoint, size);
+
+ dep->endpoint.max_streams = 16;
+diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
+index 4585ee3a444a8..e0fa4b186ec6d 100644
+--- a/drivers/usb/gadget/function/f_fs.c
++++ b/drivers/usb/gadget/function/f_fs.c
+@@ -122,8 +122,6 @@ struct ffs_ep {
+ struct usb_endpoint_descriptor *descs[3];
+
+ u8 num;
+-
+- int status; /* P: epfile->mutex */
+ };
+
+ struct ffs_epfile {
+@@ -227,6 +225,9 @@ struct ffs_io_data {
+ bool use_sg;
+
+ struct ffs_data *ffs;
++
++ int status;
++ struct completion done;
+ };
+
+ struct ffs_desc_helper {
+@@ -707,12 +708,15 @@ static const struct file_operations ffs_ep0_operations = {
+
+ static void ffs_epfile_io_complete(struct usb_ep *_ep, struct usb_request *req)
+ {
++ struct ffs_io_data *io_data = req->context;
++
+ ENTER();
+- if (req->context) {
+- struct ffs_ep *ep = _ep->driver_data;
+- ep->status = req->status ? req->status : req->actual;
+- complete(req->context);
+- }
++ if (req->status)
++ io_data->status = req->status;
++ else
++ io_data->status = req->actual;
++
++ complete(&io_data->done);
+ }
+
+ static ssize_t ffs_copy_to_iter(void *data, int data_len, struct iov_iter *iter)
+@@ -1050,7 +1054,6 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
+ WARN(1, "%s: data_len == -EINVAL\n", __func__);
+ ret = -EINVAL;
+ } else if (!io_data->aio) {
+- DECLARE_COMPLETION_ONSTACK(done);
+ bool interrupted = false;
+
+ req = ep->req;
+@@ -1066,7 +1069,8 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
+
+ io_data->buf = data;
+
+- req->context = &done;
++ init_completion(&io_data->done);
++ req->context = io_data;
+ req->complete = ffs_epfile_io_complete;
+
+ ret = usb_ep_queue(ep->ep, req, GFP_ATOMIC);
+@@ -1075,7 +1079,12 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
+
+ spin_unlock_irq(&epfile->ffs->eps_lock);
+
+- if (wait_for_completion_interruptible(&done)) {
++ if (wait_for_completion_interruptible(&io_data->done)) {
++ spin_lock_irq(&epfile->ffs->eps_lock);
++ if (epfile->ep != ep) {
++ ret = -ESHUTDOWN;
++ goto error_lock;
++ }
+ /*
+ * To avoid race condition with ffs_epfile_io_complete,
+ * dequeue the request first then check
+@@ -1083,17 +1092,18 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
+ * condition with req->complete callback.
+ */
+ usb_ep_dequeue(ep->ep, req);
+- wait_for_completion(&done);
+- interrupted = ep->status < 0;
++ spin_unlock_irq(&epfile->ffs->eps_lock);
++ wait_for_completion(&io_data->done);
++ interrupted = io_data->status < 0;
+ }
+
+ if (interrupted)
+ ret = -EINTR;
+- else if (io_data->read && ep->status > 0)
+- ret = __ffs_epfile_read_data(epfile, data, ep->status,
++ else if (io_data->read && io_data->status > 0)
++ ret = __ffs_epfile_read_data(epfile, data, io_data->status,
+ &io_data->data);
+ else
+- ret = ep->status;
++ ret = io_data->status;
+ goto error_mutex;
+ } else if (!(req = usb_ep_alloc_request(ep->ep, GFP_ATOMIC))) {
+ ret = -ENOMEM;
+diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
+index 6f5d45ef2e39a..f51694f29de92 100644
+--- a/drivers/usb/gadget/function/u_ether.c
++++ b/drivers/usb/gadget/function/u_ether.c
+@@ -775,9 +775,13 @@ struct eth_dev *gether_setup_name(struct usb_gadget *g,
+ dev->qmult = qmult;
+ snprintf(net->name, sizeof(net->name), "%s%%d", netname);
+
+- if (get_ether_addr(dev_addr, addr))
++ if (get_ether_addr(dev_addr, addr)) {
++ net->addr_assign_type = NET_ADDR_RANDOM;
+ dev_warn(&g->dev,
+ "using random %s ethernet address\n", "self");
++ } else {
++ net->addr_assign_type = NET_ADDR_SET;
++ }
+ eth_hw_addr_set(net, addr);
+ if (get_ether_addr(host_addr, dev->host_mac))
+ dev_warn(&g->dev,
+@@ -844,6 +848,10 @@ struct net_device *gether_setup_name_default(const char *netname)
+
+ eth_random_addr(dev->dev_mac);
+ pr_warn("using random %s ethernet address\n", "self");
++
++ /* by default we always have a random MAC address */
++ net->addr_assign_type = NET_ADDR_RANDOM;
++
+ eth_random_addr(dev->host_mac);
+ pr_warn("using random %s ethernet address\n", "host");
+
+@@ -871,7 +879,6 @@ int gether_register_netdev(struct net_device *net)
+ dev = netdev_priv(net);
+ g = dev->gadget;
+
+- net->addr_assign_type = NET_ADDR_RANDOM;
+ eth_hw_addr_set(net, dev->dev_mac);
+
+ status = register_netdev(net);
+@@ -912,6 +919,7 @@ int gether_set_dev_addr(struct net_device *net, const char *dev_addr)
+ if (get_ether_addr(dev_addr, new_addr))
+ return -EINVAL;
+ memcpy(dev->dev_mac, new_addr, ETH_ALEN);
++ net->addr_assign_type = NET_ADDR_SET;
+ return 0;
+ }
+ EXPORT_SYMBOL_GPL(gether_set_dev_addr);
+diff --git a/drivers/usb/gadget/udc/lpc32xx_udc.c b/drivers/usb/gadget/udc/lpc32xx_udc.c
+index 6117ae8e7242b..cea10cdb83ae5 100644
+--- a/drivers/usb/gadget/udc/lpc32xx_udc.c
++++ b/drivers/usb/gadget/udc/lpc32xx_udc.c
+@@ -3016,6 +3016,7 @@ static int lpc32xx_udc_probe(struct platform_device *pdev)
+ }
+
+ udc->isp1301_i2c_client = isp1301_get_client(isp1301_node);
++ of_node_put(isp1301_node);
+ if (!udc->isp1301_i2c_client) {
+ return -EPROBE_DEFER;
+ }
+diff --git a/drivers/usb/serial/io_ti.c b/drivers/usb/serial/io_ti.c
+index a7b3c15957ba9..feba2a8d1233a 100644
+--- a/drivers/usb/serial/io_ti.c
++++ b/drivers/usb/serial/io_ti.c
+@@ -166,6 +166,7 @@ static const struct usb_device_id edgeport_2port_id_table[] = {
+ { USB_DEVICE(USB_VENDOR_ID_ION, ION_DEVICE_ID_TI_EDGEPORT_8S) },
+ { USB_DEVICE(USB_VENDOR_ID_ION, ION_DEVICE_ID_TI_EDGEPORT_416) },
+ { USB_DEVICE(USB_VENDOR_ID_ION, ION_DEVICE_ID_TI_EDGEPORT_416B) },
++ { USB_DEVICE(USB_VENDOR_ID_ION, ION_DEVICE_ID_E5805A) },
+ { }
+ };
+
+@@ -204,6 +205,7 @@ static const struct usb_device_id id_table_combined[] = {
+ { USB_DEVICE(USB_VENDOR_ID_ION, ION_DEVICE_ID_TI_EDGEPORT_8S) },
+ { USB_DEVICE(USB_VENDOR_ID_ION, ION_DEVICE_ID_TI_EDGEPORT_416) },
+ { USB_DEVICE(USB_VENDOR_ID_ION, ION_DEVICE_ID_TI_EDGEPORT_416B) },
++ { USB_DEVICE(USB_VENDOR_ID_ION, ION_DEVICE_ID_E5805A) },
+ { }
+ };
+
+diff --git a/drivers/usb/serial/io_usbvend.h b/drivers/usb/serial/io_usbvend.h
+index 52cbc353051fe..9a6f742ad3abd 100644
+--- a/drivers/usb/serial/io_usbvend.h
++++ b/drivers/usb/serial/io_usbvend.h
+@@ -212,6 +212,7 @@
+ //
+ // Definitions for other product IDs
+ #define ION_DEVICE_ID_MT4X56USB 0x1403 // OEM device
++#define ION_DEVICE_ID_E5805A 0x1A01 // OEM device (rebranded Edgeport/4)
+
+
+ #define GENERATION_ID_FROM_USB_PRODUCT_ID(ProductId) \
+diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
+index e60425bbf5376..ed1e50d83ccab 100644
+--- a/drivers/usb/serial/option.c
++++ b/drivers/usb/serial/option.c
+@@ -432,6 +432,8 @@ static void option_instat_callback(struct urb *urb);
+ #define CINTERION_PRODUCT_CLS8 0x00b0
+ #define CINTERION_PRODUCT_MV31_MBIM 0x00b3
+ #define CINTERION_PRODUCT_MV31_RMNET 0x00b7
++#define CINTERION_PRODUCT_MV31_2_MBIM 0x00b8
++#define CINTERION_PRODUCT_MV31_2_RMNET 0x00b9
+ #define CINTERION_PRODUCT_MV32_WA 0x00f1
+ #define CINTERION_PRODUCT_MV32_WB 0x00f2
+
+@@ -1979,6 +1981,10 @@ static const struct usb_device_id option_ids[] = {
+ .driver_info = RSVD(3)},
+ { USB_DEVICE_INTERFACE_CLASS(CINTERION_VENDOR_ID, CINTERION_PRODUCT_MV31_RMNET, 0xff),
+ .driver_info = RSVD(0)},
++ { USB_DEVICE_INTERFACE_CLASS(CINTERION_VENDOR_ID, CINTERION_PRODUCT_MV31_2_MBIM, 0xff),
++ .driver_info = RSVD(3)},
++ { USB_DEVICE_INTERFACE_CLASS(CINTERION_VENDOR_ID, CINTERION_PRODUCT_MV31_2_RMNET, 0xff),
++ .driver_info = RSVD(0)},
+ { USB_DEVICE_INTERFACE_CLASS(CINTERION_VENDOR_ID, CINTERION_PRODUCT_MV32_WA, 0xff),
+ .driver_info = RSVD(3)},
+ { USB_DEVICE_INTERFACE_CLASS(CINTERION_VENDOR_ID, CINTERION_PRODUCT_MV32_WB, 0xff),
+diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
+index 56128b9c46eba..1dd396d4bebb2 100644
+--- a/drivers/virtio/virtio_mmio.c
++++ b/drivers/virtio/virtio_mmio.c
+@@ -688,6 +688,7 @@ static int vm_cmdline_set(const char *device,
+ if (!vm_cmdline_parent_registered) {
+ err = device_register(&vm_cmdline_parent);
+ if (err) {
++ put_device(&vm_cmdline_parent);
+ pr_err("Failed to register parent device!\n");
+ return err;
+ }
+diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
+index d724f676608ba..5046efcffb4c0 100644
+--- a/drivers/virtio/virtio_pci_common.c
++++ b/drivers/virtio/virtio_pci_common.c
+@@ -254,8 +254,7 @@ void vp_del_vqs(struct virtio_device *vdev)
+
+ if (vp_dev->msix_affinity_masks) {
+ for (i = 0; i < vp_dev->msix_vectors; i++)
+- if (vp_dev->msix_affinity_masks[i])
+- free_cpumask_var(vp_dev->msix_affinity_masks[i]);
++ free_cpumask_var(vp_dev->msix_affinity_masks[i]);
+ }
+
+ if (vp_dev->msix_enabled) {
+diff --git a/fs/9p/cache.c b/fs/9p/cache.c
+index 1c8dc696d516f..cebba4eaa0b57 100644
+--- a/fs/9p/cache.c
++++ b/fs/9p/cache.c
+@@ -62,12 +62,12 @@ void v9fs_cache_inode_get_cookie(struct inode *inode)
+ version = cpu_to_le32(v9inode->qid.version);
+ path = cpu_to_le64(v9inode->qid.path);
+ v9ses = v9fs_inode2v9ses(inode);
+- v9inode->netfs_ctx.cache =
++ v9inode->netfs.cache =
+ fscache_acquire_cookie(v9fs_session_cache(v9ses),
+ 0,
+ &path, sizeof(path),
+ &version, sizeof(version),
+- i_size_read(&v9inode->vfs_inode));
++ i_size_read(&v9inode->netfs.inode));
+
+ p9_debug(P9_DEBUG_FSC, "inode %p get cookie %p\n",
+ inode, v9fs_inode_cookie(v9inode));
+diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
+index e28ddf763b3b9..0129de2ea31ae 100644
+--- a/fs/9p/v9fs.c
++++ b/fs/9p/v9fs.c
+@@ -625,7 +625,7 @@ static void v9fs_inode_init_once(void *foo)
+ struct v9fs_inode *v9inode = (struct v9fs_inode *)foo;
+
+ memset(&v9inode->qid, 0, sizeof(v9inode->qid));
+- inode_init_once(&v9inode->vfs_inode);
++ inode_init_once(&v9inode->netfs.inode);
+ }
+
+ /**
+diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h
+index ec0e8df3b2eb2..1b219c21d15eb 100644
+--- a/fs/9p/v9fs.h
++++ b/fs/9p/v9fs.h
+@@ -109,11 +109,7 @@ struct v9fs_session_info {
+ #define V9FS_INO_INVALID_ATTR 0x01
+
+ struct v9fs_inode {
+- struct {
+- /* These must be contiguous */
+- struct inode vfs_inode; /* the VFS's inode record */
+- struct netfs_i_context netfs_ctx; /* Netfslib context */
+- };
++ struct netfs_inode netfs; /* Netfslib context and vfs inode */
+ struct p9_qid qid;
+ unsigned int cache_validity;
+ struct p9_fid *writeback_fid;
+@@ -122,13 +118,13 @@ struct v9fs_inode {
+
+ static inline struct v9fs_inode *V9FS_I(const struct inode *inode)
+ {
+- return container_of(inode, struct v9fs_inode, vfs_inode);
++ return container_of(inode, struct v9fs_inode, netfs.inode);
+ }
+
+ static inline struct fscache_cookie *v9fs_inode_cookie(struct v9fs_inode *v9inode)
+ {
+ #ifdef CONFIG_9P_FSCACHE
+- return netfs_i_cookie(&v9inode->vfs_inode);
++ return netfs_i_cookie(&v9inode->netfs.inode);
+ #else
+ return NULL;
+ #endif
+diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
+index 5011281883432..595875228672f 100644
+--- a/fs/9p/vfs_addr.c
++++ b/fs/9p/vfs_addr.c
+@@ -141,7 +141,7 @@ static void v9fs_write_to_cache_done(void *priv, ssize_t transferred_or_error,
+ transferred_or_error != -ENOBUFS) {
+ version = cpu_to_le32(v9inode->qid.version);
+ fscache_invalidate(v9fs_inode_cookie(v9inode), &version,
+- i_size_read(&v9inode->vfs_inode), 0);
++ i_size_read(&v9inode->netfs.inode), 0);
+ }
+ }
+
+diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
+index 55367ecb9442e..e660c6348b9da 100644
+--- a/fs/9p/vfs_inode.c
++++ b/fs/9p/vfs_inode.c
+@@ -234,7 +234,7 @@ struct inode *v9fs_alloc_inode(struct super_block *sb)
+ v9inode->writeback_fid = NULL;
+ v9inode->cache_validity = 0;
+ mutex_init(&v9inode->v_mutex);
+- return &v9inode->vfs_inode;
++ return &v9inode->netfs.inode;
+ }
+
+ /**
+@@ -252,7 +252,7 @@ void v9fs_free_inode(struct inode *inode)
+ */
+ static void v9fs_set_netfs_context(struct inode *inode)
+ {
+- netfs_i_context_init(inode, &v9fs_req_ops);
++ netfs_inode_init(inode, &v9fs_req_ops);
+ }
+
+ int v9fs_init_inode(struct v9fs_session_info *v9ses,
+diff --git a/fs/afs/callback.c b/fs/afs/callback.c
+index 1b4d5809808d0..a484fa6428081 100644
+--- a/fs/afs/callback.c
++++ b/fs/afs/callback.c
+@@ -30,7 +30,7 @@ void afs_invalidate_mmap_work(struct work_struct *work)
+ {
+ struct afs_vnode *vnode = container_of(work, struct afs_vnode, cb_work);
+
+- unmap_mapping_pages(vnode->vfs_inode.i_mapping, 0, 0, false);
++ unmap_mapping_pages(vnode->netfs.inode.i_mapping, 0, 0, false);
+ }
+
+ void afs_server_init_callback_work(struct work_struct *work)
+diff --git a/fs/afs/dir.c b/fs/afs/dir.c
+index bdac73554e6e5..5cbb5234f7ce8 100644
+--- a/fs/afs/dir.c
++++ b/fs/afs/dir.c
+@@ -109,7 +109,7 @@ struct afs_lookup_cookie {
+ */
+ static void afs_dir_read_cleanup(struct afs_read *req)
+ {
+- struct address_space *mapping = req->vnode->vfs_inode.i_mapping;
++ struct address_space *mapping = req->vnode->netfs.inode.i_mapping;
+ struct folio *folio;
+ pgoff_t last = req->nr_pages - 1;
+
+@@ -153,7 +153,7 @@ static bool afs_dir_check_folio(struct afs_vnode *dvnode, struct folio *folio,
+ block = kmap_local_folio(folio, offset);
+ if (block->hdr.magic != AFS_DIR_MAGIC) {
+ printk("kAFS: %s(%lx): [%llx] bad magic %zx/%zx is %04hx\n",
+- __func__, dvnode->vfs_inode.i_ino,
++ __func__, dvnode->netfs.inode.i_ino,
+ pos, offset, size, ntohs(block->hdr.magic));
+ trace_afs_dir_check_failed(dvnode, pos + offset, i_size);
+ kunmap_local(block);
+@@ -183,7 +183,7 @@ error:
+ static void afs_dir_dump(struct afs_vnode *dvnode, struct afs_read *req)
+ {
+ union afs_xdr_dir_block *block;
+- struct address_space *mapping = dvnode->vfs_inode.i_mapping;
++ struct address_space *mapping = dvnode->netfs.inode.i_mapping;
+ struct folio *folio;
+ pgoff_t last = req->nr_pages - 1;
+ size_t offset, size;
+@@ -217,7 +217,7 @@ static void afs_dir_dump(struct afs_vnode *dvnode, struct afs_read *req)
+ */
+ static int afs_dir_check(struct afs_vnode *dvnode, struct afs_read *req)
+ {
+- struct address_space *mapping = dvnode->vfs_inode.i_mapping;
++ struct address_space *mapping = dvnode->netfs.inode.i_mapping;
+ struct folio *folio;
+ pgoff_t last = req->nr_pages - 1;
+ int ret = 0;
+@@ -269,7 +269,7 @@ static int afs_dir_open(struct inode *inode, struct file *file)
+ static struct afs_read *afs_read_dir(struct afs_vnode *dvnode, struct key *key)
+ __acquires(&dvnode->validate_lock)
+ {
+- struct address_space *mapping = dvnode->vfs_inode.i_mapping;
++ struct address_space *mapping = dvnode->netfs.inode.i_mapping;
+ struct afs_read *req;
+ loff_t i_size;
+ int nr_pages, i;
+@@ -287,7 +287,7 @@ static struct afs_read *afs_read_dir(struct afs_vnode *dvnode, struct key *key)
+ req->cleanup = afs_dir_read_cleanup;
+
+ expand:
+- i_size = i_size_read(&dvnode->vfs_inode);
++ i_size = i_size_read(&dvnode->netfs.inode);
+ if (i_size < 2048) {
+ ret = afs_bad(dvnode, afs_file_error_dir_small);
+ goto error;
+@@ -305,7 +305,7 @@ expand:
+ req->actual_len = i_size; /* May change */
+ req->len = nr_pages * PAGE_SIZE; /* We can ask for more than there is */
+ req->data_version = dvnode->status.data_version; /* May change */
+- iov_iter_xarray(&req->def_iter, READ, &dvnode->vfs_inode.i_mapping->i_pages,
++ iov_iter_xarray(&req->def_iter, READ, &dvnode->netfs.inode.i_mapping->i_pages,
+ 0, i_size);
+ req->iter = &req->def_iter;
+
+@@ -897,7 +897,7 @@ static struct inode *afs_do_lookup(struct inode *dir, struct dentry *dentry,
+
+ out_op:
+ if (op->error == 0) {
+- inode = &op->file[1].vnode->vfs_inode;
++ inode = &op->file[1].vnode->netfs.inode;
+ op->file[1].vnode = NULL;
+ }
+
+@@ -1139,7 +1139,7 @@ static int afs_d_revalidate(struct dentry *dentry, unsigned int flags)
+ afs_stat_v(dir, n_reval);
+
+ /* search the directory for this vnode */
+- ret = afs_do_lookup_one(&dir->vfs_inode, dentry, &fid, key, &dir_version);
++ ret = afs_do_lookup_one(&dir->netfs.inode, dentry, &fid, key, &dir_version);
+ switch (ret) {
+ case 0:
+ /* the filename maps to something */
+@@ -1170,7 +1170,7 @@ static int afs_d_revalidate(struct dentry *dentry, unsigned int flags)
+ _debug("%pd: file deleted (uq %u -> %u I:%u)",
+ dentry, fid.unique,
+ vnode->fid.unique,
+- vnode->vfs_inode.i_generation);
++ vnode->netfs.inode.i_generation);
+ goto not_found;
+ }
+ goto out_valid;
+@@ -1368,7 +1368,7 @@ static void afs_dir_remove_subdir(struct dentry *dentry)
+ if (d_really_is_positive(dentry)) {
+ struct afs_vnode *vnode = AFS_FS_I(d_inode(dentry));
+
+- clear_nlink(&vnode->vfs_inode);
++ clear_nlink(&vnode->netfs.inode);
+ set_bit(AFS_VNODE_DELETED, &vnode->flags);
+ clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
+ clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
+@@ -1487,8 +1487,8 @@ static void afs_dir_remove_link(struct afs_operation *op)
+ /* Already done */
+ } else if (test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags)) {
+ write_seqlock(&vnode->cb_lock);
+- drop_nlink(&vnode->vfs_inode);
+- if (vnode->vfs_inode.i_nlink == 0) {
++ drop_nlink(&vnode->netfs.inode);
++ if (vnode->netfs.inode.i_nlink == 0) {
+ set_bit(AFS_VNODE_DELETED, &vnode->flags);
+ __afs_break_callback(vnode, afs_cb_break_for_unlink);
+ }
+@@ -1504,7 +1504,7 @@ static void afs_dir_remove_link(struct afs_operation *op)
+ op->error = ret;
+ }
+
+- _debug("nlink %d [val %d]", vnode->vfs_inode.i_nlink, op->error);
++ _debug("nlink %d [val %d]", vnode->netfs.inode.i_nlink, op->error);
+ }
+
+ static void afs_unlink_success(struct afs_operation *op)
+@@ -1680,8 +1680,8 @@ static void afs_link_success(struct afs_operation *op)
+ afs_update_dentry_version(op, dvp, op->dentry);
+ if (op->dentry_2->d_parent == op->dentry->d_parent)
+ afs_update_dentry_version(op, dvp, op->dentry_2);
+- ihold(&vp->vnode->vfs_inode);
+- d_instantiate(op->dentry, &vp->vnode->vfs_inode);
++ ihold(&vp->vnode->netfs.inode);
++ d_instantiate(op->dentry, &vp->vnode->netfs.inode);
+ }
+
+ static void afs_link_put(struct afs_operation *op)
+diff --git a/fs/afs/dir_edit.c b/fs/afs/dir_edit.c
+index d98e109ecee9a..0ab7752d1b758 100644
+--- a/fs/afs/dir_edit.c
++++ b/fs/afs/dir_edit.c
+@@ -109,7 +109,7 @@ static void afs_clear_contig_bits(union afs_xdr_dir_block *block,
+ */
+ static struct folio *afs_dir_get_folio(struct afs_vnode *vnode, pgoff_t index)
+ {
+- struct address_space *mapping = vnode->vfs_inode.i_mapping;
++ struct address_space *mapping = vnode->netfs.inode.i_mapping;
+ struct folio *folio;
+
+ folio = __filemap_get_folio(mapping, index,
+@@ -216,7 +216,7 @@ void afs_edit_dir_add(struct afs_vnode *vnode,
+
+ _enter(",,{%d,%s},", name->len, name->name);
+
+- i_size = i_size_read(&vnode->vfs_inode);
++ i_size = i_size_read(&vnode->netfs.inode);
+ if (i_size > AFS_DIR_BLOCK_SIZE * AFS_DIR_MAX_BLOCKS ||
+ (i_size & (AFS_DIR_BLOCK_SIZE - 1))) {
+ clear_bit(AFS_VNODE_DIR_VALID, &vnode->flags);
+@@ -336,7 +336,7 @@ found_space:
+ if (b < AFS_DIR_BLOCKS_WITH_CTR)
+ meta->meta.alloc_ctrs[b] -= need_slots;
+
+- inode_inc_iversion_raw(&vnode->vfs_inode);
++ inode_inc_iversion_raw(&vnode->netfs.inode);
+ afs_stat_v(vnode, n_dir_cr);
+ _debug("Insert %s in %u[%u]", name->name, b, slot);
+
+@@ -383,7 +383,7 @@ void afs_edit_dir_remove(struct afs_vnode *vnode,
+
+ _enter(",,{%d,%s},", name->len, name->name);
+
+- i_size = i_size_read(&vnode->vfs_inode);
++ i_size = i_size_read(&vnode->netfs.inode);
+ if (i_size < AFS_DIR_BLOCK_SIZE ||
+ i_size > AFS_DIR_BLOCK_SIZE * AFS_DIR_MAX_BLOCKS ||
+ (i_size & (AFS_DIR_BLOCK_SIZE - 1))) {
+@@ -463,7 +463,7 @@ found_dirent:
+ if (b < AFS_DIR_BLOCKS_WITH_CTR)
+ meta->meta.alloc_ctrs[b] += need_slots;
+
+- inode_set_iversion_raw(&vnode->vfs_inode, vnode->status.data_version);
++ inode_set_iversion_raw(&vnode->netfs.inode, vnode->status.data_version);
+ afs_stat_v(vnode, n_dir_rm);
+ _debug("Remove %s from %u[%u]", name->name, b, slot);
+
+diff --git a/fs/afs/dir_silly.c b/fs/afs/dir_silly.c
+index 45cfd50a95210..bb5807e87fa4c 100644
+--- a/fs/afs/dir_silly.c
++++ b/fs/afs/dir_silly.c
+@@ -131,7 +131,7 @@ int afs_sillyrename(struct afs_vnode *dvnode, struct afs_vnode *vnode,
+ goto out;
+ } while (!d_is_negative(sdentry));
+
+- ihold(&vnode->vfs_inode);
++ ihold(&vnode->netfs.inode);
+
+ ret = afs_do_silly_rename(dvnode, vnode, dentry, sdentry, key);
+ switch (ret) {
+@@ -148,7 +148,7 @@ int afs_sillyrename(struct afs_vnode *dvnode, struct afs_vnode *vnode,
+ d_drop(sdentry);
+ }
+
+- iput(&vnode->vfs_inode);
++ iput(&vnode->netfs.inode);
+ dput(sdentry);
+ out:
+ _leave(" = %d", ret);
+diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c
+index f120bcb8bf738..3a5bbffdf0534 100644
+--- a/fs/afs/dynroot.c
++++ b/fs/afs/dynroot.c
+@@ -76,7 +76,7 @@ struct inode *afs_iget_pseudo_dir(struct super_block *sb, bool root)
+ /* there shouldn't be an existing inode */
+ BUG_ON(!(inode->i_state & I_NEW));
+
+- netfs_i_context_init(inode, NULL);
++ netfs_inode_init(inode, NULL);
+ inode->i_size = 0;
+ inode->i_mode = S_IFDIR | S_IRUGO | S_IXUGO;
+ if (root) {
+diff --git a/fs/afs/file.c b/fs/afs/file.c
+index 26292a110a8f9..fab8324833ba0 100644
+--- a/fs/afs/file.c
++++ b/fs/afs/file.c
+@@ -194,7 +194,7 @@ int afs_release(struct inode *inode, struct file *file)
+ afs_put_wb_key(af->wb);
+
+ if ((file->f_mode & FMODE_WRITE)) {
+- i_size = i_size_read(&vnode->vfs_inode);
++ i_size = i_size_read(&vnode->netfs.inode);
+ afs_set_cache_aux(vnode, &aux);
+ fscache_unuse_cookie(afs_vnode_cache(vnode), &aux, &i_size);
+ } else {
+@@ -325,7 +325,7 @@ static void afs_issue_read(struct netfs_io_subrequest *subreq)
+ fsreq->iter = &fsreq->def_iter;
+
+ iov_iter_xarray(&fsreq->def_iter, READ,
+- &fsreq->vnode->vfs_inode.i_mapping->i_pages,
++ &fsreq->vnode->netfs.inode.i_mapping->i_pages,
+ fsreq->pos, fsreq->len);
+
+ afs_fetch_data(fsreq->vnode, fsreq);
+diff --git a/fs/afs/fs_operation.c b/fs/afs/fs_operation.c
+index d222dfbe976be..7a3803ce3a229 100644
+--- a/fs/afs/fs_operation.c
++++ b/fs/afs/fs_operation.c
+@@ -232,14 +232,14 @@ int afs_put_operation(struct afs_operation *op)
+ if (op->file[1].modification && op->file[1].vnode != op->file[0].vnode)
+ clear_bit(AFS_VNODE_MODIFYING, &op->file[1].vnode->flags);
+ if (op->file[0].put_vnode)
+- iput(&op->file[0].vnode->vfs_inode);
++ iput(&op->file[0].vnode->netfs.inode);
+ if (op->file[1].put_vnode)
+- iput(&op->file[1].vnode->vfs_inode);
++ iput(&op->file[1].vnode->netfs.inode);
+
+ if (op->more_files) {
+ for (i = 0; i < op->nr_files - 2; i++)
+ if (op->more_files[i].put_vnode)
+- iput(&op->more_files[i].vnode->vfs_inode);
++ iput(&op->more_files[i].vnode->netfs.inode);
+ kfree(op->more_files);
+ }
+
+diff --git a/fs/afs/inode.c b/fs/afs/inode.c
+index 30b066299d39f..22811e9eacf58 100644
+--- a/fs/afs/inode.c
++++ b/fs/afs/inode.c
+@@ -58,7 +58,7 @@ static noinline void dump_vnode(struct afs_vnode *vnode, struct afs_vnode *paren
+ */
+ static void afs_set_netfs_context(struct afs_vnode *vnode)
+ {
+- netfs_i_context_init(&vnode->vfs_inode, &afs_req_ops);
++ netfs_inode_init(&vnode->netfs.inode, &afs_req_ops);
+ }
+
+ /*
+@@ -96,7 +96,7 @@ static int afs_inode_init_from_status(struct afs_operation *op,
+ inode->i_flags |= S_NOATIME;
+ inode->i_uid = make_kuid(&init_user_ns, status->owner);
+ inode->i_gid = make_kgid(&init_user_ns, status->group);
+- set_nlink(&vnode->vfs_inode, status->nlink);
++ set_nlink(&vnode->netfs.inode, status->nlink);
+
+ switch (status->type) {
+ case AFS_FTYPE_FILE:
+@@ -139,7 +139,7 @@ static int afs_inode_init_from_status(struct afs_operation *op,
+ afs_set_netfs_context(vnode);
+
+ vnode->invalid_before = status->data_version;
+- inode_set_iversion_raw(&vnode->vfs_inode, status->data_version);
++ inode_set_iversion_raw(&vnode->netfs.inode, status->data_version);
+
+ if (!vp->scb.have_cb) {
+ /* it's a symlink we just created (the fileserver
+@@ -163,7 +163,7 @@ static void afs_apply_status(struct afs_operation *op,
+ {
+ struct afs_file_status *status = &vp->scb.status;
+ struct afs_vnode *vnode = vp->vnode;
+- struct inode *inode = &vnode->vfs_inode;
++ struct inode *inode = &vnode->netfs.inode;
+ struct timespec64 t;
+ umode_t mode;
+ bool data_changed = false;
+@@ -246,7 +246,7 @@ static void afs_apply_status(struct afs_operation *op,
+ * idea of what the size should be that's not the same as
+ * what's on the server.
+ */
+- vnode->netfs_ctx.remote_i_size = status->size;
++ vnode->netfs.remote_i_size = status->size;
+ if (change_size) {
+ afs_set_i_size(vnode, status->size);
+ inode->i_ctime = t;
+@@ -289,7 +289,7 @@ void afs_vnode_commit_status(struct afs_operation *op, struct afs_vnode_param *v
+ */
+ if (vp->scb.status.abort_code == VNOVNODE) {
+ set_bit(AFS_VNODE_DELETED, &vnode->flags);
+- clear_nlink(&vnode->vfs_inode);
++ clear_nlink(&vnode->netfs.inode);
+ __afs_break_callback(vnode, afs_cb_break_for_deleted);
+ op->flags &= ~AFS_OPERATION_DIR_CONFLICT;
+ }
+@@ -306,8 +306,8 @@ void afs_vnode_commit_status(struct afs_operation *op, struct afs_vnode_param *v
+ if (vp->scb.have_cb)
+ afs_apply_callback(op, vp);
+ } else if (vp->op_unlinked && !(op->flags & AFS_OPERATION_DIR_CONFLICT)) {
+- drop_nlink(&vnode->vfs_inode);
+- if (vnode->vfs_inode.i_nlink == 0) {
++ drop_nlink(&vnode->netfs.inode);
++ if (vnode->netfs.inode.i_nlink == 0) {
+ set_bit(AFS_VNODE_DELETED, &vnode->flags);
+ __afs_break_callback(vnode, afs_cb_break_for_deleted);
+ }
+@@ -326,7 +326,7 @@ static void afs_fetch_status_success(struct afs_operation *op)
+ struct afs_vnode *vnode = vp->vnode;
+ int ret;
+
+- if (vnode->vfs_inode.i_state & I_NEW) {
++ if (vnode->netfs.inode.i_state & I_NEW) {
+ ret = afs_inode_init_from_status(op, vp, vnode);
+ op->error = ret;
+ if (ret == 0)
+@@ -430,7 +430,7 @@ static void afs_get_inode_cache(struct afs_vnode *vnode)
+ struct afs_vnode_cache_aux aux;
+
+ if (vnode->status.type != AFS_FTYPE_FILE) {
+- vnode->netfs_ctx.cache = NULL;
++ vnode->netfs.cache = NULL;
+ return;
+ }
+
+@@ -457,7 +457,7 @@ static void afs_get_inode_cache(struct afs_vnode *vnode)
+ struct inode *afs_iget(struct afs_operation *op, struct afs_vnode_param *vp)
+ {
+ struct afs_vnode_param *dvp = &op->file[0];
+- struct super_block *sb = dvp->vnode->vfs_inode.i_sb;
++ struct super_block *sb = dvp->vnode->netfs.inode.i_sb;
+ struct afs_vnode *vnode;
+ struct inode *inode;
+ int ret;
+@@ -582,10 +582,10 @@ static void afs_zap_data(struct afs_vnode *vnode)
+ /* nuke all the non-dirty pages that aren't locked, mapped or being
+ * written back in a regular file and completely discard the pages in a
+ * directory or symlink */
+- if (S_ISREG(vnode->vfs_inode.i_mode))
+- invalidate_remote_inode(&vnode->vfs_inode);
++ if (S_ISREG(vnode->netfs.inode.i_mode))
++ invalidate_remote_inode(&vnode->netfs.inode);
+ else
+- invalidate_inode_pages2(vnode->vfs_inode.i_mapping);
++ invalidate_inode_pages2(vnode->netfs.inode.i_mapping);
+ }
+
+ /*
+@@ -683,8 +683,8 @@ int afs_validate(struct afs_vnode *vnode, struct key *key)
+ key_serial(key));
+
+ if (unlikely(test_bit(AFS_VNODE_DELETED, &vnode->flags))) {
+- if (vnode->vfs_inode.i_nlink)
+- clear_nlink(&vnode->vfs_inode);
++ if (vnode->netfs.inode.i_nlink)
++ clear_nlink(&vnode->netfs.inode);
+ goto valid;
+ }
+
+@@ -826,7 +826,7 @@ void afs_evict_inode(struct inode *inode)
+ static void afs_setattr_success(struct afs_operation *op)
+ {
+ struct afs_vnode_param *vp = &op->file[0];
+- struct inode *inode = &vp->vnode->vfs_inode;
++ struct inode *inode = &vp->vnode->netfs.inode;
+ loff_t old_i_size = i_size_read(inode);
+
+ op->setattr.old_i_size = old_i_size;
+@@ -843,7 +843,7 @@ static void afs_setattr_success(struct afs_operation *op)
+ static void afs_setattr_edit_file(struct afs_operation *op)
+ {
+ struct afs_vnode_param *vp = &op->file[0];
+- struct inode *inode = &vp->vnode->vfs_inode;
++ struct inode *inode = &vp->vnode->netfs.inode;
+
+ if (op->setattr.attr->ia_valid & ATTR_SIZE) {
+ loff_t size = op->setattr.attr->ia_size;
+@@ -875,7 +875,7 @@ int afs_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
+ ATTR_MTIME | ATTR_MTIME_SET | ATTR_TIMES_SET | ATTR_TOUCH;
+ struct afs_operation *op;
+ struct afs_vnode *vnode = AFS_FS_I(d_inode(dentry));
+- struct inode *inode = &vnode->vfs_inode;
++ struct inode *inode = &vnode->netfs.inode;
+ loff_t i_size;
+ int ret;
+
+diff --git a/fs/afs/internal.h b/fs/afs/internal.h
+index 7b7ef945dc78e..40518b5857609 100644
+--- a/fs/afs/internal.h
++++ b/fs/afs/internal.h
+@@ -619,12 +619,7 @@ enum afs_lock_state {
+ * leak from one inode to another.
+ */
+ struct afs_vnode {
+- struct {
+- /* These must be contiguous */
+- struct inode vfs_inode; /* the VFS's inode record */
+- struct netfs_i_context netfs_ctx; /* Netfslib context */
+- };
+-
++ struct netfs_inode netfs; /* Netfslib context and vfs inode */
+ struct afs_volume *volume; /* volume on which vnode resides */
+ struct afs_fid fid; /* the file identifier for this inode */
+ struct afs_file_status status; /* AFS status info for this file */
+@@ -675,7 +670,7 @@ struct afs_vnode {
+ static inline struct fscache_cookie *afs_vnode_cache(struct afs_vnode *vnode)
+ {
+ #ifdef CONFIG_AFS_FSCACHE
+- return netfs_i_cookie(&vnode->vfs_inode);
++ return netfs_i_cookie(&vnode->netfs.inode);
+ #else
+ return NULL;
+ #endif
+@@ -685,7 +680,7 @@ static inline void afs_vnode_set_cache(struct afs_vnode *vnode,
+ struct fscache_cookie *cookie)
+ {
+ #ifdef CONFIG_AFS_FSCACHE
+- vnode->netfs_ctx.cache = cookie;
++ vnode->netfs.cache = cookie;
+ #endif
+ }
+
+@@ -892,7 +887,7 @@ static inline void afs_invalidate_cache(struct afs_vnode *vnode, unsigned int fl
+
+ afs_set_cache_aux(vnode, &aux);
+ fscache_invalidate(afs_vnode_cache(vnode), &aux,
+- i_size_read(&vnode->vfs_inode), flags);
++ i_size_read(&vnode->netfs.inode), flags);
+ }
+
+ /*
+@@ -1217,7 +1212,7 @@ static inline struct afs_net *afs_i2net(struct inode *inode)
+
+ static inline struct afs_net *afs_v2net(struct afs_vnode *vnode)
+ {
+- return afs_i2net(&vnode->vfs_inode);
++ return afs_i2net(&vnode->netfs.inode);
+ }
+
+ static inline struct afs_net *afs_sock2net(struct sock *sk)
+@@ -1593,12 +1588,12 @@ extern void yfs_fs_store_opaque_acl2(struct afs_operation *);
+ */
+ static inline struct afs_vnode *AFS_FS_I(struct inode *inode)
+ {
+- return container_of(inode, struct afs_vnode, vfs_inode);
++ return container_of(inode, struct afs_vnode, netfs.inode);
+ }
+
+ static inline struct inode *AFS_VNODE_TO_I(struct afs_vnode *vnode)
+ {
+- return &vnode->vfs_inode;
++ return &vnode->netfs.inode;
+ }
+
+ /*
+@@ -1621,8 +1616,8 @@ static inline void afs_update_dentry_version(struct afs_operation *op,
+ */
+ static inline void afs_set_i_size(struct afs_vnode *vnode, u64 size)
+ {
+- i_size_write(&vnode->vfs_inode, size);
+- vnode->vfs_inode.i_blocks = ((size + 1023) >> 10) << 1;
++ i_size_write(&vnode->netfs.inode, size);
++ vnode->netfs.inode.i_blocks = ((size + 1023) >> 10) << 1;
+ }
+
+ /*
+diff --git a/fs/afs/super.c b/fs/afs/super.c
+index 1fea195b0b275..95d713074dc81 100644
+--- a/fs/afs/super.c
++++ b/fs/afs/super.c
+@@ -659,7 +659,7 @@ static void afs_i_init_once(void *_vnode)
+ struct afs_vnode *vnode = _vnode;
+
+ memset(vnode, 0, sizeof(*vnode));
+- inode_init_once(&vnode->vfs_inode);
++ inode_init_once(&vnode->netfs.inode);
+ mutex_init(&vnode->io_lock);
+ init_rwsem(&vnode->validate_lock);
+ spin_lock_init(&vnode->wb_lock);
+@@ -700,8 +700,8 @@ static struct inode *afs_alloc_inode(struct super_block *sb)
+ init_rwsem(&vnode->rmdir_lock);
+ INIT_WORK(&vnode->cb_work, afs_invalidate_mmap_work);
+
+- _leave(" = %p", &vnode->vfs_inode);
+- return &vnode->vfs_inode;
++ _leave(" = %p", &vnode->netfs.inode);
++ return &vnode->netfs.inode;
+ }
+
+ static void afs_free_inode(struct inode *inode)
+diff --git a/fs/afs/write.c b/fs/afs/write.c
+index c1bc52ac7de11..270e41ac9af4f 100644
+--- a/fs/afs/write.c
++++ b/fs/afs/write.c
+@@ -146,10 +146,10 @@ int afs_write_end(struct file *file, struct address_space *mapping,
+
+ write_end_pos = pos + copied;
+
+- i_size = i_size_read(&vnode->vfs_inode);
++ i_size = i_size_read(&vnode->netfs.inode);
+ if (write_end_pos > i_size) {
+ write_seqlock(&vnode->cb_lock);
+- i_size = i_size_read(&vnode->vfs_inode);
++ i_size = i_size_read(&vnode->netfs.inode);
+ if (write_end_pos > i_size)
+ afs_set_i_size(vnode, write_end_pos);
+ write_sequnlock(&vnode->cb_lock);
+@@ -257,7 +257,7 @@ static void afs_redirty_pages(struct writeback_control *wbc,
+ */
+ static void afs_pages_written_back(struct afs_vnode *vnode, loff_t start, unsigned int len)
+ {
+- struct address_space *mapping = vnode->vfs_inode.i_mapping;
++ struct address_space *mapping = vnode->netfs.inode.i_mapping;
+ struct folio *folio;
+ pgoff_t end;
+
+@@ -354,7 +354,6 @@ static const struct afs_operation_ops afs_store_data_operation = {
+ static int afs_store_data(struct afs_vnode *vnode, struct iov_iter *iter, loff_t pos,
+ bool laundering)
+ {
+- struct netfs_i_context *ictx = &vnode->netfs_ctx;
+ struct afs_operation *op;
+ struct afs_wb_key *wbk = NULL;
+ loff_t size = iov_iter_count(iter);
+@@ -385,9 +384,9 @@ static int afs_store_data(struct afs_vnode *vnode, struct iov_iter *iter, loff_t
+ op->store.write_iter = iter;
+ op->store.pos = pos;
+ op->store.size = size;
+- op->store.i_size = max(pos + size, ictx->remote_i_size);
++ op->store.i_size = max(pos + size, vnode->netfs.remote_i_size);
+ op->store.laundering = laundering;
+- op->mtime = vnode->vfs_inode.i_mtime;
++ op->mtime = vnode->netfs.inode.i_mtime;
+ op->flags |= AFS_OPERATION_UNINTR;
+ op->ops = &afs_store_data_operation;
+
+@@ -554,7 +553,7 @@ static ssize_t afs_write_back_from_locked_folio(struct address_space *mapping,
+ struct iov_iter iter;
+ unsigned long priv;
+ unsigned int offset, to, len, max_len;
+- loff_t i_size = i_size_read(&vnode->vfs_inode);
++ loff_t i_size = i_size_read(&vnode->netfs.inode);
+ bool new_content = test_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags);
+ bool caching = fscache_cookie_enabled(afs_vnode_cache(vnode));
+ long count = wbc->nr_to_write;
+@@ -845,7 +844,7 @@ ssize_t afs_file_write(struct kiocb *iocb, struct iov_iter *from)
+ _enter("{%llx:%llu},{%zu},",
+ vnode->fid.vid, vnode->fid.vnode, count);
+
+- if (IS_SWAPFILE(&vnode->vfs_inode)) {
++ if (IS_SWAPFILE(&vnode->netfs.inode)) {
+ printk(KERN_INFO
+ "AFS: Attempt to write to active swap file!\n");
+ return -EBUSY;
+@@ -958,8 +957,8 @@ void afs_prune_wb_keys(struct afs_vnode *vnode)
+ /* Discard unused keys */
+ spin_lock(&vnode->wb_lock);
+
+- if (!mapping_tagged(&vnode->vfs_inode.i_data, PAGECACHE_TAG_WRITEBACK) &&
+- !mapping_tagged(&vnode->vfs_inode.i_data, PAGECACHE_TAG_DIRTY)) {
++ if (!mapping_tagged(&vnode->netfs.inode.i_data, PAGECACHE_TAG_WRITEBACK) &&
++ !mapping_tagged(&vnode->netfs.inode.i_data, PAGECACHE_TAG_DIRTY)) {
+ list_for_each_entry_safe(wbk, tmp, &vnode->wb_keys, vnode_link) {
+ if (refcount_read(&wbk->usage) == 1)
+ list_move(&wbk->vnode_link, &graveyard);
+@@ -1034,6 +1033,6 @@ static void afs_write_to_cache(struct afs_vnode *vnode,
+ bool caching)
+ {
+ fscache_write_to_cache(afs_vnode_cache(vnode),
+- vnode->vfs_inode.i_mapping, start, len, i_size,
++ vnode->netfs.inode.i_mapping, start, len, i_size,
+ afs_write_to_cache_done, vnode, caching);
+ }
+diff --git a/fs/attr.c b/fs/attr.c
+index 66899b6e9bd86..dbe996b0dedfc 100644
+--- a/fs/attr.c
++++ b/fs/attr.c
+@@ -61,9 +61,15 @@ static bool chgrp_ok(struct user_namespace *mnt_userns,
+ const struct inode *inode, kgid_t gid)
+ {
+ kgid_t kgid = i_gid_into_mnt(mnt_userns, inode);
+- if (uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) &&
+- (in_group_p(gid) || gid_eq(gid, inode->i_gid)))
+- return true;
++ if (uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode))) {
++ kgid_t mapped_gid;
++
++ if (gid_eq(gid, inode->i_gid))
++ return true;
++ mapped_gid = mapped_kgid_fs(mnt_userns, i_user_ns(inode), gid);
++ if (in_group_p(mapped_gid))
++ return true;
++ }
+ if (capable_wrt_inode_uidgid(mnt_userns, inode, CAP_CHOWN))
+ return true;
+ if (gid_eq(kgid, INVALID_GID) &&
+@@ -123,12 +129,20 @@ int setattr_prepare(struct user_namespace *mnt_userns, struct dentry *dentry,
+
+ /* Make sure a caller can chmod. */
+ if (ia_valid & ATTR_MODE) {
++ kgid_t mapped_gid;
++
+ if (!inode_owner_or_capable(mnt_userns, inode))
+ return -EPERM;
++
++ if (ia_valid & ATTR_GID)
++ mapped_gid = mapped_kgid_fs(mnt_userns,
++ i_user_ns(inode), attr->ia_gid);
++ else
++ mapped_gid = i_gid_into_mnt(mnt_userns, inode);
++
+ /* Also check the setgid bit! */
+- if (!in_group_p((ia_valid & ATTR_GID) ? attr->ia_gid :
+- i_gid_into_mnt(mnt_userns, inode)) &&
+- !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID))
++ if (!in_group_p(mapped_gid) &&
++ !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID))
+ attr->ia_mode &= ~S_ISGID;
+ }
+
+diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
+index adef10a6e5c7b..11dbb1133a21e 100644
+--- a/fs/ceph/addr.c
++++ b/fs/ceph/addr.c
+@@ -1795,7 +1795,7 @@ enum {
+ static int __ceph_pool_perm_get(struct ceph_inode_info *ci,
+ s64 pool, struct ceph_string *pool_ns)
+ {
+- struct ceph_fs_client *fsc = ceph_inode_to_client(&ci->vfs_inode);
++ struct ceph_fs_client *fsc = ceph_inode_to_client(&ci->netfs.inode);
+ struct ceph_mds_client *mdsc = fsc->mdsc;
+ struct ceph_osd_request *rd_req = NULL, *wr_req = NULL;
+ struct rb_node **p, *parent;
+@@ -1910,7 +1910,7 @@ static int __ceph_pool_perm_get(struct ceph_inode_info *ci,
+ 0, false, true);
+ err = ceph_osdc_start_request(&fsc->client->osdc, rd_req, false);
+
+- wr_req->r_mtime = ci->vfs_inode.i_mtime;
++ wr_req->r_mtime = ci->netfs.inode.i_mtime;
+ err2 = ceph_osdc_start_request(&fsc->client->osdc, wr_req, false);
+
+ if (!err)
+diff --git a/fs/ceph/cache.c b/fs/ceph/cache.c
+index ddea999220739..177d8e8d73fe4 100644
+--- a/fs/ceph/cache.c
++++ b/fs/ceph/cache.c
+@@ -29,9 +29,9 @@ void ceph_fscache_register_inode_cookie(struct inode *inode)
+ if (!(inode->i_state & I_NEW))
+ return;
+
+- WARN_ON_ONCE(ci->netfs_ctx.cache);
++ WARN_ON_ONCE(ci->netfs.cache);
+
+- ci->netfs_ctx.cache =
++ ci->netfs.cache =
+ fscache_acquire_cookie(fsc->fscache, 0,
+ &ci->i_vino, sizeof(ci->i_vino),
+ &ci->i_version, sizeof(ci->i_version),
+diff --git a/fs/ceph/cache.h b/fs/ceph/cache.h
+index 7255b790a4c1c..26c6ae06e2f41 100644
+--- a/fs/ceph/cache.h
++++ b/fs/ceph/cache.h
+@@ -28,7 +28,7 @@ void ceph_fscache_invalidate(struct inode *inode, bool dio_write);
+
+ static inline struct fscache_cookie *ceph_fscache_cookie(struct ceph_inode_info *ci)
+ {
+- return netfs_i_cookie(&ci->vfs_inode);
++ return netfs_i_cookie(&ci->netfs.inode);
+ }
+
+ static inline void ceph_fscache_resize(struct inode *inode, loff_t to)
+diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
+index 5c14ef04e4742..a0467bca39fa7 100644
+--- a/fs/ceph/caps.c
++++ b/fs/ceph/caps.c
+@@ -492,7 +492,7 @@ static void __cap_set_timeouts(struct ceph_mds_client *mdsc,
+ struct ceph_mount_options *opt = mdsc->fsc->mount_options;
+ ci->i_hold_caps_max = round_jiffies(jiffies +
+ opt->caps_wanted_delay_max * HZ);
+- dout("__cap_set_timeouts %p %lu\n", &ci->vfs_inode,
++ dout("__cap_set_timeouts %p %lu\n", &ci->netfs.inode,
+ ci->i_hold_caps_max - jiffies);
+ }
+
+@@ -507,7 +507,7 @@ static void __cap_set_timeouts(struct ceph_mds_client *mdsc,
+ static void __cap_delay_requeue(struct ceph_mds_client *mdsc,
+ struct ceph_inode_info *ci)
+ {
+- dout("__cap_delay_requeue %p flags 0x%lx at %lu\n", &ci->vfs_inode,
++ dout("__cap_delay_requeue %p flags 0x%lx at %lu\n", &ci->netfs.inode,
+ ci->i_ceph_flags, ci->i_hold_caps_max);
+ if (!mdsc->stopping) {
+ spin_lock(&mdsc->cap_delay_lock);
+@@ -531,7 +531,7 @@ no_change:
+ static void __cap_delay_requeue_front(struct ceph_mds_client *mdsc,
+ struct ceph_inode_info *ci)
+ {
+- dout("__cap_delay_requeue_front %p\n", &ci->vfs_inode);
++ dout("__cap_delay_requeue_front %p\n", &ci->netfs.inode);
+ spin_lock(&mdsc->cap_delay_lock);
+ ci->i_ceph_flags |= CEPH_I_FLUSH;
+ if (!list_empty(&ci->i_cap_delay_list))
+@@ -548,7 +548,7 @@ static void __cap_delay_requeue_front(struct ceph_mds_client *mdsc,
+ static void __cap_delay_cancel(struct ceph_mds_client *mdsc,
+ struct ceph_inode_info *ci)
+ {
+- dout("__cap_delay_cancel %p\n", &ci->vfs_inode);
++ dout("__cap_delay_cancel %p\n", &ci->netfs.inode);
+ if (list_empty(&ci->i_cap_delay_list))
+ return;
+ spin_lock(&mdsc->cap_delay_lock);
+@@ -568,7 +568,7 @@ static void __check_cap_issue(struct ceph_inode_info *ci, struct ceph_cap *cap,
+ * Each time we receive FILE_CACHE anew, we increment
+ * i_rdcache_gen.
+ */
+- if (S_ISREG(ci->vfs_inode.i_mode) &&
++ if (S_ISREG(ci->netfs.inode.i_mode) &&
+ (issued & (CEPH_CAP_FILE_CACHE|CEPH_CAP_FILE_LAZYIO)) &&
+ (had & (CEPH_CAP_FILE_CACHE|CEPH_CAP_FILE_LAZYIO)) == 0) {
+ ci->i_rdcache_gen++;
+@@ -583,14 +583,14 @@ static void __check_cap_issue(struct ceph_inode_info *ci, struct ceph_cap *cap,
+ if ((issued & CEPH_CAP_FILE_SHARED) != (had & CEPH_CAP_FILE_SHARED)) {
+ if (issued & CEPH_CAP_FILE_SHARED)
+ atomic_inc(&ci->i_shared_gen);
+- if (S_ISDIR(ci->vfs_inode.i_mode)) {
+- dout(" marking %p NOT complete\n", &ci->vfs_inode);
++ if (S_ISDIR(ci->netfs.inode.i_mode)) {
++ dout(" marking %p NOT complete\n", &ci->netfs.inode);
+ __ceph_dir_clear_complete(ci);
+ }
+ }
+
+ /* Wipe saved layout if we're losing DIR_CREATE caps */
+- if (S_ISDIR(ci->vfs_inode.i_mode) && (had & CEPH_CAP_DIR_CREATE) &&
++ if (S_ISDIR(ci->netfs.inode.i_mode) && (had & CEPH_CAP_DIR_CREATE) &&
+ !(issued & CEPH_CAP_DIR_CREATE)) {
+ ceph_put_string(rcu_dereference_raw(ci->i_cached_layout.pool_ns));
+ memset(&ci->i_cached_layout, 0, sizeof(ci->i_cached_layout));
+@@ -771,7 +771,7 @@ static int __cap_is_valid(struct ceph_cap *cap)
+
+ if (cap->cap_gen < gen || time_after_eq(jiffies, ttl)) {
+ dout("__cap_is_valid %p cap %p issued %s "
+- "but STALE (gen %u vs %u)\n", &cap->ci->vfs_inode,
++ "but STALE (gen %u vs %u)\n", &cap->ci->netfs.inode,
+ cap, ceph_cap_string(cap->issued), cap->cap_gen, gen);
+ return 0;
+ }
+@@ -797,7 +797,7 @@ int __ceph_caps_issued(struct ceph_inode_info *ci, int *implemented)
+ if (!__cap_is_valid(cap))
+ continue;
+ dout("__ceph_caps_issued %p cap %p issued %s\n",
+- &ci->vfs_inode, cap, ceph_cap_string(cap->issued));
++ &ci->netfs.inode, cap, ceph_cap_string(cap->issued));
+ have |= cap->issued;
+ if (implemented)
+ *implemented |= cap->implemented;
+@@ -844,12 +844,12 @@ static void __touch_cap(struct ceph_cap *cap)
+
+ spin_lock(&s->s_cap_lock);
+ if (!s->s_cap_iterator) {
+- dout("__touch_cap %p cap %p mds%d\n", &cap->ci->vfs_inode, cap,
++ dout("__touch_cap %p cap %p mds%d\n", &cap->ci->netfs.inode, cap,
+ s->s_mds);
+ list_move_tail(&cap->session_caps, &s->s_caps);
+ } else {
+ dout("__touch_cap %p cap %p mds%d NOP, iterating over caps\n",
+- &cap->ci->vfs_inode, cap, s->s_mds);
++ &cap->ci->netfs.inode, cap, s->s_mds);
+ }
+ spin_unlock(&s->s_cap_lock);
+ }
+@@ -867,7 +867,7 @@ int __ceph_caps_issued_mask(struct ceph_inode_info *ci, int mask, int touch)
+
+ if ((have & mask) == mask) {
+ dout("__ceph_caps_issued_mask ino 0x%llx snap issued %s"
+- " (mask %s)\n", ceph_ino(&ci->vfs_inode),
++ " (mask %s)\n", ceph_ino(&ci->netfs.inode),
+ ceph_cap_string(have),
+ ceph_cap_string(mask));
+ return 1;
+@@ -879,7 +879,7 @@ int __ceph_caps_issued_mask(struct ceph_inode_info *ci, int mask, int touch)
+ continue;
+ if ((cap->issued & mask) == mask) {
+ dout("__ceph_caps_issued_mask ino 0x%llx cap %p issued %s"
+- " (mask %s)\n", ceph_ino(&ci->vfs_inode), cap,
++ " (mask %s)\n", ceph_ino(&ci->netfs.inode), cap,
+ ceph_cap_string(cap->issued),
+ ceph_cap_string(mask));
+ if (touch)
+@@ -891,7 +891,7 @@ int __ceph_caps_issued_mask(struct ceph_inode_info *ci, int mask, int touch)
+ have |= cap->issued;
+ if ((have & mask) == mask) {
+ dout("__ceph_caps_issued_mask ino 0x%llx combo issued %s"
+- " (mask %s)\n", ceph_ino(&ci->vfs_inode),
++ " (mask %s)\n", ceph_ino(&ci->netfs.inode),
+ ceph_cap_string(cap->issued),
+ ceph_cap_string(mask));
+ if (touch) {
+@@ -919,7 +919,7 @@ int __ceph_caps_issued_mask(struct ceph_inode_info *ci, int mask, int touch)
+ int __ceph_caps_issued_mask_metric(struct ceph_inode_info *ci, int mask,
+ int touch)
+ {
+- struct ceph_fs_client *fsc = ceph_sb_to_client(ci->vfs_inode.i_sb);
++ struct ceph_fs_client *fsc = ceph_sb_to_client(ci->netfs.inode.i_sb);
+ int r;
+
+ r = __ceph_caps_issued_mask(ci, mask, touch);
+@@ -950,7 +950,7 @@ int __ceph_caps_revoking_other(struct ceph_inode_info *ci,
+
+ int ceph_caps_revoking(struct ceph_inode_info *ci, int mask)
+ {
+- struct inode *inode = &ci->vfs_inode;
++ struct inode *inode = &ci->netfs.inode;
+ int ret;
+
+ spin_lock(&ci->i_ceph_lock);
+@@ -969,8 +969,8 @@ int __ceph_caps_used(struct ceph_inode_info *ci)
+ if (ci->i_rd_ref)
+ used |= CEPH_CAP_FILE_RD;
+ if (ci->i_rdcache_ref ||
+- (S_ISREG(ci->vfs_inode.i_mode) &&
+- ci->vfs_inode.i_data.nrpages))
++ (S_ISREG(ci->netfs.inode.i_mode) &&
++ ci->netfs.inode.i_data.nrpages))
+ used |= CEPH_CAP_FILE_CACHE;
+ if (ci->i_wr_ref)
+ used |= CEPH_CAP_FILE_WR;
+@@ -993,11 +993,11 @@ int __ceph_caps_file_wanted(struct ceph_inode_info *ci)
+ const int WR_SHIFT = ffs(CEPH_FILE_MODE_WR);
+ const int LAZY_SHIFT = ffs(CEPH_FILE_MODE_LAZY);
+ struct ceph_mount_options *opt =
+- ceph_inode_to_client(&ci->vfs_inode)->mount_options;
++ ceph_inode_to_client(&ci->netfs.inode)->mount_options;
+ unsigned long used_cutoff = jiffies - opt->caps_wanted_delay_max * HZ;
+ unsigned long idle_cutoff = jiffies - opt->caps_wanted_delay_min * HZ;
+
+- if (S_ISDIR(ci->vfs_inode.i_mode)) {
++ if (S_ISDIR(ci->netfs.inode.i_mode)) {
+ int want = 0;
+
+ /* use used_cutoff here, to keep dir's wanted caps longer */
+@@ -1050,7 +1050,7 @@ int __ceph_caps_file_wanted(struct ceph_inode_info *ci)
+ int __ceph_caps_wanted(struct ceph_inode_info *ci)
+ {
+ int w = __ceph_caps_file_wanted(ci) | __ceph_caps_used(ci);
+- if (S_ISDIR(ci->vfs_inode.i_mode)) {
++ if (S_ISDIR(ci->netfs.inode.i_mode)) {
+ /* we want EXCL if holding caps of dir ops */
+ if (w & CEPH_CAP_ANY_DIR_OPS)
+ w |= CEPH_CAP_FILE_EXCL;
+@@ -1116,9 +1116,9 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
+
+ lockdep_assert_held(&ci->i_ceph_lock);
+
+- dout("__ceph_remove_cap %p from %p\n", cap, &ci->vfs_inode);
++ dout("__ceph_remove_cap %p from %p\n", cap, &ci->netfs.inode);
+
+- mdsc = ceph_inode_to_client(&ci->vfs_inode)->mdsc;
++ mdsc = ceph_inode_to_client(&ci->netfs.inode)->mdsc;
+
+ /* remove from inode's cap rbtree, and clear auth cap */
+ rb_erase(&cap->ci_node, &ci->i_caps);
+@@ -1169,7 +1169,7 @@ void __ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
+ * keep i_snap_realm.
+ */
+ if (ci->i_wr_ref == 0 && ci->i_snap_realm)
+- ceph_change_snap_realm(&ci->vfs_inode, NULL);
++ ceph_change_snap_realm(&ci->netfs.inode, NULL);
+
+ __cap_delay_cancel(mdsc, ci);
+ }
+@@ -1188,11 +1188,11 @@ void ceph_remove_cap(struct ceph_cap *cap, bool queue_release)
+
+ lockdep_assert_held(&ci->i_ceph_lock);
+
+- fsc = ceph_inode_to_client(&ci->vfs_inode);
++ fsc = ceph_inode_to_client(&ci->netfs.inode);
+ WARN_ON_ONCE(ci->i_auth_cap == cap &&
+ !list_empty(&ci->i_dirty_item) &&
+ !fsc->blocklisted &&
+- !ceph_inode_is_shutdown(&ci->vfs_inode));
++ !ceph_inode_is_shutdown(&ci->netfs.inode));
+
+ __ceph_remove_cap(cap, queue_release);
+ }
+@@ -1343,7 +1343,7 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap,
+ int flushing, u64 flush_tid, u64 oldest_flush_tid)
+ {
+ struct ceph_inode_info *ci = cap->ci;
+- struct inode *inode = &ci->vfs_inode;
++ struct inode *inode = &ci->netfs.inode;
+ int held, revoking;
+
+ lockdep_assert_held(&ci->i_ceph_lock);
+@@ -1440,7 +1440,7 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap,
+ static void __send_cap(struct cap_msg_args *arg, struct ceph_inode_info *ci)
+ {
+ struct ceph_msg *msg;
+- struct inode *inode = &ci->vfs_inode;
++ struct inode *inode = &ci->netfs.inode;
+
+ msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, CAP_MSG_SIZE, GFP_NOFS, false);
+ if (!msg) {
+@@ -1528,7 +1528,7 @@ static void __ceph_flush_snaps(struct ceph_inode_info *ci,
+ __releases(ci->i_ceph_lock)
+ __acquires(ci->i_ceph_lock)
+ {
+- struct inode *inode = &ci->vfs_inode;
++ struct inode *inode = &ci->netfs.inode;
+ struct ceph_mds_client *mdsc = session->s_mdsc;
+ struct ceph_cap_snap *capsnap;
+ u64 oldest_flush_tid = 0;
+@@ -1621,7 +1621,7 @@ static void __ceph_flush_snaps(struct ceph_inode_info *ci,
+ void ceph_flush_snaps(struct ceph_inode_info *ci,
+ struct ceph_mds_session **psession)
+ {
+- struct inode *inode = &ci->vfs_inode;
++ struct inode *inode = &ci->netfs.inode;
+ struct ceph_mds_client *mdsc = ceph_inode_to_client(inode)->mdsc;
+ struct ceph_mds_session *session = NULL;
+ int mds;
+@@ -1681,8 +1681,8 @@ int __ceph_mark_dirty_caps(struct ceph_inode_info *ci, int mask,
+ struct ceph_cap_flush **pcf)
+ {
+ struct ceph_mds_client *mdsc =
+- ceph_sb_to_client(ci->vfs_inode.i_sb)->mdsc;
+- struct inode *inode = &ci->vfs_inode;
++ ceph_sb_to_client(ci->netfs.inode.i_sb)->mdsc;
++ struct inode *inode = &ci->netfs.inode;
+ int was = ci->i_dirty_caps;
+ int dirty = 0;
+
+@@ -1695,7 +1695,7 @@ int __ceph_mark_dirty_caps(struct ceph_inode_info *ci, int mask,
+ return 0;
+ }
+
+- dout("__mark_dirty_caps %p %s dirty %s -> %s\n", &ci->vfs_inode,
++ dout("__mark_dirty_caps %p %s dirty %s -> %s\n", &ci->netfs.inode,
+ ceph_cap_string(mask), ceph_cap_string(was),
+ ceph_cap_string(was | mask));
+ ci->i_dirty_caps |= mask;
+@@ -1711,7 +1711,7 @@ int __ceph_mark_dirty_caps(struct ceph_inode_info *ci, int mask,
+ ci->i_snap_realm->cached_context);
+ }
+ dout(" inode %p now dirty snapc %p auth cap %p\n",
+- &ci->vfs_inode, ci->i_head_snapc, ci->i_auth_cap);
++ &ci->netfs.inode, ci->i_head_snapc, ci->i_auth_cap);
+ BUG_ON(!list_empty(&ci->i_dirty_item));
+ spin_lock(&mdsc->cap_dirty_lock);
+ list_add(&ci->i_dirty_item, &session->s_cap_dirty);
+@@ -1874,7 +1874,7 @@ static int try_nonblocking_invalidate(struct inode *inode)
+
+ bool __ceph_should_report_size(struct ceph_inode_info *ci)
+ {
+- loff_t size = i_size_read(&ci->vfs_inode);
++ loff_t size = i_size_read(&ci->netfs.inode);
+ /* mds will adjust max size according to the reported size */
+ if (ci->i_flushing_caps & CEPH_CAP_FILE_WR)
+ return false;
+@@ -1899,7 +1899,7 @@ bool __ceph_should_report_size(struct ceph_inode_info *ci)
+ void ceph_check_caps(struct ceph_inode_info *ci, int flags,
+ struct ceph_mds_session *session)
+ {
+- struct inode *inode = &ci->vfs_inode;
++ struct inode *inode = &ci->netfs.inode;
+ struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(inode->i_sb);
+ struct ceph_cap *cap;
+ u64 flush_tid, oldest_flush_tid;
+@@ -2446,7 +2446,7 @@ static void __kick_flushing_caps(struct ceph_mds_client *mdsc,
+ __releases(ci->i_ceph_lock)
+ __acquires(ci->i_ceph_lock)
+ {
+- struct inode *inode = &ci->vfs_inode;
++ struct inode *inode = &ci->netfs.inode;
+ struct ceph_cap *cap;
+ struct ceph_cap_flush *cf;
+ int ret;
+@@ -2539,7 +2539,7 @@ void ceph_early_kick_flushing_caps(struct ceph_mds_client *mdsc,
+ cap = ci->i_auth_cap;
+ if (!(cap && cap->session == session)) {
+ pr_err("%p auth cap %p not mds%d ???\n",
+- &ci->vfs_inode, cap, session->s_mds);
++ &ci->netfs.inode, cap, session->s_mds);
+ spin_unlock(&ci->i_ceph_lock);
+ continue;
+ }
+@@ -2589,7 +2589,7 @@ void ceph_kick_flushing_caps(struct ceph_mds_client *mdsc,
+ cap = ci->i_auth_cap;
+ if (!(cap && cap->session == session)) {
+ pr_err("%p auth cap %p not mds%d ???\n",
+- &ci->vfs_inode, cap, session->s_mds);
++ &ci->netfs.inode, cap, session->s_mds);
+ spin_unlock(&ci->i_ceph_lock);
+ continue;
+ }
+@@ -2609,7 +2609,7 @@ void ceph_kick_flushing_inode_caps(struct ceph_mds_session *session,
+
+ lockdep_assert_held(&ci->i_ceph_lock);
+
+- dout("%s %p flushing %s\n", __func__, &ci->vfs_inode,
++ dout("%s %p flushing %s\n", __func__, &ci->netfs.inode,
+ ceph_cap_string(ci->i_flushing_caps));
+
+ if (!list_empty(&ci->i_cap_flush_list)) {
+@@ -2652,10 +2652,10 @@ void ceph_take_cap_refs(struct ceph_inode_info *ci, int got,
+ }
+ if (got & CEPH_CAP_FILE_BUFFER) {
+ if (ci->i_wb_ref == 0)
+- ihold(&ci->vfs_inode);
++ ihold(&ci->netfs.inode);
+ ci->i_wb_ref++;
+ dout("%s %p wb %d -> %d (?)\n", __func__,
+- &ci->vfs_inode, ci->i_wb_ref-1, ci->i_wb_ref);
++ &ci->netfs.inode, ci->i_wb_ref-1, ci->i_wb_ref);
+ }
+ }
+
+@@ -2983,7 +2983,7 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got
+ return ret;
+ }
+
+- if (S_ISREG(ci->vfs_inode.i_mode) &&
++ if (S_ISREG(ci->netfs.inode.i_mode) &&
+ ci->i_inline_version != CEPH_INLINE_NONE &&
+ (_got & (CEPH_CAP_FILE_CACHE|CEPH_CAP_FILE_LAZYIO)) &&
+ i_size_read(inode) > 0) {
+@@ -3073,7 +3073,7 @@ enum put_cap_refs_mode {
+ static void __ceph_put_cap_refs(struct ceph_inode_info *ci, int had,
+ enum put_cap_refs_mode mode)
+ {
+- struct inode *inode = &ci->vfs_inode;
++ struct inode *inode = &ci->netfs.inode;
+ int last = 0, put = 0, flushsnaps = 0, wake = 0;
+ bool check_flushsnaps = false;
+
+@@ -3181,7 +3181,7 @@ void ceph_put_cap_refs_no_check_caps(struct ceph_inode_info *ci, int had)
+ void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr,
+ struct ceph_snap_context *snapc)
+ {
+- struct inode *inode = &ci->vfs_inode;
++ struct inode *inode = &ci->netfs.inode;
+ struct ceph_cap_snap *capsnap = NULL;
+ int put = 0;
+ bool last = false;
+@@ -3678,7 +3678,7 @@ static void handle_cap_flush_ack(struct inode *inode, u64 flush_tid,
+ session->s_mds,
+ &list_first_entry(&session->s_cap_flushing,
+ struct ceph_inode_info,
+- i_flushing_item)->vfs_inode);
++ i_flushing_item)->netfs.inode);
+ }
+ }
+ mdsc->num_cap_flushing--;
+@@ -4326,7 +4326,7 @@ unsigned long ceph_check_delayed_caps(struct ceph_mds_client *mdsc)
+ break;
+ list_del_init(&ci->i_cap_delay_list);
+
+- inode = igrab(&ci->vfs_inode);
++ inode = igrab(&ci->netfs.inode);
+ if (inode) {
+ spin_unlock(&mdsc->cap_delay_lock);
+ dout("check_delayed_caps on %p\n", inode);
+@@ -4354,7 +4354,7 @@ static void flush_dirty_session_caps(struct ceph_mds_session *s)
+ while (!list_empty(&s->s_cap_dirty)) {
+ ci = list_first_entry(&s->s_cap_dirty, struct ceph_inode_info,
+ i_dirty_item);
+- inode = &ci->vfs_inode;
++ inode = &ci->netfs.inode;
+ ihold(inode);
+ dout("flush_dirty_caps %llx.%llx\n", ceph_vinop(inode));
+ spin_unlock(&mdsc->cap_dirty_lock);
+@@ -4388,7 +4388,7 @@ void __ceph_touch_fmode(struct ceph_inode_info *ci,
+
+ void ceph_get_fmode(struct ceph_inode_info *ci, int fmode, int count)
+ {
+- struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(ci->vfs_inode.i_sb);
++ struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(ci->netfs.inode.i_sb);
+ int bits = (fmode << 1) | 1;
+ bool already_opened = false;
+ int i;
+@@ -4422,7 +4422,7 @@ void ceph_get_fmode(struct ceph_inode_info *ci, int fmode, int count)
+ */
+ void ceph_put_fmode(struct ceph_inode_info *ci, int fmode, int count)
+ {
+- struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(ci->vfs_inode.i_sb);
++ struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(ci->netfs.inode.i_sb);
+ int bits = (fmode << 1) | 1;
+ bool is_closed = true;
+ int i;
+@@ -4637,7 +4637,7 @@ int ceph_purge_inode_cap(struct inode *inode, struct ceph_cap *cap, bool *invali
+ lockdep_assert_held(&ci->i_ceph_lock);
+
+ dout("removing cap %p, ci is %p, inode is %p\n",
+- cap, ci, &ci->vfs_inode);
++ cap, ci, &ci->netfs.inode);
+
+ is_auth = (cap == ci->i_auth_cap);
+ __ceph_remove_cap(cap, false);
+diff --git a/fs/ceph/file.c b/fs/ceph/file.c
+index 8c8226c0feacc..da59e836a06eb 100644
+--- a/fs/ceph/file.c
++++ b/fs/ceph/file.c
+@@ -205,7 +205,7 @@ static int ceph_init_file_info(struct inode *inode, struct file *file,
+ {
+ struct ceph_inode_info *ci = ceph_inode(inode);
+ struct ceph_mount_options *opt =
+- ceph_inode_to_client(&ci->vfs_inode)->mount_options;
++ ceph_inode_to_client(&ci->netfs.inode)->mount_options;
+ struct ceph_file_info *fi;
+ int ret;
+
+diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
+index 63113e2a48907..f7a99a7e53c28 100644
+--- a/fs/ceph/inode.c
++++ b/fs/ceph/inode.c
+@@ -176,7 +176,7 @@ static struct ceph_inode_frag *__get_or_create_frag(struct ceph_inode_info *ci,
+ rb_insert_color(&frag->node, &ci->i_fragtree);
+
+ dout("get_or_create_frag added %llx.%llx frag %x\n",
+- ceph_vinop(&ci->vfs_inode), f);
++ ceph_vinop(&ci->netfs.inode), f);
+ return frag;
+ }
+
+@@ -457,10 +457,10 @@ struct inode *ceph_alloc_inode(struct super_block *sb)
+ if (!ci)
+ return NULL;
+
+- dout("alloc_inode %p\n", &ci->vfs_inode);
++ dout("alloc_inode %p\n", &ci->netfs.inode);
+
+ /* Set parameters for the netfs library */
+- netfs_i_context_init(&ci->vfs_inode, &ceph_netfs_ops);
++ netfs_inode_init(&ci->netfs.inode, &ceph_netfs_ops);
+
+ spin_lock_init(&ci->i_ceph_lock);
+
+@@ -547,7 +547,7 @@ struct inode *ceph_alloc_inode(struct super_block *sb)
+ INIT_WORK(&ci->i_work, ceph_inode_work);
+ ci->i_work_mask = 0;
+ memset(&ci->i_btime, '\0', sizeof(ci->i_btime));
+- return &ci->vfs_inode;
++ return &ci->netfs.inode;
+ }
+
+ void ceph_free_inode(struct inode *inode)
+@@ -1977,7 +1977,7 @@ static void ceph_inode_work(struct work_struct *work)
+ {
+ struct ceph_inode_info *ci = container_of(work, struct ceph_inode_info,
+ i_work);
+- struct inode *inode = &ci->vfs_inode;
++ struct inode *inode = &ci->netfs.inode;
+
+ if (test_and_clear_bit(CEPH_I_WORK_WRITEBACK, &ci->i_work_mask)) {
+ dout("writeback %p\n", inode);
+diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
+index 8c249511344da..3f72ba137ff8e 100644
+--- a/fs/ceph/mds_client.c
++++ b/fs/ceph/mds_client.c
+@@ -1564,7 +1564,7 @@ int ceph_iterate_session_caps(struct ceph_mds_session *session,
+ p = session->s_caps.next;
+ while (p != &session->s_caps) {
+ cap = list_entry(p, struct ceph_cap, session_caps);
+- inode = igrab(&cap->ci->vfs_inode);
++ inode = igrab(&cap->ci->netfs.inode);
+ if (!inode) {
+ p = p->next;
+ continue;
+@@ -1622,7 +1622,7 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap,
+ int iputs;
+
+ dout("removing cap %p, ci is %p, inode is %p\n",
+- cap, ci, &ci->vfs_inode);
++ cap, ci, &ci->netfs.inode);
+ spin_lock(&ci->i_ceph_lock);
+ iputs = ceph_purge_inode_cap(inode, cap, &invalidate);
+ spin_unlock(&ci->i_ceph_lock);
+diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
+index 322ee5add9426..864cdaa0d2bd6 100644
+--- a/fs/ceph/snap.c
++++ b/fs/ceph/snap.c
+@@ -521,7 +521,7 @@ static bool has_new_snaps(struct ceph_snap_context *o,
+ static void ceph_queue_cap_snap(struct ceph_inode_info *ci,
+ struct ceph_cap_snap **pcapsnap)
+ {
+- struct inode *inode = &ci->vfs_inode;
++ struct inode *inode = &ci->netfs.inode;
+ struct ceph_snap_context *old_snapc, *new_snapc;
+ struct ceph_cap_snap *capsnap = *pcapsnap;
+ struct ceph_buffer *old_blob = NULL;
+@@ -652,7 +652,7 @@ update_snapc:
+ int __ceph_finish_cap_snap(struct ceph_inode_info *ci,
+ struct ceph_cap_snap *capsnap)
+ {
+- struct inode *inode = &ci->vfs_inode;
++ struct inode *inode = &ci->netfs.inode;
+ struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(inode->i_sb);
+
+ BUG_ON(capsnap->writing);
+@@ -712,7 +712,7 @@ static void queue_realm_cap_snaps(struct ceph_snap_realm *realm)
+
+ spin_lock(&realm->inodes_with_caps_lock);
+ list_for_each_entry(ci, &realm->inodes_with_caps, i_snap_realm_item) {
+- struct inode *inode = igrab(&ci->vfs_inode);
++ struct inode *inode = igrab(&ci->netfs.inode);
+ if (!inode)
+ continue;
+ spin_unlock(&realm->inodes_with_caps_lock);
+@@ -904,7 +904,7 @@ static void flush_snaps(struct ceph_mds_client *mdsc)
+ while (!list_empty(&mdsc->snap_flush_list)) {
+ ci = list_first_entry(&mdsc->snap_flush_list,
+ struct ceph_inode_info, i_snap_flush_item);
+- inode = &ci->vfs_inode;
++ inode = &ci->netfs.inode;
+ ihold(inode);
+ spin_unlock(&mdsc->snap_flush_lock);
+ ceph_flush_snaps(ci, &session);
+diff --git a/fs/ceph/super.c b/fs/ceph/super.c
+index e6987d295079f..612d8bc73ea96 100644
+--- a/fs/ceph/super.c
++++ b/fs/ceph/super.c
+@@ -876,7 +876,7 @@ mempool_t *ceph_wb_pagevec_pool;
+ static void ceph_inode_init_once(void *foo)
+ {
+ struct ceph_inode_info *ci = foo;
+- inode_init_once(&ci->vfs_inode);
++ inode_init_once(&ci->netfs.inode);
+ }
+
+ static int __init init_caches(void)
+diff --git a/fs/ceph/super.h b/fs/ceph/super.h
+index 20ceab74e8717..755a1ad260169 100644
+--- a/fs/ceph/super.h
++++ b/fs/ceph/super.h
+@@ -316,11 +316,7 @@ struct ceph_inode_xattrs_info {
+ * Ceph inode.
+ */
+ struct ceph_inode_info {
+- struct {
+- /* These must be contiguous */
+- struct inode vfs_inode;
+- struct netfs_i_context netfs_ctx; /* Netfslib context */
+- };
++ struct netfs_inode netfs; /* Netfslib context and vfs inode */
+ struct ceph_vino i_vino; /* ceph ino + snap */
+
+ spinlock_t i_ceph_lock;
+@@ -436,7 +432,7 @@ struct ceph_inode_info {
+ static inline struct ceph_inode_info *
+ ceph_inode(const struct inode *inode)
+ {
+- return container_of(inode, struct ceph_inode_info, vfs_inode);
++ return container_of(inode, struct ceph_inode_info, netfs.inode);
+ }
+
+ static inline struct ceph_fs_client *
+@@ -1295,7 +1291,7 @@ static inline void __ceph_update_quota(struct ceph_inode_info *ci,
+ has_quota = __ceph_has_any_quota(ci);
+
+ if (had_quota != has_quota)
+- ceph_adjust_quota_realms_count(&ci->vfs_inode, has_quota);
++ ceph_adjust_quota_realms_count(&ci->netfs.inode, has_quota);
+ }
+
+ extern void ceph_handle_quota(struct ceph_mds_client *mdsc,
+diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
+index 8c2dc2c762a4e..f141f5246163d 100644
+--- a/fs/ceph/xattr.c
++++ b/fs/ceph/xattr.c
+@@ -57,7 +57,7 @@ static bool ceph_vxattrcb_layout_exists(struct ceph_inode_info *ci)
+ static ssize_t ceph_vxattrcb_layout(struct ceph_inode_info *ci, char *val,
+ size_t size)
+ {
+- struct ceph_fs_client *fsc = ceph_sb_to_client(ci->vfs_inode.i_sb);
++ struct ceph_fs_client *fsc = ceph_sb_to_client(ci->netfs.inode.i_sb);
+ struct ceph_osd_client *osdc = &fsc->client->osdc;
+ struct ceph_string *pool_ns;
+ s64 pool = ci->i_layout.pool_id;
+@@ -69,7 +69,7 @@ static ssize_t ceph_vxattrcb_layout(struct ceph_inode_info *ci, char *val,
+
+ pool_ns = ceph_try_get_string(ci->i_layout.pool_ns);
+
+- dout("ceph_vxattrcb_layout %p\n", &ci->vfs_inode);
++ dout("ceph_vxattrcb_layout %p\n", &ci->netfs.inode);
+ down_read(&osdc->lock);
+ pool_name = ceph_pg_pool_name_by_id(osdc->osdmap, pool);
+ if (pool_name) {
+@@ -161,7 +161,7 @@ static ssize_t ceph_vxattrcb_layout_pool(struct ceph_inode_info *ci,
+ char *val, size_t size)
+ {
+ ssize_t ret;
+- struct ceph_fs_client *fsc = ceph_sb_to_client(ci->vfs_inode.i_sb);
++ struct ceph_fs_client *fsc = ceph_sb_to_client(ci->netfs.inode.i_sb);
+ struct ceph_osd_client *osdc = &fsc->client->osdc;
+ s64 pool = ci->i_layout.pool_id;
+ const char *pool_name;
+@@ -313,7 +313,7 @@ static ssize_t ceph_vxattrcb_snap_btime(struct ceph_inode_info *ci, char *val,
+ static ssize_t ceph_vxattrcb_cluster_fsid(struct ceph_inode_info *ci,
+ char *val, size_t size)
+ {
+- struct ceph_fs_client *fsc = ceph_sb_to_client(ci->vfs_inode.i_sb);
++ struct ceph_fs_client *fsc = ceph_sb_to_client(ci->netfs.inode.i_sb);
+
+ return ceph_fmt_xattr(val, size, "%pU", &fsc->client->fsid);
+ }
+@@ -321,7 +321,7 @@ static ssize_t ceph_vxattrcb_cluster_fsid(struct ceph_inode_info *ci,
+ static ssize_t ceph_vxattrcb_client_id(struct ceph_inode_info *ci,
+ char *val, size_t size)
+ {
+- struct ceph_fs_client *fsc = ceph_sb_to_client(ci->vfs_inode.i_sb);
++ struct ceph_fs_client *fsc = ceph_sb_to_client(ci->netfs.inode.i_sb);
+
+ return ceph_fmt_xattr(val, size, "client%lld",
+ ceph_client_gid(fsc->client));
+@@ -629,7 +629,7 @@ static int __set_xattr(struct ceph_inode_info *ci,
+ }
+
+ dout("__set_xattr_val added %llx.%llx xattr %p %.*s=%.*s\n",
+- ceph_vinop(&ci->vfs_inode), xattr, name_len, name, val_len, val);
++ ceph_vinop(&ci->netfs.inode), xattr, name_len, name, val_len, val);
+
+ return 0;
+ }
+@@ -871,7 +871,7 @@ struct ceph_buffer *__ceph_build_xattrs_blob(struct ceph_inode_info *ci)
+ struct ceph_buffer *old_blob = NULL;
+ void *dest;
+
+- dout("__build_xattrs_blob %p\n", &ci->vfs_inode);
++ dout("__build_xattrs_blob %p\n", &ci->netfs.inode);
+ if (ci->i_xattrs.dirty) {
+ int need = __get_required_blob_size(ci, 0, 0);
+
+diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
+index 9b4fd76866996..26f0210d04bda 100644
+--- a/fs/cifs/cifsfs.c
++++ b/fs/cifs/cifsfs.c
+@@ -377,7 +377,7 @@ cifs_alloc_inode(struct super_block *sb)
+ cifs_inode->flags = 0;
+ spin_lock_init(&cifs_inode->writers_lock);
+ cifs_inode->writers = 0;
+- cifs_inode->vfs_inode.i_blkbits = 14; /* 2**14 = CIFS_MAX_MSGSIZE */
++ cifs_inode->netfs.inode.i_blkbits = 14; /* 2**14 = CIFS_MAX_MSGSIZE */
+ cifs_inode->server_eof = 0;
+ cifs_inode->uniqueid = 0;
+ cifs_inode->createtime = 0;
+@@ -389,12 +389,12 @@ cifs_alloc_inode(struct super_block *sb)
+ * Can not set i_flags here - they get immediately overwritten to zero
+ * by the VFS.
+ */
+- /* cifs_inode->vfs_inode.i_flags = S_NOATIME | S_NOCMTIME; */
++ /* cifs_inode->netfs.inode.i_flags = S_NOATIME | S_NOCMTIME; */
+ INIT_LIST_HEAD(&cifs_inode->openFileList);
+ INIT_LIST_HEAD(&cifs_inode->llist);
+ INIT_LIST_HEAD(&cifs_inode->deferred_closes);
+ spin_lock_init(&cifs_inode->deferred_lock);
+- return &cifs_inode->vfs_inode;
++ return &cifs_inode->netfs.inode;
+ }
+
+ static void
+@@ -1416,7 +1416,7 @@ cifs_init_once(void *inode)
+ {
+ struct cifsInodeInfo *cifsi = inode;
+
+- inode_init_once(&cifsi->vfs_inode);
++ inode_init_once(&cifsi->netfs.inode);
+ init_rwsem(&cifsi->lock_sem);
+ }
+
+diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
+index a6cade2aebd98..4247910682c46 100644
+--- a/fs/cifs/cifsglob.h
++++ b/fs/cifs/cifsglob.h
+@@ -1414,20 +1414,16 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file);
+ #define CIFS_CACHE_RW_FLG (CIFS_CACHE_READ_FLG | CIFS_CACHE_WRITE_FLG)
+ #define CIFS_CACHE_RHW_FLG (CIFS_CACHE_RW_FLG | CIFS_CACHE_HANDLE_FLG)
+
+-#define CIFS_CACHE_READ(cinode) ((cinode->oplock & CIFS_CACHE_READ_FLG) || (CIFS_SB(cinode->vfs_inode.i_sb)->mnt_cifs_flags & CIFS_MOUNT_RO_CACHE))
++#define CIFS_CACHE_READ(cinode) ((cinode->oplock & CIFS_CACHE_READ_FLG) || (CIFS_SB(cinode->netfs.inode.i_sb)->mnt_cifs_flags & CIFS_MOUNT_RO_CACHE))
+ #define CIFS_CACHE_HANDLE(cinode) (cinode->oplock & CIFS_CACHE_HANDLE_FLG)
+-#define CIFS_CACHE_WRITE(cinode) ((cinode->oplock & CIFS_CACHE_WRITE_FLG) || (CIFS_SB(cinode->vfs_inode.i_sb)->mnt_cifs_flags & CIFS_MOUNT_RW_CACHE))
++#define CIFS_CACHE_WRITE(cinode) ((cinode->oplock & CIFS_CACHE_WRITE_FLG) || (CIFS_SB(cinode->netfs.inode.i_sb)->mnt_cifs_flags & CIFS_MOUNT_RW_CACHE))
+
+ /*
+ * One of these for each file inode
+ */
+
+ struct cifsInodeInfo {
+- struct {
+- /* These must be contiguous */
+- struct inode vfs_inode; /* the VFS's inode record */
+- struct netfs_i_context netfs_ctx; /* Netfslib context */
+- };
++ struct netfs_inode netfs; /* Netfslib context and vfs inode */
+ bool can_cache_brlcks;
+ struct list_head llist; /* locks helb by this inode */
+ /*
+@@ -1466,7 +1462,7 @@ struct cifsInodeInfo {
+ static inline struct cifsInodeInfo *
+ CIFS_I(struct inode *inode)
+ {
+- return container_of(inode, struct cifsInodeInfo, vfs_inode);
++ return container_of(inode, struct cifsInodeInfo, netfs.inode);
+ }
+
+ static inline struct cifs_sb_info *
+diff --git a/fs/cifs/file.c b/fs/cifs/file.c
+index d511a78383c38..58dce567ceaf1 100644
+--- a/fs/cifs/file.c
++++ b/fs/cifs/file.c
+@@ -2004,7 +2004,7 @@ struct cifsFileInfo *find_readable_file(struct cifsInodeInfo *cifs_inode,
+ bool fsuid_only)
+ {
+ struct cifsFileInfo *open_file = NULL;
+- struct cifs_sb_info *cifs_sb = CIFS_SB(cifs_inode->vfs_inode.i_sb);
++ struct cifs_sb_info *cifs_sb = CIFS_SB(cifs_inode->netfs.inode.i_sb);
+
+ /* only filter by fsuid on multiuser mounts */
+ if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MULTIUSER))
+@@ -2060,7 +2060,7 @@ cifs_get_writable_file(struct cifsInodeInfo *cifs_inode, int flags,
+ return rc;
+ }
+
+- cifs_sb = CIFS_SB(cifs_inode->vfs_inode.i_sb);
++ cifs_sb = CIFS_SB(cifs_inode->netfs.inode.i_sb);
+
+ /* only filter by fsuid on multiuser mounts */
+ if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MULTIUSER))
+@@ -4665,14 +4665,14 @@ bool is_size_safe_to_change(struct cifsInodeInfo *cifsInode, __u64 end_of_file)
+ /* This inode is open for write at least once */
+ struct cifs_sb_info *cifs_sb;
+
+- cifs_sb = CIFS_SB(cifsInode->vfs_inode.i_sb);
++ cifs_sb = CIFS_SB(cifsInode->netfs.inode.i_sb);
+ if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_DIRECT_IO) {
+ /* since no page cache to corrupt on directio
+ we can change size safely */
+ return true;
+ }
+
+- if (i_size_read(&cifsInode->vfs_inode) < end_of_file)
++ if (i_size_read(&cifsInode->netfs.inode) < end_of_file)
+ return true;
+
+ return false;
+diff --git a/fs/cifs/fscache.c b/fs/cifs/fscache.c
+index a638b29e90620..23ef56f55ce50 100644
+--- a/fs/cifs/fscache.c
++++ b/fs/cifs/fscache.c
+@@ -101,13 +101,13 @@ void cifs_fscache_get_inode_cookie(struct inode *inode)
+ struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
+ struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
+
+- cifs_fscache_fill_coherency(&cifsi->vfs_inode, &cd);
++ cifs_fscache_fill_coherency(&cifsi->netfs.inode, &cd);
+
+- cifsi->netfs_ctx.cache =
++ cifsi->netfs.cache =
+ fscache_acquire_cookie(tcon->fscache, 0,
+ &cifsi->uniqueid, sizeof(cifsi->uniqueid),
+ &cd, sizeof(cd),
+- i_size_read(&cifsi->vfs_inode));
++ i_size_read(&cifsi->netfs.inode));
+ }
+
+ void cifs_fscache_unuse_inode_cookie(struct inode *inode, bool update)
+@@ -131,7 +131,7 @@ void cifs_fscache_release_inode_cookie(struct inode *inode)
+ if (cookie) {
+ cifs_dbg(FYI, "%s: (0x%p)\n", __func__, cookie);
+ fscache_relinquish_cookie(cookie, false);
+- cifsi->netfs_ctx.cache = NULL;
++ cifsi->netfs.cache = NULL;
+ }
+ }
+
+diff --git a/fs/cifs/fscache.h b/fs/cifs/fscache.h
+index 52355c0912aee..ab9a51d0125cd 100644
+--- a/fs/cifs/fscache.h
++++ b/fs/cifs/fscache.h
+@@ -52,10 +52,10 @@ void cifs_fscache_fill_coherency(struct inode *inode,
+ struct cifsInodeInfo *cifsi = CIFS_I(inode);
+
+ memset(cd, 0, sizeof(*cd));
+- cd->last_write_time_sec = cpu_to_le64(cifsi->vfs_inode.i_mtime.tv_sec);
+- cd->last_write_time_nsec = cpu_to_le32(cifsi->vfs_inode.i_mtime.tv_nsec);
+- cd->last_change_time_sec = cpu_to_le64(cifsi->vfs_inode.i_ctime.tv_sec);
+- cd->last_change_time_nsec = cpu_to_le32(cifsi->vfs_inode.i_ctime.tv_nsec);
++ cd->last_write_time_sec = cpu_to_le64(cifsi->netfs.inode.i_mtime.tv_sec);
++ cd->last_write_time_nsec = cpu_to_le32(cifsi->netfs.inode.i_mtime.tv_nsec);
++ cd->last_change_time_sec = cpu_to_le64(cifsi->netfs.inode.i_ctime.tv_sec);
++ cd->last_change_time_nsec = cpu_to_le32(cifsi->netfs.inode.i_ctime.tv_nsec);
+ }
+
+
+diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
+index 2f9e7d2f81b6f..81da81e185538 100644
+--- a/fs/cifs/inode.c
++++ b/fs/cifs/inode.c
+@@ -115,7 +115,7 @@ cifs_revalidate_cache(struct inode *inode, struct cifs_fattr *fattr)
+ __func__, cifs_i->uniqueid);
+ set_bit(CIFS_INO_INVALID_MAPPING, &cifs_i->flags);
+ /* Invalidate fscache cookie */
+- cifs_fscache_fill_coherency(&cifs_i->vfs_inode, &cd);
++ cifs_fscache_fill_coherency(&cifs_i->netfs.inode, &cd);
+ fscache_invalidate(cifs_inode_cookie(inode), &cd, i_size_read(inode), 0);
+ }
+
+@@ -2499,7 +2499,7 @@ int cifs_fiemap(struct inode *inode, struct fiemap_extent_info *fei, u64 start,
+ u64 len)
+ {
+ struct cifsInodeInfo *cifs_i = CIFS_I(inode);
+- struct cifs_sb_info *cifs_sb = CIFS_SB(cifs_i->vfs_inode.i_sb);
++ struct cifs_sb_info *cifs_sb = CIFS_SB(cifs_i->netfs.inode.i_sb);
+ struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
+ struct TCP_Server_Info *server = tcon->ses->server;
+ struct cifsFileInfo *cfile;
+diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c
+index 4abf36b3b3450..14e8dfc83c7e0 100644
+--- a/fs/cifs/misc.c
++++ b/fs/cifs/misc.c
+@@ -535,11 +535,11 @@ void cifs_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock)
+ if (oplock == OPLOCK_EXCLUSIVE) {
+ cinode->oplock = CIFS_CACHE_WRITE_FLG | CIFS_CACHE_READ_FLG;
+ cifs_dbg(FYI, "Exclusive Oplock granted on inode %p\n",
+- &cinode->vfs_inode);
++ &cinode->netfs.inode);
+ } else if (oplock == OPLOCK_READ) {
+ cinode->oplock = CIFS_CACHE_READ_FLG;
+ cifs_dbg(FYI, "Level II Oplock granted on inode %p\n",
+- &cinode->vfs_inode);
++ &cinode->netfs.inode);
+ } else
+ cinode->oplock = 0;
+ }
+diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
+index 6e26edbffc486..19f957785a737 100644
+--- a/fs/cifs/smb2ops.c
++++ b/fs/cifs/smb2ops.c
+@@ -4241,15 +4241,15 @@ smb2_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock,
+ if (oplock == SMB2_OPLOCK_LEVEL_BATCH) {
+ cinode->oplock = CIFS_CACHE_RHW_FLG;
+ cifs_dbg(FYI, "Batch Oplock granted on inode %p\n",
+- &cinode->vfs_inode);
++ &cinode->netfs.inode);
+ } else if (oplock == SMB2_OPLOCK_LEVEL_EXCLUSIVE) {
+ cinode->oplock = CIFS_CACHE_RW_FLG;
+ cifs_dbg(FYI, "Exclusive Oplock granted on inode %p\n",
+- &cinode->vfs_inode);
++ &cinode->netfs.inode);
+ } else if (oplock == SMB2_OPLOCK_LEVEL_II) {
+ cinode->oplock = CIFS_CACHE_READ_FLG;
+ cifs_dbg(FYI, "Level II Oplock granted on inode %p\n",
+- &cinode->vfs_inode);
++ &cinode->netfs.inode);
+ } else
+ cinode->oplock = 0;
+ }
+@@ -4288,7 +4288,7 @@ smb21_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock,
+
+ cinode->oplock = new_oplock;
+ cifs_dbg(FYI, "%s Lease granted on inode %p\n", message,
+- &cinode->vfs_inode);
++ &cinode->netfs.inode);
+ }
+
+ static void
+diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
+index 87d85ce04d58b..816448041db87 100644
+--- a/fs/ext4/mballoc.c
++++ b/fs/ext4/mballoc.c
+@@ -4107,6 +4107,15 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac,
+ size = size >> bsbits;
+ start = start_off >> bsbits;
+
++ /*
++ * For tiny groups (smaller than 8MB) the chosen allocation
++ * alignment may be larger than group size. Make sure the
++ * alignment does not move allocation to a different group which
++ * makes mballoc fail assertions later.
++ */
++ start = max(start, rounddown(ac->ac_o_ex.fe_logical,
++ (ext4_lblk_t)EXT4_BLOCKS_PER_GROUP(ac->ac_sb)));
++
+ /* don't cover already allocated blocks in selected range */
+ if (ar->pleft && start <= ar->lleft) {
+ size -= ar->lleft + 1 - start;
+diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
+index e9cba12e5e128..4f0420b1ff3ec 100644
+--- a/fs/ext4/namei.c
++++ b/fs/ext4/namei.c
+@@ -1929,7 +1929,8 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir,
+ struct dx_hash_info *hinfo)
+ {
+ unsigned blocksize = dir->i_sb->s_blocksize;
+- unsigned count, continued;
++ unsigned continued;
++ int count;
+ struct buffer_head *bh2;
+ ext4_lblk_t newblock;
+ u32 hash2;
+diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
+index 90a941d20dfff..8b70a47012931 100644
+--- a/fs/ext4/resize.c
++++ b/fs/ext4/resize.c
+@@ -53,6 +53,16 @@ int ext4_resize_begin(struct super_block *sb)
+ if (!capable(CAP_SYS_RESOURCE))
+ return -EPERM;
+
++ /*
++ * If the reserved GDT blocks is non-zero, the resize_inode feature
++ * should always be set.
++ */
++ if (EXT4_SB(sb)->s_es->s_reserved_gdt_blocks &&
++ !ext4_has_feature_resize_inode(sb)) {
++ ext4_error(sb, "resize_inode disabled but reserved GDT blocks non-zero");
++ return -EFSCORRUPTED;
++ }
++
+ /*
+ * If we are not using the primary superblock/GDT copy don't resize,
+ * because the user tools have no way of handling this. Probably a
+diff --git a/fs/ext4/super.c b/fs/ext4/super.c
+index a0c79304f92ff..552285de2c7b0 100644
+--- a/fs/ext4/super.c
++++ b/fs/ext4/super.c
+@@ -5422,14 +5422,6 @@ no_journal:
+ err = percpu_counter_init(&sbi->s_freeinodes_counter, freei,
+ GFP_KERNEL);
+ }
+- /*
+- * Update the checksum after updating free space/inode
+- * counters. Otherwise the superblock can have an incorrect
+- * checksum in the buffer cache until it is written out and
+- * e2fsprogs programs trying to open a file system immediately
+- * after it is mounted can fail.
+- */
+- ext4_superblock_csum_set(sb);
+ if (!err)
+ err = percpu_counter_init(&sbi->s_dirs_counter,
+ ext4_count_dirs(sb), GFP_KERNEL);
+@@ -5487,6 +5479,14 @@ no_journal:
+ EXT4_SB(sb)->s_mount_state |= EXT4_ORPHAN_FS;
+ ext4_orphan_cleanup(sb, es);
+ EXT4_SB(sb)->s_mount_state &= ~EXT4_ORPHAN_FS;
++ /*
++ * Update the checksum after updating free space/inode counters and
++ * ext4_orphan_cleanup. Otherwise the superblock can have an incorrect
++ * checksum in the buffer cache until it is written out and
++ * e2fsprogs programs trying to open a file system immediately
++ * after it is mounted can fail.
++ */
++ ext4_superblock_csum_set(sb);
+ if (needs_recovery) {
+ ext4_msg(sb, KERN_INFO, "recovery complete");
+ err = ext4_mark_recovery_complete(sb, es);
+diff --git a/fs/io_uring.c b/fs/io_uring.c
+index 9e247335e70d5..3d123ca028c97 100644
+--- a/fs/io_uring.c
++++ b/fs/io_uring.c
+@@ -111,7 +111,8 @@
+ IOSQE_IO_DRAIN | IOSQE_CQE_SKIP_SUCCESS)
+
+ #define IO_REQ_CLEAN_FLAGS (REQ_F_BUFFER_SELECTED | REQ_F_NEED_CLEANUP | \
+- REQ_F_POLLED | REQ_F_CREDS | REQ_F_ASYNC_DATA)
++ REQ_F_POLLED | REQ_F_INFLIGHT | REQ_F_CREDS | \
++ REQ_F_ASYNC_DATA)
+
+ #define IO_TCTX_REFS_CACHE_NR (1U << 10)
+
+@@ -493,6 +494,7 @@ struct io_uring_task {
+ const struct io_ring_ctx *last;
+ struct io_wq *io_wq;
+ struct percpu_counter inflight;
++ atomic_t inflight_tracked;
+ atomic_t in_idle;
+
+ spinlock_t task_lock;
+@@ -1186,8 +1188,6 @@ static void io_clean_op(struct io_kiocb *req);
+ static inline struct file *io_file_get_fixed(struct io_kiocb *req, int fd,
+ unsigned issue_flags);
+ static inline struct file *io_file_get_normal(struct io_kiocb *req, int fd);
+-static void io_drop_inflight_file(struct io_kiocb *req);
+-static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags);
+ static void __io_queue_sqe(struct io_kiocb *req);
+ static void io_rsrc_put_work(struct work_struct *work);
+
+@@ -1435,9 +1435,29 @@ static bool io_match_task(struct io_kiocb *head, struct task_struct *task,
+ bool cancel_all)
+ __must_hold(&req->ctx->timeout_lock)
+ {
++ struct io_kiocb *req;
++
+ if (task && head->task != task)
+ return false;
+- return cancel_all;
++ if (cancel_all)
++ return true;
++
++ io_for_each_link(req, head) {
++ if (req->flags & REQ_F_INFLIGHT)
++ return true;
++ }
++ return false;
++}
++
++static bool io_match_linked(struct io_kiocb *head)
++{
++ struct io_kiocb *req;
++
++ io_for_each_link(req, head) {
++ if (req->flags & REQ_F_INFLIGHT)
++ return true;
++ }
++ return false;
+ }
+
+ /*
+@@ -1447,9 +1467,24 @@ static bool io_match_task(struct io_kiocb *head, struct task_struct *task,
+ static bool io_match_task_safe(struct io_kiocb *head, struct task_struct *task,
+ bool cancel_all)
+ {
++ bool matched;
++
+ if (task && head->task != task)
+ return false;
+- return cancel_all;
++ if (cancel_all)
++ return true;
++
++ if (head->flags & REQ_F_LINK_TIMEOUT) {
++ struct io_ring_ctx *ctx = head->ctx;
++
++ /* protect against races with linked timeouts */
++ spin_lock_irq(&ctx->timeout_lock);
++ matched = io_match_linked(head);
++ spin_unlock_irq(&ctx->timeout_lock);
++ } else {
++ matched = io_match_linked(head);
++ }
++ return matched;
+ }
+
+ static inline bool req_has_async_data(struct io_kiocb *req)
+@@ -1608,6 +1643,14 @@ static inline bool io_req_ffs_set(struct io_kiocb *req)
+ return req->flags & REQ_F_FIXED_FILE;
+ }
+
++static inline void io_req_track_inflight(struct io_kiocb *req)
++{
++ if (!(req->flags & REQ_F_INFLIGHT)) {
++ req->flags |= REQ_F_INFLIGHT;
++ atomic_inc(¤t->io_uring->inflight_tracked);
++ }
++}
++
+ static struct io_kiocb *__io_prep_linked_timeout(struct io_kiocb *req)
+ {
+ if (WARN_ON_ONCE(!req->link))
+@@ -2516,8 +2559,6 @@ static void io_req_task_work_add(struct io_kiocb *req, bool priority)
+
+ WARN_ON_ONCE(!tctx);
+
+- io_drop_inflight_file(req);
+-
+ spin_lock_irqsave(&tctx->task_lock, flags);
+ if (priority)
+ wq_list_add_tail(&req->io_task_work.node, &tctx->prior_task_list);
+@@ -5869,10 +5910,6 @@ static int io_poll_check_events(struct io_kiocb *req, bool locked)
+
+ if (!req->result) {
+ struct poll_table_struct pt = { ._key = req->apoll_events };
+- unsigned flags = locked ? 0 : IO_URING_F_UNLOCKED;
+-
+- if (unlikely(!io_assign_file(req, flags)))
+- return -EBADF;
+ req->result = vfs_poll(req->file, &pt) & req->apoll_events;
+ }
+
+@@ -7097,6 +7134,11 @@ static void io_clean_op(struct io_kiocb *req)
+ kfree(req->apoll);
+ req->apoll = NULL;
+ }
++ if (req->flags & REQ_F_INFLIGHT) {
++ struct io_uring_task *tctx = req->task->io_uring;
++
++ atomic_dec(&tctx->inflight_tracked);
++ }
+ if (req->flags & REQ_F_CREDS)
+ put_cred(req->creds);
+ if (req->flags & REQ_F_ASYNC_DATA) {
+@@ -7393,19 +7435,6 @@ out:
+ return file;
+ }
+
+-/*
+- * Drop the file for requeue operations. Only used of req->file is the
+- * io_uring descriptor itself.
+- */
+-static void io_drop_inflight_file(struct io_kiocb *req)
+-{
+- if (unlikely(req->flags & REQ_F_INFLIGHT)) {
+- fput(req->file);
+- req->file = NULL;
+- req->flags &= ~REQ_F_INFLIGHT;
+- }
+-}
+-
+ static struct file *io_file_get_normal(struct io_kiocb *req, int fd)
+ {
+ struct file *file = fget(fd);
+@@ -7414,7 +7443,7 @@ static struct file *io_file_get_normal(struct io_kiocb *req, int fd)
+
+ /* we don't allow fixed io_uring files */
+ if (file && file->f_op == &io_uring_fops)
+- req->flags |= REQ_F_INFLIGHT;
++ io_req_track_inflight(req);
+ return file;
+ }
+
+@@ -8479,11 +8508,19 @@ static void __io_sqe_files_unregister(struct io_ring_ctx *ctx)
+
+ static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
+ {
++ unsigned nr = ctx->nr_user_files;
+ int ret;
+
+ if (!ctx->file_data)
+ return -ENXIO;
++
++ /*
++ * Quiesce may unlock ->uring_lock, and while it's not held
++ * prevent new requests using the table.
++ */
++ ctx->nr_user_files = 0;
+ ret = io_rsrc_ref_quiesce(ctx->file_data, ctx);
++ ctx->nr_user_files = nr;
+ if (!ret)
+ __io_sqe_files_unregister(ctx);
+ return ret;
+@@ -9211,6 +9248,7 @@ static __cold int io_uring_alloc_task_context(struct task_struct *task,
+ xa_init(&tctx->xa);
+ init_waitqueue_head(&tctx->wait);
+ atomic_set(&tctx->in_idle, 0);
++ atomic_set(&tctx->inflight_tracked, 0);
+ task->io_uring = tctx;
+ spin_lock_init(&tctx->task_lock);
+ INIT_WQ_LIST(&tctx->task_list);
+@@ -9457,12 +9495,19 @@ static void __io_sqe_buffers_unregister(struct io_ring_ctx *ctx)
+
+ static int io_sqe_buffers_unregister(struct io_ring_ctx *ctx)
+ {
++ unsigned nr = ctx->nr_user_bufs;
+ int ret;
+
+ if (!ctx->buf_data)
+ return -ENXIO;
+
++ /*
++ * Quiesce may unlock ->uring_lock, and while it's not held
++ * prevent new requests using the table.
++ */
++ ctx->nr_user_bufs = 0;
+ ret = io_rsrc_ref_quiesce(ctx->buf_data, ctx);
++ ctx->nr_user_bufs = nr;
+ if (!ret)
+ __io_sqe_buffers_unregister(ctx);
+ return ret;
+@@ -10402,7 +10447,7 @@ static __cold void io_uring_clean_tctx(struct io_uring_task *tctx)
+ static s64 tctx_inflight(struct io_uring_task *tctx, bool tracked)
+ {
+ if (tracked)
+- return 0;
++ return atomic_read(&tctx->inflight_tracked);
+ return percpu_counter_sum(&tctx->inflight);
+ }
+
+diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
+index 281a88a5b8dcd..e8e3359a4c547 100644
+--- a/fs/netfs/buffered_read.c
++++ b/fs/netfs/buffered_read.c
+@@ -155,7 +155,7 @@ static void netfs_rreq_expand(struct netfs_io_request *rreq,
+ void netfs_readahead(struct readahead_control *ractl)
+ {
+ struct netfs_io_request *rreq;
+- struct netfs_i_context *ctx = netfs_i_context(ractl->mapping->host);
++ struct netfs_inode *ctx = netfs_inode(ractl->mapping->host);
+ int ret;
+
+ _enter("%lx,%x", readahead_index(ractl), readahead_count(ractl));
+@@ -216,7 +216,7 @@ int netfs_readpage(struct file *file, struct page *subpage)
+ struct folio *folio = page_folio(subpage);
+ struct address_space *mapping = folio_file_mapping(folio);
+ struct netfs_io_request *rreq;
+- struct netfs_i_context *ctx = netfs_i_context(mapping->host);
++ struct netfs_inode *ctx = netfs_inode(mapping->host);
+ int ret;
+
+ _enter("%lx", folio_index(folio));
+@@ -333,7 +333,7 @@ int netfs_write_begin(struct file *file, struct address_space *mapping,
+ struct folio **_folio, void **_fsdata)
+ {
+ struct netfs_io_request *rreq;
+- struct netfs_i_context *ctx = netfs_i_context(file_inode(file ));
++ struct netfs_inode *ctx = netfs_inode(file_inode(file ));
+ struct folio *folio;
+ unsigned int fgp_flags;
+ pgoff_t index = pos >> PAGE_SHIFT;
+diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
+index b7b0e3d18d9e8..43fac1b14e40c 100644
+--- a/fs/netfs/internal.h
++++ b/fs/netfs/internal.h
+@@ -91,7 +91,7 @@ static inline void netfs_stat_d(atomic_t *stat)
+ /*
+ * Miscellaneous functions.
+ */
+-static inline bool netfs_is_cache_enabled(struct netfs_i_context *ctx)
++static inline bool netfs_is_cache_enabled(struct netfs_inode *ctx)
+ {
+ #if IS_ENABLED(CONFIG_FSCACHE)
+ struct fscache_cookie *cookie = ctx->cache;
+diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
+index e86107b30ba44..c6afa605b63b8 100644
+--- a/fs/netfs/objects.c
++++ b/fs/netfs/objects.c
+@@ -18,7 +18,7 @@ struct netfs_io_request *netfs_alloc_request(struct address_space *mapping,
+ {
+ static atomic_t debug_ids;
+ struct inode *inode = file ? file_inode(file) : mapping->host;
+- struct netfs_i_context *ctx = netfs_i_context(inode);
++ struct netfs_inode *ctx = netfs_inode(inode);
+ struct netfs_io_request *rreq;
+ int ret;
+
+diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
+index c8520284dda78..c1eda73254e16 100644
+--- a/fs/nfs/callback_proc.c
++++ b/fs/nfs/callback_proc.c
+@@ -288,6 +288,7 @@ static u32 initiate_file_draining(struct nfs_client *clp,
+ rv = NFS4_OK;
+ break;
+ case -ENOENT:
++ set_bit(NFS_LAYOUT_DRAIN, &lo->plh_flags);
+ /* Embrace your forgetfulness! */
+ rv = NFS4ERR_NOMATCHING_LAYOUT;
+
+diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
+index 68a87be3e6f96..41a9b6b58fb9f 100644
+--- a/fs/nfs/pnfs.c
++++ b/fs/nfs/pnfs.c
+@@ -469,6 +469,7 @@ pnfs_mark_layout_stateid_invalid(struct pnfs_layout_hdr *lo,
+ pnfs_clear_lseg_state(lseg, lseg_list);
+ pnfs_clear_layoutreturn_info(lo);
+ pnfs_free_returned_lsegs(lo, lseg_list, &range, 0);
++ set_bit(NFS_LAYOUT_DRAIN, &lo->plh_flags);
+ if (test_bit(NFS_LAYOUT_RETURN, &lo->plh_flags) &&
+ !test_and_set_bit(NFS_LAYOUT_RETURN_LOCK, &lo->plh_flags))
+ pnfs_clear_layoutreturn_waitbit(lo);
+@@ -1917,8 +1918,9 @@ static void nfs_layoutget_begin(struct pnfs_layout_hdr *lo)
+
+ static void nfs_layoutget_end(struct pnfs_layout_hdr *lo)
+ {
+- if (atomic_dec_and_test(&lo->plh_outstanding))
+- wake_up_var(&lo->plh_outstanding);
++ if (atomic_dec_and_test(&lo->plh_outstanding) &&
++ test_and_clear_bit(NFS_LAYOUT_DRAIN, &lo->plh_flags))
++ wake_up_bit(&lo->plh_flags, NFS_LAYOUT_DRAIN);
+ }
+
+ static bool pnfs_is_first_layoutget(struct pnfs_layout_hdr *lo)
+@@ -2025,11 +2027,11 @@ lookup_again:
+ * If the layout segment list is empty, but there are outstanding
+ * layoutget calls, then they might be subject to a layoutrecall.
+ */
+- if ((list_empty(&lo->plh_segs) || !pnfs_layout_is_valid(lo)) &&
++ if (test_bit(NFS_LAYOUT_DRAIN, &lo->plh_flags) &&
+ atomic_read(&lo->plh_outstanding) != 0) {
+ spin_unlock(&ino->i_lock);
+- lseg = ERR_PTR(wait_var_event_killable(&lo->plh_outstanding,
+- !atomic_read(&lo->plh_outstanding)));
++ lseg = ERR_PTR(wait_on_bit(&lo->plh_flags, NFS_LAYOUT_DRAIN,
++ TASK_KILLABLE));
+ if (IS_ERR(lseg))
+ goto out_put_layout_hdr;
+ pnfs_put_layout_hdr(lo);
+@@ -2152,6 +2154,12 @@ lookup_again:
+ case -ERECALLCONFLICT:
+ case -EAGAIN:
+ break;
++ case -ENODATA:
++ /* The server returned NFS4ERR_LAYOUTUNAVAILABLE */
++ pnfs_layout_set_fail_bit(
++ lo, pnfs_iomode_to_fail_bit(iomode));
++ lseg = NULL;
++ goto out_put_layout_hdr;
+ default:
+ if (!nfs_error_is_fatal(PTR_ERR(lseg))) {
+ pnfs_layout_clear_fail_bit(lo, pnfs_iomode_to_fail_bit(iomode));
+@@ -2407,7 +2415,8 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
+ goto out_forget;
+ }
+
+- if (!pnfs_layout_is_valid(lo) && !pnfs_is_first_layoutget(lo))
++ if (test_bit(NFS_LAYOUT_DRAIN, &lo->plh_flags) &&
++ !pnfs_is_first_layoutget(lo))
+ goto out_forget;
+
+ if (nfs4_stateid_match_other(&lo->plh_stateid, &res->stateid)) {
+diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
+index 07f11489e4e9f..f331f067691b0 100644
+--- a/fs/nfs/pnfs.h
++++ b/fs/nfs/pnfs.h
+@@ -105,6 +105,7 @@ enum {
+ NFS_LAYOUT_FIRST_LAYOUTGET, /* Serialize first layoutget */
+ NFS_LAYOUT_INODE_FREEING, /* The inode is being freed */
+ NFS_LAYOUT_HASHED, /* The layout visible */
++ NFS_LAYOUT_DRAIN,
+ };
+
+ enum layoutdriver_policy_flags {
+diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
+index a74aef99bd3d6..09d1307959d08 100644
+--- a/fs/quota/dquot.c
++++ b/fs/quota/dquot.c
+@@ -79,6 +79,7 @@
+ #include <linux/capability.h>
+ #include <linux/quotaops.h>
+ #include <linux/blkdev.h>
++#include <linux/sched/mm.h>
+ #include "../internal.h" /* ugh */
+
+ #include <linux/uaccess.h>
+@@ -425,9 +426,11 @@ EXPORT_SYMBOL(mark_info_dirty);
+ int dquot_acquire(struct dquot *dquot)
+ {
+ int ret = 0, ret2 = 0;
++ unsigned int memalloc;
+ struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
+
+ mutex_lock(&dquot->dq_lock);
++ memalloc = memalloc_nofs_save();
+ if (!test_bit(DQ_READ_B, &dquot->dq_flags)) {
+ ret = dqopt->ops[dquot->dq_id.type]->read_dqblk(dquot);
+ if (ret < 0)
+@@ -458,6 +461,7 @@ int dquot_acquire(struct dquot *dquot)
+ smp_mb__before_atomic();
+ set_bit(DQ_ACTIVE_B, &dquot->dq_flags);
+ out_iolock:
++ memalloc_nofs_restore(memalloc);
+ mutex_unlock(&dquot->dq_lock);
+ return ret;
+ }
+@@ -469,9 +473,11 @@ EXPORT_SYMBOL(dquot_acquire);
+ int dquot_commit(struct dquot *dquot)
+ {
+ int ret = 0;
++ unsigned int memalloc;
+ struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
+
+ mutex_lock(&dquot->dq_lock);
++ memalloc = memalloc_nofs_save();
+ if (!clear_dquot_dirty(dquot))
+ goto out_lock;
+ /* Inactive dquot can be only if there was error during read/init
+@@ -481,6 +487,7 @@ int dquot_commit(struct dquot *dquot)
+ else
+ ret = -EIO;
+ out_lock:
++ memalloc_nofs_restore(memalloc);
+ mutex_unlock(&dquot->dq_lock);
+ return ret;
+ }
+@@ -492,9 +499,11 @@ EXPORT_SYMBOL(dquot_commit);
+ int dquot_release(struct dquot *dquot)
+ {
+ int ret = 0, ret2 = 0;
++ unsigned int memalloc;
+ struct quota_info *dqopt = sb_dqopt(dquot->dq_sb);
+
+ mutex_lock(&dquot->dq_lock);
++ memalloc = memalloc_nofs_save();
+ /* Check whether we are not racing with some other dqget() */
+ if (dquot_is_busy(dquot))
+ goto out_dqlock;
+@@ -510,6 +519,7 @@ int dquot_release(struct dquot *dquot)
+ }
+ clear_bit(DQ_ACTIVE_B, &dquot->dq_flags);
+ out_dqlock:
++ memalloc_nofs_restore(memalloc);
+ mutex_unlock(&dquot->dq_lock);
+ return ret;
+ }
+diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
+index 87ce24d238f34..8c2eed1b69c17 100644
+--- a/include/linux/backing-dev.h
++++ b/include/linux/backing-dev.h
+@@ -121,6 +121,8 @@ int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio);
+
+ extern struct backing_dev_info noop_backing_dev_info;
+
++int bdi_init(struct backing_dev_info *bdi);
++
+ /**
+ * writeback_in_progress - determine whether there is writeback in progress
+ * @wb: bdi_writeback of interest
+diff --git a/include/linux/netfs.h b/include/linux/netfs.h
+index c7bf1eaf51d5a..a9c6f73877ecb 100644
+--- a/include/linux/netfs.h
++++ b/include/linux/netfs.h
+@@ -119,9 +119,10 @@ typedef void (*netfs_io_terminated_t)(void *priv, ssize_t transferred_or_error,
+ bool was_async);
+
+ /*
+- * Per-inode description. This must be directly after the inode struct.
++ * Per-inode context. This wraps the VFS inode.
+ */
+-struct netfs_i_context {
++struct netfs_inode {
++ struct inode inode; /* The VFS inode */
+ const struct netfs_request_ops *ops;
+ #if IS_ENABLED(CONFIG_FSCACHE)
+ struct fscache_cookie *cache;
+@@ -255,7 +256,7 @@ struct netfs_cache_ops {
+ * boundary as appropriate.
+ */
+ enum netfs_io_source (*prepare_read)(struct netfs_io_subrequest *subreq,
+- loff_t i_size);
++ loff_t i_size);
+
+ /* Prepare a write operation, working out what part of the write we can
+ * actually do.
+@@ -287,45 +288,35 @@ extern void netfs_put_subrequest(struct netfs_io_subrequest *subreq,
+ extern void netfs_stats_show(struct seq_file *);
+
+ /**
+- * netfs_i_context - Get the netfs inode context from the inode
++ * netfs_inode - Get the netfs inode context from the inode
+ * @inode: The inode to query
+ *
+ * Get the netfs lib inode context from the network filesystem's inode. The
+ * context struct is expected to directly follow on from the VFS inode struct.
+ */
+-static inline struct netfs_i_context *netfs_i_context(struct inode *inode)
++static inline struct netfs_inode *netfs_inode(struct inode *inode)
+ {
+- return (struct netfs_i_context *)(inode + 1);
++ return container_of(inode, struct netfs_inode, inode);
+ }
+
+ /**
+- * netfs_inode - Get the netfs inode from the inode context
+- * @ctx: The context to query
+- *
+- * Get the netfs inode from the netfs library's inode context. The VFS inode
+- * is expected to directly precede the context struct.
+- */
+-static inline struct inode *netfs_inode(struct netfs_i_context *ctx)
+-{
+- return ((struct inode *)ctx) - 1;
+-}
+-
+-/**
+- * netfs_i_context_init - Initialise a netfs lib context
++ * netfs_inode_init - Initialise a netfslib inode context
+ * @inode: The inode with which the context is associated
+ * @ops: The netfs's operations list
+ *
+ * Initialise the netfs library context struct. This is expected to follow on
+ * directly from the VFS inode struct.
+ */
+-static inline void netfs_i_context_init(struct inode *inode,
+- const struct netfs_request_ops *ops)
++static inline void netfs_inode_init(struct inode *inode,
++ const struct netfs_request_ops *ops)
+ {
+- struct netfs_i_context *ctx = netfs_i_context(inode);
++ struct netfs_inode *ctx = netfs_inode(inode);
+
+- memset(ctx, 0, sizeof(*ctx));
+ ctx->ops = ops;
+ ctx->remote_i_size = i_size_read(inode);
++#if IS_ENABLED(CONFIG_FSCACHE)
++ ctx->cache = NULL;
++#endif
+ }
+
+ /**
+@@ -337,7 +328,7 @@ static inline void netfs_i_context_init(struct inode *inode,
+ */
+ static inline void netfs_resize_file(struct inode *inode, loff_t new_i_size)
+ {
+- struct netfs_i_context *ctx = netfs_i_context(inode);
++ struct netfs_inode *ctx = netfs_inode(inode);
+
+ ctx->remote_i_size = new_i_size;
+ }
+@@ -351,7 +342,7 @@ static inline void netfs_resize_file(struct inode *inode, loff_t new_i_size)
+ static inline struct fscache_cookie *netfs_i_cookie(struct inode *inode)
+ {
+ #if IS_ENABLED(CONFIG_FSCACHE)
+- struct netfs_i_context *ctx = netfs_i_context(inode);
++ struct netfs_inode *ctx = netfs_inode(inode);
+ return ctx->cache;
+ #else
+ return NULL;
+diff --git a/include/linux/objtool.h b/include/linux/objtool.h
+index 586d35720f135..c81ea2264ad8a 100644
+--- a/include/linux/objtool.h
++++ b/include/linux/objtool.h
+@@ -141,6 +141,12 @@ struct unwind_hint {
+ .popsection
+ .endm
+
++.macro STACK_FRAME_NON_STANDARD_FP func:req
++#ifdef CONFIG_FRAME_POINTER
++ STACK_FRAME_NON_STANDARD \func
++#endif
++.endm
++
+ .macro ANNOTATE_NOENDBR
+ .Lhere_\@:
+ .pushsection .discard.noendbr
+diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
+index 3a30cae8b0a55..2394441fa3dd5 100644
+--- a/include/linux/skbuff.h
++++ b/include/linux/skbuff.h
+@@ -3836,8 +3836,7 @@ struct sk_buff *__skb_try_recv_datagram(struct sock *sk,
+ struct sk_buff *__skb_recv_datagram(struct sock *sk,
+ struct sk_buff_head *sk_queue,
+ unsigned int flags, int *off, int *err);
+-struct sk_buff *skb_recv_datagram(struct sock *sk, unsigned flags, int noblock,
+- int *err);
++struct sk_buff *skb_recv_datagram(struct sock *sk, unsigned int flags, int *err);
+ __poll_t datagram_poll(struct file *file, struct socket *sock,
+ struct poll_table_struct *wait);
+ int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
+diff --git a/include/net/ipv6.h b/include/net/ipv6.h
+index 213612f1680c7..023435ce16062 100644
+--- a/include/net/ipv6.h
++++ b/include/net/ipv6.h
+@@ -1019,7 +1019,7 @@ int ip6_find_1stfragopt(struct sk_buff *skb, u8 **nexthdr);
+ int ip6_append_data(struct sock *sk,
+ int getfrag(void *from, char *to, int offset, int len,
+ int odd, struct sk_buff *skb),
+- void *from, int length, int transhdrlen,
++ void *from, size_t length, int transhdrlen,
+ struct ipcm6_cookie *ipc6, struct flowi6 *fl6,
+ struct rt6_info *rt, unsigned int flags);
+
+@@ -1035,7 +1035,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk, struct sk_buff_head *queue,
+ struct sk_buff *ip6_make_skb(struct sock *sk,
+ int getfrag(void *from, char *to, int offset,
+ int len, int odd, struct sk_buff *skb),
+- void *from, int length, int transhdrlen,
++ void *from, size_t length, int transhdrlen,
+ struct ipcm6_cookie *ipc6,
+ struct rt6_info *rt, unsigned int flags,
+ struct inet_cork_full *cork);
+diff --git a/init/Kconfig b/init/Kconfig
+index b19e2eeaae803..fa63cc019ebfc 100644
+--- a/init/Kconfig
++++ b/init/Kconfig
+@@ -899,6 +899,15 @@ config CC_IMPLICIT_FALLTHROUGH
+ default "-Wimplicit-fallthrough=5" if CC_IS_GCC && $(cc-option,-Wimplicit-fallthrough=5)
+ default "-Wimplicit-fallthrough" if CC_IS_CLANG && $(cc-option,-Wunreachable-code-fallthrough)
+
++# Currently, disable gcc-12 array-bounds globally.
++# We may want to target only particular configurations some day.
++config GCC12_NO_ARRAY_BOUNDS
++ def_bool y
++
++config CC_NO_ARRAY_BOUNDS
++ bool
++ default y if CC_IS_GCC && GCC_VERSION >= 120000 && GCC_VERSION < 130000 && GCC12_NO_ARRAY_BOUNDS
++
+ #
+ # For architectures that know their GCC __int128 support is sound
+ #
+diff --git a/kernel/auditsc.c b/kernel/auditsc.c
+index f3a2abd6d1a19..3a8c9d744800a 100644
+--- a/kernel/auditsc.c
++++ b/kernel/auditsc.c
+@@ -1014,10 +1014,10 @@ static void audit_reset_context(struct audit_context *ctx)
+ ctx->target_comm[0] = '\0';
+ unroll_tree_refs(ctx, NULL, 0);
+ WARN_ON(!list_empty(&ctx->killed_trees));
+- ctx->type = 0;
+ audit_free_module(ctx);
+ ctx->fds[0] = -1;
+ audit_proctitle_free(ctx);
++ ctx->type = 0; /* reset last for audit_free_*() */
+ }
+
+ static inline struct audit_context *audit_alloc_context(enum audit_state state)
+diff --git a/kernel/cfi.c b/kernel/cfi.c
+index 9594cfd1cf2cf..08102d19ec15a 100644
+--- a/kernel/cfi.c
++++ b/kernel/cfi.c
+@@ -281,6 +281,8 @@ static inline cfi_check_fn find_module_check_fn(unsigned long ptr)
+ static inline cfi_check_fn find_check_fn(unsigned long ptr)
+ {
+ cfi_check_fn fn = NULL;
++ unsigned long flags;
++ bool rcu_idle;
+
+ if (is_kernel_text(ptr))
+ return __cfi_check;
+@@ -290,13 +292,21 @@ static inline cfi_check_fn find_check_fn(unsigned long ptr)
+ * the shadow and __module_address use RCU, so we need to wake it
+ * up if necessary.
+ */
+- RCU_NONIDLE({
+- if (IS_ENABLED(CONFIG_CFI_CLANG_SHADOW))
+- fn = find_shadow_check_fn(ptr);
++ rcu_idle = !rcu_is_watching();
++ if (rcu_idle) {
++ local_irq_save(flags);
++ rcu_irq_enter();
++ }
++
++ if (IS_ENABLED(CONFIG_CFI_CLANG_SHADOW))
++ fn = find_shadow_check_fn(ptr);
++ if (!fn)
++ fn = find_module_check_fn(ptr);
+
+- if (!fn)
+- fn = find_module_check_fn(ptr);
+- });
++ if (rcu_idle) {
++ rcu_irq_exit();
++ local_irq_restore(flags);
++ }
+
+ return fn;
+ }
+diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
+index ac740630c79c2..2caafd13f8aac 100644
+--- a/kernel/dma/debug.c
++++ b/kernel/dma/debug.c
+@@ -564,7 +564,7 @@ static void add_dma_entry(struct dma_debug_entry *entry, unsigned long attrs)
+
+ rc = active_cacheline_insert(entry);
+ if (rc == -ENOMEM) {
+- pr_err("cacheline tracking ENOMEM, dma-debug disabled\n");
++ pr_err_once("cacheline tracking ENOMEM, dma-debug disabled\n");
+ global_disable = true;
+ } else if (rc == -EEXIST && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
+ err_printk(entry->dev, entry,
+diff --git a/kernel/sched/core.c b/kernel/sched/core.c
+index e58d894df2074..dd11daa7a84b1 100644
+--- a/kernel/sched/core.c
++++ b/kernel/sched/core.c
+@@ -4755,25 +4755,55 @@ static void do_balance_callbacks(struct rq *rq, struct callback_head *head)
+
+ static void balance_push(struct rq *rq);
+
++/*
++ * balance_push_callback is a right abuse of the callback interface and plays
++ * by significantly different rules.
++ *
++ * Where the normal balance_callback's purpose is to be ran in the same context
++ * that queued it (only later, when it's safe to drop rq->lock again),
++ * balance_push_callback is specifically targeted at __schedule().
++ *
++ * This abuse is tolerated because it places all the unlikely/odd cases behind
++ * a single test, namely: rq->balance_callback == NULL.
++ */
+ struct callback_head balance_push_callback = {
+ .next = NULL,
+ .func = (void (*)(struct callback_head *))balance_push,
+ };
+
+-static inline struct callback_head *splice_balance_callbacks(struct rq *rq)
++static inline struct callback_head *
++__splice_balance_callbacks(struct rq *rq, bool split)
+ {
+ struct callback_head *head = rq->balance_callback;
+
++ if (likely(!head))
++ return NULL;
++
+ lockdep_assert_rq_held(rq);
+- if (head)
++ /*
++ * Must not take balance_push_callback off the list when
++ * splice_balance_callbacks() and balance_callbacks() are not
++ * in the same rq->lock section.
++ *
++ * In that case it would be possible for __schedule() to interleave
++ * and observe the list empty.
++ */
++ if (split && head == &balance_push_callback)
++ head = NULL;
++ else
+ rq->balance_callback = NULL;
+
+ return head;
+ }
+
++static inline struct callback_head *splice_balance_callbacks(struct rq *rq)
++{
++ return __splice_balance_callbacks(rq, true);
++}
++
+ static void __balance_callbacks(struct rq *rq)
+ {
+- do_balance_callbacks(rq, splice_balance_callbacks(rq));
++ do_balance_callbacks(rq, __splice_balance_callbacks(rq, false));
+ }
+
+ static inline void balance_callbacks(struct rq *rq, struct callback_head *head)
+diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
+index 0d2b6b758f324..84bba67c92dc6 100644
+--- a/kernel/sched/sched.h
++++ b/kernel/sched/sched.h
+@@ -1686,6 +1686,11 @@ queue_balance_callback(struct rq *rq,
+ {
+ lockdep_assert_rq_held(rq);
+
++ /*
++ * Don't (re)queue an already queued item; nor queue anything when
++ * balance_push() is active, see the comment with
++ * balance_push_callback.
++ */
+ if (unlikely(head->next || rq->balance_callback == &balance_push_callback))
+ return;
+
+diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
+index 6b58fc6813dfc..41d0d9657fa19 100644
+--- a/kernel/trace/bpf_trace.c
++++ b/kernel/trace/bpf_trace.c
+@@ -2433,7 +2433,7 @@ int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
+ return -EINVAL;
+
+ size = cnt * sizeof(*addrs);
+- addrs = kvmalloc(size, GFP_KERNEL);
++ addrs = kvmalloc_array(cnt, sizeof(*addrs), GFP_KERNEL);
+ if (!addrs)
+ return -ENOMEM;
+
+@@ -2450,7 +2450,7 @@ int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
+
+ ucookies = u64_to_user_ptr(attr->link_create.kprobe_multi.cookies);
+ if (ucookies) {
+- cookies = kvmalloc(size, GFP_KERNEL);
++ cookies = kvmalloc_array(cnt, sizeof(*addrs), GFP_KERNEL);
+ if (!cookies) {
+ err = -ENOMEM;
+ goto error;
+diff --git a/lib/Kconfig b/lib/Kconfig
+index 087e06b4cdfde..55f0bba8f8c00 100644
+--- a/lib/Kconfig
++++ b/lib/Kconfig
+@@ -120,6 +120,9 @@ config INDIRECT_IOMEM_FALLBACK
+
+ source "lib/crypto/Kconfig"
+
++config LIB_MEMNEQ
++ bool
++
+ config CRC_CCITT
+ tristate "CRC-CCITT functions"
+ help
+diff --git a/lib/Makefile b/lib/Makefile
+index 08053df16c7c8..60843ab661ba6 100644
+--- a/lib/Makefile
++++ b/lib/Makefile
+@@ -251,6 +251,7 @@ obj-$(CONFIG_DIMLIB) += dim/
+ obj-$(CONFIG_SIGNATURE) += digsig.o
+
+ lib-$(CONFIG_CLZ_TAB) += clz_tab.o
++lib-$(CONFIG_LIB_MEMNEQ) += memneq.o
+
+ obj-$(CONFIG_GENERIC_STRNCPY_FROM_USER) += strncpy_from_user.o
+ obj-$(CONFIG_GENERIC_STRNLEN_USER) += strnlen_user.o
+diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig
+index 379a66d7f504c..017cba1253865 100644
+--- a/lib/crypto/Kconfig
++++ b/lib/crypto/Kconfig
+@@ -71,6 +71,7 @@ config CRYPTO_LIB_CURVE25519
+ tristate "Curve25519 scalar multiplication library"
+ depends on CRYPTO_ARCH_HAVE_LIB_CURVE25519 || !CRYPTO_ARCH_HAVE_LIB_CURVE25519
+ select CRYPTO_LIB_CURVE25519_GENERIC if CRYPTO_ARCH_HAVE_LIB_CURVE25519=n
++ select LIB_MEMNEQ
+ help
+ Enable the Curve25519 library interface. This interface may be
+ fulfilled by either the generic implementation or an arch-specific
+diff --git a/lib/memneq.c b/lib/memneq.c
+new file mode 100644
+index 0000000000000..fb11608b1ec1d
+--- /dev/null
++++ b/lib/memneq.c
+@@ -0,0 +1,176 @@
++/*
++ * Constant-time equality testing of memory regions.
++ *
++ * Authors:
++ *
++ * James Yonan <james@openvpn.net>
++ * Daniel Borkmann <dborkman@redhat.com>
++ *
++ * This file is provided under a dual BSD/GPLv2 license. When using or
++ * redistributing this file, you may do so under either license.
++ *
++ * GPL LICENSE SUMMARY
++ *
++ * Copyright(c) 2013 OpenVPN Technologies, Inc. All rights reserved.
++ *
++ * This program is free software; you can redistribute it and/or modify
++ * it under the terms of version 2 of the GNU General Public License as
++ * published by the Free Software Foundation.
++ *
++ * This program is distributed in the hope that it will be useful, but
++ * WITHOUT ANY WARRANTY; without even the implied warranty of
++ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
++ * General Public License for more details.
++ *
++ * You should have received a copy of the GNU General Public License
++ * along with this program; if not, write to the Free Software
++ * Foundation, Inc., 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
++ * The full GNU General Public License is included in this distribution
++ * in the file called LICENSE.GPL.
++ *
++ * BSD LICENSE
++ *
++ * Copyright(c) 2013 OpenVPN Technologies, Inc. All rights reserved.
++ *
++ * Redistribution and use in source and binary forms, with or without
++ * modification, are permitted provided that the following conditions
++ * are met:
++ *
++ * * Redistributions of source code must retain the above copyright
++ * notice, this list of conditions and the following disclaimer.
++ * * Redistributions in binary form must reproduce the above copyright
++ * notice, this list of conditions and the following disclaimer in
++ * the documentation and/or other materials provided with the
++ * distribution.
++ * * Neither the name of OpenVPN Technologies nor the names of its
++ * contributors may be used to endorse or promote products derived
++ * from this software without specific prior written permission.
++ *
++ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
++ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
++ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
++ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
++ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
++ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
++ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
++ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
++ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
++ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
++ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
++ */
++
++#include <crypto/algapi.h>
++#include <asm/unaligned.h>
++
++#ifndef __HAVE_ARCH_CRYPTO_MEMNEQ
++
++/* Generic path for arbitrary size */
++static inline unsigned long
++__crypto_memneq_generic(const void *a, const void *b, size_t size)
++{
++ unsigned long neq = 0;
++
++#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
++ while (size >= sizeof(unsigned long)) {
++ neq |= get_unaligned((unsigned long *)a) ^
++ get_unaligned((unsigned long *)b);
++ OPTIMIZER_HIDE_VAR(neq);
++ a += sizeof(unsigned long);
++ b += sizeof(unsigned long);
++ size -= sizeof(unsigned long);
++ }
++#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
++ while (size > 0) {
++ neq |= *(unsigned char *)a ^ *(unsigned char *)b;
++ OPTIMIZER_HIDE_VAR(neq);
++ a += 1;
++ b += 1;
++ size -= 1;
++ }
++ return neq;
++}
++
++/* Loop-free fast-path for frequently used 16-byte size */
++static inline unsigned long __crypto_memneq_16(const void *a, const void *b)
++{
++ unsigned long neq = 0;
++
++#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
++ if (sizeof(unsigned long) == 8) {
++ neq |= get_unaligned((unsigned long *)a) ^
++ get_unaligned((unsigned long *)b);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= get_unaligned((unsigned long *)(a + 8)) ^
++ get_unaligned((unsigned long *)(b + 8));
++ OPTIMIZER_HIDE_VAR(neq);
++ } else if (sizeof(unsigned int) == 4) {
++ neq |= get_unaligned((unsigned int *)a) ^
++ get_unaligned((unsigned int *)b);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= get_unaligned((unsigned int *)(a + 4)) ^
++ get_unaligned((unsigned int *)(b + 4));
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= get_unaligned((unsigned int *)(a + 8)) ^
++ get_unaligned((unsigned int *)(b + 8));
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= get_unaligned((unsigned int *)(a + 12)) ^
++ get_unaligned((unsigned int *)(b + 12));
++ OPTIMIZER_HIDE_VAR(neq);
++ } else
++#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
++ {
++ neq |= *(unsigned char *)(a) ^ *(unsigned char *)(b);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+1) ^ *(unsigned char *)(b+1);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+2) ^ *(unsigned char *)(b+2);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+3) ^ *(unsigned char *)(b+3);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+4) ^ *(unsigned char *)(b+4);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+5) ^ *(unsigned char *)(b+5);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+6) ^ *(unsigned char *)(b+6);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+7) ^ *(unsigned char *)(b+7);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+8) ^ *(unsigned char *)(b+8);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+9) ^ *(unsigned char *)(b+9);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+10) ^ *(unsigned char *)(b+10);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+11) ^ *(unsigned char *)(b+11);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+12) ^ *(unsigned char *)(b+12);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+13) ^ *(unsigned char *)(b+13);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+14) ^ *(unsigned char *)(b+14);
++ OPTIMIZER_HIDE_VAR(neq);
++ neq |= *(unsigned char *)(a+15) ^ *(unsigned char *)(b+15);
++ OPTIMIZER_HIDE_VAR(neq);
++ }
++
++ return neq;
++}
++
++/* Compare two areas of memory without leaking timing information,
++ * and with special optimizations for common sizes. Users should
++ * not call this function directly, but should instead use
++ * crypto_memneq defined in crypto/algapi.h.
++ */
++noinline unsigned long __crypto_memneq(const void *a, const void *b,
++ size_t size)
++{
++ switch (size) {
++ case 16:
++ return __crypto_memneq_16(a, b);
++ default:
++ return __crypto_memneq_generic(a, b, size);
++ }
++}
++EXPORT_SYMBOL(__crypto_memneq);
++
++#endif /* __HAVE_ARCH_CRYPTO_MEMNEQ */
+diff --git a/mm/backing-dev.c b/mm/backing-dev.c
+index 7176af65b103a..e262739a0a238 100644
+--- a/mm/backing-dev.c
++++ b/mm/backing-dev.c
+@@ -230,20 +230,13 @@ static __init int bdi_class_init(void)
+ }
+ postcore_initcall(bdi_class_init);
+
+-static int bdi_init(struct backing_dev_info *bdi);
+-
+ static int __init default_bdi_init(void)
+ {
+- int err;
+-
+ bdi_wq = alloc_workqueue("writeback", WQ_MEM_RECLAIM | WQ_UNBOUND |
+ WQ_SYSFS, 0);
+ if (!bdi_wq)
+ return -ENOMEM;
+-
+- err = bdi_init(&noop_backing_dev_info);
+-
+- return err;
++ return 0;
+ }
+ subsys_initcall(default_bdi_init);
+
+@@ -782,7 +775,7 @@ static void cgwb_remove_from_bdi_list(struct bdi_writeback *wb)
+
+ #endif /* CONFIG_CGROUP_WRITEBACK */
+
+-static int bdi_init(struct backing_dev_info *bdi)
++int bdi_init(struct backing_dev_info *bdi)
+ {
+ int ret;
+
+diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
+index bf5736c1d4584..a06f4d4a6f476 100644
+--- a/net/appletalk/ddp.c
++++ b/net/appletalk/ddp.c
+@@ -1753,8 +1753,7 @@ static int atalk_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
+ int err = 0;
+ struct sk_buff *skb;
+
+- skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT,
+- flags & MSG_DONTWAIT, &err);
++ skb = skb_recv_datagram(sk, flags, &err);
+ lock_sock(sk);
+
+ if (!skb)
+diff --git a/net/atm/common.c b/net/atm/common.c
+index 1cfa9bf1d1871..d0c8ab7ff8f6a 100644
+--- a/net/atm/common.c
++++ b/net/atm/common.c
+@@ -540,7 +540,7 @@ int vcc_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
+ !test_bit(ATM_VF_READY, &vcc->flags))
+ return 0;
+
+- skb = skb_recv_datagram(sk, flags, flags & MSG_DONTWAIT, &error);
++ skb = skb_recv_datagram(sk, flags, &error);
+ if (!skb)
+ return error;
+
+diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
+index 289f355e18531..4c7030ed8d331 100644
+--- a/net/ax25/af_ax25.c
++++ b/net/ax25/af_ax25.c
+@@ -1661,9 +1661,12 @@ static int ax25_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
+ int flags)
+ {
+ struct sock *sk = sock->sk;
+- struct sk_buff *skb;
++ struct sk_buff *skb, *last;
++ struct sk_buff_head *sk_queue;
+ int copied;
+ int err = 0;
++ int off = 0;
++ long timeo;
+
+ lock_sock(sk);
+ /*
+@@ -1675,11 +1678,29 @@ static int ax25_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
+ goto out;
+ }
+
+- /* Now we can treat all alike */
+- skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT,
+- flags & MSG_DONTWAIT, &err);
+- if (skb == NULL)
+- goto out;
++ /* We need support for non-blocking reads. */
++ sk_queue = &sk->sk_receive_queue;
++ skb = __skb_try_recv_datagram(sk, sk_queue, flags, &off, &err, &last);
++ /* If no packet is available, release_sock(sk) and try again. */
++ if (!skb) {
++ if (err != -EAGAIN)
++ goto out;
++ release_sock(sk);
++ timeo = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
++ while (timeo && !__skb_wait_for_more_packets(sk, sk_queue, &err,
++ &timeo, last)) {
++ skb = __skb_try_recv_datagram(sk, sk_queue, flags, &off,
++ &err, &last);
++ if (skb)
++ break;
++
++ if (err != -EAGAIN)
++ goto done;
++ }
++ if (!skb)
++ goto done;
++ lock_sock(sk);
++ }
+
+ if (!sk_to_ax25(sk)->pidincl)
+ skb_pull(skb, 1); /* Remove PID */
+@@ -1726,6 +1747,7 @@ static int ax25_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
+ out:
+ release_sock(sk);
+
++done:
+ return err;
+ }
+
+diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
+index a0cb2e3da8d4c..62705734343b6 100644
+--- a/net/bluetooth/af_bluetooth.c
++++ b/net/bluetooth/af_bluetooth.c
+@@ -251,7 +251,6 @@ EXPORT_SYMBOL(bt_accept_dequeue);
+ int bt_sock_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+ int flags)
+ {
+- int noblock = flags & MSG_DONTWAIT;
+ struct sock *sk = sock->sk;
+ struct sk_buff *skb;
+ size_t copied;
+@@ -263,7 +262,7 @@ int bt_sock_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+ if (flags & MSG_OOB)
+ return -EOPNOTSUPP;
+
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb) {
+ if (sk->sk_shutdown & RCV_SHUTDOWN)
+ return 0;
+diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
+index 33b3c0ffc3399..189e3115c8c62 100644
+--- a/net/bluetooth/hci_sock.c
++++ b/net/bluetooth/hci_sock.c
+@@ -1453,7 +1453,6 @@ static void hci_sock_cmsg(struct sock *sk, struct msghdr *msg,
+ static int hci_sock_recvmsg(struct socket *sock, struct msghdr *msg,
+ size_t len, int flags)
+ {
+- int noblock = flags & MSG_DONTWAIT;
+ struct sock *sk = sock->sk;
+ struct sk_buff *skb;
+ int copied, err;
+@@ -1470,7 +1469,7 @@ static int hci_sock_recvmsg(struct socket *sock, struct msghdr *msg,
+ if (sk->sk_state == BT_CLOSED)
+ return 0;
+
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb)
+ return err;
+
+diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
+index 2b8892d502f7f..251e666ba9a28 100644
+--- a/net/caif/caif_socket.c
++++ b/net/caif/caif_socket.c
+@@ -282,7 +282,7 @@ static int caif_seqpkt_recvmsg(struct socket *sock, struct msghdr *m,
+ if (flags & MSG_OOB)
+ goto read_error;
+
+- skb = skb_recv_datagram(sk, flags, 0 , &ret);
++ skb = skb_recv_datagram(sk, flags, &ret);
+ if (!skb)
+ goto read_error;
+ copylen = skb->len;
+diff --git a/net/can/bcm.c b/net/can/bcm.c
+index 95d209b52e6a6..64c07e650bb41 100644
+--- a/net/can/bcm.c
++++ b/net/can/bcm.c
+@@ -1632,12 +1632,9 @@ static int bcm_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
+ struct sock *sk = sock->sk;
+ struct sk_buff *skb;
+ int error = 0;
+- int noblock;
+ int err;
+
+- noblock = flags & MSG_DONTWAIT;
+- flags &= ~MSG_DONTWAIT;
+- skb = skb_recv_datagram(sk, flags, noblock, &error);
++ skb = skb_recv_datagram(sk, flags, &error);
+ if (!skb)
+ return error;
+
+diff --git a/net/can/isotp.c b/net/can/isotp.c
+index 1e7c6a460ef9a..35a1ae61744ca 100644
+--- a/net/can/isotp.c
++++ b/net/can/isotp.c
+@@ -1055,7 +1055,6 @@ static int isotp_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
+ struct sock *sk = sock->sk;
+ struct sk_buff *skb;
+ struct isotp_sock *so = isotp_sk(sk);
+- int noblock = flags & MSG_DONTWAIT;
+ int ret = 0;
+
+ if (flags & ~(MSG_DONTWAIT | MSG_TRUNC | MSG_PEEK))
+@@ -1064,8 +1063,7 @@ static int isotp_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
+ if (!so->bound)
+ return -EADDRNOTAVAIL;
+
+- flags &= ~MSG_DONTWAIT;
+- skb = skb_recv_datagram(sk, flags, noblock, &ret);
++ skb = skb_recv_datagram(sk, flags, &ret);
+ if (!skb)
+ return ret;
+
+diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c
+index 6dff4510687a4..0bb4fd3f6264d 100644
+--- a/net/can/j1939/socket.c
++++ b/net/can/j1939/socket.c
+@@ -802,7 +802,7 @@ static int j1939_sk_recvmsg(struct socket *sock, struct msghdr *msg,
+ return sock_recv_errqueue(sock->sk, msg, size, SOL_CAN_J1939,
+ SCM_J1939_ERRQUEUE);
+
+- skb = skb_recv_datagram(sk, flags, 0, &ret);
++ skb = skb_recv_datagram(sk, flags, &ret);
+ if (!skb)
+ return ret;
+
+diff --git a/net/can/raw.c b/net/can/raw.c
+index 7105fa4824e4b..0cf728dcff36f 100644
+--- a/net/can/raw.c
++++ b/net/can/raw.c
+@@ -846,16 +846,12 @@ static int raw_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
+ struct sock *sk = sock->sk;
+ struct sk_buff *skb;
+ int err = 0;
+- int noblock;
+-
+- noblock = flags & MSG_DONTWAIT;
+- flags &= ~MSG_DONTWAIT;
+
+ if (flags & MSG_ERRQUEUE)
+ return sock_recv_errqueue(sk, msg, size,
+ SOL_CAN_RAW, SCM_CAN_RAW_ERRQUEUE);
+
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb)
+ return err;
+
+diff --git a/net/core/datagram.c b/net/core/datagram.c
+index ee290776c661d..70126d15ca6e0 100644
+--- a/net/core/datagram.c
++++ b/net/core/datagram.c
+@@ -310,12 +310,11 @@ struct sk_buff *__skb_recv_datagram(struct sock *sk,
+ EXPORT_SYMBOL(__skb_recv_datagram);
+
+ struct sk_buff *skb_recv_datagram(struct sock *sk, unsigned int flags,
+- int noblock, int *err)
++ int *err)
+ {
+ int off = 0;
+
+- return __skb_recv_datagram(sk, &sk->sk_receive_queue,
+- flags | (noblock ? MSG_DONTWAIT : 0),
++ return __skb_recv_datagram(sk, &sk->sk_receive_queue, flags,
+ &off, err);
+ }
+ EXPORT_SYMBOL(skb_recv_datagram);
+diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
+index 3b2366a88c3cc..a725dd9bbda8b 100644
+--- a/net/ieee802154/socket.c
++++ b/net/ieee802154/socket.c
+@@ -314,7 +314,8 @@ static int raw_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
+ int err = -EOPNOTSUPP;
+ struct sk_buff *skb;
+
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ flags |= (noblock ? MSG_DONTWAIT : 0);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb)
+ goto out;
+
+@@ -703,7 +704,8 @@ static int dgram_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
+ struct dgram_sock *ro = dgram_sk(sk);
+ DECLARE_SOCKADDR(struct sockaddr_ieee802154 *, saddr, msg->msg_name);
+
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ flags |= (noblock ? MSG_DONTWAIT : 0);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb)
+ goto out;
+
+diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
+index aa9a11b20d18e..4e5ceca7ff7f9 100644
+--- a/net/ipv4/ping.c
++++ b/net/ipv4/ping.c
+@@ -871,7 +871,8 @@ int ping_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int noblock,
+ if (flags & MSG_ERRQUEUE)
+ return inet_recv_error(sk, msg, len, addr_len);
+
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ flags |= (noblock ? MSG_DONTWAIT : 0);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb)
+ goto out;
+
+diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
+index 9f97b9cbf7b37..c9dd9603f2e73 100644
+--- a/net/ipv4/raw.c
++++ b/net/ipv4/raw.c
+@@ -769,7 +769,8 @@ static int raw_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
+ goto out;
+ }
+
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ flags |= (noblock ? MSG_DONTWAIT : 0);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb)
+ goto out;
+
+diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
+index fa63ef2bd99cc..87067e0ddaa39 100644
+--- a/net/ipv6/ip6_output.c
++++ b/net/ipv6/ip6_output.c
+@@ -1428,7 +1428,7 @@ static int __ip6_append_data(struct sock *sk,
+ struct page_frag *pfrag,
+ int getfrag(void *from, char *to, int offset,
+ int len, int odd, struct sk_buff *skb),
+- void *from, int length, int transhdrlen,
++ void *from, size_t length, int transhdrlen,
+ unsigned int flags, struct ipcm6_cookie *ipc6)
+ {
+ struct sk_buff *skb, *skb_prev = NULL;
+@@ -1776,7 +1776,7 @@ error:
+ int ip6_append_data(struct sock *sk,
+ int getfrag(void *from, char *to, int offset, int len,
+ int odd, struct sk_buff *skb),
+- void *from, int length, int transhdrlen,
++ void *from, size_t length, int transhdrlen,
+ struct ipcm6_cookie *ipc6, struct flowi6 *fl6,
+ struct rt6_info *rt, unsigned int flags)
+ {
+@@ -1973,7 +1973,7 @@ EXPORT_SYMBOL_GPL(ip6_flush_pending_frames);
+ struct sk_buff *ip6_make_skb(struct sock *sk,
+ int getfrag(void *from, char *to, int offset,
+ int len, int odd, struct sk_buff *skb),
+- void *from, int length, int transhdrlen,
++ void *from, size_t length, int transhdrlen,
+ struct ipcm6_cookie *ipc6, struct rt6_info *rt,
+ unsigned int flags, struct inet_cork_full *cork)
+ {
+diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
+index c51d5ce3711c2..8bb41f3b246a9 100644
+--- a/net/ipv6/raw.c
++++ b/net/ipv6/raw.c
+@@ -477,7 +477,8 @@ static int rawv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
+ if (np->rxpmtu && np->rxopt.bits.rxpmtu)
+ return ipv6_recv_rxpmtu(sk, msg, len, addr_len);
+
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ flags |= (noblock ? MSG_DONTWAIT : 0);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb)
+ goto out;
+
+diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
+index a1760add5bf1e..a0385ddbffcfc 100644
+--- a/net/iucv/af_iucv.c
++++ b/net/iucv/af_iucv.c
+@@ -1223,7 +1223,6 @@ static void iucv_process_message_q(struct sock *sk)
+ static int iucv_sock_recvmsg(struct socket *sock, struct msghdr *msg,
+ size_t len, int flags)
+ {
+- int noblock = flags & MSG_DONTWAIT;
+ struct sock *sk = sock->sk;
+ struct iucv_sock *iucv = iucv_sk(sk);
+ unsigned int copied, rlen;
+@@ -1242,7 +1241,7 @@ static int iucv_sock_recvmsg(struct socket *sock, struct msghdr *msg,
+
+ /* receive/dequeue next skb:
+ * the function understands MSG_PEEK and, thus, does not dequeue skb */
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb) {
+ if (sk->sk_shutdown & RCV_SHUTDOWN)
+ return 0;
+diff --git a/net/key/af_key.c b/net/key/af_key.c
+index d93bde6573593..c249b84efbb20 100644
+--- a/net/key/af_key.c
++++ b/net/key/af_key.c
+@@ -3700,7 +3700,7 @@ static int pfkey_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+ if (flags & ~(MSG_PEEK|MSG_DONTWAIT|MSG_TRUNC|MSG_CMSG_COMPAT))
+ goto out;
+
+- skb = skb_recv_datagram(sk, flags, flags & MSG_DONTWAIT, &err);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (skb == NULL)
+ goto out;
+
+diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
+index b3edafa5fba4a..c6a5cc2d88e72 100644
+--- a/net/l2tp/l2tp_ip.c
++++ b/net/l2tp/l2tp_ip.c
+@@ -526,7 +526,8 @@ static int l2tp_ip_recvmsg(struct sock *sk, struct msghdr *msg,
+ if (flags & MSG_OOB)
+ goto out;
+
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ flags |= (noblock ? MSG_DONTWAIT : 0);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb)
+ goto out;
+
+diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
+index 96f975777438f..8f76e647adbbd 100644
+--- a/net/l2tp/l2tp_ip6.c
++++ b/net/l2tp/l2tp_ip6.c
+@@ -502,14 +502,15 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+ struct ipcm6_cookie ipc6;
+ int addr_len = msg->msg_namelen;
+ int transhdrlen = 4; /* zero session-id */
+- int ulen = len + transhdrlen;
++ int ulen;
+ int err;
+
+ /* Rough check on arithmetic overflow,
+ * better check is made in ip6_append_data().
+ */
+- if (len > INT_MAX)
++ if (len > INT_MAX - transhdrlen)
+ return -EMSGSIZE;
++ ulen = len + transhdrlen;
+
+ /* Mirror BSD error message compatibility */
+ if (msg->msg_flags & MSG_OOB)
+@@ -671,7 +672,8 @@ static int l2tp_ip6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
+ if (flags & MSG_ERRQUEUE)
+ return ipv6_recv_error(sk, msg, len, addr_len);
+
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ flags |= (noblock ? MSG_DONTWAIT : 0);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb)
+ goto out;
+
+diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
+index bf35710127dd0..8be1fdc68a0bb 100644
+--- a/net/l2tp/l2tp_ppp.c
++++ b/net/l2tp/l2tp_ppp.c
+@@ -191,8 +191,7 @@ static int pppol2tp_recvmsg(struct socket *sock, struct msghdr *msg,
+ goto end;
+
+ err = 0;
+- skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT,
+- flags & MSG_DONTWAIT, &err);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb)
+ goto end;
+
+diff --git a/net/mctp/af_mctp.c b/net/mctp/af_mctp.c
+index e22b0cbb2f353..221863afc4b12 100644
+--- a/net/mctp/af_mctp.c
++++ b/net/mctp/af_mctp.c
+@@ -216,7 +216,7 @@ static int mctp_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+ if (flags & ~(MSG_DONTWAIT | MSG_TRUNC | MSG_PEEK))
+ return -EOPNOTSUPP;
+
+- skb = skb_recv_datagram(sk, flags, flags & MSG_DONTWAIT, &rc);
++ skb = skb_recv_datagram(sk, flags, &rc);
+ if (!skb)
+ return rc;
+
+diff --git a/net/mctp/test/route-test.c b/net/mctp/test/route-test.c
+index 61205cf400746..24df29e135ed6 100644
+--- a/net/mctp/test/route-test.c
++++ b/net/mctp/test/route-test.c
+@@ -352,7 +352,7 @@ static void mctp_test_route_input_sk(struct kunit *test)
+ if (params->deliver) {
+ KUNIT_EXPECT_EQ(test, rc, 0);
+
+- skb2 = skb_recv_datagram(sock->sk, 0, 1, &rc);
++ skb2 = skb_recv_datagram(sock->sk, MSG_DONTWAIT, &rc);
+ KUNIT_EXPECT_NOT_ERR_OR_NULL(test, skb2);
+ KUNIT_EXPECT_EQ(test, skb->len, 1);
+
+@@ -360,7 +360,7 @@ static void mctp_test_route_input_sk(struct kunit *test)
+
+ } else {
+ KUNIT_EXPECT_NE(test, rc, 0);
+- skb2 = skb_recv_datagram(sock->sk, 0, 1, &rc);
++ skb2 = skb_recv_datagram(sock->sk, MSG_DONTWAIT, &rc);
+ KUNIT_EXPECT_PTR_EQ(test, skb2, NULL);
+ }
+
+@@ -423,7 +423,7 @@ static void mctp_test_route_input_sk_reasm(struct kunit *test)
+ rc = mctp_route_input(&rt->rt, skb);
+ }
+
+- skb2 = skb_recv_datagram(sock->sk, 0, 1, &rc);
++ skb2 = skb_recv_datagram(sock->sk, MSG_DONTWAIT, &rc);
+
+ if (params->rx_len) {
+ KUNIT_EXPECT_NOT_ERR_OR_NULL(test, skb2);
+@@ -582,7 +582,7 @@ static void mctp_test_route_input_sk_keys(struct kunit *test)
+ rc = mctp_route_input(&rt->rt, skb);
+
+ /* (potentially) receive message */
+- skb2 = skb_recv_datagram(sock->sk, 0, 1, &rc);
++ skb2 = skb_recv_datagram(sock->sk, MSG_DONTWAIT, &rc);
+
+ if (params->deliver)
+ KUNIT_EXPECT_NOT_ERR_OR_NULL(test, skb2);
+diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
+index 73e9c0a9c1876..0cd91f813a3bd 100644
+--- a/net/netlink/af_netlink.c
++++ b/net/netlink/af_netlink.c
+@@ -1931,7 +1931,6 @@ static int netlink_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+ struct scm_cookie scm;
+ struct sock *sk = sock->sk;
+ struct netlink_sock *nlk = nlk_sk(sk);
+- int noblock = flags & MSG_DONTWAIT;
+ size_t copied;
+ struct sk_buff *skb, *data_skb;
+ int err, ret;
+@@ -1941,7 +1940,7 @@ static int netlink_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+
+ copied = 0;
+
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (skb == NULL)
+ goto out;
+
+diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
+index fa9dc2ba39418..6f7f4392cffb1 100644
+--- a/net/netrom/af_netrom.c
++++ b/net/netrom/af_netrom.c
+@@ -1159,7 +1159,8 @@ static int nr_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
+ }
+
+ /* Now we can treat all alike */
+- if ((skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT, flags & MSG_DONTWAIT, &er)) == NULL) {
++ skb = skb_recv_datagram(sk, flags, &er);
++ if (!skb) {
+ release_sock(sk);
+ return er;
+ }
+diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
+index 4ca35791c93b7..77642d18a3b43 100644
+--- a/net/nfc/llcp_sock.c
++++ b/net/nfc/llcp_sock.c
+@@ -821,7 +821,6 @@ static int llcp_sock_sendmsg(struct socket *sock, struct msghdr *msg,
+ static int llcp_sock_recvmsg(struct socket *sock, struct msghdr *msg,
+ size_t len, int flags)
+ {
+- int noblock = flags & MSG_DONTWAIT;
+ struct sock *sk = sock->sk;
+ unsigned int copied, rlen;
+ struct sk_buff *skb, *cskb;
+@@ -842,7 +841,7 @@ static int llcp_sock_recvmsg(struct socket *sock, struct msghdr *msg,
+ if (flags & (MSG_OOB))
+ return -EOPNOTSUPP;
+
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ skb = skb_recv_datagram(sk, flags, &err);
+ if (!skb) {
+ pr_err("Recv datagram failed state %d %d %d",
+ sk->sk_state, err, sock_error(sk));
+diff --git a/net/nfc/rawsock.c b/net/nfc/rawsock.c
+index 0ca214ab5aeff..8dd569765f96e 100644
+--- a/net/nfc/rawsock.c
++++ b/net/nfc/rawsock.c
+@@ -238,7 +238,6 @@ static int rawsock_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+ static int rawsock_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+ int flags)
+ {
+- int noblock = flags & MSG_DONTWAIT;
+ struct sock *sk = sock->sk;
+ struct sk_buff *skb;
+ int copied;
+@@ -246,7 +245,7 @@ static int rawsock_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+
+ pr_debug("sock=%p sk=%p len=%zu flags=%d\n", sock, sk, len, flags);
+
+- skb = skb_recv_datagram(sk, flags, noblock, &rc);
++ skb = skb_recv_datagram(sk, flags, &rc);
+ if (!skb)
+ return rc;
+
+diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
+index 002d2b9c69dd1..243566129784a 100644
+--- a/net/packet/af_packet.c
++++ b/net/packet/af_packet.c
+@@ -3426,7 +3426,7 @@ static int packet_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+ * but then it will block.
+ */
+
+- skb = skb_recv_datagram(sk, flags, flags & MSG_DONTWAIT, &err);
++ skb = skb_recv_datagram(sk, flags, &err);
+
+ /*
+ * An error occurred so return it. Because skb_recv_datagram()
+diff --git a/net/phonet/datagram.c b/net/phonet/datagram.c
+index 393e6aa7a5927..3f2e62b63dd42 100644
+--- a/net/phonet/datagram.c
++++ b/net/phonet/datagram.c
+@@ -123,7 +123,8 @@ static int pn_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
+ MSG_CMSG_COMPAT))
+ goto out_nofree;
+
+- skb = skb_recv_datagram(sk, flags, noblock, &rval);
++ flags |= (noblock ? MSG_DONTWAIT : 0);
++ skb = skb_recv_datagram(sk, flags, &rval);
+ if (skb == NULL)
+ goto out_nofree;
+
+diff --git a/net/phonet/pep.c b/net/phonet/pep.c
+index 65d463ad87707..441a267065923 100644
+--- a/net/phonet/pep.c
++++ b/net/phonet/pep.c
+@@ -772,7 +772,8 @@ static struct sock *pep_sock_accept(struct sock *sk, int flags, int *errp,
+ u8 pipe_handle, enabled, n_sb;
+ u8 aligned = 0;
+
+- skb = skb_recv_datagram(sk, 0, flags & O_NONBLOCK, errp);
++ skb = skb_recv_datagram(sk, (flags & O_NONBLOCK) ? MSG_DONTWAIT : 0,
++ errp);
+ if (!skb)
+ return NULL;
+
+@@ -1267,7 +1268,8 @@ static int pep_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
+ return -EINVAL;
+ }
+
+- skb = skb_recv_datagram(sk, flags, noblock, &err);
++ flags |= (noblock ? MSG_DONTWAIT : 0);
++ skb = skb_recv_datagram(sk, flags, &err);
+ lock_sock(sk);
+ if (skb == NULL) {
+ if (err == -ENOTCONN && sk->sk_state == TCP_CLOSE_WAIT)
+diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c
+index ec23225297278..5c2fb992803b7 100644
+--- a/net/qrtr/af_qrtr.c
++++ b/net/qrtr/af_qrtr.c
+@@ -1035,8 +1035,7 @@ static int qrtr_recvmsg(struct socket *sock, struct msghdr *msg,
+ return -EADDRNOTAVAIL;
+ }
+
+- skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT,
+- flags & MSG_DONTWAIT, &rc);
++ skb = skb_recv_datagram(sk, flags, &rc);
+ if (!skb) {
+ release_sock(sk);
+ return rc;
+diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
+index 30a1cf4c16c67..bf2d986a6bc39 100644
+--- a/net/rose/af_rose.c
++++ b/net/rose/af_rose.c
+@@ -1230,7 +1230,8 @@ static int rose_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
+ return -ENOTCONN;
+
+ /* Now we can treat all alike */
+- if ((skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT, flags & MSG_DONTWAIT, &er)) == NULL)
++ skb = skb_recv_datagram(sk, flags, &er);
++ if (!skb)
+ return er;
+
+ qbit = (skb->data[0] & ROSE_Q_BIT) == ROSE_Q_BIT;
+diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
+index e2c6eca0271b3..b6781ada3aa8d 100644
+--- a/net/sunrpc/clnt.c
++++ b/net/sunrpc/clnt.c
+@@ -651,6 +651,7 @@ static struct rpc_clnt *__rpc_clone_client(struct rpc_create_args *args,
+ new->cl_discrtry = clnt->cl_discrtry;
+ new->cl_chatty = clnt->cl_chatty;
+ new->cl_principal = clnt->cl_principal;
++ new->cl_max_connect = clnt->cl_max_connect;
+ return new;
+
+ out_err:
+diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
+index 4aed12e94221a..6114d69b8a2da 100644
+--- a/net/unix/af_unix.c
++++ b/net/unix/af_unix.c
+@@ -1643,7 +1643,8 @@ static int unix_accept(struct socket *sock, struct socket *newsock, int flags,
+ * so that no locks are necessary.
+ */
+
+- skb = skb_recv_datagram(sk, 0, flags&O_NONBLOCK, &err);
++ skb = skb_recv_datagram(sk, (flags & O_NONBLOCK) ? MSG_DONTWAIT : 0,
++ &err);
+ if (!skb) {
+ /* This means receive shutdown. */
+ if (err == 0)
+@@ -2500,7 +2501,7 @@ static int unix_read_sock(struct sock *sk, read_descriptor_t *desc,
+ int used, err;
+
+ mutex_lock(&u->iolock);
+- skb = skb_recv_datagram(sk, 0, 1, &err);
++ skb = skb_recv_datagram(sk, MSG_DONTWAIT, &err);
+ mutex_unlock(&u->iolock);
+ if (!skb)
+ return err;
+diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
+index b17dc9745188e..b14f0ed7427bc 100644
+--- a/net/vmw_vsock/vmci_transport.c
++++ b/net/vmw_vsock/vmci_transport.c
+@@ -1732,19 +1732,16 @@ static int vmci_transport_dgram_dequeue(struct vsock_sock *vsk,
+ int flags)
+ {
+ int err;
+- int noblock;
+ struct vmci_datagram *dg;
+ size_t payload_len;
+ struct sk_buff *skb;
+
+- noblock = flags & MSG_DONTWAIT;
+-
+ if (flags & MSG_OOB || flags & MSG_ERRQUEUE)
+ return -EOPNOTSUPP;
+
+ /* Retrieve the head sk_buff from the socket's receive queue. */
+ err = 0;
+- skb = skb_recv_datagram(&vsk->sk, flags, noblock, &err);
++ skb = skb_recv_datagram(&vsk->sk, flags, &err);
+ if (!skb)
+ return err;
+
+diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
+index 3a171828638b1..6bc2ac8d8146d 100644
+--- a/net/x25/af_x25.c
++++ b/net/x25/af_x25.c
+@@ -1315,8 +1315,7 @@ static int x25_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
+ } else {
+ /* Now we can treat all alike */
+ release_sock(sk);
+- skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT,
+- flags & MSG_DONTWAIT, &rc);
++ skb = skb_recv_datagram(sk, flags, &rc);
+ lock_sock(sk);
+ if (!skb)
+ goto out;
+diff --git a/scripts/faddr2line b/scripts/faddr2line
+index 0e6268d598835..94ed98dd899f3 100755
+--- a/scripts/faddr2line
++++ b/scripts/faddr2line
+@@ -95,17 +95,25 @@ __faddr2line() {
+ local print_warnings=$4
+
+ local sym_name=${func_addr%+*}
+- local offset=${func_addr#*+}
+- offset=${offset%/*}
++ local func_offset=${func_addr#*+}
++ func_offset=${func_offset%/*}
+ local user_size=
++ local file_type
++ local is_vmlinux=0
+ [[ $func_addr =~ "/" ]] && user_size=${func_addr#*/}
+
+- if [[ -z $sym_name ]] || [[ -z $offset ]] || [[ $sym_name = $func_addr ]]; then
++ if [[ -z $sym_name ]] || [[ -z $func_offset ]] || [[ $sym_name = $func_addr ]]; then
+ warn "bad func+offset $func_addr"
+ DONE=1
+ return
+ fi
+
++ # vmlinux uses absolute addresses in the section table rather than
++ # section offsets.
++ local file_type=$(${READELF} --file-header $objfile |
++ ${AWK} '$1 == "Type:" { print $2; exit }')
++ [[ $file_type = "EXEC" ]] && is_vmlinux=1
++
+ # Go through each of the object's symbols which match the func name.
+ # In rare cases there might be duplicates, in which case we print all
+ # matches.
+@@ -114,9 +122,11 @@ __faddr2line() {
+ local sym_addr=0x${fields[1]}
+ local sym_elf_size=${fields[2]}
+ local sym_sec=${fields[6]}
++ local sec_size
++ local sec_name
+
+ # Get the section size:
+- local sec_size=$(${READELF} --section-headers --wide $objfile |
++ sec_size=$(${READELF} --section-headers --wide $objfile |
+ sed 's/\[ /\[/' |
+ ${AWK} -v sec=$sym_sec '$1 == "[" sec "]" { print "0x" $6; exit }')
+
+@@ -126,6 +136,17 @@ __faddr2line() {
+ return
+ fi
+
++ # Get the section name:
++ sec_name=$(${READELF} --section-headers --wide $objfile |
++ sed 's/\[ /\[/' |
++ ${AWK} -v sec=$sym_sec '$1 == "[" sec "]" { print $2; exit }')
++
++ if [[ -z $sec_name ]]; then
++ warn "bad section name: section: $sym_sec"
++ DONE=1
++ return
++ fi
++
+ # Calculate the symbol size.
+ #
+ # Unfortunately we can't use the ELF size, because kallsyms
+@@ -174,10 +195,10 @@ __faddr2line() {
+
+ sym_size=0x$(printf %x $sym_size)
+
+- # Calculate the section address from user-supplied offset:
+- local addr=$(($sym_addr + $offset))
++ # Calculate the address from user-supplied offset:
++ local addr=$(($sym_addr + $func_offset))
+ if [[ -z $addr ]] || [[ $addr = 0 ]]; then
+- warn "bad address: $sym_addr + $offset"
++ warn "bad address: $sym_addr + $func_offset"
+ DONE=1
+ return
+ fi
+@@ -191,9 +212,9 @@ __faddr2line() {
+ fi
+
+ # Make sure the provided offset is within the symbol's range:
+- if [[ $offset -gt $sym_size ]]; then
++ if [[ $func_offset -gt $sym_size ]]; then
+ [[ $print_warnings = 1 ]] &&
+- echo "skipping $sym_name address at $addr due to size mismatch ($offset > $sym_size)"
++ echo "skipping $sym_name address at $addr due to size mismatch ($func_offset > $sym_size)"
+ continue
+ fi
+
+@@ -202,11 +223,13 @@ __faddr2line() {
+ [[ $FIRST = 0 ]] && echo
+ FIRST=0
+
+- echo "$sym_name+$offset/$sym_size:"
++ echo "$sym_name+$func_offset/$sym_size:"
+
+ # Pass section address to addr2line and strip absolute paths
+ # from the output:
+- local output=$(${ADDR2LINE} -fpie $objfile $addr | sed "s; $dir_prefix\(\./\)*; ;")
++ local args="--functions --pretty-print --inlines --exe=$objfile"
++ [[ $is_vmlinux = 0 ]] && args="$args --section=$sec_name"
++ local output=$(${ADDR2LINE} $args $addr | sed "s; $dir_prefix\(\./\)*; ;")
+ [[ -z $output ]] && continue
+
+ # Default output (non --list):
+diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
+index e9e959343de98..cac9368be4ce5 100644
+--- a/security/selinux/hooks.c
++++ b/security/selinux/hooks.c
+@@ -2600,8 +2600,9 @@ static int selinux_sb_eat_lsm_opts(char *options, void **mnt_opts)
+ }
+ }
+ rc = selinux_add_opt(token, arg, mnt_opts);
++ kfree(arg);
++ arg = NULL;
+ if (unlikely(rc)) {
+- kfree(arg);
+ goto free_opt;
+ }
+ } else {
+@@ -2792,17 +2793,13 @@ static int selinux_fs_context_parse_param(struct fs_context *fc,
+ struct fs_parameter *param)
+ {
+ struct fs_parse_result result;
+- int opt, rc;
++ int opt;
+
+ opt = fs_parse(fc, selinux_fs_parameters, param, &result);
+ if (opt < 0)
+ return opt;
+
+- rc = selinux_add_opt(opt, param->string, &fc->security);
+- if (!rc)
+- param->string = NULL;
+-
+- return rc;
++ return selinux_add_opt(opt, param->string, &fc->security);
+ }
+
+ /* inode security operations */
+diff --git a/sound/hda/hdac_device.c b/sound/hda/hdac_device.c
+index 3e9e9ac804f62..b7e5032b61c97 100644
+--- a/sound/hda/hdac_device.c
++++ b/sound/hda/hdac_device.c
+@@ -660,6 +660,7 @@ static const struct hda_vendor_id hda_vendor_ids[] = {
+ { 0x14f1, "Conexant" },
+ { 0x17e8, "Chrontel" },
+ { 0x1854, "LG" },
++ { 0x19e5, "Huawei" },
+ { 0x1aec, "Wolfson Microelectronics" },
+ { 0x1af4, "QEMU" },
+ { 0x434d, "C-Media" },
+diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
+index 0a83eb6b88b1f..a77165bd92a98 100644
+--- a/sound/pci/hda/hda_intel.c
++++ b/sound/pci/hda/hda_intel.c
+@@ -2525,6 +2525,9 @@ static const struct pci_device_id azx_ids[] = {
+ .driver_data = AZX_DRIVER_SKL | AZX_DCAPS_INTEL_SKYLAKE},
+ { PCI_DEVICE(0x8086, 0x51cf),
+ .driver_data = AZX_DRIVER_SKL | AZX_DCAPS_INTEL_SKYLAKE},
++ /* Meteorlake-P */
++ { PCI_DEVICE(0x8086, 0x7e28),
++ .driver_data = AZX_DRIVER_SKL | AZX_DCAPS_INTEL_SKYLAKE},
+ /* Broxton-P(Apollolake) */
+ { PCI_DEVICE(0x8086, 0x5a98),
+ .driver_data = AZX_DRIVER_SKL | AZX_DCAPS_INTEL_BROXTON },
+diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c
+index 31fe417955712..6c209cd26c0ca 100644
+--- a/sound/pci/hda/patch_hdmi.c
++++ b/sound/pci/hda/patch_hdmi.c
+@@ -4554,6 +4554,7 @@ HDA_CODEC_ENTRY(0x8086281a, "Jasperlake HDMI", patch_i915_icl_hdmi),
+ HDA_CODEC_ENTRY(0x8086281b, "Elkhartlake HDMI", patch_i915_icl_hdmi),
+ HDA_CODEC_ENTRY(0x8086281c, "Alderlake-P HDMI", patch_i915_adlp_hdmi),
+ HDA_CODEC_ENTRY(0x8086281f, "Raptorlake-P HDMI", patch_i915_adlp_hdmi),
++HDA_CODEC_ENTRY(0x8086281d, "Meteorlake HDMI", patch_i915_adlp_hdmi),
+ HDA_CODEC_ENTRY(0x80862880, "CedarTrail HDMI", patch_generic_hdmi),
+ HDA_CODEC_ENTRY(0x80862882, "Valleyview2 HDMI", patch_i915_byt_hdmi),
+ HDA_CODEC_ENTRY(0x80862883, "Braswell HDMI", patch_i915_byt_hdmi),
+diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
+index 8d2d29880716f..588d4a59c8d92 100644
+--- a/sound/pci/hda/patch_realtek.c
++++ b/sound/pci/hda/patch_realtek.c
+@@ -443,6 +443,7 @@ static void alc_fill_eapd_coef(struct hda_codec *codec)
+ case 0x10ec0245:
+ case 0x10ec0255:
+ case 0x10ec0256:
++ case 0x19e58326:
+ case 0x10ec0257:
+ case 0x10ec0282:
+ case 0x10ec0283:
+@@ -580,6 +581,7 @@ static void alc_shutup_pins(struct hda_codec *codec)
+ switch (codec->core.vendor_id) {
+ case 0x10ec0236:
+ case 0x10ec0256:
++ case 0x19e58326:
+ case 0x10ec0283:
+ case 0x10ec0286:
+ case 0x10ec0288:
+@@ -3247,6 +3249,7 @@ static void alc_disable_headset_jack_key(struct hda_codec *codec)
+ case 0x10ec0230:
+ case 0x10ec0236:
+ case 0x10ec0256:
++ case 0x19e58326:
+ alc_write_coef_idx(codec, 0x48, 0x0);
+ alc_update_coef_idx(codec, 0x49, 0x0045, 0x0);
+ break;
+@@ -3275,6 +3278,7 @@ static void alc_enable_headset_jack_key(struct hda_codec *codec)
+ case 0x10ec0230:
+ case 0x10ec0236:
+ case 0x10ec0256:
++ case 0x19e58326:
+ alc_write_coef_idx(codec, 0x48, 0xd011);
+ alc_update_coef_idx(codec, 0x49, 0x007f, 0x0045);
+ break;
+@@ -4910,6 +4914,7 @@ static void alc_headset_mode_unplugged(struct hda_codec *codec)
+ case 0x10ec0230:
+ case 0x10ec0236:
+ case 0x10ec0256:
++ case 0x19e58326:
+ alc_process_coef_fw(codec, coef0256);
+ break;
+ case 0x10ec0234:
+@@ -5025,6 +5030,7 @@ static void alc_headset_mode_mic_in(struct hda_codec *codec, hda_nid_t hp_pin,
+ case 0x10ec0230:
+ case 0x10ec0236:
+ case 0x10ec0256:
++ case 0x19e58326:
+ alc_write_coef_idx(codec, 0x45, 0xc489);
+ snd_hda_set_pin_ctl_cache(codec, hp_pin, 0);
+ alc_process_coef_fw(codec, coef0256);
+@@ -5175,6 +5181,7 @@ static void alc_headset_mode_default(struct hda_codec *codec)
+ case 0x10ec0230:
+ case 0x10ec0236:
+ case 0x10ec0256:
++ case 0x19e58326:
+ alc_write_coef_idx(codec, 0x1b, 0x0e4b);
+ alc_write_coef_idx(codec, 0x45, 0xc089);
+ msleep(50);
+@@ -5274,6 +5281,7 @@ static void alc_headset_mode_ctia(struct hda_codec *codec)
+ case 0x10ec0230:
+ case 0x10ec0236:
+ case 0x10ec0256:
++ case 0x19e58326:
+ alc_process_coef_fw(codec, coef0256);
+ break;
+ case 0x10ec0234:
+@@ -5388,6 +5396,7 @@ static void alc_headset_mode_omtp(struct hda_codec *codec)
+ case 0x10ec0230:
+ case 0x10ec0236:
+ case 0x10ec0256:
++ case 0x19e58326:
+ alc_process_coef_fw(codec, coef0256);
+ break;
+ case 0x10ec0234:
+@@ -5489,6 +5498,7 @@ static void alc_determine_headset_type(struct hda_codec *codec)
+ case 0x10ec0230:
+ case 0x10ec0236:
+ case 0x10ec0256:
++ case 0x19e58326:
+ alc_write_coef_idx(codec, 0x1b, 0x0e4b);
+ alc_write_coef_idx(codec, 0x06, 0x6104);
+ alc_write_coefex_idx(codec, 0x57, 0x3, 0x09a3);
+@@ -5783,6 +5793,7 @@ static void alc255_set_default_jack_type(struct hda_codec *codec)
+ case 0x10ec0230:
+ case 0x10ec0236:
+ case 0x10ec0256:
++ case 0x19e58326:
+ alc_process_coef_fw(codec, alc256fw);
+ break;
+ }
+@@ -6385,6 +6396,7 @@ static void alc_combo_jack_hp_jd_restart(struct hda_codec *codec)
+ case 0x10ec0236:
+ case 0x10ec0255:
+ case 0x10ec0256:
++ case 0x19e58326:
+ alc_update_coef_idx(codec, 0x1b, 0x8000, 1 << 15); /* Reset HP JD */
+ alc_update_coef_idx(codec, 0x1b, 0x8000, 0 << 15);
+ break;
+@@ -10149,6 +10161,7 @@ static int patch_alc269(struct hda_codec *codec)
+ case 0x10ec0230:
+ case 0x10ec0236:
+ case 0x10ec0256:
++ case 0x19e58326:
+ spec->codec_variant = ALC269_TYPE_ALC256;
+ spec->shutup = alc256_shutup;
+ spec->init_hook = alc256_init;
+@@ -11599,6 +11612,7 @@ static const struct hda_device_id snd_hda_id_realtek[] = {
+ HDA_CODEC_ENTRY(0x10ec0b00, "ALCS1200A", patch_alc882),
+ HDA_CODEC_ENTRY(0x10ec1168, "ALC1220", patch_alc882),
+ HDA_CODEC_ENTRY(0x10ec1220, "ALC1220", patch_alc882),
++ HDA_CODEC_ENTRY(0x19e58326, "HW8326", patch_alc269),
+ {} /* terminator */
+ };
+ MODULE_DEVICE_TABLE(hdaudio, snd_hda_id_realtek);
+diff --git a/sound/soc/codecs/cs35l36.c b/sound/soc/codecs/cs35l36.c
+index d83c1b318c1c4..0accdb45ed727 100644
+--- a/sound/soc/codecs/cs35l36.c
++++ b/sound/soc/codecs/cs35l36.c
+@@ -444,7 +444,8 @@ static bool cs35l36_volatile_reg(struct device *dev, unsigned int reg)
+ }
+ }
+
+-static DECLARE_TLV_DB_SCALE(dig_vol_tlv, -10200, 25, 0);
++static const DECLARE_TLV_DB_RANGE(dig_vol_tlv, 0, 912,
++ TLV_DB_MINMAX_ITEM(-10200, 1200));
+ static DECLARE_TLV_DB_SCALE(amp_gain_tlv, 0, 1, 1);
+
+ static const char * const cs35l36_pcm_sftramp_text[] = {
+diff --git a/sound/soc/codecs/cs42l51.c b/sound/soc/codecs/cs42l51.c
+index e9c3cb4e2bfcb..b9c262a15edf4 100644
+--- a/sound/soc/codecs/cs42l51.c
++++ b/sound/soc/codecs/cs42l51.c
+@@ -146,7 +146,7 @@ static const struct snd_kcontrol_new cs42l51_snd_controls[] = {
+ 0, 0xA0, 96, adc_att_tlv),
+ SOC_DOUBLE_R_SX_TLV("PGA Volume",
+ CS42L51_ALC_PGA_CTL, CS42L51_ALC_PGB_CTL,
+- 0, 0x1A, 30, pga_tlv),
++ 0, 0x19, 30, pga_tlv),
+ SOC_SINGLE("Playback Deemphasis Switch", CS42L51_DAC_CTL, 3, 1, 0),
+ SOC_SINGLE("Auto-Mute Switch", CS42L51_DAC_CTL, 2, 1, 0),
+ SOC_SINGLE("Soft Ramp Switch", CS42L51_DAC_CTL, 1, 1, 0),
+diff --git a/sound/soc/codecs/cs42l52.c b/sound/soc/codecs/cs42l52.c
+index 80161151b3f2c..c19ad3c247026 100644
+--- a/sound/soc/codecs/cs42l52.c
++++ b/sound/soc/codecs/cs42l52.c
+@@ -137,7 +137,9 @@ static DECLARE_TLV_DB_SCALE(mic_tlv, 1600, 100, 0);
+
+ static DECLARE_TLV_DB_SCALE(pga_tlv, -600, 50, 0);
+
+-static DECLARE_TLV_DB_SCALE(mix_tlv, -50, 50, 0);
++static DECLARE_TLV_DB_SCALE(pass_tlv, -6000, 50, 0);
++
++static DECLARE_TLV_DB_SCALE(mix_tlv, -5150, 50, 0);
+
+ static DECLARE_TLV_DB_SCALE(beep_tlv, -56, 200, 0);
+
+@@ -351,7 +353,7 @@ static const struct snd_kcontrol_new cs42l52_snd_controls[] = {
+ CS42L52_SPKB_VOL, 0, 0x40, 0xC0, hl_tlv),
+
+ SOC_DOUBLE_R_SX_TLV("Bypass Volume", CS42L52_PASSTHRUA_VOL,
+- CS42L52_PASSTHRUB_VOL, 0, 0x88, 0x90, pga_tlv),
++ CS42L52_PASSTHRUB_VOL, 0, 0x88, 0x90, pass_tlv),
+
+ SOC_DOUBLE("Bypass Mute", CS42L52_MISC_CTL, 4, 5, 1, 0),
+
+@@ -364,7 +366,7 @@ static const struct snd_kcontrol_new cs42l52_snd_controls[] = {
+ CS42L52_ADCB_VOL, 0, 0xA0, 0x78, ipd_tlv),
+ SOC_DOUBLE_R_SX_TLV("ADC Mixer Volume",
+ CS42L52_ADCA_MIXER_VOL, CS42L52_ADCB_MIXER_VOL,
+- 0, 0x19, 0x7F, ipd_tlv),
++ 0, 0x19, 0x7F, mix_tlv),
+
+ SOC_DOUBLE("ADC Switch", CS42L52_ADC_MISC_CTL, 0, 1, 1, 0),
+
+diff --git a/sound/soc/codecs/cs42l56.c b/sound/soc/codecs/cs42l56.c
+index 3cf8a0b4478cd..b39c25409c239 100644
+--- a/sound/soc/codecs/cs42l56.c
++++ b/sound/soc/codecs/cs42l56.c
+@@ -391,9 +391,9 @@ static const struct snd_kcontrol_new cs42l56_snd_controls[] = {
+ SOC_DOUBLE("ADC Boost Switch", CS42L56_GAIN_BIAS_CTL, 3, 2, 1, 1),
+
+ SOC_DOUBLE_R_SX_TLV("Headphone Volume", CS42L56_HPA_VOLUME,
+- CS42L56_HPB_VOLUME, 0, 0x84, 0x48, hl_tlv),
++ CS42L56_HPB_VOLUME, 0, 0x44, 0x48, hl_tlv),
+ SOC_DOUBLE_R_SX_TLV("LineOut Volume", CS42L56_LOA_VOLUME,
+- CS42L56_LOB_VOLUME, 0, 0x84, 0x48, hl_tlv),
++ CS42L56_LOB_VOLUME, 0, 0x44, 0x48, hl_tlv),
+
+ SOC_SINGLE_TLV("Bass Shelving Volume", CS42L56_TONE_CTL,
+ 0, 0x00, 1, tone_tlv),
+diff --git a/sound/soc/codecs/cs53l30.c b/sound/soc/codecs/cs53l30.c
+index f2087bd38dbc8..c2912ad3851b7 100644
+--- a/sound/soc/codecs/cs53l30.c
++++ b/sound/soc/codecs/cs53l30.c
+@@ -348,22 +348,22 @@ static const struct snd_kcontrol_new cs53l30_snd_controls[] = {
+ SOC_ENUM("ADC2 NG Delay", adc2_ng_delay_enum),
+
+ SOC_SINGLE_SX_TLV("ADC1A PGA Volume",
+- CS53L30_ADC1A_AFE_CTL, 0, 0x34, 0x18, pga_tlv),
++ CS53L30_ADC1A_AFE_CTL, 0, 0x34, 0x24, pga_tlv),
+ SOC_SINGLE_SX_TLV("ADC1B PGA Volume",
+- CS53L30_ADC1B_AFE_CTL, 0, 0x34, 0x18, pga_tlv),
++ CS53L30_ADC1B_AFE_CTL, 0, 0x34, 0x24, pga_tlv),
+ SOC_SINGLE_SX_TLV("ADC2A PGA Volume",
+- CS53L30_ADC2A_AFE_CTL, 0, 0x34, 0x18, pga_tlv),
++ CS53L30_ADC2A_AFE_CTL, 0, 0x34, 0x24, pga_tlv),
+ SOC_SINGLE_SX_TLV("ADC2B PGA Volume",
+- CS53L30_ADC2B_AFE_CTL, 0, 0x34, 0x18, pga_tlv),
++ CS53L30_ADC2B_AFE_CTL, 0, 0x34, 0x24, pga_tlv),
+
+ SOC_SINGLE_SX_TLV("ADC1A Digital Volume",
+- CS53L30_ADC1A_DIG_VOL, 0, 0xA0, 0x0C, dig_tlv),
++ CS53L30_ADC1A_DIG_VOL, 0, 0xA0, 0x6C, dig_tlv),
+ SOC_SINGLE_SX_TLV("ADC1B Digital Volume",
+- CS53L30_ADC1B_DIG_VOL, 0, 0xA0, 0x0C, dig_tlv),
++ CS53L30_ADC1B_DIG_VOL, 0, 0xA0, 0x6C, dig_tlv),
+ SOC_SINGLE_SX_TLV("ADC2A Digital Volume",
+- CS53L30_ADC2A_DIG_VOL, 0, 0xA0, 0x0C, dig_tlv),
++ CS53L30_ADC2A_DIG_VOL, 0, 0xA0, 0x6C, dig_tlv),
+ SOC_SINGLE_SX_TLV("ADC2B Digital Volume",
+- CS53L30_ADC2B_DIG_VOL, 0, 0xA0, 0x0C, dig_tlv),
++ CS53L30_ADC2B_DIG_VOL, 0, 0xA0, 0x6C, dig_tlv),
+ };
+
+ static const struct snd_soc_dapm_widget cs53l30_dapm_widgets[] = {
+diff --git a/sound/soc/codecs/es8328.c b/sound/soc/codecs/es8328.c
+index 3f00ead97006e..dd53dfd87b04e 100644
+--- a/sound/soc/codecs/es8328.c
++++ b/sound/soc/codecs/es8328.c
+@@ -161,13 +161,16 @@ static int es8328_put_deemph(struct snd_kcontrol *kcontrol,
+ if (deemph > 1)
+ return -EINVAL;
+
++ if (es8328->deemph == deemph)
++ return 0;
++
+ ret = es8328_set_deemph(component);
+ if (ret < 0)
+ return ret;
+
+ es8328->deemph = deemph;
+
+- return 0;
++ return 1;
+ }
+
+
+diff --git a/sound/soc/codecs/nau8822.c b/sound/soc/codecs/nau8822.c
+index 58123390c7a31..b436e532993d1 100644
+--- a/sound/soc/codecs/nau8822.c
++++ b/sound/soc/codecs/nau8822.c
+@@ -740,6 +740,8 @@ static int nau8822_set_pll(struct snd_soc_dai *dai, int pll_id, int source,
+ pll_param->pll_int, pll_param->pll_frac,
+ pll_param->mclk_scaler, pll_param->pre_factor);
+
++ snd_soc_component_update_bits(component,
++ NAU8822_REG_POWER_MANAGEMENT_1, NAU8822_PLL_EN_MASK, NAU8822_PLL_OFF);
+ snd_soc_component_update_bits(component,
+ NAU8822_REG_PLL_N, NAU8822_PLLMCLK_DIV2 | NAU8822_PLLN_MASK,
+ (pll_param->pre_factor ? NAU8822_PLLMCLK_DIV2 : 0) |
+@@ -757,6 +759,8 @@ static int nau8822_set_pll(struct snd_soc_dai *dai, int pll_id, int source,
+ pll_param->mclk_scaler << NAU8822_MCLKSEL_SFT);
+ snd_soc_component_update_bits(component,
+ NAU8822_REG_CLOCKING, NAU8822_CLKM_MASK, NAU8822_CLKM_PLL);
++ snd_soc_component_update_bits(component,
++ NAU8822_REG_POWER_MANAGEMENT_1, NAU8822_PLL_EN_MASK, NAU8822_PLL_ON);
+
+ return 0;
+ }
+diff --git a/sound/soc/codecs/nau8822.h b/sound/soc/codecs/nau8822.h
+index 489191ff187ec..b45d42c15de6b 100644
+--- a/sound/soc/codecs/nau8822.h
++++ b/sound/soc/codecs/nau8822.h
+@@ -90,6 +90,9 @@
+ #define NAU8822_REFIMP_3K 0x3
+ #define NAU8822_IOBUF_EN (0x1 << 2)
+ #define NAU8822_ABIAS_EN (0x1 << 3)
++#define NAU8822_PLL_EN_MASK (0x1 << 5)
++#define NAU8822_PLL_ON (0x1 << 5)
++#define NAU8822_PLL_OFF (0x0 << 5)
+
+ /* NAU8822_REG_AUDIO_INTERFACE (0x4) */
+ #define NAU8822_AIFMT_MASK (0x3 << 3)
+diff --git a/sound/soc/codecs/wm8962.c b/sound/soc/codecs/wm8962.c
+index 2c41d31956aa8..f622a6bbd2fb3 100644
+--- a/sound/soc/codecs/wm8962.c
++++ b/sound/soc/codecs/wm8962.c
+@@ -3871,6 +3871,7 @@ static int wm8962_runtime_suspend(struct device *dev)
+ #endif
+
+ static const struct dev_pm_ops wm8962_pm = {
++ SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
+ SET_RUNTIME_PM_OPS(wm8962_runtime_suspend, wm8962_runtime_resume, NULL)
+ };
+
+diff --git a/sound/soc/codecs/wm_adsp.c b/sound/soc/codecs/wm_adsp.c
+index e32c8ded181d3..9cfd4f18493fb 100644
+--- a/sound/soc/codecs/wm_adsp.c
++++ b/sound/soc/codecs/wm_adsp.c
+@@ -333,7 +333,7 @@ int wm_adsp_fw_put(struct snd_kcontrol *kcontrol,
+ struct snd_soc_component *component = snd_soc_kcontrol_component(kcontrol);
+ struct soc_enum *e = (struct soc_enum *)kcontrol->private_value;
+ struct wm_adsp *dsp = snd_soc_component_get_drvdata(component);
+- int ret = 0;
++ int ret = 1;
+
+ if (ucontrol->value.enumerated.item[0] == dsp[e->shift_l].fw)
+ return 0;
+diff --git a/sound/soc/intel/boards/sof_cirrus_common.c b/sound/soc/intel/boards/sof_cirrus_common.c
+index e71d74ec1b0b8..f4192df962d60 100644
+--- a/sound/soc/intel/boards/sof_cirrus_common.c
++++ b/sound/soc/intel/boards/sof_cirrus_common.c
+@@ -54,22 +54,29 @@ static struct snd_soc_dai_link_component cs35l41_components[] = {
+ },
+ };
+
++/*
++ * Mapping between ACPI instance id and speaker position.
++ *
++ * Four speakers:
++ * 0: Tweeter left, 1: Woofer left
++ * 2: Tweeter right, 3: Woofer right
++ */
+ static struct snd_soc_codec_conf cs35l41_codec_conf[] = {
+ {
+ .dlc = COMP_CODEC_CONF(CS35L41_DEV0_NAME),
+- .name_prefix = "WL",
++ .name_prefix = "TL",
+ },
+ {
+ .dlc = COMP_CODEC_CONF(CS35L41_DEV1_NAME),
+- .name_prefix = "WR",
++ .name_prefix = "WL",
+ },
+ {
+ .dlc = COMP_CODEC_CONF(CS35L41_DEV2_NAME),
+- .name_prefix = "TL",
++ .name_prefix = "TR",
+ },
+ {
+ .dlc = COMP_CODEC_CONF(CS35L41_DEV3_NAME),
+- .name_prefix = "TR",
++ .name_prefix = "WR",
+ },
+ };
+
+@@ -101,6 +108,21 @@ static int cs35l41_init(struct snd_soc_pcm_runtime *rtd)
+ return ret;
+ }
+
++/*
++ * Channel map:
++ *
++ * TL/WL: ASPRX1 on slot 0, ASPRX2 on slot 1 (default)
++ * TR/WR: ASPRX1 on slot 1, ASPRX2 on slot 0
++ */
++static const struct {
++ unsigned int rx[2];
++} cs35l41_channel_map[] = {
++ {.rx = {0, 1}}, /* TL */
++ {.rx = {0, 1}}, /* WL */
++ {.rx = {1, 0}}, /* TR */
++ {.rx = {1, 0}}, /* WR */
++};
++
+ static int cs35l41_hw_params(struct snd_pcm_substream *substream,
+ struct snd_pcm_hw_params *params)
+ {
+@@ -134,6 +156,16 @@ static int cs35l41_hw_params(struct snd_pcm_substream *substream,
+ ret);
+ return ret;
+ }
++
++ /* setup channel map */
++ ret = snd_soc_dai_set_channel_map(codec_dai, 0, NULL,
++ ARRAY_SIZE(cs35l41_channel_map[i].rx),
++ (unsigned int *)cs35l41_channel_map[i].rx);
++ if (ret < 0) {
++ dev_err(codec_dai->dev, "fail to set channel map, ret %d\n",
++ ret);
++ return ret;
++ }
+ }
+
+ return 0;
+diff --git a/sound/soc/qcom/lpass-platform.c b/sound/soc/qcom/lpass-platform.c
+index 74d62f377dfdd..ae2a7837e5ccf 100644
+--- a/sound/soc/qcom/lpass-platform.c
++++ b/sound/soc/qcom/lpass-platform.c
+@@ -898,7 +898,7 @@ static int lpass_platform_cdc_dma_mmap(struct snd_pcm_substream *substream,
+ struct snd_pcm_runtime *runtime = substream->runtime;
+ unsigned long size, offset;
+
+- vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
++ vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
+ size = vma->vm_end - vma->vm_start;
+ offset = vma->vm_pgoff << PAGE_SHIFT;
+ return io_remap_pfn_range(vma, vma->vm_start,
+diff --git a/tools/include/linux/objtool.h b/tools/include/linux/objtool.h
+index 586d35720f135..c81ea2264ad8a 100644
+--- a/tools/include/linux/objtool.h
++++ b/tools/include/linux/objtool.h
+@@ -141,6 +141,12 @@ struct unwind_hint {
+ .popsection
+ .endm
+
++.macro STACK_FRAME_NON_STANDARD_FP func:req
++#ifdef CONFIG_FRAME_POINTER
++ STACK_FRAME_NON_STANDARD \func
++#endif
++.endm
++
+ .macro ANNOTATE_NOENDBR
+ .Lhere_\@:
+ .pushsection .discard.noendbr
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-06-25 19:42 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-06-25 19:42 UTC (permalink / raw
To: gentoo-commits
commit: 94c6ca333405e69c23dd1c0a2ccc86236f50e833
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Sat Jun 25 19:42:33 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Sat Jun 25 19:42:33 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=94c6ca33
Linux patch 5.18.7
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1006_linux-5.18.7.patch | 779 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 783 insertions(+)
diff --git a/0000_README b/0000_README
index 84b72755..17ef0755 100644
--- a/0000_README
+++ b/0000_README
@@ -67,6 +67,10 @@ Patch: 1005_linux-5.18.6.patch
From: http://www.kernel.org
Desc: Linux 5.18.6
+Patch: 1006_linux-5.18.7.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.7
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1006_linux-5.18.7.patch b/1006_linux-5.18.7.patch
new file mode 100644
index 00000000..ec6938d8
--- /dev/null
+++ b/1006_linux-5.18.7.patch
@@ -0,0 +1,779 @@
+diff --git a/Documentation/devicetree/bindings/nvmem/fsl,layerscape-sfp.yaml b/Documentation/devicetree/bindings/nvmem/fsl,layerscape-sfp.yaml
+index 80914b93638e4..24c5e38584528 100644
+--- a/Documentation/devicetree/bindings/nvmem/fsl,layerscape-sfp.yaml
++++ b/Documentation/devicetree/bindings/nvmem/fsl,layerscape-sfp.yaml
+@@ -24,15 +24,29 @@ properties:
+ reg:
+ maxItems: 1
+
++ clocks:
++ maxItems: 1
++ description:
++ The SFP clock. Typically, this is the platform clock divided by 4.
++
++ clock-names:
++ const: sfp
++
+ required:
+ - compatible
+ - reg
++ - clock-names
++ - clocks
+
+ unevaluatedProperties: false
+
+ examples:
+ - |
++ #include <dt-bindings/clock/fsl,qoriq-clockgen.h>
+ efuse@1e80000 {
+ compatible = "fsl,ls1028a-sfp";
+ reg = <0x1e80000 0x8000>;
++ clocks = <&clockgen QORIQ_CLK_PLATFORM_PLL
++ QORIQ_CLK_PLL_DIV(4)>;
++ clock-names = "sfp";
+ };
+diff --git a/Makefile b/Makefile
+index 27850d452d652..61d63068553c8 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 6
++SUBLEVEL = 7
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
+index 697df02362af1..4909dcd762e8c 100644
+--- a/arch/s390/mm/pgtable.c
++++ b/arch/s390/mm/pgtable.c
+@@ -748,7 +748,7 @@ void ptep_zap_key(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
+ pgste_val(pgste) |= PGSTE_GR_BIT | PGSTE_GC_BIT;
+ ptev = pte_val(*ptep);
+ if (!(ptev & _PAGE_INVALID) && (ptev & _PAGE_WRITE))
+- page_set_storage_key(ptev & PAGE_MASK, PAGE_DEFAULT_KEY, 1);
++ page_set_storage_key(ptev & PAGE_MASK, PAGE_DEFAULT_KEY, 0);
+ pgste_set_unlock(ptep, pgste);
+ preempt_enable();
+ }
+diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
+index 34c9dbb6a47d6..686a9d75a0e41 100644
+--- a/arch/x86/boot/boot.h
++++ b/arch/x86/boot/boot.h
+@@ -110,66 +110,78 @@ typedef unsigned int addr_t;
+
+ static inline u8 rdfs8(addr_t addr)
+ {
++ u8 *ptr = (u8 *)absolute_pointer(addr);
+ u8 v;
+- asm volatile("movb %%fs:%1,%0" : "=q" (v) : "m" (*(u8 *)addr));
++ asm volatile("movb %%fs:%1,%0" : "=q" (v) : "m" (*ptr));
+ return v;
+ }
+ static inline u16 rdfs16(addr_t addr)
+ {
++ u16 *ptr = (u16 *)absolute_pointer(addr);
+ u16 v;
+- asm volatile("movw %%fs:%1,%0" : "=r" (v) : "m" (*(u16 *)addr));
++ asm volatile("movw %%fs:%1,%0" : "=r" (v) : "m" (*ptr));
+ return v;
+ }
+ static inline u32 rdfs32(addr_t addr)
+ {
++ u32 *ptr = (u32 *)absolute_pointer(addr);
+ u32 v;
+- asm volatile("movl %%fs:%1,%0" : "=r" (v) : "m" (*(u32 *)addr));
++ asm volatile("movl %%fs:%1,%0" : "=r" (v) : "m" (*ptr));
+ return v;
+ }
+
+ static inline void wrfs8(u8 v, addr_t addr)
+ {
+- asm volatile("movb %1,%%fs:%0" : "+m" (*(u8 *)addr) : "qi" (v));
++ u8 *ptr = (u8 *)absolute_pointer(addr);
++ asm volatile("movb %1,%%fs:%0" : "+m" (*ptr) : "qi" (v));
+ }
+ static inline void wrfs16(u16 v, addr_t addr)
+ {
+- asm volatile("movw %1,%%fs:%0" : "+m" (*(u16 *)addr) : "ri" (v));
++ u16 *ptr = (u16 *)absolute_pointer(addr);
++ asm volatile("movw %1,%%fs:%0" : "+m" (*ptr) : "ri" (v));
+ }
+ static inline void wrfs32(u32 v, addr_t addr)
+ {
+- asm volatile("movl %1,%%fs:%0" : "+m" (*(u32 *)addr) : "ri" (v));
++ u32 *ptr = (u32 *)absolute_pointer(addr);
++ asm volatile("movl %1,%%fs:%0" : "+m" (*ptr) : "ri" (v));
+ }
+
+ static inline u8 rdgs8(addr_t addr)
+ {
++ u8 *ptr = (u8 *)absolute_pointer(addr);
+ u8 v;
+- asm volatile("movb %%gs:%1,%0" : "=q" (v) : "m" (*(u8 *)addr));
++ asm volatile("movb %%gs:%1,%0" : "=q" (v) : "m" (*ptr));
+ return v;
+ }
+ static inline u16 rdgs16(addr_t addr)
+ {
++ u16 *ptr = (u16 *)absolute_pointer(addr);
+ u16 v;
+- asm volatile("movw %%gs:%1,%0" : "=r" (v) : "m" (*(u16 *)addr));
++ asm volatile("movw %%gs:%1,%0" : "=r" (v) : "m" (*ptr));
+ return v;
+ }
+ static inline u32 rdgs32(addr_t addr)
+ {
++ u32 *ptr = (u32 *)absolute_pointer(addr);
+ u32 v;
+- asm volatile("movl %%gs:%1,%0" : "=r" (v) : "m" (*(u32 *)addr));
++ asm volatile("movl %%gs:%1,%0" : "=r" (v) : "m" (*ptr));
+ return v;
+ }
+
+ static inline void wrgs8(u8 v, addr_t addr)
+ {
+- asm volatile("movb %1,%%gs:%0" : "+m" (*(u8 *)addr) : "qi" (v));
++ u8 *ptr = (u8 *)absolute_pointer(addr);
++ asm volatile("movb %1,%%gs:%0" : "+m" (*ptr) : "qi" (v));
+ }
+ static inline void wrgs16(u16 v, addr_t addr)
+ {
+- asm volatile("movw %1,%%gs:%0" : "+m" (*(u16 *)addr) : "ri" (v));
++ u16 *ptr = (u16 *)absolute_pointer(addr);
++ asm volatile("movw %1,%%gs:%0" : "+m" (*ptr) : "ri" (v));
+ }
+ static inline void wrgs32(u32 v, addr_t addr)
+ {
+- asm volatile("movl %1,%%gs:%0" : "+m" (*(u32 *)addr) : "ri" (v));
++ u32 *ptr = (u32 *)absolute_pointer(addr);
++ asm volatile("movl %1,%%gs:%0" : "+m" (*ptr) : "ri" (v));
+ }
+
+ /* Note: these only return true/false, not a signed return value! */
+diff --git a/arch/x86/boot/main.c b/arch/x86/boot/main.c
+index e3add857c2c9d..c421af5a3cdce 100644
+--- a/arch/x86/boot/main.c
++++ b/arch/x86/boot/main.c
+@@ -33,7 +33,7 @@ static void copy_boot_params(void)
+ u16 cl_offset;
+ };
+ const struct old_cmdline * const oldcmd =
+- (const struct old_cmdline *)OLD_CL_ADDRESS;
++ absolute_pointer(OLD_CL_ADDRESS);
+
+ BUILD_BUG_ON(sizeof(boot_params) != 4096);
+ memcpy(&boot_params.hdr, &hdr, sizeof(hdr));
+diff --git a/drivers/net/ethernet/sun/cassini.c b/drivers/net/ethernet/sun/cassini.c
+index b04a6a7bf5669..435dc00d04e5d 100644
+--- a/drivers/net/ethernet/sun/cassini.c
++++ b/drivers/net/ethernet/sun/cassini.c
+@@ -1313,7 +1313,7 @@ static void cas_init_rx_dma(struct cas *cp)
+ writel(val, cp->regs + REG_RX_PAGE_SIZE);
+
+ /* enable the header parser if desired */
+- if (CAS_HP_FIRMWARE == cas_prog_null)
++ if (&CAS_HP_FIRMWARE[0] == &cas_prog_null[0])
+ return;
+
+ val = CAS_BASE(HP_CFG_NUM_CPU, CAS_NCPUS > 63 ? 0 : CAS_NCPUS);
+@@ -3780,7 +3780,7 @@ static void cas_reset(struct cas *cp, int blkflag)
+
+ /* program header parser */
+ if ((cp->cas_flags & CAS_FLAG_TARGET_ABORT) ||
+- (CAS_HP_ALT_FIRMWARE == cas_prog_null)) {
++ (&CAS_HP_ALT_FIRMWARE[0] == &cas_prog_null[0])) {
+ cas_load_firmware(cp, CAS_HP_FIRMWARE);
+ } else {
+ cas_load_firmware(cp, CAS_HP_ALT_FIRMWARE);
+diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
+index 51fe51bb05041..15e6a6aded319 100644
+--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
++++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192de/phy.c
+@@ -2386,10 +2386,7 @@ void rtl92d_phy_reload_iqk_setting(struct ieee80211_hw *hw, u8 channel)
+ rtl_dbg(rtlpriv, COMP_SCAN, DBG_LOUD,
+ "Just Read IQK Matrix reg for channel:%d....\n",
+ channel);
+- if ((rtlphy->iqk_matrix[indexforchannel].
+- value[0] != NULL)
+- /*&&(regea4 != 0) */)
+- _rtl92d_phy_patha_fill_iqk_matrix(hw, true,
++ _rtl92d_phy_patha_fill_iqk_matrix(hw, true,
+ rtlphy->iqk_matrix[
+ indexforchannel].value, 0,
+ (rtlphy->iqk_matrix[
+diff --git a/drivers/net/wwan/iosm/iosm_ipc_protocol_ops.c b/drivers/net/wwan/iosm/iosm_ipc_protocol_ops.c
+index c6b032f95d2e4..4627847c6daab 100644
+--- a/drivers/net/wwan/iosm/iosm_ipc_protocol_ops.c
++++ b/drivers/net/wwan/iosm/iosm_ipc_protocol_ops.c
+@@ -372,8 +372,6 @@ bool ipc_protocol_dl_td_prepare(struct iosm_protocol *ipc_protocol,
+ struct sk_buff *ipc_protocol_dl_td_process(struct iosm_protocol *ipc_protocol,
+ struct ipc_pipe *pipe)
+ {
+- u32 tail =
+- le32_to_cpu(ipc_protocol->p_ap_shm->tail_array[pipe->pipe_nr]);
+ struct ipc_protocol_td *p_td;
+ struct sk_buff *skb;
+
+@@ -403,14 +401,6 @@ struct sk_buff *ipc_protocol_dl_td_process(struct iosm_protocol *ipc_protocol,
+ goto ret;
+ }
+
+- if (!IPC_CB(skb)) {
+- dev_err(ipc_protocol->dev, "pipe# %d, tail: %d skb_cb is NULL",
+- pipe->pipe_nr, tail);
+- ipc_pcie_kfree_skb(ipc_protocol->pcie, skb);
+- skb = NULL;
+- goto ret;
+- }
+-
+ if (p_td->buffer.address != IPC_CB(skb)->mapping) {
+ dev_err(ipc_protocol->dev, "invalid buf=%llx or skb=%p",
+ (unsigned long long)p_td->buffer.address, skb->data);
+diff --git a/fs/io_uring.c b/fs/io_uring.c
+index 3d123ca028c97..68aab48838e41 100644
+--- a/fs/io_uring.c
++++ b/fs/io_uring.c
+@@ -1647,7 +1647,7 @@ static inline void io_req_track_inflight(struct io_kiocb *req)
+ {
+ if (!(req->flags & REQ_F_INFLIGHT)) {
+ req->flags |= REQ_F_INFLIGHT;
+- atomic_inc(¤t->io_uring->inflight_tracked);
++ atomic_inc(&req->task->io_uring->inflight_tracked);
+ }
+ }
+
+diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
+index 985e995d2a398..4f897e1095470 100644
+--- a/fs/notify/fanotify/fanotify.c
++++ b/fs/notify/fanotify/fanotify.c
+@@ -319,12 +319,8 @@ static u32 fanotify_group_event_mask(struct fsnotify_group *group,
+ return 0;
+ }
+
+- fsnotify_foreach_iter_type(type) {
+- if (!fsnotify_iter_should_report_type(iter_info, type))
+- continue;
+- mark = iter_info->marks[type];
+-
+- /* Apply ignore mask regardless of ISDIR and ON_CHILD flags */
++ fsnotify_foreach_iter_mark_type(iter_info, mark, type) {
++ /* Apply ignore mask regardless of mark's ISDIR flag */
+ marks_ignored_mask |= mark->ignored_mask;
+
+ /*
+@@ -334,14 +330,6 @@ static u32 fanotify_group_event_mask(struct fsnotify_group *group,
+ if (event_mask & FS_ISDIR && !(mark->mask & FS_ISDIR))
+ continue;
+
+- /*
+- * If the event is on a child and this mark is on a parent not
+- * watching children, don't send it!
+- */
+- if (type == FSNOTIFY_ITER_TYPE_PARENT &&
+- !(mark->mask & FS_EVENT_ON_CHILD))
+- continue;
+-
+ marks_mask |= mark->mask;
+
+ /* Record the mark types of this group that matched the event */
+@@ -849,16 +837,14 @@ out:
+ */
+ static __kernel_fsid_t fanotify_get_fsid(struct fsnotify_iter_info *iter_info)
+ {
++ struct fsnotify_mark *mark;
+ int type;
+ __kernel_fsid_t fsid = {};
+
+- fsnotify_foreach_iter_type(type) {
++ fsnotify_foreach_iter_mark_type(iter_info, mark, type) {
+ struct fsnotify_mark_connector *conn;
+
+- if (!fsnotify_iter_should_report_type(iter_info, type))
+- continue;
+-
+- conn = READ_ONCE(iter_info->marks[type]->connector);
++ conn = READ_ONCE(mark->connector);
+ /* Mark is just getting destroyed or created? */
+ if (!conn)
+ continue;
+diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
+index 70a8516b78bc5..6d63423cd4928 100644
+--- a/fs/notify/fsnotify.c
++++ b/fs/notify/fsnotify.c
+@@ -290,22 +290,15 @@ static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
+ }
+
+ if (parent_mark) {
+- /*
+- * parent_mark indicates that the parent inode is watching
+- * children and interested in this event, which is an event
+- * possible on child. But is *this mark* watching children and
+- * interested in this event?
+- */
+- if (parent_mark->mask & FS_EVENT_ON_CHILD) {
+- ret = fsnotify_handle_inode_event(group, parent_mark, mask,
+- data, data_type, dir, name, 0);
+- if (ret)
+- return ret;
+- }
+- if (!inode_mark)
+- return 0;
++ ret = fsnotify_handle_inode_event(group, parent_mark, mask,
++ data, data_type, dir, name, 0);
++ if (ret)
++ return ret;
+ }
+
++ if (!inode_mark)
++ return 0;
++
+ if (mask & FS_EVENT_ON_CHILD) {
+ /*
+ * Some events can be sent on both parent dir and child marks
+@@ -335,31 +328,23 @@ static int send_to_group(__u32 mask, const void *data, int data_type,
+ struct fsnotify_mark *mark;
+ int type;
+
+- if (WARN_ON(!iter_info->report_mask))
++ if (!iter_info->report_mask)
+ return 0;
+
+ /* clear ignored on inode modification */
+ if (mask & FS_MODIFY) {
+- fsnotify_foreach_iter_type(type) {
+- if (!fsnotify_iter_should_report_type(iter_info, type))
+- continue;
+- mark = iter_info->marks[type];
+- if (mark &&
+- !(mark->flags & FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY))
++ fsnotify_foreach_iter_mark_type(iter_info, mark, type) {
++ if (!(mark->flags &
++ FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY))
+ mark->ignored_mask = 0;
+ }
+ }
+
+- fsnotify_foreach_iter_type(type) {
+- if (!fsnotify_iter_should_report_type(iter_info, type))
+- continue;
+- mark = iter_info->marks[type];
+- /* does the object mark tell us to do something? */
+- if (mark) {
+- group = mark->group;
+- marks_mask |= mark->mask;
+- marks_ignored_mask |= mark->ignored_mask;
+- }
++ /* Are any of the group marks interested in this event? */
++ fsnotify_foreach_iter_mark_type(iter_info, mark, type) {
++ group = mark->group;
++ marks_mask |= mark->mask;
++ marks_ignored_mask |= mark->ignored_mask;
+ }
+
+ pr_debug("%s: group=%p mask=%x marks_mask=%x marks_ignored_mask=%x data=%p data_type=%d dir=%p cookie=%d\n",
+@@ -403,11 +388,11 @@ static struct fsnotify_mark *fsnotify_next_mark(struct fsnotify_mark *mark)
+
+ /*
+ * iter_info is a multi head priority queue of marks.
+- * Pick a subset of marks from queue heads, all with the
+- * same group and set the report_mask for selected subset.
+- * Returns the report_mask of the selected subset.
++ * Pick a subset of marks from queue heads, all with the same group
++ * and set the report_mask to a subset of the selected marks.
++ * Returns false if there are no more groups to iterate.
+ */
+-static unsigned int fsnotify_iter_select_report_types(
++static bool fsnotify_iter_select_report_types(
+ struct fsnotify_iter_info *iter_info)
+ {
+ struct fsnotify_group *max_prio_group = NULL;
+@@ -423,30 +408,48 @@ static unsigned int fsnotify_iter_select_report_types(
+ }
+
+ if (!max_prio_group)
+- return 0;
++ return false;
+
+ /* Set the report mask for marks from same group as max prio group */
++ iter_info->current_group = max_prio_group;
+ iter_info->report_mask = 0;
+ fsnotify_foreach_iter_type(type) {
+ mark = iter_info->marks[type];
+- if (mark &&
+- fsnotify_compare_groups(max_prio_group, mark->group) == 0)
++ if (mark && mark->group == iter_info->current_group) {
++ /*
++ * FSNOTIFY_ITER_TYPE_PARENT indicates that this inode
++ * is watching children and interested in this event,
++ * which is an event possible on child.
++ * But is *this mark* watching children?
++ */
++ if (type == FSNOTIFY_ITER_TYPE_PARENT &&
++ !(mark->mask & FS_EVENT_ON_CHILD))
++ continue;
++
+ fsnotify_iter_set_report_type(iter_info, type);
++ }
+ }
+
+- return iter_info->report_mask;
++ return true;
+ }
+
+ /*
+- * Pop from iter_info multi head queue, the marks that were iterated in the
++ * Pop from iter_info multi head queue, the marks that belong to the group of
+ * current iteration step.
+ */
+ static void fsnotify_iter_next(struct fsnotify_iter_info *iter_info)
+ {
++ struct fsnotify_mark *mark;
+ int type;
+
++ /*
++ * We cannot use fsnotify_foreach_iter_mark_type() here because we
++ * may need to advance a mark of type X that belongs to current_group
++ * but was not selected for reporting.
++ */
+ fsnotify_foreach_iter_type(type) {
+- if (fsnotify_iter_should_report_type(iter_info, type))
++ mark = iter_info->marks[type];
++ if (mark && mark->group == iter_info->current_group)
+ iter_info->marks[type] =
+ fsnotify_next_mark(iter_info->marks[type]);
+ }
+diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
+index 1c2ece9611287..15a4c7c07a3bf 100644
+--- a/fs/zonefs/super.c
++++ b/fs/zonefs/super.c
+@@ -72,15 +72,51 @@ static inline void zonefs_i_size_write(struct inode *inode, loff_t isize)
+ zi->i_flags &= ~ZONEFS_ZONE_OPEN;
+ }
+
+-static int zonefs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
+- unsigned int flags, struct iomap *iomap,
+- struct iomap *srcmap)
++static int zonefs_read_iomap_begin(struct inode *inode, loff_t offset,
++ loff_t length, unsigned int flags,
++ struct iomap *iomap, struct iomap *srcmap)
+ {
+ struct zonefs_inode_info *zi = ZONEFS_I(inode);
+ struct super_block *sb = inode->i_sb;
+ loff_t isize;
+
+- /* All I/Os should always be within the file maximum size */
++ /*
++ * All blocks are always mapped below EOF. If reading past EOF,
++ * act as if there is a hole up to the file maximum size.
++ */
++ mutex_lock(&zi->i_truncate_mutex);
++ iomap->bdev = inode->i_sb->s_bdev;
++ iomap->offset = ALIGN_DOWN(offset, sb->s_blocksize);
++ isize = i_size_read(inode);
++ if (iomap->offset >= isize) {
++ iomap->type = IOMAP_HOLE;
++ iomap->addr = IOMAP_NULL_ADDR;
++ iomap->length = length;
++ } else {
++ iomap->type = IOMAP_MAPPED;
++ iomap->addr = (zi->i_zsector << SECTOR_SHIFT) + iomap->offset;
++ iomap->length = isize - iomap->offset;
++ }
++ mutex_unlock(&zi->i_truncate_mutex);
++
++ trace_zonefs_iomap_begin(inode, iomap);
++
++ return 0;
++}
++
++static const struct iomap_ops zonefs_read_iomap_ops = {
++ .iomap_begin = zonefs_read_iomap_begin,
++};
++
++static int zonefs_write_iomap_begin(struct inode *inode, loff_t offset,
++ loff_t length, unsigned int flags,
++ struct iomap *iomap, struct iomap *srcmap)
++{
++ struct zonefs_inode_info *zi = ZONEFS_I(inode);
++ struct super_block *sb = inode->i_sb;
++ loff_t isize;
++
++ /* All write I/Os should always be within the file maximum size */
+ if (WARN_ON_ONCE(offset + length > zi->i_max_size))
+ return -EIO;
+
+@@ -90,7 +126,7 @@ static int zonefs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
+ * operation.
+ */
+ if (WARN_ON_ONCE(zi->i_ztype == ZONEFS_ZTYPE_SEQ &&
+- (flags & IOMAP_WRITE) && !(flags & IOMAP_DIRECT)))
++ !(flags & IOMAP_DIRECT)))
+ return -EIO;
+
+ /*
+@@ -99,47 +135,44 @@ static int zonefs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
+ * write pointer) and unwriten beyond.
+ */
+ mutex_lock(&zi->i_truncate_mutex);
++ iomap->bdev = inode->i_sb->s_bdev;
++ iomap->offset = ALIGN_DOWN(offset, sb->s_blocksize);
++ iomap->addr = (zi->i_zsector << SECTOR_SHIFT) + iomap->offset;
+ isize = i_size_read(inode);
+- if (offset >= isize)
++ if (iomap->offset >= isize) {
+ iomap->type = IOMAP_UNWRITTEN;
+- else
++ iomap->length = zi->i_max_size - iomap->offset;
++ } else {
+ iomap->type = IOMAP_MAPPED;
+- if (flags & IOMAP_WRITE)
+- length = zi->i_max_size - offset;
+- else
+- length = min(length, isize - offset);
++ iomap->length = isize - iomap->offset;
++ }
+ mutex_unlock(&zi->i_truncate_mutex);
+
+- iomap->offset = ALIGN_DOWN(offset, sb->s_blocksize);
+- iomap->length = ALIGN(offset + length, sb->s_blocksize) - iomap->offset;
+- iomap->bdev = inode->i_sb->s_bdev;
+- iomap->addr = (zi->i_zsector << SECTOR_SHIFT) + iomap->offset;
+-
+ trace_zonefs_iomap_begin(inode, iomap);
+
+ return 0;
+ }
+
+-static const struct iomap_ops zonefs_iomap_ops = {
+- .iomap_begin = zonefs_iomap_begin,
++static const struct iomap_ops zonefs_write_iomap_ops = {
++ .iomap_begin = zonefs_write_iomap_begin,
+ };
+
+ static int zonefs_readpage(struct file *unused, struct page *page)
+ {
+- return iomap_readpage(page, &zonefs_iomap_ops);
++ return iomap_readpage(page, &zonefs_read_iomap_ops);
+ }
+
+ static void zonefs_readahead(struct readahead_control *rac)
+ {
+- iomap_readahead(rac, &zonefs_iomap_ops);
++ iomap_readahead(rac, &zonefs_read_iomap_ops);
+ }
+
+ /*
+ * Map blocks for page writeback. This is used only on conventional zone files,
+ * which implies that the page range can only be within the fixed inode size.
+ */
+-static int zonefs_map_blocks(struct iomap_writepage_ctx *wpc,
+- struct inode *inode, loff_t offset)
++static int zonefs_write_map_blocks(struct iomap_writepage_ctx *wpc,
++ struct inode *inode, loff_t offset)
+ {
+ struct zonefs_inode_info *zi = ZONEFS_I(inode);
+
+@@ -153,12 +186,12 @@ static int zonefs_map_blocks(struct iomap_writepage_ctx *wpc,
+ offset < wpc->iomap.offset + wpc->iomap.length)
+ return 0;
+
+- return zonefs_iomap_begin(inode, offset, zi->i_max_size - offset,
+- IOMAP_WRITE, &wpc->iomap, NULL);
++ return zonefs_write_iomap_begin(inode, offset, zi->i_max_size - offset,
++ IOMAP_WRITE, &wpc->iomap, NULL);
+ }
+
+ static const struct iomap_writeback_ops zonefs_writeback_ops = {
+- .map_blocks = zonefs_map_blocks,
++ .map_blocks = zonefs_write_map_blocks,
+ };
+
+ static int zonefs_writepage(struct page *page, struct writeback_control *wbc)
+@@ -188,7 +221,8 @@ static int zonefs_swap_activate(struct swap_info_struct *sis,
+ return -EINVAL;
+ }
+
+- return iomap_swapfile_activate(sis, swap_file, span, &zonefs_iomap_ops);
++ return iomap_swapfile_activate(sis, swap_file, span,
++ &zonefs_read_iomap_ops);
+ }
+
+ static const struct address_space_operations zonefs_file_aops = {
+@@ -607,7 +641,7 @@ static vm_fault_t zonefs_filemap_page_mkwrite(struct vm_fault *vmf)
+
+ /* Serialize against truncates */
+ filemap_invalidate_lock_shared(inode->i_mapping);
+- ret = iomap_page_mkwrite(vmf, &zonefs_iomap_ops);
++ ret = iomap_page_mkwrite(vmf, &zonefs_write_iomap_ops);
+ filemap_invalidate_unlock_shared(inode->i_mapping);
+
+ sb_end_pagefault(inode->i_sb);
+@@ -860,7 +894,7 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from)
+ if (append)
+ ret = zonefs_file_dio_append(iocb, from);
+ else
+- ret = iomap_dio_rw(iocb, from, &zonefs_iomap_ops,
++ ret = iomap_dio_rw(iocb, from, &zonefs_write_iomap_ops,
+ &zonefs_write_dio_ops, 0, 0);
+ if (zi->i_ztype == ZONEFS_ZTYPE_SEQ &&
+ (ret > 0 || ret == -EIOCBQUEUED)) {
+@@ -902,7 +936,7 @@ static ssize_t zonefs_file_buffered_write(struct kiocb *iocb,
+ if (ret <= 0)
+ goto inode_unlock;
+
+- ret = iomap_file_buffered_write(iocb, from, &zonefs_iomap_ops);
++ ret = iomap_file_buffered_write(iocb, from, &zonefs_write_iomap_ops);
+ if (ret > 0)
+ iocb->ki_pos += ret;
+ else if (ret == -EIO)
+@@ -995,7 +1029,7 @@ static ssize_t zonefs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
+ goto inode_unlock;
+ }
+ file_accessed(iocb->ki_filp);
+- ret = iomap_dio_rw(iocb, to, &zonefs_iomap_ops,
++ ret = iomap_dio_rw(iocb, to, &zonefs_read_iomap_ops,
+ &zonefs_read_dio_ops, 0, 0);
+ } else {
+ ret = generic_file_read_iter(iocb, to);
+diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
+index 0805b74cae441..beb9c99fea07d 100644
+--- a/include/linux/fsnotify_backend.h
++++ b/include/linux/fsnotify_backend.h
+@@ -370,6 +370,7 @@ static inline bool fsnotify_valid_obj_type(unsigned int obj_type)
+
+ struct fsnotify_iter_info {
+ struct fsnotify_mark *marks[FSNOTIFY_ITER_TYPE_COUNT];
++ struct fsnotify_group *current_group;
+ unsigned int report_mask;
+ int srcu_idx;
+ };
+@@ -386,20 +387,31 @@ static inline void fsnotify_iter_set_report_type(
+ iter_info->report_mask |= (1U << iter_type);
+ }
+
+-static inline void fsnotify_iter_set_report_type_mark(
+- struct fsnotify_iter_info *iter_info, int iter_type,
+- struct fsnotify_mark *mark)
++static inline struct fsnotify_mark *fsnotify_iter_mark(
++ struct fsnotify_iter_info *iter_info, int iter_type)
+ {
+- iter_info->marks[iter_type] = mark;
+- iter_info->report_mask |= (1U << iter_type);
++ if (fsnotify_iter_should_report_type(iter_info, iter_type))
++ return iter_info->marks[iter_type];
++ return NULL;
++}
++
++static inline int fsnotify_iter_step(struct fsnotify_iter_info *iter, int type,
++ struct fsnotify_mark **markp)
++{
++ while (type < FSNOTIFY_ITER_TYPE_COUNT) {
++ *markp = fsnotify_iter_mark(iter, type);
++ if (*markp)
++ break;
++ type++;
++ }
++ return type;
+ }
+
+ #define FSNOTIFY_ITER_FUNCS(name, NAME) \
+ static inline struct fsnotify_mark *fsnotify_iter_##name##_mark( \
+ struct fsnotify_iter_info *iter_info) \
+ { \
+- return (iter_info->report_mask & (1U << FSNOTIFY_ITER_TYPE_##NAME)) ? \
+- iter_info->marks[FSNOTIFY_ITER_TYPE_##NAME] : NULL; \
++ return fsnotify_iter_mark(iter_info, FSNOTIFY_ITER_TYPE_##NAME); \
+ }
+
+ FSNOTIFY_ITER_FUNCS(inode, INODE)
+@@ -409,6 +421,11 @@ FSNOTIFY_ITER_FUNCS(sb, SB)
+
+ #define fsnotify_foreach_iter_type(type) \
+ for (type = 0; type < FSNOTIFY_ITER_TYPE_COUNT; type++)
++#define fsnotify_foreach_iter_mark_type(iter, mark, type) \
++ for (type = 0; \
++ type = fsnotify_iter_step(iter, type, &mark), \
++ type < FSNOTIFY_ITER_TYPE_COUNT; \
++ type++)
+
+ /*
+ * fsnotify_connp_t is what we embed in objects which connector can be attached
+diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
+index 0918a39279f6c..feef799884d1f 100644
+--- a/kernel/bpf/btf.c
++++ b/kernel/bpf/btf.c
+@@ -5769,6 +5769,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
+ struct bpf_reg_state *regs,
+ bool ptr_to_mem_ok)
+ {
++ enum bpf_prog_type prog_type = resolve_prog_type(env->prog);
+ struct bpf_verifier_log *log = &env->log;
+ u32 i, nargs, ref_id, ref_obj_id = 0;
+ bool is_kfunc = btf_is_kernel(btf);
+@@ -5834,8 +5835,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
+ if (ret < 0)
+ return ret;
+
+- if (btf_get_prog_ctx_type(log, btf, t,
+- env->prog->type, i)) {
++ if (btf_get_prog_ctx_type(log, btf, t, prog_type, i)) {
+ /* If function expects ctx type in BTF check that caller
+ * is passing PTR_TO_CTX.
+ */
+diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c
+index d9aad15e0d242..02bb8cbf91949 100644
+--- a/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c
++++ b/tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c
+@@ -395,6 +395,18 @@ static void test_func_map_prog_compatibility(void)
+ "./test_attach_probe.o");
+ }
+
++static void test_func_replace_global_func(void)
++{
++ const char *prog_name[] = {
++ "freplace/test_pkt_access",
++ };
++
++ test_fexit_bpf2bpf_common("./freplace_global_func.o",
++ "./test_pkt_access.o",
++ ARRAY_SIZE(prog_name),
++ prog_name, false, NULL);
++}
++
+ /* NOTE: affect other tests, must run in serial mode */
+ void serial_test_fexit_bpf2bpf(void)
+ {
+@@ -416,4 +428,6 @@ void serial_test_fexit_bpf2bpf(void)
+ test_func_replace_multi();
+ if (test__start_subtest("fmod_ret_freplace"))
+ test_fmod_ret_freplace();
++ if (test__start_subtest("func_replace_global_func"))
++ test_func_replace_global_func();
+ }
+diff --git a/tools/testing/selftests/bpf/progs/freplace_global_func.c b/tools/testing/selftests/bpf/progs/freplace_global_func.c
+new file mode 100644
+index 0000000000000..96cb61a6ce87a
+--- /dev/null
++++ b/tools/testing/selftests/bpf/progs/freplace_global_func.c
+@@ -0,0 +1,18 @@
++// SPDX-License-Identifier: GPL-2.0
++#include <linux/bpf.h>
++#include <bpf/bpf_helpers.h>
++
++__noinline
++int test_ctx_global_func(struct __sk_buff *skb)
++{
++ volatile int retval = 1;
++ return retval;
++}
++
++SEC("freplace/test_pkt_access")
++int new_test_pkt_access(struct __sk_buff *skb)
++{
++ return test_ctx_global_func(skb);
++}
++
++char _license[] SEC("license") = "GPL";
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-06-26 21:52 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-06-26 21:52 UTC (permalink / raw
To: gentoo-commits
commit: 9ae3c38079c69dc3335f4e20816987575a5ea5c7
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Sun Jun 26 21:51:26 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Sun Jun 26 21:51:26 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=9ae3c380
Updated BMQ Schedular patch to r2
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 2 +-
...=> 5020_BMQ-and-PDS-io-scheduler-v5.18-r2.patch | 55 +++++++---------------
2 files changed, 19 insertions(+), 38 deletions(-)
diff --git a/0000_README b/0000_README
index 17ef0755..728697d0 100644
--- a/0000_README
+++ b/0000_README
@@ -111,7 +111,7 @@ Patch: 5010_enable-cpu-optimizations-universal.patch
From: https://github.com/graysky2/kernel_compiler_patch
Desc: Kernel >= 5.15 patch enables gcc = v11.1+ optimizations for additional CPUs.
-Patch: 5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch
+Patch: 5020_BMQ-and-PDS-io-scheduler-v5.18-r2.patch
From: https://gitlab.com/alfredchen/linux-prjc
Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
diff --git a/5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch b/5020_BMQ-and-PDS-io-scheduler-v5.18-r2.patch
similarity index 99%
rename from 5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch
rename to 5020_BMQ-and-PDS-io-scheduler-v5.18-r2.patch
index a130157e..cf13d856 100644
--- a/5020_BMQ-and-PDS-io-scheduler-v5.18-r1.patch
+++ b/5020_BMQ-and-PDS-io-scheduler-v5.18-r2.patch
@@ -632,10 +632,10 @@ index 976092b7bd45..31d587c16ec1 100644
obj-y += build_utility.o
diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
new file mode 100644
-index 000000000000..189332cd6f99
+index 000000000000..b8e67d568e17
--- /dev/null
+++ b/kernel/sched/alt_core.c
-@@ -0,0 +1,7768 @@
+@@ -0,0 +1,7750 @@
+/*
+ * kernel/sched/alt_core.c
+ *
@@ -705,7 +705,7 @@ index 000000000000..189332cd6f99
+#define sched_feat(x) (0)
+#endif /* CONFIG_SCHED_DEBUG */
+
-+#define ALT_SCHED_VERSION "v5.18-r1"
++#define ALT_SCHED_VERSION "v5.18-r2"
+
+/* rt_prio(prio) defined in include/linux/sched/rt.h */
+#define rt_task(p) rt_prio((p)->prio)
@@ -785,14 +785,14 @@ index 000000000000..189332cd6f99
+#ifdef CONFIG_SCHED_SMT
+static cpumask_t sched_sg_idle_mask ____cacheline_aligned_in_smp;
+#endif
-+static cpumask_t sched_rq_watermark[SCHED_BITS] ____cacheline_aligned_in_smp;
++static cpumask_t sched_rq_watermark[SCHED_QUEUE_BITS] ____cacheline_aligned_in_smp;
+
+/* sched_queue related functions */
+static inline void sched_queue_init(struct sched_queue *q)
+{
+ int i;
+
-+ bitmap_zero(q->bitmap, SCHED_BITS);
++ bitmap_zero(q->bitmap, SCHED_QUEUE_BITS);
+ for(i = 0; i < SCHED_BITS; i++)
+ INIT_LIST_HEAD(&q->heads[i]);
+}
@@ -824,7 +824,7 @@ index 000000000000..189332cd6f99
+ cpu = cpu_of(rq);
+ if (watermark < last_wm) {
+ for (i = last_wm; i > watermark; i--)
-+ cpumask_clear_cpu(cpu, sched_rq_watermark + SCHED_BITS - 1 - i);
++ cpumask_clear_cpu(cpu, sched_rq_watermark + SCHED_QUEUE_BITS - i);
+#ifdef CONFIG_SCHED_SMT
+ if (static_branch_likely(&sched_smt_present) &&
+ IDLE_TASK_SCHED_PRIO == last_wm)
@@ -835,7 +835,7 @@ index 000000000000..189332cd6f99
+ }
+ /* last_wm < watermark */
+ for (i = watermark; i > last_wm; i--)
-+ cpumask_set_cpu(cpu, sched_rq_watermark + SCHED_BITS - 1 - i);
++ cpumask_set_cpu(cpu, sched_rq_watermark + SCHED_QUEUE_BITS - i);
+#ifdef CONFIG_SCHED_SMT
+ if (static_branch_likely(&sched_smt_present) &&
+ IDLE_TASK_SCHED_PRIO == watermark) {
@@ -2543,7 +2543,7 @@ index 000000000000..189332cd6f99
+#endif
+ cpumask_and(&tmp, &chk_mask, sched_rq_watermark) ||
+ cpumask_and(&tmp, &chk_mask,
-+ sched_rq_watermark + SCHED_BITS - task_sched_prio(p)))
++ sched_rq_watermark + SCHED_QUEUE_BITS - 1 - task_sched_prio(p)))
+ return best_mask_cpu(task_cpu(p), &tmp);
+
+ return best_mask_cpu(task_cpu(p), &chk_mask);
@@ -4334,24 +4334,6 @@ index 000000000000..189332cd6f99
+ */
+void sched_exec(void)
+{
-+ struct task_struct *p = current;
-+ unsigned long flags;
-+ int dest_cpu;
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ dest_cpu = cpumask_any(p->cpus_ptr);
-+ if (dest_cpu == smp_processor_id())
-+ goto unlock;
-+
-+ if (likely(cpu_active(dest_cpu))) {
-+ struct migration_arg arg = { p, dest_cpu };
-+
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+ stop_one_cpu(task_cpu(p), migration_cpu_stop, &arg);
-+ return;
-+ }
-+unlock:
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
+}
+
+#endif
@@ -4519,7 +4501,7 @@ index 000000000000..189332cd6f99
+}
+
+#ifdef CONFIG_SCHED_SMT
-+static inline int active_load_balance_cpu_stop(void *data)
++static inline int sg_balance_cpu_stop(void *data)
+{
+ struct rq *rq = this_rq();
+ struct task_struct *p = data;
@@ -4570,15 +4552,15 @@ index 000000000000..189332cd6f99
+ raw_spin_unlock_irqrestore(&rq->lock, flags);
+
+ if (res)
-+ stop_one_cpu_nowait(cpu, active_load_balance_cpu_stop,
-+ curr, &rq->active_balance_work);
++ stop_one_cpu_nowait(cpu, sg_balance_cpu_stop, curr,
++ &rq->active_balance_work);
+ return res;
+}
+
+/*
-+ * sg_balance_check - slibing group balance check for run queue @rq
++ * sg_balance - slibing group balance check for run queue @rq
+ */
-+static inline void sg_balance_check(struct rq *rq)
++static inline void sg_balance(struct rq *rq)
+{
+ cpumask_t chk;
+ int cpu = cpu_of(rq);
@@ -5243,7 +5225,7 @@ index 000000000000..189332cd6f99
+ }
+
+#ifdef CONFIG_SCHED_SMT
-+ sg_balance_check(rq);
++ sg_balance(rq);
+#endif
+}
+
@@ -7884,7 +7866,7 @@ index 000000000000..189332cd6f99
+ wait_bit_init();
+
+#ifdef CONFIG_SMP
-+ for (i = 0; i < SCHED_BITS; i++)
++ for (i = 0; i < SCHED_QUEUE_BITS; i++)
+ cpumask_copy(sched_rq_watermark + i, cpu_present_mask);
+#endif
+
@@ -9094,10 +9076,10 @@ index 000000000000..611424bbfa9b
+#endif /* ALT_SCHED_H */
diff --git a/kernel/sched/bmq.h b/kernel/sched/bmq.h
new file mode 100644
-index 000000000000..bf7ac80ec242
+index 000000000000..66b77291b9d0
--- /dev/null
+++ b/kernel/sched/bmq.h
-@@ -0,0 +1,111 @@
+@@ -0,0 +1,110 @@
+#define ALT_SCHED_VERSION_MSG "sched/bmq: BMQ CPU Scheduler "ALT_SCHED_VERSION" by Alfred Chen.\n"
+
+/*
@@ -9185,8 +9167,7 @@ index 000000000000..bf7ac80ec242
+
+static void sched_task_fork(struct task_struct *p, struct rq *rq)
+{
-+ p->boost_prio = (p->boost_prio < 0) ?
-+ p->boost_prio + MAX_PRIORITY_ADJ : MAX_PRIORITY_ADJ;
++ p->boost_prio = MAX_PRIORITY_ADJ;
+}
+
+static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-06-29 11:07 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-06-29 11:07 UTC (permalink / raw
To: gentoo-commits
commit: 7646e418ef54f5f0f9d2dfa172c81d5b50674c07
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Jun 29 11:07:00 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Jun 29 11:07:00 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=7646e418
Linux patch 5.18.8
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1007_linux-5.18.8.patch | 6743 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 6747 insertions(+)
diff --git a/0000_README b/0000_README
index 728697d0..b676cc58 100644
--- a/0000_README
+++ b/0000_README
@@ -71,6 +71,10 @@ Patch: 1006_linux-5.18.7.patch
From: http://www.kernel.org
Desc: Linux 5.18.7
+Patch: 1007_linux-5.18.8.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.8
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1007_linux-5.18.8.patch b/1007_linux-5.18.8.patch
new file mode 100644
index 00000000..5231e488
--- /dev/null
+++ b/1007_linux-5.18.8.patch
@@ -0,0 +1,6743 @@
+diff --git a/Documentation/ABI/testing/sysfs-bus-iio-vf610 b/Documentation/ABI/testing/sysfs-bus-iio-vf610
+index 308a6756d3bf3..491ead8044888 100644
+--- a/Documentation/ABI/testing/sysfs-bus-iio-vf610
++++ b/Documentation/ABI/testing/sysfs-bus-iio-vf610
+@@ -1,4 +1,4 @@
+-What: /sys/bus/iio/devices/iio:deviceX/conversion_mode
++What: /sys/bus/iio/devices/iio:deviceX/in_conversion_mode
+ KernelVersion: 4.2
+ Contact: linux-iio@vger.kernel.org
+ Description:
+diff --git a/Documentation/devicetree/bindings/usb/generic-ehci.yaml b/Documentation/devicetree/bindings/usb/generic-ehci.yaml
+index 8913497624de2..cb5da1df8d405 100644
+--- a/Documentation/devicetree/bindings/usb/generic-ehci.yaml
++++ b/Documentation/devicetree/bindings/usb/generic-ehci.yaml
+@@ -135,7 +135,8 @@ properties:
+ Phandle of a companion.
+
+ phys:
+- maxItems: 1
++ minItems: 1
++ maxItems: 3
+
+ phy-names:
+ const: usb
+diff --git a/Documentation/devicetree/bindings/usb/generic-ohci.yaml b/Documentation/devicetree/bindings/usb/generic-ohci.yaml
+index acbf94fa5f74a..d5fd3aa53ed29 100644
+--- a/Documentation/devicetree/bindings/usb/generic-ohci.yaml
++++ b/Documentation/devicetree/bindings/usb/generic-ohci.yaml
+@@ -102,7 +102,8 @@ properties:
+ Overrides the detected port count
+
+ phys:
+- maxItems: 1
++ minItems: 1
++ maxItems: 3
+
+ phy-names:
+ const: usb
+diff --git a/Documentation/vm/hwpoison.rst b/Documentation/vm/hwpoison.rst
+index c742de1769d18..b9d5253c13057 100644
+--- a/Documentation/vm/hwpoison.rst
++++ b/Documentation/vm/hwpoison.rst
+@@ -120,7 +120,8 @@ Testing
+ unpoison-pfn
+ Software-unpoison page at PFN echoed into this file. This way
+ a page can be reused again. This only works for Linux
+- injected failures, not for real memory failures.
++ injected failures, not for real memory failures. Once any hardware
++ memory failure happens, this feature is disabled.
+
+ Note these injection interfaces are not stable and might change between
+ kernel versions
+diff --git a/MAINTAINERS b/MAINTAINERS
+index f468864fd268c..8e6622ed6de69 100644
+--- a/MAINTAINERS
++++ b/MAINTAINERS
+@@ -427,6 +427,7 @@ ACPI VIOT DRIVER
+ M: Jean-Philippe Brucker <jean-philippe@linaro.org>
+ L: linux-acpi@vger.kernel.org
+ L: iommu@lists.linux-foundation.org
++L: iommu@lists.linux.dev
+ S: Maintained
+ F: drivers/acpi/viot.c
+ F: include/linux/acpi_viot.h
+@@ -960,6 +961,7 @@ AMD IOMMU (AMD-VI)
+ M: Joerg Roedel <joro@8bytes.org>
+ R: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
+ L: iommu@lists.linux-foundation.org
++L: iommu@lists.linux.dev
+ S: Maintained
+ T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
+ F: drivers/iommu/amd/
+@@ -5898,6 +5900,7 @@ M: Christoph Hellwig <hch@lst.de>
+ M: Marek Szyprowski <m.szyprowski@samsung.com>
+ R: Robin Murphy <robin.murphy@arm.com>
+ L: iommu@lists.linux-foundation.org
++L: iommu@lists.linux.dev
+ S: Supported
+ W: http://git.infradead.org/users/hch/dma-mapping.git
+ T: git git://git.infradead.org/users/hch/dma-mapping.git
+@@ -5910,6 +5913,7 @@ F: kernel/dma/
+ DMA MAPPING BENCHMARK
+ M: Xiang Chen <chenxiang66@hisilicon.com>
+ L: iommu@lists.linux-foundation.org
++L: iommu@lists.linux.dev
+ F: kernel/dma/map_benchmark.c
+ F: tools/testing/selftests/dma/
+
+@@ -7476,6 +7480,7 @@ F: drivers/gpu/drm/exynos/exynos_dp*
+ EXYNOS SYSMMU (IOMMU) driver
+ M: Marek Szyprowski <m.szyprowski@samsung.com>
+ L: iommu@lists.linux-foundation.org
++L: iommu@lists.linux.dev
+ S: Maintained
+ F: drivers/iommu/exynos-iommu.c
+
+@@ -9875,6 +9880,7 @@ INTEL IOMMU (VT-d)
+ M: David Woodhouse <dwmw2@infradead.org>
+ M: Lu Baolu <baolu.lu@linux.intel.com>
+ L: iommu@lists.linux-foundation.org
++L: iommu@lists.linux.dev
+ S: Supported
+ T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
+ F: drivers/iommu/intel/
+@@ -10253,6 +10259,7 @@ IOMMU DRIVERS
+ M: Joerg Roedel <joro@8bytes.org>
+ M: Will Deacon <will@kernel.org>
+ L: iommu@lists.linux-foundation.org
++L: iommu@lists.linux.dev
+ S: Maintained
+ T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
+ F: Documentation/devicetree/bindings/iommu/
+@@ -12369,6 +12376,7 @@ F: drivers/i2c/busses/i2c-mt65xx.c
+ MEDIATEK IOMMU DRIVER
+ M: Yong Wu <yong.wu@mediatek.com>
+ L: iommu@lists.linux-foundation.org
++L: iommu@lists.linux.dev
+ L: linux-mediatek@lists.infradead.org (moderated for non-subscribers)
+ S: Supported
+ F: Documentation/devicetree/bindings/iommu/mediatek*
+@@ -16354,6 +16362,7 @@ F: drivers/i2c/busses/i2c-qcom-cci.c
+ QUALCOMM IOMMU
+ M: Rob Clark <robdclark@gmail.com>
+ L: iommu@lists.linux-foundation.org
++L: iommu@lists.linux.dev
+ L: linux-arm-msm@vger.kernel.org
+ S: Maintained
+ F: drivers/iommu/arm/arm-smmu/qcom_iommu.c
+@@ -18939,6 +18948,7 @@ F: arch/x86/boot/video*
+ SWIOTLB SUBSYSTEM
+ M: Christoph Hellwig <hch@infradead.org>
+ L: iommu@lists.linux-foundation.org
++L: iommu@lists.linux.dev
+ S: Supported
+ W: http://git.infradead.org/users/hch/dma-mapping.git
+ T: git git://git.infradead.org/users/hch/dma-mapping.git
+@@ -21609,6 +21619,7 @@ M: Juergen Gross <jgross@suse.com>
+ M: Stefano Stabellini <sstabellini@kernel.org>
+ L: xen-devel@lists.xenproject.org (moderated for non-subscribers)
+ L: iommu@lists.linux-foundation.org
++L: iommu@lists.linux.dev
+ S: Supported
+ F: arch/x86/xen/*swiotlb*
+ F: drivers/xen/*swiotlb*
+diff --git a/Makefile b/Makefile
+index 61d63068553c8..6ac3335f65aff 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 7
++SUBLEVEL = 8
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+@@ -1139,7 +1139,7 @@ KBUILD_MODULES := 1
+
+ autoksyms_recursive: descend modules.order
+ $(Q)$(CONFIG_SHELL) $(srctree)/scripts/adjust_autoksyms.sh \
+- "$(MAKE) -f $(srctree)/Makefile vmlinux"
++ "$(MAKE) -f $(srctree)/Makefile autoksyms_recursive"
+ endif
+
+ autoksyms_h := $(if $(CONFIG_TRIM_UNUSED_KSYMS), include/generated/autoksyms.h)
+diff --git a/arch/arm/boot/dts/bcm2711-rpi-400.dts b/arch/arm/boot/dts/bcm2711-rpi-400.dts
+index f4d2fc20397c7..c53d9eb0b8027 100644
+--- a/arch/arm/boot/dts/bcm2711-rpi-400.dts
++++ b/arch/arm/boot/dts/bcm2711-rpi-400.dts
+@@ -28,12 +28,12 @@
+ &expgpio {
+ gpio-line-names = "BT_ON",
+ "WL_ON",
+- "",
++ "PWR_LED_OFF",
+ "GLOBAL_RESET",
+ "VDD_SD_IO_SEL",
+- "CAM_GPIO",
++ "GLOBAL_SHUTDOWN",
+ "SD_PWR_ON",
+- "SD_OC_N";
++ "SHUTDOWN_REQUEST";
+ };
+
+ &genet_mdio {
+diff --git a/arch/arm/boot/dts/imx6qdl.dtsi b/arch/arm/boot/dts/imx6qdl.dtsi
+index d27beb47f9a3b..652feff334966 100644
+--- a/arch/arm/boot/dts/imx6qdl.dtsi
++++ b/arch/arm/boot/dts/imx6qdl.dtsi
+@@ -762,7 +762,7 @@
+ regulator-name = "vddpu";
+ regulator-min-microvolt = <725000>;
+ regulator-max-microvolt = <1450000>;
+- regulator-enable-ramp-delay = <150>;
++ regulator-enable-ramp-delay = <380>;
+ anatop-reg-offset = <0x140>;
+ anatop-vol-bit-shift = <9>;
+ anatop-vol-bit-width = <5>;
+diff --git a/arch/arm/boot/dts/imx7s.dtsi b/arch/arm/boot/dts/imx7s.dtsi
+index 5af6d58666f42..9dd525871adf4 100644
+--- a/arch/arm/boot/dts/imx7s.dtsi
++++ b/arch/arm/boot/dts/imx7s.dtsi
+@@ -120,6 +120,7 @@
+ compatible = "usb-nop-xceiv";
+ clocks = <&clks IMX7D_USB_HSIC_ROOT_CLK>;
+ clock-names = "main_clk";
++ power-domains = <&pgc_hsic_phy>;
+ #phy-cells = <0>;
+ };
+
+@@ -1153,7 +1154,6 @@
+ compatible = "fsl,imx7d-usb", "fsl,imx27-usb";
+ reg = <0x30b30000 0x200>;
+ interrupts = <GIC_SPI 40 IRQ_TYPE_LEVEL_HIGH>;
+- power-domains = <&pgc_hsic_phy>;
+ clocks = <&clks IMX7D_USB_CTRL_CLK>;
+ fsl,usbphy = <&usbphynop3>;
+ fsl,usbmisc = <&usbmisc3 0>;
+diff --git a/arch/arm/kernel/crash_dump.c b/arch/arm/kernel/crash_dump.c
+index 53cb924353920..938bd932df9a0 100644
+--- a/arch/arm/kernel/crash_dump.c
++++ b/arch/arm/kernel/crash_dump.c
+@@ -14,22 +14,10 @@
+ #include <linux/crash_dump.h>
+ #include <linux/uaccess.h>
+ #include <linux/io.h>
++#include <linux/uio.h>
+
+-/**
+- * copy_oldmem_page() - copy one page from old kernel memory
+- * @pfn: page frame number to be copied
+- * @buf: buffer where the copied page is placed
+- * @csize: number of bytes to copy
+- * @offset: offset in bytes into the page
+- * @userbuf: if set, @buf is int he user address space
+- *
+- * This function copies one page from old kernel memory into buffer pointed by
+- * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
+- * copied or negative error in case of failure.
+- */
+-ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+- size_t csize, unsigned long offset,
+- int userbuf)
++ssize_t copy_oldmem_page(struct iov_iter *iter, unsigned long pfn,
++ size_t csize, unsigned long offset)
+ {
+ void *vaddr;
+
+@@ -40,14 +28,7 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+ if (!vaddr)
+ return -ENOMEM;
+
+- if (userbuf) {
+- if (copy_to_user(buf, vaddr + offset, csize)) {
+- iounmap(vaddr);
+- return -EFAULT;
+- }
+- } else {
+- memcpy(buf, vaddr + offset, csize);
+- }
++ csize = copy_to_iter(vaddr + offset, csize, iter);
+
+ iounmap(vaddr);
+ return csize;
+diff --git a/arch/arm/mach-axxia/platsmp.c b/arch/arm/mach-axxia/platsmp.c
+index 512943eae30a5..2e203626eda52 100644
+--- a/arch/arm/mach-axxia/platsmp.c
++++ b/arch/arm/mach-axxia/platsmp.c
+@@ -39,6 +39,7 @@ static int axxia_boot_secondary(unsigned int cpu, struct task_struct *idle)
+ return -ENOENT;
+
+ syscon = of_iomap(syscon_np, 0);
++ of_node_put(syscon_np);
+ if (!syscon)
+ return -ENOMEM;
+
+diff --git a/arch/arm/mach-cns3xxx/core.c b/arch/arm/mach-cns3xxx/core.c
+index e4f4b20b83a2d..3fc4ec830e3a3 100644
+--- a/arch/arm/mach-cns3xxx/core.c
++++ b/arch/arm/mach-cns3xxx/core.c
+@@ -372,6 +372,7 @@ static void __init cns3xxx_init(void)
+ /* De-Asscer SATA Reset */
+ cns3xxx_pwr_soft_rst(CNS3XXX_PWR_SOFTWARE_RST(SATA));
+ }
++ of_node_put(dn);
+
+ dn = of_find_compatible_node(NULL, NULL, "cavium,cns3420-sdhci");
+ if (of_device_is_available(dn)) {
+@@ -385,6 +386,7 @@ static void __init cns3xxx_init(void)
+ cns3xxx_pwr_clk_en(CNS3XXX_PWR_CLK_EN(SDIO));
+ cns3xxx_pwr_soft_rst(CNS3XXX_PWR_SOFTWARE_RST(SDIO));
+ }
++ of_node_put(dn);
+
+ pm_power_off = cns3xxx_power_off;
+
+diff --git a/arch/arm/mach-exynos/exynos.c b/arch/arm/mach-exynos/exynos.c
+index 8b48326be9fd5..51a247ca4da8c 100644
+--- a/arch/arm/mach-exynos/exynos.c
++++ b/arch/arm/mach-exynos/exynos.c
+@@ -149,6 +149,7 @@ static void exynos_map_pmu(void)
+ np = of_find_matching_node(NULL, exynos_dt_pmu_match);
+ if (np)
+ pmu_base_addr = of_iomap(np, 0);
++ of_node_put(np);
+ }
+
+ static void __init exynos_init_irq(void)
+diff --git a/arch/arm64/boot/dts/exynos/exynos7885.dtsi b/arch/arm64/boot/dts/exynos/exynos7885.dtsi
+index 3170661f5b672..9c233c56558ce 100644
+--- a/arch/arm64/boot/dts/exynos/exynos7885.dtsi
++++ b/arch/arm64/boot/dts/exynos/exynos7885.dtsi
+@@ -280,8 +280,8 @@
+ interrupts = <GIC_SPI 246 IRQ_TYPE_LEVEL_HIGH>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&uart0_bus>;
+- clocks = <&cmu_peri CLK_GOUT_UART0_EXT_UCLK>,
+- <&cmu_peri CLK_GOUT_UART0_PCLK>;
++ clocks = <&cmu_peri CLK_GOUT_UART0_PCLK>,
++ <&cmu_peri CLK_GOUT_UART0_EXT_UCLK>;
+ clock-names = "uart", "clk_uart_baud0";
+ samsung,uart-fifosize = <64>;
+ status = "disabled";
+@@ -293,8 +293,8 @@
+ interrupts = <GIC_SPI 247 IRQ_TYPE_LEVEL_HIGH>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&uart1_bus>;
+- clocks = <&cmu_peri CLK_GOUT_UART1_EXT_UCLK>,
+- <&cmu_peri CLK_GOUT_UART1_PCLK>;
++ clocks = <&cmu_peri CLK_GOUT_UART1_PCLK>,
++ <&cmu_peri CLK_GOUT_UART1_EXT_UCLK>;
+ clock-names = "uart", "clk_uart_baud0";
+ samsung,uart-fifosize = <256>;
+ status = "disabled";
+@@ -306,8 +306,8 @@
+ interrupts = <GIC_SPI 279 IRQ_TYPE_LEVEL_HIGH>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&uart2_bus>;
+- clocks = <&cmu_peri CLK_GOUT_UART2_EXT_UCLK>,
+- <&cmu_peri CLK_GOUT_UART2_PCLK>;
++ clocks = <&cmu_peri CLK_GOUT_UART2_PCLK>,
++ <&cmu_peri CLK_GOUT_UART2_EXT_UCLK>;
+ clock-names = "uart", "clk_uart_baud0";
+ samsung,uart-fifosize = <256>;
+ status = "disabled";
+diff --git a/arch/arm64/boot/dts/ti/k3-am64-main.dtsi b/arch/arm64/boot/dts/ti/k3-am64-main.dtsi
+index f64b368c6c371..cdb530597c5eb 100644
+--- a/arch/arm64/boot/dts/ti/k3-am64-main.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am64-main.dtsi
+@@ -456,13 +456,11 @@
+ clock-names = "clk_ahb", "clk_xin";
+ mmc-ddr-1_8v;
+ mmc-hs200-1_8v;
+- mmc-hs400-1_8v;
+ ti,trm-icp = <0x2>;
+ ti,otap-del-sel-legacy = <0x0>;
+ ti,otap-del-sel-mmc-hs = <0x0>;
+ ti,otap-del-sel-ddr52 = <0x6>;
+ ti,otap-del-sel-hs200 = <0x7>;
+- ti,otap-del-sel-hs400 = <0x4>;
+ };
+
+ sdhci1: mmc@fa00000 {
+diff --git a/arch/arm64/boot/dts/ti/k3-j721s2-main.dtsi b/arch/arm64/boot/dts/ti/k3-j721s2-main.dtsi
+index be7f39299894e..19966f72c5b38 100644
+--- a/arch/arm64/boot/dts/ti/k3-j721s2-main.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-j721s2-main.dtsi
+@@ -33,7 +33,7 @@
+ ranges;
+ #interrupt-cells = <3>;
+ interrupt-controller;
+- reg = <0x00 0x01800000 0x00 0x200000>, /* GICD */
++ reg = <0x00 0x01800000 0x00 0x100000>, /* GICD */
+ <0x00 0x01900000 0x00 0x100000>, /* GICR */
+ <0x00 0x6f000000 0x00 0x2000>, /* GICC */
+ <0x00 0x6f010000 0x00 0x1000>, /* GICH */
+diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c
+index 58303a9ec32c4..670e4ce818223 100644
+--- a/arch/arm64/kernel/crash_dump.c
++++ b/arch/arm64/kernel/crash_dump.c
+@@ -9,25 +9,11 @@
+ #include <linux/crash_dump.h>
+ #include <linux/errno.h>
+ #include <linux/io.h>
+-#include <linux/memblock.h>
+-#include <linux/uaccess.h>
++#include <linux/uio.h>
+ #include <asm/memory.h>
+
+-/**
+- * copy_oldmem_page() - copy one page from old kernel memory
+- * @pfn: page frame number to be copied
+- * @buf: buffer where the copied page is placed
+- * @csize: number of bytes to copy
+- * @offset: offset in bytes into the page
+- * @userbuf: if set, @buf is in a user address space
+- *
+- * This function copies one page from old kernel memory into buffer pointed by
+- * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
+- * copied or negative error in case of failure.
+- */
+-ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+- size_t csize, unsigned long offset,
+- int userbuf)
++ssize_t copy_oldmem_page(struct iov_iter *iter, unsigned long pfn,
++ size_t csize, unsigned long offset)
+ {
+ void *vaddr;
+
+@@ -38,14 +24,7 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+ if (!vaddr)
+ return -ENOMEM;
+
+- if (userbuf) {
+- if (copy_to_user((char __user *)buf, vaddr + offset, csize)) {
+- memunmap(vaddr);
+- return -EFAULT;
+- }
+- } else {
+- memcpy(buf, vaddr + offset, csize);
+- }
++ csize = copy_to_iter(vaddr + offset, csize, iter);
+
+ memunmap(vaddr);
+
+diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
+index a66d83540c15a..f88919a793adf 100644
+--- a/arch/arm64/kvm/arm.c
++++ b/arch/arm64/kvm/arm.c
+@@ -2011,11 +2011,11 @@ static int finalize_hyp_mode(void)
+ return 0;
+
+ /*
+- * Exclude HYP BSS from kmemleak so that it doesn't get peeked
+- * at, which would end badly once the section is inaccessible.
+- * None of other sections should ever be introspected.
++ * Exclude HYP sections from kmemleak so that they don't get peeked
++ * at, which would end badly once inaccessible.
+ */
+ kmemleak_free_part(__hyp_bss_start, __hyp_bss_end - __hyp_bss_start);
++ kmemleak_free_part(__va(hyp_mem_base), hyp_mem_size);
+ return pkvm_drop_host_privileges();
+ }
+
+diff --git a/arch/ia64/kernel/crash_dump.c b/arch/ia64/kernel/crash_dump.c
+index 0ed3c3dee4cde..4ef68e2aa7571 100644
+--- a/arch/ia64/kernel/crash_dump.c
++++ b/arch/ia64/kernel/crash_dump.c
+@@ -10,42 +10,18 @@
+ #include <linux/errno.h>
+ #include <linux/types.h>
+ #include <linux/crash_dump.h>
+-
++#include <linux/uio.h>
+ #include <asm/page.h>
+-#include <linux/uaccess.h>
+
+-/**
+- * copy_oldmem_page - copy one page from "oldmem"
+- * @pfn: page frame number to be copied
+- * @buf: target memory address for the copy; this can be in kernel address
+- * space or user address space (see @userbuf)
+- * @csize: number of bytes to copy
+- * @offset: offset in bytes into the page (based on pfn) to begin the copy
+- * @userbuf: if set, @buf is in user address space, use copy_to_user(),
+- * otherwise @buf is in kernel address space, use memcpy().
+- *
+- * Copy a page from "oldmem". For this page, there is no pte mapped
+- * in the current kernel. We stitch up a pte, similar to kmap_atomic.
+- *
+- * Calling copy_to_user() in atomic context is not desirable. Hence first
+- * copying the data to a pre-allocated kernel page and then copying to user
+- * space in non-atomic context.
+- */
+-ssize_t
+-copy_oldmem_page(unsigned long pfn, char *buf,
+- size_t csize, unsigned long offset, int userbuf)
++ssize_t copy_oldmem_page(struct iov_iter *iter, unsigned long pfn,
++ size_t csize, unsigned long offset)
+ {
+ void *vaddr;
+
+ if (!csize)
+ return 0;
+ vaddr = __va(pfn<<PAGE_SHIFT);
+- if (userbuf) {
+- if (copy_to_user(buf, (vaddr + offset), csize)) {
+- return -EFAULT;
+- }
+- } else
+- memcpy(buf, (vaddr + offset), csize);
++ csize = copy_to_iter(vaddr + offset, csize, iter);
+ return csize;
+ }
+
+diff --git a/arch/mips/kernel/crash_dump.c b/arch/mips/kernel/crash_dump.c
+index 2e50f55185a65..6e50f49024094 100644
+--- a/arch/mips/kernel/crash_dump.c
++++ b/arch/mips/kernel/crash_dump.c
+@@ -1,22 +1,10 @@
+ // SPDX-License-Identifier: GPL-2.0
+ #include <linux/highmem.h>
+ #include <linux/crash_dump.h>
++#include <linux/uio.h>
+
+-/**
+- * copy_oldmem_page - copy one page from "oldmem"
+- * @pfn: page frame number to be copied
+- * @buf: target memory address for the copy; this can be in kernel address
+- * space or user address space (see @userbuf)
+- * @csize: number of bytes to copy
+- * @offset: offset in bytes into the page (based on pfn) to begin the copy
+- * @userbuf: if set, @buf is in user address space, use copy_to_user(),
+- * otherwise @buf is in kernel address space, use memcpy().
+- *
+- * Copy a page from "oldmem". For this page, there is no pte mapped
+- * in the current kernel.
+- */
+-ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+- size_t csize, unsigned long offset, int userbuf)
++ssize_t copy_oldmem_page(struct iov_iter *iter, unsigned long pfn,
++ size_t csize, unsigned long offset)
+ {
+ void *vaddr;
+
+@@ -24,14 +12,7 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+ return 0;
+
+ vaddr = kmap_local_pfn(pfn);
+-
+- if (!userbuf) {
+- memcpy(buf, vaddr + offset, csize);
+- } else {
+- if (copy_to_user(buf, vaddr + offset, csize))
+- csize = -EFAULT;
+- }
+-
++ csize = copy_to_iter(vaddr + offset, csize, iter);
+ kunmap_local(vaddr);
+
+ return csize;
+diff --git a/arch/mips/vr41xx/common/icu.c b/arch/mips/vr41xx/common/icu.c
+index 7b7f25b4b057e..9240bcdbe74e4 100644
+--- a/arch/mips/vr41xx/common/icu.c
++++ b/arch/mips/vr41xx/common/icu.c
+@@ -640,8 +640,6 @@ static int icu_get_irq(unsigned int irq)
+
+ printk(KERN_ERR "spurious ICU interrupt: %04x,%04x\n", pend1, pend2);
+
+- atomic_inc(&irq_err_count);
+-
+ return -1;
+ }
+
+diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
+index bd22578859d00..f3a2044ee4020 100644
+--- a/arch/parisc/Kconfig
++++ b/arch/parisc/Kconfig
+@@ -10,6 +10,7 @@ config PARISC
+ select ARCH_WANT_FRAME_POINTERS
+ select ARCH_HAS_ELF_RANDOMIZE
+ select ARCH_HAS_STRICT_KERNEL_RWX
++ select ARCH_HAS_STRICT_MODULE_RWX
+ select ARCH_HAS_UBSAN_SANITIZE_ALL
+ select ARCH_HAS_PTE_SPECIAL
+ select ARCH_NO_SG_CHAIN
+diff --git a/arch/parisc/include/asm/fb.h b/arch/parisc/include/asm/fb.h
+index d63a2acb91f2b..55d29c4f716e6 100644
+--- a/arch/parisc/include/asm/fb.h
++++ b/arch/parisc/include/asm/fb.h
+@@ -12,7 +12,7 @@ static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
+ pgprot_val(vma->vm_page_prot) |= _PAGE_NO_CACHE;
+ }
+
+-#if defined(CONFIG_STI_CONSOLE) || defined(CONFIG_FB_STI)
++#if defined(CONFIG_FB_STI)
+ int fb_is_primary_device(struct fb_info *info);
+ #else
+ static inline int fb_is_primary_device(struct fb_info *info)
+diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
+index 0fd04073d4b68..a20c1c47b7808 100644
+--- a/arch/parisc/kernel/cache.c
++++ b/arch/parisc/kernel/cache.c
+@@ -722,7 +722,10 @@ void flush_anon_page(struct vm_area_struct *vma, struct page *page, unsigned lon
+ return;
+
+ if (parisc_requires_coherency()) {
+- flush_user_cache_page(vma, vmaddr);
++ if (vma->vm_flags & VM_SHARED)
++ flush_data_cache();
++ else
++ flush_user_cache_page(vma, vmaddr);
+ return;
+ }
+
+diff --git a/arch/powerpc/kernel/crash_dump.c b/arch/powerpc/kernel/crash_dump.c
+index 5693e1c67c2b4..32b4a97f1b79b 100644
+--- a/arch/powerpc/kernel/crash_dump.c
++++ b/arch/powerpc/kernel/crash_dump.c
+@@ -16,7 +16,7 @@
+ #include <asm/kdump.h>
+ #include <asm/prom.h>
+ #include <asm/firmware.h>
+-#include <linux/uaccess.h>
++#include <linux/uio.h>
+ #include <asm/rtas.h>
+ #include <asm/inst.h>
+
+@@ -68,33 +68,8 @@ void __init setup_kdump_trampoline(void)
+ }
+ #endif /* CONFIG_NONSTATIC_KERNEL */
+
+-static size_t copy_oldmem_vaddr(void *vaddr, char *buf, size_t csize,
+- unsigned long offset, int userbuf)
+-{
+- if (userbuf) {
+- if (copy_to_user((char __user *)buf, (vaddr + offset), csize))
+- return -EFAULT;
+- } else
+- memcpy(buf, (vaddr + offset), csize);
+-
+- return csize;
+-}
+-
+-/**
+- * copy_oldmem_page - copy one page from "oldmem"
+- * @pfn: page frame number to be copied
+- * @buf: target memory address for the copy; this can be in kernel address
+- * space or user address space (see @userbuf)
+- * @csize: number of bytes to copy
+- * @offset: offset in bytes into the page (based on pfn) to begin the copy
+- * @userbuf: if set, @buf is in user address space, use copy_to_user(),
+- * otherwise @buf is in kernel address space, use memcpy().
+- *
+- * Copy a page from "oldmem". For this page, there is no pte mapped
+- * in the current kernel. We stitch up a pte, similar to kmap_atomic.
+- */
+-ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+- size_t csize, unsigned long offset, int userbuf)
++ssize_t copy_oldmem_page(struct iov_iter *iter, unsigned long pfn,
++ size_t csize, unsigned long offset)
+ {
+ void *vaddr;
+ phys_addr_t paddr;
+@@ -107,10 +82,10 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+
+ if (memblock_is_region_memory(paddr, csize)) {
+ vaddr = __va(paddr);
+- csize = copy_oldmem_vaddr(vaddr, buf, csize, offset, userbuf);
++ csize = copy_to_iter(vaddr + offset, csize, iter);
+ } else {
+ vaddr = ioremap_cache(paddr, PAGE_SIZE);
+- csize = copy_oldmem_vaddr(vaddr, buf, csize, offset, userbuf);
++ csize = copy_to_iter(vaddr + offset, csize, iter);
+ iounmap(vaddr);
+ }
+
+diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
+index a75d20f23dac8..9be279469a851 100644
+--- a/arch/powerpc/kernel/process.c
++++ b/arch/powerpc/kernel/process.c
+@@ -1857,7 +1857,7 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
+ tm_reclaim_current(0);
+ #endif
+
+- memset(regs->gpr, 0, sizeof(regs->gpr));
++ memset(®s->gpr[1], 0, sizeof(regs->gpr) - sizeof(regs->gpr[0]));
+ regs->ctr = 0;
+ regs->link = 0;
+ regs->xer = 0;
+diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
+index 6bc89d9ccf635..276b4eb1435b5 100644
+--- a/arch/powerpc/kernel/rtas.c
++++ b/arch/powerpc/kernel/rtas.c
+@@ -1061,7 +1061,7 @@ static struct rtas_filter rtas_filters[] __ro_after_init = {
+ { "get-time-of-day", -1, -1, -1, -1, -1 },
+ { "ibm,get-vpd", -1, 0, -1, 1, 2 },
+ { "ibm,lpar-perftools", -1, 2, 3, -1, -1 },
+- { "ibm,platform-dump", -1, 4, 5, -1, -1 },
++ { "ibm,platform-dump", -1, 4, 5, -1, -1 }, /* Special cased */
+ { "ibm,read-slot-reset-state", -1, -1, -1, -1, -1 },
+ { "ibm,scan-log-dump", -1, 0, 1, -1, -1 },
+ { "ibm,set-dynamic-indicator", -1, 2, -1, -1, -1 },
+@@ -1110,6 +1110,15 @@ static bool block_rtas_call(int token, int nargs,
+ size = 1;
+
+ end = base + size - 1;
++
++ /*
++ * Special case for ibm,platform-dump - NULL buffer
++ * address is used to indicate end of dump processing
++ */
++ if (!strcmp(f->name, "ibm,platform-dump") &&
++ base == 0)
++ return false;
++
+ if (!in_rmo_buf(base, end))
+ goto err;
+ }
+diff --git a/arch/powerpc/platforms/microwatt/microwatt.h b/arch/powerpc/platforms/microwatt/microwatt.h
+new file mode 100644
+index 0000000000000..335417e95e66f
+--- /dev/null
++++ b/arch/powerpc/platforms/microwatt/microwatt.h
+@@ -0,0 +1,7 @@
++/* SPDX-License-Identifier: GPL-2.0 */
++#ifndef _MICROWATT_H
++#define _MICROWATT_H
++
++void microwatt_rng_init(void);
++
++#endif /* _MICROWATT_H */
+diff --git a/arch/powerpc/platforms/microwatt/rng.c b/arch/powerpc/platforms/microwatt/rng.c
+index 7bc4d1cbfaf04..8ece87d005c86 100644
+--- a/arch/powerpc/platforms/microwatt/rng.c
++++ b/arch/powerpc/platforms/microwatt/rng.c
+@@ -11,6 +11,7 @@
+ #include <asm/archrandom.h>
+ #include <asm/cputable.h>
+ #include <asm/machdep.h>
++#include "microwatt.h"
+
+ #define DARN_ERR 0xFFFFFFFFFFFFFFFFul
+
+@@ -29,7 +30,7 @@ static int microwatt_get_random_darn(unsigned long *v)
+ return 1;
+ }
+
+-static __init int rng_init(void)
++void __init microwatt_rng_init(void)
+ {
+ unsigned long val;
+ int i;
+@@ -37,12 +38,7 @@ static __init int rng_init(void)
+ for (i = 0; i < 10; i++) {
+ if (microwatt_get_random_darn(&val)) {
+ ppc_md.get_random_seed = microwatt_get_random_darn;
+- return 0;
++ return;
+ }
+ }
+-
+- pr_warn("Unable to use DARN for get_random_seed()\n");
+-
+- return -EIO;
+ }
+-machine_subsys_initcall(, rng_init);
+diff --git a/arch/powerpc/platforms/microwatt/setup.c b/arch/powerpc/platforms/microwatt/setup.c
+index 0b02603bdb747..6b32539395a48 100644
+--- a/arch/powerpc/platforms/microwatt/setup.c
++++ b/arch/powerpc/platforms/microwatt/setup.c
+@@ -16,6 +16,8 @@
+ #include <asm/xics.h>
+ #include <asm/udbg.h>
+
++#include "microwatt.h"
++
+ static void __init microwatt_init_IRQ(void)
+ {
+ xics_init();
+@@ -32,10 +34,16 @@ static int __init microwatt_populate(void)
+ }
+ machine_arch_initcall(microwatt, microwatt_populate);
+
++static void __init microwatt_setup_arch(void)
++{
++ microwatt_rng_init();
++}
++
+ define_machine(microwatt) {
+ .name = "microwatt",
+ .probe = microwatt_probe,
+ .init_IRQ = microwatt_init_IRQ,
++ .setup_arch = microwatt_setup_arch,
+ .progress = udbg_progress,
+ .calibrate_decr = generic_calibrate_decr,
+ };
+diff --git a/arch/powerpc/platforms/powernv/powernv.h b/arch/powerpc/platforms/powernv/powernv.h
+index e297bf4abfcb8..866efdc103fdd 100644
+--- a/arch/powerpc/platforms/powernv/powernv.h
++++ b/arch/powerpc/platforms/powernv/powernv.h
+@@ -42,4 +42,6 @@ ssize_t memcons_copy(struct memcons *mc, char *to, loff_t pos, size_t count);
+ u32 __init memcons_get_size(struct memcons *mc);
+ struct memcons *__init memcons_init(struct device_node *node, const char *mc_prop_name);
+
++void pnv_rng_init(void);
++
+ #endif /* _POWERNV_H */
+diff --git a/arch/powerpc/platforms/powernv/rng.c b/arch/powerpc/platforms/powernv/rng.c
+index e3d44b36ae98f..463c78c52cc5d 100644
+--- a/arch/powerpc/platforms/powernv/rng.c
++++ b/arch/powerpc/platforms/powernv/rng.c
+@@ -17,6 +17,7 @@
+ #include <asm/prom.h>
+ #include <asm/machdep.h>
+ #include <asm/smp.h>
++#include "powernv.h"
+
+ #define DARN_ERR 0xFFFFFFFFFFFFFFFFul
+
+@@ -28,7 +29,6 @@ struct powernv_rng {
+
+ static DEFINE_PER_CPU(struct powernv_rng *, powernv_rng);
+
+-
+ int powernv_hwrng_present(void)
+ {
+ struct powernv_rng *rng;
+@@ -98,9 +98,6 @@ static int __init initialise_darn(void)
+ return 0;
+ }
+ }
+-
+- pr_warn("Unable to use DARN for get_random_seed()\n");
+-
+ return -EIO;
+ }
+
+@@ -163,32 +160,55 @@ static __init int rng_create(struct device_node *dn)
+
+ rng_init_per_cpu(rng, dn);
+
+- pr_info_once("Registering arch random hook.\n");
+-
+ ppc_md.get_random_seed = powernv_get_random_long;
+
+ return 0;
+ }
+
+-static __init int rng_init(void)
++static int __init pnv_get_random_long_early(unsigned long *v)
+ {
+ struct device_node *dn;
+- int rc;
++
++ if (!slab_is_available())
++ return 0;
++
++ if (cmpxchg(&ppc_md.get_random_seed, pnv_get_random_long_early,
++ NULL) != pnv_get_random_long_early)
++ return 0;
+
+ for_each_compatible_node(dn, NULL, "ibm,power-rng") {
+- rc = rng_create(dn);
+- if (rc) {
+- pr_err("Failed creating rng for %pOF (%d).\n",
+- dn, rc);
++ if (rng_create(dn))
+ continue;
+- }
+-
+ /* Create devices for hwrng driver */
+ of_platform_device_create(dn, NULL, NULL);
+ }
+
+- initialise_darn();
++ if (!ppc_md.get_random_seed)
++ return 0;
++ return ppc_md.get_random_seed(v);
++}
++
++void __init pnv_rng_init(void)
++{
++ struct device_node *dn;
+
++ /* Prefer darn over the rest. */
++ if (!initialise_darn())
++ return;
++
++ dn = of_find_compatible_node(NULL, NULL, "ibm,power-rng");
++ if (dn)
++ ppc_md.get_random_seed = pnv_get_random_long_early;
++
++ of_node_put(dn);
++}
++
++static int __init pnv_rng_late_init(void)
++{
++ unsigned long v;
++ /* In case it wasn't called during init for some other reason. */
++ if (ppc_md.get_random_seed == pnv_get_random_long_early)
++ pnv_get_random_long_early(&v);
+ return 0;
+ }
+-machine_subsys_initcall(powernv, rng_init);
++machine_subsys_initcall(powernv, pnv_rng_late_init);
+diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
+index 824c3ad7a0faf..dac545aa03082 100644
+--- a/arch/powerpc/platforms/powernv/setup.c
++++ b/arch/powerpc/platforms/powernv/setup.c
+@@ -203,6 +203,8 @@ static void __init pnv_setup_arch(void)
+ pnv_check_guarded_cores();
+
+ /* XXX PMCS */
++
++ pnv_rng_init();
+ }
+
+ static void __init pnv_init(void)
+diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h
+index af162aeeae86d..3f9b51298aa34 100644
+--- a/arch/powerpc/platforms/pseries/pseries.h
++++ b/arch/powerpc/platforms/pseries/pseries.h
+@@ -121,4 +121,6 @@ void pseries_lpar_read_hblkrm_characteristics(void);
+ static inline void pseries_lpar_read_hblkrm_characteristics(void) { }
+ #endif
+
++void pseries_rng_init(void);
++
+ #endif /* _PSERIES_PSERIES_H */
+diff --git a/arch/powerpc/platforms/pseries/rng.c b/arch/powerpc/platforms/pseries/rng.c
+index 6268545947b83..6ddfdeaace9ef 100644
+--- a/arch/powerpc/platforms/pseries/rng.c
++++ b/arch/powerpc/platforms/pseries/rng.c
+@@ -10,6 +10,7 @@
+ #include <asm/archrandom.h>
+ #include <asm/machdep.h>
+ #include <asm/plpar_wrappers.h>
++#include "pseries.h"
+
+
+ static int pseries_get_random_long(unsigned long *v)
+@@ -24,19 +25,13 @@ static int pseries_get_random_long(unsigned long *v)
+ return 0;
+ }
+
+-static __init int rng_init(void)
++void __init pseries_rng_init(void)
+ {
+ struct device_node *dn;
+
+ dn = of_find_compatible_node(NULL, NULL, "ibm,random");
+ if (!dn)
+- return -ENODEV;
+-
+- pr_info("Registering arch random hook.\n");
+-
++ return;
+ ppc_md.get_random_seed = pseries_get_random_long;
+-
+ of_node_put(dn);
+- return 0;
+ }
+-machine_subsys_initcall(pseries, rng_init);
+diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
+index 955ff8aa1644d..f27735f623bae 100644
+--- a/arch/powerpc/platforms/pseries/setup.c
++++ b/arch/powerpc/platforms/pseries/setup.c
+@@ -852,6 +852,8 @@ static void __init pSeries_setup_arch(void)
+
+ if (swiotlb_force == SWIOTLB_FORCE)
+ ppc_swiotlb_enable = 1;
++
++ pseries_rng_init();
+ }
+
+ static void pseries_panic(char *str)
+diff --git a/arch/riscv/kernel/crash_dump.c b/arch/riscv/kernel/crash_dump.c
+index 86cc0ada57522..ea2158cee97b3 100644
+--- a/arch/riscv/kernel/crash_dump.c
++++ b/arch/riscv/kernel/crash_dump.c
+@@ -7,22 +7,10 @@
+
+ #include <linux/crash_dump.h>
+ #include <linux/io.h>
++#include <linux/uio.h>
+
+-/**
+- * copy_oldmem_page() - copy one page from old kernel memory
+- * @pfn: page frame number to be copied
+- * @buf: buffer where the copied page is placed
+- * @csize: number of bytes to copy
+- * @offset: offset in bytes into the page
+- * @userbuf: if set, @buf is in a user address space
+- *
+- * This function copies one page from old kernel memory into buffer pointed by
+- * @buf. If @buf is in userspace, set @userbuf to %1. Returns number of bytes
+- * copied or negative error in case of failure.
+- */
+-ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+- size_t csize, unsigned long offset,
+- int userbuf)
++ssize_t copy_oldmem_page(struct iov_iter *iter, unsigned long pfn,
++ size_t csize, unsigned long offset)
+ {
+ void *vaddr;
+
+@@ -33,13 +21,7 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+ if (!vaddr)
+ return -ENOMEM;
+
+- if (userbuf) {
+- if (copy_to_user((char __user *)buf, vaddr + offset, csize)) {
+- memunmap(vaddr);
+- return -EFAULT;
+- }
+- } else
+- memcpy(buf, vaddr + offset, csize);
++ csize = copy_to_iter(vaddr + offset, csize, iter);
+
+ memunmap(vaddr);
+ return csize;
+diff --git a/arch/s390/kernel/crash_dump.c b/arch/s390/kernel/crash_dump.c
+index 69819b7652504..28124d0fa1d5e 100644
+--- a/arch/s390/kernel/crash_dump.c
++++ b/arch/s390/kernel/crash_dump.c
+@@ -15,6 +15,7 @@
+ #include <linux/slab.h>
+ #include <linux/memblock.h>
+ #include <linux/elf.h>
++#include <linux/uio.h>
+ #include <asm/asm-offsets.h>
+ #include <asm/os_info.h>
+ #include <asm/elf.h>
+@@ -212,20 +213,30 @@ static int copy_oldmem_user(void __user *dst, unsigned long src, size_t count)
+ /*
+ * Copy one page from "oldmem"
+ */
+-ssize_t copy_oldmem_page(unsigned long pfn, char *buf, size_t csize,
+- unsigned long offset, int userbuf)
++ssize_t copy_oldmem_page(struct iov_iter *iter, unsigned long pfn, size_t csize,
++ unsigned long offset)
+ {
+ unsigned long src;
+ int rc;
+
++ if (!(iter_is_iovec(iter) || iov_iter_is_kvec(iter)))
++ return -EINVAL;
++ /* Multi-segment iterators are not supported */
++ if (iter->nr_segs > 1)
++ return -EINVAL;
+ if (!csize)
+ return 0;
+ src = pfn_to_phys(pfn) + offset;
+- if (userbuf)
+- rc = copy_oldmem_user((void __force __user *) buf, src, csize);
++
++ /* XXX: pass the iov_iter down to a common function */
++ if (iter_is_iovec(iter))
++ rc = copy_oldmem_user(iter->iov->iov_base, src, csize);
+ else
+- rc = copy_oldmem_kernel((void *) buf, src, csize);
+- return rc;
++ rc = copy_oldmem_kernel(iter->kvec->iov_base, src, csize);
++ if (rc < 0)
++ return rc;
++ iov_iter_advance(iter, csize);
++ return csize;
+ }
+
+ /*
+diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
+index 483ab5e10164d..f7dd3c849e68c 100644
+--- a/arch/s390/kernel/perf_cpum_cf.c
++++ b/arch/s390/kernel/perf_cpum_cf.c
+@@ -516,6 +516,26 @@ static int __hw_perf_event_init(struct perf_event *event, unsigned int type)
+ return err;
+ }
+
++/* Events CPU_CYLCES and INSTRUCTIONS can be submitted with two different
++ * attribute::type values:
++ * - PERF_TYPE_HARDWARE:
++ * - pmu->type:
++ * Handle both type of invocations identical. They address the same hardware.
++ * The result is different when event modifiers exclude_kernel and/or
++ * exclude_user are also set.
++ */
++static int cpumf_pmu_event_type(struct perf_event *event)
++{
++ u64 ev = event->attr.config;
++
++ if (cpumf_generic_events_basic[PERF_COUNT_HW_CPU_CYCLES] == ev ||
++ cpumf_generic_events_basic[PERF_COUNT_HW_INSTRUCTIONS] == ev ||
++ cpumf_generic_events_user[PERF_COUNT_HW_CPU_CYCLES] == ev ||
++ cpumf_generic_events_user[PERF_COUNT_HW_INSTRUCTIONS] == ev)
++ return PERF_TYPE_HARDWARE;
++ return PERF_TYPE_RAW;
++}
++
+ static int cpumf_pmu_event_init(struct perf_event *event)
+ {
+ unsigned int type = event->attr.type;
+@@ -525,7 +545,7 @@ static int cpumf_pmu_event_init(struct perf_event *event)
+ err = __hw_perf_event_init(event, type);
+ else if (event->pmu->type == type)
+ /* Registered as unknown PMU */
+- err = __hw_perf_event_init(event, PERF_TYPE_RAW);
++ err = __hw_perf_event_init(event, cpumf_pmu_event_type(event));
+ else
+ return -ENOENT;
+
+diff --git a/arch/sh/kernel/crash_dump.c b/arch/sh/kernel/crash_dump.c
+index 5b41b59698c1e..19ce6a950aaca 100644
+--- a/arch/sh/kernel/crash_dump.c
++++ b/arch/sh/kernel/crash_dump.c
+@@ -8,23 +8,11 @@
+ #include <linux/errno.h>
+ #include <linux/crash_dump.h>
+ #include <linux/io.h>
++#include <linux/uio.h>
+ #include <linux/uaccess.h>
+
+-/**
+- * copy_oldmem_page - copy one page from "oldmem"
+- * @pfn: page frame number to be copied
+- * @buf: target memory address for the copy; this can be in kernel address
+- * space or user address space (see @userbuf)
+- * @csize: number of bytes to copy
+- * @offset: offset in bytes into the page (based on pfn) to begin the copy
+- * @userbuf: if set, @buf is in user address space, use copy_to_user(),
+- * otherwise @buf is in kernel address space, use memcpy().
+- *
+- * Copy a page from "oldmem". For this page, there is no pte mapped
+- * in the current kernel. We stitch up a pte, similar to kmap_atomic.
+- */
+-ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+- size_t csize, unsigned long offset, int userbuf)
++ssize_t copy_oldmem_page(struct iov_iter *iter, unsigned long pfn,
++ size_t csize, unsigned long offset)
+ {
+ void __iomem *vaddr;
+
+@@ -32,15 +20,8 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf,
+ return 0;
+
+ vaddr = ioremap(pfn << PAGE_SHIFT, PAGE_SIZE);
+-
+- if (userbuf) {
+- if (copy_to_user((void __user *)buf, (vaddr + offset), csize)) {
+- iounmap(vaddr);
+- return -EFAULT;
+- }
+- } else
+- memcpy(buf, (vaddr + offset), csize);
+-
++ csize = copy_to_iter(vaddr + offset, csize, iter);
+ iounmap(vaddr);
++
+ return csize;
+ }
+diff --git a/arch/x86/kernel/crash_dump_32.c b/arch/x86/kernel/crash_dump_32.c
+index 5fcac46aaf6b1..5f4ae5476e193 100644
+--- a/arch/x86/kernel/crash_dump_32.c
++++ b/arch/x86/kernel/crash_dump_32.c
+@@ -10,8 +10,7 @@
+ #include <linux/errno.h>
+ #include <linux/highmem.h>
+ #include <linux/crash_dump.h>
+-
+-#include <linux/uaccess.h>
++#include <linux/uio.h>
+
+ static inline bool is_crashed_pfn_valid(unsigned long pfn)
+ {
+@@ -29,21 +28,8 @@ static inline bool is_crashed_pfn_valid(unsigned long pfn)
+ #endif
+ }
+
+-/**
+- * copy_oldmem_page - copy one page from "oldmem"
+- * @pfn: page frame number to be copied
+- * @buf: target memory address for the copy; this can be in kernel address
+- * space or user address space (see @userbuf)
+- * @csize: number of bytes to copy
+- * @offset: offset in bytes into the page (based on pfn) to begin the copy
+- * @userbuf: if set, @buf is in user address space, use copy_to_user(),
+- * otherwise @buf is in kernel address space, use memcpy().
+- *
+- * Copy a page from "oldmem". For this page, there might be no pte mapped
+- * in the current kernel.
+- */
+-ssize_t copy_oldmem_page(unsigned long pfn, char *buf, size_t csize,
+- unsigned long offset, int userbuf)
++ssize_t copy_oldmem_page(struct iov_iter *iter, unsigned long pfn, size_t csize,
++ unsigned long offset)
+ {
+ void *vaddr;
+
+@@ -54,14 +40,7 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf, size_t csize,
+ return -EFAULT;
+
+ vaddr = kmap_local_pfn(pfn);
+-
+- if (!userbuf) {
+- memcpy(buf, vaddr + offset, csize);
+- } else {
+- if (copy_to_user(buf, vaddr + offset, csize))
+- csize = -EFAULT;
+- }
+-
++ csize = copy_to_iter(vaddr + offset, csize, iter);
+ kunmap_local(vaddr);
+
+ return csize;
+diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c
+index 97529552dd249..94fe4aff9694b 100644
+--- a/arch/x86/kernel/crash_dump_64.c
++++ b/arch/x86/kernel/crash_dump_64.c
+@@ -8,12 +8,12 @@
+
+ #include <linux/errno.h>
+ #include <linux/crash_dump.h>
+-#include <linux/uaccess.h>
++#include <linux/uio.h>
+ #include <linux/io.h>
+ #include <linux/cc_platform.h>
+
+-static ssize_t __copy_oldmem_page(unsigned long pfn, char *buf, size_t csize,
+- unsigned long offset, int userbuf,
++static ssize_t __copy_oldmem_page(struct iov_iter *iter, unsigned long pfn,
++ size_t csize, unsigned long offset,
+ bool encrypted)
+ {
+ void *vaddr;
+@@ -29,46 +29,27 @@ static ssize_t __copy_oldmem_page(unsigned long pfn, char *buf, size_t csize,
+ if (!vaddr)
+ return -ENOMEM;
+
+- if (userbuf) {
+- if (copy_to_user((void __user *)buf, vaddr + offset, csize)) {
+- iounmap((void __iomem *)vaddr);
+- return -EFAULT;
+- }
+- } else
+- memcpy(buf, vaddr + offset, csize);
++ csize = copy_to_iter(vaddr + offset, csize, iter);
+
+ iounmap((void __iomem *)vaddr);
+ return csize;
+ }
+
+-/**
+- * copy_oldmem_page - copy one page of memory
+- * @pfn: page frame number to be copied
+- * @buf: target memory address for the copy; this can be in kernel address
+- * space or user address space (see @userbuf)
+- * @csize: number of bytes to copy
+- * @offset: offset in bytes into the page (based on pfn) to begin the copy
+- * @userbuf: if set, @buf is in user address space, use copy_to_user(),
+- * otherwise @buf is in kernel address space, use memcpy().
+- *
+- * Copy a page from the old kernel's memory. For this page, there is no pte
+- * mapped in the current kernel. We stitch up a pte, similar to kmap_atomic.
+- */
+-ssize_t copy_oldmem_page(unsigned long pfn, char *buf, size_t csize,
+- unsigned long offset, int userbuf)
++ssize_t copy_oldmem_page(struct iov_iter *iter, unsigned long pfn, size_t csize,
++ unsigned long offset)
+ {
+- return __copy_oldmem_page(pfn, buf, csize, offset, userbuf, false);
++ return __copy_oldmem_page(iter, pfn, csize, offset, false);
+ }
+
+-/**
++/*
+ * copy_oldmem_page_encrypted - same as copy_oldmem_page() above but ioremap the
+ * memory with the encryption mask set to accommodate kdump on SME-enabled
+ * machines.
+ */
+-ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf, size_t csize,
+- unsigned long offset, int userbuf)
++ssize_t copy_oldmem_page_encrypted(struct iov_iter *iter, unsigned long pfn,
++ size_t csize, unsigned long offset)
+ {
+- return __copy_oldmem_page(pfn, buf, csize, offset, userbuf, true);
++ return __copy_oldmem_page(iter, pfn, csize, offset, true);
+ }
+
+ ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
+diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
+index 4b7d490c0b639..76e9e6eb71d63 100644
+--- a/arch/x86/kvm/svm/sev.c
++++ b/arch/x86/kvm/svm/sev.c
+@@ -1665,19 +1665,24 @@ static void sev_migrate_from(struct kvm *dst_kvm, struct kvm *src_kvm)
+ {
+ struct kvm_sev_info *dst = &to_kvm_svm(dst_kvm)->sev_info;
+ struct kvm_sev_info *src = &to_kvm_svm(src_kvm)->sev_info;
++ struct kvm_vcpu *dst_vcpu, *src_vcpu;
++ struct vcpu_svm *dst_svm, *src_svm;
+ struct kvm_sev_info *mirror;
++ unsigned long i;
+
+ dst->active = true;
+ dst->asid = src->asid;
+ dst->handle = src->handle;
+ dst->pages_locked = src->pages_locked;
+ dst->enc_context_owner = src->enc_context_owner;
++ dst->es_active = src->es_active;
+
+ src->asid = 0;
+ src->active = false;
+ src->handle = 0;
+ src->pages_locked = 0;
+ src->enc_context_owner = NULL;
++ src->es_active = false;
+
+ list_cut_before(&dst->regions_list, &src->regions_list, &src->regions_list);
+
+@@ -1704,26 +1709,21 @@ static void sev_migrate_from(struct kvm *dst_kvm, struct kvm *src_kvm)
+ list_del(&src->mirror_entry);
+ list_add_tail(&dst->mirror_entry, &owner_sev_info->mirror_vms);
+ }
+-}
+
+-static int sev_es_migrate_from(struct kvm *dst, struct kvm *src)
+-{
+- unsigned long i;
+- struct kvm_vcpu *dst_vcpu, *src_vcpu;
+- struct vcpu_svm *dst_svm, *src_svm;
++ kvm_for_each_vcpu(i, dst_vcpu, dst_kvm) {
++ dst_svm = to_svm(dst_vcpu);
+
+- if (atomic_read(&src->online_vcpus) != atomic_read(&dst->online_vcpus))
+- return -EINVAL;
++ sev_init_vmcb(dst_svm);
+
+- kvm_for_each_vcpu(i, src_vcpu, src) {
+- if (!src_vcpu->arch.guest_state_protected)
+- return -EINVAL;
+- }
++ if (!dst->es_active)
++ continue;
+
+- kvm_for_each_vcpu(i, src_vcpu, src) {
++ /*
++ * Note, the source is not required to have the same number of
++ * vCPUs as the destination when migrating a vanilla SEV VM.
++ */
++ src_vcpu = kvm_get_vcpu(dst_kvm, i);
+ src_svm = to_svm(src_vcpu);
+- dst_vcpu = kvm_get_vcpu(dst, i);
+- dst_svm = to_svm(dst_vcpu);
+
+ /*
+ * Transfer VMSA and GHCB state to the destination. Nullify and
+@@ -1740,8 +1740,23 @@ static int sev_es_migrate_from(struct kvm *dst, struct kvm *src)
+ src_svm->vmcb->control.vmsa_pa = INVALID_PAGE;
+ src_vcpu->arch.guest_state_protected = false;
+ }
+- to_kvm_svm(src)->sev_info.es_active = false;
+- to_kvm_svm(dst)->sev_info.es_active = true;
++}
++
++static int sev_check_source_vcpus(struct kvm *dst, struct kvm *src)
++{
++ struct kvm_vcpu *src_vcpu;
++ unsigned long i;
++
++ if (!sev_es_guest(src))
++ return 0;
++
++ if (atomic_read(&src->online_vcpus) != atomic_read(&dst->online_vcpus))
++ return -EINVAL;
++
++ kvm_for_each_vcpu(i, src_vcpu, src) {
++ if (!src_vcpu->arch.guest_state_protected)
++ return -EINVAL;
++ }
+
+ return 0;
+ }
+@@ -1789,11 +1804,9 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd)
+ if (ret)
+ goto out_dst_vcpu;
+
+- if (sev_es_guest(source_kvm)) {
+- ret = sev_es_migrate_from(kvm, source_kvm);
+- if (ret)
+- goto out_source_vcpu;
+- }
++ ret = sev_check_source_vcpus(kvm, source_kvm);
++ if (ret)
++ goto out_source_vcpu;
+
+ sev_migrate_from(kvm, source_kvm);
+ kvm_vm_dead(source_kvm);
+@@ -2910,7 +2923,7 @@ int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in)
+ count, in);
+ }
+
+-void sev_es_init_vmcb(struct vcpu_svm *svm)
++static void sev_es_init_vmcb(struct vcpu_svm *svm)
+ {
+ struct kvm_vcpu *vcpu = &svm->vcpu;
+
+@@ -2955,6 +2968,15 @@ void sev_es_init_vmcb(struct vcpu_svm *svm)
+ set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 1, 1);
+ }
+
++void sev_init_vmcb(struct vcpu_svm *svm)
++{
++ svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ENABLE;
++ clr_exception_intercept(svm, UD_VECTOR);
++
++ if (sev_es_guest(svm->vcpu.kvm))
++ sev_es_init_vmcb(svm);
++}
++
+ void sev_es_vcpu_reset(struct vcpu_svm *svm)
+ {
+ /*
+diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
+index 0c0a09b43b105..6bfb0b0e66bd3 100644
+--- a/arch/x86/kvm/svm/svm.c
++++ b/arch/x86/kvm/svm/svm.c
+@@ -1125,15 +1125,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
+ svm->vmcb->control.int_ctl |= V_GIF_ENABLE_MASK;
+ }
+
+- if (sev_guest(vcpu->kvm)) {
+- svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ENABLE;
+- clr_exception_intercept(svm, UD_VECTOR);
+-
+- if (sev_es_guest(vcpu->kvm)) {
+- /* Perform SEV-ES specific VMCB updates */
+- sev_es_init_vmcb(svm);
+- }
+- }
++ if (sev_guest(vcpu->kvm))
++ sev_init_vmcb(svm);
+
+ svm_hv_init_vmcb(svm->vmcb);
+ init_vmcb_after_set_cpuid(vcpu);
+diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
+index 34babf9185fe5..8ec8fb58b924a 100644
+--- a/arch/x86/kvm/svm/svm.h
++++ b/arch/x86/kvm/svm/svm.h
+@@ -616,10 +616,10 @@ void __init sev_set_cpu_caps(void);
+ void __init sev_hardware_setup(void);
+ void sev_hardware_unsetup(void);
+ int sev_cpu_init(struct svm_cpu_data *sd);
++void sev_init_vmcb(struct vcpu_svm *svm);
+ void sev_free_vcpu(struct kvm_vcpu *vcpu);
+ int sev_handle_vmgexit(struct kvm_vcpu *vcpu);
+ int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
+-void sev_es_init_vmcb(struct vcpu_svm *svm);
+ void sev_es_vcpu_reset(struct vcpu_svm *svm);
+ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
+ void sev_es_prepare_switch_to_guest(struct vmcb_save_area *hostsa);
+diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
+index 16b6efacf7c67..4c71fa04e784c 100644
+--- a/arch/x86/net/bpf_jit_comp.c
++++ b/arch/x86/net/bpf_jit_comp.c
+@@ -1415,8 +1415,9 @@ st: if (is_imm8(insn->off))
+ case BPF_JMP | BPF_CALL:
+ func = (u8 *) __bpf_call_base + imm32;
+ if (tail_call_reachable) {
++ /* mov rax, qword ptr [rbp - rounded_stack_depth - 8] */
+ EMIT3_off32(0x48, 0x8B, 0x85,
+- -(bpf_prog->aux->stack_depth + 8));
++ -round_up(bpf_prog->aux->stack_depth, 8) - 8);
+ if (!imm32 || emit_call(&prog, func, image + addrs[i - 1] + 7))
+ return -EINVAL;
+ } else {
+diff --git a/arch/xtensa/kernel/time.c b/arch/xtensa/kernel/time.c
+index e8ceb15286081..16b8a6273772c 100644
+--- a/arch/xtensa/kernel/time.c
++++ b/arch/xtensa/kernel/time.c
+@@ -154,6 +154,7 @@ static void __init calibrate_ccount(void)
+ cpu = of_find_compatible_node(NULL, NULL, "cdns,xtensa-cpu");
+ if (cpu) {
+ clk = of_clk_get(cpu, 0);
++ of_node_put(cpu);
+ if (!IS_ERR(clk)) {
+ ccount_freq = clk_get_rate(clk);
+ return;
+diff --git a/arch/xtensa/platforms/xtfpga/setup.c b/arch/xtensa/platforms/xtfpga/setup.c
+index 538e6748e85a7..c79c1d09ea863 100644
+--- a/arch/xtensa/platforms/xtfpga/setup.c
++++ b/arch/xtensa/platforms/xtfpga/setup.c
+@@ -133,6 +133,7 @@ static int __init machine_setup(void)
+
+ if ((eth = of_find_compatible_node(eth, NULL, "opencores,ethoc")))
+ update_local_mac(eth);
++ of_node_put(eth);
+ return 0;
+ }
+ arch_initcall(machine_setup);
+diff --git a/block/blk-core.c b/block/blk-core.c
+index 84f7b7884d072..a7329475aba25 100644
+--- a/block/blk-core.c
++++ b/block/blk-core.c
+@@ -322,19 +322,6 @@ void blk_cleanup_queue(struct request_queue *q)
+ blk_mq_exit_queue(q);
+ }
+
+- /*
+- * In theory, request pool of sched_tags belongs to request queue.
+- * However, the current implementation requires tag_set for freeing
+- * requests, so free the pool now.
+- *
+- * Queue has become frozen, there can't be any in-queue requests, so
+- * it is safe to free requests now.
+- */
+- mutex_lock(&q->sysfs_lock);
+- if (q->elevator)
+- blk_mq_sched_free_rqs(q);
+- mutex_unlock(&q->sysfs_lock);
+-
+ /* @q is and will stay empty, shutdown and put */
+ blk_put_queue(q);
+ }
+diff --git a/block/blk-mq.c b/block/blk-mq.c
+index 631fb87b4976f..37caa73bff893 100644
+--- a/block/blk-mq.c
++++ b/block/blk-mq.c
+@@ -2777,15 +2777,20 @@ static inline struct request *blk_mq_get_cached_request(struct request_queue *q,
+ return NULL;
+ }
+
+- rq_qos_throttle(q, *bio);
+-
+ if (blk_mq_get_hctx_type((*bio)->bi_opf) != rq->mq_hctx->type)
+ return NULL;
+ if (op_is_flush(rq->cmd_flags) != op_is_flush((*bio)->bi_opf))
+ return NULL;
+
+- rq->cmd_flags = (*bio)->bi_opf;
++ /*
++ * If any qos ->throttle() end up blocking, we will have flushed the
++ * plug and hence killed the cached_rq list as well. Pop this entry
++ * before we throttle.
++ */
+ plug->cached_rq = rq_list_next(rq);
++ rq_qos_throttle(q, *bio);
++
++ rq->cmd_flags = (*bio)->bi_opf;
+ INIT_LIST_HEAD(&rq->queuelist);
+ return rq;
+ }
+diff --git a/block/genhd.c b/block/genhd.c
+index 3008ec2136543..13daac1a9aefa 100644
+--- a/block/genhd.c
++++ b/block/genhd.c
+@@ -652,6 +652,17 @@ void del_gendisk(struct gendisk *disk)
+
+ blk_sync_queue(q);
+ blk_flush_integrity();
++ blk_mq_cancel_work_sync(q);
++
++ blk_mq_quiesce_queue(q);
++ if (q->elevator) {
++ mutex_lock(&q->sysfs_lock);
++ elevator_exit(q);
++ mutex_unlock(&q->sysfs_lock);
++ }
++ rq_qos_exit(q);
++ blk_mq_unquiesce_queue(q);
++
+ /*
+ * Allow using passthrough request again after the queue is torn down.
+ */
+@@ -1120,31 +1131,6 @@ static const struct attribute_group *disk_attr_groups[] = {
+ NULL
+ };
+
+-static void disk_release_mq(struct request_queue *q)
+-{
+- blk_mq_cancel_work_sync(q);
+-
+- /*
+- * There can't be any non non-passthrough bios in flight here, but
+- * requests stay around longer, including passthrough ones so we
+- * still need to freeze the queue here.
+- */
+- blk_mq_freeze_queue(q);
+-
+- /*
+- * Since the I/O scheduler exit code may access cgroup information,
+- * perform I/O scheduler exit before disassociating from the block
+- * cgroup controller.
+- */
+- if (q->elevator) {
+- mutex_lock(&q->sysfs_lock);
+- elevator_exit(q);
+- mutex_unlock(&q->sysfs_lock);
+- }
+- rq_qos_exit(q);
+- __blk_mq_unfreeze_queue(q, true);
+-}
+-
+ /**
+ * disk_release - releases all allocated resources of the gendisk
+ * @dev: the device representing this disk
+@@ -1166,9 +1152,6 @@ static void disk_release(struct device *dev)
+ might_sleep();
+ WARN_ON_ONCE(disk_live(disk));
+
+- if (queue_is_mq(disk->queue))
+- disk_release_mq(disk->queue);
+-
+ blkcg_exit_queue(disk->queue);
+
+ disk_release_events(disk);
+diff --git a/drivers/base/memory.c b/drivers/base/memory.c
+index 084d67fd55cc8..bc60c9cd32308 100644
+--- a/drivers/base/memory.c
++++ b/drivers/base/memory.c
+@@ -558,7 +558,7 @@ static ssize_t hard_offline_page_store(struct device *dev,
+ if (kstrtoull(buf, 0, &pfn) < 0)
+ return -EINVAL;
+ pfn >>= PAGE_SHIFT;
+- ret = memory_failure(pfn, 0);
++ ret = memory_failure(pfn, MF_SW_SIMULATED);
+ if (ret == -EOPNOTSUPP)
+ ret = 0;
+ return ret ? ret : count;
+diff --git a/drivers/base/regmap/regmap-irq.c b/drivers/base/regmap/regmap-irq.c
+index 400c7412a7dcf..a6db605707b00 100644
+--- a/drivers/base/regmap/regmap-irq.c
++++ b/drivers/base/regmap/regmap-irq.c
+@@ -252,6 +252,7 @@ static void regmap_irq_enable(struct irq_data *data)
+ struct regmap_irq_chip_data *d = irq_data_get_irq_chip_data(data);
+ struct regmap *map = d->map;
+ const struct regmap_irq *irq_data = irq_to_regmap_irq(d, data->hwirq);
++ unsigned int reg = irq_data->reg_offset / map->reg_stride;
+ unsigned int mask, type;
+
+ type = irq_data->type.type_falling_val | irq_data->type.type_rising_val;
+@@ -268,14 +269,14 @@ static void regmap_irq_enable(struct irq_data *data)
+ * at the corresponding offset in regmap_irq_set_type().
+ */
+ if (d->chip->type_in_mask && type)
+- mask = d->type_buf[irq_data->reg_offset / map->reg_stride];
++ mask = d->type_buf[reg] & irq_data->mask;
+ else
+ mask = irq_data->mask;
+
+ if (d->chip->clear_on_unmask)
+ d->clear_status = true;
+
+- d->mask_buf[irq_data->reg_offset / map->reg_stride] &= ~mask;
++ d->mask_buf[reg] &= ~mask;
+ }
+
+ static void regmap_irq_disable(struct irq_data *data)
+@@ -386,6 +387,7 @@ static inline int read_sub_irq_data(struct regmap_irq_chip_data *data,
+ subreg = &chip->sub_reg_offsets[b];
+ for (i = 0; i < subreg->num_regs; i++) {
+ unsigned int offset = subreg->offset[i];
++ unsigned int index = offset / map->reg_stride;
+
+ if (chip->not_fixed_stride)
+ ret = regmap_read(map,
+@@ -394,7 +396,7 @@ static inline int read_sub_irq_data(struct regmap_irq_chip_data *data,
+ else
+ ret = regmap_read(map,
+ chip->status_base + offset,
+- &data->status_buf[offset]);
++ &data->status_buf[index]);
+
+ if (ret)
+ break;
+diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
+index 003056d4f7f5f..966a6bf4c1627 100644
+--- a/drivers/block/xen-blkfront.c
++++ b/drivers/block/xen-blkfront.c
+@@ -2137,9 +2137,11 @@ static void blkfront_closing(struct blkfront_info *info)
+ return;
+
+ /* No more blkif_request(). */
+- blk_mq_stop_hw_queues(info->rq);
+- blk_mark_disk_dead(info->gd);
+- set_capacity(info->gd, 0);
++ if (info->rq && info->gd) {
++ blk_mq_stop_hw_queues(info->rq);
++ blk_mark_disk_dead(info->gd);
++ set_capacity(info->gd, 0);
++ }
+
+ for_each_rinfo(info, rinfo, i) {
+ /* No more gnttab callback work. */
+@@ -2480,16 +2482,19 @@ static int blkfront_remove(struct xenbus_device *xbdev)
+
+ dev_dbg(&xbdev->dev, "%s removed", xbdev->nodename);
+
+- del_gendisk(info->gd);
++ if (info->gd)
++ del_gendisk(info->gd);
+
+ mutex_lock(&blkfront_mutex);
+ list_del(&info->info_list);
+ mutex_unlock(&blkfront_mutex);
+
+ blkif_free(info, 0);
+- xlbd_release_minors(info->gd->first_minor, info->gd->minors);
+- blk_cleanup_disk(info->gd);
+- blk_mq_free_tag_set(&info->tag_set);
++ if (info->gd) {
++ xlbd_release_minors(info->gd->first_minor, info->gd->minors);
++ blk_cleanup_disk(info->gd);
++ blk_mq_free_tag_set(&info->tag_set);
++ }
+
+ kfree(info);
+ return 0;
+diff --git a/drivers/char/random.c b/drivers/char/random.c
+index 1f3072ee6b7cd..dd52e948a9a48 100644
+--- a/drivers/char/random.c
++++ b/drivers/char/random.c
+@@ -87,7 +87,7 @@ static RAW_NOTIFIER_HEAD(random_ready_chain);
+
+ /* Control how we warn userspace. */
+ static struct ratelimit_state urandom_warning =
+- RATELIMIT_STATE_INIT("warn_urandom_randomness", HZ, 3);
++ RATELIMIT_STATE_INIT_FLAGS("urandom_warning", HZ, 3, RATELIMIT_MSG_ON_RELEASE);
+ static int ratelimit_disable __read_mostly =
+ IS_ENABLED(CONFIG_WARN_ALL_UNSEEDED_RANDOM);
+ module_param_named(ratelimit_disable, ratelimit_disable, int, 0644);
+@@ -451,7 +451,7 @@ static ssize_t get_random_bytes_user(struct iov_iter *iter)
+
+ /*
+ * Immediately overwrite the ChaCha key at index 4 with random
+- * bytes, in case userspace causes copy_to_user() below to sleep
++ * bytes, in case userspace causes copy_to_iter() below to sleep
+ * forever, so that we still retain forward secrecy in that case.
+ */
+ crng_make_state(chacha_state, (u8 *)&chacha_state[4], CHACHA_KEY_SIZE);
+@@ -1038,7 +1038,7 @@ void add_interrupt_randomness(int irq)
+ if (new_count & MIX_INFLIGHT)
+ return;
+
+- if (new_count < 64 && !time_is_before_jiffies(fast_pool->last + HZ))
++ if (new_count < 1024 && !time_is_before_jiffies(fast_pool->last + HZ))
+ return;
+
+ if (unlikely(!fast_pool->mix.func))
+diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
+index e7330684d3b82..9631f2fd2faf7 100644
+--- a/drivers/dma-buf/udmabuf.c
++++ b/drivers/dma-buf/udmabuf.c
+@@ -32,8 +32,11 @@ static vm_fault_t udmabuf_vm_fault(struct vm_fault *vmf)
+ {
+ struct vm_area_struct *vma = vmf->vma;
+ struct udmabuf *ubuf = vma->vm_private_data;
++ pgoff_t pgoff = vmf->pgoff;
+
+- vmf->page = ubuf->pages[vmf->pgoff];
++ if (pgoff >= ubuf->pagecount)
++ return VM_FAULT_SIGBUS;
++ vmf->page = ubuf->pages[pgoff];
+ get_page(vmf->page);
+ return 0;
+ }
+diff --git a/drivers/gpio/gpio-vr41xx.c b/drivers/gpio/gpio-vr41xx.c
+index 98cd715ccc33c..8d09b619c1669 100644
+--- a/drivers/gpio/gpio-vr41xx.c
++++ b/drivers/gpio/gpio-vr41xx.c
+@@ -217,8 +217,6 @@ static int giu_get_irq(unsigned int irq)
+ printk(KERN_ERR "spurious GIU interrupt: %04x(%04x),%04x(%04x)\n",
+ maskl, pendl, maskh, pendh);
+
+- atomic_inc(&irq_err_count);
+-
+ return -EINVAL;
+ }
+
+diff --git a/drivers/gpio/gpio-winbond.c b/drivers/gpio/gpio-winbond.c
+index 7f8f5b02e31d5..4b61d975cc0ec 100644
+--- a/drivers/gpio/gpio-winbond.c
++++ b/drivers/gpio/gpio-winbond.c
+@@ -385,12 +385,13 @@ static int winbond_gpio_get(struct gpio_chip *gc, unsigned int offset)
+ unsigned long *base = gpiochip_get_data(gc);
+ const struct winbond_gpio_info *info;
+ bool val;
++ int ret;
+
+ winbond_gpio_get_info(&offset, &info);
+
+- val = winbond_sio_enter(*base);
+- if (val)
+- return val;
++ ret = winbond_sio_enter(*base);
++ if (ret)
++ return ret;
+
+ winbond_sio_select_logical(*base, info->dev);
+
+diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+index 95b5b5bfa1ffa..71b15e2df235b 100644
+--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
++++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+@@ -944,7 +944,7 @@ static void override_lane_settings(const struct link_training_settings *lt_setti
+
+ return;
+
+- for (lane = 1; lane < LANE_COUNT_DP_MAX; lane++) {
++ for (lane = 0; lane < LANE_COUNT_DP_MAX; lane++) {
+ if (lt_settings->voltage_swing)
+ lane_settings[lane].VOLTAGE_SWING = *lt_settings->voltage_swing;
+ if (lt_settings->pre_emphasis)
+diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+index 248602c15f3a0..6007b847b54f2 100644
+--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
++++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+@@ -1771,29 +1771,9 @@ void dce110_enable_accelerated_mode(struct dc *dc, struct dc_state *context)
+ break;
+ }
+ }
+-
+- /*
+- * TO-DO: So far the code logic below only addresses single eDP case.
+- * For dual eDP case, there are a few things that need to be
+- * implemented first:
+- *
+- * 1. Change the fastboot logic above, so eDP link[0 or 1]'s
+- * stream[0 or 1] will all be checked.
+- *
+- * 2. Change keep_edp_vdd_on to an array, and maintain keep_edp_vdd_on
+- * for each eDP.
+- *
+- * Once above 2 things are completed, we can then change the logic below
+- * correspondingly, so dual eDP case will be fully covered.
+- */
+-
+- // We are trying to enable eDP, don't power down VDD if eDP stream is existing
+- if ((edp_stream_num == 1 && edp_streams[0] != NULL) || can_apply_edp_fast_boot) {
++ // We are trying to enable eDP, don't power down VDD
++ if (can_apply_edp_fast_boot)
+ keep_edp_vdd_on = true;
+- DC_LOG_EVENT_LINK_TRAINING("Keep eDP Vdd on\n");
+- } else {
+- DC_LOG_EVENT_LINK_TRAINING("No eDP stream enabled, turn eDP Vdd off\n");
+- }
+ }
+
+ // Check seamless boot support
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
+index 970b65efeac10..eaa7032f0f1a3 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_dpp.c
+@@ -212,6 +212,9 @@ static void dpp2_cnv_setup (
+ break;
+ }
+
++ /* Set default color space based on format if none is given. */
++ color_space = input_color_space ? input_color_space : color_space;
++
+ if (is_2bit == 1 && alpha_2bit_lut != NULL) {
+ REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT0, alpha_2bit_lut->lut0);
+ REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT1, alpha_2bit_lut->lut1);
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c
+index 8b6505b7dca86..f50ab961bc174 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_dpp.c
+@@ -153,6 +153,9 @@ static void dpp201_cnv_setup(
+ break;
+ }
+
++ /* Set default color space based on format if none is given. */
++ color_space = input_color_space ? input_color_space : color_space;
++
+ if (is_2bit == 1 && alpha_2bit_lut != NULL) {
+ REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT0, alpha_2bit_lut->lut0);
+ REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT1, alpha_2bit_lut->lut1);
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
+index ab3918c0a15b0..0dcc07531643f 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
+@@ -294,6 +294,9 @@ static void dpp3_cnv_setup (
+ break;
+ }
+
++ /* Set default color space based on format if none is given. */
++ color_space = input_color_space ? input_color_space : color_space;
++
+ if (is_2bit == 1 && alpha_2bit_lut != NULL) {
+ REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT0, alpha_2bit_lut->lut0);
+ REG_UPDATE(ALPHA_2BIT_LUT, ALPHA_2BIT_LUT1, alpha_2bit_lut->lut1);
+diff --git a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
+index 569903d47aea5..a76f037001aec 100644
+--- a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
++++ b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
+@@ -2437,7 +2437,7 @@ static void icl_wrpll_params_populate(struct skl_wrpll_params *params,
+ }
+
+ /*
+- * Display WA #22010492432: ehl, tgl, adl-p
++ * Display WA #22010492432: ehl, tgl, adl-s, adl-p
+ * Program half of the nominal DCO divider fraction value.
+ */
+ static bool
+@@ -2445,7 +2445,7 @@ ehl_combo_pll_div_frac_wa_needed(struct drm_i915_private *i915)
+ {
+ return ((IS_PLATFORM(i915, INTEL_ELKHARTLAKE) &&
+ IS_JSL_EHL_DISPLAY_STEP(i915, STEP_B0, STEP_FOREVER)) ||
+- IS_TIGERLAKE(i915) || IS_ALDERLAKE_P(i915)) &&
++ IS_TIGERLAKE(i915) || IS_ALDERLAKE_S(i915) || IS_ALDERLAKE_P(i915)) &&
+ i915->dpll.ref_clks.nssc == 38400;
+ }
+
+diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+index 1219f71629a52..1ced7b108f2c7 100644
+--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
++++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+@@ -1002,7 +1002,8 @@ void adreno_gpu_cleanup(struct adreno_gpu *adreno_gpu)
+ for (i = 0; i < ARRAY_SIZE(adreno_gpu->info->fw); i++)
+ release_firmware(adreno_gpu->fw[i]);
+
+- pm_runtime_disable(&priv->gpu_pdev->dev);
++ if (pm_runtime_enabled(&priv->gpu_pdev->dev))
++ pm_runtime_disable(&priv->gpu_pdev->dev);
+
+ msm_gpu_cleanup(&adreno_gpu->base);
+ }
+diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
+index 3cf476c551584..d92193db7eb2d 100644
+--- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
++++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c
+@@ -217,6 +217,7 @@ static int mdp4_modeset_init_intf(struct mdp4_kms *mdp4_kms,
+ encoder = mdp4_lcdc_encoder_init(dev, panel_node);
+ if (IS_ERR(encoder)) {
+ DRM_DEV_ERROR(dev->dev, "failed to construct LCDC encoder\n");
++ of_node_put(panel_node);
+ return PTR_ERR(encoder);
+ }
+
+@@ -226,6 +227,7 @@ static int mdp4_modeset_init_intf(struct mdp4_kms *mdp4_kms,
+ connector = mdp4_lvds_connector_init(dev, panel_node, encoder);
+ if (IS_ERR(connector)) {
+ DRM_DEV_ERROR(dev->dev, "failed to initialize LVDS connector\n");
++ of_node_put(panel_node);
+ return PTR_ERR(connector);
+ }
+
+diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c
+index de1974916ad2d..499d0bbc442c9 100644
+--- a/drivers/gpu/drm/msm/dp/dp_ctrl.c
++++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c
+@@ -1523,6 +1523,8 @@ end:
+ return ret;
+ }
+
++static int dp_ctrl_on_stream_phy_test_report(struct dp_ctrl *dp_ctrl);
++
+ static int dp_ctrl_process_phy_test_request(struct dp_ctrl_private *ctrl)
+ {
+ int ret = 0;
+@@ -1545,7 +1547,7 @@ static int dp_ctrl_process_phy_test_request(struct dp_ctrl_private *ctrl)
+
+ ret = dp_ctrl_on_link(&ctrl->dp_ctrl);
+ if (!ret)
+- ret = dp_ctrl_on_stream(&ctrl->dp_ctrl);
++ ret = dp_ctrl_on_stream_phy_test_report(&ctrl->dp_ctrl);
+ else
+ DRM_ERROR("failed to enable DP link controller\n");
+
+@@ -1800,7 +1802,27 @@ static int dp_ctrl_link_retrain(struct dp_ctrl_private *ctrl)
+ return dp_ctrl_setup_main_link(ctrl, &training_step);
+ }
+
+-int dp_ctrl_on_stream(struct dp_ctrl *dp_ctrl)
++static int dp_ctrl_on_stream_phy_test_report(struct dp_ctrl *dp_ctrl)
++{
++ int ret;
++ struct dp_ctrl_private *ctrl;
++
++ ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl);
++
++ ctrl->dp_ctrl.pixel_rate = ctrl->panel->dp_mode.drm_mode.clock;
++
++ ret = dp_ctrl_enable_stream_clocks(ctrl);
++ if (ret) {
++ DRM_ERROR("Failed to start pixel clocks. ret=%d\n", ret);
++ return ret;
++ }
++
++ dp_ctrl_send_phy_test_pattern(ctrl);
++
++ return 0;
++}
++
++int dp_ctrl_on_stream(struct dp_ctrl *dp_ctrl, bool force_link_train)
+ {
+ int ret = 0;
+ bool mainlink_ready = false;
+@@ -1831,12 +1853,7 @@ int dp_ctrl_on_stream(struct dp_ctrl *dp_ctrl)
+ goto end;
+ }
+
+- if (ctrl->link->sink_request & DP_TEST_LINK_PHY_TEST_PATTERN) {
+- dp_ctrl_send_phy_test_pattern(ctrl);
+- return 0;
+- }
+-
+- if (!dp_ctrl_channel_eq_ok(ctrl))
++ if (force_link_train || !dp_ctrl_channel_eq_ok(ctrl))
+ dp_ctrl_link_retrain(ctrl);
+
+ /* stop txing train pattern to end link training */
+diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.h b/drivers/gpu/drm/msm/dp/dp_ctrl.h
+index 2433edbc70a6d..dcc7af21a5f05 100644
+--- a/drivers/gpu/drm/msm/dp/dp_ctrl.h
++++ b/drivers/gpu/drm/msm/dp/dp_ctrl.h
+@@ -20,7 +20,7 @@ struct dp_ctrl {
+ };
+
+ int dp_ctrl_on_link(struct dp_ctrl *dp_ctrl);
+-int dp_ctrl_on_stream(struct dp_ctrl *dp_ctrl);
++int dp_ctrl_on_stream(struct dp_ctrl *dp_ctrl, bool force_link_train);
+ int dp_ctrl_off_link_stream(struct dp_ctrl *dp_ctrl);
+ int dp_ctrl_off(struct dp_ctrl *dp_ctrl);
+ void dp_ctrl_push_idle(struct dp_ctrl *dp_ctrl);
+diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
+index 8deb92bddfdec..12270bd3cff98 100644
+--- a/drivers/gpu/drm/msm/dp/dp_display.c
++++ b/drivers/gpu/drm/msm/dp/dp_display.c
+@@ -308,7 +308,8 @@ static void dp_display_unbind(struct device *dev, struct device *master,
+ struct msm_drm_private *priv = dev_get_drvdata(master);
+
+ /* disable all HPD interrupts */
+- dp_catalog_hpd_config_intr(dp->catalog, DP_DP_HPD_INT_MASK, false);
++ if (dp->core_initialized)
++ dp_catalog_hpd_config_intr(dp->catalog, DP_DP_HPD_INT_MASK, false);
+
+ kthread_stop(dp->ev_tsk);
+
+@@ -902,7 +903,7 @@ static int dp_display_enable(struct dp_display_private *dp, u32 data)
+ return 0;
+ }
+
+- rc = dp_ctrl_on_stream(dp->ctrl);
++ rc = dp_ctrl_on_stream(dp->ctrl, data);
+ if (!rc)
+ dp_display->power_on = true;
+
+@@ -1589,6 +1590,7 @@ int msm_dp_display_enable(struct msm_dp *dp, struct drm_encoder *encoder)
+ int rc = 0;
+ struct dp_display_private *dp_display;
+ u32 state;
++ bool force_link_train = false;
+
+ dp_display = container_of(dp, struct dp_display_private, dp_display);
+ if (!dp_display->dp_mode.drm_mode.clock) {
+@@ -1617,10 +1619,12 @@ int msm_dp_display_enable(struct msm_dp *dp, struct drm_encoder *encoder)
+
+ state = dp_display->hpd_state;
+
+- if (state == ST_DISPLAY_OFF)
++ if (state == ST_DISPLAY_OFF) {
+ dp_display_host_phy_init(dp_display);
++ force_link_train = true;
++ }
+
+- dp_display_enable(dp_display, 0);
++ dp_display_enable(dp_display, force_link_train);
+
+ rc = dp_display_post_enable(dp);
+ if (rc) {
+@@ -1629,10 +1633,6 @@ int msm_dp_display_enable(struct msm_dp *dp, struct drm_encoder *encoder)
+ dp_display_unprepare(dp);
+ }
+
+- /* manual kick off plug event to train link */
+- if (state == ST_DISPLAY_OFF)
+- dp_add_event(dp_display, EV_IRQ_HPD_INT, 0, 0);
+-
+ /* completed connection */
+ dp_display->hpd_state = ST_CONNECTED;
+
+diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
+index f2c46116df55c..b5f6acfe7c6e9 100644
+--- a/drivers/gpu/drm/msm/msm_drv.c
++++ b/drivers/gpu/drm/msm/msm_drv.c
+@@ -967,7 +967,7 @@ static const struct drm_driver msm_driver = {
+ .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
+ .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
+ .gem_prime_import_sg_table = msm_gem_prime_import_sg_table,
+- .gem_prime_mmap = drm_gem_prime_mmap,
++ .gem_prime_mmap = msm_gem_prime_mmap,
+ #ifdef CONFIG_DEBUG_FS
+ .debugfs_init = msm_debugfs_init,
+ #endif
+diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
+index d661debb50f11..9b985b641319d 100644
+--- a/drivers/gpu/drm/msm/msm_drv.h
++++ b/drivers/gpu/drm/msm/msm_drv.h
+@@ -288,6 +288,7 @@ unsigned long msm_gem_shrinker_shrink(struct drm_device *dev, unsigned long nr_t
+ void msm_gem_shrinker_init(struct drm_device *dev);
+ void msm_gem_shrinker_cleanup(struct drm_device *dev);
+
++int msm_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma);
+ struct sg_table *msm_gem_prime_get_sg_table(struct drm_gem_object *obj);
+ int msm_gem_prime_vmap(struct drm_gem_object *obj, struct iosys_map *map);
+ void msm_gem_prime_vunmap(struct drm_gem_object *obj, struct iosys_map *map);
+diff --git a/drivers/gpu/drm/msm/msm_gem_prime.c b/drivers/gpu/drm/msm/msm_gem_prime.c
+index 94ab705e9b8a4..dcc8a573bc762 100644
+--- a/drivers/gpu/drm/msm/msm_gem_prime.c
++++ b/drivers/gpu/drm/msm/msm_gem_prime.c
+@@ -11,6 +11,21 @@
+ #include "msm_drv.h"
+ #include "msm_gem.h"
+
++int msm_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma)
++{
++ int ret;
++
++ /* Ensure the mmap offset is initialized. We lazily initialize it,
++ * so if it has not been first mmap'd directly as a GEM object, the
++ * mmap offset will not be already initialized.
++ */
++ ret = drm_gem_create_mmap_offset(obj);
++ if (ret)
++ return ret;
++
++ return drm_gem_prime_mmap(obj, vma);
++}
++
+ struct sg_table *msm_gem_prime_get_sg_table(struct drm_gem_object *obj)
+ {
+ struct msm_gem_object *msm_obj = to_msm_bo(obj);
+diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
+index 58eb3e1662cb9..7d27d7cee688b 100644
+--- a/drivers/gpu/drm/msm/msm_gpu.c
++++ b/drivers/gpu/drm/msm/msm_gpu.c
+@@ -664,7 +664,6 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
+ msm_submit_retire(submit);
+
+ pm_runtime_mark_last_busy(&gpu->pdev->dev);
+- pm_runtime_put_autosuspend(&gpu->pdev->dev);
+
+ spin_lock_irqsave(&ring->submit_lock, flags);
+ list_del(&submit->node);
+@@ -678,6 +677,8 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
+ msm_devfreq_idle(gpu);
+ mutex_unlock(&gpu->active_lock);
+
++ pm_runtime_put_autosuspend(&gpu->pdev->dev);
++
+ msm_gem_submit_put(submit);
+ }
+
+diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
+index bcaddbba564df..a54ed354578b5 100644
+--- a/drivers/gpu/drm/msm/msm_iommu.c
++++ b/drivers/gpu/drm/msm/msm_iommu.c
+@@ -58,7 +58,7 @@ static int msm_iommu_pagetable_map(struct msm_mmu *mmu, u64 iova,
+ u64 addr = iova;
+ unsigned int i;
+
+- for_each_sg(sgt->sgl, sg, sgt->nents, i) {
++ for_each_sgtable_sg(sgt, sg, i) {
+ size_t size = sg->length;
+ phys_addr_t phys = sg_phys(sg);
+
+diff --git a/drivers/gpu/drm/sun4i/sun4i_drv.c b/drivers/gpu/drm/sun4i/sun4i_drv.c
+index 6a9ba8a77c778..4b29de65a5630 100644
+--- a/drivers/gpu/drm/sun4i/sun4i_drv.c
++++ b/drivers/gpu/drm/sun4i/sun4i_drv.c
+@@ -73,7 +73,6 @@ static int sun4i_drv_bind(struct device *dev)
+ goto free_drm;
+ }
+
+- dev_set_drvdata(dev, drm);
+ drm->dev_private = drv;
+ INIT_LIST_HEAD(&drv->frontend_list);
+ INIT_LIST_HEAD(&drv->engine_list);
+@@ -114,6 +113,8 @@ static int sun4i_drv_bind(struct device *dev)
+
+ drm_fbdev_generic_setup(drm, 32);
+
++ dev_set_drvdata(dev, drm);
++
+ return 0;
+
+ finish_poll:
+@@ -130,6 +131,7 @@ static void sun4i_drv_unbind(struct device *dev)
+ {
+ struct drm_device *drm = dev_get_drvdata(dev);
+
++ dev_set_drvdata(dev, NULL);
+ drm_dev_unregister(drm);
+ drm_kms_helper_poll_fini(drm);
+ drm_atomic_helper_shutdown(drm);
+diff --git a/drivers/iio/accel/bma180.c b/drivers/iio/accel/bma180.c
+index 4f73bc827eecb..9c9e985786670 100644
+--- a/drivers/iio/accel/bma180.c
++++ b/drivers/iio/accel/bma180.c
+@@ -1006,11 +1006,12 @@ static int bma180_probe(struct i2c_client *client,
+
+ data->trig->ops = &bma180_trigger_ops;
+ iio_trigger_set_drvdata(data->trig, indio_dev);
+- indio_dev->trig = iio_trigger_get(data->trig);
+
+ ret = iio_trigger_register(data->trig);
+ if (ret)
+ goto err_trigger_free;
++
++ indio_dev->trig = iio_trigger_get(data->trig);
+ }
+
+ ret = iio_triggered_buffer_setup(indio_dev, NULL,
+diff --git a/drivers/iio/accel/kxcjk-1013.c b/drivers/iio/accel/kxcjk-1013.c
+index ac74cdcd2bc8c..748b35c2f0c37 100644
+--- a/drivers/iio/accel/kxcjk-1013.c
++++ b/drivers/iio/accel/kxcjk-1013.c
+@@ -1554,12 +1554,12 @@ static int kxcjk1013_probe(struct i2c_client *client,
+
+ data->dready_trig->ops = &kxcjk1013_trigger_ops;
+ iio_trigger_set_drvdata(data->dready_trig, indio_dev);
+- indio_dev->trig = data->dready_trig;
+- iio_trigger_get(indio_dev->trig);
+ ret = iio_trigger_register(data->dready_trig);
+ if (ret)
+ goto err_poweroff;
+
++ indio_dev->trig = iio_trigger_get(data->dready_trig);
++
+ data->motion_trig->ops = &kxcjk1013_trigger_ops;
+ iio_trigger_set_drvdata(data->motion_trig, indio_dev);
+ ret = iio_trigger_register(data->motion_trig);
+diff --git a/drivers/iio/accel/mma8452.c b/drivers/iio/accel/mma8452.c
+index 9c02c681c84c3..f4f835274d751 100644
+--- a/drivers/iio/accel/mma8452.c
++++ b/drivers/iio/accel/mma8452.c
+@@ -1510,10 +1510,14 @@ static int mma8452_reset(struct i2c_client *client)
+ int i;
+ int ret;
+
+- ret = i2c_smbus_write_byte_data(client, MMA8452_CTRL_REG2,
++ /*
++ * Find on fxls8471, after config reset bit, it reset immediately,
++ * and will not give ACK, so here do not check the return value.
++ * The following code will read the reset register, and check whether
++ * this reset works.
++ */
++ i2c_smbus_write_byte_data(client, MMA8452_CTRL_REG2,
+ MMA8452_CTRL_REG2_RST);
+- if (ret < 0)
+- return ret;
+
+ for (i = 0; i < 10; i++) {
+ usleep_range(100, 200);
+@@ -1556,11 +1560,13 @@ static int mma8452_probe(struct i2c_client *client,
+ mutex_init(&data->lock);
+
+ data->chip_info = device_get_match_data(&client->dev);
+- if (!data->chip_info && id) {
+- data->chip_info = &mma_chip_info_table[id->driver_data];
+- } else {
+- dev_err(&client->dev, "unknown device model\n");
+- return -ENODEV;
++ if (!data->chip_info) {
++ if (id) {
++ data->chip_info = &mma_chip_info_table[id->driver_data];
++ } else {
++ dev_err(&client->dev, "unknown device model\n");
++ return -ENODEV;
++ }
+ }
+
+ ret = iio_read_mount_matrix(&client->dev, &data->orientation);
+diff --git a/drivers/iio/accel/mxc4005.c b/drivers/iio/accel/mxc4005.c
+index b3afbf0649152..df600d2917c0a 100644
+--- a/drivers/iio/accel/mxc4005.c
++++ b/drivers/iio/accel/mxc4005.c
+@@ -456,8 +456,6 @@ static int mxc4005_probe(struct i2c_client *client,
+
+ data->dready_trig->ops = &mxc4005_trigger_ops;
+ iio_trigger_set_drvdata(data->dready_trig, indio_dev);
+- indio_dev->trig = data->dready_trig;
+- iio_trigger_get(indio_dev->trig);
+ ret = devm_iio_trigger_register(&client->dev,
+ data->dready_trig);
+ if (ret) {
+@@ -465,6 +463,8 @@ static int mxc4005_probe(struct i2c_client *client,
+ "failed to register trigger\n");
+ return ret;
+ }
++
++ indio_dev->trig = iio_trigger_get(data->dready_trig);
+ }
+
+ return devm_iio_device_register(&client->dev, indio_dev);
+diff --git a/drivers/iio/adc/adi-axi-adc.c b/drivers/iio/adc/adi-axi-adc.c
+index a73e3c2d212fa..a9e655e69eaa2 100644
+--- a/drivers/iio/adc/adi-axi-adc.c
++++ b/drivers/iio/adc/adi-axi-adc.c
+@@ -322,16 +322,19 @@ static struct adi_axi_adc_client *adi_axi_adc_attach_client(struct device *dev)
+
+ if (!try_module_get(cl->dev->driver->owner)) {
+ mutex_unlock(®istered_clients_lock);
++ of_node_put(cln);
+ return ERR_PTR(-ENODEV);
+ }
+
+ get_device(cl->dev);
+ cl->info = info;
+ mutex_unlock(®istered_clients_lock);
++ of_node_put(cln);
+ return cl;
+ }
+
+ mutex_unlock(®istered_clients_lock);
++ of_node_put(cln);
+
+ return ERR_PTR(-EPROBE_DEFER);
+ }
+diff --git a/drivers/iio/adc/aspeed_adc.c b/drivers/iio/adc/aspeed_adc.c
+index 0793d2474cdcf..9341e0e0eb556 100644
+--- a/drivers/iio/adc/aspeed_adc.c
++++ b/drivers/iio/adc/aspeed_adc.c
+@@ -186,6 +186,7 @@ static int aspeed_adc_set_trim_data(struct iio_dev *indio_dev)
+ return -EOPNOTSUPP;
+ }
+ scu = syscon_node_to_regmap(syscon);
++ of_node_put(syscon);
+ if (IS_ERR(scu)) {
+ dev_warn(data->dev, "Failed to get syscon regmap\n");
+ return -EOPNOTSUPP;
+diff --git a/drivers/iio/adc/axp288_adc.c b/drivers/iio/adc/axp288_adc.c
+index a4b8be5b8f883..580361bd98492 100644
+--- a/drivers/iio/adc/axp288_adc.c
++++ b/drivers/iio/adc/axp288_adc.c
+@@ -196,6 +196,14 @@ static const struct dmi_system_id axp288_adc_ts_bias_override[] = {
+ },
+ .driver_data = (void *)(uintptr_t)AXP288_ADC_TS_BIAS_80UA,
+ },
++ {
++ /* Nuvision Solo 10 Draw */
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "TMAX"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "TM101W610L"),
++ },
++ .driver_data = (void *)(uintptr_t)AXP288_ADC_TS_BIAS_80UA,
++ },
+ {}
+ };
+
+diff --git a/drivers/iio/adc/rzg2l_adc.c b/drivers/iio/adc/rzg2l_adc.c
+index 7585144b9715b..5b09a93fdf34f 100644
+--- a/drivers/iio/adc/rzg2l_adc.c
++++ b/drivers/iio/adc/rzg2l_adc.c
+@@ -334,11 +334,15 @@ static int rzg2l_adc_parse_properties(struct platform_device *pdev, struct rzg2l
+ i = 0;
+ device_for_each_child_node(&pdev->dev, fwnode) {
+ ret = fwnode_property_read_u32(fwnode, "reg", &channel);
+- if (ret)
++ if (ret) {
++ fwnode_handle_put(fwnode);
+ return ret;
++ }
+
+- if (channel >= RZG2L_ADC_MAX_CHANNELS)
++ if (channel >= RZG2L_ADC_MAX_CHANNELS) {
++ fwnode_handle_put(fwnode);
+ return -EINVAL;
++ }
+
+ chan_array[i].type = IIO_VOLTAGE;
+ chan_array[i].indexed = 1;
+diff --git a/drivers/iio/adc/stm32-adc-core.c b/drivers/iio/adc/stm32-adc-core.c
+index 1426562321575..3efb8c404ccc3 100644
+--- a/drivers/iio/adc/stm32-adc-core.c
++++ b/drivers/iio/adc/stm32-adc-core.c
+@@ -64,6 +64,7 @@ struct stm32_adc_priv;
+ * @max_clk_rate_hz: maximum analog clock rate (Hz, from datasheet)
+ * @has_syscfg: SYSCFG capability flags
+ * @num_irqs: number of interrupt lines
++ * @num_adcs: maximum number of ADC instances in the common registers
+ */
+ struct stm32_adc_priv_cfg {
+ const struct stm32_adc_common_regs *regs;
+@@ -71,6 +72,7 @@ struct stm32_adc_priv_cfg {
+ u32 max_clk_rate_hz;
+ unsigned int has_syscfg;
+ unsigned int num_irqs;
++ unsigned int num_adcs;
+ };
+
+ /**
+@@ -352,7 +354,7 @@ static void stm32_adc_irq_handler(struct irq_desc *desc)
+ * before invoking the interrupt handler (e.g. call ISR only for
+ * IRQ-enabled ADCs).
+ */
+- for (i = 0; i < priv->cfg->num_irqs; i++) {
++ for (i = 0; i < priv->cfg->num_adcs; i++) {
+ if ((status & priv->cfg->regs->eoc_msk[i] &&
+ stm32_adc_eoc_enabled(priv, i)) ||
+ (status & priv->cfg->regs->ovr_msk[i]))
+@@ -792,6 +794,7 @@ static const struct stm32_adc_priv_cfg stm32f4_adc_priv_cfg = {
+ .clk_sel = stm32f4_adc_clk_sel,
+ .max_clk_rate_hz = 36000000,
+ .num_irqs = 1,
++ .num_adcs = 3,
+ };
+
+ static const struct stm32_adc_priv_cfg stm32h7_adc_priv_cfg = {
+@@ -800,14 +803,16 @@ static const struct stm32_adc_priv_cfg stm32h7_adc_priv_cfg = {
+ .max_clk_rate_hz = 36000000,
+ .has_syscfg = HAS_VBOOSTER,
+ .num_irqs = 1,
++ .num_adcs = 2,
+ };
+
+ static const struct stm32_adc_priv_cfg stm32mp1_adc_priv_cfg = {
+ .regs = &stm32h7_adc_common_regs,
+ .clk_sel = stm32h7_adc_clk_sel,
+- .max_clk_rate_hz = 40000000,
++ .max_clk_rate_hz = 36000000,
+ .has_syscfg = HAS_VBOOSTER | HAS_ANASWVDD,
+ .num_irqs = 2,
++ .num_adcs = 2,
+ };
+
+ static const struct of_device_id stm32_adc_of_match[] = {
+diff --git a/drivers/iio/adc/stm32-adc.c b/drivers/iio/adc/stm32-adc.c
+index a68ecbda6480b..11ef873d64532 100644
+--- a/drivers/iio/adc/stm32-adc.c
++++ b/drivers/iio/adc/stm32-adc.c
+@@ -1365,7 +1365,7 @@ static int stm32_adc_read_raw(struct iio_dev *indio_dev,
+ else
+ ret = -EINVAL;
+
+- if (mask == IIO_CHAN_INFO_PROCESSED && adc->vrefint.vrefint_cal)
++ if (mask == IIO_CHAN_INFO_PROCESSED)
+ *val = STM32_ADC_VREFINT_VOLTAGE * adc->vrefint.vrefint_cal / *val;
+
+ iio_device_release_direct_mode(indio_dev);
+@@ -1407,7 +1407,6 @@ static irqreturn_t stm32_adc_threaded_isr(int irq, void *data)
+ struct stm32_adc *adc = iio_priv(indio_dev);
+ const struct stm32_adc_regspec *regs = adc->cfg->regs;
+ u32 status = stm32_adc_readl(adc, regs->isr_eoc.reg);
+- u32 mask = stm32_adc_readl(adc, regs->ier_eoc.reg);
+
+ /* Check ovr status right now, as ovr mask should be already disabled */
+ if (status & regs->isr_ovr.mask) {
+@@ -1422,11 +1421,6 @@ static irqreturn_t stm32_adc_threaded_isr(int irq, void *data)
+ return IRQ_HANDLED;
+ }
+
+- if (!(status & mask))
+- dev_err_ratelimited(&indio_dev->dev,
+- "Unexpected IRQ: IER=0x%08x, ISR=0x%08x\n",
+- mask, status);
+-
+ return IRQ_NONE;
+ }
+
+@@ -1436,10 +1430,6 @@ static irqreturn_t stm32_adc_isr(int irq, void *data)
+ struct stm32_adc *adc = iio_priv(indio_dev);
+ const struct stm32_adc_regspec *regs = adc->cfg->regs;
+ u32 status = stm32_adc_readl(adc, regs->isr_eoc.reg);
+- u32 mask = stm32_adc_readl(adc, regs->ier_eoc.reg);
+-
+- if (!(status & mask))
+- return IRQ_WAKE_THREAD;
+
+ if (status & regs->isr_ovr.mask) {
+ /*
+@@ -1979,10 +1969,10 @@ static int stm32_adc_populate_int_ch(struct iio_dev *indio_dev, const char *ch_n
+
+ for (i = 0; i < STM32_ADC_INT_CH_NB; i++) {
+ if (!strncmp(stm32_adc_ic[i].name, ch_name, STM32_ADC_CH_SZ)) {
+- adc->int_ch[i] = chan;
+-
+- if (stm32_adc_ic[i].idx != STM32_ADC_INT_CH_VREFINT)
+- continue;
++ if (stm32_adc_ic[i].idx != STM32_ADC_INT_CH_VREFINT) {
++ adc->int_ch[i] = chan;
++ break;
++ }
+
+ /* Get calibration data for vrefint channel */
+ ret = nvmem_cell_read_u16(&indio_dev->dev, "vrefint", &vrefint);
+@@ -1990,10 +1980,15 @@ static int stm32_adc_populate_int_ch(struct iio_dev *indio_dev, const char *ch_n
+ return dev_err_probe(indio_dev->dev.parent, ret,
+ "nvmem access error\n");
+ }
+- if (ret == -ENOENT)
+- dev_dbg(&indio_dev->dev, "vrefint calibration not found\n");
+- else
+- adc->vrefint.vrefint_cal = vrefint;
++ if (ret == -ENOENT) {
++ dev_dbg(&indio_dev->dev, "vrefint calibration not found. Skip vrefint channel\n");
++ return ret;
++ } else if (!vrefint) {
++ dev_dbg(&indio_dev->dev, "Null vrefint calibration value. Skip vrefint channel\n");
++ return -ENOENT;
++ }
++ adc->int_ch[i] = chan;
++ adc->vrefint.vrefint_cal = vrefint;
+ }
+ }
+
+@@ -2030,7 +2025,9 @@ static int stm32_adc_generic_chan_init(struct iio_dev *indio_dev,
+ }
+ strncpy(adc->chan_name[val], name, STM32_ADC_CH_SZ);
+ ret = stm32_adc_populate_int_ch(indio_dev, name, val);
+- if (ret)
++ if (ret == -ENOENT)
++ continue;
++ else if (ret)
+ goto err;
+ } else if (ret != -EINVAL) {
+ dev_err(&indio_dev->dev, "Invalid label %d\n", ret);
+diff --git a/drivers/iio/adc/ti-ads131e08.c b/drivers/iio/adc/ti-ads131e08.c
+index 0c2025a225750..80a09817c1194 100644
+--- a/drivers/iio/adc/ti-ads131e08.c
++++ b/drivers/iio/adc/ti-ads131e08.c
+@@ -739,7 +739,7 @@ static int ads131e08_alloc_channels(struct iio_dev *indio_dev)
+ device_for_each_child_node(dev, node) {
+ ret = fwnode_property_read_u32(node, "reg", &channel);
+ if (ret)
+- return ret;
++ goto err_child_out;
+
+ ret = fwnode_property_read_u32(node, "ti,gain", &tmp);
+ if (ret) {
+@@ -747,7 +747,7 @@ static int ads131e08_alloc_channels(struct iio_dev *indio_dev)
+ } else {
+ ret = ads131e08_pga_gain_to_field_value(st, tmp);
+ if (ret < 0)
+- return ret;
++ goto err_child_out;
+
+ channel_config[i].pga_gain = tmp;
+ }
+@@ -758,7 +758,7 @@ static int ads131e08_alloc_channels(struct iio_dev *indio_dev)
+ } else {
+ ret = ads131e08_validate_channel_mux(st, tmp);
+ if (ret)
+- return ret;
++ goto err_child_out;
+
+ channel_config[i].mux = tmp;
+ }
+@@ -784,6 +784,10 @@ static int ads131e08_alloc_channels(struct iio_dev *indio_dev)
+ st->channel_config = channel_config;
+
+ return 0;
++
++err_child_out:
++ fwnode_handle_put(node);
++ return ret;
+ }
+
+ static void ads131e08_regulator_disable(void *data)
+diff --git a/drivers/iio/adc/xilinx-ams.c b/drivers/iio/adc/xilinx-ams.c
+index a55396c1f8b28..a7687706012d2 100644
+--- a/drivers/iio/adc/xilinx-ams.c
++++ b/drivers/iio/adc/xilinx-ams.c
+@@ -1409,7 +1409,7 @@ static int ams_probe(struct platform_device *pdev)
+
+ irq = platform_get_irq(pdev, 0);
+ if (irq < 0)
+- return ret;
++ return irq;
+
+ ret = devm_request_irq(&pdev->dev, irq, &ams_irq, 0, "ams-irq",
+ indio_dev);
+diff --git a/drivers/iio/afe/iio-rescale.c b/drivers/iio/afe/iio-rescale.c
+index 7e511293d6d12..dc426e1484f0d 100644
+--- a/drivers/iio/afe/iio-rescale.c
++++ b/drivers/iio/afe/iio-rescale.c
+@@ -278,7 +278,7 @@ static int rescale_configure_channel(struct device *dev,
+ chan->ext_info = rescale->ext_info;
+ chan->type = rescale->cfg->type;
+
+- if (iio_channel_has_info(schan, IIO_CHAN_INFO_RAW) ||
++ if (iio_channel_has_info(schan, IIO_CHAN_INFO_RAW) &&
+ iio_channel_has_info(schan, IIO_CHAN_INFO_SCALE)) {
+ dev_info(dev, "using raw+scale source channel\n");
+ } else if (iio_channel_has_info(schan, IIO_CHAN_INFO_PROCESSED)) {
+diff --git a/drivers/iio/chemical/ccs811.c b/drivers/iio/chemical/ccs811.c
+index 847194fa1e464..80ef1aa9aae3b 100644
+--- a/drivers/iio/chemical/ccs811.c
++++ b/drivers/iio/chemical/ccs811.c
+@@ -499,11 +499,11 @@ static int ccs811_probe(struct i2c_client *client,
+
+ data->drdy_trig->ops = &ccs811_trigger_ops;
+ iio_trigger_set_drvdata(data->drdy_trig, indio_dev);
+- indio_dev->trig = data->drdy_trig;
+- iio_trigger_get(indio_dev->trig);
+ ret = iio_trigger_register(data->drdy_trig);
+ if (ret)
+ goto err_poweroff;
++
++ indio_dev->trig = iio_trigger_get(data->drdy_trig);
+ }
+
+ ret = iio_triggered_buffer_setup(indio_dev, NULL,
+diff --git a/drivers/iio/gyro/mpu3050-core.c b/drivers/iio/gyro/mpu3050-core.c
+index ea387efab62d2..f4c2f4cb48349 100644
+--- a/drivers/iio/gyro/mpu3050-core.c
++++ b/drivers/iio/gyro/mpu3050-core.c
+@@ -874,6 +874,7 @@ static int mpu3050_power_up(struct mpu3050 *mpu3050)
+ ret = regmap_update_bits(mpu3050->map, MPU3050_PWR_MGM,
+ MPU3050_PWR_MGM_SLEEP, 0);
+ if (ret) {
++ regulator_bulk_disable(ARRAY_SIZE(mpu3050->regs), mpu3050->regs);
+ dev_err(mpu3050->dev, "error setting power mode\n");
+ return ret;
+ }
+diff --git a/drivers/iio/humidity/hts221_buffer.c b/drivers/iio/humidity/hts221_buffer.c
+index f29692b9d2db0..66b32413cf5e2 100644
+--- a/drivers/iio/humidity/hts221_buffer.c
++++ b/drivers/iio/humidity/hts221_buffer.c
+@@ -135,9 +135,12 @@ int hts221_allocate_trigger(struct iio_dev *iio_dev)
+
+ iio_trigger_set_drvdata(hw->trig, iio_dev);
+ hw->trig->ops = &hts221_trigger_ops;
++
++ err = devm_iio_trigger_register(hw->dev, hw->trig);
++
+ iio_dev->trig = iio_trigger_get(hw->trig);
+
+- return devm_iio_trigger_register(hw->dev, hw->trig);
++ return err;
+ }
+
+ static int hts221_buffer_preenable(struct iio_dev *iio_dev)
+diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600.h b/drivers/iio/imu/inv_icm42600/inv_icm42600.h
+index c0f5059b13b31..995a9dc06521d 100644
+--- a/drivers/iio/imu/inv_icm42600/inv_icm42600.h
++++ b/drivers/iio/imu/inv_icm42600/inv_icm42600.h
+@@ -17,6 +17,7 @@
+ #include "inv_icm42600_buffer.h"
+
+ enum inv_icm42600_chip {
++ INV_CHIP_INVALID,
+ INV_CHIP_ICM42600,
+ INV_CHIP_ICM42602,
+ INV_CHIP_ICM42605,
+diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+index 86858da9cc38f..ca85fccc98393 100644
+--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
++++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+@@ -565,7 +565,7 @@ int inv_icm42600_core_probe(struct regmap *regmap, int chip, int irq,
+ bool open_drain;
+ int ret;
+
+- if (chip < 0 || chip >= INV_CHIP_NB) {
++ if (chip <= INV_CHIP_INVALID || chip >= INV_CHIP_NB) {
+ dev_err(dev, "invalid chip = %d\n", chip);
+ return -ENODEV;
+ }
+diff --git a/drivers/iio/magnetometer/yamaha-yas530.c b/drivers/iio/magnetometer/yamaha-yas530.c
+index 9ff7b0e56cf67..b2bc637150bfa 100644
+--- a/drivers/iio/magnetometer/yamaha-yas530.c
++++ b/drivers/iio/magnetometer/yamaha-yas530.c
+@@ -639,7 +639,7 @@ static int yas532_get_calibration_data(struct yas5xx *yas5xx)
+ dev_dbg(yas5xx->dev, "calibration data: %*ph\n", 14, data);
+
+ /* Sanity check, is this all zeroes? */
+- if (memchr_inv(data, 0x00, 13)) {
++ if (memchr_inv(data, 0x00, 13) == NULL) {
+ if (!(data[13] & BIT(7)))
+ dev_warn(yas5xx->dev, "calibration is blank!\n");
+ }
+diff --git a/drivers/iio/proximity/sx9324.c b/drivers/iio/proximity/sx9324.c
+index 70c37f664f6da..63fbcaa4cac81 100644
+--- a/drivers/iio/proximity/sx9324.c
++++ b/drivers/iio/proximity/sx9324.c
+@@ -885,6 +885,9 @@ sx9324_get_default_reg(struct device *dev, int idx,
+ break;
+ ret = device_property_read_u32_array(dev, prop, pin_defs,
+ ARRAY_SIZE(pin_defs));
++ if (ret)
++ break;
++
+ for (pin = 0; pin < SX9324_NUM_PINS; pin++)
+ raw |= (pin_defs[pin] << (2 * pin)) &
+ SX9324_REG_AFE_PH0_PIN_MASK(pin);
+diff --git a/drivers/iio/test/Kconfig b/drivers/iio/test/Kconfig
+index 56ca0ad7e77a2..4c66c3f18c345 100644
+--- a/drivers/iio/test/Kconfig
++++ b/drivers/iio/test/Kconfig
+@@ -6,7 +6,7 @@
+ # Keep in alphabetical order
+ config IIO_RESCALE_KUNIT_TEST
+ bool "Test IIO rescale conversion functions"
+- depends on KUNIT=y && !IIO_RESCALE
++ depends on KUNIT=y && IIO_RESCALE=y
+ default KUNIT_ALL_TESTS
+ help
+ If you want to run tests on the iio-rescale code say Y here.
+diff --git a/drivers/iio/test/Makefile b/drivers/iio/test/Makefile
+index f15ae0a6394f7..880360f8d02c2 100644
+--- a/drivers/iio/test/Makefile
++++ b/drivers/iio/test/Makefile
+@@ -4,6 +4,6 @@
+ #
+
+ # Keep in alphabetical order
+-obj-$(CONFIG_IIO_RESCALE_KUNIT_TEST) += iio-test-rescale.o ../afe/iio-rescale.o
++obj-$(CONFIG_IIO_RESCALE_KUNIT_TEST) += iio-test-rescale.o
+ obj-$(CONFIG_IIO_TEST_FORMAT) += iio-test-format.o
+ CFLAGS_iio-test-format.o += $(DISABLE_STRUCTLEAK_PLUGIN)
+diff --git a/drivers/iio/trigger/iio-trig-sysfs.c b/drivers/iio/trigger/iio-trig-sysfs.c
+index 2a4b75897910f..3d911c24b2650 100644
+--- a/drivers/iio/trigger/iio-trig-sysfs.c
++++ b/drivers/iio/trigger/iio-trig-sysfs.c
+@@ -191,6 +191,7 @@ static int iio_sysfs_trigger_remove(int id)
+ }
+
+ iio_trigger_unregister(t->trig);
++ irq_work_sync(&t->work);
+ iio_trigger_free(t->trig);
+
+ list_del(&t->l);
+diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
+index 8fdb84b3642bd..1d42084d02767 100644
+--- a/drivers/iommu/ipmmu-vmsa.c
++++ b/drivers/iommu/ipmmu-vmsa.c
+@@ -987,7 +987,7 @@ static const struct of_device_id ipmmu_of_ids[] = {
+ .compatible = "renesas,ipmmu-r8a779a0",
+ .data = &ipmmu_features_rcar_gen4,
+ }, {
+- .compatible = "renesas,rcar-gen4-ipmmu",
++ .compatible = "renesas,rcar-gen4-ipmmu-vmsa",
+ .data = &ipmmu_features_rcar_gen4,
+ }, {
+ /* Terminator */
+diff --git a/drivers/md/dm-era-target.c b/drivers/md/dm-era-target.c
+index 1f6bf152b3c74..e92c1afc3677f 100644
+--- a/drivers/md/dm-era-target.c
++++ b/drivers/md/dm-era-target.c
+@@ -1400,7 +1400,7 @@ static void start_worker(struct era *era)
+ static void stop_worker(struct era *era)
+ {
+ atomic_set(&era->suspended, 1);
+- flush_workqueue(era->wq);
++ drain_workqueue(era->wq);
+ }
+
+ /*----------------------------------------------------------------
+@@ -1570,6 +1570,12 @@ static void era_postsuspend(struct dm_target *ti)
+ }
+
+ stop_worker(era);
++
++ r = metadata_commit(era->md);
++ if (r) {
++ DMERR("%s: metadata_commit failed", __func__);
++ /* FIXME: fail mode */
++ }
+ }
+
+ static int era_preresume(struct dm_target *ti)
+diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
+index 2dda05aada231..0c6620e7b7bf6 100644
+--- a/drivers/md/dm-log.c
++++ b/drivers/md/dm-log.c
+@@ -615,7 +615,7 @@ static int disk_resume(struct dm_dirty_log *log)
+ log_clear_bit(lc, lc->clean_bits, i);
+
+ /* clear any old bits -- device has shrunk */
+- for (i = lc->region_count; i % (sizeof(*lc->clean_bits) << BYTE_SHIFT); i++)
++ for (i = lc->region_count; i % BITS_PER_LONG; i++)
+ log_clear_bit(lc, lc->clean_bits, i);
+
+ /* copy clean across to sync */
+diff --git a/drivers/md/dm.c b/drivers/md/dm.c
+index 83dd17abf1af9..f01d33bc36136 100644
+--- a/drivers/md/dm.c
++++ b/drivers/md/dm.c
+@@ -899,9 +899,11 @@ static void dm_io_complete(struct dm_io *io)
+ if (io_error == BLK_STS_AGAIN) {
+ /* io_uring doesn't handle BLK_STS_AGAIN (yet) */
+ queue_io(md, bio);
++ return;
+ }
+ }
+- return;
++ if (io_error == BLK_STS_DM_REQUEUE)
++ return;
+ }
+
+ if (bio_is_flush_with_data(bio)) {
+diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
+index 86a3d34f418e8..4c5154e0bf00c 100644
+--- a/drivers/memory/mtk-smi.c
++++ b/drivers/memory/mtk-smi.c
+@@ -404,13 +404,16 @@ static int mtk_smi_device_link_common(struct device *dev, struct device **com_de
+ of_node_put(smi_com_node);
+ if (smi_com_pdev) {
+ /* smi common is the supplier, Make sure it is ready before */
+- if (!platform_get_drvdata(smi_com_pdev))
++ if (!platform_get_drvdata(smi_com_pdev)) {
++ put_device(&smi_com_pdev->dev);
+ return -EPROBE_DEFER;
++ }
+ smi_com_dev = &smi_com_pdev->dev;
+ link = device_link_add(dev, smi_com_dev,
+ DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+ if (!link) {
+ dev_err(dev, "Unable to link smi-common dev\n");
++ put_device(&smi_com_pdev->dev);
+ return -ENODEV;
+ }
+ *com_dev = smi_com_dev;
+diff --git a/drivers/memory/samsung/exynos5422-dmc.c b/drivers/memory/samsung/exynos5422-dmc.c
+index 4733e7898ffe5..c491cd549644f 100644
+--- a/drivers/memory/samsung/exynos5422-dmc.c
++++ b/drivers/memory/samsung/exynos5422-dmc.c
+@@ -1187,33 +1187,39 @@ static int of_get_dram_timings(struct exynos5_dmc *dmc)
+
+ dmc->timing_row = devm_kmalloc_array(dmc->dev, TIMING_COUNT,
+ sizeof(u32), GFP_KERNEL);
+- if (!dmc->timing_row)
+- return -ENOMEM;
++ if (!dmc->timing_row) {
++ ret = -ENOMEM;
++ goto put_node;
++ }
+
+ dmc->timing_data = devm_kmalloc_array(dmc->dev, TIMING_COUNT,
+ sizeof(u32), GFP_KERNEL);
+- if (!dmc->timing_data)
+- return -ENOMEM;
++ if (!dmc->timing_data) {
++ ret = -ENOMEM;
++ goto put_node;
++ }
+
+ dmc->timing_power = devm_kmalloc_array(dmc->dev, TIMING_COUNT,
+ sizeof(u32), GFP_KERNEL);
+- if (!dmc->timing_power)
+- return -ENOMEM;
++ if (!dmc->timing_power) {
++ ret = -ENOMEM;
++ goto put_node;
++ }
+
+ dmc->timings = of_lpddr3_get_ddr_timings(np_ddr, dmc->dev,
+ DDR_TYPE_LPDDR3,
+ &dmc->timings_arr_size);
+ if (!dmc->timings) {
+- of_node_put(np_ddr);
+ dev_warn(dmc->dev, "could not get timings from DT\n");
+- return -EINVAL;
++ ret = -EINVAL;
++ goto put_node;
+ }
+
+ dmc->min_tck = of_lpddr3_get_min_tck(np_ddr, dmc->dev);
+ if (!dmc->min_tck) {
+- of_node_put(np_ddr);
+ dev_warn(dmc->dev, "could not get tck from DT\n");
+- return -EINVAL;
++ ret = -EINVAL;
++ goto put_node;
+ }
+
+ /* Sorted array of OPPs with frequency ascending */
+@@ -1227,13 +1233,14 @@ static int of_get_dram_timings(struct exynos5_dmc *dmc)
+ clk_period_ps);
+ }
+
+- of_node_put(np_ddr);
+
+ /* Take the highest frequency's timings as 'bypass' */
+ dmc->bypass_timing_row = dmc->timing_row[idx - 1];
+ dmc->bypass_timing_data = dmc->timing_data[idx - 1];
+ dmc->bypass_timing_power = dmc->timing_power[idx - 1];
+
++put_node:
++ of_node_put(np_ddr);
+ return ret;
+ }
+
+diff --git a/drivers/mmc/host/mtk-sd.c b/drivers/mmc/host/mtk-sd.c
+index e61b0b98065a2..b74a0e54e652f 100644
+--- a/drivers/mmc/host/mtk-sd.c
++++ b/drivers/mmc/host/mtk-sd.c
+@@ -1356,7 +1356,7 @@ static void msdc_data_xfer_next(struct msdc_host *host, struct mmc_request *mrq)
+ msdc_request_done(host, mrq);
+ }
+
+-static bool msdc_data_xfer_done(struct msdc_host *host, u32 events,
++static void msdc_data_xfer_done(struct msdc_host *host, u32 events,
+ struct mmc_request *mrq, struct mmc_data *data)
+ {
+ struct mmc_command *stop;
+@@ -1376,7 +1376,7 @@ static bool msdc_data_xfer_done(struct msdc_host *host, u32 events,
+ spin_unlock_irqrestore(&host->lock, flags);
+
+ if (done)
+- return true;
++ return;
+ stop = data->stop;
+
+ if (check_data || (stop && stop->error)) {
+@@ -1385,12 +1385,15 @@ static bool msdc_data_xfer_done(struct msdc_host *host, u32 events,
+ sdr_set_field(host->base + MSDC_DMA_CTRL, MSDC_DMA_CTRL_STOP,
+ 1);
+
++ ret = readl_poll_timeout_atomic(host->base + MSDC_DMA_CTRL, val,
++ !(val & MSDC_DMA_CTRL_STOP), 1, 20000);
++ if (ret)
++ dev_dbg(host->dev, "DMA stop timed out\n");
++
+ ret = readl_poll_timeout_atomic(host->base + MSDC_DMA_CFG, val,
+ !(val & MSDC_DMA_CFG_STS), 1, 20000);
+- if (ret) {
+- dev_dbg(host->dev, "DMA stop timed out\n");
+- return false;
+- }
++ if (ret)
++ dev_dbg(host->dev, "DMA inactive timed out\n");
+
+ sdr_clr_bits(host->base + MSDC_INTEN, data_ints_mask);
+ dev_dbg(host->dev, "DMA stop\n");
+@@ -1415,9 +1418,7 @@ static bool msdc_data_xfer_done(struct msdc_host *host, u32 events,
+ }
+
+ msdc_data_xfer_next(host, mrq);
+- done = true;
+ }
+- return done;
+ }
+
+ static void msdc_set_buswidth(struct msdc_host *host, u32 width)
+@@ -2416,6 +2417,9 @@ static void msdc_cqe_disable(struct mmc_host *mmc, bool recovery)
+ if (recovery) {
+ sdr_set_field(host->base + MSDC_DMA_CTRL,
+ MSDC_DMA_CTRL_STOP, 1);
++ if (WARN_ON(readl_poll_timeout(host->base + MSDC_DMA_CTRL, val,
++ !(val & MSDC_DMA_CTRL_STOP), 1, 3000)))
++ return;
+ if (WARN_ON(readl_poll_timeout(host->base + MSDC_DMA_CFG, val,
+ !(val & MSDC_DMA_CFG_STS), 1, 3000)))
+ return;
+diff --git a/drivers/mmc/host/sdhci-pci-o2micro.c b/drivers/mmc/host/sdhci-pci-o2micro.c
+index 92c20cb8074a6..0d4d343dbb77d 100644
+--- a/drivers/mmc/host/sdhci-pci-o2micro.c
++++ b/drivers/mmc/host/sdhci-pci-o2micro.c
+@@ -152,6 +152,8 @@ static int sdhci_o2_get_cd(struct mmc_host *mmc)
+
+ if (!(sdhci_readw(host, O2_PLL_DLL_WDT_CONTROL1) & O2_PLL_LOCK_STATUS))
+ sdhci_o2_enable_internal_clock(host);
++ else
++ sdhci_o2_wait_card_detect_stable(host);
+
+ return !!(sdhci_readl(host, SDHCI_PRESENT_STATE) & SDHCI_CARD_PRESENT);
+ }
+diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+index 44b14c9dc9a73..375529b7d12e3 100644
+--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
++++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+@@ -695,7 +695,7 @@ static int gpmi_nfc_compute_timings(struct gpmi_nand_data *this,
+ hw->timing0 = BF_GPMI_TIMING0_ADDRESS_SETUP(addr_setup_cycles) |
+ BF_GPMI_TIMING0_DATA_HOLD(data_hold_cycles) |
+ BF_GPMI_TIMING0_DATA_SETUP(data_setup_cycles);
+- hw->timing1 = BF_GPMI_TIMING1_BUSY_TIMEOUT(busy_timeout_cycles * 4096);
++ hw->timing1 = BF_GPMI_TIMING1_BUSY_TIMEOUT(DIV_ROUND_UP(busy_timeout_cycles, 4096));
+
+ /*
+ * Derive NFC ideal delay from {3}:
+diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
+index 26a6573adf0f5..93c7a551264eb 100644
+--- a/drivers/net/bonding/bond_main.c
++++ b/drivers/net/bonding/bond_main.c
+@@ -3684,9 +3684,11 @@ re_arm:
+ if (!rtnl_trylock())
+ return;
+
+- if (should_notify_peers)
++ if (should_notify_peers) {
++ bond->send_peer_notif--;
+ call_netdevice_notifiers(NETDEV_NOTIFY_PEERS,
+ bond->dev);
++ }
+ if (should_notify_rtnl) {
+ bond_slave_state_notify(bond);
+ bond_slave_link_notify(bond);
+diff --git a/drivers/net/dsa/qca8k.h b/drivers/net/dsa/qca8k.h
+index f375627174c8c..e553e3e6fa0fb 100644
+--- a/drivers/net/dsa/qca8k.h
++++ b/drivers/net/dsa/qca8k.h
+@@ -15,7 +15,7 @@
+
+ #define QCA8K_ETHERNET_MDIO_PRIORITY 7
+ #define QCA8K_ETHERNET_PHY_PRIORITY 6
+-#define QCA8K_ETHERNET_TIMEOUT 100
++#define QCA8K_ETHERNET_TIMEOUT 5
+
+ #define QCA8K_NUM_PORTS 7
+ #define QCA8K_NUM_CPU_PORTS 2
+diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
+index 24cda7e1f916c..8aee4ae4cc8c9 100644
+--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
++++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
+@@ -2191,6 +2191,42 @@ ice_setup_autoneg(struct ice_port_info *p, struct ethtool_link_ksettings *ks,
+ return err;
+ }
+
++/**
++ * ice_set_phy_type_from_speed - set phy_types based on speeds
++ * and advertised modes
++ * @ks: ethtool link ksettings struct
++ * @phy_type_low: pointer to the lower part of phy_type
++ * @phy_type_high: pointer to the higher part of phy_type
++ * @adv_link_speed: targeted link speeds bitmap
++ */
++static void
++ice_set_phy_type_from_speed(const struct ethtool_link_ksettings *ks,
++ u64 *phy_type_low, u64 *phy_type_high,
++ u16 adv_link_speed)
++{
++ /* Handle 1000M speed in a special way because ice_update_phy_type
++ * enables all link modes, but having mixed copper and optical
++ * standards is not supported.
++ */
++ adv_link_speed &= ~ICE_AQ_LINK_SPEED_1000MB;
++
++ if (ethtool_link_ksettings_test_link_mode(ks, advertising,
++ 1000baseT_Full))
++ *phy_type_low |= ICE_PHY_TYPE_LOW_1000BASE_T |
++ ICE_PHY_TYPE_LOW_1G_SGMII;
++
++ if (ethtool_link_ksettings_test_link_mode(ks, advertising,
++ 1000baseKX_Full))
++ *phy_type_low |= ICE_PHY_TYPE_LOW_1000BASE_KX;
++
++ if (ethtool_link_ksettings_test_link_mode(ks, advertising,
++ 1000baseX_Full))
++ *phy_type_low |= ICE_PHY_TYPE_LOW_1000BASE_SX |
++ ICE_PHY_TYPE_LOW_1000BASE_LX;
++
++ ice_update_phy_type(phy_type_low, phy_type_high, adv_link_speed);
++}
++
+ /**
+ * ice_set_link_ksettings - Set Speed and Duplex
+ * @netdev: network interface device structure
+@@ -2322,7 +2358,8 @@ ice_set_link_ksettings(struct net_device *netdev,
+ adv_link_speed = curr_link_speed;
+
+ /* Convert the advertise link speeds to their corresponded PHY_TYPE */
+- ice_update_phy_type(&phy_type_low, &phy_type_high, adv_link_speed);
++ ice_set_phy_type_from_speed(ks, &phy_type_low, &phy_type_high,
++ adv_link_speed);
+
+ if (!autoneg_changed && adv_link_speed == curr_link_speed) {
+ netdev_info(netdev, "Nothing changed, exiting without setting anything.\n");
+@@ -3440,6 +3477,16 @@ static int ice_set_channels(struct net_device *dev, struct ethtool_channels *ch)
+ new_rx = ch->combined_count + ch->rx_count;
+ new_tx = ch->combined_count + ch->tx_count;
+
++ if (new_rx < vsi->tc_cfg.numtc) {
++ netdev_err(dev, "Cannot set less Rx channels, than Traffic Classes you have (%u)\n",
++ vsi->tc_cfg.numtc);
++ return -EINVAL;
++ }
++ if (new_tx < vsi->tc_cfg.numtc) {
++ netdev_err(dev, "Cannot set less Tx channels, than Traffic Classes you have (%u)\n",
++ vsi->tc_cfg.numtc);
++ return -EINVAL;
++ }
+ if (new_rx > ice_get_max_rxq(pf)) {
+ netdev_err(dev, "Maximum allowed Rx channels is %d\n",
+ ice_get_max_rxq(pf));
+diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
+index 454e01ae09b97..f7f9c973ec54d 100644
+--- a/drivers/net/ethernet/intel/ice/ice_lib.c
++++ b/drivers/net/ethernet/intel/ice/ice_lib.c
+@@ -909,7 +909,7 @@ static void ice_set_dflt_vsi_ctx(struct ice_hw *hw, struct ice_vsi_ctx *ctxt)
+ * @vsi: the VSI being configured
+ * @ctxt: VSI context structure
+ */
+-static void ice_vsi_setup_q_map(struct ice_vsi *vsi, struct ice_vsi_ctx *ctxt)
++static int ice_vsi_setup_q_map(struct ice_vsi *vsi, struct ice_vsi_ctx *ctxt)
+ {
+ u16 offset = 0, qmap = 0, tx_count = 0, pow = 0;
+ u16 num_txq_per_tc, num_rxq_per_tc;
+@@ -982,7 +982,18 @@ static void ice_vsi_setup_q_map(struct ice_vsi *vsi, struct ice_vsi_ctx *ctxt)
+ else
+ vsi->num_rxq = num_rxq_per_tc;
+
++ if (vsi->num_rxq > vsi->alloc_rxq) {
++ dev_err(ice_pf_to_dev(vsi->back), "Trying to use more Rx queues (%u), than were allocated (%u)!\n",
++ vsi->num_rxq, vsi->alloc_rxq);
++ return -EINVAL;
++ }
++
+ vsi->num_txq = tx_count;
++ if (vsi->num_txq > vsi->alloc_txq) {
++ dev_err(ice_pf_to_dev(vsi->back), "Trying to use more Tx queues (%u), than were allocated (%u)!\n",
++ vsi->num_txq, vsi->alloc_txq);
++ return -EINVAL;
++ }
+
+ if (vsi->type == ICE_VSI_VF && vsi->num_txq != vsi->num_rxq) {
+ dev_dbg(ice_pf_to_dev(vsi->back), "VF VSI should have same number of Tx and Rx queues. Hence making them equal\n");
+@@ -1000,6 +1011,8 @@ static void ice_vsi_setup_q_map(struct ice_vsi *vsi, struct ice_vsi_ctx *ctxt)
+ */
+ ctxt->info.q_mapping[0] = cpu_to_le16(vsi->rxq_map[0]);
+ ctxt->info.q_mapping[1] = cpu_to_le16(vsi->num_rxq);
++
++ return 0;
+ }
+
+ /**
+@@ -1187,7 +1200,10 @@ static int ice_vsi_init(struct ice_vsi *vsi, bool init_vsi)
+ if (vsi->type == ICE_VSI_CHNL) {
+ ice_chnl_vsi_setup_q_map(vsi, ctxt);
+ } else {
+- ice_vsi_setup_q_map(vsi, ctxt);
++ ret = ice_vsi_setup_q_map(vsi, ctxt);
++ if (ret)
++ goto out;
++
+ if (!init_vsi) /* means VSI being updated */
+ /* must to indicate which section of VSI context are
+ * being modified
+@@ -3464,7 +3480,7 @@ void ice_vsi_cfg_netdev_tc(struct ice_vsi *vsi, u8 ena_tc)
+ *
+ * Prepares VSI tc_config to have queue configurations based on MQPRIO options.
+ */
+-static void
++static int
+ ice_vsi_setup_q_map_mqprio(struct ice_vsi *vsi, struct ice_vsi_ctx *ctxt,
+ u8 ena_tc)
+ {
+@@ -3513,7 +3529,18 @@ ice_vsi_setup_q_map_mqprio(struct ice_vsi *vsi, struct ice_vsi_ctx *ctxt,
+
+ /* Set actual Tx/Rx queue pairs */
+ vsi->num_txq = offset + qcount_tx;
++ if (vsi->num_txq > vsi->alloc_txq) {
++ dev_err(ice_pf_to_dev(vsi->back), "Trying to use more Tx queues (%u), than were allocated (%u)!\n",
++ vsi->num_txq, vsi->alloc_txq);
++ return -EINVAL;
++ }
++
+ vsi->num_rxq = offset + qcount_rx;
++ if (vsi->num_rxq > vsi->alloc_rxq) {
++ dev_err(ice_pf_to_dev(vsi->back), "Trying to use more Rx queues (%u), than were allocated (%u)!\n",
++ vsi->num_rxq, vsi->alloc_rxq);
++ return -EINVAL;
++ }
+
+ /* Setup queue TC[0].qmap for given VSI context */
+ ctxt->info.tc_mapping[0] = cpu_to_le16(qmap);
+@@ -3531,6 +3558,8 @@ ice_vsi_setup_q_map_mqprio(struct ice_vsi *vsi, struct ice_vsi_ctx *ctxt,
+ dev_dbg(ice_pf_to_dev(vsi->back), "vsi->num_rxq = %d\n", vsi->num_rxq);
+ dev_dbg(ice_pf_to_dev(vsi->back), "all_numtc %u, all_enatc: 0x%04x, tc_cfg.numtc %u\n",
+ vsi->all_numtc, vsi->all_enatc, vsi->tc_cfg.numtc);
++
++ return 0;
+ }
+
+ /**
+@@ -3580,9 +3609,12 @@ int ice_vsi_cfg_tc(struct ice_vsi *vsi, u8 ena_tc)
+
+ if (vsi->type == ICE_VSI_PF &&
+ test_bit(ICE_FLAG_TC_MQPRIO, pf->flags))
+- ice_vsi_setup_q_map_mqprio(vsi, ctx, ena_tc);
++ ret = ice_vsi_setup_q_map_mqprio(vsi, ctx, ena_tc);
+ else
+- ice_vsi_setup_q_map(vsi, ctx);
++ ret = ice_vsi_setup_q_map(vsi, ctx);
++
++ if (ret)
++ goto out;
+
+ /* must to indicate which section of VSI context are being modified */
+ ctx->info.valid_sections = cpu_to_le16(ICE_AQ_VSI_PROP_RXQ_MAP_VALID);
+diff --git a/drivers/net/ethernet/intel/ice/ice_tc_lib.c b/drivers/net/ethernet/intel/ice/ice_tc_lib.c
+index 3acd9f921c441..2ce2694fcbd78 100644
+--- a/drivers/net/ethernet/intel/ice/ice_tc_lib.c
++++ b/drivers/net/ethernet/intel/ice/ice_tc_lib.c
+@@ -524,6 +524,7 @@ ice_eswitch_add_tc_fltr(struct ice_vsi *vsi, struct ice_tc_flower_fltr *fltr)
+ */
+ fltr->rid = rule_added.rid;
+ fltr->rule_id = rule_added.rule_id;
++ fltr->dest_id = rule_added.vsi_handle;
+
+ exit:
+ kfree(list);
+@@ -994,7 +995,9 @@ ice_parse_cls_flower(struct net_device *filter_dev, struct ice_vsi *vsi,
+ n_proto_key = ntohs(match.key->n_proto);
+ n_proto_mask = ntohs(match.mask->n_proto);
+
+- if (n_proto_key == ETH_P_ALL || n_proto_key == 0) {
++ if (n_proto_key == ETH_P_ALL || n_proto_key == 0 ||
++ fltr->tunnel_type == TNL_GTPU ||
++ fltr->tunnel_type == TNL_GTPC) {
+ n_proto_key = 0;
+ n_proto_mask = 0;
+ } else {
+diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
+index 68be2976f539f..c5f04c40284bf 100644
+--- a/drivers/net/ethernet/intel/igb/igb_main.c
++++ b/drivers/net/ethernet/intel/igb/igb_main.c
+@@ -4819,8 +4819,11 @@ static void igb_clean_tx_ring(struct igb_ring *tx_ring)
+ while (i != tx_ring->next_to_use) {
+ union e1000_adv_tx_desc *eop_desc, *tx_desc;
+
+- /* Free all the Tx ring sk_buffs */
+- dev_kfree_skb_any(tx_buffer->skb);
++ /* Free all the Tx ring sk_buffs or xdp frames */
++ if (tx_buffer->type == IGB_TYPE_SKB)
++ dev_kfree_skb_any(tx_buffer->skb);
++ else
++ xdp_return_frame(tx_buffer->xdpf);
+
+ /* unmap skb header data */
+ dma_unmap_single(tx_ring->dev,
+@@ -9898,11 +9901,10 @@ static void igb_init_dmac(struct igb_adapter *adapter, u32 pba)
+ struct e1000_hw *hw = &adapter->hw;
+ u32 dmac_thr;
+ u16 hwm;
++ u32 reg;
+
+ if (hw->mac.type > e1000_82580) {
+ if (adapter->flags & IGB_FLAG_DMAC) {
+- u32 reg;
+-
+ /* force threshold to 0. */
+ wr32(E1000_DMCTXTH, 0);
+
+@@ -9935,7 +9937,6 @@ static void igb_init_dmac(struct igb_adapter *adapter, u32 pba)
+ /* Disable BMC-to-OS Watchdog Enable */
+ if (hw->mac.type != e1000_i354)
+ reg &= ~E1000_DMACR_DC_BMC2OSW_EN;
+-
+ wr32(E1000_DMACR, reg);
+
+ /* no lower threshold to disable
+@@ -9952,12 +9953,12 @@ static void igb_init_dmac(struct igb_adapter *adapter, u32 pba)
+ */
+ wr32(E1000_DMCTXTH, (IGB_MIN_TXPBSIZE -
+ (IGB_TX_BUF_4096 + adapter->max_frame_size)) >> 6);
++ }
+
+- /* make low power state decision controlled
+- * by DMA coal
+- */
++ if (hw->mac.type >= e1000_i210 ||
++ (adapter->flags & IGB_FLAG_DMAC)) {
+ reg = rd32(E1000_PCIEMISC);
+- reg &= ~E1000_PCIEMISC_LX_DECISION;
++ reg |= E1000_PCIEMISC_LX_DECISION;
+ wr32(E1000_PCIEMISC, reg);
+ } /* endif adapter->dmac is not disabled */
+ } else if (hw->mac.type == e1000_82580) {
+diff --git a/drivers/net/phy/aquantia_main.c b/drivers/net/phy/aquantia_main.c
+index a8db1a19011bd..c7047f5d7a9b0 100644
+--- a/drivers/net/phy/aquantia_main.c
++++ b/drivers/net/phy/aquantia_main.c
+@@ -34,6 +34,8 @@
+ #define MDIO_AN_VEND_PROV 0xc400
+ #define MDIO_AN_VEND_PROV_1000BASET_FULL BIT(15)
+ #define MDIO_AN_VEND_PROV_1000BASET_HALF BIT(14)
++#define MDIO_AN_VEND_PROV_5000BASET_FULL BIT(11)
++#define MDIO_AN_VEND_PROV_2500BASET_FULL BIT(10)
+ #define MDIO_AN_VEND_PROV_DOWNSHIFT_EN BIT(4)
+ #define MDIO_AN_VEND_PROV_DOWNSHIFT_MASK GENMASK(3, 0)
+ #define MDIO_AN_VEND_PROV_DOWNSHIFT_DFLT 4
+@@ -231,9 +233,20 @@ static int aqr_config_aneg(struct phy_device *phydev)
+ phydev->advertising))
+ reg |= MDIO_AN_VEND_PROV_1000BASET_HALF;
+
++ /* Handle the case when the 2.5G and 5G speeds are not advertised */
++ if (linkmode_test_bit(ETHTOOL_LINK_MODE_2500baseT_Full_BIT,
++ phydev->advertising))
++ reg |= MDIO_AN_VEND_PROV_2500BASET_FULL;
++
++ if (linkmode_test_bit(ETHTOOL_LINK_MODE_5000baseT_Full_BIT,
++ phydev->advertising))
++ reg |= MDIO_AN_VEND_PROV_5000BASET_FULL;
++
+ ret = phy_modify_mmd_changed(phydev, MDIO_MMD_AN, MDIO_AN_VEND_PROV,
+ MDIO_AN_VEND_PROV_1000BASET_HALF |
+- MDIO_AN_VEND_PROV_1000BASET_FULL, reg);
++ MDIO_AN_VEND_PROV_1000BASET_FULL |
++ MDIO_AN_VEND_PROV_2500BASET_FULL |
++ MDIO_AN_VEND_PROV_5000BASET_FULL, reg);
+ if (ret < 0)
+ return ret;
+ if (ret > 0)
+diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
+index 6a467e7817a6a..59fe356942b51 100644
+--- a/drivers/net/phy/at803x.c
++++ b/drivers/net/phy/at803x.c
+@@ -2072,6 +2072,8 @@ static struct phy_driver at803x_driver[] = {
+ /* ATHEROS AR9331 */
+ PHY_ID_MATCH_EXACT(ATH9331_PHY_ID),
+ .name = "Qualcomm Atheros AR9331 built-in PHY",
++ .probe = at803x_probe,
++ .remove = at803x_remove,
+ .suspend = at803x_suspend,
+ .resume = at803x_resume,
+ .flags = PHY_POLL_CABLE_TEST,
+@@ -2087,6 +2089,8 @@ static struct phy_driver at803x_driver[] = {
+ /* Qualcomm Atheros QCA9561 */
+ PHY_ID_MATCH_EXACT(QCA9561_PHY_ID),
+ .name = "Qualcomm Atheros QCA9561 built-in PHY",
++ .probe = at803x_probe,
++ .remove = at803x_remove,
+ .suspend = at803x_suspend,
+ .resume = at803x_resume,
+ .flags = PHY_POLL_CABLE_TEST,
+@@ -2151,6 +2155,8 @@ static struct phy_driver at803x_driver[] = {
+ PHY_ID_MATCH_EXACT(QCA8081_PHY_ID),
+ .name = "Qualcomm QCA8081",
+ .flags = PHY_POLL_CABLE_TEST,
++ .probe = at803x_probe,
++ .remove = at803x_remove,
+ .config_intr = at803x_config_intr,
+ .handle_interrupt = at803x_handle_interrupt,
+ .get_tunable = at803x_get_tunable,
+diff --git a/drivers/net/veth.c b/drivers/net/veth.c
+index eb0121a64d6d2..1d1dea07d9326 100644
+--- a/drivers/net/veth.c
++++ b/drivers/net/veth.c
+@@ -312,6 +312,7 @@ static bool veth_skb_is_eligible_for_gro(const struct net_device *dev,
+ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
+ {
+ struct veth_priv *rcv_priv, *priv = netdev_priv(dev);
++ struct netdev_queue *queue = NULL;
+ struct veth_rq *rq = NULL;
+ struct net_device *rcv;
+ int length = skb->len;
+@@ -329,6 +330,7 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
+ rxq = skb_get_queue_mapping(skb);
+ if (rxq < rcv->real_num_rx_queues) {
+ rq = &rcv_priv->rq[rxq];
++ queue = netdev_get_tx_queue(dev, rxq);
+
+ /* The napi pointer is available when an XDP program is
+ * attached or when GRO is enabled
+@@ -340,6 +342,8 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
+
+ skb_tx_timestamp(skb);
+ if (likely(veth_forward_skb(rcv, skb, rq, use_napi) == NET_RX_SUCCESS)) {
++ if (queue)
++ txq_trans_cond_update(queue);
+ if (!use_napi)
+ dev_lstats_add(dev, length);
+ } else {
+diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
+index cbba9d2e8f322..10d548b07b9c6 100644
+--- a/drivers/net/virtio_net.c
++++ b/drivers/net/virtio_net.c
+@@ -2768,7 +2768,6 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
+ static void virtnet_freeze_down(struct virtio_device *vdev)
+ {
+ struct virtnet_info *vi = vdev->priv;
+- int i;
+
+ /* Make sure no work handler is accessing the device */
+ flush_work(&vi->config_work);
+@@ -2776,14 +2775,8 @@ static void virtnet_freeze_down(struct virtio_device *vdev)
+ netif_tx_lock_bh(vi->dev);
+ netif_device_detach(vi->dev);
+ netif_tx_unlock_bh(vi->dev);
+- cancel_delayed_work_sync(&vi->refill);
+-
+- if (netif_running(vi->dev)) {
+- for (i = 0; i < vi->max_queue_pairs; i++) {
+- napi_disable(&vi->rq[i].napi);
+- virtnet_napi_tx_disable(&vi->sq[i].napi);
+- }
+- }
++ if (netif_running(vi->dev))
++ virtnet_close(vi->dev);
+ }
+
+ static int init_vqs(struct virtnet_info *vi);
+@@ -2791,7 +2784,7 @@ static int init_vqs(struct virtnet_info *vi);
+ static int virtnet_restore_up(struct virtio_device *vdev)
+ {
+ struct virtnet_info *vi = vdev->priv;
+- int err, i;
++ int err;
+
+ err = init_vqs(vi);
+ if (err)
+@@ -2800,15 +2793,9 @@ static int virtnet_restore_up(struct virtio_device *vdev)
+ virtio_device_ready(vdev);
+
+ if (netif_running(vi->dev)) {
+- for (i = 0; i < vi->curr_queue_pairs; i++)
+- if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
+- schedule_delayed_work(&vi->refill, 0);
+-
+- for (i = 0; i < vi->max_queue_pairs; i++) {
+- virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
+- virtnet_napi_tx_enable(vi, vi->sq[i].vq,
+- &vi->sq[i].napi);
+- }
++ err = virtnet_open(vi->dev);
++ if (err)
++ return err;
+ }
+
+ netif_tx_lock_bh(vi->dev);
+diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
+index 1ea85c88d7951..a2862a56fadc4 100644
+--- a/drivers/nvme/host/core.c
++++ b/drivers/nvme/host/core.c
+@@ -2487,6 +2487,20 @@ static const struct nvme_core_quirk_entry core_quirks[] = {
+ .vid = 0x1e0f,
+ .mn = "KCD6XVUL6T40",
+ .quirks = NVME_QUIRK_NO_APST,
++ },
++ {
++ /*
++ * The external Samsung X5 SSD fails initialization without a
++ * delay before checking if it is ready and has a whole set of
++ * other problems. To make this even more interesting, it
++ * shares the PCI ID with internal Samsung 970 Evo Plus that
++ * does not need or want these quirks.
++ */
++ .vid = 0x144d,
++ .mn = "Samsung Portable SSD X5",
++ .quirks = NVME_QUIRK_DELAY_BEFORE_CHK_RDY |
++ NVME_QUIRK_NO_DEEPEST_PS |
++ NVME_QUIRK_IGNORE_DEV_SUBNQN,
+ }
+ };
+
+diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
+index 17aeb7d5c4852..ddea0fb90c288 100644
+--- a/drivers/nvme/host/pci.c
++++ b/drivers/nvme/host/pci.c
+@@ -3475,10 +3475,6 @@ static const struct pci_device_id nvme_id_table[] = {
+ NVME_QUIRK_128_BYTES_SQES |
+ NVME_QUIRK_SHARED_TAGS |
+ NVME_QUIRK_SKIP_CID_GEN },
+- { PCI_DEVICE(0x144d, 0xa808), /* Samsung X5 */
+- .driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY|
+- NVME_QUIRK_NO_DEEPEST_PS |
+- NVME_QUIRK_IGNORE_DEV_SUBNQN, },
+ { PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
+ { 0, }
+ };
+diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
+index d0eab5700dc57..00684e11976be 100644
+--- a/drivers/scsi/ibmvscsi/ibmvfc.c
++++ b/drivers/scsi/ibmvscsi/ibmvfc.c
+@@ -160,8 +160,8 @@ static void ibmvfc_npiv_logout(struct ibmvfc_host *);
+ static void ibmvfc_tgt_implicit_logout_and_del(struct ibmvfc_target *);
+ static void ibmvfc_tgt_move_login(struct ibmvfc_target *);
+
+-static void ibmvfc_release_sub_crqs(struct ibmvfc_host *);
+-static void ibmvfc_init_sub_crqs(struct ibmvfc_host *);
++static void ibmvfc_dereg_sub_crqs(struct ibmvfc_host *);
++static void ibmvfc_reg_sub_crqs(struct ibmvfc_host *);
+
+ static const char *unknown_error = "unknown error";
+
+@@ -917,7 +917,7 @@ static int ibmvfc_reenable_crq_queue(struct ibmvfc_host *vhost)
+ struct vio_dev *vdev = to_vio_dev(vhost->dev);
+ unsigned long flags;
+
+- ibmvfc_release_sub_crqs(vhost);
++ ibmvfc_dereg_sub_crqs(vhost);
+
+ /* Re-enable the CRQ */
+ do {
+@@ -936,7 +936,7 @@ static int ibmvfc_reenable_crq_queue(struct ibmvfc_host *vhost)
+ spin_unlock(vhost->crq.q_lock);
+ spin_unlock_irqrestore(vhost->host->host_lock, flags);
+
+- ibmvfc_init_sub_crqs(vhost);
++ ibmvfc_reg_sub_crqs(vhost);
+
+ return rc;
+ }
+@@ -955,7 +955,7 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
+ struct vio_dev *vdev = to_vio_dev(vhost->dev);
+ struct ibmvfc_queue *crq = &vhost->crq;
+
+- ibmvfc_release_sub_crqs(vhost);
++ ibmvfc_dereg_sub_crqs(vhost);
+
+ /* Close the CRQ */
+ do {
+@@ -988,7 +988,7 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
+ spin_unlock(vhost->crq.q_lock);
+ spin_unlock_irqrestore(vhost->host->host_lock, flags);
+
+- ibmvfc_init_sub_crqs(vhost);
++ ibmvfc_reg_sub_crqs(vhost);
+
+ return rc;
+ }
+@@ -5682,6 +5682,8 @@ static int ibmvfc_alloc_queue(struct ibmvfc_host *vhost,
+ queue->cur = 0;
+ queue->fmt = fmt;
+ queue->size = PAGE_SIZE / fmt_size;
++
++ queue->vhost = vhost;
+ return 0;
+ }
+
+@@ -5757,9 +5759,6 @@ static int ibmvfc_register_scsi_channel(struct ibmvfc_host *vhost,
+
+ ENTER;
+
+- if (ibmvfc_alloc_queue(vhost, scrq, IBMVFC_SUB_CRQ_FMT))
+- return -ENOMEM;
+-
+ rc = h_reg_sub_crq(vdev->unit_address, scrq->msg_token, PAGE_SIZE,
+ &scrq->cookie, &scrq->hw_irq);
+
+@@ -5790,7 +5789,6 @@ static int ibmvfc_register_scsi_channel(struct ibmvfc_host *vhost,
+ }
+
+ scrq->hwq_id = index;
+- scrq->vhost = vhost;
+
+ LEAVE;
+ return 0;
+@@ -5800,7 +5798,6 @@ irq_failed:
+ rc = plpar_hcall_norets(H_FREE_SUB_CRQ, vdev->unit_address, scrq->cookie);
+ } while (rtas_busy_delay(rc));
+ reg_failed:
+- ibmvfc_free_queue(vhost, scrq);
+ LEAVE;
+ return rc;
+ }
+@@ -5826,12 +5823,50 @@ static void ibmvfc_deregister_scsi_channel(struct ibmvfc_host *vhost, int index)
+ if (rc)
+ dev_err(dev, "Failed to free sub-crq[%d]: rc=%ld\n", index, rc);
+
+- ibmvfc_free_queue(vhost, scrq);
++ /* Clean out the queue */
++ memset(scrq->msgs.crq, 0, PAGE_SIZE);
++ scrq->cur = 0;
++
++ LEAVE;
++}
++
++static void ibmvfc_reg_sub_crqs(struct ibmvfc_host *vhost)
++{
++ int i, j;
++
++ ENTER;
++ if (!vhost->mq_enabled || !vhost->scsi_scrqs.scrqs)
++ return;
++
++ for (i = 0; i < nr_scsi_hw_queues; i++) {
++ if (ibmvfc_register_scsi_channel(vhost, i)) {
++ for (j = i; j > 0; j--)
++ ibmvfc_deregister_scsi_channel(vhost, j - 1);
++ vhost->do_enquiry = 0;
++ return;
++ }
++ }
++
++ LEAVE;
++}
++
++static void ibmvfc_dereg_sub_crqs(struct ibmvfc_host *vhost)
++{
++ int i;
++
++ ENTER;
++ if (!vhost->mq_enabled || !vhost->scsi_scrqs.scrqs)
++ return;
++
++ for (i = 0; i < nr_scsi_hw_queues; i++)
++ ibmvfc_deregister_scsi_channel(vhost, i);
++
+ LEAVE;
+ }
+
+ static void ibmvfc_init_sub_crqs(struct ibmvfc_host *vhost)
+ {
++ struct ibmvfc_queue *scrq;
+ int i, j;
+
+ ENTER;
+@@ -5847,30 +5882,41 @@ static void ibmvfc_init_sub_crqs(struct ibmvfc_host *vhost)
+ }
+
+ for (i = 0; i < nr_scsi_hw_queues; i++) {
+- if (ibmvfc_register_scsi_channel(vhost, i)) {
+- for (j = i; j > 0; j--)
+- ibmvfc_deregister_scsi_channel(vhost, j - 1);
++ scrq = &vhost->scsi_scrqs.scrqs[i];
++ if (ibmvfc_alloc_queue(vhost, scrq, IBMVFC_SUB_CRQ_FMT)) {
++ for (j = i; j > 0; j--) {
++ scrq = &vhost->scsi_scrqs.scrqs[j - 1];
++ ibmvfc_free_queue(vhost, scrq);
++ }
+ kfree(vhost->scsi_scrqs.scrqs);
+ vhost->scsi_scrqs.scrqs = NULL;
+ vhost->scsi_scrqs.active_queues = 0;
+ vhost->do_enquiry = 0;
+- break;
++ vhost->mq_enabled = 0;
++ return;
+ }
+ }
+
++ ibmvfc_reg_sub_crqs(vhost);
++
+ LEAVE;
+ }
+
+ static void ibmvfc_release_sub_crqs(struct ibmvfc_host *vhost)
+ {
++ struct ibmvfc_queue *scrq;
+ int i;
+
+ ENTER;
+ if (!vhost->scsi_scrqs.scrqs)
+ return;
+
+- for (i = 0; i < nr_scsi_hw_queues; i++)
+- ibmvfc_deregister_scsi_channel(vhost, i);
++ ibmvfc_dereg_sub_crqs(vhost);
++
++ for (i = 0; i < nr_scsi_hw_queues; i++) {
++ scrq = &vhost->scsi_scrqs.scrqs[i];
++ ibmvfc_free_queue(vhost, scrq);
++ }
+
+ kfree(vhost->scsi_scrqs.scrqs);
+ vhost->scsi_scrqs.scrqs = NULL;
+diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h
+index 3718406e09887..c39a245f43d02 100644
+--- a/drivers/scsi/ibmvscsi/ibmvfc.h
++++ b/drivers/scsi/ibmvscsi/ibmvfc.h
+@@ -789,6 +789,7 @@ struct ibmvfc_queue {
+ spinlock_t _lock;
+ spinlock_t *q_lock;
+
++ struct ibmvfc_host *vhost;
+ struct ibmvfc_event_pool evt_pool;
+ struct list_head sent;
+ struct list_head free;
+@@ -797,7 +798,6 @@ struct ibmvfc_queue {
+ union ibmvfc_iu cancel_rsp;
+
+ /* Sub-CRQ fields */
+- struct ibmvfc_host *vhost;
+ unsigned long cookie;
+ unsigned long vios_cookie;
+ unsigned long hw_irq;
+diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
+index 592a290e6cfaa..6cdd67f2a08e9 100644
+--- a/drivers/scsi/scsi_debug.c
++++ b/drivers/scsi/scsi_debug.c
+@@ -2788,6 +2788,24 @@ static void zbc_open_zone(struct sdebug_dev_info *devip,
+ }
+ }
+
++static inline void zbc_set_zone_full(struct sdebug_dev_info *devip,
++ struct sdeb_zone_state *zsp)
++{
++ switch (zsp->z_cond) {
++ case ZC2_IMPLICIT_OPEN:
++ devip->nr_imp_open--;
++ break;
++ case ZC3_EXPLICIT_OPEN:
++ devip->nr_exp_open--;
++ break;
++ default:
++ WARN_ONCE(true, "Invalid zone %llu condition %x\n",
++ zsp->z_start, zsp->z_cond);
++ break;
++ }
++ zsp->z_cond = ZC5_FULL;
++}
++
+ static void zbc_inc_wp(struct sdebug_dev_info *devip,
+ unsigned long long lba, unsigned int num)
+ {
+@@ -2800,7 +2818,7 @@ static void zbc_inc_wp(struct sdebug_dev_info *devip,
+ if (zsp->z_type == ZBC_ZONE_TYPE_SWR) {
+ zsp->z_wp += num;
+ if (zsp->z_wp >= zend)
+- zsp->z_cond = ZC5_FULL;
++ zbc_set_zone_full(devip, zsp);
+ return;
+ }
+
+@@ -2819,7 +2837,7 @@ static void zbc_inc_wp(struct sdebug_dev_info *devip,
+ n = num;
+ }
+ if (zsp->z_wp >= zend)
+- zsp->z_cond = ZC5_FULL;
++ zbc_set_zone_full(devip, zsp);
+
+ num -= n;
+ lba += n;
+diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
+index 2c0dd64159b09..5d21f07456c6d 100644
+--- a/drivers/scsi/scsi_transport_iscsi.c
++++ b/drivers/scsi/scsi_transport_iscsi.c
+@@ -212,7 +212,12 @@ iscsi_create_endpoint(int dd_size)
+ return NULL;
+
+ mutex_lock(&iscsi_ep_idr_mutex);
+- id = idr_alloc(&iscsi_ep_idr, ep, 0, -1, GFP_NOIO);
++
++ /*
++ * First endpoint id should be 1 to comply with user space
++ * applications (iscsid).
++ */
++ id = idr_alloc(&iscsi_ep_idr, ep, 1, -1, GFP_NOIO);
+ if (id < 0) {
+ mutex_unlock(&iscsi_ep_idr_mutex);
+ printk(KERN_ERR "Could not allocate endpoint ID. Error %d.\n",
+diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
+index 9a0bba5a51a71..4b1f1d73eee8b 100644
+--- a/drivers/scsi/storvsc_drv.c
++++ b/drivers/scsi/storvsc_drv.c
+@@ -1916,7 +1916,7 @@ static struct scsi_host_template scsi_driver = {
+ .cmd_per_lun = 2048,
+ .this_id = -1,
+ /* Ensure there are no gaps in presented sgls */
+- .virt_boundary_mask = PAGE_SIZE-1,
++ .virt_boundary_mask = HV_HYP_PAGE_SIZE - 1,
+ .no_write_same = 1,
+ .track_queue_depth = 1,
+ .change_queue_depth = storvsc_change_queue_depth,
+@@ -1970,6 +1970,7 @@ static int storvsc_probe(struct hv_device *device,
+ int max_targets;
+ int max_channels;
+ int max_sub_channels = 0;
++ u32 max_xfer_bytes;
+
+ /*
+ * Based on the windows host we are running on,
+@@ -2059,12 +2060,28 @@ static int storvsc_probe(struct hv_device *device,
+ }
+ /* max cmd length */
+ host->max_cmd_len = STORVSC_MAX_CMD_LEN;
+-
+ /*
+- * set the table size based on the info we got
+- * from the host.
++ * Any reasonable Hyper-V configuration should provide
++ * max_transfer_bytes value aligning to HV_HYP_PAGE_SIZE,
++ * protecting it from any weird value.
++ */
++ max_xfer_bytes = round_down(stor_device->max_transfer_bytes, HV_HYP_PAGE_SIZE);
++ /* max_hw_sectors_kb */
++ host->max_sectors = max_xfer_bytes >> 9;
++ /*
++ * There are 2 requirements for Hyper-V storvsc sgl segments,
++ * based on which the below calculation for max segments is
++ * done:
++ *
++ * 1. Except for the first and last sgl segment, all sgl segments
++ * should be align to HV_HYP_PAGE_SIZE, that also means the
++ * maximum number of segments in a sgl can be calculated by
++ * dividing the total max transfer length by HV_HYP_PAGE_SIZE.
++ *
++ * 2. Except for the first and last, each entry in the SGL must
++ * have an offset that is a multiple of HV_HYP_PAGE_SIZE.
+ */
+- host->sg_tablesize = (stor_device->max_transfer_bytes >> PAGE_SHIFT);
++ host->sg_tablesize = (max_xfer_bytes >> HV_HYP_PAGE_SHIFT) + 1;
+ /*
+ * For non-IDE disks, the host supports multiple channels.
+ * Set the number of HW queues we are supporting.
+diff --git a/drivers/soc/bcm/brcmstb/pm/pm-arm.c b/drivers/soc/bcm/brcmstb/pm/pm-arm.c
+index 3cbb165d6e309..70ad0f3dce283 100644
+--- a/drivers/soc/bcm/brcmstb/pm/pm-arm.c
++++ b/drivers/soc/bcm/brcmstb/pm/pm-arm.c
+@@ -783,6 +783,7 @@ static int brcmstb_pm_probe(struct platform_device *pdev)
+ }
+
+ ret = brcmstb_init_sram(dn);
++ of_node_put(dn);
+ if (ret) {
+ pr_err("error setting up SRAM for PM\n");
+ return ret;
+diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
+index dc6c96e04bcfe..3b8bf6daf7d0f 100644
+--- a/drivers/usb/chipidea/udc.c
++++ b/drivers/usb/chipidea/udc.c
+@@ -1048,6 +1048,9 @@ isr_setup_status_complete(struct usb_ep *ep, struct usb_request *req)
+ struct ci_hdrc *ci = req->context;
+ unsigned long flags;
+
++ if (req->status < 0)
++ return;
++
+ if (ci->setaddr) {
+ hw_usb_set_address(ci, ci->address);
+ ci->setaddr = false;
+diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c
+index 7f59a0c474020..4e4a7c3126462 100644
+--- a/drivers/usb/gadget/function/uvc_video.c
++++ b/drivers/usb/gadget/function/uvc_video.c
+@@ -415,6 +415,9 @@ static void uvcg_video_pump(struct work_struct *work)
+ uvcg_queue_cancel(queue, 0);
+ break;
+ }
++
++ /* Endpoint now owns the request */
++ req = NULL;
+ video->req_int_count++;
+ }
+
+diff --git a/drivers/usb/gadget/legacy/raw_gadget.c b/drivers/usb/gadget/legacy/raw_gadget.c
+index e9440f7bf019d..ed7c2127fb911 100644
+--- a/drivers/usb/gadget/legacy/raw_gadget.c
++++ b/drivers/usb/gadget/legacy/raw_gadget.c
+@@ -11,6 +11,7 @@
+ #include <linux/ctype.h>
+ #include <linux/debugfs.h>
+ #include <linux/delay.h>
++#include <linux/idr.h>
+ #include <linux/kref.h>
+ #include <linux/miscdevice.h>
+ #include <linux/module.h>
+@@ -36,6 +37,9 @@ MODULE_LICENSE("GPL");
+
+ /*----------------------------------------------------------------------*/
+
++static DEFINE_IDA(driver_id_numbers);
++#define DRIVER_DRIVER_NAME_LENGTH_MAX 32
++
+ #define RAW_EVENT_QUEUE_SIZE 16
+
+ struct raw_event_queue {
+@@ -161,6 +165,9 @@ struct raw_dev {
+ /* Reference to misc device: */
+ struct device *dev;
+
++ /* Make driver names unique */
++ int driver_id_number;
++
+ /* Protected by lock: */
+ enum dev_state state;
+ bool gadget_registered;
+@@ -189,6 +196,7 @@ static struct raw_dev *dev_new(void)
+ spin_lock_init(&dev->lock);
+ init_completion(&dev->ep0_done);
+ raw_event_queue_init(&dev->queue);
++ dev->driver_id_number = -1;
+ return dev;
+ }
+
+@@ -199,6 +207,9 @@ static void dev_free(struct kref *kref)
+
+ kfree(dev->udc_name);
+ kfree(dev->driver.udc_name);
++ kfree(dev->driver.driver.name);
++ if (dev->driver_id_number >= 0)
++ ida_free(&driver_id_numbers, dev->driver_id_number);
+ if (dev->req) {
+ if (dev->ep0_urb_queued)
+ usb_ep_dequeue(dev->gadget->ep0, dev->req);
+@@ -419,9 +430,11 @@ out_put:
+ static int raw_ioctl_init(struct raw_dev *dev, unsigned long value)
+ {
+ int ret = 0;
++ int driver_id_number;
+ struct usb_raw_init arg;
+ char *udc_driver_name;
+ char *udc_device_name;
++ char *driver_driver_name;
+ unsigned long flags;
+
+ if (copy_from_user(&arg, (void __user *)value, sizeof(arg)))
+@@ -440,36 +453,43 @@ static int raw_ioctl_init(struct raw_dev *dev, unsigned long value)
+ return -EINVAL;
+ }
+
++ driver_id_number = ida_alloc(&driver_id_numbers, GFP_KERNEL);
++ if (driver_id_number < 0)
++ return driver_id_number;
++
++ driver_driver_name = kmalloc(DRIVER_DRIVER_NAME_LENGTH_MAX, GFP_KERNEL);
++ if (!driver_driver_name) {
++ ret = -ENOMEM;
++ goto out_free_driver_id_number;
++ }
++ snprintf(driver_driver_name, DRIVER_DRIVER_NAME_LENGTH_MAX,
++ DRIVER_NAME ".%d", driver_id_number);
++
+ udc_driver_name = kmalloc(UDC_NAME_LENGTH_MAX, GFP_KERNEL);
+- if (!udc_driver_name)
+- return -ENOMEM;
++ if (!udc_driver_name) {
++ ret = -ENOMEM;
++ goto out_free_driver_driver_name;
++ }
+ ret = strscpy(udc_driver_name, &arg.driver_name[0],
+ UDC_NAME_LENGTH_MAX);
+- if (ret < 0) {
+- kfree(udc_driver_name);
+- return ret;
+- }
++ if (ret < 0)
++ goto out_free_udc_driver_name;
+ ret = 0;
+
+ udc_device_name = kmalloc(UDC_NAME_LENGTH_MAX, GFP_KERNEL);
+ if (!udc_device_name) {
+- kfree(udc_driver_name);
+- return -ENOMEM;
++ ret = -ENOMEM;
++ goto out_free_udc_driver_name;
+ }
+ ret = strscpy(udc_device_name, &arg.device_name[0],
+ UDC_NAME_LENGTH_MAX);
+- if (ret < 0) {
+- kfree(udc_driver_name);
+- kfree(udc_device_name);
+- return ret;
+- }
++ if (ret < 0)
++ goto out_free_udc_device_name;
+ ret = 0;
+
+ spin_lock_irqsave(&dev->lock, flags);
+ if (dev->state != STATE_DEV_OPENED) {
+ dev_dbg(dev->dev, "fail, device is not opened\n");
+- kfree(udc_driver_name);
+- kfree(udc_device_name);
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+@@ -484,14 +504,25 @@ static int raw_ioctl_init(struct raw_dev *dev, unsigned long value)
+ dev->driver.suspend = gadget_suspend;
+ dev->driver.resume = gadget_resume;
+ dev->driver.reset = gadget_reset;
+- dev->driver.driver.name = DRIVER_NAME;
++ dev->driver.driver.name = driver_driver_name;
+ dev->driver.udc_name = udc_device_name;
+ dev->driver.match_existing_only = 1;
++ dev->driver_id_number = driver_id_number;
+
+ dev->state = STATE_DEV_INITIALIZED;
++ spin_unlock_irqrestore(&dev->lock, flags);
++ return ret;
+
+ out_unlock:
+ spin_unlock_irqrestore(&dev->lock, flags);
++out_free_udc_device_name:
++ kfree(udc_device_name);
++out_free_udc_driver_name:
++ kfree(udc_driver_name);
++out_free_driver_driver_name:
++ kfree(driver_driver_name);
++out_free_driver_id_number:
++ ida_free(&driver_id_numbers, driver_id_number);
+ return ret;
+ }
+
+diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
+index f65f1ba2b5929..fc322a9526c8c 100644
+--- a/drivers/usb/host/xhci-hub.c
++++ b/drivers/usb/host/xhci-hub.c
+@@ -652,7 +652,7 @@ struct xhci_hub *xhci_get_rhub(struct usb_hcd *hcd)
+ * It will release and re-aquire the lock while calling ACPI
+ * method.
+ */
+-static void xhci_set_port_power(struct xhci_hcd *xhci, struct usb_hcd *hcd,
++void xhci_set_port_power(struct xhci_hcd *xhci, struct usb_hcd *hcd,
+ u16 index, bool on, unsigned long *flags)
+ __must_hold(&xhci->lock)
+ {
+diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
+index d57c5ff5ae1f4..64173010d4666 100644
+--- a/drivers/usb/host/xhci-pci.c
++++ b/drivers/usb/host/xhci-pci.c
+@@ -61,6 +61,8 @@
+ #define PCI_DEVICE_ID_INTEL_ALDER_LAKE_XHCI 0x461e
+ #define PCI_DEVICE_ID_INTEL_ALDER_LAKE_N_XHCI 0x464e
+ #define PCI_DEVICE_ID_INTEL_ALDER_LAKE_PCH_XHCI 0x51ed
++#define PCI_DEVICE_ID_INTEL_RAPTOR_LAKE_XHCI 0xa71e
++#define PCI_DEVICE_ID_INTEL_METEOR_LAKE_XHCI 0x7ec0
+
+ #define PCI_DEVICE_ID_AMD_RENOIR_XHCI 0x1639
+ #define PCI_DEVICE_ID_AMD_PROMONTORYA_4 0x43b9
+@@ -270,7 +272,9 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
+ pdev->device == PCI_DEVICE_ID_INTEL_MAPLE_RIDGE_XHCI ||
+ pdev->device == PCI_DEVICE_ID_INTEL_ALDER_LAKE_XHCI ||
+ pdev->device == PCI_DEVICE_ID_INTEL_ALDER_LAKE_N_XHCI ||
+- pdev->device == PCI_DEVICE_ID_INTEL_ALDER_LAKE_PCH_XHCI))
++ pdev->device == PCI_DEVICE_ID_INTEL_ALDER_LAKE_PCH_XHCI ||
++ pdev->device == PCI_DEVICE_ID_INTEL_RAPTOR_LAKE_XHCI ||
++ pdev->device == PCI_DEVICE_ID_INTEL_METEOR_LAKE_XHCI))
+ xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
+
+ if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
+diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
+index 2be38d9de8df4..162d975b648c1 100644
+--- a/drivers/usb/host/xhci.c
++++ b/drivers/usb/host/xhci.c
+@@ -779,6 +779,8 @@ static void xhci_stop(struct usb_hcd *hcd)
+ void xhci_shutdown(struct usb_hcd *hcd)
+ {
+ struct xhci_hcd *xhci = hcd_to_xhci(hcd);
++ unsigned long flags;
++ int i;
+
+ if (xhci->quirks & XHCI_SPURIOUS_REBOOT)
+ usb_disable_xhci_ports(to_pci_dev(hcd->self.sysdev));
+@@ -794,12 +796,21 @@ void xhci_shutdown(struct usb_hcd *hcd)
+ del_timer_sync(&xhci->shared_hcd->rh_timer);
+ }
+
+- spin_lock_irq(&xhci->lock);
++ spin_lock_irqsave(&xhci->lock, flags);
+ xhci_halt(xhci);
++
++ /* Power off USB2 ports*/
++ for (i = 0; i < xhci->usb2_rhub.num_ports; i++)
++ xhci_set_port_power(xhci, xhci->main_hcd, i, false, &flags);
++
++ /* Power off USB3 ports*/
++ for (i = 0; i < xhci->usb3_rhub.num_ports; i++)
++ xhci_set_port_power(xhci, xhci->shared_hcd, i, false, &flags);
++
+ /* Workaround for spurious wakeups at shutdown with HSW */
+ if (xhci->quirks & XHCI_SPURIOUS_WAKEUP)
+ xhci_reset(xhci, XHCI_RESET_SHORT_USEC);
+- spin_unlock_irq(&xhci->lock);
++ spin_unlock_irqrestore(&xhci->lock, flags);
+
+ xhci_cleanup_msix(xhci);
+
+diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
+index 473a33ce299e4..1f3f311d9951e 100644
+--- a/drivers/usb/host/xhci.h
++++ b/drivers/usb/host/xhci.h
+@@ -2172,6 +2172,8 @@ int xhci_hub_control(struct usb_hcd *hcd, u16 typeReq, u16 wValue, u16 wIndex,
+ int xhci_hub_status_data(struct usb_hcd *hcd, char *buf);
+ int xhci_find_raw_port_number(struct usb_hcd *hcd, int port1);
+ struct xhci_hub *xhci_get_rhub(struct usb_hcd *hcd);
++void xhci_set_port_power(struct xhci_hcd *xhci, struct usb_hcd *hcd, u16 index,
++ bool on, unsigned long *flags);
+
+ void xhci_hc_died(struct xhci_hcd *xhci);
+
+diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
+index ed1e50d83ccab..de59fa919540a 100644
+--- a/drivers/usb/serial/option.c
++++ b/drivers/usb/serial/option.c
+@@ -252,10 +252,12 @@ static void option_instat_callback(struct urb *urb);
+ #define QUECTEL_PRODUCT_EG95 0x0195
+ #define QUECTEL_PRODUCT_BG96 0x0296
+ #define QUECTEL_PRODUCT_EP06 0x0306
++#define QUECTEL_PRODUCT_EM05G 0x030a
+ #define QUECTEL_PRODUCT_EM12 0x0512
+ #define QUECTEL_PRODUCT_RM500Q 0x0800
+ #define QUECTEL_PRODUCT_EC200S_CN 0x6002
+ #define QUECTEL_PRODUCT_EC200T 0x6026
++#define QUECTEL_PRODUCT_RM500K 0x7001
+
+ #define CMOTECH_VENDOR_ID 0x16d8
+ #define CMOTECH_PRODUCT_6001 0x6001
+@@ -1134,6 +1136,8 @@ static const struct usb_device_id option_ids[] = {
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EP06, 0xff, 0xff, 0xff),
+ .driver_info = RSVD(1) | RSVD(2) | RSVD(3) | RSVD(4) | NUMEP2 },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EP06, 0xff, 0, 0) },
++ { USB_DEVICE_INTERFACE_CLASS(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM05G, 0xff),
++ .driver_info = RSVD(6) | ZLP },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM12, 0xff, 0xff, 0xff),
+ .driver_info = RSVD(1) | RSVD(2) | RSVD(3) | RSVD(4) | NUMEP2 },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM12, 0xff, 0, 0) },
+@@ -1147,6 +1151,7 @@ static const struct usb_device_id option_ids[] = {
+ .driver_info = ZLP },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200S_CN, 0xff, 0, 0) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200T, 0xff, 0, 0) },
++ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500K, 0xff, 0x00, 0x00) },
+
+ { USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_6001) },
+ { USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_CMU_300) },
+@@ -1279,6 +1284,7 @@ static const struct usb_device_id option_ids[] = {
+ .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) },
+ { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1231, 0xff), /* Telit LE910Cx (RNDIS) */
+ .driver_info = NCTRL(2) | RSVD(3) },
++ { USB_DEVICE_AND_INTERFACE_INFO(TELIT_VENDOR_ID, 0x1250, 0xff, 0x00, 0x00) }, /* Telit LE910Cx (rmnet) */
+ { USB_DEVICE(TELIT_VENDOR_ID, 0x1260),
+ .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) },
+ { USB_DEVICE(TELIT_VENDOR_ID, 0x1261),
+diff --git a/drivers/usb/serial/pl2303.c b/drivers/usb/serial/pl2303.c
+index 3506c47e1eef0..40b1ab3d284dc 100644
+--- a/drivers/usb/serial/pl2303.c
++++ b/drivers/usb/serial/pl2303.c
+@@ -436,22 +436,27 @@ static int pl2303_detect_type(struct usb_serial *serial)
+ break;
+ case 0x200:
+ switch (bcdDevice) {
+- case 0x100:
++ case 0x100: /* GC */
+ case 0x105:
++ return TYPE_HXN;
++ case 0x300: /* GT / TA */
++ if (pl2303_supports_hx_status(serial))
++ return TYPE_TA;
++ fallthrough;
+ case 0x305:
++ case 0x400: /* GL */
+ case 0x405:
++ return TYPE_HXN;
++ case 0x500: /* GE / TB */
++ if (pl2303_supports_hx_status(serial))
++ return TYPE_TB;
++ fallthrough;
++ case 0x505:
++ case 0x600: /* GS */
+ case 0x605:
+- /*
+- * Assume it's an HXN-type if the device doesn't
+- * support the old read request value.
+- */
+- if (!pl2303_supports_hx_status(serial))
+- return TYPE_HXN;
+- break;
+- case 0x300:
+- return TYPE_TA;
+- case 0x500:
+- return TYPE_TB;
++ case 0x700: /* GR */
++ case 0x705:
++ return TYPE_HXN;
+ }
+ break;
+ }
+diff --git a/drivers/usb/typec/tcpm/Kconfig b/drivers/usb/typec/tcpm/Kconfig
+index 557f392fe24da..073fd2ea5e0bb 100644
+--- a/drivers/usb/typec/tcpm/Kconfig
++++ b/drivers/usb/typec/tcpm/Kconfig
+@@ -56,7 +56,6 @@ config TYPEC_WCOVE
+ tristate "Intel WhiskeyCove PMIC USB Type-C PHY driver"
+ depends on ACPI
+ depends on MFD_INTEL_PMC_BXT
+- depends on INTEL_SOC_PMIC
+ depends on BXT_WC_PMIC_OPREGION
+ help
+ This driver adds support for USB Type-C on Intel Broxton platforms
+diff --git a/drivers/video/console/sticore.c b/drivers/video/console/sticore.c
+index 6a947ff96d6eb..19fd3389946d9 100644
+--- a/drivers/video/console/sticore.c
++++ b/drivers/video/console/sticore.c
+@@ -1127,6 +1127,7 @@ int sti_call(const struct sti_struct *sti, unsigned long func,
+ return ret;
+ }
+
++#if defined(CONFIG_FB_STI)
+ /* check if given fb_info is the primary device */
+ int fb_is_primary_device(struct fb_info *info)
+ {
+@@ -1142,6 +1143,7 @@ int fb_is_primary_device(struct fb_info *info)
+ return (sti->info == info);
+ }
+ EXPORT_SYMBOL(fb_is_primary_device);
++#endif
+
+ MODULE_AUTHOR("Philipp Rumpf, Helge Deller, Thomas Bogendoerfer");
+ MODULE_DESCRIPTION("Core STI driver for HP's NGLE series graphics cards in HP PARISC machines");
+diff --git a/drivers/xen/features.c b/drivers/xen/features.c
+index 7b591443833c9..87f1828d40d5e 100644
+--- a/drivers/xen/features.c
++++ b/drivers/xen/features.c
+@@ -42,7 +42,7 @@ void xen_setup_features(void)
+ if (HYPERVISOR_xen_version(XENVER_get_features, &fi) < 0)
+ break;
+ for (j = 0; j < 32; j++)
+- xen_features[i * 32 + j] = !!(fi.submap & 1<<j);
++ xen_features[i * 32 + j] = !!(fi.submap & 1U << j);
+ }
+
+ if (xen_pv_domain()) {
+diff --git a/drivers/xen/gntdev-common.h b/drivers/xen/gntdev-common.h
+index 20d7d059dadb5..40ef379c28ab0 100644
+--- a/drivers/xen/gntdev-common.h
++++ b/drivers/xen/gntdev-common.h
+@@ -16,6 +16,7 @@
+ #include <linux/mmu_notifier.h>
+ #include <linux/types.h>
+ #include <xen/interface/event_channel.h>
++#include <xen/grant_table.h>
+
+ struct gntdev_dmabuf_priv;
+
+@@ -56,6 +57,7 @@ struct gntdev_grant_map {
+ struct gnttab_unmap_grant_ref *unmap_ops;
+ struct gnttab_map_grant_ref *kmap_ops;
+ struct gnttab_unmap_grant_ref *kunmap_ops;
++ bool *being_removed;
+ struct page **pages;
+ unsigned long pages_vm_start;
+
+@@ -73,6 +75,11 @@ struct gntdev_grant_map {
+ /* Needed to avoid allocation in gnttab_dma_free_pages(). */
+ xen_pfn_t *frames;
+ #endif
++
++ /* Number of live grants */
++ atomic_t live_grants;
++ /* Needed to avoid allocation in __unmap_grant_pages */
++ struct gntab_unmap_queue_data unmap_data;
+ };
+
+ struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
+diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
+index 59ffea8000791..4b56c39f766d4 100644
+--- a/drivers/xen/gntdev.c
++++ b/drivers/xen/gntdev.c
+@@ -35,6 +35,7 @@
+ #include <linux/slab.h>
+ #include <linux/highmem.h>
+ #include <linux/refcount.h>
++#include <linux/workqueue.h>
+
+ #include <xen/xen.h>
+ #include <xen/grant_table.h>
+@@ -60,10 +61,11 @@ module_param(limit, uint, 0644);
+ MODULE_PARM_DESC(limit,
+ "Maximum number of grants that may be mapped by one mapping request");
+
++/* True in PV mode, false otherwise */
+ static int use_ptemod;
+
+-static int unmap_grant_pages(struct gntdev_grant_map *map,
+- int offset, int pages);
++static void unmap_grant_pages(struct gntdev_grant_map *map,
++ int offset, int pages);
+
+ static struct miscdevice gntdev_miscdev;
+
+@@ -120,6 +122,7 @@ static void gntdev_free_map(struct gntdev_grant_map *map)
+ kvfree(map->unmap_ops);
+ kvfree(map->kmap_ops);
+ kvfree(map->kunmap_ops);
++ kvfree(map->being_removed);
+ kfree(map);
+ }
+
+@@ -140,10 +143,13 @@ struct gntdev_grant_map *gntdev_alloc_map(struct gntdev_priv *priv, int count,
+ add->unmap_ops = kvmalloc_array(count, sizeof(add->unmap_ops[0]),
+ GFP_KERNEL);
+ add->pages = kvcalloc(count, sizeof(add->pages[0]), GFP_KERNEL);
++ add->being_removed =
++ kvcalloc(count, sizeof(add->being_removed[0]), GFP_KERNEL);
+ if (NULL == add->grants ||
+ NULL == add->map_ops ||
+ NULL == add->unmap_ops ||
+- NULL == add->pages)
++ NULL == add->pages ||
++ NULL == add->being_removed)
+ goto err;
+ if (use_ptemod) {
+ add->kmap_ops = kvmalloc_array(count, sizeof(add->kmap_ops[0]),
+@@ -250,9 +256,36 @@ void gntdev_put_map(struct gntdev_priv *priv, struct gntdev_grant_map *map)
+ if (!refcount_dec_and_test(&map->users))
+ return;
+
+- if (map->pages && !use_ptemod)
++ if (map->pages && !use_ptemod) {
++ /*
++ * Increment the reference count. This ensures that the
++ * subsequent call to unmap_grant_pages() will not wind up
++ * re-entering itself. It *can* wind up calling
++ * gntdev_put_map() recursively, but such calls will be with a
++ * reference count greater than 1, so they will return before
++ * this code is reached. The recursion depth is thus limited to
++ * 1. Do NOT use refcount_inc() here, as it will detect that
++ * the reference count is zero and WARN().
++ */
++ refcount_set(&map->users, 1);
++
++ /*
++ * Unmap the grants. This may or may not be asynchronous, so it
++ * is possible that the reference count is 1 on return, but it
++ * could also be greater than 1.
++ */
+ unmap_grant_pages(map, 0, map->count);
+
++ /* Check if the memory now needs to be freed */
++ if (!refcount_dec_and_test(&map->users))
++ return;
++
++ /*
++ * All pages have been returned to the hypervisor, so free the
++ * map.
++ */
++ }
++
+ if (map->notify.flags & UNMAP_NOTIFY_SEND_EVENT) {
+ notify_remote_via_evtchn(map->notify.event);
+ evtchn_put(map->notify.event);
+@@ -283,6 +316,7 @@ static int find_grant_ptes(pte_t *pte, unsigned long addr, void *data)
+
+ int gntdev_map_grant_pages(struct gntdev_grant_map *map)
+ {
++ size_t alloced = 0;
+ int i, err = 0;
+
+ if (!use_ptemod) {
+@@ -331,97 +365,116 @@ int gntdev_map_grant_pages(struct gntdev_grant_map *map)
+ map->count);
+
+ for (i = 0; i < map->count; i++) {
+- if (map->map_ops[i].status == GNTST_okay)
++ if (map->map_ops[i].status == GNTST_okay) {
+ map->unmap_ops[i].handle = map->map_ops[i].handle;
+- else if (!err)
++ if (!use_ptemod)
++ alloced++;
++ } else if (!err)
+ err = -EINVAL;
+
+ if (map->flags & GNTMAP_device_map)
+ map->unmap_ops[i].dev_bus_addr = map->map_ops[i].dev_bus_addr;
+
+ if (use_ptemod) {
+- if (map->kmap_ops[i].status == GNTST_okay)
++ if (map->kmap_ops[i].status == GNTST_okay) {
++ if (map->map_ops[i].status == GNTST_okay)
++ alloced++;
+ map->kunmap_ops[i].handle = map->kmap_ops[i].handle;
+- else if (!err)
++ } else if (!err)
+ err = -EINVAL;
+ }
+ }
++ atomic_add(alloced, &map->live_grants);
+ return err;
+ }
+
+-static int __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+- int pages)
++static void __unmap_grant_pages_done(int result,
++ struct gntab_unmap_queue_data *data)
+ {
+- int i, err = 0;
+- struct gntab_unmap_queue_data unmap_data;
+-
+- if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
+- int pgno = (map->notify.addr >> PAGE_SHIFT);
+- if (pgno >= offset && pgno < offset + pages) {
+- /* No need for kmap, pages are in lowmem */
+- uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
+- tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
+- map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
+- }
+- }
+-
+- unmap_data.unmap_ops = map->unmap_ops + offset;
+- unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
+- unmap_data.pages = map->pages + offset;
+- unmap_data.count = pages;
+-
+- err = gnttab_unmap_refs_sync(&unmap_data);
+- if (err)
+- return err;
++ unsigned int i;
++ struct gntdev_grant_map *map = data->data;
++ unsigned int offset = data->unmap_ops - map->unmap_ops;
+
+- for (i = 0; i < pages; i++) {
+- if (map->unmap_ops[offset+i].status)
+- err = -EINVAL;
++ for (i = 0; i < data->count; i++) {
++ WARN_ON(map->unmap_ops[offset+i].status);
+ pr_debug("unmap handle=%d st=%d\n",
+ map->unmap_ops[offset+i].handle,
+ map->unmap_ops[offset+i].status);
+ map->unmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
+ if (use_ptemod) {
+- if (map->kunmap_ops[offset+i].status)
+- err = -EINVAL;
++ WARN_ON(map->kunmap_ops[offset+i].status);
+ pr_debug("kunmap handle=%u st=%d\n",
+ map->kunmap_ops[offset+i].handle,
+ map->kunmap_ops[offset+i].status);
+ map->kunmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
+ }
+ }
+- return err;
++ /*
++ * Decrease the live-grant counter. This must happen after the loop to
++ * prevent premature reuse of the grants by gnttab_mmap().
++ */
++ atomic_sub(data->count, &map->live_grants);
++
++ /* Release reference taken by __unmap_grant_pages */
++ gntdev_put_map(NULL, map);
++}
++
++static void __unmap_grant_pages(struct gntdev_grant_map *map, int offset,
++ int pages)
++{
++ if (map->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
++ int pgno = (map->notify.addr >> PAGE_SHIFT);
++
++ if (pgno >= offset && pgno < offset + pages) {
++ /* No need for kmap, pages are in lowmem */
++ uint8_t *tmp = pfn_to_kaddr(page_to_pfn(map->pages[pgno]));
++
++ tmp[map->notify.addr & (PAGE_SIZE-1)] = 0;
++ map->notify.flags &= ~UNMAP_NOTIFY_CLEAR_BYTE;
++ }
++ }
++
++ map->unmap_data.unmap_ops = map->unmap_ops + offset;
++ map->unmap_data.kunmap_ops = use_ptemod ? map->kunmap_ops + offset : NULL;
++ map->unmap_data.pages = map->pages + offset;
++ map->unmap_data.count = pages;
++ map->unmap_data.done = __unmap_grant_pages_done;
++ map->unmap_data.data = map;
++ refcount_inc(&map->users); /* to keep map alive during async call below */
++
++ gnttab_unmap_refs_async(&map->unmap_data);
+ }
+
+-static int unmap_grant_pages(struct gntdev_grant_map *map, int offset,
+- int pages)
++static void unmap_grant_pages(struct gntdev_grant_map *map, int offset,
++ int pages)
+ {
+- int range, err = 0;
++ int range;
++
++ if (atomic_read(&map->live_grants) == 0)
++ return; /* Nothing to do */
+
+ pr_debug("unmap %d+%d [%d+%d]\n", map->index, map->count, offset, pages);
+
+ /* It is possible the requested range will have a "hole" where we
+ * already unmapped some of the grants. Only unmap valid ranges.
+ */
+- while (pages && !err) {
+- while (pages &&
+- map->unmap_ops[offset].handle == INVALID_GRANT_HANDLE) {
++ while (pages) {
++ while (pages && map->being_removed[offset]) {
+ offset++;
+ pages--;
+ }
+ range = 0;
+ while (range < pages) {
+- if (map->unmap_ops[offset + range].handle ==
+- INVALID_GRANT_HANDLE)
++ if (map->being_removed[offset + range])
+ break;
++ map->being_removed[offset + range] = true;
+ range++;
+ }
+- err = __unmap_grant_pages(map, offset, range);
++ if (range)
++ __unmap_grant_pages(map, offset, range);
+ offset += range;
+ pages -= range;
+ }
+-
+- return err;
+ }
+
+ /* ------------------------------------------------------------------ */
+@@ -473,7 +526,6 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
+ struct gntdev_grant_map *map =
+ container_of(mn, struct gntdev_grant_map, notifier);
+ unsigned long mstart, mend;
+- int err;
+
+ if (!mmu_notifier_range_blockable(range))
+ return false;
+@@ -494,10 +546,9 @@ static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
+ map->index, map->count,
+ map->vma->vm_start, map->vma->vm_end,
+ range->start, range->end, mstart, mend);
+- err = unmap_grant_pages(map,
++ unmap_grant_pages(map,
+ (mstart - map->vma->vm_start) >> PAGE_SHIFT,
+ (mend - mstart) >> PAGE_SHIFT);
+- WARN_ON(err);
+
+ return true;
+ }
+@@ -985,6 +1036,10 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
+ goto unlock_out;
+ if (use_ptemod && map->vma)
+ goto unlock_out;
++ if (atomic_read(&map->live_grants)) {
++ err = -EAGAIN;
++ goto unlock_out;
++ }
+ refcount_inc(&map->users);
+
+ vma->vm_ops = &gntdev_vmops;
+diff --git a/fs/9p/fid.c b/fs/9p/fid.c
+index 79df61fe0e596..baf2b152229e3 100644
+--- a/fs/9p/fid.c
++++ b/fs/9p/fid.c
+@@ -152,7 +152,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
+ const unsigned char **wnames, *uname;
+ int i, n, l, clone, access;
+ struct v9fs_session_info *v9ses;
+- struct p9_fid *fid, *old_fid = NULL;
++ struct p9_fid *fid, *old_fid;
+
+ v9ses = v9fs_dentry2v9ses(dentry);
+ access = v9ses->flags & V9FS_ACCESS_MASK;
+@@ -194,13 +194,12 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
+ if (IS_ERR(fid))
+ return fid;
+
++ refcount_inc(&fid->count);
+ v9fs_fid_add(dentry->d_sb->s_root, fid);
+ }
+ /* If we are root ourself just return that */
+- if (dentry->d_sb->s_root == dentry) {
+- refcount_inc(&fid->count);
++ if (dentry->d_sb->s_root == dentry)
+ return fid;
+- }
+ /*
+ * Do a multipath walk with attached root.
+ * When walking parent we need to make sure we
+@@ -212,6 +211,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
+ fid = ERR_PTR(n);
+ goto err_out;
+ }
++ old_fid = fid;
+ clone = 1;
+ i = 0;
+ while (i < n) {
+@@ -221,19 +221,15 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
+ * walk to ensure none of the patch component change
+ */
+ fid = p9_client_walk(fid, l, &wnames[i], clone);
++ /* non-cloning walk will return the same fid */
++ if (fid != old_fid) {
++ p9_client_clunk(old_fid);
++ old_fid = fid;
++ }
+ if (IS_ERR(fid)) {
+- if (old_fid) {
+- /*
+- * If we fail, clunk fid which are mapping
+- * to path component and not the last component
+- * of the path.
+- */
+- p9_client_clunk(old_fid);
+- }
+ kfree(wnames);
+ goto err_out;
+ }
+- old_fid = fid;
+ i += l;
+ clone = 0;
+ }
+diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c
+index 595875228672f..a58c554b40706 100644
+--- a/fs/9p/vfs_addr.c
++++ b/fs/9p/vfs_addr.c
+@@ -58,8 +58,21 @@ static void v9fs_issue_read(struct netfs_io_subrequest *subreq)
+ */
+ static int v9fs_init_request(struct netfs_io_request *rreq, struct file *file)
+ {
++ struct inode *inode = file_inode(file);
++ struct v9fs_inode *v9inode = V9FS_I(inode);
+ struct p9_fid *fid = file->private_data;
+
++ BUG_ON(!fid);
++
++ /* we might need to read from a fid that was opened write-only
++ * for read-modify-write of page cache, use the writeback fid
++ * for that */
++ if (rreq->origin == NETFS_READ_FOR_WRITE &&
++ (fid->mode & O_ACCMODE) == O_WRONLY) {
++ fid = v9inode->writeback_fid;
++ BUG_ON(!fid);
++ }
++
+ refcount_inc(&fid->count);
+ rreq->netfs_priv = fid;
+ return 0;
+diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
+index e660c6348b9da..d4b705a866eab 100644
+--- a/fs/9p/vfs_inode.c
++++ b/fs/9p/vfs_inode.c
+@@ -1250,15 +1250,15 @@ static const char *v9fs_vfs_get_link(struct dentry *dentry,
+ return ERR_PTR(-ECHILD);
+
+ v9ses = v9fs_dentry2v9ses(dentry);
+- fid = v9fs_fid_lookup(dentry);
++ if (!v9fs_proto_dotu(v9ses))
++ return ERR_PTR(-EBADF);
++
+ p9_debug(P9_DEBUG_VFS, "%pd\n", dentry);
++ fid = v9fs_fid_lookup(dentry);
+
+ if (IS_ERR(fid))
+ return ERR_CAST(fid);
+
+- if (!v9fs_proto_dotu(v9ses))
+- return ERR_PTR(-EBADF);
+-
+ st = p9_client_stat(fid);
+ p9_client_clunk(fid);
+ if (IS_ERR(st))
+diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
+index d17502a738a94..b6eb1160296c3 100644
+--- a/fs/9p/vfs_inode_dotl.c
++++ b/fs/9p/vfs_inode_dotl.c
+@@ -274,6 +274,7 @@ v9fs_vfs_atomic_open_dotl(struct inode *dir, struct dentry *dentry,
+ if (IS_ERR(ofid)) {
+ err = PTR_ERR(ofid);
+ p9_debug(P9_DEBUG_VFS, "p9_client_walk failed %d\n", err);
++ p9_client_clunk(dfid);
+ goto out;
+ }
+
+@@ -285,6 +286,7 @@ v9fs_vfs_atomic_open_dotl(struct inode *dir, struct dentry *dentry,
+ if (err) {
+ p9_debug(P9_DEBUG_VFS, "Failed to get acl values in creat %d\n",
+ err);
++ p9_client_clunk(dfid);
+ goto error;
+ }
+ err = p9_client_create_dotl(ofid, name, v9fs_open_to_dotl_flags(flags),
+@@ -292,6 +294,7 @@ v9fs_vfs_atomic_open_dotl(struct inode *dir, struct dentry *dentry,
+ if (err < 0) {
+ p9_debug(P9_DEBUG_VFS, "p9_client_open_dotl failed in creat %d\n",
+ err);
++ p9_client_clunk(dfid);
+ goto error;
+ }
+ v9fs_invalidate_inode_attr(dir);
+diff --git a/fs/afs/inode.c b/fs/afs/inode.c
+index 22811e9eacf58..c4c9f6dff0a23 100644
+--- a/fs/afs/inode.c
++++ b/fs/afs/inode.c
+@@ -745,7 +745,8 @@ int afs_getattr(struct user_namespace *mnt_userns, const struct path *path,
+
+ _enter("{ ino=%lu v=%u }", inode->i_ino, inode->i_generation);
+
+- if (!(query_flags & AT_STATX_DONT_SYNC) &&
++ if (vnode->volume &&
++ !(query_flags & AT_STATX_DONT_SYNC) &&
+ !test_bit(AFS_VNODE_CB_PROMISED, &vnode->flags)) {
+ key = afs_request_key(vnode->volume->cell);
+ if (IS_ERR(key))
+diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
+index 30d0bbfdb3bca..6f30413ed9a9b 100644
+--- a/fs/btrfs/disk-io.c
++++ b/fs/btrfs/disk-io.c
+@@ -4639,6 +4639,17 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
+ int ret;
+
+ set_bit(BTRFS_FS_CLOSING_START, &fs_info->flags);
++
++ /*
++ * We may have the reclaim task running and relocating a data block group,
++ * in which case it may create delayed iputs. So stop it before we park
++ * the cleaner kthread otherwise we can get new delayed iputs after
++ * parking the cleaner, and that can make the async reclaim task to hang
++ * if it's waiting for delayed iputs to complete, since the cleaner is
++ * parked and can not run delayed iputs - this will make us hang when
++ * trying to stop the async reclaim task.
++ */
++ cancel_work_sync(&fs_info->reclaim_bgs_work);
+ /*
+ * We don't want the cleaner to start new transactions, add more delayed
+ * iputs, etc. while we're closing. We can't use kthread_stop() yet
+@@ -4679,8 +4690,6 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
+ cancel_work_sync(&fs_info->async_data_reclaim_work);
+ cancel_work_sync(&fs_info->preempt_reclaim_work);
+
+- cancel_work_sync(&fs_info->reclaim_bgs_work);
+-
+ /* Cancel or finish ongoing discard work */
+ btrfs_discard_cleanup(fs_info);
+
+diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
+index 380054c94e4b6..153920acd2269 100644
+--- a/fs/btrfs/file.c
++++ b/fs/btrfs/file.c
+@@ -2359,25 +2359,62 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
+ */
+ btrfs_inode_unlock(inode, BTRFS_ILOCK_MMAP);
+
+- if (ret != BTRFS_NO_LOG_SYNC) {
++ if (ret == BTRFS_NO_LOG_SYNC) {
++ ret = btrfs_end_transaction(trans);
++ goto out;
++ }
++
++ /* We successfully logged the inode, attempt to sync the log. */
++ if (!ret) {
++ ret = btrfs_sync_log(trans, root, &ctx);
+ if (!ret) {
+- ret = btrfs_sync_log(trans, root, &ctx);
+- if (!ret) {
+- ret = btrfs_end_transaction(trans);
+- goto out;
+- }
+- }
+- if (!full_sync) {
+- ret = btrfs_wait_ordered_range(inode, start, len);
+- if (ret) {
+- btrfs_end_transaction(trans);
+- goto out;
+- }
++ ret = btrfs_end_transaction(trans);
++ goto out;
+ }
+- ret = btrfs_commit_transaction(trans);
+- } else {
++ }
++
++ /*
++ * At this point we need to commit the transaction because we had
++ * btrfs_need_log_full_commit() or some other error.
++ *
++ * If we didn't do a full sync we have to stop the trans handle, wait on
++ * the ordered extents, start it again and commit the transaction. If
++ * we attempt to wait on the ordered extents here we could deadlock with
++ * something like fallocate() that is holding the extent lock trying to
++ * start a transaction while some other thread is trying to commit the
++ * transaction while we (fsync) are currently holding the transaction
++ * open.
++ */
++ if (!full_sync) {
+ ret = btrfs_end_transaction(trans);
++ if (ret)
++ goto out;
++ ret = btrfs_wait_ordered_range(inode, start, len);
++ if (ret)
++ goto out;
++
++ /*
++ * This is safe to use here because we're only interested in
++ * making sure the transaction that had the ordered extents is
++ * committed. We aren't waiting on anything past this point,
++ * we're purely getting the transaction and committing it.
++ */
++ trans = btrfs_attach_transaction_barrier(root);
++ if (IS_ERR(trans)) {
++ ret = PTR_ERR(trans);
++
++ /*
++ * We committed the transaction and there's no currently
++ * running transaction, this means everything we care
++ * about made it to disk and we are done.
++ */
++ if (ret == -ENOENT)
++ ret = 0;
++ goto out;
++ }
+ }
++
++ ret = btrfs_commit_transaction(trans);
+ out:
+ ASSERT(list_empty(&ctx.list));
+ err = file_check_and_advance_wb_err(file);
+diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
+index 313d9d685adb7..33461b4f9c8b5 100644
+--- a/fs/btrfs/locking.c
++++ b/fs/btrfs/locking.c
+@@ -45,7 +45,6 @@ void __btrfs_tree_read_lock(struct extent_buffer *eb, enum btrfs_lock_nesting ne
+ start_ns = ktime_get_ns();
+
+ down_read_nested(&eb->lock, nest);
+- eb->lock_owner = current->pid;
+ trace_btrfs_tree_read_lock(eb, start_ns);
+ }
+
+@@ -62,7 +61,6 @@ void btrfs_tree_read_lock(struct extent_buffer *eb)
+ int btrfs_try_tree_read_lock(struct extent_buffer *eb)
+ {
+ if (down_read_trylock(&eb->lock)) {
+- eb->lock_owner = current->pid;
+ trace_btrfs_try_tree_read_lock(eb);
+ return 1;
+ }
+@@ -90,7 +88,6 @@ int btrfs_try_tree_write_lock(struct extent_buffer *eb)
+ void btrfs_tree_read_unlock(struct extent_buffer *eb)
+ {
+ trace_btrfs_tree_read_unlock(eb);
+- eb->lock_owner = 0;
+ up_read(&eb->lock);
+ }
+
+diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c
+index 998e3f180d90e..6db7f50de84da 100644
+--- a/fs/btrfs/reflink.c
++++ b/fs/btrfs/reflink.c
+@@ -344,6 +344,7 @@ static int btrfs_clone(struct inode *src, struct inode *inode,
+ int ret;
+ const u64 len = olen_aligned;
+ u64 last_dest_end = destoff;
++ u64 prev_extent_end = off;
+
+ ret = -ENOMEM;
+ buf = kvmalloc(fs_info->nodesize, GFP_KERNEL);
+@@ -363,7 +364,6 @@ static int btrfs_clone(struct inode *src, struct inode *inode,
+ key.offset = off;
+
+ while (1) {
+- u64 next_key_min_offset = key.offset + 1;
+ struct btrfs_file_extent_item *extent;
+ u64 extent_gen;
+ int type;
+@@ -431,14 +431,21 @@ process_slot:
+ * The first search might have left us at an extent item that
+ * ends before our target range's start, can happen if we have
+ * holes and NO_HOLES feature enabled.
++ *
++ * Subsequent searches may leave us on a file range we have
++ * processed before - this happens due to a race with ordered
++ * extent completion for a file range that is outside our source
++ * range, but that range was part of a file extent item that
++ * also covered a leading part of our source range.
+ */
+- if (key.offset + datal <= off) {
++ if (key.offset + datal <= prev_extent_end) {
+ path->slots[0]++;
+ goto process_slot;
+ } else if (key.offset >= off + len) {
+ break;
+ }
+- next_key_min_offset = key.offset + datal;
++
++ prev_extent_end = key.offset + datal;
+ size = btrfs_item_size(leaf, slot);
+ read_extent_buffer(leaf, buf, btrfs_item_ptr_offset(leaf, slot),
+ size);
+@@ -550,7 +557,7 @@ process_slot:
+ break;
+
+ btrfs_release_path(path);
+- key.offset = next_key_min_offset;
++ key.offset = prev_extent_end;
+
+ if (fatal_signal_pending(current)) {
+ ret = -EINTR;
+diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
+index b228efe8ab6e2..0b2a387615f64 100644
+--- a/fs/btrfs/super.c
++++ b/fs/btrfs/super.c
+@@ -763,6 +763,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
+ compress_force = false;
+ no_compress++;
+ } else {
++ btrfs_err(info, "unrecognized compression value %s",
++ args[0].from);
+ ret = -EINVAL;
+ goto out;
+ }
+@@ -821,8 +823,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
+ case Opt_thread_pool:
+ ret = match_int(&args[0], &intarg);
+ if (ret) {
++ btrfs_err(info, "unrecognized thread_pool value %s",
++ args[0].from);
+ goto out;
+ } else if (intarg == 0) {
++ btrfs_err(info, "invalid value 0 for thread_pool");
+ ret = -EINVAL;
+ goto out;
+ }
+@@ -883,8 +888,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
+ break;
+ case Opt_ratio:
+ ret = match_int(&args[0], &intarg);
+- if (ret)
++ if (ret) {
++ btrfs_err(info, "unrecognized metadata_ratio value %s",
++ args[0].from);
+ goto out;
++ }
+ info->metadata_ratio = intarg;
+ btrfs_info(info, "metadata ratio %u",
+ info->metadata_ratio);
+@@ -901,6 +909,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
+ btrfs_set_and_info(info, DISCARD_ASYNC,
+ "turning on async discard");
+ } else {
++ btrfs_err(info, "unrecognized discard mode value %s",
++ args[0].from);
+ ret = -EINVAL;
+ goto out;
+ }
+@@ -933,6 +943,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
+ btrfs_set_and_info(info, FREE_SPACE_TREE,
+ "enabling free space tree");
+ } else {
++ btrfs_err(info, "unrecognized space_cache value %s",
++ args[0].from);
+ ret = -EINVAL;
+ goto out;
+ }
+@@ -1014,8 +1026,12 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
+ break;
+ case Opt_check_integrity_print_mask:
+ ret = match_int(&args[0], &intarg);
+- if (ret)
++ if (ret) {
++ btrfs_err(info,
++ "unrecognized check_integrity_print_mask value %s",
++ args[0].from);
+ goto out;
++ }
+ info->check_integrity_print_mask = intarg;
+ btrfs_info(info, "check_integrity_print_mask 0x%x",
+ info->check_integrity_print_mask);
+@@ -1030,13 +1046,15 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
+ goto out;
+ #endif
+ case Opt_fatal_errors:
+- if (strcmp(args[0].from, "panic") == 0)
++ if (strcmp(args[0].from, "panic") == 0) {
+ btrfs_set_opt(info->mount_opt,
+ PANIC_ON_FATAL_ERROR);
+- else if (strcmp(args[0].from, "bug") == 0)
++ } else if (strcmp(args[0].from, "bug") == 0) {
+ btrfs_clear_opt(info->mount_opt,
+ PANIC_ON_FATAL_ERROR);
+- else {
++ } else {
++ btrfs_err(info, "unrecognized fatal_errors value %s",
++ args[0].from);
+ ret = -EINVAL;
+ goto out;
+ }
+@@ -1044,8 +1062,12 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
+ case Opt_commit_interval:
+ intarg = 0;
+ ret = match_int(&args[0], &intarg);
+- if (ret)
++ if (ret) {
++ btrfs_err(info, "unrecognized commit_interval value %s",
++ args[0].from);
++ ret = -EINVAL;
+ goto out;
++ }
+ if (intarg == 0) {
+ btrfs_info(info,
+ "using default commit interval %us",
+@@ -1059,8 +1081,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
+ break;
+ case Opt_rescue:
+ ret = parse_rescue_options(info, args[0].from);
+- if (ret < 0)
++ if (ret < 0) {
++ btrfs_err(info, "unrecognized rescue value %s",
++ args[0].from);
+ goto out;
++ }
+ break;
+ #ifdef CONFIG_BTRFS_DEBUG
+ case Opt_fragment_all:
+@@ -1986,6 +2011,14 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
+ if (ret)
+ goto restore;
+
++ /* V1 cache is not supported for subpage mount. */
++ if (fs_info->sectorsize < PAGE_SIZE && btrfs_test_opt(fs_info, SPACE_CACHE)) {
++ btrfs_warn(fs_info,
++ "v1 space cache is not supported for page size %lu with sectorsize %u",
++ PAGE_SIZE, fs_info->sectorsize);
++ ret = -EINVAL;
++ goto restore;
++ }
+ btrfs_remount_begin(fs_info, old_opts, *flags);
+ btrfs_resize_thread_pool(fs_info,
+ fs_info->thread_pool_size, old_thread_pool_size);
+diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
+index 179c1630bf561..6a8a00f28b192 100644
+--- a/fs/cifs/smb2pdu.c
++++ b/fs/cifs/smb2pdu.c
+@@ -543,6 +543,7 @@ assemble_neg_contexts(struct smb2_negotiate_req *req,
+ struct TCP_Server_Info *server, unsigned int *total_len)
+ {
+ char *pneg_ctxt;
++ char *hostname = NULL;
+ unsigned int ctxt_len, neg_context_count;
+
+ if (*total_len > 200) {
+@@ -570,16 +571,24 @@ assemble_neg_contexts(struct smb2_negotiate_req *req,
+ *total_len += ctxt_len;
+ pneg_ctxt += ctxt_len;
+
+- ctxt_len = build_netname_ctxt((struct smb2_netname_neg_context *)pneg_ctxt,
+- server->hostname);
+- *total_len += ctxt_len;
+- pneg_ctxt += ctxt_len;
+-
+ build_posix_ctxt((struct smb2_posix_neg_context *)pneg_ctxt);
+ *total_len += sizeof(struct smb2_posix_neg_context);
+ pneg_ctxt += sizeof(struct smb2_posix_neg_context);
+
+- neg_context_count = 4;
++ /*
++ * secondary channels don't have the hostname field populated
++ * use the hostname field in the primary channel instead
++ */
++ hostname = CIFS_SERVER_IS_CHAN(server) ?
++ server->primary_server->hostname : server->hostname;
++ if (hostname && (hostname[0] != 0)) {
++ ctxt_len = build_netname_ctxt((struct smb2_netname_neg_context *)pneg_ctxt,
++ hostname);
++ *total_len += ctxt_len;
++ pneg_ctxt += ctxt_len;
++ neg_context_count = 4;
++ } else /* second channels do not have a hostname */
++ neg_context_count = 3;
+
+ if (server->compress_algorithm) {
+ build_compression_ctxt((struct smb2_compression_capabilities_context *)
+diff --git a/fs/f2fs/iostat.c b/fs/f2fs/iostat.c
+index be599f31d3c48..d84c5f6cc09d7 100644
+--- a/fs/f2fs/iostat.c
++++ b/fs/f2fs/iostat.c
+@@ -91,8 +91,9 @@ static inline void __record_iostat_latency(struct f2fs_sb_info *sbi)
+ unsigned int cnt;
+ struct f2fs_iostat_latency iostat_lat[MAX_IO_TYPE][NR_PAGE_TYPE];
+ struct iostat_lat_info *io_lat = sbi->iostat_io_lat;
++ unsigned long flags;
+
+- spin_lock_bh(&sbi->iostat_lat_lock);
++ spin_lock_irqsave(&sbi->iostat_lat_lock, flags);
+ for (idx = 0; idx < MAX_IO_TYPE; idx++) {
+ for (io = 0; io < NR_PAGE_TYPE; io++) {
+ cnt = io_lat->bio_cnt[idx][io];
+@@ -106,7 +107,7 @@ static inline void __record_iostat_latency(struct f2fs_sb_info *sbi)
+ io_lat->bio_cnt[idx][io] = 0;
+ }
+ }
+- spin_unlock_bh(&sbi->iostat_lat_lock);
++ spin_unlock_irqrestore(&sbi->iostat_lat_lock, flags);
+
+ trace_f2fs_iostat_latency(sbi, iostat_lat);
+ }
+@@ -115,14 +116,15 @@ static inline void f2fs_record_iostat(struct f2fs_sb_info *sbi)
+ {
+ unsigned long long iostat_diff[NR_IO_TYPE];
+ int i;
++ unsigned long flags;
+
+ if (time_is_after_jiffies(sbi->iostat_next_period))
+ return;
+
+ /* Need double check under the lock */
+- spin_lock_bh(&sbi->iostat_lock);
++ spin_lock_irqsave(&sbi->iostat_lock, flags);
+ if (time_is_after_jiffies(sbi->iostat_next_period)) {
+- spin_unlock_bh(&sbi->iostat_lock);
++ spin_unlock_irqrestore(&sbi->iostat_lock, flags);
+ return;
+ }
+ sbi->iostat_next_period = jiffies +
+@@ -133,7 +135,7 @@ static inline void f2fs_record_iostat(struct f2fs_sb_info *sbi)
+ sbi->prev_rw_iostat[i];
+ sbi->prev_rw_iostat[i] = sbi->rw_iostat[i];
+ }
+- spin_unlock_bh(&sbi->iostat_lock);
++ spin_unlock_irqrestore(&sbi->iostat_lock, flags);
+
+ trace_f2fs_iostat(sbi, iostat_diff);
+
+@@ -145,25 +147,27 @@ void f2fs_reset_iostat(struct f2fs_sb_info *sbi)
+ struct iostat_lat_info *io_lat = sbi->iostat_io_lat;
+ int i;
+
+- spin_lock_bh(&sbi->iostat_lock);
++ spin_lock_irq(&sbi->iostat_lock);
+ for (i = 0; i < NR_IO_TYPE; i++) {
+ sbi->rw_iostat[i] = 0;
+ sbi->prev_rw_iostat[i] = 0;
+ }
+- spin_unlock_bh(&sbi->iostat_lock);
++ spin_unlock_irq(&sbi->iostat_lock);
+
+- spin_lock_bh(&sbi->iostat_lat_lock);
++ spin_lock_irq(&sbi->iostat_lat_lock);
+ memset(io_lat, 0, sizeof(struct iostat_lat_info));
+- spin_unlock_bh(&sbi->iostat_lat_lock);
++ spin_unlock_irq(&sbi->iostat_lat_lock);
+ }
+
+ void f2fs_update_iostat(struct f2fs_sb_info *sbi,
+ enum iostat_type type, unsigned long long io_bytes)
+ {
++ unsigned long flags;
++
+ if (!sbi->iostat_enable)
+ return;
+
+- spin_lock_bh(&sbi->iostat_lock);
++ spin_lock_irqsave(&sbi->iostat_lock, flags);
+ sbi->rw_iostat[type] += io_bytes;
+
+ if (type == APP_BUFFERED_IO || type == APP_DIRECT_IO)
+@@ -172,7 +176,7 @@ void f2fs_update_iostat(struct f2fs_sb_info *sbi,
+ if (type == APP_BUFFERED_READ_IO || type == APP_DIRECT_READ_IO)
+ sbi->rw_iostat[APP_READ_IO] += io_bytes;
+
+- spin_unlock_bh(&sbi->iostat_lock);
++ spin_unlock_irqrestore(&sbi->iostat_lock, flags);
+
+ f2fs_record_iostat(sbi);
+ }
+@@ -185,6 +189,7 @@ static inline void __update_iostat_latency(struct bio_iostat_ctx *iostat_ctx,
+ struct f2fs_sb_info *sbi = iostat_ctx->sbi;
+ struct iostat_lat_info *io_lat = sbi->iostat_io_lat;
+ int idx;
++ unsigned long flags;
+
+ if (!sbi->iostat_enable)
+ return;
+@@ -202,12 +207,12 @@ static inline void __update_iostat_latency(struct bio_iostat_ctx *iostat_ctx,
+ idx = WRITE_ASYNC_IO;
+ }
+
+- spin_lock_bh(&sbi->iostat_lat_lock);
++ spin_lock_irqsave(&sbi->iostat_lat_lock, flags);
+ io_lat->sum_lat[idx][iotype] += ts_diff;
+ io_lat->bio_cnt[idx][iotype]++;
+ if (ts_diff > io_lat->peak_lat[idx][iotype])
+ io_lat->peak_lat[idx][iotype] = ts_diff;
+- spin_unlock_bh(&sbi->iostat_lat_lock);
++ spin_unlock_irqrestore(&sbi->iostat_lat_lock, flags);
+ }
+
+ void iostat_update_and_unbind_ctx(struct bio *bio, int rw)
+diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
+index fffafd2aa4387..3764e12f19db0 100644
+--- a/fs/f2fs/namei.c
++++ b/fs/f2fs/namei.c
+@@ -92,8 +92,6 @@ static struct inode *f2fs_new_inode(struct user_namespace *mnt_userns,
+ if (test_opt(sbi, INLINE_XATTR))
+ set_inode_flag(inode, FI_INLINE_XATTR);
+
+- if (test_opt(sbi, INLINE_DATA) && f2fs_may_inline_data(inode))
+- set_inode_flag(inode, FI_INLINE_DATA);
+ if (f2fs_may_inline_dentry(inode))
+ set_inode_flag(inode, FI_INLINE_DENTRY);
+
+@@ -110,10 +108,6 @@ static struct inode *f2fs_new_inode(struct user_namespace *mnt_userns,
+
+ f2fs_init_extent_tree(inode, NULL);
+
+- stat_inc_inline_xattr(inode);
+- stat_inc_inline_inode(inode);
+- stat_inc_inline_dir(inode);
+-
+ F2FS_I(inode)->i_flags =
+ f2fs_mask_flags(mode, F2FS_I(dir)->i_flags & F2FS_FL_INHERITED);
+
+@@ -130,6 +124,14 @@ static struct inode *f2fs_new_inode(struct user_namespace *mnt_userns,
+ set_compress_context(inode);
+ }
+
++ /* Should enable inline_data after compression set */
++ if (test_opt(sbi, INLINE_DATA) && f2fs_may_inline_data(inode))
++ set_inode_flag(inode, FI_INLINE_DATA);
++
++ stat_inc_inline_xattr(inode);
++ stat_inc_inline_inode(inode);
++ stat_inc_inline_dir(inode);
++
+ f2fs_set_inode_flags(inode);
+
+ trace_f2fs_new_inode(inode, 0);
+@@ -328,6 +330,9 @@ static void set_compress_inode(struct f2fs_sb_info *sbi, struct inode *inode,
+ if (!is_extension_exist(name, ext[i], false))
+ continue;
+
++ /* Do not use inline_data with compression */
++ stat_dec_inline_inode(inode);
++ clear_inode_flag(inode, FI_INLINE_DATA);
+ set_compress_context(inode);
+ return;
+ }
+diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
+index a8d0fa2731cbe..aedc3d334113b 100644
+--- a/fs/f2fs/node.c
++++ b/fs/f2fs/node.c
+@@ -1454,7 +1454,9 @@ page_hit:
+ out_err:
+ ClearPageUptodate(page);
+ out_put_err:
+- f2fs_handle_page_eio(sbi, page->index, NODE);
++ /* ENOENT comes from read_node_page which is not an error. */
++ if (err != -ENOENT)
++ f2fs_handle_page_eio(sbi, page->index, NODE);
+ f2fs_put_page(page, 1);
+ return ERR_PTR(err);
+ }
+diff --git a/fs/io_uring.c b/fs/io_uring.c
+index 68aab48838e41..e4186635aaa8d 100644
+--- a/fs/io_uring.c
++++ b/fs/io_uring.c
+@@ -926,7 +926,7 @@ struct io_kiocb {
+ /* used by request caches, completion batching and iopoll */
+ struct io_wq_work_node comp_list;
+ /* cache ->apoll->events */
+- int apoll_events;
++ __poll_t apoll_events;
+ };
+ atomic_t refs;
+ atomic_t poll_refs;
+@@ -5984,7 +5984,8 @@ static void io_apoll_task_func(struct io_kiocb *req, bool *locked)
+ io_req_complete_failed(req, ret);
+ }
+
+-static void __io_poll_execute(struct io_kiocb *req, int mask, int events)
++static void __io_poll_execute(struct io_kiocb *req, int mask,
++ __poll_t __maybe_unused events)
+ {
+ req->result = mask;
+ /*
+@@ -5993,7 +5994,6 @@ static void __io_poll_execute(struct io_kiocb *req, int mask, int events)
+ * CPU. We want to avoid pulling in req->apoll->events for that
+ * case.
+ */
+- req->apoll_events = events;
+ if (req->opcode == IORING_OP_POLL_ADD)
+ req->io_task_work.func = io_poll_task_func;
+ else
+@@ -6003,7 +6003,8 @@ static void __io_poll_execute(struct io_kiocb *req, int mask, int events)
+ io_req_task_work_add(req, false);
+ }
+
+-static inline void io_poll_execute(struct io_kiocb *req, int res, int events)
++static inline void io_poll_execute(struct io_kiocb *req, int res,
++ __poll_t events)
+ {
+ if (io_poll_get_ownership(req))
+ __io_poll_execute(req, res, events);
+@@ -6142,6 +6143,8 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
+ io_init_poll_iocb(poll, mask, io_poll_wake);
+ poll->file = req->file;
+
++ req->apoll_events = poll->events;
++
+ ipt->pt._key = mask;
+ ipt->req = req;
+ ipt->error = 0;
+@@ -6172,8 +6175,11 @@ static int __io_arm_poll_handler(struct io_kiocb *req,
+
+ if (mask) {
+ /* can't multishot if failed, just queue the event we've got */
+- if (unlikely(ipt->error || !ipt->nr_entries))
++ if (unlikely(ipt->error || !ipt->nr_entries)) {
+ poll->events |= EPOLLONESHOT;
++ req->apoll_events |= EPOLLONESHOT;
++ ipt->error = 0;
++ }
+ __io_poll_execute(req, mask, poll->events);
+ return 0;
+ }
+@@ -6386,7 +6392,7 @@ static int io_poll_add_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe
+ return -EINVAL;
+
+ io_req_set_refcount(req);
+- req->apoll_events = poll->events = io_poll_parse_events(sqe, flags);
++ poll->events = io_poll_parse_events(sqe, flags);
+ return 0;
+ }
+
+@@ -6399,6 +6405,8 @@ static int io_poll_add(struct io_kiocb *req, unsigned int issue_flags)
+ ipt.pt._qproc = io_poll_queue_proc;
+
+ ret = __io_arm_poll_handler(req, &req->poll, &ipt, poll->events);
++ if (!ret && ipt.error)
++ req_set_fail(req);
+ ret = ret ?: ipt.error;
+ if (ret)
+ __io_req_complete(req, issue_flags, ret, 0);
+diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
+index 6f1b8ddc6f7a4..54dda2e19ed12 100644
+--- a/fs/proc/vmcore.c
++++ b/fs/proc/vmcore.c
+@@ -26,6 +26,7 @@
+ #include <linux/vmalloc.h>
+ #include <linux/pagemap.h>
+ #include <linux/uaccess.h>
++#include <linux/uio.h>
+ #include <linux/cc_platform.h>
+ #include <asm/io.h>
+ #include "internal.h"
+@@ -128,9 +129,8 @@ static int open_vmcore(struct inode *inode, struct file *file)
+ }
+
+ /* Reads a page from the oldmem device from given offset. */
+-ssize_t read_from_oldmem(char *buf, size_t count,
+- u64 *ppos, int userbuf,
+- bool encrypted)
++static ssize_t read_from_oldmem_iter(struct iov_iter *iter, size_t count,
++ u64 *ppos, bool encrypted)
+ {
+ unsigned long pfn, offset;
+ size_t nr_bytes;
+@@ -152,29 +152,23 @@ ssize_t read_from_oldmem(char *buf, size_t count,
+
+ /* If pfn is not ram, return zeros for sparse dump files */
+ if (!pfn_is_ram(pfn)) {
+- tmp = 0;
+- if (!userbuf)
+- memset(buf, 0, nr_bytes);
+- else if (clear_user(buf, nr_bytes))
+- tmp = -EFAULT;
++ tmp = iov_iter_zero(nr_bytes, iter);
+ } else {
+ if (encrypted)
+- tmp = copy_oldmem_page_encrypted(pfn, buf,
++ tmp = copy_oldmem_page_encrypted(iter, pfn,
+ nr_bytes,
+- offset,
+- userbuf);
++ offset);
+ else
+- tmp = copy_oldmem_page(pfn, buf, nr_bytes,
+- offset, userbuf);
++ tmp = copy_oldmem_page(iter, pfn, nr_bytes,
++ offset);
+ }
+- if (tmp < 0) {
++ if (tmp < nr_bytes) {
+ srcu_read_unlock(&vmcore_cb_srcu, idx);
+- return tmp;
++ return -EFAULT;
+ }
+
+ *ppos += nr_bytes;
+ count -= nr_bytes;
+- buf += nr_bytes;
+ read += nr_bytes;
+ ++pfn;
+ offset = 0;
+@@ -184,6 +178,27 @@ ssize_t read_from_oldmem(char *buf, size_t count,
+ return read;
+ }
+
++ssize_t read_from_oldmem(char *buf, size_t count,
++ u64 *ppos, int userbuf,
++ bool encrypted)
++{
++ struct iov_iter iter;
++ struct iovec iov;
++ struct kvec kvec;
++
++ if (userbuf) {
++ iov.iov_base = (__force void __user *)buf;
++ iov.iov_len = count;
++ iov_iter_init(&iter, READ, &iov, 1, count);
++ } else {
++ kvec.iov_base = buf;
++ kvec.iov_len = count;
++ iov_iter_kvec(&iter, READ, &kvec, 1, count);
++ }
++
++ return read_from_oldmem_iter(&iter, count, ppos, encrypted);
++}
++
+ /*
+ * Architectures may override this function to allocate ELF header in 2nd kernel
+ */
+@@ -228,11 +243,10 @@ int __weak remap_oldmem_pfn_range(struct vm_area_struct *vma,
+ /*
+ * Architectures which support memory encryption override this.
+ */
+-ssize_t __weak
+-copy_oldmem_page_encrypted(unsigned long pfn, char *buf, size_t csize,
+- unsigned long offset, int userbuf)
++ssize_t __weak copy_oldmem_page_encrypted(struct iov_iter *iter,
++ unsigned long pfn, size_t csize, unsigned long offset)
+ {
+- return copy_oldmem_page(pfn, buf, csize, offset, userbuf);
++ return copy_oldmem_page(iter, pfn, csize, offset);
+ }
+
+ /*
+diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h
+index 620821549b23a..a1cf7d5c03c7c 100644
+--- a/include/linux/crash_dump.h
++++ b/include/linux/crash_dump.h
+@@ -24,11 +24,10 @@ extern int remap_oldmem_pfn_range(struct vm_area_struct *vma,
+ unsigned long from, unsigned long pfn,
+ unsigned long size, pgprot_t prot);
+
+-extern ssize_t copy_oldmem_page(unsigned long, char *, size_t,
+- unsigned long, int);
+-extern ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf,
+- size_t csize, unsigned long offset,
+- int userbuf);
++ssize_t copy_oldmem_page(struct iov_iter *i, unsigned long pfn, size_t csize,
++ unsigned long offset);
++ssize_t copy_oldmem_page_encrypted(struct iov_iter *iter, unsigned long pfn,
++ size_t csize, unsigned long offset);
+
+ void vmcore_cleanup(void);
+
+diff --git a/include/linux/mm.h b/include/linux/mm.h
+index b0183450e484b..da08cce2a9fa8 100644
+--- a/include/linux/mm.h
++++ b/include/linux/mm.h
+@@ -3188,6 +3188,7 @@ enum mf_flags {
+ MF_MUST_KILL = 1 << 2,
+ MF_SOFT_OFFLINE = 1 << 3,
+ MF_UNPOISON = 1 << 4,
++ MF_SW_SIMULATED = 1 << 5,
+ };
+ extern int memory_failure(unsigned long pfn, int flags);
+ extern void memory_failure_queue(unsigned long pfn, int flags);
+diff --git a/include/linux/ratelimit_types.h b/include/linux/ratelimit_types.h
+index c21c7f8103e2b..002266693e506 100644
+--- a/include/linux/ratelimit_types.h
++++ b/include/linux/ratelimit_types.h
+@@ -23,12 +23,16 @@ struct ratelimit_state {
+ unsigned long flags;
+ };
+
+-#define RATELIMIT_STATE_INIT(name, interval_init, burst_init) { \
+- .lock = __RAW_SPIN_LOCK_UNLOCKED(name.lock), \
+- .interval = interval_init, \
+- .burst = burst_init, \
++#define RATELIMIT_STATE_INIT_FLAGS(name, interval_init, burst_init, flags_init) { \
++ .lock = __RAW_SPIN_LOCK_UNLOCKED(name.lock), \
++ .interval = interval_init, \
++ .burst = burst_init, \
++ .flags = flags_init, \
+ }
+
++#define RATELIMIT_STATE_INIT(name, interval_init, burst_init) \
++ RATELIMIT_STATE_INIT_FLAGS(name, interval_init, burst_init, 0)
++
+ #define RATELIMIT_STATE_INIT_DISABLED \
+ RATELIMIT_STATE_INIT(ratelimit_state, 0, DEFAULT_RATELIMIT_BURST)
+
+diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
+index 234d70ae5f4cb..48e4c59d85e24 100644
+--- a/include/net/inet_sock.h
++++ b/include/net/inet_sock.h
+@@ -252,6 +252,11 @@ struct inet_sock {
+ #define IP_CMSG_CHECKSUM BIT(7)
+ #define IP_CMSG_RECVFRAGSIZE BIT(8)
+
++static inline bool sk_is_inet(struct sock *sk)
++{
++ return sk->sk_family == AF_INET || sk->sk_family == AF_INET6;
++}
++
+ /**
+ * sk_to_full_sk - Access to a full socket
+ * @sk: pointer to a socket
+diff --git a/include/trace/events/libata.h b/include/trace/events/libata.h
+index d4e631aa976fb..6025dd8ba4aa1 100644
+--- a/include/trace/events/libata.h
++++ b/include/trace/events/libata.h
+@@ -288,6 +288,7 @@ DECLARE_EVENT_CLASS(ata_qc_complete_template,
+ __entry->hob_feature = qc->result_tf.hob_feature;
+ __entry->nsect = qc->result_tf.nsect;
+ __entry->hob_nsect = qc->result_tf.hob_nsect;
++ __entry->flags = qc->flags;
+ ),
+
+ TP_printk("ata_port=%u ata_dev=%u tag=%d flags=%s status=%s " \
+diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
+index e978f36e6be86..8d0b68a170422 100644
+--- a/kernel/dma/direct.c
++++ b/kernel/dma/direct.c
+@@ -357,7 +357,7 @@ void dma_direct_free(struct device *dev, size_t size,
+ } else {
+ if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_CLEAR_UNCACHED))
+ arch_dma_clear_uncached(cpu_addr, size);
+- if (dma_set_encrypted(dev, cpu_addr, 1 << page_order))
++ if (dma_set_encrypted(dev, cpu_addr, size))
+ return;
+ }
+
+@@ -392,7 +392,6 @@ void dma_direct_free_pages(struct device *dev, size_t size,
+ struct page *page, dma_addr_t dma_addr,
+ enum dma_data_direction dir)
+ {
+- unsigned int page_order = get_order(size);
+ void *vaddr = page_address(page);
+
+ /* If cpu_addr is not from an atomic pool, dma_free_from_pool() fails */
+@@ -400,7 +399,7 @@ void dma_direct_free_pages(struct device *dev, size_t size,
+ dma_free_from_pool(dev, vaddr, size))
+ return;
+
+- if (dma_set_encrypted(dev, vaddr, 1 << page_order))
++ if (dma_set_encrypted(dev, vaddr, size))
+ return;
+ __dma_direct_free_pages(dev, page, size);
+ }
+diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
+index b56833700d23f..c69d82273ce78 100644
+--- a/kernel/trace/rethook.c
++++ b/kernel/trace/rethook.c
+@@ -154,6 +154,15 @@ struct rethook_node *rethook_try_get(struct rethook *rh)
+ if (unlikely(!handler))
+ return NULL;
+
++ /*
++ * This expects the caller will set up a rethook on a function entry.
++ * When the function returns, the rethook will eventually be reclaimed
++ * or released in the rethook_recycle() with call_rcu().
++ * This means the caller must be run in the RCU-availabe context.
++ */
++ if (unlikely(!rcu_is_watching()))
++ return NULL;
++
+ fn = freelist_try_get(&rh->pool);
+ if (!fn)
+ return NULL;
+diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
+index 47cebef78532c..13439743285c5 100644
+--- a/kernel/trace/trace_kprobe.c
++++ b/kernel/trace/trace_kprobe.c
+@@ -1718,8 +1718,17 @@ static int
+ kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
+ {
+ struct kretprobe *rp = get_kretprobe(ri);
+- struct trace_kprobe *tk = container_of(rp, struct trace_kprobe, rp);
++ struct trace_kprobe *tk;
++
++ /*
++ * There is a small chance that get_kretprobe(ri) returns NULL when
++ * the kretprobe is unregister on another CPU between kretprobe's
++ * trampoline_handler and this function.
++ */
++ if (unlikely(!rp))
++ return 0;
+
++ tk = container_of(rp, struct trace_kprobe, rp);
+ raw_cpu_inc(*tk->nhit);
+
+ if (trace_probe_test_flag(&tk->tp, TP_FLAG_TRACE))
+diff --git a/mm/filemap.c b/mm/filemap.c
+index 61dd39990fda2..be1859a276e1e 100644
+--- a/mm/filemap.c
++++ b/mm/filemap.c
+@@ -2385,6 +2385,8 @@ static void filemap_get_read_batch(struct address_space *mapping,
+ continue;
+ if (xas.xa_index > max || xa_is_value(folio))
+ break;
++ if (xa_is_sibling(folio))
++ break;
+ if (!folio_try_get_rcu(folio))
+ goto retry;
+
+diff --git a/mm/hwpoison-inject.c b/mm/hwpoison-inject.c
+index bb0cea5468cbf..f483742e9dea8 100644
+--- a/mm/hwpoison-inject.c
++++ b/mm/hwpoison-inject.c
+@@ -48,7 +48,7 @@ static int hwpoison_inject(void *data, u64 val)
+
+ inject:
+ pr_info("Injecting memory failure at pfn %#lx\n", pfn);
+- err = memory_failure(pfn, 0);
++ err = memory_failure(pfn, MF_SW_SIMULATED);
+ return (err == -EOPNOTSUPP) ? 0 : err;
+ }
+
+diff --git a/mm/madvise.c b/mm/madvise.c
+index 1873616a37d2e..4d29a11c18e9e 100644
+--- a/mm/madvise.c
++++ b/mm/madvise.c
+@@ -1101,7 +1101,7 @@ static int madvise_inject_error(int behavior,
+ } else {
+ pr_info("Injecting memory failure for pfn %#lx at process virtual address %#lx\n",
+ pfn, start);
+- ret = memory_failure(pfn, MF_COUNT_INCREASED);
++ ret = memory_failure(pfn, MF_COUNT_INCREASED | MF_SW_SIMULATED);
+ if (ret == -EOPNOTSUPP)
+ ret = 0;
+ }
+diff --git a/mm/memory-failure.c b/mm/memory-failure.c
+index d4a4adcca01f3..94dac77f5ebad 100644
+--- a/mm/memory-failure.c
++++ b/mm/memory-failure.c
+@@ -68,6 +68,8 @@ int sysctl_memory_failure_recovery __read_mostly = 1;
+
+ atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0);
+
++static bool hw_memory_failure __read_mostly = false;
++
+ static bool __page_handle_poison(struct page *page)
+ {
+ int ret;
+@@ -1780,6 +1782,9 @@ int memory_failure(unsigned long pfn, int flags)
+
+ mutex_lock(&mf_mutex);
+
++ if (!(flags & MF_SW_SIMULATED))
++ hw_memory_failure = true;
++
+ p = pfn_to_online_page(pfn);
+ if (!p) {
+ res = arch_memory_failure(pfn, flags);
+@@ -2138,6 +2143,13 @@ int unpoison_memory(unsigned long pfn)
+
+ mutex_lock(&mf_mutex);
+
++ if (hw_memory_failure) {
++ unpoison_pr_info("Unpoison: Disabled after HW memory failure %#lx\n",
++ pfn, &unpoison_rs);
++ ret = -EOPNOTSUPP;
++ goto unlock_mutex;
++ }
++
+ if (!PageHWPoison(p)) {
+ unpoison_pr_info("Unpoison: Page was already unpoisoned %#lx\n",
+ pfn, &unpoison_rs);
+diff --git a/mm/readahead.c b/mm/readahead.c
+index 4a60cdb64262a..38635af5bab75 100644
+--- a/mm/readahead.c
++++ b/mm/readahead.c
+@@ -508,6 +508,7 @@ void page_cache_ra_order(struct readahead_control *ractl,
+ new_order--;
+ }
+
++ filemap_invalidate_lock_shared(mapping);
+ while (index <= limit) {
+ unsigned int order = new_order;
+
+@@ -534,6 +535,7 @@ void page_cache_ra_order(struct readahead_control *ractl,
+ }
+
+ read_pages(ractl);
++ filemap_invalidate_unlock_shared(mapping);
+
+ /*
+ * If there were already pages in the page cache, then we may have
+diff --git a/mm/slub.c b/mm/slub.c
+index ed5c2c03a47aa..46de927322fc4 100644
+--- a/mm/slub.c
++++ b/mm/slub.c
+@@ -2939,6 +2939,7 @@ redo:
+
+ if (!freelist) {
+ c->slab = NULL;
++ c->tid = next_tid(c->tid);
+ local_unlock_irqrestore(&s->cpu_slab->lock, flags);
+ stat(s, DEACTIVATE_BYPASS);
+ goto new_slab;
+@@ -2971,6 +2972,7 @@ deactivate_slab:
+ freelist = c->freelist;
+ c->slab = NULL;
+ c->freelist = NULL;
++ c->tid = next_tid(c->tid);
+ local_unlock_irqrestore(&s->cpu_slab->lock, flags);
+ deactivate_slab(s, slab, freelist);
+
+diff --git a/mm/swap.c b/mm/swap.c
+index 7e320ec08c6ae..8a98d21d2786c 100644
+--- a/mm/swap.c
++++ b/mm/swap.c
+@@ -881,7 +881,7 @@ void lru_cache_disable(void)
+ * lru_disable_count = 0 will have exited the critical
+ * section when synchronize_rcu() returns.
+ */
+- synchronize_rcu();
++ synchronize_rcu_expedited();
+ #ifdef CONFIG_SMP
+ __lru_add_drain_all(true);
+ #else
+diff --git a/net/core/dev.c b/net/core/dev.c
+index 0784c339cd7d8..842917883adb4 100644
+--- a/net/core/dev.c
++++ b/net/core/dev.c
+@@ -396,16 +396,18 @@ static void list_netdevice(struct net_device *dev)
+ /* Device list removal
+ * caller must respect a RCU grace period before freeing/reusing dev
+ */
+-static void unlist_netdevice(struct net_device *dev)
++static void unlist_netdevice(struct net_device *dev, bool lock)
+ {
+ ASSERT_RTNL();
+
+ /* Unlink dev from the device chain */
+- write_lock(&dev_base_lock);
++ if (lock)
++ write_lock(&dev_base_lock);
+ list_del_rcu(&dev->dev_list);
+ netdev_name_node_del(dev->name_node);
+ hlist_del_rcu(&dev->index_hlist);
+- write_unlock(&dev_base_lock);
++ if (lock)
++ write_unlock(&dev_base_lock);
+
+ dev_base_seq_inc(dev_net(dev));
+ }
+@@ -9963,11 +9965,11 @@ int register_netdevice(struct net_device *dev)
+ goto err_uninit;
+
+ ret = netdev_register_kobject(dev);
+- if (ret) {
+- dev->reg_state = NETREG_UNREGISTERED;
++ write_lock(&dev_base_lock);
++ dev->reg_state = ret ? NETREG_UNREGISTERED : NETREG_REGISTERED;
++ write_unlock(&dev_base_lock);
++ if (ret)
+ goto err_uninit;
+- }
+- dev->reg_state = NETREG_REGISTERED;
+
+ __netdev_update_features(dev);
+
+@@ -10249,7 +10251,9 @@ void netdev_run_todo(void)
+ continue;
+ }
+
++ write_lock(&dev_base_lock);
+ dev->reg_state = NETREG_UNREGISTERED;
++ write_unlock(&dev_base_lock);
+ linkwatch_forget_dev(dev);
+ }
+
+@@ -10727,9 +10731,10 @@ void unregister_netdevice_many(struct list_head *head)
+
+ list_for_each_entry(dev, head, unreg_list) {
+ /* And unlink it from device chain. */
+- unlist_netdevice(dev);
+-
++ write_lock(&dev_base_lock);
++ unlist_netdevice(dev, false);
+ dev->reg_state = NETREG_UNREGISTERING;
++ write_unlock(&dev_base_lock);
+ }
+ flush_all_backlogs();
+
+@@ -10876,7 +10881,7 @@ int __dev_change_net_namespace(struct net_device *dev, struct net *net,
+ dev_close(dev);
+
+ /* And unlink it from device chain */
+- unlist_netdevice(dev);
++ unlist_netdevice(dev, true);
+
+ synchronize_net();
+
+diff --git a/net/core/filter.c b/net/core/filter.c
+index 8847316ee20e0..af1e77f2f24a8 100644
+--- a/net/core/filter.c
++++ b/net/core/filter.c
+@@ -6506,10 +6506,21 @@ __bpf_sk_lookup(struct sk_buff *skb, struct bpf_sock_tuple *tuple, u32 len,
+ ifindex, proto, netns_id, flags);
+
+ if (sk) {
+- sk = sk_to_full_sk(sk);
+- if (!sk_fullsock(sk)) {
++ struct sock *sk2 = sk_to_full_sk(sk);
++
++ /* sk_to_full_sk() may return (sk)->rsk_listener, so make sure the original sk
++ * sock refcnt is decremented to prevent a request_sock leak.
++ */
++ if (!sk_fullsock(sk2))
++ sk2 = NULL;
++ if (sk2 != sk) {
+ sock_gen_put(sk);
+- return NULL;
++ /* Ensure there is no need to bump sk2 refcnt */
++ if (unlikely(sk2 && !sock_flag(sk2, SOCK_RCU_FREE))) {
++ WARN_ONCE(1, "Found non-RCU, unreferenced socket!");
++ return NULL;
++ }
++ sk = sk2;
+ }
+ }
+
+@@ -6543,10 +6554,21 @@ bpf_sk_lookup(struct sk_buff *skb, struct bpf_sock_tuple *tuple, u32 len,
+ flags);
+
+ if (sk) {
+- sk = sk_to_full_sk(sk);
+- if (!sk_fullsock(sk)) {
++ struct sock *sk2 = sk_to_full_sk(sk);
++
++ /* sk_to_full_sk() may return (sk)->rsk_listener, so make sure the original sk
++ * sock refcnt is decremented to prevent a request_sock leak.
++ */
++ if (!sk_fullsock(sk2))
++ sk2 = NULL;
++ if (sk2 != sk) {
+ sock_gen_put(sk);
+- return NULL;
++ /* Ensure there is no need to bump sk2 refcnt */
++ if (unlikely(sk2 && !sock_flag(sk2, SOCK_RCU_FREE))) {
++ WARN_ONCE(1, "Found non-RCU, unreferenced socket!");
++ return NULL;
++ }
++ sk = sk2;
+ }
+ }
+
+diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
+index 9cbc1c8289bcd..9ee57997354a2 100644
+--- a/net/core/net-sysfs.c
++++ b/net/core/net-sysfs.c
+@@ -32,6 +32,7 @@ static const char fmt_dec[] = "%d\n";
+ static const char fmt_ulong[] = "%lu\n";
+ static const char fmt_u64[] = "%llu\n";
+
++/* Caller holds RTNL or dev_base_lock */
+ static inline int dev_isalive(const struct net_device *dev)
+ {
+ return dev->reg_state <= NETREG_REGISTERED;
+diff --git a/net/core/skmsg.c b/net/core/skmsg.c
+index cc381165ea080..ede0af308f404 100644
+--- a/net/core/skmsg.c
++++ b/net/core/skmsg.c
+@@ -695,6 +695,11 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node)
+
+ write_lock_bh(&sk->sk_callback_lock);
+
++ if (sk_is_inet(sk) && inet_csk_has_ulp(sk)) {
++ psock = ERR_PTR(-EINVAL);
++ goto out;
++ }
++
+ if (sk->sk_user_data) {
+ psock = ERR_PTR(-EBUSY);
+ goto out;
+diff --git a/net/ethtool/eeprom.c b/net/ethtool/eeprom.c
+index 7e6b37a54add3..1c94bb8ea03f2 100644
+--- a/net/ethtool/eeprom.c
++++ b/net/ethtool/eeprom.c
+@@ -36,7 +36,7 @@ static int fallback_set_params(struct eeprom_req_info *request,
+ if (request->page)
+ offset = request->page * ETH_MODULE_EEPROM_PAGE_LEN + offset;
+
+- if (modinfo->type == ETH_MODULE_SFF_8079 &&
++ if (modinfo->type == ETH_MODULE_SFF_8472 &&
+ request->i2c_address == 0x51)
+ offset += ETH_MODULE_EEPROM_PAGE_LEN * 2;
+
+diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
+index bc8dfdf1c48ad..3186735179766 100644
+--- a/net/ipv4/ip_gre.c
++++ b/net/ipv4/ip_gre.c
+@@ -524,7 +524,6 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
+ int tunnel_hlen;
+ int version;
+ int nhoff;
+- int thoff;
+
+ tun_info = skb_tunnel_info(skb);
+ if (unlikely(!tun_info || !(tun_info->mode & IP_TUNNEL_INFO_TX) ||
+@@ -558,10 +557,16 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
+ (ntohs(ip_hdr(skb)->tot_len) > skb->len - nhoff))
+ truncate = true;
+
+- thoff = skb_transport_header(skb) - skb_mac_header(skb);
+- if (skb->protocol == htons(ETH_P_IPV6) &&
+- (ntohs(ipv6_hdr(skb)->payload_len) > skb->len - thoff))
+- truncate = true;
++ if (skb->protocol == htons(ETH_P_IPV6)) {
++ int thoff;
++
++ if (skb_transport_header_was_set(skb))
++ thoff = skb_transport_header(skb) - skb_mac_header(skb);
++ else
++ thoff = nhoff + sizeof(struct ipv6hdr);
++ if (ntohs(ipv6_hdr(skb)->payload_len) > skb->len - thoff)
++ truncate = true;
++ }
+
+ if (version == 1) {
+ erspan_build_header(skb, ntohl(tunnel_id_to_key32(key->tun_id)),
+diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
+index 4e5ceca7ff7f9..9dccbf863f826 100644
+--- a/net/ipv4/ping.c
++++ b/net/ipv4/ping.c
+@@ -319,12 +319,16 @@ static int ping_check_bind_addr(struct sock *sk, struct inet_sock *isk,
+ pr_debug("ping_check_bind_addr(sk=%p,addr=%pI4,port=%d)\n",
+ sk, &addr->sin_addr.s_addr, ntohs(addr->sin_port));
+
++ if (addr->sin_addr.s_addr == htonl(INADDR_ANY))
++ return 0;
++
+ tb_id = l3mdev_fib_table_by_index(net, sk->sk_bound_dev_if) ? : tb_id;
+ chk_addr_ret = inet_addr_type_table(net, addr->sin_addr.s_addr, tb_id);
+
+- if (!inet_addr_valid_or_nonlocal(net, inet_sk(sk),
+- addr->sin_addr.s_addr,
+- chk_addr_ret))
++ if (chk_addr_ret == RTN_MULTICAST ||
++ chk_addr_ret == RTN_BROADCAST ||
++ (chk_addr_ret != RTN_LOCAL &&
++ !inet_can_nonlocal_bind(net, isk)))
+ return -EADDRNOTAVAIL;
+
+ #if IS_ENABLED(CONFIG_IPV6)
+diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
+index 1cdcb4df0eb7e..2c597a4e429ab 100644
+--- a/net/ipv4/tcp_bpf.c
++++ b/net/ipv4/tcp_bpf.c
+@@ -612,9 +612,6 @@ int tcp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
+ return 0;
+ }
+
+- if (inet_csk_has_ulp(sk))
+- return -EINVAL;
+-
+ if (sk->sk_family == AF_INET6) {
+ if (tcp_bpf_assert_proto_ops(psock->sk_proto))
+ return -EINVAL;
+diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
+index 5136959b3dc5d..b996ccaff56e3 100644
+--- a/net/ipv6/ip6_gre.c
++++ b/net/ipv6/ip6_gre.c
+@@ -944,7 +944,6 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
+ __be16 proto;
+ __u32 mtu;
+ int nhoff;
+- int thoff;
+
+ if (!pskb_inet_may_pull(skb))
+ goto tx_err;
+@@ -965,10 +964,16 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
+ (ntohs(ip_hdr(skb)->tot_len) > skb->len - nhoff))
+ truncate = true;
+
+- thoff = skb_transport_header(skb) - skb_mac_header(skb);
+- if (skb->protocol == htons(ETH_P_IPV6) &&
+- (ntohs(ipv6_hdr(skb)->payload_len) > skb->len - thoff))
+- truncate = true;
++ if (skb->protocol == htons(ETH_P_IPV6)) {
++ int thoff;
++
++ if (skb_transport_header_was_set(skb))
++ thoff = skb_transport_header(skb) - skb_mac_header(skb);
++ else
++ thoff = nhoff + sizeof(struct ipv6hdr);
++ if (ntohs(ipv6_hdr(skb)->payload_len) > skb->len - thoff)
++ truncate = true;
++ }
+
+ if (skb_cow_head(skb, dev->needed_headroom ?: t->hlen))
+ goto tx_err;
+diff --git a/net/netfilter/nf_dup_netdev.c b/net/netfilter/nf_dup_netdev.c
+index 7873bd1389c36..a8e2425e43b0d 100644
+--- a/net/netfilter/nf_dup_netdev.c
++++ b/net/netfilter/nf_dup_netdev.c
+@@ -13,14 +13,31 @@
+ #include <net/netfilter/nf_tables_offload.h>
+ #include <net/netfilter/nf_dup_netdev.h>
+
+-static void nf_do_netdev_egress(struct sk_buff *skb, struct net_device *dev)
++#define NF_RECURSION_LIMIT 2
++
++static DEFINE_PER_CPU(u8, nf_dup_skb_recursion);
++
++static void nf_do_netdev_egress(struct sk_buff *skb, struct net_device *dev,
++ enum nf_dev_hooks hook)
+ {
+- if (skb_mac_header_was_set(skb))
++ if (__this_cpu_read(nf_dup_skb_recursion) > NF_RECURSION_LIMIT)
++ goto err;
++
++ if (hook == NF_NETDEV_INGRESS && skb_mac_header_was_set(skb)) {
++ if (skb_cow_head(skb, skb->mac_len))
++ goto err;
++
+ skb_push(skb, skb->mac_len);
++ }
+
+ skb->dev = dev;
+ skb_clear_tstamp(skb);
++ __this_cpu_inc(nf_dup_skb_recursion);
+ dev_queue_xmit(skb);
++ __this_cpu_dec(nf_dup_skb_recursion);
++ return;
++err:
++ kfree_skb(skb);
+ }
+
+ void nf_fwd_netdev_egress(const struct nft_pktinfo *pkt, int oif)
+@@ -33,7 +50,7 @@ void nf_fwd_netdev_egress(const struct nft_pktinfo *pkt, int oif)
+ return;
+ }
+
+- nf_do_netdev_egress(pkt->skb, dev);
++ nf_do_netdev_egress(pkt->skb, dev, nft_hook(pkt));
+ }
+ EXPORT_SYMBOL_GPL(nf_fwd_netdev_egress);
+
+@@ -48,7 +65,7 @@ void nf_dup_netdev_egress(const struct nft_pktinfo *pkt, int oif)
+
+ skb = skb_clone(pkt->skb, GFP_ATOMIC);
+ if (skb)
+- nf_do_netdev_egress(skb, dev);
++ nf_do_netdev_egress(skb, dev, nft_hook(pkt));
+ }
+ EXPORT_SYMBOL_GPL(nf_dup_netdev_egress);
+
+diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c
+index ac4859241e177..55d2d49c34259 100644
+--- a/net/netfilter/nft_meta.c
++++ b/net/netfilter/nft_meta.c
+@@ -14,6 +14,7 @@
+ #include <linux/in.h>
+ #include <linux/ip.h>
+ #include <linux/ipv6.h>
++#include <linux/random.h>
+ #include <linux/smp.h>
+ #include <linux/static_key.h>
+ #include <net/dst.h>
+@@ -32,8 +33,6 @@
+ #define NFT_META_SECS_PER_DAY 86400
+ #define NFT_META_DAYS_PER_WEEK 7
+
+-static DEFINE_PER_CPU(struct rnd_state, nft_prandom_state);
+-
+ static u8 nft_meta_weekday(void)
+ {
+ time64_t secs = ktime_get_real_seconds();
+@@ -271,13 +270,6 @@ static bool nft_meta_get_eval_ifname(enum nft_meta_keys key, u32 *dest,
+ return true;
+ }
+
+-static noinline u32 nft_prandom_u32(void)
+-{
+- struct rnd_state *state = this_cpu_ptr(&nft_prandom_state);
+-
+- return prandom_u32_state(state);
+-}
+-
+ #ifdef CONFIG_IP_ROUTE_CLASSID
+ static noinline bool
+ nft_meta_get_eval_rtclassid(const struct sk_buff *skb, u32 *dest)
+@@ -389,7 +381,7 @@ void nft_meta_get_eval(const struct nft_expr *expr,
+ break;
+ #endif
+ case NFT_META_PRANDOM:
+- *dest = nft_prandom_u32();
++ *dest = get_random_u32();
+ break;
+ #ifdef CONFIG_XFRM
+ case NFT_META_SECPATH:
+@@ -518,7 +510,6 @@ int nft_meta_get_init(const struct nft_ctx *ctx,
+ len = IFNAMSIZ;
+ break;
+ case NFT_META_PRANDOM:
+- prandom_init_once(&nft_prandom_state);
+ len = sizeof(u32);
+ break;
+ #ifdef CONFIG_XFRM
+diff --git a/net/netfilter/nft_numgen.c b/net/netfilter/nft_numgen.c
+index 81b40c663d86a..45d3dc9e96f2c 100644
+--- a/net/netfilter/nft_numgen.c
++++ b/net/netfilter/nft_numgen.c
+@@ -9,12 +9,11 @@
+ #include <linux/netlink.h>
+ #include <linux/netfilter.h>
+ #include <linux/netfilter/nf_tables.h>
++#include <linux/random.h>
+ #include <linux/static_key.h>
+ #include <net/netfilter/nf_tables.h>
+ #include <net/netfilter/nf_tables_core.h>
+
+-static DEFINE_PER_CPU(struct rnd_state, nft_numgen_prandom_state);
+-
+ struct nft_ng_inc {
+ u8 dreg;
+ u32 modulus;
+@@ -135,12 +134,9 @@ struct nft_ng_random {
+ u32 offset;
+ };
+
+-static u32 nft_ng_random_gen(struct nft_ng_random *priv)
++static u32 nft_ng_random_gen(const struct nft_ng_random *priv)
+ {
+- struct rnd_state *state = this_cpu_ptr(&nft_numgen_prandom_state);
+-
+- return reciprocal_scale(prandom_u32_state(state), priv->modulus) +
+- priv->offset;
++ return reciprocal_scale(get_random_u32(), priv->modulus) + priv->offset;
+ }
+
+ static void nft_ng_random_eval(const struct nft_expr *expr,
+@@ -168,8 +164,6 @@ static int nft_ng_random_init(const struct nft_ctx *ctx,
+ if (priv->offset + priv->modulus - 1 < priv->offset)
+ return -EOVERFLOW;
+
+- prandom_init_once(&nft_numgen_prandom_state);
+-
+ return nft_parse_register_store(ctx, tb[NFTA_NG_DREG], &priv->dreg,
+ NULL, NFT_DATA_VALUE, sizeof(u32));
+ }
+diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
+index 372bf54a0ca9e..e20d1a9734175 100644
+--- a/net/openvswitch/flow.c
++++ b/net/openvswitch/flow.c
+@@ -407,7 +407,7 @@ static int parse_ipv6hdr(struct sk_buff *skb, struct sw_flow_key *key)
+ if (flags & IP6_FH_F_FRAG) {
+ if (frag_off) {
+ key->ip.frag = OVS_FRAG_TYPE_LATER;
+- key->ip.proto = nexthdr;
++ key->ip.proto = NEXTHDR_FRAGMENT;
+ return 0;
+ }
+ key->ip.frag = OVS_FRAG_TYPE_FIRST;
+diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
+index ed4ccef5d6a82..5449ed114e406 100644
+--- a/net/sched/sch_netem.c
++++ b/net/sched/sch_netem.c
+@@ -1146,9 +1146,9 @@ static int netem_dump(struct Qdisc *sch, struct sk_buff *skb)
+ struct tc_netem_rate rate;
+ struct tc_netem_slot slot;
+
+- qopt.latency = min_t(psched_tdiff_t, PSCHED_NS2TICKS(q->latency),
++ qopt.latency = min_t(psched_time_t, PSCHED_NS2TICKS(q->latency),
+ UINT_MAX);
+- qopt.jitter = min_t(psched_tdiff_t, PSCHED_NS2TICKS(q->jitter),
++ qopt.jitter = min_t(psched_time_t, PSCHED_NS2TICKS(q->jitter),
+ UINT_MAX);
+ qopt.limit = q->limit;
+ qopt.loss = q->loss;
+diff --git a/net/tipc/core.c b/net/tipc/core.c
+index 3f4542e0f0650..434e70eabe081 100644
+--- a/net/tipc/core.c
++++ b/net/tipc/core.c
+@@ -109,10 +109,9 @@ static void __net_exit tipc_exit_net(struct net *net)
+ struct tipc_net *tn = tipc_net(net);
+
+ tipc_detach_loopback(net);
++ tipc_net_stop(net);
+ /* Make sure the tipc_net_finalize_work() finished */
+ cancel_work_sync(&tn->work);
+- tipc_net_stop(net);
+-
+ tipc_bcast_stop(net);
+ tipc_nametbl_stop(net);
+ tipc_sk_rht_destroy(net);
+diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
+index 7b2b0e7ffee4c..5c9697840ef70 100644
+--- a/net/tls/tls_main.c
++++ b/net/tls/tls_main.c
+@@ -873,6 +873,8 @@ static void tls_update(struct sock *sk, struct proto *p,
+ {
+ struct tls_context *ctx;
+
++ WARN_ON_ONCE(sk->sk_prot == p);
++
+ ctx = tls_get_ctx(sk);
+ if (likely(ctx)) {
+ ctx->sk_write_space = write_space;
+diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
+index d6bcdbfd0fc58..9b12ea3ab85a7 100644
+--- a/net/xdp/xsk.c
++++ b/net/xdp/xsk.c
+@@ -538,12 +538,6 @@ static int xsk_generic_xmit(struct sock *sk)
+ goto out;
+ }
+
+- skb = xsk_build_skb(xs, &desc);
+- if (IS_ERR(skb)) {
+- err = PTR_ERR(skb);
+- goto out;
+- }
+-
+ /* This is the backpressure mechanism for the Tx path.
+ * Reserve space in the completion queue and only proceed
+ * if there is space in it. This avoids having to implement
+@@ -552,11 +546,19 @@ static int xsk_generic_xmit(struct sock *sk)
+ spin_lock_irqsave(&xs->pool->cq_lock, flags);
+ if (xskq_prod_reserve(xs->pool->cq)) {
+ spin_unlock_irqrestore(&xs->pool->cq_lock, flags);
+- kfree_skb(skb);
+ goto out;
+ }
+ spin_unlock_irqrestore(&xs->pool->cq_lock, flags);
+
++ skb = xsk_build_skb(xs, &desc);
++ if (IS_ERR(skb)) {
++ err = PTR_ERR(skb);
++ spin_lock_irqsave(&xs->pool->cq_lock, flags);
++ xskq_prod_cancel(xs->pool->cq);
++ spin_unlock_irqrestore(&xs->pool->cq_lock, flags);
++ goto out;
++ }
++
+ err = __dev_direct_xmit(skb, xs->queue_id);
+ if (err == NETDEV_TX_BUSY) {
+ /* Tell user-space to retry the send */
+diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
+index b28344fd7408e..0005900b19b08 100644
+--- a/scripts/mod/modpost.c
++++ b/scripts/mod/modpost.c
+@@ -1115,7 +1115,7 @@ static const struct sectioncheck sectioncheck[] = {
+ },
+ /* Do not export init/exit functions or data */
+ {
+- .fromsec = { "__ksymtab*", NULL },
++ .fromsec = { "___ksymtab*", NULL },
+ .bad_tosec = { INIT_SECTIONS, EXIT_SECTIONS, NULL },
+ .mismatch = EXPORT_TO_INIT_EXIT,
+ .symbol_white_list = { DEFAULT_SYMBOL_WHITE_LIST, NULL },
+diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c
+index 15dc7160ba34e..8cfdaee779050 100644
+--- a/sound/core/memalloc.c
++++ b/sound/core/memalloc.c
+@@ -431,33 +431,17 @@ static const struct snd_malloc_ops snd_dma_iram_ops = {
+ */
+ static void *snd_dma_dev_alloc(struct snd_dma_buffer *dmab, size_t size)
+ {
+- void *p;
+-
+- p = dma_alloc_coherent(dmab->dev.dev, size, &dmab->addr, DEFAULT_GFP);
+-#ifdef CONFIG_X86
+- if (p && dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC)
+- set_memory_wc((unsigned long)p, PAGE_ALIGN(size) >> PAGE_SHIFT);
+-#endif
+- return p;
++ return dma_alloc_coherent(dmab->dev.dev, size, &dmab->addr, DEFAULT_GFP);
+ }
+
+ static void snd_dma_dev_free(struct snd_dma_buffer *dmab)
+ {
+-#ifdef CONFIG_X86
+- if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC)
+- set_memory_wb((unsigned long)dmab->area,
+- PAGE_ALIGN(dmab->bytes) >> PAGE_SHIFT);
+-#endif
+ dma_free_coherent(dmab->dev.dev, dmab->bytes, dmab->area, dmab->addr);
+ }
+
+ static int snd_dma_dev_mmap(struct snd_dma_buffer *dmab,
+ struct vm_area_struct *area)
+ {
+-#ifdef CONFIG_X86
+- if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC)
+- area->vm_page_prot = pgprot_writecombine(area->vm_page_prot);
+-#endif
+ return dma_mmap_coherent(dmab->dev.dev, area,
+ dmab->area, dmab->addr, dmab->bytes);
+ }
+@@ -471,10 +455,6 @@ static const struct snd_malloc_ops snd_dma_dev_ops = {
+ /*
+ * Write-combined pages
+ */
+-#ifdef CONFIG_X86
+-/* On x86, share the same ops as the standard dev ops */
+-#define snd_dma_wc_ops snd_dma_dev_ops
+-#else /* CONFIG_X86 */
+ static void *snd_dma_wc_alloc(struct snd_dma_buffer *dmab, size_t size)
+ {
+ return dma_alloc_wc(dmab->dev.dev, size, &dmab->addr, DEFAULT_GFP);
+@@ -497,7 +477,6 @@ static const struct snd_malloc_ops snd_dma_wc_ops = {
+ .free = snd_dma_wc_free,
+ .mmap = snd_dma_wc_mmap,
+ };
+-#endif /* CONFIG_X86 */
+
+ #ifdef CONFIG_SND_DMA_SGBUF
+ static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size);
+diff --git a/sound/hda/hdac_i915.c b/sound/hda/hdac_i915.c
+index 3f35972e1cf75..161a9711cd63e 100644
+--- a/sound/hda/hdac_i915.c
++++ b/sound/hda/hdac_i915.c
+@@ -119,21 +119,18 @@ static int i915_component_master_match(struct device *dev, int subcomponent,
+ /* check whether Intel graphics is present and reachable */
+ static int i915_gfx_present(struct pci_dev *hdac_pci)
+ {
+- unsigned int class = PCI_BASE_CLASS_DISPLAY << 16;
+ struct pci_dev *display_dev = NULL;
+- bool match = false;
+
+- do {
+- display_dev = pci_get_class(class, display_dev);
+-
+- if (display_dev && display_dev->vendor == PCI_VENDOR_ID_INTEL &&
++ for_each_pci_dev(display_dev) {
++ if (display_dev->vendor == PCI_VENDOR_ID_INTEL &&
++ (display_dev->class >> 16) == PCI_BASE_CLASS_DISPLAY &&
+ connectivity_check(display_dev, hdac_pci)) {
+ pci_dev_put(display_dev);
+- match = true;
++ return true;
+ }
+- } while (!match && display_dev);
++ }
+
+- return match;
++ return false;
+ }
+
+ /**
+diff --git a/sound/pci/hda/hda_auto_parser.c b/sound/pci/hda/hda_auto_parser.c
+index cd1db943b7e07..7c6b1fe8dfcce 100644
+--- a/sound/pci/hda/hda_auto_parser.c
++++ b/sound/pci/hda/hda_auto_parser.c
+@@ -819,7 +819,7 @@ static void set_pin_targets(struct hda_codec *codec,
+ snd_hda_set_pin_ctl_cache(codec, cfg->nid, cfg->val);
+ }
+
+-static void apply_fixup(struct hda_codec *codec, int id, int action, int depth)
++void __snd_hda_apply_fixup(struct hda_codec *codec, int id, int action, int depth)
+ {
+ const char *modelname = codec->fixup_name;
+
+@@ -829,7 +829,7 @@ static void apply_fixup(struct hda_codec *codec, int id, int action, int depth)
+ if (++depth > 10)
+ break;
+ if (fix->chained_before)
+- apply_fixup(codec, fix->chain_id, action, depth + 1);
++ __snd_hda_apply_fixup(codec, fix->chain_id, action, depth + 1);
+
+ switch (fix->type) {
+ case HDA_FIXUP_PINS:
+@@ -870,6 +870,7 @@ static void apply_fixup(struct hda_codec *codec, int id, int action, int depth)
+ id = fix->chain_id;
+ }
+ }
++EXPORT_SYMBOL_GPL(__snd_hda_apply_fixup);
+
+ /**
+ * snd_hda_apply_fixup - Apply the fixup chain with the given action
+@@ -879,7 +880,7 @@ static void apply_fixup(struct hda_codec *codec, int id, int action, int depth)
+ void snd_hda_apply_fixup(struct hda_codec *codec, int action)
+ {
+ if (codec->fixup_list)
+- apply_fixup(codec, codec->fixup_id, action, 0);
++ __snd_hda_apply_fixup(codec, codec->fixup_id, action, 0);
+ }
+ EXPORT_SYMBOL_GPL(snd_hda_apply_fixup);
+
+diff --git a/sound/pci/hda/hda_local.h b/sound/pci/hda/hda_local.h
+index aca592651870e..682dca2057dbe 100644
+--- a/sound/pci/hda/hda_local.h
++++ b/sound/pci/hda/hda_local.h
+@@ -348,6 +348,7 @@ void snd_hda_apply_verbs(struct hda_codec *codec);
+ void snd_hda_apply_pincfgs(struct hda_codec *codec,
+ const struct hda_pintbl *cfg);
+ void snd_hda_apply_fixup(struct hda_codec *codec, int action);
++void __snd_hda_apply_fixup(struct hda_codec *codec, int id, int action, int depth);
+ void snd_hda_pick_fixup(struct hda_codec *codec,
+ const struct hda_model_fixup *models,
+ const struct snd_pci_quirk *quirk,
+diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c
+index bce2cef80000b..0b7d500249f6e 100644
+--- a/sound/pci/hda/patch_conexant.c
++++ b/sound/pci/hda/patch_conexant.c
+@@ -1079,11 +1079,11 @@ static int patch_conexant_auto(struct hda_codec *codec)
+ if (err < 0)
+ goto error;
+
+- err = snd_hda_gen_parse_auto_config(codec, &spec->gen.autocfg);
++ err = cx_auto_parse_beep(codec);
+ if (err < 0)
+ goto error;
+
+- err = cx_auto_parse_beep(codec);
++ err = snd_hda_gen_parse_auto_config(codec, &spec->gen.autocfg);
+ if (err < 0)
+ goto error;
+
+diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
+index 588d4a59c8d92..d3d786de8f4c4 100644
+--- a/sound/pci/hda/patch_realtek.c
++++ b/sound/pci/hda/patch_realtek.c
+@@ -2634,6 +2634,7 @@ static const struct snd_pci_quirk alc882_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1558, 0x67e1, "Clevo PB71[DE][CDF]", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
+ SND_PCI_QUIRK(0x1558, 0x67e5, "Clevo PC70D[PRS](?:-D|-G)?", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
+ SND_PCI_QUIRK(0x1558, 0x67f1, "Clevo PC70H[PRS]", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
++ SND_PCI_QUIRK(0x1558, 0x67f5, "Clevo PD70PN[NRT]", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
+ SND_PCI_QUIRK(0x1558, 0x70d1, "Clevo PC70[ER][CDF]", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
+ SND_PCI_QUIRK(0x1558, 0x7714, "Clevo X170SM", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
+ SND_PCI_QUIRK(0x1558, 0x7715, "Clevo X170KM-G", ALC1220_FIXUP_CLEVO_PB51ED),
+@@ -7056,6 +7057,7 @@ enum {
+ ALC287_FIXUP_LEGION_15IMHG05_SPEAKERS,
+ ALC287_FIXUP_LEGION_15IMHG05_AUTOMUTE,
+ ALC287_FIXUP_YOGA7_14ITL_SPEAKERS,
++ ALC298_FIXUP_LENOVO_C940_DUET7,
+ ALC287_FIXUP_13S_GEN2_SPEAKERS,
+ ALC256_FIXUP_SET_COEF_DEFAULTS,
+ ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE,
+@@ -7074,6 +7076,23 @@ enum {
+ ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE,
+ };
+
++/* A special fixup for Lenovo C940 and Yoga Duet 7;
++ * both have the very same PCI SSID, and we need to apply different fixups
++ * depending on the codec ID
++ */
++static void alc298_fixup_lenovo_c940_duet7(struct hda_codec *codec,
++ const struct hda_fixup *fix,
++ int action)
++{
++ int id;
++
++ if (codec->core.vendor_id == 0x10ec0298)
++ id = ALC298_FIXUP_LENOVO_SPK_VOLUME; /* C940 */
++ else
++ id = ALC287_FIXUP_YOGA7_14ITL_SPEAKERS; /* Duet 7 */
++ __snd_hda_apply_fixup(codec, id, action, 0);
++}
++
+ static const struct hda_fixup alc269_fixups[] = {
+ [ALC269_FIXUP_GPIO2] = {
+ .type = HDA_FIXUP_FUNC,
+@@ -8773,6 +8792,10 @@ static const struct hda_fixup alc269_fixups[] = {
+ .chained = true,
+ .chain_id = ALC269_FIXUP_HEADSET_MODE,
+ },
++ [ALC298_FIXUP_LENOVO_C940_DUET7] = {
++ .type = HDA_FIXUP_FUNC,
++ .v.func = alc298_fixup_lenovo_c940_duet7,
++ },
+ [ALC287_FIXUP_13S_GEN2_SPEAKERS] = {
+ .type = HDA_FIXUP_VERBS,
+ .v.verbs = (const struct hda_verb[]) {
+@@ -9074,6 +9097,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x8783, "HP ZBook Fury 15 G7 Mobile Workstation",
+ ALC285_FIXUP_HP_GPIO_AMP_INIT),
++ SND_PCI_QUIRK(0x103c, 0x8787, "HP OMEN 15", ALC285_FIXUP_HP_MUTE_LED),
+ SND_PCI_QUIRK(0x103c, 0x8788, "HP OMEN 15", ALC285_FIXUP_HP_MUTE_LED),
+ SND_PCI_QUIRK(0x103c, 0x87c8, "HP", ALC287_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x87e5, "HP ProBook 440 G8 Notebook PC", ALC236_FIXUP_HP_GPIO_LED),
+@@ -9239,6 +9263,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1558, 0x70f3, "Clevo NH77DPQ", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x70f4, "Clevo NH77EPY", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x70f6, "Clevo NH77DPQ-Y", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
++ SND_PCI_QUIRK(0x1558, 0x7716, "Clevo NS50PU", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x8228, "Clevo NR40BU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x8520, "Clevo NH50D[CD]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x8521, "Clevo NH77D[CD]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+@@ -9325,7 +9350,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x17aa, 0x31af, "ThinkCentre Station", ALC623_FIXUP_LENOVO_THINKSTATION_P340),
+ SND_PCI_QUIRK(0x17aa, 0x3802, "Lenovo Yoga DuetITL 2021", ALC287_FIXUP_YOGA7_14ITL_SPEAKERS),
+ SND_PCI_QUIRK(0x17aa, 0x3813, "Legion 7i 15IMHG05", ALC287_FIXUP_LEGION_15IMHG05_SPEAKERS),
+- SND_PCI_QUIRK(0x17aa, 0x3818, "Lenovo C940", ALC298_FIXUP_LENOVO_SPK_VOLUME),
++ SND_PCI_QUIRK(0x17aa, 0x3818, "Lenovo C940 / Yoga Duet 7", ALC298_FIXUP_LENOVO_C940_DUET7),
+ SND_PCI_QUIRK(0x17aa, 0x3819, "Lenovo 13s Gen2 ITL", ALC287_FIXUP_13S_GEN2_SPEAKERS),
+ SND_PCI_QUIRK(0x17aa, 0x3820, "Yoga Duet 7 13ITL6", ALC287_FIXUP_YOGA7_14ITL_SPEAKERS),
+ SND_PCI_QUIRK(0x17aa, 0x3824, "Legion Y9000X 2020", ALC285_FIXUP_LEGION_Y9000X_SPEAKERS),
+@@ -10789,6 +10814,7 @@ enum {
+ ALC668_FIXUP_MIC_DET_COEF,
+ ALC897_FIXUP_LENOVO_HEADSET_MIC,
+ ALC897_FIXUP_HEADSET_MIC_PIN,
++ ALC897_FIXUP_HP_HSMIC_VERB,
+ };
+
+ static const struct hda_fixup alc662_fixups[] = {
+@@ -11208,6 +11234,13 @@ static const struct hda_fixup alc662_fixups[] = {
+ .chained = true,
+ .chain_id = ALC897_FIXUP_LENOVO_HEADSET_MIC
+ },
++ [ALC897_FIXUP_HP_HSMIC_VERB] = {
++ .type = HDA_FIXUP_PINS,
++ .v.pins = (const struct hda_pintbl[]) {
++ { 0x19, 0x01a1913c }, /* use as headset mic, without its own jack detect */
++ { }
++ },
++ },
+ };
+
+ static const struct snd_pci_quirk alc662_fixup_tbl[] = {
+@@ -11233,6 +11266,7 @@ static const struct snd_pci_quirk alc662_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1028, 0x0698, "Dell", ALC668_FIXUP_DELL_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1028, 0x069f, "Dell", ALC668_FIXUP_DELL_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x103c, 0x1632, "HP RP5800", ALC662_FIXUP_HP_RP5800),
++ SND_PCI_QUIRK(0x103c, 0x8719, "HP", ALC897_FIXUP_HP_HSMIC_VERB),
+ SND_PCI_QUIRK(0x103c, 0x873e, "HP", ALC671_FIXUP_HP_HEADSET_MIC2),
+ SND_PCI_QUIRK(0x103c, 0x885f, "HP 288 Pro G8", ALC671_FIXUP_HP_HEADSET_MIC2),
+ SND_PCI_QUIRK(0x1043, 0x1080, "Asus UX501VW", ALC668_FIXUP_HEADSET_MODE),
+diff --git a/sound/pci/hda/patch_via.c b/sound/pci/hda/patch_via.c
+index 773a136161f11..a188901a83bbe 100644
+--- a/sound/pci/hda/patch_via.c
++++ b/sound/pci/hda/patch_via.c
+@@ -520,11 +520,11 @@ static int via_parse_auto_config(struct hda_codec *codec)
+ if (err < 0)
+ return err;
+
+- err = snd_hda_gen_parse_auto_config(codec, &spec->gen.autocfg);
++ err = auto_parse_beep(codec);
+ if (err < 0)
+ return err;
+
+- err = auto_parse_beep(codec);
++ err = snd_hda_gen_parse_auto_config(codec, &spec->gen.autocfg);
+ if (err < 0)
+ return err;
+
+diff --git a/tools/perf/tests/shell/test_arm_callgraph_fp.sh b/tools/perf/tests/shell/test_arm_callgraph_fp.sh
+index 6ffbb27afabac..ec108d45d3c61 100755
+--- a/tools/perf/tests/shell/test_arm_callgraph_fp.sh
++++ b/tools/perf/tests/shell/test_arm_callgraph_fp.sh
+@@ -43,7 +43,7 @@ CFLAGS="-g -O0 -fno-inline -fno-omit-frame-pointer"
+ cc $CFLAGS $TEST_PROGRAM_SOURCE -o $TEST_PROGRAM || exit 1
+
+ # Add a 1 second delay to skip samples that are not in the leaf() function
+-perf record -o $PERF_DATA --call-graph fp -e cycles//u -D 1000 -- $TEST_PROGRAM 2> /dev/null &
++perf record -o $PERF_DATA --call-graph fp -e cycles//u -D 1000 --user-callchains -- $TEST_PROGRAM 2> /dev/null &
+ PID=$!
+
+ echo " + Recording (PID=$PID)..."
+diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c
+index d23a9e322ff52..0b4f61b6cc6b8 100644
+--- a/tools/perf/tests/topology.c
++++ b/tools/perf/tests/topology.c
+@@ -115,7 +115,7 @@ static int check_cpu_topology(char *path, struct perf_cpu_map *map)
+ * physical_package_id will be set to -1. Hence skip this
+ * test if physical_package_id returns -1 for cpu from perf_cpu_map.
+ */
+- if (strncmp(session->header.env.arch, "powerpc", 7)) {
++ if (!strncmp(session->header.env.arch, "ppc64le", 7)) {
+ if (cpu__get_socket_id(perf_cpu_map__cpu(map, 0)) == -1)
+ return TEST_SKIP;
+ }
+diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
+index 1a80151baed96..d040406f3314c 100644
+--- a/tools/perf/util/arm-spe.c
++++ b/tools/perf/util/arm-spe.c
+@@ -387,26 +387,16 @@ static int arm_spe__synth_instruction_sample(struct arm_spe_queue *speq,
+ return arm_spe_deliver_synth_event(spe, speq, event, &sample);
+ }
+
+-#define SPE_MEM_TYPE (ARM_SPE_L1D_ACCESS | ARM_SPE_L1D_MISS | \
+- ARM_SPE_LLC_ACCESS | ARM_SPE_LLC_MISS | \
+- ARM_SPE_REMOTE_ACCESS)
+-
+-static bool arm_spe__is_memory_event(enum arm_spe_sample_type type)
+-{
+- if (type & SPE_MEM_TYPE)
+- return true;
+-
+- return false;
+-}
+-
+ static u64 arm_spe__synth_data_source(const struct arm_spe_record *record)
+ {
+ union perf_mem_data_src data_src = { 0 };
+
+ if (record->op == ARM_SPE_LD)
+ data_src.mem_op = PERF_MEM_OP_LOAD;
+- else
++ else if (record->op == ARM_SPE_ST)
+ data_src.mem_op = PERF_MEM_OP_STORE;
++ else
++ return 0;
+
+ if (record->type & (ARM_SPE_LLC_ACCESS | ARM_SPE_LLC_MISS)) {
+ data_src.mem_lvl = PERF_MEM_LVL_L3;
+@@ -510,7 +500,11 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
+ return err;
+ }
+
+- if (spe->sample_memory && arm_spe__is_memory_event(record->type)) {
++ /*
++ * When data_src is zero it means the record is not a memory operation,
++ * skip to synthesize memory sample for this case.
++ */
++ if (spe->sample_memory && data_src) {
+ err = arm_spe__synth_mem_sample(speq, spe->memory_id, data_src);
+ if (err)
+ return err;
+diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
+index 82f3d46bea70e..328668f38c69d 100644
+--- a/tools/perf/util/build-id.c
++++ b/tools/perf/util/build-id.c
+@@ -872,6 +872,30 @@ out_free:
+ return err;
+ }
+
++static int filename__read_build_id_ns(const char *filename,
++ struct build_id *bid,
++ struct nsinfo *nsi)
++{
++ struct nscookie nsc;
++ int ret;
++
++ nsinfo__mountns_enter(nsi, &nsc);
++ ret = filename__read_build_id(filename, bid);
++ nsinfo__mountns_exit(&nsc);
++
++ return ret;
++}
++
++static bool dso__build_id_mismatch(struct dso *dso, const char *name)
++{
++ struct build_id bid;
++
++ if (filename__read_build_id_ns(name, &bid, dso->nsinfo) < 0)
++ return false;
++
++ return !dso__build_id_equal(dso, &bid);
++}
++
+ static int dso__cache_build_id(struct dso *dso, struct machine *machine,
+ void *priv __maybe_unused)
+ {
+@@ -886,6 +910,10 @@ static int dso__cache_build_id(struct dso *dso, struct machine *machine,
+ is_kallsyms = true;
+ name = machine->mmap_name;
+ }
++
++ if (!is_kallsyms && dso__build_id_mismatch(dso, name))
++ return 0;
++
+ return build_id_cache__add_b(&dso->bid, name, dso->nsinfo,
+ is_kallsyms, is_vdso);
+ }
+diff --git a/tools/testing/selftests/dma/Makefile b/tools/testing/selftests/dma/Makefile
+index aa8e8b5b3864e..cd8c5ece1cba4 100644
+--- a/tools/testing/selftests/dma/Makefile
++++ b/tools/testing/selftests/dma/Makefile
+@@ -1,5 +1,6 @@
+ # SPDX-License-Identifier: GPL-2.0
+ CFLAGS += -I../../../../usr/include/
++CFLAGS += -I../../../../include/
+
+ TEST_GEN_PROGS := dma_map_benchmark
+
+diff --git a/tools/testing/selftests/dma/dma_map_benchmark.c b/tools/testing/selftests/dma/dma_map_benchmark.c
+index c3b3c09e995e8..5c997f17fcbdb 100644
+--- a/tools/testing/selftests/dma/dma_map_benchmark.c
++++ b/tools/testing/selftests/dma/dma_map_benchmark.c
+@@ -10,8 +10,8 @@
+ #include <unistd.h>
+ #include <sys/ioctl.h>
+ #include <sys/mman.h>
+-#include <linux/map_benchmark.h>
+ #include <linux/types.h>
++#include <linux/map_benchmark.h>
+
+ #define NSEC_PER_MSEC 1000000L
+
+diff --git a/tools/testing/selftests/net/fcnal-test.sh b/tools/testing/selftests/net/fcnal-test.sh
+index 54701c8b0cd70..03b586760164a 100755
+--- a/tools/testing/selftests/net/fcnal-test.sh
++++ b/tools/testing/selftests/net/fcnal-test.sh
+@@ -70,6 +70,10 @@ NSB_LO_IP6=2001:db8:2::2
+ NL_IP=172.17.1.1
+ NL_IP6=2001:db8:4::1
+
++# multicast and broadcast addresses
++MCAST_IP=224.0.0.1
++BCAST_IP=255.255.255.255
++
+ MD5_PW=abc123
+ MD5_WRONG_PW=abc1234
+
+@@ -308,6 +312,9 @@ addr2str()
+ 127.0.0.1) echo "loopback";;
+ ::1) echo "IPv6 loopback";;
+
++ ${BCAST_IP}) echo "broadcast";;
++ ${MCAST_IP}) echo "multicast";;
++
+ ${NSA_IP}) echo "ns-A IP";;
+ ${NSA_IP6}) echo "ns-A IPv6";;
+ ${NSA_LO_IP}) echo "ns-A loopback IP";;
+@@ -1793,12 +1800,33 @@ ipv4_addr_bind_novrf()
+ done
+
+ #
+- # raw socket with nonlocal bind
++ # tests for nonlocal bind
+ #
+ a=${NL_IP}
+ log_start
+- run_cmd nettest -s -R -P icmp -f -l ${a} -I ${NSA_DEV} -b
+- log_test_addr ${a} $? 0 "Raw socket bind to nonlocal address after device bind"
++ run_cmd nettest -s -R -f -l ${a} -b
++ log_test_addr ${a} $? 0 "Raw socket bind to nonlocal address"
++
++ log_start
++ run_cmd nettest -s -f -l ${a} -b
++ log_test_addr ${a} $? 0 "TCP socket bind to nonlocal address"
++
++ log_start
++ run_cmd nettest -s -D -P icmp -f -l ${a} -b
++ log_test_addr ${a} $? 0 "ICMP socket bind to nonlocal address"
++
++ #
++ # check that ICMP sockets cannot bind to broadcast and multicast addresses
++ #
++ a=${BCAST_IP}
++ log_start
++ run_cmd nettest -s -D -P icmp -l ${a} -b
++ log_test_addr ${a} $? 1 "ICMP socket bind to broadcast address"
++
++ a=${MCAST_IP}
++ log_start
++ run_cmd nettest -s -D -P icmp -l ${a} -b
++ log_test_addr ${a} $? 1 "ICMP socket bind to multicast address"
+
+ #
+ # tcp sockets
+@@ -1850,13 +1878,34 @@ ipv4_addr_bind_vrf()
+ log_test_addr ${a} $? 1 "Raw socket bind to out of scope address after VRF bind"
+
+ #
+- # raw socket with nonlocal bind
++ # tests for nonlocal bind
+ #
+ a=${NL_IP}
+ log_start
+- run_cmd nettest -s -R -P icmp -f -l ${a} -I ${VRF} -b
++ run_cmd nettest -s -R -f -l ${a} -I ${VRF} -b
+ log_test_addr ${a} $? 0 "Raw socket bind to nonlocal address after VRF bind"
+
++ log_start
++ run_cmd nettest -s -f -l ${a} -I ${VRF} -b
++ log_test_addr ${a} $? 0 "TCP socket bind to nonlocal address after VRF bind"
++
++ log_start
++ run_cmd nettest -s -D -P icmp -f -l ${a} -I ${VRF} -b
++ log_test_addr ${a} $? 0 "ICMP socket bind to nonlocal address after VRF bind"
++
++ #
++ # check that ICMP sockets cannot bind to broadcast and multicast addresses
++ #
++ a=${BCAST_IP}
++ log_start
++ run_cmd nettest -s -D -P icmp -l ${a} -I ${VRF} -b
++ log_test_addr ${a} $? 1 "ICMP socket bind to broadcast address after VRF bind"
++
++ a=${MCAST_IP}
++ log_start
++ run_cmd nettest -s -D -P icmp -l ${a} -I ${VRF} -b
++ log_test_addr ${a} $? 1 "ICMP socket bind to multicast address after VRF bind"
++
+ #
+ # tcp sockets
+ #
+@@ -1889,10 +1938,12 @@ ipv4_addr_bind()
+
+ log_subsection "No VRF"
+ setup
++ set_sysctl net.ipv4.ping_group_range='0 2147483647' 2>/dev/null
+ ipv4_addr_bind_novrf
+
+ log_subsection "With VRF"
+ setup "yes"
++ set_sysctl net.ipv4.ping_group_range='0 2147483647' 2>/dev/null
+ ipv4_addr_bind_vrf
+ }
+
+diff --git a/tools/testing/selftests/netfilter/nft_concat_range.sh b/tools/testing/selftests/netfilter/nft_concat_range.sh
+index b35010cc7f6ae..a6991877e50cd 100755
+--- a/tools/testing/selftests/netfilter/nft_concat_range.sh
++++ b/tools/testing/selftests/netfilter/nft_concat_range.sh
+@@ -31,7 +31,7 @@ BUGS="flush_remove_add reload"
+
+ # List of possible paths to pktgen script from kernel tree for performance tests
+ PKTGEN_SCRIPT_PATHS="
+- ../../../samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh
++ ../../../../samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh
+ pktgen/pktgen_bench_xmit_mode_netif_receive.sh"
+
+ # Definition of set types:
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-07-02 16:12 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-07-02 16:12 UTC (permalink / raw
To: gentoo-commits
commit: d89b7b70ccefe56028a258075397ad66e474e665
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Sat Jul 2 16:12:04 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Sat Jul 2 16:12:04 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=d89b7b70
Linux patch 5.18.9
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1008_linux-5.18.9.patch | 283 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 287 insertions(+)
diff --git a/0000_README b/0000_README
index b676cc58..e3668eda 100644
--- a/0000_README
+++ b/0000_README
@@ -75,6 +75,10 @@ Patch: 1007_linux-5.18.8.patch
From: http://www.kernel.org
Desc: Linux 5.18.8
+Patch: 1008_linux-5.18.9.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.9
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1008_linux-5.18.9.patch b/1008_linux-5.18.9.patch
new file mode 100644
index 00000000..0716a3e2
--- /dev/null
+++ b/1008_linux-5.18.9.patch
@@ -0,0 +1,283 @@
+diff --git a/Makefile b/Makefile
+index 6ac3335f65aff..751cfd786c8c0 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 8
++SUBLEVEL = 9
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
+index d83758acd1c7c..44a37a2b6a1cf 100644
+--- a/arch/powerpc/include/asm/ftrace.h
++++ b/arch/powerpc/include/asm/ftrace.h
+@@ -86,7 +86,7 @@ static inline bool arch_syscall_match_sym_name(const char *sym, const char *name
+ #endif /* PPC64_ELF_ABI_v1 */
+ #endif /* CONFIG_FTRACE_SYSCALLS */
+
+-#ifdef CONFIG_PPC64
++#if defined(CONFIG_PPC64) && defined(CONFIG_FUNCTION_TRACER)
+ #include <asm/paca.h>
+
+ static inline void this_cpu_disable_ftrace(void)
+@@ -110,11 +110,13 @@ static inline u8 this_cpu_get_ftrace_enabled(void)
+ return get_paca()->ftrace_enabled;
+ }
+
++void ftrace_free_init_tramp(void);
+ #else /* CONFIG_PPC64 */
+ static inline void this_cpu_disable_ftrace(void) { }
+ static inline void this_cpu_enable_ftrace(void) { }
+ static inline void this_cpu_set_ftrace_enabled(u8 ftrace_enabled) { }
+ static inline u8 this_cpu_get_ftrace_enabled(void) { return 1; }
++static inline void ftrace_free_init_tramp(void) { }
+ #endif /* CONFIG_PPC64 */
+ #endif /* !__ASSEMBLY__ */
+
+diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
+index 4ee04aacf9f13..a778f2ae1f3f5 100644
+--- a/arch/powerpc/kernel/trace/ftrace.c
++++ b/arch/powerpc/kernel/trace/ftrace.c
+@@ -306,9 +306,7 @@ static int setup_mcount_compiler_tramp(unsigned long tramp)
+
+ /* Is this a known long jump tramp? */
+ for (i = 0; i < NUM_FTRACE_TRAMPS; i++)
+- if (!ftrace_tramps[i])
+- break;
+- else if (ftrace_tramps[i] == tramp)
++ if (ftrace_tramps[i] == tramp)
+ return 0;
+
+ /* Is this a known plt tramp? */
+@@ -863,6 +861,17 @@ void arch_ftrace_update_code(int command)
+
+ extern unsigned int ftrace_tramp_text[], ftrace_tramp_init[];
+
++void ftrace_free_init_tramp(void)
++{
++ int i;
++
++ for (i = 0; i < NUM_FTRACE_TRAMPS && ftrace_tramps[i]; i++)
++ if (ftrace_tramps[i] == (unsigned long)ftrace_tramp_init) {
++ ftrace_tramps[i] = 0;
++ return;
++ }
++}
++
+ int __init ftrace_dyn_arch_init(void)
+ {
+ int i;
+diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
+index 4d221d033804e..149635e5c1653 100644
+--- a/arch/powerpc/mm/mem.c
++++ b/arch/powerpc/mm/mem.c
+@@ -22,6 +22,7 @@
+ #include <asm/kasan.h>
+ #include <asm/svm.h>
+ #include <asm/mmzone.h>
++#include <asm/ftrace.h>
+
+ #include <mm/mmu_decl.h>
+
+@@ -312,6 +313,7 @@ void free_initmem(void)
+ ppc_md.progress = ppc_printk_progress;
+ mark_initmem_nx();
+ free_initmem_default(POISON_FREE_INITMEM);
++ ftrace_free_init_tramp();
+ }
+
+ /*
+diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
+index 1589ae7d5abb6..8182ff2d12fe1 100644
+--- a/drivers/clocksource/Kconfig
++++ b/drivers/clocksource/Kconfig
+@@ -80,7 +80,7 @@ config IXP4XX_TIMER
+ bool "Intel XScale IXP4xx timer driver" if COMPILE_TEST
+ depends on HAS_IOMEM
+ select CLKSRC_MMIO
+- select TIMER_OF if OF
++ select TIMER_OF
+ help
+ Enables support for the Intel XScale IXP4xx SoC timer.
+
+diff --git a/drivers/clocksource/timer-ixp4xx.c b/drivers/clocksource/timer-ixp4xx.c
+index cbb184953510b..720ed70a2964f 100644
+--- a/drivers/clocksource/timer-ixp4xx.c
++++ b/drivers/clocksource/timer-ixp4xx.c
+@@ -19,8 +19,6 @@
+ #include <linux/of_address.h>
+ #include <linux/of_irq.h>
+ #include <linux/platform_device.h>
+-/* Goes away with OF conversion */
+-#include <linux/platform_data/timer-ixp4xx.h>
+
+ /*
+ * Constants to make it easy to access Timer Control/Status registers
+@@ -263,28 +261,6 @@ static struct platform_driver ixp4xx_timer_driver = {
+ };
+ builtin_platform_driver(ixp4xx_timer_driver);
+
+-/**
+- * ixp4xx_timer_setup() - Timer setup function to be called from boardfiles
+- * @timerbase: physical base of timer block
+- * @timer_irq: Linux IRQ number for the timer
+- * @timer_freq: Fixed frequency of the timer
+- */
+-void __init ixp4xx_timer_setup(resource_size_t timerbase,
+- int timer_irq,
+- unsigned int timer_freq)
+-{
+- void __iomem *base;
+-
+- base = ioremap(timerbase, 0x100);
+- if (!base) {
+- pr_crit("IXP4xx: can't remap timer\n");
+- return;
+- }
+- ixp4xx_timer_register(base, timer_irq, timer_freq);
+-}
+-EXPORT_SYMBOL_GPL(ixp4xx_timer_setup);
+-
+-#ifdef CONFIG_OF
+ static __init int ixp4xx_of_timer_init(struct device_node *np)
+ {
+ void __iomem *base;
+@@ -315,4 +291,3 @@ out_unmap:
+ return ret;
+ }
+ TIMER_OF_DECLARE(ixp4xx, "intel,ixp4xx-timer", ixp4xx_of_timer_init);
+-#endif
+diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
+index 2362bb8ef6d19..e136d6edc1ed6 100644
+--- a/drivers/md/bcache/btree.c
++++ b/drivers/md/bcache/btree.c
+@@ -2017,6 +2017,7 @@ int bch_btree_check(struct cache_set *c)
+ if (c->root->level == 0)
+ return 0;
+
++ memset(&check_state, 0, sizeof(struct btree_check_state));
+ check_state.c = c;
+ check_state.total_threads = bch_btree_chkthread_nr();
+ check_state.key_idx = 0;
+diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
+index 75b71199800dc..d138a2d732406 100644
+--- a/drivers/md/bcache/writeback.c
++++ b/drivers/md/bcache/writeback.c
+@@ -950,6 +950,7 @@ void bch_sectors_dirty_init(struct bcache_device *d)
+ return;
+ }
+
++ memset(&state, 0, sizeof(struct bch_dirty_init_state));
+ state.c = c;
+ state.d = d;
+ state.total_threads = bch_btre_dirty_init_thread_nr();
+diff --git a/drivers/net/ethernet/huawei/hinic/hinic_devlink.c b/drivers/net/ethernet/huawei/hinic/hinic_devlink.c
+index 60ae8bfc5f69a..1749d26f4befc 100644
+--- a/drivers/net/ethernet/huawei/hinic/hinic_devlink.c
++++ b/drivers/net/ethernet/huawei/hinic/hinic_devlink.c
+@@ -43,9 +43,7 @@ static bool check_image_valid(struct hinic_devlink_priv *priv, const u8 *buf,
+
+ for (i = 0; i < fw_image->fw_info.fw_section_cnt; i++) {
+ len += fw_image->fw_section_info[i].fw_section_len;
+- memcpy(&host_image->image_section_info[i],
+- &fw_image->fw_section_info[i],
+- sizeof(struct fw_section_info_st));
++ host_image->image_section_info[i] = fw_image->fw_section_info[i];
+ }
+
+ if (len != fw_image->fw_len ||
+diff --git a/fs/io_uring.c b/fs/io_uring.c
+index e4186635aaa8d..7c190e8853404 100644
+--- a/fs/io_uring.c
++++ b/fs/io_uring.c
+@@ -3187,6 +3187,21 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+ int ret;
+
+ kiocb->ki_pos = READ_ONCE(sqe->off);
++ /* used for fixed read/write too - just read unconditionally */
++ req->buf_index = READ_ONCE(sqe->buf_index);
++ req->imu = NULL;
++
++ if (req->opcode == IORING_OP_READ_FIXED ||
++ req->opcode == IORING_OP_WRITE_FIXED) {
++ struct io_ring_ctx *ctx = req->ctx;
++ u16 index;
++
++ if (unlikely(req->buf_index >= ctx->nr_user_bufs))
++ return -EFAULT;
++ index = array_index_nospec(req->buf_index, ctx->nr_user_bufs);
++ req->imu = ctx->user_bufs[index];
++ io_req_set_rsrc_node(req, ctx, 0);
++ }
+
+ ioprio = READ_ONCE(sqe->ioprio);
+ if (ioprio) {
+@@ -3199,11 +3214,9 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+ kiocb->ki_ioprio = get_current_ioprio();
+ }
+
+- req->imu = NULL;
+ req->rw.addr = READ_ONCE(sqe->addr);
+ req->rw.len = READ_ONCE(sqe->len);
+ req->rw.flags = READ_ONCE(sqe->rw_flags);
+- req->buf_index = READ_ONCE(sqe->buf_index);
+ return 0;
+ }
+
+@@ -3335,20 +3348,9 @@ static int __io_import_fixed(struct io_kiocb *req, int rw, struct iov_iter *iter
+ static int io_import_fixed(struct io_kiocb *req, int rw, struct iov_iter *iter,
+ unsigned int issue_flags)
+ {
+- struct io_mapped_ubuf *imu = req->imu;
+- u16 index, buf_index = req->buf_index;
+-
+- if (likely(!imu)) {
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- if (unlikely(buf_index >= ctx->nr_user_bufs))
+- return -EFAULT;
+- io_req_set_rsrc_node(req, ctx, issue_flags);
+- index = array_index_nospec(buf_index, ctx->nr_user_bufs);
+- imu = READ_ONCE(ctx->user_bufs[index]);
+- req->imu = imu;
+- }
+- return __io_import_fixed(req, rw, iter, imu);
++ if (WARN_ON_ONCE(!req->imu))
++ return -EFAULT;
++ return __io_import_fixed(req, rw, iter, req->imu);
+ }
+
+ static void io_ring_submit_unlock(struct io_ring_ctx *ctx, bool needs_lock)
+diff --git a/include/linux/platform_data/timer-ixp4xx.h b/include/linux/platform_data/timer-ixp4xx.h
+deleted file mode 100644
+index ee92ae7edaed7..0000000000000
+--- a/include/linux/platform_data/timer-ixp4xx.h
++++ /dev/null
+@@ -1,11 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0 */
+-#ifndef __TIMER_IXP4XX_H
+-#define __TIMER_IXP4XX_H
+-
+-#include <linux/ioport.h>
+-
+-void __init ixp4xx_timer_setup(resource_size_t timerbase,
+- int timer_irq,
+- unsigned int timer_freq);
+-
+-#endif
+diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
+index d257721c68b8f..cd296da509c96 100644
+--- a/kernel/time/tick-sched.c
++++ b/kernel/time/tick-sched.c
+@@ -526,7 +526,6 @@ void __init tick_nohz_full_setup(cpumask_var_t cpumask)
+ cpumask_copy(tick_nohz_full_mask, cpumask);
+ tick_nohz_full_running = true;
+ }
+-EXPORT_SYMBOL_GPL(tick_nohz_full_setup);
+
+ static int tick_nohz_cpu_down(unsigned int cpu)
+ {
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-07-07 16:26 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-07-07 16:26 UTC (permalink / raw
To: gentoo-commits
commit: cfcdf3b0507246ef8ff1a4291c90feedb92166f9
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Thu Jul 7 16:26:00 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Thu Jul 7 16:26:00 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=cfcdf3b0
Linux patch 5.18.10
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1009_linux-5.18.10.patch | 3905 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 3909 insertions(+)
diff --git a/0000_README b/0000_README
index e3668eda..557264e1 100644
--- a/0000_README
+++ b/0000_README
@@ -79,6 +79,10 @@ Patch: 1008_linux-5.18.9.patch
From: http://www.kernel.org
Desc: Linux 5.18.9
+Patch: 1009_linux-5.18.10.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.10
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1009_linux-5.18.10.patch b/1009_linux-5.18.10.patch
new file mode 100644
index 00000000..064488e3
--- /dev/null
+++ b/1009_linux-5.18.10.patch
@@ -0,0 +1,3905 @@
+diff --git a/Makefile b/Makefile
+index 751cfd786c8c0..088b84f99203c 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 9
++SUBLEVEL = 10
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/arm/xen/p2m.c b/arch/arm/xen/p2m.c
+index 84a1cea1f43b9..309648c17f486 100644
+--- a/arch/arm/xen/p2m.c
++++ b/arch/arm/xen/p2m.c
+@@ -63,11 +63,12 @@ out:
+
+ unsigned long __pfn_to_mfn(unsigned long pfn)
+ {
+- struct rb_node *n = phys_to_mach.rb_node;
++ struct rb_node *n;
+ struct xen_p2m_entry *entry;
+ unsigned long irqflags;
+
+ read_lock_irqsave(&p2m_lock, irqflags);
++ n = phys_to_mach.rb_node;
+ while (n) {
+ entry = rb_entry(n, struct xen_p2m_entry, rbnode_phys);
+ if (entry->pfn <= pfn &&
+@@ -152,10 +153,11 @@ bool __set_phys_to_machine_multi(unsigned long pfn,
+ int rc;
+ unsigned long irqflags;
+ struct xen_p2m_entry *p2m_entry;
+- struct rb_node *n = phys_to_mach.rb_node;
++ struct rb_node *n;
+
+ if (mfn == INVALID_P2M_ENTRY) {
+ write_lock_irqsave(&p2m_lock, irqflags);
++ n = phys_to_mach.rb_node;
+ while (n) {
+ p2m_entry = rb_entry(n, struct xen_p2m_entry, rbnode_phys);
+ if (p2m_entry->pfn <= pfn &&
+diff --git a/arch/parisc/kernel/asm-offsets.c b/arch/parisc/kernel/asm-offsets.c
+index 2673d57eeb008..94652e13c2603 100644
+--- a/arch/parisc/kernel/asm-offsets.c
++++ b/arch/parisc/kernel/asm-offsets.c
+@@ -224,8 +224,13 @@ int main(void)
+ BLANK();
+ DEFINE(ASM_SIGFRAME_SIZE, PARISC_RT_SIGFRAME_SIZE);
+ DEFINE(SIGFRAME_CONTEXT_REGS, offsetof(struct rt_sigframe, uc.uc_mcontext) - PARISC_RT_SIGFRAME_SIZE);
++#ifdef CONFIG_64BIT
+ DEFINE(ASM_SIGFRAME_SIZE32, PARISC_RT_SIGFRAME_SIZE32);
+ DEFINE(SIGFRAME_CONTEXT_REGS32, offsetof(struct compat_rt_sigframe, uc.uc_mcontext) - PARISC_RT_SIGFRAME_SIZE32);
++#else
++ DEFINE(ASM_SIGFRAME_SIZE32, PARISC_RT_SIGFRAME_SIZE);
++ DEFINE(SIGFRAME_CONTEXT_REGS32, offsetof(struct rt_sigframe, uc.uc_mcontext) - PARISC_RT_SIGFRAME_SIZE);
++#endif
+ BLANK();
+ DEFINE(ICACHE_BASE, offsetof(struct pdc_cache_info, ic_base));
+ DEFINE(ICACHE_STRIDE, offsetof(struct pdc_cache_info, ic_stride));
+diff --git a/arch/parisc/kernel/unaligned.c b/arch/parisc/kernel/unaligned.c
+index ed1e88a74dc42..bac581b5ecfc5 100644
+--- a/arch/parisc/kernel/unaligned.c
++++ b/arch/parisc/kernel/unaligned.c
+@@ -146,7 +146,7 @@ static int emulate_ldw(struct pt_regs *regs, int toreg, int flop)
+ " depw %%r0,31,2,%4\n"
+ "1: ldw 0(%%sr1,%4),%0\n"
+ "2: ldw 4(%%sr1,%4),%3\n"
+-" subi 32,%4,%2\n"
++" subi 32,%2,%2\n"
+ " mtctl %2,11\n"
+ " vshd %0,%3,%0\n"
+ "3: \n"
+diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
+index 500f2a0831ef8..45e471516c3c3 100644
+--- a/arch/powerpc/Kconfig
++++ b/arch/powerpc/Kconfig
+@@ -358,6 +358,10 @@ config ARCH_SUSPEND_NONZERO_CPU
+ def_bool y
+ depends on PPC_POWERNV || PPC_PSERIES
+
++config ARCH_HAS_ADD_PAGES
++ def_bool y
++ depends on ARCH_ENABLE_MEMORY_HOTPLUG
++
+ config PPC_DCR_NATIVE
+ bool
+
+diff --git a/arch/powerpc/include/asm/bpf_perf_event.h b/arch/powerpc/include/asm/bpf_perf_event.h
+new file mode 100644
+index 0000000000000..e8a7b4ffb58c2
+--- /dev/null
++++ b/arch/powerpc/include/asm/bpf_perf_event.h
+@@ -0,0 +1,9 @@
++/* SPDX-License-Identifier: GPL-2.0 */
++#ifndef _ASM_POWERPC_BPF_PERF_EVENT_H
++#define _ASM_POWERPC_BPF_PERF_EVENT_H
++
++#include <asm/ptrace.h>
++
++typedef struct user_pt_regs bpf_user_pt_regs_t;
++
++#endif /* _ASM_POWERPC_BPF_PERF_EVENT_H */
+diff --git a/arch/powerpc/include/uapi/asm/bpf_perf_event.h b/arch/powerpc/include/uapi/asm/bpf_perf_event.h
+deleted file mode 100644
+index 5e1e648aeec4c..0000000000000
+--- a/arch/powerpc/include/uapi/asm/bpf_perf_event.h
++++ /dev/null
+@@ -1,9 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+-#ifndef _UAPI__ASM_BPF_PERF_EVENT_H__
+-#define _UAPI__ASM_BPF_PERF_EVENT_H__
+-
+-#include <asm/ptrace.h>
+-
+-typedef struct user_pt_regs bpf_user_pt_regs_t;
+-
+-#endif /* _UAPI__ASM_BPF_PERF_EVENT_H__ */
+diff --git a/arch/powerpc/kernel/prom_init_check.sh b/arch/powerpc/kernel/prom_init_check.sh
+index b183ab9c5107c..dfa5f729f774d 100644
+--- a/arch/powerpc/kernel/prom_init_check.sh
++++ b/arch/powerpc/kernel/prom_init_check.sh
+@@ -13,7 +13,7 @@
+ # If you really need to reference something from prom_init.o add
+ # it to the list below:
+
+-grep "^CONFIG_KASAN=y$" .config >/dev/null
++grep "^CONFIG_KASAN=y$" ${KCONFIG_CONFIG} >/dev/null
+ if [ $? -eq 0 ]
+ then
+ MEM_FUNCS="__memcpy __memset"
+diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
+index 149635e5c1653..5dd434d80f445 100644
+--- a/arch/powerpc/mm/mem.c
++++ b/arch/powerpc/mm/mem.c
+@@ -103,6 +103,37 @@ void __ref arch_remove_linear_mapping(u64 start, u64 size)
+ vm_unmap_aliases();
+ }
+
++/*
++ * After memory hotplug the variables max_pfn, max_low_pfn and high_memory need
++ * updating.
++ */
++static void update_end_of_memory_vars(u64 start, u64 size)
++{
++ unsigned long end_pfn = PFN_UP(start + size);
++
++ if (end_pfn > max_pfn) {
++ max_pfn = end_pfn;
++ max_low_pfn = end_pfn;
++ high_memory = (void *)__va(max_pfn * PAGE_SIZE - 1) + 1;
++ }
++}
++
++int __ref add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages,
++ struct mhp_params *params)
++{
++ int ret;
++
++ ret = __add_pages(nid, start_pfn, nr_pages, params);
++ if (ret)
++ return ret;
++
++ /* update max_pfn, max_low_pfn and high_memory */
++ update_end_of_memory_vars(start_pfn << PAGE_SHIFT,
++ nr_pages << PAGE_SHIFT);
++
++ return ret;
++}
++
+ int __ref arch_add_memory(int nid, u64 start, u64 size,
+ struct mhp_params *params)
+ {
+@@ -113,7 +144,7 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
+ rc = arch_create_linear_mapping(nid, start, size, params);
+ if (rc)
+ return rc;
+- rc = __add_pages(nid, start_pfn, nr_pages, params);
++ rc = add_pages(nid, start_pfn, nr_pages, params);
+ if (rc)
+ arch_remove_linear_mapping(start, size);
+ return rc;
+diff --git a/arch/powerpc/mm/nohash/book3e_pgtable.c b/arch/powerpc/mm/nohash/book3e_pgtable.c
+index 7d4368d055a68..b80fc4a91a534 100644
+--- a/arch/powerpc/mm/nohash/book3e_pgtable.c
++++ b/arch/powerpc/mm/nohash/book3e_pgtable.c
+@@ -96,8 +96,8 @@ int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
+ pgdp = pgd_offset_k(ea);
+ p4dp = p4d_offset(pgdp, ea);
+ if (p4d_none(*p4dp)) {
+- pmdp = early_alloc_pgtable(PMD_TABLE_SIZE);
+- p4d_populate(&init_mm, p4dp, pmdp);
++ pudp = early_alloc_pgtable(PUD_TABLE_SIZE);
++ p4d_populate(&init_mm, p4dp, pudp);
+ }
+ pudp = pud_offset(p4dp, ea);
+ if (pud_none(*pudp)) {
+@@ -106,7 +106,7 @@ int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
+ }
+ pmdp = pmd_offset(pudp, ea);
+ if (!pmd_present(*pmdp)) {
+- ptep = early_alloc_pgtable(PAGE_SIZE);
++ ptep = early_alloc_pgtable(PTE_TABLE_SIZE);
+ pmd_populate_kernel(&init_mm, pmdp, ptep);
+ }
+ ptep = pte_offset_kernel(pmdp, ea);
+diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
+index 359b0cc0dc35d..d7dcad9011468 100644
+--- a/arch/s390/Kconfig
++++ b/arch/s390/Kconfig
+@@ -487,7 +487,6 @@ config KEXEC
+ config KEXEC_FILE
+ bool "kexec file based system call"
+ select KEXEC_CORE
+- select BUILD_BIN2C
+ depends on CRYPTO
+ depends on CRYPTO_SHA256
+ depends on CRYPTO_SHA256_S390
+diff --git a/arch/s390/crypto/arch_random.c b/arch/s390/crypto/arch_random.c
+index 56007c763902a..1f2d40993c4d2 100644
+--- a/arch/s390/crypto/arch_random.c
++++ b/arch/s390/crypto/arch_random.c
+@@ -4,232 +4,15 @@
+ *
+ * Copyright IBM Corp. 2017, 2020
+ * Author(s): Harald Freudenberger
+- *
+- * The s390_arch_random_generate() function may be called from random.c
+- * in interrupt context. So this implementation does the best to be very
+- * fast. There is a buffer of random data which is asynchronously checked
+- * and filled by a workqueue thread.
+- * If there are enough bytes in the buffer the s390_arch_random_generate()
+- * just delivers these bytes. Otherwise false is returned until the
+- * worker thread refills the buffer.
+- * The worker fills the rng buffer by pulling fresh entropy from the
+- * high quality (but slow) true hardware random generator. This entropy
+- * is then spread over the buffer with an pseudo random generator PRNG.
+- * As the arch_get_random_seed_long() fetches 8 bytes and the calling
+- * function add_interrupt_randomness() counts this as 1 bit entropy the
+- * distribution needs to make sure there is in fact 1 bit entropy contained
+- * in 8 bytes of the buffer. The current values pull 32 byte entropy
+- * and scatter this into a 2048 byte buffer. So 8 byte in the buffer
+- * will contain 1 bit of entropy.
+- * The worker thread is rescheduled based on the charge level of the
+- * buffer but at least with 500 ms delay to avoid too much CPU consumption.
+- * So the max. amount of rng data delivered via arch_get_random_seed is
+- * limited to 4k bytes per second.
+ */
+
+ #include <linux/kernel.h>
+ #include <linux/atomic.h>
+ #include <linux/random.h>
+-#include <linux/slab.h>
+ #include <linux/static_key.h>
+-#include <linux/workqueue.h>
+-#include <linux/moduleparam.h>
+ #include <asm/cpacf.h>
+
+ DEFINE_STATIC_KEY_FALSE(s390_arch_random_available);
+
+ atomic64_t s390_arch_random_counter = ATOMIC64_INIT(0);
+ EXPORT_SYMBOL(s390_arch_random_counter);
+-
+-#define ARCH_REFILL_TICKS (HZ/2)
+-#define ARCH_PRNG_SEED_SIZE 32
+-#define ARCH_RNG_BUF_SIZE 2048
+-
+-static DEFINE_SPINLOCK(arch_rng_lock);
+-static u8 *arch_rng_buf;
+-static unsigned int arch_rng_buf_idx;
+-
+-static void arch_rng_refill_buffer(struct work_struct *);
+-static DECLARE_DELAYED_WORK(arch_rng_work, arch_rng_refill_buffer);
+-
+-bool s390_arch_random_generate(u8 *buf, unsigned int nbytes)
+-{
+- /* max hunk is ARCH_RNG_BUF_SIZE */
+- if (nbytes > ARCH_RNG_BUF_SIZE)
+- return false;
+-
+- /* lock rng buffer */
+- if (!spin_trylock(&arch_rng_lock))
+- return false;
+-
+- /* try to resolve the requested amount of bytes from the buffer */
+- arch_rng_buf_idx -= nbytes;
+- if (arch_rng_buf_idx < ARCH_RNG_BUF_SIZE) {
+- memcpy(buf, arch_rng_buf + arch_rng_buf_idx, nbytes);
+- atomic64_add(nbytes, &s390_arch_random_counter);
+- spin_unlock(&arch_rng_lock);
+- return true;
+- }
+-
+- /* not enough bytes in rng buffer, refill is done asynchronously */
+- spin_unlock(&arch_rng_lock);
+-
+- return false;
+-}
+-EXPORT_SYMBOL(s390_arch_random_generate);
+-
+-static void arch_rng_refill_buffer(struct work_struct *unused)
+-{
+- unsigned int delay = ARCH_REFILL_TICKS;
+-
+- spin_lock(&arch_rng_lock);
+- if (arch_rng_buf_idx > ARCH_RNG_BUF_SIZE) {
+- /* buffer is exhausted and needs refill */
+- u8 seed[ARCH_PRNG_SEED_SIZE];
+- u8 prng_wa[240];
+- /* fetch ARCH_PRNG_SEED_SIZE bytes of entropy */
+- cpacf_trng(NULL, 0, seed, sizeof(seed));
+- /* blow this entropy up to ARCH_RNG_BUF_SIZE with PRNG */
+- memset(prng_wa, 0, sizeof(prng_wa));
+- cpacf_prno(CPACF_PRNO_SHA512_DRNG_SEED,
+- &prng_wa, NULL, 0, seed, sizeof(seed));
+- cpacf_prno(CPACF_PRNO_SHA512_DRNG_GEN,
+- &prng_wa, arch_rng_buf, ARCH_RNG_BUF_SIZE, NULL, 0);
+- arch_rng_buf_idx = ARCH_RNG_BUF_SIZE;
+- }
+- delay += (ARCH_REFILL_TICKS * arch_rng_buf_idx) / ARCH_RNG_BUF_SIZE;
+- spin_unlock(&arch_rng_lock);
+-
+- /* kick next check */
+- queue_delayed_work(system_long_wq, &arch_rng_work, delay);
+-}
+-
+-/*
+- * Here follows the implementation of s390_arch_get_random_long().
+- *
+- * The random longs to be pulled by arch_get_random_long() are
+- * prepared in an 4K buffer which is filled from the NIST 800-90
+- * compliant s390 drbg. By default the random long buffer is refilled
+- * 256 times before the drbg itself needs a reseed. The reseed of the
+- * drbg is done with 32 bytes fetched from the high quality (but slow)
+- * trng which is assumed to deliver 100% entropy. So the 32 * 8 = 256
+- * bits of entropy are spread over 256 * 4KB = 1MB serving 131072
+- * arch_get_random_long() invocations before reseeded.
+- *
+- * How often the 4K random long buffer is refilled with the drbg
+- * before the drbg is reseeded can be adjusted. There is a module
+- * parameter 's390_arch_rnd_long_drbg_reseed' accessible via
+- * /sys/module/arch_random/parameters/rndlong_drbg_reseed
+- * or as kernel command line parameter
+- * arch_random.rndlong_drbg_reseed=<value>
+- * This parameter tells how often the drbg fills the 4K buffer before
+- * it is re-seeded by fresh entropy from the trng.
+- * A value of 16 results in reseeding the drbg at every 16 * 4 KB = 64
+- * KB with 32 bytes of fresh entropy pulled from the trng. So a value
+- * of 16 would result in 256 bits entropy per 64 KB.
+- * A value of 256 results in 1MB of drbg output before a reseed of the
+- * drbg is done. So this would spread the 256 bits of entropy among 1MB.
+- * Setting this parameter to 0 forces the reseed to take place every
+- * time the 4K buffer is depleted, so the entropy rises to 256 bits
+- * entropy per 4K or 0.5 bit entropy per arch_get_random_long(). With
+- * setting this parameter to negative values all this effort is
+- * disabled, arch_get_random long() returns false and thus indicating
+- * that the arch_get_random_long() feature is disabled at all.
+- */
+-
+-static unsigned long rndlong_buf[512];
+-static DEFINE_SPINLOCK(rndlong_lock);
+-static int rndlong_buf_index;
+-
+-static int rndlong_drbg_reseed = 256;
+-module_param_named(rndlong_drbg_reseed, rndlong_drbg_reseed, int, 0600);
+-MODULE_PARM_DESC(rndlong_drbg_reseed, "s390 arch_get_random_long() drbg reseed");
+-
+-static inline void refill_rndlong_buf(void)
+-{
+- static u8 prng_ws[240];
+- static int drbg_counter;
+-
+- if (--drbg_counter < 0) {
+- /* need to re-seed the drbg */
+- u8 seed[32];
+-
+- /* fetch seed from trng */
+- cpacf_trng(NULL, 0, seed, sizeof(seed));
+- /* seed drbg */
+- memset(prng_ws, 0, sizeof(prng_ws));
+- cpacf_prno(CPACF_PRNO_SHA512_DRNG_SEED,
+- &prng_ws, NULL, 0, seed, sizeof(seed));
+- /* re-init counter for drbg */
+- drbg_counter = rndlong_drbg_reseed;
+- }
+-
+- /* fill the arch_get_random_long buffer from drbg */
+- cpacf_prno(CPACF_PRNO_SHA512_DRNG_GEN, &prng_ws,
+- (u8 *) rndlong_buf, sizeof(rndlong_buf),
+- NULL, 0);
+-}
+-
+-bool s390_arch_get_random_long(unsigned long *v)
+-{
+- bool rc = false;
+- unsigned long flags;
+-
+- /* arch_get_random_long() disabled ? */
+- if (rndlong_drbg_reseed < 0)
+- return false;
+-
+- /* try to lock the random long lock */
+- if (!spin_trylock_irqsave(&rndlong_lock, flags))
+- return false;
+-
+- if (--rndlong_buf_index >= 0) {
+- /* deliver next long value from the buffer */
+- *v = rndlong_buf[rndlong_buf_index];
+- rc = true;
+- goto out;
+- }
+-
+- /* buffer is depleted and needs refill */
+- if (in_interrupt()) {
+- /* delay refill in interrupt context to next caller */
+- rndlong_buf_index = 0;
+- goto out;
+- }
+-
+- /* refill random long buffer */
+- refill_rndlong_buf();
+- rndlong_buf_index = ARRAY_SIZE(rndlong_buf);
+-
+- /* and provide one random long */
+- *v = rndlong_buf[--rndlong_buf_index];
+- rc = true;
+-
+-out:
+- spin_unlock_irqrestore(&rndlong_lock, flags);
+- return rc;
+-}
+-EXPORT_SYMBOL(s390_arch_get_random_long);
+-
+-static int __init s390_arch_random_init(void)
+-{
+- /* all the needed PRNO subfunctions available ? */
+- if (cpacf_query_func(CPACF_PRNO, CPACF_PRNO_TRNG) &&
+- cpacf_query_func(CPACF_PRNO, CPACF_PRNO_SHA512_DRNG_GEN)) {
+-
+- /* alloc arch random working buffer */
+- arch_rng_buf = kmalloc(ARCH_RNG_BUF_SIZE, GFP_KERNEL);
+- if (!arch_rng_buf)
+- return -ENOMEM;
+-
+- /* kick worker queue job to fill the random buffer */
+- queue_delayed_work(system_long_wq,
+- &arch_rng_work, ARCH_REFILL_TICKS);
+-
+- /* enable arch random to the outside world */
+- static_branch_enable(&s390_arch_random_available);
+- }
+-
+- return 0;
+-}
+-arch_initcall(s390_arch_random_init);
+diff --git a/arch/s390/include/asm/archrandom.h b/arch/s390/include/asm/archrandom.h
+index 5dc712fde3c7f..2c6e1c6ecbe78 100644
+--- a/arch/s390/include/asm/archrandom.h
++++ b/arch/s390/include/asm/archrandom.h
+@@ -15,17 +15,13 @@
+
+ #include <linux/static_key.h>
+ #include <linux/atomic.h>
++#include <asm/cpacf.h>
+
+ DECLARE_STATIC_KEY_FALSE(s390_arch_random_available);
+ extern atomic64_t s390_arch_random_counter;
+
+-bool s390_arch_get_random_long(unsigned long *v);
+-bool s390_arch_random_generate(u8 *buf, unsigned int nbytes);
+-
+ static inline bool __must_check arch_get_random_long(unsigned long *v)
+ {
+- if (static_branch_likely(&s390_arch_random_available))
+- return s390_arch_get_random_long(v);
+ return false;
+ }
+
+@@ -37,7 +33,9 @@ static inline bool __must_check arch_get_random_int(unsigned int *v)
+ static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
+ {
+ if (static_branch_likely(&s390_arch_random_available)) {
+- return s390_arch_random_generate((u8 *)v, sizeof(*v));
++ cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
++ atomic64_add(sizeof(*v), &s390_arch_random_counter);
++ return true;
+ }
+ return false;
+ }
+@@ -45,7 +43,9 @@ static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
+ static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
+ {
+ if (static_branch_likely(&s390_arch_random_available)) {
+- return s390_arch_random_generate((u8 *)v, sizeof(*v));
++ cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
++ atomic64_add(sizeof(*v), &s390_arch_random_counter);
++ return true;
+ }
+ return false;
+ }
+diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
+index d860ac3009197..2cef49983e9e7 100644
+--- a/arch/s390/kernel/setup.c
++++ b/arch/s390/kernel/setup.c
+@@ -875,6 +875,11 @@ static void __init setup_randomness(void)
+ if (stsi(vmms, 3, 2, 2) == 0 && vmms->count)
+ add_device_randomness(&vmms->vm, sizeof(vmms->vm[0]) * vmms->count);
+ memblock_free(vmms, PAGE_SIZE);
++
++#ifdef CONFIG_ARCH_RANDOM
++ if (cpacf_query_func(CPACF_PRNO, CPACF_PRNO_TRNG))
++ static_branch_enable(&s390_arch_random_available);
++#endif
+ }
+
+ /*
+diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c
+index 990ff5b0aeb87..e4ea42b83b512 100644
+--- a/drivers/acpi/acpi_video.c
++++ b/drivers/acpi/acpi_video.c
+@@ -73,6 +73,7 @@ module_param(device_id_scheme, bool, 0444);
+ static int only_lcd = -1;
+ module_param(only_lcd, int, 0444);
+
++static bool has_backlight;
+ static int register_count;
+ static DEFINE_MUTEX(register_count_mutex);
+ static DEFINE_MUTEX(video_list_lock);
+@@ -1222,6 +1223,9 @@ acpi_video_bus_get_one_device(struct acpi_device *device,
+ acpi_video_device_bind(video, data);
+ acpi_video_device_find_cap(data);
+
++ if (data->cap._BCM && data->cap._BCL)
++ has_backlight = true;
++
+ mutex_lock(&video->device_list_lock);
+ list_add_tail(&data->entry, &video->video_device_list);
+ mutex_unlock(&video->device_list_lock);
+@@ -2250,6 +2254,7 @@ void acpi_video_unregister(void)
+ if (register_count) {
+ acpi_bus_unregister_driver(&acpi_video_bus);
+ register_count = 0;
++ has_backlight = false;
+ }
+ mutex_unlock(®ister_count_mutex);
+ }
+@@ -2271,13 +2276,7 @@ void acpi_video_unregister_backlight(void)
+
+ bool acpi_video_handles_brightness_key_presses(void)
+ {
+- bool have_video_busses;
+-
+- mutex_lock(&video_list_lock);
+- have_video_busses = !list_empty(&video_bus_head);
+- mutex_unlock(&video_list_lock);
+-
+- return have_video_busses &&
++ return has_backlight &&
+ (report_key_events & REPORT_BRIGHTNESS_KEY_EVENTS);
+ }
+ EXPORT_SYMBOL(acpi_video_handles_brightness_key_presses);
+diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
+index 966a6bf4c1627..cf9cfc40a0283 100644
+--- a/drivers/block/xen-blkfront.c
++++ b/drivers/block/xen-blkfront.c
+@@ -152,6 +152,10 @@ static unsigned int xen_blkif_max_ring_order;
+ module_param_named(max_ring_page_order, xen_blkif_max_ring_order, int, 0444);
+ MODULE_PARM_DESC(max_ring_page_order, "Maximum order of pages to be used for the shared ring");
+
++static bool __read_mostly xen_blkif_trusted = true;
++module_param_named(trusted, xen_blkif_trusted, bool, 0644);
++MODULE_PARM_DESC(trusted, "Is the backend trusted");
++
+ #define BLK_RING_SIZE(info) \
+ __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE * (info)->nr_ring_pages)
+
+@@ -210,6 +214,7 @@ struct blkfront_info
+ unsigned int feature_discard:1;
+ unsigned int feature_secdiscard:1;
+ unsigned int feature_persistent:1;
++ unsigned int bounce:1;
+ unsigned int discard_granularity;
+ unsigned int discard_alignment;
+ /* Number of 4KB segments handled */
+@@ -312,8 +317,8 @@ static int fill_grant_buffer(struct blkfront_ring_info *rinfo, int num)
+ if (!gnt_list_entry)
+ goto out_of_memory;
+
+- if (info->feature_persistent) {
+- granted_page = alloc_page(GFP_NOIO);
++ if (info->bounce) {
++ granted_page = alloc_page(GFP_NOIO | __GFP_ZERO);
+ if (!granted_page) {
+ kfree(gnt_list_entry);
+ goto out_of_memory;
+@@ -332,7 +337,7 @@ out_of_memory:
+ list_for_each_entry_safe(gnt_list_entry, n,
+ &rinfo->grants, node) {
+ list_del(&gnt_list_entry->node);
+- if (info->feature_persistent)
++ if (info->bounce)
+ __free_page(gnt_list_entry->page);
+ kfree(gnt_list_entry);
+ i--;
+@@ -378,7 +383,7 @@ static struct grant *get_grant(grant_ref_t *gref_head,
+ /* Assign a gref to this page */
+ gnt_list_entry->gref = gnttab_claim_grant_reference(gref_head);
+ BUG_ON(gnt_list_entry->gref == -ENOSPC);
+- if (info->feature_persistent)
++ if (info->bounce)
+ grant_foreign_access(gnt_list_entry, info);
+ else {
+ /* Grant access to the GFN passed by the caller */
+@@ -402,7 +407,7 @@ static struct grant *get_indirect_grant(grant_ref_t *gref_head,
+ /* Assign a gref to this page */
+ gnt_list_entry->gref = gnttab_claim_grant_reference(gref_head);
+ BUG_ON(gnt_list_entry->gref == -ENOSPC);
+- if (!info->feature_persistent) {
++ if (!info->bounce) {
+ struct page *indirect_page;
+
+ /* Fetch a pre-allocated page to use for indirect grefs */
+@@ -705,7 +710,7 @@ static int blkif_queue_rw_req(struct request *req, struct blkfront_ring_info *ri
+ .grant_idx = 0,
+ .segments = NULL,
+ .rinfo = rinfo,
+- .need_copy = rq_data_dir(req) && info->feature_persistent,
++ .need_copy = rq_data_dir(req) && info->bounce,
+ };
+
+ /*
+@@ -983,11 +988,12 @@ static void xlvbd_flush(struct blkfront_info *info)
+ {
+ blk_queue_write_cache(info->rq, info->feature_flush ? true : false,
+ info->feature_fua ? true : false);
+- pr_info("blkfront: %s: %s %s %s %s %s\n",
++ pr_info("blkfront: %s: %s %s %s %s %s %s %s\n",
+ info->gd->disk_name, flush_info(info),
+ "persistent grants:", info->feature_persistent ?
+ "enabled;" : "disabled;", "indirect descriptors:",
+- info->max_indirect_segments ? "enabled;" : "disabled;");
++ info->max_indirect_segments ? "enabled;" : "disabled;",
++ "bounce buffer:", info->bounce ? "enabled" : "disabled;");
+ }
+
+ static int xen_translate_vdev(int vdevice, int *minor, unsigned int *offset)
+@@ -1209,7 +1215,7 @@ static void blkif_free_ring(struct blkfront_ring_info *rinfo)
+ if (!list_empty(&rinfo->indirect_pages)) {
+ struct page *indirect_page, *n;
+
+- BUG_ON(info->feature_persistent);
++ BUG_ON(info->bounce);
+ list_for_each_entry_safe(indirect_page, n, &rinfo->indirect_pages, lru) {
+ list_del(&indirect_page->lru);
+ __free_page(indirect_page);
+@@ -1226,7 +1232,7 @@ static void blkif_free_ring(struct blkfront_ring_info *rinfo)
+ 0UL);
+ rinfo->persistent_gnts_c--;
+ }
+- if (info->feature_persistent)
++ if (info->bounce)
+ __free_page(persistent_gnt->page);
+ kfree(persistent_gnt);
+ }
+@@ -1247,7 +1253,7 @@ static void blkif_free_ring(struct blkfront_ring_info *rinfo)
+ for (j = 0; j < segs; j++) {
+ persistent_gnt = rinfo->shadow[i].grants_used[j];
+ gnttab_end_foreign_access(persistent_gnt->gref, 0UL);
+- if (info->feature_persistent)
++ if (info->bounce)
+ __free_page(persistent_gnt->page);
+ kfree(persistent_gnt);
+ }
+@@ -1437,7 +1443,7 @@ static int blkif_completion(unsigned long *id,
+ data.s = s;
+ num_sg = s->num_sg;
+
+- if (bret->operation == BLKIF_OP_READ && info->feature_persistent) {
++ if (bret->operation == BLKIF_OP_READ && info->bounce) {
+ for_each_sg(s->sg, sg, num_sg, i) {
+ BUG_ON(sg->offset + sg->length > PAGE_SIZE);
+
+@@ -1496,7 +1502,7 @@ static int blkif_completion(unsigned long *id,
+ * Add the used indirect page back to the list of
+ * available pages for indirect grefs.
+ */
+- if (!info->feature_persistent) {
++ if (!info->bounce) {
+ indirect_page = s->indirect_grants[i]->page;
+ list_add(&indirect_page->lru, &rinfo->indirect_pages);
+ }
+@@ -1689,7 +1695,7 @@ static int setup_blkring(struct xenbus_device *dev,
+ for (i = 0; i < info->nr_ring_pages; i++)
+ rinfo->ring_ref[i] = GRANT_INVALID_REF;
+
+- sring = alloc_pages_exact(ring_size, GFP_NOIO);
++ sring = alloc_pages_exact(ring_size, GFP_NOIO | __GFP_ZERO);
+ if (!sring) {
+ xenbus_dev_fatal(dev, -ENOMEM, "allocating shared ring");
+ return -ENOMEM;
+@@ -1787,6 +1793,10 @@ static int talk_to_blkback(struct xenbus_device *dev,
+ if (!info)
+ return -ENODEV;
+
++ /* Check if backend is trusted. */
++ info->bounce = !xen_blkif_trusted ||
++ !xenbus_read_unsigned(dev->nodename, "trusted", 1);
++
+ max_page_order = xenbus_read_unsigned(info->xbdev->otherend,
+ "max-ring-page-order", 0);
+ ring_page_order = min(xen_blkif_max_ring_order, max_page_order);
+@@ -2196,17 +2206,18 @@ static int blkfront_setup_indirect(struct blkfront_ring_info *rinfo)
+ if (err)
+ goto out_of_memory;
+
+- if (!info->feature_persistent && info->max_indirect_segments) {
++ if (!info->bounce && info->max_indirect_segments) {
+ /*
+- * We are using indirect descriptors but not persistent
+- * grants, we need to allocate a set of pages that can be
++ * We are using indirect descriptors but don't have a bounce
++ * buffer, we need to allocate a set of pages that can be
+ * used for mapping indirect grefs
+ */
+ int num = INDIRECT_GREFS(grants) * BLK_RING_SIZE(info);
+
+ BUG_ON(!list_empty(&rinfo->indirect_pages));
+ for (i = 0; i < num; i++) {
+- struct page *indirect_page = alloc_page(GFP_KERNEL);
++ struct page *indirect_page = alloc_page(GFP_KERNEL |
++ __GFP_ZERO);
+ if (!indirect_page)
+ goto out_of_memory;
+ list_add(&indirect_page->lru, &rinfo->indirect_pages);
+@@ -2299,6 +2310,8 @@ static void blkfront_gather_backend_features(struct blkfront_info *info)
+ info->feature_persistent =
+ !!xenbus_read_unsigned(info->xbdev->otherend,
+ "feature-persistent", 0);
++ if (info->feature_persistent)
++ info->bounce = true;
+
+ indirect_segments = xenbus_read_unsigned(info->xbdev->otherend,
+ "feature-max-indirect-segments", 0);
+@@ -2570,6 +2583,13 @@ static void blkfront_delay_work(struct work_struct *work)
+ struct blkfront_info *info;
+ bool need_schedule_work = false;
+
++ /*
++ * Note that when using bounce buffers but not persistent grants
++ * there's no need to run blkfront_delay_work because grants are
++ * revoked in blkif_completion or else an error is reported and the
++ * connection is closed.
++ */
++
+ mutex_lock(&blkfront_mutex);
+
+ list_for_each_entry(info, &info_list, info_list) {
+diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
+index 7be38bc6a673b..9ac75c1cde9c2 100644
+--- a/drivers/cpufreq/amd-pstate.c
++++ b/drivers/cpufreq/amd-pstate.c
+@@ -566,6 +566,28 @@ static int amd_pstate_cpu_exit(struct cpufreq_policy *policy)
+ return 0;
+ }
+
++static int amd_pstate_cpu_resume(struct cpufreq_policy *policy)
++{
++ int ret;
++
++ ret = amd_pstate_enable(true);
++ if (ret)
++ pr_err("failed to enable amd-pstate during resume, return %d\n", ret);
++
++ return ret;
++}
++
++static int amd_pstate_cpu_suspend(struct cpufreq_policy *policy)
++{
++ int ret;
++
++ ret = amd_pstate_enable(false);
++ if (ret)
++ pr_err("failed to disable amd-pstate during suspend, return %d\n", ret);
++
++ return ret;
++}
++
+ /* Sysfs attributes */
+
+ /*
+@@ -636,6 +658,8 @@ static struct cpufreq_driver amd_pstate_driver = {
+ .target = amd_pstate_target,
+ .init = amd_pstate_cpu_init,
+ .exit = amd_pstate_cpu_exit,
++ .suspend = amd_pstate_cpu_suspend,
++ .resume = amd_pstate_cpu_resume,
+ .set_boost = amd_pstate_set_boost,
+ .name = "amd-pstate",
+ .attr = amd_pstate_attr,
+diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
+index 0253731d6d25d..36c79580fba25 100644
+--- a/drivers/cpufreq/qcom-cpufreq-hw.c
++++ b/drivers/cpufreq/qcom-cpufreq-hw.c
+@@ -442,6 +442,9 @@ static int qcom_cpufreq_hw_cpu_online(struct cpufreq_policy *policy)
+ struct platform_device *pdev = cpufreq_get_driver_data();
+ int ret;
+
++ if (data->throttle_irq <= 0)
++ return 0;
++
+ ret = irq_set_affinity_hint(data->throttle_irq, policy->cpus);
+ if (ret)
+ dev_err(&pdev->dev, "Failed to set CPU affinity of %s[%d]\n",
+@@ -469,6 +472,9 @@ static int qcom_cpufreq_hw_cpu_offline(struct cpufreq_policy *policy)
+
+ static void qcom_cpufreq_hw_lmh_exit(struct qcom_cpufreq_data *data)
+ {
++ if (data->throttle_irq <= 0)
++ return;
++
+ free_irq(data->throttle_irq, data);
+ }
+
+diff --git a/drivers/cpufreq/qoriq-cpufreq.c b/drivers/cpufreq/qoriq-cpufreq.c
+index 6b6b20da2bcfc..573b417e14833 100644
+--- a/drivers/cpufreq/qoriq-cpufreq.c
++++ b/drivers/cpufreq/qoriq-cpufreq.c
+@@ -275,6 +275,7 @@ static int qoriq_cpufreq_probe(struct platform_device *pdev)
+
+ np = of_find_matching_node(NULL, qoriq_cpufreq_blacklist);
+ if (np) {
++ of_node_put(np);
+ dev_info(&pdev->dev, "Disabling due to erratum A-008083");
+ return -ENODEV;
+ }
+diff --git a/drivers/devfreq/event/exynos-ppmu.c b/drivers/devfreq/event/exynos-ppmu.c
+index 9b849d7811167..a443e7c42dafa 100644
+--- a/drivers/devfreq/event/exynos-ppmu.c
++++ b/drivers/devfreq/event/exynos-ppmu.c
+@@ -519,15 +519,19 @@ static int of_get_devfreq_events(struct device_node *np,
+
+ count = of_get_child_count(events_np);
+ desc = devm_kcalloc(dev, count, sizeof(*desc), GFP_KERNEL);
+- if (!desc)
++ if (!desc) {
++ of_node_put(events_np);
+ return -ENOMEM;
++ }
+ info->num_events = count;
+
+ of_id = of_match_device(exynos_ppmu_id_match, dev);
+ if (of_id)
+ info->ppmu_type = (enum exynos_ppmu_type)of_id->data;
+- else
++ else {
++ of_node_put(events_np);
+ return -EINVAL;
++ }
+
+ j = 0;
+ for_each_child_of_node(events_np, node) {
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+index 6ca1db3c243f9..3610bcd29ad92 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+@@ -703,7 +703,8 @@ int amdgpu_amdkfd_flush_gpu_tlb_pasid(struct amdgpu_device *adev,
+ {
+ bool all_hub = false;
+
+- if (adev->family == AMDGPU_FAMILY_AI)
++ if (adev->family == AMDGPU_FAMILY_AI ||
++ adev->family == AMDGPU_FAMILY_RV)
+ all_hub = true;
+
+ return amdgpu_gmc_flush_gpu_tlb_pasid(adev, pasid, flush_type, all_hub);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+index 49f734137f158..66e40cac5eb36 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+@@ -5140,7 +5140,7 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
+ */
+ amdgpu_unregister_gpu_instance(tmp_adev);
+
+- drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, true);
++ drm_fb_helper_set_suspend_unlocked(adev_to_drm(tmp_adev)->fb_helper, true);
+
+ /* disable ras on ALL IPs */
+ if (!need_emergency_restart &&
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+index ea3e8c66211fd..ebd53dacfac4c 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+@@ -333,6 +333,7 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
+ if (!amdgpu_device_has_dc_support(adev)) {
+ if (!adev->enable_virtual_display)
+ /* Disable vblank IRQs aggressively for power-saving */
++ /* XXX: can this be enabled for DC? */
+ adev_to_drm(adev)->vblank_disable_immediate = true;
+
+ r = drm_vblank_init(adev_to_drm(adev), adev->mode_info.num_crtc);
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+index 78a38c3b7d664..6dc9808760fc8 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+@@ -4286,9 +4286,6 @@ static int amdgpu_dm_initialize_drm_device(struct amdgpu_device *adev)
+ }
+ #endif
+
+- /* Disable vblank IRQs aggressively for power-saving. */
+- adev_to_drm(adev)->vblank_disable_immediate = true;
+-
+ /* loops over all connectors on the board */
+ for (i = 0; i < link_cnt; i++) {
+ struct dc_link *link = NULL;
+diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
+index 9ae294eb7fb4b..12b7d4d392167 100644
+--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
++++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
+@@ -932,8 +932,9 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv,
+ case I915_CONTEXT_PARAM_PERSISTENCE:
+ if (args->size)
+ ret = -EINVAL;
+- ret = proto_context_set_persistence(fpriv->dev_priv, pc,
+- args->value);
++ else
++ ret = proto_context_set_persistence(fpriv->dev_priv, pc,
++ args->value);
+ break;
+
+ case I915_CONTEXT_PARAM_PROTECTED_CONTENT:
+diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
+index 62b3f332bbf5c..0478fa6259ebf 100644
+--- a/drivers/gpu/drm/i915/i915_driver.c
++++ b/drivers/gpu/drm/i915/i915_driver.c
+@@ -538,6 +538,7 @@ mask_err:
+ static int i915_driver_hw_probe(struct drm_i915_private *dev_priv)
+ {
+ struct pci_dev *pdev = to_pci_dev(dev_priv->drm.dev);
++ struct pci_dev *root_pdev;
+ int ret;
+
+ if (i915_inject_probe_failure(dev_priv))
+@@ -651,6 +652,15 @@ static int i915_driver_hw_probe(struct drm_i915_private *dev_priv)
+
+ intel_bw_init_hw(dev_priv);
+
++ /*
++ * FIXME: Temporary hammer to avoid freezing the machine on our DGFX
++ * This should be totally removed when we handle the pci states properly
++ * on runtime PM and on s2idle cases.
++ */
++ root_pdev = pcie_find_root_port(pdev);
++ if (root_pdev)
++ pci_d3cold_disable(root_pdev);
++
+ return 0;
+
+ err_msi:
+@@ -674,11 +684,16 @@ err_perf:
+ static void i915_driver_hw_remove(struct drm_i915_private *dev_priv)
+ {
+ struct pci_dev *pdev = to_pci_dev(dev_priv->drm.dev);
++ struct pci_dev *root_pdev;
+
+ i915_perf_fini(dev_priv);
+
+ if (pdev->msi_enabled)
+ pci_disable_msi(pdev);
++
++ root_pdev = pcie_find_root_port(pdev);
++ if (root_pdev)
++ pci_d3cold_enable(root_pdev);
+ }
+
+ /**
+@@ -1195,14 +1210,6 @@ static int i915_drm_suspend_late(struct drm_device *dev, bool hibernation)
+ goto out;
+ }
+
+- /*
+- * FIXME: Temporary hammer to avoid freezing the machine on our DGFX
+- * This should be totally removed when we handle the pci states properly
+- * on runtime PM and on s2idle cases.
+- */
+- if (suspend_to_idle(dev_priv))
+- pci_d3cold_disable(pdev);
+-
+ pci_disable_device(pdev);
+ /*
+ * During hibernation on some platforms the BIOS may try to access
+@@ -1367,8 +1374,6 @@ static int i915_drm_resume_early(struct drm_device *dev)
+
+ pci_set_master(pdev);
+
+- pci_d3cold_enable(pdev);
+-
+ disable_rpm_wakeref_asserts(&dev_priv->runtime_pm);
+
+ ret = vlv_resume_prepare(dev_priv, false);
+@@ -1545,7 +1550,6 @@ static int intel_runtime_suspend(struct device *kdev)
+ {
+ struct drm_i915_private *dev_priv = kdev_to_i915(kdev);
+ struct intel_runtime_pm *rpm = &dev_priv->runtime_pm;
+- struct pci_dev *pdev = to_pci_dev(dev_priv->drm.dev);
+ int ret;
+
+ if (drm_WARN_ON_ONCE(&dev_priv->drm, !HAS_RUNTIME_PM(dev_priv)))
+@@ -1591,12 +1595,6 @@ static int intel_runtime_suspend(struct device *kdev)
+ drm_err(&dev_priv->drm,
+ "Unclaimed access detected prior to suspending\n");
+
+- /*
+- * FIXME: Temporary hammer to avoid freezing the machine on our DGFX
+- * This should be totally removed when we handle the pci states properly
+- * on runtime PM and on s2idle cases.
+- */
+- pci_d3cold_disable(pdev);
+ rpm->suspended = true;
+
+ /*
+@@ -1635,7 +1633,6 @@ static int intel_runtime_resume(struct device *kdev)
+ {
+ struct drm_i915_private *dev_priv = kdev_to_i915(kdev);
+ struct intel_runtime_pm *rpm = &dev_priv->runtime_pm;
+- struct pci_dev *pdev = to_pci_dev(dev_priv->drm.dev);
+ int ret;
+
+ if (drm_WARN_ON_ONCE(&dev_priv->drm, !HAS_RUNTIME_PM(dev_priv)))
+@@ -1648,7 +1645,6 @@ static int intel_runtime_resume(struct device *kdev)
+
+ intel_opregion_notify_adapter(dev_priv, PCI_D0);
+ rpm->suspended = false;
+- pci_d3cold_enable(pdev);
+ if (intel_uncore_unclaimed_mmio(&dev_priv->uncore))
+ drm_dbg(&dev_priv->drm,
+ "Unclaimed access during suspend, bios?\n");
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+index 3940b9c6323be..fffd2ef897a00 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+@@ -1187,12 +1187,13 @@ static void dpu_encoder_vblank_callback(struct drm_encoder *drm_enc,
+ DPU_ATRACE_BEGIN("encoder_vblank_callback");
+ dpu_enc = to_dpu_encoder_virt(drm_enc);
+
++ atomic_inc(&phy_enc->vsync_cnt);
++
+ spin_lock_irqsave(&dpu_enc->enc_spinlock, lock_flags);
+ if (dpu_enc->crtc)
+ dpu_crtc_vblank_callback(dpu_enc->crtc);
+ spin_unlock_irqrestore(&dpu_enc->enc_spinlock, lock_flags);
+
+- atomic_inc(&phy_enc->vsync_cnt);
+ DPU_ATRACE_END("encoder_vblank_callback");
+ }
+
+diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
+index c6d60c8d286de..fec4e39732879 100644
+--- a/drivers/gpu/drm/msm/msm_gem_submit.c
++++ b/drivers/gpu/drm/msm/msm_gem_submit.c
+@@ -913,7 +913,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
+ INT_MAX, GFP_KERNEL);
+ }
+ if (submit->fence_id < 0) {
+- ret = submit->fence_id = 0;
++ ret = submit->fence_id;
+ submit->fence_id = 0;
+ }
+
+diff --git a/drivers/hwmon/ibmaem.c b/drivers/hwmon/ibmaem.c
+index de6baf6ca3d1e..dab4908b78a88 100644
+--- a/drivers/hwmon/ibmaem.c
++++ b/drivers/hwmon/ibmaem.c
+@@ -550,7 +550,7 @@ static int aem_init_aem1_inst(struct aem_ipmi_data *probe, u8 module_handle)
+
+ res = platform_device_add(data->pdev);
+ if (res)
+- goto ipmi_err;
++ goto dev_add_err;
+
+ platform_set_drvdata(data->pdev, data);
+
+@@ -598,7 +598,9 @@ hwmon_reg_err:
+ ipmi_destroy_user(data->ipmi.user);
+ ipmi_err:
+ platform_set_drvdata(data->pdev, NULL);
+- platform_device_unregister(data->pdev);
++ platform_device_del(data->pdev);
++dev_add_err:
++ platform_device_put(data->pdev);
+ dev_err:
+ ida_simple_remove(&aem_ida, data->id);
+ id_err:
+@@ -690,7 +692,7 @@ static int aem_init_aem2_inst(struct aem_ipmi_data *probe,
+
+ res = platform_device_add(data->pdev);
+ if (res)
+- goto ipmi_err;
++ goto dev_add_err;
+
+ platform_set_drvdata(data->pdev, data);
+
+@@ -738,7 +740,9 @@ hwmon_reg_err:
+ ipmi_destroy_user(data->ipmi.user);
+ ipmi_err:
+ platform_set_drvdata(data->pdev, NULL);
+- platform_device_unregister(data->pdev);
++ platform_device_del(data->pdev);
++dev_add_err:
++ platform_device_put(data->pdev);
+ dev_err:
+ ida_simple_remove(&aem_ida, data->id);
+ id_err:
+diff --git a/drivers/hwmon/occ/common.c b/drivers/hwmon/occ/common.c
+index f00cd59f1d19f..1757f3ab842e1 100644
+--- a/drivers/hwmon/occ/common.c
++++ b/drivers/hwmon/occ/common.c
+@@ -145,7 +145,7 @@ static int occ_poll(struct occ *occ)
+ cmd[6] = 0; /* checksum lsb */
+
+ /* mutex should already be locked if necessary */
+- rc = occ->send_cmd(occ, cmd, sizeof(cmd));
++ rc = occ->send_cmd(occ, cmd, sizeof(cmd), &occ->resp, sizeof(occ->resp));
+ if (rc) {
+ occ->last_error = rc;
+ if (occ->error_count++ > OCC_ERROR_COUNT_THRESHOLD)
+@@ -182,6 +182,7 @@ static int occ_set_user_power_cap(struct occ *occ, u16 user_power_cap)
+ {
+ int rc;
+ u8 cmd[8];
++ u8 resp[8];
+ __be16 user_power_cap_be = cpu_to_be16(user_power_cap);
+
+ cmd[0] = 0; /* sequence number */
+@@ -198,7 +199,7 @@ static int occ_set_user_power_cap(struct occ *occ, u16 user_power_cap)
+ if (rc)
+ return rc;
+
+- rc = occ->send_cmd(occ, cmd, sizeof(cmd));
++ rc = occ->send_cmd(occ, cmd, sizeof(cmd), resp, sizeof(resp));
+
+ mutex_unlock(&occ->lock);
+
+diff --git a/drivers/hwmon/occ/common.h b/drivers/hwmon/occ/common.h
+index 2dd4a4d240c0f..726943af9a077 100644
+--- a/drivers/hwmon/occ/common.h
++++ b/drivers/hwmon/occ/common.h
+@@ -96,7 +96,8 @@ struct occ {
+
+ int powr_sample_time_us; /* average power sample time */
+ u8 poll_cmd_data; /* to perform OCC poll command */
+- int (*send_cmd)(struct occ *occ, u8 *cmd, size_t len);
++ int (*send_cmd)(struct occ *occ, u8 *cmd, size_t len, void *resp,
++ size_t resp_len);
+
+ unsigned long next_update;
+ struct mutex lock; /* lock OCC access */
+diff --git a/drivers/hwmon/occ/p8_i2c.c b/drivers/hwmon/occ/p8_i2c.c
+index 9e61e1fb5142c..c35c07964d856 100644
+--- a/drivers/hwmon/occ/p8_i2c.c
++++ b/drivers/hwmon/occ/p8_i2c.c
+@@ -111,7 +111,8 @@ static int p8_i2c_occ_putscom_be(struct i2c_client *client, u32 address,
+ be32_to_cpu(data1));
+ }
+
+-static int p8_i2c_occ_send_cmd(struct occ *occ, u8 *cmd, size_t len)
++static int p8_i2c_occ_send_cmd(struct occ *occ, u8 *cmd, size_t len,
++ void *resp, size_t resp_len)
+ {
+ int i, rc;
+ unsigned long start;
+@@ -120,7 +121,7 @@ static int p8_i2c_occ_send_cmd(struct occ *occ, u8 *cmd, size_t len)
+ const long wait_time = msecs_to_jiffies(OCC_CMD_IN_PRG_WAIT_MS);
+ struct p8_i2c_occ *ctx = to_p8_i2c_occ(occ);
+ struct i2c_client *client = ctx->client;
+- struct occ_response *resp = &occ->resp;
++ struct occ_response *or = (struct occ_response *)resp;
+
+ start = jiffies;
+
+@@ -151,7 +152,7 @@ static int p8_i2c_occ_send_cmd(struct occ *occ, u8 *cmd, size_t len)
+ return rc;
+
+ /* wait for OCC */
+- if (resp->return_status == OCC_RESP_CMD_IN_PRG) {
++ if (or->return_status == OCC_RESP_CMD_IN_PRG) {
+ rc = -EALREADY;
+
+ if (time_after(jiffies, start + timeout))
+@@ -163,7 +164,7 @@ static int p8_i2c_occ_send_cmd(struct occ *occ, u8 *cmd, size_t len)
+ } while (rc);
+
+ /* check the OCC response */
+- switch (resp->return_status) {
++ switch (or->return_status) {
+ case OCC_RESP_CMD_IN_PRG:
+ rc = -ETIMEDOUT;
+ break;
+@@ -192,8 +193,8 @@ static int p8_i2c_occ_send_cmd(struct occ *occ, u8 *cmd, size_t len)
+ if (rc < 0)
+ return rc;
+
+- data_length = get_unaligned_be16(&resp->data_length);
+- if (data_length > OCC_RESP_DATA_BYTES)
++ data_length = get_unaligned_be16(&or->data_length);
++ if ((data_length + 7) > resp_len)
+ return -EMSGSIZE;
+
+ /* fetch the rest of the response data */
+diff --git a/drivers/hwmon/occ/p9_sbe.c b/drivers/hwmon/occ/p9_sbe.c
+index 49b13cc01073a..bad349bf9f339 100644
+--- a/drivers/hwmon/occ/p9_sbe.c
++++ b/drivers/hwmon/occ/p9_sbe.c
+@@ -78,11 +78,10 @@ done:
+ return notify;
+ }
+
+-static int p9_sbe_occ_send_cmd(struct occ *occ, u8 *cmd, size_t len)
++static int p9_sbe_occ_send_cmd(struct occ *occ, u8 *cmd, size_t len,
++ void *resp, size_t resp_len)
+ {
+- struct occ_response *resp = &occ->resp;
+ struct p9_sbe_occ *ctx = to_p9_sbe_occ(occ);
+- size_t resp_len = sizeof(*resp);
+ int rc;
+
+ rc = fsi_occ_submit(ctx->sbe, cmd, len, resp, &resp_len);
+@@ -96,7 +95,7 @@ static int p9_sbe_occ_send_cmd(struct occ *occ, u8 *cmd, size_t len)
+ return rc;
+ }
+
+- switch (resp->return_status) {
++ switch (((struct occ_response *)resp)->return_status) {
+ case OCC_RESP_CMD_IN_PRG:
+ rc = -ETIMEDOUT;
+ break;
+diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
+index 1c107d6d03b99..b985e0d9bc05e 100644
+--- a/drivers/infiniband/core/cm.c
++++ b/drivers/infiniband/core/cm.c
+@@ -1252,8 +1252,10 @@ struct ib_cm_id *ib_cm_insert_listen(struct ib_device *device,
+ return ERR_CAST(cm_id_priv);
+
+ err = cm_init_listen(cm_id_priv, service_id, 0);
+- if (err)
++ if (err) {
++ ib_destroy_cm_id(&cm_id_priv->id);
+ return ERR_PTR(err);
++ }
+
+ spin_lock_irq(&cm_id_priv->lock);
+ listen_id_priv = cm_insert_listen(cm_id_priv, cm_handler);
+diff --git a/drivers/infiniband/hw/qedr/qedr.h b/drivers/infiniband/hw/qedr/qedr.h
+index 8def88cfa3009..db9ef3e1eb97c 100644
+--- a/drivers/infiniband/hw/qedr/qedr.h
++++ b/drivers/infiniband/hw/qedr/qedr.h
+@@ -418,6 +418,7 @@ struct qedr_qp {
+ u32 sq_psn;
+ u32 qkey;
+ u32 dest_qp_num;
++ u8 timeout;
+
+ /* Relevant to qps created from kernel space only (ULPs) */
+ u8 prev_wqe_size;
+diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
+index a53476653b0d9..df4d7970c1ad5 100644
+--- a/drivers/infiniband/hw/qedr/verbs.c
++++ b/drivers/infiniband/hw/qedr/verbs.c
+@@ -2612,6 +2612,8 @@ int qedr_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
+ 1 << max_t(int, attr->timeout - 8, 0);
+ else
+ qp_params.ack_timeout = 0;
++
++ qp->timeout = attr->timeout;
+ }
+
+ if (attr_mask & IB_QP_RETRY_CNT) {
+@@ -2771,7 +2773,7 @@ int qedr_query_qp(struct ib_qp *ibqp,
+ rdma_ah_set_dgid_raw(&qp_attr->ah_attr, ¶ms.dgid.bytes[0]);
+ rdma_ah_set_port_num(&qp_attr->ah_attr, 1);
+ rdma_ah_set_sl(&qp_attr->ah_attr, 0);
+- qp_attr->timeout = params.timeout;
++ qp_attr->timeout = qp->timeout;
+ qp_attr->rnr_retry = params.rnr_retry;
+ qp_attr->retry_cnt = params.retry_cnt;
+ qp_attr->min_rnr_timer = params.min_rnr_nak_timer;
+diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
+index 2b26435a6946e..e362a7471512d 100644
+--- a/drivers/md/dm-raid.c
++++ b/drivers/md/dm-raid.c
+@@ -1001,12 +1001,13 @@ static int validate_region_size(struct raid_set *rs, unsigned long region_size)
+ static int validate_raid_redundancy(struct raid_set *rs)
+ {
+ unsigned int i, rebuild_cnt = 0;
+- unsigned int rebuilds_per_group = 0, copies;
++ unsigned int rebuilds_per_group = 0, copies, raid_disks;
+ unsigned int group_size, last_group_start;
+
+- for (i = 0; i < rs->md.raid_disks; i++)
+- if (!test_bit(In_sync, &rs->dev[i].rdev.flags) ||
+- !rs->dev[i].rdev.sb_page)
++ for (i = 0; i < rs->raid_disks; i++)
++ if (!test_bit(FirstUse, &rs->dev[i].rdev.flags) &&
++ ((!test_bit(In_sync, &rs->dev[i].rdev.flags) ||
++ !rs->dev[i].rdev.sb_page)))
+ rebuild_cnt++;
+
+ switch (rs->md.level) {
+@@ -1046,8 +1047,9 @@ static int validate_raid_redundancy(struct raid_set *rs)
+ * A A B B C
+ * C D D E E
+ */
++ raid_disks = min(rs->raid_disks, rs->md.raid_disks);
+ if (__is_raid10_near(rs->md.new_layout)) {
+- for (i = 0; i < rs->md.raid_disks; i++) {
++ for (i = 0; i < raid_disks; i++) {
+ if (!(i % copies))
+ rebuilds_per_group = 0;
+ if ((!rs->dev[i].rdev.sb_page ||
+@@ -1070,10 +1072,10 @@ static int validate_raid_redundancy(struct raid_set *rs)
+ * results in the need to treat the last (potentially larger)
+ * set differently.
+ */
+- group_size = (rs->md.raid_disks / copies);
+- last_group_start = (rs->md.raid_disks / group_size) - 1;
++ group_size = (raid_disks / copies);
++ last_group_start = (raid_disks / group_size) - 1;
+ last_group_start *= group_size;
+- for (i = 0; i < rs->md.raid_disks; i++) {
++ for (i = 0; i < raid_disks; i++) {
+ if (!(i % copies) && !(i > last_group_start))
+ rebuilds_per_group = 0;
+ if ((!rs->dev[i].rdev.sb_page ||
+@@ -1588,7 +1590,7 @@ static sector_t __rdev_sectors(struct raid_set *rs)
+ {
+ int i;
+
+- for (i = 0; i < rs->md.raid_disks; i++) {
++ for (i = 0; i < rs->raid_disks; i++) {
+ struct md_rdev *rdev = &rs->dev[i].rdev;
+
+ if (!test_bit(Journal, &rdev->flags) &&
+@@ -3771,13 +3773,13 @@ static int raid_iterate_devices(struct dm_target *ti,
+ unsigned int i;
+ int r = 0;
+
+- for (i = 0; !r && i < rs->md.raid_disks; i++)
+- if (rs->dev[i].data_dev)
+- r = fn(ti,
+- rs->dev[i].data_dev,
+- 0, /* No offset on data devs */
+- rs->md.dev_sectors,
+- data);
++ for (i = 0; !r && i < rs->raid_disks; i++) {
++ if (rs->dev[i].data_dev) {
++ r = fn(ti, rs->dev[i].data_dev,
++ 0, /* No offset on data devs */
++ rs->md.dev_sectors, data);
++ }
++ }
+
+ return r;
+ }
+diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
+index d6ce5a09fd358..0dd4679deb612 100644
+--- a/drivers/md/raid5.c
++++ b/drivers/md/raid5.c
+@@ -8023,6 +8023,7 @@ static int raid5_add_disk(struct mddev *mddev, struct md_rdev *rdev)
+ */
+ if (rdev->saved_raid_disk >= 0 &&
+ rdev->saved_raid_disk >= first &&
++ rdev->saved_raid_disk <= last &&
+ conf->disks[rdev->saved_raid_disk].rdev == NULL)
+ first = rdev->saved_raid_disk;
+
+diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
+index a86b1f71762ea..d7fb33c078e81 100644
+--- a/drivers/net/bonding/bond_3ad.c
++++ b/drivers/net/bonding/bond_3ad.c
+@@ -2228,7 +2228,8 @@ void bond_3ad_unbind_slave(struct slave *slave)
+ temp_aggregator->num_of_ports--;
+ if (__agg_active_ports(temp_aggregator) == 0) {
+ select_new_active_agg = temp_aggregator->is_active;
+- ad_clear_agg(temp_aggregator);
++ if (temp_aggregator->num_of_ports == 0)
++ ad_clear_agg(temp_aggregator);
+ if (select_new_active_agg) {
+ slave_info(bond->dev, slave->dev, "Removing an active aggregator\n");
+ /* select new active aggregator */
+diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
+index 303c8d32d451e..007d43e46dcb0 100644
+--- a/drivers/net/bonding/bond_alb.c
++++ b/drivers/net/bonding/bond_alb.c
+@@ -1302,12 +1302,12 @@ int bond_alb_initialize(struct bonding *bond, int rlb_enabled)
+ return res;
+
+ if (rlb_enabled) {
+- bond->alb_info.rlb_enabled = 1;
+ res = rlb_initialize(bond);
+ if (res) {
+ tlb_deinitialize(bond);
+ return res;
+ }
++ bond->alb_info.rlb_enabled = 1;
+ } else {
+ bond->alb_info.rlb_enabled = 0;
+ }
+diff --git a/drivers/net/caif/caif_virtio.c b/drivers/net/caif/caif_virtio.c
+index 444ef6a342f69..14c5b2db65f41 100644
+--- a/drivers/net/caif/caif_virtio.c
++++ b/drivers/net/caif/caif_virtio.c
+@@ -721,13 +721,21 @@ static int cfv_probe(struct virtio_device *vdev)
+ /* Carrier is off until netdevice is opened */
+ netif_carrier_off(netdev);
+
++ /* serialize netdev register + virtio_device_ready() with ndo_open() */
++ rtnl_lock();
++
+ /* register Netdev */
+- err = register_netdev(netdev);
++ err = register_netdevice(netdev);
+ if (err) {
++ rtnl_unlock();
+ dev_err(&vdev->dev, "Unable to register netdev (%d)\n", err);
+ goto err;
+ }
+
++ virtio_device_ready(vdev);
++
++ rtnl_unlock();
++
+ debugfs_init(cfv);
+
+ return 0;
+diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
+index 87e81c636339f..be0edfa093d04 100644
+--- a/drivers/net/dsa/bcm_sf2.c
++++ b/drivers/net/dsa/bcm_sf2.c
+@@ -878,6 +878,11 @@ static void bcm_sf2_sw_mac_link_up(struct dsa_switch *ds, int port,
+ if (duplex == DUPLEX_FULL)
+ reg |= DUPLX_MODE;
+
++ if (tx_pause)
++ reg |= TXFLOW_CNTL;
++ if (rx_pause)
++ reg |= RXFLOW_CNTL;
++
+ core_writel(priv, reg, offset);
+ }
+
+diff --git a/drivers/net/dsa/hirschmann/hellcreek_ptp.c b/drivers/net/dsa/hirschmann/hellcreek_ptp.c
+index 2572c6087bb5a..b28baab6d56a1 100644
+--- a/drivers/net/dsa/hirschmann/hellcreek_ptp.c
++++ b/drivers/net/dsa/hirschmann/hellcreek_ptp.c
+@@ -300,6 +300,7 @@ static int hellcreek_led_setup(struct hellcreek *hellcreek)
+ const char *label, *state;
+ int ret = -EINVAL;
+
++ of_node_get(hellcreek->dev->of_node);
+ leds = of_find_node_by_name(hellcreek->dev->of_node, "leds");
+ if (!leds) {
+ dev_err(hellcreek->dev, "No LEDs specified in device tree!\n");
+diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c
+index 52a8566071edd..4a071f96ea283 100644
+--- a/drivers/net/dsa/ocelot/felix_vsc9959.c
++++ b/drivers/net/dsa/ocelot/felix_vsc9959.c
+@@ -1883,6 +1883,8 @@ static void vsc9959_psfp_sgi_table_del(struct ocelot *ocelot,
+ static void vsc9959_psfp_counters_get(struct ocelot *ocelot, u32 index,
+ struct felix_stream_filter_counters *counters)
+ {
++ mutex_lock(&ocelot->stats_lock);
++
+ ocelot_rmw(ocelot, SYS_STAT_CFG_STAT_VIEW(index),
+ SYS_STAT_CFG_STAT_VIEW_M,
+ SYS_STAT_CFG);
+@@ -1897,6 +1899,8 @@ static void vsc9959_psfp_counters_get(struct ocelot *ocelot, u32 index,
+ SYS_STAT_CFG_STAT_VIEW(index) |
+ SYS_STAT_CFG_STAT_CLEAR_SHOT(0x10),
+ SYS_STAT_CFG);
++
++ mutex_unlock(&ocelot->stats_lock);
+ }
+
+ static int vsc9959_psfp_filter_add(struct ocelot *ocelot, int port,
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+index 79deb19e3a194..7ad663c5b1ab7 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
++++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+@@ -4418,6 +4418,8 @@ static int mlxsw_sp_nexthop4_init(struct mlxsw_sp *mlxsw_sp,
+ return 0;
+
+ err_nexthop_neigh_init:
++ list_del(&nh->router_list_node);
++ mlxsw_sp_nexthop_counter_free(mlxsw_sp, nh);
+ mlxsw_sp_nexthop_remove(mlxsw_sp, nh);
+ return err;
+ }
+@@ -6743,6 +6745,7 @@ static int mlxsw_sp_nexthop6_init(struct mlxsw_sp *mlxsw_sp,
+ const struct fib6_info *rt)
+ {
+ struct net_device *dev = rt->fib6_nh->fib_nh_dev;
++ int err;
+
+ nh->nhgi = nh_grp->nhgi;
+ nh->nh_weight = rt->fib6_nh->fib_nh_weight;
+@@ -6758,7 +6761,16 @@ static int mlxsw_sp_nexthop6_init(struct mlxsw_sp *mlxsw_sp,
+ return 0;
+ nh->ifindex = dev->ifindex;
+
+- return mlxsw_sp_nexthop_type_init(mlxsw_sp, nh, dev);
++ err = mlxsw_sp_nexthop_type_init(mlxsw_sp, nh, dev);
++ if (err)
++ goto err_nexthop_type_init;
++
++ return 0;
++
++err_nexthop_type_init:
++ list_del(&nh->router_list_node);
++ mlxsw_sp_nexthop_counter_free(mlxsw_sp, nh);
++ return err;
+ }
+
+ static void mlxsw_sp_nexthop6_fini(struct mlxsw_sp *mlxsw_sp,
+diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
+index 5389fffc694ab..5edc8b7176c82 100644
+--- a/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
++++ b/drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
+@@ -396,6 +396,14 @@ static int sparx5_handle_port_mdb_add(struct net_device *dev,
+ u32 mact_entry;
+ int res, err;
+
++ if (!sparx5_netdevice_check(dev))
++ return -EOPNOTSUPP;
++
++ if (netif_is_bridge_master(v->obj.orig_dev)) {
++ sparx5_mact_learn(spx5, PGID_CPU, v->addr, v->vid);
++ return 0;
++ }
++
+ /* When VLAN unaware the vlan value is not parsed and we receive vid 0.
+ * Fall back to bridge vid 1.
+ */
+@@ -461,6 +469,14 @@ static int sparx5_handle_port_mdb_del(struct net_device *dev,
+ u32 mact_entry, res, pgid_entry[3];
+ int err;
+
++ if (!sparx5_netdevice_check(dev))
++ return -EOPNOTSUPP;
++
++ if (netif_is_bridge_master(v->obj.orig_dev)) {
++ sparx5_mact_forget(spx5, v->addr, v->vid);
++ return 0;
++ }
++
+ if (!br_vlan_enabled(spx5->hw_bridge_dev))
+ vid = 1;
+ else
+@@ -500,6 +516,7 @@ static int sparx5_handle_port_obj_add(struct net_device *dev,
+ SWITCHDEV_OBJ_PORT_VLAN(obj));
+ break;
+ case SWITCHDEV_OBJ_ID_PORT_MDB:
++ case SWITCHDEV_OBJ_ID_HOST_MDB:
+ err = sparx5_handle_port_mdb_add(dev, nb,
+ SWITCHDEV_OBJ_PORT_MDB(obj));
+ break;
+@@ -552,6 +569,7 @@ static int sparx5_handle_port_obj_del(struct net_device *dev,
+ SWITCHDEV_OBJ_PORT_VLAN(obj)->vid);
+ break;
+ case SWITCHDEV_OBJ_ID_PORT_MDB:
++ case SWITCHDEV_OBJ_ID_HOST_MDB:
+ err = sparx5_handle_port_mdb_del(dev, nb,
+ SWITCHDEV_OBJ_PORT_MDB(obj));
+ break;
+diff --git a/drivers/net/ethernet/smsc/epic100.c b/drivers/net/ethernet/smsc/epic100.c
+index a0654e88444cf..0329caf63279c 100644
+--- a/drivers/net/ethernet/smsc/epic100.c
++++ b/drivers/net/ethernet/smsc/epic100.c
+@@ -1515,14 +1515,14 @@ static void epic_remove_one(struct pci_dev *pdev)
+ struct net_device *dev = pci_get_drvdata(pdev);
+ struct epic_private *ep = netdev_priv(dev);
+
++ unregister_netdev(dev);
+ dma_free_coherent(&pdev->dev, TX_TOTAL_SIZE, ep->tx_ring,
+ ep->tx_ring_dma);
+ dma_free_coherent(&pdev->dev, RX_TOTAL_SIZE, ep->rx_ring,
+ ep->rx_ring_dma);
+- unregister_netdev(dev);
+ pci_iounmap(pdev, ep->ioaddr);
+- pci_release_regions(pdev);
+ free_netdev(dev);
++ pci_release_regions(pdev);
+ pci_disable_device(pdev);
+ /* pci_power_off(pdev, -1); */
+ }
+diff --git a/drivers/net/phy/ax88796b.c b/drivers/net/phy/ax88796b.c
+index 4578963375055..0f1e617a26c91 100644
+--- a/drivers/net/phy/ax88796b.c
++++ b/drivers/net/phy/ax88796b.c
+@@ -88,8 +88,10 @@ static void asix_ax88772a_link_change_notify(struct phy_device *phydev)
+ /* Reset PHY, otherwise MII_LPA will provide outdated information.
+ * This issue is reproducible only with some link partner PHYs
+ */
+- if (phydev->state == PHY_NOLINK && phydev->drv->soft_reset)
+- phydev->drv->soft_reset(phydev);
++ if (phydev->state == PHY_NOLINK) {
++ phy_init_hw(phydev);
++ phy_start_aneg(phydev);
++ }
+ }
+
+ static struct phy_driver asix_driver[] = {
+diff --git a/drivers/net/phy/dp83822.c b/drivers/net/phy/dp83822.c
+index ce17b2af3218f..a792dd6d2ec33 100644
+--- a/drivers/net/phy/dp83822.c
++++ b/drivers/net/phy/dp83822.c
+@@ -228,9 +228,7 @@ static int dp83822_config_intr(struct phy_device *phydev)
+ if (misr_status < 0)
+ return misr_status;
+
+- misr_status |= (DP83822_RX_ERR_HF_INT_EN |
+- DP83822_FALSE_CARRIER_HF_INT_EN |
+- DP83822_LINK_STAT_INT_EN |
++ misr_status |= (DP83822_LINK_STAT_INT_EN |
+ DP83822_ENERGY_DET_INT_EN |
+ DP83822_LINK_QUAL_INT_EN);
+
+diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
+index f122026c46826..2fc851082e7b4 100644
+--- a/drivers/net/phy/phy.c
++++ b/drivers/net/phy/phy.c
+@@ -31,6 +31,7 @@
+ #include <linux/io.h>
+ #include <linux/uaccess.h>
+ #include <linux/atomic.h>
++#include <linux/suspend.h>
+ #include <net/netlink.h>
+ #include <net/genetlink.h>
+ #include <net/sock.h>
+@@ -972,6 +973,28 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
+ struct phy_driver *drv = phydev->drv;
+ irqreturn_t ret;
+
++ /* Wakeup interrupts may occur during a system sleep transition.
++ * Postpone handling until the PHY has resumed.
++ */
++ if (IS_ENABLED(CONFIG_PM_SLEEP) && phydev->irq_suspended) {
++ struct net_device *netdev = phydev->attached_dev;
++
++ if (netdev) {
++ struct device *parent = netdev->dev.parent;
++
++ if (netdev->wol_enabled)
++ pm_system_wakeup();
++ else if (device_may_wakeup(&netdev->dev))
++ pm_wakeup_dev_event(&netdev->dev, 0, true);
++ else if (parent && device_may_wakeup(parent))
++ pm_wakeup_dev_event(parent, 0, true);
++ }
++
++ phydev->irq_rerun = 1;
++ disable_irq_nosync(irq);
++ return IRQ_HANDLED;
++ }
++
+ mutex_lock(&phydev->lock);
+ ret = drv->handle_interrupt(phydev);
+ mutex_unlock(&phydev->lock);
+diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
+index 8406ac739def8..2a53ae38a962b 100644
+--- a/drivers/net/phy/phy_device.c
++++ b/drivers/net/phy/phy_device.c
+@@ -277,6 +277,15 @@ static __maybe_unused int mdio_bus_phy_suspend(struct device *dev)
+ if (phydev->mac_managed_pm)
+ return 0;
+
++ /* Wakeup interrupts may occur during the system sleep transition when
++ * the PHY is inaccessible. Set flag to postpone handling until the PHY
++ * has resumed. Wait for concurrent interrupt handler to complete.
++ */
++ if (phy_interrupt_is_valid(phydev)) {
++ phydev->irq_suspended = 1;
++ synchronize_irq(phydev->irq);
++ }
++
+ /* We must stop the state machine manually, otherwise it stops out of
+ * control, possibly with the phydev->lock held. Upon resume, netdev
+ * may call phy routines that try to grab the same lock, and that may
+@@ -314,6 +323,20 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
+ if (ret < 0)
+ return ret;
+ no_resume:
++ if (phy_interrupt_is_valid(phydev)) {
++ phydev->irq_suspended = 0;
++ synchronize_irq(phydev->irq);
++
++ /* Rerun interrupts which were postponed by phy_interrupt()
++ * because they occurred during the system sleep transition.
++ */
++ if (phydev->irq_rerun) {
++ phydev->irq_rerun = 0;
++ enable_irq(phydev->irq);
++ irq_wake_thread(phydev->irq, phydev);
++ }
++ }
++
+ if (phydev->attached_dev && phydev->adjust_link)
+ phy_start_machine(phydev);
+
+diff --git a/drivers/net/tun.c b/drivers/net/tun.c
+index dbe4c0a4be2cd..4ebf83bb9888b 100644
+--- a/drivers/net/tun.c
++++ b/drivers/net/tun.c
+@@ -274,6 +274,12 @@ static void tun_napi_init(struct tun_struct *tun, struct tun_file *tfile,
+ }
+ }
+
++static void tun_napi_enable(struct tun_file *tfile)
++{
++ if (tfile->napi_enabled)
++ napi_enable(&tfile->napi);
++}
++
+ static void tun_napi_disable(struct tun_file *tfile)
+ {
+ if (tfile->napi_enabled)
+@@ -635,7 +641,8 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
+ tun = rtnl_dereference(tfile->tun);
+
+ if (tun && clean) {
+- tun_napi_disable(tfile);
++ if (!tfile->detached)
++ tun_napi_disable(tfile);
+ tun_napi_del(tfile);
+ }
+
+@@ -654,8 +661,10 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
+ if (clean) {
+ RCU_INIT_POINTER(tfile->tun, NULL);
+ sock_put(&tfile->sk);
+- } else
++ } else {
+ tun_disable_queue(tun, tfile);
++ tun_napi_disable(tfile);
++ }
+
+ synchronize_net();
+ tun_flow_delete_by_queue(tun, tun->numqueues + 1);
+@@ -728,6 +737,7 @@ static void tun_detach_all(struct net_device *dev)
+ sock_put(&tfile->sk);
+ }
+ list_for_each_entry_safe(tfile, tmp, &tun->disabled, next) {
++ tun_napi_del(tfile);
+ tun_enable_queue(tfile);
+ tun_queue_purge(tfile);
+ xdp_rxq_info_unreg(&tfile->xdp_rxq);
+@@ -808,6 +818,7 @@ static int tun_attach(struct tun_struct *tun, struct file *file,
+
+ if (tfile->detached) {
+ tun_enable_queue(tfile);
++ tun_napi_enable(tfile);
+ } else {
+ sock_hold(&tfile->sk);
+ tun_napi_init(tun, tfile, napi, napi_frags);
+diff --git a/drivers/net/usb/asix.h b/drivers/net/usb/asix.h
+index 2c81236c6c7c6..45d3cc5cc355e 100644
+--- a/drivers/net/usb/asix.h
++++ b/drivers/net/usb/asix.h
+@@ -126,8 +126,7 @@
+ AX_MEDIUM_RE)
+
+ #define AX88772_MEDIUM_DEFAULT \
+- (AX_MEDIUM_FD | AX_MEDIUM_RFC | \
+- AX_MEDIUM_TFC | AX_MEDIUM_PS | \
++ (AX_MEDIUM_FD | AX_MEDIUM_PS | \
+ AX_MEDIUM_AC | AX_MEDIUM_RE)
+
+ /* AX88772 & AX88178 RX_CTL values */
+diff --git a/drivers/net/usb/asix_common.c b/drivers/net/usb/asix_common.c
+index 632fa6c1d5e30..b4a1b7abcfc97 100644
+--- a/drivers/net/usb/asix_common.c
++++ b/drivers/net/usb/asix_common.c
+@@ -431,6 +431,7 @@ void asix_adjust_link(struct net_device *netdev)
+
+ asix_write_medium_mode(dev, mode, 0);
+ phy_print_status(phydev);
++ usbnet_link_change(dev, phydev->link, 0);
+ }
+
+ int asix_write_gpio(struct usbnet *dev, u16 value, int sleep, int in_pm)
+diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c
+index e2fa56b926853..873f6deabbd1a 100644
+--- a/drivers/net/usb/ax88179_178a.c
++++ b/drivers/net/usb/ax88179_178a.c
+@@ -1472,6 +1472,42 @@ static int ax88179_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
+ * are bundled into this buffer and where we can find an array of
+ * per-packet metadata (which contains elements encoded into u16).
+ */
++
++ /* SKB contents for current firmware:
++ * <packet 1> <padding>
++ * ...
++ * <packet N> <padding>
++ * <per-packet metadata entry 1> <dummy header>
++ * ...
++ * <per-packet metadata entry N> <dummy header>
++ * <padding2> <rx_hdr>
++ *
++ * where:
++ * <packet N> contains pkt_len bytes:
++ * 2 bytes of IP alignment pseudo header
++ * packet received
++ * <per-packet metadata entry N> contains 4 bytes:
++ * pkt_len and fields AX_RXHDR_*
++ * <padding> 0-7 bytes to terminate at
++ * 8 bytes boundary (64-bit).
++ * <padding2> 4 bytes to make rx_hdr terminate at
++ * 8 bytes boundary (64-bit)
++ * <dummy-header> contains 4 bytes:
++ * pkt_len=0 and AX_RXHDR_DROP_ERR
++ * <rx-hdr> contains 4 bytes:
++ * pkt_cnt and hdr_off (offset of
++ * <per-packet metadata entry 1>)
++ *
++ * pkt_cnt is number of entrys in the per-packet metadata.
++ * In current firmware there is 2 entrys per packet.
++ * The first points to the packet and the
++ * second is a dummy header.
++ * This was done probably to align fields in 64-bit and
++ * maintain compatibility with old firmware.
++ * This code assumes that <dummy header> and <padding2> are
++ * optional.
++ */
++
+ if (skb->len < 4)
+ return 0;
+ skb_trim(skb, skb->len - 4);
+@@ -1485,51 +1521,66 @@ static int ax88179_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
+ /* Make sure that the bounds of the metadata array are inside the SKB
+ * (and in front of the counter at the end).
+ */
+- if (pkt_cnt * 2 + hdr_off > skb->len)
++ if (pkt_cnt * 4 + hdr_off > skb->len)
+ return 0;
+ pkt_hdr = (u32 *)(skb->data + hdr_off);
+
+ /* Packets must not overlap the metadata array */
+ skb_trim(skb, hdr_off);
+
+- for (; ; pkt_cnt--, pkt_hdr++) {
++ for (; pkt_cnt > 0; pkt_cnt--, pkt_hdr++) {
++ u16 pkt_len_plus_padd;
+ u16 pkt_len;
+
+ le32_to_cpus(pkt_hdr);
+ pkt_len = (*pkt_hdr >> 16) & 0x1fff;
++ pkt_len_plus_padd = (pkt_len + 7) & 0xfff8;
+
+- if (pkt_len > skb->len)
++ /* Skip dummy header used for alignment
++ */
++ if (pkt_len == 0)
++ continue;
++
++ if (pkt_len_plus_padd > skb->len)
+ return 0;
+
+ /* Check CRC or runt packet */
+- if (((*pkt_hdr & (AX_RXHDR_CRC_ERR | AX_RXHDR_DROP_ERR)) == 0) &&
+- pkt_len >= 2 + ETH_HLEN) {
+- bool last = (pkt_cnt == 0);
+-
+- if (last) {
+- ax_skb = skb;
+- } else {
+- ax_skb = skb_clone(skb, GFP_ATOMIC);
+- if (!ax_skb)
+- return 0;
+- }
+- ax_skb->len = pkt_len;
+- /* Skip IP alignment pseudo header */
+- skb_pull(ax_skb, 2);
+- skb_set_tail_pointer(ax_skb, ax_skb->len);
+- ax_skb->truesize = pkt_len + sizeof(struct sk_buff);
+- ax88179_rx_checksum(ax_skb, pkt_hdr);
++ if ((*pkt_hdr & (AX_RXHDR_CRC_ERR | AX_RXHDR_DROP_ERR)) ||
++ pkt_len < 2 + ETH_HLEN) {
++ dev->net->stats.rx_errors++;
++ skb_pull(skb, pkt_len_plus_padd);
++ continue;
++ }
+
+- if (last)
+- return 1;
++ /* last packet */
++ if (pkt_len_plus_padd == skb->len) {
++ skb_trim(skb, pkt_len);
+
+- usbnet_skb_return(dev, ax_skb);
++ /* Skip IP alignment pseudo header */
++ skb_pull(skb, 2);
++
++ skb->truesize = SKB_TRUESIZE(pkt_len_plus_padd);
++ ax88179_rx_checksum(skb, pkt_hdr);
++ return 1;
+ }
+
+- /* Trim this packet away from the SKB */
+- if (!skb_pull(skb, (pkt_len + 7) & 0xFFF8))
++ ax_skb = skb_clone(skb, GFP_ATOMIC);
++ if (!ax_skb)
+ return 0;
++ skb_trim(ax_skb, pkt_len);
++
++ /* Skip IP alignment pseudo header */
++ skb_pull(ax_skb, 2);
++
++ skb->truesize = pkt_len_plus_padd +
++ SKB_DATA_ALIGN(sizeof(struct sk_buff));
++ ax88179_rx_checksum(ax_skb, pkt_hdr);
++ usbnet_skb_return(dev, ax_skb);
++
++ skb_pull(skb, pkt_len_plus_padd);
+ }
++
++ return 0;
+ }
+
+ static struct sk_buff *
+diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
+index 36b24ec116504..2ea81931543c1 100644
+--- a/drivers/net/usb/usbnet.c
++++ b/drivers/net/usb/usbnet.c
+@@ -2004,7 +2004,7 @@ static int __usbnet_read_cmd(struct usbnet *dev, u8 cmd, u8 reqtype,
+ cmd, reqtype, value, index, size);
+
+ if (size) {
+- buf = kmalloc(size, GFP_KERNEL);
++ buf = kmalloc(size, GFP_NOIO);
+ if (!buf)
+ goto out;
+ }
+@@ -2036,7 +2036,7 @@ static int __usbnet_write_cmd(struct usbnet *dev, u8 cmd, u8 reqtype,
+ cmd, reqtype, value, index, size);
+
+ if (data) {
+- buf = kmemdup(data, size, GFP_KERNEL);
++ buf = kmemdup(data, size, GFP_NOIO);
+ if (!buf)
+ goto out;
+ } else {
+diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
+index 10d548b07b9c6..c7804fce204cc 100644
+--- a/drivers/net/virtio_net.c
++++ b/drivers/net/virtio_net.c
+@@ -3641,14 +3641,20 @@ static int virtnet_probe(struct virtio_device *vdev)
+ if (vi->has_rss || vi->has_rss_hash_report)
+ virtnet_init_default_rss(vi);
+
+- err = register_netdev(dev);
++ /* serialize netdev register + virtio_device_ready() with ndo_open() */
++ rtnl_lock();
++
++ err = register_netdevice(dev);
+ if (err) {
+ pr_debug("virtio_net: registering device failed\n");
++ rtnl_unlock();
+ goto free_failover;
+ }
+
+ virtio_device_ready(vdev);
+
++ rtnl_unlock();
++
+ err = virtnet_cpu_notif_add(vi);
+ if (err) {
+ pr_debug("virtio_net: registering cpu notifier failed\n");
+diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
+index e2b4a1893a132..97c7633070a3c 100644
+--- a/drivers/net/xen-netfront.c
++++ b/drivers/net/xen-netfront.c
+@@ -66,6 +66,10 @@ module_param_named(max_queues, xennet_max_queues, uint, 0644);
+ MODULE_PARM_DESC(max_queues,
+ "Maximum number of queues per virtual interface");
+
++static bool __read_mostly xennet_trusted = true;
++module_param_named(trusted, xennet_trusted, bool, 0644);
++MODULE_PARM_DESC(trusted, "Is the backend trusted");
++
+ #define XENNET_TIMEOUT (5 * HZ)
+
+ static const struct ethtool_ops xennet_ethtool_ops;
+@@ -175,6 +179,9 @@ struct netfront_info {
+ /* Is device behaving sane? */
+ bool broken;
+
++ /* Should skbs be bounced into a zeroed buffer? */
++ bool bounce;
++
+ atomic_t rx_gso_checksum_fixup;
+ };
+
+@@ -273,7 +280,8 @@ static struct sk_buff *xennet_alloc_one_rx_buffer(struct netfront_queue *queue)
+ if (unlikely(!skb))
+ return NULL;
+
+- page = page_pool_dev_alloc_pages(queue->page_pool);
++ page = page_pool_alloc_pages(queue->page_pool,
++ GFP_ATOMIC | __GFP_NOWARN | __GFP_ZERO);
+ if (unlikely(!page)) {
+ kfree_skb(skb);
+ return NULL;
+@@ -667,6 +675,33 @@ static int xennet_xdp_xmit(struct net_device *dev, int n,
+ return nxmit;
+ }
+
++struct sk_buff *bounce_skb(const struct sk_buff *skb)
++{
++ unsigned int headerlen = skb_headroom(skb);
++ /* Align size to allocate full pages and avoid contiguous data leaks */
++ unsigned int size = ALIGN(skb_end_offset(skb) + skb->data_len,
++ XEN_PAGE_SIZE);
++ struct sk_buff *n = alloc_skb(size, GFP_ATOMIC | __GFP_ZERO);
++
++ if (!n)
++ return NULL;
++
++ if (!IS_ALIGNED((uintptr_t)n->head, XEN_PAGE_SIZE)) {
++ WARN_ONCE(1, "misaligned skb allocated\n");
++ kfree_skb(n);
++ return NULL;
++ }
++
++ /* Set the data pointer */
++ skb_reserve(n, headerlen);
++ /* Set the tail pointer and length */
++ skb_put(n, skb->len);
++
++ BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
++
++ skb_copy_header(n, skb);
++ return n;
++}
+
+ #define MAX_XEN_SKB_FRAGS (65536 / XEN_PAGE_SIZE + 1)
+
+@@ -720,9 +755,13 @@ static netdev_tx_t xennet_start_xmit(struct sk_buff *skb, struct net_device *dev
+
+ /* The first req should be at least ETH_HLEN size or the packet will be
+ * dropped by netback.
++ *
++ * If the backend is not trusted bounce all data to zeroed pages to
++ * avoid exposing contiguous data on the granted page not belonging to
++ * the skb.
+ */
+- if (unlikely(PAGE_SIZE - offset < ETH_HLEN)) {
+- nskb = skb_copy(skb, GFP_ATOMIC);
++ if (np->bounce || unlikely(PAGE_SIZE - offset < ETH_HLEN)) {
++ nskb = bounce_skb(skb);
+ if (!nskb)
+ goto drop;
+ dev_consume_skb_any(skb);
+@@ -1055,8 +1094,10 @@ static int xennet_get_responses(struct netfront_queue *queue,
+ }
+ }
+ rcu_read_unlock();
+-next:
++
+ __skb_queue_tail(list, skb);
++
++next:
+ if (!(rx->flags & XEN_NETRXF_more_data))
+ break;
+
+@@ -2246,6 +2287,10 @@ static int talk_to_netback(struct xenbus_device *dev,
+
+ info->netdev->irq = 0;
+
++ /* Check if backend is trusted. */
++ info->bounce = !xennet_trusted ||
++ !xenbus_read_unsigned(dev->nodename, "trusted", 1);
++
+ /* Check if backend supports multiple queues */
+ max_queues = xenbus_read_unsigned(info->xbdev->otherend,
+ "multi-queue-max-queues", 1);
+@@ -2413,6 +2458,9 @@ static int xennet_connect(struct net_device *dev)
+ return err;
+ if (np->netback_has_xdp_headroom)
+ pr_info("backend supports XDP headroom\n");
++ if (np->bounce)
++ dev_info(&np->xbdev->dev,
++ "bouncing transmitted data to zeroed pages\n");
+
+ /* talk_to_netback() sets the correct number of queues */
+ num_queues = dev->real_num_tx_queues;
+diff --git a/drivers/nfc/nfcmrvl/i2c.c b/drivers/nfc/nfcmrvl/i2c.c
+index ceef81d93ac99..01329b91d59d5 100644
+--- a/drivers/nfc/nfcmrvl/i2c.c
++++ b/drivers/nfc/nfcmrvl/i2c.c
+@@ -167,9 +167,9 @@ static int nfcmrvl_i2c_parse_dt(struct device_node *node,
+ pdata->irq_polarity = IRQF_TRIGGER_RISING;
+
+ ret = irq_of_parse_and_map(node, 0);
+- if (ret < 0) {
+- pr_err("Unable to get irq, error: %d\n", ret);
+- return ret;
++ if (!ret) {
++ pr_err("Unable to get irq\n");
++ return -EINVAL;
+ }
+ pdata->irq = ret;
+
+diff --git a/drivers/nfc/nfcmrvl/spi.c b/drivers/nfc/nfcmrvl/spi.c
+index a38e2fcdfd39f..ad3359a4942c7 100644
+--- a/drivers/nfc/nfcmrvl/spi.c
++++ b/drivers/nfc/nfcmrvl/spi.c
+@@ -115,9 +115,9 @@ static int nfcmrvl_spi_parse_dt(struct device_node *node,
+ }
+
+ ret = irq_of_parse_and_map(node, 0);
+- if (ret < 0) {
+- pr_err("Unable to get irq, error: %d\n", ret);
+- return ret;
++ if (!ret) {
++ pr_err("Unable to get irq\n");
++ return -EINVAL;
+ }
+ pdata->irq = ret;
+
+diff --git a/drivers/nfc/nxp-nci/i2c.c b/drivers/nfc/nxp-nci/i2c.c
+index 7e451c10985df..e8f3b35afbee4 100644
+--- a/drivers/nfc/nxp-nci/i2c.c
++++ b/drivers/nfc/nxp-nci/i2c.c
+@@ -162,6 +162,9 @@ static int nxp_nci_i2c_nci_read(struct nxp_nci_i2c_phy *phy,
+
+ skb_put_data(*skb, (void *)&header, NCI_CTRL_HDR_SIZE);
+
++ if (!header.plen)
++ return 0;
++
+ r = i2c_master_recv(client, skb_put(*skb, header.plen), header.plen);
+ if (r != header.plen) {
+ nfc_err(&client->dev,
+diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
+index 7b0d1443217a3..5db16857b80e3 100644
+--- a/drivers/nvdimm/bus.c
++++ b/drivers/nvdimm/bus.c
+@@ -182,8 +182,8 @@ static int nvdimm_clear_badblocks_region(struct device *dev, void *data)
+ ndr_end = nd_region->ndr_start + nd_region->ndr_size - 1;
+
+ /* make sure we are in the region */
+- if (ctx->phys < nd_region->ndr_start
+- || (ctx->phys + ctx->cleared) > ndr_end)
++ if (ctx->phys < nd_region->ndr_start ||
++ (ctx->phys + ctx->cleared - 1) > ndr_end)
+ return 0;
+
+ sector = (ctx->phys - nd_region->ndr_start) / 512;
+diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
+index ddea0fb90c288..fe829377c7c2a 100644
+--- a/drivers/nvme/host/pci.c
++++ b/drivers/nvme/host/pci.c
+@@ -3436,8 +3436,11 @@ static const struct pci_device_id nvme_id_table[] = {
+ { PCI_DEVICE(0x1b4b, 0x1092), /* Lexar 256 GB SSD */
+ .driver_data = NVME_QUIRK_NO_NS_DESC_LIST |
+ NVME_QUIRK_IGNORE_DEV_SUBNQN, },
++ { PCI_DEVICE(0x1cc1, 0x33f8), /* ADATA IM2P33F8ABR1 1 TB */
++ .driver_data = NVME_QUIRK_BOGUS_NID, },
+ { PCI_DEVICE(0x10ec, 0x5762), /* ADATA SX6000LNP */
+- .driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, },
++ .driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN |
++ NVME_QUIRK_BOGUS_NID, },
+ { PCI_DEVICE(0x1cc1, 0x8201), /* ADATA SX8200PNP 512GB */
+ .driver_data = NVME_QUIRK_NO_DEEPEST_PS |
+ NVME_QUIRK_IGNORE_DEV_SUBNQN, },
+diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
+index e44b2988759ec..ff77c3d2354fa 100644
+--- a/drivers/nvme/target/configfs.c
++++ b/drivers/nvme/target/configfs.c
+@@ -773,11 +773,31 @@ static ssize_t nvmet_passthru_io_timeout_store(struct config_item *item,
+ }
+ CONFIGFS_ATTR(nvmet_passthru_, io_timeout);
+
++static ssize_t nvmet_passthru_clear_ids_show(struct config_item *item,
++ char *page)
++{
++ return sprintf(page, "%u\n", to_subsys(item->ci_parent)->clear_ids);
++}
++
++static ssize_t nvmet_passthru_clear_ids_store(struct config_item *item,
++ const char *page, size_t count)
++{
++ struct nvmet_subsys *subsys = to_subsys(item->ci_parent);
++ unsigned int clear_ids;
++
++ if (kstrtouint(page, 0, &clear_ids))
++ return -EINVAL;
++ subsys->clear_ids = clear_ids;
++ return count;
++}
++CONFIGFS_ATTR(nvmet_passthru_, clear_ids);
++
+ static struct configfs_attribute *nvmet_passthru_attrs[] = {
+ &nvmet_passthru_attr_device_path,
+ &nvmet_passthru_attr_enable,
+ &nvmet_passthru_attr_admin_timeout,
+ &nvmet_passthru_attr_io_timeout,
++ &nvmet_passthru_attr_clear_ids,
+ NULL,
+ };
+
+diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
+index 90e75324dae05..c27660a660d9a 100644
+--- a/drivers/nvme/target/core.c
++++ b/drivers/nvme/target/core.c
+@@ -1374,6 +1374,12 @@ u16 nvmet_alloc_ctrl(const char *subsysnqn, const char *hostnqn,
+ ctrl->port = req->port;
+ ctrl->ops = req->ops;
+
++#ifdef CONFIG_NVME_TARGET_PASSTHRU
++ /* By default, set loop targets to clear IDS by default */
++ if (ctrl->port->disc_addr.trtype == NVMF_TRTYPE_LOOP)
++ subsys->clear_ids = 1;
++#endif
++
+ INIT_WORK(&ctrl->async_event_work, nvmet_async_event_work);
+ INIT_LIST_HEAD(&ctrl->async_events);
+ INIT_RADIX_TREE(&ctrl->p2p_ns_map, GFP_KERNEL);
+diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
+index 69818752a33a5..2b3e5719f24e4 100644
+--- a/drivers/nvme/target/nvmet.h
++++ b/drivers/nvme/target/nvmet.h
+@@ -249,6 +249,7 @@ struct nvmet_subsys {
+ struct config_group passthru_group;
+ unsigned int admin_timeout;
+ unsigned int io_timeout;
++ unsigned int clear_ids;
+ #endif /* CONFIG_NVME_TARGET_PASSTHRU */
+
+ #ifdef CONFIG_BLK_DEV_ZONED
+diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
+index 5247c24538eba..6506831cb0121 100644
+--- a/drivers/nvme/target/passthru.c
++++ b/drivers/nvme/target/passthru.c
+@@ -30,6 +30,53 @@ void nvmet_passthrough_override_cap(struct nvmet_ctrl *ctrl)
+ ctrl->cap &= ~(1ULL << 43);
+ }
+
++static u16 nvmet_passthru_override_id_descs(struct nvmet_req *req)
++{
++ struct nvmet_ctrl *ctrl = req->sq->ctrl;
++ u16 status = NVME_SC_SUCCESS;
++ int pos, len;
++ bool csi_seen = false;
++ void *data;
++ u8 csi;
++
++ if (!ctrl->subsys->clear_ids)
++ return status;
++
++ data = kzalloc(NVME_IDENTIFY_DATA_SIZE, GFP_KERNEL);
++ if (!data)
++ return NVME_SC_INTERNAL;
++
++ status = nvmet_copy_from_sgl(req, 0, data, NVME_IDENTIFY_DATA_SIZE);
++ if (status)
++ goto out_free;
++
++ for (pos = 0; pos < NVME_IDENTIFY_DATA_SIZE; pos += len) {
++ struct nvme_ns_id_desc *cur = data + pos;
++
++ if (cur->nidl == 0)
++ break;
++ if (cur->nidt == NVME_NIDT_CSI) {
++ memcpy(&csi, cur + 1, NVME_NIDT_CSI_LEN);
++ csi_seen = true;
++ break;
++ }
++ len = sizeof(struct nvme_ns_id_desc) + cur->nidl;
++ }
++
++ memset(data, 0, NVME_IDENTIFY_DATA_SIZE);
++ if (csi_seen) {
++ struct nvme_ns_id_desc *cur = data;
++
++ cur->nidt = NVME_NIDT_CSI;
++ cur->nidl = NVME_NIDT_CSI_LEN;
++ memcpy(cur + 1, &csi, NVME_NIDT_CSI_LEN);
++ }
++ status = nvmet_copy_to_sgl(req, 0, data, NVME_IDENTIFY_DATA_SIZE);
++out_free:
++ kfree(data);
++ return status;
++}
++
+ static u16 nvmet_passthru_override_id_ctrl(struct nvmet_req *req)
+ {
+ struct nvmet_ctrl *ctrl = req->sq->ctrl;
+@@ -152,6 +199,11 @@ static u16 nvmet_passthru_override_id_ns(struct nvmet_req *req)
+ */
+ id->mc = 0;
+
++ if (req->sq->ctrl->subsys->clear_ids) {
++ memset(id->nguid, 0, NVME_NIDT_NGUID_LEN);
++ memset(id->eui64, 0, NVME_NIDT_EUI64_LEN);
++ }
++
+ status = nvmet_copy_to_sgl(req, 0, id, sizeof(*id));
+
+ out_free:
+@@ -176,6 +228,9 @@ static void nvmet_passthru_execute_cmd_work(struct work_struct *w)
+ case NVME_ID_CNS_NS:
+ nvmet_passthru_override_id_ns(req);
+ break;
++ case NVME_ID_CNS_NS_DESC_LIST:
++ nvmet_passthru_override_id_descs(req);
++ break;
+ }
+ } else if (status < 0)
+ status = NVME_SC_INTERNAL;
+diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
+index 2793554e622ee..0a9542599ad1c 100644
+--- a/drivers/nvme/target/tcp.c
++++ b/drivers/nvme/target/tcp.c
+@@ -405,7 +405,7 @@ err:
+ return NVME_SC_INTERNAL;
+ }
+
+-static void nvmet_tcp_send_ddgst(struct ahash_request *hash,
++static void nvmet_tcp_calc_ddgst(struct ahash_request *hash,
+ struct nvmet_tcp_cmd *cmd)
+ {
+ ahash_request_set_crypt(hash, cmd->req.sg,
+@@ -413,23 +413,6 @@ static void nvmet_tcp_send_ddgst(struct ahash_request *hash,
+ crypto_ahash_digest(hash);
+ }
+
+-static void nvmet_tcp_recv_ddgst(struct ahash_request *hash,
+- struct nvmet_tcp_cmd *cmd)
+-{
+- struct scatterlist sg;
+- struct kvec *iov;
+- int i;
+-
+- crypto_ahash_init(hash);
+- for (i = 0, iov = cmd->iov; i < cmd->nr_mapped; i++, iov++) {
+- sg_init_one(&sg, iov->iov_base, iov->iov_len);
+- ahash_request_set_crypt(hash, &sg, NULL, iov->iov_len);
+- crypto_ahash_update(hash);
+- }
+- ahash_request_set_crypt(hash, NULL, (void *)&cmd->exp_ddgst, 0);
+- crypto_ahash_final(hash);
+-}
+-
+ static void nvmet_setup_c2h_data_pdu(struct nvmet_tcp_cmd *cmd)
+ {
+ struct nvme_tcp_data_pdu *pdu = cmd->data_pdu;
+@@ -454,7 +437,7 @@ static void nvmet_setup_c2h_data_pdu(struct nvmet_tcp_cmd *cmd)
+
+ if (queue->data_digest) {
+ pdu->hdr.flags |= NVME_TCP_F_DDGST;
+- nvmet_tcp_send_ddgst(queue->snd_hash, cmd);
++ nvmet_tcp_calc_ddgst(queue->snd_hash, cmd);
+ }
+
+ if (cmd->queue->hdr_digest) {
+@@ -1137,7 +1120,7 @@ static void nvmet_tcp_prep_recv_ddgst(struct nvmet_tcp_cmd *cmd)
+ {
+ struct nvmet_tcp_queue *queue = cmd->queue;
+
+- nvmet_tcp_recv_ddgst(queue->rcv_hash, cmd);
++ nvmet_tcp_calc_ddgst(queue->rcv_hash, cmd);
+ queue->offset = 0;
+ queue->left = NVME_TCP_DIGEST_LENGTH;
+ queue->rcv_state = NVMET_TCP_RECV_DDGST;
+diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
+index 5d9dd70e4e0f5..ddb8f14247c01 100644
+--- a/drivers/platform/x86/Kconfig
++++ b/drivers/platform/x86/Kconfig
+@@ -945,6 +945,8 @@ config PANASONIC_LAPTOP
+ tristate "Panasonic Laptop Extras"
+ depends on INPUT && ACPI
+ depends on BACKLIGHT_CLASS_DEVICE
++ depends on ACPI_VIDEO=n || ACPI_VIDEO
++ depends on SERIO_I8042 || SERIO_I8042 = n
+ select INPUT_SPARSEKMAP
+ help
+ This driver adds support for access to backlight control and hotkeys
+diff --git a/drivers/platform/x86/ideapad-laptop.c b/drivers/platform/x86/ideapad-laptop.c
+index 3ccb7b71dfb12..abd0c81d62c40 100644
+--- a/drivers/platform/x86/ideapad-laptop.c
++++ b/drivers/platform/x86/ideapad-laptop.c
+@@ -152,6 +152,10 @@ static bool no_bt_rfkill;
+ module_param(no_bt_rfkill, bool, 0444);
+ MODULE_PARM_DESC(no_bt_rfkill, "No rfkill for bluetooth.");
+
++static bool allow_v4_dytc;
++module_param(allow_v4_dytc, bool, 0444);
++MODULE_PARM_DESC(allow_v4_dytc, "Enable DYTC version 4 platform-profile support.");
++
+ /*
+ * ACPI Helpers
+ */
+@@ -871,12 +875,18 @@ static void dytc_profile_refresh(struct ideapad_private *priv)
+ static const struct dmi_system_id ideapad_dytc_v4_allow_table[] = {
+ {
+ /* Ideapad 5 Pro 16ACH6 */
+- .ident = "LENOVO 82L5",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "82L5")
+ }
+ },
++ {
++ /* Ideapad 5 15ITL05 */
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
++ DMI_MATCH(DMI_PRODUCT_VERSION, "IdeaPad 5 15ITL05")
++ }
++ },
+ {}
+ };
+
+@@ -901,13 +911,16 @@ static int ideapad_dytc_profile_init(struct ideapad_private *priv)
+
+ dytc_version = (output >> DYTC_QUERY_REV_BIT) & 0xF;
+
+- if (dytc_version < 5) {
+- if (dytc_version < 4 || !dmi_check_system(ideapad_dytc_v4_allow_table)) {
+- dev_info(&priv->platform_device->dev,
+- "DYTC_VERSION is less than 4 or is not allowed: %d\n",
+- dytc_version);
+- return -ENODEV;
+- }
++ if (dytc_version < 4) {
++ dev_info(&priv->platform_device->dev, "DYTC_VERSION < 4 is not supported\n");
++ return -ENODEV;
++ }
++
++ if (dytc_version < 5 &&
++ !(allow_v4_dytc || dmi_check_system(ideapad_dytc_v4_allow_table))) {
++ dev_info(&priv->platform_device->dev,
++ "DYTC_VERSION 4 support may not work. Pass ideapad_laptop.allow_v4_dytc=Y on the kernel commandline to enable\n");
++ return -ENODEV;
+ }
+
+ priv->dytc = kzalloc(sizeof(*priv->dytc), GFP_KERNEL);
+diff --git a/drivers/platform/x86/panasonic-laptop.c b/drivers/platform/x86/panasonic-laptop.c
+index 37850d07987d8..615e39cbbbf19 100644
+--- a/drivers/platform/x86/panasonic-laptop.c
++++ b/drivers/platform/x86/panasonic-laptop.c
+@@ -119,20 +119,22 @@
+ * - v0.1 start from toshiba_acpi driver written by John Belmonte
+ */
+
+-#include <linux/kernel.h>
+-#include <linux/module.h>
+-#include <linux/init.h>
+-#include <linux/types.h>
++#include <linux/acpi.h>
+ #include <linux/backlight.h>
+ #include <linux/ctype.h>
+-#include <linux/seq_file.h>
+-#include <linux/uaccess.h>
+-#include <linux/slab.h>
+-#include <linux/acpi.h>
++#include <linux/i8042.h>
++#include <linux/init.h>
+ #include <linux/input.h>
+ #include <linux/input/sparse-keymap.h>
++#include <linux/kernel.h>
++#include <linux/module.h>
+ #include <linux/platform_device.h>
+-
++#include <linux/seq_file.h>
++#include <linux/serio.h>
++#include <linux/slab.h>
++#include <linux/types.h>
++#include <linux/uaccess.h>
++#include <acpi/video.h>
+
+ MODULE_AUTHOR("Hiroshi Miura <miura@da-cha.org>");
+ MODULE_AUTHOR("David Bronaugh <dbronaugh@linuxboxen.org>");
+@@ -241,6 +243,42 @@ struct pcc_acpi {
+ struct platform_device *platform;
+ };
+
++/*
++ * On some Panasonic models the volume up / down / mute keys send duplicate
++ * keypress events over the PS/2 kbd interface, filter these out.
++ */
++static bool panasonic_i8042_filter(unsigned char data, unsigned char str,
++ struct serio *port)
++{
++ static bool extended;
++
++ if (str & I8042_STR_AUXDATA)
++ return false;
++
++ if (data == 0xe0) {
++ extended = true;
++ return true;
++ } else if (extended) {
++ extended = false;
++
++ switch (data & 0x7f) {
++ case 0x20: /* e0 20 / e0 a0, Volume Mute press / release */
++ case 0x2e: /* e0 2e / e0 ae, Volume Down press / release */
++ case 0x30: /* e0 30 / e0 b0, Volume Up press / release */
++ return true;
++ default:
++ /*
++ * Report the previously filtered e0 before continuing
++ * with the next non-filtered byte.
++ */
++ serio_interrupt(port, 0xe0, 0);
++ return false;
++ }
++ }
++
++ return false;
++}
++
+ /* method access functions */
+ static int acpi_pcc_write_sset(struct pcc_acpi *pcc, int func, int val)
+ {
+@@ -762,6 +800,8 @@ static void acpi_pcc_generate_keyinput(struct pcc_acpi *pcc)
+ struct input_dev *hotk_input_dev = pcc->input_dev;
+ int rc;
+ unsigned long long result;
++ unsigned int key;
++ unsigned int updown;
+
+ rc = acpi_evaluate_integer(pcc->handle, METHOD_HKEY_QUERY,
+ NULL, &result);
+@@ -770,20 +810,27 @@ static void acpi_pcc_generate_keyinput(struct pcc_acpi *pcc)
+ return;
+ }
+
++ key = result & 0xf;
++ updown = result & 0x80; /* 0x80 == key down; 0x00 = key up */
++
+ /* hack: some firmware sends no key down for sleep / hibernate */
+- if ((result & 0xf) == 0x7 || (result & 0xf) == 0xa) {
+- if (result & 0x80)
++ if (key == 7 || key == 10) {
++ if (updown)
+ sleep_keydown_seen = 1;
+ if (!sleep_keydown_seen)
+ sparse_keymap_report_event(hotk_input_dev,
+- result & 0xf, 0x80, false);
++ key, 0x80, false);
+ }
+
+- if ((result & 0xf) == 0x7 || (result & 0xf) == 0x9 || (result & 0xf) == 0xa) {
+- if (!sparse_keymap_report_event(hotk_input_dev,
+- result & 0xf, result & 0x80, false))
+- pr_err("Unknown hotkey event: 0x%04llx\n", result);
+- }
++ /*
++ * Don't report brightness key-presses if they are also reported
++ * by the ACPI video bus.
++ */
++ if ((key == 1 || key == 2) && acpi_video_handles_brightness_key_presses())
++ return;
++
++ if (!sparse_keymap_report_event(hotk_input_dev, key, updown, false))
++ pr_err("Unknown hotkey event: 0x%04llx\n", result);
+ }
+
+ static void acpi_pcc_hotkey_notify(struct acpi_device *device, u32 event)
+@@ -997,6 +1044,7 @@ static int acpi_pcc_hotkey_add(struct acpi_device *device)
+ pcc->platform = NULL;
+ }
+
++ i8042_install_filter(panasonic_i8042_filter);
+ return 0;
+
+ out_platform:
+@@ -1020,6 +1068,8 @@ static int acpi_pcc_hotkey_remove(struct acpi_device *device)
+ if (!device || !pcc)
+ return -EINVAL;
+
++ i8042_remove_filter(panasonic_i8042_filter);
++
+ if (pcc->platform) {
+ device_remove_file(&pcc->platform->dev, &dev_attr_cdpower);
+ platform_device_unregister(pcc->platform);
+diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c
+index e6cb4a14cdd47..aa6ffeaa39329 100644
+--- a/drivers/platform/x86/thinkpad_acpi.c
++++ b/drivers/platform/x86/thinkpad_acpi.c
+@@ -4529,6 +4529,7 @@ static void thinkpad_acpi_amd_s2idle_restore(void)
+ iounmap(addr);
+ cleanup_resource:
+ release_resource(res);
++ kfree(res);
+ }
+
+ static struct acpi_s2idle_dev_ops thinkpad_acpi_s2idle_dev_ops = {
+diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
+index e0de44000d92d..c290386aa2f37 100644
+--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
++++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
+@@ -1757,6 +1757,8 @@ static void mlx5_vdpa_set_vq_cb(struct vdpa_device *vdev, u16 idx, struct vdpa_c
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+
+ ndev->event_cbs[idx] = *cb;
++ if (is_ctrl_vq_idx(mvdev, idx))
++ mvdev->cvq.event_cb = *cb;
+ }
+
+ static void mlx5_cvq_notify(struct vringh *vring)
+diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
+index a0467bca39fa7..263d1cd3af76f 100644
+--- a/fs/ceph/caps.c
++++ b/fs/ceph/caps.c
+@@ -4358,6 +4358,7 @@ static void flush_dirty_session_caps(struct ceph_mds_session *s)
+ ihold(inode);
+ dout("flush_dirty_caps %llx.%llx\n", ceph_vinop(inode));
+ spin_unlock(&mdsc->cap_dirty_lock);
++ ceph_wait_on_async_create(inode);
+ ceph_check_caps(ci, CHECK_CAPS_FLUSH, NULL);
+ iput(inode);
+ spin_lock(&mdsc->cap_dirty_lock);
+diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
+index 98e4a1aa898e2..409cad848ae69 100644
+--- a/fs/cifs/connect.c
++++ b/fs/cifs/connect.c
+@@ -3423,7 +3423,9 @@ static int is_path_remote(struct mount_ctx *mnt_ctx)
+ struct cifs_tcon *tcon = mnt_ctx->tcon;
+ struct smb3_fs_context *ctx = mnt_ctx->fs_ctx;
+ char *full_path;
++#ifdef CONFIG_CIFS_DFS_UPCALL
+ bool nodfs = cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_DFS;
++#endif
+
+ if (!server->ops->is_path_accessible)
+ return -EOPNOTSUPP;
+diff --git a/fs/io_uring.c b/fs/io_uring.c
+index 7c190e8853404..7e8c715052c09 100644
+--- a/fs/io_uring.c
++++ b/fs/io_uring.c
+@@ -5254,7 +5254,7 @@ static int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+- if (unlikely(sqe->addr2 || sqe->file_index))
++ if (unlikely(sqe->addr2 || sqe->file_index || sqe->ioprio))
+ return -EINVAL;
+
+ sr->umsg = u64_to_user_ptr(READ_ONCE(sqe->addr));
+@@ -5467,7 +5467,7 @@ static int io_recvmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+- if (unlikely(sqe->addr2 || sqe->file_index))
++ if (unlikely(sqe->addr2 || sqe->file_index || sqe->ioprio))
+ return -EINVAL;
+
+ sr->umsg = u64_to_user_ptr(READ_ONCE(sqe->addr));
+diff --git a/fs/ksmbd/smb2pdu.c b/fs/ksmbd/smb2pdu.c
+index 16c803a9d996f..31138e9be1dc2 100644
+--- a/fs/ksmbd/smb2pdu.c
++++ b/fs/ksmbd/smb2pdu.c
+@@ -7705,7 +7705,7 @@ int smb2_ioctl(struct ksmbd_work *work)
+ {
+ struct file_zero_data_information *zero_data;
+ struct ksmbd_file *fp;
+- loff_t off, len;
++ loff_t off, len, bfz;
+
+ if (!test_tree_conn_flag(work->tcon, KSMBD_TREE_CONN_FLAG_WRITABLE)) {
+ ksmbd_debug(SMB,
+@@ -7722,19 +7722,26 @@ int smb2_ioctl(struct ksmbd_work *work)
+ zero_data =
+ (struct file_zero_data_information *)&req->Buffer[0];
+
+- fp = ksmbd_lookup_fd_fast(work, id);
+- if (!fp) {
+- ret = -ENOENT;
++ off = le64_to_cpu(zero_data->FileOffset);
++ bfz = le64_to_cpu(zero_data->BeyondFinalZero);
++ if (off > bfz) {
++ ret = -EINVAL;
+ goto out;
+ }
+
+- off = le64_to_cpu(zero_data->FileOffset);
+- len = le64_to_cpu(zero_data->BeyondFinalZero) - off;
++ len = bfz - off;
++ if (len) {
++ fp = ksmbd_lookup_fd_fast(work, id);
++ if (!fp) {
++ ret = -ENOENT;
++ goto out;
++ }
+
+- ret = ksmbd_vfs_zero_data(work, fp, off, len);
+- ksmbd_fd_put(work, fp);
+- if (ret < 0)
+- goto out;
++ ret = ksmbd_vfs_zero_data(work, fp, off, len);
++ ksmbd_fd_put(work, fp);
++ if (ret < 0)
++ goto out;
++ }
+ break;
+ }
+ case FSCTL_QUERY_ALLOCATED_RANGES:
+@@ -7808,14 +7815,24 @@ int smb2_ioctl(struct ksmbd_work *work)
+ src_off = le64_to_cpu(dup_ext->SourceFileOffset);
+ dst_off = le64_to_cpu(dup_ext->TargetFileOffset);
+ length = le64_to_cpu(dup_ext->ByteCount);
+- cloned = vfs_clone_file_range(fp_in->filp, src_off, fp_out->filp,
+- dst_off, length, 0);
++ /*
++ * XXX: It is not clear if FSCTL_DUPLICATE_EXTENTS_TO_FILE
++ * should fall back to vfs_copy_file_range(). This could be
++ * beneficial when re-exporting nfs/smb mount, but note that
++ * this can result in partial copy that returns an error status.
++ * If/when FSCTL_DUPLICATE_EXTENTS_TO_FILE_EX is implemented,
++ * fall back to vfs_copy_file_range(), should be avoided when
++ * the flag DUPLICATE_EXTENTS_DATA_EX_SOURCE_ATOMIC is set.
++ */
++ cloned = vfs_clone_file_range(fp_in->filp, src_off,
++ fp_out->filp, dst_off, length, 0);
+ if (cloned == -EXDEV || cloned == -EOPNOTSUPP) {
+ ret = -EOPNOTSUPP;
+ goto dup_ext_out;
+ } else if (cloned != length) {
+ cloned = vfs_copy_file_range(fp_in->filp, src_off,
+- fp_out->filp, dst_off, length, 0);
++ fp_out->filp, dst_off,
++ length, 0);
+ if (cloned != length) {
+ if (cloned < 0)
+ ret = cloned;
+diff --git a/fs/ksmbd/vfs.c b/fs/ksmbd/vfs.c
+index dcdd07c6efffd..05efcdf7a4a73 100644
+--- a/fs/ksmbd/vfs.c
++++ b/fs/ksmbd/vfs.c
+@@ -1015,7 +1015,9 @@ int ksmbd_vfs_zero_data(struct ksmbd_work *work, struct ksmbd_file *fp,
+ FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+ off, len);
+
+- return vfs_fallocate(fp->filp, FALLOC_FL_ZERO_RANGE, off, len);
++ return vfs_fallocate(fp->filp,
++ FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE,
++ off, len);
+ }
+
+ int ksmbd_vfs_fqar_lseek(struct ksmbd_file *fp, loff_t start, loff_t length,
+@@ -1046,7 +1048,7 @@ int ksmbd_vfs_fqar_lseek(struct ksmbd_file *fp, loff_t start, loff_t length,
+ *out_count = 0;
+ end = start + length;
+ while (start < end && *out_count < in_count) {
+- extent_start = f->f_op->llseek(f, start, SEEK_DATA);
++ extent_start = vfs_llseek(f, start, SEEK_DATA);
+ if (extent_start < 0) {
+ if (extent_start != -ENXIO)
+ ret = (int)extent_start;
+@@ -1056,7 +1058,7 @@ int ksmbd_vfs_fqar_lseek(struct ksmbd_file *fp, loff_t start, loff_t length,
+ if (extent_start >= end)
+ break;
+
+- extent_end = f->f_op->llseek(f, extent_start, SEEK_HOLE);
++ extent_end = vfs_llseek(f, extent_start, SEEK_HOLE);
+ if (extent_end < 0) {
+ if (extent_end != -ENXIO)
+ ret = (int)extent_end;
+@@ -1777,6 +1779,10 @@ int ksmbd_vfs_copy_file_ranges(struct ksmbd_work *work,
+
+ ret = vfs_copy_file_range(src_fp->filp, src_off,
+ dst_fp->filp, dst_off, len, 0);
++ if (ret == -EOPNOTSUPP || ret == -EXDEV)
++ ret = generic_copy_file_range(src_fp->filp, src_off,
++ dst_fp->filp, dst_off,
++ len, 0);
+ if (ret < 0)
+ return ret;
+
+diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
+index d1eaaeb7f7135..f0069395f02c7 100644
+--- a/fs/nfs/nfs4proc.c
++++ b/fs/nfs/nfs4proc.c
+@@ -4012,22 +4012,29 @@ static int _nfs4_discover_trunking(struct nfs_server *server,
+ }
+
+ page = alloc_page(GFP_KERNEL);
++ if (!page)
++ return -ENOMEM;
+ locations = kmalloc(sizeof(struct nfs4_fs_locations), GFP_KERNEL);
+- if (page == NULL || locations == NULL)
+- goto out;
++ if (!locations)
++ goto out_free;
++ locations->fattr = nfs_alloc_fattr();
++ if (!locations->fattr)
++ goto out_free_2;
+
+ status = nfs4_proc_get_locations(server, fhandle, locations, page,
+ cred);
+ if (status)
+- goto out;
++ goto out_free_3;
+
+ for (i = 0; i < locations->nlocations; i++)
+ test_fs_location_for_trunking(&locations->locations[i], clp,
+ server);
+-out:
+- if (page)
+- __free_page(page);
++out_free_3:
++ kfree(locations->fattr);
++out_free_2:
+ kfree(locations);
++out_free:
++ __free_page(page);
+ return status;
+ }
+
+diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
+index 9656d40bb4887..673af2d4b5a23 100644
+--- a/fs/nfs/nfs4state.c
++++ b/fs/nfs/nfs4state.c
+@@ -2743,5 +2743,6 @@ again:
+ goto again;
+
+ nfs_put_client(clp);
++ module_put_and_kthread_exit(0);
+ return 0;
+ }
+diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
+index c22ad0532e8ee..67c851f02b249 100644
+--- a/fs/nfsd/vfs.c
++++ b/fs/nfsd/vfs.c
+@@ -577,6 +577,7 @@ out_err:
+ ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file *dst,
+ u64 dst_pos, u64 count)
+ {
++ ssize_t ret;
+
+ /*
+ * Limit copy to 4MB to prevent indefinitely blocking an nfsd
+@@ -587,7 +588,12 @@ ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file *dst,
+ * limit like this and pipeline multiple COPY requests.
+ */
+ count = min_t(u64, count, 1 << 22);
+- return vfs_copy_file_range(src, src_pos, dst, dst_pos, count, 0);
++ ret = vfs_copy_file_range(src, src_pos, dst, dst_pos, count, 0);
++
++ if (ret == -EOPNOTSUPP || ret == -EXDEV)
++ ret = generic_copy_file_range(src, src_pos, dst, dst_pos,
++ count, 0);
++ return ret;
+ }
+
+ __be32 nfsd4_vfs_fallocate(struct svc_rqst *rqstp, struct svc_fh *fhp,
+@@ -1170,6 +1176,7 @@ nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, u64 offset,
+ nfsd_copy_write_verifier(verf, nn);
+ err2 = filemap_check_wb_err(nf->nf_file->f_mapping,
+ since);
++ err = nfserrno(err2);
+ break;
+ case -EINVAL:
+ err = nfserr_notsupp;
+@@ -1177,8 +1184,8 @@ nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, u64 offset,
+ default:
+ nfsd_reset_write_verifier(nn);
+ trace_nfsd_writeverf_reset(nn, rqstp, err2);
++ err = nfserrno(err2);
+ }
+- err = nfserrno(err2);
+ } else
+ nfsd_copy_write_verifier(verf, nn);
+
+diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
+index 16d8fc84713a4..a12391d06fe5c 100644
+--- a/fs/notify/fanotify/fanotify_user.c
++++ b/fs/notify/fanotify/fanotify_user.c
+@@ -1483,8 +1483,15 @@ static int fanotify_test_fid(struct dentry *dentry)
+ return 0;
+ }
+
+-static int fanotify_events_supported(struct path *path, __u64 mask)
++static int fanotify_events_supported(struct fsnotify_group *group,
++ struct path *path, __u64 mask,
++ unsigned int flags)
+ {
++ unsigned int mark_type = flags & FANOTIFY_MARK_TYPE_BITS;
++ /* Strict validation of events in non-dir inode mask with v5.17+ APIs */
++ bool strict_dir_events = FAN_GROUP_FLAG(group, FAN_REPORT_TARGET_FID) ||
++ (mask & FAN_RENAME);
++
+ /*
+ * Some filesystems such as 'proc' acquire unusual locks when opening
+ * files. For them fanotify permission events have high chances of
+@@ -1496,6 +1503,16 @@ static int fanotify_events_supported(struct path *path, __u64 mask)
+ if (mask & FANOTIFY_PERM_EVENTS &&
+ path->mnt->mnt_sb->s_type->fs_flags & FS_DISALLOW_NOTIFY_PERM)
+ return -EINVAL;
++
++ /*
++ * We shouldn't have allowed setting dirent events and the directory
++ * flags FAN_ONDIR and FAN_EVENT_ON_CHILD in mask of non-dir inode,
++ * but because we always allowed it, error only when using new APIs.
++ */
++ if (strict_dir_events && mark_type == FAN_MARK_INODE &&
++ !d_is_dir(path->dentry) && (mask & FANOTIFY_DIRONLY_EVENT_BITS))
++ return -ENOTDIR;
++
+ return 0;
+ }
+
+@@ -1634,7 +1651,7 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
+ goto fput_and_out;
+
+ if (flags & FAN_MARK_ADD) {
+- ret = fanotify_events_supported(&path, mask);
++ ret = fanotify_events_supported(group, &path, mask, flags);
+ if (ret)
+ goto path_put_and_out;
+ }
+@@ -1657,19 +1674,6 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
+ else
+ mnt = path.mnt;
+
+- /*
+- * FAN_RENAME is not allowed on non-dir (for now).
+- * We shouldn't have allowed setting any dirent events in mask of
+- * non-dir, but because we always allowed it, error only if group
+- * was initialized with the new flag FAN_REPORT_TARGET_FID.
+- */
+- ret = -ENOTDIR;
+- if (inode && !S_ISDIR(inode->i_mode) &&
+- ((mask & FAN_RENAME) ||
+- ((mask & FANOTIFY_DIRENT_EVENTS) &&
+- FAN_GROUP_FLAG(group, FAN_REPORT_TARGET_FID))))
+- goto path_put_and_out;
+-
+ /* Mask out FAN_EVENT_ON_CHILD flag for sb/mount/non-dir marks */
+ if (mnt || !S_ISDIR(inode->i_mode)) {
+ mask &= ~FAN_EVENT_ON_CHILD;
+diff --git a/fs/read_write.c b/fs/read_write.c
+index e643aec2b0efe..671f47d5984ce 100644
+--- a/fs/read_write.c
++++ b/fs/read_write.c
+@@ -1381,28 +1381,6 @@ ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in,
+ }
+ EXPORT_SYMBOL(generic_copy_file_range);
+
+-static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
+- struct file *file_out, loff_t pos_out,
+- size_t len, unsigned int flags)
+-{
+- /*
+- * Although we now allow filesystems to handle cross sb copy, passing
+- * a file of the wrong filesystem type to filesystem driver can result
+- * in an attempt to dereference the wrong type of ->private_data, so
+- * avoid doing that until we really have a good reason. NFS defines
+- * several different file_system_type structures, but they all end up
+- * using the same ->copy_file_range() function pointer.
+- */
+- if (file_out->f_op->copy_file_range &&
+- file_out->f_op->copy_file_range == file_in->f_op->copy_file_range)
+- return file_out->f_op->copy_file_range(file_in, pos_in,
+- file_out, pos_out,
+- len, flags);
+-
+- return generic_copy_file_range(file_in, pos_in, file_out, pos_out, len,
+- flags);
+-}
+-
+ /*
+ * Performs necessary checks before doing a file copy
+ *
+@@ -1424,6 +1402,24 @@ static int generic_copy_file_checks(struct file *file_in, loff_t pos_in,
+ if (ret)
+ return ret;
+
++ /*
++ * We allow some filesystems to handle cross sb copy, but passing
++ * a file of the wrong filesystem type to filesystem driver can result
++ * in an attempt to dereference the wrong type of ->private_data, so
++ * avoid doing that until we really have a good reason.
++ *
++ * nfs and cifs define several different file_system_type structures
++ * and several different sets of file_operations, but they all end up
++ * using the same ->copy_file_range() function pointer.
++ */
++ if (file_out->f_op->copy_file_range) {
++ if (file_in->f_op->copy_file_range !=
++ file_out->f_op->copy_file_range)
++ return -EXDEV;
++ } else if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb) {
++ return -EXDEV;
++ }
++
+ /* Don't touch certain kinds of inodes */
+ if (IS_IMMUTABLE(inode_out))
+ return -EPERM;
+@@ -1489,26 +1485,41 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
+ file_start_write(file_out);
+
+ /*
+- * Try cloning first, this is supported by more file systems, and
+- * more efficient if both clone and copy are supported (e.g. NFS).
++ * Cloning is supported by more file systems, so we implement copy on
++ * same sb using clone, but for filesystems where both clone and copy
++ * are supported (e.g. nfs,cifs), we only call the copy method.
+ */
++ if (file_out->f_op->copy_file_range) {
++ ret = file_out->f_op->copy_file_range(file_in, pos_in,
++ file_out, pos_out,
++ len, flags);
++ goto done;
++ }
++
+ if (file_in->f_op->remap_file_range &&
+ file_inode(file_in)->i_sb == file_inode(file_out)->i_sb) {
+- loff_t cloned;
+-
+- cloned = file_in->f_op->remap_file_range(file_in, pos_in,
++ ret = file_in->f_op->remap_file_range(file_in, pos_in,
+ file_out, pos_out,
+ min_t(loff_t, MAX_RW_COUNT, len),
+ REMAP_FILE_CAN_SHORTEN);
+- if (cloned > 0) {
+- ret = cloned;
++ if (ret > 0)
+ goto done;
+- }
+ }
+
+- ret = do_copy_file_range(file_in, pos_in, file_out, pos_out, len,
+- flags);
+- WARN_ON_ONCE(ret == -EOPNOTSUPP);
++ /*
++ * We can get here for same sb copy of filesystems that do not implement
++ * ->copy_file_range() in case filesystem does not support clone or in
++ * case filesystem supports clone but rejected the clone request (e.g.
++ * because it was not block aligned).
++ *
++ * In both cases, fall back to kernel copy so we are able to maintain a
++ * consistent story about which filesystems support copy_file_range()
++ * and which filesystems do not, that will allow userspace tools to
++ * make consistent desicions w.r.t using copy_file_range().
++ */
++ ret = generic_copy_file_range(file_in, pos_in, file_out, pos_out, len,
++ flags);
++
+ done:
+ if (ret > 0) {
+ fsnotify_access(file_in);
+diff --git a/include/linux/dim.h b/include/linux/dim.h
+index b698266d00356..6c5733981563e 100644
+--- a/include/linux/dim.h
++++ b/include/linux/dim.h
+@@ -21,7 +21,7 @@
+ * We consider 10% difference as significant.
+ */
+ #define IS_SIGNIFICANT_DIFF(val, ref) \
+- (((100UL * abs((val) - (ref))) / (ref)) > 10)
++ ((ref) && (((100UL * abs((val) - (ref))) / (ref)) > 10))
+
+ /*
+ * Calculate the gap between two values.
+diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
+index 419cadcd7ff55..2f0e418888543 100644
+--- a/include/linux/fanotify.h
++++ b/include/linux/fanotify.h
+@@ -110,6 +110,10 @@
+ FANOTIFY_PERM_EVENTS | \
+ FAN_Q_OVERFLOW | FAN_ONDIR)
+
++/* Events and flags relevant only for directories */
++#define FANOTIFY_DIRONLY_EVENT_BITS (FANOTIFY_DIRENT_EVENTS | \
++ FAN_EVENT_ON_CHILD | FAN_ONDIR)
++
+ #define ALL_FANOTIFY_EVENT_BITS (FANOTIFY_OUTGOING_EVENTS | \
+ FANOTIFY_EVENT_FLAGS)
+
+diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
+index f736c020cde27..bfa27972f8605 100644
+--- a/include/linux/netdevice.h
++++ b/include/linux/netdevice.h
+@@ -1653,7 +1653,7 @@ enum netdev_priv_flags {
+ IFF_FAILOVER_SLAVE = 1<<28,
+ IFF_L3MDEV_RX_HANDLER = 1<<29,
+ IFF_LIVE_RENAME_OK = 1<<30,
+- IFF_TX_SKB_NO_LINEAR = 1<<31,
++ IFF_TX_SKB_NO_LINEAR = BIT_ULL(31),
+ IFF_CHANGE_PROTO_DOWN = BIT_ULL(32),
+ };
+
+diff --git a/include/linux/phy.h b/include/linux/phy.h
+index 36ca2b5c22533..b2b76dc2e5e26 100644
+--- a/include/linux/phy.h
++++ b/include/linux/phy.h
+@@ -571,6 +571,10 @@ struct macsec_ops;
+ * @mdix: Current crossover
+ * @mdix_ctrl: User setting of crossover
+ * @interrupts: Flag interrupts have been enabled
++ * @irq_suspended: Flag indicating PHY is suspended and therefore interrupt
++ * handling shall be postponed until PHY has resumed
++ * @irq_rerun: Flag indicating interrupts occurred while PHY was suspended,
++ * requiring a rerun of the interrupt handler after resume
+ * @interface: enum phy_interface_t value
+ * @skb: Netlink message for cable diagnostics
+ * @nest: Netlink nest used for cable diagnostics
+@@ -625,6 +629,8 @@ struct phy_device {
+
+ /* Interrupts are enabled */
+ unsigned interrupts:1;
++ unsigned irq_suspended:1;
++ unsigned irq_rerun:1;
+
+ enum phy_state state;
+
+diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
+index fc0c1454d2757..7b9e3f9a0f00b 100644
+--- a/include/uapi/drm/drm_fourcc.h
++++ b/include/uapi/drm/drm_fourcc.h
+@@ -1375,11 +1375,11 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier)
+ #define AMD_FMT_MOD_PIPE_MASK 0x7
+
+ #define AMD_FMT_MOD_SET(field, value) \
+- ((uint64_t)(value) << AMD_FMT_MOD_##field##_SHIFT)
++ ((__u64)(value) << AMD_FMT_MOD_##field##_SHIFT)
+ #define AMD_FMT_MOD_GET(field, value) \
+ (((value) >> AMD_FMT_MOD_##field##_SHIFT) & AMD_FMT_MOD_##field##_MASK)
+ #define AMD_FMT_MOD_CLEAR(field) \
+- (~((uint64_t)AMD_FMT_MOD_##field##_MASK << AMD_FMT_MOD_##field##_SHIFT))
++ (~((__u64)AMD_FMT_MOD_##field##_MASK << AMD_FMT_MOD_##field##_SHIFT))
+
+ #if defined(__cplusplus)
+ }
+diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h
+index 9690efedb5fa6..3e085b6c05a6d 100644
+--- a/include/uapi/linux/mptcp.h
++++ b/include/uapi/linux/mptcp.h
+@@ -2,16 +2,17 @@
+ #ifndef _UAPI_MPTCP_H
+ #define _UAPI_MPTCP_H
+
++#ifndef __KERNEL__
++#include <netinet/in.h> /* for sockaddr_in and sockaddr_in6 */
++#include <sys/socket.h> /* for struct sockaddr */
++#endif
++
+ #include <linux/const.h>
+ #include <linux/types.h>
+ #include <linux/in.h> /* for sockaddr_in */
+ #include <linux/in6.h> /* for sockaddr_in6 */
+ #include <linux/socket.h> /* for sockaddr_storage and sa_family */
+
+-#ifndef __KERNEL__
+-#include <sys/socket.h> /* for struct sockaddr */
+-#endif
+-
+ #define MPTCP_SUBFLOW_FLAG_MCAP_REM _BITUL(0)
+ #define MPTCP_SUBFLOW_FLAG_MCAP_LOC _BITUL(1)
+ #define MPTCP_SUBFLOW_FLAG_JOIN_REM _BITUL(2)
+diff --git a/lib/sbitmap.c b/lib/sbitmap.c
+index ae4fd4de9ebe7..29eb0484215af 100644
+--- a/lib/sbitmap.c
++++ b/lib/sbitmap.c
+@@ -528,7 +528,7 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags,
+
+ sbitmap_deferred_clear(map);
+ if (map->word == (1UL << (map_depth - 1)) - 1)
+- continue;
++ goto next;
+
+ nr = find_first_zero_bit(&map->word, map_depth);
+ if (nr + nr_tags <= map_depth) {
+@@ -539,6 +539,8 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags,
+ get_mask = ((1UL << map_tags) - 1) << nr;
+ do {
+ val = READ_ONCE(map->word);
++ if ((val & ~get_mask) != val)
++ goto next;
+ ret = atomic_long_cmpxchg(ptr, val, get_mask | val);
+ } while (ret != val);
+ get_mask = (get_mask & ~ret) >> nr;
+@@ -549,6 +551,7 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags,
+ return get_mask;
+ }
+ }
++next:
+ /* Jump to next index. */
+ if (++index >= sb->map_nr)
+ index = 0;
+diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
+index 6b2dc7b2b6127..cc1caab4a6549 100644
+--- a/net/ipv4/ip_tunnel_core.c
++++ b/net/ipv4/ip_tunnel_core.c
+@@ -410,7 +410,7 @@ int skb_tunnel_check_pmtu(struct sk_buff *skb, struct dst_entry *encap_dst,
+ u32 mtu = dst_mtu(encap_dst) - headroom;
+
+ if ((skb_is_gso(skb) && skb_gso_validate_network_len(skb, mtu)) ||
+- (!skb_is_gso(skb) && (skb->len - skb_mac_header_len(skb)) <= mtu))
++ (!skb_is_gso(skb) && (skb->len - skb_network_offset(skb)) <= mtu))
+ return 0;
+
+ skb_dst_update_pmtu_no_confirm(skb, mtu);
+diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
+index 30a74e4eeab49..cd78b4fc334f7 100644
+--- a/net/ipv4/tcp_ipv4.c
++++ b/net/ipv4/tcp_ipv4.c
+@@ -1965,7 +1965,10 @@ process:
+ struct sock *nsk;
+
+ sk = req->rsk_listener;
+- drop_reason = tcp_inbound_md5_hash(sk, skb,
++ if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
++ drop_reason = SKB_DROP_REASON_XFRM_POLICY;
++ else
++ drop_reason = tcp_inbound_md5_hash(sk, skb,
+ &iph->saddr, &iph->daddr,
+ AF_INET, dif, sdif);
+ if (unlikely(drop_reason)) {
+@@ -2017,6 +2020,7 @@ process:
+ }
+ goto discard_and_relse;
+ }
++ nf_reset_ct(skb);
+ if (nsk == sk) {
+ reqsk_put(req);
+ tcp_v4_restore_cb(skb);
+diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
+index 51e77dc6571a2..3b47c901c832b 100644
+--- a/net/ipv6/addrconf.c
++++ b/net/ipv6/addrconf.c
+@@ -1109,10 +1109,6 @@ ipv6_add_addr(struct inet6_dev *idev, struct ifa6_config *cfg,
+ goto out;
+ }
+
+- if (net->ipv6.devconf_all->disable_policy ||
+- idev->cnf.disable_policy)
+- f6i->dst_nopolicy = true;
+-
+ neigh_parms_data_state_setall(idev->nd_parms);
+
+ ifa->addr = *cfg->pfx;
+@@ -5174,9 +5170,9 @@ next:
+ fillargs->event = RTM_GETMULTICAST;
+
+ /* multicast address */
+- for (ifmca = rcu_dereference(idev->mc_list);
++ for (ifmca = rtnl_dereference(idev->mc_list);
+ ifmca;
+- ifmca = rcu_dereference(ifmca->next), ip_idx++) {
++ ifmca = rtnl_dereference(ifmca->next), ip_idx++) {
+ if (ip_idx < s_ip_idx)
+ continue;
+ err = inet6_fill_ifmcaddr(skb, ifmca, fillargs);
+diff --git a/net/ipv6/route.c b/net/ipv6/route.c
+index c4b6ce017d5e3..83786de847abf 100644
+--- a/net/ipv6/route.c
++++ b/net/ipv6/route.c
+@@ -4565,8 +4565,15 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net,
+ }
+
+ f6i = ip6_route_info_create(&cfg, gfp_flags, NULL);
+- if (!IS_ERR(f6i))
++ if (!IS_ERR(f6i)) {
+ f6i->dst_nocount = true;
++
++ if (!anycast &&
++ (net->ipv6.devconf_all->disable_policy ||
++ idev->cnf.disable_policy))
++ f6i->dst_nopolicy = true;
++ }
++
+ return f6i;
+ }
+
+diff --git a/net/ipv6/seg6_hmac.c b/net/ipv6/seg6_hmac.c
+index 6de01185cc68f..d43c50a7310d6 100644
+--- a/net/ipv6/seg6_hmac.c
++++ b/net/ipv6/seg6_hmac.c
+@@ -406,7 +406,6 @@ int __net_init seg6_hmac_net_init(struct net *net)
+
+ return rhashtable_init(&sdata->hmac_infos, &rht_params);
+ }
+-EXPORT_SYMBOL(seg6_hmac_net_init);
+
+ void seg6_hmac_exit(void)
+ {
+diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
+index c0b138c209925..6bcd5e419a083 100644
+--- a/net/ipv6/sit.c
++++ b/net/ipv6/sit.c
+@@ -323,8 +323,6 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
+ kcalloc(cmax, sizeof(*kp), GFP_KERNEL_ACCOUNT | __GFP_NOWARN) :
+ NULL;
+
+- rcu_read_lock();
+-
+ ca = min(t->prl_count, cmax);
+
+ if (!kp) {
+@@ -341,7 +339,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
+ }
+ }
+
+- c = 0;
++ rcu_read_lock();
+ for_each_prl_rcu(t->prl) {
+ if (c >= cmax)
+ break;
+@@ -353,7 +351,7 @@ static int ipip6_tunnel_get_prl(struct net_device *dev, struct ip_tunnel_prl __u
+ if (kprl.addr != htonl(INADDR_ANY))
+ break;
+ }
+-out:
++
+ rcu_read_unlock();
+
+ len = sizeof(*kp) * c;
+@@ -362,7 +360,7 @@ out:
+ ret = -EFAULT;
+
+ kfree(kp);
+-
++out:
+ return ret;
+ }
+
+diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
+index 8f54293c1d887..713077eef04ac 100644
+--- a/net/mptcp/protocol.c
++++ b/net/mptcp/protocol.c
+@@ -2305,6 +2305,11 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
+ kfree_rcu(subflow, rcu);
+ } else {
+ /* otherwise tcp will dispose of the ssk and subflow ctx */
++ if (ssk->sk_state == TCP_LISTEN) {
++ tcp_set_state(ssk, TCP_CLOSE);
++ mptcp_subflow_queue_clean(ssk);
++ inet_csk_listen_stop(ssk);
++ }
+ __tcp_close(ssk, 0);
+
+ /* close acquired an extra ref */
+diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
+index 9ac63fa4866ef..2aab5aff6bcdf 100644
+--- a/net/mptcp/protocol.h
++++ b/net/mptcp/protocol.h
+@@ -286,6 +286,7 @@ struct mptcp_sock {
+
+ u32 setsockopt_seq;
+ char ca_name[TCP_CA_NAME_MAX];
++ struct mptcp_sock *dl_next;
+ };
+
+ #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock)
+@@ -585,6 +586,7 @@ void mptcp_close_ssk(struct sock *sk, struct sock *ssk,
+ struct mptcp_subflow_context *subflow);
+ void mptcp_subflow_send_ack(struct sock *ssk);
+ void mptcp_subflow_reset(struct sock *ssk);
++void mptcp_subflow_queue_clean(struct sock *ssk);
+ void mptcp_sock_graft(struct sock *sk, struct socket *parent);
+ struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk);
+
+diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
+index be76ada89d969..7919e259175df 100644
+--- a/net/mptcp/subflow.c
++++ b/net/mptcp/subflow.c
+@@ -1688,6 +1688,58 @@ static void subflow_state_change(struct sock *sk)
+ }
+ }
+
++void mptcp_subflow_queue_clean(struct sock *listener_ssk)
++{
++ struct request_sock_queue *queue = &inet_csk(listener_ssk)->icsk_accept_queue;
++ struct mptcp_sock *msk, *next, *head = NULL;
++ struct request_sock *req;
++
++ /* build a list of all unaccepted mptcp sockets */
++ spin_lock_bh(&queue->rskq_lock);
++ for (req = queue->rskq_accept_head; req; req = req->dl_next) {
++ struct mptcp_subflow_context *subflow;
++ struct sock *ssk = req->sk;
++ struct mptcp_sock *msk;
++
++ if (!sk_is_mptcp(ssk))
++ continue;
++
++ subflow = mptcp_subflow_ctx(ssk);
++ if (!subflow || !subflow->conn)
++ continue;
++
++ /* skip if already in list */
++ msk = mptcp_sk(subflow->conn);
++ if (msk->dl_next || msk == head)
++ continue;
++
++ msk->dl_next = head;
++ head = msk;
++ }
++ spin_unlock_bh(&queue->rskq_lock);
++ if (!head)
++ return;
++
++ /* can't acquire the msk socket lock under the subflow one,
++ * or will cause ABBA deadlock
++ */
++ release_sock(listener_ssk);
++
++ for (msk = head; msk; msk = next) {
++ struct sock *sk = (struct sock *)msk;
++ bool slow;
++
++ slow = lock_sock_fast_nested(sk);
++ next = msk->dl_next;
++ msk->first = NULL;
++ msk->dl_next = NULL;
++ unlock_sock_fast(sk, slow);
++ }
++
++ /* we are still under the listener msk socket lock */
++ lock_sock_nested(listener_ssk, SINGLE_DEPTH_NESTING);
++}
++
+ static int subflow_ulp_init(struct sock *sk)
+ {
+ struct inet_connection_sock *icsk = inet_csk(sk);
+diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c
+index df40314de21f5..76de6c8d98655 100644
+--- a/net/netfilter/nft_set_hash.c
++++ b/net/netfilter/nft_set_hash.c
+@@ -143,6 +143,7 @@ static bool nft_rhash_update(struct nft_set *set, const u32 *key,
+ /* Another cpu may race to insert the element with the same key */
+ if (prev) {
+ nft_set_elem_destroy(set, he, true);
++ atomic_dec(&set->nelems);
+ he = prev;
+ }
+
+@@ -152,6 +153,7 @@ out:
+
+ err2:
+ nft_set_elem_destroy(set, he, true);
++ atomic_dec(&set->nelems);
+ err1:
+ return false;
+ }
+diff --git a/net/rose/rose_timer.c b/net/rose/rose_timer.c
+index b3138fc2e552e..f06ddbed3fed6 100644
+--- a/net/rose/rose_timer.c
++++ b/net/rose/rose_timer.c
+@@ -31,89 +31,89 @@ static void rose_idletimer_expiry(struct timer_list *);
+
+ void rose_start_heartbeat(struct sock *sk)
+ {
+- del_timer(&sk->sk_timer);
++ sk_stop_timer(sk, &sk->sk_timer);
+
+ sk->sk_timer.function = rose_heartbeat_expiry;
+ sk->sk_timer.expires = jiffies + 5 * HZ;
+
+- add_timer(&sk->sk_timer);
++ sk_reset_timer(sk, &sk->sk_timer, sk->sk_timer.expires);
+ }
+
+ void rose_start_t1timer(struct sock *sk)
+ {
+ struct rose_sock *rose = rose_sk(sk);
+
+- del_timer(&rose->timer);
++ sk_stop_timer(sk, &rose->timer);
+
+ rose->timer.function = rose_timer_expiry;
+ rose->timer.expires = jiffies + rose->t1;
+
+- add_timer(&rose->timer);
++ sk_reset_timer(sk, &rose->timer, rose->timer.expires);
+ }
+
+ void rose_start_t2timer(struct sock *sk)
+ {
+ struct rose_sock *rose = rose_sk(sk);
+
+- del_timer(&rose->timer);
++ sk_stop_timer(sk, &rose->timer);
+
+ rose->timer.function = rose_timer_expiry;
+ rose->timer.expires = jiffies + rose->t2;
+
+- add_timer(&rose->timer);
++ sk_reset_timer(sk, &rose->timer, rose->timer.expires);
+ }
+
+ void rose_start_t3timer(struct sock *sk)
+ {
+ struct rose_sock *rose = rose_sk(sk);
+
+- del_timer(&rose->timer);
++ sk_stop_timer(sk, &rose->timer);
+
+ rose->timer.function = rose_timer_expiry;
+ rose->timer.expires = jiffies + rose->t3;
+
+- add_timer(&rose->timer);
++ sk_reset_timer(sk, &rose->timer, rose->timer.expires);
+ }
+
+ void rose_start_hbtimer(struct sock *sk)
+ {
+ struct rose_sock *rose = rose_sk(sk);
+
+- del_timer(&rose->timer);
++ sk_stop_timer(sk, &rose->timer);
+
+ rose->timer.function = rose_timer_expiry;
+ rose->timer.expires = jiffies + rose->hb;
+
+- add_timer(&rose->timer);
++ sk_reset_timer(sk, &rose->timer, rose->timer.expires);
+ }
+
+ void rose_start_idletimer(struct sock *sk)
+ {
+ struct rose_sock *rose = rose_sk(sk);
+
+- del_timer(&rose->idletimer);
++ sk_stop_timer(sk, &rose->idletimer);
+
+ if (rose->idle > 0) {
+ rose->idletimer.function = rose_idletimer_expiry;
+ rose->idletimer.expires = jiffies + rose->idle;
+
+- add_timer(&rose->idletimer);
++ sk_reset_timer(sk, &rose->idletimer, rose->idletimer.expires);
+ }
+ }
+
+ void rose_stop_heartbeat(struct sock *sk)
+ {
+- del_timer(&sk->sk_timer);
++ sk_stop_timer(sk, &sk->sk_timer);
+ }
+
+ void rose_stop_timer(struct sock *sk)
+ {
+- del_timer(&rose_sk(sk)->timer);
++ sk_stop_timer(sk, &rose_sk(sk)->timer);
+ }
+
+ void rose_stop_idletimer(struct sock *sk)
+ {
+- del_timer(&rose_sk(sk)->idletimer);
++ sk_stop_timer(sk, &rose_sk(sk)->idletimer);
+ }
+
+ static void rose_heartbeat_expiry(struct timer_list *t)
+@@ -130,6 +130,7 @@ static void rose_heartbeat_expiry(struct timer_list *t)
+ (sk->sk_state == TCP_LISTEN && sock_flag(sk, SOCK_DEAD))) {
+ bh_unlock_sock(sk);
+ rose_destroy_socket(sk);
++ sock_put(sk);
+ return;
+ }
+ break;
+@@ -152,6 +153,7 @@ static void rose_heartbeat_expiry(struct timer_list *t)
+
+ rose_start_heartbeat(sk);
+ bh_unlock_sock(sk);
++ sock_put(sk);
+ }
+
+ static void rose_timer_expiry(struct timer_list *t)
+@@ -181,6 +183,7 @@ static void rose_timer_expiry(struct timer_list *t)
+ break;
+ }
+ bh_unlock_sock(sk);
++ sock_put(sk);
+ }
+
+ static void rose_idletimer_expiry(struct timer_list *t)
+@@ -205,4 +208,5 @@ static void rose_idletimer_expiry(struct timer_list *t)
+ sock_set_flag(sk, SOCK_DEAD);
+ }
+ bh_unlock_sock(sk);
++ sock_put(sk);
+ }
+diff --git a/net/sched/act_api.c b/net/sched/act_api.c
+index 4f51094da9dab..6fa9e7b1406a4 100644
+--- a/net/sched/act_api.c
++++ b/net/sched/act_api.c
+@@ -588,7 +588,8 @@ static int tcf_idr_release_unsafe(struct tc_action *p)
+ }
+
+ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
+- const struct tc_action_ops *ops)
++ const struct tc_action_ops *ops,
++ struct netlink_ext_ack *extack)
+ {
+ struct nlattr *nest;
+ int n_i = 0;
+@@ -604,20 +605,25 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
+ if (nla_put_string(skb, TCA_KIND, ops->kind))
+ goto nla_put_failure;
+
++ ret = 0;
+ mutex_lock(&idrinfo->lock);
+ idr_for_each_entry_ul(idr, p, tmp, id) {
+ if (IS_ERR(p))
+ continue;
+ ret = tcf_idr_release_unsafe(p);
+- if (ret == ACT_P_DELETED) {
++ if (ret == ACT_P_DELETED)
+ module_put(ops->owner);
+- n_i++;
+- } else if (ret < 0) {
+- mutex_unlock(&idrinfo->lock);
+- goto nla_put_failure;
+- }
++ else if (ret < 0)
++ break;
++ n_i++;
+ }
+ mutex_unlock(&idrinfo->lock);
++ if (ret < 0) {
++ if (n_i)
++ NL_SET_ERR_MSG(extack, "Unable to flush all TC actions");
++ else
++ goto nla_put_failure;
++ }
+
+ ret = nla_put_u32(skb, TCA_FCNT, n_i);
+ if (ret)
+@@ -638,7 +644,7 @@ int tcf_generic_walker(struct tc_action_net *tn, struct sk_buff *skb,
+ struct tcf_idrinfo *idrinfo = tn->idrinfo;
+
+ if (type == RTM_DELACTION) {
+- return tcf_del_walker(idrinfo, skb, ops);
++ return tcf_del_walker(idrinfo, skb, ops, extack);
+ } else if (type == RTM_GETACTION) {
+ return tcf_dump_walker(idrinfo, skb, cb);
+ } else {
+diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
+index b57cf9df4de89..8272427d29ca8 100644
+--- a/net/sunrpc/xdr.c
++++ b/net/sunrpc/xdr.c
+@@ -979,7 +979,7 @@ static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
+ */
+ xdr->p = (void *)p + frag2bytes;
+ space_left = xdr->buf->buflen - xdr->buf->len;
+- if (space_left - nbytes >= PAGE_SIZE)
++ if (space_left - frag1bytes >= PAGE_SIZE)
+ xdr->end = (void *)p + PAGE_SIZE;
+ else
+ xdr->end = (void *)p + space_left - frag1bytes;
+diff --git a/net/tipc/node.c b/net/tipc/node.c
+index 6ef95ce565bd3..b48d97cbbe29c 100644
+--- a/net/tipc/node.c
++++ b/net/tipc/node.c
+@@ -472,8 +472,8 @@ struct tipc_node *tipc_node_create(struct net *net, u32 addr, u8 *peer_id,
+ bool preliminary)
+ {
+ struct tipc_net *tn = net_generic(net, tipc_net_id);
++ struct tipc_link *l, *snd_l = tipc_bc_sndlink(net);
+ struct tipc_node *n, *temp_node;
+- struct tipc_link *l;
+ unsigned long intv;
+ int bearer_id;
+ int i;
+@@ -488,6 +488,16 @@ struct tipc_node *tipc_node_create(struct net *net, u32 addr, u8 *peer_id,
+ goto exit;
+ /* A preliminary node becomes "real" now, refresh its data */
+ tipc_node_write_lock(n);
++ if (!tipc_link_bc_create(net, tipc_own_addr(net), addr, peer_id, U16_MAX,
++ tipc_link_min_win(snd_l), tipc_link_max_win(snd_l),
++ n->capabilities, &n->bc_entry.inputq1,
++ &n->bc_entry.namedq, snd_l, &n->bc_entry.link)) {
++ pr_warn("Broadcast rcv link refresh failed, no memory\n");
++ tipc_node_write_unlock_fast(n);
++ tipc_node_put(n);
++ n = NULL;
++ goto exit;
++ }
+ n->preliminary = false;
+ n->addr = addr;
+ hlist_del_rcu(&n->hash);
+@@ -567,7 +577,16 @@ update:
+ n->signature = INVALID_NODE_SIG;
+ n->active_links[0] = INVALID_BEARER_ID;
+ n->active_links[1] = INVALID_BEARER_ID;
+- n->bc_entry.link = NULL;
++ if (!preliminary &&
++ !tipc_link_bc_create(net, tipc_own_addr(net), addr, peer_id, U16_MAX,
++ tipc_link_min_win(snd_l), tipc_link_max_win(snd_l),
++ n->capabilities, &n->bc_entry.inputq1,
++ &n->bc_entry.namedq, snd_l, &n->bc_entry.link)) {
++ pr_warn("Broadcast rcv link creation failed, no memory\n");
++ kfree(n);
++ n = NULL;
++ goto exit;
++ }
+ tipc_node_get(n);
+ timer_setup(&n->timer, tipc_node_timeout, 0);
+ /* Start a slow timer anyway, crypto needs it */
+@@ -1155,7 +1174,7 @@ void tipc_node_check_dest(struct net *net, u32 addr,
+ bool *respond, bool *dupl_addr)
+ {
+ struct tipc_node *n;
+- struct tipc_link *l, *snd_l;
++ struct tipc_link *l;
+ struct tipc_link_entry *le;
+ bool addr_match = false;
+ bool sign_match = false;
+@@ -1175,22 +1194,6 @@ void tipc_node_check_dest(struct net *net, u32 addr,
+ return;
+
+ tipc_node_write_lock(n);
+- if (unlikely(!n->bc_entry.link)) {
+- snd_l = tipc_bc_sndlink(net);
+- if (!tipc_link_bc_create(net, tipc_own_addr(net),
+- addr, peer_id, U16_MAX,
+- tipc_link_min_win(snd_l),
+- tipc_link_max_win(snd_l),
+- n->capabilities,
+- &n->bc_entry.inputq1,
+- &n->bc_entry.namedq, snd_l,
+- &n->bc_entry.link)) {
+- pr_warn("Broadcast rcv link creation failed, no mem\n");
+- tipc_node_write_unlock_fast(n);
+- tipc_node_put(n);
+- return;
+- }
+- }
+
+ le = &n->links[b->identity];
+
+diff --git a/tools/testing/selftests/net/bpf/Makefile b/tools/testing/selftests/net/bpf/Makefile
+index 8a69c91fcca07..8ccaf8732eb22 100644
+--- a/tools/testing/selftests/net/bpf/Makefile
++++ b/tools/testing/selftests/net/bpf/Makefile
+@@ -2,7 +2,7 @@
+
+ CLANG ?= clang
+ CCINCLUDE += -I../../bpf
+-CCINCLUDE += -I../../../lib
++CCINCLUDE += -I../../../../lib
+ CCINCLUDE += -I../../../../../usr/include/
+
+ TEST_CUSTOM_PROGS = $(OUTPUT)/bpf/nat6to4.o
+diff --git a/tools/testing/selftests/net/mptcp/diag.sh b/tools/testing/selftests/net/mptcp/diag.sh
+index ff821025d3096..49dfabded1d44 100755
+--- a/tools/testing/selftests/net/mptcp/diag.sh
++++ b/tools/testing/selftests/net/mptcp/diag.sh
+@@ -61,6 +61,39 @@ chk_msk_nr()
+ __chk_nr "grep -c token:" $*
+ }
+
++wait_msk_nr()
++{
++ local condition="grep -c token:"
++ local expected=$1
++ local timeout=20
++ local msg nr
++ local max=0
++ local i=0
++
++ shift 1
++ msg=$*
++
++ while [ $i -lt $timeout ]; do
++ nr=$(ss -inmHMN $ns | $condition)
++ [ $nr == $expected ] && break;
++ [ $nr -gt $max ] && max=$nr
++ i=$((i + 1))
++ sleep 1
++ done
++
++ printf "%-50s" "$msg"
++ if [ $i -ge $timeout ]; then
++ echo "[ fail ] timeout while expecting $expected max $max last $nr"
++ ret=$test_cnt
++ elif [ $nr != $expected ]; then
++ echo "[ fail ] expected $expected found $nr"
++ ret=$test_cnt
++ else
++ echo "[ ok ]"
++ fi
++ test_cnt=$((test_cnt+1))
++}
++
+ chk_msk_fallback_nr()
+ {
+ __chk_nr "grep -c fallback" $*
+@@ -109,7 +142,7 @@ ip -n $ns link set dev lo up
+ echo "a" | \
+ timeout ${timeout_test} \
+ ip netns exec $ns \
+- ./mptcp_connect -p 10000 -l -t ${timeout_poll} \
++ ./mptcp_connect -p 10000 -l -t ${timeout_poll} -w 20 \
+ 0.0.0.0 >/dev/null &
+ wait_local_port_listen $ns 10000
+ chk_msk_nr 0 "no msk on netns creation"
+@@ -117,7 +150,7 @@ chk_msk_nr 0 "no msk on netns creation"
+ echo "b" | \
+ timeout ${timeout_test} \
+ ip netns exec $ns \
+- ./mptcp_connect -p 10000 -r 0 -t ${timeout_poll} \
++ ./mptcp_connect -p 10000 -r 0 -t ${timeout_poll} -w 20 \
+ 127.0.0.1 >/dev/null &
+ wait_connected $ns 10000
+ chk_msk_nr 2 "after MPC handshake "
+@@ -129,13 +162,13 @@ flush_pids
+ echo "a" | \
+ timeout ${timeout_test} \
+ ip netns exec $ns \
+- ./mptcp_connect -p 10001 -l -s TCP -t ${timeout_poll} \
++ ./mptcp_connect -p 10001 -l -s TCP -t ${timeout_poll} -w 20 \
+ 0.0.0.0 >/dev/null &
+ wait_local_port_listen $ns 10001
+ echo "b" | \
+ timeout ${timeout_test} \
+ ip netns exec $ns \
+- ./mptcp_connect -p 10001 -r 0 -t ${timeout_poll} \
++ ./mptcp_connect -p 10001 -r 0 -t ${timeout_poll} -w 20 \
+ 127.0.0.1 >/dev/null &
+ wait_connected $ns 10001
+ chk_msk_fallback_nr 1 "check fallback"
+@@ -146,7 +179,7 @@ for I in `seq 1 $NR_CLIENTS`; do
+ echo "a" | \
+ timeout ${timeout_test} \
+ ip netns exec $ns \
+- ./mptcp_connect -p $((I+10001)) -l -w 10 \
++ ./mptcp_connect -p $((I+10001)) -l -w 20 \
+ -t ${timeout_poll} 0.0.0.0 >/dev/null &
+ done
+ wait_local_port_listen $ns $((NR_CLIENTS + 10001))
+@@ -155,12 +188,11 @@ for I in `seq 1 $NR_CLIENTS`; do
+ echo "b" | \
+ timeout ${timeout_test} \
+ ip netns exec $ns \
+- ./mptcp_connect -p $((I+10001)) -w 10 \
++ ./mptcp_connect -p $((I+10001)) -w 20 \
+ -t ${timeout_poll} 127.0.0.1 >/dev/null &
+ done
+-sleep 1.5
+
+-chk_msk_nr $((NR_CLIENTS*2)) "many msk socket present"
++wait_msk_nr $((NR_CLIENTS*2)) "many msk socket present"
+ flush_pids
+
+ exit $ret
+diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c
+index 8628aa61b7634..e2ea6c126c99f 100644
+--- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
++++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
+@@ -265,7 +265,7 @@ static void sock_test_tcpulp(int sock, int proto, unsigned int line)
+ static int sock_listen_mptcp(const char * const listenaddr,
+ const char * const port)
+ {
+- int sock;
++ int sock = -1;
+ struct addrinfo hints = {
+ .ai_protocol = IPPROTO_TCP,
+ .ai_socktype = SOCK_STREAM,
+diff --git a/tools/testing/selftests/net/mptcp/mptcp_inq.c b/tools/testing/selftests/net/mptcp/mptcp_inq.c
+index 29f75e2a11168..8672d898f8cda 100644
+--- a/tools/testing/selftests/net/mptcp/mptcp_inq.c
++++ b/tools/testing/selftests/net/mptcp/mptcp_inq.c
+@@ -88,7 +88,7 @@ static void xgetaddrinfo(const char *node, const char *service,
+ static int sock_listen_mptcp(const char * const listenaddr,
+ const char * const port)
+ {
+- int sock;
++ int sock = -1;
+ struct addrinfo hints = {
+ .ai_protocol = IPPROTO_TCP,
+ .ai_socktype = SOCK_STREAM,
+diff --git a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
+index ac9a4d9c17646..ae61f39556ca8 100644
+--- a/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
++++ b/tools/testing/selftests/net/mptcp/mptcp_sockopt.c
+@@ -136,7 +136,7 @@ static void xgetaddrinfo(const char *node, const char *service,
+ static int sock_listen_mptcp(const char * const listenaddr,
+ const char * const port)
+ {
+- int sock;
++ int sock = -1;
+ struct addrinfo hints = {
+ .ai_protocol = IPPROTO_TCP,
+ .ai_socktype = SOCK_STREAM,
+diff --git a/tools/testing/selftests/net/udpgso_bench.sh b/tools/testing/selftests/net/udpgso_bench.sh
+index 80b5d352702e5..dc932fd653634 100755
+--- a/tools/testing/selftests/net/udpgso_bench.sh
++++ b/tools/testing/selftests/net/udpgso_bench.sh
+@@ -120,7 +120,7 @@ run_all() {
+ run_udp "${ipv4_args}"
+
+ echo "ipv6"
+- run_tcp "${ipv4_args}"
++ run_tcp "${ipv6_args}"
+ run_udp "${ipv6_args}"
+ }
+
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-07-07 17:29 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-07-07 17:29 UTC (permalink / raw
To: gentoo-commits
commit: 72d4da40b59f7f9b0c3e8c862379301f1c2b5a4a
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Thu Jul 7 17:29:17 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Thu Jul 7 17:29:17 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=72d4da40
Removal of redundant patch, minor clean-up for BMQ defaults
Removed:
1950_cifs-fix-minor-compile-warning.patch
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 ----
1950_cifs-fix-minor-compile-warning.patch | 33 -------------------------------
5021_BMQ-and-PDS-gentoo-defaults.patch | 9 +++++----
3 files changed, 5 insertions(+), 41 deletions(-)
diff --git a/0000_README b/0000_README
index 557264e1..30a7b58d 100644
--- a/0000_README
+++ b/0000_README
@@ -95,10 +95,6 @@ Patch: 1700_sparc-address-warray-bound-warnings.patch
From: https://github.com/KSPP/linux/issues/109
Desc: Address -Warray-bounds warnings
-Patch: 1950_cifs-fix-minor-compile-warning.patch
-From: https://git.kernel.org/
-Desc: cifs: fix minor compile warning
-
Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
diff --git a/1950_cifs-fix-minor-compile-warning.patch b/1950_cifs-fix-minor-compile-warning.patch
deleted file mode 100644
index 65787238..00000000
--- a/1950_cifs-fix-minor-compile-warning.patch
+++ /dev/null
@@ -1,33 +0,0 @@
-From 93ed91c020aa4f021600a633f1f87790a5e50b91 Mon Sep 17 00:00:00 2001
-From: Steve French <stfrench@microsoft.com>
-Date: Sun, 22 May 2022 21:25:24 -0500
-Subject: cifs: fix minor compile warning
-
-Add ifdef around nodfs variable from patch:
- "cifs: don't call cifs_dfs_query_info_nonascii_quirk() if nodfs was set"
-which is unused when CONFIG_DFS_UPCALL is not set.
-
-Signed-off-by: Steve French <stfrench@microsoft.com>
----
- fs/cifs/connect.c | 2 ++
- 1 file changed, 2 insertions(+)
-
-(limited to 'fs/cifs/connect.c')
-
-diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
-index 44dc66f21d832..0b08693d1af8f 100644
---- a/fs/cifs/connect.c
-+++ b/fs/cifs/connect.c
-@@ -3433,7 +3433,9 @@ static int is_path_remote(struct mount_ctx *mnt_ctx)
- struct cifs_tcon *tcon = mnt_ctx->tcon;
- struct smb3_fs_context *ctx = mnt_ctx->fs_ctx;
- char *full_path;
-+#ifdef CONFIG_CIFS_DFS_UPCALL
- bool nodfs = cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_DFS;
-+#endif
-
- if (!server->ops->is_path_accessible)
- return -EOPNOTSUPP;
---
-cgit 1.2.3-1.el7
-
diff --git a/5021_BMQ-and-PDS-gentoo-defaults.patch b/5021_BMQ-and-PDS-gentoo-defaults.patch
index 7a71c6b6..6b2049da 100644
--- a/5021_BMQ-and-PDS-gentoo-defaults.patch
+++ b/5021_BMQ-and-PDS-gentoo-defaults.patch
@@ -1,7 +1,7 @@
---- a/init/Kconfig 2021-04-27 07:38:30.556467045 -0400
-+++ b/init/Kconfig 2021-04-27 07:39:32.956412800 -0400
-@@ -780,8 +780,9 @@ config GENERIC_SCHED_CLOCK
- menu "Scheduler features"
+--- a/init/Kconfig 2022-07-07 13:22:00.698439887 -0400
++++ b/init/Kconfig 2022-07-07 13:23:45.152333576 -0400
+@@ -874,8 +874,9 @@ config UCLAMP_BUCKETS_COUNT
+ If in doubt, use the default value.
menuconfig SCHED_ALT
+ depends on X86_64
@@ -10,3 +10,4 @@
+ default n
help
This feature enable alternative CPU scheduler"
+
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-07-12 15:58 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-07-12 15:58 UTC (permalink / raw
To: gentoo-commits
commit: eca2f342ada520977a8ae1535efc2e3944602e51
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Tue Jul 12 15:57:14 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Tue Jul 12 15:57:14 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=eca2f342
Linux patch 5.18.11
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1010_linux-5.18.11.patch | 4955 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 4959 insertions(+)
diff --git a/0000_README b/0000_README
index 30a7b58d..be72894f 100644
--- a/0000_README
+++ b/0000_README
@@ -83,6 +83,10 @@ Patch: 1009_linux-5.18.10.patch
From: http://www.kernel.org
Desc: Linux 5.18.10
+Patch: 1010_linux-5.18.11.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.11
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1010_linux-5.18.11.patch b/1010_linux-5.18.11.patch
new file mode 100644
index 00000000..1ff85521
--- /dev/null
+++ b/1010_linux-5.18.11.patch
@@ -0,0 +1,4955 @@
+diff --git a/Documentation/devicetree/bindings/dma/allwinner,sun50i-a64-dma.yaml b/Documentation/devicetree/bindings/dma/allwinner,sun50i-a64-dma.yaml
+index b6e1ebfaf3666..bb3cbc30d9121 100644
+--- a/Documentation/devicetree/bindings/dma/allwinner,sun50i-a64-dma.yaml
++++ b/Documentation/devicetree/bindings/dma/allwinner,sun50i-a64-dma.yaml
+@@ -64,7 +64,7 @@ if:
+ then:
+ properties:
+ clocks:
+- maxItems: 2
++ minItems: 2
+
+ required:
+ - clock-names
+diff --git a/MAINTAINERS b/MAINTAINERS
+index 8e6622ed6de69..2b70e2d214057 100644
+--- a/MAINTAINERS
++++ b/MAINTAINERS
+@@ -426,7 +426,6 @@ F: drivers/acpi/*thermal*
+ ACPI VIOT DRIVER
+ M: Jean-Philippe Brucker <jean-philippe@linaro.org>
+ L: linux-acpi@vger.kernel.org
+-L: iommu@lists.linux-foundation.org
+ L: iommu@lists.linux.dev
+ S: Maintained
+ F: drivers/acpi/viot.c
+@@ -960,7 +959,6 @@ F: drivers/video/fbdev/geode/
+ AMD IOMMU (AMD-VI)
+ M: Joerg Roedel <joro@8bytes.org>
+ R: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
+-L: iommu@lists.linux-foundation.org
+ L: iommu@lists.linux.dev
+ S: Maintained
+ T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
+@@ -5899,7 +5897,6 @@ DMA MAPPING HELPERS
+ M: Christoph Hellwig <hch@lst.de>
+ M: Marek Szyprowski <m.szyprowski@samsung.com>
+ R: Robin Murphy <robin.murphy@arm.com>
+-L: iommu@lists.linux-foundation.org
+ L: iommu@lists.linux.dev
+ S: Supported
+ W: http://git.infradead.org/users/hch/dma-mapping.git
+@@ -5912,7 +5909,6 @@ F: kernel/dma/
+
+ DMA MAPPING BENCHMARK
+ M: Xiang Chen <chenxiang66@hisilicon.com>
+-L: iommu@lists.linux-foundation.org
+ L: iommu@lists.linux.dev
+ F: kernel/dma/map_benchmark.c
+ F: tools/testing/selftests/dma/
+@@ -7479,7 +7475,6 @@ F: drivers/gpu/drm/exynos/exynos_dp*
+
+ EXYNOS SYSMMU (IOMMU) driver
+ M: Marek Szyprowski <m.szyprowski@samsung.com>
+-L: iommu@lists.linux-foundation.org
+ L: iommu@lists.linux.dev
+ S: Maintained
+ F: drivers/iommu/exynos-iommu.c
+@@ -9879,7 +9874,6 @@ F: drivers/hid/intel-ish-hid/
+ INTEL IOMMU (VT-d)
+ M: David Woodhouse <dwmw2@infradead.org>
+ M: Lu Baolu <baolu.lu@linux.intel.com>
+-L: iommu@lists.linux-foundation.org
+ L: iommu@lists.linux.dev
+ S: Supported
+ T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
+@@ -10258,7 +10252,6 @@ F: include/linux/iomap.h
+ IOMMU DRIVERS
+ M: Joerg Roedel <joro@8bytes.org>
+ M: Will Deacon <will@kernel.org>
+-L: iommu@lists.linux-foundation.org
+ L: iommu@lists.linux.dev
+ S: Maintained
+ T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git
+@@ -12375,7 +12368,6 @@ F: drivers/i2c/busses/i2c-mt65xx.c
+
+ MEDIATEK IOMMU DRIVER
+ M: Yong Wu <yong.wu@mediatek.com>
+-L: iommu@lists.linux-foundation.org
+ L: iommu@lists.linux.dev
+ L: linux-mediatek@lists.infradead.org (moderated for non-subscribers)
+ S: Supported
+@@ -16361,7 +16353,6 @@ F: drivers/i2c/busses/i2c-qcom-cci.c
+
+ QUALCOMM IOMMU
+ M: Rob Clark <robdclark@gmail.com>
+-L: iommu@lists.linux-foundation.org
+ L: iommu@lists.linux.dev
+ L: linux-arm-msm@vger.kernel.org
+ S: Maintained
+@@ -18947,7 +18938,6 @@ F: arch/x86/boot/video*
+
+ SWIOTLB SUBSYSTEM
+ M: Christoph Hellwig <hch@infradead.org>
+-L: iommu@lists.linux-foundation.org
+ L: iommu@lists.linux.dev
+ S: Supported
+ W: http://git.infradead.org/users/hch/dma-mapping.git
+@@ -21618,7 +21608,6 @@ XEN SWIOTLB SUBSYSTEM
+ M: Juergen Gross <jgross@suse.com>
+ M: Stefano Stabellini <sstabellini@kernel.org>
+ L: xen-devel@lists.xenproject.org (moderated for non-subscribers)
+-L: iommu@lists.linux-foundation.org
+ L: iommu@lists.linux.dev
+ S: Supported
+ F: arch/x86/xen/*swiotlb*
+diff --git a/Makefile b/Makefile
+index 088b84f99203c..323032d60ac34 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 10
++SUBLEVEL = 11
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/arm/boot/dts/at91-sam9x60ek.dts b/arch/arm/boot/dts/at91-sam9x60ek.dts
+index 7719ea3d4933c..81ccb0636a009 100644
+--- a/arch/arm/boot/dts/at91-sam9x60ek.dts
++++ b/arch/arm/boot/dts/at91-sam9x60ek.dts
+@@ -233,10 +233,9 @@
+ status = "okay";
+
+ eeprom@53 {
+- compatible = "atmel,24c32";
++ compatible = "atmel,24c02";
+ reg = <0x53>;
+ pagesize = <16>;
+- size = <128>;
+ status = "okay";
+ };
+ };
+diff --git a/arch/arm/boot/dts/at91-sama5d2_icp.dts b/arch/arm/boot/dts/at91-sama5d2_icp.dts
+index 806eb1d911d7c..164201a8fbf2d 100644
+--- a/arch/arm/boot/dts/at91-sama5d2_icp.dts
++++ b/arch/arm/boot/dts/at91-sama5d2_icp.dts
+@@ -329,21 +329,21 @@
+ status = "okay";
+
+ eeprom@50 {
+- compatible = "atmel,24c32";
++ compatible = "atmel,24c02";
+ reg = <0x50>;
+ pagesize = <16>;
+ status = "okay";
+ };
+
+ eeprom@52 {
+- compatible = "atmel,24c32";
++ compatible = "atmel,24c02";
+ reg = <0x52>;
+ pagesize = <16>;
+ status = "disabled";
+ };
+
+ eeprom@53 {
+- compatible = "atmel,24c32";
++ compatible = "atmel,24c02";
+ reg = <0x53>;
+ pagesize = <16>;
+ status = "disabled";
+diff --git a/arch/arm/boot/dts/stm32mp151.dtsi b/arch/arm/boot/dts/stm32mp151.dtsi
+index f9aa9af31efdc..9c2bbf115f4cc 100644
+--- a/arch/arm/boot/dts/stm32mp151.dtsi
++++ b/arch/arm/boot/dts/stm32mp151.dtsi
+@@ -1474,7 +1474,7 @@
+ usbh_ohci: usb@5800c000 {
+ compatible = "generic-ohci";
+ reg = <0x5800c000 0x1000>;
+- clocks = <&rcc USBH>, <&usbphyc>;
++ clocks = <&usbphyc>, <&rcc USBH>;
+ resets = <&rcc USBH_R>;
+ interrupts = <GIC_SPI 74 IRQ_TYPE_LEVEL_HIGH>;
+ status = "disabled";
+@@ -1483,7 +1483,7 @@
+ usbh_ehci: usb@5800d000 {
+ compatible = "generic-ehci";
+ reg = <0x5800d000 0x1000>;
+- clocks = <&rcc USBH>;
++ clocks = <&usbphyc>, <&rcc USBH>;
+ resets = <&rcc USBH_R>;
+ interrupts = <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>;
+ companion = <&usbh_ohci>;
+diff --git a/arch/arm/configs/mxs_defconfig b/arch/arm/configs/mxs_defconfig
+index ca32446b187f5..f53086ddc48b0 100644
+--- a/arch/arm/configs/mxs_defconfig
++++ b/arch/arm/configs/mxs_defconfig
+@@ -93,6 +93,7 @@ CONFIG_REGULATOR_FIXED_VOLTAGE=y
+ CONFIG_DRM=y
+ CONFIG_DRM_PANEL_SEIKO_43WVF1G=y
+ CONFIG_DRM_MXSFB=y
++CONFIG_FB=y
+ CONFIG_FB_MODE_HELPERS=y
+ CONFIG_LCD_CLASS_DEVICE=y
+ CONFIG_BACKLIGHT_CLASS_DEVICE=y
+diff --git a/arch/arm/mach-at91/pm.c b/arch/arm/mach-at91/pm.c
+index 0fd609e26615a..32f224f723a5d 100644
+--- a/arch/arm/mach-at91/pm.c
++++ b/arch/arm/mach-at91/pm.c
+@@ -146,7 +146,7 @@ static const struct wakeup_source_info ws_info[] = {
+
+ static const struct of_device_id sama5d2_ws_ids[] = {
+ { .compatible = "atmel,sama5d2-gem", .data = &ws_info[0] },
+- { .compatible = "atmel,at91rm9200-rtc", .data = &ws_info[1] },
++ { .compatible = "atmel,sama5d2-rtc", .data = &ws_info[1] },
+ { .compatible = "atmel,sama5d3-udc", .data = &ws_info[2] },
+ { .compatible = "atmel,at91rm9200-ohci", .data = &ws_info[2] },
+ { .compatible = "usb-ohci", .data = &ws_info[2] },
+@@ -157,24 +157,24 @@ static const struct of_device_id sama5d2_ws_ids[] = {
+ };
+
+ static const struct of_device_id sam9x60_ws_ids[] = {
+- { .compatible = "atmel,at91sam9x5-rtc", .data = &ws_info[1] },
++ { .compatible = "microchip,sam9x60-rtc", .data = &ws_info[1] },
+ { .compatible = "atmel,at91rm9200-ohci", .data = &ws_info[2] },
+ { .compatible = "usb-ohci", .data = &ws_info[2] },
+ { .compatible = "atmel,at91sam9g45-ehci", .data = &ws_info[2] },
+ { .compatible = "usb-ehci", .data = &ws_info[2] },
+- { .compatible = "atmel,at91sam9260-rtt", .data = &ws_info[4] },
++ { .compatible = "microchip,sam9x60-rtt", .data = &ws_info[4] },
+ { .compatible = "cdns,sam9x60-macb", .data = &ws_info[5] },
+ { /* sentinel */ }
+ };
+
+ static const struct of_device_id sama7g5_ws_ids[] = {
+- { .compatible = "atmel,at91sam9x5-rtc", .data = &ws_info[1] },
++ { .compatible = "microchip,sama7g5-rtc", .data = &ws_info[1] },
+ { .compatible = "microchip,sama7g5-ohci", .data = &ws_info[2] },
+ { .compatible = "usb-ohci", .data = &ws_info[2] },
+ { .compatible = "atmel,at91sam9g45-ehci", .data = &ws_info[2] },
+ { .compatible = "usb-ehci", .data = &ws_info[2] },
+ { .compatible = "microchip,sama7g5-sdhci", .data = &ws_info[3] },
+- { .compatible = "atmel,at91sam9260-rtt", .data = &ws_info[4] },
++ { .compatible = "microchip,sama7g5-rtt", .data = &ws_info[4] },
+ { /* sentinel */ }
+ };
+
+diff --git a/arch/arm/mach-meson/platsmp.c b/arch/arm/mach-meson/platsmp.c
+index 4b8ad728bb42a..32ac60b89fdcc 100644
+--- a/arch/arm/mach-meson/platsmp.c
++++ b/arch/arm/mach-meson/platsmp.c
+@@ -71,6 +71,7 @@ static void __init meson_smp_prepare_cpus(const char *scu_compatible,
+ }
+
+ sram_base = of_iomap(node, 0);
++ of_node_put(node);
+ if (!sram_base) {
+ pr_err("Couldn't map SRAM registers\n");
+ return;
+@@ -91,6 +92,7 @@ static void __init meson_smp_prepare_cpus(const char *scu_compatible,
+ }
+
+ scu_base = of_iomap(node, 0);
++ of_node_put(node);
+ if (!scu_base) {
+ pr_err("Couldn't map SCU registers\n");
+ return;
+diff --git a/arch/arm64/boot/dts/freescale/imx8mp-evk.dts b/arch/arm64/boot/dts/freescale/imx8mp-evk.dts
+index 4c3ac4214a2cd..5744fa76e9b2e 100644
+--- a/arch/arm64/boot/dts/freescale/imx8mp-evk.dts
++++ b/arch/arm64/boot/dts/freescale/imx8mp-evk.dts
+@@ -395,21 +395,21 @@
+ &iomuxc {
+ pinctrl_eqos: eqosgrp {
+ fsl,pins = <
+- MX8MP_IOMUXC_ENET_MDC__ENET_QOS_MDC 0x3
+- MX8MP_IOMUXC_ENET_MDIO__ENET_QOS_MDIO 0x3
+- MX8MP_IOMUXC_ENET_RD0__ENET_QOS_RGMII_RD0 0x91
+- MX8MP_IOMUXC_ENET_RD1__ENET_QOS_RGMII_RD1 0x91
+- MX8MP_IOMUXC_ENET_RD2__ENET_QOS_RGMII_RD2 0x91
+- MX8MP_IOMUXC_ENET_RD3__ENET_QOS_RGMII_RD3 0x91
+- MX8MP_IOMUXC_ENET_RXC__CCM_ENET_QOS_CLOCK_GENERATE_RX_CLK 0x91
+- MX8MP_IOMUXC_ENET_RX_CTL__ENET_QOS_RGMII_RX_CTL 0x91
+- MX8MP_IOMUXC_ENET_TD0__ENET_QOS_RGMII_TD0 0x1f
+- MX8MP_IOMUXC_ENET_TD1__ENET_QOS_RGMII_TD1 0x1f
+- MX8MP_IOMUXC_ENET_TD2__ENET_QOS_RGMII_TD2 0x1f
+- MX8MP_IOMUXC_ENET_TD3__ENET_QOS_RGMII_TD3 0x1f
+- MX8MP_IOMUXC_ENET_TX_CTL__ENET_QOS_RGMII_TX_CTL 0x1f
+- MX8MP_IOMUXC_ENET_TXC__CCM_ENET_QOS_CLOCK_GENERATE_TX_CLK 0x1f
+- MX8MP_IOMUXC_SAI2_RXC__GPIO4_IO22 0x19
++ MX8MP_IOMUXC_ENET_MDC__ENET_QOS_MDC 0x2
++ MX8MP_IOMUXC_ENET_MDIO__ENET_QOS_MDIO 0x2
++ MX8MP_IOMUXC_ENET_RD0__ENET_QOS_RGMII_RD0 0x90
++ MX8MP_IOMUXC_ENET_RD1__ENET_QOS_RGMII_RD1 0x90
++ MX8MP_IOMUXC_ENET_RD2__ENET_QOS_RGMII_RD2 0x90
++ MX8MP_IOMUXC_ENET_RD3__ENET_QOS_RGMII_RD3 0x90
++ MX8MP_IOMUXC_ENET_RXC__CCM_ENET_QOS_CLOCK_GENERATE_RX_CLK 0x90
++ MX8MP_IOMUXC_ENET_RX_CTL__ENET_QOS_RGMII_RX_CTL 0x90
++ MX8MP_IOMUXC_ENET_TD0__ENET_QOS_RGMII_TD0 0x16
++ MX8MP_IOMUXC_ENET_TD1__ENET_QOS_RGMII_TD1 0x16
++ MX8MP_IOMUXC_ENET_TD2__ENET_QOS_RGMII_TD2 0x16
++ MX8MP_IOMUXC_ENET_TD3__ENET_QOS_RGMII_TD3 0x16
++ MX8MP_IOMUXC_ENET_TX_CTL__ENET_QOS_RGMII_TX_CTL 0x16
++ MX8MP_IOMUXC_ENET_TXC__CCM_ENET_QOS_CLOCK_GENERATE_TX_CLK 0x16
++ MX8MP_IOMUXC_SAI2_RXC__GPIO4_IO22 0x10
+ >;
+ };
+
+@@ -461,28 +461,28 @@
+
+ pinctrl_gpio_led: gpioledgrp {
+ fsl,pins = <
+- MX8MP_IOMUXC_NAND_READY_B__GPIO3_IO16 0x19
++ MX8MP_IOMUXC_NAND_READY_B__GPIO3_IO16 0x140
+ >;
+ };
+
+ pinctrl_i2c1: i2c1grp {
+ fsl,pins = <
+- MX8MP_IOMUXC_I2C1_SCL__I2C1_SCL 0x400001c3
+- MX8MP_IOMUXC_I2C1_SDA__I2C1_SDA 0x400001c3
++ MX8MP_IOMUXC_I2C1_SCL__I2C1_SCL 0x400001c2
++ MX8MP_IOMUXC_I2C1_SDA__I2C1_SDA 0x400001c2
+ >;
+ };
+
+ pinctrl_i2c3: i2c3grp {
+ fsl,pins = <
+- MX8MP_IOMUXC_I2C3_SCL__I2C3_SCL 0x400001c3
+- MX8MP_IOMUXC_I2C3_SDA__I2C3_SDA 0x400001c3
++ MX8MP_IOMUXC_I2C3_SCL__I2C3_SCL 0x400001c2
++ MX8MP_IOMUXC_I2C3_SDA__I2C3_SDA 0x400001c2
+ >;
+ };
+
+ pinctrl_i2c5: i2c5grp {
+ fsl,pins = <
+- MX8MP_IOMUXC_SPDIF_RX__I2C5_SDA 0x400001c3
+- MX8MP_IOMUXC_SPDIF_TX__I2C5_SCL 0x400001c3
++ MX8MP_IOMUXC_SPDIF_RX__I2C5_SDA 0x400001c2
++ MX8MP_IOMUXC_SPDIF_TX__I2C5_SCL 0x400001c2
+ >;
+ };
+
+@@ -500,20 +500,20 @@
+
+ pinctrl_reg_usdhc2_vmmc: regusdhc2vmmcgrp {
+ fsl,pins = <
+- MX8MP_IOMUXC_SD2_RESET_B__GPIO2_IO19 0x41
++ MX8MP_IOMUXC_SD2_RESET_B__GPIO2_IO19 0x40
+ >;
+ };
+
+ pinctrl_uart2: uart2grp {
+ fsl,pins = <
+- MX8MP_IOMUXC_UART2_RXD__UART2_DCE_RX 0x49
+- MX8MP_IOMUXC_UART2_TXD__UART2_DCE_TX 0x49
++ MX8MP_IOMUXC_UART2_RXD__UART2_DCE_RX 0x140
++ MX8MP_IOMUXC_UART2_TXD__UART2_DCE_TX 0x140
+ >;
+ };
+
+ pinctrl_usb1_vbus: usb1grp {
+ fsl,pins = <
+- MX8MP_IOMUXC_GPIO1_IO14__USB2_OTG_PWR 0x19
++ MX8MP_IOMUXC_GPIO1_IO14__USB2_OTG_PWR 0x10
+ >;
+ };
+
+@@ -525,7 +525,7 @@
+ MX8MP_IOMUXC_SD2_DATA1__USDHC2_DATA1 0x1d0
+ MX8MP_IOMUXC_SD2_DATA2__USDHC2_DATA2 0x1d0
+ MX8MP_IOMUXC_SD2_DATA3__USDHC2_DATA3 0x1d0
+- MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc1
++ MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc0
+ >;
+ };
+
+@@ -537,7 +537,7 @@
+ MX8MP_IOMUXC_SD2_DATA1__USDHC2_DATA1 0x1d4
+ MX8MP_IOMUXC_SD2_DATA2__USDHC2_DATA2 0x1d4
+ MX8MP_IOMUXC_SD2_DATA3__USDHC2_DATA3 0x1d4
+- MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc1
++ MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc0
+ >;
+ };
+
+@@ -549,7 +549,7 @@
+ MX8MP_IOMUXC_SD2_DATA1__USDHC2_DATA1 0x1d6
+ MX8MP_IOMUXC_SD2_DATA2__USDHC2_DATA2 0x1d6
+ MX8MP_IOMUXC_SD2_DATA3__USDHC2_DATA3 0x1d6
+- MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc1
++ MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc0
+ >;
+ };
+
+diff --git a/arch/arm64/boot/dts/freescale/imx8mp-phyboard-pollux-rdk.dts b/arch/arm64/boot/dts/freescale/imx8mp-phyboard-pollux-rdk.dts
+index 984a6b9ded8d7..6aa720bafe289 100644
+--- a/arch/arm64/boot/dts/freescale/imx8mp-phyboard-pollux-rdk.dts
++++ b/arch/arm64/boot/dts/freescale/imx8mp-phyboard-pollux-rdk.dts
+@@ -116,48 +116,48 @@
+ &iomuxc {
+ pinctrl_eqos: eqosgrp {
+ fsl,pins = <
+- MX8MP_IOMUXC_ENET_MDC__ENET_QOS_MDC 0x3
+- MX8MP_IOMUXC_ENET_MDIO__ENET_QOS_MDIO 0x3
+- MX8MP_IOMUXC_ENET_RD0__ENET_QOS_RGMII_RD0 0x91
+- MX8MP_IOMUXC_ENET_RD1__ENET_QOS_RGMII_RD1 0x91
+- MX8MP_IOMUXC_ENET_RD2__ENET_QOS_RGMII_RD2 0x91
+- MX8MP_IOMUXC_ENET_RD3__ENET_QOS_RGMII_RD3 0x91
+- MX8MP_IOMUXC_ENET_RXC__CCM_ENET_QOS_CLOCK_GENERATE_RX_CLK 0x91
+- MX8MP_IOMUXC_ENET_RX_CTL__ENET_QOS_RGMII_RX_CTL 0x91
+- MX8MP_IOMUXC_ENET_TD0__ENET_QOS_RGMII_TD0 0x1f
+- MX8MP_IOMUXC_ENET_TD1__ENET_QOS_RGMII_TD1 0x1f
+- MX8MP_IOMUXC_ENET_TD2__ENET_QOS_RGMII_TD2 0x1f
+- MX8MP_IOMUXC_ENET_TD3__ENET_QOS_RGMII_TD3 0x1f
+- MX8MP_IOMUXC_ENET_TX_CTL__ENET_QOS_RGMII_TX_CTL 0x1f
+- MX8MP_IOMUXC_ENET_TXC__CCM_ENET_QOS_CLOCK_GENERATE_TX_CLK 0x1f
++ MX8MP_IOMUXC_ENET_MDC__ENET_QOS_MDC 0x2
++ MX8MP_IOMUXC_ENET_MDIO__ENET_QOS_MDIO 0x2
++ MX8MP_IOMUXC_ENET_RD0__ENET_QOS_RGMII_RD0 0x90
++ MX8MP_IOMUXC_ENET_RD1__ENET_QOS_RGMII_RD1 0x90
++ MX8MP_IOMUXC_ENET_RD2__ENET_QOS_RGMII_RD2 0x90
++ MX8MP_IOMUXC_ENET_RD3__ENET_QOS_RGMII_RD3 0x90
++ MX8MP_IOMUXC_ENET_RXC__CCM_ENET_QOS_CLOCK_GENERATE_RX_CLK 0x90
++ MX8MP_IOMUXC_ENET_RX_CTL__ENET_QOS_RGMII_RX_CTL 0x90
++ MX8MP_IOMUXC_ENET_TD0__ENET_QOS_RGMII_TD0 0x16
++ MX8MP_IOMUXC_ENET_TD1__ENET_QOS_RGMII_TD1 0x16
++ MX8MP_IOMUXC_ENET_TD2__ENET_QOS_RGMII_TD2 0x16
++ MX8MP_IOMUXC_ENET_TD3__ENET_QOS_RGMII_TD3 0x16
++ MX8MP_IOMUXC_ENET_TX_CTL__ENET_QOS_RGMII_TX_CTL 0x16
++ MX8MP_IOMUXC_ENET_TXC__CCM_ENET_QOS_CLOCK_GENERATE_TX_CLK 0x16
+ MX8MP_IOMUXC_SAI1_MCLK__GPIO4_IO20 0x10
+ >;
+ };
+
+ pinctrl_i2c2: i2c2grp {
+ fsl,pins = <
+- MX8MP_IOMUXC_I2C2_SCL__I2C2_SCL 0x400001c3
+- MX8MP_IOMUXC_I2C2_SDA__I2C2_SDA 0x400001c3
++ MX8MP_IOMUXC_I2C2_SCL__I2C2_SCL 0x400001c2
++ MX8MP_IOMUXC_I2C2_SDA__I2C2_SDA 0x400001c2
+ >;
+ };
+
+ pinctrl_i2c2_gpio: i2c2gpiogrp {
+ fsl,pins = <
+- MX8MP_IOMUXC_I2C2_SCL__GPIO5_IO16 0x1e3
+- MX8MP_IOMUXC_I2C2_SDA__GPIO5_IO17 0x1e3
++ MX8MP_IOMUXC_I2C2_SCL__GPIO5_IO16 0x1e2
++ MX8MP_IOMUXC_I2C2_SDA__GPIO5_IO17 0x1e2
+ >;
+ };
+
+ pinctrl_reg_usdhc2_vmmc: regusdhc2vmmcgrp {
+ fsl,pins = <
+- MX8MP_IOMUXC_SD2_RESET_B__GPIO2_IO19 0x41
++ MX8MP_IOMUXC_SD2_RESET_B__GPIO2_IO19 0x40
+ >;
+ };
+
+ pinctrl_uart1: uart1grp {
+ fsl,pins = <
+- MX8MP_IOMUXC_UART1_RXD__UART1_DCE_RX 0x49
+- MX8MP_IOMUXC_UART1_TXD__UART1_DCE_TX 0x49
++ MX8MP_IOMUXC_UART1_RXD__UART1_DCE_RX 0x40
++ MX8MP_IOMUXC_UART1_TXD__UART1_DCE_TX 0x40
+ >;
+ };
+
+@@ -175,7 +175,7 @@
+ MX8MP_IOMUXC_SD2_DATA1__USDHC2_DATA1 0x1d0
+ MX8MP_IOMUXC_SD2_DATA2__USDHC2_DATA2 0x1d0
+ MX8MP_IOMUXC_SD2_DATA3__USDHC2_DATA3 0x1d0
+- MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc1
++ MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc0
+ >;
+ };
+
+@@ -187,7 +187,7 @@
+ MX8MP_IOMUXC_SD2_DATA1__USDHC2_DATA1 0x1d4
+ MX8MP_IOMUXC_SD2_DATA2__USDHC2_DATA2 0x1d4
+ MX8MP_IOMUXC_SD2_DATA3__USDHC2_DATA3 0x1d4
+- MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc1
++ MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc0
+ >;
+ };
+
+@@ -199,7 +199,7 @@
+ MX8MP_IOMUXC_SD2_DATA1__USDHC2_DATA1 0x1d6
+ MX8MP_IOMUXC_SD2_DATA2__USDHC2_DATA2 0x1d6
+ MX8MP_IOMUXC_SD2_DATA3__USDHC2_DATA3 0x1d6
+- MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc1
++ MX8MP_IOMUXC_GPIO1_IO04__USDHC2_VSELECT 0xc0
+ >;
+ };
+ };
+diff --git a/arch/arm64/boot/dts/qcom/msm8992-lg-bullhead.dtsi b/arch/arm64/boot/dts/qcom/msm8992-lg-bullhead.dtsi
+index 3b0cc85d66742..71e373b11de9d 100644
+--- a/arch/arm64/boot/dts/qcom/msm8992-lg-bullhead.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8992-lg-bullhead.dtsi
+@@ -74,7 +74,7 @@
+ vdd_l17_29-supply = <&vph_pwr>;
+ vdd_l20_21-supply = <&vph_pwr>;
+ vdd_l25-supply = <&pm8994_s5>;
+- vdd_lvs1_2 = <&pm8994_s4>;
++ vdd_lvs1_2-supply = <&pm8994_s4>;
+
+ /* S1, S2, S6 and S12 are managed by RPMPD */
+
+diff --git a/arch/arm64/boot/dts/qcom/msm8992-xiaomi-libra.dts b/arch/arm64/boot/dts/qcom/msm8992-xiaomi-libra.dts
+index 84558ab5fe86b..ae882bfbf48dc 100644
+--- a/arch/arm64/boot/dts/qcom/msm8992-xiaomi-libra.dts
++++ b/arch/arm64/boot/dts/qcom/msm8992-xiaomi-libra.dts
+@@ -143,7 +143,7 @@
+ vdd_l17_29-supply = <&vph_pwr>;
+ vdd_l20_21-supply = <&vph_pwr>;
+ vdd_l25-supply = <&pm8994_s5>;
+- vdd_lvs1_2 = <&pm8994_s4>;
++ vdd_lvs1_2-supply = <&pm8994_s4>;
+
+ /* S1, S2, S6 and S12 are managed by RPMPD */
+
+diff --git a/arch/arm64/boot/dts/qcom/msm8994.dtsi b/arch/arm64/boot/dts/qcom/msm8994.dtsi
+index b1e595cb4b901..6b76321288d01 100644
+--- a/arch/arm64/boot/dts/qcom/msm8994.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8994.dtsi
+@@ -93,7 +93,7 @@
+ CPU6: cpu@102 {
+ device_type = "cpu";
+ compatible = "arm,cortex-a57";
+- reg = <0x0 0x101>;
++ reg = <0x0 0x102>;
+ enable-method = "psci";
+ next-level-cache = <&L2_1>;
+ };
+@@ -101,7 +101,7 @@
+ CPU7: cpu@103 {
+ device_type = "cpu";
+ compatible = "arm,cortex-a57";
+- reg = <0x0 0x101>;
++ reg = <0x0 0x103>;
+ enable-method = "psci";
+ next-level-cache = <&L2_1>;
+ };
+diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
+index b31bf62e86809..ad21cf465c986 100644
+--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
++++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
+@@ -4238,7 +4238,7 @@
+
+ power-domains = <&dispcc MDSS_GDSC>;
+
+- clocks = <&gcc GCC_DISP_AHB_CLK>,
++ clocks = <&dispcc DISP_CC_MDSS_AHB_CLK>,
+ <&dispcc DISP_CC_MDSS_MDP_CLK>;
+ clock-names = "iface", "core";
+
+diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
+index e63b7b0458cf8..7a14eb89e4ca0 100644
+--- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
+@@ -1383,8 +1383,8 @@
+
+ iommus = <&apps_smmu 0xe0 0x0>;
+
+- interconnects = <&aggre1_noc MASTER_UFS_MEM &mc_virt SLAVE_EBI1>,
+- <&gem_noc MASTER_APPSS_PROC &config_noc SLAVE_UFS_MEM_CFG>;
++ interconnects = <&aggre1_noc MASTER_UFS_MEM 0 &mc_virt SLAVE_EBI1 0>,
++ <&gem_noc MASTER_APPSS_PROC 0 &config_noc SLAVE_UFS_MEM_CFG 0>;
+ interconnect-names = "ufs-ddr", "cpu-ufs";
+ clock-names =
+ "core_clk",
+diff --git a/arch/powerpc/platforms/powernv/rng.c b/arch/powerpc/platforms/powernv/rng.c
+index 463c78c52cc5d..3805ad13b8f3d 100644
+--- a/arch/powerpc/platforms/powernv/rng.c
++++ b/arch/powerpc/platforms/powernv/rng.c
+@@ -176,12 +176,8 @@ static int __init pnv_get_random_long_early(unsigned long *v)
+ NULL) != pnv_get_random_long_early)
+ return 0;
+
+- for_each_compatible_node(dn, NULL, "ibm,power-rng") {
+- if (rng_create(dn))
+- continue;
+- /* Create devices for hwrng driver */
+- of_platform_device_create(dn, NULL, NULL);
+- }
++ for_each_compatible_node(dn, NULL, "ibm,power-rng")
++ rng_create(dn);
+
+ if (!ppc_md.get_random_seed)
+ return 0;
+@@ -205,10 +201,18 @@ void __init pnv_rng_init(void)
+
+ static int __init pnv_rng_late_init(void)
+ {
++ struct device_node *dn;
+ unsigned long v;
++
+ /* In case it wasn't called during init for some other reason. */
+ if (ppc_md.get_random_seed == pnv_get_random_long_early)
+ pnv_get_random_long_early(&v);
++
++ if (ppc_md.get_random_seed == powernv_get_random_long) {
++ for_each_compatible_node(dn, NULL, "ibm,power-rng")
++ of_platform_device_create(dn, NULL, NULL);
++ }
++
+ return 0;
+ }
+ machine_subsys_initcall(powernv, pnv_rng_late_init);
+diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c
+index df1644d9b3b66..3677df836e910 100644
+--- a/arch/x86/kernel/acpi/cppc.c
++++ b/arch/x86/kernel/acpi/cppc.c
+@@ -11,6 +11,16 @@
+
+ /* Refer to drivers/acpi/cppc_acpi.c for the description of functions */
+
++bool cpc_supported_by_cpu(void)
++{
++ switch (boot_cpu_data.x86_vendor) {
++ case X86_VENDOR_AMD:
++ case X86_VENDOR_HYGON:
++ return boot_cpu_has(X86_FEATURE_CPPC);
++ }
++ return false;
++}
++
+ bool cpc_ffh_supported(void)
+ {
+ return true;
+diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
+index 3e58b613a2c41..6c735cfa7d433 100644
+--- a/drivers/acpi/bus.c
++++ b/drivers/acpi/bus.c
+@@ -278,13 +278,27 @@ bool osc_sb_apei_support_acked;
+ bool osc_pc_lpi_support_confirmed;
+ EXPORT_SYMBOL_GPL(osc_pc_lpi_support_confirmed);
+
++/*
++ * ACPI 6.2 Section 6.2.11.2 'Platform-Wide OSPM Capabilities':
++ * Starting with ACPI Specification 6.2, all _CPC registers can be in
++ * PCC, System Memory, System IO, or Functional Fixed Hardware address
++ * spaces. OSPM support for this more flexible register space scheme is
++ * indicated by the “Flexible Address Space for CPPC Registers” _OSC bit.
++ *
++ * Otherwise (cf ACPI 6.1, s8.4.7.1.1.X), _CPC registers must be in:
++ * - PCC or Functional Fixed Hardware address space if defined
++ * - SystemMemory address space (NULL register) if not defined
++ */
++bool osc_cpc_flexible_adr_space_confirmed;
++EXPORT_SYMBOL_GPL(osc_cpc_flexible_adr_space_confirmed);
++
+ /*
+ * ACPI 6.4 Operating System Capabilities for USB.
+ */
+ bool osc_sb_native_usb4_support_confirmed;
+ EXPORT_SYMBOL_GPL(osc_sb_native_usb4_support_confirmed);
+
+-bool osc_sb_cppc_not_supported;
++bool osc_sb_cppc2_support_acked;
+
+ static u8 sb_uuid_str[] = "0811B06E-4A27-44F9-8D60-3CBBC22E7B48";
+ static void acpi_bus_osc_negotiate_platform_control(void)
+@@ -315,12 +329,15 @@ static void acpi_bus_osc_negotiate_platform_control(void)
+ #endif
+ #ifdef CONFIG_X86
+ capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_GENERIC_INITIATOR_SUPPORT;
+- if (boot_cpu_has(X86_FEATURE_HWP)) {
+- capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_CPC_SUPPORT;
+- capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_CPCV2_SUPPORT;
+- }
+ #endif
+
++#ifdef CONFIG_ACPI_CPPC_LIB
++ capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_CPC_SUPPORT;
++ capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_CPCV2_SUPPORT;
++#endif
++
++ capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_CPC_FLEXIBLE_ADR_SPACE;
++
+ if (IS_ENABLED(CONFIG_SCHED_MC_PRIO))
+ capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_CPC_DIVERSE_HIGH_SUPPORT;
+
+@@ -341,12 +358,6 @@ static void acpi_bus_osc_negotiate_platform_control(void)
+ return;
+ }
+
+-#ifdef CONFIG_X86
+- if (boot_cpu_has(X86_FEATURE_HWP))
+- osc_sb_cppc_not_supported = !(capbuf_ret[OSC_SUPPORT_DWORD] &
+- (OSC_SB_CPC_SUPPORT | OSC_SB_CPCV2_SUPPORT));
+-#endif
+-
+ /*
+ * Now run _OSC again with query flag clear and with the caps
+ * supported by both the OS and the platform.
+@@ -360,12 +371,18 @@ static void acpi_bus_osc_negotiate_platform_control(void)
+
+ capbuf_ret = context.ret.pointer;
+ if (context.ret.length > OSC_SUPPORT_DWORD) {
++#ifdef CONFIG_ACPI_CPPC_LIB
++ osc_sb_cppc2_support_acked = capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_CPCV2_SUPPORT;
++#endif
++
+ osc_sb_apei_support_acked =
+ capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_APEI_SUPPORT;
+ osc_pc_lpi_support_confirmed =
+ capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_PCLPI_SUPPORT;
+ osc_sb_native_usb4_support_confirmed =
+ capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_NATIVE_USB4_SUPPORT;
++ osc_cpc_flexible_adr_space_confirmed =
++ capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_CPC_FLEXIBLE_ADR_SPACE;
+ }
+
+ kfree(context.ret.pointer);
+diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
+index 34576ab0e2e1d..57ca7aa0e169a 100644
+--- a/drivers/acpi/cppc_acpi.c
++++ b/drivers/acpi/cppc_acpi.c
+@@ -559,6 +559,19 @@ bool __weak cpc_ffh_supported(void)
+ return false;
+ }
+
++/**
++ * cpc_supported_by_cpu() - check if CPPC is supported by CPU
++ *
++ * Check if the architectural support for CPPC is present even
++ * if the _OSC hasn't prescribed it
++ *
++ * Return: true for supported, false for not supported
++ */
++bool __weak cpc_supported_by_cpu(void)
++{
++ return false;
++}
++
+ /**
+ * pcc_data_alloc() - Allocate the pcc_data memory for pcc subspace
+ *
+@@ -666,8 +679,11 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
+ acpi_status status;
+ int ret = -ENODATA;
+
+- if (osc_sb_cppc_not_supported)
+- return -ENODEV;
++ if (!osc_sb_cppc2_support_acked) {
++ pr_debug("CPPC v2 _OSC not acked\n");
++ if (!cpc_supported_by_cpu())
++ return -ENODEV;
++ }
+
+ /* Parse the ACPI _CPC table for this CPU. */
+ status = acpi_evaluate_object_typed(handle, "_CPC", NULL, &output,
+@@ -746,6 +762,11 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
+ if (gas_t->address) {
+ void __iomem *addr;
+
++ if (!osc_cpc_flexible_adr_space_confirmed) {
++ pr_debug("Flexible address space capability not supported\n");
++ goto out_free;
++ }
++
+ addr = ioremap(gas_t->address, gas_t->bit_width/8);
+ if (!addr)
+ goto out_free;
+@@ -768,6 +789,10 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
+ gas_t->address);
+ goto out_free;
+ }
++ if (!osc_cpc_flexible_adr_space_confirmed) {
++ pr_debug("Flexible address space capability not supported\n");
++ goto out_free;
++ }
+ } else {
+ if (gas_t->space_id != ACPI_ADR_SPACE_FIXED_HARDWARE || !cpc_ffh_supported()) {
+ /* Support only PCC, SystemMemory, SystemIO, and FFH type regs. */
+diff --git a/drivers/base/core.c b/drivers/base/core.c
+index 3d6430eb0c6a1..9dc0ca3db61d1 100644
+--- a/drivers/base/core.c
++++ b/drivers/base/core.c
+@@ -485,7 +485,18 @@ static void device_link_release_fn(struct work_struct *work)
+ /* Ensure that all references to the link object have been dropped. */
+ device_link_synchronize_removal();
+
+- pm_runtime_release_supplier(link, true);
++ pm_runtime_release_supplier(link);
++ /*
++ * If supplier_preactivated is set, the link has been dropped between
++ * the pm_runtime_get_suppliers() and pm_runtime_put_suppliers() calls
++ * in __driver_probe_device(). In that case, drop the supplier's
++ * PM-runtime usage counter to remove the reference taken by
++ * pm_runtime_get_suppliers().
++ */
++ if (link->supplier_preactivated)
++ pm_runtime_put_noidle(link->supplier);
++
++ pm_request_idle(link->supplier);
+
+ put_device(link->consumer);
+ put_device(link->supplier);
+diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
+index d4059e6ffeaec..2e3aa9a2e33c9 100644
+--- a/drivers/base/power/runtime.c
++++ b/drivers/base/power/runtime.c
+@@ -308,13 +308,10 @@ static int rpm_get_suppliers(struct device *dev)
+ /**
+ * pm_runtime_release_supplier - Drop references to device link's supplier.
+ * @link: Target device link.
+- * @check_idle: Whether or not to check if the supplier device is idle.
+ *
+- * Drop all runtime PM references associated with @link to its supplier device
+- * and if @check_idle is set, check if that device is idle (and so it can be
+- * suspended).
++ * Drop all runtime PM references associated with @link to its supplier device.
+ */
+-void pm_runtime_release_supplier(struct device_link *link, bool check_idle)
++void pm_runtime_release_supplier(struct device_link *link)
+ {
+ struct device *supplier = link->supplier;
+
+@@ -327,9 +324,6 @@ void pm_runtime_release_supplier(struct device_link *link, bool check_idle)
+ while (refcount_dec_not_one(&link->rpm_active) &&
+ atomic_read(&supplier->power.usage_count) > 0)
+ pm_runtime_put_noidle(supplier);
+-
+- if (check_idle)
+- pm_request_idle(supplier);
+ }
+
+ static void __rpm_put_suppliers(struct device *dev, bool try_to_suspend)
+@@ -337,8 +331,11 @@ static void __rpm_put_suppliers(struct device *dev, bool try_to_suspend)
+ struct device_link *link;
+
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
+- device_links_read_lock_held())
+- pm_runtime_release_supplier(link, try_to_suspend);
++ device_links_read_lock_held()) {
++ pm_runtime_release_supplier(link);
++ if (try_to_suspend)
++ pm_request_idle(link->supplier);
++ }
+ }
+
+ static void rpm_put_suppliers(struct device *dev)
+@@ -1740,7 +1737,6 @@ void pm_runtime_get_suppliers(struct device *dev)
+ if (link->flags & DL_FLAG_PM_RUNTIME) {
+ link->supplier_preactivated = true;
+ pm_runtime_get_sync(link->supplier);
+- refcount_inc(&link->rpm_active);
+ }
+
+ device_links_read_unlock(idx);
+@@ -1760,19 +1756,8 @@ void pm_runtime_put_suppliers(struct device *dev)
+ list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
+ device_links_read_lock_held())
+ if (link->supplier_preactivated) {
+- bool put;
+-
+ link->supplier_preactivated = false;
+-
+- spin_lock_irq(&dev->power.lock);
+-
+- put = pm_runtime_status_suspended(dev) &&
+- refcount_dec_not_one(&link->rpm_active);
+-
+- spin_unlock_irq(&dev->power.lock);
+-
+- if (put)
+- pm_runtime_put(link->supplier);
++ pm_runtime_put(link->supplier);
+ }
+
+ device_links_read_unlock(idx);
+@@ -1807,7 +1792,8 @@ void pm_runtime_drop_link(struct device_link *link)
+ return;
+
+ pm_runtime_drop_link_count(link->consumer);
+- pm_runtime_release_supplier(link, true);
++ pm_runtime_release_supplier(link);
++ pm_request_idle(link->supplier);
+ }
+
+ static bool pm_runtime_need_not_resume(struct device *dev)
+diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
+index 5d33ce24fe09f..ef68ee664b674 100644
+--- a/drivers/cxl/cxlmem.h
++++ b/drivers/cxl/cxlmem.h
+@@ -252,13 +252,13 @@ struct cxl_mbox_identify {
+ } __packed;
+
+ struct cxl_mbox_get_lsa {
+- u32 offset;
+- u32 length;
++ __le32 offset;
++ __le32 length;
+ } __packed;
+
+ struct cxl_mbox_set_lsa {
+- u32 offset;
+- u32 reserved;
++ __le32 offset;
++ __le32 reserved;
+ u8 data[];
+ } __packed;
+
+diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
+index 44e899f06094c..cb111bcfe8c6c 100644
+--- a/drivers/cxl/mem.c
++++ b/drivers/cxl/mem.c
+@@ -46,6 +46,7 @@ static int create_endpoint(struct cxl_memdev *cxlmd,
+ {
+ struct cxl_dev_state *cxlds = cxlmd->cxlds;
+ struct cxl_port *endpoint;
++ int rc;
+
+ endpoint = devm_cxl_add_port(&parent_port->dev, &cxlmd->dev,
+ cxlds->component_reg_phys, parent_port);
+@@ -54,13 +55,17 @@ static int create_endpoint(struct cxl_memdev *cxlmd,
+
+ dev_dbg(&cxlmd->dev, "add: %s\n", dev_name(&endpoint->dev));
+
++ rc = cxl_endpoint_autoremove(cxlmd, endpoint);
++ if (rc)
++ return rc;
++
+ if (!endpoint->dev.driver) {
+ dev_err(&cxlmd->dev, "%s failed probe\n",
+ dev_name(&endpoint->dev));
+ return -ENXIO;
+ }
+
+- return cxl_endpoint_autoremove(cxlmd, endpoint);
++ return 0;
+ }
+
+ /**
+diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c
+index 15ad666ab03e3..6d85b54747abd 100644
+--- a/drivers/cxl/pmem.c
++++ b/drivers/cxl/pmem.c
+@@ -108,8 +108,8 @@ static int cxl_pmem_get_config_data(struct cxl_dev_state *cxlds,
+ return -EINVAL;
+
+ get_lsa = (struct cxl_mbox_get_lsa) {
+- .offset = cmd->in_offset,
+- .length = cmd->in_length,
++ .offset = cpu_to_le32(cmd->in_offset),
++ .length = cpu_to_le32(cmd->in_length),
+ };
+
+ rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_LSA, &get_lsa,
+@@ -139,7 +139,7 @@ static int cxl_pmem_set_config_data(struct cxl_dev_state *cxlds,
+ return -ENOMEM;
+
+ *set_lsa = (struct cxl_mbox_set_lsa) {
+- .offset = cmd->in_offset,
++ .offset = cpu_to_le32(cmd->in_offset),
+ };
+ memcpy(set_lsa->data, cmd->in_buf, cmd->in_length);
+
+diff --git a/drivers/dma/at_xdmac.c b/drivers/dma/at_xdmac.c
+index def564d1e8faf..678a3d4c6ec63 100644
+--- a/drivers/dma/at_xdmac.c
++++ b/drivers/dma/at_xdmac.c
+@@ -1893,6 +1893,11 @@ static int at_xdmac_alloc_chan_resources(struct dma_chan *chan)
+ for (i = 0; i < init_nr_desc_per_channel; i++) {
+ desc = at_xdmac_alloc_desc(chan, GFP_KERNEL);
+ if (!desc) {
++ if (i == 0) {
++ dev_warn(chan2dev(chan),
++ "can't allocate any descriptors\n");
++ return -EIO;
++ }
+ dev_warn(chan2dev(chan),
+ "only %d descriptors have been allocated\n", i);
+ break;
+diff --git a/drivers/dma/dw-axi-dmac/dw-axi-dmac-platform.c b/drivers/dma/dw-axi-dmac/dw-axi-dmac-platform.c
+index e9c9bcb1f5c20..c741da02b67e9 100644
+--- a/drivers/dma/dw-axi-dmac/dw-axi-dmac-platform.c
++++ b/drivers/dma/dw-axi-dmac/dw-axi-dmac-platform.c
+@@ -1164,8 +1164,9 @@ static int dma_chan_pause(struct dma_chan *dchan)
+ BIT(chan->id) << DMAC_CHAN_SUSP_WE_SHIFT;
+ axi_dma_iowrite32(chan->chip, DMAC_CHEN, val);
+ } else {
+- val = BIT(chan->id) << DMAC_CHAN_SUSP2_SHIFT |
+- BIT(chan->id) << DMAC_CHAN_SUSP2_WE_SHIFT;
++ val = axi_dma_ioread32(chan->chip, DMAC_CHSUSPREG);
++ val |= BIT(chan->id) << DMAC_CHAN_SUSP2_SHIFT |
++ BIT(chan->id) << DMAC_CHAN_SUSP2_WE_SHIFT;
+ axi_dma_iowrite32(chan->chip, DMAC_CHSUSPREG, val);
+ }
+
+@@ -1190,12 +1191,13 @@ static inline void axi_chan_resume(struct axi_dma_chan *chan)
+ {
+ u32 val;
+
+- val = axi_dma_ioread32(chan->chip, DMAC_CHEN);
+ if (chan->chip->dw->hdata->reg_map_8_channels) {
++ val = axi_dma_ioread32(chan->chip, DMAC_CHEN);
+ val &= ~(BIT(chan->id) << DMAC_CHAN_SUSP_SHIFT);
+ val |= (BIT(chan->id) << DMAC_CHAN_SUSP_WE_SHIFT);
+ axi_dma_iowrite32(chan->chip, DMAC_CHEN, val);
+ } else {
++ val = axi_dma_ioread32(chan->chip, DMAC_CHSUSPREG);
+ val &= ~(BIT(chan->id) << DMAC_CHAN_SUSP2_SHIFT);
+ val |= (BIT(chan->id) << DMAC_CHAN_SUSP2_WE_SHIFT);
+ axi_dma_iowrite32(chan->chip, DMAC_CHSUSPREG, val);
+diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c
+index f652da6ab47df..58490289efc35 100644
+--- a/drivers/dma/idxd/device.c
++++ b/drivers/dma/idxd/device.c
+@@ -698,10 +698,7 @@ static void idxd_device_wqs_clear_state(struct idxd_device *idxd)
+ for (i = 0; i < idxd->max_wqs; i++) {
+ struct idxd_wq *wq = idxd->wqs[i];
+
+- if (wq->state == IDXD_WQ_ENABLED) {
+- idxd_wq_disable_cleanup(wq);
+- wq->state = IDXD_WQ_DISABLED;
+- }
++ idxd_wq_disable_cleanup(wq);
+ idxd_wq_device_reset_cleanup(wq);
+ }
+ }
+diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
+index 6196a7b3956b1..89694c704bd58 100644
+--- a/drivers/dma/imx-sdma.c
++++ b/drivers/dma/imx-sdma.c
+@@ -872,7 +872,7 @@ static void sdma_update_channel_loop(struct sdma_channel *sdmac)
+ * SDMA stops cyclic channel when DMA request triggers a channel and no SDMA
+ * owned buffer is available (i.e. BD_DONE was set too late).
+ */
+- if (!is_sdma_channel_enabled(sdmac->sdma, sdmac->channel)) {
++ if (sdmac->desc && !is_sdma_channel_enabled(sdmac->sdma, sdmac->channel)) {
+ dev_warn(sdmac->sdma->dev, "restart cyclic channel %d\n", sdmac->channel);
+ sdma_enable_channel(sdmac->sdma, sdmac->channel);
+ }
+@@ -2280,7 +2280,7 @@ MODULE_DESCRIPTION("i.MX SDMA driver");
+ #if IS_ENABLED(CONFIG_SOC_IMX6Q)
+ MODULE_FIRMWARE("imx/sdma/sdma-imx6q.bin");
+ #endif
+-#if IS_ENABLED(CONFIG_SOC_IMX7D)
++#if IS_ENABLED(CONFIG_SOC_IMX7D) || IS_ENABLED(CONFIG_SOC_IMX8M)
+ MODULE_FIRMWARE("imx/sdma/sdma-imx7d.bin");
+ #endif
+ MODULE_LICENSE("GPL");
+diff --git a/drivers/dma/lgm/lgm-dma.c b/drivers/dma/lgm/lgm-dma.c
+index efe8bd3a0e2aa..9b9184f964be3 100644
+--- a/drivers/dma/lgm/lgm-dma.c
++++ b/drivers/dma/lgm/lgm-dma.c
+@@ -1593,11 +1593,12 @@ static int intel_ldma_probe(struct platform_device *pdev)
+ d->core_clk = devm_clk_get_optional(dev, NULL);
+ if (IS_ERR(d->core_clk))
+ return PTR_ERR(d->core_clk);
+- clk_prepare_enable(d->core_clk);
+
+ d->rst = devm_reset_control_get_optional(dev, NULL);
+ if (IS_ERR(d->rst))
+ return PTR_ERR(d->rst);
++
++ clk_prepare_enable(d->core_clk);
+ reset_control_deassert(d->rst);
+
+ ret = devm_add_action_or_reset(dev, ldma_clk_disable, d);
+diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
+index 858400e42ec05..09915a5cba3ea 100644
+--- a/drivers/dma/pl330.c
++++ b/drivers/dma/pl330.c
+@@ -2589,7 +2589,7 @@ static struct dma_pl330_desc *pl330_get_desc(struct dma_pl330_chan *pch)
+
+ /* If the DMAC pool is empty, alloc new */
+ if (!desc) {
+- DEFINE_SPINLOCK(lock);
++ static DEFINE_SPINLOCK(lock);
+ LIST_HEAD(pool);
+
+ if (!add_desc(&pool, &lock, GFP_ATOMIC, 1))
+diff --git a/drivers/dma/qcom/bam_dma.c b/drivers/dma/qcom/bam_dma.c
+index 87f6ca1541cff..2ff787df513e6 100644
+--- a/drivers/dma/qcom/bam_dma.c
++++ b/drivers/dma/qcom/bam_dma.c
+@@ -558,14 +558,6 @@ static int bam_alloc_chan(struct dma_chan *chan)
+ return 0;
+ }
+
+-static int bam_pm_runtime_get_sync(struct device *dev)
+-{
+- if (pm_runtime_enabled(dev))
+- return pm_runtime_get_sync(dev);
+-
+- return 0;
+-}
+-
+ /**
+ * bam_free_chan - Frees dma resources associated with specific channel
+ * @chan: specified channel
+@@ -581,7 +573,7 @@ static void bam_free_chan(struct dma_chan *chan)
+ unsigned long flags;
+ int ret;
+
+- ret = bam_pm_runtime_get_sync(bdev->dev);
++ ret = pm_runtime_get_sync(bdev->dev);
+ if (ret < 0)
+ return;
+
+@@ -784,7 +776,7 @@ static int bam_pause(struct dma_chan *chan)
+ unsigned long flag;
+ int ret;
+
+- ret = bam_pm_runtime_get_sync(bdev->dev);
++ ret = pm_runtime_get_sync(bdev->dev);
+ if (ret < 0)
+ return ret;
+
+@@ -810,7 +802,7 @@ static int bam_resume(struct dma_chan *chan)
+ unsigned long flag;
+ int ret;
+
+- ret = bam_pm_runtime_get_sync(bdev->dev);
++ ret = pm_runtime_get_sync(bdev->dev);
+ if (ret < 0)
+ return ret;
+
+@@ -919,7 +911,7 @@ static irqreturn_t bam_dma_irq(int irq, void *data)
+ if (srcs & P_IRQ)
+ tasklet_schedule(&bdev->task);
+
+- ret = bam_pm_runtime_get_sync(bdev->dev);
++ ret = pm_runtime_get_sync(bdev->dev);
+ if (ret < 0)
+ return IRQ_NONE;
+
+@@ -1037,7 +1029,7 @@ static void bam_start_dma(struct bam_chan *bchan)
+ if (!vd)
+ return;
+
+- ret = bam_pm_runtime_get_sync(bdev->dev);
++ ret = pm_runtime_get_sync(bdev->dev);
+ if (ret < 0)
+ return;
+
+@@ -1374,11 +1366,6 @@ static int bam_dma_probe(struct platform_device *pdev)
+ if (ret)
+ goto err_unregister_dma;
+
+- if (!bdev->bamclk) {
+- pm_runtime_disable(&pdev->dev);
+- return 0;
+- }
+-
+ pm_runtime_irq_safe(&pdev->dev);
+ pm_runtime_set_autosuspend_delay(&pdev->dev, BAM_DMA_AUTOSUSPEND_DELAY);
+ pm_runtime_use_autosuspend(&pdev->dev);
+@@ -1462,10 +1449,8 @@ static int __maybe_unused bam_dma_suspend(struct device *dev)
+ {
+ struct bam_device *bdev = dev_get_drvdata(dev);
+
+- if (bdev->bamclk) {
+- pm_runtime_force_suspend(dev);
+- clk_unprepare(bdev->bamclk);
+- }
++ pm_runtime_force_suspend(dev);
++ clk_unprepare(bdev->bamclk);
+
+ return 0;
+ }
+@@ -1475,13 +1460,11 @@ static int __maybe_unused bam_dma_resume(struct device *dev)
+ struct bam_device *bdev = dev_get_drvdata(dev);
+ int ret;
+
+- if (bdev->bamclk) {
+- ret = clk_prepare(bdev->bamclk);
+- if (ret)
+- return ret;
++ ret = clk_prepare(bdev->bamclk);
++ if (ret)
++ return ret;
+
+- pm_runtime_force_resume(dev);
+- }
++ pm_runtime_force_resume(dev);
+
+ return 0;
+ }
+diff --git a/drivers/dma/ti/dma-crossbar.c b/drivers/dma/ti/dma-crossbar.c
+index 71d24fc07c003..f744ddbbbad7f 100644
+--- a/drivers/dma/ti/dma-crossbar.c
++++ b/drivers/dma/ti/dma-crossbar.c
+@@ -245,6 +245,7 @@ static void *ti_dra7_xbar_route_allocate(struct of_phandle_args *dma_spec,
+ if (dma_spec->args[0] >= xbar->xbar_requests) {
+ dev_err(&pdev->dev, "Invalid XBAR request number: %d\n",
+ dma_spec->args[0]);
++ put_device(&pdev->dev);
+ return ERR_PTR(-EINVAL);
+ }
+
+@@ -252,12 +253,14 @@ static void *ti_dra7_xbar_route_allocate(struct of_phandle_args *dma_spec,
+ dma_spec->np = of_parse_phandle(ofdma->of_node, "dma-masters", 0);
+ if (!dma_spec->np) {
+ dev_err(&pdev->dev, "Can't get DMA master\n");
++ put_device(&pdev->dev);
+ return ERR_PTR(-EINVAL);
+ }
+
+ map = kzalloc(sizeof(*map), GFP_KERNEL);
+ if (!map) {
+ of_node_put(dma_spec->np);
++ put_device(&pdev->dev);
+ return ERR_PTR(-ENOMEM);
+ }
+
+@@ -268,6 +271,8 @@ static void *ti_dra7_xbar_route_allocate(struct of_phandle_args *dma_spec,
+ mutex_unlock(&xbar->mutex);
+ dev_err(&pdev->dev, "Run out of free DMA requests\n");
+ kfree(map);
++ of_node_put(dma_spec->np);
++ put_device(&pdev->dev);
+ return ERR_PTR(-ENOMEM);
+ }
+ set_bit(map->xbar_out, xbar->dma_inuse);
+diff --git a/drivers/i2c/busses/i2c-cadence.c b/drivers/i2c/busses/i2c-cadence.c
+index b4c1ad19cdaec..3d6f8ee355bfc 100644
+--- a/drivers/i2c/busses/i2c-cadence.c
++++ b/drivers/i2c/busses/i2c-cadence.c
+@@ -1338,6 +1338,7 @@ static int cdns_i2c_probe(struct platform_device *pdev)
+ return 0;
+
+ err_clk_dis:
++ clk_notifier_unregister(id->clk, &id->clk_rate_change_nb);
+ clk_disable_unprepare(id->clk);
+ pm_runtime_disable(&pdev->dev);
+ pm_runtime_set_suspended(&pdev->dev);
+diff --git a/drivers/i2c/busses/i2c-piix4.c b/drivers/i2c/busses/i2c-piix4.c
+index ac8e7d60672a1..39cb1b7bb8656 100644
+--- a/drivers/i2c/busses/i2c-piix4.c
++++ b/drivers/i2c/busses/i2c-piix4.c
+@@ -161,7 +161,6 @@ static const char *piix4_aux_port_name_sb800 = " port 1";
+
+ struct sb800_mmio_cfg {
+ void __iomem *addr;
+- struct resource *res;
+ bool use_mmio;
+ };
+
+@@ -179,13 +178,11 @@ static int piix4_sb800_region_request(struct device *dev,
+ struct sb800_mmio_cfg *mmio_cfg)
+ {
+ if (mmio_cfg->use_mmio) {
+- struct resource *res;
+ void __iomem *addr;
+
+- res = request_mem_region_muxed(SB800_PIIX4_FCH_PM_ADDR,
+- SB800_PIIX4_FCH_PM_SIZE,
+- "sb800_piix4_smb");
+- if (!res) {
++ if (!request_mem_region_muxed(SB800_PIIX4_FCH_PM_ADDR,
++ SB800_PIIX4_FCH_PM_SIZE,
++ "sb800_piix4_smb")) {
+ dev_err(dev,
+ "SMBus base address memory region 0x%x already in use.\n",
+ SB800_PIIX4_FCH_PM_ADDR);
+@@ -195,12 +192,12 @@ static int piix4_sb800_region_request(struct device *dev,
+ addr = ioremap(SB800_PIIX4_FCH_PM_ADDR,
+ SB800_PIIX4_FCH_PM_SIZE);
+ if (!addr) {
+- release_resource(res);
++ release_mem_region(SB800_PIIX4_FCH_PM_ADDR,
++ SB800_PIIX4_FCH_PM_SIZE);
+ dev_err(dev, "SMBus base address mapping failed.\n");
+ return -ENOMEM;
+ }
+
+- mmio_cfg->res = res;
+ mmio_cfg->addr = addr;
+
+ return 0;
+@@ -222,7 +219,8 @@ static void piix4_sb800_region_release(struct device *dev,
+ {
+ if (mmio_cfg->use_mmio) {
+ iounmap(mmio_cfg->addr);
+- release_resource(mmio_cfg->res);
++ release_mem_region(SB800_PIIX4_FCH_PM_ADDR,
++ SB800_PIIX4_FCH_PM_SIZE);
+ return;
+ }
+
+diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
+index 4de960834a1b2..497c5bd95caf8 100644
+--- a/drivers/iommu/intel/dmar.c
++++ b/drivers/iommu/intel/dmar.c
+@@ -383,7 +383,7 @@ static int dmar_pci_bus_notifier(struct notifier_block *nb,
+
+ static struct notifier_block dmar_pci_bus_nb = {
+ .notifier_call = dmar_pci_bus_notifier,
+- .priority = INT_MIN,
++ .priority = 1,
+ };
+
+ static struct dmar_drhd_unit *
+diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
+index ba9a63cac47cc..c7ec5177cf78a 100644
+--- a/drivers/iommu/intel/iommu.c
++++ b/drivers/iommu/intel/iommu.c
+@@ -320,30 +320,6 @@ EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped);
+ DEFINE_SPINLOCK(device_domain_lock);
+ static LIST_HEAD(device_domain_list);
+
+-/*
+- * Iterate over elements in device_domain_list and call the specified
+- * callback @fn against each element.
+- */
+-int for_each_device_domain(int (*fn)(struct device_domain_info *info,
+- void *data), void *data)
+-{
+- int ret = 0;
+- unsigned long flags;
+- struct device_domain_info *info;
+-
+- spin_lock_irqsave(&device_domain_lock, flags);
+- list_for_each_entry(info, &device_domain_list, global) {
+- ret = fn(info, data);
+- if (ret) {
+- spin_unlock_irqrestore(&device_domain_lock, flags);
+- return ret;
+- }
+- }
+- spin_unlock_irqrestore(&device_domain_lock, flags);
+-
+- return 0;
+-}
+-
+ const struct iommu_ops intel_iommu_ops;
+
+ static bool translation_pre_enabled(struct intel_iommu *iommu)
+diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
+index f8d215d85695b..723f3cd8fe729 100644
+--- a/drivers/iommu/intel/pasid.c
++++ b/drivers/iommu/intel/pasid.c
+@@ -86,54 +86,6 @@ void vcmd_free_pasid(struct intel_iommu *iommu, u32 pasid)
+ /*
+ * Per device pasid table management:
+ */
+-static inline void
+-device_attach_pasid_table(struct device_domain_info *info,
+- struct pasid_table *pasid_table)
+-{
+- info->pasid_table = pasid_table;
+- list_add(&info->table, &pasid_table->dev);
+-}
+-
+-static inline void
+-device_detach_pasid_table(struct device_domain_info *info,
+- struct pasid_table *pasid_table)
+-{
+- info->pasid_table = NULL;
+- list_del(&info->table);
+-}
+-
+-struct pasid_table_opaque {
+- struct pasid_table **pasid_table;
+- int segment;
+- int bus;
+- int devfn;
+-};
+-
+-static int search_pasid_table(struct device_domain_info *info, void *opaque)
+-{
+- struct pasid_table_opaque *data = opaque;
+-
+- if (info->iommu->segment == data->segment &&
+- info->bus == data->bus &&
+- info->devfn == data->devfn &&
+- info->pasid_table) {
+- *data->pasid_table = info->pasid_table;
+- return 1;
+- }
+-
+- return 0;
+-}
+-
+-static int get_alias_pasid_table(struct pci_dev *pdev, u16 alias, void *opaque)
+-{
+- struct pasid_table_opaque *data = opaque;
+-
+- data->segment = pci_domain_nr(pdev->bus);
+- data->bus = PCI_BUS_NUM(alias);
+- data->devfn = alias & 0xff;
+-
+- return for_each_device_domain(&search_pasid_table, data);
+-}
+
+ /*
+ * Allocate a pasid table for @dev. It should be called in a
+@@ -143,28 +95,18 @@ int intel_pasid_alloc_table(struct device *dev)
+ {
+ struct device_domain_info *info;
+ struct pasid_table *pasid_table;
+- struct pasid_table_opaque data;
+ struct page *pages;
+ u32 max_pasid = 0;
+- int ret, order;
+- int size;
++ int order, size;
+
+ might_sleep();
+ info = dev_iommu_priv_get(dev);
+ if (WARN_ON(!info || !dev_is_pci(dev) || info->pasid_table))
+ return -EINVAL;
+
+- /* DMA alias device already has a pasid table, use it: */
+- data.pasid_table = &pasid_table;
+- ret = pci_for_each_dma_alias(to_pci_dev(dev),
+- &get_alias_pasid_table, &data);
+- if (ret)
+- goto attach_out;
+-
+ pasid_table = kzalloc(sizeof(*pasid_table), GFP_KERNEL);
+ if (!pasid_table)
+ return -ENOMEM;
+- INIT_LIST_HEAD(&pasid_table->dev);
+
+ if (info->pasid_supported)
+ max_pasid = min_t(u32, pci_max_pasids(to_pci_dev(dev)),
+@@ -182,9 +124,7 @@ int intel_pasid_alloc_table(struct device *dev)
+ pasid_table->table = page_address(pages);
+ pasid_table->order = order;
+ pasid_table->max_pasid = 1 << (order + PAGE_SHIFT + 3);
+-
+-attach_out:
+- device_attach_pasid_table(info, pasid_table);
++ info->pasid_table = pasid_table;
+
+ return 0;
+ }
+@@ -202,10 +142,7 @@ void intel_pasid_free_table(struct device *dev)
+ return;
+
+ pasid_table = info->pasid_table;
+- device_detach_pasid_table(info, pasid_table);
+-
+- if (!list_empty(&pasid_table->dev))
+- return;
++ info->pasid_table = NULL;
+
+ /* Free scalable mode PASID directory tables: */
+ dir = pasid_table->table;
+diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h
+index ab4408c824a5a..3d62d84aced47 100644
+--- a/drivers/iommu/intel/pasid.h
++++ b/drivers/iommu/intel/pasid.h
+@@ -74,7 +74,6 @@ struct pasid_table {
+ void *table; /* pasid table pointer */
+ int order; /* page order of pasid table */
+ u32 max_pasid; /* max pasid */
+- struct list_head dev; /* device list */
+ };
+
+ /* Get PRESENT bit of a PASID directory entry. */
+diff --git a/drivers/misc/cardreader/rtsx_usb.c b/drivers/misc/cardreader/rtsx_usb.c
+index 1ef9b61077c44..f150d8769f198 100644
+--- a/drivers/misc/cardreader/rtsx_usb.c
++++ b/drivers/misc/cardreader/rtsx_usb.c
+@@ -631,16 +631,20 @@ static int rtsx_usb_probe(struct usb_interface *intf,
+
+ ucr->pusb_dev = usb_dev;
+
+- ucr->iobuf = usb_alloc_coherent(ucr->pusb_dev, IOBUF_SIZE,
+- GFP_KERNEL, &ucr->iobuf_dma);
+- if (!ucr->iobuf)
++ ucr->cmd_buf = kmalloc(IOBUF_SIZE, GFP_KERNEL);
++ if (!ucr->cmd_buf)
+ return -ENOMEM;
+
++ ucr->rsp_buf = kmalloc(IOBUF_SIZE, GFP_KERNEL);
++ if (!ucr->rsp_buf) {
++ ret = -ENOMEM;
++ goto out_free_cmd_buf;
++ }
++
+ usb_set_intfdata(intf, ucr);
+
+ ucr->vendor_id = id->idVendor;
+ ucr->product_id = id->idProduct;
+- ucr->cmd_buf = ucr->rsp_buf = ucr->iobuf;
+
+ mutex_init(&ucr->dev_mutex);
+
+@@ -668,8 +672,11 @@ static int rtsx_usb_probe(struct usb_interface *intf,
+
+ out_init_fail:
+ usb_set_intfdata(ucr->pusb_intf, NULL);
+- usb_free_coherent(ucr->pusb_dev, IOBUF_SIZE, ucr->iobuf,
+- ucr->iobuf_dma);
++ kfree(ucr->rsp_buf);
++ ucr->rsp_buf = NULL;
++out_free_cmd_buf:
++ kfree(ucr->cmd_buf);
++ ucr->cmd_buf = NULL;
+ return ret;
+ }
+
+@@ -682,8 +689,12 @@ static void rtsx_usb_disconnect(struct usb_interface *intf)
+ mfd_remove_devices(&intf->dev);
+
+ usb_set_intfdata(ucr->pusb_intf, NULL);
+- usb_free_coherent(ucr->pusb_dev, IOBUF_SIZE, ucr->iobuf,
+- ucr->iobuf_dma);
++
++ kfree(ucr->cmd_buf);
++ ucr->cmd_buf = NULL;
++
++ kfree(ucr->rsp_buf);
++ ucr->rsp_buf = NULL;
+ }
+
+ #ifdef CONFIG_PM
+diff --git a/drivers/net/can/grcan.c b/drivers/net/can/grcan.c
+index 5215bd9b2c80d..804dd1d480506 100644
+--- a/drivers/net/can/grcan.c
++++ b/drivers/net/can/grcan.c
+@@ -1646,7 +1646,6 @@ static int grcan_probe(struct platform_device *ofdev)
+ */
+ sysid_parent = of_find_node_by_path("/ambapp0");
+ if (sysid_parent) {
+- of_node_get(sysid_parent);
+ err = of_property_read_u32(sysid_parent, "systemid", &sysid);
+ if (!err && ((sysid & GRLIB_VERSION_MASK) >=
+ GRCAN_TXBUG_SAFE_GRLIB_VERSION))
+diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
+index 088bb1bcf1efb..928958b7e6f60 100644
+--- a/drivers/net/can/m_can/m_can.c
++++ b/drivers/net/can/m_can/m_can.c
+@@ -532,7 +532,7 @@ static int m_can_read_fifo(struct net_device *dev, u32 rxfs)
+ /* acknowledge rx fifo 0 */
+ m_can_write(cdev, M_CAN_RXF0A, fgi);
+
+- timestamp = FIELD_GET(RX_BUF_RXTS_MASK, fifo_header.dlc);
++ timestamp = FIELD_GET(RX_BUF_RXTS_MASK, fifo_header.dlc) << 16;
+
+ m_can_receive_skb(cdev, skb, timestamp);
+
+@@ -1036,7 +1036,7 @@ static int m_can_echo_tx_event(struct net_device *dev)
+ }
+
+ msg_mark = FIELD_GET(TX_EVENT_MM_MASK, txe);
+- timestamp = FIELD_GET(TX_EVENT_TXTS_MASK, txe);
++ timestamp = FIELD_GET(TX_EVENT_TXTS_MASK, txe) << 16;
+
+ /* ack txe element */
+ m_can_write(cdev, M_CAN_TXEFA, FIELD_PREP(TXEFA_EFAI_MASK,
+@@ -1360,7 +1360,9 @@ static void m_can_chip_config(struct net_device *dev)
+ /* enable internal timestamp generation, with a prescalar of 16. The
+ * prescalar is applied to the nominal bit timing
+ */
+- m_can_write(cdev, M_CAN_TSCC, FIELD_PREP(TSCC_TCP_MASK, 0xf));
++ m_can_write(cdev, M_CAN_TSCC,
++ FIELD_PREP(TSCC_TCP_MASK, 0xf) |
++ FIELD_PREP(TSCC_TSS_MASK, TSCC_TSS_INTERNAL));
+
+ m_can_config_endisable(cdev, false);
+
+diff --git a/drivers/net/can/rcar/rcar_canfd.c b/drivers/net/can/rcar/rcar_canfd.c
+index 1e121e04208cc..589996cef5db3 100644
+--- a/drivers/net/can/rcar/rcar_canfd.c
++++ b/drivers/net/can/rcar/rcar_canfd.c
+@@ -1334,7 +1334,10 @@ static void rcar_canfd_set_bittiming(struct net_device *dev)
+ cfg = (RCANFD_DCFG_DTSEG1(gpriv, tseg1) | RCANFD_DCFG_DBRP(brp) |
+ RCANFD_DCFG_DSJW(sjw) | RCANFD_DCFG_DTSEG2(gpriv, tseg2));
+
+- rcar_canfd_write(priv->base, RCANFD_F_DCFG(ch), cfg);
++ if (is_v3u(gpriv))
++ rcar_canfd_write(priv->base, RCANFD_V3U_DCFG(ch), cfg);
++ else
++ rcar_canfd_write(priv->base, RCANFD_F_DCFG(ch), cfg);
+ netdev_dbg(priv->ndev, "drate: brp %u, sjw %u, tseg1 %u, tseg2 %u\n",
+ brp, sjw, tseg1, tseg2);
+ } else {
+diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
+index f9dd8fdba12bc..1bbb43d77a728 100644
+--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
++++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
+@@ -12,6 +12,7 @@
+ // Copyright (c) 2019 Martin Sperl <kernel@martin.sperl.org>
+ //
+
++#include <asm/unaligned.h>
+ #include <linux/bitfield.h>
+ #include <linux/clk.h>
+ #include <linux/device.h>
+@@ -1641,6 +1642,7 @@ static int mcp251xfd_stop(struct net_device *ndev)
+ netif_stop_queue(ndev);
+ set_bit(MCP251XFD_FLAGS_DOWN, priv->flags);
+ hrtimer_cancel(&priv->rx_irq_timer);
++ hrtimer_cancel(&priv->tx_irq_timer);
+ mcp251xfd_chip_interrupts_disable(priv);
+ free_irq(ndev->irq, priv);
+ can_rx_offload_disable(&priv->offload);
+@@ -1768,7 +1770,7 @@ mcp251xfd_register_get_dev_id(const struct mcp251xfd_priv *priv, u32 *dev_id,
+ xfer[0].len = sizeof(buf_tx->cmd);
+ xfer[0].speed_hz = priv->spi_max_speed_hz_slow;
+ xfer[1].rx_buf = buf_rx->data;
+- xfer[1].len = sizeof(dev_id);
++ xfer[1].len = sizeof(*dev_id);
+ xfer[1].speed_hz = priv->spi_max_speed_hz_fast;
+
+ mcp251xfd_spi_cmd_read_nocrc(&buf_tx->cmd, MCP251XFD_REG_DEVID);
+@@ -1777,7 +1779,7 @@ mcp251xfd_register_get_dev_id(const struct mcp251xfd_priv *priv, u32 *dev_id,
+ if (err)
+ goto out_kfree_buf_tx;
+
+- *dev_id = be32_to_cpup((__be32 *)buf_rx->data);
++ *dev_id = get_unaligned_le32(buf_rx->data);
+ *effective_speed_hz_slow = xfer[0].effective_speed_hz;
+ *effective_speed_hz_fast = xfer[1].effective_speed_hz;
+
+diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
+index 217510c12af55..92b7bc7f14b9e 100644
+--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
++++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c
+@@ -334,19 +334,21 @@ mcp251xfd_regmap_crc_read(void *context,
+ * register. It increments once per SYS clock tick,
+ * which is 20 or 40 MHz.
+ *
+- * Observation shows that if the lowest byte (which is
+- * transferred first on the SPI bus) of that register
+- * is 0x00 or 0x80 the calculated CRC doesn't always
+- * match the transferred one.
++ * Observation on the mcp2518fd shows that if the
++ * lowest byte (which is transferred first on the SPI
++ * bus) of that register is 0x00 or 0x80 the
++ * calculated CRC doesn't always match the transferred
++ * one. On the mcp2517fd this problem is not limited
++ * to the first byte being 0x00 or 0x80.
+ *
+ * If the highest bit in the lowest byte is flipped
+ * the transferred CRC matches the calculated one. We
+- * assume for now the CRC calculation in the chip
+- * works on wrong data and the transferred data is
+- * correct.
++ * assume for now the CRC operates on the correct
++ * data.
+ */
+ if (reg == MCP251XFD_REG_TBC &&
+- (buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80)) {
++ ((buf_rx->data[0] & 0xf8) == 0x0 ||
++ (buf_rx->data[0] & 0xf8) == 0x80)) {
+ /* Flip highest bit in lowest byte of le32 */
+ buf_rx->data[0] ^= 0x80;
+
+@@ -356,10 +358,8 @@ mcp251xfd_regmap_crc_read(void *context,
+ val_len);
+ if (!err) {
+ /* If CRC is now correct, assume
+- * transferred data was OK, flip bit
+- * back to original value.
++ * flipped data is OK.
+ */
+- buf_rx->data[0] ^= 0x80;
+ goto out;
+ }
+ }
+diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
+index b29ba9138866b..d3a658b444b5f 100644
+--- a/drivers/net/can/usb/gs_usb.c
++++ b/drivers/net/can/usb/gs_usb.c
+@@ -268,6 +268,8 @@ struct gs_can {
+
+ struct usb_anchor tx_submitted;
+ atomic_t active_tx_urbs;
++ void *rxbuf[GS_MAX_RX_URBS];
++ dma_addr_t rxbuf_dma[GS_MAX_RX_URBS];
+ };
+
+ /* usb interface struct */
+@@ -742,6 +744,7 @@ static int gs_can_open(struct net_device *netdev)
+ for (i = 0; i < GS_MAX_RX_URBS; i++) {
+ struct urb *urb;
+ u8 *buf;
++ dma_addr_t buf_dma;
+
+ /* alloc rx urb */
+ urb = usb_alloc_urb(0, GFP_KERNEL);
+@@ -752,7 +755,7 @@ static int gs_can_open(struct net_device *netdev)
+ buf = usb_alloc_coherent(dev->udev,
+ dev->parent->hf_size_rx,
+ GFP_KERNEL,
+- &urb->transfer_dma);
++ &buf_dma);
+ if (!buf) {
+ netdev_err(netdev,
+ "No memory left for USB buffer\n");
+@@ -760,6 +763,8 @@ static int gs_can_open(struct net_device *netdev)
+ return -ENOMEM;
+ }
+
++ urb->transfer_dma = buf_dma;
++
+ /* fill, anchor, and submit rx urb */
+ usb_fill_bulk_urb(urb,
+ dev->udev,
+@@ -781,10 +786,17 @@ static int gs_can_open(struct net_device *netdev)
+ "usb_submit failed (err=%d)\n", rc);
+
+ usb_unanchor_urb(urb);
++ usb_free_coherent(dev->udev,
++ sizeof(struct gs_host_frame),
++ buf,
++ buf_dma);
+ usb_free_urb(urb);
+ break;
+ }
+
++ dev->rxbuf[i] = buf;
++ dev->rxbuf_dma[i] = buf_dma;
++
+ /* Drop reference,
+ * USB core will take care of freeing it
+ */
+@@ -842,13 +854,20 @@ static int gs_can_close(struct net_device *netdev)
+ int rc;
+ struct gs_can *dev = netdev_priv(netdev);
+ struct gs_usb *parent = dev->parent;
++ unsigned int i;
+
+ netif_stop_queue(netdev);
+
+ /* Stop polling */
+ parent->active_channels--;
+- if (!parent->active_channels)
++ if (!parent->active_channels) {
+ usb_kill_anchored_urbs(&parent->rx_submitted);
++ for (i = 0; i < GS_MAX_RX_URBS; i++)
++ usb_free_coherent(dev->udev,
++ sizeof(struct gs_host_frame),
++ dev->rxbuf[i],
++ dev->rxbuf_dma[i]);
++ }
+
+ /* Stop sending URBs */
+ usb_kill_anchored_urbs(&dev->tx_submitted);
+diff --git a/drivers/net/can/usb/kvaser_usb/kvaser_usb.h b/drivers/net/can/usb/kvaser_usb/kvaser_usb.h
+index 3a49257f9fa65..eefcbe3aadce7 100644
+--- a/drivers/net/can/usb/kvaser_usb/kvaser_usb.h
++++ b/drivers/net/can/usb/kvaser_usb/kvaser_usb.h
+@@ -35,9 +35,10 @@
+ #define KVASER_USB_RX_BUFFER_SIZE 3072
+ #define KVASER_USB_MAX_NET_DEVICES 5
+
+-/* USB devices features */
+-#define KVASER_USB_HAS_SILENT_MODE BIT(0)
+-#define KVASER_USB_HAS_TXRX_ERRORS BIT(1)
++/* Kvaser USB device quirks */
++#define KVASER_USB_QUIRK_HAS_SILENT_MODE BIT(0)
++#define KVASER_USB_QUIRK_HAS_TXRX_ERRORS BIT(1)
++#define KVASER_USB_QUIRK_IGNORE_CLK_FREQ BIT(2)
+
+ /* Device capabilities */
+ #define KVASER_USB_CAP_BERR_CAP 0x01
+@@ -65,12 +66,7 @@ struct kvaser_usb_dev_card_data_hydra {
+ struct kvaser_usb_dev_card_data {
+ u32 ctrlmode_supported;
+ u32 capabilities;
+- union {
+- struct {
+- enum kvaser_usb_leaf_family family;
+- } leaf;
+- struct kvaser_usb_dev_card_data_hydra hydra;
+- };
++ struct kvaser_usb_dev_card_data_hydra hydra;
+ };
+
+ /* Context for an outstanding, not yet ACKed, transmission */
+@@ -83,7 +79,7 @@ struct kvaser_usb {
+ struct usb_device *udev;
+ struct usb_interface *intf;
+ struct kvaser_usb_net_priv *nets[KVASER_USB_MAX_NET_DEVICES];
+- const struct kvaser_usb_dev_ops *ops;
++ const struct kvaser_usb_driver_info *driver_info;
+ const struct kvaser_usb_dev_cfg *cfg;
+
+ struct usb_endpoint_descriptor *bulk_in, *bulk_out;
+@@ -165,6 +161,12 @@ struct kvaser_usb_dev_ops {
+ u16 transid);
+ };
+
++struct kvaser_usb_driver_info {
++ u32 quirks;
++ enum kvaser_usb_leaf_family family;
++ const struct kvaser_usb_dev_ops *ops;
++};
++
+ struct kvaser_usb_dev_cfg {
+ const struct can_clock clock;
+ const unsigned int timestamp_freq;
+@@ -184,4 +186,7 @@ int kvaser_usb_send_cmd_async(struct kvaser_usb_net_priv *priv, void *cmd,
+ int len);
+
+ int kvaser_usb_can_rx_over_error(struct net_device *netdev);
++
++extern const struct can_bittiming_const kvaser_usb_flexc_bittiming_const;
++
+ #endif /* KVASER_USB_H */
+diff --git a/drivers/net/can/usb/kvaser_usb/kvaser_usb_core.c b/drivers/net/can/usb/kvaser_usb/kvaser_usb_core.c
+index e67658b53d02f..f211bfcb1d97e 100644
+--- a/drivers/net/can/usb/kvaser_usb/kvaser_usb_core.c
++++ b/drivers/net/can/usb/kvaser_usb/kvaser_usb_core.c
+@@ -61,8 +61,6 @@
+ #define USB_USBCAN_R_V2_PRODUCT_ID 294
+ #define USB_LEAF_LIGHT_R_V2_PRODUCT_ID 295
+ #define USB_LEAF_LIGHT_HS_V2_OEM2_PRODUCT_ID 296
+-#define USB_LEAF_PRODUCT_ID_END \
+- USB_LEAF_LIGHT_HS_V2_OEM2_PRODUCT_ID
+
+ /* Kvaser USBCan-II devices product ids */
+ #define USB_USBCAN_REVB_PRODUCT_ID 2
+@@ -89,116 +87,153 @@
+ #define USB_USBCAN_PRO_4HS_PRODUCT_ID 276
+ #define USB_HYBRID_CANLIN_PRODUCT_ID 277
+ #define USB_HYBRID_PRO_CANLIN_PRODUCT_ID 278
+-#define USB_HYDRA_PRODUCT_ID_END \
+- USB_HYBRID_PRO_CANLIN_PRODUCT_ID
+
+-static inline bool kvaser_is_leaf(const struct usb_device_id *id)
+-{
+- return (id->idProduct >= USB_LEAF_DEVEL_PRODUCT_ID &&
+- id->idProduct <= USB_CAN_R_PRODUCT_ID) ||
+- (id->idProduct >= USB_LEAF_LITE_V2_PRODUCT_ID &&
+- id->idProduct <= USB_LEAF_PRODUCT_ID_END);
+-}
++static const struct kvaser_usb_driver_info kvaser_usb_driver_info_hydra = {
++ .quirks = 0,
++ .ops = &kvaser_usb_hydra_dev_ops,
++};
+
+-static inline bool kvaser_is_usbcan(const struct usb_device_id *id)
+-{
+- return id->idProduct >= USB_USBCAN_REVB_PRODUCT_ID &&
+- id->idProduct <= USB_MEMORATOR_PRODUCT_ID;
+-}
++static const struct kvaser_usb_driver_info kvaser_usb_driver_info_usbcan = {
++ .quirks = KVASER_USB_QUIRK_HAS_TXRX_ERRORS |
++ KVASER_USB_QUIRK_HAS_SILENT_MODE,
++ .family = KVASER_USBCAN,
++ .ops = &kvaser_usb_leaf_dev_ops,
++};
+
+-static inline bool kvaser_is_hydra(const struct usb_device_id *id)
+-{
+- return id->idProduct >= USB_BLACKBIRD_V2_PRODUCT_ID &&
+- id->idProduct <= USB_HYDRA_PRODUCT_ID_END;
+-}
++static const struct kvaser_usb_driver_info kvaser_usb_driver_info_leaf = {
++ .quirks = KVASER_USB_QUIRK_IGNORE_CLK_FREQ,
++ .family = KVASER_LEAF,
++ .ops = &kvaser_usb_leaf_dev_ops,
++};
++
++static const struct kvaser_usb_driver_info kvaser_usb_driver_info_leaf_err = {
++ .quirks = KVASER_USB_QUIRK_HAS_TXRX_ERRORS |
++ KVASER_USB_QUIRK_IGNORE_CLK_FREQ,
++ .family = KVASER_LEAF,
++ .ops = &kvaser_usb_leaf_dev_ops,
++};
++
++static const struct kvaser_usb_driver_info kvaser_usb_driver_info_leaf_err_listen = {
++ .quirks = KVASER_USB_QUIRK_HAS_TXRX_ERRORS |
++ KVASER_USB_QUIRK_HAS_SILENT_MODE |
++ KVASER_USB_QUIRK_IGNORE_CLK_FREQ,
++ .family = KVASER_LEAF,
++ .ops = &kvaser_usb_leaf_dev_ops,
++};
++
++static const struct kvaser_usb_driver_info kvaser_usb_driver_info_leafimx = {
++ .quirks = 0,
++ .ops = &kvaser_usb_leaf_dev_ops,
++};
+
+ static const struct usb_device_id kvaser_usb_table[] = {
+- /* Leaf USB product IDs */
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_DEVEL_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_PRODUCT_ID) },
++ /* Leaf M32C USB product IDs */
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_DEVEL_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS |
+- KVASER_USB_HAS_SILENT_MODE },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err_listen },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_SPRO_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS |
+- KVASER_USB_HAS_SILENT_MODE },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err_listen },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_LS_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS |
+- KVASER_USB_HAS_SILENT_MODE },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err_listen },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_SWC_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS |
+- KVASER_USB_HAS_SILENT_MODE },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err_listen },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_LIN_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS |
+- KVASER_USB_HAS_SILENT_MODE },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err_listen },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_SPRO_LS_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS |
+- KVASER_USB_HAS_SILENT_MODE },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err_listen },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_SPRO_SWC_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS |
+- KVASER_USB_HAS_SILENT_MODE },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err_listen },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_MEMO2_DEVEL_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS |
+- KVASER_USB_HAS_SILENT_MODE },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err_listen },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_MEMO2_HSHS_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS |
+- KVASER_USB_HAS_SILENT_MODE },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err_listen },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_UPRO_HSHS_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_GI_PRODUCT_ID) },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_GI_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_OBDII_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS |
+- KVASER_USB_HAS_SILENT_MODE },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err_listen },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_MEMO2_HSLS_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_CH_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_BLACKBIRD_SPRO_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_OEM_MERCURY_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_OEM_LEAF_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_CAN_R_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_V2_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_MINI_PCIE_HS_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LIGHT_HS_V2_OEM_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_LIGHT_2HS_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_MINI_PCIE_2HS_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_R_V2_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LIGHT_R_V2_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LIGHT_HS_V2_OEM2_PRODUCT_ID) },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leaf_err },
++
++ /* Leaf i.MX28 USB product IDs */
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LITE_V2_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leafimx },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_MINI_PCIE_HS_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leafimx },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LIGHT_HS_V2_OEM_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leafimx },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_LIGHT_2HS_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leafimx },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_MINI_PCIE_2HS_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leafimx },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_R_V2_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leafimx },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LIGHT_R_V2_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leafimx },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_LIGHT_HS_V2_OEM2_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_leafimx },
+
+ /* USBCANII USB product IDs */
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN2_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_usbcan },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_REVB_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_usbcan },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_MEMORATOR_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_usbcan },
+ { USB_DEVICE(KVASER_VENDOR_ID, USB_VCI2_PRODUCT_ID),
+- .driver_info = KVASER_USB_HAS_TXRX_ERRORS },
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_usbcan },
+
+ /* Minihydra USB product IDs */
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_BLACKBIRD_V2_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_MEMO_PRO_5HS_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_PRO_5HS_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_LIGHT_4HS_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_HS_V2_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_PRO_2HS_V2_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_MEMO_2HS_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_MEMO_PRO_2HS_V2_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_HYBRID_2CANLIN_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_ATI_USBCAN_PRO_2HS_V2_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_ATI_MEMO_PRO_2HS_V2_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_HYBRID_PRO_2CANLIN_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_U100_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_U100P_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_U100S_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_PRO_4HS_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_HYBRID_CANLIN_PRODUCT_ID) },
+- { USB_DEVICE(KVASER_VENDOR_ID, USB_HYBRID_PRO_CANLIN_PRODUCT_ID) },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_BLACKBIRD_V2_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_MEMO_PRO_5HS_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_PRO_5HS_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_LIGHT_4HS_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_LEAF_PRO_HS_V2_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_PRO_2HS_V2_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_MEMO_2HS_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_MEMO_PRO_2HS_V2_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_HYBRID_2CANLIN_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_ATI_USBCAN_PRO_2HS_V2_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_ATI_MEMO_PRO_2HS_V2_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_HYBRID_PRO_2CANLIN_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_U100_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_U100P_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_U100S_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_USBCAN_PRO_4HS_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_HYBRID_CANLIN_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
++ { USB_DEVICE(KVASER_VENDOR_ID, USB_HYBRID_PRO_CANLIN_PRODUCT_ID),
++ .driver_info = (kernel_ulong_t)&kvaser_usb_driver_info_hydra },
+ { }
+ };
+ MODULE_DEVICE_TABLE(usb, kvaser_usb_table);
+@@ -285,6 +320,7 @@ int kvaser_usb_can_rx_over_error(struct net_device *netdev)
+ static void kvaser_usb_read_bulk_callback(struct urb *urb)
+ {
+ struct kvaser_usb *dev = urb->context;
++ const struct kvaser_usb_dev_ops *ops = dev->driver_info->ops;
+ int err;
+ unsigned int i;
+
+@@ -301,8 +337,8 @@ static void kvaser_usb_read_bulk_callback(struct urb *urb)
+ goto resubmit_urb;
+ }
+
+- dev->ops->dev_read_bulk_callback(dev, urb->transfer_buffer,
+- urb->actual_length);
++ ops->dev_read_bulk_callback(dev, urb->transfer_buffer,
++ urb->actual_length);
+
+ resubmit_urb:
+ usb_fill_bulk_urb(urb, dev->udev,
+@@ -396,6 +432,7 @@ static int kvaser_usb_open(struct net_device *netdev)
+ {
+ struct kvaser_usb_net_priv *priv = netdev_priv(netdev);
+ struct kvaser_usb *dev = priv->dev;
++ const struct kvaser_usb_dev_ops *ops = dev->driver_info->ops;
+ int err;
+
+ err = open_candev(netdev);
+@@ -406,11 +443,11 @@ static int kvaser_usb_open(struct net_device *netdev)
+ if (err)
+ goto error;
+
+- err = dev->ops->dev_set_opt_mode(priv);
++ err = ops->dev_set_opt_mode(priv);
+ if (err)
+ goto error;
+
+- err = dev->ops->dev_start_chip(priv);
++ err = ops->dev_start_chip(priv);
+ if (err) {
+ netdev_warn(netdev, "Cannot start device, error %d\n", err);
+ goto error;
+@@ -467,22 +504,23 @@ static int kvaser_usb_close(struct net_device *netdev)
+ {
+ struct kvaser_usb_net_priv *priv = netdev_priv(netdev);
+ struct kvaser_usb *dev = priv->dev;
++ const struct kvaser_usb_dev_ops *ops = dev->driver_info->ops;
+ int err;
+
+ netif_stop_queue(netdev);
+
+- err = dev->ops->dev_flush_queue(priv);
++ err = ops->dev_flush_queue(priv);
+ if (err)
+ netdev_warn(netdev, "Cannot flush queue, error %d\n", err);
+
+- if (dev->ops->dev_reset_chip) {
+- err = dev->ops->dev_reset_chip(dev, priv->channel);
++ if (ops->dev_reset_chip) {
++ err = ops->dev_reset_chip(dev, priv->channel);
+ if (err)
+ netdev_warn(netdev, "Cannot reset card, error %d\n",
+ err);
+ }
+
+- err = dev->ops->dev_stop_chip(priv);
++ err = ops->dev_stop_chip(priv);
+ if (err)
+ netdev_warn(netdev, "Cannot stop device, error %d\n", err);
+
+@@ -521,6 +559,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
+ {
+ struct kvaser_usb_net_priv *priv = netdev_priv(netdev);
+ struct kvaser_usb *dev = priv->dev;
++ const struct kvaser_usb_dev_ops *ops = dev->driver_info->ops;
+ struct net_device_stats *stats = &netdev->stats;
+ struct kvaser_usb_tx_urb_context *context = NULL;
+ struct urb *urb;
+@@ -563,8 +602,7 @@ static netdev_tx_t kvaser_usb_start_xmit(struct sk_buff *skb,
+ goto freeurb;
+ }
+
+- buf = dev->ops->dev_frame_to_cmd(priv, skb, &cmd_len,
+- context->echo_index);
++ buf = ops->dev_frame_to_cmd(priv, skb, &cmd_len, context->echo_index);
+ if (!buf) {
+ stats->tx_dropped++;
+ dev_kfree_skb(skb);
+@@ -648,15 +686,16 @@ static void kvaser_usb_remove_interfaces(struct kvaser_usb *dev)
+ }
+ }
+
+-static int kvaser_usb_init_one(struct kvaser_usb *dev,
+- const struct usb_device_id *id, int channel)
++static int kvaser_usb_init_one(struct kvaser_usb *dev, int channel)
+ {
+ struct net_device *netdev;
+ struct kvaser_usb_net_priv *priv;
++ const struct kvaser_usb_driver_info *driver_info = dev->driver_info;
++ const struct kvaser_usb_dev_ops *ops = driver_info->ops;
+ int err;
+
+- if (dev->ops->dev_reset_chip) {
+- err = dev->ops->dev_reset_chip(dev, channel);
++ if (ops->dev_reset_chip) {
++ err = ops->dev_reset_chip(dev, channel);
+ if (err)
+ return err;
+ }
+@@ -685,20 +724,19 @@ static int kvaser_usb_init_one(struct kvaser_usb *dev,
+ priv->can.state = CAN_STATE_STOPPED;
+ priv->can.clock.freq = dev->cfg->clock.freq;
+ priv->can.bittiming_const = dev->cfg->bittiming_const;
+- priv->can.do_set_bittiming = dev->ops->dev_set_bittiming;
+- priv->can.do_set_mode = dev->ops->dev_set_mode;
+- if ((id->driver_info & KVASER_USB_HAS_TXRX_ERRORS) ||
++ priv->can.do_set_bittiming = ops->dev_set_bittiming;
++ priv->can.do_set_mode = ops->dev_set_mode;
++ if ((driver_info->quirks & KVASER_USB_QUIRK_HAS_TXRX_ERRORS) ||
+ (priv->dev->card_data.capabilities & KVASER_USB_CAP_BERR_CAP))
+- priv->can.do_get_berr_counter = dev->ops->dev_get_berr_counter;
+- if (id->driver_info & KVASER_USB_HAS_SILENT_MODE)
++ priv->can.do_get_berr_counter = ops->dev_get_berr_counter;
++ if (driver_info->quirks & KVASER_USB_QUIRK_HAS_SILENT_MODE)
+ priv->can.ctrlmode_supported |= CAN_CTRLMODE_LISTENONLY;
+
+ priv->can.ctrlmode_supported |= dev->card_data.ctrlmode_supported;
+
+ if (priv->can.ctrlmode_supported & CAN_CTRLMODE_FD) {
+ priv->can.data_bittiming_const = dev->cfg->data_bittiming_const;
+- priv->can.do_set_data_bittiming =
+- dev->ops->dev_set_data_bittiming;
++ priv->can.do_set_data_bittiming = ops->dev_set_data_bittiming;
+ }
+
+ netdev->flags |= IFF_ECHO;
+@@ -729,29 +767,22 @@ static int kvaser_usb_probe(struct usb_interface *intf,
+ struct kvaser_usb *dev;
+ int err;
+ int i;
++ const struct kvaser_usb_driver_info *driver_info;
++ const struct kvaser_usb_dev_ops *ops;
++
++ driver_info = (const struct kvaser_usb_driver_info *)id->driver_info;
++ if (!driver_info)
++ return -ENODEV;
+
+ dev = devm_kzalloc(&intf->dev, sizeof(*dev), GFP_KERNEL);
+ if (!dev)
+ return -ENOMEM;
+
+- if (kvaser_is_leaf(id)) {
+- dev->card_data.leaf.family = KVASER_LEAF;
+- dev->ops = &kvaser_usb_leaf_dev_ops;
+- } else if (kvaser_is_usbcan(id)) {
+- dev->card_data.leaf.family = KVASER_USBCAN;
+- dev->ops = &kvaser_usb_leaf_dev_ops;
+- } else if (kvaser_is_hydra(id)) {
+- dev->ops = &kvaser_usb_hydra_dev_ops;
+- } else {
+- dev_err(&intf->dev,
+- "Product ID (%d) is not a supported Kvaser USB device\n",
+- id->idProduct);
+- return -ENODEV;
+- }
+-
+ dev->intf = intf;
++ dev->driver_info = driver_info;
++ ops = driver_info->ops;
+
+- err = dev->ops->dev_setup_endpoints(dev);
++ err = ops->dev_setup_endpoints(dev);
+ if (err) {
+ dev_err(&intf->dev, "Cannot get usb endpoint(s)");
+ return err;
+@@ -765,22 +796,22 @@ static int kvaser_usb_probe(struct usb_interface *intf,
+
+ dev->card_data.ctrlmode_supported = 0;
+ dev->card_data.capabilities = 0;
+- err = dev->ops->dev_init_card(dev);
++ err = ops->dev_init_card(dev);
+ if (err) {
+ dev_err(&intf->dev,
+ "Failed to initialize card, error %d\n", err);
+ return err;
+ }
+
+- err = dev->ops->dev_get_software_info(dev);
++ err = ops->dev_get_software_info(dev);
+ if (err) {
+ dev_err(&intf->dev,
+ "Cannot get software info, error %d\n", err);
+ return err;
+ }
+
+- if (dev->ops->dev_get_software_details) {
+- err = dev->ops->dev_get_software_details(dev);
++ if (ops->dev_get_software_details) {
++ err = ops->dev_get_software_details(dev);
+ if (err) {
+ dev_err(&intf->dev,
+ "Cannot get software details, error %d\n", err);
+@@ -798,14 +829,14 @@ static int kvaser_usb_probe(struct usb_interface *intf,
+
+ dev_dbg(&intf->dev, "Max outstanding tx = %d URBs\n", dev->max_tx_urbs);
+
+- err = dev->ops->dev_get_card_info(dev);
++ err = ops->dev_get_card_info(dev);
+ if (err) {
+ dev_err(&intf->dev, "Cannot get card info, error %d\n", err);
+ return err;
+ }
+
+- if (dev->ops->dev_get_capabilities) {
+- err = dev->ops->dev_get_capabilities(dev);
++ if (ops->dev_get_capabilities) {
++ err = ops->dev_get_capabilities(dev);
+ if (err) {
+ dev_err(&intf->dev,
+ "Cannot get capabilities, error %d\n", err);
+@@ -815,7 +846,7 @@ static int kvaser_usb_probe(struct usb_interface *intf,
+ }
+
+ for (i = 0; i < dev->nchannels; i++) {
+- err = kvaser_usb_init_one(dev, id, i);
++ err = kvaser_usb_init_one(dev, i);
+ if (err) {
+ kvaser_usb_remove_interfaces(dev);
+ return err;
+diff --git a/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c b/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c
+index a26823c5b62ad..5d70844ac0300 100644
+--- a/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c
++++ b/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c
+@@ -375,7 +375,7 @@ static const struct can_bittiming_const kvaser_usb_hydra_kcan_bittiming_c = {
+ .brp_inc = 1,
+ };
+
+-static const struct can_bittiming_const kvaser_usb_hydra_flexc_bittiming_c = {
++const struct can_bittiming_const kvaser_usb_flexc_bittiming_const = {
+ .name = "kvaser_usb_flex",
+ .tseg1_min = 4,
+ .tseg1_max = 16,
+@@ -2052,7 +2052,7 @@ static const struct kvaser_usb_dev_cfg kvaser_usb_hydra_dev_cfg_flexc = {
+ .freq = 24 * MEGA /* Hz */,
+ },
+ .timestamp_freq = 1,
+- .bittiming_const = &kvaser_usb_hydra_flexc_bittiming_c,
++ .bittiming_const = &kvaser_usb_flexc_bittiming_const,
+ };
+
+ static const struct kvaser_usb_dev_cfg kvaser_usb_hydra_dev_cfg_rt = {
+diff --git a/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c b/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
+index c805b999c5436..cc809ecd1e622 100644
+--- a/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
++++ b/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
+@@ -101,16 +101,6 @@
+ #define USBCAN_ERROR_STATE_RX_ERROR BIT(1)
+ #define USBCAN_ERROR_STATE_BUSERROR BIT(2)
+
+-/* bittiming parameters */
+-#define KVASER_USB_TSEG1_MIN 1
+-#define KVASER_USB_TSEG1_MAX 16
+-#define KVASER_USB_TSEG2_MIN 1
+-#define KVASER_USB_TSEG2_MAX 8
+-#define KVASER_USB_SJW_MAX 4
+-#define KVASER_USB_BRP_MIN 1
+-#define KVASER_USB_BRP_MAX 64
+-#define KVASER_USB_BRP_INC 1
+-
+ /* ctrl modes */
+ #define KVASER_CTRL_MODE_NORMAL 1
+ #define KVASER_CTRL_MODE_SILENT 2
+@@ -343,48 +333,68 @@ struct kvaser_usb_err_summary {
+ };
+ };
+
+-static const struct can_bittiming_const kvaser_usb_leaf_bittiming_const = {
+- .name = "kvaser_usb",
+- .tseg1_min = KVASER_USB_TSEG1_MIN,
+- .tseg1_max = KVASER_USB_TSEG1_MAX,
+- .tseg2_min = KVASER_USB_TSEG2_MIN,
+- .tseg2_max = KVASER_USB_TSEG2_MAX,
+- .sjw_max = KVASER_USB_SJW_MAX,
+- .brp_min = KVASER_USB_BRP_MIN,
+- .brp_max = KVASER_USB_BRP_MAX,
+- .brp_inc = KVASER_USB_BRP_INC,
++static const struct can_bittiming_const kvaser_usb_leaf_m16c_bittiming_const = {
++ .name = "kvaser_usb_ucii",
++ .tseg1_min = 4,
++ .tseg1_max = 16,
++ .tseg2_min = 2,
++ .tseg2_max = 8,
++ .sjw_max = 4,
++ .brp_min = 1,
++ .brp_max = 16,
++ .brp_inc = 1,
++};
++
++static const struct can_bittiming_const kvaser_usb_leaf_m32c_bittiming_const = {
++ .name = "kvaser_usb_leaf",
++ .tseg1_min = 3,
++ .tseg1_max = 16,
++ .tseg2_min = 2,
++ .tseg2_max = 8,
++ .sjw_max = 4,
++ .brp_min = 2,
++ .brp_max = 128,
++ .brp_inc = 2,
+ };
+
+-static const struct kvaser_usb_dev_cfg kvaser_usb_leaf_dev_cfg_8mhz = {
++static const struct kvaser_usb_dev_cfg kvaser_usb_leaf_usbcan_dev_cfg = {
+ .clock = {
+ .freq = 8 * MEGA /* Hz */,
+ },
+ .timestamp_freq = 1,
+- .bittiming_const = &kvaser_usb_leaf_bittiming_const,
++ .bittiming_const = &kvaser_usb_leaf_m16c_bittiming_const,
++};
++
++static const struct kvaser_usb_dev_cfg kvaser_usb_leaf_m32c_dev_cfg = {
++ .clock = {
++ .freq = 16 * MEGA /* Hz */,
++ },
++ .timestamp_freq = 1,
++ .bittiming_const = &kvaser_usb_leaf_m32c_bittiming_const,
+ };
+
+-static const struct kvaser_usb_dev_cfg kvaser_usb_leaf_dev_cfg_16mhz = {
++static const struct kvaser_usb_dev_cfg kvaser_usb_leaf_imx_dev_cfg_16mhz = {
+ .clock = {
+ .freq = 16 * MEGA /* Hz */,
+ },
+ .timestamp_freq = 1,
+- .bittiming_const = &kvaser_usb_leaf_bittiming_const,
++ .bittiming_const = &kvaser_usb_flexc_bittiming_const,
+ };
+
+-static const struct kvaser_usb_dev_cfg kvaser_usb_leaf_dev_cfg_24mhz = {
++static const struct kvaser_usb_dev_cfg kvaser_usb_leaf_imx_dev_cfg_24mhz = {
+ .clock = {
+ .freq = 24 * MEGA /* Hz */,
+ },
+ .timestamp_freq = 1,
+- .bittiming_const = &kvaser_usb_leaf_bittiming_const,
++ .bittiming_const = &kvaser_usb_flexc_bittiming_const,
+ };
+
+-static const struct kvaser_usb_dev_cfg kvaser_usb_leaf_dev_cfg_32mhz = {
++static const struct kvaser_usb_dev_cfg kvaser_usb_leaf_imx_dev_cfg_32mhz = {
+ .clock = {
+ .freq = 32 * MEGA /* Hz */,
+ },
+ .timestamp_freq = 1,
+- .bittiming_const = &kvaser_usb_leaf_bittiming_const,
++ .bittiming_const = &kvaser_usb_flexc_bittiming_const,
+ };
+
+ static void *
+@@ -404,7 +414,7 @@ kvaser_usb_leaf_frame_to_cmd(const struct kvaser_usb_net_priv *priv,
+ sizeof(struct kvaser_cmd_tx_can);
+ cmd->u.tx_can.channel = priv->channel;
+
+- switch (dev->card_data.leaf.family) {
++ switch (dev->driver_info->family) {
+ case KVASER_LEAF:
+ cmd_tx_can_flags = &cmd->u.tx_can.leaf.flags;
+ break;
+@@ -524,16 +534,23 @@ static void kvaser_usb_leaf_get_software_info_leaf(struct kvaser_usb *dev,
+ dev->fw_version = le32_to_cpu(softinfo->fw_version);
+ dev->max_tx_urbs = le16_to_cpu(softinfo->max_outstanding_tx);
+
+- switch (sw_options & KVASER_USB_LEAF_SWOPTION_FREQ_MASK) {
+- case KVASER_USB_LEAF_SWOPTION_FREQ_16_MHZ_CLK:
+- dev->cfg = &kvaser_usb_leaf_dev_cfg_16mhz;
+- break;
+- case KVASER_USB_LEAF_SWOPTION_FREQ_24_MHZ_CLK:
+- dev->cfg = &kvaser_usb_leaf_dev_cfg_24mhz;
+- break;
+- case KVASER_USB_LEAF_SWOPTION_FREQ_32_MHZ_CLK:
+- dev->cfg = &kvaser_usb_leaf_dev_cfg_32mhz;
+- break;
++ if (dev->driver_info->quirks & KVASER_USB_QUIRK_IGNORE_CLK_FREQ) {
++ /* Firmware expects bittiming parameters calculated for 16MHz
++ * clock, regardless of the actual clock
++ */
++ dev->cfg = &kvaser_usb_leaf_m32c_dev_cfg;
++ } else {
++ switch (sw_options & KVASER_USB_LEAF_SWOPTION_FREQ_MASK) {
++ case KVASER_USB_LEAF_SWOPTION_FREQ_16_MHZ_CLK:
++ dev->cfg = &kvaser_usb_leaf_imx_dev_cfg_16mhz;
++ break;
++ case KVASER_USB_LEAF_SWOPTION_FREQ_24_MHZ_CLK:
++ dev->cfg = &kvaser_usb_leaf_imx_dev_cfg_24mhz;
++ break;
++ case KVASER_USB_LEAF_SWOPTION_FREQ_32_MHZ_CLK:
++ dev->cfg = &kvaser_usb_leaf_imx_dev_cfg_32mhz;
++ break;
++ }
+ }
+ }
+
+@@ -550,7 +567,7 @@ static int kvaser_usb_leaf_get_software_info_inner(struct kvaser_usb *dev)
+ if (err)
+ return err;
+
+- switch (dev->card_data.leaf.family) {
++ switch (dev->driver_info->family) {
+ case KVASER_LEAF:
+ kvaser_usb_leaf_get_software_info_leaf(dev, &cmd.u.leaf.softinfo);
+ break;
+@@ -558,7 +575,7 @@ static int kvaser_usb_leaf_get_software_info_inner(struct kvaser_usb *dev)
+ dev->fw_version = le32_to_cpu(cmd.u.usbcan.softinfo.fw_version);
+ dev->max_tx_urbs =
+ le16_to_cpu(cmd.u.usbcan.softinfo.max_outstanding_tx);
+- dev->cfg = &kvaser_usb_leaf_dev_cfg_8mhz;
++ dev->cfg = &kvaser_usb_leaf_usbcan_dev_cfg;
+ break;
+ }
+
+@@ -597,7 +614,7 @@ static int kvaser_usb_leaf_get_card_info(struct kvaser_usb *dev)
+
+ dev->nchannels = cmd.u.cardinfo.nchannels;
+ if (dev->nchannels > KVASER_USB_MAX_NET_DEVICES ||
+- (dev->card_data.leaf.family == KVASER_USBCAN &&
++ (dev->driver_info->family == KVASER_USBCAN &&
+ dev->nchannels > MAX_USBCAN_NET_DEVICES))
+ return -EINVAL;
+
+@@ -730,7 +747,7 @@ kvaser_usb_leaf_rx_error_update_can_state(struct kvaser_usb_net_priv *priv,
+ new_state < CAN_STATE_BUS_OFF)
+ priv->can.can_stats.restarts++;
+
+- switch (dev->card_data.leaf.family) {
++ switch (dev->driver_info->family) {
+ case KVASER_LEAF:
+ if (es->leaf.error_factor) {
+ priv->can.can_stats.bus_error++;
+@@ -809,7 +826,7 @@ static void kvaser_usb_leaf_rx_error(const struct kvaser_usb *dev,
+ }
+ }
+
+- switch (dev->card_data.leaf.family) {
++ switch (dev->driver_info->family) {
+ case KVASER_LEAF:
+ if (es->leaf.error_factor) {
+ cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
+@@ -999,7 +1016,7 @@ static void kvaser_usb_leaf_rx_can_msg(const struct kvaser_usb *dev,
+ stats = &priv->netdev->stats;
+
+ if ((cmd->u.rx_can_header.flag & MSG_FLAG_ERROR_FRAME) &&
+- (dev->card_data.leaf.family == KVASER_LEAF &&
++ (dev->driver_info->family == KVASER_LEAF &&
+ cmd->id == CMD_LEAF_LOG_MESSAGE)) {
+ kvaser_usb_leaf_leaf_rx_error(dev, cmd);
+ return;
+@@ -1015,7 +1032,7 @@ static void kvaser_usb_leaf_rx_can_msg(const struct kvaser_usb *dev,
+ return;
+ }
+
+- switch (dev->card_data.leaf.family) {
++ switch (dev->driver_info->family) {
+ case KVASER_LEAF:
+ rx_data = cmd->u.leaf.rx_can.data;
+ break;
+@@ -1030,7 +1047,7 @@ static void kvaser_usb_leaf_rx_can_msg(const struct kvaser_usb *dev,
+ return;
+ }
+
+- if (dev->card_data.leaf.family == KVASER_LEAF && cmd->id ==
++ if (dev->driver_info->family == KVASER_LEAF && cmd->id ==
+ CMD_LEAF_LOG_MESSAGE) {
+ cf->can_id = le32_to_cpu(cmd->u.leaf.log_message.id);
+ if (cf->can_id & KVASER_EXTENDED_FRAME)
+@@ -1128,14 +1145,14 @@ static void kvaser_usb_leaf_handle_command(const struct kvaser_usb *dev,
+ break;
+
+ case CMD_LEAF_LOG_MESSAGE:
+- if (dev->card_data.leaf.family != KVASER_LEAF)
++ if (dev->driver_info->family != KVASER_LEAF)
+ goto warn;
+ kvaser_usb_leaf_rx_can_msg(dev, cmd);
+ break;
+
+ case CMD_CHIP_STATE_EVENT:
+ case CMD_CAN_ERROR_EVENT:
+- if (dev->card_data.leaf.family == KVASER_LEAF)
++ if (dev->driver_info->family == KVASER_LEAF)
+ kvaser_usb_leaf_leaf_rx_error(dev, cmd);
+ else
+ kvaser_usb_leaf_usbcan_rx_error(dev, cmd);
+@@ -1147,12 +1164,12 @@ static void kvaser_usb_leaf_handle_command(const struct kvaser_usb *dev,
+
+ /* Ignored commands */
+ case CMD_USBCAN_CLOCK_OVERFLOW_EVENT:
+- if (dev->card_data.leaf.family != KVASER_USBCAN)
++ if (dev->driver_info->family != KVASER_USBCAN)
+ goto warn;
+ break;
+
+ case CMD_FLUSH_QUEUE_REPLY:
+- if (dev->card_data.leaf.family != KVASER_LEAF)
++ if (dev->driver_info->family != KVASER_LEAF)
+ goto warn;
+ break;
+
+diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
+index 22b328bd7cd51..8c93dd710efc5 100644
+--- a/drivers/net/dsa/qca8k.c
++++ b/drivers/net/dsa/qca8k.c
+@@ -2372,7 +2372,7 @@ static int
+ qca8k_port_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
+ {
+ struct qca8k_priv *priv = ds->priv;
+- int i, mtu = 0;
++ int ret, i, mtu = 0;
+
+ priv->port_mtu[port] = new_mtu;
+
+@@ -2380,8 +2380,27 @@ qca8k_port_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
+ if (priv->port_mtu[i] > mtu)
+ mtu = priv->port_mtu[i];
+
++ /* To change the MAX_FRAME_SIZE the cpu ports must be off or
++ * the switch panics.
++ * Turn off both cpu ports before applying the new value to prevent
++ * this.
++ */
++ if (priv->port_sts[0].enabled)
++ qca8k_port_set_status(priv, 0, 0);
++
++ if (priv->port_sts[6].enabled)
++ qca8k_port_set_status(priv, 6, 0);
++
+ /* Include L2 header / FCS length */
+- return qca8k_write(priv, QCA8K_MAX_FRAME_SIZE, mtu + ETH_HLEN + ETH_FCS_LEN);
++ ret = qca8k_write(priv, QCA8K_MAX_FRAME_SIZE, mtu + ETH_HLEN + ETH_FCS_LEN);
++
++ if (priv->port_sts[0].enabled)
++ qca8k_port_set_status(priv, 0, 1);
++
++ if (priv->port_sts[6].enabled)
++ qca8k_port_set_status(priv, 6, 1);
++
++ return ret;
+ }
+
+ static int
+diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
+index 5c5931dba51d7..c4221f89ab188 100644
+--- a/drivers/net/ethernet/ibm/ibmvnic.c
++++ b/drivers/net/ethernet/ibm/ibmvnic.c
+@@ -5774,6 +5774,15 @@ static int ibmvnic_reset_init(struct ibmvnic_adapter *adapter, bool reset)
+ release_sub_crqs(adapter, 0);
+ rc = init_sub_crqs(adapter);
+ } else {
++ /* no need to reinitialize completely, but we do
++ * need to clean up transmits that were in flight
++ * when we processed the reset. Failure to do so
++ * will confound the upper layer, usually TCP, by
++ * creating the illusion of transmits that are
++ * awaiting completion.
++ */
++ clean_tx_pools(adapter);
++
+ rc = reset_sub_crq_queues(adapter);
+ }
+ } else {
+diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
+index 55c6bce5da61e..615aff0798d3f 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e.h
++++ b/drivers/net/ethernet/intel/i40e/i40e.h
+@@ -37,6 +37,7 @@
+ #include <net/tc_act/tc_mirred.h>
+ #include <net/udp_tunnel.h>
+ #include <net/xdp_sock.h>
++#include <linux/bitfield.h>
+ #include "i40e_type.h"
+ #include "i40e_prototype.h"
+ #include <linux/net/intel/i40e_client.h>
+@@ -1091,6 +1092,21 @@ static inline void i40e_write_fd_input_set(struct i40e_pf *pf,
+ (u32)(val & 0xFFFFFFFFULL));
+ }
+
++/**
++ * i40e_get_pf_count - get PCI PF count.
++ * @hw: pointer to a hw.
++ *
++ * Reports the function number of the highest PCI physical
++ * function plus 1 as it is loaded from the NVM.
++ *
++ * Return: PCI PF count.
++ **/
++static inline u32 i40e_get_pf_count(struct i40e_hw *hw)
++{
++ return FIELD_GET(I40E_GLGEN_PCIFCNCNT_PCIPFCNT_MASK,
++ rd32(hw, I40E_GLGEN_PCIFCNCNT));
++}
++
+ /* needed by i40e_ethtool.c */
+ int i40e_up(struct i40e_vsi *vsi);
+ void i40e_down(struct i40e_vsi *vsi);
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
+index 46bb1169a004d..77eb9c7262053 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
++++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
+@@ -549,6 +549,47 @@ void i40e_pf_reset_stats(struct i40e_pf *pf)
+ pf->hw_csum_rx_error = 0;
+ }
+
++/**
++ * i40e_compute_pci_to_hw_id - compute index form PCI function.
++ * @vsi: ptr to the VSI to read from.
++ * @hw: ptr to the hardware info.
++ **/
++static u32 i40e_compute_pci_to_hw_id(struct i40e_vsi *vsi, struct i40e_hw *hw)
++{
++ int pf_count = i40e_get_pf_count(hw);
++
++ if (vsi->type == I40E_VSI_SRIOV)
++ return (hw->port * BIT(7)) / pf_count + vsi->vf_id;
++
++ return hw->port + BIT(7);
++}
++
++/**
++ * i40e_stat_update64 - read and update a 64 bit stat from the chip.
++ * @hw: ptr to the hardware info.
++ * @hireg: the high 32 bit reg to read.
++ * @loreg: the low 32 bit reg to read.
++ * @offset_loaded: has the initial offset been loaded yet.
++ * @offset: ptr to current offset value.
++ * @stat: ptr to the stat.
++ *
++ * Since the device stats are not reset at PFReset, they will not
++ * be zeroed when the driver starts. We'll save the first values read
++ * and use them as offsets to be subtracted from the raw values in order
++ * to report stats that count from zero.
++ **/
++static void i40e_stat_update64(struct i40e_hw *hw, u32 hireg, u32 loreg,
++ bool offset_loaded, u64 *offset, u64 *stat)
++{
++ u64 new_data;
++
++ new_data = rd64(hw, loreg);
++
++ if (!offset_loaded || new_data < *offset)
++ *offset = new_data;
++ *stat = new_data - *offset;
++}
++
+ /**
+ * i40e_stat_update48 - read and update a 48 bit stat from the chip
+ * @hw: ptr to the hardware info
+@@ -620,6 +661,34 @@ static void i40e_stat_update_and_clear32(struct i40e_hw *hw, u32 reg, u64 *stat)
+ *stat += new_data;
+ }
+
++/**
++ * i40e_stats_update_rx_discards - update rx_discards.
++ * @vsi: ptr to the VSI to be updated.
++ * @hw: ptr to the hardware info.
++ * @stat_idx: VSI's stat_counter_idx.
++ * @offset_loaded: ptr to the VSI's stat_offsets_loaded.
++ * @stat_offset: ptr to stat_offset to store first read of specific register.
++ * @stat: ptr to VSI's stat to be updated.
++ **/
++static void
++i40e_stats_update_rx_discards(struct i40e_vsi *vsi, struct i40e_hw *hw,
++ int stat_idx, bool offset_loaded,
++ struct i40e_eth_stats *stat_offset,
++ struct i40e_eth_stats *stat)
++{
++ u64 rx_rdpc, rx_rxerr;
++
++ i40e_stat_update32(hw, I40E_GLV_RDPC(stat_idx), offset_loaded,
++ &stat_offset->rx_discards, &rx_rdpc);
++ i40e_stat_update64(hw,
++ I40E_GL_RXERR1H(i40e_compute_pci_to_hw_id(vsi, hw)),
++ I40E_GL_RXERR1L(i40e_compute_pci_to_hw_id(vsi, hw)),
++ offset_loaded, &stat_offset->rx_discards_other,
++ &rx_rxerr);
++
++ stat->rx_discards = rx_rdpc + rx_rxerr;
++}
++
+ /**
+ * i40e_update_eth_stats - Update VSI-specific ethernet statistics counters.
+ * @vsi: the VSI to be updated
+@@ -679,6 +748,10 @@ void i40e_update_eth_stats(struct i40e_vsi *vsi)
+ I40E_GLV_BPTCL(stat_idx),
+ vsi->stat_offsets_loaded,
+ &oes->tx_broadcast, &es->tx_broadcast);
++
++ i40e_stats_update_rx_discards(vsi, hw, stat_idx,
++ vsi->stat_offsets_loaded, oes, es);
++
+ vsi->stat_offsets_loaded = true;
+ }
+
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_register.h b/drivers/net/ethernet/intel/i40e/i40e_register.h
+index 1908eed4fa5ee..7339003aa17cd 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_register.h
++++ b/drivers/net/ethernet/intel/i40e/i40e_register.h
+@@ -211,6 +211,11 @@
+ #define I40E_GLGEN_MSRWD_MDIWRDATA_SHIFT 0
+ #define I40E_GLGEN_MSRWD_MDIRDDATA_SHIFT 16
+ #define I40E_GLGEN_MSRWD_MDIRDDATA_MASK I40E_MASK(0xFFFF, I40E_GLGEN_MSRWD_MDIRDDATA_SHIFT)
++#define I40E_GLGEN_PCIFCNCNT 0x001C0AB4 /* Reset: PCIR */
++#define I40E_GLGEN_PCIFCNCNT_PCIPFCNT_SHIFT 0
++#define I40E_GLGEN_PCIFCNCNT_PCIPFCNT_MASK I40E_MASK(0x1F, I40E_GLGEN_PCIFCNCNT_PCIPFCNT_SHIFT)
++#define I40E_GLGEN_PCIFCNCNT_PCIVFCNT_SHIFT 16
++#define I40E_GLGEN_PCIFCNCNT_PCIVFCNT_MASK I40E_MASK(0xFF, I40E_GLGEN_PCIFCNCNT_PCIVFCNT_SHIFT)
+ #define I40E_GLGEN_RSTAT 0x000B8188 /* Reset: POR */
+ #define I40E_GLGEN_RSTAT_DEVSTATE_SHIFT 0
+ #define I40E_GLGEN_RSTAT_DEVSTATE_MASK I40E_MASK(0x3, I40E_GLGEN_RSTAT_DEVSTATE_SHIFT)
+@@ -643,6 +648,14 @@
+ #define I40E_VFQF_HKEY1_MAX_INDEX 12
+ #define I40E_VFQF_HLUT1(_i, _VF) (0x00220000 + ((_i) * 1024 + (_VF) * 4)) /* _i=0...15, _VF=0...127 */ /* Reset: CORER */
+ #define I40E_VFQF_HLUT1_MAX_INDEX 15
++#define I40E_GL_RXERR1H(_i) (0x00318004 + ((_i) * 8)) /* _i=0...143 */ /* Reset: CORER */
++#define I40E_GL_RXERR1H_MAX_INDEX 143
++#define I40E_GL_RXERR1H_RXERR1H_SHIFT 0
++#define I40E_GL_RXERR1H_RXERR1H_MASK I40E_MASK(0xFFFFFFFF, I40E_GL_RXERR1H_RXERR1H_SHIFT)
++#define I40E_GL_RXERR1L(_i) (0x00318000 + ((_i) * 8)) /* _i=0...143 */ /* Reset: CORER */
++#define I40E_GL_RXERR1L_MAX_INDEX 143
++#define I40E_GL_RXERR1L_RXERR1L_SHIFT 0
++#define I40E_GL_RXERR1L_RXERR1L_MASK I40E_MASK(0xFFFFFFFF, I40E_GL_RXERR1L_RXERR1L_SHIFT)
+ #define I40E_GLPRT_BPRCH(_i) (0x003005E4 + ((_i) * 8)) /* _i=0...3 */ /* Reset: CORER */
+ #define I40E_GLPRT_BPRCL(_i) (0x003005E0 + ((_i) * 8)) /* _i=0...3 */ /* Reset: CORER */
+ #define I40E_GLPRT_BPTCH(_i) (0x00300A04 + ((_i) * 8)) /* _i=0...3 */ /* Reset: CORER */
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h b/drivers/net/ethernet/intel/i40e/i40e_type.h
+index 36a4ca1ffb1a9..7b3f30beb757a 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_type.h
++++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
+@@ -1172,6 +1172,7 @@ struct i40e_eth_stats {
+ u64 tx_broadcast; /* bptc */
+ u64 tx_discards; /* tdpc */
+ u64 tx_errors; /* tepc */
++ u64 rx_discards_other; /* rxerr1 */
+ };
+
+ /* Statistics collected per VEB per TC */
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+index 033ea71763e3d..86b0f21287dc8 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
++++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+@@ -2147,6 +2147,10 @@ static int i40e_vc_get_vf_resources_msg(struct i40e_vf *vf, u8 *msg)
+ /* VFs only use TC 0 */
+ vfres->vsi_res[0].qset_handle
+ = le16_to_cpu(vsi->info.qs_handle[0]);
++ if (!(vf->driver_caps & VIRTCHNL_VF_OFFLOAD_USO) && !vf->pf_set_mac) {
++ i40e_del_mac_filter(vsi, vf->default_lan_addr.addr);
++ eth_zero_addr(vf->default_lan_addr.addr);
++ }
+ ether_addr_copy(vfres->vsi_res[0].default_mac_addr,
+ vf->default_lan_addr.addr);
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+index ec2dfecd7f0f1..c016510474480 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+@@ -4503,13 +4503,6 @@ static int mlx5e_policer_validate(const struct flow_action *action,
+ return -EOPNOTSUPP;
+ }
+
+- if (act->police.notexceed.act_id != FLOW_ACTION_PIPE &&
+- act->police.notexceed.act_id != FLOW_ACTION_ACCEPT) {
+- NL_SET_ERR_MSG_MOD(extack,
+- "Offload not supported when conform action is not pipe or ok");
+- return -EOPNOTSUPP;
+- }
+-
+ if (act->police.notexceed.act_id == FLOW_ACTION_ACCEPT &&
+ !flow_action_is_last_entry(action, act)) {
+ NL_SET_ERR_MSG_MOD(extack,
+@@ -4560,6 +4553,12 @@ static int scan_tc_matchall_fdb_actions(struct mlx5e_priv *priv,
+ flow_action_for_each(i, act, flow_action) {
+ switch (act->id) {
+ case FLOW_ACTION_POLICE:
++ if (act->police.notexceed.act_id != FLOW_ACTION_CONTINUE) {
++ NL_SET_ERR_MSG_MOD(extack,
++ "Offload not supported when conform action is not continue");
++ return -EOPNOTSUPP;
++ }
++
+ err = mlx5e_policer_validate(flow_action, act, extack);
+ if (err)
+ return err;
+diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
+index f180a157eea49..05d759ba0c5c8 100644
+--- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
++++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
+@@ -979,7 +979,7 @@ static int lan966x_probe(struct platform_device *pdev)
+ struct fwnode_handle *ports, *portnp;
+ struct lan966x *lan966x;
+ u8 mac_addr[ETH_ALEN];
+- int err, i;
++ int err;
+
+ lan966x = devm_kzalloc(&pdev->dev, sizeof(*lan966x), GFP_KERNEL);
+ if (!lan966x)
+@@ -1010,11 +1010,7 @@ static int lan966x_probe(struct platform_device *pdev)
+ if (err)
+ return dev_err_probe(&pdev->dev, err, "Reset failed");
+
+- i = 0;
+- fwnode_for_each_available_child_node(ports, portnp)
+- ++i;
+-
+- lan966x->num_phys_ports = i;
++ lan966x->num_phys_ports = NUM_PHYS_PORTS;
+ lan966x->ports = devm_kcalloc(&pdev->dev, lan966x->num_phys_ports,
+ sizeof(struct lan966x_port *),
+ GFP_KERNEL);
+diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
+index ae282da1da748..a240f13e052b8 100644
+--- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
++++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h
+@@ -31,6 +31,7 @@
+ /* Reserved amount for (SRC, PRIO) at index 8*SRC + PRIO */
+ #define QSYS_Q_RSRV 95
+
++#define NUM_PHYS_PORTS 8
+ #define CPU_PORT 8
+
+ /* Reserved PGIDs */
+diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
+index 33f5c5698ccbf..642e435c70311 100644
+--- a/drivers/net/ethernet/realtek/r8169_main.c
++++ b/drivers/net/ethernet/realtek/r8169_main.c
+@@ -4190,7 +4190,6 @@ static void rtl8169_tso_csum_v1(struct sk_buff *skb, u32 *opts)
+ static bool rtl8169_tso_csum_v2(struct rtl8169_private *tp,
+ struct sk_buff *skb, u32 *opts)
+ {
+- u32 transport_offset = (u32)skb_transport_offset(skb);
+ struct skb_shared_info *shinfo = skb_shinfo(skb);
+ u32 mss = shinfo->gso_size;
+
+@@ -4207,7 +4206,7 @@ static bool rtl8169_tso_csum_v2(struct rtl8169_private *tp,
+ WARN_ON_ONCE(1);
+ }
+
+- opts[0] |= transport_offset << GTTCPHO_SHIFT;
++ opts[0] |= skb_transport_offset(skb) << GTTCPHO_SHIFT;
+ opts[1] |= mss << TD1_MSS_SHIFT;
+ } else if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ u8 ip_protocol;
+@@ -4235,7 +4234,7 @@ static bool rtl8169_tso_csum_v2(struct rtl8169_private *tp,
+ else
+ WARN_ON_ONCE(1);
+
+- opts[1] |= transport_offset << TCPHO_SHIFT;
++ opts[1] |= skb_transport_offset(skb) << TCPHO_SHIFT;
+ } else {
+ unsigned int padto = rtl_quirk_packet_padto(tp, skb);
+
+@@ -4402,14 +4401,13 @@ static netdev_features_t rtl8169_features_check(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features)
+ {
+- int transport_offset = skb_transport_offset(skb);
+ struct rtl8169_private *tp = netdev_priv(dev);
+
+ if (skb_is_gso(skb)) {
+ if (tp->mac_version == RTL_GIGA_MAC_VER_34)
+ features = rtl8168evl_fix_tso(skb, features);
+
+- if (transport_offset > GTTCPHO_MAX &&
++ if (skb_transport_offset(skb) > GTTCPHO_MAX &&
+ rtl_chip_supports_csum_v2(tp))
+ features &= ~NETIF_F_ALL_TSO;
+ } else if (skb->ip_summed == CHECKSUM_PARTIAL) {
+@@ -4420,7 +4418,7 @@ static netdev_features_t rtl8169_features_check(struct sk_buff *skb,
+ if (rtl_quirk_packet_padto(tp, skb))
+ features &= ~NETIF_F_CSUM_MASK;
+
+- if (transport_offset > TCPHO_MAX &&
++ if (skb_transport_offset(skb) > TCPHO_MAX &&
+ rtl_chip_supports_csum_v2(tp))
+ features &= ~NETIF_F_CSUM_MASK;
+ }
+diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
+index 2ea81931543c1..9b2bd09628a3b 100644
+--- a/drivers/net/usb/usbnet.c
++++ b/drivers/net/usb/usbnet.c
+@@ -2137,7 +2137,7 @@ static void usbnet_async_cmd_cb(struct urb *urb)
+ int usbnet_write_cmd_async(struct usbnet *dev, u8 cmd, u8 reqtype,
+ u16 value, u16 index, const void *data, u16 size)
+ {
+- struct usb_ctrlrequest *req = NULL;
++ struct usb_ctrlrequest *req;
+ struct urb *urb;
+ int err = -ENOMEM;
+ void *buf = NULL;
+@@ -2155,7 +2155,7 @@ int usbnet_write_cmd_async(struct usbnet *dev, u8 cmd, u8 reqtype,
+ if (!buf) {
+ netdev_err(dev->net, "Error allocating buffer"
+ " in %s!\n", __func__);
+- goto fail_free;
++ goto fail_free_urb;
+ }
+ }
+
+@@ -2179,14 +2179,21 @@ int usbnet_write_cmd_async(struct usbnet *dev, u8 cmd, u8 reqtype,
+ if (err < 0) {
+ netdev_err(dev->net, "Error submitting the control"
+ " message: status=%d\n", err);
+- goto fail_free;
++ goto fail_free_all;
+ }
+ return 0;
+
++fail_free_all:
++ kfree(req);
+ fail_free_buf:
+ kfree(buf);
+-fail_free:
+- kfree(req);
++ /*
++ * avoid a double free
++ * needed because the flag can be set only
++ * after filling the URB
++ */
++ urb->transfer_flags = 0;
++fail_free_urb:
+ usb_free_urb(urb);
+ fail:
+ return err;
+diff --git a/drivers/pinctrl/sunxi/pinctrl-sun8i-a83t.c b/drivers/pinctrl/sunxi/pinctrl-sun8i-a83t.c
+index 4ada80317a3bd..b5c1a8f363f32 100644
+--- a/drivers/pinctrl/sunxi/pinctrl-sun8i-a83t.c
++++ b/drivers/pinctrl/sunxi/pinctrl-sun8i-a83t.c
+@@ -158,26 +158,26 @@ static const struct sunxi_desc_pin sun8i_a83t_pins[] = {
+ SUNXI_PIN(SUNXI_PINCTRL_PIN(C, 14),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+- SUNXI_FUNCTION(0x2, "nand"), /* DQ6 */
++ SUNXI_FUNCTION(0x2, "nand0"), /* DQ6 */
+ SUNXI_FUNCTION(0x3, "mmc2")), /* D6 */
+ SUNXI_PIN(SUNXI_PINCTRL_PIN(C, 15),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+- SUNXI_FUNCTION(0x2, "nand"), /* DQ7 */
++ SUNXI_FUNCTION(0x2, "nand0"), /* DQ7 */
+ SUNXI_FUNCTION(0x3, "mmc2")), /* D7 */
+ SUNXI_PIN(SUNXI_PINCTRL_PIN(C, 16),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+- SUNXI_FUNCTION(0x2, "nand"), /* DQS */
++ SUNXI_FUNCTION(0x2, "nand0"), /* DQS */
+ SUNXI_FUNCTION(0x3, "mmc2")), /* RST */
+ SUNXI_PIN(SUNXI_PINCTRL_PIN(C, 17),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+- SUNXI_FUNCTION(0x2, "nand")), /* CE2 */
++ SUNXI_FUNCTION(0x2, "nand0")), /* CE2 */
+ SUNXI_PIN(SUNXI_PINCTRL_PIN(C, 18),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+ SUNXI_FUNCTION(0x1, "gpio_out"),
+- SUNXI_FUNCTION(0x2, "nand")), /* CE3 */
++ SUNXI_FUNCTION(0x2, "nand0")), /* CE3 */
+ /* Hole */
+ SUNXI_PIN(SUNXI_PINCTRL_PIN(D, 2),
+ SUNXI_FUNCTION(0x0, "gpio_in"),
+diff --git a/drivers/pinctrl/sunxi/pinctrl-sunxi.c b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
+index d9327d7d56eea..dd928402af997 100644
+--- a/drivers/pinctrl/sunxi/pinctrl-sunxi.c
++++ b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
+@@ -544,6 +544,8 @@ static int sunxi_pconf_set(struct pinctrl_dev *pctldev, unsigned pin,
+ struct sunxi_pinctrl *pctl = pinctrl_dev_get_drvdata(pctldev);
+ int i;
+
++ pin -= pctl->desc->pin_base;
++
+ for (i = 0; i < num_configs; i++) {
+ enum pin_config_param param;
+ unsigned long flags;
+diff --git a/drivers/soc/atmel/soc.c b/drivers/soc/atmel/soc.c
+index b2d365ae02823..dae8a2e0f7455 100644
+--- a/drivers/soc/atmel/soc.c
++++ b/drivers/soc/atmel/soc.c
+@@ -91,14 +91,14 @@ static const struct at91_soc socs[] __initconst = {
+ AT91_SOC(SAM9X60_CIDR_MATCH, AT91_CIDR_MATCH_MASK,
+ AT91_CIDR_VERSION_MASK, SAM9X60_EXID_MATCH,
+ "sam9x60", "sam9x60"),
+- AT91_SOC(SAM9X60_CIDR_MATCH, SAM9X60_D5M_EXID_MATCH,
+- AT91_CIDR_VERSION_MASK, SAM9X60_EXID_MATCH,
++ AT91_SOC(SAM9X60_CIDR_MATCH, AT91_CIDR_MATCH_MASK,
++ AT91_CIDR_VERSION_MASK, SAM9X60_D5M_EXID_MATCH,
+ "sam9x60 64MiB DDR2 SiP", "sam9x60"),
+- AT91_SOC(SAM9X60_CIDR_MATCH, SAM9X60_D1G_EXID_MATCH,
+- AT91_CIDR_VERSION_MASK, SAM9X60_EXID_MATCH,
++ AT91_SOC(SAM9X60_CIDR_MATCH, AT91_CIDR_MATCH_MASK,
++ AT91_CIDR_VERSION_MASK, SAM9X60_D1G_EXID_MATCH,
+ "sam9x60 128MiB DDR2 SiP", "sam9x60"),
+- AT91_SOC(SAM9X60_CIDR_MATCH, SAM9X60_D6K_EXID_MATCH,
+- AT91_CIDR_VERSION_MASK, SAM9X60_EXID_MATCH,
++ AT91_SOC(SAM9X60_CIDR_MATCH, AT91_CIDR_MATCH_MASK,
++ AT91_CIDR_VERSION_MASK, SAM9X60_D6K_EXID_MATCH,
+ "sam9x60 8MiB SDRAM SiP", "sam9x60"),
+ #endif
+ #ifdef CONFIG_SOC_SAMA5
+diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c
+index 9a8ae6fa6ecbb..526a7d2de4912 100644
+--- a/drivers/video/fbdev/core/fbcon.c
++++ b/drivers/video/fbdev/core/fbcon.c
+@@ -2480,6 +2480,11 @@ static int fbcon_set_font(struct vc_data *vc, struct console_font *font,
+ if (charcount != 256 && charcount != 512)
+ return -EINVAL;
+
++ /* font bigger than screen resolution ? */
++ if (w > FBCON_SWAP(info->var.rotate, info->var.xres, info->var.yres) ||
++ h > FBCON_SWAP(info->var.rotate, info->var.yres, info->var.xres))
++ return -EINVAL;
++
+ /* Make sure drawing engine can handle the font */
+ if (!(info->pixmap.blit_x & (1 << (font->width - 1))) ||
+ !(info->pixmap.blit_y & (1 << (font->height - 1))))
+@@ -2742,6 +2747,34 @@ void fbcon_update_vcs(struct fb_info *info, bool all)
+ }
+ EXPORT_SYMBOL(fbcon_update_vcs);
+
++/* let fbcon check if it supports a new screen resolution */
++int fbcon_modechange_possible(struct fb_info *info, struct fb_var_screeninfo *var)
++{
++ struct fbcon_ops *ops = info->fbcon_par;
++ struct vc_data *vc;
++ unsigned int i;
++
++ WARN_CONSOLE_UNLOCKED();
++
++ if (!ops)
++ return 0;
++
++ /* prevent setting a screen size which is smaller than font size */
++ for (i = first_fb_vc; i <= last_fb_vc; i++) {
++ vc = vc_cons[i].d;
++ if (!vc || vc->vc_mode != KD_TEXT ||
++ registered_fb[con2fb_map[i]] != info)
++ continue;
++
++ if (vc->vc_font.width > FBCON_SWAP(var->rotate, var->xres, var->yres) ||
++ vc->vc_font.height > FBCON_SWAP(var->rotate, var->yres, var->xres))
++ return -EINVAL;
++ }
++
++ return 0;
++}
++EXPORT_SYMBOL_GPL(fbcon_modechange_possible);
++
+ int fbcon_mode_deleted(struct fb_info *info,
+ struct fb_videomode *mode)
+ {
+diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
+index a6bb0e4382167..85de02d0d3aaa 100644
+--- a/drivers/video/fbdev/core/fbmem.c
++++ b/drivers/video/fbdev/core/fbmem.c
+@@ -510,7 +510,7 @@ static int fb_show_logo_line(struct fb_info *info, int rotate,
+
+ while (n && (n * (logo->width + 8) - 8 > xres))
+ --n;
+- image.dx = (xres - n * (logo->width + 8) - 8) / 2;
++ image.dx = (xres - (n * (logo->width + 8) - 8)) / 2;
+ image.dy = y ?: (yres - logo->height) / 2;
+ } else {
+ image.dx = 0;
+@@ -1016,6 +1016,16 @@ fb_set_var(struct fb_info *info, struct fb_var_screeninfo *var)
+ if (ret)
+ return ret;
+
++ /* verify that virtual resolution >= physical resolution */
++ if (var->xres_virtual < var->xres ||
++ var->yres_virtual < var->yres) {
++ pr_warn("WARNING: fbcon: Driver '%s' missed to adjust virtual screen size (%ux%u vs. %ux%u)\n",
++ info->fix.id,
++ var->xres_virtual, var->yres_virtual,
++ var->xres, var->yres);
++ return -EINVAL;
++ }
++
+ if ((var->activate & FB_ACTIVATE_MASK) != FB_ACTIVATE_NOW)
+ return 0;
+
+@@ -1106,7 +1116,9 @@ static long do_fb_ioctl(struct fb_info *info, unsigned int cmd,
+ return -EFAULT;
+ console_lock();
+ lock_fb_info(info);
+- ret = fb_set_var(info, &var);
++ ret = fbcon_modechange_possible(info, &var);
++ if (!ret)
++ ret = fb_set_var(info, &var);
+ if (!ret)
+ fbcon_update_vcs(info, var.activate & FB_ACTIVATE_ALL);
+ unlock_fb_info(info);
+diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
+index 9d3cf01117093..7c3726aa9e149 100644
+--- a/fs/fscache/cookie.c
++++ b/fs/fscache/cookie.c
+@@ -517,7 +517,14 @@ static void fscache_perform_lookup(struct fscache_cookie *cookie)
+ }
+
+ fscache_see_cookie(cookie, fscache_cookie_see_active);
+- fscache_set_cookie_state(cookie, FSCACHE_COOKIE_STATE_ACTIVE);
++ spin_lock(&cookie->lock);
++ if (test_and_clear_bit(FSCACHE_COOKIE_DO_INVALIDATE, &cookie->flags))
++ __fscache_set_cookie_state(cookie,
++ FSCACHE_COOKIE_STATE_INVALIDATING);
++ else
++ __fscache_set_cookie_state(cookie, FSCACHE_COOKIE_STATE_ACTIVE);
++ spin_unlock(&cookie->lock);
++ wake_up_cookie_state(cookie);
+ trace = fscache_access_lookup_cookie_end;
+
+ out:
+@@ -752,6 +759,9 @@ again_locked:
+ spin_lock(&cookie->lock);
+ }
+
++ if (test_and_clear_bit(FSCACHE_COOKIE_DO_INVALIDATE, &cookie->flags))
++ fscache_end_cookie_access(cookie, fscache_access_invalidate_cookie_end);
++
+ switch (state) {
+ case FSCACHE_COOKIE_STATE_RELINQUISHING:
+ fscache_see_cookie(cookie, fscache_cookie_see_relinquish);
+@@ -1048,6 +1058,9 @@ void __fscache_invalidate(struct fscache_cookie *cookie,
+ return;
+
+ case FSCACHE_COOKIE_STATE_LOOKING_UP:
++ __fscache_begin_cookie_access(cookie, fscache_access_invalidate_cookie);
++ set_bit(FSCACHE_COOKIE_DO_INVALIDATE, &cookie->flags);
++ fallthrough;
+ case FSCACHE_COOKIE_STATE_CREATING:
+ spin_unlock(&cookie->lock);
+ _leave(" [look %x]", cookie->inval_counter);
+diff --git a/fs/fscache/volume.c b/fs/fscache/volume.c
+index f2aa7dbad7667..a058e0136bfeb 100644
+--- a/fs/fscache/volume.c
++++ b/fs/fscache/volume.c
+@@ -143,7 +143,7 @@ static void fscache_wait_on_volume_collision(struct fscache_volume *candidate,
+ {
+ wait_var_event_timeout(&candidate->flags,
+ !fscache_is_acquire_pending(candidate), 20 * HZ);
+- if (!fscache_is_acquire_pending(candidate)) {
++ if (fscache_is_acquire_pending(candidate)) {
+ pr_notice("Potential volume collision new=%08x old=%08x",
+ candidate->debug_id, collidee_debug_id);
+ fscache_stat(&fscache_n_volumes_collision);
+@@ -182,7 +182,7 @@ static bool fscache_hash_volume(struct fscache_volume *candidate)
+ hlist_bl_add_head(&candidate->hash_link, h);
+ hlist_bl_unlock(h);
+
+- if (test_bit(FSCACHE_VOLUME_ACQUIRE_PENDING, &candidate->flags))
++ if (fscache_is_acquire_pending(candidate))
+ fscache_wait_on_volume_collision(candidate, collidee_debug_id);
+ return true;
+
+diff --git a/fs/io_uring.c b/fs/io_uring.c
+index 7e8c715052c09..3d97372e811eb 100644
+--- a/fs/io_uring.c
++++ b/fs/io_uring.c
+@@ -3495,6 +3495,13 @@ static ssize_t io_iov_buffer_select(struct io_kiocb *req, struct iovec *iov,
+ return __io_iov_buffer_select(req, iov, issue_flags);
+ }
+
++static inline bool io_do_buffer_select(struct io_kiocb *req)
++{
++ if (!(req->flags & REQ_F_BUFFER_SELECT))
++ return false;
++ return !(req->flags & REQ_F_BUFFER_SELECTED);
++}
++
+ static struct iovec *__io_import_iovec(int rw, struct io_kiocb *req,
+ struct io_rw_state *s,
+ unsigned int issue_flags)
+@@ -3854,18 +3861,19 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
+ if (unlikely(ret < 0))
+ return ret;
+ } else {
++ rw = req->async_data;
++ s = &rw->s;
++
+ /*
+ * Safe and required to re-import if we're using provided
+ * buffers, as we dropped the selected one before retry.
+ */
+- if (req->flags & REQ_F_BUFFER_SELECT) {
++ if (io_do_buffer_select(req)) {
+ ret = io_import_iovec(READ, req, &iovec, s, issue_flags);
+ if (unlikely(ret < 0))
+ return ret;
+ }
+
+- rw = req->async_data;
+- s = &rw->s;
+ /*
+ * We come here from an earlier attempt, restore our state to
+ * match in case it doesn't. It's cheap enough that we don't
+diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
+index 92b7ea8d8f5e1..181907349b49b 100644
+--- a/include/acpi/cppc_acpi.h
++++ b/include/acpi/cppc_acpi.h
+@@ -144,6 +144,7 @@ extern bool acpi_cpc_valid(void);
+ extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
+ extern unsigned int cppc_get_transition_latency(int cpu);
+ extern bool cpc_ffh_supported(void);
++extern bool cpc_supported_by_cpu(void);
+ extern int cpc_read_ffh(int cpunum, struct cpc_reg *reg, u64 *val);
+ extern int cpc_write_ffh(int cpunum, struct cpc_reg *reg, u64 val);
+ #else /* !CONFIG_ACPI_CPPC_LIB */
+diff --git a/include/linux/acpi.h b/include/linux/acpi.h
+index d7136d13aa442..cf1f770208da3 100644
+--- a/include/linux/acpi.h
++++ b/include/linux/acpi.h
+@@ -574,13 +574,15 @@ acpi_status acpi_run_osc(acpi_handle handle, struct acpi_osc_context *context);
+ #define OSC_SB_OSLPI_SUPPORT 0x00000100
+ #define OSC_SB_CPC_DIVERSE_HIGH_SUPPORT 0x00001000
+ #define OSC_SB_GENERIC_INITIATOR_SUPPORT 0x00002000
++#define OSC_SB_CPC_FLEXIBLE_ADR_SPACE 0x00004000
+ #define OSC_SB_NATIVE_USB4_SUPPORT 0x00040000
+ #define OSC_SB_PRM_SUPPORT 0x00200000
+
+ extern bool osc_sb_apei_support_acked;
+ extern bool osc_pc_lpi_support_confirmed;
+ extern bool osc_sb_native_usb4_support_confirmed;
+-extern bool osc_sb_cppc_not_supported;
++extern bool osc_sb_cppc2_support_acked;
++extern bool osc_cpc_flexible_adr_space_confirmed;
+
+ /* USB4 Capabilities */
+ #define OSC_USB_USB3_TUNNELING 0x00000001
+diff --git a/include/linux/fbcon.h b/include/linux/fbcon.h
+index ff5596dd30f85..2382dec6d6ab8 100644
+--- a/include/linux/fbcon.h
++++ b/include/linux/fbcon.h
+@@ -15,6 +15,8 @@ void fbcon_new_modelist(struct fb_info *info);
+ void fbcon_get_requirement(struct fb_info *info,
+ struct fb_blit_caps *caps);
+ void fbcon_fb_blanked(struct fb_info *info, int blank);
++int fbcon_modechange_possible(struct fb_info *info,
++ struct fb_var_screeninfo *var);
+ void fbcon_update_vcs(struct fb_info *info, bool all);
+ void fbcon_remap_all(struct fb_info *info);
+ int fbcon_set_con2fb_map_ioctl(void __user *argp);
+@@ -33,6 +35,8 @@ static inline void fbcon_new_modelist(struct fb_info *info) {}
+ static inline void fbcon_get_requirement(struct fb_info *info,
+ struct fb_blit_caps *caps) {}
+ static inline void fbcon_fb_blanked(struct fb_info *info, int blank) {}
++static inline int fbcon_modechange_possible(struct fb_info *info,
++ struct fb_var_screeninfo *var) { return 0; }
+ static inline void fbcon_update_vcs(struct fb_info *info, bool all) {}
+ static inline void fbcon_remap_all(struct fb_info *info) {}
+ static inline int fbcon_set_con2fb_map_ioctl(void __user *argp) { return 0; }
+diff --git a/include/linux/fscache.h b/include/linux/fscache.h
+index e25539072463b..a25804f141d31 100644
+--- a/include/linux/fscache.h
++++ b/include/linux/fscache.h
+@@ -129,6 +129,7 @@ struct fscache_cookie {
+ #define FSCACHE_COOKIE_DO_PREP_TO_WRITE 12 /* T if cookie needs write preparation */
+ #define FSCACHE_COOKIE_HAVE_DATA 13 /* T if this cookie has data stored */
+ #define FSCACHE_COOKIE_IS_HASHED 14 /* T if this cookie is hashed */
++#define FSCACHE_COOKIE_DO_INVALIDATE 15 /* T if cookie needs invalidation */
+
+ enum fscache_cookie_state state;
+ u8 advice; /* FSCACHE_ADV_* */
+diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
+index 2f9891cb3d001..82543019ff012 100644
+--- a/include/linux/intel-iommu.h
++++ b/include/linux/intel-iommu.h
+@@ -611,7 +611,6 @@ struct intel_iommu {
+ struct device_domain_info {
+ struct list_head link; /* link to domain siblings */
+ struct list_head global; /* link to global list */
+- struct list_head table; /* link to pasid table */
+ u32 segment; /* PCI segment number */
+ u8 bus; /* PCI bus number */
+ u8 devfn; /* PCI devfn number */
+@@ -728,8 +727,6 @@ extern int dmar_ir_support(void);
+ void *alloc_pgtable_page(int node);
+ void free_pgtable_page(void *vaddr);
+ struct intel_iommu *domain_get_iommu(struct dmar_domain *domain);
+-int for_each_device_domain(int (*fn)(struct device_domain_info *info,
+- void *data), void *data);
+ void iommu_flush_write_buffer(struct intel_iommu *iommu);
+ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct device *dev);
+ struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn);
+diff --git a/include/linux/memregion.h b/include/linux/memregion.h
+index e11595256cac0..c04c4fd2e2091 100644
+--- a/include/linux/memregion.h
++++ b/include/linux/memregion.h
+@@ -16,7 +16,7 @@ static inline int memregion_alloc(gfp_t gfp)
+ {
+ return -ENOMEM;
+ }
+-void memregion_free(int id)
++static inline void memregion_free(int id)
+ {
+ }
+ #endif
+diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
+index 2bff6a10095d1..73634b6653a1b 100644
+--- a/include/linux/pm_runtime.h
++++ b/include/linux/pm_runtime.h
+@@ -82,7 +82,7 @@ extern void pm_runtime_get_suppliers(struct device *dev);
+ extern void pm_runtime_put_suppliers(struct device *dev);
+ extern void pm_runtime_new_link(struct device *dev);
+ extern void pm_runtime_drop_link(struct device_link *link);
+-extern void pm_runtime_release_supplier(struct device_link *link, bool check_idle);
++extern void pm_runtime_release_supplier(struct device_link *link);
+
+ extern int devm_pm_runtime_enable(struct device *dev);
+
+@@ -308,8 +308,7 @@ static inline void pm_runtime_get_suppliers(struct device *dev) {}
+ static inline void pm_runtime_put_suppliers(struct device *dev) {}
+ static inline void pm_runtime_new_link(struct device *dev) {}
+ static inline void pm_runtime_drop_link(struct device_link *link) {}
+-static inline void pm_runtime_release_supplier(struct device_link *link,
+- bool check_idle) {}
++static inline void pm_runtime_release_supplier(struct device_link *link) {}
+
+ #endif /* !CONFIG_PM */
+
+diff --git a/include/linux/rtsx_usb.h b/include/linux/rtsx_usb.h
+index 159729cffd8e1..3247ed8e9ff0f 100644
+--- a/include/linux/rtsx_usb.h
++++ b/include/linux/rtsx_usb.h
+@@ -54,8 +54,6 @@ struct rtsx_ucr {
+ struct usb_device *pusb_dev;
+ struct usb_interface *pusb_intf;
+ struct usb_sg_request current_sg;
+- unsigned char *iobuf;
+- dma_addr_t iobuf_dma;
+
+ struct timer_list sg_timer;
+ struct mutex dev_mutex;
+diff --git a/include/net/act_api.h b/include/net/act_api.h
+index 3049cb69c0252..9cf6870b526eb 100644
+--- a/include/net/act_api.h
++++ b/include/net/act_api.h
+@@ -134,7 +134,8 @@ struct tc_action_ops {
+ (*get_psample_group)(const struct tc_action *a,
+ tc_action_priv_destructor *destructor);
+ int (*offload_act_setup)(struct tc_action *act, void *entry_data,
+- u32 *index_inc, bool bind);
++ u32 *index_inc, bool bind,
++ struct netlink_ext_ack *extack);
+ };
+
+ struct tc_action_net {
+diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
+index 6484095a8c011..7ac313858037a 100644
+--- a/include/net/flow_offload.h
++++ b/include/net/flow_offload.h
+@@ -152,6 +152,7 @@ enum flow_action_id {
+ FLOW_ACTION_PIPE,
+ FLOW_ACTION_VLAN_PUSH_ETH,
+ FLOW_ACTION_VLAN_POP_ETH,
++ FLOW_ACTION_CONTINUE,
+ NUM_FLOW_ACTIONS,
+ };
+
+diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
+index a3b57a93228a7..8cf001aed858d 100644
+--- a/include/net/pkt_cls.h
++++ b/include/net/pkt_cls.h
+@@ -547,10 +547,12 @@ tcf_match_indev(struct sk_buff *skb, int ifindex)
+ }
+
+ int tc_setup_offload_action(struct flow_action *flow_action,
+- const struct tcf_exts *exts);
++ const struct tcf_exts *exts,
++ struct netlink_ext_ack *extack);
+ void tc_cleanup_offload_action(struct flow_action *flow_action);
+ int tc_setup_action(struct flow_action *flow_action,
+- struct tc_action *actions[]);
++ struct tc_action *actions[],
++ struct netlink_ext_ack *extack);
+
+ int tc_setup_cb_call(struct tcf_block *block, enum tc_setup_type type,
+ void *type_data, bool err_stop, bool rtnl_held);
+diff --git a/include/video/of_display_timing.h b/include/video/of_display_timing.h
+index e1126a74882a5..eff166fdd81b9 100644
+--- a/include/video/of_display_timing.h
++++ b/include/video/of_display_timing.h
+@@ -8,6 +8,8 @@
+ #ifndef __LINUX_OF_DISPLAY_TIMING_H
+ #define __LINUX_OF_DISPLAY_TIMING_H
+
++#include <linux/errno.h>
++
+ struct device_node;
+ struct display_timing;
+ struct display_timings;
+diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
+index 9c1a02b82ecd0..d04147a5efa55 100644
+--- a/kernel/bpf/verifier.c
++++ b/kernel/bpf/verifier.c
+@@ -1417,6 +1417,21 @@ static void __reg_bound_offset(struct bpf_reg_state *reg)
+ reg->var_off = tnum_or(tnum_clear_subreg(var64_off), var32_off);
+ }
+
++static void reg_bounds_sync(struct bpf_reg_state *reg)
++{
++ /* We might have learned new bounds from the var_off. */
++ __update_reg_bounds(reg);
++ /* We might have learned something about the sign bit. */
++ __reg_deduce_bounds(reg);
++ /* We might have learned some bits from the bounds. */
++ __reg_bound_offset(reg);
++ /* Intersecting with the old var_off might have improved our bounds
++ * slightly, e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc),
++ * then new var_off is (0; 0x7f...fc) which improves our umax.
++ */
++ __update_reg_bounds(reg);
++}
++
+ static bool __reg32_bound_s64(s32 a)
+ {
+ return a >= 0 && a <= S32_MAX;
+@@ -1458,16 +1473,8 @@ static void __reg_combine_32_into_64(struct bpf_reg_state *reg)
+ * so they do not impact tnum bounds calculation.
+ */
+ __mark_reg64_unbounded(reg);
+- __update_reg_bounds(reg);
+ }
+-
+- /* Intersecting with the old var_off might have improved our bounds
+- * slightly. e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc),
+- * then new var_off is (0; 0x7f...fc) which improves our umax.
+- */
+- __reg_deduce_bounds(reg);
+- __reg_bound_offset(reg);
+- __update_reg_bounds(reg);
++ reg_bounds_sync(reg);
+ }
+
+ static bool __reg64_bound_s32(s64 a)
+@@ -1483,7 +1490,6 @@ static bool __reg64_bound_u32(u64 a)
+ static void __reg_combine_64_into_32(struct bpf_reg_state *reg)
+ {
+ __mark_reg32_unbounded(reg);
+-
+ if (__reg64_bound_s32(reg->smin_value) && __reg64_bound_s32(reg->smax_value)) {
+ reg->s32_min_value = (s32)reg->smin_value;
+ reg->s32_max_value = (s32)reg->smax_value;
+@@ -1492,14 +1498,7 @@ static void __reg_combine_64_into_32(struct bpf_reg_state *reg)
+ reg->u32_min_value = (u32)reg->umin_value;
+ reg->u32_max_value = (u32)reg->umax_value;
+ }
+-
+- /* Intersecting with the old var_off might have improved our bounds
+- * slightly. e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc),
+- * then new var_off is (0; 0x7f...fc) which improves our umax.
+- */
+- __reg_deduce_bounds(reg);
+- __reg_bound_offset(reg);
+- __update_reg_bounds(reg);
++ reg_bounds_sync(reg);
+ }
+
+ /* Mark a register as having a completely unknown (scalar) value. */
+@@ -6485,9 +6484,7 @@ static void do_refine_retval_range(struct bpf_reg_state *regs, int ret_type,
+ ret_reg->s32_max_value = meta->msize_max_value;
+ ret_reg->smin_value = -MAX_ERRNO;
+ ret_reg->s32_min_value = -MAX_ERRNO;
+- __reg_deduce_bounds(ret_reg);
+- __reg_bound_offset(ret_reg);
+- __update_reg_bounds(ret_reg);
++ reg_bounds_sync(ret_reg);
+ }
+
+ static int
+@@ -7693,11 +7690,7 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
+
+ if (!check_reg_sane_offset(env, dst_reg, ptr_reg->type))
+ return -EINVAL;
+-
+- __update_reg_bounds(dst_reg);
+- __reg_deduce_bounds(dst_reg);
+- __reg_bound_offset(dst_reg);
+-
++ reg_bounds_sync(dst_reg);
+ if (sanitize_check_bounds(env, insn, dst_reg) < 0)
+ return -EACCES;
+ if (sanitize_needed(opcode)) {
+@@ -8435,10 +8428,7 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
+ /* ALU32 ops are zero extended into 64bit register */
+ if (alu32)
+ zext_32_to_64(dst_reg);
+-
+- __update_reg_bounds(dst_reg);
+- __reg_deduce_bounds(dst_reg);
+- __reg_bound_offset(dst_reg);
++ reg_bounds_sync(dst_reg);
+ return 0;
+ }
+
+@@ -8627,10 +8617,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
+ insn->dst_reg);
+ }
+ zext_32_to_64(dst_reg);
+-
+- __update_reg_bounds(dst_reg);
+- __reg_deduce_bounds(dst_reg);
+- __reg_bound_offset(dst_reg);
++ reg_bounds_sync(dst_reg);
+ }
+ } else {
+ /* case: R = imm
+@@ -9068,26 +9055,33 @@ static void reg_set_min_max(struct bpf_reg_state *true_reg,
+ return;
+
+ switch (opcode) {
++ /* JEQ/JNE comparison doesn't change the register equivalence.
++ *
++ * r1 = r2;
++ * if (r1 == 42) goto label;
++ * ...
++ * label: // here both r1 and r2 are known to be 42.
++ *
++ * Hence when marking register as known preserve it's ID.
++ */
+ case BPF_JEQ:
++ if (is_jmp32) {
++ __mark_reg32_known(true_reg, val32);
++ true_32off = tnum_subreg(true_reg->var_off);
++ } else {
++ ___mark_reg_known(true_reg, val);
++ true_64off = true_reg->var_off;
++ }
++ break;
+ case BPF_JNE:
+- {
+- struct bpf_reg_state *reg =
+- opcode == BPF_JEQ ? true_reg : false_reg;
+-
+- /* JEQ/JNE comparison doesn't change the register equivalence.
+- * r1 = r2;
+- * if (r1 == 42) goto label;
+- * ...
+- * label: // here both r1 and r2 are known to be 42.
+- *
+- * Hence when marking register as known preserve it's ID.
+- */
+- if (is_jmp32)
+- __mark_reg32_known(reg, val32);
+- else
+- ___mark_reg_known(reg, val);
++ if (is_jmp32) {
++ __mark_reg32_known(false_reg, val32);
++ false_32off = tnum_subreg(false_reg->var_off);
++ } else {
++ ___mark_reg_known(false_reg, val);
++ false_64off = false_reg->var_off;
++ }
+ break;
+- }
+ case BPF_JSET:
+ if (is_jmp32) {
+ false_32off = tnum_and(false_32off, tnum_const(~val32));
+@@ -9226,21 +9220,8 @@ static void __reg_combine_min_max(struct bpf_reg_state *src_reg,
+ dst_reg->smax_value);
+ src_reg->var_off = dst_reg->var_off = tnum_intersect(src_reg->var_off,
+ dst_reg->var_off);
+- /* We might have learned new bounds from the var_off. */
+- __update_reg_bounds(src_reg);
+- __update_reg_bounds(dst_reg);
+- /* We might have learned something about the sign bit. */
+- __reg_deduce_bounds(src_reg);
+- __reg_deduce_bounds(dst_reg);
+- /* We might have learned some bits from the bounds. */
+- __reg_bound_offset(src_reg);
+- __reg_bound_offset(dst_reg);
+- /* Intersecting with the old var_off might have improved our bounds
+- * slightly. e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc),
+- * then new var_off is (0; 0x7f...fc) which improves our umax.
+- */
+- __update_reg_bounds(src_reg);
+- __update_reg_bounds(dst_reg);
++ reg_bounds_sync(src_reg);
++ reg_bounds_sync(dst_reg);
+ }
+
+ static void reg_combine_min_max(struct bpf_reg_state *true_src,
+diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
+index 6833d88871816..d30e4db04506a 100644
+--- a/kernel/rcu/srcutree.c
++++ b/kernel/rcu/srcutree.c
+@@ -382,9 +382,11 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
+ return; /* Forgot srcu_barrier(), so just leak it! */
+ }
+ if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
++ WARN_ON(rcu_seq_current(&ssp->srcu_gp_seq) != ssp->srcu_gp_seq_needed) ||
+ WARN_ON(srcu_readers_active(ssp))) {
+- pr_info("%s: Active srcu_struct %p state: %d\n",
+- __func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)));
++ pr_info("%s: Active srcu_struct %p read state: %d gp state: %lu/%lu\n",
++ __func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)),
++ rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
+ return; /* Caller forgot to stop doing call_srcu()? */
+ }
+ free_percpu(ssp->sda);
+diff --git a/lib/idr.c b/lib/idr.c
+index f4ab4f4aa3c7f..7ecdfdb5309e7 100644
+--- a/lib/idr.c
++++ b/lib/idr.c
+@@ -491,7 +491,8 @@ void ida_free(struct ida *ida, unsigned int id)
+ struct ida_bitmap *bitmap;
+ unsigned long flags;
+
+- BUG_ON((int)id < 0);
++ if ((int)id < 0)
++ return;
+
+ xas_lock_irqsave(&xas, flags);
+ bitmap = xas_load(&xas);
+diff --git a/net/can/bcm.c b/net/can/bcm.c
+index 64c07e650bb41..c3ad310c2109d 100644
+--- a/net/can/bcm.c
++++ b/net/can/bcm.c
+@@ -100,6 +100,7 @@ static inline u64 get_u64(const struct canfd_frame *cp, int offset)
+
+ struct bcm_op {
+ struct list_head list;
++ struct rcu_head rcu;
+ int ifindex;
+ canid_t can_id;
+ u32 flags;
+@@ -718,10 +719,9 @@ static struct bcm_op *bcm_find_op(struct list_head *ops,
+ return NULL;
+ }
+
+-static void bcm_remove_op(struct bcm_op *op)
++static void bcm_free_op_rcu(struct rcu_head *rcu_head)
+ {
+- hrtimer_cancel(&op->timer);
+- hrtimer_cancel(&op->thrtimer);
++ struct bcm_op *op = container_of(rcu_head, struct bcm_op, rcu);
+
+ if ((op->frames) && (op->frames != &op->sframe))
+ kfree(op->frames);
+@@ -732,6 +732,14 @@ static void bcm_remove_op(struct bcm_op *op)
+ kfree(op);
+ }
+
++static void bcm_remove_op(struct bcm_op *op)
++{
++ hrtimer_cancel(&op->timer);
++ hrtimer_cancel(&op->thrtimer);
++
++ call_rcu(&op->rcu, bcm_free_op_rcu);
++}
++
+ static void bcm_rx_unreg(struct net_device *dev, struct bcm_op *op)
+ {
+ if (op->rx_reg_dev == dev) {
+@@ -757,6 +765,9 @@ static int bcm_delete_rx_op(struct list_head *ops, struct bcm_msg_head *mh,
+ if ((op->can_id == mh->can_id) && (op->ifindex == ifindex) &&
+ (op->flags & CAN_FD_FRAME) == (mh->flags & CAN_FD_FRAME)) {
+
++ /* disable automatic timer on frame reception */
++ op->flags |= RX_NO_AUTOTIMER;
++
+ /*
+ * Don't care if we're bound or not (due to netdev
+ * problems) can_rx_unregister() is always a save
+@@ -785,7 +796,6 @@ static int bcm_delete_rx_op(struct list_head *ops, struct bcm_msg_head *mh,
+ bcm_rx_handler, op);
+
+ list_del(&op->list);
+- synchronize_rcu();
+ bcm_remove_op(op);
+ return 1; /* done */
+ }
+diff --git a/net/mptcp/options.c b/net/mptcp/options.c
+index b548cec86c9d8..48e34b81fa1cc 100644
+--- a/net/mptcp/options.c
++++ b/net/mptcp/options.c
+@@ -1538,6 +1538,9 @@ mp_rst:
+ *ptr++ = mptcp_option(MPTCPOPT_MP_PRIO,
+ TCPOLEN_MPTCP_PRIO,
+ opts->backup, TCPOPT_NOP);
++
++ MPTCP_INC_STATS(sock_net((const struct sock *)tp),
++ MPTCP_MIB_MPPRIOTX);
+ }
+
+ mp_capable_done:
+diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
+index e3dcc5501579f..78345278e4a71 100644
+--- a/net/mptcp/pm_netlink.c
++++ b/net/mptcp/pm_netlink.c
+@@ -720,24 +720,23 @@ static int mptcp_pm_nl_mp_prio_send_ack(struct mptcp_sock *msk,
+
+ mptcp_for_each_subflow(msk, subflow) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+- struct sock *sk = (struct sock *)msk;
+ struct mptcp_addr_info local;
++ bool slow;
+
+ local_address((struct sock_common *)ssk, &local);
+ if (!addresses_equal(&local, addr, addr->port))
+ continue;
+
++ slow = lock_sock_fast(ssk);
+ if (subflow->backup != bkup)
+ msk->last_snd = NULL;
+ subflow->backup = bkup;
+ subflow->send_mp_prio = 1;
+ subflow->request_bkup = bkup;
+- __MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPPRIOTX);
+
+- spin_unlock_bh(&msk->pm.lock);
+ pr_debug("send ack for mp_prio");
+- mptcp_subflow_send_ack(ssk);
+- spin_lock_bh(&msk->pm.lock);
++ __mptcp_subflow_send_ack(ssk);
++ unlock_sock_fast(ssk, slow);
+
+ return 0;
+ }
+@@ -794,7 +793,8 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk,
+ removed = true;
+ __MPTCP_INC_STATS(sock_net(sk), rm_type);
+ }
+- __set_bit(rm_list->ids[i], msk->pm.id_avail_bitmap);
++ if (rm_type == MPTCP_MIB_RMSUBFLOW)
++ __set_bit(rm_list->ids[i], msk->pm.id_avail_bitmap);
+ if (!removed)
+ continue;
+
+@@ -1769,8 +1769,10 @@ static void mptcp_pm_nl_fullmesh(struct mptcp_sock *msk,
+
+ list.ids[list.nr++] = addr->id;
+
++ spin_lock_bh(&msk->pm.lock);
+ mptcp_pm_nl_rm_subflow_received(msk, &list);
+ mptcp_pm_create_subflow_or_signal_addr(msk);
++ spin_unlock_bh(&msk->pm.lock);
+ }
+
+ static int mptcp_nl_set_flags(struct net *net,
+@@ -1788,12 +1790,10 @@ static int mptcp_nl_set_flags(struct net *net,
+ goto next;
+
+ lock_sock(sk);
+- spin_lock_bh(&msk->pm.lock);
+ if (changed & MPTCP_PM_ADDR_FLAG_BACKUP)
+ ret = mptcp_pm_nl_mp_prio_send_ack(msk, addr, bkup);
+ if (changed & MPTCP_PM_ADDR_FLAG_FULLMESH)
+ mptcp_pm_nl_fullmesh(msk, addr);
+- spin_unlock_bh(&msk->pm.lock);
+ release_sock(sk);
+
+ next:
+diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
+index 713077eef04ac..b0fb1fc0bd4a9 100644
+--- a/net/mptcp/protocol.c
++++ b/net/mptcp/protocol.c
+@@ -506,13 +506,18 @@ static bool tcp_can_send_ack(const struct sock *ssk)
+ (TCPF_SYN_SENT | TCPF_SYN_RECV | TCPF_TIME_WAIT | TCPF_CLOSE | TCPF_LISTEN));
+ }
+
++void __mptcp_subflow_send_ack(struct sock *ssk)
++{
++ if (tcp_can_send_ack(ssk))
++ tcp_send_ack(ssk);
++}
++
+ void mptcp_subflow_send_ack(struct sock *ssk)
+ {
+ bool slow;
+
+ slow = lock_sock_fast(ssk);
+- if (tcp_can_send_ack(ssk))
+- tcp_send_ack(ssk);
++ __mptcp_subflow_send_ack(ssk);
+ unlock_sock_fast(ssk, slow);
+ }
+
+diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
+index 2aab5aff6bcdf..ad36a05aa67df 100644
+--- a/net/mptcp/protocol.h
++++ b/net/mptcp/protocol.h
+@@ -584,6 +584,7 @@ void __init mptcp_subflow_init(void);
+ void mptcp_subflow_shutdown(struct sock *sk, struct sock *ssk, int how);
+ void mptcp_close_ssk(struct sock *sk, struct sock *ssk,
+ struct mptcp_subflow_context *subflow);
++void __mptcp_subflow_send_ack(struct sock *ssk);
+ void mptcp_subflow_send_ack(struct sock *ssk);
+ void mptcp_subflow_reset(struct sock *ssk);
+ void mptcp_subflow_queue_clean(struct sock *ssk);
+diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
+index 81243c834abbe..a136148627e70 100644
+--- a/net/netfilter/nf_tables_api.c
++++ b/net/netfilter/nf_tables_api.c
+@@ -5213,13 +5213,20 @@ static int nft_setelem_parse_data(struct nft_ctx *ctx, struct nft_set *set,
+ struct nft_data *data,
+ struct nlattr *attr)
+ {
++ u32 dtype;
+ int err;
+
+ err = nft_data_init(ctx, data, NFT_DATA_VALUE_MAXLEN, desc, attr);
+ if (err < 0)
+ return err;
+
+- if (desc->type != NFT_DATA_VERDICT && desc->len != set->dlen) {
++ if (set->dtype == NFT_DATA_VERDICT)
++ dtype = NFT_DATA_VERDICT;
++ else
++ dtype = NFT_DATA_VALUE;
++
++ if (dtype != desc->type ||
++ set->dlen != desc->len) {
+ nft_data_release(data, desc->type);
+ return -EINVAL;
+ }
+diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
+index 2c8051d8cca69..4f9299b9dcddc 100644
+--- a/net/netfilter/nft_set_pipapo.c
++++ b/net/netfilter/nft_set_pipapo.c
+@@ -2124,6 +2124,32 @@ out_scratch:
+ return err;
+ }
+
++/**
++ * nft_set_pipapo_match_destroy() - Destroy elements from key mapping array
++ * @set: nftables API set representation
++ * @m: matching data pointing to key mapping array
++ */
++static void nft_set_pipapo_match_destroy(const struct nft_set *set,
++ struct nft_pipapo_match *m)
++{
++ struct nft_pipapo_field *f;
++ int i, r;
++
++ for (i = 0, f = m->f; i < m->field_count - 1; i++, f++)
++ ;
++
++ for (r = 0; r < f->rules; r++) {
++ struct nft_pipapo_elem *e;
++
++ if (r < f->rules - 1 && f->mt[r + 1].e == f->mt[r].e)
++ continue;
++
++ e = f->mt[r].e;
++
++ nft_set_elem_destroy(set, e, true);
++ }
++}
++
+ /**
+ * nft_pipapo_destroy() - Free private data for set and all committed elements
+ * @set: nftables API set representation
+@@ -2132,26 +2158,13 @@ static void nft_pipapo_destroy(const struct nft_set *set)
+ {
+ struct nft_pipapo *priv = nft_set_priv(set);
+ struct nft_pipapo_match *m;
+- struct nft_pipapo_field *f;
+- int i, r, cpu;
++ int cpu;
+
+ m = rcu_dereference_protected(priv->match, true);
+ if (m) {
+ rcu_barrier();
+
+- for (i = 0, f = m->f; i < m->field_count - 1; i++, f++)
+- ;
+-
+- for (r = 0; r < f->rules; r++) {
+- struct nft_pipapo_elem *e;
+-
+- if (r < f->rules - 1 && f->mt[r + 1].e == f->mt[r].e)
+- continue;
+-
+- e = f->mt[r].e;
+-
+- nft_set_elem_destroy(set, e, true);
+- }
++ nft_set_pipapo_match_destroy(set, m);
+
+ #ifdef NFT_PIPAPO_ALIGN
+ free_percpu(m->scratch_aligned);
+@@ -2165,6 +2178,11 @@ static void nft_pipapo_destroy(const struct nft_set *set)
+ }
+
+ if (priv->clone) {
++ m = priv->clone;
++
++ if (priv->dirty)
++ nft_set_pipapo_match_destroy(set, m);
++
+ #ifdef NFT_PIPAPO_ALIGN
+ free_percpu(priv->clone->scratch_aligned);
+ #endif
+diff --git a/net/rose/rose_route.c b/net/rose/rose_route.c
+index e2e6b6b785789..5e93510fa3d24 100644
+--- a/net/rose/rose_route.c
++++ b/net/rose/rose_route.c
+@@ -227,8 +227,8 @@ static void rose_remove_neigh(struct rose_neigh *rose_neigh)
+ {
+ struct rose_neigh *s;
+
+- rose_stop_ftimer(rose_neigh);
+- rose_stop_t0timer(rose_neigh);
++ del_timer_sync(&rose_neigh->ftimer);
++ del_timer_sync(&rose_neigh->t0timer);
+
+ skb_queue_purge(&rose_neigh->queue);
+
+diff --git a/net/sched/act_api.c b/net/sched/act_api.c
+index 6fa9e7b1406a4..817065aa28331 100644
+--- a/net/sched/act_api.c
++++ b/net/sched/act_api.c
+@@ -195,7 +195,7 @@ static int offload_action_init(struct flow_offload_action *fl_action,
+ if (act->ops->offload_act_setup) {
+ spin_lock_bh(&act->tcfa_lock);
+ err = act->ops->offload_act_setup(act, fl_action, NULL,
+- false);
++ false, extack);
+ spin_unlock_bh(&act->tcfa_lock);
+ return err;
+ }
+@@ -271,7 +271,7 @@ static int tcf_action_offload_add_ex(struct tc_action *action,
+ if (err)
+ goto fl_err;
+
+- err = tc_setup_action(&fl_action->action, actions);
++ err = tc_setup_action(&fl_action->action, actions, extack);
+ if (err) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Failed to setup tc actions for offload");
+diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
+index e0f515b774cad..22847ee009efa 100644
+--- a/net/sched/act_csum.c
++++ b/net/sched/act_csum.c
+@@ -696,7 +696,8 @@ static size_t tcf_csum_get_fill_size(const struct tc_action *act)
+ }
+
+ static int tcf_csum_offload_act_setup(struct tc_action *act, void *entry_data,
+- u32 *index_inc, bool bind)
++ u32 *index_inc, bool bind,
++ struct netlink_ext_ack *extack)
+ {
+ if (bind) {
+ struct flow_action_entry *entry = entry_data;
+diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
+index b3ca837fd4e82..e013253b10d18 100644
+--- a/net/sched/act_ct.c
++++ b/net/sched/act_ct.c
+@@ -1584,7 +1584,8 @@ static void tcf_stats_update(struct tc_action *a, u64 bytes, u64 packets,
+ }
+
+ static int tcf_ct_offload_act_setup(struct tc_action *act, void *entry_data,
+- u32 *index_inc, bool bind)
++ u32 *index_inc, bool bind,
++ struct netlink_ext_ack *extack)
+ {
+ if (bind) {
+ struct flow_action_entry *entry = entry_data;
+diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
+index bde6a6c01e64c..db84a0473cc13 100644
+--- a/net/sched/act_gact.c
++++ b/net/sched/act_gact.c
+@@ -253,7 +253,8 @@ static size_t tcf_gact_get_fill_size(const struct tc_action *act)
+ }
+
+ static int tcf_gact_offload_act_setup(struct tc_action *act, void *entry_data,
+- u32 *index_inc, bool bind)
++ u32 *index_inc, bool bind,
++ struct netlink_ext_ack *extack)
+ {
+ if (bind) {
+ struct flow_action_entry *entry = entry_data;
+diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c
+index d56e73843a4b7..fd51552747332 100644
+--- a/net/sched/act_gate.c
++++ b/net/sched/act_gate.c
+@@ -619,7 +619,8 @@ static int tcf_gate_get_entries(struct flow_action_entry *entry,
+ }
+
+ static int tcf_gate_offload_act_setup(struct tc_action *act, void *entry_data,
+- u32 *index_inc, bool bind)
++ u32 *index_inc, bool bind,
++ struct netlink_ext_ack *extack)
+ {
+ int err;
+
+diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
+index 39acd1d186098..70a6a4447e6bd 100644
+--- a/net/sched/act_mirred.c
++++ b/net/sched/act_mirred.c
+@@ -460,7 +460,8 @@ static void tcf_offload_mirred_get_dev(struct flow_action_entry *entry,
+ }
+
+ static int tcf_mirred_offload_act_setup(struct tc_action *act, void *entry_data,
+- u32 *index_inc, bool bind)
++ u32 *index_inc, bool bind,
++ struct netlink_ext_ack *extack)
+ {
+ if (bind) {
+ struct flow_action_entry *entry = entry_data;
+diff --git a/net/sched/act_mpls.c b/net/sched/act_mpls.c
+index b9ff3459fdab9..23fcfa5605dfa 100644
+--- a/net/sched/act_mpls.c
++++ b/net/sched/act_mpls.c
+@@ -385,7 +385,8 @@ static int tcf_mpls_search(struct net *net, struct tc_action **a, u32 index)
+ }
+
+ static int tcf_mpls_offload_act_setup(struct tc_action *act, void *entry_data,
+- u32 *index_inc, bool bind)
++ u32 *index_inc, bool bind,
++ struct netlink_ext_ack *extack)
+ {
+ if (bind) {
+ struct flow_action_entry *entry = entry_data;
+diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
+index 211c757bfc3c4..8fccc914f4642 100644
+--- a/net/sched/act_pedit.c
++++ b/net/sched/act_pedit.c
+@@ -510,7 +510,8 @@ static int tcf_pedit_search(struct net *net, struct tc_action **a, u32 index)
+ }
+
+ static int tcf_pedit_offload_act_setup(struct tc_action *act, void *entry_data,
+- u32 *index_inc, bool bind)
++ u32 *index_inc, bool bind,
++ struct netlink_ext_ack *extack)
+ {
+ if (bind) {
+ struct flow_action_entry *entry = entry_data;
+diff --git a/net/sched/act_police.c b/net/sched/act_police.c
+index f4d9177052639..b759628a47c20 100644
+--- a/net/sched/act_police.c
++++ b/net/sched/act_police.c
+@@ -419,7 +419,8 @@ static int tcf_police_search(struct net *net, struct tc_action **a, u32 index)
+ return tcf_idr_search(tn, a, index);
+ }
+
+-static int tcf_police_act_to_flow_act(int tc_act, u32 *extval)
++static int tcf_police_act_to_flow_act(int tc_act, u32 *extval,
++ struct netlink_ext_ack *extack)
+ {
+ int act_id = -EOPNOTSUPP;
+
+@@ -430,19 +431,28 @@ static int tcf_police_act_to_flow_act(int tc_act, u32 *extval)
+ act_id = FLOW_ACTION_DROP;
+ else if (tc_act == TC_ACT_PIPE)
+ act_id = FLOW_ACTION_PIPE;
++ else if (tc_act == TC_ACT_RECLASSIFY)
++ NL_SET_ERR_MSG_MOD(extack, "Offload not supported when conform/exceed action is \"reclassify\"");
++ else
++ NL_SET_ERR_MSG_MOD(extack, "Unsupported conform/exceed action offload");
+ } else if (TC_ACT_EXT_CMP(tc_act, TC_ACT_GOTO_CHAIN)) {
+ act_id = FLOW_ACTION_GOTO;
+ *extval = tc_act & TC_ACT_EXT_VAL_MASK;
+ } else if (TC_ACT_EXT_CMP(tc_act, TC_ACT_JUMP)) {
+ act_id = FLOW_ACTION_JUMP;
+ *extval = tc_act & TC_ACT_EXT_VAL_MASK;
++ } else if (tc_act == TC_ACT_UNSPEC) {
++ act_id = FLOW_ACTION_CONTINUE;
++ } else {
++ NL_SET_ERR_MSG_MOD(extack, "Unsupported conform/exceed action offload");
+ }
+
+ return act_id;
+ }
+
+ static int tcf_police_offload_act_setup(struct tc_action *act, void *entry_data,
+- u32 *index_inc, bool bind)
++ u32 *index_inc, bool bind,
++ struct netlink_ext_ack *extack)
+ {
+ if (bind) {
+ struct flow_action_entry *entry = entry_data;
+@@ -466,14 +476,16 @@ static int tcf_police_offload_act_setup(struct tc_action *act, void *entry_data,
+ entry->police.mtu = tcf_police_tcfp_mtu(act);
+
+ act_id = tcf_police_act_to_flow_act(police->tcf_action,
+- &entry->police.exceed.extval);
++ &entry->police.exceed.extval,
++ extack);
+ if (act_id < 0)
+ return act_id;
+
+ entry->police.exceed.act_id = act_id;
+
+ act_id = tcf_police_act_to_flow_act(p->tcfp_result,
+- &entry->police.notexceed.extval);
++ &entry->police.notexceed.extval,
++ extack);
+ if (act_id < 0)
+ return act_id;
+
+diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
+index 9a22cdda6bbdc..2f7f5e44d28c9 100644
+--- a/net/sched/act_sample.c
++++ b/net/sched/act_sample.c
+@@ -291,7 +291,8 @@ static void tcf_offload_sample_get_group(struct flow_action_entry *entry,
+ }
+
+ static int tcf_sample_offload_act_setup(struct tc_action *act, void *entry_data,
+- u32 *index_inc, bool bind)
++ u32 *index_inc, bool bind,
++ struct netlink_ext_ack *extack)
+ {
+ if (bind) {
+ struct flow_action_entry *entry = entry_data;
+diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c
+index ceba11b198bba..8cd8e506c9c9b 100644
+--- a/net/sched/act_skbedit.c
++++ b/net/sched/act_skbedit.c
+@@ -328,7 +328,8 @@ static size_t tcf_skbedit_get_fill_size(const struct tc_action *act)
+ }
+
+ static int tcf_skbedit_offload_act_setup(struct tc_action *act, void *entry_data,
+- u32 *index_inc, bool bind)
++ u32 *index_inc, bool bind,
++ struct netlink_ext_ack *extack)
+ {
+ if (bind) {
+ struct flow_action_entry *entry = entry_data;
+diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c
+index 23aba03d26a8d..3c6f40478c813 100644
+--- a/net/sched/act_tunnel_key.c
++++ b/net/sched/act_tunnel_key.c
+@@ -808,7 +808,8 @@ static int tcf_tunnel_encap_get_tunnel(struct flow_action_entry *entry,
+ static int tcf_tunnel_key_offload_act_setup(struct tc_action *act,
+ void *entry_data,
+ u32 *index_inc,
+- bool bind)
++ bool bind,
++ struct netlink_ext_ack *extack)
+ {
+ int err;
+
+diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
+index 883454c4f9219..8c89bce99cbd1 100644
+--- a/net/sched/act_vlan.c
++++ b/net/sched/act_vlan.c
+@@ -369,7 +369,8 @@ static size_t tcf_vlan_get_fill_size(const struct tc_action *act)
+ }
+
+ static int tcf_vlan_offload_act_setup(struct tc_action *act, void *entry_data,
+- u32 *index_inc, bool bind)
++ u32 *index_inc, bool bind,
++ struct netlink_ext_ack *extack)
+ {
+ if (bind) {
+ struct flow_action_entry *entry = entry_data;
+diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
+index f0699f39afdb0..2d4dc1468a9a5 100644
+--- a/net/sched/cls_api.c
++++ b/net/sched/cls_api.c
+@@ -3513,11 +3513,13 @@ EXPORT_SYMBOL(tc_cleanup_offload_action);
+
+ static int tc_setup_offload_act(struct tc_action *act,
+ struct flow_action_entry *entry,
+- u32 *index_inc)
++ u32 *index_inc,
++ struct netlink_ext_ack *extack)
+ {
+ #ifdef CONFIG_NET_CLS_ACT
+ if (act->ops->offload_act_setup)
+- return act->ops->offload_act_setup(act, entry, index_inc, true);
++ return act->ops->offload_act_setup(act, entry, index_inc, true,
++ extack);
+ else
+ return -EOPNOTSUPP;
+ #else
+@@ -3526,7 +3528,8 @@ static int tc_setup_offload_act(struct tc_action *act,
+ }
+
+ int tc_setup_action(struct flow_action *flow_action,
+- struct tc_action *actions[])
++ struct tc_action *actions[],
++ struct netlink_ext_ack *extack)
+ {
+ int i, j, index, err = 0;
+ struct tc_action *act;
+@@ -3551,7 +3554,7 @@ int tc_setup_action(struct flow_action *flow_action,
+ entry->hw_stats = tc_act_hw_stats(act->hw_stats);
+ entry->hw_index = act->tcfa_index;
+ index = 0;
+- err = tc_setup_offload_act(act, entry, &index);
++ err = tc_setup_offload_act(act, entry, &index, extack);
+ if (!err)
+ j += index;
+ else
+@@ -3570,13 +3573,14 @@ err_out_locked:
+ }
+
+ int tc_setup_offload_action(struct flow_action *flow_action,
+- const struct tcf_exts *exts)
++ const struct tcf_exts *exts,
++ struct netlink_ext_ack *extack)
+ {
+ #ifdef CONFIG_NET_CLS_ACT
+ if (!exts)
+ return 0;
+
+- return tc_setup_action(flow_action, exts->actions);
++ return tc_setup_action(flow_action, exts->actions, extack);
+ #else
+ return 0;
+ #endif
+diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
+index ed5e6f08e74a8..cddacf49f9e86 100644
+--- a/net/sched/cls_flower.c
++++ b/net/sched/cls_flower.c
+@@ -464,7 +464,8 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
+ cls_flower.rule->match.key = &f->mkey;
+ cls_flower.classid = f->res.classid;
+
+- err = tc_setup_offload_action(&cls_flower.rule->action, &f->exts);
++ err = tc_setup_offload_action(&cls_flower.rule->action, &f->exts,
++ cls_flower.common.extack);
+ if (err) {
+ kfree(cls_flower.rule);
+ if (skip_sw) {
+@@ -2362,7 +2363,8 @@ static int fl_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb,
+ cls_flower.rule->match.mask = &f->mask->key;
+ cls_flower.rule->match.key = &f->mkey;
+
+- err = tc_setup_offload_action(&cls_flower.rule->action, &f->exts);
++ err = tc_setup_offload_action(&cls_flower.rule->action, &f->exts,
++ cls_flower.common.extack);
+ if (err) {
+ kfree(cls_flower.rule);
+ if (tc_skip_sw(f->flags)) {
+diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
+index ca5670fd5228e..df80c6b185a01 100644
+--- a/net/sched/cls_matchall.c
++++ b/net/sched/cls_matchall.c
+@@ -97,7 +97,8 @@ static int mall_replace_hw_filter(struct tcf_proto *tp,
+ cls_mall.command = TC_CLSMATCHALL_REPLACE;
+ cls_mall.cookie = cookie;
+
+- err = tc_setup_offload_action(&cls_mall.rule->action, &head->exts);
++ err = tc_setup_offload_action(&cls_mall.rule->action, &head->exts,
++ cls_mall.common.extack);
+ if (err) {
+ kfree(cls_mall.rule);
+ mall_destroy_hw_filter(tp, head, cookie, NULL);
+@@ -302,7 +303,8 @@ static int mall_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb,
+ TC_CLSMATCHALL_REPLACE : TC_CLSMATCHALL_DESTROY;
+ cls_mall.cookie = (unsigned long)head;
+
+- err = tc_setup_offload_action(&cls_mall.rule->action, &head->exts);
++ err = tc_setup_offload_action(&cls_mall.rule->action, &head->exts,
++ cls_mall.common.extack);
+ if (err) {
+ kfree(cls_mall.rule);
+ if (add && tc_skip_sw(head->flags)) {
+diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
+index 87bdd71c7bb66..f70112176b7c1 100644
+--- a/net/xdp/xsk_buff_pool.c
++++ b/net/xdp/xsk_buff_pool.c
+@@ -332,6 +332,7 @@ static void __xp_dma_unmap(struct xsk_dma_map *dma_map, unsigned long attrs)
+ for (i = 0; i < dma_map->dma_pages_cnt; i++) {
+ dma = &dma_map->dma_pages[i];
+ if (*dma) {
++ *dma &= ~XSK_NEXT_PG_CONTIG_MASK;
+ dma_unmap_page_attrs(dma_map->dev, *dma, PAGE_SIZE,
+ DMA_BIDIRECTIONAL, attrs);
+ *dma = 0;
+diff --git a/sound/pci/cs46xx/cs46xx.c b/sound/pci/cs46xx/cs46xx.c
+index bd60308769ff7..8634004a606b6 100644
+--- a/sound/pci/cs46xx/cs46xx.c
++++ b/sound/pci/cs46xx/cs46xx.c
+@@ -74,36 +74,36 @@ static int snd_card_cs46xx_probe(struct pci_dev *pci,
+ err = snd_cs46xx_create(card, pci,
+ external_amp[dev], thinkpad[dev]);
+ if (err < 0)
+- return err;
++ goto error;
+ card->private_data = chip;
+ chip->accept_valid = mmap_valid[dev];
+ err = snd_cs46xx_pcm(chip, 0);
+ if (err < 0)
+- return err;
++ goto error;
+ #ifdef CONFIG_SND_CS46XX_NEW_DSP
+ err = snd_cs46xx_pcm_rear(chip, 1);
+ if (err < 0)
+- return err;
++ goto error;
+ err = snd_cs46xx_pcm_iec958(chip, 2);
+ if (err < 0)
+- return err;
++ goto error;
+ #endif
+ err = snd_cs46xx_mixer(chip, 2);
+ if (err < 0)
+- return err;
++ goto error;
+ #ifdef CONFIG_SND_CS46XX_NEW_DSP
+ if (chip->nr_ac97_codecs ==2) {
+ err = snd_cs46xx_pcm_center_lfe(chip, 3);
+ if (err < 0)
+- return err;
++ goto error;
+ }
+ #endif
+ err = snd_cs46xx_midi(chip, 0);
+ if (err < 0)
+- return err;
++ goto error;
+ err = snd_cs46xx_start_dsp(chip);
+ if (err < 0)
+- return err;
++ goto error;
+
+ snd_cs46xx_gameport(chip);
+
+@@ -117,11 +117,15 @@ static int snd_card_cs46xx_probe(struct pci_dev *pci,
+
+ err = snd_card_register(card);
+ if (err < 0)
+- return err;
++ goto error;
+
+ pci_set_drvdata(pci, card);
+ dev++;
+ return 0;
++
++ error:
++ snd_card_free(card);
++ return err;
+ }
+
+ static struct pci_driver cs46xx_driver = {
+diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
+index d3d786de8f4c4..45f5db25a77ed 100644
+--- a/sound/pci/hda/patch_realtek.c
++++ b/sound/pci/hda/patch_realtek.c
+@@ -9264,6 +9264,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1558, 0x70f4, "Clevo NH77EPY", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x70f6, "Clevo NH77DPQ-Y", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x7716, "Clevo NS50PU", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
++ SND_PCI_QUIRK(0x1558, 0x7718, "Clevo L140PU", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x8228, "Clevo NR40BU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x8520, "Clevo NH50D[CD]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x8521, "Clevo NH77D[CD]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+diff --git a/sound/soc/codecs/rt700.c b/sound/soc/codecs/rt700.c
+index e61a8257bf647..360d61a36c354 100644
+--- a/sound/soc/codecs/rt700.c
++++ b/sound/soc/codecs/rt700.c
+@@ -315,17 +315,27 @@ static int rt700_set_jack_detect(struct snd_soc_component *component,
+ struct snd_soc_jack *hs_jack, void *data)
+ {
+ struct rt700_priv *rt700 = snd_soc_component_get_drvdata(component);
++ int ret;
+
+ rt700->hs_jack = hs_jack;
+
+- if (!rt700->hw_init) {
+- dev_dbg(&rt700->slave->dev,
+- "%s hw_init not ready yet\n", __func__);
++ ret = pm_runtime_resume_and_get(component->dev);
++ if (ret < 0) {
++ if (ret != -EACCES) {
++ dev_err(component->dev, "%s: failed to resume %d\n", __func__, ret);
++ return ret;
++ }
++
++ /* pm_runtime not enabled yet */
++ dev_dbg(component->dev, "%s: skipping jack init for now\n", __func__);
+ return 0;
+ }
+
+ rt700_jack_init(rt700);
+
++ pm_runtime_mark_last_busy(component->dev);
++ pm_runtime_put_autosuspend(component->dev);
++
+ return 0;
+ }
+
+diff --git a/sound/soc/codecs/rt711-sdca.c b/sound/soc/codecs/rt711-sdca.c
+index bdb1375f03388..9d59e653b941b 100644
+--- a/sound/soc/codecs/rt711-sdca.c
++++ b/sound/soc/codecs/rt711-sdca.c
+@@ -487,16 +487,27 @@ static int rt711_sdca_set_jack_detect(struct snd_soc_component *component,
+ struct snd_soc_jack *hs_jack, void *data)
+ {
+ struct rt711_sdca_priv *rt711 = snd_soc_component_get_drvdata(component);
++ int ret;
+
+ rt711->hs_jack = hs_jack;
+
+- if (!rt711->hw_init) {
+- dev_dbg(&rt711->slave->dev,
+- "%s hw_init not ready yet\n", __func__);
++ ret = pm_runtime_resume_and_get(component->dev);
++ if (ret < 0) {
++ if (ret != -EACCES) {
++ dev_err(component->dev, "%s: failed to resume %d\n", __func__, ret);
++ return ret;
++ }
++
++ /* pm_runtime not enabled yet */
++ dev_dbg(component->dev, "%s: skipping jack init for now\n", __func__);
+ return 0;
+ }
+
+ rt711_sdca_jack_init(rt711);
++
++ pm_runtime_mark_last_busy(component->dev);
++ pm_runtime_put_autosuspend(component->dev);
++
+ return 0;
+ }
+
+@@ -1190,14 +1201,6 @@ static int rt711_sdca_probe(struct snd_soc_component *component)
+ return 0;
+ }
+
+-static void rt711_sdca_remove(struct snd_soc_component *component)
+-{
+- struct rt711_sdca_priv *rt711 = snd_soc_component_get_drvdata(component);
+-
+- regcache_cache_only(rt711->regmap, true);
+- regcache_cache_only(rt711->mbq_regmap, true);
+-}
+-
+ static const struct snd_soc_component_driver soc_sdca_dev_rt711 = {
+ .probe = rt711_sdca_probe,
+ .controls = rt711_sdca_snd_controls,
+@@ -1207,7 +1210,7 @@ static const struct snd_soc_component_driver soc_sdca_dev_rt711 = {
+ .dapm_routes = rt711_sdca_audio_map,
+ .num_dapm_routes = ARRAY_SIZE(rt711_sdca_audio_map),
+ .set_jack = rt711_sdca_set_jack_detect,
+- .remove = rt711_sdca_remove,
++ .endianness = 1,
+ };
+
+ static int rt711_sdca_set_sdw_stream(struct snd_soc_dai *dai, void *sdw_stream,
+diff --git a/sound/soc/codecs/rt711.c b/sound/soc/codecs/rt711.c
+index ea25fd58d43a9..9958067e80f11 100644
+--- a/sound/soc/codecs/rt711.c
++++ b/sound/soc/codecs/rt711.c
+@@ -457,17 +457,27 @@ static int rt711_set_jack_detect(struct snd_soc_component *component,
+ struct snd_soc_jack *hs_jack, void *data)
+ {
+ struct rt711_priv *rt711 = snd_soc_component_get_drvdata(component);
++ int ret;
+
+ rt711->hs_jack = hs_jack;
+
+- if (!rt711->hw_init) {
+- dev_dbg(&rt711->slave->dev,
+- "%s hw_init not ready yet\n", __func__);
++ ret = pm_runtime_resume_and_get(component->dev);
++ if (ret < 0) {
++ if (ret != -EACCES) {
++ dev_err(component->dev, "%s: failed to resume %d\n", __func__, ret);
++ return ret;
++ }
++
++ /* pm_runtime not enabled yet */
++ dev_dbg(component->dev, "%s: skipping jack init for now\n", __func__);
+ return 0;
+ }
+
+ rt711_jack_init(rt711);
+
++ pm_runtime_mark_last_busy(component->dev);
++ pm_runtime_put_autosuspend(component->dev);
++
+ return 0;
+ }
+
+@@ -932,13 +942,6 @@ static int rt711_probe(struct snd_soc_component *component)
+ return 0;
+ }
+
+-static void rt711_remove(struct snd_soc_component *component)
+-{
+- struct rt711_priv *rt711 = snd_soc_component_get_drvdata(component);
+-
+- regcache_cache_only(rt711->regmap, true);
+-}
+-
+ static const struct snd_soc_component_driver soc_codec_dev_rt711 = {
+ .probe = rt711_probe,
+ .set_bias_level = rt711_set_bias_level,
+@@ -949,7 +952,7 @@ static const struct snd_soc_component_driver soc_codec_dev_rt711 = {
+ .dapm_routes = rt711_audio_map,
+ .num_dapm_routes = ARRAY_SIZE(rt711_audio_map),
+ .set_jack = rt711_set_jack_detect,
+- .remove = rt711_remove,
++ .endianness = 1,
+ };
+
+ static int rt711_set_sdw_stream(struct snd_soc_dai *dai, void *sdw_stream,
+diff --git a/sound/soc/qcom/qdsp6/q6apm-dai.c b/sound/soc/qcom/qdsp6/q6apm-dai.c
+index 19c4a90ec1ea9..ee59ef36b85a6 100644
+--- a/sound/soc/qcom/qdsp6/q6apm-dai.c
++++ b/sound/soc/qcom/qdsp6/q6apm-dai.c
+@@ -147,6 +147,12 @@ static int q6apm_dai_prepare(struct snd_soc_component *component,
+ cfg.num_channels = runtime->channels;
+ cfg.bit_width = prtd->bits_per_sample;
+
++ if (prtd->state) {
++ /* clear the previous setup if any */
++ q6apm_graph_stop(prtd->graph);
++ q6apm_unmap_memory_regions(prtd->graph, substream->stream);
++ }
++
+ prtd->pcm_count = snd_pcm_lib_period_bytes(substream);
+ prtd->pos = 0;
+ /* rate and channels are sent to audio driver */
+diff --git a/sound/soc/sof/intel/hda-pcm.c b/sound/soc/sof/intel/hda-pcm.c
+index dc1f743730c0d..6888e0a4665d2 100644
+--- a/sound/soc/sof/intel/hda-pcm.c
++++ b/sound/soc/sof/intel/hda-pcm.c
+@@ -192,79 +192,7 @@ snd_pcm_uframes_t hda_dsp_pcm_pointer(struct snd_sof_dev *sdev,
+ goto found;
+ }
+
+- switch (sof_hda_position_quirk) {
+- case SOF_HDA_POSITION_QUIRK_USE_SKYLAKE_LEGACY:
+- /*
+- * This legacy code, inherited from the Skylake driver,
+- * mixes DPIB registers and DPIB DDR updates and
+- * does not seem to follow any known hardware recommendations.
+- * It's not clear e.g. why there is a different flow
+- * for capture and playback, the only information that matters is
+- * what traffic class is used, and on all SOF-enabled platforms
+- * only VC0 is supported so the work-around was likely not necessary
+- * and quite possibly wrong.
+- */
+-
+- /* DPIB/posbuf position mode:
+- * For Playback, Use DPIB register from HDA space which
+- * reflects the actual data transferred.
+- * For Capture, Use the position buffer for pointer, as DPIB
+- * is not accurate enough, its update may be completed
+- * earlier than the data written to DDR.
+- */
+- if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
+- pos = snd_sof_dsp_read(sdev, HDA_DSP_HDA_BAR,
+- AZX_REG_VS_SDXDPIB_XBASE +
+- (AZX_REG_VS_SDXDPIB_XINTERVAL *
+- hstream->index));
+- } else {
+- /*
+- * For capture stream, we need more workaround to fix the
+- * position incorrect issue:
+- *
+- * 1. Wait at least 20us before reading position buffer after
+- * the interrupt generated(IOC), to make sure position update
+- * happens on frame boundary i.e. 20.833uSec for 48KHz.
+- * 2. Perform a dummy Read to DPIB register to flush DMA
+- * position value.
+- * 3. Read the DMA Position from posbuf. Now the readback
+- * value should be >= period boundary.
+- */
+- usleep_range(20, 21);
+- snd_sof_dsp_read(sdev, HDA_DSP_HDA_BAR,
+- AZX_REG_VS_SDXDPIB_XBASE +
+- (AZX_REG_VS_SDXDPIB_XINTERVAL *
+- hstream->index));
+- pos = snd_hdac_stream_get_pos_posbuf(hstream);
+- }
+- break;
+- case SOF_HDA_POSITION_QUIRK_USE_DPIB_REGISTERS:
+- /*
+- * In case VC1 traffic is disabled this is the recommended option
+- */
+- pos = snd_sof_dsp_read(sdev, HDA_DSP_HDA_BAR,
+- AZX_REG_VS_SDXDPIB_XBASE +
+- (AZX_REG_VS_SDXDPIB_XINTERVAL *
+- hstream->index));
+- break;
+- case SOF_HDA_POSITION_QUIRK_USE_DPIB_DDR_UPDATE:
+- /*
+- * This is the recommended option when VC1 is enabled.
+- * While this isn't needed for SOF platforms it's added for
+- * consistency and debug.
+- */
+- pos = snd_hdac_stream_get_pos_posbuf(hstream);
+- break;
+- default:
+- dev_err_once(sdev->dev, "hda_position_quirk value %d not supported\n",
+- sof_hda_position_quirk);
+- pos = 0;
+- break;
+- }
+-
+- if (pos >= hstream->bufsize)
+- pos = 0;
+-
++ pos = hda_dsp_stream_get_position(hstream, substream->stream, true);
+ found:
+ pos = bytes_to_frames(substream->runtime, pos);
+
+diff --git a/sound/soc/sof/intel/hda-stream.c b/sound/soc/sof/intel/hda-stream.c
+index daeb64c495e40..d95ae17e81cc4 100644
+--- a/sound/soc/sof/intel/hda-stream.c
++++ b/sound/soc/sof/intel/hda-stream.c
+@@ -707,12 +707,13 @@ bool hda_dsp_check_stream_irq(struct snd_sof_dev *sdev)
+ }
+
+ static void
+-hda_dsp_set_bytes_transferred(struct hdac_stream *hstream, u64 buffer_size)
++hda_dsp_compr_bytes_transferred(struct hdac_stream *hstream, int direction)
+ {
++ u64 buffer_size = hstream->bufsize;
+ u64 prev_pos, pos, num_bytes;
+
+ div64_u64_rem(hstream->curr_pos, buffer_size, &prev_pos);
+- pos = snd_hdac_stream_get_pos_posbuf(hstream);
++ pos = hda_dsp_stream_get_position(hstream, direction, false);
+
+ if (pos < prev_pos)
+ num_bytes = (buffer_size - prev_pos) + pos;
+@@ -748,8 +749,7 @@ static bool hda_dsp_stream_check(struct hdac_bus *bus, u32 status)
+ if (s->substream && sof_hda->no_ipc_position) {
+ snd_sof_pcm_period_elapsed(s->substream);
+ } else if (s->cstream) {
+- hda_dsp_set_bytes_transferred(s,
+- s->cstream->runtime->buffer_size);
++ hda_dsp_compr_bytes_transferred(s, s->cstream->direction);
+ snd_compr_fragment_elapsed(s->cstream);
+ }
+ }
+@@ -1009,3 +1009,89 @@ void hda_dsp_stream_free(struct snd_sof_dev *sdev)
+ devm_kfree(sdev->dev, hda_stream);
+ }
+ }
++
++snd_pcm_uframes_t hda_dsp_stream_get_position(struct hdac_stream *hstream,
++ int direction, bool can_sleep)
++{
++ struct hdac_ext_stream *hext_stream = stream_to_hdac_ext_stream(hstream);
++ struct sof_intel_hda_stream *hda_stream = hstream_to_sof_hda_stream(hext_stream);
++ struct snd_sof_dev *sdev = hda_stream->sdev;
++ snd_pcm_uframes_t pos;
++
++ switch (sof_hda_position_quirk) {
++ case SOF_HDA_POSITION_QUIRK_USE_SKYLAKE_LEGACY:
++ /*
++ * This legacy code, inherited from the Skylake driver,
++ * mixes DPIB registers and DPIB DDR updates and
++ * does not seem to follow any known hardware recommendations.
++ * It's not clear e.g. why there is a different flow
++ * for capture and playback, the only information that matters is
++ * what traffic class is used, and on all SOF-enabled platforms
++ * only VC0 is supported so the work-around was likely not necessary
++ * and quite possibly wrong.
++ */
++
++ /* DPIB/posbuf position mode:
++ * For Playback, Use DPIB register from HDA space which
++ * reflects the actual data transferred.
++ * For Capture, Use the position buffer for pointer, as DPIB
++ * is not accurate enough, its update may be completed
++ * earlier than the data written to DDR.
++ */
++ if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
++ pos = snd_sof_dsp_read(sdev, HDA_DSP_HDA_BAR,
++ AZX_REG_VS_SDXDPIB_XBASE +
++ (AZX_REG_VS_SDXDPIB_XINTERVAL *
++ hstream->index));
++ } else {
++ /*
++ * For capture stream, we need more workaround to fix the
++ * position incorrect issue:
++ *
++ * 1. Wait at least 20us before reading position buffer after
++ * the interrupt generated(IOC), to make sure position update
++ * happens on frame boundary i.e. 20.833uSec for 48KHz.
++ * 2. Perform a dummy Read to DPIB register to flush DMA
++ * position value.
++ * 3. Read the DMA Position from posbuf. Now the readback
++ * value should be >= period boundary.
++ */
++ if (can_sleep)
++ usleep_range(20, 21);
++
++ snd_sof_dsp_read(sdev, HDA_DSP_HDA_BAR,
++ AZX_REG_VS_SDXDPIB_XBASE +
++ (AZX_REG_VS_SDXDPIB_XINTERVAL *
++ hstream->index));
++ pos = snd_hdac_stream_get_pos_posbuf(hstream);
++ }
++ break;
++ case SOF_HDA_POSITION_QUIRK_USE_DPIB_REGISTERS:
++ /*
++ * In case VC1 traffic is disabled this is the recommended option
++ */
++ pos = snd_sof_dsp_read(sdev, HDA_DSP_HDA_BAR,
++ AZX_REG_VS_SDXDPIB_XBASE +
++ (AZX_REG_VS_SDXDPIB_XINTERVAL *
++ hstream->index));
++ break;
++ case SOF_HDA_POSITION_QUIRK_USE_DPIB_DDR_UPDATE:
++ /*
++ * This is the recommended option when VC1 is enabled.
++ * While this isn't needed for SOF platforms it's added for
++ * consistency and debug.
++ */
++ pos = snd_hdac_stream_get_pos_posbuf(hstream);
++ break;
++ default:
++ dev_err_once(sdev->dev, "hda_position_quirk value %d not supported\n",
++ sof_hda_position_quirk);
++ pos = 0;
++ break;
++ }
++
++ if (pos >= hstream->bufsize)
++ pos = 0;
++
++ return pos;
++}
+diff --git a/sound/soc/sof/intel/hda.h b/sound/soc/sof/intel/hda.h
+index 05e5e158614a1..196494ba1245b 100644
+--- a/sound/soc/sof/intel/hda.h
++++ b/sound/soc/sof/intel/hda.h
+@@ -557,6 +557,9 @@ int hda_dsp_stream_setup_bdl(struct snd_sof_dev *sdev,
+ bool hda_dsp_check_ipc_irq(struct snd_sof_dev *sdev);
+ bool hda_dsp_check_stream_irq(struct snd_sof_dev *sdev);
+
++snd_pcm_uframes_t hda_dsp_stream_get_position(struct hdac_stream *hstream,
++ int direction, bool can_sleep);
++
+ struct hdac_ext_stream *
+ hda_dsp_stream_get(struct snd_sof_dev *sdev, int direction, u32 flags);
+ int hda_dsp_stream_put(struct snd_sof_dev *sdev, int direction, int stream_tag);
+diff --git a/sound/soc/sof/ipc3-topology.c b/sound/soc/sof/ipc3-topology.c
+index cdff48c4195f8..80fb82ece38d7 100644
+--- a/sound/soc/sof/ipc3-topology.c
++++ b/sound/soc/sof/ipc3-topology.c
+@@ -1578,24 +1578,23 @@ static int sof_ipc3_control_load_bytes(struct snd_sof_dev *sdev, struct snd_sof_
+ struct sof_ipc_ctrl_data *cdata;
+ int ret;
+
+- scontrol->ipc_control_data = kzalloc(scontrol->max_size, GFP_KERNEL);
+- if (!scontrol->ipc_control_data)
+- return -ENOMEM;
+-
+- if (scontrol->max_size < sizeof(*cdata) ||
+- scontrol->max_size < sizeof(struct sof_abi_hdr)) {
+- ret = -EINVAL;
+- goto err;
++ if (scontrol->max_size < (sizeof(*cdata) + sizeof(struct sof_abi_hdr))) {
++ dev_err(sdev->dev, "%s: insufficient size for a bytes control: %zu.\n",
++ __func__, scontrol->max_size);
++ return -EINVAL;
+ }
+
+- /* init the get/put bytes data */
+ if (scontrol->priv_size > scontrol->max_size - sizeof(*cdata)) {
+- dev_err(sdev->dev, "err: bytes data size %zu exceeds max %zu.\n",
++ dev_err(sdev->dev,
++ "%s: bytes data size %zu exceeds max %zu.\n", __func__,
+ scontrol->priv_size, scontrol->max_size - sizeof(*cdata));
+- ret = -EINVAL;
+- goto err;
++ return -EINVAL;
+ }
+
++ scontrol->ipc_control_data = kzalloc(scontrol->max_size, GFP_KERNEL);
++ if (!scontrol->ipc_control_data)
++ return -ENOMEM;
++
+ scontrol->size = sizeof(struct sof_ipc_ctrl_data) + scontrol->priv_size;
+
+ cdata = scontrol->ipc_control_data;
+diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c
+index e8468f9b007d1..12ce69b04f634 100644
+--- a/sound/usb/quirks.c
++++ b/sound/usb/quirks.c
+@@ -1842,6 +1842,10 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = {
+ QUIRK_FLAG_SHARE_MEDIA_DEVICE | QUIRK_FLAG_ALIGN_TRANSFER),
+ DEVICE_FLG(0x1395, 0x740a, /* Sennheiser DECT */
+ QUIRK_FLAG_GET_SAMPLE_RATE),
++ DEVICE_FLG(0x1397, 0x0508, /* Behringer UMC204HD */
++ QUIRK_FLAG_PLAYBACK_FIRST | QUIRK_FLAG_GENERIC_IMPLICIT_FB),
++ DEVICE_FLG(0x1397, 0x0509, /* Behringer UMC404HD */
++ QUIRK_FLAG_PLAYBACK_FIRST | QUIRK_FLAG_GENERIC_IMPLICIT_FB),
+ DEVICE_FLG(0x13e5, 0x0001, /* Serato Phono */
+ QUIRK_FLAG_IGNORE_CTL_ERROR),
+ DEVICE_FLG(0x154e, 0x1002, /* Denon DCD-1500RE */
+diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
+index 664b9ecaf2282..063da0d534629 100644
+--- a/tools/testing/selftests/net/forwarding/lib.sh
++++ b/tools/testing/selftests/net/forwarding/lib.sh
+@@ -1176,6 +1176,7 @@ learning_test()
+ # FDB entry was installed.
+ bridge link set dev $br_port1 flood off
+
++ ip link set $host1_if promisc on
+ tc qdisc add dev $host1_if ingress
+ tc filter add dev $host1_if ingress protocol ip pref 1 handle 101 \
+ flower dst_mac $mac action drop
+@@ -1186,7 +1187,7 @@ learning_test()
+ tc -j -s filter show dev $host1_if ingress \
+ | jq -e ".[] | select(.options.handle == 101) \
+ | select(.options.actions[0].stats.packets == 1)" &> /dev/null
+- check_fail $? "Packet reached second host when should not"
++ check_fail $? "Packet reached first host when should not"
+
+ $MZ $host1_if -c 1 -p 64 -a $mac -t ip -q
+ sleep 1
+@@ -1225,6 +1226,7 @@ learning_test()
+
+ tc filter del dev $host1_if ingress protocol ip pref 1 handle 101 flower
+ tc qdisc del dev $host1_if ingress
++ ip link set $host1_if promisc off
+
+ bridge link set dev $br_port1 flood on
+
+@@ -1242,6 +1244,7 @@ flood_test_do()
+
+ # Add an ACL on `host2_if` which will tell us whether the packet
+ # was flooded to it or not.
++ ip link set $host2_if promisc on
+ tc qdisc add dev $host2_if ingress
+ tc filter add dev $host2_if ingress protocol ip pref 1 handle 101 \
+ flower dst_mac $mac action drop
+@@ -1259,6 +1262,7 @@ flood_test_do()
+
+ tc filter del dev $host2_if ingress protocol ip pref 1 handle 101 flower
+ tc qdisc del dev $host2_if ingress
++ ip link set $host2_if promisc off
+
+ return $err
+ }
+diff --git a/tools/testing/selftests/net/udpgro.sh b/tools/testing/selftests/net/udpgro.sh
+index f8a19f548ae9d..ebbd0b2824327 100755
+--- a/tools/testing/selftests/net/udpgro.sh
++++ b/tools/testing/selftests/net/udpgro.sh
+@@ -34,7 +34,7 @@ cfg_veth() {
+ ip -netns "${PEER_NS}" addr add dev veth1 192.168.1.1/24
+ ip -netns "${PEER_NS}" addr add dev veth1 2001:db8::1/64 nodad
+ ip -netns "${PEER_NS}" link set dev veth1 up
+- ip -n "${PEER_NS}" link set veth1 xdp object ../bpf/xdp_dummy.o section xdp_dummy
++ ip -n "${PEER_NS}" link set veth1 xdp object ../bpf/xdp_dummy.o section xdp
+ }
+
+ run_one() {
+diff --git a/tools/testing/selftests/net/udpgro_bench.sh b/tools/testing/selftests/net/udpgro_bench.sh
+index 820bc50f6b687..fad2d1a71cac3 100755
+--- a/tools/testing/selftests/net/udpgro_bench.sh
++++ b/tools/testing/selftests/net/udpgro_bench.sh
+@@ -34,7 +34,7 @@ run_one() {
+ ip -netns "${PEER_NS}" addr add dev veth1 2001:db8::1/64 nodad
+ ip -netns "${PEER_NS}" link set dev veth1 up
+
+- ip -n "${PEER_NS}" link set veth1 xdp object ../bpf/xdp_dummy.o section xdp_dummy
++ ip -n "${PEER_NS}" link set veth1 xdp object ../bpf/xdp_dummy.o section xdp
+ ip netns exec "${PEER_NS}" ./udpgso_bench_rx ${rx_args} -r &
+ ip netns exec "${PEER_NS}" ./udpgso_bench_rx -t ${rx_args} -r &
+
+diff --git a/tools/testing/selftests/net/udpgro_frglist.sh b/tools/testing/selftests/net/udpgro_frglist.sh
+index 807b74c8fd80f..832c738cc3c29 100755
+--- a/tools/testing/selftests/net/udpgro_frglist.sh
++++ b/tools/testing/selftests/net/udpgro_frglist.sh
+@@ -36,7 +36,7 @@ run_one() {
+ ip netns exec "${PEER_NS}" ethtool -K veth1 rx-gro-list on
+
+
+- ip -n "${PEER_NS}" link set veth1 xdp object ../bpf/xdp_dummy.o section xdp_dummy
++ ip -n "${PEER_NS}" link set veth1 xdp object ../bpf/xdp_dummy.o section xdp
+ tc -n "${PEER_NS}" qdisc add dev veth1 clsact
+ tc -n "${PEER_NS}" filter add dev veth1 ingress prio 4 protocol ipv6 bpf object-file ../bpf/nat6to4.o section schedcls/ingress6/nat_6 direct-action
+ tc -n "${PEER_NS}" filter add dev veth1 egress prio 4 protocol ip bpf object-file ../bpf/nat6to4.o section schedcls/egress4/snat4 direct-action
+diff --git a/tools/testing/selftests/net/udpgro_fwd.sh b/tools/testing/selftests/net/udpgro_fwd.sh
+index 6f05e06f67613..1bcd82e1f662e 100755
+--- a/tools/testing/selftests/net/udpgro_fwd.sh
++++ b/tools/testing/selftests/net/udpgro_fwd.sh
+@@ -46,7 +46,7 @@ create_ns() {
+ ip -n $BASE$ns addr add dev veth$ns $BM_NET_V4$ns/24
+ ip -n $BASE$ns addr add dev veth$ns $BM_NET_V6$ns/64 nodad
+ done
+- ip -n $NS_DST link set veth$DST xdp object ../bpf/xdp_dummy.o section xdp_dummy 2>/dev/null
++ ip -n $NS_DST link set veth$DST xdp object ../bpf/xdp_dummy.o section xdp 2>/dev/null
+ }
+
+ create_vxlan_endpoint() {
+diff --git a/tools/testing/selftests/net/veth.sh b/tools/testing/selftests/net/veth.sh
+index 19eac3e44c065..430895d1a2b63 100755
+--- a/tools/testing/selftests/net/veth.sh
++++ b/tools/testing/selftests/net/veth.sh
+@@ -289,14 +289,14 @@ if [ $CPUS -gt 1 ]; then
+ ip netns exec $NS_SRC ethtool -L veth$SRC rx 1 tx 2 2>/dev/null
+ printf "%-60s" "bad setting: XDP with RX nr less than TX"
+ ip -n $NS_DST link set dev veth$DST xdp object ../bpf/xdp_dummy.o \
+- section xdp_dummy 2>/dev/null &&\
++ section xdp 2>/dev/null &&\
+ echo "fail - set operation successful ?!?" || echo " ok "
+
+ # the following tests will run with multiple channels active
+ ip netns exec $NS_SRC ethtool -L veth$SRC rx 2
+ ip netns exec $NS_DST ethtool -L veth$DST rx 2
+ ip -n $NS_DST link set dev veth$DST xdp object ../bpf/xdp_dummy.o \
+- section xdp_dummy 2>/dev/null
++ section xdp 2>/dev/null
+ printf "%-60s" "bad setting: reducing RX nr below peer TX with XDP set"
+ ip netns exec $NS_DST ethtool -L veth$DST rx 1 2>/dev/null &&\
+ echo "fail - set operation successful ?!?" || echo " ok "
+@@ -311,7 +311,7 @@ if [ $CPUS -gt 2 ]; then
+ chk_channels "setting invalid channels nr" $DST 2 2
+ fi
+
+-ip -n $NS_DST link set dev veth$DST xdp object ../bpf/xdp_dummy.o section xdp_dummy 2>/dev/null
++ip -n $NS_DST link set dev veth$DST xdp object ../bpf/xdp_dummy.o section xdp 2>/dev/null
+ chk_gro_flag "with xdp attached - gro flag" $DST on
+ chk_gro_flag " - peer gro flag" $SRC off
+ chk_tso_flag " - tso flag" $SRC off
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-07-15 8:59 Alice Ferrazzi
0 siblings, 0 replies; 31+ messages in thread
From: Alice Ferrazzi @ 2022-07-15 8:59 UTC (permalink / raw
To: gentoo-commits
commit: ce587fbdb043f9ed6e28155265f2e2530d756a78
Author: Alice Ferrazzi <alicef <AT> gentoo <DOT> org>
AuthorDate: Fri Jul 15 08:56:36 2022 +0000
Commit: Alice Ferrazzi <alicef <AT> gentoo <DOT> org>
CommitDate: Fri Jul 15 08:57:31 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=ce587fbd
Add 1950_workaround_negprot_bug.patch
Fix incompatibility with older samba server
Signed-off-by: Alice Ferrazzi <alicef <AT> gentoo.org>
0000_README | 4 +++
1950_workaround_negprot_bug.patch | 56 +++++++++++++++++++++++++++++++++++++++
2 files changed, 60 insertions(+)
diff --git a/0000_README b/0000_README
index be72894f..764c25c9 100644
--- a/0000_README
+++ b/0000_README
@@ -99,6 +99,10 @@ Patch: 1700_sparc-address-warray-bound-warnings.patch
From: https://github.com/KSPP/linux/issues/109
Desc: Address -Warray-bounds warnings
+Patch: 1950_workaround_negprot_bug.patch
+From: https://patchwork.kernel.org/project/cifs-client/patch/CAH2r5mtuN-yswT5VTbNPzj02fwiHYOCe2eR8mcgRgRE8Qpkjgw@mail.gmail.com/
+Desc: Fix mount fail to older Samba servers
+
Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
diff --git a/1950_workaround_negprot_bug.patch b/1950_workaround_negprot_bug.patch
new file mode 100644
index 00000000..9a38ed83
--- /dev/null
+++ b/1950_workaround_negprot_bug.patch
@@ -0,0 +1,56 @@
+From a8d8532e4c335f0a31dd213abe4e31682f34647c Mon Sep 17 00:00:00 2001
+From: Steve French <stfrench@microsoft.com>
+Date: Tue, 12 Jul 2022 00:11:42 -0500
+Subject: [PATCH] smb3: workaround negprot bug in some Samba servers
+
+Mount can now fail to older Samba servers due to a server
+bug handling padding at the end of the last negotiate
+contexts (negotiate contexts typically round up to 8 byte
+lengths by adding padding if needed). This server bug can
+be avoided by switching the order of negotiate contexts,
+placing a negotiate context at the end that does not
+require padding (prior to the recent netname context fix
+this was the case on the client).
+
+Fixes: 73130a7b1ac9 ("smb3: fix empty netname context on secondary channels")
+Reported-by: Julian Sikorski <belegdol@gmail.com>
+Signed-off-by: Steve French <stfrench@microsoft.com>
+---
+ fs/cifs/smb2pdu.c | 13 +++++++------
+ 1 file changed, 7 insertions(+), 6 deletions(-)
+
+diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
+index 12b4dddaedb0..c705de32e225 100644
+--- a/fs/cifs/smb2pdu.c
++++ b/fs/cifs/smb2pdu.c
+@@ -571,10 +571,6 @@ assemble_neg_contexts(struct smb2_negotiate_req *req,
+ *total_len += ctxt_len;
+ pneg_ctxt += ctxt_len;
+
+- build_posix_ctxt((struct smb2_posix_neg_context *)pneg_ctxt);
+- *total_len += sizeof(struct smb2_posix_neg_context);
+- pneg_ctxt += sizeof(struct smb2_posix_neg_context);
+-
+ /*
+ * secondary channels don't have the hostname field populated
+ * use the hostname field in the primary channel instead
+@@ -586,9 +582,14 @@ assemble_neg_contexts(struct smb2_negotiate_req *req,
+ hostname);
+ *total_len += ctxt_len;
+ pneg_ctxt += ctxt_len;
+- neg_context_count = 4;
+- } else /* second channels do not have a hostname */
+ neg_context_count = 3;
++ } else
++ neg_context_count = 2;
++
++ build_posix_ctxt((struct smb2_posix_neg_context *)pneg_ctxt);
++ *total_len += sizeof(struct smb2_posix_neg_context);
++ pneg_ctxt += sizeof(struct smb2_posix_neg_context);
++ neg_context_count++;
+
+ if (server->compress_algorithm) {
+ build_compression_ctxt((struct smb2_compression_capabilities_context *)
+--
+2.34.1
+
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-07-15 10:10 Alice Ferrazzi
0 siblings, 0 replies; 31+ messages in thread
From: Alice Ferrazzi @ 2022-07-15 10:10 UTC (permalink / raw
To: gentoo-commits
commit: ea1cc0f1b2e15fa74a13ebd795a6b677d4af9fc1
Author: Alice Ferrazzi <alicef <AT> gentoo <DOT> org>
AuthorDate: Fri Jul 15 10:09:28 2022 +0000
Commit: Alice Ferrazzi <alicef <AT> gentoo <DOT> org>
CommitDate: Fri Jul 15 10:09:35 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=ea1cc0f1
Linux patch 5.18.12
Signed-off-by: Alice Ferrazzi <alicef <AT> gentoo.org>
0000_README | 4 ++++
1011_linux-5.18.12.patch | 26 ++++++++++++++++++++++++++
2 files changed, 30 insertions(+)
diff --git a/0000_README b/0000_README
index 764c25c9..2ad9190a 100644
--- a/0000_README
+++ b/0000_README
@@ -87,6 +87,10 @@ Patch: 1010_linux-5.18.11.patch
From: http://www.kernel.org
Desc: Linux 5.18.11
+Patch: 1011_linux-5.18.12.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.12
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1011_linux-5.18.12.patch b/1011_linux-5.18.12.patch
new file mode 100644
index 00000000..33801d1a
--- /dev/null
+++ b/1011_linux-5.18.12.patch
@@ -0,0 +1,26 @@
+diff --git a/Makefile b/Makefile
+index 323032d60ac34..f8e2445b60447 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 11
++SUBLEVEL = 12
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+index 375529b7d12e3..44b14c9dc9a73 100644
+--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
++++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+@@ -695,7 +695,7 @@ static int gpmi_nfc_compute_timings(struct gpmi_nand_data *this,
+ hw->timing0 = BF_GPMI_TIMING0_ADDRESS_SETUP(addr_setup_cycles) |
+ BF_GPMI_TIMING0_DATA_HOLD(data_hold_cycles) |
+ BF_GPMI_TIMING0_DATA_SETUP(data_setup_cycles);
+- hw->timing1 = BF_GPMI_TIMING1_BUSY_TIMEOUT(DIV_ROUND_UP(busy_timeout_cycles, 4096));
++ hw->timing1 = BF_GPMI_TIMING1_BUSY_TIMEOUT(busy_timeout_cycles * 4096);
+
+ /*
+ * Derive NFC ideal delay from {3}:
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-07-22 11:06 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-07-22 11:06 UTC (permalink / raw
To: gentoo-commits
commit: 461dcdbb439f225db2fe828a491ce5946acad2b1
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Fri Jul 22 11:06:30 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Fri Jul 22 11:06:30 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=461dcdbb
Linux patch 5.18.13
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1012_linux-5.18.13.patch | 8901 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 8905 insertions(+)
diff --git a/0000_README b/0000_README
index 2ad9190a..a7d6578d 100644
--- a/0000_README
+++ b/0000_README
@@ -91,6 +91,10 @@ Patch: 1011_linux-5.18.12.patch
From: http://www.kernel.org
Desc: Linux 5.18.12
+Patch: 1012_linux-5.18.13.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.13
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1012_linux-5.18.13.patch b/1012_linux-5.18.13.patch
new file mode 100644
index 00000000..04073fd9
--- /dev/null
+++ b/1012_linux-5.18.13.patch
@@ -0,0 +1,8901 @@
+diff --git a/Documentation/devicetree/bindings/sound/qcom,lpass-cpu.yaml b/Documentation/devicetree/bindings/sound/qcom,lpass-cpu.yaml
+index 2c81efb5fa370..47bb67d43ac29 100644
+--- a/Documentation/devicetree/bindings/sound/qcom,lpass-cpu.yaml
++++ b/Documentation/devicetree/bindings/sound/qcom,lpass-cpu.yaml
+@@ -25,12 +25,12 @@ properties:
+ - qcom,sc7280-lpass-cpu
+
+ reg:
+- minItems: 2
++ minItems: 1
+ maxItems: 6
+ description: LPAIF core registers
+
+ reg-names:
+- minItems: 2
++ minItems: 1
+ maxItems: 6
+
+ clocks:
+@@ -42,12 +42,12 @@ properties:
+ maxItems: 7
+
+ interrupts:
+- minItems: 2
++ minItems: 1
+ maxItems: 4
+ description: LPAIF DMA buffer interrupt
+
+ interrupt-names:
+- minItems: 2
++ minItems: 1
+ maxItems: 4
+
+ qcom,adsp:
+diff --git a/Documentation/driver-api/firmware/other_interfaces.rst b/Documentation/driver-api/firmware/other_interfaces.rst
+index b81794e0cfbb9..06ac89adaafba 100644
+--- a/Documentation/driver-api/firmware/other_interfaces.rst
++++ b/Documentation/driver-api/firmware/other_interfaces.rst
+@@ -13,6 +13,12 @@ EDD Interfaces
+ .. kernel-doc:: drivers/firmware/edd.c
+ :internal:
+
++Generic System Framebuffers Interface
++-------------------------------------
++
++.. kernel-doc:: drivers/firmware/sysfb.c
++ :export:
++
+ Intel Stratix10 SoC Service Layer
+ ---------------------------------
+ Some features of the Intel Stratix10 SoC require a level of privilege
+diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst
+index 0483abcafcb01..0542358724f19 100644
+--- a/Documentation/filesystems/netfs_library.rst
++++ b/Documentation/filesystems/netfs_library.rst
+@@ -300,7 +300,7 @@ through which it can issue requests and negotiate::
+ void (*issue_read)(struct netfs_io_subrequest *subreq);
+ bool (*is_still_valid)(struct netfs_io_request *rreq);
+ int (*check_write_begin)(struct file *file, loff_t pos, unsigned len,
+- struct folio *folio, void **_fsdata);
++ struct folio **foliop, void **_fsdata);
+ void (*done)(struct netfs_io_request *rreq);
+ void (*cleanup)(struct address_space *mapping, void *netfs_priv);
+ };
+@@ -376,8 +376,10 @@ The operations are as follows:
+ allocated/grabbed the folio to be modified to allow the filesystem to flush
+ conflicting state before allowing it to be modified.
+
+- It should return 0 if everything is now fine, -EAGAIN if the folio should be
+- regrabbed and any other error code to abort the operation.
++ It may unlock and discard the folio it was given and set the caller's folio
++ pointer to NULL. It should return 0 if everything is now fine (``*foliop``
++ left set) or the op should be retried (``*foliop`` cleared) and any other
++ error code to abort the operation.
+
+ * ``done``
+
+diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
+index 66828293d9cb7..8899b474edbfd 100644
+--- a/Documentation/networking/ip-sysctl.rst
++++ b/Documentation/networking/ip-sysctl.rst
+@@ -1085,7 +1085,7 @@ cipso_cache_enable - BOOLEAN
+ cipso_cache_bucket_size - INTEGER
+ The CIPSO label cache consists of a fixed size hash table with each
+ hash bucket containing a number of cache entries. This variable limits
+- the number of entries in each hash bucket; the larger the value the
++ the number of entries in each hash bucket; the larger the value is, the
+ more CIPSO label mappings that can be cached. When the number of
+ entries in a given hash bucket reaches this limit adding new entries
+ causes the oldest entry in the bucket to be removed to make room.
+@@ -1179,7 +1179,7 @@ ip_autobind_reuse - BOOLEAN
+ option should only be set by experts.
+ Default: 0
+
+-ip_dynaddr - BOOLEAN
++ip_dynaddr - INTEGER
+ If set non-zero, enables support for dynamic addresses.
+ If set to a non-zero value larger than 1, a kernel log
+ message will be printed when dynamic address rewriting
+diff --git a/Makefile b/Makefile
+index f8e2445b60447..1f3c753cb28df 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 12
++SUBLEVEL = 13
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/arm/boot/dts/imx6qdl-ts7970.dtsi b/arch/arm/boot/dts/imx6qdl-ts7970.dtsi
+index fded07f370b39..d6ba4b2a60f6f 100644
+--- a/arch/arm/boot/dts/imx6qdl-ts7970.dtsi
++++ b/arch/arm/boot/dts/imx6qdl-ts7970.dtsi
+@@ -226,7 +226,7 @@
+ reg = <0x28>;
+ #gpio-cells = <2>;
+ gpio-controller;
+- ngpio = <32>;
++ ngpios = <62>;
+ };
+
+ sgtl5000: codec@a {
+diff --git a/arch/arm/boot/dts/sama5d2.dtsi b/arch/arm/boot/dts/sama5d2.dtsi
+index 89c71d419f82a..659a17fc755cf 100644
+--- a/arch/arm/boot/dts/sama5d2.dtsi
++++ b/arch/arm/boot/dts/sama5d2.dtsi
+@@ -1124,7 +1124,7 @@
+ clocks = <&pmc PMC_TYPE_PERIPHERAL 55>, <&pmc PMC_TYPE_GCK 55>;
+ clock-names = "pclk", "gclk";
+ assigned-clocks = <&pmc PMC_TYPE_CORE PMC_I2S1_MUX>;
+- assigned-parrents = <&pmc PMC_TYPE_GCK 55>;
++ assigned-clock-parents = <&pmc PMC_TYPE_GCK 55>;
+ status = "disabled";
+ };
+
+diff --git a/arch/arm/boot/dts/stm32mp151.dtsi b/arch/arm/boot/dts/stm32mp151.dtsi
+index 9c2bbf115f4cc..de4d651f95752 100644
+--- a/arch/arm/boot/dts/stm32mp151.dtsi
++++ b/arch/arm/boot/dts/stm32mp151.dtsi
+@@ -565,7 +565,7 @@
+ compatible = "st,stm32-cec";
+ reg = <0x40016000 0x400>;
+ interrupts = <GIC_SPI 94 IRQ_TYPE_LEVEL_HIGH>;
+- clocks = <&rcc CEC_K>, <&clk_lse>;
++ clocks = <&rcc CEC_K>, <&rcc CEC>;
+ clock-names = "cec", "hdmi-cec";
+ status = "disabled";
+ };
+diff --git a/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts b/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts
+index f19ed981da9d9..3706216ffb40b 100644
+--- a/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts
++++ b/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts
+@@ -169,7 +169,7 @@
+ flash@0 {
+ #address-cells = <1>;
+ #size-cells = <1>;
+- compatible = "mxicy,mx25l1606e", "winbond,w25q128";
++ compatible = "mxicy,mx25l1606e", "jedec,spi-nor";
+ reg = <0>;
+ spi-max-frequency = <40000000>;
+ };
+diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
+index f1d0a7807cd0e..41536feb43921 100644
+--- a/arch/arm/include/asm/domain.h
++++ b/arch/arm/include/asm/domain.h
+@@ -112,19 +112,6 @@ static __always_inline void set_domain(unsigned int val)
+ }
+ #endif
+
+-#ifdef CONFIG_CPU_USE_DOMAINS
+-#define modify_domain(dom,type) \
+- do { \
+- unsigned int domain = get_domain(); \
+- domain &= ~domain_mask(dom); \
+- domain = domain | domain_val(dom, type); \
+- set_domain(domain); \
+- } while (0)
+-
+-#else
+-static inline void modify_domain(unsigned dom, unsigned type) { }
+-#endif
+-
+ /*
+ * Generate the T (user) versions of the LDR/STR and related
+ * instructions (inline assembly)
+diff --git a/arch/arm/include/asm/mach/map.h b/arch/arm/include/asm/mach/map.h
+index 92282558caf7c..2b8970d8e5a2f 100644
+--- a/arch/arm/include/asm/mach/map.h
++++ b/arch/arm/include/asm/mach/map.h
+@@ -27,6 +27,7 @@ enum {
+ MT_HIGH_VECTORS,
+ MT_MEMORY_RWX,
+ MT_MEMORY_RW,
++ MT_MEMORY_RO,
+ MT_ROM,
+ MT_MEMORY_RWX_NONCACHED,
+ MT_MEMORY_RW_DTCM,
+diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
+index 93051e2f402c8..1408a6a15d0e0 100644
+--- a/arch/arm/include/asm/ptrace.h
++++ b/arch/arm/include/asm/ptrace.h
+@@ -163,5 +163,31 @@ static inline unsigned long user_stack_pointer(struct pt_regs *regs)
+ ((current_stack_pointer | (THREAD_SIZE - 1)) - 7) - 1; \
+ })
+
++
++/*
++ * Update ITSTATE after normal execution of an IT block instruction.
++ *
++ * The 8 IT state bits are split into two parts in CPSR:
++ * ITSTATE<1:0> are in CPSR<26:25>
++ * ITSTATE<7:2> are in CPSR<15:10>
++ */
++static inline unsigned long it_advance(unsigned long cpsr)
++{
++ if ((cpsr & 0x06000400) == 0) {
++ /* ITSTATE<2:0> == 0 means end of IT block, so clear IT state */
++ cpsr &= ~PSR_IT_MASK;
++ } else {
++ /* We need to shift left ITSTATE<4:0> */
++ const unsigned long mask = 0x06001c00; /* Mask ITSTATE<4:0> */
++ unsigned long it = cpsr & mask;
++ it <<= 1;
++ it |= it >> (27 - 10); /* Carry ITSTATE<2> to correct place */
++ it &= mask;
++ cpsr &= ~mask;
++ cpsr |= it;
++ }
++ return cpsr;
++}
++
+ #endif /* __ASSEMBLY__ */
+ #endif
+diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
+index d30ee26ccc870..6284914f43610 100644
+--- a/arch/arm/mm/Kconfig
++++ b/arch/arm/mm/Kconfig
+@@ -631,7 +631,11 @@ config CPU_USE_DOMAINS
+ bool
+ help
+ This option enables or disables the use of domain switching
+- via the set_fs() function.
++ using the DACR (domain access control register) to protect memory
++ domains from each other. In Linux we use three domains: kernel, user
++ and IO. The domains are used to protect userspace from kernelspace
++ and to handle IO-space as a special type of memory by assigning
++ manager or client roles to running code (such as a process).
+
+ config CPU_V7M_NUM_IRQ
+ int "Number of external interrupts connected to the NVIC"
+diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
+index 6f499559d1936..f8dd0b3cc8e04 100644
+--- a/arch/arm/mm/alignment.c
++++ b/arch/arm/mm/alignment.c
+@@ -935,6 +935,9 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
+ if (type == TYPE_LDST)
+ do_alignment_finish_ldst(addr, instr, regs, offset);
+
++ if (thumb_mode(regs))
++ regs->ARM_cpsr = it_advance(regs->ARM_cpsr);
++
+ return 0;
+
+ bad_or_fault:
+diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
+index 5e2be37a198e2..cd17e324aa51e 100644
+--- a/arch/arm/mm/mmu.c
++++ b/arch/arm/mm/mmu.c
+@@ -296,6 +296,13 @@ static struct mem_type mem_types[] __ro_after_init = {
+ .prot_sect = PMD_TYPE_SECT | PMD_SECT_AP_WRITE,
+ .domain = DOMAIN_KERNEL,
+ },
++ [MT_MEMORY_RO] = {
++ .prot_pte = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
++ L_PTE_XN | L_PTE_RDONLY,
++ .prot_l1 = PMD_TYPE_TABLE,
++ .prot_sect = PMD_TYPE_SECT,
++ .domain = DOMAIN_KERNEL,
++ },
+ [MT_ROM] = {
+ .prot_sect = PMD_TYPE_SECT,
+ .domain = DOMAIN_KERNEL,
+@@ -489,6 +496,7 @@ static void __init build_mem_type_table(void)
+
+ /* Also setup NX memory mapping */
+ mem_types[MT_MEMORY_RW].prot_sect |= PMD_SECT_XN;
++ mem_types[MT_MEMORY_RO].prot_sect |= PMD_SECT_XN;
+ }
+ if (cpu_arch >= CPU_ARCH_ARMv7 && (cr & CR_TRE)) {
+ /*
+@@ -568,6 +576,7 @@ static void __init build_mem_type_table(void)
+ mem_types[MT_ROM].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
+ mem_types[MT_MINICLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
+ mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
++ mem_types[MT_MEMORY_RO].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
+ #endif
+
+ /*
+@@ -587,6 +596,8 @@ static void __init build_mem_type_table(void)
+ mem_types[MT_MEMORY_RWX].prot_pte |= L_PTE_SHARED;
+ mem_types[MT_MEMORY_RW].prot_sect |= PMD_SECT_S;
+ mem_types[MT_MEMORY_RW].prot_pte |= L_PTE_SHARED;
++ mem_types[MT_MEMORY_RO].prot_sect |= PMD_SECT_S;
++ mem_types[MT_MEMORY_RO].prot_pte |= L_PTE_SHARED;
+ mem_types[MT_MEMORY_DMA_READY].prot_pte |= L_PTE_SHARED;
+ mem_types[MT_MEMORY_RWX_NONCACHED].prot_sect |= PMD_SECT_S;
+ mem_types[MT_MEMORY_RWX_NONCACHED].prot_pte |= L_PTE_SHARED;
+@@ -647,6 +658,8 @@ static void __init build_mem_type_table(void)
+ mem_types[MT_MEMORY_RWX].prot_pte |= kern_pgprot;
+ mem_types[MT_MEMORY_RW].prot_sect |= ecc_mask | cp->pmd;
+ mem_types[MT_MEMORY_RW].prot_pte |= kern_pgprot;
++ mem_types[MT_MEMORY_RO].prot_sect |= ecc_mask | cp->pmd;
++ mem_types[MT_MEMORY_RO].prot_pte |= kern_pgprot;
+ mem_types[MT_MEMORY_DMA_READY].prot_pte |= kern_pgprot;
+ mem_types[MT_MEMORY_RWX_NONCACHED].prot_sect |= ecc_mask;
+ mem_types[MT_ROM].prot_sect |= cp->pmd;
+@@ -1360,7 +1373,7 @@ static void __init devicemaps_init(const struct machine_desc *mdesc)
+ map.pfn = __phys_to_pfn(__atags_pointer & SECTION_MASK);
+ map.virtual = FDT_FIXED_BASE;
+ map.length = FDT_FIXED_SIZE;
+- map.type = MT_ROM;
++ map.type = MT_MEMORY_RO;
+ create_mapping(&map);
+ }
+
+diff --git a/arch/arm/mm/proc-v7-bugs.c b/arch/arm/mm/proc-v7-bugs.c
+index fb9f3eb6bf483..8bc7a2d6d6c7f 100644
+--- a/arch/arm/mm/proc-v7-bugs.c
++++ b/arch/arm/mm/proc-v7-bugs.c
+@@ -108,8 +108,7 @@ static unsigned int spectre_v2_install_workaround(unsigned int method)
+ #else
+ static unsigned int spectre_v2_install_workaround(unsigned int method)
+ {
+- pr_info("CPU%u: Spectre V2: workarounds disabled by configuration\n",
+- smp_processor_id());
++ pr_info_once("Spectre V2: workarounds disabled by configuration\n");
+
+ return SPECTRE_VULNERABLE;
+ }
+@@ -209,10 +208,10 @@ static int spectre_bhb_install_workaround(int method)
+ return SPECTRE_VULNERABLE;
+
+ spectre_bhb_method = method;
+- }
+
+- pr_info("CPU%u: Spectre BHB: using %s workaround\n",
+- smp_processor_id(), spectre_bhb_method_name(method));
++ pr_info("CPU%u: Spectre BHB: enabling %s workaround for all CPUs\n",
++ smp_processor_id(), spectre_bhb_method_name(method));
++ }
+
+ return SPECTRE_MITIGATED;
+ }
+diff --git a/arch/arm/probes/decode.h b/arch/arm/probes/decode.h
+index 9731735989921..facc889d05eee 100644
+--- a/arch/arm/probes/decode.h
++++ b/arch/arm/probes/decode.h
+@@ -14,6 +14,7 @@
+ #include <linux/types.h>
+ #include <linux/stddef.h>
+ #include <asm/probes.h>
++#include <asm/ptrace.h>
+ #include <asm/kprobes.h>
+
+ void __init arm_probes_decode_init(void);
+@@ -35,31 +36,6 @@ void __init find_str_pc_offset(void);
+ #endif
+
+
+-/*
+- * Update ITSTATE after normal execution of an IT block instruction.
+- *
+- * The 8 IT state bits are split into two parts in CPSR:
+- * ITSTATE<1:0> are in CPSR<26:25>
+- * ITSTATE<7:2> are in CPSR<15:10>
+- */
+-static inline unsigned long it_advance(unsigned long cpsr)
+- {
+- if ((cpsr & 0x06000400) == 0) {
+- /* ITSTATE<2:0> == 0 means end of IT block, so clear IT state */
+- cpsr &= ~PSR_IT_MASK;
+- } else {
+- /* We need to shift left ITSTATE<4:0> */
+- const unsigned long mask = 0x06001c00; /* Mask ITSTATE<4:0> */
+- unsigned long it = cpsr & mask;
+- it <<= 1;
+- it |= it >> (27 - 10); /* Carry ITSTATE<2> to correct place */
+- it &= mask;
+- cpsr &= ~mask;
+- cpsr |= it;
+- }
+- return cpsr;
+-}
+-
+ static inline void __kprobes bx_write_pc(long pcv, struct pt_regs *regs)
+ {
+ long cpsr = regs->ARM_cpsr;
+diff --git a/arch/arm64/boot/dts/broadcom/bcm4908/bcm4906.dtsi b/arch/arm64/boot/dts/broadcom/bcm4908/bcm4906.dtsi
+index 66023d5535247..d084c33d5ca82 100644
+--- a/arch/arm64/boot/dts/broadcom/bcm4908/bcm4906.dtsi
++++ b/arch/arm64/boot/dts/broadcom/bcm4908/bcm4906.dtsi
+@@ -9,6 +9,14 @@
+ /delete-node/ cpu@3;
+ };
+
++ timer {
++ compatible = "arm,armv8-timer";
++ interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(2) | IRQ_TYPE_LEVEL_LOW)>,
++ <GIC_PPI 14 (GIC_CPU_MASK_SIMPLE(2) | IRQ_TYPE_LEVEL_LOW)>,
++ <GIC_PPI 11 (GIC_CPU_MASK_SIMPLE(2) | IRQ_TYPE_LEVEL_LOW)>,
++ <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(2) | IRQ_TYPE_LEVEL_LOW)>;
++ };
++
+ pmu {
+ compatible = "arm,cortex-a53-pmu";
+ interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>,
+diff --git a/arch/arm64/boot/dts/broadcom/bcm4908/bcm4908.dtsi b/arch/arm64/boot/dts/broadcom/bcm4908/bcm4908.dtsi
+index a4be040a00c07..967d2cd3c3cee 100644
+--- a/arch/arm64/boot/dts/broadcom/bcm4908/bcm4908.dtsi
++++ b/arch/arm64/boot/dts/broadcom/bcm4908/bcm4908.dtsi
+@@ -29,6 +29,8 @@
+ device_type = "cpu";
+ compatible = "brcm,brahma-b53";
+ reg = <0x0>;
++ enable-method = "spin-table";
++ cpu-release-addr = <0x0 0xfff8>;
+ next-level-cache = <&l2>;
+ };
+
+diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
+index 088271d49139c..59b289b52a280 100644
+--- a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
++++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
+@@ -224,9 +224,12 @@
+ little-endian;
+ };
+
+- efuse@1e80000 {
++ sfp: efuse@1e80000 {
+ compatible = "fsl,ls1028a-sfp";
+ reg = <0x0 0x1e80000 0x0 0x10000>;
++ clocks = <&clockgen QORIQ_CLK_PLATFORM_PLL
++ QORIQ_CLK_PLL_DIV(4)>;
++ clock-names = "sfp";
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
+index 503f544d28e29..b0d36e430dbc4 100644
+--- a/arch/powerpc/sysdev/xive/spapr.c
++++ b/arch/powerpc/sysdev/xive/spapr.c
+@@ -13,6 +13,7 @@
+ #include <linux/of.h>
+ #include <linux/slab.h>
+ #include <linux/spinlock.h>
++#include <linux/bitmap.h>
+ #include <linux/cpumask.h>
+ #include <linux/mm.h>
+ #include <linux/delay.h>
+@@ -55,7 +56,7 @@ static int __init xive_irq_bitmap_add(int base, int count)
+ spin_lock_init(&xibm->lock);
+ xibm->base = base;
+ xibm->count = count;
+- xibm->bitmap = kzalloc(xibm->count, GFP_KERNEL);
++ xibm->bitmap = bitmap_zalloc(xibm->count, GFP_KERNEL);
+ if (!xibm->bitmap) {
+ kfree(xibm);
+ return -ENOMEM;
+@@ -73,7 +74,7 @@ static void xive_irq_bitmap_remove_all(void)
+
+ list_for_each_entry_safe(xibm, tmp, &xive_irq_bitmaps, list) {
+ list_del(&xibm->list);
+- kfree(xibm->bitmap);
++ bitmap_free(xibm->bitmap);
+ kfree(xibm);
+ }
+ }
+diff --git a/arch/riscv/boot/dts/microchip/microchip-mpfs.dtsi b/arch/riscv/boot/dts/microchip/microchip-mpfs.dtsi
+index f44fce1fe080c..2f75e39d2fdd1 100644
+--- a/arch/riscv/boot/dts/microchip/microchip-mpfs.dtsi
++++ b/arch/riscv/boot/dts/microchip/microchip-mpfs.dtsi
+@@ -51,6 +51,7 @@
+ riscv,isa = "rv64imafdc";
+ clocks = <&clkcfg CLK_CPU>;
+ tlb-split;
++ next-level-cache = <&cctrllr>;
+ status = "okay";
+
+ cpu1_intc: interrupt-controller {
+@@ -78,6 +79,7 @@
+ riscv,isa = "rv64imafdc";
+ clocks = <&clkcfg CLK_CPU>;
+ tlb-split;
++ next-level-cache = <&cctrllr>;
+ status = "okay";
+
+ cpu2_intc: interrupt-controller {
+@@ -105,6 +107,7 @@
+ riscv,isa = "rv64imafdc";
+ clocks = <&clkcfg CLK_CPU>;
+ tlb-split;
++ next-level-cache = <&cctrllr>;
+ status = "okay";
+
+ cpu3_intc: interrupt-controller {
+@@ -132,6 +135,7 @@
+ riscv,isa = "rv64imafdc";
+ clocks = <&clkcfg CLK_CPU>;
+ tlb-split;
++ next-level-cache = <&cctrllr>;
+ status = "okay";
+ cpu4_intc: interrupt-controller {
+ #interrupt-cells = <1>;
+diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
+index 7461f964d20a9..3894777bfa87d 100644
+--- a/arch/riscv/kvm/vcpu.c
++++ b/arch/riscv/kvm/vcpu.c
+@@ -673,9 +673,11 @@ static void kvm_riscv_check_vcpu_requests(struct kvm_vcpu *vcpu)
+
+ if (kvm_request_pending(vcpu)) {
+ if (kvm_check_request(KVM_REQ_SLEEP, vcpu)) {
++ kvm_vcpu_srcu_read_unlock(vcpu);
+ rcuwait_wait_event(wait,
+ (!vcpu->arch.power_off) && (!vcpu->arch.pause),
+ TASK_INTERRUPTIBLE);
++ kvm_vcpu_srcu_read_lock(vcpu);
+
+ if (vcpu->arch.power_off || vcpu->arch.pause) {
+ /*
+diff --git a/arch/s390/Makefile b/arch/s390/Makefile
+index eba70d585cb2c..5b7e761b2d507 100644
+--- a/arch/s390/Makefile
++++ b/arch/s390/Makefile
+@@ -80,7 +80,7 @@ endif
+
+ ifdef CONFIG_EXPOLINE
+ ifdef CONFIG_EXPOLINE_EXTERN
+- KBUILD_LDFLAGS_MODULE += arch/s390/lib/expoline.o
++ KBUILD_LDFLAGS_MODULE += arch/s390/lib/expoline/expoline.o
+ CC_FLAGS_EXPOLINE := -mindirect-branch=thunk-extern
+ CC_FLAGS_EXPOLINE += -mfunction-return=thunk-extern
+ else
+@@ -162,6 +162,12 @@ vdso_prepare: prepare0
+ $(Q)$(MAKE) $(build)=arch/s390/kernel/vdso64 include/generated/vdso64-offsets.h
+ $(if $(CONFIG_COMPAT),$(Q)$(MAKE) \
+ $(build)=arch/s390/kernel/vdso32 include/generated/vdso32-offsets.h)
++
++ifdef CONFIG_EXPOLINE_EXTERN
++modules_prepare: expoline_prepare
++expoline_prepare: prepare0
++ $(Q)$(MAKE) $(build)=arch/s390/lib/expoline arch/s390/lib/expoline/expoline.o
++endif
+ endif
+
+ # Don't use tabs in echo arguments
+diff --git a/arch/s390/lib/Makefile b/arch/s390/lib/Makefile
+index 5d415b3db6d14..580d2e3265cb2 100644
+--- a/arch/s390/lib/Makefile
++++ b/arch/s390/lib/Makefile
+@@ -7,7 +7,6 @@ lib-y += delay.o string.o uaccess.o find.o spinlock.o
+ obj-y += mem.o xor.o
+ lib-$(CONFIG_KPROBES) += probes.o
+ lib-$(CONFIG_UPROBES) += probes.o
+-obj-$(CONFIG_EXPOLINE_EXTERN) += expoline.o
+ obj-$(CONFIG_S390_KPROBES_SANITY_TEST) += test_kprobes_s390.o
+ test_kprobes_s390-objs += test_kprobes_asm.o test_kprobes.o
+
+@@ -22,3 +21,5 @@ obj-$(CONFIG_S390_MODULES_SANITY_TEST) += test_modules.o
+ obj-$(CONFIG_S390_MODULES_SANITY_TEST_HELPERS) += test_modules_helpers.o
+
+ lib-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
++
++obj-$(CONFIG_EXPOLINE_EXTERN) += expoline/
+diff --git a/arch/s390/lib/expoline.S b/arch/s390/lib/expoline.S
+deleted file mode 100644
+index 92ed8409a7a44..0000000000000
+--- a/arch/s390/lib/expoline.S
++++ /dev/null
+@@ -1,12 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0 */
+-
+-#include <asm/nospec-insn.h>
+-#include <linux/linkage.h>
+-
+-.macro GEN_ALL_BR_THUNK_EXTERN
+- .irp r1,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
+- GEN_BR_THUNK_EXTERN %r\r1
+- .endr
+-.endm
+-
+-GEN_ALL_BR_THUNK_EXTERN
+diff --git a/arch/s390/lib/expoline/Makefile b/arch/s390/lib/expoline/Makefile
+new file mode 100644
+index 0000000000000..854631d9cb03a
+--- /dev/null
++++ b/arch/s390/lib/expoline/Makefile
+@@ -0,0 +1,3 @@
++# SPDX-License-Identifier: GPL-2.0
++
++obj-y += expoline.o
+diff --git a/arch/s390/lib/expoline/expoline.S b/arch/s390/lib/expoline/expoline.S
+new file mode 100644
+index 0000000000000..92ed8409a7a44
+--- /dev/null
++++ b/arch/s390/lib/expoline/expoline.S
+@@ -0,0 +1,12 @@
++/* SPDX-License-Identifier: GPL-2.0 */
++
++#include <asm/nospec-insn.h>
++#include <linux/linkage.h>
++
++.macro GEN_ALL_BR_THUNK_EXTERN
++ .irp r1,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
++ GEN_BR_THUNK_EXTERN %r\r1
++ .endr
++.endm
++
++GEN_ALL_BR_THUNK_EXTERN
+diff --git a/arch/sh/include/asm/io.h b/arch/sh/include/asm/io.h
+index cf9a3ec32406f..fba90e670ed41 100644
+--- a/arch/sh/include/asm/io.h
++++ b/arch/sh/include/asm/io.h
+@@ -271,8 +271,12 @@ static inline void __iomem *ioremap_prot(phys_addr_t offset, unsigned long size,
+ #endif /* CONFIG_HAVE_IOREMAP_PROT */
+
+ #else /* CONFIG_MMU */
+-#define iounmap(addr) do { } while (0)
+-#define ioremap(offset, size) ((void __iomem *)(unsigned long)(offset))
++static inline void __iomem *ioremap(phys_addr_t offset, size_t size)
++{
++ return (void __iomem *)(unsigned long)offset;
++}
++
++static inline void iounmap(volatile void __iomem *addr) { }
+ #endif /* CONFIG_MMU */
+
+ #define ioremap_uc ioremap
+diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
+index 896e48d45828c..bccc84de7ff26 100644
+--- a/arch/x86/include/asm/setup.h
++++ b/arch/x86/include/asm/setup.h
+@@ -132,6 +132,9 @@ void *extend_brk(size_t size, size_t align);
+ }
+
+ extern void probe_roms(void);
++
++void clear_bss(void);
++
+ #ifdef __i386__
+
+ asmlinkage void __init i386_start_kernel(void);
+diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c
+index 3677df836e910..0961bf38bdb09 100644
+--- a/arch/x86/kernel/acpi/cppc.c
++++ b/arch/x86/kernel/acpi/cppc.c
+@@ -16,6 +16,12 @@ bool cpc_supported_by_cpu(void)
+ switch (boot_cpu_data.x86_vendor) {
+ case X86_VENDOR_AMD:
+ case X86_VENDOR_HYGON:
++ if (boot_cpu_data.x86 == 0x19 && ((boot_cpu_data.x86_model <= 0x0f) ||
++ (boot_cpu_data.x86_model >= 0x20 && boot_cpu_data.x86_model <= 0x2f)))
++ return true;
++ else if (boot_cpu_data.x86 == 0x17 &&
++ boot_cpu_data.x86_model >= 0x70 && boot_cpu_data.x86_model <= 0x7f)
++ return true;
+ return boot_cpu_has(X86_FEATURE_CPPC);
+ }
+ return false;
+diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
+index 4f5ecbbaae77c..92eae95f1a0b7 100644
+--- a/arch/x86/kernel/head64.c
++++ b/arch/x86/kernel/head64.c
+@@ -421,10 +421,12 @@ void __init do_early_exception(struct pt_regs *regs, int trapnr)
+
+ /* Don't add a printk in there. printk relies on the PDA which is not initialized
+ yet. */
+-static void __init clear_bss(void)
++void __init clear_bss(void)
+ {
+ memset(__bss_start, 0,
+ (unsigned long) __bss_stop - (unsigned long) __bss_start);
++ memset(__brk_base, 0,
++ (unsigned long) __brk_limit - (unsigned long) __brk_base);
+ }
+
+ static unsigned long get_cmd_line_ptr(void)
+diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
+index 558d1f2ab5b49..828f5cf1af459 100644
+--- a/arch/x86/kvm/x86.c
++++ b/arch/x86/kvm/x86.c
+@@ -9074,15 +9074,17 @@ static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr,
+ */
+ static void kvm_pv_kick_cpu_op(struct kvm *kvm, int apicid)
+ {
+- struct kvm_lapic_irq lapic_irq;
+-
+- lapic_irq.shorthand = APIC_DEST_NOSHORT;
+- lapic_irq.dest_mode = APIC_DEST_PHYSICAL;
+- lapic_irq.level = 0;
+- lapic_irq.dest_id = apicid;
+- lapic_irq.msi_redir_hint = false;
++ /*
++ * All other fields are unused for APIC_DM_REMRD, but may be consumed by
++ * common code, e.g. for tracing. Defer initialization to the compiler.
++ */
++ struct kvm_lapic_irq lapic_irq = {
++ .delivery_mode = APIC_DM_REMRD,
++ .dest_mode = APIC_DEST_PHYSICAL,
++ .shorthand = APIC_DEST_NOSHORT,
++ .dest_id = apicid,
++ };
+
+- lapic_irq.delivery_mode = APIC_DM_REMRD;
+ kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
+ }
+
+diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
+index d8cfce221275e..57ba5502aecf9 100644
+--- a/arch/x86/mm/init.c
++++ b/arch/x86/mm/init.c
+@@ -77,10 +77,20 @@ static uint8_t __pte2cachemode_tbl[8] = {
+ [__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
+ };
+
+-/* Check that the write-protect PAT entry is set for write-protect */
++/*
++ * Check that the write-protect PAT entry is set for write-protect.
++ * To do this without making assumptions how PAT has been set up (Xen has
++ * another layout than the kernel), translate the _PAGE_CACHE_MODE_WP cache
++ * mode via the __cachemode2pte_tbl[] into protection bits (those protection
++ * bits will select a cache mode of WP or better), and then translate the
++ * protection bits back into the cache mode using __pte2cm_idx() and the
++ * __pte2cachemode_tbl[] array. This will return the really used cache mode.
++ */
+ bool x86_has_pat_wp(void)
+ {
+- return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] == _PAGE_CACHE_MODE_WP;
++ uint16_t prot = __cachemode2pte_tbl[_PAGE_CACHE_MODE_WP];
++
++ return __pte2cachemode_tbl[__pte2cm_idx(prot)] == _PAGE_CACHE_MODE_WP;
+ }
+
+ enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
+diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
+index 5038edb79ad51..b55de4ad685ce 100644
+--- a/arch/x86/xen/enlighten_pv.c
++++ b/arch/x86/xen/enlighten_pv.c
+@@ -1183,15 +1183,19 @@ static void __init xen_domu_set_legacy_features(void)
+ extern void early_xen_iret_patch(void);
+
+ /* First C function to be called on Xen boot */
+-asmlinkage __visible void __init xen_start_kernel(void)
++asmlinkage __visible void __init xen_start_kernel(struct start_info *si)
+ {
+ struct physdev_set_iopl set_iopl;
+ unsigned long initrd_start = 0;
+ int rc;
+
+- if (!xen_start_info)
++ if (!si)
+ return;
+
++ clear_bss();
++
++ xen_start_info = si;
++
+ __text_gen_insn(&early_xen_iret_patch,
+ JMP32_INSN_OPCODE, &early_xen_iret_patch, &xen_iret,
+ JMP32_INSN_SIZE);
+diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
+index 3a2cd93bf0590..13af6fe453e3f 100644
+--- a/arch/x86/xen/xen-head.S
++++ b/arch/x86/xen/xen-head.S
+@@ -48,15 +48,6 @@ SYM_CODE_START(startup_xen)
+ ANNOTATE_NOENDBR
+ cld
+
+- /* Clear .bss */
+- xor %eax,%eax
+- mov $__bss_start, %rdi
+- mov $__bss_stop, %rcx
+- sub %rdi, %rcx
+- shr $3, %rcx
+- rep stosq
+-
+- mov %rsi, xen_start_info
+ mov initial_stack(%rip), %rsp
+
+ /* Set up %gs.
+@@ -71,6 +62,7 @@ SYM_CODE_START(startup_xen)
+ cdq
+ wrmsr
+
++ mov %rsi, %rdi
+ call xen_start_kernel
+ SYM_CODE_END(startup_xen)
+ __FINIT
+diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c
+index e4ea42b83b512..3bd0de69aa115 100644
+--- a/drivers/acpi/acpi_video.c
++++ b/drivers/acpi/acpi_video.c
+@@ -73,7 +73,7 @@ module_param(device_id_scheme, bool, 0444);
+ static int only_lcd = -1;
+ module_param(only_lcd, int, 0444);
+
+-static bool has_backlight;
++static bool may_report_brightness_keys;
+ static int register_count;
+ static DEFINE_MUTEX(register_count_mutex);
+ static DEFINE_MUTEX(video_list_lock);
+@@ -1224,7 +1224,7 @@ acpi_video_bus_get_one_device(struct acpi_device *device,
+ acpi_video_device_find_cap(data);
+
+ if (data->cap._BCM && data->cap._BCL)
+- has_backlight = true;
++ may_report_brightness_keys = true;
+
+ mutex_lock(&video->device_list_lock);
+ list_add_tail(&data->entry, &video->video_device_list);
+@@ -1693,6 +1693,9 @@ static void acpi_video_device_notify(acpi_handle handle, u32 event, void *data)
+ break;
+ }
+
++ if (keycode)
++ may_report_brightness_keys = true;
++
+ acpi_notifier_call_chain(device, event, 0);
+
+ if (keycode && (report_key_events & REPORT_BRIGHTNESS_KEY_EVENTS)) {
+@@ -2254,7 +2257,7 @@ void acpi_video_unregister(void)
+ if (register_count) {
+ acpi_bus_unregister_driver(&acpi_video_bus);
+ register_count = 0;
+- has_backlight = false;
++ may_report_brightness_keys = false;
+ }
+ mutex_unlock(®ister_count_mutex);
+ }
+@@ -2276,7 +2279,7 @@ void acpi_video_unregister_backlight(void)
+
+ bool acpi_video_handles_brightness_key_presses(void)
+ {
+- return has_backlight &&
++ return may_report_brightness_keys &&
+ (report_key_events & REPORT_BRIGHTNESS_KEY_EVENTS);
+ }
+ EXPORT_SYMBOL(acpi_video_handles_brightness_key_presses);
+diff --git a/drivers/cpufreq/pmac32-cpufreq.c b/drivers/cpufreq/pmac32-cpufreq.c
+index 4f20c6a9108df..8e41fe9ee870d 100644
+--- a/drivers/cpufreq/pmac32-cpufreq.c
++++ b/drivers/cpufreq/pmac32-cpufreq.c
+@@ -470,6 +470,10 @@ static int pmac_cpufreq_init_MacRISC3(struct device_node *cpunode)
+ if (slew_done_gpio_np)
+ slew_done_gpio = read_gpio(slew_done_gpio_np);
+
++ of_node_put(volt_gpio_np);
++ of_node_put(freq_gpio_np);
++ of_node_put(slew_done_gpio_np);
++
+ /* If we use the frequency GPIOs, calculate the min/max speeds based
+ * on the bus frequencies
+ */
+diff --git a/drivers/firmware/sysfb.c b/drivers/firmware/sysfb.c
+index 2bfbb05f7d896..1f276f108cc93 100644
+--- a/drivers/firmware/sysfb.c
++++ b/drivers/firmware/sysfb.c
+@@ -34,21 +34,59 @@
+ #include <linux/screen_info.h>
+ #include <linux/sysfb.h>
+
++static struct platform_device *pd;
++static DEFINE_MUTEX(disable_lock);
++static bool disabled;
++
++static bool sysfb_unregister(void)
++{
++ if (IS_ERR_OR_NULL(pd))
++ return false;
++
++ platform_device_unregister(pd);
++ pd = NULL;
++
++ return true;
++}
++
++/**
++ * sysfb_disable() - disable the Generic System Framebuffers support
++ *
++ * This disables the registration of system framebuffer devices that match the
++ * generic drivers that make use of the system framebuffer set up by firmware.
++ *
++ * It also unregisters a device if this was already registered by sysfb_init().
++ *
++ * Context: The function can sleep. A @disable_lock mutex is acquired to serialize
++ * against sysfb_init(), that registers a system framebuffer device.
++ */
++void sysfb_disable(void)
++{
++ mutex_lock(&disable_lock);
++ sysfb_unregister();
++ disabled = true;
++ mutex_unlock(&disable_lock);
++}
++EXPORT_SYMBOL_GPL(sysfb_disable);
++
+ static __init int sysfb_init(void)
+ {
+ struct screen_info *si = &screen_info;
+ struct simplefb_platform_data mode;
+- struct platform_device *pd;
+ const char *name;
+ bool compatible;
+- int ret;
++ int ret = 0;
++
++ mutex_lock(&disable_lock);
++ if (disabled)
++ goto unlock_mutex;
+
+ /* try to create a simple-framebuffer device */
+ compatible = sysfb_parse_mode(si, &mode);
+ if (compatible) {
+- ret = sysfb_create_simplefb(si, &mode);
+- if (!ret)
+- return 0;
++ pd = sysfb_create_simplefb(si, &mode);
++ if (!IS_ERR(pd))
++ goto unlock_mutex;
+ }
+
+ /* if the FB is incompatible, create a legacy framebuffer device */
+@@ -60,8 +98,10 @@ static __init int sysfb_init(void)
+ name = "platform-framebuffer";
+
+ pd = platform_device_alloc(name, 0);
+- if (!pd)
+- return -ENOMEM;
++ if (!pd) {
++ ret = -ENOMEM;
++ goto unlock_mutex;
++ }
+
+ sysfb_apply_efi_quirks(pd);
+
+@@ -73,9 +113,11 @@ static __init int sysfb_init(void)
+ if (ret)
+ goto err;
+
+- return 0;
++ goto unlock_mutex;
+ err:
+ platform_device_put(pd);
++unlock_mutex:
++ mutex_unlock(&disable_lock);
+ return ret;
+ }
+
+diff --git a/drivers/firmware/sysfb_simplefb.c b/drivers/firmware/sysfb_simplefb.c
+index bda8712bfd8c5..a353e27f83f54 100644
+--- a/drivers/firmware/sysfb_simplefb.c
++++ b/drivers/firmware/sysfb_simplefb.c
+@@ -57,8 +57,8 @@ __init bool sysfb_parse_mode(const struct screen_info *si,
+ return false;
+ }
+
+-__init int sysfb_create_simplefb(const struct screen_info *si,
+- const struct simplefb_platform_data *mode)
++__init struct platform_device *sysfb_create_simplefb(const struct screen_info *si,
++ const struct simplefb_platform_data *mode)
+ {
+ struct platform_device *pd;
+ struct resource res;
+@@ -76,7 +76,7 @@ __init int sysfb_create_simplefb(const struct screen_info *si,
+ base |= (u64)si->ext_lfb_base << 32;
+ if (!base || (u64)(resource_size_t)base != base) {
+ printk(KERN_DEBUG "sysfb: inaccessible VRAM base\n");
+- return -EINVAL;
++ return ERR_PTR(-EINVAL);
+ }
+
+ /*
+@@ -93,7 +93,7 @@ __init int sysfb_create_simplefb(const struct screen_info *si,
+ length = mode->height * mode->stride;
+ if (length > size) {
+ printk(KERN_WARNING "sysfb: VRAM smaller than advertised\n");
+- return -EINVAL;
++ return ERR_PTR(-EINVAL);
+ }
+ length = PAGE_ALIGN(length);
+
+@@ -104,11 +104,11 @@ __init int sysfb_create_simplefb(const struct screen_info *si,
+ res.start = base;
+ res.end = res.start + length - 1;
+ if (res.end <= res.start)
+- return -EINVAL;
++ return ERR_PTR(-EINVAL);
+
+ pd = platform_device_alloc("simple-framebuffer", 0);
+ if (!pd)
+- return -ENOMEM;
++ return ERR_PTR(-ENOMEM);
+
+ sysfb_apply_efi_quirks(pd);
+
+@@ -124,10 +124,10 @@ __init int sysfb_create_simplefb(const struct screen_info *si,
+ if (ret)
+ goto err_put_device;
+
+- return 0;
++ return pd;
+
+ err_put_device:
+ platform_device_put(pd);
+
+- return ret;
++ return ERR_PTR(ret);
+ }
+diff --git a/drivers/gpio/gpio-sim.c b/drivers/gpio/gpio-sim.c
+index 98109839102fb..1020c2feb2496 100644
+--- a/drivers/gpio/gpio-sim.c
++++ b/drivers/gpio/gpio-sim.c
+@@ -991,28 +991,22 @@ static struct configfs_attribute *gpio_sim_device_config_attrs[] = {
+ };
+
+ struct gpio_sim_chip_name_ctx {
+- struct gpio_sim_device *dev;
++ struct fwnode_handle *swnode;
+ char *page;
+ };
+
+ static int gpio_sim_emit_chip_name(struct device *dev, void *data)
+ {
+ struct gpio_sim_chip_name_ctx *ctx = data;
+- struct fwnode_handle *swnode;
+- struct gpio_sim_bank *bank;
+
+ /* This would be the sysfs device exported in /sys/class/gpio. */
+ if (dev->class)
+ return 0;
+
+- swnode = dev_fwnode(dev);
++ if (device_match_fwnode(dev, ctx->swnode))
++ return sprintf(ctx->page, "%s\n", dev_name(dev));
+
+- list_for_each_entry(bank, &ctx->dev->bank_list, siblings) {
+- if (bank->swnode == swnode)
+- return sprintf(ctx->page, "%s\n", dev_name(dev));
+- }
+-
+- return -ENODATA;
++ return 0;
+ }
+
+ static ssize_t gpio_sim_bank_config_chip_name_show(struct config_item *item,
+@@ -1020,7 +1014,7 @@ static ssize_t gpio_sim_bank_config_chip_name_show(struct config_item *item,
+ {
+ struct gpio_sim_bank *bank = to_gpio_sim_bank(item);
+ struct gpio_sim_device *dev = gpio_sim_bank_get_device(bank);
+- struct gpio_sim_chip_name_ctx ctx = { dev, page };
++ struct gpio_sim_chip_name_ctx ctx = { bank->swnode, page };
+ int ret;
+
+ mutex_lock(&dev->lock);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+index fae5c1debfad3..ffb3702745a53 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+@@ -1547,6 +1547,21 @@ bool amdgpu_crtc_get_scanout_position(struct drm_crtc *crtc,
+ stime, etime, mode);
+ }
+
++static bool
++amdgpu_display_robj_is_fb(struct amdgpu_device *adev, struct amdgpu_bo *robj)
++{
++ struct drm_device *dev = adev_to_drm(adev);
++ struct drm_fb_helper *fb_helper = dev->fb_helper;
++
++ if (!fb_helper || !fb_helper->buffer)
++ return false;
++
++ if (gem_to_amdgpu_bo(fb_helper->buffer->gem) != robj)
++ return false;
++
++ return true;
++}
++
+ int amdgpu_display_suspend_helper(struct amdgpu_device *adev)
+ {
+ struct drm_device *dev = adev_to_drm(adev);
+@@ -1582,10 +1597,12 @@ int amdgpu_display_suspend_helper(struct amdgpu_device *adev)
+ continue;
+ }
+ robj = gem_to_amdgpu_bo(fb->obj[0]);
+- r = amdgpu_bo_reserve(robj, true);
+- if (r == 0) {
+- amdgpu_bo_unpin(robj);
+- amdgpu_bo_unreserve(robj);
++ if (!amdgpu_display_robj_is_fb(adev, robj)) {
++ r = amdgpu_bo_reserve(robj, true);
++ if (r == 0) {
++ amdgpu_bo_unpin(robj);
++ amdgpu_bo_unreserve(robj);
++ }
+ }
+ }
+ return 0;
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
+index 5224d9a39737f..842670d4a12e9 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
+@@ -494,7 +494,8 @@ static int amdgpu_vkms_sw_init(void *handle)
+ adev_to_drm(adev)->mode_config.max_height = YRES_MAX;
+
+ adev_to_drm(adev)->mode_config.preferred_depth = 24;
+- adev_to_drm(adev)->mode_config.prefer_shadow = 1;
++ /* disable prefer shadow for now due to hibernation issues */
++ adev_to_drm(adev)->mode_config.prefer_shadow = 0;
+
+ adev_to_drm(adev)->mode_config.fb_base = adev->gmc.aper_base;
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
+index 288fce7dc0ed1..9c964cd3b5d4e 100644
+--- a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
+@@ -2796,7 +2796,8 @@ static int dce_v10_0_sw_init(void *handle)
+ adev_to_drm(adev)->mode_config.max_height = 16384;
+
+ adev_to_drm(adev)->mode_config.preferred_depth = 24;
+- adev_to_drm(adev)->mode_config.prefer_shadow = 1;
++ /* disable prefer shadow for now due to hibernation issues */
++ adev_to_drm(adev)->mode_config.prefer_shadow = 0;
+
+ adev_to_drm(adev)->mode_config.fb_modifiers_not_supported = true;
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
+index cbe5250b31cb4..e0ad9f27dc3f9 100644
+--- a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
+@@ -2914,7 +2914,8 @@ static int dce_v11_0_sw_init(void *handle)
+ adev_to_drm(adev)->mode_config.max_height = 16384;
+
+ adev_to_drm(adev)->mode_config.preferred_depth = 24;
+- adev_to_drm(adev)->mode_config.prefer_shadow = 1;
++ /* disable prefer shadow for now due to hibernation issues */
++ adev_to_drm(adev)->mode_config.prefer_shadow = 0;
+
+ adev_to_drm(adev)->mode_config.fb_modifiers_not_supported = true;
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
+index 982855e6cf52e..3caf6f386042f 100644
+--- a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
+@@ -2673,7 +2673,8 @@ static int dce_v6_0_sw_init(void *handle)
+ adev_to_drm(adev)->mode_config.max_width = 16384;
+ adev_to_drm(adev)->mode_config.max_height = 16384;
+ adev_to_drm(adev)->mode_config.preferred_depth = 24;
+- adev_to_drm(adev)->mode_config.prefer_shadow = 1;
++ /* disable prefer shadow for now due to hibernation issues */
++ adev_to_drm(adev)->mode_config.prefer_shadow = 0;
+ adev_to_drm(adev)->mode_config.fb_modifiers_not_supported = true;
+ adev_to_drm(adev)->mode_config.fb_base = adev->gmc.aper_base;
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
+index 84440741c60bc..7c75df5bffed3 100644
+--- a/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
+@@ -2693,7 +2693,8 @@ static int dce_v8_0_sw_init(void *handle)
+ adev_to_drm(adev)->mode_config.max_height = 16384;
+
+ adev_to_drm(adev)->mode_config.preferred_depth = 24;
+- adev_to_drm(adev)->mode_config.prefer_shadow = 1;
++ /* disable prefer shadow for now due to hibernation issues */
++ adev_to_drm(adev)->mode_config.prefer_shadow = 0;
+
+ adev_to_drm(adev)->mode_config.fb_modifiers_not_supported = true;
+
+diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+index 651498bfecc8d..2059c3138410b 100644
+--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
++++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+@@ -158,6 +158,8 @@ static void kfd_device_info_init(struct kfd_dev *kfd,
+ /* Navi2x+, Navi1x+ */
+ if (gc_version == IP_VERSION(10, 3, 6))
+ kfd->device_info.no_atomic_fw_version = 14;
++ else if (gc_version == IP_VERSION(10, 3, 7))
++ kfd->device_info.no_atomic_fw_version = 3;
+ else if (gc_version >= IP_VERSION(10, 3, 0))
+ kfd->device_info.no_atomic_fw_version = 92;
+ else if (gc_version >= IP_VERSION(10, 1, 1))
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+index 6dc9808760fc8..810965bd06921 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+@@ -72,6 +72,7 @@
+ #include <linux/pci.h>
+ #include <linux/firmware.h>
+ #include <linux/component.h>
++#include <linux/dmi.h>
+
+ #include <drm/drm_atomic.h>
+ #include <drm/drm_atomic_uapi.h>
+@@ -463,6 +464,26 @@ static void dm_pflip_high_irq(void *interrupt_params)
+ vrr_active, (int) !e);
+ }
+
++static void dm_crtc_handle_vblank(struct amdgpu_crtc *acrtc)
++{
++ struct drm_crtc *crtc = &acrtc->base;
++ struct drm_device *dev = crtc->dev;
++ unsigned long flags;
++
++ drm_crtc_handle_vblank(crtc);
++
++ spin_lock_irqsave(&dev->event_lock, flags);
++
++ /* Send completion event for cursor-only commits */
++ if (acrtc->event && acrtc->pflip_status != AMDGPU_FLIP_SUBMITTED) {
++ drm_crtc_send_vblank_event(crtc, acrtc->event);
++ drm_crtc_vblank_put(crtc);
++ acrtc->event = NULL;
++ }
++
++ spin_unlock_irqrestore(&dev->event_lock, flags);
++}
++
+ static void dm_vupdate_high_irq(void *interrupt_params)
+ {
+ struct common_irq_params *irq_params = interrupt_params;
+@@ -501,7 +522,7 @@ static void dm_vupdate_high_irq(void *interrupt_params)
+ * if a pageflip happened inside front-porch.
+ */
+ if (vrr_active) {
+- drm_crtc_handle_vblank(&acrtc->base);
++ dm_crtc_handle_vblank(acrtc);
+
+ /* BTR processing for pre-DCE12 ASICs */
+ if (acrtc->dm_irq_params.stream &&
+@@ -553,7 +574,7 @@ static void dm_crtc_high_irq(void *interrupt_params)
+ * to dm_vupdate_high_irq after end of front-porch.
+ */
+ if (!vrr_active)
+- drm_crtc_handle_vblank(&acrtc->base);
++ dm_crtc_handle_vblank(acrtc);
+
+ /**
+ * Following stuff must happen at start of vblank, for crc
+@@ -1391,6 +1412,41 @@ static bool dm_should_disable_stutter(struct pci_dev *pdev)
+ return false;
+ }
+
++static const struct dmi_system_id hpd_disconnect_quirk_table[] = {
++ {
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Precision 3660"),
++ },
++ },
++ {
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Precision 3260"),
++ },
++ },
++ {
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Precision 3460"),
++ },
++ },
++ {}
++};
++
++static void retrieve_dmi_info(struct amdgpu_display_manager *dm)
++{
++ const struct dmi_system_id *dmi_id;
++
++ dm->aux_hpd_discon_quirk = false;
++
++ dmi_id = dmi_first_match(hpd_disconnect_quirk_table);
++ if (dmi_id) {
++ dm->aux_hpd_discon_quirk = true;
++ DRM_INFO("aux_hpd_discon_quirk attached\n");
++ }
++}
++
+ static int amdgpu_dm_init(struct amdgpu_device *adev)
+ {
+ struct dc_init_data init_data;
+@@ -1521,6 +1577,9 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
+ }
+
+ INIT_LIST_HEAD(&adev->dm.da_list);
++
++ retrieve_dmi_info(&adev->dm);
++
+ /* Display Core create. */
+ adev->dm.dc = dc_create(&init_data);
+
+@@ -3847,7 +3906,8 @@ static int amdgpu_dm_mode_config_init(struct amdgpu_device *adev)
+ adev_to_drm(adev)->mode_config.max_height = 16384;
+
+ adev_to_drm(adev)->mode_config.preferred_depth = 24;
+- adev_to_drm(adev)->mode_config.prefer_shadow = 1;
++ /* disable prefer shadow for now due to hibernation issues */
++ adev_to_drm(adev)->mode_config.prefer_shadow = 0;
+ /* indicates support for immediate flip */
+ adev_to_drm(adev)->mode_config.async_page_flip = true;
+
+@@ -9159,6 +9219,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
+ struct amdgpu_bo *abo;
+ uint32_t target_vblank, last_flip_vblank;
+ bool vrr_active = amdgpu_dm_vrr_active(acrtc_state);
++ bool cursor_update = false;
+ bool pflip_present = false;
+ struct {
+ struct dc_surface_update surface_updates[MAX_SURFACES];
+@@ -9194,8 +9255,13 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
+ struct dm_plane_state *dm_new_plane_state = to_dm_plane_state(new_plane_state);
+
+ /* Cursor plane is handled after stream updates */
+- if (plane->type == DRM_PLANE_TYPE_CURSOR)
++ if (plane->type == DRM_PLANE_TYPE_CURSOR) {
++ if ((fb && crtc == pcrtc) ||
++ (old_plane_state->fb && old_plane_state->crtc == pcrtc))
++ cursor_update = true;
++
+ continue;
++ }
+
+ if (!fb || !crtc || pcrtc != crtc)
+ continue;
+@@ -9357,6 +9423,17 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
+ bundle->stream_update.vrr_infopacket =
+ &acrtc_state->stream->vrr_infopacket;
+ }
++ } else if (cursor_update && acrtc_state->active_planes > 0 &&
++ !acrtc_state->force_dpms_off &&
++ acrtc_attach->base.state->event) {
++ drm_crtc_vblank_get(pcrtc);
++
++ spin_lock_irqsave(&pcrtc->dev->event_lock, flags);
++
++ acrtc_attach->event = acrtc_attach->base.state->event;
++ acrtc_attach->base.state->event = NULL;
++
++ spin_unlock_irqrestore(&pcrtc->dev->event_lock, flags);
+ }
+
+ /* Update the planes if changed or disable if we don't have any. */
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+index 7e44b04294488..4844601a5f47a 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+@@ -546,6 +546,14 @@ struct amdgpu_display_manager {
+ * last successfully applied backlight values.
+ */
+ u32 actual_brightness[AMDGPU_DM_MAX_NUM_EDP];
++
++ /**
++ * @aux_hpd_discon_quirk:
++ *
++ * quirk for hpd discon while aux is on-going.
++ * occurred on certain intel platform
++ */
++ bool aux_hpd_discon_quirk;
+ };
+
+ enum dsc_clock_force_state {
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+index 31ac1fce36f84..d864cae1af67d 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+@@ -58,6 +58,8 @@ static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux,
+ ssize_t result = 0;
+ struct aux_payload payload;
+ enum aux_return_code_type operation_result;
++ struct amdgpu_device *adev;
++ struct ddc_service *ddc;
+
+ if (WARN_ON(msg->size > 16))
+ return -E2BIG;
+@@ -76,6 +78,21 @@ static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux,
+ result = dc_link_aux_transfer_raw(TO_DM_AUX(aux)->ddc_service, &payload,
+ &operation_result);
+
++ /*
++ * w/a on certain intel platform where hpd is unexpected to pull low during
++ * 1st sideband message transaction by return AUX_RET_ERROR_HPD_DISCON
++ * aux transaction is succuess in such case, therefore bypass the error
++ */
++ ddc = TO_DM_AUX(aux)->ddc_service;
++ adev = ddc->ctx->driver_context;
++ if (adev->dm.aux_hpd_discon_quirk) {
++ if (msg->address == DP_SIDEBAND_MSG_DOWN_REQ_BASE &&
++ operation_result == AUX_RET_ERROR_HPD_DISCON) {
++ result = 0;
++ operation_result = AUX_RET_SUCCESS;
++ }
++ }
++
+ if (payload.write && result >= 0)
+ result = msg->size;
+
+diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+index d251c3f3a7140..5cdbd2b8aa4d2 100644
+--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+@@ -1113,12 +1113,13 @@ bool resource_build_scaling_params(struct pipe_ctx *pipe_ctx)
+ * on certain displays, such as the Sharp 4k. 36bpp is needed
+ * to support SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616 and
+ * SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616 with actual > 10 bpc
+- * precision on at least DCN display engines. However, at least
+- * Carrizo with DCE_VERSION_11_0 does not like 36 bpp lb depth,
+- * so use only 30 bpp on DCE_VERSION_11_0. Testing with DCE 11.2 and 8.3
+- * did not show such problems, so this seems to be the exception.
++ * precision on DCN display engines, but apparently not for DCE, as
++ * far as testing on DCE-11.2 and DCE-8 showed. Various DCE parts have
++ * problems: Carrizo with DCE_VERSION_11_0 does not like 36 bpp lb depth,
++ * neither do DCE-8 at 4k resolution, or DCE-11.2 (broken identify pixel
++ * passthrough). Therefore only use 36 bpp on DCN where it is actually needed.
+ */
+- if (plane_state->ctx->dce_version > DCE_VERSION_11_0)
++ if (plane_state->ctx->dce_version > DCE_VERSION_MAX)
+ pipe_ctx->plane_res.scl_data.lb_params.depth = LB_PIXEL_DEPTH_36BPP;
+ else
+ pipe_ctx->plane_res.scl_data.lb_params.depth = LB_PIXEL_DEPTH_30BPP;
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
+index 5f8809f6990dd..2fbd2926a5310 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
+@@ -1228,6 +1228,8 @@ int smu_v11_0_set_fan_speed_rpm(struct smu_context *smu,
+ uint32_t crystal_clock_freq = 2500;
+ uint32_t tach_period;
+
++ if (speed == 0)
++ return -EINVAL;
+ /*
+ * To prevent from possible overheat, some ASICs may have requirement
+ * for minimum fan speed:
+diff --git a/drivers/gpu/drm/drm_aperture.c b/drivers/gpu/drm/drm_aperture.c
+index 74bd4a76b253c..059fd71424f6b 100644
+--- a/drivers/gpu/drm/drm_aperture.c
++++ b/drivers/gpu/drm/drm_aperture.c
+@@ -329,7 +329,20 @@ int drm_aperture_remove_conflicting_pci_framebuffers(struct pci_dev *pdev,
+ const struct drm_driver *req_driver)
+ {
+ resource_size_t base, size;
+- int bar, ret = 0;
++ int bar, ret;
++
++ /*
++ * WARNING: Apparently we must kick fbdev drivers before vgacon,
++ * otherwise the vga fbdev driver falls over.
++ */
++#if IS_REACHABLE(CONFIG_FB)
++ ret = remove_conflicting_pci_framebuffers(pdev, req_driver->name);
++ if (ret)
++ return ret;
++#endif
++ ret = vga_remove_vgacon(pdev);
++ if (ret)
++ return ret;
+
+ for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) {
+ if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM))
+@@ -339,15 +352,6 @@ int drm_aperture_remove_conflicting_pci_framebuffers(struct pci_dev *pdev,
+ drm_aperture_detach_drivers(base, size);
+ }
+
+- /*
+- * WARNING: Apparently we must kick fbdev drivers before vgacon,
+- * otherwise the vga fbdev driver falls over.
+- */
+-#if IS_REACHABLE(CONFIG_FB)
+- ret = remove_conflicting_pci_framebuffers(pdev, req_driver->name);
+-#endif
+- if (ret == 0)
+- ret = vga_remove_vgacon(pdev);
+- return ret;
++ return 0;
+ }
+ EXPORT_SYMBOL(drm_aperture_remove_conflicting_pci_framebuffers);
+diff --git a/drivers/gpu/drm/i915/display/intel_dp_mst.c b/drivers/gpu/drm/i915/display/intel_dp_mst.c
+index e30e698aa6843..f7d46ea3afb96 100644
+--- a/drivers/gpu/drm/i915/display/intel_dp_mst.c
++++ b/drivers/gpu/drm/i915/display/intel_dp_mst.c
+@@ -841,6 +841,7 @@ static struct drm_connector *intel_dp_add_mst_connector(struct drm_dp_mst_topolo
+ ret = drm_connector_init(dev, connector, &intel_dp_mst_connector_funcs,
+ DRM_MODE_CONNECTOR_DisplayPort);
+ if (ret) {
++ drm_dp_mst_put_port_malloc(port);
+ intel_connector_free(intel_connector);
+ return NULL;
+ }
+diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
+index 6cf94469d5a84..34cfa88ae0c01 100644
+--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
++++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
+@@ -57,6 +57,8 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
+ if (page_size)
+ default_page_size = page_size;
+
++ /* We should be able to fit a page within an sg entry */
++ GEM_BUG_ON(overflows_type(default_page_size, u32));
+ GEM_BUG_ON(!is_power_of_2_u64(default_page_size));
+ GEM_BUG_ON(default_page_size < PAGE_SIZE);
+
+diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+index 45cc5837ce001..6f393f32ac368 100644
+--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
++++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+@@ -583,10 +583,15 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
+ struct ttm_resource *res)
+ {
+ struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
++ u32 page_alignment;
+
+ if (!i915_ttm_gtt_binds_lmem(res))
+ return i915_ttm_tt_get_st(bo->ttm);
+
++ page_alignment = bo->page_alignment << PAGE_SHIFT;
++ if (!page_alignment)
++ page_alignment = obj->mm.region->min_page_size;
++
+ /*
+ * If CPU mapping differs, we need to add the ttm_tt pages to
+ * the resulting st. Might make sense for GGTT.
+@@ -597,7 +602,8 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
+ struct i915_refct_sgt *rsgt;
+
+ rsgt = intel_region_ttm_resource_to_rsgt(obj->mm.region,
+- res);
++ res,
++ page_alignment);
+ if (IS_ERR(rsgt))
+ return rsgt;
+
+@@ -606,7 +612,8 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
+ return i915_refct_sgt_get(obj->ttm.cached_io_rsgt);
+ }
+
+- return intel_region_ttm_resource_to_rsgt(obj->mm.region, res);
++ return intel_region_ttm_resource_to_rsgt(obj->mm.region, res,
++ page_alignment);
+ }
+
+ static int i915_ttm_truncate(struct drm_i915_gem_object *obj)
+diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
+index 8a2483ccbfb91..f4375479e6f0a 100644
+--- a/drivers/gpu/drm/i915/gt/intel_gt.c
++++ b/drivers/gpu/drm/i915/gt/intel_gt.c
+@@ -1012,6 +1012,20 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
+ mutex_lock(>->tlb_invalidate_lock);
+ intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
+
++ spin_lock_irq(&uncore->lock); /* serialise invalidate with GT reset */
++
++ for_each_engine(engine, gt, id) {
++ struct reg_and_bit rb;
++
++ rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
++ if (!i915_mmio_reg_offset(rb.reg))
++ continue;
++
++ intel_uncore_write_fw(uncore, rb.reg, rb.bit);
++ }
++
++ spin_unlock_irq(&uncore->lock);
++
+ for_each_engine(engine, gt, id) {
+ /*
+ * HW architecture suggest typical invalidation time at 40us,
+@@ -1026,7 +1040,6 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
+ if (!i915_mmio_reg_offset(rb.reg))
+ continue;
+
+- intel_uncore_write_fw(uncore, rb.reg, rb.bit);
+ if (__intel_wait_for_register_fw(uncore,
+ rb.reg, rb.bit, 0,
+ timeout_us, timeout_ms,
+diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
+index b7c6d4462ec55..d57db66ac7ea0 100644
+--- a/drivers/gpu/drm/i915/gt/intel_reset.c
++++ b/drivers/gpu/drm/i915/gt/intel_reset.c
+@@ -299,9 +299,9 @@ static int gen6_hw_domain_reset(struct intel_gt *gt, u32 hw_domain_mask)
+ return err;
+ }
+
+-static int gen6_reset_engines(struct intel_gt *gt,
+- intel_engine_mask_t engine_mask,
+- unsigned int retry)
++static int __gen6_reset_engines(struct intel_gt *gt,
++ intel_engine_mask_t engine_mask,
++ unsigned int retry)
+ {
+ struct intel_engine_cs *engine;
+ u32 hw_mask;
+@@ -320,6 +320,20 @@ static int gen6_reset_engines(struct intel_gt *gt,
+ return gen6_hw_domain_reset(gt, hw_mask);
+ }
+
++static int gen6_reset_engines(struct intel_gt *gt,
++ intel_engine_mask_t engine_mask,
++ unsigned int retry)
++{
++ unsigned long flags;
++ int ret;
++
++ spin_lock_irqsave(>->uncore->lock, flags);
++ ret = __gen6_reset_engines(gt, engine_mask, retry);
++ spin_unlock_irqrestore(>->uncore->lock, flags);
++
++ return ret;
++}
++
+ static struct intel_engine_cs *find_sfc_paired_vecs_engine(struct intel_engine_cs *engine)
+ {
+ int vecs_id;
+@@ -486,9 +500,9 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
+ rmw_clear_fw(uncore, sfc_lock.lock_reg, sfc_lock.lock_bit);
+ }
+
+-static int gen11_reset_engines(struct intel_gt *gt,
+- intel_engine_mask_t engine_mask,
+- unsigned int retry)
++static int __gen11_reset_engines(struct intel_gt *gt,
++ intel_engine_mask_t engine_mask,
++ unsigned int retry)
+ {
+ struct intel_engine_cs *engine;
+ intel_engine_mask_t tmp;
+@@ -582,8 +596,11 @@ static int gen8_reset_engines(struct intel_gt *gt,
+ struct intel_engine_cs *engine;
+ const bool reset_non_ready = retry >= 1;
+ intel_engine_mask_t tmp;
++ unsigned long flags;
+ int ret;
+
++ spin_lock_irqsave(>->uncore->lock, flags);
++
+ for_each_engine_masked(engine, gt, engine_mask, tmp) {
+ ret = gen8_engine_reset_prepare(engine);
+ if (ret && !reset_non_ready)
+@@ -611,17 +628,19 @@ static int gen8_reset_engines(struct intel_gt *gt,
+ * This is best effort, so ignore any error from the initial reset.
+ */
+ if (IS_DG2(gt->i915) && engine_mask == ALL_ENGINES)
+- gen11_reset_engines(gt, gt->info.engine_mask, 0);
++ __gen11_reset_engines(gt, gt->info.engine_mask, 0);
+
+ if (GRAPHICS_VER(gt->i915) >= 11)
+- ret = gen11_reset_engines(gt, engine_mask, retry);
++ ret = __gen11_reset_engines(gt, engine_mask, retry);
+ else
+- ret = gen6_reset_engines(gt, engine_mask, retry);
++ ret = __gen6_reset_engines(gt, engine_mask, retry);
+
+ skip_reset:
+ for_each_engine_masked(engine, gt, engine_mask, tmp)
+ gen8_engine_reset_cancel(engine);
+
++ spin_unlock_irqrestore(>->uncore->lock, flags);
++
+ return ret;
+ }
+
+diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
+index 21c29d315cc0b..9d42a7c67a8c9 100644
+--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
++++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
+@@ -155,8 +155,8 @@ static int live_lrc_layout(void *arg)
+ continue;
+
+ hw = shmem_pin_map(engine->default_state);
+- if (IS_ERR(hw)) {
+- err = PTR_ERR(hw);
++ if (!hw) {
++ err = -ENOMEM;
+ break;
+ }
+ hw += LRC_STATE_OFFSET / sizeof(*hw);
+@@ -331,8 +331,8 @@ static int live_lrc_fixed(void *arg)
+ continue;
+
+ hw = shmem_pin_map(engine->default_state);
+- if (IS_ERR(hw)) {
+- err = PTR_ERR(hw);
++ if (!hw) {
++ err = -ENOMEM;
+ break;
+ }
+ hw += LRC_STATE_OFFSET / sizeof(*hw);
+diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+index 9b6fbad476465..097b0c8b85313 100644
+--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
++++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+@@ -160,6 +160,15 @@ __uc_fw_auto_select(struct drm_i915_private *i915, struct intel_uc_fw *uc_fw)
+ u8 rev = INTEL_REVID(i915);
+ int i;
+
++ /*
++ * The only difference between the ADL GuC FWs is the HWConfig support.
++ * ADL-N does not support HWConfig, so we should use the same binary as
++ * ADL-S, otherwise the GuC might attempt to fetch a config table that
++ * does not exist.
++ */
++ if (IS_ADLP_N(i915))
++ p = INTEL_ALDERLAKE_S;
++
+ GEM_BUG_ON(uc_fw->type >= ARRAY_SIZE(blobs_all));
+ fw_blobs = blobs_all[uc_fw->type].blobs;
+ fw_count = blobs_all[uc_fw->type].count;
+diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.c b/drivers/gpu/drm/i915/gvt/cmd_parser.c
+index 2459213b6c87f..f49c1e8b8df74 100644
+--- a/drivers/gpu/drm/i915/gvt/cmd_parser.c
++++ b/drivers/gpu/drm/i915/gvt/cmd_parser.c
+@@ -3117,9 +3117,9 @@ void intel_gvt_update_reg_whitelist(struct intel_vgpu *vgpu)
+ continue;
+
+ vaddr = shmem_pin_map(engine->default_state);
+- if (IS_ERR(vaddr)) {
+- gvt_err("failed to map %s->default state, err:%zd\n",
+- engine->name, PTR_ERR(vaddr));
++ if (!vaddr) {
++ gvt_err("failed to map %s->default state\n",
++ engine->name);
+ return;
+ }
+
+diff --git a/drivers/gpu/drm/i915/i915_scatterlist.c b/drivers/gpu/drm/i915/i915_scatterlist.c
+index 159571b9bd247..dcc081874ec8d 100644
+--- a/drivers/gpu/drm/i915/i915_scatterlist.c
++++ b/drivers/gpu/drm/i915/i915_scatterlist.c
+@@ -68,6 +68,7 @@ void i915_refct_sgt_init(struct i915_refct_sgt *rsgt, size_t size)
+ * drm_mm_node
+ * @node: The drm_mm_node.
+ * @region_start: An offset to add to the dma addresses of the sg list.
++ * @page_alignment: Required page alignment for each sg entry. Power of two.
+ *
+ * Create a struct sg_table, initializing it from a struct drm_mm_node,
+ * taking a maximum segment length into account, splitting into segments
+@@ -77,22 +78,25 @@ void i915_refct_sgt_init(struct i915_refct_sgt *rsgt, size_t size)
+ * error code cast to an error pointer on failure.
+ */
+ struct i915_refct_sgt *i915_rsgt_from_mm_node(const struct drm_mm_node *node,
+- u64 region_start)
++ u64 region_start,
++ u32 page_alignment)
+ {
+- const u64 max_segment = SZ_1G; /* Do we have a limit on this? */
+- u64 segment_pages = max_segment >> PAGE_SHIFT;
++ const u32 max_segment = round_down(UINT_MAX, page_alignment);
++ const u32 segment_pages = max_segment >> PAGE_SHIFT;
+ u64 block_size, offset, prev_end;
+ struct i915_refct_sgt *rsgt;
+ struct sg_table *st;
+ struct scatterlist *sg;
+
++ GEM_BUG_ON(!max_segment);
++
+ rsgt = kmalloc(sizeof(*rsgt), GFP_KERNEL);
+ if (!rsgt)
+ return ERR_PTR(-ENOMEM);
+
+ i915_refct_sgt_init(rsgt, node->size << PAGE_SHIFT);
+ st = &rsgt->table;
+- if (sg_alloc_table(st, DIV_ROUND_UP(node->size, segment_pages),
++ if (sg_alloc_table(st, DIV_ROUND_UP_ULL(node->size, segment_pages),
+ GFP_KERNEL)) {
+ i915_refct_sgt_put(rsgt);
+ return ERR_PTR(-ENOMEM);
+@@ -112,12 +116,14 @@ struct i915_refct_sgt *i915_rsgt_from_mm_node(const struct drm_mm_node *node,
+ sg = __sg_next(sg);
+
+ sg_dma_address(sg) = region_start + offset;
++ GEM_BUG_ON(!IS_ALIGNED(sg_dma_address(sg),
++ page_alignment));
+ sg_dma_len(sg) = 0;
+ sg->length = 0;
+ st->nents++;
+ }
+
+- len = min(block_size, max_segment - sg->length);
++ len = min_t(u64, block_size, max_segment - sg->length);
+ sg->length += len;
+ sg_dma_len(sg) += len;
+
+@@ -138,6 +144,7 @@ struct i915_refct_sgt *i915_rsgt_from_mm_node(const struct drm_mm_node *node,
+ * i915_buddy_block list
+ * @res: The struct i915_ttm_buddy_resource.
+ * @region_start: An offset to add to the dma addresses of the sg list.
++ * @page_alignment: Required page alignment for each sg entry. Power of two.
+ *
+ * Create a struct sg_table, initializing it from struct i915_buddy_block list,
+ * taking a maximum segment length into account, splitting into segments
+@@ -147,11 +154,12 @@ struct i915_refct_sgt *i915_rsgt_from_mm_node(const struct drm_mm_node *node,
+ * error code cast to an error pointer on failure.
+ */
+ struct i915_refct_sgt *i915_rsgt_from_buddy_resource(struct ttm_resource *res,
+- u64 region_start)
++ u64 region_start,
++ u32 page_alignment)
+ {
+ struct i915_ttm_buddy_resource *bman_res = to_ttm_buddy_resource(res);
+ const u64 size = res->num_pages << PAGE_SHIFT;
+- const u64 max_segment = rounddown(UINT_MAX, PAGE_SIZE);
++ const u32 max_segment = round_down(UINT_MAX, page_alignment);
+ struct drm_buddy *mm = bman_res->mm;
+ struct list_head *blocks = &bman_res->blocks;
+ struct drm_buddy_block *block;
+@@ -161,6 +169,7 @@ struct i915_refct_sgt *i915_rsgt_from_buddy_resource(struct ttm_resource *res,
+ resource_size_t prev_end;
+
+ GEM_BUG_ON(list_empty(blocks));
++ GEM_BUG_ON(!max_segment);
+
+ rsgt = kmalloc(sizeof(*rsgt), GFP_KERNEL);
+ if (!rsgt)
+@@ -191,12 +200,14 @@ struct i915_refct_sgt *i915_rsgt_from_buddy_resource(struct ttm_resource *res,
+ sg = __sg_next(sg);
+
+ sg_dma_address(sg) = region_start + offset;
++ GEM_BUG_ON(!IS_ALIGNED(sg_dma_address(sg),
++ page_alignment));
+ sg_dma_len(sg) = 0;
+ sg->length = 0;
+ st->nents++;
+ }
+
+- len = min(block_size, max_segment - sg->length);
++ len = min_t(u64, block_size, max_segment - sg->length);
+ sg->length += len;
+ sg_dma_len(sg) += len;
+
+diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h b/drivers/gpu/drm/i915/i915_scatterlist.h
+index 12c6a16840814..9ddb3e743a3e5 100644
+--- a/drivers/gpu/drm/i915/i915_scatterlist.h
++++ b/drivers/gpu/drm/i915/i915_scatterlist.h
+@@ -213,9 +213,11 @@ static inline void __i915_refct_sgt_init(struct i915_refct_sgt *rsgt,
+ void i915_refct_sgt_init(struct i915_refct_sgt *rsgt, size_t size);
+
+ struct i915_refct_sgt *i915_rsgt_from_mm_node(const struct drm_mm_node *node,
+- u64 region_start);
++ u64 region_start,
++ u32 page_alignment);
+
+ struct i915_refct_sgt *i915_rsgt_from_buddy_resource(struct ttm_resource *res,
+- u64 region_start);
++ u64 region_start,
++ u32 page_alignment);
+
+ #endif
+diff --git a/drivers/gpu/drm/i915/intel_region_ttm.c b/drivers/gpu/drm/i915/intel_region_ttm.c
+index 737ef3f4ab54e..d6bc83adef005 100644
+--- a/drivers/gpu/drm/i915/intel_region_ttm.c
++++ b/drivers/gpu/drm/i915/intel_region_ttm.c
+@@ -151,6 +151,7 @@ int intel_region_ttm_fini(struct intel_memory_region *mem)
+ * Convert an opaque TTM resource manager resource to a refcounted sg_table.
+ * @mem: The memory region.
+ * @res: The resource manager resource obtained from the TTM resource manager.
++ * @page_alignment: Required page alignment for each sg entry. Power of two.
+ *
+ * The gem backends typically use sg-tables for operations on the underlying
+ * io_memory. So provide a way for the backends to translate the
+@@ -160,16 +161,19 @@ int intel_region_ttm_fini(struct intel_memory_region *mem)
+ */
+ struct i915_refct_sgt *
+ intel_region_ttm_resource_to_rsgt(struct intel_memory_region *mem,
+- struct ttm_resource *res)
++ struct ttm_resource *res,
++ u32 page_alignment)
+ {
+ if (mem->is_range_manager) {
+ struct ttm_range_mgr_node *range_node =
+ to_ttm_range_mgr_node(res);
+
+ return i915_rsgt_from_mm_node(&range_node->mm_nodes[0],
+- mem->region.start);
++ mem->region.start,
++ page_alignment);
+ } else {
+- return i915_rsgt_from_buddy_resource(res, mem->region.start);
++ return i915_rsgt_from_buddy_resource(res, mem->region.start,
++ page_alignment);
+ }
+ }
+
+diff --git a/drivers/gpu/drm/i915/intel_region_ttm.h b/drivers/gpu/drm/i915/intel_region_ttm.h
+index fdee5e7bd46ca..b44183a771afa 100644
+--- a/drivers/gpu/drm/i915/intel_region_ttm.h
++++ b/drivers/gpu/drm/i915/intel_region_ttm.h
+@@ -24,7 +24,8 @@ int intel_region_ttm_fini(struct intel_memory_region *mem);
+
+ struct i915_refct_sgt *
+ intel_region_ttm_resource_to_rsgt(struct intel_memory_region *mem,
+- struct ttm_resource *res);
++ struct ttm_resource *res,
++ u32 page_alignment);
+
+ void intel_region_ttm_resource_free(struct intel_memory_region *mem,
+ struct ttm_resource *res);
+diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+index ab751192eb3bb..34d1ef0152335 100644
+--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
++++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+@@ -742,7 +742,7 @@ static int pot_hole(struct i915_address_space *vm,
+ u64 addr;
+
+ for (addr = round_up(hole_start + min_alignment, step) - min_alignment;
+- addr <= round_down(hole_end - (2 * min_alignment), step) - min_alignment;
++ hole_end > addr && hole_end - addr >= 2 * min_alignment;
+ addr += step) {
+ err = i915_vma_pin(vma, 0, 0, addr | flags);
+ if (err) {
+diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+index ba32893e0873c..0250a114fe0ac 100644
+--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
++++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+@@ -451,7 +451,6 @@ out_put:
+
+ static int igt_mock_max_segment(void *arg)
+ {
+- const unsigned int max_segment = rounddown(UINT_MAX, PAGE_SIZE);
+ struct intel_memory_region *mem = arg;
+ struct drm_i915_private *i915 = mem->i915;
+ struct i915_ttm_buddy_resource *res;
+@@ -460,7 +459,10 @@ static int igt_mock_max_segment(void *arg)
+ struct drm_buddy *mm;
+ struct list_head *blocks;
+ struct scatterlist *sg;
++ I915_RND_STATE(prng);
+ LIST_HEAD(objects);
++ unsigned int max_segment;
++ unsigned int ps;
+ u64 size;
+ int err = 0;
+
+@@ -472,7 +474,13 @@ static int igt_mock_max_segment(void *arg)
+ */
+
+ size = SZ_8G;
+- mem = mock_region_create(i915, 0, size, PAGE_SIZE, 0, 0);
++ ps = PAGE_SIZE;
++ if (i915_prandom_u64_state(&prng) & 1)
++ ps = SZ_64K; /* For something like DG2 */
++
++ max_segment = round_down(UINT_MAX, ps);
++
++ mem = mock_region_create(i915, 0, size, ps, 0, 0);
+ if (IS_ERR(mem))
+ return PTR_ERR(mem);
+
+@@ -498,12 +506,21 @@ static int igt_mock_max_segment(void *arg)
+ }
+
+ for (sg = obj->mm.pages->sgl; sg; sg = sg_next(sg)) {
++ dma_addr_t daddr = sg_dma_address(sg);
++
+ if (sg->length > max_segment) {
+ pr_err("%s: Created an oversized scatterlist entry, %u > %u\n",
+ __func__, sg->length, max_segment);
+ err = -EINVAL;
+ goto out_close;
+ }
++
++ if (!IS_ALIGNED(daddr, ps)) {
++ pr_err("%s: Created an unaligned scatterlist entry, addr=%pa, ps=%u\n",
++ __func__, &daddr, ps);
++ err = -EINVAL;
++ goto out_close;
++ }
+ }
+
+ out_close:
+diff --git a/drivers/gpu/drm/i915/selftests/mock_region.c b/drivers/gpu/drm/i915/selftests/mock_region.c
+index f64325491f352..6f7c9820d3e9d 100644
+--- a/drivers/gpu/drm/i915/selftests/mock_region.c
++++ b/drivers/gpu/drm/i915/selftests/mock_region.c
+@@ -32,7 +32,8 @@ static int mock_region_get_pages(struct drm_i915_gem_object *obj)
+ return PTR_ERR(obj->mm.res);
+
+ obj->mm.rsgt = intel_region_ttm_resource_to_rsgt(obj->mm.region,
+- obj->mm.res);
++ obj->mm.res,
++ obj->mm.region->min_page_size);
+ if (IS_ERR(obj->mm.rsgt)) {
+ err = PTR_ERR(obj->mm.rsgt);
+ goto err_free_resource;
+diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
+index 47780fe597f23..601e124842e17 100644
+--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
++++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
+@@ -432,8 +432,8 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, void *data,
+
+ if (args->retained) {
+ if (args->madv == PANFROST_MADV_DONTNEED)
+- list_add_tail(&bo->base.madv_list,
+- &pfdev->shrinker_list);
++ list_move_tail(&bo->base.madv_list,
++ &pfdev->shrinker_list);
+ else if (args->madv == PANFROST_MADV_WILLNEED)
+ list_del_init(&bo->base.madv_list);
+ }
+diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
+index d3f82b26a631d..b285a8001b1d2 100644
+--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
++++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
+@@ -518,7 +518,7 @@ err_map:
+ err_pages:
+ drm_gem_shmem_put_pages(&bo->base);
+ err_bo:
+- drm_gem_object_put(&bo->base.base);
++ panfrost_gem_mapping_put(bomapping);
+ return ret;
+ }
+
+diff --git a/drivers/irqchip/irq-or1k-pic.c b/drivers/irqchip/irq-or1k-pic.c
+index 49b47e7876448..f289ccd952914 100644
+--- a/drivers/irqchip/irq-or1k-pic.c
++++ b/drivers/irqchip/irq-or1k-pic.c
+@@ -66,7 +66,6 @@ static struct or1k_pic_dev or1k_pic_level = {
+ .name = "or1k-PIC-level",
+ .irq_unmask = or1k_pic_unmask,
+ .irq_mask = or1k_pic_mask,
+- .irq_mask_ack = or1k_pic_mask_ack,
+ },
+ .handle = handle_level_irq,
+ .flags = IRQ_LEVEL | IRQ_NOPROBE,
+diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
+index 43f0c6a064ba1..75b4db4d050b9 100644
+--- a/drivers/net/can/xilinx_can.c
++++ b/drivers/net/can/xilinx_can.c
+@@ -259,7 +259,7 @@ static const struct can_bittiming_const xcan_bittiming_const_canfd2 = {
+ .tseg2_min = 1,
+ .tseg2_max = 128,
+ .sjw_max = 128,
+- .brp_min = 2,
++ .brp_min = 1,
+ .brp_max = 256,
+ .brp_inc = 1,
+ };
+@@ -272,7 +272,7 @@ static const struct can_bittiming_const xcan_data_bittiming_const_canfd2 = {
+ .tseg2_min = 1,
+ .tseg2_max = 16,
+ .sjw_max = 16,
+- .brp_min = 2,
++ .brp_min = 1,
+ .brp_max = 256,
+ .brp_inc = 1,
+ };
+diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c b/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
+index 831833911a525..8647125d60aef 100644
+--- a/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
++++ b/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
+@@ -379,7 +379,7 @@ static void aq_pci_shutdown(struct pci_dev *pdev)
+ }
+ }
+
+-static int aq_suspend_common(struct device *dev, bool deep)
++static int aq_suspend_common(struct device *dev)
+ {
+ struct aq_nic_s *nic = pci_get_drvdata(to_pci_dev(dev));
+
+@@ -392,17 +392,15 @@ static int aq_suspend_common(struct device *dev, bool deep)
+ if (netif_running(nic->ndev))
+ aq_nic_stop(nic);
+
+- if (deep) {
+- aq_nic_deinit(nic, !nic->aq_hw->aq_nic_cfg->wol);
+- aq_nic_set_power(nic);
+- }
++ aq_nic_deinit(nic, !nic->aq_hw->aq_nic_cfg->wol);
++ aq_nic_set_power(nic);
+
+ rtnl_unlock();
+
+ return 0;
+ }
+
+-static int atl_resume_common(struct device *dev, bool deep)
++static int atl_resume_common(struct device *dev)
+ {
+ struct pci_dev *pdev = to_pci_dev(dev);
+ struct aq_nic_s *nic;
+@@ -415,11 +413,6 @@ static int atl_resume_common(struct device *dev, bool deep)
+ pci_set_power_state(pdev, PCI_D0);
+ pci_restore_state(pdev);
+
+- if (deep) {
+- /* Reinitialize Nic/Vecs objects */
+- aq_nic_deinit(nic, !nic->aq_hw->aq_nic_cfg->wol);
+- }
+-
+ if (netif_running(nic->ndev)) {
+ ret = aq_nic_init(nic);
+ if (ret)
+@@ -444,22 +437,22 @@ err_exit:
+
+ static int aq_pm_freeze(struct device *dev)
+ {
+- return aq_suspend_common(dev, true);
++ return aq_suspend_common(dev);
+ }
+
+ static int aq_pm_suspend_poweroff(struct device *dev)
+ {
+- return aq_suspend_common(dev, true);
++ return aq_suspend_common(dev);
+ }
+
+ static int aq_pm_thaw(struct device *dev)
+ {
+- return atl_resume_common(dev, true);
++ return atl_resume_common(dev);
+ }
+
+ static int aq_pm_resume_restore(struct device *dev)
+ {
+- return atl_resume_common(dev, true);
++ return atl_resume_common(dev);
+ }
+
+ static const struct dev_pm_ops aq_pm_ops = {
+diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+index d5149478a3510..1ceccaed2da0b 100644
+--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+@@ -7641,7 +7641,7 @@ hwrm_dbg_qcaps_exit:
+
+ static int bnxt_hwrm_queue_qportcfg(struct bnxt *bp);
+
+-static int bnxt_hwrm_func_qcaps(struct bnxt *bp)
++int bnxt_hwrm_func_qcaps(struct bnxt *bp)
+ {
+ int rc;
+
+@@ -9916,7 +9916,8 @@ static int bnxt_hwrm_if_change(struct bnxt *bp, bool up)
+
+ if (flags & FUNC_DRV_IF_CHANGE_RESP_FLAGS_RESC_CHANGE)
+ resc_reinit = true;
+- if (flags & FUNC_DRV_IF_CHANGE_RESP_FLAGS_HOT_FW_RESET_DONE)
++ if (flags & FUNC_DRV_IF_CHANGE_RESP_FLAGS_HOT_FW_RESET_DONE ||
++ test_bit(BNXT_STATE_FW_RESET_DET, &bp->state))
+ fw_reset = true;
+ else
+ bnxt_remap_fw_health_regs(bp);
+diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+index 98453a78cbd04..4c6ce2b2b3b78 100644
+--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+@@ -2310,6 +2310,7 @@ int bnxt_cancel_reservations(struct bnxt *bp, bool fw_reset);
+ int bnxt_hwrm_alloc_wol_fltr(struct bnxt *bp);
+ int bnxt_hwrm_free_wol_fltr(struct bnxt *bp);
+ int bnxt_hwrm_func_resc_qcaps(struct bnxt *bp, bool all);
++int bnxt_hwrm_func_qcaps(struct bnxt *bp);
+ int bnxt_hwrm_fw_set_time(struct bnxt *);
+ int bnxt_open_nic(struct bnxt *, bool, bool);
+ int bnxt_half_open_nic(struct bnxt *bp);
+diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+index 0c17f90d44a25..3a9441fe4fd17 100644
+--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+@@ -979,9 +979,11 @@ static int bnxt_dl_info_get(struct devlink *dl, struct devlink_info_req *req,
+ if (rc)
+ return rc;
+
+- rc = bnxt_dl_livepatch_info_put(bp, req, BNXT_FW_SRT_PATCH);
+- if (rc)
+- return rc;
++ if (BNXT_CHIP_P5(bp)) {
++ rc = bnxt_dl_livepatch_info_put(bp, req, BNXT_FW_SRT_PATCH);
++ if (rc)
++ return rc;
++ }
+ return bnxt_dl_livepatch_info_put(bp, req, BNXT_FW_CRT_PATCH);
+
+ }
+diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
+index f9c94e5fe7187..3221911e25fed 100644
+--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
+@@ -76,14 +76,23 @@ static int bnxt_refclk_read(struct bnxt *bp, struct ptp_system_timestamp *sts,
+ u64 *ns)
+ {
+ struct bnxt_ptp_cfg *ptp = bp->ptp_cfg;
++ u32 high_before, high_now, low;
+
+ if (test_bit(BNXT_STATE_IN_FW_RESET, &bp->state))
+ return -EIO;
+
++ high_before = readl(bp->bar0 + ptp->refclk_mapped_regs[1]);
+ ptp_read_system_prets(sts);
+- *ns = readl(bp->bar0 + ptp->refclk_mapped_regs[0]);
++ low = readl(bp->bar0 + ptp->refclk_mapped_regs[0]);
+ ptp_read_system_postts(sts);
+- *ns |= (u64)readl(bp->bar0 + ptp->refclk_mapped_regs[1]) << 32;
++ high_now = readl(bp->bar0 + ptp->refclk_mapped_regs[1]);
++ if (high_now != high_before) {
++ ptp_read_system_prets(sts);
++ low = readl(bp->bar0 + ptp->refclk_mapped_regs[0]);
++ ptp_read_system_postts(sts);
++ }
++ *ns = ((u64)high_now << 32) | low;
++
+ return 0;
+ }
+
+diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+index ddf2f3963abee..a1a2c7a64fd58 100644
+--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+@@ -823,8 +823,10 @@ static int bnxt_sriov_enable(struct bnxt *bp, int *num_vfs)
+ goto err_out2;
+
+ rc = pci_enable_sriov(bp->pdev, *num_vfs);
+- if (rc)
++ if (rc) {
++ bnxt_ulp_sriov_cfg(bp, 0);
+ goto err_out2;
++ }
+
+ return 0;
+
+@@ -832,6 +834,9 @@ err_out2:
+ /* Free the resources reserved for various VF's */
+ bnxt_hwrm_func_vf_resource_free(bp, *num_vfs);
+
++ /* Restore the max resources */
++ bnxt_hwrm_func_qcaps(bp);
++
+ err_out1:
+ bnxt_free_vf_resources(bp);
+
+diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
+index 4af5561cbfc54..7c760aa655404 100644
+--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
++++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
+@@ -1392,7 +1392,7 @@ static void chtls_pass_accept_request(struct sock *sk,
+ th_ecn = tcph->ece && tcph->cwr;
+ if (th_ecn) {
+ ect = !INET_ECN_is_not_ect(ip_dsfield);
+- ecn_ok = sock_net(sk)->ipv4.sysctl_tcp_ecn;
++ ecn_ok = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_ecn);
+ if ((!ect && ecn_ok) || tcp_ca_needs_ecn(sk))
+ inet_rsk(oreq)->ecn_ok = 1;
+ }
+diff --git a/drivers/net/ethernet/faraday/ftgmac100.c b/drivers/net/ethernet/faraday/ftgmac100.c
+index 5231818943c6e..c03663785a8d4 100644
+--- a/drivers/net/ethernet/faraday/ftgmac100.c
++++ b/drivers/net/ethernet/faraday/ftgmac100.c
+@@ -1764,6 +1764,19 @@ cleanup_clk:
+ return rc;
+ }
+
++static bool ftgmac100_has_child_node(struct device_node *np, const char *name)
++{
++ struct device_node *child_np = of_get_child_by_name(np, name);
++ bool ret = false;
++
++ if (child_np) {
++ ret = true;
++ of_node_put(child_np);
++ }
++
++ return ret;
++}
++
+ static int ftgmac100_probe(struct platform_device *pdev)
+ {
+ struct resource *res;
+@@ -1883,7 +1896,7 @@ static int ftgmac100_probe(struct platform_device *pdev)
+
+ /* Display what we found */
+ phy_attached_info(phy);
+- } else if (np && !of_get_child_by_name(np, "mdio")) {
++ } else if (np && !ftgmac100_has_child_node(np, "mdio")) {
+ /* Support legacy ASPEED devicetree descriptions that decribe a
+ * MAC with an embedded MDIO controller but have no "mdio"
+ * child node. Automatically scan the MDIO bus for available
+diff --git a/drivers/net/ethernet/intel/ice/ice_devids.h b/drivers/net/ethernet/intel/ice/ice_devids.h
+index 61dd2f18dee8e..b41bc3dc17454 100644
+--- a/drivers/net/ethernet/intel/ice/ice_devids.h
++++ b/drivers/net/ethernet/intel/ice/ice_devids.h
+@@ -5,6 +5,7 @@
+ #define _ICE_DEVIDS_H_
+
+ /* Device IDs */
++#define ICE_DEV_ID_E822_SI_DFLT 0x1888
+ /* Intel(R) Ethernet Connection E823-L for backplane */
+ #define ICE_DEV_ID_E823L_BACKPLANE 0x124C
+ /* Intel(R) Ethernet Connection E823-L for SFP */
+diff --git a/drivers/net/ethernet/intel/ice/ice_devlink.c b/drivers/net/ethernet/intel/ice/ice_devlink.c
+index 4a9de59121d85..31836bbdf8133 100644
+--- a/drivers/net/ethernet/intel/ice/ice_devlink.c
++++ b/drivers/net/ethernet/intel/ice/ice_devlink.c
+@@ -792,6 +792,8 @@ void ice_devlink_destroy_vf_port(struct ice_vf *vf)
+ devlink_port_unregister(devlink_port);
+ }
+
++#define ICE_DEVLINK_READ_BLK_SIZE (1024 * 1024)
++
+ /**
+ * ice_devlink_nvm_snapshot - Capture a snapshot of the NVM flash contents
+ * @devlink: the devlink instance
+@@ -818,8 +820,9 @@ static int ice_devlink_nvm_snapshot(struct devlink *devlink,
+ struct ice_pf *pf = devlink_priv(devlink);
+ struct device *dev = ice_pf_to_dev(pf);
+ struct ice_hw *hw = &pf->hw;
+- void *nvm_data;
+- u32 nvm_size;
++ u8 *nvm_data, *tmp, i;
++ u32 nvm_size, left;
++ s8 num_blks;
+ int status;
+
+ nvm_size = hw->flash.flash_size;
+@@ -827,26 +830,44 @@ static int ice_devlink_nvm_snapshot(struct devlink *devlink,
+ if (!nvm_data)
+ return -ENOMEM;
+
+- status = ice_acquire_nvm(hw, ICE_RES_READ);
+- if (status) {
+- dev_dbg(dev, "ice_acquire_nvm failed, err %d aq_err %d\n",
+- status, hw->adminq.sq_last_status);
+- NL_SET_ERR_MSG_MOD(extack, "Failed to acquire NVM semaphore");
+- vfree(nvm_data);
+- return status;
+- }
+
+- status = ice_read_flat_nvm(hw, 0, &nvm_size, nvm_data, false);
+- if (status) {
+- dev_dbg(dev, "ice_read_flat_nvm failed after reading %u bytes, err %d aq_err %d\n",
+- nvm_size, status, hw->adminq.sq_last_status);
+- NL_SET_ERR_MSG_MOD(extack, "Failed to read NVM contents");
++ num_blks = DIV_ROUND_UP(nvm_size, ICE_DEVLINK_READ_BLK_SIZE);
++ tmp = nvm_data;
++ left = nvm_size;
++
++ /* Some systems take longer to read the NVM than others which causes the
++ * FW to reclaim the NVM lock before the entire NVM has been read. Fix
++ * this by breaking the reads of the NVM into smaller chunks that will
++ * probably not take as long. This has some overhead since we are
++ * increasing the number of AQ commands, but it should always work
++ */
++ for (i = 0; i < num_blks; i++) {
++ u32 read_sz = min_t(u32, ICE_DEVLINK_READ_BLK_SIZE, left);
++
++ status = ice_acquire_nvm(hw, ICE_RES_READ);
++ if (status) {
++ dev_dbg(dev, "ice_acquire_nvm failed, err %d aq_err %d\n",
++ status, hw->adminq.sq_last_status);
++ NL_SET_ERR_MSG_MOD(extack, "Failed to acquire NVM semaphore");
++ vfree(nvm_data);
++ return -EIO;
++ }
++
++ status = ice_read_flat_nvm(hw, i * ICE_DEVLINK_READ_BLK_SIZE,
++ &read_sz, tmp, false);
++ if (status) {
++ dev_dbg(dev, "ice_read_flat_nvm failed after reading %u bytes, err %d aq_err %d\n",
++ read_sz, status, hw->adminq.sq_last_status);
++ NL_SET_ERR_MSG_MOD(extack, "Failed to read NVM contents");
++ ice_release_nvm(hw);
++ vfree(nvm_data);
++ return -EIO;
++ }
+ ice_release_nvm(hw);
+- vfree(nvm_data);
+- return status;
+- }
+
+- ice_release_nvm(hw);
++ tmp += read_sz;
++ left -= read_sz;
++ }
+
+ *data = nvm_data;
+
+diff --git a/drivers/net/ethernet/intel/ice/ice_fw_update.c b/drivers/net/ethernet/intel/ice/ice_fw_update.c
+index 665a344fb9c06..3dc5662d62a6a 100644
+--- a/drivers/net/ethernet/intel/ice/ice_fw_update.c
++++ b/drivers/net/ethernet/intel/ice/ice_fw_update.c
+@@ -736,7 +736,87 @@ static int ice_finalize_update(struct pldmfw *context)
+ return 0;
+ }
+
+-static const struct pldmfw_ops ice_fwu_ops = {
++struct ice_pldm_pci_record_id {
++ u32 vendor;
++ u32 device;
++ u32 subsystem_vendor;
++ u32 subsystem_device;
++};
++
++/**
++ * ice_op_pci_match_record - Check if a PCI device matches the record
++ * @context: PLDM fw update structure
++ * @record: list of records extracted from the PLDM image
++ *
++ * Determine if the PCI device associated with this device matches the record
++ * data provided.
++ *
++ * Searches the descriptor TLVs and extracts the relevant descriptor data into
++ * a pldm_pci_record_id. This is then compared against the PCI device ID
++ * information.
++ *
++ * Returns: true if the device matches the record, false otherwise.
++ */
++static bool
++ice_op_pci_match_record(struct pldmfw *context, struct pldmfw_record *record)
++{
++ struct pci_dev *pdev = to_pci_dev(context->dev);
++ struct ice_pldm_pci_record_id id = {
++ .vendor = PCI_ANY_ID,
++ .device = PCI_ANY_ID,
++ .subsystem_vendor = PCI_ANY_ID,
++ .subsystem_device = PCI_ANY_ID,
++ };
++ struct pldmfw_desc_tlv *desc;
++
++ list_for_each_entry(desc, &record->descs, entry) {
++ u16 value;
++ int *ptr;
++
++ switch (desc->type) {
++ case PLDM_DESC_ID_PCI_VENDOR_ID:
++ ptr = &id.vendor;
++ break;
++ case PLDM_DESC_ID_PCI_DEVICE_ID:
++ ptr = &id.device;
++ break;
++ case PLDM_DESC_ID_PCI_SUBVENDOR_ID:
++ ptr = &id.subsystem_vendor;
++ break;
++ case PLDM_DESC_ID_PCI_SUBDEV_ID:
++ ptr = &id.subsystem_device;
++ break;
++ default:
++ /* Skip unrelated TLVs */
++ continue;
++ }
++
++ value = get_unaligned_le16(desc->data);
++ /* A value of zero for one of the descriptors is sometimes
++ * used when the record should ignore this field when matching
++ * device. For example if the record applies to any subsystem
++ * device or vendor.
++ */
++ if (value)
++ *ptr = value;
++ else
++ *ptr = PCI_ANY_ID;
++ }
++
++ /* the E822 device can have a generic device ID so check for that */
++ if ((id.vendor == PCI_ANY_ID || id.vendor == pdev->vendor) &&
++ (id.device == PCI_ANY_ID || id.device == pdev->device ||
++ id.device == ICE_DEV_ID_E822_SI_DFLT) &&
++ (id.subsystem_vendor == PCI_ANY_ID ||
++ id.subsystem_vendor == pdev->subsystem_vendor) &&
++ (id.subsystem_device == PCI_ANY_ID ||
++ id.subsystem_device == pdev->subsystem_device))
++ return true;
++
++ return false;
++}
++
++static const struct pldmfw_ops ice_fwu_ops_e810 = {
+ .match_record = &pldmfw_op_pci_match_record,
+ .send_package_data = &ice_send_package_data,
+ .send_component_table = &ice_send_component_table,
+@@ -744,6 +824,14 @@ static const struct pldmfw_ops ice_fwu_ops = {
+ .finalize_update = &ice_finalize_update,
+ };
+
++static const struct pldmfw_ops ice_fwu_ops_e822 = {
++ .match_record = &ice_op_pci_match_record,
++ .send_package_data = &ice_send_package_data,
++ .send_component_table = &ice_send_component_table,
++ .flash_component = &ice_flash_component,
++ .finalize_update = &ice_finalize_update,
++};
++
+ /**
+ * ice_get_pending_updates - Check if the component has a pending update
+ * @pf: the PF driver structure
+@@ -921,7 +1009,11 @@ int ice_devlink_flash_update(struct devlink *devlink,
+
+ memset(&priv, 0, sizeof(priv));
+
+- priv.context.ops = &ice_fwu_ops;
++ /* the E822 device needs a slightly different ops */
++ if (hw->mac_type == ICE_MAC_GENERIC)
++ priv.context.ops = &ice_fwu_ops_e822;
++ else
++ priv.context.ops = &ice_fwu_ops_e810;
+ priv.context.dev = dev;
+ priv.extack = extack;
+ priv.pf = pf;
+diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
+index d069b19f9bf7f..efb076f71e381 100644
+--- a/drivers/net/ethernet/intel/ice/ice_main.c
++++ b/drivers/net/ethernet/intel/ice/ice_main.c
+@@ -5397,6 +5397,7 @@ static const struct pci_device_id ice_pci_tbl[] = {
+ { PCI_VDEVICE(INTEL, ICE_DEV_ID_E823L_10G_BASE_T), 0 },
+ { PCI_VDEVICE(INTEL, ICE_DEV_ID_E823L_1GBE), 0 },
+ { PCI_VDEVICE(INTEL, ICE_DEV_ID_E823L_QSFP), 0 },
++ { PCI_VDEVICE(INTEL, ICE_DEV_ID_E822_SI_DFLT), 0 },
+ /* required last entry */
+ { 0, }
+ };
+diff --git a/drivers/net/ethernet/marvell/prestera/prestera_router.c b/drivers/net/ethernet/marvell/prestera/prestera_router.c
+index 6c5618cf4f08f..97d9012db189f 100644
+--- a/drivers/net/ethernet/marvell/prestera/prestera_router.c
++++ b/drivers/net/ethernet/marvell/prestera/prestera_router.c
+@@ -587,6 +587,7 @@ err_router_lib_init:
+
+ void prestera_router_fini(struct prestera_switch *sw)
+ {
++ unregister_fib_notifier(&init_net, &sw->router->fib_nb);
+ unregister_inetaddr_notifier(&sw->router->inetaddr_nb);
+ unregister_inetaddr_validator_notifier(&sw->router->inetaddr_valid_nb);
+ rhashtable_destroy(&sw->router->kern_fib_cache_ht);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
+index 1ff7a07bcd06f..fbcce63e5b80e 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
+@@ -66,6 +66,7 @@ struct mlx5_tc_ct_priv {
+ struct mlx5_ct_fs *fs;
+ struct mlx5_ct_fs_ops *fs_ops;
+ spinlock_t ht_lock; /* protects ft entries */
++ struct workqueue_struct *wq;
+ };
+
+ struct mlx5_ct_flow {
+@@ -927,14 +928,11 @@ static void mlx5_tc_ct_entry_del_work(struct work_struct *work)
+ static void
+ __mlx5_tc_ct_entry_put(struct mlx5_ct_entry *entry)
+ {
+- struct mlx5e_priv *priv;
+-
+ if (!refcount_dec_and_test(&entry->refcnt))
+ return;
+
+- priv = netdev_priv(entry->ct_priv->netdev);
+ INIT_WORK(&entry->work, mlx5_tc_ct_entry_del_work);
+- queue_work(priv->wq, &entry->work);
++ queue_work(entry->ct_priv->wq, &entry->work);
+ }
+
+ static struct mlx5_ct_counter *
+@@ -1744,19 +1742,16 @@ mlx5_tc_ct_flush_ft_entry(void *ptr, void *arg)
+ static void
+ mlx5_tc_ct_del_ft_cb(struct mlx5_tc_ct_priv *ct_priv, struct mlx5_ct_ft *ft)
+ {
+- struct mlx5e_priv *priv;
+-
+ if (!refcount_dec_and_test(&ft->refcount))
+ return;
+
++ flush_workqueue(ct_priv->wq);
+ nf_flow_table_offload_del_cb(ft->nf_ft,
+ mlx5_tc_ct_block_flow_offload, ft);
+ rhashtable_remove_fast(&ct_priv->zone_ht, &ft->node, zone_params);
+ rhashtable_free_and_destroy(&ft->ct_entries_ht,
+ mlx5_tc_ct_flush_ft_entry,
+ ct_priv);
+- priv = netdev_priv(ct_priv->netdev);
+- flush_workqueue(priv->wq);
+ mlx5_tc_ct_free_pre_ct_tables(ft);
+ mapping_remove(ct_priv->zone_mapping, ft->zone_restore_id);
+ kfree(ft);
+@@ -2139,6 +2134,12 @@ mlx5_tc_ct_init(struct mlx5e_priv *priv, struct mlx5_fs_chains *chains,
+ if (rhashtable_init(&ct_priv->ct_tuples_nat_ht, &tuples_nat_ht_params))
+ goto err_ct_tuples_nat_ht;
+
++ ct_priv->wq = alloc_ordered_workqueue("mlx5e_ct_priv_wq", 0);
++ if (!ct_priv->wq) {
++ err = -ENOMEM;
++ goto err_wq;
++ }
++
+ err = mlx5_tc_ct_fs_init(ct_priv);
+ if (err)
+ goto err_init_fs;
+@@ -2146,6 +2147,8 @@ mlx5_tc_ct_init(struct mlx5e_priv *priv, struct mlx5_fs_chains *chains,
+ return ct_priv;
+
+ err_init_fs:
++ destroy_workqueue(ct_priv->wq);
++err_wq:
+ rhashtable_destroy(&ct_priv->ct_tuples_nat_ht);
+ err_ct_tuples_nat_ht:
+ rhashtable_destroy(&ct_priv->ct_tuples_ht);
+@@ -2175,6 +2178,7 @@ mlx5_tc_ct_clean(struct mlx5_tc_ct_priv *ct_priv)
+ if (!ct_priv)
+ return;
+
++ destroy_workqueue(ct_priv->wq);
+ chains = ct_priv->chains;
+
+ ct_priv->fs_ops->destroy(ct_priv->fs);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
+index 96064a2033f79..f3f2aeb1bc21d 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
+@@ -231,8 +231,7 @@ mlx5e_set_ktls_rx_priv_ctx(struct tls_context *tls_ctx,
+ struct mlx5e_ktls_offload_context_rx **ctx =
+ __tls_driver_ctx(tls_ctx, TLS_OFFLOAD_CTX_DIR_RX);
+
+- BUILD_BUG_ON(sizeof(struct mlx5e_ktls_offload_context_rx *) >
+- TLS_OFFLOAD_CONTEXT_SIZE_RX);
++ BUILD_BUG_ON(sizeof(priv_rx) > TLS_DRIVER_STATE_SIZE_RX);
+
+ *ctx = priv_rx;
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
+index aaf11c66bf4c8..6f12764d8880e 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
+@@ -68,8 +68,7 @@ mlx5e_set_ktls_tx_priv_ctx(struct tls_context *tls_ctx,
+ struct mlx5e_ktls_offload_context_tx **ctx =
+ __tls_driver_ctx(tls_ctx, TLS_OFFLOAD_CTX_DIR_TX);
+
+- BUILD_BUG_ON(sizeof(struct mlx5e_ktls_offload_context_tx *) >
+- TLS_OFFLOAD_CONTEXT_SIZE_TX);
++ BUILD_BUG_ON(sizeof(priv_tx) > TLS_DRIVER_STATE_SIZE_TX);
+
+ *ctx = priv_tx;
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+index bdc870f9c2f3f..4429c848d4c45 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+@@ -688,7 +688,7 @@ static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(vnic_env)
+ u32 in[MLX5_ST_SZ_DW(query_vnic_env_in)] = {};
+ struct mlx5_core_dev *mdev = priv->mdev;
+
+- if (!MLX5_CAP_GEN(priv->mdev, nic_receive_steering_discard))
++ if (!mlx5e_stats_grp_vnic_env_num_stats(priv))
+ return;
+
+ MLX5_SET(query_vnic_env_in, in, opcode, MLX5_CMD_OP_QUERY_VNIC_ENV);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+index 2dc48406cd08d..54a3f866a3457 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+@@ -318,6 +318,26 @@ static void mlx5e_tx_check_stop(struct mlx5e_txqsq *sq)
+ }
+ }
+
++static void mlx5e_tx_flush(struct mlx5e_txqsq *sq)
++{
++ struct mlx5e_tx_wqe_info *wi;
++ struct mlx5e_tx_wqe *wqe;
++ u16 pi;
++
++ /* Must not be called when a MPWQE session is active but empty. */
++ mlx5e_tx_mpwqe_ensure_complete(sq);
++
++ pi = mlx5_wq_cyc_ctr2ix(&sq->wq, sq->pc);
++ wi = &sq->db.wqe_info[pi];
++
++ *wi = (struct mlx5e_tx_wqe_info) {
++ .num_wqebbs = 1,
++ };
++
++ wqe = mlx5e_post_nop(&sq->wq, sq->sqn, &sq->pc);
++ mlx5e_notify_hw(&sq->wq, sq->pc, sq->uar_map, &wqe->ctrl);
++}
++
+ static inline void
+ mlx5e_txwqe_complete(struct mlx5e_txqsq *sq, struct sk_buff *skb,
+ const struct mlx5e_tx_attr *attr,
+@@ -410,6 +430,7 @@ mlx5e_sq_xmit_wqe(struct mlx5e_txqsq *sq, struct sk_buff *skb,
+ err_drop:
+ stats->dropped++;
+ dev_kfree_skb_any(skb);
++ mlx5e_tx_flush(sq);
+ }
+
+ static bool mlx5e_tx_skb_supports_mpwqe(struct sk_buff *skb, struct mlx5e_tx_attr *attr)
+@@ -511,6 +532,13 @@ mlx5e_sq_xmit_mpwqe(struct mlx5e_txqsq *sq, struct sk_buff *skb,
+ struct mlx5_wqe_ctrl_seg *cseg;
+ struct mlx5e_xmit_data txd;
+
++ txd.data = skb->data;
++ txd.len = skb->len;
++
++ txd.dma_addr = dma_map_single(sq->pdev, txd.data, txd.len, DMA_TO_DEVICE);
++ if (unlikely(dma_mapping_error(sq->pdev, txd.dma_addr)))
++ goto err_unmap;
++
+ if (!mlx5e_tx_mpwqe_session_is_active(sq)) {
+ mlx5e_tx_mpwqe_session_start(sq, eseg);
+ } else if (!mlx5e_tx_mpwqe_same_eseg(sq, eseg)) {
+@@ -520,18 +548,9 @@ mlx5e_sq_xmit_mpwqe(struct mlx5e_txqsq *sq, struct sk_buff *skb,
+
+ sq->stats->xmit_more += xmit_more;
+
+- txd.data = skb->data;
+- txd.len = skb->len;
+-
+- txd.dma_addr = dma_map_single(sq->pdev, txd.data, txd.len, DMA_TO_DEVICE);
+- if (unlikely(dma_mapping_error(sq->pdev, txd.dma_addr)))
+- goto err_unmap;
+ mlx5e_dma_push(sq, txd.dma_addr, txd.len, MLX5E_DMA_MAP_SINGLE);
+-
+ mlx5e_skb_fifo_push(&sq->db.skb_fifo, skb);
+-
+ mlx5e_tx_mpwqe_add_dseg(sq, &txd);
+-
+ mlx5e_tx_skb_update_hwts_flags(skb);
+
+ if (unlikely(mlx5e_tx_mpwqe_is_full(&sq->mpwqe, sq->max_sq_mpw_wqebbs))) {
+@@ -553,6 +572,7 @@ err_unmap:
+ mlx5e_dma_unmap_wqe_err(sq, 1);
+ sq->stats->dropped++;
+ dev_kfree_skb_any(skb);
++ mlx5e_tx_flush(sq);
+ }
+
+ void mlx5e_tx_mpwqe_ensure_complete(struct mlx5e_txqsq *sq)
+@@ -935,5 +955,6 @@ void mlx5i_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
+ err_drop:
+ stats->dropped++;
+ dev_kfree_skb_any(skb);
++ mlx5e_tx_flush(sq);
+ }
+ #endif
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/legacy.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/legacy.c
+index 9d17206d16251..fabe49a35a5c9 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/legacy.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/legacy.c
+@@ -11,6 +11,7 @@
+ #include "mlx5_core.h"
+ #include "eswitch.h"
+ #include "fs_core.h"
++#include "fs_ft_pool.h"
+ #include "esw/qos.h"
+
+ enum {
+@@ -95,8 +96,7 @@ static int esw_create_legacy_fdb_table(struct mlx5_eswitch *esw)
+ if (!flow_group_in)
+ return -ENOMEM;
+
+- table_size = BIT(MLX5_CAP_ESW_FLOWTABLE_FDB(dev, log_max_ft_size));
+- ft_attr.max_fte = table_size;
++ ft_attr.max_fte = POOL_NEXT_SIZE;
+ ft_attr.prio = LEGACY_FDB_PRIO;
+ fdb = mlx5_create_flow_table(root_ns, &ft_attr);
+ if (IS_ERR(fdb)) {
+@@ -105,6 +105,7 @@ static int esw_create_legacy_fdb_table(struct mlx5_eswitch *esw)
+ goto out;
+ }
+ esw->fdb_table.legacy.fdb = fdb;
++ table_size = fdb->max_fte;
+
+ /* Addresses group : Full match unicast/multicast addresses */
+ MLX5_SET(create_flow_group_in, flow_group_in, match_criteria_enable,
+diff --git a/drivers/net/ethernet/mscc/ocelot_fdma.c b/drivers/net/ethernet/mscc/ocelot_fdma.c
+index dffa597bffe6b..6a8f84f325a39 100644
+--- a/drivers/net/ethernet/mscc/ocelot_fdma.c
++++ b/drivers/net/ethernet/mscc/ocelot_fdma.c
+@@ -94,19 +94,18 @@ static void ocelot_fdma_activate_chan(struct ocelot *ocelot, dma_addr_t dma,
+ ocelot_fdma_writel(ocelot, MSCC_FDMA_CH_ACTIVATE, BIT(chan));
+ }
+
++static u32 ocelot_fdma_read_ch_safe(struct ocelot *ocelot)
++{
++ return ocelot_fdma_readl(ocelot, MSCC_FDMA_CH_SAFE);
++}
++
+ static int ocelot_fdma_wait_chan_safe(struct ocelot *ocelot, int chan)
+ {
+- unsigned long timeout;
+ u32 safe;
+
+- timeout = jiffies + usecs_to_jiffies(OCELOT_FDMA_CH_SAFE_TIMEOUT_US);
+- do {
+- safe = ocelot_fdma_readl(ocelot, MSCC_FDMA_CH_SAFE);
+- if (safe & BIT(chan))
+- return 0;
+- } while (time_after(jiffies, timeout));
+-
+- return -ETIMEDOUT;
++ return readx_poll_timeout_atomic(ocelot_fdma_read_ch_safe, ocelot, safe,
++ safe & BIT(chan), 0,
++ OCELOT_FDMA_CH_SAFE_TIMEOUT_US);
+ }
+
+ static void ocelot_fdma_dcb_set_data(struct ocelot_fdma_dcb *dcb,
+diff --git a/drivers/net/ethernet/netronome/nfp/nfdk/dp.c b/drivers/net/ethernet/netronome/nfp/nfdk/dp.c
+index e509d6dcba5cb..805071d64a203 100644
+--- a/drivers/net/ethernet/netronome/nfp/nfdk/dp.c
++++ b/drivers/net/ethernet/netronome/nfp/nfdk/dp.c
+@@ -125,17 +125,18 @@ nfp_nfdk_tx_csum(struct nfp_net_dp *dp, struct nfp_net_r_vector *r_vec,
+
+ static int
+ nfp_nfdk_tx_maybe_close_block(struct nfp_net_tx_ring *tx_ring,
+- unsigned int nr_frags, struct sk_buff *skb)
++ struct sk_buff *skb)
+ {
+ unsigned int n_descs, wr_p, nop_slots;
+ const skb_frag_t *frag, *fend;
+ struct nfp_nfdk_tx_desc *txd;
++ unsigned int nr_frags;
+ unsigned int wr_idx;
+ int err;
+
+ recount_descs:
+ n_descs = nfp_nfdk_headlen_to_segs(skb_headlen(skb));
+-
++ nr_frags = skb_shinfo(skb)->nr_frags;
+ frag = skb_shinfo(skb)->frags;
+ fend = frag + nr_frags;
+ for (; frag < fend; frag++)
+@@ -281,10 +282,13 @@ netdev_tx_t nfp_nfdk_tx(struct sk_buff *skb, struct net_device *netdev)
+ if (unlikely((int)metadata < 0))
+ goto err_flush;
+
+- nr_frags = skb_shinfo(skb)->nr_frags;
+- if (nfp_nfdk_tx_maybe_close_block(tx_ring, nr_frags, skb))
++ if (nfp_nfdk_tx_maybe_close_block(tx_ring, skb))
+ goto err_flush;
+
++ /* nr_frags will change after skb_linearize so we get nr_frags after
++ * nfp_nfdk_tx_maybe_close_block function
++ */
++ nr_frags = skb_shinfo(skb)->nr_frags;
+ /* DMA map all */
+ wr_idx = D_IDX(tx_ring, tx_ring->wr_p);
+ txd = &tx_ring->ktxds[wr_idx];
+@@ -310,7 +314,16 @@ netdev_tx_t nfp_nfdk_tx(struct sk_buff *skb, struct net_device *netdev)
+
+ /* FIELD_PREP() implicitly truncates to chunk */
+ dma_len -= 1;
+- dlen_type = FIELD_PREP(NFDK_DESC_TX_DMA_LEN_HEAD, dma_len) |
++
++ /* We will do our best to pass as much data as we can in descriptor
++ * and we need to make sure the first descriptor includes whole head
++ * since there is limitation in firmware side. Sometimes the value of
++ * dma_len bitwise and NFDK_DESC_TX_DMA_LEN_HEAD will less than
++ * headlen.
++ */
++ dlen_type = FIELD_PREP(NFDK_DESC_TX_DMA_LEN_HEAD,
++ dma_len > NFDK_DESC_TX_DMA_LEN_HEAD ?
++ NFDK_DESC_TX_DMA_LEN_HEAD : dma_len) |
+ FIELD_PREP(NFDK_DESC_TX_TYPE_HEAD, type);
+
+ txd->dma_len_type = cpu_to_le16(dlen_type);
+@@ -925,7 +938,9 @@ nfp_nfdk_tx_xdp_buf(struct nfp_net_dp *dp, struct nfp_net_rx_ring *rx_ring,
+
+ /* FIELD_PREP() implicitly truncates to chunk */
+ dma_len -= 1;
+- dlen_type = FIELD_PREP(NFDK_DESC_TX_DMA_LEN_HEAD, dma_len) |
++ dlen_type = FIELD_PREP(NFDK_DESC_TX_DMA_LEN_HEAD,
++ dma_len > NFDK_DESC_TX_DMA_LEN_HEAD ?
++ NFDK_DESC_TX_DMA_LEN_HEAD : dma_len) |
+ FIELD_PREP(NFDK_DESC_TX_TYPE_HEAD, type);
+
+ txd->dma_len_type = cpu_to_le16(dlen_type);
+@@ -1303,7 +1318,7 @@ nfp_nfdk_ctrl_tx_one(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
+ skb_push(skb, 4));
+ }
+
+- if (nfp_nfdk_tx_maybe_close_block(tx_ring, 0, skb))
++ if (nfp_nfdk_tx_maybe_close_block(tx_ring, skb))
+ goto err_free;
+
+ /* DMA map all */
+@@ -1328,7 +1343,9 @@ nfp_nfdk_ctrl_tx_one(struct nfp_net *nn, struct nfp_net_r_vector *r_vec,
+ txbuf++;
+
+ dma_len -= 1;
+- dlen_type = FIELD_PREP(NFDK_DESC_TX_DMA_LEN_HEAD, dma_len) |
++ dlen_type = FIELD_PREP(NFDK_DESC_TX_DMA_LEN_HEAD,
++ dma_len > NFDK_DESC_TX_DMA_LEN_HEAD ?
++ NFDK_DESC_TX_DMA_LEN_HEAD : dma_len) |
+ FIELD_PREP(NFDK_DESC_TX_TYPE_HEAD, type);
+
+ txd->dma_len_type = cpu_to_le16(dlen_type);
+diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
+index 186cb28c03bdb..8b62ce21aff3d 100644
+--- a/drivers/net/ethernet/sfc/ef10.c
++++ b/drivers/net/ethernet/sfc/ef10.c
+@@ -1932,7 +1932,10 @@ static int efx_ef10_try_update_nic_stats_vf(struct efx_nic *efx)
+
+ efx_update_sw_stats(efx, stats);
+ out:
++ /* releasing a DMA coherent buffer with BH disabled can panic */
++ spin_unlock_bh(&efx->stats_lock);
+ efx_nic_free_buffer(efx, &stats_buf);
++ spin_lock_bh(&efx->stats_lock);
+ return rc;
+ }
+
+diff --git a/drivers/net/ethernet/sfc/ef10_sriov.c b/drivers/net/ethernet/sfc/ef10_sriov.c
+index 7f5aa4a8c4512..92550c7e85ce2 100644
+--- a/drivers/net/ethernet/sfc/ef10_sriov.c
++++ b/drivers/net/ethernet/sfc/ef10_sriov.c
+@@ -408,8 +408,9 @@ fail1:
+ static int efx_ef10_pci_sriov_disable(struct efx_nic *efx, bool force)
+ {
+ struct pci_dev *dev = efx->pci_dev;
++ struct efx_ef10_nic_data *nic_data = efx->nic_data;
+ unsigned int vfs_assigned = pci_vfs_assigned(dev);
+- int rc = 0;
++ int i, rc = 0;
+
+ if (vfs_assigned && !force) {
+ netif_info(efx, drv, efx->net_dev, "VFs are assigned to guests; "
+@@ -417,10 +418,13 @@ static int efx_ef10_pci_sriov_disable(struct efx_nic *efx, bool force)
+ return -EBUSY;
+ }
+
+- if (!vfs_assigned)
++ if (!vfs_assigned) {
++ for (i = 0; i < efx->vf_count; i++)
++ nic_data->vf[i].pci_dev = NULL;
+ pci_disable_sriov(dev);
+- else
++ } else {
+ rc = -EBUSY;
++ }
+
+ efx_ef10_sriov_free_vf_vswitching(efx);
+ efx->vf_count = 0;
+diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-dwc-qos-eth.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-dwc-qos-eth.c
+index bc91fd867dcd4..358fc26f8d1fc 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-dwc-qos-eth.c
++++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-dwc-qos-eth.c
+@@ -361,6 +361,7 @@ bypass_clk_reset_gpio:
+ data->fix_mac_speed = tegra_eqos_fix_speed;
+ data->init = tegra_eqos_init;
+ data->bsp_priv = eqos;
++ data->sph_disable = 1;
+
+ err = tegra_eqos_init(pdev, eqos);
+ if (err < 0)
+diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-ingenic.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-ingenic.c
+index 9a6d819b84aea..378b4dd826bb5 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-ingenic.c
++++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-ingenic.c
+@@ -273,7 +273,8 @@ static int ingenic_mac_probe(struct platform_device *pdev)
+ mac->tx_delay = tx_delay_ps * 1000;
+ } else {
+ dev_err(&pdev->dev, "Invalid TX clock delay: %dps\n", tx_delay_ps);
+- return -EINVAL;
++ ret = -EINVAL;
++ goto err_remove_config_dt;
+ }
+ }
+
+@@ -283,7 +284,8 @@ static int ingenic_mac_probe(struct platform_device *pdev)
+ mac->rx_delay = rx_delay_ps * 1000;
+ } else {
+ dev_err(&pdev->dev, "Invalid RX clock delay: %dps\n", rx_delay_ps);
+- return -EINVAL;
++ ret = -EINVAL;
++ goto err_remove_config_dt;
+ }
+ }
+
+diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
+index 6d978dbf708f2..2989530534074 100644
+--- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
++++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
+@@ -2475,7 +2475,6 @@ static int am65_cpsw_nuss_register_devlink(struct am65_cpsw_common *common)
+ port->port_id, ret);
+ goto dl_port_unreg;
+ }
+- devlink_port_type_eth_set(dl_port, port->ndev);
+ }
+ devlink_register(common->devlink);
+ return ret;
+@@ -2519,6 +2518,7 @@ static void am65_cpsw_unregister_devlink(struct am65_cpsw_common *common)
+ static int am65_cpsw_nuss_register_ndevs(struct am65_cpsw_common *common)
+ {
+ struct device *dev = common->dev;
++ struct devlink_port *dl_port;
+ struct am65_cpsw_port *port;
+ int ret = 0, i;
+
+@@ -2535,6 +2535,10 @@ static int am65_cpsw_nuss_register_ndevs(struct am65_cpsw_common *common)
+ return ret;
+ }
+
++ ret = am65_cpsw_nuss_register_devlink(common);
++ if (ret)
++ return ret;
++
+ for (i = 0; i < common->port_num; i++) {
+ port = &common->ports[i];
+
+@@ -2547,25 +2551,24 @@ static int am65_cpsw_nuss_register_ndevs(struct am65_cpsw_common *common)
+ i, ret);
+ goto err_cleanup_ndev;
+ }
++
++ dl_port = &port->devlink_port;
++ devlink_port_type_eth_set(dl_port, port->ndev);
+ }
+
+ ret = am65_cpsw_register_notifiers(common);
+ if (ret)
+ goto err_cleanup_ndev;
+
+- ret = am65_cpsw_nuss_register_devlink(common);
+- if (ret)
+- goto clean_unregister_notifiers;
+-
+ /* can't auto unregister ndev using devm_add_action() due to
+ * devres release sequence in DD core for DMA
+ */
+
+ return 0;
+-clean_unregister_notifiers:
+- am65_cpsw_unregister_notifiers(common);
++
+ err_cleanup_ndev:
+ am65_cpsw_nuss_cleanup_ndev(common);
++ am65_cpsw_unregister_devlink(common);
+
+ return ret;
+ }
+diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
+index 9a5d5a10560fb..e7b0e12cc75bf 100644
+--- a/drivers/net/phy/sfp.c
++++ b/drivers/net/phy/sfp.c
+@@ -2516,7 +2516,7 @@ static int sfp_probe(struct platform_device *pdev)
+
+ platform_set_drvdata(pdev, sfp);
+
+- err = devm_add_action(sfp->dev, sfp_cleanup, sfp);
++ err = devm_add_action_or_reset(sfp->dev, sfp_cleanup, sfp);
+ if (err < 0)
+ return err;
+
+diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
+index dbac4c03d21a1..a0335407be423 100644
+--- a/drivers/net/xen-netback/rx.c
++++ b/drivers/net/xen-netback/rx.c
+@@ -495,6 +495,7 @@ void xenvif_rx_action(struct xenvif_queue *queue)
+ queue->rx_copy.completed = &completed_skbs;
+
+ while (xenvif_rx_ring_slots_available(queue) &&
++ !skb_queue_empty(&queue->rx_queue) &&
+ work_done < RX_BATCH_SIZE) {
+ xenvif_rx_skb(queue);
+ work_done++;
+diff --git a/drivers/nfc/nxp-nci/i2c.c b/drivers/nfc/nxp-nci/i2c.c
+index e8f3b35afbee4..ae2ba08d8ac3f 100644
+--- a/drivers/nfc/nxp-nci/i2c.c
++++ b/drivers/nfc/nxp-nci/i2c.c
+@@ -122,7 +122,9 @@ static int nxp_nci_i2c_fw_read(struct nxp_nci_i2c_phy *phy,
+ skb_put_data(*skb, &header, NXP_NCI_FW_HDR_LEN);
+
+ r = i2c_master_recv(client, skb_put(*skb, frame_len), frame_len);
+- if (r != frame_len) {
++ if (r < 0) {
++ goto fw_read_exit_free_skb;
++ } else if (r != frame_len) {
+ nfc_err(&client->dev,
+ "Invalid frame length: %u (expected %zu)\n",
+ r, frame_len);
+@@ -166,7 +168,9 @@ static int nxp_nci_i2c_nci_read(struct nxp_nci_i2c_phy *phy,
+ return 0;
+
+ r = i2c_master_recv(client, skb_put(*skb, header.plen), header.plen);
+- if (r != header.plen) {
++ if (r < 0) {
++ goto nci_read_exit_free_skb;
++ } else if (r != header.plen) {
+ nfc_err(&client->dev,
+ "Invalid frame payload length: %u (expected %u)\n",
+ r, header.plen);
+diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
+index a2862a56fadc4..c9831daafbc61 100644
+--- a/drivers/nvme/host/core.c
++++ b/drivers/nvme/host/core.c
+@@ -3726,7 +3726,7 @@ static int nvme_add_ns_cdev(struct nvme_ns *ns)
+ }
+
+ static struct nvme_ns_head *nvme_alloc_ns_head(struct nvme_ctrl *ctrl,
+- unsigned nsid, struct nvme_ns_ids *ids)
++ unsigned nsid, struct nvme_ns_ids *ids, bool is_shared)
+ {
+ struct nvme_ns_head *head;
+ size_t size = sizeof(*head);
+@@ -3750,6 +3750,7 @@ static struct nvme_ns_head *nvme_alloc_ns_head(struct nvme_ctrl *ctrl,
+ head->subsys = ctrl->subsys;
+ head->ns_id = nsid;
+ head->ids = *ids;
++ head->shared = is_shared;
+ kref_init(&head->ref);
+
+ if (head->ids.csi) {
+@@ -3830,12 +3831,11 @@ static int nvme_init_ns_head(struct nvme_ns *ns, unsigned nsid,
+ nsid);
+ goto out_unlock;
+ }
+- head = nvme_alloc_ns_head(ctrl, nsid, ids);
++ head = nvme_alloc_ns_head(ctrl, nsid, ids, is_shared);
+ if (IS_ERR(head)) {
+ ret = PTR_ERR(head);
+ goto out_unlock;
+ }
+- head->shared = is_shared;
+ } else {
+ ret = -EINVAL;
+ if (!is_shared || !head->shared) {
+@@ -4519,6 +4519,8 @@ void nvme_stop_ctrl(struct nvme_ctrl *ctrl)
+ nvme_stop_failfast_work(ctrl);
+ flush_work(&ctrl->async_event_work);
+ cancel_work_sync(&ctrl->fw_act_work);
++ if (ctrl->ops->stop_ctrl)
++ ctrl->ops->stop_ctrl(ctrl);
+ }
+ EXPORT_SYMBOL_GPL(nvme_stop_ctrl);
+
+diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
+index a2b53ca633359..337ae1e3ad252 100644
+--- a/drivers/nvme/host/nvme.h
++++ b/drivers/nvme/host/nvme.h
+@@ -501,6 +501,7 @@ struct nvme_ctrl_ops {
+ void (*free_ctrl)(struct nvme_ctrl *ctrl);
+ void (*submit_async_event)(struct nvme_ctrl *ctrl);
+ void (*delete_ctrl)(struct nvme_ctrl *ctrl);
++ void (*stop_ctrl)(struct nvme_ctrl *ctrl);
+ int (*get_address)(struct nvme_ctrl *ctrl, char *buf, int size);
+ };
+
+diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
+index fe829377c7c2a..ab575fdd8015f 100644
+--- a/drivers/nvme/host/pci.c
++++ b/drivers/nvme/host/pci.c
+@@ -3432,7 +3432,8 @@ static const struct pci_device_id nvme_id_table[] = {
+ NVME_QUIRK_DISABLE_WRITE_ZEROES|
+ NVME_QUIRK_IGNORE_DEV_SUBNQN, },
+ { PCI_DEVICE(0x1987, 0x5016), /* Phison E16 */
+- .driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, },
++ .driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN |
++ NVME_QUIRK_BOGUS_NID, },
+ { PCI_DEVICE(0x1b4b, 0x1092), /* Lexar 256 GB SSD */
+ .driver_data = NVME_QUIRK_NO_NS_DESC_LIST |
+ NVME_QUIRK_IGNORE_DEV_SUBNQN, },
+diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
+index d9f19d9013139..5aef2b81dbec6 100644
+--- a/drivers/nvme/host/rdma.c
++++ b/drivers/nvme/host/rdma.c
+@@ -1048,6 +1048,14 @@ static void nvme_rdma_teardown_io_queues(struct nvme_rdma_ctrl *ctrl,
+ }
+ }
+
++static void nvme_rdma_stop_ctrl(struct nvme_ctrl *nctrl)
++{
++ struct nvme_rdma_ctrl *ctrl = to_rdma_ctrl(nctrl);
++
++ cancel_work_sync(&ctrl->err_work);
++ cancel_delayed_work_sync(&ctrl->reconnect_work);
++}
++
+ static void nvme_rdma_free_ctrl(struct nvme_ctrl *nctrl)
+ {
+ struct nvme_rdma_ctrl *ctrl = to_rdma_ctrl(nctrl);
+@@ -2255,9 +2263,6 @@ static const struct blk_mq_ops nvme_rdma_admin_mq_ops = {
+
+ static void nvme_rdma_shutdown_ctrl(struct nvme_rdma_ctrl *ctrl, bool shutdown)
+ {
+- cancel_work_sync(&ctrl->err_work);
+- cancel_delayed_work_sync(&ctrl->reconnect_work);
+-
+ nvme_rdma_teardown_io_queues(ctrl, shutdown);
+ nvme_stop_admin_queue(&ctrl->ctrl);
+ if (shutdown)
+@@ -2307,6 +2312,7 @@ static const struct nvme_ctrl_ops nvme_rdma_ctrl_ops = {
+ .submit_async_event = nvme_rdma_submit_async_event,
+ .delete_ctrl = nvme_rdma_delete_ctrl,
+ .get_address = nvmf_get_address,
++ .stop_ctrl = nvme_rdma_stop_ctrl,
+ };
+
+ /*
+diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
+index ad3a2bf2f1e9b..1fb4f9b1621ea 100644
+--- a/drivers/nvme/host/tcp.c
++++ b/drivers/nvme/host/tcp.c
+@@ -1180,8 +1180,7 @@ done:
+ } else if (ret < 0) {
+ dev_err(queue->ctrl->ctrl.device,
+ "failed to send request %d\n", ret);
+- if (ret != -EPIPE && ret != -ECONNRESET)
+- nvme_tcp_fail_request(queue->request);
++ nvme_tcp_fail_request(queue->request);
+ nvme_tcp_done_send_req(queue);
+ }
+ return ret;
+@@ -2194,9 +2193,6 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
+
+ static void nvme_tcp_teardown_ctrl(struct nvme_ctrl *ctrl, bool shutdown)
+ {
+- cancel_work_sync(&to_tcp_ctrl(ctrl)->err_work);
+- cancel_delayed_work_sync(&to_tcp_ctrl(ctrl)->connect_work);
+-
+ nvme_tcp_teardown_io_queues(ctrl, shutdown);
+ nvme_stop_admin_queue(ctrl);
+ if (shutdown)
+@@ -2236,6 +2232,12 @@ out_fail:
+ nvme_tcp_reconnect_or_remove(ctrl);
+ }
+
++static void nvme_tcp_stop_ctrl(struct nvme_ctrl *ctrl)
++{
++ cancel_work_sync(&to_tcp_ctrl(ctrl)->err_work);
++ cancel_delayed_work_sync(&to_tcp_ctrl(ctrl)->connect_work);
++}
++
+ static void nvme_tcp_free_ctrl(struct nvme_ctrl *nctrl)
+ {
+ struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);
+@@ -2560,6 +2562,7 @@ static const struct nvme_ctrl_ops nvme_tcp_ctrl_ops = {
+ .submit_async_event = nvme_tcp_submit_async_event,
+ .delete_ctrl = nvme_tcp_delete_ctrl,
+ .get_address = nvmf_get_address,
++ .stop_ctrl = nvme_tcp_stop_ctrl,
+ };
+
+ static bool
+diff --git a/drivers/nvme/host/trace.h b/drivers/nvme/host/trace.h
+index b5f85259461a6..37c7f4c89f92e 100644
+--- a/drivers/nvme/host/trace.h
++++ b/drivers/nvme/host/trace.h
+@@ -69,7 +69,7 @@ TRACE_EVENT(nvme_setup_cmd,
+ __entry->metadata = !!blk_integrity_rq(req);
+ __entry->fctype = cmd->fabrics.fctype;
+ __assign_disk_name(__entry->disk, req->q->disk);
+- memcpy(__entry->cdw10, &cmd->common.cdw10,
++ memcpy(__entry->cdw10, &cmd->common.cdws,
+ sizeof(__entry->cdw10));
+ ),
+ TP_printk("nvme%d: %sqid=%d, cmdid=%u, nsid=%u, flags=0x%x, meta=0x%x, cmd=(%s %s)",
+diff --git a/drivers/pinctrl/aspeed/pinctrl-aspeed.c b/drivers/pinctrl/aspeed/pinctrl-aspeed.c
+index c94e24aadf922..83d47ff1cea8f 100644
+--- a/drivers/pinctrl/aspeed/pinctrl-aspeed.c
++++ b/drivers/pinctrl/aspeed/pinctrl-aspeed.c
+@@ -236,11 +236,11 @@ int aspeed_pinmux_set_mux(struct pinctrl_dev *pctldev, unsigned int function,
+ const struct aspeed_sig_expr **funcs;
+ const struct aspeed_sig_expr ***prios;
+
+- pr_debug("Muxing pin %s for %s\n", pdesc->name, pfunc->name);
+-
+ if (!pdesc)
+ return -EINVAL;
+
++ pr_debug("Muxing pin %s for %s\n", pdesc->name, pfunc->name);
++
+ prios = pdesc->prios;
+
+ if (!prios)
+diff --git a/drivers/pinctrl/freescale/pinctrl-imx93.c b/drivers/pinctrl/freescale/pinctrl-imx93.c
+index c0630f69e9954..417e41b37a6fd 100644
+--- a/drivers/pinctrl/freescale/pinctrl-imx93.c
++++ b/drivers/pinctrl/freescale/pinctrl-imx93.c
+@@ -239,6 +239,7 @@ static const struct pinctrl_pin_desc imx93_pinctrl_pads[] = {
+ static const struct imx_pinctrl_soc_info imx93_pinctrl_info = {
+ .pins = imx93_pinctrl_pads,
+ .npins = ARRAY_SIZE(imx93_pinctrl_pads),
++ .flags = ZERO_OFFSET_VALID,
+ .gpr_compatible = "fsl,imx93-iomuxc-gpr",
+ };
+
+diff --git a/drivers/platform/x86/hp-wmi.c b/drivers/platform/x86/hp-wmi.c
+index 0e6ed75c70f36..c63ec1471b844 100644
+--- a/drivers/platform/x86/hp-wmi.c
++++ b/drivers/platform/x86/hp-wmi.c
+@@ -89,6 +89,7 @@ enum hp_wmi_event_ids {
+ HPWMI_BACKLIT_KB_BRIGHTNESS = 0x0D,
+ HPWMI_PEAKSHIFT_PERIOD = 0x0F,
+ HPWMI_BATTERY_CHARGE_PERIOD = 0x10,
++ HPWMI_SANITIZATION_MODE = 0x17,
+ };
+
+ /*
+@@ -846,6 +847,8 @@ static void hp_wmi_notify(u32 value, void *context)
+ break;
+ case HPWMI_BATTERY_CHARGE_PERIOD:
+ break;
++ case HPWMI_SANITIZATION_MODE:
++ break;
+ default:
+ pr_info("Unknown event_id - %d - 0x%x\n", event_id, event_data);
+ break;
+diff --git a/drivers/platform/x86/intel/pmc/core.c b/drivers/platform/x86/intel/pmc/core.c
+index 8ee15a7252c71..c3ec5dc88bbf7 100644
+--- a/drivers/platform/x86/intel/pmc/core.c
++++ b/drivers/platform/x86/intel/pmc/core.c
+@@ -1911,6 +1911,7 @@ static const struct x86_cpu_id intel_pmc_core_ids[] = {
+ X86_MATCH_INTEL_FAM6_MODEL(ATOM_TREMONT_L, &icl_reg_map),
+ X86_MATCH_INTEL_FAM6_MODEL(ROCKETLAKE, &tgl_reg_map),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &tgl_reg_map),
++ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_N, &tgl_reg_map),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &adl_reg_map),
+ X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, &tgl_reg_map),
+ {}
+diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c
+index aa6ffeaa39329..a8b3830515285 100644
+--- a/drivers/platform/x86/thinkpad_acpi.c
++++ b/drivers/platform/x86/thinkpad_acpi.c
+@@ -10300,21 +10300,15 @@ static struct ibm_struct proxsensor_driver_data = {
+ #define DYTC_DISABLE_CQL DYTC_SET_COMMAND(DYTC_FUNCTION_CQL, DYTC_MODE_MMC_BALANCE, 0)
+ #define DYTC_ENABLE_CQL DYTC_SET_COMMAND(DYTC_FUNCTION_CQL, DYTC_MODE_MMC_BALANCE, 1)
+
+-enum dytc_profile_funcmode {
+- DYTC_FUNCMODE_NONE = 0,
+- DYTC_FUNCMODE_MMC,
+- DYTC_FUNCMODE_PSC,
+-};
+-
+-static enum dytc_profile_funcmode dytc_profile_available;
+ static enum platform_profile_option dytc_current_profile;
+ static atomic_t dytc_ignore_event = ATOMIC_INIT(0);
+ static DEFINE_MUTEX(dytc_mutex);
++static int dytc_capabilities;
+ static bool dytc_mmc_get_available;
+
+ static int convert_dytc_to_profile(int dytcmode, enum platform_profile_option *profile)
+ {
+- if (dytc_profile_available == DYTC_FUNCMODE_MMC) {
++ if (dytc_capabilities & BIT(DYTC_FC_MMC)) {
+ switch (dytcmode) {
+ case DYTC_MODE_MMC_LOWPOWER:
+ *profile = PLATFORM_PROFILE_LOW_POWER;
+@@ -10331,7 +10325,7 @@ static int convert_dytc_to_profile(int dytcmode, enum platform_profile_option *p
+ }
+ return 0;
+ }
+- if (dytc_profile_available == DYTC_FUNCMODE_PSC) {
++ if (dytc_capabilities & BIT(DYTC_FC_PSC)) {
+ switch (dytcmode) {
+ case DYTC_MODE_PSC_LOWPOWER:
+ *profile = PLATFORM_PROFILE_LOW_POWER;
+@@ -10353,21 +10347,21 @@ static int convert_profile_to_dytc(enum platform_profile_option profile, int *pe
+ {
+ switch (profile) {
+ case PLATFORM_PROFILE_LOW_POWER:
+- if (dytc_profile_available == DYTC_FUNCMODE_MMC)
++ if (dytc_capabilities & BIT(DYTC_FC_MMC))
+ *perfmode = DYTC_MODE_MMC_LOWPOWER;
+- else if (dytc_profile_available == DYTC_FUNCMODE_PSC)
++ else if (dytc_capabilities & BIT(DYTC_FC_PSC))
+ *perfmode = DYTC_MODE_PSC_LOWPOWER;
+ break;
+ case PLATFORM_PROFILE_BALANCED:
+- if (dytc_profile_available == DYTC_FUNCMODE_MMC)
++ if (dytc_capabilities & BIT(DYTC_FC_MMC))
+ *perfmode = DYTC_MODE_MMC_BALANCE;
+- else if (dytc_profile_available == DYTC_FUNCMODE_PSC)
++ else if (dytc_capabilities & BIT(DYTC_FC_PSC))
+ *perfmode = DYTC_MODE_PSC_BALANCE;
+ break;
+ case PLATFORM_PROFILE_PERFORMANCE:
+- if (dytc_profile_available == DYTC_FUNCMODE_MMC)
++ if (dytc_capabilities & BIT(DYTC_FC_MMC))
+ *perfmode = DYTC_MODE_MMC_PERFORM;
+- else if (dytc_profile_available == DYTC_FUNCMODE_PSC)
++ else if (dytc_capabilities & BIT(DYTC_FC_PSC))
+ *perfmode = DYTC_MODE_PSC_PERFORM;
+ break;
+ default: /* Unknown profile */
+@@ -10446,7 +10440,7 @@ static int dytc_profile_set(struct platform_profile_handler *pprof,
+ if (err)
+ goto unlock;
+
+- if (dytc_profile_available == DYTC_FUNCMODE_MMC) {
++ if (dytc_capabilities & BIT(DYTC_FC_MMC)) {
+ if (profile == PLATFORM_PROFILE_BALANCED) {
+ /*
+ * To get back to balanced mode we need to issue a reset command.
+@@ -10465,7 +10459,7 @@ static int dytc_profile_set(struct platform_profile_handler *pprof,
+ goto unlock;
+ }
+ }
+- if (dytc_profile_available == DYTC_FUNCMODE_PSC) {
++ if (dytc_capabilities & BIT(DYTC_FC_PSC)) {
+ err = dytc_command(DYTC_SET_COMMAND(DYTC_FUNCTION_PSC, perfmode, 1), &output);
+ if (err)
+ goto unlock;
+@@ -10484,12 +10478,12 @@ static void dytc_profile_refresh(void)
+ int perfmode;
+
+ mutex_lock(&dytc_mutex);
+- if (dytc_profile_available == DYTC_FUNCMODE_MMC) {
++ if (dytc_capabilities & BIT(DYTC_FC_MMC)) {
+ if (dytc_mmc_get_available)
+ err = dytc_command(DYTC_CMD_MMC_GET, &output);
+ else
+ err = dytc_cql_command(DYTC_CMD_GET, &output);
+- } else if (dytc_profile_available == DYTC_FUNCMODE_PSC)
++ } else if (dytc_capabilities & BIT(DYTC_FC_PSC))
+ err = dytc_command(DYTC_CMD_GET, &output);
+
+ mutex_unlock(&dytc_mutex);
+@@ -10518,7 +10512,6 @@ static int tpacpi_dytc_profile_init(struct ibm_init_struct *iibm)
+ set_bit(PLATFORM_PROFILE_BALANCED, dytc_profile.choices);
+ set_bit(PLATFORM_PROFILE_PERFORMANCE, dytc_profile.choices);
+
+- dytc_profile_available = DYTC_FUNCMODE_NONE;
+ err = dytc_command(DYTC_CMD_QUERY, &output);
+ if (err)
+ return err;
+@@ -10531,13 +10524,12 @@ static int tpacpi_dytc_profile_init(struct ibm_init_struct *iibm)
+ return -ENODEV;
+
+ /* Check what capabilities are supported */
+- err = dytc_command(DYTC_CMD_FUNC_CAP, &output);
++ err = dytc_command(DYTC_CMD_FUNC_CAP, &dytc_capabilities);
+ if (err)
+ return err;
+
+- if (output & BIT(DYTC_FC_MMC)) { /* MMC MODE */
+- dytc_profile_available = DYTC_FUNCMODE_MMC;
+-
++ if (dytc_capabilities & BIT(DYTC_FC_MMC)) { /* MMC MODE */
++ pr_debug("MMC is supported\n");
+ /*
+ * Check if MMC_GET functionality available
+ * Version > 6 and return success from MMC_GET command
+@@ -10548,8 +10540,13 @@ static int tpacpi_dytc_profile_init(struct ibm_init_struct *iibm)
+ if (!err && ((output & DYTC_ERR_MASK) == DYTC_ERR_SUCCESS))
+ dytc_mmc_get_available = true;
+ }
+- } else if (output & BIT(DYTC_FC_PSC)) { /* PSC MODE */
+- dytc_profile_available = DYTC_FUNCMODE_PSC;
++ } else if (dytc_capabilities & BIT(DYTC_FC_PSC)) { /* PSC MODE */
++ /* Support for this only works on AMD platforms */
++ if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) {
++ dbg_printk(TPACPI_DBG_INIT, "PSC not support on Intel platforms\n");
++ return -ENODEV;
++ }
++ pr_debug("PSC is supported\n");
+ } else {
+ dbg_printk(TPACPI_DBG_INIT, "No DYTC support available\n");
+ return -ENODEV;
+@@ -10575,7 +10572,6 @@ static int tpacpi_dytc_profile_init(struct ibm_init_struct *iibm)
+
+ static void dytc_profile_exit(void)
+ {
+- dytc_profile_available = DYTC_FUNCMODE_NONE;
+ platform_profile_remove();
+ }
+
+diff --git a/drivers/power/supply/power_supply_core.c b/drivers/power/supply/power_supply_core.c
+index fad5890c899e2..470253c337c73 100644
+--- a/drivers/power/supply/power_supply_core.c
++++ b/drivers/power/supply/power_supply_core.c
+@@ -846,17 +846,17 @@ int power_supply_temp2resist_simple(struct power_supply_resistance_temp_table *t
+ {
+ int i, high, low;
+
+- /* Break loop at table_len - 1 because that is the highest index */
+- for (i = 0; i < table_len - 1; i++)
++ for (i = 0; i < table_len; i++)
+ if (temp > table[i].temp)
+ break;
+
+ /* The library function will deal with high == low */
+- if ((i == 0) || (i == (table_len - 1)))
+- high = i;
++ if (i == 0)
++ high = low = i;
++ else if (i == table_len)
++ high = low = i - 1;
+ else
+- high = i - 1;
+- low = i;
++ high = (low = i) - 1;
+
+ return fixp_linear_interpolate(table[low].temp,
+ table[low].resistance,
+@@ -958,17 +958,17 @@ int power_supply_ocv2cap_simple(struct power_supply_battery_ocv_table *table,
+ {
+ int i, high, low;
+
+- /* Break loop at table_len - 1 because that is the highest index */
+- for (i = 0; i < table_len - 1; i++)
++ for (i = 0; i < table_len; i++)
+ if (ocv > table[i].ocv)
+ break;
+
+ /* The library function will deal with high == low */
+- if ((i == 0) || (i == (table_len - 1)))
+- high = i - 1;
++ if (i == 0)
++ high = low = i;
++ else if (i == table_len)
++ high = low = i - 1;
+ else
+- high = i; /* i.e. i == 0 */
+- low = i;
++ high = (low = i) - 1;
+
+ return fixp_linear_interpolate(table[low].ocv,
+ table[low].capacity,
+diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
+index fdf16cb708819..eb8645d44a9cf 100644
+--- a/drivers/s390/crypto/ap_bus.c
++++ b/drivers/s390/crypto/ap_bus.c
+@@ -1410,7 +1410,7 @@ static int __verify_queue_reservations(struct device_driver *drv, void *data)
+ if (ap_drv->in_use) {
+ rc = ap_drv->in_use(ap_perms.apm, newaqm);
+ if (rc)
+- return -EBUSY;
++ rc = -EBUSY;
+ }
+
+ /* release the driver's module */
+diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+index 7d819fc0395e4..eb86afb21aabb 100644
+--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
++++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+@@ -2782,6 +2782,7 @@ static int slave_configure_v3_hw(struct scsi_device *sdev)
+ struct hisi_hba *hisi_hba = shost_priv(shost);
+ struct device *dev = hisi_hba->dev;
+ int ret = sas_slave_configure(sdev);
++ unsigned int max_sectors;
+
+ if (ret)
+ return ret;
+@@ -2799,6 +2800,12 @@ static int slave_configure_v3_hw(struct scsi_device *sdev)
+ }
+ }
+
++ /* Set according to IOMMU IOVA caching limit */
++ max_sectors = min_t(size_t, queue_max_hw_sectors(sdev->request_queue),
++ (PAGE_SIZE * 32) >> SECTOR_SHIFT);
++
++ blk_queue_max_hw_sectors(sdev->request_queue, max_sectors);
++
+ return 0;
+ }
+
+diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
+index db6793608447a..f5deb0e561a93 100644
+--- a/drivers/scsi/megaraid/megaraid_sas_base.c
++++ b/drivers/scsi/megaraid/megaraid_sas_base.c
+@@ -3195,6 +3195,9 @@ static int megasas_map_queues(struct Scsi_Host *shost)
+ qoff += map->nr_queues;
+ offset += map->nr_queues;
+
++ /* we never use READ queue, so can't cheat blk-mq */
++ shost->tag_set.map[HCTX_TYPE_READ].nr_queues = 0;
++
+ /* Setup Poll hctx */
+ map = &shost->tag_set.map[HCTX_TYPE_POLL];
+ map->nr_queues = instance->iopoll_q_count;
+diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
+index 4c9eb4be449ca..452ad06120671 100644
+--- a/drivers/scsi/ufs/ufshcd.c
++++ b/drivers/scsi/ufs/ufshcd.c
+@@ -5722,7 +5722,7 @@ int ufshcd_wb_toggle(struct ufs_hba *hba, bool enable)
+ }
+
+ hba->dev_info.wb_enabled = enable;
+- dev_info(hba->dev, "%s Write Booster %s\n",
++ dev_dbg(hba->dev, "%s Write Booster %s\n",
+ __func__, enable ? "enabled" : "disabled");
+
+ return ret;
+diff --git a/drivers/soc/ixp4xx/ixp4xx-npe.c b/drivers/soc/ixp4xx/ixp4xx-npe.c
+index 613935cb6a488..58240e320c132 100644
+--- a/drivers/soc/ixp4xx/ixp4xx-npe.c
++++ b/drivers/soc/ixp4xx/ixp4xx-npe.c
+@@ -758,7 +758,7 @@ static const struct of_device_id ixp4xx_npe_of_match[] = {
+ static struct platform_driver ixp4xx_npe_driver = {
+ .driver = {
+ .name = "ixp4xx-npe",
+- .of_match_table = of_match_ptr(ixp4xx_npe_of_match),
++ .of_match_table = ixp4xx_npe_of_match,
+ },
+ .probe = ixp4xx_npe_probe,
+ .remove = ixp4xx_npe_remove,
+diff --git a/drivers/spi/spi-amd.c b/drivers/spi/spi-amd.c
+index cba6a4486c24c..efdcbe6c4c266 100644
+--- a/drivers/spi/spi-amd.c
++++ b/drivers/spi/spi-amd.c
+@@ -33,6 +33,7 @@
+ #define AMD_SPI_RX_COUNT_REG 0x4B
+ #define AMD_SPI_STATUS_REG 0x4C
+
++#define AMD_SPI_FIFO_SIZE 70
+ #define AMD_SPI_MEM_SIZE 200
+
+ /* M_CMD OP codes for SPI */
+@@ -270,6 +271,11 @@ static int amd_spi_master_transfer(struct spi_master *master,
+ return 0;
+ }
+
++static size_t amd_spi_max_transfer_size(struct spi_device *spi)
++{
++ return AMD_SPI_FIFO_SIZE;
++}
++
+ static int amd_spi_probe(struct platform_device *pdev)
+ {
+ struct device *dev = &pdev->dev;
+@@ -302,6 +308,8 @@ static int amd_spi_probe(struct platform_device *pdev)
+ master->flags = SPI_MASTER_HALF_DUPLEX;
+ master->setup = amd_spi_master_setup;
+ master->transfer_one_message = amd_spi_master_transfer;
++ master->max_transfer_size = amd_spi_max_transfer_size;
++ master->max_message_size = amd_spi_max_transfer_size;
+
+ /* Register the controller with SPI framework */
+ err = devm_spi_register_master(dev, master);
+diff --git a/drivers/tee/tee_core.c b/drivers/tee/tee_core.c
+index 8aa1a4836b92f..cebe4963ad873 100644
+--- a/drivers/tee/tee_core.c
++++ b/drivers/tee/tee_core.c
+@@ -1075,7 +1075,7 @@ EXPORT_SYMBOL_GPL(tee_device_unregister);
+ /**
+ * tee_get_drvdata() - Return driver_data pointer
+ * @teedev: Device containing the driver_data pointer
+- * @returns the driver_data pointer supplied to tee_register().
++ * @returns the driver_data pointer supplied to tee_device_alloc().
+ */
+ void *tee_get_drvdata(struct tee_device *teedev)
+ {
+diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
+index 74bfabe5b4538..752dab3356d72 100644
+--- a/drivers/tty/pty.c
++++ b/drivers/tty/pty.c
+@@ -111,21 +111,11 @@ static void pty_unthrottle(struct tty_struct *tty)
+ static int pty_write(struct tty_struct *tty, const unsigned char *buf, int c)
+ {
+ struct tty_struct *to = tty->link;
+- unsigned long flags;
+
+- if (tty->flow.stopped)
++ if (tty->flow.stopped || !c)
+ return 0;
+
+- if (c > 0) {
+- spin_lock_irqsave(&to->port->lock, flags);
+- /* Stuff the data into the input queue of the other end */
+- c = tty_insert_flip_string(to->port, buf, c);
+- spin_unlock_irqrestore(&to->port->lock, flags);
+- /* And shovel */
+- if (c)
+- tty_flip_buffer_push(to->port);
+- }
+- return c;
++ return tty_insert_flip_string_and_push_buffer(to->port, buf, c);
+ }
+
+ /**
+diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
+index 01d30f6ed8fb5..cbad398f48b25 100644
+--- a/drivers/tty/serial/8250/8250_core.c
++++ b/drivers/tty/serial/8250/8250_core.c
+@@ -23,6 +23,7 @@
+ #include <linux/sysrq.h>
+ #include <linux/delay.h>
+ #include <linux/platform_device.h>
++#include <linux/pm_runtime.h>
+ #include <linux/tty.h>
+ #include <linux/ratelimit.h>
+ #include <linux/tty_flip.h>
+@@ -560,6 +561,9 @@ serial8250_register_ports(struct uart_driver *drv, struct device *dev)
+
+ up->port.dev = dev;
+
++ if (uart_console_enabled(&up->port))
++ pm_runtime_get_sync(up->port.dev);
++
+ serial8250_apply_quirks(up);
+ uart_add_one_port(drv, &up->port);
+ }
+diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
+index 7e295d2701b28..ddaf35daf316b 100644
+--- a/drivers/tty/serial/8250/8250_port.c
++++ b/drivers/tty/serial/8250/8250_port.c
+@@ -2971,8 +2971,10 @@ static int serial8250_request_std_resource(struct uart_8250_port *up)
+ case UPIO_MEM32BE:
+ case UPIO_MEM16:
+ case UPIO_MEM:
+- if (!port->mapbase)
++ if (!port->mapbase) {
++ ret = -EINVAL;
+ break;
++ }
+
+ if (!request_mem_region(port->mapbase, size, "serial")) {
+ ret = -EBUSY;
+diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
+index 4d11a3e547f94..e69700275e073 100644
+--- a/drivers/tty/serial/amba-pl011.c
++++ b/drivers/tty/serial/amba-pl011.c
+@@ -1339,6 +1339,15 @@ static void pl011_stop_rx(struct uart_port *port)
+ pl011_dma_rx_stop(uap);
+ }
+
++static void pl011_throttle_rx(struct uart_port *port)
++{
++ unsigned long flags;
++
++ spin_lock_irqsave(&port->lock, flags);
++ pl011_stop_rx(port);
++ spin_unlock_irqrestore(&port->lock, flags);
++}
++
+ static void pl011_enable_ms(struct uart_port *port)
+ {
+ struct uart_amba_port *uap =
+@@ -1760,9 +1769,10 @@ static int pl011_allocate_irq(struct uart_amba_port *uap)
+ */
+ static void pl011_enable_interrupts(struct uart_amba_port *uap)
+ {
++ unsigned long flags;
+ unsigned int i;
+
+- spin_lock_irq(&uap->port.lock);
++ spin_lock_irqsave(&uap->port.lock, flags);
+
+ /* Clear out any spuriously appearing RX interrupts */
+ pl011_write(UART011_RTIS | UART011_RXIS, uap, REG_ICR);
+@@ -1784,7 +1794,14 @@ static void pl011_enable_interrupts(struct uart_amba_port *uap)
+ if (!pl011_dma_rx_running(uap))
+ uap->im |= UART011_RXIM;
+ pl011_write(uap->im, uap, REG_IMSC);
+- spin_unlock_irq(&uap->port.lock);
++ spin_unlock_irqrestore(&uap->port.lock, flags);
++}
++
++static void pl011_unthrottle_rx(struct uart_port *port)
++{
++ struct uart_amba_port *uap = container_of(port, struct uart_amba_port, port);
++
++ pl011_enable_interrupts(uap);
+ }
+
+ static int pl011_startup(struct uart_port *port)
+@@ -2211,6 +2228,8 @@ static const struct uart_ops amba_pl011_pops = {
+ .stop_tx = pl011_stop_tx,
+ .start_tx = pl011_start_tx,
+ .stop_rx = pl011_stop_rx,
++ .throttle = pl011_throttle_rx,
++ .unthrottle = pl011_unthrottle_rx,
+ .enable_ms = pl011_enable_ms,
+ .break_ctl = pl011_break_ctl,
+ .startup = pl011_startup,
+diff --git a/drivers/tty/serial/mvebu-uart.c b/drivers/tty/serial/mvebu-uart.c
+index 0429c2a542902..93489fe334d0f 100644
+--- a/drivers/tty/serial/mvebu-uart.c
++++ b/drivers/tty/serial/mvebu-uart.c
+@@ -470,14 +470,14 @@ static void mvebu_uart_shutdown(struct uart_port *port)
+ }
+ }
+
+-static int mvebu_uart_baud_rate_set(struct uart_port *port, unsigned int baud)
++static unsigned int mvebu_uart_baud_rate_set(struct uart_port *port, unsigned int baud)
+ {
+ unsigned int d_divisor, m_divisor;
+ unsigned long flags;
+ u32 brdv, osamp;
+
+ if (!port->uartclk)
+- return -EOPNOTSUPP;
++ return 0;
+
+ /*
+ * The baudrate is derived from the UART clock thanks to divisors:
+@@ -548,7 +548,7 @@ static int mvebu_uart_baud_rate_set(struct uart_port *port, unsigned int baud)
+ (m_divisor << 16) | (m_divisor << 24);
+ writel(osamp, port->membase + UART_OSAMP);
+
+- return 0;
++ return DIV_ROUND_CLOSEST(port->uartclk, d_divisor * m_divisor);
+ }
+
+ static void mvebu_uart_set_termios(struct uart_port *port,
+@@ -587,15 +587,11 @@ static void mvebu_uart_set_termios(struct uart_port *port,
+ max_baud = port->uartclk / 80;
+
+ baud = uart_get_baud_rate(port, termios, old, min_baud, max_baud);
+- if (mvebu_uart_baud_rate_set(port, baud)) {
+- /* No clock available, baudrate cannot be changed */
+- if (old)
+- baud = uart_get_baud_rate(port, old, NULL,
+- min_baud, max_baud);
+- } else {
+- tty_termios_encode_baud_rate(termios, baud, baud);
+- uart_update_timeout(port, termios->c_cflag, baud);
+- }
++ baud = mvebu_uart_baud_rate_set(port, baud);
++
++ /* In case baudrate cannot be changed, report previous old value */
++ if (baud == 0 && old)
++ baud = tty_termios_baud_rate(old);
+
+ /* Only the following flag changes are supported */
+ if (old) {
+@@ -606,6 +602,11 @@ static void mvebu_uart_set_termios(struct uart_port *port,
+ termios->c_cflag |= CS8;
+ }
+
++ if (baud != 0) {
++ tty_termios_encode_baud_rate(termios, baud, baud);
++ uart_update_timeout(port, termios->c_cflag, baud);
++ }
++
+ spin_unlock_irqrestore(&port->lock, flags);
+ }
+
+diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
+index e1585fbae909d..817ea46af2915 100644
+--- a/drivers/tty/serial/samsung_tty.c
++++ b/drivers/tty/serial/samsung_tty.c
+@@ -377,8 +377,7 @@ static void enable_tx_dma(struct s3c24xx_uart_port *ourport)
+ /* Enable tx dma mode */
+ ucon = rd_regl(port, S3C2410_UCON);
+ ucon &= ~(S3C64XX_UCON_TXBURST_MASK | S3C64XX_UCON_TXMODE_MASK);
+- ucon |= (dma_get_cache_alignment() >= 16) ?
+- S3C64XX_UCON_TXBURST_16 : S3C64XX_UCON_TXBURST_1;
++ ucon |= S3C64XX_UCON_TXBURST_1;
+ ucon |= S3C64XX_UCON_TXMODE_DMA;
+ wr_regl(port, S3C2410_UCON, ucon);
+
+@@ -674,7 +673,7 @@ static void enable_rx_dma(struct s3c24xx_uart_port *ourport)
+ S3C64XX_UCON_DMASUS_EN |
+ S3C64XX_UCON_TIMEOUT_EN |
+ S3C64XX_UCON_RXMODE_MASK);
+- ucon |= S3C64XX_UCON_RXBURST_16 |
++ ucon |= S3C64XX_UCON_RXBURST_1 |
+ 0xf << S3C64XX_UCON_TIMEOUT_SHIFT |
+ S3C64XX_UCON_EMPTYINT_EN |
+ S3C64XX_UCON_TIMEOUT_EN |
+diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
+index 6a8963caf954a..95d8d1fcd5437 100644
+--- a/drivers/tty/serial/serial_core.c
++++ b/drivers/tty/serial/serial_core.c
+@@ -1904,11 +1904,6 @@ static int uart_proc_show(struct seq_file *m, void *v)
+ }
+ #endif
+
+-static inline bool uart_console_enabled(struct uart_port *port)
+-{
+- return uart_console(port) && (port->cons->flags & CON_ENABLED);
+-}
+-
+ static void uart_port_spin_lock_init(struct uart_port *port)
+ {
+ spin_lock_init(&port->lock);
+diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
+index 3c551fd4f3ff8..0fb535b9505cd 100644
+--- a/drivers/tty/serial/stm32-usart.c
++++ b/drivers/tty/serial/stm32-usart.c
+@@ -71,6 +71,8 @@ static void stm32_usart_config_reg_rs485(u32 *cr1, u32 *cr3, u32 delay_ADE,
+ *cr3 |= USART_CR3_DEM;
+ over8 = *cr1 & USART_CR1_OVER8;
+
++ *cr1 &= ~(USART_CR1_DEDT_MASK | USART_CR1_DEAT_MASK);
++
+ if (over8)
+ rs485_deat_dedt = delay_ADE * baud * 8;
+ else
+diff --git a/drivers/tty/tty.h b/drivers/tty/tty.h
+index b710c5ef89ab2..f310a8274df15 100644
+--- a/drivers/tty/tty.h
++++ b/drivers/tty/tty.h
+@@ -111,4 +111,7 @@ static inline void tty_audit_tiocsti(struct tty_struct *tty, char ch)
+
+ ssize_t redirected_tty_write(struct kiocb *, struct iov_iter *);
+
++int tty_insert_flip_string_and_push_buffer(struct tty_port *port,
++ const unsigned char *chars, size_t cnt);
++
+ #endif
+diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
+index bfa431a8e6902..595d8b49c745d 100644
+--- a/drivers/tty/tty_buffer.c
++++ b/drivers/tty/tty_buffer.c
+@@ -532,6 +532,15 @@ static void flush_to_ldisc(struct work_struct *work)
+
+ }
+
++static inline void tty_flip_buffer_commit(struct tty_buffer *tail)
++{
++ /*
++ * Paired w/ acquire in flush_to_ldisc(); ensures flush_to_ldisc() sees
++ * buffer data.
++ */
++ smp_store_release(&tail->commit, tail->used);
++}
++
+ /**
+ * tty_flip_buffer_push - push terminal buffers
+ * @port: tty port to push
+@@ -546,15 +555,42 @@ void tty_flip_buffer_push(struct tty_port *port)
+ {
+ struct tty_bufhead *buf = &port->buf;
+
+- /*
+- * Paired w/ acquire in flush_to_ldisc(); ensures flush_to_ldisc() sees
+- * buffer data.
+- */
+- smp_store_release(&buf->tail->commit, buf->tail->used);
++ tty_flip_buffer_commit(buf->tail);
+ queue_work(system_unbound_wq, &buf->work);
+ }
+ EXPORT_SYMBOL(tty_flip_buffer_push);
+
++/**
++ * tty_insert_flip_string_and_push_buffer - add characters to the tty buffer and
++ * push
++ * @port: tty port
++ * @chars: characters
++ * @size: size
++ *
++ * The function combines tty_insert_flip_string() and tty_flip_buffer_push()
++ * with the exception of properly holding the @port->lock.
++ *
++ * To be used only internally (by pty currently).
++ *
++ * Returns: the number added.
++ */
++int tty_insert_flip_string_and_push_buffer(struct tty_port *port,
++ const unsigned char *chars, size_t size)
++{
++ struct tty_bufhead *buf = &port->buf;
++ unsigned long flags;
++
++ spin_lock_irqsave(&port->lock, flags);
++ size = tty_insert_flip_string(port, chars, size);
++ if (size)
++ tty_flip_buffer_commit(buf->tail);
++ spin_unlock_irqrestore(&port->lock, flags);
++
++ queue_work(system_unbound_wq, &buf->work);
++
++ return size;
++}
++
+ /**
+ * tty_buffer_init - prepare a tty buffer structure
+ * @port: tty port to initialise
+diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
+index f8c87c4d73995..dfc1f4b445f3b 100644
+--- a/drivers/tty/vt/vt.c
++++ b/drivers/tty/vt/vt.c
+@@ -855,7 +855,7 @@ static void delete_char(struct vc_data *vc, unsigned int nr)
+ unsigned short *p = (unsigned short *) vc->vc_pos;
+
+ vc_uniscr_delete(vc, nr);
+- scr_memcpyw(p, p + nr, (vc->vc_cols - vc->state.x - nr) * 2);
++ scr_memmovew(p, p + nr, (vc->vc_cols - vc->state.x - nr) * 2);
+ scr_memsetw(p + vc->vc_cols - vc->state.x - nr, vc->vc_video_erase_char,
+ nr * 2);
+ vc->vc_need_wrap = 0;
+diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
+index 0a6633e7edce2..f46fc02669a8c 100644
+--- a/drivers/usb/dwc3/gadget.c
++++ b/drivers/usb/dwc3/gadget.c
+@@ -4224,7 +4224,6 @@ static irqreturn_t dwc3_process_event_buf(struct dwc3_event_buffer *evt)
+ }
+
+ evt->count = 0;
+- evt->flags &= ~DWC3_EVENT_PENDING;
+ ret = IRQ_HANDLED;
+
+ /* Unmask interrupt */
+@@ -4236,6 +4235,9 @@ static irqreturn_t dwc3_process_event_buf(struct dwc3_event_buffer *evt)
+ dwc3_writel(dwc->regs, DWC3_DEV_IMOD(0), dwc->imod_interval);
+ }
+
++ /* Keep the clearing of DWC3_EVENT_PENDING at the end */
++ evt->flags &= ~DWC3_EVENT_PENDING;
++
+ return ret;
+ }
+
+diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c
+index 49c08f07c9690..10d120589cbd6 100644
+--- a/drivers/usb/serial/ftdi_sio.c
++++ b/drivers/usb/serial/ftdi_sio.c
+@@ -1023,6 +1023,9 @@ static const struct usb_device_id id_table_combined[] = {
+ { USB_DEVICE(FTDI_VID, CHETCO_SEASMART_DISPLAY_PID) },
+ { USB_DEVICE(FTDI_VID, CHETCO_SEASMART_LITE_PID) },
+ { USB_DEVICE(FTDI_VID, CHETCO_SEASMART_ANALOG_PID) },
++ /* Belimo Automation devices */
++ { USB_DEVICE(FTDI_VID, BELIMO_ZTH_PID) },
++ { USB_DEVICE(FTDI_VID, BELIMO_ZIP_PID) },
+ /* ICP DAS I-756xU devices */
+ { USB_DEVICE(ICPDAS_VID, ICPDAS_I7560U_PID) },
+ { USB_DEVICE(ICPDAS_VID, ICPDAS_I7561U_PID) },
+diff --git a/drivers/usb/serial/ftdi_sio_ids.h b/drivers/usb/serial/ftdi_sio_ids.h
+index d1a9564697a4b..4e92c165c86bf 100644
+--- a/drivers/usb/serial/ftdi_sio_ids.h
++++ b/drivers/usb/serial/ftdi_sio_ids.h
+@@ -1568,6 +1568,12 @@
+ #define CHETCO_SEASMART_LITE_PID 0xA5AE /* SeaSmart Lite USB Adapter */
+ #define CHETCO_SEASMART_ANALOG_PID 0xA5AF /* SeaSmart Analog Adapter */
+
++/*
++ * Belimo Automation
++ */
++#define BELIMO_ZTH_PID 0x8050
++#define BELIMO_ZIP_PID 0xC811
++
+ /*
+ * Unjo AB
+ */
+diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c
+index ee0e520707dd7..c4724750c81a5 100644
+--- a/drivers/usb/typec/class.c
++++ b/drivers/usb/typec/class.c
+@@ -1718,6 +1718,7 @@ void typec_set_pwr_opmode(struct typec_port *port,
+ partner->usb_pd = 1;
+ sysfs_notify(&partner_dev->kobj, NULL,
+ "supports_usb_power_delivery");
++ kobject_uevent(&partner_dev->kobj, KOBJ_CHANGE);
+ }
+ put_device(partner_dev);
+ }
+diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
+index c290386aa2f37..ae4cfbd927fd2 100644
+--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
++++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
+@@ -1965,7 +1965,6 @@ static int verify_driver_features(struct mlx5_vdpa_dev *mvdev, u64 features)
+ static int setup_virtqueues(struct mlx5_vdpa_dev *mvdev)
+ {
+ struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
+- struct mlx5_control_vq *cvq = &mvdev->cvq;
+ int err;
+ int i;
+
+@@ -1975,16 +1974,6 @@ static int setup_virtqueues(struct mlx5_vdpa_dev *mvdev)
+ goto err_vq;
+ }
+
+- if (mvdev->actual_features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ)) {
+- err = vringh_init_iotlb(&cvq->vring, mvdev->actual_features,
+- MLX5_CVQ_MAX_ENT, false,
+- (struct vring_desc *)(uintptr_t)cvq->desc_addr,
+- (struct vring_avail *)(uintptr_t)cvq->driver_addr,
+- (struct vring_used *)(uintptr_t)cvq->device_addr);
+- if (err)
+- goto err_vq;
+- }
+-
+ return 0;
+
+ err_vq:
+@@ -2257,6 +2246,21 @@ static void clear_vqs_ready(struct mlx5_vdpa_net *ndev)
+ ndev->mvdev.cvq.ready = false;
+ }
+
++static int setup_cvq_vring(struct mlx5_vdpa_dev *mvdev)
++{
++ struct mlx5_control_vq *cvq = &mvdev->cvq;
++ int err = 0;
++
++ if (mvdev->actual_features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
++ err = vringh_init_iotlb(&cvq->vring, mvdev->actual_features,
++ MLX5_CVQ_MAX_ENT, false,
++ (struct vring_desc *)(uintptr_t)cvq->desc_addr,
++ (struct vring_avail *)(uintptr_t)cvq->driver_addr,
++ (struct vring_used *)(uintptr_t)cvq->device_addr);
++
++ return err;
++}
++
+ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
+ {
+ struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+@@ -2269,6 +2273,11 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
+
+ if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
+ if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
++ err = setup_cvq_vring(mvdev);
++ if (err) {
++ mlx5_vdpa_warn(mvdev, "failed to setup control VQ vring\n");
++ goto err_setup;
++ }
+ err = setup_driver(mvdev);
+ if (err) {
+ mlx5_vdpa_warn(mvdev, "failed to setup driver\n");
+diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
+index 160e40d030847..02709f8a78bd2 100644
+--- a/drivers/vdpa/vdpa_user/vduse_dev.c
++++ b/drivers/vdpa/vdpa_user/vduse_dev.c
+@@ -1475,16 +1475,12 @@ static char *vduse_devnode(struct device *dev, umode_t *mode)
+ return kasprintf(GFP_KERNEL, "vduse/%s", dev_name(dev));
+ }
+
+-static void vduse_mgmtdev_release(struct device *dev)
+-{
+-}
+-
+-static struct device vduse_mgmtdev = {
+- .init_name = "vduse",
+- .release = vduse_mgmtdev_release,
++struct vduse_mgmt_dev {
++ struct vdpa_mgmt_dev mgmt_dev;
++ struct device dev;
+ };
+
+-static struct vdpa_mgmt_dev mgmt_dev;
++static struct vduse_mgmt_dev *vduse_mgmt;
+
+ static int vduse_dev_init_vdpa(struct vduse_dev *dev, const char *name)
+ {
+@@ -1509,7 +1505,7 @@ static int vduse_dev_init_vdpa(struct vduse_dev *dev, const char *name)
+ }
+ set_dma_ops(&vdev->vdpa.dev, &vduse_dev_dma_ops);
+ vdev->vdpa.dma_dev = &vdev->vdpa.dev;
+- vdev->vdpa.mdev = &mgmt_dev;
++ vdev->vdpa.mdev = &vduse_mgmt->mgmt_dev;
+
+ return 0;
+ }
+@@ -1555,34 +1551,52 @@ static struct virtio_device_id id_table[] = {
+ { 0 },
+ };
+
+-static struct vdpa_mgmt_dev mgmt_dev = {
+- .device = &vduse_mgmtdev,
+- .id_table = id_table,
+- .ops = &vdpa_dev_mgmtdev_ops,
+-};
++static void vduse_mgmtdev_release(struct device *dev)
++{
++ struct vduse_mgmt_dev *mgmt_dev;
++
++ mgmt_dev = container_of(dev, struct vduse_mgmt_dev, dev);
++ kfree(mgmt_dev);
++}
+
+ static int vduse_mgmtdev_init(void)
+ {
+ int ret;
+
+- ret = device_register(&vduse_mgmtdev);
+- if (ret)
++ vduse_mgmt = kzalloc(sizeof(*vduse_mgmt), GFP_KERNEL);
++ if (!vduse_mgmt)
++ return -ENOMEM;
++
++ ret = dev_set_name(&vduse_mgmt->dev, "vduse");
++ if (ret) {
++ kfree(vduse_mgmt);
+ return ret;
++ }
+
+- ret = vdpa_mgmtdev_register(&mgmt_dev);
++ vduse_mgmt->dev.release = vduse_mgmtdev_release;
++
++ ret = device_register(&vduse_mgmt->dev);
+ if (ret)
+- goto err;
++ goto dev_reg_err;
+
+- return 0;
+-err:
+- device_unregister(&vduse_mgmtdev);
++ vduse_mgmt->mgmt_dev.id_table = id_table;
++ vduse_mgmt->mgmt_dev.ops = &vdpa_dev_mgmtdev_ops;
++ vduse_mgmt->mgmt_dev.device = &vduse_mgmt->dev;
++ ret = vdpa_mgmtdev_register(&vduse_mgmt->mgmt_dev);
++ if (ret)
++ device_unregister(&vduse_mgmt->dev);
++
++ return ret;
++
++dev_reg_err:
++ put_device(&vduse_mgmt->dev);
+ return ret;
+ }
+
+ static void vduse_mgmtdev_exit(void)
+ {
+- vdpa_mgmtdev_unregister(&mgmt_dev);
+- device_unregister(&vduse_mgmtdev);
++ vdpa_mgmtdev_unregister(&vduse_mgmt->mgmt_dev);
++ device_unregister(&vduse_mgmt->dev);
+ }
+
+ static int vduse_init(void)
+diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
+index 85de02d0d3aaa..643383d74edc7 100644
+--- a/drivers/video/fbdev/core/fbmem.c
++++ b/drivers/video/fbdev/core/fbmem.c
+@@ -19,6 +19,7 @@
+ #include <linux/kernel.h>
+ #include <linux/major.h>
+ #include <linux/slab.h>
++#include <linux/sysfb.h>
+ #include <linux/mm.h>
+ #include <linux/mman.h>
+ #include <linux/vt.h>
+@@ -1787,6 +1788,17 @@ int remove_conflicting_framebuffers(struct apertures_struct *a,
+ do_free = true;
+ }
+
++ /*
++ * If a driver asked to unregister a platform device registered by
++ * sysfb, then can be assumed that this is a driver for a display
++ * that is set up by the system firmware and has a generic driver.
++ *
++ * Drivers for devices that don't have a generic driver will never
++ * ask for this, so let's assume that a real driver for the display
++ * was already probed and prevent sysfb to register devices later.
++ */
++ sysfb_disable();
++
+ mutex_lock(®istration_lock);
+ do_remove_conflicting_framebuffers(a, name, primary);
+ mutex_unlock(®istration_lock);
+diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
+index 1dd396d4bebb2..fe696aafaed86 100644
+--- a/drivers/virtio/virtio_mmio.c
++++ b/drivers/virtio/virtio_mmio.c
+@@ -62,6 +62,7 @@
+ #include <linux/list.h>
+ #include <linux/module.h>
+ #include <linux/platform_device.h>
++#include <linux/pm.h>
+ #include <linux/slab.h>
+ #include <linux/spinlock.h>
+ #include <linux/virtio.h>
+@@ -543,6 +544,28 @@ static const struct virtio_config_ops virtio_mmio_config_ops = {
+ .get_shm_region = vm_get_shm_region,
+ };
+
++#ifdef CONFIG_PM_SLEEP
++static int virtio_mmio_freeze(struct device *dev)
++{
++ struct virtio_mmio_device *vm_dev = dev_get_drvdata(dev);
++
++ return virtio_device_freeze(&vm_dev->vdev);
++}
++
++static int virtio_mmio_restore(struct device *dev)
++{
++ struct virtio_mmio_device *vm_dev = dev_get_drvdata(dev);
++
++ if (vm_dev->version == 1)
++ writel(PAGE_SIZE, vm_dev->base + VIRTIO_MMIO_GUEST_PAGE_SIZE);
++
++ return virtio_device_restore(&vm_dev->vdev);
++}
++
++static const struct dev_pm_ops virtio_mmio_pm_ops = {
++ SET_SYSTEM_SLEEP_PM_OPS(virtio_mmio_freeze, virtio_mmio_restore)
++};
++#endif
+
+ static void virtio_mmio_release_dev(struct device *_d)
+ {
+@@ -786,6 +809,9 @@ static struct platform_driver virtio_mmio_driver = {
+ .name = "virtio-mmio",
+ .of_match_table = virtio_mmio_match,
+ .acpi_match_table = ACPI_PTR(virtio_mmio_acpi_match),
++#ifdef CONFIG_PM_SLEEP
++ .pm = &virtio_mmio_pm_ops,
++#endif
+ },
+ };
+
+diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
+index 4b56c39f766d4..84b143eef395b 100644
+--- a/drivers/xen/gntdev.c
++++ b/drivers/xen/gntdev.c
+@@ -396,13 +396,15 @@ static void __unmap_grant_pages_done(int result,
+ unsigned int offset = data->unmap_ops - map->unmap_ops;
+
+ for (i = 0; i < data->count; i++) {
+- WARN_ON(map->unmap_ops[offset+i].status);
++ WARN_ON(map->unmap_ops[offset + i].status != GNTST_okay &&
++ map->unmap_ops[offset + i].handle != INVALID_GRANT_HANDLE);
+ pr_debug("unmap handle=%d st=%d\n",
+ map->unmap_ops[offset+i].handle,
+ map->unmap_ops[offset+i].status);
+ map->unmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
+ if (use_ptemod) {
+- WARN_ON(map->kunmap_ops[offset+i].status);
++ WARN_ON(map->kunmap_ops[offset + i].status != GNTST_okay &&
++ map->kunmap_ops[offset + i].handle != INVALID_GRANT_HANDLE);
+ pr_debug("kunmap handle=%u st=%d\n",
+ map->kunmap_ops[offset+i].handle,
+ map->kunmap_ops[offset+i].status);
+diff --git a/fs/afs/file.c b/fs/afs/file.c
+index fab8324833ba0..a8a5a91dc3750 100644
+--- a/fs/afs/file.c
++++ b/fs/afs/file.c
+@@ -376,7 +376,7 @@ static int afs_begin_cache_operation(struct netfs_io_request *rreq)
+ }
+
+ static int afs_check_write_begin(struct file *file, loff_t pos, unsigned len,
+- struct folio *folio, void **_fsdata)
++ struct folio **foliop, void **_fsdata)
+ {
+ struct afs_vnode *vnode = AFS_FS_I(file_inode(file));
+
+diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
+index 861b9748a0d9f..9ae79342631a8 100644
+--- a/fs/btrfs/inode.c
++++ b/fs/btrfs/inode.c
+@@ -7639,7 +7639,19 @@ static int btrfs_dio_iomap_begin(struct inode *inode, loff_t start,
+ if (test_bit(EXTENT_FLAG_COMPRESSED, &em->flags) ||
+ em->block_start == EXTENT_MAP_INLINE) {
+ free_extent_map(em);
+- ret = -ENOTBLK;
++ /*
++ * If we are in a NOWAIT context, return -EAGAIN in order to
++ * fallback to buffered IO. This is not only because we can
++ * block with buffered IO (no support for NOWAIT semantics at
++ * the moment) but also to avoid returning short reads to user
++ * space - this happens if we were able to read some data from
++ * previous non-compressed extents and then when we fallback to
++ * buffered IO, at btrfs_file_read_iter() by calling
++ * filemap_read(), we fail to fault in pages for the read buffer,
++ * in which case filemap_read() returns a short read (the number
++ * of bytes previously read is > 0, so it does not return -EFAULT).
++ */
++ ret = (flags & IOMAP_NOWAIT) ? -EAGAIN : -ENOTBLK;
+ goto unlock_err;
+ }
+
+diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
+index b5236f5afca7c..5091d679a602c 100644
+--- a/fs/btrfs/zoned.c
++++ b/fs/btrfs/zoned.c
+@@ -1727,12 +1727,14 @@ static int read_zone_info(struct btrfs_fs_info *fs_info, u64 logical,
+ ret = btrfs_map_sblock(fs_info, BTRFS_MAP_GET_READ_MIRRORS, logical,
+ &mapped_length, &bioc);
+ if (ret || !bioc || mapped_length < PAGE_SIZE) {
+- btrfs_put_bioc(bioc);
+- return -EIO;
++ ret = -EIO;
++ goto out_put_bioc;
+ }
+
+- if (bioc->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK)
+- return -EINVAL;
++ if (bioc->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
++ ret = -EINVAL;
++ goto out_put_bioc;
++ }
+
+ nofs_flag = memalloc_nofs_save();
+ nmirrors = (int)bioc->num_stripes;
+@@ -1751,7 +1753,8 @@ static int read_zone_info(struct btrfs_fs_info *fs_info, u64 logical,
+ break;
+ }
+ memalloc_nofs_restore(nofs_flag);
+-
++out_put_bioc:
++ btrfs_put_bioc(bioc);
+ return ret;
+ }
+
+diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
+index 11dbb1133a21e..ae567fb7f65a0 100644
+--- a/fs/ceph/addr.c
++++ b/fs/ceph/addr.c
+@@ -63,7 +63,7 @@
+ (CONGESTION_ON_THRESH(congestion_kb) >> 2))
+
+ static int ceph_netfs_check_write_begin(struct file *file, loff_t pos, unsigned int len,
+- struct folio *folio, void **_fsdata);
++ struct folio **foliop, void **_fsdata);
+
+ static inline struct ceph_snap_context *page_snap_context(struct page *page)
+ {
+@@ -1285,18 +1285,19 @@ ceph_find_incompatible(struct page *page)
+ }
+
+ static int ceph_netfs_check_write_begin(struct file *file, loff_t pos, unsigned int len,
+- struct folio *folio, void **_fsdata)
++ struct folio **foliop, void **_fsdata)
+ {
+ struct inode *inode = file_inode(file);
+ struct ceph_inode_info *ci = ceph_inode(inode);
+ struct ceph_snap_context *snapc;
+
+- snapc = ceph_find_incompatible(folio_page(folio, 0));
++ snapc = ceph_find_incompatible(folio_page(*foliop, 0));
+ if (snapc) {
+ int r;
+
+- folio_unlock(folio);
+- folio_put(folio);
++ folio_unlock(*foliop);
++ folio_put(*foliop);
++ *foliop = NULL;
+ if (IS_ERR(snapc))
+ return PTR_ERR(snapc);
+
+diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
+index 6a8a00f28b192..2e6c0f4d84499 100644
+--- a/fs/cifs/smb2pdu.c
++++ b/fs/cifs/smb2pdu.c
+@@ -571,10 +571,6 @@ assemble_neg_contexts(struct smb2_negotiate_req *req,
+ *total_len += ctxt_len;
+ pneg_ctxt += ctxt_len;
+
+- build_posix_ctxt((struct smb2_posix_neg_context *)pneg_ctxt);
+- *total_len += sizeof(struct smb2_posix_neg_context);
+- pneg_ctxt += sizeof(struct smb2_posix_neg_context);
+-
+ /*
+ * secondary channels don't have the hostname field populated
+ * use the hostname field in the primary channel instead
+@@ -586,9 +582,14 @@ assemble_neg_contexts(struct smb2_negotiate_req *req,
+ hostname);
+ *total_len += ctxt_len;
+ pneg_ctxt += ctxt_len;
+- neg_context_count = 4;
+- } else /* second channels do not have a hostname */
+ neg_context_count = 3;
++ } else
++ neg_context_count = 2;
++
++ build_posix_ctxt((struct smb2_posix_neg_context *)pneg_ctxt);
++ *total_len += sizeof(struct smb2_posix_neg_context);
++ pneg_ctxt += sizeof(struct smb2_posix_neg_context);
++ neg_context_count++;
+
+ if (server->compress_algorithm) {
+ build_compression_ctxt((struct smb2_compression_capabilities_context *)
+diff --git a/fs/exec.c b/fs/exec.c
+index 75eb6e0ee7b2f..5a75e92b1a0a3 100644
+--- a/fs/exec.c
++++ b/fs/exec.c
+@@ -1297,7 +1297,7 @@ int begin_new_exec(struct linux_binprm * bprm)
+ bprm->mm = NULL;
+
+ #ifdef CONFIG_POSIX_TIMERS
+- exit_itimers(me->signal);
++ exit_itimers(me);
+ flush_itimer_signals();
+ #endif
+
+diff --git a/fs/ksmbd/transport_tcp.c b/fs/ksmbd/transport_tcp.c
+index 8fef9de787d34..143bba4e4db81 100644
+--- a/fs/ksmbd/transport_tcp.c
++++ b/fs/ksmbd/transport_tcp.c
+@@ -230,7 +230,7 @@ static int ksmbd_kthread_fn(void *p)
+ break;
+ }
+ ret = kernel_accept(iface->ksmbd_socket, &client_sk,
+- O_NONBLOCK);
++ SOCK_NONBLOCK);
+ mutex_unlock(&iface->sock_release_lock);
+ if (ret) {
+ if (ret == -EAGAIN)
+diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
+index 0a22a2faf5522..e1c4617de7714 100644
+--- a/fs/lockd/svcsubs.c
++++ b/fs/lockd/svcsubs.c
+@@ -176,7 +176,7 @@ nlm_delete_file(struct nlm_file *file)
+ }
+ }
+
+-static int nlm_unlock_files(struct nlm_file *file)
++static int nlm_unlock_files(struct nlm_file *file, fl_owner_t owner)
+ {
+ struct file_lock lock;
+
+@@ -184,6 +184,7 @@ static int nlm_unlock_files(struct nlm_file *file)
+ lock.fl_type = F_UNLCK;
+ lock.fl_start = 0;
+ lock.fl_end = OFFSET_MAX;
++ lock.fl_owner = owner;
+ if (file->f_file[O_RDONLY] &&
+ vfs_lock_file(file->f_file[O_RDONLY], F_SETLK, &lock, NULL))
+ goto out_err;
+@@ -225,7 +226,7 @@ again:
+ if (match(lockhost, host)) {
+
+ spin_unlock(&flctx->flc_lock);
+- if (nlm_unlock_files(file))
++ if (nlm_unlock_files(file, fl->fl_owner))
+ return 1;
+ goto again;
+ }
+@@ -282,11 +283,10 @@ nlm_file_inuse(struct nlm_file *file)
+
+ static void nlm_close_files(struct nlm_file *file)
+ {
+- struct file *f;
+-
+- for (f = file->f_file[0]; f <= file->f_file[1]; f++)
+- if (f)
+- nlmsvc_ops->fclose(f);
++ if (file->f_file[O_RDONLY])
++ nlmsvc_ops->fclose(file->f_file[O_RDONLY]);
++ if (file->f_file[O_WRONLY])
++ nlmsvc_ops->fclose(file->f_file[O_WRONLY]);
+ }
+
+ /*
+diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
+index e8e3359a4c547..8d03826c2b152 100644
+--- a/fs/netfs/buffered_read.c
++++ b/fs/netfs/buffered_read.c
+@@ -320,8 +320,9 @@ zero_out:
+ * conflicting writes once the folio is grabbed and locked. It is passed a
+ * pointer to the fsdata cookie that gets returned to the VM to be passed to
+ * write_end. It is permitted to sleep. It should return 0 if the request
+- * should go ahead; unlock the folio and return -EAGAIN to cause the folio to
+- * be regot; or return an error.
++ * should go ahead or it may return an error. It may also unlock and put the
++ * folio, provided it sets ``*foliop`` to NULL, in which case a return of 0
++ * will cause the folio to be re-got and the process to be retried.
+ *
+ * The calling netfs must initialise a netfs context contiguous to the vfs
+ * inode before calling this.
+@@ -352,13 +353,13 @@ retry:
+
+ if (ctx->ops->check_write_begin) {
+ /* Allow the netfs (eg. ceph) to flush conflicts. */
+- ret = ctx->ops->check_write_begin(file, pos, len, folio, _fsdata);
++ ret = ctx->ops->check_write_begin(file, pos, len, &folio, _fsdata);
+ if (ret < 0) {
+ trace_netfs_failure(NULL, NULL, ret, netfs_fail_check_write_begin);
+- if (ret == -EAGAIN)
+- goto retry;
+ goto error;
+ }
++ if (!folio)
++ goto retry;
+ }
+
+ if (folio_test_uptodate(folio))
+@@ -420,8 +421,10 @@ have_folio_no_wait:
+ error_put:
+ netfs_put_request(rreq, false, netfs_rreq_trace_put_failed);
+ error:
+- folio_unlock(folio);
+- folio_put(folio);
++ if (folio) {
++ folio_unlock(folio);
++ folio_put(folio);
++ }
+ _leave(" = %d", ret);
+ return ret;
+ }
+diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
+index da92e7d2ab6a1..264c3a4629c9a 100644
+--- a/fs/nfsd/nfs4xdr.c
++++ b/fs/nfsd/nfs4xdr.c
+@@ -470,6 +470,15 @@ nfsd4_decode_fattr4(struct nfsd4_compoundargs *argp, u32 *bmval, u32 bmlen,
+ return nfserr_bad_xdr;
+ }
+ }
++ if (bmval[1] & FATTR4_WORD1_TIME_CREATE) {
++ struct timespec64 ts;
++
++ /* No Linux filesystem supports setting this attribute. */
++ bmval[1] &= ~FATTR4_WORD1_TIME_CREATE;
++ status = nfsd4_decode_nfstime4(argp, &ts);
++ if (status)
++ return status;
++ }
+ if (bmval[1] & FATTR4_WORD1_TIME_MODIFY_SET) {
+ u32 set_it;
+
+diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
+index 4fc1fd639527a..727754d56243d 100644
+--- a/fs/nfsd/nfsd.h
++++ b/fs/nfsd/nfsd.h
+@@ -460,7 +460,8 @@ static inline bool nfsd_attrs_supported(u32 minorversion, const u32 *bmval)
+ (FATTR4_WORD0_SIZE | FATTR4_WORD0_ACL)
+ #define NFSD_WRITEABLE_ATTRS_WORD1 \
+ (FATTR4_WORD1_MODE | FATTR4_WORD1_OWNER | FATTR4_WORD1_OWNER_GROUP \
+- | FATTR4_WORD1_TIME_ACCESS_SET | FATTR4_WORD1_TIME_MODIFY_SET)
++ | FATTR4_WORD1_TIME_ACCESS_SET | FATTR4_WORD1_TIME_CREATE \
++ | FATTR4_WORD1_TIME_MODIFY_SET)
+ #ifdef CONFIG_NFSD_V4_SECURITY_LABEL
+ #define MAYBE_FATTR4_WORD2_SECURITY_LABEL \
+ FATTR4_WORD2_SECURITY_LABEL
+diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
+index 1344f7d475d3c..aecda4fc95f5f 100644
+--- a/fs/nilfs2/nilfs.h
++++ b/fs/nilfs2/nilfs.h
+@@ -198,6 +198,9 @@ static inline int nilfs_acl_chmod(struct inode *inode)
+
+ static inline int nilfs_init_acl(struct inode *inode, struct inode *dir)
+ {
++ if (S_ISLNK(inode->i_mode))
++ return 0;
++
+ inode->i_mode &= ~current_umask();
+ return 0;
+ }
+diff --git a/fs/remap_range.c b/fs/remap_range.c
+index e112b5424cdb9..881a306ee2473 100644
+--- a/fs/remap_range.c
++++ b/fs/remap_range.c
+@@ -71,7 +71,8 @@ static int generic_remap_checks(struct file *file_in, loff_t pos_in,
+ * Otherwise, make sure the count is also block-aligned, having
+ * already confirmed the starting offsets' block alignment.
+ */
+- if (pos_in + count == size_in) {
++ if (pos_in + count == size_in &&
++ (!(remap_flags & REMAP_FILE_DEDUP) || pos_out + count == size_out)) {
+ bcount = ALIGN(size_in, bs) - pos_in;
+ } else {
+ if (!IS_ALIGNED(count, bs))
+diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
+index 1bfcfb1af3524..d4427d0a0e187 100644
+--- a/include/linux/cgroup-defs.h
++++ b/include/linux/cgroup-defs.h
+@@ -264,7 +264,8 @@ struct css_set {
+ * List of csets participating in the on-going migration either as
+ * source or destination. Protected by cgroup_mutex.
+ */
+- struct list_head mg_preload_node;
++ struct list_head mg_src_preload_node;
++ struct list_head mg_dst_preload_node;
+ struct list_head mg_node;
+
+ /*
+diff --git a/include/linux/kexec.h b/include/linux/kexec.h
+index fcd5035209f19..8d573baaab29e 100644
+--- a/include/linux/kexec.h
++++ b/include/linux/kexec.h
+@@ -452,6 +452,12 @@ static inline int kexec_crash_loaded(void) { return 0; }
+ #define kexec_in_progress false
+ #endif /* CONFIG_KEXEC_CORE */
+
++#ifdef CONFIG_KEXEC_SIG
++void set_kexec_sig_enforced(void);
++#else
++static inline void set_kexec_sig_enforced(void) {}
++#endif
++
+ #endif /* !defined(__ASSEBMLY__) */
+
+ #endif /* LINUX_KEXEC_H */
+diff --git a/include/linux/netfs.h b/include/linux/netfs.h
+index a9c6f73877ecb..95dadf0cd4b8e 100644
+--- a/include/linux/netfs.h
++++ b/include/linux/netfs.h
+@@ -211,7 +211,7 @@ struct netfs_request_ops {
+ void (*issue_read)(struct netfs_io_subrequest *subreq);
+ bool (*is_still_valid)(struct netfs_io_request *rreq);
+ int (*check_write_begin)(struct file *file, loff_t pos, unsigned len,
+- struct folio *folio, void **_fsdata);
++ struct folio **foliop, void **_fsdata);
+ void (*done)(struct netfs_io_request *rreq);
+ void (*cleanup)(struct address_space *mapping, void *netfs_priv);
+ };
+diff --git a/include/linux/nvme.h b/include/linux/nvme.h
+index f626a445d1a87..99b1b56f0cd39 100644
+--- a/include/linux/nvme.h
++++ b/include/linux/nvme.h
+@@ -867,12 +867,14 @@ struct nvme_common_command {
+ __le32 cdw2[2];
+ __le64 metadata;
+ union nvme_data_ptr dptr;
++ struct_group(cdws,
+ __le32 cdw10;
+ __le32 cdw11;
+ __le32 cdw12;
+ __le32 cdw13;
+ __le32 cdw14;
+ __le32 cdw15;
++ );
+ };
+
+ struct nvme_rw_command {
+diff --git a/include/linux/reset.h b/include/linux/reset.h
+index 8a21b5756c3ef..514ddf003efc7 100644
+--- a/include/linux/reset.h
++++ b/include/linux/reset.h
+@@ -731,7 +731,7 @@ static inline int __must_check
+ devm_reset_control_bulk_get_optional_exclusive(struct device *dev, int num_rstcs,
+ struct reset_control_bulk_data *rstcs)
+ {
+- return __devm_reset_control_bulk_get(dev, num_rstcs, rstcs, true, false, true);
++ return __devm_reset_control_bulk_get(dev, num_rstcs, rstcs, false, true, true);
+ }
+
+ /**
+diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
+index 4492266935dd7..d8d9ce5b3bf74 100644
+--- a/include/linux/sched/task.h
++++ b/include/linux/sched/task.h
+@@ -83,7 +83,7 @@ static inline void exit_thread(struct task_struct *tsk)
+ extern __noreturn void do_group_exit(int);
+
+ extern void exit_files(struct task_struct *);
+-extern void exit_itimers(struct signal_struct *);
++extern void exit_itimers(struct task_struct *);
+
+ extern pid_t kernel_clone(struct kernel_clone_args *kargs);
+ struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node);
+diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
+index d4828e69087a4..ca57d686d4d1e 100644
+--- a/include/linux/serial_core.h
++++ b/include/linux/serial_core.h
+@@ -388,6 +388,11 @@ static const bool earlycon_acpi_spcr_enable EARLYCON_USED_OR_UNUSED;
+ static inline int setup_earlycon(char *buf) { return 0; }
+ #endif
+
++static inline bool uart_console_enabled(struct uart_port *port)
++{
++ return uart_console(port) && (port->cons->flags & CON_ENABLED);
++}
++
+ struct uart_port *uart_get_console(struct uart_port *ports, int nr,
+ struct console *c);
+ int uart_parse_earlycon(char *p, unsigned char *iotype, resource_size_t *addr,
+diff --git a/include/linux/sysfb.h b/include/linux/sysfb.h
+index b0dcfa26d07bd..8ba8b5be55675 100644
+--- a/include/linux/sysfb.h
++++ b/include/linux/sysfb.h
+@@ -55,6 +55,18 @@ struct efifb_dmi_info {
+ int flags;
+ };
+
++#ifdef CONFIG_SYSFB
++
++void sysfb_disable(void);
++
++#else /* CONFIG_SYSFB */
++
++static inline void sysfb_disable(void)
++{
++}
++
++#endif /* CONFIG_SYSFB */
++
+ #ifdef CONFIG_EFI
+
+ extern struct efifb_dmi_info efifb_dmi_list[];
+@@ -72,8 +84,8 @@ static inline void sysfb_apply_efi_quirks(struct platform_device *pd)
+
+ bool sysfb_parse_mode(const struct screen_info *si,
+ struct simplefb_platform_data *mode);
+-int sysfb_create_simplefb(const struct screen_info *si,
+- const struct simplefb_platform_data *mode);
++struct platform_device *sysfb_create_simplefb(const struct screen_info *si,
++ const struct simplefb_platform_data *mode);
+
+ #else /* CONFIG_SYSFB_SIMPLE */
+
+@@ -83,10 +95,10 @@ static inline bool sysfb_parse_mode(const struct screen_info *si,
+ return false;
+ }
+
+-static inline int sysfb_create_simplefb(const struct screen_info *si,
+- const struct simplefb_platform_data *mode)
++static inline struct platform_device *sysfb_create_simplefb(const struct screen_info *si,
++ const struct simplefb_platform_data *mode)
+ {
+- return -EINVAL;
++ return ERR_PTR(-EINVAL);
+ }
+
+ #endif /* CONFIG_SYSFB_SIMPLE */
+diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
+index b08b70989d2cf..28672a9444997 100644
+--- a/include/net/netfilter/nf_conntrack.h
++++ b/include/net/netfilter/nf_conntrack.h
+@@ -43,6 +43,12 @@ union nf_conntrack_expect_proto {
+ /* insert expect proto private data here */
+ };
+
++struct nf_conntrack_net_ecache {
++ struct delayed_work dwork;
++ spinlock_t dying_lock;
++ struct hlist_nulls_head dying_list;
++};
++
+ struct nf_conntrack_net {
+ /* only used when new connection is allocated: */
+ atomic_t count;
+@@ -58,8 +64,7 @@ struct nf_conntrack_net {
+ struct ctl_table_header *sysctl_header;
+ #endif
+ #ifdef CONFIG_NF_CONNTRACK_EVENTS
+- struct delayed_work ecache_dwork;
+- struct netns_ct *ct_net;
++ struct nf_conntrack_net_ecache ecache;
+ #endif
+ };
+
+diff --git a/include/net/netfilter/nf_conntrack_ecache.h b/include/net/netfilter/nf_conntrack_ecache.h
+index 6c4c490a3e342..b57d73785e4d3 100644
+--- a/include/net/netfilter/nf_conntrack_ecache.h
++++ b/include/net/netfilter/nf_conntrack_ecache.h
+@@ -14,7 +14,6 @@
+ #include <net/netfilter/nf_conntrack_extend.h>
+
+ enum nf_ct_ecache_state {
+- NFCT_ECACHE_UNKNOWN, /* destroy event not sent */
+ NFCT_ECACHE_DESTROY_FAIL, /* tried but failed to send destroy event */
+ NFCT_ECACHE_DESTROY_SENT, /* sent destroy event after failure */
+ };
+@@ -23,7 +22,6 @@ struct nf_conntrack_ecache {
+ unsigned long cache; /* bitops want long */
+ u16 ctmask; /* bitmask of ct events to be delivered */
+ u16 expmask; /* bitmask of expect events to be delivered */
+- enum nf_ct_ecache_state state:8;/* ecache state */
+ u32 missed; /* missed events */
+ u32 portid; /* netlink portid of destroyer */
+ };
+@@ -166,6 +164,8 @@ void nf_conntrack_ecache_work(struct net *net, enum nf_ct_ecache_state state);
+ void nf_conntrack_ecache_pernet_init(struct net *net);
+ void nf_conntrack_ecache_pernet_fini(struct net *net);
+
++struct nf_conntrack_net_ecache *nf_conn_pernet_ecache(const struct net *net);
++
+ static inline bool nf_conntrack_ecache_dwork_pending(const struct net *net)
+ {
+ return net->ct.ecache_dwork_pending;
+diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
+index 279ae0fff7adb..64cf655c818cc 100644
+--- a/include/net/netfilter/nf_tables.h
++++ b/include/net/netfilter/nf_tables.h
+@@ -657,18 +657,22 @@ static inline void nft_set_ext_prepare(struct nft_set_ext_tmpl *tmpl)
+ tmpl->len = sizeof(struct nft_set_ext);
+ }
+
+-static inline void nft_set_ext_add_length(struct nft_set_ext_tmpl *tmpl, u8 id,
+- unsigned int len)
++static inline int nft_set_ext_add_length(struct nft_set_ext_tmpl *tmpl, u8 id,
++ unsigned int len)
+ {
+ tmpl->len = ALIGN(tmpl->len, nft_set_ext_types[id].align);
+- BUG_ON(tmpl->len > U8_MAX);
++ if (tmpl->len > U8_MAX)
++ return -EINVAL;
++
+ tmpl->offset[id] = tmpl->len;
+ tmpl->len += nft_set_ext_types[id].len + len;
++
++ return 0;
+ }
+
+-static inline void nft_set_ext_add(struct nft_set_ext_tmpl *tmpl, u8 id)
++static inline int nft_set_ext_add(struct nft_set_ext_tmpl *tmpl, u8 id)
+ {
+- nft_set_ext_add_length(tmpl, id, 0);
++ return nft_set_ext_add_length(tmpl, id, 0);
+ }
+
+ static inline void nft_set_ext_init(struct nft_set_ext *ext,
+@@ -1338,24 +1342,28 @@ void nft_unregister_flowtable_type(struct nf_flowtable_type *type);
+ /**
+ * struct nft_traceinfo - nft tracing information and state
+ *
++ * @trace: other struct members are initialised
++ * @nf_trace: copy of skb->nf_trace before rule evaluation
++ * @type: event type (enum nft_trace_types)
++ * @skbid: hash of skb to be used as trace id
++ * @packet_dumped: packet headers sent in a previous traceinfo message
+ * @pkt: pktinfo currently processed
+ * @basechain: base chain currently processed
+ * @chain: chain currently processed
+ * @rule: rule that was evaluated
+ * @verdict: verdict given by rule
+- * @type: event type (enum nft_trace_types)
+- * @packet_dumped: packet headers sent in a previous traceinfo message
+- * @trace: other struct members are initialised
+ */
+ struct nft_traceinfo {
++ bool trace;
++ bool nf_trace;
++ bool packet_dumped;
++ enum nft_trace_types type:8;
++ u32 skbid;
+ const struct nft_pktinfo *pkt;
+ const struct nft_base_chain *basechain;
+ const struct nft_chain *chain;
+ const struct nft_rule_dp *rule;
+ const struct nft_verdict *verdict;
+- enum nft_trace_types type;
+- bool packet_dumped;
+- bool trace;
+ };
+
+ void nft_trace_init(struct nft_traceinfo *info, const struct nft_pktinfo *pkt,
+diff --git a/include/net/netns/conntrack.h b/include/net/netns/conntrack.h
+index 0294f3d473af8..e985a3010b89c 100644
+--- a/include/net/netns/conntrack.h
++++ b/include/net/netns/conntrack.h
+@@ -96,7 +96,6 @@ struct nf_ip_net {
+ struct ct_pcpu {
+ spinlock_t lock;
+ struct hlist_nulls_head unconfirmed;
+- struct hlist_nulls_head dying;
+ };
+
+ struct netns_ct {
+diff --git a/include/net/raw.h b/include/net/raw.h
+index 8ad8df5948536..c51a635671a73 100644
+--- a/include/net/raw.h
++++ b/include/net/raw.h
+@@ -75,7 +75,7 @@ static inline bool raw_sk_bound_dev_eq(struct net *net, int bound_dev_if,
+ int dif, int sdif)
+ {
+ #if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
+- return inet_bound_dev_eq(!!net->ipv4.sysctl_raw_l3mdev_accept,
++ return inet_bound_dev_eq(READ_ONCE(net->ipv4.sysctl_raw_l3mdev_accept),
+ bound_dev_if, dif, sdif);
+ #else
+ return inet_bound_dev_eq(true, bound_dev_if, dif, sdif);
+diff --git a/include/net/sock.h b/include/net/sock.h
+index 3c4fb8f03fd99..6bef0ffb1e7b7 100644
+--- a/include/net/sock.h
++++ b/include/net/sock.h
+@@ -1534,7 +1534,7 @@ void __sk_mem_reclaim(struct sock *sk, int amount);
+ /* sysctl_mem values are in pages, we convert them in SK_MEM_QUANTUM units */
+ static inline long sk_prot_mem_limits(const struct sock *sk, int index)
+ {
+- long val = sk->sk_prot->sysctl_mem[index];
++ long val = READ_ONCE(sk->sk_prot->sysctl_mem[index]);
+
+ #if PAGE_SIZE > SK_MEM_QUANTUM
+ val <<= PAGE_SHIFT - SK_MEM_QUANTUM_SHIFT;
+diff --git a/include/net/tls.h b/include/net/tls.h
+index b6968a5b55389..e8764d3da41ac 100644
+--- a/include/net/tls.h
++++ b/include/net/tls.h
+@@ -708,7 +708,7 @@ int tls_sw_fallback_init(struct sock *sk,
+ struct tls_crypto_info *crypto_info);
+
+ #ifdef CONFIG_TLS_DEVICE
+-void tls_device_init(void);
++int tls_device_init(void);
+ void tls_device_cleanup(void);
+ void tls_device_sk_destruct(struct sock *sk);
+ int tls_set_device_offload(struct sock *sk, struct tls_context *ctx);
+@@ -728,7 +728,7 @@ static inline bool tls_is_sk_rx_device_offloaded(struct sock *sk)
+ return tls_get_ctx(sk)->rx_conf == TLS_HW;
+ }
+ #else
+-static inline void tls_device_init(void) {}
++static inline int tls_device_init(void) { return 0; }
+ static inline void tls_device_cleanup(void) {}
+
+ static inline int
+diff --git a/include/trace/events/sock.h b/include/trace/events/sock.h
+index 12c315782766a..777ee6cbe9330 100644
+--- a/include/trace/events/sock.h
++++ b/include/trace/events/sock.h
+@@ -98,7 +98,7 @@ TRACE_EVENT(sock_exceed_buf_limit,
+
+ TP_STRUCT__entry(
+ __array(char, name, 32)
+- __field(long *, sysctl_mem)
++ __array(long, sysctl_mem, 3)
+ __field(long, allocated)
+ __field(int, sysctl_rmem)
+ __field(int, rmem_alloc)
+@@ -110,7 +110,9 @@ TRACE_EVENT(sock_exceed_buf_limit,
+
+ TP_fast_assign(
+ strncpy(__entry->name, prot->name, 32);
+- __entry->sysctl_mem = prot->sysctl_mem;
++ __entry->sysctl_mem[0] = READ_ONCE(prot->sysctl_mem[0]);
++ __entry->sysctl_mem[1] = READ_ONCE(prot->sysctl_mem[1]);
++ __entry->sysctl_mem[2] = READ_ONCE(prot->sysctl_mem[2]);
+ __entry->allocated = allocated;
+ __entry->sysctl_rmem = sk_get_rmem0(sk, prot);
+ __entry->rmem_alloc = atomic_read(&sk->sk_rmem_alloc);
+diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
+index adb820e98f243..e6cd7e17bd259 100644
+--- a/kernel/cgroup/cgroup.c
++++ b/kernel/cgroup/cgroup.c
+@@ -765,7 +765,8 @@ struct css_set init_css_set = {
+ .task_iters = LIST_HEAD_INIT(init_css_set.task_iters),
+ .threaded_csets = LIST_HEAD_INIT(init_css_set.threaded_csets),
+ .cgrp_links = LIST_HEAD_INIT(init_css_set.cgrp_links),
+- .mg_preload_node = LIST_HEAD_INIT(init_css_set.mg_preload_node),
++ .mg_src_preload_node = LIST_HEAD_INIT(init_css_set.mg_src_preload_node),
++ .mg_dst_preload_node = LIST_HEAD_INIT(init_css_set.mg_dst_preload_node),
+ .mg_node = LIST_HEAD_INIT(init_css_set.mg_node),
+
+ /*
+@@ -1240,7 +1241,8 @@ static struct css_set *find_css_set(struct css_set *old_cset,
+ INIT_LIST_HEAD(&cset->threaded_csets);
+ INIT_HLIST_NODE(&cset->hlist);
+ INIT_LIST_HEAD(&cset->cgrp_links);
+- INIT_LIST_HEAD(&cset->mg_preload_node);
++ INIT_LIST_HEAD(&cset->mg_src_preload_node);
++ INIT_LIST_HEAD(&cset->mg_dst_preload_node);
+ INIT_LIST_HEAD(&cset->mg_node);
+
+ /* Copy the set of subsystem state objects generated in
+@@ -2597,21 +2599,27 @@ int cgroup_migrate_vet_dst(struct cgroup *dst_cgrp)
+ */
+ void cgroup_migrate_finish(struct cgroup_mgctx *mgctx)
+ {
+- LIST_HEAD(preloaded);
+ struct css_set *cset, *tmp_cset;
+
+ lockdep_assert_held(&cgroup_mutex);
+
+ spin_lock_irq(&css_set_lock);
+
+- list_splice_tail_init(&mgctx->preloaded_src_csets, &preloaded);
+- list_splice_tail_init(&mgctx->preloaded_dst_csets, &preloaded);
++ list_for_each_entry_safe(cset, tmp_cset, &mgctx->preloaded_src_csets,
++ mg_src_preload_node) {
++ cset->mg_src_cgrp = NULL;
++ cset->mg_dst_cgrp = NULL;
++ cset->mg_dst_cset = NULL;
++ list_del_init(&cset->mg_src_preload_node);
++ put_css_set_locked(cset);
++ }
+
+- list_for_each_entry_safe(cset, tmp_cset, &preloaded, mg_preload_node) {
++ list_for_each_entry_safe(cset, tmp_cset, &mgctx->preloaded_dst_csets,
++ mg_dst_preload_node) {
+ cset->mg_src_cgrp = NULL;
+ cset->mg_dst_cgrp = NULL;
+ cset->mg_dst_cset = NULL;
+- list_del_init(&cset->mg_preload_node);
++ list_del_init(&cset->mg_dst_preload_node);
+ put_css_set_locked(cset);
+ }
+
+@@ -2651,7 +2659,7 @@ void cgroup_migrate_add_src(struct css_set *src_cset,
+ if (src_cset->dead)
+ return;
+
+- if (!list_empty(&src_cset->mg_preload_node))
++ if (!list_empty(&src_cset->mg_src_preload_node))
+ return;
+
+ src_cgrp = cset_cgroup_from_root(src_cset, dst_cgrp->root);
+@@ -2664,7 +2672,7 @@ void cgroup_migrate_add_src(struct css_set *src_cset,
+ src_cset->mg_src_cgrp = src_cgrp;
+ src_cset->mg_dst_cgrp = dst_cgrp;
+ get_css_set(src_cset);
+- list_add_tail(&src_cset->mg_preload_node, &mgctx->preloaded_src_csets);
++ list_add_tail(&src_cset->mg_src_preload_node, &mgctx->preloaded_src_csets);
+ }
+
+ /**
+@@ -2689,7 +2697,7 @@ int cgroup_migrate_prepare_dst(struct cgroup_mgctx *mgctx)
+
+ /* look up the dst cset for each src cset and link it to src */
+ list_for_each_entry_safe(src_cset, tmp_cset, &mgctx->preloaded_src_csets,
+- mg_preload_node) {
++ mg_src_preload_node) {
+ struct css_set *dst_cset;
+ struct cgroup_subsys *ss;
+ int ssid;
+@@ -2708,7 +2716,7 @@ int cgroup_migrate_prepare_dst(struct cgroup_mgctx *mgctx)
+ if (src_cset == dst_cset) {
+ src_cset->mg_src_cgrp = NULL;
+ src_cset->mg_dst_cgrp = NULL;
+- list_del_init(&src_cset->mg_preload_node);
++ list_del_init(&src_cset->mg_src_preload_node);
+ put_css_set(src_cset);
+ put_css_set(dst_cset);
+ continue;
+@@ -2716,8 +2724,8 @@ int cgroup_migrate_prepare_dst(struct cgroup_mgctx *mgctx)
+
+ src_cset->mg_dst_cset = dst_cset;
+
+- if (list_empty(&dst_cset->mg_preload_node))
+- list_add_tail(&dst_cset->mg_preload_node,
++ if (list_empty(&dst_cset->mg_dst_preload_node))
++ list_add_tail(&dst_cset->mg_dst_preload_node,
+ &mgctx->preloaded_dst_csets);
+ else
+ put_css_set(dst_cset);
+@@ -2963,7 +2971,8 @@ static int cgroup_update_dfl_csses(struct cgroup *cgrp)
+ goto out_finish;
+
+ spin_lock_irq(&css_set_lock);
+- list_for_each_entry(src_cset, &mgctx.preloaded_src_csets, mg_preload_node) {
++ list_for_each_entry(src_cset, &mgctx.preloaded_src_csets,
++ mg_src_preload_node) {
+ struct task_struct *task, *ntask;
+
+ /* all tasks in src_csets need to be migrated */
+diff --git a/kernel/exit.c b/kernel/exit.c
+index f072959fcab7f..64c938ce36fe8 100644
+--- a/kernel/exit.c
++++ b/kernel/exit.c
+@@ -766,7 +766,7 @@ void __noreturn do_exit(long code)
+
+ #ifdef CONFIG_POSIX_TIMERS
+ hrtimer_cancel(&tsk->signal->real_timer);
+- exit_itimers(tsk->signal);
++ exit_itimers(tsk);
+ #endif
+ if (tsk->mm)
+ setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
+diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
+index c108a2a887546..bb0fb63f563cf 100644
+--- a/kernel/kexec_file.c
++++ b/kernel/kexec_file.c
+@@ -29,6 +29,15 @@
+ #include <linux/vmalloc.h>
+ #include "kexec_internal.h"
+
++#ifdef CONFIG_KEXEC_SIG
++static bool sig_enforce = IS_ENABLED(CONFIG_KEXEC_SIG_FORCE);
++
++void set_kexec_sig_enforced(void)
++{
++ sig_enforce = true;
++}
++#endif
++
+ static int kexec_calculate_store_digests(struct kimage *image);
+
+ /*
+@@ -159,7 +168,7 @@ kimage_validate_signature(struct kimage *image)
+ image->kernel_buf_len);
+ if (ret) {
+
+- if (IS_ENABLED(CONFIG_KEXEC_SIG_FORCE)) {
++ if (sig_enforce) {
+ pr_notice("Enforced kernel signature verification failed (%d).\n", ret);
+ return ret;
+ }
+diff --git a/kernel/signal.c b/kernel/signal.c
+index e43bc2a692f5e..75cc2339d83ed 100644
+--- a/kernel/signal.c
++++ b/kernel/signal.c
+@@ -2031,12 +2031,12 @@ bool do_notify_parent(struct task_struct *tsk, int sig)
+ bool autoreap = false;
+ u64 utime, stime;
+
+- BUG_ON(sig == -1);
++ WARN_ON_ONCE(sig == -1);
+
+- /* do_notify_parent_cldstop should have been called instead. */
+- BUG_ON(task_is_stopped_or_traced(tsk));
++ /* do_notify_parent_cldstop should have been called instead. */
++ WARN_ON_ONCE(task_is_stopped_or_traced(tsk));
+
+- BUG_ON(!tsk->ptrace &&
++ WARN_ON_ONCE(!tsk->ptrace &&
+ (tsk->group_leader != tsk || !thread_group_empty(tsk)));
+
+ /* Wake up all pidfd waiters */
+diff --git a/kernel/sysctl.c b/kernel/sysctl.c
+index 830aaf8ca08ee..c42ba2d669dcc 100644
+--- a/kernel/sysctl.c
++++ b/kernel/sysctl.c
+@@ -518,14 +518,14 @@ static int do_proc_dointvec_conv(bool *negp, unsigned long *lvalp,
+ if (*negp) {
+ if (*lvalp > (unsigned long) INT_MAX + 1)
+ return -EINVAL;
+- *valp = -*lvalp;
++ WRITE_ONCE(*valp, -*lvalp);
+ } else {
+ if (*lvalp > (unsigned long) INT_MAX)
+ return -EINVAL;
+- *valp = *lvalp;
++ WRITE_ONCE(*valp, *lvalp);
+ }
+ } else {
+- int val = *valp;
++ int val = READ_ONCE(*valp);
+ if (val < 0) {
+ *negp = true;
+ *lvalp = -(unsigned long)val;
+@@ -544,9 +544,9 @@ static int do_proc_douintvec_conv(unsigned long *lvalp,
+ if (write) {
+ if (*lvalp > UINT_MAX)
+ return -EINVAL;
+- *valp = *lvalp;
++ WRITE_ONCE(*valp, *lvalp);
+ } else {
+- unsigned int val = *valp;
++ unsigned int val = READ_ONCE(*valp);
+ *lvalp = (unsigned long)val;
+ }
+ return 0;
+@@ -929,7 +929,7 @@ static int do_proc_dointvec_minmax_conv(bool *negp, unsigned long *lvalp,
+ if ((param->min && *param->min > tmp) ||
+ (param->max && *param->max < tmp))
+ return -EINVAL;
+- *valp = tmp;
++ WRITE_ONCE(*valp, tmp);
+ }
+
+ return 0;
+@@ -995,7 +995,7 @@ static int do_proc_douintvec_minmax_conv(unsigned long *lvalp,
+ (param->max && *param->max < tmp))
+ return -ERANGE;
+
+- *valp = tmp;
++ WRITE_ONCE(*valp, tmp);
+ }
+
+ return 0;
+@@ -1079,13 +1079,13 @@ int proc_dou8vec_minmax(struct ctl_table *table, int write,
+
+ tmp.maxlen = sizeof(val);
+ tmp.data = &val;
+- val = *data;
++ val = READ_ONCE(*data);
+ res = do_proc_douintvec(&tmp, write, buffer, lenp, ppos,
+ do_proc_douintvec_minmax_conv, ¶m);
+ if (res)
+ return res;
+ if (write)
+- *data = val;
++ WRITE_ONCE(*data, val);
+ return 0;
+ }
+ EXPORT_SYMBOL_GPL(proc_dou8vec_minmax);
+@@ -1162,9 +1162,9 @@ static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table,
+ err = -EINVAL;
+ break;
+ }
+- *i = val;
++ WRITE_ONCE(*i, val);
+ } else {
+- val = convdiv * (*i) / convmul;
++ val = convdiv * READ_ONCE(*i) / convmul;
+ if (!first)
+ proc_put_char(&buffer, &left, '\t');
+ proc_put_long(&buffer, &left, val, false);
+@@ -1245,9 +1245,12 @@ static int do_proc_dointvec_jiffies_conv(bool *negp, unsigned long *lvalp,
+ if (write) {
+ if (*lvalp > INT_MAX / HZ)
+ return 1;
+- *valp = *negp ? -(*lvalp*HZ) : (*lvalp*HZ);
++ if (*negp)
++ WRITE_ONCE(*valp, -*lvalp * HZ);
++ else
++ WRITE_ONCE(*valp, *lvalp * HZ);
+ } else {
+- int val = *valp;
++ int val = READ_ONCE(*valp);
+ unsigned long lval;
+ if (val < 0) {
+ *negp = true;
+@@ -1293,9 +1296,9 @@ static int do_proc_dointvec_ms_jiffies_conv(bool *negp, unsigned long *lvalp,
+
+ if (jif > INT_MAX)
+ return 1;
+- *valp = (int)jif;
++ WRITE_ONCE(*valp, (int)jif);
+ } else {
+- int val = *valp;
++ int val = READ_ONCE(*valp);
+ unsigned long lval;
+ if (val < 0) {
+ *negp = true;
+@@ -1363,8 +1366,8 @@ int proc_dointvec_userhz_jiffies(struct ctl_table *table, int write,
+ * @ppos: the current position in the file
+ *
+ * Reads/writes up to table->maxlen/sizeof(unsigned int) integer
+- * values from/to the user buffer, treated as an ASCII string.
+- * The values read are assumed to be in 1/1000 seconds, and
++ * values from/to the user buffer, treated as an ASCII string.
++ * The values read are assumed to be in 1/1000 seconds, and
+ * are converted into jiffies.
+ *
+ * Returns 0 on success.
+@@ -2463,6 +2466,17 @@ static struct ctl_table vm_table[] = {
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_TWO_HUNDRED,
+ },
++#ifdef CONFIG_NUMA
++ {
++ .procname = "numa_stat",
++ .data = &sysctl_vm_numa_stat,
++ .maxlen = sizeof(int),
++ .mode = 0644,
++ .proc_handler = sysctl_vm_numa_stat_handler,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_ONE,
++ },
++#endif
+ #ifdef CONFIG_HUGETLB_PAGE
+ {
+ .procname = "nr_hugepages",
+@@ -2479,15 +2493,6 @@ static struct ctl_table vm_table[] = {
+ .mode = 0644,
+ .proc_handler = &hugetlb_mempolicy_sysctl_handler,
+ },
+- {
+- .procname = "numa_stat",
+- .data = &sysctl_vm_numa_stat,
+- .maxlen = sizeof(int),
+- .mode = 0644,
+- .proc_handler = sysctl_vm_numa_stat_handler,
+- .extra1 = SYSCTL_ZERO,
+- .extra2 = SYSCTL_ONE,
+- },
+ #endif
+ {
+ .procname = "hugetlb_shm_group",
+diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
+index 1cd10b102c51c..5dead89308b74 100644
+--- a/kernel/time/posix-timers.c
++++ b/kernel/time/posix-timers.c
+@@ -1051,15 +1051,24 @@ retry_delete:
+ }
+
+ /*
+- * This is called by do_exit or de_thread, only when there are no more
+- * references to the shared signal_struct.
++ * This is called by do_exit or de_thread, only when nobody else can
++ * modify the signal->posix_timers list. Yet we need sighand->siglock
++ * to prevent the race with /proc/pid/timers.
+ */
+-void exit_itimers(struct signal_struct *sig)
++void exit_itimers(struct task_struct *tsk)
+ {
++ struct list_head timers;
+ struct k_itimer *tmr;
+
+- while (!list_empty(&sig->posix_timers)) {
+- tmr = list_entry(sig->posix_timers.next, struct k_itimer, list);
++ if (list_empty(&tsk->signal->posix_timers))
++ return;
++
++ spin_lock_irq(&tsk->sighand->siglock);
++ list_replace_init(&tsk->signal->posix_timers, &timers);
++ spin_unlock_irq(&tsk->sighand->siglock);
++
++ while (!list_empty(&timers)) {
++ tmr = list_first_entry(&timers, struct k_itimer, list);
+ itimer_delete(tmr);
+ }
+ }
+diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
+index 114c31bdf8f97..c0c98b0c86e79 100644
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -9863,6 +9863,12 @@ void trace_init_global_iter(struct trace_iterator *iter)
+ /* Output in nanoseconds only if we are using a clock in nanoseconds. */
+ if (trace_clocks[iter->tr->clock_id].in_ns)
+ iter->iter_flags |= TRACE_FILE_TIME_IN_NS;
++
++ /* Can not use kmalloc for iter.temp and iter.fmt */
++ iter->temp = static_temp_buf;
++ iter->temp_size = STATIC_TEMP_BUF_SIZE;
++ iter->fmt = static_fmt_buf;
++ iter->fmt_size = STATIC_FMT_BUF_SIZE;
+ }
+
+ void ftrace_dump(enum ftrace_dump_mode oops_dump_mode)
+@@ -9895,11 +9901,6 @@ void ftrace_dump(enum ftrace_dump_mode oops_dump_mode)
+
+ /* Simulate the iterator */
+ trace_init_global_iter(&iter);
+- /* Can not use kmalloc for iter.temp and iter.fmt */
+- iter.temp = static_temp_buf;
+- iter.temp_size = STATIC_TEMP_BUF_SIZE;
+- iter.fmt = static_fmt_buf;
+- iter.fmt_size = STATIC_FMT_BUF_SIZE;
+
+ for_each_tracing_cpu(cpu) {
+ atomic_inc(&per_cpu_ptr(iter.array_buffer->data, cpu)->disabled);
+diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
+index a0e41906d9ce1..217e03e477f62 100644
+--- a/kernel/trace/trace_events_hist.c
++++ b/kernel/trace/trace_events_hist.c
+@@ -4429,6 +4429,8 @@ static int parse_var_defs(struct hist_trigger_data *hist_data)
+
+ s = kstrdup(field_str, GFP_KERNEL);
+ if (!s) {
++ kfree(hist_data->attrs->var_defs.name[n_vars]);
++ hist_data->attrs->var_defs.name[n_vars] = NULL;
+ ret = -ENOMEM;
+ goto free;
+ }
+diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
+index b2ec0aa1ff451..64456e775e8f5 100644
+--- a/mm/damon/vaddr.c
++++ b/mm/damon/vaddr.c
+@@ -407,8 +407,7 @@ static void damon_hugetlb_mkold(pte_t *pte, struct mm_struct *mm,
+ if (pte_young(entry)) {
+ referenced = true;
+ entry = pte_mkold(entry);
+- huge_ptep_set_access_flags(vma, addr, pte, entry,
+- vma->vm_flags & VM_WRITE);
++ set_huge_pte_at(mm, addr, pte, entry);
+ }
+
+ #ifdef CONFIG_MMU_NOTIFIER
+diff --git a/mm/memory.c b/mm/memory.c
+index 76e3af9639d93..e176ee386238a 100644
+--- a/mm/memory.c
++++ b/mm/memory.c
+@@ -4524,6 +4524,19 @@ static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf)
+
+ static vm_fault_t create_huge_pud(struct vm_fault *vmf)
+ {
++#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \
++ defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)
++ /* No support for anonymous transparent PUD pages yet */
++ if (vma_is_anonymous(vmf->vma))
++ return VM_FAULT_FALLBACK;
++ if (vmf->vma->vm_ops->huge_fault)
++ return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD);
++#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
++ return VM_FAULT_FALLBACK;
++}
++
++static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud)
++{
+ #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \
+ defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)
+ /* No support for anonymous transparent PUD pages yet */
+@@ -4538,19 +4551,7 @@ static vm_fault_t create_huge_pud(struct vm_fault *vmf)
+ split:
+ /* COW or write-notify not handled on PUD level: split pud.*/
+ __split_huge_pud(vmf->vma, vmf->pud, vmf->address);
+-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+- return VM_FAULT_FALLBACK;
+-}
+-
+-static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud)
+-{
+-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+- /* No support for anonymous transparent PUD pages yet */
+- if (vma_is_anonymous(vmf->vma))
+- return VM_FAULT_FALLBACK;
+- if (vmf->vma->vm_ops->huge_fault)
+- return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD);
+-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
++#endif /* CONFIG_TRANSPARENT_HUGEPAGE && CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
+ return VM_FAULT_FALLBACK;
+ }
+
+diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
+index 8aecd6b3896c7..365e47465ca0b 100644
+--- a/mm/sparse-vmemmap.c
++++ b/mm/sparse-vmemmap.c
+@@ -78,6 +78,14 @@ static int __split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start)
+
+ spin_lock(&init_mm.page_table_lock);
+ if (likely(pmd_leaf(*pmd))) {
++ /*
++ * Higher order allocations from buddy allocator must be able to
++ * be treated as indepdenent small pages (as they can be freed
++ * individually).
++ */
++ if (!PageReserved(page))
++ split_page(page, get_order(PMD_SIZE));
++
+ /* Make pte visible before pmd. See comment in pmd_install(). */
+ smp_wmb();
+ pmd_populate_kernel(&init_mm, pmd, pgtable);
+diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
+index e9bb6db002aa0..128b17fe98123 100644
+--- a/mm/userfaultfd.c
++++ b/mm/userfaultfd.c
+@@ -231,7 +231,10 @@ static int mcontinue_atomic_pte(struct mm_struct *dst_mm,
+ struct page *page;
+ int ret;
+
+- ret = shmem_getpage(inode, pgoff, &page, SGP_READ);
++ ret = shmem_getpage(inode, pgoff, &page, SGP_NOALLOC);
++ /* Our caller expects us to return -EFAULT if we failed to find page. */
++ if (ret == -ENOENT)
++ ret = -EFAULT;
+ if (ret)
+ goto out;
+ if (!page) {
+diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c
+index 53b1955b027f8..214532173536b 100644
+--- a/net/8021q/vlan_netlink.c
++++ b/net/8021q/vlan_netlink.c
+@@ -182,10 +182,14 @@ static int vlan_newlink(struct net *src_net, struct net_device *dev,
+ else if (dev->mtu > max_mtu)
+ return -EINVAL;
+
++ /* Note: If this initial vlan_changelink() fails, we need
++ * to call vlan_dev_free_egress_priority() to free memory.
++ */
+ err = vlan_changelink(dev, tb, data, extack);
+- if (err)
+- return err;
+- err = register_vlan_dev(dev, extack);
++
++ if (!err)
++ err = register_vlan_dev(dev, extack);
++
+ if (err)
+ vlan_dev_free_egress_priority(dev);
+ return err;
+diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
+index 4fd882686b04d..ff47790366497 100644
+--- a/net/bridge/br_netfilter_hooks.c
++++ b/net/bridge/br_netfilter_hooks.c
+@@ -1012,9 +1012,24 @@ int br_nf_hook_thresh(unsigned int hook, struct net *net,
+ return okfn(net, sk, skb);
+
+ ops = nf_hook_entries_get_hook_ops(e);
+- for (i = 0; i < e->num_hook_entries &&
+- ops[i]->priority <= NF_BR_PRI_BRNF; i++)
+- ;
++ for (i = 0; i < e->num_hook_entries; i++) {
++ /* These hooks have already been called */
++ if (ops[i]->priority < NF_BR_PRI_BRNF)
++ continue;
++
++ /* These hooks have not been called yet, run them. */
++ if (ops[i]->priority > NF_BR_PRI_BRNF)
++ break;
++
++ /* take a closer look at NF_BR_PRI_BRNF. */
++ if (ops[i]->hook == br_nf_pre_routing) {
++ /* This hook diverted the skb to this function,
++ * hooks after this have not been run yet.
++ */
++ i++;
++ break;
++ }
++ }
+
+ nf_hook_state_init(&state, hook, NFPROTO_BRIDGE, indev, outdev,
+ sk, net, okfn);
+diff --git a/net/core/filter.c b/net/core/filter.c
+index af1e77f2f24a8..6391c1885bca8 100644
+--- a/net/core/filter.c
++++ b/net/core/filter.c
+@@ -6148,7 +6148,6 @@ static int bpf_push_seg6_encap(struct sk_buff *skb, u32 type, void *hdr, u32 len
+ if (err)
+ return err;
+
+- ipv6_hdr(skb)->payload_len = htons(skb->len - sizeof(struct ipv6hdr));
+ skb_set_transport_header(skb, sizeof(struct ipv6hdr));
+
+ return seg6_lookup_nexthop(skb, NULL, 0);
+diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
+index 72fde2888ad2f..98bc180563d16 100644
+--- a/net/ipv4/af_inet.c
++++ b/net/ipv4/af_inet.c
+@@ -1247,7 +1247,7 @@ static int inet_sk_reselect_saddr(struct sock *sk)
+ if (new_saddr == old_saddr)
+ return 0;
+
+- if (sock_net(sk)->ipv4.sysctl_ip_dynaddr > 1) {
++ if (READ_ONCE(sock_net(sk)->ipv4.sysctl_ip_dynaddr) > 1) {
+ pr_info("%s(): shifting inet->saddr from %pI4 to %pI4\n",
+ __func__, &old_saddr, &new_saddr);
+ }
+@@ -1302,7 +1302,7 @@ int inet_sk_rebuild_header(struct sock *sk)
+ * Other protocols have to map its equivalent state to TCP_SYN_SENT.
+ * DCCP maps its DCCP_REQUESTING state to TCP_SYN_SENT. -acme
+ */
+- if (!sock_net(sk)->ipv4.sysctl_ip_dynaddr ||
++ if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_ip_dynaddr) ||
+ sk->sk_state != TCP_SYN_SENT ||
+ (sk->sk_userlocks & SOCK_BINDADDR_LOCK) ||
+ (err = inet_sk_reselect_saddr(sk)) != 0)
+diff --git a/net/ipv4/cipso_ipv4.c b/net/ipv4/cipso_ipv4.c
+index 62d5f99760aac..6cd3b6c559f05 100644
+--- a/net/ipv4/cipso_ipv4.c
++++ b/net/ipv4/cipso_ipv4.c
+@@ -239,7 +239,7 @@ static int cipso_v4_cache_check(const unsigned char *key,
+ struct cipso_v4_map_cache_entry *prev_entry = NULL;
+ u32 hash;
+
+- if (!cipso_v4_cache_enabled)
++ if (!READ_ONCE(cipso_v4_cache_enabled))
+ return -ENOENT;
+
+ hash = cipso_v4_map_cache_hash(key, key_len);
+@@ -296,13 +296,14 @@ static int cipso_v4_cache_check(const unsigned char *key,
+ int cipso_v4_cache_add(const unsigned char *cipso_ptr,
+ const struct netlbl_lsm_secattr *secattr)
+ {
++ int bkt_size = READ_ONCE(cipso_v4_cache_bucketsize);
+ int ret_val = -EPERM;
+ u32 bkt;
+ struct cipso_v4_map_cache_entry *entry = NULL;
+ struct cipso_v4_map_cache_entry *old_entry = NULL;
+ u32 cipso_ptr_len;
+
+- if (!cipso_v4_cache_enabled || cipso_v4_cache_bucketsize <= 0)
++ if (!READ_ONCE(cipso_v4_cache_enabled) || bkt_size <= 0)
+ return 0;
+
+ cipso_ptr_len = cipso_ptr[1];
+@@ -322,7 +323,7 @@ int cipso_v4_cache_add(const unsigned char *cipso_ptr,
+
+ bkt = entry->hash & (CIPSO_V4_CACHE_BUCKETS - 1);
+ spin_lock_bh(&cipso_v4_cache[bkt].lock);
+- if (cipso_v4_cache[bkt].size < cipso_v4_cache_bucketsize) {
++ if (cipso_v4_cache[bkt].size < bkt_size) {
+ list_add(&entry->list, &cipso_v4_cache[bkt].list);
+ cipso_v4_cache[bkt].size += 1;
+ } else {
+@@ -1199,7 +1200,8 @@ static int cipso_v4_gentag_rbm(const struct cipso_v4_doi *doi_def,
+ /* This will send packets using the "optimized" format when
+ * possible as specified in section 3.4.2.6 of the
+ * CIPSO draft. */
+- if (cipso_v4_rbm_optfmt && ret_val > 0 && ret_val <= 10)
++ if (READ_ONCE(cipso_v4_rbm_optfmt) && ret_val > 0 &&
++ ret_val <= 10)
+ tag_len = 14;
+ else
+ tag_len = 4 + ret_val;
+@@ -1603,7 +1605,7 @@ int cipso_v4_validate(const struct sk_buff *skb, unsigned char **option)
+ * all the CIPSO validations here but it doesn't
+ * really specify _exactly_ what we need to validate
+ * ... so, just make it a sysctl tunable. */
+- if (cipso_v4_rbm_strictvalid) {
++ if (READ_ONCE(cipso_v4_rbm_strictvalid)) {
+ if (cipso_v4_map_lvl_valid(doi_def,
+ tag[3]) < 0) {
+ err_offset = opt_iter + 3;
+diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
+index ccb62038f6a4a..720f65f7bd0b0 100644
+--- a/net/ipv4/fib_semantics.c
++++ b/net/ipv4/fib_semantics.c
+@@ -1230,7 +1230,7 @@ static int fib_check_nh_nongw(struct net *net, struct fib_nh *nh,
+
+ nh->fib_nh_dev = in_dev->dev;
+ dev_hold_track(nh->fib_nh_dev, &nh->fib_nh_dev_tracker, GFP_ATOMIC);
+- nh->fib_nh_scope = RT_SCOPE_HOST;
++ nh->fib_nh_scope = RT_SCOPE_LINK;
+ if (!netif_carrier_ok(nh->fib_nh_dev))
+ nh->fib_nh_flags |= RTNH_F_LINKDOWN;
+ err = 0;
+@@ -1811,7 +1811,7 @@ int fib_dump_info(struct sk_buff *skb, u32 portid, u32 seq, int event,
+ goto nla_put_failure;
+ if (nexthop_is_blackhole(fi->nh))
+ rtm->rtm_type = RTN_BLACKHOLE;
+- if (!fi->fib_net->ipv4.sysctl_nexthop_compat_mode)
++ if (!READ_ONCE(fi->fib_net->ipv4.sysctl_nexthop_compat_mode))
+ goto offload;
+ }
+
+diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
+index fb0e49c36c2e6..43a4962722279 100644
+--- a/net/ipv4/fib_trie.c
++++ b/net/ipv4/fib_trie.c
+@@ -498,7 +498,7 @@ static void tnode_free(struct key_vector *tn)
+ tn = container_of(head, struct tnode, rcu)->kv;
+ }
+
+- if (tnode_free_size >= sysctl_fib_sync_mem) {
++ if (tnode_free_size >= READ_ONCE(sysctl_fib_sync_mem)) {
+ tnode_free_size = 0;
+ synchronize_rcu();
+ }
+diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
+index 72a375c7f4172..c13ceda9ce5d8 100644
+--- a/net/ipv4/icmp.c
++++ b/net/ipv4/icmp.c
+@@ -253,11 +253,12 @@ bool icmp_global_allow(void)
+ spin_lock(&icmp_global.lock);
+ delta = min_t(u32, now - icmp_global.stamp, HZ);
+ if (delta >= HZ / 50) {
+- incr = sysctl_icmp_msgs_per_sec * delta / HZ ;
++ incr = READ_ONCE(sysctl_icmp_msgs_per_sec) * delta / HZ;
+ if (incr)
+ WRITE_ONCE(icmp_global.stamp, now);
+ }
+- credit = min_t(u32, icmp_global.credit + incr, sysctl_icmp_msgs_burst);
++ credit = min_t(u32, icmp_global.credit + incr,
++ READ_ONCE(sysctl_icmp_msgs_burst));
+ if (credit) {
+ /* We want to use a credit of one in average, but need to randomize
+ * it for security reasons.
+@@ -281,7 +282,7 @@ static bool icmpv4_mask_allow(struct net *net, int type, int code)
+ return true;
+
+ /* Limit if icmp type is enabled in ratemask. */
+- if (!((1 << type) & net->ipv4.sysctl_icmp_ratemask))
++ if (!((1 << type) & READ_ONCE(net->ipv4.sysctl_icmp_ratemask)))
+ return true;
+
+ return false;
+@@ -319,7 +320,8 @@ static bool icmpv4_xrlim_allow(struct net *net, struct rtable *rt,
+
+ vif = l3mdev_master_ifindex(dst->dev);
+ peer = inet_getpeer_v4(net->ipv4.peers, fl4->daddr, vif, 1);
+- rc = inet_peer_xrlim_allow(peer, net->ipv4.sysctl_icmp_ratelimit);
++ rc = inet_peer_xrlim_allow(peer,
++ READ_ONCE(net->ipv4.sysctl_icmp_ratelimit));
+ if (peer)
+ inet_putpeer(peer);
+ out:
+@@ -692,7 +694,7 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
+
+ rcu_read_lock();
+ if (rt_is_input_route(rt) &&
+- net->ipv4.sysctl_icmp_errors_use_inbound_ifaddr)
++ READ_ONCE(net->ipv4.sysctl_icmp_errors_use_inbound_ifaddr))
+ dev = dev_get_by_index_rcu(net, inet_iif(skb_in));
+
+ if (dev)
+@@ -929,7 +931,7 @@ static bool icmp_unreach(struct sk_buff *skb)
+ * get the other vendor to fix their kit.
+ */
+
+- if (!net->ipv4.sysctl_icmp_ignore_bogus_error_responses &&
++ if (!READ_ONCE(net->ipv4.sysctl_icmp_ignore_bogus_error_responses) &&
+ inet_addr_type_dev_table(net, skb->dev, iph->daddr) == RTN_BROADCAST) {
+ net_warn_ratelimited("%pI4 sent an invalid ICMP type %u, code %u error to a broadcast: %pI4 on %s\n",
+ &ip_hdr(skb)->saddr,
+@@ -989,7 +991,7 @@ static bool icmp_echo(struct sk_buff *skb)
+
+ net = dev_net(skb_dst(skb)->dev);
+ /* should there be an ICMP stat for ignored echos? */
+- if (net->ipv4.sysctl_icmp_echo_ignore_all)
++ if (READ_ONCE(net->ipv4.sysctl_icmp_echo_ignore_all))
+ return true;
+
+ icmp_param.data.icmph = *icmp_hdr(skb);
+@@ -1024,7 +1026,7 @@ bool icmp_build_probe(struct sk_buff *skb, struct icmphdr *icmphdr)
+ u16 ident_len;
+ u8 status;
+
+- if (!net->ipv4.sysctl_icmp_echo_enable_probe)
++ if (!READ_ONCE(net->ipv4.sysctl_icmp_echo_enable_probe))
+ return false;
+
+ /* We currently only support probing interfaces on the proxy node
+@@ -1238,7 +1240,7 @@ int icmp_rcv(struct sk_buff *skb)
+ */
+ if ((icmph->type == ICMP_ECHO ||
+ icmph->type == ICMP_TIMESTAMP) &&
+- net->ipv4.sysctl_icmp_echo_ignore_broadcasts) {
++ READ_ONCE(net->ipv4.sysctl_icmp_echo_ignore_broadcasts)) {
+ goto error;
+ }
+ if (icmph->type != ICMP_ECHO &&
+diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
+index 0ec501845cb3b..47ccc343c9fb0 100644
+--- a/net/ipv4/inet_timewait_sock.c
++++ b/net/ipv4/inet_timewait_sock.c
+@@ -156,7 +156,8 @@ struct inet_timewait_sock *inet_twsk_alloc(const struct sock *sk,
+ {
+ struct inet_timewait_sock *tw;
+
+- if (refcount_read(&dr->tw_refcount) - 1 >= dr->sysctl_max_tw_buckets)
++ if (refcount_read(&dr->tw_refcount) - 1 >=
++ READ_ONCE(dr->sysctl_max_tw_buckets))
+ return NULL;
+
+ tw = kmem_cache_alloc(sk->sk_prot_creator->twsk_prot->twsk_slab,
+diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c
+index da21dfce24d73..e9fed83e9b3cc 100644
+--- a/net/ipv4/inetpeer.c
++++ b/net/ipv4/inetpeer.c
+@@ -141,16 +141,20 @@ static void inet_peer_gc(struct inet_peer_base *base,
+ struct inet_peer *gc_stack[],
+ unsigned int gc_cnt)
+ {
++ int peer_threshold, peer_maxttl, peer_minttl;
+ struct inet_peer *p;
+ __u32 delta, ttl;
+ int i;
+
+- if (base->total >= inet_peer_threshold)
++ peer_threshold = READ_ONCE(inet_peer_threshold);
++ peer_maxttl = READ_ONCE(inet_peer_maxttl);
++ peer_minttl = READ_ONCE(inet_peer_minttl);
++
++ if (base->total >= peer_threshold)
+ ttl = 0; /* be aggressive */
+ else
+- ttl = inet_peer_maxttl
+- - (inet_peer_maxttl - inet_peer_minttl) / HZ *
+- base->total / inet_peer_threshold * HZ;
++ ttl = peer_maxttl - (peer_maxttl - peer_minttl) / HZ *
++ base->total / peer_threshold * HZ;
+ for (i = 0; i < gc_cnt; i++) {
+ p = gc_stack[i];
+
+diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
+index e459a391e607d..853a75a8fbafc 100644
+--- a/net/ipv4/nexthop.c
++++ b/net/ipv4/nexthop.c
+@@ -1858,7 +1858,7 @@ static void __remove_nexthop_fib(struct net *net, struct nexthop *nh)
+ /* __ip6_del_rt does a release, so do a hold here */
+ fib6_info_hold(f6i);
+ ipv6_stub->ip6_del_rt(net, f6i,
+- !net->ipv4.sysctl_nexthop_compat_mode);
++ !READ_ONCE(net->ipv4.sysctl_nexthop_compat_mode));
+ }
+ }
+
+@@ -2361,7 +2361,8 @@ out:
+ if (!rc) {
+ nh_base_seq_inc(net);
+ nexthop_notify(RTM_NEWNEXTHOP, new_nh, &cfg->nlinfo);
+- if (replace_notify && net->ipv4.sysctl_nexthop_compat_mode)
++ if (replace_notify &&
++ READ_ONCE(net->ipv4.sysctl_nexthop_compat_mode))
+ nexthop_replace_notify(net, new_nh, &cfg->nlinfo);
+ }
+
+diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
+index f33c31dd7366c..b387c48351559 100644
+--- a/net/ipv4/syncookies.c
++++ b/net/ipv4/syncookies.c
+@@ -273,7 +273,7 @@ bool cookie_ecn_ok(const struct tcp_options_received *tcp_opt,
+ if (!ecn_ok)
+ return false;
+
+- if (net->ipv4.sysctl_tcp_ecn)
++ if (READ_ONCE(net->ipv4.sysctl_tcp_ecn))
+ return true;
+
+ return dst_feature(dst, RTAX_FEATURE_ECN);
+diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
+index ad80d180b60b5..ffe0264a51b8c 100644
+--- a/net/ipv4/sysctl_net_ipv4.c
++++ b/net/ipv4/sysctl_net_ipv4.c
+@@ -603,6 +603,8 @@ static struct ctl_table ipv4_net_table[] = {
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+ .proc_handler = proc_dou8vec_minmax,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_ONE
+ },
+ {
+ .procname = "icmp_echo_enable_probe",
+@@ -619,6 +621,8 @@ static struct ctl_table ipv4_net_table[] = {
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+ .proc_handler = proc_dou8vec_minmax,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_ONE
+ },
+ {
+ .procname = "icmp_ignore_bogus_error_responses",
+@@ -626,6 +630,8 @@ static struct ctl_table ipv4_net_table[] = {
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+ .proc_handler = proc_dou8vec_minmax,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_ONE
+ },
+ {
+ .procname = "icmp_errors_use_inbound_ifaddr",
+@@ -633,6 +639,8 @@ static struct ctl_table ipv4_net_table[] = {
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+ .proc_handler = proc_dou8vec_minmax,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_ONE
+ },
+ {
+ .procname = "icmp_ratelimit",
+@@ -672,6 +680,8 @@ static struct ctl_table ipv4_net_table[] = {
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+ .proc_handler = proc_dou8vec_minmax,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_TWO,
+ },
+ {
+ .procname = "tcp_ecn_fallback",
+@@ -679,6 +689,8 @@ static struct ctl_table ipv4_net_table[] = {
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+ .proc_handler = proc_dou8vec_minmax,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_ONE,
+ },
+ {
+ .procname = "ip_dynaddr",
+diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
+index e31cf137c6140..f2fd1779d9251 100644
+--- a/net/ipv4/tcp.c
++++ b/net/ipv4/tcp.c
+@@ -2735,7 +2735,8 @@ static void tcp_orphan_update(struct timer_list *unused)
+
+ static bool tcp_too_many_orphans(int shift)
+ {
+- return READ_ONCE(tcp_orphan_cache) << shift > sysctl_tcp_max_orphans;
++ return READ_ONCE(tcp_orphan_cache) << shift >
++ READ_ONCE(sysctl_tcp_max_orphans);
+ }
+
+ bool tcp_check_oom(struct sock *sk, int shift)
+diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
+index 6b8fcf79688b5..2d71bcfcc7592 100644
+--- a/net/ipv4/tcp_input.c
++++ b/net/ipv4/tcp_input.c
+@@ -6712,7 +6712,7 @@ static void tcp_ecn_create_request(struct request_sock *req,
+
+ ect = !INET_ECN_is_not_ect(TCP_SKB_CB(skb)->ip_dsfield);
+ ecn_ok_dst = dst_feature(dst, DST_FEATURE_ECN_MASK);
+- ecn_ok = net->ipv4.sysctl_tcp_ecn || ecn_ok_dst;
++ ecn_ok = READ_ONCE(net->ipv4.sysctl_tcp_ecn) || ecn_ok_dst;
+
+ if (((!ect || th->res1) && ecn_ok) || tcp_ca_needs_ecn(listen_sk) ||
+ (ecn_ok_dst & DST_FEATURE_ECN_CA) ||
+diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
+index 6b00c17c72aa8..34249469e361f 100644
+--- a/net/ipv4/tcp_output.c
++++ b/net/ipv4/tcp_output.c
+@@ -324,7 +324,7 @@ static void tcp_ecn_send_syn(struct sock *sk, struct sk_buff *skb)
+ {
+ struct tcp_sock *tp = tcp_sk(sk);
+ bool bpf_needs_ecn = tcp_bpf_ca_needs_ecn(sk);
+- bool use_ecn = sock_net(sk)->ipv4.sysctl_tcp_ecn == 1 ||
++ bool use_ecn = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_ecn) == 1 ||
+ tcp_ca_needs_ecn(sk) || bpf_needs_ecn;
+
+ if (!use_ecn) {
+@@ -346,7 +346,7 @@ static void tcp_ecn_send_syn(struct sock *sk, struct sk_buff *skb)
+
+ static void tcp_ecn_clear_syn(struct sock *sk, struct sk_buff *skb)
+ {
+- if (sock_net(sk)->ipv4.sysctl_tcp_ecn_fallback)
++ if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_ecn_fallback))
+ /* tp->ecn_flags are cleared at a later point in time when
+ * SYN ACK is ultimatively being received.
+ */
+diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
+index e6b978ea0e87f..26554aa6fc1b0 100644
+--- a/net/ipv6/icmp.c
++++ b/net/ipv6/icmp.c
+@@ -919,7 +919,7 @@ static int icmpv6_rcv(struct sk_buff *skb)
+ break;
+ case ICMPV6_EXT_ECHO_REQUEST:
+ if (!net->ipv6.sysctl.icmpv6_echo_ignore_all &&
+- net->ipv4.sysctl_icmp_echo_enable_probe)
++ READ_ONCE(net->ipv4.sysctl_icmp_echo_enable_probe))
+ icmpv6_echo_reply(skb);
+ break;
+
+diff --git a/net/ipv6/route.c b/net/ipv6/route.c
+index 83786de847abf..a87b96a256afc 100644
+--- a/net/ipv6/route.c
++++ b/net/ipv6/route.c
+@@ -5737,7 +5737,7 @@ static int rt6_fill_node(struct net *net, struct sk_buff *skb,
+ if (nexthop_is_blackhole(rt->nh))
+ rtm->rtm_type = RTN_BLACKHOLE;
+
+- if (net->ipv4.sysctl_nexthop_compat_mode &&
++ if (READ_ONCE(net->ipv4.sysctl_nexthop_compat_mode) &&
+ rt6_fill_node_nexthop(skb, rt->nh, &nh_flags) < 0)
+ goto nla_put_failure;
+
+diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c
+index d64855010948d..e756ba705fd9b 100644
+--- a/net/ipv6/seg6_iptunnel.c
++++ b/net/ipv6/seg6_iptunnel.c
+@@ -189,6 +189,8 @@ int seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh, int proto)
+ }
+ #endif
+
++ hdr->payload_len = htons(skb->len - sizeof(struct ipv6hdr));
++
+ skb_postpush_rcsum(skb, hdr, tot_len);
+
+ return 0;
+@@ -241,6 +243,8 @@ int seg6_do_srh_inline(struct sk_buff *skb, struct ipv6_sr_hdr *osrh)
+ }
+ #endif
+
++ hdr->payload_len = htons(skb->len - sizeof(struct ipv6hdr));
++
+ skb_postpush_rcsum(skb, hdr, sizeof(struct ipv6hdr) + hdrlen);
+
+ return 0;
+@@ -302,7 +306,6 @@ static int seg6_do_srh(struct sk_buff *skb)
+ break;
+ }
+
+- ipv6_hdr(skb)->payload_len = htons(skb->len - sizeof(struct ipv6hdr));
+ skb_set_transport_header(skb, sizeof(struct ipv6hdr));
+ nf_reset_ct(skb);
+
+diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
+index 98a34287439cc..2cd4a8d3b30ad 100644
+--- a/net/ipv6/seg6_local.c
++++ b/net/ipv6/seg6_local.c
+@@ -826,7 +826,6 @@ static int input_action_end_b6(struct sk_buff *skb, struct seg6_local_lwt *slwt)
+ if (err)
+ goto drop;
+
+- ipv6_hdr(skb)->payload_len = htons(skb->len - sizeof(struct ipv6hdr));
+ skb_set_transport_header(skb, sizeof(struct ipv6hdr));
+
+ seg6_lookup_nexthop(skb, NULL, 0);
+@@ -858,7 +857,6 @@ static int input_action_end_b6_encap(struct sk_buff *skb,
+ if (err)
+ goto drop;
+
+- ipv6_hdr(skb)->payload_len = htons(skb->len - sizeof(struct ipv6hdr));
+ skb_set_transport_header(skb, sizeof(struct ipv6hdr));
+
+ seg6_lookup_nexthop(skb, NULL, 0);
+diff --git a/net/mac80211/wme.c b/net/mac80211/wme.c
+index 62c6733e07923..d50480b317505 100644
+--- a/net/mac80211/wme.c
++++ b/net/mac80211/wme.c
+@@ -147,8 +147,8 @@ u16 __ieee80211_select_queue(struct ieee80211_sub_if_data *sdata,
+ bool qos;
+
+ /* all mesh/ocb stations are required to support WME */
+- if (sdata->vif.type == NL80211_IFTYPE_MESH_POINT ||
+- sdata->vif.type == NL80211_IFTYPE_OCB)
++ if (sta && (sdata->vif.type == NL80211_IFTYPE_MESH_POINT ||
++ sdata->vif.type == NL80211_IFTYPE_OCB))
+ qos = true;
+ else if (sta)
+ qos = sta->sta.wme;
+diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
+index b0fb1fc0bd4a9..b52fd250cb3a8 100644
+--- a/net/mptcp/protocol.c
++++ b/net/mptcp/protocol.c
+@@ -2840,12 +2840,12 @@ static void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk)
+
+ static int mptcp_disconnect(struct sock *sk, int flags)
+ {
+- struct mptcp_subflow_context *subflow;
++ struct mptcp_subflow_context *subflow, *tmp;
+ struct mptcp_sock *msk = mptcp_sk(sk);
+
+ inet_sk_state_store(sk, TCP_CLOSE);
+
+- mptcp_for_each_subflow(msk, subflow) {
++ list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) {
+ struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+
+ __mptcp_close_ssk(sk, ssk, subflow, MPTCP_CF_FASTCLOSE);
+diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
+index 0164e5f522e8e..5a85735512ce3 100644
+--- a/net/netfilter/nf_conntrack_core.c
++++ b/net/netfilter/nf_conntrack_core.c
+@@ -525,21 +525,6 @@ clean_from_lists(struct nf_conn *ct)
+ nf_ct_remove_expectations(ct);
+ }
+
+-/* must be called with local_bh_disable */
+-static void nf_ct_add_to_dying_list(struct nf_conn *ct)
+-{
+- struct ct_pcpu *pcpu;
+-
+- /* add this conntrack to the (per cpu) dying list */
+- ct->cpu = smp_processor_id();
+- pcpu = per_cpu_ptr(nf_ct_net(ct)->ct.pcpu_lists, ct->cpu);
+-
+- spin_lock(&pcpu->lock);
+- hlist_nulls_add_head(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode,
+- &pcpu->dying);
+- spin_unlock(&pcpu->lock);
+-}
+-
+ /* must be called with local_bh_disable */
+ static void nf_ct_add_to_unconfirmed_list(struct nf_conn *ct)
+ {
+@@ -556,11 +541,11 @@ static void nf_ct_add_to_unconfirmed_list(struct nf_conn *ct)
+ }
+
+ /* must be called with local_bh_disable */
+-static void nf_ct_del_from_dying_or_unconfirmed_list(struct nf_conn *ct)
++static void nf_ct_del_from_unconfirmed_list(struct nf_conn *ct)
+ {
+ struct ct_pcpu *pcpu;
+
+- /* We overload first tuple to link into unconfirmed or dying list.*/
++ /* We overload first tuple to link into unconfirmed list.*/
+ pcpu = per_cpu_ptr(nf_ct_net(ct)->ct.pcpu_lists, ct->cpu);
+
+ spin_lock(&pcpu->lock);
+@@ -648,7 +633,8 @@ void nf_ct_destroy(struct nf_conntrack *nfct)
+ */
+ nf_ct_remove_expectations(ct);
+
+- nf_ct_del_from_dying_or_unconfirmed_list(ct);
++ if (unlikely(!nf_ct_is_confirmed(ct)))
++ nf_ct_del_from_unconfirmed_list(ct);
+
+ local_bh_enable();
+
+@@ -660,15 +646,12 @@ void nf_ct_destroy(struct nf_conntrack *nfct)
+ }
+ EXPORT_SYMBOL(nf_ct_destroy);
+
+-static void nf_ct_delete_from_lists(struct nf_conn *ct)
++static void __nf_ct_delete_from_lists(struct nf_conn *ct)
+ {
+ struct net *net = nf_ct_net(ct);
+ unsigned int hash, reply_hash;
+ unsigned int sequence;
+
+- nf_ct_helper_destroy(ct);
+-
+- local_bh_disable();
+ do {
+ sequence = read_seqcount_begin(&nf_conntrack_generation);
+ hash = hash_conntrack(net,
+@@ -681,12 +664,30 @@ static void nf_ct_delete_from_lists(struct nf_conn *ct)
+
+ clean_from_lists(ct);
+ nf_conntrack_double_unlock(hash, reply_hash);
++}
++
++static void nf_ct_delete_from_lists(struct nf_conn *ct)
++{
++ nf_ct_helper_destroy(ct);
++ local_bh_disable();
+
+- nf_ct_add_to_dying_list(ct);
++ __nf_ct_delete_from_lists(ct);
+
+ local_bh_enable();
+ }
+
++static void nf_ct_add_to_ecache_list(struct nf_conn *ct)
++{
++#ifdef CONFIG_NF_CONNTRACK_EVENTS
++ struct nf_conntrack_net *cnet = nf_ct_pernet(nf_ct_net(ct));
++
++ spin_lock(&cnet->ecache.dying_lock);
++ hlist_nulls_add_head_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode,
++ &cnet->ecache.dying_list);
++ spin_unlock(&cnet->ecache.dying_lock);
++#endif
++}
++
+ bool nf_ct_delete(struct nf_conn *ct, u32 portid, int report)
+ {
+ struct nf_conn_tstamp *tstamp;
+@@ -709,7 +710,12 @@ bool nf_ct_delete(struct nf_conn *ct, u32 portid, int report)
+ /* destroy event was not delivered. nf_ct_put will
+ * be done by event cache worker on redelivery.
+ */
+- nf_ct_delete_from_lists(ct);
++ nf_ct_helper_destroy(ct);
++ local_bh_disable();
++ __nf_ct_delete_from_lists(ct);
++ nf_ct_add_to_ecache_list(ct);
++ local_bh_enable();
++
+ nf_conntrack_ecache_work(nf_ct_net(ct), NFCT_ECACHE_DESTROY_FAIL);
+ return false;
+ }
+@@ -758,6 +764,9 @@ static void nf_ct_gc_expired(struct nf_conn *ct)
+ if (!refcount_inc_not_zero(&ct->ct_general.use))
+ return;
+
++ /* load ->status after refcount increase */
++ smp_acquire__after_ctrl_dep();
++
+ if (nf_ct_should_gc(ct))
+ nf_ct_kill(ct);
+
+@@ -824,6 +833,9 @@ __nf_conntrack_find_get(struct net *net, const struct nf_conntrack_zone *zone,
+ */
+ ct = nf_ct_tuplehash_to_ctrack(h);
+ if (likely(refcount_inc_not_zero(&ct->ct_general.use))) {
++ /* re-check key after refcount */
++ smp_acquire__after_ctrl_dep();
++
+ if (likely(nf_ct_key_equal(h, tuple, zone, net)))
+ goto found;
+
+@@ -972,7 +984,6 @@ static void __nf_conntrack_insert_prepare(struct nf_conn *ct)
+ struct nf_conn_tstamp *tstamp;
+
+ refcount_inc(&ct->ct_general.use);
+- ct->status |= IPS_CONFIRMED;
+
+ /* set conntrack timestamp, if enabled. */
+ tstamp = nf_conn_tstamp_find(ct);
+@@ -1001,7 +1012,6 @@ static int __nf_ct_resolve_clash(struct sk_buff *skb,
+ nf_conntrack_get(&ct->ct_general);
+
+ nf_ct_acct_merge(ct, ctinfo, loser_ct);
+- nf_ct_add_to_dying_list(loser_ct);
+ nf_ct_put(loser_ct);
+ nf_ct_set(skb, ct, ctinfo);
+
+@@ -1134,7 +1144,6 @@ nf_ct_resolve_clash(struct sk_buff *skb, struct nf_conntrack_tuple_hash *h,
+ return ret;
+
+ drop:
+- nf_ct_add_to_dying_list(loser_ct);
+ NF_CT_STAT_INC(net, drop);
+ NF_CT_STAT_INC(net, insert_failed);
+ return NF_DROP;
+@@ -1201,10 +1210,10 @@ __nf_conntrack_confirm(struct sk_buff *skb)
+ * user context, else we insert an already 'dead' hash, blocking
+ * further use of that particular connection -JM.
+ */
+- nf_ct_del_from_dying_or_unconfirmed_list(ct);
++ nf_ct_del_from_unconfirmed_list(ct);
++ ct->status |= IPS_CONFIRMED;
+
+ if (unlikely(nf_ct_is_dying(ct))) {
+- nf_ct_add_to_dying_list(ct);
+ NF_CT_STAT_INC(net, insert_failed);
+ goto dying;
+ }
+@@ -1228,7 +1237,6 @@ __nf_conntrack_confirm(struct sk_buff *skb)
+ goto out;
+ if (chainlen++ > max_chainlen) {
+ chaintoolong:
+- nf_ct_add_to_dying_list(ct);
+ NF_CT_STAT_INC(net, chaintoolong);
+ NF_CT_STAT_INC(net, insert_failed);
+ ret = NF_DROP;
+@@ -1367,6 +1375,9 @@ static unsigned int early_drop_list(struct net *net,
+ if (!refcount_inc_not_zero(&tmp->ct_general.use))
+ continue;
+
++ /* load ->ct_net and ->status after refcount increase */
++ smp_acquire__after_ctrl_dep();
++
+ /* kill only if still in same netns -- might have moved due to
+ * SLAB_TYPESAFE_BY_RCU rules.
+ *
+@@ -1516,6 +1527,9 @@ static void gc_worker(struct work_struct *work)
+ if (!refcount_inc_not_zero(&tmp->ct_general.use))
+ continue;
+
++ /* load ->status after refcount increase */
++ smp_acquire__after_ctrl_dep();
++
+ if (gc_worker_skip_ct(tmp)) {
+ nf_ct_put(tmp);
+ continue;
+@@ -1747,6 +1761,16 @@ init_conntrack(struct net *net, struct nf_conn *tmpl,
+ if (!exp)
+ __nf_ct_try_assign_helper(ct, tmpl, GFP_ATOMIC);
+
++ /* Other CPU might have obtained a pointer to this object before it was
++ * released. Because refcount is 0, refcount_inc_not_zero() will fail.
++ *
++ * After refcount_set(1) it will succeed; ensure that zeroing of
++ * ct->status and the correct ct->net pointer are visible; else other
++ * core might observe CONFIRMED bit which means the entry is valid and
++ * in the hash table, but its not (anymore).
++ */
++ smp_wmb();
++
+ /* Now it is inserted into the unconfirmed list, set refcount to 1. */
+ refcount_set(&ct->ct_general.use, 1);
+ nf_ct_add_to_unconfirmed_list(ct);
+@@ -2777,7 +2801,6 @@ void nf_conntrack_init_end(void)
+ * We need to use special "null" values, not used in hash table
+ */
+ #define UNCONFIRMED_NULLS_VAL ((1<<30)+0)
+-#define DYING_NULLS_VAL ((1<<30)+1)
+
+ int nf_conntrack_init_net(struct net *net)
+ {
+@@ -2798,7 +2821,6 @@ int nf_conntrack_init_net(struct net *net)
+
+ spin_lock_init(&pcpu->lock);
+ INIT_HLIST_NULLS_HEAD(&pcpu->unconfirmed, UNCONFIRMED_NULLS_VAL);
+- INIT_HLIST_NULLS_HEAD(&pcpu->dying, DYING_NULLS_VAL);
+ }
+
+ net->ct.stat = alloc_percpu(struct ip_conntrack_stat);
+diff --git a/net/netfilter/nf_conntrack_ecache.c b/net/netfilter/nf_conntrack_ecache.c
+index 07e65b4e92f86..7472c544642fb 100644
+--- a/net/netfilter/nf_conntrack_ecache.c
++++ b/net/netfilter/nf_conntrack_ecache.c
+@@ -16,7 +16,6 @@
+ #include <linux/vmalloc.h>
+ #include <linux/stddef.h>
+ #include <linux/err.h>
+-#include <linux/percpu.h>
+ #include <linux/kernel.h>
+ #include <linux/netdevice.h>
+ #include <linux/slab.h>
+@@ -29,8 +28,9 @@
+
+ static DEFINE_MUTEX(nf_ct_ecache_mutex);
+
+-#define ECACHE_RETRY_WAIT (HZ/10)
+-#define ECACHE_STACK_ALLOC (256 / sizeof(void *))
++#define DYING_NULLS_VAL ((1 << 30) + 1)
++#define ECACHE_MAX_JIFFIES msecs_to_jiffies(10)
++#define ECACHE_RETRY_JIFFIES msecs_to_jiffies(10)
+
+ enum retry_state {
+ STATE_CONGESTED,
+@@ -38,96 +38,90 @@ enum retry_state {
+ STATE_DONE,
+ };
+
+-static enum retry_state ecache_work_evict_list(struct ct_pcpu *pcpu)
++struct nf_conntrack_net_ecache *nf_conn_pernet_ecache(const struct net *net)
+ {
+- struct nf_conn *refs[ECACHE_STACK_ALLOC];
++ struct nf_conntrack_net *cnet = nf_ct_pernet(net);
++
++ return &cnet->ecache;
++}
++#if IS_MODULE(CONFIG_NF_CT_NETLINK)
++EXPORT_SYMBOL_GPL(nf_conn_pernet_ecache);
++#endif
++
++static enum retry_state ecache_work_evict_list(struct nf_conntrack_net *cnet)
++{
++ unsigned long stop = jiffies + ECACHE_MAX_JIFFIES;
++ struct hlist_nulls_head evicted_list;
+ enum retry_state ret = STATE_DONE;
+ struct nf_conntrack_tuple_hash *h;
+ struct hlist_nulls_node *n;
+- unsigned int evicted = 0;
++ unsigned int sent;
++
++ INIT_HLIST_NULLS_HEAD(&evicted_list, DYING_NULLS_VAL);
+
+- spin_lock(&pcpu->lock);
++next:
++ sent = 0;
++ spin_lock_bh(&cnet->ecache.dying_lock);
+
+- hlist_nulls_for_each_entry(h, n, &pcpu->dying, hnnode) {
++ hlist_nulls_for_each_entry_safe(h, n, &cnet->ecache.dying_list, hnnode) {
+ struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
+- struct nf_conntrack_ecache *e;
+-
+- if (!nf_ct_is_confirmed(ct))
+- continue;
+-
+- /* This ecache access is safe because the ct is on the
+- * pcpu dying list and we hold the spinlock -- the entry
+- * cannot be free'd until after the lock is released.
+- *
+- * This is true even if ct has a refcount of 0: the
+- * cpu that is about to free the entry must remove it
+- * from the dying list and needs the lock to do so.
+- */
+- e = nf_ct_ecache_find(ct);
+- if (!e || e->state != NFCT_ECACHE_DESTROY_FAIL)
+- continue;
+
+- /* ct is in NFCT_ECACHE_DESTROY_FAIL state, this means
+- * the worker owns this entry: the ct will remain valid
+- * until the worker puts its ct reference.
++ /* The worker owns all entries, ct remains valid until nf_ct_put
++ * in the loop below.
+ */
+ if (nf_conntrack_event(IPCT_DESTROY, ct)) {
+ ret = STATE_CONGESTED;
+ break;
+ }
+
+- e->state = NFCT_ECACHE_DESTROY_SENT;
+- refs[evicted] = ct;
++ hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
++ hlist_nulls_add_head(&ct->tuplehash[IP_CT_DIR_REPLY].hnnode, &evicted_list);
+
+- if (++evicted >= ARRAY_SIZE(refs)) {
++ if (time_after(stop, jiffies)) {
+ ret = STATE_RESTART;
+ break;
+ }
++
++ if (sent++ > 16) {
++ spin_unlock_bh(&cnet->ecache.dying_lock);
++ cond_resched();
++ goto next;
++ }
+ }
+
+- spin_unlock(&pcpu->lock);
++ spin_unlock_bh(&cnet->ecache.dying_lock);
++
++ hlist_nulls_for_each_entry_safe(h, n, &evicted_list, hnnode) {
++ struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
++
++ hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_REPLY].hnnode);
++ nf_ct_put(ct);
+
+- /* can't _put while holding lock */
+- while (evicted)
+- nf_ct_put(refs[--evicted]);
++ cond_resched();
++ }
+
+ return ret;
+ }
+
+ static void ecache_work(struct work_struct *work)
+ {
+- struct nf_conntrack_net *cnet = container_of(work, struct nf_conntrack_net, ecache_dwork.work);
+- struct netns_ct *ctnet = cnet->ct_net;
+- int cpu, delay = -1;
+- struct ct_pcpu *pcpu;
+-
+- local_bh_disable();
+-
+- for_each_possible_cpu(cpu) {
+- enum retry_state ret;
+-
+- pcpu = per_cpu_ptr(ctnet->pcpu_lists, cpu);
+-
+- ret = ecache_work_evict_list(pcpu);
+-
+- switch (ret) {
+- case STATE_CONGESTED:
+- delay = ECACHE_RETRY_WAIT;
+- goto out;
+- case STATE_RESTART:
+- delay = 0;
+- break;
+- case STATE_DONE:
+- break;
+- }
++ struct nf_conntrack_net *cnet = container_of(work, struct nf_conntrack_net, ecache.dwork.work);
++ int ret, delay = -1;
++
++ ret = ecache_work_evict_list(cnet);
++ switch (ret) {
++ case STATE_CONGESTED:
++ delay = ECACHE_RETRY_JIFFIES;
++ break;
++ case STATE_RESTART:
++ delay = 0;
++ break;
++ case STATE_DONE:
++ break;
+ }
+
+- out:
+- local_bh_enable();
+-
+- ctnet->ecache_dwork_pending = delay > 0;
+ if (delay >= 0)
+- schedule_delayed_work(&cnet->ecache_dwork, delay);
++ schedule_delayed_work(&cnet->ecache.dwork, delay);
+ }
+
+ static int __nf_conntrack_eventmask_report(struct nf_conntrack_ecache *e,
+@@ -199,7 +193,6 @@ int nf_conntrack_eventmask_report(unsigned int events, struct nf_conn *ct,
+ */
+ if (e->portid == 0 && portid != 0)
+ e->portid = portid;
+- e->state = NFCT_ECACHE_DESTROY_FAIL;
+ }
+
+ return ret;
+@@ -293,12 +286,14 @@ void nf_conntrack_ecache_work(struct net *net, enum nf_ct_ecache_state state)
+ struct nf_conntrack_net *cnet = nf_ct_pernet(net);
+
+ if (state == NFCT_ECACHE_DESTROY_FAIL &&
+- !delayed_work_pending(&cnet->ecache_dwork)) {
+- schedule_delayed_work(&cnet->ecache_dwork, HZ);
++ !delayed_work_pending(&cnet->ecache.dwork)) {
++ schedule_delayed_work(&cnet->ecache.dwork, HZ);
+ net->ct.ecache_dwork_pending = true;
+ } else if (state == NFCT_ECACHE_DESTROY_SENT) {
+- net->ct.ecache_dwork_pending = false;
+- mod_delayed_work(system_wq, &cnet->ecache_dwork, 0);
++ if (!hlist_nulls_empty(&cnet->ecache.dying_list))
++ mod_delayed_work(system_wq, &cnet->ecache.dwork, 0);
++ else
++ net->ct.ecache_dwork_pending = false;
+ }
+ }
+
+@@ -310,8 +305,10 @@ void nf_conntrack_ecache_pernet_init(struct net *net)
+ struct nf_conntrack_net *cnet = nf_ct_pernet(net);
+
+ net->ct.sysctl_events = nf_ct_events;
+- cnet->ct_net = &net->ct;
+- INIT_DELAYED_WORK(&cnet->ecache_dwork, ecache_work);
++
++ INIT_DELAYED_WORK(&cnet->ecache.dwork, ecache_work);
++ INIT_HLIST_NULLS_HEAD(&cnet->ecache.dying_list, DYING_NULLS_VAL);
++ spin_lock_init(&cnet->ecache.dying_lock);
+
+ BUILD_BUG_ON(__IPCT_MAX >= 16); /* e->ctmask is u16 */
+ }
+@@ -320,5 +317,5 @@ void nf_conntrack_ecache_pernet_fini(struct net *net)
+ {
+ struct nf_conntrack_net *cnet = nf_ct_pernet(net);
+
+- cancel_delayed_work_sync(&cnet->ecache_dwork);
++ cancel_delayed_work_sync(&cnet->ecache.dwork);
+ }
+diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
+index 1ea2ad732d578..431e005ff14d3 100644
+--- a/net/netfilter/nf_conntrack_netlink.c
++++ b/net/netfilter/nf_conntrack_netlink.c
+@@ -1203,6 +1203,7 @@ restart:
+ hnnode) {
+ ct = nf_ct_tuplehash_to_ctrack(h);
+ if (nf_ct_is_expired(ct)) {
++ /* need to defer nf_ct_kill() until lock is released */
+ if (i < ARRAY_SIZE(nf_ct_evict) &&
+ refcount_inc_not_zero(&ct->ct_general.use))
+ nf_ct_evict[i++] = ct;
+@@ -1708,19 +1709,56 @@ static int ctnetlink_done_list(struct netlink_callback *cb)
+ return 0;
+ }
+
++static int ctnetlink_dump_one_entry(struct sk_buff *skb,
++ struct netlink_callback *cb,
++ struct nf_conn *ct,
++ bool dying)
++{
++ struct ctnetlink_list_dump_ctx *ctx = (void *)cb->ctx;
++ struct nfgenmsg *nfmsg = nlmsg_data(cb->nlh);
++ u8 l3proto = nfmsg->nfgen_family;
++ int res;
++
++ if (l3proto && nf_ct_l3num(ct) != l3proto)
++ return 0;
++
++ if (ctx->last) {
++ if (ct != ctx->last)
++ return 0;
++
++ ctx->last = NULL;
++ }
++
++ /* We can't dump extension info for the unconfirmed
++ * list because unconfirmed conntracks can have
++ * ct->ext reallocated (and thus freed).
++ *
++ * In the dying list case ct->ext can't be free'd
++ * until after we drop pcpu->lock.
++ */
++ res = ctnetlink_fill_info(skb, NETLINK_CB(cb->skb).portid,
++ cb->nlh->nlmsg_seq,
++ NFNL_MSG_TYPE(cb->nlh->nlmsg_type),
++ ct, dying, 0);
++ if (res < 0) {
++ if (!refcount_inc_not_zero(&ct->ct_general.use))
++ return 0;
++
++ ctx->last = ct;
++ }
++
++ return res;
++}
++
+ static int
+-ctnetlink_dump_list(struct sk_buff *skb, struct netlink_callback *cb, bool dying)
++ctnetlink_dump_unconfirmed(struct sk_buff *skb, struct netlink_callback *cb)
+ {
+ struct ctnetlink_list_dump_ctx *ctx = (void *)cb->ctx;
+ struct nf_conn *ct, *last;
+ struct nf_conntrack_tuple_hash *h;
+ struct hlist_nulls_node *n;
+- struct nfgenmsg *nfmsg = nlmsg_data(cb->nlh);
+- u_int8_t l3proto = nfmsg->nfgen_family;
+- int res;
+- int cpu;
+- struct hlist_nulls_head *list;
+ struct net *net = sock_net(skb->sk);
++ int res, cpu;
+
+ if (ctx->done)
+ return 0;
+@@ -1735,34 +1773,13 @@ ctnetlink_dump_list(struct sk_buff *skb, struct netlink_callback *cb, bool dying
+
+ pcpu = per_cpu_ptr(net->ct.pcpu_lists, cpu);
+ spin_lock_bh(&pcpu->lock);
+- list = dying ? &pcpu->dying : &pcpu->unconfirmed;
+ restart:
+- hlist_nulls_for_each_entry(h, n, list, hnnode) {
++ hlist_nulls_for_each_entry(h, n, &pcpu->unconfirmed, hnnode) {
+ ct = nf_ct_tuplehash_to_ctrack(h);
+- if (l3proto && nf_ct_l3num(ct) != l3proto)
+- continue;
+- if (ctx->last) {
+- if (ct != last)
+- continue;
+- ctx->last = NULL;
+- }
+
+- /* We can't dump extension info for the unconfirmed
+- * list because unconfirmed conntracks can have
+- * ct->ext reallocated (and thus freed).
+- *
+- * In the dying list case ct->ext can't be free'd
+- * until after we drop pcpu->lock.
+- */
+- res = ctnetlink_fill_info(skb, NETLINK_CB(cb->skb).portid,
+- cb->nlh->nlmsg_seq,
+- NFNL_MSG_TYPE(cb->nlh->nlmsg_type),
+- ct, dying, 0);
++ res = ctnetlink_dump_one_entry(skb, cb, ct, false);
+ if (res < 0) {
+- if (!refcount_inc_not_zero(&ct->ct_general.use))
+- continue;
+ ctx->cpu = cpu;
+- ctx->last = ct;
+ spin_unlock_bh(&pcpu->lock);
+ goto out;
+ }
+@@ -1784,7 +1801,49 @@ out:
+ static int
+ ctnetlink_dump_dying(struct sk_buff *skb, struct netlink_callback *cb)
+ {
+- return ctnetlink_dump_list(skb, cb, true);
++ struct ctnetlink_list_dump_ctx *ctx = (void *)cb->ctx;
++ struct nf_conn *last = ctx->last;
++#ifdef CONFIG_NF_CONNTRACK_EVENTS
++ const struct net *net = sock_net(skb->sk);
++ struct nf_conntrack_net_ecache *ecache_net;
++ struct nf_conntrack_tuple_hash *h;
++ struct hlist_nulls_node *n;
++#endif
++
++ if (ctx->done)
++ return 0;
++
++ ctx->last = NULL;
++
++#ifdef CONFIG_NF_CONNTRACK_EVENTS
++ ecache_net = nf_conn_pernet_ecache(net);
++ spin_lock_bh(&ecache_net->dying_lock);
++
++ hlist_nulls_for_each_entry(h, n, &ecache_net->dying_list, hnnode) {
++ struct nf_conn *ct;
++ int res;
++
++ ct = nf_ct_tuplehash_to_ctrack(h);
++ if (last && last != ct)
++ continue;
++
++ res = ctnetlink_dump_one_entry(skb, cb, ct, true);
++ if (res < 0) {
++ spin_unlock_bh(&ecache_net->dying_lock);
++ nf_ct_put(last);
++ return skb->len;
++ }
++
++ nf_ct_put(last);
++ last = NULL;
++ }
++
++ spin_unlock_bh(&ecache_net->dying_lock);
++#endif
++ ctx->done = true;
++ nf_ct_put(last);
++
++ return skb->len;
+ }
+
+ static int ctnetlink_get_ct_dying(struct sk_buff *skb,
+@@ -1802,12 +1861,6 @@ static int ctnetlink_get_ct_dying(struct sk_buff *skb,
+ return -EOPNOTSUPP;
+ }
+
+-static int
+-ctnetlink_dump_unconfirmed(struct sk_buff *skb, struct netlink_callback *cb)
+-{
+- return ctnetlink_dump_list(skb, cb, false);
+-}
+-
+ static int ctnetlink_get_ct_unconfirmed(struct sk_buff *skb,
+ const struct nfnl_info *info,
+ const struct nlattr * const cda[])
+diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
+index 55aa55b252b20..48812dda273b2 100644
+--- a/net/netfilter/nf_conntrack_standalone.c
++++ b/net/netfilter/nf_conntrack_standalone.c
+@@ -306,6 +306,9 @@ static int ct_seq_show(struct seq_file *s, void *v)
+ if (unlikely(!refcount_inc_not_zero(&ct->ct_general.use)))
+ return 0;
+
++ /* load ->status after refcount increase */
++ smp_acquire__after_ctrl_dep();
++
+ if (nf_ct_should_gc(ct)) {
+ nf_ct_kill(ct);
+ goto release;
+diff --git a/net/netfilter/nf_log_syslog.c b/net/netfilter/nf_log_syslog.c
+index 13234641cdb34..7000e069bc076 100644
+--- a/net/netfilter/nf_log_syslog.c
++++ b/net/netfilter/nf_log_syslog.c
+@@ -61,7 +61,7 @@ dump_arp_packet(struct nf_log_buf *m,
+ unsigned int logflags;
+ struct arphdr _arph;
+
+- ah = skb_header_pointer(skb, 0, sizeof(_arph), &_arph);
++ ah = skb_header_pointer(skb, nhoff, sizeof(_arph), &_arph);
+ if (!ah) {
+ nf_log_buf_add(m, "TRUNCATED");
+ return;
+@@ -90,7 +90,7 @@ dump_arp_packet(struct nf_log_buf *m,
+ ah->ar_pln != sizeof(__be32))
+ return;
+
+- ap = skb_header_pointer(skb, sizeof(_arph), sizeof(_arpp), &_arpp);
++ ap = skb_header_pointer(skb, nhoff + sizeof(_arph), sizeof(_arpp), &_arpp);
+ if (!ap) {
+ nf_log_buf_add(m, " INCOMPLETE [%zu bytes]",
+ skb->len - sizeof(_arph));
+@@ -144,7 +144,7 @@ static void nf_log_arp_packet(struct net *net, u_int8_t pf,
+
+ nf_log_dump_packet_common(m, pf, hooknum, skb, in, out, loginfo,
+ prefix);
+- dump_arp_packet(m, loginfo, skb, 0);
++ dump_arp_packet(m, loginfo, skb, skb_network_offset(skb));
+
+ nf_log_buf_close(m);
+ }
+@@ -829,7 +829,7 @@ static void nf_log_ip_packet(struct net *net, u_int8_t pf,
+ if (in)
+ dump_ipv4_mac_header(m, loginfo, skb);
+
+- dump_ipv4_packet(net, m, loginfo, skb, 0);
++ dump_ipv4_packet(net, m, loginfo, skb, skb_network_offset(skb));
+
+ nf_log_buf_close(m);
+ }
+diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
+index a136148627e70..de3dc35ce6095 100644
+--- a/net/netfilter/nf_tables_api.c
++++ b/net/netfilter/nf_tables_api.c
+@@ -5831,8 +5831,11 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
+ if (!nla[NFTA_SET_ELEM_KEY] && !(flags & NFT_SET_ELEM_CATCHALL))
+ return -EINVAL;
+
+- if (flags != 0)
+- nft_set_ext_add(&tmpl, NFT_SET_EXT_FLAGS);
++ if (flags != 0) {
++ err = nft_set_ext_add(&tmpl, NFT_SET_EXT_FLAGS);
++ if (err < 0)
++ return err;
++ }
+
+ if (set->flags & NFT_SET_MAP) {
+ if (nla[NFTA_SET_ELEM_DATA] == NULL &&
+@@ -5941,7 +5944,9 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
+ if (err < 0)
+ goto err_set_elem_expr;
+
+- nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen);
++ err = nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen);
++ if (err < 0)
++ goto err_parse_key;
+ }
+
+ if (nla[NFTA_SET_ELEM_KEY_END]) {
+@@ -5950,22 +5955,31 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
+ if (err < 0)
+ goto err_parse_key;
+
+- nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen);
++ err = nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen);
++ if (err < 0)
++ goto err_parse_key_end;
+ }
+
+ if (timeout > 0) {
+- nft_set_ext_add(&tmpl, NFT_SET_EXT_EXPIRATION);
+- if (timeout != set->timeout)
+- nft_set_ext_add(&tmpl, NFT_SET_EXT_TIMEOUT);
++ err = nft_set_ext_add(&tmpl, NFT_SET_EXT_EXPIRATION);
++ if (err < 0)
++ goto err_parse_key_end;
++
++ if (timeout != set->timeout) {
++ err = nft_set_ext_add(&tmpl, NFT_SET_EXT_TIMEOUT);
++ if (err < 0)
++ goto err_parse_key_end;
++ }
+ }
+
+ if (num_exprs) {
+ for (i = 0; i < num_exprs; i++)
+ size += expr_array[i]->ops->size;
+
+- nft_set_ext_add_length(&tmpl, NFT_SET_EXT_EXPRESSIONS,
+- sizeof(struct nft_set_elem_expr) +
+- size);
++ err = nft_set_ext_add_length(&tmpl, NFT_SET_EXT_EXPRESSIONS,
++ sizeof(struct nft_set_elem_expr) + size);
++ if (err < 0)
++ goto err_parse_key_end;
+ }
+
+ if (nla[NFTA_SET_ELEM_OBJREF] != NULL) {
+@@ -5980,7 +5994,9 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
+ err = PTR_ERR(obj);
+ goto err_parse_key_end;
+ }
+- nft_set_ext_add(&tmpl, NFT_SET_EXT_OBJREF);
++ err = nft_set_ext_add(&tmpl, NFT_SET_EXT_OBJREF);
++ if (err < 0)
++ goto err_parse_key_end;
+ }
+
+ if (nla[NFTA_SET_ELEM_DATA] != NULL) {
+@@ -6014,7 +6030,9 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
+ NFT_VALIDATE_NEED);
+ }
+
+- nft_set_ext_add_length(&tmpl, NFT_SET_EXT_DATA, desc.len);
++ err = nft_set_ext_add_length(&tmpl, NFT_SET_EXT_DATA, desc.len);
++ if (err < 0)
++ goto err_parse_data;
+ }
+
+ /* The full maximum length of userdata can exceed the maximum
+@@ -6024,9 +6042,12 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
+ ulen = 0;
+ if (nla[NFTA_SET_ELEM_USERDATA] != NULL) {
+ ulen = nla_len(nla[NFTA_SET_ELEM_USERDATA]);
+- if (ulen > 0)
+- nft_set_ext_add_length(&tmpl, NFT_SET_EXT_USERDATA,
+- ulen);
++ if (ulen > 0) {
++ err = nft_set_ext_add_length(&tmpl, NFT_SET_EXT_USERDATA,
++ ulen);
++ if (err < 0)
++ goto err_parse_data;
++ }
+ }
+
+ err = -ENOMEM;
+@@ -6252,8 +6273,11 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set,
+
+ nft_set_ext_prepare(&tmpl);
+
+- if (flags != 0)
+- nft_set_ext_add(&tmpl, NFT_SET_EXT_FLAGS);
++ if (flags != 0) {
++ err = nft_set_ext_add(&tmpl, NFT_SET_EXT_FLAGS);
++ if (err < 0)
++ return err;
++ }
+
+ if (nla[NFTA_SET_ELEM_KEY]) {
+ err = nft_setelem_parse_key(ctx, set, &elem.key.val,
+@@ -6261,16 +6285,20 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set,
+ if (err < 0)
+ return err;
+
+- nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen);
++ err = nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen);
++ if (err < 0)
++ goto fail_elem;
+ }
+
+ if (nla[NFTA_SET_ELEM_KEY_END]) {
+ err = nft_setelem_parse_key(ctx, set, &elem.key_end.val,
+ nla[NFTA_SET_ELEM_KEY_END]);
+ if (err < 0)
+- return err;
++ goto fail_elem;
+
+- nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen);
++ err = nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen);
++ if (err < 0)
++ goto fail_elem_key_end;
+ }
+
+ err = -ENOMEM;
+@@ -6278,7 +6306,7 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set,
+ elem.key_end.val.data, NULL, 0, 0,
+ GFP_KERNEL_ACCOUNT);
+ if (elem.priv == NULL)
+- goto fail_elem;
++ goto fail_elem_key_end;
+
+ ext = nft_set_elem_ext(set, elem.priv);
+ if (flags)
+@@ -6302,6 +6330,8 @@ fail_ops:
+ kfree(trans);
+ fail_trans:
+ kfree(elem.priv);
++fail_elem_key_end:
++ nft_data_release(&elem.key_end.val, NFT_DATA_VALUE);
+ fail_elem:
+ nft_data_release(&elem.key.val, NFT_DATA_VALUE);
+ return err;
+diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c
+index 53f40e4738557..3ddce24ac76dd 100644
+--- a/net/netfilter/nf_tables_core.c
++++ b/net/netfilter/nf_tables_core.c
+@@ -25,9 +25,7 @@ static noinline void __nft_trace_packet(struct nft_traceinfo *info,
+ const struct nft_chain *chain,
+ enum nft_trace_types type)
+ {
+- const struct nft_pktinfo *pkt = info->pkt;
+-
+- if (!info->trace || !pkt->skb->nf_trace)
++ if (!info->trace || !info->nf_trace)
+ return;
+
+ info->chain = chain;
+@@ -42,11 +40,24 @@ static inline void nft_trace_packet(struct nft_traceinfo *info,
+ enum nft_trace_types type)
+ {
+ if (static_branch_unlikely(&nft_trace_enabled)) {
++ const struct nft_pktinfo *pkt = info->pkt;
++
++ info->nf_trace = pkt->skb->nf_trace;
+ info->rule = rule;
+ __nft_trace_packet(info, chain, type);
+ }
+ }
+
++static inline void nft_trace_copy_nftrace(struct nft_traceinfo *info)
++{
++ if (static_branch_unlikely(&nft_trace_enabled)) {
++ const struct nft_pktinfo *pkt = info->pkt;
++
++ if (info->trace)
++ info->nf_trace = pkt->skb->nf_trace;
++ }
++}
++
+ static void nft_bitwise_fast_eval(const struct nft_expr *expr,
+ struct nft_regs *regs)
+ {
+@@ -85,6 +96,7 @@ static noinline void __nft_trace_verdict(struct nft_traceinfo *info,
+ const struct nft_chain *chain,
+ const struct nft_regs *regs)
+ {
++ const struct nft_pktinfo *pkt = info->pkt;
+ enum nft_trace_types type;
+
+ switch (regs->verdict.code) {
+@@ -92,8 +104,13 @@ static noinline void __nft_trace_verdict(struct nft_traceinfo *info,
+ case NFT_RETURN:
+ type = NFT_TRACETYPE_RETURN;
+ break;
++ case NF_STOLEN:
++ type = NFT_TRACETYPE_RULE;
++ /* can't access skb->nf_trace; use copy */
++ break;
+ default:
+ type = NFT_TRACETYPE_RULE;
++ info->nf_trace = pkt->skb->nf_trace;
+ break;
+ }
+
+@@ -254,6 +271,7 @@ next_rule:
+ switch (regs.verdict.code) {
+ case NFT_BREAK:
+ regs.verdict.code = NFT_CONTINUE;
++ nft_trace_copy_nftrace(&info);
+ continue;
+ case NFT_CONTINUE:
+ nft_trace_packet(&info, chain, rule,
+diff --git a/net/netfilter/nf_tables_trace.c b/net/netfilter/nf_tables_trace.c
+index 5041725423c2d..1163ba9c14011 100644
+--- a/net/netfilter/nf_tables_trace.c
++++ b/net/netfilter/nf_tables_trace.c
+@@ -7,7 +7,7 @@
+ #include <linux/module.h>
+ #include <linux/static_key.h>
+ #include <linux/hash.h>
+-#include <linux/jhash.h>
++#include <linux/siphash.h>
+ #include <linux/if_vlan.h>
+ #include <linux/init.h>
+ #include <linux/skbuff.h>
+@@ -25,22 +25,6 @@
+ DEFINE_STATIC_KEY_FALSE(nft_trace_enabled);
+ EXPORT_SYMBOL_GPL(nft_trace_enabled);
+
+-static int trace_fill_id(struct sk_buff *nlskb, struct sk_buff *skb)
+-{
+- __be32 id;
+-
+- /* using skb address as ID results in a limited number of
+- * values (and quick reuse).
+- *
+- * So we attempt to use as many skb members that will not
+- * change while skb is with netfilter.
+- */
+- id = (__be32)jhash_2words(hash32_ptr(skb), skb_get_hash(skb),
+- skb->skb_iif);
+-
+- return nla_put_be32(nlskb, NFTA_TRACE_ID, id);
+-}
+-
+ static int trace_fill_header(struct sk_buff *nlskb, u16 type,
+ const struct sk_buff *skb,
+ int off, unsigned int len)
+@@ -186,6 +170,7 @@ void nft_trace_notify(struct nft_traceinfo *info)
+ struct nlmsghdr *nlh;
+ struct sk_buff *skb;
+ unsigned int size;
++ u32 mark = 0;
+ u16 event;
+
+ if (!nfnetlink_has_listeners(nft_net(pkt), NFNLGRP_NFTRACE))
+@@ -229,7 +214,7 @@ void nft_trace_notify(struct nft_traceinfo *info)
+ if (nla_put_be32(skb, NFTA_TRACE_TYPE, htonl(info->type)))
+ goto nla_put_failure;
+
+- if (trace_fill_id(skb, pkt->skb))
++ if (nla_put_u32(skb, NFTA_TRACE_ID, info->skbid))
+ goto nla_put_failure;
+
+ if (nla_put_string(skb, NFTA_TRACE_CHAIN, info->chain->name))
+@@ -249,16 +234,24 @@ void nft_trace_notify(struct nft_traceinfo *info)
+ case NFT_TRACETYPE_RULE:
+ if (nft_verdict_dump(skb, NFTA_TRACE_VERDICT, info->verdict))
+ goto nla_put_failure;
++
++ /* pkt->skb undefined iff NF_STOLEN, disable dump */
++ if (info->verdict->code == NF_STOLEN)
++ info->packet_dumped = true;
++ else
++ mark = pkt->skb->mark;
++
+ break;
+ case NFT_TRACETYPE_POLICY:
++ mark = pkt->skb->mark;
++
+ if (nla_put_be32(skb, NFTA_TRACE_POLICY,
+ htonl(info->basechain->policy)))
+ goto nla_put_failure;
+ break;
+ }
+
+- if (pkt->skb->mark &&
+- nla_put_be32(skb, NFTA_TRACE_MARK, htonl(pkt->skb->mark)))
++ if (mark && nla_put_be32(skb, NFTA_TRACE_MARK, htonl(mark)))
+ goto nla_put_failure;
+
+ if (!info->packet_dumped) {
+@@ -283,9 +276,20 @@ void nft_trace_init(struct nft_traceinfo *info, const struct nft_pktinfo *pkt,
+ const struct nft_verdict *verdict,
+ const struct nft_chain *chain)
+ {
++ static siphash_key_t trace_key __read_mostly;
++ struct sk_buff *skb = pkt->skb;
++
+ info->basechain = nft_base_chain(chain);
+ info->trace = true;
++ info->nf_trace = pkt->skb->nf_trace;
+ info->packet_dumped = false;
+ info->pkt = pkt;
+ info->verdict = verdict;
++
++ net_get_random_once(&trace_key, sizeof(trace_key));
++
++ info->skbid = (u32)siphash_3u32(hash32_ptr(skb),
++ skb_get_hash(skb),
++ skb->skb_iif,
++ &trace_key);
+ }
+diff --git a/net/tipc/socket.c b/net/tipc/socket.c
+index 17f8c523e33b0..43509c7e90fc2 100644
+--- a/net/tipc/socket.c
++++ b/net/tipc/socket.c
+@@ -502,6 +502,7 @@ static int tipc_sk_create(struct net *net, struct socket *sock,
+ sock_init_data(sock, sk);
+ tipc_set_sk_state(sk, TIPC_OPEN);
+ if (tipc_sk_insert(tsk)) {
++ sk_free(sk);
+ pr_warn("Socket create failed; port number exhausted\n");
+ return -EINVAL;
+ }
+diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
+index 3919fe2c58c5c..3a61bb5945441 100644
+--- a/net/tls/tls_device.c
++++ b/net/tls/tls_device.c
+@@ -1394,9 +1394,9 @@ static struct notifier_block tls_dev_notifier = {
+ .notifier_call = tls_dev_event,
+ };
+
+-void __init tls_device_init(void)
++int __init tls_device_init(void)
+ {
+- register_netdevice_notifier(&tls_dev_notifier);
++ return register_netdevice_notifier(&tls_dev_notifier);
+ }
+
+ void __exit tls_device_cleanup(void)
+diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
+index 5c9697840ef70..13058b0ee4cd1 100644
+--- a/net/tls/tls_main.c
++++ b/net/tls/tls_main.c
+@@ -993,7 +993,12 @@ static int __init tls_register(void)
+ if (err)
+ return err;
+
+- tls_device_init();
++ err = tls_device_init();
++ if (err) {
++ unregister_pernet_subsys(&tls_proc_ops);
++ return err;
++ }
++
+ tcp_register_ulp(&tcp_tls_ulp_ops);
+
+ return 0;
+diff --git a/security/integrity/evm/evm_crypto.c b/security/integrity/evm/evm_crypto.c
+index 0450d79afdc8f..b862f0f919bfc 100644
+--- a/security/integrity/evm/evm_crypto.c
++++ b/security/integrity/evm/evm_crypto.c
+@@ -75,7 +75,7 @@ static struct shash_desc *init_desc(char type, uint8_t hash_algo)
+ {
+ long rc;
+ const char *algo;
+- struct crypto_shash **tfm, *tmp_tfm = NULL;
++ struct crypto_shash **tfm, *tmp_tfm;
+ struct shash_desc *desc;
+
+ if (type == EVM_XATTR_HMAC) {
+@@ -120,16 +120,13 @@ unlock:
+ alloc:
+ desc = kmalloc(sizeof(*desc) + crypto_shash_descsize(*tfm),
+ GFP_KERNEL);
+- if (!desc) {
+- crypto_free_shash(tmp_tfm);
++ if (!desc)
+ return ERR_PTR(-ENOMEM);
+- }
+
+ desc->tfm = *tfm;
+
+ rc = crypto_shash_init(desc);
+ if (rc) {
+- crypto_free_shash(tmp_tfm);
+ kfree(desc);
+ return ERR_PTR(rc);
+ }
+diff --git a/security/integrity/ima/ima_appraise.c b/security/integrity/ima/ima_appraise.c
+index 17232bbfb9f96..ee6a0f8879e43 100644
+--- a/security/integrity/ima/ima_appraise.c
++++ b/security/integrity/ima/ima_appraise.c
+@@ -408,7 +408,8 @@ int ima_appraise_measurement(enum ima_hooks func,
+ goto out;
+ }
+
+- status = evm_verifyxattr(dentry, XATTR_NAME_IMA, xattr_value, rc, iint);
++ status = evm_verifyxattr(dentry, XATTR_NAME_IMA, xattr_value,
++ rc < 0 ? 0 : rc, iint);
+ switch (status) {
+ case INTEGRITY_PASS:
+ case INTEGRITY_PASS_IMMUTABLE:
+diff --git a/security/integrity/ima/ima_crypto.c b/security/integrity/ima/ima_crypto.c
+index a7206cc1d7d19..64499056648ad 100644
+--- a/security/integrity/ima/ima_crypto.c
++++ b/security/integrity/ima/ima_crypto.c
+@@ -205,6 +205,7 @@ out_array:
+
+ crypto_free_shash(ima_algo_array[i].tfm);
+ }
++ kfree(ima_algo_array);
+ out:
+ crypto_free_shash(ima_shash_tfm);
+ return rc;
+diff --git a/security/integrity/ima/ima_efi.c b/security/integrity/ima/ima_efi.c
+index 71786d01946f4..9db66fe310d42 100644
+--- a/security/integrity/ima/ima_efi.c
++++ b/security/integrity/ima/ima_efi.c
+@@ -67,6 +67,8 @@ const char * const *arch_get_ima_policy(void)
+ if (IS_ENABLED(CONFIG_IMA_ARCH_POLICY) && arch_ima_get_secureboot()) {
+ if (IS_ENABLED(CONFIG_MODULE_SIG))
+ set_module_sig_enforced();
++ if (IS_ENABLED(CONFIG_KEXEC_SIG))
++ set_kexec_sig_enforced();
+ return sb_arch_rules;
+ }
+ return NULL;
+diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c
+index 0b7d500249f6e..8c68e6e1387ef 100644
+--- a/sound/pci/hda/patch_conexant.c
++++ b/sound/pci/hda/patch_conexant.c
+@@ -944,6 +944,7 @@ static const struct snd_pci_quirk cxt5066_fixups[] = {
+ SND_PCI_QUIRK(0x103c, 0x828c, "HP EliteBook 840 G4", CXT_FIXUP_HP_DOCK),
+ SND_PCI_QUIRK(0x103c, 0x8299, "HP 800 G3 SFF", CXT_FIXUP_HP_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x103c, 0x829a, "HP 800 G3 DM", CXT_FIXUP_HP_MIC_NO_PRESENCE),
++ SND_PCI_QUIRK(0x103c, 0x82b4, "HP ProDesk 600 G3", CXT_FIXUP_HP_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x103c, 0x836e, "HP ProBook 455 G5", CXT_FIXUP_MUTE_LED_GPIO),
+ SND_PCI_QUIRK(0x103c, 0x837f, "HP ProBook 470 G5", CXT_FIXUP_MUTE_LED_GPIO),
+ SND_PCI_QUIRK(0x103c, 0x83b2, "HP EliteBook 840 G5", CXT_FIXUP_HP_DOCK),
+diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
+index 45f5db25a77ed..b7f1f2fb60cbc 100644
+--- a/sound/pci/hda/patch_realtek.c
++++ b/sound/pci/hda/patch_realtek.c
+@@ -6953,6 +6953,7 @@ enum {
+ ALC298_FIXUP_LENOVO_SPK_VOLUME,
+ ALC256_FIXUP_DELL_INSPIRON_7559_SUBWOOFER,
+ ALC269_FIXUP_ATIV_BOOK_8,
++ ALC221_FIXUP_HP_288PRO_MIC_NO_PRESENCE,
+ ALC221_FIXUP_HP_MIC_NO_PRESENCE,
+ ALC256_FIXUP_ASUS_HEADSET_MODE,
+ ALC256_FIXUP_ASUS_MIC,
+@@ -7889,6 +7890,16 @@ static const struct hda_fixup alc269_fixups[] = {
+ .chained = true,
+ .chain_id = ALC269_FIXUP_NO_SHUTUP
+ },
++ [ALC221_FIXUP_HP_288PRO_MIC_NO_PRESENCE] = {
++ .type = HDA_FIXUP_PINS,
++ .v.pins = (const struct hda_pintbl[]) {
++ { 0x19, 0x01a1913c }, /* use as headset mic, without its own jack detect */
++ { 0x1a, 0x01813030 }, /* use as headphone mic, without its own jack detect */
++ { }
++ },
++ .chained = true,
++ .chain_id = ALC269_FIXUP_HEADSET_MODE
++ },
+ [ALC221_FIXUP_HP_MIC_NO_PRESENCE] = {
+ .type = HDA_FIXUP_PINS,
+ .v.pins = (const struct hda_pintbl[]) {
+@@ -8938,6 +8949,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1025, 0x1290, "Acer Veriton Z4860G", ALC286_FIXUP_ACER_AIO_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1025, 0x1291, "Acer Veriton Z4660G", ALC286_FIXUP_ACER_AIO_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1025, 0x129c, "Acer SWIFT SF314-55", ALC256_FIXUP_ACER_HEADSET_MIC),
++ SND_PCI_QUIRK(0x1025, 0x129d, "Acer SWIFT SF313-51", ALC256_FIXUP_ACER_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1025, 0x1300, "Acer SWIFT SF314-56", ALC256_FIXUP_ACER_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1025, 0x1308, "Acer Aspire Z24-890", ALC286_FIXUP_ACER_AIO_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1025, 0x132a, "Acer TravelMate B114-21", ALC233_FIXUP_ACER_HEADSET_MIC),
+@@ -8947,6 +8959,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1025, 0x1430, "Acer TravelMate B311R-31", ALC256_FIXUP_ACER_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1025, 0x1466, "Acer Aspire A515-56", ALC255_FIXUP_ACER_HEADPHONE_AND_MIC),
+ SND_PCI_QUIRK(0x1028, 0x0470, "Dell M101z", ALC269_FIXUP_DELL_M101Z),
++ SND_PCI_QUIRK(0x1028, 0x053c, "Dell Latitude E5430", ALC292_FIXUP_DELL_E7X),
+ SND_PCI_QUIRK(0x1028, 0x054b, "Dell XPS one 2710", ALC275_FIXUP_DELL_XPS),
+ SND_PCI_QUIRK(0x1028, 0x05bd, "Dell Latitude E6440", ALC292_FIXUP_DELL_E7X),
+ SND_PCI_QUIRK(0x1028, 0x05be, "Dell Latitude E6540", ALC292_FIXUP_DELL_E7X),
+@@ -9062,6 +9075,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x103c, 0x2335, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC1),
+ SND_PCI_QUIRK(0x103c, 0x2336, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC1),
+ SND_PCI_QUIRK(0x103c, 0x2337, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC1),
++ SND_PCI_QUIRK(0x103c, 0x2b5e, "HP 288 Pro G2 MT", ALC221_FIXUP_HP_288PRO_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x103c, 0x802e, "HP Z240 SFF", ALC221_FIXUP_HP_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x103c, 0x802f, "HP Z240", ALC221_FIXUP_HP_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x103c, 0x8077, "HP", ALC256_FIXUP_HP_HEADSET_MIC),
+@@ -9148,6 +9162,10 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x103c, 0x89c6, "Zbook Fury 17 G9", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x89ca, "HP", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF),
+ SND_PCI_QUIRK(0x103c, 0x8a78, "HP Dev One", ALC285_FIXUP_HP_LIMIT_INT_MIC_BOOST),
++ SND_PCI_QUIRK(0x103c, 0x8aa0, "HP ProBook 440 G9 (MB 8A9E)", ALC236_FIXUP_HP_GPIO_LED),
++ SND_PCI_QUIRK(0x103c, 0x8aa3, "HP ProBook 450 G9 (MB 8AA1)", ALC236_FIXUP_HP_GPIO_LED),
++ SND_PCI_QUIRK(0x103c, 0x8aa8, "HP EliteBook 640 G9 (MB 8AA6)", ALC236_FIXUP_HP_GPIO_LED),
++ SND_PCI_QUIRK(0x103c, 0x8aab, "HP EliteBook 650 G9 (MB 8AA9)", ALC236_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x1043, 0x103e, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC),
+ SND_PCI_QUIRK(0x1043, 0x103f, "ASUS TX300", ALC282_FIXUP_ASUS_TX300),
+ SND_PCI_QUIRK(0x1043, 0x106d, "Asus K53BE", ALC269_FIXUP_LIMIT_INT_MIC_BOOST),
+@@ -9407,6 +9425,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1d72, 0x1602, "RedmiBook", ALC255_FIXUP_XIAOMI_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1d72, 0x1701, "XiaomiNotebook Pro", ALC298_FIXUP_DELL1_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1d72, 0x1901, "RedmiBook 14", ALC256_FIXUP_ASUS_HEADSET_MIC),
++ SND_PCI_QUIRK(0x1d72, 0x1945, "Redmi G", ALC256_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1d72, 0x1947, "RedmiBook Air", ALC255_FIXUP_XIAOMI_HEADSET_MIC),
+ SND_PCI_QUIRK(0x8086, 0x2074, "Intel NUC 8", ALC233_FIXUP_INTEL_NUC8_DMIC),
+ SND_PCI_QUIRK(0x8086, 0x2080, "Intel NUC 8 Rugged", ALC256_FIXUP_INTEL_NUC8_RUGGED),
+@@ -11269,6 +11288,7 @@ static const struct snd_pci_quirk alc662_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x103c, 0x1632, "HP RP5800", ALC662_FIXUP_HP_RP5800),
+ SND_PCI_QUIRK(0x103c, 0x8719, "HP", ALC897_FIXUP_HP_HSMIC_VERB),
+ SND_PCI_QUIRK(0x103c, 0x873e, "HP", ALC671_FIXUP_HP_HEADSET_MIC2),
++ SND_PCI_QUIRK(0x103c, 0x877e, "HP 288 Pro G6", ALC671_FIXUP_HP_HEADSET_MIC2),
+ SND_PCI_QUIRK(0x103c, 0x885f, "HP 288 Pro G8", ALC671_FIXUP_HP_HEADSET_MIC2),
+ SND_PCI_QUIRK(0x1043, 0x1080, "Asus UX501VW", ALC668_FIXUP_HEADSET_MODE),
+ SND_PCI_QUIRK(0x1043, 0x11cd, "Asus N550", ALC662_FIXUP_ASUS_Nx50),
+diff --git a/sound/soc/codecs/cs35l41-lib.c b/sound/soc/codecs/cs35l41-lib.c
+index 17cf782f39af6..538b5c4d3abf8 100644
+--- a/sound/soc/codecs/cs35l41-lib.c
++++ b/sound/soc/codecs/cs35l41-lib.c
+@@ -36,8 +36,8 @@ static const struct reg_default cs35l41_reg[] = {
+ { CS35L41_DAC_PCM1_SRC, 0x00000008 },
+ { CS35L41_ASP_TX1_SRC, 0x00000018 },
+ { CS35L41_ASP_TX2_SRC, 0x00000019 },
+- { CS35L41_ASP_TX3_SRC, 0x00000020 },
+- { CS35L41_ASP_TX4_SRC, 0x00000021 },
++ { CS35L41_ASP_TX3_SRC, 0x00000000 },
++ { CS35L41_ASP_TX4_SRC, 0x00000000 },
+ { CS35L41_DSP1_RX1_SRC, 0x00000008 },
+ { CS35L41_DSP1_RX2_SRC, 0x00000009 },
+ { CS35L41_DSP1_RX3_SRC, 0x00000018 },
+@@ -643,6 +643,8 @@ static const struct reg_sequence cs35l41_reva0_errata_patch[] = {
+ { CS35L41_DSP1_XM_ACCEL_PL0_PRI, 0x00000000 },
+ { CS35L41_PWR_CTRL2, 0x00000000 },
+ { CS35L41_AMP_GAIN_CTRL, 0x00000000 },
++ { CS35L41_ASP_TX3_SRC, 0x00000000 },
++ { CS35L41_ASP_TX4_SRC, 0x00000000 },
+ };
+
+ static const struct reg_sequence cs35l41_revb0_errata_patch[] = {
+@@ -654,6 +656,8 @@ static const struct reg_sequence cs35l41_revb0_errata_patch[] = {
+ { CS35L41_DSP1_XM_ACCEL_PL0_PRI, 0x00000000 },
+ { CS35L41_PWR_CTRL2, 0x00000000 },
+ { CS35L41_AMP_GAIN_CTRL, 0x00000000 },
++ { CS35L41_ASP_TX3_SRC, 0x00000000 },
++ { CS35L41_ASP_TX4_SRC, 0x00000000 },
+ };
+
+ static const struct reg_sequence cs35l41_revb2_errata_patch[] = {
+@@ -665,6 +669,8 @@ static const struct reg_sequence cs35l41_revb2_errata_patch[] = {
+ { CS35L41_DSP1_XM_ACCEL_PL0_PRI, 0x00000000 },
+ { CS35L41_PWR_CTRL2, 0x00000000 },
+ { CS35L41_AMP_GAIN_CTRL, 0x00000000 },
++ { CS35L41_ASP_TX3_SRC, 0x00000000 },
++ { CS35L41_ASP_TX4_SRC, 0x00000000 },
+ };
+
+ static const struct cs35l41_otp_map_element_t cs35l41_otp_map_map[] = {
+diff --git a/sound/soc/codecs/cs35l41.c b/sound/soc/codecs/cs35l41.c
+index 6b784a62df0ce..20c76a53a5080 100644
+--- a/sound/soc/codecs/cs35l41.c
++++ b/sound/soc/codecs/cs35l41.c
+@@ -392,7 +392,7 @@ static const struct snd_kcontrol_new cs35l41_aud_controls[] = {
+ SOC_SINGLE("HW Noise Gate Enable", CS35L41_NG_CFG, 8, 63, 0),
+ SOC_SINGLE("HW Noise Gate Delay", CS35L41_NG_CFG, 4, 7, 0),
+ SOC_SINGLE("HW Noise Gate Threshold", CS35L41_NG_CFG, 0, 7, 0),
+- SOC_SINGLE("Aux Noise Gate CH1 Enable",
++ SOC_SINGLE("Aux Noise Gate CH1 Switch",
+ CS35L41_MIXER_NGATE_CH1_CFG, 16, 1, 0),
+ SOC_SINGLE("Aux Noise Gate CH1 Entry Delay",
+ CS35L41_MIXER_NGATE_CH1_CFG, 8, 15, 0),
+@@ -400,15 +400,15 @@ static const struct snd_kcontrol_new cs35l41_aud_controls[] = {
+ CS35L41_MIXER_NGATE_CH1_CFG, 0, 7, 0),
+ SOC_SINGLE("Aux Noise Gate CH2 Entry Delay",
+ CS35L41_MIXER_NGATE_CH2_CFG, 8, 15, 0),
+- SOC_SINGLE("Aux Noise Gate CH2 Enable",
++ SOC_SINGLE("Aux Noise Gate CH2 Switch",
+ CS35L41_MIXER_NGATE_CH2_CFG, 16, 1, 0),
+ SOC_SINGLE("Aux Noise Gate CH2 Threshold",
+ CS35L41_MIXER_NGATE_CH2_CFG, 0, 7, 0),
+- SOC_SINGLE("SCLK Force", CS35L41_SP_FORMAT, CS35L41_SCLK_FRC_SHIFT, 1, 0),
+- SOC_SINGLE("LRCLK Force", CS35L41_SP_FORMAT, CS35L41_LRCLK_FRC_SHIFT, 1, 0),
+- SOC_SINGLE("Invert Class D", CS35L41_AMP_DIG_VOL_CTRL,
++ SOC_SINGLE("SCLK Force Switch", CS35L41_SP_FORMAT, CS35L41_SCLK_FRC_SHIFT, 1, 0),
++ SOC_SINGLE("LRCLK Force Switch", CS35L41_SP_FORMAT, CS35L41_LRCLK_FRC_SHIFT, 1, 0),
++ SOC_SINGLE("Invert Class D Switch", CS35L41_AMP_DIG_VOL_CTRL,
+ CS35L41_AMP_INV_PCM_SHIFT, 1, 0),
+- SOC_SINGLE("Amp Gain ZC", CS35L41_AMP_GAIN_CTRL,
++ SOC_SINGLE("Amp Gain ZC Switch", CS35L41_AMP_GAIN_CTRL,
+ CS35L41_AMP_GAIN_ZC_SHIFT, 1, 0),
+ WM_ADSP2_PRELOAD_SWITCH("DSP1", 1),
+ WM_ADSP_FW_CONTROL("DSP1", 0),
+diff --git a/sound/soc/codecs/cs47l15.c b/sound/soc/codecs/cs47l15.c
+index 391fd7da331fd..1c7d52bef8935 100644
+--- a/sound/soc/codecs/cs47l15.c
++++ b/sound/soc/codecs/cs47l15.c
+@@ -122,6 +122,9 @@ static int cs47l15_in1_adc_put(struct snd_kcontrol *kcontrol,
+ snd_soc_kcontrol_component(kcontrol);
+ struct cs47l15 *cs47l15 = snd_soc_component_get_drvdata(component);
+
++ if (!!ucontrol->value.integer.value[0] == cs47l15->in1_lp_mode)
++ return 0;
++
+ switch (ucontrol->value.integer.value[0]) {
+ case 0:
+ /* Set IN1 to normal mode */
+@@ -150,7 +153,7 @@ static int cs47l15_in1_adc_put(struct snd_kcontrol *kcontrol,
+ break;
+ }
+
+- return 0;
++ return 1;
+ }
+
+ static const struct snd_kcontrol_new cs47l15_snd_controls[] = {
+diff --git a/sound/soc/codecs/madera.c b/sound/soc/codecs/madera.c
+index 272041c6236a9..b9f19fbd29114 100644
+--- a/sound/soc/codecs/madera.c
++++ b/sound/soc/codecs/madera.c
+@@ -618,7 +618,13 @@ int madera_out1_demux_put(struct snd_kcontrol *kcontrol,
+ end:
+ snd_soc_dapm_mutex_unlock(dapm);
+
+- return snd_soc_dapm_mux_update_power(dapm, kcontrol, mux, e, NULL);
++ ret = snd_soc_dapm_mux_update_power(dapm, kcontrol, mux, e, NULL);
++ if (ret < 0) {
++ dev_err(madera->dev, "Failed to update demux power state: %d\n", ret);
++ return ret;
++ }
++
++ return change;
+ }
+ EXPORT_SYMBOL_GPL(madera_out1_demux_put);
+
+@@ -893,7 +899,7 @@ static int madera_adsp_rate_put(struct snd_kcontrol *kcontrol,
+ struct soc_enum *e = (struct soc_enum *)kcontrol->private_value;
+ const int adsp_num = e->shift_l;
+ const unsigned int item = ucontrol->value.enumerated.item[0];
+- int ret;
++ int ret = 0;
+
+ if (item >= e->items)
+ return -EINVAL;
+@@ -910,10 +916,10 @@ static int madera_adsp_rate_put(struct snd_kcontrol *kcontrol,
+ "Cannot change '%s' while in use by active audio paths\n",
+ kcontrol->id.name);
+ ret = -EBUSY;
+- } else {
++ } else if (priv->adsp_rate_cache[adsp_num] != e->values[item]) {
+ /* Volatile register so defer until the codec is powered up */
+ priv->adsp_rate_cache[adsp_num] = e->values[item];
+- ret = 0;
++ ret = 1;
+ }
+
+ mutex_unlock(&priv->rate_lock);
+diff --git a/sound/soc/codecs/max98373-sdw.c b/sound/soc/codecs/max98373-sdw.c
+index f47e956d4f55a..97b64477dde67 100644
+--- a/sound/soc/codecs/max98373-sdw.c
++++ b/sound/soc/codecs/max98373-sdw.c
+@@ -862,6 +862,16 @@ static int max98373_sdw_probe(struct sdw_slave *slave,
+ return max98373_init(slave, regmap);
+ }
+
++static int max98373_sdw_remove(struct sdw_slave *slave)
++{
++ struct max98373_priv *max98373 = dev_get_drvdata(&slave->dev);
++
++ if (max98373->first_hw_init)
++ pm_runtime_disable(&slave->dev);
++
++ return 0;
++}
++
+ #if defined(CONFIG_OF)
+ static const struct of_device_id max98373_of_match[] = {
+ { .compatible = "maxim,max98373", },
+@@ -893,7 +903,7 @@ static struct sdw_driver max98373_sdw_driver = {
+ .pm = &max98373_pm,
+ },
+ .probe = max98373_sdw_probe,
+- .remove = NULL,
++ .remove = max98373_sdw_remove,
+ .ops = &max98373_slave_ops,
+ .id_table = max98373_id,
+ };
+diff --git a/sound/soc/codecs/rt1308-sdw.c b/sound/soc/codecs/rt1308-sdw.c
+index 1ef836a68a56f..e42a63ee07f4d 100644
+--- a/sound/soc/codecs/rt1308-sdw.c
++++ b/sound/soc/codecs/rt1308-sdw.c
+@@ -690,6 +690,16 @@ static int rt1308_sdw_probe(struct sdw_slave *slave,
+ return 0;
+ }
+
++static int rt1308_sdw_remove(struct sdw_slave *slave)
++{
++ struct rt1308_sdw_priv *rt1308 = dev_get_drvdata(&slave->dev);
++
++ if (rt1308->first_hw_init)
++ pm_runtime_disable(&slave->dev);
++
++ return 0;
++}
++
+ static const struct sdw_device_id rt1308_id[] = {
+ SDW_SLAVE_ENTRY_EXT(0x025d, 0x1308, 0x2, 0, 0),
+ {},
+@@ -749,6 +759,7 @@ static struct sdw_driver rt1308_sdw_driver = {
+ .pm = &rt1308_pm,
+ },
+ .probe = rt1308_sdw_probe,
++ .remove = rt1308_sdw_remove,
+ .ops = &rt1308_slave_ops,
+ .id_table = rt1308_id,
+ };
+diff --git a/sound/soc/codecs/rt1316-sdw.c b/sound/soc/codecs/rt1316-sdw.c
+index c66d7b20cb4dd..1e04aa8ab1666 100644
+--- a/sound/soc/codecs/rt1316-sdw.c
++++ b/sound/soc/codecs/rt1316-sdw.c
+@@ -675,6 +675,16 @@ static int rt1316_sdw_probe(struct sdw_slave *slave,
+ return rt1316_sdw_init(&slave->dev, regmap, slave);
+ }
+
++static int rt1316_sdw_remove(struct sdw_slave *slave)
++{
++ struct rt1316_sdw_priv *rt1316 = dev_get_drvdata(&slave->dev);
++
++ if (rt1316->first_hw_init)
++ pm_runtime_disable(&slave->dev);
++
++ return 0;
++}
++
+ static const struct sdw_device_id rt1316_id[] = {
+ SDW_SLAVE_ENTRY_EXT(0x025d, 0x1316, 0x3, 0x1, 0),
+ {},
+@@ -734,6 +744,7 @@ static struct sdw_driver rt1316_sdw_driver = {
+ .pm = &rt1316_pm,
+ },
+ .probe = rt1316_sdw_probe,
++ .remove = rt1316_sdw_remove,
+ .ops = &rt1316_slave_ops,
+ .id_table = rt1316_id,
+ };
+diff --git a/sound/soc/codecs/rt5682-sdw.c b/sound/soc/codecs/rt5682-sdw.c
+index 248257a2e4e0f..f04e18c32489d 100644
+--- a/sound/soc/codecs/rt5682-sdw.c
++++ b/sound/soc/codecs/rt5682-sdw.c
+@@ -719,9 +719,12 @@ static int rt5682_sdw_remove(struct sdw_slave *slave)
+ {
+ struct rt5682_priv *rt5682 = dev_get_drvdata(&slave->dev);
+
+- if (rt5682 && rt5682->hw_init)
++ if (rt5682->hw_init)
+ cancel_delayed_work_sync(&rt5682->jack_detect_work);
+
++ if (rt5682->first_hw_init)
++ pm_runtime_disable(&slave->dev);
++
+ return 0;
+ }
+
+diff --git a/sound/soc/codecs/rt700-sdw.c b/sound/soc/codecs/rt700-sdw.c
+index bda5948996642..f7439e40ca8b5 100644
+--- a/sound/soc/codecs/rt700-sdw.c
++++ b/sound/soc/codecs/rt700-sdw.c
+@@ -13,6 +13,7 @@
+ #include <linux/soundwire/sdw_type.h>
+ #include <linux/soundwire/sdw_registers.h>
+ #include <linux/module.h>
++#include <linux/pm_runtime.h>
+ #include <linux/regmap.h>
+ #include <sound/soc.h>
+ #include "rt700.h"
+@@ -463,11 +464,14 @@ static int rt700_sdw_remove(struct sdw_slave *slave)
+ {
+ struct rt700_priv *rt700 = dev_get_drvdata(&slave->dev);
+
+- if (rt700 && rt700->hw_init) {
++ if (rt700->hw_init) {
+ cancel_delayed_work_sync(&rt700->jack_detect_work);
+ cancel_delayed_work_sync(&rt700->jack_btn_check_work);
+ }
+
++ if (rt700->first_hw_init)
++ pm_runtime_disable(&slave->dev);
++
+ return 0;
+ }
+
+diff --git a/sound/soc/codecs/rt700.c b/sound/soc/codecs/rt700.c
+index 360d61a36c354..3de3406d653e4 100644
+--- a/sound/soc/codecs/rt700.c
++++ b/sound/soc/codecs/rt700.c
+@@ -162,7 +162,7 @@ static void rt700_jack_detect_handler(struct work_struct *work)
+ if (!rt700->hs_jack)
+ return;
+
+- if (!rt700->component->card->instantiated)
++ if (!rt700->component->card || !rt700->component->card->instantiated)
+ return;
+
+ reg = RT700_VERB_GET_PIN_SENSE | RT700_HP_OUT;
+@@ -1124,6 +1124,11 @@ int rt700_init(struct device *dev, struct regmap *sdw_regmap,
+
+ mutex_init(&rt700->disable_irq_lock);
+
++ INIT_DELAYED_WORK(&rt700->jack_detect_work,
++ rt700_jack_detect_handler);
++ INIT_DELAYED_WORK(&rt700->jack_btn_check_work,
++ rt700_btn_check_handler);
++
+ /*
+ * Mark hw_init to false
+ * HW init will be performed when device reports present
+@@ -1218,13 +1223,6 @@ int rt700_io_init(struct device *dev, struct sdw_slave *slave)
+ /* Finish Initial Settings, set power to D3 */
+ regmap_write(rt700->regmap, RT700_SET_AUDIO_POWER_STATE, AC_PWRST_D3);
+
+- if (!rt700->first_hw_init) {
+- INIT_DELAYED_WORK(&rt700->jack_detect_work,
+- rt700_jack_detect_handler);
+- INIT_DELAYED_WORK(&rt700->jack_btn_check_work,
+- rt700_btn_check_handler);
+- }
+-
+ /*
+ * if set_jack callback occurred early than io_init,
+ * we set up the jack detection function now
+diff --git a/sound/soc/codecs/rt711-sdca-sdw.c b/sound/soc/codecs/rt711-sdca-sdw.c
+index aaf5af153d3fe..a085b2f530aa1 100644
+--- a/sound/soc/codecs/rt711-sdca-sdw.c
++++ b/sound/soc/codecs/rt711-sdca-sdw.c
+@@ -11,6 +11,7 @@
+ #include <linux/mod_devicetable.h>
+ #include <linux/soundwire/sdw_registers.h>
+ #include <linux/module.h>
++#include <linux/pm_runtime.h>
+
+ #include "rt711-sdca.h"
+ #include "rt711-sdca-sdw.h"
+@@ -364,11 +365,17 @@ static int rt711_sdca_sdw_remove(struct sdw_slave *slave)
+ {
+ struct rt711_sdca_priv *rt711 = dev_get_drvdata(&slave->dev);
+
+- if (rt711 && rt711->hw_init) {
++ if (rt711->hw_init) {
+ cancel_delayed_work_sync(&rt711->jack_detect_work);
+ cancel_delayed_work_sync(&rt711->jack_btn_check_work);
+ }
+
++ if (rt711->first_hw_init)
++ pm_runtime_disable(&slave->dev);
++
++ mutex_destroy(&rt711->calibrate_mutex);
++ mutex_destroy(&rt711->disable_irq_lock);
++
+ return 0;
+ }
+
+diff --git a/sound/soc/codecs/rt711-sdca.c b/sound/soc/codecs/rt711-sdca.c
+index 9d59e653b941b..5ad53bbc85284 100644
+--- a/sound/soc/codecs/rt711-sdca.c
++++ b/sound/soc/codecs/rt711-sdca.c
+@@ -34,7 +34,7 @@ static int rt711_sdca_index_write(struct rt711_sdca_priv *rt711,
+
+ ret = regmap_write(regmap, addr, value);
+ if (ret < 0)
+- dev_err(rt711->component->dev,
++ dev_err(&rt711->slave->dev,
+ "Failed to set private value: %06x <= %04x ret=%d\n",
+ addr, value, ret);
+
+@@ -50,7 +50,7 @@ static int rt711_sdca_index_read(struct rt711_sdca_priv *rt711,
+
+ ret = regmap_read(regmap, addr, value);
+ if (ret < 0)
+- dev_err(rt711->component->dev,
++ dev_err(&rt711->slave->dev,
+ "Failed to get private value: %06x => %04x ret=%d\n",
+ addr, *value, ret);
+
+@@ -294,7 +294,7 @@ static void rt711_sdca_jack_detect_handler(struct work_struct *work)
+ if (!rt711->hs_jack)
+ return;
+
+- if (!rt711->component->card->instantiated)
++ if (!rt711->component->card || !rt711->component->card->instantiated)
+ return;
+
+ /* SDW_SCP_SDCA_INT_SDCA_0 is used for jack detection */
+@@ -1414,8 +1414,12 @@ int rt711_sdca_init(struct device *dev, struct regmap *regmap,
+ rt711->regmap = regmap;
+ rt711->mbq_regmap = mbq_regmap;
+
++ mutex_init(&rt711->calibrate_mutex);
+ mutex_init(&rt711->disable_irq_lock);
+
++ INIT_DELAYED_WORK(&rt711->jack_detect_work, rt711_sdca_jack_detect_handler);
++ INIT_DELAYED_WORK(&rt711->jack_btn_check_work, rt711_sdca_btn_check_handler);
++
+ /*
+ * Mark hw_init to false
+ * HW init will be performed when device reports present
+@@ -1547,14 +1551,6 @@ int rt711_sdca_io_init(struct device *dev, struct sdw_slave *slave)
+ rt711_sdca_index_update_bits(rt711, RT711_VENDOR_HDA_CTL,
+ RT711_PUSH_BTN_INT_CTL0, 0x20, 0x00);
+
+- if (!rt711->first_hw_init) {
+- INIT_DELAYED_WORK(&rt711->jack_detect_work,
+- rt711_sdca_jack_detect_handler);
+- INIT_DELAYED_WORK(&rt711->jack_btn_check_work,
+- rt711_sdca_btn_check_handler);
+- mutex_init(&rt711->calibrate_mutex);
+- }
+-
+ /* calibration */
+ ret = rt711_sdca_calibration(rt711);
+ if (ret < 0)
+diff --git a/sound/soc/codecs/rt711-sdw.c b/sound/soc/codecs/rt711-sdw.c
+index bda2cc9439c98..4fe68bcf2a7c2 100644
+--- a/sound/soc/codecs/rt711-sdw.c
++++ b/sound/soc/codecs/rt711-sdw.c
+@@ -13,6 +13,7 @@
+ #include <linux/soundwire/sdw_type.h>
+ #include <linux/soundwire/sdw_registers.h>
+ #include <linux/module.h>
++#include <linux/pm_runtime.h>
+ #include <linux/regmap.h>
+ #include <sound/soc.h>
+ #include "rt711.h"
+@@ -464,12 +465,18 @@ static int rt711_sdw_remove(struct sdw_slave *slave)
+ {
+ struct rt711_priv *rt711 = dev_get_drvdata(&slave->dev);
+
+- if (rt711 && rt711->hw_init) {
++ if (rt711->hw_init) {
+ cancel_delayed_work_sync(&rt711->jack_detect_work);
+ cancel_delayed_work_sync(&rt711->jack_btn_check_work);
+ cancel_work_sync(&rt711->calibration_work);
+ }
+
++ if (rt711->first_hw_init)
++ pm_runtime_disable(&slave->dev);
++
++ mutex_destroy(&rt711->calibrate_mutex);
++ mutex_destroy(&rt711->disable_irq_lock);
++
+ return 0;
+ }
+
+diff --git a/sound/soc/codecs/rt711.c b/sound/soc/codecs/rt711.c
+index 9958067e80f11..9df800abfc2d8 100644
+--- a/sound/soc/codecs/rt711.c
++++ b/sound/soc/codecs/rt711.c
+@@ -242,7 +242,7 @@ static void rt711_jack_detect_handler(struct work_struct *work)
+ if (!rt711->hs_jack)
+ return;
+
+- if (!rt711->component->card->instantiated)
++ if (!rt711->component->card || !rt711->component->card->instantiated)
+ return;
+
+ if (pm_runtime_status_suspended(rt711->slave->dev.parent)) {
+@@ -1206,8 +1206,13 @@ int rt711_init(struct device *dev, struct regmap *sdw_regmap,
+ rt711->sdw_regmap = sdw_regmap;
+ rt711->regmap = regmap;
+
++ mutex_init(&rt711->calibrate_mutex);
+ mutex_init(&rt711->disable_irq_lock);
+
++ INIT_DELAYED_WORK(&rt711->jack_detect_work, rt711_jack_detect_handler);
++ INIT_DELAYED_WORK(&rt711->jack_btn_check_work, rt711_btn_check_handler);
++ INIT_WORK(&rt711->calibration_work, rt711_calibration_work);
++
+ /*
+ * Mark hw_init to false
+ * HW init will be performed when device reports present
+@@ -1315,15 +1320,8 @@ int rt711_io_init(struct device *dev, struct sdw_slave *slave)
+
+ if (rt711->first_hw_init)
+ rt711_calibration(rt711);
+- else {
+- INIT_DELAYED_WORK(&rt711->jack_detect_work,
+- rt711_jack_detect_handler);
+- INIT_DELAYED_WORK(&rt711->jack_btn_check_work,
+- rt711_btn_check_handler);
+- mutex_init(&rt711->calibrate_mutex);
+- INIT_WORK(&rt711->calibration_work, rt711_calibration_work);
++ else
+ schedule_work(&rt711->calibration_work);
+- }
+
+ /*
+ * if set_jack callback occurred early than io_init,
+diff --git a/sound/soc/codecs/rt715-sdca-sdw.c b/sound/soc/codecs/rt715-sdca-sdw.c
+index a5c673f43d824..0f4354eafef25 100644
+--- a/sound/soc/codecs/rt715-sdca-sdw.c
++++ b/sound/soc/codecs/rt715-sdca-sdw.c
+@@ -13,6 +13,7 @@
+ #include <linux/soundwire/sdw_type.h>
+ #include <linux/soundwire/sdw_registers.h>
+ #include <linux/module.h>
++#include <linux/pm_runtime.h>
+ #include <linux/regmap.h>
+ #include <sound/soc.h>
+ #include "rt715-sdca.h"
+@@ -195,6 +196,16 @@ static int rt715_sdca_sdw_probe(struct sdw_slave *slave,
+ return rt715_sdca_init(&slave->dev, mbq_regmap, regmap, slave);
+ }
+
++static int rt715_sdca_sdw_remove(struct sdw_slave *slave)
++{
++ struct rt715_sdca_priv *rt715 = dev_get_drvdata(&slave->dev);
++
++ if (rt715->first_hw_init)
++ pm_runtime_disable(&slave->dev);
++
++ return 0;
++}
++
+ static const struct sdw_device_id rt715_sdca_id[] = {
+ SDW_SLAVE_ENTRY_EXT(0x025d, 0x715, 0x3, 0x1, 0),
+ SDW_SLAVE_ENTRY_EXT(0x025d, 0x714, 0x3, 0x1, 0),
+@@ -269,6 +280,7 @@ static struct sdw_driver rt715_sdw_driver = {
+ .pm = &rt715_pm,
+ },
+ .probe = rt715_sdca_sdw_probe,
++ .remove = rt715_sdca_sdw_remove,
+ .ops = &rt715_sdca_slave_ops,
+ .id_table = rt715_sdca_id,
+ };
+diff --git a/sound/soc/codecs/rt715-sdw.c b/sound/soc/codecs/rt715-sdw.c
+index a7b21b03c08bb..b047bf87a100c 100644
+--- a/sound/soc/codecs/rt715-sdw.c
++++ b/sound/soc/codecs/rt715-sdw.c
+@@ -14,6 +14,7 @@
+ #include <linux/soundwire/sdw_type.h>
+ #include <linux/soundwire/sdw_registers.h>
+ #include <linux/module.h>
++#include <linux/pm_runtime.h>
+ #include <linux/of.h>
+ #include <linux/regmap.h>
+ #include <sound/soc.h>
+@@ -514,6 +515,16 @@ static int rt715_sdw_probe(struct sdw_slave *slave,
+ return 0;
+ }
+
++static int rt715_sdw_remove(struct sdw_slave *slave)
++{
++ struct rt715_priv *rt715 = dev_get_drvdata(&slave->dev);
++
++ if (rt715->first_hw_init)
++ pm_runtime_disable(&slave->dev);
++
++ return 0;
++}
++
+ static const struct sdw_device_id rt715_id[] = {
+ SDW_SLAVE_ENTRY_EXT(0x025d, 0x714, 0x2, 0, 0),
+ SDW_SLAVE_ENTRY_EXT(0x025d, 0x715, 0x2, 0, 0),
+@@ -575,6 +586,7 @@ static struct sdw_driver rt715_sdw_driver = {
+ .pm = &rt715_pm,
+ },
+ .probe = rt715_sdw_probe,
++ .remove = rt715_sdw_remove,
+ .ops = &rt715_slave_ops,
+ .id_table = rt715_id,
+ };
+diff --git a/sound/soc/codecs/sgtl5000.c b/sound/soc/codecs/sgtl5000.c
+index 8eebf27d0ea24..281785a9301bc 100644
+--- a/sound/soc/codecs/sgtl5000.c
++++ b/sound/soc/codecs/sgtl5000.c
+@@ -1796,6 +1796,9 @@ static int sgtl5000_i2c_remove(struct i2c_client *client)
+ {
+ struct sgtl5000_priv *sgtl5000 = i2c_get_clientdata(client);
+
++ regmap_write(sgtl5000->regmap, SGTL5000_CHIP_DIG_POWER, SGTL5000_DIG_POWER_DEFAULT);
++ regmap_write(sgtl5000->regmap, SGTL5000_CHIP_ANA_POWER, SGTL5000_ANA_POWER_DEFAULT);
++
+ clk_disable_unprepare(sgtl5000->mclk);
+ regulator_bulk_disable(sgtl5000->num_supplies, sgtl5000->supplies);
+ regulator_bulk_free(sgtl5000->num_supplies, sgtl5000->supplies);
+@@ -1803,6 +1806,11 @@ static int sgtl5000_i2c_remove(struct i2c_client *client)
+ return 0;
+ }
+
++static void sgtl5000_i2c_shutdown(struct i2c_client *client)
++{
++ sgtl5000_i2c_remove(client);
++}
++
+ static const struct i2c_device_id sgtl5000_id[] = {
+ {"sgtl5000", 0},
+ {},
+@@ -1823,6 +1831,7 @@ static struct i2c_driver sgtl5000_i2c_driver = {
+ },
+ .probe = sgtl5000_i2c_probe,
+ .remove = sgtl5000_i2c_remove,
++ .shutdown = sgtl5000_i2c_shutdown,
+ .id_table = sgtl5000_id,
+ };
+
+diff --git a/sound/soc/codecs/sgtl5000.h b/sound/soc/codecs/sgtl5000.h
+index 56ec5863f2507..3a808c762299e 100644
+--- a/sound/soc/codecs/sgtl5000.h
++++ b/sound/soc/codecs/sgtl5000.h
+@@ -80,6 +80,7 @@
+ /*
+ * SGTL5000_CHIP_DIG_POWER
+ */
++#define SGTL5000_DIG_POWER_DEFAULT 0x0000
+ #define SGTL5000_ADC_EN 0x0040
+ #define SGTL5000_DAC_EN 0x0020
+ #define SGTL5000_DAP_POWERUP 0x0010
+diff --git a/sound/soc/codecs/tas2764.c b/sound/soc/codecs/tas2764.c
+index 9265af41c235d..ec13ba01e5223 100644
+--- a/sound/soc/codecs/tas2764.c
++++ b/sound/soc/codecs/tas2764.c
+@@ -42,10 +42,12 @@ static void tas2764_reset(struct tas2764_priv *tas2764)
+ gpiod_set_value_cansleep(tas2764->reset_gpio, 0);
+ msleep(20);
+ gpiod_set_value_cansleep(tas2764->reset_gpio, 1);
++ usleep_range(1000, 2000);
+ }
+
+ snd_soc_component_write(tas2764->component, TAS2764_SW_RST,
+ TAS2764_RST);
++ usleep_range(1000, 2000);
+ }
+
+ static int tas2764_set_bias_level(struct snd_soc_component *component,
+@@ -107,8 +109,10 @@ static int tas2764_codec_resume(struct snd_soc_component *component)
+ struct tas2764_priv *tas2764 = snd_soc_component_get_drvdata(component);
+ int ret;
+
+- if (tas2764->sdz_gpio)
++ if (tas2764->sdz_gpio) {
+ gpiod_set_value_cansleep(tas2764->sdz_gpio, 1);
++ usleep_range(1000, 2000);
++ }
+
+ ret = snd_soc_component_update_bits(component, TAS2764_PWR_CTRL,
+ TAS2764_PWR_CTRL_MASK,
+@@ -131,7 +135,8 @@ static const char * const tas2764_ASI1_src[] = {
+ };
+
+ static SOC_ENUM_SINGLE_DECL(
+- tas2764_ASI1_src_enum, TAS2764_TDM_CFG2, 4, tas2764_ASI1_src);
++ tas2764_ASI1_src_enum, TAS2764_TDM_CFG2, TAS2764_TDM_CFG2_SCFG_SHIFT,
++ tas2764_ASI1_src);
+
+ static const struct snd_kcontrol_new tas2764_asi1_mux =
+ SOC_DAPM_ENUM("ASI1 Source", tas2764_ASI1_src_enum);
+@@ -329,20 +334,22 @@ static int tas2764_set_fmt(struct snd_soc_dai *dai, unsigned int fmt)
+ {
+ struct snd_soc_component *component = dai->component;
+ struct tas2764_priv *tas2764 = snd_soc_component_get_drvdata(component);
+- u8 tdm_rx_start_slot = 0, asi_cfg_1 = 0;
+- int iface;
++ u8 tdm_rx_start_slot = 0, asi_cfg_0 = 0, asi_cfg_1 = 0;
+ int ret;
+
+ switch (fmt & SND_SOC_DAIFMT_INV_MASK) {
++ case SND_SOC_DAIFMT_NB_IF:
++ asi_cfg_0 ^= TAS2764_TDM_CFG0_FRAME_START;
++ fallthrough;
+ case SND_SOC_DAIFMT_NB_NF:
+ asi_cfg_1 = TAS2764_TDM_CFG1_RX_RISING;
+ break;
++ case SND_SOC_DAIFMT_IB_IF:
++ asi_cfg_0 ^= TAS2764_TDM_CFG0_FRAME_START;
++ fallthrough;
+ case SND_SOC_DAIFMT_IB_NF:
+ asi_cfg_1 = TAS2764_TDM_CFG1_RX_FALLING;
+ break;
+- default:
+- dev_err(tas2764->dev, "ASI format Inverse is not found\n");
+- return -EINVAL;
+ }
+
+ ret = snd_soc_component_update_bits(component, TAS2764_TDM_CFG1,
+@@ -353,13 +360,13 @@ static int tas2764_set_fmt(struct snd_soc_dai *dai, unsigned int fmt)
+
+ switch (fmt & SND_SOC_DAIFMT_FORMAT_MASK) {
+ case SND_SOC_DAIFMT_I2S:
++ asi_cfg_0 ^= TAS2764_TDM_CFG0_FRAME_START;
++ fallthrough;
+ case SND_SOC_DAIFMT_DSP_A:
+- iface = TAS2764_TDM_CFG2_SCFG_I2S;
+ tdm_rx_start_slot = 1;
+ break;
+ case SND_SOC_DAIFMT_DSP_B:
+ case SND_SOC_DAIFMT_LEFT_J:
+- iface = TAS2764_TDM_CFG2_SCFG_LEFT_J;
+ tdm_rx_start_slot = 0;
+ break;
+ default:
+@@ -368,14 +375,15 @@ static int tas2764_set_fmt(struct snd_soc_dai *dai, unsigned int fmt)
+ return -EINVAL;
+ }
+
+- ret = snd_soc_component_update_bits(component, TAS2764_TDM_CFG1,
+- TAS2764_TDM_CFG1_MASK,
+- (tdm_rx_start_slot << TAS2764_TDM_CFG1_51_SHIFT));
++ ret = snd_soc_component_update_bits(component, TAS2764_TDM_CFG0,
++ TAS2764_TDM_CFG0_FRAME_START,
++ asi_cfg_0);
+ if (ret < 0)
+ return ret;
+
+- ret = snd_soc_component_update_bits(component, TAS2764_TDM_CFG2,
+- TAS2764_TDM_CFG2_SCFG_MASK, iface);
++ ret = snd_soc_component_update_bits(component, TAS2764_TDM_CFG1,
++ TAS2764_TDM_CFG1_MASK,
++ (tdm_rx_start_slot << TAS2764_TDM_CFG1_51_SHIFT));
+ if (ret < 0)
+ return ret;
+
+@@ -501,8 +509,10 @@ static int tas2764_codec_probe(struct snd_soc_component *component)
+
+ tas2764->component = component;
+
+- if (tas2764->sdz_gpio)
++ if (tas2764->sdz_gpio) {
+ gpiod_set_value_cansleep(tas2764->sdz_gpio, 1);
++ usleep_range(1000, 2000);
++ }
+
+ tas2764_reset(tas2764);
+
+@@ -526,12 +536,12 @@ static int tas2764_codec_probe(struct snd_soc_component *component)
+ }
+
+ static DECLARE_TLV_DB_SCALE(tas2764_digital_tlv, 1100, 50, 0);
+-static DECLARE_TLV_DB_SCALE(tas2764_playback_volume, -10000, 50, 0);
++static DECLARE_TLV_DB_SCALE(tas2764_playback_volume, -10050, 50, 1);
+
+ static const struct snd_kcontrol_new tas2764_snd_controls[] = {
+ SOC_SINGLE_TLV("Speaker Volume", TAS2764_DVC, 0,
+ TAS2764_DVC_MAX, 1, tas2764_playback_volume),
+- SOC_SINGLE_TLV("Amp Gain Volume", TAS2764_CHNL_0, 0, 0x14, 0,
++ SOC_SINGLE_TLV("Amp Gain Volume", TAS2764_CHNL_0, 1, 0x14, 0,
+ tas2764_digital_tlv),
+ };
+
+@@ -556,7 +566,7 @@ static const struct reg_default tas2764_reg_defaults[] = {
+ { TAS2764_SW_RST, 0x00 },
+ { TAS2764_PWR_CTRL, 0x1a },
+ { TAS2764_DVC, 0x00 },
+- { TAS2764_CHNL_0, 0x00 },
++ { TAS2764_CHNL_0, 0x28 },
+ { TAS2764_TDM_CFG0, 0x09 },
+ { TAS2764_TDM_CFG1, 0x02 },
+ { TAS2764_TDM_CFG2, 0x0a },
+diff --git a/sound/soc/codecs/tas2764.h b/sound/soc/codecs/tas2764.h
+index 67d6fd903c42c..f015f22a083b5 100644
+--- a/sound/soc/codecs/tas2764.h
++++ b/sound/soc/codecs/tas2764.h
+@@ -47,6 +47,7 @@
+ #define TAS2764_TDM_CFG0_MASK GENMASK(3, 1)
+ #define TAS2764_TDM_CFG0_44_1_48KHZ BIT(3)
+ #define TAS2764_TDM_CFG0_88_2_96KHZ (BIT(3) | BIT(1))
++#define TAS2764_TDM_CFG0_FRAME_START BIT(0)
+
+ /* TDM Configuration Reg1 */
+ #define TAS2764_TDM_CFG1 TAS2764_REG(0X0, 0x09)
+@@ -66,10 +67,7 @@
+ #define TAS2764_TDM_CFG2_RXS_16BITS 0x0
+ #define TAS2764_TDM_CFG2_RXS_24BITS BIT(0)
+ #define TAS2764_TDM_CFG2_RXS_32BITS BIT(1)
+-#define TAS2764_TDM_CFG2_SCFG_MASK GENMASK(5, 4)
+-#define TAS2764_TDM_CFG2_SCFG_I2S 0x0
+-#define TAS2764_TDM_CFG2_SCFG_LEFT_J BIT(4)
+-#define TAS2764_TDM_CFG2_SCFG_RIGHT_J BIT(5)
++#define TAS2764_TDM_CFG2_SCFG_SHIFT 4
+
+ /* TDM Configuration Reg3 */
+ #define TAS2764_TDM_CFG3 TAS2764_REG(0X0, 0x0c)
+diff --git a/sound/soc/codecs/wcd9335.c b/sound/soc/codecs/wcd9335.c
+index 1e60db4056ada..aa685980a97b6 100644
+--- a/sound/soc/codecs/wcd9335.c
++++ b/sound/soc/codecs/wcd9335.c
+@@ -1287,11 +1287,17 @@ static int slim_rx_mux_put(struct snd_kcontrol *kc,
+ struct snd_soc_dapm_update *update = NULL;
+ u32 port_id = w->shift;
+
++ if (wcd->rx_port_value[port_id] == ucontrol->value.enumerated.item[0])
++ return 0;
++
+ wcd->rx_port_value[port_id] = ucontrol->value.enumerated.item[0];
+
++ /* Remove channel from any list it's in before adding it to a new one */
++ list_del_init(&wcd->rx_chs[port_id].list);
++
+ switch (wcd->rx_port_value[port_id]) {
+ case 0:
+- list_del_init(&wcd->rx_chs[port_id].list);
++ /* Channel already removed from lists. Nothing to do here */
+ break;
+ case 1:
+ list_add_tail(&wcd->rx_chs[port_id].list,
+diff --git a/sound/soc/codecs/wcd938x.c b/sound/soc/codecs/wcd938x.c
+index 898b2887fa637..088cfda767cc8 100644
+--- a/sound/soc/codecs/wcd938x.c
++++ b/sound/soc/codecs/wcd938x.c
+@@ -2519,6 +2519,9 @@ static int wcd938x_tx_mode_put(struct snd_kcontrol *kcontrol,
+ struct soc_enum *e = (struct soc_enum *)kcontrol->private_value;
+ int path = e->shift_l;
+
++ if (wcd938x->tx_mode[path] == ucontrol->value.enumerated.item[0])
++ return 0;
++
+ wcd938x->tx_mode[path] = ucontrol->value.enumerated.item[0];
+
+ return 1;
+@@ -2541,6 +2544,9 @@ static int wcd938x_rx_hph_mode_put(struct snd_kcontrol *kcontrol,
+ struct snd_soc_component *component = snd_soc_kcontrol_component(kcontrol);
+ struct wcd938x_priv *wcd938x = snd_soc_component_get_drvdata(component);
+
++ if (wcd938x->hph_mode == ucontrol->value.enumerated.item[0])
++ return 0;
++
+ wcd938x->hph_mode = ucontrol->value.enumerated.item[0];
+
+ return 1;
+@@ -2632,6 +2638,9 @@ static int wcd938x_ldoh_put(struct snd_kcontrol *kcontrol,
+ struct snd_soc_component *component = snd_soc_kcontrol_component(kcontrol);
+ struct wcd938x_priv *wcd938x = snd_soc_component_get_drvdata(component);
+
++ if (wcd938x->ldoh == ucontrol->value.integer.value[0])
++ return 0;
++
+ wcd938x->ldoh = ucontrol->value.integer.value[0];
+
+ return 1;
+@@ -2654,6 +2663,9 @@ static int wcd938x_bcs_put(struct snd_kcontrol *kcontrol,
+ struct snd_soc_component *component = snd_soc_kcontrol_component(kcontrol);
+ struct wcd938x_priv *wcd938x = snd_soc_component_get_drvdata(component);
+
++ if (wcd938x->bcs_dis == ucontrol->value.integer.value[0])
++ return 0;
++
+ wcd938x->bcs_dis = ucontrol->value.integer.value[0];
+
+ return 1;
+diff --git a/sound/soc/codecs/wm5110.c b/sound/soc/codecs/wm5110.c
+index 4973ba1ed7791..4ab7a672f8de8 100644
+--- a/sound/soc/codecs/wm5110.c
++++ b/sound/soc/codecs/wm5110.c
+@@ -413,6 +413,7 @@ static int wm5110_put_dre(struct snd_kcontrol *kcontrol,
+ unsigned int rnew = (!!ucontrol->value.integer.value[1]) << mc->rshift;
+ unsigned int lold, rold;
+ unsigned int lena, rena;
++ bool change = false;
+ int ret;
+
+ snd_soc_dapm_mutex_lock(dapm);
+@@ -440,8 +441,8 @@ static int wm5110_put_dre(struct snd_kcontrol *kcontrol,
+ goto err;
+ }
+
+- ret = regmap_update_bits(arizona->regmap, ARIZONA_DRE_ENABLE,
+- mask, lnew | rnew);
++ ret = regmap_update_bits_check(arizona->regmap, ARIZONA_DRE_ENABLE,
++ mask, lnew | rnew, &change);
+ if (ret) {
+ dev_err(arizona->dev, "Failed to set DRE: %d\n", ret);
+ goto err;
+@@ -454,6 +455,9 @@ static int wm5110_put_dre(struct snd_kcontrol *kcontrol,
+ if (!rnew && rold)
+ wm5110_clear_pga_volume(arizona, mc->rshift);
+
++ if (change)
++ ret = 1;
++
+ err:
+ snd_soc_dapm_mutex_unlock(dapm);
+
+diff --git a/sound/soc/codecs/wm_adsp.c b/sound/soc/codecs/wm_adsp.c
+index 9cfd4f18493fb..d3ecff3bdef29 100644
+--- a/sound/soc/codecs/wm_adsp.c
++++ b/sound/soc/codecs/wm_adsp.c
+@@ -997,7 +997,7 @@ int wm_adsp2_preloader_put(struct snd_kcontrol *kcontrol,
+ snd_soc_dapm_sync(dapm);
+ }
+
+- return 0;
++ return 1;
+ }
+ EXPORT_SYMBOL_GPL(wm_adsp2_preloader_put);
+
+diff --git a/sound/soc/intel/boards/bytcr_wm5102.c b/sound/soc/intel/boards/bytcr_wm5102.c
+index 8d8e96e3cd2df..f6d0cef1b28c3 100644
+--- a/sound/soc/intel/boards/bytcr_wm5102.c
++++ b/sound/soc/intel/boards/bytcr_wm5102.c
+@@ -421,8 +421,17 @@ static int snd_byt_wm5102_mc_probe(struct platform_device *pdev)
+ priv->spkvdd_en_gpio = gpiod_get(codec_dev, "wlf,spkvdd-ena", GPIOD_OUT_LOW);
+ put_device(codec_dev);
+
+- if (IS_ERR(priv->spkvdd_en_gpio))
+- return dev_err_probe(dev, PTR_ERR(priv->spkvdd_en_gpio), "getting spkvdd-GPIO\n");
++ if (IS_ERR(priv->spkvdd_en_gpio)) {
++ ret = PTR_ERR(priv->spkvdd_en_gpio);
++ /*
++ * The spkvdd gpio-lookup is registered by: drivers/mfd/arizona-spi.c,
++ * so -ENOENT means that arizona-spi hasn't probed yet.
++ */
++ if (ret == -ENOENT)
++ ret = -EPROBE_DEFER;
++
++ return dev_err_probe(dev, ret, "getting spkvdd-GPIO\n");
++ }
+
+ /* override platform name, if required */
+ byt_wm5102_card.dev = dev;
+diff --git a/sound/soc/intel/boards/sof_sdw.c b/sound/soc/intel/boards/sof_sdw.c
+index 1f00679b42409..ad826ad82d51a 100644
+--- a/sound/soc/intel/boards/sof_sdw.c
++++ b/sound/soc/intel/boards/sof_sdw.c
+@@ -1398,6 +1398,33 @@ static struct snd_soc_card card_sof_sdw = {
+ .late_probe = sof_sdw_card_late_probe,
+ };
+
++static void mc_dailink_exit_loop(struct snd_soc_card *card)
++{
++ struct snd_soc_dai_link *link;
++ int ret;
++ int i, j;
++
++ for (i = 0; i < ARRAY_SIZE(codec_info_list); i++) {
++ if (!codec_info_list[i].exit)
++ continue;
++ /*
++ * We don't need to call .exit function if there is no matched
++ * dai link found.
++ */
++ for_each_card_prelinks(card, j, link) {
++ if (!strcmp(link->codecs[0].dai_name,
++ codec_info_list[i].dai_name)) {
++ ret = codec_info_list[i].exit(card, link);
++ if (ret)
++ dev_warn(card->dev,
++ "codec exit failed %d\n",
++ ret);
++ break;
++ }
++ }
++ }
++}
++
+ static int mc_probe(struct platform_device *pdev)
+ {
+ struct snd_soc_card *card = &card_sof_sdw;
+@@ -1462,6 +1489,7 @@ static int mc_probe(struct platform_device *pdev)
+ ret = devm_snd_soc_register_card(&pdev->dev, card);
+ if (ret) {
+ dev_err(card->dev, "snd_soc_register_card failed %d\n", ret);
++ mc_dailink_exit_loop(card);
+ return ret;
+ }
+
+@@ -1473,29 +1501,8 @@ static int mc_probe(struct platform_device *pdev)
+ static int mc_remove(struct platform_device *pdev)
+ {
+ struct snd_soc_card *card = platform_get_drvdata(pdev);
+- struct snd_soc_dai_link *link;
+- int ret;
+- int i, j;
+
+- for (i = 0; i < ARRAY_SIZE(codec_info_list); i++) {
+- if (!codec_info_list[i].exit)
+- continue;
+- /*
+- * We don't need to call .exit function if there is no matched
+- * dai link found.
+- */
+- for_each_card_prelinks(card, j, link) {
+- if (!strcmp(link->codecs[0].dai_name,
+- codec_info_list[i].dai_name)) {
+- ret = codec_info_list[i].exit(card, link);
+- if (ret)
+- dev_warn(&pdev->dev,
+- "codec exit failed %d\n",
+- ret);
+- break;
+- }
+- }
+- }
++ mc_dailink_exit_loop(card);
+
+ return 0;
+ }
+diff --git a/sound/soc/intel/skylake/skl-nhlt.c b/sound/soc/intel/skylake/skl-nhlt.c
+index 2439a574ac2fa..deb7b820325e7 100644
+--- a/sound/soc/intel/skylake/skl-nhlt.c
++++ b/sound/soc/intel/skylake/skl-nhlt.c
+@@ -99,7 +99,6 @@ static void skl_get_ssp_clks(struct skl_dev *skl, struct skl_ssp_clk *ssp_clks,
+ struct nhlt_fmt_cfg *fmt_cfg;
+ struct wav_fmt_ext *wav_fmt;
+ unsigned long rate;
+- bool present = false;
+ int rate_index = 0;
+ u16 channels, bps;
+ u8 clk_src;
+@@ -112,9 +111,12 @@ static void skl_get_ssp_clks(struct skl_dev *skl, struct skl_ssp_clk *ssp_clks,
+ if (fmt->fmt_count == 0)
+ return;
+
++ fmt_cfg = (struct nhlt_fmt_cfg *)fmt->fmt_config;
+ for (i = 0; i < fmt->fmt_count; i++) {
+- fmt_cfg = &fmt->fmt_config[i];
+- wav_fmt = &fmt_cfg->fmt_ext;
++ struct nhlt_fmt_cfg *saved_fmt_cfg = fmt_cfg;
++ bool present = false;
++
++ wav_fmt = &saved_fmt_cfg->fmt_ext;
+
+ channels = wav_fmt->fmt.channels;
+ bps = wav_fmt->fmt.bits_per_sample;
+@@ -132,12 +134,18 @@ static void skl_get_ssp_clks(struct skl_dev *skl, struct skl_ssp_clk *ssp_clks,
+ * derive the rate.
+ */
+ for (j = i; j < fmt->fmt_count; j++) {
+- fmt_cfg = &fmt->fmt_config[j];
+- wav_fmt = &fmt_cfg->fmt_ext;
++ struct nhlt_fmt_cfg *tmp_fmt_cfg = fmt_cfg;
++
++ wav_fmt = &tmp_fmt_cfg->fmt_ext;
+ if ((fs == wav_fmt->fmt.samples_per_sec) &&
+- (bps == wav_fmt->fmt.bits_per_sample))
++ (bps == wav_fmt->fmt.bits_per_sample)) {
+ channels = max_t(u16, channels,
+ wav_fmt->fmt.channels);
++ saved_fmt_cfg = tmp_fmt_cfg;
++ }
++ /* Move to the next nhlt_fmt_cfg */
++ tmp_fmt_cfg = (struct nhlt_fmt_cfg *)(tmp_fmt_cfg->config.caps +
++ tmp_fmt_cfg->config.size);
+ }
+
+ rate = channels * bps * fs;
+@@ -153,8 +161,11 @@ static void skl_get_ssp_clks(struct skl_dev *skl, struct skl_ssp_clk *ssp_clks,
+
+ /* Fill rate and parent for sclk/sclkfs */
+ if (!present) {
++ struct nhlt_fmt_cfg *first_fmt_cfg;
++
++ first_fmt_cfg = (struct nhlt_fmt_cfg *)fmt->fmt_config;
+ i2s_config_ext = (struct skl_i2s_config_blob_ext *)
+- fmt->fmt_config[0].config.caps;
++ first_fmt_cfg->config.caps;
+
+ /* MCLK Divider Source Select */
+ if (is_legacy_blob(i2s_config_ext->hdr.sig)) {
+@@ -168,6 +179,9 @@ static void skl_get_ssp_clks(struct skl_dev *skl, struct skl_ssp_clk *ssp_clks,
+
+ parent = skl_get_parent_clk(clk_src);
+
++ /* Move to the next nhlt_fmt_cfg */
++ fmt_cfg = (struct nhlt_fmt_cfg *)(fmt_cfg->config.caps +
++ fmt_cfg->config.size);
+ /*
+ * Do not copy the config data if there is no parent
+ * clock available for this clock source select
+@@ -176,9 +190,9 @@ static void skl_get_ssp_clks(struct skl_dev *skl, struct skl_ssp_clk *ssp_clks,
+ continue;
+
+ sclk[id].rate_cfg[rate_index].rate = rate;
+- sclk[id].rate_cfg[rate_index].config = fmt_cfg;
++ sclk[id].rate_cfg[rate_index].config = saved_fmt_cfg;
+ sclkfs[id].rate_cfg[rate_index].rate = rate;
+- sclkfs[id].rate_cfg[rate_index].config = fmt_cfg;
++ sclkfs[id].rate_cfg[rate_index].config = saved_fmt_cfg;
+ sclk[id].parent_name = parent->name;
+ sclkfs[id].parent_name = parent->name;
+
+@@ -192,13 +206,13 @@ static void skl_get_mclk(struct skl_dev *skl, struct skl_ssp_clk *mclk,
+ {
+ struct skl_i2s_config_blob_ext *i2s_config_ext;
+ struct skl_i2s_config_blob_legacy *i2s_config;
+- struct nhlt_specific_cfg *fmt_cfg;
++ struct nhlt_fmt_cfg *fmt_cfg;
+ struct skl_clk_parent_src *parent;
+ u32 clkdiv, div_ratio;
+ u8 clk_src;
+
+- fmt_cfg = &fmt->fmt_config[0].config;
+- i2s_config_ext = (struct skl_i2s_config_blob_ext *)fmt_cfg->caps;
++ fmt_cfg = (struct nhlt_fmt_cfg *)fmt->fmt_config;
++ i2s_config_ext = (struct skl_i2s_config_blob_ext *)fmt_cfg->config.caps;
+
+ /* MCLK Divider Source Select and divider */
+ if (is_legacy_blob(i2s_config_ext->hdr.sig)) {
+@@ -227,7 +241,7 @@ static void skl_get_mclk(struct skl_dev *skl, struct skl_ssp_clk *mclk,
+ return;
+
+ mclk[id].rate_cfg[0].rate = parent->rate/div_ratio;
+- mclk[id].rate_cfg[0].config = &fmt->fmt_config[0];
++ mclk[id].rate_cfg[0].config = fmt_cfg;
+ mclk[id].parent_name = parent->name;
+ }
+
+diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c
+index 869c76506b669..a8e842e02cdc2 100644
+--- a/sound/soc/soc-dapm.c
++++ b/sound/soc/soc-dapm.c
+@@ -62,6 +62,8 @@ struct snd_soc_dapm_widget *
+ snd_soc_dapm_new_control_unlocked(struct snd_soc_dapm_context *dapm,
+ const struct snd_soc_dapm_widget *widget);
+
++static unsigned int soc_dapm_read(struct snd_soc_dapm_context *dapm, int reg);
++
+ /* dapm power sequences - make this per codec in the future */
+ static int dapm_up_seq[] = {
+ [snd_soc_dapm_pre] = 1,
+@@ -442,6 +444,9 @@ static int dapm_kcontrol_data_alloc(struct snd_soc_dapm_widget *widget,
+
+ snd_soc_dapm_add_path(widget->dapm, data->widget,
+ widget, NULL, NULL);
++ } else if (e->reg != SND_SOC_NOPM) {
++ data->value = soc_dapm_read(widget->dapm, e->reg) &
++ (e->mask << e->shift_l);
+ }
+ break;
+ default:
+diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
+index e693070f51fe8..d867f449d82db 100644
+--- a/sound/soc/soc-ops.c
++++ b/sound/soc/soc-ops.c
+@@ -526,7 +526,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
+ return -EINVAL;
+ if (mc->platform_max && tmp > mc->platform_max)
+ return -EINVAL;
+- if (tmp > mc->max - mc->min + 1)
++ if (tmp > mc->max - mc->min)
+ return -EINVAL;
+
+ if (invert)
+@@ -547,7 +547,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
+ return -EINVAL;
+ if (mc->platform_max && tmp > mc->platform_max)
+ return -EINVAL;
+- if (tmp > mc->max - mc->min + 1)
++ if (tmp > mc->max - mc->min)
+ return -EINVAL;
+
+ if (invert)
+diff --git a/sound/soc/sof/intel/hda-dsp.c b/sound/soc/sof/intel/hda-dsp.c
+index 8ddde60c56b3f..68a8074c956a7 100644
+--- a/sound/soc/sof/intel/hda-dsp.c
++++ b/sound/soc/sof/intel/hda-dsp.c
+@@ -181,12 +181,20 @@ int hda_dsp_core_run(struct snd_sof_dev *sdev, unsigned int core_mask)
+ * Power Management.
+ */
+
+-static int hda_dsp_core_power_up(struct snd_sof_dev *sdev, unsigned int core_mask)
++int hda_dsp_core_power_up(struct snd_sof_dev *sdev, unsigned int core_mask)
+ {
++ struct sof_intel_hda_dev *hda = sdev->pdata->hw_pdata;
++ const struct sof_intel_dsp_desc *chip = hda->desc;
+ unsigned int cpa;
+ u32 adspcs;
+ int ret;
+
++ /* restrict core_mask to host managed cores mask */
++ core_mask &= chip->host_managed_cores_mask;
++ /* return if core_mask is not valid */
++ if (!core_mask)
++ return 0;
++
+ /* update bits */
+ snd_sof_dsp_update_bits(sdev, HDA_DSP_BAR, HDA_DSP_REG_ADSPCS,
+ HDA_DSP_ADSPCS_SPA_MASK(core_mask),
+diff --git a/sound/soc/sof/intel/hda-loader.c b/sound/soc/sof/intel/hda-loader.c
+index 2ac5d9d0719bc..88d23924e1bf2 100644
+--- a/sound/soc/sof/intel/hda-loader.c
++++ b/sound/soc/sof/intel/hda-loader.c
+@@ -97,9 +97,9 @@ out_put:
+ }
+
+ /*
+- * first boot sequence has some extra steps. core 0 waits for power
+- * status on core 1, so power up core 1 also momentarily, keep it in
+- * reset/stall and then turn it off
++ * first boot sequence has some extra steps.
++ * power on all host managed cores and only unstall/run the boot core to boot the
++ * DSP then turn off all non boot cores (if any) is powered on.
+ */
+ static int cl_dsp_init(struct snd_sof_dev *sdev, int stream_tag)
+ {
+@@ -112,7 +112,7 @@ static int cl_dsp_init(struct snd_sof_dev *sdev, int stream_tag)
+ int ret;
+
+ /* step 1: power up corex */
+- ret = hda_dsp_enable_core(sdev, chip->host_managed_cores_mask);
++ ret = hda_dsp_core_power_up(sdev, chip->host_managed_cores_mask);
+ if (ret < 0) {
+ if (hda->boot_iteration == HDA_FW_BOOT_ATTEMPTS)
+ dev_err(sdev->dev, "error: dsp core 0/1 power up failed\n");
+@@ -127,7 +127,7 @@ static int cl_dsp_init(struct snd_sof_dev *sdev, int stream_tag)
+ ((stream_tag - 1) << 9)));
+
+ /* step 3: unset core 0 reset state & unstall/run core 0 */
+- ret = hda_dsp_core_run(sdev, BIT(0));
++ ret = hda_dsp_core_run(sdev, chip->init_core_mask);
+ if (ret < 0) {
+ if (hda->boot_iteration == HDA_FW_BOOT_ATTEMPTS)
+ dev_err(sdev->dev,
+diff --git a/sound/soc/sof/intel/hda.h b/sound/soc/sof/intel/hda.h
+index 196494ba1245b..db066d094afa9 100644
+--- a/sound/soc/sof/intel/hda.h
++++ b/sound/soc/sof/intel/hda.h
+@@ -490,6 +490,7 @@ struct sof_intel_hda_stream {
+ */
+ int hda_dsp_probe(struct snd_sof_dev *sdev);
+ int hda_dsp_remove(struct snd_sof_dev *sdev);
++int hda_dsp_core_power_up(struct snd_sof_dev *sdev, unsigned int core_mask);
+ int hda_dsp_core_run(struct snd_sof_dev *sdev, unsigned int core_mask);
+ int hda_dsp_enable_core(struct snd_sof_dev *sdev, unsigned int core_mask);
+ int hda_dsp_core_reset_power_down(struct snd_sof_dev *sdev,
+diff --git a/sound/usb/quirks-table.h b/sound/usb/quirks-table.h
+index 4f56e1784932a..f93201a830b5a 100644
+--- a/sound/usb/quirks-table.h
++++ b/sound/usb/quirks-table.h
+@@ -3802,6 +3802,54 @@ YAMAHA_DEVICE(0x7010, "UB99"),
+ }
+ },
+
++/*
++ * MacroSilicon MS2100/MS2106 based AV capture cards
++ *
++ * These claim 96kHz 1ch in the descriptors, but are actually 48kHz 2ch.
++ * They also need QUIRK_FLAG_ALIGN_TRANSFER, which makes one wonder if
++ * they pretend to be 96kHz mono as a workaround for stereo being broken
++ * by that...
++ *
++ * They also have an issue with initial stream alignment that causes the
++ * channels to be swapped and out of phase, which is dealt with in quirks.c.
++ */
++{
++ USB_AUDIO_DEVICE(0x534d, 0x0021),
++ .driver_info = (unsigned long) &(const struct snd_usb_audio_quirk) {
++ .vendor_name = "MacroSilicon",
++ .product_name = "MS210x",
++ .ifnum = QUIRK_ANY_INTERFACE,
++ .type = QUIRK_COMPOSITE,
++ .data = &(const struct snd_usb_audio_quirk[]) {
++ {
++ .ifnum = 2,
++ .type = QUIRK_AUDIO_STANDARD_MIXER,
++ },
++ {
++ .ifnum = 3,
++ .type = QUIRK_AUDIO_FIXED_ENDPOINT,
++ .data = &(const struct audioformat) {
++ .formats = SNDRV_PCM_FMTBIT_S16_LE,
++ .channels = 2,
++ .iface = 3,
++ .altsetting = 1,
++ .altset_idx = 1,
++ .attributes = 0,
++ .endpoint = 0x82,
++ .ep_attr = USB_ENDPOINT_XFER_ISOC |
++ USB_ENDPOINT_SYNC_ASYNC,
++ .rates = SNDRV_PCM_RATE_CONTINUOUS,
++ .rate_min = 48000,
++ .rate_max = 48000,
++ }
++ },
++ {
++ .ifnum = -1
++ }
++ }
++ }
++},
++
+ /*
+ * MacroSilicon MS2109 based HDMI capture cards
+ *
+@@ -4119,6 +4167,206 @@ YAMAHA_DEVICE(0x7010, "UB99"),
+ }
+ }
+ },
++{
++ /*
++ * Fiero SC-01 (firmware v1.0.0 @ 48 kHz)
++ */
++ USB_DEVICE(0x2b53, 0x0023),
++ .driver_info = (unsigned long) &(const struct snd_usb_audio_quirk) {
++ .vendor_name = "Fiero",
++ .product_name = "SC-01",
++ .ifnum = QUIRK_ANY_INTERFACE,
++ .type = QUIRK_COMPOSITE,
++ .data = &(const struct snd_usb_audio_quirk[]) {
++ {
++ .ifnum = 0,
++ .type = QUIRK_AUDIO_STANDARD_INTERFACE
++ },
++ /* Playback */
++ {
++ .ifnum = 1,
++ .type = QUIRK_AUDIO_FIXED_ENDPOINT,
++ .data = &(const struct audioformat) {
++ .formats = SNDRV_PCM_FMTBIT_S32_LE,
++ .channels = 2,
++ .fmt_bits = 24,
++ .iface = 1,
++ .altsetting = 1,
++ .altset_idx = 1,
++ .endpoint = 0x01,
++ .ep_attr = USB_ENDPOINT_XFER_ISOC |
++ USB_ENDPOINT_SYNC_ASYNC,
++ .rates = SNDRV_PCM_RATE_48000,
++ .rate_min = 48000,
++ .rate_max = 48000,
++ .nr_rates = 1,
++ .rate_table = (unsigned int[]) { 48000 },
++ .clock = 0x29
++ }
++ },
++ /* Capture */
++ {
++ .ifnum = 2,
++ .type = QUIRK_AUDIO_FIXED_ENDPOINT,
++ .data = &(const struct audioformat) {
++ .formats = SNDRV_PCM_FMTBIT_S32_LE,
++ .channels = 2,
++ .fmt_bits = 24,
++ .iface = 2,
++ .altsetting = 1,
++ .altset_idx = 1,
++ .endpoint = 0x82,
++ .ep_attr = USB_ENDPOINT_XFER_ISOC |
++ USB_ENDPOINT_SYNC_ASYNC |
++ USB_ENDPOINT_USAGE_IMPLICIT_FB,
++ .rates = SNDRV_PCM_RATE_48000,
++ .rate_min = 48000,
++ .rate_max = 48000,
++ .nr_rates = 1,
++ .rate_table = (unsigned int[]) { 48000 },
++ .clock = 0x29
++ }
++ },
++ {
++ .ifnum = -1
++ }
++ }
++ }
++},
++{
++ /*
++ * Fiero SC-01 (firmware v1.0.0 @ 96 kHz)
++ */
++ USB_DEVICE(0x2b53, 0x0024),
++ .driver_info = (unsigned long) &(const struct snd_usb_audio_quirk) {
++ .vendor_name = "Fiero",
++ .product_name = "SC-01",
++ .ifnum = QUIRK_ANY_INTERFACE,
++ .type = QUIRK_COMPOSITE,
++ .data = &(const struct snd_usb_audio_quirk[]) {
++ {
++ .ifnum = 0,
++ .type = QUIRK_AUDIO_STANDARD_INTERFACE
++ },
++ /* Playback */
++ {
++ .ifnum = 1,
++ .type = QUIRK_AUDIO_FIXED_ENDPOINT,
++ .data = &(const struct audioformat) {
++ .formats = SNDRV_PCM_FMTBIT_S32_LE,
++ .channels = 2,
++ .fmt_bits = 24,
++ .iface = 1,
++ .altsetting = 1,
++ .altset_idx = 1,
++ .endpoint = 0x01,
++ .ep_attr = USB_ENDPOINT_XFER_ISOC |
++ USB_ENDPOINT_SYNC_ASYNC,
++ .rates = SNDRV_PCM_RATE_96000,
++ .rate_min = 96000,
++ .rate_max = 96000,
++ .nr_rates = 1,
++ .rate_table = (unsigned int[]) { 96000 },
++ .clock = 0x29
++ }
++ },
++ /* Capture */
++ {
++ .ifnum = 2,
++ .type = QUIRK_AUDIO_FIXED_ENDPOINT,
++ .data = &(const struct audioformat) {
++ .formats = SNDRV_PCM_FMTBIT_S32_LE,
++ .channels = 2,
++ .fmt_bits = 24,
++ .iface = 2,
++ .altsetting = 1,
++ .altset_idx = 1,
++ .endpoint = 0x82,
++ .ep_attr = USB_ENDPOINT_XFER_ISOC |
++ USB_ENDPOINT_SYNC_ASYNC |
++ USB_ENDPOINT_USAGE_IMPLICIT_FB,
++ .rates = SNDRV_PCM_RATE_96000,
++ .rate_min = 96000,
++ .rate_max = 96000,
++ .nr_rates = 1,
++ .rate_table = (unsigned int[]) { 96000 },
++ .clock = 0x29
++ }
++ },
++ {
++ .ifnum = -1
++ }
++ }
++ }
++},
++{
++ /*
++ * Fiero SC-01 (firmware v1.1.0)
++ */
++ USB_DEVICE(0x2b53, 0x0031),
++ .driver_info = (unsigned long) &(const struct snd_usb_audio_quirk) {
++ .vendor_name = "Fiero",
++ .product_name = "SC-01",
++ .ifnum = QUIRK_ANY_INTERFACE,
++ .type = QUIRK_COMPOSITE,
++ .data = &(const struct snd_usb_audio_quirk[]) {
++ {
++ .ifnum = 0,
++ .type = QUIRK_AUDIO_STANDARD_INTERFACE
++ },
++ /* Playback */
++ {
++ .ifnum = 1,
++ .type = QUIRK_AUDIO_FIXED_ENDPOINT,
++ .data = &(const struct audioformat) {
++ .formats = SNDRV_PCM_FMTBIT_S32_LE,
++ .channels = 2,
++ .fmt_bits = 24,
++ .iface = 1,
++ .altsetting = 1,
++ .altset_idx = 1,
++ .endpoint = 0x01,
++ .ep_attr = USB_ENDPOINT_XFER_ISOC |
++ USB_ENDPOINT_SYNC_ASYNC,
++ .rates = SNDRV_PCM_RATE_48000 |
++ SNDRV_PCM_RATE_96000,
++ .rate_min = 48000,
++ .rate_max = 96000,
++ .nr_rates = 2,
++ .rate_table = (unsigned int[]) { 48000, 96000 },
++ .clock = 0x29
++ }
++ },
++ /* Capture */
++ {
++ .ifnum = 2,
++ .type = QUIRK_AUDIO_FIXED_ENDPOINT,
++ .data = &(const struct audioformat) {
++ .formats = SNDRV_PCM_FMTBIT_S32_LE,
++ .channels = 2,
++ .fmt_bits = 24,
++ .iface = 2,
++ .altsetting = 1,
++ .altset_idx = 1,
++ .endpoint = 0x82,
++ .ep_attr = USB_ENDPOINT_XFER_ISOC |
++ USB_ENDPOINT_SYNC_ASYNC |
++ USB_ENDPOINT_USAGE_IMPLICIT_FB,
++ .rates = SNDRV_PCM_RATE_48000 |
++ SNDRV_PCM_RATE_96000,
++ .rate_min = 48000,
++ .rate_max = 96000,
++ .nr_rates = 2,
++ .rate_table = (unsigned int[]) { 48000, 96000 },
++ .clock = 0x29
++ }
++ },
++ {
++ .ifnum = -1
++ }
++ }
++ }
++},
+
+ #undef USB_DEVICE_VENDOR_SPEC
+ #undef USB_AUDIO_DEVICE
+diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c
+index 12ce69b04f634..968d90caeefa0 100644
+--- a/sound/usb/quirks.c
++++ b/sound/usb/quirks.c
+@@ -1478,6 +1478,7 @@ void snd_usb_set_format_quirk(struct snd_usb_substream *subs,
+ case USB_ID(0x041e, 0x3f19): /* E-Mu 0204 USB */
+ set_format_emu_quirk(subs, fmt);
+ break;
++ case USB_ID(0x534d, 0x0021): /* MacroSilicon MS2100/MS2106 */
+ case USB_ID(0x534d, 0x2109): /* MacroSilicon MS2109 */
+ subs->stream_offset_adj = 2;
+ break;
+@@ -1908,10 +1909,18 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = {
+ QUIRK_FLAG_IGNORE_CTL_ERROR),
+ DEVICE_FLG(0x413c, 0xa506, /* Dell AE515 sound bar */
+ QUIRK_FLAG_GET_SAMPLE_RATE),
++ DEVICE_FLG(0x534d, 0x0021, /* MacroSilicon MS2100/MS2106 */
++ QUIRK_FLAG_ALIGN_TRANSFER),
+ DEVICE_FLG(0x534d, 0x2109, /* MacroSilicon MS2109 */
+ QUIRK_FLAG_ALIGN_TRANSFER),
+ DEVICE_FLG(0x1224, 0x2a25, /* Jieli Technology USB PHY 2.0 */
+ QUIRK_FLAG_GET_SAMPLE_RATE),
++ DEVICE_FLG(0x2b53, 0x0023, /* Fiero SC-01 (firmware v1.0.0 @ 48 kHz) */
++ QUIRK_FLAG_GENERIC_IMPLICIT_FB),
++ DEVICE_FLG(0x2b53, 0x0024, /* Fiero SC-01 (firmware v1.0.0 @ 96 kHz) */
++ QUIRK_FLAG_GENERIC_IMPLICIT_FB),
++ DEVICE_FLG(0x2b53, 0x0031, /* Fiero SC-01 (firmware v1.1.0) */
++ QUIRK_FLAG_GENERIC_IMPLICIT_FB),
+
+ /* Vendor matches */
+ VENDOR_FLG(0x045e, /* MS Lifecam */
+diff --git a/tools/testing/selftests/wireguard/qemu/Makefile b/tools/testing/selftests/wireguard/qemu/Makefile
+index bca07b93eeb07..fd060c5d6f39a 100644
+--- a/tools/testing/selftests/wireguard/qemu/Makefile
++++ b/tools/testing/selftests/wireguard/qemu/Makefile
+@@ -19,8 +19,6 @@ endif
+ MIRROR := https://download.wireguard.com/qemu-test/distfiles/
+
+ KERNEL_BUILD_PATH := $(BUILD_PATH)/kernel$(if $(findstring yes,$(DEBUG_KERNEL)),-debug)
+-rwildcard=$(foreach d,$(wildcard $1*),$(call rwildcard,$d/,$2) $(filter $(subst *,%,$2),$d))
+-WIREGUARD_SOURCES := $(call rwildcard,$(KERNEL_PATH)/drivers/net/wireguard/,*)
+
+ default: qemu
+
+@@ -324,8 +322,9 @@ $(KERNEL_BUILD_PATH)/.config: $(TOOLCHAIN_PATH)/.installed kernel.config arch/$(
+ cd $(KERNEL_BUILD_PATH) && ARCH=$(KERNEL_ARCH) $(KERNEL_PATH)/scripts/kconfig/merge_config.sh -n $(KERNEL_BUILD_PATH)/.config $(KERNEL_BUILD_PATH)/minimal.config
+ $(if $(findstring yes,$(DEBUG_KERNEL)),cp debug.config $(KERNEL_BUILD_PATH) && cd $(KERNEL_BUILD_PATH) && ARCH=$(KERNEL_ARCH) $(KERNEL_PATH)/scripts/kconfig/merge_config.sh -n $(KERNEL_BUILD_PATH)/.config debug.config,)
+
+-$(KERNEL_BZIMAGE): $(TOOLCHAIN_PATH)/.installed $(KERNEL_BUILD_PATH)/.config $(BUILD_PATH)/init-cpio-spec.txt $(IPERF_PATH)/src/iperf3 $(IPUTILS_PATH)/ping $(BASH_PATH)/bash $(IPROUTE2_PATH)/misc/ss $(IPROUTE2_PATH)/ip/ip $(IPTABLES_PATH)/iptables/xtables-legacy-multi $(NMAP_PATH)/ncat/ncat $(WIREGUARD_TOOLS_PATH)/src/wg $(BUILD_PATH)/init ../netns.sh $(WIREGUARD_SOURCES)
++$(KERNEL_BZIMAGE): $(TOOLCHAIN_PATH)/.installed $(KERNEL_BUILD_PATH)/.config $(BUILD_PATH)/init-cpio-spec.txt $(IPERF_PATH)/src/iperf3 $(IPUTILS_PATH)/ping $(BASH_PATH)/bash $(IPROUTE2_PATH)/misc/ss $(IPROUTE2_PATH)/ip/ip $(IPTABLES_PATH)/iptables/xtables-legacy-multi $(NMAP_PATH)/ncat/ncat $(WIREGUARD_TOOLS_PATH)/src/wg $(BUILD_PATH)/init
+ $(MAKE) -C $(KERNEL_PATH) O=$(KERNEL_BUILD_PATH) ARCH=$(KERNEL_ARCH) CROSS_COMPILE=$(CROSS_COMPILE)
++.PHONY: $(KERNEL_BZIMAGE)
+
+ $(TOOLCHAIN_PATH)/$(CHOST)/include/linux/.installed: | $(KERNEL_BUILD_PATH)/.config $(TOOLCHAIN_PATH)/.installed
+ rm -rf $(TOOLCHAIN_PATH)/$(CHOST)/include/linux
+diff --git a/tools/testing/selftests/wireguard/qemu/arch/arm.config b/tools/testing/selftests/wireguard/qemu/arch/arm.config
+index fc7959bef9c25..0579c66be83ee 100644
+--- a/tools/testing/selftests/wireguard/qemu/arch/arm.config
++++ b/tools/testing/selftests/wireguard/qemu/arch/arm.config
+@@ -7,6 +7,7 @@ CONFIG_SERIAL_AMBA_PL011_CONSOLE=y
+ CONFIG_VIRTIO_MENU=y
+ CONFIG_VIRTIO_MMIO=y
+ CONFIG_VIRTIO_CONSOLE=y
++CONFIG_COMPAT_32BIT_TIME=y
+ CONFIG_CMDLINE_BOOL=y
+ CONFIG_CMDLINE="console=ttyAMA0 wg.success=vport0p1 panic_on_warn=1"
+ CONFIG_FRAME_WARN=1024
+diff --git a/tools/testing/selftests/wireguard/qemu/arch/armeb.config b/tools/testing/selftests/wireguard/qemu/arch/armeb.config
+index f3066be81c199..2a3307bbe534f 100644
+--- a/tools/testing/selftests/wireguard/qemu/arch/armeb.config
++++ b/tools/testing/selftests/wireguard/qemu/arch/armeb.config
+@@ -7,6 +7,7 @@ CONFIG_SERIAL_AMBA_PL011_CONSOLE=y
+ CONFIG_VIRTIO_MENU=y
+ CONFIG_VIRTIO_MMIO=y
+ CONFIG_VIRTIO_CONSOLE=y
++CONFIG_COMPAT_32BIT_TIME=y
+ CONFIG_CMDLINE_BOOL=y
+ CONFIG_CMDLINE="console=ttyAMA0 wg.success=vport0p1 panic_on_warn=1"
+ CONFIG_CPU_BIG_ENDIAN=y
+diff --git a/tools/testing/selftests/wireguard/qemu/arch/i686.config b/tools/testing/selftests/wireguard/qemu/arch/i686.config
+index 6d90892a85a24..cd864b9be6fb5 100644
+--- a/tools/testing/selftests/wireguard/qemu/arch/i686.config
++++ b/tools/testing/selftests/wireguard/qemu/arch/i686.config
+@@ -1,6 +1,7 @@
+ CONFIG_ACPI=y
+ CONFIG_SERIAL_8250=y
+ CONFIG_SERIAL_8250_CONSOLE=y
++CONFIG_COMPAT_32BIT_TIME=y
+ CONFIG_CMDLINE_BOOL=y
+ CONFIG_CMDLINE="console=ttyS0 wg.success=ttyS1 panic_on_warn=1"
+ CONFIG_FRAME_WARN=1024
+diff --git a/tools/testing/selftests/wireguard/qemu/arch/m68k.config b/tools/testing/selftests/wireguard/qemu/arch/m68k.config
+index 82c925e49beb7..9639bfe060744 100644
+--- a/tools/testing/selftests/wireguard/qemu/arch/m68k.config
++++ b/tools/testing/selftests/wireguard/qemu/arch/m68k.config
+@@ -5,5 +5,6 @@ CONFIG_MAC=y
+ CONFIG_SERIAL_PMACZILOG=y
+ CONFIG_SERIAL_PMACZILOG_TTYS=y
+ CONFIG_SERIAL_PMACZILOG_CONSOLE=y
++CONFIG_COMPAT_32BIT_TIME=y
+ CONFIG_CMDLINE="console=ttyS0 wg.success=ttyS1 panic_on_warn=1"
+ CONFIG_FRAME_WARN=1024
+diff --git a/tools/testing/selftests/wireguard/qemu/arch/mips.config b/tools/testing/selftests/wireguard/qemu/arch/mips.config
+index d7ec63c17b30e..2a84402353ab8 100644
+--- a/tools/testing/selftests/wireguard/qemu/arch/mips.config
++++ b/tools/testing/selftests/wireguard/qemu/arch/mips.config
+@@ -6,6 +6,7 @@ CONFIG_POWER_RESET=y
+ CONFIG_POWER_RESET_SYSCON=y
+ CONFIG_SERIAL_8250=y
+ CONFIG_SERIAL_8250_CONSOLE=y
++CONFIG_COMPAT_32BIT_TIME=y
+ CONFIG_CMDLINE_BOOL=y
+ CONFIG_CMDLINE="console=ttyS0 wg.success=ttyS1 panic_on_warn=1"
+ CONFIG_FRAME_WARN=1024
+diff --git a/tools/testing/selftests/wireguard/qemu/arch/mipsel.config b/tools/testing/selftests/wireguard/qemu/arch/mipsel.config
+index 18a4982937376..56146a101e7e4 100644
+--- a/tools/testing/selftests/wireguard/qemu/arch/mipsel.config
++++ b/tools/testing/selftests/wireguard/qemu/arch/mipsel.config
+@@ -7,6 +7,7 @@ CONFIG_POWER_RESET=y
+ CONFIG_POWER_RESET_SYSCON=y
+ CONFIG_SERIAL_8250=y
+ CONFIG_SERIAL_8250_CONSOLE=y
++CONFIG_COMPAT_32BIT_TIME=y
+ CONFIG_CMDLINE_BOOL=y
+ CONFIG_CMDLINE="console=ttyS0 wg.success=ttyS1 panic_on_warn=1"
+ CONFIG_FRAME_WARN=1024
+diff --git a/tools/testing/selftests/wireguard/qemu/arch/powerpc.config b/tools/testing/selftests/wireguard/qemu/arch/powerpc.config
+index 5e04882e8e35b..174a9ffe2a362 100644
+--- a/tools/testing/selftests/wireguard/qemu/arch/powerpc.config
++++ b/tools/testing/selftests/wireguard/qemu/arch/powerpc.config
+@@ -4,6 +4,7 @@ CONFIG_PPC_85xx=y
+ CONFIG_PHYS_64BIT=y
+ CONFIG_SERIAL_8250=y
+ CONFIG_SERIAL_8250_CONSOLE=y
++CONFIG_COMPAT_32BIT_TIME=y
+ CONFIG_MATH_EMULATION=y
+ CONFIG_CMDLINE_BOOL=y
+ CONFIG_CMDLINE="console=ttyS0 wg.success=ttyS1 panic_on_warn=1"
+diff --git a/tools/testing/selftests/wireguard/qemu/init.c b/tools/testing/selftests/wireguard/qemu/init.c
+index 2a0f48fac925a..542c34b00eb06 100644
+--- a/tools/testing/selftests/wireguard/qemu/init.c
++++ b/tools/testing/selftests/wireguard/qemu/init.c
+@@ -11,6 +11,7 @@
+ #include <stdlib.h>
+ #include <stdbool.h>
+ #include <fcntl.h>
++#include <time.h>
+ #include <sys/wait.h>
+ #include <sys/mount.h>
+ #include <sys/stat.h>
+@@ -67,6 +68,15 @@ static void seed_rng(void)
+ close(fd);
+ }
+
++static void set_time(void)
++{
++ if (time(NULL))
++ return;
++ pretty_message("[+] Setting fake time...");
++ if (stime(&(time_t){1433512680}) < 0)
++ panic("settimeofday()");
++}
++
+ static void mount_filesystems(void)
+ {
+ pretty_message("[+] Mounting filesystems...");
+@@ -256,6 +266,7 @@ int main(int argc, char *argv[])
+ print_banner();
+ mount_filesystems();
+ seed_rng();
++ set_time();
+ kmod_selftests();
+ enable_logging();
+ clear_leaks();
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-07-22 11:11 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-07-22 11:11 UTC (permalink / raw
To: gentoo-commits
commit: e83956c00f3c9876c87a60a248d151aecf71c5d8
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Fri Jul 22 11:10:40 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Fri Jul 22 11:10:40 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=e83956c0
Remove redundant patch: 1950_workaround_negprot_bug.patch
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 ---
1950_workaround_negprot_bug.patch | 56 ---------------------------------------
2 files changed, 60 deletions(-)
diff --git a/0000_README b/0000_README
index a7d6578d..b6f81e34 100644
--- a/0000_README
+++ b/0000_README
@@ -107,10 +107,6 @@ Patch: 1700_sparc-address-warray-bound-warnings.patch
From: https://github.com/KSPP/linux/issues/109
Desc: Address -Warray-bounds warnings
-Patch: 1950_workaround_negprot_bug.patch
-From: https://patchwork.kernel.org/project/cifs-client/patch/CAH2r5mtuN-yswT5VTbNPzj02fwiHYOCe2eR8mcgRgRE8Qpkjgw@mail.gmail.com/
-Desc: Fix mount fail to older Samba servers
-
Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
diff --git a/1950_workaround_negprot_bug.patch b/1950_workaround_negprot_bug.patch
deleted file mode 100644
index 9a38ed83..00000000
--- a/1950_workaround_negprot_bug.patch
+++ /dev/null
@@ -1,56 +0,0 @@
-From a8d8532e4c335f0a31dd213abe4e31682f34647c Mon Sep 17 00:00:00 2001
-From: Steve French <stfrench@microsoft.com>
-Date: Tue, 12 Jul 2022 00:11:42 -0500
-Subject: [PATCH] smb3: workaround negprot bug in some Samba servers
-
-Mount can now fail to older Samba servers due to a server
-bug handling padding at the end of the last negotiate
-contexts (negotiate contexts typically round up to 8 byte
-lengths by adding padding if needed). This server bug can
-be avoided by switching the order of negotiate contexts,
-placing a negotiate context at the end that does not
-require padding (prior to the recent netname context fix
-this was the case on the client).
-
-Fixes: 73130a7b1ac9 ("smb3: fix empty netname context on secondary channels")
-Reported-by: Julian Sikorski <belegdol@gmail.com>
-Signed-off-by: Steve French <stfrench@microsoft.com>
----
- fs/cifs/smb2pdu.c | 13 +++++++------
- 1 file changed, 7 insertions(+), 6 deletions(-)
-
-diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
-index 12b4dddaedb0..c705de32e225 100644
---- a/fs/cifs/smb2pdu.c
-+++ b/fs/cifs/smb2pdu.c
-@@ -571,10 +571,6 @@ assemble_neg_contexts(struct smb2_negotiate_req *req,
- *total_len += ctxt_len;
- pneg_ctxt += ctxt_len;
-
-- build_posix_ctxt((struct smb2_posix_neg_context *)pneg_ctxt);
-- *total_len += sizeof(struct smb2_posix_neg_context);
-- pneg_ctxt += sizeof(struct smb2_posix_neg_context);
--
- /*
- * secondary channels don't have the hostname field populated
- * use the hostname field in the primary channel instead
-@@ -586,9 +582,14 @@ assemble_neg_contexts(struct smb2_negotiate_req *req,
- hostname);
- *total_len += ctxt_len;
- pneg_ctxt += ctxt_len;
-- neg_context_count = 4;
-- } else /* second channels do not have a hostname */
- neg_context_count = 3;
-+ } else
-+ neg_context_count = 2;
-+
-+ build_posix_ctxt((struct smb2_posix_neg_context *)pneg_ctxt);
-+ *total_len += sizeof(struct smb2_posix_neg_context);
-+ pneg_ctxt += sizeof(struct smb2_posix_neg_context);
-+ neg_context_count++;
-
- if (server->compress_algorithm) {
- build_compression_ctxt((struct smb2_compression_capabilities_context *)
---
-2.34.1
-
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-07-23 11:45 Alice Ferrazzi
0 siblings, 0 replies; 31+ messages in thread
From: Alice Ferrazzi @ 2022-07-23 11:45 UTC (permalink / raw
To: gentoo-commits
commit: 4064a320985b8d7d5888eadd8cf48c385c4586af
Author: Alice Ferrazzi <alicef <AT> gentoo <DOT> org>
AuthorDate: Sat Jul 23 11:44:36 2022 +0000
Commit: Alice Ferrazzi <alicef <AT> gentoo <DOT> org>
CommitDate: Sat Jul 23 11:44:42 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=4064a320
Linux patch 5.18.14
Signed-off-by: Alice Ferrazzi <alicef <AT> gentoo.org>
0000_README | 4 +
1013_linux-5.18.14.patch | 4685 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 4689 insertions(+)
diff --git a/0000_README b/0000_README
index b6f81e34..32e6ba50 100644
--- a/0000_README
+++ b/0000_README
@@ -95,6 +95,10 @@ Patch: 1012_linux-5.18.13.patch
From: http://www.kernel.org
Desc: Linux 5.18.13
+Patch: 1013_linux-5.18.14.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.14
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1013_linux-5.18.14.patch b/1013_linux-5.18.14.patch
new file mode 100644
index 00000000..aa7f2e2c
--- /dev/null
+++ b/1013_linux-5.18.14.patch
@@ -0,0 +1,4685 @@
+diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
+index c4893782055b4..eb92195ca0155 100644
+--- a/Documentation/admin-guide/kernel-parameters.txt
++++ b/Documentation/admin-guide/kernel-parameters.txt
+@@ -5124,6 +5124,30 @@
+
+ retain_initrd [RAM] Keep initrd memory after extraction
+
++ retbleed= [X86] Control mitigation of RETBleed (Arbitrary
++ Speculative Code Execution with Return Instructions)
++ vulnerability.
++
++ off - no mitigation
++ auto - automatically select a migitation
++ auto,nosmt - automatically select a mitigation,
++ disabling SMT if necessary for
++ the full mitigation (only on Zen1
++ and older without STIBP).
++ ibpb - mitigate short speculation windows on
++ basic block boundaries too. Safe, highest
++ perf impact.
++ unret - force enable untrained return thunks,
++ only effective on AMD f15h-f17h
++ based systems.
++ unret,nosmt - like unret, will disable SMT when STIBP
++ is not available.
++
++ Selecting 'auto' will choose a mitigation method at run
++ time according to the CPU.
++
++ Not specifying this option is equivalent to retbleed=auto.
++
+ rfkill.default_state=
+ 0 "airplane mode". All wifi, bluetooth, wimax, gps, fm,
+ etc. communication is blocked by default.
+@@ -5482,6 +5506,7 @@
+ eibrs - enhanced IBRS
+ eibrs,retpoline - enhanced IBRS + Retpolines
+ eibrs,lfence - enhanced IBRS + LFENCE
++ ibrs - use IBRS to protect kernel
+
+ Not specifying this option is equivalent to
+ spectre_v2=auto.
+diff --git a/Makefile b/Makefile
+index 1f3c753cb28df..d3723b2f6d6ca 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 13
++SUBLEVEL = 14
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c
+index 0760e24f2ebab..9838967d0b2f1 100644
+--- a/arch/um/kernel/um_arch.c
++++ b/arch/um/kernel/um_arch.c
+@@ -432,6 +432,10 @@ void apply_retpolines(s32 *start, s32 *end)
+ {
+ }
+
++void apply_returns(s32 *start, s32 *end)
++{
++}
++
+ void apply_alternatives(struct alt_instr *start, struct alt_instr *end)
+ {
+ }
+diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
+index b2c65f5733538..4d1d87f76a74f 100644
+--- a/arch/x86/Kconfig
++++ b/arch/x86/Kconfig
+@@ -457,27 +457,6 @@ config GOLDFISH
+ def_bool y
+ depends on X86_GOLDFISH
+
+-config RETPOLINE
+- bool "Avoid speculative indirect branches in kernel"
+- default y
+- help
+- Compile kernel with the retpoline compiler options to guard against
+- kernel-to-user data leaks by avoiding speculative indirect
+- branches. Requires a compiler with -mindirect-branch=thunk-extern
+- support for full protection. The kernel may run slower.
+-
+-config CC_HAS_SLS
+- def_bool $(cc-option,-mharden-sls=all)
+-
+-config SLS
+- bool "Mitigate Straight-Line-Speculation"
+- depends on CC_HAS_SLS && X86_64
+- default n
+- help
+- Compile the kernel with straight-line-speculation options to guard
+- against straight line speculation. The kernel image might be slightly
+- larger.
+-
+ config X86_CPU_RESCTRL
+ bool "x86 CPU resource control support"
+ depends on X86 && (CPU_SUP_INTEL || CPU_SUP_AMD)
+@@ -2449,6 +2428,88 @@ source "kernel/livepatch/Kconfig"
+
+ endmenu
+
++config CC_HAS_SLS
++ def_bool $(cc-option,-mharden-sls=all)
++
++config CC_HAS_RETURN_THUNK
++ def_bool $(cc-option,-mfunction-return=thunk-extern)
++
++menuconfig SPECULATION_MITIGATIONS
++ bool "Mitigations for speculative execution vulnerabilities"
++ default y
++ help
++ Say Y here to enable options which enable mitigations for
++ speculative execution hardware vulnerabilities.
++
++ If you say N, all mitigations will be disabled. You really
++ should know what you are doing to say so.
++
++if SPECULATION_MITIGATIONS
++
++config PAGE_TABLE_ISOLATION
++ bool "Remove the kernel mapping in user mode"
++ default y
++ depends on (X86_64 || X86_PAE)
++ help
++ This feature reduces the number of hardware side channels by
++ ensuring that the majority of kernel addresses are not mapped
++ into userspace.
++
++ See Documentation/x86/pti.rst for more details.
++
++config RETPOLINE
++ bool "Avoid speculative indirect branches in kernel"
++ default y
++ help
++ Compile kernel with the retpoline compiler options to guard against
++ kernel-to-user data leaks by avoiding speculative indirect
++ branches. Requires a compiler with -mindirect-branch=thunk-extern
++ support for full protection. The kernel may run slower.
++
++config RETHUNK
++ bool "Enable return-thunks"
++ depends on RETPOLINE && CC_HAS_RETURN_THUNK
++ default y
++ help
++ Compile the kernel with the return-thunks compiler option to guard
++ against kernel-to-user data leaks by avoiding return speculation.
++ Requires a compiler with -mfunction-return=thunk-extern
++ support for full protection. The kernel may run slower.
++
++config CPU_UNRET_ENTRY
++ bool "Enable UNRET on kernel entry"
++ depends on CPU_SUP_AMD && RETHUNK
++ default y
++ help
++ Compile the kernel with support for the retbleed=unret mitigation.
++
++config CPU_IBPB_ENTRY
++ bool "Enable IBPB on kernel entry"
++ depends on CPU_SUP_AMD
++ default y
++ help
++ Compile the kernel with support for the retbleed=ibpb mitigation.
++
++config CPU_IBRS_ENTRY
++ bool "Enable IBRS on kernel entry"
++ depends on CPU_SUP_INTEL
++ default y
++ help
++ Compile the kernel with support for the spectre_v2=ibrs mitigation.
++ This mitigates both spectre_v2 and retbleed at great cost to
++ performance.
++
++config SLS
++ bool "Mitigate Straight-Line-Speculation"
++ depends on CC_HAS_SLS && X86_64
++ default n
++ help
++ Compile the kernel with straight-line-speculation options to guard
++ against straight line speculation. The kernel image might be slightly
++ larger.
++
++endif
++
+ config ARCH_HAS_ADD_PAGES
+ def_bool y
+ depends on ARCH_ENABLE_MEMORY_HOTPLUG
+diff --git a/arch/x86/Makefile b/arch/x86/Makefile
+index 63d50f65b8283..fb0de637411cb 100644
+--- a/arch/x86/Makefile
++++ b/arch/x86/Makefile
+@@ -21,6 +21,12 @@ ifdef CONFIG_CC_IS_CLANG
+ RETPOLINE_CFLAGS := -mretpoline-external-thunk
+ RETPOLINE_VDSO_CFLAGS := -mretpoline
+ endif
++
++ifdef CONFIG_RETHUNK
++RETHUNK_CFLAGS := -mfunction-return=thunk-extern
++RETPOLINE_CFLAGS += $(RETHUNK_CFLAGS)
++endif
++
+ export RETPOLINE_CFLAGS
+ export RETPOLINE_VDSO_CFLAGS
+
+diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile
+index 7fec5dcf64386..eeadbd7d92cc5 100644
+--- a/arch/x86/entry/Makefile
++++ b/arch/x86/entry/Makefile
+@@ -11,7 +11,7 @@ CFLAGS_REMOVE_common.o = $(CC_FLAGS_FTRACE)
+
+ CFLAGS_common.o += -fno-stack-protector
+
+-obj-y := entry_$(BITS).o thunk_$(BITS).o syscall_$(BITS).o
++obj-y := entry.o entry_$(BITS).o thunk_$(BITS).o syscall_$(BITS).o
+ obj-y += common.o
+
+ obj-y += vdso/
+diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
+index a4c061fb7c6ea..b00a3a95fbfab 100644
+--- a/arch/x86/entry/calling.h
++++ b/arch/x86/entry/calling.h
+@@ -7,6 +7,8 @@
+ #include <asm/asm-offsets.h>
+ #include <asm/processor-flags.h>
+ #include <asm/ptrace-abi.h>
++#include <asm/msr.h>
++#include <asm/nospec-branch.h>
+
+ /*
+
+@@ -119,27 +121,19 @@ For 32-bit we have the following conventions - kernel is built with
+ CLEAR_REGS
+ .endm
+
+-.macro POP_REGS pop_rdi=1 skip_r11rcx=0
++.macro POP_REGS pop_rdi=1
+ popq %r15
+ popq %r14
+ popq %r13
+ popq %r12
+ popq %rbp
+ popq %rbx
+- .if \skip_r11rcx
+- popq %rsi
+- .else
+ popq %r11
+- .endif
+ popq %r10
+ popq %r9
+ popq %r8
+ popq %rax
+- .if \skip_r11rcx
+- popq %rsi
+- .else
+ popq %rcx
+- .endif
+ popq %rdx
+ popq %rsi
+ .if \pop_rdi
+@@ -289,6 +283,66 @@ For 32-bit we have the following conventions - kernel is built with
+
+ #endif
+
++/*
++ * IBRS kernel mitigation for Spectre_v2.
++ *
++ * Assumes full context is established (PUSH_REGS, CR3 and GS) and it clobbers
++ * the regs it uses (AX, CX, DX). Must be called before the first RET
++ * instruction (NOTE! UNTRAIN_RET includes a RET instruction)
++ *
++ * The optional argument is used to save/restore the current value,
++ * which is used on the paranoid paths.
++ *
++ * Assumes x86_spec_ctrl_{base,current} to have SPEC_CTRL_IBRS set.
++ */
++.macro IBRS_ENTER save_reg
++#ifdef CONFIG_CPU_IBRS_ENTRY
++ ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_KERNEL_IBRS
++ movl $MSR_IA32_SPEC_CTRL, %ecx
++
++.ifnb \save_reg
++ rdmsr
++ shl $32, %rdx
++ or %rdx, %rax
++ mov %rax, \save_reg
++ test $SPEC_CTRL_IBRS, %eax
++ jz .Ldo_wrmsr_\@
++ lfence
++ jmp .Lend_\@
++.Ldo_wrmsr_\@:
++.endif
++
++ movq PER_CPU_VAR(x86_spec_ctrl_current), %rdx
++ movl %edx, %eax
++ shr $32, %rdx
++ wrmsr
++.Lend_\@:
++#endif
++.endm
++
++/*
++ * Similar to IBRS_ENTER, requires KERNEL GS,CR3 and clobbers (AX, CX, DX)
++ * regs. Must be called after the last RET.
++ */
++.macro IBRS_EXIT save_reg
++#ifdef CONFIG_CPU_IBRS_ENTRY
++ ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_KERNEL_IBRS
++ movl $MSR_IA32_SPEC_CTRL, %ecx
++
++.ifnb \save_reg
++ mov \save_reg, %rdx
++.else
++ movq PER_CPU_VAR(x86_spec_ctrl_current), %rdx
++ andl $(~SPEC_CTRL_IBRS), %edx
++.endif
++
++ movl %edx, %eax
++ shr $32, %rdx
++ wrmsr
++.Lend_\@:
++#endif
++.endm
++
+ /*
+ * Mitigate Spectre v1 for conditional swapgs code paths.
+ *
+diff --git a/arch/x86/entry/entry.S b/arch/x86/entry/entry.S
+new file mode 100644
+index 0000000000000..bfb7bcb362bcf
+--- /dev/null
++++ b/arch/x86/entry/entry.S
+@@ -0,0 +1,22 @@
++/* SPDX-License-Identifier: GPL-2.0 */
++/*
++ * Common place for both 32- and 64-bit entry routines.
++ */
++
++#include <linux/linkage.h>
++#include <asm/export.h>
++#include <asm/msr-index.h>
++
++.pushsection .noinstr.text, "ax"
++
++SYM_FUNC_START(entry_ibpb)
++ movl $MSR_IA32_PRED_CMD, %ecx
++ movl $PRED_CMD_IBPB, %eax
++ xorl %edx, %edx
++ wrmsr
++ RET
++SYM_FUNC_END(entry_ibpb)
++/* For KVM */
++EXPORT_SYMBOL_GPL(entry_ibpb);
++
++.popsection
+diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
+index 8874208440664..e309e71560389 100644
+--- a/arch/x86/entry/entry_32.S
++++ b/arch/x86/entry/entry_32.S
+@@ -698,7 +698,6 @@ SYM_CODE_START(__switch_to_asm)
+ movl %ebx, PER_CPU_VAR(__stack_chk_guard)
+ #endif
+
+-#ifdef CONFIG_RETPOLINE
+ /*
+ * When switching from a shallower to a deeper call stack
+ * the RSB may either underflow or use entries populated
+@@ -707,7 +706,6 @@ SYM_CODE_START(__switch_to_asm)
+ * speculative execution to prevent attack.
+ */
+ FILL_RETURN_BUFFER %ebx, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
+-#endif
+
+ /* Restore flags or the incoming task to restore AC state. */
+ popfl
+diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
+index d8376e5fe1afe..2ea185d47cfd1 100644
+--- a/arch/x86/entry/entry_64.S
++++ b/arch/x86/entry/entry_64.S
+@@ -85,7 +85,7 @@
+ */
+
+ SYM_CODE_START(entry_SYSCALL_64)
+- UNWIND_HINT_EMPTY
++ UNWIND_HINT_ENTRY
+ ENDBR
+
+ swapgs
+@@ -112,6 +112,11 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_GLOBAL)
+ movq %rsp, %rdi
+ /* Sign extend the lower 32bit as syscall numbers are treated as int */
+ movslq %eax, %rsi
++
++ /* clobbers %rax, make sure it is after saving the syscall nr */
++ IBRS_ENTER
++ UNTRAIN_RET
++
+ call do_syscall_64 /* returns with IRQs disabled */
+
+ /*
+@@ -191,8 +196,8 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_GLOBAL)
+ * perf profiles. Nothing jumps here.
+ */
+ syscall_return_via_sysret:
+- /* rcx and r11 are already restored (see code above) */
+- POP_REGS pop_rdi=0 skip_r11rcx=1
++ IBRS_EXIT
++ POP_REGS pop_rdi=0
+
+ /*
+ * Now all regs are restored except RSP and RDI.
+@@ -245,7 +250,6 @@ SYM_FUNC_START(__switch_to_asm)
+ movq %rbx, PER_CPU_VAR(fixed_percpu_data) + stack_canary_offset
+ #endif
+
+-#ifdef CONFIG_RETPOLINE
+ /*
+ * When switching from a shallower to a deeper call stack
+ * the RSB may either underflow or use entries populated
+@@ -254,7 +258,6 @@ SYM_FUNC_START(__switch_to_asm)
+ * speculative execution to prevent attack.
+ */
+ FILL_RETURN_BUFFER %r12, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
+-#endif
+
+ /* restore callee-saved registers */
+ popq %r15
+@@ -318,6 +321,14 @@ SYM_CODE_END(ret_from_fork)
+ #endif
+ .endm
+
++SYM_CODE_START_LOCAL(xen_error_entry)
++ UNWIND_HINT_FUNC
++ PUSH_AND_CLEAR_REGS save_ret=1
++ ENCODE_FRAME_POINTER 8
++ UNTRAIN_RET
++ RET
++SYM_CODE_END(xen_error_entry)
++
+ /**
+ * idtentry_body - Macro to emit code calling the C function
+ * @cfunc: C function to be called
+@@ -325,7 +336,18 @@ SYM_CODE_END(ret_from_fork)
+ */
+ .macro idtentry_body cfunc has_error_code:req
+
+- call error_entry
++ /*
++ * Call error_entry() and switch to the task stack if from userspace.
++ *
++ * When in XENPV, it is already in the task stack, and it can't fault
++ * for native_iret() nor native_load_gs_index() since XENPV uses its
++ * own pvops for IRET and load_gs_index(). And it doesn't need to
++ * switch the CR3. So it can skip invoking error_entry().
++ */
++ ALTERNATIVE "call error_entry; movq %rax, %rsp", \
++ "call xen_error_entry", X86_FEATURE_XENPV
++
++ ENCODE_FRAME_POINTER
+ UNWIND_HINT_REGS
+
+ movq %rsp, %rdi /* pt_regs pointer into 1st argument*/
+@@ -582,6 +604,7 @@ __irqentry_text_end:
+
+ SYM_CODE_START_LOCAL(common_interrupt_return)
+ SYM_INNER_LABEL(swapgs_restore_regs_and_return_to_usermode, SYM_L_GLOBAL)
++ IBRS_EXIT
+ #ifdef CONFIG_DEBUG_ENTRY
+ /* Assert that pt_regs indicates user mode. */
+ testb $3, CS(%rsp)
+@@ -695,6 +718,7 @@ native_irq_return_ldt:
+ pushq %rdi /* Stash user RDI */
+ swapgs /* to kernel GS */
+ SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi /* to kernel CR3 */
++ UNTRAIN_RET
+
+ movq PER_CPU_VAR(espfix_waddr), %rdi
+ movq %rax, (0*8)(%rdi) /* user RAX */
+@@ -867,6 +891,9 @@ SYM_CODE_END(xen_failsafe_callback)
+ * 1 -> no SWAPGS on exit
+ *
+ * Y GSBASE value at entry, must be restored in paranoid_exit
++ *
++ * R14 - old CR3
++ * R15 - old SPEC_CTRL
+ */
+ SYM_CODE_START_LOCAL(paranoid_entry)
+ UNWIND_HINT_FUNC
+@@ -911,7 +938,7 @@ SYM_CODE_START_LOCAL(paranoid_entry)
+ * is needed here.
+ */
+ SAVE_AND_SET_GSBASE scratch_reg=%rax save_reg=%rbx
+- RET
++ jmp .Lparanoid_gsbase_done
+
+ .Lparanoid_entry_checkgs:
+ /* EBX = 1 -> kernel GSBASE active, no restore required */
+@@ -930,8 +957,16 @@ SYM_CODE_START_LOCAL(paranoid_entry)
+ xorl %ebx, %ebx
+ swapgs
+ .Lparanoid_kernel_gsbase:
+-
+ FENCE_SWAPGS_KERNEL_ENTRY
++.Lparanoid_gsbase_done:
++
++ /*
++ * Once we have CR3 and %GS setup save and set SPEC_CTRL. Just like
++ * CR3 above, keep the old value in a callee saved register.
++ */
++ IBRS_ENTER save_reg=%r15
++ UNTRAIN_RET
++
+ RET
+ SYM_CODE_END(paranoid_entry)
+
+@@ -953,9 +988,19 @@ SYM_CODE_END(paranoid_entry)
+ * 1 -> no SWAPGS on exit
+ *
+ * Y User space GSBASE, must be restored unconditionally
++ *
++ * R14 - old CR3
++ * R15 - old SPEC_CTRL
+ */
+ SYM_CODE_START_LOCAL(paranoid_exit)
+ UNWIND_HINT_REGS
++
++ /*
++ * Must restore IBRS state before both CR3 and %GS since we need access
++ * to the per-CPU x86_spec_ctrl_shadow variable.
++ */
++ IBRS_EXIT save_reg=%r15
++
+ /*
+ * The order of operations is important. RESTORE_CR3 requires
+ * kernel GSBASE.
+@@ -984,13 +1029,15 @@ SYM_CODE_START_LOCAL(paranoid_exit)
+ SYM_CODE_END(paranoid_exit)
+
+ /*
+- * Save all registers in pt_regs, and switch GS if needed.
++ * Switch GS and CR3 if needed.
+ */
+ SYM_CODE_START_LOCAL(error_entry)
+ UNWIND_HINT_FUNC
+ cld
++
+ PUSH_AND_CLEAR_REGS save_ret=1
+ ENCODE_FRAME_POINTER 8
++
+ testb $3, CS+8(%rsp)
+ jz .Lerror_kernelspace
+
+@@ -1002,15 +1049,14 @@ SYM_CODE_START_LOCAL(error_entry)
+ FENCE_SWAPGS_USER_ENTRY
+ /* We have user CR3. Change to kernel CR3. */
+ SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
++ IBRS_ENTER
++ UNTRAIN_RET
+
++ leaq 8(%rsp), %rdi /* arg0 = pt_regs pointer */
+ .Lerror_entry_from_usermode_after_swapgs:
++
+ /* Put us onto the real thread stack. */
+- popq %r12 /* save return addr in %12 */
+- movq %rsp, %rdi /* arg0 = pt_regs pointer */
+ call sync_regs
+- movq %rax, %rsp /* switch stack */
+- ENCODE_FRAME_POINTER
+- pushq %r12
+ RET
+
+ /*
+@@ -1042,6 +1088,8 @@ SYM_CODE_START_LOCAL(error_entry)
+ */
+ .Lerror_entry_done_lfence:
+ FENCE_SWAPGS_KERNEL_ENTRY
++ leaq 8(%rsp), %rax /* return pt_regs pointer */
++ ANNOTATE_UNRET_END
+ RET
+
+ .Lbstep_iret:
+@@ -1057,14 +1105,16 @@ SYM_CODE_START_LOCAL(error_entry)
+ SWAPGS
+ FENCE_SWAPGS_USER_ENTRY
+ SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
++ IBRS_ENTER
++ UNTRAIN_RET
+
+ /*
+ * Pretend that the exception came from user mode: set up pt_regs
+ * as if we faulted immediately after IRET.
+ */
+- mov %rsp, %rdi
++ leaq 8(%rsp), %rdi /* arg0 = pt_regs pointer */
+ call fixup_bad_iret
+- mov %rax, %rsp
++ mov %rax, %rdi
+ jmp .Lerror_entry_from_usermode_after_swapgs
+ SYM_CODE_END(error_entry)
+
+@@ -1162,6 +1212,9 @@ SYM_CODE_START(asm_exc_nmi)
+ PUSH_AND_CLEAR_REGS rdx=(%rdx)
+ ENCODE_FRAME_POINTER
+
++ IBRS_ENTER
++ UNTRAIN_RET
++
+ /*
+ * At this point we no longer need to worry about stack damage
+ * due to nesting -- we're on the normal thread stack and we're
+@@ -1386,6 +1439,9 @@ end_repeat_nmi:
+ movq $-1, %rsi
+ call exc_nmi
+
++ /* Always restore stashed SPEC_CTRL value (see paranoid_entry) */
++ IBRS_EXIT save_reg=%r15
++
+ /* Always restore stashed CR3 value (see paranoid_entry) */
+ RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
+
+diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
+index 4fdb007cddbd1..4f479cdc7a408 100644
+--- a/arch/x86/entry/entry_64_compat.S
++++ b/arch/x86/entry/entry_64_compat.S
+@@ -4,7 +4,6 @@
+ *
+ * Copyright 2000-2002 Andi Kleen, SuSE Labs.
+ */
+-#include "calling.h"
+ #include <asm/asm-offsets.h>
+ #include <asm/current.h>
+ #include <asm/errno.h>
+@@ -14,9 +13,12 @@
+ #include <asm/irqflags.h>
+ #include <asm/asm.h>
+ #include <asm/smap.h>
++#include <asm/nospec-branch.h>
+ #include <linux/linkage.h>
+ #include <linux/err.h>
+
++#include "calling.h"
++
+ .section .entry.text, "ax"
+
+ /*
+@@ -47,7 +49,7 @@
+ * 0(%ebp) arg6
+ */
+ SYM_CODE_START(entry_SYSENTER_compat)
+- UNWIND_HINT_EMPTY
++ UNWIND_HINT_ENTRY
+ ENDBR
+ /* Interrupts are off on entry. */
+ SWAPGS
+@@ -113,6 +115,9 @@ SYM_INNER_LABEL(entry_SYSENTER_compat_after_hwframe, SYM_L_GLOBAL)
+
+ cld
+
++ IBRS_ENTER
++ UNTRAIN_RET
++
+ /*
+ * SYSENTER doesn't filter flags, so we need to clear NT and AC
+ * ourselves. To save a few cycles, we can check whether
+@@ -199,7 +204,7 @@ SYM_CODE_END(entry_SYSENTER_compat)
+ * 0(%esp) arg6
+ */
+ SYM_CODE_START(entry_SYSCALL_compat)
+- UNWIND_HINT_EMPTY
++ UNWIND_HINT_ENTRY
+ ENDBR
+ /* Interrupts are off on entry. */
+ swapgs
+@@ -256,6 +261,9 @@ SYM_INNER_LABEL(entry_SYSCALL_compat_after_hwframe, SYM_L_GLOBAL)
+
+ UNWIND_HINT_REGS
+
++ IBRS_ENTER
++ UNTRAIN_RET
++
+ movq %rsp, %rdi
+ call do_fast_syscall_32
+ /* XEN PV guests always use IRET path */
+@@ -270,6 +278,8 @@ sysret32_from_system_call:
+ */
+ STACKLEAK_ERASE
+
++ IBRS_EXIT
++
+ movq RBX(%rsp), %rbx /* pt_regs->rbx */
+ movq RBP(%rsp), %rbp /* pt_regs->rbp */
+ movq EFLAGS(%rsp), %r11 /* pt_regs->flags (in r11) */
+@@ -343,7 +353,7 @@ SYM_CODE_END(entry_SYSCALL_compat)
+ * ebp arg6
+ */
+ SYM_CODE_START(entry_INT80_compat)
+- UNWIND_HINT_EMPTY
++ UNWIND_HINT_ENTRY
+ ENDBR
+ /*
+ * Interrupts are off on entry.
+@@ -414,6 +424,9 @@ SYM_CODE_START(entry_INT80_compat)
+
+ cld
+
++ IBRS_ENTER
++ UNTRAIN_RET
++
+ movq %rsp, %rdi
+ call do_int80_syscall_32
+ jmp swapgs_restore_regs_and_return_to_usermode
+diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
+index 693f8b9031fb8..e893af5aa8f55 100644
+--- a/arch/x86/entry/vdso/Makefile
++++ b/arch/x86/entry/vdso/Makefile
+@@ -92,6 +92,7 @@ endif
+ endif
+
+ $(vobjs): KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO) $(GCC_PLUGINS_CFLAGS) $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS)) $(CFL)
++$(vobjs): KBUILD_AFLAGS += -DBUILD_VDSO
+
+ #
+ # vDSO code runs in userspace and -pg doesn't help with profiling anyway.
+diff --git a/arch/x86/entry/vsyscall/vsyscall_emu_64.S b/arch/x86/entry/vsyscall/vsyscall_emu_64.S
+index 15e35159ebb68..ef2dd18272431 100644
+--- a/arch/x86/entry/vsyscall/vsyscall_emu_64.S
++++ b/arch/x86/entry/vsyscall/vsyscall_emu_64.S
+@@ -19,17 +19,20 @@ __vsyscall_page:
+
+ mov $__NR_gettimeofday, %rax
+ syscall
+- RET
++ ret
++ int3
+
+ .balign 1024, 0xcc
+ mov $__NR_time, %rax
+ syscall
+- RET
++ ret
++ int3
+
+ .balign 1024, 0xcc
+ mov $__NR_getcpu, %rax
+ syscall
+- RET
++ ret
++ int3
+
+ .balign 4096, 0xcc
+
+diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
+index 9b10c8c76087a..9542c582d546b 100644
+--- a/arch/x86/include/asm/alternative.h
++++ b/arch/x86/include/asm/alternative.h
+@@ -76,6 +76,7 @@ extern int alternatives_patched;
+ extern void alternative_instructions(void);
+ extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
+ extern void apply_retpolines(s32 *start, s32 *end);
++extern void apply_returns(s32 *start, s32 *end);
+ extern void apply_ibt_endbr(s32 *start, s32 *end);
+
+ struct module;
+diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
+index e17de69faa543..5d09ded0c491f 100644
+--- a/arch/x86/include/asm/cpufeatures.h
++++ b/arch/x86/include/asm/cpufeatures.h
+@@ -203,8 +203,8 @@
+ #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
+ /* FREE! ( 7*32+10) */
+ #define X86_FEATURE_PTI ( 7*32+11) /* Kernel Page Table Isolation enabled */
+-#define X86_FEATURE_RETPOLINE ( 7*32+12) /* "" Generic Retpoline mitigation for Spectre variant 2 */
+-#define X86_FEATURE_RETPOLINE_LFENCE ( 7*32+13) /* "" Use LFENCE for Spectre variant 2 */
++#define X86_FEATURE_KERNEL_IBRS ( 7*32+12) /* "" Set/clear IBRS on kernel entry/exit */
++#define X86_FEATURE_RSB_VMEXIT ( 7*32+13) /* "" Fill RSB on VM-Exit */
+ #define X86_FEATURE_INTEL_PPIN ( 7*32+14) /* Intel Processor Inventory Number */
+ #define X86_FEATURE_CDP_L2 ( 7*32+15) /* Code and Data Prioritization L2 */
+ #define X86_FEATURE_MSR_SPEC_CTRL ( 7*32+16) /* "" MSR SPEC_CTRL is implemented */
+@@ -295,6 +295,12 @@
+ #define X86_FEATURE_PER_THREAD_MBA (11*32+ 7) /* "" Per-thread Memory Bandwidth Allocation */
+ #define X86_FEATURE_SGX1 (11*32+ 8) /* "" Basic SGX */
+ #define X86_FEATURE_SGX2 (11*32+ 9) /* "" SGX Enclave Dynamic Memory Management (EDMM) */
++#define X86_FEATURE_ENTRY_IBPB (11*32+10) /* "" Issue an IBPB on kernel entry */
++#define X86_FEATURE_RRSBA_CTRL (11*32+11) /* "" RET prediction control */
++#define X86_FEATURE_RETPOLINE (11*32+12) /* "" Generic Retpoline mitigation for Spectre variant 2 */
++#define X86_FEATURE_RETPOLINE_LFENCE (11*32+13) /* "" Use LFENCE for Spectre variant 2 */
++#define X86_FEATURE_RETHUNK (11*32+14) /* "" Use REturn THUNK */
++#define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */
+
+ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
+ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
+@@ -315,6 +321,7 @@
+ #define X86_FEATURE_VIRT_SSBD (13*32+25) /* Virtualized Speculative Store Bypass Disable */
+ #define X86_FEATURE_AMD_SSB_NO (13*32+26) /* "" Speculative Store Bypass is fixed in hardware. */
+ #define X86_FEATURE_CPPC (13*32+27) /* Collaborative Processor Performance Control */
++#define X86_FEATURE_BTC_NO (13*32+29) /* "" Not vulnerable to Branch Type Confusion */
+
+ /* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */
+ #define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */
+@@ -444,5 +451,6 @@
+ #define X86_BUG_ITLB_MULTIHIT X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */
+ #define X86_BUG_SRBDS X86_BUG(24) /* CPU may leak RNG bits if not mitigated */
+ #define X86_BUG_MMIO_STALE_DATA X86_BUG(25) /* CPU is affected by Processor MMIO Stale Data vulnerabilities */
++#define X86_BUG_RETBLEED X86_BUG(26) /* CPU is affected by RETBleed */
+
+ #endif /* _ASM_X86_CPUFEATURES_H */
+diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
+index 1231d63f836d8..f7be189e97232 100644
+--- a/arch/x86/include/asm/disabled-features.h
++++ b/arch/x86/include/asm/disabled-features.h
+@@ -56,6 +56,25 @@
+ # define DISABLE_PTI (1 << (X86_FEATURE_PTI & 31))
+ #endif
+
++#ifdef CONFIG_RETPOLINE
++# define DISABLE_RETPOLINE 0
++#else
++# define DISABLE_RETPOLINE ((1 << (X86_FEATURE_RETPOLINE & 31)) | \
++ (1 << (X86_FEATURE_RETPOLINE_LFENCE & 31)))
++#endif
++
++#ifdef CONFIG_RETHUNK
++# define DISABLE_RETHUNK 0
++#else
++# define DISABLE_RETHUNK (1 << (X86_FEATURE_RETHUNK & 31))
++#endif
++
++#ifdef CONFIG_CPU_UNRET_ENTRY
++# define DISABLE_UNRET 0
++#else
++# define DISABLE_UNRET (1 << (X86_FEATURE_UNRET & 31))
++#endif
++
+ #ifdef CONFIG_INTEL_IOMMU_SVM
+ # define DISABLE_ENQCMD 0
+ #else
+@@ -82,7 +101,7 @@
+ #define DISABLED_MASK8 0
+ #define DISABLED_MASK9 (DISABLE_SMAP|DISABLE_SGX)
+ #define DISABLED_MASK10 0
+-#define DISABLED_MASK11 0
++#define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET)
+ #define DISABLED_MASK12 0
+ #define DISABLED_MASK13 0
+ #define DISABLED_MASK14 0
+diff --git a/arch/x86/include/asm/linkage.h b/arch/x86/include/asm/linkage.h
+index 85865f1645bd3..73ca200498356 100644
+--- a/arch/x86/include/asm/linkage.h
++++ b/arch/x86/include/asm/linkage.h
+@@ -19,19 +19,27 @@
+ #define __ALIGN_STR __stringify(__ALIGN)
+ #endif
+
++#if defined(CONFIG_RETHUNK) && !defined(__DISABLE_EXPORTS) && !defined(BUILD_VDSO)
++#define RET jmp __x86_return_thunk
++#else /* CONFIG_RETPOLINE */
+ #ifdef CONFIG_SLS
+ #define RET ret; int3
+ #else
+ #define RET ret
+ #endif
++#endif /* CONFIG_RETPOLINE */
+
+ #else /* __ASSEMBLY__ */
+
++#if defined(CONFIG_RETHUNK) && !defined(__DISABLE_EXPORTS) && !defined(BUILD_VDSO)
++#define ASM_RET "jmp __x86_return_thunk\n\t"
++#else /* CONFIG_RETPOLINE */
+ #ifdef CONFIG_SLS
+ #define ASM_RET "ret; int3\n\t"
+ #else
+ #define ASM_RET "ret\n\t"
+ #endif
++#endif /* CONFIG_RETPOLINE */
+
+ #endif /* __ASSEMBLY__ */
+
+diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
+index 4425d6773183b..ad084326f24c2 100644
+--- a/arch/x86/include/asm/msr-index.h
++++ b/arch/x86/include/asm/msr-index.h
+@@ -51,6 +51,8 @@
+ #define SPEC_CTRL_STIBP BIT(SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */
+ #define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */
+ #define SPEC_CTRL_SSBD BIT(SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
++#define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior */
++#define SPEC_CTRL_RRSBA_DIS_S BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
+
+ #define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */
+ #define PRED_CMD_IBPB BIT(0) /* Indirect Branch Prediction Barrier */
+@@ -91,6 +93,7 @@
+ #define MSR_IA32_ARCH_CAPABILITIES 0x0000010a
+ #define ARCH_CAP_RDCL_NO BIT(0) /* Not susceptible to Meltdown */
+ #define ARCH_CAP_IBRS_ALL BIT(1) /* Enhanced IBRS support */
++#define ARCH_CAP_RSBA BIT(2) /* RET may use alternative branch predictors */
+ #define ARCH_CAP_SKIP_VMENTRY_L1DFLUSH BIT(3) /* Skip L1D flush on vmentry */
+ #define ARCH_CAP_SSB_NO BIT(4) /*
+ * Not susceptible to Speculative Store Bypass
+@@ -138,6 +141,13 @@
+ * bit available to control VERW
+ * behavior.
+ */
++#define ARCH_CAP_RRSBA BIT(19) /*
++ * Indicates RET may use predictors
++ * other than the RSB. With eIBRS
++ * enabled predictions in kernel mode
++ * are restricted to targets in
++ * kernel.
++ */
+
+ #define MSR_IA32_FLUSH_CMD 0x0000010b
+ #define L1D_FLUSH BIT(0) /*
+@@ -552,6 +562,9 @@
+ /* Fam 17h MSRs */
+ #define MSR_F17H_IRPERF 0xc00000e9
+
++#define MSR_ZEN2_SPECTRAL_CHICKEN 0xc00110e3
++#define MSR_ZEN2_SPECTRAL_CHICKEN_BIT BIT_ULL(1)
++
+ /* Fam 16h MSRs */
+ #define MSR_F16H_L2I_PERF_CTL 0xc0010230
+ #define MSR_F16H_L2I_PERF_CTR 0xc0010231
+diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
+index da251a5645b0e..10a3bfc1eb230 100644
+--- a/arch/x86/include/asm/nospec-branch.h
++++ b/arch/x86/include/asm/nospec-branch.h
+@@ -11,6 +11,7 @@
+ #include <asm/cpufeatures.h>
+ #include <asm/msr-index.h>
+ #include <asm/unwind_hints.h>
++#include <asm/percpu.h>
+
+ #define RETPOLINE_THUNK_SIZE 32
+
+@@ -75,6 +76,23 @@
+ .popsection
+ .endm
+
++/*
++ * (ab)use RETPOLINE_SAFE on RET to annotate away 'bare' RET instructions
++ * vs RETBleed validation.
++ */
++#define ANNOTATE_UNRET_SAFE ANNOTATE_RETPOLINE_SAFE
++
++/*
++ * Abuse ANNOTATE_RETPOLINE_SAFE on a NOP to indicate UNRET_END, should
++ * eventually turn into it's own annotation.
++ */
++.macro ANNOTATE_UNRET_END
++#ifdef CONFIG_DEBUG_ENTRY
++ ANNOTATE_RETPOLINE_SAFE
++ nop
++#endif
++.endm
++
+ /*
+ * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple
+ * indirect jmp/call which may be susceptible to the Spectre variant 2
+@@ -105,10 +123,34 @@
+ * monstrosity above, manually.
+ */
+ .macro FILL_RETURN_BUFFER reg:req nr:req ftr:req
+-#ifdef CONFIG_RETPOLINE
+ ALTERNATIVE "jmp .Lskip_rsb_\@", "", \ftr
+ __FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)
+ .Lskip_rsb_\@:
++.endm
++
++#ifdef CONFIG_CPU_UNRET_ENTRY
++#define CALL_ZEN_UNTRAIN_RET "call zen_untrain_ret"
++#else
++#define CALL_ZEN_UNTRAIN_RET ""
++#endif
++
++/*
++ * Mitigate RETBleed for AMD/Hygon Zen uarch. Requires KERNEL CR3 because the
++ * return thunk isn't mapped into the userspace tables (then again, AMD
++ * typically has NO_MELTDOWN).
++ *
++ * While zen_untrain_ret() doesn't clobber anything but requires stack,
++ * entry_ibpb() will clobber AX, CX, DX.
++ *
++ * As such, this must be placed after every *SWITCH_TO_KERNEL_CR3 at a point
++ * where we have a stack but before any RET instruction.
++ */
++.macro UNTRAIN_RET
++#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY)
++ ANNOTATE_UNRET_END
++ ALTERNATIVE_2 "", \
++ CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET, \
++ "call entry_ibpb", X86_FEATURE_ENTRY_IBPB
+ #endif
+ .endm
+
+@@ -120,17 +162,20 @@
+ _ASM_PTR " 999b\n\t" \
+ ".popsection\n\t"
+
+-#ifdef CONFIG_RETPOLINE
+-
+ typedef u8 retpoline_thunk_t[RETPOLINE_THUNK_SIZE];
++extern retpoline_thunk_t __x86_indirect_thunk_array[];
++
++extern void __x86_return_thunk(void);
++extern void zen_untrain_ret(void);
++extern void entry_ibpb(void);
++
++#ifdef CONFIG_RETPOLINE
+
+ #define GEN(reg) \
+ extern retpoline_thunk_t __x86_indirect_thunk_ ## reg;
+ #include <asm/GEN-for-each-reg.h>
+ #undef GEN
+
+-extern retpoline_thunk_t __x86_indirect_thunk_array[];
+-
+ #ifdef CONFIG_X86_64
+
+ /*
+@@ -193,6 +238,7 @@ enum spectre_v2_mitigation {
+ SPECTRE_V2_EIBRS,
+ SPECTRE_V2_EIBRS_RETPOLINE,
+ SPECTRE_V2_EIBRS_LFENCE,
++ SPECTRE_V2_IBRS,
+ };
+
+ /* The indirect branch speculation control variants */
+@@ -235,6 +281,9 @@ static inline void indirect_branch_prediction_barrier(void)
+
+ /* The Intel SPEC CTRL MSR base value cache */
+ extern u64 x86_spec_ctrl_base;
++DECLARE_PER_CPU(u64, x86_spec_ctrl_current);
++extern void write_spec_ctrl_current(u64 val, bool force);
++extern u64 spec_ctrl_current(void);
+
+ /*
+ * With retpoline, we must use IBRS to restrict branch prediction
+@@ -244,18 +293,16 @@ extern u64 x86_spec_ctrl_base;
+ */
+ #define firmware_restrict_branch_speculation_start() \
+ do { \
+- u64 val = x86_spec_ctrl_base | SPEC_CTRL_IBRS; \
+- \
+ preempt_disable(); \
+- alternative_msr_write(MSR_IA32_SPEC_CTRL, val, \
++ alternative_msr_write(MSR_IA32_SPEC_CTRL, \
++ spec_ctrl_current() | SPEC_CTRL_IBRS, \
+ X86_FEATURE_USE_IBRS_FW); \
+ } while (0)
+
+ #define firmware_restrict_branch_speculation_end() \
+ do { \
+- u64 val = x86_spec_ctrl_base; \
+- \
+- alternative_msr_write(MSR_IA32_SPEC_CTRL, val, \
++ alternative_msr_write(MSR_IA32_SPEC_CTRL, \
++ spec_ctrl_current(), \
+ X86_FEATURE_USE_IBRS_FW); \
+ preempt_enable(); \
+ } while (0)
+diff --git a/arch/x86/include/asm/static_call.h b/arch/x86/include/asm/static_call.h
+index 2d8dacd026437..343b722ccaf21 100644
+--- a/arch/x86/include/asm/static_call.h
++++ b/arch/x86/include/asm/static_call.h
+@@ -21,6 +21,16 @@
+ * relative displacement across sections.
+ */
+
++/*
++ * The trampoline is 8 bytes and of the general form:
++ *
++ * jmp.d32 \func
++ * ud1 %esp, %ecx
++ *
++ * That trailing #UD provides both a speculation stop and serves as a unique
++ * 3 byte signature identifying static call trampolines. Also see tramp_ud[]
++ * and __static_call_fixup().
++ */
+ #define __ARCH_DEFINE_STATIC_CALL_TRAMP(name, insns) \
+ asm(".pushsection .static_call.text, \"ax\" \n" \
+ ".align 4 \n" \
+@@ -28,7 +38,7 @@
+ STATIC_CALL_TRAMP_STR(name) ": \n" \
+ ANNOTATE_NOENDBR \
+ insns " \n" \
+- ".byte 0x53, 0x43, 0x54 \n" \
++ ".byte 0x0f, 0xb9, 0xcc \n" \
+ ".type " STATIC_CALL_TRAMP_STR(name) ", @function \n" \
+ ".size " STATIC_CALL_TRAMP_STR(name) ", . - " STATIC_CALL_TRAMP_STR(name) " \n" \
+ ".popsection \n")
+@@ -36,8 +46,13 @@
+ #define ARCH_DEFINE_STATIC_CALL_TRAMP(name, func) \
+ __ARCH_DEFINE_STATIC_CALL_TRAMP(name, ".byte 0xe9; .long " #func " - (. + 4)")
+
++#ifdef CONFIG_RETHUNK
++#define ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name) \
++ __ARCH_DEFINE_STATIC_CALL_TRAMP(name, "jmp __x86_return_thunk")
++#else
+ #define ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name) \
+ __ARCH_DEFINE_STATIC_CALL_TRAMP(name, "ret; int3; nop; nop; nop")
++#endif
+
+ #define ARCH_DEFINE_STATIC_CALL_RET0_TRAMP(name) \
+ ARCH_DEFINE_STATIC_CALL_TRAMP(name, __static_call_return0)
+@@ -48,4 +63,6 @@
+ ".long " STATIC_CALL_KEY_STR(name) " - . \n" \
+ ".popsection \n")
+
++extern bool __static_call_fixup(void *tramp, u8 op, void *dest);
++
+ #endif /* _ASM_STATIC_CALL_H */
+diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
+index 35317c5c551d9..47ecfff2c83da 100644
+--- a/arch/x86/include/asm/traps.h
++++ b/arch/x86/include/asm/traps.h
+@@ -13,7 +13,7 @@
+ #ifdef CONFIG_X86_64
+ asmlinkage __visible notrace struct pt_regs *sync_regs(struct pt_regs *eregs);
+ asmlinkage __visible notrace
+-struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s);
++struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs);
+ void __init trap_init(void);
+ asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *eregs);
+ #endif
+diff --git a/arch/x86/include/asm/unwind_hints.h b/arch/x86/include/asm/unwind_hints.h
+index 8b33674288ea7..f66fbe6537dd7 100644
+--- a/arch/x86/include/asm/unwind_hints.h
++++ b/arch/x86/include/asm/unwind_hints.h
+@@ -8,7 +8,11 @@
+ #ifdef __ASSEMBLY__
+
+ .macro UNWIND_HINT_EMPTY
+- UNWIND_HINT sp_reg=ORC_REG_UNDEFINED type=UNWIND_HINT_TYPE_CALL end=1
++ UNWIND_HINT type=UNWIND_HINT_TYPE_CALL end=1
++.endm
++
++.macro UNWIND_HINT_ENTRY
++ UNWIND_HINT type=UNWIND_HINT_TYPE_ENTRY end=1
+ .endm
+
+ .macro UNWIND_HINT_REGS base=%rsp offset=0 indirect=0 extra=1 partial=0
+@@ -52,6 +56,14 @@
+ UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=8 type=UNWIND_HINT_TYPE_FUNC
+ .endm
+
++.macro UNWIND_HINT_SAVE
++ UNWIND_HINT type=UNWIND_HINT_TYPE_SAVE
++.endm
++
++.macro UNWIND_HINT_RESTORE
++ UNWIND_HINT type=UNWIND_HINT_TYPE_RESTORE
++.endm
++
+ #else
+
+ #define UNWIND_HINT_FUNC \
+diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
+index d374cb3cf024c..46427b785bc89 100644
+--- a/arch/x86/kernel/alternative.c
++++ b/arch/x86/kernel/alternative.c
+@@ -115,6 +115,7 @@ static void __init_or_module add_nops(void *insns, unsigned int len)
+ }
+
+ extern s32 __retpoline_sites[], __retpoline_sites_end[];
++extern s32 __return_sites[], __return_sites_end[];
+ extern s32 __ibt_endbr_seal[], __ibt_endbr_seal_end[];
+ extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
+ extern s32 __smp_locks[], __smp_locks_end[];
+@@ -507,9 +508,76 @@ void __init_or_module noinline apply_retpolines(s32 *start, s32 *end)
+ }
+ }
+
++#ifdef CONFIG_RETHUNK
++/*
++ * Rewrite the compiler generated return thunk tail-calls.
++ *
++ * For example, convert:
++ *
++ * JMP __x86_return_thunk
++ *
++ * into:
++ *
++ * RET
++ */
++static int patch_return(void *addr, struct insn *insn, u8 *bytes)
++{
++ int i = 0;
++
++ if (cpu_feature_enabled(X86_FEATURE_RETHUNK))
++ return -1;
++
++ bytes[i++] = RET_INSN_OPCODE;
++
++ for (; i < insn->length;)
++ bytes[i++] = INT3_INSN_OPCODE;
++
++ return i;
++}
++
++void __init_or_module noinline apply_returns(s32 *start, s32 *end)
++{
++ s32 *s;
++
++ for (s = start; s < end; s++) {
++ void *dest = NULL, *addr = (void *)s + *s;
++ struct insn insn;
++ int len, ret;
++ u8 bytes[16];
++ u8 op;
++
++ ret = insn_decode_kernel(&insn, addr);
++ if (WARN_ON_ONCE(ret < 0))
++ continue;
++
++ op = insn.opcode.bytes[0];
++ if (op == JMP32_INSN_OPCODE)
++ dest = addr + insn.length + insn.immediate.value;
++
++ if (__static_call_fixup(addr, op, dest) ||
++ WARN_ON_ONCE(dest != &__x86_return_thunk))
++ continue;
++
++ DPRINTK("return thunk at: %pS (%px) len: %d to: %pS",
++ addr, addr, insn.length,
++ addr + insn.length + insn.immediate.value);
++
++ len = patch_return(addr, &insn, bytes);
++ if (len == insn.length) {
++ DUMP_BYTES(((u8*)addr), len, "%px: orig: ", addr);
++ DUMP_BYTES(((u8*)bytes), len, "%px: repl: ", addr);
++ text_poke_early(addr, bytes, len);
++ }
++ }
++}
++#else
++void __init_or_module noinline apply_returns(s32 *start, s32 *end) { }
++#endif /* CONFIG_RETHUNK */
++
+ #else /* !RETPOLINES || !CONFIG_STACK_VALIDATION */
+
+ void __init_or_module noinline apply_retpolines(s32 *start, s32 *end) { }
++void __init_or_module noinline apply_returns(s32 *start, s32 *end) { }
+
+ #endif /* CONFIG_RETPOLINE && CONFIG_STACK_VALIDATION */
+
+@@ -860,6 +928,7 @@ void __init alternative_instructions(void)
+ * those can rewrite the retpoline thunks.
+ */
+ apply_retpolines(__retpoline_sites, __retpoline_sites_end);
++ apply_returns(__return_sites, __return_sites_end);
+
+ /*
+ * Then patch alternatives, such that those paravirt calls that are in
+diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
+index 9fb0a2f8b62a2..6434ea9413485 100644
+--- a/arch/x86/kernel/asm-offsets.c
++++ b/arch/x86/kernel/asm-offsets.c
+@@ -18,6 +18,7 @@
+ #include <asm/bootparam.h>
+ #include <asm/suspend.h>
+ #include <asm/tlbflush.h>
++#include "../kvm/vmx/vmx.h"
+
+ #ifdef CONFIG_XEN
+ #include <xen/interface/xen.h>
+@@ -90,4 +91,9 @@ static void __used common(void)
+ OFFSET(TSS_sp0, tss_struct, x86_tss.sp0);
+ OFFSET(TSS_sp1, tss_struct, x86_tss.sp1);
+ OFFSET(TSS_sp2, tss_struct, x86_tss.sp2);
++
++ if (IS_ENABLED(CONFIG_KVM_INTEL)) {
++ BLANK();
++ OFFSET(VMX_spec_ctrl, vcpu_vmx, spec_ctrl);
++ }
+ }
+diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
+index 0c0b09796ced3..35d5288394cb8 100644
+--- a/arch/x86/kernel/cpu/amd.c
++++ b/arch/x86/kernel/cpu/amd.c
+@@ -862,6 +862,28 @@ static void init_amd_bd(struct cpuinfo_x86 *c)
+ clear_rdrand_cpuid_bit(c);
+ }
+
++void init_spectral_chicken(struct cpuinfo_x86 *c)
++{
++#ifdef CONFIG_CPU_UNRET_ENTRY
++ u64 value;
++
++ /*
++ * On Zen2 we offer this chicken (bit) on the altar of Speculation.
++ *
++ * This suppresses speculation from the middle of a basic block, i.e. it
++ * suppresses non-branch predictions.
++ *
++ * We use STIBP as a heuristic to filter out Zen2 from the rest of F17H
++ */
++ if (!cpu_has(c, X86_FEATURE_HYPERVISOR) && cpu_has(c, X86_FEATURE_AMD_STIBP)) {
++ if (!rdmsrl_safe(MSR_ZEN2_SPECTRAL_CHICKEN, &value)) {
++ value |= MSR_ZEN2_SPECTRAL_CHICKEN_BIT;
++ wrmsrl_safe(MSR_ZEN2_SPECTRAL_CHICKEN, value);
++ }
++ }
++#endif
++}
++
+ static void init_amd_zn(struct cpuinfo_x86 *c)
+ {
+ set_cpu_cap(c, X86_FEATURE_ZEN);
+@@ -870,12 +892,21 @@ static void init_amd_zn(struct cpuinfo_x86 *c)
+ node_reclaim_distance = 32;
+ #endif
+
+- /*
+- * Fix erratum 1076: CPB feature bit not being set in CPUID.
+- * Always set it, except when running under a hypervisor.
+- */
+- if (!cpu_has(c, X86_FEATURE_HYPERVISOR) && !cpu_has(c, X86_FEATURE_CPB))
+- set_cpu_cap(c, X86_FEATURE_CPB);
++ /* Fix up CPUID bits, but only if not virtualised. */
++ if (!cpu_has(c, X86_FEATURE_HYPERVISOR)) {
++
++ /* Erratum 1076: CPB feature bit not being set in CPUID. */
++ if (!cpu_has(c, X86_FEATURE_CPB))
++ set_cpu_cap(c, X86_FEATURE_CPB);
++
++ /*
++ * Zen3 (Fam19 model < 0x10) parts are not susceptible to
++ * Branch Type Confusion, but predate the allocation of the
++ * BTC_NO bit.
++ */
++ if (c->x86 == 0x19 && !cpu_has(c, X86_FEATURE_BTC_NO))
++ set_cpu_cap(c, X86_FEATURE_BTC_NO);
++ }
+ }
+
+ static void init_amd(struct cpuinfo_x86 *c)
+@@ -907,7 +938,8 @@ static void init_amd(struct cpuinfo_x86 *c)
+ case 0x12: init_amd_ln(c); break;
+ case 0x15: init_amd_bd(c); break;
+ case 0x16: init_amd_jg(c); break;
+- case 0x17: fallthrough;
++ case 0x17: init_spectral_chicken(c);
++ fallthrough;
+ case 0x19: init_amd_zn(c); break;
+ }
+
+diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
+index a8a9f64063315..0b64e894b3838 100644
+--- a/arch/x86/kernel/cpu/bugs.c
++++ b/arch/x86/kernel/cpu/bugs.c
+@@ -38,6 +38,8 @@
+
+ static void __init spectre_v1_select_mitigation(void);
+ static void __init spectre_v2_select_mitigation(void);
++static void __init retbleed_select_mitigation(void);
++static void __init spectre_v2_user_select_mitigation(void);
+ static void __init ssb_select_mitigation(void);
+ static void __init l1tf_select_mitigation(void);
+ static void __init mds_select_mitigation(void);
+@@ -48,16 +50,40 @@ static void __init mmio_select_mitigation(void);
+ static void __init srbds_select_mitigation(void);
+ static void __init l1d_flush_select_mitigation(void);
+
+-/* The base value of the SPEC_CTRL MSR that always has to be preserved. */
++/* The base value of the SPEC_CTRL MSR without task-specific bits set */
+ u64 x86_spec_ctrl_base;
+ EXPORT_SYMBOL_GPL(x86_spec_ctrl_base);
++
++/* The current value of the SPEC_CTRL MSR with task-specific bits set */
++DEFINE_PER_CPU(u64, x86_spec_ctrl_current);
++EXPORT_SYMBOL_GPL(x86_spec_ctrl_current);
++
+ static DEFINE_MUTEX(spec_ctrl_mutex);
+
+ /*
+- * The vendor and possibly platform specific bits which can be modified in
+- * x86_spec_ctrl_base.
++ * Keep track of the SPEC_CTRL MSR value for the current task, which may differ
++ * from x86_spec_ctrl_base due to STIBP/SSB in __speculation_ctrl_update().
+ */
+-static u64 __ro_after_init x86_spec_ctrl_mask = SPEC_CTRL_IBRS;
++void write_spec_ctrl_current(u64 val, bool force)
++{
++ if (this_cpu_read(x86_spec_ctrl_current) == val)
++ return;
++
++ this_cpu_write(x86_spec_ctrl_current, val);
++
++ /*
++ * When KERNEL_IBRS this MSR is written on return-to-user, unless
++ * forced the update can be delayed until that time.
++ */
++ if (force || !cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS))
++ wrmsrl(MSR_IA32_SPEC_CTRL, val);
++}
++
++u64 spec_ctrl_current(void)
++{
++ return this_cpu_read(x86_spec_ctrl_current);
++}
++EXPORT_SYMBOL_GPL(spec_ctrl_current);
+
+ /*
+ * AMD specific MSR info for Speculative Store Bypass control.
+@@ -114,13 +140,21 @@ void __init check_bugs(void)
+ if (boot_cpu_has(X86_FEATURE_MSR_SPEC_CTRL))
+ rdmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+
+- /* Allow STIBP in MSR_SPEC_CTRL if supported */
+- if (boot_cpu_has(X86_FEATURE_STIBP))
+- x86_spec_ctrl_mask |= SPEC_CTRL_STIBP;
+-
+ /* Select the proper CPU mitigations before patching alternatives: */
+ spectre_v1_select_mitigation();
+ spectre_v2_select_mitigation();
++ /*
++ * retbleed_select_mitigation() relies on the state set by
++ * spectre_v2_select_mitigation(); specifically it wants to know about
++ * spectre_v2=ibrs.
++ */
++ retbleed_select_mitigation();
++ /*
++ * spectre_v2_user_select_mitigation() relies on the state set by
++ * retbleed_select_mitigation(); specifically the STIBP selection is
++ * forced for UNRET.
++ */
++ spectre_v2_user_select_mitigation();
+ ssb_select_mitigation();
+ l1tf_select_mitigation();
+ md_clear_select_mitigation();
+@@ -161,31 +195,17 @@ void __init check_bugs(void)
+ #endif
+ }
+
++/*
++ * NOTE: This function is *only* called for SVM. VMX spec_ctrl handling is
++ * done in vmenter.S.
++ */
+ void
+ x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
+ {
+- u64 msrval, guestval, hostval = x86_spec_ctrl_base;
++ u64 msrval, guestval = guest_spec_ctrl, hostval = spec_ctrl_current();
+ struct thread_info *ti = current_thread_info();
+
+- /* Is MSR_SPEC_CTRL implemented ? */
+ if (static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) {
+- /*
+- * Restrict guest_spec_ctrl to supported values. Clear the
+- * modifiable bits in the host base value and or the
+- * modifiable bits from the guest value.
+- */
+- guestval = hostval & ~x86_spec_ctrl_mask;
+- guestval |= guest_spec_ctrl & x86_spec_ctrl_mask;
+-
+- /* SSBD controlled in MSR_SPEC_CTRL */
+- if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) ||
+- static_cpu_has(X86_FEATURE_AMD_SSBD))
+- hostval |= ssbd_tif_to_spec_ctrl(ti->flags);
+-
+- /* Conditional STIBP enabled? */
+- if (static_branch_unlikely(&switch_to_cond_stibp))
+- hostval |= stibp_tif_to_spec_ctrl(ti->flags);
+-
+ if (hostval != guestval) {
+ msrval = setguest ? guestval : hostval;
+ wrmsrl(MSR_IA32_SPEC_CTRL, msrval);
+@@ -745,12 +765,180 @@ static int __init nospectre_v1_cmdline(char *str)
+ }
+ early_param("nospectre_v1", nospectre_v1_cmdline);
+
+-#undef pr_fmt
+-#define pr_fmt(fmt) "Spectre V2 : " fmt
+-
+ static enum spectre_v2_mitigation spectre_v2_enabled __ro_after_init =
+ SPECTRE_V2_NONE;
+
++#undef pr_fmt
++#define pr_fmt(fmt) "RETBleed: " fmt
++
++enum retbleed_mitigation {
++ RETBLEED_MITIGATION_NONE,
++ RETBLEED_MITIGATION_UNRET,
++ RETBLEED_MITIGATION_IBPB,
++ RETBLEED_MITIGATION_IBRS,
++ RETBLEED_MITIGATION_EIBRS,
++};
++
++enum retbleed_mitigation_cmd {
++ RETBLEED_CMD_OFF,
++ RETBLEED_CMD_AUTO,
++ RETBLEED_CMD_UNRET,
++ RETBLEED_CMD_IBPB,
++};
++
++const char * const retbleed_strings[] = {
++ [RETBLEED_MITIGATION_NONE] = "Vulnerable",
++ [RETBLEED_MITIGATION_UNRET] = "Mitigation: untrained return thunk",
++ [RETBLEED_MITIGATION_IBPB] = "Mitigation: IBPB",
++ [RETBLEED_MITIGATION_IBRS] = "Mitigation: IBRS",
++ [RETBLEED_MITIGATION_EIBRS] = "Mitigation: Enhanced IBRS",
++};
++
++static enum retbleed_mitigation retbleed_mitigation __ro_after_init =
++ RETBLEED_MITIGATION_NONE;
++static enum retbleed_mitigation_cmd retbleed_cmd __ro_after_init =
++ RETBLEED_CMD_AUTO;
++
++static int __ro_after_init retbleed_nosmt = false;
++
++static int __init retbleed_parse_cmdline(char *str)
++{
++ if (!str)
++ return -EINVAL;
++
++ while (str) {
++ char *next = strchr(str, ',');
++ if (next) {
++ *next = 0;
++ next++;
++ }
++
++ if (!strcmp(str, "off")) {
++ retbleed_cmd = RETBLEED_CMD_OFF;
++ } else if (!strcmp(str, "auto")) {
++ retbleed_cmd = RETBLEED_CMD_AUTO;
++ } else if (!strcmp(str, "unret")) {
++ retbleed_cmd = RETBLEED_CMD_UNRET;
++ } else if (!strcmp(str, "ibpb")) {
++ retbleed_cmd = RETBLEED_CMD_IBPB;
++ } else if (!strcmp(str, "nosmt")) {
++ retbleed_nosmt = true;
++ } else {
++ pr_err("Ignoring unknown retbleed option (%s).", str);
++ }
++
++ str = next;
++ }
++
++ return 0;
++}
++early_param("retbleed", retbleed_parse_cmdline);
++
++#define RETBLEED_UNTRAIN_MSG "WARNING: BTB untrained return thunk mitigation is only effective on AMD/Hygon!\n"
++#define RETBLEED_INTEL_MSG "WARNING: Spectre v2 mitigation leaves CPU vulnerable to RETBleed attacks, data leaks possible!\n"
++
++static void __init retbleed_select_mitigation(void)
++{
++ bool mitigate_smt = false;
++
++ if (!boot_cpu_has_bug(X86_BUG_RETBLEED) || cpu_mitigations_off())
++ return;
++
++ switch (retbleed_cmd) {
++ case RETBLEED_CMD_OFF:
++ return;
++
++ case RETBLEED_CMD_UNRET:
++ if (IS_ENABLED(CONFIG_CPU_UNRET_ENTRY)) {
++ retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
++ } else {
++ pr_err("WARNING: kernel not compiled with CPU_UNRET_ENTRY.\n");
++ goto do_cmd_auto;
++ }
++ break;
++
++ case RETBLEED_CMD_IBPB:
++ if (!boot_cpu_has(X86_FEATURE_IBPB)) {
++ pr_err("WARNING: CPU does not support IBPB.\n");
++ goto do_cmd_auto;
++ } else if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY)) {
++ retbleed_mitigation = RETBLEED_MITIGATION_IBPB;
++ } else {
++ pr_err("WARNING: kernel not compiled with CPU_IBPB_ENTRY.\n");
++ goto do_cmd_auto;
++ }
++ break;
++
++do_cmd_auto:
++ case RETBLEED_CMD_AUTO:
++ default:
++ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
++ boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) {
++ if (IS_ENABLED(CONFIG_CPU_UNRET_ENTRY))
++ retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
++ else if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY) && boot_cpu_has(X86_FEATURE_IBPB))
++ retbleed_mitigation = RETBLEED_MITIGATION_IBPB;
++ }
++
++ /*
++ * The Intel mitigation (IBRS or eIBRS) was already selected in
++ * spectre_v2_select_mitigation(). 'retbleed_mitigation' will
++ * be set accordingly below.
++ */
++
++ break;
++ }
++
++ switch (retbleed_mitigation) {
++ case RETBLEED_MITIGATION_UNRET:
++ setup_force_cpu_cap(X86_FEATURE_RETHUNK);
++ setup_force_cpu_cap(X86_FEATURE_UNRET);
++
++ if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
++ boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
++ pr_err(RETBLEED_UNTRAIN_MSG);
++
++ mitigate_smt = true;
++ break;
++
++ case RETBLEED_MITIGATION_IBPB:
++ setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);
++ mitigate_smt = true;
++ break;
++
++ default:
++ break;
++ }
++
++ if (mitigate_smt && !boot_cpu_has(X86_FEATURE_STIBP) &&
++ (retbleed_nosmt || cpu_mitigations_auto_nosmt()))
++ cpu_smt_disable(false);
++
++ /*
++ * Let IBRS trump all on Intel without affecting the effects of the
++ * retbleed= cmdline option.
++ */
++ if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) {
++ switch (spectre_v2_enabled) {
++ case SPECTRE_V2_IBRS:
++ retbleed_mitigation = RETBLEED_MITIGATION_IBRS;
++ break;
++ case SPECTRE_V2_EIBRS:
++ case SPECTRE_V2_EIBRS_RETPOLINE:
++ case SPECTRE_V2_EIBRS_LFENCE:
++ retbleed_mitigation = RETBLEED_MITIGATION_EIBRS;
++ break;
++ default:
++ pr_err(RETBLEED_INTEL_MSG);
++ }
++ }
++
++ pr_info("%s\n", retbleed_strings[retbleed_mitigation]);
++}
++
++#undef pr_fmt
++#define pr_fmt(fmt) "Spectre V2 : " fmt
++
+ static enum spectre_v2_user_mitigation spectre_v2_user_stibp __ro_after_init =
+ SPECTRE_V2_USER_NONE;
+ static enum spectre_v2_user_mitigation spectre_v2_user_ibpb __ro_after_init =
+@@ -821,6 +1009,7 @@ enum spectre_v2_mitigation_cmd {
+ SPECTRE_V2_CMD_EIBRS,
+ SPECTRE_V2_CMD_EIBRS_RETPOLINE,
+ SPECTRE_V2_CMD_EIBRS_LFENCE,
++ SPECTRE_V2_CMD_IBRS,
+ };
+
+ enum spectre_v2_user_cmd {
+@@ -861,13 +1050,15 @@ static void __init spec_v2_user_print_cond(const char *reason, bool secure)
+ pr_info("spectre_v2_user=%s forced on command line.\n", reason);
+ }
+
++static __ro_after_init enum spectre_v2_mitigation_cmd spectre_v2_cmd;
++
+ static enum spectre_v2_user_cmd __init
+-spectre_v2_parse_user_cmdline(enum spectre_v2_mitigation_cmd v2_cmd)
++spectre_v2_parse_user_cmdline(void)
+ {
+ char arg[20];
+ int ret, i;
+
+- switch (v2_cmd) {
++ switch (spectre_v2_cmd) {
+ case SPECTRE_V2_CMD_NONE:
+ return SPECTRE_V2_USER_CMD_NONE;
+ case SPECTRE_V2_CMD_FORCE:
+@@ -893,15 +1084,16 @@ spectre_v2_parse_user_cmdline(enum spectre_v2_mitigation_cmd v2_cmd)
+ return SPECTRE_V2_USER_CMD_AUTO;
+ }
+
+-static inline bool spectre_v2_in_eibrs_mode(enum spectre_v2_mitigation mode)
++static inline bool spectre_v2_in_ibrs_mode(enum spectre_v2_mitigation mode)
+ {
+- return (mode == SPECTRE_V2_EIBRS ||
+- mode == SPECTRE_V2_EIBRS_RETPOLINE ||
+- mode == SPECTRE_V2_EIBRS_LFENCE);
++ return mode == SPECTRE_V2_IBRS ||
++ mode == SPECTRE_V2_EIBRS ||
++ mode == SPECTRE_V2_EIBRS_RETPOLINE ||
++ mode == SPECTRE_V2_EIBRS_LFENCE;
+ }
+
+ static void __init
+-spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd)
++spectre_v2_user_select_mitigation(void)
+ {
+ enum spectre_v2_user_mitigation mode = SPECTRE_V2_USER_NONE;
+ bool smt_possible = IS_ENABLED(CONFIG_SMP);
+@@ -914,7 +1106,7 @@ spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd)
+ cpu_smt_control == CPU_SMT_NOT_SUPPORTED)
+ smt_possible = false;
+
+- cmd = spectre_v2_parse_user_cmdline(v2_cmd);
++ cmd = spectre_v2_parse_user_cmdline();
+ switch (cmd) {
+ case SPECTRE_V2_USER_CMD_NONE:
+ goto set_mode;
+@@ -962,12 +1154,12 @@ spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd)
+ }
+
+ /*
+- * If no STIBP, enhanced IBRS is enabled or SMT impossible, STIBP is not
+- * required.
++ * If no STIBP, IBRS or enhanced IBRS is enabled, or SMT impossible,
++ * STIBP is not required.
+ */
+ if (!boot_cpu_has(X86_FEATURE_STIBP) ||
+ !smt_possible ||
+- spectre_v2_in_eibrs_mode(spectre_v2_enabled))
++ spectre_v2_in_ibrs_mode(spectre_v2_enabled))
+ return;
+
+ /*
+@@ -979,6 +1171,13 @@ spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd)
+ boot_cpu_has(X86_FEATURE_AMD_STIBP_ALWAYS_ON))
+ mode = SPECTRE_V2_USER_STRICT_PREFERRED;
+
++ if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET) {
++ if (mode != SPECTRE_V2_USER_STRICT &&
++ mode != SPECTRE_V2_USER_STRICT_PREFERRED)
++ pr_info("Selecting STIBP always-on mode to complement retbleed mitigation\n");
++ mode = SPECTRE_V2_USER_STRICT_PREFERRED;
++ }
++
+ spectre_v2_user_stibp = mode;
+
+ set_mode:
+@@ -992,6 +1191,7 @@ static const char * const spectre_v2_strings[] = {
+ [SPECTRE_V2_EIBRS] = "Mitigation: Enhanced IBRS",
+ [SPECTRE_V2_EIBRS_LFENCE] = "Mitigation: Enhanced IBRS + LFENCE",
+ [SPECTRE_V2_EIBRS_RETPOLINE] = "Mitigation: Enhanced IBRS + Retpolines",
++ [SPECTRE_V2_IBRS] = "Mitigation: IBRS",
+ };
+
+ static const struct {
+@@ -1009,6 +1209,7 @@ static const struct {
+ { "eibrs,lfence", SPECTRE_V2_CMD_EIBRS_LFENCE, false },
+ { "eibrs,retpoline", SPECTRE_V2_CMD_EIBRS_RETPOLINE, false },
+ { "auto", SPECTRE_V2_CMD_AUTO, false },
++ { "ibrs", SPECTRE_V2_CMD_IBRS, false },
+ };
+
+ static void __init spec_v2_print_cond(const char *reason, bool secure)
+@@ -1071,6 +1272,30 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void)
+ return SPECTRE_V2_CMD_AUTO;
+ }
+
++ if (cmd == SPECTRE_V2_CMD_IBRS && !IS_ENABLED(CONFIG_CPU_IBRS_ENTRY)) {
++ pr_err("%s selected but not compiled in. Switching to AUTO select\n",
++ mitigation_options[i].option);
++ return SPECTRE_V2_CMD_AUTO;
++ }
++
++ if (cmd == SPECTRE_V2_CMD_IBRS && boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) {
++ pr_err("%s selected but not Intel CPU. Switching to AUTO select\n",
++ mitigation_options[i].option);
++ return SPECTRE_V2_CMD_AUTO;
++ }
++
++ if (cmd == SPECTRE_V2_CMD_IBRS && !boot_cpu_has(X86_FEATURE_IBRS)) {
++ pr_err("%s selected but CPU doesn't have IBRS. Switching to AUTO select\n",
++ mitigation_options[i].option);
++ return SPECTRE_V2_CMD_AUTO;
++ }
++
++ if (cmd == SPECTRE_V2_CMD_IBRS && boot_cpu_has(X86_FEATURE_XENPV)) {
++ pr_err("%s selected but running as XenPV guest. Switching to AUTO select\n",
++ mitigation_options[i].option);
++ return SPECTRE_V2_CMD_AUTO;
++ }
++
+ spec_v2_print_cond(mitigation_options[i].option,
+ mitigation_options[i].secure);
+ return cmd;
+@@ -1086,6 +1311,22 @@ static enum spectre_v2_mitigation __init spectre_v2_select_retpoline(void)
+ return SPECTRE_V2_RETPOLINE;
+ }
+
++/* Disable in-kernel use of non-RSB RET predictors */
++static void __init spec_ctrl_disable_kernel_rrsba(void)
++{
++ u64 ia32_cap;
++
++ if (!boot_cpu_has(X86_FEATURE_RRSBA_CTRL))
++ return;
++
++ ia32_cap = x86_read_arch_cap_msr();
++
++ if (ia32_cap & ARCH_CAP_RRSBA) {
++ x86_spec_ctrl_base |= SPEC_CTRL_RRSBA_DIS_S;
++ write_spec_ctrl_current(x86_spec_ctrl_base, true);
++ }
++}
++
+ static void __init spectre_v2_select_mitigation(void)
+ {
+ enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
+@@ -1110,6 +1351,15 @@ static void __init spectre_v2_select_mitigation(void)
+ break;
+ }
+
++ if (IS_ENABLED(CONFIG_CPU_IBRS_ENTRY) &&
++ boot_cpu_has_bug(X86_BUG_RETBLEED) &&
++ retbleed_cmd != RETBLEED_CMD_OFF &&
++ boot_cpu_has(X86_FEATURE_IBRS) &&
++ boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) {
++ mode = SPECTRE_V2_IBRS;
++ break;
++ }
++
+ mode = spectre_v2_select_retpoline();
+ break;
+
+@@ -1126,6 +1376,10 @@ static void __init spectre_v2_select_mitigation(void)
+ mode = spectre_v2_select_retpoline();
+ break;
+
++ case SPECTRE_V2_CMD_IBRS:
++ mode = SPECTRE_V2_IBRS;
++ break;
++
+ case SPECTRE_V2_CMD_EIBRS:
+ mode = SPECTRE_V2_EIBRS;
+ break;
+@@ -1142,10 +1396,9 @@ static void __init spectre_v2_select_mitigation(void)
+ if (mode == SPECTRE_V2_EIBRS && unprivileged_ebpf_enabled())
+ pr_err(SPECTRE_V2_EIBRS_EBPF_MSG);
+
+- if (spectre_v2_in_eibrs_mode(mode)) {
+- /* Force it so VMEXIT will restore correctly */
++ if (spectre_v2_in_ibrs_mode(mode)) {
+ x86_spec_ctrl_base |= SPEC_CTRL_IBRS;
+- wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
++ write_spec_ctrl_current(x86_spec_ctrl_base, true);
+ }
+
+ switch (mode) {
+@@ -1153,6 +1406,10 @@ static void __init spectre_v2_select_mitigation(void)
+ case SPECTRE_V2_EIBRS:
+ break;
+
++ case SPECTRE_V2_IBRS:
++ setup_force_cpu_cap(X86_FEATURE_KERNEL_IBRS);
++ break;
++
+ case SPECTRE_V2_LFENCE:
+ case SPECTRE_V2_EIBRS_LFENCE:
+ setup_force_cpu_cap(X86_FEATURE_RETPOLINE_LFENCE);
+@@ -1164,43 +1421,107 @@ static void __init spectre_v2_select_mitigation(void)
+ break;
+ }
+
++ /*
++ * Disable alternate RSB predictions in kernel when indirect CALLs and
++ * JMPs gets protection against BHI and Intramode-BTI, but RET
++ * prediction from a non-RSB predictor is still a risk.
++ */
++ if (mode == SPECTRE_V2_EIBRS_LFENCE ||
++ mode == SPECTRE_V2_EIBRS_RETPOLINE ||
++ mode == SPECTRE_V2_RETPOLINE)
++ spec_ctrl_disable_kernel_rrsba();
++
+ spectre_v2_enabled = mode;
+ pr_info("%s\n", spectre_v2_strings[mode]);
+
+ /*
+- * If spectre v2 protection has been enabled, unconditionally fill
+- * RSB during a context switch; this protects against two independent
+- * issues:
++ * If Spectre v2 protection has been enabled, fill the RSB during a
++ * context switch. In general there are two types of RSB attacks
++ * across context switches, for which the CALLs/RETs may be unbalanced.
+ *
+- * - RSB underflow (and switch to BTB) on Skylake+
+- * - SpectreRSB variant of spectre v2 on X86_BUG_SPECTRE_V2 CPUs
++ * 1) RSB underflow
++ *
++ * Some Intel parts have "bottomless RSB". When the RSB is empty,
++ * speculated return targets may come from the branch predictor,
++ * which could have a user-poisoned BTB or BHB entry.
++ *
++ * AMD has it even worse: *all* returns are speculated from the BTB,
++ * regardless of the state of the RSB.
++ *
++ * When IBRS or eIBRS is enabled, the "user -> kernel" attack
++ * scenario is mitigated by the IBRS branch prediction isolation
++ * properties, so the RSB buffer filling wouldn't be necessary to
++ * protect against this type of attack.
++ *
++ * The "user -> user" attack scenario is mitigated by RSB filling.
++ *
++ * 2) Poisoned RSB entry
++ *
++ * If the 'next' in-kernel return stack is shorter than 'prev',
++ * 'next' could be tricked into speculating with a user-poisoned RSB
++ * entry.
++ *
++ * The "user -> kernel" attack scenario is mitigated by SMEP and
++ * eIBRS.
++ *
++ * The "user -> user" scenario, also known as SpectreBHB, requires
++ * RSB clearing.
++ *
++ * So to mitigate all cases, unconditionally fill RSB on context
++ * switches.
++ *
++ * FIXME: Is this pointless for retbleed-affected AMD?
+ */
+ setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
+ pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
+
+ /*
+- * Retpoline means the kernel is safe because it has no indirect
+- * branches. Enhanced IBRS protects firmware too, so, enable restricted
+- * speculation around firmware calls only when Enhanced IBRS isn't
+- * supported.
++ * Similar to context switches, there are two types of RSB attacks
++ * after vmexit:
++ *
++ * 1) RSB underflow
++ *
++ * 2) Poisoned RSB entry
++ *
++ * When retpoline is enabled, both are mitigated by filling/clearing
++ * the RSB.
++ *
++ * When IBRS is enabled, while #1 would be mitigated by the IBRS branch
++ * prediction isolation protections, RSB still needs to be cleared
++ * because of #2. Note that SMEP provides no protection here, unlike
++ * user-space-poisoned RSB entries.
++ *
++ * eIBRS, on the other hand, has RSB-poisoning protections, so it
++ * doesn't need RSB clearing after vmexit.
++ */
++ if (boot_cpu_has(X86_FEATURE_RETPOLINE) ||
++ boot_cpu_has(X86_FEATURE_KERNEL_IBRS))
++ setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT);
++
++ /*
++ * Retpoline protects the kernel, but doesn't protect firmware. IBRS
++ * and Enhanced IBRS protect firmware too, so enable IBRS around
++ * firmware calls only when IBRS / Enhanced IBRS aren't otherwise
++ * enabled.
+ *
+ * Use "mode" to check Enhanced IBRS instead of boot_cpu_has(), because
+ * the user might select retpoline on the kernel command line and if
+ * the CPU supports Enhanced IBRS, kernel might un-intentionally not
+ * enable IBRS around firmware calls.
+ */
+- if (boot_cpu_has(X86_FEATURE_IBRS) && !spectre_v2_in_eibrs_mode(mode)) {
++ if (boot_cpu_has(X86_FEATURE_IBRS) && !spectre_v2_in_ibrs_mode(mode)) {
+ setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW);
+ pr_info("Enabling Restricted Speculation for firmware calls\n");
+ }
+
+ /* Set up IBPB and STIBP depending on the general spectre V2 command */
+- spectre_v2_user_select_mitigation(cmd);
++ spectre_v2_cmd = cmd;
+ }
+
+ static void update_stibp_msr(void * __unused)
+ {
+- wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
++ u64 val = spec_ctrl_current() | (x86_spec_ctrl_base & SPEC_CTRL_STIBP);
++ write_spec_ctrl_current(val, true);
+ }
+
+ /* Update x86_spec_ctrl_base in case SMT state changed. */
+@@ -1416,16 +1737,6 @@ static enum ssb_mitigation __init __ssb_select_mitigation(void)
+ break;
+ }
+
+- /*
+- * If SSBD is controlled by the SPEC_CTRL MSR, then set the proper
+- * bit in the mask to allow guests to use the mitigation even in the
+- * case where the host does not enable it.
+- */
+- if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) ||
+- static_cpu_has(X86_FEATURE_AMD_SSBD)) {
+- x86_spec_ctrl_mask |= SPEC_CTRL_SSBD;
+- }
+-
+ /*
+ * We have three CPU feature flags that are in play here:
+ * - X86_BUG_SPEC_STORE_BYPASS - CPU is susceptible.
+@@ -1443,7 +1754,7 @@ static enum ssb_mitigation __init __ssb_select_mitigation(void)
+ x86_amd_ssb_disable();
+ } else {
+ x86_spec_ctrl_base |= SPEC_CTRL_SSBD;
+- wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
++ write_spec_ctrl_current(x86_spec_ctrl_base, true);
+ }
+ }
+
+@@ -1694,7 +2005,7 @@ int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which)
+ void x86_spec_ctrl_setup_ap(void)
+ {
+ if (boot_cpu_has(X86_FEATURE_MSR_SPEC_CTRL))
+- wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
++ write_spec_ctrl_current(x86_spec_ctrl_base, true);
+
+ if (ssb_mode == SPEC_STORE_BYPASS_DISABLE)
+ x86_amd_ssb_disable();
+@@ -1931,7 +2242,7 @@ static ssize_t mmio_stale_data_show_state(char *buf)
+
+ static char *stibp_state(void)
+ {
+- if (spectre_v2_in_eibrs_mode(spectre_v2_enabled))
++ if (spectre_v2_in_ibrs_mode(spectre_v2_enabled))
+ return "";
+
+ switch (spectre_v2_user_stibp) {
+@@ -1987,6 +2298,24 @@ static ssize_t srbds_show_state(char *buf)
+ return sprintf(buf, "%s\n", srbds_strings[srbds_mitigation]);
+ }
+
++static ssize_t retbleed_show_state(char *buf)
++{
++ if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET) {
++ if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
++ boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
++ return sprintf(buf, "Vulnerable: untrained return thunk on non-Zen uarch\n");
++
++ return sprintf(buf, "%s; SMT %s\n",
++ retbleed_strings[retbleed_mitigation],
++ !sched_smt_active() ? "disabled" :
++ spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
++ spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED ?
++ "enabled with STIBP protection" : "vulnerable");
++ }
++
++ return sprintf(buf, "%s\n", retbleed_strings[retbleed_mitigation]);
++}
++
+ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
+ char *buf, unsigned int bug)
+ {
+@@ -2032,6 +2361,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
+ case X86_BUG_MMIO_STALE_DATA:
+ return mmio_stale_data_show_state(buf);
+
++ case X86_BUG_RETBLEED:
++ return retbleed_show_state(buf);
++
+ default:
+ break;
+ }
+@@ -2088,4 +2420,9 @@ ssize_t cpu_show_mmio_stale_data(struct device *dev, struct device_attribute *at
+ {
+ return cpu_show_common(dev, attr, buf, X86_BUG_MMIO_STALE_DATA);
+ }
++
++ssize_t cpu_show_retbleed(struct device *dev, struct device_attribute *attr, char *buf)
++{
++ return cpu_show_common(dev, attr, buf, X86_BUG_RETBLEED);
++}
+ #endif
+diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
+index af5d0c188f7b8..1f43ddf2ffc36 100644
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -1231,48 +1231,60 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
+ {}
+ };
+
++#define VULNBL(vendor, family, model, blacklist) \
++ X86_MATCH_VENDOR_FAM_MODEL(vendor, family, model, blacklist)
++
+ #define VULNBL_INTEL_STEPPINGS(model, steppings, issues) \
+ X86_MATCH_VENDOR_FAM_MODEL_STEPPINGS_FEATURE(INTEL, 6, \
+ INTEL_FAM6_##model, steppings, \
+ X86_FEATURE_ANY, issues)
+
++#define VULNBL_AMD(family, blacklist) \
++ VULNBL(AMD, family, X86_MODEL_ANY, blacklist)
++
++#define VULNBL_HYGON(family, blacklist) \
++ VULNBL(HYGON, family, X86_MODEL_ANY, blacklist)
++
+ #define SRBDS BIT(0)
+ /* CPU is affected by X86_BUG_MMIO_STALE_DATA */
+ #define MMIO BIT(1)
+ /* CPU is affected by Shared Buffers Data Sampling (SBDS), a variant of X86_BUG_MMIO_STALE_DATA */
+ #define MMIO_SBDS BIT(2)
++/* CPU is affected by RETbleed, speculating where you would not expect it */
++#define RETBLEED BIT(3)
+
+ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
+ VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS),
+ VULNBL_INTEL_STEPPINGS(HASWELL, X86_STEPPING_ANY, SRBDS),
+ VULNBL_INTEL_STEPPINGS(HASWELL_L, X86_STEPPING_ANY, SRBDS),
+ VULNBL_INTEL_STEPPINGS(HASWELL_G, X86_STEPPING_ANY, SRBDS),
+- VULNBL_INTEL_STEPPINGS(HASWELL_X, BIT(2) | BIT(4), MMIO),
+- VULNBL_INTEL_STEPPINGS(BROADWELL_D, X86_STEPPINGS(0x3, 0x5), MMIO),
++ VULNBL_INTEL_STEPPINGS(HASWELL_X, X86_STEPPING_ANY, MMIO),
++ VULNBL_INTEL_STEPPINGS(BROADWELL_D, X86_STEPPING_ANY, MMIO),
+ VULNBL_INTEL_STEPPINGS(BROADWELL_G, X86_STEPPING_ANY, SRBDS),
+ VULNBL_INTEL_STEPPINGS(BROADWELL_X, X86_STEPPING_ANY, MMIO),
+ VULNBL_INTEL_STEPPINGS(BROADWELL, X86_STEPPING_ANY, SRBDS),
+- VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPINGS(0x3, 0x3), SRBDS | MMIO),
+- VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPING_ANY, SRBDS),
+- VULNBL_INTEL_STEPPINGS(SKYLAKE_X, BIT(3) | BIT(4) | BIT(6) |
+- BIT(7) | BIT(0xB), MMIO),
+- VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPINGS(0x3, 0x3), SRBDS | MMIO),
+- VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPING_ANY, SRBDS),
+- VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPINGS(0x9, 0xC), SRBDS | MMIO),
+- VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPINGS(0x0, 0x8), SRBDS),
+- VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x9, 0xD), SRBDS | MMIO),
+- VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPINGS(0x0, 0x8), SRBDS),
+- VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPINGS(0x5, 0x5), MMIO | MMIO_SBDS),
+- VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPINGS(0x1, 0x1), MMIO),
+- VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPINGS(0x4, 0x6), MMIO),
+- VULNBL_INTEL_STEPPINGS(COMETLAKE, BIT(2) | BIT(3) | BIT(5), MMIO | MMIO_SBDS),
+- VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x1, 0x1), MMIO | MMIO_SBDS),
+- VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO),
+- VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPINGS(0x1, 0x1), MMIO | MMIO_SBDS),
+- VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPINGS(0x1, 0x1), MMIO),
+- VULNBL_INTEL_STEPPINGS(ATOM_TREMONT, X86_STEPPINGS(0x1, 0x1), MMIO | MMIO_SBDS),
++ VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPING_ANY, MMIO | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(CANNONLAKE_L, X86_STEPPING_ANY, RETBLEED),
++ VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO),
++ VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO),
++ VULNBL_INTEL_STEPPINGS(COMETLAKE, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(ATOM_TREMONT, X86_STEPPING_ANY, MMIO | MMIO_SBDS),
+ VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_D, X86_STEPPING_ANY, MMIO),
+- VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L, X86_STEPPINGS(0x0, 0x0), MMIO | MMIO_SBDS),
++ VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS),
++
++ VULNBL_AMD(0x15, RETBLEED),
++ VULNBL_AMD(0x16, RETBLEED),
++ VULNBL_AMD(0x17, RETBLEED),
++ VULNBL_HYGON(0x18, RETBLEED),
+ {}
+ };
+
+@@ -1374,6 +1386,11 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
+ !arch_cap_mmio_immune(ia32_cap))
+ setup_force_cpu_bug(X86_BUG_MMIO_STALE_DATA);
+
++ if (!cpu_has(c, X86_FEATURE_BTC_NO)) {
++ if (cpu_matches(cpu_vuln_blacklist, RETBLEED) || (ia32_cap & ARCH_CAP_RSBA))
++ setup_force_cpu_bug(X86_BUG_RETBLEED);
++ }
++
+ if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
+ return;
+
+diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h
+index 2a8e584fc9913..7c9b5893c30ab 100644
+--- a/arch/x86/kernel/cpu/cpu.h
++++ b/arch/x86/kernel/cpu/cpu.h
+@@ -61,6 +61,8 @@ static inline void tsx_init(void) { }
+ static inline void tsx_ap_init(void) { }
+ #endif /* CONFIG_CPU_SUP_INTEL */
+
++extern void init_spectral_chicken(struct cpuinfo_x86 *c);
++
+ extern void get_cpu_cap(struct cpuinfo_x86 *c);
+ extern void get_cpu_address_sizes(struct cpuinfo_x86 *c);
+ extern void cpu_detect_cache_sizes(struct cpuinfo_x86 *c);
+diff --git a/arch/x86/kernel/cpu/hygon.c b/arch/x86/kernel/cpu/hygon.c
+index 3fcdda4c1e114..21fd425088fe5 100644
+--- a/arch/x86/kernel/cpu/hygon.c
++++ b/arch/x86/kernel/cpu/hygon.c
+@@ -302,6 +302,12 @@ static void init_hygon(struct cpuinfo_x86 *c)
+ /* get apicid instead of initial apic id from cpuid */
+ c->apicid = hard_smp_processor_id();
+
++ /*
++ * XXX someone from Hygon needs to confirm this DTRT
++ *
++ init_spectral_chicken(c);
++ */
++
+ set_cpu_cap(c, X86_FEATURE_ZEN);
+ set_cpu_cap(c, X86_FEATURE_CPB);
+
+diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
+index 4143b1e4c5c6d..fcfb03f5f89b3 100644
+--- a/arch/x86/kernel/cpu/scattered.c
++++ b/arch/x86/kernel/cpu/scattered.c
+@@ -27,6 +27,7 @@ static const struct cpuid_bit cpuid_bits[] = {
+ { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x00000006, 0 },
+ { X86_FEATURE_EPB, CPUID_ECX, 3, 0x00000006, 0 },
+ { X86_FEATURE_INTEL_PPIN, CPUID_EBX, 0, 0x00000007, 1 },
++ { X86_FEATURE_RRSBA_CTRL, CPUID_EDX, 2, 0x00000007, 2 },
+ { X86_FEATURE_CQM_LLC, CPUID_EDX, 1, 0x0000000f, 0 },
+ { X86_FEATURE_CQM_OCCUP_LLC, CPUID_EDX, 0, 0x0000000f, 1 },
+ { X86_FEATURE_CQM_MBM_TOTAL, CPUID_EDX, 1, 0x0000000f, 1 },
+diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
+index 1e31c7d21597b..6892ca67d9c6d 100644
+--- a/arch/x86/kernel/ftrace.c
++++ b/arch/x86/kernel/ftrace.c
+@@ -303,7 +303,7 @@ union ftrace_op_code_union {
+ } __attribute__((packed));
+ };
+
+-#define RET_SIZE 1 + IS_ENABLED(CONFIG_SLS)
++#define RET_SIZE (IS_ENABLED(CONFIG_RETPOLINE) ? 5 : 1 + IS_ENABLED(CONFIG_SLS))
+
+ static unsigned long
+ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size)
+@@ -359,7 +359,10 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size)
+ goto fail;
+
+ ip = trampoline + size;
+- memcpy(ip, retq, RET_SIZE);
++ if (cpu_feature_enabled(X86_FEATURE_RETHUNK))
++ __text_gen_insn(ip, JMP32_INSN_OPCODE, ip, &__x86_return_thunk, JMP32_INSN_SIZE);
++ else
++ memcpy(ip, retq, sizeof(retq));
+
+ /* No need to test direct calls on created trampolines */
+ if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
+diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
+index eb8656bac99b6..9b7acc9c7874c 100644
+--- a/arch/x86/kernel/head_32.S
++++ b/arch/x86/kernel/head_32.S
+@@ -23,6 +23,7 @@
+ #include <asm/cpufeatures.h>
+ #include <asm/percpu.h>
+ #include <asm/nops.h>
++#include <asm/nospec-branch.h>
+ #include <asm/bootparam.h>
+ #include <asm/export.h>
+ #include <asm/pgtable_32.h>
+diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
+index b8e3019547a5d..3178fd81f93fc 100644
+--- a/arch/x86/kernel/head_64.S
++++ b/arch/x86/kernel/head_64.S
+@@ -334,6 +334,8 @@ SYM_CODE_START_NOALIGN(vc_boot_ghcb)
+ UNWIND_HINT_IRET_REGS offset=8
+ ENDBR
+
++ ANNOTATE_UNRET_END
++
+ /* Build pt_regs */
+ PUSH_AND_CLEAR_REGS
+
+@@ -393,6 +395,7 @@ SYM_CODE_END(early_idt_handler_array)
+
+ SYM_CODE_START_LOCAL(early_idt_handler_common)
+ UNWIND_HINT_IRET_REGS offset=16
++ ANNOTATE_UNRET_END
+ /*
+ * The stack is the hardware frame, an error code or zero, and the
+ * vector number.
+@@ -442,6 +445,8 @@ SYM_CODE_START_NOALIGN(vc_no_ghcb)
+ UNWIND_HINT_IRET_REGS offset=8
+ ENDBR
+
++ ANNOTATE_UNRET_END
++
+ /* Build pt_regs */
+ PUSH_AND_CLEAR_REGS
+
+diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
+index b98ffcf4d250f..67828d9733890 100644
+--- a/arch/x86/kernel/module.c
++++ b/arch/x86/kernel/module.c
+@@ -253,7 +253,7 @@ int module_finalize(const Elf_Ehdr *hdr,
+ {
+ const Elf_Shdr *s, *text = NULL, *alt = NULL, *locks = NULL,
+ *para = NULL, *orc = NULL, *orc_ip = NULL,
+- *retpolines = NULL, *ibt_endbr = NULL;
++ *retpolines = NULL, *returns = NULL, *ibt_endbr = NULL;
+ char *secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
+
+ for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) {
+@@ -271,6 +271,8 @@ int module_finalize(const Elf_Ehdr *hdr,
+ orc_ip = s;
+ if (!strcmp(".retpoline_sites", secstrings + s->sh_name))
+ retpolines = s;
++ if (!strcmp(".return_sites", secstrings + s->sh_name))
++ returns = s;
+ if (!strcmp(".ibt_endbr_seal", secstrings + s->sh_name))
+ ibt_endbr = s;
+ }
+@@ -287,6 +289,10 @@ int module_finalize(const Elf_Ehdr *hdr,
+ void *rseg = (void *)retpolines->sh_addr;
+ apply_retpolines(rseg, rseg + retpolines->sh_size);
+ }
++ if (returns) {
++ void *rseg = (void *)returns->sh_addr;
++ apply_returns(rseg, rseg + returns->sh_size);
++ }
+ if (alt) {
+ /* patch .altinstructions */
+ void *aseg = (void *)alt->sh_addr;
+diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
+index b370767f5b191..622dc3673c37f 100644
+--- a/arch/x86/kernel/process.c
++++ b/arch/x86/kernel/process.c
+@@ -600,7 +600,7 @@ static __always_inline void __speculation_ctrl_update(unsigned long tifp,
+ }
+
+ if (updmsr)
+- wrmsrl(MSR_IA32_SPEC_CTRL, msr);
++ write_spec_ctrl_current(msr, false);
+ }
+
+ static unsigned long speculation_ctrl_update_tif(struct task_struct *tsk)
+diff --git a/arch/x86/kernel/relocate_kernel_32.S b/arch/x86/kernel/relocate_kernel_32.S
+index fcc8a7699103a..c7c4b1917336d 100644
+--- a/arch/x86/kernel/relocate_kernel_32.S
++++ b/arch/x86/kernel/relocate_kernel_32.S
+@@ -7,10 +7,12 @@
+ #include <linux/linkage.h>
+ #include <asm/page_types.h>
+ #include <asm/kexec.h>
++#include <asm/nospec-branch.h>
+ #include <asm/processor-flags.h>
+
+ /*
+- * Must be relocatable PIC code callable as a C function
++ * Must be relocatable PIC code callable as a C function, in particular
++ * there must be a plain RET and not jump to return thunk.
+ */
+
+ #define PTR(x) (x << 2)
+@@ -91,7 +93,9 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
+ movl %edi, %eax
+ addl $(identity_mapped - relocate_kernel), %eax
+ pushl %eax
+- RET
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
+ SYM_CODE_END(relocate_kernel)
+
+ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
+@@ -159,12 +163,15 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
+ xorl %edx, %edx
+ xorl %esi, %esi
+ xorl %ebp, %ebp
+- RET
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
+ 1:
+ popl %edx
+ movl CP_PA_SWAP_PAGE(%edi), %esp
+ addl $PAGE_SIZE, %esp
+ 2:
++ ANNOTATE_RETPOLINE_SAFE
+ call *%edx
+
+ /* get the re-entry point of the peer system */
+@@ -190,7 +197,9 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
+ movl %edi, %eax
+ addl $(virtual_mapped - relocate_kernel), %eax
+ pushl %eax
+- RET
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
+ SYM_CODE_END(identity_mapped)
+
+ SYM_CODE_START_LOCAL_NOALIGN(virtual_mapped)
+@@ -208,7 +217,9 @@ SYM_CODE_START_LOCAL_NOALIGN(virtual_mapped)
+ popl %edi
+ popl %esi
+ popl %ebx
+- RET
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
+ SYM_CODE_END(virtual_mapped)
+
+ /* Do the copies */
+@@ -271,7 +282,9 @@ SYM_CODE_START_LOCAL_NOALIGN(swap_pages)
+ popl %edi
+ popl %ebx
+ popl %ebp
+- RET
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
+ SYM_CODE_END(swap_pages)
+
+ .globl kexec_control_code_size
+diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
+index c1d8626c53b66..4809c0dc4eb0c 100644
+--- a/arch/x86/kernel/relocate_kernel_64.S
++++ b/arch/x86/kernel/relocate_kernel_64.S
+@@ -13,7 +13,8 @@
+ #include <asm/unwind_hints.h>
+
+ /*
+- * Must be relocatable PIC code callable as a C function
++ * Must be relocatable PIC code callable as a C function, in particular
++ * there must be a plain RET and not jump to return thunk.
+ */
+
+ #define PTR(x) (x << 3)
+@@ -105,7 +106,9 @@ SYM_CODE_START_NOALIGN(relocate_kernel)
+ /* jump to identity mapped page */
+ addq $(identity_mapped - relocate_kernel), %r8
+ pushq %r8
+- RET
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
+ SYM_CODE_END(relocate_kernel)
+
+ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
+@@ -200,7 +203,9 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
+ xorl %r14d, %r14d
+ xorl %r15d, %r15d
+
+- RET
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
+
+ 1:
+ popq %rdx
+@@ -219,7 +224,9 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
+ call swap_pages
+ movq $virtual_mapped, %rax
+ pushq %rax
+- RET
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
+ SYM_CODE_END(identity_mapped)
+
+ SYM_CODE_START_LOCAL_NOALIGN(virtual_mapped)
+@@ -241,7 +248,9 @@ SYM_CODE_START_LOCAL_NOALIGN(virtual_mapped)
+ popq %r12
+ popq %rbp
+ popq %rbx
+- RET
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
+ SYM_CODE_END(virtual_mapped)
+
+ /* Do the copies */
+@@ -298,7 +307,9 @@ SYM_CODE_START_LOCAL_NOALIGN(swap_pages)
+ lea PAGE_SIZE(%rax), %rsi
+ jmp 0b
+ 3:
+- RET
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
+ SYM_CODE_END(swap_pages)
+
+ .globl kexec_control_code_size
+diff --git a/arch/x86/kernel/static_call.c b/arch/x86/kernel/static_call.c
+index aa72cefdd5be6..aaaba85d6d7ff 100644
+--- a/arch/x86/kernel/static_call.c
++++ b/arch/x86/kernel/static_call.c
+@@ -11,6 +11,13 @@ enum insn_type {
+ RET = 3, /* tramp / site cond-tail-call */
+ };
+
++/*
++ * ud1 %esp, %ecx - a 3 byte #UD that is unique to trampolines, chosen such
++ * that there is no false-positive trampoline identification while also being a
++ * speculation stop.
++ */
++static const u8 tramp_ud[] = { 0x0f, 0xb9, 0xcc };
++
+ /*
+ * cs cs cs xorl %eax, %eax - a single 5 byte instruction that clears %[er]ax
+ */
+@@ -18,7 +25,8 @@ static const u8 xor5rax[] = { 0x2e, 0x2e, 0x2e, 0x31, 0xc0 };
+
+ static const u8 retinsn[] = { RET_INSN_OPCODE, 0xcc, 0xcc, 0xcc, 0xcc };
+
+-static void __ref __static_call_transform(void *insn, enum insn_type type, void *func)
++static void __ref __static_call_transform(void *insn, enum insn_type type,
++ void *func, bool modinit)
+ {
+ const void *emulate = NULL;
+ int size = CALL_INSN_SIZE;
+@@ -43,14 +51,17 @@ static void __ref __static_call_transform(void *insn, enum insn_type type, void
+ break;
+
+ case RET:
+- code = &retinsn;
++ if (cpu_feature_enabled(X86_FEATURE_RETHUNK))
++ code = text_gen_insn(JMP32_INSN_OPCODE, insn, &__x86_return_thunk);
++ else
++ code = &retinsn;
+ break;
+ }
+
+ if (memcmp(insn, code, size) == 0)
+ return;
+
+- if (unlikely(system_state == SYSTEM_BOOTING))
++ if (system_state == SYSTEM_BOOTING || modinit)
+ return text_poke_early(insn, code, size);
+
+ text_poke_bp(insn, code, size, emulate);
+@@ -60,7 +71,7 @@ static void __static_call_validate(void *insn, bool tail, bool tramp)
+ {
+ u8 opcode = *(u8 *)insn;
+
+- if (tramp && memcmp(insn+5, "SCT", 3)) {
++ if (tramp && memcmp(insn+5, tramp_ud, 3)) {
+ pr_err("trampoline signature fail");
+ BUG();
+ }
+@@ -104,14 +115,42 @@ void arch_static_call_transform(void *site, void *tramp, void *func, bool tail)
+
+ if (tramp) {
+ __static_call_validate(tramp, true, true);
+- __static_call_transform(tramp, __sc_insn(!func, true), func);
++ __static_call_transform(tramp, __sc_insn(!func, true), func, false);
+ }
+
+ if (IS_ENABLED(CONFIG_HAVE_STATIC_CALL_INLINE) && site) {
+ __static_call_validate(site, tail, false);
+- __static_call_transform(site, __sc_insn(!func, tail), func);
++ __static_call_transform(site, __sc_insn(!func, tail), func, false);
+ }
+
+ mutex_unlock(&text_mutex);
+ }
+ EXPORT_SYMBOL_GPL(arch_static_call_transform);
++
++#ifdef CONFIG_RETHUNK
++/*
++ * This is called by apply_returns() to fix up static call trampolines,
++ * specifically ARCH_DEFINE_STATIC_CALL_NULL_TRAMP which is recorded as
++ * having a return trampoline.
++ *
++ * The problem is that static_call() is available before determining
++ * X86_FEATURE_RETHUNK and, by implication, running alternatives.
++ *
++ * This means that __static_call_transform() above can have overwritten the
++ * return trampoline and we now need to fix things up to be consistent.
++ */
++bool __static_call_fixup(void *tramp, u8 op, void *dest)
++{
++ if (memcmp(tramp+5, tramp_ud, 3)) {
++ /* Not a trampoline site, not our problem. */
++ return false;
++ }
++
++ mutex_lock(&text_mutex);
++ if (op == RET_INSN_OPCODE || dest == &__x86_return_thunk)
++ __static_call_transform(tramp, RET, NULL, true);
++ mutex_unlock(&text_mutex);
++
++ return true;
++}
++#endif
+diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
+index 1563fb9950059..4167215333fd8 100644
+--- a/arch/x86/kernel/traps.c
++++ b/arch/x86/kernel/traps.c
+@@ -892,14 +892,10 @@ sync:
+ }
+ #endif
+
+-struct bad_iret_stack {
+- void *error_entry_ret;
+- struct pt_regs regs;
+-};
+-
+-asmlinkage __visible noinstr
+-struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
++asmlinkage __visible noinstr struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs)
+ {
++ struct pt_regs tmp, *new_stack;
++
+ /*
+ * This is called from entry_64.S early in handling a fault
+ * caused by a bad iret to user mode. To handle the fault
+@@ -908,19 +904,18 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
+ * just below the IRET frame) and we want to pretend that the
+ * exception came from the IRET target.
+ */
+- struct bad_iret_stack tmp, *new_stack =
+- (struct bad_iret_stack *)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
++ new_stack = (struct pt_regs *)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
+
+ /* Copy the IRET target to the temporary storage. */
+- __memcpy(&tmp.regs.ip, (void *)s->regs.sp, 5*8);
++ __memcpy(&tmp.ip, (void *)bad_regs->sp, 5*8);
+
+ /* Copy the remainder of the stack from the current stack. */
+- __memcpy(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));
++ __memcpy(&tmp, bad_regs, offsetof(struct pt_regs, ip));
+
+ /* Update the entry stack */
+ __memcpy(new_stack, &tmp, sizeof(tmp));
+
+- BUG_ON(!user_mode(&new_stack->regs));
++ BUG_ON(!user_mode(new_stack));
+ return new_stack;
+ }
+ #endif
+diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
+index 7fda7f27e7620..071faf2c8a77c 100644
+--- a/arch/x86/kernel/vmlinux.lds.S
++++ b/arch/x86/kernel/vmlinux.lds.S
+@@ -141,7 +141,7 @@ SECTIONS
+
+ #ifdef CONFIG_RETPOLINE
+ __indirect_thunk_start = .;
+- *(.text.__x86.indirect_thunk)
++ *(.text.__x86.*)
+ __indirect_thunk_end = .;
+ #endif
+ } :text =0xcccc
+@@ -283,6 +283,13 @@ SECTIONS
+ *(.retpoline_sites)
+ __retpoline_sites_end = .;
+ }
++
++ . = ALIGN(8);
++ .return_sites : AT(ADDR(.return_sites) - LOAD_OFFSET) {
++ __return_sites = .;
++ *(.return_sites)
++ __return_sites_end = .;
++ }
+ #endif
+
+ #ifdef CONFIG_X86_KERNEL_IBT
+diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
+index 89b11e7dca8aa..f8382abe22ff8 100644
+--- a/arch/x86/kvm/emulate.c
++++ b/arch/x86/kvm/emulate.c
+@@ -189,9 +189,6 @@
+ #define X8(x...) X4(x), X4(x)
+ #define X16(x...) X8(x), X8(x)
+
+-#define NR_FASTOP (ilog2(sizeof(ulong)) + 1)
+-#define FASTOP_SIZE (8 * (1 + HAS_KERNEL_IBT))
+-
+ struct opcode {
+ u64 flags;
+ u8 intercept;
+@@ -306,9 +303,15 @@ static void invalidate_registers(struct x86_emulate_ctxt *ctxt)
+ * Moreover, they are all exactly FASTOP_SIZE bytes long, so functions for
+ * different operand sizes can be reached by calculation, rather than a jump
+ * table (which would be bigger than the code).
++ *
++ * The 16 byte alignment, considering 5 bytes for the RET thunk, 3 for ENDBR
++ * and 1 for the straight line speculation INT3, leaves 7 bytes for the
++ * body of the function. Currently none is larger than 4.
+ */
+ static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop);
+
++#define FASTOP_SIZE 16
++
+ #define __FOP_FUNC(name) \
+ ".align " __stringify(FASTOP_SIZE) " \n\t" \
+ ".type " name ", @function \n\t" \
+@@ -325,13 +328,15 @@ static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop);
+ #define FOP_RET(name) \
+ __FOP_RET(#name)
+
+-#define FOP_START(op) \
++#define __FOP_START(op, align) \
+ extern void em_##op(struct fastop *fake); \
+ asm(".pushsection .text, \"ax\" \n\t" \
+ ".global em_" #op " \n\t" \
+- ".align " __stringify(FASTOP_SIZE) " \n\t" \
++ ".align " __stringify(align) " \n\t" \
+ "em_" #op ":\n\t"
+
++#define FOP_START(op) __FOP_START(op, FASTOP_SIZE)
++
+ #define FOP_END \
+ ".popsection")
+
+@@ -435,17 +440,12 @@ static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop);
+ /*
+ * Depending on .config the SETcc functions look like:
+ *
+- * ENDBR [4 bytes; CONFIG_X86_KERNEL_IBT]
+- * SETcc %al [3 bytes]
+- * RET [1 byte]
+- * INT3 [1 byte; CONFIG_SLS]
+- *
+- * Which gives possible sizes 4, 5, 8 or 9. When rounded up to the
+- * next power-of-two alignment they become 4, 8 or 16 resp.
++ * ENDBR [4 bytes; CONFIG_X86_KERNEL_IBT]
++ * SETcc %al [3 bytes]
++ * RET | JMP __x86_return_thunk [1,5 bytes; CONFIG_RETHUNK]
++ * INT3 [1 byte; CONFIG_SLS]
+ */
+-#define SETCC_LENGTH (ENDBR_INSN_SIZE + 4 + IS_ENABLED(CONFIG_SLS))
+-#define SETCC_ALIGN (4 << IS_ENABLED(CONFIG_SLS) << HAS_KERNEL_IBT)
+-static_assert(SETCC_LENGTH <= SETCC_ALIGN);
++#define SETCC_ALIGN 16
+
+ #define FOP_SETCC(op) \
+ ".align " __stringify(SETCC_ALIGN) " \n\t" \
+@@ -453,9 +453,10 @@ static_assert(SETCC_LENGTH <= SETCC_ALIGN);
+ #op ": \n\t" \
+ ASM_ENDBR \
+ #op " %al \n\t" \
+- __FOP_RET(#op)
++ __FOP_RET(#op) \
++ ".skip " __stringify(SETCC_ALIGN) " - (.-" #op "), 0xcc \n\t"
+
+-FOP_START(setcc)
++__FOP_START(setcc, SETCC_ALIGN)
+ FOP_SETCC(seto)
+ FOP_SETCC(setno)
+ FOP_SETCC(setc)
+diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
+index dfaeb47fcf2a7..723f8534986c3 100644
+--- a/arch/x86/kvm/svm/vmenter.S
++++ b/arch/x86/kvm/svm/vmenter.S
+@@ -110,6 +110,15 @@ SYM_FUNC_START(__svm_vcpu_run)
+ mov %r15, VCPU_R15(%_ASM_AX)
+ #endif
+
++ /*
++ * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
++ * untrained as soon as we exit the VM and are back to the
++ * kernel. This should be done before re-enabling interrupts
++ * because interrupt handlers won't sanitize 'ret' if the return is
++ * from the kernel.
++ */
++ UNTRAIN_RET
++
+ /*
+ * Clear all general purpose registers except RSP and RAX to prevent
+ * speculative use of the guest's values, even those that are reloaded
+@@ -190,6 +199,15 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
+ FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
+ #endif
+
++ /*
++ * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
++ * untrained as soon as we exit the VM and are back to the
++ * kernel. This should be done before re-enabling interrupts
++ * because interrupt handlers won't sanitize RET if the return is
++ * from the kernel.
++ */
++ UNTRAIN_RET
++
+ pop %_ASM_BX
+
+ #ifdef CONFIG_X86_64
+diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
+index 3f430e2183753..c0e24826a86f7 100644
+--- a/arch/x86/kvm/vmx/capabilities.h
++++ b/arch/x86/kvm/vmx/capabilities.h
+@@ -4,8 +4,8 @@
+
+ #include <asm/vmx.h>
+
+-#include "lapic.h"
+-#include "x86.h"
++#include "../lapic.h"
++#include "../x86.h"
+
+ extern bool __read_mostly enable_vpid;
+ extern bool __read_mostly flexpriority_enabled;
+diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
+index ee7df31883cd9..28ccf25c41248 100644
+--- a/arch/x86/kvm/vmx/nested.c
++++ b/arch/x86/kvm/vmx/nested.c
+@@ -3091,7 +3091,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
+ }
+
+ vm_fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
+- vmx->loaded_vmcs->launched);
++ __vmx_vcpu_run_flags(vmx));
+
+ if (vmx->msr_autoload.host.nr)
+ vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
+diff --git a/arch/x86/kvm/vmx/run_flags.h b/arch/x86/kvm/vmx/run_flags.h
+new file mode 100644
+index 0000000000000..edc3f16cc1896
+--- /dev/null
++++ b/arch/x86/kvm/vmx/run_flags.h
+@@ -0,0 +1,8 @@
++/* SPDX-License-Identifier: GPL-2.0 */
++#ifndef __KVM_X86_VMX_RUN_FLAGS_H
++#define __KVM_X86_VMX_RUN_FLAGS_H
++
++#define VMX_RUN_VMRESUME (1 << 0)
++#define VMX_RUN_SAVE_SPEC_CTRL (1 << 1)
++
++#endif /* __KVM_X86_VMX_RUN_FLAGS_H */
+diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
+index 435c187927c48..4182c7ffc9091 100644
+--- a/arch/x86/kvm/vmx/vmenter.S
++++ b/arch/x86/kvm/vmx/vmenter.S
+@@ -1,10 +1,13 @@
+ /* SPDX-License-Identifier: GPL-2.0 */
+ #include <linux/linkage.h>
+ #include <asm/asm.h>
++#include <asm/asm-offsets.h>
+ #include <asm/bitsperlong.h>
+ #include <asm/kvm_vcpu_regs.h>
+ #include <asm/nospec-branch.h>
++#include <asm/percpu.h>
+ #include <asm/segment.h>
++#include "run_flags.h"
+
+ #define WORD_SIZE (BITS_PER_LONG / 8)
+
+@@ -30,73 +33,12 @@
+
+ .section .noinstr.text, "ax"
+
+-/**
+- * vmx_vmenter - VM-Enter the current loaded VMCS
+- *
+- * %RFLAGS.ZF: !VMCS.LAUNCHED, i.e. controls VMLAUNCH vs. VMRESUME
+- *
+- * Returns:
+- * %RFLAGS.CF is set on VM-Fail Invalid
+- * %RFLAGS.ZF is set on VM-Fail Valid
+- * %RFLAGS.{CF,ZF} are cleared on VM-Success, i.e. VM-Exit
+- *
+- * Note that VMRESUME/VMLAUNCH fall-through and return directly if
+- * they VM-Fail, whereas a successful VM-Enter + VM-Exit will jump
+- * to vmx_vmexit.
+- */
+-SYM_FUNC_START_LOCAL(vmx_vmenter)
+- /* EFLAGS.ZF is set if VMCS.LAUNCHED == 0 */
+- je 2f
+-
+-1: vmresume
+- RET
+-
+-2: vmlaunch
+- RET
+-
+-3: cmpb $0, kvm_rebooting
+- je 4f
+- RET
+-4: ud2
+-
+- _ASM_EXTABLE(1b, 3b)
+- _ASM_EXTABLE(2b, 3b)
+-
+-SYM_FUNC_END(vmx_vmenter)
+-
+-/**
+- * vmx_vmexit - Handle a VMX VM-Exit
+- *
+- * Returns:
+- * %RFLAGS.{CF,ZF} are cleared on VM-Success, i.e. VM-Exit
+- *
+- * This is vmx_vmenter's partner in crime. On a VM-Exit, control will jump
+- * here after hardware loads the host's state, i.e. this is the destination
+- * referred to by VMCS.HOST_RIP.
+- */
+-SYM_FUNC_START(vmx_vmexit)
+-#ifdef CONFIG_RETPOLINE
+- ALTERNATIVE "jmp .Lvmexit_skip_rsb", "", X86_FEATURE_RETPOLINE
+- /* Preserve guest's RAX, it's used to stuff the RSB. */
+- push %_ASM_AX
+-
+- /* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
+- FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
+-
+- /* Clear RFLAGS.CF and RFLAGS.ZF to preserve VM-Exit, i.e. !VM-Fail. */
+- or $1, %_ASM_AX
+-
+- pop %_ASM_AX
+-.Lvmexit_skip_rsb:
+-#endif
+- RET
+-SYM_FUNC_END(vmx_vmexit)
+-
+ /**
+ * __vmx_vcpu_run - Run a vCPU via a transition to VMX guest mode
+- * @vmx: struct vcpu_vmx * (forwarded to vmx_update_host_rsp)
++ * @vmx: struct vcpu_vmx *
+ * @regs: unsigned long * (to guest registers)
+- * @launched: %true if the VMCS has been launched
++ * @flags: VMX_RUN_VMRESUME: use VMRESUME instead of VMLAUNCH
++ * VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
+ *
+ * Returns:
+ * 0 on VM-Exit, 1 on VM-Fail
+@@ -115,24 +57,56 @@ SYM_FUNC_START(__vmx_vcpu_run)
+ #endif
+ push %_ASM_BX
+
++ /* Save @vmx for SPEC_CTRL handling */
++ push %_ASM_ARG1
++
++ /* Save @flags for SPEC_CTRL handling */
++ push %_ASM_ARG3
++
+ /*
+ * Save @regs, _ASM_ARG2 may be modified by vmx_update_host_rsp() and
+ * @regs is needed after VM-Exit to save the guest's register values.
+ */
+ push %_ASM_ARG2
+
+- /* Copy @launched to BL, _ASM_ARG3 is volatile. */
++ /* Copy @flags to BL, _ASM_ARG3 is volatile. */
+ mov %_ASM_ARG3B, %bl
+
+- /* Adjust RSP to account for the CALL to vmx_vmenter(). */
+- lea -WORD_SIZE(%_ASM_SP), %_ASM_ARG2
++ lea (%_ASM_SP), %_ASM_ARG2
+ call vmx_update_host_rsp
+
++ ALTERNATIVE "jmp .Lspec_ctrl_done", "", X86_FEATURE_MSR_SPEC_CTRL
++
++ /*
++ * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
++ * host's, write the MSR.
++ *
++ * IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
++ * there must not be any returns or indirect branches between this code
++ * and vmentry.
++ */
++ mov 2*WORD_SIZE(%_ASM_SP), %_ASM_DI
++ movl VMX_spec_ctrl(%_ASM_DI), %edi
++ movl PER_CPU_VAR(x86_spec_ctrl_current), %esi
++ cmp %edi, %esi
++ je .Lspec_ctrl_done
++ mov $MSR_IA32_SPEC_CTRL, %ecx
++ xor %edx, %edx
++ mov %edi, %eax
++ wrmsr
++
++.Lspec_ctrl_done:
++
++ /*
++ * Since vmentry is serializing on affected CPUs, there's no need for
++ * an LFENCE to stop speculation from skipping the wrmsr.
++ */
++
+ /* Load @regs to RAX. */
+ mov (%_ASM_SP), %_ASM_AX
+
+ /* Check if vmlaunch or vmresume is needed */
+- testb %bl, %bl
++ testb $VMX_RUN_VMRESUME, %bl
+
+ /* Load guest registers. Don't clobber flags. */
+ mov VCPU_RCX(%_ASM_AX), %_ASM_CX
+@@ -154,11 +128,37 @@ SYM_FUNC_START(__vmx_vcpu_run)
+ /* Load guest RAX. This kills the @regs pointer! */
+ mov VCPU_RAX(%_ASM_AX), %_ASM_AX
+
+- /* Enter guest mode */
+- call vmx_vmenter
++ /* Check EFLAGS.ZF from 'testb' above */
++ jz .Lvmlaunch
++
++ /*
++ * After a successful VMRESUME/VMLAUNCH, control flow "magically"
++ * resumes below at 'vmx_vmexit' due to the VMCS HOST_RIP setting.
++ * So this isn't a typical function and objtool needs to be told to
++ * save the unwind state here and restore it below.
++ */
++ UNWIND_HINT_SAVE
++
++/*
++ * If VMRESUME/VMLAUNCH and corresponding vmexit succeed, execution resumes at
++ * the 'vmx_vmexit' label below.
++ */
++.Lvmresume:
++ vmresume
++ jmp .Lvmfail
++
++.Lvmlaunch:
++ vmlaunch
++ jmp .Lvmfail
+
+- /* Jump on VM-Fail. */
+- jbe 2f
++ _ASM_EXTABLE(.Lvmresume, .Lfixup)
++ _ASM_EXTABLE(.Lvmlaunch, .Lfixup)
++
++SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
++
++ /* Restore unwind state from before the VMRESUME/VMLAUNCH. */
++ UNWIND_HINT_RESTORE
++ ENDBR
+
+ /* Temporarily save guest's RAX. */
+ push %_ASM_AX
+@@ -185,21 +185,23 @@ SYM_FUNC_START(__vmx_vcpu_run)
+ mov %r15, VCPU_R15(%_ASM_AX)
+ #endif
+
+- /* Clear RAX to indicate VM-Exit (as opposed to VM-Fail). */
+- xor %eax, %eax
++ /* Clear return value to indicate VM-Exit (as opposed to VM-Fail). */
++ xor %ebx, %ebx
+
++.Lclear_regs:
+ /*
+- * Clear all general purpose registers except RSP and RAX to prevent
++ * Clear all general purpose registers except RSP and RBX to prevent
+ * speculative use of the guest's values, even those that are reloaded
+ * via the stack. In theory, an L1 cache miss when restoring registers
+ * could lead to speculative execution with the guest's values.
+ * Zeroing XORs are dirt cheap, i.e. the extra paranoia is essentially
+ * free. RSP and RAX are exempt as RSP is restored by hardware during
+- * VM-Exit and RAX is explicitly loaded with 0 or 1 to return VM-Fail.
++ * VM-Exit and RBX is explicitly loaded with 0 or 1 to hold the return
++ * value.
+ */
+-1: xor %ecx, %ecx
++ xor %eax, %eax
++ xor %ecx, %ecx
+ xor %edx, %edx
+- xor %ebx, %ebx
+ xor %ebp, %ebp
+ xor %esi, %esi
+ xor %edi, %edi
+@@ -216,8 +218,30 @@ SYM_FUNC_START(__vmx_vcpu_run)
+
+ /* "POP" @regs. */
+ add $WORD_SIZE, %_ASM_SP
+- pop %_ASM_BX
+
++ /*
++ * IMPORTANT: RSB filling and SPEC_CTRL handling must be done before
++ * the first unbalanced RET after vmexit!
++ *
++ * For retpoline or IBRS, RSB filling is needed to prevent poisoned RSB
++ * entries and (in some cases) RSB underflow.
++ *
++ * eIBRS has its own protection against poisoned RSB, so it doesn't
++ * need the RSB filling sequence. But it does need to be enabled
++ * before the first unbalanced RET.
++ */
++
++ FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT
++
++ pop %_ASM_ARG2 /* @flags */
++ pop %_ASM_ARG1 /* @vmx */
++
++ call vmx_spec_ctrl_restore_host
++
++ /* Put return value in AX */
++ mov %_ASM_BX, %_ASM_AX
++
++ pop %_ASM_BX
+ #ifdef CONFIG_X86_64
+ pop %r12
+ pop %r13
+@@ -230,9 +254,15 @@ SYM_FUNC_START(__vmx_vcpu_run)
+ pop %_ASM_BP
+ RET
+
+- /* VM-Fail. Out-of-line to avoid a taken Jcc after VM-Exit. */
+-2: mov $1, %eax
+- jmp 1b
++.Lfixup:
++ cmpb $0, kvm_rebooting
++ jne .Lvmfail
++ ud2
++.Lvmfail:
++ /* VM-Fail: set return value to 1 */
++ mov $1, %_ASM_BX
++ jmp .Lclear_regs
++
+ SYM_FUNC_END(__vmx_vcpu_run)
+
+
+diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
+index 9646ae886b4b5..4b6a0268c78e3 100644
+--- a/arch/x86/kvm/vmx/vmx.c
++++ b/arch/x86/kvm/vmx/vmx.c
+@@ -383,9 +383,9 @@ static __always_inline void vmx_disable_fb_clear(struct vcpu_vmx *vmx)
+ if (!vmx->disable_fb_clear)
+ return;
+
+- rdmsrl(MSR_IA32_MCU_OPT_CTRL, msr);
++ msr = __rdmsr(MSR_IA32_MCU_OPT_CTRL);
+ msr |= FB_CLEAR_DIS;
+- wrmsrl(MSR_IA32_MCU_OPT_CTRL, msr);
++ native_wrmsrl(MSR_IA32_MCU_OPT_CTRL, msr);
+ /* Cache the MSR value to avoid reading it later */
+ vmx->msr_ia32_mcu_opt_ctrl = msr;
+ }
+@@ -396,7 +396,7 @@ static __always_inline void vmx_enable_fb_clear(struct vcpu_vmx *vmx)
+ return;
+
+ vmx->msr_ia32_mcu_opt_ctrl &= ~FB_CLEAR_DIS;
+- wrmsrl(MSR_IA32_MCU_OPT_CTRL, vmx->msr_ia32_mcu_opt_ctrl);
++ native_wrmsrl(MSR_IA32_MCU_OPT_CTRL, vmx->msr_ia32_mcu_opt_ctrl);
+ }
+
+ static void vmx_update_fb_clear_dis(struct kvm_vcpu *vcpu, struct vcpu_vmx *vmx)
+@@ -839,6 +839,24 @@ static bool msr_write_intercepted(struct vcpu_vmx *vmx, u32 msr)
+ MSR_IA32_SPEC_CTRL);
+ }
+
++unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx)
++{
++ unsigned int flags = 0;
++
++ if (vmx->loaded_vmcs->launched)
++ flags |= VMX_RUN_VMRESUME;
++
++ /*
++ * If writes to the SPEC_CTRL MSR aren't intercepted, the guest is free
++ * to change it directly without causing a vmexit. In that case read
++ * it after vmexit and store it in vmx->spec_ctrl.
++ */
++ if (unlikely(!msr_write_intercepted(vmx, MSR_IA32_SPEC_CTRL)))
++ flags |= VMX_RUN_SAVE_SPEC_CTRL;
++
++ return flags;
++}
++
+ static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
+ unsigned long entry, unsigned long exit)
+ {
+@@ -6814,6 +6832,31 @@ void noinstr vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp)
+ }
+ }
+
++void noinstr vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx,
++ unsigned int flags)
++{
++ u64 hostval = this_cpu_read(x86_spec_ctrl_current);
++
++ if (!cpu_feature_enabled(X86_FEATURE_MSR_SPEC_CTRL))
++ return;
++
++ if (flags & VMX_RUN_SAVE_SPEC_CTRL)
++ vmx->spec_ctrl = __rdmsr(MSR_IA32_SPEC_CTRL);
++
++ /*
++ * If the guest/host SPEC_CTRL values differ, restore the host value.
++ *
++ * For legacy IBRS, the IBRS bit always needs to be written after
++ * transitioning from a less privileged predictor mode, regardless of
++ * whether the guest/host values differ.
++ */
++ if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) ||
++ vmx->spec_ctrl != hostval)
++ native_wrmsrl(MSR_IA32_SPEC_CTRL, hostval);
++
++ barrier_nospec();
++}
++
+ static fastpath_t vmx_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
+ {
+ switch (to_vmx(vcpu)->exit_reason.basic) {
+@@ -6827,7 +6870,8 @@ static fastpath_t vmx_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
+ }
+
+ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
+- struct vcpu_vmx *vmx)
++ struct vcpu_vmx *vmx,
++ unsigned long flags)
+ {
+ guest_state_enter_irqoff();
+
+@@ -6846,7 +6890,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
+ native_write_cr2(vcpu->arch.cr2);
+
+ vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
+- vmx->loaded_vmcs->launched);
++ flags);
+
+ vcpu->arch.cr2 = native_read_cr2();
+
+@@ -6945,36 +6989,8 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
+
+ kvm_wait_lapic_expire(vcpu);
+
+- /*
+- * If this vCPU has touched SPEC_CTRL, restore the guest's value if
+- * it's non-zero. Since vmentry is serialising on affected CPUs, there
+- * is no need to worry about the conditional branch over the wrmsr
+- * being speculatively taken.
+- */
+- x86_spec_ctrl_set_guest(vmx->spec_ctrl, 0);
+-
+ /* The actual VMENTER/EXIT is in the .noinstr.text section. */
+- vmx_vcpu_enter_exit(vcpu, vmx);
+-
+- /*
+- * We do not use IBRS in the kernel. If this vCPU has used the
+- * SPEC_CTRL MSR it may have left it on; save the value and
+- * turn it off. This is much more efficient than blindly adding
+- * it to the atomic save/restore list. Especially as the former
+- * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
+- *
+- * For non-nested case:
+- * If the L01 MSR bitmap does not intercept the MSR, then we need to
+- * save it.
+- *
+- * For nested case:
+- * If the L02 MSR bitmap does not intercept the MSR, then we need to
+- * save it.
+- */
+- if (unlikely(!msr_write_intercepted(vmx, MSR_IA32_SPEC_CTRL)))
+- vmx->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
+-
+- x86_spec_ctrl_restore_host(vmx->spec_ctrl, 0);
++ vmx_vcpu_enter_exit(vcpu, vmx, __vmx_vcpu_run_flags(vmx));
+
+ /* All fields are clean at this point */
+ if (static_branch_unlikely(&enable_evmcs)) {
+diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
+index 8d2342ede0c59..1e7f9453894b1 100644
+--- a/arch/x86/kvm/vmx/vmx.h
++++ b/arch/x86/kvm/vmx/vmx.h
+@@ -8,11 +8,12 @@
+ #include <asm/intel_pt.h>
+
+ #include "capabilities.h"
+-#include "kvm_cache_regs.h"
++#include "../kvm_cache_regs.h"
+ #include "posted_intr.h"
+ #include "vmcs.h"
+ #include "vmx_ops.h"
+-#include "cpuid.h"
++#include "../cpuid.h"
++#include "run_flags.h"
+
+ #define MSR_TYPE_R 1
+ #define MSR_TYPE_W 2
+@@ -404,7 +405,10 @@ void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu);
+ struct vmx_uret_msr *vmx_find_uret_msr(struct vcpu_vmx *vmx, u32 msr);
+ void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu);
+ void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp);
+-bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs, bool launched);
++void vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx, unsigned int flags);
++unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx);
++bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs,
++ unsigned int flags);
+ int vmx_find_loadstore_msr_slot(struct vmx_msrs *m, u32 msr);
+ void vmx_ept_load_pdptrs(struct kvm_vcpu *vcpu);
+
+diff --git a/arch/x86/kvm/vmx/vmx_ops.h b/arch/x86/kvm/vmx/vmx_ops.h
+index 5e7f412257801..5cfc49ddb1b44 100644
+--- a/arch/x86/kvm/vmx/vmx_ops.h
++++ b/arch/x86/kvm/vmx/vmx_ops.h
+@@ -8,7 +8,7 @@
+
+ #include "evmcs.h"
+ #include "vmcs.h"
+-#include "x86.h"
++#include "../x86.h"
+
+ asmlinkage void vmread_error(unsigned long field, bool fault);
+ __attribute__((regparm(0))) void vmread_error_trampoline(unsigned long field,
+diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
+index 828f5cf1af459..53b6fdf30c99b 100644
+--- a/arch/x86/kvm/x86.c
++++ b/arch/x86/kvm/x86.c
+@@ -12533,9 +12533,9 @@ void kvm_arch_end_assignment(struct kvm *kvm)
+ }
+ EXPORT_SYMBOL_GPL(kvm_arch_end_assignment);
+
+-bool kvm_arch_has_assigned_device(struct kvm *kvm)
++bool noinstr kvm_arch_has_assigned_device(struct kvm *kvm)
+ {
+- return atomic_read(&kvm->arch.assigned_device_count);
++ return arch_atomic_read(&kvm->arch.assigned_device_count);
+ }
+ EXPORT_SYMBOL_GPL(kvm_arch_has_assigned_device);
+
+diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S
+index d83cba364e31d..724bbf83eb5b0 100644
+--- a/arch/x86/lib/memmove_64.S
++++ b/arch/x86/lib/memmove_64.S
+@@ -39,7 +39,7 @@ SYM_FUNC_START(__memmove)
+ /* FSRM implies ERMS => no length checks, do the copy directly */
+ .Lmemmove_begin_forward:
+ ALTERNATIVE "cmp $0x20, %rdx; jb 1f", "", X86_FEATURE_FSRM
+- ALTERNATIVE "", __stringify(movq %rdx, %rcx; rep movsb; RET), X86_FEATURE_ERMS
++ ALTERNATIVE "", "jmp .Lmemmove_erms", X86_FEATURE_ERMS
+
+ /*
+ * movsq instruction have many startup latency
+@@ -205,6 +205,11 @@ SYM_FUNC_START(__memmove)
+ movb %r11b, (%rdi)
+ 13:
+ RET
++
++.Lmemmove_erms:
++ movq %rdx, %rcx
++ rep movsb
++ RET
+ SYM_FUNC_END(__memmove)
+ EXPORT_SYMBOL(__memmove)
+
+diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
+index b2b2366885a2b..073289a55f849 100644
+--- a/arch/x86/lib/retpoline.S
++++ b/arch/x86/lib/retpoline.S
+@@ -33,9 +33,9 @@ SYM_INNER_LABEL(__x86_indirect_thunk_\reg, SYM_L_GLOBAL)
+ UNWIND_HINT_EMPTY
+ ANNOTATE_NOENDBR
+
+- ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
+- __stringify(RETPOLINE \reg), X86_FEATURE_RETPOLINE, \
+- __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg; int3), X86_FEATURE_RETPOLINE_LFENCE
++ ALTERNATIVE_2 __stringify(RETPOLINE \reg), \
++ __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg; int3), X86_FEATURE_RETPOLINE_LFENCE, \
++ __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), ALT_NOT(X86_FEATURE_RETPOLINE)
+
+ .endm
+
+@@ -67,3 +67,76 @@ SYM_CODE_END(__x86_indirect_thunk_array)
+ #define GEN(reg) EXPORT_THUNK(reg)
+ #include <asm/GEN-for-each-reg.h>
+ #undef GEN
++
++/*
++ * This function name is magical and is used by -mfunction-return=thunk-extern
++ * for the compiler to generate JMPs to it.
++ */
++#ifdef CONFIG_RETHUNK
++
++ .section .text.__x86.return_thunk
++
++/*
++ * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
++ * 1) The RET at __x86_return_thunk must be on a 64 byte boundary, for
++ * alignment within the BTB.
++ * 2) The instruction at zen_untrain_ret must contain, and not
++ * end with, the 0xc3 byte of the RET.
++ * 3) STIBP must be enabled, or SMT disabled, to prevent the sibling thread
++ * from re-poisioning the BTB prediction.
++ */
++ .align 64
++ .skip 63, 0xcc
++SYM_FUNC_START_NOALIGN(zen_untrain_ret);
++
++ /*
++ * As executed from zen_untrain_ret, this is:
++ *
++ * TEST $0xcc, %bl
++ * LFENCE
++ * JMP __x86_return_thunk
++ *
++ * Executing the TEST instruction has a side effect of evicting any BTB
++ * prediction (potentially attacker controlled) attached to the RET, as
++ * __x86_return_thunk + 1 isn't an instruction boundary at the moment.
++ */
++ .byte 0xf6
++
++ /*
++ * As executed from __x86_return_thunk, this is a plain RET.
++ *
++ * As part of the TEST above, RET is the ModRM byte, and INT3 the imm8.
++ *
++ * We subsequently jump backwards and architecturally execute the RET.
++ * This creates a correct BTB prediction (type=ret), but in the
++ * meantime we suffer Straight Line Speculation (because the type was
++ * no branch) which is halted by the INT3.
++ *
++ * With SMT enabled and STIBP active, a sibling thread cannot poison
++ * RET's prediction to a type of its choice, but can evict the
++ * prediction due to competitive sharing. If the prediction is
++ * evicted, __x86_return_thunk will suffer Straight Line Speculation
++ * which will be contained safely by the INT3.
++ */
++SYM_INNER_LABEL(__x86_return_thunk, SYM_L_GLOBAL)
++ ret
++ int3
++SYM_CODE_END(__x86_return_thunk)
++
++ /*
++ * Ensure the TEST decoding / BTB invalidation is complete.
++ */
++ lfence
++
++ /*
++ * Jump back and execute the RET in the middle of the TEST instruction.
++ * INT3 is for SLS protection.
++ */
++ jmp __x86_return_thunk
++ int3
++SYM_FUNC_END(zen_untrain_ret)
++__EXPORT_THUNK(zen_untrain_ret)
++
++EXPORT_SYMBOL(__x86_return_thunk)
++
++#endif /* CONFIG_RETHUNK */
+diff --git a/arch/x86/mm/mem_encrypt_boot.S b/arch/x86/mm/mem_encrypt_boot.S
+index 3d1dba05fce4a..9de3d900bc927 100644
+--- a/arch/x86/mm/mem_encrypt_boot.S
++++ b/arch/x86/mm/mem_encrypt_boot.S
+@@ -65,7 +65,10 @@ SYM_FUNC_START(sme_encrypt_execute)
+ movq %rbp, %rsp /* Restore original stack pointer */
+ pop %rbp
+
+- RET
++ /* Offset to __x86_return_thunk would be wrong here */
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
+ SYM_FUNC_END(sme_encrypt_execute)
+
+ SYM_FUNC_START(__enc_copy)
+@@ -151,6 +154,9 @@ SYM_FUNC_START(__enc_copy)
+ pop %r12
+ pop %r15
+
+- RET
++ /* Offset to __x86_return_thunk would be wrong here */
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
+ .L__enc_copy_end:
+ SYM_FUNC_END(__enc_copy)
+diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
+index 4c71fa04e784c..2dab2816b3f7c 100644
+--- a/arch/x86/net/bpf_jit_comp.c
++++ b/arch/x86/net/bpf_jit_comp.c
+@@ -407,16 +407,30 @@ static void emit_indirect_jump(u8 **pprog, int reg, u8 *ip)
+ {
+ u8 *prog = *pprog;
+
+-#ifdef CONFIG_RETPOLINE
+ if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) {
+ EMIT_LFENCE();
+ EMIT2(0xFF, 0xE0 + reg);
+ } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) {
+ OPTIMIZER_HIDE_VAR(reg);
+ emit_jump(&prog, &__x86_indirect_thunk_array[reg], ip);
+- } else
+-#endif
+- EMIT2(0xFF, 0xE0 + reg);
++ } else {
++ EMIT2(0xFF, 0xE0 + reg);
++ }
++
++ *pprog = prog;
++}
++
++static void emit_return(u8 **pprog, u8 *ip)
++{
++ u8 *prog = *pprog;
++
++ if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) {
++ emit_jump(&prog, &__x86_return_thunk, ip);
++ } else {
++ EMIT1(0xC3); /* ret */
++ if (IS_ENABLED(CONFIG_SLS))
++ EMIT1(0xCC); /* int3 */
++ }
+
+ *pprog = prog;
+ }
+@@ -1681,7 +1695,7 @@ emit_jmp:
+ ctx->cleanup_addr = proglen;
+ pop_callee_regs(&prog, callee_regs_used);
+ EMIT1(0xC9); /* leave */
+- EMIT1(0xC3); /* ret */
++ emit_return(&prog, image + addrs[i - 1] + (prog - temp));
+ break;
+
+ default:
+@@ -2158,7 +2172,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
+ if (flags & BPF_TRAMP_F_SKIP_FRAME)
+ /* skip our return address and return to parent */
+ EMIT4(0x48, 0x83, 0xC4, 8); /* add rsp, 8 */
+- EMIT1(0xC3); /* ret */
++ emit_return(&prog, prog);
+ /* Make sure the trampoline generation logic doesn't overflow */
+ if (WARN_ON_ONCE(prog > (u8 *)image_end - BPF_INSN_SAFETY)) {
+ ret = -EFAULT;
+diff --git a/arch/x86/platform/efi/efi_thunk_64.S b/arch/x86/platform/efi/efi_thunk_64.S
+index 854dd81804b73..bc740a7c438c9 100644
+--- a/arch/x86/platform/efi/efi_thunk_64.S
++++ b/arch/x86/platform/efi/efi_thunk_64.S
+@@ -23,6 +23,7 @@
+ #include <linux/objtool.h>
+ #include <asm/page_types.h>
+ #include <asm/segment.h>
++#include <asm/nospec-branch.h>
+
+ .text
+ .code64
+@@ -75,7 +76,9 @@ STACK_FRAME_NON_STANDARD __efi64_thunk
+ 1: movq 0x20(%rsp), %rsp
+ pop %rbx
+ pop %rbp
+- RET
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
+
+ .code32
+ 2: pushl $__KERNEL_CS
+diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
+index 81aa46f770c54..cfa99e8f054be 100644
+--- a/arch/x86/xen/setup.c
++++ b/arch/x86/xen/setup.c
+@@ -918,7 +918,7 @@ void xen_enable_sysenter(void)
+ if (!boot_cpu_has(sysenter_feature))
+ return;
+
+- ret = register_callback(CALLBACKTYPE_sysenter, xen_sysenter_target);
++ ret = register_callback(CALLBACKTYPE_sysenter, xen_entry_SYSENTER_compat);
+ if(ret != 0)
+ setup_clear_cpu_cap(sysenter_feature);
+ }
+@@ -927,7 +927,7 @@ void xen_enable_syscall(void)
+ {
+ int ret;
+
+- ret = register_callback(CALLBACKTYPE_syscall, xen_syscall_target);
++ ret = register_callback(CALLBACKTYPE_syscall, xen_entry_SYSCALL_64);
+ if (ret != 0) {
+ printk(KERN_ERR "Failed to set syscall callback: %d\n", ret);
+ /* Pretty fatal; 64-bit userspace has no other
+@@ -936,7 +936,7 @@ void xen_enable_syscall(void)
+
+ if (boot_cpu_has(X86_FEATURE_SYSCALL32)) {
+ ret = register_callback(CALLBACKTYPE_syscall32,
+- xen_syscall32_target);
++ xen_entry_SYSCALL_compat);
+ if (ret != 0)
+ setup_clear_cpu_cap(X86_FEATURE_SYSCALL32);
+ }
+diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
+index caa9bc2fa1008..6b4fdf6b95422 100644
+--- a/arch/x86/xen/xen-asm.S
++++ b/arch/x86/xen/xen-asm.S
+@@ -121,7 +121,7 @@ SYM_FUNC_END(xen_read_cr2_direct);
+
+ .macro xen_pv_trap name
+ SYM_CODE_START(xen_\name)
+- UNWIND_HINT_EMPTY
++ UNWIND_HINT_ENTRY
+ ENDBR
+ pop %rcx
+ pop %r11
+@@ -234,8 +234,8 @@ SYM_CODE_END(xenpv_restore_regs_and_return_to_usermode)
+ */
+
+ /* Normal 64-bit system call target */
+-SYM_CODE_START(xen_syscall_target)
+- UNWIND_HINT_EMPTY
++SYM_CODE_START(xen_entry_SYSCALL_64)
++ UNWIND_HINT_ENTRY
+ ENDBR
+ popq %rcx
+ popq %r11
+@@ -249,13 +249,13 @@ SYM_CODE_START(xen_syscall_target)
+ movq $__USER_CS, 1*8(%rsp)
+
+ jmp entry_SYSCALL_64_after_hwframe
+-SYM_CODE_END(xen_syscall_target)
++SYM_CODE_END(xen_entry_SYSCALL_64)
+
+ #ifdef CONFIG_IA32_EMULATION
+
+ /* 32-bit compat syscall target */
+-SYM_CODE_START(xen_syscall32_target)
+- UNWIND_HINT_EMPTY
++SYM_CODE_START(xen_entry_SYSCALL_compat)
++ UNWIND_HINT_ENTRY
+ ENDBR
+ popq %rcx
+ popq %r11
+@@ -269,11 +269,11 @@ SYM_CODE_START(xen_syscall32_target)
+ movq $__USER32_CS, 1*8(%rsp)
+
+ jmp entry_SYSCALL_compat_after_hwframe
+-SYM_CODE_END(xen_syscall32_target)
++SYM_CODE_END(xen_entry_SYSCALL_compat)
+
+ /* 32-bit compat sysenter target */
+-SYM_CODE_START(xen_sysenter_target)
+- UNWIND_HINT_EMPTY
++SYM_CODE_START(xen_entry_SYSENTER_compat)
++ UNWIND_HINT_ENTRY
+ ENDBR
+ /*
+ * NB: Xen is polite and clears TF from EFLAGS for us. This means
+@@ -291,19 +291,19 @@ SYM_CODE_START(xen_sysenter_target)
+ movq $__USER32_CS, 1*8(%rsp)
+
+ jmp entry_SYSENTER_compat_after_hwframe
+-SYM_CODE_END(xen_sysenter_target)
++SYM_CODE_END(xen_entry_SYSENTER_compat)
+
+ #else /* !CONFIG_IA32_EMULATION */
+
+-SYM_CODE_START(xen_syscall32_target)
+-SYM_CODE_START(xen_sysenter_target)
+- UNWIND_HINT_EMPTY
++SYM_CODE_START(xen_entry_SYSCALL_compat)
++SYM_CODE_START(xen_entry_SYSENTER_compat)
++ UNWIND_HINT_ENTRY
+ ENDBR
+ lea 16(%rsp), %rsp /* strip %rcx, %r11 */
+ mov $-ENOSYS, %rax
+ pushq $0
+ jmp hypercall_iret
+-SYM_CODE_END(xen_sysenter_target)
+-SYM_CODE_END(xen_syscall32_target)
++SYM_CODE_END(xen_entry_SYSENTER_compat)
++SYM_CODE_END(xen_entry_SYSCALL_compat)
+
+ #endif /* CONFIG_IA32_EMULATION */
+diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
+index 13af6fe453e3f..ffaa62167f6e8 100644
+--- a/arch/x86/xen/xen-head.S
++++ b/arch/x86/xen/xen-head.S
+@@ -26,6 +26,7 @@ SYM_CODE_START(hypercall_page)
+ .rept (PAGE_SIZE / 32)
+ UNWIND_HINT_FUNC
+ ANNOTATE_NOENDBR
++ ANNOTATE_UNRET_SAFE
+ ret
+ /*
+ * Xen will write the hypercall page, and sort out ENDBR.
+diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
+index fd0fec6e92f4c..9a8bb972193d8 100644
+--- a/arch/x86/xen/xen-ops.h
++++ b/arch/x86/xen/xen-ops.h
+@@ -10,10 +10,10 @@
+ /* These are code, but not functions. Defined in entry.S */
+ extern const char xen_failsafe_callback[];
+
+-void xen_sysenter_target(void);
++void xen_entry_SYSENTER_compat(void);
+ #ifdef CONFIG_X86_64
+-void xen_syscall_target(void);
+-void xen_syscall32_target(void);
++void xen_entry_SYSCALL_64(void);
++void xen_entry_SYSCALL_compat(void);
+ #endif
+
+ extern void *xen_initial_gdt;
+diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
+index a97776ea9d990..4c98849577d4e 100644
+--- a/drivers/base/cpu.c
++++ b/drivers/base/cpu.c
+@@ -570,6 +570,12 @@ ssize_t __weak cpu_show_mmio_stale_data(struct device *dev,
+ return sysfs_emit(buf, "Not affected\n");
+ }
+
++ssize_t __weak cpu_show_retbleed(struct device *dev,
++ struct device_attribute *attr, char *buf)
++{
++ return sysfs_emit(buf, "Not affected\n");
++}
++
+ static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
+ static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
+ static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
+@@ -580,6 +586,7 @@ static DEVICE_ATTR(tsx_async_abort, 0444, cpu_show_tsx_async_abort, NULL);
+ static DEVICE_ATTR(itlb_multihit, 0444, cpu_show_itlb_multihit, NULL);
+ static DEVICE_ATTR(srbds, 0444, cpu_show_srbds, NULL);
+ static DEVICE_ATTR(mmio_stale_data, 0444, cpu_show_mmio_stale_data, NULL);
++static DEVICE_ATTR(retbleed, 0444, cpu_show_retbleed, NULL);
+
+ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
+ &dev_attr_meltdown.attr,
+@@ -592,6 +599,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
+ &dev_attr_itlb_multihit.attr,
+ &dev_attr_srbds.attr,
+ &dev_attr_mmio_stale_data.attr,
++ &dev_attr_retbleed.attr,
+ NULL
+ };
+
+diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
+index c5a019eab5ec9..b463d85bfb358 100644
+--- a/drivers/idle/intel_idle.c
++++ b/drivers/idle/intel_idle.c
+@@ -47,11 +47,13 @@
+ #include <linux/tick.h>
+ #include <trace/events/power.h>
+ #include <linux/sched.h>
++#include <linux/sched/smt.h>
+ #include <linux/notifier.h>
+ #include <linux/cpu.h>
+ #include <linux/moduleparam.h>
+ #include <asm/cpu_device_id.h>
+ #include <asm/intel-family.h>
++#include <asm/nospec-branch.h>
+ #include <asm/mwait.h>
+ #include <asm/msr.h>
+
+@@ -105,6 +107,12 @@ static unsigned int mwait_substates __initdata;
+ */
+ #define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15)
+
++/*
++ * Disable IBRS across idle (when KERNEL_IBRS), is exclusive vs IRQ_ENABLE
++ * above.
++ */
++#define CPUIDLE_FLAG_IBRS BIT(16)
++
+ /*
+ * MWAIT takes an 8-bit "hint" in EAX "suggesting"
+ * the C-state (top nibble) and sub-state (bottom nibble)
+@@ -159,6 +167,24 @@ static __cpuidle int intel_idle_irq(struct cpuidle_device *dev,
+ return ret;
+ }
+
++static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
++ struct cpuidle_driver *drv, int index)
++{
++ bool smt_active = sched_smt_active();
++ u64 spec_ctrl = spec_ctrl_current();
++ int ret;
++
++ if (smt_active)
++ wrmsrl(MSR_IA32_SPEC_CTRL, 0);
++
++ ret = __intel_idle(dev, drv, index);
++
++ if (smt_active)
++ wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
++
++ return ret;
++}
++
+ /**
+ * intel_idle_s2idle - Ask the processor to enter the given idle state.
+ * @dev: cpuidle device of the target CPU.
+@@ -680,7 +706,7 @@ static struct cpuidle_state skl_cstates[] __initdata = {
+ {
+ .name = "C6",
+ .desc = "MWAIT 0x20",
+- .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
++ .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | CPUIDLE_FLAG_IBRS,
+ .exit_latency = 85,
+ .target_residency = 200,
+ .enter = &intel_idle,
+@@ -688,7 +714,7 @@ static struct cpuidle_state skl_cstates[] __initdata = {
+ {
+ .name = "C7s",
+ .desc = "MWAIT 0x33",
+- .flags = MWAIT2flg(0x33) | CPUIDLE_FLAG_TLB_FLUSHED,
++ .flags = MWAIT2flg(0x33) | CPUIDLE_FLAG_TLB_FLUSHED | CPUIDLE_FLAG_IBRS,
+ .exit_latency = 124,
+ .target_residency = 800,
+ .enter = &intel_idle,
+@@ -696,7 +722,7 @@ static struct cpuidle_state skl_cstates[] __initdata = {
+ {
+ .name = "C8",
+ .desc = "MWAIT 0x40",
+- .flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
++ .flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED | CPUIDLE_FLAG_IBRS,
+ .exit_latency = 200,
+ .target_residency = 800,
+ .enter = &intel_idle,
+@@ -704,7 +730,7 @@ static struct cpuidle_state skl_cstates[] __initdata = {
+ {
+ .name = "C9",
+ .desc = "MWAIT 0x50",
+- .flags = MWAIT2flg(0x50) | CPUIDLE_FLAG_TLB_FLUSHED,
++ .flags = MWAIT2flg(0x50) | CPUIDLE_FLAG_TLB_FLUSHED | CPUIDLE_FLAG_IBRS,
+ .exit_latency = 480,
+ .target_residency = 5000,
+ .enter = &intel_idle,
+@@ -712,7 +738,7 @@ static struct cpuidle_state skl_cstates[] __initdata = {
+ {
+ .name = "C10",
+ .desc = "MWAIT 0x60",
+- .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
++ .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED | CPUIDLE_FLAG_IBRS,
+ .exit_latency = 890,
+ .target_residency = 5000,
+ .enter = &intel_idle,
+@@ -741,7 +767,7 @@ static struct cpuidle_state skx_cstates[] __initdata = {
+ {
+ .name = "C6",
+ .desc = "MWAIT 0x20",
+- .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
++ .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | CPUIDLE_FLAG_IBRS,
+ .exit_latency = 133,
+ .target_residency = 600,
+ .enter = &intel_idle,
+@@ -1686,6 +1712,12 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
+ if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_IRQ_ENABLE)
+ drv->states[drv->state_count].enter = intel_idle_irq;
+
++ if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) &&
++ cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_IBRS) {
++ WARN_ON_ONCE(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_IRQ_ENABLE);
++ drv->states[drv->state_count].enter = intel_idle_ibrs;
++ }
++
+ if ((disabled_states_mask & BIT(drv->state_count)) ||
+ ((icpu->use_acpi || force_use_acpi) &&
+ intel_idle_off_by_default(mwait_hint) &&
+diff --git a/include/linux/cpu.h b/include/linux/cpu.h
+index 2c74773547444..314802f98b9da 100644
+--- a/include/linux/cpu.h
++++ b/include/linux/cpu.h
+@@ -68,6 +68,8 @@ extern ssize_t cpu_show_srbds(struct device *dev, struct device_attribute *attr,
+ extern ssize_t cpu_show_mmio_stale_data(struct device *dev,
+ struct device_attribute *attr,
+ char *buf);
++extern ssize_t cpu_show_retbleed(struct device *dev,
++ struct device_attribute *attr, char *buf);
+
+ extern __printf(4, 5)
+ struct device *cpu_device_create(struct device *parent, void *drvdata,
+diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
+index 34eed5f85ed60..88d94cf515e13 100644
+--- a/include/linux/kvm_host.h
++++ b/include/linux/kvm_host.h
+@@ -1511,7 +1511,7 @@ static inline void kvm_arch_end_assignment(struct kvm *kvm)
+ {
+ }
+
+-static inline bool kvm_arch_has_assigned_device(struct kvm *kvm)
++static __always_inline bool kvm_arch_has_assigned_device(struct kvm *kvm)
+ {
+ return false;
+ }
+diff --git a/include/linux/objtool.h b/include/linux/objtool.h
+index c81ea2264ad8a..376110ead758e 100644
+--- a/include/linux/objtool.h
++++ b/include/linux/objtool.h
+@@ -32,11 +32,16 @@ struct unwind_hint {
+ *
+ * UNWIND_HINT_FUNC: Generate the unwind metadata of a callable function.
+ * Useful for code which doesn't have an ELF function annotation.
++ *
++ * UNWIND_HINT_ENTRY: machine entry without stack, SYSCALL/SYSENTER etc.
+ */
+ #define UNWIND_HINT_TYPE_CALL 0
+ #define UNWIND_HINT_TYPE_REGS 1
+ #define UNWIND_HINT_TYPE_REGS_PARTIAL 2
+ #define UNWIND_HINT_TYPE_FUNC 3
++#define UNWIND_HINT_TYPE_ENTRY 4
++#define UNWIND_HINT_TYPE_SAVE 5
++#define UNWIND_HINT_TYPE_RESTORE 6
+
+ #ifdef CONFIG_STACK_VALIDATION
+
+@@ -122,7 +127,7 @@ struct unwind_hint {
+ * the debuginfo as necessary. It will also warn if it sees any
+ * inconsistencies.
+ */
+-.macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
++.macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 end=0
+ .Lunwind_hint_ip_\@:
+ .pushsection .discard.unwind_hints
+ /* struct unwind_hint */
+@@ -175,7 +180,7 @@ struct unwind_hint {
+ #define ASM_REACHABLE
+ #else
+ #define ANNOTATE_INTRA_FUNCTION_CALL
+-.macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
++.macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 end=0
+ .endm
+ .macro STACK_FRAME_NON_STANDARD func:req
+ .endm
+diff --git a/scripts/Makefile.build b/scripts/Makefile.build
+index 33c1ed5815229..2a0521f77e5f0 100644
+--- a/scripts/Makefile.build
++++ b/scripts/Makefile.build
+@@ -233,6 +233,7 @@ objtool_args = \
+ $(if $(CONFIG_FRAME_POINTER),, --no-fp) \
+ $(if $(CONFIG_GCOV_KERNEL), --no-unreachable) \
+ $(if $(CONFIG_RETPOLINE), --retpoline) \
++ $(if $(CONFIG_RETHUNK), --rethunk) \
+ $(if $(CONFIG_X86_SMAP), --uaccess) \
+ $(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL), --mcount) \
+ $(if $(CONFIG_SLS), --sls)
+diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
+index 9361a1ef02c99..d4d028595fb4b 100755
+--- a/scripts/link-vmlinux.sh
++++ b/scripts/link-vmlinux.sh
+@@ -130,6 +130,9 @@ objtool_link()
+
+ if is_enabled CONFIG_VMLINUX_VALIDATION; then
+ objtoolopt="${objtoolopt} --noinstr"
++ if is_enabled CONFIG_CPU_UNRET_ENTRY; then
++ objtoolopt="${objtoolopt} --unret"
++ fi
+ fi
+
+ if [ -n "${objtoolopt}" ]; then
+diff --git a/security/Kconfig b/security/Kconfig
+index 9b2c4925585a3..34e2d7edd0857 100644
+--- a/security/Kconfig
++++ b/security/Kconfig
+@@ -54,17 +54,6 @@ config SECURITY_NETWORK
+ implement socket and networking access controls.
+ If you are unsure how to answer this question, answer N.
+
+-config PAGE_TABLE_ISOLATION
+- bool "Remove the kernel mapping in user mode"
+- default y
+- depends on (X86_64 || X86_PAE) && !UML
+- help
+- This feature reduces the number of hardware side channels by
+- ensuring that the majority of kernel addresses are not mapped
+- into userspace.
+-
+- See Documentation/x86/pti.rst for more details.
+-
+ config SECURITY_INFINIBAND
+ bool "Infiniband Security Hooks"
+ depends on SECURITY && INFINIBAND
+diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
+index e17de69faa543..5d09ded0c491f 100644
+--- a/tools/arch/x86/include/asm/cpufeatures.h
++++ b/tools/arch/x86/include/asm/cpufeatures.h
+@@ -203,8 +203,8 @@
+ #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
+ /* FREE! ( 7*32+10) */
+ #define X86_FEATURE_PTI ( 7*32+11) /* Kernel Page Table Isolation enabled */
+-#define X86_FEATURE_RETPOLINE ( 7*32+12) /* "" Generic Retpoline mitigation for Spectre variant 2 */
+-#define X86_FEATURE_RETPOLINE_LFENCE ( 7*32+13) /* "" Use LFENCE for Spectre variant 2 */
++#define X86_FEATURE_KERNEL_IBRS ( 7*32+12) /* "" Set/clear IBRS on kernel entry/exit */
++#define X86_FEATURE_RSB_VMEXIT ( 7*32+13) /* "" Fill RSB on VM-Exit */
+ #define X86_FEATURE_INTEL_PPIN ( 7*32+14) /* Intel Processor Inventory Number */
+ #define X86_FEATURE_CDP_L2 ( 7*32+15) /* Code and Data Prioritization L2 */
+ #define X86_FEATURE_MSR_SPEC_CTRL ( 7*32+16) /* "" MSR SPEC_CTRL is implemented */
+@@ -295,6 +295,12 @@
+ #define X86_FEATURE_PER_THREAD_MBA (11*32+ 7) /* "" Per-thread Memory Bandwidth Allocation */
+ #define X86_FEATURE_SGX1 (11*32+ 8) /* "" Basic SGX */
+ #define X86_FEATURE_SGX2 (11*32+ 9) /* "" SGX Enclave Dynamic Memory Management (EDMM) */
++#define X86_FEATURE_ENTRY_IBPB (11*32+10) /* "" Issue an IBPB on kernel entry */
++#define X86_FEATURE_RRSBA_CTRL (11*32+11) /* "" RET prediction control */
++#define X86_FEATURE_RETPOLINE (11*32+12) /* "" Generic Retpoline mitigation for Spectre variant 2 */
++#define X86_FEATURE_RETPOLINE_LFENCE (11*32+13) /* "" Use LFENCE for Spectre variant 2 */
++#define X86_FEATURE_RETHUNK (11*32+14) /* "" Use REturn THUNK */
++#define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */
+
+ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
+ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
+@@ -315,6 +321,7 @@
+ #define X86_FEATURE_VIRT_SSBD (13*32+25) /* Virtualized Speculative Store Bypass Disable */
+ #define X86_FEATURE_AMD_SSB_NO (13*32+26) /* "" Speculative Store Bypass is fixed in hardware. */
+ #define X86_FEATURE_CPPC (13*32+27) /* Collaborative Processor Performance Control */
++#define X86_FEATURE_BTC_NO (13*32+29) /* "" Not vulnerable to Branch Type Confusion */
+
+ /* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */
+ #define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */
+@@ -444,5 +451,6 @@
+ #define X86_BUG_ITLB_MULTIHIT X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */
+ #define X86_BUG_SRBDS X86_BUG(24) /* CPU may leak RNG bits if not mitigated */
+ #define X86_BUG_MMIO_STALE_DATA X86_BUG(25) /* CPU is affected by Processor MMIO Stale Data vulnerabilities */
++#define X86_BUG_RETBLEED X86_BUG(26) /* CPU is affected by RETBleed */
+
+ #endif /* _ASM_X86_CPUFEATURES_H */
+diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h
+index 1231d63f836d8..f7be189e97232 100644
+--- a/tools/arch/x86/include/asm/disabled-features.h
++++ b/tools/arch/x86/include/asm/disabled-features.h
+@@ -56,6 +56,25 @@
+ # define DISABLE_PTI (1 << (X86_FEATURE_PTI & 31))
+ #endif
+
++#ifdef CONFIG_RETPOLINE
++# define DISABLE_RETPOLINE 0
++#else
++# define DISABLE_RETPOLINE ((1 << (X86_FEATURE_RETPOLINE & 31)) | \
++ (1 << (X86_FEATURE_RETPOLINE_LFENCE & 31)))
++#endif
++
++#ifdef CONFIG_RETHUNK
++# define DISABLE_RETHUNK 0
++#else
++# define DISABLE_RETHUNK (1 << (X86_FEATURE_RETHUNK & 31))
++#endif
++
++#ifdef CONFIG_CPU_UNRET_ENTRY
++# define DISABLE_UNRET 0
++#else
++# define DISABLE_UNRET (1 << (X86_FEATURE_UNRET & 31))
++#endif
++
+ #ifdef CONFIG_INTEL_IOMMU_SVM
+ # define DISABLE_ENQCMD 0
+ #else
+@@ -82,7 +101,7 @@
+ #define DISABLED_MASK8 0
+ #define DISABLED_MASK9 (DISABLE_SMAP|DISABLE_SGX)
+ #define DISABLED_MASK10 0
+-#define DISABLED_MASK11 0
++#define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET)
+ #define DISABLED_MASK12 0
+ #define DISABLED_MASK13 0
+ #define DISABLED_MASK14 0
+diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
+index 4425d6773183b..ad084326f24c2 100644
+--- a/tools/arch/x86/include/asm/msr-index.h
++++ b/tools/arch/x86/include/asm/msr-index.h
+@@ -51,6 +51,8 @@
+ #define SPEC_CTRL_STIBP BIT(SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */
+ #define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */
+ #define SPEC_CTRL_SSBD BIT(SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */
++#define SPEC_CTRL_RRSBA_DIS_S_SHIFT 6 /* Disable RRSBA behavior */
++#define SPEC_CTRL_RRSBA_DIS_S BIT(SPEC_CTRL_RRSBA_DIS_S_SHIFT)
+
+ #define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */
+ #define PRED_CMD_IBPB BIT(0) /* Indirect Branch Prediction Barrier */
+@@ -91,6 +93,7 @@
+ #define MSR_IA32_ARCH_CAPABILITIES 0x0000010a
+ #define ARCH_CAP_RDCL_NO BIT(0) /* Not susceptible to Meltdown */
+ #define ARCH_CAP_IBRS_ALL BIT(1) /* Enhanced IBRS support */
++#define ARCH_CAP_RSBA BIT(2) /* RET may use alternative branch predictors */
+ #define ARCH_CAP_SKIP_VMENTRY_L1DFLUSH BIT(3) /* Skip L1D flush on vmentry */
+ #define ARCH_CAP_SSB_NO BIT(4) /*
+ * Not susceptible to Speculative Store Bypass
+@@ -138,6 +141,13 @@
+ * bit available to control VERW
+ * behavior.
+ */
++#define ARCH_CAP_RRSBA BIT(19) /*
++ * Indicates RET may use predictors
++ * other than the RSB. With eIBRS
++ * enabled predictions in kernel mode
++ * are restricted to targets in
++ * kernel.
++ */
+
+ #define MSR_IA32_FLUSH_CMD 0x0000010b
+ #define L1D_FLUSH BIT(0) /*
+@@ -552,6 +562,9 @@
+ /* Fam 17h MSRs */
+ #define MSR_F17H_IRPERF 0xc00000e9
+
++#define MSR_ZEN2_SPECTRAL_CHICKEN 0xc00110e3
++#define MSR_ZEN2_SPECTRAL_CHICKEN_BIT BIT_ULL(1)
++
+ /* Fam 16h MSRs */
+ #define MSR_F16H_L2I_PERF_CTL 0xc0010230
+ #define MSR_F16H_L2I_PERF_CTR 0xc0010231
+diff --git a/tools/include/linux/objtool.h b/tools/include/linux/objtool.h
+index c81ea2264ad8a..376110ead758e 100644
+--- a/tools/include/linux/objtool.h
++++ b/tools/include/linux/objtool.h
+@@ -32,11 +32,16 @@ struct unwind_hint {
+ *
+ * UNWIND_HINT_FUNC: Generate the unwind metadata of a callable function.
+ * Useful for code which doesn't have an ELF function annotation.
++ *
++ * UNWIND_HINT_ENTRY: machine entry without stack, SYSCALL/SYSENTER etc.
+ */
+ #define UNWIND_HINT_TYPE_CALL 0
+ #define UNWIND_HINT_TYPE_REGS 1
+ #define UNWIND_HINT_TYPE_REGS_PARTIAL 2
+ #define UNWIND_HINT_TYPE_FUNC 3
++#define UNWIND_HINT_TYPE_ENTRY 4
++#define UNWIND_HINT_TYPE_SAVE 5
++#define UNWIND_HINT_TYPE_RESTORE 6
+
+ #ifdef CONFIG_STACK_VALIDATION
+
+@@ -122,7 +127,7 @@ struct unwind_hint {
+ * the debuginfo as necessary. It will also warn if it sees any
+ * inconsistencies.
+ */
+-.macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
++.macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 end=0
+ .Lunwind_hint_ip_\@:
+ .pushsection .discard.unwind_hints
+ /* struct unwind_hint */
+@@ -175,7 +180,7 @@ struct unwind_hint {
+ #define ASM_REACHABLE
+ #else
+ #define ANNOTATE_INTRA_FUNCTION_CALL
+-.macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
++.macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 end=0
+ .endm
+ .macro STACK_FRAME_NON_STANDARD func:req
+ .endm
+diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
+index 943cb41cddf7c..1ecf50bbd5544 100644
+--- a/tools/objtool/arch/x86/decode.c
++++ b/tools/objtool/arch/x86/decode.c
+@@ -787,3 +787,8 @@ bool arch_is_retpoline(struct symbol *sym)
+ {
+ return !strncmp(sym->name, "__x86_indirect_", 15);
+ }
++
++bool arch_is_rethunk(struct symbol *sym)
++{
++ return !strcmp(sym->name, "__x86_return_thunk");
++}
+diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
+index fc6975ab8b06e..cd4bbc98f8c1f 100644
+--- a/tools/objtool/builtin-check.c
++++ b/tools/objtool/builtin-check.c
+@@ -21,7 +21,7 @@
+
+ bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
+ lto, vmlinux, mcount, noinstr, backup, sls, dryrun,
+- ibt;
++ ibt, unret, rethunk;
+
+ static const char * const check_usage[] = {
+ "objtool check [<options>] file.o",
+@@ -37,6 +37,8 @@ const struct option check_options[] = {
+ OPT_BOOLEAN('f', "no-fp", &no_fp, "Skip frame pointer validation"),
+ OPT_BOOLEAN('u', "no-unreachable", &no_unreachable, "Skip 'unreachable instruction' warnings"),
+ OPT_BOOLEAN('r', "retpoline", &retpoline, "Validate retpoline assumptions"),
++ OPT_BOOLEAN(0, "rethunk", &rethunk, "validate and annotate rethunk usage"),
++ OPT_BOOLEAN(0, "unret", &unret, "validate entry unret placement"),
+ OPT_BOOLEAN('m', "module", &module, "Indicates the object will be part of a kernel module"),
+ OPT_BOOLEAN('b', "backtrace", &backtrace, "unwind on error"),
+ OPT_BOOLEAN('a', "uaccess", &uaccess, "enable uaccess checking"),
+diff --git a/tools/objtool/check.c b/tools/objtool/check.c
+index f66e4ac0af948..57b7a68d3b66a 100644
+--- a/tools/objtool/check.c
++++ b/tools/objtool/check.c
+@@ -374,7 +374,8 @@ static int decode_instructions(struct objtool_file *file)
+ sec->text = true;
+
+ if (!strcmp(sec->name, ".noinstr.text") ||
+- !strcmp(sec->name, ".entry.text"))
++ !strcmp(sec->name, ".entry.text") ||
++ !strncmp(sec->name, ".text.__x86.", 12))
+ sec->noinstr = true;
+
+ for (offset = 0; offset < sec->sh.sh_size; offset += insn->len) {
+@@ -747,6 +748,52 @@ static int create_retpoline_sites_sections(struct objtool_file *file)
+ return 0;
+ }
+
++static int create_return_sites_sections(struct objtool_file *file)
++{
++ struct instruction *insn;
++ struct section *sec;
++ int idx;
++
++ sec = find_section_by_name(file->elf, ".return_sites");
++ if (sec) {
++ WARN("file already has .return_sites, skipping");
++ return 0;
++ }
++
++ idx = 0;
++ list_for_each_entry(insn, &file->return_thunk_list, call_node)
++ idx++;
++
++ if (!idx)
++ return 0;
++
++ sec = elf_create_section(file->elf, ".return_sites", 0,
++ sizeof(int), idx);
++ if (!sec) {
++ WARN("elf_create_section: .return_sites");
++ return -1;
++ }
++
++ idx = 0;
++ list_for_each_entry(insn, &file->return_thunk_list, call_node) {
++
++ int *site = (int *)sec->data->d_buf + idx;
++ *site = 0;
++
++ if (elf_add_reloc_to_insn(file->elf, sec,
++ idx * sizeof(int),
++ R_X86_64_PC32,
++ insn->sec, insn->offset)) {
++ WARN("elf_add_reloc_to_insn: .return_sites");
++ return -1;
++ }
++
++ idx++;
++ }
++
++ return 0;
++}
++
+ static int create_ibt_endbr_seal_sections(struct objtool_file *file)
+ {
+ struct instruction *insn;
+@@ -1081,6 +1128,11 @@ __weak bool arch_is_retpoline(struct symbol *sym)
+ return false;
+ }
+
++__weak bool arch_is_rethunk(struct symbol *sym)
++{
++ return false;
++}
++
+ #define NEGATIVE_RELOC ((void *)-1L)
+
+ static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
+@@ -1248,6 +1300,20 @@ static void add_retpoline_call(struct objtool_file *file, struct instruction *in
+ annotate_call_site(file, insn, false);
+ }
+
++static void add_return_call(struct objtool_file *file, struct instruction *insn, bool add)
++{
++ /*
++ * Return thunk tail calls are really just returns in disguise,
++ * so convert them accordingly.
++ */
++ insn->type = INSN_RETURN;
++ insn->retpoline_safe = true;
++
++ /* Skip the non-text sections, specially .discard ones */
++ if (add && insn->sec->text)
++ list_add_tail(&insn->call_node, &file->return_thunk_list);
++}
++
+ static bool same_function(struct instruction *insn1, struct instruction *insn2)
+ {
+ return insn1->func->pfunc == insn2->func->pfunc;
+@@ -1300,6 +1366,9 @@ static int add_jump_destinations(struct objtool_file *file)
+ } else if (reloc->sym->retpoline_thunk) {
+ add_retpoline_call(file, insn);
+ continue;
++ } else if (reloc->sym->return_thunk) {
++ add_return_call(file, insn, true);
++ continue;
+ } else if (insn->func) {
+ /*
+ * External sibling call or internal sibling call with
+@@ -1318,6 +1387,21 @@ static int add_jump_destinations(struct objtool_file *file)
+
+ jump_dest = find_insn(file, dest_sec, dest_off);
+ if (!jump_dest) {
++ struct symbol *sym = find_symbol_by_offset(dest_sec, dest_off);
++
++ /*
++ * This is a special case for zen_untrain_ret().
++ * It jumps to __x86_return_thunk(), but objtool
++ * can't find the thunk's starting RET
++ * instruction, because the RET is also in the
++ * middle of another instruction. Objtool only
++ * knows about the outer instruction.
++ */
++ if (sym && sym->return_thunk) {
++ add_return_call(file, insn, false);
++ continue;
++ }
++
+ WARN_FUNC("can't find jump dest instruction at %s+0x%lx",
+ insn->sec, insn->offset, dest_sec->name,
+ dest_off);
+@@ -1947,16 +2031,35 @@ static int read_unwind_hints(struct objtool_file *file)
+
+ insn->hint = true;
+
+- if (ibt && hint->type == UNWIND_HINT_TYPE_REGS_PARTIAL) {
++ if (hint->type == UNWIND_HINT_TYPE_SAVE) {
++ insn->hint = false;
++ insn->save = true;
++ continue;
++ }
++
++ if (hint->type == UNWIND_HINT_TYPE_RESTORE) {
++ insn->restore = true;
++ continue;
++ }
++
++ if (hint->type == UNWIND_HINT_TYPE_REGS_PARTIAL) {
+ struct symbol *sym = find_symbol_by_offset(insn->sec, insn->offset);
+
+- if (sym && sym->bind == STB_GLOBAL &&
+- insn->type != INSN_ENDBR && !insn->noendbr) {
+- WARN_FUNC("UNWIND_HINT_IRET_REGS without ENDBR",
+- insn->sec, insn->offset);
++ if (sym && sym->bind == STB_GLOBAL) {
++ if (ibt && insn->type != INSN_ENDBR && !insn->noendbr) {
++ WARN_FUNC("UNWIND_HINT_IRET_REGS without ENDBR",
++ insn->sec, insn->offset);
++ }
++
++ insn->entry = 1;
+ }
+ }
+
++ if (hint->type == UNWIND_HINT_TYPE_ENTRY) {
++ hint->type = UNWIND_HINT_TYPE_CALL;
++ insn->entry = 1;
++ }
++
+ if (hint->type == UNWIND_HINT_TYPE_FUNC) {
+ insn->cfi = &func_cfi;
+ continue;
+@@ -2030,8 +2133,10 @@ static int read_retpoline_hints(struct objtool_file *file)
+ }
+
+ if (insn->type != INSN_JUMP_DYNAMIC &&
+- insn->type != INSN_CALL_DYNAMIC) {
+- WARN_FUNC("retpoline_safe hint not an indirect jump/call",
++ insn->type != INSN_CALL_DYNAMIC &&
++ insn->type != INSN_RETURN &&
++ insn->type != INSN_NOP) {
++ WARN_FUNC("retpoline_safe hint not an indirect jump/call/ret/nop",
+ insn->sec, insn->offset);
+ return -1;
+ }
+@@ -2182,6 +2287,9 @@ static int classify_symbols(struct objtool_file *file)
+ if (arch_is_retpoline(func))
+ func->retpoline_thunk = true;
+
++ if (arch_is_rethunk(func))
++ func->return_thunk = true;
++
+ if (!strcmp(func->name, "__fentry__"))
+ func->fentry = true;
+
+@@ -3324,8 +3432,8 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
+ return 1;
+ }
+
+- visited = 1 << state.uaccess;
+- if (insn->visited) {
++ visited = VISITED_BRANCH << state.uaccess;
++ if (insn->visited & VISITED_BRANCH_MASK) {
+ if (!insn->hint && !insn_cfi_match(insn, &state.cfi))
+ return 1;
+
+@@ -3339,6 +3447,35 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
+ state.instr += insn->instr;
+
+ if (insn->hint) {
++ if (insn->restore) {
++ struct instruction *save_insn, *i;
++
++ i = insn;
++ save_insn = NULL;
++
++ sym_for_each_insn_continue_reverse(file, func, i) {
++ if (i->save) {
++ save_insn = i;
++ break;
++ }
++ }
++
++ if (!save_insn) {
++ WARN_FUNC("no corresponding CFI save for CFI restore",
++ sec, insn->offset);
++ return 1;
++ }
++
++ if (!save_insn->visited) {
++ WARN_FUNC("objtool isn't smart enough to handle this CFI save/restore combo",
++ sec, insn->offset);
++ return 1;
++ }
++
++ insn->cfi = save_insn->cfi;
++ nr_cfi_reused++;
++ }
++
+ state.cfi = *insn->cfi;
+ } else {
+ /* XXX track if we actually changed state.cfi */
+@@ -3554,6 +3691,145 @@ static int validate_unwind_hints(struct objtool_file *file, struct section *sec)
+ return warnings;
+ }
+
++/*
++ * Validate rethunk entry constraint: must untrain RET before the first RET.
++ *
++ * Follow every branch (intra-function) and ensure ANNOTATE_UNRET_END comes
++ * before an actual RET instruction.
++ */
++static int validate_entry(struct objtool_file *file, struct instruction *insn)
++{
++ struct instruction *next, *dest;
++ int ret, warnings = 0;
++
++ for (;;) {
++ next = next_insn_to_validate(file, insn);
++
++ if (insn->visited & VISITED_ENTRY)
++ return 0;
++
++ insn->visited |= VISITED_ENTRY;
++
++ if (!insn->ignore_alts && !list_empty(&insn->alts)) {
++ struct alternative *alt;
++ bool skip_orig = false;
++
++ list_for_each_entry(alt, &insn->alts, list) {
++ if (alt->skip_orig)
++ skip_orig = true;
++
++ ret = validate_entry(file, alt->insn);
++ if (ret) {
++ if (backtrace)
++ BT_FUNC("(alt)", insn);
++ return ret;
++ }
++ }
++
++ if (skip_orig)
++ return 0;
++ }
++
++ switch (insn->type) {
++
++ case INSN_CALL_DYNAMIC:
++ case INSN_JUMP_DYNAMIC:
++ case INSN_JUMP_DYNAMIC_CONDITIONAL:
++ WARN_FUNC("early indirect call", insn->sec, insn->offset);
++ return 1;
++
++ case INSN_JUMP_UNCONDITIONAL:
++ case INSN_JUMP_CONDITIONAL:
++ if (!is_sibling_call(insn)) {
++ if (!insn->jump_dest) {
++ WARN_FUNC("unresolved jump target after linking?!?",
++ insn->sec, insn->offset);
++ return -1;
++ }
++ ret = validate_entry(file, insn->jump_dest);
++ if (ret) {
++ if (backtrace) {
++ BT_FUNC("(branch%s)", insn,
++ insn->type == INSN_JUMP_CONDITIONAL ? "-cond" : "");
++ }
++ return ret;
++ }
++
++ if (insn->type == INSN_JUMP_UNCONDITIONAL)
++ return 0;
++
++ break;
++ }
++
++ /* fallthrough */
++ case INSN_CALL:
++ dest = find_insn(file, insn->call_dest->sec,
++ insn->call_dest->offset);
++ if (!dest) {
++ WARN("Unresolved function after linking!?: %s",
++ insn->call_dest->name);
++ return -1;
++ }
++
++ ret = validate_entry(file, dest);
++ if (ret) {
++ if (backtrace)
++ BT_FUNC("(call)", insn);
++ return ret;
++ }
++ /*
++ * If a call returns without error, it must have seen UNTRAIN_RET.
++ * Therefore any non-error return is a success.
++ */
++ return 0;
++
++ case INSN_RETURN:
++ WARN_FUNC("RET before UNTRAIN", insn->sec, insn->offset);
++ return 1;
++
++ case INSN_NOP:
++ if (insn->retpoline_safe)
++ return 0;
++ break;
++
++ default:
++ break;
++ }
++
++ if (!next) {
++ WARN_FUNC("teh end!", insn->sec, insn->offset);
++ return -1;
++ }
++ insn = next;
++ }
++
++ return warnings;
++}
++
++/*
++ * Validate that all branches starting at 'insn->entry' encounter UNRET_END
++ * before RET.
++ */
++static int validate_unret(struct objtool_file *file)
++{
++ struct instruction *insn;
++ int ret, warnings = 0;
++
++ for_each_insn(file, insn) {
++ if (!insn->entry)
++ continue;
++
++ ret = validate_entry(file, insn);
++ if (ret < 0) {
++ WARN_FUNC("Failed UNRET validation", insn->sec, insn->offset);
++ return ret;
++ }
++ warnings += ret;
++ }
++
++ return warnings;
++}
++
+ static int validate_retpoline(struct objtool_file *file)
+ {
+ struct instruction *insn;
+@@ -3561,7 +3837,8 @@ static int validate_retpoline(struct objtool_file *file)
+
+ for_each_insn(file, insn) {
+ if (insn->type != INSN_JUMP_DYNAMIC &&
+- insn->type != INSN_CALL_DYNAMIC)
++ insn->type != INSN_CALL_DYNAMIC &&
++ insn->type != INSN_RETURN)
+ continue;
+
+ if (insn->retpoline_safe)
+@@ -3576,9 +3853,17 @@ static int validate_retpoline(struct objtool_file *file)
+ if (!strcmp(insn->sec->name, ".init.text") && !module)
+ continue;
+
+- WARN_FUNC("indirect %s found in RETPOLINE build",
+- insn->sec, insn->offset,
+- insn->type == INSN_JUMP_DYNAMIC ? "jump" : "call");
++ if (insn->type == INSN_RETURN) {
++ if (rethunk) {
++ WARN_FUNC("'naked' return found in RETHUNK build",
++ insn->sec, insn->offset);
++ } else
++ continue;
++ } else {
++ WARN_FUNC("indirect %s found in RETPOLINE build",
++ insn->sec, insn->offset,
++ insn->type == INSN_JUMP_DYNAMIC ? "jump" : "call");
++ }
+
+ warnings++;
+ }
+@@ -3911,6 +4196,17 @@ int check(struct objtool_file *file)
+ goto out;
+ warnings += ret;
+
++ if (unret) {
++ /*
++ * Must be after validate_branch() and friends, it plays
++ * further games with insn->visited.
++ */
++ ret = validate_unret(file);
++ if (ret < 0)
++ return ret;
++ warnings += ret;
++ }
++
+ if (ibt) {
+ ret = validate_ibt(file);
+ if (ret < 0)
+@@ -3937,6 +4233,13 @@ int check(struct objtool_file *file)
+ warnings += ret;
+ }
+
++ if (rethunk) {
++ ret = create_return_sites_sections(file);
++ if (ret < 0)
++ goto out;
++ warnings += ret;
++ }
++
+ if (mcount) {
+ ret = create_mcount_loc_sections(file);
+ if (ret < 0)
+diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h
+index 9b19cc3041955..beb2f3aa94ffc 100644
+--- a/tools/objtool/include/objtool/arch.h
++++ b/tools/objtool/include/objtool/arch.h
+@@ -89,6 +89,7 @@ const char *arch_ret_insn(int len);
+ int arch_decode_hint_reg(u8 sp_reg, int *base);
+
+ bool arch_is_retpoline(struct symbol *sym);
++bool arch_is_rethunk(struct symbol *sym);
+
+ int arch_rewrite_retpolines(struct objtool_file *file);
+
+diff --git a/tools/objtool/include/objtool/builtin.h b/tools/objtool/include/objtool/builtin.h
+index c39dbfaef6dcb..b6bb605faf3f7 100644
+--- a/tools/objtool/include/objtool/builtin.h
++++ b/tools/objtool/include/objtool/builtin.h
+@@ -10,7 +10,7 @@
+ extern const struct option check_options[];
+ extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
+ lto, vmlinux, mcount, noinstr, backup, sls, dryrun,
+- ibt;
++ ibt, unret, rethunk;
+
+ extern int cmd_parse_options(int argc, const char **argv, const char * const usage[]);
+
+diff --git a/tools/objtool/include/objtool/check.h b/tools/objtool/include/objtool/check.h
+index f10d7374f388a..036129cebeeef 100644
+--- a/tools/objtool/include/objtool/check.h
++++ b/tools/objtool/include/objtool/check.h
+@@ -46,16 +46,19 @@ struct instruction {
+ enum insn_type type;
+ unsigned long immediate;
+
+- u8 dead_end : 1,
+- ignore : 1,
+- ignore_alts : 1,
+- hint : 1,
+- retpoline_safe : 1,
+- noendbr : 1;
+- /* 2 bit hole */
++ u16 dead_end : 1,
++ ignore : 1,
++ ignore_alts : 1,
++ hint : 1,
++ save : 1,
++ restore : 1,
++ retpoline_safe : 1,
++ noendbr : 1,
++ entry : 1;
++ /* 7 bit hole */
++
+ s8 instr;
+ u8 visited;
+- /* u8 hole */
+
+ struct alt_group *alt_group;
+ struct symbol *call_dest;
+@@ -69,6 +72,11 @@ struct instruction {
+ struct cfi_state *cfi;
+ };
+
++#define VISITED_BRANCH 0x01
++#define VISITED_BRANCH_UACCESS 0x02
++#define VISITED_BRANCH_MASK 0x03
++#define VISITED_ENTRY 0x04
++
+ static inline bool is_static_jump(struct instruction *insn)
+ {
+ return insn->type == INSN_JUMP_CONDITIONAL ||
+diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
+index 82e57eb4b4c5d..94a618e2a79e3 100644
+--- a/tools/objtool/include/objtool/elf.h
++++ b/tools/objtool/include/objtool/elf.h
+@@ -57,6 +57,7 @@ struct symbol {
+ u8 uaccess_safe : 1;
+ u8 static_call_tramp : 1;
+ u8 retpoline_thunk : 1;
++ u8 return_thunk : 1;
+ u8 fentry : 1;
+ u8 profiling_func : 1;
+ struct list_head pv_target;
+diff --git a/tools/objtool/include/objtool/objtool.h b/tools/objtool/include/objtool/objtool.h
+index a6e72d916807d..7f2d1b0953333 100644
+--- a/tools/objtool/include/objtool/objtool.h
++++ b/tools/objtool/include/objtool/objtool.h
+@@ -24,6 +24,7 @@ struct objtool_file {
+ struct list_head insn_list;
+ DECLARE_HASHTABLE(insn_hash, 20);
+ struct list_head retpoline_call_list;
++ struct list_head return_thunk_list;
+ struct list_head static_call_list;
+ struct list_head mcount_loc_list;
+ struct list_head endbr_list;
+diff --git a/tools/objtool/objtool.c b/tools/objtool/objtool.c
+index 843ff3c2f28e4..983687345d355 100644
+--- a/tools/objtool/objtool.c
++++ b/tools/objtool/objtool.c
+@@ -126,6 +126,7 @@ struct objtool_file *objtool_open_read(const char *_objname)
+ INIT_LIST_HEAD(&file.insn_list);
+ hash_init(file.insn_hash);
+ INIT_LIST_HEAD(&file.retpoline_call_list);
++ INIT_LIST_HEAD(&file.return_thunk_list);
+ INIT_LIST_HEAD(&file.static_call_list);
+ INIT_LIST_HEAD(&file.mcount_loc_list);
+ INIT_LIST_HEAD(&file.endbr_list);
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-07-29 16:39 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-07-29 16:39 UTC (permalink / raw
To: gentoo-commits
commit: e3d4af682ac7e412f6f7e870493e0cbb2b17a0d2
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Fri Jul 29 16:39:00 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Fri Jul 29 16:39:00 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=e3d4af68
Linux patch 5.18.15
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1014_linux-5.18.15.patch | 9072 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 9076 insertions(+)
diff --git a/0000_README b/0000_README
index 32e6ba50..7c448f23 100644
--- a/0000_README
+++ b/0000_README
@@ -99,6 +99,10 @@ Patch: 1013_linux-5.18.14.patch
From: http://www.kernel.org
Desc: Linux 5.18.14
+Patch: 1014_linux-5.18.15.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.15
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1014_linux-5.18.15.patch b/1014_linux-5.18.15.patch
new file mode 100644
index 00000000..01c926b9
--- /dev/null
+++ b/1014_linux-5.18.15.patch
@@ -0,0 +1,9072 @@
+diff --git a/Makefile b/Makefile
+index d3723b2f6d6ca..5957afa296922 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 14
++SUBLEVEL = 15
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
+index c6ca1b9cbf712..8e236a0221564 100644
+--- a/arch/riscv/Makefile
++++ b/arch/riscv/Makefile
+@@ -73,6 +73,7 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
+ endif
+
+ KBUILD_CFLAGS_MODULE += $(call cc-option,-mno-relax)
++KBUILD_AFLAGS_MODULE += $(call as-option,-Wa$(comma)-mno-relax)
+
+ # GCC versions that support the "-mstrict-align" option default to allowing
+ # unaligned accesses. While unaligned accesses are explicitly allowed in the
+diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
+index fe1742c4ca498..1f156098a5bf5 100644
+--- a/arch/x86/events/intel/lbr.c
++++ b/arch/x86/events/intel/lbr.c
+@@ -278,9 +278,9 @@ enum {
+ };
+
+ /*
+- * For formats with LBR_TSX flags (e.g. LBR_FORMAT_EIP_FLAGS2), bits 61:62 in
+- * MSR_LAST_BRANCH_FROM_x are the TSX flags when TSX is supported, but when
+- * TSX is not supported they have no consistent behavior:
++ * For format LBR_FORMAT_EIP_FLAGS2, bits 61:62 in MSR_LAST_BRANCH_FROM_x
++ * are the TSX flags when TSX is supported, but when TSX is not supported
++ * they have no consistent behavior:
+ *
+ * - For wrmsr(), bits 61:62 are considered part of the sign extension.
+ * - For HW updates (branch captures) bits 61:62 are always OFF and are not
+@@ -288,7 +288,7 @@ enum {
+ *
+ * Therefore, if:
+ *
+- * 1) LBR has TSX format
++ * 1) LBR format LBR_FORMAT_EIP_FLAGS2
+ * 2) CPU has no TSX support enabled
+ *
+ * ... then any value passed to wrmsr() must be sign extended to 63 bits and any
+@@ -300,7 +300,7 @@ static inline bool lbr_from_signext_quirk_needed(void)
+ bool tsx_support = boot_cpu_has(X86_FEATURE_HLE) ||
+ boot_cpu_has(X86_FEATURE_RTM);
+
+- return !tsx_support && x86_pmu.lbr_has_tsx;
++ return !tsx_support;
+ }
+
+ static DEFINE_STATIC_KEY_FALSE(lbr_from_quirk_key);
+@@ -1611,9 +1611,6 @@ void intel_pmu_lbr_init_hsw(void)
+ x86_pmu.lbr_sel_map = hsw_lbr_sel_map;
+
+ x86_get_pmu(smp_processor_id())->task_ctx_cache = create_lbr_kmem_cache(size, 0);
+-
+- if (lbr_from_signext_quirk_needed())
+- static_branch_enable(&lbr_from_quirk_key);
+ }
+
+ /* skylake */
+@@ -1704,7 +1701,11 @@ void intel_pmu_lbr_init(void)
+ switch (x86_pmu.intel_cap.lbr_format) {
+ case LBR_FORMAT_EIP_FLAGS2:
+ x86_pmu.lbr_has_tsx = 1;
+- fallthrough;
++ x86_pmu.lbr_from_flags = 1;
++ if (lbr_from_signext_quirk_needed())
++ static_branch_enable(&lbr_from_quirk_key);
++ break;
++
+ case LBR_FORMAT_EIP_FLAGS:
+ x86_pmu.lbr_from_flags = 1;
+ break;
+diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
+index 5d09ded0c491f..49889f171e860 100644
+--- a/arch/x86/include/asm/cpufeatures.h
++++ b/arch/x86/include/asm/cpufeatures.h
+@@ -301,6 +301,7 @@
+ #define X86_FEATURE_RETPOLINE_LFENCE (11*32+13) /* "" Use LFENCE for Spectre variant 2 */
+ #define X86_FEATURE_RETHUNK (11*32+14) /* "" Use REturn THUNK */
+ #define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */
++#define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB during runtime firmware calls */
+
+ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
+ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
+diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
+index 10a3bfc1eb230..38a3e86e665ef 100644
+--- a/arch/x86/include/asm/nospec-branch.h
++++ b/arch/x86/include/asm/nospec-branch.h
+@@ -297,6 +297,8 @@ do { \
+ alternative_msr_write(MSR_IA32_SPEC_CTRL, \
+ spec_ctrl_current() | SPEC_CTRL_IBRS, \
+ X86_FEATURE_USE_IBRS_FW); \
++ alternative_msr_write(MSR_IA32_PRED_CMD, PRED_CMD_IBPB, \
++ X86_FEATURE_USE_IBPB_FW); \
+ } while (0)
+
+ #define firmware_restrict_branch_speculation_end() \
+diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
+index 46427b785bc89..d440f6726df07 100644
+--- a/arch/x86/kernel/alternative.c
++++ b/arch/x86/kernel/alternative.c
+@@ -555,7 +555,9 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end)
+ dest = addr + insn.length + insn.immediate.value;
+
+ if (__static_call_fixup(addr, op, dest) ||
+- WARN_ON_ONCE(dest != &__x86_return_thunk))
++ WARN_ONCE(dest != &__x86_return_thunk,
++ "missing return thunk: %pS-%pS: %*ph",
++ addr, dest, 5, addr))
+ continue;
+
+ DPRINTK("return thunk at: %pS (%px) len: %d to: %pS",
+diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
+index 0b64e894b3838..8179fa4d5004f 100644
+--- a/arch/x86/kernel/cpu/bugs.c
++++ b/arch/x86/kernel/cpu/bugs.c
+@@ -968,6 +968,7 @@ static inline const char *spectre_v2_module_string(void) { return ""; }
+ #define SPECTRE_V2_LFENCE_MSG "WARNING: LFENCE mitigation is not recommended for this CPU, data leaks possible!\n"
+ #define SPECTRE_V2_EIBRS_EBPF_MSG "WARNING: Unprivileged eBPF is enabled with eIBRS on, data leaks possible via Spectre v2 BHB attacks!\n"
+ #define SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG "WARNING: Unprivileged eBPF is enabled with eIBRS+LFENCE mitigation and SMT, data leaks possible via Spectre v2 BHB attacks!\n"
++#define SPECTRE_V2_IBRS_PERF_MSG "WARNING: IBRS mitigation selected on Enhanced IBRS CPU, this may cause unnecessary performance loss\n"
+
+ #ifdef CONFIG_BPF_SYSCALL
+ void unpriv_ebpf_notify(int new_state)
+@@ -1408,6 +1409,8 @@ static void __init spectre_v2_select_mitigation(void)
+
+ case SPECTRE_V2_IBRS:
+ setup_force_cpu_cap(X86_FEATURE_KERNEL_IBRS);
++ if (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED))
++ pr_warn(SPECTRE_V2_IBRS_PERF_MSG);
+ break;
+
+ case SPECTRE_V2_LFENCE:
+@@ -1509,7 +1512,16 @@ static void __init spectre_v2_select_mitigation(void)
+ * the CPU supports Enhanced IBRS, kernel might un-intentionally not
+ * enable IBRS around firmware calls.
+ */
+- if (boot_cpu_has(X86_FEATURE_IBRS) && !spectre_v2_in_ibrs_mode(mode)) {
++ if (boot_cpu_has_bug(X86_BUG_RETBLEED) &&
++ (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
++ boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)) {
++
++ if (retbleed_cmd != RETBLEED_CMD_IBPB) {
++ setup_force_cpu_cap(X86_FEATURE_USE_IBPB_FW);
++ pr_info("Enabling Speculation Barrier for firmware calls\n");
++ }
++
++ } else if (boot_cpu_has(X86_FEATURE_IBRS) && !spectre_v2_in_ibrs_mode(mode)) {
+ setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW);
+ pr_info("Enabling Restricted Speculation for firmware calls\n");
+ }
+diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
+index 57ca7aa0e169a..b8e26b6b55236 100644
+--- a/drivers/acpi/cppc_acpi.c
++++ b/drivers/acpi/cppc_acpi.c
+@@ -764,7 +764,8 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
+
+ if (!osc_cpc_flexible_adr_space_confirmed) {
+ pr_debug("Flexible address space capability not supported\n");
+- goto out_free;
++ if (!cpc_supported_by_cpu())
++ goto out_free;
+ }
+
+ addr = ioremap(gas_t->address, gas_t->bit_width/8);
+@@ -791,7 +792,8 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
+ }
+ if (!osc_cpc_flexible_adr_space_confirmed) {
+ pr_debug("Flexible address space capability not supported\n");
+- goto out_free;
++ if (!cpc_supported_by_cpu())
++ goto out_free;
+ }
+ } else {
+ if (gas_t->space_id != ACPI_ADR_SPACE_FIXED_HARDWARE || !cpc_ffh_supported()) {
+diff --git a/drivers/bus/mhi/host/pci_generic.c b/drivers/bus/mhi/host/pci_generic.c
+index 541ced27d9412..de1e934a4f7ec 100644
+--- a/drivers/bus/mhi/host/pci_generic.c
++++ b/drivers/bus/mhi/host/pci_generic.c
+@@ -446,14 +446,93 @@ static const struct mhi_pci_dev_info mhi_sierra_em919x_info = {
+ .sideband_wake = false,
+ };
+
++static const struct mhi_channel_config mhi_telit_fn980_hw_v1_channels[] = {
++ MHI_CHANNEL_CONFIG_UL(14, "QMI", 32, 0),
++ MHI_CHANNEL_CONFIG_DL(15, "QMI", 32, 0),
++ MHI_CHANNEL_CONFIG_UL(20, "IPCR", 16, 0),
++ MHI_CHANNEL_CONFIG_DL_AUTOQUEUE(21, "IPCR", 16, 0),
++ MHI_CHANNEL_CONFIG_HW_UL(100, "IP_HW0", 128, 1),
++ MHI_CHANNEL_CONFIG_HW_DL(101, "IP_HW0", 128, 2),
++};
++
++static struct mhi_event_config mhi_telit_fn980_hw_v1_events[] = {
++ MHI_EVENT_CONFIG_CTRL(0, 128),
++ MHI_EVENT_CONFIG_HW_DATA(1, 1024, 100),
++ MHI_EVENT_CONFIG_HW_DATA(2, 2048, 101)
++};
++
++static struct mhi_controller_config modem_telit_fn980_hw_v1_config = {
++ .max_channels = 128,
++ .timeout_ms = 20000,
++ .num_channels = ARRAY_SIZE(mhi_telit_fn980_hw_v1_channels),
++ .ch_cfg = mhi_telit_fn980_hw_v1_channels,
++ .num_events = ARRAY_SIZE(mhi_telit_fn980_hw_v1_events),
++ .event_cfg = mhi_telit_fn980_hw_v1_events,
++};
++
++static const struct mhi_pci_dev_info mhi_telit_fn980_hw_v1_info = {
++ .name = "telit-fn980-hwv1",
++ .fw = "qcom/sdx55m/sbl1.mbn",
++ .edl = "qcom/sdx55m/edl.mbn",
++ .config = &modem_telit_fn980_hw_v1_config,
++ .bar_num = MHI_PCI_DEFAULT_BAR_NUM,
++ .dma_data_width = 32,
++ .mru_default = 32768,
++ .sideband_wake = false,
++};
++
++static const struct mhi_channel_config mhi_telit_fn990_channels[] = {
++ MHI_CHANNEL_CONFIG_UL_SBL(2, "SAHARA", 32, 0),
++ MHI_CHANNEL_CONFIG_DL_SBL(3, "SAHARA", 32, 0),
++ MHI_CHANNEL_CONFIG_UL(4, "DIAG", 64, 1),
++ MHI_CHANNEL_CONFIG_DL(5, "DIAG", 64, 1),
++ MHI_CHANNEL_CONFIG_UL(12, "MBIM", 32, 0),
++ MHI_CHANNEL_CONFIG_DL(13, "MBIM", 32, 0),
++ MHI_CHANNEL_CONFIG_UL(32, "DUN", 32, 0),
++ MHI_CHANNEL_CONFIG_DL(33, "DUN", 32, 0),
++ MHI_CHANNEL_CONFIG_HW_UL(100, "IP_HW0_MBIM", 128, 2),
++ MHI_CHANNEL_CONFIG_HW_DL(101, "IP_HW0_MBIM", 128, 3),
++};
++
++static struct mhi_event_config mhi_telit_fn990_events[] = {
++ MHI_EVENT_CONFIG_CTRL(0, 128),
++ MHI_EVENT_CONFIG_DATA(1, 128),
++ MHI_EVENT_CONFIG_HW_DATA(2, 1024, 100),
++ MHI_EVENT_CONFIG_HW_DATA(3, 2048, 101)
++};
++
++static const struct mhi_controller_config modem_telit_fn990_config = {
++ .max_channels = 128,
++ .timeout_ms = 20000,
++ .num_channels = ARRAY_SIZE(mhi_telit_fn990_channels),
++ .ch_cfg = mhi_telit_fn990_channels,
++ .num_events = ARRAY_SIZE(mhi_telit_fn990_events),
++ .event_cfg = mhi_telit_fn990_events,
++};
++
++static const struct mhi_pci_dev_info mhi_telit_fn990_info = {
++ .name = "telit-fn990",
++ .config = &modem_telit_fn990_config,
++ .bar_num = MHI_PCI_DEFAULT_BAR_NUM,
++ .dma_data_width = 32,
++ .sideband_wake = false,
++ .mru_default = 32768,
++};
++
+ static const struct pci_device_id mhi_pci_id_table[] = {
+ /* EM919x (sdx55), use the same vid:pid as qcom-sdx55m */
+ { PCI_DEVICE_SUB(PCI_VENDOR_ID_QCOM, 0x0306, 0x18d7, 0x0200),
+ .driver_data = (kernel_ulong_t) &mhi_sierra_em919x_info },
++ /* Telit FN980 hardware revision v1 */
++ { PCI_DEVICE_SUB(PCI_VENDOR_ID_QCOM, 0x0306, 0x1C5D, 0x2000),
++ .driver_data = (kernel_ulong_t) &mhi_telit_fn980_hw_v1_info },
+ { PCI_DEVICE(PCI_VENDOR_ID_QCOM, 0x0306),
+ .driver_data = (kernel_ulong_t) &mhi_qcom_sdx55_info },
+ { PCI_DEVICE(PCI_VENDOR_ID_QCOM, 0x0304),
+ .driver_data = (kernel_ulong_t) &mhi_qcom_sdx24_info },
++ /* Telit FN990 */
++ { PCI_DEVICE_SUB(PCI_VENDOR_ID_QCOM, 0x0308, 0x1c5d, 0x2010),
++ .driver_data = (kernel_ulong_t) &mhi_telit_fn990_info },
+ { PCI_DEVICE(0x1eac, 0x1001), /* EM120R-GL (sdx24) */
+ .driver_data = (kernel_ulong_t) &mhi_quectel_em1xx_info },
+ { PCI_DEVICE(0x1eac, 0x1002), /* EM160R-GL (sdx24) */
+diff --git a/drivers/clk/clk-lan966x.c b/drivers/clk/clk-lan966x.c
+index d1535ac13e894..81cb90955d68b 100644
+--- a/drivers/clk/clk-lan966x.c
++++ b/drivers/clk/clk-lan966x.c
+@@ -213,7 +213,7 @@ static int lan966x_gate_clk_register(struct device *dev,
+
+ hw_data->hws[i] =
+ devm_clk_hw_register_gate(dev, clk_gate_desc[idx].name,
+- "lan966x", 0, base,
++ "lan966x", 0, gate_base,
+ clk_gate_desc[idx].bit_idx,
+ 0, &clk_gate_lock);
+
+diff --git a/drivers/crypto/qat/qat_4xxx/adf_drv.c b/drivers/crypto/qat/qat_4xxx/adf_drv.c
+index fa4c350c1bf92..a6c78b9c730bc 100644
+--- a/drivers/crypto/qat/qat_4xxx/adf_drv.c
++++ b/drivers/crypto/qat/qat_4xxx/adf_drv.c
+@@ -75,13 +75,6 @@ static int adf_crypto_dev_config(struct adf_accel_dev *accel_dev)
+ if (ret)
+ goto err;
+
+- /* Temporarily set the number of crypto instances to zero to avoid
+- * registering the crypto algorithms.
+- * This will be removed when the algorithms will support the
+- * CRYPTO_TFM_REQ_MAY_BACKLOG flag
+- */
+- instances = 0;
+-
+ for (i = 0; i < instances; i++) {
+ val = i;
+ bank = i * 2;
+diff --git a/drivers/crypto/qat/qat_common/Makefile b/drivers/crypto/qat/qat_common/Makefile
+index f25a6c8edfc73..04f058acc4d37 100644
+--- a/drivers/crypto/qat/qat_common/Makefile
++++ b/drivers/crypto/qat/qat_common/Makefile
+@@ -16,6 +16,7 @@ intel_qat-objs := adf_cfg.o \
+ qat_crypto.o \
+ qat_algs.o \
+ qat_asym_algs.o \
++ qat_algs_send.o \
+ qat_uclo.o \
+ qat_hal.o
+
+diff --git a/drivers/crypto/qat/qat_common/adf_transport.c b/drivers/crypto/qat/qat_common/adf_transport.c
+index 8ba28409fb74b..630d0483c4e0a 100644
+--- a/drivers/crypto/qat/qat_common/adf_transport.c
++++ b/drivers/crypto/qat/qat_common/adf_transport.c
+@@ -8,6 +8,9 @@
+ #include "adf_cfg.h"
+ #include "adf_common_drv.h"
+
++#define ADF_MAX_RING_THRESHOLD 80
++#define ADF_PERCENT(tot, percent) (((tot) * (percent)) / 100)
++
+ static inline u32 adf_modulo(u32 data, u32 shift)
+ {
+ u32 div = data >> shift;
+@@ -77,6 +80,11 @@ static void adf_disable_ring_irq(struct adf_etr_bank_data *bank, u32 ring)
+ bank->irq_mask);
+ }
+
++bool adf_ring_nearly_full(struct adf_etr_ring_data *ring)
++{
++ return atomic_read(ring->inflights) > ring->threshold;
++}
++
+ int adf_send_message(struct adf_etr_ring_data *ring, u32 *msg)
+ {
+ struct adf_hw_csr_ops *csr_ops = GET_CSR_OPS(ring->bank->accel_dev);
+@@ -217,6 +225,7 @@ int adf_create_ring(struct adf_accel_dev *accel_dev, const char *section,
+ struct adf_etr_bank_data *bank;
+ struct adf_etr_ring_data *ring;
+ char val[ADF_CFG_MAX_VAL_LEN_IN_BYTES];
++ int max_inflights;
+ u32 ring_num;
+ int ret;
+
+@@ -263,6 +272,8 @@ int adf_create_ring(struct adf_accel_dev *accel_dev, const char *section,
+ ring->ring_size = adf_verify_ring_size(msg_size, num_msgs);
+ ring->head = 0;
+ ring->tail = 0;
++ max_inflights = ADF_MAX_INFLIGHTS(ring->ring_size, ring->msg_size);
++ ring->threshold = ADF_PERCENT(max_inflights, ADF_MAX_RING_THRESHOLD);
+ atomic_set(ring->inflights, 0);
+ ret = adf_init_ring(ring);
+ if (ret)
+diff --git a/drivers/crypto/qat/qat_common/adf_transport.h b/drivers/crypto/qat/qat_common/adf_transport.h
+index 2c95f1697c76f..e6ef6f9b76913 100644
+--- a/drivers/crypto/qat/qat_common/adf_transport.h
++++ b/drivers/crypto/qat/qat_common/adf_transport.h
+@@ -14,6 +14,7 @@ int adf_create_ring(struct adf_accel_dev *accel_dev, const char *section,
+ const char *ring_name, adf_callback_fn callback,
+ int poll_mode, struct adf_etr_ring_data **ring_ptr);
+
++bool adf_ring_nearly_full(struct adf_etr_ring_data *ring);
+ int adf_send_message(struct adf_etr_ring_data *ring, u32 *msg);
+ void adf_remove_ring(struct adf_etr_ring_data *ring);
+ #endif
+diff --git a/drivers/crypto/qat/qat_common/adf_transport_internal.h b/drivers/crypto/qat/qat_common/adf_transport_internal.h
+index 501bcf0f1809a..8b2c92ba7ca1f 100644
+--- a/drivers/crypto/qat/qat_common/adf_transport_internal.h
++++ b/drivers/crypto/qat/qat_common/adf_transport_internal.h
+@@ -22,6 +22,7 @@ struct adf_etr_ring_data {
+ spinlock_t lock; /* protects ring data struct */
+ u16 head;
+ u16 tail;
++ u32 threshold;
+ u8 ring_number;
+ u8 ring_size;
+ u8 msg_size;
+diff --git a/drivers/crypto/qat/qat_common/qat_algs.c b/drivers/crypto/qat/qat_common/qat_algs.c
+index f998ed58457c2..873533dc43a74 100644
+--- a/drivers/crypto/qat/qat_common/qat_algs.c
++++ b/drivers/crypto/qat/qat_common/qat_algs.c
+@@ -17,7 +17,7 @@
+ #include <crypto/xts.h>
+ #include <linux/dma-mapping.h>
+ #include "adf_accel_devices.h"
+-#include "adf_transport.h"
++#include "qat_algs_send.h"
+ #include "adf_common_drv.h"
+ #include "qat_crypto.h"
+ #include "icp_qat_hw.h"
+@@ -46,19 +46,6 @@
+ static DEFINE_MUTEX(algs_lock);
+ static unsigned int active_devs;
+
+-struct qat_alg_buf {
+- u32 len;
+- u32 resrvd;
+- u64 addr;
+-} __packed;
+-
+-struct qat_alg_buf_list {
+- u64 resrvd;
+- u32 num_bufs;
+- u32 num_mapped_bufs;
+- struct qat_alg_buf bufers[];
+-} __packed __aligned(64);
+-
+ /* Common content descriptor */
+ struct qat_alg_cd {
+ union {
+@@ -693,7 +680,10 @@ static void qat_alg_free_bufl(struct qat_crypto_instance *inst,
+ bl->bufers[i].len, DMA_BIDIRECTIONAL);
+
+ dma_unmap_single(dev, blp, sz, DMA_TO_DEVICE);
+- kfree(bl);
++
++ if (!qat_req->buf.sgl_src_valid)
++ kfree(bl);
++
+ if (blp != blpout) {
+ /* If out of place operation dma unmap only data */
+ int bufless = blout->num_bufs - blout->num_mapped_bufs;
+@@ -704,7 +694,9 @@ static void qat_alg_free_bufl(struct qat_crypto_instance *inst,
+ DMA_BIDIRECTIONAL);
+ }
+ dma_unmap_single(dev, blpout, sz_out, DMA_TO_DEVICE);
+- kfree(blout);
++
++ if (!qat_req->buf.sgl_dst_valid)
++ kfree(blout);
+ }
+ }
+
+@@ -721,15 +713,24 @@ static int qat_alg_sgl_to_bufl(struct qat_crypto_instance *inst,
+ dma_addr_t blp = DMA_MAPPING_ERROR;
+ dma_addr_t bloutp = DMA_MAPPING_ERROR;
+ struct scatterlist *sg;
+- size_t sz_out, sz = struct_size(bufl, bufers, n + 1);
++ size_t sz_out, sz = struct_size(bufl, bufers, n);
++ int node = dev_to_node(&GET_DEV(inst->accel_dev));
+
+ if (unlikely(!n))
+ return -EINVAL;
+
+- bufl = kzalloc_node(sz, GFP_ATOMIC,
+- dev_to_node(&GET_DEV(inst->accel_dev)));
+- if (unlikely(!bufl))
+- return -ENOMEM;
++ qat_req->buf.sgl_src_valid = false;
++ qat_req->buf.sgl_dst_valid = false;
++
++ if (n > QAT_MAX_BUFF_DESC) {
++ bufl = kzalloc_node(sz, GFP_ATOMIC, node);
++ if (unlikely(!bufl))
++ return -ENOMEM;
++ } else {
++ bufl = &qat_req->buf.sgl_src.sgl_hdr;
++ memset(bufl, 0, sizeof(struct qat_alg_buf_list));
++ qat_req->buf.sgl_src_valid = true;
++ }
+
+ for_each_sg(sgl, sg, n, i)
+ bufl->bufers[i].addr = DMA_MAPPING_ERROR;
+@@ -760,12 +761,18 @@ static int qat_alg_sgl_to_bufl(struct qat_crypto_instance *inst,
+ struct qat_alg_buf *bufers;
+
+ n = sg_nents(sglout);
+- sz_out = struct_size(buflout, bufers, n + 1);
++ sz_out = struct_size(buflout, bufers, n);
+ sg_nctr = 0;
+- buflout = kzalloc_node(sz_out, GFP_ATOMIC,
+- dev_to_node(&GET_DEV(inst->accel_dev)));
+- if (unlikely(!buflout))
+- goto err_in;
++
++ if (n > QAT_MAX_BUFF_DESC) {
++ buflout = kzalloc_node(sz_out, GFP_ATOMIC, node);
++ if (unlikely(!buflout))
++ goto err_in;
++ } else {
++ buflout = &qat_req->buf.sgl_dst.sgl_hdr;
++ memset(buflout, 0, sizeof(struct qat_alg_buf_list));
++ qat_req->buf.sgl_dst_valid = true;
++ }
+
+ bufers = buflout->bufers;
+ for_each_sg(sglout, sg, n, i)
+@@ -810,7 +817,9 @@ err_out:
+ dma_unmap_single(dev, buflout->bufers[i].addr,
+ buflout->bufers[i].len,
+ DMA_BIDIRECTIONAL);
+- kfree(buflout);
++
++ if (!qat_req->buf.sgl_dst_valid)
++ kfree(buflout);
+
+ err_in:
+ if (!dma_mapping_error(dev, blp))
+@@ -823,7 +832,8 @@ err_in:
+ bufl->bufers[i].len,
+ DMA_BIDIRECTIONAL);
+
+- kfree(bufl);
++ if (!qat_req->buf.sgl_src_valid)
++ kfree(bufl);
+
+ dev_err(dev, "Failed to map buf for dma\n");
+ return -ENOMEM;
+@@ -925,8 +935,25 @@ void qat_alg_callback(void *resp)
+ struct icp_qat_fw_la_resp *qat_resp = resp;
+ struct qat_crypto_request *qat_req =
+ (void *)(__force long)qat_resp->opaque_data;
++ struct qat_instance_backlog *backlog = qat_req->alg_req.backlog;
+
+ qat_req->cb(qat_resp, qat_req);
++
++ qat_alg_send_backlog(backlog);
++}
++
++static int qat_alg_send_sym_message(struct qat_crypto_request *qat_req,
++ struct qat_crypto_instance *inst,
++ struct crypto_async_request *base)
++{
++ struct qat_alg_req *alg_req = &qat_req->alg_req;
++
++ alg_req->fw_req = (u32 *)&qat_req->req;
++ alg_req->tx_ring = inst->sym_tx;
++ alg_req->base = base;
++ alg_req->backlog = &inst->backlog;
++
++ return qat_alg_send_message(alg_req);
+ }
+
+ static int qat_alg_aead_dec(struct aead_request *areq)
+@@ -939,7 +966,7 @@ static int qat_alg_aead_dec(struct aead_request *areq)
+ struct icp_qat_fw_la_auth_req_params *auth_param;
+ struct icp_qat_fw_la_bulk_req *msg;
+ int digst_size = crypto_aead_authsize(aead_tfm);
+- int ret, ctr = 0;
++ int ret;
+ u32 cipher_len;
+
+ cipher_len = areq->cryptlen - digst_size;
+@@ -965,15 +992,12 @@ static int qat_alg_aead_dec(struct aead_request *areq)
+ auth_param = (void *)((u8 *)cipher_param + sizeof(*cipher_param));
+ auth_param->auth_off = 0;
+ auth_param->auth_len = areq->assoclen + cipher_param->cipher_length;
+- do {
+- ret = adf_send_message(ctx->inst->sym_tx, (u32 *)msg);
+- } while (ret == -EAGAIN && ctr++ < 10);
+
+- if (ret == -EAGAIN) {
++ ret = qat_alg_send_sym_message(qat_req, ctx->inst, &areq->base);
++ if (ret == -ENOSPC)
+ qat_alg_free_bufl(ctx->inst, qat_req);
+- return -EBUSY;
+- }
+- return -EINPROGRESS;
++
++ return ret;
+ }
+
+ static int qat_alg_aead_enc(struct aead_request *areq)
+@@ -986,7 +1010,7 @@ static int qat_alg_aead_enc(struct aead_request *areq)
+ struct icp_qat_fw_la_auth_req_params *auth_param;
+ struct icp_qat_fw_la_bulk_req *msg;
+ u8 *iv = areq->iv;
+- int ret, ctr = 0;
++ int ret;
+
+ if (areq->cryptlen % AES_BLOCK_SIZE != 0)
+ return -EINVAL;
+@@ -1013,15 +1037,11 @@ static int qat_alg_aead_enc(struct aead_request *areq)
+ auth_param->auth_off = 0;
+ auth_param->auth_len = areq->assoclen + areq->cryptlen;
+
+- do {
+- ret = adf_send_message(ctx->inst->sym_tx, (u32 *)msg);
+- } while (ret == -EAGAIN && ctr++ < 10);
+-
+- if (ret == -EAGAIN) {
++ ret = qat_alg_send_sym_message(qat_req, ctx->inst, &areq->base);
++ if (ret == -ENOSPC)
+ qat_alg_free_bufl(ctx->inst, qat_req);
+- return -EBUSY;
+- }
+- return -EINPROGRESS;
++
++ return ret;
+ }
+
+ static int qat_alg_skcipher_rekey(struct qat_alg_skcipher_ctx *ctx,
+@@ -1174,7 +1194,7 @@ static int qat_alg_skcipher_encrypt(struct skcipher_request *req)
+ struct qat_crypto_request *qat_req = skcipher_request_ctx(req);
+ struct icp_qat_fw_la_cipher_req_params *cipher_param;
+ struct icp_qat_fw_la_bulk_req *msg;
+- int ret, ctr = 0;
++ int ret;
+
+ if (req->cryptlen == 0)
+ return 0;
+@@ -1198,15 +1218,11 @@ static int qat_alg_skcipher_encrypt(struct skcipher_request *req)
+
+ qat_alg_set_req_iv(qat_req);
+
+- do {
+- ret = adf_send_message(ctx->inst->sym_tx, (u32 *)msg);
+- } while (ret == -EAGAIN && ctr++ < 10);
+-
+- if (ret == -EAGAIN) {
++ ret = qat_alg_send_sym_message(qat_req, ctx->inst, &req->base);
++ if (ret == -ENOSPC)
+ qat_alg_free_bufl(ctx->inst, qat_req);
+- return -EBUSY;
+- }
+- return -EINPROGRESS;
++
++ return ret;
+ }
+
+ static int qat_alg_skcipher_blk_encrypt(struct skcipher_request *req)
+@@ -1243,7 +1259,7 @@ static int qat_alg_skcipher_decrypt(struct skcipher_request *req)
+ struct qat_crypto_request *qat_req = skcipher_request_ctx(req);
+ struct icp_qat_fw_la_cipher_req_params *cipher_param;
+ struct icp_qat_fw_la_bulk_req *msg;
+- int ret, ctr = 0;
++ int ret;
+
+ if (req->cryptlen == 0)
+ return 0;
+@@ -1268,15 +1284,11 @@ static int qat_alg_skcipher_decrypt(struct skcipher_request *req)
+ qat_alg_set_req_iv(qat_req);
+ qat_alg_update_iv(qat_req);
+
+- do {
+- ret = adf_send_message(ctx->inst->sym_tx, (u32 *)msg);
+- } while (ret == -EAGAIN && ctr++ < 10);
+-
+- if (ret == -EAGAIN) {
++ ret = qat_alg_send_sym_message(qat_req, ctx->inst, &req->base);
++ if (ret == -ENOSPC)
+ qat_alg_free_bufl(ctx->inst, qat_req);
+- return -EBUSY;
+- }
+- return -EINPROGRESS;
++
++ return ret;
+ }
+
+ static int qat_alg_skcipher_blk_decrypt(struct skcipher_request *req)
+diff --git a/drivers/crypto/qat/qat_common/qat_algs_send.c b/drivers/crypto/qat/qat_common/qat_algs_send.c
+new file mode 100644
+index 0000000000000..ff5b4347f7831
+--- /dev/null
++++ b/drivers/crypto/qat/qat_common/qat_algs_send.c
+@@ -0,0 +1,86 @@
++// SPDX-License-Identifier: (BSD-3-Clause OR GPL-2.0-only)
++/* Copyright(c) 2022 Intel Corporation */
++#include "adf_transport.h"
++#include "qat_algs_send.h"
++#include "qat_crypto.h"
++
++#define ADF_MAX_RETRIES 20
++
++static int qat_alg_send_message_retry(struct qat_alg_req *req)
++{
++ int ret = 0, ctr = 0;
++
++ do {
++ ret = adf_send_message(req->tx_ring, req->fw_req);
++ } while (ret == -EAGAIN && ctr++ < ADF_MAX_RETRIES);
++
++ if (ret == -EAGAIN)
++ return -ENOSPC;
++
++ return -EINPROGRESS;
++}
++
++void qat_alg_send_backlog(struct qat_instance_backlog *backlog)
++{
++ struct qat_alg_req *req, *tmp;
++
++ spin_lock_bh(&backlog->lock);
++ list_for_each_entry_safe(req, tmp, &backlog->list, list) {
++ if (adf_send_message(req->tx_ring, req->fw_req)) {
++ /* The HW ring is full. Do nothing.
++ * qat_alg_send_backlog() will be invoked again by
++ * another callback.
++ */
++ break;
++ }
++ list_del(&req->list);
++ req->base->complete(req->base, -EINPROGRESS);
++ }
++ spin_unlock_bh(&backlog->lock);
++}
++
++static void qat_alg_backlog_req(struct qat_alg_req *req,
++ struct qat_instance_backlog *backlog)
++{
++ INIT_LIST_HEAD(&req->list);
++
++ spin_lock_bh(&backlog->lock);
++ list_add_tail(&req->list, &backlog->list);
++ spin_unlock_bh(&backlog->lock);
++}
++
++static int qat_alg_send_message_maybacklog(struct qat_alg_req *req)
++{
++ struct qat_instance_backlog *backlog = req->backlog;
++ struct adf_etr_ring_data *tx_ring = req->tx_ring;
++ u32 *fw_req = req->fw_req;
++
++ /* If any request is already backlogged, then add to backlog list */
++ if (!list_empty(&backlog->list))
++ goto enqueue;
++
++ /* If ring is nearly full, then add to backlog list */
++ if (adf_ring_nearly_full(tx_ring))
++ goto enqueue;
++
++ /* If adding request to HW ring fails, then add to backlog list */
++ if (adf_send_message(tx_ring, fw_req))
++ goto enqueue;
++
++ return -EINPROGRESS;
++
++enqueue:
++ qat_alg_backlog_req(req, backlog);
++
++ return -EBUSY;
++}
++
++int qat_alg_send_message(struct qat_alg_req *req)
++{
++ u32 flags = req->base->flags;
++
++ if (flags & CRYPTO_TFM_REQ_MAY_BACKLOG)
++ return qat_alg_send_message_maybacklog(req);
++ else
++ return qat_alg_send_message_retry(req);
++}
+diff --git a/drivers/crypto/qat/qat_common/qat_algs_send.h b/drivers/crypto/qat/qat_common/qat_algs_send.h
+new file mode 100644
+index 0000000000000..5ce9f4f69d8ff
+--- /dev/null
++++ b/drivers/crypto/qat/qat_common/qat_algs_send.h
+@@ -0,0 +1,11 @@
++/* SPDX-License-Identifier: (BSD-3-Clause OR GPL-2.0-only) */
++/* Copyright(c) 2022 Intel Corporation */
++#ifndef QAT_ALGS_SEND_H
++#define QAT_ALGS_SEND_H
++
++#include "qat_crypto.h"
++
++int qat_alg_send_message(struct qat_alg_req *req);
++void qat_alg_send_backlog(struct qat_instance_backlog *backlog);
++
++#endif
+diff --git a/drivers/crypto/qat/qat_common/qat_asym_algs.c b/drivers/crypto/qat/qat_common/qat_asym_algs.c
+index b0b78445418bb..7173a2a0a484f 100644
+--- a/drivers/crypto/qat/qat_common/qat_asym_algs.c
++++ b/drivers/crypto/qat/qat_common/qat_asym_algs.c
+@@ -12,6 +12,7 @@
+ #include <crypto/scatterwalk.h>
+ #include "icp_qat_fw_pke.h"
+ #include "adf_accel_devices.h"
++#include "qat_algs_send.h"
+ #include "adf_transport.h"
+ #include "adf_common_drv.h"
+ #include "qat_crypto.h"
+@@ -135,8 +136,23 @@ struct qat_asym_request {
+ } areq;
+ int err;
+ void (*cb)(struct icp_qat_fw_pke_resp *resp);
++ struct qat_alg_req alg_req;
+ } __aligned(64);
+
++static int qat_alg_send_asym_message(struct qat_asym_request *qat_req,
++ struct qat_crypto_instance *inst,
++ struct crypto_async_request *base)
++{
++ struct qat_alg_req *alg_req = &qat_req->alg_req;
++
++ alg_req->fw_req = (u32 *)&qat_req->req;
++ alg_req->tx_ring = inst->pke_tx;
++ alg_req->base = base;
++ alg_req->backlog = &inst->backlog;
++
++ return qat_alg_send_message(alg_req);
++}
++
+ static void qat_dh_cb(struct icp_qat_fw_pke_resp *resp)
+ {
+ struct qat_asym_request *req = (void *)(__force long)resp->opaque;
+@@ -148,26 +164,21 @@ static void qat_dh_cb(struct icp_qat_fw_pke_resp *resp)
+ err = (err == ICP_QAT_FW_COMN_STATUS_FLAG_OK) ? 0 : -EINVAL;
+
+ if (areq->src) {
+- if (req->src_align)
+- dma_free_coherent(dev, req->ctx.dh->p_size,
+- req->src_align, req->in.dh.in.b);
+- else
+- dma_unmap_single(dev, req->in.dh.in.b,
+- req->ctx.dh->p_size, DMA_TO_DEVICE);
++ dma_unmap_single(dev, req->in.dh.in.b, req->ctx.dh->p_size,
++ DMA_TO_DEVICE);
++ kfree_sensitive(req->src_align);
+ }
+
+ areq->dst_len = req->ctx.dh->p_size;
+ if (req->dst_align) {
+ scatterwalk_map_and_copy(req->dst_align, areq->dst, 0,
+ areq->dst_len, 1);
+-
+- dma_free_coherent(dev, req->ctx.dh->p_size, req->dst_align,
+- req->out.dh.r);
+- } else {
+- dma_unmap_single(dev, req->out.dh.r, req->ctx.dh->p_size,
+- DMA_FROM_DEVICE);
++ kfree_sensitive(req->dst_align);
+ }
+
++ dma_unmap_single(dev, req->out.dh.r, req->ctx.dh->p_size,
++ DMA_FROM_DEVICE);
++
+ dma_unmap_single(dev, req->phy_in, sizeof(struct qat_dh_input_params),
+ DMA_TO_DEVICE);
+ dma_unmap_single(dev, req->phy_out,
+@@ -213,8 +224,9 @@ static int qat_dh_compute_value(struct kpp_request *req)
+ struct qat_asym_request *qat_req =
+ PTR_ALIGN(kpp_request_ctx(req), 64);
+ struct icp_qat_fw_pke_request *msg = &qat_req->req;
+- int ret, ctr = 0;
++ int ret;
+ int n_input_params = 0;
++ u8 *vaddr;
+
+ if (unlikely(!ctx->xa))
+ return -EINVAL;
+@@ -223,6 +235,10 @@ static int qat_dh_compute_value(struct kpp_request *req)
+ req->dst_len = ctx->p_size;
+ return -EOVERFLOW;
+ }
++
++ if (req->src_len > ctx->p_size)
++ return -EINVAL;
++
+ memset(msg, '\0', sizeof(*msg));
+ ICP_QAT_FW_PKE_HDR_VALID_FLAG_SET(msg->pke_hdr,
+ ICP_QAT_FW_COMN_REQ_FLAG_SET);
+@@ -271,27 +287,24 @@ static int qat_dh_compute_value(struct kpp_request *req)
+ */
+ if (sg_is_last(req->src) && req->src_len == ctx->p_size) {
+ qat_req->src_align = NULL;
+- qat_req->in.dh.in.b = dma_map_single(dev,
+- sg_virt(req->src),
+- req->src_len,
+- DMA_TO_DEVICE);
+- if (unlikely(dma_mapping_error(dev,
+- qat_req->in.dh.in.b)))
+- return ret;
+-
++ vaddr = sg_virt(req->src);
+ } else {
+ int shift = ctx->p_size - req->src_len;
+
+- qat_req->src_align = dma_alloc_coherent(dev,
+- ctx->p_size,
+- &qat_req->in.dh.in.b,
+- GFP_KERNEL);
++ qat_req->src_align = kzalloc(ctx->p_size, GFP_KERNEL);
+ if (unlikely(!qat_req->src_align))
+ return ret;
+
+ scatterwalk_map_and_copy(qat_req->src_align + shift,
+ req->src, 0, req->src_len, 0);
++
++ vaddr = qat_req->src_align;
+ }
++
++ qat_req->in.dh.in.b = dma_map_single(dev, vaddr, ctx->p_size,
++ DMA_TO_DEVICE);
++ if (unlikely(dma_mapping_error(dev, qat_req->in.dh.in.b)))
++ goto unmap_src;
+ }
+ /*
+ * dst can be of any size in valid range, but HW expects it to be the
+@@ -302,20 +315,18 @@ static int qat_dh_compute_value(struct kpp_request *req)
+ */
+ if (sg_is_last(req->dst) && req->dst_len == ctx->p_size) {
+ qat_req->dst_align = NULL;
+- qat_req->out.dh.r = dma_map_single(dev, sg_virt(req->dst),
+- req->dst_len,
+- DMA_FROM_DEVICE);
+-
+- if (unlikely(dma_mapping_error(dev, qat_req->out.dh.r)))
+- goto unmap_src;
+-
++ vaddr = sg_virt(req->dst);
+ } else {
+- qat_req->dst_align = dma_alloc_coherent(dev, ctx->p_size,
+- &qat_req->out.dh.r,
+- GFP_KERNEL);
++ qat_req->dst_align = kzalloc(ctx->p_size, GFP_KERNEL);
+ if (unlikely(!qat_req->dst_align))
+ goto unmap_src;
++
++ vaddr = qat_req->dst_align;
+ }
++ qat_req->out.dh.r = dma_map_single(dev, vaddr, ctx->p_size,
++ DMA_FROM_DEVICE);
++ if (unlikely(dma_mapping_error(dev, qat_req->out.dh.r)))
++ goto unmap_dst;
+
+ qat_req->in.dh.in_tab[n_input_params] = 0;
+ qat_req->out.dh.out_tab[1] = 0;
+@@ -338,13 +349,13 @@ static int qat_dh_compute_value(struct kpp_request *req)
+ msg->input_param_count = n_input_params;
+ msg->output_param_count = 1;
+
+- do {
+- ret = adf_send_message(ctx->inst->pke_tx, (u32 *)msg);
+- } while (ret == -EBUSY && ctr++ < 100);
++ ret = qat_alg_send_asym_message(qat_req, inst, &req->base);
++ if (ret == -ENOSPC)
++ goto unmap_all;
+
+- if (!ret)
+- return -EINPROGRESS;
++ return ret;
+
++unmap_all:
+ if (!dma_mapping_error(dev, qat_req->phy_out))
+ dma_unmap_single(dev, qat_req->phy_out,
+ sizeof(struct qat_dh_output_params),
+@@ -355,23 +366,17 @@ unmap_in_params:
+ sizeof(struct qat_dh_input_params),
+ DMA_TO_DEVICE);
+ unmap_dst:
+- if (qat_req->dst_align)
+- dma_free_coherent(dev, ctx->p_size, qat_req->dst_align,
+- qat_req->out.dh.r);
+- else
+- if (!dma_mapping_error(dev, qat_req->out.dh.r))
+- dma_unmap_single(dev, qat_req->out.dh.r, ctx->p_size,
+- DMA_FROM_DEVICE);
++ if (!dma_mapping_error(dev, qat_req->out.dh.r))
++ dma_unmap_single(dev, qat_req->out.dh.r, ctx->p_size,
++ DMA_FROM_DEVICE);
++ kfree_sensitive(qat_req->dst_align);
+ unmap_src:
+ if (req->src) {
+- if (qat_req->src_align)
+- dma_free_coherent(dev, ctx->p_size, qat_req->src_align,
+- qat_req->in.dh.in.b);
+- else
+- if (!dma_mapping_error(dev, qat_req->in.dh.in.b))
+- dma_unmap_single(dev, qat_req->in.dh.in.b,
+- ctx->p_size,
+- DMA_TO_DEVICE);
++ if (!dma_mapping_error(dev, qat_req->in.dh.in.b))
++ dma_unmap_single(dev, qat_req->in.dh.in.b,
++ ctx->p_size,
++ DMA_TO_DEVICE);
++ kfree_sensitive(qat_req->src_align);
+ }
+ return ret;
+ }
+@@ -420,14 +425,17 @@ static int qat_dh_set_params(struct qat_dh_ctx *ctx, struct dh *params)
+ static void qat_dh_clear_ctx(struct device *dev, struct qat_dh_ctx *ctx)
+ {
+ if (ctx->g) {
++ memset(ctx->g, 0, ctx->p_size);
+ dma_free_coherent(dev, ctx->p_size, ctx->g, ctx->dma_g);
+ ctx->g = NULL;
+ }
+ if (ctx->xa) {
++ memset(ctx->xa, 0, ctx->p_size);
+ dma_free_coherent(dev, ctx->p_size, ctx->xa, ctx->dma_xa);
+ ctx->xa = NULL;
+ }
+ if (ctx->p) {
++ memset(ctx->p, 0, ctx->p_size);
+ dma_free_coherent(dev, ctx->p_size, ctx->p, ctx->dma_p);
+ ctx->p = NULL;
+ }
+@@ -510,25 +518,22 @@ static void qat_rsa_cb(struct icp_qat_fw_pke_resp *resp)
+
+ err = (err == ICP_QAT_FW_COMN_STATUS_FLAG_OK) ? 0 : -EINVAL;
+
+- if (req->src_align)
+- dma_free_coherent(dev, req->ctx.rsa->key_sz, req->src_align,
+- req->in.rsa.enc.m);
+- else
+- dma_unmap_single(dev, req->in.rsa.enc.m, req->ctx.rsa->key_sz,
+- DMA_TO_DEVICE);
++ kfree_sensitive(req->src_align);
++
++ dma_unmap_single(dev, req->in.rsa.enc.m, req->ctx.rsa->key_sz,
++ DMA_TO_DEVICE);
+
+ areq->dst_len = req->ctx.rsa->key_sz;
+ if (req->dst_align) {
+ scatterwalk_map_and_copy(req->dst_align, areq->dst, 0,
+ areq->dst_len, 1);
+
+- dma_free_coherent(dev, req->ctx.rsa->key_sz, req->dst_align,
+- req->out.rsa.enc.c);
+- } else {
+- dma_unmap_single(dev, req->out.rsa.enc.c, req->ctx.rsa->key_sz,
+- DMA_FROM_DEVICE);
++ kfree_sensitive(req->dst_align);
+ }
+
++ dma_unmap_single(dev, req->out.rsa.enc.c, req->ctx.rsa->key_sz,
++ DMA_FROM_DEVICE);
++
+ dma_unmap_single(dev, req->phy_in, sizeof(struct qat_rsa_input_params),
+ DMA_TO_DEVICE);
+ dma_unmap_single(dev, req->phy_out,
+@@ -542,8 +547,11 @@ void qat_alg_asym_callback(void *_resp)
+ {
+ struct icp_qat_fw_pke_resp *resp = _resp;
+ struct qat_asym_request *areq = (void *)(__force long)resp->opaque;
++ struct qat_instance_backlog *backlog = areq->alg_req.backlog;
+
+ areq->cb(resp);
++
++ qat_alg_send_backlog(backlog);
+ }
+
+ #define PKE_RSA_EP_512 0x1c161b21
+@@ -642,7 +650,8 @@ static int qat_rsa_enc(struct akcipher_request *req)
+ struct qat_asym_request *qat_req =
+ PTR_ALIGN(akcipher_request_ctx(req), 64);
+ struct icp_qat_fw_pke_request *msg = &qat_req->req;
+- int ret, ctr = 0;
++ u8 *vaddr;
++ int ret;
+
+ if (unlikely(!ctx->n || !ctx->e))
+ return -EINVAL;
+@@ -651,6 +660,10 @@ static int qat_rsa_enc(struct akcipher_request *req)
+ req->dst_len = ctx->key_sz;
+ return -EOVERFLOW;
+ }
++
++ if (req->src_len > ctx->key_sz)
++ return -EINVAL;
++
+ memset(msg, '\0', sizeof(*msg));
+ ICP_QAT_FW_PKE_HDR_VALID_FLAG_SET(msg->pke_hdr,
+ ICP_QAT_FW_COMN_REQ_FLAG_SET);
+@@ -679,40 +692,39 @@ static int qat_rsa_enc(struct akcipher_request *req)
+ */
+ if (sg_is_last(req->src) && req->src_len == ctx->key_sz) {
+ qat_req->src_align = NULL;
+- qat_req->in.rsa.enc.m = dma_map_single(dev, sg_virt(req->src),
+- req->src_len, DMA_TO_DEVICE);
+- if (unlikely(dma_mapping_error(dev, qat_req->in.rsa.enc.m)))
+- return ret;
+-
++ vaddr = sg_virt(req->src);
+ } else {
+ int shift = ctx->key_sz - req->src_len;
+
+- qat_req->src_align = dma_alloc_coherent(dev, ctx->key_sz,
+- &qat_req->in.rsa.enc.m,
+- GFP_KERNEL);
++ qat_req->src_align = kzalloc(ctx->key_sz, GFP_KERNEL);
+ if (unlikely(!qat_req->src_align))
+ return ret;
+
+ scatterwalk_map_and_copy(qat_req->src_align + shift, req->src,
+ 0, req->src_len, 0);
++ vaddr = qat_req->src_align;
+ }
+- if (sg_is_last(req->dst) && req->dst_len == ctx->key_sz) {
+- qat_req->dst_align = NULL;
+- qat_req->out.rsa.enc.c = dma_map_single(dev, sg_virt(req->dst),
+- req->dst_len,
+- DMA_FROM_DEVICE);
+
+- if (unlikely(dma_mapping_error(dev, qat_req->out.rsa.enc.c)))
+- goto unmap_src;
++ qat_req->in.rsa.enc.m = dma_map_single(dev, vaddr, ctx->key_sz,
++ DMA_TO_DEVICE);
++ if (unlikely(dma_mapping_error(dev, qat_req->in.rsa.enc.m)))
++ goto unmap_src;
+
++ if (sg_is_last(req->dst) && req->dst_len == ctx->key_sz) {
++ qat_req->dst_align = NULL;
++ vaddr = sg_virt(req->dst);
+ } else {
+- qat_req->dst_align = dma_alloc_coherent(dev, ctx->key_sz,
+- &qat_req->out.rsa.enc.c,
+- GFP_KERNEL);
++ qat_req->dst_align = kzalloc(ctx->key_sz, GFP_KERNEL);
+ if (unlikely(!qat_req->dst_align))
+ goto unmap_src;
+-
++ vaddr = qat_req->dst_align;
+ }
++
++ qat_req->out.rsa.enc.c = dma_map_single(dev, vaddr, ctx->key_sz,
++ DMA_FROM_DEVICE);
++ if (unlikely(dma_mapping_error(dev, qat_req->out.rsa.enc.c)))
++ goto unmap_dst;
++
+ qat_req->in.rsa.in_tab[3] = 0;
+ qat_req->out.rsa.out_tab[1] = 0;
+ qat_req->phy_in = dma_map_single(dev, &qat_req->in.rsa.enc.m,
+@@ -732,13 +744,14 @@ static int qat_rsa_enc(struct akcipher_request *req)
+ msg->pke_mid.opaque = (u64)(__force long)qat_req;
+ msg->input_param_count = 3;
+ msg->output_param_count = 1;
+- do {
+- ret = adf_send_message(ctx->inst->pke_tx, (u32 *)msg);
+- } while (ret == -EBUSY && ctr++ < 100);
+
+- if (!ret)
+- return -EINPROGRESS;
++ ret = qat_alg_send_asym_message(qat_req, inst, &req->base);
++ if (ret == -ENOSPC)
++ goto unmap_all;
+
++ return ret;
++
++unmap_all:
+ if (!dma_mapping_error(dev, qat_req->phy_out))
+ dma_unmap_single(dev, qat_req->phy_out,
+ sizeof(struct qat_rsa_output_params),
+@@ -749,21 +762,15 @@ unmap_in_params:
+ sizeof(struct qat_rsa_input_params),
+ DMA_TO_DEVICE);
+ unmap_dst:
+- if (qat_req->dst_align)
+- dma_free_coherent(dev, ctx->key_sz, qat_req->dst_align,
+- qat_req->out.rsa.enc.c);
+- else
+- if (!dma_mapping_error(dev, qat_req->out.rsa.enc.c))
+- dma_unmap_single(dev, qat_req->out.rsa.enc.c,
+- ctx->key_sz, DMA_FROM_DEVICE);
++ if (!dma_mapping_error(dev, qat_req->out.rsa.enc.c))
++ dma_unmap_single(dev, qat_req->out.rsa.enc.c,
++ ctx->key_sz, DMA_FROM_DEVICE);
++ kfree_sensitive(qat_req->dst_align);
+ unmap_src:
+- if (qat_req->src_align)
+- dma_free_coherent(dev, ctx->key_sz, qat_req->src_align,
+- qat_req->in.rsa.enc.m);
+- else
+- if (!dma_mapping_error(dev, qat_req->in.rsa.enc.m))
+- dma_unmap_single(dev, qat_req->in.rsa.enc.m,
+- ctx->key_sz, DMA_TO_DEVICE);
++ if (!dma_mapping_error(dev, qat_req->in.rsa.enc.m))
++ dma_unmap_single(dev, qat_req->in.rsa.enc.m, ctx->key_sz,
++ DMA_TO_DEVICE);
++ kfree_sensitive(qat_req->src_align);
+ return ret;
+ }
+
+@@ -776,7 +783,8 @@ static int qat_rsa_dec(struct akcipher_request *req)
+ struct qat_asym_request *qat_req =
+ PTR_ALIGN(akcipher_request_ctx(req), 64);
+ struct icp_qat_fw_pke_request *msg = &qat_req->req;
+- int ret, ctr = 0;
++ u8 *vaddr;
++ int ret;
+
+ if (unlikely(!ctx->n || !ctx->d))
+ return -EINVAL;
+@@ -785,6 +793,10 @@ static int qat_rsa_dec(struct akcipher_request *req)
+ req->dst_len = ctx->key_sz;
+ return -EOVERFLOW;
+ }
++
++ if (req->src_len > ctx->key_sz)
++ return -EINVAL;
++
+ memset(msg, '\0', sizeof(*msg));
+ ICP_QAT_FW_PKE_HDR_VALID_FLAG_SET(msg->pke_hdr,
+ ICP_QAT_FW_COMN_REQ_FLAG_SET);
+@@ -823,40 +835,37 @@ static int qat_rsa_dec(struct akcipher_request *req)
+ */
+ if (sg_is_last(req->src) && req->src_len == ctx->key_sz) {
+ qat_req->src_align = NULL;
+- qat_req->in.rsa.dec.c = dma_map_single(dev, sg_virt(req->src),
+- req->dst_len, DMA_TO_DEVICE);
+- if (unlikely(dma_mapping_error(dev, qat_req->in.rsa.dec.c)))
+- return ret;
+-
++ vaddr = sg_virt(req->src);
+ } else {
+ int shift = ctx->key_sz - req->src_len;
+
+- qat_req->src_align = dma_alloc_coherent(dev, ctx->key_sz,
+- &qat_req->in.rsa.dec.c,
+- GFP_KERNEL);
++ qat_req->src_align = kzalloc(ctx->key_sz, GFP_KERNEL);
+ if (unlikely(!qat_req->src_align))
+ return ret;
+
+ scatterwalk_map_and_copy(qat_req->src_align + shift, req->src,
+ 0, req->src_len, 0);
++ vaddr = qat_req->src_align;
+ }
+- if (sg_is_last(req->dst) && req->dst_len == ctx->key_sz) {
+- qat_req->dst_align = NULL;
+- qat_req->out.rsa.dec.m = dma_map_single(dev, sg_virt(req->dst),
+- req->dst_len,
+- DMA_FROM_DEVICE);
+
+- if (unlikely(dma_mapping_error(dev, qat_req->out.rsa.dec.m)))
+- goto unmap_src;
++ qat_req->in.rsa.dec.c = dma_map_single(dev, vaddr, ctx->key_sz,
++ DMA_TO_DEVICE);
++ if (unlikely(dma_mapping_error(dev, qat_req->in.rsa.dec.c)))
++ goto unmap_src;
+
++ if (sg_is_last(req->dst) && req->dst_len == ctx->key_sz) {
++ qat_req->dst_align = NULL;
++ vaddr = sg_virt(req->dst);
+ } else {
+- qat_req->dst_align = dma_alloc_coherent(dev, ctx->key_sz,
+- &qat_req->out.rsa.dec.m,
+- GFP_KERNEL);
++ qat_req->dst_align = kzalloc(ctx->key_sz, GFP_KERNEL);
+ if (unlikely(!qat_req->dst_align))
+ goto unmap_src;
+-
++ vaddr = qat_req->dst_align;
+ }
++ qat_req->out.rsa.dec.m = dma_map_single(dev, vaddr, ctx->key_sz,
++ DMA_FROM_DEVICE);
++ if (unlikely(dma_mapping_error(dev, qat_req->out.rsa.dec.m)))
++ goto unmap_dst;
+
+ if (ctx->crt_mode)
+ qat_req->in.rsa.in_tab[6] = 0;
+@@ -884,13 +893,14 @@ static int qat_rsa_dec(struct akcipher_request *req)
+ msg->input_param_count = 3;
+
+ msg->output_param_count = 1;
+- do {
+- ret = adf_send_message(ctx->inst->pke_tx, (u32 *)msg);
+- } while (ret == -EBUSY && ctr++ < 100);
+
+- if (!ret)
+- return -EINPROGRESS;
++ ret = qat_alg_send_asym_message(qat_req, inst, &req->base);
++ if (ret == -ENOSPC)
++ goto unmap_all;
++
++ return ret;
+
++unmap_all:
+ if (!dma_mapping_error(dev, qat_req->phy_out))
+ dma_unmap_single(dev, qat_req->phy_out,
+ sizeof(struct qat_rsa_output_params),
+@@ -901,21 +911,15 @@ unmap_in_params:
+ sizeof(struct qat_rsa_input_params),
+ DMA_TO_DEVICE);
+ unmap_dst:
+- if (qat_req->dst_align)
+- dma_free_coherent(dev, ctx->key_sz, qat_req->dst_align,
+- qat_req->out.rsa.dec.m);
+- else
+- if (!dma_mapping_error(dev, qat_req->out.rsa.dec.m))
+- dma_unmap_single(dev, qat_req->out.rsa.dec.m,
+- ctx->key_sz, DMA_FROM_DEVICE);
++ if (!dma_mapping_error(dev, qat_req->out.rsa.dec.m))
++ dma_unmap_single(dev, qat_req->out.rsa.dec.m,
++ ctx->key_sz, DMA_FROM_DEVICE);
++ kfree_sensitive(qat_req->dst_align);
+ unmap_src:
+- if (qat_req->src_align)
+- dma_free_coherent(dev, ctx->key_sz, qat_req->src_align,
+- qat_req->in.rsa.dec.c);
+- else
+- if (!dma_mapping_error(dev, qat_req->in.rsa.dec.c))
+- dma_unmap_single(dev, qat_req->in.rsa.dec.c,
+- ctx->key_sz, DMA_TO_DEVICE);
++ if (!dma_mapping_error(dev, qat_req->in.rsa.dec.c))
++ dma_unmap_single(dev, qat_req->in.rsa.dec.c, ctx->key_sz,
++ DMA_TO_DEVICE);
++ kfree_sensitive(qat_req->src_align);
+ return ret;
+ }
+
+@@ -1233,18 +1237,8 @@ static void qat_rsa_exit_tfm(struct crypto_akcipher *tfm)
+ struct qat_rsa_ctx *ctx = akcipher_tfm_ctx(tfm);
+ struct device *dev = &GET_DEV(ctx->inst->accel_dev);
+
+- if (ctx->n)
+- dma_free_coherent(dev, ctx->key_sz, ctx->n, ctx->dma_n);
+- if (ctx->e)
+- dma_free_coherent(dev, ctx->key_sz, ctx->e, ctx->dma_e);
+- if (ctx->d) {
+- memset(ctx->d, '\0', ctx->key_sz);
+- dma_free_coherent(dev, ctx->key_sz, ctx->d, ctx->dma_d);
+- }
++ qat_rsa_clear_ctx(dev, ctx);
+ qat_crypto_put_instance(ctx->inst);
+- ctx->n = NULL;
+- ctx->e = NULL;
+- ctx->d = NULL;
+ }
+
+ static struct akcipher_alg rsa = {
+diff --git a/drivers/crypto/qat/qat_common/qat_crypto.c b/drivers/crypto/qat/qat_common/qat_crypto.c
+index 67c9588e89df9..9341d892533a7 100644
+--- a/drivers/crypto/qat/qat_common/qat_crypto.c
++++ b/drivers/crypto/qat/qat_common/qat_crypto.c
+@@ -161,13 +161,6 @@ int qat_crypto_dev_config(struct adf_accel_dev *accel_dev)
+ if (ret)
+ goto err;
+
+- /* Temporarily set the number of crypto instances to zero to avoid
+- * registering the crypto algorithms.
+- * This will be removed when the algorithms will support the
+- * CRYPTO_TFM_REQ_MAY_BACKLOG flag
+- */
+- instances = 0;
+-
+ for (i = 0; i < instances; i++) {
+ val = i;
+ snprintf(key, sizeof(key), ADF_CY "%d" ADF_RING_ASYM_BANK_NUM, i);
+@@ -353,6 +346,9 @@ static int qat_crypto_create_instances(struct adf_accel_dev *accel_dev)
+ &inst->pke_rx);
+ if (ret)
+ goto err;
++
++ INIT_LIST_HEAD(&inst->backlog.list);
++ spin_lock_init(&inst->backlog.lock);
+ }
+ return 0;
+ err:
+diff --git a/drivers/crypto/qat/qat_common/qat_crypto.h b/drivers/crypto/qat/qat_common/qat_crypto.h
+index b6a4c95ae003f..245b6d9a36507 100644
+--- a/drivers/crypto/qat/qat_common/qat_crypto.h
++++ b/drivers/crypto/qat/qat_common/qat_crypto.h
+@@ -9,6 +9,19 @@
+ #include "adf_accel_devices.h"
+ #include "icp_qat_fw_la.h"
+
++struct qat_instance_backlog {
++ struct list_head list;
++ spinlock_t lock; /* protects backlog list */
++};
++
++struct qat_alg_req {
++ u32 *fw_req;
++ struct adf_etr_ring_data *tx_ring;
++ struct crypto_async_request *base;
++ struct list_head list;
++ struct qat_instance_backlog *backlog;
++};
++
+ struct qat_crypto_instance {
+ struct adf_etr_ring_data *sym_tx;
+ struct adf_etr_ring_data *sym_rx;
+@@ -19,8 +32,29 @@ struct qat_crypto_instance {
+ unsigned long state;
+ int id;
+ atomic_t refctr;
++ struct qat_instance_backlog backlog;
+ };
+
++#define QAT_MAX_BUFF_DESC 4
++
++struct qat_alg_buf {
++ u32 len;
++ u32 resrvd;
++ u64 addr;
++} __packed;
++
++struct qat_alg_buf_list {
++ u64 resrvd;
++ u32 num_bufs;
++ u32 num_mapped_bufs;
++ struct qat_alg_buf bufers[];
++} __packed;
++
++struct qat_alg_fixed_buf_list {
++ struct qat_alg_buf_list sgl_hdr;
++ struct qat_alg_buf descriptors[QAT_MAX_BUFF_DESC];
++} __packed __aligned(64);
++
+ struct qat_crypto_request_buffs {
+ struct qat_alg_buf_list *bl;
+ dma_addr_t blp;
+@@ -28,6 +62,10 @@ struct qat_crypto_request_buffs {
+ dma_addr_t bloutp;
+ size_t sz;
+ size_t sz_out;
++ bool sgl_src_valid;
++ bool sgl_dst_valid;
++ struct qat_alg_fixed_buf_list sgl_src;
++ struct qat_alg_fixed_buf_list sgl_dst;
+ };
+
+ struct qat_crypto_request;
+@@ -53,6 +91,7 @@ struct qat_crypto_request {
+ u8 iv[AES_BLOCK_SIZE];
+ };
+ bool encryption;
++ struct qat_alg_req alg_req;
+ };
+
+ static inline bool adf_hw_dev_has_crypto(struct adf_accel_dev *accel_dev)
+diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c
+index 33683295a0bfe..64befd6f702b2 100644
+--- a/drivers/gpio/gpio-pca953x.c
++++ b/drivers/gpio/gpio-pca953x.c
+@@ -351,6 +351,9 @@ static const struct regmap_config pca953x_i2c_regmap = {
+ .reg_bits = 8,
+ .val_bits = 8,
+
++ .use_single_read = true,
++ .use_single_write = true,
++
+ .readable_reg = pca953x_readable_register,
+ .writeable_reg = pca953x_writeable_register,
+ .volatile_reg = pca953x_volatile_register,
+@@ -894,15 +897,18 @@ static int pca953x_irq_setup(struct pca953x_chip *chip,
+ static int device_pca95xx_init(struct pca953x_chip *chip, u32 invert)
+ {
+ DECLARE_BITMAP(val, MAX_LINE);
++ u8 regaddr;
+ int ret;
+
+- ret = regcache_sync_region(chip->regmap, chip->regs->output,
+- chip->regs->output + NBANK(chip));
++ regaddr = pca953x_recalc_addr(chip, chip->regs->output, 0);
++ ret = regcache_sync_region(chip->regmap, regaddr,
++ regaddr + NBANK(chip) - 1);
+ if (ret)
+ goto out;
+
+- ret = regcache_sync_region(chip->regmap, chip->regs->direction,
+- chip->regs->direction + NBANK(chip));
++ regaddr = pca953x_recalc_addr(chip, chip->regs->direction, 0);
++ ret = regcache_sync_region(chip->regmap, regaddr,
++ regaddr + NBANK(chip) - 1);
+ if (ret)
+ goto out;
+
+@@ -1115,14 +1121,14 @@ static int pca953x_regcache_sync(struct device *dev)
+ * sync these registers first and only then sync the rest.
+ */
+ regaddr = pca953x_recalc_addr(chip, chip->regs->direction, 0);
+- ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip));
++ ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip) - 1);
+ if (ret) {
+ dev_err(dev, "Failed to sync GPIO dir registers: %d\n", ret);
+ return ret;
+ }
+
+ regaddr = pca953x_recalc_addr(chip, chip->regs->output, 0);
+- ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip));
++ ret = regcache_sync_region(chip->regmap, regaddr, regaddr + NBANK(chip) - 1);
+ if (ret) {
+ dev_err(dev, "Failed to sync GPIO out registers: %d\n", ret);
+ return ret;
+@@ -1132,7 +1138,7 @@ static int pca953x_regcache_sync(struct device *dev)
+ if (chip->driver_data & PCA_PCAL) {
+ regaddr = pca953x_recalc_addr(chip, PCAL953X_IN_LATCH, 0);
+ ret = regcache_sync_region(chip->regmap, regaddr,
+- regaddr + NBANK(chip));
++ regaddr + NBANK(chip) - 1);
+ if (ret) {
+ dev_err(dev, "Failed to sync INT latch registers: %d\n",
+ ret);
+@@ -1141,7 +1147,7 @@ static int pca953x_regcache_sync(struct device *dev)
+
+ regaddr = pca953x_recalc_addr(chip, PCAL953X_INT_MASK, 0);
+ ret = regcache_sync_region(chip->regmap, regaddr,
+- regaddr + NBANK(chip));
++ regaddr + NBANK(chip) - 1);
+ if (ret) {
+ dev_err(dev, "Failed to sync INT mask registers: %d\n",
+ ret);
+diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
+index b6d3a57e27edc..7f8e2fed29884 100644
+--- a/drivers/gpio/gpio-xilinx.c
++++ b/drivers/gpio/gpio-xilinx.c
+@@ -99,7 +99,7 @@ static inline void xgpio_set_value32(unsigned long *map, int bit, u32 v)
+ const unsigned long offset = (bit % BITS_PER_LONG) & BIT(5);
+
+ map[index] &= ~(0xFFFFFFFFul << offset);
+- map[index] |= v << offset;
++ map[index] |= (unsigned long)v << offset;
+ }
+
+ static inline int xgpio_regoffset(struct xgpio_instance *chip, int ch)
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+index 810965bd06921..a2575195c4e07 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+@@ -1670,7 +1670,7 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
+ #if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
+ adev->dm.crc_rd_wrk = amdgpu_dm_crtc_secure_display_create_work();
+ #endif
+- if (dc_enable_dmub_notifications(adev->dm.dc)) {
++ if (dc_is_dmub_outbox_supported(adev->dm.dc)) {
+ init_completion(&adev->dm.dmub_aux_transfer_done);
+ adev->dm.dmub_notify = kzalloc(sizeof(struct dmub_notification), GFP_KERNEL);
+ if (!adev->dm.dmub_notify) {
+@@ -1708,6 +1708,13 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
+ goto error;
+ }
+
++ /* Enable outbox notification only after IRQ handlers are registered and DMUB is alive.
++ * It is expected that DMUB will resend any pending notifications at this point, for
++ * example HPD from DPIA.
++ */
++ if (dc_is_dmub_outbox_supported(adev->dm.dc))
++ dc_enable_dmub_outbox(adev->dm.dc);
++
+ /* create fake encoders for MST */
+ dm_dp_create_fake_mst_encoders(adev);
+
+@@ -2701,9 +2708,6 @@ static int dm_resume(void *handle)
+ */
+ link_enc_cfg_copy(adev->dm.dc->current_state, dc_state);
+
+- if (dc_enable_dmub_notifications(adev->dm.dc))
+- amdgpu_dm_outbox_init(adev);
+-
+ r = dm_dmub_hw_init(adev);
+ if (r)
+ DRM_ERROR("DMUB interface failed to initialize: status=%d\n", r);
+@@ -2721,6 +2725,11 @@ static int dm_resume(void *handle)
+ }
+ }
+
++ if (dc_is_dmub_outbox_supported(adev->dm.dc)) {
++ amdgpu_dm_outbox_init(adev);
++ dc_enable_dmub_outbox(adev->dm.dc);
++ }
++
+ WARN_ON(!dc_commit_state(dm->dc, dc_state));
+
+ dm_gpureset_commit_state(dm->cached_dc_state, dm);
+@@ -2742,13 +2751,15 @@ static int dm_resume(void *handle)
+ /* TODO: Remove dc_state->dccg, use dc->dccg directly. */
+ dc_resource_state_construct(dm->dc, dm_state->context);
+
+- /* Re-enable outbox interrupts for DPIA. */
+- if (dc_enable_dmub_notifications(adev->dm.dc))
+- amdgpu_dm_outbox_init(adev);
+-
+ /* Before powering on DC we need to re-initialize DMUB. */
+ dm_dmub_hw_resume(adev);
+
++ /* Re-enable outbox interrupts for DPIA. */
++ if (dc_is_dmub_outbox_supported(adev->dm.dc)) {
++ amdgpu_dm_outbox_init(adev);
++ dc_enable_dmub_outbox(adev->dm.dc);
++ }
++
+ /* power on hardware */
+ dc_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D0);
+
+diff --git a/drivers/gpu/drm/drm_gem_ttm_helper.c b/drivers/gpu/drm/drm_gem_ttm_helper.c
+index d5962a34c01d5..e5fc875990c4f 100644
+--- a/drivers/gpu/drm/drm_gem_ttm_helper.c
++++ b/drivers/gpu/drm/drm_gem_ttm_helper.c
+@@ -64,8 +64,13 @@ int drm_gem_ttm_vmap(struct drm_gem_object *gem,
+ struct iosys_map *map)
+ {
+ struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
++ int ret;
++
++ dma_resv_lock(gem->resv, NULL);
++ ret = ttm_bo_vmap(bo, map);
++ dma_resv_unlock(gem->resv);
+
+- return ttm_bo_vmap(bo, map);
++ return ret;
+ }
+ EXPORT_SYMBOL(drm_gem_ttm_vmap);
+
+@@ -82,7 +87,9 @@ void drm_gem_ttm_vunmap(struct drm_gem_object *gem,
+ {
+ struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
+
++ dma_resv_lock(gem->resv, NULL);
+ ttm_bo_vunmap(bo, map);
++ dma_resv_unlock(gem->resv);
+ }
+ EXPORT_SYMBOL(drm_gem_ttm_vunmap);
+
+diff --git a/drivers/gpu/drm/imx/dcss/dcss-dev.c b/drivers/gpu/drm/imx/dcss/dcss-dev.c
+index c849533ca83e3..3f5750cc2673e 100644
+--- a/drivers/gpu/drm/imx/dcss/dcss-dev.c
++++ b/drivers/gpu/drm/imx/dcss/dcss-dev.c
+@@ -207,6 +207,7 @@ struct dcss_dev *dcss_dev_create(struct device *dev, bool hdmi_output)
+
+ ret = dcss_submodules_init(dcss);
+ if (ret) {
++ of_node_put(dcss->of_port);
+ dev_err(dev, "submodules initialization failed\n");
+ goto clks_err;
+ }
+@@ -237,6 +238,8 @@ void dcss_dev_destroy(struct dcss_dev *dcss)
+ dcss_clocks_disable(dcss);
+ }
+
++ of_node_put(dcss->of_port);
++
+ pm_runtime_disable(dcss->dev);
+
+ dcss_submodules_stop(dcss);
+diff --git a/drivers/gpu/drm/panel/panel-edp.c b/drivers/gpu/drm/panel/panel-edp.c
+index f7bfcf63d48ee..701a258d2e111 100644
+--- a/drivers/gpu/drm/panel/panel-edp.c
++++ b/drivers/gpu/drm/panel/panel-edp.c
+@@ -713,7 +713,7 @@ static int generic_edp_panel_probe(struct device *dev, struct panel_edp *panel)
+ of_property_read_u32(dev->of_node, "hpd-reliable-delay-ms", &reliable_ms);
+ desc->delay.hpd_reliable = reliable_ms;
+ of_property_read_u32(dev->of_node, "hpd-absent-delay-ms", &absent_ms);
+- desc->delay.hpd_reliable = absent_ms;
++ desc->delay.hpd_absent = absent_ms;
+
+ /* Power the panel on so we can read the EDID */
+ ret = pm_runtime_get_sync(dev);
+diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
+index 191c56064f196..6b25b2f4f5a30 100644
+--- a/drivers/gpu/drm/scheduler/sched_entity.c
++++ b/drivers/gpu/drm/scheduler/sched_entity.c
+@@ -190,7 +190,7 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
+ }
+ EXPORT_SYMBOL(drm_sched_entity_flush);
+
+-static void drm_sched_entity_kill_jobs_irq_work(struct irq_work *wrk)
++static void drm_sched_entity_kill_jobs_work(struct work_struct *wrk)
+ {
+ struct drm_sched_job *job = container_of(wrk, typeof(*job), work);
+
+@@ -207,8 +207,8 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
+ struct drm_sched_job *job = container_of(cb, struct drm_sched_job,
+ finish_cb);
+
+- init_irq_work(&job->work, drm_sched_entity_kill_jobs_irq_work);
+- irq_work_queue(&job->work);
++ INIT_WORK(&job->work, drm_sched_entity_kill_jobs_work);
++ schedule_work(&job->work);
+ }
+
+ static struct dma_fence *
+diff --git a/drivers/i2c/busses/i2c-cadence.c b/drivers/i2c/busses/i2c-cadence.c
+index 3d6f8ee355bfc..630cfa4ddd468 100644
+--- a/drivers/i2c/busses/i2c-cadence.c
++++ b/drivers/i2c/busses/i2c-cadence.c
+@@ -388,9 +388,9 @@ static irqreturn_t cdns_i2c_slave_isr(void *ptr)
+ */
+ static irqreturn_t cdns_i2c_master_isr(void *ptr)
+ {
+- unsigned int isr_status, avail_bytes, updatetx;
++ unsigned int isr_status, avail_bytes;
+ unsigned int bytes_to_send;
+- bool hold_quirk;
++ bool updatetx;
+ struct cdns_i2c *id = ptr;
+ /* Signal completion only after everything is updated */
+ int done_flag = 0;
+@@ -410,11 +410,7 @@ static irqreturn_t cdns_i2c_master_isr(void *ptr)
+ * Check if transfer size register needs to be updated again for a
+ * large data receive operation.
+ */
+- updatetx = 0;
+- if (id->recv_count > id->curr_recv_count)
+- updatetx = 1;
+-
+- hold_quirk = (id->quirks & CDNS_I2C_BROKEN_HOLD_BIT) && updatetx;
++ updatetx = id->recv_count > id->curr_recv_count;
+
+ /* When receiving, handle data interrupt and completion interrupt */
+ if (id->p_recv_buf &&
+@@ -445,7 +441,7 @@ static irqreturn_t cdns_i2c_master_isr(void *ptr)
+ break;
+ }
+
+- if (cdns_is_holdquirk(id, hold_quirk))
++ if (cdns_is_holdquirk(id, updatetx))
+ break;
+ }
+
+@@ -456,7 +452,7 @@ static irqreturn_t cdns_i2c_master_isr(void *ptr)
+ * maintain transfer size non-zero while performing a large
+ * receive operation.
+ */
+- if (cdns_is_holdquirk(id, hold_quirk)) {
++ if (cdns_is_holdquirk(id, updatetx)) {
+ /* wait while fifo is full */
+ while (cdns_i2c_readreg(CDNS_I2C_XFER_SIZE_OFFSET) !=
+ (id->curr_recv_count - CDNS_I2C_FIFO_DEPTH))
+@@ -478,22 +474,6 @@ static irqreturn_t cdns_i2c_master_isr(void *ptr)
+ CDNS_I2C_XFER_SIZE_OFFSET);
+ id->curr_recv_count = id->recv_count;
+ }
+- } else if (id->recv_count && !hold_quirk &&
+- !id->curr_recv_count) {
+-
+- /* Set the slave address in address register*/
+- cdns_i2c_writereg(id->p_msg->addr & CDNS_I2C_ADDR_MASK,
+- CDNS_I2C_ADDR_OFFSET);
+-
+- if (id->recv_count > CDNS_I2C_TRANSFER_SIZE) {
+- cdns_i2c_writereg(CDNS_I2C_TRANSFER_SIZE,
+- CDNS_I2C_XFER_SIZE_OFFSET);
+- id->curr_recv_count = CDNS_I2C_TRANSFER_SIZE;
+- } else {
+- cdns_i2c_writereg(id->recv_count,
+- CDNS_I2C_XFER_SIZE_OFFSET);
+- id->curr_recv_count = id->recv_count;
+- }
+ }
+
+ /* Clear hold (if not repeated start) and signal completion */
+diff --git a/drivers/i2c/busses/i2c-mlxcpld.c b/drivers/i2c/busses/i2c-mlxcpld.c
+index 56aa424fd71d5..815cc561386b0 100644
+--- a/drivers/i2c/busses/i2c-mlxcpld.c
++++ b/drivers/i2c/busses/i2c-mlxcpld.c
+@@ -49,7 +49,7 @@
+ #define MLXCPLD_LPCI2C_NACK_IND 2
+
+ #define MLXCPLD_I2C_FREQ_1000KHZ_SET 0x04
+-#define MLXCPLD_I2C_FREQ_400KHZ_SET 0x0c
++#define MLXCPLD_I2C_FREQ_400KHZ_SET 0x0e
+ #define MLXCPLD_I2C_FREQ_100KHZ_SET 0x42
+
+ enum mlxcpld_i2c_frequency {
+diff --git a/drivers/infiniband/hw/irdma/cm.c b/drivers/infiniband/hw/irdma/cm.c
+index 638bf4a1ed946..646fa86774909 100644
+--- a/drivers/infiniband/hw/irdma/cm.c
++++ b/drivers/infiniband/hw/irdma/cm.c
+@@ -4231,10 +4231,6 @@ void irdma_cm_teardown_connections(struct irdma_device *iwdev, u32 *ipaddr,
+ struct irdma_cm_node *cm_node;
+ struct list_head teardown_list;
+ struct ib_qp_attr attr;
+- struct irdma_sc_vsi *vsi = &iwdev->vsi;
+- struct irdma_sc_qp *sc_qp;
+- struct irdma_qp *qp;
+- int i;
+
+ INIT_LIST_HEAD(&teardown_list);
+
+@@ -4251,52 +4247,6 @@ void irdma_cm_teardown_connections(struct irdma_device *iwdev, u32 *ipaddr,
+ irdma_cm_disconn(cm_node->iwqp);
+ irdma_rem_ref_cm_node(cm_node);
+ }
+- if (!iwdev->roce_mode)
+- return;
+-
+- INIT_LIST_HEAD(&teardown_list);
+- for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++) {
+- mutex_lock(&vsi->qos[i].qos_mutex);
+- list_for_each_safe (list_node, list_core_temp,
+- &vsi->qos[i].qplist) {
+- u32 qp_ip[4];
+-
+- sc_qp = container_of(list_node, struct irdma_sc_qp,
+- list);
+- if (sc_qp->qp_uk.qp_type != IRDMA_QP_TYPE_ROCE_RC)
+- continue;
+-
+- qp = sc_qp->qp_uk.back_qp;
+- if (!disconnect_all) {
+- if (nfo->ipv4)
+- qp_ip[0] = qp->udp_info.local_ipaddr[3];
+- else
+- memcpy(qp_ip,
+- &qp->udp_info.local_ipaddr[0],
+- sizeof(qp_ip));
+- }
+-
+- if (disconnect_all ||
+- (nfo->vlan_id == (qp->udp_info.vlan_tag & VLAN_VID_MASK) &&
+- !memcmp(qp_ip, ipaddr, nfo->ipv4 ? 4 : 16))) {
+- spin_lock(&iwdev->rf->qptable_lock);
+- if (iwdev->rf->qp_table[sc_qp->qp_uk.qp_id]) {
+- irdma_qp_add_ref(&qp->ibqp);
+- list_add(&qp->teardown_entry,
+- &teardown_list);
+- }
+- spin_unlock(&iwdev->rf->qptable_lock);
+- }
+- }
+- mutex_unlock(&vsi->qos[i].qos_mutex);
+- }
+-
+- list_for_each_safe (list_node, list_core_temp, &teardown_list) {
+- qp = container_of(list_node, struct irdma_qp, teardown_entry);
+- attr.qp_state = IB_QPS_ERR;
+- irdma_modify_qp_roce(&qp->ibqp, &attr, IB_QP_STATE, NULL);
+- irdma_qp_rem_ref(&qp->ibqp);
+- }
+ }
+
+ /**
+diff --git a/drivers/infiniband/hw/irdma/i40iw_hw.c b/drivers/infiniband/hw/irdma/i40iw_hw.c
+index e46fc110004d0..50299f58b6b31 100644
+--- a/drivers/infiniband/hw/irdma/i40iw_hw.c
++++ b/drivers/infiniband/hw/irdma/i40iw_hw.c
+@@ -201,6 +201,7 @@ void i40iw_init_hw(struct irdma_sc_dev *dev)
+ dev->hw_attrs.uk_attrs.max_hw_read_sges = I40IW_MAX_SGE_RD;
+ dev->hw_attrs.max_hw_device_pages = I40IW_MAX_PUSH_PAGE_COUNT;
+ dev->hw_attrs.uk_attrs.max_hw_inline = I40IW_MAX_INLINE_DATA_SIZE;
++ dev->hw_attrs.page_size_cap = SZ_4K | SZ_2M;
+ dev->hw_attrs.max_hw_ird = I40IW_MAX_IRD_SIZE;
+ dev->hw_attrs.max_hw_ord = I40IW_MAX_ORD_SIZE;
+ dev->hw_attrs.max_hw_wqes = I40IW_MAX_WQ_ENTRIES;
+diff --git a/drivers/infiniband/hw/irdma/icrdma_hw.c b/drivers/infiniband/hw/irdma/icrdma_hw.c
+index cf53b17510cdb..5986fd906308c 100644
+--- a/drivers/infiniband/hw/irdma/icrdma_hw.c
++++ b/drivers/infiniband/hw/irdma/icrdma_hw.c
+@@ -139,6 +139,7 @@ void icrdma_init_hw(struct irdma_sc_dev *dev)
+ dev->cqp_db = dev->hw_regs[IRDMA_CQPDB];
+ dev->cq_ack_db = dev->hw_regs[IRDMA_CQACK];
+ dev->irq_ops = &icrdma_irq_ops;
++ dev->hw_attrs.page_size_cap = SZ_4K | SZ_2M | SZ_1G;
+ dev->hw_attrs.max_hw_ird = ICRDMA_MAX_IRD_SIZE;
+ dev->hw_attrs.max_hw_ord = ICRDMA_MAX_ORD_SIZE;
+ dev->hw_attrs.max_stat_inst = ICRDMA_MAX_STATS_COUNT;
+diff --git a/drivers/infiniband/hw/irdma/irdma.h b/drivers/infiniband/hw/irdma/irdma.h
+index 46c12334c7354..4789e85d717b3 100644
+--- a/drivers/infiniband/hw/irdma/irdma.h
++++ b/drivers/infiniband/hw/irdma/irdma.h
+@@ -127,6 +127,7 @@ struct irdma_hw_attrs {
+ u64 max_hw_outbound_msg_size;
+ u64 max_hw_inbound_msg_size;
+ u64 max_mr_size;
++ u64 page_size_cap;
+ u32 min_hw_qp_id;
+ u32 min_hw_aeq_size;
+ u32 max_hw_aeq_size;
+diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
+index 52f3e88f85695..6daa149dcbda2 100644
+--- a/drivers/infiniband/hw/irdma/verbs.c
++++ b/drivers/infiniband/hw/irdma/verbs.c
+@@ -30,7 +30,7 @@ static int irdma_query_device(struct ib_device *ibdev,
+ props->vendor_part_id = pcidev->device;
+
+ props->hw_ver = rf->pcidev->revision;
+- props->page_size_cap = SZ_4K | SZ_2M | SZ_1G;
++ props->page_size_cap = hw_attrs->page_size_cap;
+ props->max_mr_size = hw_attrs->max_mr_size;
+ props->max_qp = rf->max_qp - rf->used_qps;
+ props->max_qp_wr = hw_attrs->max_qp_wr;
+@@ -2764,7 +2764,7 @@ static struct ib_mr *irdma_reg_user_mr(struct ib_pd *pd, u64 start, u64 len,
+
+ if (req.reg_type == IRDMA_MEMREG_TYPE_MEM) {
+ iwmr->page_size = ib_umem_find_best_pgsz(region,
+- SZ_4K | SZ_2M | SZ_1G,
++ iwdev->rf->sc_dev.hw_attrs.page_size_cap,
+ virt);
+ if (unlikely(!iwmr->page_size)) {
+ kfree(iwmr);
+diff --git a/drivers/mmc/host/sdhci-omap.c b/drivers/mmc/host/sdhci-omap.c
+index 64e27c2821f99..ada23040cb654 100644
+--- a/drivers/mmc/host/sdhci-omap.c
++++ b/drivers/mmc/host/sdhci-omap.c
+@@ -1303,8 +1303,9 @@ static int sdhci_omap_probe(struct platform_device *pdev)
+ /*
+ * omap_device_pm_domain has callbacks to enable the main
+ * functional clock, interface clock and also configure the
+- * SYSCONFIG register of omap devices. The callback will be invoked
+- * as part of pm_runtime_get_sync.
++ * SYSCONFIG register to clear any boot loader set voltage
++ * capabilities before calling sdhci_setup_host(). The
++ * callback will be invoked as part of pm_runtime_get_sync.
+ */
+ pm_runtime_use_autosuspend(dev);
+ pm_runtime_set_autosuspend_delay(dev, 50);
+@@ -1446,7 +1447,8 @@ static int __maybe_unused sdhci_omap_runtime_suspend(struct device *dev)
+ struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
+ struct sdhci_omap_host *omap_host = sdhci_pltfm_priv(pltfm_host);
+
+- sdhci_runtime_suspend_host(host);
++ if (omap_host->con != -EINVAL)
++ sdhci_runtime_suspend_host(host);
+
+ sdhci_omap_context_save(omap_host);
+
+@@ -1463,10 +1465,10 @@ static int __maybe_unused sdhci_omap_runtime_resume(struct device *dev)
+
+ pinctrl_pm_select_default_state(dev);
+
+- if (omap_host->con != -EINVAL)
++ if (omap_host->con != -EINVAL) {
+ sdhci_omap_context_restore(omap_host);
+-
+- sdhci_runtime_resume_host(host, 0);
++ sdhci_runtime_resume_host(host, 0);
++ }
+
+ return 0;
+ }
+diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+index 44b14c9dc9a73..a626028336d3f 100644
+--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
++++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+@@ -655,9 +655,10 @@ static int gpmi_nfc_compute_timings(struct gpmi_nand_data *this,
+ unsigned int tRP_ps;
+ bool use_half_period;
+ int sample_delay_ps, sample_delay_factor;
+- u16 busy_timeout_cycles;
++ unsigned int busy_timeout_cycles;
+ u8 wrn_dly_sel;
+ unsigned long clk_rate, min_rate;
++ u64 busy_timeout_ps;
+
+ if (sdr->tRC_min >= 30000) {
+ /* ONFI non-EDO modes [0-3] */
+@@ -690,7 +691,8 @@ static int gpmi_nfc_compute_timings(struct gpmi_nand_data *this,
+ addr_setup_cycles = TO_CYCLES(sdr->tALS_min, period_ps);
+ data_setup_cycles = TO_CYCLES(sdr->tDS_min, period_ps);
+ data_hold_cycles = TO_CYCLES(sdr->tDH_min, period_ps);
+- busy_timeout_cycles = TO_CYCLES(sdr->tWB_max + sdr->tR_max, period_ps);
++ busy_timeout_ps = max(sdr->tBERS_max, sdr->tPROG_max);
++ busy_timeout_cycles = TO_CYCLES(busy_timeout_ps, period_ps);
+
+ hw->timing0 = BF_GPMI_TIMING0_ADDRESS_SETUP(addr_setup_cycles) |
+ BF_GPMI_TIMING0_DATA_HOLD(data_hold_cycles) |
+diff --git a/drivers/net/amt.c b/drivers/net/amt.c
+index 14fe03dbd9b1d..acf5ea96652f8 100644
+--- a/drivers/net/amt.c
++++ b/drivers/net/amt.c
+@@ -563,7 +563,7 @@ static struct sk_buff *amt_build_igmp_gq(struct amt_dev *amt)
+ ihv3->nsrcs = 0;
+ ihv3->resv = 0;
+ ihv3->suppress = false;
+- ihv3->qrv = amt->net->ipv4.sysctl_igmp_qrv;
++ ihv3->qrv = READ_ONCE(amt->net->ipv4.sysctl_igmp_qrv);
+ ihv3->csum = 0;
+ csum = &ihv3->csum;
+ csum_start = (void *)ihv3;
+@@ -577,14 +577,14 @@ static struct sk_buff *amt_build_igmp_gq(struct amt_dev *amt)
+ return skb;
+ }
+
+-static void __amt_update_gw_status(struct amt_dev *amt, enum amt_status status,
+- bool validate)
++static void amt_update_gw_status(struct amt_dev *amt, enum amt_status status,
++ bool validate)
+ {
+ if (validate && amt->status >= status)
+ return;
+ netdev_dbg(amt->dev, "Update GW status %s -> %s",
+ status_str[amt->status], status_str[status]);
+- amt->status = status;
++ WRITE_ONCE(amt->status, status);
+ }
+
+ static void __amt_update_relay_status(struct amt_tunnel_list *tunnel,
+@@ -600,14 +600,6 @@ static void __amt_update_relay_status(struct amt_tunnel_list *tunnel,
+ tunnel->status = status;
+ }
+
+-static void amt_update_gw_status(struct amt_dev *amt, enum amt_status status,
+- bool validate)
+-{
+- spin_lock_bh(&amt->lock);
+- __amt_update_gw_status(amt, status, validate);
+- spin_unlock_bh(&amt->lock);
+-}
+-
+ static void amt_update_relay_status(struct amt_tunnel_list *tunnel,
+ enum amt_status status, bool validate)
+ {
+@@ -700,9 +692,7 @@ static void amt_send_discovery(struct amt_dev *amt)
+ if (unlikely(net_xmit_eval(err)))
+ amt->dev->stats.tx_errors++;
+
+- spin_lock_bh(&amt->lock);
+- __amt_update_gw_status(amt, AMT_STATUS_SENT_DISCOVERY, true);
+- spin_unlock_bh(&amt->lock);
++ amt_update_gw_status(amt, AMT_STATUS_SENT_DISCOVERY, true);
+ out:
+ rcu_read_unlock();
+ }
+@@ -900,6 +890,28 @@ static void amt_send_mld_gq(struct amt_dev *amt, struct amt_tunnel_list *tunnel)
+ }
+ #endif
+
++static bool amt_queue_event(struct amt_dev *amt, enum amt_event event,
++ struct sk_buff *skb)
++{
++ int index;
++
++ spin_lock_bh(&amt->lock);
++ if (amt->nr_events >= AMT_MAX_EVENTS) {
++ spin_unlock_bh(&amt->lock);
++ return 1;
++ }
++
++ index = (amt->event_idx + amt->nr_events) % AMT_MAX_EVENTS;
++ amt->events[index].event = event;
++ amt->events[index].skb = skb;
++ amt->nr_events++;
++ amt->event_idx %= AMT_MAX_EVENTS;
++ queue_work(amt_wq, &amt->event_wq);
++ spin_unlock_bh(&amt->lock);
++
++ return 0;
++}
++
+ static void amt_secret_work(struct work_struct *work)
+ {
+ struct amt_dev *amt = container_of(to_delayed_work(work),
+@@ -913,58 +925,72 @@ static void amt_secret_work(struct work_struct *work)
+ msecs_to_jiffies(AMT_SECRET_TIMEOUT));
+ }
+
+-static void amt_discovery_work(struct work_struct *work)
++static void amt_event_send_discovery(struct amt_dev *amt)
+ {
+- struct amt_dev *amt = container_of(to_delayed_work(work),
+- struct amt_dev,
+- discovery_wq);
+-
+- spin_lock_bh(&amt->lock);
+ if (amt->status > AMT_STATUS_SENT_DISCOVERY)
+ goto out;
+ get_random_bytes(&amt->nonce, sizeof(__be32));
+- spin_unlock_bh(&amt->lock);
+
+ amt_send_discovery(amt);
+- spin_lock_bh(&amt->lock);
+ out:
+ mod_delayed_work(amt_wq, &amt->discovery_wq,
+ msecs_to_jiffies(AMT_DISCOVERY_TIMEOUT));
+- spin_unlock_bh(&amt->lock);
+ }
+
+-static void amt_req_work(struct work_struct *work)
++static void amt_discovery_work(struct work_struct *work)
+ {
+ struct amt_dev *amt = container_of(to_delayed_work(work),
+ struct amt_dev,
+- req_wq);
++ discovery_wq);
++
++ if (amt_queue_event(amt, AMT_EVENT_SEND_DISCOVERY, NULL))
++ mod_delayed_work(amt_wq, &amt->discovery_wq,
++ msecs_to_jiffies(AMT_DISCOVERY_TIMEOUT));
++}
++
++static void amt_event_send_request(struct amt_dev *amt)
++{
+ u32 exp;
+
+- spin_lock_bh(&amt->lock);
+ if (amt->status < AMT_STATUS_RECEIVED_ADVERTISEMENT)
+ goto out;
+
+ if (amt->req_cnt > AMT_MAX_REQ_COUNT) {
+ netdev_dbg(amt->dev, "Gateway is not ready");
+ amt->qi = AMT_INIT_REQ_TIMEOUT;
+- amt->ready4 = false;
+- amt->ready6 = false;
++ WRITE_ONCE(amt->ready4, false);
++ WRITE_ONCE(amt->ready6, false);
+ amt->remote_ip = 0;
+- __amt_update_gw_status(amt, AMT_STATUS_INIT, false);
++ amt_update_gw_status(amt, AMT_STATUS_INIT, false);
+ amt->req_cnt = 0;
++ amt->nonce = 0;
+ goto out;
+ }
+- spin_unlock_bh(&amt->lock);
++
++ if (!amt->req_cnt) {
++ WRITE_ONCE(amt->ready4, false);
++ WRITE_ONCE(amt->ready6, false);
++ get_random_bytes(&amt->nonce, sizeof(__be32));
++ }
+
+ amt_send_request(amt, false);
+ amt_send_request(amt, true);
+- spin_lock_bh(&amt->lock);
+- __amt_update_gw_status(amt, AMT_STATUS_SENT_REQUEST, true);
++ amt_update_gw_status(amt, AMT_STATUS_SENT_REQUEST, true);
+ amt->req_cnt++;
+ out:
+ exp = min_t(u32, (1 * (1 << amt->req_cnt)), AMT_MAX_REQ_TIMEOUT);
+ mod_delayed_work(amt_wq, &amt->req_wq, msecs_to_jiffies(exp * 1000));
+- spin_unlock_bh(&amt->lock);
++}
++
++static void amt_req_work(struct work_struct *work)
++{
++ struct amt_dev *amt = container_of(to_delayed_work(work),
++ struct amt_dev,
++ req_wq);
++
++ if (amt_queue_event(amt, AMT_EVENT_SEND_REQUEST, NULL))
++ mod_delayed_work(amt_wq, &amt->req_wq,
++ msecs_to_jiffies(100));
+ }
+
+ static bool amt_send_membership_update(struct amt_dev *amt,
+@@ -1220,7 +1246,8 @@ static netdev_tx_t amt_dev_xmit(struct sk_buff *skb, struct net_device *dev)
+ /* Gateway only passes IGMP/MLD packets */
+ if (!report)
+ goto free;
+- if ((!v6 && !amt->ready4) || (v6 && !amt->ready6))
++ if ((!v6 && !READ_ONCE(amt->ready4)) ||
++ (v6 && !READ_ONCE(amt->ready6)))
+ goto free;
+ if (amt_send_membership_update(amt, skb, v6))
+ goto free;
+@@ -2236,6 +2263,10 @@ static bool amt_advertisement_handler(struct amt_dev *amt, struct sk_buff *skb)
+ ipv4_is_zeronet(amta->ip4))
+ return true;
+
++ if (amt->status != AMT_STATUS_SENT_DISCOVERY ||
++ amt->nonce != amta->nonce)
++ return true;
++
+ amt->remote_ip = amta->ip4;
+ netdev_dbg(amt->dev, "advertised remote ip = %pI4\n", &amt->remote_ip);
+ mod_delayed_work(amt_wq, &amt->req_wq, 0);
+@@ -2251,6 +2282,9 @@ static bool amt_multicast_data_handler(struct amt_dev *amt, struct sk_buff *skb)
+ struct ethhdr *eth;
+ struct iphdr *iph;
+
++ if (READ_ONCE(amt->status) != AMT_STATUS_SENT_UPDATE)
++ return true;
++
+ hdr_size = sizeof(*amtmd) + sizeof(struct udphdr);
+ if (!pskb_may_pull(skb, hdr_size))
+ return true;
+@@ -2325,6 +2359,9 @@ static bool amt_membership_query_handler(struct amt_dev *amt,
+ if (amtmq->reserved || amtmq->version)
+ return true;
+
++ if (amtmq->nonce != amt->nonce)
++ return true;
++
+ hdr_size -= sizeof(*eth);
+ if (iptunnel_pull_header(skb, hdr_size, htons(ETH_P_TEB), false))
+ return true;
+@@ -2339,6 +2376,9 @@ static bool amt_membership_query_handler(struct amt_dev *amt,
+
+ iph = ip_hdr(skb);
+ if (iph->version == 4) {
++ if (READ_ONCE(amt->ready4))
++ return true;
++
+ if (!pskb_may_pull(skb, sizeof(*iph) + AMT_IPHDR_OPTS +
+ sizeof(*ihv3)))
+ return true;
+@@ -2349,12 +2389,10 @@ static bool amt_membership_query_handler(struct amt_dev *amt,
+ ihv3 = skb_pull(skb, sizeof(*iph) + AMT_IPHDR_OPTS);
+ skb_reset_transport_header(skb);
+ skb_push(skb, sizeof(*iph) + AMT_IPHDR_OPTS);
+- spin_lock_bh(&amt->lock);
+- amt->ready4 = true;
++ WRITE_ONCE(amt->ready4, true);
+ amt->mac = amtmq->response_mac;
+ amt->req_cnt = 0;
+ amt->qi = ihv3->qqic;
+- spin_unlock_bh(&amt->lock);
+ skb->protocol = htons(ETH_P_IP);
+ eth->h_proto = htons(ETH_P_IP);
+ ip_eth_mc_map(iph->daddr, eth->h_dest);
+@@ -2363,6 +2401,9 @@ static bool amt_membership_query_handler(struct amt_dev *amt,
+ struct mld2_query *mld2q;
+ struct ipv6hdr *ip6h;
+
++ if (READ_ONCE(amt->ready6))
++ return true;
++
+ if (!pskb_may_pull(skb, sizeof(*ip6h) + AMT_IP6HDR_OPTS +
+ sizeof(*mld2q)))
+ return true;
+@@ -2374,12 +2415,10 @@ static bool amt_membership_query_handler(struct amt_dev *amt,
+ mld2q = skb_pull(skb, sizeof(*ip6h) + AMT_IP6HDR_OPTS);
+ skb_reset_transport_header(skb);
+ skb_push(skb, sizeof(*ip6h) + AMT_IP6HDR_OPTS);
+- spin_lock_bh(&amt->lock);
+- amt->ready6 = true;
++ WRITE_ONCE(amt->ready6, true);
+ amt->mac = amtmq->response_mac;
+ amt->req_cnt = 0;
+ amt->qi = mld2q->mld2q_qqic;
+- spin_unlock_bh(&amt->lock);
+ skb->protocol = htons(ETH_P_IPV6);
+ eth->h_proto = htons(ETH_P_IPV6);
+ ipv6_eth_mc_map(&ip6h->daddr, eth->h_dest);
+@@ -2392,12 +2431,14 @@ static bool amt_membership_query_handler(struct amt_dev *amt,
+ skb->pkt_type = PACKET_MULTICAST;
+ skb->ip_summed = CHECKSUM_NONE;
+ len = skb->len;
++ local_bh_disable();
+ if (__netif_rx(skb) == NET_RX_SUCCESS) {
+ amt_update_gw_status(amt, AMT_STATUS_RECEIVED_QUERY, true);
+ dev_sw_netstats_rx_add(amt->dev, len);
+ } else {
+ amt->dev->stats.rx_dropped++;
+ }
++ local_bh_enable();
+
+ return false;
+ }
+@@ -2638,7 +2679,9 @@ static bool amt_request_handler(struct amt_dev *amt, struct sk_buff *skb)
+ if (tunnel->ip4 == iph->saddr)
+ goto send;
+
++ spin_lock_bh(&amt->lock);
+ if (amt->nr_tunnels >= amt->max_tunnels) {
++ spin_unlock_bh(&amt->lock);
+ icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0);
+ return true;
+ }
+@@ -2646,8 +2689,10 @@ static bool amt_request_handler(struct amt_dev *amt, struct sk_buff *skb)
+ tunnel = kzalloc(sizeof(*tunnel) +
+ (sizeof(struct hlist_head) * amt->hash_buckets),
+ GFP_ATOMIC);
+- if (!tunnel)
++ if (!tunnel) {
++ spin_unlock_bh(&amt->lock);
+ return true;
++ }
+
+ tunnel->source_port = udph->source;
+ tunnel->ip4 = iph->saddr;
+@@ -2660,10 +2705,9 @@ static bool amt_request_handler(struct amt_dev *amt, struct sk_buff *skb)
+
+ INIT_DELAYED_WORK(&tunnel->gc_wq, amt_tunnel_expire);
+
+- spin_lock_bh(&amt->lock);
+ list_add_tail_rcu(&tunnel->list, &amt->tunnel_list);
+ tunnel->key = amt->key;
+- amt_update_relay_status(tunnel, AMT_STATUS_RECEIVED_REQUEST, true);
++ __amt_update_relay_status(tunnel, AMT_STATUS_RECEIVED_REQUEST, true);
+ amt->nr_tunnels++;
+ mod_delayed_work(amt_wq, &tunnel->gc_wq,
+ msecs_to_jiffies(amt_gmi(amt)));
+@@ -2688,6 +2732,38 @@ send:
+ return false;
+ }
+
++static void amt_gw_rcv(struct amt_dev *amt, struct sk_buff *skb)
++{
++ int type = amt_parse_type(skb);
++ int err = 1;
++
++ if (type == -1)
++ goto drop;
++
++ if (amt->mode == AMT_MODE_GATEWAY) {
++ switch (type) {
++ case AMT_MSG_ADVERTISEMENT:
++ err = amt_advertisement_handler(amt, skb);
++ break;
++ case AMT_MSG_MEMBERSHIP_QUERY:
++ err = amt_membership_query_handler(amt, skb);
++ if (!err)
++ return;
++ break;
++ default:
++ netdev_dbg(amt->dev, "Invalid type of Gateway\n");
++ break;
++ }
++ }
++drop:
++ if (err) {
++ amt->dev->stats.rx_dropped++;
++ kfree_skb(skb);
++ } else {
++ consume_skb(skb);
++ }
++}
++
+ static int amt_rcv(struct sock *sk, struct sk_buff *skb)
+ {
+ struct amt_dev *amt;
+@@ -2719,8 +2795,12 @@ static int amt_rcv(struct sock *sk, struct sk_buff *skb)
+ err = true;
+ goto drop;
+ }
+- err = amt_advertisement_handler(amt, skb);
+- break;
++ if (amt_queue_event(amt, AMT_EVENT_RECEIVE, skb)) {
++ netdev_dbg(amt->dev, "AMT Event queue full\n");
++ err = true;
++ goto drop;
++ }
++ goto out;
+ case AMT_MSG_MULTICAST_DATA:
+ if (iph->saddr != amt->remote_ip) {
+ netdev_dbg(amt->dev, "Invalid Relay IP\n");
+@@ -2738,11 +2818,12 @@ static int amt_rcv(struct sock *sk, struct sk_buff *skb)
+ err = true;
+ goto drop;
+ }
+- err = amt_membership_query_handler(amt, skb);
+- if (err)
++ if (amt_queue_event(amt, AMT_EVENT_RECEIVE, skb)) {
++ netdev_dbg(amt->dev, "AMT Event queue full\n");
++ err = true;
+ goto drop;
+- else
+- goto out;
++ }
++ goto out;
+ default:
+ err = true;
+ netdev_dbg(amt->dev, "Invalid type of Gateway\n");
+@@ -2780,6 +2861,46 @@ out:
+ return 0;
+ }
+
++static void amt_event_work(struct work_struct *work)
++{
++ struct amt_dev *amt = container_of(work, struct amt_dev, event_wq);
++ struct sk_buff *skb;
++ u8 event;
++ int i;
++
++ for (i = 0; i < AMT_MAX_EVENTS; i++) {
++ spin_lock_bh(&amt->lock);
++ if (amt->nr_events == 0) {
++ spin_unlock_bh(&amt->lock);
++ return;
++ }
++ event = amt->events[amt->event_idx].event;
++ skb = amt->events[amt->event_idx].skb;
++ amt->events[amt->event_idx].event = AMT_EVENT_NONE;
++ amt->events[amt->event_idx].skb = NULL;
++ amt->nr_events--;
++ amt->event_idx++;
++ amt->event_idx %= AMT_MAX_EVENTS;
++ spin_unlock_bh(&amt->lock);
++
++ switch (event) {
++ case AMT_EVENT_RECEIVE:
++ amt_gw_rcv(amt, skb);
++ break;
++ case AMT_EVENT_SEND_DISCOVERY:
++ amt_event_send_discovery(amt);
++ break;
++ case AMT_EVENT_SEND_REQUEST:
++ amt_event_send_request(amt);
++ break;
++ default:
++ if (skb)
++ kfree_skb(skb);
++ break;
++ }
++ }
++}
++
+ static int amt_err_lookup(struct sock *sk, struct sk_buff *skb)
+ {
+ struct amt_dev *amt;
+@@ -2804,7 +2925,7 @@ static int amt_err_lookup(struct sock *sk, struct sk_buff *skb)
+ break;
+ case AMT_MSG_REQUEST:
+ case AMT_MSG_MEMBERSHIP_UPDATE:
+- if (amt->status >= AMT_STATUS_RECEIVED_ADVERTISEMENT)
++ if (READ_ONCE(amt->status) >= AMT_STATUS_RECEIVED_ADVERTISEMENT)
+ mod_delayed_work(amt_wq, &amt->req_wq, 0);
+ break;
+ default:
+@@ -2867,6 +2988,8 @@ static int amt_dev_open(struct net_device *dev)
+
+ amt->ready4 = false;
+ amt->ready6 = false;
++ amt->event_idx = 0;
++ amt->nr_events = 0;
+
+ err = amt_socket_create(amt);
+ if (err)
+@@ -2874,6 +2997,7 @@ static int amt_dev_open(struct net_device *dev)
+
+ amt->req_cnt = 0;
+ amt->remote_ip = 0;
++ amt->nonce = 0;
+ get_random_bytes(&amt->key, sizeof(siphash_key_t));
+
+ amt->status = AMT_STATUS_INIT;
+@@ -2892,6 +3016,8 @@ static int amt_dev_stop(struct net_device *dev)
+ struct amt_dev *amt = netdev_priv(dev);
+ struct amt_tunnel_list *tunnel, *tmp;
+ struct socket *sock;
++ struct sk_buff *skb;
++ int i;
+
+ cancel_delayed_work_sync(&amt->req_wq);
+ cancel_delayed_work_sync(&amt->discovery_wq);
+@@ -2904,6 +3030,15 @@ static int amt_dev_stop(struct net_device *dev)
+ if (sock)
+ udp_tunnel_sock_release(sock);
+
++ cancel_work_sync(&amt->event_wq);
++ for (i = 0; i < AMT_MAX_EVENTS; i++) {
++ skb = amt->events[i].skb;
++ if (skb)
++ kfree_skb(skb);
++ amt->events[i].event = AMT_EVENT_NONE;
++ amt->events[i].skb = NULL;
++ }
++
+ amt->ready4 = false;
+ amt->ready6 = false;
+ amt->req_cnt = 0;
+@@ -3095,7 +3230,7 @@ static int amt_newlink(struct net *net, struct net_device *dev,
+ goto err;
+ }
+ if (amt->mode == AMT_MODE_RELAY) {
+- amt->qrv = amt->net->ipv4.sysctl_igmp_qrv;
++ amt->qrv = READ_ONCE(amt->net->ipv4.sysctl_igmp_qrv);
+ amt->qri = 10;
+ dev->needed_headroom = amt->stream_dev->needed_headroom +
+ AMT_RELAY_HLEN;
+@@ -3146,8 +3281,8 @@ static int amt_newlink(struct net *net, struct net_device *dev,
+ INIT_DELAYED_WORK(&amt->discovery_wq, amt_discovery_work);
+ INIT_DELAYED_WORK(&amt->req_wq, amt_req_work);
+ INIT_DELAYED_WORK(&amt->secret_wq, amt_secret_work);
++ INIT_WORK(&amt->event_wq, amt_event_work);
+ INIT_LIST_HEAD(&amt->tunnel_list);
+-
+ return 0;
+ err:
+ dev_put(amt->stream_dev);
+@@ -3280,7 +3415,7 @@ static int __init amt_init(void)
+ if (err < 0)
+ goto unregister_notifier;
+
+- amt_wq = alloc_workqueue("amt", WQ_UNBOUND, 1);
++ amt_wq = alloc_workqueue("amt", WQ_UNBOUND, 0);
+ if (!amt_wq) {
+ err = -ENOMEM;
+ goto rtnl_unregister;
+diff --git a/drivers/net/can/rcar/rcar_canfd.c b/drivers/net/can/rcar/rcar_canfd.c
+index 589996cef5db3..8d457d2c3bccb 100644
+--- a/drivers/net/can/rcar/rcar_canfd.c
++++ b/drivers/net/can/rcar/rcar_canfd.c
+@@ -1850,6 +1850,7 @@ static int rcar_canfd_probe(struct platform_device *pdev)
+ of_child = of_get_child_by_name(pdev->dev.of_node, name);
+ if (of_child && of_device_is_available(of_child))
+ channels_mask |= BIT(i);
++ of_node_put(of_child);
+ }
+
+ if (chip_id != RENESAS_RZG2L) {
+diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
+index 8014b18d93914..aa0bcf01e20ac 100644
+--- a/drivers/net/dsa/microchip/ksz_common.c
++++ b/drivers/net/dsa/microchip/ksz_common.c
+@@ -447,18 +447,21 @@ int ksz_switch_register(struct ksz_device *dev,
+ ports = of_get_child_by_name(dev->dev->of_node, "ethernet-ports");
+ if (!ports)
+ ports = of_get_child_by_name(dev->dev->of_node, "ports");
+- if (ports)
++ if (ports) {
+ for_each_available_child_of_node(ports, port) {
+ if (of_property_read_u32(port, "reg",
+ &port_num))
+ continue;
+ if (!(dev->port_mask & BIT(port_num))) {
+ of_node_put(port);
++ of_node_put(ports);
+ return -EINVAL;
+ }
+ of_get_phy_mode(port,
+ &dev->ports[port_num].interface);
+ }
++ of_node_put(ports);
++ }
+ dev->synclko_125 = of_property_read_bool(dev->dev->of_node,
+ "microchip,synclko-125");
+ dev->synclko_disable = of_property_read_bool(dev->dev->of_node,
+diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
+index b33841c6507ae..7734c6b1bacae 100644
+--- a/drivers/net/dsa/sja1105/sja1105_main.c
++++ b/drivers/net/dsa/sja1105/sja1105_main.c
+@@ -3383,12 +3383,28 @@ static const struct of_device_id sja1105_dt_ids[] = {
+ };
+ MODULE_DEVICE_TABLE(of, sja1105_dt_ids);
+
++static const struct spi_device_id sja1105_spi_ids[] = {
++ { "sja1105e" },
++ { "sja1105t" },
++ { "sja1105p" },
++ { "sja1105q" },
++ { "sja1105r" },
++ { "sja1105s" },
++ { "sja1110a" },
++ { "sja1110b" },
++ { "sja1110c" },
++ { "sja1110d" },
++ { },
++};
++MODULE_DEVICE_TABLE(spi, sja1105_spi_ids);
++
+ static struct spi_driver sja1105_driver = {
+ .driver = {
+ .name = "sja1105",
+ .owner = THIS_MODULE,
+ .of_match_table = of_match_ptr(sja1105_dt_ids),
+ },
++ .id_table = sja1105_spi_ids,
+ .probe = sja1105_probe,
+ .remove = sja1105_remove,
+ .shutdown = sja1105_shutdown,
+diff --git a/drivers/net/dsa/vitesse-vsc73xx-spi.c b/drivers/net/dsa/vitesse-vsc73xx-spi.c
+index 3110895358d8d..97a92e6da60d8 100644
+--- a/drivers/net/dsa/vitesse-vsc73xx-spi.c
++++ b/drivers/net/dsa/vitesse-vsc73xx-spi.c
+@@ -205,10 +205,20 @@ static const struct of_device_id vsc73xx_of_match[] = {
+ };
+ MODULE_DEVICE_TABLE(of, vsc73xx_of_match);
+
++static const struct spi_device_id vsc73xx_spi_ids[] = {
++ { "vsc7385" },
++ { "vsc7388" },
++ { "vsc7395" },
++ { "vsc7398" },
++ { },
++};
++MODULE_DEVICE_TABLE(spi, vsc73xx_spi_ids);
++
+ static struct spi_driver vsc73xx_spi_driver = {
+ .probe = vsc73xx_spi_probe,
+ .remove = vsc73xx_spi_remove,
+ .shutdown = vsc73xx_spi_shutdown,
++ .id_table = vsc73xx_spi_ids,
+ .driver = {
+ .name = "vsc73xx-spi",
+ .of_match_table = vsc73xx_of_match,
+diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
+index 7c760aa655404..ddfe9208529a5 100644
+--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
++++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
+@@ -1236,8 +1236,8 @@ static struct sock *chtls_recv_sock(struct sock *lsk,
+ csk->sndbuf = newsk->sk_sndbuf;
+ csk->smac_idx = ((struct port_info *)netdev_priv(ndev))->smt_idx;
+ RCV_WSCALE(tp) = select_rcv_wscale(tcp_full_space(newsk),
+- sock_net(newsk)->
+- ipv4.sysctl_tcp_window_scaling,
++ READ_ONCE(sock_net(newsk)->
++ ipv4.sysctl_tcp_window_scaling),
+ tp->window_clamp);
+ neigh_release(n);
+ inet_inherit_port(&tcp_hashinfo, lsk, newsk);
+@@ -1384,7 +1384,7 @@ static void chtls_pass_accept_request(struct sock *sk,
+ #endif
+ }
+ if (req->tcpopt.wsf <= 14 &&
+- sock_net(sk)->ipv4.sysctl_tcp_window_scaling) {
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_window_scaling)) {
+ inet_rsk(oreq)->wscale_ok = 1;
+ inet_rsk(oreq)->snd_wscale = req->tcpopt.wsf;
+ }
+diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.c b/drivers/net/ethernet/emulex/benet/be_cmds.c
+index 528eb0f223b17..b4f5e57d0285c 100644
+--- a/drivers/net/ethernet/emulex/benet/be_cmds.c
++++ b/drivers/net/ethernet/emulex/benet/be_cmds.c
+@@ -2287,7 +2287,7 @@ err:
+
+ /* Uses sync mcc */
+ int be_cmd_read_port_transceiver_data(struct be_adapter *adapter,
+- u8 page_num, u8 *data)
++ u8 page_num, u32 off, u32 len, u8 *data)
+ {
+ struct be_dma_mem cmd;
+ struct be_mcc_wrb *wrb;
+@@ -2321,10 +2321,10 @@ int be_cmd_read_port_transceiver_data(struct be_adapter *adapter,
+ req->port = cpu_to_le32(adapter->hba_port_num);
+ req->page_num = cpu_to_le32(page_num);
+ status = be_mcc_notify_wait(adapter);
+- if (!status) {
++ if (!status && len > 0) {
+ struct be_cmd_resp_port_type *resp = cmd.va;
+
+- memcpy(data, resp->page_data, PAGE_DATA_LEN);
++ memcpy(data, resp->page_data + off, len);
+ }
+ err:
+ mutex_unlock(&adapter->mcc_lock);
+@@ -2415,7 +2415,7 @@ int be_cmd_query_cable_type(struct be_adapter *adapter)
+ int status;
+
+ status = be_cmd_read_port_transceiver_data(adapter, TR_PAGE_A0,
+- page_data);
++ 0, PAGE_DATA_LEN, page_data);
+ if (!status) {
+ switch (adapter->phy.interface_type) {
+ case PHY_TYPE_QSFP:
+@@ -2440,7 +2440,7 @@ int be_cmd_query_sfp_info(struct be_adapter *adapter)
+ int status;
+
+ status = be_cmd_read_port_transceiver_data(adapter, TR_PAGE_A0,
+- page_data);
++ 0, PAGE_DATA_LEN, page_data);
+ if (!status) {
+ strlcpy(adapter->phy.vendor_name, page_data +
+ SFP_VENDOR_NAME_OFFSET, SFP_VENDOR_NAME_LEN - 1);
+diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.h b/drivers/net/ethernet/emulex/benet/be_cmds.h
+index db1f3b908582e..e2085c68c0ee7 100644
+--- a/drivers/net/ethernet/emulex/benet/be_cmds.h
++++ b/drivers/net/ethernet/emulex/benet/be_cmds.h
+@@ -2427,7 +2427,7 @@ int be_cmd_set_beacon_state(struct be_adapter *adapter, u8 port_num, u8 beacon,
+ int be_cmd_get_beacon_state(struct be_adapter *adapter, u8 port_num,
+ u32 *state);
+ int be_cmd_read_port_transceiver_data(struct be_adapter *adapter,
+- u8 page_num, u8 *data);
++ u8 page_num, u32 off, u32 len, u8 *data);
+ int be_cmd_query_cable_type(struct be_adapter *adapter);
+ int be_cmd_query_sfp_info(struct be_adapter *adapter);
+ int lancer_cmd_read_object(struct be_adapter *adapter, struct be_dma_mem *cmd,
+diff --git a/drivers/net/ethernet/emulex/benet/be_ethtool.c b/drivers/net/ethernet/emulex/benet/be_ethtool.c
+index dfa784339781d..bd0df189d8719 100644
+--- a/drivers/net/ethernet/emulex/benet/be_ethtool.c
++++ b/drivers/net/ethernet/emulex/benet/be_ethtool.c
+@@ -1344,7 +1344,7 @@ static int be_get_module_info(struct net_device *netdev,
+ return -EOPNOTSUPP;
+
+ status = be_cmd_read_port_transceiver_data(adapter, TR_PAGE_A0,
+- page_data);
++ 0, PAGE_DATA_LEN, page_data);
+ if (!status) {
+ if (!page_data[SFP_PLUS_SFF_8472_COMP]) {
+ modinfo->type = ETH_MODULE_SFF_8079;
+@@ -1362,25 +1362,32 @@ static int be_get_module_eeprom(struct net_device *netdev,
+ {
+ struct be_adapter *adapter = netdev_priv(netdev);
+ int status;
++ u32 begin, end;
+
+ if (!check_privilege(adapter, MAX_PRIVILEGES))
+ return -EOPNOTSUPP;
+
+- status = be_cmd_read_port_transceiver_data(adapter, TR_PAGE_A0,
+- data);
+- if (status)
+- goto err;
++ begin = eeprom->offset;
++ end = eeprom->offset + eeprom->len;
++
++ if (begin < PAGE_DATA_LEN) {
++ status = be_cmd_read_port_transceiver_data(adapter, TR_PAGE_A0, begin,
++ min_t(u32, end, PAGE_DATA_LEN) - begin,
++ data);
++ if (status)
++ goto err;
++
++ data += PAGE_DATA_LEN - begin;
++ begin = PAGE_DATA_LEN;
++ }
+
+- if (eeprom->offset + eeprom->len > PAGE_DATA_LEN) {
+- status = be_cmd_read_port_transceiver_data(adapter,
+- TR_PAGE_A2,
+- data +
+- PAGE_DATA_LEN);
++ if (end > PAGE_DATA_LEN) {
++ status = be_cmd_read_port_transceiver_data(adapter, TR_PAGE_A2,
++ begin - PAGE_DATA_LEN,
++ end - begin, data);
+ if (status)
+ goto err;
+ }
+- if (eeprom->offset)
+- memcpy(data, data + eeprom->offset, eeprom->len);
+ err:
+ return be_cmd_status(status);
+ }
+diff --git a/drivers/net/ethernet/intel/e1000e/hw.h b/drivers/net/ethernet/intel/e1000e/hw.h
+index 13382df2f2eff..bcf680e838113 100644
+--- a/drivers/net/ethernet/intel/e1000e/hw.h
++++ b/drivers/net/ethernet/intel/e1000e/hw.h
+@@ -630,7 +630,6 @@ struct e1000_phy_info {
+ bool disable_polarity_correction;
+ bool is_mdix;
+ bool polarity_correction;
+- bool reset_disable;
+ bool speed_downgraded;
+ bool autoneg_wait_to_complete;
+ };
+diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
+index e6c8e6d5234f8..9466f65a6da77 100644
+--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
++++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
+@@ -2050,10 +2050,6 @@ static s32 e1000_check_reset_block_ich8lan(struct e1000_hw *hw)
+ bool blocked = false;
+ int i = 0;
+
+- /* Check the PHY (LCD) reset flag */
+- if (hw->phy.reset_disable)
+- return true;
+-
+ while ((blocked = !(er32(FWSM) & E1000_ICH_FWSM_RSPCIPHY)) &&
+ (i++ < 30))
+ usleep_range(10000, 11000);
+diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.h b/drivers/net/ethernet/intel/e1000e/ich8lan.h
+index 638a3ddd7ada8..2504b11c3169f 100644
+--- a/drivers/net/ethernet/intel/e1000e/ich8lan.h
++++ b/drivers/net/ethernet/intel/e1000e/ich8lan.h
+@@ -271,7 +271,6 @@
+ #define I217_CGFREG_ENABLE_MTA_RESET 0x0002
+ #define I217_MEMPWR PHY_REG(772, 26)
+ #define I217_MEMPWR_DISABLE_SMB_RELEASE 0x0010
+-#define I217_MEMPWR_MOEM 0x1000
+
+ /* Receive Address Initial CRC Calculation */
+ #define E1000_PCH_RAICC(_n) (0x05F50 + ((_n) * 4))
+diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
+index fa06f68c8c803..f1729940e46ce 100644
+--- a/drivers/net/ethernet/intel/e1000e/netdev.c
++++ b/drivers/net/ethernet/intel/e1000e/netdev.c
+@@ -6494,6 +6494,10 @@ static void e1000e_s0ix_exit_flow(struct e1000_adapter *adapter)
+
+ if (er32(FWSM) & E1000_ICH_FWSM_FW_VALID &&
+ hw->mac.type >= e1000_pch_adp) {
++ /* Keep the GPT clock enabled for CSME */
++ mac_data = er32(FEXTNVM);
++ mac_data |= BIT(3);
++ ew32(FEXTNVM, mac_data);
+ /* Request ME unconfigure the device from S0ix */
+ mac_data = er32(H2ME);
+ mac_data &= ~E1000_H2ME_START_DPG;
+@@ -6987,21 +6991,8 @@ static __maybe_unused int e1000e_pm_suspend(struct device *dev)
+ struct net_device *netdev = pci_get_drvdata(to_pci_dev(dev));
+ struct e1000_adapter *adapter = netdev_priv(netdev);
+ struct pci_dev *pdev = to_pci_dev(dev);
+- struct e1000_hw *hw = &adapter->hw;
+- u16 phy_data;
+ int rc;
+
+- if (er32(FWSM) & E1000_ICH_FWSM_FW_VALID &&
+- hw->mac.type >= e1000_pch_adp) {
+- /* Mask OEM Bits / Gig Disable / Restart AN (772_26[12] = 1) */
+- e1e_rphy(hw, I217_MEMPWR, &phy_data);
+- phy_data |= I217_MEMPWR_MOEM;
+- e1e_wphy(hw, I217_MEMPWR, phy_data);
+-
+- /* Disable LCD reset */
+- hw->phy.reset_disable = true;
+- }
+-
+ e1000e_flush_lpic(pdev);
+
+ e1000e_pm_freeze(dev);
+@@ -7023,8 +7014,6 @@ static __maybe_unused int e1000e_pm_resume(struct device *dev)
+ struct net_device *netdev = pci_get_drvdata(to_pci_dev(dev));
+ struct e1000_adapter *adapter = netdev_priv(netdev);
+ struct pci_dev *pdev = to_pci_dev(dev);
+- struct e1000_hw *hw = &adapter->hw;
+- u16 phy_data;
+ int rc;
+
+ /* Introduce S0ix implementation */
+@@ -7035,17 +7024,6 @@ static __maybe_unused int e1000e_pm_resume(struct device *dev)
+ if (rc)
+ return rc;
+
+- if (er32(FWSM) & E1000_ICH_FWSM_FW_VALID &&
+- hw->mac.type >= e1000_pch_adp) {
+- /* Unmask OEM Bits / Gig Disable / Restart AN 772_26[12] = 0 */
+- e1e_rphy(hw, I217_MEMPWR, &phy_data);
+- phy_data &= ~I217_MEMPWR_MOEM;
+- e1e_wphy(hw, I217_MEMPWR, phy_data);
+-
+- /* Enable LCD reset */
+- hw->phy.reset_disable = false;
+- }
+-
+ return e1000e_pm_thaw(dev);
+ }
+
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
+index 77eb9c7262053..6f01bffd7e5c2 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
++++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
+@@ -10645,7 +10645,7 @@ static int i40e_reset(struct i40e_pf *pf)
+ **/
+ static void i40e_rebuild(struct i40e_pf *pf, bool reinit, bool lock_acquired)
+ {
+- int old_recovery_mode_bit = test_bit(__I40E_RECOVERY_MODE, pf->state);
++ const bool is_recovery_mode_reported = i40e_check_recovery_mode(pf);
+ struct i40e_vsi *vsi = pf->vsi[pf->lan_vsi];
+ struct i40e_hw *hw = &pf->hw;
+ i40e_status ret;
+@@ -10653,13 +10653,11 @@ static void i40e_rebuild(struct i40e_pf *pf, bool reinit, bool lock_acquired)
+ int v;
+
+ if (test_bit(__I40E_EMP_RESET_INTR_RECEIVED, pf->state) &&
+- i40e_check_recovery_mode(pf)) {
++ is_recovery_mode_reported)
+ i40e_set_ethtool_ops(pf->vsi[pf->lan_vsi]->netdev);
+- }
+
+ if (test_bit(__I40E_DOWN, pf->state) &&
+- !test_bit(__I40E_RECOVERY_MODE, pf->state) &&
+- !old_recovery_mode_bit)
++ !test_bit(__I40E_RECOVERY_MODE, pf->state))
+ goto clear_recovery;
+ dev_dbg(&pf->pdev->dev, "Rebuilding internal switch\n");
+
+@@ -10686,13 +10684,12 @@ static void i40e_rebuild(struct i40e_pf *pf, bool reinit, bool lock_acquired)
+ * accordingly with regard to resources initialization
+ * and deinitialization
+ */
+- if (test_bit(__I40E_RECOVERY_MODE, pf->state) ||
+- old_recovery_mode_bit) {
++ if (test_bit(__I40E_RECOVERY_MODE, pf->state)) {
+ if (i40e_get_capabilities(pf,
+ i40e_aqc_opc_list_func_capabilities))
+ goto end_unlock;
+
+- if (test_bit(__I40E_RECOVERY_MODE, pf->state)) {
++ if (is_recovery_mode_reported) {
+ /* we're staying in recovery mode so we'll reinitialize
+ * misc vector here
+ */
+diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h
+index 49aed3e506a66..0ea0361cd86b1 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf.h
++++ b/drivers/net/ethernet/intel/iavf/iavf.h
+@@ -64,7 +64,6 @@ struct iavf_vsi {
+ u16 id;
+ DECLARE_BITMAP(state, __IAVF_VSI_STATE_SIZE__);
+ int base_vector;
+- u16 work_limit;
+ u16 qs_handle;
+ void *priv; /* client driver data reference. */
+ };
+@@ -159,8 +158,12 @@ struct iavf_vlan {
+ struct iavf_vlan_filter {
+ struct list_head list;
+ struct iavf_vlan vlan;
+- bool remove; /* filter needs to be removed */
+- bool add; /* filter needs to be added */
++ struct {
++ u8 is_new_vlan:1; /* filter is new, wait for PF answer */
++ u8 remove:1; /* filter needs to be removed */
++ u8 add:1; /* filter needs to be added */
++ u8 padding:5;
++ };
+ };
+
+ #define IAVF_MAX_TRAFFIC_CLASS 4
+@@ -461,6 +464,10 @@ static inline const char *iavf_state_str(enum iavf_state_t state)
+ return "__IAVF_INIT_VERSION_CHECK";
+ case __IAVF_INIT_GET_RESOURCES:
+ return "__IAVF_INIT_GET_RESOURCES";
++ case __IAVF_INIT_EXTENDED_CAPS:
++ return "__IAVF_INIT_EXTENDED_CAPS";
++ case __IAVF_INIT_CONFIG_ADAPTER:
++ return "__IAVF_INIT_CONFIG_ADAPTER";
+ case __IAVF_INIT_SW:
+ return "__IAVF_INIT_SW";
+ case __IAVF_INIT_FAILED:
+@@ -520,6 +527,7 @@ int iavf_get_vf_config(struct iavf_adapter *adapter);
+ int iavf_get_vf_vlan_v2_caps(struct iavf_adapter *adapter);
+ int iavf_send_vf_offload_vlan_v2_msg(struct iavf_adapter *adapter);
+ void iavf_set_queue_vlan_tag_loc(struct iavf_adapter *adapter);
++u16 iavf_get_num_vlans_added(struct iavf_adapter *adapter);
+ void iavf_irq_enable(struct iavf_adapter *adapter, bool flush);
+ void iavf_configure_queues(struct iavf_adapter *adapter);
+ void iavf_deconfigure_queues(struct iavf_adapter *adapter);
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
+index 3bb56714beb03..e535d4c3da49d 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
+@@ -692,12 +692,8 @@ static int __iavf_get_coalesce(struct net_device *netdev,
+ struct ethtool_coalesce *ec, int queue)
+ {
+ struct iavf_adapter *adapter = netdev_priv(netdev);
+- struct iavf_vsi *vsi = &adapter->vsi;
+ struct iavf_ring *rx_ring, *tx_ring;
+
+- ec->tx_max_coalesced_frames = vsi->work_limit;
+- ec->rx_max_coalesced_frames = vsi->work_limit;
+-
+ /* Rx and Tx usecs per queue value. If user doesn't specify the
+ * queue, return queue 0's value to represent.
+ */
+@@ -825,12 +821,8 @@ static int __iavf_set_coalesce(struct net_device *netdev,
+ struct ethtool_coalesce *ec, int queue)
+ {
+ struct iavf_adapter *adapter = netdev_priv(netdev);
+- struct iavf_vsi *vsi = &adapter->vsi;
+ int i;
+
+- if (ec->tx_max_coalesced_frames_irq || ec->rx_max_coalesced_frames_irq)
+- vsi->work_limit = ec->tx_max_coalesced_frames_irq;
+-
+ if (ec->rx_coalesce_usecs == 0) {
+ if (ec->use_adaptive_rx_coalesce)
+ netif_info(adapter, drv, netdev, "rx-usecs=0, need to disable adaptive-rx for a complete disable\n");
+@@ -1969,8 +1961,6 @@ static int iavf_set_rxfh(struct net_device *netdev, const u32 *indir,
+
+ static const struct ethtool_ops iavf_ethtool_ops = {
+ .supported_coalesce_params = ETHTOOL_COALESCE_USECS |
+- ETHTOOL_COALESCE_MAX_FRAMES |
+- ETHTOOL_COALESCE_MAX_FRAMES_IRQ |
+ ETHTOOL_COALESCE_USE_ADAPTIVE,
+ .get_drvinfo = iavf_get_drvinfo,
+ .get_link = ethtool_op_get_link,
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
+index f3ecb3bca33dd..2e2c153ce46a3 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
+@@ -843,7 +843,7 @@ static void iavf_restore_filters(struct iavf_adapter *adapter)
+ * iavf_get_num_vlans_added - get number of VLANs added
+ * @adapter: board private structure
+ */
+-static u16 iavf_get_num_vlans_added(struct iavf_adapter *adapter)
++u16 iavf_get_num_vlans_added(struct iavf_adapter *adapter)
+ {
+ return bitmap_weight(adapter->vsi.active_cvlans, VLAN_N_VID) +
+ bitmap_weight(adapter->vsi.active_svlans, VLAN_N_VID);
+@@ -906,11 +906,6 @@ static int iavf_vlan_rx_add_vid(struct net_device *netdev,
+ if (!iavf_add_vlan(adapter, IAVF_VLAN(vid, be16_to_cpu(proto))))
+ return -ENOMEM;
+
+- if (proto == cpu_to_be16(ETH_P_8021Q))
+- set_bit(vid, adapter->vsi.active_cvlans);
+- else
+- set_bit(vid, adapter->vsi.active_svlans);
+-
+ return 0;
+ }
+
+@@ -2245,7 +2240,6 @@ int iavf_parse_vf_resource_msg(struct iavf_adapter *adapter)
+
+ adapter->vsi.back = adapter;
+ adapter->vsi.base_vector = 1;
+- adapter->vsi.work_limit = IAVF_DEFAULT_IRQ_WORK;
+ vsi->netdev = adapter->netdev;
+ vsi->qs_handle = adapter->vsi_res->qset_handle;
+ if (adapter->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_RSS_PF) {
+@@ -2956,6 +2950,9 @@ continue_reset:
+ adapter->aq_required |= IAVF_FLAG_AQ_ADD_CLOUD_FILTER;
+ iavf_misc_irq_enable(adapter);
+
++ bitmap_clear(adapter->vsi.active_cvlans, 0, VLAN_N_VID);
++ bitmap_clear(adapter->vsi.active_svlans, 0, VLAN_N_VID);
++
+ mod_delayed_work(iavf_wq, &adapter->watchdog_task, 2);
+
+ /* We were running when the reset started, so we need to restore some
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
+index 978f651c6b093..06d18797d25a2 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
+@@ -194,7 +194,7 @@ static bool iavf_clean_tx_irq(struct iavf_vsi *vsi,
+ struct iavf_tx_buffer *tx_buf;
+ struct iavf_tx_desc *tx_desc;
+ unsigned int total_bytes = 0, total_packets = 0;
+- unsigned int budget = vsi->work_limit;
++ unsigned int budget = IAVF_DEFAULT_IRQ_WORK;
+
+ tx_buf = &tx_ring->tx_bi[i];
+ tx_desc = IAVF_TX_DESC(tx_ring, i);
+@@ -1285,11 +1285,10 @@ static struct iavf_rx_buffer *iavf_get_rx_buffer(struct iavf_ring *rx_ring,
+ {
+ struct iavf_rx_buffer *rx_buffer;
+
+- if (!size)
+- return NULL;
+-
+ rx_buffer = &rx_ring->rx_bi[rx_ring->next_to_clean];
+ prefetchw(rx_buffer->page);
++ if (!size)
++ return rx_buffer;
+
+ /* we are reusing so sync this buffer for CPU use */
+ dma_sync_single_range_for_cpu(rx_ring->dev,
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
+index 782450d5c12fc..1603e99bae4af 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
+@@ -626,6 +626,33 @@ static void iavf_mac_add_reject(struct iavf_adapter *adapter)
+ spin_unlock_bh(&adapter->mac_vlan_list_lock);
+ }
+
++/**
++ * iavf_vlan_add_reject
++ * @adapter: adapter structure
++ *
++ * Remove VLAN filters from list based on PF response.
++ **/
++static void iavf_vlan_add_reject(struct iavf_adapter *adapter)
++{
++ struct iavf_vlan_filter *f, *ftmp;
++
++ spin_lock_bh(&adapter->mac_vlan_list_lock);
++ list_for_each_entry_safe(f, ftmp, &adapter->vlan_filter_list, list) {
++ if (f->is_new_vlan) {
++ if (f->vlan.tpid == ETH_P_8021Q)
++ clear_bit(f->vlan.vid,
++ adapter->vsi.active_cvlans);
++ else
++ clear_bit(f->vlan.vid,
++ adapter->vsi.active_svlans);
++
++ list_del(&f->list);
++ kfree(f);
++ }
++ }
++ spin_unlock_bh(&adapter->mac_vlan_list_lock);
++}
++
+ /**
+ * iavf_add_vlans
+ * @adapter: adapter structure
+@@ -683,6 +710,7 @@ void iavf_add_vlans(struct iavf_adapter *adapter)
+ vvfl->vlan_id[i] = f->vlan.vid;
+ i++;
+ f->add = false;
++ f->is_new_vlan = true;
+ if (i == count)
+ break;
+ }
+@@ -695,10 +723,18 @@ void iavf_add_vlans(struct iavf_adapter *adapter)
+ iavf_send_pf_msg(adapter, VIRTCHNL_OP_ADD_VLAN, (u8 *)vvfl, len);
+ kfree(vvfl);
+ } else {
++ u16 max_vlans = adapter->vlan_v2_caps.filtering.max_filters;
++ u16 current_vlans = iavf_get_num_vlans_added(adapter);
+ struct virtchnl_vlan_filter_list_v2 *vvfl_v2;
+
+ adapter->current_op = VIRTCHNL_OP_ADD_VLAN_V2;
+
++ if ((count + current_vlans) > max_vlans &&
++ current_vlans < max_vlans) {
++ count = max_vlans - iavf_get_num_vlans_added(adapter);
++ more = true;
++ }
++
+ len = sizeof(*vvfl_v2) + ((count - 1) *
+ sizeof(struct virtchnl_vlan_filter));
+ if (len > IAVF_MAX_AQ_BUF_SIZE) {
+@@ -725,6 +761,9 @@ void iavf_add_vlans(struct iavf_adapter *adapter)
+ &adapter->vlan_v2_caps.filtering.filtering_support;
+ struct virtchnl_vlan *vlan;
+
++ if (i == count)
++ break;
++
+ /* give priority over outer if it's enabled */
+ if (filtering_support->outer)
+ vlan = &vvfl_v2->filters[i].outer;
+@@ -736,8 +775,7 @@ void iavf_add_vlans(struct iavf_adapter *adapter)
+
+ i++;
+ f->add = false;
+- if (i == count)
+- break;
++ f->is_new_vlan = true;
+ }
+ }
+
+@@ -2080,6 +2118,11 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
+ */
+ iavf_netdev_features_vlan_strip_set(netdev, true);
+ break;
++ case VIRTCHNL_OP_ADD_VLAN_V2:
++ iavf_vlan_add_reject(adapter);
++ dev_warn(&adapter->pdev->dev, "Failed to add VLAN filter, error %s\n",
++ iavf_stat_str(&adapter->hw, v_retval));
++ break;
+ default:
+ dev_err(&adapter->pdev->dev, "PF returned error %d (%s) to our request %d\n",
+ v_retval, iavf_stat_str(&adapter->hw, v_retval),
+@@ -2332,6 +2375,24 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
+ spin_unlock_bh(&adapter->adv_rss_lock);
+ }
+ break;
++ case VIRTCHNL_OP_ADD_VLAN_V2: {
++ struct iavf_vlan_filter *f;
++
++ spin_lock_bh(&adapter->mac_vlan_list_lock);
++ list_for_each_entry(f, &adapter->vlan_filter_list, list) {
++ if (f->is_new_vlan) {
++ f->is_new_vlan = false;
++ if (f->vlan.tpid == ETH_P_8021Q)
++ set_bit(f->vlan.vid,
++ adapter->vsi.active_cvlans);
++ else
++ set_bit(f->vlan.vid,
++ adapter->vsi.active_svlans);
++ }
++ }
++ spin_unlock_bh(&adapter->mac_vlan_list_lock);
++ }
++ break;
+ case VIRTCHNL_OP_ENABLE_VLAN_STRIPPING:
+ /* PF enabled vlan strip on this VF.
+ * Update netdev->features if needed to be in sync with ethtool.
+diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
+index 74b2c590ed5d0..38e46e9ba8bb8 100644
+--- a/drivers/net/ethernet/intel/igc/igc_main.c
++++ b/drivers/net/ethernet/intel/igc/igc_main.c
+@@ -6171,6 +6171,9 @@ u32 igc_rd32(struct igc_hw *hw, u32 reg)
+ u8 __iomem *hw_addr = READ_ONCE(hw->hw_addr);
+ u32 value = 0;
+
++ if (IGC_REMOVED(hw_addr))
++ return ~value;
++
+ value = readl(&hw_addr[reg]);
+
+ /* reads should not return all F's */
+diff --git a/drivers/net/ethernet/intel/igc/igc_regs.h b/drivers/net/ethernet/intel/igc/igc_regs.h
+index e197a33d93a03..026c3b65fc37a 100644
+--- a/drivers/net/ethernet/intel/igc/igc_regs.h
++++ b/drivers/net/ethernet/intel/igc/igc_regs.h
+@@ -306,7 +306,8 @@ u32 igc_rd32(struct igc_hw *hw, u32 reg);
+ #define wr32(reg, val) \
+ do { \
+ u8 __iomem *hw_addr = READ_ONCE((hw)->hw_addr); \
+- writel((val), &hw_addr[(reg)]); \
++ if (!IGC_REMOVED(hw_addr)) \
++ writel((val), &hw_addr[(reg)]); \
+ } while (0)
+
+ #define rd32(reg) (igc_rd32(hw, reg))
+@@ -318,4 +319,6 @@ do { \
+
+ #define array_rd32(reg, offset) (igc_rd32(hw, (reg) + ((offset) << 2)))
+
++#define IGC_REMOVED(h) unlikely(!(h))
++
+ #endif
+diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+index 921a4d977d651..8813b4dd6872f 100644
+--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
++++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+@@ -779,6 +779,7 @@ struct ixgbe_adapter {
+ #ifdef CONFIG_IXGBE_IPSEC
+ struct ixgbe_ipsec *ipsec;
+ #endif /* CONFIG_IXGBE_IPSEC */
++ spinlock_t vfs_lock;
+ };
+
+ static inline int ixgbe_determine_xdp_q_idx(int cpu)
+diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+index c4a4954aa3177..6c403f112d294 100644
+--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
++++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+@@ -6402,6 +6402,9 @@ static int ixgbe_sw_init(struct ixgbe_adapter *adapter,
+ /* n-tuple support exists, always init our spinlock */
+ spin_lock_init(&adapter->fdir_perfect_lock);
+
++ /* init spinlock to avoid concurrency of VF resources */
++ spin_lock_init(&adapter->vfs_lock);
++
+ #ifdef CONFIG_IXGBE_DCB
+ ixgbe_init_dcb(adapter);
+ #endif
+diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+index d4e63f0644c36..a1e69c7348632 100644
+--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
++++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+@@ -205,10 +205,13 @@ void ixgbe_enable_sriov(struct ixgbe_adapter *adapter, unsigned int max_vfs)
+ int ixgbe_disable_sriov(struct ixgbe_adapter *adapter)
+ {
+ unsigned int num_vfs = adapter->num_vfs, vf;
++ unsigned long flags;
+ int rss;
+
++ spin_lock_irqsave(&adapter->vfs_lock, flags);
+ /* set num VFs to 0 to prevent access to vfinfo */
+ adapter->num_vfs = 0;
++ spin_unlock_irqrestore(&adapter->vfs_lock, flags);
+
+ /* put the reference to all of the vf devices */
+ for (vf = 0; vf < num_vfs; ++vf) {
+@@ -1355,8 +1358,10 @@ static void ixgbe_rcv_ack_from_vf(struct ixgbe_adapter *adapter, u32 vf)
+ void ixgbe_msg_task(struct ixgbe_adapter *adapter)
+ {
+ struct ixgbe_hw *hw = &adapter->hw;
++ unsigned long flags;
+ u32 vf;
+
++ spin_lock_irqsave(&adapter->vfs_lock, flags);
+ for (vf = 0; vf < adapter->num_vfs; vf++) {
+ /* process any reset requests */
+ if (!ixgbe_check_for_rst(hw, vf))
+@@ -1370,6 +1375,7 @@ void ixgbe_msg_task(struct ixgbe_adapter *adapter)
+ if (!ixgbe_check_for_ack(hw, vf))
+ ixgbe_rcv_ack_from_vf(adapter, vf);
+ }
++ spin_unlock_irqrestore(&adapter->vfs_lock, flags);
+ }
+
+ static inline void ixgbe_ping_vf(struct ixgbe_adapter *adapter, int vf)
+diff --git a/drivers/net/ethernet/marvell/prestera/prestera_flower.c b/drivers/net/ethernet/marvell/prestera/prestera_flower.c
+index 921959a980ee4..d8cfa4a7de0f2 100644
+--- a/drivers/net/ethernet/marvell/prestera/prestera_flower.c
++++ b/drivers/net/ethernet/marvell/prestera/prestera_flower.c
+@@ -139,12 +139,12 @@ static int prestera_flower_parse_meta(struct prestera_acl_rule *rule,
+ }
+ port = netdev_priv(ingress_dev);
+
+- mask = htons(0x1FFF);
+- key = htons(port->hw_id);
++ mask = htons(0x1FFF << 3);
++ key = htons(port->hw_id << 3);
+ rule_match_set(r_match->key, SYS_PORT, key);
+ rule_match_set(r_match->mask, SYS_PORT, mask);
+
+- mask = htons(0x1FF);
++ mask = htons(0x3FF);
+ key = htons(port->dev_id);
+ rule_match_set(r_match->key, SYS_DEV, key);
+ rule_match_set(r_match->mask, SYS_DEV, mask);
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+index 7ad663c5b1ab7..c00d6c4ed37c3 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
++++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+@@ -5387,7 +5387,7 @@ static bool mlxsw_sp_fi_is_gateway(const struct mlxsw_sp *mlxsw_sp,
+ {
+ const struct fib_nh *nh = fib_info_nh(fi, 0);
+
+- return nh->fib_nh_scope == RT_SCOPE_LINK ||
++ return nh->fib_nh_gw_family ||
+ mlxsw_sp_nexthop4_ipip_type(mlxsw_sp, nh, NULL);
+ }
+
+@@ -10263,7 +10263,7 @@ static void mlxsw_sp_mp4_hash_init(struct mlxsw_sp *mlxsw_sp,
+ unsigned long *fields = config->fields;
+ u32 hash_fields;
+
+- switch (net->ipv4.sysctl_fib_multipath_hash_policy) {
++ switch (READ_ONCE(net->ipv4.sysctl_fib_multipath_hash_policy)) {
+ case 0:
+ mlxsw_sp_mp4_hash_outer_addr(config);
+ break;
+@@ -10281,7 +10281,7 @@ static void mlxsw_sp_mp4_hash_init(struct mlxsw_sp *mlxsw_sp,
+ mlxsw_sp_mp_hash_inner_l3(config);
+ break;
+ case 3:
+- hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields;
++ hash_fields = READ_ONCE(net->ipv4.sysctl_fib_multipath_hash_fields);
+ /* Outer */
+ MLXSW_SP_MP_HASH_HEADER_SET(headers, IPV4_EN_NOT_TCP_NOT_UDP);
+ MLXSW_SP_MP_HASH_HEADER_SET(headers, IPV4_EN_TCP_UDP);
+@@ -10462,13 +10462,14 @@ static int mlxsw_sp_dscp_init(struct mlxsw_sp *mlxsw_sp)
+ static int __mlxsw_sp_router_init(struct mlxsw_sp *mlxsw_sp)
+ {
+ struct net *net = mlxsw_sp_net(mlxsw_sp);
+- bool usp = net->ipv4.sysctl_ip_fwd_update_priority;
+ char rgcr_pl[MLXSW_REG_RGCR_LEN];
+ u64 max_rifs;
++ bool usp;
+
+ if (!MLXSW_CORE_RES_VALID(mlxsw_sp->core, MAX_RIFS))
+ return -EIO;
+ max_rifs = MLXSW_CORE_RES_GET(mlxsw_sp->core, MAX_RIFS);
++ usp = READ_ONCE(net->ipv4.sysctl_ip_fwd_update_priority);
+
+ mlxsw_reg_rgcr_pack(rgcr_pl, true, true);
+ mlxsw_reg_rgcr_max_router_interfaces_set(rgcr_pl, max_rifs);
+diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c b/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
+index 005e56ea5da12..5893770bfd946 100644
+--- a/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
++++ b/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c
+@@ -75,6 +75,9 @@ static int __lan966x_mac_learn(struct lan966x *lan966x, int pgid,
+ unsigned int vid,
+ enum macaccess_entry_type type)
+ {
++ int ret;
++
++ spin_lock(&lan966x->mac_lock);
+ lan966x_mac_select(lan966x, mac, vid);
+
+ /* Issue a write command */
+@@ -86,7 +89,10 @@ static int __lan966x_mac_learn(struct lan966x *lan966x, int pgid,
+ ANA_MACACCESS_MAC_TABLE_CMD_SET(MACACCESS_CMD_LEARN),
+ lan966x, ANA_MACACCESS);
+
+- return lan966x_mac_wait_for_completion(lan966x);
++ ret = lan966x_mac_wait_for_completion(lan966x);
++ spin_unlock(&lan966x->mac_lock);
++
++ return ret;
+ }
+
+ /* The mask of the front ports is encoded inside the mac parameter via a call
+@@ -113,11 +119,13 @@ int lan966x_mac_learn(struct lan966x *lan966x, int port,
+ return __lan966x_mac_learn(lan966x, port, false, mac, vid, type);
+ }
+
+-int lan966x_mac_forget(struct lan966x *lan966x,
+- const unsigned char mac[ETH_ALEN],
+- unsigned int vid,
+- enum macaccess_entry_type type)
++static int lan966x_mac_forget_locked(struct lan966x *lan966x,
++ const unsigned char mac[ETH_ALEN],
++ unsigned int vid,
++ enum macaccess_entry_type type)
+ {
++ lockdep_assert_held(&lan966x->mac_lock);
++
+ lan966x_mac_select(lan966x, mac, vid);
+
+ /* Issue a forget command */
+@@ -128,6 +136,20 @@ int lan966x_mac_forget(struct lan966x *lan966x,
+ return lan966x_mac_wait_for_completion(lan966x);
+ }
+
++int lan966x_mac_forget(struct lan966x *lan966x,
++ const unsigned char mac[ETH_ALEN],
++ unsigned int vid,
++ enum macaccess_entry_type type)
++{
++ int ret;
++
++ spin_lock(&lan966x->mac_lock);
++ ret = lan966x_mac_forget_locked(lan966x, mac, vid, type);
++ spin_unlock(&lan966x->mac_lock);
++
++ return ret;
++}
++
+ int lan966x_mac_cpu_learn(struct lan966x *lan966x, const char *addr, u16 vid)
+ {
+ return lan966x_mac_learn(lan966x, PGID_CPU, addr, vid, ENTRYTYPE_LOCKED);
+@@ -161,7 +183,7 @@ static struct lan966x_mac_entry *lan966x_mac_alloc_entry(const unsigned char *ma
+ {
+ struct lan966x_mac_entry *mac_entry;
+
+- mac_entry = kzalloc(sizeof(*mac_entry), GFP_KERNEL);
++ mac_entry = kzalloc(sizeof(*mac_entry), GFP_ATOMIC);
+ if (!mac_entry)
+ return NULL;
+
+@@ -179,7 +201,6 @@ static struct lan966x_mac_entry *lan966x_mac_find_entry(struct lan966x *lan966x,
+ struct lan966x_mac_entry *res = NULL;
+ struct lan966x_mac_entry *mac_entry;
+
+- spin_lock(&lan966x->mac_lock);
+ list_for_each_entry(mac_entry, &lan966x->mac_entries, list) {
+ if (mac_entry->vid == vid &&
+ ether_addr_equal(mac, mac_entry->mac) &&
+@@ -188,7 +209,6 @@ static struct lan966x_mac_entry *lan966x_mac_find_entry(struct lan966x *lan966x,
+ break;
+ }
+ }
+- spin_unlock(&lan966x->mac_lock);
+
+ return res;
+ }
+@@ -231,8 +251,11 @@ int lan966x_mac_add_entry(struct lan966x *lan966x, struct lan966x_port *port,
+ {
+ struct lan966x_mac_entry *mac_entry;
+
+- if (lan966x_mac_lookup(lan966x, addr, vid, ENTRYTYPE_NORMAL))
++ spin_lock(&lan966x->mac_lock);
++ if (lan966x_mac_lookup(lan966x, addr, vid, ENTRYTYPE_NORMAL)) {
++ spin_unlock(&lan966x->mac_lock);
+ return 0;
++ }
+
+ /* In case the entry already exists, don't add it again to SW,
+ * just update HW, but we need to look in the actual HW because
+@@ -241,21 +264,25 @@ int lan966x_mac_add_entry(struct lan966x *lan966x, struct lan966x_port *port,
+ * add the entry but without the extern_learn flag.
+ */
+ mac_entry = lan966x_mac_find_entry(lan966x, addr, vid, port->chip_port);
+- if (mac_entry)
+- return lan966x_mac_learn(lan966x, port->chip_port,
+- addr, vid, ENTRYTYPE_LOCKED);
++ if (mac_entry) {
++ spin_unlock(&lan966x->mac_lock);
++ goto mac_learn;
++ }
+
+ mac_entry = lan966x_mac_alloc_entry(addr, vid, port->chip_port);
+- if (!mac_entry)
++ if (!mac_entry) {
++ spin_unlock(&lan966x->mac_lock);
+ return -ENOMEM;
++ }
+
+- spin_lock(&lan966x->mac_lock);
+ list_add_tail(&mac_entry->list, &lan966x->mac_entries);
+ spin_unlock(&lan966x->mac_lock);
+
+- lan966x_mac_learn(lan966x, port->chip_port, addr, vid, ENTRYTYPE_LOCKED);
+ lan966x_fdb_call_notifiers(SWITCHDEV_FDB_OFFLOADED, addr, vid, port->dev);
+
++mac_learn:
++ lan966x_mac_learn(lan966x, port->chip_port, addr, vid, ENTRYTYPE_LOCKED);
++
+ return 0;
+ }
+
+@@ -269,8 +296,9 @@ int lan966x_mac_del_entry(struct lan966x *lan966x, const unsigned char *addr,
+ list) {
+ if (mac_entry->vid == vid &&
+ ether_addr_equal(addr, mac_entry->mac)) {
+- lan966x_mac_forget(lan966x, mac_entry->mac, mac_entry->vid,
+- ENTRYTYPE_LOCKED);
++ lan966x_mac_forget_locked(lan966x, mac_entry->mac,
++ mac_entry->vid,
++ ENTRYTYPE_LOCKED);
+
+ list_del(&mac_entry->list);
+ kfree(mac_entry);
+@@ -288,8 +316,8 @@ void lan966x_mac_purge_entries(struct lan966x *lan966x)
+ spin_lock(&lan966x->mac_lock);
+ list_for_each_entry_safe(mac_entry, tmp, &lan966x->mac_entries,
+ list) {
+- lan966x_mac_forget(lan966x, mac_entry->mac, mac_entry->vid,
+- ENTRYTYPE_LOCKED);
++ lan966x_mac_forget_locked(lan966x, mac_entry->mac,
++ mac_entry->vid, ENTRYTYPE_LOCKED);
+
+ list_del(&mac_entry->list);
+ kfree(mac_entry);
+@@ -325,10 +353,13 @@ static void lan966x_mac_irq_process(struct lan966x *lan966x, u32 row,
+ {
+ struct lan966x_mac_entry *mac_entry, *tmp;
+ unsigned char mac[ETH_ALEN] __aligned(2);
++ struct list_head mac_deleted_entries;
+ u32 dest_idx;
+ u32 column;
+ u16 vid;
+
++ INIT_LIST_HEAD(&mac_deleted_entries);
++
+ spin_lock(&lan966x->mac_lock);
+ list_for_each_entry_safe(mac_entry, tmp, &lan966x->mac_entries, list) {
+ bool found = false;
+@@ -362,20 +393,26 @@ static void lan966x_mac_irq_process(struct lan966x *lan966x, u32 row,
+ }
+
+ if (!found) {
+- /* Notify the bridge that the entry doesn't exist
+- * anymore in the HW and remove the entry from the SW
+- * list
+- */
+- lan966x_mac_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE,
+- mac_entry->mac, mac_entry->vid,
+- lan966x->ports[mac_entry->port_index]->dev);
+-
+ list_del(&mac_entry->list);
+- kfree(mac_entry);
++ /* Move the entry from SW list to a tmp list such that
++ * it would be deleted later
++ */
++ list_add_tail(&mac_entry->list, &mac_deleted_entries);
+ }
+ }
+ spin_unlock(&lan966x->mac_lock);
+
++ list_for_each_entry_safe(mac_entry, tmp, &mac_deleted_entries, list) {
++ /* Notify the bridge that the entry doesn't exist
++ * anymore in the HW
++ */
++ lan966x_mac_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE,
++ mac_entry->mac, mac_entry->vid,
++ lan966x->ports[mac_entry->port_index]->dev);
++ list_del(&mac_entry->list);
++ kfree(mac_entry);
++ }
++
+ /* Now go to the list of columns and see if any entry was not in the SW
+ * list, then that means that the entry is new so it needs to notify the
+ * bridge.
+@@ -396,13 +433,20 @@ static void lan966x_mac_irq_process(struct lan966x *lan966x, u32 row,
+ if (WARN_ON(dest_idx >= lan966x->num_phys_ports))
+ continue;
+
++ spin_lock(&lan966x->mac_lock);
++ mac_entry = lan966x_mac_find_entry(lan966x, mac, vid, dest_idx);
++ if (mac_entry) {
++ spin_unlock(&lan966x->mac_lock);
++ continue;
++ }
++
+ mac_entry = lan966x_mac_alloc_entry(mac, vid, dest_idx);
+- if (!mac_entry)
++ if (!mac_entry) {
++ spin_unlock(&lan966x->mac_lock);
+ return;
++ }
+
+ mac_entry->row = row;
+-
+- spin_lock(&lan966x->mac_lock);
+ list_add_tail(&mac_entry->list, &lan966x->mac_entries);
+ spin_unlock(&lan966x->mac_lock);
+
+@@ -424,6 +468,7 @@ irqreturn_t lan966x_mac_irq_handler(struct lan966x *lan966x)
+ lan966x, ANA_MACTINDX);
+
+ while (1) {
++ spin_lock(&lan966x->mac_lock);
+ lan_rmw(ANA_MACACCESS_MAC_TABLE_CMD_SET(MACACCESS_CMD_SYNC_GET_NEXT),
+ ANA_MACACCESS_MAC_TABLE_CMD,
+ lan966x, ANA_MACACCESS);
+@@ -447,12 +492,15 @@ irqreturn_t lan966x_mac_irq_handler(struct lan966x *lan966x)
+ stop = false;
+
+ if (column == LAN966X_MAC_COLUMNS - 1 &&
+- index == 0 && stop)
++ index == 0 && stop) {
++ spin_unlock(&lan966x->mac_lock);
+ break;
++ }
+
+ entry[column].mach = lan_rd(lan966x, ANA_MACHDATA);
+ entry[column].macl = lan_rd(lan966x, ANA_MACLDATA);
+ entry[column].maca = lan_rd(lan966x, ANA_MACACCESS);
++ spin_unlock(&lan966x->mac_lock);
+
+ /* Once all the columns are read process them */
+ if (column == LAN966X_MAC_COLUMNS - 1) {
+diff --git a/drivers/net/ethernet/netronome/nfp/flower/action.c b/drivers/net/ethernet/netronome/nfp/flower/action.c
+index 1b9421e844a95..79036767c99d1 100644
+--- a/drivers/net/ethernet/netronome/nfp/flower/action.c
++++ b/drivers/net/ethernet/netronome/nfp/flower/action.c
+@@ -473,7 +473,7 @@ nfp_fl_set_tun(struct nfp_app *app, struct nfp_fl_set_tun *set_tun,
+ set_tun->ttl = ip4_dst_hoplimit(&rt->dst);
+ ip_rt_put(rt);
+ } else {
+- set_tun->ttl = net->ipv4.sysctl_ip_default_ttl;
++ set_tun->ttl = READ_ONCE(net->ipv4.sysctl_ip_default_ttl);
+ }
+ }
+
+diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c
+index 6ff88df587673..ca8ab290013ce 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c
++++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c
+@@ -576,32 +576,7 @@ static int mediatek_dwmac_init(struct platform_device *pdev, void *priv)
+ }
+ }
+
+- ret = clk_bulk_prepare_enable(variant->num_clks, plat->clks);
+- if (ret) {
+- dev_err(plat->dev, "failed to enable clks, err = %d\n", ret);
+- return ret;
+- }
+-
+- ret = clk_prepare_enable(plat->rmii_internal_clk);
+- if (ret) {
+- dev_err(plat->dev, "failed to enable rmii internal clk, err = %d\n", ret);
+- goto err_clk;
+- }
+-
+ return 0;
+-
+-err_clk:
+- clk_bulk_disable_unprepare(variant->num_clks, plat->clks);
+- return ret;
+-}
+-
+-static void mediatek_dwmac_exit(struct platform_device *pdev, void *priv)
+-{
+- struct mediatek_dwmac_plat_data *plat = priv;
+- const struct mediatek_dwmac_variant *variant = plat->variant;
+-
+- clk_disable_unprepare(plat->rmii_internal_clk);
+- clk_bulk_disable_unprepare(variant->num_clks, plat->clks);
+ }
+
+ static int mediatek_dwmac_clks_config(void *priv, bool enabled)
+@@ -643,7 +618,6 @@ static int mediatek_dwmac_common_data(struct platform_device *pdev,
+ plat->addr64 = priv_plat->variant->dma_bit_mask;
+ plat->bsp_priv = priv_plat;
+ plat->init = mediatek_dwmac_init;
+- plat->exit = mediatek_dwmac_exit;
+ plat->clks_config = mediatek_dwmac_clks_config;
+ if (priv_plat->variant->dwmac_fix_mac_speed)
+ plat->fix_mac_speed = priv_plat->variant->dwmac_fix_mac_speed;
+@@ -712,13 +686,32 @@ static int mediatek_dwmac_probe(struct platform_device *pdev)
+ mediatek_dwmac_common_data(pdev, plat_dat, priv_plat);
+ mediatek_dwmac_init(pdev, priv_plat);
+
++ ret = mediatek_dwmac_clks_config(priv_plat, true);
++ if (ret)
++ return ret;
++
+ ret = stmmac_dvr_probe(&pdev->dev, plat_dat, &stmmac_res);
+ if (ret) {
+ stmmac_remove_config_dt(pdev, plat_dat);
+- return ret;
++ goto err_drv_probe;
+ }
+
+ return 0;
++
++err_drv_probe:
++ mediatek_dwmac_clks_config(priv_plat, false);
++ return ret;
++}
++
++static int mediatek_dwmac_remove(struct platform_device *pdev)
++{
++ struct mediatek_dwmac_plat_data *priv_plat = get_stmmac_bsp_priv(&pdev->dev);
++ int ret;
++
++ ret = stmmac_pltfr_remove(pdev);
++ mediatek_dwmac_clks_config(priv_plat, false);
++
++ return ret;
+ }
+
+ static const struct of_device_id mediatek_dwmac_match[] = {
+@@ -733,7 +726,7 @@ MODULE_DEVICE_TABLE(of, mediatek_dwmac_match);
+
+ static struct platform_driver mediatek_dwmac_driver = {
+ .probe = mediatek_dwmac_probe,
+- .remove = stmmac_pltfr_remove,
++ .remove = mediatek_dwmac_remove,
+ .driver = {
+ .name = "dwmac-mediatek",
+ .pm = &stmmac_pltfr_pm_ops,
+diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
+index fd41db65fe1df..af33390411346 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
++++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
+@@ -219,6 +219,9 @@ static void dwmac4_map_mtl_dma(struct mac_device_info *hw, u32 queue, u32 chan)
+ if (queue == 0 || queue == 4) {
+ value &= ~MTL_RXQ_DMA_Q04MDMACH_MASK;
+ value |= MTL_RXQ_DMA_Q04MDMACH(chan);
++ } else if (queue > 4) {
++ value &= ~MTL_RXQ_DMA_QXMDMACH_MASK(queue - 4);
++ value |= MTL_RXQ_DMA_QXMDMACH(chan, queue - 4);
+ } else {
+ value &= ~MTL_RXQ_DMA_QXMDMACH_MASK(queue);
+ value |= MTL_RXQ_DMA_QXMDMACH(chan, queue);
+diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+index abfb3cd5958df..9c3055ee26085 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
++++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+@@ -803,14 +803,6 @@ static int stmmac_ethtool_op_set_eee(struct net_device *dev,
+ netdev_warn(priv->dev,
+ "Setting EEE tx-lpi is not supported\n");
+
+- if (priv->hw->xpcs) {
+- ret = xpcs_config_eee(priv->hw->xpcs,
+- priv->plat->mult_fact_100ns,
+- edata->eee_enabled);
+- if (ret)
+- return ret;
+- }
+-
+ if (!edata->eee_enabled)
+ stmmac_disable_eee_mode(priv);
+
+diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+index 2525a80353b70..6a7f63a58aef8 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
++++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+@@ -834,19 +834,10 @@ int stmmac_init_tstamp_counter(struct stmmac_priv *priv, u32 systime_flags)
+ struct timespec64 now;
+ u32 sec_inc = 0;
+ u64 temp = 0;
+- int ret;
+
+ if (!(priv->dma_cap.time_stamp || priv->dma_cap.atime_stamp))
+ return -EOPNOTSUPP;
+
+- ret = clk_prepare_enable(priv->plat->clk_ptp_ref);
+- if (ret < 0) {
+- netdev_warn(priv->dev,
+- "failed to enable PTP reference clock: %pe\n",
+- ERR_PTR(ret));
+- return ret;
+- }
+-
+ stmmac_config_hw_tstamping(priv, priv->ptpaddr, systime_flags);
+ priv->systime_flags = systime_flags;
+
+@@ -3270,6 +3261,14 @@ static int stmmac_hw_setup(struct net_device *dev, bool ptp_register)
+
+ stmmac_mmc_setup(priv);
+
++ if (ptp_register) {
++ ret = clk_prepare_enable(priv->plat->clk_ptp_ref);
++ if (ret < 0)
++ netdev_warn(priv->dev,
++ "failed to enable PTP reference clock: %pe\n",
++ ERR_PTR(ret));
++ }
++
+ ret = stmmac_init_ptp(priv);
+ if (ret == -EOPNOTSUPP)
+ netdev_info(priv->dev, "PTP not supported by HW\n");
+@@ -7220,8 +7219,6 @@ int stmmac_dvr_remove(struct device *dev)
+ netdev_info(priv->dev, "%s: removing driver", __func__);
+
+ pm_runtime_get_sync(dev);
+- pm_runtime_disable(dev);
+- pm_runtime_put_noidle(dev);
+
+ stmmac_stop_all_dma(priv);
+ stmmac_mac_set(priv, priv->ioaddr, false);
+@@ -7248,6 +7245,9 @@ int stmmac_dvr_remove(struct device *dev)
+ mutex_destroy(&priv->lock);
+ bitmap_free(priv->af_xdp_zc_qps);
+
++ pm_runtime_disable(dev);
++ pm_runtime_put_noidle(dev);
++
+ return 0;
+ }
+ EXPORT_SYMBOL_GPL(stmmac_dvr_remove);
+diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+index 11e1055e8260f..9f5cac4000da6 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
++++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+@@ -815,7 +815,13 @@ static int __maybe_unused stmmac_pltfr_noirq_resume(struct device *dev)
+ if (ret)
+ return ret;
+
+- stmmac_init_tstamp_counter(priv, priv->systime_flags);
++ ret = clk_prepare_enable(priv->plat->clk_ptp_ref);
++ if (ret < 0) {
++ netdev_warn(priv->dev,
++ "failed to enable PTP reference clock: %pe\n",
++ ERR_PTR(ret));
++ return ret;
++ }
+ }
+
+ return 0;
+diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c
+index 873f6deabbd1a..dc1f6d8444ad0 100644
+--- a/drivers/net/usb/ax88179_178a.c
++++ b/drivers/net/usb/ax88179_178a.c
+@@ -1801,7 +1801,7 @@ static const struct driver_info ax88179_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1814,7 +1814,7 @@ static const struct driver_info ax88178a_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1827,7 +1827,7 @@ static const struct driver_info cypress_GX3_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1840,7 +1840,7 @@ static const struct driver_info dlink_dub1312_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1853,7 +1853,7 @@ static const struct driver_info sitecom_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1866,7 +1866,7 @@ static const struct driver_info samsung_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1879,7 +1879,7 @@ static const struct driver_info lenovo_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1892,7 +1892,7 @@ static const struct driver_info belkin_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1905,7 +1905,7 @@ static const struct driver_info toshiba_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1918,7 +1918,7 @@ static const struct driver_info mct_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1931,7 +1931,7 @@ static const struct driver_info at_umc2000_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1944,7 +1944,7 @@ static const struct driver_info at_umc200_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1957,7 +1957,7 @@ static const struct driver_info at_umc2000sp_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
+index ee41088c52518..6b4efae11e57c 100644
+--- a/drivers/net/usb/r8152.c
++++ b/drivers/net/usb/r8152.c
+@@ -32,7 +32,7 @@
+ #define NETNEXT_VERSION "12"
+
+ /* Information for net */
+-#define NET_VERSION "12"
++#define NET_VERSION "13"
+
+ #define DRIVER_VERSION "v1." NETNEXT_VERSION "." NET_VERSION
+ #define DRIVER_AUTHOR "Realtek linux nic maintainers <nic_swsd@realtek.com>"
+@@ -5915,7 +5915,8 @@ static void r8153_enter_oob(struct r8152 *tp)
+
+ wait_oob_link_list_ready(tp);
+
+- ocp_write_word(tp, MCU_TYPE_PLA, PLA_RMS, mtu_to_size(tp->netdev->mtu));
++ ocp_write_word(tp, MCU_TYPE_PLA, PLA_RMS, 1522);
++ ocp_write_byte(tp, MCU_TYPE_PLA, PLA_MTPS, MTPS_DEFAULT);
+
+ switch (tp->version) {
+ case RTL_VER_03:
+@@ -5951,6 +5952,10 @@ static void r8153_enter_oob(struct r8152 *tp)
+ ocp_data |= NOW_IS_OOB | DIS_MCU_CLROOB;
+ ocp_write_byte(tp, MCU_TYPE_PLA, PLA_OOB_CTRL, ocp_data);
+
++ ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_SFF_STS_7);
++ ocp_data |= MCU_BORW_EN;
++ ocp_write_word(tp, MCU_TYPE_PLA, PLA_SFF_STS_7, ocp_data);
++
+ rxdy_gated_en(tp, false);
+
+ ocp_data = ocp_read_dword(tp, MCU_TYPE_PLA, PLA_RCR);
+@@ -6553,6 +6558,9 @@ static void rtl8156_down(struct r8152 *tp)
+ rtl_disable(tp);
+ rtl_reset_bmu(tp);
+
++ ocp_write_word(tp, MCU_TYPE_PLA, PLA_RMS, 1522);
++ ocp_write_byte(tp, MCU_TYPE_PLA, PLA_MTPS, MTPS_DEFAULT);
++
+ /* Clear teredo wake event. bit[15:8] is the teredo wakeup
+ * type. Set it to zero. bits[7:0] are the W1C bits about
+ * the events. Set them to all 1 to clear them.
+@@ -6563,6 +6571,10 @@ static void rtl8156_down(struct r8152 *tp)
+ ocp_data |= NOW_IS_OOB;
+ ocp_write_byte(tp, MCU_TYPE_PLA, PLA_OOB_CTRL, ocp_data);
+
++ ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_SFF_STS_7);
++ ocp_data |= MCU_BORW_EN;
++ ocp_write_word(tp, MCU_TYPE_PLA, PLA_SFF_STS_7, ocp_data);
++
+ rtl_rx_vlan_en(tp, true);
+ rxdy_gated_en(tp, false);
+
+diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
+index d270a204324e9..fb185cb052583 100644
+--- a/drivers/pci/controller/pci-hyperv.c
++++ b/drivers/pci/controller/pci-hyperv.c
+@@ -604,17 +604,19 @@ static unsigned int hv_msi_get_int_vector(struct irq_data *data)
+ return cfg->vector;
+ }
+
+-static void hv_set_msi_entry_from_desc(union hv_msi_entry *msi_entry,
+- struct msi_desc *msi_desc)
+-{
+- msi_entry->address.as_uint32 = msi_desc->msg.address_lo;
+- msi_entry->data.as_uint32 = msi_desc->msg.data;
+-}
+-
+ static int hv_msi_prepare(struct irq_domain *domain, struct device *dev,
+ int nvec, msi_alloc_info_t *info)
+ {
+- return pci_msi_prepare(domain, dev, nvec, info);
++ int ret = pci_msi_prepare(domain, dev, nvec, info);
++
++ /*
++ * By using the interrupt remapper in the hypervisor IOMMU, contiguous
++ * CPU vectors is not needed for multi-MSI
++ */
++ if (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI)
++ info->flags &= ~X86_IRQ_ALLOC_CONTIGUOUS_VECTORS;
++
++ return ret;
+ }
+
+ /**
+@@ -631,6 +633,7 @@ static void hv_arch_irq_unmask(struct irq_data *data)
+ {
+ struct msi_desc *msi_desc = irq_data_get_msi_desc(data);
+ struct hv_retarget_device_interrupt *params;
++ struct tran_int_desc *int_desc;
+ struct hv_pcibus_device *hbus;
+ struct cpumask *dest;
+ cpumask_var_t tmp;
+@@ -645,6 +648,7 @@ static void hv_arch_irq_unmask(struct irq_data *data)
+ pdev = msi_desc_to_pci_dev(msi_desc);
+ pbus = pdev->bus;
+ hbus = container_of(pbus->sysdata, struct hv_pcibus_device, sysdata);
++ int_desc = data->chip_data;
+
+ spin_lock_irqsave(&hbus->retarget_msi_interrupt_lock, flags);
+
+@@ -652,7 +656,8 @@ static void hv_arch_irq_unmask(struct irq_data *data)
+ memset(params, 0, sizeof(*params));
+ params->partition_id = HV_PARTITION_ID_SELF;
+ params->int_entry.source = HV_INTERRUPT_SOURCE_MSI;
+- hv_set_msi_entry_from_desc(¶ms->int_entry.msi_entry, msi_desc);
++ params->int_entry.msi_entry.address.as_uint32 = int_desc->address & 0xffffffff;
++ params->int_entry.msi_entry.data.as_uint32 = int_desc->data;
+ params->device_id = (hbus->hdev->dev_instance.b[5] << 24) |
+ (hbus->hdev->dev_instance.b[4] << 16) |
+ (hbus->hdev->dev_instance.b[7] << 8) |
+@@ -1513,6 +1518,10 @@ static void hv_int_desc_free(struct hv_pci_dev *hpdev,
+ u8 buffer[sizeof(struct pci_delete_interrupt)];
+ } ctxt;
+
++ if (!int_desc->vector_count) {
++ kfree(int_desc);
++ return;
++ }
+ memset(&ctxt, 0, sizeof(ctxt));
+ int_pkt = (struct pci_delete_interrupt *)&ctxt.pkt.message;
+ int_pkt->message_type.type =
+@@ -1597,12 +1606,12 @@ static void hv_pci_compose_compl(void *context, struct pci_response *resp,
+
+ static u32 hv_compose_msi_req_v1(
+ struct pci_create_interrupt *int_pkt, struct cpumask *affinity,
+- u32 slot, u8 vector)
++ u32 slot, u8 vector, u8 vector_count)
+ {
+ int_pkt->message_type.type = PCI_CREATE_INTERRUPT_MESSAGE;
+ int_pkt->wslot.slot = slot;
+ int_pkt->int_desc.vector = vector;
+- int_pkt->int_desc.vector_count = 1;
++ int_pkt->int_desc.vector_count = vector_count;
+ int_pkt->int_desc.delivery_mode = DELIVERY_MODE;
+
+ /*
+@@ -1625,14 +1634,14 @@ static int hv_compose_msi_req_get_cpu(struct cpumask *affinity)
+
+ static u32 hv_compose_msi_req_v2(
+ struct pci_create_interrupt2 *int_pkt, struct cpumask *affinity,
+- u32 slot, u8 vector)
++ u32 slot, u8 vector, u8 vector_count)
+ {
+ int cpu;
+
+ int_pkt->message_type.type = PCI_CREATE_INTERRUPT_MESSAGE2;
+ int_pkt->wslot.slot = slot;
+ int_pkt->int_desc.vector = vector;
+- int_pkt->int_desc.vector_count = 1;
++ int_pkt->int_desc.vector_count = vector_count;
+ int_pkt->int_desc.delivery_mode = DELIVERY_MODE;
+ cpu = hv_compose_msi_req_get_cpu(affinity);
+ int_pkt->int_desc.processor_array[0] =
+@@ -1644,7 +1653,7 @@ static u32 hv_compose_msi_req_v2(
+
+ static u32 hv_compose_msi_req_v3(
+ struct pci_create_interrupt3 *int_pkt, struct cpumask *affinity,
+- u32 slot, u32 vector)
++ u32 slot, u32 vector, u8 vector_count)
+ {
+ int cpu;
+
+@@ -1652,7 +1661,7 @@ static u32 hv_compose_msi_req_v3(
+ int_pkt->wslot.slot = slot;
+ int_pkt->int_desc.vector = vector;
+ int_pkt->int_desc.reserved = 0;
+- int_pkt->int_desc.vector_count = 1;
++ int_pkt->int_desc.vector_count = vector_count;
+ int_pkt->int_desc.delivery_mode = DELIVERY_MODE;
+ cpu = hv_compose_msi_req_get_cpu(affinity);
+ int_pkt->int_desc.processor_array[0] =
+@@ -1683,6 +1692,8 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
+ struct cpumask *dest;
+ struct compose_comp_ctxt comp;
+ struct tran_int_desc *int_desc;
++ struct msi_desc *msi_desc;
++ u8 vector, vector_count;
+ struct {
+ struct pci_packet pci_pkt;
+ union {
+@@ -1695,7 +1706,17 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
+ u32 size;
+ int ret;
+
+- pdev = msi_desc_to_pci_dev(irq_data_get_msi_desc(data));
++ /* Reuse the previous allocation */
++ if (data->chip_data) {
++ int_desc = data->chip_data;
++ msg->address_hi = int_desc->address >> 32;
++ msg->address_lo = int_desc->address & 0xffffffff;
++ msg->data = int_desc->data;
++ return;
++ }
++
++ msi_desc = irq_data_get_msi_desc(data);
++ pdev = msi_desc_to_pci_dev(msi_desc);
+ dest = irq_data_get_effective_affinity_mask(data);
+ pbus = pdev->bus;
+ hbus = container_of(pbus->sysdata, struct hv_pcibus_device, sysdata);
+@@ -1704,17 +1725,40 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
+ if (!hpdev)
+ goto return_null_message;
+
+- /* Free any previous message that might have already been composed. */
+- if (data->chip_data) {
+- int_desc = data->chip_data;
+- data->chip_data = NULL;
+- hv_int_desc_free(hpdev, int_desc);
+- }
+-
+ int_desc = kzalloc(sizeof(*int_desc), GFP_ATOMIC);
+ if (!int_desc)
+ goto drop_reference;
+
++ if (!msi_desc->pci.msi_attrib.is_msix && msi_desc->nvec_used > 1) {
++ /*
++ * If this is not the first MSI of Multi MSI, we already have
++ * a mapping. Can exit early.
++ */
++ if (msi_desc->irq != data->irq) {
++ data->chip_data = int_desc;
++ int_desc->address = msi_desc->msg.address_lo |
++ (u64)msi_desc->msg.address_hi << 32;
++ int_desc->data = msi_desc->msg.data +
++ (data->irq - msi_desc->irq);
++ msg->address_hi = msi_desc->msg.address_hi;
++ msg->address_lo = msi_desc->msg.address_lo;
++ msg->data = int_desc->data;
++ put_pcichild(hpdev);
++ return;
++ }
++ /*
++ * The vector we select here is a dummy value. The correct
++ * value gets sent to the hypervisor in unmask(). This needs
++ * to be aligned with the count, and also not zero. Multi-msi
++ * is powers of 2 up to 32, so 32 will always work here.
++ */
++ vector = 32;
++ vector_count = msi_desc->nvec_used;
++ } else {
++ vector = hv_msi_get_int_vector(data);
++ vector_count = 1;
++ }
++
+ memset(&ctxt, 0, sizeof(ctxt));
+ init_completion(&comp.comp_pkt.host_event);
+ ctxt.pci_pkt.completion_func = hv_pci_compose_compl;
+@@ -1725,7 +1769,8 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
+ size = hv_compose_msi_req_v1(&ctxt.int_pkts.v1,
+ dest,
+ hpdev->desc.win_slot.slot,
+- hv_msi_get_int_vector(data));
++ vector,
++ vector_count);
+ break;
+
+ case PCI_PROTOCOL_VERSION_1_2:
+@@ -1733,14 +1778,16 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
+ size = hv_compose_msi_req_v2(&ctxt.int_pkts.v2,
+ dest,
+ hpdev->desc.win_slot.slot,
+- hv_msi_get_int_vector(data));
++ vector,
++ vector_count);
+ break;
+
+ case PCI_PROTOCOL_VERSION_1_4:
+ size = hv_compose_msi_req_v3(&ctxt.int_pkts.v3,
+ dest,
+ hpdev->desc.win_slot.slot,
+- hv_msi_get_int_vector(data));
++ vector,
++ vector_count);
+ break;
+
+ default:
+diff --git a/drivers/pinctrl/mvebu/pinctrl-armada-37xx.c b/drivers/pinctrl/mvebu/pinctrl-armada-37xx.c
+index adccf03b3e5af..b920dd5237c75 100644
+--- a/drivers/pinctrl/mvebu/pinctrl-armada-37xx.c
++++ b/drivers/pinctrl/mvebu/pinctrl-armada-37xx.c
+@@ -101,7 +101,7 @@ struct armada_37xx_pinctrl {
+ struct device *dev;
+ struct gpio_chip gpio_chip;
+ struct irq_chip irq_chip;
+- spinlock_t irq_lock;
++ raw_spinlock_t irq_lock;
+ struct pinctrl_desc pctl;
+ struct pinctrl_dev *pctl_dev;
+ struct armada_37xx_pin_group *groups;
+@@ -522,9 +522,9 @@ static void armada_37xx_irq_ack(struct irq_data *d)
+ unsigned long flags;
+
+ armada_37xx_irq_update_reg(®, d);
+- spin_lock_irqsave(&info->irq_lock, flags);
++ raw_spin_lock_irqsave(&info->irq_lock, flags);
+ writel(d->mask, info->base + reg);
+- spin_unlock_irqrestore(&info->irq_lock, flags);
++ raw_spin_unlock_irqrestore(&info->irq_lock, flags);
+ }
+
+ static void armada_37xx_irq_mask(struct irq_data *d)
+@@ -535,10 +535,10 @@ static void armada_37xx_irq_mask(struct irq_data *d)
+ unsigned long flags;
+
+ armada_37xx_irq_update_reg(®, d);
+- spin_lock_irqsave(&info->irq_lock, flags);
++ raw_spin_lock_irqsave(&info->irq_lock, flags);
+ val = readl(info->base + reg);
+ writel(val & ~d->mask, info->base + reg);
+- spin_unlock_irqrestore(&info->irq_lock, flags);
++ raw_spin_unlock_irqrestore(&info->irq_lock, flags);
+ }
+
+ static void armada_37xx_irq_unmask(struct irq_data *d)
+@@ -549,10 +549,10 @@ static void armada_37xx_irq_unmask(struct irq_data *d)
+ unsigned long flags;
+
+ armada_37xx_irq_update_reg(®, d);
+- spin_lock_irqsave(&info->irq_lock, flags);
++ raw_spin_lock_irqsave(&info->irq_lock, flags);
+ val = readl(info->base + reg);
+ writel(val | d->mask, info->base + reg);
+- spin_unlock_irqrestore(&info->irq_lock, flags);
++ raw_spin_unlock_irqrestore(&info->irq_lock, flags);
+ }
+
+ static int armada_37xx_irq_set_wake(struct irq_data *d, unsigned int on)
+@@ -563,14 +563,14 @@ static int armada_37xx_irq_set_wake(struct irq_data *d, unsigned int on)
+ unsigned long flags;
+
+ armada_37xx_irq_update_reg(®, d);
+- spin_lock_irqsave(&info->irq_lock, flags);
++ raw_spin_lock_irqsave(&info->irq_lock, flags);
+ val = readl(info->base + reg);
+ if (on)
+ val |= (BIT(d->hwirq % GPIO_PER_REG));
+ else
+ val &= ~(BIT(d->hwirq % GPIO_PER_REG));
+ writel(val, info->base + reg);
+- spin_unlock_irqrestore(&info->irq_lock, flags);
++ raw_spin_unlock_irqrestore(&info->irq_lock, flags);
+
+ return 0;
+ }
+@@ -582,7 +582,7 @@ static int armada_37xx_irq_set_type(struct irq_data *d, unsigned int type)
+ u32 val, reg = IRQ_POL;
+ unsigned long flags;
+
+- spin_lock_irqsave(&info->irq_lock, flags);
++ raw_spin_lock_irqsave(&info->irq_lock, flags);
+ armada_37xx_irq_update_reg(®, d);
+ val = readl(info->base + reg);
+ switch (type) {
+@@ -606,11 +606,11 @@ static int armada_37xx_irq_set_type(struct irq_data *d, unsigned int type)
+ break;
+ }
+ default:
+- spin_unlock_irqrestore(&info->irq_lock, flags);
++ raw_spin_unlock_irqrestore(&info->irq_lock, flags);
+ return -EINVAL;
+ }
+ writel(val, info->base + reg);
+- spin_unlock_irqrestore(&info->irq_lock, flags);
++ raw_spin_unlock_irqrestore(&info->irq_lock, flags);
+
+ return 0;
+ }
+@@ -625,7 +625,7 @@ static int armada_37xx_edge_both_irq_swap_pol(struct armada_37xx_pinctrl *info,
+
+ regmap_read(info->regmap, INPUT_VAL + 4*reg_idx, &l);
+
+- spin_lock_irqsave(&info->irq_lock, flags);
++ raw_spin_lock_irqsave(&info->irq_lock, flags);
+ p = readl(info->base + IRQ_POL + 4 * reg_idx);
+ if ((p ^ l) & (1 << bit_num)) {
+ /*
+@@ -646,7 +646,7 @@ static int armada_37xx_edge_both_irq_swap_pol(struct armada_37xx_pinctrl *info,
+ ret = -1;
+ }
+
+- spin_unlock_irqrestore(&info->irq_lock, flags);
++ raw_spin_unlock_irqrestore(&info->irq_lock, flags);
+ return ret;
+ }
+
+@@ -663,11 +663,11 @@ static void armada_37xx_irq_handler(struct irq_desc *desc)
+ u32 status;
+ unsigned long flags;
+
+- spin_lock_irqsave(&info->irq_lock, flags);
++ raw_spin_lock_irqsave(&info->irq_lock, flags);
+ status = readl_relaxed(info->base + IRQ_STATUS + 4 * i);
+ /* Manage only the interrupt that was enabled */
+ status &= readl_relaxed(info->base + IRQ_EN + 4 * i);
+- spin_unlock_irqrestore(&info->irq_lock, flags);
++ raw_spin_unlock_irqrestore(&info->irq_lock, flags);
+ while (status) {
+ u32 hwirq = ffs(status) - 1;
+ u32 virq = irq_find_mapping(d, hwirq +
+@@ -694,12 +694,12 @@ static void armada_37xx_irq_handler(struct irq_desc *desc)
+
+ update_status:
+ /* Update status in case a new IRQ appears */
+- spin_lock_irqsave(&info->irq_lock, flags);
++ raw_spin_lock_irqsave(&info->irq_lock, flags);
+ status = readl_relaxed(info->base +
+ IRQ_STATUS + 4 * i);
+ /* Manage only the interrupt that was enabled */
+ status &= readl_relaxed(info->base + IRQ_EN + 4 * i);
+- spin_unlock_irqrestore(&info->irq_lock, flags);
++ raw_spin_unlock_irqrestore(&info->irq_lock, flags);
+ }
+ }
+ chained_irq_exit(chip, desc);
+@@ -726,23 +726,13 @@ static int armada_37xx_irqchip_register(struct platform_device *pdev,
+ struct gpio_chip *gc = &info->gpio_chip;
+ struct irq_chip *irqchip = &info->irq_chip;
+ struct gpio_irq_chip *girq = &gc->irq;
++ struct device_node *np = to_of_node(gc->fwnode);
+ struct device *dev = &pdev->dev;
+- struct device_node *np;
+- int ret = -ENODEV, i, nr_irq_parent;
++ unsigned int i, nr_irq_parent;
+
+- /* Check if we have at least one gpio-controller child node */
+- for_each_child_of_node(dev->of_node, np) {
+- if (of_property_read_bool(np, "gpio-controller")) {
+- ret = 0;
+- break;
+- }
+- }
+- if (ret)
+- return dev_err_probe(dev, ret, "no gpio-controller child node\n");
++ raw_spin_lock_init(&info->irq_lock);
+
+ nr_irq_parent = of_irq_count(np);
+- spin_lock_init(&info->irq_lock);
+-
+ if (!nr_irq_parent) {
+ dev_err(dev, "invalid or no IRQ\n");
+ return 0;
+@@ -1121,25 +1111,40 @@ static const struct of_device_id armada_37xx_pinctrl_of_match[] = {
+ { },
+ };
+
++static const struct regmap_config armada_37xx_pinctrl_regmap_config = {
++ .reg_bits = 32,
++ .val_bits = 32,
++ .reg_stride = 4,
++ .use_raw_spinlock = true,
++};
++
+ static int __init armada_37xx_pinctrl_probe(struct platform_device *pdev)
+ {
+ struct armada_37xx_pinctrl *info;
+ struct device *dev = &pdev->dev;
+- struct device_node *np = dev->of_node;
+ struct regmap *regmap;
++ void __iomem *base;
+ int ret;
+
++ base = devm_platform_get_and_ioremap_resource(pdev, 0, NULL);
++ if (IS_ERR(base)) {
++ dev_err(dev, "failed to ioremap base address: %pe\n", base);
++ return PTR_ERR(base);
++ }
++
++ regmap = devm_regmap_init_mmio(dev, base,
++ &armada_37xx_pinctrl_regmap_config);
++ if (IS_ERR(regmap)) {
++ dev_err(dev, "failed to create regmap: %pe\n", regmap);
++ return PTR_ERR(regmap);
++ }
++
+ info = devm_kzalloc(dev, sizeof(*info), GFP_KERNEL);
+ if (!info)
+ return -ENOMEM;
+
+ info->dev = dev;
+-
+- regmap = syscon_node_to_regmap(np);
+- if (IS_ERR(regmap))
+- return dev_err_probe(dev, PTR_ERR(regmap), "cannot get regmap\n");
+ info->regmap = regmap;
+-
+ info->data = of_device_get_match_data(dev);
+
+ ret = armada_37xx_pinctrl_register(pdev, info);
+diff --git a/drivers/pinctrl/pinctrl-ocelot.c b/drivers/pinctrl/pinctrl-ocelot.c
+index 6a956ee94494f..6ee9f0de8ede3 100644
+--- a/drivers/pinctrl/pinctrl-ocelot.c
++++ b/drivers/pinctrl/pinctrl-ocelot.c
+@@ -28,19 +28,12 @@
+ #define ocelot_clrsetbits(addr, clear, set) \
+ writel((readl(addr) & ~(clear)) | (set), (addr))
+
+-/* PINCONFIG bits (sparx5 only) */
+ enum {
+ PINCONF_BIAS,
+ PINCONF_SCHMITT,
+ PINCONF_DRIVE_STRENGTH,
+ };
+
+-#define BIAS_PD_BIT BIT(4)
+-#define BIAS_PU_BIT BIT(3)
+-#define BIAS_BITS (BIAS_PD_BIT|BIAS_PU_BIT)
+-#define SCHMITT_BIT BIT(2)
+-#define DRIVE_BITS GENMASK(1, 0)
+-
+ /* GPIO standard registers */
+ #define OCELOT_GPIO_OUT_SET 0x0
+ #define OCELOT_GPIO_OUT_CLR 0x4
+@@ -314,6 +307,13 @@ struct ocelot_pin_caps {
+ unsigned char a_functions[OCELOT_FUNC_PER_PIN]; /* Additional functions */
+ };
+
++struct ocelot_pincfg_data {
++ u8 pd_bit;
++ u8 pu_bit;
++ u8 drive_bits;
++ u8 schmitt_bit;
++};
++
+ struct ocelot_pinctrl {
+ struct device *dev;
+ struct pinctrl_dev *pctl;
+@@ -321,10 +321,16 @@ struct ocelot_pinctrl {
+ struct regmap *map;
+ struct regmap *pincfg;
+ struct pinctrl_desc *desc;
++ const struct ocelot_pincfg_data *pincfg_data;
+ struct ocelot_pmx_func func[FUNC_MAX];
+ u8 stride;
+ };
+
++struct ocelot_match_data {
++ struct pinctrl_desc desc;
++ struct ocelot_pincfg_data pincfg_data;
++};
++
+ #define LUTON_P(p, f0, f1) \
+ static struct ocelot_pin_caps luton_pin_##p = { \
+ .pin = p, \
+@@ -1318,24 +1324,27 @@ static int ocelot_hw_get_value(struct ocelot_pinctrl *info,
+ int ret = -EOPNOTSUPP;
+
+ if (info->pincfg) {
++ const struct ocelot_pincfg_data *opd = info->pincfg_data;
+ u32 regcfg;
+
+- ret = regmap_read(info->pincfg, pin, ®cfg);
++ ret = regmap_read(info->pincfg,
++ pin * regmap_get_reg_stride(info->pincfg),
++ ®cfg);
+ if (ret)
+ return ret;
+
+ ret = 0;
+ switch (reg) {
+ case PINCONF_BIAS:
+- *val = regcfg & BIAS_BITS;
++ *val = regcfg & (opd->pd_bit | opd->pu_bit);
+ break;
+
+ case PINCONF_SCHMITT:
+- *val = regcfg & SCHMITT_BIT;
++ *val = regcfg & opd->schmitt_bit;
+ break;
+
+ case PINCONF_DRIVE_STRENGTH:
+- *val = regcfg & DRIVE_BITS;
++ *val = regcfg & opd->drive_bits;
+ break;
+
+ default:
+@@ -1352,14 +1361,18 @@ static int ocelot_pincfg_clrsetbits(struct ocelot_pinctrl *info, u32 regaddr,
+ u32 val;
+ int ret;
+
+- ret = regmap_read(info->pincfg, regaddr, &val);
++ ret = regmap_read(info->pincfg,
++ regaddr * regmap_get_reg_stride(info->pincfg),
++ &val);
+ if (ret)
+ return ret;
+
+ val &= ~clrbits;
+ val |= setbits;
+
+- ret = regmap_write(info->pincfg, regaddr, val);
++ ret = regmap_write(info->pincfg,
++ regaddr * regmap_get_reg_stride(info->pincfg),
++ val);
+
+ return ret;
+ }
+@@ -1372,23 +1385,27 @@ static int ocelot_hw_set_value(struct ocelot_pinctrl *info,
+ int ret = -EOPNOTSUPP;
+
+ if (info->pincfg) {
++ const struct ocelot_pincfg_data *opd = info->pincfg_data;
+
+ ret = 0;
+ switch (reg) {
+ case PINCONF_BIAS:
+- ret = ocelot_pincfg_clrsetbits(info, pin, BIAS_BITS,
++ ret = ocelot_pincfg_clrsetbits(info, pin,
++ opd->pd_bit | opd->pu_bit,
+ val);
+ break;
+
+ case PINCONF_SCHMITT:
+- ret = ocelot_pincfg_clrsetbits(info, pin, SCHMITT_BIT,
++ ret = ocelot_pincfg_clrsetbits(info, pin,
++ opd->schmitt_bit,
+ val);
+ break;
+
+ case PINCONF_DRIVE_STRENGTH:
+ if (val <= 3)
+ ret = ocelot_pincfg_clrsetbits(info, pin,
+- DRIVE_BITS, val);
++ opd->drive_bits,
++ val);
+ else
+ ret = -EINVAL;
+ break;
+@@ -1418,17 +1435,20 @@ static int ocelot_pinconf_get(struct pinctrl_dev *pctldev,
+ if (param == PIN_CONFIG_BIAS_DISABLE)
+ val = (val == 0);
+ else if (param == PIN_CONFIG_BIAS_PULL_DOWN)
+- val = (val & BIAS_PD_BIT ? true : false);
++ val = !!(val & info->pincfg_data->pd_bit);
+ else /* PIN_CONFIG_BIAS_PULL_UP */
+- val = (val & BIAS_PU_BIT ? true : false);
++ val = !!(val & info->pincfg_data->pu_bit);
+ break;
+
+ case PIN_CONFIG_INPUT_SCHMITT_ENABLE:
++ if (!info->pincfg_data->schmitt_bit)
++ return -EOPNOTSUPP;
++
+ err = ocelot_hw_get_value(info, pin, PINCONF_SCHMITT, &val);
+ if (err)
+ return err;
+
+- val = (val & SCHMITT_BIT ? true : false);
++ val = !!(val & info->pincfg_data->schmitt_bit);
+ break;
+
+ case PIN_CONFIG_DRIVE_STRENGTH:
+@@ -1472,6 +1492,7 @@ static int ocelot_pinconf_set(struct pinctrl_dev *pctldev, unsigned int pin,
+ unsigned long *configs, unsigned int num_configs)
+ {
+ struct ocelot_pinctrl *info = pinctrl_dev_get_drvdata(pctldev);
++ const struct ocelot_pincfg_data *opd = info->pincfg_data;
+ u32 param, arg, p;
+ int cfg, err = 0;
+
+@@ -1484,8 +1505,8 @@ static int ocelot_pinconf_set(struct pinctrl_dev *pctldev, unsigned int pin,
+ case PIN_CONFIG_BIAS_PULL_UP:
+ case PIN_CONFIG_BIAS_PULL_DOWN:
+ arg = (param == PIN_CONFIG_BIAS_DISABLE) ? 0 :
+- (param == PIN_CONFIG_BIAS_PULL_UP) ? BIAS_PU_BIT :
+- BIAS_PD_BIT;
++ (param == PIN_CONFIG_BIAS_PULL_UP) ?
++ opd->pu_bit : opd->pd_bit;
+
+ err = ocelot_hw_set_value(info, pin, PINCONF_BIAS, arg);
+ if (err)
+@@ -1494,7 +1515,10 @@ static int ocelot_pinconf_set(struct pinctrl_dev *pctldev, unsigned int pin,
+ break;
+
+ case PIN_CONFIG_INPUT_SCHMITT_ENABLE:
+- arg = arg ? SCHMITT_BIT : 0;
++ if (!opd->schmitt_bit)
++ return -EOPNOTSUPP;
++
++ arg = arg ? opd->schmitt_bit : 0;
+ err = ocelot_hw_set_value(info, pin, PINCONF_SCHMITT,
+ arg);
+ if (err)
+@@ -1555,69 +1579,94 @@ static const struct pinctrl_ops ocelot_pctl_ops = {
+ .dt_free_map = pinconf_generic_dt_free_map,
+ };
+
+-static struct pinctrl_desc luton_desc = {
+- .name = "luton-pinctrl",
+- .pins = luton_pins,
+- .npins = ARRAY_SIZE(luton_pins),
+- .pctlops = &ocelot_pctl_ops,
+- .pmxops = &ocelot_pmx_ops,
+- .owner = THIS_MODULE,
++static struct ocelot_match_data luton_desc = {
++ .desc = {
++ .name = "luton-pinctrl",
++ .pins = luton_pins,
++ .npins = ARRAY_SIZE(luton_pins),
++ .pctlops = &ocelot_pctl_ops,
++ .pmxops = &ocelot_pmx_ops,
++ .owner = THIS_MODULE,
++ },
+ };
+
+-static struct pinctrl_desc serval_desc = {
+- .name = "serval-pinctrl",
+- .pins = serval_pins,
+- .npins = ARRAY_SIZE(serval_pins),
+- .pctlops = &ocelot_pctl_ops,
+- .pmxops = &ocelot_pmx_ops,
+- .owner = THIS_MODULE,
++static struct ocelot_match_data serval_desc = {
++ .desc = {
++ .name = "serval-pinctrl",
++ .pins = serval_pins,
++ .npins = ARRAY_SIZE(serval_pins),
++ .pctlops = &ocelot_pctl_ops,
++ .pmxops = &ocelot_pmx_ops,
++ .owner = THIS_MODULE,
++ },
+ };
+
+-static struct pinctrl_desc ocelot_desc = {
+- .name = "ocelot-pinctrl",
+- .pins = ocelot_pins,
+- .npins = ARRAY_SIZE(ocelot_pins),
+- .pctlops = &ocelot_pctl_ops,
+- .pmxops = &ocelot_pmx_ops,
+- .owner = THIS_MODULE,
++static struct ocelot_match_data ocelot_desc = {
++ .desc = {
++ .name = "ocelot-pinctrl",
++ .pins = ocelot_pins,
++ .npins = ARRAY_SIZE(ocelot_pins),
++ .pctlops = &ocelot_pctl_ops,
++ .pmxops = &ocelot_pmx_ops,
++ .owner = THIS_MODULE,
++ },
+ };
+
+-static struct pinctrl_desc jaguar2_desc = {
+- .name = "jaguar2-pinctrl",
+- .pins = jaguar2_pins,
+- .npins = ARRAY_SIZE(jaguar2_pins),
+- .pctlops = &ocelot_pctl_ops,
+- .pmxops = &ocelot_pmx_ops,
+- .owner = THIS_MODULE,
++static struct ocelot_match_data jaguar2_desc = {
++ .desc = {
++ .name = "jaguar2-pinctrl",
++ .pins = jaguar2_pins,
++ .npins = ARRAY_SIZE(jaguar2_pins),
++ .pctlops = &ocelot_pctl_ops,
++ .pmxops = &ocelot_pmx_ops,
++ .owner = THIS_MODULE,
++ },
+ };
+
+-static struct pinctrl_desc servalt_desc = {
+- .name = "servalt-pinctrl",
+- .pins = servalt_pins,
+- .npins = ARRAY_SIZE(servalt_pins),
+- .pctlops = &ocelot_pctl_ops,
+- .pmxops = &ocelot_pmx_ops,
+- .owner = THIS_MODULE,
++static struct ocelot_match_data servalt_desc = {
++ .desc = {
++ .name = "servalt-pinctrl",
++ .pins = servalt_pins,
++ .npins = ARRAY_SIZE(servalt_pins),
++ .pctlops = &ocelot_pctl_ops,
++ .pmxops = &ocelot_pmx_ops,
++ .owner = THIS_MODULE,
++ },
+ };
+
+-static struct pinctrl_desc sparx5_desc = {
+- .name = "sparx5-pinctrl",
+- .pins = sparx5_pins,
+- .npins = ARRAY_SIZE(sparx5_pins),
+- .pctlops = &ocelot_pctl_ops,
+- .pmxops = &ocelot_pmx_ops,
+- .confops = &ocelot_confops,
+- .owner = THIS_MODULE,
++static struct ocelot_match_data sparx5_desc = {
++ .desc = {
++ .name = "sparx5-pinctrl",
++ .pins = sparx5_pins,
++ .npins = ARRAY_SIZE(sparx5_pins),
++ .pctlops = &ocelot_pctl_ops,
++ .pmxops = &ocelot_pmx_ops,
++ .confops = &ocelot_confops,
++ .owner = THIS_MODULE,
++ },
++ .pincfg_data = {
++ .pd_bit = BIT(4),
++ .pu_bit = BIT(3),
++ .drive_bits = GENMASK(1, 0),
++ .schmitt_bit = BIT(2),
++ },
+ };
+
+-static struct pinctrl_desc lan966x_desc = {
+- .name = "lan966x-pinctrl",
+- .pins = lan966x_pins,
+- .npins = ARRAY_SIZE(lan966x_pins),
+- .pctlops = &ocelot_pctl_ops,
+- .pmxops = &lan966x_pmx_ops,
+- .confops = &ocelot_confops,
+- .owner = THIS_MODULE,
++static struct ocelot_match_data lan966x_desc = {
++ .desc = {
++ .name = "lan966x-pinctrl",
++ .pins = lan966x_pins,
++ .npins = ARRAY_SIZE(lan966x_pins),
++ .pctlops = &ocelot_pctl_ops,
++ .pmxops = &lan966x_pmx_ops,
++ .confops = &ocelot_confops,
++ .owner = THIS_MODULE,
++ },
++ .pincfg_data = {
++ .pd_bit = BIT(3),
++ .pu_bit = BIT(2),
++ .drive_bits = GENMASK(1, 0),
++ },
+ };
+
+ static int ocelot_create_group_func_map(struct device *dev,
+@@ -1883,7 +1932,8 @@ static const struct of_device_id ocelot_pinctrl_of_match[] = {
+ {},
+ };
+
+-static struct regmap *ocelot_pinctrl_create_pincfg(struct platform_device *pdev)
++static struct regmap *ocelot_pinctrl_create_pincfg(struct platform_device *pdev,
++ const struct ocelot_pinctrl *info)
+ {
+ void __iomem *base;
+
+@@ -1891,7 +1941,7 @@ static struct regmap *ocelot_pinctrl_create_pincfg(struct platform_device *pdev)
+ .reg_bits = 32,
+ .val_bits = 32,
+ .reg_stride = 4,
+- .max_register = 32,
++ .max_register = info->desc->npins * 4,
+ .name = "pincfg",
+ };
+
+@@ -1906,6 +1956,7 @@ static struct regmap *ocelot_pinctrl_create_pincfg(struct platform_device *pdev)
+
+ static int ocelot_pinctrl_probe(struct platform_device *pdev)
+ {
++ const struct ocelot_match_data *data;
+ struct device *dev = &pdev->dev;
+ struct ocelot_pinctrl *info;
+ struct regmap *pincfg;
+@@ -1921,7 +1972,16 @@ static int ocelot_pinctrl_probe(struct platform_device *pdev)
+ if (!info)
+ return -ENOMEM;
+
+- info->desc = (struct pinctrl_desc *)device_get_match_data(dev);
++ data = device_get_match_data(dev);
++ if (!data)
++ return -EINVAL;
++
++ info->desc = devm_kmemdup(dev, &data->desc, sizeof(*info->desc),
++ GFP_KERNEL);
++ if (!info->desc)
++ return -ENOMEM;
++
++ info->pincfg_data = &data->pincfg_data;
+
+ base = devm_ioremap_resource(dev,
+ platform_get_resource(pdev, IORESOURCE_MEM, 0));
+@@ -1942,7 +2002,7 @@ static int ocelot_pinctrl_probe(struct platform_device *pdev)
+
+ /* Pinconf registers */
+ if (info->desc->confops) {
+- pincfg = ocelot_pinctrl_create_pincfg(pdev);
++ pincfg = ocelot_pinctrl_create_pincfg(pdev, info);
+ if (IS_ERR(pincfg))
+ dev_dbg(dev, "Failed to create pincfg regmap\n");
+ else
+diff --git a/drivers/pinctrl/ralink/Kconfig b/drivers/pinctrl/ralink/Kconfig
+index a76ee3deb8c31..d0f0a8f2b9b7d 100644
+--- a/drivers/pinctrl/ralink/Kconfig
++++ b/drivers/pinctrl/ralink/Kconfig
+@@ -3,37 +3,33 @@ menu "Ralink pinctrl drivers"
+ depends on RALINK
+
+ config PINCTRL_RALINK
+- bool "Ralink pin control support"
+- default y if RALINK
+-
+-config PINCTRL_RT2880
+- bool "RT2880 pinctrl driver for RALINK/Mediatek SOCs"
++ bool "Ralink pinctrl driver"
+ select PINMUX
+ select GENERIC_PINCONF
+
+ config PINCTRL_MT7620
+ bool "mt7620 pinctrl driver for RALINK/Mediatek SOCs"
+ depends on RALINK && SOC_MT7620
+- select PINCTRL_RT2880
++ select PINCTRL_RALINK
+
+ config PINCTRL_MT7621
+ bool "mt7621 pinctrl driver for RALINK/Mediatek SOCs"
+ depends on RALINK && SOC_MT7621
+- select PINCTRL_RT2880
++ select PINCTRL_RALINK
+
+ config PINCTRL_RT288X
+ bool "RT288X pinctrl driver for RALINK/Mediatek SOCs"
+ depends on RALINK && SOC_RT288X
+- select PINCTRL_RT2880
++ select PINCTRL_RALINK
+
+ config PINCTRL_RT305X
+ bool "RT305X pinctrl driver for RALINK/Mediatek SOCs"
+ depends on RALINK && SOC_RT305X
+- select PINCTRL_RT2880
++ select PINCTRL_RALINK
+
+ config PINCTRL_RT3883
+ bool "RT3883 pinctrl driver for RALINK/Mediatek SOCs"
+ depends on RALINK && SOC_RT3883
+- select PINCTRL_RT2880
++ select PINCTRL_RALINK
+
+ endmenu
+diff --git a/drivers/pinctrl/ralink/Makefile b/drivers/pinctrl/ralink/Makefile
+index a15610206ced4..2c1323b74e96f 100644
+--- a/drivers/pinctrl/ralink/Makefile
++++ b/drivers/pinctrl/ralink/Makefile
+@@ -1,5 +1,5 @@
+ # SPDX-License-Identifier: GPL-2.0
+-obj-$(CONFIG_PINCTRL_RT2880) += pinctrl-rt2880.o
++obj-$(CONFIG_PINCTRL_RALINK) += pinctrl-ralink.o
+
+ obj-$(CONFIG_PINCTRL_MT7620) += pinctrl-mt7620.o
+ obj-$(CONFIG_PINCTRL_MT7621) += pinctrl-mt7621.o
+diff --git a/drivers/pinctrl/ralink/pinctrl-mt7620.c b/drivers/pinctrl/ralink/pinctrl-mt7620.c
+index 6853b5b8b0fe7..51b863d85c51e 100644
+--- a/drivers/pinctrl/ralink/pinctrl-mt7620.c
++++ b/drivers/pinctrl/ralink/pinctrl-mt7620.c
+@@ -5,7 +5,7 @@
+ #include <linux/module.h>
+ #include <linux/platform_device.h>
+ #include <linux/of.h>
+-#include "pinmux.h"
++#include "pinctrl-ralink.h"
+
+ #define MT7620_GPIO_MODE_UART0_SHIFT 2
+ #define MT7620_GPIO_MODE_UART0_MASK 0x7
+@@ -54,20 +54,20 @@
+ #define MT7620_GPIO_MODE_EPHY 15
+ #define MT7620_GPIO_MODE_PA 20
+
+-static struct rt2880_pmx_func i2c_grp[] = { FUNC("i2c", 0, 1, 2) };
+-static struct rt2880_pmx_func spi_grp[] = { FUNC("spi", 0, 3, 4) };
+-static struct rt2880_pmx_func uartlite_grp[] = { FUNC("uartlite", 0, 15, 2) };
+-static struct rt2880_pmx_func mdio_grp[] = {
++static struct ralink_pmx_func i2c_grp[] = { FUNC("i2c", 0, 1, 2) };
++static struct ralink_pmx_func spi_grp[] = { FUNC("spi", 0, 3, 4) };
++static struct ralink_pmx_func uartlite_grp[] = { FUNC("uartlite", 0, 15, 2) };
++static struct ralink_pmx_func mdio_grp[] = {
+ FUNC("mdio", MT7620_GPIO_MODE_MDIO, 22, 2),
+ FUNC("refclk", MT7620_GPIO_MODE_MDIO_REFCLK, 22, 2),
+ };
+-static struct rt2880_pmx_func rgmii1_grp[] = { FUNC("rgmii1", 0, 24, 12) };
+-static struct rt2880_pmx_func refclk_grp[] = { FUNC("spi refclk", 0, 37, 3) };
+-static struct rt2880_pmx_func ephy_grp[] = { FUNC("ephy", 0, 40, 5) };
+-static struct rt2880_pmx_func rgmii2_grp[] = { FUNC("rgmii2", 0, 60, 12) };
+-static struct rt2880_pmx_func wled_grp[] = { FUNC("wled", 0, 72, 1) };
+-static struct rt2880_pmx_func pa_grp[] = { FUNC("pa", 0, 18, 4) };
+-static struct rt2880_pmx_func uartf_grp[] = {
++static struct ralink_pmx_func rgmii1_grp[] = { FUNC("rgmii1", 0, 24, 12) };
++static struct ralink_pmx_func refclk_grp[] = { FUNC("spi refclk", 0, 37, 3) };
++static struct ralink_pmx_func ephy_grp[] = { FUNC("ephy", 0, 40, 5) };
++static struct ralink_pmx_func rgmii2_grp[] = { FUNC("rgmii2", 0, 60, 12) };
++static struct ralink_pmx_func wled_grp[] = { FUNC("wled", 0, 72, 1) };
++static struct ralink_pmx_func pa_grp[] = { FUNC("pa", 0, 18, 4) };
++static struct ralink_pmx_func uartf_grp[] = {
+ FUNC("uartf", MT7620_GPIO_MODE_UARTF, 7, 8),
+ FUNC("pcm uartf", MT7620_GPIO_MODE_PCM_UARTF, 7, 8),
+ FUNC("pcm i2s", MT7620_GPIO_MODE_PCM_I2S, 7, 8),
+@@ -76,20 +76,20 @@ static struct rt2880_pmx_func uartf_grp[] = {
+ FUNC("gpio uartf", MT7620_GPIO_MODE_GPIO_UARTF, 7, 4),
+ FUNC("gpio i2s", MT7620_GPIO_MODE_GPIO_I2S, 7, 4),
+ };
+-static struct rt2880_pmx_func wdt_grp[] = {
++static struct ralink_pmx_func wdt_grp[] = {
+ FUNC("wdt rst", 0, 17, 1),
+ FUNC("wdt refclk", 0, 17, 1),
+ };
+-static struct rt2880_pmx_func pcie_rst_grp[] = {
++static struct ralink_pmx_func pcie_rst_grp[] = {
+ FUNC("pcie rst", MT7620_GPIO_MODE_PCIE_RST, 36, 1),
+ FUNC("pcie refclk", MT7620_GPIO_MODE_PCIE_REF, 36, 1)
+ };
+-static struct rt2880_pmx_func nd_sd_grp[] = {
++static struct ralink_pmx_func nd_sd_grp[] = {
+ FUNC("nand", MT7620_GPIO_MODE_NAND, 45, 15),
+ FUNC("sd", MT7620_GPIO_MODE_SD, 47, 13)
+ };
+
+-static struct rt2880_pmx_group mt7620a_pinmux_data[] = {
++static struct ralink_pmx_group mt7620a_pinmux_data[] = {
+ GRP("i2c", i2c_grp, 1, MT7620_GPIO_MODE_I2C),
+ GRP("uartf", uartf_grp, MT7620_GPIO_MODE_UART0_MASK,
+ MT7620_GPIO_MODE_UART0_SHIFT),
+@@ -112,262 +112,262 @@ static struct rt2880_pmx_group mt7620a_pinmux_data[] = {
+ { 0 }
+ };
+
+-static struct rt2880_pmx_func pwm1_grp_mt7628[] = {
++static struct ralink_pmx_func pwm1_grp_mt76x8[] = {
+ FUNC("sdxc d6", 3, 19, 1),
+ FUNC("utif", 2, 19, 1),
+ FUNC("gpio", 1, 19, 1),
+ FUNC("pwm1", 0, 19, 1),
+ };
+
+-static struct rt2880_pmx_func pwm0_grp_mt7628[] = {
++static struct ralink_pmx_func pwm0_grp_mt76x8[] = {
+ FUNC("sdxc d7", 3, 18, 1),
+ FUNC("utif", 2, 18, 1),
+ FUNC("gpio", 1, 18, 1),
+ FUNC("pwm0", 0, 18, 1),
+ };
+
+-static struct rt2880_pmx_func uart2_grp_mt7628[] = {
++static struct ralink_pmx_func uart2_grp_mt76x8[] = {
+ FUNC("sdxc d5 d4", 3, 20, 2),
+ FUNC("pwm", 2, 20, 2),
+ FUNC("gpio", 1, 20, 2),
+ FUNC("uart2", 0, 20, 2),
+ };
+
+-static struct rt2880_pmx_func uart1_grp_mt7628[] = {
++static struct ralink_pmx_func uart1_grp_mt76x8[] = {
+ FUNC("sw_r", 3, 45, 2),
+ FUNC("pwm", 2, 45, 2),
+ FUNC("gpio", 1, 45, 2),
+ FUNC("uart1", 0, 45, 2),
+ };
+
+-static struct rt2880_pmx_func i2c_grp_mt7628[] = {
++static struct ralink_pmx_func i2c_grp_mt76x8[] = {
+ FUNC("-", 3, 4, 2),
+ FUNC("debug", 2, 4, 2),
+ FUNC("gpio", 1, 4, 2),
+ FUNC("i2c", 0, 4, 2),
+ };
+
+-static struct rt2880_pmx_func refclk_grp_mt7628[] = { FUNC("refclk", 0, 37, 1) };
+-static struct rt2880_pmx_func perst_grp_mt7628[] = { FUNC("perst", 0, 36, 1) };
+-static struct rt2880_pmx_func wdt_grp_mt7628[] = { FUNC("wdt", 0, 38, 1) };
+-static struct rt2880_pmx_func spi_grp_mt7628[] = { FUNC("spi", 0, 7, 4) };
++static struct ralink_pmx_func refclk_grp_mt76x8[] = { FUNC("refclk", 0, 37, 1) };
++static struct ralink_pmx_func perst_grp_mt76x8[] = { FUNC("perst", 0, 36, 1) };
++static struct ralink_pmx_func wdt_grp_mt76x8[] = { FUNC("wdt", 0, 38, 1) };
++static struct ralink_pmx_func spi_grp_mt76x8[] = { FUNC("spi", 0, 7, 4) };
+
+-static struct rt2880_pmx_func sd_mode_grp_mt7628[] = {
++static struct ralink_pmx_func sd_mode_grp_mt76x8[] = {
+ FUNC("jtag", 3, 22, 8),
+ FUNC("utif", 2, 22, 8),
+ FUNC("gpio", 1, 22, 8),
+ FUNC("sdxc", 0, 22, 8),
+ };
+
+-static struct rt2880_pmx_func uart0_grp_mt7628[] = {
++static struct ralink_pmx_func uart0_grp_mt76x8[] = {
+ FUNC("-", 3, 12, 2),
+ FUNC("-", 2, 12, 2),
+ FUNC("gpio", 1, 12, 2),
+ FUNC("uart0", 0, 12, 2),
+ };
+
+-static struct rt2880_pmx_func i2s_grp_mt7628[] = {
++static struct ralink_pmx_func i2s_grp_mt76x8[] = {
+ FUNC("antenna", 3, 0, 4),
+ FUNC("pcm", 2, 0, 4),
+ FUNC("gpio", 1, 0, 4),
+ FUNC("i2s", 0, 0, 4),
+ };
+
+-static struct rt2880_pmx_func spi_cs1_grp_mt7628[] = {
++static struct ralink_pmx_func spi_cs1_grp_mt76x8[] = {
+ FUNC("-", 3, 6, 1),
+ FUNC("refclk", 2, 6, 1),
+ FUNC("gpio", 1, 6, 1),
+ FUNC("spi cs1", 0, 6, 1),
+ };
+
+-static struct rt2880_pmx_func spis_grp_mt7628[] = {
++static struct ralink_pmx_func spis_grp_mt76x8[] = {
+ FUNC("pwm_uart2", 3, 14, 4),
+ FUNC("utif", 2, 14, 4),
+ FUNC("gpio", 1, 14, 4),
+ FUNC("spis", 0, 14, 4),
+ };
+
+-static struct rt2880_pmx_func gpio_grp_mt7628[] = {
++static struct ralink_pmx_func gpio_grp_mt76x8[] = {
+ FUNC("pcie", 3, 11, 1),
+ FUNC("refclk", 2, 11, 1),
+ FUNC("gpio", 1, 11, 1),
+ FUNC("gpio", 0, 11, 1),
+ };
+
+-static struct rt2880_pmx_func p4led_kn_grp_mt7628[] = {
++static struct ralink_pmx_func p4led_kn_grp_mt76x8[] = {
+ FUNC("jtag", 3, 30, 1),
+ FUNC("utif", 2, 30, 1),
+ FUNC("gpio", 1, 30, 1),
+ FUNC("p4led_kn", 0, 30, 1),
+ };
+
+-static struct rt2880_pmx_func p3led_kn_grp_mt7628[] = {
++static struct ralink_pmx_func p3led_kn_grp_mt76x8[] = {
+ FUNC("jtag", 3, 31, 1),
+ FUNC("utif", 2, 31, 1),
+ FUNC("gpio", 1, 31, 1),
+ FUNC("p3led_kn", 0, 31, 1),
+ };
+
+-static struct rt2880_pmx_func p2led_kn_grp_mt7628[] = {
++static struct ralink_pmx_func p2led_kn_grp_mt76x8[] = {
+ FUNC("jtag", 3, 32, 1),
+ FUNC("utif", 2, 32, 1),
+ FUNC("gpio", 1, 32, 1),
+ FUNC("p2led_kn", 0, 32, 1),
+ };
+
+-static struct rt2880_pmx_func p1led_kn_grp_mt7628[] = {
++static struct ralink_pmx_func p1led_kn_grp_mt76x8[] = {
+ FUNC("jtag", 3, 33, 1),
+ FUNC("utif", 2, 33, 1),
+ FUNC("gpio", 1, 33, 1),
+ FUNC("p1led_kn", 0, 33, 1),
+ };
+
+-static struct rt2880_pmx_func p0led_kn_grp_mt7628[] = {
++static struct ralink_pmx_func p0led_kn_grp_mt76x8[] = {
+ FUNC("jtag", 3, 34, 1),
+ FUNC("rsvd", 2, 34, 1),
+ FUNC("gpio", 1, 34, 1),
+ FUNC("p0led_kn", 0, 34, 1),
+ };
+
+-static struct rt2880_pmx_func wled_kn_grp_mt7628[] = {
++static struct ralink_pmx_func wled_kn_grp_mt76x8[] = {
+ FUNC("rsvd", 3, 35, 1),
+ FUNC("rsvd", 2, 35, 1),
+ FUNC("gpio", 1, 35, 1),
+ FUNC("wled_kn", 0, 35, 1),
+ };
+
+-static struct rt2880_pmx_func p4led_an_grp_mt7628[] = {
++static struct ralink_pmx_func p4led_an_grp_mt76x8[] = {
+ FUNC("jtag", 3, 39, 1),
+ FUNC("utif", 2, 39, 1),
+ FUNC("gpio", 1, 39, 1),
+ FUNC("p4led_an", 0, 39, 1),
+ };
+
+-static struct rt2880_pmx_func p3led_an_grp_mt7628[] = {
++static struct ralink_pmx_func p3led_an_grp_mt76x8[] = {
+ FUNC("jtag", 3, 40, 1),
+ FUNC("utif", 2, 40, 1),
+ FUNC("gpio", 1, 40, 1),
+ FUNC("p3led_an", 0, 40, 1),
+ };
+
+-static struct rt2880_pmx_func p2led_an_grp_mt7628[] = {
++static struct ralink_pmx_func p2led_an_grp_mt76x8[] = {
+ FUNC("jtag", 3, 41, 1),
+ FUNC("utif", 2, 41, 1),
+ FUNC("gpio", 1, 41, 1),
+ FUNC("p2led_an", 0, 41, 1),
+ };
+
+-static struct rt2880_pmx_func p1led_an_grp_mt7628[] = {
++static struct ralink_pmx_func p1led_an_grp_mt76x8[] = {
+ FUNC("jtag", 3, 42, 1),
+ FUNC("utif", 2, 42, 1),
+ FUNC("gpio", 1, 42, 1),
+ FUNC("p1led_an", 0, 42, 1),
+ };
+
+-static struct rt2880_pmx_func p0led_an_grp_mt7628[] = {
++static struct ralink_pmx_func p0led_an_grp_mt76x8[] = {
+ FUNC("jtag", 3, 43, 1),
+ FUNC("rsvd", 2, 43, 1),
+ FUNC("gpio", 1, 43, 1),
+ FUNC("p0led_an", 0, 43, 1),
+ };
+
+-static struct rt2880_pmx_func wled_an_grp_mt7628[] = {
++static struct ralink_pmx_func wled_an_grp_mt76x8[] = {
+ FUNC("rsvd", 3, 44, 1),
+ FUNC("rsvd", 2, 44, 1),
+ FUNC("gpio", 1, 44, 1),
+ FUNC("wled_an", 0, 44, 1),
+ };
+
+-#define MT7628_GPIO_MODE_MASK 0x3
+-
+-#define MT7628_GPIO_MODE_P4LED_KN 58
+-#define MT7628_GPIO_MODE_P3LED_KN 56
+-#define MT7628_GPIO_MODE_P2LED_KN 54
+-#define MT7628_GPIO_MODE_P1LED_KN 52
+-#define MT7628_GPIO_MODE_P0LED_KN 50
+-#define MT7628_GPIO_MODE_WLED_KN 48
+-#define MT7628_GPIO_MODE_P4LED_AN 42
+-#define MT7628_GPIO_MODE_P3LED_AN 40
+-#define MT7628_GPIO_MODE_P2LED_AN 38
+-#define MT7628_GPIO_MODE_P1LED_AN 36
+-#define MT7628_GPIO_MODE_P0LED_AN 34
+-#define MT7628_GPIO_MODE_WLED_AN 32
+-#define MT7628_GPIO_MODE_PWM1 30
+-#define MT7628_GPIO_MODE_PWM0 28
+-#define MT7628_GPIO_MODE_UART2 26
+-#define MT7628_GPIO_MODE_UART1 24
+-#define MT7628_GPIO_MODE_I2C 20
+-#define MT7628_GPIO_MODE_REFCLK 18
+-#define MT7628_GPIO_MODE_PERST 16
+-#define MT7628_GPIO_MODE_WDT 14
+-#define MT7628_GPIO_MODE_SPI 12
+-#define MT7628_GPIO_MODE_SDMODE 10
+-#define MT7628_GPIO_MODE_UART0 8
+-#define MT7628_GPIO_MODE_I2S 6
+-#define MT7628_GPIO_MODE_CS1 4
+-#define MT7628_GPIO_MODE_SPIS 2
+-#define MT7628_GPIO_MODE_GPIO 0
+-
+-static struct rt2880_pmx_group mt7628an_pinmux_data[] = {
+- GRP_G("pwm1", pwm1_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_PWM1),
+- GRP_G("pwm0", pwm0_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_PWM0),
+- GRP_G("uart2", uart2_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_UART2),
+- GRP_G("uart1", uart1_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_UART1),
+- GRP_G("i2c", i2c_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_I2C),
+- GRP("refclk", refclk_grp_mt7628, 1, MT7628_GPIO_MODE_REFCLK),
+- GRP("perst", perst_grp_mt7628, 1, MT7628_GPIO_MODE_PERST),
+- GRP("wdt", wdt_grp_mt7628, 1, MT7628_GPIO_MODE_WDT),
+- GRP("spi", spi_grp_mt7628, 1, MT7628_GPIO_MODE_SPI),
+- GRP_G("sdmode", sd_mode_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_SDMODE),
+- GRP_G("uart0", uart0_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_UART0),
+- GRP_G("i2s", i2s_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_I2S),
+- GRP_G("spi cs1", spi_cs1_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_CS1),
+- GRP_G("spis", spis_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_SPIS),
+- GRP_G("gpio", gpio_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_GPIO),
+- GRP_G("wled_an", wled_an_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_WLED_AN),
+- GRP_G("p0led_an", p0led_an_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_P0LED_AN),
+- GRP_G("p1led_an", p1led_an_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_P1LED_AN),
+- GRP_G("p2led_an", p2led_an_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_P2LED_AN),
+- GRP_G("p3led_an", p3led_an_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_P3LED_AN),
+- GRP_G("p4led_an", p4led_an_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_P4LED_AN),
+- GRP_G("wled_kn", wled_kn_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_WLED_KN),
+- GRP_G("p0led_kn", p0led_kn_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_P0LED_KN),
+- GRP_G("p1led_kn", p1led_kn_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_P1LED_KN),
+- GRP_G("p2led_kn", p2led_kn_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_P2LED_KN),
+- GRP_G("p3led_kn", p3led_kn_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_P3LED_KN),
+- GRP_G("p4led_kn", p4led_kn_grp_mt7628, MT7628_GPIO_MODE_MASK,
+- 1, MT7628_GPIO_MODE_P4LED_KN),
++#define MT76X8_GPIO_MODE_MASK 0x3
++
++#define MT76X8_GPIO_MODE_P4LED_KN 58
++#define MT76X8_GPIO_MODE_P3LED_KN 56
++#define MT76X8_GPIO_MODE_P2LED_KN 54
++#define MT76X8_GPIO_MODE_P1LED_KN 52
++#define MT76X8_GPIO_MODE_P0LED_KN 50
++#define MT76X8_GPIO_MODE_WLED_KN 48
++#define MT76X8_GPIO_MODE_P4LED_AN 42
++#define MT76X8_GPIO_MODE_P3LED_AN 40
++#define MT76X8_GPIO_MODE_P2LED_AN 38
++#define MT76X8_GPIO_MODE_P1LED_AN 36
++#define MT76X8_GPIO_MODE_P0LED_AN 34
++#define MT76X8_GPIO_MODE_WLED_AN 32
++#define MT76X8_GPIO_MODE_PWM1 30
++#define MT76X8_GPIO_MODE_PWM0 28
++#define MT76X8_GPIO_MODE_UART2 26
++#define MT76X8_GPIO_MODE_UART1 24
++#define MT76X8_GPIO_MODE_I2C 20
++#define MT76X8_GPIO_MODE_REFCLK 18
++#define MT76X8_GPIO_MODE_PERST 16
++#define MT76X8_GPIO_MODE_WDT 14
++#define MT76X8_GPIO_MODE_SPI 12
++#define MT76X8_GPIO_MODE_SDMODE 10
++#define MT76X8_GPIO_MODE_UART0 8
++#define MT76X8_GPIO_MODE_I2S 6
++#define MT76X8_GPIO_MODE_CS1 4
++#define MT76X8_GPIO_MODE_SPIS 2
++#define MT76X8_GPIO_MODE_GPIO 0
++
++static struct ralink_pmx_group mt76x8_pinmux_data[] = {
++ GRP_G("pwm1", pwm1_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_PWM1),
++ GRP_G("pwm0", pwm0_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_PWM0),
++ GRP_G("uart2", uart2_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_UART2),
++ GRP_G("uart1", uart1_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_UART1),
++ GRP_G("i2c", i2c_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_I2C),
++ GRP("refclk", refclk_grp_mt76x8, 1, MT76X8_GPIO_MODE_REFCLK),
++ GRP("perst", perst_grp_mt76x8, 1, MT76X8_GPIO_MODE_PERST),
++ GRP("wdt", wdt_grp_mt76x8, 1, MT76X8_GPIO_MODE_WDT),
++ GRP("spi", spi_grp_mt76x8, 1, MT76X8_GPIO_MODE_SPI),
++ GRP_G("sdmode", sd_mode_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_SDMODE),
++ GRP_G("uart0", uart0_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_UART0),
++ GRP_G("i2s", i2s_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_I2S),
++ GRP_G("spi cs1", spi_cs1_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_CS1),
++ GRP_G("spis", spis_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_SPIS),
++ GRP_G("gpio", gpio_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_GPIO),
++ GRP_G("wled_an", wled_an_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_WLED_AN),
++ GRP_G("p0led_an", p0led_an_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_P0LED_AN),
++ GRP_G("p1led_an", p1led_an_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_P1LED_AN),
++ GRP_G("p2led_an", p2led_an_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_P2LED_AN),
++ GRP_G("p3led_an", p3led_an_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_P3LED_AN),
++ GRP_G("p4led_an", p4led_an_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_P4LED_AN),
++ GRP_G("wled_kn", wled_kn_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_WLED_KN),
++ GRP_G("p0led_kn", p0led_kn_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_P0LED_KN),
++ GRP_G("p1led_kn", p1led_kn_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_P1LED_KN),
++ GRP_G("p2led_kn", p2led_kn_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_P2LED_KN),
++ GRP_G("p3led_kn", p3led_kn_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_P3LED_KN),
++ GRP_G("p4led_kn", p4led_kn_grp_mt76x8, MT76X8_GPIO_MODE_MASK,
++ 1, MT76X8_GPIO_MODE_P4LED_KN),
+ { 0 }
+ };
+
+ static int mt7620_pinmux_probe(struct platform_device *pdev)
+ {
+ if (is_mt76x8())
+- return rt2880_pinmux_init(pdev, mt7628an_pinmux_data);
++ return ralink_pinmux_init(pdev, mt76x8_pinmux_data);
+ else
+- return rt2880_pinmux_init(pdev, mt7620a_pinmux_data);
++ return ralink_pinmux_init(pdev, mt7620a_pinmux_data);
+ }
+
+ static const struct of_device_id mt7620_pinmux_match[] = {
+diff --git a/drivers/pinctrl/ralink/pinctrl-mt7621.c b/drivers/pinctrl/ralink/pinctrl-mt7621.c
+index 7d96144c474e7..14b89cb43d4cb 100644
+--- a/drivers/pinctrl/ralink/pinctrl-mt7621.c
++++ b/drivers/pinctrl/ralink/pinctrl-mt7621.c
+@@ -3,7 +3,7 @@
+ #include <linux/module.h>
+ #include <linux/platform_device.h>
+ #include <linux/of.h>
+-#include "pinmux.h"
++#include "pinctrl-ralink.h"
+
+ #define MT7621_GPIO_MODE_UART1 1
+ #define MT7621_GPIO_MODE_I2C 2
+@@ -34,40 +34,40 @@
+ #define MT7621_GPIO_MODE_SDHCI_SHIFT 18
+ #define MT7621_GPIO_MODE_SDHCI_GPIO 1
+
+-static struct rt2880_pmx_func uart1_grp[] = { FUNC("uart1", 0, 1, 2) };
+-static struct rt2880_pmx_func i2c_grp[] = { FUNC("i2c", 0, 3, 2) };
+-static struct rt2880_pmx_func uart3_grp[] = {
++static struct ralink_pmx_func uart1_grp[] = { FUNC("uart1", 0, 1, 2) };
++static struct ralink_pmx_func i2c_grp[] = { FUNC("i2c", 0, 3, 2) };
++static struct ralink_pmx_func uart3_grp[] = {
+ FUNC("uart3", 0, 5, 4),
+ FUNC("i2s", 2, 5, 4),
+ FUNC("spdif3", 3, 5, 4),
+ };
+-static struct rt2880_pmx_func uart2_grp[] = {
++static struct ralink_pmx_func uart2_grp[] = {
+ FUNC("uart2", 0, 9, 4),
+ FUNC("pcm", 2, 9, 4),
+ FUNC("spdif2", 3, 9, 4),
+ };
+-static struct rt2880_pmx_func jtag_grp[] = { FUNC("jtag", 0, 13, 5) };
+-static struct rt2880_pmx_func wdt_grp[] = {
++static struct ralink_pmx_func jtag_grp[] = { FUNC("jtag", 0, 13, 5) };
++static struct ralink_pmx_func wdt_grp[] = {
+ FUNC("wdt rst", 0, 18, 1),
+ FUNC("wdt refclk", 2, 18, 1),
+ };
+-static struct rt2880_pmx_func pcie_rst_grp[] = {
++static struct ralink_pmx_func pcie_rst_grp[] = {
+ FUNC("pcie rst", MT7621_GPIO_MODE_PCIE_RST, 19, 1),
+ FUNC("pcie refclk", MT7621_GPIO_MODE_PCIE_REF, 19, 1)
+ };
+-static struct rt2880_pmx_func mdio_grp[] = { FUNC("mdio", 0, 20, 2) };
+-static struct rt2880_pmx_func rgmii2_grp[] = { FUNC("rgmii2", 0, 22, 12) };
+-static struct rt2880_pmx_func spi_grp[] = {
++static struct ralink_pmx_func mdio_grp[] = { FUNC("mdio", 0, 20, 2) };
++static struct ralink_pmx_func rgmii2_grp[] = { FUNC("rgmii2", 0, 22, 12) };
++static struct ralink_pmx_func spi_grp[] = {
+ FUNC("spi", 0, 34, 7),
+ FUNC("nand1", 2, 34, 7),
+ };
+-static struct rt2880_pmx_func sdhci_grp[] = {
++static struct ralink_pmx_func sdhci_grp[] = {
+ FUNC("sdhci", 0, 41, 8),
+ FUNC("nand2", 2, 41, 8),
+ };
+-static struct rt2880_pmx_func rgmii1_grp[] = { FUNC("rgmii1", 0, 49, 12) };
++static struct ralink_pmx_func rgmii1_grp[] = { FUNC("rgmii1", 0, 49, 12) };
+
+-static struct rt2880_pmx_group mt7621_pinmux_data[] = {
++static struct ralink_pmx_group mt7621_pinmux_data[] = {
+ GRP("uart1", uart1_grp, 1, MT7621_GPIO_MODE_UART1),
+ GRP("i2c", i2c_grp, 1, MT7621_GPIO_MODE_I2C),
+ GRP_G("uart3", uart3_grp, MT7621_GPIO_MODE_UART3_MASK,
+@@ -92,7 +92,7 @@ static struct rt2880_pmx_group mt7621_pinmux_data[] = {
+
+ static int mt7621_pinmux_probe(struct platform_device *pdev)
+ {
+- return rt2880_pinmux_init(pdev, mt7621_pinmux_data);
++ return ralink_pinmux_init(pdev, mt7621_pinmux_data);
+ }
+
+ static const struct of_device_id mt7621_pinmux_match[] = {
+diff --git a/drivers/pinctrl/ralink/pinctrl-ralink.c b/drivers/pinctrl/ralink/pinctrl-ralink.c
+new file mode 100644
+index 0000000000000..3a8268a43d74a
+--- /dev/null
++++ b/drivers/pinctrl/ralink/pinctrl-ralink.c
+@@ -0,0 +1,351 @@
++// SPDX-License-Identifier: GPL-2.0
++/*
++ * Copyright (C) 2013 John Crispin <blogic@openwrt.org>
++ */
++
++#include <linux/module.h>
++#include <linux/device.h>
++#include <linux/io.h>
++#include <linux/platform_device.h>
++#include <linux/slab.h>
++#include <linux/of.h>
++#include <linux/pinctrl/pinctrl.h>
++#include <linux/pinctrl/pinconf.h>
++#include <linux/pinctrl/pinconf-generic.h>
++#include <linux/pinctrl/pinmux.h>
++#include <linux/pinctrl/consumer.h>
++#include <linux/pinctrl/machine.h>
++
++#include <asm/mach-ralink/ralink_regs.h>
++#include <asm/mach-ralink/mt7620.h>
++
++#include "pinctrl-ralink.h"
++#include "../core.h"
++#include "../pinctrl-utils.h"
++
++#define SYSC_REG_GPIO_MODE 0x60
++#define SYSC_REG_GPIO_MODE2 0x64
++
++struct ralink_priv {
++ struct device *dev;
++
++ struct pinctrl_pin_desc *pads;
++ struct pinctrl_desc *desc;
++
++ struct ralink_pmx_func **func;
++ int func_count;
++
++ struct ralink_pmx_group *groups;
++ const char **group_names;
++ int group_count;
++
++ u8 *gpio;
++ int max_pins;
++};
++
++static int ralink_get_group_count(struct pinctrl_dev *pctrldev)
++{
++ struct ralink_priv *p = pinctrl_dev_get_drvdata(pctrldev);
++
++ return p->group_count;
++}
++
++static const char *ralink_get_group_name(struct pinctrl_dev *pctrldev,
++ unsigned int group)
++{
++ struct ralink_priv *p = pinctrl_dev_get_drvdata(pctrldev);
++
++ return (group >= p->group_count) ? NULL : p->group_names[group];
++}
++
++static int ralink_get_group_pins(struct pinctrl_dev *pctrldev,
++ unsigned int group,
++ const unsigned int **pins,
++ unsigned int *num_pins)
++{
++ struct ralink_priv *p = pinctrl_dev_get_drvdata(pctrldev);
++
++ if (group >= p->group_count)
++ return -EINVAL;
++
++ *pins = p->groups[group].func[0].pins;
++ *num_pins = p->groups[group].func[0].pin_count;
++
++ return 0;
++}
++
++static const struct pinctrl_ops ralink_pctrl_ops = {
++ .get_groups_count = ralink_get_group_count,
++ .get_group_name = ralink_get_group_name,
++ .get_group_pins = ralink_get_group_pins,
++ .dt_node_to_map = pinconf_generic_dt_node_to_map_all,
++ .dt_free_map = pinconf_generic_dt_free_map,
++};
++
++static int ralink_pmx_func_count(struct pinctrl_dev *pctrldev)
++{
++ struct ralink_priv *p = pinctrl_dev_get_drvdata(pctrldev);
++
++ return p->func_count;
++}
++
++static const char *ralink_pmx_func_name(struct pinctrl_dev *pctrldev,
++ unsigned int func)
++{
++ struct ralink_priv *p = pinctrl_dev_get_drvdata(pctrldev);
++
++ return p->func[func]->name;
++}
++
++static int ralink_pmx_group_get_groups(struct pinctrl_dev *pctrldev,
++ unsigned int func,
++ const char * const **groups,
++ unsigned int * const num_groups)
++{
++ struct ralink_priv *p = pinctrl_dev_get_drvdata(pctrldev);
++
++ if (p->func[func]->group_count == 1)
++ *groups = &p->group_names[p->func[func]->groups[0]];
++ else
++ *groups = p->group_names;
++
++ *num_groups = p->func[func]->group_count;
++
++ return 0;
++}
++
++static int ralink_pmx_group_enable(struct pinctrl_dev *pctrldev,
++ unsigned int func, unsigned int group)
++{
++ struct ralink_priv *p = pinctrl_dev_get_drvdata(pctrldev);
++ u32 mode = 0;
++ u32 reg = SYSC_REG_GPIO_MODE;
++ int i;
++ int shift;
++
++ /* dont allow double use */
++ if (p->groups[group].enabled) {
++ dev_err(p->dev, "%s is already enabled\n",
++ p->groups[group].name);
++ return 0;
++ }
++
++ p->groups[group].enabled = 1;
++ p->func[func]->enabled = 1;
++
++ shift = p->groups[group].shift;
++ if (shift >= 32) {
++ shift -= 32;
++ reg = SYSC_REG_GPIO_MODE2;
++ }
++ mode = rt_sysc_r32(reg);
++ mode &= ~(p->groups[group].mask << shift);
++
++ /* mark the pins as gpio */
++ for (i = 0; i < p->groups[group].func[0].pin_count; i++)
++ p->gpio[p->groups[group].func[0].pins[i]] = 1;
++
++ /* function 0 is gpio and needs special handling */
++ if (func == 0) {
++ mode |= p->groups[group].gpio << shift;
++ } else {
++ for (i = 0; i < p->func[func]->pin_count; i++)
++ p->gpio[p->func[func]->pins[i]] = 0;
++ mode |= p->func[func]->value << shift;
++ }
++ rt_sysc_w32(mode, reg);
++
++ return 0;
++}
++
++static int ralink_pmx_group_gpio_request_enable(struct pinctrl_dev *pctrldev,
++ struct pinctrl_gpio_range *range,
++ unsigned int pin)
++{
++ struct ralink_priv *p = pinctrl_dev_get_drvdata(pctrldev);
++
++ if (!p->gpio[pin]) {
++ dev_err(p->dev, "pin %d is not set to gpio mux\n", pin);
++ return -EINVAL;
++ }
++
++ return 0;
++}
++
++static const struct pinmux_ops ralink_pmx_group_ops = {
++ .get_functions_count = ralink_pmx_func_count,
++ .get_function_name = ralink_pmx_func_name,
++ .get_function_groups = ralink_pmx_group_get_groups,
++ .set_mux = ralink_pmx_group_enable,
++ .gpio_request_enable = ralink_pmx_group_gpio_request_enable,
++};
++
++static struct pinctrl_desc ralink_pctrl_desc = {
++ .owner = THIS_MODULE,
++ .name = "ralink-pinmux",
++ .pctlops = &ralink_pctrl_ops,
++ .pmxops = &ralink_pmx_group_ops,
++};
++
++static struct ralink_pmx_func gpio_func = {
++ .name = "gpio",
++};
++
++static int ralink_pinmux_index(struct ralink_priv *p)
++{
++ struct ralink_pmx_group *mux = p->groups;
++ int i, j, c = 0;
++
++ /* count the mux functions */
++ while (mux->name) {
++ p->group_count++;
++ mux++;
++ }
++
++ /* allocate the group names array needed by the gpio function */
++ p->group_names = devm_kcalloc(p->dev, p->group_count,
++ sizeof(char *), GFP_KERNEL);
++ if (!p->group_names)
++ return -ENOMEM;
++
++ for (i = 0; i < p->group_count; i++) {
++ p->group_names[i] = p->groups[i].name;
++ p->func_count += p->groups[i].func_count;
++ }
++
++ /* we have a dummy function[0] for gpio */
++ p->func_count++;
++
++ /* allocate our function and group mapping index buffers */
++ p->func = devm_kcalloc(p->dev, p->func_count,
++ sizeof(*p->func), GFP_KERNEL);
++ gpio_func.groups = devm_kcalloc(p->dev, p->group_count, sizeof(int),
++ GFP_KERNEL);
++ if (!p->func || !gpio_func.groups)
++ return -ENOMEM;
++
++ /* add a backpointer to the function so it knows its group */
++ gpio_func.group_count = p->group_count;
++ for (i = 0; i < gpio_func.group_count; i++)
++ gpio_func.groups[i] = i;
++
++ p->func[c] = &gpio_func;
++ c++;
++
++ /* add remaining functions */
++ for (i = 0; i < p->group_count; i++) {
++ for (j = 0; j < p->groups[i].func_count; j++) {
++ p->func[c] = &p->groups[i].func[j];
++ p->func[c]->groups = devm_kzalloc(p->dev, sizeof(int),
++ GFP_KERNEL);
++ if (!p->func[c]->groups)
++ return -ENOMEM;
++ p->func[c]->groups[0] = i;
++ p->func[c]->group_count = 1;
++ c++;
++ }
++ }
++ return 0;
++}
++
++static int ralink_pinmux_pins(struct ralink_priv *p)
++{
++ int i, j;
++
++ /*
++ * loop over the functions and initialize the pins array.
++ * also work out the highest pin used.
++ */
++ for (i = 0; i < p->func_count; i++) {
++ int pin;
++
++ if (!p->func[i]->pin_count)
++ continue;
++
++ p->func[i]->pins = devm_kcalloc(p->dev,
++ p->func[i]->pin_count,
++ sizeof(int),
++ GFP_KERNEL);
++ if (!p->func[i]->pins)
++ return -ENOMEM;
++ for (j = 0; j < p->func[i]->pin_count; j++)
++ p->func[i]->pins[j] = p->func[i]->pin_first + j;
++
++ pin = p->func[i]->pin_first + p->func[i]->pin_count;
++ if (pin > p->max_pins)
++ p->max_pins = pin;
++ }
++
++ /* the buffer that tells us which pins are gpio */
++ p->gpio = devm_kcalloc(p->dev, p->max_pins, sizeof(u8), GFP_KERNEL);
++ /* the pads needed to tell pinctrl about our pins */
++ p->pads = devm_kcalloc(p->dev, p->max_pins,
++ sizeof(struct pinctrl_pin_desc), GFP_KERNEL);
++ if (!p->pads || !p->gpio)
++ return -ENOMEM;
++
++ memset(p->gpio, 1, sizeof(u8) * p->max_pins);
++ for (i = 0; i < p->func_count; i++) {
++ if (!p->func[i]->pin_count)
++ continue;
++
++ for (j = 0; j < p->func[i]->pin_count; j++)
++ p->gpio[p->func[i]->pins[j]] = 0;
++ }
++
++ /* pin 0 is always a gpio */
++ p->gpio[0] = 1;
++
++ /* set the pads */
++ for (i = 0; i < p->max_pins; i++) {
++ /* strlen("ioXY") + 1 = 5 */
++ char *name = devm_kzalloc(p->dev, 5, GFP_KERNEL);
++
++ if (!name)
++ return -ENOMEM;
++ snprintf(name, 5, "io%d", i);
++ p->pads[i].number = i;
++ p->pads[i].name = name;
++ }
++ p->desc->pins = p->pads;
++ p->desc->npins = p->max_pins;
++
++ return 0;
++}
++
++int ralink_pinmux_init(struct platform_device *pdev,
++ struct ralink_pmx_group *data)
++{
++ struct ralink_priv *p;
++ struct pinctrl_dev *dev;
++ int err;
++
++ if (!data)
++ return -ENOTSUPP;
++
++ /* setup the private data */
++ p = devm_kzalloc(&pdev->dev, sizeof(struct ralink_priv), GFP_KERNEL);
++ if (!p)
++ return -ENOMEM;
++
++ p->dev = &pdev->dev;
++ p->desc = &ralink_pctrl_desc;
++ p->groups = data;
++ platform_set_drvdata(pdev, p);
++
++ /* init the device */
++ err = ralink_pinmux_index(p);
++ if (err) {
++ dev_err(&pdev->dev, "failed to load index\n");
++ return err;
++ }
++
++ err = ralink_pinmux_pins(p);
++ if (err) {
++ dev_err(&pdev->dev, "failed to load pins\n");
++ return err;
++ }
++ dev = pinctrl_register(p->desc, &pdev->dev, p);
++
++ return PTR_ERR_OR_ZERO(dev);
++}
+diff --git a/drivers/pinctrl/ralink/pinctrl-ralink.h b/drivers/pinctrl/ralink/pinctrl-ralink.h
+new file mode 100644
+index 0000000000000..1349694095852
+--- /dev/null
++++ b/drivers/pinctrl/ralink/pinctrl-ralink.h
+@@ -0,0 +1,53 @@
++/* SPDX-License-Identifier: GPL-2.0-only */
++/*
++ * Copyright (C) 2012 John Crispin <john@phrozen.org>
++ */
++
++#ifndef _PINCTRL_RALINK_H__
++#define _PINCTRL_RALINK_H__
++
++#define FUNC(name, value, pin_first, pin_count) \
++ { name, value, pin_first, pin_count }
++
++#define GRP(_name, _func, _mask, _shift) \
++ { .name = _name, .mask = _mask, .shift = _shift, \
++ .func = _func, .gpio = _mask, \
++ .func_count = ARRAY_SIZE(_func) }
++
++#define GRP_G(_name, _func, _mask, _gpio, _shift) \
++ { .name = _name, .mask = _mask, .shift = _shift, \
++ .func = _func, .gpio = _gpio, \
++ .func_count = ARRAY_SIZE(_func) }
++
++struct ralink_pmx_group;
++
++struct ralink_pmx_func {
++ const char *name;
++ const char value;
++
++ int pin_first;
++ int pin_count;
++ int *pins;
++
++ int *groups;
++ int group_count;
++
++ int enabled;
++};
++
++struct ralink_pmx_group {
++ const char *name;
++ int enabled;
++
++ const u32 shift;
++ const char mask;
++ const char gpio;
++
++ struct ralink_pmx_func *func;
++ int func_count;
++};
++
++int ralink_pinmux_init(struct platform_device *pdev,
++ struct ralink_pmx_group *data);
++
++#endif
+diff --git a/drivers/pinctrl/ralink/pinctrl-rt2880.c b/drivers/pinctrl/ralink/pinctrl-rt2880.c
+deleted file mode 100644
+index 96fc06d1b8b92..0000000000000
+--- a/drivers/pinctrl/ralink/pinctrl-rt2880.c
++++ /dev/null
+@@ -1,349 +0,0 @@
+-// SPDX-License-Identifier: GPL-2.0
+-/*
+- * Copyright (C) 2013 John Crispin <blogic@openwrt.org>
+- */
+-
+-#include <linux/module.h>
+-#include <linux/device.h>
+-#include <linux/io.h>
+-#include <linux/platform_device.h>
+-#include <linux/slab.h>
+-#include <linux/of.h>
+-#include <linux/pinctrl/pinctrl.h>
+-#include <linux/pinctrl/pinconf.h>
+-#include <linux/pinctrl/pinconf-generic.h>
+-#include <linux/pinctrl/pinmux.h>
+-#include <linux/pinctrl/consumer.h>
+-#include <linux/pinctrl/machine.h>
+-
+-#include <asm/mach-ralink/ralink_regs.h>
+-#include <asm/mach-ralink/mt7620.h>
+-
+-#include "pinmux.h"
+-#include "../core.h"
+-#include "../pinctrl-utils.h"
+-
+-#define SYSC_REG_GPIO_MODE 0x60
+-#define SYSC_REG_GPIO_MODE2 0x64
+-
+-struct rt2880_priv {
+- struct device *dev;
+-
+- struct pinctrl_pin_desc *pads;
+- struct pinctrl_desc *desc;
+-
+- struct rt2880_pmx_func **func;
+- int func_count;
+-
+- struct rt2880_pmx_group *groups;
+- const char **group_names;
+- int group_count;
+-
+- u8 *gpio;
+- int max_pins;
+-};
+-
+-static int rt2880_get_group_count(struct pinctrl_dev *pctrldev)
+-{
+- struct rt2880_priv *p = pinctrl_dev_get_drvdata(pctrldev);
+-
+- return p->group_count;
+-}
+-
+-static const char *rt2880_get_group_name(struct pinctrl_dev *pctrldev,
+- unsigned int group)
+-{
+- struct rt2880_priv *p = pinctrl_dev_get_drvdata(pctrldev);
+-
+- return (group >= p->group_count) ? NULL : p->group_names[group];
+-}
+-
+-static int rt2880_get_group_pins(struct pinctrl_dev *pctrldev,
+- unsigned int group,
+- const unsigned int **pins,
+- unsigned int *num_pins)
+-{
+- struct rt2880_priv *p = pinctrl_dev_get_drvdata(pctrldev);
+-
+- if (group >= p->group_count)
+- return -EINVAL;
+-
+- *pins = p->groups[group].func[0].pins;
+- *num_pins = p->groups[group].func[0].pin_count;
+-
+- return 0;
+-}
+-
+-static const struct pinctrl_ops rt2880_pctrl_ops = {
+- .get_groups_count = rt2880_get_group_count,
+- .get_group_name = rt2880_get_group_name,
+- .get_group_pins = rt2880_get_group_pins,
+- .dt_node_to_map = pinconf_generic_dt_node_to_map_all,
+- .dt_free_map = pinconf_generic_dt_free_map,
+-};
+-
+-static int rt2880_pmx_func_count(struct pinctrl_dev *pctrldev)
+-{
+- struct rt2880_priv *p = pinctrl_dev_get_drvdata(pctrldev);
+-
+- return p->func_count;
+-}
+-
+-static const char *rt2880_pmx_func_name(struct pinctrl_dev *pctrldev,
+- unsigned int func)
+-{
+- struct rt2880_priv *p = pinctrl_dev_get_drvdata(pctrldev);
+-
+- return p->func[func]->name;
+-}
+-
+-static int rt2880_pmx_group_get_groups(struct pinctrl_dev *pctrldev,
+- unsigned int func,
+- const char * const **groups,
+- unsigned int * const num_groups)
+-{
+- struct rt2880_priv *p = pinctrl_dev_get_drvdata(pctrldev);
+-
+- if (p->func[func]->group_count == 1)
+- *groups = &p->group_names[p->func[func]->groups[0]];
+- else
+- *groups = p->group_names;
+-
+- *num_groups = p->func[func]->group_count;
+-
+- return 0;
+-}
+-
+-static int rt2880_pmx_group_enable(struct pinctrl_dev *pctrldev,
+- unsigned int func, unsigned int group)
+-{
+- struct rt2880_priv *p = pinctrl_dev_get_drvdata(pctrldev);
+- u32 mode = 0;
+- u32 reg = SYSC_REG_GPIO_MODE;
+- int i;
+- int shift;
+-
+- /* dont allow double use */
+- if (p->groups[group].enabled) {
+- dev_err(p->dev, "%s is already enabled\n",
+- p->groups[group].name);
+- return 0;
+- }
+-
+- p->groups[group].enabled = 1;
+- p->func[func]->enabled = 1;
+-
+- shift = p->groups[group].shift;
+- if (shift >= 32) {
+- shift -= 32;
+- reg = SYSC_REG_GPIO_MODE2;
+- }
+- mode = rt_sysc_r32(reg);
+- mode &= ~(p->groups[group].mask << shift);
+-
+- /* mark the pins as gpio */
+- for (i = 0; i < p->groups[group].func[0].pin_count; i++)
+- p->gpio[p->groups[group].func[0].pins[i]] = 1;
+-
+- /* function 0 is gpio and needs special handling */
+- if (func == 0) {
+- mode |= p->groups[group].gpio << shift;
+- } else {
+- for (i = 0; i < p->func[func]->pin_count; i++)
+- p->gpio[p->func[func]->pins[i]] = 0;
+- mode |= p->func[func]->value << shift;
+- }
+- rt_sysc_w32(mode, reg);
+-
+- return 0;
+-}
+-
+-static int rt2880_pmx_group_gpio_request_enable(struct pinctrl_dev *pctrldev,
+- struct pinctrl_gpio_range *range,
+- unsigned int pin)
+-{
+- struct rt2880_priv *p = pinctrl_dev_get_drvdata(pctrldev);
+-
+- if (!p->gpio[pin]) {
+- dev_err(p->dev, "pin %d is not set to gpio mux\n", pin);
+- return -EINVAL;
+- }
+-
+- return 0;
+-}
+-
+-static const struct pinmux_ops rt2880_pmx_group_ops = {
+- .get_functions_count = rt2880_pmx_func_count,
+- .get_function_name = rt2880_pmx_func_name,
+- .get_function_groups = rt2880_pmx_group_get_groups,
+- .set_mux = rt2880_pmx_group_enable,
+- .gpio_request_enable = rt2880_pmx_group_gpio_request_enable,
+-};
+-
+-static struct pinctrl_desc rt2880_pctrl_desc = {
+- .owner = THIS_MODULE,
+- .name = "rt2880-pinmux",
+- .pctlops = &rt2880_pctrl_ops,
+- .pmxops = &rt2880_pmx_group_ops,
+-};
+-
+-static struct rt2880_pmx_func gpio_func = {
+- .name = "gpio",
+-};
+-
+-static int rt2880_pinmux_index(struct rt2880_priv *p)
+-{
+- struct rt2880_pmx_group *mux = p->groups;
+- int i, j, c = 0;
+-
+- /* count the mux functions */
+- while (mux->name) {
+- p->group_count++;
+- mux++;
+- }
+-
+- /* allocate the group names array needed by the gpio function */
+- p->group_names = devm_kcalloc(p->dev, p->group_count,
+- sizeof(char *), GFP_KERNEL);
+- if (!p->group_names)
+- return -ENOMEM;
+-
+- for (i = 0; i < p->group_count; i++) {
+- p->group_names[i] = p->groups[i].name;
+- p->func_count += p->groups[i].func_count;
+- }
+-
+- /* we have a dummy function[0] for gpio */
+- p->func_count++;
+-
+- /* allocate our function and group mapping index buffers */
+- p->func = devm_kcalloc(p->dev, p->func_count,
+- sizeof(*p->func), GFP_KERNEL);
+- gpio_func.groups = devm_kcalloc(p->dev, p->group_count, sizeof(int),
+- GFP_KERNEL);
+- if (!p->func || !gpio_func.groups)
+- return -ENOMEM;
+-
+- /* add a backpointer to the function so it knows its group */
+- gpio_func.group_count = p->group_count;
+- for (i = 0; i < gpio_func.group_count; i++)
+- gpio_func.groups[i] = i;
+-
+- p->func[c] = &gpio_func;
+- c++;
+-
+- /* add remaining functions */
+- for (i = 0; i < p->group_count; i++) {
+- for (j = 0; j < p->groups[i].func_count; j++) {
+- p->func[c] = &p->groups[i].func[j];
+- p->func[c]->groups = devm_kzalloc(p->dev, sizeof(int),
+- GFP_KERNEL);
+- if (!p->func[c]->groups)
+- return -ENOMEM;
+- p->func[c]->groups[0] = i;
+- p->func[c]->group_count = 1;
+- c++;
+- }
+- }
+- return 0;
+-}
+-
+-static int rt2880_pinmux_pins(struct rt2880_priv *p)
+-{
+- int i, j;
+-
+- /*
+- * loop over the functions and initialize the pins array.
+- * also work out the highest pin used.
+- */
+- for (i = 0; i < p->func_count; i++) {
+- int pin;
+-
+- if (!p->func[i]->pin_count)
+- continue;
+-
+- p->func[i]->pins = devm_kcalloc(p->dev,
+- p->func[i]->pin_count,
+- sizeof(int),
+- GFP_KERNEL);
+- for (j = 0; j < p->func[i]->pin_count; j++)
+- p->func[i]->pins[j] = p->func[i]->pin_first + j;
+-
+- pin = p->func[i]->pin_first + p->func[i]->pin_count;
+- if (pin > p->max_pins)
+- p->max_pins = pin;
+- }
+-
+- /* the buffer that tells us which pins are gpio */
+- p->gpio = devm_kcalloc(p->dev, p->max_pins, sizeof(u8), GFP_KERNEL);
+- /* the pads needed to tell pinctrl about our pins */
+- p->pads = devm_kcalloc(p->dev, p->max_pins,
+- sizeof(struct pinctrl_pin_desc), GFP_KERNEL);
+- if (!p->pads || !p->gpio)
+- return -ENOMEM;
+-
+- memset(p->gpio, 1, sizeof(u8) * p->max_pins);
+- for (i = 0; i < p->func_count; i++) {
+- if (!p->func[i]->pin_count)
+- continue;
+-
+- for (j = 0; j < p->func[i]->pin_count; j++)
+- p->gpio[p->func[i]->pins[j]] = 0;
+- }
+-
+- /* pin 0 is always a gpio */
+- p->gpio[0] = 1;
+-
+- /* set the pads */
+- for (i = 0; i < p->max_pins; i++) {
+- /* strlen("ioXY") + 1 = 5 */
+- char *name = devm_kzalloc(p->dev, 5, GFP_KERNEL);
+-
+- if (!name)
+- return -ENOMEM;
+- snprintf(name, 5, "io%d", i);
+- p->pads[i].number = i;
+- p->pads[i].name = name;
+- }
+- p->desc->pins = p->pads;
+- p->desc->npins = p->max_pins;
+-
+- return 0;
+-}
+-
+-int rt2880_pinmux_init(struct platform_device *pdev,
+- struct rt2880_pmx_group *data)
+-{
+- struct rt2880_priv *p;
+- struct pinctrl_dev *dev;
+- int err;
+-
+- if (!data)
+- return -ENOTSUPP;
+-
+- /* setup the private data */
+- p = devm_kzalloc(&pdev->dev, sizeof(struct rt2880_priv), GFP_KERNEL);
+- if (!p)
+- return -ENOMEM;
+-
+- p->dev = &pdev->dev;
+- p->desc = &rt2880_pctrl_desc;
+- p->groups = data;
+- platform_set_drvdata(pdev, p);
+-
+- /* init the device */
+- err = rt2880_pinmux_index(p);
+- if (err) {
+- dev_err(&pdev->dev, "failed to load index\n");
+- return err;
+- }
+-
+- err = rt2880_pinmux_pins(p);
+- if (err) {
+- dev_err(&pdev->dev, "failed to load pins\n");
+- return err;
+- }
+- dev = pinctrl_register(p->desc, &pdev->dev, p);
+-
+- return PTR_ERR_OR_ZERO(dev);
+-}
+diff --git a/drivers/pinctrl/ralink/pinctrl-rt288x.c b/drivers/pinctrl/ralink/pinctrl-rt288x.c
+index 0744aebbace52..40c45140ff8a3 100644
+--- a/drivers/pinctrl/ralink/pinctrl-rt288x.c
++++ b/drivers/pinctrl/ralink/pinctrl-rt288x.c
+@@ -4,7 +4,7 @@
+ #include <linux/module.h>
+ #include <linux/platform_device.h>
+ #include <linux/of.h>
+-#include "pinmux.h"
++#include "pinctrl-ralink.h"
+
+ #define RT2880_GPIO_MODE_I2C BIT(0)
+ #define RT2880_GPIO_MODE_UART0 BIT(1)
+@@ -15,15 +15,15 @@
+ #define RT2880_GPIO_MODE_SDRAM BIT(6)
+ #define RT2880_GPIO_MODE_PCI BIT(7)
+
+-static struct rt2880_pmx_func i2c_func[] = { FUNC("i2c", 0, 1, 2) };
+-static struct rt2880_pmx_func spi_func[] = { FUNC("spi", 0, 3, 4) };
+-static struct rt2880_pmx_func uartlite_func[] = { FUNC("uartlite", 0, 7, 8) };
+-static struct rt2880_pmx_func jtag_func[] = { FUNC("jtag", 0, 17, 5) };
+-static struct rt2880_pmx_func mdio_func[] = { FUNC("mdio", 0, 22, 2) };
+-static struct rt2880_pmx_func sdram_func[] = { FUNC("sdram", 0, 24, 16) };
+-static struct rt2880_pmx_func pci_func[] = { FUNC("pci", 0, 40, 32) };
++static struct ralink_pmx_func i2c_func[] = { FUNC("i2c", 0, 1, 2) };
++static struct ralink_pmx_func spi_func[] = { FUNC("spi", 0, 3, 4) };
++static struct ralink_pmx_func uartlite_func[] = { FUNC("uartlite", 0, 7, 8) };
++static struct ralink_pmx_func jtag_func[] = { FUNC("jtag", 0, 17, 5) };
++static struct ralink_pmx_func mdio_func[] = { FUNC("mdio", 0, 22, 2) };
++static struct ralink_pmx_func sdram_func[] = { FUNC("sdram", 0, 24, 16) };
++static struct ralink_pmx_func pci_func[] = { FUNC("pci", 0, 40, 32) };
+
+-static struct rt2880_pmx_group rt2880_pinmux_data_act[] = {
++static struct ralink_pmx_group rt2880_pinmux_data_act[] = {
+ GRP("i2c", i2c_func, 1, RT2880_GPIO_MODE_I2C),
+ GRP("spi", spi_func, 1, RT2880_GPIO_MODE_SPI),
+ GRP("uartlite", uartlite_func, 1, RT2880_GPIO_MODE_UART0),
+@@ -36,7 +36,7 @@ static struct rt2880_pmx_group rt2880_pinmux_data_act[] = {
+
+ static int rt288x_pinmux_probe(struct platform_device *pdev)
+ {
+- return rt2880_pinmux_init(pdev, rt2880_pinmux_data_act);
++ return ralink_pinmux_init(pdev, rt2880_pinmux_data_act);
+ }
+
+ static const struct of_device_id rt288x_pinmux_match[] = {
+diff --git a/drivers/pinctrl/ralink/pinctrl-rt305x.c b/drivers/pinctrl/ralink/pinctrl-rt305x.c
+index 5d8fa156c0037..25527ca1ccaae 100644
+--- a/drivers/pinctrl/ralink/pinctrl-rt305x.c
++++ b/drivers/pinctrl/ralink/pinctrl-rt305x.c
+@@ -5,7 +5,7 @@
+ #include <linux/module.h>
+ #include <linux/platform_device.h>
+ #include <linux/of.h>
+-#include "pinmux.h"
++#include "pinctrl-ralink.h"
+
+ #define RT305X_GPIO_MODE_UART0_SHIFT 2
+ #define RT305X_GPIO_MODE_UART0_MASK 0x7
+@@ -31,9 +31,9 @@
+ #define RT3352_GPIO_MODE_LNA 18
+ #define RT3352_GPIO_MODE_PA 20
+
+-static struct rt2880_pmx_func i2c_func[] = { FUNC("i2c", 0, 1, 2) };
+-static struct rt2880_pmx_func spi_func[] = { FUNC("spi", 0, 3, 4) };
+-static struct rt2880_pmx_func uartf_func[] = {
++static struct ralink_pmx_func i2c_func[] = { FUNC("i2c", 0, 1, 2) };
++static struct ralink_pmx_func spi_func[] = { FUNC("spi", 0, 3, 4) };
++static struct ralink_pmx_func uartf_func[] = {
+ FUNC("uartf", RT305X_GPIO_MODE_UARTF, 7, 8),
+ FUNC("pcm uartf", RT305X_GPIO_MODE_PCM_UARTF, 7, 8),
+ FUNC("pcm i2s", RT305X_GPIO_MODE_PCM_I2S, 7, 8),
+@@ -42,28 +42,28 @@ static struct rt2880_pmx_func uartf_func[] = {
+ FUNC("gpio uartf", RT305X_GPIO_MODE_GPIO_UARTF, 7, 4),
+ FUNC("gpio i2s", RT305X_GPIO_MODE_GPIO_I2S, 7, 4),
+ };
+-static struct rt2880_pmx_func uartlite_func[] = { FUNC("uartlite", 0, 15, 2) };
+-static struct rt2880_pmx_func jtag_func[] = { FUNC("jtag", 0, 17, 5) };
+-static struct rt2880_pmx_func mdio_func[] = { FUNC("mdio", 0, 22, 2) };
+-static struct rt2880_pmx_func rt5350_led_func[] = { FUNC("led", 0, 22, 5) };
+-static struct rt2880_pmx_func rt5350_cs1_func[] = {
++static struct ralink_pmx_func uartlite_func[] = { FUNC("uartlite", 0, 15, 2) };
++static struct ralink_pmx_func jtag_func[] = { FUNC("jtag", 0, 17, 5) };
++static struct ralink_pmx_func mdio_func[] = { FUNC("mdio", 0, 22, 2) };
++static struct ralink_pmx_func rt5350_led_func[] = { FUNC("led", 0, 22, 5) };
++static struct ralink_pmx_func rt5350_cs1_func[] = {
+ FUNC("spi_cs1", 0, 27, 1),
+ FUNC("wdg_cs1", 1, 27, 1),
+ };
+-static struct rt2880_pmx_func sdram_func[] = { FUNC("sdram", 0, 24, 16) };
+-static struct rt2880_pmx_func rt3352_rgmii_func[] = {
++static struct ralink_pmx_func sdram_func[] = { FUNC("sdram", 0, 24, 16) };
++static struct ralink_pmx_func rt3352_rgmii_func[] = {
+ FUNC("rgmii", 0, 24, 12)
+ };
+-static struct rt2880_pmx_func rgmii_func[] = { FUNC("rgmii", 0, 40, 12) };
+-static struct rt2880_pmx_func rt3352_lna_func[] = { FUNC("lna", 0, 36, 2) };
+-static struct rt2880_pmx_func rt3352_pa_func[] = { FUNC("pa", 0, 38, 2) };
+-static struct rt2880_pmx_func rt3352_led_func[] = { FUNC("led", 0, 40, 5) };
+-static struct rt2880_pmx_func rt3352_cs1_func[] = {
++static struct ralink_pmx_func rgmii_func[] = { FUNC("rgmii", 0, 40, 12) };
++static struct ralink_pmx_func rt3352_lna_func[] = { FUNC("lna", 0, 36, 2) };
++static struct ralink_pmx_func rt3352_pa_func[] = { FUNC("pa", 0, 38, 2) };
++static struct ralink_pmx_func rt3352_led_func[] = { FUNC("led", 0, 40, 5) };
++static struct ralink_pmx_func rt3352_cs1_func[] = {
+ FUNC("spi_cs1", 0, 45, 1),
+ FUNC("wdg_cs1", 1, 45, 1),
+ };
+
+-static struct rt2880_pmx_group rt3050_pinmux_data[] = {
++static struct ralink_pmx_group rt3050_pinmux_data[] = {
+ GRP("i2c", i2c_func, 1, RT305X_GPIO_MODE_I2C),
+ GRP("spi", spi_func, 1, RT305X_GPIO_MODE_SPI),
+ GRP("uartf", uartf_func, RT305X_GPIO_MODE_UART0_MASK,
+@@ -76,7 +76,7 @@ static struct rt2880_pmx_group rt3050_pinmux_data[] = {
+ { 0 }
+ };
+
+-static struct rt2880_pmx_group rt3352_pinmux_data[] = {
++static struct ralink_pmx_group rt3352_pinmux_data[] = {
+ GRP("i2c", i2c_func, 1, RT305X_GPIO_MODE_I2C),
+ GRP("spi", spi_func, 1, RT305X_GPIO_MODE_SPI),
+ GRP("uartf", uartf_func, RT305X_GPIO_MODE_UART0_MASK,
+@@ -92,7 +92,7 @@ static struct rt2880_pmx_group rt3352_pinmux_data[] = {
+ { 0 }
+ };
+
+-static struct rt2880_pmx_group rt5350_pinmux_data[] = {
++static struct ralink_pmx_group rt5350_pinmux_data[] = {
+ GRP("i2c", i2c_func, 1, RT305X_GPIO_MODE_I2C),
+ GRP("spi", spi_func, 1, RT305X_GPIO_MODE_SPI),
+ GRP("uartf", uartf_func, RT305X_GPIO_MODE_UART0_MASK,
+@@ -107,11 +107,11 @@ static struct rt2880_pmx_group rt5350_pinmux_data[] = {
+ static int rt305x_pinmux_probe(struct platform_device *pdev)
+ {
+ if (soc_is_rt5350())
+- return rt2880_pinmux_init(pdev, rt5350_pinmux_data);
++ return ralink_pinmux_init(pdev, rt5350_pinmux_data);
+ else if (soc_is_rt305x() || soc_is_rt3350())
+- return rt2880_pinmux_init(pdev, rt3050_pinmux_data);
++ return ralink_pinmux_init(pdev, rt3050_pinmux_data);
+ else if (soc_is_rt3352())
+- return rt2880_pinmux_init(pdev, rt3352_pinmux_data);
++ return ralink_pinmux_init(pdev, rt3352_pinmux_data);
+ else
+ return -EINVAL;
+ }
+diff --git a/drivers/pinctrl/ralink/pinctrl-rt3883.c b/drivers/pinctrl/ralink/pinctrl-rt3883.c
+index 3e0e1b4caa647..0b8674dbe1880 100644
+--- a/drivers/pinctrl/ralink/pinctrl-rt3883.c
++++ b/drivers/pinctrl/ralink/pinctrl-rt3883.c
+@@ -3,7 +3,7 @@
+ #include <linux/module.h>
+ #include <linux/platform_device.h>
+ #include <linux/of.h>
+-#include "pinmux.h"
++#include "pinctrl-ralink.h"
+
+ #define RT3883_GPIO_MODE_UART0_SHIFT 2
+ #define RT3883_GPIO_MODE_UART0_MASK 0x7
+@@ -39,9 +39,9 @@
+ #define RT3883_GPIO_MODE_LNA_G_GPIO 0x3
+ #define RT3883_GPIO_MODE_LNA_G _RT3883_GPIO_MODE_LNA_G(RT3883_GPIO_MODE_LNA_G_MASK)
+
+-static struct rt2880_pmx_func i2c_func[] = { FUNC("i2c", 0, 1, 2) };
+-static struct rt2880_pmx_func spi_func[] = { FUNC("spi", 0, 3, 4) };
+-static struct rt2880_pmx_func uartf_func[] = {
++static struct ralink_pmx_func i2c_func[] = { FUNC("i2c", 0, 1, 2) };
++static struct ralink_pmx_func spi_func[] = { FUNC("spi", 0, 3, 4) };
++static struct ralink_pmx_func uartf_func[] = {
+ FUNC("uartf", RT3883_GPIO_MODE_UARTF, 7, 8),
+ FUNC("pcm uartf", RT3883_GPIO_MODE_PCM_UARTF, 7, 8),
+ FUNC("pcm i2s", RT3883_GPIO_MODE_PCM_I2S, 7, 8),
+@@ -50,21 +50,21 @@ static struct rt2880_pmx_func uartf_func[] = {
+ FUNC("gpio uartf", RT3883_GPIO_MODE_GPIO_UARTF, 7, 4),
+ FUNC("gpio i2s", RT3883_GPIO_MODE_GPIO_I2S, 7, 4),
+ };
+-static struct rt2880_pmx_func uartlite_func[] = { FUNC("uartlite", 0, 15, 2) };
+-static struct rt2880_pmx_func jtag_func[] = { FUNC("jtag", 0, 17, 5) };
+-static struct rt2880_pmx_func mdio_func[] = { FUNC("mdio", 0, 22, 2) };
+-static struct rt2880_pmx_func lna_a_func[] = { FUNC("lna a", 0, 32, 3) };
+-static struct rt2880_pmx_func lna_g_func[] = { FUNC("lna g", 0, 35, 3) };
+-static struct rt2880_pmx_func pci_func[] = {
++static struct ralink_pmx_func uartlite_func[] = { FUNC("uartlite", 0, 15, 2) };
++static struct ralink_pmx_func jtag_func[] = { FUNC("jtag", 0, 17, 5) };
++static struct ralink_pmx_func mdio_func[] = { FUNC("mdio", 0, 22, 2) };
++static struct ralink_pmx_func lna_a_func[] = { FUNC("lna a", 0, 32, 3) };
++static struct ralink_pmx_func lna_g_func[] = { FUNC("lna g", 0, 35, 3) };
++static struct ralink_pmx_func pci_func[] = {
+ FUNC("pci-dev", 0, 40, 32),
+ FUNC("pci-host2", 1, 40, 32),
+ FUNC("pci-host1", 2, 40, 32),
+ FUNC("pci-fnc", 3, 40, 32)
+ };
+-static struct rt2880_pmx_func ge1_func[] = { FUNC("ge1", 0, 72, 12) };
+-static struct rt2880_pmx_func ge2_func[] = { FUNC("ge2", 0, 84, 12) };
++static struct ralink_pmx_func ge1_func[] = { FUNC("ge1", 0, 72, 12) };
++static struct ralink_pmx_func ge2_func[] = { FUNC("ge2", 0, 84, 12) };
+
+-static struct rt2880_pmx_group rt3883_pinmux_data[] = {
++static struct ralink_pmx_group rt3883_pinmux_data[] = {
+ GRP("i2c", i2c_func, 1, RT3883_GPIO_MODE_I2C),
+ GRP("spi", spi_func, 1, RT3883_GPIO_MODE_SPI),
+ GRP("uartf", uartf_func, RT3883_GPIO_MODE_UART0_MASK,
+@@ -83,7 +83,7 @@ static struct rt2880_pmx_group rt3883_pinmux_data[] = {
+
+ static int rt3883_pinmux_probe(struct platform_device *pdev)
+ {
+- return rt2880_pinmux_init(pdev, rt3883_pinmux_data);
++ return ralink_pinmux_init(pdev, rt3883_pinmux_data);
+ }
+
+ static const struct of_device_id rt3883_pinmux_match[] = {
+diff --git a/drivers/pinctrl/ralink/pinmux.h b/drivers/pinctrl/ralink/pinmux.h
+deleted file mode 100644
+index 0046abe3bcc79..0000000000000
+--- a/drivers/pinctrl/ralink/pinmux.h
++++ /dev/null
+@@ -1,53 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0-only */
+-/*
+- * Copyright (C) 2012 John Crispin <john@phrozen.org>
+- */
+-
+-#ifndef _RT288X_PINMUX_H__
+-#define _RT288X_PINMUX_H__
+-
+-#define FUNC(name, value, pin_first, pin_count) \
+- { name, value, pin_first, pin_count }
+-
+-#define GRP(_name, _func, _mask, _shift) \
+- { .name = _name, .mask = _mask, .shift = _shift, \
+- .func = _func, .gpio = _mask, \
+- .func_count = ARRAY_SIZE(_func) }
+-
+-#define GRP_G(_name, _func, _mask, _gpio, _shift) \
+- { .name = _name, .mask = _mask, .shift = _shift, \
+- .func = _func, .gpio = _gpio, \
+- .func_count = ARRAY_SIZE(_func) }
+-
+-struct rt2880_pmx_group;
+-
+-struct rt2880_pmx_func {
+- const char *name;
+- const char value;
+-
+- int pin_first;
+- int pin_count;
+- int *pins;
+-
+- int *groups;
+- int group_count;
+-
+- int enabled;
+-};
+-
+-struct rt2880_pmx_group {
+- const char *name;
+- int enabled;
+-
+- const u32 shift;
+- const char mask;
+- const char gpio;
+-
+- struct rt2880_pmx_func *func;
+- int func_count;
+-};
+-
+-int rt2880_pinmux_init(struct platform_device *pdev,
+- struct rt2880_pmx_group *data);
+-
+-#endif
+diff --git a/drivers/pinctrl/stm32/pinctrl-stm32.c b/drivers/pinctrl/stm32/pinctrl-stm32.c
+index f7c9459f66283..edd0d0af5c147 100644
+--- a/drivers/pinctrl/stm32/pinctrl-stm32.c
++++ b/drivers/pinctrl/stm32/pinctrl-stm32.c
+@@ -1299,15 +1299,17 @@ static int stm32_gpiolib_register_bank(struct stm32_pinctrl *pctl,
+ bank->bank_ioport_nr = bank_ioport_nr;
+ spin_lock_init(&bank->lock);
+
+- /* create irq hierarchical domain */
+- bank->fwnode = of_node_to_fwnode(np);
++ if (pctl->domain) {
++ /* create irq hierarchical domain */
++ bank->fwnode = of_node_to_fwnode(np);
+
+- bank->domain = irq_domain_create_hierarchy(pctl->domain, 0,
+- STM32_GPIO_IRQ_LINE, bank->fwnode,
+- &stm32_gpio_domain_ops, bank);
++ bank->domain = irq_domain_create_hierarchy(pctl->domain, 0, STM32_GPIO_IRQ_LINE,
++ bank->fwnode, &stm32_gpio_domain_ops,
++ bank);
+
+- if (!bank->domain)
+- return -ENODEV;
++ if (!bank->domain)
++ return -ENODEV;
++ }
+
+ err = gpiochip_add_data(&bank->gpio_chip, bank);
+ if (err) {
+@@ -1466,6 +1468,8 @@ int stm32_pctl_probe(struct platform_device *pdev)
+ pctl->domain = stm32_pctrl_get_irq_domain(np);
+ if (IS_ERR(pctl->domain))
+ return PTR_ERR(pctl->domain);
++ if (!pctl->domain)
++ dev_warn(dev, "pinctrl without interrupt support\n");
+
+ /* hwspinlock is optional */
+ hwlock_id = of_hwspin_lock_get_id(pdev->dev.of_node, 0);
+diff --git a/drivers/pinctrl/sunplus/sppctl.c b/drivers/pinctrl/sunplus/sppctl.c
+index 3ba47040ac423..2b3335ab56c66 100644
+--- a/drivers/pinctrl/sunplus/sppctl.c
++++ b/drivers/pinctrl/sunplus/sppctl.c
+@@ -871,6 +871,9 @@ static int sppctl_dt_node_to_map(struct pinctrl_dev *pctldev, struct device_node
+ }
+
+ *map = kcalloc(*num_maps + nmG, sizeof(**map), GFP_KERNEL);
++ if (*map == NULL)
++ return -ENOMEM;
++
+ for (i = 0; i < (*num_maps); i++) {
+ dt_pin = be32_to_cpu(list[i]);
+ pin_num = FIELD_GET(GENMASK(31, 24), dt_pin);
+diff --git a/drivers/power/reset/arm-versatile-reboot.c b/drivers/power/reset/arm-versatile-reboot.c
+index 08d0a07b58ef2..c7624d7611a7e 100644
+--- a/drivers/power/reset/arm-versatile-reboot.c
++++ b/drivers/power/reset/arm-versatile-reboot.c
+@@ -146,6 +146,7 @@ static int __init versatile_reboot_probe(void)
+ versatile_reboot_type = (enum versatile_reboot)reboot_id->data;
+
+ syscon_regmap = syscon_node_to_regmap(np);
++ of_node_put(np);
+ if (IS_ERR(syscon_regmap))
+ return PTR_ERR(syscon_regmap);
+
+diff --git a/drivers/power/supply/ab8500_fg.c b/drivers/power/supply/ab8500_fg.c
+index ec8a404d71b44..4339fa9ff0099 100644
+--- a/drivers/power/supply/ab8500_fg.c
++++ b/drivers/power/supply/ab8500_fg.c
+@@ -3148,6 +3148,7 @@ static int ab8500_fg_probe(struct platform_device *pdev)
+ ret = ab8500_fg_init_hw_registers(di);
+ if (ret) {
+ dev_err(dev, "failed to initialize registers\n");
++ destroy_workqueue(di->fg_wq);
+ return ret;
+ }
+
+@@ -3159,6 +3160,7 @@ static int ab8500_fg_probe(struct platform_device *pdev)
+ di->fg_psy = devm_power_supply_register(dev, &ab8500_fg_desc, &psy_cfg);
+ if (IS_ERR(di->fg_psy)) {
+ dev_err(dev, "failed to register FG psy\n");
++ destroy_workqueue(di->fg_wq);
+ return PTR_ERR(di->fg_psy);
+ }
+
+@@ -3174,8 +3176,10 @@ static int ab8500_fg_probe(struct platform_device *pdev)
+ /* Register primary interrupt handlers */
+ for (i = 0; i < ARRAY_SIZE(ab8500_fg_irq); i++) {
+ irq = platform_get_irq_byname(pdev, ab8500_fg_irq[i].name);
+- if (irq < 0)
++ if (irq < 0) {
++ destroy_workqueue(di->fg_wq);
+ return irq;
++ }
+
+ ret = devm_request_threaded_irq(dev, irq, NULL,
+ ab8500_fg_irq[i].isr,
+@@ -3185,6 +3189,7 @@ static int ab8500_fg_probe(struct platform_device *pdev)
+ if (ret != 0) {
+ dev_err(dev, "failed to request %s IRQ %d: %d\n",
+ ab8500_fg_irq[i].name, irq, ret);
++ destroy_workqueue(di->fg_wq);
+ return ret;
+ }
+ dev_dbg(dev, "Requested %s IRQ %d: %d\n",
+@@ -3200,6 +3205,7 @@ static int ab8500_fg_probe(struct platform_device *pdev)
+ ret = ab8500_fg_sysfs_init(di);
+ if (ret) {
+ dev_err(dev, "failed to create sysfs entry\n");
++ destroy_workqueue(di->fg_wq);
+ return ret;
+ }
+
+@@ -3207,6 +3213,7 @@ static int ab8500_fg_probe(struct platform_device *pdev)
+ if (ret) {
+ dev_err(dev, "failed to create FG psy\n");
+ ab8500_fg_sysfs_exit(di);
++ destroy_workqueue(di->fg_wq);
+ return ret;
+ }
+
+diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c
+index 775c0bf2f923d..0933948d7df3d 100644
+--- a/drivers/spi/spi-bcm2835.c
++++ b/drivers/spi/spi-bcm2835.c
+@@ -1138,10 +1138,14 @@ static void bcm2835_spi_handle_err(struct spi_controller *ctlr,
+ struct bcm2835_spi *bs = spi_controller_get_devdata(ctlr);
+
+ /* if an error occurred and we have an active dma, then terminate */
+- dmaengine_terminate_sync(ctlr->dma_tx);
+- bs->tx_dma_active = false;
+- dmaengine_terminate_sync(ctlr->dma_rx);
+- bs->rx_dma_active = false;
++ if (ctlr->dma_tx) {
++ dmaengine_terminate_sync(ctlr->dma_tx);
++ bs->tx_dma_active = false;
++ }
++ if (ctlr->dma_rx) {
++ dmaengine_terminate_sync(ctlr->dma_rx);
++ bs->rx_dma_active = false;
++ }
+ bcm2835_spi_undo_prologue(bs);
+
+ /* and reset */
+diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
+index 5b485cd96c931..5298a3a43bc71 100644
+--- a/fs/dlm/lock.c
++++ b/fs/dlm/lock.c
+@@ -4085,13 +4085,14 @@ static void send_repeat_remove(struct dlm_ls *ls, char *ms_name, int len)
+ rv = _create_message(ls, sizeof(struct dlm_message) + len,
+ dir_nodeid, DLM_MSG_REMOVE, &ms, &mh);
+ if (rv)
+- return;
++ goto out;
+
+ memcpy(ms->m_extra, name, len);
+ ms->m_hash = hash;
+
+ send_message(mh, ms);
+
++out:
+ spin_lock(&ls->ls_remove_spin);
+ ls->ls_remove_len = 0;
+ memset(ls->ls_remove_name, 0, DLM_RESNAME_MAXLEN);
+diff --git a/fs/exfat/namei.c b/fs/exfat/namei.c
+index a02a04a993bfa..c6eaf7e9ea743 100644
+--- a/fs/exfat/namei.c
++++ b/fs/exfat/namei.c
+@@ -1080,6 +1080,7 @@ static int exfat_rename_file(struct inode *inode, struct exfat_chain *p_dir,
+
+ exfat_remove_entries(inode, p_dir, oldentry, 0,
+ num_old_entries);
++ ei->dir = *p_dir;
+ ei->entry = newentry;
+ } else {
+ if (exfat_get_entry_type(epold) == TYPE_FILE) {
+@@ -1167,28 +1168,6 @@ static int exfat_move_file(struct inode *inode, struct exfat_chain *p_olddir,
+ return 0;
+ }
+
+-static void exfat_update_parent_info(struct exfat_inode_info *ei,
+- struct inode *parent_inode)
+-{
+- struct exfat_sb_info *sbi = EXFAT_SB(parent_inode->i_sb);
+- struct exfat_inode_info *parent_ei = EXFAT_I(parent_inode);
+- loff_t parent_isize = i_size_read(parent_inode);
+-
+- /*
+- * the problem that struct exfat_inode_info caches wrong parent info.
+- *
+- * because of flag-mismatch of ei->dir,
+- * there is abnormal traversing cluster chain.
+- */
+- if (unlikely(parent_ei->flags != ei->dir.flags ||
+- parent_isize != EXFAT_CLU_TO_B(ei->dir.size, sbi) ||
+- parent_ei->start_clu != ei->dir.dir)) {
+- exfat_chain_set(&ei->dir, parent_ei->start_clu,
+- EXFAT_B_TO_CLU_ROUND_UP(parent_isize, sbi),
+- parent_ei->flags);
+- }
+-}
+-
+ /* rename or move a old file into a new file */
+ static int __exfat_rename(struct inode *old_parent_inode,
+ struct exfat_inode_info *ei, struct inode *new_parent_inode,
+@@ -1219,9 +1198,9 @@ static int __exfat_rename(struct inode *old_parent_inode,
+ return -ENOENT;
+ }
+
+- exfat_update_parent_info(ei, old_parent_inode);
+-
+- exfat_chain_dup(&olddir, &ei->dir);
++ exfat_chain_set(&olddir, EXFAT_I(old_parent_inode)->start_clu,
++ EXFAT_B_TO_CLU_ROUND_UP(i_size_read(old_parent_inode), sbi),
++ EXFAT_I(old_parent_inode)->flags);
+ dentry = ei->entry;
+
+ ep = exfat_get_dentry(sb, &olddir, dentry, &old_bh);
+@@ -1241,8 +1220,6 @@ static int __exfat_rename(struct inode *old_parent_inode,
+ goto out;
+ }
+
+- exfat_update_parent_info(new_ei, new_parent_inode);
+-
+ p_dir = &(new_ei->dir);
+ new_entry = new_ei->entry;
+ ep = exfat_get_dentry(sb, p_dir, new_entry, &new_bh);
+diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
+index 944f83ef9f2ef..dcd15e0249767 100644
+--- a/include/drm/gpu_scheduler.h
++++ b/include/drm/gpu_scheduler.h
+@@ -28,7 +28,7 @@
+ #include <linux/dma-fence.h>
+ #include <linux/completion.h>
+ #include <linux/xarray.h>
+-#include <linux/irq_work.h>
++#include <linux/workqueue.h>
+
+ #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
+
+@@ -294,7 +294,7 @@ struct drm_sched_job {
+ */
+ union {
+ struct dma_fence_cb finish_cb;
+- struct irq_work work;
++ struct work_struct work;
+ };
+
+ uint64_t id;
+diff --git a/include/net/amt.h b/include/net/amt.h
+index 7a4db8b903eed..44acadf3a69e3 100644
+--- a/include/net/amt.h
++++ b/include/net/amt.h
+@@ -78,6 +78,15 @@ enum amt_status {
+
+ #define AMT_STATUS_MAX (__AMT_STATUS_MAX - 1)
+
++/* Gateway events only */
++enum amt_event {
++ AMT_EVENT_NONE,
++ AMT_EVENT_RECEIVE,
++ AMT_EVENT_SEND_DISCOVERY,
++ AMT_EVENT_SEND_REQUEST,
++ __AMT_EVENT_MAX,
++};
++
+ struct amt_header {
+ #if defined(__LITTLE_ENDIAN_BITFIELD)
+ u8 type:4,
+@@ -292,6 +301,12 @@ struct amt_group_node {
+ struct hlist_head sources[];
+ };
+
++#define AMT_MAX_EVENTS 16
++struct amt_events {
++ enum amt_event event;
++ struct sk_buff *skb;
++};
++
+ struct amt_dev {
+ struct net_device *dev;
+ struct net_device *stream_dev;
+@@ -308,6 +323,7 @@ struct amt_dev {
+ struct delayed_work req_wq;
+ /* Protected by RTNL */
+ struct delayed_work secret_wq;
++ struct work_struct event_wq;
+ /* AMT status */
+ enum amt_status status;
+ /* Generated key */
+@@ -345,6 +361,10 @@ struct amt_dev {
+ /* Used only in gateway mode */
+ u64 mac:48,
+ reserved:16;
++ /* AMT gateway side message handler queue */
++ struct amt_events events[AMT_MAX_EVENTS];
++ u8 event_idx;
++ u8 nr_events;
+ };
+
+ #define AMT_TOS 0xc0
+diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
+index 98e1ec1a14f03..749bb1e460871 100644
+--- a/include/net/inet_hashtables.h
++++ b/include/net/inet_hashtables.h
+@@ -207,7 +207,7 @@ static inline bool inet_sk_bound_dev_eq(struct net *net, int bound_dev_if,
+ int dif, int sdif)
+ {
+ #if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
+- return inet_bound_dev_eq(!!net->ipv4.sysctl_tcp_l3mdev_accept,
++ return inet_bound_dev_eq(!!READ_ONCE(net->ipv4.sysctl_tcp_l3mdev_accept),
+ bound_dev_if, dif, sdif);
+ #else
+ return inet_bound_dev_eq(true, bound_dev_if, dif, sdif);
+diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
+index 48e4c59d85e24..6395f6b9a5d29 100644
+--- a/include/net/inet_sock.h
++++ b/include/net/inet_sock.h
+@@ -107,7 +107,8 @@ static inline struct inet_request_sock *inet_rsk(const struct request_sock *sk)
+
+ static inline u32 inet_request_mark(const struct sock *sk, struct sk_buff *skb)
+ {
+- if (!sk->sk_mark && sock_net(sk)->ipv4.sysctl_tcp_fwmark_accept)
++ if (!sk->sk_mark &&
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fwmark_accept))
+ return skb->mark;
+
+ return sk->sk_mark;
+@@ -116,14 +117,15 @@ static inline u32 inet_request_mark(const struct sock *sk, struct sk_buff *skb)
+ static inline int inet_request_bound_dev_if(const struct sock *sk,
+ struct sk_buff *skb)
+ {
++ int bound_dev_if = READ_ONCE(sk->sk_bound_dev_if);
+ #ifdef CONFIG_NET_L3_MASTER_DEV
+ struct net *net = sock_net(sk);
+
+- if (!sk->sk_bound_dev_if && net->ipv4.sysctl_tcp_l3mdev_accept)
++ if (!bound_dev_if && READ_ONCE(net->ipv4.sysctl_tcp_l3mdev_accept))
+ return l3mdev_master_ifindex_by_index(net, skb->skb_iif);
+ #endif
+
+- return sk->sk_bound_dev_if;
++ return bound_dev_if;
+ }
+
+ static inline int inet_sk_bound_l3mdev(const struct sock *sk)
+@@ -131,7 +133,7 @@ static inline int inet_sk_bound_l3mdev(const struct sock *sk)
+ #ifdef CONFIG_NET_L3_MASTER_DEV
+ struct net *net = sock_net(sk);
+
+- if (!net->ipv4.sysctl_tcp_l3mdev_accept)
++ if (!READ_ONCE(net->ipv4.sysctl_tcp_l3mdev_accept))
+ return l3mdev_master_ifindex_by_index(net,
+ sk->sk_bound_dev_if);
+ #endif
+@@ -373,7 +375,7 @@ static inline bool inet_get_convert_csum(struct sock *sk)
+ static inline bool inet_can_nonlocal_bind(struct net *net,
+ struct inet_sock *inet)
+ {
+- return net->ipv4.sysctl_ip_nonlocal_bind ||
++ return READ_ONCE(net->ipv4.sysctl_ip_nonlocal_bind) ||
+ inet->freebind || inet->transparent;
+ }
+
+diff --git a/include/net/ip.h b/include/net/ip.h
+index 26fffda78cca4..1c979fd1904ce 100644
+--- a/include/net/ip.h
++++ b/include/net/ip.h
+@@ -357,7 +357,7 @@ static inline bool sysctl_dev_name_is_allowed(const char *name)
+
+ static inline bool inet_port_requires_bind_service(struct net *net, unsigned short port)
+ {
+- return port < net->ipv4.sysctl_ip_prot_sock;
++ return port < READ_ONCE(net->ipv4.sysctl_ip_prot_sock);
+ }
+
+ #else
+@@ -384,7 +384,7 @@ void ipfrag_init(void);
+ void ip_static_sysctl_init(void);
+
+ #define IP4_REPLY_MARK(net, mark) \
+- ((net)->ipv4.sysctl_fwmark_reflect ? (mark) : 0)
++ (READ_ONCE((net)->ipv4.sysctl_fwmark_reflect) ? (mark) : 0)
+
+ static inline bool ip_is_fragment(const struct iphdr *iph)
+ {
+@@ -446,7 +446,7 @@ static inline unsigned int ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
+ struct net *net = dev_net(dst->dev);
+ unsigned int mtu;
+
+- if (net->ipv4.sysctl_ip_fwd_use_pmtu ||
++ if (READ_ONCE(net->ipv4.sysctl_ip_fwd_use_pmtu) ||
+ ip_mtu_locked(dst) ||
+ !forwarding) {
+ mtu = rt->rt_pmtu;
+diff --git a/include/net/protocol.h b/include/net/protocol.h
+index f51c06ae365f5..6aef8cb11cc8c 100644
+--- a/include/net/protocol.h
++++ b/include/net/protocol.h
+@@ -35,8 +35,6 @@
+
+ /* This is used to register protocols. */
+ struct net_protocol {
+- int (*early_demux)(struct sk_buff *skb);
+- int (*early_demux_handler)(struct sk_buff *skb);
+ int (*handler)(struct sk_buff *skb);
+
+ /* This returns an error if we weren't able to handle the error. */
+@@ -52,8 +50,6 @@ struct net_protocol {
+
+ #if IS_ENABLED(CONFIG_IPV6)
+ struct inet6_protocol {
+- void (*early_demux)(struct sk_buff *skb);
+- void (*early_demux_handler)(struct sk_buff *skb);
+ int (*handler)(struct sk_buff *skb);
+
+ /* This returns an error if we weren't able to handle the error. */
+diff --git a/include/net/route.h b/include/net/route.h
+index 25404fc2b4837..08df794364853 100644
+--- a/include/net/route.h
++++ b/include/net/route.h
+@@ -361,7 +361,7 @@ static inline int ip4_dst_hoplimit(const struct dst_entry *dst)
+ struct net *net = dev_net(dst->dev);
+
+ if (hoplimit == 0)
+- hoplimit = net->ipv4.sysctl_ip_default_ttl;
++ hoplimit = READ_ONCE(net->ipv4.sysctl_ip_default_ttl);
+ return hoplimit;
+ }
+
+diff --git a/include/net/tcp.h b/include/net/tcp.h
+index 2d9a78b3beaa9..4f5de382e1927 100644
+--- a/include/net/tcp.h
++++ b/include/net/tcp.h
+@@ -932,7 +932,7 @@ extern const struct inet_connection_sock_af_ops ipv6_specific;
+
+ INDIRECT_CALLABLE_DECLARE(void tcp_v6_send_check(struct sock *sk, struct sk_buff *skb));
+ INDIRECT_CALLABLE_DECLARE(int tcp_v6_rcv(struct sk_buff *skb));
+-INDIRECT_CALLABLE_DECLARE(void tcp_v6_early_demux(struct sk_buff *skb));
++void tcp_v6_early_demux(struct sk_buff *skb);
+
+ #endif
+
+@@ -1421,8 +1421,8 @@ static inline void tcp_slow_start_after_idle_check(struct sock *sk)
+ struct tcp_sock *tp = tcp_sk(sk);
+ s32 delta;
+
+- if (!sock_net(sk)->ipv4.sysctl_tcp_slow_start_after_idle || tp->packets_out ||
+- ca_ops->cong_control)
++ if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_slow_start_after_idle) ||
++ tp->packets_out || ca_ops->cong_control)
+ return;
+ delta = tcp_jiffies32 - tp->lsndtime;
+ if (delta > inet_csk(sk)->icsk_rto)
+@@ -1511,21 +1511,24 @@ static inline int keepalive_intvl_when(const struct tcp_sock *tp)
+ {
+ struct net *net = sock_net((struct sock *)tp);
+
+- return tp->keepalive_intvl ? : net->ipv4.sysctl_tcp_keepalive_intvl;
++ return tp->keepalive_intvl ? :
++ READ_ONCE(net->ipv4.sysctl_tcp_keepalive_intvl);
+ }
+
+ static inline int keepalive_time_when(const struct tcp_sock *tp)
+ {
+ struct net *net = sock_net((struct sock *)tp);
+
+- return tp->keepalive_time ? : net->ipv4.sysctl_tcp_keepalive_time;
++ return tp->keepalive_time ? :
++ READ_ONCE(net->ipv4.sysctl_tcp_keepalive_time);
+ }
+
+ static inline int keepalive_probes(const struct tcp_sock *tp)
+ {
+ struct net *net = sock_net((struct sock *)tp);
+
+- return tp->keepalive_probes ? : net->ipv4.sysctl_tcp_keepalive_probes;
++ return tp->keepalive_probes ? :
++ READ_ONCE(net->ipv4.sysctl_tcp_keepalive_probes);
+ }
+
+ static inline u32 keepalive_time_elapsed(const struct tcp_sock *tp)
+@@ -1538,7 +1541,8 @@ static inline u32 keepalive_time_elapsed(const struct tcp_sock *tp)
+
+ static inline int tcp_fin_time(const struct sock *sk)
+ {
+- int fin_timeout = tcp_sk(sk)->linger2 ? : sock_net(sk)->ipv4.sysctl_tcp_fin_timeout;
++ int fin_timeout = tcp_sk(sk)->linger2 ? :
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fin_timeout);
+ const int rto = inet_csk(sk)->icsk_rto;
+
+ if (fin_timeout < (rto << 2) - (rto >> 1))
+@@ -2041,7 +2045,7 @@ void __tcp_v4_send_check(struct sk_buff *skb, __be32 saddr, __be32 daddr);
+ static inline u32 tcp_notsent_lowat(const struct tcp_sock *tp)
+ {
+ struct net *net = sock_net((struct sock *)tp);
+- return tp->notsent_lowat ?: net->ipv4.sysctl_tcp_notsent_lowat;
++ return tp->notsent_lowat ?: READ_ONCE(net->ipv4.sysctl_tcp_notsent_lowat);
+ }
+
+ bool tcp_stream_memory_free(const struct sock *sk, int wake);
+diff --git a/include/net/udp.h b/include/net/udp.h
+index f1c2a88c9005a..abe91ab9030df 100644
+--- a/include/net/udp.h
++++ b/include/net/udp.h
+@@ -167,7 +167,7 @@ static inline void udp_csum_pull_header(struct sk_buff *skb)
+ typedef struct sock *(*udp_lookup_t)(const struct sk_buff *skb, __be16 sport,
+ __be16 dport);
+
+-INDIRECT_CALLABLE_DECLARE(void udp_v6_early_demux(struct sk_buff *));
++void udp_v6_early_demux(struct sk_buff *skb);
+ INDIRECT_CALLABLE_DECLARE(int udpv6_rcv(struct sk_buff *));
+
+ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
+@@ -238,7 +238,7 @@ static inline bool udp_sk_bound_dev_eq(struct net *net, int bound_dev_if,
+ int dif, int sdif)
+ {
+ #if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
+- return inet_bound_dev_eq(!!net->ipv4.sysctl_udp_l3mdev_accept,
++ return inet_bound_dev_eq(!!READ_ONCE(net->ipv4.sysctl_udp_l3mdev_accept),
+ bound_dev_if, dif, sdif);
+ #else
+ return inet_bound_dev_eq(true, bound_dev_if, dif, sdif);
+diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
+index 1e92b52fc8146..3adff3831c047 100644
+--- a/kernel/bpf/core.c
++++ b/kernel/bpf/core.c
+@@ -68,11 +68,13 @@ void *bpf_internal_load_pointer_neg_helper(const struct sk_buff *skb, int k, uns
+ {
+ u8 *ptr = NULL;
+
+- if (k >= SKF_NET_OFF)
++ if (k >= SKF_NET_OFF) {
+ ptr = skb_network_header(skb) + k - SKF_NET_OFF;
+- else if (k >= SKF_LL_OFF)
++ } else if (k >= SKF_LL_OFF) {
++ if (unlikely(!skb_mac_header_was_set(skb)))
++ return NULL;
+ ptr = skb_mac_header(skb) + k - SKF_LL_OFF;
+-
++ }
+ if (ptr >= skb->head && ptr + size <= skb_tail_pointer(skb))
+ return ptr;
+
+diff --git a/kernel/events/core.c b/kernel/events/core.c
+index 950b25c3f2103..82238406f5f55 100644
+--- a/kernel/events/core.c
++++ b/kernel/events/core.c
+@@ -6254,10 +6254,10 @@ again:
+
+ if (!atomic_inc_not_zero(&event->rb->mmap_count)) {
+ /*
+- * Raced against perf_mmap_close() through
+- * perf_event_set_output(). Try again, hope for better
+- * luck.
++ * Raced against perf_mmap_close(); remove the
++ * event and try again.
+ */
++ ring_buffer_attach(event, NULL);
+ mutex_unlock(&event->mmap_mutex);
+ goto again;
+ }
+@@ -11826,14 +11826,25 @@ err_size:
+ goto out;
+ }
+
++static void mutex_lock_double(struct mutex *a, struct mutex *b)
++{
++ if (b < a)
++ swap(a, b);
++
++ mutex_lock(a);
++ mutex_lock_nested(b, SINGLE_DEPTH_NESTING);
++}
++
+ static int
+ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
+ {
+ struct perf_buffer *rb = NULL;
+ int ret = -EINVAL;
+
+- if (!output_event)
++ if (!output_event) {
++ mutex_lock(&event->mmap_mutex);
+ goto set;
++ }
+
+ /* don't allow circular references */
+ if (event == output_event)
+@@ -11871,8 +11882,15 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
+ event->pmu != output_event->pmu)
+ goto out;
+
++ /*
++ * Hold both mmap_mutex to serialize against perf_mmap_close(). Since
++ * output_event is already on rb->event_list, and the list iteration
++ * restarts after every removal, it is guaranteed this new event is
++ * observed *OR* if output_event is already removed, it's guaranteed we
++ * observe !rb->mmap_count.
++ */
++ mutex_lock_double(&event->mmap_mutex, &output_event->mmap_mutex);
+ set:
+- mutex_lock(&event->mmap_mutex);
+ /* Can't redirect output if we've got an active mmap() */
+ if (atomic_read(&event->mmap_count))
+ goto unlock;
+@@ -11882,6 +11900,12 @@ set:
+ rb = ring_buffer_get(output_event);
+ if (!rb)
+ goto unlock;
++
++ /* did we race against perf_mmap_close() */
++ if (!atomic_read(&rb->mmap_count)) {
++ ring_buffer_put(rb);
++ goto unlock;
++ }
+ }
+
+ ring_buffer_attach(event, rb);
+@@ -11889,20 +11913,13 @@ set:
+ ret = 0;
+ unlock:
+ mutex_unlock(&event->mmap_mutex);
++ if (output_event)
++ mutex_unlock(&output_event->mmap_mutex);
+
+ out:
+ return ret;
+ }
+
+-static void mutex_lock_double(struct mutex *a, struct mutex *b)
+-{
+- if (b < a)
+- swap(a, b);
+-
+- mutex_lock(a);
+- mutex_lock_nested(b, SINGLE_DEPTH_NESTING);
+-}
+-
+ static int perf_event_set_clock(struct perf_event *event, clockid_t clk_id)
+ {
+ bool nmi_safe = false;
+diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
+index b61281d104584..10a916ec64826 100644
+--- a/kernel/sched/deadline.c
++++ b/kernel/sched/deadline.c
+@@ -1669,7 +1669,10 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
+ * the throttle.
+ */
+ p->dl.dl_throttled = 0;
+- BUG_ON(!is_dl_boosted(&p->dl) || flags != ENQUEUE_REPLENISH);
++ if (!(flags & ENQUEUE_REPLENISH))
++ printk_deferred_once("sched: DL de-boosted task PID %d: REPLENISH flag missing\n",
++ task_pid_nr(p));
++
+ return;
+ }
+
+diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
+index 230038d4f9081..bb9962b33f95c 100644
+--- a/kernel/watch_queue.c
++++ b/kernel/watch_queue.c
+@@ -34,6 +34,27 @@ MODULE_LICENSE("GPL");
+ #define WATCH_QUEUE_NOTE_SIZE 128
+ #define WATCH_QUEUE_NOTES_PER_PAGE (PAGE_SIZE / WATCH_QUEUE_NOTE_SIZE)
+
++/*
++ * This must be called under the RCU read-lock, which makes
++ * sure that the wqueue still exists. It can then take the lock,
++ * and check that the wqueue hasn't been destroyed, which in
++ * turn makes sure that the notification pipe still exists.
++ */
++static inline bool lock_wqueue(struct watch_queue *wqueue)
++{
++ spin_lock_bh(&wqueue->lock);
++ if (unlikely(wqueue->defunct)) {
++ spin_unlock_bh(&wqueue->lock);
++ return false;
++ }
++ return true;
++}
++
++static inline void unlock_wqueue(struct watch_queue *wqueue)
++{
++ spin_unlock_bh(&wqueue->lock);
++}
++
+ static void watch_queue_pipe_buf_release(struct pipe_inode_info *pipe,
+ struct pipe_buffer *buf)
+ {
+@@ -69,6 +90,10 @@ static const struct pipe_buf_operations watch_queue_pipe_buf_ops = {
+
+ /*
+ * Post a notification to a watch queue.
++ *
++ * Must be called with the RCU lock for reading, and the
++ * watch_queue lock held, which guarantees that the pipe
++ * hasn't been released.
+ */
+ static bool post_one_notification(struct watch_queue *wqueue,
+ struct watch_notification *n)
+@@ -85,9 +110,6 @@ static bool post_one_notification(struct watch_queue *wqueue,
+
+ spin_lock_irq(&pipe->rd_wait.lock);
+
+- if (wqueue->defunct)
+- goto out;
+-
+ mask = pipe->ring_size - 1;
+ head = pipe->head;
+ tail = pipe->tail;
+@@ -203,7 +225,10 @@ void __post_watch_notification(struct watch_list *wlist,
+ if (security_post_notification(watch->cred, cred, n) < 0)
+ continue;
+
+- post_one_notification(wqueue, n);
++ if (lock_wqueue(wqueue)) {
++ post_one_notification(wqueue, n);
++ unlock_wqueue(wqueue);
++ }
+ }
+
+ rcu_read_unlock();
+@@ -462,11 +487,12 @@ int add_watch_to_object(struct watch *watch, struct watch_list *wlist)
+ return -EAGAIN;
+ }
+
+- spin_lock_bh(&wqueue->lock);
+- kref_get(&wqueue->usage);
+- kref_get(&watch->usage);
+- hlist_add_head(&watch->queue_node, &wqueue->watches);
+- spin_unlock_bh(&wqueue->lock);
++ if (lock_wqueue(wqueue)) {
++ kref_get(&wqueue->usage);
++ kref_get(&watch->usage);
++ hlist_add_head(&watch->queue_node, &wqueue->watches);
++ unlock_wqueue(wqueue);
++ }
+
+ hlist_add_head(&watch->list_node, &wlist->watchers);
+ return 0;
+@@ -520,20 +546,15 @@ found:
+
+ wqueue = rcu_dereference(watch->queue);
+
+- /* We don't need the watch list lock for the next bit as RCU is
+- * protecting *wqueue from deallocation.
+- */
+- if (wqueue) {
++ if (lock_wqueue(wqueue)) {
+ post_one_notification(wqueue, &n.watch);
+
+- spin_lock_bh(&wqueue->lock);
+-
+ if (!hlist_unhashed(&watch->queue_node)) {
+ hlist_del_init_rcu(&watch->queue_node);
+ put_watch(watch);
+ }
+
+- spin_unlock_bh(&wqueue->lock);
++ unlock_wqueue(wqueue);
+ }
+
+ if (wlist->release_watch) {
+diff --git a/mm/mempolicy.c b/mm/mempolicy.c
+index 8c74107a2b15e..ea6dee61bc9dc 100644
+--- a/mm/mempolicy.c
++++ b/mm/mempolicy.c
+@@ -350,7 +350,7 @@ static void mpol_rebind_preferred(struct mempolicy *pol,
+ */
+ static void mpol_rebind_policy(struct mempolicy *pol, const nodemask_t *newmask)
+ {
+- if (!pol)
++ if (!pol || pol->mode == MPOL_LOCAL)
+ return;
+ if (!mpol_store_user_nodemask(pol) &&
+ nodes_equal(pol->w.cpuset_mems_allowed, *newmask))
+diff --git a/net/core/filter.c b/net/core/filter.c
+index 6391c1885bca8..d0b0c163d3f34 100644
+--- a/net/core/filter.c
++++ b/net/core/filter.c
+@@ -7031,7 +7031,7 @@ BPF_CALL_5(bpf_tcp_check_syncookie, struct sock *, sk, void *, iph, u32, iph_len
+ if (sk->sk_protocol != IPPROTO_TCP || sk->sk_state != TCP_LISTEN)
+ return -EINVAL;
+
+- if (!sock_net(sk)->ipv4.sysctl_tcp_syncookies)
++ if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_syncookies))
+ return -EINVAL;
+
+ if (!th->ack || th->rst || th->syn)
+@@ -7106,7 +7106,7 @@ BPF_CALL_5(bpf_tcp_gen_syncookie, struct sock *, sk, void *, iph, u32, iph_len,
+ if (sk->sk_protocol != IPPROTO_TCP || sk->sk_state != TCP_LISTEN)
+ return -EINVAL;
+
+- if (!sock_net(sk)->ipv4.sysctl_tcp_syncookies)
++ if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_syncookies))
+ return -ENOENT;
+
+ if (!th->syn || th->ack || th->fin || th->rst)
+diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
+index 5f85e01d4093b..b0ff6153be623 100644
+--- a/net/core/secure_seq.c
++++ b/net/core/secure_seq.c
+@@ -64,7 +64,7 @@ u32 secure_tcpv6_ts_off(const struct net *net,
+ .daddr = *(struct in6_addr *)daddr,
+ };
+
+- if (net->ipv4.sysctl_tcp_timestamps != 1)
++ if (READ_ONCE(net->ipv4.sysctl_tcp_timestamps) != 1)
+ return 0;
+
+ ts_secret_init();
+@@ -120,7 +120,7 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
+ #ifdef CONFIG_INET
+ u32 secure_tcp_ts_off(const struct net *net, __be32 saddr, __be32 daddr)
+ {
+- if (net->ipv4.sysctl_tcp_timestamps != 1)
++ if (READ_ONCE(net->ipv4.sysctl_tcp_timestamps) != 1)
+ return 0;
+
+ ts_secret_init();
+diff --git a/net/core/sock_reuseport.c b/net/core/sock_reuseport.c
+index 3f00a28fe762a..5daa1fa542490 100644
+--- a/net/core/sock_reuseport.c
++++ b/net/core/sock_reuseport.c
+@@ -387,7 +387,7 @@ void reuseport_stop_listen_sock(struct sock *sk)
+ prog = rcu_dereference_protected(reuse->prog,
+ lockdep_is_held(&reuseport_lock));
+
+- if (sock_net(sk)->ipv4.sysctl_tcp_migrate_req ||
++ if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_migrate_req) ||
+ (prog && prog->expected_attach_type == BPF_SK_REUSEPORT_SELECT_OR_MIGRATE)) {
+ /* Migration capable, move sk from the listening section
+ * to the closed section.
+@@ -545,7 +545,7 @@ struct sock *reuseport_migrate_sock(struct sock *sk,
+ hash = migrating_sk->sk_hash;
+ prog = rcu_dereference(reuse->prog);
+ if (!prog || prog->expected_attach_type != BPF_SK_REUSEPORT_SELECT_OR_MIGRATE) {
+- if (sock_net(sk)->ipv4.sysctl_tcp_migrate_req)
++ if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_migrate_req))
+ goto select_by_hash;
+ goto failure;
+ }
+diff --git a/net/dsa/port.c b/net/dsa/port.c
+index bdccb613285db..7bc79e28d48ee 100644
+--- a/net/dsa/port.c
++++ b/net/dsa/port.c
+@@ -242,6 +242,60 @@ void dsa_port_disable(struct dsa_port *dp)
+ rtnl_unlock();
+ }
+
++static void dsa_port_reset_vlan_filtering(struct dsa_port *dp,
++ struct dsa_bridge bridge)
++{
++ struct netlink_ext_ack extack = {0};
++ bool change_vlan_filtering = false;
++ struct dsa_switch *ds = dp->ds;
++ struct dsa_port *other_dp;
++ bool vlan_filtering;
++ int err;
++
++ if (ds->needs_standalone_vlan_filtering &&
++ !br_vlan_enabled(bridge.dev)) {
++ change_vlan_filtering = true;
++ vlan_filtering = true;
++ } else if (!ds->needs_standalone_vlan_filtering &&
++ br_vlan_enabled(bridge.dev)) {
++ change_vlan_filtering = true;
++ vlan_filtering = false;
++ }
++
++ /* If the bridge was vlan_filtering, the bridge core doesn't trigger an
++ * event for changing vlan_filtering setting upon slave ports leaving
++ * it. That is a good thing, because that lets us handle it and also
++ * handle the case where the switch's vlan_filtering setting is global
++ * (not per port). When that happens, the correct moment to trigger the
++ * vlan_filtering callback is only when the last port leaves the last
++ * VLAN-aware bridge.
++ */
++ if (change_vlan_filtering && ds->vlan_filtering_is_global) {
++ dsa_switch_for_each_port(other_dp, ds) {
++ struct net_device *br = dsa_port_bridge_dev_get(other_dp);
++
++ if (br && br_vlan_enabled(br)) {
++ change_vlan_filtering = false;
++ break;
++ }
++ }
++ }
++
++ if (!change_vlan_filtering)
++ return;
++
++ err = dsa_port_vlan_filtering(dp, vlan_filtering, &extack);
++ if (extack._msg) {
++ dev_err(ds->dev, "port %d: %s\n", dp->index,
++ extack._msg);
++ }
++ if (err && err != -EOPNOTSUPP) {
++ dev_err(ds->dev,
++ "port %d failed to reset VLAN filtering to %d: %pe\n",
++ dp->index, vlan_filtering, ERR_PTR(err));
++ }
++}
++
+ static int dsa_port_inherit_brport_flags(struct dsa_port *dp,
+ struct netlink_ext_ack *extack)
+ {
+@@ -313,7 +367,8 @@ static int dsa_port_switchdev_sync_attrs(struct dsa_port *dp,
+ return 0;
+ }
+
+-static void dsa_port_switchdev_unsync_attrs(struct dsa_port *dp)
++static void dsa_port_switchdev_unsync_attrs(struct dsa_port *dp,
++ struct dsa_bridge bridge)
+ {
+ /* Configure the port for standalone mode (no address learning,
+ * flood everything).
+@@ -333,7 +388,7 @@ static void dsa_port_switchdev_unsync_attrs(struct dsa_port *dp)
+ */
+ dsa_port_set_state_now(dp, BR_STATE_FORWARDING, true);
+
+- /* VLAN filtering is handled by dsa_switch_bridge_leave */
++ dsa_port_reset_vlan_filtering(dp, bridge);
+
+ /* Ageing time may be global to the switch chip, so don't change it
+ * here because we have no good reason (or value) to change it to.
+@@ -502,7 +557,7 @@ void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br)
+ "port %d failed to notify DSA_NOTIFIER_BRIDGE_LEAVE: %pe\n",
+ dp->index, ERR_PTR(err));
+
+- dsa_port_switchdev_unsync_attrs(dp);
++ dsa_port_switchdev_unsync_attrs(dp, info.bridge);
+ }
+
+ int dsa_port_lag_change(struct dsa_port *dp,
+@@ -752,7 +807,7 @@ int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
+ ds->vlan_filtering = vlan_filtering;
+
+ dsa_switch_for_each_user_port(other_dp, ds) {
+- struct net_device *slave = dp->slave;
++ struct net_device *slave = other_dp->slave;
+
+ /* We might be called in the unbind path, so not
+ * all slave devices might still be registered.
+diff --git a/net/dsa/switch.c b/net/dsa/switch.c
+index d25cd1da3eb35..d8a80cf9742c0 100644
+--- a/net/dsa/switch.c
++++ b/net/dsa/switch.c
+@@ -115,62 +115,10 @@ static int dsa_switch_bridge_join(struct dsa_switch *ds,
+ return 0;
+ }
+
+-static int dsa_switch_sync_vlan_filtering(struct dsa_switch *ds,
+- struct dsa_notifier_bridge_info *info)
+-{
+- struct netlink_ext_ack extack = {0};
+- bool change_vlan_filtering = false;
+- bool vlan_filtering;
+- struct dsa_port *dp;
+- int err;
+-
+- if (ds->needs_standalone_vlan_filtering &&
+- !br_vlan_enabled(info->bridge.dev)) {
+- change_vlan_filtering = true;
+- vlan_filtering = true;
+- } else if (!ds->needs_standalone_vlan_filtering &&
+- br_vlan_enabled(info->bridge.dev)) {
+- change_vlan_filtering = true;
+- vlan_filtering = false;
+- }
+-
+- /* If the bridge was vlan_filtering, the bridge core doesn't trigger an
+- * event for changing vlan_filtering setting upon slave ports leaving
+- * it. That is a good thing, because that lets us handle it and also
+- * handle the case where the switch's vlan_filtering setting is global
+- * (not per port). When that happens, the correct moment to trigger the
+- * vlan_filtering callback is only when the last port leaves the last
+- * VLAN-aware bridge.
+- */
+- if (change_vlan_filtering && ds->vlan_filtering_is_global) {
+- dsa_switch_for_each_port(dp, ds) {
+- struct net_device *br = dsa_port_bridge_dev_get(dp);
+-
+- if (br && br_vlan_enabled(br)) {
+- change_vlan_filtering = false;
+- break;
+- }
+- }
+- }
+-
+- if (change_vlan_filtering) {
+- err = dsa_port_vlan_filtering(dsa_to_port(ds, info->port),
+- vlan_filtering, &extack);
+- if (extack._msg)
+- dev_err(ds->dev, "port %d: %s\n", info->port,
+- extack._msg);
+- if (err && err != -EOPNOTSUPP)
+- return err;
+- }
+-
+- return 0;
+-}
+-
+ static int dsa_switch_bridge_leave(struct dsa_switch *ds,
+ struct dsa_notifier_bridge_info *info)
+ {
+ struct dsa_switch_tree *dst = ds->dst;
+- int err;
+
+ if (dst->index == info->tree_index && ds->index == info->sw_index &&
+ ds->ops->port_bridge_leave)
+@@ -182,12 +130,6 @@ static int dsa_switch_bridge_leave(struct dsa_switch *ds,
+ info->sw_index, info->port,
+ info->bridge);
+
+- if (ds->dst->index == info->tree_index && ds->index == info->sw_index) {
+- err = dsa_switch_sync_vlan_filtering(ds, info);
+- if (err)
+- return err;
+- }
+-
+ return 0;
+ }
+
+diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
+index 98bc180563d16..5c207367b3b4f 100644
+--- a/net/ipv4/af_inet.c
++++ b/net/ipv4/af_inet.c
+@@ -217,7 +217,7 @@ int inet_listen(struct socket *sock, int backlog)
+ * because the socket was in TCP_LISTEN state previously but
+ * was shutdown() rather than close().
+ */
+- tcp_fastopen = sock_net(sk)->ipv4.sysctl_tcp_fastopen;
++ tcp_fastopen = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fastopen);
+ if ((tcp_fastopen & TFO_SERVER_WO_SOCKOPT1) &&
+ (tcp_fastopen & TFO_SERVER_ENABLE) &&
+ !inet_csk(sk)->icsk_accept_queue.fastopenq.max_qlen) {
+@@ -335,7 +335,7 @@ lookup_protocol:
+ inet->hdrincl = 1;
+ }
+
+- if (net->ipv4.sysctl_ip_no_pmtu_disc)
++ if (READ_ONCE(net->ipv4.sysctl_ip_no_pmtu_disc))
+ inet->pmtudisc = IP_PMTUDISC_DONT;
+ else
+ inet->pmtudisc = IP_PMTUDISC_WANT;
+@@ -1711,24 +1711,14 @@ static const struct net_protocol igmp_protocol = {
+ };
+ #endif
+
+-/* thinking of making this const? Don't.
+- * early_demux can change based on sysctl.
+- */
+-static struct net_protocol tcp_protocol = {
+- .early_demux = tcp_v4_early_demux,
+- .early_demux_handler = tcp_v4_early_demux,
++static const struct net_protocol tcp_protocol = {
+ .handler = tcp_v4_rcv,
+ .err_handler = tcp_v4_err,
+ .no_policy = 1,
+ .icmp_strict_tag_validation = 1,
+ };
+
+-/* thinking of making this const? Don't.
+- * early_demux can change based on sysctl.
+- */
+-static struct net_protocol udp_protocol = {
+- .early_demux = udp_v4_early_demux,
+- .early_demux_handler = udp_v4_early_demux,
++static const struct net_protocol udp_protocol = {
+ .handler = udp_rcv,
+ .err_handler = udp_err,
+ .no_policy = 1,
+diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
+index 720f65f7bd0b0..9f5c1c26c8f26 100644
+--- a/net/ipv4/fib_semantics.c
++++ b/net/ipv4/fib_semantics.c
+@@ -2216,7 +2216,7 @@ void fib_select_multipath(struct fib_result *res, int hash)
+ }
+
+ change_nexthops(fi) {
+- if (net->ipv4.sysctl_fib_multipath_use_neigh) {
++ if (READ_ONCE(net->ipv4.sysctl_fib_multipath_use_neigh)) {
+ if (!fib_good_nh(nexthop_nh))
+ continue;
+ if (!first) {
+diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
+index c13ceda9ce5d8..d8cfa6241c04b 100644
+--- a/net/ipv4/icmp.c
++++ b/net/ipv4/icmp.c
+@@ -878,7 +878,7 @@ static bool icmp_unreach(struct sk_buff *skb)
+ * values please see
+ * Documentation/networking/ip-sysctl.rst
+ */
+- switch (net->ipv4.sysctl_ip_no_pmtu_disc) {
++ switch (READ_ONCE(net->ipv4.sysctl_ip_no_pmtu_disc)) {
+ default:
+ net_dbg_ratelimited("%pI4: fragmentation needed and DF set\n",
+ &iph->daddr);
+diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
+index 1d9e6d5e9a76c..0a0010f896274 100644
+--- a/net/ipv4/igmp.c
++++ b/net/ipv4/igmp.c
+@@ -467,7 +467,8 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ip_mc_list *pmc,
+
+ if (pmc->multiaddr == IGMP_ALL_HOSTS)
+ return skb;
+- if (ipv4_is_local_multicast(pmc->multiaddr) && !net->ipv4.sysctl_igmp_llm_reports)
++ if (ipv4_is_local_multicast(pmc->multiaddr) &&
++ !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports))
+ return skb;
+
+ mtu = READ_ONCE(dev->mtu);
+@@ -593,7 +594,7 @@ static int igmpv3_send_report(struct in_device *in_dev, struct ip_mc_list *pmc)
+ if (pmc->multiaddr == IGMP_ALL_HOSTS)
+ continue;
+ if (ipv4_is_local_multicast(pmc->multiaddr) &&
+- !net->ipv4.sysctl_igmp_llm_reports)
++ !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports))
+ continue;
+ spin_lock_bh(&pmc->lock);
+ if (pmc->sfcount[MCAST_EXCLUDE])
+@@ -736,7 +737,8 @@ static int igmp_send_report(struct in_device *in_dev, struct ip_mc_list *pmc,
+ if (type == IGMPV3_HOST_MEMBERSHIP_REPORT)
+ return igmpv3_send_report(in_dev, pmc);
+
+- if (ipv4_is_local_multicast(group) && !net->ipv4.sysctl_igmp_llm_reports)
++ if (ipv4_is_local_multicast(group) &&
++ !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports))
+ return 0;
+
+ if (type == IGMP_HOST_LEAVE_MESSAGE)
+@@ -825,7 +827,7 @@ static void igmp_ifc_event(struct in_device *in_dev)
+ struct net *net = dev_net(in_dev->dev);
+ if (IGMP_V1_SEEN(in_dev) || IGMP_V2_SEEN(in_dev))
+ return;
+- WRITE_ONCE(in_dev->mr_ifc_count, in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv);
++ WRITE_ONCE(in_dev->mr_ifc_count, in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv));
+ igmp_ifc_start_timer(in_dev, 1);
+ }
+
+@@ -920,7 +922,8 @@ static bool igmp_heard_report(struct in_device *in_dev, __be32 group)
+
+ if (group == IGMP_ALL_HOSTS)
+ return false;
+- if (ipv4_is_local_multicast(group) && !net->ipv4.sysctl_igmp_llm_reports)
++ if (ipv4_is_local_multicast(group) &&
++ !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports))
+ return false;
+
+ rcu_read_lock();
+@@ -1006,7 +1009,7 @@ static bool igmp_heard_query(struct in_device *in_dev, struct sk_buff *skb,
+ * received value was zero, use the default or statically
+ * configured value.
+ */
+- in_dev->mr_qrv = ih3->qrv ?: net->ipv4.sysctl_igmp_qrv;
++ in_dev->mr_qrv = ih3->qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv);
+ in_dev->mr_qi = IGMPV3_QQIC(ih3->qqic)*HZ ?: IGMP_QUERY_INTERVAL;
+
+ /* RFC3376, 8.3. Query Response Interval:
+@@ -1045,7 +1048,7 @@ static bool igmp_heard_query(struct in_device *in_dev, struct sk_buff *skb,
+ if (im->multiaddr == IGMP_ALL_HOSTS)
+ continue;
+ if (ipv4_is_local_multicast(im->multiaddr) &&
+- !net->ipv4.sysctl_igmp_llm_reports)
++ !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports))
+ continue;
+ spin_lock_bh(&im->lock);
+ if (im->tm_running)
+@@ -1186,7 +1189,7 @@ static void igmpv3_add_delrec(struct in_device *in_dev, struct ip_mc_list *im,
+ pmc->interface = im->interface;
+ in_dev_hold(in_dev);
+ pmc->multiaddr = im->multiaddr;
+- pmc->crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv;
++ pmc->crcount = in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv);
+ pmc->sfmode = im->sfmode;
+ if (pmc->sfmode == MCAST_INCLUDE) {
+ struct ip_sf_list *psf;
+@@ -1237,9 +1240,11 @@ static void igmpv3_del_delrec(struct in_device *in_dev, struct ip_mc_list *im)
+ swap(im->tomb, pmc->tomb);
+ swap(im->sources, pmc->sources);
+ for (psf = im->sources; psf; psf = psf->sf_next)
+- psf->sf_crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv;
++ psf->sf_crcount = in_dev->mr_qrv ?:
++ READ_ONCE(net->ipv4.sysctl_igmp_qrv);
+ } else {
+- im->crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv;
++ im->crcount = in_dev->mr_qrv ?:
++ READ_ONCE(net->ipv4.sysctl_igmp_qrv);
+ }
+ in_dev_put(pmc->interface);
+ kfree_pmc(pmc);
+@@ -1296,7 +1301,8 @@ static void __igmp_group_dropped(struct ip_mc_list *im, gfp_t gfp)
+ #ifdef CONFIG_IP_MULTICAST
+ if (im->multiaddr == IGMP_ALL_HOSTS)
+ return;
+- if (ipv4_is_local_multicast(im->multiaddr) && !net->ipv4.sysctl_igmp_llm_reports)
++ if (ipv4_is_local_multicast(im->multiaddr) &&
++ !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports))
+ return;
+
+ reporter = im->reporter;
+@@ -1338,13 +1344,14 @@ static void igmp_group_added(struct ip_mc_list *im)
+ #ifdef CONFIG_IP_MULTICAST
+ if (im->multiaddr == IGMP_ALL_HOSTS)
+ return;
+- if (ipv4_is_local_multicast(im->multiaddr) && !net->ipv4.sysctl_igmp_llm_reports)
++ if (ipv4_is_local_multicast(im->multiaddr) &&
++ !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports))
+ return;
+
+ if (in_dev->dead)
+ return;
+
+- im->unsolicit_count = net->ipv4.sysctl_igmp_qrv;
++ im->unsolicit_count = READ_ONCE(net->ipv4.sysctl_igmp_qrv);
+ if (IGMP_V1_SEEN(in_dev) || IGMP_V2_SEEN(in_dev)) {
+ spin_lock_bh(&im->lock);
+ igmp_start_timer(im, IGMP_INITIAL_REPORT_DELAY);
+@@ -1358,7 +1365,7 @@ static void igmp_group_added(struct ip_mc_list *im)
+ * IN() to IN(A).
+ */
+ if (im->sfmode == MCAST_EXCLUDE)
+- im->crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv;
++ im->crcount = in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv);
+
+ igmp_ifc_event(in_dev);
+ #endif
+@@ -1642,7 +1649,7 @@ static void ip_mc_rejoin_groups(struct in_device *in_dev)
+ if (im->multiaddr == IGMP_ALL_HOSTS)
+ continue;
+ if (ipv4_is_local_multicast(im->multiaddr) &&
+- !net->ipv4.sysctl_igmp_llm_reports)
++ !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports))
+ continue;
+
+ /* a failover is happening and switches
+@@ -1749,7 +1756,7 @@ static void ip_mc_reset(struct in_device *in_dev)
+
+ in_dev->mr_qi = IGMP_QUERY_INTERVAL;
+ in_dev->mr_qri = IGMP_QUERY_RESPONSE_INTERVAL;
+- in_dev->mr_qrv = net->ipv4.sysctl_igmp_qrv;
++ in_dev->mr_qrv = READ_ONCE(net->ipv4.sysctl_igmp_qrv);
+ }
+ #else
+ static void ip_mc_reset(struct in_device *in_dev)
+@@ -1883,7 +1890,7 @@ static int ip_mc_del1_src(struct ip_mc_list *pmc, int sfmode,
+ #ifdef CONFIG_IP_MULTICAST
+ if (psf->sf_oldin &&
+ !IGMP_V1_SEEN(in_dev) && !IGMP_V2_SEEN(in_dev)) {
+- psf->sf_crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv;
++ psf->sf_crcount = in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv);
+ psf->sf_next = pmc->tomb;
+ pmc->tomb = psf;
+ rv = 1;
+@@ -1947,7 +1954,7 @@ static int ip_mc_del_src(struct in_device *in_dev, __be32 *pmca, int sfmode,
+ /* filter mode change */
+ pmc->sfmode = MCAST_INCLUDE;
+ #ifdef CONFIG_IP_MULTICAST
+- pmc->crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv;
++ pmc->crcount = in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv);
+ WRITE_ONCE(in_dev->mr_ifc_count, pmc->crcount);
+ for (psf = pmc->sources; psf; psf = psf->sf_next)
+ psf->sf_crcount = 0;
+@@ -2126,7 +2133,7 @@ static int ip_mc_add_src(struct in_device *in_dev, __be32 *pmca, int sfmode,
+ #ifdef CONFIG_IP_MULTICAST
+ /* else no filters; keep old mode for reports */
+
+- pmc->crcount = in_dev->mr_qrv ?: net->ipv4.sysctl_igmp_qrv;
++ pmc->crcount = in_dev->mr_qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv);
+ WRITE_ONCE(in_dev->mr_ifc_count, pmc->crcount);
+ for (psf = pmc->sources; psf; psf = psf->sf_next)
+ psf->sf_crcount = 0;
+@@ -2192,7 +2199,7 @@ static int __ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr,
+ count++;
+ }
+ err = -ENOBUFS;
+- if (count >= net->ipv4.sysctl_igmp_max_memberships)
++ if (count >= READ_ONCE(net->ipv4.sysctl_igmp_max_memberships))
+ goto done;
+ iml = sock_kmalloc(sk, sizeof(*iml), GFP_KERNEL);
+ if (!iml)
+@@ -2379,7 +2386,7 @@ int ip_mc_source(int add, int omode, struct sock *sk, struct
+ }
+ /* else, add a new source to the filter */
+
+- if (psl && psl->sl_count >= net->ipv4.sysctl_igmp_max_msf) {
++ if (psl && psl->sl_count >= READ_ONCE(net->ipv4.sysctl_igmp_max_msf)) {
+ err = -ENOBUFS;
+ goto done;
+ }
+diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
+index 1e5b53c2bb267..cdc750ced5255 100644
+--- a/net/ipv4/inet_connection_sock.c
++++ b/net/ipv4/inet_connection_sock.c
+@@ -259,7 +259,7 @@ next_port:
+ goto other_half_scan;
+ }
+
+- if (net->ipv4.sysctl_ip_autobind_reuse && !relax) {
++ if (READ_ONCE(net->ipv4.sysctl_ip_autobind_reuse) && !relax) {
+ /* We still have a chance to connect to different destinations */
+ relax = true;
+ goto ports_exhausted;
+@@ -829,7 +829,8 @@ static void reqsk_timer_handler(struct timer_list *t)
+
+ icsk = inet_csk(sk_listener);
+ net = sock_net(sk_listener);
+- max_syn_ack_retries = icsk->icsk_syn_retries ? : net->ipv4.sysctl_tcp_synack_retries;
++ max_syn_ack_retries = icsk->icsk_syn_retries ? :
++ READ_ONCE(net->ipv4.sysctl_tcp_synack_retries);
+ /* Normally all the openreqs are young and become mature
+ * (i.e. converted to established socket) for first timeout.
+ * If synack was not acknowledged for 1 second, it means
+diff --git a/net/ipv4/ip_forward.c b/net/ipv4/ip_forward.c
+index 92ba3350274bc..03bb7c51b6182 100644
+--- a/net/ipv4/ip_forward.c
++++ b/net/ipv4/ip_forward.c
+@@ -151,7 +151,7 @@ int ip_forward(struct sk_buff *skb)
+ !skb_sec_path(skb))
+ ip_rt_send_redirect(skb);
+
+- if (net->ipv4.sysctl_ip_fwd_update_priority)
++ if (READ_ONCE(net->ipv4.sysctl_ip_fwd_update_priority))
+ skb->priority = rt_tos2priority(iph->tos);
+
+ return NF_HOOK(NFPROTO_IPV4, NF_INET_FORWARD,
+diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
+index 95f7bb052784e..f3fd6c3983090 100644
+--- a/net/ipv4/ip_input.c
++++ b/net/ipv4/ip_input.c
+@@ -312,14 +312,13 @@ static bool ip_can_use_hint(const struct sk_buff *skb, const struct iphdr *iph,
+ ip_hdr(hint)->tos == iph->tos;
+ }
+
+-INDIRECT_CALLABLE_DECLARE(int udp_v4_early_demux(struct sk_buff *));
+-INDIRECT_CALLABLE_DECLARE(int tcp_v4_early_demux(struct sk_buff *));
++int tcp_v4_early_demux(struct sk_buff *skb);
++int udp_v4_early_demux(struct sk_buff *skb);
+ static int ip_rcv_finish_core(struct net *net, struct sock *sk,
+ struct sk_buff *skb, struct net_device *dev,
+ const struct sk_buff *hint)
+ {
+ const struct iphdr *iph = ip_hdr(skb);
+- int (*edemux)(struct sk_buff *skb);
+ int err, drop_reason;
+ struct rtable *rt;
+
+@@ -332,21 +331,29 @@ static int ip_rcv_finish_core(struct net *net, struct sock *sk,
+ goto drop_error;
+ }
+
+- if (net->ipv4.sysctl_ip_early_demux &&
++ if (READ_ONCE(net->ipv4.sysctl_ip_early_demux) &&
+ !skb_dst(skb) &&
+ !skb->sk &&
+ !ip_is_fragment(iph)) {
+- const struct net_protocol *ipprot;
+- int protocol = iph->protocol;
+-
+- ipprot = rcu_dereference(inet_protos[protocol]);
+- if (ipprot && (edemux = READ_ONCE(ipprot->early_demux))) {
+- err = INDIRECT_CALL_2(edemux, tcp_v4_early_demux,
+- udp_v4_early_demux, skb);
+- if (unlikely(err))
+- goto drop_error;
+- /* must reload iph, skb->head might have changed */
+- iph = ip_hdr(skb);
++ switch (iph->protocol) {
++ case IPPROTO_TCP:
++ if (READ_ONCE(net->ipv4.sysctl_tcp_early_demux)) {
++ tcp_v4_early_demux(skb);
++
++ /* must reload iph, skb->head might have changed */
++ iph = ip_hdr(skb);
++ }
++ break;
++ case IPPROTO_UDP:
++ if (READ_ONCE(net->ipv4.sysctl_udp_early_demux)) {
++ err = udp_v4_early_demux(skb);
++ if (unlikely(err))
++ goto drop_error;
++
++ /* must reload iph, skb->head might have changed */
++ iph = ip_hdr(skb);
++ }
++ break;
+ }
+ }
+
+diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
+index 445a9ecaefa19..a8a323ecbb54b 100644
+--- a/net/ipv4/ip_sockglue.c
++++ b/net/ipv4/ip_sockglue.c
+@@ -782,7 +782,7 @@ static int ip_set_mcast_msfilter(struct sock *sk, sockptr_t optval, int optlen)
+ /* numsrc >= (4G-140)/128 overflow in 32 bits */
+ err = -ENOBUFS;
+ if (gsf->gf_numsrc >= 0x1ffffff ||
+- gsf->gf_numsrc > sock_net(sk)->ipv4.sysctl_igmp_max_msf)
++ gsf->gf_numsrc > READ_ONCE(sock_net(sk)->ipv4.sysctl_igmp_max_msf))
+ goto out_free_gsf;
+
+ err = -EINVAL;
+@@ -832,7 +832,7 @@ static int compat_ip_set_mcast_msfilter(struct sock *sk, sockptr_t optval,
+
+ /* numsrc >= (4G-140)/128 overflow in 32 bits */
+ err = -ENOBUFS;
+- if (n > sock_net(sk)->ipv4.sysctl_igmp_max_msf)
++ if (n > READ_ONCE(sock_net(sk)->ipv4.sysctl_igmp_max_msf))
+ goto out_free_gsf;
+ err = set_mcast_msfilter(sk, gf32->gf_interface, n, gf32->gf_fmode,
+ &gf32->gf_group, gf32->gf_slist_flex);
+@@ -1244,7 +1244,7 @@ static int do_ip_setsockopt(struct sock *sk, int level, int optname,
+ }
+ /* numsrc >= (1G-4) overflow in 32 bits */
+ if (msf->imsf_numsrc >= 0x3ffffffcU ||
+- msf->imsf_numsrc > net->ipv4.sysctl_igmp_max_msf) {
++ msf->imsf_numsrc > READ_ONCE(net->ipv4.sysctl_igmp_max_msf)) {
+ kfree(msf);
+ err = -ENOBUFS;
+ break;
+@@ -1606,7 +1606,7 @@ static int do_ip_getsockopt(struct sock *sk, int level, int optname,
+ {
+ struct net *net = sock_net(sk);
+ val = (inet->uc_ttl == -1 ?
+- net->ipv4.sysctl_ip_default_ttl :
++ READ_ONCE(net->ipv4.sysctl_ip_default_ttl) :
+ inet->uc_ttl);
+ break;
+ }
+diff --git a/net/ipv4/netfilter/nf_reject_ipv4.c b/net/ipv4/netfilter/nf_reject_ipv4.c
+index 4eed5afca392e..f2edb40c0db00 100644
+--- a/net/ipv4/netfilter/nf_reject_ipv4.c
++++ b/net/ipv4/netfilter/nf_reject_ipv4.c
+@@ -62,7 +62,7 @@ struct sk_buff *nf_reject_skb_v4_tcp_reset(struct net *net,
+
+ skb_reserve(nskb, LL_MAX_HEADER);
+ niph = nf_reject_iphdr_put(nskb, oldskb, IPPROTO_TCP,
+- net->ipv4.sysctl_ip_default_ttl);
++ READ_ONCE(net->ipv4.sysctl_ip_default_ttl));
+ nf_reject_ip_tcphdr_put(nskb, oldskb, oth);
+ niph->tot_len = htons(nskb->len);
+ ip_send_check(niph);
+@@ -115,7 +115,7 @@ struct sk_buff *nf_reject_skb_v4_unreach(struct net *net,
+
+ skb_reserve(nskb, LL_MAX_HEADER);
+ niph = nf_reject_iphdr_put(nskb, oldskb, IPPROTO_ICMP,
+- net->ipv4.sysctl_ip_default_ttl);
++ READ_ONCE(net->ipv4.sysctl_ip_default_ttl));
+
+ skb_reset_transport_header(nskb);
+ icmph = skb_put_zero(nskb, sizeof(struct icmphdr));
+diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
+index 28836071f0a69..0088a4c64d77e 100644
+--- a/net/ipv4/proc.c
++++ b/net/ipv4/proc.c
+@@ -387,7 +387,7 @@ static int snmp_seq_show_ipstats(struct seq_file *seq, void *v)
+
+ seq_printf(seq, "\nIp: %d %d",
+ IPV4_DEVCONF_ALL(net, FORWARDING) ? 1 : 2,
+- net->ipv4.sysctl_ip_default_ttl);
++ READ_ONCE(net->ipv4.sysctl_ip_default_ttl));
+
+ BUILD_BUG_ON(offsetof(struct ipstats_mib, mibs) != 0);
+ snmp_get_cpu_field64_batch(buff64, snmp4_ipstats_list,
+diff --git a/net/ipv4/route.c b/net/ipv4/route.c
+index ed01063d8f303..02a0a397a2f38 100644
+--- a/net/ipv4/route.c
++++ b/net/ipv4/route.c
+@@ -1397,7 +1397,7 @@ u32 ip_mtu_from_fib_result(struct fib_result *res, __be32 daddr)
+ struct fib_info *fi = res->fi;
+ u32 mtu = 0;
+
+- if (dev_net(dev)->ipv4.sysctl_ip_fwd_use_pmtu ||
++ if (READ_ONCE(dev_net(dev)->ipv4.sysctl_ip_fwd_use_pmtu) ||
+ fi->fib_metrics->metrics[RTAX_LOCK - 1] & (1 << RTAX_MTU))
+ mtu = fi->fib_mtu;
+
+@@ -1928,7 +1928,7 @@ static u32 fib_multipath_custom_hash_outer(const struct net *net,
+ const struct sk_buff *skb,
+ bool *p_has_inner)
+ {
+- u32 hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields;
++ u32 hash_fields = READ_ONCE(net->ipv4.sysctl_fib_multipath_hash_fields);
+ struct flow_keys keys, hash_keys;
+
+ if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_OUTER_MASK))
+@@ -1957,7 +1957,7 @@ static u32 fib_multipath_custom_hash_inner(const struct net *net,
+ const struct sk_buff *skb,
+ bool has_inner)
+ {
+- u32 hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields;
++ u32 hash_fields = READ_ONCE(net->ipv4.sysctl_fib_multipath_hash_fields);
+ struct flow_keys keys, hash_keys;
+
+ /* We assume the packet carries an encapsulation, but if none was
+@@ -2017,7 +2017,7 @@ static u32 fib_multipath_custom_hash_skb(const struct net *net,
+ static u32 fib_multipath_custom_hash_fl4(const struct net *net,
+ const struct flowi4 *fl4)
+ {
+- u32 hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields;
++ u32 hash_fields = READ_ONCE(net->ipv4.sysctl_fib_multipath_hash_fields);
+ struct flow_keys hash_keys;
+
+ if (!(hash_fields & FIB_MULTIPATH_HASH_FIELD_OUTER_MASK))
+@@ -2047,7 +2047,7 @@ int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4,
+ struct flow_keys hash_keys;
+ u32 mhash = 0;
+
+- switch (net->ipv4.sysctl_fib_multipath_hash_policy) {
++ switch (READ_ONCE(net->ipv4.sysctl_fib_multipath_hash_policy)) {
+ case 0:
+ memset(&hash_keys, 0, sizeof(hash_keys));
+ hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
+diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
+index b387c48351559..942d2dfa11151 100644
+--- a/net/ipv4/syncookies.c
++++ b/net/ipv4/syncookies.c
+@@ -247,12 +247,12 @@ bool cookie_timestamp_decode(const struct net *net,
+ return true;
+ }
+
+- if (!net->ipv4.sysctl_tcp_timestamps)
++ if (!READ_ONCE(net->ipv4.sysctl_tcp_timestamps))
+ return false;
+
+ tcp_opt->sack_ok = (options & TS_OPT_SACK) ? TCP_SACK_SEEN : 0;
+
+- if (tcp_opt->sack_ok && !net->ipv4.sysctl_tcp_sack)
++ if (tcp_opt->sack_ok && !READ_ONCE(net->ipv4.sysctl_tcp_sack))
+ return false;
+
+ if ((options & TS_OPT_WSCALE_MASK) == TS_OPT_WSCALE_MASK)
+@@ -261,7 +261,7 @@ bool cookie_timestamp_decode(const struct net *net,
+ tcp_opt->wscale_ok = 1;
+ tcp_opt->snd_wscale = options & TS_OPT_WSCALE_MASK;
+
+- return net->ipv4.sysctl_tcp_window_scaling != 0;
++ return READ_ONCE(net->ipv4.sysctl_tcp_window_scaling) != 0;
+ }
+ EXPORT_SYMBOL(cookie_timestamp_decode);
+
+@@ -340,7 +340,8 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb)
+ struct flowi4 fl4;
+ u32 tsoff = 0;
+
+- if (!sock_net(sk)->ipv4.sysctl_tcp_syncookies || !th->ack || th->rst)
++ if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_syncookies) ||
++ !th->ack || th->rst)
+ goto out;
+
+ if (tcp_synq_no_recent_overflow(sk))
+diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
+index ffe0264a51b8c..344cdcd5a7d5c 100644
+--- a/net/ipv4/sysctl_net_ipv4.c
++++ b/net/ipv4/sysctl_net_ipv4.c
+@@ -88,7 +88,7 @@ static int ipv4_local_port_range(struct ctl_table *table, int write,
+ * port limit.
+ */
+ if ((range[1] < range[0]) ||
+- (range[0] < net->ipv4.sysctl_ip_prot_sock))
++ (range[0] < READ_ONCE(net->ipv4.sysctl_ip_prot_sock)))
+ ret = -EINVAL;
+ else
+ set_local_port_range(net, range);
+@@ -114,7 +114,7 @@ static int ipv4_privileged_ports(struct ctl_table *table, int write,
+ .extra2 = &ip_privileged_port_max,
+ };
+
+- pports = net->ipv4.sysctl_ip_prot_sock;
++ pports = READ_ONCE(net->ipv4.sysctl_ip_prot_sock);
+
+ ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
+
+@@ -126,7 +126,7 @@ static int ipv4_privileged_ports(struct ctl_table *table, int write,
+ if (range[0] < pports)
+ ret = -EINVAL;
+ else
+- net->ipv4.sysctl_ip_prot_sock = pports;
++ WRITE_ONCE(net->ipv4.sysctl_ip_prot_sock, pports);
+ }
+
+ return ret;
+@@ -354,61 +354,6 @@ bad_key:
+ return ret;
+ }
+
+-static void proc_configure_early_demux(int enabled, int protocol)
+-{
+- struct net_protocol *ipprot;
+-#if IS_ENABLED(CONFIG_IPV6)
+- struct inet6_protocol *ip6prot;
+-#endif
+-
+- rcu_read_lock();
+-
+- ipprot = rcu_dereference(inet_protos[protocol]);
+- if (ipprot)
+- ipprot->early_demux = enabled ? ipprot->early_demux_handler :
+- NULL;
+-
+-#if IS_ENABLED(CONFIG_IPV6)
+- ip6prot = rcu_dereference(inet6_protos[protocol]);
+- if (ip6prot)
+- ip6prot->early_demux = enabled ? ip6prot->early_demux_handler :
+- NULL;
+-#endif
+- rcu_read_unlock();
+-}
+-
+-static int proc_tcp_early_demux(struct ctl_table *table, int write,
+- void *buffer, size_t *lenp, loff_t *ppos)
+-{
+- int ret = 0;
+-
+- ret = proc_dou8vec_minmax(table, write, buffer, lenp, ppos);
+-
+- if (write && !ret) {
+- int enabled = init_net.ipv4.sysctl_tcp_early_demux;
+-
+- proc_configure_early_demux(enabled, IPPROTO_TCP);
+- }
+-
+- return ret;
+-}
+-
+-static int proc_udp_early_demux(struct ctl_table *table, int write,
+- void *buffer, size_t *lenp, loff_t *ppos)
+-{
+- int ret = 0;
+-
+- ret = proc_dou8vec_minmax(table, write, buffer, lenp, ppos);
+-
+- if (write && !ret) {
+- int enabled = init_net.ipv4.sysctl_udp_early_demux;
+-
+- proc_configure_early_demux(enabled, IPPROTO_UDP);
+- }
+-
+- return ret;
+-}
+-
+ static int proc_tfo_blackhole_detect_timeout(struct ctl_table *table,
+ int write, void *buffer,
+ size_t *lenp, loff_t *ppos)
+@@ -711,14 +656,14 @@ static struct ctl_table ipv4_net_table[] = {
+ .data = &init_net.ipv4.sysctl_udp_early_demux,
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+- .proc_handler = proc_udp_early_demux
++ .proc_handler = proc_dou8vec_minmax,
+ },
+ {
+ .procname = "tcp_early_demux",
+ .data = &init_net.ipv4.sysctl_tcp_early_demux,
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+- .proc_handler = proc_tcp_early_demux
++ .proc_handler = proc_dou8vec_minmax,
+ },
+ {
+ .procname = "nexthop_compat_mode",
+diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
+index f2fd1779d9251..28db838e604ad 100644
+--- a/net/ipv4/tcp.c
++++ b/net/ipv4/tcp.c
+@@ -441,7 +441,7 @@ void tcp_init_sock(struct sock *sk)
+ tp->snd_cwnd_clamp = ~0;
+ tp->mss_cache = TCP_MSS_DEFAULT;
+
+- tp->reordering = sock_net(sk)->ipv4.sysctl_tcp_reordering;
++ tp->reordering = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_reordering);
+ tcp_assign_congestion_control(sk);
+
+ tp->tsoffset = 0;
+@@ -1151,7 +1151,8 @@ static int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg,
+ struct sockaddr *uaddr = msg->msg_name;
+ int err, flags;
+
+- if (!(sock_net(sk)->ipv4.sysctl_tcp_fastopen & TFO_CLIENT_ENABLE) ||
++ if (!(READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fastopen) &
++ TFO_CLIENT_ENABLE) ||
+ (uaddr && msg->msg_namelen >= sizeof(uaddr->sa_family) &&
+ uaddr->sa_family == AF_UNSPEC))
+ return -EOPNOTSUPP;
+@@ -3638,7 +3639,8 @@ static int do_tcp_setsockopt(struct sock *sk, int level, int optname,
+ case TCP_FASTOPEN_CONNECT:
+ if (val > 1 || val < 0) {
+ err = -EINVAL;
+- } else if (net->ipv4.sysctl_tcp_fastopen & TFO_CLIENT_ENABLE) {
++ } else if (READ_ONCE(net->ipv4.sysctl_tcp_fastopen) &
++ TFO_CLIENT_ENABLE) {
+ if (sk->sk_state == TCP_CLOSE)
+ tp->fastopen_connect = val;
+ else
+@@ -3988,12 +3990,13 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
+ val = keepalive_probes(tp);
+ break;
+ case TCP_SYNCNT:
+- val = icsk->icsk_syn_retries ? : net->ipv4.sysctl_tcp_syn_retries;
++ val = icsk->icsk_syn_retries ? :
++ READ_ONCE(net->ipv4.sysctl_tcp_syn_retries);
+ break;
+ case TCP_LINGER2:
+ val = tp->linger2;
+ if (val >= 0)
+- val = (val ? : net->ipv4.sysctl_tcp_fin_timeout) / HZ;
++ val = (val ? : READ_ONCE(net->ipv4.sysctl_tcp_fin_timeout)) / HZ;
+ break;
+ case TCP_DEFER_ACCEPT:
+ val = retrans_to_secs(icsk->icsk_accept_queue.rskq_defer_accept,
+diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c
+index fdbcf2a6d08ef..825b216d11f52 100644
+--- a/net/ipv4/tcp_fastopen.c
++++ b/net/ipv4/tcp_fastopen.c
+@@ -332,7 +332,7 @@ static bool tcp_fastopen_no_cookie(const struct sock *sk,
+ const struct dst_entry *dst,
+ int flag)
+ {
+- return (sock_net(sk)->ipv4.sysctl_tcp_fastopen & flag) ||
++ return (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fastopen) & flag) ||
+ tcp_sk(sk)->fastopen_no_cookie ||
+ (dst && dst_metric(dst, RTAX_FASTOPEN_NO_COOKIE));
+ }
+@@ -347,7 +347,7 @@ struct sock *tcp_try_fastopen(struct sock *sk, struct sk_buff *skb,
+ const struct dst_entry *dst)
+ {
+ bool syn_data = TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq + 1;
+- int tcp_fastopen = sock_net(sk)->ipv4.sysctl_tcp_fastopen;
++ int tcp_fastopen = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fastopen);
+ struct tcp_fastopen_cookie valid_foc = { .len = -1 };
+ struct sock *child;
+ int ret = 0;
+@@ -489,7 +489,7 @@ void tcp_fastopen_active_disable(struct sock *sk)
+ {
+ struct net *net = sock_net(sk);
+
+- if (!sock_net(sk)->ipv4.sysctl_tcp_fastopen_blackhole_timeout)
++ if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fastopen_blackhole_timeout))
+ return;
+
+ /* Paired with READ_ONCE() in tcp_fastopen_active_should_disable() */
+@@ -510,7 +510,8 @@ void tcp_fastopen_active_disable(struct sock *sk)
+ */
+ bool tcp_fastopen_active_should_disable(struct sock *sk)
+ {
+- unsigned int tfo_bh_timeout = sock_net(sk)->ipv4.sysctl_tcp_fastopen_blackhole_timeout;
++ unsigned int tfo_bh_timeout =
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fastopen_blackhole_timeout);
+ unsigned long timeout;
+ int tfo_da_times;
+ int multiplier;
+diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
+index 2d71bcfcc7592..19b186a4a8e8c 100644
+--- a/net/ipv4/tcp_input.c
++++ b/net/ipv4/tcp_input.c
+@@ -1051,7 +1051,7 @@ static void tcp_check_sack_reordering(struct sock *sk, const u32 low_seq,
+ tp->undo_marker ? tp->undo_retrans : 0);
+ #endif
+ tp->reordering = min_t(u32, (metric + mss - 1) / mss,
+- sock_net(sk)->ipv4.sysctl_tcp_max_reordering);
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_max_reordering));
+ }
+
+ /* This exciting event is worth to be remembered. 8) */
+@@ -2030,7 +2030,7 @@ static void tcp_check_reno_reordering(struct sock *sk, const int addend)
+ return;
+
+ tp->reordering = min_t(u32, tp->packets_out + addend,
+- sock_net(sk)->ipv4.sysctl_tcp_max_reordering);
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_max_reordering));
+ tp->reord_seen++;
+ NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRENOREORDER);
+ }
+@@ -2095,7 +2095,8 @@ static inline void tcp_init_undo(struct tcp_sock *tp)
+
+ static bool tcp_is_rack(const struct sock *sk)
+ {
+- return sock_net(sk)->ipv4.sysctl_tcp_recovery & TCP_RACK_LOSS_DETECTION;
++ return READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_recovery) &
++ TCP_RACK_LOSS_DETECTION;
+ }
+
+ /* If we detect SACK reneging, forget all SACK information
+@@ -2139,6 +2140,7 @@ void tcp_enter_loss(struct sock *sk)
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct net *net = sock_net(sk);
+ bool new_recovery = icsk->icsk_ca_state < TCP_CA_Recovery;
++ u8 reordering;
+
+ tcp_timeout_mark_lost(sk);
+
+@@ -2159,10 +2161,12 @@ void tcp_enter_loss(struct sock *sk)
+ /* Timeout in disordered state after receiving substantial DUPACKs
+ * suggests that the degree of reordering is over-estimated.
+ */
++ reordering = READ_ONCE(net->ipv4.sysctl_tcp_reordering);
+ if (icsk->icsk_ca_state <= TCP_CA_Disorder &&
+- tp->sacked_out >= net->ipv4.sysctl_tcp_reordering)
++ tp->sacked_out >= reordering)
+ tp->reordering = min_t(unsigned int, tp->reordering,
+- net->ipv4.sysctl_tcp_reordering);
++ reordering);
++
+ tcp_set_ca_state(sk, TCP_CA_Loss);
+ tp->high_seq = tp->snd_nxt;
+ tcp_ecn_queue_cwr(tp);
+@@ -3464,7 +3468,8 @@ static inline bool tcp_may_raise_cwnd(const struct sock *sk, const int flag)
+ * new SACK or ECE mark may first advance cwnd here and later reduce
+ * cwnd in tcp_fastretrans_alert() based on more states.
+ */
+- if (tcp_sk(sk)->reordering > sock_net(sk)->ipv4.sysctl_tcp_reordering)
++ if (tcp_sk(sk)->reordering >
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_reordering))
+ return flag & FLAG_FORWARD_PROGRESS;
+
+ return flag & FLAG_DATA_ACKED;
+@@ -4056,7 +4061,7 @@ void tcp_parse_options(const struct net *net,
+ break;
+ case TCPOPT_WINDOW:
+ if (opsize == TCPOLEN_WINDOW && th->syn &&
+- !estab && net->ipv4.sysctl_tcp_window_scaling) {
++ !estab && READ_ONCE(net->ipv4.sysctl_tcp_window_scaling)) {
+ __u8 snd_wscale = *(__u8 *)ptr;
+ opt_rx->wscale_ok = 1;
+ if (snd_wscale > TCP_MAX_WSCALE) {
+@@ -4072,7 +4077,7 @@ void tcp_parse_options(const struct net *net,
+ case TCPOPT_TIMESTAMP:
+ if ((opsize == TCPOLEN_TIMESTAMP) &&
+ ((estab && opt_rx->tstamp_ok) ||
+- (!estab && net->ipv4.sysctl_tcp_timestamps))) {
++ (!estab && READ_ONCE(net->ipv4.sysctl_tcp_timestamps)))) {
+ opt_rx->saw_tstamp = 1;
+ opt_rx->rcv_tsval = get_unaligned_be32(ptr);
+ opt_rx->rcv_tsecr = get_unaligned_be32(ptr + 4);
+@@ -4080,7 +4085,7 @@ void tcp_parse_options(const struct net *net,
+ break;
+ case TCPOPT_SACK_PERM:
+ if (opsize == TCPOLEN_SACK_PERM && th->syn &&
+- !estab && net->ipv4.sysctl_tcp_sack) {
++ !estab && READ_ONCE(net->ipv4.sysctl_tcp_sack)) {
+ opt_rx->sack_ok = TCP_SACK_SEEN;
+ tcp_sack_reset(opt_rx);
+ }
+@@ -5571,7 +5576,7 @@ static void tcp_check_urg(struct sock *sk, const struct tcphdr *th)
+ struct tcp_sock *tp = tcp_sk(sk);
+ u32 ptr = ntohs(th->urg_ptr);
+
+- if (ptr && !sock_net(sk)->ipv4.sysctl_tcp_stdurg)
++ if (ptr && !READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_stdurg))
+ ptr--;
+ ptr += ntohl(th->seq);
+
+@@ -6780,11 +6785,14 @@ static bool tcp_syn_flood_action(const struct sock *sk, const char *proto)
+ {
+ struct request_sock_queue *queue = &inet_csk(sk)->icsk_accept_queue;
+ const char *msg = "Dropping request";
+- bool want_cookie = false;
+ struct net *net = sock_net(sk);
++ bool want_cookie = false;
++ u8 syncookies;
++
++ syncookies = READ_ONCE(net->ipv4.sysctl_tcp_syncookies);
+
+ #ifdef CONFIG_SYN_COOKIES
+- if (net->ipv4.sysctl_tcp_syncookies) {
++ if (syncookies) {
+ msg = "Sending cookies";
+ want_cookie = true;
+ __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPREQQFULLDOCOOKIES);
+@@ -6792,8 +6800,7 @@ static bool tcp_syn_flood_action(const struct sock *sk, const char *proto)
+ #endif
+ __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPREQQFULLDROP);
+
+- if (!queue->synflood_warned &&
+- net->ipv4.sysctl_tcp_syncookies != 2 &&
++ if (!queue->synflood_warned && syncookies != 2 &&
+ xchg(&queue->synflood_warned, 1) == 0)
+ net_info_ratelimited("%s: Possible SYN flooding on port %d. %s. Check SNMP counters.\n",
+ proto, sk->sk_num, msg);
+@@ -6842,7 +6849,7 @@ u16 tcp_get_syncookie_mss(struct request_sock_ops *rsk_ops,
+ struct tcp_sock *tp = tcp_sk(sk);
+ u16 mss;
+
+- if (sock_net(sk)->ipv4.sysctl_tcp_syncookies != 2 &&
++ if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_syncookies) != 2 &&
+ !inet_csk_reqsk_queue_is_full(sk))
+ return 0;
+
+@@ -6876,13 +6883,15 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
+ bool want_cookie = false;
+ struct dst_entry *dst;
+ struct flowi fl;
++ u8 syncookies;
++
++ syncookies = READ_ONCE(net->ipv4.sysctl_tcp_syncookies);
+
+ /* TW buckets are converted to open requests without
+ * limitations, they conserve resources and peer is
+ * evidently real one.
+ */
+- if ((net->ipv4.sysctl_tcp_syncookies == 2 ||
+- inet_csk_reqsk_queue_is_full(sk)) && !isn) {
++ if ((syncookies == 2 || inet_csk_reqsk_queue_is_full(sk)) && !isn) {
+ want_cookie = tcp_syn_flood_action(sk, rsk_ops->slab_name);
+ if (!want_cookie)
+ goto drop;
+@@ -6931,10 +6940,12 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
+ tcp_rsk(req)->ts_off = af_ops->init_ts_off(net, skb);
+
+ if (!want_cookie && !isn) {
++ int max_syn_backlog = READ_ONCE(net->ipv4.sysctl_max_syn_backlog);
++
+ /* Kill the following clause, if you dislike this way. */
+- if (!net->ipv4.sysctl_tcp_syncookies &&
+- (net->ipv4.sysctl_max_syn_backlog - inet_csk_reqsk_queue_len(sk) <
+- (net->ipv4.sysctl_max_syn_backlog >> 2)) &&
++ if (!syncookies &&
++ (max_syn_backlog - inet_csk_reqsk_queue_len(sk) <
++ (max_syn_backlog >> 2)) &&
+ !tcp_peer_is_proven(req, dst)) {
+ /* Without syncookies last quarter of
+ * backlog is filled with destinations,
+diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
+index cd78b4fc334f7..a57f96b868741 100644
+--- a/net/ipv4/tcp_ipv4.c
++++ b/net/ipv4/tcp_ipv4.c
+@@ -108,10 +108,10 @@ static u32 tcp_v4_init_ts_off(const struct net *net, const struct sk_buff *skb)
+
+ int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp)
+ {
++ int reuse = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_tw_reuse);
+ const struct inet_timewait_sock *tw = inet_twsk(sktw);
+ const struct tcp_timewait_sock *tcptw = tcp_twsk(sktw);
+ struct tcp_sock *tp = tcp_sk(sk);
+- int reuse = sock_net(sk)->ipv4.sysctl_tcp_tw_reuse;
+
+ if (reuse == 2) {
+ /* Still does not detect *everything* that goes through
+diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
+index 7029b0e98edb2..a501150deaa3b 100644
+--- a/net/ipv4/tcp_metrics.c
++++ b/net/ipv4/tcp_metrics.c
+@@ -428,7 +428,8 @@ void tcp_update_metrics(struct sock *sk)
+ if (!tcp_metric_locked(tm, TCP_METRIC_REORDERING)) {
+ val = tcp_metric_get(tm, TCP_METRIC_REORDERING);
+ if (val < tp->reordering &&
+- tp->reordering != net->ipv4.sysctl_tcp_reordering)
++ tp->reordering !=
++ READ_ONCE(net->ipv4.sysctl_tcp_reordering))
+ tcp_metric_set(tm, TCP_METRIC_REORDERING,
+ tp->reordering);
+ }
+diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
+index 6854bb1fb32b2..cb95d88497aea 100644
+--- a/net/ipv4/tcp_minisocks.c
++++ b/net/ipv4/tcp_minisocks.c
+@@ -173,7 +173,7 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
+ * Oh well... nobody has a sufficient solution to this
+ * protocol bug yet.
+ */
+- if (twsk_net(tw)->ipv4.sysctl_tcp_rfc1337 == 0) {
++ if (!READ_ONCE(twsk_net(tw)->ipv4.sysctl_tcp_rfc1337)) {
+ kill:
+ inet_twsk_deschedule_put(tw);
+ return TCP_TW_SUCCESS;
+@@ -781,7 +781,7 @@ listen_overflow:
+ if (sk != req->rsk_listener)
+ __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMIGRATEREQFAILURE);
+
+- if (!sock_net(sk)->ipv4.sysctl_tcp_abort_on_overflow) {
++ if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_abort_on_overflow)) {
+ inet_rsk(req)->acked = 1;
+ return NULL;
+ }
+diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
+index 34249469e361f..3554a4c1e1b82 100644
+--- a/net/ipv4/tcp_output.c
++++ b/net/ipv4/tcp_output.c
+@@ -789,18 +789,18 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
+ opts->mss = tcp_advertise_mss(sk);
+ remaining -= TCPOLEN_MSS_ALIGNED;
+
+- if (likely(sock_net(sk)->ipv4.sysctl_tcp_timestamps && !*md5)) {
++ if (likely(READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_timestamps) && !*md5)) {
+ opts->options |= OPTION_TS;
+ opts->tsval = tcp_skb_timestamp(skb) + tp->tsoffset;
+ opts->tsecr = tp->rx_opt.ts_recent;
+ remaining -= TCPOLEN_TSTAMP_ALIGNED;
+ }
+- if (likely(sock_net(sk)->ipv4.sysctl_tcp_window_scaling)) {
++ if (likely(READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_window_scaling))) {
+ opts->ws = tp->rx_opt.rcv_wscale;
+ opts->options |= OPTION_WSCALE;
+ remaining -= TCPOLEN_WSCALE_ALIGNED;
+ }
+- if (likely(sock_net(sk)->ipv4.sysctl_tcp_sack)) {
++ if (likely(READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_sack))) {
+ opts->options |= OPTION_SACK_ADVERTISE;
+ if (unlikely(!(OPTION_TS & opts->options)))
+ remaining -= TCPOLEN_SACKPERM_ALIGNED;
+@@ -1717,7 +1717,8 @@ static inline int __tcp_mtu_to_mss(struct sock *sk, int pmtu)
+ mss_now -= icsk->icsk_ext_hdr_len;
+
+ /* Then reserve room for full set of TCP options and 8 bytes of data */
+- mss_now = max(mss_now, sock_net(sk)->ipv4.sysctl_tcp_min_snd_mss);
++ mss_now = max(mss_now,
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_min_snd_mss));
+ return mss_now;
+ }
+
+@@ -1760,10 +1761,10 @@ void tcp_mtup_init(struct sock *sk)
+ struct inet_connection_sock *icsk = inet_csk(sk);
+ struct net *net = sock_net(sk);
+
+- icsk->icsk_mtup.enabled = net->ipv4.sysctl_tcp_mtu_probing > 1;
++ icsk->icsk_mtup.enabled = READ_ONCE(net->ipv4.sysctl_tcp_mtu_probing) > 1;
+ icsk->icsk_mtup.search_high = tp->rx_opt.mss_clamp + sizeof(struct tcphdr) +
+ icsk->icsk_af_ops->net_header_len;
+- icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, net->ipv4.sysctl_tcp_base_mss);
++ icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, READ_ONCE(net->ipv4.sysctl_tcp_base_mss));
+ icsk->icsk_mtup.probe_size = 0;
+ if (icsk->icsk_mtup.enabled)
+ icsk->icsk_mtup.probe_timestamp = tcp_jiffies32;
+@@ -1895,7 +1896,7 @@ static void tcp_cwnd_validate(struct sock *sk, bool is_cwnd_limited)
+ if (tp->packets_out > tp->snd_cwnd_used)
+ tp->snd_cwnd_used = tp->packets_out;
+
+- if (sock_net(sk)->ipv4.sysctl_tcp_slow_start_after_idle &&
++ if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_slow_start_after_idle) &&
+ (s32)(tcp_jiffies32 - tp->snd_cwnd_stamp) >= inet_csk(sk)->icsk_rto &&
+ !ca_ops->cong_control)
+ tcp_cwnd_application_limited(sk);
+@@ -2280,7 +2281,7 @@ static inline void tcp_mtu_check_reprobe(struct sock *sk)
+ u32 interval;
+ s32 delta;
+
+- interval = net->ipv4.sysctl_tcp_probe_interval;
++ interval = READ_ONCE(net->ipv4.sysctl_tcp_probe_interval);
+ delta = tcp_jiffies32 - icsk->icsk_mtup.probe_timestamp;
+ if (unlikely(delta >= interval * HZ)) {
+ int mss = tcp_current_mss(sk);
+@@ -2364,7 +2365,7 @@ static int tcp_mtu_probe(struct sock *sk)
+ * probing process by not resetting search range to its orignal.
+ */
+ if (probe_size > tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_high) ||
+- interval < net->ipv4.sysctl_tcp_probe_threshold) {
++ interval < READ_ONCE(net->ipv4.sysctl_tcp_probe_threshold)) {
+ /* Check whether enough time has elaplased for
+ * another round of probing.
+ */
+@@ -2738,7 +2739,7 @@ bool tcp_schedule_loss_probe(struct sock *sk, bool advancing_rto)
+ if (rcu_access_pointer(tp->fastopen_rsk))
+ return false;
+
+- early_retrans = sock_net(sk)->ipv4.sysctl_tcp_early_retrans;
++ early_retrans = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_early_retrans);
+ /* Schedule a loss probe in 2*RTT for SACK capable connections
+ * not in loss recovery, that are either limited by cwnd or application.
+ */
+@@ -3102,7 +3103,7 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to,
+ struct sk_buff *skb = to, *tmp;
+ bool first = true;
+
+- if (!sock_net(sk)->ipv4.sysctl_tcp_retrans_collapse)
++ if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_retrans_collapse))
+ return;
+ if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN)
+ return;
+@@ -3644,7 +3645,7 @@ static void tcp_connect_init(struct sock *sk)
+ * See tcp_input.c:tcp_rcv_state_process case TCP_SYN_SENT.
+ */
+ tp->tcp_header_len = sizeof(struct tcphdr);
+- if (sock_net(sk)->ipv4.sysctl_tcp_timestamps)
++ if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_timestamps))
+ tp->tcp_header_len += TCPOLEN_TSTAMP_ALIGNED;
+
+ #ifdef CONFIG_TCP_MD5SIG
+@@ -3680,7 +3681,7 @@ static void tcp_connect_init(struct sock *sk)
+ tp->advmss - (tp->rx_opt.ts_recent_stamp ? tp->tcp_header_len - sizeof(struct tcphdr) : 0),
+ &tp->rcv_wnd,
+ &tp->window_clamp,
+- sock_net(sk)->ipv4.sysctl_tcp_window_scaling,
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_window_scaling),
+ &rcv_wscale,
+ rcv_wnd);
+
+@@ -4087,7 +4088,7 @@ void tcp_send_probe0(struct sock *sk)
+
+ icsk->icsk_probes_out++;
+ if (err <= 0) {
+- if (icsk->icsk_backoff < net->ipv4.sysctl_tcp_retries2)
++ if (icsk->icsk_backoff < READ_ONCE(net->ipv4.sysctl_tcp_retries2))
+ icsk->icsk_backoff++;
+ timeout = tcp_probe0_when(sk, TCP_RTO_MAX);
+ } else {
+diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c
+index fd113f6226efc..ac14216f6204f 100644
+--- a/net/ipv4/tcp_recovery.c
++++ b/net/ipv4/tcp_recovery.c
+@@ -19,7 +19,8 @@ static u32 tcp_rack_reo_wnd(const struct sock *sk)
+ return 0;
+
+ if (tp->sacked_out >= tp->reordering &&
+- !(sock_net(sk)->ipv4.sysctl_tcp_recovery & TCP_RACK_NO_DUPTHRESH))
++ !(READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_recovery) &
++ TCP_RACK_NO_DUPTHRESH))
+ return 0;
+ }
+
+@@ -192,7 +193,8 @@ void tcp_rack_update_reo_wnd(struct sock *sk, struct rate_sample *rs)
+ {
+ struct tcp_sock *tp = tcp_sk(sk);
+
+- if (sock_net(sk)->ipv4.sysctl_tcp_recovery & TCP_RACK_STATIC_REO_WND ||
++ if ((READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_recovery) &
++ TCP_RACK_STATIC_REO_WND) ||
+ !rs->prior_delivered)
+ return;
+
+diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
+index 20cf4a98c69d8..50bba370486e8 100644
+--- a/net/ipv4/tcp_timer.c
++++ b/net/ipv4/tcp_timer.c
+@@ -143,7 +143,7 @@ static int tcp_out_of_resources(struct sock *sk, bool do_reset)
+ */
+ static int tcp_orphan_retries(struct sock *sk, bool alive)
+ {
+- int retries = sock_net(sk)->ipv4.sysctl_tcp_orphan_retries; /* May be zero. */
++ int retries = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_orphan_retries); /* May be zero. */
+
+ /* We know from an ICMP that something is wrong. */
+ if (sk->sk_err_soft && !alive)
+@@ -163,7 +163,7 @@ static void tcp_mtu_probing(struct inet_connection_sock *icsk, struct sock *sk)
+ int mss;
+
+ /* Black hole detection */
+- if (!net->ipv4.sysctl_tcp_mtu_probing)
++ if (!READ_ONCE(net->ipv4.sysctl_tcp_mtu_probing))
+ return;
+
+ if (!icsk->icsk_mtup.enabled) {
+@@ -171,9 +171,9 @@ static void tcp_mtu_probing(struct inet_connection_sock *icsk, struct sock *sk)
+ icsk->icsk_mtup.probe_timestamp = tcp_jiffies32;
+ } else {
+ mss = tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_low) >> 1;
+- mss = min(net->ipv4.sysctl_tcp_base_mss, mss);
+- mss = max(mss, net->ipv4.sysctl_tcp_mtu_probe_floor);
+- mss = max(mss, net->ipv4.sysctl_tcp_min_snd_mss);
++ mss = min(READ_ONCE(net->ipv4.sysctl_tcp_base_mss), mss);
++ mss = max(mss, READ_ONCE(net->ipv4.sysctl_tcp_mtu_probe_floor));
++ mss = max(mss, READ_ONCE(net->ipv4.sysctl_tcp_min_snd_mss));
+ icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, mss);
+ }
+ tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
+@@ -239,17 +239,18 @@ static int tcp_write_timeout(struct sock *sk)
+ if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) {
+ if (icsk->icsk_retransmits)
+ __dst_negative_advice(sk);
+- retry_until = icsk->icsk_syn_retries ? : net->ipv4.sysctl_tcp_syn_retries;
++ retry_until = icsk->icsk_syn_retries ? :
++ READ_ONCE(net->ipv4.sysctl_tcp_syn_retries);
+ expired = icsk->icsk_retransmits >= retry_until;
+ } else {
+- if (retransmits_timed_out(sk, net->ipv4.sysctl_tcp_retries1, 0)) {
++ if (retransmits_timed_out(sk, READ_ONCE(net->ipv4.sysctl_tcp_retries1), 0)) {
+ /* Black hole detection */
+ tcp_mtu_probing(icsk, sk);
+
+ __dst_negative_advice(sk);
+ }
+
+- retry_until = net->ipv4.sysctl_tcp_retries2;
++ retry_until = READ_ONCE(net->ipv4.sysctl_tcp_retries2);
+ if (sock_flag(sk, SOCK_DEAD)) {
+ const bool alive = icsk->icsk_rto < TCP_RTO_MAX;
+
+@@ -380,7 +381,7 @@ static void tcp_probe_timer(struct sock *sk)
+ msecs_to_jiffies(icsk->icsk_user_timeout))
+ goto abort;
+
+- max_probes = sock_net(sk)->ipv4.sysctl_tcp_retries2;
++ max_probes = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_retries2);
+ if (sock_flag(sk, SOCK_DEAD)) {
+ const bool alive = inet_csk_rto_backoff(icsk, TCP_RTO_MAX) < TCP_RTO_MAX;
+
+@@ -406,12 +407,15 @@ abort: tcp_write_err(sk);
+ static void tcp_fastopen_synack_timer(struct sock *sk, struct request_sock *req)
+ {
+ struct inet_connection_sock *icsk = inet_csk(sk);
+- int max_retries = icsk->icsk_syn_retries ? :
+- sock_net(sk)->ipv4.sysctl_tcp_synack_retries + 1; /* add one more retry for fastopen */
+ struct tcp_sock *tp = tcp_sk(sk);
++ int max_retries;
+
+ req->rsk_ops->syn_ack_timeout(req);
+
++ /* add one more retry for fastopen */
++ max_retries = icsk->icsk_syn_retries ? :
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_synack_retries) + 1;
++
+ if (req->num_timeout >= max_retries) {
+ tcp_write_err(sk);
+ return;
+@@ -574,7 +578,7 @@ out_reset_timer:
+ * linear-timeout retransmissions into a black hole
+ */
+ if (sk->sk_state == TCP_ESTABLISHED &&
+- (tp->thin_lto || net->ipv4.sysctl_tcp_thin_linear_timeouts) &&
++ (tp->thin_lto || READ_ONCE(net->ipv4.sysctl_tcp_thin_linear_timeouts)) &&
+ tcp_stream_is_thin(tp) &&
+ icsk->icsk_retransmits <= TCP_THIN_LINEAR_RETRIES) {
+ icsk->icsk_backoff = 0;
+@@ -585,7 +589,7 @@ out_reset_timer:
+ }
+ inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
+ tcp_clamp_rto_to_user_timeout(sk), TCP_RTO_MAX);
+- if (retransmits_timed_out(sk, net->ipv4.sysctl_tcp_retries1 + 1, 0))
++ if (retransmits_timed_out(sk, READ_ONCE(net->ipv4.sysctl_tcp_retries1) + 1, 0))
+ __sk_dst_reset(sk);
+
+ out:;
+diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
+index 7d7b7523d1265..ef1e6545d8690 100644
+--- a/net/ipv6/af_inet6.c
++++ b/net/ipv6/af_inet6.c
+@@ -226,7 +226,7 @@ lookup_protocol:
+ RCU_INIT_POINTER(inet->mc_list, NULL);
+ inet->rcv_tos = 0;
+
+- if (net->ipv4.sysctl_ip_no_pmtu_disc)
++ if (READ_ONCE(net->ipv4.sysctl_ip_no_pmtu_disc))
+ inet->pmtudisc = IP_PMTUDISC_DONT;
+ else
+ inet->pmtudisc = IP_PMTUDISC_WANT;
+diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
+index 5b5ea35635f9f..caba03e551ef6 100644
+--- a/net/ipv6/ip6_input.c
++++ b/net/ipv6/ip6_input.c
+@@ -45,20 +45,23 @@
+ #include <net/inet_ecn.h>
+ #include <net/dst_metadata.h>
+
+-INDIRECT_CALLABLE_DECLARE(void tcp_v6_early_demux(struct sk_buff *));
+ static void ip6_rcv_finish_core(struct net *net, struct sock *sk,
+ struct sk_buff *skb)
+ {
+- void (*edemux)(struct sk_buff *skb);
+-
+- if (net->ipv4.sysctl_ip_early_demux && !skb_dst(skb) && skb->sk == NULL) {
+- const struct inet6_protocol *ipprot;
+-
+- ipprot = rcu_dereference(inet6_protos[ipv6_hdr(skb)->nexthdr]);
+- if (ipprot && (edemux = READ_ONCE(ipprot->early_demux)))
+- INDIRECT_CALL_2(edemux, tcp_v6_early_demux,
+- udp_v6_early_demux, skb);
++ if (READ_ONCE(net->ipv4.sysctl_ip_early_demux) &&
++ !skb_dst(skb) && !skb->sk) {
++ switch (ipv6_hdr(skb)->nexthdr) {
++ case IPPROTO_TCP:
++ if (READ_ONCE(net->ipv4.sysctl_tcp_early_demux))
++ tcp_v6_early_demux(skb);
++ break;
++ case IPPROTO_UDP:
++ if (READ_ONCE(net->ipv4.sysctl_udp_early_demux))
++ udp_v6_early_demux(skb);
++ break;
++ }
+ }
++
+ if (!skb_valid_dst(skb))
+ ip6_route_input(skb);
+ }
+diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
+index 9cc123f000fbc..5014aa6634527 100644
+--- a/net/ipv6/syncookies.c
++++ b/net/ipv6/syncookies.c
+@@ -141,7 +141,8 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb)
+ __u8 rcv_wscale;
+ u32 tsoff = 0;
+
+- if (!sock_net(sk)->ipv4.sysctl_tcp_syncookies || !th->ack || th->rst)
++ if (!READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_syncookies) ||
++ !th->ack || th->rst)
+ goto out;
+
+ if (tcp_synq_no_recent_overflow(sk))
+diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
+index cbc5fff3d8466..5185c11dc4447 100644
+--- a/net/ipv6/tcp_ipv6.c
++++ b/net/ipv6/tcp_ipv6.c
+@@ -1822,7 +1822,7 @@ do_time_wait:
+ goto discard_it;
+ }
+
+-INDIRECT_CALLABLE_SCOPE void tcp_v6_early_demux(struct sk_buff *skb)
++void tcp_v6_early_demux(struct sk_buff *skb)
+ {
+ const struct ipv6hdr *hdr;
+ const struct tcphdr *th;
+@@ -2176,12 +2176,7 @@ struct proto tcpv6_prot = {
+ };
+ EXPORT_SYMBOL_GPL(tcpv6_prot);
+
+-/* thinking of making this const? Don't.
+- * early_demux can change based on sysctl.
+- */
+-static struct inet6_protocol tcpv6_protocol = {
+- .early_demux = tcp_v6_early_demux,
+- .early_demux_handler = tcp_v6_early_demux,
++static const struct inet6_protocol tcpv6_protocol = {
+ .handler = tcp_v6_rcv,
+ .err_handler = tcp_v6_err,
+ .flags = INET6_PROTO_NOPOLICY|INET6_PROTO_FINAL,
+diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
+index a535c3f2e4af4..aea28bf701be4 100644
+--- a/net/ipv6/udp.c
++++ b/net/ipv6/udp.c
+@@ -1052,7 +1052,7 @@ static struct sock *__udp6_lib_demux_lookup(struct net *net,
+ return NULL;
+ }
+
+-INDIRECT_CALLABLE_SCOPE void udp_v6_early_demux(struct sk_buff *skb)
++void udp_v6_early_demux(struct sk_buff *skb)
+ {
+ struct net *net = dev_net(skb->dev);
+ const struct udphdr *uh;
+@@ -1660,12 +1660,7 @@ int udpv6_getsockopt(struct sock *sk, int level, int optname,
+ return ipv6_getsockopt(sk, level, optname, optval, optlen);
+ }
+
+-/* thinking of making this const? Don't.
+- * early_demux can change based on sysctl.
+- */
+-static struct inet6_protocol udpv6_protocol = {
+- .early_demux = udp_v6_early_demux,
+- .early_demux_handler = udp_v6_early_demux,
++static const struct inet6_protocol udpv6_protocol = {
+ .handler = udpv6_rcv,
+ .err_handler = udpv6_err,
+ .flags = INET6_PROTO_NOPOLICY|INET6_PROTO_FINAL,
+diff --git a/net/netfilter/nf_synproxy_core.c b/net/netfilter/nf_synproxy_core.c
+index e479dd0561c54..16915f8eef2b1 100644
+--- a/net/netfilter/nf_synproxy_core.c
++++ b/net/netfilter/nf_synproxy_core.c
+@@ -405,7 +405,7 @@ synproxy_build_ip(struct net *net, struct sk_buff *skb, __be32 saddr,
+ iph->tos = 0;
+ iph->id = 0;
+ iph->frag_off = htons(IP_DF);
+- iph->ttl = net->ipv4.sysctl_ip_default_ttl;
++ iph->ttl = READ_ONCE(net->ipv4.sysctl_ip_default_ttl);
+ iph->protocol = IPPROTO_TCP;
+ iph->check = 0;
+ iph->saddr = saddr;
+diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
+index 2d4dc1468a9a5..6fd33c75d6bb2 100644
+--- a/net/sched/cls_api.c
++++ b/net/sched/cls_api.c
+@@ -3531,7 +3531,7 @@ int tc_setup_action(struct flow_action *flow_action,
+ struct tc_action *actions[],
+ struct netlink_ext_ack *extack)
+ {
+- int i, j, index, err = 0;
++ int i, j, k, index, err = 0;
+ struct tc_action *act;
+
+ BUILD_BUG_ON(TCA_ACT_HW_STATS_ANY != FLOW_ACTION_HW_STATS_ANY);
+@@ -3551,14 +3551,18 @@ int tc_setup_action(struct flow_action *flow_action,
+ if (err)
+ goto err_out_locked;
+
+- entry->hw_stats = tc_act_hw_stats(act->hw_stats);
+- entry->hw_index = act->tcfa_index;
+ index = 0;
+ err = tc_setup_offload_act(act, entry, &index, extack);
+- if (!err)
+- j += index;
+- else
++ if (err)
+ goto err_out_locked;
++
++ for (k = 0; k < index ; k++) {
++ entry[k].hw_stats = tc_act_hw_stats(act->hw_stats);
++ entry[k].hw_index = act->tcfa_index;
++ }
++
++ j += index;
++
+ spin_unlock_bh(&act->tcfa_lock);
+ }
+
+diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
+index 35928fefae332..1a094b087d88b 100644
+--- a/net/sctp/protocol.c
++++ b/net/sctp/protocol.c
+@@ -358,7 +358,7 @@ static int sctp_v4_available(union sctp_addr *addr, struct sctp_sock *sp)
+ if (addr->v4.sin_addr.s_addr != htonl(INADDR_ANY) &&
+ ret != RTN_LOCAL &&
+ !sp->inet.freebind &&
+- !net->ipv4.sysctl_ip_nonlocal_bind)
++ !READ_ONCE(net->ipv4.sysctl_ip_nonlocal_bind))
+ return 0;
+
+ if (ipv6_only_sock(sctp_opt2sk(sp)))
+diff --git a/net/smc/smc_llc.c b/net/smc/smc_llc.c
+index c4d057b2941d5..0bde36b564727 100644
+--- a/net/smc/smc_llc.c
++++ b/net/smc/smc_llc.c
+@@ -2122,7 +2122,7 @@ void smc_llc_lgr_init(struct smc_link_group *lgr, struct smc_sock *smc)
+ init_waitqueue_head(&lgr->llc_flow_waiter);
+ init_waitqueue_head(&lgr->llc_msg_waiter);
+ mutex_init(&lgr->llc_conf_mutex);
+- lgr->llc_testlink_time = net->ipv4.sysctl_tcp_keepalive_time;
++ lgr->llc_testlink_time = READ_ONCE(net->ipv4.sysctl_tcp_keepalive_time);
+ }
+
+ /* called after lgr was removed from lgr_list */
+diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
+index 3a61bb5945441..9c3933781ad47 100644
+--- a/net/tls/tls_device.c
++++ b/net/tls/tls_device.c
+@@ -97,13 +97,16 @@ static void tls_device_queue_ctx_destruction(struct tls_context *ctx)
+ unsigned long flags;
+
+ spin_lock_irqsave(&tls_device_lock, flags);
++ if (unlikely(!refcount_dec_and_test(&ctx->refcount)))
++ goto unlock;
++
+ list_move_tail(&ctx->list, &tls_device_gc_list);
+
+ /* schedule_work inside the spinlock
+ * to make sure tls_device_down waits for that work.
+ */
+ schedule_work(&tls_device_gc_work);
+-
++unlock:
+ spin_unlock_irqrestore(&tls_device_lock, flags);
+ }
+
+@@ -194,8 +197,7 @@ void tls_device_sk_destruct(struct sock *sk)
+ clean_acked_data_disable(inet_csk(sk));
+ }
+
+- if (refcount_dec_and_test(&tls_ctx->refcount))
+- tls_device_queue_ctx_destruction(tls_ctx);
++ tls_device_queue_ctx_destruction(tls_ctx);
+ }
+ EXPORT_SYMBOL_GPL(tls_device_sk_destruct);
+
+diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
+index f1876ea61fdce..f1a0bab920a55 100644
+--- a/net/xfrm/xfrm_policy.c
++++ b/net/xfrm/xfrm_policy.c
+@@ -2678,8 +2678,10 @@ static int xfrm_expand_policies(const struct flowi *fl, u16 family,
+ *num_xfrms = 0;
+ return 0;
+ }
+- if (IS_ERR(pols[0]))
++ if (IS_ERR(pols[0])) {
++ *num_pols = 0;
+ return PTR_ERR(pols[0]);
++ }
+
+ *num_xfrms = pols[0]->xfrm_nr;
+
+@@ -2694,6 +2696,7 @@ static int xfrm_expand_policies(const struct flowi *fl, u16 family,
+ if (pols[1]) {
+ if (IS_ERR(pols[1])) {
+ xfrm_pols_put(pols, *num_pols);
++ *num_pols = 0;
+ return PTR_ERR(pols[1]);
+ }
+ (*num_pols)++;
+diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
+index b749935152ba5..b4ce16a934a28 100644
+--- a/net/xfrm/xfrm_state.c
++++ b/net/xfrm/xfrm_state.c
+@@ -2620,7 +2620,7 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload)
+ int err;
+
+ if (family == AF_INET &&
+- xs_net(x)->ipv4.sysctl_ip_no_pmtu_disc)
++ READ_ONCE(xs_net(x)->ipv4.sysctl_ip_no_pmtu_disc))
+ x->props.flags |= XFRM_STATE_NOPMTUDISC;
+
+ err = -EPROTONOSUPPORT;
+diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
+index eea6e92500b8e..50ebad1e3abb1 100644
+--- a/security/integrity/ima/ima_policy.c
++++ b/security/integrity/ima/ima_policy.c
+@@ -2181,6 +2181,10 @@ bool ima_appraise_signature(enum kernel_read_file_id id)
+ if (id >= READING_MAX_ID)
+ return false;
+
++ if (id == READING_KEXEC_IMAGE && !(ima_appraise & IMA_APPRAISE_ENFORCE)
++ && security_locked_down(LOCKDOWN_KEXEC))
++ return false;
++
+ func = read_idmap[id] ?: FILE_CHECK;
+
+ rcu_read_lock();
+diff --git a/sound/soc/sof/intel/hda-loader.c b/sound/soc/sof/intel/hda-loader.c
+index 88d23924e1bf2..87313145d14f9 100644
+--- a/sound/soc/sof/intel/hda-loader.c
++++ b/sound/soc/sof/intel/hda-loader.c
+@@ -397,7 +397,8 @@ int hda_dsp_cl_boot_firmware(struct snd_sof_dev *sdev)
+ struct firmware stripped_firmware;
+ int ret, ret1, i;
+
+- if ((sdev->fw_ready.flags & SOF_IPC_INFO_D3_PERSISTENT) &&
++ if ((sdev->system_suspend_target < SOF_SUSPEND_S4) &&
++ (sdev->fw_ready.flags & SOF_IPC_INFO_D3_PERSISTENT) &&
+ !(sof_debug_check_flag(SOF_DBG_IGNORE_D3_PERSISTENT)) &&
+ !sdev->first_boot) {
+ dev_dbg(sdev->dev, "IMR restore supported, booting from IMR directly\n");
+diff --git a/sound/soc/sof/pm.c b/sound/soc/sof/pm.c
+index 1c319582ca6f0..76351f7f3243d 100644
+--- a/sound/soc/sof/pm.c
++++ b/sound/soc/sof/pm.c
+@@ -23,6 +23,9 @@ static u32 snd_sof_dsp_power_target(struct snd_sof_dev *sdev)
+ u32 target_dsp_state;
+
+ switch (sdev->system_suspend_target) {
++ case SOF_SUSPEND_S5:
++ case SOF_SUSPEND_S4:
++ /* DSP should be in D3 if the system is suspending to S3+ */
+ case SOF_SUSPEND_S3:
+ /* DSP should be in D3 if the system is suspending to S3 */
+ target_dsp_state = SOF_DSP_PM_D3;
+@@ -327,8 +330,24 @@ int snd_sof_prepare(struct device *dev)
+ return 0;
+
+ #if defined(CONFIG_ACPI)
+- if (acpi_target_system_state() == ACPI_STATE_S0)
++ switch (acpi_target_system_state()) {
++ case ACPI_STATE_S0:
+ sdev->system_suspend_target = SOF_SUSPEND_S0IX;
++ break;
++ case ACPI_STATE_S1:
++ case ACPI_STATE_S2:
++ case ACPI_STATE_S3:
++ sdev->system_suspend_target = SOF_SUSPEND_S3;
++ break;
++ case ACPI_STATE_S4:
++ sdev->system_suspend_target = SOF_SUSPEND_S4;
++ break;
++ case ACPI_STATE_S5:
++ sdev->system_suspend_target = SOF_SUSPEND_S5;
++ break;
++ default:
++ break;
++ }
+ #endif
+
+ return 0;
+diff --git a/sound/soc/sof/sof-priv.h b/sound/soc/sof/sof-priv.h
+index 0d9b640ae24cd..c856f0d84e495 100644
+--- a/sound/soc/sof/sof-priv.h
++++ b/sound/soc/sof/sof-priv.h
+@@ -85,6 +85,8 @@ enum sof_system_suspend_state {
+ SOF_SUSPEND_NONE = 0,
+ SOF_SUSPEND_S0IX,
+ SOF_SUSPEND_S3,
++ SOF_SUSPEND_S4,
++ SOF_SUSPEND_S5,
+ };
+
+ enum sof_dfsentry_type {
+diff --git a/tools/perf/tests/perf-time-to-tsc.c b/tools/perf/tests/perf-time-to-tsc.c
+index 4ad0dfbc8b21f..7c7d20fc503ad 100644
+--- a/tools/perf/tests/perf-time-to-tsc.c
++++ b/tools/perf/tests/perf-time-to-tsc.c
+@@ -20,8 +20,6 @@
+ #include "tsc.h"
+ #include "mmap.h"
+ #include "tests.h"
+-#include "pmu.h"
+-#include "pmu-hybrid.h"
+
+ /*
+ * Except x86_64/i386 and Arm64, other archs don't support TSC in perf. Just
+@@ -106,28 +104,21 @@ static int test__perf_time_to_tsc(struct test_suite *test __maybe_unused, int su
+
+ evlist__config(evlist, &opts, NULL);
+
+- evsel = evlist__first(evlist);
+-
+- evsel->core.attr.comm = 1;
+- evsel->core.attr.disabled = 1;
+- evsel->core.attr.enable_on_exec = 0;
+-
+- /*
+- * For hybrid "cycles:u", it creates two events.
+- * Init the second evsel here.
+- */
+- if (perf_pmu__has_hybrid() && perf_pmu__hybrid_mounted("cpu_atom")) {
+- evsel = evsel__next(evsel);
++ /* For hybrid "cycles:u", it creates two events */
++ evlist__for_each_entry(evlist, evsel) {
+ evsel->core.attr.comm = 1;
+ evsel->core.attr.disabled = 1;
+ evsel->core.attr.enable_on_exec = 0;
+ }
+
+- if (evlist__open(evlist) == -ENOENT) {
+- err = TEST_SKIP;
++ ret = evlist__open(evlist);
++ if (ret < 0) {
++ if (ret == -ENOENT)
++ err = TEST_SKIP;
++ else
++ pr_debug("evlist__open() failed\n");
+ goto out_err;
+ }
+- CHECK__(evlist__open(evlist));
+
+ CHECK__(evlist__mmap(evlist, UINT_MAX));
+
+@@ -167,10 +158,12 @@ static int test__perf_time_to_tsc(struct test_suite *test __maybe_unused, int su
+ goto next_event;
+
+ if (strcmp(event->comm.comm, comm1) == 0) {
++ CHECK_NOT_NULL__(evsel = evlist__event2evsel(evlist, event));
+ CHECK__(evsel__parse_sample(evsel, event, &sample));
+ comm1_time = sample.time;
+ }
+ if (strcmp(event->comm.comm, comm2) == 0) {
++ CHECK_NOT_NULL__(evsel = evlist__event2evsel(evlist, event));
+ CHECK__(evsel__parse_sample(evsel, event, &sample));
+ comm2_time = sample.time;
+ }
+diff --git a/tools/testing/selftests/gpio/Makefile b/tools/testing/selftests/gpio/Makefile
+index 71b3066023685..616ed40196554 100644
+--- a/tools/testing/selftests/gpio/Makefile
++++ b/tools/testing/selftests/gpio/Makefile
+@@ -3,6 +3,6 @@
+ TEST_PROGS := gpio-mockup.sh gpio-sim.sh
+ TEST_FILES := gpio-mockup-sysfs.sh
+ TEST_GEN_PROGS_EXTENDED := gpio-mockup-cdev gpio-chip-info gpio-line-name
+-CFLAGS += -O2 -g -Wall -I../../../../usr/include/
++CFLAGS += -O2 -g -Wall -I../../../../usr/include/ $(KHDR_INCLUDES)
+
+ include ../lib.mk
+diff --git a/tools/testing/selftests/kvm/rseq_test.c b/tools/testing/selftests/kvm/rseq_test.c
+index 4158da0da2bba..2237d1aac8014 100644
+--- a/tools/testing/selftests/kvm/rseq_test.c
++++ b/tools/testing/selftests/kvm/rseq_test.c
+@@ -82,8 +82,9 @@ static int next_cpu(int cpu)
+ return cpu;
+ }
+
+-static void *migration_worker(void *ign)
++static void *migration_worker(void *__rseq_tid)
+ {
++ pid_t rseq_tid = (pid_t)(unsigned long)__rseq_tid;
+ cpu_set_t allowed_mask;
+ int r, i, cpu;
+
+@@ -106,7 +107,7 @@ static void *migration_worker(void *ign)
+ * stable, i.e. while changing affinity is in-progress.
+ */
+ smp_wmb();
+- r = sched_setaffinity(0, sizeof(allowed_mask), &allowed_mask);
++ r = sched_setaffinity(rseq_tid, sizeof(allowed_mask), &allowed_mask);
+ TEST_ASSERT(!r, "sched_setaffinity failed, errno = %d (%s)",
+ errno, strerror(errno));
+ smp_wmb();
+@@ -231,7 +232,8 @@ int main(int argc, char *argv[])
+ vm = vm_create_default(VCPU_ID, 0, guest_code);
+ ucall_init(vm, NULL);
+
+- pthread_create(&migration_thread, NULL, migration_worker, 0);
++ pthread_create(&migration_thread, NULL, migration_worker,
++ (void *)(unsigned long)gettid());
+
+ for (i = 0; !done; i++) {
+ vcpu_run(vm, VCPU_ID);
+diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
+index 5ab12214e18dd..24cb37d19c638 100644
+--- a/virt/kvm/kvm_main.c
++++ b/virt/kvm/kvm_main.c
+@@ -4299,8 +4299,11 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
+ kvm_put_kvm_no_destroy(kvm);
+ mutex_lock(&kvm->lock);
+ list_del(&dev->vm_node);
++ if (ops->release)
++ ops->release(dev);
+ mutex_unlock(&kvm->lock);
+- ops->destroy(dev);
++ if (ops->destroy)
++ ops->destroy(dev);
+ return ret;
+ }
+
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-08-03 14:22 Alice Ferrazzi
0 siblings, 0 replies; 31+ messages in thread
From: Alice Ferrazzi @ 2022-08-03 14:22 UTC (permalink / raw
To: gentoo-commits
commit: 2fa036fe7ac5f3ac44227817160b9a6f4608e626
Author: Alice Ferrazzi <alicef <AT> gentoo <DOT> org>
AuthorDate: Wed Aug 3 14:08:37 2022 +0000
Commit: Alice Ferrazzi <alicef <AT> gentoo <DOT> org>
CommitDate: Wed Aug 3 14:08:46 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=2fa036fe
Linux patch 5.18.16
Signed-off-by: Alice Ferrazzi <alicef <AT> gentoo.org>
0000_README | 4 +
1015_linux-5.18.16.patch | 3168 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 3172 insertions(+)
diff --git a/0000_README b/0000_README
index 7c448f23..efa0b25e 100644
--- a/0000_README
+++ b/0000_README
@@ -103,6 +103,10 @@ Patch: 1014_linux-5.18.15.patch
From: http://www.kernel.org
Desc: Linux 5.18.15
+Patch: 1015_linux-5.18.16.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.16
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1015_linux-5.18.16.patch b/1015_linux-5.18.16.patch
new file mode 100644
index 00000000..79bc2313
--- /dev/null
+++ b/1015_linux-5.18.16.patch
@@ -0,0 +1,3168 @@
+diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
+index eb92195ca0155..334544c893ddf 100644
+--- a/Documentation/admin-guide/kernel-parameters.txt
++++ b/Documentation/admin-guide/kernel-parameters.txt
+@@ -3106,6 +3106,7 @@
+ no_entry_flush [PPC]
+ no_uaccess_flush [PPC]
+ mmio_stale_data=off [X86]
++ retbleed=off [X86]
+
+ Exceptions:
+ This does not have any effect on
+@@ -3128,6 +3129,7 @@
+ mds=full,nosmt [X86]
+ tsx_async_abort=full,nosmt [X86]
+ mmio_stale_data=full,nosmt [X86]
++ retbleed=auto,nosmt [X86]
+
+ mminit_loglevel=
+ [KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
+diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
+index 8899b474edbfd..e29017d4d7a25 100644
+--- a/Documentation/networking/ip-sysctl.rst
++++ b/Documentation/networking/ip-sysctl.rst
+@@ -2848,7 +2848,14 @@ sctp_rmem - vector of 3 INTEGERs: min, default, max
+ Default: 4K
+
+ sctp_wmem - vector of 3 INTEGERs: min, default, max
+- Currently this tunable has no effect.
++ Only the first value ("min") is used, "default" and "max" are
++ ignored.
++
++ min: Minimum size of send buffer that can be used by SCTP sockets.
++ It is guaranteed to each SCTP socket (but not association) even
++ under moderate memory pressure.
++
++ Default: 4K
+
+ addr_scope_policy - INTEGER
+ Control IPv4 address scoping - draft-stewart-tsvwg-sctp-ipv4-00
+diff --git a/Makefile b/Makefile
+index 5957afa296922..18bcbcd037f0a 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 15
++SUBLEVEL = 16
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/arm/boot/dts/lan966x.dtsi b/arch/arm/boot/dts/lan966x.dtsi
+index 5e9cbc8cdcbce..a99ffb4cfb8a6 100644
+--- a/arch/arm/boot/dts/lan966x.dtsi
++++ b/arch/arm/boot/dts/lan966x.dtsi
+@@ -38,7 +38,7 @@
+ sys_clk: sys_clk {
+ compatible = "fixed-clock";
+ #clock-cells = <0>;
+- clock-frequency = <162500000>;
++ clock-frequency = <165625000>;
+ };
+
+ cpu_clk: cpu_clk {
+diff --git a/arch/arm/include/asm/dma.h b/arch/arm/include/asm/dma.h
+index a81dda65c5762..45180a2cc47cb 100644
+--- a/arch/arm/include/asm/dma.h
++++ b/arch/arm/include/asm/dma.h
+@@ -10,7 +10,7 @@
+ #else
+ #define MAX_DMA_ADDRESS ({ \
+ extern phys_addr_t arm_dma_zone_size; \
+- arm_dma_zone_size && arm_dma_zone_size < (0x10000000 - PAGE_OFFSET) ? \
++ arm_dma_zone_size && arm_dma_zone_size < (0x100000000ULL - PAGE_OFFSET) ? \
+ (PAGE_OFFSET + arm_dma_zone_size) : 0xffffffffUL; })
+ #endif
+
+diff --git a/arch/arm/mach-pxa/corgi.c b/arch/arm/mach-pxa/corgi.c
+index 44659fbc37bab..036f3aacd0e19 100644
+--- a/arch/arm/mach-pxa/corgi.c
++++ b/arch/arm/mach-pxa/corgi.c
+@@ -531,7 +531,7 @@ static struct pxa2xx_spi_controller corgi_spi_info = {
+ };
+
+ static struct gpiod_lookup_table corgi_spi_gpio_table = {
+- .dev_id = "pxa2xx-spi.1",
++ .dev_id = "spi1",
+ .table = {
+ GPIO_LOOKUP_IDX("gpio-pxa", CORGI_GPIO_ADS7846_CS, "cs", 0, GPIO_ACTIVE_LOW),
+ GPIO_LOOKUP_IDX("gpio-pxa", CORGI_GPIO_LCDCON_CS, "cs", 1, GPIO_ACTIVE_LOW),
+diff --git a/arch/arm/mach-pxa/hx4700.c b/arch/arm/mach-pxa/hx4700.c
+index e1870fbb19e7e..188707316e6ec 100644
+--- a/arch/arm/mach-pxa/hx4700.c
++++ b/arch/arm/mach-pxa/hx4700.c
+@@ -635,7 +635,7 @@ static struct pxa2xx_spi_controller pxa_ssp2_master_info = {
+ };
+
+ static struct gpiod_lookup_table pxa_ssp2_gpio_table = {
+- .dev_id = "pxa2xx-spi.2",
++ .dev_id = "spi2",
+ .table = {
+ GPIO_LOOKUP_IDX("gpio-pxa", GPIO88_HX4700_TSC2046_CS, "cs", 0, GPIO_ACTIVE_LOW),
+ { },
+diff --git a/arch/arm/mach-pxa/icontrol.c b/arch/arm/mach-pxa/icontrol.c
+index 753fe166ab681..624088257cfc8 100644
+--- a/arch/arm/mach-pxa/icontrol.c
++++ b/arch/arm/mach-pxa/icontrol.c
+@@ -140,7 +140,7 @@ struct platform_device pxa_spi_ssp4 = {
+ };
+
+ static struct gpiod_lookup_table pxa_ssp3_gpio_table = {
+- .dev_id = "pxa2xx-spi.3",
++ .dev_id = "spi3",
+ .table = {
+ GPIO_LOOKUP_IDX("gpio-pxa", ICONTROL_MCP251x_nCS1, "cs", 0, GPIO_ACTIVE_LOW),
+ GPIO_LOOKUP_IDX("gpio-pxa", ICONTROL_MCP251x_nCS2, "cs", 1, GPIO_ACTIVE_LOW),
+@@ -149,7 +149,7 @@ static struct gpiod_lookup_table pxa_ssp3_gpio_table = {
+ };
+
+ static struct gpiod_lookup_table pxa_ssp4_gpio_table = {
+- .dev_id = "pxa2xx-spi.4",
++ .dev_id = "spi4",
+ .table = {
+ GPIO_LOOKUP_IDX("gpio-pxa", ICONTROL_MCP251x_nCS3, "cs", 0, GPIO_ACTIVE_LOW),
+ GPIO_LOOKUP_IDX("gpio-pxa", ICONTROL_MCP251x_nCS4, "cs", 1, GPIO_ACTIVE_LOW),
+diff --git a/arch/arm/mach-pxa/littleton.c b/arch/arm/mach-pxa/littleton.c
+index 73f5953b3bb6b..3d04912d44d41 100644
+--- a/arch/arm/mach-pxa/littleton.c
++++ b/arch/arm/mach-pxa/littleton.c
+@@ -208,7 +208,7 @@ static struct spi_board_info littleton_spi_devices[] __initdata = {
+ };
+
+ static struct gpiod_lookup_table littleton_spi_gpio_table = {
+- .dev_id = "pxa2xx-spi.2",
++ .dev_id = "spi2",
+ .table = {
+ GPIO_LOOKUP_IDX("gpio-pxa", LITTLETON_GPIO_LCD_CS, "cs", 0, GPIO_ACTIVE_LOW),
+ { },
+diff --git a/arch/arm/mach-pxa/magician.c b/arch/arm/mach-pxa/magician.c
+index fcced6499faee..828d6e1cd0387 100644
+--- a/arch/arm/mach-pxa/magician.c
++++ b/arch/arm/mach-pxa/magician.c
+@@ -946,7 +946,7 @@ static struct pxa2xx_spi_controller magician_spi_info = {
+ };
+
+ static struct gpiod_lookup_table magician_spi_gpio_table = {
+- .dev_id = "pxa2xx-spi.2",
++ .dev_id = "spi2",
+ .table = {
+ /* NOTICE must be GPIO, incompatibility with hw PXA SPI framing */
+ GPIO_LOOKUP_IDX("gpio-pxa", GPIO14_MAGICIAN_TSC2046_CS, "cs", 0, GPIO_ACTIVE_LOW),
+diff --git a/arch/arm/mach-pxa/spitz.c b/arch/arm/mach-pxa/spitz.c
+index a648e7094e84e..36aee8de18729 100644
+--- a/arch/arm/mach-pxa/spitz.c
++++ b/arch/arm/mach-pxa/spitz.c
+@@ -578,7 +578,7 @@ static struct pxa2xx_spi_controller spitz_spi_info = {
+ };
+
+ static struct gpiod_lookup_table spitz_spi_gpio_table = {
+- .dev_id = "pxa2xx-spi.2",
++ .dev_id = "spi2",
+ .table = {
+ GPIO_LOOKUP_IDX("gpio-pxa", SPITZ_GPIO_ADS7846_CS, "cs", 0, GPIO_ACTIVE_LOW),
+ GPIO_LOOKUP_IDX("gpio-pxa", SPITZ_GPIO_LCDCON_CS, "cs", 1, GPIO_ACTIVE_LOW),
+diff --git a/arch/arm/mach-pxa/z2.c b/arch/arm/mach-pxa/z2.c
+index 7eaeda2699270..7b18d1f90309e 100644
+--- a/arch/arm/mach-pxa/z2.c
++++ b/arch/arm/mach-pxa/z2.c
+@@ -623,7 +623,7 @@ static struct pxa2xx_spi_controller pxa_ssp2_master_info = {
+ };
+
+ static struct gpiod_lookup_table pxa_ssp1_gpio_table = {
+- .dev_id = "pxa2xx-spi.1",
++ .dev_id = "spi1",
+ .table = {
+ GPIO_LOOKUP_IDX("gpio-pxa", GPIO24_ZIPITZ2_WIFI_CS, "cs", 0, GPIO_ACTIVE_LOW),
+ { },
+@@ -631,7 +631,7 @@ static struct gpiod_lookup_table pxa_ssp1_gpio_table = {
+ };
+
+ static struct gpiod_lookup_table pxa_ssp2_gpio_table = {
+- .dev_id = "pxa2xx-spi.2",
++ .dev_id = "spi2",
+ .table = {
+ GPIO_LOOKUP_IDX("gpio-pxa", GPIO88_ZIPITZ2_LCD_CS, "cs", 0, GPIO_ACTIVE_LOW),
+ { },
+diff --git a/arch/s390/include/asm/archrandom.h b/arch/s390/include/asm/archrandom.h
+index 2c6e1c6ecbe78..4120c428dc378 100644
+--- a/arch/s390/include/asm/archrandom.h
++++ b/arch/s390/include/asm/archrandom.h
+@@ -2,7 +2,7 @@
+ /*
+ * Kernel interface for the s390 arch_random_* functions
+ *
+- * Copyright IBM Corp. 2017, 2020
++ * Copyright IBM Corp. 2017, 2022
+ *
+ * Author: Harald Freudenberger <freude@de.ibm.com>
+ *
+@@ -14,6 +14,7 @@
+ #ifdef CONFIG_ARCH_RANDOM
+
+ #include <linux/static_key.h>
++#include <linux/preempt.h>
+ #include <linux/atomic.h>
+ #include <asm/cpacf.h>
+
+@@ -32,7 +33,8 @@ static inline bool __must_check arch_get_random_int(unsigned int *v)
+
+ static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
+ {
+- if (static_branch_likely(&s390_arch_random_available)) {
++ if (static_branch_likely(&s390_arch_random_available) &&
++ in_task()) {
+ cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
+ atomic64_add(sizeof(*v), &s390_arch_random_counter);
+ return true;
+@@ -42,7 +44,8 @@ static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
+
+ static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
+ {
+- if (static_branch_likely(&s390_arch_random_available)) {
++ if (static_branch_likely(&s390_arch_random_available) &&
++ in_task()) {
+ cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v));
+ atomic64_add(sizeof(*v), &s390_arch_random_counter);
+ return true;
+diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
+index 8179fa4d5004f..fd986a8ba2bd7 100644
+--- a/arch/x86/kernel/cpu/bugs.c
++++ b/arch/x86/kernel/cpu/bugs.c
+@@ -1513,6 +1513,7 @@ static void __init spectre_v2_select_mitigation(void)
+ * enable IBRS around firmware calls.
+ */
+ if (boot_cpu_has_bug(X86_BUG_RETBLEED) &&
++ boot_cpu_has(X86_FEATURE_IBPB) &&
+ (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
+ boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)) {
+
+diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
+index 6d1ddecbf0da3..d0a9ccf640c4b 100644
+--- a/drivers/edac/ghes_edac.c
++++ b/drivers/edac/ghes_edac.c
+@@ -101,9 +101,14 @@ static void dimm_setup_label(struct dimm_info *dimm, u16 handle)
+
+ dmi_memdev_name(handle, &bank, &device);
+
+- /* both strings must be non-zero */
+- if (bank && *bank && device && *device)
+- snprintf(dimm->label, sizeof(dimm->label), "%s %s", bank, device);
++ /*
++ * Set to a NULL string when both bank and device are zero. In this case,
++ * the label assigned by default will be preserved.
++ */
++ snprintf(dimm->label, sizeof(dimm->label), "%s%s%s",
++ (bank && *bank) ? bank : "",
++ (bank && *bank && device && *device) ? " " : "",
++ (device && *device) ? device : "");
+ }
+
+ static void assign_dmi_dimm_info(struct dimm_info *dimm, struct memdev_dmi_entry *entry)
+diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c
+index 40b1abeca8562..a14baeca64004 100644
+--- a/drivers/edac/synopsys_edac.c
++++ b/drivers/edac/synopsys_edac.c
+@@ -527,6 +527,28 @@ static void handle_error(struct mem_ctl_info *mci, struct synps_ecc_status *p)
+ memset(p, 0, sizeof(*p));
+ }
+
++static void enable_intr(struct synps_edac_priv *priv)
++{
++ /* Enable UE/CE Interrupts */
++ if (priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)
++ writel(DDR_UE_MASK | DDR_CE_MASK,
++ priv->baseaddr + ECC_CLR_OFST);
++ else
++ writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
++ priv->baseaddr + DDR_QOS_IRQ_EN_OFST);
++
++}
++
++static void disable_intr(struct synps_edac_priv *priv)
++{
++ /* Disable UE/CE Interrupts */
++ if (priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)
++ writel(0x0, priv->baseaddr + ECC_CLR_OFST);
++ else
++ writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
++ priv->baseaddr + DDR_QOS_IRQ_DB_OFST);
++}
++
+ /**
+ * intr_handler - Interrupt Handler for ECC interrupts.
+ * @irq: IRQ number.
+@@ -568,6 +590,9 @@ static irqreturn_t intr_handler(int irq, void *dev_id)
+ /* v3.0 of the controller does not have this register */
+ if (!(priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR))
+ writel(regval, priv->baseaddr + DDR_QOS_IRQ_STAT_OFST);
++ else
++ enable_intr(priv);
++
+ return IRQ_HANDLED;
+ }
+
+@@ -850,25 +875,6 @@ static void mc_init(struct mem_ctl_info *mci, struct platform_device *pdev)
+ init_csrows(mci);
+ }
+
+-static void enable_intr(struct synps_edac_priv *priv)
+-{
+- /* Enable UE/CE Interrupts */
+- if (priv->p_data->quirks & DDR_ECC_INTR_SELF_CLEAR)
+- writel(DDR_UE_MASK | DDR_CE_MASK,
+- priv->baseaddr + ECC_CLR_OFST);
+- else
+- writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
+- priv->baseaddr + DDR_QOS_IRQ_EN_OFST);
+-
+-}
+-
+-static void disable_intr(struct synps_edac_priv *priv)
+-{
+- /* Disable UE/CE Interrupts */
+- writel(DDR_QOSUE_MASK | DDR_QOSCE_MASK,
+- priv->baseaddr + DDR_QOS_IRQ_DB_OFST);
+-}
+-
+ static int setup_irq(struct mem_ctl_info *mci,
+ struct platform_device *pdev)
+ {
+diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
+index 7ba66ad68a8a1..16356611b5b95 100644
+--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
++++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
+@@ -680,7 +680,11 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
+ goto out_free_dma;
+
+ for (i = 0; i < npages; i += max) {
+- args.end = start + (max << PAGE_SHIFT);
++ if (args.start + (max << PAGE_SHIFT) > end)
++ args.end = end;
++ else
++ args.end = args.start + (max << PAGE_SHIFT);
++
+ ret = migrate_vma_setup(&args);
+ if (ret)
+ goto out_free_pfns;
+diff --git a/drivers/gpu/drm/tiny/simpledrm.c b/drivers/gpu/drm/tiny/simpledrm.c
+index f5b8e864a5cd9..b1a88675bd473 100644
+--- a/drivers/gpu/drm/tiny/simpledrm.c
++++ b/drivers/gpu/drm/tiny/simpledrm.c
+@@ -627,7 +627,7 @@ static const struct drm_connector_funcs simpledrm_connector_funcs = {
+ .atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
+ };
+
+-static int
++static enum drm_mode_status
+ simpledrm_simple_display_pipe_mode_valid(struct drm_simple_display_pipe *pipe,
+ const struct drm_display_mode *mode)
+ {
+diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
+index b463d85bfb358..47b68c6071bef 100644
+--- a/drivers/idle/intel_idle.c
++++ b/drivers/idle/intel_idle.c
+@@ -162,7 +162,13 @@ static __cpuidle int intel_idle_irq(struct cpuidle_device *dev,
+
+ raw_local_irq_enable();
+ ret = __intel_idle(dev, drv, index);
+- raw_local_irq_disable();
++
++ /*
++ * The lockdep hardirqs state may be changed to 'on' with timer
++ * tick interrupt followed by __do_softirq(). Use local_irq_disable()
++ * to keep the hardirqs state correct.
++ */
++ local_irq_disable();
+
+ return ret;
+ }
+diff --git a/drivers/net/ethernet/fungible/funeth/funeth_rx.c b/drivers/net/ethernet/fungible/funeth/funeth_rx.c
+index 0f6a549b9f679..29a6c2ede43a6 100644
+--- a/drivers/net/ethernet/fungible/funeth/funeth_rx.c
++++ b/drivers/net/ethernet/fungible/funeth/funeth_rx.c
+@@ -142,6 +142,7 @@ static void *fun_run_xdp(struct funeth_rxq *q, skb_frag_t *frags, void *buf_va,
+ int ref_ok, struct funeth_txq *xdp_q)
+ {
+ struct bpf_prog *xdp_prog;
++ struct xdp_frame *xdpf;
+ struct xdp_buff xdp;
+ u32 act;
+
+@@ -163,7 +164,9 @@ static void *fun_run_xdp(struct funeth_rxq *q, skb_frag_t *frags, void *buf_va,
+ case XDP_TX:
+ if (unlikely(!ref_ok))
+ goto pass;
+- if (!fun_xdp_tx(xdp_q, xdp.data, xdp.data_end - xdp.data))
++
++ xdpf = xdp_convert_buff_to_frame(&xdp);
++ if (!xdpf || !fun_xdp_tx(xdp_q, xdpf))
+ goto xdp_error;
+ FUN_QSTAT_INC(q, xdp_tx);
+ q->xdp_flush |= FUN_XDP_FLUSH_TX;
+diff --git a/drivers/net/ethernet/fungible/funeth/funeth_tx.c b/drivers/net/ethernet/fungible/funeth/funeth_tx.c
+index ff6e292372535..2f6698b98b034 100644
+--- a/drivers/net/ethernet/fungible/funeth/funeth_tx.c
++++ b/drivers/net/ethernet/fungible/funeth/funeth_tx.c
+@@ -466,7 +466,7 @@ static unsigned int fun_xdpq_clean(struct funeth_txq *q, unsigned int budget)
+
+ do {
+ fun_xdp_unmap(q, reclaim_idx);
+- page_frag_free(q->info[reclaim_idx].vaddr);
++ xdp_return_frame(q->info[reclaim_idx].xdpf);
+
+ trace_funeth_tx_free(q, reclaim_idx, 1, head);
+
+@@ -479,11 +479,11 @@ static unsigned int fun_xdpq_clean(struct funeth_txq *q, unsigned int budget)
+ return npkts;
+ }
+
+-bool fun_xdp_tx(struct funeth_txq *q, void *data, unsigned int len)
++bool fun_xdp_tx(struct funeth_txq *q, struct xdp_frame *xdpf)
+ {
+ struct fun_eth_tx_req *req;
+ struct fun_dataop_gl *gle;
+- unsigned int idx;
++ unsigned int idx, len;
+ dma_addr_t dma;
+
+ if (fun_txq_avail(q) < FUN_XDP_CLEAN_THRES)
+@@ -494,7 +494,8 @@ bool fun_xdp_tx(struct funeth_txq *q, void *data, unsigned int len)
+ return false;
+ }
+
+- dma = dma_map_single(q->dma_dev, data, len, DMA_TO_DEVICE);
++ len = xdpf->len;
++ dma = dma_map_single(q->dma_dev, xdpf->data, len, DMA_TO_DEVICE);
+ if (unlikely(dma_mapping_error(q->dma_dev, dma))) {
+ FUN_QSTAT_INC(q, tx_map_err);
+ return false;
+@@ -514,7 +515,7 @@ bool fun_xdp_tx(struct funeth_txq *q, void *data, unsigned int len)
+ gle = (struct fun_dataop_gl *)req->dataop.imm;
+ fun_dataop_gl_init(gle, 0, 0, len, dma);
+
+- q->info[idx].vaddr = data;
++ q->info[idx].xdpf = xdpf;
+
+ u64_stats_update_begin(&q->syncp);
+ q->stats.tx_bytes += len;
+@@ -545,12 +546,9 @@ int fun_xdp_xmit_frames(struct net_device *dev, int n,
+ if (unlikely(q_idx >= fp->num_xdpqs))
+ return -ENXIO;
+
+- for (q = xdpqs[q_idx], i = 0; i < n; i++) {
+- const struct xdp_frame *xdpf = frames[i];
+-
+- if (!fun_xdp_tx(q, xdpf->data, xdpf->len))
++ for (q = xdpqs[q_idx], i = 0; i < n; i++)
++ if (!fun_xdp_tx(q, frames[i]))
+ break;
+- }
+
+ if (unlikely(flags & XDP_XMIT_FLUSH))
+ fun_txq_wr_db(q);
+@@ -577,7 +575,7 @@ static void fun_xdpq_purge(struct funeth_txq *q)
+ unsigned int idx = q->cons_cnt & q->mask;
+
+ fun_xdp_unmap(q, idx);
+- page_frag_free(q->info[idx].vaddr);
++ xdp_return_frame(q->info[idx].xdpf);
+ q->cons_cnt++;
+ }
+ }
+diff --git a/drivers/net/ethernet/fungible/funeth/funeth_txrx.h b/drivers/net/ethernet/fungible/funeth/funeth_txrx.h
+index 04c9f91b7489b..8708e2895946d 100644
+--- a/drivers/net/ethernet/fungible/funeth/funeth_txrx.h
++++ b/drivers/net/ethernet/fungible/funeth/funeth_txrx.h
+@@ -95,8 +95,8 @@ struct funeth_txq_stats { /* per Tx queue SW counters */
+
+ struct funeth_tx_info { /* per Tx descriptor state */
+ union {
+- struct sk_buff *skb; /* associated packet */
+- void *vaddr; /* start address for XDP */
++ struct sk_buff *skb; /* associated packet (sk_buff path) */
++ struct xdp_frame *xdpf; /* associated XDP frame (XDP path) */
+ };
+ };
+
+@@ -245,7 +245,7 @@ static inline int fun_irq_node(const struct fun_irq *p)
+ int fun_rxq_napi_poll(struct napi_struct *napi, int budget);
+ int fun_txq_napi_poll(struct napi_struct *napi, int budget);
+ netdev_tx_t fun_start_xmit(struct sk_buff *skb, struct net_device *netdev);
+-bool fun_xdp_tx(struct funeth_txq *q, void *data, unsigned int len);
++bool fun_xdp_tx(struct funeth_txq *q, struct xdp_frame *xdpf);
+ int fun_xdp_xmit_frames(struct net_device *dev, int n,
+ struct xdp_frame **frames, u32 flags);
+
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
+index 6f01bffd7e5c2..9471f47089b26 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
++++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
+@@ -1920,11 +1920,15 @@ static void i40e_vsi_setup_queue_map(struct i40e_vsi *vsi,
+ * non-zero req_queue_pairs says that user requested a new
+ * queue count via ethtool's set_channels, so use this
+ * value for queues distribution across traffic classes
++ * We need at least one queue pair for the interface
++ * to be usable as we see in else statement.
+ */
+ if (vsi->req_queue_pairs > 0)
+ vsi->num_queue_pairs = vsi->req_queue_pairs;
+ else if (pf->flags & I40E_FLAG_MSIX_ENABLED)
+ vsi->num_queue_pairs = pf->num_lan_msix;
++ else
++ vsi->num_queue_pairs = 1;
+ }
+
+ /* Number of queues per enabled TC */
+diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
+index 8aee4ae4cc8c9..74350a95a6e9a 100644
+--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
++++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
+@@ -660,7 +660,8 @@ static int ice_lbtest_receive_frames(struct ice_rx_ring *rx_ring)
+ rx_desc = ICE_RX_DESC(rx_ring, i);
+
+ if (!(rx_desc->wb.status_error0 &
+- cpu_to_le16(ICE_TX_DESC_CMD_EOP | ICE_TX_DESC_CMD_RS)))
++ (cpu_to_le16(BIT(ICE_RX_FLEX_DESC_STATUS0_DD_S)) |
++ cpu_to_le16(BIT(ICE_RX_FLEX_DESC_STATUS0_EOF_S)))))
+ continue;
+
+ rx_buf = &rx_ring->rx_buf[i];
+diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
+index efb076f71e381..522462f41067a 100644
+--- a/drivers/net/ethernet/intel/ice/ice_main.c
++++ b/drivers/net/ethernet/intel/ice/ice_main.c
+@@ -4640,6 +4640,8 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
+ ice_set_safe_mode_caps(hw);
+ }
+
++ hw->ucast_shared = true;
++
+ err = ice_init_pf(pf);
+ if (err) {
+ dev_err(dev, "ice_init_pf failed: %d\n", err);
+@@ -5994,10 +5996,12 @@ int ice_vsi_cfg(struct ice_vsi *vsi)
+ if (vsi->netdev) {
+ ice_set_rx_mode(vsi->netdev);
+
+- err = ice_vsi_vlan_setup(vsi);
++ if (vsi->type != ICE_VSI_LB) {
++ err = ice_vsi_vlan_setup(vsi);
+
+- if (err)
+- return err;
++ if (err)
++ return err;
++ }
+ }
+ ice_vsi_cfg_dcb_rings(vsi);
+
+diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c b/drivers/net/ethernet/intel/ice/ice_sriov.c
+index bb1721f1321db..f4907a3c2d193 100644
+--- a/drivers/net/ethernet/intel/ice/ice_sriov.c
++++ b/drivers/net/ethernet/intel/ice/ice_sriov.c
+@@ -1309,39 +1309,6 @@ out_put_vf:
+ return ret;
+ }
+
+-/**
+- * ice_unicast_mac_exists - check if the unicast MAC exists on the PF's switch
+- * @pf: PF used to reference the switch's rules
+- * @umac: unicast MAC to compare against existing switch rules
+- *
+- * Return true on the first/any match, else return false
+- */
+-static bool ice_unicast_mac_exists(struct ice_pf *pf, u8 *umac)
+-{
+- struct ice_sw_recipe *mac_recipe_list =
+- &pf->hw.switch_info->recp_list[ICE_SW_LKUP_MAC];
+- struct ice_fltr_mgmt_list_entry *list_itr;
+- struct list_head *rule_head;
+- struct mutex *rule_lock; /* protect MAC filter list access */
+-
+- rule_head = &mac_recipe_list->filt_rules;
+- rule_lock = &mac_recipe_list->filt_rule_lock;
+-
+- mutex_lock(rule_lock);
+- list_for_each_entry(list_itr, rule_head, list_entry) {
+- u8 *existing_mac = &list_itr->fltr_info.l_data.mac.mac_addr[0];
+-
+- if (ether_addr_equal(existing_mac, umac)) {
+- mutex_unlock(rule_lock);
+- return true;
+- }
+- }
+-
+- mutex_unlock(rule_lock);
+-
+- return false;
+-}
+-
+ /**
+ * ice_set_vf_mac
+ * @netdev: network interface device structure
+@@ -1376,13 +1343,6 @@ int ice_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac)
+ if (ret)
+ goto out_put_vf;
+
+- if (ice_unicast_mac_exists(pf, mac)) {
+- netdev_err(netdev, "Unicast MAC %pM already exists on this PF. Preventing setting VF %u unicast MAC address to %pM\n",
+- mac, vf_id, mac);
+- ret = -EINVAL;
+- goto out_put_vf;
+- }
+-
+ mutex_lock(&vf->cfg_lock);
+
+ /* VF is notified of its new MAC via the PF's response to the
+diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c
+index da7c5ce15be0d..3143c2abf7753 100644
+--- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c
++++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c
+@@ -2966,7 +2966,8 @@ ice_vc_validate_add_vlan_filter_list(struct ice_vsi *vsi,
+ struct virtchnl_vlan_filtering_caps *vfc,
+ struct virtchnl_vlan_filter_list_v2 *vfl)
+ {
+- u16 num_requested_filters = vsi->num_vlan + vfl->num_elements;
++ u16 num_requested_filters = ice_vsi_num_non_zero_vlans(vsi) +
++ vfl->num_elements;
+
+ if (num_requested_filters > vfc->max_filters)
+ return false;
+diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c
+index 28b19945d716c..e64318c110fdd 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c
++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c
+@@ -28,6 +28,9 @@
+ #define MAX_RATE_EXPONENT 0x0FULL
+ #define MAX_RATE_MANTISSA 0xFFULL
+
++#define CN10K_MAX_BURST_MANTISSA 0x7FFFULL
++#define CN10K_MAX_BURST_SIZE 8453888ULL
++
+ /* Bitfields in NIX_TLX_PIR register */
+ #define TLX_RATE_MANTISSA GENMASK_ULL(8, 1)
+ #define TLX_RATE_EXPONENT GENMASK_ULL(12, 9)
+@@ -35,6 +38,9 @@
+ #define TLX_BURST_MANTISSA GENMASK_ULL(36, 29)
+ #define TLX_BURST_EXPONENT GENMASK_ULL(40, 37)
+
++#define CN10K_TLX_BURST_MANTISSA GENMASK_ULL(43, 29)
++#define CN10K_TLX_BURST_EXPONENT GENMASK_ULL(47, 44)
++
+ struct otx2_tc_flow_stats {
+ u64 bytes;
+ u64 pkts;
+@@ -77,33 +83,42 @@ int otx2_tc_alloc_ent_bitmap(struct otx2_nic *nic)
+ }
+ EXPORT_SYMBOL(otx2_tc_alloc_ent_bitmap);
+
+-static void otx2_get_egress_burst_cfg(u32 burst, u32 *burst_exp,
+- u32 *burst_mantissa)
++static void otx2_get_egress_burst_cfg(struct otx2_nic *nic, u32 burst,
++ u32 *burst_exp, u32 *burst_mantissa)
+ {
++ int max_burst, max_mantissa;
+ unsigned int tmp;
+
++ if (is_dev_otx2(nic->pdev)) {
++ max_burst = MAX_BURST_SIZE;
++ max_mantissa = MAX_BURST_MANTISSA;
++ } else {
++ max_burst = CN10K_MAX_BURST_SIZE;
++ max_mantissa = CN10K_MAX_BURST_MANTISSA;
++ }
++
+ /* Burst is calculated as
+ * ((256 + BURST_MANTISSA) << (1 + BURST_EXPONENT)) / 256
+ * Max supported burst size is 130,816 bytes.
+ */
+- burst = min_t(u32, burst, MAX_BURST_SIZE);
++ burst = min_t(u32, burst, max_burst);
+ if (burst) {
+ *burst_exp = ilog2(burst) ? ilog2(burst) - 1 : 0;
+ tmp = burst - rounddown_pow_of_two(burst);
+- if (burst < MAX_BURST_MANTISSA)
++ if (burst < max_mantissa)
+ *burst_mantissa = tmp * 2;
+ else
+ *burst_mantissa = tmp / (1ULL << (*burst_exp - 7));
+ } else {
+ *burst_exp = MAX_BURST_EXPONENT;
+- *burst_mantissa = MAX_BURST_MANTISSA;
++ *burst_mantissa = max_mantissa;
+ }
+ }
+
+-static void otx2_get_egress_rate_cfg(u32 maxrate, u32 *exp,
++static void otx2_get_egress_rate_cfg(u64 maxrate, u32 *exp,
+ u32 *mantissa, u32 *div_exp)
+ {
+- unsigned int tmp;
++ u64 tmp;
+
+ /* Rate calculation by hardware
+ *
+@@ -132,21 +147,44 @@ static void otx2_get_egress_rate_cfg(u32 maxrate, u32 *exp,
+ }
+ }
+
+-static int otx2_set_matchall_egress_rate(struct otx2_nic *nic, u32 burst, u32 maxrate)
++static u64 otx2_get_txschq_rate_regval(struct otx2_nic *nic,
++ u64 maxrate, u32 burst)
+ {
+- struct otx2_hw *hw = &nic->hw;
+- struct nix_txschq_config *req;
+ u32 burst_exp, burst_mantissa;
+ u32 exp, mantissa, div_exp;
++ u64 regval = 0;
++
++ /* Get exponent and mantissa values from the desired rate */
++ otx2_get_egress_burst_cfg(nic, burst, &burst_exp, &burst_mantissa);
++ otx2_get_egress_rate_cfg(maxrate, &exp, &mantissa, &div_exp);
++
++ if (is_dev_otx2(nic->pdev)) {
++ regval = FIELD_PREP(TLX_BURST_EXPONENT, (u64)burst_exp) |
++ FIELD_PREP(TLX_BURST_MANTISSA, (u64)burst_mantissa) |
++ FIELD_PREP(TLX_RATE_DIVIDER_EXPONENT, div_exp) |
++ FIELD_PREP(TLX_RATE_EXPONENT, exp) |
++ FIELD_PREP(TLX_RATE_MANTISSA, mantissa) | BIT_ULL(0);
++ } else {
++ regval = FIELD_PREP(CN10K_TLX_BURST_EXPONENT, (u64)burst_exp) |
++ FIELD_PREP(CN10K_TLX_BURST_MANTISSA, (u64)burst_mantissa) |
++ FIELD_PREP(TLX_RATE_DIVIDER_EXPONENT, div_exp) |
++ FIELD_PREP(TLX_RATE_EXPONENT, exp) |
++ FIELD_PREP(TLX_RATE_MANTISSA, mantissa) | BIT_ULL(0);
++ }
++
++ return regval;
++}
++
++static int otx2_set_matchall_egress_rate(struct otx2_nic *nic,
++ u32 burst, u64 maxrate)
++{
++ struct otx2_hw *hw = &nic->hw;
++ struct nix_txschq_config *req;
+ int txschq, err;
+
+ /* All SQs share the same TL4, so pick the first scheduler */
+ txschq = hw->txschq_list[NIX_TXSCH_LVL_TL4][0];
+
+- /* Get exponent and mantissa values from the desired rate */
+- otx2_get_egress_burst_cfg(burst, &burst_exp, &burst_mantissa);
+- otx2_get_egress_rate_cfg(maxrate, &exp, &mantissa, &div_exp);
+-
+ mutex_lock(&nic->mbox.lock);
+ req = otx2_mbox_alloc_msg_nix_txschq_cfg(&nic->mbox);
+ if (!req) {
+@@ -157,11 +195,7 @@ static int otx2_set_matchall_egress_rate(struct otx2_nic *nic, u32 burst, u32 ma
+ req->lvl = NIX_TXSCH_LVL_TL4;
+ req->num_regs = 1;
+ req->reg[0] = NIX_AF_TL4X_PIR(txschq);
+- req->regval[0] = FIELD_PREP(TLX_BURST_EXPONENT, burst_exp) |
+- FIELD_PREP(TLX_BURST_MANTISSA, burst_mantissa) |
+- FIELD_PREP(TLX_RATE_DIVIDER_EXPONENT, div_exp) |
+- FIELD_PREP(TLX_RATE_EXPONENT, exp) |
+- FIELD_PREP(TLX_RATE_MANTISSA, mantissa) | BIT_ULL(0);
++ req->regval[0] = otx2_get_txschq_rate_regval(nic, maxrate, burst);
+
+ err = otx2_sync_mbox_msg(&nic->mbox);
+ mutex_unlock(&nic->mbox.lock);
+@@ -230,7 +264,7 @@ static int otx2_tc_egress_matchall_install(struct otx2_nic *nic,
+ struct netlink_ext_ack *extack = cls->common.extack;
+ struct flow_action *actions = &cls->rule->action;
+ struct flow_action_entry *entry;
+- u32 rate;
++ u64 rate;
+ int err;
+
+ err = otx2_tc_validate_flow(nic, actions, extack);
+@@ -256,7 +290,7 @@ static int otx2_tc_egress_matchall_install(struct otx2_nic *nic,
+ }
+ /* Convert bytes per second to Mbps */
+ rate = entry->police.rate_bytes_ps * 8;
+- rate = max_t(u32, rate / 1000000, 1);
++ rate = max_t(u64, rate / 1000000, 1);
+ err = otx2_set_matchall_egress_rate(nic, entry->police.burst, rate);
+ if (err)
+ return err;
+@@ -614,21 +648,27 @@ static int otx2_tc_prepare_flow(struct otx2_nic *nic, struct otx2_tc_flow *node,
+
+ flow_spec->dport = match.key->dst;
+ flow_mask->dport = match.mask->dst;
+- if (ip_proto == IPPROTO_UDP)
+- req->features |= BIT_ULL(NPC_DPORT_UDP);
+- else if (ip_proto == IPPROTO_TCP)
+- req->features |= BIT_ULL(NPC_DPORT_TCP);
+- else if (ip_proto == IPPROTO_SCTP)
+- req->features |= BIT_ULL(NPC_DPORT_SCTP);
++
++ if (flow_mask->dport) {
++ if (ip_proto == IPPROTO_UDP)
++ req->features |= BIT_ULL(NPC_DPORT_UDP);
++ else if (ip_proto == IPPROTO_TCP)
++ req->features |= BIT_ULL(NPC_DPORT_TCP);
++ else if (ip_proto == IPPROTO_SCTP)
++ req->features |= BIT_ULL(NPC_DPORT_SCTP);
++ }
+
+ flow_spec->sport = match.key->src;
+ flow_mask->sport = match.mask->src;
+- if (ip_proto == IPPROTO_UDP)
+- req->features |= BIT_ULL(NPC_SPORT_UDP);
+- else if (ip_proto == IPPROTO_TCP)
+- req->features |= BIT_ULL(NPC_SPORT_TCP);
+- else if (ip_proto == IPPROTO_SCTP)
+- req->features |= BIT_ULL(NPC_SPORT_SCTP);
++
++ if (flow_mask->sport) {
++ if (ip_proto == IPPROTO_UDP)
++ req->features |= BIT_ULL(NPC_SPORT_UDP);
++ else if (ip_proto == IPPROTO_TCP)
++ req->features |= BIT_ULL(NPC_SPORT_TCP);
++ else if (ip_proto == IPPROTO_SCTP)
++ req->features |= BIT_ULL(NPC_SPORT_SCTP);
++ }
+ }
+
+ return otx2_tc_parse_actions(nic, &rule->action, req, f, node);
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+index c00d6c4ed37c3..245d36696486a 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
++++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+@@ -7022,7 +7022,7 @@ mlxsw_sp_fib6_entry_nexthop_add(struct mlxsw_sp *mlxsw_sp,
+ mlxsw_sp_rt6 = mlxsw_sp_rt6_create(rt_arr[i]);
+ if (IS_ERR(mlxsw_sp_rt6)) {
+ err = PTR_ERR(mlxsw_sp_rt6);
+- goto err_rt6_create;
++ goto err_rt6_unwind;
+ }
+
+ list_add_tail(&mlxsw_sp_rt6->list, &fib6_entry->rt6_list);
+@@ -7031,14 +7031,12 @@ mlxsw_sp_fib6_entry_nexthop_add(struct mlxsw_sp *mlxsw_sp,
+
+ err = mlxsw_sp_nexthop6_group_update(mlxsw_sp, op_ctx, fib6_entry);
+ if (err)
+- goto err_nexthop6_group_update;
++ goto err_rt6_unwind;
+
+ return 0;
+
+-err_nexthop6_group_update:
+- i = nrt6;
+-err_rt6_create:
+- for (i--; i >= 0; i--) {
++err_rt6_unwind:
++ for (; i > 0; i--) {
+ fib6_entry->nrt6--;
+ mlxsw_sp_rt6 = list_last_entry(&fib6_entry->rt6_list,
+ struct mlxsw_sp_rt6, list);
+@@ -7166,7 +7164,7 @@ mlxsw_sp_fib6_entry_create(struct mlxsw_sp *mlxsw_sp,
+ mlxsw_sp_rt6 = mlxsw_sp_rt6_create(rt_arr[i]);
+ if (IS_ERR(mlxsw_sp_rt6)) {
+ err = PTR_ERR(mlxsw_sp_rt6);
+- goto err_rt6_create;
++ goto err_rt6_unwind;
+ }
+ list_add_tail(&mlxsw_sp_rt6->list, &fib6_entry->rt6_list);
+ fib6_entry->nrt6++;
+@@ -7174,7 +7172,7 @@ mlxsw_sp_fib6_entry_create(struct mlxsw_sp *mlxsw_sp,
+
+ err = mlxsw_sp_nexthop6_group_get(mlxsw_sp, fib6_entry);
+ if (err)
+- goto err_nexthop6_group_get;
++ goto err_rt6_unwind;
+
+ err = mlxsw_sp_nexthop_group_vr_link(fib_entry->nh_group,
+ fib_node->fib);
+@@ -7193,10 +7191,8 @@ err_fib6_entry_type_set:
+ mlxsw_sp_nexthop_group_vr_unlink(fib_entry->nh_group, fib_node->fib);
+ err_nexthop_group_vr_link:
+ mlxsw_sp_nexthop6_group_put(mlxsw_sp, fib_entry);
+-err_nexthop6_group_get:
+- i = nrt6;
+-err_rt6_create:
+- for (i--; i >= 0; i--) {
++err_rt6_unwind:
++ for (; i > 0; i--) {
+ fib6_entry->nrt6--;
+ mlxsw_sp_rt6 = list_last_entry(&fib6_entry->rt6_list,
+ struct mlxsw_sp_rt6, list);
+diff --git a/drivers/net/ethernet/sfc/ptp.c b/drivers/net/ethernet/sfc/ptp.c
+index 4625f85acab2e..10ad0b93d283b 100644
+--- a/drivers/net/ethernet/sfc/ptp.c
++++ b/drivers/net/ethernet/sfc/ptp.c
+@@ -1100,7 +1100,29 @@ static void efx_ptp_xmit_skb_queue(struct efx_nic *efx, struct sk_buff *skb)
+
+ tx_queue = efx_channel_get_tx_queue(ptp_data->channel, type);
+ if (tx_queue && tx_queue->timestamping) {
++ /* This code invokes normal driver TX code which is always
++ * protected from softirqs when called from generic TX code,
++ * which in turn disables preemption. Look at __dev_queue_xmit
++ * which uses rcu_read_lock_bh disabling preemption for RCU
++ * plus disabling softirqs. We do not need RCU reader
++ * protection here.
++ *
++ * Although it is theoretically safe for current PTP TX/RX code
++ * running without disabling softirqs, there are three good
++ * reasond for doing so:
++ *
++ * 1) The code invoked is mainly implemented for non-PTP
++ * packets and it is always executed with softirqs
++ * disabled.
++ * 2) This being a single PTP packet, better to not
++ * interrupt its processing by softirqs which can lead
++ * to high latencies.
++ * 3) netdev_xmit_more checks preemption is disabled and
++ * triggers a BUG_ON if not.
++ */
++ local_bh_disable();
+ efx_enqueue_skb(tx_queue, skb);
++ local_bh_enable();
+ } else {
+ WARN_ONCE(1, "PTP channel has no timestamped tx queue\n");
+ dev_kfree_skb_any(skb);
+diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c
+index ca8ab290013ce..d42e1afb65213 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c
++++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.c
+@@ -688,18 +688,19 @@ static int mediatek_dwmac_probe(struct platform_device *pdev)
+
+ ret = mediatek_dwmac_clks_config(priv_plat, true);
+ if (ret)
+- return ret;
++ goto err_remove_config_dt;
+
+ ret = stmmac_dvr_probe(&pdev->dev, plat_dat, &stmmac_res);
+- if (ret) {
+- stmmac_remove_config_dt(pdev, plat_dat);
++ if (ret)
+ goto err_drv_probe;
+- }
+
+ return 0;
+
+ err_drv_probe:
+ mediatek_dwmac_clks_config(priv_plat, false);
++err_remove_config_dt:
++ stmmac_remove_config_dt(pdev, plat_dat);
++
+ return ret;
+ }
+
+diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
+index 817577e713d70..f354fad05714a 100644
+--- a/drivers/net/macsec.c
++++ b/drivers/net/macsec.c
+@@ -243,6 +243,7 @@ static struct macsec_cb *macsec_skb_cb(struct sk_buff *skb)
+ #define DEFAULT_SEND_SCI true
+ #define DEFAULT_ENCRYPT false
+ #define DEFAULT_ENCODING_SA 0
++#define MACSEC_XPN_MAX_REPLAY_WINDOW (((1 << 30) - 1))
+
+ static bool send_sci(const struct macsec_secy *secy)
+ {
+@@ -1697,7 +1698,7 @@ static bool validate_add_rxsa(struct nlattr **attrs)
+ return false;
+
+ if (attrs[MACSEC_SA_ATTR_PN] &&
+- *(u64 *)nla_data(attrs[MACSEC_SA_ATTR_PN]) == 0)
++ nla_get_u64(attrs[MACSEC_SA_ATTR_PN]) == 0)
+ return false;
+
+ if (attrs[MACSEC_SA_ATTR_ACTIVE]) {
+@@ -1753,7 +1754,8 @@ static int macsec_add_rxsa(struct sk_buff *skb, struct genl_info *info)
+ }
+
+ pn_len = secy->xpn ? MACSEC_XPN_PN_LEN : MACSEC_DEFAULT_PN_LEN;
+- if (nla_len(tb_sa[MACSEC_SA_ATTR_PN]) != pn_len) {
++ if (tb_sa[MACSEC_SA_ATTR_PN] &&
++ nla_len(tb_sa[MACSEC_SA_ATTR_PN]) != pn_len) {
+ pr_notice("macsec: nl: add_rxsa: bad pn length: %d != %d\n",
+ nla_len(tb_sa[MACSEC_SA_ATTR_PN]), pn_len);
+ rtnl_unlock();
+@@ -1769,7 +1771,7 @@ static int macsec_add_rxsa(struct sk_buff *skb, struct genl_info *info)
+ if (nla_len(tb_sa[MACSEC_SA_ATTR_SALT]) != MACSEC_SALT_LEN) {
+ pr_notice("macsec: nl: add_rxsa: bad salt length: %d != %d\n",
+ nla_len(tb_sa[MACSEC_SA_ATTR_SALT]),
+- MACSEC_SA_ATTR_SALT);
++ MACSEC_SALT_LEN);
+ rtnl_unlock();
+ return -EINVAL;
+ }
+@@ -1842,7 +1844,7 @@ static int macsec_add_rxsa(struct sk_buff *skb, struct genl_info *info)
+ return 0;
+
+ cleanup:
+- kfree(rx_sa);
++ macsec_rxsa_put(rx_sa);
+ rtnl_unlock();
+ return err;
+ }
+@@ -1939,7 +1941,7 @@ static bool validate_add_txsa(struct nlattr **attrs)
+ if (nla_get_u8(attrs[MACSEC_SA_ATTR_AN]) >= MACSEC_NUM_AN)
+ return false;
+
+- if (nla_get_u32(attrs[MACSEC_SA_ATTR_PN]) == 0)
++ if (nla_get_u64(attrs[MACSEC_SA_ATTR_PN]) == 0)
+ return false;
+
+ if (attrs[MACSEC_SA_ATTR_ACTIVE]) {
+@@ -2011,7 +2013,7 @@ static int macsec_add_txsa(struct sk_buff *skb, struct genl_info *info)
+ if (nla_len(tb_sa[MACSEC_SA_ATTR_SALT]) != MACSEC_SALT_LEN) {
+ pr_notice("macsec: nl: add_txsa: bad salt length: %d != %d\n",
+ nla_len(tb_sa[MACSEC_SA_ATTR_SALT]),
+- MACSEC_SA_ATTR_SALT);
++ MACSEC_SALT_LEN);
+ rtnl_unlock();
+ return -EINVAL;
+ }
+@@ -2085,7 +2087,7 @@ static int macsec_add_txsa(struct sk_buff *skb, struct genl_info *info)
+
+ cleanup:
+ secy->operational = was_operational;
+- kfree(tx_sa);
++ macsec_txsa_put(tx_sa);
+ rtnl_unlock();
+ return err;
+ }
+@@ -2293,7 +2295,7 @@ static bool validate_upd_sa(struct nlattr **attrs)
+ if (nla_get_u8(attrs[MACSEC_SA_ATTR_AN]) >= MACSEC_NUM_AN)
+ return false;
+
+- if (attrs[MACSEC_SA_ATTR_PN] && nla_get_u32(attrs[MACSEC_SA_ATTR_PN]) == 0)
++ if (attrs[MACSEC_SA_ATTR_PN] && nla_get_u64(attrs[MACSEC_SA_ATTR_PN]) == 0)
+ return false;
+
+ if (attrs[MACSEC_SA_ATTR_ACTIVE]) {
+@@ -3745,9 +3747,6 @@ static int macsec_changelink_common(struct net_device *dev,
+ secy->operational = tx_sa && tx_sa->active;
+ }
+
+- if (data[IFLA_MACSEC_WINDOW])
+- secy->replay_window = nla_get_u32(data[IFLA_MACSEC_WINDOW]);
+-
+ if (data[IFLA_MACSEC_ENCRYPT])
+ tx_sc->encrypt = !!nla_get_u8(data[IFLA_MACSEC_ENCRYPT]);
+
+@@ -3793,6 +3792,16 @@ static int macsec_changelink_common(struct net_device *dev,
+ }
+ }
+
++ if (data[IFLA_MACSEC_WINDOW]) {
++ secy->replay_window = nla_get_u32(data[IFLA_MACSEC_WINDOW]);
++
++ /* IEEE 802.1AEbw-2013 10.7.8 - maximum replay window
++ * for XPN cipher suites */
++ if (secy->xpn &&
++ secy->replay_window > MACSEC_XPN_MAX_REPLAY_WINDOW)
++ return -EINVAL;
++ }
++
+ return 0;
+ }
+
+@@ -3822,7 +3831,7 @@ static int macsec_changelink(struct net_device *dev, struct nlattr *tb[],
+
+ ret = macsec_changelink_common(dev, data);
+ if (ret)
+- return ret;
++ goto cleanup;
+
+ /* If h/w offloading is available, propagate to the device */
+ if (macsec_is_offloaded(macsec)) {
+diff --git a/drivers/net/pcs/pcs-xpcs.c b/drivers/net/pcs/pcs-xpcs.c
+index 61418d4dc0cd2..8768f6e34846f 100644
+--- a/drivers/net/pcs/pcs-xpcs.c
++++ b/drivers/net/pcs/pcs-xpcs.c
+@@ -898,7 +898,7 @@ static int xpcs_get_state_c37_sgmii(struct dw_xpcs *xpcs,
+ */
+ ret = xpcs_read(xpcs, MDIO_MMD_VEND2, DW_VR_MII_AN_INTR_STS);
+ if (ret < 0)
+- return false;
++ return ret;
+
+ if (ret & DW_VR_MII_C37_ANSGM_SP_LNKSTS) {
+ int speed_value;
+diff --git a/drivers/net/sungem_phy.c b/drivers/net/sungem_phy.c
+index 4daac5fda073c..0d40d265b6886 100644
+--- a/drivers/net/sungem_phy.c
++++ b/drivers/net/sungem_phy.c
+@@ -454,6 +454,7 @@ static int bcm5421_init(struct mii_phy* phy)
+ int can_low_power = 1;
+ if (np == NULL || of_get_property(np, "no-autolowpower", NULL))
+ can_low_power = 0;
++ of_node_put(np);
+ if (can_low_power) {
+ /* Enable automatic low-power */
+ sungem_phy_write(phy, 0x1c, 0x9002);
+diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
+index c7804fce204cc..206904e60784b 100644
+--- a/drivers/net/virtio_net.c
++++ b/drivers/net/virtio_net.c
+@@ -242,9 +242,15 @@ struct virtnet_info {
+ /* Packet virtio header size */
+ u8 hdr_len;
+
+- /* Work struct for refilling if we run low on memory. */
++ /* Work struct for delayed refilling if we run low on memory. */
+ struct delayed_work refill;
+
++ /* Is delayed refill enabled? */
++ bool refill_enabled;
++
++ /* The lock to synchronize the access to refill_enabled */
++ spinlock_t refill_lock;
++
+ /* Work struct for config space updates */
+ struct work_struct config_work;
+
+@@ -348,6 +354,20 @@ static struct page *get_a_page(struct receive_queue *rq, gfp_t gfp_mask)
+ return p;
+ }
+
++static void enable_delayed_refill(struct virtnet_info *vi)
++{
++ spin_lock_bh(&vi->refill_lock);
++ vi->refill_enabled = true;
++ spin_unlock_bh(&vi->refill_lock);
++}
++
++static void disable_delayed_refill(struct virtnet_info *vi)
++{
++ spin_lock_bh(&vi->refill_lock);
++ vi->refill_enabled = false;
++ spin_unlock_bh(&vi->refill_lock);
++}
++
+ static void virtqueue_napi_schedule(struct napi_struct *napi,
+ struct virtqueue *vq)
+ {
+@@ -1527,8 +1547,12 @@ static int virtnet_receive(struct receive_queue *rq, int budget,
+ }
+
+ if (rq->vq->num_free > min((unsigned int)budget, virtqueue_get_vring_size(rq->vq)) / 2) {
+- if (!try_fill_recv(vi, rq, GFP_ATOMIC))
+- schedule_delayed_work(&vi->refill, 0);
++ if (!try_fill_recv(vi, rq, GFP_ATOMIC)) {
++ spin_lock(&vi->refill_lock);
++ if (vi->refill_enabled)
++ schedule_delayed_work(&vi->refill, 0);
++ spin_unlock(&vi->refill_lock);
++ }
+ }
+
+ u64_stats_update_begin(&rq->stats.syncp);
+@@ -1651,6 +1675,8 @@ static int virtnet_open(struct net_device *dev)
+ struct virtnet_info *vi = netdev_priv(dev);
+ int i, err;
+
++ enable_delayed_refill(vi);
++
+ for (i = 0; i < vi->max_queue_pairs; i++) {
+ if (i < vi->curr_queue_pairs)
+ /* Make sure we have some buffers: if oom use wq. */
+@@ -2033,6 +2059,8 @@ static int virtnet_close(struct net_device *dev)
+ struct virtnet_info *vi = netdev_priv(dev);
+ int i;
+
++ /* Make sure NAPI doesn't schedule refill work */
++ disable_delayed_refill(vi);
+ /* Make sure refill_work doesn't re-enable napi! */
+ cancel_delayed_work_sync(&vi->refill);
+
+@@ -2792,6 +2820,8 @@ static int virtnet_restore_up(struct virtio_device *vdev)
+
+ virtio_device_ready(vdev);
+
++ enable_delayed_refill(vi);
++
+ if (netif_running(vi->dev)) {
+ err = virtnet_open(vi->dev);
+ if (err)
+@@ -3534,6 +3564,7 @@ static int virtnet_probe(struct virtio_device *vdev)
+ vdev->priv = vi;
+
+ INIT_WORK(&vi->config_work, virtnet_config_changed_work);
++ spin_lock_init(&vi->refill_lock);
+
+ /* If we can receive ANY GSO packets, we must allocate large ones. */
+ if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) ||
+diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+index 7e476f50935b8..f9e4b01cd0f5c 100644
+--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
++++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+@@ -11386,6 +11386,7 @@ scsih_shutdown(struct pci_dev *pdev)
+ _scsih_ir_shutdown(ioc);
+ _scsih_nvme_shutdown(ioc);
+ mpt3sas_base_mask_interrupts(ioc);
++ mpt3sas_base_stop_watchdog(ioc);
+ ioc->shost_recovery = 1;
+ mpt3sas_base_make_ioc_ready(ioc, SOFT_RESET);
+ ioc->shost_recovery = 0;
+diff --git a/drivers/scsi/scsi_ioctl.c b/drivers/scsi/scsi_ioctl.c
+index a480c4d589f5f..729e309e60346 100644
+--- a/drivers/scsi/scsi_ioctl.c
++++ b/drivers/scsi/scsi_ioctl.c
+@@ -450,7 +450,7 @@ static int sg_io(struct scsi_device *sdev, struct sg_io_hdr *hdr, fmode_t mode)
+ goto out_put_request;
+
+ ret = 0;
+- if (hdr->iovec_count) {
++ if (hdr->iovec_count && hdr->dxfer_len) {
+ struct iov_iter i;
+ struct iovec *iov = NULL;
+
+diff --git a/drivers/scsi/ufs/ufshcd-pltfrm.c b/drivers/scsi/ufs/ufshcd-pltfrm.c
+index 87975d1a21c8b..adc302b1a57ae 100644
+--- a/drivers/scsi/ufs/ufshcd-pltfrm.c
++++ b/drivers/scsi/ufs/ufshcd-pltfrm.c
+@@ -107,9 +107,20 @@ out:
+ return ret;
+ }
+
++static bool phandle_exists(const struct device_node *np,
++ const char *phandle_name, int index)
++{
++ struct device_node *parse_np = of_parse_phandle(np, phandle_name, index);
++
++ if (parse_np)
++ of_node_put(parse_np);
++
++ return parse_np != NULL;
++}
++
+ #define MAX_PROP_SIZE 32
+ static int ufshcd_populate_vreg(struct device *dev, const char *name,
+- struct ufs_vreg **out_vreg)
++ struct ufs_vreg **out_vreg)
+ {
+ char prop_name[MAX_PROP_SIZE];
+ struct ufs_vreg *vreg = NULL;
+@@ -121,7 +132,7 @@ static int ufshcd_populate_vreg(struct device *dev, const char *name,
+ }
+
+ snprintf(prop_name, MAX_PROP_SIZE, "%s-supply", name);
+- if (!of_parse_phandle(np, prop_name, 0)) {
++ if (!phandle_exists(np, prop_name, 0)) {
+ dev_info(dev, "%s: Unable to find %s regulator, assuming enabled\n",
+ __func__, prop_name);
+ goto out;
+diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
+index 452ad06120671..874490f7f5e7f 100644
+--- a/drivers/scsi/ufs/ufshcd.c
++++ b/drivers/scsi/ufs/ufshcd.c
+@@ -734,17 +734,28 @@ static enum utp_ocs ufshcd_get_tr_ocs(struct ufshcd_lrb *lrbp)
+ }
+
+ /**
+- * ufshcd_utrl_clear - Clear a bit in UTRLCLR register
++ * ufshcd_utrl_clear() - Clear requests from the controller request list.
+ * @hba: per adapter instance
+- * @pos: position of the bit to be cleared
++ * @mask: mask with one bit set for each request to be cleared
+ */
+-static inline void ufshcd_utrl_clear(struct ufs_hba *hba, u32 pos)
++static inline void ufshcd_utrl_clear(struct ufs_hba *hba, u32 mask)
+ {
+ if (hba->quirks & UFSHCI_QUIRK_BROKEN_REQ_LIST_CLR)
+- ufshcd_writel(hba, (1 << pos), REG_UTP_TRANSFER_REQ_LIST_CLEAR);
+- else
+- ufshcd_writel(hba, ~(1 << pos),
+- REG_UTP_TRANSFER_REQ_LIST_CLEAR);
++ mask = ~mask;
++ /*
++ * From the UFSHCI specification: "UTP Transfer Request List CLear
++ * Register (UTRLCLR): This field is bit significant. Each bit
++ * corresponds to a slot in the UTP Transfer Request List, where bit 0
++ * corresponds to request slot 0. A bit in this field is set to ‘0’
++ * by host software to indicate to the host controller that a transfer
++ * request slot is cleared. The host controller
++ * shall free up any resources associated to the request slot
++ * immediately, and shall set the associated bit in UTRLDBR to ‘0’. The
++ * host software indicates no change to request slots by setting the
++ * associated bits in this field to ‘1’. Bits in this field shall only
++ * be set ‘1’ or ‘0’ by host software when UTRLRSR is set to ‘1’."
++ */
++ ufshcd_writel(hba, ~mask, REG_UTP_TRANSFER_REQ_LIST_CLEAR);
+ }
+
+ /**
+@@ -2853,16 +2864,19 @@ static int ufshcd_compose_dev_cmd(struct ufs_hba *hba,
+ return ufshcd_compose_devman_upiu(hba, lrbp);
+ }
+
+-static int
+-ufshcd_clear_cmd(struct ufs_hba *hba, int tag)
++/*
++ * Clear all the requests from the controller for which a bit has been set in
++ * @mask and wait until the controller confirms that these requests have been
++ * cleared.
++ */
++static int ufshcd_clear_cmds(struct ufs_hba *hba, u32 mask)
+ {
+ int err = 0;
+ unsigned long flags;
+- u32 mask = 1 << tag;
+
+ /* clear outstanding transaction before retry */
+ spin_lock_irqsave(hba->host->host_lock, flags);
+- ufshcd_utrl_clear(hba, tag);
++ ufshcd_utrl_clear(hba, mask);
+ spin_unlock_irqrestore(hba->host->host_lock, flags);
+
+ /*
+@@ -2933,37 +2947,59 @@ ufshcd_dev_cmd_completion(struct ufs_hba *hba, struct ufshcd_lrb *lrbp)
+ static int ufshcd_wait_for_dev_cmd(struct ufs_hba *hba,
+ struct ufshcd_lrb *lrbp, int max_timeout)
+ {
+- int err = 0;
+- unsigned long time_left;
++ unsigned long time_left = msecs_to_jiffies(max_timeout);
+ unsigned long flags;
++ bool pending;
++ int err;
+
++retry:
+ time_left = wait_for_completion_timeout(hba->dev_cmd.complete,
+- msecs_to_jiffies(max_timeout));
++ time_left);
+
+- spin_lock_irqsave(hba->host->host_lock, flags);
+- hba->dev_cmd.complete = NULL;
+ if (likely(time_left)) {
++ /*
++ * The completion handler called complete() and the caller of
++ * this function still owns the @lrbp tag so the code below does
++ * not trigger any race conditions.
++ */
++ hba->dev_cmd.complete = NULL;
+ err = ufshcd_get_tr_ocs(lrbp);
+ if (!err)
+ err = ufshcd_dev_cmd_completion(hba, lrbp);
+- }
+- spin_unlock_irqrestore(hba->host->host_lock, flags);
+-
+- if (!time_left) {
++ } else {
+ err = -ETIMEDOUT;
+ dev_dbg(hba->dev, "%s: dev_cmd request timedout, tag %d\n",
+ __func__, lrbp->task_tag);
+- if (!ufshcd_clear_cmd(hba, lrbp->task_tag))
++ if (ufshcd_clear_cmds(hba, 1U << lrbp->task_tag) == 0) {
+ /* successfully cleared the command, retry if needed */
+ err = -EAGAIN;
+- /*
+- * in case of an error, after clearing the doorbell,
+- * we also need to clear the outstanding_request
+- * field in hba
+- */
+- spin_lock_irqsave(&hba->outstanding_lock, flags);
+- __clear_bit(lrbp->task_tag, &hba->outstanding_reqs);
+- spin_unlock_irqrestore(&hba->outstanding_lock, flags);
++ /*
++ * Since clearing the command succeeded we also need to
++ * clear the task tag bit from the outstanding_reqs
++ * variable.
++ */
++ spin_lock_irqsave(&hba->outstanding_lock, flags);
++ pending = test_bit(lrbp->task_tag,
++ &hba->outstanding_reqs);
++ if (pending) {
++ hba->dev_cmd.complete = NULL;
++ __clear_bit(lrbp->task_tag,
++ &hba->outstanding_reqs);
++ }
++ spin_unlock_irqrestore(&hba->outstanding_lock, flags);
++
++ if (!pending) {
++ /*
++ * The completion handler ran while we tried to
++ * clear the command.
++ */
++ time_left = 1;
++ goto retry;
++ }
++ } else {
++ dev_err(hba->dev, "%s: failed to clear tag %d\n",
++ __func__, lrbp->task_tag);
++ }
+ }
+
+ return err;
+@@ -6988,7 +7024,7 @@ static int ufshcd_eh_device_reset_handler(struct scsi_cmnd *cmd)
+ /* clear the commands that were pending for corresponding LUN */
+ for_each_set_bit(pos, &hba->outstanding_reqs, hba->nutrs) {
+ if (hba->lrb[pos].lun == lun) {
+- err = ufshcd_clear_cmd(hba, pos);
++ err = ufshcd_clear_cmds(hba, 1U << pos);
+ if (err)
+ break;
+ __ufshcd_transfer_req_compl(hba, 1U << pos);
+@@ -7090,7 +7126,7 @@ static int ufshcd_try_to_abort_task(struct ufs_hba *hba, int tag)
+ goto out;
+ }
+
+- err = ufshcd_clear_cmd(hba, tag);
++ err = ufshcd_clear_cmds(hba, 1U << tag);
+ if (err)
+ dev_err(hba->dev, "%s: Failed clearing cmd at tag %d, err %d\n",
+ __func__, tag, err);
+diff --git a/fs/ntfs/attrib.c b/fs/ntfs/attrib.c
+index 2911c04a33e01..080333bda45eb 100644
+--- a/fs/ntfs/attrib.c
++++ b/fs/ntfs/attrib.c
+@@ -592,8 +592,12 @@ static int ntfs_attr_find(const ATTR_TYPE type, const ntfschar *name,
+ a = (ATTR_RECORD*)((u8*)ctx->attr +
+ le32_to_cpu(ctx->attr->length));
+ for (;; a = (ATTR_RECORD*)((u8*)a + le32_to_cpu(a->length))) {
+- if ((u8*)a < (u8*)ctx->mrec || (u8*)a > (u8*)ctx->mrec +
+- le32_to_cpu(ctx->mrec->bytes_allocated))
++ u8 *mrec_end = (u8 *)ctx->mrec +
++ le32_to_cpu(ctx->mrec->bytes_allocated);
++ u8 *name_end = (u8 *)a + le16_to_cpu(a->name_offset) +
++ a->name_length * sizeof(ntfschar);
++ if ((u8*)a < (u8*)ctx->mrec || (u8*)a > mrec_end ||
++ name_end > mrec_end)
+ break;
+ ctx->attr = a;
+ if (unlikely(le32_to_cpu(a->type) > le32_to_cpu(type) ||
+diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
+index 3375275714612..740b642383127 100644
+--- a/fs/ocfs2/ocfs2.h
++++ b/fs/ocfs2/ocfs2.h
+@@ -277,7 +277,6 @@ enum ocfs2_mount_options
+ OCFS2_MOUNT_JOURNAL_ASYNC_COMMIT = 1 << 15, /* Journal Async Commit */
+ OCFS2_MOUNT_ERRORS_CONT = 1 << 16, /* Return EIO to the calling process on error */
+ OCFS2_MOUNT_ERRORS_ROFS = 1 << 17, /* Change filesystem to read-only on error */
+- OCFS2_MOUNT_NOCLUSTER = 1 << 18, /* No cluster aware filesystem mount */
+ };
+
+ #define OCFS2_OSB_SOFT_RO 0x0001
+@@ -673,8 +672,7 @@ static inline int ocfs2_cluster_o2cb_global_heartbeat(struct ocfs2_super *osb)
+
+ static inline int ocfs2_mount_local(struct ocfs2_super *osb)
+ {
+- return ((osb->s_feature_incompat & OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT)
+- || (osb->s_mount_opt & OCFS2_MOUNT_NOCLUSTER));
++ return (osb->s_feature_incompat & OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT);
+ }
+
+ static inline int ocfs2_uses_extended_slot_map(struct ocfs2_super *osb)
+diff --git a/fs/ocfs2/slot_map.c b/fs/ocfs2/slot_map.c
+index 0b0ae3ebb0cf5..da7718cef735e 100644
+--- a/fs/ocfs2/slot_map.c
++++ b/fs/ocfs2/slot_map.c
+@@ -252,16 +252,14 @@ static int __ocfs2_find_empty_slot(struct ocfs2_slot_info *si,
+ int i, ret = -ENOSPC;
+
+ if ((preferred >= 0) && (preferred < si->si_num_slots)) {
+- if (!si->si_slots[preferred].sl_valid ||
+- !si->si_slots[preferred].sl_node_num) {
++ if (!si->si_slots[preferred].sl_valid) {
+ ret = preferred;
+ goto out;
+ }
+ }
+
+ for(i = 0; i < si->si_num_slots; i++) {
+- if (!si->si_slots[i].sl_valid ||
+- !si->si_slots[i].sl_node_num) {
++ if (!si->si_slots[i].sl_valid) {
+ ret = i;
+ break;
+ }
+@@ -456,30 +454,24 @@ int ocfs2_find_slot(struct ocfs2_super *osb)
+ spin_lock(&osb->osb_lock);
+ ocfs2_update_slot_info(si);
+
+- if (ocfs2_mount_local(osb))
+- /* use slot 0 directly in local mode */
+- slot = 0;
+- else {
+- /* search for ourselves first and take the slot if it already
+- * exists. Perhaps we need to mark this in a variable for our
+- * own journal recovery? Possibly not, though we certainly
+- * need to warn to the user */
+- slot = __ocfs2_node_num_to_slot(si, osb->node_num);
++ /* search for ourselves first and take the slot if it already
++ * exists. Perhaps we need to mark this in a variable for our
++ * own journal recovery? Possibly not, though we certainly
++ * need to warn to the user */
++ slot = __ocfs2_node_num_to_slot(si, osb->node_num);
++ if (slot < 0) {
++ /* if no slot yet, then just take 1st available
++ * one. */
++ slot = __ocfs2_find_empty_slot(si, osb->preferred_slot);
+ if (slot < 0) {
+- /* if no slot yet, then just take 1st available
+- * one. */
+- slot = __ocfs2_find_empty_slot(si, osb->preferred_slot);
+- if (slot < 0) {
+- spin_unlock(&osb->osb_lock);
+- mlog(ML_ERROR, "no free slots available!\n");
+- status = -EINVAL;
+- goto bail;
+- }
+- } else
+- printk(KERN_INFO "ocfs2: Slot %d on device (%s) was "
+- "already allocated to this node!\n",
+- slot, osb->dev_str);
+- }
++ spin_unlock(&osb->osb_lock);
++ mlog(ML_ERROR, "no free slots available!\n");
++ status = -EINVAL;
++ goto bail;
++ }
++ } else
++ printk(KERN_INFO "ocfs2: Slot %d on device (%s) was already "
++ "allocated to this node!\n", slot, osb->dev_str);
+
+ ocfs2_set_slot(si, slot, osb->node_num);
+ osb->slot_num = slot;
+diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
+index 311433c69a3f5..0535acd2389aa 100644
+--- a/fs/ocfs2/super.c
++++ b/fs/ocfs2/super.c
+@@ -172,7 +172,6 @@ enum {
+ Opt_dir_resv_level,
+ Opt_journal_async_commit,
+ Opt_err_cont,
+- Opt_nocluster,
+ Opt_err,
+ };
+
+@@ -206,7 +205,6 @@ static const match_table_t tokens = {
+ {Opt_dir_resv_level, "dir_resv_level=%u"},
+ {Opt_journal_async_commit, "journal_async_commit"},
+ {Opt_err_cont, "errors=continue"},
+- {Opt_nocluster, "nocluster"},
+ {Opt_err, NULL}
+ };
+
+@@ -618,13 +616,6 @@ static int ocfs2_remount(struct super_block *sb, int *flags, char *data)
+ goto out;
+ }
+
+- tmp = OCFS2_MOUNT_NOCLUSTER;
+- if ((osb->s_mount_opt & tmp) != (parsed_options.mount_opt & tmp)) {
+- ret = -EINVAL;
+- mlog(ML_ERROR, "Cannot change nocluster option on remount\n");
+- goto out;
+- }
+-
+ tmp = OCFS2_MOUNT_HB_LOCAL | OCFS2_MOUNT_HB_GLOBAL |
+ OCFS2_MOUNT_HB_NONE;
+ if ((osb->s_mount_opt & tmp) != (parsed_options.mount_opt & tmp)) {
+@@ -865,7 +856,6 @@ static int ocfs2_verify_userspace_stack(struct ocfs2_super *osb,
+ }
+
+ if (ocfs2_userspace_stack(osb) &&
+- !(osb->s_mount_opt & OCFS2_MOUNT_NOCLUSTER) &&
+ strncmp(osb->osb_cluster_stack, mopt->cluster_stack,
+ OCFS2_STACK_LABEL_LEN)) {
+ mlog(ML_ERROR,
+@@ -1144,11 +1134,6 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
+ osb->s_mount_opt & OCFS2_MOUNT_DATA_WRITEBACK ? "writeback" :
+ "ordered");
+
+- if ((osb->s_mount_opt & OCFS2_MOUNT_NOCLUSTER) &&
+- !(osb->s_feature_incompat & OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT))
+- printk(KERN_NOTICE "ocfs2: The shared device (%s) is mounted "
+- "without cluster aware mode.\n", osb->dev_str);
+-
+ atomic_set(&osb->vol_state, VOLUME_MOUNTED);
+ wake_up(&osb->osb_mount_event);
+
+@@ -1455,9 +1440,6 @@ static int ocfs2_parse_options(struct super_block *sb,
+ case Opt_journal_async_commit:
+ mopt->mount_opt |= OCFS2_MOUNT_JOURNAL_ASYNC_COMMIT;
+ break;
+- case Opt_nocluster:
+- mopt->mount_opt |= OCFS2_MOUNT_NOCLUSTER;
+- break;
+ default:
+ mlog(ML_ERROR,
+ "Unrecognized mount option \"%s\" "
+@@ -1569,9 +1551,6 @@ static int ocfs2_show_options(struct seq_file *s, struct dentry *root)
+ if (opts & OCFS2_MOUNT_JOURNAL_ASYNC_COMMIT)
+ seq_printf(s, ",journal_async_commit");
+
+- if (opts & OCFS2_MOUNT_NOCLUSTER)
+- seq_printf(s, ",nocluster");
+-
+ return 0;
+ }
+
+diff --git a/fs/read_write.c b/fs/read_write.c
+index 671f47d5984ce..aca85a5bbb0f0 100644
+--- a/fs/read_write.c
++++ b/fs/read_write.c
+@@ -1247,6 +1247,9 @@ static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
+ count, fl);
+ file_end_write(out.file);
+ } else {
++ if (out.file->f_flags & O_NONBLOCK)
++ fl |= SPLICE_F_NONBLOCK;
++
+ retval = splice_file_to_pipe(in.file, opipe, &pos, count, fl);
+ }
+
+diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
+index aa0c47cb0d165..694096653ea79 100644
+--- a/fs/userfaultfd.c
++++ b/fs/userfaultfd.c
+@@ -191,17 +191,19 @@ static inline void msg_init(struct uffd_msg *msg)
+ }
+
+ static inline struct uffd_msg userfault_msg(unsigned long address,
++ unsigned long real_address,
+ unsigned int flags,
+ unsigned long reason,
+ unsigned int features)
+ {
+ struct uffd_msg msg;
++
+ msg_init(&msg);
+ msg.event = UFFD_EVENT_PAGEFAULT;
+
+- if (!(features & UFFD_FEATURE_EXACT_ADDRESS))
+- address &= PAGE_MASK;
+- msg.arg.pagefault.address = address;
++ msg.arg.pagefault.address = (features & UFFD_FEATURE_EXACT_ADDRESS) ?
++ real_address : address;
++
+ /*
+ * These flags indicate why the userfault occurred:
+ * - UFFD_PAGEFAULT_FLAG_WP indicates a write protect fault.
+@@ -485,8 +487,8 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason)
+
+ init_waitqueue_func_entry(&uwq.wq, userfaultfd_wake_function);
+ uwq.wq.private = current;
+- uwq.msg = userfault_msg(vmf->real_address, vmf->flags, reason,
+- ctx->features);
++ uwq.msg = userfault_msg(vmf->address, vmf->real_address, vmf->flags,
++ reason, ctx->features);
+ uwq.ctx = ctx;
+ uwq.waken = false;
+
+diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
+index 7ce93aaf69f8d..98954dda57344 100644
+--- a/include/asm-generic/io.h
++++ b/include/asm-generic/io.h
+@@ -1125,9 +1125,7 @@ static inline void memcpy_toio(volatile void __iomem *addr, const void *buffer,
+ }
+ #endif
+
+-#ifndef CONFIG_GENERIC_DEVMEM_IS_ALLOWED
+ extern int devmem_is_allowed(unsigned long pfn);
+-#endif
+
+ #endif /* __KERNEL__ */
+
+diff --git a/include/linux/mm.h b/include/linux/mm.h
+index da08cce2a9fa8..b5a115e9bcd5a 100644
+--- a/include/linux/mm.h
++++ b/include/linux/mm.h
+@@ -1130,23 +1130,27 @@ static inline bool is_zone_movable_page(const struct page *page)
+ #if defined(CONFIG_ZONE_DEVICE) && defined(CONFIG_FS_DAX)
+ DECLARE_STATIC_KEY_FALSE(devmap_managed_key);
+
+-bool __put_devmap_managed_page(struct page *page);
+-static inline bool put_devmap_managed_page(struct page *page)
++bool __put_devmap_managed_page_refs(struct page *page, int refs);
++static inline bool put_devmap_managed_page_refs(struct page *page, int refs)
+ {
+ if (!static_branch_unlikely(&devmap_managed_key))
+ return false;
+ if (!is_zone_device_page(page))
+ return false;
+- return __put_devmap_managed_page(page);
++ return __put_devmap_managed_page_refs(page, refs);
+ }
+-
+ #else /* CONFIG_ZONE_DEVICE && CONFIG_FS_DAX */
+-static inline bool put_devmap_managed_page(struct page *page)
++static inline bool put_devmap_managed_page_refs(struct page *page, int refs)
+ {
+ return false;
+ }
+ #endif /* CONFIG_ZONE_DEVICE && CONFIG_FS_DAX */
+
++static inline bool put_devmap_managed_page(struct page *page)
++{
++ return put_devmap_managed_page_refs(page, 1);
++}
++
+ /* 127: arbitrary random number, small enough to assemble well */
+ #define folio_ref_zero_or_close_to_overflow(folio) \
+ ((unsigned int) folio_ref_count(folio) + 127u <= 127u)
+diff --git a/include/net/addrconf.h b/include/net/addrconf.h
+index f7506f08e505a..c04f359655b86 100644
+--- a/include/net/addrconf.h
++++ b/include/net/addrconf.h
+@@ -405,6 +405,9 @@ static inline bool ip6_ignore_linkdown(const struct net_device *dev)
+ {
+ const struct inet6_dev *idev = __in6_dev_get(dev);
+
++ if (unlikely(!idev))
++ return true;
++
+ return !!idev->cnf.ignore_routes_with_linkdown;
+ }
+
+diff --git a/include/net/bluetooth/l2cap.h b/include/net/bluetooth/l2cap.h
+index 3c4f550e5a8b7..2f766e3437ce2 100644
+--- a/include/net/bluetooth/l2cap.h
++++ b/include/net/bluetooth/l2cap.h
+@@ -847,6 +847,7 @@ enum {
+ };
+
+ void l2cap_chan_hold(struct l2cap_chan *c);
++struct l2cap_chan *l2cap_chan_hold_unless_zero(struct l2cap_chan *c);
+ void l2cap_chan_put(struct l2cap_chan *c);
+
+ static inline void l2cap_chan_lock(struct l2cap_chan *chan)
+diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
+index 3908296d103fd..d24719972900a 100644
+--- a/include/net/inet_connection_sock.h
++++ b/include/net/inet_connection_sock.h
+@@ -323,7 +323,7 @@ void inet_csk_update_fastreuse(struct inet_bind_bucket *tb,
+
+ struct dst_entry *inet_csk_update_pmtu(struct sock *sk, u32 mtu);
+
+-#define TCP_PINGPONG_THRESH 3
++#define TCP_PINGPONG_THRESH 1
+
+ static inline void inet_csk_enter_pingpong_mode(struct sock *sk)
+ {
+@@ -340,14 +340,6 @@ static inline bool inet_csk_in_pingpong_mode(struct sock *sk)
+ return inet_csk(sk)->icsk_ack.pingpong >= TCP_PINGPONG_THRESH;
+ }
+
+-static inline void inet_csk_inc_pingpong_cnt(struct sock *sk)
+-{
+- struct inet_connection_sock *icsk = inet_csk(sk);
+-
+- if (icsk->icsk_ack.pingpong < U8_MAX)
+- icsk->icsk_ack.pingpong++;
+-}
+-
+ static inline bool inet_csk_has_ulp(struct sock *sk)
+ {
+ return inet_sk(sk)->is_icsk && !!inet_csk(sk)->icsk_ulp_ops;
+diff --git a/include/net/sock.h b/include/net/sock.h
+index 6bef0ffb1e7b7..9563a093fdfc1 100644
+--- a/include/net/sock.h
++++ b/include/net/sock.h
+@@ -2834,18 +2834,18 @@ static inline int sk_get_wmem0(const struct sock *sk, const struct proto *proto)
+ {
+ /* Does this proto have per netns sysctl_wmem ? */
+ if (proto->sysctl_wmem_offset)
+- return *(int *)((void *)sock_net(sk) + proto->sysctl_wmem_offset);
++ return READ_ONCE(*(int *)((void *)sock_net(sk) + proto->sysctl_wmem_offset));
+
+- return *proto->sysctl_wmem;
++ return READ_ONCE(*proto->sysctl_wmem);
+ }
+
+ static inline int sk_get_rmem0(const struct sock *sk, const struct proto *proto)
+ {
+ /* Does this proto have per netns sysctl_rmem ? */
+ if (proto->sysctl_rmem_offset)
+- return *(int *)((void *)sock_net(sk) + proto->sysctl_rmem_offset);
++ return READ_ONCE(*(int *)((void *)sock_net(sk) + proto->sysctl_rmem_offset));
+
+- return *proto->sysctl_rmem;
++ return READ_ONCE(*proto->sysctl_rmem);
+ }
+
+ /* Default TCP Small queue budget is ~1 ms of data (1sec >> 10)
+diff --git a/include/net/tcp.h b/include/net/tcp.h
+index 4f5de382e1927..f618d6a52324c 100644
+--- a/include/net/tcp.h
++++ b/include/net/tcp.h
+@@ -1437,7 +1437,7 @@ void tcp_select_initial_window(const struct sock *sk, int __space,
+
+ static inline int tcp_win_from_space(const struct sock *sk, int space)
+ {
+- int tcp_adv_win_scale = sock_net(sk)->ipv4.sysctl_tcp_adv_win_scale;
++ int tcp_adv_win_scale = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_adv_win_scale);
+
+ return tcp_adv_win_scale <= 0 ?
+ (space>>(-tcp_adv_win_scale)) :
+diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
+index acde5d6f12546..13a78b2b7b323 100644
+--- a/kernel/locking/rwsem.c
++++ b/kernel/locking/rwsem.c
+@@ -334,8 +334,6 @@ struct rwsem_waiter {
+ struct task_struct *task;
+ enum rwsem_waiter_type type;
+ unsigned long timeout;
+-
+- /* Writer only, not initialized in reader */
+ bool handoff_set;
+ };
+ #define rwsem_first_waiter(sem) \
+@@ -455,10 +453,12 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
+ * to give up the lock), request a HANDOFF to
+ * force the issue.
+ */
+- if (!(oldcount & RWSEM_FLAG_HANDOFF) &&
+- time_after(jiffies, waiter->timeout)) {
+- adjustment -= RWSEM_FLAG_HANDOFF;
+- lockevent_inc(rwsem_rlock_handoff);
++ if (time_after(jiffies, waiter->timeout)) {
++ if (!(oldcount & RWSEM_FLAG_HANDOFF)) {
++ adjustment -= RWSEM_FLAG_HANDOFF;
++ lockevent_inc(rwsem_rlock_handoff);
++ }
++ waiter->handoff_set = true;
+ }
+
+ atomic_long_add(-adjustment, &sem->count);
+@@ -568,7 +568,7 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
+ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
+ struct rwsem_waiter *waiter)
+ {
+- bool first = rwsem_first_waiter(sem) == waiter;
++ struct rwsem_waiter *first = rwsem_first_waiter(sem);
+ long count, new;
+
+ lockdep_assert_held(&sem->wait_lock);
+@@ -578,11 +578,20 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
+ bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);
+
+ if (has_handoff) {
+- if (!first)
++ /*
++ * Honor handoff bit and yield only when the first
++ * waiter is the one that set it. Otherwisee, we
++ * still try to acquire the rwsem.
++ */
++ if (first->handoff_set && (waiter != first))
+ return false;
+
+- /* First waiter inherits a previously set handoff bit */
+- waiter->handoff_set = true;
++ /*
++ * First waiter can inherit a previously set handoff
++ * bit and spin on rwsem if lock acquisition fails.
++ */
++ if (waiter == first)
++ waiter->handoff_set = true;
+ }
+
+ new = count;
+@@ -972,6 +981,7 @@ queue:
+ waiter.task = current;
+ waiter.type = RWSEM_WAITING_FOR_READ;
+ waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT;
++ waiter.handoff_set = false;
+
+ raw_spin_lock_irq(&sem->wait_lock);
+ if (list_empty(&sem->wait_list)) {
+diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
+index bb9962b33f95c..59ddb00d69447 100644
+--- a/kernel/watch_queue.c
++++ b/kernel/watch_queue.c
+@@ -454,6 +454,33 @@ void init_watch(struct watch *watch, struct watch_queue *wqueue)
+ rcu_assign_pointer(watch->queue, wqueue);
+ }
+
++static int add_one_watch(struct watch *watch, struct watch_list *wlist, struct watch_queue *wqueue)
++{
++ const struct cred *cred;
++ struct watch *w;
++
++ hlist_for_each_entry(w, &wlist->watchers, list_node) {
++ struct watch_queue *wq = rcu_access_pointer(w->queue);
++ if (wqueue == wq && watch->id == w->id)
++ return -EBUSY;
++ }
++
++ cred = current_cred();
++ if (atomic_inc_return(&cred->user->nr_watches) > task_rlimit(current, RLIMIT_NOFILE)) {
++ atomic_dec(&cred->user->nr_watches);
++ return -EAGAIN;
++ }
++
++ watch->cred = get_cred(cred);
++ rcu_assign_pointer(watch->watch_list, wlist);
++
++ kref_get(&wqueue->usage);
++ kref_get(&watch->usage);
++ hlist_add_head(&watch->queue_node, &wqueue->watches);
++ hlist_add_head_rcu(&watch->list_node, &wlist->watchers);
++ return 0;
++}
++
+ /**
+ * add_watch_to_object - Add a watch on an object to a watch list
+ * @watch: The watch to add
+@@ -468,34 +495,21 @@ void init_watch(struct watch *watch, struct watch_queue *wqueue)
+ */
+ int add_watch_to_object(struct watch *watch, struct watch_list *wlist)
+ {
+- struct watch_queue *wqueue = rcu_access_pointer(watch->queue);
+- struct watch *w;
+-
+- hlist_for_each_entry(w, &wlist->watchers, list_node) {
+- struct watch_queue *wq = rcu_access_pointer(w->queue);
+- if (wqueue == wq && watch->id == w->id)
+- return -EBUSY;
+- }
+-
+- watch->cred = get_current_cred();
+- rcu_assign_pointer(watch->watch_list, wlist);
++ struct watch_queue *wqueue;
++ int ret = -ENOENT;
+
+- if (atomic_inc_return(&watch->cred->user->nr_watches) >
+- task_rlimit(current, RLIMIT_NOFILE)) {
+- atomic_dec(&watch->cred->user->nr_watches);
+- put_cred(watch->cred);
+- return -EAGAIN;
+- }
++ rcu_read_lock();
+
++ wqueue = rcu_access_pointer(watch->queue);
+ if (lock_wqueue(wqueue)) {
+- kref_get(&wqueue->usage);
+- kref_get(&watch->usage);
+- hlist_add_head(&watch->queue_node, &wqueue->watches);
++ spin_lock(&wlist->lock);
++ ret = add_one_watch(watch, wlist, wqueue);
++ spin_unlock(&wlist->lock);
+ unlock_wqueue(wqueue);
+ }
+
+- hlist_add_head(&watch->list_node, &wlist->watchers);
+- return 0;
++ rcu_read_unlock();
++ return ret;
+ }
+ EXPORT_SYMBOL(add_watch_to_object);
+
+diff --git a/mm/gup.c b/mm/gup.c
+index f598a037eb04f..c5d076d43d9be 100644
+--- a/mm/gup.c
++++ b/mm/gup.c
+@@ -54,7 +54,8 @@ retry:
+ * belongs to this folio.
+ */
+ if (unlikely(page_folio(page) != folio)) {
+- folio_put_refs(folio, refs);
++ if (!put_devmap_managed_page_refs(&folio->page, refs))
++ folio_put_refs(folio, refs);
+ goto retry;
+ }
+
+@@ -143,7 +144,8 @@ static void gup_put_folio(struct folio *folio, int refs, unsigned int flags)
+ refs *= GUP_PIN_COUNTING_BIAS;
+ }
+
+- folio_put_refs(folio, refs);
++ if (!put_devmap_managed_page_refs(&folio->page, refs))
++ folio_put_refs(folio, refs);
+ }
+
+ /**
+diff --git a/mm/hmm.c b/mm/hmm.c
+index af71aac3140e4..6ec5ea76f31b1 100644
+--- a/mm/hmm.c
++++ b/mm/hmm.c
+@@ -212,14 +212,6 @@ int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr,
+ unsigned long end, unsigned long hmm_pfns[], pmd_t pmd);
+ #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
+-static inline bool hmm_is_device_private_entry(struct hmm_range *range,
+- swp_entry_t entry)
+-{
+- return is_device_private_entry(entry) &&
+- pfn_swap_entry_to_page(entry)->pgmap->owner ==
+- range->dev_private_owner;
+-}
+-
+ static inline unsigned long pte_to_hmm_pfn_flags(struct hmm_range *range,
+ pte_t pte)
+ {
+@@ -252,10 +244,12 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
+ swp_entry_t entry = pte_to_swp_entry(pte);
+
+ /*
+- * Never fault in device private pages, but just report
+- * the PFN even if not present.
++ * Don't fault in device private pages owned by the caller,
++ * just report the PFN.
+ */
+- if (hmm_is_device_private_entry(range, entry)) {
++ if (is_device_private_entry(entry) &&
++ pfn_swap_entry_to_page(entry)->pgmap->owner ==
++ range->dev_private_owner) {
+ cpu_flags = HMM_PFN_VALID;
+ if (is_writable_device_private_entry(entry))
+ cpu_flags |= HMM_PFN_WRITE;
+@@ -273,6 +267,9 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
+ if (!non_swap_entry(entry))
+ goto fault;
+
++ if (is_device_private_entry(entry))
++ goto fault;
++
+ if (is_device_exclusive_entry(entry))
+ goto fault;
+
+diff --git a/mm/hugetlb.c b/mm/hugetlb.c
+index 410bbb0aee321..859cfcaecddbc 100644
+--- a/mm/hugetlb.c
++++ b/mm/hugetlb.c
+@@ -5822,6 +5822,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
+
+ page = alloc_huge_page(dst_vma, dst_addr, 0);
+ if (IS_ERR(page)) {
++ put_page(*pagep);
+ ret = -ENOMEM;
+ *pagep = NULL;
+ goto out;
+diff --git a/mm/memory.c b/mm/memory.c
+index e176ee386238a..4ef55c26e114e 100644
+--- a/mm/memory.c
++++ b/mm/memory.c
+@@ -4108,9 +4108,12 @@ vm_fault_t finish_fault(struct vm_fault *vmf)
+ return VM_FAULT_OOM;
+ }
+
+- /* See comment in handle_pte_fault() */
++ /*
++ * See comment in handle_pte_fault() for how this scenario happens, we
++ * need to return NOPAGE so that we drop this page.
++ */
+ if (pmd_devmap_trans_unstable(vmf->pmd))
+- return 0;
++ return VM_FAULT_NOPAGE;
+
+ vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
+ vmf->address, &vmf->ptl);
+diff --git a/mm/memremap.c b/mm/memremap.c
+index 2554a6b07007f..e11653fd348cc 100644
+--- a/mm/memremap.c
++++ b/mm/memremap.c
+@@ -489,7 +489,7 @@ void free_zone_device_page(struct page *page)
+ }
+
+ #ifdef CONFIG_FS_DAX
+-bool __put_devmap_managed_page(struct page *page)
++bool __put_devmap_managed_page_refs(struct page *page, int refs)
+ {
+ if (page->pgmap->type != MEMORY_DEVICE_FS_DAX)
+ return false;
+@@ -499,9 +499,9 @@ bool __put_devmap_managed_page(struct page *page)
+ * refcount is 1, then the page is free and the refcount is
+ * stable because nobody holds a reference on the page.
+ */
+- if (page_ref_dec_return(page) == 1)
++ if (page_ref_sub_return(page, refs) == 1)
+ wake_up_var(&page->_refcount);
+ return true;
+ }
+-EXPORT_SYMBOL(__put_devmap_managed_page);
++EXPORT_SYMBOL(__put_devmap_managed_page_refs);
+ #endif /* CONFIG_FS_DAX */
+diff --git a/mm/page_alloc.c b/mm/page_alloc.c
+index 5ced6cb260ed1..135a081edb82c 100644
+--- a/mm/page_alloc.c
++++ b/mm/page_alloc.c
+@@ -3953,11 +3953,15 @@ static inline bool zone_watermark_fast(struct zone *z, unsigned int order,
+ * need to be calculated.
+ */
+ if (!order) {
+- long fast_free;
++ long usable_free;
++ long reserved;
+
+- fast_free = free_pages;
+- fast_free -= __zone_watermark_unusable_free(z, 0, alloc_flags);
+- if (fast_free > mark + z->lowmem_reserve[highest_zoneidx])
++ usable_free = free_pages;
++ reserved = __zone_watermark_unusable_free(z, 0, alloc_flags);
++
++ /* reserved may over estimate high-atomic reserves. */
++ usable_free -= min(usable_free, reserved);
++ if (usable_free > mark + z->lowmem_reserve[highest_zoneidx])
+ return true;
+ }
+
+diff --git a/mm/secretmem.c b/mm/secretmem.c
+index 3b3cf2892b6ae..81ff3037bd551 100644
+--- a/mm/secretmem.c
++++ b/mm/secretmem.c
+@@ -55,22 +55,28 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf)
+ gfp_t gfp = vmf->gfp_mask;
+ unsigned long addr;
+ struct page *page;
++ vm_fault_t ret;
+ int err;
+
+ if (((loff_t)vmf->pgoff << PAGE_SHIFT) >= i_size_read(inode))
+ return vmf_error(-EINVAL);
+
++ filemap_invalidate_lock_shared(mapping);
++
+ retry:
+ page = find_lock_page(mapping, offset);
+ if (!page) {
+ page = alloc_page(gfp | __GFP_ZERO);
+- if (!page)
+- return VM_FAULT_OOM;
++ if (!page) {
++ ret = VM_FAULT_OOM;
++ goto out;
++ }
+
+ err = set_direct_map_invalid_noflush(page);
+ if (err) {
+ put_page(page);
+- return vmf_error(err);
++ ret = vmf_error(err);
++ goto out;
+ }
+
+ __SetPageUptodate(page);
+@@ -86,7 +92,8 @@ retry:
+ if (err == -EEXIST)
+ goto retry;
+
+- return vmf_error(err);
++ ret = vmf_error(err);
++ goto out;
+ }
+
+ addr = (unsigned long)page_address(page);
+@@ -94,7 +101,11 @@ retry:
+ }
+
+ vmf->page = page;
+- return VM_FAULT_LOCKED;
++ ret = VM_FAULT_LOCKED;
++
++out:
++ filemap_invalidate_unlock_shared(mapping);
++ return ret;
+ }
+
+ static const struct vm_operations_struct secretmem_vm_ops = {
+@@ -162,12 +173,20 @@ static int secretmem_setattr(struct user_namespace *mnt_userns,
+ struct dentry *dentry, struct iattr *iattr)
+ {
+ struct inode *inode = d_inode(dentry);
++ struct address_space *mapping = inode->i_mapping;
+ unsigned int ia_valid = iattr->ia_valid;
++ int ret;
++
++ filemap_invalidate_lock(mapping);
+
+ if ((ia_valid & ATTR_SIZE) && inode->i_size)
+- return -EINVAL;
++ ret = -EINVAL;
++ else
++ ret = simple_setattr(mnt_userns, dentry, iattr);
+
+- return simple_setattr(mnt_userns, dentry, iattr);
++ filemap_invalidate_unlock(mapping);
++
++ return ret;
+ }
+
+ static const struct inode_operations secretmem_iops = {
+diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
+index 351c2390164d0..9e2a42299fc09 100644
+--- a/net/bluetooth/hci_sync.c
++++ b/net/bluetooth/hci_sync.c
+@@ -4942,6 +4942,9 @@ int hci_suspend_sync(struct hci_dev *hdev)
+ return err;
+ }
+
++ /* Update event mask so only the allowed event can wakeup the host */
++ hci_set_event_mask_sync(hdev);
++
+ /* Only configure accept list if disconnect succeeded and wake
+ * isn't being prevented.
+ */
+@@ -4953,9 +4956,6 @@ int hci_suspend_sync(struct hci_dev *hdev)
+ /* Unpause to take care of updating scanning params */
+ hdev->scanning_paused = false;
+
+- /* Update event mask so only the allowed event can wakeup the host */
+- hci_set_event_mask_sync(hdev);
+-
+ /* Enable event filter for paired devices */
+ hci_update_event_filter_sync(hdev);
+
+diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
+index ae78490ecd3d4..52668662ae8de 100644
+--- a/net/bluetooth/l2cap_core.c
++++ b/net/bluetooth/l2cap_core.c
+@@ -111,7 +111,8 @@ static struct l2cap_chan *__l2cap_get_chan_by_scid(struct l2cap_conn *conn,
+ }
+
+ /* Find channel with given SCID.
+- * Returns locked channel. */
++ * Returns a reference locked channel.
++ */
+ static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn *conn,
+ u16 cid)
+ {
+@@ -119,15 +120,19 @@ static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn *conn,
+
+ mutex_lock(&conn->chan_lock);
+ c = __l2cap_get_chan_by_scid(conn, cid);
+- if (c)
+- l2cap_chan_lock(c);
++ if (c) {
++ /* Only lock if chan reference is not 0 */
++ c = l2cap_chan_hold_unless_zero(c);
++ if (c)
++ l2cap_chan_lock(c);
++ }
+ mutex_unlock(&conn->chan_lock);
+
+ return c;
+ }
+
+ /* Find channel with given DCID.
+- * Returns locked channel.
++ * Returns a reference locked channel.
+ */
+ static struct l2cap_chan *l2cap_get_chan_by_dcid(struct l2cap_conn *conn,
+ u16 cid)
+@@ -136,8 +141,12 @@ static struct l2cap_chan *l2cap_get_chan_by_dcid(struct l2cap_conn *conn,
+
+ mutex_lock(&conn->chan_lock);
+ c = __l2cap_get_chan_by_dcid(conn, cid);
+- if (c)
+- l2cap_chan_lock(c);
++ if (c) {
++ /* Only lock if chan reference is not 0 */
++ c = l2cap_chan_hold_unless_zero(c);
++ if (c)
++ l2cap_chan_lock(c);
++ }
+ mutex_unlock(&conn->chan_lock);
+
+ return c;
+@@ -162,8 +171,12 @@ static struct l2cap_chan *l2cap_get_chan_by_ident(struct l2cap_conn *conn,
+
+ mutex_lock(&conn->chan_lock);
+ c = __l2cap_get_chan_by_ident(conn, ident);
+- if (c)
+- l2cap_chan_lock(c);
++ if (c) {
++ /* Only lock if chan reference is not 0 */
++ c = l2cap_chan_hold_unless_zero(c);
++ if (c)
++ l2cap_chan_lock(c);
++ }
+ mutex_unlock(&conn->chan_lock);
+
+ return c;
+@@ -497,6 +510,16 @@ void l2cap_chan_hold(struct l2cap_chan *c)
+ kref_get(&c->kref);
+ }
+
++struct l2cap_chan *l2cap_chan_hold_unless_zero(struct l2cap_chan *c)
++{
++ BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
++
++ if (!kref_get_unless_zero(&c->kref))
++ return NULL;
++
++ return c;
++}
++
+ void l2cap_chan_put(struct l2cap_chan *c)
+ {
+ BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
+@@ -1968,7 +1991,10 @@ static struct l2cap_chan *l2cap_global_chan_by_psm(int state, __le16 psm,
+ src_match = !bacmp(&c->src, src);
+ dst_match = !bacmp(&c->dst, dst);
+ if (src_match && dst_match) {
+- l2cap_chan_hold(c);
++ c = l2cap_chan_hold_unless_zero(c);
++ if (!c)
++ continue;
++
+ read_unlock(&chan_list_lock);
+ return c;
+ }
+@@ -1983,7 +2009,7 @@ static struct l2cap_chan *l2cap_global_chan_by_psm(int state, __le16 psm,
+ }
+
+ if (c1)
+- l2cap_chan_hold(c1);
++ c1 = l2cap_chan_hold_unless_zero(c1);
+
+ read_unlock(&chan_list_lock);
+
+@@ -4463,6 +4489,7 @@ static inline int l2cap_config_req(struct l2cap_conn *conn,
+
+ unlock:
+ l2cap_chan_unlock(chan);
++ l2cap_chan_put(chan);
+ return err;
+ }
+
+@@ -4577,6 +4604,7 @@ static inline int l2cap_config_rsp(struct l2cap_conn *conn,
+
+ done:
+ l2cap_chan_unlock(chan);
++ l2cap_chan_put(chan);
+ return err;
+ }
+
+@@ -5304,6 +5332,7 @@ send_move_response:
+ l2cap_send_move_chan_rsp(chan, result);
+
+ l2cap_chan_unlock(chan);
++ l2cap_chan_put(chan);
+
+ return 0;
+ }
+@@ -5396,6 +5425,7 @@ static void l2cap_move_continue(struct l2cap_conn *conn, u16 icid, u16 result)
+ }
+
+ l2cap_chan_unlock(chan);
++ l2cap_chan_put(chan);
+ }
+
+ static void l2cap_move_fail(struct l2cap_conn *conn, u8 ident, u16 icid,
+@@ -5425,6 +5455,7 @@ static void l2cap_move_fail(struct l2cap_conn *conn, u8 ident, u16 icid,
+ l2cap_send_move_chan_cfm(chan, L2CAP_MC_UNCONFIRMED);
+
+ l2cap_chan_unlock(chan);
++ l2cap_chan_put(chan);
+ }
+
+ static int l2cap_move_channel_rsp(struct l2cap_conn *conn,
+@@ -5488,6 +5519,7 @@ static int l2cap_move_channel_confirm(struct l2cap_conn *conn,
+ l2cap_send_move_chan_cfm_rsp(conn, cmd->ident, icid);
+
+ l2cap_chan_unlock(chan);
++ l2cap_chan_put(chan);
+
+ return 0;
+ }
+@@ -5523,6 +5555,7 @@ static inline int l2cap_move_channel_confirm_rsp(struct l2cap_conn *conn,
+ }
+
+ l2cap_chan_unlock(chan);
++ l2cap_chan_put(chan);
+
+ return 0;
+ }
+@@ -5895,12 +5928,11 @@ static inline int l2cap_le_credits(struct l2cap_conn *conn,
+ if (credits > max_credits) {
+ BT_ERR("LE credits overflow");
+ l2cap_send_disconn_req(chan, ECONNRESET);
+- l2cap_chan_unlock(chan);
+
+ /* Return 0 so that we don't trigger an unnecessary
+ * command reject packet.
+ */
+- return 0;
++ goto unlock;
+ }
+
+ chan->tx_credits += credits;
+@@ -5911,7 +5943,9 @@ static inline int l2cap_le_credits(struct l2cap_conn *conn,
+ if (chan->tx_credits)
+ chan->ops->resume(chan);
+
++unlock:
+ l2cap_chan_unlock(chan);
++ l2cap_chan_put(chan);
+
+ return 0;
+ }
+@@ -7597,6 +7631,7 @@ drop:
+
+ done:
+ l2cap_chan_unlock(chan);
++ l2cap_chan_put(chan);
+ }
+
+ static void l2cap_conless_channel(struct l2cap_conn *conn, __le16 psm,
+@@ -8085,7 +8120,7 @@ static struct l2cap_chan *l2cap_global_fixed_chan(struct l2cap_chan *c,
+ if (src_type != c->src_type)
+ continue;
+
+- l2cap_chan_hold(c);
++ c = l2cap_chan_hold_unless_zero(c);
+ read_unlock(&chan_list_lock);
+ return c;
+ }
+diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
+index 200ad05b296fc..52abc46e88412 100644
+--- a/net/bridge/br_netlink.c
++++ b/net/bridge/br_netlink.c
+@@ -589,9 +589,13 @@ static int br_fill_ifinfo(struct sk_buff *skb,
+ }
+
+ done:
++ if (af) {
++ if (nlmsg_get_pos(skb) - (void *)af > nla_attr_size(0))
++ nla_nest_end(skb, af);
++ else
++ nla_nest_cancel(skb, af);
++ }
+
+- if (af)
+- nla_nest_end(skb, af);
+ nlmsg_end(skb, nlh);
+ return 0;
+
+diff --git a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c
+index dc92a67baea39..7d542eb461729 100644
+--- a/net/decnet/af_decnet.c
++++ b/net/decnet/af_decnet.c
+@@ -480,8 +480,8 @@ static struct sock *dn_alloc_sock(struct net *net, struct socket *sock, gfp_t gf
+ sk->sk_family = PF_DECnet;
+ sk->sk_protocol = 0;
+ sk->sk_allocation = gfp;
+- sk->sk_sndbuf = sysctl_decnet_wmem[1];
+- sk->sk_rcvbuf = sysctl_decnet_rmem[1];
++ sk->sk_sndbuf = READ_ONCE(sysctl_decnet_wmem[1]);
++ sk->sk_rcvbuf = READ_ONCE(sysctl_decnet_rmem[1]);
+
+ /* Initialization of DECnet Session Control Port */
+ scp = DN_SK(sk);
+diff --git a/net/dsa/switch.c b/net/dsa/switch.c
+index d8a80cf9742c0..52f84ea349d29 100644
+--- a/net/dsa/switch.c
++++ b/net/dsa/switch.c
+@@ -363,6 +363,7 @@ static int dsa_switch_do_lag_fdb_add(struct dsa_switch *ds, struct dsa_lag *lag,
+
+ ether_addr_copy(a->addr, addr);
+ a->vid = vid;
++ a->db = db;
+ refcount_set(&a->refcount, 1);
+ list_add_tail(&a->list, &lag->fdbs);
+
+diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
+index 43a4962722279..c1b53854047b6 100644
+--- a/net/ipv4/fib_trie.c
++++ b/net/ipv4/fib_trie.c
+@@ -1042,6 +1042,7 @@ fib_find_matching_alias(struct net *net, const struct fib_rt_info *fri)
+
+ void fib_alias_hw_flags_set(struct net *net, const struct fib_rt_info *fri)
+ {
++ u8 fib_notify_on_flag_change;
+ struct fib_alias *fa_match;
+ struct sk_buff *skb;
+ int err;
+@@ -1063,14 +1064,16 @@ void fib_alias_hw_flags_set(struct net *net, const struct fib_rt_info *fri)
+ WRITE_ONCE(fa_match->offload, fri->offload);
+ WRITE_ONCE(fa_match->trap, fri->trap);
+
++ fib_notify_on_flag_change = READ_ONCE(net->ipv4.sysctl_fib_notify_on_flag_change);
++
+ /* 2 means send notifications only if offload_failed was changed. */
+- if (net->ipv4.sysctl_fib_notify_on_flag_change == 2 &&
++ if (fib_notify_on_flag_change == 2 &&
+ READ_ONCE(fa_match->offload_failed) == fri->offload_failed)
+ goto out;
+
+ WRITE_ONCE(fa_match->offload_failed, fri->offload_failed);
+
+- if (!net->ipv4.sysctl_fib_notify_on_flag_change)
++ if (!fib_notify_on_flag_change)
+ goto out;
+
+ skb = nlmsg_new(fib_nlmsg_size(fa_match->fa_info), GFP_ATOMIC);
+diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
+index 28db838e604ad..91735d631a282 100644
+--- a/net/ipv4/tcp.c
++++ b/net/ipv4/tcp.c
+@@ -452,8 +452,8 @@ void tcp_init_sock(struct sock *sk)
+
+ icsk->icsk_sync_mss = tcp_sync_mss;
+
+- WRITE_ONCE(sk->sk_sndbuf, sock_net(sk)->ipv4.sysctl_tcp_wmem[1]);
+- WRITE_ONCE(sk->sk_rcvbuf, sock_net(sk)->ipv4.sysctl_tcp_rmem[1]);
++ WRITE_ONCE(sk->sk_sndbuf, READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_wmem[1]));
++ WRITE_ONCE(sk->sk_rcvbuf, READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[1]));
+
+ sk_sockets_allocated_inc(sk);
+ }
+@@ -686,7 +686,7 @@ static bool tcp_should_autocork(struct sock *sk, struct sk_buff *skb,
+ int size_goal)
+ {
+ return skb->len < size_goal &&
+- sock_net(sk)->ipv4.sysctl_tcp_autocorking &&
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_autocorking) &&
+ !tcp_rtx_queue_empty(sk) &&
+ refcount_read(&sk->sk_wmem_alloc) > skb->truesize &&
+ tcp_skb_can_collapse_to(skb);
+@@ -1743,7 +1743,7 @@ int tcp_set_rcvlowat(struct sock *sk, int val)
+ if (sk->sk_userlocks & SOCK_RCVBUF_LOCK)
+ cap = sk->sk_rcvbuf >> 1;
+ else
+- cap = sock_net(sk)->ipv4.sysctl_tcp_rmem[2] >> 1;
++ cap = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[2]) >> 1;
+ val = min(val, cap);
+ WRITE_ONCE(sk->sk_rcvlowat, val ? : 1);
+
+@@ -4481,9 +4481,18 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
+ return SKB_DROP_REASON_TCP_MD5UNEXPECTED;
+ }
+
+- /* check the signature */
+- genhash = tp->af_specific->calc_md5_hash(newhash, hash_expected,
+- NULL, skb);
++ /* Check the signature.
++ * To support dual stack listeners, we need to handle
++ * IPv4-mapped case.
++ */
++ if (family == AF_INET)
++ genhash = tcp_v4_md5_hash_skb(newhash,
++ hash_expected,
++ NULL, skb);
++ else
++ genhash = tp->af_specific->calc_md5_hash(newhash,
++ hash_expected,
++ NULL, skb);
+
+ if (genhash || memcmp(hash_location, newhash, 16) != 0) {
+ NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE);
+diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
+index 19b186a4a8e8c..9221c8c7b9a97 100644
+--- a/net/ipv4/tcp_input.c
++++ b/net/ipv4/tcp_input.c
+@@ -426,7 +426,7 @@ static void tcp_sndbuf_expand(struct sock *sk)
+
+ if (sk->sk_sndbuf < sndmem)
+ WRITE_ONCE(sk->sk_sndbuf,
+- min(sndmem, sock_net(sk)->ipv4.sysctl_tcp_wmem[2]));
++ min(sndmem, READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_wmem[2])));
+ }
+
+ /* 2. Tuning advertised window (window_clamp, rcv_ssthresh)
+@@ -461,7 +461,7 @@ static int __tcp_grow_window(const struct sock *sk, const struct sk_buff *skb,
+ struct tcp_sock *tp = tcp_sk(sk);
+ /* Optimize this! */
+ int truesize = tcp_win_from_space(sk, skbtruesize) >> 1;
+- int window = tcp_win_from_space(sk, sock_net(sk)->ipv4.sysctl_tcp_rmem[2]) >> 1;
++ int window = tcp_win_from_space(sk, READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[2])) >> 1;
+
+ while (tp->rcv_ssthresh <= window) {
+ if (truesize <= skb->len)
+@@ -534,7 +534,7 @@ static void tcp_grow_window(struct sock *sk, const struct sk_buff *skb,
+ */
+ static void tcp_init_buffer_space(struct sock *sk)
+ {
+- int tcp_app_win = sock_net(sk)->ipv4.sysctl_tcp_app_win;
++ int tcp_app_win = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_app_win);
+ struct tcp_sock *tp = tcp_sk(sk);
+ int maxwin;
+
+@@ -574,16 +574,17 @@ static void tcp_clamp_window(struct sock *sk)
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct inet_connection_sock *icsk = inet_csk(sk);
+ struct net *net = sock_net(sk);
++ int rmem2;
+
+ icsk->icsk_ack.quick = 0;
++ rmem2 = READ_ONCE(net->ipv4.sysctl_tcp_rmem[2]);
+
+- if (sk->sk_rcvbuf < net->ipv4.sysctl_tcp_rmem[2] &&
++ if (sk->sk_rcvbuf < rmem2 &&
+ !(sk->sk_userlocks & SOCK_RCVBUF_LOCK) &&
+ !tcp_under_memory_pressure(sk) &&
+ sk_memory_allocated(sk) < sk_prot_mem_limits(sk, 0)) {
+ WRITE_ONCE(sk->sk_rcvbuf,
+- min(atomic_read(&sk->sk_rmem_alloc),
+- net->ipv4.sysctl_tcp_rmem[2]));
++ min(atomic_read(&sk->sk_rmem_alloc), rmem2));
+ }
+ if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
+ tp->rcv_ssthresh = min(tp->window_clamp, 2U * tp->advmss);
+@@ -724,7 +725,7 @@ void tcp_rcv_space_adjust(struct sock *sk)
+ * <prev RTT . ><current RTT .. ><next RTT .... >
+ */
+
+- if (sock_net(sk)->ipv4.sysctl_tcp_moderate_rcvbuf &&
++ if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_moderate_rcvbuf) &&
+ !(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) {
+ int rcvmem, rcvbuf;
+ u64 rcvwin, grow;
+@@ -745,7 +746,7 @@ void tcp_rcv_space_adjust(struct sock *sk)
+
+ do_div(rcvwin, tp->advmss);
+ rcvbuf = min_t(u64, rcvwin * rcvmem,
+- sock_net(sk)->ipv4.sysctl_tcp_rmem[2]);
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[2]));
+ if (rcvbuf > sk->sk_rcvbuf) {
+ WRITE_ONCE(sk->sk_rcvbuf, rcvbuf);
+
+@@ -910,9 +911,9 @@ static void tcp_update_pacing_rate(struct sock *sk)
+ * end of slow start and should slow down.
+ */
+ if (tcp_snd_cwnd(tp) < tp->snd_ssthresh / 2)
+- rate *= sock_net(sk)->ipv4.sysctl_tcp_pacing_ss_ratio;
++ rate *= READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_pacing_ss_ratio);
+ else
+- rate *= sock_net(sk)->ipv4.sysctl_tcp_pacing_ca_ratio;
++ rate *= READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_pacing_ca_ratio);
+
+ rate *= max(tcp_snd_cwnd(tp), tp->packets_out);
+
+@@ -2175,7 +2176,7 @@ void tcp_enter_loss(struct sock *sk)
+ * loss recovery is underway except recurring timeout(s) on
+ * the same SND.UNA (sec 3.2). Disable F-RTO on path MTU probing
+ */
+- tp->frto = net->ipv4.sysctl_tcp_frto &&
++ tp->frto = READ_ONCE(net->ipv4.sysctl_tcp_frto) &&
+ (new_recovery || icsk->icsk_retransmits) &&
+ !inet_csk(sk)->icsk_mtup.probe_size;
+ }
+@@ -3058,7 +3059,7 @@ static void tcp_fastretrans_alert(struct sock *sk, const u32 prior_snd_una,
+
+ static void tcp_update_rtt_min(struct sock *sk, u32 rtt_us, const int flag)
+ {
+- u32 wlen = sock_net(sk)->ipv4.sysctl_tcp_min_rtt_wlen * HZ;
++ u32 wlen = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_min_rtt_wlen) * HZ;
+ struct tcp_sock *tp = tcp_sk(sk);
+
+ if ((flag & FLAG_ACK_MAYBE_DELAYED) && rtt_us > tcp_min_rtt(tp)) {
+@@ -3581,7 +3582,8 @@ static bool __tcp_oow_rate_limited(struct net *net, int mib_idx,
+ if (*last_oow_ack_time) {
+ s32 elapsed = (s32)(tcp_jiffies32 - *last_oow_ack_time);
+
+- if (0 <= elapsed && elapsed < net->ipv4.sysctl_tcp_invalid_ratelimit) {
++ if (0 <= elapsed &&
++ elapsed < READ_ONCE(net->ipv4.sysctl_tcp_invalid_ratelimit)) {
+ NET_INC_STATS(net, mib_idx);
+ return true; /* rate-limited: don't send yet! */
+ }
+@@ -3629,7 +3631,7 @@ static void tcp_send_challenge_ack(struct sock *sk)
+ /* Then check host-wide RFC 5961 rate limit. */
+ now = jiffies / HZ;
+ if (now != challenge_timestamp) {
+- u32 ack_limit = net->ipv4.sysctl_tcp_challenge_ack_limit;
++ u32 ack_limit = READ_ONCE(net->ipv4.sysctl_tcp_challenge_ack_limit);
+ u32 half = (ack_limit + 1) >> 1;
+
+ challenge_timestamp = now;
+@@ -4426,7 +4428,7 @@ static void tcp_dsack_set(struct sock *sk, u32 seq, u32 end_seq)
+ {
+ struct tcp_sock *tp = tcp_sk(sk);
+
+- if (tcp_is_sack(tp) && sock_net(sk)->ipv4.sysctl_tcp_dsack) {
++ if (tcp_is_sack(tp) && READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_dsack)) {
+ int mib_idx;
+
+ if (before(seq, tp->rcv_nxt))
+@@ -4473,7 +4475,7 @@ static void tcp_send_dupack(struct sock *sk, const struct sk_buff *skb)
+ NET_INC_STATS(sock_net(sk), LINUX_MIB_DELAYEDACKLOST);
+ tcp_enter_quickack_mode(sk, TCP_MAX_QUICKACKS);
+
+- if (tcp_is_sack(tp) && sock_net(sk)->ipv4.sysctl_tcp_dsack) {
++ if (tcp_is_sack(tp) && READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_dsack)) {
+ u32 end_seq = TCP_SKB_CB(skb)->end_seq;
+
+ tcp_rcv_spurious_retrans(sk, skb);
+@@ -5523,7 +5525,7 @@ send_now:
+ }
+
+ if (!tcp_is_sack(tp) ||
+- tp->compressed_ack >= sock_net(sk)->ipv4.sysctl_tcp_comp_sack_nr)
++ tp->compressed_ack >= READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_comp_sack_nr))
+ goto send_now;
+
+ if (tp->compressed_ack_rcv_nxt != tp->rcv_nxt) {
+@@ -5544,11 +5546,12 @@ send_now:
+ if (tp->srtt_us && tp->srtt_us < rtt)
+ rtt = tp->srtt_us;
+
+- delay = min_t(unsigned long, sock_net(sk)->ipv4.sysctl_tcp_comp_sack_delay_ns,
++ delay = min_t(unsigned long,
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_comp_sack_delay_ns),
+ rtt * (NSEC_PER_USEC >> 3)/20);
+ sock_hold(sk);
+ hrtimer_start_range_ns(&tp->compressed_ack_timer, ns_to_ktime(delay),
+- sock_net(sk)->ipv4.sysctl_tcp_comp_sack_slack_ns,
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_comp_sack_slack_ns),
+ HRTIMER_MODE_REL_PINNED_SOFT);
+ }
+
+diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
+index a57f96b868741..1db9938163c43 100644
+--- a/net/ipv4/tcp_ipv4.c
++++ b/net/ipv4/tcp_ipv4.c
+@@ -1007,7 +1007,7 @@ static int tcp_v4_send_synack(const struct sock *sk, struct dst_entry *dst,
+ if (skb) {
+ __tcp_v4_send_check(skb, ireq->ir_loc_addr, ireq->ir_rmt_addr);
+
+- tos = sock_net(sk)->ipv4.sysctl_tcp_reflect_tos ?
++ tos = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_reflect_tos) ?
+ (tcp_rsk(req)->syn_tos & ~INET_ECN_MASK) |
+ (inet_sk(sk)->tos & INET_ECN_MASK) :
+ inet_sk(sk)->tos;
+@@ -1527,7 +1527,7 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,
+ /* Set ToS of the new socket based upon the value of incoming SYN.
+ * ECT bits are set later in tcp_init_transfer().
+ */
+- if (sock_net(sk)->ipv4.sysctl_tcp_reflect_tos)
++ if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_reflect_tos))
+ newinet->tos = tcp_rsk(req)->syn_tos & ~INET_ECN_MASK;
+
+ if (!dst) {
+diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
+index a501150deaa3b..d58e672be31c7 100644
+--- a/net/ipv4/tcp_metrics.c
++++ b/net/ipv4/tcp_metrics.c
+@@ -329,7 +329,7 @@ void tcp_update_metrics(struct sock *sk)
+ int m;
+
+ sk_dst_confirm(sk);
+- if (net->ipv4.sysctl_tcp_nometrics_save || !dst)
++ if (READ_ONCE(net->ipv4.sysctl_tcp_nometrics_save) || !dst)
+ return;
+
+ rcu_read_lock();
+@@ -385,7 +385,7 @@ void tcp_update_metrics(struct sock *sk)
+
+ if (tcp_in_initial_slowstart(tp)) {
+ /* Slow start still did not finish. */
+- if (!net->ipv4.sysctl_tcp_no_ssthresh_metrics_save &&
++ if (!READ_ONCE(net->ipv4.sysctl_tcp_no_ssthresh_metrics_save) &&
+ !tcp_metric_locked(tm, TCP_METRIC_SSTHRESH)) {
+ val = tcp_metric_get(tm, TCP_METRIC_SSTHRESH);
+ if (val && (tcp_snd_cwnd(tp) >> 1) > val)
+@@ -401,7 +401,7 @@ void tcp_update_metrics(struct sock *sk)
+ } else if (!tcp_in_slow_start(tp) &&
+ icsk->icsk_ca_state == TCP_CA_Open) {
+ /* Cong. avoidance phase, cwnd is reliable. */
+- if (!net->ipv4.sysctl_tcp_no_ssthresh_metrics_save &&
++ if (!READ_ONCE(net->ipv4.sysctl_tcp_no_ssthresh_metrics_save) &&
+ !tcp_metric_locked(tm, TCP_METRIC_SSTHRESH))
+ tcp_metric_set(tm, TCP_METRIC_SSTHRESH,
+ max(tcp_snd_cwnd(tp) >> 1, tp->snd_ssthresh));
+@@ -418,7 +418,7 @@ void tcp_update_metrics(struct sock *sk)
+ tcp_metric_set(tm, TCP_METRIC_CWND,
+ (val + tp->snd_ssthresh) >> 1);
+ }
+- if (!net->ipv4.sysctl_tcp_no_ssthresh_metrics_save &&
++ if (!READ_ONCE(net->ipv4.sysctl_tcp_no_ssthresh_metrics_save) &&
+ !tcp_metric_locked(tm, TCP_METRIC_SSTHRESH)) {
+ val = tcp_metric_get(tm, TCP_METRIC_SSTHRESH);
+ if (val && tp->snd_ssthresh > val)
+@@ -463,7 +463,7 @@ void tcp_init_metrics(struct sock *sk)
+ if (tcp_metric_locked(tm, TCP_METRIC_CWND))
+ tp->snd_cwnd_clamp = tcp_metric_get(tm, TCP_METRIC_CWND);
+
+- val = net->ipv4.sysctl_tcp_no_ssthresh_metrics_save ?
++ val = READ_ONCE(net->ipv4.sysctl_tcp_no_ssthresh_metrics_save) ?
+ 0 : tcp_metric_get(tm, TCP_METRIC_SSTHRESH);
+ if (val) {
+ tp->snd_ssthresh = val;
+diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
+index 3554a4c1e1b82..a7f0a1f0c2a34 100644
+--- a/net/ipv4/tcp_output.c
++++ b/net/ipv4/tcp_output.c
+@@ -167,16 +167,13 @@ static void tcp_event_data_sent(struct tcp_sock *tp,
+ if (tcp_packets_in_flight(tp) == 0)
+ tcp_ca_event(sk, CA_EVENT_TX_START);
+
+- /* If this is the first data packet sent in response to the
+- * previous received data,
+- * and it is a reply for ato after last received packet,
+- * increase pingpong count.
+- */
+- if (before(tp->lsndtime, icsk->icsk_ack.lrcvtime) &&
+- (u32)(now - icsk->icsk_ack.lrcvtime) < icsk->icsk_ack.ato)
+- inet_csk_inc_pingpong_cnt(sk);
+-
+ tp->lsndtime = now;
++
++ /* If it is a reply for ato after last received
++ * packet, enter pingpong mode.
++ */
++ if ((u32)(now - icsk->icsk_ack.lrcvtime) < icsk->icsk_ack.ato)
++ inet_csk_enter_pingpong_mode(sk);
+ }
+
+ /* Account for an ACK we sent. */
+@@ -230,7 +227,7 @@ void tcp_select_initial_window(const struct sock *sk, int __space, __u32 mss,
+ * which we interpret as a sign the remote TCP is not
+ * misinterpreting the window field as a signed quantity.
+ */
+- if (sock_net(sk)->ipv4.sysctl_tcp_workaround_signed_windows)
++ if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_workaround_signed_windows))
+ (*rcv_wnd) = min(space, MAX_TCP_WINDOW);
+ else
+ (*rcv_wnd) = min_t(u32, space, U16_MAX);
+@@ -241,7 +238,7 @@ void tcp_select_initial_window(const struct sock *sk, int __space, __u32 mss,
+ *rcv_wscale = 0;
+ if (wscale_ok) {
+ /* Set window scaling on max possible window */
+- space = max_t(u32, space, sock_net(sk)->ipv4.sysctl_tcp_rmem[2]);
++ space = max_t(u32, space, READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[2]));
+ space = max_t(u32, space, sysctl_rmem_max);
+ space = min_t(u32, space, *window_clamp);
+ *rcv_wscale = clamp_t(int, ilog2(space) - 15,
+@@ -285,7 +282,7 @@ static u16 tcp_select_window(struct sock *sk)
+ * scaled window.
+ */
+ if (!tp->rx_opt.rcv_wscale &&
+- sock_net(sk)->ipv4.sysctl_tcp_workaround_signed_windows)
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_workaround_signed_windows))
+ new_win = min(new_win, MAX_TCP_WINDOW);
+ else
+ new_win = min(new_win, (65535U << tp->rx_opt.rcv_wscale));
+@@ -1974,7 +1971,7 @@ static u32 tcp_tso_autosize(const struct sock *sk, unsigned int mss_now,
+
+ bytes = sk->sk_pacing_rate >> READ_ONCE(sk->sk_pacing_shift);
+
+- r = tcp_min_rtt(tcp_sk(sk)) >> sock_net(sk)->ipv4.sysctl_tcp_tso_rtt_log;
++ r = tcp_min_rtt(tcp_sk(sk)) >> READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_tso_rtt_log);
+ if (r < BITS_PER_TYPE(sk->sk_gso_max_size))
+ bytes += sk->sk_gso_max_size >> r;
+
+@@ -1993,7 +1990,7 @@ static u32 tcp_tso_segs(struct sock *sk, unsigned int mss_now)
+
+ min_tso = ca_ops->min_tso_segs ?
+ ca_ops->min_tso_segs(sk) :
+- sock_net(sk)->ipv4.sysctl_tcp_min_tso_segs;
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_min_tso_segs);
+
+ tso_segs = tcp_tso_autosize(sk, mss_now, min_tso);
+ return min_t(u32, tso_segs, sk->sk_gso_max_segs);
+@@ -2505,7 +2502,7 @@ static bool tcp_small_queue_check(struct sock *sk, const struct sk_buff *skb,
+ sk->sk_pacing_rate >> READ_ONCE(sk->sk_pacing_shift));
+ if (sk->sk_pacing_status == SK_PACING_NONE)
+ limit = min_t(unsigned long, limit,
+- sock_net(sk)->ipv4.sysctl_tcp_limit_output_bytes);
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_limit_output_bytes));
+ limit <<= factor;
+
+ if (static_branch_unlikely(&tcp_tx_delay_enabled) &&
+diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
+index 7f695c39d9a8c..87c699d57b366 100644
+--- a/net/ipv6/mcast.c
++++ b/net/ipv6/mcast.c
+@@ -1522,7 +1522,6 @@ static void mld_query_work(struct work_struct *work)
+
+ if (++cnt >= MLD_MAX_QUEUE) {
+ rework = true;
+- schedule_delayed_work(&idev->mc_query_work, 0);
+ break;
+ }
+ }
+@@ -1533,8 +1532,10 @@ static void mld_query_work(struct work_struct *work)
+ __mld_query_work(skb);
+ mutex_unlock(&idev->mc_lock);
+
+- if (!rework)
+- in6_dev_put(idev);
++ if (rework && queue_delayed_work(mld_wq, &idev->mc_query_work, 0))
++ return;
++
++ in6_dev_put(idev);
+ }
+
+ /* called with rcu_read_lock() */
+@@ -1624,7 +1625,6 @@ static void mld_report_work(struct work_struct *work)
+
+ if (++cnt >= MLD_MAX_QUEUE) {
+ rework = true;
+- schedule_delayed_work(&idev->mc_report_work, 0);
+ break;
+ }
+ }
+@@ -1635,8 +1635,10 @@ static void mld_report_work(struct work_struct *work)
+ __mld_report_work(skb);
+ mutex_unlock(&idev->mc_lock);
+
+- if (!rework)
+- in6_dev_put(idev);
++ if (rework && queue_delayed_work(mld_wq, &idev->mc_report_work, 0))
++ return;
++
++ in6_dev_put(idev);
+ }
+
+ static bool is_in(struct ifmcaddr6 *pmc, struct ip6_sf_list *psf, int type,
+diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
+index ecf3a553a0dc4..8c6c2d82c1cd6 100644
+--- a/net/ipv6/ping.c
++++ b/net/ipv6/ping.c
+@@ -22,6 +22,11 @@
+ #include <linux/proc_fs.h>
+ #include <net/ping.h>
+
++static void ping_v6_destroy(struct sock *sk)
++{
++ inet6_destroy_sock(sk);
++}
++
+ /* Compatibility glue so we can support IPv6 when it's compiled as a module */
+ static int dummy_ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len,
+ int *addr_len)
+@@ -181,6 +186,7 @@ struct proto pingv6_prot = {
+ .owner = THIS_MODULE,
+ .init = ping_init_sock,
+ .close = ping_close,
++ .destroy = ping_v6_destroy,
+ .connect = ip6_datagram_connect_v6_only,
+ .disconnect = __udp_disconnect,
+ .setsockopt = ipv6_setsockopt,
+diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
+index 5185c11dc4447..979e0d7b21195 100644
+--- a/net/ipv6/tcp_ipv6.c
++++ b/net/ipv6/tcp_ipv6.c
+@@ -546,7 +546,7 @@ static int tcp_v6_send_synack(const struct sock *sk, struct dst_entry *dst,
+ if (np->repflow && ireq->pktopts)
+ fl6->flowlabel = ip6_flowlabel(ipv6_hdr(ireq->pktopts));
+
+- tclass = sock_net(sk)->ipv4.sysctl_tcp_reflect_tos ?
++ tclass = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_reflect_tos) ?
+ (tcp_rsk(req)->syn_tos & ~INET_ECN_MASK) |
+ (np->tclass & INET_ECN_MASK) :
+ np->tclass;
+@@ -1314,7 +1314,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
+ /* Set ToS of the new socket based upon the value of incoming SYN.
+ * ECT bits are set later in tcp_init_transfer().
+ */
+- if (sock_net(sk)->ipv4.sysctl_tcp_reflect_tos)
++ if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_reflect_tos))
+ newnp->tclass = tcp_rsk(req)->syn_tos & ~INET_ECN_MASK;
+
+ /* Clone native IPv6 options from listening socket (if any)
+diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
+index b52fd250cb3a8..07b5a2044cab4 100644
+--- a/net/mptcp/protocol.c
++++ b/net/mptcp/protocol.c
+@@ -1882,7 +1882,7 @@ static void mptcp_rcv_space_adjust(struct mptcp_sock *msk, int copied)
+ if (msk->rcvq_space.copied <= msk->rcvq_space.space)
+ goto new_measure;
+
+- if (sock_net(sk)->ipv4.sysctl_tcp_moderate_rcvbuf &&
++ if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_moderate_rcvbuf) &&
+ !(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) {
+ int rcvmem, rcvbuf;
+ u64 rcvwin, grow;
+@@ -1900,7 +1900,7 @@ static void mptcp_rcv_space_adjust(struct mptcp_sock *msk, int copied)
+
+ do_div(rcvwin, advmss);
+ rcvbuf = min_t(u64, rcvwin * rcvmem,
+- sock_net(sk)->ipv4.sysctl_tcp_rmem[2]);
++ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[2]));
+
+ if (rcvbuf > sk->sk_rcvbuf) {
+ u32 window_clamp;
+@@ -2597,8 +2597,8 @@ static int mptcp_init_sock(struct sock *sk)
+ mptcp_ca_reset(sk);
+
+ sk_sockets_allocated_inc(sk);
+- sk->sk_rcvbuf = sock_net(sk)->ipv4.sysctl_tcp_rmem[1];
+- sk->sk_sndbuf = sock_net(sk)->ipv4.sysctl_tcp_wmem[1];
++ sk->sk_rcvbuf = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[1]);
++ sk->sk_sndbuf = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_wmem[1]);
+
+ return 0;
+ }
+diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
+index a364f8e5e698f..87a9009d5234d 100644
+--- a/net/netfilter/nfnetlink_queue.c
++++ b/net/netfilter/nfnetlink_queue.c
+@@ -843,11 +843,16 @@ nfqnl_enqueue_packet(struct nf_queue_entry *entry, unsigned int queuenum)
+ }
+
+ static int
+-nfqnl_mangle(void *data, int data_len, struct nf_queue_entry *e, int diff)
++nfqnl_mangle(void *data, unsigned int data_len, struct nf_queue_entry *e, int diff)
+ {
+ struct sk_buff *nskb;
+
+ if (diff < 0) {
++ unsigned int min_len = skb_transport_offset(e->skb);
++
++ if (data_len < min_len)
++ return -EINVAL;
++
+ if (pskb_trim(e->skb, data_len))
+ return -ENOMEM;
+ } else if (diff > 0) {
+diff --git a/net/sctp/associola.c b/net/sctp/associola.c
+index be29da09cc7ab..3460abceba443 100644
+--- a/net/sctp/associola.c
++++ b/net/sctp/associola.c
+@@ -229,9 +229,8 @@ static struct sctp_association *sctp_association_init(
+ if (!sctp_ulpq_init(&asoc->ulpq, asoc))
+ goto fail_init;
+
+- if (sctp_stream_init(&asoc->stream, asoc->c.sinit_num_ostreams,
+- 0, gfp))
+- goto fail_init;
++ if (sctp_stream_init(&asoc->stream, asoc->c.sinit_num_ostreams, 0, gfp))
++ goto stream_free;
+
+ /* Initialize default path MTU. */
+ asoc->pathmtu = sp->pathmtu;
+diff --git a/net/sctp/stream.c b/net/sctp/stream.c
+index 6dc95dcc0ff4f..ef9fceadef8d5 100644
+--- a/net/sctp/stream.c
++++ b/net/sctp/stream.c
+@@ -137,7 +137,7 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
+
+ ret = sctp_stream_alloc_out(stream, outcnt, gfp);
+ if (ret)
+- goto out_err;
++ return ret;
+
+ for (i = 0; i < stream->outcnt; i++)
+ SCTP_SO(stream, i)->state = SCTP_STREAM_OPEN;
+@@ -145,22 +145,9 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
+ handle_in:
+ sctp_stream_interleave_init(stream);
+ if (!incnt)
+- goto out;
+-
+- ret = sctp_stream_alloc_in(stream, incnt, gfp);
+- if (ret)
+- goto in_err;
+-
+- goto out;
++ return 0;
+
+-in_err:
+- sched->free(stream);
+- genradix_free(&stream->in);
+-out_err:
+- genradix_free(&stream->out);
+- stream->outcnt = 0;
+-out:
+- return ret;
++ return sctp_stream_alloc_in(stream, incnt, gfp);
+ }
+
+ int sctp_stream_init_ext(struct sctp_stream *stream, __u16 sid)
+diff --git a/net/sctp/stream_sched.c b/net/sctp/stream_sched.c
+index 99e5f69fbb742..a2e1d34f52c5b 100644
+--- a/net/sctp/stream_sched.c
++++ b/net/sctp/stream_sched.c
+@@ -163,7 +163,7 @@ int sctp_sched_set_sched(struct sctp_association *asoc,
+ if (!SCTP_SO(&asoc->stream, i)->ext)
+ continue;
+
+- ret = n->init_sid(&asoc->stream, i, GFP_KERNEL);
++ ret = n->init_sid(&asoc->stream, i, GFP_ATOMIC);
+ if (ret)
+ goto err;
+ }
+diff --git a/net/tipc/socket.c b/net/tipc/socket.c
+index 43509c7e90fc2..f1c3b8eb4b3d3 100644
+--- a/net/tipc/socket.c
++++ b/net/tipc/socket.c
+@@ -517,7 +517,7 @@ static int tipc_sk_create(struct net *net, struct socket *sock,
+ timer_setup(&sk->sk_timer, tipc_sk_timeout, 0);
+ sk->sk_shutdown = 0;
+ sk->sk_backlog_rcv = tipc_sk_backlog_rcv;
+- sk->sk_rcvbuf = sysctl_tipc_rmem[1];
++ sk->sk_rcvbuf = READ_ONCE(sysctl_tipc_rmem[1]);
+ sk->sk_data_ready = tipc_data_ready;
+ sk->sk_write_space = tipc_write_space;
+ sk->sk_destruct = tipc_sock_destruct;
+diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
+index 9c3933781ad47..b1f3d336cdaec 100644
+--- a/net/tls/tls_device.c
++++ b/net/tls/tls_device.c
+@@ -1351,8 +1351,13 @@ static int tls_device_down(struct net_device *netdev)
+ * by tls_device_free_ctx. rx_conf and tx_conf stay in TLS_HW.
+ * Now release the ref taken above.
+ */
+- if (refcount_dec_and_test(&ctx->refcount))
++ if (refcount_dec_and_test(&ctx->refcount)) {
++ /* sk_destruct ran after tls_device_down took a ref, and
++ * it returned early. Complete the destruction here.
++ */
++ list_del(&ctx->list);
+ tls_device_free_ctx(ctx);
++ }
+ }
+
+ up_write(&device_offload_lock);
+diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
+index ecd377938eea8..ef6ced5c5746a 100644
+--- a/tools/perf/util/symbol-elf.c
++++ b/tools/perf/util/symbol-elf.c
+@@ -233,6 +233,33 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
+ return NULL;
+ }
+
++static int elf_read_program_header(Elf *elf, u64 vaddr, GElf_Phdr *phdr)
++{
++ size_t i, phdrnum;
++ u64 sz;
++
++ if (elf_getphdrnum(elf, &phdrnum))
++ return -1;
++
++ for (i = 0; i < phdrnum; i++) {
++ if (gelf_getphdr(elf, i, phdr) == NULL)
++ return -1;
++
++ if (phdr->p_type != PT_LOAD)
++ continue;
++
++ sz = max(phdr->p_memsz, phdr->p_filesz);
++ if (!sz)
++ continue;
++
++ if (vaddr >= phdr->p_vaddr && (vaddr < phdr->p_vaddr + sz))
++ return 0;
++ }
++
++ /* Not found any valid program header */
++ return -1;
++}
++
+ static bool want_demangle(bool is_kernel_sym)
+ {
+ return is_kernel_sym ? symbol_conf.demangle_kernel : symbol_conf.demangle;
+@@ -1209,6 +1236,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
+ sym.st_value);
+ used_opd = true;
+ }
++
+ /*
+ * When loading symbols in a data mapping, ABS symbols (which
+ * has a value of SHN_ABS in its st_shndx) failed at
+@@ -1262,11 +1290,20 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
+ goto out_elf_end;
+ } else if ((used_opd && runtime_ss->adjust_symbols) ||
+ (!used_opd && syms_ss->adjust_symbols)) {
++ GElf_Phdr phdr;
++
++ if (elf_read_program_header(syms_ss->elf,
++ (u64)sym.st_value, &phdr)) {
++ pr_warning("%s: failed to find program header for "
++ "symbol: %s st_value: %#" PRIx64 "\n",
++ __func__, elf_name, (u64)sym.st_value);
++ continue;
++ }
+ pr_debug4("%s: adjusting symbol: st_value: %#" PRIx64 " "
+- "sh_addr: %#" PRIx64 " sh_offset: %#" PRIx64 "\n", __func__,
+- (u64)sym.st_value, (u64)shdr.sh_addr,
+- (u64)shdr.sh_offset);
+- sym.st_value -= shdr.sh_addr - shdr.sh_offset;
++ "p_vaddr: %#" PRIx64 " p_offset: %#" PRIx64 "\n",
++ __func__, (u64)sym.st_value, (u64)phdr.p_vaddr,
++ (u64)phdr.p_offset);
++ sym.st_value -= phdr.p_vaddr - phdr.p_offset;
+ }
+
+ demangled = demangle_sym(dso, kmodule, elf_name);
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-08-11 12:32 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-08-11 12:32 UTC (permalink / raw
To: gentoo-commits
commit: 7aaaf24efa9606d739c15aa28d8118b093cc315e
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Thu Aug 11 12:32:45 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Thu Aug 11 12:32:45 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=7aaaf24e
Linux patch 5.18.17
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1016_linux-5.18.17.patch | 1418 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 1422 insertions(+)
diff --git a/0000_README b/0000_README
index efa0b25e..e0f23579 100644
--- a/0000_README
+++ b/0000_README
@@ -107,6 +107,10 @@ Patch: 1015_linux-5.18.16.patch
From: http://www.kernel.org
Desc: Linux 5.18.16
+Patch: 1016_linux-5.18.17.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.17
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1016_linux-5.18.17.patch b/1016_linux-5.18.17.patch
new file mode 100644
index 00000000..94fc8829
--- /dev/null
+++ b/1016_linux-5.18.17.patch
@@ -0,0 +1,1418 @@
+diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
+index 9e9556826450b..2ce2a38cdd556 100644
+--- a/Documentation/admin-guide/hw-vuln/spectre.rst
++++ b/Documentation/admin-guide/hw-vuln/spectre.rst
+@@ -422,6 +422,14 @@ The possible values in this file are:
+ 'RSB filling' Protection of RSB on context switch enabled
+ ============= ===========================================
+
++ - EIBRS Post-barrier Return Stack Buffer (PBRSB) protection status:
++
++ =========================== =======================================================
++ 'PBRSB-eIBRS: SW sequence' CPU is affected and protection of RSB on VMEXIT enabled
++ 'PBRSB-eIBRS: Vulnerable' CPU is vulnerable
++ 'PBRSB-eIBRS: Not affected' CPU is not affected by PBRSB
++ =========================== =======================================================
++
+ Full mitigation might require a microcode update from the CPU
+ vendor. When the necessary microcode is not available, the kernel will
+ report vulnerability.
+diff --git a/Documentation/devicetree/bindings/net/broadcom-bluetooth.yaml b/Documentation/devicetree/bindings/net/broadcom-bluetooth.yaml
+index 5aac094fd2172..58ecafc1b7f90 100644
+--- a/Documentation/devicetree/bindings/net/broadcom-bluetooth.yaml
++++ b/Documentation/devicetree/bindings/net/broadcom-bluetooth.yaml
+@@ -23,6 +23,7 @@ properties:
+ - brcm,bcm4345c5
+ - brcm,bcm43540-bt
+ - brcm,bcm4335a0
++ - brcm,bcm4349-bt
+
+ shutdown-gpios:
+ maxItems: 1
+diff --git a/Makefile b/Makefile
+index 18bcbcd037f0a..ef8c18e5c161c 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 16
++SUBLEVEL = 17
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/arm64/crypto/poly1305-glue.c b/arch/arm64/crypto/poly1305-glue.c
+index 9c3d86e397bf3..1fae18ba11ed1 100644
+--- a/arch/arm64/crypto/poly1305-glue.c
++++ b/arch/arm64/crypto/poly1305-glue.c
+@@ -52,7 +52,7 @@ static void neon_poly1305_blocks(struct poly1305_desc_ctx *dctx, const u8 *src,
+ {
+ if (unlikely(!dctx->sset)) {
+ if (!dctx->rset) {
+- poly1305_init_arch(dctx, src);
++ poly1305_init_arm64(&dctx->h, src);
+ src += POLY1305_BLOCK_SIZE;
+ len -= POLY1305_BLOCK_SIZE;
+ dctx->rset = 1;
+diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
+index 96dc0f7da258d..a971d462f531c 100644
+--- a/arch/arm64/include/asm/kernel-pgtable.h
++++ b/arch/arm64/include/asm/kernel-pgtable.h
+@@ -103,8 +103,8 @@
+ /*
+ * Initial memory map attributes.
+ */
+-#define SWAPPER_PTE_FLAGS (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
+-#define SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
++#define SWAPPER_PTE_FLAGS (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED | PTE_UXN)
++#define SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S | PMD_SECT_UXN)
+
+ #if ARM64_KERNEL_USES_PMD_MAPS
+ #define SWAPPER_MM_MMUFLAGS (PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS)
+diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
+index 6a98f1a38c29a..8a93a0a7489b2 100644
+--- a/arch/arm64/kernel/head.S
++++ b/arch/arm64/kernel/head.S
+@@ -285,7 +285,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
+ subs x1, x1, #64
+ b.ne 1b
+
+- mov x7, SWAPPER_MM_MMUFLAGS
++ mov_q x7, SWAPPER_MM_MMUFLAGS
+
+ /*
+ * Create the identity mapping.
+diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
+index 4d1d87f76a74f..ce1f5a876cfea 100644
+--- a/arch/x86/Kconfig
++++ b/arch/x86/Kconfig
+@@ -2469,7 +2469,7 @@ config RETPOLINE
+ config RETHUNK
+ bool "Enable return-thunks"
+ depends on RETPOLINE && CC_HAS_RETURN_THUNK
+- default y
++ default y if X86_64
+ help
+ Compile the kernel with the return-thunks compiler option to guard
+ against kernel-to-user data leaks by avoiding return speculation.
+@@ -2478,21 +2478,21 @@ config RETHUNK
+
+ config CPU_UNRET_ENTRY
+ bool "Enable UNRET on kernel entry"
+- depends on CPU_SUP_AMD && RETHUNK
++ depends on CPU_SUP_AMD && RETHUNK && X86_64
+ default y
+ help
+ Compile the kernel with support for the retbleed=unret mitigation.
+
+ config CPU_IBPB_ENTRY
+ bool "Enable IBPB on kernel entry"
+- depends on CPU_SUP_AMD
++ depends on CPU_SUP_AMD && X86_64
+ default y
+ help
+ Compile the kernel with support for the retbleed=ibpb mitigation.
+
+ config CPU_IBRS_ENTRY
+ bool "Enable IBRS on kernel entry"
+- depends on CPU_SUP_INTEL
++ depends on CPU_SUP_INTEL && X86_64
+ default y
+ help
+ Compile the kernel with support for the spectre_v2=ibrs mitigation.
+diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
+index 49889f171e860..e82da174d28c3 100644
+--- a/arch/x86/include/asm/cpufeatures.h
++++ b/arch/x86/include/asm/cpufeatures.h
+@@ -302,6 +302,7 @@
+ #define X86_FEATURE_RETHUNK (11*32+14) /* "" Use REturn THUNK */
+ #define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */
+ #define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB during runtime firmware calls */
++#define X86_FEATURE_RSB_VMEXIT_LITE (11*32+17) /* "" Fill RSB on VM exit when EIBRS is enabled */
+
+ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
+ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
+@@ -453,5 +454,6 @@
+ #define X86_BUG_SRBDS X86_BUG(24) /* CPU may leak RNG bits if not mitigated */
+ #define X86_BUG_MMIO_STALE_DATA X86_BUG(25) /* CPU is affected by Processor MMIO Stale Data vulnerabilities */
+ #define X86_BUG_RETBLEED X86_BUG(26) /* CPU is affected by RETBleed */
++#define X86_BUG_EIBRS_PBRSB X86_BUG(27) /* EIBRS is vulnerable to Post Barrier RSB Predictions */
+
+ #endif /* _ASM_X86_CPUFEATURES_H */
+diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
+index 4ff36610af6ab..9fdaa847d4b66 100644
+--- a/arch/x86/include/asm/kvm_host.h
++++ b/arch/x86/include/asm/kvm_host.h
+@@ -651,6 +651,7 @@ struct kvm_vcpu_arch {
+ u64 ia32_misc_enable_msr;
+ u64 smbase;
+ u64 smi_count;
++ bool at_instruction_boundary;
+ bool tpr_access_reporting;
+ bool xsaves_enabled;
+ bool xfd_no_write_intercept;
+@@ -1289,6 +1290,8 @@ struct kvm_vcpu_stat {
+ u64 nested_run;
+ u64 directed_yield_attempted;
+ u64 directed_yield_successful;
++ u64 preemption_reported;
++ u64 preemption_other;
+ u64 guest_mode;
+ };
+
+diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
+index ad084326f24c2..f951147cc7fdc 100644
+--- a/arch/x86/include/asm/msr-index.h
++++ b/arch/x86/include/asm/msr-index.h
+@@ -148,6 +148,10 @@
+ * are restricted to targets in
+ * kernel.
+ */
++#define ARCH_CAP_PBRSB_NO BIT(24) /*
++ * Not susceptible to Post-Barrier
++ * Return Stack Buffer Predictions.
++ */
+
+ #define MSR_IA32_FLUSH_CMD 0x0000010b
+ #define L1D_FLUSH BIT(0) /*
+diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
+index 38a3e86e665ef..d3a3cc6772ee1 100644
+--- a/arch/x86/include/asm/nospec-branch.h
++++ b/arch/x86/include/asm/nospec-branch.h
+@@ -60,7 +60,9 @@
+ 774: \
+ add $(BITS_PER_LONG/8) * 2, sp; \
+ dec reg; \
+- jnz 771b;
++ jnz 771b; \
++ /* barrier for jnz misprediction */ \
++ lfence;
+
+ #ifdef __ASSEMBLY__
+
+@@ -118,13 +120,28 @@
+ #endif
+ .endm
+
++.macro ISSUE_UNBALANCED_RET_GUARD
++ ANNOTATE_INTRA_FUNCTION_CALL
++ call .Lunbalanced_ret_guard_\@
++ int3
++.Lunbalanced_ret_guard_\@:
++ add $(BITS_PER_LONG/8), %_ASM_SP
++ lfence
++.endm
++
+ /*
+ * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP
+ * monstrosity above, manually.
+ */
+-.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req
++.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2
++.ifb \ftr2
+ ALTERNATIVE "jmp .Lskip_rsb_\@", "", \ftr
++.else
++ ALTERNATIVE_2 "jmp .Lskip_rsb_\@", "", \ftr, "jmp .Lunbalanced_\@", \ftr2
++.endif
+ __FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)
++.Lunbalanced_\@:
++ ISSUE_UNBALANCED_RET_GUARD
+ .Lskip_rsb_\@:
+ .endm
+
+diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
+index fd986a8ba2bd7..fa625b2a8a939 100644
+--- a/arch/x86/kernel/cpu/bugs.c
++++ b/arch/x86/kernel/cpu/bugs.c
+@@ -1328,6 +1328,53 @@ static void __init spec_ctrl_disable_kernel_rrsba(void)
+ }
+ }
+
++static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_mitigation mode)
++{
++ /*
++ * Similar to context switches, there are two types of RSB attacks
++ * after VM exit:
++ *
++ * 1) RSB underflow
++ *
++ * 2) Poisoned RSB entry
++ *
++ * When retpoline is enabled, both are mitigated by filling/clearing
++ * the RSB.
++ *
++ * When IBRS is enabled, while #1 would be mitigated by the IBRS branch
++ * prediction isolation protections, RSB still needs to be cleared
++ * because of #2. Note that SMEP provides no protection here, unlike
++ * user-space-poisoned RSB entries.
++ *
++ * eIBRS should protect against RSB poisoning, but if the EIBRS_PBRSB
++ * bug is present then a LITE version of RSB protection is required,
++ * just a single call needs to retire before a RET is executed.
++ */
++ switch (mode) {
++ case SPECTRE_V2_NONE:
++ return;
++
++ case SPECTRE_V2_EIBRS_LFENCE:
++ case SPECTRE_V2_EIBRS:
++ if (boot_cpu_has_bug(X86_BUG_EIBRS_PBRSB)) {
++ setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT_LITE);
++ pr_info("Spectre v2 / PBRSB-eIBRS: Retire a single CALL on VMEXIT\n");
++ }
++ return;
++
++ case SPECTRE_V2_EIBRS_RETPOLINE:
++ case SPECTRE_V2_RETPOLINE:
++ case SPECTRE_V2_LFENCE:
++ case SPECTRE_V2_IBRS:
++ setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT);
++ pr_info("Spectre v2 / SpectreRSB : Filling RSB on VMEXIT\n");
++ return;
++ }
++
++ pr_warn_once("Unknown Spectre v2 mode, disabling RSB mitigation at VM exit");
++ dump_stack();
++}
++
+ static void __init spectre_v2_select_mitigation(void)
+ {
+ enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
+@@ -1478,28 +1525,7 @@ static void __init spectre_v2_select_mitigation(void)
+ setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
+ pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
+
+- /*
+- * Similar to context switches, there are two types of RSB attacks
+- * after vmexit:
+- *
+- * 1) RSB underflow
+- *
+- * 2) Poisoned RSB entry
+- *
+- * When retpoline is enabled, both are mitigated by filling/clearing
+- * the RSB.
+- *
+- * When IBRS is enabled, while #1 would be mitigated by the IBRS branch
+- * prediction isolation protections, RSB still needs to be cleared
+- * because of #2. Note that SMEP provides no protection here, unlike
+- * user-space-poisoned RSB entries.
+- *
+- * eIBRS, on the other hand, has RSB-poisoning protections, so it
+- * doesn't need RSB clearing after vmexit.
+- */
+- if (boot_cpu_has(X86_FEATURE_RETPOLINE) ||
+- boot_cpu_has(X86_FEATURE_KERNEL_IBRS))
+- setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT);
++ spectre_v2_determine_rsb_fill_type_at_vmexit(mode);
+
+ /*
+ * Retpoline protects the kernel, but doesn't protect firmware. IBRS
+@@ -2285,6 +2311,19 @@ static char *ibpb_state(void)
+ return "";
+ }
+
++static char *pbrsb_eibrs_state(void)
++{
++ if (boot_cpu_has_bug(X86_BUG_EIBRS_PBRSB)) {
++ if (boot_cpu_has(X86_FEATURE_RSB_VMEXIT_LITE) ||
++ boot_cpu_has(X86_FEATURE_RSB_VMEXIT))
++ return ", PBRSB-eIBRS: SW sequence";
++ else
++ return ", PBRSB-eIBRS: Vulnerable";
++ } else {
++ return ", PBRSB-eIBRS: Not affected";
++ }
++}
++
+ static ssize_t spectre_v2_show_state(char *buf)
+ {
+ if (spectre_v2_enabled == SPECTRE_V2_LFENCE)
+@@ -2297,12 +2336,13 @@ static ssize_t spectre_v2_show_state(char *buf)
+ spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE)
+ return sprintf(buf, "Vulnerable: eIBRS+LFENCE with unprivileged eBPF and SMT\n");
+
+- return sprintf(buf, "%s%s%s%s%s%s\n",
++ return sprintf(buf, "%s%s%s%s%s%s%s\n",
+ spectre_v2_strings[spectre_v2_enabled],
+ ibpb_state(),
+ boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
+ stibp_state(),
+ boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "",
++ pbrsb_eibrs_state(),
+ spectre_v2_module_string());
+ }
+
+diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
+index 1f43ddf2ffc36..d47e20e305cd2 100644
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -1161,6 +1161,7 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
+ #define NO_SWAPGS BIT(6)
+ #define NO_ITLB_MULTIHIT BIT(7)
+ #define NO_SPECTRE_V2 BIT(8)
++#define NO_EIBRS_PBRSB BIT(9)
+
+ #define VULNWL(vendor, family, model, whitelist) \
+ X86_MATCH_VENDOR_FAM_MODEL(vendor, family, model, whitelist)
+@@ -1203,7 +1204,7 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
+
+ VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_GOLDMONT_D, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
+- VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
++ VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT | NO_EIBRS_PBRSB),
+
+ /*
+ * Technically, swapgs isn't serializing on AMD (despite it previously
+@@ -1213,7 +1214,9 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
+ * good enough for our purposes.
+ */
+
+- VULNWL_INTEL(ATOM_TREMONT_D, NO_ITLB_MULTIHIT),
++ VULNWL_INTEL(ATOM_TREMONT, NO_EIBRS_PBRSB),
++ VULNWL_INTEL(ATOM_TREMONT_L, NO_EIBRS_PBRSB),
++ VULNWL_INTEL(ATOM_TREMONT_D, NO_ITLB_MULTIHIT | NO_EIBRS_PBRSB),
+
+ /* AMD Family 0xf - 0x12 */
+ VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+@@ -1391,6 +1394,11 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
+ setup_force_cpu_bug(X86_BUG_RETBLEED);
+ }
+
++ if (cpu_has(c, X86_FEATURE_IBRS_ENHANCED) &&
++ !cpu_matches(cpu_vuln_whitelist, NO_EIBRS_PBRSB) &&
++ !(ia32_cap & ARCH_CAP_PBRSB_NO))
++ setup_force_cpu_bug(X86_BUG_EIBRS_PBRSB);
++
+ if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
+ return;
+
+diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c
+index 6d3b3e5a5533b..ee4802d7b36cd 100644
+--- a/arch/x86/kvm/mmu/tdp_iter.c
++++ b/arch/x86/kvm/mmu/tdp_iter.c
+@@ -145,6 +145,15 @@ static bool try_step_up(struct tdp_iter *iter)
+ return true;
+ }
+
++/*
++ * Step the iterator back up a level in the paging structure. Should only be
++ * used when the iterator is below the root level.
++ */
++void tdp_iter_step_up(struct tdp_iter *iter)
++{
++ WARN_ON(!try_step_up(iter));
++}
++
+ /*
+ * Step to the next SPTE in a pre-order traversal of the paging structure.
+ * To get to the next SPTE, the iterator either steps down towards the goal
+diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h
+index f0af385c56e03..adfca0cf94d3a 100644
+--- a/arch/x86/kvm/mmu/tdp_iter.h
++++ b/arch/x86/kvm/mmu/tdp_iter.h
+@@ -114,5 +114,6 @@ void tdp_iter_start(struct tdp_iter *iter, struct kvm_mmu_page *root,
+ int min_level, gfn_t next_last_level_gfn);
+ void tdp_iter_next(struct tdp_iter *iter);
+ void tdp_iter_restart(struct tdp_iter *iter);
++void tdp_iter_step_up(struct tdp_iter *iter);
+
+ #endif /* __KVM_X86_MMU_TDP_ITER_H */
+diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
+index 922b06bf4b948..b61a11d462ccb 100644
+--- a/arch/x86/kvm/mmu/tdp_mmu.c
++++ b/arch/x86/kvm/mmu/tdp_mmu.c
+@@ -1748,12 +1748,12 @@ static void zap_collapsible_spte_range(struct kvm *kvm,
+ gfn_t start = slot->base_gfn;
+ gfn_t end = start + slot->npages;
+ struct tdp_iter iter;
++ int max_mapping_level;
+ kvm_pfn_t pfn;
+
+ rcu_read_lock();
+
+ tdp_root_for_each_pte(iter, root, start, end) {
+-retry:
+ if (tdp_mmu_iter_cond_resched(kvm, &iter, false, true))
+ continue;
+
+@@ -1761,15 +1761,41 @@ retry:
+ !is_last_spte(iter.old_spte, iter.level))
+ continue;
+
++ /*
++ * This is a leaf SPTE. Check if the PFN it maps can
++ * be mapped at a higher level.
++ */
+ pfn = spte_to_pfn(iter.old_spte);
+- if (kvm_is_reserved_pfn(pfn) ||
+- iter.level >= kvm_mmu_max_mapping_level(kvm, slot, iter.gfn,
+- pfn, PG_LEVEL_NUM))
++
++ if (kvm_is_reserved_pfn(pfn))
+ continue;
+
++ max_mapping_level = kvm_mmu_max_mapping_level(kvm, slot,
++ iter.gfn, pfn, PG_LEVEL_NUM);
++
++ WARN_ON(max_mapping_level < iter.level);
++
++ /*
++ * If this page is already mapped at the highest
++ * viable level, there's nothing more to do.
++ */
++ if (max_mapping_level == iter.level)
++ continue;
++
++ /*
++ * The page can be remapped at a higher level, so step
++ * up to zap the parent SPTE.
++ */
++ while (max_mapping_level > iter.level)
++ tdp_iter_step_up(&iter);
++
+ /* Note, a successful atomic zap also does a remote TLB flush. */
+- if (tdp_mmu_zap_spte_atomic(kvm, &iter))
+- goto retry;
++ tdp_mmu_zap_spte_atomic(kvm, &iter);
++
++ /*
++ * If the atomic zap fails, the iter will recurse back into
++ * the same subtree to retry.
++ */
+ }
+
+ rcu_read_unlock();
+diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
+index 76e9e6eb71d63..7aa1ce34a5204 100644
+--- a/arch/x86/kvm/svm/sev.c
++++ b/arch/x86/kvm/svm/sev.c
+@@ -844,7 +844,7 @@ static int __sev_dbg_encrypt_user(struct kvm *kvm, unsigned long paddr,
+
+ /* If source buffer is not aligned then use an intermediate buffer */
+ if (!IS_ALIGNED((unsigned long)vaddr, 16)) {
+- src_tpage = alloc_page(GFP_KERNEL);
++ src_tpage = alloc_page(GFP_KERNEL_ACCOUNT);
+ if (!src_tpage)
+ return -ENOMEM;
+
+@@ -865,7 +865,7 @@ static int __sev_dbg_encrypt_user(struct kvm *kvm, unsigned long paddr,
+ if (!IS_ALIGNED((unsigned long)dst_vaddr, 16) || !IS_ALIGNED(size, 16)) {
+ int dst_offset;
+
+- dst_tpage = alloc_page(GFP_KERNEL);
++ dst_tpage = alloc_page(GFP_KERNEL_ACCOUNT);
+ if (!dst_tpage) {
+ ret = -ENOMEM;
+ goto e_free;
+diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
+index 6bfb0b0e66bd3..c667214c630b1 100644
+--- a/arch/x86/kvm/svm/svm.c
++++ b/arch/x86/kvm/svm/svm.c
+@@ -4166,6 +4166,8 @@ out:
+
+ static void svm_handle_exit_irqoff(struct kvm_vcpu *vcpu)
+ {
++ if (to_svm(vcpu)->vmcb->control.exit_code == SVM_EXIT_INTR)
++ vcpu->arch.at_instruction_boundary = true;
+ }
+
+ static void svm_sched_in(struct kvm_vcpu *vcpu, int cpu)
+diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
+index 4182c7ffc9091..6de96b9438044 100644
+--- a/arch/x86/kvm/vmx/vmenter.S
++++ b/arch/x86/kvm/vmx/vmenter.S
+@@ -227,11 +227,13 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
+ * entries and (in some cases) RSB underflow.
+ *
+ * eIBRS has its own protection against poisoned RSB, so it doesn't
+- * need the RSB filling sequence. But it does need to be enabled
+- * before the first unbalanced RET.
++ * need the RSB filling sequence. But it does need to be enabled, and a
++ * single call to retire, before the first unbalanced RET.
+ */
+
+- FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT
++ FILL_RETURN_BUFFER %_ASM_CX, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT,\
++ X86_FEATURE_RSB_VMEXIT_LITE
++
+
+ pop %_ASM_ARG2 /* @flags */
+ pop %_ASM_ARG1 /* @vmx */
+diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
+index 4b6a0268c78e3..597c3c08da501 100644
+--- a/arch/x86/kvm/vmx/vmx.c
++++ b/arch/x86/kvm/vmx/vmx.c
+@@ -6630,6 +6630,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
+ return;
+
+ handle_interrupt_nmi_irqoff(vcpu, gate_offset(desc));
++ vcpu->arch.at_instruction_boundary = true;
+ }
+
+ static void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu)
+diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
+index 53b6fdf30c99b..65b0ec28bd52b 100644
+--- a/arch/x86/kvm/x86.c
++++ b/arch/x86/kvm/x86.c
+@@ -291,6 +291,8 @@ const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
+ STATS_DESC_COUNTER(VCPU, nested_run),
+ STATS_DESC_COUNTER(VCPU, directed_yield_attempted),
+ STATS_DESC_COUNTER(VCPU, directed_yield_successful),
++ STATS_DESC_COUNTER(VCPU, preemption_reported),
++ STATS_DESC_COUNTER(VCPU, preemption_other),
+ STATS_DESC_ICOUNTER(VCPU, guest_mode)
+ };
+
+@@ -4607,6 +4609,19 @@ static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu)
+ struct kvm_memslots *slots;
+ static const u8 preempted = KVM_VCPU_PREEMPTED;
+
++ /*
++ * The vCPU can be marked preempted if and only if the VM-Exit was on
++ * an instruction boundary and will not trigger guest emulation of any
++ * kind (see vcpu_run). Vendor specific code controls (conservatively)
++ * when this is true, for example allowing the vCPU to be marked
++ * preempted if and only if the VM-Exit was due to a host interrupt.
++ */
++ if (!vcpu->arch.at_instruction_boundary) {
++ vcpu->stat.preemption_other++;
++ return;
++ }
++
++ vcpu->stat.preemption_reported++;
+ if (!(vcpu->arch.st.msr_val & KVM_MSR_ENABLED))
+ return;
+
+@@ -4636,19 +4651,21 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
+ {
+ int idx;
+
+- if (vcpu->preempted && !vcpu->arch.guest_state_protected)
+- vcpu->arch.preempted_in_kernel = !static_call(kvm_x86_get_cpl)(vcpu);
++ if (vcpu->preempted) {
++ if (!vcpu->arch.guest_state_protected)
++ vcpu->arch.preempted_in_kernel = !static_call(kvm_x86_get_cpl)(vcpu);
+
+- /*
+- * Take the srcu lock as memslots will be accessed to check the gfn
+- * cache generation against the memslots generation.
+- */
+- idx = srcu_read_lock(&vcpu->kvm->srcu);
+- if (kvm_xen_msr_enabled(vcpu->kvm))
+- kvm_xen_runstate_set_preempted(vcpu);
+- else
+- kvm_steal_time_set_preempted(vcpu);
+- srcu_read_unlock(&vcpu->kvm->srcu, idx);
++ /*
++ * Take the srcu lock as memslots will be accessed to check the gfn
++ * cache generation against the memslots generation.
++ */
++ idx = srcu_read_lock(&vcpu->kvm->srcu);
++ if (kvm_xen_msr_enabled(vcpu->kvm))
++ kvm_xen_runstate_set_preempted(vcpu);
++ else
++ kvm_steal_time_set_preempted(vcpu);
++ srcu_read_unlock(&vcpu->kvm->srcu, idx);
++ }
+
+ static_call(kvm_x86_vcpu_put)(vcpu);
+ vcpu->arch.last_host_tsc = rdtsc();
+@@ -9767,6 +9784,7 @@ void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
+ return;
+
+ down_read(&vcpu->kvm->arch.apicv_update_lock);
++ preempt_disable();
+
+ activate = kvm_apicv_activated(vcpu->kvm);
+ if (vcpu->arch.apicv_active == activate)
+@@ -9786,6 +9804,7 @@ void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
+ kvm_make_request(KVM_REQ_EVENT, vcpu);
+
+ out:
++ preempt_enable();
+ up_read(&vcpu->kvm->arch.apicv_update_lock);
+ }
+ EXPORT_SYMBOL_GPL(kvm_vcpu_update_apicv);
+@@ -10363,6 +10382,13 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
+ vcpu->arch.l1tf_flush_l1d = true;
+
+ for (;;) {
++ /*
++ * If another guest vCPU requests a PV TLB flush in the middle
++ * of instruction emulation, the rest of the emulation could
++ * use a stale page translation. Assume that any code after
++ * this point can start executing an instruction.
++ */
++ vcpu->arch.at_instruction_boundary = false;
+ if (kvm_vcpu_running(vcpu)) {
+ r = vcpu_enter_guest(vcpu);
+ } else {
+diff --git a/arch/x86/kvm/xen.h b/arch/x86/kvm/xen.h
+index adbcc9ed59dbc..fda1413f8af95 100644
+--- a/arch/x86/kvm/xen.h
++++ b/arch/x86/kvm/xen.h
+@@ -103,8 +103,10 @@ static inline void kvm_xen_runstate_set_preempted(struct kvm_vcpu *vcpu)
+ * behalf of the vCPU. Only if the VMM does actually block
+ * does it need to enter RUNSTATE_blocked.
+ */
+- if (vcpu->preempted)
+- kvm_xen_update_runstate_guest(vcpu, RUNSTATE_runnable);
++ if (WARN_ON_ONCE(!vcpu->preempted))
++ return;
++
++ kvm_xen_update_runstate_guest(vcpu, RUNSTATE_runnable);
+ }
+
+ /* 32-bit compatibility definitions, also used natively in 32-bit build */
+diff --git a/block/blk-ioc.c b/block/blk-ioc.c
+index df9cfe4ca5328..63fc020424082 100644
+--- a/block/blk-ioc.c
++++ b/block/blk-ioc.c
+@@ -247,6 +247,8 @@ static struct io_context *alloc_io_context(gfp_t gfp_flags, int node)
+ INIT_HLIST_HEAD(&ioc->icq_list);
+ INIT_WORK(&ioc->release_work, ioc_release_fn);
+ #endif
++ ioc->ioprio = IOPRIO_DEFAULT;
++
+ return ioc;
+ }
+
+diff --git a/block/ioprio.c b/block/ioprio.c
+index 2fe068fcaad58..2a34cbca18aed 100644
+--- a/block/ioprio.c
++++ b/block/ioprio.c
+@@ -157,9 +157,9 @@ out:
+ int ioprio_best(unsigned short aprio, unsigned short bprio)
+ {
+ if (!ioprio_valid(aprio))
+- aprio = IOPRIO_DEFAULT;
++ aprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_BE_NORM);
+ if (!ioprio_valid(bprio))
+- bprio = IOPRIO_DEFAULT;
++ bprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_BE_NORM);
+
+ return min(aprio, bprio);
+ }
+diff --git a/drivers/acpi/apei/bert.c b/drivers/acpi/apei/bert.c
+index 598fd19b65fa4..45973aa6e06d4 100644
+--- a/drivers/acpi/apei/bert.c
++++ b/drivers/acpi/apei/bert.c
+@@ -29,16 +29,26 @@
+
+ #undef pr_fmt
+ #define pr_fmt(fmt) "BERT: " fmt
++
++#define ACPI_BERT_PRINT_MAX_RECORDS 5
+ #define ACPI_BERT_PRINT_MAX_LEN 1024
+
+ static int bert_disable;
+
++/*
++ * Print "all" the error records in the BERT table, but avoid huge spam to
++ * the console if the BIOS included oversize records, or too many records.
++ * Skipping some records here does not lose anything because the full
++ * data is available to user tools in:
++ * /sys/firmware/acpi/tables/data/BERT
++ */
+ static void __init bert_print_all(struct acpi_bert_region *region,
+ unsigned int region_len)
+ {
+ struct acpi_hest_generic_status *estatus =
+ (struct acpi_hest_generic_status *)region;
+ int remain = region_len;
++ int printed = 0, skipped = 0;
+ u32 estatus_len;
+
+ while (remain >= sizeof(struct acpi_bert_region)) {
+@@ -46,24 +56,26 @@ static void __init bert_print_all(struct acpi_bert_region *region,
+ if (remain < estatus_len) {
+ pr_err(FW_BUG "Truncated status block (length: %u).\n",
+ estatus_len);
+- return;
++ break;
+ }
+
+ /* No more error records. */
+ if (!estatus->block_status)
+- return;
++ break;
+
+ if (cper_estatus_check(estatus)) {
+ pr_err(FW_BUG "Invalid error record.\n");
+- return;
++ break;
+ }
+
+- pr_info_once("Error records from previous boot:\n");
+- if (region_len < ACPI_BERT_PRINT_MAX_LEN)
++ if (estatus_len < ACPI_BERT_PRINT_MAX_LEN &&
++ printed < ACPI_BERT_PRINT_MAX_RECORDS) {
++ pr_info_once("Error records from previous boot:\n");
+ cper_estatus_print(KERN_INFO HW_ERR, estatus);
+- else
+- pr_info_once("Max print length exceeded, table data is available at:\n"
+- "/sys/firmware/acpi/tables/data/BERT");
++ printed++;
++ } else {
++ skipped++;
++ }
+
+ /*
+ * Because the boot error source is "one-time polled" type,
+@@ -75,6 +87,9 @@ static void __init bert_print_all(struct acpi_bert_region *region,
+ estatus = (void *)estatus + estatus_len;
+ remain -= estatus_len;
+ }
++
++ if (skipped)
++ pr_info(HW_ERR "Skipped %d error records\n", skipped);
+ }
+
+ static int __init setup_bert_disable(char *str)
+diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
+index becc198e4c224..6615f59ab7fd2 100644
+--- a/drivers/acpi/video_detect.c
++++ b/drivers/acpi/video_detect.c
+@@ -430,7 +430,6 @@ static const struct dmi_system_id video_detect_dmi_table[] = {
+ .callback = video_detect_force_native,
+ .ident = "Clevo NL5xRU",
+ .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
+ DMI_MATCH(DMI_BOARD_NAME, "NL5xRU"),
+ },
+ },
+@@ -438,59 +437,75 @@ static const struct dmi_system_id video_detect_dmi_table[] = {
+ .callback = video_detect_force_native,
+ .ident = "Clevo NL5xRU",
+ .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "SchenkerTechnologiesGmbH"),
+- DMI_MATCH(DMI_BOARD_NAME, "NL5xRU"),
++ DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
++ DMI_MATCH(DMI_BOARD_NAME, "AURA1501"),
+ },
+ },
+ {
+ .callback = video_detect_force_native,
+ .ident = "Clevo NL5xRU",
+ .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "Notebook"),
+- DMI_MATCH(DMI_BOARD_NAME, "NL5xRU"),
++ DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
++ DMI_MATCH(DMI_BOARD_NAME, "EDUBOOK1502"),
+ },
+ },
+ {
+ .callback = video_detect_force_native,
+- .ident = "Clevo NL5xRU",
++ .ident = "Clevo NL5xNU",
+ .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
+- DMI_MATCH(DMI_BOARD_NAME, "AURA1501"),
++ DMI_MATCH(DMI_BOARD_NAME, "NL5xNU"),
+ },
+ },
++ /*
++ * The TongFang PF5PU1G, PF4NU1F, PF5NU1G, and PF5LUXG/TUXEDO BA15 Gen10,
++ * Pulse 14/15 Gen1, and Pulse 15 Gen2 have the same problem as the Clevo
++ * NL5xRU and NL5xNU/TUXEDO Aura 15 Gen1 and Gen2. See the description
++ * above.
++ */
+ {
+ .callback = video_detect_force_native,
+- .ident = "Clevo NL5xRU",
++ .ident = "TongFang PF5PU1G",
+ .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
+- DMI_MATCH(DMI_BOARD_NAME, "EDUBOOK1502"),
++ DMI_MATCH(DMI_BOARD_NAME, "PF5PU1G"),
+ },
+ },
+ {
+ .callback = video_detect_force_native,
+- .ident = "Clevo NL5xNU",
++ .ident = "TongFang PF4NU1F",
++ .matches = {
++ DMI_MATCH(DMI_BOARD_NAME, "PF4NU1F"),
++ },
++ },
++ {
++ .callback = video_detect_force_native,
++ .ident = "TongFang PF4NU1F",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
+- DMI_MATCH(DMI_BOARD_NAME, "NL5xNU"),
++ DMI_MATCH(DMI_BOARD_NAME, "PULSE1401"),
+ },
+ },
+ {
+ .callback = video_detect_force_native,
+- .ident = "Clevo NL5xNU",
++ .ident = "TongFang PF5NU1G",
+ .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "SchenkerTechnologiesGmbH"),
+- DMI_MATCH(DMI_BOARD_NAME, "NL5xNU"),
++ DMI_MATCH(DMI_BOARD_NAME, "PF5NU1G"),
+ },
+ },
+ {
+ .callback = video_detect_force_native,
+- .ident = "Clevo NL5xNU",
++ .ident = "TongFang PF5NU1G",
+ .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "Notebook"),
+- DMI_MATCH(DMI_BOARD_NAME, "NL5xNU"),
++ DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
++ DMI_MATCH(DMI_BOARD_NAME, "PULSE1501"),
++ },
++ },
++ {
++ .callback = video_detect_force_native,
++ .ident = "TongFang PF5LUXG",
++ .matches = {
++ DMI_MATCH(DMI_BOARD_NAME, "PF5LUXG"),
+ },
+ },
+-
+ /*
+ * Desktops which falsely report a backlight and which our heuristics
+ * for this do not catch.
+diff --git a/drivers/bluetooth/btbcm.c b/drivers/bluetooth/btbcm.c
+index d9ceca7a7935c..a18f289d73466 100644
+--- a/drivers/bluetooth/btbcm.c
++++ b/drivers/bluetooth/btbcm.c
+@@ -453,6 +453,8 @@ static const struct bcm_subver_table bcm_uart_subver_table[] = {
+ { 0x6606, "BCM4345C5" }, /* 003.006.006 */
+ { 0x230f, "BCM4356A2" }, /* 001.003.015 */
+ { 0x220e, "BCM20702A1" }, /* 001.002.014 */
++ { 0x420d, "BCM4349B1" }, /* 002.002.013 */
++ { 0x420e, "BCM4349B1" }, /* 002.002.014 */
+ { 0x4217, "BCM4329B1" }, /* 002.002.023 */
+ { 0x6106, "BCM4359C0" }, /* 003.001.006 */
+ { 0x4106, "BCM4335A0" }, /* 002.001.006 */
+diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
+index e48c3ad069bb4..d789c077d95dc 100644
+--- a/drivers/bluetooth/btusb.c
++++ b/drivers/bluetooth/btusb.c
+@@ -422,6 +422,18 @@ static const struct usb_device_id blacklist_table[] = {
+ { USB_DEVICE(0x04ca, 0x4006), .driver_info = BTUSB_REALTEK |
+ BTUSB_WIDEBAND_SPEECH },
+
++ /* Realtek 8852CE Bluetooth devices */
++ { USB_DEVICE(0x04ca, 0x4007), .driver_info = BTUSB_REALTEK |
++ BTUSB_WIDEBAND_SPEECH },
++ { USB_DEVICE(0x04c5, 0x1675), .driver_info = BTUSB_REALTEK |
++ BTUSB_WIDEBAND_SPEECH },
++ { USB_DEVICE(0x0cb8, 0xc558), .driver_info = BTUSB_REALTEK |
++ BTUSB_WIDEBAND_SPEECH },
++ { USB_DEVICE(0x13d3, 0x3587), .driver_info = BTUSB_REALTEK |
++ BTUSB_WIDEBAND_SPEECH },
++ { USB_DEVICE(0x13d3, 0x3586), .driver_info = BTUSB_REALTEK |
++ BTUSB_WIDEBAND_SPEECH },
++
+ /* Realtek Bluetooth devices */
+ { USB_VENDOR_AND_INTERFACE_INFO(0x0bda, 0xe0, 0x01, 0x01),
+ .driver_info = BTUSB_REALTEK },
+@@ -469,6 +481,9 @@ static const struct usb_device_id blacklist_table[] = {
+ { USB_DEVICE(0x0489, 0xe0d9), .driver_info = BTUSB_MEDIATEK |
+ BTUSB_WIDEBAND_SPEECH |
+ BTUSB_VALID_LE_STATES },
++ { USB_DEVICE(0x13d3, 0x3568), .driver_info = BTUSB_MEDIATEK |
++ BTUSB_WIDEBAND_SPEECH |
++ BTUSB_VALID_LE_STATES },
+
+ /* Additional Realtek 8723AE Bluetooth devices */
+ { USB_DEVICE(0x0930, 0x021d), .driver_info = BTUSB_REALTEK },
+diff --git a/drivers/bluetooth/hci_bcm.c b/drivers/bluetooth/hci_bcm.c
+index 785f445dd60d5..49bed66b8c84e 100644
+--- a/drivers/bluetooth/hci_bcm.c
++++ b/drivers/bluetooth/hci_bcm.c
+@@ -1544,8 +1544,10 @@ static const struct of_device_id bcm_bluetooth_of_match[] = {
+ { .compatible = "brcm,bcm43430a0-bt" },
+ { .compatible = "brcm,bcm43430a1-bt" },
+ { .compatible = "brcm,bcm43438-bt", .data = &bcm43438_device_data },
++ { .compatible = "brcm,bcm4349-bt", .data = &bcm43438_device_data },
+ { .compatible = "brcm,bcm43540-bt", .data = &bcm4354_device_data },
+ { .compatible = "brcm,bcm4335a0" },
++ { .compatible = "infineon,cyw55572-bt" },
+ { },
+ };
+ MODULE_DEVICE_TABLE(of, bcm_bluetooth_of_match);
+diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
+index eab34e24d9446..8df11016fd51b 100644
+--- a/drivers/bluetooth/hci_qca.c
++++ b/drivers/bluetooth/hci_qca.c
+@@ -1588,7 +1588,7 @@ static bool qca_wakeup(struct hci_dev *hdev)
+ wakeup = device_may_wakeup(hu->serdev->ctrl->dev.parent);
+ bt_dev_dbg(hu->hdev, "wakeup status : %d", wakeup);
+
+- return !wakeup;
++ return wakeup;
+ }
+
+ static int qca_regulator_init(struct hci_uart *hu)
+diff --git a/drivers/macintosh/adb.c b/drivers/macintosh/adb.c
+index 73b3961890397..afb0942ccc293 100644
+--- a/drivers/macintosh/adb.c
++++ b/drivers/macintosh/adb.c
+@@ -647,7 +647,7 @@ do_adb_query(struct adb_request *req)
+
+ switch(req->data[1]) {
+ case ADB_QUERY_GETDEVINFO:
+- if (req->nbytes < 3)
++ if (req->nbytes < 3 || req->data[2] >= 16)
+ break;
+ mutex_lock(&adb_handler_mutex);
+ req->reply[0] = adb_handler[req->data[2]].original_address;
+diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
+index 19db5693175fe..2a0ead57db71c 100644
+--- a/fs/btrfs/block-group.h
++++ b/fs/btrfs/block-group.h
+@@ -104,6 +104,7 @@ struct btrfs_block_group {
+ unsigned int relocating_repair:1;
+ unsigned int chunk_item_inserted:1;
+ unsigned int zone_is_active:1;
++ unsigned int zoned_data_reloc_ongoing:1;
+
+ int disk_cache_state;
+
+diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
+index 6aa92f84f4654..f45ecd939a2cb 100644
+--- a/fs/btrfs/extent-tree.c
++++ b/fs/btrfs/extent-tree.c
+@@ -3836,7 +3836,7 @@ static int do_allocation_zoned(struct btrfs_block_group *block_group,
+ block_group->start == fs_info->data_reloc_bg ||
+ fs_info->data_reloc_bg == 0);
+
+- if (block_group->ro) {
++ if (block_group->ro || block_group->zoned_data_reloc_ongoing) {
+ ret = 1;
+ goto out;
+ }
+@@ -3898,8 +3898,24 @@ static int do_allocation_zoned(struct btrfs_block_group *block_group,
+ out:
+ if (ret && ffe_ctl->for_treelog)
+ fs_info->treelog_bg = 0;
+- if (ret && ffe_ctl->for_data_reloc)
++ if (ret && ffe_ctl->for_data_reloc &&
++ fs_info->data_reloc_bg == block_group->start) {
++ /*
++ * Do not allow further allocations from this block group.
++ * Compared to increasing the ->ro, setting the
++ * ->zoned_data_reloc_ongoing flag still allows nocow
++ * writers to come in. See btrfs_inc_nocow_writers().
++ *
++ * We need to disable an allocation to avoid an allocation of
++ * regular (non-relocation data) extent. With mix of relocation
++ * extents and regular extents, we can dispatch WRITE commands
++ * (for relocation extents) and ZONE APPEND commands (for
++ * regular extents) at the same time to the same zone, which
++ * easily break the write pointer.
++ */
++ block_group->zoned_data_reloc_ongoing = 1;
+ fs_info->data_reloc_bg = 0;
++ }
+ spin_unlock(&fs_info->relocation_bg_lock);
+ spin_unlock(&fs_info->treelog_bg_lock);
+ spin_unlock(&block_group->lock);
+diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
+index a23a42ba88cae..68ddd90685d9d 100644
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -5214,13 +5214,14 @@ int extent_writepages(struct address_space *mapping,
+ */
+ btrfs_zoned_data_reloc_lock(BTRFS_I(inode));
+ ret = extent_write_cache_pages(mapping, wbc, &epd);
+- btrfs_zoned_data_reloc_unlock(BTRFS_I(inode));
+ ASSERT(ret <= 0);
+ if (ret < 0) {
++ btrfs_zoned_data_reloc_unlock(BTRFS_I(inode));
+ end_write_bio(&epd, ret);
+ return ret;
+ }
+ ret = flush_write_bio(&epd);
++ btrfs_zoned_data_reloc_unlock(BTRFS_I(inode));
+ return ret;
+ }
+
+diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
+index 9ae79342631a8..5d15e374d0326 100644
+--- a/fs/btrfs/inode.c
++++ b/fs/btrfs/inode.c
+@@ -3102,6 +3102,8 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent)
+ ordered_extent->file_offset,
+ ordered_extent->file_offset +
+ logical_len);
++ btrfs_zoned_release_data_reloc_bg(fs_info, ordered_extent->disk_bytenr,
++ ordered_extent->disk_num_bytes);
+ } else {
+ BUG_ON(root == fs_info->tree_root);
+ ret = insert_ordered_extent_file_extent(trans, ordered_extent);
+diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
+index 5091d679a602c..84b6d39509bd3 100644
+--- a/fs/btrfs/zoned.c
++++ b/fs/btrfs/zoned.c
+@@ -2005,6 +2005,7 @@ void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 len
+ struct btrfs_device *device;
+ u64 min_alloc_bytes;
+ u64 physical;
++ int i;
+
+ if (!btrfs_is_zoned(fs_info))
+ return;
+@@ -2039,13 +2040,25 @@ void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 len
+ spin_unlock(&block_group->lock);
+
+ map = block_group->physical_map;
+- device = map->stripes[0].dev;
+- physical = map->stripes[0].physical;
++ for (i = 0; i < map->num_stripes; i++) {
++ int ret;
+
+- if (!device->zone_info->max_active_zones)
+- goto out;
++ device = map->stripes[i].dev;
++ physical = map->stripes[i].physical;
++
++ if (device->zone_info->max_active_zones == 0)
++ continue;
+
+- btrfs_dev_clear_active_zone(device, physical);
++ ret = blkdev_zone_mgmt(device->bdev, REQ_OP_ZONE_FINISH,
++ physical >> SECTOR_SHIFT,
++ device->zone_info->zone_size >> SECTOR_SHIFT,
++ GFP_NOFS);
++
++ if (ret)
++ return;
++
++ btrfs_dev_clear_active_zone(device, physical);
++ }
+
+ spin_lock(&fs_info->zone_active_bgs_lock);
+ ASSERT(!list_empty(&block_group->active_bg_list));
+@@ -2116,3 +2129,30 @@ void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info)
+ }
+ mutex_unlock(&fs_devices->device_list_mutex);
+ }
++
++void btrfs_zoned_release_data_reloc_bg(struct btrfs_fs_info *fs_info, u64 logical,
++ u64 length)
++{
++ struct btrfs_block_group *block_group;
++
++ if (!btrfs_is_zoned(fs_info))
++ return;
++
++ block_group = btrfs_lookup_block_group(fs_info, logical);
++ /* It should be called on a previous data relocation block group. */
++ ASSERT(block_group && (block_group->flags & BTRFS_BLOCK_GROUP_DATA));
++
++ spin_lock(&block_group->lock);
++ if (!block_group->zoned_data_reloc_ongoing)
++ goto out;
++
++ /* All relocation extents are written. */
++ if (block_group->start + block_group->alloc_offset == logical + length) {
++ /* Now, release this block group for further allocations. */
++ block_group->zoned_data_reloc_ongoing = 0;
++ }
++
++out:
++ spin_unlock(&block_group->lock);
++ btrfs_put_block_group(block_group);
++}
+diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h
+index 2d898970aec5f..cf6320feef464 100644
+--- a/fs/btrfs/zoned.h
++++ b/fs/btrfs/zoned.h
+@@ -80,6 +80,8 @@ void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg,
+ struct extent_buffer *eb);
+ void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg);
+ void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info);
++void btrfs_zoned_release_data_reloc_bg(struct btrfs_fs_info *fs_info, u64 logical,
++ u64 length);
+ #else /* CONFIG_BLK_DEV_ZONED */
+ static inline int btrfs_get_dev_zone(struct btrfs_device *device, u64 pos,
+ struct blk_zone *zone)
+@@ -241,6 +243,9 @@ static inline void btrfs_schedule_zone_finish_bg(struct btrfs_block_group *bg,
+ static inline void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg) { }
+
+ static inline void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info) { }
++
++static inline void btrfs_zoned_release_data_reloc_bg(struct btrfs_fs_info *fs_info,
++ u64 logical, u64 length) { }
+ #endif
+
+ static inline bool btrfs_dev_is_sequential(struct btrfs_device *device, u64 pos)
+diff --git a/include/linux/ioprio.h b/include/linux/ioprio.h
+index 3f53bc27a19bf..3d088a88f8320 100644
+--- a/include/linux/ioprio.h
++++ b/include/linux/ioprio.h
+@@ -11,7 +11,7 @@
+ /*
+ * Default IO priority.
+ */
+-#define IOPRIO_DEFAULT IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_BE_NORM)
++#define IOPRIO_DEFAULT IOPRIO_PRIO_VALUE(IOPRIO_CLASS_NONE, 0)
+
+ /*
+ * Check that a priority value has a valid class.
+diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c
+index 9d09f489b60e0..2e0f75bcb7fd1 100644
+--- a/kernel/entry/kvm.c
++++ b/kernel/entry/kvm.c
+@@ -9,12 +9,6 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, unsigned long ti_work)
+ int ret;
+
+ if (ti_work & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL)) {
+- clear_notify_signal();
+- if (task_work_pending(current))
+- task_work_run();
+- }
+-
+- if (ti_work & _TIF_SIGPENDING) {
+ kvm_handle_signal_exit(vcpu);
+ return -EINTR;
+ }
+diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
+index 5d09ded0c491f..04b7e3654ff77 100644
+--- a/tools/arch/x86/include/asm/cpufeatures.h
++++ b/tools/arch/x86/include/asm/cpufeatures.h
+@@ -301,6 +301,7 @@
+ #define X86_FEATURE_RETPOLINE_LFENCE (11*32+13) /* "" Use LFENCE for Spectre variant 2 */
+ #define X86_FEATURE_RETHUNK (11*32+14) /* "" Use REturn THUNK */
+ #define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */
++#define X86_FEATURE_RSB_VMEXIT_LITE (11*32+17) /* "" Fill RSB on VM-Exit when EIBRS is enabled */
+
+ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
+ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
+diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
+index ad084326f24c2..f951147cc7fdc 100644
+--- a/tools/arch/x86/include/asm/msr-index.h
++++ b/tools/arch/x86/include/asm/msr-index.h
+@@ -148,6 +148,10 @@
+ * are restricted to targets in
+ * kernel.
+ */
++#define ARCH_CAP_PBRSB_NO BIT(24) /*
++ * Not susceptible to Post-Barrier
++ * Return Stack Buffer Predictions.
++ */
+
+ #define MSR_IA32_FLUSH_CMD 0x0000010b
+ #define L1D_FLUSH BIT(0) /*
+diff --git a/tools/kvm/kvm_stat/kvm_stat b/tools/kvm/kvm_stat/kvm_stat
+index 5a5bd74f55bd5..9c366b3a676db 100755
+--- a/tools/kvm/kvm_stat/kvm_stat
++++ b/tools/kvm/kvm_stat/kvm_stat
+@@ -1646,7 +1646,8 @@ Press any other key to refresh statistics immediately.
+ .format(values))
+ if len(pids) > 1:
+ sys.exit('Error: Multiple processes found (pids: {}). Use "-p"'
+- ' to specify the desired pid'.format(" ".join(pids)))
++ ' to specify the desired pid'
++ .format(" ".join(map(str, pids))))
+ namespace.pid = pids[0]
+
+ argparser = argparse.ArgumentParser(description=description_text,
+diff --git a/tools/testing/selftests/kvm/lib/aarch64/ucall.c b/tools/testing/selftests/kvm/lib/aarch64/ucall.c
+index e0b0164e9af85..be1d9728c4cea 100644
+--- a/tools/testing/selftests/kvm/lib/aarch64/ucall.c
++++ b/tools/testing/selftests/kvm/lib/aarch64/ucall.c
+@@ -73,20 +73,19 @@ void ucall_uninit(struct kvm_vm *vm)
+
+ void ucall(uint64_t cmd, int nargs, ...)
+ {
+- struct ucall uc = {
+- .cmd = cmd,
+- };
++ struct ucall uc = {};
+ va_list va;
+ int i;
+
++ WRITE_ONCE(uc.cmd, cmd);
+ nargs = nargs <= UCALL_MAX_ARGS ? nargs : UCALL_MAX_ARGS;
+
+ va_start(va, nargs);
+ for (i = 0; i < nargs; ++i)
+- uc.args[i] = va_arg(va, uint64_t);
++ WRITE_ONCE(uc.args[i], va_arg(va, uint64_t));
+ va_end(va);
+
+- *ucall_exit_mmio_addr = (vm_vaddr_t)&uc;
++ WRITE_ONCE(*ucall_exit_mmio_addr, (vm_vaddr_t)&uc);
+ }
+
+ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc)
+diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
+index 722df3a28791c..ddd68ba0c99fc 100644
+--- a/tools/testing/selftests/kvm/lib/perf_test_util.c
++++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
+@@ -110,6 +110,7 @@ struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int vcpus,
+ struct kvm_vm *vm;
+ uint64_t guest_num_pages;
+ uint64_t backing_src_pagesz = get_backing_src_pagesz(backing_src);
++ uint64_t region_end_gfn;
+ int i;
+
+ pr_info("Testing guest mode: %s\n", vm_guest_mode_string(mode));
+@@ -144,18 +145,29 @@ struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int vcpus,
+
+ pta->vm = vm;
+
++ /* Put the test region at the top guest physical memory. */
++ region_end_gfn = vm_get_max_gfn(vm) + 1;
++
++#ifdef __x86_64__
++ /*
++ * When running vCPUs in L2, restrict the test region to 48 bits to
++ * avoid needing 5-level page tables to identity map L2.
++ */
++ if (pta->nested)
++ region_end_gfn = min(region_end_gfn, (1UL << 48) / pta->guest_page_size);
++#endif
+ /*
+ * If there should be more memory in the guest test region than there
+ * can be pages in the guest, it will definitely cause problems.
+ */
+- TEST_ASSERT(guest_num_pages < vm_get_max_gfn(vm),
++ TEST_ASSERT(guest_num_pages < region_end_gfn,
+ "Requested more guest memory than address space allows.\n"
+ " guest pages: %" PRIx64 " max gfn: %" PRIx64
+ " vcpus: %d wss: %" PRIx64 "]\n",
+- guest_num_pages, vm_get_max_gfn(vm), vcpus,
++ guest_num_pages, region_end_gfn - 1, vcpus,
+ vcpu_memory_bytes);
+
+- pta->gpa = (vm_get_max_gfn(vm) - guest_num_pages) * pta->guest_page_size;
++ pta->gpa = (region_end_gfn - guest_num_pages) * pta->guest_page_size;
+ pta->gpa = align_down(pta->gpa, backing_src_pagesz);
+ #ifdef __s390x__
+ /* Align to 1M (segment size) */
+diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_clock.c b/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
+index e0b2bb1339b16..3330fb183c680 100644
+--- a/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
++++ b/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
+@@ -44,7 +44,7 @@ static inline void nop_loop(void)
+ {
+ int i;
+
+- for (i = 0; i < 1000000; i++)
++ for (i = 0; i < 100000000; i++)
+ asm volatile("nop");
+ }
+
+@@ -56,12 +56,14 @@ static inline void check_tsc_msr_rdtsc(void)
+ tsc_freq = rdmsr(HV_X64_MSR_TSC_FREQUENCY);
+ GUEST_ASSERT(tsc_freq > 0);
+
+- /* First, check MSR-based clocksource */
++ /* For increased accuracy, take mean rdtsc() before and afrer rdmsr() */
+ r1 = rdtsc();
+ t1 = rdmsr(HV_X64_MSR_TIME_REF_COUNT);
++ r1 = (r1 + rdtsc()) / 2;
+ nop_loop();
+ r2 = rdtsc();
+ t2 = rdmsr(HV_X64_MSR_TIME_REF_COUNT);
++ r2 = (r2 + rdtsc()) / 2;
+
+ GUEST_ASSERT(r2 > r1 && t2 > t1);
+
+@@ -181,12 +183,14 @@ static void host_check_tsc_msr_rdtsc(struct kvm_vm *vm)
+ tsc_freq = vcpu_get_msr(vm, VCPU_ID, HV_X64_MSR_TSC_FREQUENCY);
+ TEST_ASSERT(tsc_freq > 0, "TSC frequency must be nonzero");
+
+- /* First, check MSR-based clocksource */
++ /* For increased accuracy, take mean rdtsc() before and afrer ioctl */
+ r1 = rdtsc();
+ t1 = vcpu_get_msr(vm, VCPU_ID, HV_X64_MSR_TIME_REF_COUNT);
++ r1 = (r1 + rdtsc()) / 2;
+ nop_loop();
+ r2 = rdtsc();
+ t2 = vcpu_get_msr(vm, VCPU_ID, HV_X64_MSR_TIME_REF_COUNT);
++ r2 = (r2 + rdtsc()) / 2;
+
+ TEST_ASSERT(t2 > t1, "Time reference MSR is not monotonic (%ld <= %ld)", t1, t2);
+
+diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c
+index 9b68658b6bb85..5b98f3ee58a58 100644
+--- a/tools/vm/slabinfo.c
++++ b/tools/vm/slabinfo.c
+@@ -233,6 +233,24 @@ static unsigned long read_slab_obj(struct slabinfo *s, const char *name)
+ return l;
+ }
+
++static unsigned long read_debug_slab_obj(struct slabinfo *s, const char *name)
++{
++ char x[128];
++ FILE *f;
++ size_t l;
++
++ snprintf(x, 128, "/sys/kernel/debug/slab/%s/%s", s->name, name);
++ f = fopen(x, "r");
++ if (!f) {
++ buffer[0] = 0;
++ l = 0;
++ } else {
++ l = fread(buffer, 1, sizeof(buffer), f);
++ buffer[l] = 0;
++ fclose(f);
++ }
++ return l;
++}
+
+ /*
+ * Put a size string together
+@@ -409,14 +427,18 @@ static void show_tracking(struct slabinfo *s)
+ {
+ printf("\n%s: Kernel object allocation\n", s->name);
+ printf("-----------------------------------------------------------------------\n");
+- if (read_slab_obj(s, "alloc_calls"))
++ if (read_debug_slab_obj(s, "alloc_traces"))
++ printf("%s", buffer);
++ else if (read_slab_obj(s, "alloc_calls"))
+ printf("%s", buffer);
+ else
+ printf("No Data\n");
+
+ printf("\n%s: Kernel object freeing\n", s->name);
+ printf("------------------------------------------------------------------------\n");
+- if (read_slab_obj(s, "free_calls"))
++ if (read_debug_slab_obj(s, "free_traces"))
++ printf("%s", buffer);
++ else if (read_slab_obj(s, "free_calls"))
+ printf("%s", buffer);
+ else
+ printf("No Data\n");
+diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
+index 24cb37d19c638..7f1d19689701b 100644
+--- a/virt/kvm/kvm_main.c
++++ b/virt/kvm/kvm_main.c
+@@ -3327,9 +3327,11 @@ bool kvm_vcpu_block(struct kvm_vcpu *vcpu)
+
+ vcpu->stat.generic.blocking = 1;
+
++ preempt_disable();
+ kvm_arch_vcpu_blocking(vcpu);
+-
+ prepare_to_rcuwait(wait);
++ preempt_enable();
++
+ for (;;) {
+ set_current_state(TASK_INTERRUPTIBLE);
+
+@@ -3339,9 +3341,11 @@ bool kvm_vcpu_block(struct kvm_vcpu *vcpu)
+ waited = true;
+ schedule();
+ }
+- finish_rcuwait(wait);
+
++ preempt_disable();
++ finish_rcuwait(wait);
+ kvm_arch_vcpu_unblocking(vcpu);
++ preempt_enable();
+
+ vcpu->stat.generic.blocking = 0;
+
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-08-17 14:31 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-08-17 14:31 UTC (permalink / raw
To: gentoo-commits
commit: 826fdf3e3dc58be7aa2de584cfd1db7dc87f8b78
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Aug 17 14:31:06 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Aug 17 14:31:06 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=826fdf3e
Linux patch 5.18.18
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1017_linux-5.18.18.patch | 85236 +++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 85240 insertions(+)
diff --git a/0000_README b/0000_README
index e0f23579..c9212ac7 100644
--- a/0000_README
+++ b/0000_README
@@ -111,6 +111,10 @@ Patch: 1016_linux-5.18.17.patch
From: http://www.kernel.org
Desc: Linux 5.18.17
+Patch: 1017_linux-5.18.18.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.18
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1017_linux-5.18.18.patch b/1017_linux-5.18.18.patch
new file mode 100644
index 00000000..09fe43be
--- /dev/null
+++ b/1017_linux-5.18.18.patch
@@ -0,0 +1,85236 @@
+diff --git a/Documentation/ABI/testing/sysfs-driver-xen-blkback b/Documentation/ABI/testing/sysfs-driver-xen-blkback
+index a74dfe52dd76d..ca7f12aeb2dad 100644
+--- a/Documentation/ABI/testing/sysfs-driver-xen-blkback
++++ b/Documentation/ABI/testing/sysfs-driver-xen-blkback
+@@ -42,5 +42,5 @@ KernelVersion: 5.10
+ Contact: SeongJae Park <sj@kernel.org>
+ Description:
+ Whether to enable the persistent grants feature or not. Note
+- that this option only takes effect on newly created backends.
++ that this option only takes effect on newly connected backends.
+ The default is Y (enable).
+diff --git a/Documentation/ABI/testing/sysfs-driver-xen-blkfront b/Documentation/ABI/testing/sysfs-driver-xen-blkfront
+index 61fd173fabfe3..98b41c3b60027 100644
+--- a/Documentation/ABI/testing/sysfs-driver-xen-blkfront
++++ b/Documentation/ABI/testing/sysfs-driver-xen-blkfront
+@@ -15,5 +15,5 @@ KernelVersion: 5.10
+ Contact: SeongJae Park <sj@kernel.org>
+ Description:
+ Whether to enable the persistent grants feature or not. Note
+- that this option only takes effect on newly created frontends.
++ that this option only takes effect on newly connected frontends.
+ The default is Y (enable).
+diff --git a/Documentation/admin-guide/device-mapper/writecache.rst b/Documentation/admin-guide/device-mapper/writecache.rst
+index 10429779a91ab..724e028d1858b 100644
+--- a/Documentation/admin-guide/device-mapper/writecache.rst
++++ b/Documentation/admin-guide/device-mapper/writecache.rst
+@@ -78,16 +78,16 @@ Status:
+ 2. the number of blocks
+ 3. the number of free blocks
+ 4. the number of blocks under writeback
+-5. the number of read requests
+-6. the number of read requests that hit the cache
+-7. the number of write requests
+-8. the number of write requests that hit uncommitted block
+-9. the number of write requests that hit committed block
+-10. the number of write requests that bypass the cache
+-11. the number of write requests that are allocated in the cache
++5. the number of read blocks
++6. the number of read blocks that hit the cache
++7. the number of write blocks
++8. the number of write blocks that hit uncommitted block
++9. the number of write blocks that hit committed block
++10. the number of write blocks that bypass the cache
++11. the number of write blocks that are allocated in the cache
+ 12. the number of write requests that are blocked on the freelist
+ 13. the number of flush requests
+-14. the number of discard requests
++14. the number of discarded blocks
+
+ Messages:
+ flush
+diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
+index 334544c893ddf..c9a272acfb5b2 100644
+--- a/Documentation/admin-guide/kernel-parameters.txt
++++ b/Documentation/admin-guide/kernel-parameters.txt
+@@ -5130,20 +5130,33 @@
+ Speculative Code Execution with Return Instructions)
+ vulnerability.
+
++ AMD-based UNRET and IBPB mitigations alone do not stop
++ sibling threads from influencing the predictions of other
++ sibling threads. For that reason, STIBP is used on pro-
++ cessors that support it, and mitigate SMT on processors
++ that don't.
++
+ off - no mitigation
+ auto - automatically select a migitation
+ auto,nosmt - automatically select a mitigation,
+ disabling SMT if necessary for
+ the full mitigation (only on Zen1
+ and older without STIBP).
+- ibpb - mitigate short speculation windows on
+- basic block boundaries too. Safe, highest
+- perf impact.
+- unret - force enable untrained return thunks,
+- only effective on AMD f15h-f17h
+- based systems.
+- unret,nosmt - like unret, will disable SMT when STIBP
+- is not available.
++ ibpb - On AMD, mitigate short speculation
++ windows on basic block boundaries too.
++ Safe, highest perf impact. It also
++ enables STIBP if present. Not suitable
++ on Intel.
++ ibpb,nosmt - Like "ibpb" above but will disable SMT
++ when STIBP is not available. This is
++ the alternative for systems which do not
++ have STIBP.
++ unret - Force enable untrained return thunks,
++ only effective on AMD f15h-f17h based
++ systems.
++ unret,nosmt - Like unret, but will disable SMT when STIBP
++ is not available. This is the alternative for
++ systems which do not have STIBP.
+
+ Selecting 'auto' will choose a mitigation method at run
+ time according to the CPU.
+diff --git a/Documentation/admin-guide/pm/cpuidle.rst b/Documentation/admin-guide/pm/cpuidle.rst
+index aec2cd2aaea73..19754beb5a4e6 100644
+--- a/Documentation/admin-guide/pm/cpuidle.rst
++++ b/Documentation/admin-guide/pm/cpuidle.rst
+@@ -612,8 +612,8 @@ the ``menu`` governor to be used on the systems that use the ``ladder`` governor
+ by default this way, for example.
+
+ The other kernel command line parameters controlling CPU idle time management
+-described below are only relevant for the *x86* architecture and some of
+-them affect Intel processors only.
++described below are only relevant for the *x86* architecture and references
++to ``intel_idle`` affect Intel processors only.
+
+ The *x86* architecture support code recognizes three kernel command line
+ options related to CPU idle time management: ``idle=poll``, ``idle=halt``,
+@@ -635,10 +635,13 @@ idle, so it very well may hurt single-thread computations performance as well as
+ energy-efficiency. Thus using it for performance reasons may not be a good idea
+ at all.]
+
+-The ``idle=nomwait`` option disables the ``intel_idle`` driver and causes
+-``acpi_idle`` to be used (as long as all of the information needed by it is
+-there in the system's ACPI tables), but it is not allowed to use the
+-``MWAIT`` instruction of the CPUs to ask the hardware to enter idle states.
++The ``idle=nomwait`` option prevents the use of ``MWAIT`` instruction of
++the CPU to enter idle states. When this option is used, the ``acpi_idle``
++driver will use the ``HLT`` instruction instead of ``MWAIT``. On systems
++running Intel processors, this option disables the ``intel_idle`` driver
++and forces the use of the ``acpi_idle`` driver instead. Note that in either
++case, ``acpi_idle`` driver will function only if all the information needed
++by it is in the system's ACPI tables.
+
+ In addition to the architecture-level kernel command line options affecting CPU
+ idle time management, there are parameters affecting individual ``CPUIdle``
+diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
+index d27db84d585ed..0b4235b1f8c46 100644
+--- a/Documentation/arm64/silicon-errata.rst
++++ b/Documentation/arm64/silicon-errata.rst
+@@ -82,10 +82,14 @@ stable kernels.
+ +----------------+-----------------+-----------------+-----------------------------+
+ | ARM | Cortex-A57 | #1319537 | ARM64_ERRATUM_1319367 |
+ +----------------+-----------------+-----------------+-----------------------------+
++| ARM | Cortex-A57 | #1742098 | ARM64_ERRATUM_1742098 |
+++----------------+-----------------+-----------------+-----------------------------+
+ | ARM | Cortex-A72 | #853709 | N/A |
+ +----------------+-----------------+-----------------+-----------------------------+
+ | ARM | Cortex-A72 | #1319367 | ARM64_ERRATUM_1319367 |
+ +----------------+-----------------+-----------------+-----------------------------+
++| ARM | Cortex-A72 | #1655431 | ARM64_ERRATUM_1742098 |
+++----------------+-----------------+-----------------+-----------------------------+
+ | ARM | Cortex-A73 | #858921 | ARM64_ERRATUM_858921 |
+ +----------------+-----------------+-----------------+-----------------------------+
+ | ARM | Cortex-A76 | #1188873,1418040| ARM64_ERRATUM_1418040 |
+diff --git a/Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml b/Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml
+index e2d330bd4608a..69cdab18d6294 100644
+--- a/Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml
++++ b/Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml
+@@ -46,7 +46,7 @@ properties:
+ const: 2
+
+ cache-sets:
+- const: 1024
++ enum: [1024, 2048]
+
+ cache-size:
+ const: 2097152
+@@ -84,6 +84,8 @@ then:
+ description: |
+ Must contain entries for DirError, DataError and DataFail signals.
+ maxItems: 3
++ cache-sets:
++ const: 1024
+
+ else:
+ properties:
+@@ -91,6 +93,8 @@ else:
+ description: |
+ Must contain entries for DirError, DataError, DataFail, DirFail signals.
+ minItems: 4
++ cache-sets:
++ const: 2048
+
+ additionalProperties: false
+
+diff --git a/Documentation/tty/device_drivers/oxsemi-tornado.rst b/Documentation/tty/device_drivers/oxsemi-tornado.rst
+new file mode 100644
+index 0000000000000..0180d8bb08818
+--- /dev/null
++++ b/Documentation/tty/device_drivers/oxsemi-tornado.rst
+@@ -0,0 +1,129 @@
++.. SPDX-License-Identifier: GPL-2.0
++
++====================================================================
++Notes on Oxford Semiconductor PCIe (Tornado) 950 serial port devices
++====================================================================
++
++Oxford Semiconductor PCIe (Tornado) 950 serial port devices are driven
++by a fixed 62.5MHz clock input derived from the 100MHz PCI Express clock.
++
++The baud rate produced by the baud generator is obtained from this input
++frequency by dividing it by the clock prescaler, which can be set to any
++value from 1 to 63.875 in increments of 0.125, and then the usual 16-bit
++divisor is used as with the original 8250, to divide the frequency by a
++value from 1 to 65535. Finally a programmable oversampling rate is used
++that can take any value from 4 to 16 to divide the frequency further and
++determine the actual baud rate used. Baud rates from 15625000bps down
++to 0.933bps can be obtained this way.
++
++By default the oversampling rate is set to 16 and the clock prescaler is
++set to 33.875, meaning that the frequency to be used as the reference
++for the usual 16-bit divisor is 115313.653, which is close enough to the
++frequency of 115200 used by the original 8250 for the same values to be
++used for the divisor to obtain the requested baud rates by software that
++is unaware of the extra clock controls available.
++
++The oversampling rate is programmed with the TCR register and the clock
++prescaler is programmed with the CPR/CPR2 register pair[1][2][3][4].
++To switch away from the default value of 33.875 for the prescaler the
++the enhanced mode has to be explicitly enabled though, by setting bit 4
++of the EFR. In that mode setting bit 7 in the MCR enables the prescaler
++or otherwise it is bypassed as if the value of 1 was used. Additionally
++writing any value to CPR clears CPR2 for compatibility with old software
++written for older conventional PCI Oxford Semiconductor devices that do
++not have the extra prescaler's 9th bit in CPR2, so the CPR/CPR2 register
++pair has to be programmed in the right order.
++
++By using these parameters rates from 15625000bps down to 1bps can be
++obtained, with either exact or highly-accurate actual bit rates for
++standard and many non-standard rates.
++
++Here are the figures for the standard and some non-standard baud rates
++(including those quoted in Oxford Semiconductor documentation), giving
++the requested rate (r), the actual rate yielded (a) and its deviation
++from the requested rate (d), and the values of the oversampling rate
++(tcr), the clock prescaler (cpr) and the divisor (div) produced by the
++new `get_divisor' handler:
++
++r: 15625000, a: 15625000.00, d: 0.0000%, tcr: 4, cpr: 1.000, div: 1
++r: 12500000, a: 12500000.00, d: 0.0000%, tcr: 5, cpr: 1.000, div: 1
++r: 10416666, a: 10416666.67, d: 0.0000%, tcr: 6, cpr: 1.000, div: 1
++r: 8928571, a: 8928571.43, d: 0.0000%, tcr: 7, cpr: 1.000, div: 1
++r: 7812500, a: 7812500.00, d: 0.0000%, tcr: 8, cpr: 1.000, div: 1
++r: 4000000, a: 4000000.00, d: 0.0000%, tcr: 5, cpr: 3.125, div: 1
++r: 3686400, a: 3676470.59, d: -0.2694%, tcr: 8, cpr: 2.125, div: 1
++r: 3500000, a: 3496503.50, d: -0.0999%, tcr: 13, cpr: 1.375, div: 1
++r: 3000000, a: 2976190.48, d: -0.7937%, tcr: 14, cpr: 1.500, div: 1
++r: 2500000, a: 2500000.00, d: 0.0000%, tcr: 10, cpr: 2.500, div: 1
++r: 2000000, a: 2000000.00, d: 0.0000%, tcr: 10, cpr: 3.125, div: 1
++r: 1843200, a: 1838235.29, d: -0.2694%, tcr: 16, cpr: 2.125, div: 1
++r: 1500000, a: 1492537.31, d: -0.4975%, tcr: 5, cpr: 8.375, div: 1
++r: 1152000, a: 1152073.73, d: 0.0064%, tcr: 14, cpr: 3.875, div: 1
++r: 921600, a: 919117.65, d: -0.2694%, tcr: 16, cpr: 2.125, div: 2
++r: 576000, a: 576036.87, d: 0.0064%, tcr: 14, cpr: 3.875, div: 2
++r: 460800, a: 460829.49, d: 0.0064%, tcr: 7, cpr: 3.875, div: 5
++r: 230400, a: 230414.75, d: 0.0064%, tcr: 14, cpr: 3.875, div: 5
++r: 115200, a: 115207.37, d: 0.0064%, tcr: 14, cpr: 1.250, div: 31
++r: 57600, a: 57603.69, d: 0.0064%, tcr: 8, cpr: 3.875, div: 35
++r: 38400, a: 38402.46, d: 0.0064%, tcr: 14, cpr: 3.875, div: 30
++r: 19200, a: 19201.23, d: 0.0064%, tcr: 8, cpr: 3.875, div: 105
++r: 9600, a: 9600.06, d: 0.0006%, tcr: 9, cpr: 1.125, div: 643
++r: 4800, a: 4799.98, d: -0.0004%, tcr: 7, cpr: 2.875, div: 647
++r: 2400, a: 2400.02, d: 0.0008%, tcr: 9, cpr: 2.250, div: 1286
++r: 1200, a: 1200.00, d: 0.0000%, tcr: 14, cpr: 2.875, div: 1294
++r: 300, a: 300.00, d: 0.0000%, tcr: 11, cpr: 2.625, div: 7215
++r: 200, a: 200.00, d: 0.0000%, tcr: 16, cpr: 1.250, div: 15625
++r: 150, a: 150.00, d: 0.0000%, tcr: 13, cpr: 2.250, div: 14245
++r: 134, a: 134.00, d: 0.0000%, tcr: 11, cpr: 2.625, div: 16153
++r: 110, a: 110.00, d: 0.0000%, tcr: 12, cpr: 1.000, div: 47348
++r: 75, a: 75.00, d: 0.0000%, tcr: 4, cpr: 5.875, div: 35461
++r: 50, a: 50.00, d: 0.0000%, tcr: 16, cpr: 1.250, div: 62500
++r: 25, a: 25.00, d: 0.0000%, tcr: 16, cpr: 2.500, div: 62500
++r: 4, a: 4.00, d: 0.0000%, tcr: 16, cpr: 20.000, div: 48828
++r: 2, a: 2.00, d: 0.0000%, tcr: 16, cpr: 40.000, div: 48828
++r: 1, a: 1.00, d: 0.0000%, tcr: 16, cpr: 63.875, div: 61154
++
++With the baud base set to 15625000 and the unsigned 16-bit UART_DIV_MAX
++limitation imposed by `serial8250_get_baud_rate' standard baud rates
++below 300bps become unavailable in the regular way, e.g. the rate of
++200bps requires the baud base to be divided by 78125 and that is beyond
++the unsigned 16-bit range. The historic spd_cust feature can still be
++used by encoding the values for, the prescaler, the oversampling rate
++and the clock divisor (DLM/DLL) as follows to obtain such rates if so
++required:
++
++ 31 29 28 20 19 16 15 0
+++-----+-----------------+-------+-------------------------------+
++|0 0 0| CPR2:CPR | TCR | DLM:DLL |
+++-----+-----------------+-------+-------------------------------+
++
++Use a value such encoded for the `custom_divisor' field along with the
++ASYNC_SPD_CUST flag set in the `flags' field in `struct serial_struct'
++passed with the TIOCSSERIAL ioctl(2), such as with the setserial(8)
++utility and its `divisor' and `spd_cust' parameters, and the select
++the baud rate of 38400bps. Note that the value of 0 in TCR sets the
++oversampling rate to 16 and prescaler values below 1 in CPR2/CPR are
++clamped by the driver to 1.
++
++For example the value of 0x1f4004e2 will set CPR2/CPR, TCR and DLM/DLL
++respectively to 0x1f4, 0x0 and 0x04e2, choosing the prescaler value,
++the oversampling rate and the clock divisor of 62.500, 16 and 1250
++respectively. These parameters will set the baud rate for the serial
++port to 62500000 / 62.500 / 1250 / 16 = 50bps.
++
++References:
++
++[1] "OXPCIe200 PCI Express Multi-Port Bridge", Oxford Semiconductor,
++ Inc., DS-0045, 10 Nov 2008, Section "950 Mode", pp. 64-65
++
++[2] "OXPCIe952 PCI Express Bridge to Dual Serial & Parallel Port",
++ Oxford Semiconductor, Inc., DS-0046, Mar 06 08, Section "950 Mode",
++ p. 20
++
++[3] "OXPCIe954 PCI Express Bridge to Quad Serial Port", Oxford
++ Semiconductor, Inc., DS-0047, Feb 08, Section "950 Mode", p. 20
++
++[4] "OXPCIe958 PCI Express Bridge to Octal Serial Port", Oxford
++ Semiconductor, Inc., DS-0048, Feb 08, Section "950 Mode", p. 20
++
++Maciej W. Rozycki <macro@orcam.me.uk>
+diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
+index 4cd7c541fc307..bcdc14144105d 100644
+--- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
++++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
+@@ -2975,7 +2975,7 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
+ * - __u8
+ - ``colour_plane_id``
+ -
+- * - __u16
++ * - __s32
+ - ``slice_pic_order_cnt``
+ -
+ * - __u8
+diff --git a/MAINTAINERS b/MAINTAINERS
+index 2b70e2d214057..c7c7a96b62a87 100644
+--- a/MAINTAINERS
++++ b/MAINTAINERS
+@@ -7599,9 +7599,6 @@ F: include/linux/fs.h
+ F: include/linux/fs_types.h
+ F: include/uapi/linux/fs.h
+ F: include/uapi/linux/openat2.h
+-X: fs/io-wq.c
+-X: fs/io-wq.h
+-X: fs/io_uring.c
+
+ FINTEK F75375S HARDWARE MONITOR AND FAN CONTROLLER DRIVER
+ M: Riku Voipio <riku.voipio@iki.fi>
+@@ -10277,9 +10274,7 @@ L: io-uring@vger.kernel.org
+ S: Maintained
+ T: git git://git.kernel.dk/linux-block
+ T: git git://git.kernel.dk/liburing
+-F: fs/io-wq.c
+-F: fs/io-wq.h
+-F: fs/io_uring.c
++F: io_uring/
+ F: include/linux/io_uring.h
+ F: include/uapi/linux/io_uring.h
+ F: tools/io_uring/
+diff --git a/Makefile b/Makefile
+index ef8c18e5c161c..23162e2bdf140 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 17
++SUBLEVEL = 18
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+@@ -1031,6 +1031,11 @@ KBUILD_CFLAGS += $(KCFLAGS)
+ KBUILD_LDFLAGS_MODULE += --build-id=sha1
+ LDFLAGS_vmlinux += --build-id=sha1
+
++KBUILD_LDFLAGS += -z noexecstack
++ifeq ($(CONFIG_LD_IS_BFD),y)
++KBUILD_LDFLAGS += $(call ld-option,--no-warn-rwx-segments)
++endif
++
+ ifeq ($(CONFIG_STRIP_ASM_SYMS),y)
+ LDFLAGS_vmlinux += $(call ld-option, -X,)
+ endif
+@@ -1095,6 +1100,7 @@ export MODULES_NSDEPS := $(extmod_prefix)modules.nsdeps
+ ifeq ($(KBUILD_EXTMOD),)
+ core-y += kernel/ certs/ mm/ fs/ ipc/ security/ crypto/
+ core-$(CONFIG_BLOCK) += block/
++core-$(CONFIG_IO_URING) += io_uring/
+
+ vmlinux-dirs := $(patsubst %/,%,$(filter %/, \
+ $(core-y) $(core-m) $(drivers-y) $(drivers-m) \
+diff --git a/arch/Kconfig b/arch/Kconfig
+index 31c4fdc4a4baa..ab45e0f6c21bf 100644
+--- a/arch/Kconfig
++++ b/arch/Kconfig
+@@ -214,6 +214,9 @@ config HAVE_FUNCTION_DESCRIPTORS
+ config TRACE_IRQFLAGS_SUPPORT
+ bool
+
++config TRACE_IRQFLAGS_NMI_SUPPORT
++ bool
++
+ #
+ # An arch should select this if it provides all these things:
+ #
+diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
+index 7c16f8a2b738d..13d788d1e6dc4 100644
+--- a/arch/arm/boot/dts/Makefile
++++ b/arch/arm/boot/dts/Makefile
+@@ -133,6 +133,7 @@ dtb-$(CONFIG_ARCH_BCM_5301X) += \
+ bcm47094-luxul-xwr-3150-v1.dtb \
+ bcm47094-netgear-r8500.dtb \
+ bcm47094-phicomm-k3.dtb \
++ bcm53015-meraki-mr26.dtb \
+ bcm53016-meraki-mr32.dtb \
+ bcm94708.dtb \
+ bcm94709.dtb \
+diff --git a/arch/arm/boot/dts/aspeed-ast2500-evb.dts b/arch/arm/boot/dts/aspeed-ast2500-evb.dts
+index 1d24b394ea4c3..a497dd135491b 100644
+--- a/arch/arm/boot/dts/aspeed-ast2500-evb.dts
++++ b/arch/arm/boot/dts/aspeed-ast2500-evb.dts
+@@ -5,7 +5,7 @@
+
+ / {
+ model = "AST2500 EVB";
+- compatible = "aspeed,ast2500";
++ compatible = "aspeed,ast2500-evb", "aspeed,ast2500";
+
+ aliases {
+ serial4 = &uart5;
+diff --git a/arch/arm/boot/dts/aspeed-ast2600-evb-a1.dts b/arch/arm/boot/dts/aspeed-ast2600-evb-a1.dts
+index dd7148060c4a3..d0a5c2ff0fec4 100644
+--- a/arch/arm/boot/dts/aspeed-ast2600-evb-a1.dts
++++ b/arch/arm/boot/dts/aspeed-ast2600-evb-a1.dts
+@@ -5,6 +5,7 @@
+
+ / {
+ model = "AST2600 A1 EVB";
++ compatible = "aspeed,ast2600-evb-a1", "aspeed,ast2600";
+
+ /delete-node/regulator-vcc-sdhci0;
+ /delete-node/regulator-vcc-sdhci1;
+diff --git a/arch/arm/boot/dts/aspeed-ast2600-evb.dts b/arch/arm/boot/dts/aspeed-ast2600-evb.dts
+index 788448cdd6b3f..b8e55bf167aa8 100644
+--- a/arch/arm/boot/dts/aspeed-ast2600-evb.dts
++++ b/arch/arm/boot/dts/aspeed-ast2600-evb.dts
+@@ -8,7 +8,7 @@
+
+ / {
+ model = "AST2600 EVB";
+- compatible = "aspeed,ast2600";
++ compatible = "aspeed,ast2600-evb-a1", "aspeed,ast2600";
+
+ aliases {
+ serial4 = &uart5;
+diff --git a/arch/arm/boot/dts/bcm53015-meraki-mr26.dts b/arch/arm/boot/dts/bcm53015-meraki-mr26.dts
+new file mode 100644
+index 0000000000000..14f58033efeb9
+--- /dev/null
++++ b/arch/arm/boot/dts/bcm53015-meraki-mr26.dts
+@@ -0,0 +1,166 @@
++// SPDX-License-Identifier: GPL-2.0-or-later OR MIT
++/*
++ * Broadcom BCM470X / BCM5301X ARM platform code.
++ * DTS for Meraki MR26 / Codename: Venom
++ *
++ * Copyright (C) 2022 Christian Lamparter <chunkeey@gmail.com>
++ */
++
++/dts-v1/;
++
++#include "bcm4708.dtsi"
++#include "bcm5301x-nand-cs0-bch8.dtsi"
++#include <dt-bindings/leds/common.h>
++
++/ {
++ compatible = "meraki,mr26", "brcm,bcm53015", "brcm,bcm4708";
++ model = "Meraki MR26";
++
++ memory@0 {
++ reg = <0x00000000 0x08000000>;
++ device_type = "memory";
++ };
++
++ leds {
++ compatible = "gpio-leds";
++
++ led-0 {
++ function = LED_FUNCTION_FAULT;
++ color = <LED_COLOR_ID_AMBER>;
++ gpios = <&chipcommon 13 GPIO_ACTIVE_HIGH>;
++ panic-indicator;
++ };
++ led-1 {
++ function = LED_FUNCTION_INDICATOR;
++ color = <LED_COLOR_ID_WHITE>;
++ gpios = <&chipcommon 12 GPIO_ACTIVE_HIGH>;
++ };
++ };
++
++ keys {
++ compatible = "gpio-keys";
++ #address-cells = <1>;
++ #size-cells = <0>;
++
++ key-restart {
++ label = "Reset";
++ linux,code = <KEY_RESTART>;
++ gpios = <&chipcommon 11 GPIO_ACTIVE_LOW>;
++ };
++ };
++};
++
++&uart0 {
++ clock-frequency = <50000000>;
++ /delete-property/ clocks;
++};
++
++&uart1 {
++ status = "disabled";
++};
++
++&gmac0 {
++ status = "okay";
++};
++
++&gmac1 {
++ status = "disabled";
++};
++&gmac2 {
++ status = "disabled";
++};
++&gmac3 {
++ status = "disabled";
++};
++
++&nandcs {
++ nand-ecc-algo = "hw";
++
++ partitions {
++ compatible = "fixed-partitions";
++ #address-cells = <0x1>;
++ #size-cells = <0x1>;
++
++ partition@0 {
++ label = "u-boot";
++ reg = <0x0 0x200000>;
++ read-only;
++ };
++
++ partition@200000 {
++ label = "u-boot-env";
++ reg = <0x200000 0x200000>;
++ /* empty */
++ };
++
++ partition@400000 {
++ label = "u-boot-backup";
++ reg = <0x400000 0x200000>;
++ /* empty */
++ };
++
++ partition@600000 {
++ label = "u-boot-env-backup";
++ reg = <0x600000 0x200000>;
++ /* empty */
++ };
++
++ partition@800000 {
++ label = "ubi";
++ reg = <0x800000 0x7780000>;
++ };
++ };
++};
++
++&srab {
++ status = "okay";
++
++ ports {
++ port@0 {
++ reg = <0>;
++ label = "poe";
++ };
++
++ port@5 {
++ reg = <5>;
++ label = "cpu";
++ ethernet = <&gmac0>;
++
++ fixed-link {
++ speed = <1000>;
++ duplex-full;
++ };
++ };
++ };
++};
++
++&i2c0 {
++ status = "okay";
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&pinmux_i2c>;
++
++ clock-frequency = <100000>;
++
++ ina219@40 {
++ compatible = "ti,ina219"; /* PoE power */
++ reg = <0x40>;
++ shunt-resistor = <60000>; /* = 60 mOhms */
++ };
++
++ eeprom@56 {
++ compatible = "atmel,24c64";
++ reg = <0x56>;
++ pagesize = <32>;
++ read-only;
++ #address-cells = <1>;
++ #size-cells = <1>;
++
++ /* it's empty */
++ };
++};
++
++&thermal {
++ status = "disabled";
++ /* does not work, reads 418 degree Celsius */
++};
+diff --git a/arch/arm/boot/dts/imx6qdl-apalis.dtsi b/arch/arm/boot/dts/imx6qdl-apalis.dtsi
+index bd763bae596b0..da919d0544a80 100644
+--- a/arch/arm/boot/dts/imx6qdl-apalis.dtsi
++++ b/arch/arm/boot/dts/imx6qdl-apalis.dtsi
+@@ -315,7 +315,7 @@
+ /* ADC conversion time: 80 clocks */
+ st,sample-time = <4>;
+
+- stmpe_touchscreen: stmpe-touchscreen {
++ stmpe_touchscreen: stmpe_touchscreen {
+ compatible = "st,stmpe-ts";
+ /* 8 sample average control */
+ st,ave-ctrl = <3>;
+@@ -332,7 +332,7 @@
+ st,touch-det-delay = <5>;
+ };
+
+- stmpe_adc: stmpe-adc {
++ stmpe_adc: stmpe_adc {
+ compatible = "st,stmpe-adc";
+ /* forbid to use ADC channels 3-0 (touch) */
+ st,norequest-mask = <0x0F>;
+diff --git a/arch/arm/boot/dts/imx6ul.dtsi b/arch/arm/boot/dts/imx6ul.dtsi
+index afeec01f65228..eca8bf89ab88f 100644
+--- a/arch/arm/boot/dts/imx6ul.dtsi
++++ b/arch/arm/boot/dts/imx6ul.dtsi
+@@ -64,20 +64,18 @@
+ clock-frequency = <696000000>;
+ clock-latency = <61036>; /* two CLK32 periods */
+ #cooling-cells = <2>;
+- operating-points = <
++ operating-points =
+ /* kHz uV */
+- 696000 1275000
+- 528000 1175000
+- 396000 1025000
+- 198000 950000
+- >;
+- fsl,soc-operating-points = <
++ <696000 1275000>,
++ <528000 1175000>,
++ <396000 1025000>,
++ <198000 950000>;
++ fsl,soc-operating-points =
+ /* KHz uV */
+- 696000 1275000
+- 528000 1175000
+- 396000 1175000
+- 198000 1175000
+- >;
++ <696000 1275000>,
++ <528000 1175000>,
++ <396000 1175000>,
++ <198000 1175000>;
+ clocks = <&clks IMX6UL_CLK_ARM>,
+ <&clks IMX6UL_CLK_PLL2_BUS>,
+ <&clks IMX6UL_CLK_PLL2_PFD2>,
+@@ -149,6 +147,9 @@
+ ocram: sram@900000 {
+ compatible = "mmio-sram";
+ reg = <0x00900000 0x20000>;
++ ranges = <0 0x00900000 0x20000>;
++ #address-cells = <1>;
++ #size-cells = <1>;
+ };
+
+ intc: interrupt-controller@a01000 {
+@@ -543,7 +544,7 @@
+ };
+
+ kpp: keypad@20b8000 {
+- compatible = "fsl,imx6ul-kpp", "fsl,imx6q-kpp", "fsl,imx21-kpp";
++ compatible = "fsl,imx6ul-kpp", "fsl,imx21-kpp";
+ reg = <0x020b8000 0x4000>;
+ interrupts = <GIC_SPI 82 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&clks IMX6UL_CLK_KPP>;
+@@ -998,7 +999,7 @@
+ };
+
+ csi: csi@21c4000 {
+- compatible = "fsl,imx6ul-csi", "fsl,imx7-csi";
++ compatible = "fsl,imx6ul-csi";
+ reg = <0x021c4000 0x4000>;
+ interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&clks IMX6UL_CLK_CSI>;
+@@ -1007,7 +1008,7 @@
+ };
+
+ lcdif: lcdif@21c8000 {
+- compatible = "fsl,imx6ul-lcdif", "fsl,imx28-lcdif";
++ compatible = "fsl,imx6ul-lcdif", "fsl,imx6sx-lcdif";
+ reg = <0x021c8000 0x4000>;
+ interrupts = <GIC_SPI 5 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&clks IMX6UL_CLK_LCDIF_PIX>,
+@@ -1028,7 +1029,7 @@
+ qspi: spi@21e0000 {
+ #address-cells = <1>;
+ #size-cells = <0>;
+- compatible = "fsl,imx6ul-qspi", "fsl,imx6sx-qspi";
++ compatible = "fsl,imx6ul-qspi";
+ reg = <0x021e0000 0x4000>, <0x60000000 0x10000000>;
+ reg-names = "QuadSPI", "QuadSPI-memory";
+ interrupts = <GIC_SPI 107 IRQ_TYPE_LEVEL_HIGH>;
+diff --git a/arch/arm/boot/dts/imx7d-colibri-emmc.dtsi b/arch/arm/boot/dts/imx7d-colibri-emmc.dtsi
+index af39e5370fa12..045e4413d3390 100644
+--- a/arch/arm/boot/dts/imx7d-colibri-emmc.dtsi
++++ b/arch/arm/boot/dts/imx7d-colibri-emmc.dtsi
+@@ -13,6 +13,10 @@
+ };
+ };
+
++&cpu1 {
++ cpu-supply = <®_DCDC2>;
++};
++
+ &gpio6 {
+ gpio-line-names = "",
+ "",
+diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi b/arch/arm/boot/dts/qcom-apq8064.dtsi
+index a1c8ae516d217..33a4d3441959a 100644
+--- a/arch/arm/boot/dts/qcom-apq8064.dtsi
++++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
+@@ -227,7 +227,7 @@
+ smd {
+ compatible = "qcom,smd";
+
+- modem@0 {
++ modem-edge {
+ interrupts = <0 37 IRQ_TYPE_EDGE_RISING>;
+
+ qcom,ipc = <&l2cc 8 3>;
+@@ -236,7 +236,7 @@
+ status = "disabled";
+ };
+
+- q6@1 {
++ q6-edge {
+ interrupts = <0 90 IRQ_TYPE_EDGE_RISING>;
+
+ qcom,ipc = <&l2cc 8 15>;
+@@ -245,7 +245,7 @@
+ status = "disabled";
+ };
+
+- dsps@3 {
++ dsps-edge {
+ interrupts = <0 138 IRQ_TYPE_EDGE_RISING>;
+
+ qcom,ipc = <&sps_sic_non_secure 0x4080 0>;
+@@ -254,7 +254,7 @@
+ status = "disabled";
+ };
+
+- riva@6 {
++ riva-edge {
+ interrupts = <0 198 IRQ_TYPE_EDGE_RISING>;
+
+ qcom,ipc = <&l2cc 8 25>;
+diff --git a/arch/arm/boot/dts/qcom-apq8074-dragonboard.dts b/arch/arm/boot/dts/qcom-apq8074-dragonboard.dts
+index 83793b835d40b..f47020cf7a905 100644
+--- a/arch/arm/boot/dts/qcom-apq8074-dragonboard.dts
++++ b/arch/arm/boot/dts/qcom-apq8074-dragonboard.dts
+@@ -16,331 +16,320 @@
+ chosen {
+ stdout-path = "serial0:115200n8";
+ };
++};
++
++&blsp1_uart2 {
++ status = "okay";
++};
++
++&blsp2_i2c5 {
++ status = "okay";
++ clock-frequency = <200000>;
+
+- soc {
+- serial@f991e000 {
++ pinctrl-0 = <&i2c11_pins>;
++ pinctrl-names = "default";
++
++ eeprom: eeprom@52 {
++ compatible = "atmel,24c128";
++ reg = <0x52>;
++ pagesize = <32>;
++ read-only;
++ };
++};
++
++&otg {
++ status = "okay";
++
++ phys = <&usb_hs2_phy>;
++ phy-select = <&tcsr 0xb000 1>;
++ extcon = <&smbb>, <&usb_id>;
++ vbus-supply = <&chg_otg>;
++ hnp-disable;
++ srp-disable;
++ adp-disable;
++
++ ulpi {
++ phy@b {
+ status = "okay";
++ v3p3-supply = <&pm8941_l24>;
++ v1p8-supply = <&pm8941_l6>;
++ extcon = <&smbb>;
++ qcom,init-seq = /bits/ 8 <0x1 0x63>;
+ };
++ };
++};
+
+- sdhci@f9824900 {
+- bus-width = <8>;
+- non-removable;
+- status = "okay";
++&rpm_requests {
++ pm8841-regulators {
++ compatible = "qcom,rpm-pm8841-regulators";
+
+- vmmc-supply = <&pm8941_l20>;
+- vqmmc-supply = <&pm8941_s3>;
++ pm8841_s1: s1 {
++ regulator-min-microvolt = <675000>;
++ regulator-max-microvolt = <1050000>;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc1_pin_a>;
++ pm8841_s2: s2 {
++ regulator-min-microvolt = <500000>;
++ regulator-max-microvolt = <1050000>;
+ };
+
+- sdhci@f98a4900 {
+- cd-gpios = <&msmgpio 62 0x1>;
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
+- bus-width = <4>;
+- status = "okay";
++ pm8841_s3: s3 {
++ regulator-min-microvolt = <500000>;
++ regulator-max-microvolt = <1050000>;
++ };
+
+- vmmc-supply = <&pm8941_l21>;
+- vqmmc-supply = <&pm8941_l13>;
++ pm8841_s4: s4 {
++ regulator-min-microvolt = <500000>;
++ regulator-max-microvolt = <1050000>;
+ };
++ };
+
+- usb@f9a55000 {
+- status = "okay";
+- phys = <&usb_hs2_phy>;
+- phy-select = <&tcsr 0xb000 1>;
+- extcon = <&smbb>, <&usb_id>;
+- vbus-supply = <&chg_otg>;
+- hnp-disable;
+- srp-disable;
+- adp-disable;
+- ulpi {
+- phy@b {
+- status = "okay";
+- v3p3-supply = <&pm8941_l24>;
+- v1p8-supply = <&pm8941_l6>;
+- extcon = <&smbb>;
+- qcom,init-seq = /bits/ 8 <0x1 0x63>;
+- };
+- };
+- };
+-
+-
+- pinctrl@fd510000 {
+- i2c11_pins: i2c11 {
+- mux {
+- pins = "gpio83", "gpio84";
+- function = "blsp_i2c11";
+- };
+- };
+-
+- spi8_default: spi8_default {
+- mosi {
+- pins = "gpio45";
+- function = "blsp_spi8";
+- };
+- miso {
+- pins = "gpio46";
+- function = "blsp_spi8";
+- };
+- cs {
+- pins = "gpio47";
+- function = "blsp_spi8";
+- };
+- clk {
+- pins = "gpio48";
+- function = "blsp_spi8";
+- };
+- };
+-
+- sdhc1_pin_a: sdhc1-pin-active {
+- clk {
+- pins = "sdc1_clk";
+- drive-strength = <16>;
+- bias-disable;
+- };
+-
+- cmd-data {
+- pins = "sdc1_cmd", "sdc1_data";
+- drive-strength = <10>;
+- bias-pull-up;
+- };
+- };
+-
+- sdhc2_cd_pin_a: sdhc2-cd-pin-active {
+- pins = "gpio62";
+- function = "gpio";
+-
+- drive-strength = <2>;
+- bias-disable;
+- };
+-
+- sdhc2_pin_a: sdhc2-pin-active {
+- clk {
+- pins = "sdc2_clk";
+- drive-strength = <10>;
+- bias-disable;
+- };
+-
+- cmd-data {
+- pins = "sdc2_cmd", "sdc2_data";
+- drive-strength = <6>;
+- bias-pull-up;
+- };
+- };
+- };
+-
+- i2c@f9967000 {
+- status = "okay";
+- clock-frequency = <200000>;
+- pinctrl-0 = <&i2c11_pins>;
+- pinctrl-names = "default";
++ pm8941-regulators {
++ compatible = "qcom,rpm-pm8941-regulators";
++
++ vdd_l1_l3-supply = <&pm8941_s1>;
++ vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
++ vdd_l4_l11-supply = <&pm8941_s1>;
++ vdd_l5_l7-supply = <&pm8941_s2>;
++ vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
++ vin_5vs-supply = <&pm8941_5v>;
++
++ pm8941_s1: s1 {
++ regulator-min-microvolt = <1300000>;
++ regulator-max-microvolt = <1300000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
++
++ pm8941_s2: s2 {
++ regulator-min-microvolt = <2150000>;
++ regulator-max-microvolt = <2150000>;
++ regulator-boot-on;
++ };
+
+- eeprom: eeprom@52 {
+- compatible = "atmel,24c128";
+- reg = <0x52>;
+- pagesize = <32>;
+- read-only;
+- };
++ pm8941_s3: s3 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-always-on;
++ regulator-boot-on;
+ };
++
++ pm8941_l1: l1 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
++
++ pm8941_l2: l2 {
++ regulator-min-microvolt = <1200000>;
++ regulator-max-microvolt = <1200000>;
++ };
++
++ pm8941_l3: l3 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
++ };
++
++ pm8941_l4: l4 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
++ };
++
++ pm8941_l5: l5 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
++
++ pm8941_l6: l6 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-boot-on;
++ };
++
++ pm8941_l7: l7 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-boot-on;
++ };
++
++ pm8941_l8: l8 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
++
++ pm8941_l9: l9 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ };
++
++ pm8941_l10: l10 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-always-on;
++ };
++
++ pm8941_l11: l11 {
++ regulator-min-microvolt = <1300000>;
++ regulator-max-microvolt = <1300000>;
++ };
++
++ pm8941_l12: l12 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
++
++ pm8941_l13: l13 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-boot-on;
++ };
++
++ pm8941_l14: l14 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
++
++ pm8941_l15: l15 {
++ regulator-min-microvolt = <2050000>;
++ regulator-max-microvolt = <2050000>;
++ };
++
++ pm8941_l16: l16 {
++ regulator-min-microvolt = <2700000>;
++ regulator-max-microvolt = <2700000>;
++ };
++
++ pm8941_l17: l17 {
++ regulator-min-microvolt = <2700000>;
++ regulator-max-microvolt = <2700000>;
++ };
++
++ pm8941_l18: l18 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++
++ pm8941_l19: l19 {
++ regulator-min-microvolt = <3300000>;
++ regulator-max-microvolt = <3300000>;
++ regulator-always-on;
++ };
++
++ pm8941_l20: l20 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-system-load = <200000>;
++ regulator-allow-set-load;
++ regulator-boot-on;
++ };
++
++ pm8941_l21: l21 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-boot-on;
++ };
++
++ pm8941_l22: l22 {
++ regulator-min-microvolt = <3000000>;
++ regulator-max-microvolt = <3000000>;
++ };
++
++ pm8941_l23: l23 {
++ regulator-min-microvolt = <3000000>;
++ regulator-max-microvolt = <3000000>;
++ };
++
++ pm8941_l24: l24 {
++ regulator-min-microvolt = <3075000>;
++ regulator-max-microvolt = <3075000>;
++ regulator-boot-on;
++ };
++ };
++};
++
++&sdhc_1 {
++ status = "okay";
++
++ vmmc-supply = <&pm8941_l20>;
++ vqmmc-supply = <&pm8941_s3>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc1_pin_a>;
++};
++
++&sdhc_2 {
++ status = "okay";
++ cd-gpios = <&tlmm 62 0x1>;
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
++
++ vmmc-supply = <&pm8941_l21>;
++ vqmmc-supply = <&pm8941_l13>;
++};
++
++&tlmm {
++ i2c11_pins: i2c11 {
++ mux {
++ pins = "gpio83", "gpio84";
++ function = "blsp_i2c11";
++ };
++ };
++
++ spi8_default: spi8_default {
++ mosi {
++ pins = "gpio45";
++ function = "blsp_spi8";
++ };
++ miso {
++ pins = "gpio46";
++ function = "blsp_spi8";
++ };
++ cs {
++ pins = "gpio47";
++ function = "blsp_spi8";
++ };
++ clk {
++ pins = "gpio48";
++ function = "blsp_spi8";
++ };
++ };
++
++ sdhc1_pin_a: sdhc1-pin-active {
++ clk {
++ pins = "sdc1_clk";
++ drive-strength = <16>;
++ bias-disable;
++ };
++
++ cmd-data {
++ pins = "sdc1_cmd", "sdc1_data";
++ drive-strength = <10>;
++ bias-pull-up;
++ };
++ };
++
++ sdhc2_cd_pin_a: sdhc2-cd-pin-active {
++ pins = "gpio62";
++ function = "gpio";
++
++ drive-strength = <2>;
++ bias-disable;
+ };
+
+- smd {
+- rpm {
+- rpm_requests {
+- pm8841-regulators {
+- s1 {
+- regulator-min-microvolt = <675000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s2 {
+- regulator-min-microvolt = <500000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s3 {
+- regulator-min-microvolt = <500000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <500000>;
+- regulator-max-microvolt = <1050000>;
+- };
+- };
+-
+- pm8941-regulators {
+- vdd_l1_l3-supply = <&pm8941_s1>;
+- vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
+- vdd_l4_l11-supply = <&pm8941_s1>;
+- vdd_l5_l7-supply = <&pm8941_s2>;
+- vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
+- vin_5vs-supply = <&pm8941_5v>;
+-
+- s1 {
+- regulator-min-microvolt = <1300000>;
+- regulator-max-microvolt = <1300000>;
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- s2 {
+- regulator-min-microvolt = <2150000>;
+- regulator-max-microvolt = <2150000>;
+- regulator-boot-on;
+- };
+-
+- s3 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l3 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-boot-on;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-boot-on;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- regulator-always-on;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1300000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l19 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- regulator-always-on;
+- };
+-
+- l20 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-allow-set-load;
+- regulator-boot-on;
+- regulator-system-load = <200000>;
+- };
+-
+- l21 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- };
+-
+- l22 {
+- regulator-min-microvolt = <3000000>;
+- regulator-max-microvolt = <3000000>;
+- };
+-
+- l23 {
+- regulator-min-microvolt = <3000000>;
+- regulator-max-microvolt = <3000000>;
+- };
+-
+- l24 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+-
+- regulator-boot-on;
+- };
+- };
+- };
++ sdhc2_pin_a: sdhc2-pin-active {
++ clk {
++ pins = "sdc2_clk";
++ drive-strength = <10>;
++ bias-disable;
++ };
++
++ cmd-data {
++ pins = "sdc2_cmd", "sdc2_data";
++ drive-strength = <6>;
++ bias-pull-up;
+ };
+ };
+ };
+diff --git a/arch/arm/boot/dts/qcom-apq8084.dtsi b/arch/arm/boot/dts/qcom-apq8084.dtsi
+index 52240fc7a1a69..da50a1a0197f9 100644
+--- a/arch/arm/boot/dts/qcom-apq8084.dtsi
++++ b/arch/arm/boot/dts/qcom-apq8084.dtsi
+@@ -470,7 +470,7 @@
+ qcom,ipc = <&apcs 8 0>;
+ qcom,smd-edge = <15>;
+
+- rpm_requests {
++ rpm-requests {
+ compatible = "qcom,rpm-apq8084";
+ qcom,smd-channels = "rpm_requests";
+
+diff --git a/arch/arm/boot/dts/qcom-mdm9615.dtsi b/arch/arm/boot/dts/qcom-mdm9615.dtsi
+index 4d4f37cebf219..96a66862875c0 100644
+--- a/arch/arm/boot/dts/qcom-mdm9615.dtsi
++++ b/arch/arm/boot/dts/qcom-mdm9615.dtsi
+@@ -321,6 +321,7 @@
+
+ pmicgpio: gpio@150 {
+ compatible = "qcom,pm8018-gpio", "qcom,ssbi-gpio";
++ reg = <0x150>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ gpio-controller;
+diff --git a/arch/arm/boot/dts/qcom-msm8974-fairphone-fp2.dts b/arch/arm/boot/dts/qcom-msm8974-fairphone-fp2.dts
+index 6d77e0f8ca4d4..0855911835921 100644
+--- a/arch/arm/boot/dts/qcom-msm8974-fairphone-fp2.dts
++++ b/arch/arm/boot/dts/qcom-msm8974-fairphone-fp2.dts
+@@ -5,7 +5,6 @@
+ #include <dt-bindings/input/input.h>
+ #include <dt-bindings/pinctrl/qcom,pmic-gpio.h>
+
+-
+ / {
+ model = "Fairphone 2";
+ compatible = "fairphone,fp2", "qcom,msm8974";
+@@ -51,359 +50,358 @@
+
+ vibrator {
+ compatible = "gpio-vibrator";
+- enable-gpios = <&msmgpio 86 GPIO_ACTIVE_HIGH>;
++ enable-gpios = <&tlmm 86 GPIO_ACTIVE_HIGH>;
+ vcc-supply = <&pm8941_l18>;
+ };
++};
++
++&blsp1_uart2 {
++ status = "okay";
++};
++
++&imem {
++ status = "okay";
++
++ reboot-mode {
++ mode-normal = <0x77665501>;
++ mode-bootloader = <0x77665500>;
++ mode-recovery = <0x77665502>;
++ };
++};
++
++&otg {
++ status = "okay";
++
++ phys = <&usb_hs1_phy>;
++ phy-select = <&tcsr 0xb000 0>;
++ extcon = <&smbb>, <&usb_id>;
++ vbus-supply = <&chg_otg>;
++
++ hnp-disable;
++ srp-disable;
++ adp-disable;
+
+- smd {
+- rpm {
+- rpm_requests {
+- pm8841-regulators {
+- s1 {
+- regulator-min-microvolt = <675000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s2 {
+- regulator-min-microvolt = <500000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s3 {
+- regulator-min-microvolt = <1050000>;
+- regulator-max-microvolt = <1050000>;
+- };
+- };
+-
+- pm8941-regulators {
+- vdd_l1_l3-supply = <&pm8941_s1>;
+- vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
+- vdd_l4_l11-supply = <&pm8941_s1>;
+- vdd_l5_l7-supply = <&pm8941_s2>;
+- vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
+- vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
+- vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
+- vdd_l21-supply = <&vreg_boost>;
+-
+- s1 {
+- regulator-min-microvolt = <1300000>;
+- regulator-max-microvolt = <1300000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- s2 {
+- regulator-min-microvolt = <2150000>;
+- regulator-max-microvolt = <2150000>;
+-
+- regulator-boot-on;
+- };
+-
+- s3 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l3 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-boot-on;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-boot-on;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1350000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l19 {
+- regulator-min-microvolt = <2900000>;
+- regulator-max-microvolt = <3350000>;
+- };
+-
+- l20 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- regulator-system-load = <200000>;
+- regulator-allow-set-load;
+- };
+-
+- l21 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- };
+-
+- l22 {
+- regulator-min-microvolt = <3000000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l23 {
+- regulator-min-microvolt = <3000000>;
+- regulator-max-microvolt = <3000000>;
+- };
+-
+- l24 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+-
+- regulator-boot-on;
+- };
+- };
+- };
++ ulpi {
++ phy@a {
++ status = "okay";
++
++ v1p8-supply = <&pm8941_l6>;
++ v3p3-supply = <&pm8941_l24>;
++
++ extcon = <&smbb>;
++ qcom,init-seq = /bits/ 8 <0x1 0x64>;
+ };
+ };
+ };
+
+-&soc {
+- serial@f991e000 {
+- status = "okay";
++&pm8941_gpios {
++ gpio_keys_pin_a: gpio-keys-active {
++ pins = "gpio1", "gpio2", "gpio5";
++ function = "normal";
++
++ bias-pull-up;
++ power-source = <PM8941_GPIO_S3>;
+ };
++};
+
+- remoteproc@fb21b000 {
+- status = "okay";
++&pronto {
++ status = "okay";
+
+- vddmx-supply = <&pm8841_s1>;
+- vddcx-supply = <&pm8841_s2>;
++ vddmx-supply = <&pm8841_s1>;
++ vddcx-supply = <&pm8841_s2>;
++ vddpx-supply = <&pm8941_s3>;
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&wcnss_pin_a>;
++ pinctrl-names = "default";
++ pinctrl-0 = <&wcnss_pin_a>;
+
+- smd-edge {
+- qcom,remote-pid = <4>;
+- label = "pronto";
++ iris {
++ vddxo-supply = <&pm8941_l6>;
++ vddrfa-supply = <&pm8941_l11>;
++ vddpa-supply = <&pm8941_l19>;
++ vdddig-supply = <&pm8941_s3>;
++ };
++
++ smd-edge {
++ qcom,remote-pid = <4>;
++ label = "pronto";
+
+- wcnss {
+- status = "okay";
+- };
++ wcnss {
++ status = "okay";
+ };
+ };
++};
+
+- pinctrl@fd510000 {
+- sdhc1_pin_a: sdhc1-pin-active {
+- clk {
+- pins = "sdc1_clk";
+- drive-strength = <16>;
+- bias-disable;
+- };
++&remoteproc_adsp {
++ status = "okay";
++ cx-supply = <&pm8841_s2>;
++};
+
+- cmd-data {
+- pins = "sdc1_cmd", "sdc1_data";
+- drive-strength = <10>;
+- bias-pull-up;
+- };
++&remoteproc_mss {
++ status = "okay";
++ cx-supply = <&pm8841_s2>;
++ mss-supply = <&pm8841_s3>;
++ mx-supply = <&pm8841_s1>;
++ pll-supply = <&pm8941_l12>;
++};
++
++&rpm_requests {
++ pm8841-regulators {
++ compatible = "qcom,rpm-pm8841-regulators";
++
++ pm8841_s1: s1 {
++ regulator-min-microvolt = <675000>;
++ regulator-max-microvolt = <1050000>;
+ };
+
+- sdhc2_pin_a: sdhc2-pin-active {
+- clk {
+- pins = "sdc2_clk";
+- drive-strength = <10>;
+- bias-disable;
+- };
++ pm8841_s2: s2 {
++ regulator-min-microvolt = <500000>;
++ regulator-max-microvolt = <1050000>;
++ };
+
+- cmd-data {
+- pins = "sdc2_cmd", "sdc2_data";
+- drive-strength = <6>;
+- bias-pull-up;
+- };
++ pm8841_s3: s3 {
++ regulator-min-microvolt = <1050000>;
++ regulator-max-microvolt = <1050000>;
+ };
++ };
+
+- wcnss_pin_a: wcnss-pin-active {
+- wlan {
+- pins = "gpio36", "gpio37", "gpio38", "gpio39", "gpio40";
+- function = "wlan";
++ pm8941-regulators {
++ compatible = "qcom,rpm-pm8941-regulators";
++
++ vdd_l1_l3-supply = <&pm8941_s1>;
++ vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
++ vdd_l4_l11-supply = <&pm8941_s1>;
++ vdd_l5_l7-supply = <&pm8941_s2>;
++ vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
++ vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
++ vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
++ vdd_l21-supply = <&vreg_boost>;
++
++ pm8941_s1: s1 {
++ regulator-min-microvolt = <1300000>;
++ regulator-max-microvolt = <1300000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
+
+- drive-strength = <6>;
+- bias-pull-down;
+- };
++ pm8941_s2: s2 {
++ regulator-min-microvolt = <2150000>;
++ regulator-max-microvolt = <2150000>;
++ regulator-boot-on;
++ };
+
+- bt {
+- pins = "gpio35", "gpio43", "gpio44";
+- function = "bt";
++ pm8941_s3: s3 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
+
+- drive-strength = <2>;
+- bias-pull-down;
+- };
++ pm8941_l1: l1 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
+
+- fm {
+- pins = "gpio41", "gpio42";
+- function = "fm";
++ pm8941_l2: l2 {
++ regulator-min-microvolt = <1200000>;
++ regulator-max-microvolt = <1200000>;
++ };
+
+- drive-strength = <2>;
+- bias-pull-down;
+- };
++ pm8941_l3: l3 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
+ };
+- };
+
+- sdhci@f9824900 {
+- status = "okay";
++ pm8941_l4: l4 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
++ };
+
+- vmmc-supply = <&pm8941_l20>;
+- vqmmc-supply = <&pm8941_s3>;
++ pm8941_l5: l5 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
+
+- bus-width = <8>;
+- non-removable;
++ pm8941_l6: l6 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-boot-on;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc1_pin_a>;
+- };
++ pm8941_l7: l7 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-boot-on;
++ };
+
+- sdhci@f98a4900 {
+- status = "okay";
++ pm8941_l8: l8 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
+
+- vmmc-supply = <&pm8941_l21>;
+- vqmmc-supply = <&pm8941_l13>;
++ pm8941_l9: l9 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ };
+
+- bus-width = <4>;
++ pm8941_l10: l10 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc2_pin_a>;
++ pm8941_l11: l11 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1350000>;
++ };
++
++ pm8941_l12: l12 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
++
++ pm8941_l13: l13 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-boot-on;
++ };
++
++ pm8941_l14: l14 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
++
++ pm8941_l15: l15 {
++ regulator-min-microvolt = <2050000>;
++ regulator-max-microvolt = <2050000>;
++ };
++
++ pm8941_l16: l16 {
++ regulator-min-microvolt = <2700000>;
++ regulator-max-microvolt = <2700000>;
++ };
++
++ pm8941_l17: l17 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++
++ pm8941_l18: l18 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++
++ pm8941_l19: l19 {
++ regulator-min-microvolt = <2900000>;
++ regulator-max-microvolt = <3350000>;
++ };
++
++ pm8941_l20: l20 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-system-load = <200000>;
++ regulator-allow-set-load;
++ regulator-boot-on;
++ };
++
++ pm8941_l21: l21 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-boot-on;
++ };
++
++ pm8941_l22: l22 {
++ regulator-min-microvolt = <3000000>;
++ regulator-max-microvolt = <3300000>;
++ };
++
++ pm8941_l23: l23 {
++ regulator-min-microvolt = <3000000>;
++ regulator-max-microvolt = <3000000>;
++ };
++
++ pm8941_l24: l24 {
++ regulator-min-microvolt = <3075000>;
++ regulator-max-microvolt = <3075000>;
++ regulator-boot-on;
++ };
+ };
++};
++
++&sdhc_1 {
++ status = "okay";
+
+- usb@f9a55000 {
+- status = "okay";
++ vmmc-supply = <&pm8941_l20>;
++ vqmmc-supply = <&pm8941_s3>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc1_pin_a>;
++};
+
+- phys = <&usb_hs1_phy>;
+- phy-select = <&tcsr 0xb000 0>;
+- extcon = <&smbb>, <&usb_id>;
+- vbus-supply = <&chg_otg>;
++&sdhc_2 {
++ status = "okay";
+
+- hnp-disable;
+- srp-disable;
+- adp-disable;
++ vmmc-supply = <&pm8941_l21>;
++ vqmmc-supply = <&pm8941_l13>;
+
+- ulpi {
+- phy@a {
+- status = "okay";
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc2_pin_a>;
++};
+
+- v1p8-supply = <&pm8941_l6>;
+- v3p3-supply = <&pm8941_l24>;
++&tlmm {
++ sdhc1_pin_a: sdhc1-pin-active {
++ clk {
++ pins = "sdc1_clk";
++ drive-strength = <16>;
++ bias-disable;
++ };
+
+- extcon = <&smbb>;
+- qcom,init-seq = /bits/ 8 <0x1 0x64>;
+- };
++ cmd-data {
++ pins = "sdc1_cmd", "sdc1_data";
++ drive-strength = <10>;
++ bias-pull-up;
+ };
+ };
+
+- imem@fe805000 {
+- status = "okay";
++ sdhc2_pin_a: sdhc2-pin-active {
++ clk {
++ pins = "sdc2_clk";
++ drive-strength = <10>;
++ bias-disable;
++ };
+
+- reboot-mode {
+- mode-normal = <0x77665501>;
+- mode-bootloader = <0x77665500>;
+- mode-recovery = <0x77665502>;
++ cmd-data {
++ pins = "sdc2_cmd", "sdc2_data";
++ drive-strength = <6>;
++ bias-pull-up;
+ };
+ };
+-};
+
+-&spmi_bus {
+- pm8941@0 {
+- gpios@c000 {
+- gpio_keys_pin_a: gpio-keys-active {
+- pins = "gpio1", "gpio2", "gpio5";
+- function = "normal";
++ wcnss_pin_a: wcnss-pin-active {
++ wlan {
++ pins = "gpio36", "gpio37", "gpio38", "gpio39", "gpio40";
++ function = "wlan";
++
++ drive-strength = <6>;
++ bias-pull-down;
++ };
++
++ bt {
++ pins = "gpio35", "gpio43", "gpio44";
++ function = "bt";
++
++ drive-strength = <2>;
++ bias-pull-down;
++ };
++
++ fm {
++ pins = "gpio41", "gpio42";
++ function = "fm";
+
+- bias-pull-up;
+- power-source = <PM8941_GPIO_S3>;
+- };
++ drive-strength = <2>;
++ bias-pull-down;
+ };
+ };
+ };
+diff --git a/arch/arm/boot/dts/qcom-msm8974-lge-nexus5-hammerhead.dts b/arch/arm/boot/dts/qcom-msm8974-lge-nexus5-hammerhead.dts
+index 0691361701985..6537950c30bae 100644
+--- a/arch/arm/boot/dts/qcom-msm8974-lge-nexus5-hammerhead.dts
++++ b/arch/arm/boot/dts/qcom-msm8974-lge-nexus5-hammerhead.dts
+@@ -12,213 +12,31 @@
+
+ aliases {
+ serial0 = &blsp1_uart1;
+- serial1 = &blsp2_uart10;
++ serial1 = &blsp2_uart4;
+ };
+
+ chosen {
+ stdout-path = "serial0:115200n8";
+ };
+
+- smd {
+- rpm {
+- rpm_requests {
+- pm8841-regulators {
+- s1 {
+- regulator-min-microvolt = <675000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s2 {
+- regulator-min-microvolt = <500000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s3 {
+- regulator-min-microvolt = <1050000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <815000>;
+- regulator-max-microvolt = <900000>;
+- };
+- };
+-
+- pm8941-regulators {
+- vdd_l1_l3-supply = <&pm8941_s1>;
+- vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
+- vdd_l4_l11-supply = <&pm8941_s1>;
+- vdd_l5_l7-supply = <&pm8941_s2>;
+- vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
+- vdd_l8_l16_l18_l19-supply = <&vreg_vph_pwr>;
+- vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
+- vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
+- vdd_l21-supply = <&vreg_boost>;
+-
+- s1 {
+- regulator-min-microvolt = <1300000>;
+- regulator-max-microvolt = <1300000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- s2 {
+- regulator-min-microvolt = <2150000>;
+- regulator-max-microvolt = <2150000>;
+-
+- regulator-boot-on;
+- };
+-
+- s3 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l3 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-boot-on;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-boot-on;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1300000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l19 {
+- regulator-min-microvolt = <3000000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l20 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- regulator-system-load = <200000>;
+- regulator-allow-set-load;
+- };
+-
+- l21 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- };
+-
+- l22 {
+- regulator-min-microvolt = <3000000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l23 {
+- regulator-min-microvolt = <3000000>;
+- regulator-max-microvolt = <3000000>;
+- };
+-
+- l24 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+-
+- regulator-boot-on;
+- };
+- };
+- };
++ gpio-keys {
++ compatible = "gpio-keys";
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&gpio_keys_pin_a>;
++
++ volume-up {
++ label = "volume_up";
++ gpios = <&pm8941_gpios 2 GPIO_ACTIVE_LOW>;
++ linux,input-type = <1>;
++ linux,code = <KEY_VOLUMEUP>;
++ };
++
++ volume-down {
++ label = "volume_down";
++ gpios = <&pm8941_gpios 3 GPIO_ACTIVE_LOW>;
++ linux,input-type = <1>;
++ linux,code = <KEY_VOLUMEDOWN>;
+ };
+ };
+
+@@ -229,7 +47,7 @@
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+
+- gpio = <&msmgpio 26 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 26 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-names = "default";
+@@ -237,525 +55,676 @@
+ };
+ };
+
+-&soc {
+- serial@f991d000 {
+- status = "okay";
++&blsp1_i2c1 {
++ status = "okay";
++ clock-frequency = <100000>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&i2c1_pins>;
++
++ charger: bq24192@6b {
++ compatible = "ti,bq24192";
++ reg = <0x6b>;
++ interrupts-extended = <&spmi_bus 0 0xd5 0 IRQ_TYPE_EDGE_FALLING>;
++
++ omit-battery-class;
++
++ usb_otg_vbus: usb-otg-vbus { };
+ };
+
+- pinctrl@fd510000 {
+- sdhc1_pin_a: sdhc1-pin-active {
+- clk {
+- pins = "sdc1_clk";
+- drive-strength = <16>;
+- bias-disable;
+- };
++ fuelgauge: max17048@36 {
++ compatible = "maxim,max17048";
++ reg = <0x36>;
+
+- cmd-data {
+- pins = "sdc1_cmd", "sdc1_data";
+- drive-strength = <10>;
+- bias-pull-up;
+- };
+- };
++ maxim,double-soc;
++ maxim,rcomp = /bits/ 8 <0x4d>;
+
+- sdhc2_pin_a: sdhc2-pin-active {
+- clk {
+- pins = "sdc2_clk";
+- drive-strength = <6>;
+- bias-disable;
+- };
++ interrupt-parent = <&tlmm>;
++ interrupts = <9 IRQ_TYPE_LEVEL_LOW>;
+
+- cmd-data {
+- pins = "sdc2_cmd", "sdc2_data";
+- drive-strength = <6>;
+- bias-pull-up;
+- };
+- };
++ pinctrl-names = "default";
++ pinctrl-0 = <&fuelgauge_pin>;
+
+- i2c1_pins: i2c1 {
+- mux {
+- pins = "gpio2", "gpio3";
+- function = "blsp_i2c1";
++ maxim,alert-low-soc-level = <2>;
++ };
++};
+
+- drive-strength = <2>;
+- bias-disable;
+- };
+- };
++&blsp1_i2c2 {
++ status = "okay";
++ clock-frequency = <355000>;
+
+- i2c2_pins: i2c2 {
+- mux {
+- pins = "gpio6", "gpio7";
+- function = "blsp_i2c2";
++ pinctrl-names = "default";
++ pinctrl-0 = <&i2c2_pins>;
+
+- drive-strength = <2>;
+- bias-disable;
+- };
++ synaptics@70 {
++ compatible = "syna,rmi4-i2c";
++ reg = <0x70>;
++
++ interrupts-extended = <&tlmm 5 IRQ_TYPE_EDGE_FALLING>;
++ vdd-supply = <&pm8941_l22>;
++ vio-supply = <&pm8941_lvs3>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&touch_pin>;
++
++ #address-cells = <1>;
++ #size-cells = <0>;
++
++ rmi4-f01@1 {
++ reg = <0x1>;
++ syna,nosleep-mode = <1>;
+ };
+
+- i2c3_pins: i2c3 {
+- mux {
+- pins = "gpio10", "gpio11";
+- function = "blsp_i2c3";
+- drive-strength = <2>;
+- bias-disable;
+- };
++ rmi4-f12@12 {
++ reg = <0x12>;
++ syna,sensor-type = <1>;
+ };
++ };
++};
+
+- i2c11_pins: i2c11 {
+- mux {
+- pins = "gpio83", "gpio84";
+- function = "blsp_i2c11";
++&blsp1_i2c3 {
++ status = "okay";
++ clock-frequency = <100000>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&i2c3_pins>;
++
++ avago_apds993@39 {
++ compatible = "avago,apds9930";
++ reg = <0x39>;
++ interrupts-extended = <&tlmm 61 IRQ_TYPE_EDGE_FALLING>;
++ vdd-supply = <&pm8941_l17>;
++ vddio-supply = <&pm8941_lvs1>;
++ led-max-microamp = <100000>;
++ amstaos,proximity-diodes = <0>;
++ };
++};
+
+- drive-strength = <2>;
+- bias-disable;
+- };
++&blsp2_i2c5 {
++ status = "okay";
++ clock-frequency = <355000>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&i2c11_pins>;
++
++ led-controller@38 {
++ compatible = "ti,lm3630a";
++ status = "okay";
++ reg = <0x38>;
++
++ #address-cells = <1>;
++ #size-cells = <0>;
++
++ led@0 {
++ reg = <0>;
++ led-sources = <0 1>;
++ label = "lcd-backlight";
++ default-brightness = <200>;
+ };
++ };
++};
++
++&blsp2_i2c6 {
++ status = "okay";
++ clock-frequency = <100000>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&i2c12_pins>;
++
++ mpu6515@68 {
++ compatible = "invensense,mpu6515";
++ reg = <0x68>;
++ interrupts-extended = <&tlmm 73 IRQ_TYPE_EDGE_FALLING>;
++ vddio-supply = <&pm8941_lvs1>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&mpu6515_pin>;
+
+- i2c12_pins: i2c12 {
+- mux {
+- pins = "gpio87", "gpio88";
+- function = "blsp_i2c12";
+- drive-strength = <2>;
+- bias-disable;
++ mount-matrix = "0", "-1", "0",
++ "-1", "0", "0",
++ "0", "0", "1";
++
++ i2c-gate {
++ #address-cells = <1>;
++ #size-cells = <0>;
++ ak8963@f {
++ compatible = "asahi-kasei,ak8963";
++ reg = <0x0f>;
++ gpios = <&tlmm 67 0>;
++ vid-supply = <&pm8941_lvs1>;
++ vdd-supply = <&pm8941_l17>;
+ };
+- };
+
+- mpu6515_pin: mpu6515 {
+- irq {
+- pins = "gpio73";
+- function = "gpio";
+- bias-disable;
+- input-enable;
++ bmp280@76 {
++ compatible = "bosch,bmp280";
++ reg = <0x76>;
++ vdda-supply = <&pm8941_lvs1>;
++ vddd-supply = <&pm8941_l17>;
+ };
+ };
++ };
++};
++
++&blsp1_uart1 {
++ status = "okay";
++};
+
+- touch_pin: touch {
+- int {
+- pins = "gpio5";
+- function = "gpio";
++&blsp2_uart4 {
++ status = "okay";
+
+- drive-strength = <2>;
+- bias-disable;
+- input-enable;
+- };
++ pinctrl-names = "default";
++ pinctrl-0 = <&blsp2_uart4_pin_a>;
+
+- reset {
+- pins = "gpio8";
+- function = "gpio";
++ bluetooth {
++ compatible = "brcm,bcm43438-bt";
++ max-speed = <3000000>;
+
+- drive-strength = <2>;
+- bias-pull-up;
+- };
+- };
++ pinctrl-names = "default";
++ pinctrl-0 = <&bt_pin>;
++
++ host-wakeup-gpios = <&tlmm 42 GPIO_ACTIVE_HIGH>;
++ device-wakeup-gpios = <&tlmm 62 GPIO_ACTIVE_HIGH>;
++ shutdown-gpios = <&tlmm 41 GPIO_ACTIVE_HIGH>;
++ };
++};
+
+- panel_pin: panel {
+- te {
+- pins = "gpio12";
+- function = "mdp_vsync";
++&dsi0 {
++ status = "okay";
+
+- drive-strength = <2>;
+- bias-disable;
+- };
+- };
++ vdda-supply = <&pm8941_l2>;
++ vdd-supply = <&pm8941_lvs3>;
++ vddio-supply = <&pm8941_l12>;
+
+- bt_pin: bt {
+- hostwake {
+- pins = "gpio42";
+- function = "gpio";
+- };
++ panel: panel@0 {
++ reg = <0>;
++ compatible = "lg,acx467akm-7";
+
+- devwake {
+- pins = "gpio62";
+- function = "gpio";
+- };
++ pinctrl-names = "default";
++ pinctrl-0 = <&panel_pin>;
+
+- shutdown {
+- pins = "gpio41";
+- function = "gpio";
++ port {
++ panel_in: endpoint {
++ remote-endpoint = <&dsi0_out>;
+ };
+ };
++ };
++};
+
+- blsp2_uart10_pin_a: blsp2-uart10-pin-active {
+- tx {
+- pins = "gpio53";
+- function = "blsp_uart10";
++&dsi0_out {
++ remote-endpoint = <&panel_in>;
++ data-lanes = <0 1 2 3>;
++};
+
+- drive-strength = <2>;
+- bias-disable;
+- };
++&dsi0_phy {
++ status = "okay";
+
+- rx {
+- pins = "gpio54";
+- function = "blsp_uart10";
++ vddio-supply = <&pm8941_l12>;
++};
+
+- drive-strength = <2>;
+- bias-pull-up;
+- };
++&mdss {
++ status = "okay";
++};
+
+- cts {
+- pins = "gpio55";
+- function = "blsp_uart10";
++&otg {
++ status = "okay";
+
+- drive-strength = <2>;
+- bias-pull-up;
+- };
++ phys = <&usb_hs1_phy>;
++ phy-select = <&tcsr 0xb000 0>;
+
+- rts {
+- pins = "gpio56";
+- function = "blsp_uart10";
++ extcon = <&charger>, <&usb_id>;
++ vbus-supply = <&usb_otg_vbus>;
+
+- drive-strength = <2>;
+- bias-disable;
+- };
++ hnp-disable;
++ srp-disable;
++ adp-disable;
++
++ ulpi {
++ phy@a {
++ status = "okay";
++
++ v1p8-supply = <&pm8941_l6>;
++ v3p3-supply = <&pm8941_l24>;
++
++ qcom,init-seq = /bits/ 8 <0x1 0x64>;
+ };
+ };
++};
+
+- sdhci@f9824900 {
+- status = "okay";
++&pm8941_gpios {
++ gpio_keys_pin_a: gpio-keys-active {
++ pins = "gpio2", "gpio3";
++ function = "normal";
+
+- vmmc-supply = <&pm8941_l20>;
+- vqmmc-supply = <&pm8941_s3>;
++ bias-pull-up;
++ power-source = <PM8941_GPIO_S3>;
++ };
+
+- bus-width = <8>;
+- non-removable;
++ fuelgauge_pin: fuelgauge-int {
++ pins = "gpio9";
++ function = "normal";
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc1_pin_a>;
++ bias-disable;
++ input-enable;
++ power-source = <PM8941_GPIO_S3>;
+ };
+
+- sdhci@f98a4900 {
+- status = "okay";
++ wlan_sleep_clk_pin: wl-sleep-clk {
++ pins = "gpio16";
++ function = "func2";
+
+- max-frequency = <100000000>;
+- bus-width = <4>;
+- non-removable;
+- vmmc-supply = <&vreg_wlan>;
+- vqmmc-supply = <&pm8941_s3>;
++ output-high;
++ power-source = <PM8941_GPIO_S3>;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc2_pin_a>;
++ wlan_regulator_pin: wl-reg-active {
++ pins = "gpio17";
++ function = "normal";
+
+- #address-cells = <1>;
+- #size-cells = <0>;
++ bias-disable;
++ power-source = <PM8941_GPIO_S3>;
++ };
++
++ otg {
++ gpio-hog;
++ gpios = <35 GPIO_ACTIVE_HIGH>;
++ output-high;
++ line-name = "otg-gpio";
++ };
++};
+
+- bcrmf@1 {
+- compatible = "brcm,bcm4339-fmac", "brcm,bcm4329-fmac";
+- reg = <1>;
++&rpm_requests {
++ pm8841-regulators {
++ compatible = "qcom,rpm-pm8841-regulators";
+
+- brcm,drive-strength = <10>;
++ pm8841_s1: s1 {
++ regulator-min-microvolt = <675000>;
++ regulator-max-microvolt = <1050000>;
++ };
++
++ pm8841_s2: s2 {
++ regulator-min-microvolt = <500000>;
++ regulator-max-microvolt = <1050000>;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&wlan_sleep_clk_pin>;
++ pm8841_s3: s3 {
++ regulator-min-microvolt = <1050000>;
++ regulator-max-microvolt = <1050000>;
++ };
++
++ pm8841_s4: s4 {
++ regulator-min-microvolt = <815000>;
++ regulator-max-microvolt = <900000>;
+ };
+ };
+
+- gpio-keys {
+- compatible = "gpio-keys";
++ pm8941-regulators {
++ compatible = "qcom,rpm-pm8941-regulators";
++
++ vdd_l1_l3-supply = <&pm8941_s1>;
++ vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
++ vdd_l4_l11-supply = <&pm8941_s1>;
++ vdd_l5_l7-supply = <&pm8941_s2>;
++ vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
++ vdd_l8_l16_l18_l19-supply = <&vreg_vph_pwr>;
++ vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
++ vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
++ vdd_l21-supply = <&vreg_boost>;
++
++ pm8941_s1: s1 {
++ regulator-min-microvolt = <1300000>;
++ regulator-max-microvolt = <1300000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&gpio_keys_pin_a>;
++ pm8941_s2: s2 {
++ regulator-min-microvolt = <2150000>;
++ regulator-max-microvolt = <2150000>;
++ regulator-boot-on;
++ };
+
+- volume-up {
+- label = "volume_up";
+- gpios = <&pm8941_gpios 2 GPIO_ACTIVE_LOW>;
+- linux,input-type = <1>;
+- linux,code = <KEY_VOLUMEUP>;
++ pm8941_s3: s3 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-always-on;
++ regulator-boot-on;
+ };
+
+- volume-down {
+- label = "volume_down";
+- gpios = <&pm8941_gpios 3 GPIO_ACTIVE_LOW>;
+- linux,input-type = <1>;
+- linux,code = <KEY_VOLUMEDOWN>;
++ pm8941_l1: l1 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
++ regulator-always-on;
++ regulator-boot-on;
+ };
+- };
+
+- serial@f9960000 {
+- status = "okay";
++ pm8941_l2: l2 {
++ regulator-min-microvolt = <1200000>;
++ regulator-max-microvolt = <1200000>;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&blsp2_uart10_pin_a>;
++ pm8941_l3: l3 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
++ };
+
+- bluetooth {
+- compatible = "brcm,bcm43438-bt";
+- max-speed = <3000000>;
++ pm8941_l4: l4 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&bt_pin>;
++ pm8941_l5: l5 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
+
+- host-wakeup-gpios = <&msmgpio 42 GPIO_ACTIVE_HIGH>;
+- device-wakeup-gpios = <&msmgpio 62 GPIO_ACTIVE_HIGH>;
+- shutdown-gpios = <&msmgpio 41 GPIO_ACTIVE_HIGH>;
++ pm8941_l6: l6 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-boot-on;
+ };
+- };
+
+- i2c@f9967000 {
+- status = "okay";
+- pinctrl-names = "default";
+- pinctrl-0 = <&i2c11_pins>;
+- clock-frequency = <355000>;
+- qcom,src-freq = <50000000>;
++ pm8941_l7: l7 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-boot-on;
++ };
+
+- led-controller@38 {
+- compatible = "ti,lm3630a";
+- status = "okay";
+- reg = <0x38>;
++ pm8941_l8: l8 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
+
+- #address-cells = <1>;
+- #size-cells = <0>;
++ pm8941_l9: l9 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ };
+
+- led@0 {
+- reg = <0>;
+- led-sources = <0 1>;
+- label = "lcd-backlight";
+- default-brightness = <200>;
+- };
++ pm8941_l10: l10 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
+ };
+- };
+
+- i2c@f9968000 {
+- status = "okay";
+- pinctrl-names = "default";
+- pinctrl-0 = <&i2c12_pins>;
+- clock-frequency = <100000>;
+- qcom,src-freq = <50000000>;
+-
+- mpu6515@68 {
+- compatible = "invensense,mpu6515";
+- reg = <0x68>;
+- interrupts-extended = <&msmgpio 73 IRQ_TYPE_EDGE_FALLING>;
+- vddio-supply = <&pm8941_lvs1>;
+-
+- pinctrl-names = "default";
+- pinctrl-0 = <&mpu6515_pin>;
+-
+- mount-matrix = "0", "-1", "0",
+- "-1", "0", "0",
+- "0", "0", "1";
+-
+- i2c-gate {
+- #address-cells = <1>;
+- #size-cells = <0>;
+- ak8963@f {
+- compatible = "asahi-kasei,ak8963";
+- reg = <0x0f>;
+- gpios = <&msmgpio 67 0>;
+- vid-supply = <&pm8941_lvs1>;
+- vdd-supply = <&pm8941_l17>;
+- };
+-
+- bmp280@76 {
+- compatible = "bosch,bmp280";
+- reg = <0x76>;
+- vdda-supply = <&pm8941_lvs1>;
+- vddd-supply = <&pm8941_l17>;
+- };
+- };
++ pm8941_l11: l11 {
++ regulator-min-microvolt = <1300000>;
++ regulator-max-microvolt = <1300000>;
+ };
+- };
+
+- i2c@f9923000 {
+- status = "okay";
+- pinctrl-names = "default";
+- pinctrl-0 = <&i2c1_pins>;
+- clock-frequency = <100000>;
+- qcom,src-freq = <50000000>;
++ pm8941_l12: l12 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
++
++ pm8941_l13: l13 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-boot-on;
++ };
++
++ pm8941_l14: l14 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
++
++ pm8941_l15: l15 {
++ regulator-min-microvolt = <2050000>;
++ regulator-max-microvolt = <2050000>;
++ };
+
+- charger: bq24192@6b {
+- compatible = "ti,bq24192";
+- reg = <0x6b>;
+- interrupts-extended = <&spmi_bus 0 0xd5 0 IRQ_TYPE_EDGE_FALLING>;
++ pm8941_l16: l16 {
++ regulator-min-microvolt = <2700000>;
++ regulator-max-microvolt = <2700000>;
++ };
+
+- omit-battery-class;
++ pm8941_l17: l17 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
+
+- usb_otg_vbus: usb-otg-vbus { };
++ pm8941_l18: l18 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
+ };
+
+- fuelgauge: max17048@36 {
+- compatible = "maxim,max17048";
+- reg = <0x36>;
++ pm8941_l19: l19 {
++ regulator-min-microvolt = <3000000>;
++ regulator-max-microvolt = <3300000>;
++ };
+
+- maxim,double-soc;
+- maxim,rcomp = /bits/ 8 <0x4d>;
++ pm8941_l20: l20 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-system-load = <200000>;
++ regulator-allow-set-load;
++ regulator-boot-on;
++ };
+
+- interrupt-parent = <&msmgpio>;
+- interrupts = <9 IRQ_TYPE_LEVEL_LOW>;
++ pm8941_l21: l21 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-boot-on;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&fuelgauge_pin>;
++ pm8941_l22: l22 {
++ regulator-min-microvolt = <3000000>;
++ regulator-max-microvolt = <3300000>;
++ };
+
+- maxim,alert-low-soc-level = <2>;
++ pm8941_l23: l23 {
++ regulator-min-microvolt = <3000000>;
++ regulator-max-microvolt = <3000000>;
+ };
++
++ pm8941_l24: l24 {
++ regulator-min-microvolt = <3075000>;
++ regulator-max-microvolt = <3075000>;
++ regulator-boot-on;
++ };
++
++ pm8941_lvs1: lvs1 {};
++ pm8941_lvs3: lvs3 {};
+ };
++};
+
+- i2c@f9924000 {
+- status = "okay";
++&sdhc_1 {
++ status = "okay";
+
+- clock-frequency = <355000>;
+- qcom,src-freq = <50000000>;
++ vmmc-supply = <&pm8941_l20>;
++ vqmmc-supply = <&pm8941_s3>;
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&i2c2_pins>;
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc1_pin_a>;
++};
+
+- synaptics@70 {
+- compatible = "syna,rmi4-i2c";
+- reg = <0x70>;
++&sdhc_2 {
++ status = "okay";
+
+- interrupts-extended = <&msmgpio 5 IRQ_TYPE_EDGE_FALLING>;
+- vdd-supply = <&pm8941_l22>;
+- vio-supply = <&pm8941_lvs3>;
++ max-frequency = <100000000>;
++ vmmc-supply = <&vreg_wlan>;
++ vqmmc-supply = <&pm8941_s3>;
++ non-removable;
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&touch_pin>;
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc2_pin_a>;
+
+- #address-cells = <1>;
+- #size-cells = <0>;
++ #address-cells = <1>;
++ #size-cells = <0>;
+
+- rmi4-f01@1 {
+- reg = <0x1>;
+- syna,nosleep-mode = <1>;
+- };
++ bcrmf@1 {
++ compatible = "brcm,bcm4339-fmac", "brcm,bcm4329-fmac";
++ reg = <1>;
+
+- rmi4-f12@12 {
+- reg = <0x12>;
+- syna,sensor-type = <1>;
+- };
++ brcm,drive-strength = <10>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&wlan_sleep_clk_pin>;
++ };
++};
++
++&tlmm {
++ sdhc1_pin_a: sdhc1-pin-active {
++ clk {
++ pins = "sdc1_clk";
++ drive-strength = <16>;
++ bias-disable;
++ };
++
++ cmd-data {
++ pins = "sdc1_cmd", "sdc1_data";
++ drive-strength = <10>;
++ bias-pull-up;
+ };
+ };
+
+- i2c@f9925000 {
+- status = "okay";
+- pinctrl-names = "default";
+- pinctrl-0 = <&i2c3_pins>;
+- clock-frequency = <100000>;
+- qcom,src-freq = <50000000>;
++ sdhc2_pin_a: sdhc2-pin-active {
++ clk {
++ pins = "sdc2_clk";
++ drive-strength = <6>;
++ bias-disable;
++ };
+
+- avago_apds993@39 {
+- compatible = "avago,apds9930";
+- reg = <0x39>;
+- interrupts-extended = <&msmgpio 61 IRQ_TYPE_EDGE_FALLING>;
+- vdd-supply = <&pm8941_l17>;
+- vddio-supply = <&pm8941_lvs1>;
+- led-max-microamp = <100000>;
+- amstaos,proximity-diodes = <0>;
++ cmd-data {
++ pins = "sdc2_cmd", "sdc2_data";
++ drive-strength = <6>;
++ bias-pull-up;
+ };
+ };
+
+- usb@f9a55000 {
+- status = "okay";
++ i2c1_pins: i2c1 {
++ mux {
++ pins = "gpio2", "gpio3";
++ function = "blsp_i2c1";
+
+- phys = <&usb_hs1_phy>;
+- phy-select = <&tcsr 0xb000 0>;
++ drive-strength = <2>;
++ bias-disable;
++ };
++ };
+
+- extcon = <&charger>, <&usb_id>;
+- vbus-supply = <&usb_otg_vbus>;
++ i2c2_pins: i2c2 {
++ mux {
++ pins = "gpio6", "gpio7";
++ function = "blsp_i2c2";
+
+- hnp-disable;
+- srp-disable;
+- adp-disable;
++ drive-strength = <2>;
++ bias-disable;
++ };
++ };
+
+- ulpi {
+- phy@a {
+- status = "okay";
++ i2c3_pins: i2c3 {
++ mux {
++ pins = "gpio10", "gpio11";
++ function = "blsp_i2c3";
++ drive-strength = <2>;
++ bias-disable;
++ };
++ };
+
+- v1p8-supply = <&pm8941_l6>;
+- v3p3-supply = <&pm8941_l24>;
++ i2c11_pins: i2c11 {
++ mux {
++ pins = "gpio83", "gpio84";
++ function = "blsp_i2c11";
+
+- qcom,init-seq = /bits/ 8 <0x1 0x64>;
+- };
++ drive-strength = <2>;
++ bias-disable;
+ };
+ };
+
+- mdss@fd900000 {
+- status = "okay";
++ i2c12_pins: i2c12 {
++ mux {
++ pins = "gpio87", "gpio88";
++ function = "blsp_i2c12";
++ drive-strength = <2>;
++ bias-disable;
++ };
++ };
+
+- mdp@fd900000 {
+- status = "okay";
++ mpu6515_pin: mpu6515 {
++ irq {
++ pins = "gpio73";
++ function = "gpio";
++ bias-disable;
++ input-enable;
+ };
++ };
+
+- dsi@fd922800 {
+- status = "okay";
++ touch_pin: touch {
++ int {
++ pins = "gpio5";
++ function = "gpio";
+
+- vdda-supply = <&pm8941_l2>;
+- vdd-supply = <&pm8941_lvs3>;
+- vddio-supply = <&pm8941_l12>;
++ drive-strength = <2>;
++ bias-disable;
++ input-enable;
++ };
+
+- #address-cells = <1>;
+- #size-cells = <0>;
++ reset {
++ pins = "gpio8";
++ function = "gpio";
+
+- ports {
+- port@1 {
+- endpoint {
+- remote-endpoint = <&panel_in>;
+- data-lanes = <0 1 2 3>;
+- };
+- };
+- };
++ drive-strength = <2>;
++ bias-pull-up;
++ };
++ };
+
+- panel: panel@0 {
+- reg = <0>;
+- compatible = "lg,acx467akm-7";
++ panel_pin: panel {
++ te {
++ pins = "gpio12";
++ function = "mdp_vsync";
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&panel_pin>;
++ drive-strength = <2>;
++ bias-disable;
++ };
++ };
+
+- port {
+- panel_in: endpoint {
+- remote-endpoint = <&dsi0_out>;
+- };
+- };
+- };
++ bt_pin: bt {
++ hostwake {
++ pins = "gpio42";
++ function = "gpio";
+ };
+
+- dsi-phy@fd922a00 {
+- status = "okay";
++ devwake {
++ pins = "gpio62";
++ function = "gpio";
++ };
+
+- vddio-supply = <&pm8941_l12>;
++ shutdown {
++ pins = "gpio41";
++ function = "gpio";
+ };
+ };
+-};
+
+-&spmi_bus {
+- pm8941@0 {
+- gpios@c000 {
+- gpio_keys_pin_a: gpio-keys-active {
+- pins = "gpio2", "gpio3";
+- function = "normal";
++ blsp2_uart4_pin_a: blsp2-uart4-pin-active {
++ tx {
++ pins = "gpio53";
++ function = "blsp_uart10";
+
+- bias-pull-up;
+- power-source = <PM8941_GPIO_S3>;
+- };
++ drive-strength = <2>;
++ bias-disable;
++ };
+
+- fuelgauge_pin: fuelgauge-int {
+- pins = "gpio9";
+- function = "normal";
++ rx {
++ pins = "gpio54";
++ function = "blsp_uart10";
+
+- bias-disable;
+- input-enable;
+- power-source = <PM8941_GPIO_S3>;
+- };
+-
+- wlan_sleep_clk_pin: wl-sleep-clk {
+- pins = "gpio16";
+- function = "func2";
++ drive-strength = <2>;
++ bias-pull-up;
++ };
+
+- output-high;
+- power-source = <PM8941_GPIO_S3>;
+- };
++ cts {
++ pins = "gpio55";
++ function = "blsp_uart10";
+
+- wlan_regulator_pin: wl-reg-active {
+- pins = "gpio17";
+- function = "normal";
++ drive-strength = <2>;
++ bias-pull-up;
++ };
+
+- bias-disable;
+- power-source = <PM8941_GPIO_S3>;
+- };
++ rts {
++ pins = "gpio56";
++ function = "blsp_uart10";
+
+- otg {
+- gpio-hog;
+- gpios = <35 GPIO_ACTIVE_HIGH>;
+- output-high;
+- line-name = "otg-gpio";
+- };
++ drive-strength = <2>;
++ bias-disable;
+ };
+ };
+ };
+diff --git a/arch/arm/boot/dts/qcom-msm8974-samsung-klte.dts b/arch/arm/boot/dts/qcom-msm8974-samsung-klte.dts
+index 96e1c978b878c..9ef5a68747f14 100644
+--- a/arch/arm/boot/dts/qcom-msm8974-samsung-klte.dts
++++ b/arch/arm/boot/dts/qcom-msm8974-samsung-klte.dts
+@@ -13,201 +13,42 @@
+ aliases {
+ serial0 = &blsp1_uart1;
+ mmc0 = &sdhc_1; /* SDC1 eMMC slot */
+- mmc1 = &sdhc_2; /* SDC2 SD card slot */
++ mmc1 = &sdhc_3; /* SDC2 SD card slot */
+ };
+
+ chosen {
+ stdout-path = "serial0:115200n8";
+ };
+
+- smd {
+- rpm {
+- rpm_requests {
+- pma8084-regulators {
+- compatible = "qcom,rpm-pma8084-regulators";
+- status = "okay";
+-
+- pma8084_s1: s1 {
+- regulator-min-microvolt = <675000>;
+- regulator-max-microvolt = <1050000>;
+- regulator-always-on;
+- };
+-
+- pma8084_s2: s2 {
+- regulator-min-microvolt = <500000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- pma8084_s3: s3 {
+- regulator-min-microvolt = <1300000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- pma8084_s4: s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- pma8084_s5: s5 {
+- regulator-min-microvolt = <2150000>;
+- regulator-max-microvolt = <2150000>;
+- };
+-
+- pma8084_s6: s6 {
+- regulator-min-microvolt = <1050000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- pma8084_l1: l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- pma8084_l2: l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- pma8084_l3: l3 {
+- regulator-min-microvolt = <1050000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- pma8084_l4: l4 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- pma8084_l5: l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- pma8084_l6: l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- pma8084_l7: l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- pma8084_l8: l8 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- pma8084_l9: l9 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- pma8084_l10: l10 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- pma8084_l11: l11 {
+- regulator-min-microvolt = <1300000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- pma8084_l12: l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- regulator-always-on;
+- };
+-
+- pma8084_l13: l13 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- pma8084_l14: l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- pma8084_l15: l15 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- pma8084_l16: l16 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-
+- pma8084_l17: l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- pma8084_l18: l18 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- pma8084_l19: l19 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- pma8084_l20: l20 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-allow-set-load;
+- regulator-system-load = <200000>;
+- };
+-
+- pma8084_l21: l21 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-allow-set-load;
+- regulator-system-load = <200000>;
+- };
+-
+- pma8084_l22: l22 {
+- regulator-min-microvolt = <3000000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- pma8084_l23: l23 {
+- regulator-min-microvolt = <3000000>;
+- regulator-max-microvolt = <3000000>;
+- };
+-
+- pma8084_l24: l24 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- pma8084_l25: l25 {
+- regulator-min-microvolt = <2100000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- pma8084_l26: l26 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- pma8084_l27: l27 {
+- regulator-min-microvolt = <1000000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- pma8084_lvs1: lvs1 {};
+- pma8084_lvs2: lvs2 {};
+- pma8084_lvs3: lvs3 {};
+- pma8084_lvs4: lvs4 {};
+-
+- pma8084_5vs1: 5vs1 {};
+- };
+- };
++ gpio-keys {
++ compatible = "gpio-keys";
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&gpio_keys_pin_a>;
++
++ volume-down {
++ label = "volume_down";
++ gpios = <&pma8084_gpios 2 GPIO_ACTIVE_LOW>;
++ linux,input-type = <1>;
++ linux,code = <KEY_VOLUMEDOWN>;
++ debounce-interval = <15>;
++ };
++
++ home-key {
++ label = "home_key";
++ gpios = <&pma8084_gpios 3 GPIO_ACTIVE_LOW>;
++ linux,input-type = <1>;
++ linux,code = <KEY_HOMEPAGE>;
++ wakeup-source;
++ debounce-interval = <15>;
++ };
++
++ volume-up {
++ label = "volume_up";
++ gpios = <&pma8084_gpios 5 GPIO_ACTIVE_LOW>;
++ linux,input-type = <1>;
++ linux,code = <KEY_VOLUMEUP>;
++ debounce-interval = <15>;
+ };
+ };
+
+@@ -215,8 +56,8 @@
+ compatible = "i2c-gpio";
+ #address-cells = <1>;
+ #size-cells = <0>;
+- sda-gpios = <&msmgpio 95 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
+- scl-gpios = <&msmgpio 96 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
++ sda-gpios = <&tlmm 95 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
++ scl-gpios = <&tlmm 96 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c_touchkey_pins>;
+
+@@ -240,8 +81,8 @@
+ compatible = "i2c-gpio";
+ #address-cells = <1>;
+ #size-cells = <0>;
+- scl-gpios = <&msmgpio 121 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
+- sda-gpios = <&msmgpio 120 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
++ scl-gpios = <&tlmm 121 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
++ sda-gpios = <&tlmm 120 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c_led_gpioex_pins>;
+
+@@ -259,7 +100,7 @@
+ pinctrl-names = "default";
+ pinctrl-0 = <&gpioex_pin>;
+
+- reset-gpios = <&msmgpio 145 GPIO_ACTIVE_LOW>;
++ reset-gpios = <&tlmm 145 GPIO_ACTIVE_LOW>;
+ };
+
+ led-controller@30 {
+@@ -315,594 +156,719 @@
+ };
+
+ /delete-node/ vreg-boost;
+-
+- adsp-pil {
+- cx-supply = <&pma8084_s2>;
+- };
+ };
+
+-&soc {
+- serial@f991e000 {
+- status = "okay";
+- };
++&blsp1_i2c2 {
++ status = "okay";
+
+- /* blsp2_uart8 */
+- serial@f995e000 {
+- status = "okay";
++ pinctrl-names = "default";
++ pinctrl-0 = <&i2c2_pins>;
+
+- pinctrl-names = "default", "sleep";
+- pinctrl-0 = <&blsp2_uart8_pins_active>;
+- pinctrl-1 = <&blsp2_uart8_pins_sleep>;
++ touchscreen@20 {
++ compatible = "syna,rmi4-i2c";
++ reg = <0x20>;
+
+- bluetooth {
+- compatible = "brcm,bcm43540-bt";
+- max-speed = <3000000>;
+- pinctrl-names = "default";
+- pinctrl-0 = <&bt_pins>;
+- device-wakeup-gpios = <&msmgpio 91 GPIO_ACTIVE_HIGH>;
+- shutdown-gpios = <&gpio_expander 9 GPIO_ACTIVE_HIGH>;
+- interrupt-parent = <&msmgpio>;
+- interrupts = <75 IRQ_TYPE_LEVEL_HIGH>;
+- interrupt-names = "host-wakeup";
+- };
+- };
++ interrupt-parent = <&pma8084_gpios>;
++ interrupts = <8 IRQ_TYPE_EDGE_FALLING>;
+
+- gpio-keys {
+- compatible = "gpio-keys";
++ vdd-supply = <&max77826_ldo13>;
++ vio-supply = <&pma8084_lvs2>;
+
+ pinctrl-names = "default";
+- pinctrl-0 = <&gpio_keys_pin_a>;
++ pinctrl-0 = <&touch_pin>;
+
+- volume-down {
+- label = "volume_down";
+- gpios = <&pma8084_gpios 2 GPIO_ACTIVE_LOW>;
+- linux,input-type = <1>;
+- linux,code = <KEY_VOLUMEDOWN>;
+- debounce-interval = <15>;
+- };
++ syna,startup-delay-ms = <100>;
+
+- home-key {
+- label = "home_key";
+- gpios = <&pma8084_gpios 3 GPIO_ACTIVE_LOW>;
+- linux,input-type = <1>;
+- linux,code = <KEY_HOMEPAGE>;
+- wakeup-source;
+- debounce-interval = <15>;
++ #address-cells = <1>;
++ #size-cells = <0>;
++
++ rmi4-f01@1 {
++ reg = <0x1>;
++ syna,nosleep-mode = <1>;
+ };
+
+- volume-up {
+- label = "volume_up";
+- gpios = <&pma8084_gpios 5 GPIO_ACTIVE_LOW>;
+- linux,input-type = <1>;
+- linux,code = <KEY_VOLUMEUP>;
+- debounce-interval = <15>;
++ rmi4-f12@12 {
++ reg = <0x12>;
++ syna,sensor-type = <1>;
+ };
+ };
++};
+
+- pinctrl@fd510000 {
+- blsp2_uart8_pins_active: blsp2-uart8-pins-active {
+- pins = "gpio45", "gpio46", "gpio47", "gpio48";
+- function = "blsp_uart8";
+- drive-strength = <8>;
+- bias-disable;
+- };
++&blsp1_i2c6 {
++ status = "okay";
+
+- blsp2_uart8_pins_sleep: blsp2-uart8-pins-sleep {
+- pins = "gpio45", "gpio46", "gpio47", "gpio48";
+- function = "gpio";
+- drive-strength = <2>;
+- bias-pull-down;
+- };
++ pinctrl-names = "default";
++ pinctrl-0 = <&i2c6_pins>;
++
++ pmic@60 {
++ reg = <0x60>;
++ compatible = "maxim,max77826";
+
+- bt_pins: bt-pins {
+- hostwake {
+- pins = "gpio75";
+- function = "gpio";
+- drive-strength = <16>;
+- input-enable;
++ regulators {
++ max77826_ldo1: LDO1 {
++ regulator-min-microvolt = <1200000>;
++ regulator-max-microvolt = <1200000>;
+ };
+
+- devwake {
+- pins = "gpio91";
+- function = "gpio";
+- drive-strength = <2>;
++ max77826_ldo2: LDO2 {
++ regulator-min-microvolt = <1000000>;
++ regulator-max-microvolt = <1000000>;
+ };
+- };
+
+- sdhc1_pin_a: sdhc1-pin-active {
+- clk {
+- pins = "sdc1_clk";
+- drive-strength = <4>;
+- bias-disable;
++ max77826_ldo3: LDO3 {
++ regulator-min-microvolt = <1200000>;
++ regulator-max-microvolt = <1200000>;
+ };
+
+- cmd-data {
+- pins = "sdc1_cmd", "sdc1_data";
+- drive-strength = <4>;
+- bias-pull-up;
++ max77826_ldo4: LDO4 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
+ };
+- };
+
+- sdhc2_pin_a: sdhc2-pin-active {
+- clk-cmd-data {
+- pins = "gpio35", "gpio36", "gpio37", "gpio38",
+- "gpio39", "gpio40";
+- function = "sdc3";
+- drive-strength = <8>;
+- bias-disable;
++ max77826_ldo5: LDO5 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
+ };
+- };
+
+- sdhc2_cd_pin: sdhc2-cd {
+- pins = "gpio62";
+- function = "gpio";
++ max77826_ldo6: LDO6 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <3300000>;
++ };
+
+- drive-strength = <2>;
+- bias-disable;
+- };
++ max77826_ldo7: LDO7 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
+
+- sdhc3_pin_a: sdhc3-pin-active {
+- clk {
+- pins = "sdc2_clk";
+- drive-strength = <6>;
+- bias-disable;
++ max77826_ldo8: LDO8 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <3300000>;
+ };
+
+- cmd-data {
+- pins = "sdc2_cmd", "sdc2_data";
+- drive-strength = <6>;
+- bias-pull-up;
++ max77826_ldo9: LDO9 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
+ };
+- };
+
+- i2c2_pins: i2c2 {
+- mux {
+- pins = "gpio6", "gpio7";
+- function = "blsp_i2c2";
++ max77826_ldo10: LDO10 {
++ regulator-min-microvolt = <2800000>;
++ regulator-max-microvolt = <2950000>;
++ };
+
+- drive-strength = <2>;
+- bias-disable;
++ max77826_ldo11: LDO11 {
++ regulator-min-microvolt = <2700000>;
++ regulator-max-microvolt = <2950000>;
+ };
+- };
+
+- i2c6_pins: i2c6 {
+- mux {
+- pins = "gpio29", "gpio30";
+- function = "blsp_i2c6";
++ max77826_ldo12: LDO12 {
++ regulator-min-microvolt = <2500000>;
++ regulator-max-microvolt = <3300000>;
++ };
+
+- drive-strength = <2>;
+- bias-disable;
++ max77826_ldo13: LDO13 {
++ regulator-min-microvolt = <3300000>;
++ regulator-max-microvolt = <3300000>;
+ };
+- };
+
+- i2c12_pins: i2c12 {
+- mux {
+- pins = "gpio87", "gpio88";
+- function = "blsp_i2c12";
++ max77826_ldo14: LDO14 {
++ regulator-min-microvolt = <3300000>;
++ regulator-max-microvolt = <3300000>;
++ };
+
+- drive-strength = <2>;
+- bias-disable;
++ max77826_ldo15: LDO15 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
+ };
+- };
+
+- i2c_touchkey_pins: i2c-touchkey {
+- mux {
+- pins = "gpio95", "gpio96";
+- function = "gpio";
+- input-enable;
+- bias-pull-up;
++ max77826_buck: BUCK {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
+ };
+- };
+
+- i2c_led_gpioex_pins: i2c-led-gpioex {
+- mux {
+- pins = "gpio120", "gpio121";
+- function = "gpio";
+- input-enable;
+- bias-pull-down;
++ max77826_buckboost: BUCKBOOST {
++ regulator-min-microvolt = <3400000>;
++ regulator-max-microvolt = <3400000>;
+ };
+ };
++ };
++};
+
+- gpioex_pin: gpioex {
+- res {
+- pins = "gpio145";
+- function = "gpio";
++&blsp1_uart2 {
++ status = "okay";
++};
+
+- bias-pull-up;
+- drive-strength = <2>;
+- };
+- };
++&blsp2_i2c6 {
++ status = "okay";
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&i2c12_pins>;
++
++ fuelgauge@36 {
++ compatible = "maxim,max17048";
++ reg = <0x36>;
++
++ maxim,double-soc;
++ maxim,rcomp = /bits/ 8 <0x56>;
++
++ interrupt-parent = <&pma8084_gpios>;
++ interrupts = <21 IRQ_TYPE_LEVEL_LOW>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&fuelgauge_pin>;
++ };
++};
++
++&blsp2_uart2 {
++ status = "okay";
+
+- wifi_pin: wifi {
+- int {
+- pins = "gpio92";
+- function = "gpio";
++ pinctrl-names = "default", "sleep";
++ pinctrl-0 = <&blsp2_uart2_pins_active>;
++ pinctrl-1 = <&blsp2_uart2_pins_sleep>;
+
+- input-enable;
+- bias-pull-down;
++ bluetooth {
++ compatible = "brcm,bcm43540-bt";
++ max-speed = <3000000>;
++ pinctrl-names = "default";
++ pinctrl-0 = <&bt_pins>;
++ device-wakeup-gpios = <&tlmm 91 GPIO_ACTIVE_HIGH>;
++ shutdown-gpios = <&gpio_expander 9 GPIO_ACTIVE_HIGH>;
++ interrupt-parent = <&tlmm>;
++ interrupts = <75 IRQ_TYPE_LEVEL_HIGH>;
++ interrupt-names = "host-wakeup";
++ };
++};
++
++&dsi0 {
++ status = "okay";
++
++ vdda-supply = <&pma8084_l2>;
++ vdd-supply = <&pma8084_l22>;
++ vddio-supply = <&pma8084_l12>;
++
++ panel: panel@0 {
++ reg = <0>;
++ compatible = "samsung,s6e3fa2";
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&panel_te_pin &panel_rst_pin>;
++
++ iovdd-supply = <&pma8084_lvs4>;
++ vddr-supply = <&vreg_panel>;
++
++ reset-gpios = <&pma8084_gpios 17 GPIO_ACTIVE_LOW>;
++ te-gpios = <&tlmm 12 GPIO_ACTIVE_HIGH>;
++
++ port {
++ panel_in: endpoint {
++ remote-endpoint = <&dsi0_out>;
+ };
+ };
++ };
++};
+
+- panel_te_pin: panel {
+- te {
+- pins = "gpio12";
+- function = "mdp_vsync";
++&dsi0_out {
++ remote-endpoint = <&panel_in>;
++ data-lanes = <0 1 2 3>;
++};
+
+- drive-strength = <2>;
+- bias-disable;
+- };
++&dsi0_phy {
++ status = "okay";
++
++ vddio-supply = <&pma8084_l12>;
++};
++
++&gpu {
++ status = "okay";
++};
++
++&mdss {
++ status = "okay";
++};
++
++&otg {
++ status = "okay";
++
++ phys = <&usb_hs1_phy>;
++ phy-select = <&tcsr 0xb000 0>;
++
++ hnp-disable;
++ srp-disable;
++ adp-disable;
++
++ ulpi {
++ phy@a {
++ status = "okay";
++
++ v1p8-supply = <&pma8084_l6>;
++ v3p3-supply = <&pma8084_l24>;
++
++ qcom,init-seq = /bits/ 8 <0x1 0x64>;
+ };
+ };
++};
+
+- sdhc_1: sdhci@f9824900 {
+- status = "okay";
++&pma8084_gpios {
++ gpio_keys_pin_a: gpio-keys-active {
++ pins = "gpio2", "gpio3", "gpio5";
++ function = "normal";
+
+- vmmc-supply = <&pma8084_l20>;
+- vqmmc-supply = <&pma8084_s4>;
++ bias-pull-up;
++ power-source = <PMA8084_GPIO_S4>;
++ };
+
+- bus-width = <8>;
+- non-removable;
++ touchkey_pin: touchkey-int-pin {
++ pins = "gpio6";
++ function = "normal";
++ bias-disable;
++ input-enable;
++ power-source = <PMA8084_GPIO_S4>;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc1_pin_a>;
++ touch_pin: touchscreen-int-pin {
++ pins = "gpio8";
++ function = "normal";
++ bias-disable;
++ input-enable;
++ power-source = <PMA8084_GPIO_S4>;
+ };
+
+- sdhc_2: sdhci@f9864900 {
+- status = "okay";
++ panel_en_pin: panel-en-pin {
++ pins = "gpio14";
++ function = "normal";
++ bias-pull-up;
++ power-source = <PMA8084_GPIO_S4>;
++ qcom,drive-strength = <PMIC_GPIO_STRENGTH_LOW>;
++ };
+
+- max-frequency = <100000000>;
++ wlan_sleep_clk_pin: wlan-sleep-clk-pin {
++ pins = "gpio16";
++ function = "func2";
+
+- vmmc-supply = <&pma8084_l21>;
+- vqmmc-supply = <&pma8084_l13>;
++ output-high;
++ power-source = <PMA8084_GPIO_S4>;
++ qcom,drive-strength = <PMIC_GPIO_STRENGTH_HIGH>;
++ };
+
+- bus-width = <4>;
++ panel_rst_pin: panel-rst-pin {
++ pins = "gpio17";
++ function = "normal";
++ bias-disable;
++ power-source = <PMA8084_GPIO_S4>;
++ qcom,drive-strength = <PMIC_GPIO_STRENGTH_LOW>;
++ };
+
+- /* cd-gpio is intentionally disabled. If enabled, an SD card
+- * present during boot is not initialized correctly. Without
+- * cd-gpios the driver resorts to polling, so hotplug works.
+- */
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc2_pin_a /* &sdhc2_cd_pin */>;
+- // cd-gpios = <&msmgpio 62 GPIO_ACTIVE_LOW>;
++ fuelgauge_pin: fuelgauge-int-pin {
++ pins = "gpio21";
++ function = "normal";
++ bias-disable;
++ input-enable;
++ power-source = <PMA8084_GPIO_S4>;
+ };
++};
+
+- sdhci@f98a4900 {
+- status = "okay";
++&remoteproc_adsp {
++ status = "okay";
++ cx-supply = <&pma8084_s2>;
++};
+
+- #address-cells = <1>;
+- #size-cells = <0>;
++&remoteproc_mss {
++ status = "okay";
++ cx-supply = <&pma8084_s2>;
++ mss-supply = <&pma8084_s6>;
++ mx-supply = <&pma8084_s1>;
++ pll-supply = <&pma8084_l12>;
++};
+
+- max-frequency = <100000000>;
++&rpm_requests {
++ pma8084-regulators {
++ compatible = "qcom,rpm-pma8084-regulators";
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc3_pin_a>;
++ pma8084_s1: s1 {
++ regulator-min-microvolt = <675000>;
++ regulator-max-microvolt = <1050000>;
++ regulator-always-on;
++ };
+
+- vmmc-supply = <&vreg_wlan>;
+- vqmmc-supply = <&pma8084_s4>;
++ pma8084_s2: s2 {
++ regulator-min-microvolt = <500000>;
++ regulator-max-microvolt = <1050000>;
++ };
+
+- bus-width = <4>;
+- non-removable;
++ pma8084_s3: s3 {
++ regulator-min-microvolt = <1300000>;
++ regulator-max-microvolt = <1300000>;
++ };
+
+- wifi@1 {
+- reg = <1>;
+- compatible = "brcm,bcm4329-fmac";
++ pma8084_s4: s4 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
+
+- interrupt-parent = <&msmgpio>;
+- interrupts = <92 IRQ_TYPE_LEVEL_HIGH>;
+- interrupt-names = "host-wake";
++ pma8084_s5: s5 {
++ regulator-min-microvolt = <2150000>;
++ regulator-max-microvolt = <2150000>;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&wlan_sleep_clk_pin &wifi_pin>;
++ pma8084_s6: s6 {
++ regulator-min-microvolt = <1050000>;
++ regulator-max-microvolt = <1050000>;
+ };
+- };
+
+- usb@f9a55000 {
+- status = "okay";
++ pma8084_l1: l1 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
++ };
+
+- phys = <&usb_hs1_phy>;
+- phy-select = <&tcsr 0xb000 0>;
+- /*extcon = <&smbb>, <&usb_id>;*/
+- /*vbus-supply = <&chg_otg>;*/
++ pma8084_l2: l2 {
++ regulator-min-microvolt = <1200000>;
++ regulator-max-microvolt = <1200000>;
++ };
+
+- hnp-disable;
+- srp-disable;
+- adp-disable;
++ pma8084_l3: l3 {
++ regulator-min-microvolt = <1050000>;
++ regulator-max-microvolt = <1200000>;
++ };
+
+- ulpi {
+- phy@a {
+- status = "okay";
++ pma8084_l4: l4 {
++ regulator-min-microvolt = <1200000>;
++ regulator-max-microvolt = <1225000>;
++ };
+
+- v1p8-supply = <&pma8084_l6>;
+- v3p3-supply = <&pma8084_l24>;
++ pma8084_l5: l5 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
+
+- /*extcon = <&smbb>;*/
+- qcom,init-seq = /bits/ 8 <0x1 0x64>;
+- };
++ pma8084_l6: l6 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
+ };
+- };
+
+- i2c@f9924000 {
+- status = "okay";
++ pma8084_l7: l7 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&i2c2_pins>;
++ pma8084_l8: l8 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
+
+- touchscreen@20 {
+- compatible = "syna,rmi4-i2c";
+- reg = <0x20>;
++ pma8084_l9: l9 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ };
+
+- interrupt-parent = <&pma8084_gpios>;
+- interrupts = <8 IRQ_TYPE_EDGE_FALLING>;
++ pma8084_l10: l10 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ };
+
+- vdd-supply = <&max77826_ldo13>;
+- vio-supply = <&pma8084_lvs2>;
++ pma8084_l11: l11 {
++ regulator-min-microvolt = <1300000>;
++ regulator-max-microvolt = <1300000>;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&touch_pin>;
++ pma8084_l12: l12 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-always-on;
++ };
+
+- syna,startup-delay-ms = <100>;
++ pma8084_l13: l13 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ };
+
+- #address-cells = <1>;
+- #size-cells = <0>;
++ pma8084_l14: l14 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
+
+- rmi4-f01@1 {
+- reg = <0x1>;
+- syna,nosleep-mode = <1>;
+- };
++ pma8084_l15: l15 {
++ regulator-min-microvolt = <2050000>;
++ regulator-max-microvolt = <2050000>;
++ };
+
+- rmi4-f12@12 {
+- reg = <0x12>;
+- syna,sensor-type = <1>;
+- };
++ pma8084_l16: l16 {
++ regulator-min-microvolt = <2700000>;
++ regulator-max-microvolt = <2700000>;
+ };
+- };
+
+- i2c@f9928000 {
+- status = "okay";
++ pma8084_l17: l17 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&i2c6_pins>;
+-
+- pmic@60 {
+- reg = <0x60>;
+- compatible = "maxim,max77826";
+-
+- regulators {
+- max77826_ldo1: LDO1 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- max77826_ldo2: LDO2 {
+- regulator-min-microvolt = <1000000>;
+- regulator-max-microvolt = <1000000>;
+- };
+-
+- max77826_ldo3: LDO3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- max77826_ldo4: LDO4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- max77826_ldo5: LDO5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- max77826_ldo6: LDO6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- max77826_ldo7: LDO7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- max77826_ldo8: LDO8 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- max77826_ldo9: LDO9 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- max77826_ldo10: LDO10 {
+- regulator-min-microvolt = <2800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- max77826_ldo11: LDO11 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- max77826_ldo12: LDO12 {
+- regulator-min-microvolt = <2500000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- max77826_ldo13: LDO13 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- max77826_ldo14: LDO14 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- max77826_ldo15: LDO15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- max77826_buck: BUCK {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- max77826_buckboost: BUCKBOOST {
+- regulator-min-microvolt = <3400000>;
+- regulator-max-microvolt = <3400000>;
+- };
+- };
++ pma8084_l18: l18 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
+ };
+- };
+
+- i2c@f9968000 {
+- status = "okay";
++ pma8084_l19: l19 {
++ regulator-min-microvolt = <3300000>;
++ regulator-max-microvolt = <3300000>;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&i2c12_pins>;
++ pma8084_l20: l20 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-system-load = <200000>;
++ regulator-allow-set-load;
++ };
+
+- fuelgauge@36 {
+- compatible = "maxim,max17048";
+- reg = <0x36>;
++ pma8084_l21: l21 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-system-load = <200000>;
++ regulator-allow-set-load;
++ };
+
+- maxim,double-soc;
+- maxim,rcomp = /bits/ 8 <0x56>;
++ pma8084_l22: l22 {
++ regulator-min-microvolt = <3000000>;
++ regulator-max-microvolt = <3300000>;
++ };
+
+- interrupt-parent = <&pma8084_gpios>;
+- interrupts = <21 IRQ_TYPE_LEVEL_LOW>;
++ pma8084_l23: l23 {
++ regulator-min-microvolt = <3000000>;
++ regulator-max-microvolt = <3000000>;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&fuelgauge_pin>;
++ pma8084_l24: l24 {
++ regulator-min-microvolt = <3075000>;
++ regulator-max-microvolt = <3075000>;
+ };
++
++ pma8084_l25: l25 {
++ regulator-min-microvolt = <2100000>;
++ regulator-max-microvolt = <2100000>;
++ };
++
++ pma8084_l26: l26 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2050000>;
++ };
++
++ pma8084_l27: l27 {
++ regulator-min-microvolt = <1000000>;
++ regulator-max-microvolt = <1225000>;
++ };
++
++ pma8084_lvs1: lvs1 {};
++ pma8084_lvs2: lvs2 {};
++ pma8084_lvs3: lvs3 {};
++ pma8084_lvs4: lvs4 {};
++
++ pma8084_5vs1: 5vs1 {};
+ };
++};
++
++&sdhc_1 {
++ status = "okay";
++
++ vmmc-supply = <&pma8084_l20>;
++ vqmmc-supply = <&pma8084_s4>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc1_pin_a>;
++};
++
++&sdhc_2 {
++ status = "okay";
++ max-frequency = <100000000>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc3_pin_a>;
++
++ vmmc-supply = <&vreg_wlan>;
++ vqmmc-supply = <&pma8084_s4>;
++
++ non-removable;
+
+- adreno@fdb00000 {
+- status = "ok";
++ #address-cells = <1>;
++ #size-cells = <0>;
++
++ wifi@1 {
++ reg = <1>;
++ compatible = "brcm,bcm4329-fmac";
++
++ interrupt-parent = <&tlmm>;
++ interrupts = <92 IRQ_TYPE_LEVEL_HIGH>;
++ interrupt-names = "host-wake";
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&wlan_sleep_clk_pin &wifi_pin>;
+ };
++};
+
+- mdss@fd900000 {
+- status = "ok";
++&sdhc_3 {
++ status = "okay";
++ max-frequency = <100000000>;
++
++ vmmc-supply = <&pma8084_l21>;
++ vqmmc-supply = <&pma8084_l13>;
++
++ /*
++ * cd-gpio is intentionally disabled. If enabled, an SD card
++ * present during boot is not initialized correctly. Without
++ * cd-gpios the driver resorts to polling, so hotplug works.
++ */
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc2_pin_a /* &sdhc2_cd_pin */>;
++ /* cd-gpios = <&tlmm 62 GPIO_ACTIVE_LOW>; */
++};
+
+- mdp@fd900000 {
+- status = "ok";
+- };
++&tlmm {
++ blsp2_uart2_pins_active: blsp2-uart2-pins-active {
++ pins = "gpio45", "gpio46", "gpio47", "gpio48";
++ function = "blsp_uart8";
++ drive-strength = <8>;
++ bias-disable;
++ };
+
+- dsi@fd922800 {
+- status = "ok";
++ blsp2_uart2_pins_sleep: blsp2-uart2-pins-sleep {
++ pins = "gpio45", "gpio46", "gpio47", "gpio48";
++ function = "gpio";
++ drive-strength = <2>;
++ bias-pull-down;
++ };
+
+- vdda-supply = <&pma8084_l2>;
+- vdd-supply = <&pma8084_l22>;
+- vddio-supply = <&pma8084_l12>;
++ bt_pins: bt-pins {
++ hostwake {
++ pins = "gpio75";
++ function = "gpio";
++ drive-strength = <16>;
++ input-enable;
++ };
+
+- #address-cells = <1>;
+- #size-cells = <0>;
++ devwake {
++ pins = "gpio91";
++ function = "gpio";
++ drive-strength = <2>;
++ };
++ };
+
+- ports {
+- port@1 {
+- endpoint {
+- remote-endpoint = <&panel_in>;
+- data-lanes = <0 1 2 3>;
+- };
+- };
+- };
++ sdhc1_pin_a: sdhc1-pin-active {
++ clk {
++ pins = "sdc1_clk";
++ drive-strength = <4>;
++ bias-disable;
++ };
+
+- panel: panel@0 {
+- reg = <0>;
+- compatible = "samsung,s6e3fa2";
++ cmd-data {
++ pins = "sdc1_cmd", "sdc1_data";
++ drive-strength = <4>;
++ bias-pull-up;
++ };
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&panel_te_pin &panel_rst_pin>;
++ sdhc2_pin_a: sdhc2-pin-active {
++ clk-cmd-data {
++ pins = "gpio35", "gpio36", "gpio37", "gpio38",
++ "gpio39", "gpio40";
++ function = "sdc3";
++ drive-strength = <8>;
++ bias-disable;
++ };
++ };
+
+- iovdd-supply = <&pma8084_lvs4>;
+- vddr-supply = <&vreg_panel>;
++ sdhc2_cd_pin: sdhc2-cd {
++ pins = "gpio62";
++ function = "gpio";
+
+- reset-gpios = <&pma8084_gpios 17 GPIO_ACTIVE_LOW>;
+- te-gpios = <&msmgpio 12 GPIO_ACTIVE_HIGH>;
++ drive-strength = <2>;
++ bias-disable;
++ };
+
+- port {
+- panel_in: endpoint {
+- remote-endpoint = <&dsi0_out>;
+- };
+- };
+- };
++ sdhc3_pin_a: sdhc3-pin-active {
++ clk {
++ pins = "sdc2_clk";
++ drive-strength = <6>;
++ bias-disable;
++ };
++
++ cmd-data {
++ pins = "sdc2_cmd", "sdc2_data";
++ drive-strength = <6>;
++ bias-pull-up;
+ };
++ };
+
+- dsi-phy@fd922a00 {
+- status = "ok";
++ i2c2_pins: i2c2 {
++ mux {
++ pins = "gpio6", "gpio7";
++ function = "blsp_i2c2";
+
+- vddio-supply = <&pma8084_l12>;
++ drive-strength = <2>;
++ bias-disable;
+ };
+ };
+
+- remoteproc@fc880000 {
+- cx-supply = <&pma8084_s2>;
+- mss-supply = <&pma8084_s6>;
+- mx-supply = <&pma8084_s1>;
+- pll-supply = <&pma8084_l12>;
++ i2c6_pins: i2c6 {
++ mux {
++ pins = "gpio29", "gpio30";
++ function = "blsp_i2c6";
++
++ drive-strength = <2>;
++ bias-disable;
++ };
+ };
+-};
+
+-&spmi_bus {
+- pma8084@0 {
+- gpios@c000 {
+- gpio_keys_pin_a: gpio-keys-active {
+- pins = "gpio2", "gpio3", "gpio5";
+- function = "normal";
++ i2c12_pins: i2c12 {
++ mux {
++ pins = "gpio87", "gpio88";
++ function = "blsp_i2c12";
+
+- bias-pull-up;
+- power-source = <PMA8084_GPIO_S4>;
+- };
++ drive-strength = <2>;
++ bias-disable;
++ };
++ };
+
+- touchkey_pin: touchkey-int-pin {
+- pins = "gpio6";
+- function = "normal";
+- bias-disable;
+- input-enable;
+- power-source = <PMA8084_GPIO_S4>;
+- };
++ i2c_touchkey_pins: i2c-touchkey {
++ mux {
++ pins = "gpio95", "gpio96";
++ function = "gpio";
++ input-enable;
++ bias-pull-up;
++ };
++ };
+
+- touch_pin: touchscreen-int-pin {
+- pins = "gpio8";
+- function = "normal";
+- bias-disable;
+- input-enable;
+- power-source = <PMA8084_GPIO_S4>;
+- };
++ i2c_led_gpioex_pins: i2c-led-gpioex {
++ mux {
++ pins = "gpio120", "gpio121";
++ function = "gpio";
++ input-enable;
++ bias-pull-down;
++ };
++ };
+
+- panel_en_pin: panel-en-pin {
+- pins = "gpio14";
+- function = "normal";
+- bias-pull-up;
+- power-source = <PMA8084_GPIO_S4>;
+- qcom,drive-strength = <PMIC_GPIO_STRENGTH_LOW>;
+- };
++ gpioex_pin: gpioex {
++ res {
++ pins = "gpio145";
++ function = "gpio";
+
+- wlan_sleep_clk_pin: wlan-sleep-clk-pin {
+- pins = "gpio16";
+- function = "func2";
++ bias-pull-up;
++ drive-strength = <2>;
++ };
++ };
+
+- output-high;
+- power-source = <PMA8084_GPIO_S4>;
+- qcom,drive-strength = <PMIC_GPIO_STRENGTH_HIGH>;
+- };
++ wifi_pin: wifi {
++ int {
++ pins = "gpio92";
++ function = "gpio";
+
+- panel_rst_pin: panel-rst-pin {
+- pins = "gpio17";
+- function = "normal";
+- bias-disable;
+- power-source = <PMA8084_GPIO_S4>;
+- qcom,drive-strength = <PMIC_GPIO_STRENGTH_LOW>;
+- };
++ input-enable;
++ bias-pull-down;
++ };
++ };
+
++ panel_te_pin: panel {
++ te {
++ pins = "gpio12";
++ function = "mdp_vsync";
+
+- fuelgauge_pin: fuelgauge-int-pin {
+- pins = "gpio21";
+- function = "normal";
+- bias-disable;
+- input-enable;
+- power-source = <PMA8084_GPIO_S4>;
+- };
++ drive-strength = <2>;
++ bias-disable;
+ };
+ };
+ };
+diff --git a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-amami.dts b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-amami.dts
+index 79e2cfbbb1ba2..68d5626bf4918 100644
+--- a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-amami.dts
++++ b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-amami.dts
+@@ -1,435 +1,13 @@
+ // SPDX-License-Identifier: GPL-2.0
+-#include "qcom-msm8974.dtsi"
+-#include "qcom-pm8841.dtsi"
+-#include "qcom-pm8941.dtsi"
+-#include <dt-bindings/gpio/gpio.h>
+-#include <dt-bindings/input/input.h>
+-#include <dt-bindings/pinctrl/qcom,pmic-gpio.h>
++#include "qcom-msm8974-sony-xperia-rhine.dtsi"
+
+ / {
+ model = "Sony Xperia Z1 Compact";
+ compatible = "sony,xperia-amami", "qcom,msm8974";
+-
+- aliases {
+- serial0 = &blsp1_uart2;
+- };
+-
+- chosen {
+- stdout-path = "serial0:115200n8";
+- };
+-
+- gpio-keys {
+- compatible = "gpio-keys";
+-
+- pinctrl-names = "default";
+- pinctrl-0 = <&gpio_keys_pin_a>;
+-
+- volume-down {
+- label = "volume_down";
+- gpios = <&pm8941_gpios 2 GPIO_ACTIVE_LOW>;
+- linux,input-type = <1>;
+- linux,code = <KEY_VOLUMEDOWN>;
+- };
+-
+- camera-snapshot {
+- label = "camera_snapshot";
+- gpios = <&pm8941_gpios 3 GPIO_ACTIVE_LOW>;
+- linux,input-type = <1>;
+- linux,code = <KEY_CAMERA>;
+- };
+-
+- camera-focus {
+- label = "camera_focus";
+- gpios = <&pm8941_gpios 4 GPIO_ACTIVE_LOW>;
+- linux,input-type = <1>;
+- linux,code = <KEY_CAMERA_FOCUS>;
+- };
+-
+- volume-up {
+- label = "volume_up";
+- gpios = <&pm8941_gpios 5 GPIO_ACTIVE_LOW>;
+- linux,input-type = <1>;
+- linux,code = <KEY_VOLUMEUP>;
+- };
+- };
+-
+- memory@0 {
+- reg = <0 0x40000000>, <0x40000000 0x40000000>;
+- device_type = "memory";
+- };
+-
+- smd {
+- rpm {
+- rpm_requests {
+- pm8841-regulators {
+- s1 {
+- regulator-min-microvolt = <675000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s2 {
+- regulator-min-microvolt = <500000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s3 {
+- regulator-min-microvolt = <500000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <500000>;
+- regulator-max-microvolt = <1050000>;
+- };
+- };
+-
+- pm8941-regulators {
+- vdd_l1_l3-supply = <&pm8941_s1>;
+- vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
+- vdd_l4_l11-supply = <&pm8941_s1>;
+- vdd_l5_l7-supply = <&pm8941_s2>;
+- vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
+- vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
+- vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
+- vdd_l21-supply = <&vreg_boost>;
+-
+- s1 {
+- regulator-min-microvolt = <1300000>;
+- regulator-max-microvolt = <1300000>;
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- s2 {
+- regulator-min-microvolt = <2150000>;
+- regulator-max-microvolt = <2150000>;
+- regulator-boot-on;
+- };
+-
+- s3 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <5000000>;
+- regulator-max-microvolt = <5000000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-boot-on;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-boot-on;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1300000>;
+- regulator-max-microvolt = <1350000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l19 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l20 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-allow-set-load;
+- regulator-boot-on;
+- regulator-system-load = <200000>;
+- };
+-
+- l21 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- };
+-
+- l22 {
+- regulator-min-microvolt = <3000000>;
+- regulator-max-microvolt = <3000000>;
+- };
+-
+- l23 {
+- regulator-min-microvolt = <2800000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l24 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+-
+- regulator-boot-on;
+- };
+- };
+- };
+- };
+- };
+ };
+
+-&soc {
+- sdhci@f9824900 {
+- status = "okay";
+-
+- vmmc-supply = <&pm8941_l20>;
+- vqmmc-supply = <&pm8941_s3>;
+-
+- bus-width = <8>;
+- non-removable;
+-
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc1_pin_a>;
+- };
+-
+- sdhci@f98a4900 {
+- status = "okay";
+-
+- bus-width = <4>;
+-
+- vmmc-supply = <&pm8941_l21>;
+- vqmmc-supply = <&pm8941_l13>;
+-
+- cd-gpios = <&msmgpio 62 GPIO_ACTIVE_LOW>;
+-
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
+- };
+-
+- serial@f991e000 {
+- status = "okay";
+-
+- pinctrl-names = "default";
+- pinctrl-0 = <&blsp1_uart2_pin_a>;
+- };
+-
+-
+- pinctrl@fd510000 {
+- blsp1_uart2_pin_a: blsp1-uart2-pin-active {
+- rx {
+- pins = "gpio5";
+- function = "blsp_uart2";
+-
+- drive-strength = <2>;
+- bias-pull-up;
+- };
+-
+- tx {
+- pins = "gpio4";
+- function = "blsp_uart2";
+-
+- drive-strength = <4>;
+- bias-disable;
+- };
+- };
+-
+- i2c2_pins: i2c2 {
+- mux {
+- pins = "gpio6", "gpio7";
+- function = "blsp_i2c2";
+-
+- drive-strength = <2>;
+- bias-disable;
+- };
+- };
+-
+- sdhc1_pin_a: sdhc1-pin-active {
+- clk {
+- pins = "sdc1_clk";
+- drive-strength = <16>;
+- bias-disable;
+- };
+-
+- cmd-data {
+- pins = "sdc1_cmd", "sdc1_data";
+- drive-strength = <10>;
+- bias-pull-up;
+- };
+- };
+-
+- sdhc2_cd_pin_a: sdhc2-cd-pin-active {
+- pins = "gpio62";
+- function = "gpio";
+-
+- drive-strength = <2>;
+- bias-disable;
+- };
+-
+- sdhc2_pin_a: sdhc2-pin-active {
+- clk {
+- pins = "sdc2_clk";
+- drive-strength = <10>;
+- bias-disable;
+- };
+-
+- cmd-data {
+- pins = "sdc2_cmd", "sdc2_data";
+- drive-strength = <6>;
+- bias-pull-up;
+- };
+- };
+- };
+-
+- dma-controller@f9944000 {
+- qcom,controlled-remotely;
+- };
+-
+- usb@f9a55000 {
+- status = "okay";
+-
+- phys = <&usb_hs1_phy>;
+- phy-select = <&tcsr 0xb000 0>;
+- extcon = <&smbb>, <&usb_id>;
+- vbus-supply = <&chg_otg>;
+-
+- hnp-disable;
+- srp-disable;
+- adp-disable;
+-
+- ulpi {
+- phy@a {
+- status = "okay";
+-
+- v1p8-supply = <&pm8941_l6>;
+- v3p3-supply = <&pm8941_l24>;
+-
+- extcon = <&smbb>;
+- qcom,init-seq = /bits/ 8 <0x1 0x64>;
+- };
+- };
+- };
+-};
+-
+-&spmi_bus {
+- pm8941@0 {
+- charger@1000 {
+- qcom,fast-charge-safe-current = <1300000>;
+- qcom,fast-charge-current-limit = <1300000>;
+- qcom,dc-current-limit = <1300000>;
+- qcom,fast-charge-safe-voltage = <4400000>;
+- qcom,fast-charge-high-threshold-voltage = <4350000>;
+- qcom,fast-charge-low-threshold-voltage = <3400000>;
+- qcom,auto-recharge-threshold-voltage = <4200000>;
+- qcom,minimum-input-voltage = <4300000>;
+- };
+-
+- gpios@c000 {
+- gpio_keys_pin_a: gpio-keys-active {
+- pins = "gpio2", "gpio3", "gpio4", "gpio5";
+- function = "normal";
+-
+- bias-pull-up;
+- power-source = <PM8941_GPIO_S3>;
+- };
+- };
+-
+- coincell@2800 {
+- status = "okay";
+- qcom,rset-ohms = <2100>;
+- qcom,vset-millivolts = <3000>;
+- };
+- };
+-
+- pm8941@1 {
+- wled@d800 {
+- status = "okay";
+-
+- qcom,cs-out;
+- qcom,current-limit = <20>;
+- qcom,current-boost-limit = <805>;
+- qcom,switching-freq = <1600>;
+- qcom,ovp = <29>;
+- qcom,num-strings = <2>;
+- };
+- };
++&smbb {
++ qcom,fast-charge-safe-current = <1300000>;
++ qcom,fast-charge-current-limit = <1300000>;
++ qcom,dc-current-limit = <1300000>;
+ };
+diff --git a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-castor.dts b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-castor.dts
+index e66937e3f7dd4..b8b6447b3217c 100644
+--- a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-castor.dts
++++ b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-castor.dts
+@@ -11,7 +11,7 @@
+
+ aliases {
+ serial0 = &blsp1_uart2;
+- serial1 = &blsp2_uart7;
++ serial1 = &blsp2_uart1;
+ };
+
+ chosen {
+@@ -53,193 +53,13 @@
+ };
+ };
+
+- smd {
+- rpm {
+- rpm_requests {
+- pm8941-regulators {
+- vdd_l1_l3-supply = <&pm8941_s1>;
+- vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
+- vdd_l4_l11-supply = <&pm8941_s1>;
+- vdd_l5_l7-supply = <&pm8941_s2>;
+- vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
+- vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
+- vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
+- vdd_l21-supply = <&vreg_boost>;
+-
+- s1 {
+- regulator-min-microvolt = <1300000>;
+- regulator-max-microvolt = <1300000>;
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- s2 {
+- regulator-min-microvolt = <2150000>;
+- regulator-max-microvolt = <2150000>;
+- regulator-boot-on;
+- };
+-
+- s3 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- regulator-always-on;
+- regulator-boot-on;
+-
+- regulator-system-load = <154000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <5000000>;
+- regulator-max-microvolt = <5000000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-boot-on;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-boot-on;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1300000>;
+- regulator-max-microvolt = <1350000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l19 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l20 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-allow-set-load;
+- regulator-boot-on;
+- regulator-allow-set-load;
+- regulator-system-load = <500000>;
+- };
+-
+- l21 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- };
+-
+- l22 {
+- regulator-min-microvolt = <3000000>;
+- regulator-max-microvolt = <3000000>;
+- };
+-
+- l23 {
+- regulator-min-microvolt = <2800000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l24 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+-
+- regulator-boot-on;
+- };
+- };
+- };
+- };
+- };
+-
+ vreg_bl_vddio: lcd-backlight-vddio {
+ compatible = "regulator-fixed";
+ regulator-name = "vreg_bl_vddio";
+ regulator-min-microvolt = <3150000>;
+ regulator-max-microvolt = <3150000>;
+
+- gpio = <&msmgpio 69 0>;
++ gpio = <&tlmm 69 0>;
+ enable-active-high;
+
+ vin-supply = <&pm8941_s3>;
+@@ -277,447 +97,605 @@
+ };
+ };
+
+-&soc {
+- sdhci@f9824900 {
+- status = "okay";
++&blsp1_uart2 {
++ status = "okay";
+
+- vmmc-supply = <&pm8941_l20>;
+- vqmmc-supply = <&pm8941_s3>;
+-
+- bus-width = <8>;
+- non-removable;
++ pinctrl-names = "default";
++ pinctrl-0 = <&blsp1_uart2_pin_a>;
++};
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc1_pin_a>;
+- };
++&blsp2_i2c2 {
++ status = "okay";
++ clock-frequency = <355000>;
+
+- sdhci@f9864900 {
+- status = "okay";
++ pinctrl-names = "default";
++ pinctrl-0 = <&i2c8_pins>;
+
+- max-frequency = <100000000>;
+- non-removable;
+- vmmc-supply = <&vreg_wlan>;
++ synaptics@2c {
++ compatible = "syna,rmi4-i2c";
++ reg = <0x2c>;
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc3_pin_a>;
++ interrupt-parent = <&tlmm>;
++ interrupts = <86 IRQ_TYPE_EDGE_FALLING>;
+
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+- bcrmf@1 {
+- compatible = "brcm,bcm4339-fmac", "brcm,bcm4329-fmac";
+- reg = <1>;
++ vdd-supply = <&pm8941_l22>;
++ vio-supply = <&pm8941_lvs3>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&ts_int_pin>;
+
+- brcm,drive-strength = <10>;
++ syna,startup-delay-ms = <10>;
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&wlan_sleep_clk_pin>;
++ rmi-f01@1 {
++ reg = <0x1>;
++ syna,nosleep = <1>;
+ };
+- };
+
+- sdhci@f98a4900 {
+- status = "okay";
++ rmi-f11@11 {
++ reg = <0x11>;
++ syna,f11-flip-x = <1>;
++ syna,sensor-type = <1>;
++ };
++ };
++};
+
+- bus-width = <4>;
++&blsp2_i2c5 {
++ status = "okay";
++ clock-frequency = <355000>;
+
+- vmmc-supply = <&pm8941_l21>;
+- vqmmc-supply = <&pm8941_l13>;
++ pinctrl-names = "default";
++ pinctrl-0 = <&i2c11_pins>;
+
+- cd-gpios = <&msmgpio 62 GPIO_ACTIVE_LOW>;
++ lp8566_wled: backlight@2c {
++ compatible = "ti,lp8556";
++ reg = <0x2c>;
++ power-supply = <&vreg_bl_vddio>;
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
++ bl-name = "backlight";
++ dev-ctrl = /bits/ 8 <0x05>;
++ init-brt = /bits/ 8 <0x3f>;
++ rom_a0h {
++ rom-addr = /bits/ 8 <0xa0>;
++ rom-val = /bits/ 8 <0xff>;
++ };
++ rom_a1h {
++ rom-addr = /bits/ 8 <0xa1>;
++ rom-val = /bits/ 8 <0x3f>;
++ };
++ rom_a2h {
++ rom-addr = /bits/ 8 <0xa2>;
++ rom-val = /bits/ 8 <0x20>;
++ };
++ rom_a3h {
++ rom-addr = /bits/ 8 <0xa3>;
++ rom-val = /bits/ 8 <0x5e>;
++ };
++ rom_a4h {
++ rom-addr = /bits/ 8 <0xa4>;
++ rom-val = /bits/ 8 <0x02>;
++ };
++ rom_a5h {
++ rom-addr = /bits/ 8 <0xa5>;
++ rom-val = /bits/ 8 <0x04>;
++ };
++ rom_a6h {
++ rom-addr = /bits/ 8 <0xa6>;
++ rom-val = /bits/ 8 <0x80>;
++ };
++ rom_a7h {
++ rom-addr = /bits/ 8 <0xa7>;
++ rom-val = /bits/ 8 <0xf7>;
++ };
++ rom_a9h {
++ rom-addr = /bits/ 8 <0xa9>;
++ rom-val = /bits/ 8 <0x80>;
++ };
++ rom_aah {
++ rom-addr = /bits/ 8 <0xaa>;
++ rom-val = /bits/ 8 <0x0f>;
++ };
++ rom_aeh {
++ rom-addr = /bits/ 8 <0xae>;
++ rom-val = /bits/ 8 <0x0f>;
++ };
+ };
++};
++
++&blsp2_uart1 {
++ status = "okay";
+
+- serial@f991e000 {
+- status = "okay";
++ pinctrl-names = "default";
++ pinctrl-0 = <&blsp2_uart7_pin_a>;
++
++ bluetooth {
++ compatible = "brcm,bcm43438-bt";
++ max-speed = <3000000>;
+
+ pinctrl-names = "default";
+- pinctrl-0 = <&blsp1_uart2_pin_a>;
++ pinctrl-0 = <&bt_host_wake_pin>,
++ <&bt_dev_wake_pin>,
++ <&bt_reg_on_pin>;
++
++ host-wakeup-gpios = <&tlmm 95 GPIO_ACTIVE_HIGH>;
++ device-wakeup-gpios = <&tlmm 96 GPIO_ACTIVE_HIGH>;
++ shutdown-gpios = <&pm8941_gpios 16 GPIO_ACTIVE_HIGH>;
+ };
++};
+
+- serial@f995d000 {
+- status = "ok";
++&otg {
++ status = "okay";
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&blsp2_uart7_pin_a>;
++ phys = <&usb_hs1_phy>;
++ phy-select = <&tcsr 0xb000 0>;
++ extcon = <&smbb>, <&usb_id>;
++ vbus-supply = <&chg_otg>;
+
+- bluetooth {
+- compatible = "brcm,bcm43438-bt";
+- max-speed = <3000000>;
++ hnp-disable;
++ srp-disable;
++ adp-disable;
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&bt_host_wake_pin>,
+- <&bt_dev_wake_pin>,
+- <&bt_reg_on_pin>;
++ ulpi {
++ phy@a {
++ status = "okay";
+
+- host-wakeup-gpios = <&msmgpio 95 GPIO_ACTIVE_HIGH>;
+- device-wakeup-gpios = <&msmgpio 96 GPIO_ACTIVE_HIGH>;
+- shutdown-gpios = <&pm8941_gpios 16 GPIO_ACTIVE_HIGH>;
++ v1p8-supply = <&pm8941_l6>;
++ v3p3-supply = <&pm8941_l24>;
++
++ extcon = <&smbb>;
++ qcom,init-seq = /bits/ 8 <0x1 0x64>;
+ };
+ };
++};
+
+- usb@f9a55000 {
+- status = "okay";
++&pm8941_coincell {
++ status = "okay";
+
+- phys = <&usb_hs1_phy>;
+- phy-select = <&tcsr 0xb000 0>;
+- extcon = <&smbb>, <&usb_id>;
+- vbus-supply = <&chg_otg>;
++ qcom,rset-ohms = <2100>;
++ qcom,vset-millivolts = <3000>;
++};
+
+- hnp-disable;
+- srp-disable;
+- adp-disable;
++&pm8941_gpios {
++ gpio_keys_pin_a: gpio-keys-active {
++ pins = "gpio2", "gpio5";
++ function = "normal";
+
+- ulpi {
+- phy@a {
+- status = "okay";
++ bias-pull-up;
++ power-source = <PM8941_GPIO_S3>;
++ };
+
+- v1p8-supply = <&pm8941_l6>;
+- v3p3-supply = <&pm8941_l24>;
++ bt_reg_on_pin: bt-reg-on {
++ pins = "gpio16";
++ function = "normal";
+
+- extcon = <&smbb>;
+- qcom,init-seq = /bits/ 8 <0x1 0x64>;
+- };
+- };
++ output-low;
++ power-source = <PM8941_GPIO_S3>;
+ };
+
+- pinctrl@fd510000 {
+- blsp1_uart2_pin_a: blsp1-uart2-pin-active {
+- rx {
+- pins = "gpio5";
+- function = "blsp_uart2";
++ wlan_sleep_clk_pin: wl-sleep-clk {
++ pins = "gpio17";
++ function = "func2";
+
+- drive-strength = <2>;
+- bias-pull-up;
+- };
++ output-high;
++ power-source = <PM8941_GPIO_S3>;
++ };
+
+- tx {
+- pins = "gpio4";
+- function = "blsp_uart2";
++ wlan_regulator_pin: wl-reg-active {
++ pins = "gpio18";
++ function = "normal";
+
+- drive-strength = <4>;
+- bias-disable;
+- };
+- };
++ bias-disable;
++ power-source = <PM8941_GPIO_S3>;
++ };
+
+- blsp2_uart7_pin_a: blsp2-uart7-pin-active {
+- tx {
+- pins = "gpio41";
+- function = "blsp_uart7";
++ lcd_dcdc_en_pin_a: lcd-dcdc-en-active {
++ pins = "gpio20";
++ function = "normal";
+
+- drive-strength = <2>;
+- bias-disable;
+- };
++ bias-disable;
++ power-source = <PM8941_GPIO_S3>;
++ input-disable;
++ output-low;
++ };
+
+- rx {
+- pins = "gpio42";
+- function = "blsp_uart7";
++};
+
+- drive-strength = <2>;
+- bias-pull-up;
+- };
++&rpm_requests {
++ pm8941-regulators {
++ compatible = "qcom,rpm-pm8941-regulators";
++
++ vdd_l1_l3-supply = <&pm8941_s1>;
++ vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
++ vdd_l4_l11-supply = <&pm8941_s1>;
++ vdd_l5_l7-supply = <&pm8941_s2>;
++ vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
++ vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
++ vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
++ vdd_l21-supply = <&vreg_boost>;
++
++ pm8941_s1: s1 {
++ regulator-min-microvolt = <1300000>;
++ regulator-max-microvolt = <1300000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
+
+- cts {
+- pins = "gpio43";
+- function = "blsp_uart7";
++ pm8941_s2: s2 {
++ regulator-min-microvolt = <2150000>;
++ regulator-max-microvolt = <2150000>;
++ regulator-boot-on;
++ };
+
+- drive-strength = <2>;
+- bias-pull-up;
+- };
++ pm8941_s3: s3 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-system-load = <154000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
+
+- rts {
+- pins = "gpio44";
+- function = "blsp_uart7";
++ pm8941_s4: s4 {
++ regulator-min-microvolt = <5000000>;
++ regulator-max-microvolt = <5000000>;
++ };
+
+- drive-strength = <2>;
+- bias-disable;
+- };
++ pm8941_l1: l1 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
++ regulator-always-on;
++ regulator-boot-on;
+ };
+
+- i2c8_pins: i2c8 {
+- mux {
+- pins = "gpio47", "gpio48";
+- function = "blsp_i2c8";
++ pm8941_l2: l2 {
++ regulator-min-microvolt = <1200000>;
++ regulator-max-microvolt = <1200000>;
++ };
+
+- drive-strength = <2>;
+- bias-disable;
+- };
++ pm8941_l3: l3 {
++ regulator-min-microvolt = <1200000>;
++ regulator-max-microvolt = <1200000>;
+ };
+
+- i2c11_pins: i2c11 {
+- mux {
+- pins = "gpio83", "gpio84";
+- function = "blsp_i2c11";
++ pm8941_l4: l4 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
++ };
+
+- drive-strength = <2>;
+- bias-disable;
+- };
++ pm8941_l5: l5 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
+ };
+
+- lcd_backlight_en_pin_a: lcd-backlight-vddio {
+- pins = "gpio69";
+- drive-strength = <10>;
+- output-low;
+- bias-disable;
++ pm8941_l6: l6 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-boot-on;
+ };
+
+- sdhc1_pin_a: sdhc1-pin-active {
+- clk {
+- pins = "sdc1_clk";
+- drive-strength = <16>;
+- bias-disable;
+- };
++ pm8941_l7: l7 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-boot-on;
++ };
+
+- cmd-data {
+- pins = "sdc1_cmd", "sdc1_data";
+- drive-strength = <10>;
+- bias-pull-up;
+- };
++ pm8941_l8: l8 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
+ };
+
+- sdhc2_cd_pin_a: sdhc2-cd-pin-active {
+- pins = "gpio62";
+- function = "gpio";
++ pm8941_l9: l9 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ };
+
+- drive-strength = <2>;
+- bias-disable;
+- };
++ pm8941_l11: l11 {
++ regulator-min-microvolt = <1300000>;
++ regulator-max-microvolt = <1350000>;
++ };
+
+- sdhc2_pin_a: sdhc2-pin-active {
+- clk {
+- pins = "sdc2_clk";
+- drive-strength = <6>;
+- bias-disable;
+- };
++ pm8941_l12: l12 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
+
+- cmd-data {
+- pins = "sdc2_cmd", "sdc2_data";
+- drive-strength = <6>;
+- bias-pull-up;
+- };
++ pm8941_l13: l13 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-boot-on;
+ };
+
+- sdhc3_pin_a: sdhc3-pin-active {
+- clk {
+- pins = "gpio40";
+- function = "sdc3";
++ pm8941_l14: l14 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
+
+- drive-strength = <10>;
+- bias-disable;
+- };
++ pm8941_l15: l15 {
++ regulator-min-microvolt = <2050000>;
++ regulator-max-microvolt = <2050000>;
++ };
+
+- cmd {
+- pins = "gpio39";
+- function = "sdc3";
++ pm8941_l16: l16 {
++ regulator-min-microvolt = <2700000>;
++ regulator-max-microvolt = <2700000>;
++ };
+
+- drive-strength = <10>;
+- bias-pull-up;
+- };
++ pm8941_l17: l17 {
++ regulator-min-microvolt = <2700000>;
++ regulator-max-microvolt = <2700000>;
++ };
+
+- data {
+- pins = "gpio35", "gpio36", "gpio37", "gpio38";
+- function = "sdc3";
++ pm8941_l18: l18 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
+
+- drive-strength = <10>;
+- bias-pull-up;
+- };
++ pm8941_l19: l19 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
+ };
+
+- ts_int_pin: synaptics {
+- pin {
+- pins = "gpio86";
+- function = "gpio";
++ pm8941_l20: l20 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-system-load = <500000>;
++ regulator-allow-set-load;
++ regulator-boot-on;
++ };
+
+- drive-strength = <2>;
+- bias-disable;
+- input-enable;
+- };
++ pm8941_l21: l21 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-boot-on;
+ };
+
+- bt_host_wake_pin: bt-host-wake {
+- pins = "gpio95";
+- function = "gpio";
++ pm8941_l22: l22 {
++ regulator-min-microvolt = <3000000>;
++ regulator-max-microvolt = <3000000>;
++ };
+
+- drive-strength = <2>;
+- bias-disable;
+- output-low;
++ pm8941_l23: l23 {
++ regulator-min-microvolt = <2800000>;
++ regulator-max-microvolt = <2800000>;
+ };
+
+- bt_dev_wake_pin: bt-dev-wake {
+- pins = "gpio96";
+- function = "gpio";
++ pm8941_l24: l24 {
++ regulator-min-microvolt = <3075000>;
++ regulator-max-microvolt = <3075000>;
++ regulator-boot-on;
++ };
++
++ pm8941_lvs3: lvs3 {};
++ };
++};
++
++&sdhc_1 {
++ status = "okay";
++
++ vmmc-supply = <&pm8941_l20>;
++ vqmmc-supply = <&pm8941_s3>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc1_pin_a>;
++};
++
++&sdhc_2 {
++ status = "okay";
++
++ vmmc-supply = <&pm8941_l21>;
++ vqmmc-supply = <&pm8941_l13>;
++
++ cd-gpios = <&tlmm 62 GPIO_ACTIVE_LOW>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
++};
++
++&sdhc_3 {
++ status = "okay";
++
++ max-frequency = <100000000>;
++ vmmc-supply = <&vreg_wlan>;
++ non-removable;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc3_pin_a>;
++
++ #address-cells = <1>;
++ #size-cells = <0>;
++
++ bcrmf@1 {
++ compatible = "brcm,bcm4339-fmac", "brcm,bcm4329-fmac";
++ reg = <1>;
++
++ brcm,drive-strength = <10>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&wlan_sleep_clk_pin>;
++ };
++};
++
++&smbb {
++ qcom,fast-charge-safe-current = <1500000>;
++ qcom,fast-charge-current-limit = <1500000>;
++ qcom,dc-current-limit = <1800000>;
++ qcom,fast-charge-safe-voltage = <4400000>;
++ qcom,fast-charge-high-threshold-voltage = <4350000>;
++ qcom,fast-charge-low-threshold-voltage = <3400000>;
++ qcom,auto-recharge-threshold-voltage = <4200000>;
++ qcom,minimum-input-voltage = <4300000>;
++};
++
++&tlmm {
++ blsp1_uart2_pin_a: blsp1-uart2-pin-active {
++ rx {
++ pins = "gpio5";
++ function = "blsp_uart2";
+
+ drive-strength = <2>;
++ bias-pull-up;
++ };
++
++ tx {
++ pins = "gpio4";
++ function = "blsp_uart2";
++
++ drive-strength = <4>;
+ bias-disable;
+ };
+ };
+
+- i2c@f9964000 {
+- status = "okay";
++ blsp2_uart7_pin_a: blsp2-uart7-pin-active {
++ tx {
++ pins = "gpio41";
++ function = "blsp_uart7";
+
+- clock-frequency = <355000>;
+- qcom,src-freq = <50000000>;
+-
+- pinctrl-names = "default";
+- pinctrl-0 = <&i2c8_pins>;
++ drive-strength = <2>;
++ bias-disable;
++ };
+
+- synaptics@2c {
+- compatible = "syna,rmi4-i2c";
+- reg = <0x2c>;
++ rx {
++ pins = "gpio42";
++ function = "blsp_uart7";
+
+- interrupt-parent = <&msmgpio>;
+- interrupts = <86 IRQ_TYPE_EDGE_FALLING>;
++ drive-strength = <2>;
++ bias-pull-up;
++ };
+
+- #address-cells = <1>;
+- #size-cells = <0>;
++ cts {
++ pins = "gpio43";
++ function = "blsp_uart7";
+
+- vdd-supply = <&pm8941_l22>;
+- vio-supply = <&pm8941_lvs3>;
++ drive-strength = <2>;
++ bias-pull-up;
++ };
+
+- pinctrl-names = "default";
+- pinctrl-0 = <&ts_int_pin>;
++ rts {
++ pins = "gpio44";
++ function = "blsp_uart7";
+
+- syna,startup-delay-ms = <10>;
++ drive-strength = <2>;
++ bias-disable;
++ };
++ };
+
+- rmi-f01@1 {
+- reg = <0x1>;
+- syna,nosleep = <1>;
+- };
++ i2c8_pins: i2c8 {
++ mux {
++ pins = "gpio47", "gpio48";
++ function = "blsp_i2c8";
+
+- rmi-f11@11 {
+- reg = <0x11>;
+- syna,f11-flip-x = <1>;
+- syna,sensor-type = <1>;
+- };
++ drive-strength = <2>;
++ bias-disable;
+ };
+ };
+
+- i2c@f9967000 {
+- status = "okay";
+- pinctrl-names = "default";
+- pinctrl-0 = <&i2c11_pins>;
+- clock-frequency = <355000>;
+- qcom,src-freq = <50000000>;
+-
+- lp8566_wled: backlight@2c {
+- compatible = "ti,lp8556";
+- reg = <0x2c>;
+- power-supply = <&vreg_bl_vddio>;
+-
+- bl-name = "backlight";
+- dev-ctrl = /bits/ 8 <0x05>;
+- init-brt = /bits/ 8 <0x3f>;
+- rom_a0h {
+- rom-addr = /bits/ 8 <0xa0>;
+- rom-val = /bits/ 8 <0xff>;
+- };
+- rom_a1h {
+- rom-addr = /bits/ 8 <0xa1>;
+- rom-val = /bits/ 8 <0x3f>;
+- };
+- rom_a2h {
+- rom-addr = /bits/ 8 <0xa2>;
+- rom-val = /bits/ 8 <0x20>;
+- };
+- rom_a3h {
+- rom-addr = /bits/ 8 <0xa3>;
+- rom-val = /bits/ 8 <0x5e>;
+- };
+- rom_a4h {
+- rom-addr = /bits/ 8 <0xa4>;
+- rom-val = /bits/ 8 <0x02>;
+- };
+- rom_a5h {
+- rom-addr = /bits/ 8 <0xa5>;
+- rom-val = /bits/ 8 <0x04>;
+- };
+- rom_a6h {
+- rom-addr = /bits/ 8 <0xa6>;
+- rom-val = /bits/ 8 <0x80>;
+- };
+- rom_a7h {
+- rom-addr = /bits/ 8 <0xa7>;
+- rom-val = /bits/ 8 <0xf7>;
+- };
+- rom_a9h {
+- rom-addr = /bits/ 8 <0xa9>;
+- rom-val = /bits/ 8 <0x80>;
+- };
+- rom_aah {
+- rom-addr = /bits/ 8 <0xaa>;
+- rom-val = /bits/ 8 <0x0f>;
+- };
+- rom_aeh {
+- rom-addr = /bits/ 8 <0xae>;
+- rom-val = /bits/ 8 <0x0f>;
+- };
++ i2c11_pins: i2c11 {
++ mux {
++ pins = "gpio83", "gpio84";
++ function = "blsp_i2c11";
++
++ drive-strength = <2>;
++ bias-disable;
+ };
+ };
+-};
+
+-&spmi_bus {
+- pm8941@0 {
+- charger@1000 {
+- qcom,fast-charge-safe-current = <1500000>;
+- qcom,fast-charge-current-limit = <1500000>;
+- qcom,dc-current-limit = <1800000>;
+- qcom,fast-charge-safe-voltage = <4400000>;
+- qcom,fast-charge-high-threshold-voltage = <4350000>;
+- qcom,fast-charge-low-threshold-voltage = <3400000>;
+- qcom,auto-recharge-threshold-voltage = <4200000>;
+- qcom,minimum-input-voltage = <4300000>;
++ lcd_backlight_en_pin_a: lcd-backlight-vddio {
++ pins = "gpio69";
++ drive-strength = <10>;
++ output-low;
++ bias-disable;
++ };
++
++ sdhc1_pin_a: sdhc1-pin-active {
++ clk {
++ pins = "sdc1_clk";
++ drive-strength = <16>;
++ bias-disable;
+ };
+
+- gpios@c000 {
+- gpio_keys_pin_a: gpio-keys-active {
+- pins = "gpio2", "gpio5";
+- function = "normal";
++ cmd-data {
++ pins = "sdc1_cmd", "sdc1_data";
++ drive-strength = <10>;
++ bias-pull-up;
++ };
++ };
+
+- bias-pull-up;
+- power-source = <PM8941_GPIO_S3>;
+- };
++ sdhc2_cd_pin_a: sdhc2-cd-pin-active {
++ pins = "gpio62";
++ function = "gpio";
+
+- bt_reg_on_pin: bt-reg-on {
+- pins = "gpio16";
+- function = "normal";
++ drive-strength = <2>;
++ bias-disable;
++ };
+
+- output-low;
+- power-source = <PM8941_GPIO_S3>;
+- };
++ sdhc2_pin_a: sdhc2-pin-active {
++ clk {
++ pins = "sdc2_clk";
++ drive-strength = <6>;
++ bias-disable;
++ };
+
+- wlan_sleep_clk_pin: wl-sleep-clk {
+- pins = "gpio17";
+- function = "func2";
++ cmd-data {
++ pins = "sdc2_cmd", "sdc2_data";
++ drive-strength = <6>;
++ bias-pull-up;
++ };
++ };
+
+- output-high;
+- power-source = <PM8941_GPIO_S3>;
+- };
++ sdhc3_pin_a: sdhc3-pin-active {
++ clk {
++ pins = "gpio40";
++ function = "sdc3";
+
+- wlan_regulator_pin: wl-reg-active {
+- pins = "gpio18";
+- function = "normal";
++ drive-strength = <10>;
++ bias-disable;
++ };
+
+- bias-disable;
+- power-source = <PM8941_GPIO_S3>;
+- };
++ cmd {
++ pins = "gpio39";
++ function = "sdc3";
+
+- lcd_dcdc_en_pin_a: lcd-dcdc-en-active {
+- pins = "gpio20";
+- function = "normal";
++ drive-strength = <10>;
++ bias-pull-up;
++ };
+
+- bias-disable;
+- power-source = <PM8941_GPIO_S3>;
+- input-disable;
+- output-low;
+- };
++ data {
++ pins = "gpio35", "gpio36", "gpio37", "gpio38";
++ function = "sdc3";
+
++ drive-strength = <10>;
++ bias-pull-up;
+ };
++ };
+
+- coincell@2800 {
+- status = "okay";
+- qcom,rset-ohms = <2100>;
+- qcom,vset-millivolts = <3000>;
++ ts_int_pin: synaptics {
++ pin {
++ pins = "gpio86";
++ function = "gpio";
++
++ drive-strength = <2>;
++ bias-disable;
++ input-enable;
+ };
+ };
++
++ bt_host_wake_pin: bt-host-wake {
++ pins = "gpio95";
++ function = "gpio";
++
++ drive-strength = <2>;
++ bias-disable;
++ output-low;
++ };
++
++ bt_dev_wake_pin: bt-dev-wake {
++ pins = "gpio96";
++ function = "gpio";
++
++ drive-strength = <2>;
++ bias-disable;
++ };
+ };
+diff --git a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-honami.dts b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-honami.dts
+index a62e5c25b23c0..ea6a941d8f8cd 100644
+--- a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-honami.dts
++++ b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-honami.dts
+@@ -1,484 +1,7 @@
+ // SPDX-License-Identifier: GPL-2.0
+-#include "qcom-msm8974.dtsi"
+-#include "qcom-pm8841.dtsi"
+-#include "qcom-pm8941.dtsi"
+-#include <dt-bindings/gpio/gpio.h>
+-#include <dt-bindings/input/input.h>
+-#include <dt-bindings/pinctrl/qcom,pmic-gpio.h>
++#include "qcom-msm8974-sony-xperia-rhine.dtsi"
+
+ / {
+ model = "Sony Xperia Z1";
+ compatible = "sony,xperia-honami", "qcom,msm8974";
+-
+- aliases {
+- serial0 = &blsp1_uart2;
+- };
+-
+- chosen {
+- stdout-path = "serial0:115200n8";
+- };
+-
+- gpio-keys {
+- compatible = "gpio-keys";
+-
+- pinctrl-names = "default";
+- pinctrl-0 = <&gpio_keys_pin_a>;
+-
+- volume-down {
+- label = "volume_down";
+- gpios = <&pm8941_gpios 2 GPIO_ACTIVE_LOW>;
+- linux,input-type = <1>;
+- linux,code = <KEY_VOLUMEDOWN>;
+- };
+-
+- camera-snapshot {
+- label = "camera_snapshot";
+- gpios = <&pm8941_gpios 3 GPIO_ACTIVE_LOW>;
+- linux,input-type = <1>;
+- linux,code = <KEY_CAMERA>;
+- };
+-
+- camera-focus {
+- label = "camera_focus";
+- gpios = <&pm8941_gpios 4 GPIO_ACTIVE_LOW>;
+- linux,input-type = <1>;
+- linux,code = <KEY_CAMERA_FOCUS>;
+- };
+-
+- volume-up {
+- label = "volume_up";
+- gpios = <&pm8941_gpios 5 GPIO_ACTIVE_LOW>;
+- linux,input-type = <1>;
+- linux,code = <KEY_VOLUMEUP>;
+- };
+- };
+-
+- memory@0 {
+- reg = <0 0x40000000>, <0x40000000 0x40000000>;
+- device_type = "memory";
+- };
+-
+- smd {
+- rpm {
+- rpm_requests {
+- pm8841-regulators {
+- s1 {
+- regulator-min-microvolt = <675000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s2 {
+- regulator-min-microvolt = <500000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s3 {
+- regulator-min-microvolt = <500000>;
+- regulator-max-microvolt = <1050000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <500000>;
+- regulator-max-microvolt = <1050000>;
+- };
+- };
+-
+- pm8941-regulators {
+- vdd_l1_l3-supply = <&pm8941_s1>;
+- vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
+- vdd_l4_l11-supply = <&pm8941_s1>;
+- vdd_l5_l7-supply = <&pm8941_s2>;
+- vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
+- vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
+- vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
+- vdd_l21-supply = <&vreg_boost>;
+-
+- s1 {
+- regulator-min-microvolt = <1300000>;
+- regulator-max-microvolt = <1300000>;
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- s2 {
+- regulator-min-microvolt = <2150000>;
+- regulator-max-microvolt = <2150000>;
+- regulator-boot-on;
+- };
+-
+- s3 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <5000000>;
+- regulator-max-microvolt = <5000000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-boot-on;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-boot-on;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1300000>;
+- regulator-max-microvolt = <1350000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l19 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l20 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-allow-set-load;
+- regulator-boot-on;
+- regulator-system-load = <200000>;
+- };
+-
+- l21 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+-
+- regulator-boot-on;
+- };
+-
+- l22 {
+- regulator-min-microvolt = <3000000>;
+- regulator-max-microvolt = <3000000>;
+- };
+-
+- l23 {
+- regulator-min-microvolt = <2800000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l24 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+-
+- regulator-boot-on;
+- };
+- };
+- };
+- };
+- };
+-};
+-
+-&soc {
+- usb@f9a55000 {
+- status = "okay";
+-
+- phys = <&usb_hs1_phy>;
+- phy-select = <&tcsr 0xb000 0>;
+- extcon = <&smbb>, <&usb_id>;
+- vbus-supply = <&chg_otg>;
+-
+- hnp-disable;
+- srp-disable;
+- adp-disable;
+-
+- ulpi {
+- phy@a {
+- status = "okay";
+-
+- v1p8-supply = <&pm8941_l6>;
+- v3p3-supply = <&pm8941_l24>;
+-
+- extcon = <&smbb>;
+- qcom,init-seq = /bits/ 8 <0x1 0x64>;
+- };
+- };
+- };
+-
+- sdhci@f9824900 {
+- status = "okay";
+-
+- vmmc-supply = <&pm8941_l20>;
+- vqmmc-supply = <&pm8941_s3>;
+-
+- bus-width = <8>;
+- non-removable;
+-
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc1_pin_a>;
+- };
+-
+- sdhci@f98a4900 {
+- status = "okay";
+-
+- bus-width = <4>;
+-
+- vmmc-supply = <&pm8941_l21>;
+- vqmmc-supply = <&pm8941_l13>;
+-
+- cd-gpios = <&msmgpio 62 GPIO_ACTIVE_LOW>;
+-
+- pinctrl-names = "default";
+- pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
+- };
+-
+- serial@f991e000 {
+- status = "okay";
+-
+- pinctrl-names = "default";
+- pinctrl-0 = <&blsp1_uart2_pin_a>;
+- };
+-
+- i2c@f9924000 {
+- status = "okay";
+-
+- clock-frequency = <355000>;
+- qcom,src-freq = <50000000>;
+-
+- pinctrl-names = "default";
+- pinctrl-0 = <&i2c2_pins>;
+-
+- synaptics@2c {
+- compatible = "syna,rmi4-i2c";
+- reg = <0x2c>;
+-
+- interrupts-extended = <&msmgpio 61 IRQ_TYPE_EDGE_FALLING>;
+-
+- #address-cells = <1>;
+- #size-cells = <0>;
+-
+- vdd-supply = <&pm8941_l22>;
+- vio-supply = <&pm8941_lvs3>;
+-
+- pinctrl-names = "default";
+- pinctrl-0 = <&ts_int_pin>;
+-
+- syna,startup-delay-ms = <10>;
+-
+- rmi4-f01@1 {
+- reg = <0x1>;
+- syna,nosleep-mode = <1>;
+- };
+-
+- rmi4-f11@11 {
+- reg = <0x11>;
+- touchscreen-inverted-x;
+- syna,sensor-type = <1>;
+- };
+- };
+- };
+-
+- pinctrl@fd510000 {
+- blsp1_uart2_pin_a: blsp1-uart2-pin-active {
+- rx {
+- pins = "gpio5";
+- function = "blsp_uart2";
+-
+- drive-strength = <2>;
+- bias-pull-up;
+- };
+-
+- tx {
+- pins = "gpio4";
+- function = "blsp_uart2";
+-
+- drive-strength = <4>;
+- bias-disable;
+- };
+- };
+-
+- i2c2_pins: i2c2 {
+- mux {
+- pins = "gpio6", "gpio7";
+- function = "blsp_i2c2";
+-
+- drive-strength = <2>;
+- bias-disable;
+- };
+- };
+-
+- sdhc1_pin_a: sdhc1-pin-active {
+- clk {
+- pins = "sdc1_clk";
+- drive-strength = <16>;
+- bias-disable;
+- };
+-
+- cmd-data {
+- pins = "sdc1_cmd", "sdc1_data";
+- drive-strength = <10>;
+- bias-pull-up;
+- };
+- };
+-
+- sdhc2_cd_pin_a: sdhc2-cd-pin-active {
+- pins = "gpio62";
+- function = "gpio";
+-
+- drive-strength = <2>;
+- bias-disable;
+- };
+-
+- sdhc2_pin_a: sdhc2-pin-active {
+- clk {
+- pins = "sdc2_clk";
+- drive-strength = <10>;
+- bias-disable;
+- };
+-
+- cmd-data {
+- pins = "sdc2_cmd", "sdc2_data";
+- drive-strength = <6>;
+- bias-pull-up;
+- };
+- };
+-
+- ts_int_pin: touch-int {
+- pin {
+- pins = "gpio61";
+- function = "gpio";
+-
+- drive-strength = <2>;
+- bias-disable;
+- input-enable;
+- };
+- };
+- };
+-
+- dma-controller@f9944000 {
+- qcom,controlled-remotely;
+- };
+-};
+-
+-&spmi_bus {
+- pm8941@0 {
+- charger@1000 {
+- qcom,fast-charge-safe-current = <1500000>;
+- qcom,fast-charge-current-limit = <1500000>;
+- qcom,dc-current-limit = <1800000>;
+- qcom,fast-charge-safe-voltage = <4400000>;
+- qcom,fast-charge-high-threshold-voltage = <4350000>;
+- qcom,fast-charge-low-threshold-voltage = <3400000>;
+- qcom,auto-recharge-threshold-voltage = <4200000>;
+- qcom,minimum-input-voltage = <4300000>;
+- };
+-
+- gpios@c000 {
+- gpio_keys_pin_a: gpio-keys-active {
+- pins = "gpio2", "gpio3", "gpio4", "gpio5";
+- function = "normal";
+-
+- bias-pull-up;
+- power-source = <PM8941_GPIO_S3>;
+- };
+- };
+-
+- coincell@2800 {
+- status = "okay";
+- qcom,rset-ohms = <2100>;
+- qcom,vset-millivolts = <3000>;
+- };
+- };
+-
+- pm8941@1 {
+- wled@d800 {
+- status = "okay";
+-
+- qcom,cs-out;
+- qcom,current-limit = <20>;
+- qcom,current-boost-limit = <805>;
+- qcom,switching-freq = <1600>;
+- qcom,ovp = <29>;
+- qcom,num-strings = <2>;
+- };
+- };
+ };
+diff --git a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-rhine.dtsi b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-rhine.dtsi
+new file mode 100644
+index 0000000000000..870e0aeb4d057
+--- /dev/null
++++ b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-rhine.dtsi
+@@ -0,0 +1,455 @@
++// SPDX-License-Identifier: GPL-2.0
++#include "qcom-msm8974.dtsi"
++#include "qcom-pm8841.dtsi"
++#include "qcom-pm8941.dtsi"
++#include <dt-bindings/gpio/gpio.h>
++#include <dt-bindings/input/input.h>
++#include <dt-bindings/pinctrl/qcom,pmic-gpio.h>
++
++/ {
++ aliases {
++ serial0 = &blsp1_uart2;
++ };
++
++ chosen {
++ stdout-path = "serial0:115200n8";
++ };
++
++ gpio-keys {
++ compatible = "gpio-keys";
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&gpio_keys_pin_a>;
++
++ volume-down {
++ label = "volume_down";
++ gpios = <&pm8941_gpios 2 GPIO_ACTIVE_LOW>;
++ linux,input-type = <1>;
++ linux,code = <KEY_VOLUMEDOWN>;
++ };
++
++ camera-snapshot {
++ label = "camera_snapshot";
++ gpios = <&pm8941_gpios 3 GPIO_ACTIVE_LOW>;
++ linux,input-type = <1>;
++ linux,code = <KEY_CAMERA>;
++ };
++
++ camera-focus {
++ label = "camera_focus";
++ gpios = <&pm8941_gpios 4 GPIO_ACTIVE_LOW>;
++ linux,input-type = <1>;
++ linux,code = <KEY_CAMERA_FOCUS>;
++ };
++
++ volume-up {
++ label = "volume_up";
++ gpios = <&pm8941_gpios 5 GPIO_ACTIVE_LOW>;
++ linux,input-type = <1>;
++ linux,code = <KEY_VOLUMEUP>;
++ };
++ };
++};
++
++&blsp1_i2c2 {
++ status = "okay";
++ clock-frequency = <355000>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&i2c2_pins>;
++
++ synaptics@2c {
++ compatible = "syna,rmi4-i2c";
++ reg = <0x2c>;
++
++ interrupts-extended = <&tlmm 61 IRQ_TYPE_EDGE_FALLING>;
++
++ #address-cells = <1>;
++ #size-cells = <0>;
++
++ vdd-supply = <&pm8941_l22>;
++ vio-supply = <&pm8941_lvs3>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&ts_int_pin>;
++
++ syna,startup-delay-ms = <10>;
++
++ rmi4-f01@1 {
++ reg = <0x1>;
++ syna,nosleep-mode = <1>;
++ };
++
++ rmi4-f11@11 {
++ reg = <0x11>;
++ touchscreen-inverted-x;
++ syna,sensor-type = <1>;
++ };
++ };
++};
++
++&blsp1_uart2 {
++ status = "okay";
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&blsp1_uart2_pin_a>;
++};
++
++&blsp2_dma {
++ qcom,controlled-remotely;
++};
++
++&otg {
++ status = "okay";
++
++ phys = <&usb_hs1_phy>;
++ phy-select = <&tcsr 0xb000 0>;
++ extcon = <&smbb>, <&usb_id>;
++ vbus-supply = <&chg_otg>;
++
++ hnp-disable;
++ srp-disable;
++ adp-disable;
++
++ ulpi {
++ phy@a {
++ status = "okay";
++
++ v1p8-supply = <&pm8941_l6>;
++ v3p3-supply = <&pm8941_l24>;
++
++ extcon = <&smbb>;
++ qcom,init-seq = /bits/ 8 <0x1 0x64>;
++ };
++ };
++};
++
++&pm8941_coincell {
++ status = "okay";
++ qcom,rset-ohms = <2100>;
++ qcom,vset-millivolts = <3000>;
++};
++
++&pm8941_gpios {
++ gpio_keys_pin_a: gpio-keys-active {
++ pins = "gpio2", "gpio3", "gpio4", "gpio5";
++ function = "normal";
++
++ bias-pull-up;
++ power-source = <PM8941_GPIO_S3>;
++ };
++};
++
++&pm8941_wled {
++ status = "okay";
++
++ qcom,cs-out;
++ qcom,current-limit = <20>;
++ qcom,current-boost-limit = <805>;
++ qcom,switching-freq = <1600>;
++ qcom,ovp = <29>;
++ qcom,num-strings = <2>;
++};
++
++&rpm_requests {
++ pm8841-regulators {
++ compatible = "qcom,rpm-pm8841-regulators";
++
++ pm8841_s1: s1 {
++ regulator-min-microvolt = <675000>;
++ regulator-max-microvolt = <1050000>;
++ };
++
++ pm8841_s2: s2 {
++ regulator-min-microvolt = <500000>;
++ regulator-max-microvolt = <1050000>;
++ };
++
++ pm8841_s3: s3 {
++ regulator-min-microvolt = <500000>;
++ regulator-max-microvolt = <1050000>;
++ };
++
++ pm8841_s4: s4 {
++ regulator-min-microvolt = <500000>;
++ regulator-max-microvolt = <1050000>;
++ };
++ };
++
++ pm8941-regulators {
++ compatible = "qcom,rpm-pm8941-regulators";
++
++ vdd_l1_l3-supply = <&pm8941_s1>;
++ vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
++ vdd_l4_l11-supply = <&pm8941_s1>;
++ vdd_l5_l7-supply = <&pm8941_s2>;
++ vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
++ vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
++ vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
++ vdd_l21-supply = <&vreg_boost>;
++
++ pm8941_s1: s1 {
++ regulator-min-microvolt = <1300000>;
++ regulator-max-microvolt = <1300000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
++
++ pm8941_s2: s2 {
++ regulator-min-microvolt = <2150000>;
++ regulator-max-microvolt = <2150000>;
++ regulator-boot-on;
++ };
++
++ pm8941_s3: s3 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
++
++ pm8941_s4: s4 {
++ regulator-min-microvolt = <5000000>;
++ regulator-max-microvolt = <5000000>;
++ };
++
++ pm8941_l1: l1 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
++
++ pm8941_l2: l2 {
++ regulator-min-microvolt = <1200000>;
++ regulator-max-microvolt = <1200000>;
++ };
++
++ pm8941_l3: l3 {
++ regulator-min-microvolt = <1200000>;
++ regulator-max-microvolt = <1200000>;
++ };
++
++ pm8941_l4: l4 {
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
++ };
++
++ pm8941_l5: l5 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
++
++ pm8941_l6: l6 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-boot-on;
++ };
++
++ pm8941_l7: l7 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-boot-on;
++ };
++
++ pm8941_l8: l8 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
++
++ pm8941_l9: l9 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ };
++
++ pm8941_l11: l11 {
++ regulator-min-microvolt = <1300000>;
++ regulator-max-microvolt = <1350000>;
++ };
++
++ pm8941_l12: l12 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-always-on;
++ regulator-boot-on;
++ };
++
++ pm8941_l13: l13 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-boot-on;
++ };
++
++ pm8941_l14: l14 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
++
++ pm8941_l15: l15 {
++ regulator-min-microvolt = <2050000>;
++ regulator-max-microvolt = <2050000>;
++ };
++
++ pm8941_l16: l16 {
++ regulator-min-microvolt = <2700000>;
++ regulator-max-microvolt = <2700000>;
++ };
++
++ pm8941_l17: l17 {
++ regulator-min-microvolt = <2700000>;
++ regulator-max-microvolt = <2700000>;
++ };
++
++ pm8941_l18: l18 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++
++ pm8941_l19: l19 {
++ regulator-min-microvolt = <3300000>;
++ regulator-max-microvolt = <3300000>;
++ };
++
++ pm8941_l20: l20 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-system-load = <200000>;
++ regulator-allow-set-load;
++ regulator-boot-on;
++ };
++
++ pm8941_l21: l21 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-boot-on;
++ };
++
++ pm8941_l22: l22 {
++ regulator-min-microvolt = <3000000>;
++ regulator-max-microvolt = <3000000>;
++ };
++
++ pm8941_l23: l23 {
++ regulator-min-microvolt = <2800000>;
++ regulator-max-microvolt = <2800000>;
++ };
++
++ pm8941_l24: l24 {
++ regulator-min-microvolt = <3075000>;
++ regulator-max-microvolt = <3075000>;
++ regulator-boot-on;
++ };
++
++ pm8941_lvs3: lvs3 {};
++ };
++};
++
++&sdhc_1 {
++ status = "okay";
++
++ vmmc-supply = <&pm8941_l20>;
++ vqmmc-supply = <&pm8941_s3>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc1_pin_a>;
++};
++
++&sdhc_2 {
++ status = "okay";
++
++ vmmc-supply = <&pm8941_l21>;
++ vqmmc-supply = <&pm8941_l13>;
++
++ cd-gpios = <&tlmm 62 GPIO_ACTIVE_LOW>;
++
++ pinctrl-names = "default";
++ pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
++};
++
++&smbb {
++ qcom,fast-charge-safe-current = <1500000>;
++ qcom,fast-charge-current-limit = <1500000>;
++ qcom,dc-current-limit = <1800000>;
++ qcom,fast-charge-safe-voltage = <4400000>;
++ qcom,fast-charge-high-threshold-voltage = <4350000>;
++ qcom,fast-charge-low-threshold-voltage = <3400000>;
++ qcom,auto-recharge-threshold-voltage = <4200000>;
++ qcom,minimum-input-voltage = <4300000>;
++};
++
++&tlmm {
++ ts_int_pin: touch-int {
++ pin {
++ pins = "gpio61";
++ function = "gpio";
++
++ drive-strength = <2>;
++ bias-disable;
++ input-enable;
++ };
++ };
++
++ blsp1_uart2_pin_a: blsp1-uart2-pin-active {
++ rx {
++ pins = "gpio5";
++ function = "blsp_uart2";
++
++ drive-strength = <2>;
++ bias-pull-up;
++ };
++
++ tx {
++ pins = "gpio4";
++ function = "blsp_uart2";
++
++ drive-strength = <4>;
++ bias-disable;
++ };
++ };
++
++ i2c2_pins: i2c2 {
++ mux {
++ pins = "gpio6", "gpio7";
++ function = "blsp_i2c2";
++
++ drive-strength = <2>;
++ bias-disable;
++ };
++ };
++
++ sdhc1_pin_a: sdhc1-pin-active {
++ clk {
++ pins = "sdc1_clk";
++ drive-strength = <16>;
++ bias-disable;
++ };
++
++ cmd-data {
++ pins = "sdc1_cmd", "sdc1_data";
++ drive-strength = <10>;
++ bias-pull-up;
++ };
++ };
++
++ sdhc2_cd_pin_a: sdhc2-cd-pin-active {
++ pins = "gpio62";
++ function = "gpio";
++
++ drive-strength = <2>;
++ bias-disable;
++ };
++
++ sdhc2_pin_a: sdhc2-pin-active {
++ clk {
++ pins = "sdc2_clk";
++ drive-strength = <10>;
++ bias-disable;
++ };
++
++ cmd-data {
++ pins = "sdc2_cmd", "sdc2_data";
++ drive-strength = <6>;
++ bias-pull-up;
++ };
++ };
++};
+diff --git a/arch/arm/boot/dts/qcom-msm8974.dtsi b/arch/arm/boot/dts/qcom-msm8974.dtsi
+index 412d94736c354..05a36566bd52d 100644
+--- a/arch/arm/boot/dts/qcom-msm8974.dtsi
++++ b/arch/arm/boot/dts/qcom-msm8974.dtsi
+@@ -16,57 +16,17 @@
+ compatible = "qcom,msm8974";
+ interrupt-parent = <&intc>;
+
+- reserved-memory {
+- #address-cells = <1>;
+- #size-cells = <1>;
+- ranges;
+-
+- mpss_region: mpss@8000000 {
+- reg = <0x08000000 0x5100000>;
+- no-map;
+- };
+-
+- mba_region: mba@d100000 {
+- reg = <0x0d100000 0x100000>;
+- no-map;
+- };
+-
+- wcnss_region: wcnss@d200000 {
+- reg = <0x0d200000 0xa00000>;
+- no-map;
+- };
+-
+- adsp_region: adsp@dc00000 {
+- reg = <0x0dc00000 0x1900000>;
+- no-map;
+- };
+-
+- venus@f500000 {
+- reg = <0x0f500000 0x500000>;
+- no-map;
+- };
+-
+- smem_region: smem@fa00000 {
+- reg = <0xfa00000 0x200000>;
+- no-map;
+- };
+-
+- tz@fc00000 {
+- reg = <0x0fc00000 0x160000>;
+- no-map;
+- };
+-
+- rfsa@fd60000 {
+- reg = <0x0fd60000 0x20000>;
+- no-map;
++ clocks {
++ xo_board: xo_board {
++ compatible = "fixed-clock";
++ #clock-cells = <0>;
++ clock-frequency = <19200000>;
+ };
+
+- rmtfs@fd80000 {
+- compatible = "qcom,rmtfs-mem";
+- reg = <0x0fd80000 0x180000>;
+- no-map;
+-
+- qcom,client-id = <1>;
++ sleep_clk: sleep_clk {
++ compatible = "fixed-clock";
++ #clock-cells = <0>;
++ clock-frequency = <32768>;
+ };
+ };
+
+@@ -136,283 +96,120 @@
+ };
+ };
+
++ firmware {
++ scm {
++ compatible = "qcom,scm";
++ clocks = <&gcc GCC_CE1_CLK>, <&gcc GCC_CE1_AXI_CLK>, <&gcc GCC_CE1_AHB_CLK>;
++ clock-names = "core", "bus", "iface";
++ };
++ };
++
+ memory {
+ device_type = "memory";
+ reg = <0x0 0x0>;
+ };
+
+- thermal-zones {
+- cpu0-thermal {
+- polling-delay-passive = <250>;
+- polling-delay = <1000>;
++ pmu {
++ compatible = "qcom,krait-pmu";
++ interrupts = <GIC_PPI 7 0xf04>;
++ };
+
+- thermal-sensors = <&tsens 5>;
++ reserved-memory {
++ #address-cells = <1>;
++ #size-cells = <1>;
++ ranges;
+
+- trips {
+- cpu_alert0: trip0 {
+- temperature = <75000>;
+- hysteresis = <2000>;
+- type = "passive";
+- };
+- cpu_crit0: trip1 {
+- temperature = <110000>;
+- hysteresis = <2000>;
+- type = "critical";
+- };
+- };
++ mpss_region: mpss@8000000 {
++ reg = <0x08000000 0x5100000>;
++ no-map;
+ };
+
+- cpu1-thermal {
+- polling-delay-passive = <250>;
+- polling-delay = <1000>;
++ mba_region: mba@d100000 {
++ reg = <0x0d100000 0x100000>;
++ no-map;
++ };
+
+- thermal-sensors = <&tsens 6>;
++ wcnss_region: wcnss@d200000 {
++ reg = <0x0d200000 0xa00000>;
++ no-map;
++ };
+
+- trips {
+- cpu_alert1: trip0 {
+- temperature = <75000>;
+- hysteresis = <2000>;
+- type = "passive";
+- };
+- cpu_crit1: trip1 {
+- temperature = <110000>;
+- hysteresis = <2000>;
+- type = "critical";
+- };
+- };
++ adsp_region: adsp@dc00000 {
++ reg = <0x0dc00000 0x1900000>;
++ no-map;
+ };
+
+- cpu2-thermal {
+- polling-delay-passive = <250>;
+- polling-delay = <1000>;
++ venus_region: memory@f500000 {
++ reg = <0x0f500000 0x500000>;
++ no-map;
++ };
+
+- thermal-sensors = <&tsens 7>;
++ smem_region: smem@fa00000 {
++ reg = <0xfa00000 0x200000>;
++ no-map;
++ };
+
+- trips {
+- cpu_alert2: trip0 {
+- temperature = <75000>;
+- hysteresis = <2000>;
+- type = "passive";
+- };
+- cpu_crit2: trip1 {
+- temperature = <110000>;
+- hysteresis = <2000>;
+- type = "critical";
+- };
+- };
++ tz_region: memory@fc00000 {
++ reg = <0x0fc00000 0x160000>;
++ no-map;
+ };
+
+- cpu3-thermal {
+- polling-delay-passive = <250>;
+- polling-delay = <1000>;
++ rfsa_mem: memory@fd60000 {
++ reg = <0x0fd60000 0x20000>;
++ no-map;
++ };
+
+- thermal-sensors = <&tsens 8>;
++ rmtfs@fd80000 {
++ compatible = "qcom,rmtfs-mem";
++ reg = <0x0fd80000 0x180000>;
++ no-map;
+
+- trips {
+- cpu_alert3: trip0 {
+- temperature = <75000>;
+- hysteresis = <2000>;
+- type = "passive";
+- };
+- cpu_crit3: trip1 {
+- temperature = <110000>;
+- hysteresis = <2000>;
+- type = "critical";
+- };
+- };
++ qcom,client-id = <1>;
+ };
++ };
+
+- q6-dsp-thermal {
+- polling-delay-passive = <250>;
+- polling-delay = <1000>;
+-
+- thermal-sensors = <&tsens 1>;
++ smem {
++ compatible = "qcom,smem";
+
+- trips {
+- q6_dsp_alert0: trip-point0 {
+- temperature = <90000>;
+- hysteresis = <2000>;
+- type = "hot";
+- };
+- };
+- };
++ memory-region = <&smem_region>;
++ qcom,rpm-msg-ram = <&rpm_msg_ram>;
+
+- modemtx-thermal {
+- polling-delay-passive = <250>;
+- polling-delay = <1000>;
++ hwlocks = <&tcsr_mutex 3>;
++ };
+
+- thermal-sensors = <&tsens 2>;
++ smp2p-adsp {
++ compatible = "qcom,smp2p";
++ qcom,smem = <443>, <429>;
+
+- trips {
+- modemtx_alert0: trip-point0 {
+- temperature = <90000>;
+- hysteresis = <2000>;
+- type = "hot";
+- };
+- };
+- };
++ interrupt-parent = <&intc>;
++ interrupts = <GIC_SPI 158 IRQ_TYPE_EDGE_RISING>;
+
+- video-thermal {
+- polling-delay-passive = <250>;
+- polling-delay = <1000>;
++ qcom,ipc = <&apcs 8 10>;
+
+- thermal-sensors = <&tsens 3>;
++ qcom,local-pid = <0>;
++ qcom,remote-pid = <2>;
+
+- trips {
+- video_alert0: trip-point0 {
+- temperature = <95000>;
+- hysteresis = <2000>;
+- type = "hot";
+- };
+- };
++ adsp_smp2p_out: master-kernel {
++ qcom,entry-name = "master-kernel";
++ #qcom,smem-state-cells = <1>;
+ };
+
+- wlan-thermal {
+- polling-delay-passive = <250>;
+- polling-delay = <1000>;
+-
+- thermal-sensors = <&tsens 4>;
++ adsp_smp2p_in: slave-kernel {
++ qcom,entry-name = "slave-kernel";
+
+- trips {
+- wlan_alert0: trip-point0 {
+- temperature = <105000>;
+- hysteresis = <2000>;
+- type = "hot";
+- };
+- };
++ interrupt-controller;
++ #interrupt-cells = <2>;
+ };
++ };
+
+- gpu-top-thermal {
+- polling-delay-passive = <250>;
+- polling-delay = <1000>;
++ smp2p-modem {
++ compatible = "qcom,smp2p";
++ qcom,smem = <435>, <428>;
+
+- thermal-sensors = <&tsens 9>;
++ interrupt-parent = <&intc>;
++ interrupts = <GIC_SPI 27 IRQ_TYPE_EDGE_RISING>;
+
+- trips {
+- gpu1_alert0: trip-point0 {
+- temperature = <90000>;
+- hysteresis = <2000>;
+- type = "hot";
+- };
+- };
+- };
+-
+- gpu-bottom-thermal {
+- polling-delay-passive = <250>;
+- polling-delay = <1000>;
+-
+- thermal-sensors = <&tsens 10>;
+-
+- trips {
+- gpu2_alert0: trip-point0 {
+- temperature = <90000>;
+- hysteresis = <2000>;
+- type = "hot";
+- };
+- };
+- };
+- };
+-
+- cpu-pmu {
+- compatible = "qcom,krait-pmu";
+- interrupts = <GIC_PPI 7 0xf04>;
+- };
+-
+- clocks {
+- xo_board: xo_board {
+- compatible = "fixed-clock";
+- #clock-cells = <0>;
+- clock-frequency = <19200000>;
+- };
+-
+- sleep_clk: sleep_clk {
+- compatible = "fixed-clock";
+- #clock-cells = <0>;
+- clock-frequency = <32768>;
+- };
+- };
+-
+- timer {
+- compatible = "arm,armv7-timer";
+- interrupts = <GIC_PPI 2 0xf08>,
+- <GIC_PPI 3 0xf08>,
+- <GIC_PPI 4 0xf08>,
+- <GIC_PPI 1 0xf08>;
+- clock-frequency = <19200000>;
+- };
+-
+- adsp-pil {
+- compatible = "qcom,msm8974-adsp-pil";
+-
+- interrupts-extended = <&intc GIC_SPI 162 IRQ_TYPE_EDGE_RISING>,
+- <&adsp_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
+- <&adsp_smp2p_in 1 IRQ_TYPE_EDGE_RISING>,
+- <&adsp_smp2p_in 2 IRQ_TYPE_EDGE_RISING>,
+- <&adsp_smp2p_in 3 IRQ_TYPE_EDGE_RISING>;
+- interrupt-names = "wdog", "fatal", "ready", "handover", "stop-ack";
+-
+- cx-supply = <&pm8841_s2>;
+-
+- clocks = <&xo_board>;
+- clock-names = "xo";
+-
+- memory-region = <&adsp_region>;
+-
+- qcom,smem-states = <&adsp_smp2p_out 0>;
+- qcom,smem-state-names = "stop";
+-
+- smd-edge {
+- interrupts = <GIC_SPI 156 IRQ_TYPE_EDGE_RISING>;
+-
+- qcom,ipc = <&apcs 8 8>;
+- qcom,smd-edge = <1>;
+-
+- label = "lpass";
+- };
+- };
+-
+- smem {
+- compatible = "qcom,smem";
+-
+- memory-region = <&smem_region>;
+- qcom,rpm-msg-ram = <&rpm_msg_ram>;
+-
+- hwlocks = <&tcsr_mutex 3>;
+- };
+-
+- smp2p-adsp {
+- compatible = "qcom,smp2p";
+- qcom,smem = <443>, <429>;
+-
+- interrupt-parent = <&intc>;
+- interrupts = <GIC_SPI 158 IRQ_TYPE_EDGE_RISING>;
+-
+- qcom,ipc = <&apcs 8 10>;
+-
+- qcom,local-pid = <0>;
+- qcom,remote-pid = <2>;
+-
+- adsp_smp2p_out: master-kernel {
+- qcom,entry-name = "master-kernel";
+- #qcom,smem-state-cells = <1>;
+- };
+-
+- adsp_smp2p_in: slave-kernel {
+- qcom,entry-name = "slave-kernel";
+-
+- interrupt-controller;
+- #interrupt-cells = <2>;
+- };
+- };
+-
+- smp2p-modem {
+- compatible = "qcom,smp2p";
+- qcom,smem = <435>, <428>;
+-
+- interrupt-parent = <&intc>;
+- interrupts = <GIC_SPI 27 IRQ_TYPE_EDGE_RISING>;
+-
+- qcom,ipc = <&apcs 8 14>;
++ qcom,ipc = <&apcs 8 14>;
+
+ qcom,local-pid = <0>;
+ qcom,remote-pid = <1>;
+@@ -497,11 +294,23 @@
+ };
+ };
+
+- firmware {
+- scm {
+- compatible = "qcom,scm";
+- clocks = <&gcc GCC_CE1_CLK>, <&gcc GCC_CE1_AXI_CLK>, <&gcc GCC_CE1_AHB_CLK>;
+- clock-names = "core", "bus", "iface";
++ smd {
++ compatible = "qcom,smd";
++
++ rpm {
++ interrupts = <GIC_SPI 168 IRQ_TYPE_EDGE_RISING>;
++ qcom,ipc = <&apcs 8 0>;
++ qcom,smd-edge = <15>;
++
++ rpm_requests: rpm_requests {
++ compatible = "qcom,rpm-msm8974";
++ qcom,smd-channels = "rpm_requests";
++
++ rpmcc: clock-controller {
++ compatible = "qcom,rpmcc-msm8974", "qcom,rpmcc";
++ #clock-cells = <1>;
++ };
++ };
+ };
+ };
+
+@@ -524,31 +333,6 @@
+ reg = <0xf9011000 0x1000>;
+ };
+
+- qfprom: qfprom@fc4bc000 {
+- #address-cells = <1>;
+- #size-cells = <1>;
+- compatible = "qcom,qfprom";
+- reg = <0xfc4bc000 0x1000>;
+- tsens_calib: calib@d0 {
+- reg = <0xd0 0x18>;
+- };
+- tsens_backup: backup@440 {
+- reg = <0x440 0x10>;
+- };
+- };
+-
+- tsens: thermal-sensor@fc4a9000 {
+- compatible = "qcom,msm8974-tsens";
+- reg = <0xfc4a9000 0x1000>, /* TM */
+- <0xfc4a8000 0x1000>; /* SROT */
+- nvmem-cells = <&tsens_calib>, <&tsens_backup>;
+- nvmem-cell-names = "calib", "calib_backup";
+- #qcom,sensors = <11>;
+- interrupts = <GIC_SPI 184 IRQ_TYPE_LEVEL_HIGH>;
+- interrupt-names = "uplow";
+- #thermal-sensor-cells = <1>;
+- };
+-
+ timer@f9020000 {
+ #address-cells = <1>;
+ #size-cells = <1>;
+@@ -654,47 +438,53 @@
+ reg = <0xf90b8000 0x1000>, <0xf9008000 0x1000>;
+ };
+
+- restart@fc4ab000 {
+- compatible = "qcom,pshold";
+- reg = <0xfc4ab000 0x4>;
+- };
+-
+- gcc: clock-controller@fc400000 {
+- compatible = "qcom,gcc-msm8974";
+- #clock-cells = <1>;
+- #reset-cells = <1>;
+- #power-domain-cells = <1>;
+- reg = <0xfc400000 0x4000>;
+- };
++ sdhc_1: sdhci@f9824900 {
++ compatible = "qcom,msm8974-sdhci", "qcom,sdhci-msm-v4";
++ reg = <0xf9824900 0x11c>, <0xf9824000 0x800>;
++ reg-names = "hc_mem", "core_mem";
++ interrupts = <GIC_SPI 123 IRQ_TYPE_LEVEL_HIGH>,
++ <GIC_SPI 138 IRQ_TYPE_LEVEL_HIGH>;
++ interrupt-names = "hc_irq", "pwr_irq";
++ clocks = <&gcc GCC_SDCC1_APPS_CLK>,
++ <&gcc GCC_SDCC1_AHB_CLK>,
++ <&xo_board>;
++ clock-names = "core", "iface", "xo";
++ bus-width = <8>;
++ non-removable;
+
+- tcsr: syscon@fd4a0000 {
+- compatible = "syscon";
+- reg = <0xfd4a0000 0x10000>;
++ status = "disabled";
+ };
+
+- tcsr_mutex_block: syscon@fd484000 {
+- compatible = "syscon";
+- reg = <0xfd484000 0x2000>;
+- };
++ sdhc_3: sdhci@f9864900 {
++ compatible = "qcom,msm8974-sdhci", "qcom,sdhci-msm-v4";
++ reg = <0xf9864900 0x11c>, <0xf9864000 0x800>;
++ reg-names = "hc_mem", "core_mem";
++ interrupts = <GIC_SPI 127 IRQ_TYPE_LEVEL_HIGH>,
++ <GIC_SPI 224 IRQ_TYPE_LEVEL_HIGH>;
++ interrupt-names = "hc_irq", "pwr_irq";
++ clocks = <&gcc GCC_SDCC3_APPS_CLK>,
++ <&gcc GCC_SDCC3_AHB_CLK>,
++ <&xo_board>;
++ clock-names = "core", "iface", "xo";
++ bus-width = <4>;
+
+- mmcc: clock-controller@fd8c0000 {
+- compatible = "qcom,mmcc-msm8974";
+- #clock-cells = <1>;
+- #reset-cells = <1>;
+- #power-domain-cells = <1>;
+- reg = <0xfd8c0000 0x6000>;
++ status = "disabled";
+ };
+
+- tcsr_mutex: tcsr-mutex {
+- compatible = "qcom,tcsr-mutex";
+- syscon = <&tcsr_mutex_block 0 0x80>;
+-
+- #hwlock-cells = <1>;
+- };
++ sdhc_2: sdhci@f98a4900 {
++ compatible = "qcom,msm8974-sdhci", "qcom,sdhci-msm-v4";
++ reg = <0xf98a4900 0x11c>, <0xf98a4000 0x800>;
++ reg-names = "hc_mem", "core_mem";
++ interrupts = <GIC_SPI 125 IRQ_TYPE_LEVEL_HIGH>,
++ <GIC_SPI 221 IRQ_TYPE_LEVEL_HIGH>;
++ interrupt-names = "hc_irq", "pwr_irq";
++ clocks = <&gcc GCC_SDCC2_APPS_CLK>,
++ <&gcc GCC_SDCC2_AHB_CLK>,
++ <&xo_board>;
++ clock-names = "core", "iface", "xo";
++ bus-width = <4>;
+
+- rpm_msg_ram: memory@fc428000 {
+- compatible = "qcom,rpm-msg-ram";
+- reg = <0xfc428000 0x4000>;
++ status = "disabled";
+ };
+
+ blsp1_uart1: serial@f991d000 {
+@@ -715,16 +505,70 @@
+ status = "disabled";
+ };
+
+- blsp2_uart7: serial@f995d000 {
++ blsp1_i2c1: i2c@f9923000 {
++ status = "disabled";
++ compatible = "qcom,i2c-qup-v2.1.1";
++ reg = <0xf9923000 0x1000>;
++ interrupts = <0 95 IRQ_TYPE_LEVEL_HIGH>;
++ clocks = <&gcc GCC_BLSP1_QUP1_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
++ clock-names = "core", "iface";
++ #address-cells = <1>;
++ #size-cells = <0>;
++ };
++
++ blsp1_i2c2: i2c@f9924000 {
++ status = "disabled";
++ compatible = "qcom,i2c-qup-v2.1.1";
++ reg = <0xf9924000 0x1000>;
++ interrupts = <GIC_SPI 96 IRQ_TYPE_LEVEL_HIGH>;
++ clocks = <&gcc GCC_BLSP1_QUP2_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
++ clock-names = "core", "iface";
++ #address-cells = <1>;
++ #size-cells = <0>;
++ };
++
++ blsp1_i2c3: i2c@f9925000 {
++ status = "disabled";
++ compatible = "qcom,i2c-qup-v2.1.1";
++ reg = <0xf9925000 0x1000>;
++ interrupts = <0 97 IRQ_TYPE_LEVEL_HIGH>;
++ clocks = <&gcc GCC_BLSP1_QUP3_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
++ clock-names = "core", "iface";
++ #address-cells = <1>;
++ #size-cells = <0>;
++ };
++
++ blsp1_i2c6: i2c@f9928000 {
++ status = "disabled";
++ compatible = "qcom,i2c-qup-v2.1.1";
++ reg = <0xf9928000 0x1000>;
++ interrupts = <GIC_SPI 100 IRQ_TYPE_LEVEL_HIGH>;
++ clocks = <&gcc GCC_BLSP1_QUP6_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
++ clock-names = "core", "iface";
++ #address-cells = <1>;
++ #size-cells = <0>;
++ };
++
++ blsp2_dma: dma-controller@f9944000 {
++ compatible = "qcom,bam-v1.4.0";
++ reg = <0xf9944000 0x19000>;
++ interrupts = <GIC_SPI 239 IRQ_TYPE_LEVEL_HIGH>;
++ clocks = <&gcc GCC_BLSP2_AHB_CLK>;
++ clock-names = "bam_clk";
++ #dma-cells = <1>;
++ qcom,ee = <0>;
++ };
++
++ blsp2_uart1: serial@f995d000 {
+ compatible = "qcom,msm-uartdm-v1.4", "qcom,msm-uartdm";
+ reg = <0xf995d000 0x1000>;
+- interrupts = <GIC_SPI 113 IRQ_TYPE_NONE>;
++ interrupts = <GIC_SPI 113 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gcc GCC_BLSP2_UART1_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
+ clock-names = "core", "iface";
+ status = "disabled";
+ };
+
+- blsp2_uart8: serial@f995e000 {
++ blsp2_uart2: serial@f995e000 {
+ compatible = "qcom,msm-uartdm-v1.4", "qcom,msm-uartdm";
+ reg = <0xf995e000 0x1000>;
+ interrupts = <GIC_SPI 114 IRQ_TYPE_LEVEL_HIGH>;
+@@ -733,7 +577,7 @@
+ status = "disabled";
+ };
+
+- blsp2_uart10: serial@f9960000 {
++ blsp2_uart4: serial@f9960000 {
+ compatible = "qcom,msm-uartdm-v1.4", "qcom,msm-uartdm";
+ reg = <0xf9960000 0x1000>;
+ interrupts = <GIC_SPI 116 IRQ_TYPE_LEVEL_HIGH>;
+@@ -742,46 +586,39 @@
+ status = "disabled";
+ };
+
+- sdhci@f9824900 {
+- compatible = "qcom,msm8974-sdhci", "qcom,sdhci-msm-v4";
+- reg = <0xf9824900 0x11c>, <0xf9824000 0x800>;
+- reg-names = "hc_mem", "core_mem";
+- interrupts = <GIC_SPI 123 IRQ_TYPE_LEVEL_HIGH>,
+- <GIC_SPI 138 IRQ_TYPE_LEVEL_HIGH>;
+- interrupt-names = "hc_irq", "pwr_irq";
+- clocks = <&gcc GCC_SDCC1_APPS_CLK>,
+- <&gcc GCC_SDCC1_AHB_CLK>,
+- <&xo_board>;
+- clock-names = "core", "iface", "xo";
++ blsp2_i2c2: i2c@f9964000 {
+ status = "disabled";
++ compatible = "qcom,i2c-qup-v2.1.1";
++ reg = <0xf9964000 0x1000>;
++ interrupts = <GIC_SPI 102 IRQ_TYPE_LEVEL_HIGH>;
++ clocks = <&gcc GCC_BLSP2_QUP2_I2C_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
++ clock-names = "core", "iface";
++ #address-cells = <1>;
++ #size-cells = <0>;
+ };
+
+- sdhci@f9864900 {
+- compatible = "qcom,msm8974-sdhci", "qcom,sdhci-msm-v4";
+- reg = <0xf9864900 0x11c>, <0xf9864000 0x800>;
+- reg-names = "hc_mem", "core_mem";
+- interrupts = <GIC_SPI 127 IRQ_TYPE_LEVEL_HIGH>,
+- <GIC_SPI 224 IRQ_TYPE_LEVEL_HIGH>;
+- interrupt-names = "hc_irq", "pwr_irq";
+- clocks = <&gcc GCC_SDCC3_APPS_CLK>,
+- <&gcc GCC_SDCC3_AHB_CLK>,
+- <&xo_board>;
+- clock-names = "core", "iface", "xo";
++ blsp2_i2c5: i2c@f9967000 {
+ status = "disabled";
++ compatible = "qcom,i2c-qup-v2.1.1";
++ reg = <0xf9967000 0x1000>;
++ interrupts = <GIC_SPI 105 IRQ_TYPE_LEVEL_HIGH>;
++ clocks = <&gcc GCC_BLSP2_QUP5_I2C_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
++ clock-names = "core", "iface";
++ #address-cells = <1>;
++ #size-cells = <0>;
++ dmas = <&blsp2_dma 20>, <&blsp2_dma 21>;
++ dma-names = "tx", "rx";
+ };
+
+- sdhci@f98a4900 {
+- compatible = "qcom,msm8974-sdhci", "qcom,sdhci-msm-v4";
+- reg = <0xf98a4900 0x11c>, <0xf98a4000 0x800>;
+- reg-names = "hc_mem", "core_mem";
+- interrupts = <GIC_SPI 125 IRQ_TYPE_LEVEL_HIGH>,
+- <GIC_SPI 221 IRQ_TYPE_LEVEL_HIGH>;
+- interrupt-names = "hc_irq", "pwr_irq";
+- clocks = <&gcc GCC_SDCC2_APPS_CLK>,
+- <&gcc GCC_SDCC2_AHB_CLK>,
+- <&xo_board>;
+- clock-names = "core", "iface", "xo";
++ blsp2_i2c6: i2c@f9968000 {
+ status = "disabled";
++ compatible = "qcom,i2c-qup-v2.1.1";
++ reg = <0xf9968000 0x1000>;
++ interrupts = <0 106 IRQ_TYPE_LEVEL_HIGH>;
++ clocks = <&gcc GCC_BLSP2_QUP6_I2C_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
++ clock-names = "core", "iface";
++ #address-cells = <1>;
++ #size-cells = <0>;
+ };
+
+ otg: usb@f9a55000 {
+@@ -835,55 +672,6 @@
+ clock-names = "core";
+ };
+
+- remoteproc@fc880000 {
+- compatible = "qcom,msm8974-mss-pil";
+- reg = <0xfc880000 0x100>, <0xfc820000 0x020>;
+- reg-names = "qdsp6", "rmb";
+-
+- interrupts-extended = <&intc GIC_SPI 24 IRQ_TYPE_EDGE_RISING>,
+- <&modem_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
+- <&modem_smp2p_in 1 IRQ_TYPE_EDGE_RISING>,
+- <&modem_smp2p_in 2 IRQ_TYPE_EDGE_RISING>,
+- <&modem_smp2p_in 3 IRQ_TYPE_EDGE_RISING>;
+- interrupt-names = "wdog", "fatal", "ready", "handover", "stop-ack";
+-
+- clocks = <&gcc GCC_MSS_Q6_BIMC_AXI_CLK>,
+- <&gcc GCC_MSS_CFG_AHB_CLK>,
+- <&gcc GCC_BOOT_ROM_AHB_CLK>,
+- <&xo_board>;
+- clock-names = "iface", "bus", "mem", "xo";
+-
+- resets = <&gcc GCC_MSS_RESTART>;
+- reset-names = "mss_restart";
+-
+- cx-supply = <&pm8841_s2>;
+- mss-supply = <&pm8841_s3>;
+- mx-supply = <&pm8841_s1>;
+- pll-supply = <&pm8941_l12>;
+-
+- qcom,halt-regs = <&tcsr_mutex_block 0x1180 0x1200 0x1280>;
+-
+- qcom,smem-states = <&modem_smp2p_out 0>;
+- qcom,smem-state-names = "stop";
+-
+- mba {
+- memory-region = <&mba_region>;
+- };
+-
+- mpss {
+- memory-region = <&mpss_region>;
+- };
+-
+- smd-edge {
+- interrupts = <GIC_SPI 25 IRQ_TYPE_EDGE_RISING>;
+-
+- qcom,ipc = <&apcs 8 12>;
+- qcom,smd-edge = <0>;
+-
+- label = "modem";
+- };
+- };
+-
+ pronto: remoteproc@fb21b000 {
+ compatible = "qcom,pronto-v2-pil", "qcom,pronto";
+ reg = <0xfb204000 0x2000>, <0xfb202000 0x1000>, <0xfb21b000 0x3000>;
+@@ -898,8 +686,6 @@
+ <&wcnss_smp2p_in 3 IRQ_TYPE_EDGE_RISING>;
+ interrupt-names = "wdog", "fatal", "ready", "handover", "stop-ack";
+
+- vddpx-supply = <&pm8941_s3>;
+-
+ qcom,smem-states = <&wcnss_smp2p_out 0>;
+ qcom,smem-state-names = "stop";
+
+@@ -910,11 +696,6 @@
+
+ clocks = <&rpmcc RPM_SMD_CXO_A2>;
+ clock-names = "xo";
+-
+- vddxo-supply = <&pm8941_l6>;
+- vddrfa-supply = <&pm8941_l11>;
+- vddpa-supply = <&pm8941_l19>;
+- vdddig-supply = <&pm8941_s3>;
+ };
+
+ smd-edge {
+@@ -942,139 +723,32 @@
+ interrupt-names = "tx", "rx";
+
+ qcom,smem-states = <&apps_smsm 10>, <&apps_smsm 9>;
+- qcom,smem-state-names = "tx-enable", "tx-rings-empty";
++ qcom,smem-state-names = "tx-enable",
++ "tx-rings-empty";
+ };
+ };
+ };
+ };
+
+- msmgpio: pinctrl@fd510000 {
+- compatible = "qcom,msm8974-pinctrl";
+- reg = <0xfd510000 0x4000>;
+- gpio-controller;
+- gpio-ranges = <&msmgpio 0 0 146>;
+- #gpio-cells = <2>;
+- interrupt-controller;
+- #interrupt-cells = <2>;
+- interrupts = <GIC_SPI 208 IRQ_TYPE_LEVEL_HIGH>;
+- };
+-
+- i2c@f9923000 {
+- status = "disabled";
+- compatible = "qcom,i2c-qup-v2.1.1";
+- reg = <0xf9923000 0x1000>;
+- interrupts = <0 95 IRQ_TYPE_LEVEL_HIGH>;
+- clocks = <&gcc GCC_BLSP1_QUP1_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
+- clock-names = "core", "iface";
+- #address-cells = <1>;
+- #size-cells = <0>;
+- };
+-
+- i2c@f9924000 {
+- status = "disabled";
+- compatible = "qcom,i2c-qup-v2.1.1";
+- reg = <0xf9924000 0x1000>;
+- interrupts = <GIC_SPI 96 IRQ_TYPE_LEVEL_HIGH>;
+- clocks = <&gcc GCC_BLSP1_QUP2_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
+- clock-names = "core", "iface";
+- #address-cells = <1>;
+- #size-cells = <0>;
+- };
+-
+- blsp_i2c3: i2c@f9925000 {
+- status = "disabled";
+- compatible = "qcom,i2c-qup-v2.1.1";
+- reg = <0xf9925000 0x1000>;
+- interrupts = <0 97 IRQ_TYPE_LEVEL_HIGH>;
+- clocks = <&gcc GCC_BLSP1_QUP3_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
+- clock-names = "core", "iface";
+- #address-cells = <1>;
+- #size-cells = <0>;
+- };
+-
+- blsp_i2c6: i2c@f9928000 {
+- status = "disabled";
+- compatible = "qcom,i2c-qup-v2.1.1";
+- reg = <0xf9928000 0x1000>;
+- interrupts = <GIC_SPI 100 IRQ_TYPE_LEVEL_HIGH>;
+- clocks = <&gcc GCC_BLSP1_QUP6_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
+- clock-names = "core", "iface";
+- #address-cells = <1>;
+- #size-cells = <0>;
+- };
+-
+- blsp_i2c8: i2c@f9964000 {
+- status = "disabled";
+- compatible = "qcom,i2c-qup-v2.1.1";
+- reg = <0xf9964000 0x1000>;
+- interrupts = <GIC_SPI 102 IRQ_TYPE_LEVEL_HIGH>;
+- clocks = <&gcc GCC_BLSP2_QUP2_I2C_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
+- clock-names = "core", "iface";
+- #address-cells = <1>;
+- #size-cells = <0>;
+- };
+-
+- blsp_i2c11: i2c@f9967000 {
+- status = "disabled";
+- compatible = "qcom,i2c-qup-v2.1.1";
+- reg = <0xf9967000 0x1000>;
+- interrupts = <GIC_SPI 105 IRQ_TYPE_LEVEL_HIGH>;
+- clocks = <&gcc GCC_BLSP2_QUP5_I2C_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
+- clock-names = "core", "iface";
+- #address-cells = <1>;
+- #size-cells = <0>;
+- dmas = <&blsp2_dma 20>, <&blsp2_dma 21>;
+- dma-names = "tx", "rx";
+- };
+-
+- blsp_i2c12: i2c@f9968000 {
+- status = "disabled";
+- compatible = "qcom,i2c-qup-v2.1.1";
+- reg = <0xf9968000 0x1000>;
+- interrupts = <0 106 IRQ_TYPE_LEVEL_HIGH>;
+- clocks = <&gcc GCC_BLSP2_QUP6_I2C_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
+- clock-names = "core", "iface";
+- #address-cells = <1>;
+- #size-cells = <0>;
+- };
+-
+- spmi_bus: spmi@fc4cf000 {
+- compatible = "qcom,spmi-pmic-arb";
+- reg-names = "core", "intr", "cnfg";
+- reg = <0xfc4cf000 0x1000>,
+- <0xfc4cb000 0x1000>,
+- <0xfc4ca000 0x1000>;
+- interrupt-names = "periph_irq";
+- interrupts = <GIC_SPI 190 IRQ_TYPE_LEVEL_HIGH>;
+- qcom,ee = <0>;
+- qcom,channel = <0>;
+- #address-cells = <2>;
+- #size-cells = <0>;
+- interrupt-controller;
+- #interrupt-cells = <4>;
+- };
+-
+- blsp2_dma: dma-controller@f9944000 {
+- compatible = "qcom,bam-v1.4.0";
+- reg = <0xf9944000 0x19000>;
+- interrupts = <GIC_SPI 239 IRQ_TYPE_LEVEL_HIGH>;
+- clocks = <&gcc GCC_BLSP2_AHB_CLK>;
+- clock-names = "bam_clk";
+- #dma-cells = <1>;
+- qcom,ee = <0>;
+- };
+-
+- etr@fc322000 {
++ etf@fc307000 {
+ compatible = "arm,coresight-tmc", "arm,primecell";
+- reg = <0xfc322000 0x1000>;
++ reg = <0xfc307000 0x1000>;
+
+ clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>;
+ clock-names = "apb_pclk", "atclk";
+
++ out-ports {
++ port {
++ etf_out: endpoint {
++ remote-endpoint = <&replicator_in>;
++ };
++ };
++ };
++
+ in-ports {
+ port {
+- etr_in: endpoint {
+- remote-endpoint = <&replicator_out0>;
++ etf_in: endpoint {
++ remote-endpoint = <&merger_out>;
+ };
+ };
+ };
+@@ -1096,59 +770,39 @@
+ };
+ };
+
+- replicator@fc31c000 {
+- compatible = "arm,coresight-dynamic-replicator", "arm,primecell";
+- reg = <0xfc31c000 0x1000>;
++ funnel@fc31a000 {
++ compatible = "arm,coresight-dynamic-funnel", "arm,primecell";
++ reg = <0xfc31a000 0x1000>;
+
+ clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>;
+ clock-names = "apb_pclk", "atclk";
+
+- out-ports {
++ in-ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+- port@0 {
+- reg = <0>;
+- replicator_out0: endpoint {
+- remote-endpoint = <&etr_in>;
+- };
+- };
+- port@1 {
+- reg = <1>;
+- replicator_out1: endpoint {
+- remote-endpoint = <&tpiu_in>;
++ /*
++ * Not described input ports:
++ * 0 - not-connected
++ * 1 - connected trought funnel to Multimedia CPU
++ * 2 - connected to Wireless CPU
++ * 3 - not-connected
++ * 4 - not-connected
++ * 6 - not-connected
++ * 7 - connected to STM
++ */
++ port@5 {
++ reg = <5>;
++ funnel1_in5: endpoint {
++ remote-endpoint = <&kpss_out>;
+ };
+ };
+ };
+
+- in-ports {
++ out-ports {
+ port {
+- replicator_in: endpoint {
+- remote-endpoint = <&etf_out>;
+- };
+- };
+- };
+- };
+-
+- etf@fc307000 {
+- compatible = "arm,coresight-tmc", "arm,primecell";
+- reg = <0xfc307000 0x1000>;
+-
+- clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>;
+- clock-names = "apb_pclk", "atclk";
+-
+- out-ports {
+- port {
+- etf_out: endpoint {
+- remote-endpoint = <&replicator_in>;
+- };
+- };
+- };
+-
+- in-ports {
+- port {
+- etf_in: endpoint {
+- remote-endpoint = <&merger_out>;
++ funnel1_out: endpoint {
++ remote-endpoint = <&merger_in1>;
+ };
+ };
+ };
+@@ -1188,85 +842,51 @@
+ };
+ };
+
+- funnel@fc31a000 {
+- compatible = "arm,coresight-dynamic-funnel", "arm,primecell";
+- reg = <0xfc31a000 0x1000>;
++ replicator@fc31c000 {
++ compatible = "arm,coresight-dynamic-replicator", "arm,primecell";
++ reg = <0xfc31c000 0x1000>;
+
+ clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>;
+ clock-names = "apb_pclk", "atclk";
+
+- in-ports {
++ out-ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+- /*
+- * Not described input ports:
+- * 0 - not-connected
+- * 1 - connected trought funnel to Multimedia CPU
+- * 2 - connected to Wireless CPU
+- * 3 - not-connected
+- * 4 - not-connected
+- * 6 - not-connected
+- * 7 - connected to STM
+- */
+- port@5 {
+- reg = <5>;
+- funnel1_in5: endpoint {
+- remote-endpoint = <&kpss_out>;
++ port@0 {
++ reg = <0>;
++ replicator_out0: endpoint {
++ remote-endpoint = <&etr_in>;
++ };
++ };
++ port@1 {
++ reg = <1>;
++ replicator_out1: endpoint {
++ remote-endpoint = <&tpiu_in>;
+ };
+ };
+ };
+
+- out-ports {
++ in-ports {
+ port {
+- funnel1_out: endpoint {
+- remote-endpoint = <&merger_in1>;
++ replicator_in: endpoint {
++ remote-endpoint = <&etf_out>;
+ };
+ };
+ };
+ };
+
+- funnel@fc345000 { /* KPSS funnel only 4 inputs are used */
+- compatible = "arm,coresight-dynamic-funnel", "arm,primecell";
+- reg = <0xfc345000 0x1000>;
++ etr@fc322000 {
++ compatible = "arm,coresight-tmc", "arm,primecell";
++ reg = <0xfc322000 0x1000>;
+
+ clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>;
+ clock-names = "apb_pclk", "atclk";
+
+ in-ports {
+- #address-cells = <1>;
+- #size-cells = <0>;
+-
+- port@0 {
+- reg = <0>;
+- kpss_in0: endpoint {
+- remote-endpoint = <&etm0_out>;
+- };
+- };
+- port@1 {
+- reg = <1>;
+- kpss_in1: endpoint {
+- remote-endpoint = <&etm1_out>;
+- };
+- };
+- port@2 {
+- reg = <2>;
+- kpss_in2: endpoint {
+- remote-endpoint = <&etm2_out>;
+- };
+- };
+- port@3 {
+- reg = <3>;
+- kpss_in3: endpoint {
+- remote-endpoint = <&etm3_out>;
+- };
+- };
+- };
+-
+- out-ports {
+ port {
+- kpss_out: endpoint {
+- remote-endpoint = <&funnel1_in5>;
++ etr_in: endpoint {
++ remote-endpoint = <&replicator_out0>;
+ };
+ };
+ };
+@@ -1344,25 +964,66 @@
+ };
+ };
+
+- ocmem@fdd00000 {
+- compatible = "qcom,msm8974-ocmem";
+- reg = <0xfdd00000 0x2000>,
+- <0xfec00000 0x180000>;
+- reg-names = "ctrl",
+- "mem";
+- clocks = <&rpmcc RPM_SMD_OCMEMGX_CLK>,
+- <&mmcc OCMEMCX_OCMEMNOC_CLK>;
+- clock-names = "core",
+- "iface";
++ /* KPSS funnel, only 4 inputs are used */
++ funnel@fc345000 {
++ compatible = "arm,coresight-dynamic-funnel", "arm,primecell";
++ reg = <0xfc345000 0x1000>;
+
+- #address-cells = <1>;
+- #size-cells = <1>;
++ clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>;
++ clock-names = "apb_pclk", "atclk";
+
+- gmu_sram: gmu-sram@0 {
+- reg = <0x0 0x100000>;
++ in-ports {
++ #address-cells = <1>;
++ #size-cells = <0>;
++
++ port@0 {
++ reg = <0>;
++ kpss_in0: endpoint {
++ remote-endpoint = <&etm0_out>;
++ };
++ };
++ port@1 {
++ reg = <1>;
++ kpss_in1: endpoint {
++ remote-endpoint = <&etm1_out>;
++ };
++ };
++ port@2 {
++ reg = <2>;
++ kpss_in2: endpoint {
++ remote-endpoint = <&etm2_out>;
++ };
++ };
++ port@3 {
++ reg = <3>;
++ kpss_in3: endpoint {
++ remote-endpoint = <&etm3_out>;
++ };
++ };
++ };
++
++ out-ports {
++ port {
++ kpss_out: endpoint {
++ remote-endpoint = <&funnel1_in5>;
++ };
++ };
+ };
+ };
+
++ gcc: clock-controller@fc400000 {
++ compatible = "qcom,gcc-msm8974";
++ #clock-cells = <1>;
++ #reset-cells = <1>;
++ #power-domain-cells = <1>;
++ reg = <0xfc400000 0x4000>;
++ };
++
++ rpm_msg_ram: memory@fc428000 {
++ compatible = "qcom,rpm-msg-ram";
++ reg = <0xfc428000 0x4000>;
++ };
++
+ bimc: interconnect@fc380000 {
+ reg = <0xfc380000 0x6a000>;
+ compatible = "qcom,msm8974-bimc";
+@@ -1417,79 +1078,151 @@
+ <&rpmcc RPM_SMD_CNOC_A_CLK>;
+ };
+
+- gpu: adreno@fdb00000 {
+- status = "disabled";
+-
+- compatible = "qcom,adreno-330.1",
+- "qcom,adreno";
+- reg = <0xfdb00000 0x10000>;
+- reg-names = "kgsl_3d0_reg_memory";
+- interrupts = <GIC_SPI 33 IRQ_TYPE_LEVEL_HIGH>;
+- interrupt-names = "kgsl_3d0_irq";
+- clock-names = "core",
+- "iface",
+- "mem_iface";
+- clocks = <&mmcc OXILI_GFX3D_CLK>,
+- <&mmcc OXILICX_AHB_CLK>,
+- <&mmcc OXILICX_AXI_CLK>;
+- sram = <&gmu_sram>;
+- power-domains = <&mmcc OXILICX_GDSC>;
+- operating-points-v2 = <&gpu_opp_table>;
+-
+- interconnects = <&mmssnoc MNOC_MAS_GRAPHICS_3D &bimc BIMC_SLV_EBI_CH0>,
+- <&ocmemnoc OCMEM_VNOC_MAS_GFX3D &ocmemnoc OCMEM_SLV_OCMEM>;
+- interconnect-names = "gfx-mem",
+- "ocmem";
+-
+- // iommus = <&gpu_iommu 0>;
+-
+- gpu_opp_table: opp_table {
+- compatible = "operating-points-v2";
+-
+- opp-320000000 {
+- opp-hz = /bits/ 64 <320000000>;
+- };
++ tsens: thermal-sensor@fc4a9000 {
++ compatible = "qcom,msm8974-tsens";
++ reg = <0xfc4a9000 0x1000>, /* TM */
++ <0xfc4a8000 0x1000>; /* SROT */
++ nvmem-cells = <&tsens_calib>, <&tsens_backup>;
++ nvmem-cell-names = "calib", "calib_backup";
++ #qcom,sensors = <11>;
++ interrupts = <GIC_SPI 184 IRQ_TYPE_LEVEL_HIGH>;
++ interrupt-names = "uplow";
++ #thermal-sensor-cells = <1>;
++ };
+
+- opp-200000000 {
+- opp-hz = /bits/ 64 <200000000>;
+- };
++ restart@fc4ab000 {
++ compatible = "qcom,pshold";
++ reg = <0xfc4ab000 0x4>;
++ };
+
+- opp-27000000 {
+- opp-hz = /bits/ 64 <27000000>;
+- };
++ qfprom: qfprom@fc4bc000 {
++ #address-cells = <1>;
++ #size-cells = <1>;
++ compatible = "qcom,qfprom";
++ reg = <0xfc4bc000 0x1000>;
++ tsens_calib: calib@d0 {
++ reg = <0xd0 0x18>;
++ };
++ tsens_backup: backup@440 {
++ reg = <0x440 0x10>;
+ };
+ };
+
+- mdss: mdss@fd900000 {
+- status = "disabled";
++ spmi_bus: spmi@fc4cf000 {
++ compatible = "qcom,spmi-pmic-arb";
++ reg-names = "core", "intr", "cnfg";
++ reg = <0xfc4cf000 0x1000>,
++ <0xfc4cb000 0x1000>,
++ <0xfc4ca000 0x1000>;
++ interrupt-names = "periph_irq";
++ interrupts = <GIC_SPI 190 IRQ_TYPE_LEVEL_HIGH>;
++ qcom,ee = <0>;
++ qcom,channel = <0>;
++ #address-cells = <2>;
++ #size-cells = <0>;
++ interrupt-controller;
++ #interrupt-cells = <4>;
++ };
+
+- compatible = "qcom,mdss";
+- reg = <0xfd900000 0x100>,
+- <0xfd924000 0x1000>;
+- reg-names = "mdss_phys",
+- "vbif_phys";
++ remoteproc_mss: remoteproc@fc880000 {
++ compatible = "qcom,msm8974-mss-pil";
++ reg = <0xfc880000 0x100>, <0xfc820000 0x020>;
++ reg-names = "qdsp6", "rmb";
+
+- power-domains = <&mmcc MDSS_GDSC>;
++ interrupts-extended = <&intc GIC_SPI 24 IRQ_TYPE_EDGE_RISING>,
++ <&modem_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
++ <&modem_smp2p_in 1 IRQ_TYPE_EDGE_RISING>,
++ <&modem_smp2p_in 2 IRQ_TYPE_EDGE_RISING>,
++ <&modem_smp2p_in 3 IRQ_TYPE_EDGE_RISING>;
++ interrupt-names = "wdog", "fatal", "ready", "handover", "stop-ack";
+
+- clocks = <&mmcc MDSS_AHB_CLK>,
+- <&mmcc MDSS_AXI_CLK>,
+- <&mmcc MDSS_VSYNC_CLK>;
+- clock-names = "iface",
+- "bus",
+- "vsync";
++ clocks = <&gcc GCC_MSS_Q6_BIMC_AXI_CLK>,
++ <&gcc GCC_MSS_CFG_AHB_CLK>,
++ <&gcc GCC_BOOT_ROM_AHB_CLK>,
++ <&xo_board>;
++ clock-names = "iface", "bus", "mem", "xo";
++
++ resets = <&gcc GCC_MSS_RESTART>;
++ reset-names = "mss_restart";
++
++ qcom,halt-regs = <&tcsr_mutex_block 0x1180 0x1200 0x1280>;
++
++ qcom,smem-states = <&modem_smp2p_out 0>;
++ qcom,smem-state-names = "stop";
++
++ status = "disabled";
++
++ mba {
++ memory-region = <&mba_region>;
++ };
++
++ mpss {
++ memory-region = <&mpss_region>;
++ };
++
++ smd-edge {
++ interrupts = <GIC_SPI 25 IRQ_TYPE_EDGE_RISING>;
++
++ qcom,ipc = <&apcs 8 12>;
++ qcom,smd-edge = <0>;
++
++ label = "modem";
++ };
++ };
++
++ tcsr_mutex_block: syscon@fd484000 {
++ compatible = "syscon";
++ reg = <0xfd484000 0x2000>;
++ };
++
++ tcsr: syscon@fd4a0000 {
++ compatible = "syscon";
++ reg = <0xfd4a0000 0x10000>;
++ };
++
++ tlmm: pinctrl@fd510000 {
++ compatible = "qcom,msm8974-pinctrl";
++ reg = <0xfd510000 0x4000>;
++ gpio-controller;
++ gpio-ranges = <&tlmm 0 0 146>;
++ #gpio-cells = <2>;
++ interrupt-controller;
++ #interrupt-cells = <2>;
++ interrupts = <GIC_SPI 208 IRQ_TYPE_LEVEL_HIGH>;
++ };
++
++ mmcc: clock-controller@fd8c0000 {
++ compatible = "qcom,mmcc-msm8974";
++ #clock-cells = <1>;
++ #reset-cells = <1>;
++ #power-domain-cells = <1>;
++ reg = <0xfd8c0000 0x6000>;
++ };
++
++ mdss: mdss@fd900000 {
++ compatible = "qcom,mdss";
++ reg = <0xfd900000 0x100>, <0xfd924000 0x1000>;
++ reg-names = "mdss_phys", "vbif_phys";
++
++ power-domains = <&mmcc MDSS_GDSC>;
++
++ clocks = <&mmcc MDSS_AHB_CLK>,
++ <&mmcc MDSS_AXI_CLK>,
++ <&mmcc MDSS_VSYNC_CLK>;
++ clock-names = "iface", "bus", "vsync";
+
+ interrupts = <GIC_SPI 72 IRQ_TYPE_LEVEL_HIGH>;
+
+ interrupt-controller;
+ #interrupt-cells = <1>;
+
++ status = "disabled";
++
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges;
+
+ mdp: mdp@fd900000 {
+- status = "disabled";
+-
+ compatible = "qcom,mdp5";
+ reg = <0xfd900100 0x22000>;
+ reg-names = "mdp_phys";
+@@ -1501,10 +1234,7 @@
+ <&mmcc MDSS_AXI_CLK>,
+ <&mmcc MDSS_MDP_CLK>,
+ <&mmcc MDSS_VSYNC_CLK>;
+- clock-names = "iface",
+- "bus",
+- "core",
+- "vsync";
++ clock-names = "iface", "bus", "core", "vsync";
+
+ interconnects = <&mmssnoc MNOC_MAS_MDP_PORT0 &bimc BIMC_SLV_EBI_CH0>;
+ interconnect-names = "mdp0-mem";
+@@ -1523,8 +1253,6 @@
+ };
+
+ dsi0: dsi@fd922800 {
+- status = "disabled";
+-
+ compatible = "qcom,mdss-dsi-ctrl";
+ reg = <0xfd922800 0x1f8>;
+ reg-names = "dsi_ctrl";
+@@ -1532,29 +1260,32 @@
+ interrupt-parent = <&mdss>;
+ interrupts = <4 IRQ_TYPE_LEVEL_HIGH>;
+
+- assigned-clocks = <&mmcc BYTE0_CLK_SRC>,
+- <&mmcc PCLK0_CLK_SRC>;
+- assigned-clock-parents = <&dsi_phy0 0>,
+- <&dsi_phy0 1>;
++ assigned-clocks = <&mmcc BYTE0_CLK_SRC>, <&mmcc PCLK0_CLK_SRC>;
++ assigned-clock-parents = <&dsi0_phy 0>, <&dsi0_phy 1>;
+
+ clocks = <&mmcc MDSS_MDP_CLK>,
+- <&mmcc MDSS_AHB_CLK>,
+- <&mmcc MDSS_AXI_CLK>,
+- <&mmcc MDSS_BYTE0_CLK>,
+- <&mmcc MDSS_PCLK0_CLK>,
+- <&mmcc MDSS_ESC0_CLK>,
+- <&mmcc MMSS_MISC_AHB_CLK>;
++ <&mmcc MDSS_AHB_CLK>,
++ <&mmcc MDSS_AXI_CLK>,
++ <&mmcc MDSS_BYTE0_CLK>,
++ <&mmcc MDSS_PCLK0_CLK>,
++ <&mmcc MDSS_ESC0_CLK>,
++ <&mmcc MMSS_MISC_AHB_CLK>;
+ clock-names = "mdp_core",
+- "iface",
+- "bus",
+- "byte",
+- "pixel",
+- "core",
+- "core_mmss";
+-
+- phys = <&dsi_phy0>;
++ "iface",
++ "bus",
++ "byte",
++ "pixel",
++ "core",
++ "core_mmss";
++
++ phys = <&dsi0_phy>;
+ phy-names = "dsi-phy";
+
++ status = "disabled";
++
++ #address-cells = <1>;
++ #size-cells = <0>;
++
+ ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+@@ -1574,27 +1305,118 @@
+ };
+ };
+
+- dsi_phy0: dsi-phy@fd922a00 {
+- status = "disabled";
+-
++ dsi0_phy: dsi-phy@fd922a00 {
+ compatible = "qcom,dsi-phy-28nm-hpm";
+ reg = <0xfd922a00 0xd4>,
+ <0xfd922b00 0x280>,
+ <0xfd922d80 0x30>;
+ reg-names = "dsi_pll",
+- "dsi_phy",
+- "dsi_phy_regulator";
++ "dsi_phy",
++ "dsi_phy_regulator";
+
+ #clock-cells = <1>;
+ #phy-cells = <0>;
+- qcom,dsi-phy-index = <0>;
+
+ clocks = <&mmcc MDSS_AHB_CLK>, <&xo_board>;
+ clock-names = "iface", "ref";
++
++ status = "disabled";
++ };
++ };
++
++ gpu: adreno@fdb00000 {
++ compatible = "qcom,adreno-330.1", "qcom,adreno";
++ reg = <0xfdb00000 0x10000>;
++ reg-names = "kgsl_3d0_reg_memory";
++
++ interrupts = <GIC_SPI 33 IRQ_TYPE_LEVEL_HIGH>;
++ interrupt-names = "kgsl_3d0_irq";
++
++ clocks = <&mmcc OXILI_GFX3D_CLK>,
++ <&mmcc OXILICX_AHB_CLK>,
++ <&mmcc OXILICX_AXI_CLK>;
++ clock-names = "core", "iface", "mem_iface";
++
++ sram = <&gmu_sram>;
++ power-domains = <&mmcc OXILICX_GDSC>;
++ operating-points-v2 = <&gpu_opp_table>;
++
++ interconnects = <&mmssnoc MNOC_MAS_GRAPHICS_3D &bimc BIMC_SLV_EBI_CH0>,
++ <&ocmemnoc OCMEM_VNOC_MAS_GFX3D &ocmemnoc OCMEM_SLV_OCMEM>;
++ interconnect-names = "gfx-mem", "ocmem";
++
++ // iommus = <&gpu_iommu 0>;
++
++ status = "disabled";
++
++ gpu_opp_table: opp_table {
++ compatible = "operating-points-v2";
++
++ opp-320000000 {
++ opp-hz = /bits/ 64 <320000000>;
++ };
++
++ opp-200000000 {
++ opp-hz = /bits/ 64 <200000000>;
++ };
++
++ opp-27000000 {
++ opp-hz = /bits/ 64 <27000000>;
++ };
+ };
+ };
+
+- imem@fe805000 {
++ ocmem@fdd00000 {
++ compatible = "qcom,msm8974-ocmem";
++ reg = <0xfdd00000 0x2000>,
++ <0xfec00000 0x180000>;
++ reg-names = "ctrl", "mem";
++ ranges = <0 0xfec00000 0x180000>;
++ clocks = <&rpmcc RPM_SMD_OCMEMGX_CLK>,
++ <&mmcc OCMEMCX_OCMEMNOC_CLK>;
++ clock-names = "core", "iface";
++
++ #address-cells = <1>;
++ #size-cells = <1>;
++
++ gmu_sram: gmu-sram@0 {
++ reg = <0x0 0x100000>;
++ };
++ };
++
++ remoteproc_adsp: remoteproc@fe200000 {
++ compatible = "qcom,msm8974-adsp-pil";
++ reg = <0xfe200000 0x100>;
++
++ interrupts-extended = <&intc GIC_SPI 162 IRQ_TYPE_EDGE_RISING>,
++ <&adsp_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
++ <&adsp_smp2p_in 1 IRQ_TYPE_EDGE_RISING>,
++ <&adsp_smp2p_in 2 IRQ_TYPE_EDGE_RISING>,
++ <&adsp_smp2p_in 3 IRQ_TYPE_EDGE_RISING>;
++ interrupt-names = "wdog", "fatal", "ready", "handover", "stop-ack";
++
++ clocks = <&xo_board>;
++ clock-names = "xo";
++
++ memory-region = <&adsp_region>;
++
++ qcom,smem-states = <&adsp_smp2p_out 0>;
++ qcom,smem-state-names = "stop";
++
++ status = "disabled";
++
++ smd-edge {
++ interrupts = <GIC_SPI 156 IRQ_TYPE_EDGE_RISING>;
++
++ qcom,ipc = <&apcs 8 8>;
++ qcom,smd-edge = <1>;
++ label = "lpass";
++ #address-cells = <1>;
++ #size-cells = <0>;
++ };
++ };
++
++ imem: imem@fe805000 {
+ status = "disabled";
+ compatible = "syscon", "simple-mfd";
+ reg = <0xfe805000 0x1000>;
+@@ -1606,76 +1428,194 @@
+ };
+ };
+
+- smd {
+- compatible = "qcom,smd";
++ tcsr_mutex: tcsr-mutex {
++ compatible = "qcom,tcsr-mutex";
++ syscon = <&tcsr_mutex_block 0 0x80>;
+
+- rpm {
+- interrupts = <GIC_SPI 168 IRQ_TYPE_EDGE_RISING>;
+- qcom,ipc = <&apcs 8 0>;
+- qcom,smd-edge = <15>;
++ #hwlock-cells = <1>;
++ };
+
+- rpm_requests {
+- compatible = "qcom,rpm-msm8974";
+- qcom,smd-channels = "rpm_requests";
++ thermal-zones {
++ cpu0-thermal {
++ polling-delay-passive = <250>;
++ polling-delay = <1000>;
+
+- rpmcc: clock-controller {
+- compatible = "qcom,rpmcc-msm8974", "qcom,rpmcc";
+- #clock-cells = <1>;
++ thermal-sensors = <&tsens 5>;
++
++ trips {
++ cpu_alert0: trip0 {
++ temperature = <75000>;
++ hysteresis = <2000>;
++ type = "passive";
++ };
++ cpu_crit0: trip1 {
++ temperature = <110000>;
++ hysteresis = <2000>;
++ type = "critical";
++ };
++ };
++ };
++
++ cpu1-thermal {
++ polling-delay-passive = <250>;
++ polling-delay = <1000>;
++
++ thermal-sensors = <&tsens 6>;
++
++ trips {
++ cpu_alert1: trip0 {
++ temperature = <75000>;
++ hysteresis = <2000>;
++ type = "passive";
++ };
++ cpu_crit1: trip1 {
++ temperature = <110000>;
++ hysteresis = <2000>;
++ type = "critical";
++ };
++ };
++ };
++
++ cpu2-thermal {
++ polling-delay-passive = <250>;
++ polling-delay = <1000>;
++
++ thermal-sensors = <&tsens 7>;
++
++ trips {
++ cpu_alert2: trip0 {
++ temperature = <75000>;
++ hysteresis = <2000>;
++ type = "passive";
++ };
++ cpu_crit2: trip1 {
++ temperature = <110000>;
++ hysteresis = <2000>;
++ type = "critical";
++ };
++ };
++ };
++
++ cpu3-thermal {
++ polling-delay-passive = <250>;
++ polling-delay = <1000>;
++
++ thermal-sensors = <&tsens 8>;
++
++ trips {
++ cpu_alert3: trip0 {
++ temperature = <75000>;
++ hysteresis = <2000>;
++ type = "passive";
++ };
++ cpu_crit3: trip1 {
++ temperature = <110000>;
++ hysteresis = <2000>;
++ type = "critical";
++ };
++ };
++ };
++
++ q6-dsp-thermal {
++ polling-delay-passive = <250>;
++ polling-delay = <1000>;
++
++ thermal-sensors = <&tsens 1>;
++
++ trips {
++ q6_dsp_alert0: trip-point0 {
++ temperature = <90000>;
++ hysteresis = <2000>;
++ type = "hot";
+ };
++ };
++ };
+
+- pm8841-regulators {
+- compatible = "qcom,rpm-pm8841-regulators";
+-
+- pm8841_s1: s1 {};
+- pm8841_s2: s2 {};
+- pm8841_s3: s3 {};
+- pm8841_s4: s4 {};
+- pm8841_s5: s5 {};
+- pm8841_s6: s6 {};
+- pm8841_s7: s7 {};
+- pm8841_s8: s8 {};
++ modemtx-thermal {
++ polling-delay-passive = <250>;
++ polling-delay = <1000>;
++
++ thermal-sensors = <&tsens 2>;
++
++ trips {
++ modemtx_alert0: trip-point0 {
++ temperature = <90000>;
++ hysteresis = <2000>;
++ type = "hot";
+ };
++ };
++ };
++
++ video-thermal {
++ polling-delay-passive = <250>;
++ polling-delay = <1000>;
+
+- pm8941-regulators {
+- compatible = "qcom,rpm-pm8941-regulators";
+-
+- pm8941_s1: s1 {};
+- pm8941_s2: s2 {};
+- pm8941_s3: s3 {};
+-
+- pm8941_l1: l1 {};
+- pm8941_l2: l2 {};
+- pm8941_l3: l3 {};
+- pm8941_l4: l4 {};
+- pm8941_l5: l5 {};
+- pm8941_l6: l6 {};
+- pm8941_l7: l7 {};
+- pm8941_l8: l8 {};
+- pm8941_l9: l9 {};
+- pm8941_l10: l10 {};
+- pm8941_l11: l11 {};
+- pm8941_l12: l12 {};
+- pm8941_l13: l13 {};
+- pm8941_l14: l14 {};
+- pm8941_l15: l15 {};
+- pm8941_l16: l16 {};
+- pm8941_l17: l17 {};
+- pm8941_l18: l18 {};
+- pm8941_l19: l19 {};
+- pm8941_l20: l20 {};
+- pm8941_l21: l21 {};
+- pm8941_l22: l22 {};
+- pm8941_l23: l23 {};
+- pm8941_l24: l24 {};
+-
+- pm8941_lvs1: lvs1 {};
+- pm8941_lvs2: lvs2 {};
+- pm8941_lvs3: lvs3 {};
++ thermal-sensors = <&tsens 3>;
++
++ trips {
++ video_alert0: trip-point0 {
++ temperature = <95000>;
++ hysteresis = <2000>;
++ type = "hot";
++ };
++ };
++ };
++
++ wlan-thermal {
++ polling-delay-passive = <250>;
++ polling-delay = <1000>;
++
++ thermal-sensors = <&tsens 4>;
++
++ trips {
++ wlan_alert0: trip-point0 {
++ temperature = <105000>;
++ hysteresis = <2000>;
++ type = "hot";
++ };
++ };
++ };
++
++ gpu-top-thermal {
++ polling-delay-passive = <250>;
++ polling-delay = <1000>;
++
++ thermal-sensors = <&tsens 9>;
++
++ trips {
++ gpu1_alert0: trip-point0 {
++ temperature = <90000>;
++ hysteresis = <2000>;
++ type = "hot";
++ };
++ };
++ };
++
++ gpu-bottom-thermal {
++ polling-delay-passive = <250>;
++ polling-delay = <1000>;
++
++ thermal-sensors = <&tsens 10>;
++
++ trips {
++ gpu2_alert0: trip-point0 {
++ temperature = <90000>;
++ hysteresis = <2000>;
++ type = "hot";
+ };
+ };
+ };
+ };
+
++ timer {
++ compatible = "arm,armv7-timer";
++ interrupts = <GIC_PPI 2 0xf08>,
++ <GIC_PPI 3 0xf08>,
++ <GIC_PPI 4 0xf08>,
++ <GIC_PPI 1 0xf08>;
++ clock-frequency = <19200000>;
++ };
++
+ vreg_boost: vreg-boost {
+ compatible = "regulator-fixed";
+
+@@ -1692,6 +1632,7 @@
+ pinctrl-names = "default";
+ pinctrl-0 = <&boost_bypass_n_pin>;
+ };
++
+ vreg_vph_pwr: vreg-vph-pwr {
+ compatible = "regulator-fixed";
+ regulator-name = "vph-pwr";
+diff --git a/arch/arm/boot/dts/qcom-pm8841.dtsi b/arch/arm/boot/dts/qcom-pm8841.dtsi
+index 2caf71eacb520..b5cdde034d188 100644
+--- a/arch/arm/boot/dts/qcom-pm8841.dtsi
++++ b/arch/arm/boot/dts/qcom-pm8841.dtsi
+@@ -24,6 +24,7 @@
+ compatible = "qcom,spmi-temp-alarm";
+ reg = <0x2400>;
+ interrupts = <4 0x24 0 IRQ_TYPE_EDGE_RISING>;
++ #thermal-sensor-cells = <0>;
+ };
+ };
+
+diff --git a/arch/arm/boot/dts/qcom-pm8941.dtsi b/arch/arm/boot/dts/qcom-pm8941.dtsi
+index da00b8f5eecd0..cdd2bdb77b329 100644
+--- a/arch/arm/boot/dts/qcom-pm8941.dtsi
++++ b/arch/arm/boot/dts/qcom-pm8941.dtsi
+@@ -131,7 +131,7 @@
+ qcom,external-resistor-micro-ohms = <10000>;
+ };
+
+- coincell@2800 {
++ pm8941_coincell: coincell@2800 {
+ compatible = "qcom,pm8941-coincell";
+ reg = <0x2800>;
+ status = "disabled";
+diff --git a/arch/arm/boot/dts/qcom-sdx55.dtsi b/arch/arm/boot/dts/qcom-sdx55.dtsi
+index d455795da44c8..b75e672c239d8 100644
+--- a/arch/arm/boot/dts/qcom-sdx55.dtsi
++++ b/arch/arm/boot/dts/qcom-sdx55.dtsi
+@@ -206,7 +206,7 @@
+ blsp1_uart3: serial@831000 {
+ compatible = "qcom,msm-uartdm-v1.4", "qcom,msm-uartdm";
+ reg = <0x00831000 0x200>;
+- interrupts = <GIC_SPI 26 IRQ_TYPE_LEVEL_LOW>;
++ interrupts = <GIC_SPI 26 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gcc 30>,
+ <&gcc 9>;
+ clock-names = "core", "iface";
+diff --git a/arch/arm/boot/dts/ste-ux500-samsung-codina.dts b/arch/arm/boot/dts/ste-ux500-samsung-codina.dts
+index 1c1725d31c7cd..0a10fb0a3741e 100644
+--- a/arch/arm/boot/dts/ste-ux500-samsung-codina.dts
++++ b/arch/arm/boot/dts/ste-ux500-samsung-codina.dts
+@@ -566,8 +566,8 @@
+ reg = <0x19>;
+ vdd-supply = <&ab8500_ldo_aux1_reg>; // 3V
+ vddio-supply = <&ab8500_ldo_aux2_reg>; // 1.8V
+- mount-matrix = "0", "-1", "0",
+- "1", "0", "0",
++ mount-matrix = "0", "1", "0",
++ "-1", "0", "0",
+ "0", "0", "1";
+ };
+ };
+diff --git a/arch/arm/boot/dts/ste-ux500-samsung-gavini.dts b/arch/arm/boot/dts/ste-ux500-samsung-gavini.dts
+index fd170974765fb..149838b5f9c1a 100644
+--- a/arch/arm/boot/dts/ste-ux500-samsung-gavini.dts
++++ b/arch/arm/boot/dts/ste-ux500-samsung-gavini.dts
+@@ -523,8 +523,8 @@
+ accelerometer@18 {
+ compatible = "bosch,bma222e";
+ reg = <0x18>;
+- mount-matrix = "0", "1", "0",
+- "-1", "0", "0",
++ mount-matrix = "0", "-1", "0",
++ "1", "0", "0",
+ "0", "0", "1";
+ vddio-supply = <&ab8500_ldo_aux2_reg>; // 1.8V
+ vdd-supply = <&ab8500_ldo_aux1_reg>; // 3V
+diff --git a/arch/arm/boot/dts/ste-ux500-samsung-janice.dts b/arch/arm/boot/dts/ste-ux500-samsung-janice.dts
+index 42762bfcd8781..2069efb252a7c 100644
+--- a/arch/arm/boot/dts/ste-ux500-samsung-janice.dts
++++ b/arch/arm/boot/dts/ste-ux500-samsung-janice.dts
+@@ -610,8 +610,8 @@
+ accelerometer@8 {
+ compatible = "bosch,bma222";
+ reg = <0x08>;
+- mount-matrix = "0", "1", "0",
+- "-1", "0", "0",
++ mount-matrix = "0", "-1", "0",
++ "1", "0", "0",
+ "0", "0", "1";
+ vddio-supply = <&ab8500_ldo_aux2_reg>; // 1.8V
+ vdd-supply = <&ab8500_ldo_aux1_reg>; // 3V
+diff --git a/arch/arm/boot/dts/uniphier-pxs2.dtsi b/arch/arm/boot/dts/uniphier-pxs2.dtsi
+index e81e5937a60ae..03301ddb3403a 100644
+--- a/arch/arm/boot/dts/uniphier-pxs2.dtsi
++++ b/arch/arm/boot/dts/uniphier-pxs2.dtsi
+@@ -597,8 +597,8 @@
+ compatible = "socionext,uniphier-dwc3", "snps,dwc3";
+ status = "disabled";
+ reg = <0x65a00000 0xcd00>;
+- interrupt-names = "host", "peripheral";
+- interrupts = <0 134 4>, <0 135 4>;
++ interrupt-names = "dwc_usb3";
++ interrupts = <0 134 4>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&pinctrl_usb0>, <&pinctrl_usb2>;
+ clock-names = "ref", "bus_early", "suspend";
+@@ -693,8 +693,8 @@
+ compatible = "socionext,uniphier-dwc3", "snps,dwc3";
+ status = "disabled";
+ reg = <0x65c00000 0xcd00>;
+- interrupt-names = "host", "peripheral";
+- interrupts = <0 137 4>, <0 138 4>;
++ interrupt-names = "dwc_usb3";
++ interrupts = <0 137 4>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&pinctrl_usb1>, <&pinctrl_usb3>;
+ clock-names = "ref", "bus_early", "suspend";
+diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
+index e4dba5461cb3e..149a5bd6b88c1 100644
+--- a/arch/arm/crypto/Kconfig
++++ b/arch/arm/crypto/Kconfig
+@@ -63,7 +63,7 @@ config CRYPTO_SHA512_ARM
+ using optimized ARM assembler and NEON, when available.
+
+ config CRYPTO_BLAKE2S_ARM
+- tristate "BLAKE2s digest algorithm (ARM)"
++ bool "BLAKE2s digest algorithm (ARM)"
+ select CRYPTO_ARCH_HAVE_LIB_BLAKE2S
+ help
+ BLAKE2s digest algorithm optimized with ARM scalar instructions. This
+diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
+index 0274f81cc8ea0..971e74546fb1b 100644
+--- a/arch/arm/crypto/Makefile
++++ b/arch/arm/crypto/Makefile
+@@ -9,8 +9,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
+ obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
+ obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
+ obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
+-obj-$(CONFIG_CRYPTO_BLAKE2S_ARM) += blake2s-arm.o
+-obj-$(if $(CONFIG_CRYPTO_BLAKE2S_ARM),y) += libblake2s-arm.o
++obj-$(CONFIG_CRYPTO_BLAKE2S_ARM) += libblake2s-arm.o
+ obj-$(CONFIG_CRYPTO_BLAKE2B_NEON) += blake2b-neon.o
+ obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
+ obj-$(CONFIG_CRYPTO_POLY1305_ARM) += poly1305-arm.o
+@@ -32,7 +31,6 @@ sha256-arm-neon-$(CONFIG_KERNEL_MODE_NEON) := sha256_neon_glue.o
+ sha256-arm-y := sha256-core.o sha256_glue.o $(sha256-arm-neon-y)
+ sha512-arm-neon-$(CONFIG_KERNEL_MODE_NEON) := sha512-neon-glue.o
+ sha512-arm-y := sha512-core.o sha512-glue.o $(sha512-arm-neon-y)
+-blake2s-arm-y := blake2s-shash.o
+ libblake2s-arm-y:= blake2s-core.o blake2s-glue.o
+ blake2b-neon-y := blake2b-neon-core.o blake2b-neon-glue.o
+ sha1-arm-ce-y := sha1-ce-core.o sha1-ce-glue.o
+diff --git a/arch/arm/crypto/blake2s-shash.c b/arch/arm/crypto/blake2s-shash.c
+deleted file mode 100644
+index 763c73beea2d0..0000000000000
+--- a/arch/arm/crypto/blake2s-shash.c
++++ /dev/null
+@@ -1,75 +0,0 @@
+-// SPDX-License-Identifier: GPL-2.0-or-later
+-/*
+- * BLAKE2s digest algorithm, ARM scalar implementation
+- *
+- * Copyright 2020 Google LLC
+- */
+-
+-#include <crypto/internal/blake2s.h>
+-#include <crypto/internal/hash.h>
+-
+-#include <linux/module.h>
+-
+-static int crypto_blake2s_update_arm(struct shash_desc *desc,
+- const u8 *in, unsigned int inlen)
+-{
+- return crypto_blake2s_update(desc, in, inlen, false);
+-}
+-
+-static int crypto_blake2s_final_arm(struct shash_desc *desc, u8 *out)
+-{
+- return crypto_blake2s_final(desc, out, false);
+-}
+-
+-#define BLAKE2S_ALG(name, driver_name, digest_size) \
+- { \
+- .base.cra_name = name, \
+- .base.cra_driver_name = driver_name, \
+- .base.cra_priority = 200, \
+- .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \
+- .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, \
+- .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), \
+- .base.cra_module = THIS_MODULE, \
+- .digestsize = digest_size, \
+- .setkey = crypto_blake2s_setkey, \
+- .init = crypto_blake2s_init, \
+- .update = crypto_blake2s_update_arm, \
+- .final = crypto_blake2s_final_arm, \
+- .descsize = sizeof(struct blake2s_state), \
+- }
+-
+-static struct shash_alg blake2s_arm_algs[] = {
+- BLAKE2S_ALG("blake2s-128", "blake2s-128-arm", BLAKE2S_128_HASH_SIZE),
+- BLAKE2S_ALG("blake2s-160", "blake2s-160-arm", BLAKE2S_160_HASH_SIZE),
+- BLAKE2S_ALG("blake2s-224", "blake2s-224-arm", BLAKE2S_224_HASH_SIZE),
+- BLAKE2S_ALG("blake2s-256", "blake2s-256-arm", BLAKE2S_256_HASH_SIZE),
+-};
+-
+-static int __init blake2s_arm_mod_init(void)
+-{
+- return IS_REACHABLE(CONFIG_CRYPTO_HASH) ?
+- crypto_register_shashes(blake2s_arm_algs,
+- ARRAY_SIZE(blake2s_arm_algs)) : 0;
+-}
+-
+-static void __exit blake2s_arm_mod_exit(void)
+-{
+- if (IS_REACHABLE(CONFIG_CRYPTO_HASH))
+- crypto_unregister_shashes(blake2s_arm_algs,
+- ARRAY_SIZE(blake2s_arm_algs));
+-}
+-
+-module_init(blake2s_arm_mod_init);
+-module_exit(blake2s_arm_mod_exit);
+-
+-MODULE_DESCRIPTION("BLAKE2s digest algorithm, ARM scalar implementation");
+-MODULE_LICENSE("GPL");
+-MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
+-MODULE_ALIAS_CRYPTO("blake2s-128");
+-MODULE_ALIAS_CRYPTO("blake2s-128-arm");
+-MODULE_ALIAS_CRYPTO("blake2s-160");
+-MODULE_ALIAS_CRYPTO("blake2s-160-arm");
+-MODULE_ALIAS_CRYPTO("blake2s-224");
+-MODULE_ALIAS_CRYPTO("blake2s-224-arm");
+-MODULE_ALIAS_CRYPTO("blake2s-256");
+-MODULE_ALIAS_CRYPTO("blake2s-256-arm");
+diff --git a/arch/arm/lib/findbit.S b/arch/arm/lib/findbit.S
+index b5e8b9ae4c7d4..7fd3600db8efd 100644
+--- a/arch/arm/lib/findbit.S
++++ b/arch/arm/lib/findbit.S
+@@ -40,8 +40,8 @@ ENDPROC(_find_first_zero_bit_le)
+ * Prototype: int find_next_zero_bit(void *addr, unsigned int maxbit, int offset)
+ */
+ ENTRY(_find_next_zero_bit_le)
+- teq r1, #0
+- beq 3b
++ cmp r2, r1
++ bhs 3b
+ ands ip, r2, #7
+ beq 1b @ If new byte, goto old routine
+ ARM( ldrb r3, [r0, r2, lsr #3] )
+@@ -81,8 +81,8 @@ ENDPROC(_find_first_bit_le)
+ * Prototype: int find_next_zero_bit(void *addr, unsigned int maxbit, int offset)
+ */
+ ENTRY(_find_next_bit_le)
+- teq r1, #0
+- beq 3b
++ cmp r2, r1
++ bhs 3b
+ ands ip, r2, #7
+ beq 1b @ If new byte, goto old routine
+ ARM( ldrb r3, [r0, r2, lsr #3] )
+@@ -115,8 +115,8 @@ ENTRY(_find_first_zero_bit_be)
+ ENDPROC(_find_first_zero_bit_be)
+
+ ENTRY(_find_next_zero_bit_be)
+- teq r1, #0
+- beq 3b
++ cmp r2, r1
++ bhs 3b
+ ands ip, r2, #7
+ beq 1b @ If new byte, goto old routine
+ eor r3, r2, #0x18 @ big endian byte ordering
+@@ -149,8 +149,8 @@ ENTRY(_find_first_bit_be)
+ ENDPROC(_find_first_bit_be)
+
+ ENTRY(_find_next_bit_be)
+- teq r1, #0
+- beq 3b
++ cmp r2, r1
++ bhs 3b
+ ands ip, r2, #7
+ beq 1b @ If new byte, goto old routine
+ eor r3, r2, #0x18 @ big endian byte ordering
+diff --git a/arch/arm/mach-bcm/bcm_kona_smc.c b/arch/arm/mach-bcm/bcm_kona_smc.c
+index 43829e49ad93f..347bfb7f03e2c 100644
+--- a/arch/arm/mach-bcm/bcm_kona_smc.c
++++ b/arch/arm/mach-bcm/bcm_kona_smc.c
+@@ -52,6 +52,7 @@ int __init bcm_kona_smc_init(void)
+ return -ENODEV;
+
+ prop_val = of_get_address(node, 0, &prop_size, NULL);
++ of_node_put(node);
+ if (!prop_val)
+ return -EINVAL;
+
+diff --git a/arch/arm/mach-omap2/display.c b/arch/arm/mach-omap2/display.c
+index 21413a9b7b6c6..8d829f3dafe76 100644
+--- a/arch/arm/mach-omap2/display.c
++++ b/arch/arm/mach-omap2/display.c
+@@ -211,6 +211,7 @@ static int __init omapdss_init_fbdev(void)
+ node = of_find_node_by_name(NULL, "omap4_padconf_global");
+ if (node)
+ omap4_dsi_mux_syscon = syscon_node_to_regmap(node);
++ of_node_put(node);
+
+ return 0;
+ }
+@@ -259,11 +260,13 @@ static int __init omapdss_init_of(void)
+
+ if (!pdev) {
+ pr_err("Unable to find DSS platform device\n");
++ of_node_put(node);
+ return -ENODEV;
+ }
+
+ r = of_platform_populate(node, NULL, NULL, &pdev->dev);
+ put_device(&pdev->dev);
++ of_node_put(node);
+ if (r) {
+ pr_err("Unable to populate DSS submodule devices\n");
+ return r;
+diff --git a/arch/arm/mach-omap2/pdata-quirks.c b/arch/arm/mach-omap2/pdata-quirks.c
+index e7fd29a502a02..af9d8c432af08 100644
+--- a/arch/arm/mach-omap2/pdata-quirks.c
++++ b/arch/arm/mach-omap2/pdata-quirks.c
+@@ -551,6 +551,8 @@ pdata_quirks_init_clocks(const struct of_device_id *omap_dt_match_table)
+
+ of_platform_populate(np, omap_dt_match_table,
+ omap_auxdata_lookup, NULL);
++
++ of_node_put(np);
+ }
+ }
+
+diff --git a/arch/arm/mach-omap2/prm3xxx.c b/arch/arm/mach-omap2/prm3xxx.c
+index 1b442b1285693..63e73e9b82bc6 100644
+--- a/arch/arm/mach-omap2/prm3xxx.c
++++ b/arch/arm/mach-omap2/prm3xxx.c
+@@ -708,6 +708,7 @@ static int omap3xxx_prm_late_init(void)
+ }
+
+ irq_num = of_irq_get(np, 0);
++ of_node_put(np);
+ if (irq_num == -EPROBE_DEFER)
+ return irq_num;
+
+diff --git a/arch/arm/mach-shmobile/regulator-quirk-rcar-gen2.c b/arch/arm/mach-shmobile/regulator-quirk-rcar-gen2.c
+index 09ef73b99dd86..ba44cec5e59ac 100644
+--- a/arch/arm/mach-shmobile/regulator-quirk-rcar-gen2.c
++++ b/arch/arm/mach-shmobile/regulator-quirk-rcar-gen2.c
+@@ -125,6 +125,7 @@ remove:
+
+ list_for_each_entry_safe(pos, tmp, &quirk_list, list) {
+ list_del(&pos->list);
++ of_node_put(pos->np);
+ kfree(pos);
+ }
+
+@@ -174,11 +175,12 @@ static int __init rcar_gen2_regulator_quirk(void)
+ memcpy(&quirk->i2c_msg, id->data, sizeof(quirk->i2c_msg));
+
+ quirk->id = id;
+- quirk->np = np;
++ quirk->np = of_node_get(np);
+ quirk->i2c_msg.addr = addr;
+
+ ret = of_irq_parse_one(np, 0, argsa);
+ if (ret) { /* Skip invalid entry and continue */
++ of_node_put(np);
+ kfree(quirk);
+ continue;
+ }
+@@ -225,6 +227,7 @@ err_free:
+ err_mem:
+ list_for_each_entry_safe(pos, tmp, &quirk_list, list) {
+ list_del(&pos->list);
++ of_node_put(pos->np);
+ kfree(pos);
+ }
+
+diff --git a/arch/arm/mach-zynq/common.c b/arch/arm/mach-zynq/common.c
+index e1ca6a5732d27..15e8a321a713b 100644
+--- a/arch/arm/mach-zynq/common.c
++++ b/arch/arm/mach-zynq/common.c
+@@ -77,6 +77,7 @@ static int __init zynq_get_revision(void)
+ }
+
+ zynq_devcfg_base = of_iomap(np, 0);
++ of_node_put(np);
+ if (!zynq_devcfg_base) {
+ pr_err("%s: Unable to map I/O memory\n", __func__);
+ return -1;
+diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
+index 20ea89d9ac2fa..54cf6faf339c1 100644
+--- a/arch/arm64/Kconfig
++++ b/arch/arm64/Kconfig
+@@ -223,6 +223,7 @@ config ARM64
+ select THREAD_INFO_IN_TASK
+ select HAVE_ARCH_USERFAULTFD_MINOR if USERFAULTFD
+ select TRACE_IRQFLAGS_SUPPORT
++ select TRACE_IRQFLAGS_NMI_SUPPORT
+ help
+ ARM 64-bit (AArch64) Linux support.
+
+@@ -500,6 +501,22 @@ config ARM64_ERRATUM_834220
+
+ If unsure, say Y.
+
++config ARM64_ERRATUM_1742098
++ bool "Cortex-A57/A72: 1742098: ELR recorded incorrectly on interrupt taken between cryptographic instructions in a sequence"
++ depends on COMPAT
++ default y
++ help
++ This option removes the AES hwcap for aarch32 user-space to
++ workaround erratum 1742098 on Cortex-A57 and Cortex-A72.
++
++ Affected parts may corrupt the AES state if an interrupt is
++ taken between a pair of AES instructions. These instructions
++ are only present if the cryptography extensions are present.
++ All software should have a fallback implementation for CPUs
++ that don't implement the cryptography extensions.
++
++ If unsure, say Y.
++
+ config ARM64_ERRATUM_845719
+ bool "Cortex-A53: 845719: a load might read incorrect data"
+ depends on COMPAT
+diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-orangepi-win.dts b/arch/arm64/boot/dts/allwinner/sun50i-a64-orangepi-win.dts
+index c519d9fa6967c..3d2c68d58f49c 100644
+--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-orangepi-win.dts
++++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-orangepi-win.dts
+@@ -40,7 +40,7 @@
+ leds {
+ compatible = "gpio-leds";
+
+- status {
++ led-0 {
+ label = "orangepi:green:status";
+ gpios = <&pio 7 11 GPIO_ACTIVE_HIGH>; /* PH11 */
+ };
+diff --git a/arch/arm64/boot/dts/exynos/exynosautov9-pinctrl.dtsi b/arch/arm64/boot/dts/exynos/exynosautov9-pinctrl.dtsi
+index ef0349d1c3d09..68f4a0fae7cf5 100644
+--- a/arch/arm64/boot/dts/exynos/exynosautov9-pinctrl.dtsi
++++ b/arch/arm64/boot/dts/exynos/exynosautov9-pinctrl.dtsi
+@@ -1089,21 +1089,21 @@
+
+ /* PERIC1 USI11_SPI */
+ spi11_bus: spi11-pins {
+- samsung,pins = "gpp3-6", "gpp3-5", "gpp3-4";
++ samsung,pins = "gpp5-6", "gpp5-5", "gpp5-4";
+ samsung,pin-function = <EXYNOS_PIN_FUNC_2>;
+ samsung,pin-pud = <EXYNOS_PIN_PULL_NONE>;
+ samsung,pin-drv = <EXYNOS5420_PIN_DRV_LV1>;
+ };
+
+ spi11_cs: spi11-cs-pins {
+- samsung,pins = "gpp3-7";
++ samsung,pins = "gpp5-7";
+ samsung,pin-function = <EXYNOS_PIN_FUNC_OUTPUT>;
+ samsung,pin-pud = <EXYNOS_PIN_PULL_NONE>;
+ samsung,pin-drv = <EXYNOS5420_PIN_DRV_LV1>;
+ };
+
+ spi11_cs_func: spi11-cs-func-pins {
+- samsung,pins = "gpp3-7";
++ samsung,pins = "gpp5-7";
+ samsung,pin-function = <EXYNOS_PIN_FUNC_2>;
+ samsung,pin-pud = <EXYNOS_PIN_PULL_NONE>;
+ samsung,pin-drv = <EXYNOS5420_PIN_DRV_LV1>;
+diff --git a/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts
+index 2b9bf8dd14ecc..7538918c7a829 100644
+--- a/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts
++++ b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts
+@@ -49,7 +49,7 @@
+ wps {
+ label = "wps";
+ linux,code = <KEY_WPS_BUTTON>;
+- gpios = <&pio 102 GPIO_ACTIVE_HIGH>;
++ gpios = <&pio 102 GPIO_ACTIVE_LOW>;
+ };
+ };
+
+diff --git a/arch/arm64/boot/dts/mediatek/mt8192.dtsi b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
+index bcecc74844535..96439a4bce095 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8192.dtsi
++++ b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
+@@ -41,7 +41,7 @@
+ reg = <0x000>;
+ enable-method = "psci";
+ clock-frequency = <1701000000>;
+- cpu-idle-states = <&cpuoff_l &clusteroff_l>;
++ cpu-idle-states = <&cpu_sleep_l &cluster_sleep_l>;
+ next-level-cache = <&l2_0>;
+ capacity-dmips-mhz = <530>;
+ };
+@@ -52,7 +52,7 @@
+ reg = <0x100>;
+ enable-method = "psci";
+ clock-frequency = <1701000000>;
+- cpu-idle-states = <&cpuoff_l &clusteroff_l>;
++ cpu-idle-states = <&cpu_sleep_l &cluster_sleep_l>;
+ next-level-cache = <&l2_0>;
+ capacity-dmips-mhz = <530>;
+ };
+@@ -63,7 +63,7 @@
+ reg = <0x200>;
+ enable-method = "psci";
+ clock-frequency = <1701000000>;
+- cpu-idle-states = <&cpuoff_l &clusteroff_l>;
++ cpu-idle-states = <&cpu_sleep_l &cluster_sleep_l>;
+ next-level-cache = <&l2_0>;
+ capacity-dmips-mhz = <530>;
+ };
+@@ -74,7 +74,7 @@
+ reg = <0x300>;
+ enable-method = "psci";
+ clock-frequency = <1701000000>;
+- cpu-idle-states = <&cpuoff_l &clusteroff_l>;
++ cpu-idle-states = <&cpu_sleep_l &cluster_sleep_l>;
+ next-level-cache = <&l2_0>;
+ capacity-dmips-mhz = <530>;
+ };
+@@ -85,7 +85,7 @@
+ reg = <0x400>;
+ enable-method = "psci";
+ clock-frequency = <2171000000>;
+- cpu-idle-states = <&cpuoff_b &clusteroff_b>;
++ cpu-idle-states = <&cpu_sleep_b &cluster_sleep_b>;
+ next-level-cache = <&l2_1>;
+ capacity-dmips-mhz = <1024>;
+ };
+@@ -96,7 +96,7 @@
+ reg = <0x500>;
+ enable-method = "psci";
+ clock-frequency = <2171000000>;
+- cpu-idle-states = <&cpuoff_b &clusteroff_b>;
++ cpu-idle-states = <&cpu_sleep_b &cluster_sleep_b>;
+ next-level-cache = <&l2_1>;
+ capacity-dmips-mhz = <1024>;
+ };
+@@ -107,7 +107,7 @@
+ reg = <0x600>;
+ enable-method = "psci";
+ clock-frequency = <2171000000>;
+- cpu-idle-states = <&cpuoff_b &clusteroff_b>;
++ cpu-idle-states = <&cpu_sleep_b &cluster_sleep_b>;
+ next-level-cache = <&l2_1>;
+ capacity-dmips-mhz = <1024>;
+ };
+@@ -118,7 +118,7 @@
+ reg = <0x700>;
+ enable-method = "psci";
+ clock-frequency = <2171000000>;
+- cpu-idle-states = <&cpuoff_b &clusteroff_b>;
++ cpu-idle-states = <&cpu_sleep_b &cluster_sleep_b>;
+ next-level-cache = <&l2_1>;
+ capacity-dmips-mhz = <1024>;
+ };
+@@ -170,8 +170,8 @@
+ };
+
+ idle-states {
+- entry-method = "arm,psci";
+- cpuoff_l: cpuoff_l {
++ entry-method = "psci";
++ cpu_sleep_l: cpu-sleep-l {
+ compatible = "arm,idle-state";
+ arm,psci-suspend-param = <0x00010001>;
+ local-timer-stop;
+@@ -179,7 +179,7 @@
+ exit-latency-us = <140>;
+ min-residency-us = <780>;
+ };
+- cpuoff_b: cpuoff_b {
++ cpu_sleep_b: cpu-sleep-b {
+ compatible = "arm,idle-state";
+ arm,psci-suspend-param = <0x00010001>;
+ local-timer-stop;
+@@ -187,7 +187,7 @@
+ exit-latency-us = <145>;
+ min-residency-us = <720>;
+ };
+- clusteroff_l: clusteroff_l {
++ cluster_sleep_l: cluster-sleep-l {
+ compatible = "arm,idle-state";
+ arm,psci-suspend-param = <0x01010002>;
+ local-timer-stop;
+@@ -195,7 +195,7 @@
+ exit-latency-us = <155>;
+ min-residency-us = <860>;
+ };
+- clusteroff_b: clusteroff_b {
++ cluster_sleep_b: cluster-sleep-b {
+ compatible = "arm,idle-state";
+ arm,psci-suspend-param = <0x01010002>;
+ local-timer-stop;
+diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
+index e9b40f5d79ec9..77c597a6386fd 100644
+--- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi
++++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
+@@ -1807,6 +1807,7 @@
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0x0 0x0 0x30000000 0x50000>;
++ no-memory-wc;
+
+ cpu_bpmp_tx: sram@4e000 {
+ reg = <0x4e000 0x1000>;
+diff --git a/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi b/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
+index a7d7cfd66379f..b0f9393dd39cc 100644
+--- a/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
++++ b/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
+@@ -75,7 +75,7 @@
+
+ /* SDMMC1 (SD/MMC) */
+ mmc@3400000 {
+- cd-gpios = <&gpio TEGRA194_MAIN_GPIO(A, 0) GPIO_ACTIVE_LOW>;
++ cd-gpios = <&gpio TEGRA194_MAIN_GPIO(G, 7) GPIO_ACTIVE_LOW>;
+ };
+
+ /* SDMMC4 (eMMC) */
+diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+index 751ebe5e95068..61465eb6cccd0 100644
+--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
++++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+@@ -2648,6 +2648,7 @@
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0x0 0x0 0x40000000 0x50000>;
++ no-memory-wc;
+
+ cpu_bpmp_tx: sram@4e000 {
+ reg = <0x4e000 0x1000>;
+diff --git a/arch/arm64/boot/dts/nvidia/tegra234.dtsi b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
+index aaace605bdaaf..9916b87fa83fc 100644
+--- a/arch/arm64/boot/dts/nvidia/tegra234.dtsi
++++ b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
+@@ -1264,6 +1264,7 @@
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0x0 0x0 0x40000000 0x80000>;
++ no-memory-wc;
+
+ cpu_bpmp_tx: sram@70000 {
+ reg = <0x70000 0x1000>;
+diff --git a/arch/arm64/boot/dts/qcom/ipq6018.dtsi b/arch/arm64/boot/dts/qcom/ipq6018.dtsi
+index aac56575e30d6..5cdec104d8993 100644
+--- a/arch/arm64/boot/dts/qcom/ipq6018.dtsi
++++ b/arch/arm64/boot/dts/qcom/ipq6018.dtsi
+@@ -525,9 +525,9 @@
+ };
+
+ timer@b120000 {
+- #address-cells = <2>;
+- #size-cells = <2>;
+- ranges;
++ #address-cells = <1>;
++ #size-cells = <1>;
++ ranges = <0 0 0 0x10000000>;
+ compatible = "arm,armv7-timer-mem";
+ reg = <0x0 0x0b120000 0x0 0x1000>;
+
+@@ -535,49 +535,49 @@
+ frame-number = <0>;
+ interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x0b121000 0x0 0x1000>,
+- <0x0 0x0b122000 0x0 0x1000>;
++ reg = <0x0b121000 0x1000>,
++ <0x0b122000 0x1000>;
+ };
+
+ frame@b123000 {
+ frame-number = <1>;
+ interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0xb123000 0x0 0x1000>;
++ reg = <0x0b123000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@b124000 {
+ frame-number = <2>;
+ interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x0b124000 0x0 0x1000>;
++ reg = <0x0b124000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@b125000 {
+ frame-number = <3>;
+ interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x0b125000 0x0 0x1000>;
++ reg = <0x0b125000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@b126000 {
+ frame-number = <4>;
+ interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x0b126000 0x0 0x1000>;
++ reg = <0x0b126000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@b127000 {
+ frame-number = <5>;
+ interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x0b127000 0x0 0x1000>;
++ reg = <0x0b127000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@b128000 {
+ frame-number = <6>;
+ interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x0b128000 0x0 0x1000>;
++ reg = <0x0b128000 0x1000>;
+ status = "disabled";
+ };
+ };
+diff --git a/arch/arm64/boot/dts/qcom/ipq8074.dtsi b/arch/arm64/boot/dts/qcom/ipq8074.dtsi
+index 8d4e0e1934396..664fba3632b17 100644
+--- a/arch/arm64/boot/dts/qcom/ipq8074.dtsi
++++ b/arch/arm64/boot/dts/qcom/ipq8074.dtsi
+@@ -534,7 +534,7 @@
+ status = "disabled";
+ };
+
+- qpic_nand: nand@79b0000 {
++ qpic_nand: nand-controller@79b0000 {
+ compatible = "qcom,ipq8074-nand";
+ reg = <0x079b0000 0x10000>;
+ #address-cells = <1>;
+diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi b/arch/arm64/boot/dts/qcom/msm8916.dtsi
+index e34963505e070..7ecd747dc6248 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
+@@ -1758,8 +1758,8 @@
+ <&rpmpd MSM8916_VDDMX>;
+ power-domain-names = "cx", "mx";
+
+- qcom,state = <&wcnss_smp2p_out 0>;
+- qcom,state-names = "stop";
++ qcom,smem-states = <&wcnss_smp2p_out 0>;
++ qcom,smem-state-names = "stop";
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&wcnss_pin_a>;
+diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi b/arch/arm64/boot/dts/qcom/msm8996.dtsi
+index b9a48cfd760fa..078edf46de2bf 100644
+--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
+@@ -603,7 +603,7 @@
+ <0x00035400 0x1dc>;
+ #phy-cells = <0>;
+
+- #clock-cells = <1>;
++ #clock-cells = <0>;
+ clock-output-names = "pcie_0_pipe_clk_src";
+ clocks = <&gcc GCC_PCIE_0_PIPE_CLK>;
+ clock-names = "pipe0";
+@@ -617,6 +617,7 @@
+ <0x00036400 0x1dc>;
+ #phy-cells = <0>;
+
++ #clock-cells = <0>;
+ clock-output-names = "pcie_1_pipe_clk_src";
+ clocks = <&gcc GCC_PCIE_1_PIPE_CLK>;
+ clock-names = "pipe1";
+@@ -630,6 +631,7 @@
+ <0x00037400 0x1dc>;
+ #phy-cells = <0>;
+
++ #clock-cells = <0>;
+ clock-output-names = "pcie_2_pipe_clk_src";
+ clocks = <&gcc GCC_PCIE_2_PIPE_CLK>;
+ clock-names = "pipe2";
+@@ -2659,7 +2661,7 @@
+ <0x07410600 0x1a8>;
+ #phy-cells = <0>;
+
+- #clock-cells = <1>;
++ #clock-cells = <0>;
+ clock-output-names = "usb3_phy_pipe_clk_src";
+ clocks = <&gcc GCC_USB3_PHY_PIPE_CLK>;
+ clock-names = "pipe0";
+diff --git a/arch/arm64/boot/dts/qcom/msm8998-sony-xperia-yoshino-poplar.dts b/arch/arm64/boot/dts/qcom/msm8998-sony-xperia-yoshino-poplar.dts
+index 4a1f98a210319..c21333aa73c29 100644
+--- a/arch/arm64/boot/dts/qcom/msm8998-sony-xperia-yoshino-poplar.dts
++++ b/arch/arm64/boot/dts/qcom/msm8998-sony-xperia-yoshino-poplar.dts
+@@ -26,11 +26,13 @@
+ };
+
+ &vreg_l18a_2p85 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
++ /* Note: Round-down from 2850000 to be a multiple of PLDO step-size 8000 */
++ regulator-min-microvolt = <2848000>;
++ regulator-max-microvolt = <2848000>;
+ };
+
+ &vreg_l22a_2p85 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
++ /* Note: Round-down from 2700000 to be a multiple of PLDO step-size 8000 */
++ regulator-min-microvolt = <2696000>;
++ regulator-max-microvolt = <2696000>;
+ };
+diff --git a/arch/arm64/boot/dts/qcom/qcs404.dtsi b/arch/arm64/boot/dts/qcom/qcs404.dtsi
+index 3f06f7cd3cf2d..65d71adee345c 100644
+--- a/arch/arm64/boot/dts/qcom/qcs404.dtsi
++++ b/arch/arm64/boot/dts/qcom/qcs404.dtsi
+@@ -548,7 +548,7 @@
+ compatible = "snps,dwc3";
+ reg = <0x07580000 0xcd00>;
+ interrupts = <GIC_SPI 26 IRQ_TYPE_LEVEL_HIGH>;
+- phys = <&usb2_phy_sec>, <&usb3_phy>;
++ phys = <&usb2_phy_prim>, <&usb3_phy>;
+ phy-names = "usb2-phy", "usb3-phy";
+ snps,has-lpm-erratum;
+ snps,hird-threshold = /bits/ 8 <0x10>;
+@@ -577,7 +577,7 @@
+ compatible = "snps,dwc3";
+ reg = <0x078c0000 0xcc00>;
+ interrupts = <GIC_SPI 44 IRQ_TYPE_LEVEL_HIGH>;
+- phys = <&usb2_phy_prim>;
++ phys = <&usb2_phy_sec>;
+ phy-names = "usb2-phy";
+ snps,has-lpm-erratum;
+ snps,hird-threshold = /bits/ 8 <0x10>;
+diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi
+index 732e1181af488..262224504921d 100644
+--- a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi
++++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi
+@@ -42,6 +42,7 @@
+ */
+
+ /delete-node/ &hyp_mem;
++/delete-node/ &ipa_fw_mem;
+ /delete-node/ &xbl_mem;
+ /delete-node/ &aop_mem;
+ /delete-node/ &sec_apps_mem;
+diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi
+index e1c46b80f14a0..120db2d0a3097 100644
+--- a/arch/arm64/boot/dts/qcom/sc7180.dtsi
++++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi
+@@ -3219,7 +3219,7 @@
+ };
+
+ aoss_qmp: power-controller@c300000 {
+- compatible = "qcom,sc7180-aoss-qmp";
++ compatible = "qcom,sc7180-aoss-qmp", "qcom,aoss-qmp";
+ reg = <0 0x0c300000 0 0x400>;
+ interrupts = <GIC_SPI 389 IRQ_TYPE_EDGE_RISING>;
+ mboxes = <&apss_shared 0>;
+@@ -3388,9 +3388,9 @@
+ };
+
+ timer@17c20000{
+- #address-cells = <2>;
+- #size-cells = <2>;
+- ranges;
++ #address-cells = <1>;
++ #size-cells = <1>;
++ ranges = <0 0 0 0x20000000>;
+ compatible = "arm,armv7-timer-mem";
+ reg = <0 0x17c20000 0 0x1000>;
+
+@@ -3398,49 +3398,49 @@
+ frame-number = <0>;
+ interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c21000 0 0x1000>,
+- <0 0x17c22000 0 0x1000>;
++ reg = <0x17c21000 0x1000>,
++ <0x17c22000 0x1000>;
+ };
+
+ frame@17c23000 {
+ frame-number = <1>;
+ interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c23000 0 0x1000>;
++ reg = <0x17c23000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c25000 {
+ frame-number = <2>;
+ interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c25000 0 0x1000>;
++ reg = <0x17c25000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c27000 {
+ frame-number = <3>;
+ interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c27000 0 0x1000>;
++ reg = <0x17c27000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c29000 {
+ frame-number = <4>;
+ interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c29000 0 0x1000>;
++ reg = <0x17c29000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c2b000 {
+ frame-number = <5>;
+ interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c2b000 0 0x1000>;
++ reg = <0x17c2b000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c2d000 {
+ frame-number = <6>;
+ interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c2d000 0 0x1000>;
++ reg = <0x17c2d000 0x1000>;
+ status = "disabled";
+ };
+ };
+diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi b/arch/arm64/boot/dts/qcom/sc7280.dtsi
+index f0b64be63c21d..793ac9715da22 100644
+--- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
++++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
+@@ -810,7 +810,7 @@
+ reg = <0 0x00100000 0 0x1f0000>;
+ clocks = <&rpmhcc RPMH_CXO_CLK>,
+ <&rpmhcc RPMH_CXO_CLK_A>, <&sleep_clk>,
+- <0>, <&pcie1_lane 0>,
++ <0>, <&pcie1_lane>,
+ <0>, <0>, <0>, <0>;
+ clock-names = "bi_tcxo", "bi_tcxo_ao", "sleep_clk",
+ "pcie_0_pipe_clk", "pcie_1_pipe_clk",
+@@ -1839,7 +1839,7 @@
+
+ clocks = <&gcc GCC_PCIE_1_PIPE_CLK>,
+ <&gcc GCC_PCIE_1_PIPE_CLK_SRC>,
+- <&pcie1_lane 0>,
++ <&pcie1_lane>,
+ <&rpmhcc RPMH_CXO_CLK>,
+ <&gcc GCC_PCIE_1_AUX_CLK>,
+ <&gcc GCC_PCIE_1_CFG_AHB_CLK>,
+@@ -1914,7 +1914,7 @@
+ clock-names = "pipe0";
+
+ #phy-cells = <0>;
+- #clock-cells = <1>;
++ #clock-cells = <0>;
+ clock-output-names = "pcie_1_pipe_clk";
+ };
+ };
+@@ -3467,7 +3467,7 @@
+ };
+
+ aoss_qmp: power-controller@c300000 {
+- compatible = "qcom,sc7280-aoss-qmp";
++ compatible = "qcom,sc7280-aoss-qmp", "qcom,aoss-qmp";
+ reg = <0 0x0c300000 0 0x400>;
+ interrupts-extended = <&ipcc IPCC_CLIENT_AOP
+ IPCC_MPROC_SIGNAL_GLINK_QMP
+@@ -4395,9 +4395,9 @@
+ };
+
+ timer@17c20000 {
+- #address-cells = <2>;
+- #size-cells = <2>;
+- ranges;
++ #address-cells = <1>;
++ #size-cells = <1>;
++ ranges = <0 0 0 0x20000000>;
+ compatible = "arm,armv7-timer-mem";
+ reg = <0 0x17c20000 0 0x1000>;
+
+@@ -4405,49 +4405,49 @@
+ frame-number = <0>;
+ interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c21000 0 0x1000>,
+- <0 0x17c22000 0 0x1000>;
++ reg = <0x17c21000 0x1000>,
++ <0x17c22000 0x1000>;
+ };
+
+ frame@17c23000 {
+ frame-number = <1>;
+ interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c23000 0 0x1000>;
++ reg = <0x17c23000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c25000 {
+ frame-number = <2>;
+ interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c25000 0 0x1000>;
++ reg = <0x17c25000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c27000 {
+ frame-number = <3>;
+ interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c27000 0 0x1000>;
++ reg = <0x17c27000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c29000 {
+ frame-number = <4>;
+ interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c29000 0 0x1000>;
++ reg = <0x17c29000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c2b000 {
+ frame-number = <5>;
+ interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c2b000 0 0x1000>;
++ reg = <0x17c2b000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c2d000 {
+ frame-number = <6>;
+ interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17c2d000 0 0x1000>;
++ reg = <0x17c2d000 0x1000>;
+ status = "disabled";
+ };
+ };
+diff --git a/arch/arm64/boot/dts/qcom/sdm630.dtsi b/arch/arm64/boot/dts/qcom/sdm630.dtsi
+index 240293592ef9e..a33638a1cb44a 100644
+--- a/arch/arm64/boot/dts/qcom/sdm630.dtsi
++++ b/arch/arm64/boot/dts/qcom/sdm630.dtsi
+@@ -8,6 +8,7 @@
+ #include <dt-bindings/clock/qcom,gpucc-sdm660.h>
+ #include <dt-bindings/clock/qcom,mmcc-sdm660.h>
+ #include <dt-bindings/clock/qcom,rpmcc.h>
++#include <dt-bindings/interconnect/qcom,sdm660.h>
+ #include <dt-bindings/power/qcom-rpmpd.h>
+ #include <dt-bindings/gpio/gpio.h>
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+@@ -1045,11 +1046,13 @@
+ nvmem-cells = <&gpu_speed_bin>;
+ nvmem-cell-names = "speed_bin";
+
+- interconnects = <&gnoc 1 &bimc 5>;
++ interconnects = <&bimc MASTER_OXILI &bimc SLAVE_EBI>;
+ interconnect-names = "gfx-mem";
+
+ operating-points-v2 = <&gpu_sdm630_opp_table>;
+
++ status = "disabled";
++
+ gpu_sdm630_opp_table: opp-table {
+ compatible = "operating-points-v2";
+ opp-775000000 {
+@@ -1260,7 +1263,7 @@
+ #phy-cells = <0>;
+
+ clocks = <&gcc GCC_USB_PHY_CFG_AHB2PHY_CLK>,
+- <&gcc GCC_RX1_USB2_CLKREF_CLK>;
++ <&gcc GCC_RX0_USB2_CLKREF_CLK>;
+ clock-names = "cfg_ahb", "ref";
+
+ resets = <&gcc GCC_QUSB2PHY_PRIM_BCR>;
+diff --git a/arch/arm64/boot/dts/qcom/sdm636-sony-xperia-ganges-mermaid.dts b/arch/arm64/boot/dts/qcom/sdm636-sony-xperia-ganges-mermaid.dts
+index b96da53f2f1ee..58f687fc49e04 100644
+--- a/arch/arm64/boot/dts/qcom/sdm636-sony-xperia-ganges-mermaid.dts
++++ b/arch/arm64/boot/dts/qcom/sdm636-sony-xperia-ganges-mermaid.dts
+@@ -19,7 +19,7 @@
+ };
+
+ &sdc2_state_on {
+- pinconf-clk {
++ clk {
+ drive-strength = <14>;
+ };
+ };
+diff --git a/arch/arm64/boot/dts/qcom/sdm845-sony-xperia-tama-akatsuki.dts b/arch/arm64/boot/dts/qcom/sdm845-sony-xperia-tama-akatsuki.dts
+index 8a0d94e7f5985..2f5e12deaadab 100644
+--- a/arch/arm64/boot/dts/qcom/sdm845-sony-xperia-tama-akatsuki.dts
++++ b/arch/arm64/boot/dts/qcom/sdm845-sony-xperia-tama-akatsuki.dts
+@@ -19,8 +19,9 @@
+ };
+
+ &vreg_l22a_2p8 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
++ /* Note: Round-down from 2700000 to be a multiple of PLDO step-size 8000 */
++ regulator-min-microvolt = <2696000>;
++ regulator-max-microvolt = <2696000>;
+ };
+
+ &vreg_l28a_2p8 {
+diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
+index ad21cf465c986..cd0029ff82467 100644
+--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
++++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
+@@ -4942,9 +4942,9 @@
+ };
+
+ timer@17c90000 {
+- #address-cells = <2>;
+- #size-cells = <2>;
+- ranges;
++ #address-cells = <1>;
++ #size-cells = <1>;
++ ranges = <0 0 0 0x20000000>;
+ compatible = "arm,armv7-timer-mem";
+ reg = <0 0x17c90000 0 0x1000>;
+
+@@ -4952,49 +4952,49 @@
+ frame-number = <0>;
+ interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17ca0000 0 0x1000>,
+- <0 0x17cb0000 0 0x1000>;
++ reg = <0x17ca0000 0x1000>,
++ <0x17cb0000 0x1000>;
+ };
+
+ frame@17cc0000 {
+ frame-number = <1>;
+ interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17cc0000 0 0x1000>;
++ reg = <0x17cc0000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17cd0000 {
+ frame-number = <2>;
+ interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17cd0000 0 0x1000>;
++ reg = <0x17cd0000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17ce0000 {
+ frame-number = <3>;
+ interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17ce0000 0 0x1000>;
++ reg = <0x17ce0000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17cf0000 {
+ frame-number = <4>;
+ interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17cf0000 0 0x1000>;
++ reg = <0x17cf0000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17d00000 {
+ frame-number = <5>;
+ interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17d00000 0 0x1000>;
++ reg = <0x17d00000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17d10000 {
+ frame-number = <6>;
+ interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0 0x17d10000 0 0x1000>;
++ reg = <0x17d10000 0x1000>;
+ status = "disabled";
+ };
+ };
+diff --git a/arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dts b/arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dts
+index 871ccbba445bb..038970c0b68e2 100644
+--- a/arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dts
++++ b/arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dts
+@@ -88,11 +88,19 @@
+ status = "okay";
+ };
+
+-&sdc2_state_off {
++&sdc2_off_state {
+ sd-cd {
+ pins = "gpio98";
++ drive-strength = <2>;
+ bias-disable;
++ };
++};
++
++&sdc2_on_state {
++ sd-cd {
++ pins = "gpio98";
+ drive-strength = <2>;
++ bias-pull-up;
+ };
+ };
+
+@@ -102,32 +110,6 @@
+
+ &tlmm {
+ gpio-reserved-ranges = <22 2>, <28 6>;
+-
+- sdc2_state_on: sdc2-on {
+- clk {
+- pins = "sdc2_clk";
+- bias-disable;
+- drive-strength = <16>;
+- };
+-
+- cmd {
+- pins = "sdc2_cmd";
+- bias-pull-up;
+- drive-strength = <10>;
+- };
+-
+- data {
+- pins = "sdc2_data";
+- bias-pull-up;
+- drive-strength = <10>;
+- };
+-
+- sd-cd {
+- pins = "gpio98";
+- bias-pull-up;
+- drive-strength = <2>;
+- };
+- };
+ };
+
+ &usb3 {
+diff --git a/arch/arm64/boot/dts/qcom/sm6125.dtsi b/arch/arm64/boot/dts/qcom/sm6125.dtsi
+index e81b2a7794fb7..e601b9bfdc04e 100644
+--- a/arch/arm64/boot/dts/qcom/sm6125.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm6125.dtsi
+@@ -386,23 +386,43 @@
+ interrupt-controller;
+ #interrupt-cells = <2>;
+
+- sdc2_state_off: sdc2-off {
++ sdc2_off_state: sdc2-off-state {
+ clk {
+ pins = "sdc2_clk";
+- bias-disable;
+ drive-strength = <2>;
++ bias-disable;
+ };
+
+ cmd {
+ pins = "sdc2_cmd";
++ drive-strength = <2>;
+ bias-pull-up;
++ };
++
++ data {
++ pins = "sdc2_data";
+ drive-strength = <2>;
++ bias-pull-up;
++ };
++ };
++
++ sdc2_on_state: sdc2-on-state {
++ clk {
++ pins = "sdc2_clk";
++ drive-strength = <16>;
++ bias-disable;
++ };
++
++ cmd {
++ pins = "sdc2_cmd";
++ drive-strength = <10>;
++ bias-pull-up;
+ };
+
+ data {
+ pins = "sdc2_data";
++ drive-strength = <10>;
+ bias-pull-up;
+- drive-strength = <2>;
+ };
+ };
+ };
+@@ -470,8 +490,8 @@
+ <&xo_board>;
+ clock-names = "iface", "core", "xo";
+
+- pinctrl-0 = <&sdc2_state_on>;
+- pinctrl-1 = <&sdc2_state_off>;
++ pinctrl-0 = <&sdc2_on_state>;
++ pinctrl-1 = <&sdc2_off_state>;
+ pinctrl-names = "default", "sleep";
+
+ power-domains = <&rpmpd SM6125_VDDCX>;
+diff --git a/arch/arm64/boot/dts/qcom/sm6350.dtsi b/arch/arm64/boot/dts/qcom/sm6350.dtsi
+index d7c9edff19f77..c31fe27a46f2b 100644
+--- a/arch/arm64/boot/dts/qcom/sm6350.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm6350.dtsi
+@@ -1090,57 +1090,57 @@
+ compatible = "arm,armv7-timer-mem";
+ reg = <0x0 0x17c20000 0x0 0x1000>;
+ clock-frequency = <19200000>;
+- #address-cells = <2>;
+- #size-cells = <2>;
+- ranges;
++ #address-cells = <1>;
++ #size-cells = <1>;
++ ranges = <0 0 0 0x20000000>;
+
+ frame@17c21000 {
+ frame-number = <0>;
+ interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c21000 0x0 0x1000>,
+- <0x0 0x17c22000 0x0 0x1000>;
++ reg = <0x17c21000 0x1000>,
++ <0x17c22000 0x1000>;
+ };
+
+ frame@17c23000 {
+ frame-number = <1>;
+ interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c23000 0x0 0x1000>;
++ reg = <0x17c23000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c25000 {
+ frame-number = <2>;
+ interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c25000 0x0 0x1000>;
++ reg = <0x17c25000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c27000 {
+ frame-number = <3>;
+ interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c27000 0x0 0x1000>;
++ reg = <0x17c27000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c29000 {
+ frame-number = <4>;
+ interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c29000 0x0 0x1000>;
++ reg = <0x17c29000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c2b000 {
+ frame-number = <5>;
+ interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c2b000 0x0 0x1000>;
++ reg = <0x17c2b000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c2d000 {
+ frame-number = <6>;
+ interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c2d000 0x0 0x1000>;
++ reg = <0x17c2d000 0x1000>;
+ status = "disabled";
+ };
+ };
+diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi
+index 15f3bf2e7ea0c..6edb3986a473d 100644
+--- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
+@@ -3382,7 +3382,7 @@
+ };
+
+ aoss_qmp: power-controller@c300000 {
+- compatible = "qcom,sm8150-aoss-qmp";
++ compatible = "qcom,sm8150-aoss-qmp", "qcom,aoss-qmp";
+ reg = <0x0 0x0c300000 0x0 0x400>;
+ interrupts = <GIC_SPI 389 IRQ_TYPE_EDGE_RISING>;
+ mboxes = <&apss_shared 0>;
+@@ -3608,9 +3608,9 @@
+ };
+
+ timer@17c20000 {
+- #address-cells = <2>;
+- #size-cells = <2>;
+- ranges;
++ #address-cells = <1>;
++ #size-cells = <1>;
++ ranges = <0 0 0 0x20000000>;
+ compatible = "arm,armv7-timer-mem";
+ reg = <0x0 0x17c20000 0x0 0x1000>;
+ clock-frequency = <19200000>;
+@@ -3619,49 +3619,49 @@
+ frame-number = <0>;
+ interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c21000 0x0 0x1000>,
+- <0x0 0x17c22000 0x0 0x1000>;
++ reg = <0x17c21000 0x1000>,
++ <0x17c22000 0x1000>;
+ };
+
+ frame@17c23000 {
+ frame-number = <1>;
+ interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c23000 0x0 0x1000>;
++ reg = <0x17c23000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c25000 {
+ frame-number = <2>;
+ interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c25000 0x0 0x1000>;
++ reg = <0x17c25000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c27000 {
+ frame-number = <3>;
+ interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c26000 0x0 0x1000>;
++ reg = <0x17c26000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c29000 {
+ frame-number = <4>;
+ interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c29000 0x0 0x1000>;
++ reg = <0x17c29000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c2b000 {
+ frame-number = <5>;
+ interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c2b000 0x0 0x1000>;
++ reg = <0x17c2b000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c2d000 {
+ frame-number = <6>;
+ interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c2d000 0x0 0x1000>;
++ reg = <0x17c2d000 0x1000>;
+ status = "disabled";
+ };
+ };
+diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi b/arch/arm64/boot/dts/qcom/sm8250.dtsi
+index 1304b86af1a00..9afd8b3da3a11 100644
+--- a/arch/arm64/boot/dts/qcom/sm8250.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi
+@@ -1883,6 +1883,8 @@
+ clock-names = "pipe0";
+
+ #phy-cells = <0>;
++
++ #clock-cells = <0>;
+ clock-output-names = "pcie_0_pipe_clk";
+ };
+ };
+@@ -1989,6 +1991,8 @@
+ clock-names = "pipe0";
+
+ #phy-cells = <0>;
++
++ #clock-cells = <0>;
+ clock-output-names = "pcie_1_pipe_clk";
+ };
+ };
+@@ -2095,6 +2099,8 @@
+ clock-names = "pipe0";
+
+ #phy-cells = <0>;
++
++ #clock-cells = <0>;
+ clock-output-names = "pcie_2_pipe_clk";
+ };
+ };
+@@ -3475,7 +3481,7 @@
+ };
+
+ aoss_qmp: power-controller@c300000 {
+- compatible = "qcom,sm8250-aoss-qmp";
++ compatible = "qcom,sm8250-aoss-qmp", "qcom,aoss-qmp";
+ reg = <0 0x0c300000 0 0x400>;
+ interrupts-extended = <&ipcc IPCC_CLIENT_AOP
+ IPCC_MPROC_SIGNAL_GLINK_QMP
+@@ -4528,9 +4534,9 @@
+ };
+
+ timer@17c20000 {
+- #address-cells = <2>;
+- #size-cells = <2>;
+- ranges;
++ #address-cells = <1>;
++ #size-cells = <1>;
++ ranges = <0 0 0 0x20000000>;
+ compatible = "arm,armv7-timer-mem";
+ reg = <0x0 0x17c20000 0x0 0x1000>;
+ clock-frequency = <19200000>;
+@@ -4539,49 +4545,49 @@
+ frame-number = <0>;
+ interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c21000 0x0 0x1000>,
+- <0x0 0x17c22000 0x0 0x1000>;
++ reg = <0x17c21000 0x1000>,
++ <0x17c22000 0x1000>;
+ };
+
+ frame@17c23000 {
+ frame-number = <1>;
+ interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c23000 0x0 0x1000>;
++ reg = <0x17c23000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c25000 {
+ frame-number = <2>;
+ interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c25000 0x0 0x1000>;
++ reg = <0x17c25000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c27000 {
+ frame-number = <3>;
+ interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c27000 0x0 0x1000>;
++ reg = <0x17c27000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c29000 {
+ frame-number = <4>;
+ interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c29000 0x0 0x1000>;
++ reg = <0x17c29000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c2b000 {
+ frame-number = <5>;
+ interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c2b000 0x0 0x1000>;
++ reg = <0x17c2b000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c2d000 {
+ frame-number = <6>;
+ interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c2d000 0x0 0x1000>;
++ reg = <0x17c2d000 0x1000>;
+ status = "disabled";
+ };
+ };
+diff --git a/arch/arm64/boot/dts/qcom/sm8350.dtsi b/arch/arm64/boot/dts/qcom/sm8350.dtsi
+index 20f850b941586..ee6c202ab68cc 100644
+--- a/arch/arm64/boot/dts/qcom/sm8350.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm8350.dtsi
+@@ -1537,7 +1537,7 @@
+ };
+
+ aoss_qmp: power-controller@c300000 {
+- compatible = "qcom,sm8350-aoss-qmp";
++ compatible = "qcom,sm8350-aoss-qmp", "qcom,aoss-qmp";
+ reg = <0 0x0c300000 0 0x400>;
+ interrupts-extended = <&ipcc IPCC_CLIENT_AOP IPCC_MPROC_SIGNAL_GLINK_QMP
+ IRQ_TYPE_EDGE_RISING>;
+@@ -1752,9 +1752,9 @@
+
+ timer@17c20000 {
+ compatible = "arm,armv7-timer-mem";
+- #address-cells = <2>;
+- #size-cells = <2>;
+- ranges;
++ #address-cells = <1>;
++ #size-cells = <1>;
++ ranges = <0 0 0 0x20000000>;
+ reg = <0x0 0x17c20000 0x0 0x1000>;
+ clock-frequency = <19200000>;
+
+@@ -1762,49 +1762,49 @@
+ frame-number = <0>;
+ interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c21000 0x0 0x1000>,
+- <0x0 0x17c22000 0x0 0x1000>;
++ reg = <0x17c21000 0x1000>,
++ <0x17c22000 0x1000>;
+ };
+
+ frame@17c23000 {
+ frame-number = <1>;
+ interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c23000 0x0 0x1000>;
++ reg = <0x17c23000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c25000 {
+ frame-number = <2>;
+ interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c25000 0x0 0x1000>;
++ reg = <0x17c25000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c27000 {
+ frame-number = <3>;
+ interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c27000 0x0 0x1000>;
++ reg = <0x17c27000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c29000 {
+ frame-number = <4>;
+ interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c29000 0x0 0x1000>;
++ reg = <0x17c29000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c2b000 {
+ frame-number = <5>;
+ interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c2b000 0x0 0x1000>;
++ reg = <0x17c2b000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17c2d000 {
+ frame-number = <6>;
+ interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17c2d000 0x0 0x1000>;
++ reg = <0x17c2d000 0x1000>;
+ status = "disabled";
+ };
+ };
+diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
+index 7a14eb89e4ca0..e5fe694b7be64 100644
+--- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
+@@ -1203,9 +1203,9 @@
+
+ timer@17420000 {
+ compatible = "arm,armv7-timer-mem";
+- #address-cells = <2>;
+- #size-cells = <2>;
+- ranges;
++ #address-cells = <1>;
++ #size-cells = <1>;
++ ranges = <0 0 0 0x20000000>;
+ reg = <0x0 0x17420000 0x0 0x1000>;
+ clock-frequency = <19200000>;
+
+@@ -1213,49 +1213,49 @@
+ frame-number = <0>;
+ interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17421000 0x0 0x1000>,
+- <0x0 0x17422000 0x0 0x1000>;
++ reg = <0x17421000 0x1000>,
++ <0x17422000 0x1000>;
+ };
+
+ frame@17423000 {
+ frame-number = <1>;
+ interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17423000 0x0 0x1000>;
++ reg = <0x17423000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17425000 {
+ frame-number = <2>;
+ interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17425000 0x0 0x1000>;
++ reg = <0x17425000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17427000 {
+ frame-number = <3>;
+ interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17427000 0x0 0x1000>;
++ reg = <0x17427000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@17429000 {
+ frame-number = <4>;
+ interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x17429000 0x0 0x1000>;
++ reg = <0x17429000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@1742b000 {
+ frame-number = <5>;
+ interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x1742b000 0x0 0x1000>;
++ reg = <0x1742b000 0x1000>;
+ status = "disabled";
+ };
+
+ frame@1742d000 {
+ frame-number = <6>;
+ interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
+- reg = <0x0 0x1742d000 0x0 0x1000>;
++ reg = <0x1742d000 0x1000>;
+ status = "disabled";
+ };
+ };
+diff --git a/arch/arm64/boot/dts/renesas/beacon-renesom-baseboard.dtsi b/arch/arm64/boot/dts/renesas/beacon-renesom-baseboard.dtsi
+index 5ad6cd1864c10..c289d2bb51226 100644
+--- a/arch/arm64/boot/dts/renesas/beacon-renesom-baseboard.dtsi
++++ b/arch/arm64/boot/dts/renesas/beacon-renesom-baseboard.dtsi
+@@ -146,7 +146,7 @@
+ };
+ };
+
+- reg_audio: regulator_audio {
++ reg_audio: regulator-audio {
+ compatible = "regulator-fixed";
+ regulator-name = "audio-1.8V";
+ regulator-min-microvolt = <1800000>;
+@@ -174,7 +174,7 @@
+ vin-supply = <®_lcd>;
+ };
+
+- reg_cam0: regulator_camera {
++ reg_cam0: regulator-cam0 {
+ compatible = "regulator-fixed";
+ regulator-name = "reg_cam0";
+ regulator-min-microvolt = <1800000>;
+@@ -183,7 +183,7 @@
+ enable-active-high;
+ };
+
+- reg_cam1: regulator_camera {
++ reg_cam1: regulator-cam1 {
+ compatible = "regulator-fixed";
+ regulator-name = "reg_cam1";
+ regulator-min-microvolt = <1800000>;
+diff --git a/arch/arm64/boot/dts/renesas/r8a774c0.dtsi b/arch/arm64/boot/dts/renesas/r8a774c0.dtsi
+index e123c8d1bab93..337efbeda5a91 100644
+--- a/arch/arm64/boot/dts/renesas/r8a774c0.dtsi
++++ b/arch/arm64/boot/dts/renesas/r8a774c0.dtsi
+@@ -1956,7 +1956,7 @@
+ cpu-thermal {
+ polling-delay-passive = <250>;
+ polling-delay = <0>;
+- thermal-sensors = <&thermal 0>;
++ thermal-sensors = <&thermal>;
+ sustainable-power = <717>;
+
+ cooling-maps {
+diff --git a/arch/arm64/boot/dts/renesas/r8a77990.dtsi b/arch/arm64/boot/dts/renesas/r8a77990.dtsi
+index 7e0f1aab21352..155589f67e1be 100644
+--- a/arch/arm64/boot/dts/renesas/r8a77990.dtsi
++++ b/arch/arm64/boot/dts/renesas/r8a77990.dtsi
+@@ -2117,7 +2117,7 @@
+ cpu-thermal {
+ polling-delay-passive = <250>;
+ polling-delay = <0>;
+- thermal-sensors = <&thermal 0>;
++ thermal-sensors = <&thermal>;
+ sustainable-power = <717>;
+
+ cooling-maps {
+diff --git a/arch/arm64/boot/dts/renesas/r8a779m8.dtsi b/arch/arm64/boot/dts/renesas/r8a779m8.dtsi
+index 752440b0c40f7..750bd8ccdb7f1 100644
+--- a/arch/arm64/boot/dts/renesas/r8a779m8.dtsi
++++ b/arch/arm64/boot/dts/renesas/r8a779m8.dtsi
+@@ -10,3 +10,8 @@
+ / {
+ compatible = "renesas,r8a779m8", "renesas,r8a7795";
+ };
++
++&cluster0_opp {
++ /delete-node/ opp-1600000000;
++ /delete-node/ opp-1700000000;
++};
+diff --git a/arch/arm64/boot/dts/renesas/r9a07g054l2-smarc.dts b/arch/arm64/boot/dts/renesas/r9a07g054l2-smarc.dts
+index fc334b4c2aa42..d2848cbfc6c0d 100644
+--- a/arch/arm64/boot/dts/renesas/r9a07g054l2-smarc.dts
++++ b/arch/arm64/boot/dts/renesas/r9a07g054l2-smarc.dts
+@@ -1,6 +1,6 @@
+ // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+ /*
+- * Device Tree Source for the RZ/G2L SMARC EVK board
++ * Device Tree Source for the RZ/V2L SMARC EVK board
+ *
+ * Copyright (C) 2021 Renesas Electronics Corp.
+ */
+diff --git a/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi b/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi
+index be97da1322580..ba75adedbf79b 100644
+--- a/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi
++++ b/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi
+@@ -599,8 +599,8 @@
+ compatible = "socionext,uniphier-dwc3", "snps,dwc3";
+ status = "disabled";
+ reg = <0x65a00000 0xcd00>;
+- interrupt-names = "host", "peripheral";
+- interrupts = <0 134 4>, <0 135 4>;
++ interrupt-names = "dwc_usb3";
++ interrupts = <0 134 4>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&pinctrl_usb0>, <&pinctrl_usb2>;
+ clock-names = "ref", "bus_early", "suspend";
+@@ -701,8 +701,8 @@
+ compatible = "socionext,uniphier-dwc3", "snps,dwc3";
+ status = "disabled";
+ reg = <0x65c00000 0xcd00>;
+- interrupt-names = "host", "peripheral";
+- interrupts = <0 137 4>, <0 138 4>;
++ interrupt-names = "dwc_usb3";
++ interrupts = <0 137 4>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&pinctrl_usb1>, <&pinctrl_usb3>;
+ clock-names = "ref", "bus_early", "suspend";
+diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
+index 2a965aa0188dd..bdca34283c060 100644
+--- a/arch/arm64/crypto/Kconfig
++++ b/arch/arm64/crypto/Kconfig
+@@ -59,6 +59,7 @@ config CRYPTO_GHASH_ARM64_CE
+ select CRYPTO_HASH
+ select CRYPTO_GF128MUL
+ select CRYPTO_LIB_AES
++ select CRYPTO_AEAD
+
+ config CRYPTO_CRCT10DIF_ARM64_CE
+ tristate "CRCT10DIF digest algorithm using PMULL instructions"
+diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
+index d52a0b269ee80..65c2201b11b20 100644
+--- a/arch/arm64/include/asm/esr.h
++++ b/arch/arm64/include/asm/esr.h
+@@ -133,7 +133,8 @@
+ #define ESR_ELx_CV (UL(1) << 24)
+ #define ESR_ELx_COND_SHIFT (20)
+ #define ESR_ELx_COND_MASK (UL(0xF) << ESR_ELx_COND_SHIFT)
+-#define ESR_ELx_WFx_ISS_TI (UL(1) << 0)
++#define ESR_ELx_WFx_ISS_TI (UL(3) << 0)
++#define ESR_ELx_WFx_ISS_WFxT (UL(2) << 0)
+ #define ESR_ELx_WFx_ISS_WFI (UL(0) << 0)
+ #define ESR_ELx_WFx_ISS_WFE (UL(1) << 0)
+ #define ESR_ELx_xVC_IMM_MASK ((1UL << 16) - 1)
+@@ -146,7 +147,8 @@
+ #define DISR_EL1_ESR_MASK (ESR_ELx_AET | ESR_ELx_EA | ESR_ELx_FSC)
+
+ /* ESR value templates for specific events */
+-#define ESR_ELx_WFx_MASK (ESR_ELx_EC_MASK | ESR_ELx_WFx_ISS_TI)
++#define ESR_ELx_WFx_MASK (ESR_ELx_EC_MASK | \
++ (ESR_ELx_WFx_ISS_TI & ~ESR_ELx_WFx_ISS_WFxT))
+ #define ESR_ELx_WFx_WFI_VAL ((ESR_ELx_EC_WFx << ESR_ELx_EC_SHIFT) | \
+ ESR_ELx_WFx_ISS_WFI)
+
+diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
+index 9839bfc163d71..78d272b26ebd1 100644
+--- a/arch/arm64/include/asm/kexec.h
++++ b/arch/arm64/include/asm/kexec.h
+@@ -115,7 +115,9 @@ extern const struct kexec_file_ops kexec_image_ops;
+
+ struct kimage;
+
+-extern int arch_kimage_file_post_load_cleanup(struct kimage *image);
++int arch_kimage_file_post_load_cleanup(struct kimage *image);
++#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
++
+ extern int load_other_segments(struct kimage *image,
+ unsigned long kernel_load_addr, unsigned long kernel_size,
+ char *initrd, unsigned long initrd_len,
+diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
+index 6b1a12c23fe77..7058bf047fa50 100644
+--- a/arch/arm64/include/asm/processor.h
++++ b/arch/arm64/include/asm/processor.h
+@@ -250,8 +250,9 @@ void tls_preserve_current_state(void);
+
+ static inline void start_thread_common(struct pt_regs *regs, unsigned long pc)
+ {
++ s32 previous_syscall = regs->syscallno;
+ memset(regs, 0, sizeof(*regs));
+- forget_syscall(regs);
++ regs->syscallno = previous_syscall;
+ regs->pc = pc;
+
+ if (system_uses_irq_prio_masking())
+diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
+index 6875a16b09d29..fb0e7c7b2e209 100644
+--- a/arch/arm64/kernel/armv8_deprecated.c
++++ b/arch/arm64/kernel/armv8_deprecated.c
+@@ -59,6 +59,7 @@ struct insn_emulation {
+ static LIST_HEAD(insn_emulation);
+ static int nr_insn_emulated __initdata;
+ static DEFINE_RAW_SPINLOCK(insn_emulation_lock);
++static DEFINE_MUTEX(insn_emulation_mutex);
+
+ static void register_emulation_hooks(struct insn_emulation_ops *ops)
+ {
+@@ -207,10 +208,10 @@ static int emulation_proc_handler(struct ctl_table *table, int write,
+ loff_t *ppos)
+ {
+ int ret = 0;
+- struct insn_emulation *insn = (struct insn_emulation *) table->data;
++ struct insn_emulation *insn = container_of(table->data, struct insn_emulation, current_mode);
+ enum insn_emulation_mode prev_mode = insn->current_mode;
+
+- table->data = &insn->current_mode;
++ mutex_lock(&insn_emulation_mutex);
+ ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+
+ if (ret || !write || prev_mode == insn->current_mode)
+@@ -223,7 +224,7 @@ static int emulation_proc_handler(struct ctl_table *table, int write,
+ update_insn_emulation_mode(insn, INSN_UNDEF);
+ }
+ ret:
+- table->data = insn;
++ mutex_unlock(&insn_emulation_mutex);
+ return ret;
+ }
+
+@@ -247,7 +248,7 @@ static void __init register_insn_emulation_sysctl(void)
+ sysctl->maxlen = sizeof(int);
+
+ sysctl->procname = insn->ops->name;
+- sysctl->data = insn;
++ sysctl->data = &insn->current_mode;
+ sysctl->extra1 = &insn->min;
+ sysctl->extra2 = &insn->max;
+ sysctl->proc_handler = emulation_proc_handler;
+diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
+index a0f3d0aaa3c53..27baa41c46ca2 100644
+--- a/arch/arm64/kernel/cpu_errata.c
++++ b/arch/arm64/kernel/cpu_errata.c
+@@ -395,6 +395,14 @@ static struct midr_range trbe_write_out_of_range_cpus[] = {
+ };
+ #endif /* CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE */
+
++#ifdef CONFIG_ARM64_ERRATUM_1742098
++static struct midr_range broken_aarch32_aes[] = {
++ MIDR_RANGE(MIDR_CORTEX_A57, 0, 1, 0xf, 0xf),
++ MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
++ {},
++};
++#endif /* CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE */
++
+ const struct arm64_cpu_capabilities arm64_errata[] = {
+ #ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
+ {
+@@ -657,6 +665,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
+ /* Cortex-A510 r0p0 - r0p1 */
+ ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 1)
+ },
++#endif
++#ifdef CONFIG_ARM64_ERRATUM_1742098
++ {
++ .desc = "ARM erratum 1742098",
++ .capability = ARM64_WORKAROUND_1742098,
++ CAP_MIDR_RANGE_LIST(broken_aarch32_aes),
++ .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM,
++ },
+ #endif
+ {
+ }
+diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
+index 2cb9cc9e0eff1..a54f34aa36e01 100644
+--- a/arch/arm64/kernel/cpufeature.c
++++ b/arch/arm64/kernel/cpufeature.c
+@@ -79,6 +79,7 @@
+ #include <asm/cpufeature.h>
+ #include <asm/cpu_ops.h>
+ #include <asm/fpsimd.h>
++#include <asm/hwcap.h>
+ #include <asm/insn.h>
+ #include <asm/kvm_host.h>
+ #include <asm/mmu_context.h>
+@@ -540,7 +541,7 @@ static const struct arm64_ftr_bits ftr_id_pfr2[] = {
+
+ static const struct arm64_ftr_bits ftr_id_dfr0[] = {
+ /* [31:28] TraceFilt */
+- S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_PERFMON_SHIFT, 4, 0xf),
++ S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_EXACT, ID_DFR0_PERFMON_SHIFT, 4, 0),
+ ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_MPROFDBG_SHIFT, 4, 0),
+ ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_MMAPTRC_SHIFT, 4, 0),
+ ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_COPTRC_SHIFT, 4, 0),
+@@ -1921,6 +1922,14 @@ static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
+ }
+ #endif /* CONFIG_ARM64_MTE */
+
++static void elf_hwcap_fixup(void)
++{
++#ifdef CONFIG_ARM64_ERRATUM_1742098
++ if (cpus_have_const_cap(ARM64_WORKAROUND_1742098))
++ compat_elf_hwcap2 &= ~COMPAT_HWCAP2_AES;
++#endif /* ARM64_ERRATUM_1742098 */
++}
++
+ #ifdef CONFIG_KVM
+ static bool is_kvm_protected_mode(const struct arm64_cpu_capabilities *entry, int __unused)
+ {
+@@ -3033,8 +3042,10 @@ void __init setup_cpu_features(void)
+ setup_system_capabilities();
+ setup_elf_hwcaps(arm64_elf_hwcaps);
+
+- if (system_supports_32bit_el0())
++ if (system_supports_32bit_el0()) {
+ setup_elf_hwcaps(compat_elf_hwcaps);
++ elf_hwcap_fixup();
++ }
+
+ if (system_uses_ttbr0_pan())
+ pr_info("emulated: Privileged Access Never (PAN) using TTBR0_EL1 switching\n");
+@@ -3086,6 +3097,7 @@ static int enable_mismatched_32bit_el0(unsigned int cpu)
+ cpu_active_mask);
+ get_cpu_device(lucky_winner)->offline_disabled = true;
+ setup_elf_hwcaps(compat_elf_hwcaps);
++ elf_hwcap_fixup();
+ pr_info("Asymmetric 32-bit EL0 support detected on CPU %u; CPU hot-unplug disabled on CPU %u\n",
+ cpu, lucky_winner);
+ return 0;
+diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
+index 6328308be2726..7754ef328657e 100644
+--- a/arch/arm64/kernel/hibernate.c
++++ b/arch/arm64/kernel/hibernate.c
+@@ -300,11 +300,6 @@ static void swsusp_mte_restore_tags(void)
+ unsigned long pfn = xa_state.xa_index;
+ struct page *page = pfn_to_online_page(pfn);
+
+- /*
+- * It is not required to invoke page_kasan_tag_reset(page)
+- * at this point since the tags stored in page->flags are
+- * already restored.
+- */
+ mte_restore_page_tags(page_address(page), tags);
+
+ mte_free_tag_storage(tags);
+diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
+index d502703e8373a..d565ae25e48f4 100644
+--- a/arch/arm64/kernel/mte.c
++++ b/arch/arm64/kernel/mte.c
+@@ -47,15 +47,6 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte,
+ if (!pte_is_tagged)
+ return;
+
+- page_kasan_tag_reset(page);
+- /*
+- * We need smp_wmb() in between setting the flags and clearing the
+- * tags because if another thread reads page->flags and builds a
+- * tagged address out of it, there is an actual dependency to the
+- * memory access, but on the current thread we do not guarantee that
+- * the new page->flags are visible before the tags were updated.
+- */
+- smp_wmb();
+ mte_clear_page_tags(page_address(page));
+ }
+
+diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
+index 6410d21d86957..858e5be48791a 100644
+--- a/arch/arm64/kvm/hyp/nvhe/switch.c
++++ b/arch/arm64/kvm/hyp/nvhe/switch.c
+@@ -371,5 +371,5 @@ void __noreturn hyp_panic(void)
+
+ asmlinkage void kvm_unexpected_el2_exception(void)
+ {
+- return __kvm_unexpected_el2_exception();
++ __kvm_unexpected_el2_exception();
+ }
+diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
+index 262dfe03134da..51ea9ac712103 100644
+--- a/arch/arm64/kvm/hyp/vhe/switch.c
++++ b/arch/arm64/kvm/hyp/vhe/switch.c
+@@ -240,5 +240,5 @@ void __noreturn hyp_panic(void)
+
+ asmlinkage void kvm_unexpected_el2_exception(void)
+ {
+- return __kvm_unexpected_el2_exception();
++ __kvm_unexpected_el2_exception();
+ }
+diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
+index 0dea80bf6de46..24913271e898c 100644
+--- a/arch/arm64/mm/copypage.c
++++ b/arch/arm64/mm/copypage.c
+@@ -23,15 +23,6 @@ void copy_highpage(struct page *to, struct page *from)
+
+ if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) {
+ set_bit(PG_mte_tagged, &to->flags);
+- page_kasan_tag_reset(to);
+- /*
+- * We need smp_wmb() in between setting the flags and clearing the
+- * tags because if another thread reads page->flags and builds a
+- * tagged address out of it, there is an actual dependency to the
+- * memory access, but on the current thread we do not guarantee that
+- * the new page->flags are visible before the tags were updated.
+- */
+- smp_wmb();
+ mte_copy_page_tags(kto, kfrom);
+ }
+ }
+diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c
+index a9e50e930484a..4334dec93bd44 100644
+--- a/arch/arm64/mm/mteswap.c
++++ b/arch/arm64/mm/mteswap.c
+@@ -53,15 +53,6 @@ bool mte_restore_tags(swp_entry_t entry, struct page *page)
+ if (!tags)
+ return false;
+
+- page_kasan_tag_reset(page);
+- /*
+- * We need smp_wmb() in between setting the flags and clearing the
+- * tags because if another thread reads page->flags and builds a
+- * tagged address out of it, there is an actual dependency to the
+- * memory access, but on the current thread we do not guarantee that
+- * the new page->flags are visible before the tags were updated.
+- */
+- smp_wmb();
+ mte_restore_page_tags(page_address(page), tags);
+
+ return true;
+diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
+index 3ed418f70e3bd..8cd6088f8875e 100644
+--- a/arch/arm64/tools/cpucaps
++++ b/arch/arm64/tools/cpucaps
+@@ -58,6 +58,7 @@ WORKAROUND_1418040
+ WORKAROUND_1463225
+ WORKAROUND_1508412
+ WORKAROUND_1542419
++WORKAROUND_1742098
+ WORKAROUND_1902691
+ WORKAROUND_2038923
+ WORKAROUND_2064142
+diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
+index 7cbce290f4e5a..757c2f6d8d4b8 100644
+--- a/arch/ia64/include/asm/processor.h
++++ b/arch/ia64/include/asm/processor.h
+@@ -538,7 +538,7 @@ ia64_get_irr(unsigned int vector)
+ {
+ unsigned int reg = vector / 64;
+ unsigned int bit = vector % 64;
+- u64 irr;
++ unsigned long irr;
+
+ switch (reg) {
+ case 0: irr = ia64_getreg(_IA64_REG_CR_IRR0); break;
+diff --git a/arch/ia64/kernel/palinfo.c b/arch/ia64/kernel/palinfo.c
+index 64189f04c1a49..b9ae093bfe376 100644
+--- a/arch/ia64/kernel/palinfo.c
++++ b/arch/ia64/kernel/palinfo.c
+@@ -120,7 +120,7 @@ static const char *mem_attrib[]={
+ * Input:
+ * - a pointer to a buffer to hold the string
+ * - a 64-bit vector
+- * Ouput:
++ * Output:
+ * - a pointer to the end of the buffer
+ *
+ */
+diff --git a/arch/ia64/kernel/traps.c b/arch/ia64/kernel/traps.c
+index 753642366e12e..53735b1d1be3a 100644
+--- a/arch/ia64/kernel/traps.c
++++ b/arch/ia64/kernel/traps.c
+@@ -309,7 +309,7 @@ handle_fpu_swa (int fp_fault, struct pt_regs *regs, unsigned long isr)
+ /*
+ * Lower 4 bits are used as a count. Upper bits are a sequence
+ * number that is updated when count is reset. The cmpxchg will
+- * fail is seqno has changed. This minimizes mutiple cpus
++ * fail is seqno has changed. This minimizes multiple cpus
+ * resetting the count.
+ */
+ if (current_jiffies > last.time)
+diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
+index 5d165607bf354..7ae1244ed8ec2 100644
+--- a/arch/ia64/mm/init.c
++++ b/arch/ia64/mm/init.c
+@@ -451,7 +451,7 @@ mem_init (void)
+ memblock_free_all();
+
+ /*
+- * For fsyscall entrpoints with no light-weight handler, use the ordinary
++ * For fsyscall entrypoints with no light-weight handler, use the ordinary
+ * (heavy-weight) handler, but mark it by setting bit 0, so the fsyscall entry
+ * code can tell them apart.
+ */
+diff --git a/arch/ia64/mm/tlb.c b/arch/ia64/mm/tlb.c
+index 135b5135cace2..ca060e7a2a466 100644
+--- a/arch/ia64/mm/tlb.c
++++ b/arch/ia64/mm/tlb.c
+@@ -174,7 +174,7 @@ __setup("nptcg=", set_nptcg);
+ * override table (in which case we should ignore the value from
+ * PAL_VM_SUMMARY).
+ *
+- * Kernel parameter "nptcg=" overrides maximum number of simultanesous ptc.g
++ * Kernel parameter "nptcg=" overrides maximum number of simultaneous ptc.g
+ * purges defined in either PAL_VM_SUMMARY or PAL override table. In this case,
+ * we should ignore the value from either PAL_VM_SUMMARY or PAL override table.
+ *
+@@ -516,7 +516,7 @@ found:
+ if (i >= per_cpu(ia64_tr_num, cpu))
+ return -EBUSY;
+
+- /*Record tr info for mca hander use!*/
++ /*Record tr info for mca handler use!*/
+ if (i > per_cpu(ia64_tr_used, cpu))
+ per_cpu(ia64_tr_used, cpu) = i;
+
+diff --git a/arch/mips/kernel/proc.c b/arch/mips/kernel/proc.c
+index bb43bf850314a..8eba5a1ed664c 100644
+--- a/arch/mips/kernel/proc.c
++++ b/arch/mips/kernel/proc.c
+@@ -311,7 +311,7 @@ static void *c_start(struct seq_file *m, loff_t *pos)
+ {
+ unsigned long i = *pos;
+
+- return i < NR_CPUS ? (void *) (i + 1) : NULL;
++ return i < nr_cpu_ids ? (void *) (i + 1) : NULL;
+ }
+
+ static void *c_next(struct seq_file *m, void *v, loff_t *pos)
+diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
+index 3d0cf471f2fe1..b2cc2c2dd4bfc 100644
+--- a/arch/mips/kernel/vdso.c
++++ b/arch/mips/kernel/vdso.c
+@@ -159,7 +159,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
+ /* Map GIC user page. */
+ if (gic_size) {
+ gic_base = (unsigned long)mips_gic_base + MIPS_GIC_USER_OFS;
+- gic_pfn = virt_to_phys((void *)gic_base) >> PAGE_SHIFT;
++ gic_pfn = PFN_DOWN(__pa(gic_base));
+
+ ret = io_remap_pfn_range(vma, base, gic_pfn, gic_size,
+ pgprot_noncached(vma->vm_page_prot));
+diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
+index 69a533148efdd..8f61e93c0c5bc 100644
+--- a/arch/mips/loongson64/numa.c
++++ b/arch/mips/loongson64/numa.c
+@@ -196,7 +196,6 @@ void __init prom_init_numa_memory(void)
+ pr_info("CP0_PageGrain: CP0 5.1 (0x%x)\n", read_c0_pagegrain());
+ prom_meminit();
+ }
+-EXPORT_SYMBOL(prom_init_numa_memory);
+
+ pg_data_t * __init arch_alloc_nodedata(int nid)
+ {
+diff --git a/arch/mips/mm/physaddr.c b/arch/mips/mm/physaddr.c
+index a1ced5e449511..f9b8c85e98433 100644
+--- a/arch/mips/mm/physaddr.c
++++ b/arch/mips/mm/physaddr.c
+@@ -5,6 +5,7 @@
+ #include <linux/mmdebug.h>
+ #include <linux/mm.h>
+
++#include <asm/addrspace.h>
+ #include <asm/sections.h>
+ #include <asm/io.h>
+ #include <asm/page.h>
+@@ -12,15 +13,6 @@
+
+ static inline bool __debug_virt_addr_valid(unsigned long x)
+ {
+- /* high_memory does not get immediately defined, and there
+- * are early callers of __pa() against PAGE_OFFSET
+- */
+- if (!high_memory && x >= PAGE_OFFSET)
+- return true;
+-
+- if (high_memory && x >= PAGE_OFFSET && x < (unsigned long)high_memory)
+- return true;
+-
+ /*
+ * MAX_DMA_ADDRESS is a virtual address that may not correspond to an
+ * actual physical address. Enough code relies on
+@@ -30,7 +22,9 @@ static inline bool __debug_virt_addr_valid(unsigned long x)
+ if (x == MAX_DMA_ADDRESS)
+ return true;
+
+- return false;
++ return x >= PAGE_OFFSET && (KSEGX(x) < KSEG2 ||
++ IS_ENABLED(CONFIG_EVA) ||
++ !IS_ENABLED(CONFIG_HIGHMEM));
+ }
+
+ phys_addr_t __virt_to_phys(volatile const void *x)
+diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
+index a20c1c47b7808..5930ad5a3ad3e 100644
+--- a/arch/parisc/kernel/cache.c
++++ b/arch/parisc/kernel/cache.c
+@@ -50,9 +50,6 @@ void flush_instruction_cache_local(void); /* flushes local code-cache only */
+ */
+ DEFINE_SPINLOCK(pa_tlb_flush_lock);
+
+-/* Swapper page setup lock. */
+-DEFINE_SPINLOCK(pa_swapper_pg_lock);
+-
+ #if defined(CONFIG_64BIT) && defined(CONFIG_SMP)
+ int pa_serialize_tlb_flushes __ro_after_init;
+ #endif
+diff --git a/arch/parisc/kernel/drivers.c b/arch/parisc/kernel/drivers.c
+index 776d624a7207b..d126e78e101ae 100644
+--- a/arch/parisc/kernel/drivers.c
++++ b/arch/parisc/kernel/drivers.c
+@@ -520,7 +520,6 @@ alloc_pa_dev(unsigned long hpa, struct hardware_path *mod_path)
+ dev->id.hversion_rev = iodc_data[1] & 0x0f;
+ dev->id.sversion = ((iodc_data[4] & 0x0f) << 16) |
+ (iodc_data[5] << 8) | iodc_data[6];
+- dev->hpa.name = parisc_pathname(dev);
+ dev->hpa.start = hpa;
+ /* This is awkward. The STI spec says that gfx devices may occupy
+ * 32MB or 64MB. Unfortunately, we don't know how to tell whether
+@@ -534,10 +533,10 @@ alloc_pa_dev(unsigned long hpa, struct hardware_path *mod_path)
+ dev->hpa.end = hpa + 0xfff;
+ }
+ dev->hpa.flags = IORESOURCE_MEM;
+- name = parisc_hardware_description(&dev->id);
+- if (name) {
+- strlcpy(dev->name, name, sizeof(dev->name));
+- }
++ dev->hpa.name = dev->name;
++ name = parisc_hardware_description(&dev->id) ? : "unknown";
++ snprintf(dev->name, sizeof(dev->name), "%s [%s]",
++ name, parisc_pathname(dev));
+
+ /* Silently fail things like mouse ports which are subsumed within
+ * the keyboard controller
+diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
+index 68b46fe2f17c5..8a99c998da9bb 100644
+--- a/arch/parisc/kernel/syscalls/syscall.tbl
++++ b/arch/parisc/kernel/syscalls/syscall.tbl
+@@ -413,7 +413,7 @@
+ 412 32 utimensat_time64 sys_utimensat sys_utimensat
+ 413 32 pselect6_time64 sys_pselect6 compat_sys_pselect6_time64
+ 414 32 ppoll_time64 sys_ppoll compat_sys_ppoll_time64
+-416 32 io_pgetevents_time64 sys_io_pgetevents sys_io_pgetevents
++416 32 io_pgetevents_time64 sys_io_pgetevents compat_sys_io_pgetevents_time64
+ 417 32 recvmmsg_time64 sys_recvmmsg compat_sys_recvmmsg_time64
+ 418 32 mq_timedsend_time64 sys_mq_timedsend sys_mq_timedsend
+ 419 32 mq_timedreceive_time64 sys_mq_timedreceive sys_mq_timedreceive
+diff --git a/arch/powerpc/boot/cuboot-hotfoot.c b/arch/powerpc/boot/cuboot-hotfoot.c
+index 888a6b9bfead2..0e5532f855d61 100644
+--- a/arch/powerpc/boot/cuboot-hotfoot.c
++++ b/arch/powerpc/boot/cuboot-hotfoot.c
+@@ -70,7 +70,7 @@ static void hotfoot_fixups(void)
+
+ printf("Fixing devtree for 4M Flash\n");
+
+- /* First fix up the base addresse */
++ /* First fix up the base address */
+ getprop(devp, "reg", regs, sizeof(regs));
+ regs[0] = 0;
+ regs[1] = 0xffc00000;
+diff --git a/arch/powerpc/configs/44x/akebono_defconfig b/arch/powerpc/configs/44x/akebono_defconfig
+index 4bc549c6edc5a..fde4824f235ef 100644
+--- a/arch/powerpc/configs/44x/akebono_defconfig
++++ b/arch/powerpc/configs/44x/akebono_defconfig
+@@ -118,7 +118,7 @@ CONFIG_CRAMFS=y
+ CONFIG_NLS_DEFAULT="n"
+ CONFIG_NLS_CODEPAGE_437=y
+ CONFIG_NLS_ISO8859_1=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_MAGIC_SYSRQ=y
+ CONFIG_DETECT_HUNG_TASK=y
+ CONFIG_XMON=y
+diff --git a/arch/powerpc/configs/44x/currituck_defconfig b/arch/powerpc/configs/44x/currituck_defconfig
+index 7178272199217..7283b7d4a1a57 100644
+--- a/arch/powerpc/configs/44x/currituck_defconfig
++++ b/arch/powerpc/configs/44x/currituck_defconfig
+@@ -73,7 +73,7 @@ CONFIG_NFS_FS=y
+ CONFIG_NFS_V3_ACL=y
+ CONFIG_NFS_V4=y
+ CONFIG_NLS_DEFAULT="n"
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_MAGIC_SYSRQ=y
+ CONFIG_DETECT_HUNG_TASK=y
+ CONFIG_XMON=y
+diff --git a/arch/powerpc/configs/44x/fsp2_defconfig b/arch/powerpc/configs/44x/fsp2_defconfig
+index 8da316e61a08c..3fdfbb29b8548 100644
+--- a/arch/powerpc/configs/44x/fsp2_defconfig
++++ b/arch/powerpc/configs/44x/fsp2_defconfig
+@@ -110,7 +110,7 @@ CONFIG_XZ_DEC=y
+ CONFIG_PRINTK_TIME=y
+ CONFIG_MESSAGE_LOGLEVEL_DEFAULT=3
+ CONFIG_DYNAMIC_DEBUG=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_MAGIC_SYSRQ=y
+ CONFIG_DETECT_HUNG_TASK=y
+ CONFIG_CRYPTO_CBC=y
+diff --git a/arch/powerpc/configs/44x/iss476-smp_defconfig b/arch/powerpc/configs/44x/iss476-smp_defconfig
+index c11e777b2f3d6..0f6380e1e6125 100644
+--- a/arch/powerpc/configs/44x/iss476-smp_defconfig
++++ b/arch/powerpc/configs/44x/iss476-smp_defconfig
+@@ -56,7 +56,7 @@ CONFIG_PROC_KCORE=y
+ CONFIG_TMPFS=y
+ CONFIG_CRAMFS=y
+ # CONFIG_NETWORK_FILESYSTEMS is not set
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_MAGIC_SYSRQ=y
+ CONFIG_DETECT_HUNG_TASK=y
+ CONFIG_PPC_EARLY_DEBUG=y
+diff --git a/arch/powerpc/configs/44x/warp_defconfig b/arch/powerpc/configs/44x/warp_defconfig
+index 47252c2d7669a..20891c413149c 100644
+--- a/arch/powerpc/configs/44x/warp_defconfig
++++ b/arch/powerpc/configs/44x/warp_defconfig
+@@ -88,7 +88,7 @@ CONFIG_NLS_UTF8=y
+ CONFIG_CRC_CCITT=y
+ CONFIG_CRC_T10DIF=y
+ CONFIG_PRINTK_TIME=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_DEBUG_FS=y
+ CONFIG_MAGIC_SYSRQ=y
+ CONFIG_DETECT_HUNG_TASK=y
+diff --git a/arch/powerpc/configs/52xx/lite5200b_defconfig b/arch/powerpc/configs/52xx/lite5200b_defconfig
+index 63368e6775064..7db479dcbc0c4 100644
+--- a/arch/powerpc/configs/52xx/lite5200b_defconfig
++++ b/arch/powerpc/configs/52xx/lite5200b_defconfig
+@@ -58,6 +58,6 @@ CONFIG_NFS_FS=y
+ CONFIG_NFS_V4=y
+ CONFIG_ROOT_NFS=y
+ CONFIG_PRINTK_TIME=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_DETECT_HUNG_TASK=y
+ # CONFIG_DEBUG_BUGVERBOSE is not set
+diff --git a/arch/powerpc/configs/52xx/motionpro_defconfig b/arch/powerpc/configs/52xx/motionpro_defconfig
+index 72762da94846f..6186ead1e1056 100644
+--- a/arch/powerpc/configs/52xx/motionpro_defconfig
++++ b/arch/powerpc/configs/52xx/motionpro_defconfig
+@@ -84,7 +84,7 @@ CONFIG_ROOT_NFS=y
+ CONFIG_NLS_CODEPAGE_437=y
+ CONFIG_NLS_ISO8859_1=y
+ CONFIG_PRINTK_TIME=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_DETECT_HUNG_TASK=y
+ # CONFIG_DEBUG_BUGVERBOSE is not set
+ CONFIG_CRYPTO_ECB=y
+diff --git a/arch/powerpc/configs/52xx/tqm5200_defconfig b/arch/powerpc/configs/52xx/tqm5200_defconfig
+index a3c8ca74032c4..e6735b945327e 100644
+--- a/arch/powerpc/configs/52xx/tqm5200_defconfig
++++ b/arch/powerpc/configs/52xx/tqm5200_defconfig
+@@ -85,7 +85,7 @@ CONFIG_ROOT_NFS=y
+ CONFIG_NLS_CODEPAGE_437=y
+ CONFIG_NLS_ISO8859_1=y
+ CONFIG_PRINTK_TIME=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_DETECT_HUNG_TASK=y
+ # CONFIG_DEBUG_BUGVERBOSE is not set
+ CONFIG_CRYPTO_ECB=y
+diff --git a/arch/powerpc/configs/adder875_defconfig b/arch/powerpc/configs/adder875_defconfig
+index 5326bc7392790..7f35d5bc12299 100644
+--- a/arch/powerpc/configs/adder875_defconfig
++++ b/arch/powerpc/configs/adder875_defconfig
+@@ -45,7 +45,7 @@ CONFIG_CRAMFS=y
+ CONFIG_NFS_FS=y
+ CONFIG_ROOT_NFS=y
+ CONFIG_CRC32_SLICEBY4=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_DEBUG_FS=y
+ CONFIG_MAGIC_SYSRQ=y
+ CONFIG_DETECT_HUNG_TASK=y
+diff --git a/arch/powerpc/configs/ep8248e_defconfig b/arch/powerpc/configs/ep8248e_defconfig
+index 00d69965f898b..8df6d3a293e3c 100644
+--- a/arch/powerpc/configs/ep8248e_defconfig
++++ b/arch/powerpc/configs/ep8248e_defconfig
+@@ -59,7 +59,7 @@ CONFIG_NLS_CODEPAGE_437=y
+ CONFIG_NLS_ASCII=y
+ CONFIG_NLS_ISO8859_1=y
+ CONFIG_NLS_UTF8=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_MAGIC_SYSRQ=y
+ # CONFIG_SCHED_DEBUG is not set
+ CONFIG_BDI_SWITCH=y
+diff --git a/arch/powerpc/configs/ep88xc_defconfig b/arch/powerpc/configs/ep88xc_defconfig
+index f5c3e72da7196..a98ef6a4abef6 100644
+--- a/arch/powerpc/configs/ep88xc_defconfig
++++ b/arch/powerpc/configs/ep88xc_defconfig
+@@ -48,6 +48,6 @@ CONFIG_CRAMFS=y
+ CONFIG_NFS_FS=y
+ CONFIG_ROOT_NFS=y
+ CONFIG_CRC32_SLICEBY4=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_MAGIC_SYSRQ=y
+ CONFIG_DETECT_HUNG_TASK=y
+diff --git a/arch/powerpc/configs/fsl-emb-nonhw.config b/arch/powerpc/configs/fsl-emb-nonhw.config
+index df37efed0aec3..f14c6dbd7346c 100644
+--- a/arch/powerpc/configs/fsl-emb-nonhw.config
++++ b/arch/powerpc/configs/fsl-emb-nonhw.config
+@@ -24,7 +24,7 @@ CONFIG_CRYPTO_PCBC=m
+ CONFIG_CRYPTO_SHA256=y
+ CONFIG_CRYPTO_SHA512=y
+ CONFIG_DEBUG_FS=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_DEBUG_KERNEL=y
+ CONFIG_DEBUG_SHIRQ=y
+ CONFIG_DETECT_HUNG_TASK=y
+diff --git a/arch/powerpc/configs/mgcoge_defconfig b/arch/powerpc/configs/mgcoge_defconfig
+index dcc8dccf54f3b..498d35db78331 100644
+--- a/arch/powerpc/configs/mgcoge_defconfig
++++ b/arch/powerpc/configs/mgcoge_defconfig
+@@ -73,7 +73,7 @@ CONFIG_NLS_CODEPAGE_437=y
+ CONFIG_NLS_ASCII=y
+ CONFIG_NLS_ISO8859_1=y
+ CONFIG_NLS_UTF8=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_DEBUG_FS=y
+ CONFIG_MAGIC_SYSRQ=y
+ # CONFIG_SCHED_DEBUG is not set
+diff --git a/arch/powerpc/configs/mpc5200_defconfig b/arch/powerpc/configs/mpc5200_defconfig
+index 83d801307178d..c0fe5e76604a0 100644
+--- a/arch/powerpc/configs/mpc5200_defconfig
++++ b/arch/powerpc/configs/mpc5200_defconfig
+@@ -122,6 +122,6 @@ CONFIG_ROOT_NFS=y
+ CONFIG_NLS_CODEPAGE_437=y
+ CONFIG_NLS_ISO8859_1=y
+ CONFIG_PRINTK_TIME=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_DEBUG_KERNEL=y
+ CONFIG_DETECT_HUNG_TASK=y
+diff --git a/arch/powerpc/configs/mpc8272_ads_defconfig b/arch/powerpc/configs/mpc8272_ads_defconfig
+index 00a4d2bf43b2a..4145ef5689caa 100644
+--- a/arch/powerpc/configs/mpc8272_ads_defconfig
++++ b/arch/powerpc/configs/mpc8272_ads_defconfig
+@@ -67,7 +67,7 @@ CONFIG_NLS_CODEPAGE_437=y
+ CONFIG_NLS_ASCII=y
+ CONFIG_NLS_ISO8859_1=y
+ CONFIG_NLS_UTF8=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_MAGIC_SYSRQ=y
+ CONFIG_DETECT_HUNG_TASK=y
+ CONFIG_BDI_SWITCH=y
+diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig
+index c74dc76b1d0d1..700115d85d6fb 100644
+--- a/arch/powerpc/configs/mpc885_ads_defconfig
++++ b/arch/powerpc/configs/mpc885_ads_defconfig
+@@ -71,7 +71,7 @@ CONFIG_ROOT_NFS=y
+ CONFIG_CRYPTO=y
+ CONFIG_CRYPTO_DEV_TALITOS=y
+ CONFIG_CRC32_SLICEBY4=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_MAGIC_SYSRQ=y
+ CONFIG_DEBUG_FS=y
+ CONFIG_DEBUG_VM_PGTABLE=y
+diff --git a/arch/powerpc/configs/ppc6xx_defconfig b/arch/powerpc/configs/ppc6xx_defconfig
+index bb549cb1c3e33..6d7ae407648fa 100644
+--- a/arch/powerpc/configs/ppc6xx_defconfig
++++ b/arch/powerpc/configs/ppc6xx_defconfig
+@@ -1066,7 +1066,7 @@ CONFIG_NLS_ISO8859_14=m
+ CONFIG_NLS_ISO8859_15=m
+ CONFIG_NLS_KOI8_R=m
+ CONFIG_NLS_KOI8_U=m
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_HEADERS_INSTALL=y
+ CONFIG_MAGIC_SYSRQ=y
+ CONFIG_DEBUG_KERNEL=y
+diff --git a/arch/powerpc/configs/pq2fads_defconfig b/arch/powerpc/configs/pq2fads_defconfig
+index 9d8a76857c6fc..9d63e2e652115 100644
+--- a/arch/powerpc/configs/pq2fads_defconfig
++++ b/arch/powerpc/configs/pq2fads_defconfig
+@@ -68,7 +68,7 @@ CONFIG_NLS_CODEPAGE_437=y
+ CONFIG_NLS_ASCII=y
+ CONFIG_NLS_ISO8859_1=y
+ CONFIG_NLS_UTF8=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_MAGIC_SYSRQ=y
+ CONFIG_DETECT_HUNG_TASK=y
+ # CONFIG_SCHED_DEBUG is not set
+diff --git a/arch/powerpc/configs/ps3_defconfig b/arch/powerpc/configs/ps3_defconfig
+index 7c95fab4b9206..2d9ac233da685 100644
+--- a/arch/powerpc/configs/ps3_defconfig
++++ b/arch/powerpc/configs/ps3_defconfig
+@@ -153,7 +153,7 @@ CONFIG_NLS_CODEPAGE_437=y
+ CONFIG_NLS_ISO8859_1=y
+ CONFIG_CRC_CCITT=m
+ CONFIG_CRC_T10DIF=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_MAGIC_SYSRQ=y
+ CONFIG_DEBUG_MEMORY_INIT=y
+ CONFIG_DEBUG_STACKOVERFLOW=y
+diff --git a/arch/powerpc/configs/tqm8xx_defconfig b/arch/powerpc/configs/tqm8xx_defconfig
+index 77857d5130223..083c2e57520a0 100644
+--- a/arch/powerpc/configs/tqm8xx_defconfig
++++ b/arch/powerpc/configs/tqm8xx_defconfig
+@@ -55,6 +55,6 @@ CONFIG_CRAMFS=y
+ CONFIG_NFS_FS=y
+ CONFIG_ROOT_NFS=y
+ CONFIG_CRC32_SLICEBY4=y
+-CONFIG_DEBUG_INFO=y
++CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
+ CONFIG_MAGIC_SYSRQ=y
+ CONFIG_DETECT_HUNG_TASK=y
+diff --git a/arch/powerpc/crypto/aes-spe-glue.c b/arch/powerpc/crypto/aes-spe-glue.c
+index c2b23b69d7b1d..e8dfe9fb02668 100644
+--- a/arch/powerpc/crypto/aes-spe-glue.c
++++ b/arch/powerpc/crypto/aes-spe-glue.c
+@@ -404,7 +404,7 @@ static int ppc_xts_decrypt(struct skcipher_request *req)
+
+ /*
+ * Algorithm definitions. Disabling alignment (cra_alignmask=0) was chosen
+- * because the e500 platform can handle unaligned reads/writes very efficently.
++ * because the e500 platform can handle unaligned reads/writes very efficiently.
+ * This improves IPsec thoughput by another few percent. Additionally we assume
+ * that AES context is always aligned to at least 8 bytes because it is created
+ * with kmalloc() in the crypto infrastructure
+diff --git a/arch/powerpc/include/asm/archrandom.h b/arch/powerpc/include/asm/archrandom.h
+index 9a53e29680f41..258174304904b 100644
+--- a/arch/powerpc/include/asm/archrandom.h
++++ b/arch/powerpc/include/asm/archrandom.h
+@@ -38,12 +38,7 @@ static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
+ #endif /* CONFIG_ARCH_RANDOM */
+
+ #ifdef CONFIG_PPC_POWERNV
+-int powernv_hwrng_present(void);
+ int powernv_get_random_long(unsigned long *v);
+-int powernv_get_random_real_mode(unsigned long *v);
+-#else
+-static inline int powernv_hwrng_present(void) { return 0; }
+-static inline int powernv_get_random_real_mode(unsigned long *v) { return 0; }
+ #endif
+
+ #endif /* _ASM_POWERPC_ARCHRANDOM_H */
+diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
+index 2aefe14e14422..1e5e9b6ec78d9 100644
+--- a/arch/powerpc/include/asm/kexec.h
++++ b/arch/powerpc/include/asm/kexec.h
+@@ -120,6 +120,15 @@ int setup_purgatory(struct kimage *image, const void *slave_code,
+ #ifdef CONFIG_PPC64
+ struct kexec_buf;
+
++int arch_kexec_kernel_image_probe(struct kimage *image, void *buf, unsigned long buf_len);
++#define arch_kexec_kernel_image_probe arch_kexec_kernel_image_probe
++
++int arch_kimage_file_post_load_cleanup(struct kimage *image);
++#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
++
++int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf);
++#define arch_kexec_locate_mem_hole arch_kexec_locate_mem_hole
++
+ int load_crashdump_segments_ppc64(struct kimage *image,
+ struct kexec_buf *kbuf);
+ int setup_purgatory_ppc64(struct kimage *image, const void *slave_code,
+diff --git a/arch/powerpc/include/asm/simple_spinlock.h b/arch/powerpc/include/asm/simple_spinlock.h
+index 7ae6aeef8464e..9dcc7e9993b90 100644
+--- a/arch/powerpc/include/asm/simple_spinlock.h
++++ b/arch/powerpc/include/asm/simple_spinlock.h
+@@ -48,10 +48,11 @@ static inline int arch_spin_is_locked(arch_spinlock_t *lock)
+ static inline unsigned long __arch_spin_trylock(arch_spinlock_t *lock)
+ {
+ unsigned long tmp, token;
++ unsigned int eh = IS_ENABLED(CONFIG_PPC64);
+
+ token = LOCK_TOKEN;
+ __asm__ __volatile__(
+-"1: lwarx %0,0,%2,1\n\
++"1: lwarx %0,0,%2,%[eh]\n\
+ cmpwi 0,%0,0\n\
+ bne- 2f\n\
+ stwcx. %1,0,%2\n\
+@@ -59,7 +60,7 @@ static inline unsigned long __arch_spin_trylock(arch_spinlock_t *lock)
+ PPC_ACQUIRE_BARRIER
+ "2:"
+ : "=&r" (tmp)
+- : "r" (token), "r" (&lock->slock)
++ : "r" (token), "r" (&lock->slock), [eh] "n" (eh)
+ : "cr0", "memory");
+
+ return tmp;
+@@ -156,9 +157,10 @@ static inline void arch_spin_unlock(arch_spinlock_t *lock)
+ static inline long __arch_read_trylock(arch_rwlock_t *rw)
+ {
+ long tmp;
++ unsigned int eh = IS_ENABLED(CONFIG_PPC64);
+
+ __asm__ __volatile__(
+-"1: lwarx %0,0,%1,1\n"
++"1: lwarx %0,0,%1,%[eh]\n"
+ __DO_SIGN_EXTEND
+ " addic. %0,%0,1\n\
+ ble- 2f\n"
+@@ -166,7 +168,7 @@ static inline long __arch_read_trylock(arch_rwlock_t *rw)
+ bne- 1b\n"
+ PPC_ACQUIRE_BARRIER
+ "2:" : "=&r" (tmp)
+- : "r" (&rw->lock)
++ : "r" (&rw->lock), [eh] "n" (eh)
+ : "cr0", "xer", "memory");
+
+ return tmp;
+@@ -179,17 +181,18 @@ static inline long __arch_read_trylock(arch_rwlock_t *rw)
+ static inline long __arch_write_trylock(arch_rwlock_t *rw)
+ {
+ long tmp, token;
++ unsigned int eh = IS_ENABLED(CONFIG_PPC64);
+
+ token = WRLOCK_TOKEN;
+ __asm__ __volatile__(
+-"1: lwarx %0,0,%2,1\n\
++"1: lwarx %0,0,%2,%[eh]\n\
+ cmpwi 0,%0,0\n\
+ bne- 2f\n"
+ " stwcx. %1,0,%2\n\
+ bne- 1b\n"
+ PPC_ACQUIRE_BARRIER
+ "2:" : "=&r" (tmp)
+- : "r" (token), "r" (&rw->lock)
++ : "r" (token), "r" (&rw->lock), [eh] "n" (eh)
+ : "cr0", "memory");
+
+ return tmp;
+diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
+index 4ddd161aef324..63c384c3e6d4a 100644
+--- a/arch/powerpc/kernel/Makefile
++++ b/arch/powerpc/kernel/Makefile
+@@ -20,6 +20,7 @@ CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
+ CFLAGS_prom_init.o += -fno-stack-protector
+ CFLAGS_prom_init.o += -DDISABLE_BRANCH_PROFILING
+ CFLAGS_prom_init.o += -ffreestanding
++CFLAGS_prom_init.o += $(call cc-option, -ftrivial-auto-var-init=uninitialized)
+
+ ifdef CONFIG_FUNCTION_TRACER
+ # Do not trace early boot code
+diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
+index ae0fdef0ac115..2a271a6d69249 100644
+--- a/arch/powerpc/kernel/cputable.c
++++ b/arch/powerpc/kernel/cputable.c
+@@ -2025,7 +2025,7 @@ static struct cpu_spec * __init setup_cpu_spec(unsigned long offset,
+ * oprofile_cpu_type already has a value, then we are
+ * possibly overriding a real PVR with a logical one,
+ * and, in that case, keep the current value for
+- * oprofile_cpu_type. Futhermore, let's ensure that the
++ * oprofile_cpu_type. Furthermore, let's ensure that the
+ * fix for the PMAO bug is enabled on compatibility mode.
+ */
+ if (old.oprofile_cpu_type != NULL) {
+diff --git a/arch/powerpc/kernel/dawr.c b/arch/powerpc/kernel/dawr.c
+index 64e423d2fe0f1..30d4eca88d174 100644
+--- a/arch/powerpc/kernel/dawr.c
++++ b/arch/powerpc/kernel/dawr.c
+@@ -27,7 +27,7 @@ int set_dawr(int nr, struct arch_hw_breakpoint *brk)
+ dawrx |= (brk->type & (HW_BRK_TYPE_PRIV_ALL)) >> 3;
+ /*
+ * DAWR length is stored in field MDR bits 48:53. Matches range in
+- * doublewords (64 bits) baised by -1 eg. 0b000000=1DW and
++ * doublewords (64 bits) biased by -1 eg. 0b000000=1DW and
+ * 0b111111=64DW.
+ * brk->hw_len is in bytes.
+ * This aligns up to double word size, shifts and does the bias.
+diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
+index 28bb1e7263a6c..ab316e155ea9f 100644
+--- a/arch/powerpc/kernel/eeh.c
++++ b/arch/powerpc/kernel/eeh.c
+@@ -1329,7 +1329,7 @@ int eeh_pe_set_option(struct eeh_pe *pe, int option)
+
+ /*
+ * EEH functionality could possibly be disabled, just
+- * return error for the case. And the EEH functinality
++ * return error for the case. And the EEH functionality
+ * isn't expected to be disabled on one specific PE.
+ */
+ switch (option) {
+@@ -1804,7 +1804,7 @@ static int eeh_debugfs_break_device(struct pci_dev *pdev)
+ * PE freeze. Using the in_8() accessor skips the eeh detection hook
+ * so the freeze hook so the EEH Detection machinery won't be
+ * triggered here. This is to match the usual behaviour of EEH
+- * where the HW will asyncronously freeze a PE and it's up to
++ * where the HW will asynchronously freeze a PE and it's up to
+ * the kernel to notice and deal with it.
+ *
+ * 3. Turn Memory space back on. This is more important for VFs
+diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c
+index a7a8dc182efb9..c23a454af08a8 100644
+--- a/arch/powerpc/kernel/eeh_event.c
++++ b/arch/powerpc/kernel/eeh_event.c
+@@ -143,7 +143,7 @@ int __eeh_send_failure_event(struct eeh_pe *pe)
+ int eeh_send_failure_event(struct eeh_pe *pe)
+ {
+ /*
+- * If we've manually supressed recovery events via debugfs
++ * If we've manually suppressed recovery events via debugfs
+ * then just drop it on the floor.
+ */
+ if (eeh_debugfs_no_recover) {
+diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
+index dc2350b288cfe..aa29b9b7920f9 100644
+--- a/arch/powerpc/kernel/fadump.c
++++ b/arch/powerpc/kernel/fadump.c
+@@ -1665,8 +1665,8 @@ int __init setup_fadump(void)
+ }
+ /*
+ * Use subsys_initcall_sync() here because there is dependency with
+- * crash_save_vmcoreinfo_init(), which mush run first to ensure vmcoreinfo initialization
+- * is done before regisering with f/w.
++ * crash_save_vmcoreinfo_init(), which must run first to ensure vmcoreinfo initialization
++ * is done before registering with f/w.
+ */
+ subsys_initcall_sync(setup_fadump);
+ #else /* !CONFIG_PRESERVE_FA_DUMP */
+diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
+index 07093b7cdcb9a..a67fd54ccc573 100644
+--- a/arch/powerpc/kernel/iommu.c
++++ b/arch/powerpc/kernel/iommu.c
+@@ -776,6 +776,11 @@ bool iommu_table_in_use(struct iommu_table *tbl)
+ /* ignore reserved bit0 */
+ if (tbl->it_offset == 0)
+ start = 1;
++
++ /* Simple case with no reserved MMIO32 region */
++ if (!tbl->it_reserved_start && !tbl->it_reserved_end)
++ return find_next_bit(tbl->it_map, tbl->it_size, start) != tbl->it_size;
++
+ end = tbl->it_reserved_start - tbl->it_offset;
+ if (find_next_bit(tbl->it_map, end, start) != end)
+ return true;
+diff --git a/arch/powerpc/kernel/module_32.c b/arch/powerpc/kernel/module_32.c
+index a0432ef46967e..e25b796682cc0 100644
+--- a/arch/powerpc/kernel/module_32.c
++++ b/arch/powerpc/kernel/module_32.c
+@@ -99,7 +99,7 @@ static unsigned long get_plt_size(const Elf32_Ehdr *hdr,
+
+ /* Sort the relocation information based on a symbol and
+ * addend key. This is a stable O(n*log n) complexity
+- * alogrithm but it will reduce the complexity of
++ * algorithm but it will reduce the complexity of
+ * count_relocs() to linear complexity O(n)
+ */
+ sort((void *)hdr + sechdrs[i].sh_offset,
+diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
+index 7947205304427..2cce576edbc54 100644
+--- a/arch/powerpc/kernel/module_64.c
++++ b/arch/powerpc/kernel/module_64.c
+@@ -194,7 +194,7 @@ static unsigned long get_stubs_size(const Elf64_Ehdr *hdr,
+
+ /* Sort the relocation information based on a symbol and
+ * addend key. This is a stable O(n*log n) complexity
+- * alogrithm but it will reduce the complexity of
++ * algorithm but it will reduce the complexity of
+ * count_relocs() to linear complexity O(n)
+ */
+ sort((void *)sechdrs[i].sh_addr,
+@@ -361,7 +361,7 @@ static inline int create_ftrace_stub(struct ppc64_stub_entry *entry,
+ entry->jump[1] |= PPC_HA(reladdr);
+ entry->jump[2] |= PPC_LO(reladdr);
+
+- /* Eventhough we don't use funcdata in the stub, it's needed elsewhere. */
++ /* Even though we don't use funcdata in the stub, it's needed elsewhere. */
+ entry->funcdata = func_desc(addr);
+ entry->magic = STUB_MAGIC;
+
+diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
+index 8bc9cf62cd93d..aa37faeb80ee6 100644
+--- a/arch/powerpc/kernel/pci-common.c
++++ b/arch/powerpc/kernel/pci-common.c
+@@ -74,16 +74,32 @@ void __init set_pci_dma_ops(const struct dma_map_ops *dma_ops)
+ static int get_phb_number(struct device_node *dn)
+ {
+ int ret, phb_id = -1;
+- u32 prop_32;
+ u64 prop;
+
+ /*
+ * Try fixed PHB numbering first, by checking archs and reading
+- * the respective device-tree properties. Firstly, try powernv by
+- * reading "ibm,opal-phbid", only present in OPAL environment.
++ * the respective device-tree properties. Firstly, try reading
++ * standard "linux,pci-domain", then try reading "ibm,opal-phbid"
++ * (only present in powernv OPAL environment), then try device-tree
++ * alias and as the last try to use lower bits of "reg" property.
+ */
+- ret = of_property_read_u64(dn, "ibm,opal-phbid", &prop);
++ ret = of_get_pci_domain_nr(dn);
++ if (ret >= 0) {
++ prop = ret;
++ ret = 0;
++ }
++ if (ret)
++ ret = of_property_read_u64(dn, "ibm,opal-phbid", &prop);
++
+ if (ret) {
++ ret = of_alias_get_id(dn, "pci");
++ if (ret >= 0) {
++ prop = ret;
++ ret = 0;
++ }
++ }
++ if (ret) {
++ u32 prop_32;
+ ret = of_property_read_u32_index(dn, "reg", 1, &prop_32);
+ prop = prop_32;
+ }
+@@ -95,10 +111,7 @@ static int get_phb_number(struct device_node *dn)
+ if ((phb_id >= 0) && !test_and_set_bit(phb_id, phb_bitmap))
+ return phb_id;
+
+- /*
+- * If not pseries nor powernv, or if fixed PHB numbering tried to add
+- * the same PHB number twice, then fallback to dynamic PHB numbering.
+- */
++ /* If everything fails then fallback to dynamic PHB numbering. */
+ phb_id = find_first_zero_bit(phb_bitmap, MAX_PHBS);
+ BUG_ON(phb_id >= MAX_PHBS);
+ set_bit(phb_id, phb_bitmap);
+@@ -1688,7 +1701,7 @@ EXPORT_SYMBOL_GPL(pcibios_scan_phb);
+ static void fixup_hide_host_resource_fsl(struct pci_dev *dev)
+ {
+ int i, class = dev->class >> 8;
+- /* When configured as agent, programing interface = 1 */
++ /* When configured as agent, programming interface = 1 */
+ int prog_if = dev->class & 0xf;
+
+ if ((class == PCI_CLASS_PROCESSOR_POWERPC ||
+diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_of_scan.c
+index c3024f1047654..6f2b0cc1ddd6f 100644
+--- a/arch/powerpc/kernel/pci_of_scan.c
++++ b/arch/powerpc/kernel/pci_of_scan.c
+@@ -244,7 +244,7 @@ EXPORT_SYMBOL(of_create_pci_dev);
+ * @dev: pci_dev structure for the bridge
+ *
+ * of_scan_bus() calls this routine for each PCI bridge that it finds, and
+- * this routine in turn call of_scan_bus() recusively to scan for more child
++ * this routine in turn call of_scan_bus() recursively to scan for more child
+ * devices.
+ */
+ void of_scan_pci_bridge(struct pci_dev *dev)
+diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
+index 9be279469a851..3940db48db771 100644
+--- a/arch/powerpc/kernel/process.c
++++ b/arch/powerpc/kernel/process.c
+@@ -307,7 +307,7 @@ static void __giveup_vsx(struct task_struct *tsk)
+ unsigned long msr = tsk->thread.regs->msr;
+
+ /*
+- * We should never be ssetting MSR_VSX without also setting
++ * We should never be setting MSR_VSX without also setting
+ * MSR_FP and MSR_VEC
+ */
+ WARN_ON((msr & MSR_VSX) && !((msr & MSR_FP) && (msr & MSR_VEC)));
+@@ -645,7 +645,7 @@ static void do_break_handler(struct pt_regs *regs)
+ return;
+ }
+
+- /* Otherwise findout which DAWR caused exception and disable it. */
++ /* Otherwise find out which DAWR caused exception and disable it. */
+ wp_get_instr_detail(regs, &instr, &type, &size, &ea);
+
+ for (i = 0; i < nr_wp_slots(); i++) {
+diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
+index 0ac5faacc909c..ace861ec4c4c6 100644
+--- a/arch/powerpc/kernel/prom_init.c
++++ b/arch/powerpc/kernel/prom_init.c
+@@ -3416,7 +3416,7 @@ unsigned long __init prom_init(unsigned long r3, unsigned long r4,
+ *
+ * PowerMacs use a different mechanism to spin CPUs
+ *
+- * (This must be done after instanciating RTAS)
++ * (This must be done after instantiating RTAS)
+ */
+ if (of_platform != PLATFORM_POWERMAC)
+ prom_hold_cpus();
+diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c
+index f15bc78caf718..076d867412c70 100644
+--- a/arch/powerpc/kernel/ptrace/ptrace-view.c
++++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
+@@ -174,7 +174,7 @@ int ptrace_get_reg(struct task_struct *task, int regno, unsigned long *data)
+
+ /*
+ * softe copies paca->irq_soft_mask variable state. Since irq_soft_mask is
+- * no more used as a flag, lets force usr to alway see the softe value as 1
++ * no more used as a flag, lets force usr to always see the softe value as 1
+ * which means interrupts are not soft disabled.
+ */
+ if (IS_ENABLED(CONFIG_PPC64) && regno == PT_SOFTE) {
+diff --git a/arch/powerpc/kernel/rtas_flash.c b/arch/powerpc/kernel/rtas_flash.c
+index a99179d835382..bc817a5619d64 100644
+--- a/arch/powerpc/kernel/rtas_flash.c
++++ b/arch/powerpc/kernel/rtas_flash.c
+@@ -120,7 +120,7 @@ static struct kmem_cache *flash_block_cache = NULL;
+ /*
+ * Local copy of the flash block list.
+ *
+- * The rtas_firmware_flash_list varable will be
++ * The rtas_firmware_flash_list variable will be
+ * set once the data is fully read.
+ *
+ * For convenience as we build the list we use virtual addrs,
+diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
+index 518ae5aa94109..3acf2782acdf2 100644
+--- a/arch/powerpc/kernel/setup-common.c
++++ b/arch/powerpc/kernel/setup-common.c
+@@ -279,7 +279,7 @@ static int show_cpuinfo(struct seq_file *m, void *v)
+ proc_freq / 1000000, proc_freq % 1000000);
+
+ /* If we are a Freescale core do a simple check so
+- * we dont have to keep adding cases in the future */
++ * we don't have to keep adding cases in the future */
+ if (PVR_VER(pvr) & 0x8000) {
+ switch (PVR_VER(pvr)) {
+ case 0x8000: /* 7441/7450/7451, Voyager */
+diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
+index 73d483b07ff39..858fc13b8c519 100644
+--- a/arch/powerpc/kernel/signal_64.c
++++ b/arch/powerpc/kernel/signal_64.c
+@@ -123,7 +123,7 @@ static long notrace __unsafe_setup_sigcontext(struct sigcontext __user *sc,
+ #endif
+ struct pt_regs *regs = tsk->thread.regs;
+ unsigned long msr = regs->msr;
+- /* Force usr to alway see softe as 1 (interrupts enabled) */
++ /* Force usr to always see softe as 1 (interrupts enabled) */
+ unsigned long softe = 0x1;
+
+ BUG_ON(tsk != current);
+diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
+index de0f6f09a5ddc..a69df557e2b71 100644
+--- a/arch/powerpc/kernel/smp.c
++++ b/arch/powerpc/kernel/smp.c
+@@ -1102,7 +1102,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
+ DBG("smp_prepare_cpus\n");
+
+ /*
+- * setup_cpu may need to be called on the boot cpu. We havent
++ * setup_cpu may need to be called on the boot cpu. We haven't
+ * spun any cpus up but lets be paranoid.
+ */
+ BUG_ON(boot_cpuid != smp_processor_id());
+diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
+index f80cce0e38994..4bf757ebe13d0 100644
+--- a/arch/powerpc/kernel/time.c
++++ b/arch/powerpc/kernel/time.c
+@@ -829,7 +829,7 @@ static void __read_persistent_clock(struct timespec64 *ts)
+ static int first = 1;
+
+ ts->tv_nsec = 0;
+- /* XXX this is a litle fragile but will work okay in the short term */
++ /* XXX this is a little fragile but will work okay in the short term */
+ if (first) {
+ first = 0;
+ if (ppc_md.time_init)
+@@ -974,7 +974,7 @@ void secondary_cpu_time_init(void)
+ */
+ start_cpu_decrementer();
+
+- /* FIME: Should make unrelatred change to move snapshot_timebase
++ /* FIME: Should make unrelated change to move snapshot_timebase
+ * call here ! */
+ register_decrementer_clockevent(smp_processor_id());
+ }
+diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
+index bfc27496fe7e2..7d28b95536540 100644
+--- a/arch/powerpc/kernel/watchdog.c
++++ b/arch/powerpc/kernel/watchdog.c
+@@ -56,7 +56,7 @@
+ * solved by also having a SMP watchdog where all CPUs check all other
+ * CPUs heartbeat.
+ *
+- * The SMP checker can detect lockups on other CPUs. A gobal "pending"
++ * The SMP checker can detect lockups on other CPUs. A global "pending"
+ * cpumask is kept, containing all CPUs which enable the watchdog. Each
+ * CPU clears their pending bit in their heartbeat timer. When the bitmask
+ * becomes empty, the last CPU to clear its pending bit updates a global
+diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
+index 6cc7793b8420c..c29c639551fe0 100644
+--- a/arch/powerpc/kexec/core_64.c
++++ b/arch/powerpc/kexec/core_64.c
+@@ -406,7 +406,7 @@ static int __init export_htab_values(void)
+ if (!node)
+ return -ENODEV;
+
+- /* remove any stale propertys so ours can be found */
++ /* remove any stale properties so ours can be found */
+ of_remove_property(node, of_find_property(node, htab_base_prop.name, NULL));
+ of_remove_property(node, of_find_property(node, htab_size_prop.name, NULL));
+
+diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
+index b4981b651d9aa..349a781cea0b3 100644
+--- a/arch/powerpc/kexec/file_load_64.c
++++ b/arch/powerpc/kexec/file_load_64.c
+@@ -23,6 +23,7 @@
+ #include <linux/vmalloc.h>
+ #include <asm/setup.h>
+ #include <asm/drmem.h>
++#include <asm/firmware.h>
+ #include <asm/kexec_ranges.h>
+ #include <asm/crashdump-ppc64.h>
+
+@@ -1038,6 +1039,48 @@ out:
+ return ret;
+ }
+
++static int copy_property(void *fdt, int node_offset, const struct device_node *dn,
++ const char *propname)
++{
++ const void *prop, *fdtprop;
++ int len = 0, fdtlen = 0;
++
++ prop = of_get_property(dn, propname, &len);
++ fdtprop = fdt_getprop(fdt, node_offset, propname, &fdtlen);
++
++ if (fdtprop && !prop)
++ return fdt_delprop(fdt, node_offset, propname);
++ else if (prop)
++ return fdt_setprop(fdt, node_offset, propname, prop, len);
++ else
++ return -FDT_ERR_NOTFOUND;
++}
++
++static int update_pci_dma_nodes(void *fdt, const char *dmapropname)
++{
++ struct device_node *dn;
++ int pci_offset, root_offset, ret = 0;
++
++ if (!firmware_has_feature(FW_FEATURE_LPAR))
++ return 0;
++
++ root_offset = fdt_path_offset(fdt, "/");
++ for_each_node_with_property(dn, dmapropname) {
++ pci_offset = fdt_subnode_offset(fdt, root_offset, of_node_full_name(dn));
++ if (pci_offset < 0)
++ continue;
++
++ ret = copy_property(fdt, pci_offset, dn, "ibm,dma-window");
++ if (ret < 0)
++ break;
++ ret = copy_property(fdt, pci_offset, dn, dmapropname);
++ if (ret < 0)
++ break;
++ }
++
++ return ret;
++}
++
+ /**
+ * setup_new_fdt_ppc64 - Update the flattend device-tree of the kernel
+ * being loaded.
+@@ -1099,6 +1142,18 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt,
+ if (ret < 0)
+ goto out;
+
++#define DIRECT64_PROPNAME "linux,direct64-ddr-window-info"
++#define DMA64_PROPNAME "linux,dma64-ddr-window-info"
++ ret = update_pci_dma_nodes(fdt, DIRECT64_PROPNAME);
++ if (ret < 0)
++ goto out;
++
++ ret = update_pci_dma_nodes(fdt, DMA64_PROPNAME);
++ if (ret < 0)
++ goto out;
++#undef DMA64_PROPNAME
++#undef DIRECT64_PROPNAME
++
+ /* Update memory reserve map */
+ ret = get_reserved_memory_ranges(&rmem);
+ if (ret)
+diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
+index 0aeb51738ca9b..1137c4df726c3 100644
+--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
++++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
+@@ -58,7 +58,7 @@ struct kvm_resize_hpt {
+ /* Possible values and their usage:
+ * <0 an error occurred during allocation,
+ * -EBUSY allocation is in the progress,
+- * 0 allocation made successfuly.
++ * 0 allocation made successfully.
+ */
+ int error;
+
+diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c b/arch/powerpc/kvm/book3s_64_vio_hv.c
+index fdeda6a9cff44..fdcc7b287dd8f 100644
+--- a/arch/powerpc/kvm/book3s_64_vio_hv.c
++++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
+@@ -453,7 +453,7 @@ static long kvmppc_rm_ua_to_hpa(struct kvm_vcpu *vcpu, unsigned long mmu_seq,
+ * we are doing this on secondary cpus and current task there
+ * is not the hypervisor. Also this is safe against THP in the
+ * host, because an IPI to primary thread will wait for the secondary
+- * to exit which will agains result in the below page table walk
++ * to exit which will again result in the below page table walk
+ * to finish.
+ */
+ /* an rmap lock won't make it safe. because that just ensure hash
+diff --git a/arch/powerpc/kvm/book3s_emulate.c b/arch/powerpc/kvm/book3s_emulate.c
+index fdb57be71aa65..5bbfb2eed1277 100644
+--- a/arch/powerpc/kvm/book3s_emulate.c
++++ b/arch/powerpc/kvm/book3s_emulate.c
+@@ -268,7 +268,7 @@ int kvmppc_core_emulate_op_pr(struct kvm_vcpu *vcpu,
+
+ /*
+ * add rules to fit in ISA specification regarding TM
+- * state transistion in TM disable/Suspended state,
++ * state transition in TM disable/Suspended state,
+ * and target TM state is TM inactive(00) state. (the
+ * change should be suppressed).
+ */
+diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c
+index 7e52d0beee776..5e4251b76e758 100644
+--- a/arch/powerpc/kvm/book3s_hv_builtin.c
++++ b/arch/powerpc/kvm/book3s_hv_builtin.c
+@@ -19,7 +19,7 @@
+ #include <asm/interrupt.h>
+ #include <asm/kvm_ppc.h>
+ #include <asm/kvm_book3s.h>
+-#include <asm/archrandom.h>
++#include <asm/machdep.h>
+ #include <asm/xics.h>
+ #include <asm/xive.h>
+ #include <asm/dbell.h>
+@@ -176,13 +176,14 @@ EXPORT_SYMBOL_GPL(kvmppc_hcall_impl_hv_realmode);
+
+ int kvmppc_hwrng_present(void)
+ {
+- return powernv_hwrng_present();
++ return ppc_md.get_random_seed != NULL;
+ }
+ EXPORT_SYMBOL_GPL(kvmppc_hwrng_present);
+
+ long kvmppc_rm_h_random(struct kvm_vcpu *vcpu)
+ {
+- if (powernv_get_random_real_mode(&vcpu->arch.regs.gpr[4]))
++ if (ppc_md.get_random_seed &&
++ ppc_md.get_random_seed(&vcpu->arch.regs.gpr[4]))
+ return H_SUCCESS;
+
+ return H_HARDWARE;
+diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
+index a28e5b3daabdf..ac38c1cad378d 100644
+--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
++++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
+@@ -379,7 +379,7 @@ void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
+ {
+ /*
+ * current->thread.xxx registers must all be restored to host
+- * values before a potential context switch, othrewise the context
++ * values before a potential context switch, otherwise the context
+ * switch itself will overwrite current->thread.xxx with the values
+ * from the guest SPRs.
+ */
+diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
+index 36f2314c58e5f..5980063016207 100644
+--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
++++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
+@@ -120,7 +120,7 @@ static DEFINE_SPINLOCK(kvmppc_uvmem_bitmap_lock);
+ * content is un-encrypted.
+ *
+ * (c) Normal - The GFN is a normal. The GFN is associated with
+- * a normal VM. The contents of the GFN is accesible to
++ * a normal VM. The contents of the GFN is accessible to
+ * the Hypervisor. Its content is never encrypted.
+ *
+ * States of a VM.
+diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
+index 7bf9e6ca5c2df..d6abed6e51e69 100644
+--- a/arch/powerpc/kvm/book3s_pr.c
++++ b/arch/powerpc/kvm/book3s_pr.c
+@@ -1287,7 +1287,7 @@ int kvmppc_handle_exit_pr(struct kvm_vcpu *vcpu, unsigned int exit_nr)
+
+ /* Get last sc for papr */
+ if (vcpu->arch.papr_enabled) {
+- /* The sc instuction points SRR0 to the next inst */
++ /* The sc instruction points SRR0 to the next inst */
+ emul = kvmppc_get_last_inst(vcpu, INST_SC, &last_sc);
+ if (emul != EMULATE_DONE) {
+ kvmppc_set_pc(vcpu, kvmppc_get_pc(vcpu) - 4);
+diff --git a/arch/powerpc/kvm/book3s_xics.c b/arch/powerpc/kvm/book3s_xics.c
+index ab6d37d78c62d..589a8f2571201 100644
+--- a/arch/powerpc/kvm/book3s_xics.c
++++ b/arch/powerpc/kvm/book3s_xics.c
+@@ -462,7 +462,7 @@ static void icp_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp *icp,
+ * new guy. We cannot assume that the rejected interrupt is less
+ * favored than the new one, and thus doesn't need to be delivered,
+ * because by the time we exit icp_try_to_deliver() the target
+- * processor may well have alrady consumed & completed it, and thus
++ * processor may well have already consumed & completed it, and thus
+ * the rejected interrupt might actually be already acceptable.
+ */
+ if (icp_try_to_deliver(icp, new_irq, state->priority, &reject)) {
+diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
+index c0ce5531d9bcb..24d434f1f012c 100644
+--- a/arch/powerpc/kvm/book3s_xive.c
++++ b/arch/powerpc/kvm/book3s_xive.c
+@@ -124,7 +124,7 @@ void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu)
+ * interrupt might have fired and be on its way to the
+ * host queue while we mask it, and if we unmask it
+ * early enough (re-cede right away), there is a
+- * theorical possibility that it fires again, thus
++ * theoretical possibility that it fires again, thus
+ * landing in the target queue more than once which is
+ * a big no-no.
+ *
+@@ -622,7 +622,7 @@ static int xive_target_interrupt(struct kvm *kvm,
+
+ /*
+ * Targetting rules: In order to avoid losing track of
+- * pending interrupts accross mask and unmask, which would
++ * pending interrupts across mask and unmask, which would
+ * allow queue overflows, we implement the following rules:
+ *
+ * - Unless it was never enabled (or we run out of capacity)
+@@ -1073,7 +1073,7 @@ int kvmppc_xive_clr_mapped(struct kvm *kvm, unsigned long guest_irq,
+ /*
+ * If old_p is set, the interrupt is pending, we switch it to
+ * PQ=11. This will force a resend in the host so the interrupt
+- * isn't lost to whatver host driver may pick it up
++ * isn't lost to whatever host driver may pick it up
+ */
+ if (state->old_p)
+ xive_vm_esb_load(state->pt_data, XIVE_ESB_SET_PQ_11);
+diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
+index fa0d8dbbe4841..4ff1372e48d40 100644
+--- a/arch/powerpc/kvm/e500mc.c
++++ b/arch/powerpc/kvm/e500mc.c
+@@ -309,7 +309,7 @@ static int kvmppc_core_vcpu_create_e500mc(struct kvm_vcpu *vcpu)
+ BUILD_BUG_ON(offsetof(struct kvmppc_vcpu_e500, vcpu) != 0);
+ vcpu_e500 = to_e500(vcpu);
+
+- /* Invalid PIR value -- this LPID dosn't have valid state on any cpu */
++ /* Invalid PIR value -- this LPID doesn't have valid state on any cpu */
+ vcpu->arch.oldpir = 0xffffffff;
+
+ err = kvmppc_e500_tlb_init(vcpu_e500);
+diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
+index 7ce8914992e3f..2e0cad5817baf 100644
+--- a/arch/powerpc/mm/book3s64/hash_pgtable.c
++++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
+@@ -377,7 +377,7 @@ int hash__has_transparent_hugepage(void)
+ if (mmu_psize_defs[MMU_PAGE_16M].shift != PMD_SHIFT)
+ return 0;
+ /*
+- * We need to make sure that we support 16MB hugepage in a segement
++ * We need to make sure that we support 16MB hugepage in a segment
+ * with base page size 64K or 4K. We only enable THP with a PAGE_SIZE
+ * of 64K.
+ */
+diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
+index 985cabdd7f679..5b69b271707e8 100644
+--- a/arch/powerpc/mm/book3s64/hash_utils.c
++++ b/arch/powerpc/mm/book3s64/hash_utils.c
+@@ -1343,7 +1343,7 @@ static int subpage_protection(struct mm_struct *mm, unsigned long ea)
+ spp >>= 30 - 2 * ((ea >> 12) & 0xf);
+
+ /*
+- * 0 -> full premission
++ * 0 -> full permission
+ * 1 -> Read only
+ * 2 -> no access.
+ * We return the flag that need to be cleared.
+@@ -1664,7 +1664,7 @@ DEFINE_INTERRUPT_HANDLER(do_hash_fault)
+
+ err = hash_page_mm(mm, ea, access, TRAP(regs), flags);
+ if (unlikely(err < 0)) {
+- // failed to instert a hash PTE due to an hypervisor error
++ // failed to insert a hash PTE due to an hypervisor error
+ if (user_mode(regs)) {
+ if (IS_ENABLED(CONFIG_PPC_SUBPAGE_PROT) && err == -2)
+ _exception(SIGSEGV, regs, SEGV_ACCERR, ea);
+diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
+index 052e6590f84fb..071bb66c3ad9e 100644
+--- a/arch/powerpc/mm/book3s64/pgtable.c
++++ b/arch/powerpc/mm/book3s64/pgtable.c
+@@ -331,7 +331,7 @@ static pmd_t *__alloc_for_pmdcache(struct mm_struct *mm)
+ spin_lock(&mm->page_table_lock);
+ /*
+ * If we find pgtable_page set, we return
+- * the allocated page with single fragement
++ * the allocated page with single fragment
+ * count.
+ */
+ if (likely(!mm->context.pmd_frag)) {
+diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
+index def04631a74d5..db2f3d1934481 100644
+--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
++++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
+@@ -359,7 +359,7 @@ static void __init radix_init_pgtable(void)
+ if (!cpu_has_feature(CPU_FTR_HVMODE) &&
+ cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG)) {
+ /*
+- * Older versions of KVM on these machines perfer if the
++ * Older versions of KVM on these machines prefer if the
+ * guest only uses the low 19 PID bits.
+ */
+ mmu_pid_bits = 19;
+diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
+index 7724af19ed7e6..dda51fef2d2ec 100644
+--- a/arch/powerpc/mm/book3s64/radix_tlb.c
++++ b/arch/powerpc/mm/book3s64/radix_tlb.c
+@@ -397,7 +397,7 @@ static inline void _tlbie_pid(unsigned long pid, unsigned long ric)
+
+ /*
+ * Workaround the fact that the "ric" argument to __tlbie_pid
+- * must be a compile-time contraint to match the "i" constraint
++ * must be a compile-time constraint to match the "i" constraint
+ * in the asm statement.
+ */
+ switch (ric) {
+diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
+index 81091b9587f62..6956f637a38c1 100644
+--- a/arch/powerpc/mm/book3s64/slb.c
++++ b/arch/powerpc/mm/book3s64/slb.c
+@@ -347,7 +347,7 @@ void slb_setup_new_exec(void)
+ /*
+ * We have no good place to clear the slb preload cache on exec,
+ * flush_thread is about the earliest arch hook but that happens
+- * after we switch to the mm and have aleady preloaded the SLBEs.
++ * after we switch to the mm and have already preloaded the SLBEs.
+ *
+ * For the most part that's probably okay to use entries from the
+ * previous exec, they will age out if unused. It may turn out to
+@@ -615,7 +615,7 @@ static void slb_cache_update(unsigned long esid_data)
+ } else {
+ /*
+ * Our cache is full and the current cache content strictly
+- * doesn't indicate the active SLB conents. Bump the ptr
++ * doesn't indicate the active SLB contents. Bump the ptr
+ * so that switch_slb() will ignore the cache.
+ */
+ local_paca->slb_cache_ptr = SLB_CACHE_ENTRIES + 1;
+diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
+index 83c0ee9fbf05b..2e11952057f8c 100644
+--- a/arch/powerpc/mm/init_64.c
++++ b/arch/powerpc/mm/init_64.c
+@@ -111,7 +111,7 @@ static int __meminit vmemmap_populated(unsigned long vmemmap_addr, int vmemmap_m
+ }
+
+ /*
+- * vmemmap virtual address space management does not have a traditonal page
++ * vmemmap virtual address space management does not have a traditional page
+ * table to track which virtual struct pages are backed by physical mapping.
+ * The virtual to physical mappings are tracked in a simple linked list
+ * format. 'vmemmap_list' maintains the entire vmemmap physical mapping at
+@@ -128,7 +128,7 @@ static struct vmemmap_backing *next;
+
+ /*
+ * The same pointer 'next' tracks individual chunks inside the allocated
+- * full page during the boot time and again tracks the freeed nodes during
++ * full page during the boot time and again tracks the freed nodes during
+ * runtime. It is racy but it does not happen as they are separated by the
+ * boot process. Will create problem if some how we have memory hotplug
+ * operation during boot !!
+diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
+index f3e4d069e0ba7..a70828a6d9357 100644
+--- a/arch/powerpc/mm/kasan/kasan_init_32.c
++++ b/arch/powerpc/mm/kasan/kasan_init_32.c
+@@ -25,7 +25,7 @@ static void __init kasan_populate_pte(pte_t *ptep, pgprot_t prot)
+ int i;
+
+ for (i = 0; i < PTRS_PER_PTE; i++, ptep++)
+- __set_pte_at(&init_mm, va, ptep, pfn_pte(PHYS_PFN(pa), prot), 0);
++ __set_pte_at(&init_mm, va, ptep, pfn_pte(PHYS_PFN(pa), prot), 1);
+ }
+
+ int __init kasan_init_shadow_page_tables(unsigned long k_start, unsigned long k_end)
+diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
+index 27f9186ae3740..1ee08c3efe5b6 100644
+--- a/arch/powerpc/mm/nohash/8xx.c
++++ b/arch/powerpc/mm/nohash/8xx.c
+@@ -179,8 +179,8 @@ void mmu_mark_initmem_nx(void)
+ unsigned long boundary = strict_kernel_rwx_enabled() ? sinittext : etext8;
+ unsigned long einittext8 = ALIGN(__pa(_einittext), SZ_8M);
+
+- mmu_mapin_ram_chunk(0, boundary, PAGE_KERNEL_TEXT, false);
+- mmu_mapin_ram_chunk(boundary, einittext8, PAGE_KERNEL, false);
++ if (!debug_pagealloc_enabled_or_kfence())
++ mmu_mapin_ram_chunk(boundary, einittext8, PAGE_KERNEL, false);
+
+ mmu_pin_tlb(block_mapped_ram, false);
+ }
+diff --git a/arch/powerpc/mm/nohash/book3e_hugetlbpage.c b/arch/powerpc/mm/nohash/book3e_hugetlbpage.c
+index 8b88be91b622a..307ca919d3930 100644
+--- a/arch/powerpc/mm/nohash/book3e_hugetlbpage.c
++++ b/arch/powerpc/mm/nohash/book3e_hugetlbpage.c
+@@ -142,7 +142,7 @@ book3e_hugetlb_preload(struct vm_area_struct *vma, unsigned long ea, pte_t pte)
+ tsize = shift - 10;
+ /*
+ * We can't be interrupted while we're setting up the MAS
+- * regusters or after we've confirmed that no tlb exists.
++ * registers or after we've confirmed that no tlb exists.
+ */
+ local_irq_save(flags);
+
+diff --git a/arch/powerpc/mm/nohash/kaslr_booke.c b/arch/powerpc/mm/nohash/kaslr_booke.c
+index 5f81c076621f2..37eb8d80bd536 100644
+--- a/arch/powerpc/mm/nohash/kaslr_booke.c
++++ b/arch/powerpc/mm/nohash/kaslr_booke.c
+@@ -311,7 +311,7 @@ static unsigned long __init kaslr_choose_location(void *dt_ptr, phys_addr_t size
+ ram = map_mem_in_cams(ram, CONFIG_LOWMEM_CAM_NUM, true, true);
+ linear_sz = min_t(unsigned long, ram, SZ_512M);
+
+- /* If the linear size is smaller than 64M, do not randmize */
++ /* If the linear size is smaller than 64M, do not randomize */
+ if (linear_sz < SZ_64M)
+ return 0;
+
+diff --git a/arch/powerpc/mm/nohash/tlb_low_64e.S b/arch/powerpc/mm/nohash/tlb_low_64e.S
+index 8b97c4acfebfa..9e9ab3803fb2f 100644
+--- a/arch/powerpc/mm/nohash/tlb_low_64e.S
++++ b/arch/powerpc/mm/nohash/tlb_low_64e.S
+@@ -583,7 +583,7 @@ itlb_miss_fault_e6500:
+ */
+ rlwimi r11,r14,32-19,27,27
+ rlwimi r11,r14,32-16,19,19
+- beq normal_tlb_miss
++ beq normal_tlb_miss_user
+ /* XXX replace the RMW cycles with immediate loads + writes */
+ 1: mfspr r10,SPRN_MAS1
+ cmpldi cr0,r15,8 /* Check for vmalloc region */
+@@ -626,7 +626,7 @@ itlb_miss_fault_e6500:
+
+ cmpldi cr0,r15,0 /* Check for user region */
+ std r14,EX_TLB_ESR(r12) /* write crazy -1 to frame */
+- beq normal_tlb_miss
++ beq normal_tlb_miss_user
+
+ li r11,_PAGE_PRESENT|_PAGE_BAP_SX /* Base perm */
+ oris r11,r11,_PAGE_ACCESSED@h
+@@ -653,6 +653,12 @@ itlb_miss_fault_e6500:
+ * r11 = PTE permission mask
+ * r10 = crap (free to use)
+ */
++normal_tlb_miss_user:
++#ifdef CONFIG_PPC_KUAP
++ mfspr r14,SPRN_MAS1
++ rlwinm. r14,r14,0,0x3fff0000
++ beq- normal_tlb_miss_access_fault /* KUAP fault */
++#endif
+ normal_tlb_miss:
+ /* So we first construct the page table address. We do that by
+ * shifting the bottom of the address (not the region ID) by
+@@ -683,11 +689,6 @@ finish_normal_tlb_miss:
+ /* Check if required permissions are met */
+ andc. r15,r11,r14
+ bne- normal_tlb_miss_access_fault
+-#ifdef CONFIG_PPC_KUAP
+- mfspr r11,SPRN_MAS1
+- rlwinm. r10,r11,0,0x3fff0000
+- beq- normal_tlb_miss_access_fault /* KUAP fault */
+-#endif
+
+ /* Now we build the MAS:
+ *
+@@ -709,9 +710,7 @@ finish_normal_tlb_miss:
+ rldicl r10,r14,64-8,64-8
+ cmpldi cr0,r10,BOOK3E_PAGESZ_4K
+ beq- 1f
+-#ifndef CONFIG_PPC_KUAP
+ mfspr r11,SPRN_MAS1
+-#endif
+ rlwimi r11,r14,31,21,24
+ rlwinm r11,r11,0,21,19
+ mtspr SPRN_MAS1,r11
+diff --git a/arch/powerpc/mm/pgtable-frag.c b/arch/powerpc/mm/pgtable-frag.c
+index 97ae4935da799..20652daa1d7e3 100644
+--- a/arch/powerpc/mm/pgtable-frag.c
++++ b/arch/powerpc/mm/pgtable-frag.c
+@@ -83,7 +83,7 @@ static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel)
+ spin_lock(&mm->page_table_lock);
+ /*
+ * If we find pgtable_page set, we return
+- * the allocated page with single fragement
++ * the allocated page with single fragment
+ * count.
+ */
+ if (likely(!pte_frag_get(&mm->context))) {
+diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
+index a56ade39dc68a..3ac73f9fb5d59 100644
+--- a/arch/powerpc/mm/pgtable_32.c
++++ b/arch/powerpc/mm/pgtable_32.c
+@@ -135,9 +135,9 @@ void mark_initmem_nx(void)
+ unsigned long numpages = PFN_UP((unsigned long)_einittext) -
+ PFN_DOWN((unsigned long)_sinittext);
+
+- if (v_block_mapped((unsigned long)_sinittext)) {
+- mmu_mark_initmem_nx();
+- } else {
++ mmu_mark_initmem_nx();
++
++ if (!v_block_mapped((unsigned long)_sinittext)) {
+ set_memory_nx((unsigned long)_sinittext, numpages);
+ set_memory_rw((unsigned long)_sinittext, numpages);
+ }
+diff --git a/arch/powerpc/mm/ptdump/shared.c b/arch/powerpc/mm/ptdump/shared.c
+index 03607ab90c66f..f884760ca5cfe 100644
+--- a/arch/powerpc/mm/ptdump/shared.c
++++ b/arch/powerpc/mm/ptdump/shared.c
+@@ -17,9 +17,9 @@ static const struct flag_info flag_array[] = {
+ .clear = " ",
+ }, {
+ .mask = _PAGE_RW,
+- .val = _PAGE_RW,
+- .set = "rw",
+- .clear = "r ",
++ .val = 0,
++ .set = "r ",
++ .clear = "rw",
+ }, {
+ .mask = _PAGE_EXEC,
+ .val = _PAGE_EXEC,
+diff --git a/arch/powerpc/perf/8xx-pmu.c b/arch/powerpc/perf/8xx-pmu.c
+index 4738c4dbf5676..308a2e40d7be9 100644
+--- a/arch/powerpc/perf/8xx-pmu.c
++++ b/arch/powerpc/perf/8xx-pmu.c
+@@ -157,7 +157,7 @@ static void mpc8xx_pmu_del(struct perf_event *event, int flags)
+
+ mpc8xx_pmu_read(event);
+
+- /* If it was the last user, stop counting to avoid useles overhead */
++ /* If it was the last user, stop counting to avoid useless overhead */
+ switch (event_type(event)) {
+ case PERF_8xx_ID_CPU_CYCLES:
+ break;
+diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
+index b5b42cf0a7039..03c64a0195df2 100644
+--- a/arch/powerpc/perf/core-book3s.c
++++ b/arch/powerpc/perf/core-book3s.c
+@@ -1142,7 +1142,7 @@ static u64 check_and_compute_delta(u64 prev, u64 val)
+ /*
+ * POWER7 can roll back counter values, if the new value is smaller
+ * than the previous value it will cause the delta and the counter to
+- * have bogus values unless we rolled a counter over. If a coutner is
++ * have bogus values unless we rolled a counter over. If a counter is
+ * rolled back, it will be smaller, but within 256, which is the maximum
+ * number of events to rollback at once. If we detect a rollback
+ * return 0. This can lead to a small lack of precision in the
+@@ -1349,27 +1349,22 @@ static void power_pmu_disable(struct pmu *pmu)
+ * a PMI happens during interrupt replay and perf counter
+ * values are cleared by PMU callbacks before replay.
+ *
+- * If any PMC corresponding to the active PMU events are
+- * overflown, disable the interrupt by clearing the paca
+- * bit for PMI since we are disabling the PMU now.
+- * Otherwise provide a warning if there is PMI pending, but
+- * no counter is found overflown.
++ * Disable the interrupt by clearing the paca bit for PMI
++ * since we are disabling the PMU now. Otherwise provide a
++ * warning if there is PMI pending, but no counter is found
++ * overflown.
++ *
++ * Since power_pmu_disable runs under local_irq_save, it
++ * could happen that code hits a PMC overflow without PMI
++ * pending in paca. Hence only clear PMI pending if it was
++ * set.
++ *
++ * If a PMI is pending, then MSR[EE] must be disabled (because
++ * the masked PMI handler disabling EE). So it is safe to
++ * call clear_pmi_irq_pending().
+ */
+- if (any_pmc_overflown(cpuhw)) {
+- /*
+- * Since power_pmu_disable runs under local_irq_save, it
+- * could happen that code hits a PMC overflow without PMI
+- * pending in paca. Hence only clear PMI pending if it was
+- * set.
+- *
+- * If a PMI is pending, then MSR[EE] must be disabled (because
+- * the masked PMI handler disabling EE). So it is safe to
+- * call clear_pmi_irq_pending().
+- */
+- if (pmi_irq_pending())
+- clear_pmi_irq_pending();
+- } else
+- WARN_ON(pmi_irq_pending());
++ if (pmi_irq_pending())
++ clear_pmi_irq_pending();
+
+ val = mmcra = cpuhw->mmcr.mmcra;
+
+@@ -2057,7 +2052,7 @@ static int power_pmu_event_init(struct perf_event *event)
+ /*
+ * PMU config registers have fields that are
+ * reserved and some specific values for bit fields are reserved.
+- * For ex., MMCRA[61:62] is Randome Sampling Mode (SM)
++ * For ex., MMCRA[61:62] is Random Sampling Mode (SM)
+ * and value of 0b11 to this field is reserved.
+ * Check for invalid values in attr.config.
+ */
+@@ -2447,7 +2442,7 @@ static void __perf_event_interrupt(struct pt_regs *regs)
+ }
+
+ /*
+- * During system wide profling or while specific CPU is monitored for an
++ * During system wide profiling or while specific CPU is monitored for an
+ * event, some corner cases could cause PMC to overflow in idle path. This
+ * will trigger a PMI after waking up from idle. Since counter values are _not_
+ * saved/restored in idle path, can lead to below "Can't find PMC" message.
+diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
+index 526d4b767534c..498f1a2f7658c 100644
+--- a/arch/powerpc/perf/imc-pmu.c
++++ b/arch/powerpc/perf/imc-pmu.c
+@@ -521,7 +521,7 @@ static int nest_imc_event_init(struct perf_event *event)
+
+ /*
+ * Nest HW counter memory resides in a per-chip reserve-memory (HOMER).
+- * Get the base memory addresss for this cpu.
++ * Get the base memory address for this cpu.
+ */
+ chip_id = cpu_to_chip_id(event->cpu);
+
+@@ -674,7 +674,7 @@ static int ppc_core_imc_cpu_offline(unsigned int cpu)
+ /*
+ * Check whether core_imc is registered. We could end up here
+ * if the cpuhotplug callback registration fails. i.e, callback
+- * invokes the offline path for all sucessfully registered cpus.
++ * invokes the offline path for all successfully registered cpus.
+ * At this stage, core_imc pmu will not be registered and we
+ * should return here.
+ *
+diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
+index bb5d64862bc99..42abbcfc73da1 100644
+--- a/arch/powerpc/perf/isa207-common.c
++++ b/arch/powerpc/perf/isa207-common.c
+@@ -82,11 +82,11 @@ static unsigned long sdar_mod_val(u64 event)
+ static void mmcra_sdar_mode(u64 event, unsigned long *mmcra)
+ {
+ /*
+- * MMCRA[SDAR_MODE] specifices how the SDAR should be updated in
+- * continous sampling mode.
++ * MMCRA[SDAR_MODE] specifies how the SDAR should be updated in
++ * continuous sampling mode.
+ *
+ * Incase of Power8:
+- * MMCRA[SDAR_MODE] will be programmed as "0b01" for continous sampling
++ * MMCRA[SDAR_MODE] will be programmed as "0b01" for continuous sampling
+ * mode and will be un-changed when setting MMCRA[63] (Marked events).
+ *
+ * Incase of Power9/power10:
+diff --git a/arch/powerpc/platforms/512x/clock-commonclk.c b/arch/powerpc/platforms/512x/clock-commonclk.c
+index 0b03d812baae5..0652c7e692256 100644
+--- a/arch/powerpc/platforms/512x/clock-commonclk.c
++++ b/arch/powerpc/platforms/512x/clock-commonclk.c
+@@ -663,7 +663,7 @@ static void __init mpc512x_clk_setup_mclk(struct mclk_setup_data *entry, size_t
+ * the PSC/MSCAN/SPDIF (serial drivers et al) need the MCLK
+ * for their bitrate
+ * - in the absence of "aliases" for clocks we need to create
+- * individial 'struct clk' items for whatever might get
++ * individual 'struct clk' items for whatever might get
+ * referenced or looked up, even if several of those items are
+ * identical from the logical POV (their rate value)
+ * - for easier future maintenance and for better reflection of
+diff --git a/arch/powerpc/platforms/512x/mpc512x_shared.c b/arch/powerpc/platforms/512x/mpc512x_shared.c
+index e3411663edadb..96d9cf49560d9 100644
+--- a/arch/powerpc/platforms/512x/mpc512x_shared.c
++++ b/arch/powerpc/platforms/512x/mpc512x_shared.c
+@@ -289,7 +289,7 @@ static void __init mpc512x_setup_diu(void)
+
+ /*
+ * We do not allocate and configure new area for bitmap buffer
+- * because it would requere copying bitmap data (splash image)
++ * because it would require copying bitmap data (splash image)
+ * and so negatively affect boot time. Instead we reserve the
+ * already configured frame buffer area so that it won't be
+ * destroyed. The starting address of the area to reserve and
+diff --git a/arch/powerpc/platforms/52xx/mpc52xx_common.c b/arch/powerpc/platforms/52xx/mpc52xx_common.c
+index 565e3a83dc9ee..60aa6015e284b 100644
+--- a/arch/powerpc/platforms/52xx/mpc52xx_common.c
++++ b/arch/powerpc/platforms/52xx/mpc52xx_common.c
+@@ -308,7 +308,7 @@ int mpc5200_psc_ac97_gpio_reset(int psc_number)
+
+ spin_lock_irqsave(&gpio_lock, flags);
+
+- /* Reconfiure pin-muxing to gpio */
++ /* Reconfigure pin-muxing to gpio */
+ mux = in_be32(&simple_gpio->port_config);
+ out_be32(&simple_gpio->port_config, mux & (~gpio));
+
+diff --git a/arch/powerpc/platforms/52xx/mpc52xx_gpt.c b/arch/powerpc/platforms/52xx/mpc52xx_gpt.c
+index f862b48b48242..7252d992ca9fe 100644
+--- a/arch/powerpc/platforms/52xx/mpc52xx_gpt.c
++++ b/arch/powerpc/platforms/52xx/mpc52xx_gpt.c
+@@ -398,7 +398,7 @@ static int mpc52xx_gpt_do_start(struct mpc52xx_gpt_priv *gpt, u64 period,
+ set |= MPC52xx_GPT_MODE_CONTINUOUS;
+
+ /* Determine the number of clocks in the requested period. 64 bit
+- * arithmatic is done here to preserve the precision until the value
++ * arithmetic is done here to preserve the precision until the value
+ * is scaled back down into the u32 range. Period is in 'ns', bus
+ * frequency is in Hz. */
+ clocks = period * (u64)gpt->ipb_freq;
+diff --git a/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c b/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c
+index b91ebebd9ff20..54dfc63809d31 100644
+--- a/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c
++++ b/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c
+@@ -104,7 +104,7 @@ static void mpc52xx_lpbfifo_kick(struct mpc52xx_lpbfifo_request *req)
+ *
+ * Configure the watermarks so DMA will always complete correctly.
+ * It may be worth experimenting with the ALARM value to see if
+- * there is a performance impacit. However, if it is wrong there
++ * there is a performance impact. However, if it is wrong there
+ * is a risk of DMA not transferring the last chunk of data
+ */
+ if (write) {
+diff --git a/arch/powerpc/platforms/85xx/mpc85xx_cds.c b/arch/powerpc/platforms/85xx/mpc85xx_cds.c
+index 5bd4870302564..ad0b88ecfd0ca 100644
+--- a/arch/powerpc/platforms/85xx/mpc85xx_cds.c
++++ b/arch/powerpc/platforms/85xx/mpc85xx_cds.c
+@@ -151,7 +151,7 @@ static void __init mpc85xx_cds_pci_irq_fixup(struct pci_dev *dev)
+ */
+ case PCI_DEVICE_ID_VIA_82C586_2:
+ /* There are two USB controllers.
+- * Identify them by functon number
++ * Identify them by function number
+ */
+ if (PCI_FUNC(dev->devfn) == 3)
+ dev->irq = 11;
+diff --git a/arch/powerpc/platforms/86xx/gef_ppc9a.c b/arch/powerpc/platforms/86xx/gef_ppc9a.c
+index 44bbbc535e1df..884da08806ce3 100644
+--- a/arch/powerpc/platforms/86xx/gef_ppc9a.c
++++ b/arch/powerpc/platforms/86xx/gef_ppc9a.c
+@@ -180,7 +180,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_USB,
+ *
+ * This function is called to determine whether the BSP is compatible with the
+ * supplied device-tree, which is assumed to be the correct one for the actual
+- * board. It is expected thati, in the future, a kernel may support multiple
++ * board. It is expected that, in the future, a kernel may support multiple
+ * boards.
+ */
+ static int __init gef_ppc9a_probe(void)
+diff --git a/arch/powerpc/platforms/86xx/gef_sbc310.c b/arch/powerpc/platforms/86xx/gef_sbc310.c
+index 46d6d3d4957a4..baaf1ab070165 100644
+--- a/arch/powerpc/platforms/86xx/gef_sbc310.c
++++ b/arch/powerpc/platforms/86xx/gef_sbc310.c
+@@ -167,7 +167,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_USB,
+ *
+ * This function is called to determine whether the BSP is compatible with the
+ * supplied device-tree, which is assumed to be the correct one for the actual
+- * board. It is expected thati, in the future, a kernel may support multiple
++ * board. It is expected that, in the future, a kernel may support multiple
+ * boards.
+ */
+ static int __init gef_sbc310_probe(void)
+diff --git a/arch/powerpc/platforms/86xx/gef_sbc610.c b/arch/powerpc/platforms/86xx/gef_sbc610.c
+index acf2c6c3c1eb7..120caf6af71d2 100644
+--- a/arch/powerpc/platforms/86xx/gef_sbc610.c
++++ b/arch/powerpc/platforms/86xx/gef_sbc610.c
+@@ -157,7 +157,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_USB,
+ *
+ * This function is called to determine whether the BSP is compatible with the
+ * supplied device-tree, which is assumed to be the correct one for the actual
+- * board. It is expected thati, in the future, a kernel may support multiple
++ * board. It is expected that, in the future, a kernel may support multiple
+ * boards.
+ */
+ static int __init gef_sbc610_probe(void)
+diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
+index e2e1fec91c6ed..660309559bd8c 100644
+--- a/arch/powerpc/platforms/Kconfig.cputype
++++ b/arch/powerpc/platforms/Kconfig.cputype
+@@ -173,11 +173,11 @@ config POWER9_CPU
+
+ config E5500_CPU
+ bool "Freescale e5500"
+- depends on E500
++ depends on PPC64 && E500
+
+ config E6500_CPU
+ bool "Freescale e6500"
+- depends on E500
++ depends on PPC64 && E500
+
+ config 860_CPU
+ bool "8xx family"
+diff --git a/arch/powerpc/platforms/book3s/vas-api.c b/arch/powerpc/platforms/book3s/vas-api.c
+index f9a1615b74dae..c0799fb26b6d6 100644
+--- a/arch/powerpc/platforms/book3s/vas-api.c
++++ b/arch/powerpc/platforms/book3s/vas-api.c
+@@ -30,7 +30,7 @@
+ *
+ * where "vas_copy" and "vas_paste" are defined in copy-paste.h.
+ * copy/paste returns to the user space directly. So refer NX hardware
+- * documententation for exact copy/paste usage and completion / error
++ * documentation for exact copy/paste usage and completion / error
+ * conditions.
+ */
+
+diff --git a/arch/powerpc/platforms/cell/axon_msi.c b/arch/powerpc/platforms/cell/axon_msi.c
+index 354a58c1e6f2d..dba34f4d51627 100644
+--- a/arch/powerpc/platforms/cell/axon_msi.c
++++ b/arch/powerpc/platforms/cell/axon_msi.c
+@@ -223,6 +223,7 @@ static int setup_msi_msg_address(struct pci_dev *dev, struct msi_msg *msg)
+ if (!prop) {
+ dev_dbg(&dev->dev,
+ "axon_msi: no msi-address-(32|64) properties found\n");
++ of_node_put(dn);
+ return -ENOENT;
+ }
+
+diff --git a/arch/powerpc/platforms/cell/cbe_regs.c b/arch/powerpc/platforms/cell/cbe_regs.c
+index 1c4c53bec66c1..03512a41bd7e2 100644
+--- a/arch/powerpc/platforms/cell/cbe_regs.c
++++ b/arch/powerpc/platforms/cell/cbe_regs.c
+@@ -23,7 +23,7 @@
+ * Current implementation uses "cpu" nodes. We build our own mapping
+ * array of cpu numbers to cpu nodes locally for now to allow interrupt
+ * time code to have a fast path rather than call of_get_cpu_node(). If
+- * we implement cpu hotplug, we'll have to install an appropriate norifier
++ * we implement cpu hotplug, we'll have to install an appropriate notifier
+ * in order to release references to the cpu going away
+ */
+ static struct cbe_regs_map
+diff --git a/arch/powerpc/platforms/cell/iommu.c b/arch/powerpc/platforms/cell/iommu.c
+index 25e726bf01727..3f141cf5e580d 100644
+--- a/arch/powerpc/platforms/cell/iommu.c
++++ b/arch/powerpc/platforms/cell/iommu.c
+@@ -582,7 +582,7 @@ static int cell_of_bus_notify(struct notifier_block *nb, unsigned long action,
+ {
+ struct device *dev = data;
+
+- /* We are only intereted in device addition */
++ /* We are only interested in device addition */
+ if (action != BUS_NOTIFY_ADD_DEVICE)
+ return 0;
+
+diff --git a/arch/powerpc/platforms/cell/spider-pci.c b/arch/powerpc/platforms/cell/spider-pci.c
+index a1c293f42a1fb..3a2ea8376e322 100644
+--- a/arch/powerpc/platforms/cell/spider-pci.c
++++ b/arch/powerpc/platforms/cell/spider-pci.c
+@@ -81,7 +81,7 @@ static int __init spiderpci_pci_setup_chip(struct pci_controller *phb,
+ /*
+ * On CellBlade, we can't know that which XDR memory is used by
+ * kmalloc() to allocate dummy_page_va.
+- * In order to imporve the performance, the XDR which is used to
++ * In order to improve the performance, the XDR which is used to
+ * allocate dummy_page_va is the nearest the spider-pci.
+ * We have to select the CBE which is the nearest the spider-pci
+ * to allocate memory from the best XDR, but I don't know that
+diff --git a/arch/powerpc/platforms/cell/spu_manage.c b/arch/powerpc/platforms/cell/spu_manage.c
+index ddf8742f09a3e..080ed2d2c6829 100644
+--- a/arch/powerpc/platforms/cell/spu_manage.c
++++ b/arch/powerpc/platforms/cell/spu_manage.c
+@@ -457,7 +457,7 @@ static void __init init_affinity_node(int cbe)
+
+ /*
+ * Walk through each phandle in vicinity property of the spu
+- * (tipically two vicinity phandles per spe node)
++ * (typically two vicinity phandles per spe node)
+ */
+ for (i = 0; i < (lenp / sizeof(phandle)); i++) {
+ if (vic_handles[i] == avoid_ph)
+diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
+index 4c702192412f1..96f5e2e430180 100644
+--- a/arch/powerpc/platforms/cell/spufs/inode.c
++++ b/arch/powerpc/platforms/cell/spufs/inode.c
+@@ -660,6 +660,7 @@ spufs_init_isolated_loader(void)
+ return;
+
+ loader = of_get_property(dn, "loader", &size);
++ of_node_put(dn);
+ if (!loader)
+ return;
+
+diff --git a/arch/powerpc/platforms/powermac/low_i2c.c b/arch/powerpc/platforms/powermac/low_i2c.c
+index df89d916236d9..9aded0188ce8b 100644
+--- a/arch/powerpc/platforms/powermac/low_i2c.c
++++ b/arch/powerpc/platforms/powermac/low_i2c.c
+@@ -1472,7 +1472,7 @@ int __init pmac_i2c_init(void)
+ smu_i2c_probe();
+ #endif
+
+- /* Now add plaform functions for some known devices */
++ /* Now add platform functions for some known devices */
+ pmac_i2c_devscan(pmac_i2c_dev_create);
+
+ return 0;
+diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
+index 89e22c460ebf9..33f7b959c810f 100644
+--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
++++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
+@@ -390,7 +390,7 @@ static struct eeh_dev *pnv_eeh_probe(struct pci_dev *pdev)
+ * should be blocked until PE reset. MMIO access is dropped
+ * by hardware certainly. In order to drop PCI config requests,
+ * one more flag (EEH_PE_CFG_RESTRICTED) is introduced, which
+- * will be checked in the backend for PE state retrival. If
++ * will be checked in the backend for PE state retrieval. If
+ * the PE becomes frozen for the first time and the flag has
+ * been set for the PE, we will set EEH_PE_CFG_BLOCKED for
+ * that PE to block its config space.
+@@ -981,7 +981,7 @@ static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
+ case EEH_RESET_FUNDAMENTAL:
+ /*
+ * Wait for Transaction Pending bit to clear. A word-aligned
+- * test is used, so we use the conrol offset rather than status
++ * test is used, so we use the control offset rather than status
+ * and shift the test bit to match.
+ */
+ pnv_eeh_wait_for_pending(pdn, "AF",
+@@ -1048,7 +1048,7 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
+ * frozen state during PE reset. However, the good idea here from
+ * benh is to keep frozen state before we get PE reset done completely
+ * (until BAR restore). With the frozen state, HW drops illegal IO
+- * or MMIO access, which can incur recrusive frozen PE during PE
++ * or MMIO access, which can incur recursive frozen PE during PE
+ * reset. The side effect is that EEH core has to clear the frozen
+ * state explicitly after BAR restore.
+ */
+@@ -1095,8 +1095,8 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
+ * bus is behind a hotplug slot and it will use the slot provided
+ * reset methods to prevent spurious hotplug events during the reset.
+ *
+- * Fundemental resets need to be handled internally to EEH since the
+- * PCI core doesn't really have a concept of a fundemental reset,
++ * Fundamental resets need to be handled internally to EEH since the
++ * PCI core doesn't really have a concept of a fundamental reset,
+ * mainly because there's no standard way to generate one. Only a
+ * few devices require an FRESET so it should be fine.
+ */
+diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
+index a6677a111aca1..6f94b808dd39a 100644
+--- a/arch/powerpc/platforms/powernv/idle.c
++++ b/arch/powerpc/platforms/powernv/idle.c
+@@ -112,7 +112,7 @@ static int __init pnv_save_sprs_for_deep_states(void)
+ if (rc != 0)
+ return rc;
+
+- /* Only p8 needs to set extra HID regiters */
++ /* Only p8 needs to set extra HID registers */
+ if (!cpu_has_feature(CPU_FTR_ARCH_300)) {
+ uint64_t hid1_val = mfspr(SPRN_HID1);
+ uint64_t hid4_val = mfspr(SPRN_HID4);
+@@ -1204,7 +1204,7 @@ static void __init pnv_arch300_idle_init(void)
+ * The idle code does not deal with TB loss occurring
+ * in a shallower state than SPR loss, so force it to
+ * behave like SPRs are lost if TB is lost. POWER9 would
+- * never encouter this, but a POWER8 core would if it
++ * never encounter this, but a POWER8 core would if it
+ * implemented the stop instruction. So this is for forward
+ * compatibility.
+ */
+diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c
+index 28b009b46464a..27c936075031a 100644
+--- a/arch/powerpc/platforms/powernv/ocxl.c
++++ b/arch/powerpc/platforms/powernv/ocxl.c
+@@ -289,7 +289,7 @@ int pnv_ocxl_get_pasid_count(struct pci_dev *dev, int *count)
+ * be used by a function depends on how many functions exist
+ * on the device. The NPU needs to be configured to know how
+ * many bits are available to PASIDs and how many are to be
+- * used by the function BDF indentifier.
++ * used by the function BDF identifier.
+ *
+ * We only support one AFU-carrying function for now.
+ */
+diff --git a/arch/powerpc/platforms/powernv/opal-fadump.c b/arch/powerpc/platforms/powernv/opal-fadump.c
+index 9d74d3950a523..c1bcd2d4826e0 100644
+--- a/arch/powerpc/platforms/powernv/opal-fadump.c
++++ b/arch/powerpc/platforms/powernv/opal-fadump.c
+@@ -206,7 +206,7 @@ static u64 opal_fadump_init_mem_struct(struct fw_dump *fadump_conf)
+ opal_fdm->region_cnt = cpu_to_be16(reg_cnt);
+
+ /*
+- * Kernel metadata is passed to f/w and retrieved in capture kerenl.
++ * Kernel metadata is passed to f/w and retrieved in capture kernel.
+ * So, use it to save fadump header address instead of calculating it.
+ */
+ opal_fdm->fadumphdr_addr = cpu_to_be64(be64_to_cpu(opal_fdm->rgn[0].dest) +
+diff --git a/arch/powerpc/platforms/powernv/opal-lpc.c b/arch/powerpc/platforms/powernv/opal-lpc.c
+index 5390c888db162..d129d6d45a500 100644
+--- a/arch/powerpc/platforms/powernv/opal-lpc.c
++++ b/arch/powerpc/platforms/powernv/opal-lpc.c
+@@ -197,7 +197,7 @@ static ssize_t lpc_debug_read(struct file *filp, char __user *ubuf,
+
+ /*
+ * Select access size based on count and alignment and
+- * access type. IO and MEM only support byte acceses,
++ * access type. IO and MEM only support byte accesses,
+ * FW supports all 3.
+ */
+ len = 1;
+diff --git a/arch/powerpc/platforms/powernv/opal-memory-errors.c b/arch/powerpc/platforms/powernv/opal-memory-errors.c
+index 1e8e17df9ce80..a1754a28265dd 100644
+--- a/arch/powerpc/platforms/powernv/opal-memory-errors.c
++++ b/arch/powerpc/platforms/powernv/opal-memory-errors.c
+@@ -82,7 +82,7 @@ static DECLARE_WORK(mem_error_work, mem_error_handler);
+
+ /*
+ * opal_memory_err_event - notifier handler that queues up the opal message
+- * to be preocessed later.
++ * to be processed later.
+ */
+ static int opal_memory_err_event(struct notifier_block *nb,
+ unsigned long msg_type, void *msg)
+diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c b/arch/powerpc/platforms/powernv/pci-sriov.c
+index 04155aaaadb1d..fe3d111b881c2 100644
+--- a/arch/powerpc/platforms/powernv/pci-sriov.c
++++ b/arch/powerpc/platforms/powernv/pci-sriov.c
+@@ -699,7 +699,7 @@ static int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
+ return -ENOSPC;
+ }
+
+- /* allocate a contigious block of PEs for our VFs */
++ /* allocate a contiguous block of PEs for our VFs */
+ base_pe = pnv_ioda_alloc_pe(phb, num_vfs);
+ if (!base_pe) {
+ pci_err(pdev, "Unable to allocate PEs for %d VFs\n", num_vfs);
+diff --git a/arch/powerpc/platforms/powernv/rng.c b/arch/powerpc/platforms/powernv/rng.c
+index 3805ad13b8f3d..d19305292e1e3 100644
+--- a/arch/powerpc/platforms/powernv/rng.c
++++ b/arch/powerpc/platforms/powernv/rng.c
+@@ -29,15 +29,6 @@ struct powernv_rng {
+
+ static DEFINE_PER_CPU(struct powernv_rng *, powernv_rng);
+
+-int powernv_hwrng_present(void)
+-{
+- struct powernv_rng *rng;
+-
+- rng = get_cpu_var(powernv_rng);
+- put_cpu_var(rng);
+- return rng != NULL;
+-}
+-
+ static unsigned long rng_whiten(struct powernv_rng *rng, unsigned long val)
+ {
+ unsigned long parity;
+@@ -58,17 +49,6 @@ static unsigned long rng_whiten(struct powernv_rng *rng, unsigned long val)
+ return val;
+ }
+
+-int powernv_get_random_real_mode(unsigned long *v)
+-{
+- struct powernv_rng *rng;
+-
+- rng = raw_cpu_read(powernv_rng);
+-
+- *v = rng_whiten(rng, __raw_rm_readq(rng->regs_real));
+-
+- return 1;
+-}
+-
+ static int powernv_get_random_darn(unsigned long *v)
+ {
+ unsigned long val;
+@@ -105,12 +85,14 @@ int powernv_get_random_long(unsigned long *v)
+ {
+ struct powernv_rng *rng;
+
+- rng = get_cpu_var(powernv_rng);
+-
+- *v = rng_whiten(rng, in_be64(rng->regs));
+-
+- put_cpu_var(rng);
+-
++ if (mfmsr() & MSR_DR) {
++ rng = get_cpu_var(powernv_rng);
++ *v = rng_whiten(rng, in_be64(rng->regs));
++ put_cpu_var(rng);
++ } else {
++ rng = raw_cpu_read(powernv_rng);
++ *v = rng_whiten(rng, __raw_rm_readq(rng->regs_real));
++ }
+ return 1;
+ }
+ EXPORT_SYMBOL_GPL(powernv_get_random_long);
+diff --git a/arch/powerpc/platforms/ps3/mm.c b/arch/powerpc/platforms/ps3/mm.c
+index 5ce924611b941..63ef61ed7597f 100644
+--- a/arch/powerpc/platforms/ps3/mm.c
++++ b/arch/powerpc/platforms/ps3/mm.c
+@@ -364,7 +364,7 @@ static void __maybe_unused _dma_dump_region(const struct ps3_dma_region *r,
+ * @bus_addr: Starting ioc bus address of the area to map.
+ * @len: Length in bytes of the area to map.
+ * @link: A struct list_head used with struct ps3_dma_region.chunk_list, the
+- * list of all chuncks owned by the region.
++ * list of all chunks owned by the region.
+ *
+ * This implementation uses a very simple dma page manager
+ * based on the dma_chunk structure. This scheme assumes
+diff --git a/arch/powerpc/platforms/ps3/system-bus.c b/arch/powerpc/platforms/ps3/system-bus.c
+index b637bf2920474..2502e9b17df4a 100644
+--- a/arch/powerpc/platforms/ps3/system-bus.c
++++ b/arch/powerpc/platforms/ps3/system-bus.c
+@@ -601,7 +601,7 @@ static dma_addr_t ps3_ioc0_map_page(struct device *_dev, struct page *page,
+ iopte_flag |= CBE_IOPTE_PP_W | CBE_IOPTE_SO_RW;
+ break;
+ default:
+- /* not happned */
++ /* not happened */
+ BUG();
+ }
+ result = ps3_dma_map(dev->d_region, (unsigned long)ptr, size,
+diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
+index 09fafcf2d3a06..f0b2bca750dad 100644
+--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
++++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
+@@ -510,7 +510,7 @@ static int pseries_eeh_set_option(struct eeh_pe *pe, int option)
+ int ret = 0;
+
+ /*
+- * When we're enabling or disabling EEH functioality on
++ * When we're enabling or disabling EEH functionality on
+ * the particular PE, the PE config address is possibly
+ * unavailable. Therefore, we have to figure it out from
+ * the FDT node.
+diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
+index 4d991cf840d91..a90024697c119 100644
+--- a/arch/powerpc/platforms/pseries/iommu.c
++++ b/arch/powerpc/platforms/pseries/iommu.c
+@@ -701,6 +701,33 @@ struct iommu_table_ops iommu_table_lpar_multi_ops = {
+ .get = tce_get_pSeriesLP
+ };
+
++/*
++ * Find nearest ibm,dma-window (default DMA window) or direct DMA window or
++ * dynamic 64bit DMA window, walking up the device tree.
++ */
++static struct device_node *pci_dma_find(struct device_node *dn,
++ const __be32 **dma_window)
++{
++ const __be32 *dw = NULL;
++
++ for ( ; dn && PCI_DN(dn); dn = dn->parent) {
++ dw = of_get_property(dn, "ibm,dma-window", NULL);
++ if (dw) {
++ if (dma_window)
++ *dma_window = dw;
++ return dn;
++ }
++ dw = of_get_property(dn, DIRECT64_PROPNAME, NULL);
++ if (dw)
++ return dn;
++ dw = of_get_property(dn, DMA64_PROPNAME, NULL);
++ if (dw)
++ return dn;
++ }
++
++ return NULL;
++}
++
+ static void pci_dma_bus_setup_pSeriesLP(struct pci_bus *bus)
+ {
+ struct iommu_table *tbl;
+@@ -713,20 +740,10 @@ static void pci_dma_bus_setup_pSeriesLP(struct pci_bus *bus)
+ pr_debug("pci_dma_bus_setup_pSeriesLP: setting up bus %pOF\n",
+ dn);
+
+- /*
+- * Find nearest ibm,dma-window (default DMA window), walking up the
+- * device tree
+- */
+- for (pdn = dn; pdn != NULL; pdn = pdn->parent) {
+- dma_window = of_get_property(pdn, "ibm,dma-window", NULL);
+- if (dma_window != NULL)
+- break;
+- }
++ pdn = pci_dma_find(dn, &dma_window);
+
+- if (dma_window == NULL) {
++ if (dma_window == NULL)
+ pr_debug(" no ibm,dma-window property !\n");
+- return;
+- }
+
+ ppci = PCI_DN(pdn);
+
+@@ -736,11 +753,13 @@ static void pci_dma_bus_setup_pSeriesLP(struct pci_bus *bus)
+ if (!ppci->table_group) {
+ ppci->table_group = iommu_pseries_alloc_group(ppci->phb->node);
+ tbl = ppci->table_group->tables[0];
+- iommu_table_setparms_lpar(ppci->phb, pdn, tbl,
+- ppci->table_group, dma_window);
++ if (dma_window) {
++ iommu_table_setparms_lpar(ppci->phb, pdn, tbl,
++ ppci->table_group, dma_window);
+
+- if (!iommu_init_table(tbl, ppci->phb->node, 0, 0))
+- panic("Failed to initialize iommu table");
++ if (!iommu_init_table(tbl, ppci->phb->node, 0, 0))
++ panic("Failed to initialize iommu table");
++ }
+ iommu_register_group(ppci->table_group,
+ pci_domain_nr(bus), 0);
+ pr_debug(" created table: %p\n", ppci->table_group);
+@@ -1233,7 +1252,7 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
+ bool default_win_removed = false, direct_mapping = false;
+ bool pmem_present;
+ struct pci_dn *pci = PCI_DN(pdn);
+- struct iommu_table *tbl = pci->table_group->tables[0];
++ struct property *default_win = NULL;
+
+ dn = of_find_node_by_type(NULL, "ibm,pmemory");
+ pmem_present = dn != NULL;
+@@ -1290,11 +1309,10 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
+ * for extensions presence.
+ */
+ if (query.windows_available == 0) {
+- struct property *default_win;
+ int reset_win_ext;
+
+ /* DDW + IOMMU on single window may fail if there is any allocation */
+- if (iommu_table_in_use(tbl)) {
++ if (iommu_table_in_use(pci->table_group->tables[0])) {
+ dev_warn(&dev->dev, "current IOMMU table in use, can't be replaced.\n");
+ goto out_failed;
+ }
+@@ -1430,16 +1448,18 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
+
+ pci->table_group->tables[1] = newtbl;
+
+- /* Keep default DMA window stuct if removed */
+- if (default_win_removed) {
+- tbl->it_size = 0;
+- vfree(tbl->it_map);
+- tbl->it_map = NULL;
+- }
+-
+ set_iommu_table_base(&dev->dev, newtbl);
+ }
+
++ if (default_win_removed) {
++ iommu_tce_table_put(pci->table_group->tables[0]);
++ pci->table_group->tables[0] = NULL;
++
++ /* default_win is valid here because default_win_removed == true */
++ of_remove_property(pdn, default_win);
++ dev_info(&dev->dev, "Removed default DMA window for %pOF\n", pdn);
++ }
++
+ spin_lock(&dma_win_list_lock);
+ list_add(&window->list, &dma_win_list);
+ spin_unlock(&dma_win_list_lock);
+@@ -1504,13 +1524,7 @@ static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev)
+ dn = pci_device_to_OF_node(dev);
+ pr_debug(" node is %pOF\n", dn);
+
+- for (pdn = dn; pdn && PCI_DN(pdn) && !PCI_DN(pdn)->table_group;
+- pdn = pdn->parent) {
+- dma_window = of_get_property(pdn, "ibm,dma-window", NULL);
+- if (dma_window)
+- break;
+- }
+-
++ pdn = pci_dma_find(dn, &dma_window);
+ if (!pdn || !PCI_DN(pdn)) {
+ printk(KERN_WARNING "pci_dma_dev_setup_pSeriesLP: "
+ "no DMA window found for pci dev=%s dn=%pOF\n",
+@@ -1541,7 +1555,6 @@ static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev)
+ static bool iommu_bypass_supported_pSeriesLP(struct pci_dev *pdev, u64 dma_mask)
+ {
+ struct device_node *dn = pci_device_to_OF_node(pdev), *pdn;
+- const __be32 *dma_window = NULL;
+
+ /* only attempt to use a new window if 64-bit DMA is requested */
+ if (dma_mask < DMA_BIT_MASK(64))
+@@ -1555,13 +1568,7 @@ static bool iommu_bypass_supported_pSeriesLP(struct pci_dev *pdev, u64 dma_mask)
+ * search upwards in the tree until we either hit a dma-window
+ * property, OR find a parent with a table already allocated.
+ */
+- for (pdn = dn; pdn && PCI_DN(pdn) && !PCI_DN(pdn)->table_group;
+- pdn = pdn->parent) {
+- dma_window = of_get_property(pdn, "ibm,dma-window", NULL);
+- if (dma_window)
+- break;
+- }
+-
++ pdn = pci_dma_find(dn, NULL);
+ if (pdn && PCI_DN(pdn))
+ return enable_ddw(pdev, pdn);
+
+diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
+index f27735f623bae..27bed0dd866ef 100644
+--- a/arch/powerpc/platforms/pseries/setup.c
++++ b/arch/powerpc/platforms/pseries/setup.c
+@@ -658,7 +658,7 @@ static resource_size_t pseries_get_iov_fw_value(struct pci_dev *dev, int resno,
+ */
+ num_res = of_read_number(&indexes[NUM_RES_PROPERTY], 1);
+ if (resno >= num_res)
+- return 0; /* or an errror */
++ return 0; /* or an error */
+
+ i = START_OF_ENTRIES + NEXT_ENTRY * resno;
+ switch (value) {
+@@ -762,7 +762,7 @@ static void pseries_pci_fixup_iov_resources(struct pci_dev *pdev)
+
+ if (!pdev->is_physfn)
+ return;
+- /*Firmware must support open sriov otherwise dont configure*/
++ /*Firmware must support open sriov otherwise don't configure*/
+ indexes = of_get_property(dn, "ibm,open-sriov-vf-bar-info", NULL);
+ if (indexes)
+ of_pci_parse_iov_addrs(pdev, indexes);
+diff --git a/arch/powerpc/platforms/pseries/vas-sysfs.c b/arch/powerpc/platforms/pseries/vas-sysfs.c
+index ec65586cbeb39..241c84374045a 100644
+--- a/arch/powerpc/platforms/pseries/vas-sysfs.c
++++ b/arch/powerpc/platforms/pseries/vas-sysfs.c
+@@ -76,7 +76,7 @@ struct vas_sysfs_entry {
+ * Create sysfs interface:
+ * /sys/devices/vas/vas0/gzip/default_capabilities
+ * This directory contains the following VAS GZIP capabilities
+- * for the defaule credit type.
++ * for the default credit type.
+ * /sys/devices/vas/vas0/gzip/default_capabilities/nr_total_credits
+ * Total number of default credits assigned to the LPAR which
+ * can be changed with DLPAR operation.
+diff --git a/arch/powerpc/platforms/pseries/vas.c b/arch/powerpc/platforms/pseries/vas.c
+index ec643bbdb67fc..500a1fc4a1d7d 100644
+--- a/arch/powerpc/platforms/pseries/vas.c
++++ b/arch/powerpc/platforms/pseries/vas.c
+@@ -801,7 +801,7 @@ int vas_reconfig_capabilties(u8 type, int new_nr_creds)
+ atomic_set(&caps->nr_total_credits, new_nr_creds);
+ /*
+ * The total number of available credits may be decreased or
+- * inceased with DLPAR operation. Means some windows have to be
++ * increased with DLPAR operation. Means some windows have to be
+ * closed / reopened. Hold the vas_pseries_mutex so that the
+ * the user space can not open new windows.
+ */
+diff --git a/arch/powerpc/sysdev/fsl_lbc.c b/arch/powerpc/sysdev/fsl_lbc.c
+index 1985e067e952e..18acfb4e82af8 100644
+--- a/arch/powerpc/sysdev/fsl_lbc.c
++++ b/arch/powerpc/sysdev/fsl_lbc.c
+@@ -37,7 +37,7 @@ EXPORT_SYMBOL(fsl_lbc_ctrl_dev);
+ *
+ * This function converts a base address of lbc into the right format for the
+ * BR register. If the SOC has eLBC then it returns 32bit physical address
+- * else it convers a 34bit local bus physical address to correct format of
++ * else it converts a 34bit local bus physical address to correct format of
+ * 32bit address for BR register (Example: MPC8641).
+ */
+ u32 fsl_lbc_addr(phys_addr_t addr_base)
+diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
+index a97ce602394e5..3c430a6a6a4e6 100644
+--- a/arch/powerpc/sysdev/fsl_pci.c
++++ b/arch/powerpc/sysdev/fsl_pci.c
+@@ -218,7 +218,7 @@ static void setup_pci_atmu(struct pci_controller *hose)
+ * windows have implemented the default target value as 0xf
+ * for CCSR space.In all Freescale legacy devices the target
+ * of 0xf is reserved for local memory space. 9132 Rev1.0
+- * now has local mempry space mapped to target 0x0 instead of
++ * now has local memory space mapped to target 0x0 instead of
+ * 0xf. Hence adding a workaround to remove the target 0xf
+ * defined for memory space from Inbound window attributes.
+ */
+@@ -520,6 +520,7 @@ int fsl_add_bridge(struct platform_device *pdev, int is_primary)
+ struct resource rsrc;
+ const int *bus_range;
+ u8 hdr_type, progif;
++ u32 class_code;
+ struct device_node *dev;
+ struct ccsr_pci __iomem *pci;
+ u16 temp;
+@@ -593,6 +594,13 @@ int fsl_add_bridge(struct platform_device *pdev, int is_primary)
+ PPC_INDIRECT_TYPE_SURPRESS_PRIMARY_BUS;
+ if (fsl_pcie_check_link(hose))
+ hose->indirect_type |= PPC_INDIRECT_TYPE_NO_PCIE_LINK;
++ /* Fix Class Code to PCI_CLASS_BRIDGE_PCI_NORMAL for pre-3.0 controller */
++ if (in_be32(&pci->block_rev1) < PCIE_IP_REV_3_0) {
++ early_read_config_dword(hose, 0, 0, PCIE_FSL_CSR_CLASSCODE, &class_code);
++ class_code &= 0xff;
++ class_code |= PCI_CLASS_BRIDGE_PCI_NORMAL << 8;
++ early_write_config_dword(hose, 0, 0, PCIE_FSL_CSR_CLASSCODE, class_code);
++ }
+ } else {
+ /*
+ * Set PBFR(PCI Bus Function Register)[10] = 1 to
+diff --git a/arch/powerpc/sysdev/fsl_pci.h b/arch/powerpc/sysdev/fsl_pci.h
+index cdbde2e0c96ef..093a875d7d1ec 100644
+--- a/arch/powerpc/sysdev/fsl_pci.h
++++ b/arch/powerpc/sysdev/fsl_pci.h
+@@ -18,6 +18,7 @@ struct platform_device;
+
+ #define PCIE_LTSSM 0x0404 /* PCIE Link Training and Status */
+ #define PCIE_LTSSM_L0 0x16 /* L0 state */
++#define PCIE_FSL_CSR_CLASSCODE 0x474 /* FSL GPEX CSR */
+ #define PCIE_IP_REV_2_2 0x02080202 /* PCIE IP block version Rev2.2 */
+ #define PCIE_IP_REV_3_0 0x02080300 /* PCIE IP block version Rev3.0 */
+ #define PIWAR_EN 0x80000000 /* Enable */
+diff --git a/arch/powerpc/sysdev/ge/ge_pic.c b/arch/powerpc/sysdev/ge/ge_pic.c
+index 02553a8ce1919..413b375c4d28c 100644
+--- a/arch/powerpc/sysdev/ge/ge_pic.c
++++ b/arch/powerpc/sysdev/ge/ge_pic.c
+@@ -150,7 +150,7 @@ static struct irq_chip gef_pic_chip = {
+ };
+
+
+-/* When an interrupt is being configured, this call allows some flexibilty
++/* When an interrupt is being configured, this call allows some flexibility
+ * in deciding which irq_chip structure is used
+ */
+ static int gef_pic_host_map(struct irq_domain *h, unsigned int virq,
+diff --git a/arch/powerpc/sysdev/mpic_msgr.c b/arch/powerpc/sysdev/mpic_msgr.c
+index 36ec0bdd8b63c..a25413826b63e 100644
+--- a/arch/powerpc/sysdev/mpic_msgr.c
++++ b/arch/powerpc/sysdev/mpic_msgr.c
+@@ -99,7 +99,7 @@ void mpic_msgr_disable(struct mpic_msgr *msgr)
+ EXPORT_SYMBOL_GPL(mpic_msgr_disable);
+
+ /* The following three functions are used to compute the order and number of
+- * the message register blocks. They are clearly very inefficent. However,
++ * the message register blocks. They are clearly very inefficient. However,
+ * they are called *only* a few times during device initialization.
+ */
+ static unsigned int mpic_msgr_number_of_blocks(void)
+diff --git a/arch/powerpc/sysdev/mpic_msi.c b/arch/powerpc/sysdev/mpic_msi.c
+index f412d6ad0b660..9936c014ac7df 100644
+--- a/arch/powerpc/sysdev/mpic_msi.c
++++ b/arch/powerpc/sysdev/mpic_msi.c
+@@ -37,7 +37,7 @@ static int __init mpic_msi_reserve_u3_hwirqs(struct mpic *mpic)
+ /* Reserve source numbers we know are reserved in the HW.
+ *
+ * This is a bit of a mix of U3 and U4 reserves but that's going
+- * to work fine, we have plenty enugh numbers left so let's just
++ * to work fine, we have plenty enough numbers left so let's just
+ * mark anything we don't like reserved.
+ */
+ for (i = 0; i < 8; i++)
+diff --git a/arch/powerpc/sysdev/mpic_timer.c b/arch/powerpc/sysdev/mpic_timer.c
+index 444e9ce42d0a5..b2f0a73e8f930 100644
+--- a/arch/powerpc/sysdev/mpic_timer.c
++++ b/arch/powerpc/sysdev/mpic_timer.c
+@@ -255,7 +255,7 @@ EXPORT_SYMBOL(mpic_start_timer);
+
+ /**
+ * mpic_stop_timer - stop hardware timer
+- * @handle: the timer to be stoped
++ * @handle: the timer to be stopped
+ *
+ * The timer periodically generates an interrupt. Unless user stops the timer.
+ */
+diff --git a/arch/powerpc/sysdev/mpic_u3msi.c b/arch/powerpc/sysdev/mpic_u3msi.c
+index 3f4841dfefb5f..73d1295940782 100644
+--- a/arch/powerpc/sysdev/mpic_u3msi.c
++++ b/arch/powerpc/sysdev/mpic_u3msi.c
+@@ -78,7 +78,7 @@ static u64 find_u4_magic_addr(struct pci_dev *pdev, unsigned int hwirq)
+
+ /* U4 PCIe MSIs need to write to the special register in
+ * the bridge that generates interrupts. There should be
+- * theorically a register at 0xf8005000 where you just write
++ * theoretically a register at 0xf8005000 where you just write
+ * the MSI number and that triggers the right interrupt, but
+ * unfortunately, this is busted in HW, the bridge endian swaps
+ * the value and hits the wrong nibble in the register.
+diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
+index f940428ad13fe..45f72fc715fc7 100644
+--- a/arch/powerpc/sysdev/xive/native.c
++++ b/arch/powerpc/sysdev/xive/native.c
+@@ -617,7 +617,7 @@ bool __init xive_native_init(void)
+
+ xive_tima_os = r.start;
+
+- /* Grab size of provisionning pages */
++ /* Grab size of provisioning pages */
+ xive_parse_provisioning(np);
+
+ /* Switch the XIVE to exploitation mode */
+diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
+index b0d36e430dbc4..013446f5cdbbe 100644
+--- a/arch/powerpc/sysdev/xive/spapr.c
++++ b/arch/powerpc/sysdev/xive/spapr.c
+@@ -716,6 +716,7 @@ static bool __init xive_get_max_prio(u8 *max_prio)
+ }
+
+ reg = of_get_property(rootdn, "ibm,plat-res-int-priorities", &len);
++ of_node_put(rootdn);
+ if (!reg) {
+ pr_err("Failed to read 'ibm,plat-res-int-priorities' property\n");
+ return false;
+diff --git a/arch/powerpc/xmon/ppc-opc.c b/arch/powerpc/xmon/ppc-opc.c
+index dfb80810b16cb..0774d711453ef 100644
+--- a/arch/powerpc/xmon/ppc-opc.c
++++ b/arch/powerpc/xmon/ppc-opc.c
+@@ -408,7 +408,7 @@ const struct powerpc_operand powerpc_operands[] =
+ #define FXM4 FXM + 1
+ { 0xff, 12, insert_fxm, extract_fxm,
+ PPC_OPERAND_OPTIONAL | PPC_OPERAND_OPTIONAL_VALUE},
+- /* If the FXM4 operand is ommitted, use the sentinel value -1. */
++ /* If the FXM4 operand is omitted, use the sentinel value -1. */
+ { -1, -1, NULL, NULL, 0},
+
+ /* The IMM20 field in an LI instruction. */
+diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
+index fd72753e8ad50..27da7d5c20241 100644
+--- a/arch/powerpc/xmon/xmon.c
++++ b/arch/powerpc/xmon/xmon.c
+@@ -2024,7 +2024,7 @@ static void dump_206_sprs(void)
+ if (!cpu_has_feature(CPU_FTR_ARCH_206))
+ return;
+
+- /* Actually some of these pre-date 2.06, but whatevs */
++ /* Actually some of these pre-date 2.06, but whatever */
+
+ printf("srr0 = %.16lx srr1 = %.16lx dsisr = %.8lx\n",
+ mfspr(SPRN_SRR0), mfspr(SPRN_SRR1), mfspr(SPRN_DSISR));
+diff --git a/arch/riscv/boot/dts/starfive/jh7100.dtsi b/arch/riscv/boot/dts/starfive/jh7100.dtsi
+index 69f22f9aad9db..f48e232a72a74 100644
+--- a/arch/riscv/boot/dts/starfive/jh7100.dtsi
++++ b/arch/riscv/boot/dts/starfive/jh7100.dtsi
+@@ -118,7 +118,7 @@
+ interrupt-controller;
+ #address-cells = <0>;
+ #interrupt-cells = <1>;
+- riscv,ndev = <127>;
++ riscv,ndev = <133>;
+ };
+
+ clkgen: clock-controller@11800000 {
+diff --git a/arch/riscv/include/asm/cpu_ops.h b/arch/riscv/include/asm/cpu_ops.h
+index 134590f1b8435..aa128466c4d4e 100644
+--- a/arch/riscv/include/asm/cpu_ops.h
++++ b/arch/riscv/include/asm/cpu_ops.h
+@@ -38,6 +38,7 @@ struct cpu_operations {
+ #endif
+ };
+
++extern const struct cpu_operations cpu_ops_spinwait;
+ extern const struct cpu_operations *cpu_ops[NR_CPUS];
+ void __init cpu_set_ops(int cpu);
+
+diff --git a/arch/riscv/kernel/cpu_ops.c b/arch/riscv/kernel/cpu_ops.c
+index 170d07e577215..f92c0e6eddb16 100644
+--- a/arch/riscv/kernel/cpu_ops.c
++++ b/arch/riscv/kernel/cpu_ops.c
+@@ -15,9 +15,7 @@
+ const struct cpu_operations *cpu_ops[NR_CPUS] __ro_after_init;
+
+ extern const struct cpu_operations cpu_ops_sbi;
+-#ifdef CONFIG_RISCV_BOOT_SPINWAIT
+-extern const struct cpu_operations cpu_ops_spinwait;
+-#else
++#ifndef CONFIG_RISCV_BOOT_SPINWAIT
+ const struct cpu_operations cpu_ops_spinwait = {
+ .name = "",
+ .cpu_prepare = NULL,
+diff --git a/arch/riscv/kernel/cpu_ops_spinwait.c b/arch/riscv/kernel/cpu_ops_spinwait.c
+index 346847f6c41c8..d98d19226b5f5 100644
+--- a/arch/riscv/kernel/cpu_ops_spinwait.c
++++ b/arch/riscv/kernel/cpu_ops_spinwait.c
+@@ -11,6 +11,8 @@
+ #include <asm/sbi.h>
+ #include <asm/smp.h>
+
++#include "head.h"
++
+ const struct cpu_operations cpu_ops_spinwait;
+ void *__cpu_spinwait_stack_pointer[NR_CPUS] __section(".data");
+ void *__cpu_spinwait_task_pointer[NR_CPUS] __section(".data");
+@@ -18,7 +20,7 @@ void *__cpu_spinwait_task_pointer[NR_CPUS] __section(".data");
+ static void cpu_update_secondary_bootdata(unsigned int cpuid,
+ struct task_struct *tidle)
+ {
+- int hartid = cpuid_to_hartid_map(cpuid);
++ unsigned long hartid = cpuid_to_hartid_map(cpuid);
+
+ /*
+ * The hartid must be less than NR_CPUS to avoid out-of-bound access
+@@ -27,7 +29,7 @@ static void cpu_update_secondary_bootdata(unsigned int cpuid,
+ * spinwait booting is not the recommended approach for any platforms
+ * booting Linux in S-mode and can be disabled in the future.
+ */
+- if (hartid == INVALID_HARTID || hartid >= NR_CPUS)
++ if (hartid == INVALID_HARTID || hartid >= (unsigned long) NR_CPUS)
+ return;
+
+ /* Make sure tidle is updated */
+diff --git a/arch/riscv/kernel/crash_save_regs.S b/arch/riscv/kernel/crash_save_regs.S
+index 7832fb763abac..b2a1908c0463e 100644
+--- a/arch/riscv/kernel/crash_save_regs.S
++++ b/arch/riscv/kernel/crash_save_regs.S
+@@ -44,7 +44,7 @@ SYM_CODE_START(riscv_crash_save_regs)
+ REG_S t6, PT_T6(a0) /* x31 */
+
+ csrr t1, CSR_STATUS
+- csrr t2, CSR_EPC
++ auipc t2, 0x0
+ csrr t3, CSR_TVAL
+ csrr t4, CSR_CAUSE
+
+diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c
+index df8e24559035c..ee79e6839b863 100644
+--- a/arch/riscv/kernel/machine_kexec.c
++++ b/arch/riscv/kernel/machine_kexec.c
+@@ -138,19 +138,37 @@ void machine_shutdown(void)
+ #endif
+ }
+
++/* Override the weak function in kernel/panic.c */
++void crash_smp_send_stop(void)
++{
++ static int cpus_stopped;
++
++ /*
++ * This function can be called twice in panic path, but obviously
++ * we execute this only once.
++ */
++ if (cpus_stopped)
++ return;
++
++ smp_send_stop();
++ cpus_stopped = 1;
++}
++
+ /*
+ * machine_crash_shutdown - Prepare to kexec after a kernel crash
+ *
+ * This function is called by crash_kexec just before machine_kexec
+- * below and its goal is similar to machine_shutdown, but in case of
+- * a kernel crash. Since we don't handle such cases yet, this function
+- * is empty.
++ * and its goal is to shutdown non-crashing cpus and save registers.
+ */
+ void
+ machine_crash_shutdown(struct pt_regs *regs)
+ {
++ local_irq_disable();
++
++ /* shutdown non-crashing cpus */
++ crash_smp_send_stop();
++
+ crash_save_cpu(regs, smp_processor_id());
+- machine_shutdown();
+ pr_info("Starting crashdump kernel...\n");
+ }
+
+@@ -171,7 +189,7 @@ machine_kexec(struct kimage *image)
+ struct kimage_arch *internal = &image->arch;
+ unsigned long jump_addr = (unsigned long) image->start;
+ unsigned long first_ind_entry = (unsigned long) &image->head;
+- unsigned long this_cpu_id = smp_processor_id();
++ unsigned long this_cpu_id = __smp_processor_id();
+ unsigned long this_hart_id = cpuid_to_hartid_map(this_cpu_id);
+ unsigned long fdt_addr = internal->fdt_addr;
+ void *control_code_buffer = page_address(image->control_code_page);
+diff --git a/arch/riscv/kernel/probes/uprobes.c b/arch/riscv/kernel/probes/uprobes.c
+index 7a057b5f0adc7..c976a21cd4bd5 100644
+--- a/arch/riscv/kernel/probes/uprobes.c
++++ b/arch/riscv/kernel/probes/uprobes.c
+@@ -59,8 +59,6 @@ int arch_uprobe_pre_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
+
+ instruction_pointer_set(regs, utask->xol_vaddr);
+
+- regs->status &= ~SR_SPIE;
+-
+ return 0;
+ }
+
+@@ -72,8 +70,6 @@ int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
+
+ instruction_pointer_set(regs, utask->vaddr + auprobe->insn_size);
+
+- regs->status |= SR_SPIE;
+-
+ return 0;
+ }
+
+@@ -111,8 +107,6 @@ void arch_uprobe_abort_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
+ * address.
+ */
+ instruction_pointer_set(regs, utask->vaddr);
+-
+- regs->status &= ~SR_SPIE;
+ }
+
+ bool arch_uretprobe_is_alive(struct return_instance *ret, enum rp_check ctx,
+diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S
+index 8c475f4da3084..ec486e5369d9b 100644
+--- a/arch/riscv/lib/uaccess.S
++++ b/arch/riscv/lib/uaccess.S
+@@ -175,7 +175,7 @@ ENTRY(__asm_copy_from_user)
+ /* Exception fixup code */
+ 10:
+ /* Disable access to user memory */
+- csrs CSR_STATUS, t6
++ csrc CSR_STATUS, t6
+ mv a0, t5
+ ret
+ ENDPROC(__asm_copy_to_user)
+@@ -227,7 +227,7 @@ ENTRY(__clear_user)
+ /* Exception fixup code */
+ 11:
+ /* Disable access to user memory */
+- csrs CSR_STATUS, t6
++ csrc CSR_STATUS, t6
+ mv a0, a1
+ ret
+ ENDPROC(__clear_user)
+diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
+index 39e2e1d0e94f4..80b4a407b3f12 100644
+--- a/arch/riscv/mm/init.c
++++ b/arch/riscv/mm/init.c
+@@ -99,6 +99,10 @@ static void __init print_vm_layout(void)
+ (unsigned long)VMEMMAP_END);
+ print_mlm("vmalloc", (unsigned long)VMALLOC_START,
+ (unsigned long)VMALLOC_END);
++#ifdef CONFIG_64BIT
++ print_mlm("modules", (unsigned long)MODULES_VADDR,
++ (unsigned long)MODULES_END);
++#endif
+ print_mlm("lowmem", (unsigned long)PAGE_OFFSET,
+ (unsigned long)high_memory);
+ if (IS_ENABLED(CONFIG_64BIT)) {
+diff --git a/arch/s390/include/asm/gmap.h b/arch/s390/include/asm/gmap.h
+index 40264f60b0da9..f4073106e1f39 100644
+--- a/arch/s390/include/asm/gmap.h
++++ b/arch/s390/include/asm/gmap.h
+@@ -148,4 +148,6 @@ void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long dirty_bitmap[4],
+ unsigned long gaddr, unsigned long vmaddr);
+ int gmap_mark_unmergeable(void);
+ void s390_reset_acc(struct mm_struct *mm);
++void s390_unlist_old_asce(struct gmap *gmap);
++int s390_replace_asce(struct gmap *gmap);
+ #endif /* _ASM_S390_GMAP_H */
+diff --git a/arch/s390/include/asm/kexec.h b/arch/s390/include/asm/kexec.h
+index 63098df81c9f2..d13bd221cd37f 100644
+--- a/arch/s390/include/asm/kexec.h
++++ b/arch/s390/include/asm/kexec.h
+@@ -92,5 +92,8 @@ int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
+ const Elf_Shdr *relsec,
+ const Elf_Shdr *symtab);
+ #define arch_kexec_apply_relocations_add arch_kexec_apply_relocations_add
++
++int arch_kimage_file_post_load_cleanup(struct kimage *image);
++#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
+ #endif
+ #endif /*_S390_KEXEC_H */
+diff --git a/arch/s390/include/asm/unwind.h b/arch/s390/include/asm/unwind.h
+index 0bf06f1682d81..02462e7100c1c 100644
+--- a/arch/s390/include/asm/unwind.h
++++ b/arch/s390/include/asm/unwind.h
+@@ -47,7 +47,7 @@ struct unwind_state {
+ static inline unsigned long unwind_recover_ret_addr(struct unwind_state *state,
+ unsigned long ip)
+ {
+- ip = ftrace_graph_ret_addr(state->task, &state->graph_idx, ip, NULL);
++ ip = ftrace_graph_ret_addr(state->task, &state->graph_idx, ip, (void *)state->sp);
+ if (is_kretprobe_trampoline(ip))
+ ip = kretprobe_find_ret_addr(state->task, (void *)state->sp, &state->kr_cur);
+ return ip;
+diff --git a/arch/s390/kernel/crash_dump.c b/arch/s390/kernel/crash_dump.c
+index 28124d0fa1d5e..f8ebdd70dd317 100644
+--- a/arch/s390/kernel/crash_dump.c
++++ b/arch/s390/kernel/crash_dump.c
+@@ -199,7 +199,7 @@ static int copy_oldmem_user(void __user *dst, unsigned long src, size_t count)
+ } else {
+ len = count;
+ }
+- rc = copy_to_user_real(dst, src, count);
++ rc = copy_to_user_real(dst, src, len);
+ if (rc)
+ return rc;
+ }
+diff --git a/arch/s390/kernel/machine_kexec_file.c b/arch/s390/kernel/machine_kexec_file.c
+index 8f43575a4dd32..fc6d5f58debeb 100644
+--- a/arch/s390/kernel/machine_kexec_file.c
++++ b/arch/s390/kernel/machine_kexec_file.c
+@@ -31,6 +31,7 @@ int s390_verify_sig(const char *kernel, unsigned long kernel_len)
+ const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1;
+ struct module_signature *ms;
+ unsigned long sig_len;
++ int ret;
+
+ /* Skip signature verification when not secure IPLed. */
+ if (!ipl_secure_flag)
+@@ -65,11 +66,18 @@ int s390_verify_sig(const char *kernel, unsigned long kernel_len)
+ return -EBADMSG;
+ }
+
+- return verify_pkcs7_signature(kernel, kernel_len,
+- kernel + kernel_len, sig_len,
+- VERIFY_USE_PLATFORM_KEYRING,
+- VERIFYING_MODULE_SIGNATURE,
+- NULL, NULL);
++ ret = verify_pkcs7_signature(kernel, kernel_len,
++ kernel + kernel_len, sig_len,
++ VERIFY_USE_SECONDARY_KEYRING,
++ VERIFYING_MODULE_SIGNATURE,
++ NULL, NULL);
++ if (ret == -ENOKEY && IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING))
++ ret = verify_pkcs7_signature(kernel, kernel_len,
++ kernel + kernel_len, sig_len,
++ VERIFY_USE_PLATFORM_KEYRING,
++ VERIFYING_MODULE_SIGNATURE,
++ NULL, NULL);
++ return ret;
+ }
+ #endif /* CONFIG_KEXEC_SIG */
+
+diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
+index 8bd42a20d924e..88112065d9411 100644
+--- a/arch/s390/kvm/intercept.c
++++ b/arch/s390/kvm/intercept.c
+@@ -528,12 +528,27 @@ static int handle_pv_uvc(struct kvm_vcpu *vcpu)
+
+ static int handle_pv_notification(struct kvm_vcpu *vcpu)
+ {
++ int ret;
++
+ if (vcpu->arch.sie_block->ipa == 0xb210)
+ return handle_pv_spx(vcpu);
+ if (vcpu->arch.sie_block->ipa == 0xb220)
+ return handle_pv_sclp(vcpu);
+ if (vcpu->arch.sie_block->ipa == 0xb9a4)
+ return handle_pv_uvc(vcpu);
++ if (vcpu->arch.sie_block->ipa >> 8 == 0xae) {
++ /*
++ * Besides external call, other SIGP orders also cause a
++ * 108 (pv notify) intercept. In contrast to external call,
++ * these orders need to be emulated and hence the appropriate
++ * place to handle them is in handle_instruction().
++ * So first try kvm_s390_handle_sigp_pei() and if that isn't
++ * successful, go on with handle_instruction().
++ */
++ ret = kvm_s390_handle_sigp_pei(vcpu);
++ if (!ret)
++ return ret;
++ }
+
+ return handle_instruction(vcpu);
+ }
+diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
+index cc7c9599f43ee..8eee3fc414e5b 100644
+--- a/arch/s390/kvm/pv.c
++++ b/arch/s390/kvm/pv.c
+@@ -161,10 +161,13 @@ int kvm_s390_pv_deinit_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
+ atomic_set(&kvm->mm->context.is_protected, 0);
+ KVM_UV_EVENT(kvm, 3, "PROTVIRT DESTROY VM: rc %x rrc %x", *rc, *rrc);
+ WARN_ONCE(cc, "protvirt destroy vm failed rc %x rrc %x", *rc, *rrc);
+- /* Inteded memory leak on "impossible" error */
+- if (!cc)
++ /* Intended memory leak on "impossible" error */
++ if (!cc) {
+ kvm_s390_pv_dealloc_vm(kvm);
+- return cc ? -EIO : 0;
++ return 0;
++ }
++ s390_replace_asce(kvm->arch.gmap);
++ return -EIO;
+ }
+
+ int kvm_s390_pv_init_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
+diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
+index 8aaee2892ec35..cb747bf6c7982 100644
+--- a/arch/s390/kvm/sigp.c
++++ b/arch/s390/kvm/sigp.c
+@@ -480,9 +480,9 @@ int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu)
+ struct kvm_vcpu *dest_vcpu;
+ u8 order_code = kvm_s390_get_base_disp_rs(vcpu, NULL);
+
+- trace_kvm_s390_handle_sigp_pei(vcpu, order_code, cpu_addr);
+-
+ if (order_code == SIGP_EXTERNAL_CALL) {
++ trace_kvm_s390_handle_sigp_pei(vcpu, order_code, cpu_addr);
++
+ dest_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, cpu_addr);
+ BUG_ON(dest_vcpu == NULL);
+
+diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
+index b8ae4a4aa2ba4..85cab61d87a96 100644
+--- a/arch/s390/mm/gmap.c
++++ b/arch/s390/mm/gmap.c
+@@ -2735,3 +2735,89 @@ void s390_reset_acc(struct mm_struct *mm)
+ mmput(mm);
+ }
+ EXPORT_SYMBOL_GPL(s390_reset_acc);
++
++/**
++ * s390_unlist_old_asce - Remove the topmost level of page tables from the
++ * list of page tables of the gmap.
++ * @gmap: the gmap whose table is to be removed
++ *
++ * On s390x, KVM keeps a list of all pages containing the page tables of the
++ * gmap (the CRST list). This list is used at tear down time to free all
++ * pages that are now not needed anymore.
++ *
++ * This function removes the topmost page of the tree (the one pointed to by
++ * the ASCE) from the CRST list.
++ *
++ * This means that it will not be freed when the VM is torn down, and needs
++ * to be handled separately by the caller, unless a leak is actually
++ * intended. Notice that this function will only remove the page from the
++ * list, the page will still be used as a top level page table (and ASCE).
++ */
++void s390_unlist_old_asce(struct gmap *gmap)
++{
++ struct page *old;
++
++ old = virt_to_page(gmap->table);
++ spin_lock(&gmap->guest_table_lock);
++ list_del(&old->lru);
++ /*
++ * Sometimes the topmost page might need to be "removed" multiple
++ * times, for example if the VM is rebooted into secure mode several
++ * times concurrently, or if s390_replace_asce fails after calling
++ * s390_remove_old_asce and is attempted again later. In that case
++ * the old asce has been removed from the list, and therefore it
++ * will not be freed when the VM terminates, but the ASCE is still
++ * in use and still pointed to.
++ * A subsequent call to replace_asce will follow the pointer and try
++ * to remove the same page from the list again.
++ * Therefore it's necessary that the page of the ASCE has valid
++ * pointers, so list_del can work (and do nothing) without
++ * dereferencing stale or invalid pointers.
++ */
++ INIT_LIST_HEAD(&old->lru);
++ spin_unlock(&gmap->guest_table_lock);
++}
++EXPORT_SYMBOL_GPL(s390_unlist_old_asce);
++
++/**
++ * s390_replace_asce - Try to replace the current ASCE of a gmap with a copy
++ * @gmap: the gmap whose ASCE needs to be replaced
++ *
++ * If the allocation of the new top level page table fails, the ASCE is not
++ * replaced.
++ * In any case, the old ASCE is always removed from the gmap CRST list.
++ * Therefore the caller has to make sure to save a pointer to it
++ * beforehand, unless a leak is actually intended.
++ */
++int s390_replace_asce(struct gmap *gmap)
++{
++ unsigned long asce;
++ struct page *page;
++ void *table;
++
++ s390_unlist_old_asce(gmap);
++
++ page = alloc_pages(GFP_KERNEL_ACCOUNT, CRST_ALLOC_ORDER);
++ if (!page)
++ return -ENOMEM;
++ table = page_to_virt(page);
++ memcpy(table, gmap->table, 1UL << (CRST_ALLOC_ORDER + PAGE_SHIFT));
++
++ /*
++ * The caller has to deal with the old ASCE, but here we make sure
++ * the new one is properly added to the CRST list, so that
++ * it will be freed when the VM is torn down.
++ */
++ spin_lock(&gmap->guest_table_lock);
++ list_add(&page->lru, &gmap->crst_list);
++ spin_unlock(&gmap->guest_table_lock);
++
++ /* Set new table origin while preserving existing ASCE control bits */
++ asce = (gmap->asce & ~_ASCE_ORIGIN) | __pa(table);
++ WRITE_ONCE(gmap->asce, asce);
++ WRITE_ONCE(gmap->mm->context.gmap_asce, asce);
++ WRITE_ONCE(gmap->table, table);
++
++ return 0;
++}
++EXPORT_SYMBOL_GPL(s390_replace_asce);
+diff --git a/arch/um/drivers/random.c b/arch/um/drivers/random.c
+index 433a3f8f2ef3e..32b3341fe9707 100644
+--- a/arch/um/drivers/random.c
++++ b/arch/um/drivers/random.c
+@@ -28,7 +28,7 @@
+ * protects against a module being loaded twice at the same time.
+ */
+ static int random_fd = -1;
+-static struct hwrng hwrng = { 0, };
++static struct hwrng hwrng;
+ static DECLARE_COMPLETION(have_data);
+
+ static int rng_dev_read(struct hwrng *rng, void *buf, size_t max, bool block)
+diff --git a/arch/um/include/asm/archrandom.h b/arch/um/include/asm/archrandom.h
+new file mode 100644
+index 0000000000000..2f24cb96391d7
+--- /dev/null
++++ b/arch/um/include/asm/archrandom.h
+@@ -0,0 +1,30 @@
++/* SPDX-License-Identifier: GPL-2.0 */
++#ifndef __ASM_UM_ARCHRANDOM_H__
++#define __ASM_UM_ARCHRANDOM_H__
++
++#include <linux/types.h>
++
++/* This is from <os.h>, but better not to #include that in a global header here. */
++ssize_t os_getrandom(void *buf, size_t len, unsigned int flags);
++
++static inline bool __must_check arch_get_random_long(unsigned long *v)
++{
++ return os_getrandom(v, sizeof(*v), 0) == sizeof(*v);
++}
++
++static inline bool __must_check arch_get_random_int(unsigned int *v)
++{
++ return os_getrandom(v, sizeof(*v), 0) == sizeof(*v);
++}
++
++static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
++{
++ return false;
++}
++
++static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
++{
++ return false;
++}
++
++#endif
+diff --git a/arch/um/include/asm/xor.h b/arch/um/include/asm/xor.h
+index 22b39de73c246..647fae200c5d3 100644
+--- a/arch/um/include/asm/xor.h
++++ b/arch/um/include/asm/xor.h
+@@ -18,7 +18,7 @@
+ #undef XOR_SELECT_TEMPLATE
+ /* pick an arbitrary one - measuring isn't possible with inf-cpu */
+ #define XOR_SELECT_TEMPLATE(x) \
+- (time_travel_mode == TT_MODE_INFCPU ? TT_CPU_INF_XOR_DEFAULT : x))
++ (time_travel_mode == TT_MODE_INFCPU ? TT_CPU_INF_XOR_DEFAULT : x)
+ #endif
+
+ #endif
+diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h
+index fafde1d5416ed..0df646c6651ea 100644
+--- a/arch/um/include/shared/os.h
++++ b/arch/um/include/shared/os.h
+@@ -11,6 +11,12 @@
+ #include <irq_user.h>
+ #include <longjmp.h>
+ #include <mm_id.h>
++/* This is to get size_t */
++#ifndef __UM_HOST__
++#include <linux/types.h>
++#else
++#include <sys/types.h>
++#endif
+
+ #define CATCH_EINTR(expr) while ((errno = 0, ((expr) < 0)) && (errno == EINTR))
+
+@@ -243,6 +249,7 @@ extern void stack_protections(unsigned long address);
+ extern int raw(int fd);
+ extern void setup_machinename(char *machine_out);
+ extern void setup_hostinfo(char *buf, int len);
++extern ssize_t os_getrandom(void *buf, size_t len, unsigned int flags);
+ extern void os_dump_core(void) __attribute__ ((noreturn));
+ extern void um_early_printk(const char *s, unsigned int n);
+ extern void os_fix_helper_signals(void);
+diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c
+index 9838967d0b2f1..e0de60e503b98 100644
+--- a/arch/um/kernel/um_arch.c
++++ b/arch/um/kernel/um_arch.c
+@@ -16,6 +16,7 @@
+ #include <linux/sched/task.h>
+ #include <linux/kmsg_dump.h>
+ #include <linux/suspend.h>
++#include <linux/random.h>
+
+ #include <asm/processor.h>
+ #include <asm/cpufeature.h>
+@@ -406,6 +407,8 @@ int __init __weak read_initrd(void)
+
+ void __init setup_arch(char **cmdline_p)
+ {
++ u8 rng_seed[32];
++
+ stack_protections((unsigned long) &init_thread_info);
+ setup_physmem(uml_physmem, uml_reserved, physmem_size, highmem);
+ mem_total_pages(physmem_size, iomem_size, highmem);
+@@ -416,6 +419,11 @@ void __init setup_arch(char **cmdline_p)
+ strlcpy(boot_command_line, command_line, COMMAND_LINE_SIZE);
+ *cmdline_p = command_line;
+ setup_hostinfo(host_info, sizeof host_info);
++
++ if (os_getrandom(rng_seed, sizeof(rng_seed), 0) == sizeof(rng_seed)) {
++ add_bootloader_randomness(rng_seed, sizeof(rng_seed));
++ memzero_explicit(rng_seed, sizeof(rng_seed));
++ }
+ }
+
+ void __init check_bugs(void)
+diff --git a/arch/um/os-Linux/util.c b/arch/um/os-Linux/util.c
+index 41297ec404bf9..fc0f2a9dee5af 100644
+--- a/arch/um/os-Linux/util.c
++++ b/arch/um/os-Linux/util.c
+@@ -14,6 +14,7 @@
+ #include <sys/wait.h>
+ #include <sys/mman.h>
+ #include <sys/utsname.h>
++#include <sys/random.h>
+ #include <init.h>
+ #include <os.h>
+
+@@ -96,6 +97,11 @@ static inline void __attribute__ ((noreturn)) uml_abort(void)
+ exit(127);
+ }
+
++ssize_t os_getrandom(void *buf, size_t len, unsigned int flags)
++{
++ return getrandom(buf, len, flags);
++}
++
+ /*
+ * UML helper threads must not handle SIGWINCH/INT/TERM
+ */
+diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
+index ce1f5a876cfea..04a9865dbdbf4 100644
+--- a/arch/x86/Kconfig
++++ b/arch/x86/Kconfig
+@@ -272,6 +272,7 @@ config X86
+ select SYSCTL_EXCEPTION_TRACE
+ select THREAD_INFO_IN_TASK
+ select TRACE_IRQFLAGS_SUPPORT
++ select TRACE_IRQFLAGS_NMI_SUPPORT
+ select USER_STACKTRACE_SUPPORT
+ select VIRT_TO_BUS
+ select HAVE_ARCH_KCSAN if X86_64
+diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
+index d3a6f74a94bdf..d4d6db4dde220 100644
+--- a/arch/x86/Kconfig.debug
++++ b/arch/x86/Kconfig.debug
+@@ -1,8 +1,5 @@
+ # SPDX-License-Identifier: GPL-2.0
+
+-config TRACE_IRQFLAGS_NMI_SUPPORT
+- def_bool y
+-
+ config EARLY_PRINTK_USB
+ bool
+
+diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
+index b5aecb524a8aa..ffec8bb01ba8c 100644
+--- a/arch/x86/boot/Makefile
++++ b/arch/x86/boot/Makefile
+@@ -103,7 +103,7 @@ $(obj)/zoffset.h: $(obj)/compressed/vmlinux FORCE
+ AFLAGS_header.o += -I$(objtree)/$(obj)
+ $(obj)/header.o: $(obj)/zoffset.h
+
+-LDFLAGS_setup.elf := -m elf_i386 -T
++LDFLAGS_setup.elf := -m elf_i386 -z noexecstack -T
+ $(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE
+ $(call if_changed,ld)
+
+diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
+index 6115274fe10fc..c66ea96936de4 100644
+--- a/arch/x86/boot/compressed/Makefile
++++ b/arch/x86/boot/compressed/Makefile
+@@ -69,6 +69,10 @@ LDFLAGS_vmlinux := -pie $(call ld-option, --no-dynamic-linker)
+ ifdef CONFIG_LD_ORPHAN_WARN
+ LDFLAGS_vmlinux += --orphan-handling=warn
+ endif
++LDFLAGS_vmlinux += -z noexecstack
++ifeq ($(CONFIG_LD_IS_BFD),y)
++LDFLAGS_vmlinux += $(call ld-option,--no-warn-rwx-segments)
++endif
+ LDFLAGS_vmlinux += -T
+
+ hostprogs := mkpiggy
+diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
+index 2831685adf6fb..8ed4597fdf6a0 100644
+--- a/arch/x86/crypto/Makefile
++++ b/arch/x86/crypto/Makefile
+@@ -61,9 +61,7 @@ sha256-ssse3-$(CONFIG_AS_SHA256_NI) += sha256_ni_asm.o
+ obj-$(CONFIG_CRYPTO_SHA512_SSSE3) += sha512-ssse3.o
+ sha512-ssse3-y := sha512-ssse3-asm.o sha512-avx-asm.o sha512-avx2-asm.o sha512_ssse3_glue.o
+
+-obj-$(CONFIG_CRYPTO_BLAKE2S_X86) += blake2s-x86_64.o
+-blake2s-x86_64-y := blake2s-shash.o
+-obj-$(if $(CONFIG_CRYPTO_BLAKE2S_X86),y) += libblake2s-x86_64.o
++obj-$(CONFIG_CRYPTO_BLAKE2S_X86) += libblake2s-x86_64.o
+ libblake2s-x86_64-y := blake2s-core.o blake2s-glue.o
+
+ obj-$(CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL) += ghash-clmulni-intel.o
+diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c
+index 69853c13e8fb0..aaba212305288 100644
+--- a/arch/x86/crypto/blake2s-glue.c
++++ b/arch/x86/crypto/blake2s-glue.c
+@@ -4,7 +4,6 @@
+ */
+
+ #include <crypto/internal/blake2s.h>
+-#include <crypto/internal/simd.h>
+
+ #include <linux/types.h>
+ #include <linux/jump_label.h>
+@@ -33,7 +32,7 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block,
+ /* SIMD disables preemption, so relax after processing each page. */
+ BUILD_BUG_ON(SZ_4K / BLAKE2S_BLOCK_SIZE < 8);
+
+- if (!static_branch_likely(&blake2s_use_ssse3) || !crypto_simd_usable()) {
++ if (!static_branch_likely(&blake2s_use_ssse3) || !may_use_simd()) {
+ blake2s_compress_generic(state, block, nblocks, inc);
+ return;
+ }
+diff --git a/arch/x86/crypto/blake2s-shash.c b/arch/x86/crypto/blake2s-shash.c
+deleted file mode 100644
+index 59ae28abe35cc..0000000000000
+--- a/arch/x86/crypto/blake2s-shash.c
++++ /dev/null
+@@ -1,77 +0,0 @@
+-// SPDX-License-Identifier: GPL-2.0 OR MIT
+-/*
+- * Copyright (C) 2015-2019 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+- */
+-
+-#include <crypto/internal/blake2s.h>
+-#include <crypto/internal/simd.h>
+-#include <crypto/internal/hash.h>
+-
+-#include <linux/types.h>
+-#include <linux/kernel.h>
+-#include <linux/module.h>
+-#include <linux/sizes.h>
+-
+-#include <asm/cpufeature.h>
+-#include <asm/processor.h>
+-
+-static int crypto_blake2s_update_x86(struct shash_desc *desc,
+- const u8 *in, unsigned int inlen)
+-{
+- return crypto_blake2s_update(desc, in, inlen, false);
+-}
+-
+-static int crypto_blake2s_final_x86(struct shash_desc *desc, u8 *out)
+-{
+- return crypto_blake2s_final(desc, out, false);
+-}
+-
+-#define BLAKE2S_ALG(name, driver_name, digest_size) \
+- { \
+- .base.cra_name = name, \
+- .base.cra_driver_name = driver_name, \
+- .base.cra_priority = 200, \
+- .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \
+- .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, \
+- .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), \
+- .base.cra_module = THIS_MODULE, \
+- .digestsize = digest_size, \
+- .setkey = crypto_blake2s_setkey, \
+- .init = crypto_blake2s_init, \
+- .update = crypto_blake2s_update_x86, \
+- .final = crypto_blake2s_final_x86, \
+- .descsize = sizeof(struct blake2s_state), \
+- }
+-
+-static struct shash_alg blake2s_algs[] = {
+- BLAKE2S_ALG("blake2s-128", "blake2s-128-x86", BLAKE2S_128_HASH_SIZE),
+- BLAKE2S_ALG("blake2s-160", "blake2s-160-x86", BLAKE2S_160_HASH_SIZE),
+- BLAKE2S_ALG("blake2s-224", "blake2s-224-x86", BLAKE2S_224_HASH_SIZE),
+- BLAKE2S_ALG("blake2s-256", "blake2s-256-x86", BLAKE2S_256_HASH_SIZE),
+-};
+-
+-static int __init blake2s_mod_init(void)
+-{
+- if (IS_REACHABLE(CONFIG_CRYPTO_HASH) && boot_cpu_has(X86_FEATURE_SSSE3))
+- return crypto_register_shashes(blake2s_algs, ARRAY_SIZE(blake2s_algs));
+- return 0;
+-}
+-
+-static void __exit blake2s_mod_exit(void)
+-{
+- if (IS_REACHABLE(CONFIG_CRYPTO_HASH) && boot_cpu_has(X86_FEATURE_SSSE3))
+- crypto_unregister_shashes(blake2s_algs, ARRAY_SIZE(blake2s_algs));
+-}
+-
+-module_init(blake2s_mod_init);
+-module_exit(blake2s_mod_exit);
+-
+-MODULE_ALIAS_CRYPTO("blake2s-128");
+-MODULE_ALIAS_CRYPTO("blake2s-128-x86");
+-MODULE_ALIAS_CRYPTO("blake2s-160");
+-MODULE_ALIAS_CRYPTO("blake2s-160-x86");
+-MODULE_ALIAS_CRYPTO("blake2s-224");
+-MODULE_ALIAS_CRYPTO("blake2s-224-x86");
+-MODULE_ALIAS_CRYPTO("blake2s-256");
+-MODULE_ALIAS_CRYPTO("blake2s-256-x86");
+-MODULE_LICENSE("GPL v2");
+diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile
+index eeadbd7d92cc5..ca2fe186994b0 100644
+--- a/arch/x86/entry/Makefile
++++ b/arch/x86/entry/Makefile
+@@ -11,12 +11,13 @@ CFLAGS_REMOVE_common.o = $(CC_FLAGS_FTRACE)
+
+ CFLAGS_common.o += -fno-stack-protector
+
+-obj-y := entry.o entry_$(BITS).o thunk_$(BITS).o syscall_$(BITS).o
++obj-y := entry.o entry_$(BITS).o syscall_$(BITS).o
+ obj-y += common.o
+
+ obj-y += vdso/
+ obj-y += vsyscall/
+
++obj-$(CONFIG_PREEMPTION) += thunk_$(BITS).o
+ obj-$(CONFIG_IA32_EMULATION) += entry_64_compat.o syscall_32.o
+ obj-$(CONFIG_X86_X32_ABI) += syscall_x32.o
+
+diff --git a/arch/x86/entry/thunk_32.S b/arch/x86/entry/thunk_32.S
+index 7591bab060f70..ff6e7003da974 100644
+--- a/arch/x86/entry/thunk_32.S
++++ b/arch/x86/entry/thunk_32.S
+@@ -29,10 +29,8 @@ SYM_CODE_START_NOALIGN(\name)
+ SYM_CODE_END(\name)
+ .endm
+
+-#ifdef CONFIG_PREEMPTION
+ THUNK preempt_schedule_thunk, preempt_schedule
+ THUNK preempt_schedule_notrace_thunk, preempt_schedule_notrace
+ EXPORT_SYMBOL(preempt_schedule_thunk)
+ EXPORT_SYMBOL(preempt_schedule_notrace_thunk)
+-#endif
+
+diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S
+index 505b488fcc655..f38b07d2768bb 100644
+--- a/arch/x86/entry/thunk_64.S
++++ b/arch/x86/entry/thunk_64.S
+@@ -31,14 +31,11 @@ SYM_FUNC_END(\name)
+ _ASM_NOKPROBE(\name)
+ .endm
+
+-#ifdef CONFIG_PREEMPTION
+ THUNK preempt_schedule_thunk, preempt_schedule
+ THUNK preempt_schedule_notrace_thunk, preempt_schedule_notrace
+ EXPORT_SYMBOL(preempt_schedule_thunk)
+ EXPORT_SYMBOL(preempt_schedule_notrace_thunk)
+-#endif
+
+-#ifdef CONFIG_PREEMPTION
+ SYM_CODE_START_LOCAL_NOALIGN(__thunk_restore)
+ popq %r11
+ popq %r10
+@@ -53,4 +50,3 @@ SYM_CODE_START_LOCAL_NOALIGN(__thunk_restore)
+ RET
+ _ASM_NOKPROBE(__thunk_restore)
+ SYM_CODE_END(__thunk_restore)
+-#endif
+diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
+index e893af5aa8f55..d80a133093a23 100644
+--- a/arch/x86/entry/vdso/Makefile
++++ b/arch/x86/entry/vdso/Makefile
+@@ -179,7 +179,7 @@ quiet_cmd_vdso = VDSO $@
+ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@'
+
+ VDSO_LDFLAGS = -shared --hash-style=both --build-id=sha1 \
+- $(call ld-option, --eh-frame-hdr) -Bsymbolic
++ $(call ld-option, --eh-frame-hdr) -Bsymbolic -z noexecstack
+ GCOV_PROFILE := n
+
+ quiet_cmd_vdso_and_check = VDSO $@
+diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
+index 1f156098a5bf5..4f70fb6c2c1eb 100644
+--- a/arch/x86/events/intel/lbr.c
++++ b/arch/x86/events/intel/lbr.c
+@@ -769,6 +769,7 @@ void intel_pmu_lbr_disable_all(void)
+ void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
+ {
+ unsigned long mask = x86_pmu.lbr_nr - 1;
++ struct perf_branch_entry *br = cpuc->lbr_entries;
+ u64 tos = intel_pmu_lbr_tos();
+ int i;
+
+@@ -784,15 +785,11 @@ void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
+
+ rdmsrl(x86_pmu.lbr_from + lbr_idx, msr_lastbranch.lbr);
+
+- cpuc->lbr_entries[i].from = msr_lastbranch.from;
+- cpuc->lbr_entries[i].to = msr_lastbranch.to;
+- cpuc->lbr_entries[i].mispred = 0;
+- cpuc->lbr_entries[i].predicted = 0;
+- cpuc->lbr_entries[i].in_tx = 0;
+- cpuc->lbr_entries[i].abort = 0;
+- cpuc->lbr_entries[i].cycles = 0;
+- cpuc->lbr_entries[i].type = 0;
+- cpuc->lbr_entries[i].reserved = 0;
++ perf_clear_branch_entry_bitfields(br);
++
++ br->from = msr_lastbranch.from;
++ br->to = msr_lastbranch.to;
++ br++;
+ }
+ cpuc->lbr_stack.nr = i;
+ cpuc->lbr_stack.hw_idx = tos;
+@@ -807,6 +804,7 @@ void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
+ {
+ bool need_info = false, call_stack = false;
+ unsigned long mask = x86_pmu.lbr_nr - 1;
++ struct perf_branch_entry *br = cpuc->lbr_entries;
+ u64 tos = intel_pmu_lbr_tos();
+ int i;
+ int out = 0;
+@@ -878,15 +876,14 @@ void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
+ if (abort && x86_pmu.lbr_double_abort && out > 0)
+ out--;
+
+- cpuc->lbr_entries[out].from = from;
+- cpuc->lbr_entries[out].to = to;
+- cpuc->lbr_entries[out].mispred = mis;
+- cpuc->lbr_entries[out].predicted = pred;
+- cpuc->lbr_entries[out].in_tx = in_tx;
+- cpuc->lbr_entries[out].abort = abort;
+- cpuc->lbr_entries[out].cycles = cycles;
+- cpuc->lbr_entries[out].type = 0;
+- cpuc->lbr_entries[out].reserved = 0;
++ perf_clear_branch_entry_bitfields(br+out);
++ br[out].from = from;
++ br[out].to = to;
++ br[out].mispred = mis;
++ br[out].predicted = pred;
++ br[out].in_tx = in_tx;
++ br[out].abort = abort;
++ br[out].cycles = cycles;
+ out++;
+ }
+ cpuc->lbr_stack.nr = out;
+@@ -951,6 +948,8 @@ static void intel_pmu_store_lbr(struct cpu_hw_events *cpuc,
+ to = rdlbr_to(i, lbr);
+ info = rdlbr_info(i, lbr);
+
++ perf_clear_branch_entry_bitfields(e);
++
+ e->from = from;
+ e->to = to;
+ e->mispred = get_lbr_mispred(info);
+@@ -959,7 +958,6 @@ static void intel_pmu_store_lbr(struct cpu_hw_events *cpuc,
+ e->abort = !!(info & LBR_INFO_ABORT);
+ e->cycles = get_lbr_cycles(info);
+ e->type = get_lbr_br_type(info);
+- e->reserved = 0;
+ }
+
+ cpuc->lbr_stack.nr = i;
+diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
+index 6ad8d946cd3eb..5ec359c1b50cb 100644
+--- a/arch/x86/include/asm/kexec.h
++++ b/arch/x86/include/asm/kexec.h
+@@ -193,6 +193,12 @@ int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
+ const Elf_Shdr *relsec,
+ const Elf_Shdr *symtab);
+ #define arch_kexec_apply_relocations_add arch_kexec_apply_relocations_add
++
++void *arch_kexec_kernel_image_load(struct kimage *image);
++#define arch_kexec_kernel_image_load arch_kexec_kernel_image_load
++
++int arch_kimage_file_post_load_cleanup(struct kimage *image);
++#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
+ #endif
+ #endif
+
+diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
+index 9fdaa847d4b66..35c7a1fce8eab 100644
+--- a/arch/x86/include/asm/kvm_host.h
++++ b/arch/x86/include/asm/kvm_host.h
+@@ -508,6 +508,7 @@ struct kvm_pmu {
+ unsigned nr_arch_fixed_counters;
+ unsigned available_event_types;
+ u64 fixed_ctr_ctrl;
++ u64 fixed_ctr_ctrl_mask;
+ u64 global_ctrl;
+ u64 global_status;
+ u64 counter_bitmask[2];
+@@ -1588,7 +1589,7 @@ static inline int kvm_arch_flush_remote_tlb(struct kvm *kvm)
+ #define kvm_arch_pmi_in_guest(vcpu) \
+ ((vcpu) && (vcpu)->arch.handling_intr_from_guest)
+
+-void kvm_mmu_x86_module_init(void);
++void __init kvm_mmu_x86_module_init(void);
+ int kvm_mmu_vendor_module_init(void);
+ void kvm_mmu_vendor_module_exit(void);
+
+diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
+index fa625b2a8a939..a80329bd00040 100644
+--- a/arch/x86/kernel/cpu/bugs.c
++++ b/arch/x86/kernel/cpu/bugs.c
+@@ -152,7 +152,7 @@ void __init check_bugs(void)
+ /*
+ * spectre_v2_user_select_mitigation() relies on the state set by
+ * retbleed_select_mitigation(); specifically the STIBP selection is
+- * forced for UNRET.
++ * forced for UNRET or IBPB.
+ */
+ spectre_v2_user_select_mitigation();
+ ssb_select_mitigation();
+@@ -1172,7 +1172,8 @@ spectre_v2_user_select_mitigation(void)
+ boot_cpu_has(X86_FEATURE_AMD_STIBP_ALWAYS_ON))
+ mode = SPECTRE_V2_USER_STRICT_PREFERRED;
+
+- if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET) {
++ if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET ||
++ retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
+ if (mode != SPECTRE_V2_USER_STRICT &&
+ mode != SPECTRE_V2_USER_STRICT_PREFERRED)
+ pr_info("Selecting STIBP always-on mode to complement retbleed mitigation\n");
+@@ -2353,10 +2354,11 @@ static ssize_t srbds_show_state(char *buf)
+
+ static ssize_t retbleed_show_state(char *buf)
+ {
+- if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET) {
++ if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET ||
++ retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
+ boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
+- return sprintf(buf, "Vulnerable: untrained return thunk on non-Zen uarch\n");
++ return sprintf(buf, "Vulnerable: untrained return thunk / IBPB on non-AMD based uarch\n");
+
+ return sprintf(buf, "%s; SMT %s\n",
+ retbleed_strings[retbleed_mitigation],
+diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
+index 2c87d62f191e2..ae7d4c85f4f43 100644
+--- a/arch/x86/kernel/cpu/intel.c
++++ b/arch/x86/kernel/cpu/intel.c
+@@ -1145,22 +1145,23 @@ static void bus_lock_init(void)
+ {
+ u64 val;
+
+- /*
+- * Warn and fatal are handled by #AC for split lock if #AC for
+- * split lock is supported.
+- */
+- if (!boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT) ||
+- (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) &&
+- (sld_state == sld_warn || sld_state == sld_fatal)) ||
+- sld_state == sld_off)
++ if (!boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT))
+ return;
+
+- /*
+- * Enable #DB for bus lock. All bus locks are handled in #DB except
+- * split locks are handled in #AC in the fatal case.
+- */
+ rdmsrl(MSR_IA32_DEBUGCTLMSR, val);
+- val |= DEBUGCTLMSR_BUS_LOCK_DETECT;
++
++ if ((boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) &&
++ (sld_state == sld_warn || sld_state == sld_fatal)) ||
++ sld_state == sld_off) {
++ /*
++ * Warn and fatal are handled by #AC for split lock if #AC for
++ * split lock is supported.
++ */
++ val &= ~DEBUGCTLMSR_BUS_LOCK_DETECT;
++ } else {
++ val |= DEBUGCTLMSR_BUS_LOCK_DETECT;
++ }
++
+ wrmsrl(MSR_IA32_DEBUGCTLMSR, val);
+ }
+
+diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
+index 6892ca67d9c6d..b6d7ece7bf512 100644
+--- a/arch/x86/kernel/ftrace.c
++++ b/arch/x86/kernel/ftrace.c
+@@ -93,6 +93,7 @@ static int ftrace_verify_code(unsigned long ip, const char *old_code)
+
+ /* Make sure it is what we expect it to be */
+ if (memcmp(cur_code, old_code, MCOUNT_INSN_SIZE) != 0) {
++ ftrace_expected = old_code;
+ WARN_ON(1);
+ return -EINVAL;
+ }
+diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
+index 7c4ab8870da44..74167dc5f55ec 100644
+--- a/arch/x86/kernel/kprobes/core.c
++++ b/arch/x86/kernel/kprobes/core.c
+@@ -814,16 +814,20 @@ set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
+ static void kprobe_post_process(struct kprobe *cur, struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb)
+ {
+- if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
+- kcb->kprobe_status = KPROBE_HIT_SSDONE;
+- cur->post_handler(cur, regs, 0);
+- }
+-
+ /* Restore back the original saved kprobes variables and continue. */
+- if (kcb->kprobe_status == KPROBE_REENTER)
++ if (kcb->kprobe_status == KPROBE_REENTER) {
++ /* This will restore both kcb and current_kprobe */
+ restore_previous_kprobe(kcb);
+- else
++ } else {
++ /*
++ * Always update the kcb status because
++ * reset_curent_kprobe() doesn't update kcb.
++ */
++ kcb->kprobe_status = KPROBE_HIT_SSDONE;
++ if (cur->post_handler)
++ cur->post_handler(cur, regs, 0);
+ reset_current_kprobe();
++ }
+ }
+ NOKPROBE_SYMBOL(kprobe_post_process);
+
+diff --git a/arch/x86/kernel/pmem.c b/arch/x86/kernel/pmem.c
+index 6b07faaa15798..23154d24b1173 100644
+--- a/arch/x86/kernel/pmem.c
++++ b/arch/x86/kernel/pmem.c
+@@ -27,6 +27,11 @@ static __init int register_e820_pmem(void)
+ * simply here to trigger the module to load on demand.
+ */
+ pdev = platform_device_alloc("e820_pmem", -1);
+- return platform_device_add(pdev);
++
++ rc = platform_device_add(pdev);
++ if (rc)
++ platform_device_put(pdev);
++
++ return rc;
+ }
+ device_initcall(register_e820_pmem);
+diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
+index 622dc3673c37f..8011536ba5c44 100644
+--- a/arch/x86/kernel/process.c
++++ b/arch/x86/kernel/process.c
+@@ -824,6 +824,10 @@ static void amd_e400_idle(void)
+ */
+ static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
+ {
++ /* User has disallowed the use of MWAIT. Fallback to HALT */
++ if (boot_option_idle_override == IDLE_NOMWAIT)
++ return 0;
++
+ if (c->x86_vendor != X86_VENDOR_INTEL)
+ return 0;
+
+@@ -932,9 +936,8 @@ static int __init idle_setup(char *str)
+ } else if (!strcmp(str, "nomwait")) {
+ /*
+ * If the boot option of "idle=nomwait" is added,
+- * it means that mwait will be disabled for CPU C2/C3
+- * states. In such case it won't touch the variable
+- * of boot_option_idle_override.
++ * it means that mwait will be disabled for CPU C1/C2/C3
++ * states.
+ */
+ boot_option_idle_override = IDLE_NOMWAIT;
+ } else
+diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
+index f8382abe22ff8..aa907cec09187 100644
+--- a/arch/x86/kvm/emulate.c
++++ b/arch/x86/kvm/emulate.c
+@@ -1687,16 +1687,6 @@ static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
+ case VCPU_SREG_TR:
+ if (seg_desc.s || (seg_desc.type != 1 && seg_desc.type != 9))
+ goto exception;
+- if (!seg_desc.p) {
+- err_vec = NP_VECTOR;
+- goto exception;
+- }
+- old_desc = seg_desc;
+- seg_desc.type |= 2; /* busy */
+- ret = ctxt->ops->cmpxchg_emulated(ctxt, desc_addr, &old_desc, &seg_desc,
+- sizeof(seg_desc), &ctxt->exception);
+- if (ret != X86EMUL_CONTINUE)
+- return ret;
+ break;
+ case VCPU_SREG_LDTR:
+ if (seg_desc.s || seg_desc.type != 2)
+@@ -1734,8 +1724,17 @@ static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
+ if (ret != X86EMUL_CONTINUE)
+ return ret;
+ if (emul_is_noncanonical_address(get_desc_base(&seg_desc) |
+- ((u64)base3 << 32), ctxt))
+- return emulate_gp(ctxt, 0);
++ ((u64)base3 << 32), ctxt))
++ return emulate_gp(ctxt, err_code);
++ }
++
++ if (seg == VCPU_SREG_TR) {
++ old_desc = seg_desc;
++ seg_desc.type |= 2; /* busy */
++ ret = ctxt->ops->cmpxchg_emulated(ctxt, desc_addr, &old_desc, &seg_desc,
++ sizeof(seg_desc), &ctxt->exception);
++ if (ret != X86EMUL_CONTINUE)
++ return ret;
+ }
+ load:
+ ctxt->ops->set_segment(ctxt, selector, &seg_desc, base3, seg);
+diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
+index 7f9c3b0fcd0b5..da7a25d41a97f 100644
+--- a/arch/x86/kvm/mmu/mmu.c
++++ b/arch/x86/kvm/mmu/mmu.c
+@@ -6264,7 +6264,7 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
+ * nx_huge_pages needs to be resolved to true/false when kvm.ko is loaded, as
+ * its default value of -1 is technically undefined behavior for a boolean.
+ */
+-void kvm_mmu_x86_module_init(void)
++void __init kvm_mmu_x86_module_init(void)
+ {
+ if (nx_huge_pages == -1)
+ __set_nx_huge_pages(get_nx_auto_mode());
+diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
+index beb3ce8d94eb3..43f6a882615f0 100644
+--- a/arch/x86/kvm/mmu/paging_tmpl.h
++++ b/arch/x86/kvm/mmu/paging_tmpl.h
+@@ -1044,7 +1044,14 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
+ if (sync_mmio_spte(vcpu, &sp->spt[i], gfn, pte_access))
+ continue;
+
+- if (gfn != sp->gfns[i]) {
++ /*
++ * Drop the SPTE if the new protections would result in a RWX=0
++ * SPTE or if the gfn is changing. The RWX=0 case only affects
++ * EPT with execute-only support, i.e. EPT without an effective
++ * "present" bit, as all other paging modes will create a
++ * read-only SPTE if pte_access is zero.
++ */
++ if ((!pte_access && !shadow_present_mask) || gfn != sp->gfns[i]) {
+ drop_spte(vcpu->kvm, &sp->spt[i]);
+ flush = true;
+ continue;
+diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
+index e5c0b6db6f2ca..8223a80802e7f 100644
+--- a/arch/x86/kvm/mmu/spte.c
++++ b/arch/x86/kvm/mmu/spte.c
+@@ -128,6 +128,8 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
+ u64 spte = SPTE_MMU_PRESENT_MASK;
+ bool wrprot = false;
+
++ WARN_ON_ONCE(!pte_access && !shadow_present_mask);
++
+ if (sp->role.ad_disabled)
+ spte |= SPTE_TDP_AD_DISABLED_MASK;
+ else if (kvm_mmu_page_ad_need_write_protect(sp))
+diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
+index 9a5963bff0f95..83e7325f7df41 100644
+--- a/arch/x86/kvm/svm/nested.c
++++ b/arch/x86/kvm/svm/nested.c
+@@ -292,7 +292,8 @@ static bool __nested_vmcb_check_save(struct kvm_vcpu *vcpu,
+ return false;
+ }
+
+- if (CC(!kvm_is_valid_cr4(vcpu, save->cr4)))
++ /* Note, SVM doesn't have any additional restrictions on CR4. */
++ if (CC(!__kvm_is_valid_cr4(vcpu, save->cr4)))
+ return false;
+
+ if (CC(!kvm_valid_efer(vcpu, save->efer)))
+diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
+index c667214c630b1..e5a154f0d2aba 100644
+--- a/arch/x86/kvm/svm/svm.c
++++ b/arch/x86/kvm/svm/svm.c
+@@ -390,6 +390,10 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu)
+ */
+ (void)svm_skip_emulated_instruction(vcpu);
+ rip = kvm_rip_read(vcpu);
++
++ if (boot_cpu_has(X86_FEATURE_NRIPS))
++ svm->vmcb->control.next_rip = rip;
++
+ svm->int3_rip = rip + svm->vmcb->save.cs.base;
+ svm->int3_injected = rip - old_rip;
+ }
+@@ -3305,8 +3309,6 @@ static void svm_inject_irq(struct kvm_vcpu *vcpu)
+ {
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+- BUG_ON(!(gif_set(svm)));
+-
+ trace_kvm_inj_virq(vcpu->arch.interrupt.nr);
+ ++vcpu->stat.irq_injections;
+
+@@ -3615,6 +3617,18 @@ static void svm_complete_interrupts(struct kvm_vcpu *vcpu)
+ vector = exitintinfo & SVM_EXITINTINFO_VEC_MASK;
+ type = exitintinfo & SVM_EXITINTINFO_TYPE_MASK;
+
++ /*
++ * If NextRIP isn't enabled, KVM must manually advance RIP prior to
++ * injecting the soft exception/interrupt. That advancement needs to
++ * be unwound if vectoring didn't complete. Note, the new event may
++ * not be the injected event, e.g. if KVM injected an INTn, the INTn
++ * hit a #NP in the guest, and the #NP encountered a #PF, the #NP will
++ * be the reported vectored event, but RIP still needs to be unwound.
++ */
++ if (int3_injected && type == SVM_EXITINTINFO_TYPE_EXEPT &&
++ kvm_is_linear_rip(vcpu, svm->int3_rip))
++ kvm_rip_write(vcpu, kvm_rip_read(vcpu) - int3_injected);
++
+ switch (type) {
+ case SVM_EXITINTINFO_TYPE_NMI:
+ vcpu->arch.nmi_injected = true;
+@@ -3628,16 +3642,11 @@ static void svm_complete_interrupts(struct kvm_vcpu *vcpu)
+
+ /*
+ * In case of software exceptions, do not reinject the vector,
+- * but re-execute the instruction instead. Rewind RIP first
+- * if we emulated INT3 before.
++ * but re-execute the instruction instead.
+ */
+- if (kvm_exception_is_soft(vector)) {
+- if (vector == BP_VECTOR && int3_injected &&
+- kvm_is_linear_rip(vcpu, svm->int3_rip))
+- kvm_rip_write(vcpu,
+- kvm_rip_read(vcpu) - int3_injected);
++ if (kvm_exception_is_soft(vector))
+ break;
+- }
++
+ if (exitintinfo & SVM_EXITINTINFO_VALID_ERR) {
+ u32 err = svm->vmcb->control.exit_int_info_err;
+ kvm_requeue_exception_e(vcpu, vector, err);
+diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
+index 28ccf25c41248..5c62e552082ab 100644
+--- a/arch/x86/kvm/vmx/nested.c
++++ b/arch/x86/kvm/vmx/nested.c
+@@ -1224,7 +1224,7 @@ static int vmx_restore_vmx_basic(struct vcpu_vmx *vmx, u64 data)
+ BIT_ULL(49) | BIT_ULL(54) | BIT_ULL(55) |
+ /* reserved */
+ BIT_ULL(31) | GENMASK_ULL(47, 45) | GENMASK_ULL(63, 56);
+- u64 vmx_basic = vmx->nested.msrs.basic;
++ u64 vmx_basic = vmcs_config.nested.basic;
+
+ if (!is_bitwise_subset(vmx_basic, data, feature_and_reserved))
+ return -EINVAL;
+@@ -1247,36 +1247,42 @@ static int vmx_restore_vmx_basic(struct vcpu_vmx *vmx, u64 data)
+ return 0;
+ }
+
+-static int
+-vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
++static void vmx_get_control_msr(struct nested_vmx_msrs *msrs, u32 msr_index,
++ u32 **low, u32 **high)
+ {
+- u64 supported;
+- u32 *lowp, *highp;
+-
+ switch (msr_index) {
+ case MSR_IA32_VMX_TRUE_PINBASED_CTLS:
+- lowp = &vmx->nested.msrs.pinbased_ctls_low;
+- highp = &vmx->nested.msrs.pinbased_ctls_high;
++ *low = &msrs->pinbased_ctls_low;
++ *high = &msrs->pinbased_ctls_high;
+ break;
+ case MSR_IA32_VMX_TRUE_PROCBASED_CTLS:
+- lowp = &vmx->nested.msrs.procbased_ctls_low;
+- highp = &vmx->nested.msrs.procbased_ctls_high;
++ *low = &msrs->procbased_ctls_low;
++ *high = &msrs->procbased_ctls_high;
+ break;
+ case MSR_IA32_VMX_TRUE_EXIT_CTLS:
+- lowp = &vmx->nested.msrs.exit_ctls_low;
+- highp = &vmx->nested.msrs.exit_ctls_high;
++ *low = &msrs->exit_ctls_low;
++ *high = &msrs->exit_ctls_high;
+ break;
+ case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
+- lowp = &vmx->nested.msrs.entry_ctls_low;
+- highp = &vmx->nested.msrs.entry_ctls_high;
++ *low = &msrs->entry_ctls_low;
++ *high = &msrs->entry_ctls_high;
+ break;
+ case MSR_IA32_VMX_PROCBASED_CTLS2:
+- lowp = &vmx->nested.msrs.secondary_ctls_low;
+- highp = &vmx->nested.msrs.secondary_ctls_high;
++ *low = &msrs->secondary_ctls_low;
++ *high = &msrs->secondary_ctls_high;
+ break;
+ default:
+ BUG();
+ }
++}
++
++static int
++vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
++{
++ u32 *lowp, *highp;
++ u64 supported;
++
++ vmx_get_control_msr(&vmcs_config.nested, msr_index, &lowp, &highp);
+
+ supported = vmx_control_msr(*lowp, *highp);
+
+@@ -1288,6 +1294,7 @@ vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
+ if (!is_bitwise_subset(supported, data, GENMASK_ULL(63, 32)))
+ return -EINVAL;
+
++ vmx_get_control_msr(&vmx->nested.msrs, msr_index, &lowp, &highp);
+ *lowp = data;
+ *highp = data >> 32;
+ return 0;
+@@ -1301,10 +1308,8 @@ static int vmx_restore_vmx_misc(struct vcpu_vmx *vmx, u64 data)
+ BIT_ULL(28) | BIT_ULL(29) | BIT_ULL(30) |
+ /* reserved */
+ GENMASK_ULL(13, 9) | BIT_ULL(31);
+- u64 vmx_misc;
+-
+- vmx_misc = vmx_control_msr(vmx->nested.msrs.misc_low,
+- vmx->nested.msrs.misc_high);
++ u64 vmx_misc = vmx_control_msr(vmcs_config.nested.misc_low,
++ vmcs_config.nested.misc_high);
+
+ if (!is_bitwise_subset(vmx_misc, data, feature_and_reserved_bits))
+ return -EINVAL;
+@@ -1332,10 +1337,8 @@ static int vmx_restore_vmx_misc(struct vcpu_vmx *vmx, u64 data)
+
+ static int vmx_restore_vmx_ept_vpid_cap(struct vcpu_vmx *vmx, u64 data)
+ {
+- u64 vmx_ept_vpid_cap;
+-
+- vmx_ept_vpid_cap = vmx_control_msr(vmx->nested.msrs.ept_caps,
+- vmx->nested.msrs.vpid_caps);
++ u64 vmx_ept_vpid_cap = vmx_control_msr(vmcs_config.nested.ept_caps,
++ vmcs_config.nested.vpid_caps);
+
+ /* Every bit is either reserved or a feature bit. */
+ if (!is_bitwise_subset(vmx_ept_vpid_cap, data, -1ULL))
+@@ -1346,20 +1349,21 @@ static int vmx_restore_vmx_ept_vpid_cap(struct vcpu_vmx *vmx, u64 data)
+ return 0;
+ }
+
+-static int vmx_restore_fixed0_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
++static u64 *vmx_get_fixed0_msr(struct nested_vmx_msrs *msrs, u32 msr_index)
+ {
+- u64 *msr;
+-
+ switch (msr_index) {
+ case MSR_IA32_VMX_CR0_FIXED0:
+- msr = &vmx->nested.msrs.cr0_fixed0;
+- break;
++ return &msrs->cr0_fixed0;
+ case MSR_IA32_VMX_CR4_FIXED0:
+- msr = &vmx->nested.msrs.cr4_fixed0;
+- break;
++ return &msrs->cr4_fixed0;
+ default:
+ BUG();
+ }
++}
++
++static int vmx_restore_fixed0_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
++{
++ const u64 *msr = vmx_get_fixed0_msr(&vmcs_config.nested, msr_index);
+
+ /*
+ * 1 bits (which indicates bits which "must-be-1" during VMX operation)
+@@ -1368,7 +1372,7 @@ static int vmx_restore_fixed0_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
+ if (!is_bitwise_subset(data, *msr, -1ULL))
+ return -EINVAL;
+
+- *msr = data;
++ *vmx_get_fixed0_msr(&vmx->nested.msrs, msr_index) = data;
+ return 0;
+ }
+
+@@ -1429,7 +1433,7 @@ int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
+ vmx->nested.msrs.vmcs_enum = data;
+ return 0;
+ case MSR_IA32_VMX_VMFUNC:
+- if (data & ~vmx->nested.msrs.vmfunc_controls)
++ if (data & ~vmcs_config.nested.vmfunc_controls)
+ return -EINVAL;
+ vmx->nested.msrs.vmfunc_controls = data;
+ return 0;
+@@ -2279,7 +2283,6 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
+ SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
+ SECONDARY_EXEC_APIC_REGISTER_VIRT |
+ SECONDARY_EXEC_ENABLE_VMFUNC |
+- SECONDARY_EXEC_TSC_SCALING |
+ SECONDARY_EXEC_DESC);
+
+ if (nested_cpu_has(vmcs12,
+@@ -2618,6 +2621,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
+ vcpu->arch.walk_mmu->inject_page_fault = vmx_inject_page_fault_nested;
+
+ if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL) &&
++ intel_pmu_has_perf_global_ctrl(vcpu_to_pmu(vcpu)) &&
+ WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL,
+ vmcs12->guest_ia32_perf_global_ctrl))) {
+ *entry_failure_code = ENTRY_FAIL_DEFAULT;
+@@ -3378,10 +3382,12 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
+ if (likely(!evaluate_pending_interrupts) && kvm_vcpu_apicv_active(vcpu))
+ evaluate_pending_interrupts |= vmx_has_apicv_interrupt(vcpu);
+
+- if (!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS))
++ if (!vmx->nested.nested_run_pending ||
++ !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS))
+ vmx->nested.vmcs01_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL);
+ if (kvm_mpx_supported() &&
+- !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))
++ (!vmx->nested.nested_run_pending ||
++ !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)))
+ vmx->nested.vmcs01_guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);
+
+ /*
+@@ -4341,7 +4347,8 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
+ vmcs_write64(GUEST_IA32_PAT, vmcs12->host_ia32_pat);
+ vcpu->arch.pat = vmcs12->host_ia32_pat;
+ }
+- if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL)
++ if ((vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL) &&
++ intel_pmu_has_perf_global_ctrl(vcpu_to_pmu(vcpu)))
+ WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL,
+ vmcs12->host_ia32_perf_global_ctrl));
+
+@@ -4967,20 +4974,25 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
+ | FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
+
+ /*
+- * The Intel VMX Instruction Reference lists a bunch of bits that are
+- * prerequisite to running VMXON, most notably cr4.VMXE must be set to
+- * 1 (see vmx_is_valid_cr4() for when we allow the guest to set this).
+- * Otherwise, we should fail with #UD. But most faulting conditions
+- * have already been checked by hardware, prior to the VM-exit for
+- * VMXON. We do test guest cr4.VMXE because processor CR4 always has
+- * that bit set to 1 in non-root mode.
++ * Note, KVM cannot rely on hardware to perform the CR0/CR4 #UD checks
++ * that have higher priority than VM-Exit (see Intel SDM's pseudocode
++ * for VMXON), as KVM must load valid CR0/CR4 values into hardware while
++ * running the guest, i.e. KVM needs to check the _guest_ values.
++ *
++ * Rely on hardware for the other two pre-VM-Exit checks, !VM86 and
++ * !COMPATIBILITY modes. KVM may run the guest in VM86 to emulate Real
++ * Mode, but KVM will never take the guest out of those modes.
+ */
+- if (!kvm_read_cr4_bits(vcpu, X86_CR4_VMXE)) {
++ if (!nested_host_cr0_valid(vcpu, kvm_read_cr0(vcpu)) ||
++ !nested_host_cr4_valid(vcpu, kvm_read_cr4(vcpu))) {
+ kvm_queue_exception(vcpu, UD_VECTOR);
+ return 1;
+ }
+
+- /* CPL=0 must be checked manually. */
++ /*
++ * CPL=0 and all other checks that are lower priority than VM-Exit must
++ * be checked manually.
++ */
+ if (vmx_get_cpl(vcpu)) {
+ kvm_inject_gp(vcpu, 0);
+ return 1;
+@@ -6780,6 +6792,9 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
+ rdmsrl(MSR_IA32_VMX_CR0_FIXED1, msrs->cr0_fixed1);
+ rdmsrl(MSR_IA32_VMX_CR4_FIXED1, msrs->cr4_fixed1);
+
++ if (vmx_umip_emulated())
++ msrs->cr4_fixed1 |= X86_CR4_UMIP;
++
+ msrs->vmcs_enum = nested_vmx_calc_vmcs_enum_msr();
+ }
+
+diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
+index c92cea0b8cccf..129ae4e01f7c1 100644
+--- a/arch/x86/kvm/vmx/nested.h
++++ b/arch/x86/kvm/vmx/nested.h
+@@ -281,7 +281,8 @@ static inline bool nested_cr4_valid(struct kvm_vcpu *vcpu, unsigned long val)
+ u64 fixed0 = to_vmx(vcpu)->nested.msrs.cr4_fixed0;
+ u64 fixed1 = to_vmx(vcpu)->nested.msrs.cr4_fixed1;
+
+- return fixed_bits_valid(val, fixed0, fixed1);
++ return fixed_bits_valid(val, fixed0, fixed1) &&
++ __kvm_is_valid_cr4(vcpu, val);
+ }
+
+ /* No difference in the restrictions on guest and host CR4 in VMX operation. */
+diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
+index b82b6709d7a81..8bd154f8c9663 100644
+--- a/arch/x86/kvm/vmx/pmu_intel.c
++++ b/arch/x86/kvm/vmx/pmu_intel.c
+@@ -98,6 +98,9 @@ static bool intel_pmc_is_enabled(struct kvm_pmc *pmc)
+ {
+ struct kvm_pmu *pmu = pmc_to_pmu(pmc);
+
++ if (!intel_pmu_has_perf_global_ctrl(pmu))
++ return true;
++
+ return test_bit(pmc->idx, (unsigned long *)&pmu->global_ctrl);
+ }
+
+@@ -212,7 +215,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
+ case MSR_CORE_PERF_GLOBAL_STATUS:
+ case MSR_CORE_PERF_GLOBAL_CTRL:
+ case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+- ret = pmu->version > 1;
++ return intel_pmu_has_perf_global_ctrl(pmu);
+ break;
+ default:
+ ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
+@@ -395,7 +398,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
+ case MSR_CORE_PERF_FIXED_CTR_CTRL:
+ if (pmu->fixed_ctr_ctrl == data)
+ return 0;
+- if (!(data & 0xfffffffffffff444ull)) {
++ if (!(data & pmu->fixed_ctr_ctrl_mask)) {
+ reprogram_fixed_counters(pmu, data);
+ return 0;
+ }
+@@ -479,6 +482,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
+ struct kvm_cpuid_entry2 *entry;
+ union cpuid10_eax eax;
+ union cpuid10_edx edx;
++ int i;
+
+ pmu->nr_arch_gp_counters = 0;
+ pmu->nr_arch_fixed_counters = 0;
+@@ -487,6 +491,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
+ pmu->version = 0;
+ pmu->reserved_bits = 0xffffffff00200000ull;
+ pmu->raw_event_mask = X86_RAW_EVENT_MASK;
++ pmu->global_ctrl_mask = ~0ull;
++ pmu->global_ovf_ctrl_mask = ~0ull;
++ pmu->fixed_ctr_ctrl_mask = ~0ull;
+
+ entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
+ if (!entry || !vcpu->kvm->arch.enable_pmu)
+@@ -522,6 +529,8 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
+ setup_fixed_pmc_eventsel(pmu);
+ }
+
++ for (i = 0; i < pmu->nr_arch_fixed_counters; i++)
++ pmu->fixed_ctr_ctrl_mask &= ~(0xbull << (i * 4));
+ pmu->global_ctrl = ((1ull << pmu->nr_arch_gp_counters) - 1) |
+ (((1ull << pmu->nr_arch_fixed_counters) - 1) << INTEL_PMC_IDX_FIXED);
+ pmu->global_ctrl_mask = ~pmu->global_ctrl;
+diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
+index 597c3c08da501..3cfbb0413cd8c 100644
+--- a/arch/x86/kvm/vmx/vmx.c
++++ b/arch/x86/kvm/vmx/vmx.c
+@@ -3230,8 +3230,8 @@ static bool vmx_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
+ {
+ /*
+ * We operate under the default treatment of SMM, so VMX cannot be
+- * enabled under SMM. Note, whether or not VMXE is allowed at all is
+- * handled by kvm_is_valid_cr4().
++ * enabled under SMM. Note, whether or not VMXE is allowed at all,
++ * i.e. is a reserved bit, is handled by common x86 code.
+ */
+ if ((cr4 & X86_CR4_VMXE) && is_smm(vcpu))
+ return false;
+diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
+index 1e7f9453894b1..93aa1f3ea01e5 100644
+--- a/arch/x86/kvm/vmx/vmx.h
++++ b/arch/x86/kvm/vmx/vmx.h
+@@ -92,6 +92,18 @@ union vmx_exit_reason {
+ u32 full;
+ };
+
++static inline bool intel_pmu_has_perf_global_ctrl(struct kvm_pmu *pmu)
++{
++ /*
++ * Architecturally, Intel's SDM states that IA32_PERF_GLOBAL_CTRL is
++ * supported if "CPUID.0AH: EAX[7:0] > 0", i.e. if the PMU version is
++ * greater than zero. However, KVM only exposes and emulates the MSR
++ * to/for the guest if the guest PMU supports at least "Architectural
++ * Performance Monitoring Version 2".
++ */
++ return pmu->version > 1;
++}
++
+ #define vcpu_to_lbr_desc(vcpu) (&to_vmx(vcpu)->lbr_desc)
+ #define vcpu_to_lbr_records(vcpu) (&to_vmx(vcpu)->lbr_desc.records)
+
+diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
+index 65b0ec28bd52b..0d6cea0d33a9e 100644
+--- a/arch/x86/kvm/x86.c
++++ b/arch/x86/kvm/x86.c
+@@ -1066,7 +1066,7 @@ int kvm_emulate_xsetbv(struct kvm_vcpu *vcpu)
+ }
+ EXPORT_SYMBOL_GPL(kvm_emulate_xsetbv);
+
+-bool kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
++bool __kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
+ {
+ if (cr4 & cr4_reserved_bits)
+ return false;
+@@ -1074,9 +1074,15 @@ bool kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
+ if (cr4 & vcpu->arch.cr4_guest_rsvd_bits)
+ return false;
+
+- return static_call(kvm_x86_is_valid_cr4)(vcpu, cr4);
++ return true;
++}
++EXPORT_SYMBOL_GPL(__kvm_is_valid_cr4);
++
++static bool kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
++{
++ return __kvm_is_valid_cr4(vcpu, cr4) &&
++ static_call(kvm_x86_is_valid_cr4)(vcpu, cr4);
+ }
+-EXPORT_SYMBOL_GPL(kvm_is_valid_cr4);
+
+ void kvm_post_set_cr4(struct kvm_vcpu *vcpu, unsigned long old_cr4, unsigned long cr4)
+ {
+@@ -3220,17 +3226,20 @@ static int set_msr_mce(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
+ /* only 0 or all 1s can be written to IA32_MCi_CTL
+ * some Linux kernels though clear bit 10 in bank 4 to
+ * workaround a BIOS/GART TBL issue on AMD K8s, ignore
+- * this to avoid an uncatched #GP in the guest
++ * this to avoid an uncatched #GP in the guest.
++ *
++ * UNIXWARE clears bit 0 of MC1_CTL to ignore
++ * correctable, single-bit ECC data errors.
+ */
+ if ((offset & 0x3) == 0 &&
+- data != 0 && (data | (1 << 10)) != ~(u64)0)
+- return -1;
++ data != 0 && (data | (1 << 10) | 1) != ~(u64)0)
++ return 1;
+
+ /* MCi_STATUS */
+ if (!msr_info->host_initiated &&
+ (offset & 0x3) == 1 && data != 0) {
+ if (!can_set_mci_status(vcpu))
+- return -1;
++ return 1;
+ }
+
+ vcpu->arch.mce_banks[offset] = data;
+@@ -3361,6 +3370,7 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
+ struct gfn_to_hva_cache *ghc = &vcpu->arch.st.cache;
+ struct kvm_steal_time __user *st;
+ struct kvm_memslots *slots;
++ gpa_t gpa = vcpu->arch.st.msr_val & KVM_STEAL_VALID_BITS;
+ u64 steal;
+ u32 version;
+
+@@ -3378,13 +3388,12 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
+ slots = kvm_memslots(vcpu->kvm);
+
+ if (unlikely(slots->generation != ghc->generation ||
++ gpa != ghc->gpa ||
+ kvm_is_error_hva(ghc->hva) || !ghc->memslot)) {
+- gfn_t gfn = vcpu->arch.st.msr_val & KVM_STEAL_VALID_BITS;
+-
+ /* We rely on the fact that it fits in a single page. */
+ BUILD_BUG_ON((sizeof(*st) - 1) & KVM_STEAL_VALID_BITS);
+
+- if (kvm_gfn_to_hva_cache_init(vcpu->kvm, ghc, gfn, sizeof(*st)) ||
++ if (kvm_gfn_to_hva_cache_init(vcpu->kvm, ghc, gpa, sizeof(*st)) ||
+ kvm_is_error_hva(ghc->hva) || !ghc->memslot)
+ return;
+ }
+@@ -4371,10 +4380,10 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
+ if (r < sizeof(struct kvm_xsave))
+ r = sizeof(struct kvm_xsave);
+ break;
++ }
+ case KVM_CAP_PMU_CAPABILITY:
+ r = enable_pmu ? KVM_CAP_PMU_VALID_MASK : 0;
+ break;
+- }
+ case KVM_CAP_DISABLE_QUIRKS2:
+ r = KVM_X86_VALID_QUIRKS;
+ break;
+@@ -4608,6 +4617,7 @@ static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu)
+ struct kvm_steal_time __user *st;
+ struct kvm_memslots *slots;
+ static const u8 preempted = KVM_VCPU_PREEMPTED;
++ gpa_t gpa = vcpu->arch.st.msr_val & KVM_STEAL_VALID_BITS;
+
+ /*
+ * The vCPU can be marked preempted if and only if the VM-Exit was on
+@@ -4635,6 +4645,7 @@ static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu)
+ slots = kvm_memslots(vcpu->kvm);
+
+ if (unlikely(slots->generation != ghc->generation ||
++ gpa != ghc->gpa ||
+ kvm_is_error_hva(ghc->hva) || !ghc->memslot))
+ return;
+
+diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
+index 588792f003345..80417761fe4ac 100644
+--- a/arch/x86/kvm/x86.h
++++ b/arch/x86/kvm/x86.h
+@@ -407,7 +407,7 @@ static inline void kvm_machine_check(void)
+ void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu);
+ void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu);
+ int kvm_spec_ctrl_test_value(u64 value);
+-bool kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
++bool __kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
+ int kvm_handle_memory_failure(struct kvm_vcpu *vcpu, int r,
+ struct x86_exception *e);
+ int kvm_handle_invpcid(struct kvm_vcpu *vcpu, unsigned long type, gva_t gva);
+diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c
+index dba2197c05c30..331310c293492 100644
+--- a/arch/x86/mm/extable.c
++++ b/arch/x86/mm/extable.c
+@@ -94,16 +94,18 @@ static bool ex_handler_copy(const struct exception_table_entry *fixup,
+ static bool ex_handler_msr(const struct exception_table_entry *fixup,
+ struct pt_regs *regs, bool wrmsr, bool safe, int reg)
+ {
+- if (!safe && wrmsr &&
+- pr_warn_once("unchecked MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x) at rIP: 0x%lx (%pS)\n",
+- (unsigned int)regs->cx, (unsigned int)regs->dx,
+- (unsigned int)regs->ax, regs->ip, (void *)regs->ip))
++ if (__ONCE_LITE_IF(!safe && wrmsr)) {
++ pr_warn("unchecked MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x) at rIP: 0x%lx (%pS)\n",
++ (unsigned int)regs->cx, (unsigned int)regs->dx,
++ (unsigned int)regs->ax, regs->ip, (void *)regs->ip);
+ show_stack_regs(regs);
++ }
+
+- if (!safe && !wrmsr &&
+- pr_warn_once("unchecked MSR access error: RDMSR from 0x%x at rIP: 0x%lx (%pS)\n",
+- (unsigned int)regs->cx, regs->ip, (void *)regs->ip))
++ if (__ONCE_LITE_IF(!safe && !wrmsr)) {
++ pr_warn("unchecked MSR access error: RDMSR from 0x%x at rIP: 0x%lx (%pS)\n",
++ (unsigned int)regs->cx, regs->ip, (void *)regs->ip);
+ show_stack_regs(regs);
++ }
+
+ if (!wrmsr) {
+ /* Pretend that the read succeeded and returned 0. */
+diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
+index e8b061557887d..2aadb2019b4f2 100644
+--- a/arch/x86/mm/numa.c
++++ b/arch/x86/mm/numa.c
+@@ -867,7 +867,7 @@ void debug_cpumask_set_cpu(int cpu, int node, bool enable)
+ return;
+ }
+ mask = node_to_cpumask_map[node];
+- if (!mask) {
++ if (!cpumask_available(mask)) {
+ pr_err("node_to_cpumask_map[%i] NULL\n", node);
+ dump_stack();
+ return;
+@@ -913,7 +913,7 @@ const struct cpumask *cpumask_of_node(int node)
+ dump_stack();
+ return cpu_none_mask;
+ }
+- if (node_to_cpumask_map[node] == NULL) {
++ if (!cpumask_available(node_to_cpumask_map[node])) {
+ printk(KERN_WARNING
+ "cpumask_of_node(%d): no node_to_cpumask_map!\n",
+ node);
+diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
+index 2dab2816b3f7c..400117f630b81 100644
+--- a/arch/x86/net/bpf_jit_comp.c
++++ b/arch/x86/net/bpf_jit_comp.c
+@@ -1777,10 +1777,12 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
+ }
+
+ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
+- struct bpf_prog *p, int stack_size, bool save_ret)
++ struct bpf_tramp_link *l, int stack_size,
++ bool save_ret)
+ {
+ u8 *prog = *pprog;
+ u8 *jmp_insn;
++ struct bpf_prog *p = l->link.prog;
+
+ /* arg1: mov rdi, progs[i] */
+ emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
+@@ -1865,14 +1867,14 @@ static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
+ }
+
+ static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
+- struct bpf_tramp_progs *tp, int stack_size,
++ struct bpf_tramp_links *tl, int stack_size,
+ bool save_ret)
+ {
+ int i;
+ u8 *prog = *pprog;
+
+- for (i = 0; i < tp->nr_progs; i++) {
+- if (invoke_bpf_prog(m, &prog, tp->progs[i], stack_size,
++ for (i = 0; i < tl->nr_links; i++) {
++ if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size,
+ save_ret))
+ return -EINVAL;
+ }
+@@ -1881,7 +1883,7 @@ static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
+ }
+
+ static int invoke_bpf_mod_ret(const struct btf_func_model *m, u8 **pprog,
+- struct bpf_tramp_progs *tp, int stack_size,
++ struct bpf_tramp_links *tl, int stack_size,
+ u8 **branches)
+ {
+ u8 *prog = *pprog;
+@@ -1892,8 +1894,8 @@ static int invoke_bpf_mod_ret(const struct btf_func_model *m, u8 **pprog,
+ */
+ emit_mov_imm32(&prog, false, BPF_REG_0, 0);
+ emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
+- for (i = 0; i < tp->nr_progs; i++) {
+- if (invoke_bpf_prog(m, &prog, tp->progs[i], stack_size, true))
++ for (i = 0; i < tl->nr_links; i++) {
++ if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size, true))
+ return -EINVAL;
+
+ /* mod_ret prog stored return value into [rbp - 8]. Emit:
+@@ -1995,14 +1997,14 @@ static bool is_valid_bpf_tramp_flags(unsigned int flags)
+ */
+ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *image_end,
+ const struct btf_func_model *m, u32 flags,
+- struct bpf_tramp_progs *tprogs,
++ struct bpf_tramp_links *tlinks,
+ void *orig_call)
+ {
+ int ret, i, nr_args = m->nr_args;
+ int regs_off, ip_off, args_off, stack_size = nr_args * 8;
+- struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
+- struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
+- struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN];
++ struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
++ struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
++ struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
+ u8 **branches = NULL;
+ u8 *prog;
+ bool save_ret;
+@@ -2093,13 +2095,13 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
+ }
+ }
+
+- if (fentry->nr_progs)
++ if (fentry->nr_links)
+ if (invoke_bpf(m, &prog, fentry, regs_off,
+ flags & BPF_TRAMP_F_RET_FENTRY_RET))
+ return -EINVAL;
+
+- if (fmod_ret->nr_progs) {
+- branches = kcalloc(fmod_ret->nr_progs, sizeof(u8 *),
++ if (fmod_ret->nr_links) {
++ branches = kcalloc(fmod_ret->nr_links, sizeof(u8 *),
+ GFP_KERNEL);
+ if (!branches)
+ return -ENOMEM;
+@@ -2126,7 +2128,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
+ prog += X86_PATCH_SIZE;
+ }
+
+- if (fmod_ret->nr_progs) {
++ if (fmod_ret->nr_links) {
+ /* From Intel 64 and IA-32 Architectures Optimization
+ * Reference Manual, 3.4.1.4 Code Alignment, Assembly/Compiler
+ * Coding Rule 11: All branch targets should be 16-byte
+@@ -2136,12 +2138,12 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
+ /* Update the branches saved in invoke_bpf_mod_ret with the
+ * aligned address of do_fexit.
+ */
+- for (i = 0; i < fmod_ret->nr_progs; i++)
++ for (i = 0; i < fmod_ret->nr_links; i++)
+ emit_cond_near_jump(&branches[i], prog, branches[i],
+ X86_JNE);
+ }
+
+- if (fexit->nr_progs)
++ if (fexit->nr_links)
+ if (invoke_bpf(m, &prog, fexit, regs_off, false)) {
+ ret = -EINVAL;
+ goto cleanup;
+@@ -2475,3 +2477,34 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len)
+ return ERR_PTR(-EINVAL);
+ return dst;
+ }
++
++/* Indicate the JIT backend supports mixing bpf2bpf and tailcalls. */
++bool bpf_jit_supports_subprog_tailcalls(void)
++{
++ return true;
++}
++
++void bpf_jit_free(struct bpf_prog *prog)
++{
++ if (prog->jited) {
++ struct x64_jit_data *jit_data = prog->aux->jit_data;
++ struct bpf_binary_header *hdr;
++
++ /*
++ * If we fail the final pass of JIT (from jit_subprogs),
++ * the program may not be finalized yet. Call finalize here
++ * before freeing it.
++ */
++ if (jit_data) {
++ bpf_jit_binary_pack_finalize(prog, jit_data->header,
++ jit_data->rw_header);
++ kvfree(jit_data->addrs);
++ kfree(jit_data);
++ }
++ hdr = bpf_jit_binary_pack_hdr(prog);
++ bpf_jit_binary_pack_free(hdr, NULL);
++ WARN_ON_ONCE(!bpf_prog_kallsyms_verify_off(prog));
++ }
++
++ bpf_prog_unlock_free(prog);
++}
+diff --git a/arch/x86/platform/olpc/olpc-xo1-sci.c b/arch/x86/platform/olpc/olpc-xo1-sci.c
+index f03a6883dcc6d..89f25af4b3c33 100644
+--- a/arch/x86/platform/olpc/olpc-xo1-sci.c
++++ b/arch/x86/platform/olpc/olpc-xo1-sci.c
+@@ -80,7 +80,7 @@ static void send_ebook_state(void)
+ return;
+ }
+
+- if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)
++ if (test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == !!state)
+ return; /* Nothing new to report. */
+
+ input_report_switch(ebook_switch_idev, SW_TABLET_MODE, state);
+diff --git a/arch/x86/um/Makefile b/arch/x86/um/Makefile
+index ba5789c358094..a8cde4e8ab114 100644
+--- a/arch/x86/um/Makefile
++++ b/arch/x86/um/Makefile
+@@ -28,7 +28,8 @@ else
+
+ obj-y += syscalls_64.o vdso/
+
+-subarch-y = ../lib/csum-partial_64.o ../lib/memcpy_64.o ../entry/thunk_64.o
++subarch-y = ../lib/csum-partial_64.o ../lib/memcpy_64.o
++subarch-$(CONFIG_PREEMPTION) += ../entry/thunk_64.o
+
+ endif
+
+diff --git a/arch/xtensa/platforms/iss/network.c b/arch/xtensa/platforms/iss/network.c
+index be3aaaad8bee0..eee389187807b 100644
+--- a/arch/xtensa/platforms/iss/network.c
++++ b/arch/xtensa/platforms/iss/network.c
+@@ -503,16 +503,24 @@ static const struct net_device_ops iss_netdev_ops = {
+ .ndo_set_rx_mode = iss_net_set_multicast_list,
+ };
+
+-static int iss_net_configure(int index, char *init)
++static void iss_net_pdev_release(struct device *dev)
++{
++ struct platform_device *pdev = to_platform_device(dev);
++ struct iss_net_private *lp =
++ container_of(pdev, struct iss_net_private, pdev);
++
++ free_netdev(lp->dev);
++}
++
++static void iss_net_configure(int index, char *init)
+ {
+ struct net_device *dev;
+ struct iss_net_private *lp;
+- int err;
+
+ dev = alloc_etherdev(sizeof(*lp));
+ if (dev == NULL) {
+ pr_err("eth_configure: failed to allocate device\n");
+- return 1;
++ return;
+ }
+
+ /* Initialize private element. */
+@@ -541,7 +549,7 @@ static int iss_net_configure(int index, char *init)
+ if (!tuntap_probe(lp, index, init)) {
+ pr_err("%s: invalid arguments. Skipping device!\n",
+ dev->name);
+- goto errout;
++ goto err_free_netdev;
+ }
+
+ pr_info("Netdevice %d (%pM)\n", index, dev->dev_addr);
+@@ -549,7 +557,8 @@ static int iss_net_configure(int index, char *init)
+ /* sysfs register */
+
+ if (!driver_registered) {
+- platform_driver_register(&iss_net_driver);
++ if (platform_driver_register(&iss_net_driver))
++ goto err_free_netdev;
+ driver_registered = 1;
+ }
+
+@@ -559,7 +568,9 @@ static int iss_net_configure(int index, char *init)
+
+ lp->pdev.id = index;
+ lp->pdev.name = DRIVER_NAME;
+- platform_device_register(&lp->pdev);
++ lp->pdev.dev.release = iss_net_pdev_release;
++ if (platform_device_register(&lp->pdev))
++ goto err_free_netdev;
+ SET_NETDEV_DEV(dev, &lp->pdev.dev);
+
+ dev->netdev_ops = &iss_netdev_ops;
+@@ -568,23 +579,20 @@ static int iss_net_configure(int index, char *init)
+ dev->irq = -1;
+
+ rtnl_lock();
+- err = register_netdevice(dev);
+- rtnl_unlock();
+-
+- if (err) {
++ if (register_netdevice(dev)) {
++ rtnl_unlock();
+ pr_err("%s: error registering net device!\n", dev->name);
+- /* XXX: should we call ->remove() here? */
+- free_netdev(dev);
+- return 1;
++ platform_device_unregister(&lp->pdev);
++ return;
+ }
++ rtnl_unlock();
+
+ timer_setup(&lp->tl, iss_net_user_timer_expire, 0);
+
+- return 0;
++ return;
+
+-errout:
+- /* FIXME: unregister; free, etc.. */
+- return -EIO;
++err_free_netdev:
++ free_netdev(dev);
+ }
+
+ /* ------------------------------------------------------------------------- */
+diff --git a/block/bio.c b/block/bio.c
+index d3ca79c3ebdff..7d4d5723350bf 100644
+--- a/block/bio.c
++++ b/block/bio.c
+@@ -1129,6 +1129,37 @@ static void bio_put_pages(struct page **pages, size_t size, size_t off)
+ put_page(pages[i]);
+ }
+
++static int bio_iov_add_page(struct bio *bio, struct page *page,
++ unsigned int len, unsigned int offset)
++{
++ bool same_page = false;
++
++ if (!__bio_try_merge_page(bio, page, len, offset, &same_page)) {
++ if (WARN_ON_ONCE(bio_full(bio, len)))
++ return -EINVAL;
++ __bio_add_page(bio, page, len, offset);
++ return 0;
++ }
++
++ if (same_page)
++ put_page(page);
++ return 0;
++}
++
++static int bio_iov_add_zone_append_page(struct bio *bio, struct page *page,
++ unsigned int len, unsigned int offset)
++{
++ struct request_queue *q = bdev_get_queue(bio->bi_bdev);
++ bool same_page = false;
++
++ if (bio_add_hw_page(q, bio, page, len, offset,
++ queue_max_zone_append_sectors(q), &same_page) != len)
++ return -EINVAL;
++ if (same_page)
++ put_page(page);
++ return 0;
++}
++
+ #define PAGE_PTRS_PER_BVEC (sizeof(struct bio_vec) / sizeof(struct page *))
+
+ /**
+@@ -1147,61 +1178,11 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
+ unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt;
+ struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
+ struct page **pages = (struct page **)bv;
+- bool same_page = false;
+- ssize_t size, left;
+- unsigned len, i;
+- size_t offset;
+-
+- /*
+- * Move page array up in the allocated memory for the bio vecs as far as
+- * possible so that we can start filling biovecs from the beginning
+- * without overwriting the temporary page array.
+- */
+- BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2);
+- pages += entries_left * (PAGE_PTRS_PER_BVEC - 1);
+-
+- size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset);
+- if (unlikely(size <= 0))
+- return size ? size : -EFAULT;
+-
+- for (left = size, i = 0; left > 0; left -= len, i++) {
+- struct page *page = pages[i];
+-
+- len = min_t(size_t, PAGE_SIZE - offset, left);
+-
+- if (__bio_try_merge_page(bio, page, len, offset, &same_page)) {
+- if (same_page)
+- put_page(page);
+- } else {
+- if (WARN_ON_ONCE(bio_full(bio, len))) {
+- bio_put_pages(pages + i, left, offset);
+- return -EINVAL;
+- }
+- __bio_add_page(bio, page, len, offset);
+- }
+- offset = 0;
+- }
+-
+- iov_iter_advance(iter, size);
+- return 0;
+-}
+-
+-static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter)
+-{
+- unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt;
+- unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt;
+- struct request_queue *q = bdev_get_queue(bio->bi_bdev);
+- unsigned int max_append_sectors = queue_max_zone_append_sectors(q);
+- struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
+- struct page **pages = (struct page **)bv;
+ ssize_t size, left;
+ unsigned len, i;
+ size_t offset;
+ int ret = 0;
+
+- if (WARN_ON_ONCE(!max_append_sectors))
+- return 0;
+-
+ /*
+ * Move page array up in the allocated memory for the bio vecs as far as
+ * possible so that we can start filling biovecs from the beginning
+@@ -1216,17 +1197,18 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter)
+
+ for (left = size, i = 0; left > 0; left -= len, i++) {
+ struct page *page = pages[i];
+- bool same_page = false;
+
+ len = min_t(size_t, PAGE_SIZE - offset, left);
+- if (bio_add_hw_page(q, bio, page, len, offset,
+- max_append_sectors, &same_page) != len) {
++ if (bio_op(bio) == REQ_OP_ZONE_APPEND)
++ ret = bio_iov_add_zone_append_page(bio, page, len,
++ offset);
++ else
++ ret = bio_iov_add_page(bio, page, len, offset);
++
++ if (ret) {
+ bio_put_pages(pages + i, left, offset);
+- ret = -EINVAL;
+ break;
+ }
+- if (same_page)
+- put_page(page);
+ offset = 0;
+ }
+
+@@ -1268,10 +1250,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
+ }
+
+ do {
+- if (bio_op(bio) == REQ_OP_ZONE_APPEND)
+- ret = __bio_iov_append_get_pages(bio, iter);
+- else
+- ret = __bio_iov_iter_get_pages(bio, iter);
++ ret = __bio_iov_iter_get_pages(bio, iter);
+ } while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
+
+ /* don't account direct I/O as memory stall */
+diff --git a/block/blk-iocost.c b/block/blk-iocost.c
+index 16705fbd06991..a19f2db4eeb2e 100644
+--- a/block/blk-iocost.c
++++ b/block/blk-iocost.c
+@@ -2893,15 +2893,21 @@ static int blk_iocost_init(struct request_queue *q)
+ * called before policy activation completion, can't assume that the
+ * target bio has an iocg associated and need to test for NULL iocg.
+ */
+- rq_qos_add(q, rqos);
++ ret = rq_qos_add(q, rqos);
++ if (ret)
++ goto err_free_ioc;
++
+ ret = blkcg_activate_policy(q, &blkcg_policy_iocost);
+- if (ret) {
+- rq_qos_del(q, rqos);
+- free_percpu(ioc->pcpu_stat);
+- kfree(ioc);
+- return ret;
+- }
++ if (ret)
++ goto err_del_qos;
+ return 0;
++
++err_del_qos:
++ rq_qos_del(q, rqos);
++err_free_ioc:
++ free_percpu(ioc->pcpu_stat);
++ kfree(ioc);
++ return ret;
+ }
+
+ static struct blkcg_policy_data *ioc_cpd_alloc(gfp_t gfp)
+diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c
+index 9568bf8dfe82b..7845dca5fcfdb 100644
+--- a/block/blk-iolatency.c
++++ b/block/blk-iolatency.c
+@@ -773,19 +773,23 @@ int blk_iolatency_init(struct request_queue *q)
+ rqos->ops = &blkcg_iolatency_ops;
+ rqos->q = q;
+
+- rq_qos_add(q, rqos);
+-
++ ret = rq_qos_add(q, rqos);
++ if (ret)
++ goto err_free;
+ ret = blkcg_activate_policy(q, &blkcg_policy_iolatency);
+- if (ret) {
+- rq_qos_del(q, rqos);
+- kfree(blkiolat);
+- return ret;
+- }
++ if (ret)
++ goto err_qos_del;
+
+ timer_setup(&blkiolat->timer, blkiolatency_timer_fn, 0);
+ INIT_WORK(&blkiolat->enable_work, blkiolatency_enable_work_fn);
+
+ return 0;
++
++err_qos_del:
++ rq_qos_del(q, rqos);
++err_free:
++ kfree(blkiolat);
++ return ret;
+ }
+
+ static void iolatency_set_min_lat_nsec(struct blkcg_gq *blkg, u64 val)
+diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
+index aa0349e9f083b..d491b6eb0ab97 100644
+--- a/block/blk-mq-debugfs.c
++++ b/block/blk-mq-debugfs.c
+@@ -713,11 +713,6 @@ void blk_mq_debugfs_register(struct request_queue *q)
+ }
+ }
+
+-void blk_mq_debugfs_unregister(struct request_queue *q)
+-{
+- q->sched_debugfs_dir = NULL;
+-}
+-
+ static void blk_mq_debugfs_register_ctx(struct blk_mq_hw_ctx *hctx,
+ struct blk_mq_ctx *ctx)
+ {
+@@ -737,6 +732,9 @@ void blk_mq_debugfs_register_hctx(struct request_queue *q,
+ char name[20];
+ int i;
+
++ if (!q->debugfs_dir)
++ return;
++
+ snprintf(name, sizeof(name), "hctx%u", hctx->queue_num);
+ hctx->debugfs_dir = debugfs_create_dir(name, q->debugfs_dir);
+
+@@ -748,6 +746,8 @@ void blk_mq_debugfs_register_hctx(struct request_queue *q,
+
+ void blk_mq_debugfs_unregister_hctx(struct blk_mq_hw_ctx *hctx)
+ {
++ if (!hctx->queue->debugfs_dir)
++ return;
+ debugfs_remove_recursive(hctx->debugfs_dir);
+ hctx->sched_debugfs_dir = NULL;
+ hctx->debugfs_dir = NULL;
+@@ -775,6 +775,8 @@ void blk_mq_debugfs_register_sched(struct request_queue *q)
+ {
+ struct elevator_type *e = q->elevator->type;
+
++ lockdep_assert_held(&q->debugfs_mutex);
++
+ /*
+ * If the parent directory has not been created yet, return, we will be
+ * called again later on and the directory/files will be created then.
+@@ -792,6 +794,8 @@ void blk_mq_debugfs_register_sched(struct request_queue *q)
+
+ void blk_mq_debugfs_unregister_sched(struct request_queue *q)
+ {
++ lockdep_assert_held(&q->debugfs_mutex);
++
+ debugfs_remove_recursive(q->sched_debugfs_dir);
+ q->sched_debugfs_dir = NULL;
+ }
+@@ -813,6 +817,10 @@ static const char *rq_qos_id_to_name(enum rq_qos_id id)
+
+ void blk_mq_debugfs_unregister_rqos(struct rq_qos *rqos)
+ {
++ lockdep_assert_held(&rqos->q->debugfs_mutex);
++
++ if (!rqos->q->debugfs_dir)
++ return;
+ debugfs_remove_recursive(rqos->debugfs_dir);
+ rqos->debugfs_dir = NULL;
+ }
+@@ -822,6 +830,8 @@ void blk_mq_debugfs_register_rqos(struct rq_qos *rqos)
+ struct request_queue *q = rqos->q;
+ const char *dir_name = rq_qos_id_to_name(rqos->id);
+
++ lockdep_assert_held(&q->debugfs_mutex);
++
+ if (rqos->debugfs_dir || !rqos->ops->debugfs_attrs)
+ return;
+
+@@ -837,6 +847,8 @@ void blk_mq_debugfs_register_rqos(struct rq_qos *rqos)
+
+ void blk_mq_debugfs_unregister_queue_rqos(struct request_queue *q)
+ {
++ lockdep_assert_held(&q->debugfs_mutex);
++
+ debugfs_remove_recursive(q->rqos_debugfs_dir);
+ q->rqos_debugfs_dir = NULL;
+ }
+@@ -846,6 +858,8 @@ void blk_mq_debugfs_register_sched_hctx(struct request_queue *q,
+ {
+ struct elevator_type *e = q->elevator->type;
+
++ lockdep_assert_held(&q->debugfs_mutex);
++
+ /*
+ * If the parent debugfs directory has not been created yet, return;
+ * We will be called again later on with appropriate parent debugfs
+@@ -865,6 +879,10 @@ void blk_mq_debugfs_register_sched_hctx(struct request_queue *q,
+
+ void blk_mq_debugfs_unregister_sched_hctx(struct blk_mq_hw_ctx *hctx)
+ {
++ lockdep_assert_held(&hctx->queue->debugfs_mutex);
++
++ if (!hctx->queue->debugfs_dir)
++ return;
+ debugfs_remove_recursive(hctx->sched_debugfs_dir);
+ hctx->sched_debugfs_dir = NULL;
+ }
+diff --git a/block/blk-mq-debugfs.h b/block/blk-mq-debugfs.h
+index 69918f4170d69..771d458328788 100644
+--- a/block/blk-mq-debugfs.h
++++ b/block/blk-mq-debugfs.h
+@@ -21,7 +21,6 @@ int __blk_mq_debugfs_rq_show(struct seq_file *m, struct request *rq);
+ int blk_mq_debugfs_rq_show(struct seq_file *m, void *v);
+
+ void blk_mq_debugfs_register(struct request_queue *q);
+-void blk_mq_debugfs_unregister(struct request_queue *q);
+ void blk_mq_debugfs_register_hctx(struct request_queue *q,
+ struct blk_mq_hw_ctx *hctx);
+ void blk_mq_debugfs_unregister_hctx(struct blk_mq_hw_ctx *hctx);
+@@ -42,10 +41,6 @@ static inline void blk_mq_debugfs_register(struct request_queue *q)
+ {
+ }
+
+-static inline void blk_mq_debugfs_unregister(struct request_queue *q)
+-{
+-}
+-
+ static inline void blk_mq_debugfs_register_hctx(struct request_queue *q,
+ struct blk_mq_hw_ctx *hctx)
+ {
+diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
+index 9e56a69422b65..e84bec39fd3ad 100644
+--- a/block/blk-mq-sched.c
++++ b/block/blk-mq-sched.c
+@@ -593,7 +593,9 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
+ if (ret)
+ goto err_free_map_and_rqs;
+
++ mutex_lock(&q->debugfs_mutex);
+ blk_mq_debugfs_register_sched(q);
++ mutex_unlock(&q->debugfs_mutex);
+
+ queue_for_each_hw_ctx(q, hctx, i) {
+ if (e->ops.init_hctx) {
+@@ -606,7 +608,9 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
+ return ret;
+ }
+ }
++ mutex_lock(&q->debugfs_mutex);
+ blk_mq_debugfs_register_sched_hctx(q, hctx);
++ mutex_unlock(&q->debugfs_mutex);
+ }
+
+ return 0;
+@@ -647,14 +651,21 @@ void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e)
+ unsigned int flags = 0;
+
+ queue_for_each_hw_ctx(q, hctx, i) {
++ mutex_lock(&q->debugfs_mutex);
+ blk_mq_debugfs_unregister_sched_hctx(hctx);
++ mutex_unlock(&q->debugfs_mutex);
++
+ if (e->type->ops.exit_hctx && hctx->sched_data) {
+ e->type->ops.exit_hctx(hctx, i);
+ hctx->sched_data = NULL;
+ }
+ flags = hctx->flags;
+ }
++
++ mutex_lock(&q->debugfs_mutex);
+ blk_mq_debugfs_unregister_sched(q);
++ mutex_unlock(&q->debugfs_mutex);
++
+ if (e->type->ops.exit_sched)
+ e->type->ops.exit_sched(e);
+ blk_mq_sched_tags_teardown(q, flags);
+diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
+index e83af7bc75919..249a6f05dd3bd 100644
+--- a/block/blk-rq-qos.c
++++ b/block/blk-rq-qos.c
+@@ -294,7 +294,9 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
+
+ void rq_qos_exit(struct request_queue *q)
+ {
++ mutex_lock(&q->debugfs_mutex);
+ blk_mq_debugfs_unregister_queue_rqos(q);
++ mutex_unlock(&q->debugfs_mutex);
+
+ while (q->rq_qos) {
+ struct rq_qos *rqos = q->rq_qos;
+diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h
+index 68267007da1c6..08b856570ad10 100644
+--- a/block/blk-rq-qos.h
++++ b/block/blk-rq-qos.h
+@@ -86,7 +86,7 @@ static inline void rq_wait_init(struct rq_wait *rq_wait)
+ init_waitqueue_head(&rq_wait->wait);
+ }
+
+-static inline void rq_qos_add(struct request_queue *q, struct rq_qos *rqos)
++static inline int rq_qos_add(struct request_queue *q, struct rq_qos *rqos)
+ {
+ /*
+ * No IO can be in-flight when adding rqos, so freeze queue, which
+@@ -98,14 +98,26 @@ static inline void rq_qos_add(struct request_queue *q, struct rq_qos *rqos)
+ blk_mq_freeze_queue(q);
+
+ spin_lock_irq(&q->queue_lock);
++ if (rq_qos_id(q, rqos->id))
++ goto ebusy;
+ rqos->next = q->rq_qos;
+ q->rq_qos = rqos;
+ spin_unlock_irq(&q->queue_lock);
+
+ blk_mq_unfreeze_queue(q);
+
+- if (rqos->ops->debugfs_attrs)
++ if (rqos->ops->debugfs_attrs) {
++ mutex_lock(&q->debugfs_mutex);
+ blk_mq_debugfs_register_rqos(rqos);
++ mutex_unlock(&q->debugfs_mutex);
++ }
++
++ return 0;
++ebusy:
++ spin_unlock_irq(&q->queue_lock);
++ blk_mq_unfreeze_queue(q);
++ return -EBUSY;
++
+ }
+
+ static inline void rq_qos_del(struct request_queue *q, struct rq_qos *rqos)
+@@ -129,7 +141,9 @@ static inline void rq_qos_del(struct request_queue *q, struct rq_qos *rqos)
+
+ blk_mq_unfreeze_queue(q);
+
++ mutex_lock(&q->debugfs_mutex);
+ blk_mq_debugfs_unregister_rqos(rqos);
++ mutex_unlock(&q->debugfs_mutex);
+ }
+
+ typedef bool (acquire_inflight_cb_t)(struct rq_wait *rqw, void *private_data);
+diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
+index 88bd41d4cb593..6e4801b217a79 100644
+--- a/block/blk-sysfs.c
++++ b/block/blk-sysfs.c
+@@ -779,14 +779,13 @@ static void blk_release_queue(struct kobject *kobj)
+ if (queue_is_mq(q))
+ blk_mq_release(q);
+
+- blk_trace_shutdown(q);
+ mutex_lock(&q->debugfs_mutex);
++ blk_trace_shutdown(q);
+ debugfs_remove_recursive(q->debugfs_dir);
++ q->debugfs_dir = NULL;
++ q->sched_debugfs_dir = NULL;
+ mutex_unlock(&q->debugfs_mutex);
+
+- if (queue_is_mq(q))
+- blk_mq_debugfs_unregister(q);
+-
+ bioset_exit(&q->bio_split);
+
+ if (blk_queue_has_srcu(q))
+@@ -836,17 +835,16 @@ int blk_register_queue(struct gendisk *disk)
+ goto unlock;
+ }
+
++ if (queue_is_mq(q))
++ __blk_mq_register_dev(dev, q);
++ mutex_lock(&q->sysfs_lock);
++
+ mutex_lock(&q->debugfs_mutex);
+ q->debugfs_dir = debugfs_create_dir(kobject_name(q->kobj.parent),
+ blk_debugfs_root);
+- mutex_unlock(&q->debugfs_mutex);
+-
+- if (queue_is_mq(q)) {
+- __blk_mq_register_dev(dev, q);
++ if (queue_is_mq(q))
+ blk_mq_debugfs_register(q);
+- }
+-
+- mutex_lock(&q->sysfs_lock);
++ mutex_unlock(&q->debugfs_mutex);
+
+ ret = disk_register_independent_access_ranges(disk, NULL);
+ if (ret)
+diff --git a/block/blk-wbt.c b/block/blk-wbt.c
+index 0c119be0e8133..ae6ea0b545799 100644
+--- a/block/blk-wbt.c
++++ b/block/blk-wbt.c
+@@ -820,6 +820,7 @@ int wbt_init(struct request_queue *q)
+ {
+ struct rq_wb *rwb;
+ int i;
++ int ret;
+
+ rwb = kzalloc(sizeof(*rwb), GFP_KERNEL);
+ if (!rwb)
+@@ -846,7 +847,10 @@ int wbt_init(struct request_queue *q)
+ /*
+ * Assign rwb and add the stats callback.
+ */
+- rq_qos_add(q, &rwb->rqos);
++ ret = rq_qos_add(q, &rwb->rqos);
++ if (ret)
++ goto err_free;
++
+ blk_stat_add_callback(q, rwb->cb);
+
+ rwb->min_lat_nsec = wbt_default_latency_nsec(q);
+@@ -855,4 +859,10 @@ int wbt_init(struct request_queue *q)
+ wbt_set_write_cache(q, test_bit(QUEUE_FLAG_WC, &q->queue_flags));
+
+ return 0;
++
++err_free:
++ blk_stat_free_callback(rwb->cb);
++ kfree(rwb);
++ return ret;
++
+ }
+diff --git a/crypto/Kconfig b/crypto/Kconfig
+index b4e00a7a046b9..38601a072b994 100644
+--- a/crypto/Kconfig
++++ b/crypto/Kconfig
+@@ -692,26 +692,8 @@ config CRYPTO_BLAKE2B
+
+ See https://blake2.net for further information.
+
+-config CRYPTO_BLAKE2S
+- tristate "BLAKE2s digest algorithm"
+- select CRYPTO_LIB_BLAKE2S_GENERIC
+- select CRYPTO_HASH
+- help
+- Implementation of cryptographic hash function BLAKE2s
+- optimized for 8-32bit platforms and can produce digests of any size
+- between 1 to 32. The keyed hash is also implemented.
+-
+- This module provides the following algorithms:
+-
+- - blake2s-128
+- - blake2s-160
+- - blake2s-224
+- - blake2s-256
+-
+- See https://blake2.net for further information.
+-
+ config CRYPTO_BLAKE2S_X86
+- tristate "BLAKE2s digest algorithm (x86 accelerated version)"
++ bool "BLAKE2s digest algorithm (x86 accelerated version)"
+ depends on X86 && 64BIT
+ select CRYPTO_LIB_BLAKE2S_GENERIC
+ select CRYPTO_ARCH_HAVE_LIB_BLAKE2S
+diff --git a/crypto/Makefile b/crypto/Makefile
+index a40e6d5fb2c83..dbfa53567c92f 100644
+--- a/crypto/Makefile
++++ b/crypto/Makefile
+@@ -83,7 +83,6 @@ obj-$(CONFIG_CRYPTO_STREEBOG) += streebog_generic.o
+ obj-$(CONFIG_CRYPTO_WP512) += wp512.o
+ CFLAGS_wp512.o := $(call cc-option,-fno-schedule-insns) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
+ obj-$(CONFIG_CRYPTO_BLAKE2B) += blake2b_generic.o
+-obj-$(CONFIG_CRYPTO_BLAKE2S) += blake2s_generic.o
+ obj-$(CONFIG_CRYPTO_GF128MUL) += gf128mul.o
+ obj-$(CONFIG_CRYPTO_ECB) += ecb.o
+ obj-$(CONFIG_CRYPTO_CBC) += cbc.o
+diff --git a/crypto/asymmetric_keys/public_key.c b/crypto/asymmetric_keys/public_key.c
+index 7c9e6be35c30c..2f8352e888602 100644
+--- a/crypto/asymmetric_keys/public_key.c
++++ b/crypto/asymmetric_keys/public_key.c
+@@ -304,6 +304,10 @@ static int cert_sig_digest_update(const struct public_key_signature *sig,
+
+ BUG_ON(!sig->data);
+
++ /* SM2 signatures always use the SM3 hash algorithm */
++ if (!sig->hash_algo || strcmp(sig->hash_algo, "sm3") != 0)
++ return -EINVAL;
++
+ ret = sm2_compute_z_digest(tfm_pkey, SM2_DEFAULT_USERID,
+ SM2_DEFAULT_USERID_LEN, dgst);
+ if (ret)
+@@ -414,8 +418,7 @@ int public_key_verify_signature(const struct public_key *pkey,
+ if (ret)
+ goto error_free_key;
+
+- if (sig->pkey_algo && strcmp(sig->pkey_algo, "sm2") == 0 &&
+- sig->data_size) {
++ if (strcmp(pkey->pkey_algo, "sm2") == 0 && sig->data_size) {
+ ret = cert_sig_digest_update(sig, tfm);
+ if (ret)
+ goto error_free_key;
+diff --git a/crypto/blake2s_generic.c b/crypto/blake2s_generic.c
+deleted file mode 100644
+index 5f96a21f87883..0000000000000
+--- a/crypto/blake2s_generic.c
++++ /dev/null
+@@ -1,75 +0,0 @@
+-// SPDX-License-Identifier: GPL-2.0 OR MIT
+-/*
+- * shash interface to the generic implementation of BLAKE2s
+- *
+- * Copyright (C) 2015-2019 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+- */
+-
+-#include <crypto/internal/blake2s.h>
+-#include <crypto/internal/hash.h>
+-
+-#include <linux/types.h>
+-#include <linux/kernel.h>
+-#include <linux/module.h>
+-
+-static int crypto_blake2s_update_generic(struct shash_desc *desc,
+- const u8 *in, unsigned int inlen)
+-{
+- return crypto_blake2s_update(desc, in, inlen, true);
+-}
+-
+-static int crypto_blake2s_final_generic(struct shash_desc *desc, u8 *out)
+-{
+- return crypto_blake2s_final(desc, out, true);
+-}
+-
+-#define BLAKE2S_ALG(name, driver_name, digest_size) \
+- { \
+- .base.cra_name = name, \
+- .base.cra_driver_name = driver_name, \
+- .base.cra_priority = 100, \
+- .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \
+- .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, \
+- .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), \
+- .base.cra_module = THIS_MODULE, \
+- .digestsize = digest_size, \
+- .setkey = crypto_blake2s_setkey, \
+- .init = crypto_blake2s_init, \
+- .update = crypto_blake2s_update_generic, \
+- .final = crypto_blake2s_final_generic, \
+- .descsize = sizeof(struct blake2s_state), \
+- }
+-
+-static struct shash_alg blake2s_algs[] = {
+- BLAKE2S_ALG("blake2s-128", "blake2s-128-generic",
+- BLAKE2S_128_HASH_SIZE),
+- BLAKE2S_ALG("blake2s-160", "blake2s-160-generic",
+- BLAKE2S_160_HASH_SIZE),
+- BLAKE2S_ALG("blake2s-224", "blake2s-224-generic",
+- BLAKE2S_224_HASH_SIZE),
+- BLAKE2S_ALG("blake2s-256", "blake2s-256-generic",
+- BLAKE2S_256_HASH_SIZE),
+-};
+-
+-static int __init blake2s_mod_init(void)
+-{
+- return crypto_register_shashes(blake2s_algs, ARRAY_SIZE(blake2s_algs));
+-}
+-
+-static void __exit blake2s_mod_exit(void)
+-{
+- crypto_unregister_shashes(blake2s_algs, ARRAY_SIZE(blake2s_algs));
+-}
+-
+-subsys_initcall(blake2s_mod_init);
+-module_exit(blake2s_mod_exit);
+-
+-MODULE_ALIAS_CRYPTO("blake2s-128");
+-MODULE_ALIAS_CRYPTO("blake2s-128-generic");
+-MODULE_ALIAS_CRYPTO("blake2s-160");
+-MODULE_ALIAS_CRYPTO("blake2s-160-generic");
+-MODULE_ALIAS_CRYPTO("blake2s-224");
+-MODULE_ALIAS_CRYPTO("blake2s-224-generic");
+-MODULE_ALIAS_CRYPTO("blake2s-256");
+-MODULE_ALIAS_CRYPTO("blake2s-256-generic");
+-MODULE_LICENSE("GPL v2");
+diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c
+index 2bacf8384f59f..66b7ca1ccb23c 100644
+--- a/crypto/tcrypt.c
++++ b/crypto/tcrypt.c
+@@ -1669,10 +1669,6 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
+ ret += tcrypt_test("rmd160");
+ break;
+
+- case 41:
+- ret += tcrypt_test("blake2s-256");
+- break;
+-
+ case 42:
+ ret += tcrypt_test("blake2b-512");
+ break;
+@@ -2240,10 +2236,6 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
+ test_hash_speed("rmd160", sec, generic_hash_speed_template);
+ if (mode > 300 && mode < 400) break;
+ fallthrough;
+- case 316:
+- test_hash_speed("blake2s-256", sec, generic_hash_speed_template);
+- if (mode > 300 && mode < 400) break;
+- fallthrough;
+ case 317:
+ test_hash_speed("blake2b-512", sec, generic_hash_speed_template);
+ if (mode > 300 && mode < 400) break;
+@@ -2352,10 +2344,6 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
+ test_ahash_speed("rmd160", sec, generic_hash_speed_template);
+ if (mode > 400 && mode < 500) break;
+ fallthrough;
+- case 416:
+- test_ahash_speed("blake2s-256", sec, generic_hash_speed_template);
+- if (mode > 400 && mode < 500) break;
+- fallthrough;
+ case 417:
+ test_ahash_speed("blake2b-512", sec, generic_hash_speed_template);
+ if (mode > 400 && mode < 500) break;
+diff --git a/crypto/testmgr.c b/crypto/testmgr.c
+index 4948201065cc4..56facdb638430 100644
+--- a/crypto/testmgr.c
++++ b/crypto/testmgr.c
+@@ -4324,30 +4324,6 @@ static const struct alg_test_desc alg_test_descs[] = {
+ .suite = {
+ .hash = __VECS(blake2b_512_tv_template)
+ }
+- }, {
+- .alg = "blake2s-128",
+- .test = alg_test_hash,
+- .suite = {
+- .hash = __VECS(blakes2s_128_tv_template)
+- }
+- }, {
+- .alg = "blake2s-160",
+- .test = alg_test_hash,
+- .suite = {
+- .hash = __VECS(blakes2s_160_tv_template)
+- }
+- }, {
+- .alg = "blake2s-224",
+- .test = alg_test_hash,
+- .suite = {
+- .hash = __VECS(blakes2s_224_tv_template)
+- }
+- }, {
+- .alg = "blake2s-256",
+- .test = alg_test_hash,
+- .suite = {
+- .hash = __VECS(blakes2s_256_tv_template)
+- }
+ }, {
+ .alg = "cbc(aes)",
+ .test = alg_test_skcipher,
+diff --git a/crypto/testmgr.h b/crypto/testmgr.h
+index 4d7449fc6a655..c29658337d963 100644
+--- a/crypto/testmgr.h
++++ b/crypto/testmgr.h
+@@ -34034,221 +34034,4 @@ static const struct hash_testvec blake2b_512_tv_template[] = {{
+ 0xae, 0x15, 0x81, 0x15, 0xd0, 0x88, 0xa0, 0x3c, },
+ }};
+
+-static const struct hash_testvec blakes2s_128_tv_template[] = {{
+- .digest = (u8[]){ 0x64, 0x55, 0x0d, 0x6f, 0xfe, 0x2c, 0x0a, 0x01,
+- 0xa1, 0x4a, 0xba, 0x1e, 0xad, 0xe0, 0x20, 0x0c, },
+-}, {
+- .plaintext = blake2_ordered_sequence,
+- .psize = 64,
+- .digest = (u8[]){ 0xdc, 0x66, 0xca, 0x8f, 0x03, 0x86, 0x58, 0x01,
+- 0xb0, 0xff, 0xe0, 0x6e, 0xd8, 0xa1, 0xa9, 0x0e, },
+-}, {
+- .ksize = 16,
+- .key = blake2_ordered_sequence,
+- .plaintext = blake2_ordered_sequence,
+- .psize = 1,
+- .digest = (u8[]){ 0x88, 0x1e, 0x42, 0xe7, 0xbb, 0x35, 0x80, 0x82,
+- 0x63, 0x7c, 0x0a, 0x0f, 0xd7, 0xec, 0x6c, 0x2f, },
+-}, {
+- .ksize = 32,
+- .key = blake2_ordered_sequence,
+- .plaintext = blake2_ordered_sequence,
+- .psize = 7,
+- .digest = (u8[]){ 0xcf, 0x9e, 0x07, 0x2a, 0xd5, 0x22, 0xf2, 0xcd,
+- 0xa2, 0xd8, 0x25, 0x21, 0x80, 0x86, 0x73, 0x1c, },
+-}, {
+- .ksize = 1,
+- .key = "B",
+- .plaintext = blake2_ordered_sequence,
+- .psize = 15,
+- .digest = (u8[]){ 0xf6, 0x33, 0x5a, 0x2c, 0x22, 0xa0, 0x64, 0xb2,
+- 0xb6, 0x3f, 0xeb, 0xbc, 0xd1, 0xc3, 0xe5, 0xb2, },
+-}, {
+- .ksize = 16,
+- .key = blake2_ordered_sequence,
+- .plaintext = blake2_ordered_sequence,
+- .psize = 247,
+- .digest = (u8[]){ 0x72, 0x66, 0x49, 0x60, 0xf9, 0x4a, 0xea, 0xbe,
+- 0x1f, 0xf4, 0x60, 0xce, 0xb7, 0x81, 0xcb, 0x09, },
+-}, {
+- .ksize = 32,
+- .key = blake2_ordered_sequence,
+- .plaintext = blake2_ordered_sequence,
+- .psize = 256,
+- .digest = (u8[]){ 0xd5, 0xa4, 0x0e, 0xc3, 0x16, 0xc7, 0x51, 0xa6,
+- 0x3c, 0xd0, 0xd9, 0x11, 0x57, 0xfa, 0x1e, 0xbb, },
+-}};
+-
+-static const struct hash_testvec blakes2s_160_tv_template[] = {{
+- .plaintext = blake2_ordered_sequence,
+- .psize = 7,
+- .digest = (u8[]){ 0xb4, 0xf2, 0x03, 0x49, 0x37, 0xed, 0xb1, 0x3e,
+- 0x5b, 0x2a, 0xca, 0x64, 0x82, 0x74, 0xf6, 0x62,
+- 0xe3, 0xf2, 0x84, 0xff, },
+-}, {
+- .plaintext = blake2_ordered_sequence,
+- .psize = 256,
+- .digest = (u8[]){ 0xaa, 0x56, 0x9b, 0xdc, 0x98, 0x17, 0x75, 0xf2,
+- 0xb3, 0x68, 0x83, 0xb7, 0x9b, 0x8d, 0x48, 0xb1,
+- 0x9b, 0x2d, 0x35, 0x05, },
+-}, {
+- .ksize = 1,
+- .key = "B",
+- .digest = (u8[]){ 0x50, 0x16, 0xe7, 0x0c, 0x01, 0xd0, 0xd3, 0xc3,
+- 0xf4, 0x3e, 0xb1, 0x6e, 0x97, 0xa9, 0x4e, 0xd1,
+- 0x79, 0x65, 0x32, 0x93, },
+-}, {
+- .ksize = 32,
+- .key = blake2_ordered_sequence,
+- .plaintext = blake2_ordered_sequence,
+- .psize = 1,
+- .digest = (u8[]){ 0x1c, 0x2b, 0xcd, 0x9a, 0x68, 0xca, 0x8c, 0x71,
+- 0x90, 0x29, 0x6c, 0x54, 0xfa, 0x56, 0x4a, 0xef,
+- 0xa2, 0x3a, 0x56, 0x9c, },
+-}, {
+- .ksize = 16,
+- .key = blake2_ordered_sequence,
+- .plaintext = blake2_ordered_sequence,
+- .psize = 15,
+- .digest = (u8[]){ 0x36, 0xc3, 0x5f, 0x9a, 0xdc, 0x7e, 0xbf, 0x19,
+- 0x68, 0xaa, 0xca, 0xd8, 0x81, 0xbf, 0x09, 0x34,
+- 0x83, 0x39, 0x0f, 0x30, },
+-}, {
+- .ksize = 1,
+- .key = "B",
+- .plaintext = blake2_ordered_sequence,
+- .psize = 64,
+- .digest = (u8[]){ 0x86, 0x80, 0x78, 0xa4, 0x14, 0xec, 0x03, 0xe5,
+- 0xb6, 0x9a, 0x52, 0x0e, 0x42, 0xee, 0x39, 0x9d,
+- 0xac, 0xa6, 0x81, 0x63, },
+-}, {
+- .ksize = 32,
+- .key = blake2_ordered_sequence,
+- .plaintext = blake2_ordered_sequence,
+- .psize = 247,
+- .digest = (u8[]){ 0x2d, 0xd8, 0xd2, 0x53, 0x66, 0xfa, 0xa9, 0x01,
+- 0x1c, 0x9c, 0xaf, 0xa3, 0xe2, 0x9d, 0x9b, 0x10,
+- 0x0a, 0xf6, 0x73, 0xe8, },
+-}};
+-
+-static const struct hash_testvec blakes2s_224_tv_template[] = {{
+- .plaintext = blake2_ordered_sequence,
+- .psize = 1,
+- .digest = (u8[]){ 0x61, 0xb9, 0x4e, 0xc9, 0x46, 0x22, 0xa3, 0x91,
+- 0xd2, 0xae, 0x42, 0xe6, 0x45, 0x6c, 0x90, 0x12,
+- 0xd5, 0x80, 0x07, 0x97, 0xb8, 0x86, 0x5a, 0xfc,
+- 0x48, 0x21, 0x97, 0xbb, },
+-}, {
+- .plaintext = blake2_ordered_sequence,
+- .psize = 247,
+- .digest = (u8[]){ 0x9e, 0xda, 0xc7, 0x20, 0x2c, 0xd8, 0x48, 0x2e,
+- 0x31, 0x94, 0xab, 0x46, 0x6d, 0x94, 0xd8, 0xb4,
+- 0x69, 0xcd, 0xae, 0x19, 0x6d, 0x9e, 0x41, 0xcc,
+- 0x2b, 0xa4, 0xd5, 0xf6, },
+-}, {
+- .ksize = 16,
+- .key = blake2_ordered_sequence,
+- .digest = (u8[]){ 0x32, 0xc0, 0xac, 0xf4, 0x3b, 0xd3, 0x07, 0x9f,
+- 0xbe, 0xfb, 0xfa, 0x4d, 0x6b, 0x4e, 0x56, 0xb3,
+- 0xaa, 0xd3, 0x27, 0xf6, 0x14, 0xbf, 0xb9, 0x32,
+- 0xa7, 0x19, 0xfc, 0xb8, },
+-}, {
+- .ksize = 1,
+- .key = "B",
+- .plaintext = blake2_ordered_sequence,
+- .psize = 7,
+- .digest = (u8[]){ 0x73, 0xad, 0x5e, 0x6d, 0xb9, 0x02, 0x8e, 0x76,
+- 0xf2, 0x66, 0x42, 0x4b, 0x4c, 0xfa, 0x1f, 0xe6,
+- 0x2e, 0x56, 0x40, 0xe5, 0xa2, 0xb0, 0x3c, 0xe8,
+- 0x7b, 0x45, 0xfe, 0x05, },
+-}, {
+- .ksize = 32,
+- .key = blake2_ordered_sequence,
+- .plaintext = blake2_ordered_sequence,
+- .psize = 15,
+- .digest = (u8[]){ 0x16, 0x60, 0xfb, 0x92, 0x54, 0xb3, 0x6e, 0x36,
+- 0x81, 0xf4, 0x16, 0x41, 0xc3, 0x3d, 0xd3, 0x43,
+- 0x84, 0xed, 0x10, 0x6f, 0x65, 0x80, 0x7a, 0x3e,
+- 0x25, 0xab, 0xc5, 0x02, },
+-}, {
+- .ksize = 16,
+- .key = blake2_ordered_sequence,
+- .plaintext = blake2_ordered_sequence,
+- .psize = 64,
+- .digest = (u8[]){ 0xca, 0xaa, 0x39, 0x67, 0x9c, 0xf7, 0x6b, 0xc7,
+- 0xb6, 0x82, 0xca, 0x0e, 0x65, 0x36, 0x5b, 0x7c,
+- 0x24, 0x00, 0xfa, 0x5f, 0xda, 0x06, 0x91, 0x93,
+- 0x6a, 0x31, 0x83, 0xb5, },
+-}, {
+- .ksize = 1,
+- .key = "B",
+- .plaintext = blake2_ordered_sequence,
+- .psize = 256,
+- .digest = (u8[]){ 0x90, 0x02, 0x26, 0xb5, 0x06, 0x9c, 0x36, 0x86,
+- 0x94, 0x91, 0x90, 0x1e, 0x7d, 0x2a, 0x71, 0xb2,
+- 0x48, 0xb5, 0xe8, 0x16, 0xfd, 0x64, 0x33, 0x45,
+- 0xb3, 0xd7, 0xec, 0xcc, },
+-}};
+-
+-static const struct hash_testvec blakes2s_256_tv_template[] = {{
+- .plaintext = blake2_ordered_sequence,
+- .psize = 15,
+- .digest = (u8[]){ 0xd9, 0x7c, 0x82, 0x8d, 0x81, 0x82, 0xa7, 0x21,
+- 0x80, 0xa0, 0x6a, 0x78, 0x26, 0x83, 0x30, 0x67,
+- 0x3f, 0x7c, 0x4e, 0x06, 0x35, 0x94, 0x7c, 0x04,
+- 0xc0, 0x23, 0x23, 0xfd, 0x45, 0xc0, 0xa5, 0x2d, },
+-}, {
+- .ksize = 32,
+- .key = blake2_ordered_sequence,
+- .digest = (u8[]){ 0x48, 0xa8, 0x99, 0x7d, 0xa4, 0x07, 0x87, 0x6b,
+- 0x3d, 0x79, 0xc0, 0xd9, 0x23, 0x25, 0xad, 0x3b,
+- 0x89, 0xcb, 0xb7, 0x54, 0xd8, 0x6a, 0xb7, 0x1a,
+- 0xee, 0x04, 0x7a, 0xd3, 0x45, 0xfd, 0x2c, 0x49, },
+-}, {
+- .ksize = 1,
+- .key = "B",
+- .plaintext = blake2_ordered_sequence,
+- .psize = 1,
+- .digest = (u8[]){ 0x22, 0x27, 0xae, 0xaa, 0x6e, 0x81, 0x56, 0x03,
+- 0xa7, 0xe3, 0xa1, 0x18, 0xa5, 0x9a, 0x2c, 0x18,
+- 0xf4, 0x63, 0xbc, 0x16, 0x70, 0xf1, 0xe7, 0x4b,
+- 0x00, 0x6d, 0x66, 0x16, 0xae, 0x9e, 0x74, 0x4e, },
+-}, {
+- .ksize = 16,
+- .key = blake2_ordered_sequence,
+- .plaintext = blake2_ordered_sequence,
+- .psize = 7,
+- .digest = (u8[]){ 0x58, 0x5d, 0xa8, 0x60, 0x1c, 0xa4, 0xd8, 0x03,
+- 0x86, 0x86, 0x84, 0x64, 0xd7, 0xa0, 0x8e, 0x15,
+- 0x2f, 0x05, 0xa2, 0x1b, 0xbc, 0xef, 0x7a, 0x34,
+- 0xb3, 0xc5, 0xbc, 0x4b, 0xf0, 0x32, 0xeb, 0x12, },
+-}, {
+- .ksize = 32,
+- .key = blake2_ordered_sequence,
+- .plaintext = blake2_ordered_sequence,
+- .psize = 64,
+- .digest = (u8[]){ 0x89, 0x75, 0xb0, 0x57, 0x7f, 0xd3, 0x55, 0x66,
+- 0xd7, 0x50, 0xb3, 0x62, 0xb0, 0x89, 0x7a, 0x26,
+- 0xc3, 0x99, 0x13, 0x6d, 0xf0, 0x7b, 0xab, 0xab,
+- 0xbd, 0xe6, 0x20, 0x3f, 0xf2, 0x95, 0x4e, 0xd4, },
+-}, {
+- .ksize = 1,
+- .key = "B",
+- .plaintext = blake2_ordered_sequence,
+- .psize = 247,
+- .digest = (u8[]){ 0x2e, 0x74, 0x1c, 0x1d, 0x03, 0xf4, 0x9d, 0x84,
+- 0x6f, 0xfc, 0x86, 0x32, 0x92, 0x49, 0x7e, 0x66,
+- 0xd7, 0xc3, 0x10, 0x88, 0xfe, 0x28, 0xb3, 0xe0,
+- 0xbf, 0x50, 0x75, 0xad, 0x8e, 0xa4, 0xe6, 0xb2, },
+-}, {
+- .ksize = 16,
+- .key = blake2_ordered_sequence,
+- .plaintext = blake2_ordered_sequence,
+- .psize = 256,
+- .digest = (u8[]){ 0xb9, 0xd2, 0x81, 0x0e, 0x3a, 0xb1, 0x62, 0x9b,
+- 0xad, 0x44, 0x05, 0xf4, 0x92, 0x2e, 0x99, 0xc1,
+- 0x4a, 0x47, 0xbb, 0x5b, 0x6f, 0xb2, 0x96, 0xed,
+- 0xd5, 0x06, 0xb5, 0x3a, 0x7c, 0x7a, 0x65, 0x1d, },
+-}};
+-
+ #endif /* _CRYPTO_TESTMGR_H */
+diff --git a/drivers/acpi/acpi_lpss.c b/drivers/acpi/acpi_lpss.c
+index fbe0756259c5a..c4d4d21391d7b 100644
+--- a/drivers/acpi/acpi_lpss.c
++++ b/drivers/acpi/acpi_lpss.c
+@@ -422,6 +422,9 @@ static int register_device_clock(struct acpi_device *adev,
+ if (!lpss_clk_dev)
+ lpt_register_clock_device();
+
++ if (IS_ERR(lpss_clk_dev))
++ return PTR_ERR(lpss_clk_dev);
++
+ clk_data = platform_get_drvdata(lpss_clk_dev);
+ if (!clk_data)
+ return -ENODEV;
+diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c
+index 95cc2a9f3e058..49b5e317e9168 100644
+--- a/drivers/acpi/apei/einj.c
++++ b/drivers/acpi/apei/einj.c
+@@ -546,6 +546,8 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
+ != REGION_INTERSECTS) &&
+ (region_intersects(base_addr, size, IORESOURCE_MEM, IORES_DESC_PERSISTENT_MEMORY)
+ != REGION_INTERSECTS) &&
++ (region_intersects(base_addr, size, IORESOURCE_MEM, IORES_DESC_SOFT_RESERVED)
++ != REGION_INTERSECTS) &&
+ !arch_is_platform_page(base_addr)))
+ return -EINVAL;
+
+diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
+index 6c735cfa7d433..ef7858393a3ce 100644
+--- a/drivers/acpi/bus.c
++++ b/drivers/acpi/bus.c
+@@ -1373,6 +1373,7 @@ static int __init acpi_init(void)
+
+ pci_mmcfg_late_init();
+ acpi_iort_init();
++ acpi_viot_early_init();
+ acpi_hest_init();
+ acpi_ghes_init();
+ acpi_scan_init();
+diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
+index b8e26b6b55236..35d894674eba3 100644
+--- a/drivers/acpi/cppc_acpi.c
++++ b/drivers/acpi/cppc_acpi.c
+@@ -600,33 +600,6 @@ static int pcc_data_alloc(int pcc_ss_id)
+ return 0;
+ }
+
+-/* Check if CPPC revision + num_ent combination is supported */
+-static bool is_cppc_supported(int revision, int num_ent)
+-{
+- int expected_num_ent;
+-
+- switch (revision) {
+- case CPPC_V2_REV:
+- expected_num_ent = CPPC_V2_NUM_ENT;
+- break;
+- case CPPC_V3_REV:
+- expected_num_ent = CPPC_V3_NUM_ENT;
+- break;
+- default:
+- pr_debug("Firmware exports unsupported CPPC revision: %d\n",
+- revision);
+- return false;
+- }
+-
+- if (expected_num_ent != num_ent) {
+- pr_debug("Firmware exports %d entries. Expected: %d for CPPC rev:%d\n",
+- num_ent, expected_num_ent, revision);
+- return false;
+- }
+-
+- return true;
+-}
+-
+ /*
+ * An example CPC table looks like the following.
+ *
+@@ -715,7 +688,6 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
+ cpc_obj->type, pr->id);
+ goto out_free;
+ }
+- cpc_ptr->num_entries = num_ent;
+
+ /* Second entry should be revision. */
+ cpc_obj = &out_obj->package.elements[1];
+@@ -726,10 +698,32 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
+ cpc_obj->type, pr->id);
+ goto out_free;
+ }
+- cpc_ptr->version = cpc_rev;
+
+- if (!is_cppc_supported(cpc_rev, num_ent))
++ if (cpc_rev < CPPC_V2_REV) {
++ pr_debug("Unsupported _CPC Revision (%d) for CPU:%d\n", cpc_rev,
++ pr->id);
++ goto out_free;
++ }
++
++ /*
++ * Disregard _CPC if the number of entries in the return pachage is not
++ * as expected, but support future revisions being proper supersets of
++ * the v3 and only causing more entries to be returned by _CPC.
++ */
++ if ((cpc_rev == CPPC_V2_REV && num_ent != CPPC_V2_NUM_ENT) ||
++ (cpc_rev == CPPC_V3_REV && num_ent != CPPC_V3_NUM_ENT) ||
++ (cpc_rev > CPPC_V3_REV && num_ent <= CPPC_V3_NUM_ENT)) {
++ pr_debug("Unexpected number of _CPC return package entries (%d) for CPU:%d\n",
++ num_ent, pr->id);
+ goto out_free;
++ }
++ if (cpc_rev > CPPC_V3_REV) {
++ num_ent = CPPC_V3_NUM_ENT;
++ cpc_rev = CPPC_V3_REV;
++ }
++
++ cpc_ptr->num_entries = num_ent;
++ cpc_ptr->version = cpc_rev;
+
+ /* Iterate through remaining entries in _CPC */
+ for (i = 2; i < num_ent; i++) {
+diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
+index a1b871a418f87..488c9ec0da0bc 100644
+--- a/drivers/acpi/ec.c
++++ b/drivers/acpi/ec.c
+@@ -180,7 +180,6 @@ static struct workqueue_struct *ec_wq;
+ static struct workqueue_struct *ec_query_wq;
+
+ static int EC_FLAGS_CORRECT_ECDT; /* Needs ECDT port address correction */
+-static int EC_FLAGS_IGNORE_DSDT_GPE; /* Needs ECDT GPE as correction setting */
+ static int EC_FLAGS_TRUST_DSDT_GPE; /* Needs DSDT GPE as correction setting */
+ static int EC_FLAGS_CLEAR_ON_RESUME; /* Needs acpi_ec_clear() on boot/resume */
+
+@@ -1407,24 +1406,16 @@ ec_parse_device(acpi_handle handle, u32 Level, void *context, void **retval)
+ if (ec->data_addr == 0 || ec->command_addr == 0)
+ return AE_OK;
+
+- if (boot_ec && boot_ec_is_ecdt && EC_FLAGS_IGNORE_DSDT_GPE) {
+- /*
+- * Always inherit the GPE number setting from the ECDT
+- * EC.
+- */
+- ec->gpe = boot_ec->gpe;
+- } else {
+- /* Get GPE bit assignment (EC events). */
+- /* TODO: Add support for _GPE returning a package */
+- status = acpi_evaluate_integer(handle, "_GPE", NULL, &tmp);
+- if (ACPI_SUCCESS(status))
+- ec->gpe = tmp;
++ /* Get GPE bit assignment (EC events). */
++ /* TODO: Add support for _GPE returning a package */
++ status = acpi_evaluate_integer(handle, "_GPE", NULL, &tmp);
++ if (ACPI_SUCCESS(status))
++ ec->gpe = tmp;
++ /*
++ * Errors are non-fatal, allowing for ACPI Reduced Hardware
++ * platforms which use GpioInt instead of GPE.
++ */
+
+- /*
+- * Errors are non-fatal, allowing for ACPI Reduced Hardware
+- * platforms which use GpioInt instead of GPE.
+- */
+- }
+ /* Use the global lock for all EC transactions? */
+ tmp = 0;
+ acpi_evaluate_integer(handle, "_GLK", NULL, &tmp);
+@@ -1862,60 +1853,12 @@ static int ec_honor_dsdt_gpe(const struct dmi_system_id *id)
+ return 0;
+ }
+
+-/*
+- * Some DSDTs contain wrong GPE setting.
+- * Asus FX502VD/VE, GL702VMK, X550VXK, X580VD
+- * https://bugzilla.kernel.org/show_bug.cgi?id=195651
+- */
+-static int ec_honor_ecdt_gpe(const struct dmi_system_id *id)
+-{
+- pr_debug("Detected system needing ignore DSDT GPE setting.\n");
+- EC_FLAGS_IGNORE_DSDT_GPE = 1;
+- return 0;
+-}
+-
+ static const struct dmi_system_id ec_dmi_table[] __initconst = {
+ {
+ ec_correct_ecdt, "MSI MS-171F", {
+ DMI_MATCH(DMI_SYS_VENDOR, "Micro-Star"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "MS-171F"),}, NULL},
+ {
+- ec_honor_ecdt_gpe, "ASUS FX502VD", {
+- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+- DMI_MATCH(DMI_PRODUCT_NAME, "FX502VD"),}, NULL},
+- {
+- ec_honor_ecdt_gpe, "ASUS FX502VE", {
+- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+- DMI_MATCH(DMI_PRODUCT_NAME, "FX502VE"),}, NULL},
+- {
+- ec_honor_ecdt_gpe, "ASUS GL702VMK", {
+- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+- DMI_MATCH(DMI_PRODUCT_NAME, "GL702VMK"),}, NULL},
+- {
+- ec_honor_ecdt_gpe, "ASUSTeK COMPUTER INC. X505BA", {
+- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+- DMI_MATCH(DMI_PRODUCT_NAME, "X505BA"),}, NULL},
+- {
+- ec_honor_ecdt_gpe, "ASUSTeK COMPUTER INC. X505BP", {
+- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+- DMI_MATCH(DMI_PRODUCT_NAME, "X505BP"),}, NULL},
+- {
+- ec_honor_ecdt_gpe, "ASUSTeK COMPUTER INC. X542BA", {
+- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+- DMI_MATCH(DMI_PRODUCT_NAME, "X542BA"),}, NULL},
+- {
+- ec_honor_ecdt_gpe, "ASUSTeK COMPUTER INC. X542BP", {
+- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+- DMI_MATCH(DMI_PRODUCT_NAME, "X542BP"),}, NULL},
+- {
+- ec_honor_ecdt_gpe, "ASUS X550VXK", {
+- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+- DMI_MATCH(DMI_PRODUCT_NAME, "X550VXK"),}, NULL},
+- {
+- ec_honor_ecdt_gpe, "ASUS X580VD", {
+- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+- DMI_MATCH(DMI_PRODUCT_NAME, "X580VD"),}, NULL},
+- {
+ /* https://bugzilla.kernel.org/show_bug.cgi?id=209989 */
+ ec_honor_dsdt_gpe, "HP Pavilion Gaming Laptop 15-cx0xxx", {
+ DMI_MATCH(DMI_SYS_VENDOR, "HP"),
+@@ -2207,13 +2150,6 @@ static const struct dmi_system_id acpi_ec_no_wakeup[] = {
+ DMI_MATCH(DMI_PRODUCT_FAMILY, "Thinkpad X1 Carbon 6th"),
+ },
+ },
+- {
+- .ident = "ThinkPad X1 Carbon 6th",
+- .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+- DMI_MATCH(DMI_PRODUCT_FAMILY, "ThinkPad X1 Carbon 6th"),
+- },
+- },
+ {
+ .ident = "ThinkPad X1 Yoga 3rd",
+ .matches = {
+diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
+index eb95e188d62bc..573ba7f617e8a 100644
+--- a/drivers/acpi/processor_idle.c
++++ b/drivers/acpi/processor_idle.c
+@@ -602,7 +602,7 @@ static DEFINE_RAW_SPINLOCK(c3_lock);
+ * @cx: Target state context
+ * @index: index of target state
+ */
+-static int acpi_idle_enter_bm(struct cpuidle_driver *drv,
++static int __cpuidle acpi_idle_enter_bm(struct cpuidle_driver *drv,
+ struct acpi_processor *pr,
+ struct acpi_processor_cx *cx,
+ int index)
+@@ -659,7 +659,7 @@ static int acpi_idle_enter_bm(struct cpuidle_driver *drv,
+ return index;
+ }
+
+-static int acpi_idle_enter(struct cpuidle_device *dev,
++static int __cpuidle acpi_idle_enter(struct cpuidle_device *dev,
+ struct cpuidle_driver *drv, int index)
+ {
+ struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu);
+@@ -688,7 +688,7 @@ static int acpi_idle_enter(struct cpuidle_device *dev,
+ return index;
+ }
+
+-static int acpi_idle_enter_s2idle(struct cpuidle_device *dev,
++static int __cpuidle acpi_idle_enter_s2idle(struct cpuidle_device *dev,
+ struct cpuidle_driver *drv, int index)
+ {
+ struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu);
+diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
+index 3147702710afe..1ec3238e2cdc9 100644
+--- a/drivers/acpi/sleep.c
++++ b/drivers/acpi/sleep.c
+@@ -360,6 +360,14 @@ static const struct dmi_system_id acpisleep_dmi_table[] __initconst = {
+ DMI_MATCH(DMI_PRODUCT_NAME, "80E3"),
+ },
+ },
++ {
++ .callback = init_nvs_save_s3,
++ .ident = "Lenovo G40-45",
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "80E1"),
++ },
++ },
+ /*
+ * ThinkPad X1 Tablet(2016) cannot do suspend-to-idle using
+ * the Low Power S0 Idle firmware interface (see
+diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
+index 6615f59ab7fd2..5d7f38016a243 100644
+--- a/drivers/acpi/video_detect.c
++++ b/drivers/acpi/video_detect.c
+@@ -347,6 +347,14 @@ static const struct dmi_system_id video_detect_dmi_table[] = {
+ DMI_MATCH(DMI_PRODUCT_NAME, "MacBookPro12,1"),
+ },
+ },
++ {
++ .callback = video_detect_force_native,
++ /* Dell Inspiron N4010 */
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Inspiron N4010"),
++ },
++ },
+ {
+ .callback = video_detect_force_native,
+ /* Dell Vostro V131 */
+diff --git a/drivers/acpi/viot.c b/drivers/acpi/viot.c
+index d2256326c73ae..647f11cf165d7 100644
+--- a/drivers/acpi/viot.c
++++ b/drivers/acpi/viot.c
+@@ -248,6 +248,26 @@ err_free:
+ return ret;
+ }
+
++/**
++ * acpi_viot_early_init - Test the presence of VIOT and enable ACS
++ *
++ * If the VIOT does exist, ACS must be enabled. This cannot be
++ * done in acpi_viot_init() which is called after the bus scan
++ */
++void __init acpi_viot_early_init(void)
++{
++#ifdef CONFIG_PCI
++ acpi_status status;
++ struct acpi_table_header *hdr;
++
++ status = acpi_get_table(ACPI_SIG_VIOT, 0, &hdr);
++ if (ACPI_FAILURE(status))
++ return;
++ pci_request_acs();
++ acpi_put_table(hdr);
++#endif
++}
++
+ /**
+ * acpi_viot_init - Parse the VIOT table
+ *
+@@ -319,12 +339,6 @@ static int viot_pci_dev_iommu_init(struct pci_dev *pdev, u16 dev_id, void *data)
+ epid = ((domain_nr - ep->segment_start) << 16) +
+ dev_id - ep->bdf_start + ep->endpoint_id;
+
+- /*
+- * If we found a PCI range managed by the viommu, we're
+- * the one that has to request ACS.
+- */
+- pci_request_acs();
+-
+ return viot_dev_iommu_init(&pdev->dev, ep->viommu,
+ epid);
+ }
+diff --git a/drivers/android/binder.c b/drivers/android/binder.c
+index f3b639e89dd88..5243fe0eb4029 100644
+--- a/drivers/android/binder.c
++++ b/drivers/android/binder.c
+@@ -170,8 +170,32 @@ static inline void binder_stats_created(enum binder_stat_types type)
+ atomic_inc(&binder_stats.obj_created[type]);
+ }
+
+-struct binder_transaction_log binder_transaction_log;
+-struct binder_transaction_log binder_transaction_log_failed;
++struct binder_transaction_log_entry {
++ int debug_id;
++ int debug_id_done;
++ int call_type;
++ int from_proc;
++ int from_thread;
++ int target_handle;
++ int to_proc;
++ int to_thread;
++ int to_node;
++ int data_size;
++ int offsets_size;
++ int return_error_line;
++ uint32_t return_error;
++ uint32_t return_error_param;
++ char context_name[BINDERFS_MAX_NAME + 1];
++};
++
++struct binder_transaction_log {
++ atomic_t cur;
++ bool full;
++ struct binder_transaction_log_entry entry[32];
++};
++
++static struct binder_transaction_log binder_transaction_log;
++static struct binder_transaction_log binder_transaction_log_failed;
+
+ static struct binder_transaction_log_entry *binder_transaction_log_add(
+ struct binder_transaction_log *log)
+@@ -6084,8 +6108,7 @@ static void print_binder_proc_stats(struct seq_file *m,
+ print_binder_stats(m, " ", &proc->stats);
+ }
+
+-
+-int binder_state_show(struct seq_file *m, void *unused)
++static int state_show(struct seq_file *m, void *unused)
+ {
+ struct binder_proc *proc;
+ struct binder_node *node;
+@@ -6124,7 +6147,7 @@ int binder_state_show(struct seq_file *m, void *unused)
+ return 0;
+ }
+
+-int binder_stats_show(struct seq_file *m, void *unused)
++static int stats_show(struct seq_file *m, void *unused)
+ {
+ struct binder_proc *proc;
+
+@@ -6140,7 +6163,7 @@ int binder_stats_show(struct seq_file *m, void *unused)
+ return 0;
+ }
+
+-int binder_transactions_show(struct seq_file *m, void *unused)
++static int transactions_show(struct seq_file *m, void *unused)
+ {
+ struct binder_proc *proc;
+
+@@ -6196,7 +6219,7 @@ static void print_binder_transaction_log_entry(struct seq_file *m,
+ "\n" : " (incomplete)\n");
+ }
+
+-int binder_transaction_log_show(struct seq_file *m, void *unused)
++static int transaction_log_show(struct seq_file *m, void *unused)
+ {
+ struct binder_transaction_log *log = m->private;
+ unsigned int log_cur = atomic_read(&log->cur);
+@@ -6228,6 +6251,45 @@ const struct file_operations binder_fops = {
+ .release = binder_release,
+ };
+
++DEFINE_SHOW_ATTRIBUTE(state);
++DEFINE_SHOW_ATTRIBUTE(stats);
++DEFINE_SHOW_ATTRIBUTE(transactions);
++DEFINE_SHOW_ATTRIBUTE(transaction_log);
++
++const struct binder_debugfs_entry binder_debugfs_entries[] = {
++ {
++ .name = "state",
++ .mode = 0444,
++ .fops = &state_fops,
++ .data = NULL,
++ },
++ {
++ .name = "stats",
++ .mode = 0444,
++ .fops = &stats_fops,
++ .data = NULL,
++ },
++ {
++ .name = "transactions",
++ .mode = 0444,
++ .fops = &transactions_fops,
++ .data = NULL,
++ },
++ {
++ .name = "transaction_log",
++ .mode = 0444,
++ .fops = &transaction_log_fops,
++ .data = &binder_transaction_log,
++ },
++ {
++ .name = "failed_transaction_log",
++ .mode = 0444,
++ .fops = &transaction_log_fops,
++ .data = &binder_transaction_log_failed,
++ },
++ {} /* terminator */
++};
++
+ static int __init init_binder_device(const char *name)
+ {
+ int ret;
+@@ -6273,36 +6335,18 @@ static int __init binder_init(void)
+ atomic_set(&binder_transaction_log_failed.cur, ~0U);
+
+ binder_debugfs_dir_entry_root = debugfs_create_dir("binder", NULL);
+- if (binder_debugfs_dir_entry_root)
++ if (binder_debugfs_dir_entry_root) {
++ const struct binder_debugfs_entry *db_entry;
++
++ binder_for_each_debugfs_entry(db_entry)
++ debugfs_create_file(db_entry->name,
++ db_entry->mode,
++ binder_debugfs_dir_entry_root,
++ db_entry->data,
++ db_entry->fops);
++
+ binder_debugfs_dir_entry_proc = debugfs_create_dir("proc",
+ binder_debugfs_dir_entry_root);
+-
+- if (binder_debugfs_dir_entry_root) {
+- debugfs_create_file("state",
+- 0444,
+- binder_debugfs_dir_entry_root,
+- NULL,
+- &binder_state_fops);
+- debugfs_create_file("stats",
+- 0444,
+- binder_debugfs_dir_entry_root,
+- NULL,
+- &binder_stats_fops);
+- debugfs_create_file("transactions",
+- 0444,
+- binder_debugfs_dir_entry_root,
+- NULL,
+- &binder_transactions_fops);
+- debugfs_create_file("transaction_log",
+- 0444,
+- binder_debugfs_dir_entry_root,
+- &binder_transaction_log,
+- &binder_transaction_log_fops);
+- debugfs_create_file("failed_transaction_log",
+- 0444,
+- binder_debugfs_dir_entry_root,
+- &binder_transaction_log_failed,
+- &binder_transaction_log_fops);
+ }
+
+ if (!IS_ENABLED(CONFIG_ANDROID_BINDERFS) &&
+diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
+index 2ac1008a5f396..a93a7d2a8caff 100644
+--- a/drivers/android/binder_alloc.c
++++ b/drivers/android/binder_alloc.c
+@@ -213,7 +213,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
+
+ if (mm) {
+ mmap_read_lock(mm);
+- vma = alloc->vma;
++ vma = vma_lookup(mm, alloc->vma_addr);
+ }
+
+ if (!vma && need_mm) {
+@@ -313,16 +313,15 @@ err_no_vma:
+ static inline void binder_alloc_set_vma(struct binder_alloc *alloc,
+ struct vm_area_struct *vma)
+ {
+- if (vma)
++ unsigned long vm_start = 0;
++
++ if (vma) {
++ vm_start = vma->vm_start;
+ alloc->vma_vm_mm = vma->vm_mm;
+- /*
+- * If we see alloc->vma is not NULL, buffer data structures set up
+- * completely. Look at smp_rmb side binder_alloc_get_vma.
+- * We also want to guarantee new alloc->vma_vm_mm is always visible
+- * if alloc->vma is set.
+- */
+- smp_wmb();
+- alloc->vma = vma;
++ }
++
++ mmap_assert_write_locked(alloc->vma_vm_mm);
++ alloc->vma_addr = vm_start;
+ }
+
+ static inline struct vm_area_struct *binder_alloc_get_vma(
+@@ -330,11 +329,9 @@ static inline struct vm_area_struct *binder_alloc_get_vma(
+ {
+ struct vm_area_struct *vma = NULL;
+
+- if (alloc->vma) {
+- /* Look at description in binder_alloc_set_vma */
+- smp_rmb();
+- vma = alloc->vma;
+- }
++ if (alloc->vma_addr)
++ vma = vma_lookup(alloc->vma_vm_mm, alloc->vma_addr);
++
+ return vma;
+ }
+
+@@ -817,7 +814,8 @@ void binder_alloc_deferred_release(struct binder_alloc *alloc)
+
+ buffers = 0;
+ mutex_lock(&alloc->mutex);
+- BUG_ON(alloc->vma);
++ BUG_ON(alloc->vma_addr &&
++ vma_lookup(alloc->vma_vm_mm, alloc->vma_addr));
+
+ while ((n = rb_first(&alloc->allocated_buffers))) {
+ buffer = rb_entry(n, struct binder_buffer, rb_node);
+diff --git a/drivers/android/binder_alloc.h b/drivers/android/binder_alloc.h
+index 7dea57a84c79b..1e4fd37af5e03 100644
+--- a/drivers/android/binder_alloc.h
++++ b/drivers/android/binder_alloc.h
+@@ -100,7 +100,7 @@ struct binder_lru_page {
+ */
+ struct binder_alloc {
+ struct mutex mutex;
+- struct vm_area_struct *vma;
++ unsigned long vma_addr;
+ struct mm_struct *vma_vm_mm;
+ void __user *buffer;
+ struct list_head buffers;
+diff --git a/drivers/android/binder_alloc_selftest.c b/drivers/android/binder_alloc_selftest.c
+index c2b323bc3b3a5..43a881073a428 100644
+--- a/drivers/android/binder_alloc_selftest.c
++++ b/drivers/android/binder_alloc_selftest.c
+@@ -287,7 +287,7 @@ void binder_selftest_alloc(struct binder_alloc *alloc)
+ if (!binder_selftest_run)
+ return;
+ mutex_lock(&binder_selftest_lock);
+- if (!binder_selftest_run || !alloc->vma)
++ if (!binder_selftest_run || !alloc->vma_addr)
+ goto done;
+ pr_info("STARTED\n");
+ binder_selftest_alloc_offset(alloc, end_offset, 0);
+diff --git a/drivers/android/binder_internal.h b/drivers/android/binder_internal.h
+index d6b6b8cb73465..1ade9799c8d58 100644
+--- a/drivers/android/binder_internal.h
++++ b/drivers/android/binder_internal.h
+@@ -107,41 +107,19 @@ static inline int __init init_binderfs(void)
+ }
+ #endif
+
+-int binder_stats_show(struct seq_file *m, void *unused);
+-DEFINE_SHOW_ATTRIBUTE(binder_stats);
+-
+-int binder_state_show(struct seq_file *m, void *unused);
+-DEFINE_SHOW_ATTRIBUTE(binder_state);
+-
+-int binder_transactions_show(struct seq_file *m, void *unused);
+-DEFINE_SHOW_ATTRIBUTE(binder_transactions);
+-
+-int binder_transaction_log_show(struct seq_file *m, void *unused);
+-DEFINE_SHOW_ATTRIBUTE(binder_transaction_log);
+-
+-struct binder_transaction_log_entry {
+- int debug_id;
+- int debug_id_done;
+- int call_type;
+- int from_proc;
+- int from_thread;
+- int target_handle;
+- int to_proc;
+- int to_thread;
+- int to_node;
+- int data_size;
+- int offsets_size;
+- int return_error_line;
+- uint32_t return_error;
+- uint32_t return_error_param;
+- char context_name[BINDERFS_MAX_NAME + 1];
++struct binder_debugfs_entry {
++ const char *name;
++ umode_t mode;
++ const struct file_operations *fops;
++ void *data;
+ };
+
+-struct binder_transaction_log {
+- atomic_t cur;
+- bool full;
+- struct binder_transaction_log_entry entry[32];
+-};
++extern const struct binder_debugfs_entry binder_debugfs_entries[];
++
++#define binder_for_each_debugfs_entry(entry) \
++ for ((entry) = binder_debugfs_entries; \
++ (entry)->name; \
++ (entry)++)
+
+ enum binder_stat_types {
+ BINDER_STAT_PROC,
+@@ -575,6 +553,4 @@ struct binder_object {
+ };
+ };
+
+-extern struct binder_transaction_log binder_transaction_log;
+-extern struct binder_transaction_log binder_transaction_log_failed;
+ #endif /* _LINUX_BINDER_INTERNAL_H */
+diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c
+index e3605cdd43357..6d717ed76766e 100644
+--- a/drivers/android/binderfs.c
++++ b/drivers/android/binderfs.c
+@@ -621,6 +621,7 @@ static int init_binder_features(struct super_block *sb)
+ static int init_binder_logs(struct super_block *sb)
+ {
+ struct dentry *binder_logs_root_dir, *dentry, *proc_log_dir;
++ const struct binder_debugfs_entry *db_entry;
+ struct binderfs_info *info;
+ int ret = 0;
+
+@@ -631,43 +632,15 @@ static int init_binder_logs(struct super_block *sb)
+ goto out;
+ }
+
+- dentry = binderfs_create_file(binder_logs_root_dir, "stats",
+- &binder_stats_fops, NULL);
+- if (IS_ERR(dentry)) {
+- ret = PTR_ERR(dentry);
+- goto out;
+- }
+-
+- dentry = binderfs_create_file(binder_logs_root_dir, "state",
+- &binder_state_fops, NULL);
+- if (IS_ERR(dentry)) {
+- ret = PTR_ERR(dentry);
+- goto out;
+- }
+-
+- dentry = binderfs_create_file(binder_logs_root_dir, "transactions",
+- &binder_transactions_fops, NULL);
+- if (IS_ERR(dentry)) {
+- ret = PTR_ERR(dentry);
+- goto out;
+- }
+-
+- dentry = binderfs_create_file(binder_logs_root_dir,
+- "transaction_log",
+- &binder_transaction_log_fops,
+- &binder_transaction_log);
+- if (IS_ERR(dentry)) {
+- ret = PTR_ERR(dentry);
+- goto out;
+- }
+-
+- dentry = binderfs_create_file(binder_logs_root_dir,
+- "failed_transaction_log",
+- &binder_transaction_log_fops,
+- &binder_transaction_log_failed);
+- if (IS_ERR(dentry)) {
+- ret = PTR_ERR(dentry);
+- goto out;
++ binder_for_each_debugfs_entry(db_entry) {
++ dentry = binderfs_create_file(binder_logs_root_dir,
++ db_entry->name,
++ db_entry->fops,
++ db_entry->data);
++ if (IS_ERR(dentry)) {
++ ret = PTR_ERR(dentry);
++ goto out;
++ }
+ }
+
+ proc_log_dir = binderfs_create_dir(binder_logs_root_dir, "proc");
+diff --git a/drivers/base/dd.c b/drivers/base/dd.c
+index d6980f33afc44..822dfa6561c29 100644
+--- a/drivers/base/dd.c
++++ b/drivers/base/dd.c
+@@ -1091,6 +1091,7 @@ static void __driver_attach_async_helper(void *_dev, async_cookie_t cookie)
+ static int __driver_attach(struct device *dev, void *data)
+ {
+ struct device_driver *drv = data;
++ bool async = false;
+ int ret;
+
+ /*
+@@ -1129,9 +1130,11 @@ static int __driver_attach(struct device *dev, void *data)
+ if (!dev->driver) {
+ get_device(dev);
+ dev->p->async_driver = drv;
+- async_schedule_dev(__driver_attach_async_helper, dev);
++ async = true;
+ }
+ device_unlock(dev);
++ if (async)
++ async_schedule_dev(__driver_attach_async_helper, dev);
+ return 0;
+ }
+
+diff --git a/drivers/base/node.c b/drivers/base/node.c
+index 0ac6376ef7a10..eb0f43784c2b3 100644
+--- a/drivers/base/node.c
++++ b/drivers/base/node.c
+@@ -45,7 +45,7 @@ static inline ssize_t cpumap_read(struct file *file, struct kobject *kobj,
+ return n;
+ }
+
+-static BIN_ATTR_RO(cpumap, 0);
++static BIN_ATTR_RO(cpumap, CPUMAP_FILE_MAX_BYTES);
+
+ static inline ssize_t cpulist_read(struct file *file, struct kobject *kobj,
+ struct bin_attribute *attr, char *buf,
+@@ -66,7 +66,7 @@ static inline ssize_t cpulist_read(struct file *file, struct kobject *kobj,
+ return n;
+ }
+
+-static BIN_ATTR_RO(cpulist, 0);
++static BIN_ATTR_RO(cpulist, CPULIST_FILE_MAX_BYTES);
+
+ /**
+ * struct node_access_nodes - Access class device to hold user visible
+diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
+index f0e4b0ea93e8c..59a706a126e5e 100644
+--- a/drivers/base/power/domain.c
++++ b/drivers/base/power/domain.c
+@@ -219,6 +219,9 @@ static void genpd_debug_remove(struct generic_pm_domain *genpd)
+ {
+ struct dentry *d;
+
++ if (!genpd_debugfs_dir)
++ return;
++
+ d = debugfs_lookup(genpd->name, genpd_debugfs_dir);
+ debugfs_remove(d);
+ }
+diff --git a/drivers/base/topology.c b/drivers/base/topology.c
+index ac6ad9ab67f94..89f98be5c5b99 100644
+--- a/drivers/base/topology.c
++++ b/drivers/base/topology.c
+@@ -62,47 +62,47 @@ define_id_show_func(ppin, "0x%llx");
+ static DEVICE_ATTR_ADMIN_RO(ppin);
+
+ define_siblings_read_func(thread_siblings, sibling_cpumask);
+-static BIN_ATTR_RO(thread_siblings, 0);
+-static BIN_ATTR_RO(thread_siblings_list, 0);
++static BIN_ATTR_RO(thread_siblings, CPUMAP_FILE_MAX_BYTES);
++static BIN_ATTR_RO(thread_siblings_list, CPULIST_FILE_MAX_BYTES);
+
+ define_siblings_read_func(core_cpus, sibling_cpumask);
+-static BIN_ATTR_RO(core_cpus, 0);
+-static BIN_ATTR_RO(core_cpus_list, 0);
++static BIN_ATTR_RO(core_cpus, CPUMAP_FILE_MAX_BYTES);
++static BIN_ATTR_RO(core_cpus_list, CPULIST_FILE_MAX_BYTES);
+
+ define_siblings_read_func(core_siblings, core_cpumask);
+-static BIN_ATTR_RO(core_siblings, 0);
+-static BIN_ATTR_RO(core_siblings_list, 0);
++static BIN_ATTR_RO(core_siblings, CPUMAP_FILE_MAX_BYTES);
++static BIN_ATTR_RO(core_siblings_list, CPULIST_FILE_MAX_BYTES);
+
+ #ifdef TOPOLOGY_CLUSTER_SYSFS
+ define_siblings_read_func(cluster_cpus, cluster_cpumask);
+-static BIN_ATTR_RO(cluster_cpus, 0);
+-static BIN_ATTR_RO(cluster_cpus_list, 0);
++static BIN_ATTR_RO(cluster_cpus, CPUMAP_FILE_MAX_BYTES);
++static BIN_ATTR_RO(cluster_cpus_list, CPULIST_FILE_MAX_BYTES);
+ #endif
+
+ #ifdef TOPOLOGY_DIE_SYSFS
+ define_siblings_read_func(die_cpus, die_cpumask);
+-static BIN_ATTR_RO(die_cpus, 0);
+-static BIN_ATTR_RO(die_cpus_list, 0);
++static BIN_ATTR_RO(die_cpus, CPUMAP_FILE_MAX_BYTES);
++static BIN_ATTR_RO(die_cpus_list, CPULIST_FILE_MAX_BYTES);
+ #endif
+
+ define_siblings_read_func(package_cpus, core_cpumask);
+-static BIN_ATTR_RO(package_cpus, 0);
+-static BIN_ATTR_RO(package_cpus_list, 0);
++static BIN_ATTR_RO(package_cpus, CPUMAP_FILE_MAX_BYTES);
++static BIN_ATTR_RO(package_cpus_list, CPULIST_FILE_MAX_BYTES);
+
+ #ifdef TOPOLOGY_BOOK_SYSFS
+ define_id_show_func(book_id, "%d");
+ static DEVICE_ATTR_RO(book_id);
+ define_siblings_read_func(book_siblings, book_cpumask);
+-static BIN_ATTR_RO(book_siblings, 0);
+-static BIN_ATTR_RO(book_siblings_list, 0);
++static BIN_ATTR_RO(book_siblings, CPUMAP_FILE_MAX_BYTES);
++static BIN_ATTR_RO(book_siblings_list, CPULIST_FILE_MAX_BYTES);
+ #endif
+
+ #ifdef TOPOLOGY_DRAWER_SYSFS
+ define_id_show_func(drawer_id, "%d");
+ static DEVICE_ATTR_RO(drawer_id);
+ define_siblings_read_func(drawer_siblings, drawer_cpumask);
+-static BIN_ATTR_RO(drawer_siblings, 0);
+-static BIN_ATTR_RO(drawer_siblings_list, 0);
++static BIN_ATTR_RO(drawer_siblings, CPUMAP_FILE_MAX_BYTES);
++static BIN_ATTR_RO(drawer_siblings_list, CPULIST_FILE_MAX_BYTES);
+ #endif
+
+ static struct bin_attribute *bin_attrs[] = {
+diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
+index c441a4972064e..3e59e824ed1ff 100644
+--- a/drivers/block/null_blk/main.c
++++ b/drivers/block/null_blk/main.c
+@@ -2044,8 +2044,13 @@ static int null_add_dev(struct nullb_device *dev)
+ blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, nullb->q);
+
+ mutex_lock(&lock);
+- nullb->index = ida_simple_get(&nullb_indexes, 0, 0, GFP_KERNEL);
+- dev->index = nullb->index;
++ rv = ida_simple_get(&nullb_indexes, 0, 0, GFP_KERNEL);
++ if (rv < 0) {
++ mutex_unlock(&lock);
++ goto out_cleanup_zone;
++ }
++ nullb->index = rv;
++ dev->index = rv;
+ mutex_unlock(&lock);
+
+ blk_queue_logical_block_size(nullb->q, dev->blocksize);
+@@ -2065,13 +2070,16 @@ static int null_add_dev(struct nullb_device *dev)
+
+ rv = null_gendisk_register(nullb);
+ if (rv)
+- goto out_cleanup_zone;
++ goto out_ida_free;
+
+ mutex_lock(&lock);
+ list_add_tail(&nullb->list, &nullb_list);
+ mutex_unlock(&lock);
+
+ return 0;
++
++out_ida_free:
++ ida_free(&nullb_indexes, nullb->index);
+ out_cleanup_zone:
+ null_free_zoned_dev(dev);
+ out_cleanup_disk:
+diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
+index f04df6294650b..963df70ea0266 100644
+--- a/drivers/block/rnbd/rnbd-srv.c
++++ b/drivers/block/rnbd/rnbd-srv.c
+@@ -323,10 +323,11 @@ void rnbd_srv_sess_dev_force_close(struct rnbd_srv_sess_dev *sess_dev,
+ {
+ struct rnbd_srv_session *sess = sess_dev->sess;
+
+- sess_dev->keep_id = true;
+ /* It is already started to close by client's close message. */
+ if (!mutex_trylock(&sess->lock))
+ return;
++
++ sess_dev->keep_id = true;
+ /* first remove sysfs itself to avoid deadlock */
+ sysfs_remove_file_self(&sess_dev->kobj, &attr->attr);
+ rnbd_srv_destroy_dev_session_sysfs(sess_dev);
+diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
+index f09040435e2e5..7e0d1eb29d8fb 100644
+--- a/drivers/block/xen-blkback/xenbus.c
++++ b/drivers/block/xen-blkback/xenbus.c
+@@ -157,6 +157,11 @@ static int xen_blkif_alloc_rings(struct xen_blkif *blkif)
+ return 0;
+ }
+
++/* Enable the persistent grants feature. */
++static bool feature_persistent = true;
++module_param(feature_persistent, bool, 0644);
++MODULE_PARM_DESC(feature_persistent, "Enables the persistent grants feature");
++
+ static struct xen_blkif *xen_blkif_alloc(domid_t domid)
+ {
+ struct xen_blkif *blkif;
+@@ -472,12 +477,6 @@ static void xen_vbd_free(struct xen_vbd *vbd)
+ vbd->bdev = NULL;
+ }
+
+-/* Enable the persistent grants feature. */
+-static bool feature_persistent = true;
+-module_param(feature_persistent, bool, 0644);
+-MODULE_PARM_DESC(feature_persistent,
+- "Enables the persistent grants feature");
+-
+ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
+ unsigned major, unsigned minor, int readonly,
+ int cdrom)
+@@ -523,8 +522,6 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
+ if (q && blk_queue_secure_erase(q))
+ vbd->discard_secure = true;
+
+- vbd->feature_gnt_persistent = feature_persistent;
+-
+ pr_debug("Successful creation of handle=%04x (dom=%u)\n",
+ handle, blkif->domid);
+ return 0;
+@@ -1091,10 +1088,9 @@ static int connect_ring(struct backend_info *be)
+ xenbus_dev_fatal(dev, err, "unknown fe protocol %s", protocol);
+ return -ENOSYS;
+ }
+- if (blkif->vbd.feature_gnt_persistent)
+- blkif->vbd.feature_gnt_persistent =
+- xenbus_read_unsigned(dev->otherend,
+- "feature-persistent", 0);
++
++ blkif->vbd.feature_gnt_persistent = feature_persistent &&
++ xenbus_read_unsigned(dev->otherend, "feature-persistent", 0);
+
+ blkif->vbd.overflow_max_grants = 0;
+
+diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
+index cf9cfc40a0283..6aa1bcca8645d 100644
+--- a/drivers/block/xen-blkfront.c
++++ b/drivers/block/xen-blkfront.c
+@@ -2011,8 +2011,6 @@ static int blkfront_probe(struct xenbus_device *dev,
+ info->vdevice = vdevice;
+ info->connected = BLKIF_STATE_DISCONNECTED;
+
+- info->feature_persistent = feature_persistent;
+-
+ /* Front end dir is a number, which is used as the id. */
+ info->handle = simple_strtoul(strrchr(dev->nodename, '/')+1, NULL, 0);
+ dev_set_drvdata(&dev->dev, info);
+@@ -2306,7 +2304,7 @@ static void blkfront_gather_backend_features(struct blkfront_info *info)
+ if (xenbus_read_unsigned(info->xbdev->otherend, "feature-discard", 0))
+ blkfront_setup_discard(info);
+
+- if (info->feature_persistent)
++ if (feature_persistent)
+ info->feature_persistent =
+ !!xenbus_read_unsigned(info->xbdev->otherend,
+ "feature-persistent", 0);
+diff --git a/drivers/bluetooth/hci_intel.c b/drivers/bluetooth/hci_intel.c
+index 7249b91d9b91a..78afb9a348e70 100644
+--- a/drivers/bluetooth/hci_intel.c
++++ b/drivers/bluetooth/hci_intel.c
+@@ -1217,7 +1217,11 @@ static struct platform_driver intel_driver = {
+
+ int __init intel_init(void)
+ {
+- platform_driver_register(&intel_driver);
++ int err;
++
++ err = platform_driver_register(&intel_driver);
++ if (err)
++ return err;
+
+ return hci_uart_register_proto(&intel_proto);
+ }
+diff --git a/drivers/bluetooth/hci_serdev.c b/drivers/bluetooth/hci_serdev.c
+index 4cda890ce6470..c0e5f42ec6b7d 100644
+--- a/drivers/bluetooth/hci_serdev.c
++++ b/drivers/bluetooth/hci_serdev.c
+@@ -231,6 +231,15 @@ static int hci_uart_setup(struct hci_dev *hdev)
+ return 0;
+ }
+
++/* Check if the device is wakeable */
++static bool hci_uart_wakeup(struct hci_dev *hdev)
++{
++ /* HCI UART devices are assumed to be wakeable by default.
++ * Implement wakeup callback to override this behavior.
++ */
++ return true;
++}
++
+ /** hci_uart_write_wakeup - transmit buffer wakeup
+ * @serdev: serial device
+ *
+@@ -342,6 +351,8 @@ int hci_uart_register_device(struct hci_uart *hu,
+ hdev->flush = hci_uart_flush;
+ hdev->send = hci_uart_send_frame;
+ hdev->setup = hci_uart_setup;
++ if (!hdev->wakeup)
++ hdev->wakeup = hci_uart_wakeup;
+ SET_HCIDEV_DEV(hdev, &hu->serdev->dev);
+
+ if (test_bit(HCI_UART_NO_SUSPEND_NOTIFIER, &hu->flags))
+diff --git a/drivers/bus/hisi_lpc.c b/drivers/bus/hisi_lpc.c
+index 378f5d62a9912..e7eaa8784fee0 100644
+--- a/drivers/bus/hisi_lpc.c
++++ b/drivers/bus/hisi_lpc.c
+@@ -503,13 +503,13 @@ static int hisi_lpc_acpi_probe(struct device *hostdev)
+ {
+ struct acpi_device *adev = ACPI_COMPANION(hostdev);
+ struct acpi_device *child;
++ struct platform_device *pdev;
+ int ret;
+
+ /* Only consider the children of the host */
+ list_for_each_entry(child, &adev->children, node) {
+ const char *hid = acpi_device_hid(child);
+ const struct hisi_lpc_acpi_cell *cell;
+- struct platform_device *pdev;
+ const struct resource *res;
+ bool found = false;
+ int num_res;
+@@ -571,22 +571,24 @@ static int hisi_lpc_acpi_probe(struct device *hostdev)
+
+ ret = platform_device_add_resources(pdev, res, num_res);
+ if (ret)
+- goto fail;
++ goto fail_put_device;
+
+ ret = platform_device_add_data(pdev, cell->pdata,
+ cell->pdata_size);
+ if (ret)
+- goto fail;
++ goto fail_put_device;
+
+ ret = platform_device_add(pdev);
+ if (ret)
+- goto fail;
++ goto fail_put_device;
+
+ acpi_device_set_enumerated(child);
+ }
+
+ return 0;
+
++fail_put_device:
++ platform_device_put(pdev);
+ fail:
+ hisi_lpc_acpi_remove(hostdev);
+ return ret;
+diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
+index 04a3e23a4afc7..4419593d95315 100644
+--- a/drivers/char/tpm/tpm2-cmd.c
++++ b/drivers/char/tpm/tpm2-cmd.c
+@@ -752,6 +752,12 @@ int tpm2_auto_startup(struct tpm_chip *chip)
+ }
+
+ rc = tpm2_get_cc_attrs_tbl(chip);
++ if (rc == TPM2_RC_FAILURE || (rc < 0 && rc != -ENOMEM)) {
++ dev_info(&chip->dev,
++ "TPM in field failure mode, requires firmware upgrade\n");
++ chip->flags |= TPM_CHIP_FLAG_FIRMWARE_UPGRADE;
++ rc = 0;
++ }
+
+ out:
+ if (rc == TPM2_RC_UPGRADE) {
+diff --git a/drivers/clk/imx/clk-fracn-gppll.c b/drivers/clk/imx/clk-fracn-gppll.c
+index 71c102d950ab0..025b73229cddd 100644
+--- a/drivers/clk/imx/clk-fracn-gppll.c
++++ b/drivers/clk/imx/clk-fracn-gppll.c
+@@ -64,10 +64,10 @@ struct clk_fracn_gppll {
+ * Fout = Fvco / (rdiv * odiv)
+ */
+ static const struct imx_fracn_gppll_rate_table fracn_tbl[] = {
+- PLL_FRACN_GP(650000000U, 81, 0, 0, 0, 3),
+- PLL_FRACN_GP(594000000U, 198, 0, 0, 0, 8),
+- PLL_FRACN_GP(560000000U, 70, 0, 0, 0, 3),
+- PLL_FRACN_GP(400000000U, 50, 0, 0, 0, 3),
++ PLL_FRACN_GP(650000000U, 81, 0, 1, 0, 3),
++ PLL_FRACN_GP(594000000U, 198, 0, 1, 0, 8),
++ PLL_FRACN_GP(560000000U, 70, 0, 1, 0, 3),
++ PLL_FRACN_GP(400000000U, 50, 0, 1, 0, 3),
+ PLL_FRACN_GP(393216000U, 81, 92, 100, 0, 5)
+ };
+
+@@ -131,18 +131,7 @@ static unsigned long clk_fracn_gppll_recalc_rate(struct clk_hw *hw, unsigned lon
+ mfi = FIELD_GET(PLL_MFI_MASK, pll_div);
+
+ rdiv = FIELD_GET(PLL_RDIV_MASK, pll_div);
+- rdiv = rdiv + 1;
+ odiv = FIELD_GET(PLL_ODIV_MASK, pll_div);
+- switch (odiv) {
+- case 0:
+- odiv = 2;
+- break;
+- case 1:
+- odiv = 3;
+- break;
+- default:
+- break;
+- }
+
+ /*
+ * Sometimes, the recalculated rate has deviation due to
+@@ -160,6 +149,20 @@ static unsigned long clk_fracn_gppll_recalc_rate(struct clk_hw *hw, unsigned lon
+ if (rate)
+ return (unsigned long)rate;
+
++ if (!rdiv)
++ rdiv = rdiv + 1;
++
++ switch (odiv) {
++ case 0:
++ odiv = 2;
++ break;
++ case 1:
++ odiv = 3;
++ break;
++ default:
++ break;
++ }
++
+ /* Fvco = Fref * (MFI + MFN / MFD) */
+ fvco = fvco * mfi * mfd + fvco * mfn;
+ do_div(fvco, mfd * rdiv * odiv);
+diff --git a/drivers/clk/imx/clk-imx93.c b/drivers/clk/imx/clk-imx93.c
+index edcc87661d1f6..26885bd3971c4 100644
+--- a/drivers/clk/imx/clk-imx93.c
++++ b/drivers/clk/imx/clk-imx93.c
+@@ -150,7 +150,7 @@ static const struct imx93_clk_ccgr {
+ { IMX93_CLK_A55_GATE, "a55", "a55_root", 0x8000, },
+ /* M33 critical clk for system run */
+ { IMX93_CLK_CM33_GATE, "cm33", "m33_root", 0x8040, CLK_IS_CRITICAL },
+- { IMX93_CLK_ADC1_GATE, "adc1", "osc_24m", 0x82c0, },
++ { IMX93_CLK_ADC1_GATE, "adc1", "adc_root", 0x82c0, },
+ { IMX93_CLK_WDOG1_GATE, "wdog1", "osc_24m", 0x8300, },
+ { IMX93_CLK_WDOG2_GATE, "wdog2", "osc_24m", 0x8340, },
+ { IMX93_CLK_WDOG3_GATE, "wdog3", "osc_24m", 0x8380, },
+@@ -219,7 +219,7 @@ static const struct imx93_clk_ccgr {
+ { IMX93_CLK_LCDIF_GATE, "lcdif", "media_apb_root", 0x9640, },
+ { IMX93_CLK_PXP_GATE, "pxp", "media_apb_root", 0x9680, },
+ { IMX93_CLK_ISI_GATE, "isi", "media_apb_root", 0x96c0, },
+- { IMX93_CLK_NIC_MEDIA_GATE, "nic_media", "media_apb_root", 0x9700, },
++ { IMX93_CLK_NIC_MEDIA_GATE, "nic_media", "media_axi_root", 0x9700, },
+ { IMX93_CLK_USB_CONTROLLER_GATE, "usb_controller", "hsio_root", 0x9a00, },
+ { IMX93_CLK_USB_TEST_60M_GATE, "usb_test_60m", "hsio_usb_test_60m_root", 0x9a40, },
+ { IMX93_CLK_HSIO_TROUT_24M_GATE, "hsio_trout_24m", "osc_24m", 0x9a80, },
+diff --git a/drivers/clk/mediatek/reset.c b/drivers/clk/mediatek/reset.c
+index bcec4b89f449a..834d26e9bdfde 100644
+--- a/drivers/clk/mediatek/reset.c
++++ b/drivers/clk/mediatek/reset.c
+@@ -25,7 +25,7 @@ static int mtk_reset_assert_set_clr(struct reset_controller_dev *rcdev,
+ struct mtk_reset *data = container_of(rcdev, struct mtk_reset, rcdev);
+ unsigned int reg = data->regofs + ((id / 32) << 4);
+
+- return regmap_write(data->regmap, reg, 1);
++ return regmap_write(data->regmap, reg, BIT(id % 32));
+ }
+
+ static int mtk_reset_deassert_set_clr(struct reset_controller_dev *rcdev,
+@@ -34,7 +34,7 @@ static int mtk_reset_deassert_set_clr(struct reset_controller_dev *rcdev,
+ struct mtk_reset *data = container_of(rcdev, struct mtk_reset, rcdev);
+ unsigned int reg = data->regofs + ((id / 32) << 4) + 0x4;
+
+- return regmap_write(data->regmap, reg, 1);
++ return regmap_write(data->regmap, reg, BIT(id % 32));
+ }
+
+ static int mtk_reset_assert(struct reset_controller_dev *rcdev,
+diff --git a/drivers/clk/qcom/camcc-sdm845.c b/drivers/clk/qcom/camcc-sdm845.c
+index be3f953269657..27d44188a7abb 100644
+--- a/drivers/clk/qcom/camcc-sdm845.c
++++ b/drivers/clk/qcom/camcc-sdm845.c
+@@ -1534,6 +1534,8 @@ static struct clk_branch cam_cc_sys_tmr_clk = {
+ },
+ };
+
++static struct gdsc titan_top_gdsc;
++
+ static struct gdsc bps_gdsc = {
+ .gdscr = 0x6004,
+ .pd = {
+@@ -1567,6 +1569,7 @@ static struct gdsc ife_0_gdsc = {
+ .name = "ife_0_gdsc",
+ },
+ .flags = POLL_CFG_GDSCR,
++ .parent = &titan_top_gdsc.pd,
+ .pwrsts = PWRSTS_OFF_ON,
+ };
+
+@@ -1576,6 +1579,7 @@ static struct gdsc ife_1_gdsc = {
+ .name = "ife_1_gdsc",
+ },
+ .flags = POLL_CFG_GDSCR,
++ .parent = &titan_top_gdsc.pd,
+ .pwrsts = PWRSTS_OFF_ON,
+ };
+
+diff --git a/drivers/clk/qcom/camcc-sm8250.c b/drivers/clk/qcom/camcc-sm8250.c
+index 439eaafdcc862..9b32c56a5bc5a 100644
+--- a/drivers/clk/qcom/camcc-sm8250.c
++++ b/drivers/clk/qcom/camcc-sm8250.c
+@@ -2205,6 +2205,8 @@ static struct clk_branch cam_cc_sleep_clk = {
+ },
+ };
+
++static struct gdsc titan_top_gdsc;
++
+ static struct gdsc bps_gdsc = {
+ .gdscr = 0x7004,
+ .pd = {
+@@ -2238,6 +2240,7 @@ static struct gdsc ife_0_gdsc = {
+ .name = "ife_0_gdsc",
+ },
+ .flags = POLL_CFG_GDSCR,
++ .parent = &titan_top_gdsc.pd,
+ .pwrsts = PWRSTS_OFF_ON,
+ };
+
+@@ -2247,6 +2250,7 @@ static struct gdsc ife_1_gdsc = {
+ .name = "ife_1_gdsc",
+ },
+ .flags = POLL_CFG_GDSCR,
++ .parent = &titan_top_gdsc.pd,
+ .pwrsts = PWRSTS_OFF_ON,
+ };
+
+@@ -2440,17 +2444,7 @@ static struct platform_driver cam_cc_sm8250_driver = {
+ },
+ };
+
+-static int __init cam_cc_sm8250_init(void)
+-{
+- return platform_driver_register(&cam_cc_sm8250_driver);
+-}
+-subsys_initcall(cam_cc_sm8250_init);
+-
+-static void __exit cam_cc_sm8250_exit(void)
+-{
+- platform_driver_unregister(&cam_cc_sm8250_driver);
+-}
+-module_exit(cam_cc_sm8250_exit);
++module_platform_driver(cam_cc_sm8250_driver);
+
+ MODULE_DESCRIPTION("QTI CAMCC SM8250 Driver");
+ MODULE_LICENSE("GPL v2");
+diff --git a/drivers/clk/qcom/clk-krait.c b/drivers/clk/qcom/clk-krait.c
+index 59f1af415b580..90046428693c2 100644
+--- a/drivers/clk/qcom/clk-krait.c
++++ b/drivers/clk/qcom/clk-krait.c
+@@ -32,11 +32,16 @@ static void __krait_mux_set_sel(struct krait_mux_clk *mux, int sel)
+ regval |= (sel & mux->mask) << (mux->shift + LPL_SHIFT);
+ }
+ krait_set_l2_indirect_reg(mux->offset, regval);
+- spin_unlock_irqrestore(&krait_clock_reg_lock, flags);
+
+ /* Wait for switch to complete. */
+ mb();
+ udelay(1);
++
++ /*
++ * Unlock now to make sure the mux register is not
++ * modified while switching to the new parent.
++ */
++ spin_unlock_irqrestore(&krait_clock_reg_lock, flags);
+ }
+
+ static int krait_mux_set_parent(struct clk_hw *hw, u8 index)
+diff --git a/drivers/clk/qcom/clk-rcg2.c b/drivers/clk/qcom/clk-rcg2.c
+index e9c357309fd9f..caafc35eecfd2 100644
+--- a/drivers/clk/qcom/clk-rcg2.c
++++ b/drivers/clk/qcom/clk-rcg2.c
+@@ -13,6 +13,7 @@
+ #include <linux/rational.h>
+ #include <linux/regmap.h>
+ #include <linux/math64.h>
++#include <linux/minmax.h>
+ #include <linux/slab.h>
+
+ #include <asm/div64.h>
+@@ -405,7 +406,7 @@ static int clk_rcg2_get_duty_cycle(struct clk_hw *hw, struct clk_duty *duty)
+ static int clk_rcg2_set_duty_cycle(struct clk_hw *hw, struct clk_duty *duty)
+ {
+ struct clk_rcg2 *rcg = to_clk_rcg2(hw);
+- u32 notn_m, n, m, d, not2d, mask, duty_per;
++ u32 notn_m, n, m, d, not2d, mask, duty_per, cfg;
+ int ret;
+
+ /* Duty-cycle cannot be modified for non-MND RCGs */
+@@ -416,6 +417,11 @@ static int clk_rcg2_set_duty_cycle(struct clk_hw *hw, struct clk_duty *duty)
+
+ regmap_read(rcg->clkr.regmap, RCG_N_OFFSET(rcg), ¬n_m);
+ regmap_read(rcg->clkr.regmap, RCG_M_OFFSET(rcg), &m);
++ regmap_read(rcg->clkr.regmap, RCG_CFG_OFFSET(rcg), &cfg);
++
++ /* Duty-cycle cannot be modified if MND divider is in bypass mode. */
++ if (!(cfg & CFG_MODE_MASK))
++ return -EINVAL;
+
+ n = (~(notn_m) + m) & mask;
+
+@@ -424,9 +430,11 @@ static int clk_rcg2_set_duty_cycle(struct clk_hw *hw, struct clk_duty *duty)
+ /* Calculate 2d value */
+ d = DIV_ROUND_CLOSEST(n * duty_per * 2, 100);
+
+- /* Check bit widths of 2d. If D is too big reduce duty cycle. */
+- if (d > mask)
+- d = mask;
++ /*
++ * Check bit widths of 2d. If D is too big reduce duty cycle.
++ * Also make sure it is never zero.
++ */
++ d = clamp_val(d, 1, mask);
+
+ if ((d / 2) > (n - m))
+ d = (n - m) * 2;
+diff --git a/drivers/clk/qcom/dispcc-sm8250.c b/drivers/clk/qcom/dispcc-sm8250.c
+index db9379634fb22..f646fdfe6f154 100644
+--- a/drivers/clk/qcom/dispcc-sm8250.c
++++ b/drivers/clk/qcom/dispcc-sm8250.c
+@@ -1134,7 +1134,6 @@ static struct gdsc mdss_gdsc = {
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+ .flags = HW_CTRL,
+- .supply = "mmcx",
+ };
+
+ static struct clk_regmap *disp_cc_sm8250_clocks[] = {
+diff --git a/drivers/clk/qcom/gcc-ipq8074.c b/drivers/clk/qcom/gcc-ipq8074.c
+index 541016db3c4bb..2c2ecfc5e61f5 100644
+--- a/drivers/clk/qcom/gcc-ipq8074.c
++++ b/drivers/clk/qcom/gcc-ipq8074.c
+@@ -1788,8 +1788,10 @@ static struct clk_regmap_div nss_port4_tx_div_clk_src = {
+ static const struct freq_tbl ftbl_nss_port5_rx_clk_src[] = {
+ F(19200000, P_XO, 1, 0, 0),
+ F(25000000, P_UNIPHY1_RX, 12.5, 0, 0),
++ F(25000000, P_UNIPHY0_RX, 5, 0, 0),
+ F(78125000, P_UNIPHY1_RX, 4, 0, 0),
+ F(125000000, P_UNIPHY1_RX, 2.5, 0, 0),
++ F(125000000, P_UNIPHY0_RX, 1, 0, 0),
+ F(156250000, P_UNIPHY1_RX, 2, 0, 0),
+ F(312500000, P_UNIPHY1_RX, 1, 0, 0),
+ { }
+@@ -1828,8 +1830,10 @@ static struct clk_regmap_div nss_port5_rx_div_clk_src = {
+ static const struct freq_tbl ftbl_nss_port5_tx_clk_src[] = {
+ F(19200000, P_XO, 1, 0, 0),
+ F(25000000, P_UNIPHY1_TX, 12.5, 0, 0),
++ F(25000000, P_UNIPHY0_TX, 5, 0, 0),
+ F(78125000, P_UNIPHY1_TX, 4, 0, 0),
+ F(125000000, P_UNIPHY1_TX, 2.5, 0, 0),
++ F(125000000, P_UNIPHY0_TX, 1, 0, 0),
+ F(156250000, P_UNIPHY1_TX, 2, 0, 0),
+ F(312500000, P_UNIPHY1_TX, 1, 0, 0),
+ { }
+@@ -1867,8 +1871,10 @@ static struct clk_regmap_div nss_port5_tx_div_clk_src = {
+
+ static const struct freq_tbl ftbl_nss_port6_rx_clk_src[] = {
+ F(19200000, P_XO, 1, 0, 0),
++ F(25000000, P_UNIPHY2_RX, 5, 0, 0),
+ F(25000000, P_UNIPHY2_RX, 12.5, 0, 0),
+ F(78125000, P_UNIPHY2_RX, 4, 0, 0),
++ F(125000000, P_UNIPHY2_RX, 1, 0, 0),
+ F(125000000, P_UNIPHY2_RX, 2.5, 0, 0),
+ F(156250000, P_UNIPHY2_RX, 2, 0, 0),
+ F(312500000, P_UNIPHY2_RX, 1, 0, 0),
+@@ -1907,8 +1913,10 @@ static struct clk_regmap_div nss_port6_rx_div_clk_src = {
+
+ static const struct freq_tbl ftbl_nss_port6_tx_clk_src[] = {
+ F(19200000, P_XO, 1, 0, 0),
++ F(25000000, P_UNIPHY2_TX, 5, 0, 0),
+ F(25000000, P_UNIPHY2_TX, 12.5, 0, 0),
+ F(78125000, P_UNIPHY2_TX, 4, 0, 0),
++ F(125000000, P_UNIPHY2_TX, 1, 0, 0),
+ F(125000000, P_UNIPHY2_TX, 2.5, 0, 0),
+ F(156250000, P_UNIPHY2_TX, 2, 0, 0),
+ F(312500000, P_UNIPHY2_TX, 1, 0, 0),
+@@ -3346,6 +3354,7 @@ static struct clk_branch gcc_nssnoc_ubi1_ahb_clk = {
+
+ static struct clk_branch gcc_ubi0_ahb_clk = {
+ .halt_reg = 0x6820c,
++ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x6820c,
+ .enable_mask = BIT(0),
+@@ -3363,6 +3372,7 @@ static struct clk_branch gcc_ubi0_ahb_clk = {
+
+ static struct clk_branch gcc_ubi0_axi_clk = {
+ .halt_reg = 0x68200,
++ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x68200,
+ .enable_mask = BIT(0),
+@@ -3380,6 +3390,7 @@ static struct clk_branch gcc_ubi0_axi_clk = {
+
+ static struct clk_branch gcc_ubi0_nc_axi_clk = {
+ .halt_reg = 0x68204,
++ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x68204,
+ .enable_mask = BIT(0),
+@@ -3397,6 +3408,7 @@ static struct clk_branch gcc_ubi0_nc_axi_clk = {
+
+ static struct clk_branch gcc_ubi0_core_clk = {
+ .halt_reg = 0x68210,
++ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x68210,
+ .enable_mask = BIT(0),
+@@ -3414,6 +3426,7 @@ static struct clk_branch gcc_ubi0_core_clk = {
+
+ static struct clk_branch gcc_ubi0_mpt_clk = {
+ .halt_reg = 0x68208,
++ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x68208,
+ .enable_mask = BIT(0),
+@@ -3431,6 +3444,7 @@ static struct clk_branch gcc_ubi0_mpt_clk = {
+
+ static struct clk_branch gcc_ubi1_ahb_clk = {
+ .halt_reg = 0x6822c,
++ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x6822c,
+ .enable_mask = BIT(0),
+@@ -3448,6 +3462,7 @@ static struct clk_branch gcc_ubi1_ahb_clk = {
+
+ static struct clk_branch gcc_ubi1_axi_clk = {
+ .halt_reg = 0x68220,
++ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x68220,
+ .enable_mask = BIT(0),
+@@ -3465,6 +3480,7 @@ static struct clk_branch gcc_ubi1_axi_clk = {
+
+ static struct clk_branch gcc_ubi1_nc_axi_clk = {
+ .halt_reg = 0x68224,
++ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x68224,
+ .enable_mask = BIT(0),
+@@ -3482,6 +3498,7 @@ static struct clk_branch gcc_ubi1_nc_axi_clk = {
+
+ static struct clk_branch gcc_ubi1_core_clk = {
+ .halt_reg = 0x68230,
++ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x68230,
+ .enable_mask = BIT(0),
+@@ -3499,6 +3516,7 @@ static struct clk_branch gcc_ubi1_core_clk = {
+
+ static struct clk_branch gcc_ubi1_mpt_clk = {
+ .halt_reg = 0x68228,
++ .halt_check = BRANCH_HALT_DELAY,
+ .clkr = {
+ .enable_reg = 0x68228,
+ .enable_mask = BIT(0),
+@@ -4371,6 +4389,33 @@ static struct clk_branch gcc_pcie0_axi_s_bridge_clk = {
+ },
+ };
+
++static const struct alpha_pll_config ubi32_pll_config = {
++ .l = 0x4e,
++ .config_ctl_val = 0x200d4aa8,
++ .config_ctl_hi_val = 0x3c2,
++ .main_output_mask = BIT(0),
++ .aux_output_mask = BIT(1),
++ .pre_div_val = 0x0,
++ .pre_div_mask = BIT(12),
++ .post_div_val = 0x0,
++ .post_div_mask = GENMASK(9, 8),
++};
++
++static const struct alpha_pll_config nss_crypto_pll_config = {
++ .l = 0x3e,
++ .alpha = 0x0,
++ .alpha_hi = 0x80,
++ .config_ctl_val = 0x4001055b,
++ .main_output_mask = BIT(0),
++ .pre_div_val = 0x0,
++ .pre_div_mask = GENMASK(14, 12),
++ .post_div_val = 0x1 << 8,
++ .post_div_mask = GENMASK(11, 8),
++ .vco_mask = GENMASK(21, 20),
++ .vco_val = 0x0,
++ .alpha_en_mask = BIT(24),
++};
++
+ static struct clk_hw *gcc_ipq8074_hws[] = {
+ &gpll0_out_main_div2.hw,
+ &gpll6_out_main_div2.hw,
+@@ -4772,7 +4817,20 @@ static const struct qcom_cc_desc gcc_ipq8074_desc = {
+
+ static int gcc_ipq8074_probe(struct platform_device *pdev)
+ {
+- return qcom_cc_probe(pdev, &gcc_ipq8074_desc);
++ struct regmap *regmap;
++
++ regmap = qcom_cc_map(pdev, &gcc_ipq8074_desc);
++ if (IS_ERR(regmap))
++ return PTR_ERR(regmap);
++
++ /* SW Workaround for UBI32 Huayra PLL */
++ regmap_update_bits(regmap, 0x2501c, BIT(26), BIT(26));
++
++ clk_alpha_pll_configure(&ubi32_pll_main, regmap, &ubi32_pll_config);
++ clk_alpha_pll_configure(&nss_crypto_pll_main, regmap,
++ &nss_crypto_pll_config);
++
++ return qcom_cc_really_probe(pdev, &gcc_ipq8074_desc, regmap);
+ }
+
+ static struct platform_driver gcc_ipq8074_driver = {
+diff --git a/drivers/clk/qcom/gcc-msm8939.c b/drivers/clk/qcom/gcc-msm8939.c
+index 39ebb443ae3d5..de0022e5450de 100644
+--- a/drivers/clk/qcom/gcc-msm8939.c
++++ b/drivers/clk/qcom/gcc-msm8939.c
+@@ -632,7 +632,7 @@ static struct clk_rcg2 system_noc_bfdcd_clk_src = {
+ };
+
+ static struct clk_rcg2 bimc_ddr_clk_src = {
+- .cmd_rcgr = 0x32004,
++ .cmd_rcgr = 0x32024,
+ .hid_width = 5,
+ .parent_map = gcc_xo_gpll0_bimc_map,
+ .clkr.hw.init = &(struct clk_init_data){
+@@ -644,6 +644,18 @@ static struct clk_rcg2 bimc_ddr_clk_src = {
+ },
+ };
+
++static struct clk_rcg2 system_mm_noc_bfdcd_clk_src = {
++ .cmd_rcgr = 0x2600c,
++ .hid_width = 5,
++ .parent_map = gcc_xo_gpll0_gpll6a_map,
++ .clkr.hw.init = &(struct clk_init_data){
++ .name = "system_mm_noc_bfdcd_clk_src",
++ .parent_data = gcc_xo_gpll0_gpll6a_parent_data,
++ .num_parents = 3,
++ .ops = &clk_rcg2_ops,
++ },
++};
++
+ static const struct freq_tbl ftbl_gcc_camss_ahb_clk[] = {
+ F(40000000, P_GPLL0, 10, 1, 2),
+ F(80000000, P_GPLL0, 10, 0, 0),
+@@ -1002,7 +1014,7 @@ static struct clk_rcg2 blsp1_uart2_apps_clk_src = {
+ };
+
+ static const struct freq_tbl ftbl_gcc_camss_cci_clk[] = {
+- F(19200000, P_XO, 1, 0, 0),
++ F(19200000, P_XO, 1, 0, 0),
+ { }
+ };
+
+@@ -2441,7 +2453,7 @@ static struct clk_branch gcc_camss_jpeg_axi_clk = {
+ .hw.init = &(struct clk_init_data){
+ .name = "gcc_camss_jpeg_axi_clk",
+ .parent_data = &(const struct clk_parent_data){
+- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
++ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
+ },
+ .num_parents = 1,
+ .flags = CLK_SET_RATE_PARENT,
+@@ -2645,7 +2657,7 @@ static struct clk_branch gcc_camss_vfe_axi_clk = {
+ .hw.init = &(struct clk_init_data){
+ .name = "gcc_camss_vfe_axi_clk",
+ .parent_data = &(const struct clk_parent_data){
+- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
++ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
+ },
+ .num_parents = 1,
+ .flags = CLK_SET_RATE_PARENT,
+@@ -2801,7 +2813,7 @@ static struct clk_branch gcc_mdss_axi_clk = {
+ .hw.init = &(struct clk_init_data){
+ .name = "gcc_mdss_axi_clk",
+ .parent_data = &(const struct clk_parent_data){
+- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
++ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
+ },
+ .num_parents = 1,
+ .flags = CLK_SET_RATE_PARENT,
+@@ -3193,7 +3205,7 @@ static struct clk_branch gcc_mdp_tbu_clk = {
+ .hw.init = &(struct clk_init_data){
+ .name = "gcc_mdp_tbu_clk",
+ .parent_data = &(const struct clk_parent_data){
+- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
++ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
+ },
+ .num_parents = 1,
+ .flags = CLK_SET_RATE_PARENT,
+@@ -3211,7 +3223,7 @@ static struct clk_branch gcc_venus_tbu_clk = {
+ .hw.init = &(struct clk_init_data){
+ .name = "gcc_venus_tbu_clk",
+ .parent_data = &(const struct clk_parent_data){
+- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
++ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
+ },
+ .num_parents = 1,
+ .flags = CLK_SET_RATE_PARENT,
+@@ -3229,7 +3241,7 @@ static struct clk_branch gcc_vfe_tbu_clk = {
+ .hw.init = &(struct clk_init_data){
+ .name = "gcc_vfe_tbu_clk",
+ .parent_data = &(const struct clk_parent_data){
+- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
++ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
+ },
+ .num_parents = 1,
+ .flags = CLK_SET_RATE_PARENT,
+@@ -3247,7 +3259,7 @@ static struct clk_branch gcc_jpeg_tbu_clk = {
+ .hw.init = &(struct clk_init_data){
+ .name = "gcc_jpeg_tbu_clk",
+ .parent_data = &(const struct clk_parent_data){
+- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
++ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
+ },
+ .num_parents = 1,
+ .flags = CLK_SET_RATE_PARENT,
+@@ -3484,7 +3496,7 @@ static struct clk_branch gcc_venus0_axi_clk = {
+ .hw.init = &(struct clk_init_data){
+ .name = "gcc_venus0_axi_clk",
+ .parent_data = &(const struct clk_parent_data){
+- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
++ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
+ },
+ .num_parents = 1,
+ .flags = CLK_SET_RATE_PARENT,
+@@ -3623,6 +3635,7 @@ static struct clk_regmap *gcc_msm8939_clocks[] = {
+ [GPLL2_VOTE] = &gpll2_vote,
+ [PCNOC_BFDCD_CLK_SRC] = &pcnoc_bfdcd_clk_src.clkr,
+ [SYSTEM_NOC_BFDCD_CLK_SRC] = &system_noc_bfdcd_clk_src.clkr,
++ [SYSTEM_MM_NOC_BFDCD_CLK_SRC] = &system_mm_noc_bfdcd_clk_src.clkr,
+ [CAMSS_AHB_CLK_SRC] = &camss_ahb_clk_src.clkr,
+ [APSS_AHB_CLK_SRC] = &apss_ahb_clk_src.clkr,
+ [CSI0_CLK_SRC] = &csi0_clk_src.clkr,
+diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
+index 44520efc6c72b..2db0938f8dd3f 100644
+--- a/drivers/clk/qcom/gdsc.c
++++ b/drivers/clk/qcom/gdsc.c
+@@ -420,6 +420,14 @@ static int gdsc_init(struct gdsc *sc)
+ return ret;
+ }
+
++ /* ...and the power-domain */
++ ret = gdsc_pm_runtime_get(sc);
++ if (ret) {
++ if (sc->rsupply)
++ regulator_disable(sc->rsupply);
++ return ret;
++ }
++
+ /*
+ * Votable GDSCs can be ON due to Vote from other masters.
+ * If a Votable GDSC is ON, make sure we have a Vote.
+diff --git a/drivers/clk/qcom/videocc-sm8250.c b/drivers/clk/qcom/videocc-sm8250.c
+index 8617454e4a77c..f28f2cb051d72 100644
+--- a/drivers/clk/qcom/videocc-sm8250.c
++++ b/drivers/clk/qcom/videocc-sm8250.c
+@@ -277,7 +277,6 @@ static struct gdsc mvs0c_gdsc = {
+ },
+ .flags = 0,
+ .pwrsts = PWRSTS_OFF_ON,
+- .supply = "mmcx",
+ };
+
+ static struct gdsc mvs1c_gdsc = {
+@@ -287,7 +286,6 @@ static struct gdsc mvs1c_gdsc = {
+ },
+ .flags = 0,
+ .pwrsts = PWRSTS_OFF_ON,
+- .supply = "mmcx",
+ };
+
+ static struct gdsc mvs0_gdsc = {
+@@ -297,7 +295,6 @@ static struct gdsc mvs0_gdsc = {
+ },
+ .flags = HW_CTRL,
+ .pwrsts = PWRSTS_OFF_ON,
+- .supply = "mmcx",
+ };
+
+ static struct gdsc mvs1_gdsc = {
+@@ -307,7 +304,6 @@ static struct gdsc mvs1_gdsc = {
+ },
+ .flags = HW_CTRL,
+ .pwrsts = PWRSTS_OFF_ON,
+- .supply = "mmcx",
+ };
+
+ static struct clk_regmap *video_cc_sm8250_clocks[] = {
+diff --git a/drivers/clk/renesas/r9a06g032-clocks.c b/drivers/clk/renesas/r9a06g032-clocks.c
+index c99942f0e4d4c..abc0891fd96db 100644
+--- a/drivers/clk/renesas/r9a06g032-clocks.c
++++ b/drivers/clk/renesas/r9a06g032-clocks.c
+@@ -286,8 +286,8 @@ static const struct r9a06g032_clkdesc r9a06g032_clocks[] = {
+ .name = "uart_group_012",
+ .type = K_BITSEL,
+ .source = 1 + R9A06G032_DIV_UART,
+- /* R9A06G032_SYSCTRL_REG_PWRCTRL_PG1_PR2 */
+- .dual.sel = ((0xec / 4) << 5) | 24,
++ /* R9A06G032_SYSCTRL_REG_PWRCTRL_PG0_0 */
++ .dual.sel = ((0x34 / 4) << 5) | 30,
+ .dual.group = 0,
+ },
+ {
+@@ -295,8 +295,8 @@ static const struct r9a06g032_clkdesc r9a06g032_clocks[] = {
+ .name = "uart_group_34567",
+ .type = K_BITSEL,
+ .source = 1 + R9A06G032_DIV_P2_PG,
+- /* R9A06G032_SYSCTRL_REG_PWRCTRL_PG0_0 */
+- .dual.sel = ((0x34 / 4) << 5) | 30,
++ /* R9A06G032_SYSCTRL_REG_PWRCTRL_PG1_PR2 */
++ .dual.sel = ((0xec / 4) << 5) | 24,
+ .dual.group = 1,
+ },
+ D_UGATE(CLK_UART0, "clk_uart0", UART_GROUP_012, 0, 0, 0x1b2, 0x1b3, 0x1b4, 0x1b5),
+diff --git a/drivers/clk/renesas/rzg2l-cpg.c b/drivers/clk/renesas/rzg2l-cpg.c
+index 486d0656c58ac..1068058a3865e 100644
+--- a/drivers/clk/renesas/rzg2l-cpg.c
++++ b/drivers/clk/renesas/rzg2l-cpg.c
+@@ -744,7 +744,7 @@ static int rzg2l_cpg_status(struct reset_controller_dev *rcdev,
+ unsigned int reg = info->resets[id].off;
+ u32 bitmask = BIT(info->resets[id].bit);
+
+- return !(readl(priv->base + CLK_MRST_R(reg)) & bitmask);
++ return !!(readl(priv->base + CLK_MRST_R(reg)) & bitmask);
+ }
+
+ static const struct reset_control_ops rzg2l_cpg_reset_ops = {
+diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
+index 70e2e6e373897..3c46ad8c3a1c5 100644
+--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
++++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
+@@ -151,6 +151,7 @@ dma_iv_error:
+ while (i >= 0) {
+ dma_unmap_single(ss->dev, rctx->p_iv[i], ivsize, DMA_TO_DEVICE);
+ memzero_explicit(sf->iv[i], ivsize);
++ i--;
+ }
+ return err;
+ }
+diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c
+index 6575305786436..47b5828e35c34 100644
+--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c
++++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c
+@@ -476,14 +476,32 @@ static int allocate_flows(struct sun8i_ss_dev *ss)
+
+ ss->flows[i].biv = devm_kmalloc(ss->dev, AES_BLOCK_SIZE,
+ GFP_KERNEL | GFP_DMA);
+- if (!ss->flows[i].biv)
++ if (!ss->flows[i].biv) {
++ err = -ENOMEM;
+ goto error_engine;
++ }
+
+ for (j = 0; j < MAX_SG; j++) {
+ ss->flows[i].iv[j] = devm_kmalloc(ss->dev, AES_BLOCK_SIZE,
+ GFP_KERNEL | GFP_DMA);
+- if (!ss->flows[i].iv[j])
++ if (!ss->flows[i].iv[j]) {
++ err = -ENOMEM;
+ goto error_engine;
++ }
++ }
++
++ /* the padding could be up to two block. */
++ ss->flows[i].pad = devm_kmalloc(ss->dev, SHA256_BLOCK_SIZE * 2,
++ GFP_KERNEL | GFP_DMA);
++ if (!ss->flows[i].pad) {
++ err = -ENOMEM;
++ goto error_engine;
++ }
++ ss->flows[i].result = devm_kmalloc(ss->dev, SHA256_DIGEST_SIZE,
++ GFP_KERNEL | GFP_DMA);
++ if (!ss->flows[i].result) {
++ err = -ENOMEM;
++ goto error_engine;
+ }
+
+ ss->flows[i].engine = crypto_engine_alloc_init(ss->dev, true);
+diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c
+index ca4f280af35d2..f89a580618aaa 100644
+--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c
++++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c
+@@ -342,18 +342,11 @@ int sun8i_ss_hash_run(struct crypto_engine *engine, void *breq)
+ if (digestsize == SHA224_DIGEST_SIZE)
+ digestsize = SHA256_DIGEST_SIZE;
+
+- /* the padding could be up to two block. */
+- pad = kzalloc(algt->alg.hash.halg.base.cra_blocksize * 2, GFP_KERNEL | GFP_DMA);
+- if (!pad)
+- return -ENOMEM;
++ result = ss->flows[rctx->flow].result;
++ pad = ss->flows[rctx->flow].pad;
++ memset(pad, 0, algt->alg.hash.halg.base.cra_blocksize * 2);
+ bf = (__le32 *)pad;
+
+- result = kzalloc(digestsize, GFP_KERNEL | GFP_DMA);
+- if (!result) {
+- kfree(pad);
+- return -ENOMEM;
+- }
+-
+ for (i = 0; i < MAX_SG; i++) {
+ rctx->t_dst[i].addr = 0;
+ rctx->t_dst[i].len = 0;
+@@ -449,8 +442,6 @@ int sun8i_ss_hash_run(struct crypto_engine *engine, void *breq)
+
+ memcpy(areq->result, result, algt->alg.hash.halg.digestsize);
+ theend:
+- kfree(pad);
+- kfree(result);
+ local_bh_disable();
+ crypto_finalize_hash_request(engine, breq, err);
+ local_bh_enable();
+diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h
+index 57ada86538550..eb82ee5345ae1 100644
+--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h
++++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h
+@@ -123,6 +123,8 @@ struct sginfo {
+ * @stat_req: number of request done by this flow
+ * @iv: list of IV to use for each step
+ * @biv: buffer which contain the backuped IV
++ * @pad: padding buffer for hash operations
++ * @result: buffer for storing the result of hash operations
+ */
+ struct sun8i_ss_flow {
+ struct crypto_engine *engine;
+@@ -130,6 +132,8 @@ struct sun8i_ss_flow {
+ int status;
+ u8 *iv[MAX_SG];
+ u8 *biv;
++ void *pad;
++ void *result;
+ #ifdef CONFIG_CRYPTO_DEV_SUN8I_SS_DEBUG
+ unsigned long stat_req;
+ #endif
+diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
+index 3aefb177715e9..95dbab0ffab2f 100644
+--- a/drivers/crypto/ccp/sev-dev.c
++++ b/drivers/crypto/ccp/sev-dev.c
+@@ -503,7 +503,7 @@ static int __sev_platform_shutdown_locked(int *error)
+ struct sev_device *sev = psp_master->sev_data;
+ int ret;
+
+- if (sev->state == SEV_STATE_UNINIT)
++ if (!sev || sev->state == SEV_STATE_UNINIT)
+ return 0;
+
+ ret = __sev_do_cmd_locked(SEV_CMD_SHUTDOWN, NULL, error);
+@@ -577,6 +577,8 @@ static int sev_ioctl_do_platform_status(struct sev_issue_cmd *argp)
+ struct sev_user_data_status data;
+ int ret;
+
++ memset(&data, 0, sizeof(data));
++
+ ret = __sev_do_cmd_locked(SEV_CMD_PLATFORM_STATUS, &data, &argp->error);
+ if (ret)
+ return ret;
+@@ -630,7 +632,7 @@ static int sev_ioctl_do_pek_csr(struct sev_issue_cmd *argp, bool writable)
+ if (input.length > SEV_FW_BLOB_MAX_SIZE)
+ return -EFAULT;
+
+- blob = kmalloc(input.length, GFP_KERNEL);
++ blob = kzalloc(input.length, GFP_KERNEL);
+ if (!blob)
+ return -ENOMEM;
+
+@@ -854,7 +856,7 @@ static int sev_ioctl_do_get_id2(struct sev_issue_cmd *argp)
+ input_address = (void __user *)input.address;
+
+ if (input.address && input.length) {
+- id_blob = kmalloc(input.length, GFP_KERNEL);
++ id_blob = kzalloc(input.length, GFP_KERNEL);
+ if (!id_blob)
+ return -ENOMEM;
+
+@@ -973,14 +975,14 @@ static int sev_ioctl_do_pdh_export(struct sev_issue_cmd *argp, bool writable)
+ if (input.cert_chain_len > SEV_FW_BLOB_MAX_SIZE)
+ return -EFAULT;
+
+- pdh_blob = kmalloc(input.pdh_cert_len, GFP_KERNEL);
++ pdh_blob = kzalloc(input.pdh_cert_len, GFP_KERNEL);
+ if (!pdh_blob)
+ return -ENOMEM;
+
+ data.pdh_cert_address = __psp_pa(pdh_blob);
+ data.pdh_cert_len = input.pdh_cert_len;
+
+- cert_blob = kmalloc(input.cert_chain_len, GFP_KERNEL);
++ cert_blob = kzalloc(input.cert_chain_len, GFP_KERNEL);
+ if (!cert_blob) {
+ ret = -ENOMEM;
+ goto e_free_pdh;
+diff --git a/drivers/crypto/hisilicon/hpre/hpre_crypto.c b/drivers/crypto/hisilicon/hpre/hpre_crypto.c
+index 97d54c1465c2b..3ba6f15deafc6 100644
+--- a/drivers/crypto/hisilicon/hpre/hpre_crypto.c
++++ b/drivers/crypto/hisilicon/hpre/hpre_crypto.c
+@@ -252,7 +252,7 @@ static int hpre_prepare_dma_buf(struct hpre_asym_request *hpre_req,
+ if (unlikely(shift < 0))
+ return -EINVAL;
+
+- ptr = dma_alloc_coherent(dev, ctx->key_sz, tmp, GFP_KERNEL);
++ ptr = dma_alloc_coherent(dev, ctx->key_sz, tmp, GFP_ATOMIC);
+ if (unlikely(!ptr))
+ return -ENOMEM;
+
+diff --git a/drivers/crypto/hisilicon/sec/sec_algs.c b/drivers/crypto/hisilicon/sec/sec_algs.c
+index 0a3c8f019b025..490e1542305e1 100644
+--- a/drivers/crypto/hisilicon/sec/sec_algs.c
++++ b/drivers/crypto/hisilicon/sec/sec_algs.c
+@@ -449,7 +449,7 @@ static void sec_skcipher_alg_callback(struct sec_bd_info *sec_resp,
+ */
+ }
+
+- mutex_lock(&ctx->queue->queuelock);
++ spin_lock_bh(&ctx->queue->queuelock);
+ /* Put the IV in place for chained cases */
+ switch (ctx->cipher_alg) {
+ case SEC_C_AES_CBC_128:
+@@ -509,7 +509,7 @@ static void sec_skcipher_alg_callback(struct sec_bd_info *sec_resp,
+ list_del(&backlog_req->backlog_head);
+ }
+ }
+- mutex_unlock(&ctx->queue->queuelock);
++ spin_unlock_bh(&ctx->queue->queuelock);
+
+ mutex_lock(&sec_req->lock);
+ list_del(&sec_req_el->head);
+@@ -798,7 +798,7 @@ static int sec_alg_skcipher_crypto(struct skcipher_request *skreq,
+ */
+
+ /* Grab a big lock for a long time to avoid concurrency issues */
+- mutex_lock(&queue->queuelock);
++ spin_lock_bh(&queue->queuelock);
+
+ /*
+ * Can go on to queue if we have space in either:
+@@ -814,15 +814,15 @@ static int sec_alg_skcipher_crypto(struct skcipher_request *skreq,
+ ret = -EBUSY;
+ if ((skreq->base.flags & CRYPTO_TFM_REQ_MAY_BACKLOG)) {
+ list_add_tail(&sec_req->backlog_head, &ctx->backlog);
+- mutex_unlock(&queue->queuelock);
++ spin_unlock_bh(&queue->queuelock);
+ goto out;
+ }
+
+- mutex_unlock(&queue->queuelock);
++ spin_unlock_bh(&queue->queuelock);
+ goto err_free_elements;
+ }
+ ret = sec_send_request(sec_req, queue);
+- mutex_unlock(&queue->queuelock);
++ spin_unlock_bh(&queue->queuelock);
+ if (ret)
+ goto err_free_elements;
+
+@@ -881,7 +881,7 @@ static int sec_alg_skcipher_init(struct crypto_skcipher *tfm)
+ if (IS_ERR(ctx->queue))
+ return PTR_ERR(ctx->queue);
+
+- mutex_init(&ctx->queue->queuelock);
++ spin_lock_init(&ctx->queue->queuelock);
+ ctx->queue->havesoftqueue = false;
+
+ return 0;
+diff --git a/drivers/crypto/hisilicon/sec/sec_drv.h b/drivers/crypto/hisilicon/sec/sec_drv.h
+index 179a8250d691c..e2a50bf2234b9 100644
+--- a/drivers/crypto/hisilicon/sec/sec_drv.h
++++ b/drivers/crypto/hisilicon/sec/sec_drv.h
+@@ -347,7 +347,7 @@ struct sec_queue {
+ DECLARE_BITMAP(unprocessed, SEC_QUEUE_LEN);
+ DECLARE_KFIFO_PTR(softqueue, typeof(struct sec_request_el *));
+ bool havesoftqueue;
+- struct mutex queuelock;
++ spinlock_t queuelock;
+ void *shadow[SEC_QUEUE_LEN];
+ };
+
+diff --git a/drivers/crypto/hisilicon/sec2/sec.h b/drivers/crypto/hisilicon/sec2/sec.h
+index c2e9b01187a74..a44c8dba3cda6 100644
+--- a/drivers/crypto/hisilicon/sec2/sec.h
++++ b/drivers/crypto/hisilicon/sec2/sec.h
+@@ -119,7 +119,7 @@ struct sec_qp_ctx {
+ struct idr req_idr;
+ struct sec_alg_res res[QM_Q_DEPTH];
+ struct sec_ctx *ctx;
+- struct mutex req_lock;
++ spinlock_t req_lock;
+ struct list_head backlog;
+ struct hisi_acc_sgl_pool *c_in_pool;
+ struct hisi_acc_sgl_pool *c_out_pool;
+diff --git a/drivers/crypto/hisilicon/sec2/sec_crypto.c b/drivers/crypto/hisilicon/sec2/sec_crypto.c
+index a91635c348b5e..dbaa6c918cfd6 100644
+--- a/drivers/crypto/hisilicon/sec2/sec_crypto.c
++++ b/drivers/crypto/hisilicon/sec2/sec_crypto.c
+@@ -127,11 +127,11 @@ static int sec_alloc_req_id(struct sec_req *req, struct sec_qp_ctx *qp_ctx)
+ {
+ int req_id;
+
+- mutex_lock(&qp_ctx->req_lock);
++ spin_lock_bh(&qp_ctx->req_lock);
+
+ req_id = idr_alloc_cyclic(&qp_ctx->req_idr, NULL,
+ 0, QM_Q_DEPTH, GFP_ATOMIC);
+- mutex_unlock(&qp_ctx->req_lock);
++ spin_unlock_bh(&qp_ctx->req_lock);
+ if (unlikely(req_id < 0)) {
+ dev_err(req->ctx->dev, "alloc req id fail!\n");
+ return req_id;
+@@ -156,9 +156,9 @@ static void sec_free_req_id(struct sec_req *req)
+ qp_ctx->req_list[req_id] = NULL;
+ req->qp_ctx = NULL;
+
+- mutex_lock(&qp_ctx->req_lock);
++ spin_lock_bh(&qp_ctx->req_lock);
+ idr_remove(&qp_ctx->req_idr, req_id);
+- mutex_unlock(&qp_ctx->req_lock);
++ spin_unlock_bh(&qp_ctx->req_lock);
+ }
+
+ static u8 pre_parse_finished_bd(struct bd_status *status, void *resp)
+@@ -273,7 +273,7 @@ static int sec_bd_send(struct sec_ctx *ctx, struct sec_req *req)
+ !(req->flag & CRYPTO_TFM_REQ_MAY_BACKLOG))
+ return -EBUSY;
+
+- mutex_lock(&qp_ctx->req_lock);
++ spin_lock_bh(&qp_ctx->req_lock);
+ ret = hisi_qp_send(qp_ctx->qp, &req->sec_sqe);
+
+ if (ctx->fake_req_limit <=
+@@ -281,10 +281,10 @@ static int sec_bd_send(struct sec_ctx *ctx, struct sec_req *req)
+ list_add_tail(&req->backlog_head, &qp_ctx->backlog);
+ atomic64_inc(&ctx->sec->debug.dfx.send_cnt);
+ atomic64_inc(&ctx->sec->debug.dfx.send_busy_cnt);
+- mutex_unlock(&qp_ctx->req_lock);
++ spin_unlock_bh(&qp_ctx->req_lock);
+ return -EBUSY;
+ }
+- mutex_unlock(&qp_ctx->req_lock);
++ spin_unlock_bh(&qp_ctx->req_lock);
+
+ if (unlikely(ret == -EBUSY))
+ return -ENOBUFS;
+@@ -487,7 +487,7 @@ static int sec_create_qp_ctx(struct hisi_qm *qm, struct sec_ctx *ctx,
+
+ qp->req_cb = sec_req_cb;
+
+- mutex_init(&qp_ctx->req_lock);
++ spin_lock_init(&qp_ctx->req_lock);
+ idr_init(&qp_ctx->req_idr);
+ INIT_LIST_HEAD(&qp_ctx->backlog);
+
+@@ -620,7 +620,7 @@ static int sec_auth_init(struct sec_ctx *ctx)
+ {
+ struct sec_auth_ctx *a_ctx = &ctx->a_ctx;
+
+- a_ctx->a_key = dma_alloc_coherent(ctx->dev, SEC_MAX_KEY_SIZE,
++ a_ctx->a_key = dma_alloc_coherent(ctx->dev, SEC_MAX_AKEY_SIZE,
+ &a_ctx->a_key_dma, GFP_KERNEL);
+ if (!a_ctx->a_key)
+ return -ENOMEM;
+@@ -632,8 +632,8 @@ static void sec_auth_uninit(struct sec_ctx *ctx)
+ {
+ struct sec_auth_ctx *a_ctx = &ctx->a_ctx;
+
+- memzero_explicit(a_ctx->a_key, SEC_MAX_KEY_SIZE);
+- dma_free_coherent(ctx->dev, SEC_MAX_KEY_SIZE,
++ memzero_explicit(a_ctx->a_key, SEC_MAX_AKEY_SIZE);
++ dma_free_coherent(ctx->dev, SEC_MAX_AKEY_SIZE,
+ a_ctx->a_key, a_ctx->a_key_dma);
+ }
+
+@@ -1382,7 +1382,7 @@ static struct sec_req *sec_back_req_clear(struct sec_ctx *ctx,
+ {
+ struct sec_req *backlog_req = NULL;
+
+- mutex_lock(&qp_ctx->req_lock);
++ spin_lock_bh(&qp_ctx->req_lock);
+ if (ctx->fake_req_limit >=
+ atomic_read(&qp_ctx->qp->qp_status.used) &&
+ !list_empty(&qp_ctx->backlog)) {
+@@ -1390,7 +1390,7 @@ static struct sec_req *sec_back_req_clear(struct sec_ctx *ctx,
+ typeof(*backlog_req), backlog_head);
+ list_del(&backlog_req->backlog_head);
+ }
+- mutex_unlock(&qp_ctx->req_lock);
++ spin_unlock_bh(&qp_ctx->req_lock);
+
+ return backlog_req;
+ }
+diff --git a/drivers/crypto/hisilicon/sec2/sec_crypto.h b/drivers/crypto/hisilicon/sec2/sec_crypto.h
+index 5e039b50e9d4c..d033f63b583f8 100644
+--- a/drivers/crypto/hisilicon/sec2/sec_crypto.h
++++ b/drivers/crypto/hisilicon/sec2/sec_crypto.h
+@@ -7,6 +7,7 @@
+ #define SEC_AIV_SIZE 12
+ #define SEC_IV_SIZE 24
+ #define SEC_MAX_KEY_SIZE 64
++#define SEC_MAX_AKEY_SIZE 128
+ #define SEC_COMM_SCENE 0
+ #define SEC_MIN_BLOCK_SZ 1
+
+diff --git a/drivers/crypto/inside-secure/safexcel.c b/drivers/crypto/inside-secure/safexcel.c
+index 9ff885d50edfc..389a7b51f1f38 100644
+--- a/drivers/crypto/inside-secure/safexcel.c
++++ b/drivers/crypto/inside-secure/safexcel.c
+@@ -1831,6 +1831,8 @@ static const struct of_device_id safexcel_of_match_table[] = {
+ {},
+ };
+
++MODULE_DEVICE_TABLE(of, safexcel_of_match_table);
++
+ static struct platform_driver crypto_safexcel = {
+ .probe = safexcel_probe,
+ .remove = safexcel_remove,
+diff --git a/drivers/dma/dw-edma/dw-edma-core.c b/drivers/dma/dw-edma/dw-edma-core.c
+index 468d1097a1ece..f23569e4b0bde 100644
+--- a/drivers/dma/dw-edma/dw-edma-core.c
++++ b/drivers/dma/dw-edma/dw-edma-core.c
+@@ -423,7 +423,7 @@ dw_edma_device_transfer(struct dw_edma_transfer *xfer)
+ chunk->ll_region.sz += burst->sz;
+ desc->alloc_sz += burst->sz;
+
+- if (chan->dir == EDMA_DIR_WRITE) {
++ if (dir == DMA_DEV_TO_MEM) {
+ burst->sar = src_addr;
+ if (xfer->type == EDMA_XFER_CYCLIC) {
+ burst->dar = xfer->xfer.cyclic.paddr;
+diff --git a/drivers/dma/imx-dma.c b/drivers/dma/imx-dma.c
+index 2ddc31e64db03..da31e73d24d4c 100644
+--- a/drivers/dma/imx-dma.c
++++ b/drivers/dma/imx-dma.c
+@@ -1047,7 +1047,7 @@ static int __init imxdma_probe(struct platform_device *pdev)
+ return -ENOMEM;
+
+ imxdma->dev = &pdev->dev;
+- imxdma->devtype = (enum imx_dma_type)of_device_get_match_data(&pdev->dev);
++ imxdma->devtype = (uintptr_t)of_device_get_match_data(&pdev->dev);
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ imxdma->base = devm_ioremap_resource(&pdev->dev, res);
+diff --git a/drivers/dma/sf-pdma/sf-pdma.c b/drivers/dma/sf-pdma/sf-pdma.c
+index f12606aeff87c..ab0ad7a2f2015 100644
+--- a/drivers/dma/sf-pdma/sf-pdma.c
++++ b/drivers/dma/sf-pdma/sf-pdma.c
+@@ -52,16 +52,6 @@ static inline struct sf_pdma_desc *to_sf_pdma_desc(struct virt_dma_desc *vd)
+ static struct sf_pdma_desc *sf_pdma_alloc_desc(struct sf_pdma_chan *chan)
+ {
+ struct sf_pdma_desc *desc;
+- unsigned long flags;
+-
+- spin_lock_irqsave(&chan->lock, flags);
+-
+- if (chan->desc && !chan->desc->in_use) {
+- spin_unlock_irqrestore(&chan->lock, flags);
+- return chan->desc;
+- }
+-
+- spin_unlock_irqrestore(&chan->lock, flags);
+
+ desc = kzalloc(sizeof(*desc), GFP_NOWAIT);
+ if (!desc)
+@@ -111,7 +101,6 @@ sf_pdma_prep_dma_memcpy(struct dma_chan *dchan, dma_addr_t dest, dma_addr_t src,
+ desc->async_tx = vchan_tx_prep(&chan->vchan, &desc->vdesc, flags);
+
+ spin_lock_irqsave(&chan->vchan.lock, iflags);
+- chan->desc = desc;
+ sf_pdma_fill_desc(desc, dest, src, len);
+ spin_unlock_irqrestore(&chan->vchan.lock, iflags);
+
+@@ -170,11 +159,17 @@ static size_t sf_pdma_desc_residue(struct sf_pdma_chan *chan,
+ unsigned long flags;
+ u64 residue = 0;
+ struct sf_pdma_desc *desc;
+- struct dma_async_tx_descriptor *tx;
++ struct dma_async_tx_descriptor *tx = NULL;
+
+ spin_lock_irqsave(&chan->vchan.lock, flags);
+
+- tx = &chan->desc->vdesc.tx;
++ list_for_each_entry(vd, &chan->vchan.desc_submitted, node)
++ if (vd->tx.cookie == cookie)
++ tx = &vd->tx;
++
++ if (!tx)
++ goto out;
++
+ if (cookie == tx->chan->completed_cookie)
+ goto out;
+
+@@ -241,6 +236,19 @@ static void sf_pdma_enable_request(struct sf_pdma_chan *chan)
+ writel(v, regs->ctrl);
+ }
+
++static struct sf_pdma_desc *sf_pdma_get_first_pending_desc(struct sf_pdma_chan *chan)
++{
++ struct virt_dma_chan *vchan = &chan->vchan;
++ struct virt_dma_desc *vdesc;
++
++ if (list_empty(&vchan->desc_issued))
++ return NULL;
++
++ vdesc = list_first_entry(&vchan->desc_issued, struct virt_dma_desc, node);
++
++ return container_of(vdesc, struct sf_pdma_desc, vdesc);
++}
++
+ static void sf_pdma_xfer_desc(struct sf_pdma_chan *chan)
+ {
+ struct sf_pdma_desc *desc = chan->desc;
+@@ -268,8 +276,11 @@ static void sf_pdma_issue_pending(struct dma_chan *dchan)
+
+ spin_lock_irqsave(&chan->vchan.lock, flags);
+
+- if (vchan_issue_pending(&chan->vchan) && chan->desc)
++ if (!chan->desc && vchan_issue_pending(&chan->vchan)) {
++ /* vchan_issue_pending has made a check that desc in not NULL */
++ chan->desc = sf_pdma_get_first_pending_desc(chan);
+ sf_pdma_xfer_desc(chan);
++ }
+
+ spin_unlock_irqrestore(&chan->vchan.lock, flags);
+ }
+@@ -298,6 +309,11 @@ static void sf_pdma_donebh_tasklet(struct tasklet_struct *t)
+ spin_lock_irqsave(&chan->vchan.lock, flags);
+ list_del(&chan->desc->vdesc.node);
+ vchan_cookie_complete(&chan->desc->vdesc);
++
++ chan->desc = sf_pdma_get_first_pending_desc(chan);
++ if (chan->desc)
++ sf_pdma_xfer_desc(chan);
++
+ spin_unlock_irqrestore(&chan->vchan.lock, flags);
+ }
+
+diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c
+index ddf0b9ff9e15c..435d0e2658a42 100644
+--- a/drivers/firmware/arm_scpi.c
++++ b/drivers/firmware/arm_scpi.c
+@@ -815,7 +815,7 @@ static int scpi_init_versions(struct scpi_drvinfo *info)
+ info->firmware_version = le32_to_cpu(caps.platform_version);
+ }
+ /* Ignore error if not implemented */
+- if (scpi_info->is_legacy && ret == -EOPNOTSUPP)
++ if (info->is_legacy && ret == -EOPNOTSUPP)
+ return 0;
+
+ return ret;
+@@ -913,13 +913,14 @@ static int scpi_probe(struct platform_device *pdev)
+ struct resource res;
+ struct device *dev = &pdev->dev;
+ struct device_node *np = dev->of_node;
++ struct scpi_drvinfo *scpi_drvinfo;
+
+- scpi_info = devm_kzalloc(dev, sizeof(*scpi_info), GFP_KERNEL);
+- if (!scpi_info)
++ scpi_drvinfo = devm_kzalloc(dev, sizeof(*scpi_drvinfo), GFP_KERNEL);
++ if (!scpi_drvinfo)
+ return -ENOMEM;
+
+ if (of_match_device(legacy_scpi_of_match, &pdev->dev))
+- scpi_info->is_legacy = true;
++ scpi_drvinfo->is_legacy = true;
+
+ count = of_count_phandle_with_args(np, "mboxes", "#mbox-cells");
+ if (count < 0) {
+@@ -927,19 +928,19 @@ static int scpi_probe(struct platform_device *pdev)
+ return -ENODEV;
+ }
+
+- scpi_info->channels = devm_kcalloc(dev, count, sizeof(struct scpi_chan),
+- GFP_KERNEL);
+- if (!scpi_info->channels)
++ scpi_drvinfo->channels =
++ devm_kcalloc(dev, count, sizeof(struct scpi_chan), GFP_KERNEL);
++ if (!scpi_drvinfo->channels)
+ return -ENOMEM;
+
+- ret = devm_add_action(dev, scpi_free_channels, scpi_info);
++ ret = devm_add_action(dev, scpi_free_channels, scpi_drvinfo);
+ if (ret)
+ return ret;
+
+- for (; scpi_info->num_chans < count; scpi_info->num_chans++) {
++ for (; scpi_drvinfo->num_chans < count; scpi_drvinfo->num_chans++) {
+ resource_size_t size;
+- int idx = scpi_info->num_chans;
+- struct scpi_chan *pchan = scpi_info->channels + idx;
++ int idx = scpi_drvinfo->num_chans;
++ struct scpi_chan *pchan = scpi_drvinfo->channels + idx;
+ struct mbox_client *cl = &pchan->cl;
+ struct device_node *shmem = of_parse_phandle(np, "shmem", idx);
+
+@@ -986,45 +987,53 @@ static int scpi_probe(struct platform_device *pdev)
+ return ret;
+ }
+
+- scpi_info->commands = scpi_std_commands;
++ scpi_drvinfo->commands = scpi_std_commands;
+
+- platform_set_drvdata(pdev, scpi_info);
++ platform_set_drvdata(pdev, scpi_drvinfo);
+
+- if (scpi_info->is_legacy) {
++ if (scpi_drvinfo->is_legacy) {
+ /* Replace with legacy variants */
+ scpi_ops.clk_set_val = legacy_scpi_clk_set_val;
+- scpi_info->commands = scpi_legacy_commands;
++ scpi_drvinfo->commands = scpi_legacy_commands;
+
+ /* Fill priority bitmap */
+ for (idx = 0; idx < ARRAY_SIZE(legacy_hpriority_cmds); idx++)
+ set_bit(legacy_hpriority_cmds[idx],
+- scpi_info->cmd_priority);
++ scpi_drvinfo->cmd_priority);
+ }
+
+- ret = scpi_init_versions(scpi_info);
++ scpi_info = scpi_drvinfo;
++
++ ret = scpi_init_versions(scpi_drvinfo);
+ if (ret) {
+ dev_err(dev, "incorrect or no SCP firmware found\n");
++ scpi_info = NULL;
+ return ret;
+ }
+
+- if (scpi_info->is_legacy && !scpi_info->protocol_version &&
+- !scpi_info->firmware_version)
++ if (scpi_drvinfo->is_legacy && !scpi_drvinfo->protocol_version &&
++ !scpi_drvinfo->firmware_version)
+ dev_info(dev, "SCP Protocol legacy pre-1.0 firmware\n");
+ else
+ dev_info(dev, "SCP Protocol %lu.%lu Firmware %lu.%lu.%lu version\n",
+ FIELD_GET(PROTO_REV_MAJOR_MASK,
+- scpi_info->protocol_version),
++ scpi_drvinfo->protocol_version),
+ FIELD_GET(PROTO_REV_MINOR_MASK,
+- scpi_info->protocol_version),
++ scpi_drvinfo->protocol_version),
+ FIELD_GET(FW_REV_MAJOR_MASK,
+- scpi_info->firmware_version),
++ scpi_drvinfo->firmware_version),
+ FIELD_GET(FW_REV_MINOR_MASK,
+- scpi_info->firmware_version),
++ scpi_drvinfo->firmware_version),
+ FIELD_GET(FW_REV_PATCH_MASK,
+- scpi_info->firmware_version));
+- scpi_info->scpi_ops = &scpi_ops;
++ scpi_drvinfo->firmware_version));
++
++ scpi_drvinfo->scpi_ops = &scpi_ops;
+
+- return devm_of_platform_populate(dev);
++ ret = devm_of_platform_populate(dev);
++ if (ret)
++ scpi_info = NULL;
++
++ return ret;
+ }
+
+ static const struct of_device_id scpi_of_match[] = {
+diff --git a/drivers/firmware/tegra/bpmp-debugfs.c b/drivers/firmware/tegra/bpmp-debugfs.c
+index fd89899aeeed9..0c440afd52247 100644
+--- a/drivers/firmware/tegra/bpmp-debugfs.c
++++ b/drivers/firmware/tegra/bpmp-debugfs.c
+@@ -474,7 +474,7 @@ static int bpmp_populate_debugfs_inband(struct tegra_bpmp *bpmp,
+ mode |= attrs & DEBUGFS_S_IWUSR ? 0200 : 0;
+ dentry = debugfs_create_file(name, mode, parent, bpmp,
+ &bpmp_debug_fops);
+- if (!dentry) {
++ if (IS_ERR(dentry)) {
+ err = -ENOMEM;
+ goto out;
+ }
+@@ -725,7 +725,7 @@ static int bpmp_populate_dir(struct tegra_bpmp *bpmp, struct seqbuf *seqbuf,
+
+ if (t & DEBUGFS_S_ISDIR) {
+ dentry = debugfs_create_dir(name, parent);
+- if (!dentry)
++ if (IS_ERR(dentry))
+ return -ENOMEM;
+ err = bpmp_populate_dir(bpmp, seqbuf, dentry, depth+1);
+ if (err < 0)
+@@ -738,7 +738,7 @@ static int bpmp_populate_dir(struct tegra_bpmp *bpmp, struct seqbuf *seqbuf,
+ dentry = debugfs_create_file(name, mode,
+ parent, bpmp,
+ &debugfs_fops);
+- if (!dentry)
++ if (IS_ERR(dentry))
+ return -ENOMEM;
+ }
+ }
+@@ -788,11 +788,11 @@ int tegra_bpmp_init_debugfs(struct tegra_bpmp *bpmp)
+ return 0;
+
+ root = debugfs_create_dir("bpmp", NULL);
+- if (!root)
++ if (IS_ERR(root))
+ return -ENOMEM;
+
+ bpmp->debugfs_mirror = debugfs_create_dir("debug", root);
+- if (!bpmp->debugfs_mirror) {
++ if (IS_ERR(bpmp->debugfs_mirror)) {
+ err = -ENOMEM;
+ goto out;
+ }
+diff --git a/drivers/fpga/altera-pr-ip-core.c b/drivers/fpga/altera-pr-ip-core.c
+index be0667968d33b..df8671af4a92a 100644
+--- a/drivers/fpga/altera-pr-ip-core.c
++++ b/drivers/fpga/altera-pr-ip-core.c
+@@ -108,7 +108,7 @@ static int alt_pr_fpga_write(struct fpga_manager *mgr, const char *buf,
+ u32 *buffer_32 = (u32 *)buf;
+ size_t i = 0;
+
+- if (count <= 0)
++ if (!count)
+ return -EINVAL;
+
+ /* Write out the complete 32-bit chunks */
+diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c
+index 6dec81b1f24be..5cabb9a9c272d 100644
+--- a/drivers/gpio/gpiolib-of.c
++++ b/drivers/gpio/gpiolib-of.c
+@@ -861,7 +861,8 @@ int of_mm_gpiochip_add_data(struct device_node *np,
+ if (mm_gc->save_regs)
+ mm_gc->save_regs(mm_gc);
+
+- mm_gc->gc.of_node = np;
++ of_node_put(mm_gc->gc.of_node);
++ mm_gc->gc.of_node = of_node_get(np);
+
+ ret = gpiochip_add_data(gc, data);
+ if (ret)
+@@ -869,6 +870,7 @@ int of_mm_gpiochip_add_data(struct device_node *np,
+
+ return 0;
+ err2:
++ of_node_put(np);
+ iounmap(mm_gc->regs);
+ err1:
+ kfree(gc->label);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+index f4509656ea8c8..e9a8eb070766c 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+@@ -1401,16 +1401,10 @@ void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device *adev,
+ struct amdgpu_vm *vm)
+ {
+ struct amdkfd_process_info *process_info = vm->process_info;
+- struct amdgpu_bo *pd = vm->root.bo;
+
+ if (!process_info)
+ return;
+
+- /* Release eviction fence from PD */
+- amdgpu_bo_reserve(pd, false);
+- amdgpu_bo_fence(pd, NULL, false);
+- amdgpu_bo_unreserve(pd);
+-
+ /* Update process info */
+ mutex_lock(&process_info->lock);
+ process_info->n_vms--;
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+index 2019622191b59..ee0cbc6ccbfb3 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+@@ -1260,7 +1260,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
+
+ p->fence = dma_fence_get(&job->base.s_fence->finished);
+
+- amdgpu_ctx_add_fence(p->ctx, entity, p->fence, &seq);
++ seq = amdgpu_ctx_add_fence(p->ctx, entity, p->fence);
+ amdgpu_cs_post_dependencies(p);
+
+ if ((job->preamble_status & AMDGPU_PREAMBLE_IB_PRESENT) &&
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+index c317078d1afd0..95ed528e6ec5c 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+@@ -135,9 +135,9 @@ static enum amdgpu_ring_priority_level amdgpu_ctx_sched_prio_to_ring_prio(int32_
+
+ static unsigned int amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, u32 hw_ip)
+ {
+- struct amdgpu_device *adev = ctx->adev;
+- int32_t ctx_prio;
++ struct amdgpu_device *adev = ctx->mgr->adev;
+ unsigned int hw_prio;
++ int32_t ctx_prio;
+
+ ctx_prio = (ctx->override_priority == AMDGPU_CTX_PRIORITY_UNSET) ?
+ ctx->init_priority : ctx->override_priority;
+@@ -166,7 +166,7 @@ static unsigned int amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, u32 hw_ip)
+ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
+ const u32 ring)
+ {
+- struct amdgpu_device *adev = ctx->adev;
++ struct amdgpu_device *adev = ctx->mgr->adev;
+ struct amdgpu_ctx_entity *entity;
+ struct drm_gpu_scheduler **scheds = NULL, *sched = NULL;
+ unsigned num_scheds = 0;
+@@ -220,35 +220,6 @@ error_free_entity:
+ return r;
+ }
+
+-static int amdgpu_ctx_init(struct amdgpu_device *adev,
+- int32_t priority,
+- struct drm_file *filp,
+- struct amdgpu_ctx *ctx)
+-{
+- int r;
+-
+- r = amdgpu_ctx_priority_permit(filp, priority);
+- if (r)
+- return r;
+-
+- memset(ctx, 0, sizeof(*ctx));
+-
+- ctx->adev = adev;
+-
+- kref_init(&ctx->refcount);
+- spin_lock_init(&ctx->ring_lock);
+- mutex_init(&ctx->lock);
+-
+- ctx->reset_counter = atomic_read(&adev->gpu_reset_counter);
+- ctx->reset_counter_query = ctx->reset_counter;
+- ctx->vram_lost_counter = atomic_read(&adev->vram_lost_counter);
+- ctx->init_priority = priority;
+- ctx->override_priority = AMDGPU_CTX_PRIORITY_UNSET;
+- ctx->stable_pstate = AMDGPU_CTX_STABLE_PSTATE_NONE;
+-
+- return 0;
+-}
+-
+ static void amdgpu_ctx_fini_entity(struct amdgpu_ctx_entity *entity)
+ {
+
+@@ -266,7 +237,7 @@ static void amdgpu_ctx_fini_entity(struct amdgpu_ctx_entity *entity)
+ static int amdgpu_ctx_get_stable_pstate(struct amdgpu_ctx *ctx,
+ u32 *stable_pstate)
+ {
+- struct amdgpu_device *adev = ctx->adev;
++ struct amdgpu_device *adev = ctx->mgr->adev;
+ enum amd_dpm_forced_level current_level;
+
+ current_level = amdgpu_dpm_get_performance_level(adev);
+@@ -291,10 +262,42 @@ static int amdgpu_ctx_get_stable_pstate(struct amdgpu_ctx *ctx,
+ return 0;
+ }
+
++static int amdgpu_ctx_init(struct amdgpu_ctx_mgr *mgr, int32_t priority,
++ struct drm_file *filp, struct amdgpu_ctx *ctx)
++{
++ u32 current_stable_pstate;
++ int r;
++
++ r = amdgpu_ctx_priority_permit(filp, priority);
++ if (r)
++ return r;
++
++ memset(ctx, 0, sizeof(*ctx));
++
++ kref_init(&ctx->refcount);
++ ctx->mgr = mgr;
++ spin_lock_init(&ctx->ring_lock);
++ mutex_init(&ctx->lock);
++
++ ctx->reset_counter = atomic_read(&mgr->adev->gpu_reset_counter);
++ ctx->reset_counter_query = ctx->reset_counter;
++ ctx->vram_lost_counter = atomic_read(&mgr->adev->vram_lost_counter);
++ ctx->init_priority = priority;
++ ctx->override_priority = AMDGPU_CTX_PRIORITY_UNSET;
++
++ r = amdgpu_ctx_get_stable_pstate(ctx, ¤t_stable_pstate);
++ if (r)
++ return r;
++
++ ctx->stable_pstate = current_stable_pstate;
++
++ return 0;
++}
++
+ static int amdgpu_ctx_set_stable_pstate(struct amdgpu_ctx *ctx,
+ u32 stable_pstate)
+ {
+- struct amdgpu_device *adev = ctx->adev;
++ struct amdgpu_device *adev = ctx->mgr->adev;
+ enum amd_dpm_forced_level level;
+ u32 current_stable_pstate;
+ int r;
+@@ -345,7 +348,8 @@ done:
+ static void amdgpu_ctx_fini(struct kref *ref)
+ {
+ struct amdgpu_ctx *ctx = container_of(ref, struct amdgpu_ctx, refcount);
+- struct amdgpu_device *adev = ctx->adev;
++ struct amdgpu_ctx_mgr *mgr = ctx->mgr;
++ struct amdgpu_device *adev = mgr->adev;
+ unsigned i, j, idx;
+
+ if (!adev)
+@@ -359,7 +363,7 @@ static void amdgpu_ctx_fini(struct kref *ref)
+ }
+
+ if (drm_dev_enter(&adev->ddev, &idx)) {
+- amdgpu_ctx_set_stable_pstate(ctx, AMDGPU_CTX_STABLE_PSTATE_NONE);
++ amdgpu_ctx_set_stable_pstate(ctx, ctx->stable_pstate);
+ drm_dev_exit(idx);
+ }
+
+@@ -421,7 +425,7 @@ static int amdgpu_ctx_alloc(struct amdgpu_device *adev,
+ }
+
+ *id = (uint32_t)r;
+- r = amdgpu_ctx_init(adev, priority, filp, ctx);
++ r = amdgpu_ctx_init(mgr, priority, filp, ctx);
+ if (r) {
+ idr_remove(&mgr->ctx_handles, *id);
+ *id = 0;
+@@ -671,9 +675,9 @@ int amdgpu_ctx_put(struct amdgpu_ctx *ctx)
+ return 0;
+ }
+
+-void amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx,
+- struct drm_sched_entity *entity,
+- struct dma_fence *fence, uint64_t *handle)
++uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx,
++ struct drm_sched_entity *entity,
++ struct dma_fence *fence)
+ {
+ struct amdgpu_ctx_entity *centity = to_amdgpu_ctx_entity(entity);
+ uint64_t seq = centity->sequence;
+@@ -682,8 +686,7 @@ void amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx,
+
+ idx = seq & (amdgpu_sched_jobs - 1);
+ other = centity->fences[idx];
+- if (other)
+- BUG_ON(!dma_fence_is_signaled(other));
++ WARN_ON(other && !dma_fence_is_signaled(other));
+
+ dma_fence_get(fence);
+
+@@ -693,8 +696,7 @@ void amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx,
+ spin_unlock(&ctx->ring_lock);
+
+ dma_fence_put(other);
+- if (handle)
+- *handle = seq;
++ return seq;
+ }
+
+ struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx *ctx,
+@@ -731,7 +733,7 @@ static void amdgpu_ctx_set_entity_priority(struct amdgpu_ctx *ctx,
+ int hw_ip,
+ int32_t priority)
+ {
+- struct amdgpu_device *adev = ctx->adev;
++ struct amdgpu_device *adev = ctx->mgr->adev;
+ unsigned int hw_prio;
+ struct drm_gpu_scheduler **scheds = NULL;
+ unsigned num_scheds;
+@@ -796,8 +798,10 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx,
+ return r;
+ }
+
+-void amdgpu_ctx_mgr_init(struct amdgpu_ctx_mgr *mgr)
++void amdgpu_ctx_mgr_init(struct amdgpu_ctx_mgr *mgr,
++ struct amdgpu_device *adev)
+ {
++ mgr->adev = adev;
+ mutex_init(&mgr->lock);
+ idr_init(&mgr->ctx_handles);
+ }
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
+index 142f2f87d44ce..681050bc828c3 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
+@@ -40,7 +40,7 @@ struct amdgpu_ctx_entity {
+
+ struct amdgpu_ctx {
+ struct kref refcount;
+- struct amdgpu_device *adev;
++ struct amdgpu_ctx_mgr *mgr;
+ unsigned reset_counter;
+ unsigned reset_counter_query;
+ uint32_t vram_lost_counter;
+@@ -70,9 +70,9 @@ int amdgpu_ctx_put(struct amdgpu_ctx *ctx);
+
+ int amdgpu_ctx_get_entity(struct amdgpu_ctx *ctx, u32 hw_ip, u32 instance,
+ u32 ring, struct drm_sched_entity **entity);
+-void amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx,
+- struct drm_sched_entity *entity,
+- struct dma_fence *fence, uint64_t *seq);
++uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx,
++ struct drm_sched_entity *entity,
++ struct dma_fence *fence);
+ struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx *ctx,
+ struct drm_sched_entity *entity,
+ uint64_t seq);
+@@ -85,7 +85,8 @@ int amdgpu_ctx_ioctl(struct drm_device *dev, void *data,
+ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx,
+ struct drm_sched_entity *entity);
+
+-void amdgpu_ctx_mgr_init(struct amdgpu_ctx_mgr *mgr);
++void amdgpu_ctx_mgr_init(struct amdgpu_ctx_mgr *mgr,
++ struct amdgpu_device *adev);
+ void amdgpu_ctx_mgr_entity_fini(struct amdgpu_ctx_mgr *mgr);
+ long amdgpu_ctx_mgr_entity_flush(struct amdgpu_ctx_mgr *mgr, long timeout);
+ void amdgpu_ctx_mgr_fini(struct amdgpu_ctx_mgr *mgr);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+index e4fcbb385a62b..d68db66d969b9 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+@@ -1891,12 +1891,9 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev)
+ break;
+ case IP_VERSION(7, 4, 0):
+ case IP_VERSION(7, 4, 1):
+- adev->nbio.funcs = &nbio_v7_4_funcs;
+- adev->nbio.hdp_flush_reg = &nbio_v7_4_hdp_flush_reg;
+- break;
+ case IP_VERSION(7, 4, 4):
+ adev->nbio.funcs = &nbio_v7_4_funcs;
+- adev->nbio.hdp_flush_reg = &nbio_v7_4_hdp_flush_reg_ald;
++ adev->nbio.hdp_flush_reg = &nbio_v7_4_hdp_flush_reg;
+ break;
+ case IP_VERSION(7, 2, 0):
+ case IP_VERSION(7, 2, 1):
+@@ -1910,15 +1907,12 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev)
+ case IP_VERSION(2, 3, 0):
+ case IP_VERSION(2, 3, 1):
+ case IP_VERSION(2, 3, 2):
+- adev->nbio.funcs = &nbio_v2_3_funcs;
+- adev->nbio.hdp_flush_reg = &nbio_v2_3_hdp_flush_reg;
+- break;
+ case IP_VERSION(3, 3, 0):
+ case IP_VERSION(3, 3, 1):
+ case IP_VERSION(3, 3, 2):
+ case IP_VERSION(3, 3, 3):
+ adev->nbio.funcs = &nbio_v2_3_funcs;
+- adev->nbio.hdp_flush_reg = &nbio_v2_3_hdp_flush_reg_sc;
++ adev->nbio.hdp_flush_reg = &nbio_v2_3_hdp_flush_reg;
+ break;
+ default:
+ break;
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+index 49c55d82cba80..20a432a774c15 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+@@ -1142,7 +1142,7 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
+ mutex_init(&fpriv->bo_list_lock);
+ idr_init(&fpriv->bo_list_handles);
+
+- amdgpu_ctx_mgr_init(&fpriv->ctx_mgr);
++ amdgpu_ctx_mgr_init(&fpriv->ctx_mgr, adev);
+
+ file_priv->driver_priv = fpriv;
+ goto out_suspend;
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+index 940752488330f..916e2396d263f 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+@@ -882,6 +882,10 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
+ if (WARN_ON_ONCE(min_offset > max_offset))
+ return -EINVAL;
+
++ /* Check domain to be pinned to against preferred domains */
++ if (bo->preferred_domains & domain)
++ domain = bo->preferred_domains & domain;
++
+ /* A shared bo cannot be migrated to VRAM */
+ if (bo->tbo.base.import_attach) {
+ if (domain & AMDGPU_GEM_DOMAIN_GTT)
+diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
+index ee7cab37dfd58..dae78a1e88e63 100644
+--- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
++++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
+@@ -328,27 +328,6 @@ const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg = {
+ .ref_and_mask_sdma1 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__SDMA1_MASK,
+ };
+
+-const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg_sc = {
+- .ref_and_mask_cp0 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP0_MASK,
+- .ref_and_mask_cp1 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP1_MASK,
+- .ref_and_mask_cp2 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP2_MASK,
+- .ref_and_mask_cp3 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP3_MASK,
+- .ref_and_mask_cp4 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP4_MASK,
+- .ref_and_mask_cp5 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP5_MASK,
+- .ref_and_mask_cp6 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP6_MASK,
+- .ref_and_mask_cp7 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP7_MASK,
+- .ref_and_mask_cp8 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP8_MASK,
+- .ref_and_mask_cp9 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP9_MASK,
+- .ref_and_mask_sdma0 = GPU_HDP_FLUSH_DONE__RSVD_ENG1_MASK,
+- .ref_and_mask_sdma1 = GPU_HDP_FLUSH_DONE__RSVD_ENG2_MASK,
+- .ref_and_mask_sdma2 = GPU_HDP_FLUSH_DONE__RSVD_ENG3_MASK,
+- .ref_and_mask_sdma3 = GPU_HDP_FLUSH_DONE__RSVD_ENG4_MASK,
+- .ref_and_mask_sdma4 = GPU_HDP_FLUSH_DONE__RSVD_ENG5_MASK,
+- .ref_and_mask_sdma5 = GPU_HDP_FLUSH_DONE__RSVD_ENG6_MASK,
+- .ref_and_mask_sdma6 = GPU_HDP_FLUSH_DONE__RSVD_ENG7_MASK,
+- .ref_and_mask_sdma7 = GPU_HDP_FLUSH_DONE__RSVD_ENG8_MASK,
+-};
+-
+ static void nbio_v2_3_init_registers(struct amdgpu_device *adev)
+ {
+ uint32_t def, data;
+diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h
+index 6074dd3a1ed8f..a43b60acf7f63 100644
+--- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h
++++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h
+@@ -27,7 +27,6 @@
+ #include "soc15_common.h"
+
+ extern const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg;
+-extern const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg_sc;
+ extern const struct amdgpu_nbio_funcs nbio_v2_3_funcs;
+
+ #endif
+diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
+index c2357e83a8c41..09c0def356a18 100644
+--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
++++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
+@@ -339,27 +339,6 @@ const struct nbio_hdp_flush_reg nbio_v7_4_hdp_flush_reg = {
+ .ref_and_mask_sdma1 = GPU_HDP_FLUSH_DONE__SDMA1_MASK,
+ };
+
+-const struct nbio_hdp_flush_reg nbio_v7_4_hdp_flush_reg_ald = {
+- .ref_and_mask_cp0 = GPU_HDP_FLUSH_DONE__CP0_MASK,
+- .ref_and_mask_cp1 = GPU_HDP_FLUSH_DONE__CP1_MASK,
+- .ref_and_mask_cp2 = GPU_HDP_FLUSH_DONE__CP2_MASK,
+- .ref_and_mask_cp3 = GPU_HDP_FLUSH_DONE__CP3_MASK,
+- .ref_and_mask_cp4 = GPU_HDP_FLUSH_DONE__CP4_MASK,
+- .ref_and_mask_cp5 = GPU_HDP_FLUSH_DONE__CP5_MASK,
+- .ref_and_mask_cp6 = GPU_HDP_FLUSH_DONE__CP6_MASK,
+- .ref_and_mask_cp7 = GPU_HDP_FLUSH_DONE__CP7_MASK,
+- .ref_and_mask_cp8 = GPU_HDP_FLUSH_DONE__CP8_MASK,
+- .ref_and_mask_cp9 = GPU_HDP_FLUSH_DONE__CP9_MASK,
+- .ref_and_mask_sdma0 = GPU_HDP_FLUSH_DONE__RSVD_ENG1_MASK,
+- .ref_and_mask_sdma1 = GPU_HDP_FLUSH_DONE__RSVD_ENG2_MASK,
+- .ref_and_mask_sdma2 = GPU_HDP_FLUSH_DONE__RSVD_ENG3_MASK,
+- .ref_and_mask_sdma3 = GPU_HDP_FLUSH_DONE__RSVD_ENG4_MASK,
+- .ref_and_mask_sdma4 = GPU_HDP_FLUSH_DONE__RSVD_ENG5_MASK,
+- .ref_and_mask_sdma5 = GPU_HDP_FLUSH_DONE__RSVD_ENG6_MASK,
+- .ref_and_mask_sdma6 = GPU_HDP_FLUSH_DONE__RSVD_ENG7_MASK,
+- .ref_and_mask_sdma7 = GPU_HDP_FLUSH_DONE__RSVD_ENG8_MASK,
+-};
+-
+ static void nbio_v7_4_init_registers(struct amdgpu_device *adev)
+ {
+ uint32_t baco_cntl;
+diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.h b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.h
+index 7490022d79d4f..f27c417288224 100644
+--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.h
++++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.h
+@@ -27,7 +27,6 @@
+ #include "soc15_common.h"
+
+ extern const struct nbio_hdp_flush_reg nbio_v7_4_hdp_flush_reg;
+-extern const struct nbio_hdp_flush_reg nbio_v7_4_hdp_flush_reg_ald;
+ extern const struct amdgpu_nbio_funcs nbio_v7_4_funcs;
+ extern struct amdgpu_nbio_ras nbio_v7_4_ras;
+
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+index f5f39984702f2..a3282ddbff86e 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+@@ -571,7 +571,7 @@ static bool execute_synaptics_rc_command(struct drm_dp_aux *aux,
+ unsigned char rc_cmd = 0;
+ unsigned char rc_result = 0xFF;
+ unsigned char i = 0;
+- uint8_t ret = 0;
++ int ret;
+
+ if (is_write_cmd) {
+ // write rc data
+diff --git a/drivers/gpu/drm/bridge/Kconfig b/drivers/gpu/drm/bridge/Kconfig
+index be2fc4791c1da..7d152cc6f8376 100644
+--- a/drivers/gpu/drm/bridge/Kconfig
++++ b/drivers/gpu/drm/bridge/Kconfig
+@@ -82,6 +82,8 @@ config DRM_ITE_IT6505
+ select DRM_KMS_HELPER
+ select DRM_DP_HELPER
+ select EXTCON
++ select CRYPTO
++ select CRYPTO_HASH
+ help
+ ITE IT6505 DisplayPort bridge chip driver.
+
+diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
+index 668dcefbae17a..f4d6359f6219e 100644
+--- a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
++++ b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
+@@ -1060,6 +1060,10 @@ static int adv7511_init_cec_regmap(struct adv7511 *adv)
+ ADV7511_CEC_I2C_ADDR_DEFAULT);
+ if (IS_ERR(adv->i2c_cec))
+ return PTR_ERR(adv->i2c_cec);
++
++ regmap_write(adv->regmap, ADV7511_REG_CEC_I2C_ADDR,
++ adv->i2c_cec->addr << 1);
++
+ i2c_set_clientdata(adv->i2c_cec, adv);
+
+ adv->regmap_cec = devm_regmap_init_i2c(adv->i2c_cec,
+@@ -1264,9 +1268,6 @@ static int adv7511_probe(struct i2c_client *i2c, const struct i2c_device_id *id)
+ if (ret)
+ goto err_i2c_unregister_packet;
+
+- regmap_write(adv7511->regmap, ADV7511_REG_CEC_I2C_ADDR,
+- adv7511->i2c_cec->addr << 1);
+-
+ INIT_WORK(&adv7511->hpd_work, adv7511_hpd_work);
+
+ if (i2c->irq) {
+@@ -1383,10 +1384,21 @@ static struct i2c_driver adv7511_driver = {
+
+ static int __init adv7511_init(void)
+ {
+- if (IS_ENABLED(CONFIG_DRM_MIPI_DSI))
+- mipi_dsi_driver_register(&adv7533_dsi_driver);
++ int ret;
+
+- return i2c_add_driver(&adv7511_driver);
++ if (IS_ENABLED(CONFIG_DRM_MIPI_DSI)) {
++ ret = mipi_dsi_driver_register(&adv7533_dsi_driver);
++ if (ret)
++ return ret;
++ }
++
++ ret = i2c_add_driver(&adv7511_driver);
++ if (ret) {
++ if (IS_ENABLED(CONFIG_DRM_MIPI_DSI))
++ mipi_dsi_driver_unregister(&adv7533_dsi_driver);
++ }
++
++ return ret;
+ }
+ module_init(adv7511_init);
+
+diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c b/drivers/gpu/drm/bridge/analogix/anx7625.c
+index 060849f8ad8b4..c527a3a02eac7 100644
+--- a/drivers/gpu/drm/bridge/analogix/anx7625.c
++++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
+@@ -2651,14 +2651,6 @@ static int anx7625_i2c_probe(struct i2c_client *client,
+ platform->aux.dev = dev;
+ platform->aux.transfer = anx7625_aux_transfer;
+ drm_dp_aux_init(&platform->aux);
+- devm_of_dp_aux_populate_ep_devices(&platform->aux);
+-
+- ret = anx7625_parse_dt(dev, pdata);
+- if (ret) {
+- if (ret != -EPROBE_DEFER)
+- DRM_DEV_ERROR(dev, "fail to parse DT : %d\n", ret);
+- goto free_wq;
+- }
+
+ if (anx7625_register_i2c_dummy_clients(platform, client) != 0) {
+ ret = -ENOMEM;
+@@ -2674,6 +2666,15 @@ static int anx7625_i2c_probe(struct i2c_client *client,
+ if (ret)
+ goto free_wq;
+
++ devm_of_dp_aux_populate_ep_devices(&platform->aux);
++
++ ret = anx7625_parse_dt(dev, pdata);
++ if (ret) {
++ if (ret != -EPROBE_DEFER)
++ DRM_DEV_ERROR(dev, "fail to parse DT : %d\n", ret);
++ goto free_wq;
++ }
++
+ if (!platform->pdata.low_power_mode) {
+ anx7625_disable_pd_protocol(platform);
+ pm_runtime_get_sync(dev);
+diff --git a/drivers/gpu/drm/bridge/lontium-lt9611.c b/drivers/gpu/drm/bridge/lontium-lt9611.c
+index 63df2e8a8abc1..21d1e14659818 100644
+--- a/drivers/gpu/drm/bridge/lontium-lt9611.c
++++ b/drivers/gpu/drm/bridge/lontium-lt9611.c
+@@ -586,7 +586,7 @@ lt9611_connector_detect(struct drm_connector *connector, bool force)
+ int connected = 0;
+
+ regmap_read(lt9611->regmap, 0x825e, ®_val);
+- connected = (reg_val & BIT(0));
++ connected = (reg_val & (BIT(2) | BIT(0)));
+
+ lt9611->status = connected ? connector_status_connected :
+ connector_status_disconnected;
+diff --git a/drivers/gpu/drm/bridge/lontium-lt9611uxc.c b/drivers/gpu/drm/bridge/lontium-lt9611uxc.c
+index 3d62e6bf68926..310b3b1944919 100644
+--- a/drivers/gpu/drm/bridge/lontium-lt9611uxc.c
++++ b/drivers/gpu/drm/bridge/lontium-lt9611uxc.c
+@@ -982,7 +982,7 @@ static int lt9611uxc_remove(struct i2c_client *client)
+ struct lt9611uxc *lt9611uxc = i2c_get_clientdata(client);
+
+ disable_irq(client->irq);
+- flush_scheduled_work();
++ cancel_work_sync(<9611uxc->work);
+ lt9611uxc_audio_exit(lt9611uxc);
+ drm_bridge_remove(<9611uxc->bridge);
+
+diff --git a/drivers/gpu/drm/bridge/sil-sii8620.c b/drivers/gpu/drm/bridge/sil-sii8620.c
+index ec7745c31da07..ab0bce4a988c5 100644
+--- a/drivers/gpu/drm/bridge/sil-sii8620.c
++++ b/drivers/gpu/drm/bridge/sil-sii8620.c
+@@ -605,7 +605,7 @@ static void *sii8620_burst_get_tx_buf(struct sii8620 *ctx, int len)
+ u8 *buf = &ctx->burst.tx_buf[ctx->burst.tx_count];
+ int size = len + 2;
+
+- if (ctx->burst.tx_count + size > ARRAY_SIZE(ctx->burst.tx_buf)) {
++ if (ctx->burst.tx_count + size >= ARRAY_SIZE(ctx->burst.tx_buf)) {
+ dev_err(ctx->dev, "TX-BLK buffer exhausted\n");
+ ctx->error = -EINVAL;
+ return NULL;
+@@ -622,7 +622,7 @@ static u8 *sii8620_burst_get_rx_buf(struct sii8620 *ctx, int len)
+ u8 *buf = &ctx->burst.rx_buf[ctx->burst.rx_count];
+ int size = len + 1;
+
+- if (ctx->burst.tx_count + size > ARRAY_SIZE(ctx->burst.tx_buf)) {
++ if (ctx->burst.rx_count + size >= ARRAY_SIZE(ctx->burst.rx_buf)) {
+ dev_err(ctx->dev, "RX-BLK buffer exhausted\n");
+ ctx->error = -EINVAL;
+ return NULL;
+diff --git a/drivers/gpu/drm/bridge/tc358767.c b/drivers/gpu/drm/bridge/tc358767.c
+index c23e0abc65e8f..da59b041fec58 100644
+--- a/drivers/gpu/drm/bridge/tc358767.c
++++ b/drivers/gpu/drm/bridge/tc358767.c
+@@ -1549,19 +1549,12 @@ static irqreturn_t tc_irq_handler(int irq, void *arg)
+ return IRQ_HANDLED;
+ }
+
+-static int tc_probe(struct i2c_client *client, const struct i2c_device_id *id)
++static int tc_probe_edp_bridge_endpoint(struct tc_data *tc)
+ {
+- struct device *dev = &client->dev;
++ struct device *dev = tc->dev;
+ struct drm_panel *panel;
+- struct tc_data *tc;
+ int ret;
+
+- tc = devm_kzalloc(dev, sizeof(*tc), GFP_KERNEL);
+- if (!tc)
+- return -ENOMEM;
+-
+- tc->dev = dev;
+-
+ /* port@2 is the output port */
+ ret = drm_of_find_panel_or_bridge(dev->of_node, 2, 0, &panel, NULL);
+ if (ret && ret != -ENODEV)
+@@ -1580,6 +1573,50 @@ static int tc_probe(struct i2c_client *client, const struct i2c_device_id *id)
+ tc->bridge.type = DRM_MODE_CONNECTOR_DisplayPort;
+ }
+
++ return 0;
++}
++
++static void tc_clk_disable(void *data)
++{
++ struct clk *refclk = data;
++
++ clk_disable_unprepare(refclk);
++}
++
++static int tc_probe(struct i2c_client *client, const struct i2c_device_id *id)
++{
++ struct device *dev = &client->dev;
++ struct tc_data *tc;
++ int ret;
++
++ tc = devm_kzalloc(dev, sizeof(*tc), GFP_KERNEL);
++ if (!tc)
++ return -ENOMEM;
++
++ tc->dev = dev;
++
++ ret = tc_probe_edp_bridge_endpoint(tc);
++ if (ret)
++ return ret;
++
++ tc->refclk = devm_clk_get(dev, "ref");
++ if (IS_ERR(tc->refclk)) {
++ ret = PTR_ERR(tc->refclk);
++ dev_err(dev, "Failed to get refclk: %d\n", ret);
++ return ret;
++ }
++
++ ret = clk_prepare_enable(tc->refclk);
++ if (ret)
++ return ret;
++
++ ret = devm_add_action_or_reset(dev, tc_clk_disable, tc->refclk);
++ if (ret)
++ return ret;
++
++ /* tRSTW = 100 cycles , at 13 MHz that is ~7.69 us */
++ usleep_range(10, 15);
++
+ /* Shut down GPIO is optional */
+ tc->sd_gpio = devm_gpiod_get_optional(dev, "shutdown", GPIOD_OUT_HIGH);
+ if (IS_ERR(tc->sd_gpio))
+@@ -1600,13 +1637,6 @@ static int tc_probe(struct i2c_client *client, const struct i2c_device_id *id)
+ usleep_range(5000, 10000);
+ }
+
+- tc->refclk = devm_clk_get(dev, "ref");
+- if (IS_ERR(tc->refclk)) {
+- ret = PTR_ERR(tc->refclk);
+- dev_err(dev, "Failed to get refclk: %d\n", ret);
+- return ret;
+- }
+-
+ tc->regmap = devm_regmap_init_i2c(client, &tc_regmap_config);
+ if (IS_ERR(tc->regmap)) {
+ ret = PTR_ERR(tc->regmap);
+diff --git a/drivers/gpu/drm/dp/drm_dp_aux_bus.c b/drivers/gpu/drm/dp/drm_dp_aux_bus.c
+index 415afce3cf96c..5f06bc0a34aca 100644
+--- a/drivers/gpu/drm/dp/drm_dp_aux_bus.c
++++ b/drivers/gpu/drm/dp/drm_dp_aux_bus.c
+@@ -66,7 +66,6 @@ static int dp_aux_ep_probe(struct device *dev)
+ * @dev: The device to remove.
+ *
+ * Calls through to the endpoint driver remove.
+- *
+ */
+ static void dp_aux_ep_remove(struct device *dev)
+ {
+@@ -120,8 +119,6 @@ ATTRIBUTE_GROUPS(dp_aux_ep_dev);
+ /**
+ * dp_aux_ep_dev_release() - Free memory for the dp_aux_ep device
+ * @dev: The device to free.
+- *
+- * Return: 0 if no error or negative error code.
+ */
+ static void dp_aux_ep_dev_release(struct device *dev)
+ {
+@@ -256,6 +253,7 @@ int of_dp_aux_populate_ep_devices(struct drm_dp_aux *aux)
+
+ return 0;
+ }
++EXPORT_SYMBOL_GPL(of_dp_aux_populate_ep_devices);
+
+ static void of_dp_aux_depopulate_ep_devices_void(void *data)
+ {
+diff --git a/drivers/gpu/drm/dp/drm_dp_mst_topology.c b/drivers/gpu/drm/dp/drm_dp_mst_topology.c
+index 7a7cc44686f97..96869875390f3 100644
+--- a/drivers/gpu/drm/dp/drm_dp_mst_topology.c
++++ b/drivers/gpu/drm/dp/drm_dp_mst_topology.c
+@@ -3861,9 +3861,7 @@ int drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr,
+ if (!mgr->mst_primary)
+ goto out_fail;
+
+- ret = drm_dp_dpcd_read(mgr->aux, DP_DPCD_REV, mgr->dpcd,
+- DP_RECEIVER_CAP_SIZE);
+- if (ret != DP_RECEIVER_CAP_SIZE) {
++ if (drm_dp_read_dpcd_caps(mgr->aux, mgr->dpcd) < 0) {
+ drm_dbg_kms(mgr->dev, "dpcd read failed - undocked during suspend?\n");
+ goto out_fail;
+ }
+@@ -4912,8 +4910,7 @@ void drm_dp_mst_dump_topology(struct seq_file *m,
+ u8 buf[DP_PAYLOAD_TABLE_SIZE];
+ int ret;
+
+- ret = drm_dp_dpcd_read(mgr->aux, DP_DPCD_REV, buf, DP_RECEIVER_CAP_SIZE);
+- if (ret) {
++ if (drm_dp_read_dpcd_caps(mgr->aux, buf) < 0) {
+ seq_printf(m, "dpcd read failed\n");
+ goto out;
+ }
+diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
+index 56fb878851465..e2a810cfce45c 100644
+--- a/drivers/gpu/drm/drm_gem.c
++++ b/drivers/gpu/drm/drm_gem.c
+@@ -1225,7 +1225,7 @@ retry:
+ ret = dma_resv_lock_slow_interruptible(obj->resv,
+ acquire_ctx);
+ if (ret) {
+- ww_acquire_done(acquire_ctx);
++ ww_acquire_fini(acquire_ctx);
+ return ret;
+ }
+ }
+@@ -1250,7 +1250,7 @@ retry:
+ goto retry;
+ }
+
+- ww_acquire_done(acquire_ctx);
++ ww_acquire_fini(acquire_ctx);
+ return ret;
+ }
+ }
+diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
+index 8ad0e02991ca0..904fc893c905b 100644
+--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
++++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
+@@ -302,6 +302,7 @@ static int drm_gem_shmem_vmap_locked(struct drm_gem_shmem_object *shmem,
+ ret = dma_buf_vmap(obj->import_attach->dmabuf, map);
+ if (!ret) {
+ if (WARN_ON(map->is_iomem)) {
++ dma_buf_vunmap(obj->import_attach->dmabuf, map);
+ ret = -EIO;
+ goto err_put_pages;
+ }
+diff --git a/drivers/gpu/drm/drm_mipi_dbi.c b/drivers/gpu/drm/drm_mipi_dbi.c
+index 9314f2ead79fe..09e4edb5a9927 100644
+--- a/drivers/gpu/drm/drm_mipi_dbi.c
++++ b/drivers/gpu/drm/drm_mipi_dbi.c
+@@ -1199,6 +1199,13 @@ int mipi_dbi_spi_transfer(struct spi_device *spi, u32 speed_hz,
+ size_t chunk;
+ int ret;
+
++ /* In __spi_validate, there's a validation that no partial transfers
++ * are accepted (xfer->len % w_size must be zero).
++ * Here we align max_chunk to multiple of 2 (16bits),
++ * to prevent transfers from being rejected.
++ */
++ max_chunk = ALIGN_DOWN(max_chunk, 2);
++
+ spi_message_init_with_transfers(&m, &tr, 1);
+
+ while (len) {
+diff --git a/drivers/gpu/drm/exynos/exynos7_drm_decon.c b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
+index c04264f70ad17..3c31405600f0b 100644
+--- a/drivers/gpu/drm/exynos/exynos7_drm_decon.c
++++ b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
+@@ -800,31 +800,40 @@ static int exynos7_decon_resume(struct device *dev)
+ if (ret < 0) {
+ DRM_DEV_ERROR(dev, "Failed to prepare_enable the pclk [%d]\n",
+ ret);
+- return ret;
++ goto err_pclk_enable;
+ }
+
+ ret = clk_prepare_enable(ctx->aclk);
+ if (ret < 0) {
+ DRM_DEV_ERROR(dev, "Failed to prepare_enable the aclk [%d]\n",
+ ret);
+- return ret;
++ goto err_aclk_enable;
+ }
+
+ ret = clk_prepare_enable(ctx->eclk);
+ if (ret < 0) {
+ DRM_DEV_ERROR(dev, "Failed to prepare_enable the eclk [%d]\n",
+ ret);
+- return ret;
++ goto err_eclk_enable;
+ }
+
+ ret = clk_prepare_enable(ctx->vclk);
+ if (ret < 0) {
+ DRM_DEV_ERROR(dev, "Failed to prepare_enable the vclk [%d]\n",
+ ret);
+- return ret;
++ goto err_vclk_enable;
+ }
+
+ return 0;
++
++err_vclk_enable:
++ clk_disable_unprepare(ctx->eclk);
++err_eclk_enable:
++ clk_disable_unprepare(ctx->aclk);
++err_aclk_enable:
++ clk_disable_unprepare(ctx->pclk);
++err_pclk_enable:
++ return ret;
+ }
+ #endif
+
+diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c b/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
+index e82b815f83a65..5a9358dcd9e50 100644
+--- a/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
++++ b/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
+@@ -7,9 +7,11 @@
+
+ #include <drm/drm_damage_helper.h>
+ #include <drm/drm_drv.h>
++#include <drm/drm_edid.h>
+ #include <drm/drm_fb_helper.h>
+ #include <drm/drm_format_helper.h>
+ #include <drm/drm_fourcc.h>
++#include <drm/drm_framebuffer.h>
+ #include <drm/drm_gem_atomic_helper.h>
+ #include <drm/drm_gem_framebuffer_helper.h>
+ #include <drm/drm_gem_shmem_helper.h>
+diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+index ac52b49bf9016..ca9068bac7481 100644
+--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
++++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+@@ -69,6 +69,7 @@ struct jz_soc_info {
+ bool map_noncoherent;
+ bool use_extended_hwdesc;
+ bool plane_f0_not_working;
++ u32 max_burst;
+ unsigned int max_width, max_height;
+ const u32 *formats_f0, *formats_f1;
+ unsigned int num_formats_f0, num_formats_f1;
+@@ -308,8 +309,9 @@ static void ingenic_drm_crtc_update_timings(struct ingenic_drm *priv,
+ regmap_write(priv->map, JZ_REG_LCD_REV, mode->htotal << 16);
+ }
+
+- regmap_set_bits(priv->map, JZ_REG_LCD_CTRL,
+- JZ_LCD_CTRL_OFUP | JZ_LCD_CTRL_BURST_16);
++ regmap_update_bits(priv->map, JZ_REG_LCD_CTRL,
++ JZ_LCD_CTRL_OFUP | JZ_LCD_CTRL_BURST_MASK,
++ JZ_LCD_CTRL_OFUP | priv->soc_info->max_burst);
+
+ /*
+ * IPU restart - specify how much time the LCDC will wait before
+@@ -1480,6 +1482,7 @@ static const struct jz_soc_info jz4740_soc_info = {
+ .map_noncoherent = false,
+ .max_width = 800,
+ .max_height = 600,
++ .max_burst = JZ_LCD_CTRL_BURST_16,
+ .formats_f1 = jz4740_formats,
+ .num_formats_f1 = ARRAY_SIZE(jz4740_formats),
+ /* JZ4740 has only one plane */
+@@ -1491,6 +1494,7 @@ static const struct jz_soc_info jz4725b_soc_info = {
+ .map_noncoherent = false,
+ .max_width = 800,
+ .max_height = 600,
++ .max_burst = JZ_LCD_CTRL_BURST_16,
+ .formats_f1 = jz4725b_formats_f1,
+ .num_formats_f1 = ARRAY_SIZE(jz4725b_formats_f1),
+ .formats_f0 = jz4725b_formats_f0,
+@@ -1503,6 +1507,7 @@ static const struct jz_soc_info jz4770_soc_info = {
+ .map_noncoherent = true,
+ .max_width = 1280,
+ .max_height = 720,
++ .max_burst = JZ_LCD_CTRL_BURST_64,
+ .formats_f1 = jz4770_formats_f1,
+ .num_formats_f1 = ARRAY_SIZE(jz4770_formats_f1),
+ .formats_f0 = jz4770_formats_f0,
+@@ -1517,6 +1522,7 @@ static const struct jz_soc_info jz4780_soc_info = {
+ .plane_f0_not_working = true, /* REVISIT */
+ .max_width = 4096,
+ .max_height = 2048,
++ .max_burst = JZ_LCD_CTRL_BURST_64,
+ .formats_f1 = jz4770_formats_f1,
+ .num_formats_f1 = ARRAY_SIZE(jz4770_formats_f1),
+ .formats_f0 = jz4770_formats_f0,
+diff --git a/drivers/gpu/drm/ingenic/ingenic-drm.h b/drivers/gpu/drm/ingenic/ingenic-drm.h
+index cb1d09b625881..e5bd007ea93d8 100644
+--- a/drivers/gpu/drm/ingenic/ingenic-drm.h
++++ b/drivers/gpu/drm/ingenic/ingenic-drm.h
+@@ -106,6 +106,9 @@
+ #define JZ_LCD_CTRL_BURST_4 (0x0 << 28)
+ #define JZ_LCD_CTRL_BURST_8 (0x1 << 28)
+ #define JZ_LCD_CTRL_BURST_16 (0x2 << 28)
++#define JZ_LCD_CTRL_BURST_32 (0x3 << 28)
++#define JZ_LCD_CTRL_BURST_64 (0x4 << 28)
++#define JZ_LCD_CTRL_BURST_MASK (0x7 << 28)
+ #define JZ_LCD_CTRL_RGB555 BIT(27)
+ #define JZ_LCD_CTRL_OFUP BIT(26)
+ #define JZ_LCD_CTRL_FRC_GRAYSCALE_16 (0x0 << 24)
+diff --git a/drivers/gpu/drm/mcde/mcde_dsi.c b/drivers/gpu/drm/mcde/mcde_dsi.c
+index 5651734ce977f..9f9ac8699310d 100644
+--- a/drivers/gpu/drm/mcde/mcde_dsi.c
++++ b/drivers/gpu/drm/mcde/mcde_dsi.c
+@@ -1111,6 +1111,7 @@ static int mcde_dsi_bind(struct device *dev, struct device *master,
+ bridge = of_drm_find_bridge(child);
+ if (!bridge) {
+ dev_err(dev, "failed to find bridge\n");
++ of_node_put(child);
+ return -EINVAL;
+ }
+ }
+diff --git a/drivers/gpu/drm/mediatek/mtk_dpi.c b/drivers/gpu/drm/mediatek/mtk_dpi.c
+index e61cd67b978ff..41c783349321e 100644
+--- a/drivers/gpu/drm/mediatek/mtk_dpi.c
++++ b/drivers/gpu/drm/mediatek/mtk_dpi.c
+@@ -54,13 +54,7 @@ enum mtk_dpi_out_channel_swap {
+ };
+
+ enum mtk_dpi_out_color_format {
+- MTK_DPI_COLOR_FORMAT_RGB,
+- MTK_DPI_COLOR_FORMAT_RGB_FULL,
+- MTK_DPI_COLOR_FORMAT_YCBCR_444,
+- MTK_DPI_COLOR_FORMAT_YCBCR_422,
+- MTK_DPI_COLOR_FORMAT_XV_YCC,
+- MTK_DPI_COLOR_FORMAT_YCBCR_444_FULL,
+- MTK_DPI_COLOR_FORMAT_YCBCR_422_FULL
++ MTK_DPI_COLOR_FORMAT_RGB
+ };
+
+ struct mtk_dpi {
+@@ -364,24 +358,11 @@ static void mtk_dpi_config_disable_edge(struct mtk_dpi *dpi)
+ static void mtk_dpi_config_color_format(struct mtk_dpi *dpi,
+ enum mtk_dpi_out_color_format format)
+ {
+- if ((format == MTK_DPI_COLOR_FORMAT_YCBCR_444) ||
+- (format == MTK_DPI_COLOR_FORMAT_YCBCR_444_FULL)) {
+- mtk_dpi_config_yuv422_enable(dpi, false);
+- mtk_dpi_config_csc_enable(dpi, true);
+- mtk_dpi_config_swap_input(dpi, false);
+- mtk_dpi_config_channel_swap(dpi, MTK_DPI_OUT_CHANNEL_SWAP_BGR);
+- } else if ((format == MTK_DPI_COLOR_FORMAT_YCBCR_422) ||
+- (format == MTK_DPI_COLOR_FORMAT_YCBCR_422_FULL)) {
+- mtk_dpi_config_yuv422_enable(dpi, true);
+- mtk_dpi_config_csc_enable(dpi, true);
+- mtk_dpi_config_swap_input(dpi, true);
+- mtk_dpi_config_channel_swap(dpi, MTK_DPI_OUT_CHANNEL_SWAP_RGB);
+- } else {
+- mtk_dpi_config_yuv422_enable(dpi, false);
+- mtk_dpi_config_csc_enable(dpi, false);
+- mtk_dpi_config_swap_input(dpi, false);
+- mtk_dpi_config_channel_swap(dpi, MTK_DPI_OUT_CHANNEL_SWAP_RGB);
+- }
++ /* only support RGB888 */
++ mtk_dpi_config_yuv422_enable(dpi, false);
++ mtk_dpi_config_csc_enable(dpi, false);
++ mtk_dpi_config_swap_input(dpi, false);
++ mtk_dpi_config_channel_swap(dpi, MTK_DPI_OUT_CHANNEL_SWAP_RGB);
+ }
+
+ static void mtk_dpi_dual_edge(struct mtk_dpi *dpi)
+@@ -436,7 +417,6 @@ static int mtk_dpi_power_on(struct mtk_dpi *dpi)
+ if (dpi->pinctrl && dpi->pins_dpi)
+ pinctrl_select_state(dpi->pinctrl, dpi->pins_dpi);
+
+- mtk_dpi_enable(dpi);
+ return 0;
+
+ err_pixel:
+@@ -658,6 +638,7 @@ static void mtk_dpi_bridge_enable(struct drm_bridge *bridge)
+
+ mtk_dpi_power_on(dpi);
+ mtk_dpi_set_display_mode(dpi, &dpi->mode);
++ mtk_dpi_enable(dpi);
+ }
+
+ static enum drm_mode_status
+diff --git a/drivers/gpu/drm/mediatek/mtk_dsi.c b/drivers/gpu/drm/mediatek/mtk_dsi.c
+index ccb0511b9cd5c..e0a2d5ea40af7 100644
+--- a/drivers/gpu/drm/mediatek/mtk_dsi.c
++++ b/drivers/gpu/drm/mediatek/mtk_dsi.c
+@@ -203,6 +203,7 @@ struct mtk_dsi {
+ struct mtk_phy_timing phy_timing;
+ int refcount;
+ bool enabled;
++ bool lanes_ready;
+ u32 irq_data;
+ wait_queue_head_t irq_wait_queue;
+ const struct mtk_dsi_driver_data *driver_data;
+@@ -649,18 +650,11 @@ static int mtk_dsi_poweron(struct mtk_dsi *dsi)
+ mtk_dsi_reset_engine(dsi);
+ mtk_dsi_phy_timconfig(dsi);
+
+- mtk_dsi_rxtx_control(dsi);
+- usleep_range(30, 100);
+- mtk_dsi_reset_dphy(dsi);
+ mtk_dsi_ps_control_vact(dsi);
+ mtk_dsi_set_vm_cmd(dsi);
+ mtk_dsi_config_vdo_timing(dsi);
+ mtk_dsi_set_interrupt_enable(dsi);
+
+- mtk_dsi_clk_ulp_mode_leave(dsi);
+- mtk_dsi_lane0_ulp_mode_leave(dsi);
+- mtk_dsi_clk_hs_mode(dsi, 0);
+-
+ return 0;
+ err_disable_engine_clk:
+ clk_disable_unprepare(dsi->engine_clk);
+@@ -679,19 +673,11 @@ static void mtk_dsi_poweroff(struct mtk_dsi *dsi)
+ if (--dsi->refcount != 0)
+ return;
+
+- /*
+- * mtk_dsi_stop() and mtk_dsi_start() is asymmetric, since
+- * mtk_dsi_stop() should be called after mtk_drm_crtc_atomic_disable(),
+- * which needs irq for vblank, and mtk_dsi_stop() will disable irq.
+- * mtk_dsi_start() needs to be called in mtk_output_dsi_enable(),
+- * after dsi is fully set.
+- */
+- mtk_dsi_stop(dsi);
+-
+- mtk_dsi_switch_to_cmd_mode(dsi, VM_DONE_INT_FLAG, 500);
+ mtk_dsi_reset_engine(dsi);
+ mtk_dsi_lane0_ulp_mode_enter(dsi);
+ mtk_dsi_clk_ulp_mode_enter(dsi);
++ /* set the lane number as 0 to pull down mipi */
++ writel(0, dsi->regs + DSI_TXRX_CTRL);
+
+ mtk_dsi_disable(dsi);
+
+@@ -699,21 +685,31 @@ static void mtk_dsi_poweroff(struct mtk_dsi *dsi)
+ clk_disable_unprepare(dsi->digital_clk);
+
+ phy_power_off(dsi->phy);
++
++ dsi->lanes_ready = false;
+ }
+
+-static void mtk_output_dsi_enable(struct mtk_dsi *dsi)
++static void mtk_dsi_lane_ready(struct mtk_dsi *dsi)
+ {
+- int ret;
++ if (!dsi->lanes_ready) {
++ dsi->lanes_ready = true;
++ mtk_dsi_rxtx_control(dsi);
++ usleep_range(30, 100);
++ mtk_dsi_reset_dphy(dsi);
++ mtk_dsi_clk_ulp_mode_leave(dsi);
++ mtk_dsi_lane0_ulp_mode_leave(dsi);
++ mtk_dsi_clk_hs_mode(dsi, 0);
++ msleep(20);
++ /* The reaction time after pulling up the mipi signal for dsi_rx */
++ }
++}
+
++static void mtk_output_dsi_enable(struct mtk_dsi *dsi)
++{
+ if (dsi->enabled)
+ return;
+
+- ret = mtk_dsi_poweron(dsi);
+- if (ret < 0) {
+- DRM_ERROR("failed to power on dsi\n");
+- return;
+- }
+-
++ mtk_dsi_lane_ready(dsi);
+ mtk_dsi_set_mode(dsi);
+ mtk_dsi_clk_hs_mode(dsi, 1);
+
+@@ -727,7 +723,16 @@ static void mtk_output_dsi_disable(struct mtk_dsi *dsi)
+ if (!dsi->enabled)
+ return;
+
+- mtk_dsi_poweroff(dsi);
++ /*
++ * mtk_dsi_stop() and mtk_dsi_start() is asymmetric, since
++ * mtk_dsi_stop() should be called after mtk_drm_crtc_atomic_disable(),
++ * which needs irq for vblank, and mtk_dsi_stop() will disable irq.
++ * mtk_dsi_start() needs to be called in mtk_output_dsi_enable(),
++ * after dsi is fully set.
++ */
++ mtk_dsi_stop(dsi);
++
++ mtk_dsi_switch_to_cmd_mode(dsi, VM_DONE_INT_FLAG, 500);
+
+ dsi->enabled = false;
+ }
+@@ -751,24 +756,50 @@ static void mtk_dsi_bridge_mode_set(struct drm_bridge *bridge,
+ drm_display_mode_to_videomode(adjusted, &dsi->vm);
+ }
+
+-static void mtk_dsi_bridge_disable(struct drm_bridge *bridge)
++static void mtk_dsi_bridge_atomic_disable(struct drm_bridge *bridge,
++ struct drm_bridge_state *old_bridge_state)
+ {
+ struct mtk_dsi *dsi = bridge_to_dsi(bridge);
+
+ mtk_output_dsi_disable(dsi);
+ }
+
+-static void mtk_dsi_bridge_enable(struct drm_bridge *bridge)
++static void mtk_dsi_bridge_atomic_enable(struct drm_bridge *bridge,
++ struct drm_bridge_state *old_bridge_state)
+ {
+ struct mtk_dsi *dsi = bridge_to_dsi(bridge);
+
++ if (dsi->refcount == 0)
++ return;
++
+ mtk_output_dsi_enable(dsi);
+ }
+
++static void mtk_dsi_bridge_atomic_pre_enable(struct drm_bridge *bridge,
++ struct drm_bridge_state *old_bridge_state)
++{
++ struct mtk_dsi *dsi = bridge_to_dsi(bridge);
++ int ret;
++
++ ret = mtk_dsi_poweron(dsi);
++ if (ret < 0)
++ DRM_ERROR("failed to power on dsi\n");
++}
++
++static void mtk_dsi_bridge_atomic_post_disable(struct drm_bridge *bridge,
++ struct drm_bridge_state *old_bridge_state)
++{
++ struct mtk_dsi *dsi = bridge_to_dsi(bridge);
++
++ mtk_dsi_poweroff(dsi);
++}
++
+ static const struct drm_bridge_funcs mtk_dsi_bridge_funcs = {
+ .attach = mtk_dsi_bridge_attach,
+- .disable = mtk_dsi_bridge_disable,
+- .enable = mtk_dsi_bridge_enable,
++ .atomic_disable = mtk_dsi_bridge_atomic_disable,
++ .atomic_enable = mtk_dsi_bridge_atomic_enable,
++ .atomic_pre_enable = mtk_dsi_bridge_atomic_pre_enable,
++ .atomic_post_disable = mtk_dsi_bridge_atomic_post_disable,
+ .mode_set = mtk_dsi_bridge_mode_set,
+ };
+
+@@ -988,6 +1019,8 @@ static ssize_t mtk_dsi_host_transfer(struct mipi_dsi_host *host,
+ if (MTK_DSI_HOST_IS_READ(msg->type))
+ irq_flag |= LPRX_RD_RDY_INT_FLAG;
+
++ mtk_dsi_lane_ready(dsi);
++
+ ret = mtk_dsi_host_send_cmd(dsi, msg, irq_flag);
+ if (ret)
+ goto restore_dsi_mode;
+diff --git a/drivers/gpu/drm/meson/meson_encoder_cvbs.c b/drivers/gpu/drm/meson/meson_encoder_cvbs.c
+index fd8db97ba8ba2..8110a6e39320f 100644
+--- a/drivers/gpu/drm/meson/meson_encoder_cvbs.c
++++ b/drivers/gpu/drm/meson/meson_encoder_cvbs.c
+@@ -238,6 +238,7 @@ int meson_encoder_cvbs_init(struct meson_drm *priv)
+ }
+
+ meson_encoder_cvbs->next_bridge = of_drm_find_bridge(remote);
++ of_node_put(remote);
+ if (!meson_encoder_cvbs->next_bridge) {
+ dev_err(priv->dev, "Failed to find CVBS Connector bridge\n");
+ return -EPROBE_DEFER;
+diff --git a/drivers/gpu/drm/meson/meson_encoder_hdmi.c b/drivers/gpu/drm/meson/meson_encoder_hdmi.c
+index 5e306de6f4853..a7692584487cc 100644
+--- a/drivers/gpu/drm/meson/meson_encoder_hdmi.c
++++ b/drivers/gpu/drm/meson/meson_encoder_hdmi.c
+@@ -365,7 +365,8 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
+ meson_encoder_hdmi->next_bridge = of_drm_find_bridge(remote);
+ if (!meson_encoder_hdmi->next_bridge) {
+ dev_err(priv->dev, "Failed to find HDMI transceiver bridge\n");
+- return -EPROBE_DEFER;
++ ret = -EPROBE_DEFER;
++ goto err_put_node;
+ }
+
+ /* HDMI Encoder Bridge */
+@@ -383,7 +384,7 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
+ DRM_MODE_ENCODER_TMDS);
+ if (ret) {
+ dev_err(priv->dev, "Failed to init HDMI encoder: %d\n", ret);
+- return ret;
++ goto err_put_node;
+ }
+
+ meson_encoder_hdmi->encoder.possible_crtcs = BIT(0);
+@@ -393,7 +394,7 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
+ DRM_BRIDGE_ATTACH_NO_CONNECTOR);
+ if (ret) {
+ dev_err(priv->dev, "Failed to attach bridge: %d\n", ret);
+- return ret;
++ goto err_put_node;
+ }
+
+ /* Initialize & attach Bridge Connector */
+@@ -401,7 +402,8 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
+ &meson_encoder_hdmi->encoder);
+ if (IS_ERR(meson_encoder_hdmi->connector)) {
+ dev_err(priv->dev, "Unable to create HDMI bridge connector\n");
+- return PTR_ERR(meson_encoder_hdmi->connector);
++ ret = PTR_ERR(meson_encoder_hdmi->connector);
++ goto err_put_node;
+ }
+ drm_connector_attach_encoder(meson_encoder_hdmi->connector,
+ &meson_encoder_hdmi->encoder);
+@@ -428,6 +430,7 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
+ meson_encoder_hdmi->connector->ycbcr_420_allowed = true;
+
+ pdev = of_find_device_by_node(remote);
++ of_node_put(remote);
+ if (pdev) {
+ struct cec_connector_info conn_info;
+ struct cec_notifier *notifier;
+@@ -435,8 +438,10 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
+ cec_fill_conn_info_from_drm(&conn_info, meson_encoder_hdmi->connector);
+
+ notifier = cec_notifier_conn_register(&pdev->dev, NULL, &conn_info);
+- if (!notifier)
++ if (!notifier) {
++ put_device(&pdev->dev);
+ return -ENOMEM;
++ }
+
+ meson_encoder_hdmi->cec_notifier = notifier;
+ }
+@@ -444,4 +449,8 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
+ dev_dbg(priv->dev, "HDMI encoder initialized\n");
+
+ return 0;
++
++err_put_node:
++ of_node_put(remote);
++ return ret;
+ }
+diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+index 217615e0e8507..213f75fc98e80 100644
+--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
++++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+@@ -1666,18 +1666,10 @@ static u64 a5xx_gpu_busy(struct msm_gpu *gpu, unsigned long *out_sample_rate)
+ {
+ u64 busy_cycles;
+
+- /* Only read the gpu busy if the hardware is already active */
+- if (pm_runtime_get_if_in_use(&gpu->pdev->dev) == 0) {
+- *out_sample_rate = 1;
+- return 0;
+- }
+-
+ busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO,
+ REG_A5XX_RBBM_PERFCTR_RBBM_0_HI);
+ *out_sample_rate = clk_get_rate(gpu->core_clk);
+
+- pm_runtime_put(&gpu->pdev->dev);
+-
+ return busy_cycles;
+ }
+
+diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+index 3e325e2a2b1b6..1863908694dc7 100644
+--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
++++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+@@ -102,7 +102,8 @@ bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu)
+ A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF));
+ }
+
+-void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp)
++void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
++ bool suspended)
+ {
+ struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+ struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+@@ -127,15 +128,16 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp)
+
+ /*
+ * This can get called from devfreq while the hardware is idle. Don't
+- * bring up the power if it isn't already active
++ * bring up the power if it isn't already active. All we're doing here
++ * is updating the frequency so that when we come back online we're at
++ * the right rate.
+ */
+- if (pm_runtime_get_if_in_use(gmu->dev) == 0)
++ if (suspended)
+ return;
+
+ if (!gmu->legacy) {
+ a6xx_hfi_set_freq(gmu, perf_index);
+ dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
+- pm_runtime_put(gmu->dev);
+ return;
+ }
+
+@@ -159,7 +161,6 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp)
+ dev_err(gmu->dev, "GMU set GPU frequency error: %d\n", ret);
+
+ dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
+- pm_runtime_put(gmu->dev);
+ }
+
+ unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu)
+@@ -895,7 +896,7 @@ static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu *gmu)
+ return;
+
+ gmu->freq = 0; /* so a6xx_gmu_set_freq() doesn't exit early */
+- a6xx_gmu_set_freq(gpu, gpu_opp);
++ a6xx_gmu_set_freq(gpu, gpu_opp, false);
+ dev_pm_opp_put(gpu_opp);
+ }
+
+diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+index 40fb92becc78a..a8a0d798d17fe 100644
+--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
++++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+@@ -1658,27 +1658,21 @@ static u64 a6xx_gpu_busy(struct msm_gpu *gpu, unsigned long *out_sample_rate)
+ /* 19.2MHz */
+ *out_sample_rate = 19200000;
+
+- /* Only read the gpu busy if the hardware is already active */
+- if (pm_runtime_get_if_in_use(a6xx_gpu->gmu.dev) == 0)
+- return 0;
+-
+ busy_cycles = gmu_read64(&a6xx_gpu->gmu,
+ REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_L,
+ REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_H);
+
+-
+- pm_runtime_put(a6xx_gpu->gmu.dev);
+-
+ return busy_cycles;
+ }
+
+-static void a6xx_gpu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp)
++static void a6xx_gpu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
++ bool suspended)
+ {
+ struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+ struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+
+ mutex_lock(&a6xx_gpu->gmu.lock);
+- a6xx_gmu_set_freq(gpu, opp);
++ a6xx_gmu_set_freq(gpu, opp, suspended);
+ mutex_unlock(&a6xx_gpu->gmu.lock);
+ }
+
+diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+index 86e0a7c3fe6df..ab853f61db632 100644
+--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
++++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+@@ -77,7 +77,8 @@ void a6xx_gmu_clear_oob(struct a6xx_gmu *gmu, enum a6xx_gmu_oob_state state);
+ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node);
+ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu);
+
+-void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp);
++void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
++ bool suspended);
+ unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu);
+
+ void a6xx_show(struct msm_gpu *gpu, struct msm_gpu_state *state,
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+index 16ba9f9b9a787..32715399a7c19 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+@@ -361,6 +361,9 @@ static void _dpu_crtc_blend_setup_mixer(struct drm_crtc *crtc,
+ if (!state)
+ continue;
+
++ if (!state->visible)
++ continue;
++
+ pstate = to_dpu_plane_state(state);
+ fb = state->fb;
+
+@@ -1125,6 +1128,9 @@ static int dpu_crtc_atomic_check(struct drm_crtc *crtc,
+ if (cnt >= DPU_STAGE_MAX * 4)
+ continue;
+
++ if (!pstate->visible)
++ continue;
++
+ pstates[cnt].dpu_pstate = dpu_pstate;
+ pstates[cnt].drm_pstate = pstate;
+ pstates[cnt].stage = pstate->normalized_zpos;
+diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
+index a4f5cb90f3e80..e4b8a789835a4 100644
+--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
++++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
+@@ -123,12 +123,13 @@ int mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe *hwpipe)
+ {
+ struct msm_drm_private *priv = s->dev->dev_private;
+ struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(priv->kms));
+- struct mdp5_global_state *state = mdp5_get_global_state(s);
++ struct mdp5_global_state *state;
+ struct mdp5_hw_pipe_state *new_state;
+
+ if (!hwpipe)
+ return 0;
+
++ state = mdp5_get_global_state(s);
+ if (IS_ERR(state))
+ return PTR_ERR(state);
+
+diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c b/drivers/gpu/drm/msm/hdmi/hdmi.c
+index f6229262dcb05..4ce0b4c41e49f 100644
+--- a/drivers/gpu/drm/msm/hdmi/hdmi.c
++++ b/drivers/gpu/drm/msm/hdmi/hdmi.c
+@@ -180,6 +180,9 @@ static struct hdmi *msm_hdmi_init(struct platform_device *pdev)
+ goto fail;
+ }
+
++ for (i = 0; i < config->pwr_reg_cnt; i++)
++ hdmi->pwr_regs[i].supply = config->pwr_reg_names[i];
++
+ ret = devm_regulator_bulk_get(&pdev->dev, config->pwr_reg_cnt, hdmi->pwr_regs);
+ if (ret) {
+ DRM_DEV_ERROR(&pdev->dev, "failed to get pwr regulator: %d\n", ret);
+diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
+index 143c56f5185b8..a4d3abc4f3e13 100644
+--- a/drivers/gpu/drm/msm/msm_gpu.h
++++ b/drivers/gpu/drm/msm/msm_gpu.h
+@@ -63,11 +63,14 @@ struct msm_gpu_funcs {
+ /* for generation specific debugfs: */
+ void (*debugfs_init)(struct msm_gpu *gpu, struct drm_minor *minor);
+ #endif
++ /* note: gpu_busy() can assume that we have been pm_resumed */
+ u64 (*gpu_busy)(struct msm_gpu *gpu, unsigned long *out_sample_rate);
+ struct msm_gpu_state *(*gpu_state_get)(struct msm_gpu *gpu);
+ int (*gpu_state_put)(struct msm_gpu_state *state);
+ unsigned long (*gpu_get_freq)(struct msm_gpu *gpu);
+- void (*gpu_set_freq)(struct msm_gpu *gpu, struct dev_pm_opp *opp);
++ /* note: gpu_set_freq() can assume that we have been pm_resumed */
++ void (*gpu_set_freq)(struct msm_gpu *gpu, struct dev_pm_opp *opp,
++ bool suspended);
+ struct msm_gem_address_space *(*create_address_space)
+ (struct msm_gpu *gpu, struct platform_device *pdev);
+ struct msm_gem_address_space *(*create_private_address_space)
+@@ -91,6 +94,9 @@ struct msm_gpu_devfreq {
+ /** devfreq: devfreq instance */
+ struct devfreq *devfreq;
+
++ /** lock: lock for "suspended", "busy_cycles", and "time" */
++ struct mutex lock;
++
+ /**
+ * idle_constraint:
+ *
+@@ -134,6 +140,9 @@ struct msm_gpu_devfreq {
+ * elapsed
+ */
+ struct msm_hrtimer_work boost_work;
++
++ /** suspended: tracks if we're suspended */
++ bool suspended;
+ };
+
+ struct msm_gpu {
+diff --git a/drivers/gpu/drm/msm/msm_gpu_devfreq.c b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
+index c7dbaa4b19264..01248d3401240 100644
+--- a/drivers/gpu/drm/msm/msm_gpu_devfreq.c
++++ b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
+@@ -20,6 +20,7 @@ static int msm_devfreq_target(struct device *dev, unsigned long *freq,
+ u32 flags)
+ {
+ struct msm_gpu *gpu = dev_to_gpu(dev);
++ struct msm_gpu_devfreq *df = &gpu->devfreq;
+ struct dev_pm_opp *opp;
+
+ /*
+@@ -32,10 +33,13 @@ static int msm_devfreq_target(struct device *dev, unsigned long *freq,
+
+ trace_msm_gpu_freq_change(dev_pm_opp_get_freq(opp));
+
+- if (gpu->funcs->gpu_set_freq)
+- gpu->funcs->gpu_set_freq(gpu, opp);
+- else
++ if (gpu->funcs->gpu_set_freq) {
++ mutex_lock(&df->lock);
++ gpu->funcs->gpu_set_freq(gpu, opp, df->suspended);
++ mutex_unlock(&df->lock);
++ } else {
+ clk_set_rate(gpu->core_clk, *freq);
++ }
+
+ dev_pm_opp_put(opp);
+
+@@ -58,15 +62,24 @@ static void get_raw_dev_status(struct msm_gpu *gpu,
+ unsigned long sample_rate;
+ ktime_t time;
+
++ mutex_lock(&df->lock);
++
+ status->current_frequency = get_freq(gpu);
+- busy_cycles = gpu->funcs->gpu_busy(gpu, &sample_rate);
+ time = ktime_get();
+-
+- busy_time = busy_cycles - df->busy_cycles;
+ status->total_time = ktime_us_delta(time, df->time);
++ df->time = time;
+
++ if (df->suspended) {
++ mutex_unlock(&df->lock);
++ status->busy_time = 0;
++ return;
++ }
++
++ busy_cycles = gpu->funcs->gpu_busy(gpu, &sample_rate);
++ busy_time = busy_cycles - df->busy_cycles;
+ df->busy_cycles = busy_cycles;
+- df->time = time;
++
++ mutex_unlock(&df->lock);
+
+ busy_time *= USEC_PER_SEC;
+ do_div(busy_time, sample_rate);
+@@ -175,6 +188,8 @@ void msm_devfreq_init(struct msm_gpu *gpu)
+ if (!gpu->funcs->gpu_busy)
+ return;
+
++ mutex_init(&df->lock);
++
+ dev_pm_qos_add_request(&gpu->pdev->dev, &df->idle_freq,
+ DEV_PM_QOS_MAX_FREQUENCY,
+ PM_QOS_MAX_FREQUENCY_DEFAULT_VALUE);
+@@ -244,12 +259,16 @@ void msm_devfreq_cleanup(struct msm_gpu *gpu)
+ void msm_devfreq_resume(struct msm_gpu *gpu)
+ {
+ struct msm_gpu_devfreq *df = &gpu->devfreq;
++ unsigned long sample_rate;
+
+ if (!has_devfreq(gpu))
+ return;
+
+- df->busy_cycles = 0;
++ mutex_lock(&df->lock);
++ df->busy_cycles = gpu->funcs->gpu_busy(gpu, &sample_rate);
+ df->time = ktime_get();
++ df->suspended = false;
++ mutex_unlock(&df->lock);
+
+ devfreq_resume_device(df->devfreq);
+ }
+@@ -261,6 +280,10 @@ void msm_devfreq_suspend(struct msm_gpu *gpu)
+ if (!has_devfreq(gpu))
+ return;
+
++ mutex_lock(&df->lock);
++ df->suspended = true;
++ mutex_unlock(&df->lock);
++
+ devfreq_suspend_device(df->devfreq);
+
+ cancel_idle_work(df);
+diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
+index 22b83a6577eb0..df83c4654e269 100644
+--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
++++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
+@@ -1361,13 +1361,11 @@ nouveau_connector_create(struct drm_device *dev,
+ snprintf(aux_name, sizeof(aux_name), "sor-%04x-%04x",
+ dcbe->hasht, dcbe->hashm);
+ nv_connector->aux.name = kstrdup(aux_name, GFP_KERNEL);
+- drm_dp_aux_init(&nv_connector->aux);
+- if (ret) {
+- NV_ERROR(drm, "Failed to init AUX adapter for sor-%04x-%04x: %d\n",
+- dcbe->hasht, dcbe->hashm, ret);
++ if (!nv_connector->aux.name) {
+ kfree(nv_connector);
+- return ERR_PTR(ret);
++ return ERR_PTR(-ENOMEM);
+ }
++ drm_dp_aux_init(&nv_connector->aux);
+ fallthrough;
+ default:
+ funcs = &nouveau_connector_funcs;
+diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c
+index 2cd0932b3d687..a2f5df568ca54 100644
+--- a/drivers/gpu/drm/nouveau/nouveau_display.c
++++ b/drivers/gpu/drm/nouveau/nouveau_display.c
+@@ -515,7 +515,7 @@ nouveau_display_hpd_work(struct work_struct *work)
+
+ pm_runtime_mark_last_busy(drm->dev->dev);
+ noop:
+- pm_runtime_put_sync(drm->dev->dev);
++ pm_runtime_put_autosuspend(dev->dev);
+ }
+
+ #ifdef CONFIG_ACPI
+@@ -537,7 +537,7 @@ nouveau_display_acpi_ntfy(struct notifier_block *nb, unsigned long val,
+ * it's own hotplug events.
+ */
+ pm_runtime_put_autosuspend(drm->dev->dev);
+- } else if (ret == 0) {
++ } else if (ret == 0 || ret == -EINPROGRESS) {
+ /* We've started resuming the GPU already, so
+ * it will handle scheduling a full reprobe
+ * itself
+diff --git a/drivers/gpu/drm/nouveau/nouveau_fbcon.c b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
+index 4f9b3aa5deda9..20ac1ce2c0f14 100644
+--- a/drivers/gpu/drm/nouveau/nouveau_fbcon.c
++++ b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
+@@ -466,7 +466,7 @@ nouveau_fbcon_set_suspend_work(struct work_struct *work)
+ if (state == FBINFO_STATE_RUNNING) {
+ nouveau_fbcon_hotplug_resume(drm->fbcon);
+ pm_runtime_mark_last_busy(drm->dev->dev);
+- pm_runtime_put_sync(drm->dev->dev);
++ pm_runtime_put_autosuspend(drm->dev->dev);
+ }
+ }
+
+diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/base.c
+index 64e423dddd9e7..6c318e41bde04 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/base.c
++++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/base.c
+@@ -33,7 +33,7 @@ nvbios_addr(struct nvkm_bios *bios, u32 *addr, u8 size)
+ {
+ u32 p = *addr;
+
+- if (*addr > bios->image0_size && bios->imaged_addr) {
++ if (*addr >= bios->image0_size && bios->imaged_addr) {
+ *addr -= bios->image0_size;
+ *addr += bios->imaged_addr;
+ }
+diff --git a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig
+index ddf5f38e87311..abc5271af4eb9 100644
+--- a/drivers/gpu/drm/panel/Kconfig
++++ b/drivers/gpu/drm/panel/Kconfig
+@@ -428,6 +428,8 @@ config DRM_PANEL_SAMSUNG_ATNA33XC20
+ depends on OF
+ depends on BACKLIGHT_CLASS_DEVICE
+ depends on PM
++ select DRM_DISPLAY_DP_HELPER
++ select DRM_DISPLAY_HELPER
+ select DRM_DP_AUX_BUS
+ help
+ DRM panel driver for the Samsung ATNA33XC20 panel. This panel can't
+diff --git a/drivers/gpu/drm/radeon/.gitignore b/drivers/gpu/drm/radeon/.gitignore
+index 9c1a941539836..d8777383a64aa 100644
+--- a/drivers/gpu/drm/radeon/.gitignore
++++ b/drivers/gpu/drm/radeon/.gitignore
+@@ -1,4 +1,4 @@
+-# SPDX-License-Identifier: GPL-2.0-only
++# SPDX-License-Identifier: MIT
+ mkregtable
+ *_reg_safe.h
+
+diff --git a/drivers/gpu/drm/radeon/Kconfig b/drivers/gpu/drm/radeon/Kconfig
+index 6f60f4840cc58..52819e7f1fca1 100644
+--- a/drivers/gpu/drm/radeon/Kconfig
++++ b/drivers/gpu/drm/radeon/Kconfig
+@@ -1,4 +1,4 @@
+-# SPDX-License-Identifier: GPL-2.0-only
++# SPDX-License-Identifier: MIT
+ config DRM_RADEON_USERPTR
+ bool "Always enable userptr support"
+ depends on DRM_RADEON
+diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
+index 11c97edde54dd..3d502f1bbfcbe 100644
+--- a/drivers/gpu/drm/radeon/Makefile
++++ b/drivers/gpu/drm/radeon/Makefile
+@@ -1,4 +1,4 @@
+-# SPDX-License-Identifier: GPL-2.0
++# SPDX-License-Identifier: MIT
+ #
+ # Makefile for the drm device driver. This driver provides support for the
+ # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
+diff --git a/drivers/gpu/drm/radeon/ni_dpm.c b/drivers/gpu/drm/radeon/ni_dpm.c
+index 769f666335ac4..672d2239293e0 100644
+--- a/drivers/gpu/drm/radeon/ni_dpm.c
++++ b/drivers/gpu/drm/radeon/ni_dpm.c
+@@ -2741,10 +2741,10 @@ static int ni_set_mc_special_registers(struct radeon_device *rdev,
+ table->mc_reg_table_entry[k].mc_data[j] |= 0x100;
+ }
+ j++;
+- if (j > SMC_NISLANDS_MC_REGISTER_ARRAY_SIZE)
+- return -EINVAL;
+ break;
+ case MC_SEQ_RESERVE_M >> 2:
++ if (j >= SMC_NISLANDS_MC_REGISTER_ARRAY_SIZE)
++ return -EINVAL;
+ temp_reg = RREG32(MC_PMG_CMD_MRS1);
+ table->mc_reg_address[j].s1 = MC_PMG_CMD_MRS1 >> 2;
+ table->mc_reg_address[j].s0 = MC_SEQ_PMG_CMD_MRS1_LP >> 2;
+@@ -2753,8 +2753,6 @@ static int ni_set_mc_special_registers(struct radeon_device *rdev,
+ (temp_reg & 0xffff0000) |
+ (table->mc_reg_table_entry[k].mc_data[i] & 0x0000ffff);
+ j++;
+- if (j > SMC_NISLANDS_MC_REGISTER_ARRAY_SIZE)
+- return -EINVAL;
+ break;
+ default:
+ break;
+diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
+index 15692cb241fc0..429644d5ddc69 100644
+--- a/drivers/gpu/drm/radeon/radeon_device.c
++++ b/drivers/gpu/drm/radeon/radeon_device.c
+@@ -1113,7 +1113,7 @@ static int radeon_gart_size_auto(enum radeon_family family)
+ static void radeon_check_arguments(struct radeon_device *rdev)
+ {
+ /* vramlimit must be a power of two */
+- if (!is_power_of_2(radeon_vram_limit)) {
++ if (radeon_vram_limit != 0 && !is_power_of_2(radeon_vram_limit)) {
+ dev_warn(rdev->dev, "vram limit (%d) must be a power of 2\n",
+ radeon_vram_limit);
+ radeon_vram_limit = 0;
+diff --git a/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c b/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
+index c82901d9a9ccd..8a9bb618f872b 100644
+--- a/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
++++ b/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
+@@ -398,7 +398,15 @@ static int rockchip_dp_probe(struct platform_device *pdev)
+ if (IS_ERR(dp->adp))
+ return PTR_ERR(dp->adp);
+
+- return component_add(dev, &rockchip_dp_component_ops);
++ ret = component_add(dev, &rockchip_dp_component_ops);
++ if (ret)
++ goto err_dp_remove;
++
++ return 0;
++
++err_dp_remove:
++ analogix_dp_remove(dp->adp);
++ return ret;
+ }
+
+ static int rockchip_dp_remove(struct platform_device *pdev)
+diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+index d53037531f407..26e24f62f75f8 100644
+--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
++++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+@@ -1552,6 +1552,9 @@ static struct drm_crtc_state *vop_crtc_duplicate_state(struct drm_crtc *crtc)
+ {
+ struct rockchip_crtc_state *rockchip_state;
+
++ if (WARN_ON(!crtc->state))
++ return NULL;
++
+ rockchip_state = kzalloc(sizeof(*rockchip_state), GFP_KERNEL);
+ if (!rockchip_state)
+ return NULL;
+diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
+index 7c7dd84e6db84..81991090adcc9 100644
+--- a/drivers/gpu/drm/tegra/gem.c
++++ b/drivers/gpu/drm/tegra/gem.c
+@@ -704,14 +704,23 @@ static int tegra_gem_prime_vmap(struct dma_buf *buf, struct iosys_map *map)
+ {
+ struct drm_gem_object *gem = buf->priv;
+ struct tegra_bo *bo = to_tegra_bo(gem);
++ void *vaddr;
+
+- iosys_map_set_vaddr(map, bo->vaddr);
++ vaddr = tegra_bo_mmap(&bo->base);
++ if (IS_ERR(vaddr))
++ return PTR_ERR(vaddr);
++
++ iosys_map_set_vaddr(map, vaddr);
+
+ return 0;
+ }
+
+ static void tegra_gem_prime_vunmap(struct dma_buf *buf, struct iosys_map *map)
+ {
++ struct drm_gem_object *gem = buf->priv;
++ struct tegra_bo *bo = to_tegra_bo(gem);
++
++ tegra_bo_munmap(&bo->base, map->vaddr);
+ }
+
+ static const struct dma_buf_ops tegra_gem_prime_dmabuf_ops = {
+diff --git a/drivers/gpu/drm/tiny/st7735r.c b/drivers/gpu/drm/tiny/st7735r.c
+index 29d618093e946..e0f02d367d880 100644
+--- a/drivers/gpu/drm/tiny/st7735r.c
++++ b/drivers/gpu/drm/tiny/st7735r.c
+@@ -174,6 +174,7 @@ MODULE_DEVICE_TABLE(of, st7735r_of_match);
+
+ static const struct spi_device_id st7735r_id[] = {
+ { "jd-t18003-t01", (uintptr_t)&jd_t18003_t01_cfg },
++ { "rh128128t", (uintptr_t)&rh128128t_cfg },
+ { },
+ };
+ MODULE_DEVICE_TABLE(spi, st7735r_id);
+diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
+index 477b3c5ad089c..18e2a246f7c17 100644
+--- a/drivers/gpu/drm/vc4/vc4_crtc.c
++++ b/drivers/gpu/drm/vc4/vc4_crtc.c
+@@ -314,10 +314,13 @@ static void vc4_crtc_config_pv(struct drm_crtc *crtc, struct drm_encoder *encode
+ struct drm_crtc_state *crtc_state = crtc->state;
+ struct drm_display_mode *mode = &crtc_state->adjusted_mode;
+ bool interlace = mode->flags & DRM_MODE_FLAG_INTERLACE;
+- u32 pixel_rep = (mode->flags & DRM_MODE_FLAG_DBLCLK) ? 2 : 1;
++ bool is_hdmi = vc4_encoder->type == VC4_ENCODER_TYPE_HDMI0 ||
++ vc4_encoder->type == VC4_ENCODER_TYPE_HDMI1;
++ u32 pixel_rep = ((mode->flags & DRM_MODE_FLAG_DBLCLK) && !is_hdmi) ? 2 : 1;
+ bool is_dsi = (vc4_encoder->type == VC4_ENCODER_TYPE_DSI0 ||
+ vc4_encoder->type == VC4_ENCODER_TYPE_DSI1);
+- u32 format = is_dsi ? PV_CONTROL_FORMAT_DSIV_24 : PV_CONTROL_FORMAT_24;
++ bool is_dsi1 = vc4_encoder->type == VC4_ENCODER_TYPE_DSI1;
++ u32 format = is_dsi1 ? PV_CONTROL_FORMAT_DSIV_24 : PV_CONTROL_FORMAT_24;
+ u8 ppc = pv_data->pixels_per_clock;
+ bool debug_dump_regs = false;
+
+@@ -343,7 +346,8 @@ static void vc4_crtc_config_pv(struct drm_crtc *crtc, struct drm_encoder *encode
+ PV_HORZB_HACTIVE));
+
+ CRTC_WRITE(PV_VERTA,
+- VC4_SET_FIELD(mode->crtc_vtotal - mode->crtc_vsync_end,
++ VC4_SET_FIELD(mode->crtc_vtotal - mode->crtc_vsync_end +
++ interlace,
+ PV_VERTA_VBP) |
+ VC4_SET_FIELD(mode->crtc_vsync_end - mode->crtc_vsync_start,
+ PV_VERTA_VSYNC));
+@@ -355,7 +359,7 @@ static void vc4_crtc_config_pv(struct drm_crtc *crtc, struct drm_encoder *encode
+ if (interlace) {
+ CRTC_WRITE(PV_VERTA_EVEN,
+ VC4_SET_FIELD(mode->crtc_vtotal -
+- mode->crtc_vsync_end - 1,
++ mode->crtc_vsync_end,
+ PV_VERTA_VBP) |
+ VC4_SET_FIELD(mode->crtc_vsync_end -
+ mode->crtc_vsync_start,
+@@ -375,7 +379,7 @@ static void vc4_crtc_config_pv(struct drm_crtc *crtc, struct drm_encoder *encode
+ PV_VCONTROL_CONTINUOUS |
+ (is_dsi ? PV_VCONTROL_DSI : 0) |
+ PV_VCONTROL_INTERLACE |
+- VC4_SET_FIELD(mode->htotal * pixel_rep / 2,
++ VC4_SET_FIELD(mode->htotal * pixel_rep / (2 * ppc),
+ PV_VCONTROL_ODD_DELAY));
+ CRTC_WRITE(PV_VSYNCD_EVEN, 0);
+ } else {
+diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
+index 162bc18e7497f..680eaef7287f5 100644
+--- a/drivers/gpu/drm/vc4/vc4_drv.c
++++ b/drivers/gpu/drm/vc4/vc4_drv.c
+@@ -209,6 +209,15 @@ static void vc4_match_add_drivers(struct device *dev,
+ }
+ }
+
++static const struct of_device_id vc4_dma_range_matches[] = {
++ { .compatible = "brcm,bcm2711-hvs" },
++ { .compatible = "brcm,bcm2835-hvs" },
++ { .compatible = "brcm,bcm2835-v3d" },
++ { .compatible = "brcm,cygnus-v3d" },
++ { .compatible = "brcm,vc4-v3d" },
++ {}
++};
++
+ static int vc4_drm_bind(struct device *dev)
+ {
+ struct platform_device *pdev = to_platform_device(dev);
+@@ -227,6 +236,16 @@ static int vc4_drm_bind(struct device *dev)
+ vc4_drm_driver.driver_features &= ~DRIVER_RENDER;
+ of_node_put(node);
+
++ node = of_find_matching_node_and_match(NULL, vc4_dma_range_matches,
++ NULL);
++ if (node) {
++ ret = of_dma_configure(dev, node, true);
++ of_node_put(node);
++
++ if (ret)
++ return ret;
++ }
++
+ vc4 = devm_drm_dev_alloc(dev, &vc4_drm_driver, struct vc4_dev, base);
+ if (IS_ERR(vc4))
+ return PTR_ERR(vc4);
+diff --git a/drivers/gpu/drm/vc4/vc4_dsi.c b/drivers/gpu/drm/vc4/vc4_dsi.c
+index 98308a17e4ed7..b7b2c76770dc6 100644
+--- a/drivers/gpu/drm/vc4/vc4_dsi.c
++++ b/drivers/gpu/drm/vc4/vc4_dsi.c
+@@ -181,8 +181,50 @@
+
+ #define DSI0_TXPKT_PIX_FIFO 0x20 /* AKA PIX_FIFO */
+
+-#define DSI0_INT_STAT 0x24
+-#define DSI0_INT_EN 0x28
++#define DSI0_INT_STAT 0x24
++#define DSI0_INT_EN 0x28
++# define DSI0_INT_FIFO_ERR BIT(25)
++# define DSI0_INT_CMDC_DONE_MASK VC4_MASK(24, 23)
++# define DSI0_INT_CMDC_DONE_SHIFT 23
++# define DSI0_INT_CMDC_DONE_NO_REPEAT 1
++# define DSI0_INT_CMDC_DONE_REPEAT 3
++# define DSI0_INT_PHY_DIR_RTF BIT(22)
++# define DSI0_INT_PHY_D1_ULPS BIT(21)
++# define DSI0_INT_PHY_D1_STOP BIT(20)
++# define DSI0_INT_PHY_RXLPDT BIT(19)
++# define DSI0_INT_PHY_RXTRIG BIT(18)
++# define DSI0_INT_PHY_D0_ULPS BIT(17)
++# define DSI0_INT_PHY_D0_LPDT BIT(16)
++# define DSI0_INT_PHY_D0_FTR BIT(15)
++# define DSI0_INT_PHY_D0_STOP BIT(14)
++/* Signaled when the clock lane enters the given state. */
++# define DSI0_INT_PHY_CLK_ULPS BIT(13)
++# define DSI0_INT_PHY_CLK_HS BIT(12)
++# define DSI0_INT_PHY_CLK_FTR BIT(11)
++/* Signaled on timeouts */
++# define DSI0_INT_PR_TO BIT(10)
++# define DSI0_INT_TA_TO BIT(9)
++# define DSI0_INT_LPRX_TO BIT(8)
++# define DSI0_INT_HSTX_TO BIT(7)
++/* Contention on a line when trying to drive the line low */
++# define DSI0_INT_ERR_CONT_LP1 BIT(6)
++# define DSI0_INT_ERR_CONT_LP0 BIT(5)
++/* Control error: incorrect line state sequence on data lane 0. */
++# define DSI0_INT_ERR_CONTROL BIT(4)
++# define DSI0_INT_ERR_SYNC_ESC BIT(3)
++# define DSI0_INT_RX2_PKT BIT(2)
++# define DSI0_INT_RX1_PKT BIT(1)
++# define DSI0_INT_CMD_PKT BIT(0)
++
++#define DSI0_INTERRUPTS_ALWAYS_ENABLED (DSI0_INT_ERR_SYNC_ESC | \
++ DSI0_INT_ERR_CONTROL | \
++ DSI0_INT_ERR_CONT_LP0 | \
++ DSI0_INT_ERR_CONT_LP1 | \
++ DSI0_INT_HSTX_TO | \
++ DSI0_INT_LPRX_TO | \
++ DSI0_INT_TA_TO | \
++ DSI0_INT_PR_TO)
++
+ # define DSI1_INT_PHY_D3_ULPS BIT(30)
+ # define DSI1_INT_PHY_D3_STOP BIT(29)
+ # define DSI1_INT_PHY_D2_ULPS BIT(28)
+@@ -761,6 +803,9 @@ static void vc4_dsi_encoder_disable(struct drm_encoder *encoder)
+ list_for_each_entry_reverse(iter, &dsi->bridge_chain, chain_node) {
+ if (iter->funcs->disable)
+ iter->funcs->disable(iter);
++
++ if (iter == dsi->bridge)
++ break;
+ }
+
+ vc4_dsi_ulps(dsi, true);
+@@ -805,11 +850,9 @@ static bool vc4_dsi_encoder_mode_fixup(struct drm_encoder *encoder,
+ /* Find what divider gets us a faster clock than the requested
+ * pixel clock.
+ */
+- for (divider = 1; divider < 8; divider++) {
+- if (parent_rate / divider < pll_clock) {
+- divider--;
++ for (divider = 1; divider < 255; divider++) {
++ if (parent_rate / (divider + 1) < pll_clock)
+ break;
+- }
+ }
+
+ /* Now that we've picked a PLL divider, calculate back to its
+@@ -894,6 +937,9 @@ static void vc4_dsi_encoder_enable(struct drm_encoder *encoder)
+
+ DSI_PORT_WRITE(PHY_AFEC0, afec0);
+
++ /* AFEC reset hold time */
++ mdelay(1);
++
+ DSI_PORT_WRITE(PHY_AFEC1,
+ VC4_SET_FIELD(6, DSI0_PHY_AFEC1_IDR_DLANE1) |
+ VC4_SET_FIELD(6, DSI0_PHY_AFEC1_IDR_DLANE0) |
+@@ -1060,12 +1106,9 @@ static void vc4_dsi_encoder_enable(struct drm_encoder *encoder)
+ DSI_PORT_WRITE(CTRL, DSI_PORT_READ(CTRL) | DSI1_CTRL_EN);
+
+ /* Bring AFE out of reset. */
+- if (dsi->variant->port == 0) {
+- } else {
+- DSI_PORT_WRITE(PHY_AFEC0,
+- DSI_PORT_READ(PHY_AFEC0) &
+- ~DSI1_PHY_AFEC0_RESET);
+- }
++ DSI_PORT_WRITE(PHY_AFEC0,
++ DSI_PORT_READ(PHY_AFEC0) &
++ ~DSI_PORT_BIT(PHY_AFEC0_RESET));
+
+ vc4_dsi_ulps(dsi, false);
+
+@@ -1184,13 +1227,28 @@ static ssize_t vc4_dsi_host_transfer(struct mipi_dsi_host *host,
+ /* Enable the appropriate interrupt for the transfer completion. */
+ dsi->xfer_result = 0;
+ reinit_completion(&dsi->xfer_completion);
+- DSI_PORT_WRITE(INT_STAT, DSI1_INT_TXPKT1_DONE | DSI1_INT_PHY_DIR_RTF);
+- if (msg->rx_len) {
+- DSI_PORT_WRITE(INT_EN, (DSI1_INTERRUPTS_ALWAYS_ENABLED |
+- DSI1_INT_PHY_DIR_RTF));
++ if (dsi->variant->port == 0) {
++ DSI_PORT_WRITE(INT_STAT,
++ DSI0_INT_CMDC_DONE_MASK | DSI1_INT_PHY_DIR_RTF);
++ if (msg->rx_len) {
++ DSI_PORT_WRITE(INT_EN, (DSI0_INTERRUPTS_ALWAYS_ENABLED |
++ DSI0_INT_PHY_DIR_RTF));
++ } else {
++ DSI_PORT_WRITE(INT_EN,
++ (DSI0_INTERRUPTS_ALWAYS_ENABLED |
++ VC4_SET_FIELD(DSI0_INT_CMDC_DONE_NO_REPEAT,
++ DSI0_INT_CMDC_DONE)));
++ }
+ } else {
+- DSI_PORT_WRITE(INT_EN, (DSI1_INTERRUPTS_ALWAYS_ENABLED |
+- DSI1_INT_TXPKT1_DONE));
++ DSI_PORT_WRITE(INT_STAT,
++ DSI1_INT_TXPKT1_DONE | DSI1_INT_PHY_DIR_RTF);
++ if (msg->rx_len) {
++ DSI_PORT_WRITE(INT_EN, (DSI1_INTERRUPTS_ALWAYS_ENABLED |
++ DSI1_INT_PHY_DIR_RTF));
++ } else {
++ DSI_PORT_WRITE(INT_EN, (DSI1_INTERRUPTS_ALWAYS_ENABLED |
++ DSI1_INT_TXPKT1_DONE));
++ }
+ }
+
+ /* Send the packet. */
+@@ -1207,7 +1265,7 @@ static ssize_t vc4_dsi_host_transfer(struct mipi_dsi_host *host,
+ ret = dsi->xfer_result;
+ }
+
+- DSI_PORT_WRITE(INT_EN, DSI1_INTERRUPTS_ALWAYS_ENABLED);
++ DSI_PORT_WRITE(INT_EN, DSI_PORT_BIT(INTERRUPTS_ALWAYS_ENABLED));
+
+ if (ret)
+ goto reset_fifo_and_return;
+@@ -1253,7 +1311,7 @@ reset_fifo_and_return:
+ DSI_PORT_BIT(CTRL_RESET_FIFOS));
+
+ DSI_PORT_WRITE(TXPKT1C, 0);
+- DSI_PORT_WRITE(INT_EN, DSI1_INTERRUPTS_ALWAYS_ENABLED);
++ DSI_PORT_WRITE(INT_EN, DSI_PORT_BIT(INTERRUPTS_ALWAYS_ENABLED));
+ return ret;
+ }
+
+@@ -1390,26 +1448,28 @@ static irqreturn_t vc4_dsi_irq_handler(int irq, void *data)
+ DSI_PORT_WRITE(INT_STAT, stat);
+
+ dsi_handle_error(dsi, &ret, stat,
+- DSI1_INT_ERR_SYNC_ESC, "LPDT sync");
++ DSI_PORT_BIT(INT_ERR_SYNC_ESC), "LPDT sync");
+ dsi_handle_error(dsi, &ret, stat,
+- DSI1_INT_ERR_CONTROL, "data lane 0 sequence");
++ DSI_PORT_BIT(INT_ERR_CONTROL), "data lane 0 sequence");
+ dsi_handle_error(dsi, &ret, stat,
+- DSI1_INT_ERR_CONT_LP0, "LP0 contention");
++ DSI_PORT_BIT(INT_ERR_CONT_LP0), "LP0 contention");
+ dsi_handle_error(dsi, &ret, stat,
+- DSI1_INT_ERR_CONT_LP1, "LP1 contention");
++ DSI_PORT_BIT(INT_ERR_CONT_LP1), "LP1 contention");
+ dsi_handle_error(dsi, &ret, stat,
+- DSI1_INT_HSTX_TO, "HSTX timeout");
++ DSI_PORT_BIT(INT_HSTX_TO), "HSTX timeout");
+ dsi_handle_error(dsi, &ret, stat,
+- DSI1_INT_LPRX_TO, "LPRX timeout");
++ DSI_PORT_BIT(INT_LPRX_TO), "LPRX timeout");
+ dsi_handle_error(dsi, &ret, stat,
+- DSI1_INT_TA_TO, "turnaround timeout");
++ DSI_PORT_BIT(INT_TA_TO), "turnaround timeout");
+ dsi_handle_error(dsi, &ret, stat,
+- DSI1_INT_PR_TO, "peripheral reset timeout");
++ DSI_PORT_BIT(INT_PR_TO), "peripheral reset timeout");
+
+- if (stat & (DSI1_INT_TXPKT1_DONE | DSI1_INT_PHY_DIR_RTF)) {
++ if (stat & ((dsi->variant->port ? DSI1_INT_TXPKT1_DONE :
++ DSI0_INT_CMDC_DONE_MASK) |
++ DSI_PORT_BIT(INT_PHY_DIR_RTF))) {
+ complete(&dsi->xfer_completion);
+ ret = IRQ_HANDLED;
+- } else if (stat & DSI1_INT_HSTX_TO) {
++ } else if (stat & DSI_PORT_BIT(INT_HSTX_TO)) {
+ complete(&dsi->xfer_completion);
+ dsi->xfer_result = -ETIMEDOUT;
+ ret = IRQ_HANDLED;
+@@ -1487,13 +1547,29 @@ vc4_dsi_init_phy_clocks(struct vc4_dsi *dsi)
+ dsi->clk_onecell);
+ }
+
++static void vc4_dsi_dma_mem_release(void *ptr)
++{
++ struct vc4_dsi *dsi = ptr;
++ struct device *dev = &dsi->pdev->dev;
++
++ dma_free_coherent(dev, 4, dsi->reg_dma_mem, dsi->reg_dma_paddr);
++ dsi->reg_dma_mem = NULL;
++}
++
++static void vc4_dsi_dma_chan_release(void *ptr)
++{
++ struct vc4_dsi *dsi = ptr;
++
++ dma_release_channel(dsi->reg_dma_chan);
++ dsi->reg_dma_chan = NULL;
++}
++
+ static int vc4_dsi_bind(struct device *dev, struct device *master, void *data)
+ {
+ struct platform_device *pdev = to_platform_device(dev);
+ struct drm_device *drm = dev_get_drvdata(master);
+ struct vc4_dsi *dsi = dev_get_drvdata(dev);
+ struct vc4_dsi_encoder *vc4_dsi_encoder;
+- dma_cap_mask_t dma_mask;
+ int ret;
+
+ dsi->variant = of_device_get_match_data(dev);
+@@ -1504,7 +1580,8 @@ static int vc4_dsi_bind(struct device *dev, struct device *master, void *data)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&dsi->bridge_chain);
+- vc4_dsi_encoder->base.type = VC4_ENCODER_TYPE_DSI1;
++ vc4_dsi_encoder->base.type = dsi->variant->port ?
++ VC4_ENCODER_TYPE_DSI1 : VC4_ENCODER_TYPE_DSI0;
+ vc4_dsi_encoder->dsi = dsi;
+ dsi->encoder = &vc4_dsi_encoder->base.base;
+
+@@ -1527,6 +1604,8 @@ static int vc4_dsi_bind(struct device *dev, struct device *master, void *data)
+ * so set up a channel for talking to it.
+ */
+ if (dsi->variant->broken_axi_workaround) {
++ dma_cap_mask_t dma_mask;
++
+ dsi->reg_dma_mem = dma_alloc_coherent(dev, 4,
+ &dsi->reg_dma_paddr,
+ GFP_KERNEL);
+@@ -1535,8 +1614,13 @@ static int vc4_dsi_bind(struct device *dev, struct device *master, void *data)
+ return -ENOMEM;
+ }
+
++ ret = devm_add_action_or_reset(dev, vc4_dsi_dma_mem_release, dsi);
++ if (ret)
++ return ret;
++
+ dma_cap_zero(dma_mask);
+ dma_cap_set(DMA_MEMCPY, dma_mask);
++
+ dsi->reg_dma_chan = dma_request_chan_by_mask(&dma_mask);
+ if (IS_ERR(dsi->reg_dma_chan)) {
+ ret = PTR_ERR(dsi->reg_dma_chan);
+@@ -1546,6 +1630,10 @@ static int vc4_dsi_bind(struct device *dev, struct device *master, void *data)
+ return ret;
+ }
+
++ ret = devm_add_action_or_reset(dev, vc4_dsi_dma_chan_release, dsi);
++ if (ret)
++ return ret;
++
+ /* Get the physical address of the device's registers. The
+ * struct resource for the regs gives us the bus address
+ * instead.
+diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
+index 98b78ec6b37d6..8b1e521450820 100644
+--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
++++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
+@@ -79,6 +79,11 @@
+ #define VC5_HDMI_VERTB_VSPO_SHIFT 16
+ #define VC5_HDMI_VERTB_VSPO_MASK VC4_MASK(29, 16)
+
++#define VC4_HDMI_MISC_CONTROL_PIXEL_REP_SHIFT 0
++#define VC4_HDMI_MISC_CONTROL_PIXEL_REP_MASK VC4_MASK(3, 0)
++#define VC5_HDMI_MISC_CONTROL_PIXEL_REP_SHIFT 0
++#define VC5_HDMI_MISC_CONTROL_PIXEL_REP_MASK VC4_MASK(3, 0)
++
+ #define VC5_HDMI_SCRAMBLER_CTL_ENABLE BIT(0)
+
+ #define VC5_HDMI_DEEP_COLOR_CONFIG_1_INIT_PACK_PHASE_SHIFT 8
+@@ -122,6 +127,12 @@ static int vc4_hdmi_debugfs_regs(struct seq_file *m, void *unused)
+
+ drm_print_regset32(&p, &vc4_hdmi->hdmi_regset);
+ drm_print_regset32(&p, &vc4_hdmi->hd_regset);
++ drm_print_regset32(&p, &vc4_hdmi->cec_regset);
++ drm_print_regset32(&p, &vc4_hdmi->csc_regset);
++ drm_print_regset32(&p, &vc4_hdmi->dvp_regset);
++ drm_print_regset32(&p, &vc4_hdmi->phy_regset);
++ drm_print_regset32(&p, &vc4_hdmi->ram_regset);
++ drm_print_regset32(&p, &vc4_hdmi->rm_regset);
+
+ return 0;
+ }
+@@ -433,9 +444,11 @@ static void vc4_hdmi_write_infoframe(struct drm_encoder *encoder,
+ const struct vc4_hdmi_register *ram_packet_start =
+ &vc4_hdmi->variant->registers[HDMI_RAM_PACKET_START];
+ u32 packet_reg = ram_packet_start->offset + VC4_HDMI_PACKET_STRIDE * packet_id;
++ u32 packet_reg_next = ram_packet_start->offset +
++ VC4_HDMI_PACKET_STRIDE * (packet_id + 1);
+ void __iomem *base = __vc4_hdmi_get_field_base(vc4_hdmi,
+ ram_packet_start->reg);
+- uint8_t buffer[VC4_HDMI_PACKET_STRIDE];
++ uint8_t buffer[VC4_HDMI_PACKET_STRIDE] = {};
+ unsigned long flags;
+ ssize_t len, i;
+ int ret;
+@@ -471,6 +484,13 @@ static void vc4_hdmi_write_infoframe(struct drm_encoder *encoder,
+ packet_reg += 4;
+ }
+
++ /*
++ * clear remainder of packet ram as it's included in the
++ * infoframe and triggers a checksum error on hdmi analyser
++ */
++ for (; packet_reg < packet_reg_next; packet_reg += 4)
++ writel(0, base + packet_reg);
++
+ HDMI_WRITE(HDMI_RAM_PACKET_CONFIG,
+ HDMI_READ(HDMI_RAM_PACKET_CONFIG) | BIT(packet_id));
+
+@@ -852,14 +872,15 @@ static void vc4_hdmi_set_timings(struct vc4_hdmi *vc4_hdmi,
+ VC4_HDMI_VERTA_VFP) |
+ VC4_SET_FIELD(mode->crtc_vdisplay, VC4_HDMI_VERTA_VAL));
+ u32 vertb = (VC4_SET_FIELD(0, VC4_HDMI_VERTB_VSPO) |
+- VC4_SET_FIELD(mode->crtc_vtotal - mode->crtc_vsync_end,
++ VC4_SET_FIELD(mode->crtc_vtotal - mode->crtc_vsync_end +
++ interlaced,
+ VC4_HDMI_VERTB_VBP));
+ u32 vertb_even = (VC4_SET_FIELD(0, VC4_HDMI_VERTB_VSPO) |
+ VC4_SET_FIELD(mode->crtc_vtotal -
+- mode->crtc_vsync_end -
+- interlaced,
++ mode->crtc_vsync_end,
+ VC4_HDMI_VERTB_VBP));
+ unsigned long flags;
++ u32 reg;
+
+ spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
+
+@@ -886,6 +907,11 @@ static void vc4_hdmi_set_timings(struct vc4_hdmi *vc4_hdmi,
+ HDMI_WRITE(HDMI_VERTB0, vertb_even);
+ HDMI_WRITE(HDMI_VERTB1, vertb);
+
++ reg = HDMI_READ(HDMI_MISC_CONTROL);
++ reg &= ~VC4_HDMI_MISC_CONTROL_PIXEL_REP_MASK;
++ reg |= VC4_SET_FIELD(pixel_rep - 1, VC4_HDMI_MISC_CONTROL_PIXEL_REP);
++ HDMI_WRITE(HDMI_MISC_CONTROL, reg);
++
+ spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
+ }
+
+@@ -902,13 +928,13 @@ static void vc5_hdmi_set_timings(struct vc4_hdmi *vc4_hdmi,
+ VC4_SET_FIELD(mode->crtc_vsync_start - mode->crtc_vdisplay,
+ VC5_HDMI_VERTA_VFP) |
+ VC4_SET_FIELD(mode->crtc_vdisplay, VC5_HDMI_VERTA_VAL));
+- u32 vertb = (VC4_SET_FIELD(0, VC5_HDMI_VERTB_VSPO) |
++ u32 vertb = (VC4_SET_FIELD(mode->htotal >> (2 - pixel_rep),
++ VC5_HDMI_VERTB_VSPO) |
+ VC4_SET_FIELD(mode->crtc_vtotal - mode->crtc_vsync_end,
+ VC4_HDMI_VERTB_VBP));
+ u32 vertb_even = (VC4_SET_FIELD(0, VC5_HDMI_VERTB_VSPO) |
+ VC4_SET_FIELD(mode->crtc_vtotal -
+- mode->crtc_vsync_end -
+- interlaced,
++ mode->crtc_vsync_end - interlaced,
+ VC4_HDMI_VERTB_VBP));
+ unsigned long flags;
+ unsigned char gcp;
+@@ -973,6 +999,11 @@ static void vc5_hdmi_set_timings(struct vc4_hdmi *vc4_hdmi,
+ reg |= gcp_en ? VC5_HDMI_GCP_CONFIG_GCP_ENABLE : 0;
+ HDMI_WRITE(HDMI_GCP_CONFIG, reg);
+
++ reg = HDMI_READ(HDMI_MISC_CONTROL);
++ reg &= ~VC5_HDMI_MISC_CONTROL_PIXEL_REP_MASK;
++ reg |= VC4_SET_FIELD(pixel_rep - 1, VC5_HDMI_MISC_CONTROL_PIXEL_REP);
++ HDMI_WRITE(HDMI_MISC_CONTROL, reg);
++
+ HDMI_WRITE(HDMI_CLOCK_STOP, 0);
+
+ spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
+@@ -1253,11 +1284,25 @@ static int vc4_hdmi_encoder_atomic_check(struct drm_encoder *encoder,
+ unsigned long long pixel_rate = mode->clock * 1000;
+ unsigned long long tmds_rate;
+
+- if (vc4_hdmi->variant->unsupported_odd_h_timings &&
+- !(mode->flags & DRM_MODE_FLAG_DBLCLK) &&
+- ((mode->hdisplay % 2) || (mode->hsync_start % 2) ||
+- (mode->hsync_end % 2) || (mode->htotal % 2)))
+- return -EINVAL;
++ if (vc4_hdmi->variant->unsupported_odd_h_timings) {
++ if (mode->flags & DRM_MODE_FLAG_DBLCLK) {
++ /* Only try to fixup DBLCLK modes to get 480i and 576i
++ * working.
++ * A generic solution for all modes with odd horizontal
++ * timing values seems impossible based on trying to
++ * solve it for 1366x768 monitors.
++ */
++ if ((mode->hsync_start - mode->hdisplay) & 1)
++ mode->hsync_start--;
++ if ((mode->hsync_end - mode->hsync_start) & 1)
++ mode->hsync_end--;
++ }
++
++ /* Now check whether we still have odd values remaining */
++ if ((mode->hdisplay % 2) || (mode->hsync_start % 2) ||
++ (mode->hsync_end % 2) || (mode->htotal % 2))
++ return -EINVAL;
++ }
+
+ /*
+ * The 1440p@60 pixel rate is in the same range than the first
+@@ -1611,10 +1656,10 @@ static int vc4_hdmi_audio_prepare(struct device *dev, void *data,
+
+ /* Set the MAI threshold */
+ HDMI_WRITE(HDMI_MAI_THR,
+- VC4_SET_FIELD(0x10, VC4_HD_MAI_THR_PANICHIGH) |
+- VC4_SET_FIELD(0x10, VC4_HD_MAI_THR_PANICLOW) |
+- VC4_SET_FIELD(0x10, VC4_HD_MAI_THR_DREQHIGH) |
+- VC4_SET_FIELD(0x10, VC4_HD_MAI_THR_DREQLOW));
++ VC4_SET_FIELD(0x08, VC4_HD_MAI_THR_PANICHIGH) |
++ VC4_SET_FIELD(0x08, VC4_HD_MAI_THR_PANICLOW) |
++ VC4_SET_FIELD(0x06, VC4_HD_MAI_THR_DREQHIGH) |
++ VC4_SET_FIELD(0x08, VC4_HD_MAI_THR_DREQLOW));
+
+ HDMI_WRITE(HDMI_MAI_CONFIG,
+ VC4_HDMI_MAI_CONFIG_BIT_REVERSE |
+@@ -1705,12 +1750,12 @@ static int vc4_hdmi_audio_init(struct vc4_hdmi *vc4_hdmi)
+ struct device *dev = &vc4_hdmi->pdev->dev;
+ struct platform_device *codec_pdev;
+ const __be32 *addr;
+- int index;
++ int index, len;
+ int ret;
+
+- if (!of_find_property(dev->of_node, "dmas", NULL)) {
++ if (!of_find_property(dev->of_node, "dmas", &len) || !len) {
+ dev_warn(dev,
+- "'dmas' DT property is missing, no HDMI audio\n");
++ "'dmas' DT property is missing or empty, no HDMI audio\n");
+ return 0;
+ }
+
+@@ -2191,8 +2236,6 @@ static int vc4_hdmi_cec_init(struct vc4_hdmi *vc4_hdmi)
+ struct cec_connector_info conn_info;
+ struct platform_device *pdev = vc4_hdmi->pdev;
+ struct device *dev = &pdev->dev;
+- unsigned long flags;
+- u32 value;
+ int ret;
+
+ if (!of_find_property(dev->of_node, "interrupts", NULL)) {
+@@ -2211,15 +2254,6 @@ static int vc4_hdmi_cec_init(struct vc4_hdmi *vc4_hdmi)
+ cec_fill_conn_info_from_drm(&conn_info, &vc4_hdmi->connector);
+ cec_s_conn_info(vc4_hdmi->cec_adap, &conn_info);
+
+- spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
+- value = HDMI_READ(HDMI_CEC_CNTRL_1);
+- /* Set the logical address to Unregistered */
+- value |= VC4_HDMI_CEC_ADDR_MASK;
+- HDMI_WRITE(HDMI_CEC_CNTRL_1, value);
+- spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
+-
+- vc4_hdmi_cec_update_clk_div(vc4_hdmi);
+-
+ if (vc4_hdmi->variant->external_irq_controller) {
+ ret = request_threaded_irq(platform_get_irq_byname(pdev, "cec-rx"),
+ vc4_cec_irq_handler_rx_bare,
+@@ -2235,10 +2269,6 @@ static int vc4_hdmi_cec_init(struct vc4_hdmi *vc4_hdmi)
+ if (ret)
+ goto err_remove_cec_rx_handler;
+ } else {
+- spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
+- HDMI_WRITE(HDMI_CEC_CPU_MASK_SET, 0xffffffff);
+- spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
+-
+ ret = request_threaded_irq(platform_get_irq(pdev, 0),
+ vc4_cec_irq_handler,
+ vc4_cec_irq_handler_thread, 0,
+@@ -2289,7 +2319,6 @@ static int vc4_hdmi_cec_init(struct vc4_hdmi *vc4_hdmi)
+ }
+
+ static void vc4_hdmi_cec_exit(struct vc4_hdmi *vc4_hdmi) {};
+-
+ #endif
+
+ static int vc4_hdmi_build_regset(struct vc4_hdmi *vc4_hdmi,
+@@ -2374,6 +2403,7 @@ static int vc5_hdmi_init_resources(struct vc4_hdmi *vc4_hdmi)
+ struct platform_device *pdev = vc4_hdmi->pdev;
+ struct device *dev = &pdev->dev;
+ struct resource *res;
++ int ret;
+
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "hdmi");
+ if (!res)
+@@ -2470,6 +2500,38 @@ static int vc5_hdmi_init_resources(struct vc4_hdmi *vc4_hdmi)
+ return PTR_ERR(vc4_hdmi->reset);
+ }
+
++ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->hdmi_regset, VC4_HDMI);
++ if (ret)
++ return ret;
++
++ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->hd_regset, VC4_HD);
++ if (ret)
++ return ret;
++
++ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->cec_regset, VC5_CEC);
++ if (ret)
++ return ret;
++
++ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->csc_regset, VC5_CSC);
++ if (ret)
++ return ret;
++
++ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->dvp_regset, VC5_DVP);
++ if (ret)
++ return ret;
++
++ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->phy_regset, VC5_PHY);
++ if (ret)
++ return ret;
++
++ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->ram_regset, VC5_RAM);
++ if (ret)
++ return ret;
++
++ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->rm_regset, VC5_RM);
++ if (ret)
++ return ret;
++
+ return 0;
+ }
+
+@@ -2485,12 +2547,34 @@ static int __maybe_unused vc4_hdmi_runtime_suspend(struct device *dev)
+ static int vc4_hdmi_runtime_resume(struct device *dev)
+ {
+ struct vc4_hdmi *vc4_hdmi = dev_get_drvdata(dev);
++ unsigned long __maybe_unused flags;
++ u32 __maybe_unused value;
+ int ret;
+
+ ret = clk_prepare_enable(vc4_hdmi->hsm_clock);
+ if (ret)
+ return ret;
+
++ if (vc4_hdmi->variant->reset)
++ vc4_hdmi->variant->reset(vc4_hdmi);
++
++#ifdef CONFIG_DRM_VC4_HDMI_CEC
++ spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
++ value = HDMI_READ(HDMI_CEC_CNTRL_1);
++ /* Set the logical address to Unregistered */
++ value |= VC4_HDMI_CEC_ADDR_MASK;
++ HDMI_WRITE(HDMI_CEC_CNTRL_1, value);
++ spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
++
++ vc4_hdmi_cec_update_clk_div(vc4_hdmi);
++
++ if (!vc4_hdmi->variant->external_irq_controller) {
++ spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
++ HDMI_WRITE(HDMI_CEC_CPU_MASK_SET, 0xffffffff);
++ spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
++ }
++#endif
++
+ return 0;
+ }
+
+@@ -2593,9 +2677,6 @@ static int vc4_hdmi_bind(struct device *dev, struct device *master, void *data)
+ pm_runtime_set_active(dev);
+ pm_runtime_enable(dev);
+
+- if (vc4_hdmi->variant->reset)
+- vc4_hdmi->variant->reset(vc4_hdmi);
+-
+ if ((of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi0") ||
+ of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi1")) &&
+ HDMI_READ(HDMI_VID_CTL) & VC4_HD_VID_CTL_ENABLE) {
+diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h b/drivers/gpu/drm/vc4/vc4_hdmi.h
+index 1076faeab6163..2b9f5ca15a40d 100644
+--- a/drivers/gpu/drm/vc4/vc4_hdmi.h
++++ b/drivers/gpu/drm/vc4/vc4_hdmi.h
+@@ -184,6 +184,14 @@ struct vc4_hdmi {
+ struct debugfs_regset32 hdmi_regset;
+ struct debugfs_regset32 hd_regset;
+
++ /* VC5 only */
++ struct debugfs_regset32 cec_regset;
++ struct debugfs_regset32 csc_regset;
++ struct debugfs_regset32 dvp_regset;
++ struct debugfs_regset32 phy_regset;
++ struct debugfs_regset32 ram_regset;
++ struct debugfs_regset32 rm_regset;
++
+ /**
+ * @hw_lock: Spinlock protecting device register access.
+ */
+diff --git a/drivers/gpu/drm/vc4/vc4_hdmi_regs.h b/drivers/gpu/drm/vc4/vc4_hdmi_regs.h
+index fc971506bd4f9..72b7694124829 100644
+--- a/drivers/gpu/drm/vc4/vc4_hdmi_regs.h
++++ b/drivers/gpu/drm/vc4/vc4_hdmi_regs.h
+@@ -125,6 +125,7 @@ enum vc4_hdmi_field {
+ HDMI_VERTB0,
+ HDMI_VERTB1,
+ HDMI_VID_CTL,
++ HDMI_MISC_CONTROL,
+ };
+
+ struct vc4_hdmi_register {
+@@ -235,6 +236,7 @@ static const struct vc4_hdmi_register __maybe_unused vc5_hdmi_hdmi0_fields[] = {
+ VC4_HDMI_REG(HDMI_VERTB0, 0x0f0),
+ VC4_HDMI_REG(HDMI_VERTA1, 0x0f4),
+ VC4_HDMI_REG(HDMI_VERTB1, 0x0f8),
++ VC4_HDMI_REG(HDMI_MISC_CONTROL, 0x100),
+ VC4_HDMI_REG(HDMI_MAI_CHANNEL_MAP, 0x09c),
+ VC4_HDMI_REG(HDMI_MAI_CONFIG, 0x0a0),
+ VC4_HDMI_REG(HDMI_DEEP_COLOR_CONFIG_1, 0x170),
+@@ -315,6 +317,7 @@ static const struct vc4_hdmi_register __maybe_unused vc5_hdmi_hdmi1_fields[] = {
+ VC4_HDMI_REG(HDMI_VERTB0, 0x0f0),
+ VC4_HDMI_REG(HDMI_VERTA1, 0x0f4),
+ VC4_HDMI_REG(HDMI_VERTB1, 0x0f8),
++ VC4_HDMI_REG(HDMI_MISC_CONTROL, 0x100),
+ VC4_HDMI_REG(HDMI_MAI_CHANNEL_MAP, 0x09c),
+ VC4_HDMI_REG(HDMI_MAI_CONFIG, 0x0a0),
+ VC4_HDMI_REG(HDMI_DEEP_COLOR_CONFIG_1, 0x170),
+@@ -414,7 +417,7 @@ static inline u32 vc4_hdmi_read(struct vc4_hdmi *hdmi,
+ const struct vc4_hdmi_variant *variant = hdmi->variant;
+ void __iomem *base;
+
+- WARN_ON(!pm_runtime_active(&hdmi->pdev->dev));
++ WARN_ON(pm_runtime_status_suspended(&hdmi->pdev->dev));
+
+ if (reg >= variant->num_registers) {
+ dev_warn(&hdmi->pdev->dev,
+@@ -444,7 +447,7 @@ static inline void vc4_hdmi_write(struct vc4_hdmi *hdmi,
+
+ lockdep_assert_held(&hdmi->hw_lock);
+
+- WARN_ON(!pm_runtime_active(&hdmi->pdev->dev));
++ WARN_ON(pm_runtime_status_suspended(&hdmi->pdev->dev));
+
+ if (reg >= variant->num_registers) {
+ dev_warn(&hdmi->pdev->dev,
+diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
+index 992d6a2400029..dba23ae2e65eb 100644
+--- a/drivers/gpu/drm/vc4/vc4_kms.c
++++ b/drivers/gpu/drm/vc4/vc4_kms.c
+@@ -890,7 +890,9 @@ vc4_core_clock_atomic_check(struct drm_atomic_state *state)
+ continue;
+
+ num_outputs++;
+- cob_rate += hvs_new_state->fifo_state[i].fifo_load;
++ cob_rate = max_t(unsigned long,
++ hvs_new_state->fifo_state[i].fifo_load,
++ cob_rate);
+ }
+
+ pixel_rate = load_state->hvs_load;
+diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c
+index 920a9eefe4261..a82a0b1190ebb 100644
+--- a/drivers/gpu/drm/vc4/vc4_plane.c
++++ b/drivers/gpu/drm/vc4/vc4_plane.c
+@@ -310,16 +310,16 @@ static int vc4_plane_margins_adj(struct drm_plane_state *pstate)
+ adjhdisplay,
+ crtc_state->mode.hdisplay);
+ vc4_pstate->crtc_x += left;
+- if (vc4_pstate->crtc_x > crtc_state->mode.hdisplay - left)
+- vc4_pstate->crtc_x = crtc_state->mode.hdisplay - left;
++ if (vc4_pstate->crtc_x > crtc_state->mode.hdisplay - right)
++ vc4_pstate->crtc_x = crtc_state->mode.hdisplay - right;
+
+ adjvdisplay = crtc_state->mode.vdisplay - (top + bottom);
+ vc4_pstate->crtc_y = DIV_ROUND_CLOSEST(vc4_pstate->crtc_y *
+ adjvdisplay,
+ crtc_state->mode.vdisplay);
+ vc4_pstate->crtc_y += top;
+- if (vc4_pstate->crtc_y > crtc_state->mode.vdisplay - top)
+- vc4_pstate->crtc_y = crtc_state->mode.vdisplay - top;
++ if (vc4_pstate->crtc_y > crtc_state->mode.vdisplay - bottom)
++ vc4_pstate->crtc_y = crtc_state->mode.vdisplay - bottom;
+
+ vc4_pstate->crtc_w = DIV_ROUND_CLOSEST(vc4_pstate->crtc_w *
+ adjhdisplay,
+@@ -339,7 +339,6 @@ static int vc4_plane_setup_clipping_and_scaling(struct drm_plane_state *state)
+ struct vc4_plane_state *vc4_state = to_vc4_plane_state(state);
+ struct drm_framebuffer *fb = state->fb;
+ struct drm_gem_cma_object *bo = drm_fb_cma_get_gem_obj(fb, 0);
+- u32 subpixel_src_mask = (1 << 16) - 1;
+ int num_planes = fb->format->num_planes;
+ struct drm_crtc_state *crtc_state;
+ u32 h_subsample = fb->format->hsub;
+@@ -361,18 +360,15 @@ static int vc4_plane_setup_clipping_and_scaling(struct drm_plane_state *state)
+ for (i = 0; i < num_planes; i++)
+ vc4_state->offsets[i] = bo->paddr + fb->offsets[i];
+
+- /* We don't support subpixel source positioning for scaling. */
+- if ((state->src.x1 & subpixel_src_mask) ||
+- (state->src.x2 & subpixel_src_mask) ||
+- (state->src.y1 & subpixel_src_mask) ||
+- (state->src.y2 & subpixel_src_mask)) {
+- return -EINVAL;
+- }
+-
+- vc4_state->src_x = state->src.x1 >> 16;
+- vc4_state->src_y = state->src.y1 >> 16;
+- vc4_state->src_w[0] = (state->src.x2 - state->src.x1) >> 16;
+- vc4_state->src_h[0] = (state->src.y2 - state->src.y1) >> 16;
++ /*
++ * We don't support subpixel source positioning for scaling,
++ * but fractional coordinates can be generated by clipping
++ * so just round for now
++ */
++ vc4_state->src_x = DIV_ROUND_CLOSEST(state->src.x1, 1 << 16);
++ vc4_state->src_y = DIV_ROUND_CLOSEST(state->src.y1, 1 << 16);
++ vc4_state->src_w[0] = DIV_ROUND_CLOSEST(state->src.x2, 1 << 16) - vc4_state->src_x;
++ vc4_state->src_h[0] = DIV_ROUND_CLOSEST(state->src.y2, 1 << 16) - vc4_state->src_y;
+
+ vc4_state->crtc_x = state->dst.x1;
+ vc4_state->crtc_y = state->dst.y1;
+diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+index c708bab555c6b..10f080060923b 100644
+--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
++++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+@@ -579,8 +579,10 @@ static int virtio_gpu_get_caps_ioctl(struct drm_device *dev,
+ spin_unlock(&vgdev->display_info_lock);
+
+ /* not in cache - need to talk to hw */
+- virtio_gpu_cmd_get_capset(vgdev, found_valid, args->cap_set_ver,
+- &cache_ent);
++ ret = virtio_gpu_cmd_get_capset(vgdev, found_valid, args->cap_set_ver,
++ &cache_ent);
++ if (ret)
++ return ret;
+ virtio_gpu_notify(vgdev);
+
+ copy_exit:
+diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c b/drivers/gpu/drm/virtio/virtgpu_object.c
+index f293e6ad52daf..1cc8f3fc8e4ba 100644
+--- a/drivers/gpu/drm/virtio/virtgpu_object.c
++++ b/drivers/gpu/drm/virtio/virtgpu_object.c
+@@ -168,9 +168,9 @@ static int virtio_gpu_object_shmem_init(struct virtio_gpu_device *vgdev,
+ * since virtio_gpu doesn't support dma-buf import from other devices.
+ */
+ shmem->pages = drm_gem_shmem_get_sg_table(&bo->base);
+- if (!shmem->pages) {
++ if (IS_ERR(shmem->pages)) {
+ drm_gem_shmem_unpin(&bo->base);
+- return -EINVAL;
++ return PTR_ERR(shmem->pages);
+ }
+
+ if (use_dma_api) {
+diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
+index c6a1036bf2ea7..b47ac170108ca 100644
+--- a/drivers/gpu/drm/vkms/vkms_composer.c
++++ b/drivers/gpu/drm/vkms/vkms_composer.c
+@@ -157,7 +157,7 @@ static void compose_plane(struct vkms_composer *primary_composer,
+ void *vaddr;
+ void (*pixel_blend)(const u8 *p_src, u8 *p_dst);
+
+- if (WARN_ON(iosys_map_is_null(&primary_composer->map[0])))
++ if (WARN_ON(iosys_map_is_null(&plane_composer->map[0])))
+ return;
+
+ vaddr = plane_composer->map[0].vaddr;
+diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_client.c b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
+index 444acd9e2cd6a..8be33cf3b6c23 100644
+--- a/drivers/hid/amd-sfh-hid/amd_sfh_client.c
++++ b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
+@@ -155,6 +155,8 @@ int amd_sfh_hid_client_init(struct amd_mp2_dev *privdata)
+ dev = &privdata->pdev->dev;
+
+ cl_data->num_hid_devices = amd_mp2_get_sensor_num(privdata, &cl_data->sensor_idx[0]);
++ if (cl_data->num_hid_devices == 0)
++ return -ENODEV;
+
+ INIT_DELAYED_WORK(&cl_data->work, amd_sfh_work);
+ INIT_DELAYED_WORK(&cl_data->work_buffer, amd_sfh_work_buffer);
+diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_hid.c b/drivers/hid/amd-sfh-hid/amd_sfh_hid.c
+index e2a9679e32be8..54fe224050f95 100644
+--- a/drivers/hid/amd-sfh-hid/amd_sfh_hid.c
++++ b/drivers/hid/amd-sfh-hid/amd_sfh_hid.c
+@@ -100,11 +100,15 @@ static int amdtp_wait_for_response(struct hid_device *hid)
+
+ void amdtp_hid_wakeup(struct hid_device *hid)
+ {
+- struct amdtp_hid_data *hid_data = hid->driver_data;
+- struct amdtp_cl_data *cli_data = hid_data->cli_data;
++ struct amdtp_hid_data *hid_data;
++ struct amdtp_cl_data *cli_data;
+
+- cli_data->request_done[cli_data->cur_hid_dev] = true;
+- wake_up_interruptible(&hid_data->hid_wait);
++ if (hid) {
++ hid_data = hid->driver_data;
++ cli_data = hid_data->cli_data;
++ cli_data->request_done[cli_data->cur_hid_dev] = true;
++ wake_up_interruptible(&hid_data->hid_wait);
++ }
+ }
+
+ static struct hid_ll_driver amdtp_hid_ll_driver = {
+diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c
+index e18a4efd8839e..390298a6fc85b 100644
+--- a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c
++++ b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c
+@@ -327,7 +327,8 @@ static int amd_mp2_pci_probe(struct pci_dev *pdev, const struct pci_device_id *i
+ rc = amd_sfh_hid_client_init(privdata);
+ if (rc) {
+ amd_sfh_clear_intr(privdata);
+- dev_err(&pdev->dev, "amd_sfh_hid_client_init failed\n");
++ if (rc != -EOPNOTSUPP)
++ dev_err(&pdev->dev, "amd_sfh_hid_client_init failed\n");
+ return rc;
+ }
+
+diff --git a/drivers/hid/hid-alps.c b/drivers/hid/hid-alps.c
+index 2b986d0dbde46..db146d0f7937e 100644
+--- a/drivers/hid/hid-alps.c
++++ b/drivers/hid/hid-alps.c
+@@ -830,6 +830,8 @@ static const struct hid_device_id alps_id[] = {
+ USB_VENDOR_ID_ALPS_JP, HID_DEVICE_ID_ALPS_U1_DUAL) },
+ { HID_DEVICE(HID_BUS_ANY, HID_GROUP_ANY,
+ USB_VENDOR_ID_ALPS_JP, HID_DEVICE_ID_ALPS_U1) },
++ { HID_DEVICE(HID_BUS_ANY, HID_GROUP_ANY,
++ USB_VENDOR_ID_ALPS_JP, HID_DEVICE_ID_ALPS_U1_UNICORN_LEGACY) },
+ { HID_DEVICE(HID_BUS_ANY, HID_GROUP_ANY,
+ USB_VENDOR_ID_ALPS_JP, HID_DEVICE_ID_ALPS_T4_BTNLESS) },
+ { }
+diff --git a/drivers/hid/hid-cp2112.c b/drivers/hid/hid-cp2112.c
+index ece147d1a2789..1e16b0fa310d1 100644
+--- a/drivers/hid/hid-cp2112.c
++++ b/drivers/hid/hid-cp2112.c
+@@ -790,6 +790,11 @@ static int cp2112_xfer(struct i2c_adapter *adap, u16 addr,
+ data->word = le16_to_cpup((__le16 *)buf);
+ break;
+ case I2C_SMBUS_I2C_BLOCK_DATA:
++ if (read_length > I2C_SMBUS_BLOCK_MAX) {
++ ret = -EINVAL;
++ goto power_normal;
++ }
++
+ memcpy(data->block + 1, buf, read_length);
+ break;
+ case I2C_SMBUS_BLOCK_DATA:
+diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
+index c297c63f3ec5c..e233726d9a748 100644
+--- a/drivers/hid/hid-ids.h
++++ b/drivers/hid/hid-ids.h
+@@ -413,6 +413,7 @@
+ #define USB_DEVICE_ID_ASUS_UX550VE_TOUCHSCREEN 0x2544
+ #define USB_DEVICE_ID_ASUS_UX550_TOUCHSCREEN 0x2706
+ #define I2C_DEVICE_ID_SURFACE_GO_TOUCHSCREEN 0x261A
++#define I2C_DEVICE_ID_SURFACE_GO2_TOUCHSCREEN 0x2A1C
+
+ #define USB_VENDOR_ID_ELECOM 0x056e
+ #define USB_DEVICE_ID_ELECOM_BM084 0x0061
+diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c
+index c6b27aab90414..48c1c02c69f4e 100644
+--- a/drivers/hid/hid-input.c
++++ b/drivers/hid/hid-input.c
+@@ -381,6 +381,8 @@ static const struct hid_device_id hid_battery_quirks[] = {
+ HID_BATTERY_QUIRK_IGNORE },
+ { HID_I2C_DEVICE(USB_VENDOR_ID_ELAN, I2C_DEVICE_ID_SURFACE_GO_TOUCHSCREEN),
+ HID_BATTERY_QUIRK_IGNORE },
++ { HID_I2C_DEVICE(USB_VENDOR_ID_ELAN, I2C_DEVICE_ID_SURFACE_GO2_TOUCHSCREEN),
++ HID_BATTERY_QUIRK_IGNORE },
+ {}
+ };
+
+diff --git a/drivers/hid/hid-mcp2221.c b/drivers/hid/hid-mcp2221.c
+index 4211b9839209b..de52e9f7bb8cb 100644
+--- a/drivers/hid/hid-mcp2221.c
++++ b/drivers/hid/hid-mcp2221.c
+@@ -385,6 +385,9 @@ static int mcp_smbus_write(struct mcp2221 *mcp, u16 addr,
+ data_len = 7;
+ break;
+ default:
++ if (len > I2C_SMBUS_BLOCK_MAX)
++ return -EINVAL;
++
+ memcpy(&mcp->txbuf[5], buf, len);
+ data_len = len + 5;
+ }
+diff --git a/drivers/hid/hid-nintendo.c b/drivers/hid/hid-nintendo.c
+index 2204de889739f..4b1173957c17c 100644
+--- a/drivers/hid/hid-nintendo.c
++++ b/drivers/hid/hid-nintendo.c
+@@ -1586,6 +1586,7 @@ static const unsigned int joycon_button_inputs_r[] = {
+ /* We report joy-con d-pad inputs as buttons and pro controller as a hat. */
+ static const unsigned int joycon_dpad_inputs_jc[] = {
+ BTN_DPAD_UP, BTN_DPAD_DOWN, BTN_DPAD_LEFT, BTN_DPAD_RIGHT,
++ 0 /* 0 signals end of array */
+ };
+
+ static int joycon_input_create(struct joycon_ctlr *ctlr)
+diff --git a/drivers/hid/wacom_sys.c b/drivers/hid/wacom_sys.c
+index 066c567dbaa22..9a81e63c330ea 100644
+--- a/drivers/hid/wacom_sys.c
++++ b/drivers/hid/wacom_sys.c
+@@ -2121,7 +2121,7 @@ static int wacom_register_inputs(struct wacom *wacom)
+
+ error = wacom_setup_pad_input_capabilities(pad_input_dev, wacom_wac);
+ if (error) {
+- /* no pad in use on this interface */
++ /* no pad events using this interface */
+ input_free_device(pad_input_dev);
+ wacom_wac->pad_input = NULL;
+ pad_input_dev = NULL;
+diff --git a/drivers/hid/wacom_wac.c b/drivers/hid/wacom_wac.c
+index a7176fc0635dd..c454231afec89 100644
+--- a/drivers/hid/wacom_wac.c
++++ b/drivers/hid/wacom_wac.c
+@@ -638,9 +638,26 @@ static int wacom_intuos_id_mangle(int tool_id)
+ return (tool_id & ~0xFFF) << 4 | (tool_id & 0xFFF);
+ }
+
++static bool wacom_is_art_pen(int tool_id)
++{
++ bool is_art_pen = false;
++
++ switch (tool_id) {
++ case 0x885: /* Intuos3 Marker Pen */
++ case 0x804: /* Intuos4/5 13HD/24HD Marker Pen */
++ case 0x10804: /* Intuos4/5 13HD/24HD Art Pen */
++ is_art_pen = true;
++ break;
++ }
++ return is_art_pen;
++}
++
+ static int wacom_intuos_get_tool_type(int tool_id)
+ {
+- int tool_type;
++ int tool_type = BTN_TOOL_PEN;
++
++ if (wacom_is_art_pen(tool_id))
++ return tool_type;
+
+ switch (tool_id) {
+ case 0x812: /* Inking pen */
+@@ -655,12 +672,9 @@ static int wacom_intuos_get_tool_type(int tool_id)
+ case 0x852:
+ case 0x823: /* Intuos3 Grip Pen */
+ case 0x813: /* Intuos3 Classic Pen */
+- case 0x885: /* Intuos3 Marker Pen */
+ case 0x802: /* Intuos4/5 13HD/24HD General Pen */
+- case 0x804: /* Intuos4/5 13HD/24HD Marker Pen */
+ case 0x8e2: /* IntuosHT2 pen */
+ case 0x022:
+- case 0x10804: /* Intuos4/5 13HD/24HD Art Pen */
+ case 0x10842: /* MobileStudio Pro Pro Pen slim */
+ case 0x14802: /* Intuos4/5 13HD/24HD Classic Pen */
+ case 0x16802: /* Cintiq 13HD Pro Pen */
+@@ -718,10 +732,6 @@ static int wacom_intuos_get_tool_type(int tool_id)
+ case 0x10902: /* Intuos4/5 13HD/24HD Airbrush */
+ tool_type = BTN_TOOL_AIRBRUSH;
+ break;
+-
+- default: /* Unknown tool */
+- tool_type = BTN_TOOL_PEN;
+- break;
+ }
+ return tool_type;
+ }
+@@ -2007,7 +2017,6 @@ static void wacom_wac_pad_usage_mapping(struct hid_device *hdev,
+ wacom_wac->has_mute_touch_switch = true;
+ usage->type = EV_SW;
+ usage->code = SW_MUTE_DEVICE;
+- features->device_type |= WACOM_DEVICETYPE_PAD;
+ break;
+ case WACOM_HID_WD_TOUCHSTRIP:
+ wacom_map_usage(input, usage, field, EV_ABS, ABS_RX, 0);
+@@ -2087,6 +2096,30 @@ static void wacom_wac_pad_event(struct hid_device *hdev, struct hid_field *field
+ wacom_wac->hid_data.inrange_state |= value;
+ }
+
++ /* Process touch switch state first since it is reported through touch interface,
++ * which is indepentent of pad interface. In the case when there are no other pad
++ * events, the pad interface will not even be created.
++ */
++ if ((equivalent_usage == WACOM_HID_WD_MUTE_DEVICE) ||
++ (equivalent_usage == WACOM_HID_WD_TOUCHONOFF)) {
++ if (wacom_wac->shared->touch_input) {
++ bool *is_touch_on = &wacom_wac->shared->is_touch_on;
++
++ if (equivalent_usage == WACOM_HID_WD_MUTE_DEVICE && value)
++ *is_touch_on = !(*is_touch_on);
++ else if (equivalent_usage == WACOM_HID_WD_TOUCHONOFF)
++ *is_touch_on = value;
++
++ input_report_switch(wacom_wac->shared->touch_input,
++ SW_MUTE_DEVICE, !(*is_touch_on));
++ input_sync(wacom_wac->shared->touch_input);
++ }
++ return;
++ }
++
++ if (!input)
++ return;
++
+ switch (equivalent_usage) {
+ case WACOM_HID_WD_TOUCHRING:
+ /*
+@@ -2122,22 +2155,6 @@ static void wacom_wac_pad_event(struct hid_device *hdev, struct hid_field *field
+ input_event(input, usage->type, usage->code, 0);
+ break;
+
+- case WACOM_HID_WD_MUTE_DEVICE:
+- case WACOM_HID_WD_TOUCHONOFF:
+- if (wacom_wac->shared->touch_input) {
+- bool *is_touch_on = &wacom_wac->shared->is_touch_on;
+-
+- if (equivalent_usage == WACOM_HID_WD_MUTE_DEVICE && value)
+- *is_touch_on = !(*is_touch_on);
+- else if (equivalent_usage == WACOM_HID_WD_TOUCHONOFF)
+- *is_touch_on = value;
+-
+- input_report_switch(wacom_wac->shared->touch_input,
+- SW_MUTE_DEVICE, !(*is_touch_on));
+- input_sync(wacom_wac->shared->touch_input);
+- }
+- break;
+-
+ case WACOM_HID_WD_MODE_CHANGE:
+ if (wacom_wac->is_direct_mode != value) {
+ wacom_wac->is_direct_mode = value;
+@@ -2323,6 +2340,9 @@ static void wacom_wac_pen_event(struct hid_device *hdev, struct hid_field *field
+ }
+ return;
+ case HID_DG_TWIST:
++ /* don't modify the value if the pen doesn't support the feature */
++ if (!wacom_is_art_pen(wacom_wac->id[0])) return;
++
+ /*
+ * Userspace expects pen twist to have its zero point when
+ * the buttons/finger is on the tablet's left. HID values
+@@ -2795,7 +2815,7 @@ void wacom_wac_event(struct hid_device *hdev, struct hid_field *field,
+ /* usage tests must precede field tests */
+ if (WACOM_BATTERY_USAGE(usage))
+ wacom_wac_battery_event(hdev, field, usage, value);
+- else if (WACOM_PAD_FIELD(field) && wacom->wacom_wac.pad_input)
++ else if (WACOM_PAD_FIELD(field))
+ wacom_wac_pad_event(hdev, field, usage, value);
+ else if (WACOM_PEN_FIELD(field) && wacom->wacom_wac.pen_input)
+ wacom_wac_pen_event(hdev, field, usage, value);
+diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c
+index 84cb1ede7bc0b..a0df9a0abeb92 100644
+--- a/drivers/hwmon/dell-smm-hwmon.c
++++ b/drivers/hwmon/dell-smm-hwmon.c
+@@ -1262,6 +1262,14 @@ static const struct dmi_system_id i8k_whitelist_fan_control[] __initconst = {
+ },
+ .driver_data = (void *)&i8k_fan_control_data[I8K_FAN_34A3_35A3],
+ },
++ {
++ .ident = "Dell XPS 13 7390",
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
++ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "XPS 13 7390"),
++ },
++ .driver_data = (void *)&i8k_fan_control_data[I8K_FAN_34A3_35A3],
++ },
+ { }
+ };
+
+diff --git a/drivers/hwmon/drivetemp.c b/drivers/hwmon/drivetemp.c
+index 1eb37106a220b..5bac2b0fc7bb6 100644
+--- a/drivers/hwmon/drivetemp.c
++++ b/drivers/hwmon/drivetemp.c
+@@ -621,3 +621,4 @@ module_exit(drivetemp_exit);
+ MODULE_AUTHOR("Guenter Roeck <linus@roeck-us.net>");
+ MODULE_DESCRIPTION("Hard drive temperature monitor");
+ MODULE_LICENSE("GPL");
++MODULE_ALIAS("platform:drivetemp");
+diff --git a/drivers/hwmon/sch56xx-common.c b/drivers/hwmon/sch56xx-common.c
+index 3ece53adabd62..de3a0886c2f72 100644
+--- a/drivers/hwmon/sch56xx-common.c
++++ b/drivers/hwmon/sch56xx-common.c
+@@ -523,6 +523,28 @@ static int __init sch56xx_device_add(int address, const char *name)
+ return PTR_ERR_OR_ZERO(sch56xx_pdev);
+ }
+
++static const struct dmi_system_id sch56xx_dmi_override_table[] __initconst = {
++ {
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "CELSIUS W380"),
++ },
++ },
++ {
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "ESPRIMO P710"),
++ },
++ },
++ {
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "ESPRIMO E9900"),
++ },
++ },
++ { }
++};
++
+ /* For autoloading only */
+ static const struct dmi_system_id sch56xx_dmi_table[] __initconst = {
+ {
+@@ -543,16 +565,18 @@ static int __init sch56xx_init(void)
+ if (!dmi_check_system(sch56xx_dmi_table))
+ return -ENODEV;
+
+- /*
+- * Some machines like the Esprimo P720 and Esprimo C700 have
+- * onboard devices named " Antiope"/" Theseus" instead of
+- * "Antiope"/"Theseus", so we need to check for both.
+- */
+- if (!dmi_find_device(DMI_DEV_TYPE_OTHER, "Antiope", NULL) &&
+- !dmi_find_device(DMI_DEV_TYPE_OTHER, " Antiope", NULL) &&
+- !dmi_find_device(DMI_DEV_TYPE_OTHER, "Theseus", NULL) &&
+- !dmi_find_device(DMI_DEV_TYPE_OTHER, " Theseus", NULL))
+- return -ENODEV;
++ if (!dmi_check_system(sch56xx_dmi_override_table)) {
++ /*
++ * Some machines like the Esprimo P720 and Esprimo C700 have
++ * onboard devices named " Antiope"/" Theseus" instead of
++ * "Antiope"/"Theseus", so we need to check for both.
++ */
++ if (!dmi_find_device(DMI_DEV_TYPE_OTHER, "Antiope", NULL) &&
++ !dmi_find_device(DMI_DEV_TYPE_OTHER, " Antiope", NULL) &&
++ !dmi_find_device(DMI_DEV_TYPE_OTHER, "Theseus", NULL) &&
++ !dmi_find_device(DMI_DEV_TYPE_OTHER, " Theseus", NULL))
++ return -ENODEV;
++ }
+ }
+
+ /*
+diff --git a/drivers/hwmon/sht15.c b/drivers/hwmon/sht15.c
+index 7f4a639597306..ae4d14257a11d 100644
+--- a/drivers/hwmon/sht15.c
++++ b/drivers/hwmon/sht15.c
+@@ -1020,25 +1020,20 @@ err_release_reg:
+ static int sht15_remove(struct platform_device *pdev)
+ {
+ struct sht15_data *data = platform_get_drvdata(pdev);
++ int ret;
+
+- /*
+- * Make sure any reads from the device are done and
+- * prevent new ones beginning
+- */
+- mutex_lock(&data->read_lock);
+- if (sht15_soft_reset(data)) {
+- mutex_unlock(&data->read_lock);
+- return -EFAULT;
+- }
+ hwmon_device_unregister(data->hwmon_dev);
+ sysfs_remove_group(&pdev->dev.kobj, &sht15_attr_group);
++
++ ret = sht15_soft_reset(data);
++ if (ret)
++ dev_err(&pdev->dev, "Failed to reset device (%pe)\n", ERR_PTR(ret));
++
+ if (!IS_ERR(data->reg)) {
+ regulator_unregister_notifier(data->reg, &data->nb);
+ regulator_disable(data->reg);
+ }
+
+- mutex_unlock(&data->read_lock);
+-
+ return 0;
+ }
+
+diff --git a/drivers/hwtracing/coresight/coresight-config.h b/drivers/hwtracing/coresight/coresight-config.h
+index 2e1670523461c..6ba0139757418 100644
+--- a/drivers/hwtracing/coresight/coresight-config.h
++++ b/drivers/hwtracing/coresight/coresight-config.h
+@@ -134,6 +134,7 @@ struct cscfg_feature_desc {
+ * @active_cnt: ref count for activate on this configuration.
+ * @load_owner: handle to load owner for dynamic load and unload of configs.
+ * @fs_group: reference to configfs group for dynamic unload.
++ * @available: config can be activated - multi-stage load sets true on completion.
+ */
+ struct cscfg_config_desc {
+ const char *name;
+@@ -148,6 +149,7 @@ struct cscfg_config_desc {
+ atomic_t active_cnt;
+ void *load_owner;
+ struct config_group *fs_group;
++ bool available;
+ };
+
+ /**
+diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
+index ee6ce92ab4c31..1edfec1e9d18e 100644
+--- a/drivers/hwtracing/coresight/coresight-core.c
++++ b/drivers/hwtracing/coresight/coresight-core.c
+@@ -1424,6 +1424,7 @@ static int coresight_remove_match(struct device *dev, void *data)
+ * platform data.
+ */
+ fwnode_handle_put(conn->child_fwnode);
++ conn->child_fwnode = NULL;
+ /* No need to continue */
+ break;
+ }
+diff --git a/drivers/hwtracing/coresight/coresight-syscfg.c b/drivers/hwtracing/coresight/coresight-syscfg.c
+index 11850fd8c3b5b..11138a9762b01 100644
+--- a/drivers/hwtracing/coresight/coresight-syscfg.c
++++ b/drivers/hwtracing/coresight/coresight-syscfg.c
+@@ -414,6 +414,27 @@ static void cscfg_remove_owned_csdev_features(struct coresight_device *csdev, vo
+ }
+ }
+
++/*
++ * Unregister all configuration and features from configfs owned by load_owner.
++ * Although this is called without the list mutex being held, it is in the
++ * context of an unload operation which are strictly serialised,
++ * so the lists cannot change during this call.
++ */
++static void cscfg_fs_unregister_cfgs_feats(void *load_owner)
++{
++ struct cscfg_config_desc *config_desc;
++ struct cscfg_feature_desc *feat_desc;
++
++ list_for_each_entry(config_desc, &cscfg_mgr->config_desc_list, item) {
++ if (config_desc->load_owner == load_owner)
++ cscfg_configfs_del_config(config_desc);
++ }
++ list_for_each_entry(feat_desc, &cscfg_mgr->feat_desc_list, item) {
++ if (feat_desc->load_owner == load_owner)
++ cscfg_configfs_del_feature(feat_desc);
++ }
++}
++
+ /*
+ * removal is relatively easy - just remove from all lists, anything that
+ * matches the owner. Memory for the descriptors will be managed by the owner,
+@@ -426,6 +447,8 @@ static void cscfg_unload_owned_cfgs_feats(void *load_owner)
+ struct cscfg_feature_desc *feat_desc, *feat_tmp;
+ struct cscfg_registered_csdev *csdev_item;
+
++ lockdep_assert_held(&cscfg_mutex);
++
+ /* remove from each csdev instance feature and config lists */
+ list_for_each_entry(csdev_item, &cscfg_mgr->csdev_desc_list, item) {
+ /*
+@@ -439,7 +462,6 @@ static void cscfg_unload_owned_cfgs_feats(void *load_owner)
+ /* remove from the config descriptor lists */
+ list_for_each_entry_safe(config_desc, cfg_tmp, &cscfg_mgr->config_desc_list, item) {
+ if (config_desc->load_owner == load_owner) {
+- cscfg_configfs_del_config(config_desc);
+ etm_perf_del_symlink_cscfg(config_desc);
+ list_del(&config_desc->item);
+ }
+@@ -448,12 +470,90 @@ static void cscfg_unload_owned_cfgs_feats(void *load_owner)
+ /* remove from the feature descriptor lists */
+ list_for_each_entry_safe(feat_desc, feat_tmp, &cscfg_mgr->feat_desc_list, item) {
+ if (feat_desc->load_owner == load_owner) {
+- cscfg_configfs_del_feature(feat_desc);
+ list_del(&feat_desc->item);
+ }
+ }
+ }
+
++/*
++ * load the features and configs to the lists - called with list mutex held
++ */
++static int cscfg_load_owned_cfgs_feats(struct cscfg_config_desc **config_descs,
++ struct cscfg_feature_desc **feat_descs,
++ struct cscfg_load_owner_info *owner_info)
++{
++ int i, err;
++
++ lockdep_assert_held(&cscfg_mutex);
++
++ /* load features first */
++ if (feat_descs) {
++ for (i = 0; feat_descs[i]; i++) {
++ err = cscfg_load_feat(feat_descs[i]);
++ if (err) {
++ pr_err("coresight-syscfg: Failed to load feature %s\n",
++ feat_descs[i]->name);
++ return err;
++ }
++ feat_descs[i]->load_owner = owner_info;
++ }
++ }
++
++ /* next any configurations to check feature dependencies */
++ if (config_descs) {
++ for (i = 0; config_descs[i]; i++) {
++ err = cscfg_load_config(config_descs[i]);
++ if (err) {
++ pr_err("coresight-syscfg: Failed to load configuration %s\n",
++ config_descs[i]->name);
++ return err;
++ }
++ config_descs[i]->load_owner = owner_info;
++ config_descs[i]->available = false;
++ }
++ }
++ return 0;
++}
++
++/* set configurations as available to activate at the end of the load process */
++static void cscfg_set_configs_available(struct cscfg_config_desc **config_descs)
++{
++ int i;
++
++ lockdep_assert_held(&cscfg_mutex);
++
++ if (config_descs) {
++ for (i = 0; config_descs[i]; i++)
++ config_descs[i]->available = true;
++ }
++}
++
++/*
++ * Create and register each of the configurations and features with configfs.
++ * Called without mutex being held.
++ */
++static int cscfg_fs_register_cfgs_feats(struct cscfg_config_desc **config_descs,
++ struct cscfg_feature_desc **feat_descs)
++{
++ int i, err;
++
++ if (feat_descs) {
++ for (i = 0; feat_descs[i]; i++) {
++ err = cscfg_configfs_add_feature(feat_descs[i]);
++ if (err)
++ return err;
++ }
++ }
++ if (config_descs) {
++ for (i = 0; config_descs[i]; i++) {
++ err = cscfg_configfs_add_config(config_descs[i]);
++ if (err)
++ return err;
++ }
++ }
++ return 0;
++}
++
+ /**
+ * cscfg_load_config_sets - API function to load feature and config sets.
+ *
+@@ -476,57 +576,63 @@ int cscfg_load_config_sets(struct cscfg_config_desc **config_descs,
+ struct cscfg_feature_desc **feat_descs,
+ struct cscfg_load_owner_info *owner_info)
+ {
+- int err = 0, i = 0;
++ int err = 0;
+
+ mutex_lock(&cscfg_mutex);
+-
+- /* load features first */
+- if (feat_descs) {
+- while (feat_descs[i]) {
+- err = cscfg_load_feat(feat_descs[i]);
+- if (!err)
+- err = cscfg_configfs_add_feature(feat_descs[i]);
+- if (err) {
+- pr_err("coresight-syscfg: Failed to load feature %s\n",
+- feat_descs[i]->name);
+- cscfg_unload_owned_cfgs_feats(owner_info);
+- goto exit_unlock;
+- }
+- feat_descs[i]->load_owner = owner_info;
+- i++;
+- }
++ if (cscfg_mgr->load_state != CSCFG_NONE) {
++ mutex_unlock(&cscfg_mutex);
++ return -EBUSY;
+ }
++ cscfg_mgr->load_state = CSCFG_LOAD;
+
+- /* next any configurations to check feature dependencies */
+- i = 0;
+- if (config_descs) {
+- while (config_descs[i]) {
+- err = cscfg_load_config(config_descs[i]);
+- if (!err)
+- err = cscfg_configfs_add_config(config_descs[i]);
+- if (err) {
+- pr_err("coresight-syscfg: Failed to load configuration %s\n",
+- config_descs[i]->name);
+- cscfg_unload_owned_cfgs_feats(owner_info);
+- goto exit_unlock;
+- }
+- config_descs[i]->load_owner = owner_info;
+- i++;
+- }
+- }
++ /* first load and add to the lists */
++ err = cscfg_load_owned_cfgs_feats(config_descs, feat_descs, owner_info);
++ if (err)
++ goto err_clean_load;
+
+ /* add the load owner to the load order list */
+ list_add_tail(&owner_info->item, &cscfg_mgr->load_order_list);
+ if (!list_is_singular(&cscfg_mgr->load_order_list)) {
+ /* lock previous item in load order list */
+ err = cscfg_owner_get(list_prev_entry(owner_info, item));
+- if (err) {
+- cscfg_unload_owned_cfgs_feats(owner_info);
+- list_del(&owner_info->item);
+- }
++ if (err)
++ goto err_clean_owner_list;
+ }
+
++ /*
++ * make visible to configfs - configfs manipulation must occur outside
++ * the list mutex lock to avoid circular lockdep issues with configfs
++ * built in mutexes and semaphores. This is safe as it is not possible
++ * to start a new load/unload operation till the current one is done.
++ */
++ mutex_unlock(&cscfg_mutex);
++
++ /* create the configfs elements */
++ err = cscfg_fs_register_cfgs_feats(config_descs, feat_descs);
++ mutex_lock(&cscfg_mutex);
++
++ if (err)
++ goto err_clean_cfs;
++
++ /* mark any new configs as available for activation */
++ cscfg_set_configs_available(config_descs);
++ goto exit_unlock;
++
++err_clean_cfs:
++ /* cleanup after error registering with configfs */
++ cscfg_fs_unregister_cfgs_feats(owner_info);
++
++ if (!list_is_singular(&cscfg_mgr->load_order_list))
++ cscfg_owner_put(list_prev_entry(owner_info, item));
++
++err_clean_owner_list:
++ list_del(&owner_info->item);
++
++err_clean_load:
++ cscfg_unload_owned_cfgs_feats(owner_info);
++
+ exit_unlock:
++ cscfg_mgr->load_state = CSCFG_NONE;
+ mutex_unlock(&cscfg_mutex);
+ return err;
+ }
+@@ -543,6 +649,9 @@ EXPORT_SYMBOL_GPL(cscfg_load_config_sets);
+ * 1) no configurations are active.
+ * 2) the set being unloaded was the last to be loaded to maintain dependencies.
+ *
++ * Once the unload operation commences, we disallow any configuration being
++ * made active until it is complete.
++ *
+ * @owner_info: Information on owner for set being unloaded.
+ */
+ int cscfg_unload_config_sets(struct cscfg_load_owner_info *owner_info)
+@@ -551,6 +660,13 @@ int cscfg_unload_config_sets(struct cscfg_load_owner_info *owner_info)
+ struct cscfg_load_owner_info *load_list_item = NULL;
+
+ mutex_lock(&cscfg_mutex);
++ if (cscfg_mgr->load_state != CSCFG_NONE) {
++ mutex_unlock(&cscfg_mutex);
++ return -EBUSY;
++ }
++
++ /* unload op in progress also prevents activation of any config */
++ cscfg_mgr->load_state = CSCFG_UNLOAD;
+
+ /* cannot unload if anything is active */
+ if (atomic_read(&cscfg_mgr->sys_active_cnt)) {
+@@ -571,7 +687,12 @@ int cscfg_unload_config_sets(struct cscfg_load_owner_info *owner_info)
+ goto exit_unlock;
+ }
+
+- /* unload all belonging to load_owner */
++ /* remove from configfs - again outside the scope of the list mutex */
++ mutex_unlock(&cscfg_mutex);
++ cscfg_fs_unregister_cfgs_feats(owner_info);
++ mutex_lock(&cscfg_mutex);
++
++ /* unload everything from lists belonging to load_owner */
+ cscfg_unload_owned_cfgs_feats(owner_info);
+
+ /* remove from load order list */
+@@ -582,6 +703,7 @@ int cscfg_unload_config_sets(struct cscfg_load_owner_info *owner_info)
+ list_del(&owner_info->item);
+
+ exit_unlock:
++ cscfg_mgr->load_state = CSCFG_NONE;
+ mutex_unlock(&cscfg_mutex);
+ return err;
+ }
+@@ -759,8 +881,15 @@ static int _cscfg_activate_config(unsigned long cfg_hash)
+ struct cscfg_config_desc *config_desc;
+ int err = -EINVAL;
+
++ if (cscfg_mgr->load_state == CSCFG_UNLOAD)
++ return -EBUSY;
++
+ list_for_each_entry(config_desc, &cscfg_mgr->config_desc_list, item) {
+ if ((unsigned long)config_desc->event_ea->var == cfg_hash) {
++ /* if we happen upon a partly loaded config, can't use it */
++ if (config_desc->available == false)
++ return -EBUSY;
++
+ /* must ensure that config cannot be unloaded in use */
+ err = cscfg_owner_get(config_desc->load_owner);
+ if (err)
+@@ -1022,8 +1151,10 @@ struct device *cscfg_device(void)
+ /* Must have a release function or the kernel will complain on module unload */
+ static void cscfg_dev_release(struct device *dev)
+ {
++ mutex_lock(&cscfg_mutex);
+ kfree(cscfg_mgr);
+ cscfg_mgr = NULL;
++ mutex_unlock(&cscfg_mutex);
+ }
+
+ /* a device is needed to "own" some kernel elements such as sysfs entries. */
+@@ -1042,6 +1173,14 @@ static int cscfg_create_device(void)
+ if (!cscfg_mgr)
+ goto create_dev_exit_unlock;
+
++ /* initialise the cscfg_mgr structure */
++ INIT_LIST_HEAD(&cscfg_mgr->csdev_desc_list);
++ INIT_LIST_HEAD(&cscfg_mgr->feat_desc_list);
++ INIT_LIST_HEAD(&cscfg_mgr->config_desc_list);
++ INIT_LIST_HEAD(&cscfg_mgr->load_order_list);
++ atomic_set(&cscfg_mgr->sys_active_cnt, 0);
++ cscfg_mgr->load_state = CSCFG_NONE;
++
+ /* setup the device */
+ dev = cscfg_device();
+ dev->release = cscfg_dev_release;
+@@ -1056,17 +1195,73 @@ create_dev_exit_unlock:
+ return err;
+ }
+
+-static void cscfg_clear_device(void)
++/*
++ * Loading and unloading is generally on user discretion.
++ * If exiting due to coresight module unload, we need to unload any configurations that remain,
++ * before we unregister the configfs intrastructure.
++ *
++ * Do this by walking the load_owner list and taking appropriate action, depending on the load
++ * owner type.
++ */
++static void cscfg_unload_cfgs_on_exit(void)
+ {
+- struct cscfg_config_desc *cfg_desc;
++ struct cscfg_load_owner_info *owner_info = NULL;
+
++ /*
++ * grab the mutex - even though we are exiting, some configfs files
++ * may still be live till we dump them, so ensure list data is
++ * protected from a race condition.
++ */
+ mutex_lock(&cscfg_mutex);
+- list_for_each_entry(cfg_desc, &cscfg_mgr->config_desc_list, item) {
+- etm_perf_del_symlink_cscfg(cfg_desc);
++ while (!list_empty(&cscfg_mgr->load_order_list)) {
++
++ /* remove in reverse order of loading */
++ owner_info = list_last_entry(&cscfg_mgr->load_order_list,
++ struct cscfg_load_owner_info, item);
++
++ /* action according to type */
++ switch (owner_info->type) {
++ case CSCFG_OWNER_PRELOAD:
++ /*
++ * preloaded descriptors are statically allocated in
++ * this module - just need to unload dynamic items from
++ * csdev lists, and remove from configfs directories.
++ */
++ pr_info("cscfg: unloading preloaded configurations\n");
++ break;
++
++ case CSCFG_OWNER_MODULE:
++ /*
++ * this is an error - the loadable module must have been unloaded prior
++ * to the coresight module unload. Therefore that module has not
++ * correctly unloaded configs in its own exit code.
++ * Nothing to do other than emit an error string as the static descriptor
++ * references we need to unload will have disappeared with the module.
++ */
++ pr_err("cscfg: ERROR: prior module failed to unload configuration\n");
++ goto list_remove;
++ }
++
++ /* remove from configfs - outside the scope of the list mutex */
++ mutex_unlock(&cscfg_mutex);
++ cscfg_fs_unregister_cfgs_feats(owner_info);
++ mutex_lock(&cscfg_mutex);
++
++ /* Next unload from csdev lists. */
++ cscfg_unload_owned_cfgs_feats(owner_info);
++
++list_remove:
++ /* remove from load order list */
++ list_del(&owner_info->item);
+ }
++ mutex_unlock(&cscfg_mutex);
++}
++
++static void cscfg_clear_device(void)
++{
++ cscfg_unload_cfgs_on_exit();
+ cscfg_configfs_release(cscfg_mgr);
+ device_unregister(cscfg_device());
+- mutex_unlock(&cscfg_mutex);
+ }
+
+ /* Initialise system config management API device */
+@@ -1074,20 +1269,16 @@ int __init cscfg_init(void)
+ {
+ int err = 0;
+
++ /* create the device and init cscfg_mgr */
+ err = cscfg_create_device();
+ if (err)
+ return err;
+
++ /* initialise configfs subsystem */
+ err = cscfg_configfs_init(cscfg_mgr);
+ if (err)
+ goto exit_err;
+
+- INIT_LIST_HEAD(&cscfg_mgr->csdev_desc_list);
+- INIT_LIST_HEAD(&cscfg_mgr->feat_desc_list);
+- INIT_LIST_HEAD(&cscfg_mgr->config_desc_list);
+- INIT_LIST_HEAD(&cscfg_mgr->load_order_list);
+- atomic_set(&cscfg_mgr->sys_active_cnt, 0);
+-
+ /* preload built-in configurations */
+ err = cscfg_preload(THIS_MODULE);
+ if (err)
+diff --git a/drivers/hwtracing/coresight/coresight-syscfg.h b/drivers/hwtracing/coresight/coresight-syscfg.h
+index 9106ffab48337..66e2db890d820 100644
+--- a/drivers/hwtracing/coresight/coresight-syscfg.h
++++ b/drivers/hwtracing/coresight/coresight-syscfg.h
+@@ -12,6 +12,17 @@
+
+ #include "coresight-config.h"
+
++/*
++ * Load operation types.
++ * When loading or unloading, another load operation cannot be run.
++ * When unloading configurations cannot be activated.
++ */
++enum cscfg_load_ops {
++ CSCFG_NONE,
++ CSCFG_LOAD,
++ CSCFG_UNLOAD
++};
++
+ /**
+ * System configuration manager device.
+ *
+@@ -30,6 +41,7 @@
+ * @cfgfs_subsys: configfs subsystem used to manage configurations.
+ * @sysfs_active_config:Active config hash used if CoreSight controlled from sysfs.
+ * @sysfs_active_preset:Active preset index used if CoreSight controlled from sysfs.
++ * @load_state: A multi-stage load/unload operation is in progress.
+ */
+ struct cscfg_manager {
+ struct device dev;
+@@ -41,6 +53,7 @@ struct cscfg_manager {
+ struct configfs_subsystem cfgfs_subsys;
+ u32 sysfs_active_config;
+ int sysfs_active_preset;
++ enum cscfg_load_ops load_state;
+ };
+
+ /* get reference to dev in cscfg_manager */
+diff --git a/drivers/hwtracing/intel_th/msu-sink.c b/drivers/hwtracing/intel_th/msu-sink.c
+index 2c7f5116be126..891b28ea25fe6 100644
+--- a/drivers/hwtracing/intel_th/msu-sink.c
++++ b/drivers/hwtracing/intel_th/msu-sink.c
+@@ -71,6 +71,9 @@ static int msu_sink_alloc_window(void *data, struct sg_table **sgt, size_t size)
+ block = dma_alloc_coherent(priv->dev->parent->parent,
+ PAGE_SIZE, &sg_dma_address(sg_ptr),
+ GFP_KERNEL);
++ if (!block)
++ return -ENOMEM;
++
+ sg_set_buf(sg_ptr, block, PAGE_SIZE);
+ }
+
+diff --git a/drivers/hwtracing/intel_th/msu.c b/drivers/hwtracing/intel_th/msu.c
+index 70a07b4e99673..6c8215a47a601 100644
+--- a/drivers/hwtracing/intel_th/msu.c
++++ b/drivers/hwtracing/intel_th/msu.c
+@@ -1067,6 +1067,16 @@ msc_buffer_set_uc(struct msc *msc) {}
+ static inline void msc_buffer_set_wb(struct msc *msc) {}
+ #endif /* CONFIG_X86 */
+
++static struct page *msc_sg_page(struct scatterlist *sg)
++{
++ void *addr = sg_virt(sg);
++
++ if (is_vmalloc_addr(addr))
++ return vmalloc_to_page(addr);
++
++ return sg_page(sg);
++}
++
+ /**
+ * msc_buffer_win_alloc() - alloc a window for a multiblock mode
+ * @msc: MSC device
+@@ -1137,7 +1147,7 @@ static void __msc_buffer_win_free(struct msc *msc, struct msc_window *win)
+ int i;
+
+ for_each_sg(win->sgt->sgl, sg, win->nr_segs, i) {
+- struct page *page = sg_page(sg);
++ struct page *page = msc_sg_page(sg);
+
+ page->mapping = NULL;
+ dma_free_coherent(msc_dev(win->msc)->parent->parent, PAGE_SIZE,
+@@ -1401,7 +1411,7 @@ found:
+ pgoff -= win->pgoff;
+
+ for_each_sg(win->sgt->sgl, sg, win->nr_segs, blk) {
+- struct page *page = sg_page(sg);
++ struct page *page = msc_sg_page(sg);
+ size_t pgsz = PFN_DOWN(sg->length);
+
+ if (pgoff < pgsz)
+diff --git a/drivers/hwtracing/intel_th/pci.c b/drivers/hwtracing/intel_th/pci.c
+index 7da4f298ed01e..147d338c191e7 100644
+--- a/drivers/hwtracing/intel_th/pci.c
++++ b/drivers/hwtracing/intel_th/pci.c
+@@ -100,8 +100,10 @@ static int intel_th_pci_probe(struct pci_dev *pdev,
+ }
+
+ th = intel_th_alloc(&pdev->dev, drvdata, resource, r);
+- if (IS_ERR(th))
+- return PTR_ERR(th);
++ if (IS_ERR(th)) {
++ err = PTR_ERR(th);
++ goto err_free_irq;
++ }
+
+ th->activate = intel_th_pci_activate;
+ th->deactivate = intel_th_pci_deactivate;
+@@ -109,6 +111,10 @@ static int intel_th_pci_probe(struct pci_dev *pdev,
+ pci_set_master(pdev);
+
+ return 0;
++
++err_free_irq:
++ pci_free_irq_vectors(pdev);
++ return err;
+ }
+
+ static void intel_th_pci_remove(struct pci_dev *pdev)
+@@ -278,6 +284,21 @@ static const struct pci_device_id intel_th_pci_id_table[] = {
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x54a6),
+ .driver_data = (kernel_ulong_t)&intel_th_2x,
+ },
++ {
++ /* Meteor Lake-P */
++ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x7e24),
++ .driver_data = (kernel_ulong_t)&intel_th_2x,
++ },
++ {
++ /* Raptor Lake-S */
++ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x7a26),
++ .driver_data = (kernel_ulong_t)&intel_th_2x,
++ },
++ {
++ /* Raptor Lake-S CPU */
++ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xa76f),
++ .driver_data = (kernel_ulong_t)&intel_th_2x,
++ },
+ {
+ /* Alder Lake CPU */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x466f),
+diff --git a/drivers/i2c/busses/i2c-cadence.c b/drivers/i2c/busses/i2c-cadence.c
+index 630cfa4ddd468..33f5588a50c07 100644
+--- a/drivers/i2c/busses/i2c-cadence.c
++++ b/drivers/i2c/busses/i2c-cadence.c
+@@ -573,8 +573,13 @@ static void cdns_i2c_mrecv(struct cdns_i2c *id)
+ ctrl_reg = cdns_i2c_readreg(CDNS_I2C_CR_OFFSET);
+ ctrl_reg |= CDNS_I2C_CR_RW | CDNS_I2C_CR_CLR_FIFO;
+
++ /*
++ * Receive up to I2C_SMBUS_BLOCK_MAX data bytes, plus one message length
++ * byte, plus one checksum byte if PEC is enabled. p_msg->len will be 2 if
++ * PEC is enabled, otherwise 1.
++ */
+ if (id->p_msg->flags & I2C_M_RECV_LEN)
+- id->recv_count = I2C_SMBUS_BLOCK_MAX + 1;
++ id->recv_count = I2C_SMBUS_BLOCK_MAX + id->p_msg->len;
+
+ id->curr_recv_count = id->recv_count;
+
+@@ -789,6 +794,9 @@ static int cdns_i2c_process_msg(struct cdns_i2c *id, struct i2c_msg *msg,
+ if (id->err_status & CDNS_I2C_IXR_ARB_LOST)
+ return -EAGAIN;
+
++ if (msg->flags & I2C_M_RECV_LEN)
++ msg->len += min_t(unsigned int, msg->buf[0], I2C_SMBUS_BLOCK_MAX);
++
+ return 0;
+ }
+
+diff --git a/drivers/i2c/busses/i2c-mxs.c b/drivers/i2c/busses/i2c-mxs.c
+index 864a3f1bd4e14..68f67d084c63a 100644
+--- a/drivers/i2c/busses/i2c-mxs.c
++++ b/drivers/i2c/busses/i2c-mxs.c
+@@ -799,7 +799,7 @@ static int mxs_i2c_probe(struct platform_device *pdev)
+ if (!i2c)
+ return -ENOMEM;
+
+- i2c->dev_type = (enum mxs_i2c_devtype)of_device_get_match_data(&pdev->dev);
++ i2c->dev_type = (uintptr_t)of_device_get_match_data(&pdev->dev);
+
+ i2c->regs = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(i2c->regs))
+diff --git a/drivers/i2c/busses/i2c-npcm7xx.c b/drivers/i2c/busses/i2c-npcm7xx.c
+index 743ac20a405c2..9f0462fb2504c 100644
+--- a/drivers/i2c/busses/i2c-npcm7xx.c
++++ b/drivers/i2c/busses/i2c-npcm7xx.c
+@@ -123,11 +123,11 @@ enum i2c_addr {
+ * Since the addr regs are sprinkled all over the address space,
+ * use this array to get the address or each register.
+ */
+-#define I2C_NUM_OWN_ADDR 10
++#define I2C_NUM_OWN_ADDR 2
++#define I2C_NUM_OWN_ADDR_SUPPORTED 2
++
+ static const int npcm_i2caddr[I2C_NUM_OWN_ADDR] = {
+- NPCM_I2CADDR1, NPCM_I2CADDR2, NPCM_I2CADDR3, NPCM_I2CADDR4,
+- NPCM_I2CADDR5, NPCM_I2CADDR6, NPCM_I2CADDR7, NPCM_I2CADDR8,
+- NPCM_I2CADDR9, NPCM_I2CADDR10,
++ NPCM_I2CADDR1, NPCM_I2CADDR2,
+ };
+ #endif
+
+@@ -391,14 +391,10 @@ static void npcm_i2c_disable(struct npcm_i2c *bus)
+ #if IS_ENABLED(CONFIG_I2C_SLAVE)
+ int i;
+
+- /* select bank 0 for I2C addresses */
+- npcm_i2c_select_bank(bus, I2C_BANK_0);
+-
+ /* Slave addresses removal */
+- for (i = I2C_SLAVE_ADDR1; i < I2C_NUM_OWN_ADDR; i++)
++ for (i = I2C_SLAVE_ADDR1; i < I2C_NUM_OWN_ADDR_SUPPORTED; i++)
+ iowrite8(0, bus->reg + npcm_i2caddr[i]);
+
+- npcm_i2c_select_bank(bus, I2C_BANK_1);
+ #endif
+ /* Disable module */
+ i2cctl2 = ioread8(bus->reg + NPCM_I2CCTL2);
+@@ -603,8 +599,7 @@ static int npcm_i2c_slave_enable(struct npcm_i2c *bus, enum i2c_addr addr_type,
+ i2cctl1 &= ~NPCM_I2CCTL1_GCMEN;
+ iowrite8(i2cctl1, bus->reg + NPCM_I2CCTL1);
+ return 0;
+- }
+- if (addr_type == I2C_ARP_ADDR) {
++ } else if (addr_type == I2C_ARP_ADDR) {
+ i2cctl3 = ioread8(bus->reg + NPCM_I2CCTL3);
+ if (enable)
+ i2cctl3 |= I2CCTL3_ARPMEN;
+@@ -613,16 +608,16 @@ static int npcm_i2c_slave_enable(struct npcm_i2c *bus, enum i2c_addr addr_type,
+ iowrite8(i2cctl3, bus->reg + NPCM_I2CCTL3);
+ return 0;
+ }
++ if (addr_type > I2C_SLAVE_ADDR2 && addr_type <= I2C_SLAVE_ADDR10)
++ dev_err(bus->dev, "try to enable more than 2 SA not supported\n");
++
+ if (addr_type >= I2C_ARP_ADDR)
+ return -EFAULT;
+- /* select bank 0 for address 3 to 10 */
+- if (addr_type > I2C_SLAVE_ADDR2)
+- npcm_i2c_select_bank(bus, I2C_BANK_0);
++
+ /* Set and enable the address */
+ iowrite8(sa_reg, bus->reg + npcm_i2caddr[addr_type]);
+ npcm_i2c_slave_int_enable(bus, enable);
+- if (addr_type > I2C_SLAVE_ADDR2)
+- npcm_i2c_select_bank(bus, I2C_BANK_1);
++
+ return 0;
+ }
+ #endif
+@@ -843,15 +838,11 @@ static u8 npcm_i2c_get_slave_addr(struct npcm_i2c *bus, enum i2c_addr addr_type)
+ {
+ u8 slave_add;
+
+- /* select bank 0 for address 3 to 10 */
+- if (addr_type > I2C_SLAVE_ADDR2)
+- npcm_i2c_select_bank(bus, I2C_BANK_0);
++ if (addr_type > I2C_SLAVE_ADDR2 && addr_type <= I2C_SLAVE_ADDR10)
++ dev_err(bus->dev, "get slave: try to use more than 2 SA not supported\n");
+
+ slave_add = ioread8(bus->reg + npcm_i2caddr[(int)addr_type]);
+
+- if (addr_type > I2C_SLAVE_ADDR2)
+- npcm_i2c_select_bank(bus, I2C_BANK_1);
+-
+ return slave_add;
+ }
+
+@@ -861,12 +852,12 @@ static int npcm_i2c_remove_slave_addr(struct npcm_i2c *bus, u8 slave_add)
+
+ /* Set the enable bit */
+ slave_add |= 0x80;
+- npcm_i2c_select_bank(bus, I2C_BANK_0);
+- for (i = I2C_SLAVE_ADDR1; i < I2C_NUM_OWN_ADDR; i++) {
++
++ for (i = I2C_SLAVE_ADDR1; i < I2C_NUM_OWN_ADDR_SUPPORTED; i++) {
+ if (ioread8(bus->reg + npcm_i2caddr[i]) == slave_add)
+ iowrite8(0, bus->reg + npcm_i2caddr[i]);
+ }
+- npcm_i2c_select_bank(bus, I2C_BANK_1);
++
+ return 0;
+ }
+
+@@ -921,11 +912,15 @@ static int npcm_i2c_slave_get_wr_buf(struct npcm_i2c *bus)
+ for (i = 0; i < I2C_HW_FIFO_SIZE; i++) {
+ if (bus->slv_wr_size >= I2C_HW_FIFO_SIZE)
+ break;
+- i2c_slave_event(bus->slave, I2C_SLAVE_READ_REQUESTED, &value);
++ if (bus->state == I2C_SLAVE_MATCH) {
++ i2c_slave_event(bus->slave, I2C_SLAVE_READ_REQUESTED, &value);
++ bus->state = I2C_OPER_STARTED;
++ } else {
++ i2c_slave_event(bus->slave, I2C_SLAVE_READ_PROCESSED, &value);
++ }
+ ind = (bus->slv_wr_ind + bus->slv_wr_size) % I2C_HW_FIFO_SIZE;
+ bus->slv_wr_buf[ind] = value;
+ bus->slv_wr_size++;
+- i2c_slave_event(bus->slave, I2C_SLAVE_READ_PROCESSED, &value);
+ }
+ return I2C_HW_FIFO_SIZE - ret;
+ }
+@@ -973,7 +968,6 @@ static void npcm_i2c_slave_xmit(struct npcm_i2c *bus, u16 nwrite,
+ if (nwrite == 0)
+ return;
+
+- bus->state = I2C_OPER_STARTED;
+ bus->operation = I2C_WRITE_OPER;
+
+ /* get the next buffer */
+diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
+index 5b920f0fc7dd7..a4eef8de054ce 100644
+--- a/drivers/i2c/busses/i2c-qcom-geni.c
++++ b/drivers/i2c/busses/i2c-qcom-geni.c
+@@ -688,7 +688,7 @@ static int geni_i2c_xfer(struct i2c_adapter *adap,
+ pm_runtime_put_autosuspend(gi2c->se.dev);
+ gi2c->cur = NULL;
+ gi2c->err = 0;
+- return num;
++ return ret;
+ }
+
+ static u32 geni_i2c_func(struct i2c_adapter *adap)
+diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c
+index d43db2c3876e7..19a317fdcf5bf 100644
+--- a/drivers/i2c/i2c-core-base.c
++++ b/drivers/i2c/i2c-core-base.c
+@@ -2467,8 +2467,9 @@ void i2c_put_adapter(struct i2c_adapter *adap)
+ if (!adap)
+ return;
+
+- put_device(&adap->dev);
+ module_put(adap->owner);
++ /* Should be last, otherwise we risk use-after-free with 'adap' */
++ put_device(&adap->dev);
+ }
+ EXPORT_SYMBOL(i2c_put_adapter);
+
+diff --git a/drivers/i2c/muxes/i2c-mux-gpmux.c b/drivers/i2c/muxes/i2c-mux-gpmux.c
+index d3acd8d66c323..33024acaac02b 100644
+--- a/drivers/i2c/muxes/i2c-mux-gpmux.c
++++ b/drivers/i2c/muxes/i2c-mux-gpmux.c
+@@ -134,6 +134,7 @@ static int i2c_mux_probe(struct platform_device *pdev)
+ return 0;
+
+ err_children:
++ of_node_put(child);
+ i2c_mux_del_adapters(muxc);
+ err_parent:
+ i2c_put_adapter(parent);
+diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
+index 47b68c6071bef..9515a3146dc97 100644
+--- a/drivers/idle/intel_idle.c
++++ b/drivers/idle/intel_idle.c
+@@ -812,15 +812,105 @@ static struct cpuidle_state icx_cstates[] __initdata = {
+ };
+
+ /*
+- * On Sapphire Rapids Xeon C1 has to be disabled if C1E is enabled, and vice
+- * versa. On SPR C1E is enabled only if "C1E promotion" bit is set in
+- * MSR_IA32_POWER_CTL. But in this case there effectively no C1, because C1
+- * requests are promoted to C1E. If the "C1E promotion" bit is cleared, then
+- * both C1 and C1E requests end up with C1, so there is effectively no C1E.
++ * On AlderLake C1 has to be disabled if C1E is enabled, and vice versa.
++ * C1E is enabled only if "C1E promotion" bit is set in MSR_IA32_POWER_CTL.
++ * But in this case there is effectively no C1, because C1 requests are
++ * promoted to C1E. If the "C1E promotion" bit is cleared, then both C1
++ * and C1E requests end up with C1, so there is effectively no C1E.
+ *
+- * By default we enable C1 and disable C1E by marking it with
++ * By default we enable C1E and disable C1 by marking it with
+ * 'CPUIDLE_FLAG_UNUSABLE'.
+ */
++static struct cpuidle_state adl_cstates[] __initdata = {
++ {
++ .name = "C1",
++ .desc = "MWAIT 0x00",
++ .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_UNUSABLE,
++ .exit_latency = 1,
++ .target_residency = 1,
++ .enter = &intel_idle,
++ .enter_s2idle = intel_idle_s2idle, },
++ {
++ .name = "C1E",
++ .desc = "MWAIT 0x01",
++ .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
++ .exit_latency = 2,
++ .target_residency = 4,
++ .enter = &intel_idle,
++ .enter_s2idle = intel_idle_s2idle, },
++ {
++ .name = "C6",
++ .desc = "MWAIT 0x20",
++ .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
++ .exit_latency = 220,
++ .target_residency = 600,
++ .enter = &intel_idle,
++ .enter_s2idle = intel_idle_s2idle, },
++ {
++ .name = "C8",
++ .desc = "MWAIT 0x40",
++ .flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
++ .exit_latency = 280,
++ .target_residency = 800,
++ .enter = &intel_idle,
++ .enter_s2idle = intel_idle_s2idle, },
++ {
++ .name = "C10",
++ .desc = "MWAIT 0x60",
++ .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
++ .exit_latency = 680,
++ .target_residency = 2000,
++ .enter = &intel_idle,
++ .enter_s2idle = intel_idle_s2idle, },
++ {
++ .enter = NULL }
++};
++
++static struct cpuidle_state adl_l_cstates[] __initdata = {
++ {
++ .name = "C1",
++ .desc = "MWAIT 0x00",
++ .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_UNUSABLE,
++ .exit_latency = 1,
++ .target_residency = 1,
++ .enter = &intel_idle,
++ .enter_s2idle = intel_idle_s2idle, },
++ {
++ .name = "C1E",
++ .desc = "MWAIT 0x01",
++ .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
++ .exit_latency = 2,
++ .target_residency = 4,
++ .enter = &intel_idle,
++ .enter_s2idle = intel_idle_s2idle, },
++ {
++ .name = "C6",
++ .desc = "MWAIT 0x20",
++ .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
++ .exit_latency = 170,
++ .target_residency = 500,
++ .enter = &intel_idle,
++ .enter_s2idle = intel_idle_s2idle, },
++ {
++ .name = "C8",
++ .desc = "MWAIT 0x40",
++ .flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
++ .exit_latency = 200,
++ .target_residency = 600,
++ .enter = &intel_idle,
++ .enter_s2idle = intel_idle_s2idle, },
++ {
++ .name = "C10",
++ .desc = "MWAIT 0x60",
++ .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
++ .exit_latency = 230,
++ .target_residency = 700,
++ .enter = &intel_idle,
++ .enter_s2idle = intel_idle_s2idle, },
++ {
++ .enter = NULL }
++};
++
+ static struct cpuidle_state spr_cstates[] __initdata = {
+ {
+ .name = "C1",
+@@ -833,8 +923,7 @@ static struct cpuidle_state spr_cstates[] __initdata = {
+ {
+ .name = "C1E",
+ .desc = "MWAIT 0x01",
+- .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE |
+- CPUIDLE_FLAG_UNUSABLE,
++ .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
+ .exit_latency = 2,
+ .target_residency = 4,
+ .enter = &intel_idle,
+@@ -1194,6 +1283,14 @@ static const struct idle_cpu idle_cpu_icx __initconst = {
+ .use_acpi = true,
+ };
+
++static const struct idle_cpu idle_cpu_adl __initconst = {
++ .state_table = adl_cstates,
++};
++
++static const struct idle_cpu idle_cpu_adl_l __initconst = {
++ .state_table = adl_l_cstates,
++};
++
+ static const struct idle_cpu idle_cpu_spr __initconst = {
+ .state_table = spr_cstates,
+ .disable_promotion_to_c1e = true,
+@@ -1262,6 +1359,8 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = {
+ X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, &idle_cpu_skx),
+ X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, &idle_cpu_icx),
+ X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, &idle_cpu_icx),
++ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &idle_cpu_adl),
++ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &idle_cpu_adl_l),
+ X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &idle_cpu_spr),
+ X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNL, &idle_cpu_knl),
+ X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNM, &idle_cpu_knl),
+@@ -1620,6 +1719,25 @@ static void __init skx_idle_state_table_update(void)
+ }
+ }
+
++/**
++ * adl_idle_state_table_update - Adjust AlderLake idle states table.
++ */
++static void __init adl_idle_state_table_update(void)
++{
++ /* Check if user prefers C1 over C1E. */
++ if (preferred_states_mask & BIT(1) && !(preferred_states_mask & BIT(2))) {
++ cpuidle_state_table[0].flags &= ~CPUIDLE_FLAG_UNUSABLE;
++ cpuidle_state_table[1].flags |= CPUIDLE_FLAG_UNUSABLE;
++
++ /* Disable C1E by clearing the "C1E promotion" bit. */
++ c1e_promotion = C1E_PROMOTION_DISABLE;
++ return;
++ }
++
++ /* Make sure C1E is enabled by default */
++ c1e_promotion = C1E_PROMOTION_ENABLE;
++}
++
+ /**
+ * spr_idle_state_table_update - Adjust Sapphire Rapids idle states table.
+ */
+@@ -1627,17 +1745,6 @@ static void __init spr_idle_state_table_update(void)
+ {
+ unsigned long long msr;
+
+- /* Check if user prefers C1E over C1. */
+- if ((preferred_states_mask & BIT(2)) &&
+- !(preferred_states_mask & BIT(1))) {
+- /* Disable C1 and enable C1E. */
+- spr_cstates[0].flags |= CPUIDLE_FLAG_UNUSABLE;
+- spr_cstates[1].flags &= ~CPUIDLE_FLAG_UNUSABLE;
+-
+- /* Enable C1E using the "C1E promotion" bit. */
+- c1e_promotion = C1E_PROMOTION_ENABLE;
+- }
+-
+ /*
+ * By default, the C6 state assumes the worst-case scenario of package
+ * C6. However, if PC6 is disabled, we update the numbers to match
+@@ -1689,6 +1796,10 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
+ case INTEL_FAM6_SAPPHIRERAPIDS_X:
+ spr_idle_state_table_update();
+ break;
++ case INTEL_FAM6_ALDERLAKE:
++ case INTEL_FAM6_ALDERLAKE_L:
++ adl_idle_state_table_update();
++ break;
+ }
+
+ for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) {
+diff --git a/drivers/iio/accel/adxl313_core.c b/drivers/iio/accel/adxl313_core.c
+index 9e4193e64765f..afeef779e1d08 100644
+--- a/drivers/iio/accel/adxl313_core.c
++++ b/drivers/iio/accel/adxl313_core.c
+@@ -46,7 +46,7 @@ EXPORT_SYMBOL_NS_GPL(adxl313_writable_regs_table, IIO_ADXL313);
+ struct adxl313_data {
+ struct regmap *regmap;
+ struct mutex lock; /* lock to protect transf_buf */
+- __le16 transf_buf ____cacheline_aligned;
++ __le16 transf_buf __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static const int adxl313_odr_freqs[][2] = {
+diff --git a/drivers/iio/accel/adxl355_core.c b/drivers/iio/accel/adxl355_core.c
+index e9c10c8c32f09..2dfe780d61444 100644
+--- a/drivers/iio/accel/adxl355_core.c
++++ b/drivers/iio/accel/adxl355_core.c
+@@ -177,7 +177,7 @@ struct adxl355_data {
+ u8 buf[14];
+ s64 ts;
+ } buffer;
+- } ____cacheline_aligned;
++ } __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int adxl355_set_op_mode(struct adxl355_data *data,
+diff --git a/drivers/iio/accel/adxl367.c b/drivers/iio/accel/adxl367.c
+index 62960134ea195..d680bec05efc2 100644
+--- a/drivers/iio/accel/adxl367.c
++++ b/drivers/iio/accel/adxl367.c
+@@ -179,7 +179,7 @@ struct adxl367_state {
+ unsigned int fifo_set_size;
+ unsigned int fifo_watermark;
+
+- __be16 fifo_buf[ADXL367_FIFO_SIZE] ____cacheline_aligned;
++ __be16 fifo_buf[ADXL367_FIFO_SIZE] __aligned(IIO_DMA_MINALIGN);
+ __be16 sample_buf;
+ u8 act_threshold_buf[2];
+ u8 inact_time_buf[2];
+diff --git a/drivers/iio/accel/adxl367_spi.c b/drivers/iio/accel/adxl367_spi.c
+index 26dfc821ebbe0..118c894015a57 100644
+--- a/drivers/iio/accel/adxl367_spi.c
++++ b/drivers/iio/accel/adxl367_spi.c
+@@ -9,6 +9,8 @@
+ #include <linux/regmap.h>
+ #include <linux/spi/spi.h>
+
++#include <linux/iio/iio.h>
++
+ #include "adxl367.h"
+
+ #define ADXL367_SPI_WRITE_COMMAND 0x0A
+@@ -28,10 +30,10 @@ struct adxl367_spi_state {
+ struct spi_transfer fifo_xfer[2];
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
+- * transfer buffers to live in their own cache lines.
++ * DMA (thus cache coherency maintenance) may require the
++ * transfer buffers live in their own cache lines.
+ */
+- u8 reg_write_tx_buf[1] ____cacheline_aligned;
++ u8 reg_write_tx_buf[1] __aligned(IIO_DMA_MINALIGN);
+ u8 reg_read_tx_buf[2];
+ u8 fifo_tx_buf[1];
+ };
+diff --git a/drivers/iio/accel/bma220_spi.c b/drivers/iio/accel/bma220_spi.c
+index 74024d7ce5ac2..b6d9ab8e2054e 100644
+--- a/drivers/iio/accel/bma220_spi.c
++++ b/drivers/iio/accel/bma220_spi.c
+@@ -67,7 +67,7 @@ struct bma220_data {
+ /* Ensure timestamp is naturally aligned. */
+ s64 timestamp __aligned(8);
+ } scan;
+- u8 tx_buf[2] ____cacheline_aligned;
++ u8 tx_buf[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static const struct iio_chan_spec bma220_channels[] = {
+diff --git a/drivers/iio/accel/bma400.h b/drivers/iio/accel/bma400.h
+index c4c8d74155c2a..1c8c47a9a317c 100644
+--- a/drivers/iio/accel/bma400.h
++++ b/drivers/iio/accel/bma400.h
+@@ -83,8 +83,27 @@
+ #define BMA400_ACC_ODR_MIN_WHOLE_HZ 25
+ #define BMA400_ACC_ODR_MIN_HZ 12
+
+-#define BMA400_SCALE_MIN 38357
+-#define BMA400_SCALE_MAX 306864
++/*
++ * BMA400_SCALE_MIN macro value represents m/s^2 for 1 LSB before
++ * converting to micro values for +-2g range.
++ *
++ * For +-2g - 1 LSB = 0.976562 milli g = 0.009576 m/s^2
++ * For +-4g - 1 LSB = 1.953125 milli g = 0.019153 m/s^2
++ * For +-16g - 1 LSB = 7.8125 milli g = 0.076614 m/s^2
++ *
++ * The raw value which is used to select the different ranges is determined
++ * by the first bit set position from the scale value, so BMA400_SCALE_MIN
++ * should be odd.
++ *
++ * Scale values for +-2g, +-4g, +-8g and +-16g are populated into bma400_scales
++ * array by left shifting BMA400_SCALE_MIN.
++ * e.g.:
++ * To select +-2g = 9577 << 0 = raw value to write is 0.
++ * To select +-8g = 9577 << 2 = raw value to write is 2.
++ * To select +-16g = 9577 << 3 = raw value to write is 3.
++ */
++#define BMA400_SCALE_MIN 9577
++#define BMA400_SCALE_MAX 76617
+
+ #define BMA400_NUM_REGULATORS 2
+ #define BMA400_VDD_REGULATOR 0
+@@ -94,6 +113,4 @@ extern const struct regmap_config bma400_regmap_config;
+
+ int bma400_probe(struct device *dev, struct regmap *regmap, const char *name);
+
+-void bma400_remove(struct device *dev);
+-
+ #endif
+diff --git a/drivers/iio/accel/bma400_core.c b/drivers/iio/accel/bma400_core.c
+index 043002fe6f633..07674d89d9782 100644
+--- a/drivers/iio/accel/bma400_core.c
++++ b/drivers/iio/accel/bma400_core.c
+@@ -13,14 +13,14 @@
+
+ #include <linux/bitops.h>
+ #include <linux/device.h>
+-#include <linux/iio/iio.h>
+-#include <linux/iio/sysfs.h>
+ #include <linux/kernel.h>
+ #include <linux/module.h>
+ #include <linux/mutex.h>
+ #include <linux/regmap.h>
+ #include <linux/regulator/consumer.h>
+
++#include <linux/iio/iio.h>
++
+ #include "bma400.h"
+
+ /*
+@@ -560,6 +560,26 @@ static void bma400_init_tables(void)
+ }
+ }
+
++static void bma400_regulators_disable(void *data_ptr)
++{
++ struct bma400_data *data = data_ptr;
++
++ regulator_bulk_disable(ARRAY_SIZE(data->regulators), data->regulators);
++}
++
++static void bma400_power_disable(void *data_ptr)
++{
++ struct bma400_data *data = data_ptr;
++ int ret;
++
++ mutex_lock(&data->mutex);
++ ret = bma400_set_power_mode(data, POWER_MODE_SLEEP);
++ mutex_unlock(&data->mutex);
++ if (ret)
++ dev_warn(data->dev, "Failed to put device into sleep mode (%pe)\n",
++ ERR_PTR(ret));
++}
++
+ static int bma400_init(struct bma400_data *data)
+ {
+ unsigned int val;
+@@ -569,13 +589,12 @@ static int bma400_init(struct bma400_data *data)
+ ret = regmap_read(data->regmap, BMA400_CHIP_ID_REG, &val);
+ if (ret) {
+ dev_err(data->dev, "Failed to read chip id register\n");
+- goto out;
++ return ret;
+ }
+
+ if (val != BMA400_ID_REG_VAL) {
+ dev_err(data->dev, "Chip ID mismatch\n");
+- ret = -ENODEV;
+- goto out;
++ return -ENODEV;
+ }
+
+ data->regulators[BMA400_VDD_REGULATOR].supply = "vdd";
+@@ -589,27 +608,31 @@ static int bma400_init(struct bma400_data *data)
+ "Failed to get regulators: %d\n",
+ ret);
+
+- goto out;
++ return ret;
+ }
+ ret = regulator_bulk_enable(ARRAY_SIZE(data->regulators),
+ data->regulators);
+ if (ret) {
+ dev_err(data->dev, "Failed to enable regulators: %d\n",
+ ret);
+- goto out;
++ return ret;
+ }
+
++ ret = devm_add_action_or_reset(data->dev, bma400_regulators_disable, data);
++ if (ret)
++ return ret;
++
+ ret = bma400_get_power_mode(data);
+ if (ret) {
+ dev_err(data->dev, "Failed to get the initial power-mode\n");
+- goto err_reg_disable;
++ return ret;
+ }
+
+ if (data->power_mode != POWER_MODE_NORMAL) {
+ ret = bma400_set_power_mode(data, POWER_MODE_NORMAL);
+ if (ret) {
+ dev_err(data->dev, "Failed to wake up the device\n");
+- goto err_reg_disable;
++ return ret;
+ }
+ /*
+ * TODO: The datasheet waits 1500us here in the example, but
+@@ -618,19 +641,23 @@ static int bma400_init(struct bma400_data *data)
+ usleep_range(1500, 2000);
+ }
+
++ ret = devm_add_action_or_reset(data->dev, bma400_power_disable, data);
++ if (ret)
++ return ret;
++
+ bma400_init_tables();
+
+ ret = bma400_get_accel_output_data_rate(data);
+ if (ret)
+- goto err_reg_disable;
++ return ret;
+
+ ret = bma400_get_accel_oversampling_ratio(data);
+ if (ret)
+- goto err_reg_disable;
++ return ret;
+
+ ret = bma400_get_accel_scale(data);
+ if (ret)
+- goto err_reg_disable;
++ return ret;
+
+ /*
+ * Once the interrupt engine is supported we might use the
+@@ -639,12 +666,6 @@ static int bma400_init(struct bma400_data *data)
+ * channel.
+ */
+ return regmap_write(data->regmap, BMA400_ACC_CONFIG2_REG, 0x00);
+-
+-err_reg_disable:
+- regulator_bulk_disable(ARRAY_SIZE(data->regulators),
+- data->regulators);
+-out:
+- return ret;
+ }
+
+ static int bma400_read_raw(struct iio_dev *indio_dev,
+@@ -822,32 +843,10 @@ int bma400_probe(struct device *dev, struct regmap *regmap, const char *name)
+ indio_dev->num_channels = ARRAY_SIZE(bma400_channels);
+ indio_dev->modes = INDIO_DIRECT_MODE;
+
+- dev_set_drvdata(dev, indio_dev);
+-
+- return iio_device_register(indio_dev);
++ return devm_iio_device_register(dev, indio_dev);
+ }
+ EXPORT_SYMBOL_NS(bma400_probe, IIO_BMA400);
+
+-void bma400_remove(struct device *dev)
+-{
+- struct iio_dev *indio_dev = dev_get_drvdata(dev);
+- struct bma400_data *data = iio_priv(indio_dev);
+- int ret;
+-
+- mutex_lock(&data->mutex);
+- ret = bma400_set_power_mode(data, POWER_MODE_SLEEP);
+- mutex_unlock(&data->mutex);
+-
+- if (ret)
+- dev_warn(dev, "Failed to put device into sleep mode (%pe)\n", ERR_PTR(ret));
+-
+- regulator_bulk_disable(ARRAY_SIZE(data->regulators),
+- data->regulators);
+-
+- iio_device_unregister(indio_dev);
+-}
+-EXPORT_SYMBOL_NS(bma400_remove, IIO_BMA400);
+-
+ MODULE_AUTHOR("Dan Robertson <dan@dlrobertson.com>");
+ MODULE_DESCRIPTION("Bosch BMA400 triaxial acceleration sensor core");
+ MODULE_LICENSE("GPL");
+diff --git a/drivers/iio/accel/bma400_i2c.c b/drivers/iio/accel/bma400_i2c.c
+index da104ffd3fe07..4f6e01a3b3a1f 100644
+--- a/drivers/iio/accel/bma400_i2c.c
++++ b/drivers/iio/accel/bma400_i2c.c
+@@ -27,13 +27,6 @@ static int bma400_i2c_probe(struct i2c_client *client,
+ return bma400_probe(&client->dev, regmap, id->name);
+ }
+
+-static int bma400_i2c_remove(struct i2c_client *client)
+-{
+- bma400_remove(&client->dev);
+-
+- return 0;
+-}
+-
+ static const struct i2c_device_id bma400_i2c_ids[] = {
+ { "bma400", 0 },
+ { }
+@@ -52,7 +45,6 @@ static struct i2c_driver bma400_i2c_driver = {
+ .of_match_table = bma400_of_i2c_match,
+ },
+ .probe = bma400_i2c_probe,
+- .remove = bma400_i2c_remove,
+ .id_table = bma400_i2c_ids,
+ };
+
+diff --git a/drivers/iio/accel/bma400_spi.c b/drivers/iio/accel/bma400_spi.c
+index 51f23bdc0ea5f..28e240400a3fe 100644
+--- a/drivers/iio/accel/bma400_spi.c
++++ b/drivers/iio/accel/bma400_spi.c
+@@ -87,11 +87,6 @@ static int bma400_spi_probe(struct spi_device *spi)
+ return bma400_probe(&spi->dev, regmap, id->name);
+ }
+
+-static void bma400_spi_remove(struct spi_device *spi)
+-{
+- bma400_remove(&spi->dev);
+-}
+-
+ static const struct spi_device_id bma400_spi_ids[] = {
+ { "bma400", 0 },
+ { }
+@@ -110,7 +105,6 @@ static struct spi_driver bma400_spi_driver = {
+ .of_match_table = bma400_of_spi_match,
+ },
+ .probe = bma400_spi_probe,
+- .remove = bma400_spi_remove,
+ .id_table = bma400_spi_ids,
+ };
+
+diff --git a/drivers/iio/accel/cros_ec_accel_legacy.c b/drivers/iio/accel/cros_ec_accel_legacy.c
+index b6f3471b62dcf..3b77fded2dc07 100644
+--- a/drivers/iio/accel/cros_ec_accel_legacy.c
++++ b/drivers/iio/accel/cros_ec_accel_legacy.c
+@@ -215,7 +215,7 @@ static int cros_ec_accel_legacy_probe(struct platform_device *pdev)
+ return -ENOMEM;
+
+ ret = cros_ec_sensors_core_init(pdev, indio_dev, true,
+- cros_ec_sensors_capture, NULL);
++ cros_ec_sensors_capture);
+ if (ret)
+ return ret;
+
+@@ -235,7 +235,7 @@ static int cros_ec_accel_legacy_probe(struct platform_device *pdev)
+ state->sign[CROS_EC_SENSOR_Z] = -1;
+ }
+
+- return devm_iio_device_register(dev, indio_dev);
++ return cros_ec_sensors_core_register(dev, indio_dev, NULL);
+ }
+
+ static struct platform_driver cros_ec_accel_platform_driver = {
+diff --git a/drivers/iio/accel/sca3000.c b/drivers/iio/accel/sca3000.c
+index 83c81072511ee..2eecd2ab72dd6 100644
+--- a/drivers/iio/accel/sca3000.c
++++ b/drivers/iio/accel/sca3000.c
+@@ -167,8 +167,8 @@ struct sca3000_state {
+ int mo_det_use_count;
+ struct mutex lock;
+ /* Can these share a cacheline ? */
+- u8 rx[384] ____cacheline_aligned;
+- u8 tx[6] ____cacheline_aligned;
++ u8 rx[384] __aligned(IIO_DMA_MINALIGN);
++ u8 tx[6] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ /**
+diff --git a/drivers/iio/accel/sca3300.c b/drivers/iio/accel/sca3300.c
+index f7ef8ecfd34a6..39e0c24364aec 100644
+--- a/drivers/iio/accel/sca3300.c
++++ b/drivers/iio/accel/sca3300.c
+@@ -115,7 +115,7 @@ struct sca3300_data {
+ s16 channels[4];
+ s64 ts __aligned(sizeof(s64));
+ } scan;
+- u8 txbuf[4] ____cacheline_aligned;
++ u8 txbuf[4] __aligned(IIO_DMA_MINALIGN);
+ u8 rxbuf[4];
+ };
+
+diff --git a/drivers/iio/adc/ad7266.c b/drivers/iio/adc/ad7266.c
+index c17d9b5fbaf64..53c83e04dde53 100644
+--- a/drivers/iio/adc/ad7266.c
++++ b/drivers/iio/adc/ad7266.c
+@@ -37,7 +37,7 @@ struct ad7266_state {
+ struct gpio_desc *gpios[3];
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ * The buffer needs to be large enough to hold two samples (4 bytes) and
+ * the naturally aligned timestamp (8 bytes).
+@@ -45,7 +45,7 @@ struct ad7266_state {
+ struct {
+ __be16 sample[2];
+ s64 timestamp;
+- } data ____cacheline_aligned;
++ } data __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int ad7266_wakeup(struct ad7266_state *st)
+diff --git a/drivers/iio/adc/ad7280a.c b/drivers/iio/adc/ad7280a.c
+index ec9acbf12b9a5..047a9d0dbcf75 100644
+--- a/drivers/iio/adc/ad7280a.c
++++ b/drivers/iio/adc/ad7280a.c
+@@ -183,7 +183,7 @@ struct ad7280_state {
+ unsigned char cb_mask[AD7280A_MAX_CHAIN];
+ struct mutex lock; /* protect sensor state */
+
+- __be32 tx ____cacheline_aligned;
++ __be32 tx __aligned(IIO_DMA_MINALIGN);
+ __be32 rx;
+ };
+
+diff --git a/drivers/iio/adc/ad7292.c b/drivers/iio/adc/ad7292.c
+index 3271a31afde1c..92c68d467c505 100644
+--- a/drivers/iio/adc/ad7292.c
++++ b/drivers/iio/adc/ad7292.c
+@@ -80,7 +80,7 @@ struct ad7292_state {
+ struct regulator *reg;
+ unsigned short vref_mv;
+
+- __be16 d16 ____cacheline_aligned;
++ __be16 d16 __aligned(IIO_DMA_MINALIGN);
+ u8 d8[2];
+ };
+
+diff --git a/drivers/iio/adc/ad7298.c b/drivers/iio/adc/ad7298.c
+index 3f4e73f7d35a0..c0430f71f592a 100644
+--- a/drivers/iio/adc/ad7298.c
++++ b/drivers/iio/adc/ad7298.c
+@@ -49,7 +49,7 @@ struct ad7298_state {
+ * DMA (thus cache coherency maintenance) requires the
+ * transfer buffers to live in their own cache lines.
+ */
+- __be16 rx_buf[12] ____cacheline_aligned;
++ __be16 rx_buf[12] __aligned(IIO_DMA_MINALIGN);
+ __be16 tx_buf[2];
+ };
+
+diff --git a/drivers/iio/adc/ad7476.c b/drivers/iio/adc/ad7476.c
+index a1e8b32671cf6..94776f6962907 100644
+--- a/drivers/iio/adc/ad7476.c
++++ b/drivers/iio/adc/ad7476.c
+@@ -44,13 +44,12 @@ struct ad7476_state {
+ struct spi_transfer xfer;
+ struct spi_message msg;
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ * Make the buffer large enough for one 16 bit sample and one 64 bit
+ * aligned 64 bit timestamp.
+ */
+- unsigned char data[ALIGN(2, sizeof(s64)) + sizeof(s64)]
+- ____cacheline_aligned;
++ unsigned char data[ALIGN(2, sizeof(s64)) + sizeof(s64)] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ enum ad7476_supported_device_ids {
+diff --git a/drivers/iio/adc/ad7606.h b/drivers/iio/adc/ad7606.h
+index 4f82d7c9acfde..2dc4f599f9df9 100644
+--- a/drivers/iio/adc/ad7606.h
++++ b/drivers/iio/adc/ad7606.h
+@@ -116,11 +116,11 @@ struct ad7606_state {
+ struct completion completion;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ * 16 * 16-bit samples + 64-bit timestamp
+ */
+- unsigned short data[20] ____cacheline_aligned;
++ unsigned short data[20] __aligned(IIO_DMA_MINALIGN);
+ __be16 d16[2];
+ };
+
+diff --git a/drivers/iio/adc/ad7766.c b/drivers/iio/adc/ad7766.c
+index 51ee9482e0df9..3079a0872947e 100644
+--- a/drivers/iio/adc/ad7766.c
++++ b/drivers/iio/adc/ad7766.c
+@@ -45,13 +45,12 @@ struct ad7766 {
+ struct spi_message msg;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ * Make the buffer large enough for one 24 bit sample and one 64 bit
+ * aligned 64 bit timestamp.
+ */
+- unsigned char data[ALIGN(3, sizeof(s64)) + sizeof(s64)]
+- ____cacheline_aligned;
++ unsigned char data[ALIGN(3, sizeof(s64)) + sizeof(s64)] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ /*
+diff --git a/drivers/iio/adc/ad7768-1.c b/drivers/iio/adc/ad7768-1.c
+index aa42ba759fa1a..60f394da46401 100644
+--- a/drivers/iio/adc/ad7768-1.c
++++ b/drivers/iio/adc/ad7768-1.c
+@@ -163,7 +163,7 @@ struct ad7768_state {
+ struct gpio_desc *gpio_sync_in;
+ const char *labels[ARRAY_SIZE(ad7768_channels)];
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+ union {
+@@ -173,7 +173,7 @@ struct ad7768_state {
+ } scan;
+ __be32 d32;
+ u8 d8[2];
+- } data ____cacheline_aligned;
++ } data __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int ad7768_spi_reg_read(struct ad7768_state *st, unsigned int addr,
+diff --git a/drivers/iio/adc/ad7887.c b/drivers/iio/adc/ad7887.c
+index f64999714a4da..965bdc8aa6961 100644
+--- a/drivers/iio/adc/ad7887.c
++++ b/drivers/iio/adc/ad7887.c
+@@ -66,13 +66,12 @@ struct ad7887_state {
+ unsigned char tx_cmd_buf[4];
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ * Buffer needs to be large enough to hold two 16 bit samples and a
+ * 64 bit aligned 64 bit timestamp.
+ */
+- unsigned char data[ALIGN(4, sizeof(s64)) + sizeof(s64)]
+- ____cacheline_aligned;
++ unsigned char data[ALIGN(4, sizeof(s64)) + sizeof(s64)] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ enum ad7887_supported_device_ids {
+diff --git a/drivers/iio/adc/ad7923.c b/drivers/iio/adc/ad7923.c
+index 069b561ee7689..edad1f30121dd 100644
+--- a/drivers/iio/adc/ad7923.c
++++ b/drivers/iio/adc/ad7923.c
+@@ -57,12 +57,12 @@ struct ad7923_state {
+ unsigned int settings;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ * Ensure rx_buf can be directly used in iio_push_to_buffers_with_timetamp
+ * Length = 8 channels + 4 extra for 8 byte timestamp
+ */
+- __be16 rx_buf[12] ____cacheline_aligned;
++ __be16 rx_buf[12] __aligned(IIO_DMA_MINALIGN);
+ __be16 tx_buf[4];
+ };
+
+diff --git a/drivers/iio/adc/ad7949.c b/drivers/iio/adc/ad7949.c
+index 44bb5fde83de0..ed4c1656ca75d 100644
+--- a/drivers/iio/adc/ad7949.c
++++ b/drivers/iio/adc/ad7949.c
+@@ -86,7 +86,7 @@ struct ad7949_adc_chip {
+ u8 resolution;
+ u16 cfg;
+ unsigned int current_channel;
+- u16 buffer ____cacheline_aligned;
++ u16 buffer __aligned(IIO_DMA_MINALIGN);
+ __be16 buf8b;
+ };
+
+diff --git a/drivers/iio/adc/adi-axi-adc.c b/drivers/iio/adc/adi-axi-adc.c
+index a9e655e69eaa2..8ffabdaf841ea 100644
+--- a/drivers/iio/adc/adi-axi-adc.c
++++ b/drivers/iio/adc/adi-axi-adc.c
+@@ -84,7 +84,8 @@ void *adi_axi_adc_conv_priv(struct adi_axi_adc_conv *conv)
+ {
+ struct adi_axi_adc_client *cl = conv_to_client(conv);
+
+- return (char *)cl + ALIGN(sizeof(struct adi_axi_adc_client), IIO_ALIGN);
++ return (char *)cl + ALIGN(sizeof(struct adi_axi_adc_client),
++ IIO_DMA_MINALIGN);
+ }
+ EXPORT_SYMBOL_GPL(adi_axi_adc_conv_priv);
+
+@@ -169,9 +170,9 @@ static struct adi_axi_adc_conv *adi_axi_adc_conv_register(struct device *dev,
+ struct adi_axi_adc_client *cl;
+ size_t alloc_size;
+
+- alloc_size = ALIGN(sizeof(struct adi_axi_adc_client), IIO_ALIGN);
++ alloc_size = ALIGN(sizeof(struct adi_axi_adc_client), IIO_DMA_MINALIGN);
+ if (sizeof_priv)
+- alloc_size += ALIGN(sizeof_priv, IIO_ALIGN);
++ alloc_size += ALIGN(sizeof_priv, IIO_DMA_MINALIGN);
+
+ cl = kzalloc(alloc_size, GFP_KERNEL);
+ if (!cl)
+diff --git a/drivers/iio/adc/hi8435.c b/drivers/iio/adc/hi8435.c
+index 8eb0140df133a..771fa12bdc026 100644
+--- a/drivers/iio/adc/hi8435.c
++++ b/drivers/iio/adc/hi8435.c
+@@ -49,7 +49,7 @@ struct hi8435_priv {
+
+ unsigned threshold_lo[2]; /* GND-Open and Supply-Open thresholds */
+ unsigned threshold_hi[2]; /* GND-Open and Supply-Open thresholds */
+- u8 reg_buffer[3] ____cacheline_aligned;
++ u8 reg_buffer[3] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int hi8435_readb(struct hi8435_priv *priv, u8 reg, u8 *val)
+diff --git a/drivers/iio/adc/ltc2496.c b/drivers/iio/adc/ltc2496.c
+index 5a55f79f25749..dfb3bb5997e57 100644
+--- a/drivers/iio/adc/ltc2496.c
++++ b/drivers/iio/adc/ltc2496.c
+@@ -24,10 +24,10 @@ struct ltc2496_driverdata {
+ struct spi_device *spi;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+- unsigned char rxbuf[3] ____cacheline_aligned;
++ unsigned char rxbuf[3] __aligned(IIO_DMA_MINALIGN);
+ unsigned char txbuf[3];
+ };
+
+diff --git a/drivers/iio/adc/ltc2497.c b/drivers/iio/adc/ltc2497.c
+index 1adddf5a88a94..f7c786f37ceb1 100644
+--- a/drivers/iio/adc/ltc2497.c
++++ b/drivers/iio/adc/ltc2497.c
+@@ -20,10 +20,10 @@ struct ltc2497_driverdata {
+ struct ltc2497core_driverdata common_ddata;
+ struct i2c_client *client;
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+- __be32 buf ____cacheline_aligned;
++ __be32 buf __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int ltc2497_result_and_measure(struct ltc2497core_driverdata *ddata,
+diff --git a/drivers/iio/adc/max1027.c b/drivers/iio/adc/max1027.c
+index 4daf1d576c4ee..136fcf753837c 100644
+--- a/drivers/iio/adc/max1027.c
++++ b/drivers/iio/adc/max1027.c
+@@ -272,7 +272,7 @@ struct max1027_state {
+ struct mutex lock;
+ struct completion complete;
+
+- u8 reg ____cacheline_aligned;
++ u8 reg __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int max1027_wait_eoc(struct iio_dev *indio_dev)
+@@ -349,8 +349,7 @@ static int max1027_read_single_value(struct iio_dev *indio_dev,
+ if (ret < 0) {
+ dev_err(&indio_dev->dev,
+ "Failed to configure conversion register\n");
+- iio_device_release_direct_mode(indio_dev);
+- return ret;
++ goto release;
+ }
+
+ /*
+@@ -360,11 +359,12 @@ static int max1027_read_single_value(struct iio_dev *indio_dev,
+ */
+ ret = max1027_wait_eoc(indio_dev);
+ if (ret)
+- return ret;
++ goto release;
+
+ /* Read result */
+ ret = spi_read(st->spi, st->buffer, (chan->type == IIO_TEMP) ? 4 : 2);
+
++release:
+ iio_device_release_direct_mode(indio_dev);
+
+ if (ret < 0)
+diff --git a/drivers/iio/adc/max11100.c b/drivers/iio/adc/max11100.c
+index eb1ce6a0315c5..49e38dca8fe2c 100644
+--- a/drivers/iio/adc/max11100.c
++++ b/drivers/iio/adc/max11100.c
+@@ -33,10 +33,10 @@ struct max11100_state {
+ struct spi_device *spi;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+- u8 buffer[3] ____cacheline_aligned;
++ u8 buffer[3] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static const struct iio_chan_spec max11100_channels[] = {
+diff --git a/drivers/iio/adc/max1118.c b/drivers/iio/adc/max1118.c
+index a41bc570be210..75ab57d9aef74 100644
+--- a/drivers/iio/adc/max1118.c
++++ b/drivers/iio/adc/max1118.c
+@@ -42,7 +42,7 @@ struct max1118 {
+ s64 ts __aligned(8);
+ } scan;
+
+- u8 data ____cacheline_aligned;
++ u8 data __aligned(IIO_DMA_MINALIGN);
+ };
+
+ #define MAX1118_CHANNEL(ch) \
+diff --git a/drivers/iio/adc/max1241.c b/drivers/iio/adc/max1241.c
+index a5afd84af58b9..a815ad1f6913b 100644
+--- a/drivers/iio/adc/max1241.c
++++ b/drivers/iio/adc/max1241.c
+@@ -26,7 +26,7 @@ struct max1241 {
+ struct regulator *vref;
+ struct gpio_desc *shutdown;
+
+- __be16 data ____cacheline_aligned;
++ __be16 data __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static const struct iio_chan_spec max1241_channels[] = {
+diff --git a/drivers/iio/adc/mcp320x.c b/drivers/iio/adc/mcp320x.c
+index b4c69acb33e34..f3b81798b3c93 100644
+--- a/drivers/iio/adc/mcp320x.c
++++ b/drivers/iio/adc/mcp320x.c
+@@ -92,7 +92,7 @@ struct mcp320x {
+ struct mutex lock;
+ const struct mcp320x_chip_info *chip_info;
+
+- u8 tx_buf ____cacheline_aligned;
++ u8 tx_buf __aligned(IIO_DMA_MINALIGN);
+ u8 rx_buf[4];
+ };
+
+diff --git a/drivers/iio/adc/ti-adc0832.c b/drivers/iio/adc/ti-adc0832.c
+index fb5e72600b968..b11ce555ba3b9 100644
+--- a/drivers/iio/adc/ti-adc0832.c
++++ b/drivers/iio/adc/ti-adc0832.c
+@@ -36,7 +36,7 @@ struct adc0832 {
+ */
+ u8 data[24] __aligned(8);
+
+- u8 tx_buf[2] ____cacheline_aligned;
++ u8 tx_buf[2] __aligned(IIO_DMA_MINALIGN);
+ u8 rx_buf[2];
+ };
+
+diff --git a/drivers/iio/adc/ti-adc084s021.c b/drivers/iio/adc/ti-adc084s021.c
+index c9b5d9aec3dc4..1f6e53832e062 100644
+--- a/drivers/iio/adc/ti-adc084s021.c
++++ b/drivers/iio/adc/ti-adc084s021.c
+@@ -32,10 +32,10 @@ struct adc084s021 {
+ s64 ts __aligned(8);
+ } scan;
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache line.
+ */
+- u16 tx_buf[4] ____cacheline_aligned;
++ u16 tx_buf[4] __aligned(IIO_DMA_MINALIGN);
+ __be16 rx_buf[5]; /* First 16-bits are trash */
+ };
+
+diff --git a/drivers/iio/adc/ti-adc108s102.c b/drivers/iio/adc/ti-adc108s102.c
+index c8e48881c37f9..c82a161630e1d 100644
+--- a/drivers/iio/adc/ti-adc108s102.c
++++ b/drivers/iio/adc/ti-adc108s102.c
+@@ -77,8 +77,8 @@ struct adc108s102_state {
+ * tx_buf: 8 channel read commands, plus 1 dummy command
+ * rx_buf: 1 dummy response, 8 channel responses
+ */
+- __be16 rx_buf[9] ____cacheline_aligned;
+- __be16 tx_buf[9] ____cacheline_aligned;
++ __be16 rx_buf[9] __aligned(IIO_DMA_MINALIGN);
++ __be16 tx_buf[9] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ #define ADC108S102_V_CHAN(index) \
+diff --git a/drivers/iio/adc/ti-adc12138.c b/drivers/iio/adc/ti-adc12138.c
+index 59d75d09604f3..c0a72d72f3a99 100644
+--- a/drivers/iio/adc/ti-adc12138.c
++++ b/drivers/iio/adc/ti-adc12138.c
+@@ -55,7 +55,7 @@ struct adc12138 {
+ */
+ __be16 data[20] __aligned(8);
+
+- u8 tx_buf[2] ____cacheline_aligned;
++ u8 tx_buf[2] __aligned(IIO_DMA_MINALIGN);
+ u8 rx_buf[2];
+ };
+
+diff --git a/drivers/iio/adc/ti-adc128s052.c b/drivers/iio/adc/ti-adc128s052.c
+index 8e7adec877555..622fd384983c7 100644
+--- a/drivers/iio/adc/ti-adc128s052.c
++++ b/drivers/iio/adc/ti-adc128s052.c
+@@ -29,7 +29,7 @@ struct adc128 {
+ struct regulator *reg;
+ struct mutex lock;
+
+- u8 buffer[2] ____cacheline_aligned;
++ u8 buffer[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int adc128_adc_conversion(struct adc128 *adc, u8 channel)
+diff --git a/drivers/iio/adc/ti-adc161s626.c b/drivers/iio/adc/ti-adc161s626.c
+index 75ca7f1c87264..b789891dcf490 100644
+--- a/drivers/iio/adc/ti-adc161s626.c
++++ b/drivers/iio/adc/ti-adc161s626.c
+@@ -71,7 +71,7 @@ struct ti_adc_data {
+ u8 read_size;
+ u8 shift;
+
+- u8 buffer[16] ____cacheline_aligned;
++ u8 buffer[16] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int ti_adc_read_measurement(struct ti_adc_data *data,
+diff --git a/drivers/iio/adc/ti-ads124s08.c b/drivers/iio/adc/ti-ads124s08.c
+index 767b3b6348092..64833156c1998 100644
+--- a/drivers/iio/adc/ti-ads124s08.c
++++ b/drivers/iio/adc/ti-ads124s08.c
+@@ -106,7 +106,7 @@ struct ads124s_private {
+ * timestamp is maintained.
+ */
+ u32 buffer[ADS124S08_MAX_CHANNELS + sizeof(s64)/sizeof(u32)] __aligned(8);
+- u8 data[5] ____cacheline_aligned;
++ u8 data[5] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ #define ADS124S08_CHAN(index) \
+diff --git a/drivers/iio/adc/ti-ads131e08.c b/drivers/iio/adc/ti-ads131e08.c
+index 80a09817c1194..32237cacc9a37 100644
+--- a/drivers/iio/adc/ti-ads131e08.c
++++ b/drivers/iio/adc/ti-ads131e08.c
+@@ -105,7 +105,7 @@ struct ads131e08_state {
+ s64 ts __aligned(8);
+ } tmp_buf;
+
+- u8 tx_buf[3] ____cacheline_aligned;
++ u8 tx_buf[3] __aligned(IIO_DMA_MINALIGN);
+ /*
+ * Add extra one padding byte to be able to access the last channel
+ * value using u32 pointer
+diff --git a/drivers/iio/adc/ti-ads7950.c b/drivers/iio/adc/ti-ads7950.c
+index e3658b969c5bf..2cc9a9bd9db60 100644
+--- a/drivers/iio/adc/ti-ads7950.c
++++ b/drivers/iio/adc/ti-ads7950.c
+@@ -102,11 +102,11 @@ struct ti_ads7950_state {
+ unsigned int gpio_cmd_settings_bitmask;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+ u16 rx_buf[TI_ADS7950_MAX_CHAN + 2 + TI_ADS7950_TIMESTAMP_SIZE]
+- ____cacheline_aligned;
++ __aligned(IIO_DMA_MINALIGN);
+ u16 tx_buf[TI_ADS7950_MAX_CHAN + 2];
+ u16 single_tx;
+ u16 single_rx;
+diff --git a/drivers/iio/adc/ti-ads8344.c b/drivers/iio/adc/ti-ads8344.c
+index c96d2a9ba9247..bbd85cb47f816 100644
+--- a/drivers/iio/adc/ti-ads8344.c
++++ b/drivers/iio/adc/ti-ads8344.c
+@@ -28,7 +28,7 @@ struct ads8344 {
+ */
+ struct mutex lock;
+
+- u8 tx_buf ____cacheline_aligned;
++ u8 tx_buf __aligned(IIO_DMA_MINALIGN);
+ u8 rx_buf[3];
+ };
+
+diff --git a/drivers/iio/adc/ti-ads8688.c b/drivers/iio/adc/ti-ads8688.c
+index 22c2583eedd0f..7ee14c6d9f463 100644
+--- a/drivers/iio/adc/ti-ads8688.c
++++ b/drivers/iio/adc/ti-ads8688.c
+@@ -71,7 +71,7 @@ struct ads8688_state {
+ union {
+ __be32 d32;
+ u8 d8[4];
+- } data[2] ____cacheline_aligned;
++ } data[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ enum ads8688_id {
+diff --git a/drivers/iio/adc/ti-tlc4541.c b/drivers/iio/adc/ti-tlc4541.c
+index 2406eda9dfc6a..30f629a553a14 100644
+--- a/drivers/iio/adc/ti-tlc4541.c
++++ b/drivers/iio/adc/ti-tlc4541.c
+@@ -37,12 +37,12 @@ struct tlc4541_state {
+ struct spi_message scan_single_msg;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ * 2 bytes data + 6 bytes padding + 8 bytes timestamp when
+ * call iio_push_to_buffers_with_timestamp.
+ */
+- __be16 rx_buf[8] ____cacheline_aligned;
++ __be16 rx_buf[8] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ struct tlc4541_chip_info {
+diff --git a/drivers/iio/addac/ad74413r.c b/drivers/iio/addac/ad74413r.c
+index acd230a6af35a..6a66d7a65db79 100644
+--- a/drivers/iio/addac/ad74413r.c
++++ b/drivers/iio/addac/ad74413r.c
+@@ -77,13 +77,13 @@ struct ad74413r_state {
+ struct spi_transfer adc_samples_xfer[AD74413R_CHANNEL_MAX + 1];
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+ struct {
+ u8 rx_buf[AD74413R_FRAME_SIZE * AD74413R_CHANNEL_MAX];
+ s64 timestamp;
+- } adc_samples_buf ____cacheline_aligned;
++ } adc_samples_buf __aligned(IIO_DMA_MINALIGN);
+
+ u8 adc_samples_tx_buf[AD74413R_FRAME_SIZE * AD74413R_CHANNEL_MAX];
+ u8 reg_tx_buf[AD74413R_FRAME_SIZE];
+diff --git a/drivers/iio/amplifiers/ad8366.c b/drivers/iio/amplifiers/ad8366.c
+index 1134ae12e5319..f2c2ea79a07f3 100644
+--- a/drivers/iio/amplifiers/ad8366.c
++++ b/drivers/iio/amplifiers/ad8366.c
+@@ -45,10 +45,10 @@ struct ad8366_state {
+ enum ad8366_type type;
+ struct ad8366_info *info;
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+- unsigned char data[2] ____cacheline_aligned;
++ unsigned char data[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static struct ad8366_info ad8366_infos[] = {
+diff --git a/drivers/iio/common/cros_ec_sensors/cros_ec_lid_angle.c b/drivers/iio/common/cros_ec_sensors/cros_ec_lid_angle.c
+index af801e203623e..02d3cf36acb0c 100644
+--- a/drivers/iio/common/cros_ec_sensors/cros_ec_lid_angle.c
++++ b/drivers/iio/common/cros_ec_sensors/cros_ec_lid_angle.c
+@@ -97,7 +97,7 @@ static int cros_ec_lid_angle_probe(struct platform_device *pdev)
+ if (!indio_dev)
+ return -ENOMEM;
+
+- ret = cros_ec_sensors_core_init(pdev, indio_dev, false, NULL, NULL);
++ ret = cros_ec_sensors_core_init(pdev, indio_dev, false, NULL);
+ if (ret)
+ return ret;
+
+@@ -113,7 +113,7 @@ static int cros_ec_lid_angle_probe(struct platform_device *pdev)
+ if (ret)
+ return ret;
+
+- return devm_iio_device_register(dev, indio_dev);
++ return cros_ec_sensors_core_register(dev, indio_dev, NULL);
+ }
+
+ static const struct platform_device_id cros_ec_lid_angle_ids[] = {
+diff --git a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors.c b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors.c
+index 376a5b30010ae..5cce34fdff022 100644
+--- a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors.c
++++ b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors.c
+@@ -235,8 +235,7 @@ static int cros_ec_sensors_probe(struct platform_device *pdev)
+ return -ENOMEM;
+
+ ret = cros_ec_sensors_core_init(pdev, indio_dev, true,
+- cros_ec_sensors_capture,
+- cros_ec_sensors_push_data);
++ cros_ec_sensors_capture);
+ if (ret)
+ return ret;
+
+@@ -297,7 +296,8 @@ static int cros_ec_sensors_probe(struct platform_device *pdev)
+ else
+ state->core.read_ec_sensors_data = cros_ec_sensors_read_cmd;
+
+- return devm_iio_device_register(dev, indio_dev);
++ return cros_ec_sensors_core_register(dev, indio_dev,
++ cros_ec_sensors_push_data);
+ }
+
+ static const struct platform_device_id cros_ec_sensors_ids[] = {
+diff --git a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
+index b2725c6adc7f1..ed97168321617 100644
+--- a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
++++ b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
+@@ -234,21 +234,18 @@ static void cros_ec_sensors_core_clean(void *arg)
+
+ /**
+ * cros_ec_sensors_core_init() - basic initialization of the core structure
+- * @pdev: platform device created for the sensors
++ * @pdev: platform device created for the sensor
+ * @indio_dev: iio device structure of the device
+ * @physical_device: true if the device refers to a physical device
+ * @trigger_capture: function pointer to call buffer is triggered,
+ * for backward compatibility.
+- * @push_data: function to call when cros_ec_sensorhub receives
+- * a sample for that sensor.
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+ int cros_ec_sensors_core_init(struct platform_device *pdev,
+ struct iio_dev *indio_dev,
+ bool physical_device,
+- cros_ec_sensors_capture_t trigger_capture,
+- cros_ec_sensorhub_push_data_cb_t push_data)
++ cros_ec_sensors_capture_t trigger_capture)
+ {
+ struct device *dev = &pdev->dev;
+ struct cros_ec_sensors_core_state *state = iio_priv(indio_dev);
+@@ -339,17 +336,6 @@ int cros_ec_sensors_core_init(struct platform_device *pdev,
+ if (ret)
+ return ret;
+
+- ret = cros_ec_sensorhub_register_push_data(
+- sensor_hub, sensor_platform->sensor_num,
+- indio_dev, push_data);
+- if (ret)
+- return ret;
+-
+- ret = devm_add_action_or_reset(
+- dev, cros_ec_sensors_core_clean, pdev);
+- if (ret)
+- return ret;
+-
+ /* Timestamp coming from FIFO are in ns since boot. */
+ ret = iio_device_set_clock(indio_dev, CLOCK_BOOTTIME);
+ if (ret)
+@@ -371,6 +357,46 @@ int cros_ec_sensors_core_init(struct platform_device *pdev,
+ }
+ EXPORT_SYMBOL_GPL(cros_ec_sensors_core_init);
+
++/**
++ * cros_ec_sensors_core_register() - Register callback to FIFO and IIO when
++ * sensor is ready.
++ * It must be called at the end of the sensor probe routine.
++ * @dev: device created for the sensor
++ * @indio_dev: iio device structure of the device
++ * @push_data: function to call when cros_ec_sensorhub receives
++ * a sample for that sensor.
++ *
++ * Return: 0 on success, -errno on failure.
++ */
++int cros_ec_sensors_core_register(struct device *dev,
++ struct iio_dev *indio_dev,
++ cros_ec_sensorhub_push_data_cb_t push_data)
++{
++ struct cros_ec_sensor_platform *sensor_platform = dev_get_platdata(dev);
++ struct cros_ec_sensorhub *sensor_hub = dev_get_drvdata(dev->parent);
++ struct platform_device *pdev = to_platform_device(dev);
++ struct cros_ec_dev *ec = sensor_hub->ec;
++ int ret;
++
++ ret = devm_iio_device_register(dev, indio_dev);
++ if (ret)
++ return ret;
++
++ if (!push_data ||
++ !cros_ec_check_features(ec, EC_FEATURE_MOTION_SENSE_FIFO))
++ return 0;
++
++ ret = cros_ec_sensorhub_register_push_data(
++ sensor_hub, sensor_platform->sensor_num,
++ indio_dev, push_data);
++ if (ret)
++ return ret;
++
++ return devm_add_action_or_reset(
++ dev, cros_ec_sensors_core_clean, pdev);
++}
++EXPORT_SYMBOL_GPL(cros_ec_sensors_core_register);
++
+ /**
+ * cros_ec_motion_send_host_cmd() - send motion sense host command
+ * @state: pointer to state information for device
+diff --git a/drivers/iio/common/ssp_sensors/ssp.h b/drivers/iio/common/ssp_sensors/ssp.h
+index abb8327956194..f649cdecc2774 100644
+--- a/drivers/iio/common/ssp_sensors/ssp.h
++++ b/drivers/iio/common/ssp_sensors/ssp.h
+@@ -221,8 +221,7 @@ struct ssp_data {
+ struct iio_dev *sensor_devs[SSP_SENSOR_MAX];
+ atomic_t enable_refcount;
+
+- __le16 header_buffer[SSP_HEADER_BUFFER_SIZE / sizeof(__le16)]
+- ____cacheline_aligned;
++ __le16 header_buffer[SSP_HEADER_BUFFER_SIZE / sizeof(__le16)] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ void ssp_clean_pending_list(struct ssp_data *data);
+diff --git a/drivers/iio/dac/ad5064.c b/drivers/iio/dac/ad5064.c
+index 27ee2c63c5d45..e3b1add34f462 100644
+--- a/drivers/iio/dac/ad5064.c
++++ b/drivers/iio/dac/ad5064.c
+@@ -115,13 +115,13 @@ struct ad5064_state {
+ struct mutex lock;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+ union {
+ u8 i2c[3];
+ __be32 spi;
+- } data ____cacheline_aligned;
++ } data __aligned(IIO_DMA_MINALIGN);
+ };
+
+ enum ad5064_type {
+diff --git a/drivers/iio/dac/ad5360.c b/drivers/iio/dac/ad5360.c
+index ecbc6a51d60fa..1bde696a572c6 100644
+--- a/drivers/iio/dac/ad5360.c
++++ b/drivers/iio/dac/ad5360.c
+@@ -79,13 +79,13 @@ struct ad5360_state {
+ struct mutex lock;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+ union {
+ __be32 d32;
+ u8 d8[4];
+- } data[2] ____cacheline_aligned;
++ } data[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ enum ad5360_type {
+diff --git a/drivers/iio/dac/ad5421.c b/drivers/iio/dac/ad5421.c
+index eedf661d32b2d..7644acfd879e0 100644
+--- a/drivers/iio/dac/ad5421.c
++++ b/drivers/iio/dac/ad5421.c
+@@ -72,13 +72,13 @@ struct ad5421_state {
+ struct mutex lock;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+ union {
+ __be32 d32;
+ u8 d8[4];
+- } data[2] ____cacheline_aligned;
++ } data[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static const struct iio_event_spec ad5421_current_event[] = {
+diff --git a/drivers/iio/dac/ad5449.c b/drivers/iio/dac/ad5449.c
+index bad9bdaafa94d..4572d6f49275f 100644
+--- a/drivers/iio/dac/ad5449.c
++++ b/drivers/iio/dac/ad5449.c
+@@ -68,10 +68,10 @@ struct ad5449 {
+ uint16_t dac_cache[AD5449_MAX_CHANNELS];
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+- __be16 data[2] ____cacheline_aligned;
++ __be16 data[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ enum ad5449_type {
+diff --git a/drivers/iio/dac/ad5504.c b/drivers/iio/dac/ad5504.c
+index 8507573aa13e9..e472c9564edf7 100644
+--- a/drivers/iio/dac/ad5504.c
++++ b/drivers/iio/dac/ad5504.c
+@@ -54,7 +54,7 @@ struct ad5504_state {
+ unsigned pwr_down_mask;
+ unsigned pwr_down_mode;
+
+- __be16 data[2] ____cacheline_aligned;
++ __be16 data[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ /*
+diff --git a/drivers/iio/dac/ad5592r-base.h b/drivers/iio/dac/ad5592r-base.h
+index 2a22ef6919965..cc7be426cbc88 100644
+--- a/drivers/iio/dac/ad5592r-base.h
++++ b/drivers/iio/dac/ad5592r-base.h
+@@ -14,6 +14,8 @@
+ #include <linux/mutex.h>
+ #include <linux/gpio/driver.h>
+
++#include <linux/iio/iio.h>
++
+ struct device;
+ struct ad5592r_state;
+
+@@ -65,7 +67,7 @@ struct ad5592r_state {
+ u8 gpio_in;
+ u8 gpio_val;
+
+- __be16 spi_msg ____cacheline_aligned;
++ __be16 spi_msg __aligned(IIO_DMA_MINALIGN);
+ __be16 spi_msg_nop;
+ };
+
+diff --git a/drivers/iio/dac/ad5686.h b/drivers/iio/dac/ad5686.h
+index cd5fff9e9d537..b7ade3a6b9b6c 100644
+--- a/drivers/iio/dac/ad5686.h
++++ b/drivers/iio/dac/ad5686.h
+@@ -13,6 +13,8 @@
+ #include <linux/mutex.h>
+ #include <linux/kernel.h>
+
++#include <linux/iio/iio.h>
++
+ #define AD5310_CMD(x) ((x) << 12)
+
+ #define AD5683_DATA(x) ((x) << 4)
+@@ -137,7 +139,7 @@ struct ad5686_state {
+ struct mutex lock;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+
+@@ -145,7 +147,7 @@ struct ad5686_state {
+ __be32 d32;
+ __be16 d16;
+ u8 d8[4];
+- } data[3] ____cacheline_aligned;
++ } data[3] __aligned(IIO_DMA_MINALIGN);
+ };
+
+
+diff --git a/drivers/iio/dac/ad5755.c b/drivers/iio/dac/ad5755.c
+index 7a62e6e1d5f15..bbb345323b691 100644
+--- a/drivers/iio/dac/ad5755.c
++++ b/drivers/iio/dac/ad5755.c
+@@ -189,14 +189,14 @@ struct ad5755_state {
+ struct mutex lock;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+
+ union {
+ __be32 d32;
+ u8 d8[4];
+- } data[2] ____cacheline_aligned;
++ } data[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ enum ad5755_type {
+diff --git a/drivers/iio/dac/ad5761.c b/drivers/iio/dac/ad5761.c
+index 4cb8471db81e0..6aa1a068adb06 100644
+--- a/drivers/iio/dac/ad5761.c
++++ b/drivers/iio/dac/ad5761.c
+@@ -70,13 +70,13 @@ struct ad5761_state {
+ enum ad5761_voltage_range range;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+ union {
+ __be32 d32;
+ u8 d8[4];
+- } data[3] ____cacheline_aligned;
++ } data[3] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static const struct ad5761_range_params ad5761_range_params[] = {
+diff --git a/drivers/iio/dac/ad5764.c b/drivers/iio/dac/ad5764.c
+index d235a8047ba0c..26c049d5b73a5 100644
+--- a/drivers/iio/dac/ad5764.c
++++ b/drivers/iio/dac/ad5764.c
+@@ -56,13 +56,13 @@ struct ad5764_state {
+ struct mutex lock;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+ union {
+ __be32 d32;
+ u8 d8[4];
+- } data[2] ____cacheline_aligned;
++ } data[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ enum ad5764_type {
+diff --git a/drivers/iio/dac/ad5766.c b/drivers/iio/dac/ad5766.c
+index 43189af2fb1f3..899894523752f 100644
+--- a/drivers/iio/dac/ad5766.c
++++ b/drivers/iio/dac/ad5766.c
+@@ -123,7 +123,7 @@ struct ad5766_state {
+ u32 d32;
+ u16 w16[2];
+ u8 b8[4];
+- } data[3] ____cacheline_aligned;
++ } data[3] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ struct ad5766_span_tbl {
+diff --git a/drivers/iio/dac/ad5770r.c b/drivers/iio/dac/ad5770r.c
+index 7e2fd32e993a6..f66d67402e436 100644
+--- a/drivers/iio/dac/ad5770r.c
++++ b/drivers/iio/dac/ad5770r.c
+@@ -140,7 +140,7 @@ struct ad5770r_state {
+ bool ch_pwr_down[AD5770R_MAX_CHANNELS];
+ bool internal_ref;
+ bool external_res;
+- u8 transf_buf[2] ____cacheline_aligned;
++ u8 transf_buf[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static const struct regmap_config ad5770r_spi_regmap_config = {
+diff --git a/drivers/iio/dac/ad5791.c b/drivers/iio/dac/ad5791.c
+index 2b14914b40500..ee7e89774d9d4 100644
+--- a/drivers/iio/dac/ad5791.c
++++ b/drivers/iio/dac/ad5791.c
+@@ -95,7 +95,7 @@ struct ad5791_state {
+ union {
+ __be32 d32;
+ u8 d8[4];
+- } data[3] ____cacheline_aligned;
++ } data[3] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ enum ad5791_supported_device_ids {
+diff --git a/drivers/iio/dac/ad7293.c b/drivers/iio/dac/ad7293.c
+index 59a38ca4c3c77..06f05750d9216 100644
+--- a/drivers/iio/dac/ad7293.c
++++ b/drivers/iio/dac/ad7293.c
+@@ -144,7 +144,7 @@ struct ad7293_state {
+ struct regulator *reg_avdd;
+ struct regulator *reg_vdrive;
+ u8 page_select;
+- u8 data[3] ____cacheline_aligned;
++ u8 data[3] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int ad7293_page_select(struct ad7293_state *st, unsigned int reg)
+diff --git a/drivers/iio/dac/ad7303.c b/drivers/iio/dac/ad7303.c
+index 91eaaf793b3e7..558af7926a890 100644
+--- a/drivers/iio/dac/ad7303.c
++++ b/drivers/iio/dac/ad7303.c
+@@ -44,10 +44,10 @@ struct ad7303_state {
+
+ struct mutex lock;
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+- __be16 data ____cacheline_aligned;
++ __be16 data __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int ad7303_write(struct ad7303_state *st, unsigned int chan,
+diff --git a/drivers/iio/dac/ad8801.c b/drivers/iio/dac/ad8801.c
+index 6be35c92d435a..919e8c8806973 100644
+--- a/drivers/iio/dac/ad8801.c
++++ b/drivers/iio/dac/ad8801.c
+@@ -26,7 +26,7 @@ struct ad8801_state {
+ struct regulator *vrefh_reg;
+ struct regulator *vrefl_reg;
+
+- __be16 data ____cacheline_aligned;
++ __be16 data __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int ad8801_spi_write(struct ad8801_state *state,
+diff --git a/drivers/iio/dac/ltc2688.c b/drivers/iio/dac/ltc2688.c
+index 2f9c384885f4d..04ec38b1cad1b 100644
+--- a/drivers/iio/dac/ltc2688.c
++++ b/drivers/iio/dac/ltc2688.c
+@@ -91,10 +91,10 @@ struct ltc2688_state {
+ struct mutex lock;
+ int vref;
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+- u8 tx_data[6] ____cacheline_aligned;
++ u8 tx_data[6] __aligned(IIO_DMA_MINALIGN);
+ u8 rx_data[3];
+ };
+
+diff --git a/drivers/iio/dac/mcp4922.c b/drivers/iio/dac/mcp4922.c
+index cb9e60e71b915..6c0e31032c570 100644
+--- a/drivers/iio/dac/mcp4922.c
++++ b/drivers/iio/dac/mcp4922.c
+@@ -29,7 +29,7 @@ struct mcp4922_state {
+ unsigned int value[MCP4922_NUM_CHANNELS];
+ unsigned int vref_mv;
+ struct regulator *vref_reg;
+- u8 mosi[2] ____cacheline_aligned;
++ u8 mosi[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ #define MCP4922_CHAN(chan, bits) { \
+diff --git a/drivers/iio/dac/ti-dac082s085.c b/drivers/iio/dac/ti-dac082s085.c
+index 4e1156e6deb2d..f52ad8fa77474 100644
+--- a/drivers/iio/dac/ti-dac082s085.c
++++ b/drivers/iio/dac/ti-dac082s085.c
+@@ -55,7 +55,7 @@ struct ti_dac_chip {
+ bool powerdown;
+ u8 powerdown_mode;
+ u8 resolution;
+- u8 buf[2] ____cacheline_aligned;
++ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ #define WRITE_NOT_UPDATE(chan) (0x00 | (chan) << 6)
+diff --git a/drivers/iio/dac/ti-dac5571.c b/drivers/iio/dac/ti-dac5571.c
+index 0b775f943db3e..ba68ea1e3455d 100644
+--- a/drivers/iio/dac/ti-dac5571.c
++++ b/drivers/iio/dac/ti-dac5571.c
+@@ -52,7 +52,7 @@ struct dac5571_data {
+ struct dac5571_spec const *spec;
+ int (*dac5571_cmd)(struct dac5571_data *data, int channel, u16 val);
+ int (*dac5571_pwrdwn)(struct dac5571_data *data, int channel, u8 pwrdwn);
+- u8 buf[3] ____cacheline_aligned;
++ u8 buf[3] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ #define DAC5571_POWERDOWN(mode) ((mode) + 1)
+diff --git a/drivers/iio/dac/ti-dac7311.c b/drivers/iio/dac/ti-dac7311.c
+index e10d17e60ed39..dd0a7b77838da 100644
+--- a/drivers/iio/dac/ti-dac7311.c
++++ b/drivers/iio/dac/ti-dac7311.c
+@@ -52,7 +52,7 @@ struct ti_dac_chip {
+ bool powerdown;
+ u8 powerdown_mode;
+ u8 resolution;
+- u8 buf[2] ____cacheline_aligned;
++ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static u8 ti_dac_get_power(struct ti_dac_chip *ti_dac, bool powerdown)
+diff --git a/drivers/iio/dac/ti-dac7612.c b/drivers/iio/dac/ti-dac7612.c
+index 4c0f4b5e9ff44..8195815de26fe 100644
+--- a/drivers/iio/dac/ti-dac7612.c
++++ b/drivers/iio/dac/ti-dac7612.c
+@@ -31,10 +31,10 @@ struct dac7612 {
+ struct mutex lock;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+- uint8_t data[2] ____cacheline_aligned;
++ uint8_t data[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int dac7612_cmd_single(struct dac7612 *priv, int channel, u16 val)
+diff --git a/drivers/iio/frequency/ad9523.c b/drivers/iio/frequency/ad9523.c
+index a0f92c336fc40..31c97f9f2c1be 100644
+--- a/drivers/iio/frequency/ad9523.c
++++ b/drivers/iio/frequency/ad9523.c
+@@ -287,13 +287,13 @@ struct ad9523_state {
+ struct mutex lock;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
+- * transfer buffers to live in their own cache lines.
++ * DMA (thus cache coherency maintenance) may require that
++ * transfer buffers live in their own cache lines.
+ */
+ union {
+ __be32 d32;
+ u8 d8[4];
+- } data[2] ____cacheline_aligned;
++ } data[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int ad9523_read(struct iio_dev *indio_dev, unsigned int addr)
+diff --git a/drivers/iio/frequency/adf4350.c b/drivers/iio/frequency/adf4350.c
+index be1218d862919..85e289700c3c5 100644
+--- a/drivers/iio/frequency/adf4350.c
++++ b/drivers/iio/frequency/adf4350.c
+@@ -56,10 +56,10 @@ struct adf4350_state {
+ */
+ struct mutex lock;
+ /*
+- * DMA (thus cache coherency maintenance) requires the
+- * transfer buffers to live in their own cache lines.
++ * DMA (thus cache coherency maintenance) may require that
++ * transfer buffers live in their own cache lines.
+ */
+- __be32 val ____cacheline_aligned;
++ __be32 val __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static struct adf4350_platform_data default_pdata = {
+diff --git a/drivers/iio/frequency/adf4371.c b/drivers/iio/frequency/adf4371.c
+index ecd5e18995adc..135c8cedc33dc 100644
+--- a/drivers/iio/frequency/adf4371.c
++++ b/drivers/iio/frequency/adf4371.c
+@@ -175,7 +175,7 @@ struct adf4371_state {
+ unsigned int mod2;
+ unsigned int rf_div_sel;
+ unsigned int ref_div_factor;
+- u8 buf[10] ____cacheline_aligned;
++ u8 buf[10] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static unsigned long long adf4371_pll_fract_n_get_rate(struct adf4371_state *st,
+diff --git a/drivers/iio/frequency/admv1013.c b/drivers/iio/frequency/admv1013.c
+index b0e1f6571afba..ed81672713586 100644
+--- a/drivers/iio/frequency/admv1013.c
++++ b/drivers/iio/frequency/admv1013.c
+@@ -100,7 +100,7 @@ struct admv1013_state {
+ unsigned int input_mode;
+ unsigned int quad_se_mode;
+ bool det_en;
+- u8 data[3] ____cacheline_aligned;
++ u8 data[3] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int __admv1013_spi_read(struct admv1013_state *st, unsigned int reg,
+diff --git a/drivers/iio/frequency/admv1014.c b/drivers/iio/frequency/admv1014.c
+index a7994f8e6b9ba..d1ccaa7ed5fee 100644
+--- a/drivers/iio/frequency/admv1014.c
++++ b/drivers/iio/frequency/admv1014.c
+@@ -127,7 +127,7 @@ struct admv1014_state {
+ unsigned int quad_se_mode;
+ unsigned int p1db_comp;
+ bool det_en;
+- u8 data[3] ____cacheline_aligned;
++ u8 data[3] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static const int mixer_vgate_table[] = {106, 107, 108, 110, 111, 112, 113, 114,
+diff --git a/drivers/iio/frequency/admv4420.c b/drivers/iio/frequency/admv4420.c
+index 51134aee85109..863ba8e98c95e 100644
+--- a/drivers/iio/frequency/admv4420.c
++++ b/drivers/iio/frequency/admv4420.c
+@@ -113,7 +113,7 @@ struct admv4420_state {
+ struct admv4420_n_counter n_counter;
+ enum admv4420_mux_sel mux_sel;
+ struct mutex lock;
+- u8 transf_buf[4] ____cacheline_aligned;
++ u8 transf_buf[4] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static const struct regmap_config admv4420_regmap_config = {
+diff --git a/drivers/iio/frequency/adrf6780.c b/drivers/iio/frequency/adrf6780.c
+index 8255ffd174f6a..21878bad09097 100644
+--- a/drivers/iio/frequency/adrf6780.c
++++ b/drivers/iio/frequency/adrf6780.c
+@@ -86,7 +86,7 @@ struct adrf6780_state {
+ bool uc_bias_en;
+ bool lo_sideband;
+ bool vdet_out_en;
+- u8 data[3] ____cacheline_aligned;
++ u8 data[3] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int __adrf6780_spi_read(struct adrf6780_state *st, unsigned int reg,
+diff --git a/drivers/iio/gyro/adis16080.c b/drivers/iio/gyro/adis16080.c
+index acef59d822b10..14b3abf6dce95 100644
+--- a/drivers/iio/gyro/adis16080.c
++++ b/drivers/iio/gyro/adis16080.c
+@@ -45,7 +45,7 @@ struct adis16080_state {
+ const struct adis16080_chip_info *info;
+ struct mutex lock;
+
+- __be16 buf ____cacheline_aligned;
++ __be16 buf __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int adis16080_read_sample(struct iio_dev *indio_dev,
+diff --git a/drivers/iio/gyro/adis16130.c b/drivers/iio/gyro/adis16130.c
+index b9c952e65b553..33cde9e6fca59 100644
+--- a/drivers/iio/gyro/adis16130.c
++++ b/drivers/iio/gyro/adis16130.c
+@@ -41,7 +41,7 @@
+ struct adis16130_state {
+ struct spi_device *us;
+ struct mutex buf_lock;
+- u8 buf[4] ____cacheline_aligned;
++ u8 buf[4] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int adis16130_spi_read(struct iio_dev *indio_dev, u8 reg_addr, u32 *val)
+diff --git a/drivers/iio/gyro/adxrs450.c b/drivers/iio/gyro/adxrs450.c
+index 04f3500252152..f84438e0c42c5 100644
+--- a/drivers/iio/gyro/adxrs450.c
++++ b/drivers/iio/gyro/adxrs450.c
+@@ -73,7 +73,7 @@ enum {
+ struct adxrs450_state {
+ struct spi_device *us;
+ struct mutex buf_lock;
+- __be32 tx ____cacheline_aligned;
++ __be32 tx __aligned(IIO_DMA_MINALIGN);
+ __be32 rx;
+
+ };
+diff --git a/drivers/iio/gyro/fxas21002c_core.c b/drivers/iio/gyro/fxas21002c_core.c
+index 410e5e9f2672e..7a459a823f6e1 100644
+--- a/drivers/iio/gyro/fxas21002c_core.c
++++ b/drivers/iio/gyro/fxas21002c_core.c
+@@ -150,10 +150,10 @@ struct fxas21002c_data {
+ struct regulator *vddio;
+
+ /*
+- * DMA (thus cache coherency maintenance) requires the
+- * transfer buffers to live in their own cache lines.
++ * DMA (thus cache coherency maintenance) may require the
++ * transfer buffers live in their own cache lines.
+ */
+- s16 buffer[8] ____cacheline_aligned;
++ s16 buffer[8] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ enum fxas21002c_channel_index {
+diff --git a/drivers/iio/imu/fxos8700_core.c b/drivers/iio/imu/fxos8700_core.c
+index ab288186f36e4..423cfe526f2a1 100644
+--- a/drivers/iio/imu/fxos8700_core.c
++++ b/drivers/iio/imu/fxos8700_core.c
+@@ -167,7 +167,7 @@
+ struct fxos8700_data {
+ struct regmap *regmap;
+ struct iio_trigger *trig;
+- __be16 buf[FXOS8700_DATA_BUF_SIZE] ____cacheline_aligned;
++ __be16 buf[FXOS8700_DATA_BUF_SIZE] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ /* Regmap info */
+diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600.h b/drivers/iio/imu/inv_icm42600/inv_icm42600.h
+index 995a9dc06521d..3d91469beccbb 100644
+--- a/drivers/iio/imu/inv_icm42600/inv_icm42600.h
++++ b/drivers/iio/imu/inv_icm42600/inv_icm42600.h
+@@ -141,7 +141,7 @@ struct inv_icm42600_state {
+ struct inv_icm42600_suspended suspended;
+ struct iio_dev *indio_gyro;
+ struct iio_dev *indio_accel;
+- uint8_t buffer[2] ____cacheline_aligned;
++ uint8_t buffer[2] __aligned(IIO_DMA_MINALIGN);
+ struct inv_icm42600_fifo fifo;
+ struct {
+ int64_t gyro;
+diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_buffer.h b/drivers/iio/imu/inv_icm42600/inv_icm42600_buffer.h
+index de2a3949dcc7d..8b85ee333bf8f 100644
+--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_buffer.h
++++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_buffer.h
+@@ -39,7 +39,7 @@ struct inv_icm42600_fifo {
+ size_t accel;
+ size_t total;
+ } nb;
+- uint8_t data[2080] ____cacheline_aligned;
++ uint8_t data[2080] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ /* FIFO data packet */
+diff --git a/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h b/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h
+index c6aa36ee966a4..32b58b797d573 100644
+--- a/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h
++++ b/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h
+@@ -203,7 +203,7 @@ struct inv_mpu6050_state {
+ s32 magn_raw_to_gauss[3];
+ struct iio_mount_matrix magn_orient;
+ unsigned int suspended_sensors;
+- u8 data[INV_MPU6050_OUTPUT_DATA_SIZE] ____cacheline_aligned;
++ u8 data[INV_MPU6050_OUTPUT_DATA_SIZE] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ /*register and associated bit definition*/
+diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c
+index b078eb2f3c9de..01369973329aa 100644
+--- a/drivers/iio/industrialio-buffer.c
++++ b/drivers/iio/industrialio-buffer.c
+@@ -915,7 +915,7 @@ static int iio_verify_update(struct iio_dev *indio_dev,
+ if (scan_mask == NULL)
+ return -EINVAL;
+ } else {
+- scan_mask = compound_mask;
++ scan_mask = compound_mask;
+ }
+
+ config->scan_bytes = iio_compute_scan_bytes(indio_dev,
+@@ -1649,7 +1649,7 @@ static int __iio_buffer_alloc_sysfs_and_mask(struct iio_buffer *buffer,
+ }
+
+ attrn = buffer_attrcount + scan_el_attrcount + ARRAY_SIZE(iio_buffer_attrs);
+- attr = kcalloc(attrn + 1, sizeof(* attr), GFP_KERNEL);
++ attr = kcalloc(attrn + 1, sizeof(*attr), GFP_KERNEL);
+ if (!attr) {
+ ret = -ENOMEM;
+ goto error_free_scan_mask;
+diff --git a/drivers/iio/industrialio-core.c b/drivers/iio/industrialio-core.c
+index e1ed44dec2abe..e5aed96de8f33 100644
+--- a/drivers/iio/industrialio-core.c
++++ b/drivers/iio/industrialio-core.c
+@@ -821,7 +821,23 @@ static ssize_t iio_format_avail_list(char *buf, const int *vals,
+
+ static ssize_t iio_format_avail_range(char *buf, const int *vals, int type)
+ {
+- return iio_format_list(buf, vals, type, 3, "[", "]");
++ int length;
++
++ /*
++ * length refers to the array size , not the number of elements.
++ * The purpose is to print the range [min , step ,max] so length should
++ * be 3 in case of int, and 6 for other types.
++ */
++ switch (type) {
++ case IIO_VAL_INT:
++ length = 3;
++ break;
++ default:
++ length = 6;
++ break;
++ }
++
++ return iio_format_list(buf, vals, type, length, "[", "]");
+ }
+
+ static ssize_t iio_read_channel_info_avail(struct device *dev,
+@@ -892,8 +908,7 @@ static int __iio_str_to_fixpoint(const char *str, int fract_mult,
+ } else if (*str == '\n') {
+ if (*(str + 1) == '\0')
+ break;
+- else
+- return -EINVAL;
++ return -EINVAL;
+ } else if (!strncmp(str, " dB", sizeof(" dB") - 1) && scale_db) {
+ /* Ignore the dB suffix */
+ str += sizeof(" dB") - 1;
+@@ -1640,7 +1655,7 @@ struct iio_dev *iio_device_alloc(struct device *parent, int sizeof_priv)
+
+ alloc_size = sizeof(struct iio_dev_opaque);
+ if (sizeof_priv) {
+- alloc_size = ALIGN(alloc_size, IIO_ALIGN);
++ alloc_size = ALIGN(alloc_size, IIO_DMA_MINALIGN);
+ alloc_size += sizeof_priv;
+ }
+
+@@ -1650,7 +1665,7 @@ struct iio_dev *iio_device_alloc(struct device *parent, int sizeof_priv)
+
+ indio_dev = &iio_dev_opaque->indio_dev;
+ indio_dev->priv = (char *)iio_dev_opaque +
+- ALIGN(sizeof(struct iio_dev_opaque), IIO_ALIGN);
++ ALIGN(sizeof(struct iio_dev_opaque), IIO_DMA_MINALIGN);
+
+ indio_dev->dev.parent = parent;
+ indio_dev->dev.type = &iio_device_type;
+diff --git a/drivers/iio/light/cros_ec_light_prox.c b/drivers/iio/light/cros_ec_light_prox.c
+index de472f23d1cba..16b893bae3881 100644
+--- a/drivers/iio/light/cros_ec_light_prox.c
++++ b/drivers/iio/light/cros_ec_light_prox.c
+@@ -181,8 +181,7 @@ static int cros_ec_light_prox_probe(struct platform_device *pdev)
+ return -ENOMEM;
+
+ ret = cros_ec_sensors_core_init(pdev, indio_dev, true,
+- cros_ec_sensors_capture,
+- cros_ec_sensors_push_data);
++ cros_ec_sensors_capture);
+ if (ret)
+ return ret;
+
+@@ -240,7 +239,8 @@ static int cros_ec_light_prox_probe(struct platform_device *pdev)
+
+ state->core.read_ec_sensors_data = cros_ec_sensors_read_cmd;
+
+- return devm_iio_device_register(dev, indio_dev);
++ return cros_ec_sensors_core_register(dev, indio_dev,
++ cros_ec_sensors_push_data);
+ }
+
+ static const struct platform_device_id cros_ec_light_prox_ids[] = {
+diff --git a/drivers/iio/light/isl29028.c b/drivers/iio/light/isl29028.c
+index 9de3262aa6883..a62787f5d5e7b 100644
+--- a/drivers/iio/light/isl29028.c
++++ b/drivers/iio/light/isl29028.c
+@@ -625,7 +625,7 @@ static int isl29028_probe(struct i2c_client *client,
+ ISL29028_POWER_OFF_DELAY_MS);
+ pm_runtime_use_autosuspend(&client->dev);
+
+- ret = devm_iio_device_register(indio_dev->dev.parent, indio_dev);
++ ret = iio_device_register(indio_dev);
+ if (ret < 0) {
+ dev_err(&client->dev,
+ "%s(): iio registration failed with error %d\n",
+diff --git a/drivers/iio/potentiometer/ad5110.c b/drivers/iio/potentiometer/ad5110.c
+index d4eeedae56e5a..8fbcce4829898 100644
+--- a/drivers/iio/potentiometer/ad5110.c
++++ b/drivers/iio/potentiometer/ad5110.c
+@@ -63,10 +63,10 @@ struct ad5110_data {
+ struct mutex lock;
+ const struct ad5110_cfg *cfg;
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ */
+- u8 buf[2] ____cacheline_aligned;
++ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static const struct iio_chan_spec ad5110_channels[] = {
+diff --git a/drivers/iio/potentiometer/ad5272.c b/drivers/iio/potentiometer/ad5272.c
+index d8cbd170262f8..ed5fc0b50fe97 100644
+--- a/drivers/iio/potentiometer/ad5272.c
++++ b/drivers/iio/potentiometer/ad5272.c
+@@ -50,7 +50,7 @@ struct ad5272_data {
+ struct i2c_client *client;
+ struct mutex lock;
+ const struct ad5272_cfg *cfg;
+- u8 buf[2] ____cacheline_aligned;
++ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static const struct iio_chan_spec ad5272_channel = {
+diff --git a/drivers/iio/potentiometer/max5481.c b/drivers/iio/potentiometer/max5481.c
+index 098d144a8fddc..b40e5ac218d73 100644
+--- a/drivers/iio/potentiometer/max5481.c
++++ b/drivers/iio/potentiometer/max5481.c
+@@ -44,7 +44,7 @@ static const struct max5481_cfg max5481_cfg[] = {
+ struct max5481_data {
+ struct spi_device *spi;
+ const struct max5481_cfg *cfg;
+- u8 msg[3] ____cacheline_aligned;
++ u8 msg[3] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ #define MAX5481_CHANNEL { \
+diff --git a/drivers/iio/potentiometer/mcp41010.c b/drivers/iio/potentiometer/mcp41010.c
+index 30a4594d4e115..2b73c75402094 100644
+--- a/drivers/iio/potentiometer/mcp41010.c
++++ b/drivers/iio/potentiometer/mcp41010.c
+@@ -60,7 +60,7 @@ struct mcp41010_data {
+ const struct mcp41010_cfg *cfg;
+ struct mutex lock; /* Protect write sequences */
+ unsigned int value[MCP41010_MAX_WIPERS]; /* Cache wiper values */
+- u8 buf[2] ____cacheline_aligned;
++ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ #define MCP41010_CHANNEL(ch) { \
+diff --git a/drivers/iio/potentiometer/mcp4131.c b/drivers/iio/potentiometer/mcp4131.c
+index 7c8c18ab87649..7890c0993ec48 100644
+--- a/drivers/iio/potentiometer/mcp4131.c
++++ b/drivers/iio/potentiometer/mcp4131.c
+@@ -129,7 +129,7 @@ struct mcp4131_data {
+ struct spi_device *spi;
+ const struct mcp4131_cfg *cfg;
+ struct mutex lock;
+- u8 buf[2] ____cacheline_aligned;
++ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ #define MCP4131_CHANNEL(ch) { \
+diff --git a/drivers/iio/pressure/cros_ec_baro.c b/drivers/iio/pressure/cros_ec_baro.c
+index 2f882e1094232..0511edbf868d7 100644
+--- a/drivers/iio/pressure/cros_ec_baro.c
++++ b/drivers/iio/pressure/cros_ec_baro.c
+@@ -138,8 +138,7 @@ static int cros_ec_baro_probe(struct platform_device *pdev)
+ return -ENOMEM;
+
+ ret = cros_ec_sensors_core_init(pdev, indio_dev, true,
+- cros_ec_sensors_capture,
+- cros_ec_sensors_push_data);
++ cros_ec_sensors_capture);
+ if (ret)
+ return ret;
+
+@@ -186,7 +185,8 @@ static int cros_ec_baro_probe(struct platform_device *pdev)
+
+ state->core.read_ec_sensors_data = cros_ec_sensors_read_cmd;
+
+- return devm_iio_device_register(dev, indio_dev);
++ return cros_ec_sensors_core_register(dev, indio_dev,
++ cros_ec_sensors_push_data);
+ }
+
+ static const struct platform_device_id cros_ec_baro_ids[] = {
+diff --git a/drivers/iio/proximity/as3935.c b/drivers/iio/proximity/as3935.c
+index 67891ce2bd095..ebc95cf8f5f42 100644
+--- a/drivers/iio/proximity/as3935.c
++++ b/drivers/iio/proximity/as3935.c
+@@ -65,7 +65,7 @@ struct as3935_state {
+ u8 chan;
+ s64 timestamp __aligned(8);
+ } scan;
+- u8 buf[2] ____cacheline_aligned;
++ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static const struct iio_chan_spec as3935_channels[] = {
+diff --git a/drivers/iio/proximity/sx9324.c b/drivers/iio/proximity/sx9324.c
+index 63fbcaa4cac81..a30ac8007a3d3 100644
+--- a/drivers/iio/proximity/sx9324.c
++++ b/drivers/iio/proximity/sx9324.c
+@@ -93,7 +93,7 @@
+ #define SX9324_REG_PROX_CTRL4_AVGNEGFILT_MASK GENMASK(5, 3)
+ #define SX9324_REG_PROX_CTRL4_AVGNEG_FILT_2 0x08
+ #define SX9324_REG_PROX_CTRL4_AVGPOSFILT_MASK GENMASK(2, 0)
+-#define SX9324_REG_PROX_CTRL3_AVGPOS_FILT_256 0x04
++#define SX9324_REG_PROX_CTRL4_AVGPOS_FILT_256 0x04
+ #define SX9324_REG_PROX_CTRL5 0x35
+ #define SX9324_REG_PROX_CTRL5_HYST_MASK GENMASK(5, 4)
+ #define SX9324_REG_PROX_CTRL5_CLOSE_DEBOUNCE_MASK GENMASK(3, 2)
+@@ -810,7 +810,7 @@ static const struct sx_common_reg_default sx9324_default_regs[] = {
+ { SX9324_REG_PROX_CTRL3, SX9324_REG_PROX_CTRL3_AVGDEB_2SAMPLES |
+ SX9324_REG_PROX_CTRL3_AVGPOS_THRESH_16K },
+ { SX9324_REG_PROX_CTRL4, SX9324_REG_PROX_CTRL4_AVGNEG_FILT_2 |
+- SX9324_REG_PROX_CTRL3_AVGPOS_FILT_256 },
++ SX9324_REG_PROX_CTRL4_AVGPOS_FILT_256 },
+ { SX9324_REG_PROX_CTRL5, 0x00 },
+ { SX9324_REG_PROX_CTRL6, SX9324_REG_PROX_CTRL6_PROXTHRESH_32 },
+ { SX9324_REG_PROX_CTRL7, SX9324_REG_PROX_CTRL6_PROXTHRESH_32 },
+diff --git a/drivers/iio/resolver/ad2s1200.c b/drivers/iio/resolver/ad2s1200.c
+index 9746bd9356285..9d95241bdf8f2 100644
+--- a/drivers/iio/resolver/ad2s1200.c
++++ b/drivers/iio/resolver/ad2s1200.c
+@@ -41,7 +41,7 @@ struct ad2s1200_state {
+ struct spi_device *sdev;
+ struct gpio_desc *sample;
+ struct gpio_desc *rdvel;
+- __be16 rx ____cacheline_aligned;
++ __be16 rx __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int ad2s1200_read_raw(struct iio_dev *indio_dev,
+diff --git a/drivers/iio/resolver/ad2s90.c b/drivers/iio/resolver/ad2s90.c
+index d6a91f137e134..be6836e55376f 100644
+--- a/drivers/iio/resolver/ad2s90.c
++++ b/drivers/iio/resolver/ad2s90.c
+@@ -24,7 +24,7 @@
+ struct ad2s90_state {
+ struct mutex lock; /* lock to protect rx buffer */
+ struct spi_device *sdev;
+- u8 rx[2] ____cacheline_aligned;
++ u8 rx[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int ad2s90_read_raw(struct iio_dev *indio_dev,
+diff --git a/drivers/iio/temperature/ltc2983.c b/drivers/iio/temperature/ltc2983.c
+index 301c3f13fb26c..1b8252d86889e 100644
+--- a/drivers/iio/temperature/ltc2983.c
++++ b/drivers/iio/temperature/ltc2983.c
+@@ -200,11 +200,11 @@ struct ltc2983_data {
+ u8 num_channels;
+ u8 iio_channels;
+ /*
+- * DMA (thus cache coherency maintenance) requires the
++ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers to live in their own cache lines.
+ * Holds the converted temperature
+ */
+- __be32 temp ____cacheline_aligned;
++ __be32 temp __aligned(IIO_DMA_MINALIGN);
+ };
+
+ struct ltc2983_sensor {
+diff --git a/drivers/iio/temperature/max31865.c b/drivers/iio/temperature/max31865.c
+index 86c3f3509a26f..f0079e51d8d2d 100644
+--- a/drivers/iio/temperature/max31865.c
++++ b/drivers/iio/temperature/max31865.c
+@@ -53,7 +53,7 @@ struct max31865_data {
+ struct mutex lock;
+ bool filter_50hz;
+ bool three_wire;
+- u8 buf[2] ____cacheline_aligned;
++ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
+ };
+
+ static int max31865_read(struct max31865_data *data, u8 reg,
+diff --git a/drivers/iio/temperature/maxim_thermocouple.c b/drivers/iio/temperature/maxim_thermocouple.c
+index 98c41cddc6f00..c28a7a6dea5f1 100644
+--- a/drivers/iio/temperature/maxim_thermocouple.c
++++ b/drivers/iio/temperature/maxim_thermocouple.c
+@@ -122,7 +122,7 @@ struct maxim_thermocouple_data {
+ struct spi_device *spi;
+ const struct maxim_thermocouple_chip *chip;
+
+- u8 buffer[16] ____cacheline_aligned;
++ u8 buffer[16] __aligned(IIO_DMA_MINALIGN);
+ char tc_type;
+ };
+
+diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
+index 3ebdd42fec362..686d170a5947e 100644
+--- a/drivers/infiniband/hw/hfi1/file_ops.c
++++ b/drivers/infiniband/hw/hfi1/file_ops.c
+@@ -1179,8 +1179,10 @@ static int setup_base_ctxt(struct hfi1_filedata *fd,
+ goto done;
+
+ ret = init_user_ctxt(fd, uctxt);
+- if (ret)
++ if (ret) {
++ hfi1_free_ctxt_rcv_groups(uctxt);
+ goto done;
++ }
+
+ user_init(uctxt);
+
+diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+index 86f6a4aae1e5a..6e7dadc3386cc 100644
+--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
++++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+@@ -6125,8 +6125,8 @@ static irqreturn_t hns_roce_v2_msix_interrupt_abn(int irq, void *dev_id)
+
+ dev_err(dev, "AEQ overflow!\n");
+
+- int_st |= 1 << HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S;
+- roce_write(hr_dev, ROCEE_VF_ABN_INT_ST_REG, int_st);
++ roce_write(hr_dev, ROCEE_VF_ABN_INT_ST_REG,
++ 1 << HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S);
+
+ /* Set reset level for reset_event() */
+ if (ops->set_default_reset_request)
+diff --git a/drivers/infiniband/hw/irdma/cm.c b/drivers/infiniband/hw/irdma/cm.c
+index 646fa86774909..7b086fe63a245 100644
+--- a/drivers/infiniband/hw/irdma/cm.c
++++ b/drivers/infiniband/hw/irdma/cm.c
+@@ -1477,12 +1477,13 @@ irdma_find_listener(struct irdma_cm_core *cm_core, u32 *dst_addr, u16 dst_port,
+ list_for_each_entry (listen_node, &cm_core->listen_list, list) {
+ memcpy(listen_addr, listen_node->loc_addr, sizeof(listen_addr));
+ listen_port = listen_node->loc_port;
++ if (listen_port != dst_port ||
++ !(listener_state & listen_node->listener_state))
++ continue;
+ /* compare node pair, return node handle if a match */
+- if ((!memcmp(listen_addr, dst_addr, sizeof(listen_addr)) ||
+- !memcmp(listen_addr, ip_zero, sizeof(listen_addr))) &&
+- listen_port == dst_port &&
+- vlan_id == listen_node->vlan_id &&
+- (listener_state & listen_node->listener_state)) {
++ if (!memcmp(listen_addr, ip_zero, sizeof(listen_addr)) ||
++ (!memcmp(listen_addr, dst_addr, sizeof(listen_addr)) &&
++ vlan_id == listen_node->vlan_id)) {
+ refcount_inc(&listen_node->refcnt);
+ spin_unlock_irqrestore(&cm_core->listen_list_lock,
+ flags);
+diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c
+index 3dc9b5801da15..c1b402a06e17f 100644
+--- a/drivers/infiniband/hw/irdma/hw.c
++++ b/drivers/infiniband/hw/irdma/hw.c
+@@ -257,10 +257,6 @@ static void irdma_process_aeq(struct irdma_pci_f *rf)
+ iwqp->last_aeq = info->ae_id;
+ spin_unlock_irqrestore(&iwqp->lock, flags);
+ ctx_info = &iwqp->ctx_info;
+- if (rdma_protocol_roce(&iwqp->iwdev->ibdev, 1))
+- ctx_info->roce_info->err_rq_idx_valid = true;
+- else
+- ctx_info->iwarp_info->err_rq_idx_valid = true;
+ } else {
+ if (info->ae_id != IRDMA_AE_CQ_OPERATION_ERROR)
+ continue;
+@@ -370,16 +366,12 @@ static void irdma_process_aeq(struct irdma_pci_f *rf)
+ case IRDMA_AE_LCE_FUNCTION_CATASTROPHIC:
+ case IRDMA_AE_LCE_CQ_CATASTROPHIC:
+ case IRDMA_AE_UDA_XMIT_DGRAM_TOO_LONG:
+- if (rdma_protocol_roce(&iwdev->ibdev, 1))
+- ctx_info->roce_info->err_rq_idx_valid = false;
+- else
+- ctx_info->iwarp_info->err_rq_idx_valid = false;
+- fallthrough;
+ default:
+ ibdev_err(&iwdev->ibdev, "abnormal ae_id = 0x%x bool qp=%d qp_id = %d\n",
+ info->ae_id, info->qp, info->qp_cq_id);
+ if (rdma_protocol_roce(&iwdev->ibdev, 1)) {
+- if (!info->sq && ctx_info->roce_info->err_rq_idx_valid) {
++ ctx_info->roce_info->err_rq_idx_valid = info->rq;
++ if (info->rq) {
+ ctx_info->roce_info->err_rq_idx = info->wqe_idx;
+ irdma_sc_qp_setctx_roce(&iwqp->sc_qp, iwqp->host_ctx.va,
+ ctx_info);
+@@ -388,7 +380,8 @@ static void irdma_process_aeq(struct irdma_pci_f *rf)
+ irdma_cm_disconn(iwqp);
+ break;
+ }
+- if (!info->sq && ctx_info->iwarp_info->err_rq_idx_valid) {
++ ctx_info->iwarp_info->err_rq_idx_valid = info->rq;
++ if (info->rq) {
+ ctx_info->iwarp_info->err_rq_idx = info->wqe_idx;
+ ctx_info->tcp_info_valid = false;
+ ctx_info->iwarp_info_valid = true;
+diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
+index 6daa149dcbda2..b29631f6659a5 100644
+--- a/drivers/infiniband/hw/irdma/verbs.c
++++ b/drivers/infiniband/hw/irdma/verbs.c
+@@ -1760,11 +1760,11 @@ static int irdma_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
+ spin_unlock_irqrestore(&iwcq->lock, flags);
+
+ irdma_cq_wq_destroy(iwdev->rf, cq);
+- irdma_cq_free_rsrc(iwdev->rf, iwcq);
+
+ spin_lock_irqsave(&iwceq->ce_lock, flags);
+ irdma_sc_cleanup_ceqes(cq, ceq);
+ spin_unlock_irqrestore(&iwceq->ce_lock, flags);
++ irdma_cq_free_rsrc(iwdev->rf, iwcq);
+
+ return 0;
+ }
+diff --git a/drivers/infiniband/hw/mlx5/fs.c b/drivers/infiniband/hw/mlx5/fs.c
+index 661ed2b44508d..6092118de6723 100644
+--- a/drivers/infiniband/hw/mlx5/fs.c
++++ b/drivers/infiniband/hw/mlx5/fs.c
+@@ -2265,12 +2265,10 @@ static int mlx5_ib_matcher_ns(struct uverbs_attr_bundle *attrs,
+ if (err)
+ return err;
+
+- if (flags) {
+- mlx5_ib_ft_type_to_namespace(
++ if (flags)
++ return mlx5_ib_ft_type_to_namespace(
+ MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_TX,
+ &obj->ns_type);
+- return 0;
+- }
+ }
+
+ obj->ns_type = MLX5_FLOW_NAMESPACE_BYPASS;
+diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
+index df4d7970c1ad5..cc99b293f0be8 100644
+--- a/drivers/infiniband/hw/qedr/verbs.c
++++ b/drivers/infiniband/hw/qedr/verbs.c
+@@ -3083,7 +3083,7 @@ static struct qedr_mr *__qedr_alloc_mr(struct ib_pd *ibpd,
+ else
+ DP_ERR(dev, "roce alloc tid returned error %d\n", rc);
+
+- goto err0;
++ goto err1;
+ }
+
+ /* Index only, 18 bit long, lkey = itid << 8 | key */
+@@ -3107,7 +3107,7 @@ static struct qedr_mr *__qedr_alloc_mr(struct ib_pd *ibpd,
+ rc = dev->ops->rdma_register_tid(dev->rdma_ctx, &mr->hw_mr);
+ if (rc) {
+ DP_ERR(dev, "roce register tid returned an error %d\n", rc);
+- goto err1;
++ goto err2;
+ }
+
+ mr->ibmr.lkey = mr->hw_mr.itid << 8 | mr->hw_mr.key;
+@@ -3116,8 +3116,10 @@ static struct qedr_mr *__qedr_alloc_mr(struct ib_pd *ibpd,
+ DP_DEBUG(dev, QEDR_MSG_MR, "alloc frmr: %x\n", mr->ibmr.lkey);
+ return mr;
+
+-err1:
++err2:
+ dev->ops->rdma_free_tid(dev->rdma_ctx, mr->hw_mr.itid);
++err1:
++ qedr_free_pbl(dev, &mr->info.pbl_info, mr->info.pbl_table);
+ err0:
+ kfree(mr);
+ return ERR_PTR(rc);
+diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
+index 138b3e7d3a5f0..ec671e171f139 100644
+--- a/drivers/infiniband/sw/rxe/rxe_comp.c
++++ b/drivers/infiniband/sw/rxe/rxe_comp.c
+@@ -114,6 +114,8 @@ void retransmit_timer(struct timer_list *t)
+ {
+ struct rxe_qp *qp = from_timer(qp, t, retrans_timer);
+
++ pr_debug("%s: fired for qp#%d\n", __func__, qp->elem.index);
++
+ if (qp->valid) {
+ qp->comp.timeout = 1;
+ rxe_run_task(&qp->comp.task, 1);
+@@ -729,11 +731,15 @@ int rxe_completer(void *arg)
+ break;
+
+ case COMPST_RNR_RETRY:
++ /* we come here if we received an RNR NAK */
+ if (qp->comp.rnr_retry > 0) {
+ if (qp->comp.rnr_retry != 7)
+ qp->comp.rnr_retry--;
+
+- qp->req.need_retry = 1;
++ /* don't start a retry flow until the
++ * rnr timer has fired
++ */
++ qp->req.wait_for_rnr_timer = 1;
+ pr_debug("qp#%d set rnr nak timer\n",
+ qp_num(qp));
+ mod_timer(&qp->rnr_nak_timer,
+diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
+index 2ffbe3390668e..2b0edf3604740 100644
+--- a/drivers/infiniband/sw/rxe/rxe_loc.h
++++ b/drivers/infiniband/sw/rxe/rxe_loc.h
+@@ -77,7 +77,7 @@ struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key,
+ enum rxe_mr_lookup_type type);
+ int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length);
+ int advance_dma_data(struct rxe_dma_info *dma, unsigned int length);
+-int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey);
++int rxe_invalidate_mr(struct rxe_qp *qp, u32 key);
+ int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe);
+ int rxe_mr_set_page(struct ib_mr *ibmr, u64 addr);
+ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata);
+diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
+index 60a31b7187740..76d6498e83f95 100644
+--- a/drivers/infiniband/sw/rxe/rxe_mr.c
++++ b/drivers/infiniband/sw/rxe/rxe_mr.c
+@@ -576,22 +576,22 @@ struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key,
+ return mr;
+ }
+
+-int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey)
++int rxe_invalidate_mr(struct rxe_qp *qp, u32 key)
+ {
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+ struct rxe_mr *mr;
+ int ret;
+
+- mr = rxe_pool_get_index(&rxe->mr_pool, rkey >> 8);
++ mr = rxe_pool_get_index(&rxe->mr_pool, key >> 8);
+ if (!mr) {
+- pr_err("%s: No MR for rkey %#x\n", __func__, rkey);
++ pr_err("%s: No MR for key %#x\n", __func__, key);
+ ret = -EINVAL;
+ goto err;
+ }
+
+- if (rkey != mr->rkey) {
+- pr_err("%s: rkey (%#x) doesn't match mr->rkey (%#x)\n",
+- __func__, rkey, mr->rkey);
++ if (mr->rkey ? (key != mr->rkey) : (key != mr->lkey)) {
++ pr_err("%s: wr key (%#x) doesn't match mr key (%#x)\n",
++ __func__, key, (mr->rkey ? mr->rkey : mr->lkey));
+ ret = -EINVAL;
+ goto err_drop_ref;
+ }
+diff --git a/drivers/infiniband/sw/rxe/rxe_mw.c b/drivers/infiniband/sw/rxe/rxe_mw.c
+index c86b2efd58f25..fc1416fc7feb5 100644
+--- a/drivers/infiniband/sw/rxe/rxe_mw.c
++++ b/drivers/infiniband/sw/rxe/rxe_mw.c
+@@ -69,8 +69,6 @@ int rxe_dealloc_mw(struct ib_mw *ibmw)
+ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+ struct rxe_mw *mw, struct rxe_mr *mr)
+ {
+- u32 key = wqe->wr.wr.mw.rkey & 0xff;
+-
+ if (mw->ibmw.type == IB_MW_TYPE_1) {
+ if (unlikely(mw->state != RXE_MW_STATE_VALID)) {
+ pr_err_once(
+@@ -108,11 +106,6 @@ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+ }
+ }
+
+- if (unlikely(key == (mw->rkey & 0xff))) {
+- pr_err_once("attempt to bind MW with same key\n");
+- return -EINVAL;
+- }
+-
+ /* remaining checks only apply to a nonzero MR */
+ if (!mr)
+ return 0;
+diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
+index 87066d04ed186..69db289445679 100644
+--- a/drivers/infiniband/sw/rxe/rxe_pool.c
++++ b/drivers/infiniband/sw/rxe/rxe_pool.c
+@@ -140,7 +140,7 @@ void *rxe_alloc(struct rxe_pool *pool)
+
+ err = xa_alloc_cyclic(&pool->xa, &elem->index, elem, pool->limit,
+ &pool->next, GFP_KERNEL);
+- if (err)
++ if (err < 0)
+ goto err_free;
+
+ return obj;
+@@ -168,7 +168,7 @@ int __rxe_add_to_pool(struct rxe_pool *pool, struct rxe_pool_elem *elem)
+
+ err = xa_alloc_cyclic(&pool->xa, &elem->index, elem, pool->limit,
+ &pool->next, GFP_KERNEL);
+- if (err)
++ if (err < 0)
+ goto err_cnt;
+
+ return 0;
+diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
+index 62acf890af6c1..4889dcae0cc8b 100644
+--- a/drivers/infiniband/sw/rxe/rxe_qp.c
++++ b/drivers/infiniband/sw/rxe/rxe_qp.c
+@@ -186,6 +186,14 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp,
+
+ spin_lock_init(&qp->state_lock);
+
++ spin_lock_init(&qp->req.task.state_lock);
++ spin_lock_init(&qp->resp.task.state_lock);
++ spin_lock_init(&qp->comp.task.state_lock);
++
++ spin_lock_init(&qp->sq.sq_lock);
++ spin_lock_init(&qp->rq.producer_lock);
++ spin_lock_init(&qp->rq.consumer_lock);
++
+ atomic_set(&qp->ssn, 0);
+ atomic_set(&qp->skb_out, 0);
+ }
+@@ -245,7 +253,6 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
+ qp->req.opcode = -1;
+ qp->comp.opcode = -1;
+
+- spin_lock_init(&qp->sq.sq_lock);
+ skb_queue_head_init(&qp->req_pkts);
+
+ rxe_init_task(rxe, &qp->req.task, qp,
+@@ -296,9 +303,6 @@ static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp,
+ }
+ }
+
+- spin_lock_init(&qp->rq.producer_lock);
+- spin_lock_init(&qp->rq.consumer_lock);
+-
+ skb_queue_head_init(&qp->resp_pkts);
+
+ rxe_init_task(rxe, &qp->resp.task, qp,
+@@ -513,6 +517,7 @@ static void rxe_qp_reset(struct rxe_qp *qp)
+ atomic_set(&qp->ssn, 0);
+ qp->req.opcode = -1;
+ qp->req.need_retry = 0;
++ qp->req.wait_for_rnr_timer = 0;
+ qp->req.noack_pkts = 0;
+ qp->resp.msn = 0;
+ qp->resp.opcode = -1;
+diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
+index 8a1cff80a68e7..90669b3c56afb 100644
+--- a/drivers/infiniband/sw/rxe/rxe_req.c
++++ b/drivers/infiniband/sw/rxe/rxe_req.c
+@@ -103,7 +103,11 @@ void rnr_nak_timer(struct timer_list *t)
+ {
+ struct rxe_qp *qp = from_timer(qp, t, rnr_nak_timer);
+
+- pr_debug("qp#%d rnr nak timer fired\n", qp_num(qp));
++ pr_debug("%s: fired for qp#%d\n", __func__, qp_num(qp));
++
++ /* request a send queue retry */
++ qp->req.need_retry = 1;
++ qp->req.wait_for_rnr_timer = 0;
+ rxe_run_task(&qp->req.task, 1);
+ }
+
+@@ -586,9 +590,11 @@ static int rxe_do_local_ops(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
+ wqe->status = IB_WC_SUCCESS;
+ qp->req.wqe_index = queue_next_index(qp->sq.queue, qp->req.wqe_index);
+
+- if ((wqe->wr.send_flags & IB_SEND_SIGNALED) ||
+- qp->sq_sig_type == IB_SIGNAL_ALL_WR)
+- rxe_run_task(&qp->comp.task, 1);
++ /* There is no ack coming for local work requests
++ * which can lead to a deadlock. So go ahead and complete
++ * it now.
++ */
++ rxe_run_task(&qp->comp.task, 1);
+
+ return 0;
+ }
+@@ -624,10 +630,17 @@ next_wqe:
+ qp->req.need_rd_atomic = 0;
+ qp->req.wait_psn = 0;
+ qp->req.need_retry = 0;
++ qp->req.wait_for_rnr_timer = 0;
+ goto exit;
+ }
+
+- if (unlikely(qp->req.need_retry)) {
++ /* we come here if the retransmot timer has fired
++ * or if the rnr timer has fired. If the retransmit
++ * timer fires while we are processing an RNR NAK wait
++ * until the rnr timer has fired before starting the
++ * retry flow
++ */
++ if (unlikely(qp->req.need_retry && !qp->req.wait_for_rnr_timer)) {
+ req_retry(qp);
+ qp->req.need_retry = 0;
+ }
+diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
+index e7eff1ca75e9b..33e8d0547553f 100644
+--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
++++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
+@@ -123,6 +123,7 @@ struct rxe_req_info {
+ int need_rd_atomic;
+ int wait_psn;
+ int need_retry;
++ int wait_for_rnr_timer;
+ int noack_pkts;
+ struct rxe_task task;
+ };
+diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
+index 17f34d584cd9e..f88d2971c2c63 100644
+--- a/drivers/infiniband/sw/siw/siw_cm.c
++++ b/drivers/infiniband/sw/siw/siw_cm.c
+@@ -725,11 +725,11 @@ static int siw_proc_mpareply(struct siw_cep *cep)
+ enum mpa_v2_ctrl mpa_p2p_mode = MPA_V2_RDMA_NO_RTR;
+
+ rv = siw_recv_mpa_rr(cep);
+- if (rv != -EAGAIN)
+- siw_cancel_mpatimer(cep);
+ if (rv)
+ goto out_err;
+
++ siw_cancel_mpatimer(cep);
++
+ rep = &cep->mpa.hdr;
+
+ if (__mpa_rr_revision(rep->params.bits) > MPA_REVISION_2) {
+@@ -895,7 +895,8 @@ static int siw_proc_mpareply(struct siw_cep *cep)
+ }
+
+ out_err:
+- siw_cm_upcall(cep, IW_CM_EVENT_CONNECT_REPLY, -EINVAL);
++ if (rv != -EAGAIN)
++ siw_cm_upcall(cep, IW_CM_EVENT_CONNECT_REPLY, -EINVAL);
+
+ return rv;
+ }
+diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c
+index f8d0bab4424cf..e36036b8f3867 100644
+--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
++++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
+@@ -568,7 +568,7 @@ static void iscsi_iser_session_destroy(struct iscsi_cls_session *cls_session)
+ struct Scsi_Host *shost = iscsi_session_to_shost(cls_session);
+
+ iscsi_session_teardown(cls_session);
+- iscsi_host_remove(shost);
++ iscsi_host_remove(shost, false);
+ iscsi_host_free(shost);
+ }
+
+@@ -685,7 +685,7 @@ iscsi_iser_session_create(struct iscsi_endpoint *ep,
+ return cls_session;
+
+ remove_host:
+- iscsi_host_remove(shost);
++ iscsi_host_remove(shost, false);
+ free_host:
+ iscsi_host_free(shost);
+ return NULL;
+diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+index c2c860d0c56e9..65bad1dd917b6 100644
+--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
++++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+@@ -740,25 +740,25 @@ struct path_it {
+ struct rtrs_clt_path *(*next_path)(struct path_it *it);
+ };
+
+-/**
+- * list_next_or_null_rr_rcu - get next list element in round-robin fashion.
++/*
++ * rtrs_clt_get_next_path_or_null - get clt path from the list or return NULL
+ * @head: the head for the list.
+- * @ptr: the list head to take the next element from.
+- * @type: the type of the struct this is embedded in.
+- * @memb: the name of the list_head within the struct.
++ * @clt_path: The element to take the next clt_path from.
+ *
+- * Next element returned in round-robin fashion, i.e. head will be skipped,
++ * Next clt path returned in round-robin fashion, i.e. head will be skipped,
+ * but if list is observed as empty, NULL will be returned.
+ *
+- * This primitive may safely run concurrently with the _rcu list-mutation
++ * This function may safely run concurrently with the _rcu list-mutation
+ * primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
+ */
+-#define list_next_or_null_rr_rcu(head, ptr, type, memb) \
+-({ \
+- list_next_or_null_rcu(head, ptr, type, memb) ?: \
+- list_next_or_null_rcu(head, READ_ONCE((ptr)->next), \
+- type, memb); \
+-})
++static inline struct rtrs_clt_path *
++rtrs_clt_get_next_path_or_null(struct list_head *head, struct rtrs_clt_path *clt_path)
++{
++ return list_next_or_null_rcu(head, &clt_path->s.entry, typeof(*clt_path), s.entry) ?:
++ list_next_or_null_rcu(head,
++ READ_ONCE((&clt_path->s.entry)->next),
++ typeof(*clt_path), s.entry);
++}
+
+ /**
+ * get_next_path_rr() - Returns path in round-robin fashion.
+@@ -789,10 +789,8 @@ static struct rtrs_clt_path *get_next_path_rr(struct path_it *it)
+ path = list_first_or_null_rcu(&clt->paths_list,
+ typeof(*path), s.entry);
+ else
+- path = list_next_or_null_rr_rcu(&clt->paths_list,
+- &path->s.entry,
+- typeof(*path),
+- s.entry);
++ path = rtrs_clt_get_next_path_or_null(&clt->paths_list, path);
++
+ rcu_assign_pointer(*ppcpu_path, path);
+
+ return path;
+@@ -2277,8 +2275,7 @@ static void rtrs_clt_remove_path_from_arr(struct rtrs_clt_path *clt_path)
+ * removed. If @sess is the last element, then @next is NULL.
+ */
+ rcu_read_lock();
+- next = list_next_or_null_rr_rcu(&clt->paths_list, &clt_path->s.entry,
+- typeof(*next), s.entry);
++ next = rtrs_clt_get_next_path_or_null(&clt->paths_list, clt_path);
+ rcu_read_unlock();
+
+ /*
+diff --git a/drivers/infiniband/ulp/rtrs/rtrs-pri.h b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
+index 9a1e5c2ae55c0..ac0df734eba8c 100644
+--- a/drivers/infiniband/ulp/rtrs/rtrs-pri.h
++++ b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
+@@ -23,6 +23,17 @@
+ #define RTRS_PROTO_VER_STRING __stringify(RTRS_PROTO_VER_MAJOR) "." \
+ __stringify(RTRS_PROTO_VER_MINOR)
+
++/*
++ * Max IB immediate data size is 2^28 (MAX_IMM_PAYL_BITS)
++ * and the minimum chunk size is 4096 (2^12).
++ * So the maximum sess_queue_depth is 65536 (2^16) in theory.
++ * But mempool_create, create_qp and ib_post_send fail with
++ * "cannot allocate memory" error if sess_queue_depth is too big.
++ * Therefore the pratical max value of sess_queue_depth is
++ * somewhere between 1 and 65534 and it depends on the system.
++ */
++#define MAX_SESS_QUEUE_DEPTH 65535
++
+ enum rtrs_imm_const {
+ MAX_IMM_TYPE_BITS = 4,
+ MAX_IMM_TYPE_MASK = ((1 << MAX_IMM_TYPE_BITS) - 1),
+@@ -46,16 +57,6 @@ enum {
+
+ MAX_PATHS_NUM = 128,
+
+- /*
+- * Max IB immediate data size is 2^28 (MAX_IMM_PAYL_BITS)
+- * and the minimum chunk size is 4096 (2^12).
+- * So the maximum sess_queue_depth is 65536 (2^16) in theory.
+- * But mempool_create, create_qp and ib_post_send fail with
+- * "cannot allocate memory" error if sess_queue_depth is too big.
+- * Therefore the pratical max value of sess_queue_depth is
+- * somewhere between 1 and 65534 and it depends on the system.
+- */
+- MAX_SESS_QUEUE_DEPTH = 65535,
+ MIN_CHUNK_SIZE = 8192,
+
+ RTRS_HB_INTERVAL_MS = 5000,
+diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
+index f86ee1c4b970a..c3036aeac89ed 100644
+--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
++++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
+@@ -565,12 +565,9 @@ static int srpt_refresh_port(struct srpt_port *sport)
+ if (ret)
+ return ret;
+
+- sport->port_guid_id.wwn.priv = sport;
+- srpt_format_guid(sport->port_guid_id.name,
+- sizeof(sport->port_guid_id.name),
++ srpt_format_guid(sport->guid_name, ARRAY_SIZE(sport->guid_name),
+ &sport->gid.global.interface_id);
+- sport->port_gid_id.wwn.priv = sport;
+- snprintf(sport->port_gid_id.name, sizeof(sport->port_gid_id.name),
++ snprintf(sport->gid_name, ARRAY_SIZE(sport->gid_name),
+ "0x%016llx%016llx",
+ be64_to_cpu(sport->gid.global.subnet_prefix),
+ be64_to_cpu(sport->gid.global.interface_id));
+@@ -2314,31 +2311,35 @@ static int srpt_cm_req_recv(struct srpt_device *const sdev,
+ tag_num = ch->rq_size;
+ tag_size = 1; /* ib_srpt does not use se_sess->sess_cmd_map */
+
+- mutex_lock(&sport->port_guid_id.mutex);
+- list_for_each_entry(stpg, &sport->port_guid_id.tpg_list, entry) {
+- if (!IS_ERR_OR_NULL(ch->sess))
+- break;
+- ch->sess = target_setup_session(&stpg->tpg, tag_num,
++ if (sport->guid_id) {
++ mutex_lock(&sport->guid_id->mutex);
++ list_for_each_entry(stpg, &sport->guid_id->tpg_list, entry) {
++ if (!IS_ERR_OR_NULL(ch->sess))
++ break;
++ ch->sess = target_setup_session(&stpg->tpg, tag_num,
+ tag_size, TARGET_PROT_NORMAL,
+ ch->sess_name, ch, NULL);
++ }
++ mutex_unlock(&sport->guid_id->mutex);
+ }
+- mutex_unlock(&sport->port_guid_id.mutex);
+
+- mutex_lock(&sport->port_gid_id.mutex);
+- list_for_each_entry(stpg, &sport->port_gid_id.tpg_list, entry) {
+- if (!IS_ERR_OR_NULL(ch->sess))
+- break;
+- ch->sess = target_setup_session(&stpg->tpg, tag_num,
++ if (sport->gid_id) {
++ mutex_lock(&sport->gid_id->mutex);
++ list_for_each_entry(stpg, &sport->gid_id->tpg_list, entry) {
++ if (!IS_ERR_OR_NULL(ch->sess))
++ break;
++ ch->sess = target_setup_session(&stpg->tpg, tag_num,
+ tag_size, TARGET_PROT_NORMAL, i_port_id,
+ ch, NULL);
+- if (!IS_ERR_OR_NULL(ch->sess))
+- break;
+- /* Retry without leading "0x" */
+- ch->sess = target_setup_session(&stpg->tpg, tag_num,
++ if (!IS_ERR_OR_NULL(ch->sess))
++ break;
++ /* Retry without leading "0x" */
++ ch->sess = target_setup_session(&stpg->tpg, tag_num,
+ tag_size, TARGET_PROT_NORMAL,
+ i_port_id + 2, ch, NULL);
++ }
++ mutex_unlock(&sport->gid_id->mutex);
+ }
+- mutex_unlock(&sport->port_gid_id.mutex);
+
+ if (IS_ERR_OR_NULL(ch->sess)) {
+ WARN_ON_ONCE(ch->sess == NULL);
+@@ -2983,7 +2984,12 @@ static int srpt_release_sport(struct srpt_port *sport)
+ return 0;
+ }
+
+-static struct se_wwn *__srpt_lookup_wwn(const char *name)
++struct port_and_port_id {
++ struct srpt_port *sport;
++ struct srpt_port_id **port_id;
++};
++
++static struct port_and_port_id __srpt_lookup_port(const char *name)
+ {
+ struct ib_device *dev;
+ struct srpt_device *sdev;
+@@ -2998,25 +3004,38 @@ static struct se_wwn *__srpt_lookup_wwn(const char *name)
+ for (i = 0; i < dev->phys_port_cnt; i++) {
+ sport = &sdev->port[i];
+
+- if (strcmp(sport->port_guid_id.name, name) == 0)
+- return &sport->port_guid_id.wwn;
+- if (strcmp(sport->port_gid_id.name, name) == 0)
+- return &sport->port_gid_id.wwn;
++ if (strcmp(sport->guid_name, name) == 0) {
++ kref_get(&sdev->refcnt);
++ return (struct port_and_port_id){
++ sport, &sport->guid_id};
++ }
++ if (strcmp(sport->gid_name, name) == 0) {
++ kref_get(&sdev->refcnt);
++ return (struct port_and_port_id){
++ sport, &sport->gid_id};
++ }
+ }
+ }
+
+- return NULL;
++ return (struct port_and_port_id){};
+ }
+
+-static struct se_wwn *srpt_lookup_wwn(const char *name)
++/**
++ * srpt_lookup_port() - Look up an RDMA port by name
++ * @name: ASCII port name
++ *
++ * Increments the RDMA port reference count if an RDMA port pointer is returned.
++ * The caller must drop that reference count by calling srpt_port_put_ref().
++ */
++static struct port_and_port_id srpt_lookup_port(const char *name)
+ {
+- struct se_wwn *wwn;
++ struct port_and_port_id papi;
+
+ spin_lock(&srpt_dev_lock);
+- wwn = __srpt_lookup_wwn(name);
++ papi = __srpt_lookup_port(name);
+ spin_unlock(&srpt_dev_lock);
+
+- return wwn;
++ return papi;
+ }
+
+ static void srpt_free_srq(struct srpt_device *sdev)
+@@ -3101,6 +3120,18 @@ static int srpt_use_srq(struct srpt_device *sdev, bool use_srq)
+ return ret;
+ }
+
++static void srpt_free_sdev(struct kref *refcnt)
++{
++ struct srpt_device *sdev = container_of(refcnt, typeof(*sdev), refcnt);
++
++ kfree(sdev);
++}
++
++static void srpt_sdev_put(struct srpt_device *sdev)
++{
++ kref_put(&sdev->refcnt, srpt_free_sdev);
++}
++
+ /**
+ * srpt_add_one - InfiniBand device addition callback function
+ * @device: Describes a HCA.
+@@ -3119,6 +3150,7 @@ static int srpt_add_one(struct ib_device *device)
+ if (!sdev)
+ return -ENOMEM;
+
++ kref_init(&sdev->refcnt);
+ sdev->device = device;
+ mutex_init(&sdev->sdev_mutex);
+
+@@ -3182,10 +3214,6 @@ static int srpt_add_one(struct ib_device *device)
+ sport->port_attrib.srp_sq_size = DEF_SRPT_SQ_SIZE;
+ sport->port_attrib.use_srq = false;
+ INIT_WORK(&sport->work, srpt_refresh_port_work);
+- mutex_init(&sport->port_guid_id.mutex);
+- INIT_LIST_HEAD(&sport->port_guid_id.tpg_list);
+- mutex_init(&sport->port_gid_id.mutex);
+- INIT_LIST_HEAD(&sport->port_gid_id.tpg_list);
+
+ ret = srpt_refresh_port(sport);
+ if (ret) {
+@@ -3214,7 +3242,7 @@ err_ring:
+ srpt_free_srq(sdev);
+ ib_dealloc_pd(sdev->pd);
+ free_dev:
+- kfree(sdev);
++ srpt_sdev_put(sdev);
+ pr_info("%s(%s) failed.\n", __func__, dev_name(&device->dev));
+ return ret;
+ }
+@@ -3258,7 +3286,7 @@ static void srpt_remove_one(struct ib_device *device, void *client_data)
+
+ ib_dealloc_pd(sdev->pd);
+
+- kfree(sdev);
++ srpt_sdev_put(sdev);
+ }
+
+ static struct ib_client srpt_client = {
+@@ -3286,10 +3314,10 @@ static struct srpt_port_id *srpt_wwn_to_sport_id(struct se_wwn *wwn)
+ {
+ struct srpt_port *sport = wwn->priv;
+
+- if (wwn == &sport->port_guid_id.wwn)
+- return &sport->port_guid_id;
+- if (wwn == &sport->port_gid_id.wwn)
+- return &sport->port_gid_id;
++ if (sport->guid_id && &sport->guid_id->wwn == wwn)
++ return sport->guid_id;
++ if (sport->gid_id && &sport->gid_id->wwn == wwn)
++ return sport->gid_id;
+ WARN_ON_ONCE(true);
+ return NULL;
+ }
+@@ -3774,7 +3802,31 @@ static struct se_wwn *srpt_make_tport(struct target_fabric_configfs *tf,
+ struct config_group *group,
+ const char *name)
+ {
+- return srpt_lookup_wwn(name) ? : ERR_PTR(-EINVAL);
++ struct port_and_port_id papi = srpt_lookup_port(name);
++ struct srpt_port *sport = papi.sport;
++ struct srpt_port_id *port_id;
++
++ if (!papi.port_id)
++ return ERR_PTR(-EINVAL);
++ if (*papi.port_id) {
++ /* Attempt to create a directory that already exists. */
++ WARN_ON_ONCE(true);
++ return &(*papi.port_id)->wwn;
++ }
++ port_id = kzalloc(sizeof(*port_id), GFP_KERNEL);
++ if (!port_id) {
++ srpt_sdev_put(sport->sdev);
++ return ERR_PTR(-ENOMEM);
++ }
++ mutex_init(&port_id->mutex);
++ INIT_LIST_HEAD(&port_id->tpg_list);
++ port_id->wwn.priv = sport;
++ memcpy(port_id->name, port_id == sport->guid_id ? sport->guid_name :
++ sport->gid_name, ARRAY_SIZE(port_id->name));
++
++ *papi.port_id = port_id;
++
++ return &port_id->wwn;
+ }
+
+ /**
+@@ -3783,6 +3835,18 @@ static struct se_wwn *srpt_make_tport(struct target_fabric_configfs *tf,
+ */
+ static void srpt_drop_tport(struct se_wwn *wwn)
+ {
++ struct srpt_port_id *port_id = container_of(wwn, typeof(*port_id), wwn);
++ struct srpt_port *sport = wwn->priv;
++
++ if (sport->guid_id == port_id)
++ sport->guid_id = NULL;
++ else if (sport->gid_id == port_id)
++ sport->gid_id = NULL;
++ else
++ WARN_ON_ONCE(true);
++
++ srpt_sdev_put(sport->sdev);
++ kfree(port_id);
+ }
+
+ static ssize_t srpt_wwn_version_show(struct config_item *item, char *buf)
+diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.h b/drivers/infiniband/ulp/srpt/ib_srpt.h
+index 76e66f630c17a..4c46b301eea18 100644
+--- a/drivers/infiniband/ulp/srpt/ib_srpt.h
++++ b/drivers/infiniband/ulp/srpt/ib_srpt.h
+@@ -376,7 +376,7 @@ struct srpt_tpg {
+ };
+
+ /**
+- * struct srpt_port_id - information about an RDMA port name
++ * struct srpt_port_id - LIO RDMA port information
+ * @mutex: Protects @tpg_list changes.
+ * @tpg_list: TPGs associated with the RDMA port name.
+ * @wwn: WWN associated with the RDMA port name.
+@@ -393,7 +393,7 @@ struct srpt_port_id {
+ };
+
+ /**
+- * struct srpt_port - information associated by SRPT with a single IB port
++ * struct srpt_port - SRPT RDMA port information
+ * @sdev: backpointer to the HCA information.
+ * @mad_agent: per-port management datagram processing information.
+ * @enabled: Whether or not this target port is enabled.
+@@ -402,8 +402,10 @@ struct srpt_port_id {
+ * @lid: cached value of the port's lid.
+ * @gid: cached value of the port's gid.
+ * @work: work structure for refreshing the aforementioned cached values.
+- * @port_guid_id: target port GUID
+- * @port_gid_id: target port GID
++ * @guid_name: port name in GUID format.
++ * @guid_id: LIO target port information for the port name in GUID format.
++ * @gid_name: port name in GID format.
++ * @gid_id: LIO target port information for the port name in GID format.
+ * @port_attrib: Port attributes that can be accessed through configfs.
+ * @refcount: Number of objects associated with this port.
+ * @freed_channels: Completion that will be signaled once @refcount becomes 0.
+@@ -419,8 +421,10 @@ struct srpt_port {
+ u32 lid;
+ union ib_gid gid;
+ struct work_struct work;
+- struct srpt_port_id port_guid_id;
+- struct srpt_port_id port_gid_id;
++ char guid_name[64];
++ struct srpt_port_id *guid_id;
++ char gid_name[64];
++ struct srpt_port_id *gid_id;
+ struct srpt_port_attrib port_attrib;
+ atomic_t refcount;
+ struct completion *freed_channels;
+@@ -430,6 +434,7 @@ struct srpt_port {
+
+ /**
+ * struct srpt_device - information associated by SRPT with a single HCA
++ * @refcnt: Reference count for this device.
+ * @device: Backpointer to the struct ib_device managed by the IB core.
+ * @pd: IB protection domain.
+ * @lkey: L_Key (local key) with write access to all local memory.
+@@ -445,6 +450,7 @@ struct srpt_port {
+ * @port: Information about the ports owned by this HCA.
+ */
+ struct srpt_device {
++ struct kref refcnt;
+ struct ib_device *device;
+ struct ib_pd *pd;
+ u32 lkey;
+diff --git a/drivers/input/serio/gscps2.c b/drivers/input/serio/gscps2.c
+index a9065c6ab5508..da2c67cb86422 100644
+--- a/drivers/input/serio/gscps2.c
++++ b/drivers/input/serio/gscps2.c
+@@ -350,6 +350,10 @@ static int __init gscps2_probe(struct parisc_device *dev)
+ ps2port->port = serio;
+ ps2port->padev = dev;
+ ps2port->addr = ioremap(hpa, GSC_STATUS + 4);
++ if (!ps2port->addr) {
++ ret = -ENOMEM;
++ goto fail_nomem;
++ }
+ spin_lock_init(&ps2port->lock);
+
+ gscps2_reset(ps2port);
+diff --git a/drivers/interconnect/imx/imx.c b/drivers/interconnect/imx/imx.c
+index 249ca25d1d556..4406ec45fa90f 100644
+--- a/drivers/interconnect/imx/imx.c
++++ b/drivers/interconnect/imx/imx.c
+@@ -234,16 +234,16 @@ int imx_icc_register(struct platform_device *pdev,
+ struct device *dev = &pdev->dev;
+ struct icc_onecell_data *data;
+ struct icc_provider *provider;
+- int max_node_id;
++ int num_nodes;
+ int ret;
+
+ /* icc_onecell_data is indexed by node_id, unlike nodes param */
+- max_node_id = get_max_node_id(nodes, nodes_count);
+- data = devm_kzalloc(dev, struct_size(data, nodes, max_node_id),
++ num_nodes = get_max_node_id(nodes, nodes_count) + 1;
++ data = devm_kzalloc(dev, struct_size(data, nodes, num_nodes),
+ GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+- data->num_nodes = max_node_id;
++ data->num_nodes = num_nodes;
+
+ provider = devm_kzalloc(dev, sizeof(*provider), GFP_KERNEL);
+ if (!provider)
+diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+index 4c077c38fbd64..c11d2c2cbb620 100644
+--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
++++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+@@ -750,9 +750,12 @@ static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+ {
+ struct device_node *child;
+
+- for_each_child_of_node(qcom_iommu->dev->of_node, child)
+- if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
++ for_each_child_of_node(qcom_iommu->dev->of_node, child) {
++ if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec")) {
++ of_node_put(child);
+ return true;
++ }
++ }
+
+ return false;
+ }
+diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
+index 71f2018e23fe9..cd4b889d55379 100644
+--- a/drivers/iommu/exynos-iommu.c
++++ b/drivers/iommu/exynos-iommu.c
+@@ -630,7 +630,7 @@ static int exynos_sysmmu_probe(struct platform_device *pdev)
+
+ ret = iommu_device_register(&data->iommu, &exynos_iommu_ops, dev);
+ if (ret)
+- return ret;
++ goto err_iommu_register;
+
+ platform_set_drvdata(pdev, data);
+
+@@ -657,6 +657,10 @@ static int exynos_sysmmu_probe(struct platform_device *pdev)
+ pm_runtime_enable(dev);
+
+ return 0;
++
++err_iommu_register:
++ iommu_device_sysfs_remove(&data->iommu);
++ return ret;
+ }
+
+ static int __maybe_unused exynos_sysmmu_suspend(struct device *dev)
+diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
+index 497c5bd95caf8..2a10c9b540646 100644
+--- a/drivers/iommu/intel/dmar.c
++++ b/drivers/iommu/intel/dmar.c
+@@ -495,7 +495,7 @@ static int dmar_parse_one_rhsa(struct acpi_dmar_header *header, void *arg)
+ if (drhd->reg_base_addr == rhsa->base_address) {
+ int node = pxm_to_node(rhsa->proximity_domain);
+
+- if (!node_online(node))
++ if (node != NUMA_NO_NODE && !node_online(node))
+ node = NUMA_NO_NODE;
+ drhd->iommu->node = node;
+ return 0;
+diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
+index 15edb9a6fcae0..1af26c409cb1d 100644
+--- a/drivers/irqchip/Kconfig
++++ b/drivers/irqchip/Kconfig
+@@ -177,7 +177,7 @@ config MADERA_IRQ
+ config IRQ_MIPS_CPU
+ bool
+ select GENERIC_IRQ_CHIP
+- select GENERIC_IRQ_IPI if SYS_SUPPORTS_MULTITHREADING
++ select GENERIC_IRQ_IPI if SMP && SYS_SUPPORTS_MULTITHREADING
+ select IRQ_DOMAIN
+ select GENERIC_IRQ_EFFECTIVE_AFF_MASK
+
+@@ -310,7 +310,8 @@ config KEYSTONE_IRQ
+
+ config MIPS_GIC
+ bool
+- select GENERIC_IRQ_IPI
++ select GENERIC_IRQ_IPI if SMP
++ select IRQ_DOMAIN_HIERARCHY
+ select MIPS_CM
+
+ config INGENIC_IRQ
+diff --git a/drivers/irqchip/irq-mips-gic.c b/drivers/irqchip/irq-mips-gic.c
+index ff89b36267dd4..1ba0f1555c805 100644
+--- a/drivers/irqchip/irq-mips-gic.c
++++ b/drivers/irqchip/irq-mips-gic.c
+@@ -52,13 +52,15 @@ static DEFINE_PER_CPU_READ_MOSTLY(unsigned long[GIC_MAX_LONGS], pcpu_masks);
+
+ static DEFINE_SPINLOCK(gic_lock);
+ static struct irq_domain *gic_irq_domain;
+-static struct irq_domain *gic_ipi_domain;
+ static int gic_shared_intrs;
+ static unsigned int gic_cpu_pin;
+ static unsigned int timer_cpu_pin;
+ static struct irq_chip gic_level_irq_controller, gic_edge_irq_controller;
++
++#ifdef CONFIG_GENERIC_IRQ_IPI
+ static DECLARE_BITMAP(ipi_resrv, GIC_MAX_INTRS);
+ static DECLARE_BITMAP(ipi_available, GIC_MAX_INTRS);
++#endif /* CONFIG_GENERIC_IRQ_IPI */
+
+ static struct gic_all_vpes_chip_data {
+ u32 map;
+@@ -472,9 +474,11 @@ static int gic_irq_domain_map(struct irq_domain *d, unsigned int virq,
+ u32 map;
+
+ if (hwirq >= GIC_SHARED_HWIRQ_BASE) {
++#ifdef CONFIG_GENERIC_IRQ_IPI
+ /* verify that shared irqs don't conflict with an IPI irq */
+ if (test_bit(GIC_HWIRQ_TO_SHARED(hwirq), ipi_resrv))
+ return -EBUSY;
++#endif /* CONFIG_GENERIC_IRQ_IPI */
+
+ err = irq_domain_set_hwirq_and_chip(d, virq, hwirq,
+ &gic_level_irq_controller,
+@@ -567,6 +571,8 @@ static const struct irq_domain_ops gic_irq_domain_ops = {
+ .map = gic_irq_domain_map,
+ };
+
++#ifdef CONFIG_GENERIC_IRQ_IPI
++
+ static int gic_ipi_domain_xlate(struct irq_domain *d, struct device_node *ctrlr,
+ const u32 *intspec, unsigned int intsize,
+ irq_hw_number_t *out_hwirq,
+@@ -670,6 +676,48 @@ static const struct irq_domain_ops gic_ipi_domain_ops = {
+ .match = gic_ipi_domain_match,
+ };
+
++static int gic_register_ipi_domain(struct device_node *node)
++{
++ struct irq_domain *gic_ipi_domain;
++ unsigned int v[2], num_ipis;
++
++ gic_ipi_domain = irq_domain_add_hierarchy(gic_irq_domain,
++ IRQ_DOMAIN_FLAG_IPI_PER_CPU,
++ GIC_NUM_LOCAL_INTRS + gic_shared_intrs,
++ node, &gic_ipi_domain_ops, NULL);
++ if (!gic_ipi_domain) {
++ pr_err("Failed to add IPI domain");
++ return -ENXIO;
++ }
++
++ irq_domain_update_bus_token(gic_ipi_domain, DOMAIN_BUS_IPI);
++
++ if (node &&
++ !of_property_read_u32_array(node, "mti,reserved-ipi-vectors", v, 2)) {
++ bitmap_set(ipi_resrv, v[0], v[1]);
++ } else {
++ /*
++ * Reserve 2 interrupts per possible CPU/VP for use as IPIs,
++ * meeting the requirements of arch/mips SMP.
++ */
++ num_ipis = 2 * num_possible_cpus();
++ bitmap_set(ipi_resrv, gic_shared_intrs - num_ipis, num_ipis);
++ }
++
++ bitmap_copy(ipi_available, ipi_resrv, GIC_MAX_INTRS);
++
++ return 0;
++}
++
++#else /* !CONFIG_GENERIC_IRQ_IPI */
++
++static inline int gic_register_ipi_domain(struct device_node *node)
++{
++ return 0;
++}
++
++#endif /* !CONFIG_GENERIC_IRQ_IPI */
++
+ static int gic_cpu_startup(unsigned int cpu)
+ {
+ /* Enable or disable EIC */
+@@ -688,11 +736,12 @@ static int gic_cpu_startup(unsigned int cpu)
+ static int __init gic_of_init(struct device_node *node,
+ struct device_node *parent)
+ {
+- unsigned int cpu_vec, i, gicconfig, v[2], num_ipis;
++ unsigned int cpu_vec, i, gicconfig;
+ unsigned long reserved;
+ phys_addr_t gic_base;
+ struct resource res;
+ size_t gic_len;
++ int ret;
+
+ /* Find the first available CPU vector. */
+ i = 0;
+@@ -734,6 +783,10 @@ static int __init gic_of_init(struct device_node *node,
+ }
+
+ mips_gic_base = ioremap(gic_base, gic_len);
++ if (!mips_gic_base) {
++ pr_err("Failed to ioremap gic_base\n");
++ return -ENOMEM;
++ }
+
+ gicconfig = read_gic_config();
+ gic_shared_intrs = FIELD_GET(GIC_CONFIG_NUMINTERRUPTS, gicconfig);
+@@ -780,30 +833,9 @@ static int __init gic_of_init(struct device_node *node,
+ return -ENXIO;
+ }
+
+- gic_ipi_domain = irq_domain_add_hierarchy(gic_irq_domain,
+- IRQ_DOMAIN_FLAG_IPI_PER_CPU,
+- GIC_NUM_LOCAL_INTRS + gic_shared_intrs,
+- node, &gic_ipi_domain_ops, NULL);
+- if (!gic_ipi_domain) {
+- pr_err("Failed to add IPI domain");
+- return -ENXIO;
+- }
+-
+- irq_domain_update_bus_token(gic_ipi_domain, DOMAIN_BUS_IPI);
+-
+- if (node &&
+- !of_property_read_u32_array(node, "mti,reserved-ipi-vectors", v, 2)) {
+- bitmap_set(ipi_resrv, v[0], v[1]);
+- } else {
+- /*
+- * Reserve 2 interrupts per possible CPU/VP for use as IPIs,
+- * meeting the requirements of arch/mips SMP.
+- */
+- num_ipis = 2 * num_possible_cpus();
+- bitmap_set(ipi_resrv, gic_shared_intrs - num_ipis, num_ipis);
+- }
+-
+- bitmap_copy(ipi_available, ipi_resrv, GIC_MAX_INTRS);
++ ret = gic_register_ipi_domain(node);
++ if (ret)
++ return ret;
+
+ board_bind_eic_interrupt = &gic_bind_eic_interrupt;
+
+diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
+index e362a7471512d..a55d6f6f294bf 100644
+--- a/drivers/md/dm-raid.c
++++ b/drivers/md/dm-raid.c
+@@ -3514,7 +3514,7 @@ static void raid_status(struct dm_target *ti, status_type_t type,
+ {
+ struct raid_set *rs = ti->private;
+ struct mddev *mddev = &rs->md;
+- struct r5conf *conf = mddev->private;
++ struct r5conf *conf = rs_is_raid456(rs) ? mddev->private : NULL;
+ int i, max_nr_stripes = conf ? conf->max_nr_stripes : 0;
+ unsigned long recovery;
+ unsigned int raid_param_cnt = 1; /* at least 1 for chunksize */
+@@ -3824,7 +3824,7 @@ static void attempt_restore_of_faulty_devices(struct raid_set *rs)
+
+ memset(cleared_failed_devices, 0, sizeof(cleared_failed_devices));
+
+- for (i = 0; i < mddev->raid_disks; i++) {
++ for (i = 0; i < rs->raid_disks; i++) {
+ r = &rs->dev[i].rdev;
+ /* HM FIXME: enhance journal device recovery processing */
+ if (test_bit(Journal, &r->flags))
+diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c
+index 2db7030aba00b..a27395c8621ff 100644
+--- a/drivers/md/dm-thin-metadata.c
++++ b/drivers/md/dm-thin-metadata.c
+@@ -2045,10 +2045,13 @@ int dm_pool_register_metadata_threshold(struct dm_pool_metadata *pmd,
+ dm_sm_threshold_fn fn,
+ void *context)
+ {
+- int r;
++ int r = -EINVAL;
+
+ pmd_write_lock_in_core(pmd);
+- r = dm_sm_register_threshold_callback(pmd->metadata_sm, threshold, fn, context);
++ if (!pmd->fail_io) {
++ r = dm_sm_register_threshold_callback(pmd->metadata_sm,
++ threshold, fn, context);
++ }
+ pmd_write_unlock(pmd);
+
+ return r;
+diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
+index 4d25d0e270313..53ac6ae870aca 100644
+--- a/drivers/md/dm-thin.c
++++ b/drivers/md/dm-thin.c
+@@ -3382,8 +3382,10 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv)
+ calc_metadata_threshold(pt),
+ metadata_low_callback,
+ pool);
+- if (r)
++ if (r) {
++ ti->error = "Error registering metadata threshold";
+ goto out_flags_changed;
++ }
+
+ dm_pool_register_pre_commit_callback(pool->pmd,
+ metadata_pre_commit_callback, pool);
+diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
+index 5630b470ba429..27557b852c944 100644
+--- a/drivers/md/dm-writecache.c
++++ b/drivers/md/dm-writecache.c
+@@ -22,7 +22,7 @@
+
+ #define HIGH_WATERMARK 50
+ #define LOW_WATERMARK 45
+-#define MAX_WRITEBACK_JOBS 0
++#define MAX_WRITEBACK_JOBS min(0x10000000 / PAGE_SIZE, totalram_pages() / 16)
+ #define ENDIO_LATENCY 16
+ #define WRITEBACK_LATENCY 64
+ #define AUTOCOMMIT_BLOCKS_SSD 65536
+@@ -1328,8 +1328,8 @@ enum wc_map_op {
+ WC_MAP_ERROR,
+ };
+
+-static enum wc_map_op writecache_map_remap_origin(struct dm_writecache *wc, struct bio *bio,
+- struct wc_entry *e)
++static void writecache_map_remap_origin(struct dm_writecache *wc, struct bio *bio,
++ struct wc_entry *e)
+ {
+ if (e) {
+ sector_t next_boundary =
+@@ -1337,8 +1337,6 @@ static enum wc_map_op writecache_map_remap_origin(struct dm_writecache *wc, stru
+ if (next_boundary < bio->bi_iter.bi_size >> SECTOR_SHIFT)
+ dm_accept_partial_bio(bio, next_boundary);
+ }
+-
+- return WC_MAP_REMAP_ORIGIN;
+ }
+
+ static enum wc_map_op writecache_map_read(struct dm_writecache *wc, struct bio *bio)
+@@ -1365,14 +1363,16 @@ read_next_block:
+ map_op = WC_MAP_REMAP;
+ }
+ } else {
+- map_op = writecache_map_remap_origin(wc, bio, e);
++ writecache_map_remap_origin(wc, bio, e);
++ wc->stats.reads += (bio->bi_iter.bi_size - wc->block_size) >> wc->block_size_bits;
++ map_op = WC_MAP_REMAP_ORIGIN;
+ }
+
+ return map_op;
+ }
+
+-static enum wc_map_op writecache_bio_copy_ssd(struct dm_writecache *wc, struct bio *bio,
+- struct wc_entry *e, bool search_used)
++static void writecache_bio_copy_ssd(struct dm_writecache *wc, struct bio *bio,
++ struct wc_entry *e, bool search_used)
+ {
+ unsigned bio_size = wc->block_size;
+ sector_t start_cache_sec = cache_sector(wc, e);
+@@ -1412,14 +1412,15 @@ static enum wc_map_op writecache_bio_copy_ssd(struct dm_writecache *wc, struct b
+ bio->bi_iter.bi_sector = start_cache_sec;
+ dm_accept_partial_bio(bio, bio_size >> SECTOR_SHIFT);
+
++ wc->stats.writes += bio->bi_iter.bi_size >> wc->block_size_bits;
++ wc->stats.writes_allocate += (bio->bi_iter.bi_size - wc->block_size) >> wc->block_size_bits;
++
+ if (unlikely(wc->uncommitted_blocks >= wc->autocommit_blocks)) {
+ wc->uncommitted_blocks = 0;
+ queue_work(wc->writeback_wq, &wc->flush_work);
+ } else {
+ writecache_schedule_autocommit(wc);
+ }
+-
+- return WC_MAP_REMAP;
+ }
+
+ static enum wc_map_op writecache_map_write(struct dm_writecache *wc, struct bio *bio)
+@@ -1429,9 +1430,10 @@ static enum wc_map_op writecache_map_write(struct dm_writecache *wc, struct bio
+ do {
+ bool found_entry = false;
+ bool search_used = false;
+- wc->stats.writes++;
+- if (writecache_has_error(wc))
++ if (writecache_has_error(wc)) {
++ wc->stats.writes += bio->bi_iter.bi_size >> wc->block_size_bits;
+ return WC_MAP_ERROR;
++ }
+ e = writecache_find_entry(wc, bio->bi_iter.bi_sector, 0);
+ if (e) {
+ if (!writecache_entry_is_committed(wc, e)) {
+@@ -1455,9 +1457,11 @@ static enum wc_map_op writecache_map_write(struct dm_writecache *wc, struct bio
+ if (unlikely(!e)) {
+ if (!WC_MODE_PMEM(wc) && !found_entry) {
+ direct_write:
+- wc->stats.writes_around++;
+ e = writecache_find_entry(wc, bio->bi_iter.bi_sector, WFE_RETURN_FOLLOWING);
+- return writecache_map_remap_origin(wc, bio, e);
++ writecache_map_remap_origin(wc, bio, e);
++ wc->stats.writes_around += bio->bi_iter.bi_size >> wc->block_size_bits;
++ wc->stats.writes += bio->bi_iter.bi_size >> wc->block_size_bits;
++ return WC_MAP_REMAP_ORIGIN;
+ }
+ wc->stats.writes_blocked_on_freelist++;
+ writecache_wait_on_freelist(wc);
+@@ -1468,10 +1472,13 @@ direct_write:
+ wc->uncommitted_blocks++;
+ wc->stats.writes_allocate++;
+ bio_copy:
+- if (WC_MODE_PMEM(wc))
++ if (WC_MODE_PMEM(wc)) {
+ bio_copy_block(wc, bio, memory_data(wc, e));
+- else
+- return writecache_bio_copy_ssd(wc, bio, e, search_used);
++ wc->stats.writes++;
++ } else {
++ writecache_bio_copy_ssd(wc, bio, e, search_used);
++ return WC_MAP_REMAP;
++ }
+ } while (bio->bi_iter.bi_size);
+
+ if (unlikely(bio->bi_opf & REQ_FUA || wc->uncommitted_blocks >= wc->autocommit_blocks))
+@@ -1506,7 +1513,7 @@ static enum wc_map_op writecache_map_flush(struct dm_writecache *wc, struct bio
+
+ static enum wc_map_op writecache_map_discard(struct dm_writecache *wc, struct bio *bio)
+ {
+- wc->stats.discards++;
++ wc->stats.discards += bio->bi_iter.bi_size >> wc->block_size_bits;
+
+ if (writecache_has_error(wc))
+ return WC_MAP_ERROR;
+diff --git a/drivers/md/dm.c b/drivers/md/dm.c
+index f01d33bc36136..e0e28a3203f82 100644
+--- a/drivers/md/dm.c
++++ b/drivers/md/dm.c
+@@ -2955,6 +2955,11 @@ static int dm_call_pr(struct block_device *bdev, iterate_devices_callout_fn fn,
+ goto out;
+ ti = dm_table_get_target(table, 0);
+
++ if (dm_suspended_md(md)) {
++ ret = -EAGAIN;
++ goto out;
++ }
++
+ ret = -EINVAL;
+ if (!ti->type->iterate_devices)
+ goto out;
+diff --git a/drivers/md/md.c b/drivers/md/md.c
+index f79cab8c77009..134c12e481a57 100644
+--- a/drivers/md/md.c
++++ b/drivers/md/md.c
+@@ -6259,11 +6259,11 @@ static void mddev_detach(struct mddev *mddev)
+ static void __md_stop(struct mddev *mddev)
+ {
+ struct md_personality *pers = mddev->pers;
+- md_bitmap_destroy(mddev);
+ mddev_detach(mddev);
+ /* Ensure ->event_work is done */
+ if (mddev->event_work.func)
+ flush_workqueue(md_misc_wq);
++ md_bitmap_destroy(mddev);
+ spin_lock(&mddev->lock);
+ mddev->pers = NULL;
+ spin_unlock(&mddev->lock);
+diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
+index dfe7d62d3fbdd..e9648d643dd18 100644
+--- a/drivers/md/raid10.c
++++ b/drivers/md/raid10.c
+@@ -2157,9 +2157,12 @@ static int raid10_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
+ int err = 0;
+ int number = rdev->raid_disk;
+ struct md_rdev **rdevp;
+- struct raid10_info *p = conf->mirrors + number;
++ struct raid10_info *p;
+
+ print_conf(conf);
++ if (unlikely(number >= mddev->raid_disks))
++ return 0;
++ p = conf->mirrors + number;
+ if (rdev == p->rdev)
+ rdevp = &p->rdev;
+ else if (rdev == p->replacement)
+diff --git a/drivers/media/i2c/Kconfig b/drivers/media/i2c/Kconfig
+index 2b20aa6c37b1b..c926e5d43820c 100644
+--- a/drivers/media/i2c/Kconfig
++++ b/drivers/media/i2c/Kconfig
+@@ -1178,6 +1178,7 @@ config VIDEO_ISL7998X
+ depends on OF_GPIO
+ select MEDIA_CONTROLLER
+ select VIDEO_V4L2_SUBDEV_API
++ select V4L2_FWNODE
+ help
+ Support for Intersil ISL7998x analog to MIPI-CSI2 or
+ BT.656 decoder.
+diff --git a/drivers/media/pci/sta2x11/Kconfig b/drivers/media/pci/sta2x11/Kconfig
+index a96e170ab04ef..118b922c08c35 100644
+--- a/drivers/media/pci/sta2x11/Kconfig
++++ b/drivers/media/pci/sta2x11/Kconfig
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0-only
+ config STA2X11_VIP
+ tristate "STA2X11 VIP Video For Linux"
+- depends on PCI && VIDEO_DEV && VIRT_TO_BUS && I2C
++ depends on PCI && VIDEO_DEV && I2C
+ depends on STA2X11 || COMPILE_TEST
+ select GPIOLIB if MEDIA_SUBDRV_AUTOSELECT
+ select VIDEO_ADV7180 if MEDIA_SUBDRV_AUTOSELECT
+diff --git a/drivers/media/pci/tw686x/tw686x-core.c b/drivers/media/pci/tw686x/tw686x-core.c
+index 6676e069b515d..384d38754a4b1 100644
+--- a/drivers/media/pci/tw686x/tw686x-core.c
++++ b/drivers/media/pci/tw686x/tw686x-core.c
+@@ -315,13 +315,6 @@ static int tw686x_probe(struct pci_dev *pci_dev,
+
+ spin_lock_init(&dev->lock);
+
+- err = request_irq(pci_dev->irq, tw686x_irq, IRQF_SHARED,
+- dev->name, dev);
+- if (err < 0) {
+- dev_err(&pci_dev->dev, "unable to request interrupt\n");
+- goto iounmap;
+- }
+-
+ timer_setup(&dev->dma_delay_timer, tw686x_dma_delay, 0);
+
+ /*
+@@ -333,18 +326,23 @@ static int tw686x_probe(struct pci_dev *pci_dev,
+ err = tw686x_video_init(dev);
+ if (err) {
+ dev_err(&pci_dev->dev, "can't register video\n");
+- goto free_irq;
++ goto iounmap;
+ }
+
+ err = tw686x_audio_init(dev);
+ if (err)
+ dev_warn(&pci_dev->dev, "can't register audio\n");
+
++ err = request_irq(pci_dev->irq, tw686x_irq, IRQF_SHARED,
++ dev->name, dev);
++ if (err < 0) {
++ dev_err(&pci_dev->dev, "unable to request interrupt\n");
++ goto iounmap;
++ }
++
+ pci_set_drvdata(pci_dev, dev);
+ return 0;
+
+-free_irq:
+- free_irq(pci_dev->irq, dev);
+ iounmap:
+ pci_iounmap(pci_dev, dev->mmio);
+ free_region:
+diff --git a/drivers/media/pci/tw686x/tw686x-video.c b/drivers/media/pci/tw686x/tw686x-video.c
+index b227e9e78ebd0..37a20fe24241f 100644
+--- a/drivers/media/pci/tw686x/tw686x-video.c
++++ b/drivers/media/pci/tw686x/tw686x-video.c
+@@ -1282,8 +1282,10 @@ int tw686x_video_init(struct tw686x_dev *dev)
+ video_set_drvdata(vdev, vc);
+
+ err = video_register_device(vdev, VFL_TYPE_VIDEO, -1);
+- if (err < 0)
++ if (err < 0) {
++ video_device_release(vdev);
+ goto error;
++ }
+ vc->num = vdev->num;
+ }
+
+diff --git a/drivers/media/platform/amphion/vdec.c b/drivers/media/platform/amphion/vdec.c
+index c0dfede11ab74..36450a2e85c96 100644
+--- a/drivers/media/platform/amphion/vdec.c
++++ b/drivers/media/platform/amphion/vdec.c
+@@ -26,8 +26,8 @@
+ #include "vpu_cmds.h"
+ #include "vpu_rpc.h"
+
+-#define VDEC_FRAME_DEPTH 256
+ #define VDEC_MIN_BUFFER_CAP 8
++#define VDEC_MIN_BUFFER_OUT 8
+
+ struct vdec_fs_info {
+ char name[8];
+@@ -63,8 +63,7 @@ struct vdec_t {
+ bool is_source_changed;
+ u32 source_change;
+ u32 drain;
+- u32 ts_pre_count;
+- u32 frame_depth;
++ bool aborting;
+ };
+
+ static const struct vpu_format vdec_formats[] = {
+@@ -106,7 +105,6 @@ static const struct vpu_format vdec_formats[] = {
+ .pixfmt = V4L2_PIX_FMT_VC1_ANNEX_L,
+ .num_planes = 1,
+ .type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE,
+- .flags = V4L2_FMT_FLAG_DYN_RESOLUTION
+ },
+ {
+ .pixfmt = V4L2_PIX_FMT_MPEG2,
+@@ -174,16 +172,6 @@ static int vdec_ctrl_init(struct vpu_inst *inst)
+ return 0;
+ }
+
+-static void vdec_set_last_buffer_dequeued(struct vpu_inst *inst)
+-{
+- struct vdec_t *vdec = inst->priv;
+-
+- if (vdec->eos_received) {
+- if (!vpu_set_last_buffer_dequeued(inst))
+- vdec->eos_received--;
+- }
+-}
+-
+ static void vdec_handle_resolution_change(struct vpu_inst *inst)
+ {
+ struct vdec_t *vdec = inst->priv;
+@@ -230,6 +218,21 @@ static int vdec_update_state(struct vpu_inst *inst, enum vpu_codec_state state,
+ return 0;
+ }
+
++static void vdec_set_last_buffer_dequeued(struct vpu_inst *inst)
++{
++ struct vdec_t *vdec = inst->priv;
++
++ if (inst->state == VPU_CODEC_STATE_DYAMIC_RESOLUTION_CHANGE)
++ return;
++
++ if (vdec->eos_received) {
++ if (!vpu_set_last_buffer_dequeued(inst)) {
++ vdec->eos_received--;
++ vdec_update_state(inst, VPU_CODEC_STATE_DRAIN, 0);
++ }
++ }
++}
++
+ static int vdec_querycap(struct file *file, void *fh, struct v4l2_capability *cap)
+ {
+ strscpy(cap->driver, "amphion-vpu", sizeof(cap->driver));
+@@ -470,7 +473,7 @@ static int vdec_drain(struct vpu_inst *inst)
+ if (!vdec->drain)
+ return 0;
+
+- if (v4l2_m2m_num_src_bufs_ready(inst->fh.m2m_ctx))
++ if (!vpu_is_source_empty(inst))
+ return 0;
+
+ if (!vdec->params.frame_count) {
+@@ -489,6 +492,8 @@ static int vdec_drain(struct vpu_inst *inst)
+
+ static int vdec_cmd_start(struct vpu_inst *inst)
+ {
++ struct vdec_t *vdec = inst->priv;
++
+ switch (inst->state) {
+ case VPU_CODEC_STATE_STARTED:
+ case VPU_CODEC_STATE_DRAIN:
+@@ -499,6 +504,8 @@ static int vdec_cmd_start(struct vpu_inst *inst)
+ break;
+ }
+ vpu_process_capture_buffer(inst);
++ if (vdec->eos_received)
++ vdec_set_last_buffer_dequeued(inst);
+ return 0;
+ }
+
+@@ -589,11 +596,8 @@ static bool vdec_check_ready(struct vpu_inst *inst, unsigned int type)
+ {
+ struct vdec_t *vdec = inst->priv;
+
+- if (V4L2_TYPE_IS_OUTPUT(type)) {
+- if (vdec->ts_pre_count >= vdec->frame_depth)
+- return false;
++ if (V4L2_TYPE_IS_OUTPUT(type))
+ return true;
+- }
+
+ if (vdec->req_frame_count)
+ return true;
+@@ -601,12 +605,21 @@ static bool vdec_check_ready(struct vpu_inst *inst, unsigned int type)
+ return false;
+ }
+
++static struct vb2_v4l2_buffer *vdec_get_src_buffer(struct vpu_inst *inst, u32 count)
++{
++ if (count > 1)
++ vpu_skip_frame(inst, count - 1);
++
++ return vpu_next_src_buf(inst);
++}
++
+ static int vdec_frame_decoded(struct vpu_inst *inst, void *arg)
+ {
+ struct vdec_t *vdec = inst->priv;
+ struct vpu_dec_pic_info *info = arg;
+ struct vpu_vb2_buffer *vpu_buf;
+ struct vb2_v4l2_buffer *vbuf;
++ struct vb2_v4l2_buffer *src_buf;
+ int ret = 0;
+
+ if (!info || info->id >= ARRAY_SIZE(vdec->slots))
+@@ -620,14 +633,21 @@ static int vdec_frame_decoded(struct vpu_inst *inst, void *arg)
+ goto exit;
+ }
+ vbuf = &vpu_buf->m2m_buf.vb;
++ src_buf = vdec_get_src_buffer(inst, info->consumed_count);
++ if (src_buf) {
++ v4l2_m2m_buf_copy_metadata(src_buf, vbuf, true);
++ if (info->consumed_count) {
++ v4l2_m2m_src_buf_remove(inst->fh.m2m_ctx);
++ vpu_set_buffer_state(src_buf, VPU_BUF_STATE_IDLE);
++ v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
++ } else {
++ vpu_set_buffer_state(src_buf, VPU_BUF_STATE_DECODED);
++ }
++ }
+ if (vpu_get_buffer_state(vbuf) == VPU_BUF_STATE_DECODED)
+ dev_info(inst->dev, "[%d] buf[%d] has been decoded\n", inst->id, info->id);
+ vpu_set_buffer_state(vbuf, VPU_BUF_STATE_DECODED);
+ vdec->decoded_frame_count++;
+- if (vdec->ts_pre_count >= info->consumed_count)
+- vdec->ts_pre_count -= info->consumed_count;
+- else
+- vdec->ts_pre_count = 0;
+ exit:
+ vpu_inst_unlock(inst);
+
+@@ -683,10 +703,9 @@ static void vdec_buf_done(struct vpu_inst *inst, struct vpu_frame_info *frame)
+ vpu_set_buffer_state(vbuf, VPU_BUF_STATE_READY);
+ vb2_set_plane_payload(&vbuf->vb2_buf, 0, inst->cap_format.sizeimage[0]);
+ vb2_set_plane_payload(&vbuf->vb2_buf, 1, inst->cap_format.sizeimage[1]);
+- vbuf->vb2_buf.timestamp = frame->timestamp;
+ vbuf->field = inst->cap_format.field;
+ vbuf->sequence = sequence;
+- dev_dbg(inst->dev, "[%d][OUTPUT TS]%32lld\n", inst->id, frame->timestamp);
++ dev_dbg(inst->dev, "[%d][OUTPUT TS]%32lld\n", inst->id, vbuf->vb2_buf.timestamp);
+
+ v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_DONE);
+ vpu_inst_lock(inst);
+@@ -708,7 +727,6 @@ static void vdec_stop_done(struct vpu_inst *inst)
+ vdec->fixed_fmt = false;
+ vdec->params.end_flag = 0;
+ vdec->drain = 0;
+- vdec->ts_pre_count = 0;
+ vdec->params.frame_count = 0;
+ vdec->decoded_frame_count = 0;
+ vdec->display_frame_count = 0;
+@@ -716,6 +734,7 @@ static void vdec_stop_done(struct vpu_inst *inst)
+ vdec->eos_received = 0;
+ vdec->is_source_changed = false;
+ vdec->source_change = 0;
++ inst->total_input_count = 0;
+ vpu_inst_unlock(inst);
+ }
+
+@@ -924,6 +943,9 @@ static int vdec_response_frame(struct vpu_inst *inst, struct vb2_v4l2_buffer *vb
+ if (inst->state != VPU_CODEC_STATE_ACTIVE)
+ return -EINVAL;
+
++ if (vdec->aborting)
++ return -EINVAL;
++
+ if (!vdec->req_frame_count)
+ return -EINVAL;
+
+@@ -1033,6 +1055,8 @@ static void vdec_clear_slots(struct vpu_inst *inst)
+ vpu_buf = vdec->slots[i];
+ vbuf = &vpu_buf->m2m_buf.vb;
+
++ vpu_trace(inst->dev, "clear slot %d\n", i);
++ vdec_response_fs_release(inst, i, vpu_buf->tag);
+ vdec_recycle_buffer(inst, vbuf);
+ vdec->slots[i]->state = VPU_BUF_STATE_IDLE;
+ vdec->slots[i] = NULL;
+@@ -1188,7 +1212,6 @@ static void vdec_event_eos(struct vpu_inst *inst)
+ vdec->eos_received++;
+ vdec->fixed_fmt = false;
+ inst->min_buffer_cap = VDEC_MIN_BUFFER_CAP;
+- vdec_update_state(inst, VPU_CODEC_STATE_DRAIN, 0);
+ vdec_set_last_buffer_dequeued(inst);
+ vpu_inst_unlock(inst);
+ }
+@@ -1244,18 +1267,14 @@ static int vdec_process_output(struct vpu_inst *inst, struct vb2_buffer *vb)
+ if (free_space < vb2_get_plane_payload(vb, 0) + 0x40000)
+ return -ENOMEM;
+
++ vpu_set_buffer_state(vbuf, VPU_BUF_STATE_INUSE);
+ ret = vpu_iface_input_frame(inst, vb);
+ if (ret < 0)
+ return -ENOMEM;
+
+ dev_dbg(inst->dev, "[%d][INPUT TS]%32lld\n", inst->id, vb->timestamp);
+- vdec->ts_pre_count++;
+ vdec->params.frame_count++;
+
+- v4l2_m2m_src_buf_remove_by_buf(inst->fh.m2m_ctx, vbuf);
+- vpu_set_buffer_state(vbuf, VPU_BUF_STATE_IDLE);
+- v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_DONE);
+-
+ if (vdec->drain)
+ vdec_drain(inst);
+
+@@ -1299,6 +1318,8 @@ static void vdec_abort(struct vpu_inst *inst)
+ int ret;
+
+ vpu_trace(inst->dev, "[%d] state = %d\n", inst->id, inst->state);
++
++ vdec->aborting = true;
+ vpu_iface_add_scode(inst, SCODE_PADDING_ABORT);
+ vdec->params.end_flag = 1;
+ vpu_iface_set_decode_params(inst, &vdec->params, 1);
+@@ -1318,11 +1339,11 @@ static void vdec_abort(struct vpu_inst *inst)
+ vdec->sequence);
+ vdec->params.end_flag = 0;
+ vdec->drain = 0;
+- vdec->ts_pre_count = 0;
+ vdec->params.frame_count = 0;
+ vdec->decoded_frame_count = 0;
+ vdec->display_frame_count = 0;
+ vdec->sequence = 0;
++ vdec->aborting = false;
+ }
+
+ static void vdec_stop(struct vpu_inst *inst, bool free)
+@@ -1470,10 +1491,10 @@ static int vdec_stop_session(struct vpu_inst *inst, u32 type)
+ vdec_update_state(inst, VPU_CODEC_STATE_SEEK, 0);
+ vdec->drain = 0;
+ } else {
+- if (inst->state != VPU_CODEC_STATE_DYAMIC_RESOLUTION_CHANGE)
++ if (inst->state != VPU_CODEC_STATE_DYAMIC_RESOLUTION_CHANGE) {
+ vdec_abort(inst);
+-
+- vdec->eos_received = 0;
++ vdec->eos_received = 0;
++ }
+ vdec_clear_slots(inst);
+ }
+
+@@ -1525,10 +1546,6 @@ static int vdec_get_debug_info(struct vpu_inst *inst, char *str, u32 size, u32 i
+ vdec->drain, vdec->eos_received, vdec->source_change);
+ break;
+ case 8:
+- num = scnprintf(str, size, "ts_pre_count = %d, frame_depth = %d\n",
+- vdec->ts_pre_count, vdec->frame_depth);
+- break;
+- case 9:
+ num = scnprintf(str, size, "fps = %d/%d\n",
+ vdec->codec_info.frame_rate.numerator,
+ vdec->codec_info.frame_rate.denominator);
+@@ -1562,12 +1579,8 @@ static struct vpu_inst_ops vdec_inst_ops = {
+ static void vdec_init(struct file *file)
+ {
+ struct vpu_inst *inst = to_inst(file);
+- struct vdec_t *vdec;
+ struct v4l2_format f;
+
+- vdec = inst->priv;
+- vdec->frame_depth = VDEC_FRAME_DEPTH;
+-
+ memset(&f, 0, sizeof(f));
+ f.type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE;
+ f.fmt.pix_mp.pixelformat = V4L2_PIX_FMT_H264;
+@@ -1612,36 +1625,18 @@ static int vdec_open(struct file *file)
+
+ vdec->fixed_fmt = false;
+ inst->min_buffer_cap = VDEC_MIN_BUFFER_CAP;
++ inst->min_buffer_out = VDEC_MIN_BUFFER_OUT;
+ vdec_init(file);
+
+ return 0;
+ }
+
+-static __poll_t vdec_poll(struct file *file, poll_table *wait)
+-{
+- struct vpu_inst *inst = to_inst(file);
+- struct vb2_queue *src_q, *dst_q;
+- __poll_t ret;
+-
+- ret = v4l2_m2m_fop_poll(file, wait);
+- src_q = v4l2_m2m_get_src_vq(inst->fh.m2m_ctx);
+- dst_q = v4l2_m2m_get_dst_vq(inst->fh.m2m_ctx);
+- if (vb2_is_streaming(src_q) && !vb2_is_streaming(dst_q))
+- ret &= (~EPOLLERR);
+- if (!src_q->error && !dst_q->error &&
+- (vb2_is_streaming(src_q) && list_empty(&src_q->queued_list)) &&
+- (vb2_is_streaming(dst_q) && list_empty(&dst_q->queued_list)))
+- ret &= (~EPOLLERR);
+-
+- return ret;
+-}
+-
+ static const struct v4l2_file_operations vdec_fops = {
+ .owner = THIS_MODULE,
+ .open = vdec_open,
+ .release = vpu_v4l2_close,
+ .unlocked_ioctl = video_ioctl2,
+- .poll = vdec_poll,
++ .poll = v4l2_m2m_fop_poll,
+ .mmap = v4l2_m2m_fop_mmap,
+ };
+
+diff --git a/drivers/media/platform/amphion/vpu.h b/drivers/media/platform/amphion/vpu.h
+index e56b96a7e5d3f..f914de6ed81e9 100644
+--- a/drivers/media/platform/amphion/vpu.h
++++ b/drivers/media/platform/amphion/vpu.h
+@@ -258,6 +258,7 @@ struct vpu_inst {
+ struct vpu_format cap_format;
+ u32 min_buffer_cap;
+ u32 min_buffer_out;
++ u32 total_input_count;
+
+ struct v4l2_rect crop;
+ u32 colorspace;
+diff --git a/drivers/media/platform/amphion/vpu_core.c b/drivers/media/platform/amphion/vpu_core.c
+index 68ad183925fdb..51a764713159a 100644
+--- a/drivers/media/platform/amphion/vpu_core.c
++++ b/drivers/media/platform/amphion/vpu_core.c
+@@ -455,8 +455,13 @@ int vpu_inst_unregister(struct vpu_inst *inst)
+ }
+ vpu_core_check_hang(core);
+ if (core->state == VPU_CORE_HANG && !core->instance_mask) {
++ int err;
++
+ dev_info(core->dev, "reset hang core\n");
+- if (!vpu_core_sw_reset(core)) {
++ mutex_unlock(&core->lock);
++ err = vpu_core_sw_reset(core);
++ mutex_lock(&core->lock);
++ if (!err) {
+ core->state = VPU_CORE_ACTIVE;
+ core->hang_mask = 0;
+ }
+diff --git a/drivers/media/platform/amphion/vpu_malone.c b/drivers/media/platform/amphion/vpu_malone.c
+index 446a9de0cc119..7f591805c48c9 100644
+--- a/drivers/media/platform/amphion/vpu_malone.c
++++ b/drivers/media/platform/amphion/vpu_malone.c
+@@ -609,6 +609,8 @@ static int vpu_malone_set_params(struct vpu_shared_addr *shared,
+ enum vpu_malone_format malone_format;
+
+ malone_format = vpu_malone_format_remap(params->codec_format);
++ if (WARN_ON(malone_format == MALONE_FMT_NULL))
++ return -EINVAL;
+ iface->udata_buffer[instance].base = params->udata.base;
+ iface->udata_buffer[instance].slot_size = params->udata.size;
+
+@@ -1294,6 +1296,8 @@ static int vpu_malone_insert_scode_vc1_l_seq(struct malone_scode_t *scode)
+ int size = 0;
+ u8 rcv_seqhdr[MALONE_VC1_RCV_SEQ_HEADER_LEN];
+
++ if (scode->inst->total_input_count)
++ return 0;
+ scode->need_data = 0;
+
+ ret = vpu_malone_insert_scode_seq(scode, MALONE_CODEC_ID_VC1_SIMPLE, sizeof(rcv_seqhdr));
+@@ -1556,7 +1560,7 @@ int vpu_malone_input_frame(struct vpu_shared_addr *shared,
+ * merge the data to next frame
+ */
+ vbuf = to_vb2_v4l2_buffer(vb);
+- if (vpu_vb_is_codecconfig(vbuf) && (s64)vb->timestamp < 0) {
++ if (vpu_vb_is_codecconfig(vbuf)) {
+ inst->extra_size += size;
+ return 0;
+ }
+diff --git a/drivers/media/platform/amphion/vpu_msgs.c b/drivers/media/platform/amphion/vpu_msgs.c
+index 58502c51ddb37..077644bc1d7c6 100644
+--- a/drivers/media/platform/amphion/vpu_msgs.c
++++ b/drivers/media/platform/amphion/vpu_msgs.c
+@@ -150,7 +150,12 @@ static void vpu_session_handle_eos(struct vpu_inst *inst, struct vpu_rpc_event *
+
+ static void vpu_session_handle_error(struct vpu_inst *inst, struct vpu_rpc_event *pkt)
+ {
+- dev_err(inst->dev, "unsupported stream\n");
++ char *str = (char *)pkt->data;
++
++ if (strlen(str))
++ dev_err(inst->dev, "instance %d firmware error : %s\n", inst->id, str);
++ else
++ dev_err(inst->dev, "instance %d is unsupported stream\n", inst->id);
+ call_void_vop(inst, event_notify, VPU_MSG_ID_UNSUPPORTED, NULL);
+ vpu_v4l2_set_error(inst);
+ }
+diff --git a/drivers/media/platform/amphion/vpu_rpc.h b/drivers/media/platform/amphion/vpu_rpc.h
+index 25119e5e807e1..7eb6f01e6ab5d 100644
+--- a/drivers/media/platform/amphion/vpu_rpc.h
++++ b/drivers/media/platform/amphion/vpu_rpc.h
+@@ -312,11 +312,16 @@ static inline int vpu_iface_input_frame(struct vpu_inst *inst,
+ struct vb2_buffer *vb)
+ {
+ struct vpu_iface_ops *ops = vpu_core_get_iface(inst->core);
++ int ret;
+
+ if (!ops || !ops->input_frame)
+ return -EINVAL;
+
+- return ops->input_frame(inst->core->iface, inst, vb);
++ ret = ops->input_frame(inst->core->iface, inst, vb);
++ if (ret < 0)
++ return ret;
++ inst->total_input_count++;
++ return ret;
+ }
+
+ static inline int vpu_iface_config_memory_resource(struct vpu_inst *inst,
+diff --git a/drivers/media/platform/amphion/vpu_v4l2.c b/drivers/media/platform/amphion/vpu_v4l2.c
+index 9c0704cd57661..0f3726c80967b 100644
+--- a/drivers/media/platform/amphion/vpu_v4l2.c
++++ b/drivers/media/platform/amphion/vpu_v4l2.c
+@@ -127,6 +127,19 @@ int vpu_set_last_buffer_dequeued(struct vpu_inst *inst)
+ return 0;
+ }
+
++bool vpu_is_source_empty(struct vpu_inst *inst)
++{
++ struct v4l2_m2m_buffer *buf = NULL;
++
++ if (!inst->fh.m2m_ctx)
++ return true;
++ v4l2_m2m_for_each_src_buf(inst->fh.m2m_ctx, buf) {
++ if (vpu_get_buffer_state(&buf->vb) == VPU_BUF_STATE_IDLE)
++ return false;
++ }
++ return true;
++}
++
+ const struct vpu_format *vpu_try_fmt_common(struct vpu_inst *inst, struct v4l2_format *f)
+ {
+ struct v4l2_pix_format_mplane *pixmp = &f->fmt.pix_mp;
+@@ -234,6 +247,49 @@ int vpu_process_capture_buffer(struct vpu_inst *inst)
+ return call_vop(inst, process_capture, &vbuf->vb2_buf);
+ }
+
++struct vb2_v4l2_buffer *vpu_next_src_buf(struct vpu_inst *inst)
++{
++ struct vb2_v4l2_buffer *src_buf = v4l2_m2m_next_src_buf(inst->fh.m2m_ctx);
++
++ if (!src_buf || vpu_get_buffer_state(src_buf) == VPU_BUF_STATE_IDLE)
++ return NULL;
++
++ while (vpu_vb_is_codecconfig(src_buf)) {
++ v4l2_m2m_src_buf_remove(inst->fh.m2m_ctx);
++ vpu_set_buffer_state(src_buf, VPU_BUF_STATE_IDLE);
++ v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
++
++ src_buf = v4l2_m2m_next_src_buf(inst->fh.m2m_ctx);
++ if (!src_buf || vpu_get_buffer_state(src_buf) == VPU_BUF_STATE_IDLE)
++ return NULL;
++ }
++
++ return src_buf;
++}
++
++void vpu_skip_frame(struct vpu_inst *inst, int count)
++{
++ struct vb2_v4l2_buffer *src_buf;
++ enum vb2_buffer_state state;
++ int i = 0;
++
++ if (count <= 0)
++ return;
++
++ while (i < count) {
++ src_buf = v4l2_m2m_src_buf_remove(inst->fh.m2m_ctx);
++ if (!src_buf || vpu_get_buffer_state(src_buf) == VPU_BUF_STATE_IDLE)
++ return;
++ if (vpu_get_buffer_state(src_buf) == VPU_BUF_STATE_DECODED)
++ state = VB2_BUF_STATE_DONE;
++ else
++ state = VB2_BUF_STATE_ERROR;
++ i++;
++ vpu_set_buffer_state(src_buf, VPU_BUF_STATE_IDLE);
++ v4l2_m2m_buf_done(src_buf, state);
++ }
++}
++
+ struct vb2_v4l2_buffer *vpu_find_buf_by_sequence(struct vpu_inst *inst, u32 type, u32 sequence)
+ {
+ struct v4l2_m2m_buffer *buf = NULL;
+@@ -440,10 +496,12 @@ static int vpu_vb2_start_streaming(struct vb2_queue *q, unsigned int count)
+ fmt->sizeimage[1], fmt->bytesperline[1],
+ fmt->sizeimage[2], fmt->bytesperline[2],
+ q->num_buffers);
+- call_void_vop(inst, start, q->type);
+ vb2_clear_last_buffer_dequeued(q);
++ ret = call_vop(inst, start, q->type);
++ if (ret)
++ vpu_vb2_buffers_return(inst, q->type, VB2_BUF_STATE_QUEUED);
+
+- return 0;
++ return ret;
+ }
+
+ static void vpu_vb2_stop_streaming(struct vb2_queue *q)
+diff --git a/drivers/media/platform/amphion/vpu_v4l2.h b/drivers/media/platform/amphion/vpu_v4l2.h
+index 90fa7ea67495b..795ca33a6a507 100644
+--- a/drivers/media/platform/amphion/vpu_v4l2.h
++++ b/drivers/media/platform/amphion/vpu_v4l2.h
+@@ -19,6 +19,8 @@ int vpu_v4l2_close(struct file *file);
+ const struct vpu_format *vpu_try_fmt_common(struct vpu_inst *inst, struct v4l2_format *f);
+ int vpu_process_output_buffer(struct vpu_inst *inst);
+ int vpu_process_capture_buffer(struct vpu_inst *inst);
++struct vb2_v4l2_buffer *vpu_next_src_buf(struct vpu_inst *inst);
++void vpu_skip_frame(struct vpu_inst *inst, int count);
+ struct vb2_v4l2_buffer *vpu_find_buf_by_sequence(struct vpu_inst *inst, u32 type, u32 sequence);
+ struct vb2_v4l2_buffer *vpu_find_buf_by_idx(struct vpu_inst *inst, u32 type, u32 idx);
+ void vpu_v4l2_set_error(struct vpu_inst *inst);
+@@ -27,6 +29,7 @@ int vpu_notify_source_change(struct vpu_inst *inst);
+ int vpu_set_last_buffer_dequeued(struct vpu_inst *inst);
+ void vpu_vb2_buffers_return(struct vpu_inst *inst, unsigned int type, enum vb2_buffer_state state);
+ int vpu_get_num_buffers(struct vpu_inst *inst, u32 type);
++bool vpu_is_source_empty(struct vpu_inst *inst);
+
+ dma_addr_t vpu_get_vb_phy_addr(struct vb2_buffer *vb, u32 plane_no);
+ unsigned int vpu_get_vb_length(struct vb2_buffer *vb, u32 plane_no);
+diff --git a/drivers/media/platform/atmel/atmel-sama7g5-isc.c b/drivers/media/platform/atmel/atmel-sama7g5-isc.c
+index 07a80b08bc545..e292a8d790a38 100644
+--- a/drivers/media/platform/atmel/atmel-sama7g5-isc.c
++++ b/drivers/media/platform/atmel/atmel-sama7g5-isc.c
+@@ -612,11 +612,13 @@ static const struct dev_pm_ops microchip_xisc_dev_pm_ops = {
+ SET_RUNTIME_PM_OPS(xisc_runtime_suspend, xisc_runtime_resume, NULL)
+ };
+
++#if IS_ENABLED(CONFIG_OF)
+ static const struct of_device_id microchip_xisc_of_match[] = {
+ { .compatible = "microchip,sama7g5-isc" },
+ { }
+ };
+ MODULE_DEVICE_TABLE(of, microchip_xisc_of_match);
++#endif
+
+ static struct platform_driver microchip_xisc_driver = {
+ .probe = microchip_xisc_probe,
+diff --git a/drivers/media/platform/mediatek/mdp/mtk_mdp_ipi.h b/drivers/media/platform/mediatek/mdp/mtk_mdp_ipi.h
+index 2cb8cecb30771..b810c96695c83 100644
+--- a/drivers/media/platform/mediatek/mdp/mtk_mdp_ipi.h
++++ b/drivers/media/platform/mediatek/mdp/mtk_mdp_ipi.h
+@@ -40,12 +40,14 @@ struct mdp_ipi_init {
+ * @ipi_id : IPI_MDP
+ * @ap_inst : AP mtk_mdp_vpu address
+ * @vpu_inst_addr : VPU MDP instance address
++ * @padding : Alignment padding
+ */
+ struct mdp_ipi_comm {
+ uint32_t msg_id;
+ uint32_t ipi_id;
+ uint64_t ap_inst;
+ uint32_t vpu_inst_addr;
++ uint32_t padding;
+ };
+
+ /**
+diff --git a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec.c
+index c8ee5e2b4f699..54489f64533e7 100644
+--- a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec.c
++++ b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec.c
+@@ -112,8 +112,6 @@ void mtk_vcodec_dec_set_default_params(struct mtk_vcodec_ctx *ctx)
+ {
+ struct mtk_q_data *q_data;
+
+- ctx->dev->vdec_pdata->init_vdec_params(ctx);
+-
+ ctx->m2m_ctx->q_lock = &ctx->dev->dev_mutex;
+ ctx->fh.m2m_ctx = ctx->m2m_ctx;
+ ctx->fh.ctrl_handler = &ctx->ctrl_hdl;
+@@ -196,6 +194,11 @@ static int vidioc_vdec_querycap(struct file *file, void *priv,
+ static int vidioc_vdec_subscribe_evt(struct v4l2_fh *fh,
+ const struct v4l2_event_subscription *sub)
+ {
++ struct mtk_vcodec_ctx *ctx = fh_to_ctx(fh);
++
++ if (ctx->dev->vdec_pdata->uses_stateless_api)
++ return v4l2_ctrl_subscribe_event(fh, sub);
++
+ switch (sub->type) {
+ case V4L2_EVENT_EOS:
+ return v4l2_event_subscribe(fh, sub, 2, NULL);
+diff --git a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_drv.c b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_drv.c
+index fe7b2f1739b15..ff8f2a4829a32 100644
+--- a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_drv.c
++++ b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_drv.c
+@@ -211,9 +211,12 @@ static int fops_vcodec_open(struct file *file)
+
+ dev->dec_capability =
+ mtk_vcodec_fw_get_vdec_capa(dev->fw_handler);
++
+ mtk_v4l2_debug(0, "decoder capability %x", dev->dec_capability);
+ }
+
++ ctx->dev->vdec_pdata->init_vdec_params(ctx);
++
+ list_add(&ctx->list, &dev->ctx_list);
+
+ mutex_unlock(&dev->dev_mutex);
+@@ -391,6 +394,8 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
+ mtk_v4l2_err("Main device of_platform_populate failed.");
+ goto err_reg_cont;
+ }
++ } else {
++ set_bit(MTK_VDEC_CORE, dev->subdev_bitmap);
+ }
+
+ ret = video_register_device(vfd_dec, VFL_TYPE_VIDEO, -1);
+diff --git a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.c b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.c
+index 29c604b1b1790..718b7b08f93e0 100644
+--- a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.c
++++ b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.c
+@@ -79,6 +79,11 @@ void mxc_jpeg_enable_irq(void __iomem *reg, int slot)
+ writel(0xFFFFFFFF, reg + MXC_SLOT_OFFSET(slot, SLOT_IRQ_EN));
+ }
+
++void mxc_jpeg_disable_irq(void __iomem *reg, int slot)
++{
++ writel(0x0, reg + MXC_SLOT_OFFSET(slot, SLOT_IRQ_EN));
++}
++
+ void mxc_jpeg_sw_reset(void __iomem *reg)
+ {
+ /*
+diff --git a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.h b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.h
+index ae70d3a0dc243..bf4e1973a0661 100644
+--- a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.h
++++ b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.h
+@@ -53,10 +53,10 @@
+ #define CAST_REC_REGS_SEL CAST_STATUS4
+ #define CAST_LUMTH CAST_STATUS5
+ #define CAST_CHRTH CAST_STATUS6
+-#define CAST_NOMFRSIZE_LO CAST_STATUS7
+-#define CAST_NOMFRSIZE_HI CAST_STATUS8
+-#define CAST_OFBSIZE_LO CAST_STATUS9
+-#define CAST_OFBSIZE_HI CAST_STATUS10
++#define CAST_NOMFRSIZE_LO CAST_STATUS16
++#define CAST_NOMFRSIZE_HI CAST_STATUS17
++#define CAST_OFBSIZE_LO CAST_STATUS18
++#define CAST_OFBSIZE_HI CAST_STATUS19
+
+ #define MXC_MAX_SLOTS 1 /* TODO use all 4 slots*/
+ /* JPEG-Decoder Wrapper Slot Registers 0..3 */
+@@ -125,6 +125,7 @@ u32 mxc_jpeg_get_offset(void __iomem *reg, int slot);
+ void mxc_jpeg_enable_slot(void __iomem *reg, int slot);
+ void mxc_jpeg_set_l_endian(void __iomem *reg, int le);
+ void mxc_jpeg_enable_irq(void __iomem *reg, int slot);
++void mxc_jpeg_disable_irq(void __iomem *reg, int slot);
+ int mxc_jpeg_set_input(void __iomem *reg, u32 in_buf, u32 bufsize);
+ int mxc_jpeg_set_output(void __iomem *reg, u16 out_pitch, u32 out_buf,
+ u16 w, u16 h);
+diff --git a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.c b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.c
+index d1ec1f4b506b8..55c4b29d3e4eb 100644
+--- a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.c
++++ b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.c
+@@ -82,6 +82,7 @@ static const struct mxc_jpeg_fmt mxc_formats[] = {
+ .h_align = 3,
+ .v_align = 3,
+ .flags = MXC_JPEG_FMT_TYPE_RAW,
++ .precision = 8,
+ },
+ {
+ .name = "ARGB", /* ARGBARGB packed format */
+@@ -93,6 +94,7 @@ static const struct mxc_jpeg_fmt mxc_formats[] = {
+ .h_align = 3,
+ .v_align = 3,
+ .flags = MXC_JPEG_FMT_TYPE_RAW,
++ .precision = 8,
+ },
+ {
+ .name = "YUV420", /* 1st plane = Y, 2nd plane = UV */
+@@ -104,6 +106,7 @@ static const struct mxc_jpeg_fmt mxc_formats[] = {
+ .h_align = 4,
+ .v_align = 4,
+ .flags = MXC_JPEG_FMT_TYPE_RAW,
++ .precision = 8,
+ },
+ {
+ .name = "YUV422", /* YUYV */
+@@ -115,6 +118,7 @@ static const struct mxc_jpeg_fmt mxc_formats[] = {
+ .h_align = 4,
+ .v_align = 3,
+ .flags = MXC_JPEG_FMT_TYPE_RAW,
++ .precision = 8,
+ },
+ {
+ .name = "YUV444", /* YUVYUV */
+@@ -126,6 +130,7 @@ static const struct mxc_jpeg_fmt mxc_formats[] = {
+ .h_align = 3,
+ .v_align = 3,
+ .flags = MXC_JPEG_FMT_TYPE_RAW,
++ .precision = 8,
+ },
+ {
+ .name = "Gray", /* Gray (Y8/Y12) or Single Comp */
+@@ -137,6 +142,7 @@ static const struct mxc_jpeg_fmt mxc_formats[] = {
+ .h_align = 3,
+ .v_align = 3,
+ .flags = MXC_JPEG_FMT_TYPE_RAW,
++ .precision = 8,
+ },
+ };
+
+@@ -309,6 +315,9 @@ struct mxc_jpeg_src_buf {
+ /* mxc-jpeg specific */
+ bool dht_needed;
+ bool jpeg_parse_error;
++ const struct mxc_jpeg_fmt *fmt;
++ int w;
++ int h;
+ };
+
+ static inline struct mxc_jpeg_src_buf *vb2_to_mxc_buf(struct vb2_buffer *vb)
+@@ -321,6 +330,9 @@ static unsigned int debug;
+ module_param(debug, int, 0644);
+ MODULE_PARM_DESC(debug, "Debug level (0-3)");
+
++static void mxc_jpeg_bytesperline(struct mxc_jpeg_q_data *q, u32 precision);
++static void mxc_jpeg_sizeimage(struct mxc_jpeg_q_data *q);
++
+ static void _bswap16(u16 *a)
+ {
+ *a = ((*a & 0x00FF) << 8) | ((*a & 0xFF00) >> 8);
+@@ -508,6 +520,7 @@ static bool mxc_jpeg_alloc_slot_data(struct mxc_jpeg_dev *jpeg,
+ GFP_ATOMIC);
+ if (!cfg_stm)
+ goto err;
++ memset(cfg_stm, 0, MXC_JPEG_MAX_CFG_STREAM);
+ jpeg->slot_data[slot].cfg_stream_vaddr = cfg_stm;
+
+ skip_alloc:
+@@ -546,6 +559,18 @@ static void mxc_jpeg_free_slot_data(struct mxc_jpeg_dev *jpeg,
+ jpeg->slot_data[slot].used = false;
+ }
+
++static void mxc_jpeg_check_and_set_last_buffer(struct mxc_jpeg_ctx *ctx,
++ struct vb2_v4l2_buffer *src_buf,
++ struct vb2_v4l2_buffer *dst_buf)
++{
++ if (v4l2_m2m_is_last_draining_src_buf(ctx->fh.m2m_ctx, src_buf)) {
++ dst_buf->flags |= V4L2_BUF_FLAG_LAST;
++ v4l2_m2m_mark_stopped(ctx->fh.m2m_ctx);
++ notify_eos(ctx);
++ ctx->header_parsed = false;
++ }
++}
++
+ static irqreturn_t mxc_jpeg_dec_irq(int irq, void *priv)
+ {
+ struct mxc_jpeg_dev *jpeg = priv;
+@@ -568,15 +593,8 @@ static irqreturn_t mxc_jpeg_dec_irq(int irq, void *priv)
+ dev_dbg(dev, "Irq %d on slot %d.\n", irq, slot);
+
+ ctx = v4l2_m2m_get_curr_priv(jpeg->m2m_dev);
+- if (!ctx) {
+- dev_err(dev,
+- "Instance released before the end of transaction.\n");
+- /* soft reset only resets internal state, not registers */
+- mxc_jpeg_sw_reset(reg);
+- /* clear all interrupts */
+- writel(0xFFFFFFFF, reg + MXC_SLOT_OFFSET(slot, SLOT_STATUS));
++ if (WARN_ON(!ctx))
+ goto job_unlock;
+- }
+
+ if (slot != ctx->slot) {
+ /* TODO investigate when adding multi-instance support */
+@@ -620,6 +638,7 @@ static irqreturn_t mxc_jpeg_dec_irq(int irq, void *priv)
+ dev_dbg(dev, "Decoder DHT cfg finished. Start decoding...\n");
+ goto job_unlock;
+ }
++
+ if (jpeg->mode == MXC_JPEG_ENCODE) {
+ payload = readl(reg + MXC_SLOT_OFFSET(slot, SLOT_BUF_PTR));
+ vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload);
+@@ -647,7 +666,9 @@ static irqreturn_t mxc_jpeg_dec_irq(int irq, void *priv)
+ buf_state = VB2_BUF_STATE_DONE;
+
+ buffers_done:
++ mxc_jpeg_disable_irq(reg, ctx->slot);
+ jpeg->slot_data[slot].used = false; /* unused, but don't free */
++ mxc_jpeg_check_and_set_last_buffer(ctx, src_buf, dst_buf);
+ v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
+ v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
+ v4l2_m2m_buf_done(src_buf, buf_state);
+@@ -743,7 +764,13 @@ static unsigned int mxc_jpeg_setup_cfg_stream(void *cfg_stream_vaddr,
+ u32 fourcc,
+ u16 w, u16 h)
+ {
+- unsigned int offset = 0;
++ /*
++ * There is a hardware issue that first 128 bytes of configuration data
++ * can't be loaded correctly.
++ * To avoid this issue, we need to write the configuration from
++ * an offset which should be no less than 0x80 (128 bytes).
++ */
++ unsigned int offset = 0x80;
+ u8 *cfg = (u8 *)cfg_stream_vaddr;
+ struct mxc_jpeg_sof *sof;
+ struct mxc_jpeg_sos *sos;
+@@ -875,8 +902,8 @@ static void mxc_jpeg_config_enc_desc(struct vb2_buffer *out_buf,
+ jpeg->slot_data[slot].cfg_stream_size =
+ mxc_jpeg_setup_cfg_stream(cfg_stream_vaddr,
+ q_data->fmt->fourcc,
+- q_data->w_adjusted,
+- q_data->h_adjusted);
++ q_data->w,
++ q_data->h);
+
+ /* chain the config descriptor with the encoding descriptor */
+ cfg_desc->next_descpt_ptr = desc_handle | MXC_NXT_DESCPT_EN;
+@@ -916,6 +943,67 @@ static void mxc_jpeg_config_enc_desc(struct vb2_buffer *out_buf,
+ mxc_jpeg_set_desc(cfg_desc_handle, reg, slot);
+ }
+
++static bool mxc_jpeg_source_change(struct mxc_jpeg_ctx *ctx,
++ struct mxc_jpeg_src_buf *jpeg_src_buf)
++{
++ struct device *dev = ctx->mxc_jpeg->dev;
++ struct mxc_jpeg_q_data *q_data_cap;
++
++ if (!jpeg_src_buf->fmt)
++ return false;
++
++ q_data_cap = mxc_jpeg_get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE);
++ if (q_data_cap->fmt != jpeg_src_buf->fmt ||
++ q_data_cap->w != jpeg_src_buf->w ||
++ q_data_cap->h != jpeg_src_buf->h) {
++ dev_dbg(dev, "Detected jpeg res=(%dx%d)->(%dx%d), pixfmt=%c%c%c%c\n",
++ q_data_cap->w, q_data_cap->h,
++ jpeg_src_buf->w, jpeg_src_buf->h,
++ (jpeg_src_buf->fmt->fourcc & 0xff),
++ (jpeg_src_buf->fmt->fourcc >> 8) & 0xff,
++ (jpeg_src_buf->fmt->fourcc >> 16) & 0xff,
++ (jpeg_src_buf->fmt->fourcc >> 24) & 0xff);
++
++ /*
++ * set-up the capture queue with the pixelformat and resolution
++ * detected from the jpeg output stream
++ */
++ q_data_cap->w = jpeg_src_buf->w;
++ q_data_cap->h = jpeg_src_buf->h;
++ q_data_cap->fmt = jpeg_src_buf->fmt;
++ q_data_cap->w_adjusted = q_data_cap->w;
++ q_data_cap->h_adjusted = q_data_cap->h;
++
++ /*
++ * align up the resolution for CAST IP,
++ * but leave the buffer resolution unchanged
++ */
++ v4l_bound_align_image(&q_data_cap->w_adjusted,
++ q_data_cap->w_adjusted, /* adjust up */
++ MXC_JPEG_MAX_WIDTH,
++ q_data_cap->fmt->h_align,
++ &q_data_cap->h_adjusted,
++ q_data_cap->h_adjusted, /* adjust up */
++ MXC_JPEG_MAX_HEIGHT,
++ 0,
++ 0);
++
++ /* setup bytesperline/sizeimage for capture queue */
++ mxc_jpeg_bytesperline(q_data_cap, jpeg_src_buf->fmt->precision);
++ mxc_jpeg_sizeimage(q_data_cap);
++ notify_src_chg(ctx);
++ ctx->source_change = 1;
++ }
++ return ctx->source_change ? true : false;
++}
++
++static int mxc_jpeg_job_ready(void *priv)
++{
++ struct mxc_jpeg_ctx *ctx = priv;
++
++ return ctx->source_change ? 0 : 1;
++}
++
+ static void mxc_jpeg_device_run(void *priv)
+ {
+ struct mxc_jpeg_ctx *ctx = priv;
+@@ -954,6 +1042,7 @@ static void mxc_jpeg_device_run(void *priv)
+ jpeg_src_buf->jpeg_parse_error = true;
+ }
+ if (jpeg_src_buf->jpeg_parse_error) {
++ mxc_jpeg_check_and_set_last_buffer(ctx, src_buf, dst_buf);
+ v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
+ v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
+ v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
+@@ -963,6 +1052,13 @@ static void mxc_jpeg_device_run(void *priv)
+
+ return;
+ }
++ if (ctx->mxc_jpeg->mode == MXC_JPEG_DECODE) {
++ if (ctx->source_change || mxc_jpeg_source_change(ctx, jpeg_src_buf)) {
++ spin_unlock_irqrestore(&ctx->mxc_jpeg->hw_lock, flags);
++ v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
++ return;
++ }
++ }
+
+ mxc_jpeg_enable(reg);
+ mxc_jpeg_set_l_endian(reg, 1);
+@@ -997,44 +1093,33 @@ end:
+ spin_unlock_irqrestore(&ctx->mxc_jpeg->hw_lock, flags);
+ }
+
+-static void mxc_jpeg_set_last_buffer_dequeued(struct mxc_jpeg_ctx *ctx)
+-{
+- struct vb2_queue *q;
+-
+- ctx->stopped = 1;
+- q = v4l2_m2m_get_dst_vq(ctx->fh.m2m_ctx);
+- if (!list_empty(&q->done_list))
+- return;
+-
+- q->last_buffer_dequeued = true;
+- wake_up(&q->done_wq);
+- ctx->stopped = 0;
+-}
+-
+ static int mxc_jpeg_decoder_cmd(struct file *file, void *priv,
+ struct v4l2_decoder_cmd *cmd)
+ {
+ struct v4l2_fh *fh = file->private_data;
+ struct mxc_jpeg_ctx *ctx = mxc_jpeg_fh_to_ctx(fh);
+- struct device *dev = ctx->mxc_jpeg->dev;
+ int ret;
+
+ ret = v4l2_m2m_ioctl_try_decoder_cmd(file, fh, cmd);
+ if (ret < 0)
+ return ret;
+
+- if (cmd->cmd == V4L2_DEC_CMD_STOP) {
+- dev_dbg(dev, "Received V4L2_DEC_CMD_STOP");
+- if (v4l2_m2m_num_src_bufs_ready(fh->m2m_ctx) == 0) {
+- /* No more src bufs, notify app EOS */
+- notify_eos(ctx);
+- mxc_jpeg_set_last_buffer_dequeued(ctx);
+- } else {
+- /* will send EOS later*/
+- ctx->stopping = 1;
+- }
++ if (!vb2_is_streaming(v4l2_m2m_get_src_vq(fh->m2m_ctx)))
++ return 0;
++
++ ret = v4l2_m2m_ioctl_decoder_cmd(file, priv, cmd);
++ if (ret < 0)
++ return ret;
++
++ if (cmd->cmd == V4L2_DEC_CMD_STOP &&
++ v4l2_m2m_has_stopped(fh->m2m_ctx)) {
++ notify_eos(ctx);
++ ctx->header_parsed = false;
+ }
+
++ if (cmd->cmd == V4L2_DEC_CMD_START &&
++ v4l2_m2m_has_stopped(fh->m2m_ctx))
++ vb2_clear_last_buffer_dequeued(&fh->m2m_ctx->cap_q_ctx.q);
+ return 0;
+ }
+
+@@ -1043,24 +1128,27 @@ static int mxc_jpeg_encoder_cmd(struct file *file, void *priv,
+ {
+ struct v4l2_fh *fh = file->private_data;
+ struct mxc_jpeg_ctx *ctx = mxc_jpeg_fh_to_ctx(fh);
+- struct device *dev = ctx->mxc_jpeg->dev;
+ int ret;
+
+ ret = v4l2_m2m_ioctl_try_encoder_cmd(file, fh, cmd);
+ if (ret < 0)
+ return ret;
+
+- if (cmd->cmd == V4L2_ENC_CMD_STOP) {
+- dev_dbg(dev, "Received V4L2_ENC_CMD_STOP");
+- if (v4l2_m2m_num_src_bufs_ready(fh->m2m_ctx) == 0) {
+- /* No more src bufs, notify app EOS */
+- notify_eos(ctx);
+- mxc_jpeg_set_last_buffer_dequeued(ctx);
+- } else {
+- /* will send EOS later*/
+- ctx->stopping = 1;
+- }
+- }
++ if (!vb2_is_streaming(v4l2_m2m_get_src_vq(fh->m2m_ctx)) ||
++ !vb2_is_streaming(v4l2_m2m_get_dst_vq(fh->m2m_ctx)))
++ return 0;
++
++ ret = v4l2_m2m_ioctl_encoder_cmd(file, fh, cmd);
++ if (ret < 0)
++ return 0;
++
++ if (cmd->cmd == V4L2_ENC_CMD_STOP &&
++ v4l2_m2m_has_stopped(fh->m2m_ctx))
++ notify_eos(ctx);
++
++ if (cmd->cmd == V4L2_ENC_CMD_START &&
++ v4l2_m2m_has_stopped(fh->m2m_ctx))
++ vb2_clear_last_buffer_dequeued(&fh->m2m_ctx->cap_q_ctx.q);
+
+ return 0;
+ }
+@@ -1073,16 +1161,28 @@ static int mxc_jpeg_queue_setup(struct vb2_queue *q,
+ {
+ struct mxc_jpeg_ctx *ctx = vb2_get_drv_priv(q);
+ struct mxc_jpeg_q_data *q_data = NULL;
++ struct mxc_jpeg_q_data tmp_q;
+ int i;
+
+ q_data = mxc_jpeg_get_q_data(ctx, q->type);
+ if (!q_data)
+ return -EINVAL;
+
++ tmp_q.fmt = q_data->fmt;
++ tmp_q.w = q_data->w_adjusted;
++ tmp_q.h = q_data->h_adjusted;
++ for (i = 0; i < MXC_JPEG_MAX_PLANES; i++) {
++ tmp_q.bytesperline[i] = q_data->bytesperline[i];
++ tmp_q.sizeimage[i] = q_data->sizeimage[i];
++ }
++ mxc_jpeg_sizeimage(&tmp_q);
++ for (i = 0; i < MXC_JPEG_MAX_PLANES; i++)
++ tmp_q.sizeimage[i] = max(tmp_q.sizeimage[i], q_data->sizeimage[i]);
++
+ /* Handle CREATE_BUFS situation - *nplanes != 0 */
+ if (*nplanes) {
+ for (i = 0; i < *nplanes; i++) {
+- if (sizes[i] < q_data->sizeimage[i])
++ if (sizes[i] < tmp_q.sizeimage[i])
+ return -EINVAL;
+ }
+ return 0;
+@@ -1091,7 +1191,7 @@ static int mxc_jpeg_queue_setup(struct vb2_queue *q,
+ /* Handle REQBUFS situation */
+ *nplanes = q_data->fmt->colplanes;
+ for (i = 0; i < *nplanes; i++)
+- sizes[i] = q_data->sizeimage[i];
++ sizes[i] = tmp_q.sizeimage[i];
+
+ return 0;
+ }
+@@ -1102,6 +1202,10 @@ static int mxc_jpeg_start_streaming(struct vb2_queue *q, unsigned int count)
+ struct mxc_jpeg_q_data *q_data = mxc_jpeg_get_q_data(ctx, q->type);
+ int ret;
+
++ v4l2_m2m_update_start_streaming_state(ctx->fh.m2m_ctx, q);
++
++ if (ctx->mxc_jpeg->mode == MXC_JPEG_DECODE && V4L2_TYPE_IS_CAPTURE(q->type))
++ ctx->source_change = 0;
+ dev_dbg(ctx->mxc_jpeg->dev, "Start streaming ctx=%p", ctx);
+ q_data->sequence = 0;
+
+@@ -1131,11 +1235,15 @@ static void mxc_jpeg_stop_streaming(struct vb2_queue *q)
+ break;
+ v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_ERROR);
+ }
+- pm_runtime_put_sync(&ctx->mxc_jpeg->pdev->dev);
+- if (V4L2_TYPE_IS_OUTPUT(q->type)) {
+- ctx->stopping = 0;
+- ctx->stopped = 0;
++
++ v4l2_m2m_update_stop_streaming_state(ctx->fh.m2m_ctx, q);
++ if (V4L2_TYPE_IS_OUTPUT(q->type) &&
++ v4l2_m2m_has_stopped(ctx->fh.m2m_ctx)) {
++ notify_eos(ctx);
++ ctx->header_parsed = false;
+ }
++
++ pm_runtime_put_sync(&ctx->mxc_jpeg->pdev->dev);
+ }
+
+ static int mxc_jpeg_valid_comp_id(struct device *dev,
+@@ -1175,14 +1283,17 @@ static u32 mxc_jpeg_get_image_format(struct device *dev,
+
+ for (i = 0; i < MXC_JPEG_NUM_FORMATS; i++)
+ if (mxc_formats[i].subsampling == header->frame.subsampling &&
+- mxc_formats[i].nc == header->frame.num_components) {
++ mxc_formats[i].nc == header->frame.num_components &&
++ mxc_formats[i].precision == header->frame.precision) {
+ fourcc = mxc_formats[i].fourcc;
+ break;
+ }
+ if (fourcc == 0) {
+- dev_err(dev, "Could not identify image format nc=%d, subsampling=%d\n",
++ dev_err(dev,
++ "Could not identify image format nc=%d, subsampling=%d, precision=%d\n",
+ header->frame.num_components,
+- header->frame.subsampling);
++ header->frame.subsampling,
++ header->frame.precision);
+ return fourcc;
+ }
+ /*
+@@ -1200,26 +1311,29 @@ static u32 mxc_jpeg_get_image_format(struct device *dev,
+ return fourcc;
+ }
+
+-static void mxc_jpeg_bytesperline(struct mxc_jpeg_q_data *q,
+- u32 precision)
++static void mxc_jpeg_bytesperline(struct mxc_jpeg_q_data *q, u32 precision)
+ {
+ /* Bytes distance between the leftmost pixels in two adjacent lines */
+ if (q->fmt->fourcc == V4L2_PIX_FMT_JPEG) {
+ /* bytesperline unused for compressed formats */
+ q->bytesperline[0] = 0;
+ q->bytesperline[1] = 0;
+- } else if (q->fmt->fourcc == V4L2_PIX_FMT_NV12M) {
++ } else if (q->fmt->subsampling == V4L2_JPEG_CHROMA_SUBSAMPLING_420) {
+ /* When the image format is planar the bytesperline value
+ * applies to the first plane and is divided by the same factor
+ * as the width field for the other planes
+ */
+- q->bytesperline[0] = q->w * (precision / 8) *
+- (q->fmt->depth / 8);
++ q->bytesperline[0] = q->w * DIV_ROUND_UP(precision, 8);
+ q->bytesperline[1] = q->bytesperline[0];
++ } else if (q->fmt->subsampling == V4L2_JPEG_CHROMA_SUBSAMPLING_422) {
++ q->bytesperline[0] = q->w * DIV_ROUND_UP(precision, 8) * 2;
++ q->bytesperline[1] = 0;
++ } else if (q->fmt->subsampling == V4L2_JPEG_CHROMA_SUBSAMPLING_444) {
++ q->bytesperline[0] = q->w * DIV_ROUND_UP(precision, 8) * q->fmt->nc;
++ q->bytesperline[1] = 0;
+ } else {
+- /* single plane formats */
+- q->bytesperline[0] = q->w * (precision / 8) *
+- (q->fmt->depth / 8);
++ /* grayscale */
++ q->bytesperline[0] = q->w * DIV_ROUND_UP(precision, 8);
+ q->bytesperline[1] = 0;
+ }
+ }
+@@ -1245,17 +1359,17 @@ static void mxc_jpeg_sizeimage(struct mxc_jpeg_q_data *q)
+ }
+ }
+
+-static int mxc_jpeg_parse(struct mxc_jpeg_ctx *ctx,
+- u8 *src_addr, u32 size, bool *dht_needed)
++static int mxc_jpeg_parse(struct mxc_jpeg_ctx *ctx, struct vb2_buffer *vb)
+ {
+ struct device *dev = ctx->mxc_jpeg->dev;
+- struct mxc_jpeg_q_data *q_data_out, *q_data_cap;
+- enum v4l2_buf_type cap_type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE;
+- bool src_chg = false;
++ struct mxc_jpeg_q_data *q_data_out;
+ u32 fourcc;
+ struct v4l2_jpeg_header header;
+ struct mxc_jpeg_sof *psof = NULL;
+ struct mxc_jpeg_sos *psos = NULL;
++ struct mxc_jpeg_src_buf *jpeg_src_buf = vb2_to_mxc_buf(vb);
++ u8 *src_addr = (u8 *)vb2_plane_vaddr(vb, 0);
++ u32 size = vb2_get_plane_payload(vb, 0);
+ int ret;
+
+ memset(&header, 0, sizeof(header));
+@@ -1266,7 +1380,7 @@ static int mxc_jpeg_parse(struct mxc_jpeg_ctx *ctx,
+ }
+
+ /* if DHT marker present, no need to inject default one */
+- *dht_needed = (header.num_dht == 0);
++ jpeg_src_buf->dht_needed = (header.num_dht == 0);
+
+ q_data_out = mxc_jpeg_get_q_data(ctx,
+ V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
+@@ -1274,21 +1388,15 @@ static int mxc_jpeg_parse(struct mxc_jpeg_ctx *ctx,
+ dev_warn(dev, "Invalid user resolution 0x0");
+ dev_warn(dev, "Keeping resolution from JPEG: %dx%d",
+ header.frame.width, header.frame.height);
+- q_data_out->w = header.frame.width;
+- q_data_out->h = header.frame.height;
+ } else if (header.frame.width != q_data_out->w ||
+ header.frame.height != q_data_out->h) {
+ dev_err(dev,
+ "Resolution mismatch: %dx%d (JPEG) versus %dx%d(user)",
+ header.frame.width, header.frame.height,
+ q_data_out->w, q_data_out->h);
+- return -EINVAL;
+- }
+- if (header.frame.width % 8 != 0 || header.frame.height % 8 != 0) {
+- dev_err(dev, "JPEG width or height not multiple of 8: %dx%d\n",
+- header.frame.width, header.frame.height);
+- return -EINVAL;
+ }
++ q_data_out->w = header.frame.width;
++ q_data_out->h = header.frame.height;
+ if (header.frame.width > MXC_JPEG_MAX_WIDTH ||
+ header.frame.height > MXC_JPEG_MAX_HEIGHT) {
+ dev_err(dev, "JPEG width or height should be <= 8192: %dx%d\n",
+@@ -1316,51 +1424,13 @@ static int mxc_jpeg_parse(struct mxc_jpeg_ctx *ctx,
+ if (fourcc == 0)
+ return -EINVAL;
+
+- /*
+- * set-up the capture queue with the pixelformat and resolution
+- * detected from the jpeg output stream
+- */
+- q_data_cap = mxc_jpeg_get_q_data(ctx, cap_type);
+- if (q_data_cap->w != header.frame.width ||
+- q_data_cap->h != header.frame.height)
+- src_chg = true;
+- q_data_cap->w = header.frame.width;
+- q_data_cap->h = header.frame.height;
+- q_data_cap->fmt = mxc_jpeg_find_format(ctx, fourcc);
+- q_data_cap->w_adjusted = q_data_cap->w;
+- q_data_cap->h_adjusted = q_data_cap->h;
+- /*
+- * align up the resolution for CAST IP,
+- * but leave the buffer resolution unchanged
+- */
+- v4l_bound_align_image(&q_data_cap->w_adjusted,
+- q_data_cap->w_adjusted, /* adjust up */
+- MXC_JPEG_MAX_WIDTH,
+- q_data_cap->fmt->h_align,
+- &q_data_cap->h_adjusted,
+- q_data_cap->h_adjusted, /* adjust up */
+- MXC_JPEG_MAX_HEIGHT,
+- q_data_cap->fmt->v_align,
+- 0);
+- dev_dbg(dev, "Detected jpeg res=(%dx%d)->(%dx%d), pixfmt=%c%c%c%c\n",
+- q_data_cap->w, q_data_cap->h,
+- q_data_cap->w_adjusted, q_data_cap->h_adjusted,
+- (fourcc & 0xff),
+- (fourcc >> 8) & 0xff,
+- (fourcc >> 16) & 0xff,
+- (fourcc >> 24) & 0xff);
+-
+- /* setup bytesperline/sizeimage for capture queue */
+- mxc_jpeg_bytesperline(q_data_cap, header.frame.precision);
+- mxc_jpeg_sizeimage(q_data_cap);
++ jpeg_src_buf->fmt = mxc_jpeg_find_format(ctx, fourcc);
++ jpeg_src_buf->w = header.frame.width;
++ jpeg_src_buf->h = header.frame.height;
++ ctx->header_parsed = true;
+
+- /*
+- * if the CAPTURE format was updated with new values, regardless of
+- * whether they match the values set by the client or not, signal
+- * a source change event
+- */
+- if (src_chg)
+- notify_src_chg(ctx);
++ if (!v4l2_m2m_num_src_bufs_ready(ctx->fh.m2m_ctx))
++ mxc_jpeg_source_change(ctx, jpeg_src_buf);
+
+ return 0;
+ }
+@@ -1372,6 +1442,20 @@ static void mxc_jpeg_buf_queue(struct vb2_buffer *vb)
+ struct mxc_jpeg_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
+ struct mxc_jpeg_src_buf *jpeg_src_buf;
+
++ if (V4L2_TYPE_IS_CAPTURE(vb->vb2_queue->type) &&
++ vb2_is_streaming(vb->vb2_queue) &&
++ v4l2_m2m_dst_buf_is_last(ctx->fh.m2m_ctx)) {
++ struct mxc_jpeg_q_data *q_data;
++
++ q_data = mxc_jpeg_get_q_data(ctx, vb->vb2_queue->type);
++ vbuf->field = V4L2_FIELD_NONE;
++ vbuf->sequence = q_data->sequence++;
++ v4l2_m2m_last_buffer_done(ctx->fh.m2m_ctx, vbuf);
++ notify_eos(ctx);
++ ctx->header_parsed = false;
++ return;
++ }
++
+ if (vb->vb2_queue->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
+ goto end;
+
+@@ -1381,10 +1465,7 @@ static void mxc_jpeg_buf_queue(struct vb2_buffer *vb)
+
+ jpeg_src_buf = vb2_to_mxc_buf(vb);
+ jpeg_src_buf->jpeg_parse_error = false;
+- ret = mxc_jpeg_parse(ctx,
+- (u8 *)vb2_plane_vaddr(vb, 0),
+- vb2_get_plane_payload(vb, 0),
+- &jpeg_src_buf->dht_needed);
++ ret = mxc_jpeg_parse(ctx, vb);
+ if (ret)
+ jpeg_src_buf->jpeg_parse_error = true;
+
+@@ -1424,23 +1505,11 @@ static int mxc_jpeg_buf_prepare(struct vb2_buffer *vb)
+ }
+ vb2_set_plane_payload(vb, i, sizeimage);
+ }
+- return 0;
+-}
+-
+-static void mxc_jpeg_buf_finish(struct vb2_buffer *vb)
+-{
+- struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
+- struct mxc_jpeg_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
+- struct vb2_queue *q = vb->vb2_queue;
+-
+- if (V4L2_TYPE_IS_OUTPUT(vb->type))
+- return;
+- if (!ctx->stopped)
+- return;
+- if (list_empty(&q->done_list)) {
+- vbuf->flags |= V4L2_BUF_FLAG_LAST;
+- ctx->stopped = 0;
++ if (V4L2_TYPE_IS_CAPTURE(vb->vb2_queue->type)) {
++ vb2_set_plane_payload(vb, 0, 0);
++ vb2_set_plane_payload(vb, 1, 0);
+ }
++ return 0;
+ }
+
+ static const struct vb2_ops mxc_jpeg_qops = {
+@@ -1449,7 +1518,6 @@ static const struct vb2_ops mxc_jpeg_qops = {
+ .wait_finish = vb2_ops_wait_finish,
+ .buf_out_validate = mxc_jpeg_buf_out_validate,
+ .buf_prepare = mxc_jpeg_buf_prepare,
+- .buf_finish = mxc_jpeg_buf_finish,
+ .start_streaming = mxc_jpeg_start_streaming,
+ .stop_streaming = mxc_jpeg_stop_streaming,
+ .buf_queue = mxc_jpeg_buf_queue,
+@@ -1510,7 +1578,7 @@ static void mxc_jpeg_set_default_params(struct mxc_jpeg_ctx *ctx)
+ q[i]->h = MXC_JPEG_DEFAULT_HEIGHT;
+ q[i]->w_adjusted = MXC_JPEG_DEFAULT_WIDTH;
+ q[i]->h_adjusted = MXC_JPEG_DEFAULT_HEIGHT;
+- mxc_jpeg_bytesperline(q[i], 8);
++ mxc_jpeg_bytesperline(q[i], q[i]->fmt->precision);
+ mxc_jpeg_sizeimage(q[i]);
+ }
+ }
+@@ -1585,26 +1653,42 @@ static int mxc_jpeg_enum_fmt_vid_cap(struct file *file, void *priv,
+ struct v4l2_fmtdesc *f)
+ {
+ struct mxc_jpeg_ctx *ctx = mxc_jpeg_fh_to_ctx(priv);
++ struct mxc_jpeg_q_data *q_data = mxc_jpeg_get_q_data(ctx, f->type);
+
+- if (ctx->mxc_jpeg->mode == MXC_JPEG_ENCODE)
++ if (ctx->mxc_jpeg->mode == MXC_JPEG_ENCODE) {
+ return enum_fmt(mxc_formats, MXC_JPEG_NUM_FORMATS, f,
+ MXC_JPEG_FMT_TYPE_ENC);
+- else
++ } else if (!ctx->header_parsed) {
+ return enum_fmt(mxc_formats, MXC_JPEG_NUM_FORMATS, f,
+ MXC_JPEG_FMT_TYPE_RAW);
++ } else {
++ /* For the decoder CAPTURE queue, only enumerate the raw formats
++ * supported for the format currently active on OUTPUT
++ * (more precisely what was propagated on capture queue
++ * after jpeg parse on the output buffer)
++ */
++ if (f->index)
++ return -EINVAL;
++ f->pixelformat = q_data->fmt->fourcc;
++ strscpy(f->description, q_data->fmt->name, sizeof(f->description));
++ return 0;
++ }
+ }
+
+ static int mxc_jpeg_enum_fmt_vid_out(struct file *file, void *priv,
+ struct v4l2_fmtdesc *f)
+ {
+ struct mxc_jpeg_ctx *ctx = mxc_jpeg_fh_to_ctx(priv);
++ u32 type = ctx->mxc_jpeg->mode == MXC_JPEG_DECODE ? MXC_JPEG_FMT_TYPE_ENC :
++ MXC_JPEG_FMT_TYPE_RAW;
++ int ret;
+
++ ret = enum_fmt(mxc_formats, MXC_JPEG_NUM_FORMATS, f, type);
++ if (ret)
++ return ret;
+ if (ctx->mxc_jpeg->mode == MXC_JPEG_DECODE)
+- return enum_fmt(mxc_formats, MXC_JPEG_NUM_FORMATS, f,
+- MXC_JPEG_FMT_TYPE_ENC);
+- else
+- return enum_fmt(mxc_formats, MXC_JPEG_NUM_FORMATS, f,
+- MXC_JPEG_FMT_TYPE_RAW);
++ f->flags = V4L2_FMT_FLAG_DYN_RESOLUTION;
++ return 0;
+ }
+
+ static int mxc_jpeg_try_fmt(struct v4l2_format *f, const struct mxc_jpeg_fmt *fmt,
+@@ -1624,22 +1708,17 @@ static int mxc_jpeg_try_fmt(struct v4l2_format *f, const struct mxc_jpeg_fmt *fm
+ pix_mp->num_planes = fmt->colplanes;
+ pix_mp->pixelformat = fmt->fourcc;
+
+- /*
+- * use MXC_JPEG_H_ALIGN instead of fmt->v_align, for vertical
+- * alignment, to loosen up the alignment to multiple of 8,
+- * otherwise NV12-1080p fails as 1080 is not a multiple of 16
+- */
++ pix_mp->width = w;
++ pix_mp->height = h;
+ v4l_bound_align_image(&w,
+- MXC_JPEG_MIN_WIDTH,
+- w, /* adjust downwards*/
++ w, /* adjust upwards*/
++ MXC_JPEG_MAX_WIDTH,
+ fmt->h_align,
+ &h,
+- MXC_JPEG_MIN_HEIGHT,
+- h, /* adjust downwards*/
+- MXC_JPEG_H_ALIGN,
++ h, /* adjust upwards*/
++ MXC_JPEG_MAX_HEIGHT,
++ 0,
+ 0);
+- pix_mp->width = w; /* negotiate the width */
+- pix_mp->height = h; /* negotiate the height */
+
+ /* get user input into the tmp_q */
+ tmp_q.w = w;
+@@ -1652,7 +1731,7 @@ static int mxc_jpeg_try_fmt(struct v4l2_format *f, const struct mxc_jpeg_fmt *fm
+ }
+
+ /* calculate bytesperline & sizeimage into the tmp_q */
+- mxc_jpeg_bytesperline(&tmp_q, 8);
++ mxc_jpeg_bytesperline(&tmp_q, fmt->precision);
+ mxc_jpeg_sizeimage(&tmp_q);
+
+ /* adjust user format according to our calculations */
+@@ -1765,35 +1844,19 @@ static int mxc_jpeg_s_fmt(struct mxc_jpeg_ctx *ctx,
+
+ q_data->w_adjusted = q_data->w;
+ q_data->h_adjusted = q_data->h;
+- if (jpeg->mode == MXC_JPEG_DECODE) {
+- /*
+- * align up the resolution for CAST IP,
+- * but leave the buffer resolution unchanged
+- */
+- v4l_bound_align_image(&q_data->w_adjusted,
+- q_data->w_adjusted, /* adjust upwards */
+- MXC_JPEG_MAX_WIDTH,
+- q_data->fmt->h_align,
+- &q_data->h_adjusted,
+- q_data->h_adjusted, /* adjust upwards */
+- MXC_JPEG_MAX_HEIGHT,
+- q_data->fmt->v_align,
+- 0);
+- } else {
+- /*
+- * align down the resolution for CAST IP,
+- * but leave the buffer resolution unchanged
+- */
+- v4l_bound_align_image(&q_data->w_adjusted,
+- MXC_JPEG_MIN_WIDTH,
+- q_data->w_adjusted, /* adjust downwards*/
+- q_data->fmt->h_align,
+- &q_data->h_adjusted,
+- MXC_JPEG_MIN_HEIGHT,
+- q_data->h_adjusted, /* adjust downwards*/
+- q_data->fmt->v_align,
+- 0);
+- }
++ /*
++ * align up the resolution for CAST IP,
++ * but leave the buffer resolution unchanged
++ */
++ v4l_bound_align_image(&q_data->w_adjusted,
++ q_data->w_adjusted, /* adjust upwards */
++ MXC_JPEG_MAX_WIDTH,
++ q_data->fmt->h_align,
++ &q_data->h_adjusted,
++ q_data->h_adjusted, /* adjust upwards */
++ MXC_JPEG_MAX_HEIGHT,
++ q_data->fmt->v_align,
++ 0);
+
+ for (i = 0; i < pix_mp->num_planes; i++) {
+ q_data->bytesperline[i] = pix_mp->plane_fmt[i].bytesperline;
+@@ -1875,27 +1938,6 @@ static int mxc_jpeg_subscribe_event(struct v4l2_fh *fh,
+ }
+ }
+
+-static int mxc_jpeg_dqbuf(struct file *file, void *priv,
+- struct v4l2_buffer *buf)
+-{
+- struct v4l2_fh *fh = file->private_data;
+- struct mxc_jpeg_ctx *ctx = mxc_jpeg_fh_to_ctx(priv);
+- struct device *dev = ctx->mxc_jpeg->dev;
+- int num_src_ready = v4l2_m2m_num_src_bufs_ready(fh->m2m_ctx);
+- int ret;
+-
+- dev_dbg(dev, "DQBUF type=%d, index=%d", buf->type, buf->index);
+- if (ctx->stopping == 1 && num_src_ready == 0) {
+- /* No more src bufs, notify app EOS */
+- notify_eos(ctx);
+- ctx->stopping = 0;
+- mxc_jpeg_set_last_buffer_dequeued(ctx);
+- }
+-
+- ret = v4l2_m2m_dqbuf(file, fh->m2m_ctx, buf);
+- return ret;
+-}
+-
+ static const struct v4l2_ioctl_ops mxc_jpeg_ioctl_ops = {
+ .vidioc_querycap = mxc_jpeg_querycap,
+ .vidioc_enum_fmt_vid_cap = mxc_jpeg_enum_fmt_vid_cap,
+@@ -1919,7 +1961,7 @@ static const struct v4l2_ioctl_ops mxc_jpeg_ioctl_ops = {
+ .vidioc_encoder_cmd = mxc_jpeg_encoder_cmd,
+
+ .vidioc_qbuf = v4l2_m2m_ioctl_qbuf,
+- .vidioc_dqbuf = mxc_jpeg_dqbuf,
++ .vidioc_dqbuf = v4l2_m2m_ioctl_dqbuf,
+
+ .vidioc_create_bufs = v4l2_m2m_ioctl_create_bufs,
+ .vidioc_prepare_buf = v4l2_m2m_ioctl_prepare_buf,
+@@ -1962,6 +2004,7 @@ static const struct v4l2_file_operations mxc_jpeg_fops = {
+ };
+
+ static const struct v4l2_m2m_ops mxc_jpeg_m2m_ops = {
++ .job_ready = mxc_jpeg_job_ready,
+ .device_run = mxc_jpeg_device_run,
+ };
+
+@@ -2078,12 +2121,14 @@ static int mxc_jpeg_probe(struct platform_device *pdev)
+ jpeg->clk_ipg = devm_clk_get(dev, "ipg");
+ if (IS_ERR(jpeg->clk_ipg)) {
+ dev_err(dev, "failed to get clock: ipg\n");
++ ret = PTR_ERR(jpeg->clk_ipg);
+ goto err_clk;
+ }
+
+ jpeg->clk_per = devm_clk_get(dev, "per");
+ if (IS_ERR(jpeg->clk_per)) {
+ dev_err(dev, "failed to get clock: per\n");
++ ret = PTR_ERR(jpeg->clk_per);
+ goto err_clk;
+ }
+
+diff --git a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.h b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.h
+index f53f004ba851d..542993eb8d5b0 100644
+--- a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.h
++++ b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.h
+@@ -49,6 +49,7 @@ enum mxc_jpeg_mode {
+ * @h_align: horizontal alignment order (align to 2^h_align)
+ * @v_align: vertical alignment order (align to 2^v_align)
+ * @flags: flags describing format applicability
++ * @precision: jpeg sample precision
+ */
+ struct mxc_jpeg_fmt {
+ const char *name;
+@@ -60,6 +61,7 @@ struct mxc_jpeg_fmt {
+ int h_align;
+ int v_align;
+ u32 flags;
++ u8 precision;
+ };
+
+ struct mxc_jpeg_desc {
+@@ -90,9 +92,9 @@ struct mxc_jpeg_ctx {
+ struct mxc_jpeg_q_data cap_q;
+ struct v4l2_fh fh;
+ enum mxc_jpeg_enc_state enc_state;
+- unsigned int stopping;
+- unsigned int stopped;
+ unsigned int slot;
++ unsigned int source_change;
++ bool header_parsed;
+ };
+
+ struct mxc_jpeg_slot_data {
+diff --git a/drivers/media/platform/qcom/camss/camss-csid.c b/drivers/media/platform/qcom/camss/camss-csid.c
+index f993f349b66bf..80628801cf09f 100644
+--- a/drivers/media/platform/qcom/camss/camss-csid.c
++++ b/drivers/media/platform/qcom/camss/camss-csid.c
+@@ -666,7 +666,7 @@ int msm_csid_subdev_init(struct camss *camss, struct csid_device *csid,
+ if (csid->num_supplies) {
+ csid->supplies = devm_kmalloc_array(camss->dev,
+ csid->num_supplies,
+- sizeof(csid->supplies),
++ sizeof(*csid->supplies),
+ GFP_KERNEL);
+ if (!csid->supplies)
+ return -ENOMEM;
+diff --git a/drivers/media/platform/renesas/rcar-vin/rcar-core.c b/drivers/media/platform/renesas/rcar-vin/rcar-core.c
+index 64cb05b3907c2..ed3ddf36f9068 100644
+--- a/drivers/media/platform/renesas/rcar-vin/rcar-core.c
++++ b/drivers/media/platform/renesas/rcar-vin/rcar-core.c
+@@ -1264,7 +1264,7 @@ static const struct rvin_info rcar_info_r8a77980 = {
+ };
+
+ static const struct rvin_group_route rcar_info_r8a77990_routes[] = {
+- { .master = 0, .csi = RVIN_CSI40, .chsel = 0x03 },
++ { .master = 4, .csi = RVIN_CSI40, .chsel = 0x03 },
+ { /* Sentinel */ }
+ };
+
+diff --git a/drivers/media/usb/hdpvr/hdpvr-video.c b/drivers/media/usb/hdpvr/hdpvr-video.c
+index 60e57e0f19272..fd7d2a9d0449a 100644
+--- a/drivers/media/usb/hdpvr/hdpvr-video.c
++++ b/drivers/media/usb/hdpvr/hdpvr-video.c
+@@ -409,7 +409,7 @@ static ssize_t hdpvr_read(struct file *file, char __user *buffer, size_t count,
+ struct hdpvr_device *dev = video_drvdata(file);
+ struct hdpvr_buffer *buf = NULL;
+ struct urb *urb;
+- unsigned int ret = 0;
++ int ret = 0;
+ int rem, cnt;
+
+ if (*pos)
+diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c
+index 675e22895ebe6..094e1815209b3 100644
+--- a/drivers/media/v4l2-core/v4l2-mem2mem.c
++++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
+@@ -924,7 +924,7 @@ static __poll_t v4l2_m2m_poll_for_data(struct file *file,
+ if ((!src_q->streaming || src_q->error ||
+ list_empty(&src_q->queued_list)) &&
+ (!dst_q->streaming || dst_q->error ||
+- list_empty(&dst_q->queued_list)))
++ (list_empty(&dst_q->queued_list) && !dst_q->last_buffer_dequeued)))
+ return EPOLLERR;
+
+ spin_lock_irqsave(&src_q->done_lock, flags);
+diff --git a/drivers/memstick/core/ms_block.c b/drivers/memstick/core/ms_block.c
+index 3993bdd4b519c..f8fdf88fb240c 100644
+--- a/drivers/memstick/core/ms_block.c
++++ b/drivers/memstick/core/ms_block.c
+@@ -1341,17 +1341,17 @@ static int msb_ftl_initialize(struct msb_data *msb)
+ msb->zone_count = msb->block_count / MS_BLOCKS_IN_ZONE;
+ msb->logical_block_count = msb->zone_count * 496 - 2;
+
+- msb->used_blocks_bitmap = kzalloc(msb->block_count / 8, GFP_KERNEL);
+- msb->erased_blocks_bitmap = kzalloc(msb->block_count / 8, GFP_KERNEL);
++ msb->used_blocks_bitmap = bitmap_zalloc(msb->block_count, GFP_KERNEL);
++ msb->erased_blocks_bitmap = bitmap_zalloc(msb->block_count, GFP_KERNEL);
+ msb->lba_to_pba_table =
+ kmalloc_array(msb->logical_block_count, sizeof(u16),
+ GFP_KERNEL);
+
+ if (!msb->used_blocks_bitmap || !msb->lba_to_pba_table ||
+ !msb->erased_blocks_bitmap) {
+- kfree(msb->used_blocks_bitmap);
++ bitmap_free(msb->used_blocks_bitmap);
++ bitmap_free(msb->erased_blocks_bitmap);
+ kfree(msb->lba_to_pba_table);
+- kfree(msb->erased_blocks_bitmap);
+ return -ENOMEM;
+ }
+
+@@ -1946,7 +1946,8 @@ static DEFINE_MUTEX(msb_disk_lock); /* protects against races in open/release */
+ static void msb_data_clear(struct msb_data *msb)
+ {
+ kfree(msb->boot_page);
+- kfree(msb->used_blocks_bitmap);
++ bitmap_free(msb->used_blocks_bitmap);
++ bitmap_free(msb->erased_blocks_bitmap);
+ kfree(msb->lba_to_pba_table);
+ kfree(msb->cache);
+ msb->card = NULL;
+diff --git a/drivers/mfd/max77620.c b/drivers/mfd/max77620.c
+index fec2096474ad1..a6661e07035ba 100644
+--- a/drivers/mfd/max77620.c
++++ b/drivers/mfd/max77620.c
+@@ -419,9 +419,11 @@ static int max77620_initialise_fps(struct max77620_chip *chip)
+ ret = max77620_config_fps(chip, fps_child);
+ if (ret < 0) {
+ of_node_put(fps_child);
++ of_node_put(fps_np);
+ return ret;
+ }
+ }
++ of_node_put(fps_np);
+
+ config = chip->enable_global_lpm ? MAX77620_ONOFFCNFG2_SLP_LPM_MSK : 0;
+ ret = regmap_update_bits(chip->rmap, MAX77620_REG_ONOFFCNFG2,
+diff --git a/drivers/mfd/t7l66xb.c b/drivers/mfd/t7l66xb.c
+index 5369c67e3280d..663ffd4b85706 100644
+--- a/drivers/mfd/t7l66xb.c
++++ b/drivers/mfd/t7l66xb.c
+@@ -397,11 +397,8 @@ err_noirq:
+
+ static int t7l66xb_remove(struct platform_device *dev)
+ {
+- struct t7l66xb_platform_data *pdata = dev_get_platdata(&dev->dev);
+ struct t7l66xb *t7l66xb = platform_get_drvdata(dev);
+- int ret;
+
+- ret = pdata->disable(dev);
+ clk_disable_unprepare(t7l66xb->clk48m);
+ clk_put(t7l66xb->clk48m);
+ clk_disable_unprepare(t7l66xb->clk32k);
+@@ -412,8 +409,7 @@ static int t7l66xb_remove(struct platform_device *dev)
+ mfd_remove_devices(&dev->dev);
+ kfree(t7l66xb);
+
+- return ret;
+-
++ return 0;
+ }
+
+ static struct platform_driver t7l66xb_platform_driver = {
+diff --git a/drivers/misc/cardreader/rtsx_pcr.c b/drivers/misc/cardreader/rtsx_pcr.c
+index 2a2619e3c72cc..f001d99bf366b 100644
+--- a/drivers/misc/cardreader/rtsx_pcr.c
++++ b/drivers/misc/cardreader/rtsx_pcr.c
+@@ -1507,7 +1507,7 @@ static int rtsx_pci_probe(struct pci_dev *pcidev,
+ pcr->remap_addr = ioremap(base, len);
+ if (!pcr->remap_addr) {
+ ret = -ENOMEM;
+- goto free_handle;
++ goto free_idr;
+ }
+
+ pcr->rtsx_resv_buf = dma_alloc_coherent(&(pcidev->dev),
+@@ -1570,6 +1570,10 @@ disable_msi:
+ pcr->rtsx_resv_buf, pcr->rtsx_resv_buf_addr);
+ unmap:
+ iounmap(pcr->remap_addr);
++free_idr:
++ spin_lock(&rtsx_pci_lock);
++ idr_remove(&rtsx_pci_idr, pcr->id);
++ spin_unlock(&rtsx_pci_lock);
+ free_handle:
+ kfree(handle);
+ free_pcr:
+diff --git a/drivers/misc/eeprom/idt_89hpesx.c b/drivers/misc/eeprom/idt_89hpesx.c
+index b0cff4b152da8..7f430742ce2b8 100644
+--- a/drivers/misc/eeprom/idt_89hpesx.c
++++ b/drivers/misc/eeprom/idt_89hpesx.c
+@@ -909,14 +909,18 @@ static ssize_t idt_dbgfs_csr_write(struct file *filep, const char __user *ubuf,
+ u32 csraddr, csrval;
+ char *buf;
+
++ if (*offp)
++ return 0;
++
+ /* Copy data from User-space */
+ buf = kmalloc(count + 1, GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+- ret = simple_write_to_buffer(buf, count, offp, ubuf, count);
+- if (ret < 0)
++ if (copy_from_user(buf, ubuf, count)) {
++ ret = -EFAULT;
+ goto free_buf;
++ }
+ buf[count] = 0;
+
+ /* Find position of colon in the buffer */
+diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
+index 23fcb7a166057..0b8d3d25ba7ae 100644
+--- a/drivers/mmc/core/block.c
++++ b/drivers/mmc/core/block.c
+@@ -175,7 +175,7 @@ static inline int mmc_blk_part_switch(struct mmc_card *card,
+ unsigned int part_type);
+ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
+ struct mmc_card *card,
+- int disable_multi,
++ int recovery_mode,
+ struct mmc_queue *mq);
+ static void mmc_blk_hsq_req_done(struct mmc_request *mrq);
+
+@@ -1285,7 +1285,7 @@ static void mmc_blk_eval_resp_error(struct mmc_blk_request *brq)
+ }
+
+ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
+- int disable_multi, bool *do_rel_wr_p,
++ int recovery_mode, bool *do_rel_wr_p,
+ bool *do_data_tag_p)
+ {
+ struct mmc_blk_data *md = mq->blkdata;
+@@ -1351,12 +1351,12 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
+ brq->data.blocks--;
+
+ /*
+- * After a read error, we redo the request one sector
++ * After a read error, we redo the request one (native) sector
+ * at a time in order to accurately determine which
+ * sectors can be read successfully.
+ */
+- if (disable_multi)
+- brq->data.blocks = 1;
++ if (recovery_mode)
++ brq->data.blocks = queue_physical_block_size(mq->queue) >> 9;
+
+ /*
+ * Some controllers have HW issues while operating
+@@ -1573,7 +1573,7 @@ static int mmc_blk_cqe_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+
+ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
+ struct mmc_card *card,
+- int disable_multi,
++ int recovery_mode,
+ struct mmc_queue *mq)
+ {
+ u32 readcmd, writecmd;
+@@ -1582,7 +1582,7 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
+ struct mmc_blk_data *md = mq->blkdata;
+ bool do_rel_wr, do_data_tag;
+
+- mmc_blk_data_prep(mq, mqrq, disable_multi, &do_rel_wr, &do_data_tag);
++ mmc_blk_data_prep(mq, mqrq, recovery_mode, &do_rel_wr, &do_data_tag);
+
+ brq->mrq.cmd = &brq->cmd;
+
+@@ -1673,7 +1673,7 @@ static int mmc_blk_fix_state(struct mmc_card *card, struct request *req)
+
+ #define MMC_READ_SINGLE_RETRIES 2
+
+-/* Single sector read during recovery */
++/* Single (native) sector read during recovery */
+ static void mmc_blk_read_single(struct mmc_queue *mq, struct request *req)
+ {
+ struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+@@ -1681,6 +1681,7 @@ static void mmc_blk_read_single(struct mmc_queue *mq, struct request *req)
+ struct mmc_card *card = mq->card;
+ struct mmc_host *host = card->host;
+ blk_status_t error = BLK_STS_OK;
++ size_t bytes_per_read = queue_physical_block_size(mq->queue);
+
+ do {
+ u32 status;
+@@ -1715,13 +1716,13 @@ static void mmc_blk_read_single(struct mmc_queue *mq, struct request *req)
+ else
+ error = BLK_STS_OK;
+
+- } while (blk_update_request(req, error, 512));
++ } while (blk_update_request(req, error, bytes_per_read));
+
+ return;
+
+ error_exit:
+ mrq->data->bytes_xfered = 0;
+- blk_update_request(req, BLK_STS_IOERR, 512);
++ blk_update_request(req, BLK_STS_IOERR, bytes_per_read);
+ /* Let it try the remaining request again */
+ if (mqrq->retries > MMC_MAX_RETRIES - 1)
+ mqrq->retries = MMC_MAX_RETRIES - 1;
+@@ -1862,10 +1863,9 @@ static void mmc_blk_mq_rw_recovery(struct mmc_queue *mq, struct request *req)
+ return;
+ }
+
+- /* FIXME: Missing single sector read for large sector size */
+- if (!mmc_large_sector(card) && rq_data_dir(req) == READ &&
+- brq->data.blocks > 1) {
+- /* Read one sector at a time */
++ if (rq_data_dir(req) == READ && brq->data.blocks >
++ queue_physical_block_size(mq->queue) >> 9) {
++ /* Read one (native) sector at a time */
+ mmc_blk_read_single(mq, req);
+ return;
+ }
+diff --git a/drivers/mmc/core/quirks.h b/drivers/mmc/core/quirks.h
+index f879dc63d9364..be43939880868 100644
+--- a/drivers/mmc/core/quirks.h
++++ b/drivers/mmc/core/quirks.h
+@@ -163,8 +163,10 @@ static inline bool mmc_fixup_of_compatible_match(struct mmc_card *card,
+ struct device_node *np;
+
+ for_each_child_of_node(mmc_dev(card->host)->of_node, np) {
+- if (of_device_is_compatible(np, compatible))
++ if (of_device_is_compatible(np, compatible)) {
++ of_node_put(np);
+ return true;
++ }
+ }
+
+ return false;
+diff --git a/drivers/mmc/host/cavium-octeon.c b/drivers/mmc/host/cavium-octeon.c
+index 2c4b2df52adb1..12dca91a8ef61 100644
+--- a/drivers/mmc/host/cavium-octeon.c
++++ b/drivers/mmc/host/cavium-octeon.c
+@@ -277,6 +277,7 @@ static int octeon_mmc_probe(struct platform_device *pdev)
+ if (ret) {
+ dev_err(&pdev->dev, "Error populating slots\n");
+ octeon_mmc_set_shared_power(host, 0);
++ of_node_put(cn);
+ goto error;
+ }
+ i++;
+diff --git a/drivers/mmc/host/cavium-thunderx.c b/drivers/mmc/host/cavium-thunderx.c
+index 76013bbbcff30..202b1d6da678c 100644
+--- a/drivers/mmc/host/cavium-thunderx.c
++++ b/drivers/mmc/host/cavium-thunderx.c
+@@ -142,8 +142,10 @@ static int thunder_mmc_probe(struct pci_dev *pdev,
+ continue;
+
+ ret = cvm_mmc_of_slot_probe(&host->slot_pdev[i]->dev, host);
+- if (ret)
++ if (ret) {
++ of_node_put(child_node);
+ goto error;
++ }
+ }
+ i++;
+ }
+diff --git a/drivers/mmc/host/mxcmmc.c b/drivers/mmc/host/mxcmmc.c
+index 40b6878bea6cb..b5940083a1082 100644
+--- a/drivers/mmc/host/mxcmmc.c
++++ b/drivers/mmc/host/mxcmmc.c
+@@ -1025,7 +1025,7 @@ static int mxcmci_probe(struct platform_device *pdev)
+ mmc->max_req_size = mmc->max_blk_size * mmc->max_blk_count;
+ mmc->max_seg_size = mmc->max_req_size;
+
+- host->devtype = (enum mxcmci_type)of_device_get_match_data(&pdev->dev);
++ host->devtype = (uintptr_t)of_device_get_match_data(&pdev->dev);
+
+ /* adjust max_segs after devtype detection */
+ if (!is_mpc512x_mmc(host))
+diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
+index ddb5ca2f559e2..4fb49306275e4 100644
+--- a/drivers/mmc/host/renesas_sdhi_core.c
++++ b/drivers/mmc/host/renesas_sdhi_core.c
+@@ -940,6 +940,10 @@ int renesas_sdhi_probe(struct platform_device *pdev,
+ if (IS_ERR(priv->clk_cd))
+ return dev_err_probe(&pdev->dev, PTR_ERR(priv->clk_cd), "cannot get cd clock");
+
++ priv->rstc = devm_reset_control_get_optional_exclusive(&pdev->dev, NULL);
++ if (IS_ERR(priv->rstc))
++ return PTR_ERR(priv->rstc);
++
+ priv->pinctrl = devm_pinctrl_get(&pdev->dev);
+ if (!IS_ERR(priv->pinctrl)) {
+ priv->pins_default = pinctrl_lookup_state(priv->pinctrl,
+@@ -1032,10 +1036,6 @@ int renesas_sdhi_probe(struct platform_device *pdev,
+ if (ret)
+ goto efree;
+
+- priv->rstc = devm_reset_control_get_optional_exclusive(&pdev->dev, NULL);
+- if (IS_ERR(priv->rstc))
+- return PTR_ERR(priv->rstc);
+-
+ ver = sd_ctrl_read16(host, CTL_VERSION);
+ /* GEN2_SDR104 is first known SDHI to use 32bit block count */
+ if (ver < SDHI_VER_GEN2_SDR104 && mmc_data->max_blk_count > U16_MAX)
+diff --git a/drivers/mmc/host/sdhci-of-at91.c b/drivers/mmc/host/sdhci-of-at91.c
+index 10fb4cb2c731e..cd0134580a901 100644
+--- a/drivers/mmc/host/sdhci-of-at91.c
++++ b/drivers/mmc/host/sdhci-of-at91.c
+@@ -100,8 +100,13 @@ static void sdhci_at91_set_clock(struct sdhci_host *host, unsigned int clock)
+ static void sdhci_at91_set_uhs_signaling(struct sdhci_host *host,
+ unsigned int timing)
+ {
+- if (timing == MMC_TIMING_MMC_DDR52)
+- sdhci_writeb(host, SDMMC_MC1R_DDR, SDMMC_MC1R);
++ u8 mc1r;
++
++ if (timing == MMC_TIMING_MMC_DDR52) {
++ mc1r = sdhci_readb(host, SDMMC_MC1R);
++ mc1r |= SDMMC_MC1R_DDR;
++ sdhci_writeb(host, mc1r, SDMMC_MC1R);
++ }
+ sdhci_set_uhs_signaling(host, timing);
+ }
+
+diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c
+index d9dc41143bb35..8b3d8119f3880 100644
+--- a/drivers/mmc/host/sdhci-of-esdhc.c
++++ b/drivers/mmc/host/sdhci-of-esdhc.c
+@@ -904,6 +904,7 @@ static int esdhc_signal_voltage_switch(struct mmc_host *mmc,
+ scfg_node = of_find_matching_node(NULL, scfg_device_ids);
+ if (scfg_node)
+ scfg_base = of_iomap(scfg_node, 0);
++ of_node_put(scfg_node);
+ if (scfg_base) {
+ sdhciovselcr = SDHCIOVSELCR_TGLEN |
+ SDHCIOVSELCR_VSELVAL;
+diff --git a/drivers/mtd/devices/mtd_dataflash.c b/drivers/mtd/devices/mtd_dataflash.c
+index 134e273285974..25bad43183052 100644
+--- a/drivers/mtd/devices/mtd_dataflash.c
++++ b/drivers/mtd/devices/mtd_dataflash.c
+@@ -112,6 +112,13 @@ static const struct of_device_id dataflash_dt_ids[] = {
+ MODULE_DEVICE_TABLE(of, dataflash_dt_ids);
+ #endif
+
++static const struct spi_device_id dataflash_spi_ids[] = {
++ { .name = "at45", },
++ { .name = "dataflash", },
++ { /* sentinel */ }
++};
++MODULE_DEVICE_TABLE(spi, dataflash_spi_ids);
++
+ /* ......................................................................... */
+
+ /*
+@@ -936,6 +943,7 @@ static struct spi_driver dataflash_driver = {
+
+ .probe = dataflash_probe,
+ .remove = dataflash_remove,
++ .id_table = dataflash_spi_ids,
+
+ /* FIXME: investigate suspend and resume... */
+ };
+diff --git a/drivers/mtd/devices/st_spi_fsm.c b/drivers/mtd/devices/st_spi_fsm.c
+index 983999c020d66..48bda2dd1bb55 100644
+--- a/drivers/mtd/devices/st_spi_fsm.c
++++ b/drivers/mtd/devices/st_spi_fsm.c
+@@ -2115,10 +2115,12 @@ static int stfsm_probe(struct platform_device *pdev)
+ (long long)fsm->mtd.size, (long long)(fsm->mtd.size >> 20),
+ fsm->mtd.erasesize, (fsm->mtd.erasesize >> 10));
+
+- return mtd_device_register(&fsm->mtd, NULL, 0);
+-
++ ret = mtd_device_register(&fsm->mtd, NULL, 0);
++ if (ret) {
+ err_clk_unprepare:
+- clk_disable_unprepare(fsm->clk);
++ clk_disable_unprepare(fsm->clk);
++ }
++
+ return ret;
+ }
+
+diff --git a/drivers/mtd/hyperbus/rpc-if.c b/drivers/mtd/hyperbus/rpc-if.c
+index 6e08ec1d4f098..b70d259e48a7c 100644
+--- a/drivers/mtd/hyperbus/rpc-if.c
++++ b/drivers/mtd/hyperbus/rpc-if.c
+@@ -134,7 +134,7 @@ static int rpcif_hb_probe(struct platform_device *pdev)
+
+ error = rpcif_hw_init(&hyperbus->rpc, true);
+ if (error)
+- return error;
++ goto out_disable_rpm;
+
+ hyperbus->hbdev.map.size = hyperbus->rpc.size;
+ hyperbus->hbdev.map.virt = hyperbus->rpc.dirmap;
+@@ -145,8 +145,12 @@ static int rpcif_hb_probe(struct platform_device *pdev)
+ hyperbus->hbdev.np = of_get_next_child(pdev->dev.parent->of_node, NULL);
+ error = hyperbus_register_device(&hyperbus->hbdev);
+ if (error)
+- rpcif_disable_rpm(&hyperbus->rpc);
++ goto out_disable_rpm;
++
++ return 0;
+
++out_disable_rpm:
++ rpcif_disable_rpm(&hyperbus->rpc);
+ return error;
+ }
+
+diff --git a/drivers/mtd/maps/physmap-versatile.c b/drivers/mtd/maps/physmap-versatile.c
+index ad7cd9cfaee04..a1b8b7b25f88b 100644
+--- a/drivers/mtd/maps/physmap-versatile.c
++++ b/drivers/mtd/maps/physmap-versatile.c
+@@ -93,6 +93,7 @@ static int ap_flash_init(struct platform_device *pdev)
+ return -ENODEV;
+ }
+ ebi_base = of_iomap(ebi, 0);
++ of_node_put(ebi);
+ if (!ebi_base)
+ return -ENODEV;
+
+@@ -207,6 +208,7 @@ int of_flash_probe_versatile(struct platform_device *pdev,
+
+ versatile_flashprot = (enum versatile_flashprot)devid->data;
+ rmap = syscon_node_to_regmap(sysnp);
++ of_node_put(sysnp);
+ if (IS_ERR(rmap))
+ return PTR_ERR(rmap);
+
+diff --git a/drivers/mtd/nand/raw/arasan-nand-controller.c b/drivers/mtd/nand/raw/arasan-nand-controller.c
+index 53bd10738418b..296fb16c8dc3c 100644
+--- a/drivers/mtd/nand/raw/arasan-nand-controller.c
++++ b/drivers/mtd/nand/raw/arasan-nand-controller.c
+@@ -347,17 +347,17 @@ static int anfc_select_target(struct nand_chip *chip, int target)
+
+ /* Update clock frequency */
+ if (nfc->cur_clk != anand->clk) {
+- clk_disable_unprepare(nfc->controller_clk);
+- ret = clk_set_rate(nfc->controller_clk, anand->clk);
++ clk_disable_unprepare(nfc->bus_clk);
++ ret = clk_set_rate(nfc->bus_clk, anand->clk);
+ if (ret) {
+ dev_err(nfc->dev, "Failed to change clock rate\n");
+ return ret;
+ }
+
+- ret = clk_prepare_enable(nfc->controller_clk);
++ ret = clk_prepare_enable(nfc->bus_clk);
+ if (ret) {
+ dev_err(nfc->dev,
+- "Failed to re-enable the controller clock\n");
++ "Failed to re-enable the bus clock\n");
+ return ret;
+ }
+
+@@ -1043,7 +1043,13 @@ static int anfc_setup_interface(struct nand_chip *chip, int target,
+ DQS_BUFF_SEL_OUT(dqs_mode);
+ }
+
+- anand->clk = ANFC_XLNX_SDR_DFLT_CORE_CLK;
++ if (nand_interface_is_sdr(conf)) {
++ anand->clk = ANFC_XLNX_SDR_DFLT_CORE_CLK;
++ } else {
++ /* ONFI timings are defined in picoseconds */
++ anand->clk = div_u64((u64)NSEC_PER_SEC * 1000,
++ conf->timings.nvddr.tCK_min);
++ }
+
+ /*
+ * Due to a hardware bug in the ZynqMP SoC, SDR timing modes 0-1 work
+diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
+index ac3be92872d06..0321801833393 100644
+--- a/drivers/mtd/nand/raw/meson_nand.c
++++ b/drivers/mtd/nand/raw/meson_nand.c
+@@ -1307,7 +1307,6 @@ static int meson_nfc_nand_chip_cleanup(struct meson_nfc *nfc)
+ if (ret)
+ return ret;
+
+- meson_nfc_free_buffer(&meson_chip->nand);
+ nand_cleanup(&meson_chip->nand);
+ list_del(&meson_chip->node);
+ }
+diff --git a/drivers/mtd/parsers/ofpart_bcm4908.c b/drivers/mtd/parsers/ofpart_bcm4908.c
+index 0eddef4c198ec..bb072a0940e48 100644
+--- a/drivers/mtd/parsers/ofpart_bcm4908.c
++++ b/drivers/mtd/parsers/ofpart_bcm4908.c
+@@ -35,12 +35,15 @@ static long long bcm4908_partitions_fw_offset(void)
+ err = kstrtoul(s + len + 1, 0, &offset);
+ if (err) {
+ pr_err("failed to parse %s\n", s + len + 1);
++ of_node_put(root);
+ return err;
+ }
+
++ of_node_put(root);
+ return offset << 10;
+ }
+
++ of_node_put(root);
+ return -ENOENT;
+ }
+
+diff --git a/drivers/mtd/parsers/redboot.c b/drivers/mtd/parsers/redboot.c
+index feb44a573d447..a16b42a885816 100644
+--- a/drivers/mtd/parsers/redboot.c
++++ b/drivers/mtd/parsers/redboot.c
+@@ -58,6 +58,7 @@ static void parse_redboot_of(struct mtd_info *master)
+ return;
+
+ ret = of_property_read_u32(npart, "fis-index-block", &dirblock);
++ of_node_put(npart);
+ if (ret)
+ return;
+
+diff --git a/drivers/mtd/sm_ftl.c b/drivers/mtd/sm_ftl.c
+index 0cff2cda1b5a0..7f955fade8383 100644
+--- a/drivers/mtd/sm_ftl.c
++++ b/drivers/mtd/sm_ftl.c
+@@ -1111,9 +1111,9 @@ static void sm_release(struct mtd_blktrans_dev *dev)
+ {
+ struct sm_ftl *ftl = dev->priv;
+
+- mutex_lock(&ftl->mutex);
+ del_timer_sync(&ftl->timer);
+ cancel_work_sync(&ftl->flush_work);
++ mutex_lock(&ftl->mutex);
+ sm_cache_flush(ftl);
+ mutex_unlock(&ftl->mutex);
+ }
+diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
+index c1630131c7342..170182eb431e4 100644
+--- a/drivers/mtd/spi-nor/core.c
++++ b/drivers/mtd/spi-nor/core.c
+@@ -177,7 +177,7 @@ int spi_nor_controller_ops_write_reg(struct spi_nor *nor, u8 opcode,
+
+ static int spi_nor_controller_ops_erase(struct spi_nor *nor, loff_t offs)
+ {
+- if (spi_nor_protocol_is_dtr(nor->write_proto))
++ if (spi_nor_protocol_is_dtr(nor->reg_proto))
+ return -EOPNOTSUPP;
+
+ return nor->controller_ops->erase(nor, offs);
+@@ -976,7 +976,7 @@ static int spi_nor_erase_chip(struct spi_nor *nor)
+ SPI_MEM_OP_NO_DUMMY,
+ SPI_MEM_OP_NO_DATA);
+
+- spi_nor_spimem_setup_op(nor, &op, nor->write_proto);
++ spi_nor_spimem_setup_op(nor, &op, nor->reg_proto);
+
+ ret = spi_mem_exec_op(nor->spimem, &op);
+ } else {
+@@ -1121,7 +1121,7 @@ int spi_nor_erase_sector(struct spi_nor *nor, u32 addr)
+ SPI_MEM_OP_NO_DUMMY,
+ SPI_MEM_OP_NO_DATA);
+
+- spi_nor_spimem_setup_op(nor, &op, nor->write_proto);
++ spi_nor_spimem_setup_op(nor, &op, nor->reg_proto);
+
+ return spi_mem_exec_op(nor->spimem, &op);
+ } else if (nor->controller_ops->erase) {
+diff --git a/drivers/net/can/dev/netlink.c b/drivers/net/can/dev/netlink.c
+index 7633d98e39121..037824011266e 100644
+--- a/drivers/net/can/dev/netlink.c
++++ b/drivers/net/can/dev/netlink.c
+@@ -176,7 +176,8 @@ static int can_changelink(struct net_device *dev, struct nlattr *tb[],
+ * directly via do_set_bitrate(). Bail out if neither
+ * is given.
+ */
+- if (!priv->bittiming_const && !priv->do_set_bittiming)
++ if (!priv->bittiming_const && !priv->do_set_bittiming &&
++ !priv->bitrate_const)
+ return -EOPNOTSUPP;
+
+ memcpy(&bt, nla_data(data[IFLA_CAN_BITTIMING]), sizeof(bt));
+@@ -278,7 +279,8 @@ static int can_changelink(struct net_device *dev, struct nlattr *tb[],
+ * directly via do_set_bitrate(). Bail out if neither
+ * is given.
+ */
+- if (!priv->data_bittiming_const && !priv->do_set_data_bittiming)
++ if (!priv->data_bittiming_const && !priv->do_set_data_bittiming &&
++ !priv->data_bitrate_const)
+ return -EOPNOTSUPP;
+
+ memcpy(&dbt, nla_data(data[IFLA_CAN_DATA_BITTIMING]),
+diff --git a/drivers/net/can/pch_can.c b/drivers/net/can/pch_can.c
+index 888bef03de09f..17f8d67ddb181 100644
+--- a/drivers/net/can/pch_can.c
++++ b/drivers/net/can/pch_can.c
+@@ -489,6 +489,7 @@ static void pch_can_error(struct net_device *ndev, u32 status)
+ if (!skb)
+ return;
+
++ errc = ioread32(&priv->regs->errc);
+ if (status & PCH_BUS_OFF) {
+ pch_can_set_tx_all(priv, 0);
+ pch_can_set_rx_all(priv, 0);
+@@ -496,9 +497,11 @@ static void pch_can_error(struct net_device *ndev, u32 status)
+ cf->can_id |= CAN_ERR_BUSOFF;
+ priv->can.can_stats.bus_off++;
+ can_bus_off(ndev);
++ } else {
++ cf->data[6] = errc & PCH_TEC;
++ cf->data[7] = (errc & PCH_REC) >> 8;
+ }
+
+- errc = ioread32(&priv->regs->errc);
+ /* Warning interrupt. */
+ if (status & PCH_EWARN) {
+ state = CAN_STATE_ERROR_WARNING;
+@@ -556,9 +559,6 @@ static void pch_can_error(struct net_device *ndev, u32 status)
+ break;
+ }
+
+- cf->data[6] = errc & PCH_TEC;
+- cf->data[7] = (errc & PCH_REC) >> 8;
+-
+ priv->can.state = state;
+ netif_receive_skb(skb);
+ }
+diff --git a/drivers/net/can/rcar/rcar_can.c b/drivers/net/can/rcar/rcar_can.c
+index 33e37395379d7..4ca5b09f1e5da 100644
+--- a/drivers/net/can/rcar/rcar_can.c
++++ b/drivers/net/can/rcar/rcar_can.c
+@@ -233,11 +233,8 @@ static void rcar_can_error(struct net_device *ndev)
+ if (eifr & (RCAR_CAN_EIFR_EWIF | RCAR_CAN_EIFR_EPIF)) {
+ txerr = readb(&priv->regs->tecr);
+ rxerr = readb(&priv->regs->recr);
+- if (skb) {
++ if (skb)
+ cf->can_id |= CAN_ERR_CRTL;
+- cf->data[6] = txerr;
+- cf->data[7] = rxerr;
+- }
+ }
+ if (eifr & RCAR_CAN_EIFR_BEIF) {
+ int rx_errors = 0, tx_errors = 0;
+@@ -337,6 +334,9 @@ static void rcar_can_error(struct net_device *ndev)
+ can_bus_off(ndev);
+ if (skb)
+ cf->can_id |= CAN_ERR_BUSOFF;
++ } else if (skb) {
++ cf->data[6] = txerr;
++ cf->data[7] = rxerr;
+ }
+ if (eifr & RCAR_CAN_EIFR_ORIF) {
+ netdev_dbg(priv->ndev, "Receive overrun error interrupt\n");
+diff --git a/drivers/net/can/sja1000/sja1000.c b/drivers/net/can/sja1000/sja1000.c
+index 966316479485d..366a369080efa 100644
+--- a/drivers/net/can/sja1000/sja1000.c
++++ b/drivers/net/can/sja1000/sja1000.c
+@@ -405,9 +405,6 @@ static int sja1000_err(struct net_device *dev, uint8_t isrc, uint8_t status)
+ txerr = priv->read_reg(priv, SJA1000_TXERR);
+ rxerr = priv->read_reg(priv, SJA1000_RXERR);
+
+- cf->data[6] = txerr;
+- cf->data[7] = rxerr;
+-
+ if (isrc & IRQ_DOI) {
+ /* data overrun interrupt */
+ netdev_dbg(dev, "data overrun interrupt\n");
+@@ -429,6 +426,10 @@ static int sja1000_err(struct net_device *dev, uint8_t isrc, uint8_t status)
+ else
+ state = CAN_STATE_ERROR_ACTIVE;
+ }
++ if (state != CAN_STATE_BUS_OFF) {
++ cf->data[6] = txerr;
++ cf->data[7] = rxerr;
++ }
+ if (isrc & IRQ_BEI) {
+ /* bus error interrupt */
+ priv->can.can_stats.bus_error++;
+diff --git a/drivers/net/can/spi/hi311x.c b/drivers/net/can/spi/hi311x.c
+index a5b2952b8d0ff..7dc88a9eca0f7 100644
+--- a/drivers/net/can/spi/hi311x.c
++++ b/drivers/net/can/spi/hi311x.c
+@@ -672,8 +672,6 @@ static irqreturn_t hi3110_can_ist(int irq, void *dev_id)
+
+ txerr = hi3110_read(spi, HI3110_READ_TEC);
+ rxerr = hi3110_read(spi, HI3110_READ_REC);
+- cf->data[6] = txerr;
+- cf->data[7] = rxerr;
+ tx_state = txerr >= rxerr ? new_state : 0;
+ rx_state = txerr <= rxerr ? new_state : 0;
+ can_change_state(net, cf, tx_state, rx_state);
+@@ -686,6 +684,9 @@ static irqreturn_t hi3110_can_ist(int irq, void *dev_id)
+ hi3110_hw_sleep(spi);
+ break;
+ }
++ } else {
++ cf->data[6] = txerr;
++ cf->data[7] = rxerr;
+ }
+ }
+
+diff --git a/drivers/net/can/sun4i_can.c b/drivers/net/can/sun4i_can.c
+index 25d6d81ab4f44..f3ad585e766fb 100644
+--- a/drivers/net/can/sun4i_can.c
++++ b/drivers/net/can/sun4i_can.c
+@@ -538,11 +538,6 @@ static int sun4i_can_err(struct net_device *dev, u8 isrc, u8 status)
+ rxerr = (errc >> 16) & 0xFF;
+ txerr = errc & 0xFF;
+
+- if (skb) {
+- cf->data[6] = txerr;
+- cf->data[7] = rxerr;
+- }
+-
+ if (isrc & SUN4I_INT_DATA_OR) {
+ /* data overrun interrupt */
+ netdev_dbg(dev, "data overrun interrupt\n");
+@@ -573,6 +568,10 @@ static int sun4i_can_err(struct net_device *dev, u8 isrc, u8 status)
+ else
+ state = CAN_STATE_ERROR_ACTIVE;
+ }
++ if (skb && state != CAN_STATE_BUS_OFF) {
++ cf->data[6] = txerr;
++ cf->data[7] = rxerr;
++ }
+ if (isrc & SUN4I_INT_BUS_ERR) {
+ /* bus error interrupt */
+ netdev_dbg(dev, "bus error interrupt\n");
+diff --git a/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c b/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c
+index 5d70844ac0300..404093468b2f1 100644
+--- a/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c
++++ b/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c
+@@ -917,8 +917,10 @@ static void kvaser_usb_hydra_update_state(struct kvaser_usb_net_priv *priv,
+ new_state < CAN_STATE_BUS_OFF)
+ priv->can.can_stats.restarts++;
+
+- cf->data[6] = bec->txerr;
+- cf->data[7] = bec->rxerr;
++ if (new_state != CAN_STATE_BUS_OFF) {
++ cf->data[6] = bec->txerr;
++ cf->data[7] = bec->rxerr;
++ }
+
+ netif_rx(skb);
+ }
+@@ -1069,8 +1071,10 @@ kvaser_usb_hydra_error_frame(struct kvaser_usb_net_priv *priv,
+ shhwtstamps->hwtstamp = hwtstamp;
+
+ cf->can_id |= CAN_ERR_BUSERROR;
+- cf->data[6] = bec.txerr;
+- cf->data[7] = bec.rxerr;
++ if (new_state != CAN_STATE_BUS_OFF) {
++ cf->data[6] = bec.txerr;
++ cf->data[7] = bec.rxerr;
++ }
+
+ netif_rx(skb);
+
+diff --git a/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c b/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
+index cc809ecd1e622..f551fde16a709 100644
+--- a/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
++++ b/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
+@@ -853,8 +853,10 @@ static void kvaser_usb_leaf_rx_error(const struct kvaser_usb *dev,
+ break;
+ }
+
+- cf->data[6] = es->txerr;
+- cf->data[7] = es->rxerr;
++ if (new_state != CAN_STATE_BUS_OFF) {
++ cf->data[6] = es->txerr;
++ cf->data[7] = es->rxerr;
++ }
+
+ netif_rx(skb);
+ }
+diff --git a/drivers/net/can/usb/usb_8dev.c b/drivers/net/can/usb/usb_8dev.c
+index b638604bf1eef..ea63dc687fdb7 100644
+--- a/drivers/net/can/usb/usb_8dev.c
++++ b/drivers/net/can/usb/usb_8dev.c
+@@ -439,9 +439,10 @@ static void usb_8dev_rx_err_msg(struct usb_8dev_priv *priv,
+
+ if (rx_errors)
+ stats->rx_errors++;
+-
+- cf->data[6] = txerr;
+- cf->data[7] = rxerr;
++ if (priv->can.state != CAN_STATE_BUS_OFF) {
++ cf->data[6] = txerr;
++ cf->data[7] = rxerr;
++ }
+
+ priv->bec.txerr = txerr;
+ priv->bec.rxerr = rxerr;
+diff --git a/drivers/net/dsa/ocelot/Kconfig b/drivers/net/dsa/ocelot/Kconfig
+index 220b0b027b555..08db9cf768180 100644
+--- a/drivers/net/dsa/ocelot/Kconfig
++++ b/drivers/net/dsa/ocelot/Kconfig
+@@ -6,6 +6,7 @@ config NET_DSA_MSCC_FELIX
+ depends on NET_VENDOR_FREESCALE
+ depends on HAS_IOMEM
+ depends on PTP_1588_CLOCK_OPTIONAL
++ depends on NET_SCH_TAPRIO || NET_SCH_TAPRIO=n
+ select MSCC_OCELOT_SWITCH_LIB
+ select NET_DSA_TAG_OCELOT_8021Q
+ select NET_DSA_TAG_OCELOT
+diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c
+index faccfb3f01583..d95c2c0a64bd9 100644
+--- a/drivers/net/dsa/ocelot/felix.c
++++ b/drivers/net/dsa/ocelot/felix.c
+@@ -1613,9 +1613,18 @@ static void felix_txtstamp(struct dsa_switch *ds, int port,
+ static int felix_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
+ {
+ struct ocelot *ocelot = ds->priv;
++ struct ocelot_port *ocelot_port = ocelot->ports[port];
++ struct felix *felix = ocelot_to_felix(ocelot);
+
+ ocelot_port_set_maxlen(ocelot, port, new_mtu);
+
++ mutex_lock(&ocelot->tas_lock);
++
++ if (ocelot_port->taprio && felix->info->tas_guard_bands_update)
++ felix->info->tas_guard_bands_update(ocelot, port);
++
++ mutex_unlock(&ocelot->tas_lock);
++
+ return 0;
+ }
+
+diff --git a/drivers/net/dsa/ocelot/felix.h b/drivers/net/dsa/ocelot/felix.h
+index f083b06fdfe93..b0e4635a8cc20 100644
+--- a/drivers/net/dsa/ocelot/felix.h
++++ b/drivers/net/dsa/ocelot/felix.h
+@@ -53,6 +53,7 @@ struct felix_info {
+ struct phylink_link_state *state);
+ int (*port_setup_tc)(struct dsa_switch *ds, int port,
+ enum tc_setup_type type, void *type_data);
++ void (*tas_guard_bands_update)(struct ocelot *ocelot, int port);
+ void (*port_sched_speed_set)(struct ocelot *ocelot, int port,
+ u32 speed);
+ struct regmap *(*init_regmap)(struct ocelot *ocelot,
+diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c
+index 4a071f96ea283..d6e8549c12c46 100644
+--- a/drivers/net/dsa/ocelot/felix_vsc9959.c
++++ b/drivers/net/dsa/ocelot/felix_vsc9959.c
+@@ -1124,9 +1124,212 @@ static void vsc9959_mdio_bus_free(struct ocelot *ocelot)
+ mdiobus_free(felix->imdio);
+ }
+
++/* Extract shortest continuous gate open intervals in ns for each traffic class
++ * of a cyclic tc-taprio schedule. If a gate is always open, the duration is
++ * considered U64_MAX. If the gate is always closed, it is considered 0.
++ */
++static void vsc9959_tas_min_gate_lengths(struct tc_taprio_qopt_offload *taprio,
++ u64 min_gate_len[OCELOT_NUM_TC])
++{
++ struct tc_taprio_sched_entry *entry;
++ u64 gate_len[OCELOT_NUM_TC];
++ u8 gates_ever_opened = 0;
++ int tc, i, n;
++
++ /* Initialize arrays */
++ for (tc = 0; tc < OCELOT_NUM_TC; tc++) {
++ min_gate_len[tc] = U64_MAX;
++ gate_len[tc] = 0;
++ }
++
++ /* If we don't have taprio, consider all gates as permanently open */
++ if (!taprio)
++ return;
++
++ n = taprio->num_entries;
++
++ /* Walk through the gate list twice to determine the length
++ * of consecutively open gates for a traffic class, including
++ * open gates that wrap around. We are just interested in the
++ * minimum window size, and this doesn't change what the
++ * minimum is (if the gate never closes, min_gate_len will
++ * remain U64_MAX).
++ */
++ for (i = 0; i < 2 * n; i++) {
++ entry = &taprio->entries[i % n];
++
++ for (tc = 0; tc < OCELOT_NUM_TC; tc++) {
++ if (entry->gate_mask & BIT(tc)) {
++ gate_len[tc] += entry->interval;
++ gates_ever_opened |= BIT(tc);
++ } else {
++ /* Gate closes now, record a potential new
++ * minimum and reinitialize length
++ */
++ if (min_gate_len[tc] > gate_len[tc] &&
++ gate_len[tc])
++ min_gate_len[tc] = gate_len[tc];
++ gate_len[tc] = 0;
++ }
++ }
++ }
++
++ /* min_gate_len[tc] actually tracks minimum *open* gate time, so for
++ * permanently closed gates, min_gate_len[tc] will still be U64_MAX.
++ * Therefore they are currently indistinguishable from permanently
++ * open gates. Overwrite the gate len with 0 when we know they're
++ * actually permanently closed, i.e. after the loop above.
++ */
++ for (tc = 0; tc < OCELOT_NUM_TC; tc++)
++ if (!(gates_ever_opened & BIT(tc)))
++ min_gate_len[tc] = 0;
++}
++
++/* Update QSYS_PORT_MAX_SDU to make sure the static guard bands added by the
++ * switch (see the ALWAYS_GUARD_BAND_SCH_Q comment) are correct at all MTU
++ * values (the default value is 1518). Also, for traffic class windows smaller
++ * than one MTU sized frame, update QSYS_QMAXSDU_CFG to enable oversized frame
++ * dropping, such that these won't hang the port, as they will never be sent.
++ */
++static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port)
++{
++ struct ocelot_port *ocelot_port = ocelot->ports[port];
++ u64 min_gate_len[OCELOT_NUM_TC];
++ int speed, picos_per_byte;
++ u64 needed_bit_time_ps;
++ u32 val, maxlen;
++ u8 tas_speed;
++ int tc;
++
++ lockdep_assert_held(&ocelot->tas_lock);
++
++ val = ocelot_read_rix(ocelot, QSYS_TAG_CONFIG, port);
++ tas_speed = QSYS_TAG_CONFIG_LINK_SPEED_X(val);
++
++ switch (tas_speed) {
++ case OCELOT_SPEED_10:
++ speed = SPEED_10;
++ break;
++ case OCELOT_SPEED_100:
++ speed = SPEED_100;
++ break;
++ case OCELOT_SPEED_1000:
++ speed = SPEED_1000;
++ break;
++ case OCELOT_SPEED_2500:
++ speed = SPEED_2500;
++ break;
++ default:
++ return;
++ }
++
++ picos_per_byte = (USEC_PER_SEC * 8) / speed;
++
++ val = ocelot_port_readl(ocelot_port, DEV_MAC_MAXLEN_CFG);
++ /* MAXLEN_CFG accounts automatically for VLAN. We need to include it
++ * manually in the bit time calculation, plus the preamble and SFD.
++ */
++ maxlen = val + 2 * VLAN_HLEN;
++ /* Consider the standard Ethernet overhead of 8 octets preamble+SFD,
++ * 4 octets FCS, 12 octets IFG.
++ */
++ needed_bit_time_ps = (maxlen + 24) * picos_per_byte;
++
++ dev_dbg(ocelot->dev,
++ "port %d: max frame size %d needs %llu ps at speed %d\n",
++ port, maxlen, needed_bit_time_ps, speed);
++
++ vsc9959_tas_min_gate_lengths(ocelot_port->taprio, min_gate_len);
++
++ for (tc = 0; tc < OCELOT_NUM_TC; tc++) {
++ u32 max_sdu;
++
++ if (min_gate_len[tc] == U64_MAX /* Gate always open */ ||
++ min_gate_len[tc] * 1000 > needed_bit_time_ps) {
++ /* Setting QMAXSDU_CFG to 0 disables oversized frame
++ * dropping.
++ */
++ max_sdu = 0;
++ dev_dbg(ocelot->dev,
++ "port %d tc %d min gate len %llu"
++ ", sending all frames\n",
++ port, tc, min_gate_len[tc]);
++ } else {
++ /* If traffic class doesn't support a full MTU sized
++ * frame, make sure to enable oversize frame dropping
++ * for frames larger than the smallest that would fit.
++ */
++ max_sdu = div_u64(min_gate_len[tc] * 1000,
++ picos_per_byte);
++ /* A TC gate may be completely closed, which is a
++ * special case where all packets are oversized.
++ * Any limit smaller than 64 octets accomplishes this
++ */
++ if (!max_sdu)
++ max_sdu = 1;
++ /* Take L1 overhead into account, but just don't allow
++ * max_sdu to go negative or to 0. Here we use 20
++ * because QSYS_MAXSDU_CFG_* already counts the 4 FCS
++ * octets as part of packet size.
++ */
++ if (max_sdu > 20)
++ max_sdu -= 20;
++ dev_info(ocelot->dev,
++ "port %d tc %d min gate length %llu"
++ " ns not enough for max frame size %d at %d"
++ " Mbps, dropping frames over %d"
++ " octets including FCS\n",
++ port, tc, min_gate_len[tc], maxlen, speed,
++ max_sdu);
++ }
++
++ /* ocelot_write_rix is a macro that concatenates
++ * QSYS_MAXSDU_CFG_* with _RSZ, so we need to spell out
++ * the writes to each traffic class
++ */
++ switch (tc) {
++ case 0:
++ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_0,
++ port);
++ break;
++ case 1:
++ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_1,
++ port);
++ break;
++ case 2:
++ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_2,
++ port);
++ break;
++ case 3:
++ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_3,
++ port);
++ break;
++ case 4:
++ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_4,
++ port);
++ break;
++ case 5:
++ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_5,
++ port);
++ break;
++ case 6:
++ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_6,
++ port);
++ break;
++ case 7:
++ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_7,
++ port);
++ break;
++ }
++ }
++
++ ocelot_write_rix(ocelot, maxlen, QSYS_PORT_MAX_SDU, port);
++}
++
+ static void vsc9959_sched_speed_set(struct ocelot *ocelot, int port,
+ u32 speed)
+ {
++ struct ocelot_port *ocelot_port = ocelot->ports[port];
+ u8 tas_speed;
+
+ switch (speed) {
+@@ -1151,6 +1354,13 @@ static void vsc9959_sched_speed_set(struct ocelot *ocelot, int port,
+ QSYS_TAG_CONFIG_LINK_SPEED(tas_speed),
+ QSYS_TAG_CONFIG_LINK_SPEED_M,
+ QSYS_TAG_CONFIG, port);
++
++ mutex_lock(&ocelot->tas_lock);
++
++ if (ocelot_port->taprio)
++ vsc9959_tas_guard_bands_update(ocelot, port);
++
++ mutex_unlock(&ocelot->tas_lock);
+ }
+
+ static void vsc9959_new_base_time(struct ocelot *ocelot, ktime_t base_time,
+@@ -1193,10 +1403,13 @@ static void vsc9959_tas_gcl_set(struct ocelot *ocelot, const u32 gcl_ix,
+ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
+ struct tc_taprio_qopt_offload *taprio)
+ {
++ struct ocelot_port *ocelot_port = ocelot->ports[port];
+ struct timespec64 base_ts;
+ int ret, i;
+ u32 val;
+
++ mutex_lock(&ocelot->tas_lock);
++
+ if (!taprio->enable) {
+ ocelot_rmw_rix(ocelot,
+ QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF),
+@@ -1204,15 +1417,25 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
+ QSYS_TAG_CONFIG_INIT_GATE_STATE_M,
+ QSYS_TAG_CONFIG, port);
+
++ taprio_offload_free(ocelot_port->taprio);
++ ocelot_port->taprio = NULL;
++
++ vsc9959_tas_guard_bands_update(ocelot, port);
++
++ mutex_unlock(&ocelot->tas_lock);
+ return 0;
+ }
+
+ if (taprio->cycle_time > NSEC_PER_SEC ||
+- taprio->cycle_time_extension >= NSEC_PER_SEC)
+- return -EINVAL;
++ taprio->cycle_time_extension >= NSEC_PER_SEC) {
++ ret = -EINVAL;
++ goto err;
++ }
+
+- if (taprio->num_entries > VSC9959_TAS_GCL_ENTRY_MAX)
+- return -ERANGE;
++ if (taprio->num_entries > VSC9959_TAS_GCL_ENTRY_MAX) {
++ ret = -ERANGE;
++ goto err;
++ }
+
+ /* Enable guard band. The switch will schedule frames without taking
+ * their length into account. Thus we'll always need to enable the
+@@ -1233,8 +1456,10 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
+ * config is pending, need reset the TAS module
+ */
+ val = ocelot_read(ocelot, QSYS_PARAM_STATUS_REG_8);
+- if (val & QSYS_PARAM_STATUS_REG_8_CONFIG_PENDING)
+- return -EBUSY;
++ if (val & QSYS_PARAM_STATUS_REG_8_CONFIG_PENDING) {
++ ret = -EBUSY;
++ goto err;
++ }
+
+ ocelot_rmw_rix(ocelot,
+ QSYS_TAG_CONFIG_ENABLE |
+@@ -1267,10 +1492,71 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
+ ret = readx_poll_timeout(vsc9959_tas_read_cfg_status, ocelot, val,
+ !(val & QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE),
+ 10, 100000);
++ if (ret)
++ goto err;
++
++ ocelot_port->taprio = taprio_offload_get(taprio);
++ vsc9959_tas_guard_bands_update(ocelot, port);
++
++err:
++ mutex_unlock(&ocelot->tas_lock);
+
+ return ret;
+ }
+
++static void vsc9959_tas_clock_adjust(struct ocelot *ocelot)
++{
++ struct tc_taprio_qopt_offload *taprio;
++ struct ocelot_port *ocelot_port;
++ struct timespec64 base_ts;
++ int port;
++ u32 val;
++
++ mutex_lock(&ocelot->tas_lock);
++
++ for (port = 0; port < ocelot->num_phys_ports; port++) {
++ ocelot_port = ocelot->ports[port];
++ taprio = ocelot_port->taprio;
++ if (!taprio)
++ continue;
++
++ ocelot_rmw(ocelot,
++ QSYS_TAS_PARAM_CFG_CTRL_PORT_NUM(port),
++ QSYS_TAS_PARAM_CFG_CTRL_PORT_NUM_M,
++ QSYS_TAS_PARAM_CFG_CTRL);
++
++ ocelot_rmw_rix(ocelot,
++ QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF),
++ QSYS_TAG_CONFIG_ENABLE |
++ QSYS_TAG_CONFIG_INIT_GATE_STATE_M,
++ QSYS_TAG_CONFIG, port);
++
++ vsc9959_new_base_time(ocelot, taprio->base_time,
++ taprio->cycle_time, &base_ts);
++
++ ocelot_write(ocelot, base_ts.tv_nsec, QSYS_PARAM_CFG_REG_1);
++ ocelot_write(ocelot, lower_32_bits(base_ts.tv_sec),
++ QSYS_PARAM_CFG_REG_2);
++ val = upper_32_bits(base_ts.tv_sec);
++ ocelot_rmw(ocelot,
++ QSYS_PARAM_CFG_REG_3_BASE_TIME_SEC_MSB(val),
++ QSYS_PARAM_CFG_REG_3_BASE_TIME_SEC_MSB_M,
++ QSYS_PARAM_CFG_REG_3);
++
++ ocelot_rmw(ocelot, QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE,
++ QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE,
++ QSYS_TAS_PARAM_CFG_CTRL);
++
++ ocelot_rmw_rix(ocelot,
++ QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF) |
++ QSYS_TAG_CONFIG_ENABLE,
++ QSYS_TAG_CONFIG_ENABLE |
++ QSYS_TAG_CONFIG_INIT_GATE_STATE_M,
++ QSYS_TAG_CONFIG, port);
++ }
++ mutex_unlock(&ocelot->tas_lock);
++}
++
+ static int vsc9959_qos_port_cbs_set(struct dsa_switch *ds, int port,
+ struct tc_cbs_qopt_offload *cbs_qopt)
+ {
+@@ -2210,6 +2496,7 @@ static const struct ocelot_ops vsc9959_ops = {
+ .psfp_filter_del = vsc9959_psfp_filter_del,
+ .psfp_stats_get = vsc9959_psfp_stats_get,
+ .cut_through_fwd = vsc9959_cut_through_fwd,
++ .tas_clock_adjust = vsc9959_tas_clock_adjust,
+ };
+
+ static const struct felix_info felix_info_vsc9959 = {
+@@ -2237,6 +2524,7 @@ static const struct felix_info felix_info_vsc9959 = {
+ .port_modes = vsc9959_port_modes,
+ .port_setup_tc = vsc9959_port_setup_tc,
+ .port_sched_speed_set = vsc9959_sched_speed_set,
++ .tas_guard_bands_update = vsc9959_tas_guard_bands_update,
+ .init_regmap = ocelot_regmap_init,
+ };
+
+diff --git a/drivers/net/ethernet/atheros/ag71xx.c b/drivers/net/ethernet/atheros/ag71xx.c
+index ec167af0e3b2d..5041265f42262 100644
+--- a/drivers/net/ethernet/atheros/ag71xx.c
++++ b/drivers/net/ethernet/atheros/ag71xx.c
+@@ -946,7 +946,7 @@ static unsigned int ag71xx_max_frame_len(unsigned int mtu)
+ return ETH_HLEN + VLAN_HLEN + mtu + ETH_FCS_LEN;
+ }
+
+-static void ag71xx_hw_set_macaddr(struct ag71xx *ag, unsigned char *mac)
++static void ag71xx_hw_set_macaddr(struct ag71xx *ag, const unsigned char *mac)
+ {
+ u32 t;
+
+diff --git a/drivers/net/ethernet/huawei/hinic/hinic_dev.h b/drivers/net/ethernet/huawei/hinic/hinic_dev.h
+index fb3e89141a0d9..a4fbf44f944cd 100644
+--- a/drivers/net/ethernet/huawei/hinic/hinic_dev.h
++++ b/drivers/net/ethernet/huawei/hinic/hinic_dev.h
+@@ -95,9 +95,6 @@ struct hinic_dev {
+ u16 sq_depth;
+ u16 rq_depth;
+
+- struct hinic_txq_stats tx_stats;
+- struct hinic_rxq_stats rx_stats;
+-
+ u8 rss_tmpl_idx;
+ u8 rss_hash_engine;
+ u16 num_rss;
+diff --git a/drivers/net/ethernet/huawei/hinic/hinic_main.c b/drivers/net/ethernet/huawei/hinic/hinic_main.c
+index 05329292d940f..c23ee2ddbce3e 100644
+--- a/drivers/net/ethernet/huawei/hinic/hinic_main.c
++++ b/drivers/net/ethernet/huawei/hinic/hinic_main.c
+@@ -62,8 +62,6 @@ MODULE_PARM_DESC(rx_weight, "Number Rx packets for NAPI budget (default=64)");
+
+ #define HINIC_LRO_RX_TIMER_DEFAULT 16
+
+-#define VLAN_BITMAP_SIZE(nic_dev) (ALIGN(VLAN_N_VID, 8) / 8)
+-
+ #define work_to_rx_mode_work(work) \
+ container_of(work, struct hinic_rx_mode_work, work)
+
+@@ -82,56 +80,44 @@ static int set_features(struct hinic_dev *nic_dev,
+ netdev_features_t pre_features,
+ netdev_features_t features, bool force_change);
+
+-static void update_rx_stats(struct hinic_dev *nic_dev, struct hinic_rxq *rxq)
++static void gather_rx_stats(struct hinic_rxq_stats *nic_rx_stats, struct hinic_rxq *rxq)
+ {
+- struct hinic_rxq_stats *nic_rx_stats = &nic_dev->rx_stats;
+ struct hinic_rxq_stats rx_stats;
+
+- u64_stats_init(&rx_stats.syncp);
+-
+ hinic_rxq_get_stats(rxq, &rx_stats);
+
+- u64_stats_update_begin(&nic_rx_stats->syncp);
+ nic_rx_stats->bytes += rx_stats.bytes;
+ nic_rx_stats->pkts += rx_stats.pkts;
+ nic_rx_stats->errors += rx_stats.errors;
+ nic_rx_stats->csum_errors += rx_stats.csum_errors;
+ nic_rx_stats->other_errors += rx_stats.other_errors;
+- u64_stats_update_end(&nic_rx_stats->syncp);
+-
+- hinic_rxq_clean_stats(rxq);
+ }
+
+-static void update_tx_stats(struct hinic_dev *nic_dev, struct hinic_txq *txq)
++static void gather_tx_stats(struct hinic_txq_stats *nic_tx_stats, struct hinic_txq *txq)
+ {
+- struct hinic_txq_stats *nic_tx_stats = &nic_dev->tx_stats;
+ struct hinic_txq_stats tx_stats;
+
+- u64_stats_init(&tx_stats.syncp);
+-
+ hinic_txq_get_stats(txq, &tx_stats);
+
+- u64_stats_update_begin(&nic_tx_stats->syncp);
+ nic_tx_stats->bytes += tx_stats.bytes;
+ nic_tx_stats->pkts += tx_stats.pkts;
+ nic_tx_stats->tx_busy += tx_stats.tx_busy;
+ nic_tx_stats->tx_wake += tx_stats.tx_wake;
+ nic_tx_stats->tx_dropped += tx_stats.tx_dropped;
+ nic_tx_stats->big_frags_pkts += tx_stats.big_frags_pkts;
+- u64_stats_update_end(&nic_tx_stats->syncp);
+-
+- hinic_txq_clean_stats(txq);
+ }
+
+-static void update_nic_stats(struct hinic_dev *nic_dev)
++static void gather_nic_stats(struct hinic_dev *nic_dev,
++ struct hinic_rxq_stats *nic_rx_stats,
++ struct hinic_txq_stats *nic_tx_stats)
+ {
+ int i, num_qps = hinic_hwdev_num_qps(nic_dev->hwdev);
+
+ for (i = 0; i < num_qps; i++)
+- update_rx_stats(nic_dev, &nic_dev->rxqs[i]);
++ gather_rx_stats(nic_rx_stats, &nic_dev->rxqs[i]);
+
+ for (i = 0; i < num_qps; i++)
+- update_tx_stats(nic_dev, &nic_dev->txqs[i]);
++ gather_tx_stats(nic_tx_stats, &nic_dev->txqs[i]);
+ }
+
+ /**
+@@ -560,8 +546,6 @@ int hinic_close(struct net_device *netdev)
+ netif_carrier_off(netdev);
+ netif_tx_disable(netdev);
+
+- update_nic_stats(nic_dev);
+-
+ up(&nic_dev->mgmt_lock);
+
+ if (!HINIC_IS_VF(nic_dev->hwdev->hwif))
+@@ -855,26 +839,19 @@ static void hinic_get_stats64(struct net_device *netdev,
+ struct rtnl_link_stats64 *stats)
+ {
+ struct hinic_dev *nic_dev = netdev_priv(netdev);
+- struct hinic_rxq_stats *nic_rx_stats;
+- struct hinic_txq_stats *nic_tx_stats;
+-
+- nic_rx_stats = &nic_dev->rx_stats;
+- nic_tx_stats = &nic_dev->tx_stats;
+-
+- down(&nic_dev->mgmt_lock);
++ struct hinic_rxq_stats nic_rx_stats = {};
++ struct hinic_txq_stats nic_tx_stats = {};
+
+ if (nic_dev->flags & HINIC_INTF_UP)
+- update_nic_stats(nic_dev);
+-
+- up(&nic_dev->mgmt_lock);
++ gather_nic_stats(nic_dev, &nic_rx_stats, &nic_tx_stats);
+
+- stats->rx_bytes = nic_rx_stats->bytes;
+- stats->rx_packets = nic_rx_stats->pkts;
+- stats->rx_errors = nic_rx_stats->errors;
++ stats->rx_bytes = nic_rx_stats.bytes;
++ stats->rx_packets = nic_rx_stats.pkts;
++ stats->rx_errors = nic_rx_stats.errors;
+
+- stats->tx_bytes = nic_tx_stats->bytes;
+- stats->tx_packets = nic_tx_stats->pkts;
+- stats->tx_errors = nic_tx_stats->tx_dropped;
++ stats->tx_bytes = nic_tx_stats.bytes;
++ stats->tx_packets = nic_tx_stats.pkts;
++ stats->tx_errors = nic_tx_stats.tx_dropped;
+ }
+
+ static int hinic_set_features(struct net_device *netdev,
+@@ -1173,8 +1150,6 @@ static void hinic_free_intr_coalesce(struct hinic_dev *nic_dev)
+ static int nic_dev_init(struct pci_dev *pdev)
+ {
+ struct hinic_rx_mode_work *rx_mode_work;
+- struct hinic_txq_stats *tx_stats;
+- struct hinic_rxq_stats *rx_stats;
+ struct hinic_dev *nic_dev;
+ struct net_device *netdev;
+ struct hinic_hwdev *hwdev;
+@@ -1236,15 +1211,8 @@ static int nic_dev_init(struct pci_dev *pdev)
+
+ sema_init(&nic_dev->mgmt_lock, 1);
+
+- tx_stats = &nic_dev->tx_stats;
+- rx_stats = &nic_dev->rx_stats;
+-
+- u64_stats_init(&tx_stats->syncp);
+- u64_stats_init(&rx_stats->syncp);
+-
+- nic_dev->vlan_bitmap = devm_kzalloc(&pdev->dev,
+- VLAN_BITMAP_SIZE(nic_dev),
+- GFP_KERNEL);
++ nic_dev->vlan_bitmap = devm_bitmap_zalloc(&pdev->dev, VLAN_N_VID,
++ GFP_KERNEL);
+ if (!nic_dev->vlan_bitmap) {
+ err = -ENOMEM;
+ goto err_vlan_bitmap;
+diff --git a/drivers/net/ethernet/huawei/hinic/hinic_rx.c b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
+index b33ed4d92b71b..b6bce622a6a82 100644
+--- a/drivers/net/ethernet/huawei/hinic/hinic_rx.c
++++ b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
+@@ -73,7 +73,6 @@ void hinic_rxq_get_stats(struct hinic_rxq *rxq, struct hinic_rxq_stats *stats)
+ struct hinic_rxq_stats *rxq_stats = &rxq->rxq_stats;
+ unsigned int start;
+
+- u64_stats_update_begin(&stats->syncp);
+ do {
+ start = u64_stats_fetch_begin(&rxq_stats->syncp);
+ stats->pkts = rxq_stats->pkts;
+@@ -83,7 +82,6 @@ void hinic_rxq_get_stats(struct hinic_rxq *rxq, struct hinic_rxq_stats *stats)
+ stats->csum_errors = rxq_stats->csum_errors;
+ stats->other_errors = rxq_stats->other_errors;
+ } while (u64_stats_fetch_retry(&rxq_stats->syncp, start));
+- u64_stats_update_end(&stats->syncp);
+ }
+
+ /**
+diff --git a/drivers/net/ethernet/huawei/hinic/hinic_tx.c b/drivers/net/ethernet/huawei/hinic/hinic_tx.c
+index 8d59babbf476c..082eb2a5088de 100644
+--- a/drivers/net/ethernet/huawei/hinic/hinic_tx.c
++++ b/drivers/net/ethernet/huawei/hinic/hinic_tx.c
+@@ -98,7 +98,6 @@ void hinic_txq_get_stats(struct hinic_txq *txq, struct hinic_txq_stats *stats)
+ struct hinic_txq_stats *txq_stats = &txq->txq_stats;
+ unsigned int start;
+
+- u64_stats_update_begin(&stats->syncp);
+ do {
+ start = u64_stats_fetch_begin(&txq_stats->syncp);
+ stats->pkts = txq_stats->pkts;
+@@ -108,7 +107,6 @@ void hinic_txq_get_stats(struct hinic_txq *txq, struct hinic_txq_stats *stats)
+ stats->tx_dropped = txq_stats->tx_dropped;
+ stats->big_frags_pkts = txq_stats->big_frags_pkts;
+ } while (u64_stats_fetch_retry(&txq_stats->syncp, start));
+- u64_stats_update_end(&stats->syncp);
+ }
+
+ /**
+diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h
+index 0ea0361cd86b1..a988c08e906f1 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf.h
++++ b/drivers/net/ethernet/intel/iavf/iavf.h
+@@ -92,6 +92,7 @@ struct iavf_vsi {
+ #define IAVF_HKEY_ARRAY_SIZE ((IAVF_VFQF_HKEY_MAX_INDEX + 1) * 4)
+ #define IAVF_HLUT_ARRAY_SIZE ((IAVF_VFQF_HLUT_MAX_INDEX + 1) * 4)
+ #define IAVF_MBPS_DIVISOR 125000 /* divisor to convert to Mbps */
++#define IAVF_MBPS_QUANTA 50
+
+ #define IAVF_VIRTCHNL_VF_RESOURCE_SIZE (sizeof(struct virtchnl_vf_resource) + \
+ (IAVF_MAX_VF_VSI * \
+@@ -430,6 +431,11 @@ struct iavf_adapter {
+ /* lock to protect access to the cloud filter list */
+ spinlock_t cloud_filter_list_lock;
+ u16 num_cloud_filters;
++ /* snapshot of "num_active_queues" before setup_tc for qdisc add
++ * is invoked. This information is useful during qdisc del flow,
++ * to restore correct number of queues
++ */
++ int orig_num_active_queues;
+
+ #define IAVF_MAX_FDIR_FILTERS 128 /* max allowed Flow Director filters */
+ u16 fdir_active_fltr;
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
+index 2e2c153ce46a3..3dbfaead2ac74 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
+@@ -3322,6 +3322,7 @@ static int iavf_validate_ch_config(struct iavf_adapter *adapter,
+ struct tc_mqprio_qopt_offload *mqprio_qopt)
+ {
+ u64 total_max_rate = 0;
++ u32 tx_rate_rem = 0;
+ int i, num_qps = 0;
+ u64 tx_rate = 0;
+ int ret = 0;
+@@ -3336,12 +3337,32 @@ static int iavf_validate_ch_config(struct iavf_adapter *adapter,
+ return -EINVAL;
+ if (mqprio_qopt->min_rate[i]) {
+ dev_err(&adapter->pdev->dev,
+- "Invalid min tx rate (greater than 0) specified\n");
++ "Invalid min tx rate (greater than 0) specified for TC%d\n",
++ i);
+ return -EINVAL;
+ }
+- /*convert to Mbps */
++
++ /* convert to Mbps */
+ tx_rate = div_u64(mqprio_qopt->max_rate[i],
+ IAVF_MBPS_DIVISOR);
++
++ if (mqprio_qopt->max_rate[i] &&
++ tx_rate < IAVF_MBPS_QUANTA) {
++ dev_err(&adapter->pdev->dev,
++ "Invalid max tx rate for TC%d, minimum %dMbps\n",
++ i, IAVF_MBPS_QUANTA);
++ return -EINVAL;
++ }
++
++ (void)div_u64_rem(tx_rate, IAVF_MBPS_QUANTA, &tx_rate_rem);
++
++ if (tx_rate_rem != 0) {
++ dev_err(&adapter->pdev->dev,
++ "Invalid max tx rate for TC%d, not divisible by %d\n",
++ i, IAVF_MBPS_QUANTA);
++ return -EINVAL;
++ }
++
+ total_max_rate += tx_rate;
+ num_qps += mqprio_qopt->qopt.count[i];
+ }
+@@ -3408,6 +3429,7 @@ static int __iavf_setup_tc(struct net_device *netdev, void *type_data)
+ netif_tx_disable(netdev);
+ iavf_del_all_cloud_filters(adapter);
+ adapter->aq_required = IAVF_FLAG_AQ_DISABLE_CHANNELS;
++ total_qps = adapter->orig_num_active_queues;
+ goto exit;
+ } else {
+ return -EINVAL;
+@@ -3451,7 +3473,21 @@ static int __iavf_setup_tc(struct net_device *netdev, void *type_data)
+ adapter->ch_config.ch_info[i].offset = 0;
+ }
+ }
++
++ /* Take snapshot of original config such as "num_active_queues"
++ * It is used later when delete ADQ flow is exercised, so that
++ * once delete ADQ flow completes, VF shall go back to its
++ * original queue configuration
++ */
++
++ adapter->orig_num_active_queues = adapter->num_active_queues;
++
++ /* Store queue info based on TC so that VF gets configured
++ * with correct number of queues when VF completes ADQ config
++ * flow
++ */
+ adapter->ch_config.total_qps = total_qps;
++
+ netif_tx_stop_all_queues(netdev);
+ netif_tx_disable(netdev);
+ adapter->aq_required |= IAVF_FLAG_AQ_ENABLE_CHANNELS;
+@@ -3468,6 +3504,12 @@ static int __iavf_setup_tc(struct net_device *netdev, void *type_data)
+ }
+ }
+ exit:
++ if (test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section))
++ return 0;
++
++ netif_set_real_num_rx_queues(netdev, total_qps);
++ netif_set_real_num_tx_queues(netdev, total_qps);
++
+ return ret;
+ }
+
+diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
+index 522462f41067a..f48938af960c1 100644
+--- a/drivers/net/ethernet/intel/ice/ice_main.c
++++ b/drivers/net/ethernet/intel/ice/ice_main.c
+@@ -419,7 +419,7 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi)
+ IFF_PROMISC;
+ goto out_promisc;
+ }
+- if (vsi->current_netdev_flags &
++ if (vsi->netdev->features &
+ NETIF_F_HW_VLAN_CTAG_FILTER)
+ vlan_ops->ena_rx_filtering(vsi);
+ }
+diff --git a/drivers/net/ethernet/intel/ice/ice_switch.c b/drivers/net/ethernet/intel/ice/ice_switch.c
+index 25b8f6f726eba..73960c8f9dba6 100644
+--- a/drivers/net/ethernet/intel/ice/ice_switch.c
++++ b/drivers/net/ethernet/intel/ice/ice_switch.c
+@@ -4874,7 +4874,7 @@ ice_find_free_recp_res_idx(struct ice_hw *hw, const unsigned long *profiles,
+ bitmap_zero(recipes, ICE_MAX_NUM_RECIPES);
+ bitmap_zero(used_idx, ICE_MAX_FV_WORDS);
+
+- bitmap_set(possible_idx, 0, ICE_MAX_FV_WORDS);
++ bitmap_fill(possible_idx, ICE_MAX_FV_WORDS);
+
+ /* For each profile we are going to associate the recipe with, add the
+ * recipes that are associated with that profile. This will give us
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
+index d0d14325a0d95..3ddf76aea3f10 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
+@@ -109,7 +109,7 @@ struct page_pool;
+ #define MLX5E_REQUIRED_WQE_MTTS (MLX5_ALIGN_MTTS(MLX5_MPWRQ_PAGES_PER_WQE + 1))
+ #define MLX5E_REQUIRED_MTTS(wqes) (wqes * MLX5E_REQUIRED_WQE_MTTS)
+ #define MLX5E_MAX_RQ_NUM_MTTS \
+- ((1 << 16) * 2) /* So that MLX5_MTT_OCTW(num_mtts) fits into u16 */
++ (ALIGN_DOWN(U16_MAX, 4) * 2) /* So that MLX5_MTT_OCTW(num_mtts) fits into u16 */
+ #define MLX5E_ORDER2_MAX_PACKET_MTU (order_base_2(10 * 1024))
+ #define MLX5E_PARAMS_MAXIMUM_LOG_RQ_SIZE_MPW \
+ (ilog2(MLX5E_MAX_RQ_NUM_MTTS / MLX5E_REQUIRED_WQE_MTTS))
+@@ -174,8 +174,8 @@ struct page_pool;
+ ALIGN_DOWN(MLX5E_KLM_MAX_ENTRIES_PER_WQE(wqe_size), MLX5_UMR_KLM_ALIGNMENT)
+
+ #define MLX5E_MAX_KLM_PER_WQE(mdev) \
+- MLX5E_KLM_ENTRIES_PER_WQE(mlx5e_get_sw_max_sq_mpw_wqebbs(mlx5e_get_max_sq_wqebbs(mdev)) \
+- << MLX5_MKEY_BSF_OCTO_SIZE)
++ MLX5E_KLM_ENTRIES_PER_WQE(MLX5_SEND_WQE_BB * \
++ mlx5e_get_sw_max_sq_mpw_wqebbs(mlx5e_get_max_sq_wqebbs(mdev)))
+
+ #define MLX5E_MSG_LEVEL NETIF_MSG_LINK
+
+@@ -233,7 +233,7 @@ static inline u16 mlx5e_get_max_sq_wqebbs(struct mlx5_core_dev *mdev)
+ MLX5_CAP_GEN(mdev, max_wqe_sz_sq) / MLX5_SEND_WQE_BB);
+ }
+
+-static inline u16 mlx5e_get_sw_max_sq_mpw_wqebbs(u16 max_sq_wqebbs)
++static inline u8 mlx5e_get_sw_max_sq_mpw_wqebbs(u8 max_sq_wqebbs)
+ {
+ /* The return value will be multiplied by MLX5_SEND_WQEBB_NUM_DS.
+ * Since max_sq_wqebbs may be up to MLX5_SEND_WQE_MAX_WQEBBS == 16,
+@@ -242,11 +242,12 @@ static inline u16 mlx5e_get_sw_max_sq_mpw_wqebbs(u16 max_sq_wqebbs)
+ * than MLX5_SEND_WQE_MAX_WQEBBS to let a full-session WQE be
+ * cache-aligned.
+ */
+-#if L1_CACHE_BYTES < 128
+- return min_t(u16, max_sq_wqebbs, MLX5_SEND_WQE_MAX_WQEBBS - 1);
+-#else
+- return min_t(u16, max_sq_wqebbs, MLX5_SEND_WQE_MAX_WQEBBS - 2);
++ u8 wqebbs = min_t(u8, max_sq_wqebbs, MLX5_SEND_WQE_MAX_WQEBBS - 1);
++
++#if L1_CACHE_BYTES >= 128
++ wqebbs = ALIGN_DOWN(wqebbs, 2);
+ #endif
++ return wqebbs;
+ }
+
+ struct mlx5e_tx_wqe {
+@@ -456,7 +457,7 @@ struct mlx5e_txqsq {
+ struct netdev_queue *txq;
+ u32 sqn;
+ u16 stop_room;
+- u16 max_sq_mpw_wqebbs;
++ u8 max_sq_mpw_wqebbs;
+ u8 min_inline_mode;
+ struct device *pdev;
+ __be32 mkey_be;
+@@ -571,7 +572,7 @@ struct mlx5e_xdpsq {
+ struct device *pdev;
+ __be32 mkey_be;
+ u16 stop_room;
+- u16 max_sq_mpw_wqebbs;
++ u8 max_sq_mpw_wqebbs;
+ u8 min_inline_mode;
+ unsigned long state;
+ unsigned int hw_mtu;
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+index 08fd1370a8b0a..75da12e1b0c7b 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+@@ -797,8 +797,20 @@ static u8 mlx5e_build_icosq_log_wq_sz(struct mlx5_core_dev *mdev,
+ return MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE;
+
+ wqebbs = MLX5E_UMR_WQEBBS * BIT(mlx5e_get_rq_log_wq_sz(rqp->rqc));
++
++ /* If XDP program is attached, XSK may be turned on at any time without
++ * restarting the channel. ICOSQ must be big enough to fit UMR WQEs of
++ * both regular RQ and XSK RQ.
++ * Although mlx5e_mpwqe_get_log_rq_size accepts mlx5e_xsk_param, it
++ * doesn't affect its return value, as long as params->xdp_prog != NULL,
++ * so we can just multiply by 2.
++ */
++ if (params->xdp_prog)
++ wqebbs *= 2;
++
+ if (params->packet_merge.type == MLX5E_PACKET_MERGE_SHAMPO)
+ wqebbs += mlx5e_shampo_icosq_sz(mdev, params, rqp);
++
+ return max_t(u8, MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE, order_base_2(wqebbs));
+ }
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.c
+index dea137dd744b4..2b64dd557b5d1 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.c
+@@ -128,6 +128,7 @@ mlx5e_tc_post_act_add(struct mlx5e_post_act *post_act, struct mlx5_flow_attr *at
+ post_attr->inner_match_level = MLX5_MATCH_NONE;
+ post_attr->outer_match_level = MLX5_MATCH_NONE;
+ post_attr->action &= ~MLX5_FLOW_CONTEXT_ACTION_DECAP;
++ post_attr->flags |= MLX5_ATTR_FLAG_NO_IN_PORT;
+
+ handle->ns_type = post_act->ns_type;
+ /* Splits were handled before post action */
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h
+index 7f88ccf67fdde..8b56cb8b47433 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h
+@@ -7,6 +7,8 @@
+ #include "en.h"
+ #include <net/xdp_sock_drv.h>
+
++#define MLX5E_MTT_PTAG_MASK 0xfffffffffffffff8ULL
++
+ /* RX data path */
+
+ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq,
+@@ -22,6 +24,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq,
+ static inline int mlx5e_xsk_page_alloc_pool(struct mlx5e_rq *rq,
+ struct mlx5e_dma_info *dma_info)
+ {
++retry:
+ dma_info->xsk = xsk_buff_alloc(rq->xsk_pool);
+ if (!dma_info->xsk)
+ return -ENOMEM;
+@@ -33,6 +36,17 @@ static inline int mlx5e_xsk_page_alloc_pool(struct mlx5e_rq *rq,
+ */
+ dma_info->addr = xsk_buff_xdp_get_frame_dma(dma_info->xsk);
+
++ /* MTT page mapping has alignment requirements. If they are not
++ * satisfied, leak the descriptor so that it won't come again, and try
++ * to allocate a new one.
++ */
++ if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ) {
++ if (unlikely(dma_info->addr & ~MLX5E_MTT_PTAG_MASK)) {
++ xsk_buff_discard(dma_info->xsk);
++ goto retry;
++ }
++ }
++
+ return 0;
+ }
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c
+index d93aadbf10da8..90ea78239d402 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c
+@@ -16,7 +16,7 @@ static int mlx5e_ktls_add(struct net_device *netdev, struct sock *sk,
+ struct mlx5_core_dev *mdev = priv->mdev;
+ int err;
+
+- if (WARN_ON(!mlx5e_ktls_type_check(mdev, crypto_info)))
++ if (!mlx5e_ktls_type_check(mdev, crypto_info))
+ return -EOPNOTSUPP;
+
+ if (direction == TLS_OFFLOAD_CTX_DIR_TX)
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+index 796d97bcf1aa0..c6546613f7d8c 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+@@ -230,10 +230,8 @@ esw_setup_ft_dest(struct mlx5_flow_destination *dest,
+ }
+
+ static void
+-esw_setup_slow_path_dest(struct mlx5_flow_destination *dest,
+- struct mlx5_flow_act *flow_act,
+- struct mlx5_fs_chains *chains,
+- int i)
++esw_setup_accept_dest(struct mlx5_flow_destination *dest, struct mlx5_flow_act *flow_act,
++ struct mlx5_fs_chains *chains, int i)
+ {
+ if (mlx5_chains_ignore_flow_level_supported(chains))
+ flow_act->flags |= FLOW_ACT_IGNORE_FLOW_LEVEL;
+@@ -241,6 +239,16 @@ esw_setup_slow_path_dest(struct mlx5_flow_destination *dest,
+ dest[i].ft = mlx5_chains_get_tc_end_ft(chains);
+ }
+
++static void
++esw_setup_slow_path_dest(struct mlx5_flow_destination *dest, struct mlx5_flow_act *flow_act,
++ struct mlx5_eswitch *esw, int i)
++{
++ if (MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, ignore_flow_level))
++ flow_act->flags |= FLOW_ACT_IGNORE_FLOW_LEVEL;
++ dest[i].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE;
++ dest[i].ft = esw->fdb_table.offloads.slow_fdb;
++}
++
+ static int
+ esw_setup_chain_dest(struct mlx5_flow_destination *dest,
+ struct mlx5_flow_act *flow_act,
+@@ -473,8 +481,11 @@ esw_setup_dests(struct mlx5_flow_destination *dest,
+ } else if (attr->dest_ft) {
+ esw_setup_ft_dest(dest, flow_act, esw, attr, spec, *i);
+ (*i)++;
+- } else if (mlx5e_tc_attr_flags_skip(attr->flags)) {
+- esw_setup_slow_path_dest(dest, flow_act, chains, *i);
++ } else if (attr->flags & MLX5_ATTR_FLAG_SLOW_PATH) {
++ esw_setup_slow_path_dest(dest, flow_act, esw, *i);
++ (*i)++;
++ } else if (attr->flags & MLX5_ATTR_FLAG_ACCEPT) {
++ esw_setup_accept_dest(dest, flow_act, chains, *i);
+ (*i)++;
+ } else if (attr->dest_chain) {
+ err = esw_setup_chain_dest(dest, flow_act, chains, attr->dest_chain,
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c
+index d758848d34d0c..696e45e2bd06d 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c
+@@ -32,20 +32,17 @@ static void tout_set(struct mlx5_core_dev *dev, u64 val, enum mlx5_timeouts_type
+ dev->timeouts->to[type] = val;
+ }
+
+-void mlx5_tout_set_def_val(struct mlx5_core_dev *dev)
++int mlx5_tout_init(struct mlx5_core_dev *dev)
+ {
+ int i;
+
+- for (i = 0; i < MAX_TIMEOUT_TYPES; i++)
+- tout_set(dev, tout_def_sw_val[i], i);
+-}
+-
+-int mlx5_tout_init(struct mlx5_core_dev *dev)
+-{
+ dev->timeouts = kmalloc(sizeof(*dev->timeouts), GFP_KERNEL);
+ if (!dev->timeouts)
+ return -ENOMEM;
+
++ for (i = 0; i < MAX_TIMEOUT_TYPES; i++)
++ tout_set(dev, tout_def_sw_val[i], i);
++
+ return 0;
+ }
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h
+index 257c03eeab365..bc9e9aeda8478 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h
+@@ -35,7 +35,6 @@ int mlx5_tout_init(struct mlx5_core_dev *dev);
+ void mlx5_tout_cleanup(struct mlx5_core_dev *dev);
+ void mlx5_tout_query_iseg(struct mlx5_core_dev *dev);
+ int mlx5_tout_query_dtor(struct mlx5_core_dev *dev);
+-void mlx5_tout_set_def_val(struct mlx5_core_dev *dev);
+ u64 _mlx5_tout_ms(struct mlx5_core_dev *dev, enum mlx5_timeouts_types type);
+
+ #define mlx5_tout_ms(dev, type) _mlx5_tout_ms(dev, MLX5_TO_##type##_MS)
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
+index 8b52636999943..75d216246955d 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
+@@ -526,7 +526,7 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx)
+
+ /* Check log_max_qp from HCA caps to set in current profile */
+ if (prof->log_max_qp == LOG_MAX_SUPPORTED_QPS) {
+- prof->log_max_qp = min_t(u8, 17, MLX5_CAP_GEN_MAX(dev, log_max_qp));
++ prof->log_max_qp = min_t(u8, 18, MLX5_CAP_GEN_MAX(dev, log_max_qp));
+ } else if (MLX5_CAP_GEN_MAX(dev, log_max_qp) < prof->log_max_qp) {
+ mlx5_core_warn(dev, "log_max_qp value in current profile is %d, changing it to HCA capability limit (%d)\n",
+ prof->log_max_qp,
+@@ -1025,8 +1025,6 @@ static int mlx5_function_setup(struct mlx5_core_dev *dev, u64 timeout)
+ if (mlx5_core_is_pf(dev))
+ pcie_print_link_status(dev->pdev);
+
+- mlx5_tout_set_def_val(dev);
+-
+ /* wait for firmware to accept initialization segments configurations
+ */
+ err = wait_fw_init(dev, timeout,
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
+index 887ee0f729d12..2935614f6fa9d 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
+@@ -87,6 +87,11 @@ static int mlx5_device_enable_sriov(struct mlx5_core_dev *dev, int num_vfs)
+ enable_vfs_hca:
+ num_msix_count = mlx5_get_default_msix_vec_count(dev, num_vfs);
+ for (vf = 0; vf < num_vfs; vf++) {
++ /* Notify the VF before its enablement to let it set
++ * some stuff.
++ */
++ blocking_notifier_call_chain(&sriov->vfs_ctx[vf].notifier,
++ MLX5_PF_NOTIFY_ENABLE_VF, dev);
+ err = mlx5_core_enable_hca(dev, vf + 1);
+ if (err) {
+ mlx5_core_warn(dev, "failed to enable VF %d (%d)\n", vf, err);
+@@ -127,6 +132,11 @@ mlx5_device_disable_sriov(struct mlx5_core_dev *dev, int num_vfs, bool clear_vf)
+ for (vf = num_vfs - 1; vf >= 0; vf--) {
+ if (!sriov->vfs_ctx[vf].enabled)
+ continue;
++ /* Notify the VF before its disablement to let it clean
++ * some resources.
++ */
++ blocking_notifier_call_chain(&sriov->vfs_ctx[vf].notifier,
++ MLX5_PF_NOTIFY_DISABLE_VF, dev);
+ err = mlx5_core_disable_hca(dev, vf + 1);
+ if (err) {
+ mlx5_core_warn(dev, "failed to disable VF %d\n", vf);
+@@ -257,7 +267,7 @@ int mlx5_sriov_init(struct mlx5_core_dev *dev)
+ {
+ struct mlx5_core_sriov *sriov = &dev->priv.sriov;
+ struct pci_dev *pdev = dev->pdev;
+- int total_vfs;
++ int total_vfs, i;
+
+ if (!mlx5_core_is_pf(dev))
+ return 0;
+@@ -269,6 +279,9 @@ int mlx5_sriov_init(struct mlx5_core_dev *dev)
+ if (!sriov->vfs_ctx)
+ return -ENOMEM;
+
++ for (i = 0; i < total_vfs; i++)
++ BLOCKING_INIT_NOTIFIER_HEAD(&sriov->vfs_ctx[i].notifier);
++
+ return 0;
+ }
+
+@@ -281,3 +294,53 @@ void mlx5_sriov_cleanup(struct mlx5_core_dev *dev)
+
+ kfree(sriov->vfs_ctx);
+ }
++
++/**
++ * mlx5_sriov_blocking_notifier_unregister - Unregister a VF from
++ * a notification block chain.
++ *
++ * @mdev: The mlx5 core device.
++ * @vf_id: The VF id.
++ * @nb: The notifier block to be unregistered.
++ */
++void mlx5_sriov_blocking_notifier_unregister(struct mlx5_core_dev *mdev,
++ int vf_id,
++ struct notifier_block *nb)
++{
++ struct mlx5_vf_context *vfs_ctx;
++ struct mlx5_core_sriov *sriov;
++
++ sriov = &mdev->priv.sriov;
++ if (WARN_ON(vf_id < 0 || vf_id >= sriov->num_vfs))
++ return;
++
++ vfs_ctx = &sriov->vfs_ctx[vf_id];
++ blocking_notifier_chain_unregister(&vfs_ctx->notifier, nb);
++}
++EXPORT_SYMBOL(mlx5_sriov_blocking_notifier_unregister);
++
++/**
++ * mlx5_sriov_blocking_notifier_register - Register a VF notification
++ * block chain.
++ *
++ * @mdev: The mlx5 core device.
++ * @vf_id: The VF id.
++ * @nb: The notifier block to be called upon the VF events.
++ *
++ * Returns 0 on success or an error code.
++ */
++int mlx5_sriov_blocking_notifier_register(struct mlx5_core_dev *mdev,
++ int vf_id,
++ struct notifier_block *nb)
++{
++ struct mlx5_vf_context *vfs_ctx;
++ struct mlx5_core_sriov *sriov;
++
++ sriov = &mdev->priv.sriov;
++ if (vf_id < 0 || vf_id >= sriov->num_vfs)
++ return -EINVAL;
++
++ vfs_ctx = &sriov->vfs_ctx[vf_id];
++ return blocking_notifier_chain_register(&vfs_ctx->notifier, nb);
++}
++EXPORT_SYMBOL(mlx5_sriov_blocking_notifier_register);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
+index d5998ef59be47..7adcf0eec13be 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
+@@ -21,10 +21,11 @@ enum dr_dump_rec_type {
+ DR_DUMP_REC_TYPE_TABLE_TX = 3102,
+
+ DR_DUMP_REC_TYPE_MATCHER = 3200,
+- DR_DUMP_REC_TYPE_MATCHER_MASK = 3201,
++ DR_DUMP_REC_TYPE_MATCHER_MASK_DEPRECATED = 3201,
+ DR_DUMP_REC_TYPE_MATCHER_RX = 3202,
+ DR_DUMP_REC_TYPE_MATCHER_TX = 3203,
+ DR_DUMP_REC_TYPE_MATCHER_BUILDER = 3204,
++ DR_DUMP_REC_TYPE_MATCHER_MASK = 3205,
+
+ DR_DUMP_REC_TYPE_RULE = 3300,
+ DR_DUMP_REC_TYPE_RULE_RX_ENTRY_V0 = 3301,
+@@ -114,13 +115,15 @@ dr_dump_rule_action_mem(struct seq_file *file, const u64 rule_id,
+ break;
+ case DR_ACTION_TYP_FT:
+ if (action->dest_tbl->is_fw_tbl)
+- seq_printf(file, "%d,0x%llx,0x%llx,0x%x\n",
++ seq_printf(file, "%d,0x%llx,0x%llx,0x%x,0x%x\n",
+ DR_DUMP_REC_TYPE_ACTION_FT, action_id,
+- rule_id, action->dest_tbl->fw_tbl.id);
++ rule_id, action->dest_tbl->fw_tbl.id,
++ -1);
+ else
+- seq_printf(file, "%d,0x%llx,0x%llx,0x%x\n",
++ seq_printf(file, "%d,0x%llx,0x%llx,0x%x,0x%llx\n",
+ DR_DUMP_REC_TYPE_ACTION_FT, action_id,
+- rule_id, action->dest_tbl->tbl->table_id);
++ rule_id, action->dest_tbl->tbl->table_id,
++ DR_DBG_PTR_TO_ID(action->dest_tbl->tbl));
+
+ break;
+ case DR_ACTION_TYP_CTR:
+diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
+index 20ceac81a2c2c..4970777269168 100644
+--- a/drivers/net/ethernet/mscc/ocelot.c
++++ b/drivers/net/ethernet/mscc/ocelot.c
+@@ -3245,6 +3245,7 @@ int ocelot_init(struct ocelot *ocelot)
+ mutex_init(&ocelot->ptp_lock);
+ mutex_init(&ocelot->mact_lock);
+ mutex_init(&ocelot->fwd_domain_lock);
++ mutex_init(&ocelot->tas_lock);
+ spin_lock_init(&ocelot->ptp_clock_lock);
+ spin_lock_init(&ocelot->ts_id_lock);
+ snprintf(queue_name, sizeof(queue_name), "%s-stats",
+diff --git a/drivers/net/ethernet/mscc/ocelot_ptp.c b/drivers/net/ethernet/mscc/ocelot_ptp.c
+index 87ad2137ba065..09c703efe946c 100644
+--- a/drivers/net/ethernet/mscc/ocelot_ptp.c
++++ b/drivers/net/ethernet/mscc/ocelot_ptp.c
+@@ -72,6 +72,10 @@ int ocelot_ptp_settime64(struct ptp_clock_info *ptp,
+ ocelot_write_rix(ocelot, val, PTP_PIN_CFG, TOD_ACC_PIN);
+
+ spin_unlock_irqrestore(&ocelot->ptp_clock_lock, flags);
++
++ if (ocelot->ops->tas_clock_adjust)
++ ocelot->ops->tas_clock_adjust(ocelot);
++
+ return 0;
+ }
+ EXPORT_SYMBOL(ocelot_ptp_settime64);
+@@ -105,6 +109,9 @@ int ocelot_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
+ ocelot_write_rix(ocelot, val, PTP_PIN_CFG, TOD_ACC_PIN);
+
+ spin_unlock_irqrestore(&ocelot->ptp_clock_lock, flags);
++
++ if (ocelot->ops->tas_clock_adjust)
++ ocelot->ops->tas_clock_adjust(ocelot);
+ } else {
+ /* Fall back using ocelot_ptp_settime64 which is not exact. */
+ struct timespec64 ts;
+@@ -117,6 +124,7 @@ int ocelot_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
+
+ ocelot_ptp_settime64(ptp, &ts);
+ }
++
+ return 0;
+ }
+ EXPORT_SYMBOL(ocelot_ptp_adjtime);
+diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+index f3568901eb916..1443f788ee37c 100644
+--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
++++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+@@ -1437,7 +1437,7 @@ static int ionic_set_nic_features(struct ionic_lif *lif,
+ if ((old_hw_features ^ lif->hw_features) & IONIC_ETH_HW_RX_HASH)
+ ionic_lif_rss_config(lif, lif->rss_types, NULL, NULL);
+
+- if ((vlan_flags & features) &&
++ if ((vlan_flags & le64_to_cpu(ctx.cmd.lif_setattr.features)) &&
+ !(vlan_flags & le64_to_cpu(ctx.comp.lif_setattr.features)))
+ dev_info_once(lif->ionic->dev, "NIC is not supporting vlan offload, likely in SmartNIC mode\n");
+
+diff --git a/drivers/net/netdevsim/bpf.c b/drivers/net/netdevsim/bpf.c
+index a438202129323..50854265864d1 100644
+--- a/drivers/net/netdevsim/bpf.c
++++ b/drivers/net/netdevsim/bpf.c
+@@ -351,10 +351,12 @@ nsim_map_alloc_elem(struct bpf_offloaded_map *offmap, unsigned int idx)
+ {
+ struct nsim_bpf_bound_map *nmap = offmap->dev_priv;
+
+- nmap->entry[idx].key = kmalloc(offmap->map.key_size, GFP_USER);
++ nmap->entry[idx].key = kmalloc(offmap->map.key_size,
++ GFP_KERNEL_ACCOUNT | __GFP_NOWARN);
+ if (!nmap->entry[idx].key)
+ return -ENOMEM;
+- nmap->entry[idx].value = kmalloc(offmap->map.value_size, GFP_USER);
++ nmap->entry[idx].value = kmalloc(offmap->map.value_size,
++ GFP_KERNEL_ACCOUNT | __GFP_NOWARN);
+ if (!nmap->entry[idx].value) {
+ kfree(nmap->entry[idx].key);
+ nmap->entry[idx].key = NULL;
+@@ -496,7 +498,7 @@ nsim_bpf_map_alloc(struct netdevsim *ns, struct bpf_offloaded_map *offmap)
+ if (offmap->map.map_flags)
+ return -EINVAL;
+
+- nmap = kzalloc(sizeof(*nmap), GFP_USER);
++ nmap = kzalloc(sizeof(*nmap), GFP_KERNEL_ACCOUNT);
+ if (!nmap)
+ return -ENOMEM;
+
+diff --git a/drivers/net/netdevsim/fib.c b/drivers/net/netdevsim/fib.c
+index 378ee779061c3..14787d17f703f 100644
+--- a/drivers/net/netdevsim/fib.c
++++ b/drivers/net/netdevsim/fib.c
+@@ -53,6 +53,7 @@ struct nsim_fib_data {
+ struct rhashtable nexthop_ht;
+ struct devlink *devlink;
+ struct work_struct fib_event_work;
++ struct work_struct fib_flush_work;
+ struct list_head fib_event_queue;
+ spinlock_t fib_event_queue_lock; /* Protects fib event queue list */
+ struct mutex nh_lock; /* Protects NH HT */
+@@ -977,7 +978,7 @@ static int nsim_fib_event_schedule_work(struct nsim_fib_data *data,
+
+ fib_event = kzalloc(sizeof(*fib_event), GFP_ATOMIC);
+ if (!fib_event)
+- return NOTIFY_BAD;
++ goto err_fib_event_alloc;
+
+ fib_event->data = data;
+ fib_event->event = event;
+@@ -1005,6 +1006,9 @@ static int nsim_fib_event_schedule_work(struct nsim_fib_data *data,
+
+ err_fib_prepare_event:
+ kfree(fib_event);
++err_fib_event_alloc:
++ if (event == FIB_EVENT_ENTRY_DEL)
++ schedule_work(&data->fib_flush_work);
+ return NOTIFY_BAD;
+ }
+
+@@ -1482,6 +1486,24 @@ static void nsim_fib_event_work(struct work_struct *work)
+ mutex_unlock(&data->fib_lock);
+ }
+
++static void nsim_fib_flush_work(struct work_struct *work)
++{
++ struct nsim_fib_data *data = container_of(work, struct nsim_fib_data,
++ fib_flush_work);
++ struct nsim_fib_rt *fib_rt, *fib_rt_tmp;
++
++ /* Process pending work. */
++ flush_work(&data->fib_event_work);
++
++ mutex_lock(&data->fib_lock);
++ list_for_each_entry_safe(fib_rt, fib_rt_tmp, &data->fib_rt_list, list) {
++ rhashtable_remove_fast(&data->fib_rt_ht, &fib_rt->ht_node,
++ nsim_fib_rt_ht_params);
++ nsim_fib_rt_free(fib_rt, data);
++ }
++ mutex_unlock(&data->fib_lock);
++}
++
+ static int
+ nsim_fib_debugfs_init(struct nsim_fib_data *data, struct nsim_dev *nsim_dev)
+ {
+@@ -1540,6 +1562,7 @@ struct nsim_fib_data *nsim_fib_create(struct devlink *devlink,
+ goto err_rhashtable_nexthop_destroy;
+
+ INIT_WORK(&data->fib_event_work, nsim_fib_event_work);
++ INIT_WORK(&data->fib_flush_work, nsim_fib_flush_work);
+ INIT_LIST_HEAD(&data->fib_event_queue);
+ spin_lock_init(&data->fib_event_queue_lock);
+
+@@ -1586,6 +1609,7 @@ struct nsim_fib_data *nsim_fib_create(struct devlink *devlink,
+ err_nexthop_nb_unregister:
+ unregister_nexthop_notifier(devlink_net(devlink), &data->nexthop_nb);
+ err_rhashtable_fib_destroy:
++ cancel_work_sync(&data->fib_flush_work);
+ flush_work(&data->fib_event_work);
+ rhashtable_free_and_destroy(&data->fib_rt_ht, nsim_fib_rt_free,
+ data);
+@@ -1615,6 +1639,7 @@ void nsim_fib_destroy(struct devlink *devlink, struct nsim_fib_data *data)
+ NSIM_RESOURCE_IPV4_FIB);
+ unregister_fib_notifier(devlink_net(devlink), &data->fib_nb);
+ unregister_nexthop_notifier(devlink_net(devlink), &data->nexthop_nb);
++ cancel_work_sync(&data->fib_flush_work);
+ flush_work(&data->fib_event_work);
+ rhashtable_free_and_destroy(&data->fib_rt_ht, nsim_fib_rt_free,
+ data);
+diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c
+index d8cac02a79b95..636b0907a5987 100644
+--- a/drivers/net/phy/smsc.c
++++ b/drivers/net/phy/smsc.c
+@@ -110,7 +110,7 @@ static int smsc_phy_config_init(struct phy_device *phydev)
+ struct smsc_phy_priv *priv = phydev->priv;
+ int rc;
+
+- if (!priv->energy_enable)
++ if (!priv->energy_enable || phydev->irq != PHY_POLL)
+ return 0;
+
+ rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);
+@@ -210,6 +210,8 @@ static int lan95xx_config_aneg_ext(struct phy_device *phydev)
+ * response on link pulses to detect presence of plugged Ethernet cable.
+ * The Energy Detect Power-Down mode is enabled again in the end of procedure to
+ * save approximately 220 mW of power if cable is unplugged.
++ * The workaround is only applicable to poll mode. Energy Detect Power-Down may
++ * not be used in interrupt mode lest link change detection becomes unreliable.
+ */
+ static int lan87xx_read_status(struct phy_device *phydev)
+ {
+@@ -217,7 +219,7 @@ static int lan87xx_read_status(struct phy_device *phydev)
+
+ int err = genphy_read_status(phydev);
+
+- if (!phydev->link && priv->energy_enable) {
++ if (!phydev->link && priv->energy_enable && phydev->irq == PHY_POLL) {
+ /* Disable EDPD to wake up PHY */
+ int rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);
+ if (rc < 0)
+diff --git a/drivers/net/usb/Kconfig b/drivers/net/usb/Kconfig
+index e62fc4f2aee0d..76659c1c525a2 100644
+--- a/drivers/net/usb/Kconfig
++++ b/drivers/net/usb/Kconfig
+@@ -637,8 +637,9 @@ config USB_NET_AQC111
+ * Aquantia AQtion USB to 5GbE
+
+ config USB_RTL8153_ECM
+- tristate "RTL8153 ECM support"
++ tristate
+ depends on USB_NET_CDCETHER && (USB_RTL8152 || USB_RTL8152=n)
++ default y
+ help
+ This option supports ECM mode for RTL8153 ethernet adapter, when
+ CONFIG_USB_RTL8152 is not set, or the RTL8153 device is not
+diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c
+index dc1f6d8444ad0..873f6deabbd1a 100644
+--- a/drivers/net/usb/ax88179_178a.c
++++ b/drivers/net/usb/ax88179_178a.c
+@@ -1801,7 +1801,7 @@ static const struct driver_info ax88179_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1814,7 +1814,7 @@ static const struct driver_info ax88178a_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1827,7 +1827,7 @@ static const struct driver_info cypress_GX3_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1840,7 +1840,7 @@ static const struct driver_info dlink_dub1312_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1853,7 +1853,7 @@ static const struct driver_info sitecom_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1866,7 +1866,7 @@ static const struct driver_info samsung_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1879,7 +1879,7 @@ static const struct driver_info lenovo_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1892,7 +1892,7 @@ static const struct driver_info belkin_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1905,7 +1905,7 @@ static const struct driver_info toshiba_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1918,7 +1918,7 @@ static const struct driver_info mct_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1931,7 +1931,7 @@ static const struct driver_info at_umc2000_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1944,7 +1944,7 @@ static const struct driver_info at_umc200_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+@@ -1957,7 +1957,7 @@ static const struct driver_info at_umc2000sp_info = {
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
++ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+ };
+diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
+index edf0492ad489a..515363d740784 100644
+--- a/drivers/net/usb/smsc95xx.c
++++ b/drivers/net/usb/smsc95xx.c
+@@ -18,6 +18,8 @@
+ #include <linux/usb/usbnet.h>
+ #include <linux/slab.h>
+ #include <linux/of_net.h>
++#include <linux/irq.h>
++#include <linux/irqdomain.h>
+ #include <linux/mdio.h>
+ #include <linux/phy.h>
+ #include <net/selftests.h>
+@@ -53,6 +55,9 @@
+ #define SUSPEND_ALLMODES (SUSPEND_SUSPEND0 | SUSPEND_SUSPEND1 | \
+ SUSPEND_SUSPEND2 | SUSPEND_SUSPEND3)
+
++#define SMSC95XX_NR_IRQS (1) /* raise to 12 for GPIOs */
++#define PHY_HWIRQ (SMSC95XX_NR_IRQS - 1)
++
+ struct smsc95xx_priv {
+ u32 mac_cr;
+ u32 hash_hi;
+@@ -61,8 +66,12 @@ struct smsc95xx_priv {
+ spinlock_t mac_cr_lock;
+ u8 features;
+ u8 suspend_flags;
++ struct irq_chip irqchip;
++ struct irq_domain *irqdomain;
++ struct fwnode_handle *irqfwnode;
+ struct mii_bus *mdiobus;
+ struct phy_device *phydev;
++ struct task_struct *pm_task;
+ };
+
+ static bool turbo_mode = true;
+@@ -72,13 +81,14 @@ MODULE_PARM_DESC(turbo_mode, "Enable multiple frames per Rx transaction");
+ static int __must_check __smsc95xx_read_reg(struct usbnet *dev, u32 index,
+ u32 *data, int in_pm)
+ {
++ struct smsc95xx_priv *pdata = dev->driver_priv;
+ u32 buf;
+ int ret;
+ int (*fn)(struct usbnet *, u8, u8, u16, u16, void *, u16);
+
+ BUG_ON(!dev);
+
+- if (!in_pm)
++ if (current != pdata->pm_task)
+ fn = usbnet_read_cmd;
+ else
+ fn = usbnet_read_cmd_nopm;
+@@ -102,13 +112,14 @@ static int __must_check __smsc95xx_read_reg(struct usbnet *dev, u32 index,
+ static int __must_check __smsc95xx_write_reg(struct usbnet *dev, u32 index,
+ u32 data, int in_pm)
+ {
++ struct smsc95xx_priv *pdata = dev->driver_priv;
+ u32 buf;
+ int ret;
+ int (*fn)(struct usbnet *, u8, u8, u16, u16, const void *, u16);
+
+ BUG_ON(!dev);
+
+- if (!in_pm)
++ if (current != pdata->pm_task)
+ fn = usbnet_write_cmd;
+ else
+ fn = usbnet_write_cmd_nopm;
+@@ -566,16 +577,12 @@ static int smsc95xx_phy_update_flowcontrol(struct usbnet *dev)
+ return smsc95xx_write_reg(dev, AFC_CFG, afc_cfg);
+ }
+
+-static int smsc95xx_link_reset(struct usbnet *dev)
++static void smsc95xx_mac_update_fullduplex(struct usbnet *dev)
+ {
+ struct smsc95xx_priv *pdata = dev->driver_priv;
+ unsigned long flags;
+ int ret;
+
+- ret = smsc95xx_write_reg(dev, INT_STS, INT_STS_CLEAR_ALL_);
+- if (ret < 0)
+- return ret;
+-
+ spin_lock_irqsave(&pdata->mac_cr_lock, flags);
+ if (pdata->phydev->duplex != DUPLEX_FULL) {
+ pdata->mac_cr &= ~MAC_CR_FDPX_;
+@@ -587,18 +594,22 @@ static int smsc95xx_link_reset(struct usbnet *dev)
+ spin_unlock_irqrestore(&pdata->mac_cr_lock, flags);
+
+ ret = smsc95xx_write_reg(dev, MAC_CR, pdata->mac_cr);
+- if (ret < 0)
+- return ret;
++ if (ret < 0) {
++ if (ret != -ENODEV)
++ netdev_warn(dev->net,
++ "Error updating MAC full duplex mode\n");
++ return;
++ }
+
+ ret = smsc95xx_phy_update_flowcontrol(dev);
+ if (ret < 0)
+ netdev_warn(dev->net, "Error updating PHY flow control\n");
+-
+- return ret;
+ }
+
+ static void smsc95xx_status(struct usbnet *dev, struct urb *urb)
+ {
++ struct smsc95xx_priv *pdata = dev->driver_priv;
++ unsigned long flags;
+ u32 intdata;
+
+ if (urb->actual_length != 4) {
+@@ -610,11 +621,15 @@ static void smsc95xx_status(struct usbnet *dev, struct urb *urb)
+ intdata = get_unaligned_le32(urb->transfer_buffer);
+ netif_dbg(dev, link, dev->net, "intdata: 0x%08X\n", intdata);
+
++ local_irq_save(flags);
++
+ if (intdata & INT_ENP_PHY_INT_)
+- usbnet_defer_kevent(dev, EVENT_LINK_RESET);
++ generic_handle_domain_irq(pdata->irqdomain, PHY_HWIRQ);
+ else
+ netdev_warn(dev->net, "unexpected interrupt, intdata=0x%08X\n",
+ intdata);
++
++ local_irq_restore(flags);
+ }
+
+ /* Enable or disable Tx & Rx checksum offload engines */
+@@ -1092,6 +1107,7 @@ static void smsc95xx_handle_link_change(struct net_device *net)
+ struct usbnet *dev = netdev_priv(net);
+
+ phy_print_status(net->phydev);
++ smsc95xx_mac_update_fullduplex(dev);
+ usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
+ }
+
+@@ -1099,8 +1115,9 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
+ {
+ struct smsc95xx_priv *pdata;
+ bool is_internal_phy;
++ char usb_path[64];
++ int ret, phy_irq;
+ u32 val;
+- int ret;
+
+ printk(KERN_INFO SMSC_CHIPNAME " v" SMSC_DRIVER_VERSION "\n");
+
+@@ -1140,10 +1157,38 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
+ if (ret)
+ goto free_pdata;
+
++ /* create irq domain for use by PHY driver and GPIO consumers */
++ usb_make_path(dev->udev, usb_path, sizeof(usb_path));
++ pdata->irqfwnode = irq_domain_alloc_named_fwnode(usb_path);
++ if (!pdata->irqfwnode) {
++ ret = -ENOMEM;
++ goto free_pdata;
++ }
++
++ pdata->irqdomain = irq_domain_create_linear(pdata->irqfwnode,
++ SMSC95XX_NR_IRQS,
++ &irq_domain_simple_ops,
++ pdata);
++ if (!pdata->irqdomain) {
++ ret = -ENOMEM;
++ goto free_irqfwnode;
++ }
++
++ phy_irq = irq_create_mapping(pdata->irqdomain, PHY_HWIRQ);
++ if (!phy_irq) {
++ ret = -ENOENT;
++ goto remove_irqdomain;
++ }
++
++ pdata->irqchip = dummy_irq_chip;
++ pdata->irqchip.name = SMSC_CHIPNAME;
++ irq_set_chip_and_handler_name(phy_irq, &pdata->irqchip,
++ handle_simple_irq, "phy");
++
+ pdata->mdiobus = mdiobus_alloc();
+ if (!pdata->mdiobus) {
+ ret = -ENOMEM;
+- goto free_pdata;
++ goto dispose_irq;
+ }
+
+ ret = smsc95xx_read_reg(dev, HW_CFG, &val);
+@@ -1176,6 +1221,7 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
+ goto unregister_mdio;
+ }
+
++ pdata->phydev->irq = phy_irq;
+ pdata->phydev->is_internal = is_internal_phy;
+
+ /* detect device revision as different features may be available */
+@@ -1218,6 +1264,15 @@ unregister_mdio:
+ free_mdio:
+ mdiobus_free(pdata->mdiobus);
+
++dispose_irq:
++ irq_dispose_mapping(phy_irq);
++
++remove_irqdomain:
++ irq_domain_remove(pdata->irqdomain);
++
++free_irqfwnode:
++ irq_domain_free_fwnode(pdata->irqfwnode);
++
+ free_pdata:
+ kfree(pdata);
+ return ret;
+@@ -1230,6 +1285,9 @@ static void smsc95xx_unbind(struct usbnet *dev, struct usb_interface *intf)
+ phy_disconnect(dev->net->phydev);
+ mdiobus_unregister(pdata->mdiobus);
+ mdiobus_free(pdata->mdiobus);
++ irq_dispose_mapping(irq_find_mapping(pdata->irqdomain, PHY_HWIRQ));
++ irq_domain_remove(pdata->irqdomain);
++ irq_domain_free_fwnode(pdata->irqfwnode);
+ netif_dbg(dev, ifdown, dev->net, "free pdata\n");
+ kfree(pdata);
+ }
+@@ -1254,29 +1312,6 @@ static u32 smsc_crc(const u8 *buffer, size_t len, int filter)
+ return crc << ((filter % 2) * 16);
+ }
+
+-static int smsc95xx_enable_phy_wakeup_interrupts(struct usbnet *dev, u16 mask)
+-{
+- int ret;
+-
+- netdev_dbg(dev->net, "enabling PHY wakeup interrupts\n");
+-
+- /* read to clear */
+- ret = smsc95xx_mdio_read_nopm(dev, PHY_INT_SRC);
+- if (ret < 0)
+- return ret;
+-
+- /* enable interrupt source */
+- ret = smsc95xx_mdio_read_nopm(dev, PHY_INT_MASK);
+- if (ret < 0)
+- return ret;
+-
+- ret |= mask;
+-
+- smsc95xx_mdio_write_nopm(dev, PHY_INT_MASK, ret);
+-
+- return 0;
+-}
+-
+ static int smsc95xx_link_ok_nopm(struct usbnet *dev)
+ {
+ int ret;
+@@ -1443,7 +1478,6 @@ static int smsc95xx_enter_suspend3(struct usbnet *dev)
+ static int smsc95xx_autosuspend(struct usbnet *dev, u32 link_up)
+ {
+ struct smsc95xx_priv *pdata = dev->driver_priv;
+- int ret;
+
+ if (!netif_running(dev->net)) {
+ /* interface is ifconfig down so fully power down hw */
+@@ -1462,27 +1496,10 @@ static int smsc95xx_autosuspend(struct usbnet *dev, u32 link_up)
+ }
+
+ netdev_dbg(dev->net, "autosuspend entering SUSPEND1\n");
+-
+- /* enable PHY wakeup events for if cable is attached */
+- ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
+- PHY_INT_MASK_ANEG_COMP_);
+- if (ret < 0) {
+- netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
+- return ret;
+- }
+-
+ netdev_info(dev->net, "entering SUSPEND1 mode\n");
+ return smsc95xx_enter_suspend1(dev);
+ }
+
+- /* enable PHY wakeup events so we remote wakeup if cable is pulled */
+- ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
+- PHY_INT_MASK_LINK_DOWN_);
+- if (ret < 0) {
+- netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
+- return ret;
+- }
+-
+ netdev_dbg(dev->net, "autosuspend entering SUSPEND3\n");
+ return smsc95xx_enter_suspend3(dev);
+ }
+@@ -1494,9 +1511,12 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
+ u32 val, link_up;
+ int ret;
+
++ pdata->pm_task = current;
++
+ ret = usbnet_suspend(intf, message);
+ if (ret < 0) {
+ netdev_warn(dev->net, "usbnet_suspend error\n");
++ pdata->pm_task = NULL;
+ return ret;
+ }
+
+@@ -1548,13 +1568,6 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
+ }
+
+ if (pdata->wolopts & WAKE_PHY) {
+- ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
+- (PHY_INT_MASK_ANEG_COMP_ | PHY_INT_MASK_LINK_DOWN_));
+- if (ret < 0) {
+- netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
+- goto done;
+- }
+-
+ /* if link is down then configure EDPD and enter SUSPEND1,
+ * otherwise enter SUSPEND0 below
+ */
+@@ -1743,6 +1756,7 @@ done:
+ if (ret && PMSG_IS_AUTO(message))
+ usbnet_resume(intf);
+
++ pdata->pm_task = NULL;
+ return ret;
+ }
+
+@@ -1763,45 +1777,53 @@ static int smsc95xx_resume(struct usb_interface *intf)
+ /* do this first to ensure it's cleared even in error case */
+ pdata->suspend_flags = 0;
+
++ pdata->pm_task = current;
++
+ if (suspend_flags & SUSPEND_ALLMODES) {
+ /* clear wake-up sources */
+ ret = smsc95xx_read_reg_nopm(dev, WUCSR, &val);
+ if (ret < 0)
+- return ret;
++ goto done;
+
+ val &= ~(WUCSR_WAKE_EN_ | WUCSR_MPEN_);
+
+ ret = smsc95xx_write_reg_nopm(dev, WUCSR, val);
+ if (ret < 0)
+- return ret;
++ goto done;
+
+ /* clear wake-up status */
+ ret = smsc95xx_read_reg_nopm(dev, PM_CTRL, &val);
+ if (ret < 0)
+- return ret;
++ goto done;
+
+ val &= ~PM_CTL_WOL_EN_;
+ val |= PM_CTL_WUPS_;
+
+ ret = smsc95xx_write_reg_nopm(dev, PM_CTRL, val);
+ if (ret < 0)
+- return ret;
++ goto done;
+ }
+
++ phy_init_hw(pdata->phydev);
++
+ ret = usbnet_resume(intf);
+ if (ret < 0)
+ netdev_warn(dev->net, "usbnet_resume error\n");
+
+- phy_init_hw(pdata->phydev);
++done:
++ pdata->pm_task = NULL;
+ return ret;
+ }
+
+ static int smsc95xx_reset_resume(struct usb_interface *intf)
+ {
+ struct usbnet *dev = usb_get_intfdata(intf);
++ struct smsc95xx_priv *pdata = dev->driver_priv;
+ int ret;
+
++ pdata->pm_task = current;
+ ret = smsc95xx_reset(dev);
++ pdata->pm_task = NULL;
+ if (ret < 0)
+ return ret;
+
+@@ -1997,7 +2019,6 @@ static const struct driver_info smsc95xx_info = {
+ .description = "smsc95xx USB 2.0 Ethernet",
+ .bind = smsc95xx_bind,
+ .unbind = smsc95xx_unbind,
+- .link_reset = smsc95xx_link_reset,
+ .reset = smsc95xx_reset,
+ .check_connect = smsc95xx_start_phy,
+ .stop = smsc95xx_stop,
+diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
+index 9b2bd09628a3b..502e6abbc8e20 100644
+--- a/drivers/net/usb/usbnet.c
++++ b/drivers/net/usb/usbnet.c
+@@ -849,13 +849,11 @@ int usbnet_stop (struct net_device *net)
+
+ mpn = !test_and_clear_bit(EVENT_NO_RUNTIME_PM, &dev->flags);
+
+- /* deferred work (task, timer, softirq) must also stop.
+- * can't flush_scheduled_work() until we drop rtnl (later),
+- * else workers could deadlock; so make workers a NOP.
+- */
++ /* deferred work (timer, softirq, task) must also stop */
+ dev->flags = 0;
+ del_timer_sync (&dev->delay);
+ tasklet_kill (&dev->bh);
++ cancel_work_sync(&dev->kevent);
+ if (!pm)
+ usb_autopm_put_interface(dev->intf);
+
+@@ -1619,8 +1617,6 @@ void usbnet_disconnect (struct usb_interface *intf)
+ net = dev->net;
+ unregister_netdev (net);
+
+- cancel_work_sync(&dev->kevent);
+-
+ usb_scuttle_anchored_urbs(&dev->deferred);
+
+ if (dev->driver_info->unbind)
+diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c
+index 9a4c8ff32d9dd..5bf7822c53f18 100644
+--- a/drivers/net/wireguard/allowedips.c
++++ b/drivers/net/wireguard/allowedips.c
+@@ -6,6 +6,8 @@
+ #include "allowedips.h"
+ #include "peer.h"
+
++enum { MAX_ALLOWEDIPS_BITS = 128 };
++
+ static struct kmem_cache *node_cache;
+
+ static void swap_endian(u8 *dst, const u8 *src, u8 bits)
+@@ -40,7 +42,8 @@ static void push_rcu(struct allowedips_node **stack,
+ struct allowedips_node __rcu *p, unsigned int *len)
+ {
+ if (rcu_access_pointer(p)) {
+- WARN_ON(IS_ENABLED(DEBUG) && *len >= 128);
++ if (WARN_ON(IS_ENABLED(DEBUG) && *len >= MAX_ALLOWEDIPS_BITS))
++ return;
+ stack[(*len)++] = rcu_dereference_raw(p);
+ }
+ }
+@@ -52,7 +55,7 @@ static void node_free_rcu(struct rcu_head *rcu)
+
+ static void root_free_rcu(struct rcu_head *rcu)
+ {
+- struct allowedips_node *node, *stack[128] = {
++ struct allowedips_node *node, *stack[MAX_ALLOWEDIPS_BITS] = {
+ container_of(rcu, struct allowedips_node, rcu) };
+ unsigned int len = 1;
+
+@@ -65,7 +68,7 @@ static void root_free_rcu(struct rcu_head *rcu)
+
+ static void root_remove_peer_lists(struct allowedips_node *root)
+ {
+- struct allowedips_node *node, *stack[128] = { root };
++ struct allowedips_node *node, *stack[MAX_ALLOWEDIPS_BITS] = { root };
+ unsigned int len = 1;
+
+ while (len > 0 && (node = stack[--len])) {
+diff --git a/drivers/net/wireguard/selftest/allowedips.c b/drivers/net/wireguard/selftest/allowedips.c
+index e173204ae7d78..41db10f9be498 100644
+--- a/drivers/net/wireguard/selftest/allowedips.c
++++ b/drivers/net/wireguard/selftest/allowedips.c
+@@ -593,10 +593,10 @@ bool __init wg_allowedips_selftest(void)
+ wg_allowedips_remove_by_peer(&t, a, &mutex);
+ test_negative(4, a, 192, 168, 0, 1);
+
+- /* These will hit the WARN_ON(len >= 128) in free_node if something
+- * goes wrong.
++ /* These will hit the WARN_ON(len >= MAX_ALLOWEDIPS_BITS) in free_node
++ * if something goes wrong.
+ */
+- for (i = 0; i < 128; ++i) {
++ for (i = 0; i < MAX_ALLOWEDIPS_BITS; ++i) {
+ part = cpu_to_be64(~(1LLU << (i % 64)));
+ memset(&ip, 0xff, 16);
+ memcpy((u8 *)&ip + (i < 64) * 8, &part, 8);
+diff --git a/drivers/net/wireguard/selftest/ratelimiter.c b/drivers/net/wireguard/selftest/ratelimiter.c
+index 007cd4457c5f6..ba87d294604fe 100644
+--- a/drivers/net/wireguard/selftest/ratelimiter.c
++++ b/drivers/net/wireguard/selftest/ratelimiter.c
+@@ -6,28 +6,29 @@
+ #ifdef DEBUG
+
+ #include <linux/jiffies.h>
++#include <linux/hrtimer.h>
+
+ static const struct {
+ bool result;
+- unsigned int msec_to_sleep_before;
++ u64 nsec_to_sleep_before;
+ } expected_results[] __initconst = {
+ [0 ... PACKETS_BURSTABLE - 1] = { true, 0 },
+ [PACKETS_BURSTABLE] = { false, 0 },
+- [PACKETS_BURSTABLE + 1] = { true, MSEC_PER_SEC / PACKETS_PER_SECOND },
++ [PACKETS_BURSTABLE + 1] = { true, NSEC_PER_SEC / PACKETS_PER_SECOND },
+ [PACKETS_BURSTABLE + 2] = { false, 0 },
+- [PACKETS_BURSTABLE + 3] = { true, (MSEC_PER_SEC / PACKETS_PER_SECOND) * 2 },
++ [PACKETS_BURSTABLE + 3] = { true, (NSEC_PER_SEC / PACKETS_PER_SECOND) * 2 },
+ [PACKETS_BURSTABLE + 4] = { true, 0 },
+ [PACKETS_BURSTABLE + 5] = { false, 0 }
+ };
+
+ static __init unsigned int maximum_jiffies_at_index(int index)
+ {
+- unsigned int total_msecs = 2 * MSEC_PER_SEC / PACKETS_PER_SECOND / 3;
++ u64 total_nsecs = 2 * NSEC_PER_SEC / PACKETS_PER_SECOND / 3;
+ int i;
+
+ for (i = 0; i <= index; ++i)
+- total_msecs += expected_results[i].msec_to_sleep_before;
+- return msecs_to_jiffies(total_msecs);
++ total_nsecs += expected_results[i].nsec_to_sleep_before;
++ return nsecs_to_jiffies(total_nsecs);
+ }
+
+ static __init int timings_test(struct sk_buff *skb4, struct iphdr *hdr4,
+@@ -42,8 +43,12 @@ static __init int timings_test(struct sk_buff *skb4, struct iphdr *hdr4,
+ loop_start_time = jiffies;
+
+ for (i = 0; i < ARRAY_SIZE(expected_results); ++i) {
+- if (expected_results[i].msec_to_sleep_before)
+- msleep(expected_results[i].msec_to_sleep_before);
++ if (expected_results[i].nsec_to_sleep_before) {
++ ktime_t timeout = ktime_add(ktime_add_ns(ktime_get_coarse_boottime(), TICK_NSEC * 4 / 3),
++ ns_to_ktime(expected_results[i].nsec_to_sleep_before));
++ set_current_state(TASK_UNINTERRUPTIBLE);
++ schedule_hrtimeout_range_clock(&timeout, 0, HRTIMER_MODE_ABS, CLOCK_BOOTTIME);
++ }
+
+ if (time_is_before_jiffies(loop_start_time +
+ maximum_jiffies_at_index(i)))
+@@ -127,7 +132,7 @@ bool __init wg_ratelimiter_selftest(void)
+ if (IS_ENABLED(CONFIG_KASAN) || IS_ENABLED(CONFIG_UBSAN))
+ return true;
+
+- BUILD_BUG_ON(MSEC_PER_SEC % PACKETS_PER_SECOND != 0);
++ BUILD_BUG_ON(NSEC_PER_SEC % PACKETS_PER_SECOND != 0);
+
+ if (wg_ratelimiter_init())
+ goto out;
+@@ -176,7 +181,6 @@ bool __init wg_ratelimiter_selftest(void)
+ test += test_count;
+ goto err;
+ }
+- msleep(500);
+ continue;
+ } else if (ret < 0) {
+ test += test_count;
+@@ -195,7 +199,6 @@ bool __init wg_ratelimiter_selftest(void)
+ test += test_count;
+ goto err;
+ }
+- msleep(50);
+ continue;
+ }
+ test += test_count;
+diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
+index 8328966a0471c..603dc7bebc0c4 100644
+--- a/drivers/net/wireless/ath/ath10k/snoc.c
++++ b/drivers/net/wireless/ath/ath10k/snoc.c
+@@ -1249,13 +1249,12 @@ static void ath10k_snoc_init_napi(struct ath10k *ar)
+ static int ath10k_snoc_request_irq(struct ath10k *ar)
+ {
+ struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
+- int irqflags = IRQF_TRIGGER_RISING;
+ int ret, id;
+
+ for (id = 0; id < CE_COUNT_MAX; id++) {
+ ret = request_irq(ar_snoc->ce_irqs[id].irq_line,
+- ath10k_snoc_per_engine_handler,
+- irqflags, ce_name[id], ar);
++ ath10k_snoc_per_engine_handler, 0,
++ ce_name[id], ar);
+ if (ret) {
+ ath10k_err(ar,
+ "failed to register IRQ handler for CE %d: %d\n",
+diff --git a/drivers/net/wireless/ath/ath11k/core.c b/drivers/net/wireless/ath/ath11k/core.c
+index 90a5df1fbdbd2..0106e529f3c5a 100644
+--- a/drivers/net/wireless/ath/ath11k/core.c
++++ b/drivers/net/wireless/ath/ath11k/core.c
+@@ -903,23 +903,23 @@ static int ath11k_core_pdev_create(struct ath11k_base *ab)
+ return ret;
+ }
+
+- ret = ath11k_mac_register(ab);
++ ret = ath11k_dp_pdev_alloc(ab);
+ if (ret) {
+- ath11k_err(ab, "failed register the radio with mac80211: %d\n", ret);
++ ath11k_err(ab, "failed to attach DP pdev: %d\n", ret);
+ goto err_pdev_debug;
+ }
+
+- ret = ath11k_dp_pdev_alloc(ab);
++ ret = ath11k_mac_register(ab);
+ if (ret) {
+- ath11k_err(ab, "failed to attach DP pdev: %d\n", ret);
+- goto err_mac_unregister;
++ ath11k_err(ab, "failed register the radio with mac80211: %d\n", ret);
++ goto err_dp_pdev_free;
+ }
+
+ ret = ath11k_thermal_register(ab);
+ if (ret) {
+ ath11k_err(ab, "could not register thermal device: %d\n",
+ ret);
+- goto err_dp_pdev_free;
++ goto err_mac_unregister;
+ }
+
+ ret = ath11k_spectral_init(ab);
+@@ -932,10 +932,10 @@ static int ath11k_core_pdev_create(struct ath11k_base *ab)
+
+ err_thermal_unregister:
+ ath11k_thermal_unregister(ab);
+-err_dp_pdev_free:
+- ath11k_dp_pdev_free(ab);
+ err_mac_unregister:
+ ath11k_mac_unregister(ab);
++err_dp_pdev_free:
++ ath11k_dp_pdev_free(ab);
+ err_pdev_debug:
+ ath11k_debugfs_pdev_destroy(ab);
+
+diff --git a/drivers/net/wireless/ath/ath11k/debug.h b/drivers/net/wireless/ath/ath11k/debug.h
+index fbbd5fe02aa83..91545640c47b2 100644
+--- a/drivers/net/wireless/ath/ath11k/debug.h
++++ b/drivers/net/wireless/ath/ath11k/debug.h
+@@ -23,8 +23,8 @@ enum ath11k_debug_mask {
+ ATH11K_DBG_TESTMODE = 0x00000400,
+ ATH11k_DBG_HAL = 0x00000800,
+ ATH11K_DBG_PCI = 0x00001000,
+- ATH11K_DBG_DP_TX = 0x00001000,
+- ATH11K_DBG_DP_RX = 0x00002000,
++ ATH11K_DBG_DP_TX = 0x00002000,
++ ATH11K_DBG_DP_RX = 0x00004000,
+ ATH11K_DBG_ANY = 0xffffffff,
+ };
+
+diff --git a/drivers/net/wireless/ath/ath11k/dp_rx.c b/drivers/net/wireless/ath/ath11k/dp_rx.c
+index 049774cc158cc..b3e133add1ce5 100644
+--- a/drivers/net/wireless/ath/ath11k/dp_rx.c
++++ b/drivers/net/wireless/ath/ath11k/dp_rx.c
+@@ -835,8 +835,9 @@ void ath11k_peer_rx_tid_delete(struct ath11k *ar,
+ HAL_REO_CMD_UPDATE_RX_QUEUE, &cmd,
+ ath11k_dp_rx_tid_del_func);
+ if (ret) {
+- ath11k_err(ar->ab, "failed to send HAL_REO_CMD_UPDATE_RX_QUEUE cmd, tid %d (%d)\n",
+- tid, ret);
++ if (ret != -ESHUTDOWN)
++ ath11k_err(ar->ab, "failed to send HAL_REO_CMD_UPDATE_RX_QUEUE cmd, tid %d (%d)\n",
++ tid, ret);
+ dma_unmap_single(ar->ab->dev, rx_tid->paddr, rx_tid->size,
+ DMA_BIDIRECTIONAL);
+ kfree(rx_tid->vaddr);
+diff --git a/drivers/net/wireless/ath/ath11k/htc.c b/drivers/net/wireless/ath/ath11k/htc.c
+index 6913b7494b9bf..2de1e953a5396 100644
+--- a/drivers/net/wireless/ath/ath11k/htc.c
++++ b/drivers/net/wireless/ath/ath11k/htc.c
+@@ -258,8 +258,10 @@ void ath11k_htc_tx_completion_handler(struct ath11k_base *ab,
+ u8 eid;
+
+ eid = ATH11K_SKB_CB(skb)->eid;
+- if (eid >= ATH11K_HTC_EP_COUNT)
++ if (eid >= ATH11K_HTC_EP_COUNT) {
++ dev_kfree_skb_any(skb);
+ return;
++ }
+
+ ep = &htc->endpoint[eid];
+ spin_lock_bh(&htc->tx_lock);
+diff --git a/drivers/net/wireless/ath/ath11k/pci.c b/drivers/net/wireless/ath/ath11k/pci.c
+index 8a3ff12057e89..e2382c8595b6a 100644
+--- a/drivers/net/wireless/ath/ath11k/pci.c
++++ b/drivers/net/wireless/ath/ath11k/pci.c
+@@ -1566,7 +1566,9 @@ qmi_fail:
+ static void ath11k_pci_shutdown(struct pci_dev *pdev)
+ {
+ struct ath11k_base *ab = pci_get_drvdata(pdev);
++ struct ath11k_pci *ab_pci = ath11k_pci_priv(ab);
+
++ ath11k_pci_set_irq_affinity_hint(ab_pci, NULL);
+ ath11k_pci_power_down(ab);
+ }
+
+diff --git a/drivers/net/wireless/ath/ath9k/htc.h b/drivers/net/wireless/ath/ath9k/htc.h
+index 6b45e63fae4ba..e3d546ef71ddc 100644
+--- a/drivers/net/wireless/ath/ath9k/htc.h
++++ b/drivers/net/wireless/ath/ath9k/htc.h
+@@ -327,11 +327,11 @@ static inline struct ath9k_htc_tx_ctl *HTC_SKB_CB(struct sk_buff *skb)
+ }
+
+ #ifdef CONFIG_ATH9K_HTC_DEBUGFS
+-
+-#define TX_STAT_INC(c) (hif_dev->htc_handle->drv_priv->debug.tx_stats.c++)
+-#define TX_STAT_ADD(c, a) (hif_dev->htc_handle->drv_priv->debug.tx_stats.c += a)
+-#define RX_STAT_INC(c) (hif_dev->htc_handle->drv_priv->debug.skbrx_stats.c++)
+-#define RX_STAT_ADD(c, a) (hif_dev->htc_handle->drv_priv->debug.skbrx_stats.c += a)
++#define __STAT_SAFE(expr) (hif_dev->htc_handle->drv_priv ? (expr) : 0)
++#define TX_STAT_INC(c) __STAT_SAFE(hif_dev->htc_handle->drv_priv->debug.tx_stats.c++)
++#define TX_STAT_ADD(c, a) __STAT_SAFE(hif_dev->htc_handle->drv_priv->debug.tx_stats.c += a)
++#define RX_STAT_INC(c) __STAT_SAFE(hif_dev->htc_handle->drv_priv->debug.skbrx_stats.c++)
++#define RX_STAT_ADD(c, a) __STAT_SAFE(hif_dev->htc_handle->drv_priv->debug.skbrx_stats.c += a)
+ #define CAB_STAT_INC priv->debug.tx_stats.cab_queued++
+
+ #define TX_QSTAT_INC(q) (priv->debug.tx_stats.queue_stats[q]++)
+diff --git a/drivers/net/wireless/ath/ath9k/htc_drv_init.c b/drivers/net/wireless/ath/ath9k/htc_drv_init.c
+index ff61ae34ecdf0..07ac88fb1c577 100644
+--- a/drivers/net/wireless/ath/ath9k/htc_drv_init.c
++++ b/drivers/net/wireless/ath/ath9k/htc_drv_init.c
+@@ -944,7 +944,6 @@ int ath9k_htc_probe_device(struct htc_target *htc_handle, struct device *dev,
+ priv->hw = hw;
+ priv->htc = htc_handle;
+ priv->dev = dev;
+- htc_handle->drv_priv = priv;
+ SET_IEEE80211_DEV(hw, priv->dev);
+
+ ret = ath9k_htc_wait_for_target(priv);
+@@ -965,6 +964,8 @@ int ath9k_htc_probe_device(struct htc_target *htc_handle, struct device *dev,
+ if (ret)
+ goto err_init;
+
++ htc_handle->drv_priv = priv;
++
+ return 0;
+
+ err_init:
+diff --git a/drivers/net/wireless/ath/wil6210/debugfs.c b/drivers/net/wireless/ath/wil6210/debugfs.c
+index 4c944e595978b..ac7787e1a7f61 100644
+--- a/drivers/net/wireless/ath/wil6210/debugfs.c
++++ b/drivers/net/wireless/ath/wil6210/debugfs.c
+@@ -1010,20 +1010,14 @@ static ssize_t wil_write_file_wmi(struct file *file, const char __user *buf,
+ void *cmd;
+ int cmdlen = len - sizeof(struct wmi_cmd_hdr);
+ u16 cmdid;
+- int rc, rc1;
++ int rc1;
+
+- if (cmdlen < 0)
++ if (cmdlen < 0 || *ppos != 0)
+ return -EINVAL;
+
+- wmi = kmalloc(len, GFP_KERNEL);
+- if (!wmi)
+- return -ENOMEM;
+-
+- rc = simple_write_to_buffer(wmi, len, ppos, buf, len);
+- if (rc < 0) {
+- kfree(wmi);
+- return rc;
+- }
++ wmi = memdup_user(buf, len);
++ if (IS_ERR(wmi))
++ return PTR_ERR(wmi);
+
+ cmd = (cmdlen > 0) ? &wmi[1] : NULL;
+ cmdid = le16_to_cpu(wmi->command_id);
+@@ -1033,7 +1027,7 @@ static ssize_t wil_write_file_wmi(struct file *file, const char __user *buf,
+
+ wil_info(wil, "0x%04x[%d] -> %d\n", cmdid, cmdlen, rc1);
+
+- return rc;
++ return len;
+ }
+
+ static const struct file_operations fops_wmi = {
+diff --git a/drivers/net/wireless/intel/iwlegacy/4965-rs.c b/drivers/net/wireless/intel/iwlegacy/4965-rs.c
+index 9a491e5db75bd..532e3b91777d9 100644
+--- a/drivers/net/wireless/intel/iwlegacy/4965-rs.c
++++ b/drivers/net/wireless/intel/iwlegacy/4965-rs.c
+@@ -2403,7 +2403,7 @@ il4965_rs_fill_link_cmd(struct il_priv *il, struct il_lq_sta *lq_sta,
+ /* Repeat initial/next rate.
+ * For legacy IL_NUMBER_TRY == 1, this loop will not execute.
+ * For HT IL_HT_NUMBER_TRY == 3, this executes twice. */
+- while (repeat_rate > 0 && idx < LINK_QUAL_MAX_RETRY_NUM) {
++ while (repeat_rate > 0) {
+ if (is_legacy(tbl_type.lq_type)) {
+ if (ant_toggle_cnt < NUM_TRY_BEFORE_ANT_TOGGLE)
+ ant_toggle_cnt++;
+@@ -2422,6 +2422,8 @@ il4965_rs_fill_link_cmd(struct il_priv *il, struct il_lq_sta *lq_sta,
+ cpu_to_le32(new_rate);
+ repeat_rate--;
+ idx++;
++ if (idx >= LINK_QUAL_MAX_RETRY_NUM)
++ goto out;
+ }
+
+ il4965_rs_get_tbl_info_from_mcs(new_rate, lq_sta->band,
+@@ -2466,6 +2468,7 @@ il4965_rs_fill_link_cmd(struct il_priv *il, struct il_lq_sta *lq_sta,
+ repeat_rate--;
+ }
+
++out:
+ lq_cmd->agg_params.agg_frame_cnt_limit = LINK_QUAL_AGG_FRAME_LIMIT_DEF;
+ lq_cmd->agg_params.agg_dis_start_th = LINK_QUAL_AGG_DISABLE_START_DEF;
+
+diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
+index c7f9d3870f218..8a38d1bfe9b3c 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
++++ b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
+@@ -1862,6 +1862,7 @@ static void iwl_mvm_disable_sta_queues(struct iwl_mvm *mvm,
+ iwl_mvm_txq_from_mac80211(sta->txq[i]);
+
+ mvmtxq->txq_id = IWL_MVM_INVALID_QUEUE;
++ list_del_init(&mvmtxq->list);
+ }
+ }
+
+diff --git a/drivers/net/wireless/intersil/p54/main.c b/drivers/net/wireless/intersil/p54/main.c
+index a3ca6620dc0c6..8fa3ec71603e3 100644
+--- a/drivers/net/wireless/intersil/p54/main.c
++++ b/drivers/net/wireless/intersil/p54/main.c
+@@ -682,7 +682,7 @@ static void p54_flush(struct ieee80211_hw *dev, struct ieee80211_vif *vif,
+ * queues have already been stopped and no new frames can sneak
+ * up from behind.
+ */
+- while ((total = p54_flush_count(priv) && i--)) {
++ while ((total = p54_flush_count(priv)) && i--) {
+ /* waste time */
+ msleep(20);
+ }
+diff --git a/drivers/net/wireless/intersil/p54/p54spi.c b/drivers/net/wireless/intersil/p54/p54spi.c
+index f99b7ba69fc3d..19152fd449ba7 100644
+--- a/drivers/net/wireless/intersil/p54/p54spi.c
++++ b/drivers/net/wireless/intersil/p54/p54spi.c
+@@ -164,7 +164,7 @@ static int p54spi_request_firmware(struct ieee80211_hw *dev)
+
+ ret = p54_parse_firmware(dev, priv->firmware);
+ if (ret) {
+- release_firmware(priv->firmware);
++ /* the firmware is released by the caller */
+ return ret;
+ }
+
+@@ -659,6 +659,7 @@ static int p54spi_probe(struct spi_device *spi)
+ return 0;
+
+ err_free_common:
++ release_firmware(priv->firmware);
+ free_irq(gpio_to_irq(p54spi_gpio_irq), spi);
+ err_free_gpio_irq:
+ gpio_free(p54spi_gpio_irq);
+diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c
+index e9ec63e0e395b..1f39f2ed633d2 100644
+--- a/drivers/net/wireless/mac80211_hwsim.c
++++ b/drivers/net/wireless/mac80211_hwsim.c
+@@ -680,7 +680,7 @@ struct mac80211_hwsim_data {
+ bool ps_poll_pending;
+ struct dentry *debugfs;
+
+- uintptr_t pending_cookie;
++ atomic_t pending_cookie;
+ struct sk_buff_head pending; /* packets pending */
+ /*
+ * Only radios in the same group can communicate together (the
+@@ -1416,8 +1416,7 @@ static void mac80211_hwsim_tx_frame_nl(struct ieee80211_hw *hw,
+ goto nla_put_failure;
+
+ /* We create a cookie to identify this skb */
+- data->pending_cookie++;
+- cookie = data->pending_cookie;
++ cookie = atomic_inc_return(&data->pending_cookie);
+ info->rate_driver_data[0] = (void *)cookie;
+ if (nla_put_u64_64bit(skb, HWSIM_ATTR_COOKIE, cookie, HWSIM_ATTR_PAD))
+ goto nla_put_failure;
+@@ -4080,6 +4079,7 @@ static int hwsim_tx_info_frame_received_nl(struct sk_buff *skb_2,
+ const u8 *src;
+ unsigned int hwsim_flags;
+ int i;
++ unsigned long flags;
+ bool found = false;
+
+ if (!info->attrs[HWSIM_ATTR_ADDR_TRANSMITTER] ||
+@@ -4107,18 +4107,20 @@ static int hwsim_tx_info_frame_received_nl(struct sk_buff *skb_2,
+ }
+
+ /* look for the skb matching the cookie passed back from user */
++ spin_lock_irqsave(&data2->pending.lock, flags);
+ skb_queue_walk_safe(&data2->pending, skb, tmp) {
+- u64 skb_cookie;
++ uintptr_t skb_cookie;
+
+ txi = IEEE80211_SKB_CB(skb);
+- skb_cookie = (u64)(uintptr_t)txi->rate_driver_data[0];
++ skb_cookie = (uintptr_t)txi->rate_driver_data[0];
+
+ if (skb_cookie == ret_skb_cookie) {
+- skb_unlink(skb, &data2->pending);
++ __skb_unlink(skb, &data2->pending);
+ found = true;
+ break;
+ }
+ }
++ spin_unlock_irqrestore(&data2->pending.lock, flags);
+
+ /* not found */
+ if (!found)
+diff --git a/drivers/net/wireless/marvell/libertas/if_usb.c b/drivers/net/wireless/marvell/libertas/if_usb.c
+index 5d6dc1dd050d4..32fdc4150b605 100644
+--- a/drivers/net/wireless/marvell/libertas/if_usb.c
++++ b/drivers/net/wireless/marvell/libertas/if_usb.c
+@@ -287,6 +287,7 @@ static int if_usb_probe(struct usb_interface *intf,
+ return 0;
+
+ err_get_fw:
++ usb_put_dev(udev);
+ lbs_remove_card(priv);
+ err_add_card:
+ if_usb_reset_device(cardp);
+diff --git a/drivers/net/wireless/mediatek/mt76/eeprom.c b/drivers/net/wireless/mediatek/mt76/eeprom.c
+index a499861918fa3..9bc8758573fcc 100644
+--- a/drivers/net/wireless/mediatek/mt76/eeprom.c
++++ b/drivers/net/wireless/mediatek/mt76/eeprom.c
+@@ -162,10 +162,13 @@ mt76_find_power_limits_node(struct mt76_dev *dev)
+ }
+
+ if (mt76_string_prop_find(country, dev->alpha2) ||
+- mt76_string_prop_find(regd, region_name))
++ mt76_string_prop_find(regd, region_name)) {
++ of_node_put(np);
+ return cur;
++ }
+ }
+
++ of_node_put(np);
+ return fallback;
+ }
+
+diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c b/drivers/net/wireless/mediatek/mt76/mac80211.c
+index 8a2fedbb1451c..2cd7b3f1db648 100644
+--- a/drivers/net/wireless/mediatek/mt76/mac80211.c
++++ b/drivers/net/wireless/mediatek/mt76/mac80211.c
+@@ -210,6 +210,7 @@ static int mt76_led_init(struct mt76_dev *dev)
+ if (!of_property_read_u32(np, "led-sources", &led_pin))
+ dev->led_pin = led_pin;
+ dev->led_al = of_property_read_bool(np, "led-active-low");
++ of_node_put(np);
+ }
+
+ return led_classdev_register(dev->dev, &dev->led_cdev);
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/mac.c b/drivers/net/wireless/mediatek/mt76/mt7615/mac.c
+index bd687f7de6289..9e832b27170fe 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7615/mac.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7615/mac.c
+@@ -2282,6 +2282,7 @@ mt7615_dfs_init_radar_specs(struct mt7615_phy *phy)
+
+ int mt7615_dfs_init_radar_detector(struct mt7615_phy *phy)
+ {
++ struct cfg80211_chan_def *chandef = &phy->mt76->chandef;
+ struct mt7615_dev *dev = phy->dev;
+ bool ext_phy = phy != &dev->phy;
+ enum mt76_dfs_state dfs_state, prev_state;
+@@ -2292,13 +2293,13 @@ int mt7615_dfs_init_radar_detector(struct mt7615_phy *phy)
+
+ prev_state = phy->mt76->dfs_state;
+ dfs_state = mt76_phy_dfs_state(phy->mt76);
++ if ((chandef->chan->flags & IEEE80211_CHAN_RADAR) &&
++ dfs_state < MT_DFS_STATE_CAC)
++ dfs_state = MT_DFS_STATE_ACTIVE;
+
+ if (prev_state == dfs_state)
+ return 0;
+
+- if (prev_state == MT_DFS_STATE_UNKNOWN)
+- mt7615_dfs_stop_radar_detector(phy);
+-
+ if (dfs_state == MT_DFS_STATE_DISABLED)
+ goto stop;
+
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/main.c b/drivers/net/wireless/mediatek/mt76/mt7615/main.c
+index 6b8e3e7ae4a26..36990637a8a2c 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7615/main.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7615/main.c
+@@ -282,26 +282,6 @@ static void mt7615_remove_interface(struct ieee80211_hw *hw,
+ mt76_packet_id_flush(&dev->mt76, &mvif->sta.wcid);
+ }
+
+-static void mt7615_init_dfs_state(struct mt7615_phy *phy)
+-{
+- struct mt76_phy *mphy = phy->mt76;
+- struct ieee80211_hw *hw = mphy->hw;
+- struct cfg80211_chan_def *chandef = &hw->conf.chandef;
+-
+- if (hw->conf.flags & IEEE80211_CONF_OFFCHANNEL)
+- return;
+-
+- if (!(chandef->chan->flags & IEEE80211_CHAN_RADAR) &&
+- !(mphy->chandef.chan->flags & IEEE80211_CHAN_RADAR))
+- return;
+-
+- if (mphy->chandef.chan->center_freq == chandef->chan->center_freq &&
+- mphy->chandef.width == chandef->width)
+- return;
+-
+- phy->dfs_state = -1;
+-}
+-
+ int mt7615_set_channel(struct mt7615_phy *phy)
+ {
+ struct mt7615_dev *dev = phy->dev;
+@@ -314,7 +294,6 @@ int mt7615_set_channel(struct mt7615_phy *phy)
+
+ set_bit(MT76_RESET, &phy->mt76->state);
+
+- mt7615_init_dfs_state(phy);
+ mt76_set_channel(phy->mt76);
+
+ if (is_mt7615(&dev->mt76) && dev->flash_eeprom) {
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c
+index 97e2a85cb7284..7127d6007ae08 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c
+@@ -350,10 +350,11 @@ static int mt7615_mcu_fw_pmctrl(struct mt7615_dev *dev)
+ }
+
+ mt7622_trigger_hif_int(dev, false);
+-
+- pm->stats.last_doze_event = jiffies;
+- pm->stats.awake_time += pm->stats.last_doze_event -
+- pm->stats.last_wake_event;
++ if (!err) {
++ pm->stats.last_doze_event = jiffies;
++ pm->stats.awake_time += pm->stats.last_doze_event -
++ pm->stats.last_wake_event;
++ }
+ out:
+ mutex_unlock(&pm->mutex);
+
+@@ -402,6 +403,9 @@ mt7615_mcu_rx_radar_detected(struct mt7615_dev *dev, struct sk_buff *skb)
+ if (r->band_idx && dev->mt76.phy2)
+ mphy = dev->mt76.phy2;
+
++ if (mt76_phy_dfs_state(mphy) < MT_DFS_STATE_CAC)
++ return;
++
+ ieee80211_radar_detected(mphy->hw);
+ dev->hw_pattern++;
+ }
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/mt7615.h b/drivers/net/wireless/mediatek/mt76/mt7615/mt7615.h
+index 2e91f6a27d0ff..082c73b571ae7 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7615/mt7615.h
++++ b/drivers/net/wireless/mediatek/mt76/mt7615/mt7615.h
+@@ -177,7 +177,6 @@ struct mt7615_phy {
+
+ u8 chfreq;
+ u8 rdd_state;
+- int dfs_state;
+
+ u32 rx_ampdu_ts;
+ u32 ampdu_ref;
+diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_usb_mcu.c b/drivers/net/wireless/mediatek/mt76/mt76x02_usb_mcu.c
+index 2953df7d8388d..c6c16fe8ee859 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt76x02_usb_mcu.c
++++ b/drivers/net/wireless/mediatek/mt76/mt76x02_usb_mcu.c
+@@ -108,7 +108,7 @@ __mt76x02u_mcu_send_msg(struct mt76_dev *dev, struct sk_buff *skb,
+ ret = mt76u_bulk_msg(dev, skb->data, skb->len, NULL, 500,
+ MT_EP_OUT_INBAND_CMD);
+ if (ret)
+- return ret;
++ goto out;
+
+ if (wait_resp)
+ ret = mt76x02u_mcu_wait_resp(dev, seq);
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/init.c b/drivers/net/wireless/mediatek/mt76/mt7921/init.c
+index 91fc41922d959..37453e1c136f3 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7921/init.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7921/init.c
+@@ -49,8 +49,8 @@ mt7921_init_wiphy(struct ieee80211_hw *hw)
+ struct wiphy *wiphy = hw->wiphy;
+
+ hw->queues = 4;
+- hw->max_rx_aggregation_subframes = 64;
+- hw->max_tx_aggregation_subframes = 128;
++ hw->max_rx_aggregation_subframes = IEEE80211_MAX_AMPDU_BUF_HE;
++ hw->max_tx_aggregation_subframes = IEEE80211_MAX_AMPDU_BUF_HE;
+ hw->netdev_features = NETIF_F_RXCSUM;
+
+ hw->radiotap_timestamp.units_pos =
+@@ -291,7 +291,7 @@ int mt7921_register_device(struct mt7921_dev *dev)
+ IEEE80211_HT_CAP_LDPC_CODING |
+ IEEE80211_HT_CAP_MAX_AMSDU;
+ dev->mphy.sband_5g.sband.vht_cap.cap |=
+- IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_7991 |
++ IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_11454 |
+ IEEE80211_VHT_CAP_MAX_A_MPDU_LENGTH_EXPONENT_MASK |
+ IEEE80211_VHT_CAP_SU_BEAMFORMEE_CAPABLE |
+ IEEE80211_VHT_CAP_MU_BEAMFORMEE_CAPABLE |
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
+index da2be050ed7c4..2a609b25561c7 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
+@@ -538,13 +538,6 @@ static int mt7921_load_patch(struct mt7921_dev *dev)
+ if (ret)
+ dev_err(dev->mt76.dev, "Failed to start patch\n");
+
+- if (mt76_is_sdio(&dev->mt76)) {
+- /* activate again */
+- ret = __mt7921_mcu_fw_pmctrl(dev);
+- if (!ret)
+- ret = __mt7921_mcu_drv_pmctrl(dev);
+- }
+-
+ out:
+ sem = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, false);
+ switch (sem) {
+@@ -555,6 +548,14 @@ out:
+ dev_err(dev->mt76.dev, "Failed to release patch semaphore\n");
+ break;
+ }
++
++ if (!ret && mt76_is_sdio(&dev->mt76)) {
++ /* activate again */
++ ret = __mt7921_mcu_fw_pmctrl(dev);
++ if (!ret)
++ ret = __mt7921_mcu_drv_pmctrl(dev);
++ }
++
+ release_firmware(fw);
+
+ return ret;
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci_mcu.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci_mcu.c
+index 36669e5aeef39..a1ab5f878f81a 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7921/pci_mcu.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci_mcu.c
+@@ -102,7 +102,7 @@ int mt7921e_mcu_fw_pmctrl(struct mt7921_dev *dev)
+ {
+ struct mt76_phy *mphy = &dev->mt76.phy;
+ struct mt76_connac_pm *pm = &dev->pm;
+- int i, err = 0;
++ int i;
+
+ for (i = 0; i < MT7921_DRV_OWN_RETRY_COUNT; i++) {
+ mt76_wr(dev, MT_CONN_ON_LPCTL, PCIE_LPCR_HOST_SET_OWN);
+@@ -114,12 +114,12 @@ int mt7921e_mcu_fw_pmctrl(struct mt7921_dev *dev)
+ if (i == MT7921_DRV_OWN_RETRY_COUNT) {
+ dev_err(dev->mt76.dev, "firmware own failed\n");
+ clear_bit(MT76_STATE_PM, &mphy->state);
+- err = -EIO;
++ return -EIO;
+ }
+
+ pm->stats.last_doze_event = jiffies;
+ pm->stats.awake_time += pm->stats.last_doze_event -
+ pm->stats.last_wake_event;
+
+- return err;
++ return 0;
+ }
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/sdio_mcu.c b/drivers/net/wireless/mediatek/mt76/mt7921/sdio_mcu.c
+index 54a5c712a3c3e..c572a3107b8b7 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7921/sdio_mcu.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7921/sdio_mcu.c
+@@ -136,8 +136,8 @@ int mt7921s_mcu_fw_pmctrl(struct mt7921_dev *dev)
+ struct sdio_func *func = dev->mt76.sdio.func;
+ struct mt76_phy *mphy = &dev->mt76.phy;
+ struct mt76_connac_pm *pm = &dev->pm;
+- int err = 0;
+ u32 status;
++ int err;
+
+ sdio_claim_host(func);
+
+@@ -148,7 +148,7 @@ int mt7921s_mcu_fw_pmctrl(struct mt7921_dev *dev)
+ 2000, 1000000);
+ if (err < 0) {
+ dev_err(dev->mt76.dev, "mailbox ACK not cleared\n");
+- goto err;
++ goto out;
+ }
+ }
+
+@@ -156,18 +156,18 @@ int mt7921s_mcu_fw_pmctrl(struct mt7921_dev *dev)
+
+ err = readx_poll_timeout(mt76s_read_pcr, &dev->mt76, status,
+ !(status & WHLPCR_IS_DRIVER_OWN), 2000, 1000000);
++out:
+ sdio_release_host(func);
+
+-err:
+ if (err < 0) {
+ dev_err(dev->mt76.dev, "firmware own failed\n");
+ clear_bit(MT76_STATE_PM, &mphy->state);
+- err = -EIO;
++ return -EIO;
+ }
+
+ pm->stats.last_doze_event = jiffies;
+ pm->stats.awake_time += pm->stats.last_doze_event -
+ pm->stats.last_wake_event;
+
+- return err;
++ return 0;
+ }
+diff --git a/drivers/net/wireless/microchip/wilc1000/spi.c b/drivers/net/wireless/microchip/wilc1000/spi.c
+index 18420e954402f..2ae8dd3411aca 100644
+--- a/drivers/net/wireless/microchip/wilc1000/spi.c
++++ b/drivers/net/wireless/microchip/wilc1000/spi.c
+@@ -191,11 +191,11 @@ static void wilc_wlan_power(struct wilc *wilc, bool on)
+ /* assert ENABLE: */
+ gpiod_set_value(gpios->enable, 1);
+ mdelay(5);
+- /* deassert RESET: */
+- gpiod_set_value(gpios->reset, 0);
+- } else {
+ /* assert RESET: */
+ gpiod_set_value(gpios->reset, 1);
++ } else {
++ /* deassert RESET: */
++ gpiod_set_value(gpios->reset, 0);
+ /* deassert ENABLE: */
+ gpiod_set_value(gpios->enable, 0);
+ }
+diff --git a/drivers/net/wireless/realtek/rtlwifi/debug.c b/drivers/net/wireless/realtek/rtlwifi/debug.c
+index 901cdfe3723cf..0b1bc04cb6adb 100644
+--- a/drivers/net/wireless/realtek/rtlwifi/debug.c
++++ b/drivers/net/wireless/realtek/rtlwifi/debug.c
+@@ -329,8 +329,8 @@ static ssize_t rtl_debugfs_set_write_h2c(struct file *filp,
+
+ tmp_len = (count > sizeof(tmp) - 1 ? sizeof(tmp) - 1 : count);
+
+- if (!buffer || copy_from_user(tmp, buffer, tmp_len))
+- return count;
++ if (copy_from_user(tmp, buffer, tmp_len))
++ return -EFAULT;
+
+ tmp[tmp_len] = '\0';
+
+@@ -340,8 +340,8 @@ static ssize_t rtl_debugfs_set_write_h2c(struct file *filp,
+ &h2c_data[4], &h2c_data[5],
+ &h2c_data[6], &h2c_data[7]);
+
+- if (h2c_len <= 0)
+- return count;
++ if (h2c_len == 0)
++ return -EINVAL;
+
+ for (i = 0; i < h2c_len; i++)
+ h2c_data_packed[i] = (u8)h2c_data[i];
+diff --git a/drivers/net/wireless/realtek/rtw88/main.c b/drivers/net/wireless/realtek/rtw88/main.c
+index 8b9899e41b0bb..ded952913ae66 100644
+--- a/drivers/net/wireless/realtek/rtw88/main.c
++++ b/drivers/net/wireless/realtek/rtw88/main.c
+@@ -1974,6 +1974,10 @@ int rtw_core_init(struct rtw_dev *rtwdev)
+ timer_setup(&rtwdev->tx_report.purge_timer,
+ rtw_tx_report_purge_timer, 0);
+ rtwdev->tx_wq = alloc_workqueue("rtw_tx_wq", WQ_UNBOUND | WQ_HIGHPRI, 0);
++ if (!rtwdev->tx_wq) {
++ rtw_warn(rtwdev, "alloc_workqueue rtw_tx_wq failed\n");
++ return -ENOMEM;
++ }
+
+ INIT_DELAYED_WORK(&rtwdev->watch_dog_work, rtw_watch_dog_work);
+ INIT_DELAYED_WORK(&coex->bt_relink_work, rtw_coex_bt_relink_work);
+diff --git a/drivers/net/wireless/realtek/rtw89/rtw8852a_rfk.c b/drivers/net/wireless/realtek/rtw89/rtw8852a_rfk.c
+index ad272854c442f..8e4869eb3a245 100644
+--- a/drivers/net/wireless/realtek/rtw89/rtw8852a_rfk.c
++++ b/drivers/net/wireless/realtek/rtw89/rtw8852a_rfk.c
+@@ -2330,8 +2330,8 @@ static u8 _dpk_pas_read(struct rtw89_dev *rtwdev, bool is_check)
+ val2_q = abs(sign_extend32(val2_q, 11));
+
+ rtw89_debug(rtwdev, RTW89_DBG_RFK, "[DPK] PAS_delta = 0x%x\n",
+- (val1_i * val1_i + val1_q * val1_q) /
+- (val2_i * val2_i + val2_q * val2_q));
++ phy_div(val1_i * val1_i + val1_q * val1_q,
++ val2_i * val2_i + val2_q * val2_q));
+
+ } else {
+ for (i = 0; i < 32; i++) {
+diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
+index c9831daafbc61..a58a69999dbc5 100644
+--- a/drivers/nvme/host/core.c
++++ b/drivers/nvme/host/core.c
+@@ -1897,8 +1897,10 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_id_ns *id)
+
+ if (ns->head->ids.csi == NVME_CSI_ZNS) {
+ ret = nvme_update_zone_info(ns, lbaf);
+- if (ret)
+- goto out_unfreeze;
++ if (ret) {
++ blk_mq_unfreeze_queue(ns->disk->queue);
++ goto out;
++ }
+ }
+
+ set_disk_ro(ns->disk, (id->nsattr & NVME_NS_ATTR_RO) ||
+@@ -1909,7 +1911,7 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_id_ns *id)
+ if (blk_queue_is_zoned(ns->queue)) {
+ ret = nvme_revalidate_zones(ns);
+ if (ret && !nvme_first_scan(ns->disk))
+- return ret;
++ goto out;
+ }
+
+ if (nvme_ns_head_multipath(ns->head)) {
+@@ -1924,9 +1926,9 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_id_ns *id)
+ disk_update_readahead(ns->head->disk);
+ blk_mq_unfreeze_queue(ns->head->disk->queue);
+ }
+- return 0;
+
+-out_unfreeze:
++ ret = 0;
++out:
+ /*
+ * If probing fails due an unsupported feature, hide the block device,
+ * but still allow other access.
+@@ -1936,7 +1938,6 @@ out_unfreeze:
+ set_bit(NVME_NS_READY, &ns->flags);
+ ret = 0;
+ }
+- blk_mq_unfreeze_queue(ns->disk->queue);
+ return ret;
+ }
+
+@@ -2093,6 +2094,7 @@ static int nvme_report_zones(struct gendisk *disk, sector_t sector,
+ static const struct block_device_operations nvme_bdev_ops = {
+ .owner = THIS_MODULE,
+ .ioctl = nvme_ioctl,
++ .compat_ioctl = blkdev_compat_ptr_ioctl,
+ .open = nvme_open,
+ .release = nvme_release,
+ .getgeo = nvme_getgeo,
+diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
+index 080f85f4105f3..05f9da251758f 100644
+--- a/drivers/nvme/host/fc.c
++++ b/drivers/nvme/host/fc.c
+@@ -1899,6 +1899,24 @@ nvme_fc_ctrl_ioerr_work(struct work_struct *work)
+ nvme_fc_error_recovery(ctrl, "transport detected io error");
+ }
+
++/*
++ * nvme_fc_io_getuuid - Routine called to get the appid field
++ * associated with request by the lldd
++ * @req:IO request from nvme fc to driver
++ * Returns: UUID if there is an appid associated with VM or
++ * NULL if the user/libvirt has not set the appid to VM
++ */
++char *nvme_fc_io_getuuid(struct nvmefc_fcp_req *req)
++{
++ struct nvme_fc_fcp_op *op = fcp_req_to_fcp_op(req);
++ struct request *rq = op->rq;
++
++ if (!IS_ENABLED(CONFIG_BLK_CGROUP_FC_APPID) || !rq->bio)
++ return NULL;
++ return blkcg_get_fc_appid(rq->bio);
++}
++EXPORT_SYMBOL_GPL(nvme_fc_io_getuuid);
++
+ static void
+ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
+ {
+diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
+index d464fdf978fba..b0fe23439c4a9 100644
+--- a/drivers/nvme/host/multipath.c
++++ b/drivers/nvme/host/multipath.c
+@@ -408,6 +408,7 @@ const struct block_device_operations nvme_ns_head_ops = {
+ .open = nvme_ns_head_open,
+ .release = nvme_ns_head_release,
+ .ioctl = nvme_ns_head_ioctl,
++ .compat_ioctl = blkdev_compat_ptr_ioctl,
+ .getgeo = nvme_getgeo,
+ .report_zones = nvme_ns_head_report_zones,
+ .pr_ops = &nvme_pr_ops,
+diff --git a/drivers/nvme/host/trace.h b/drivers/nvme/host/trace.h
+index 37c7f4c89f92e..6f0eaf6a15282 100644
+--- a/drivers/nvme/host/trace.h
++++ b/drivers/nvme/host/trace.h
+@@ -98,7 +98,7 @@ TRACE_EVENT(nvme_complete_rq,
+ TP_fast_assign(
+ __entry->ctrl_id = nvme_req(req)->ctrl->instance;
+ __entry->qid = nvme_req_qid(req);
+- __entry->cid = req->tag;
++ __entry->cid = nvme_req(req)->cmd->common.command_id;
+ __entry->result = le64_to_cpu(nvme_req(req)->result.u64);
+ __entry->retries = nvme_req(req)->retries;
+ __entry->flags = nvme_req(req)->flags;
+diff --git a/drivers/nvme/target/zns.c b/drivers/nvme/target/zns.c
+index e34718b095504..82b61acf7a72b 100644
+--- a/drivers/nvme/target/zns.c
++++ b/drivers/nvme/target/zns.c
+@@ -34,8 +34,7 @@ static int validate_conv_zones_cb(struct blk_zone *z,
+
+ bool nvmet_bdev_zns_enable(struct nvmet_ns *ns)
+ {
+- struct request_queue *q = ns->bdev->bd_disk->queue;
+- u8 zasl = nvmet_zasl(queue_max_zone_append_sectors(q));
++ u8 zasl = nvmet_zasl(bdev_max_zone_append_sectors(ns->bdev));
+ struct gendisk *bd_disk = ns->bdev->bd_disk;
+ int ret;
+
+diff --git a/drivers/of/device.c b/drivers/of/device.c
+index 874f031442dc7..75b6cbffa7558 100644
+--- a/drivers/of/device.c
++++ b/drivers/of/device.c
+@@ -81,8 +81,11 @@ of_dma_set_restricted_buffer(struct device *dev, struct device_node *np)
+ * restricted-dma-pool region is allowed.
+ */
+ if (of_device_is_compatible(node, "restricted-dma-pool") &&
+- of_device_is_available(node))
++ of_device_is_available(node)) {
++ of_node_put(node);
+ break;
++ }
++ of_node_put(node);
+ }
+
+ /*
+diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
+index 0f30496ce80bf..d624e8b185aac 100644
+--- a/drivers/of/fdt.c
++++ b/drivers/of/fdt.c
+@@ -246,7 +246,7 @@ static int populate_node(const void *blob,
+ }
+
+ *pnp = np;
+- return true;
++ return 0;
+ }
+
+ static void reverse_nodes(struct device_node *parent)
+diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
+index 8d374cc552be5..91b04b04eec45 100644
+--- a/drivers/of/kexec.c
++++ b/drivers/of/kexec.c
+@@ -126,6 +126,7 @@ int ima_get_kexec_buffer(void **addr, size_t *size)
+ {
+ int ret, len;
+ unsigned long tmp_addr;
++ unsigned long start_pfn, end_pfn;
+ size_t tmp_size;
+ const void *prop;
+
+@@ -140,6 +141,22 @@ int ima_get_kexec_buffer(void **addr, size_t *size)
+ if (ret)
+ return ret;
+
++ /* Do some sanity on the returned size for the ima-kexec buffer */
++ if (!tmp_size)
++ return -ENOENT;
++
++ /*
++ * Calculate the PFNs for the buffer and ensure
++ * they are with in addressable memory.
++ */
++ start_pfn = PHYS_PFN(tmp_addr);
++ end_pfn = PHYS_PFN(tmp_addr + tmp_size - 1);
++ if (!page_is_ram(start_pfn) || !page_is_ram(end_pfn)) {
++ pr_warn("IMA buffer at 0x%lx, size = 0x%zx beyond memory\n",
++ tmp_addr, tmp_size);
++ return -EINVAL;
++ }
++
+ *addr = __va(tmp_addr);
+ *size = tmp_size;
+
+diff --git a/drivers/opp/core.c b/drivers/opp/core.c
+index 7404072522982..e154b18ec4b1c 100644
+--- a/drivers/opp/core.c
++++ b/drivers/opp/core.c
+@@ -2413,8 +2413,8 @@ struct opp_table *dev_pm_opp_attach_genpd(struct device *dev,
+ }
+
+ virt_dev = dev_pm_domain_attach_by_name(dev, *name);
+- if (IS_ERR(virt_dev)) {
+- ret = PTR_ERR(virt_dev);
++ if (IS_ERR_OR_NULL(virt_dev)) {
++ ret = PTR_ERR(virt_dev) ? : -ENODEV;
+ dev_err(dev, "Couldn't attach to pm_domain: %d\n", ret);
+ goto err;
+ }
+diff --git a/drivers/parisc/lba_pci.c b/drivers/parisc/lba_pci.c
+index 732b516c7bf84..afc6e66ddc31c 100644
+--- a/drivers/parisc/lba_pci.c
++++ b/drivers/parisc/lba_pci.c
+@@ -1476,9 +1476,13 @@ lba_driver_probe(struct parisc_device *dev)
+ u32 func_class;
+ void *tmp_obj;
+ char *version;
+- void __iomem *addr = ioremap(dev->hpa.start, 4096);
++ void __iomem *addr;
+ int max;
+
++ addr = ioremap(dev->hpa.start, 4096);
++ if (addr == NULL)
++ return -ENOMEM;
++
+ /* Read HW Rev First */
+ func_class = READ_REG32(addr + LBA_FCLASS);
+
+diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
+index 0eda8236c125a..13c2e73f0eaf8 100644
+--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
++++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
+@@ -780,8 +780,9 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
+ ep->msi_mem = pci_epc_mem_alloc_addr(epc, &ep->msi_mem_phys,
+ epc->mem->window.page_size);
+ if (!ep->msi_mem) {
++ ret = -ENOMEM;
+ dev_err(dev, "Failed to reserve memory for MSI/MSI-X\n");
+- return -ENOMEM;
++ goto err_exit_epc_mem;
+ }
+
+ if (ep->ops->get_features) {
+@@ -790,6 +791,19 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
+ return 0;
+ }
+
+- return dw_pcie_ep_init_complete(ep);
++ ret = dw_pcie_ep_init_complete(ep);
++ if (ret)
++ goto err_free_epc_mem;
++
++ return 0;
++
++err_free_epc_mem:
++ pci_epc_mem_free_addr(epc, ep->msi_mem_phys, ep->msi_mem,
++ epc->mem->window.page_size);
++
++err_exit_epc_mem:
++ pci_epc_mem_exit(epc);
++
++ return ret;
+ }
+ EXPORT_SYMBOL_GPL(dw_pcie_ep_init);
+diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
+index 9979302532b72..d0d768f22ac30 100644
+--- a/drivers/pci/controller/dwc/pcie-designware-host.c
++++ b/drivers/pci/controller/dwc/pcie-designware-host.c
+@@ -421,8 +421,14 @@ int dw_pcie_host_init(struct pcie_port *pp)
+ bridge->sysdata = pp;
+
+ ret = pci_host_probe(bridge);
+- if (!ret)
+- return 0;
++ if (ret)
++ goto err_stop_link;
++
++ return 0;
++
++err_stop_link:
++ if (pci->ops && pci->ops->stop_link)
++ pci->ops->stop_link(pci);
+
+ err_free_msi:
+ if (pp->has_msi_ctrl)
+@@ -433,8 +439,14 @@ EXPORT_SYMBOL_GPL(dw_pcie_host_init);
+
+ void dw_pcie_host_deinit(struct pcie_port *pp)
+ {
++ struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
++
+ pci_stop_root_bus(pp->bridge->bus);
+ pci_remove_root_bus(pp->bridge->bus);
++
++ if (pci->ops && pci->ops->stop_link)
++ pci->ops->stop_link(pci);
++
+ if (pp->has_msi_ctrl)
+ dw_pcie_free_msi(pp);
+ }
+@@ -531,7 +543,6 @@ static struct pci_ops dw_pcie_ops = {
+
+ void dw_pcie_setup_rc(struct pcie_port *pp)
+ {
+- int i;
+ u32 val, ctrl, num_ctrls;
+ struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
+
+@@ -582,19 +593,22 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
+ PCI_COMMAND_MASTER | PCI_COMMAND_SERR;
+ dw_pcie_writel_dbi(pci, PCI_COMMAND, val);
+
+- /* Ensure all outbound windows are disabled so there are multiple matches */
+- for (i = 0; i < pci->num_ob_windows; i++)
+- dw_pcie_disable_atu(pci, i, DW_PCIE_REGION_OUTBOUND);
+-
+ /*
+ * If the platform provides its own child bus config accesses, it means
+ * the platform uses its own address translation component rather than
+ * ATU, so we should not program the ATU here.
+ */
+ if (pp->bridge->child_ops == &dw_child_pcie_ops) {
+- int atu_idx = 0;
++ int i, atu_idx = 0;
+ struct resource_entry *entry;
+
++ /*
++ * Disable all outbound windows to make sure a transaction
++ * can't match multiple windows.
++ */
++ for (i = 0; i < pci->num_ob_windows; i++)
++ dw_pcie_disable_atu(pci, i, DW_PCIE_REGION_OUTBOUND);
++
+ /* Get last memory resource entry */
+ resource_list_for_each_entry(entry, &pp->bridge->windows) {
+ if (resource_type(entry->res) != IORESOURCE_MEM)
+diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
+index d92c8a25094fa..5848cc520b52e 100644
+--- a/drivers/pci/controller/dwc/pcie-designware.c
++++ b/drivers/pci/controller/dwc/pcie-designware.c
+@@ -287,8 +287,8 @@ static void dw_pcie_prog_outbound_atu_unroll(struct dw_pcie *pci, u8 func_no,
+ dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_UPPER_TARGET,
+ upper_32_bits(pci_addr));
+ val = type | PCIE_ATU_FUNC_NUM(func_no);
+- val = upper_32_bits(size - 1) ?
+- val | PCIE_ATU_INCREASE_REGION_SIZE : val;
++ if (upper_32_bits(limit_addr) > upper_32_bits(cpu_addr))
++ val |= PCIE_ATU_INCREASE_REGION_SIZE;
+ if (pci->version == 0x490A)
+ val = dw_pcie_enable_ecrc(val);
+ dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_REGION_CTRL1, val);
+@@ -315,6 +315,7 @@ static void __dw_pcie_prog_outbound_atu(struct dw_pcie *pci, u8 func_no,
+ u64 pci_addr, u64 size)
+ {
+ u32 retries, val;
++ u64 limit_addr;
+
+ if (pci->ops && pci->ops->cpu_addr_fixup)
+ cpu_addr = pci->ops->cpu_addr_fixup(pci, cpu_addr);
+@@ -325,6 +326,8 @@ static void __dw_pcie_prog_outbound_atu(struct dw_pcie *pci, u8 func_no,
+ return;
+ }
+
++ limit_addr = cpu_addr + size - 1;
++
+ dw_pcie_writel_dbi(pci, PCIE_ATU_VIEWPORT,
+ PCIE_ATU_REGION_OUTBOUND | index);
+ dw_pcie_writel_dbi(pci, PCIE_ATU_LOWER_BASE,
+@@ -332,17 +335,18 @@ static void __dw_pcie_prog_outbound_atu(struct dw_pcie *pci, u8 func_no,
+ dw_pcie_writel_dbi(pci, PCIE_ATU_UPPER_BASE,
+ upper_32_bits(cpu_addr));
+ dw_pcie_writel_dbi(pci, PCIE_ATU_LIMIT,
+- lower_32_bits(cpu_addr + size - 1));
++ lower_32_bits(limit_addr));
+ if (pci->version >= 0x460A)
+ dw_pcie_writel_dbi(pci, PCIE_ATU_UPPER_LIMIT,
+- upper_32_bits(cpu_addr + size - 1));
++ upper_32_bits(limit_addr));
+ dw_pcie_writel_dbi(pci, PCIE_ATU_LOWER_TARGET,
+ lower_32_bits(pci_addr));
+ dw_pcie_writel_dbi(pci, PCIE_ATU_UPPER_TARGET,
+ upper_32_bits(pci_addr));
+ val = type | PCIE_ATU_FUNC_NUM(func_no);
+- val = ((upper_32_bits(size - 1)) && (pci->version >= 0x460A)) ?
+- val | PCIE_ATU_INCREASE_REGION_SIZE : val;
++ if (upper_32_bits(limit_addr) > upper_32_bits(cpu_addr) &&
++ pci->version >= 0x460A)
++ val |= PCIE_ATU_INCREASE_REGION_SIZE;
+ if (pci->version == 0x490A)
+ val = dw_pcie_enable_ecrc(val);
+ dw_pcie_writel_dbi(pci, PCIE_ATU_CR1, val);
+@@ -491,7 +495,7 @@ int dw_pcie_prog_inbound_atu(struct dw_pcie *pci, u8 func_no, int index,
+ void dw_pcie_disable_atu(struct dw_pcie *pci, int index,
+ enum dw_pcie_region_type type)
+ {
+- int region;
++ u32 region;
+
+ switch (type) {
+ case DW_PCIE_REGION_INBOUND:
+@@ -504,8 +508,18 @@ void dw_pcie_disable_atu(struct dw_pcie *pci, int index,
+ return;
+ }
+
+- dw_pcie_writel_dbi(pci, PCIE_ATU_VIEWPORT, region | index);
+- dw_pcie_writel_dbi(pci, PCIE_ATU_CR2, ~(u32)PCIE_ATU_ENABLE);
++ if (pci->iatu_unroll_enabled) {
++ if (region == PCIE_ATU_REGION_INBOUND) {
++ dw_pcie_writel_ib_unroll(pci, index, PCIE_ATU_UNR_REGION_CTRL2,
++ ~(u32)PCIE_ATU_ENABLE);
++ } else {
++ dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_REGION_CTRL2,
++ ~(u32)PCIE_ATU_ENABLE);
++ }
++ } else {
++ dw_pcie_writel_dbi(pci, PCIE_ATU_VIEWPORT, region | index);
++ dw_pcie_writel_dbi(pci, PCIE_ATU_CR2, ~(u32)PCIE_ATU_ENABLE);
++ }
+ }
+
+ int dw_pcie_wait_for_link(struct dw_pcie *pci)
+@@ -726,6 +740,13 @@ void dw_pcie_setup(struct dw_pcie *pci)
+ val |= PORT_LINK_DLL_LINK_EN;
+ dw_pcie_writel_dbi(pci, PCIE_PORT_LINK_CONTROL, val);
+
++ if (of_property_read_bool(np, "snps,enable-cdm-check")) {
++ val = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS);
++ val |= PCIE_PL_CHK_REG_CHK_REG_CONTINUOUS |
++ PCIE_PL_CHK_REG_CHK_REG_START;
++ dw_pcie_writel_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS, val);
++ }
++
+ of_property_read_u32(np, "num-lanes", &pci->num_lanes);
+ if (!pci->num_lanes) {
+ dev_dbg(pci->dev, "Using h/w default number of lanes\n");
+@@ -772,11 +793,4 @@ void dw_pcie_setup(struct dw_pcie *pci)
+ break;
+ }
+ dw_pcie_writel_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL, val);
+-
+- if (of_property_read_bool(np, "snps,enable-cdm-check")) {
+- val = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS);
+- val |= PCIE_PL_CHK_REG_CHK_REG_CONTINUOUS |
+- PCIE_PL_CHK_REG_CHK_REG_START;
+- dw_pcie_writel_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS, val);
+- }
+ }
+diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
+index ed55421eb9ba9..340542aab8a57 100644
+--- a/drivers/pci/controller/dwc/pcie-qcom.c
++++ b/drivers/pci/controller/dwc/pcie-qcom.c
+@@ -337,8 +337,6 @@ static int qcom_pcie_init_2_1_0(struct qcom_pcie *pcie)
+ reset_control_assert(res->ext_reset);
+ reset_control_assert(res->phy_reset);
+
+- writel(1, pcie->parf + PCIE20_PARF_PHY_CTRL);
+-
+ ret = regulator_bulk_enable(ARRAY_SIZE(res->supplies), res->supplies);
+ if (ret < 0) {
+ dev_err(dev, "cannot enable regulators\n");
+@@ -381,15 +379,15 @@ static int qcom_pcie_init_2_1_0(struct qcom_pcie *pcie)
+ goto err_deassert_axi;
+ }
+
+- ret = clk_bulk_prepare_enable(ARRAY_SIZE(res->clks), res->clks);
+- if (ret)
+- goto err_clks;
+-
+ /* enable PCIe clocks and resets */
+ val = readl(pcie->parf + PCIE20_PARF_PHY_CTRL);
+ val &= ~BIT(0);
+ writel(val, pcie->parf + PCIE20_PARF_PHY_CTRL);
+
++ ret = clk_bulk_prepare_enable(ARRAY_SIZE(res->clks), res->clks);
++ if (ret)
++ goto err_clks;
++
+ if (of_device_is_compatible(node, "qcom,pcie-ipq8064") ||
+ of_device_is_compatible(node, "qcom,pcie-ipq8064-v2")) {
+ writel(PCS_DEEMPH_TX_DEEMPH_GEN1(24) |
+@@ -1038,9 +1036,7 @@ static int qcom_pcie_init_2_3_3(struct qcom_pcie *pcie)
+ struct qcom_pcie_resources_2_3_3 *res = &pcie->res.v2_3_3;
+ struct dw_pcie *pci = pcie->pci;
+ struct device *dev = pci->dev;
+- u16 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
+ int i, ret;
+- u32 val;
+
+ for (i = 0; i < ARRAY_SIZE(res->rst); i++) {
+ ret = reset_control_assert(res->rst[i]);
+@@ -1097,6 +1093,33 @@ static int qcom_pcie_init_2_3_3(struct qcom_pcie *pcie)
+ goto err_clk_aux;
+ }
+
++ return 0;
++
++err_clk_aux:
++ clk_disable_unprepare(res->ahb_clk);
++err_clk_ahb:
++ clk_disable_unprepare(res->axi_s_clk);
++err_clk_axi_s:
++ clk_disable_unprepare(res->axi_m_clk);
++err_clk_axi_m:
++ clk_disable_unprepare(res->iface);
++err_clk_iface:
++ /*
++ * Not checking for failure, will anyway return
++ * the original failure in 'ret'.
++ */
++ for (i = 0; i < ARRAY_SIZE(res->rst); i++)
++ reset_control_assert(res->rst[i]);
++
++ return ret;
++}
++
++static int qcom_pcie_post_init_2_3_3(struct qcom_pcie *pcie)
++{
++ struct dw_pcie *pci = pcie->pci;
++ u16 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
++ u32 val;
++
+ writel(SLV_ADDR_SPACE_SZ,
+ pcie->parf + PCIE20_v3_PARF_SLV_ADDR_SPACE_SIZE);
+
+@@ -1124,24 +1147,6 @@ static int qcom_pcie_init_2_3_3(struct qcom_pcie *pcie)
+ PCI_EXP_DEVCTL2);
+
+ return 0;
+-
+-err_clk_aux:
+- clk_disable_unprepare(res->ahb_clk);
+-err_clk_ahb:
+- clk_disable_unprepare(res->axi_s_clk);
+-err_clk_axi_s:
+- clk_disable_unprepare(res->axi_m_clk);
+-err_clk_axi_m:
+- clk_disable_unprepare(res->iface);
+-err_clk_iface:
+- /*
+- * Not checking for failure, will anyway return
+- * the original failure in 'ret'.
+- */
+- for (i = 0; i < ARRAY_SIZE(res->rst); i++)
+- reset_control_assert(res->rst[i]);
+-
+- return ret;
+ }
+
+ static int qcom_pcie_get_resources_2_7_0(struct qcom_pcie *pcie)
+@@ -1467,6 +1472,7 @@ static const struct qcom_pcie_ops ops_2_4_0 = {
+ static const struct qcom_pcie_ops ops_2_3_3 = {
+ .get_resources = qcom_pcie_get_resources_2_3_3,
+ .init = qcom_pcie_init_2_3_3,
++ .post_init = qcom_pcie_post_init_2_3_3,
+ .deinit = qcom_pcie_deinit_2_3_3,
+ .ltssm_enable = qcom_pcie_2_3_2_ltssm_enable,
+ };
+diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
+index b1b5f836a8064..a479e58cc8fb2 100644
+--- a/drivers/pci/controller/dwc/pcie-tegra194.c
++++ b/drivers/pci/controller/dwc/pcie-tegra194.c
+@@ -352,15 +352,14 @@ static irqreturn_t tegra_pcie_rp_irq_handler(int irq, void *arg)
+ struct tegra194_pcie *pcie = arg;
+ struct dw_pcie *pci = &pcie->pci;
+ struct pcie_port *pp = &pci->pp;
+- u32 val, tmp;
++ u32 val, status_l0, status_l1;
+ u16 val_w;
+
+- val = appl_readl(pcie, APPL_INTR_STATUS_L0);
+- if (val & APPL_INTR_STATUS_L0_LINK_STATE_INT) {
+- val = appl_readl(pcie, APPL_INTR_STATUS_L1_0_0);
+- if (val & APPL_INTR_STATUS_L1_0_0_LINK_REQ_RST_NOT_CHGED) {
+- appl_writel(pcie, val, APPL_INTR_STATUS_L1_0_0);
+-
++ status_l0 = appl_readl(pcie, APPL_INTR_STATUS_L0);
++ if (status_l0 & APPL_INTR_STATUS_L0_LINK_STATE_INT) {
++ status_l1 = appl_readl(pcie, APPL_INTR_STATUS_L1_0_0);
++ appl_writel(pcie, status_l1, APPL_INTR_STATUS_L1_0_0);
++ if (status_l1 & APPL_INTR_STATUS_L1_0_0_LINK_REQ_RST_NOT_CHGED) {
+ /* SBR & Surprise Link Down WAR */
+ val = appl_readl(pcie, APPL_CAR_RESET_OVRD);
+ val &= ~APPL_CAR_RESET_OVRD_CYA_OVERRIDE_CORE_RST_N;
+@@ -376,15 +375,15 @@ static irqreturn_t tegra_pcie_rp_irq_handler(int irq, void *arg)
+ }
+ }
+
+- if (val & APPL_INTR_STATUS_L0_INT_INT) {
+- val = appl_readl(pcie, APPL_INTR_STATUS_L1_8_0);
+- if (val & APPL_INTR_STATUS_L1_8_0_AUTO_BW_INT_STS) {
++ if (status_l0 & APPL_INTR_STATUS_L0_INT_INT) {
++ status_l1 = appl_readl(pcie, APPL_INTR_STATUS_L1_8_0);
++ if (status_l1 & APPL_INTR_STATUS_L1_8_0_AUTO_BW_INT_STS) {
+ appl_writel(pcie,
+ APPL_INTR_STATUS_L1_8_0_AUTO_BW_INT_STS,
+ APPL_INTR_STATUS_L1_8_0);
+ apply_bad_link_workaround(pp);
+ }
+- if (val & APPL_INTR_STATUS_L1_8_0_BW_MGT_INT_STS) {
++ if (status_l1 & APPL_INTR_STATUS_L1_8_0_BW_MGT_INT_STS) {
+ appl_writel(pcie,
+ APPL_INTR_STATUS_L1_8_0_BW_MGT_INT_STS,
+ APPL_INTR_STATUS_L1_8_0);
+@@ -396,25 +395,24 @@ static irqreturn_t tegra_pcie_rp_irq_handler(int irq, void *arg)
+ }
+ }
+
+- val = appl_readl(pcie, APPL_INTR_STATUS_L0);
+- if (val & APPL_INTR_STATUS_L0_CDM_REG_CHK_INT) {
+- val = appl_readl(pcie, APPL_INTR_STATUS_L1_18);
+- tmp = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS);
+- if (val & APPL_INTR_STATUS_L1_18_CDM_REG_CHK_CMPLT) {
++ if (status_l0 & APPL_INTR_STATUS_L0_CDM_REG_CHK_INT) {
++ status_l1 = appl_readl(pcie, APPL_INTR_STATUS_L1_18);
++ val = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS);
++ if (status_l1 & APPL_INTR_STATUS_L1_18_CDM_REG_CHK_CMPLT) {
+ dev_info(pci->dev, "CDM check complete\n");
+- tmp |= PCIE_PL_CHK_REG_CHK_REG_COMPLETE;
++ val |= PCIE_PL_CHK_REG_CHK_REG_COMPLETE;
+ }
+- if (val & APPL_INTR_STATUS_L1_18_CDM_REG_CHK_CMP_ERR) {
++ if (status_l1 & APPL_INTR_STATUS_L1_18_CDM_REG_CHK_CMP_ERR) {
+ dev_err(pci->dev, "CDM comparison mismatch\n");
+- tmp |= PCIE_PL_CHK_REG_CHK_REG_COMPARISON_ERROR;
++ val |= PCIE_PL_CHK_REG_CHK_REG_COMPARISON_ERROR;
+ }
+- if (val & APPL_INTR_STATUS_L1_18_CDM_REG_CHK_LOGIC_ERR) {
++ if (status_l1 & APPL_INTR_STATUS_L1_18_CDM_REG_CHK_LOGIC_ERR) {
+ dev_err(pci->dev, "CDM Logic error\n");
+- tmp |= PCIE_PL_CHK_REG_CHK_REG_LOGIC_ERROR;
++ val |= PCIE_PL_CHK_REG_CHK_REG_LOGIC_ERROR;
+ }
+- dw_pcie_writel_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS, tmp);
+- tmp = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_ERR_ADDR);
+- dev_err(pci->dev, "CDM Error Address Offset = 0x%08X\n", tmp);
++ dw_pcie_writel_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS, val);
++ val = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_ERR_ADDR);
++ dev_err(pci->dev, "CDM Error Address Offset = 0x%08X\n", val);
+ }
+
+ return IRQ_HANDLED;
+@@ -980,7 +978,7 @@ retry_link:
+ offset = dw_pcie_find_ext_capability(pci, PCI_EXT_CAP_ID_DLF);
+ val = dw_pcie_readl_dbi(pci, offset + PCI_DLF_CAP);
+ val &= ~PCI_DLF_EXCHANGE_ENABLE;
+- dw_pcie_writel_dbi(pci, offset, val);
++ dw_pcie_writel_dbi(pci, offset + PCI_DLF_CAP, val);
+
+ tegra194_pcie_host_init(pp);
+ dw_pcie_setup_rc(pp);
+@@ -1951,6 +1949,7 @@ static int tegra_pcie_config_ep(struct tegra194_pcie *pcie,
+ if (ret) {
+ dev_err(dev, "Failed to initialize DWC Endpoint subsystem: %d\n",
+ ret);
++ pm_runtime_disable(dev);
+ return ret;
+ }
+
+diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c b/drivers/pci/controller/pcie-mediatek-gen3.c
+index 5d9fd36b02d18..a02c466a597cd 100644
+--- a/drivers/pci/controller/pcie-mediatek-gen3.c
++++ b/drivers/pci/controller/pcie-mediatek-gen3.c
+@@ -600,7 +600,8 @@ static int mtk_pcie_init_irq_domains(struct mtk_gen3_pcie *pcie)
+ &intx_domain_ops, pcie);
+ if (!pcie->intx_domain) {
+ dev_err(dev, "failed to create INTx IRQ domain\n");
+- return -ENODEV;
++ ret = -ENODEV;
++ goto out_put_node;
+ }
+
+ /* Setup MSI */
+@@ -623,13 +624,15 @@ static int mtk_pcie_init_irq_domains(struct mtk_gen3_pcie *pcie)
+ goto err_msi_domain;
+ }
+
++ of_node_put(intc_node);
+ return 0;
+
+ err_msi_domain:
+ irq_domain_remove(pcie->msi_bottom_domain);
+ err_msi_bottom_domain:
+ irq_domain_remove(pcie->intx_domain);
+-
++out_put_node:
++ of_node_put(intc_node);
+ return ret;
+ }
+
+diff --git a/drivers/pci/controller/pcie-microchip-host.c b/drivers/pci/controller/pcie-microchip-host.c
+index 2c52a8cef7260..932b1c182149f 100644
+--- a/drivers/pci/controller/pcie-microchip-host.c
++++ b/drivers/pci/controller/pcie-microchip-host.c
+@@ -904,6 +904,7 @@ static int mc_pcie_init_irq_domains(struct mc_pcie *port)
+ &event_domain_ops, port);
+ if (!port->event_domain) {
+ dev_err(dev, "failed to get event domain\n");
++ of_node_put(pcie_intc_node);
+ return -ENOMEM;
+ }
+
+@@ -913,6 +914,7 @@ static int mc_pcie_init_irq_domains(struct mc_pcie *port)
+ &intx_domain_ops, port);
+ if (!port->intx_domain) {
+ dev_err(dev, "failed to get an INTx IRQ domain\n");
++ of_node_put(pcie_intc_node);
+ return -ENOMEM;
+ }
+
+diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
+index 5b833f00e9800..a5ed779b0a512 100644
+--- a/drivers/pci/endpoint/functions/pci-epf-test.c
++++ b/drivers/pci/endpoint/functions/pci-epf-test.c
+@@ -627,7 +627,6 @@ static void pci_epf_test_unbind(struct pci_epf *epf)
+
+ cancel_delayed_work(&epf_test->cmd_handler);
+ pci_epf_test_clean_dma_chan(epf_test);
+- pci_epc_stop(epc);
+ for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
+ epf_bar = &epf->bar[bar];
+
+diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
+index 7952e5efd6cf3..a1e38ca93cd96 100644
+--- a/drivers/pci/pcie/aer.c
++++ b/drivers/pci/pcie/aer.c
+@@ -538,7 +538,7 @@ static const char *aer_agent_string[] = {
+ u64 *stats = pdev->aer_stats->stats_array; \
+ size_t len = 0; \
+ \
+- for (i = 0; i < ARRAY_SIZE(strings_array); i++) { \
++ for (i = 0; i < ARRAY_SIZE(pdev->aer_stats->stats_array); i++) {\
+ if (strings_array[i]) \
+ len += sysfs_emit_at(buf, len, "%s %llu\n", \
+ strings_array[i], \
+@@ -1347,6 +1347,11 @@ static int aer_probe(struct pcie_device *dev)
+ struct device *device = &dev->device;
+ struct pci_dev *port = dev->port;
+
++ BUILD_BUG_ON(ARRAY_SIZE(aer_correctable_error_string) <
++ AER_MAX_TYPEOF_COR_ERRS);
++ BUILD_BUG_ON(ARRAY_SIZE(aer_uncorrectable_error_string) <
++ AER_MAX_TYPEOF_UNCOR_ERRS);
++
+ /* Limit to Root Ports or Root Complex Event Collectors */
+ if ((pci_pcie_type(port) != PCI_EXP_TYPE_RC_EC) &&
+ (pci_pcie_type(port) != PCI_EXP_TYPE_ROOT_PORT))
+diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
+index 604feeb84ee40..1ac7fec47d6fb 100644
+--- a/drivers/pci/pcie/portdrv_core.c
++++ b/drivers/pci/pcie/portdrv_core.c
+@@ -222,15 +222,8 @@ static int get_port_device_capability(struct pci_dev *dev)
+
+ #ifdef CONFIG_PCIEAER
+ if (dev->aer_cap && pci_aer_available() &&
+- (pcie_ports_native || host->native_aer)) {
++ (pcie_ports_native || host->native_aer))
+ services |= PCIE_PORT_SERVICE_AER;
+-
+- /*
+- * Disable AER on this port in case it's been enabled by the
+- * BIOS (the AER service driver will enable it when necessary).
+- */
+- pci_disable_pcie_error_reporting(dev);
+- }
+ #endif
+
+ /* Root Ports and Root Complex Event Collectors may generate PMEs */
+diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
+index d44bcc29d99c8..cd5945e17fdf7 100644
+--- a/drivers/perf/arm_spe_pmu.c
++++ b/drivers/perf/arm_spe_pmu.c
+@@ -39,6 +39,24 @@
+ #include <asm/mmu.h>
+ #include <asm/sysreg.h>
+
++/*
++ * Cache if the event is allowed to trace Context information.
++ * This allows us to perform the check, i.e, perfmon_capable(),
++ * in the context of the event owner, once, during the event_init().
++ */
++#define SPE_PMU_HW_FLAGS_CX BIT(0)
++
++static void set_spe_event_has_cx(struct perf_event *event)
++{
++ if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
++ event->hw.flags |= SPE_PMU_HW_FLAGS_CX;
++}
++
++static bool get_spe_event_has_cx(struct perf_event *event)
++{
++ return !!(event->hw.flags & SPE_PMU_HW_FLAGS_CX);
++}
++
+ #define ARM_SPE_BUF_PAD_BYTE 0
+
+ struct arm_spe_pmu_buf {
+@@ -272,7 +290,7 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
+ if (!attr->exclude_kernel)
+ reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);
+
+- if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
++ if (get_spe_event_has_cx(event))
+ reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);
+
+ return reg;
+@@ -709,10 +727,10 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
+ !(spe_pmu->features & SPE_PMU_FEAT_FILT_LAT))
+ return -EOPNOTSUPP;
+
++ set_spe_event_has_cx(event);
+ reg = arm_spe_event_to_pmscr(event);
+ if (!perfmon_capable() &&
+ (reg & (BIT(SYS_PMSCR_EL1_PA_SHIFT) |
+- BIT(SYS_PMSCR_EL1_CX_SHIFT) |
+ BIT(SYS_PMSCR_EL1_PCT_SHIFT))))
+ return -EACCES;
+
+diff --git a/drivers/perf/riscv_pmu.c b/drivers/perf/riscv_pmu.c
+index b2b8d2074ed0d..130b9f1a40e08 100644
+--- a/drivers/perf/riscv_pmu.c
++++ b/drivers/perf/riscv_pmu.c
+@@ -170,7 +170,6 @@ int riscv_pmu_event_set_period(struct perf_event *event)
+ left = (max_period >> 1);
+
+ local64_set(&hwc->prev_count, (u64)-left);
+- perf_event_update_userpage(event);
+
+ return overflow;
+ }
+diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
+index a1317a4835127..77b826d6c05b4 100644
+--- a/drivers/perf/riscv_pmu_sbi.c
++++ b/drivers/perf/riscv_pmu_sbi.c
+@@ -274,8 +274,13 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
+ cflags |= SBI_PMU_CFG_FLAG_SET_UINH;
+
+ /* retrieve the available counter index */
++#if defined(CONFIG_32BIT)
++ ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase, cmask,
++ cflags, hwc->event_base, hwc->config, hwc->config >> 32);
++#else
+ ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase, cmask,
+ cflags, hwc->event_base, hwc->config, 0);
++#endif
+ if (ret.error) {
+ pr_debug("Not able to find a counter for event %lx config %llx\n",
+ hwc->event_base, hwc->config);
+@@ -417,8 +422,13 @@ static void pmu_sbi_ctr_start(struct perf_event *event, u64 ival)
+ struct hw_perf_event *hwc = &event->hw;
+ unsigned long flag = SBI_PMU_START_FLAG_SET_INIT_VALUE;
+
++#if defined(CONFIG_32BIT)
+ ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, hwc->idx,
+ 1, flag, ival, ival >> 32, 0);
++#else
++ ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, hwc->idx,
++ 1, flag, ival, 0, 0);
++#endif
+ if (ret.error && (ret.error != SBI_ERR_ALREADY_STARTED))
+ pr_err("Starting counter idx %d failed with error %d\n",
+ hwc->idx, sbi_err_map_linux_errno(ret.error));
+@@ -525,8 +535,14 @@ static inline void pmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
+ hwc = &event->hw;
+ max_period = riscv_pmu_ctr_get_width_mask(event);
+ init_val = local64_read(&hwc->prev_count) & max_period;
++#if defined(CONFIG_32BIT)
++ sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1,
++ flag, init_val, init_val >> 32, 0);
++#else
+ sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1,
+ flag, init_val, 0, 0);
++#endif
++ perf_event_update_userpage(event);
+ }
+ ctr_ovf_mask = ctr_ovf_mask >> 1;
+ idx++;
+@@ -666,12 +682,15 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde
+ child = of_get_compatible_child(cpu, "riscv,cpu-intc");
+ if (!child) {
+ pr_err("Failed to find INTC node\n");
++ of_node_put(cpu);
+ return -ENODEV;
+ }
+ domain = irq_find_host(child);
+ of_node_put(child);
+- if (domain)
++ if (domain) {
++ of_node_put(cpu);
+ break;
++ }
+ }
+ if (!domain) {
+ pr_err("Failed to find INTC IRQ root domain\n");
+diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.h b/drivers/phy/qualcomm/phy-qcom-qmp.h
+index 06b2556ed93a5..b9a91520439cb 100644
+--- a/drivers/phy/qualcomm/phy-qcom-qmp.h
++++ b/drivers/phy/qualcomm/phy-qcom-qmp.h
+@@ -1116,7 +1116,8 @@
+ #define QSERDES_V5_COM_CORE_CLK_EN 0x174
+ #define QSERDES_V5_COM_CMN_CONFIG 0x17c
+ #define QSERDES_V5_COM_CMN_MISC1 0x19c
+-#define QSERDES_V5_COM_CMN_MODE 0x1a4
++#define QSERDES_V5_COM_CMN_MODE 0x1a0
++#define QSERDES_V5_COM_CMN_MODE_CONTD 0x1a4
+ #define QSERDES_V5_COM_VCO_DC_LEVEL_CTRL 0x1a8
+ #define QSERDES_V5_COM_BIN_VCOCAL_CMP_CODE1_MODE0 0x1ac
+ #define QSERDES_V5_COM_BIN_VCOCAL_CMP_CODE2_MODE0 0x1b0
+diff --git a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
+index cba5c32cbaeed..0c6548ac9d8a9 100644
+--- a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
++++ b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
+@@ -942,7 +942,9 @@ static irqreturn_t rockchip_usb2phy_irq(int irq, void *data)
+
+ switch (rport->port_id) {
+ case USB2PHY_PORT_OTG:
+- ret |= rockchip_usb2phy_otg_mux_irq(irq, rport);
++ if (rport->mode != USB_DR_MODE_HOST &&
++ rport->mode != USB_DR_MODE_UNKNOWN)
++ ret |= rockchip_usb2phy_otg_mux_irq(irq, rport);
+ break;
+ case USB2PHY_PORT_HOST:
+ ret |= rockchip_usb2phy_linestate_irq(irq, rport);
+diff --git a/drivers/phy/samsung/phy-exynosautov9-ufs.c b/drivers/phy/samsung/phy-exynosautov9-ufs.c
+index 36398a15c2db7..d043dfdb598a2 100644
+--- a/drivers/phy/samsung/phy-exynosautov9-ufs.c
++++ b/drivers/phy/samsung/phy-exynosautov9-ufs.c
+@@ -31,22 +31,22 @@ static const struct samsung_ufs_phy_cfg exynosautov9_pre_init_cfg[] = {
+ PHY_COMN_REG_CFG(0x023, 0xc0, PWR_MODE_ANY),
+ PHY_COMN_REG_CFG(0x023, 0x00, PWR_MODE_ANY),
+
+- PHY_TRSV_REG_CFG(0x042, 0x5d, PWR_MODE_ANY),
+- PHY_TRSV_REG_CFG(0x043, 0x80, PWR_MODE_ANY),
++ PHY_TRSV_REG_CFG_AUTOV9(0x042, 0x5d, PWR_MODE_ANY),
++ PHY_TRSV_REG_CFG_AUTOV9(0x043, 0x80, PWR_MODE_ANY),
+
+ END_UFS_PHY_CFG,
+ };
+
+ /* Calibration for HS mode series A/B */
+ static const struct samsung_ufs_phy_cfg exynosautov9_pre_pwr_hs_cfg[] = {
+- PHY_TRSV_REG_CFG(0x032, 0xbc, PWR_MODE_HS_ANY),
+- PHY_TRSV_REG_CFG(0x03c, 0x7f, PWR_MODE_HS_ANY),
+- PHY_TRSV_REG_CFG(0x048, 0xc0, PWR_MODE_HS_ANY),
++ PHY_TRSV_REG_CFG_AUTOV9(0x032, 0xbc, PWR_MODE_HS_ANY),
++ PHY_TRSV_REG_CFG_AUTOV9(0x03c, 0x7f, PWR_MODE_HS_ANY),
++ PHY_TRSV_REG_CFG_AUTOV9(0x048, 0xc0, PWR_MODE_HS_ANY),
+
+- PHY_TRSV_REG_CFG(0x04a, 0x00, PWR_MODE_HS_G3_SER_B),
+- PHY_TRSV_REG_CFG(0x04b, 0x10, PWR_MODE_HS_G1_SER_B |
+- PWR_MODE_HS_G3_SER_B),
+- PHY_TRSV_REG_CFG(0x04d, 0x63, PWR_MODE_HS_G3_SER_B),
++ PHY_TRSV_REG_CFG_AUTOV9(0x04a, 0x00, PWR_MODE_HS_G3_SER_B),
++ PHY_TRSV_REG_CFG_AUTOV9(0x04b, 0x10, PWR_MODE_HS_G1_SER_B |
++ PWR_MODE_HS_G3_SER_B),
++ PHY_TRSV_REG_CFG_AUTOV9(0x04d, 0x63, PWR_MODE_HS_G3_SER_B),
+
+ END_UFS_PHY_CFG,
+ };
+diff --git a/drivers/phy/st/phy-stm32-usbphyc.c b/drivers/phy/st/phy-stm32-usbphyc.c
+index 007a23c78d562..a98c911cc37ae 100644
+--- a/drivers/phy/st/phy-stm32-usbphyc.c
++++ b/drivers/phy/st/phy-stm32-usbphyc.c
+@@ -358,7 +358,9 @@ static int stm32_usbphyc_phy_init(struct phy *phy)
+ return 0;
+
+ pll_disable:
+- return stm32_usbphyc_pll_disable(usbphyc);
++ stm32_usbphyc_pll_disable(usbphyc);
++
++ return ret;
+ }
+
+ static int stm32_usbphyc_phy_exit(struct phy *phy)
+diff --git a/drivers/phy/ti/phy-tusb1210.c b/drivers/phy/ti/phy-tusb1210.c
+index c3ab4b69ea680..669c13d6e402f 100644
+--- a/drivers/phy/ti/phy-tusb1210.c
++++ b/drivers/phy/ti/phy-tusb1210.c
+@@ -105,8 +105,9 @@ static int tusb1210_power_on(struct phy *phy)
+ msleep(TUSB1210_RESET_TIME_MS);
+
+ /* Restore the optional eye diagram optimization value */
+- return tusb1210_ulpi_write(tusb, TUSB1210_VENDOR_SPECIFIC2,
+- tusb->vendor_specific2);
++ tusb1210_ulpi_write(tusb, TUSB1210_VENDOR_SPECIFIC2, tusb->vendor_specific2);
++
++ return 0;
+ }
+
+ static int tusb1210_power_off(struct phy *phy)
+diff --git a/drivers/pinctrl/Kconfig b/drivers/pinctrl/Kconfig
+index f52960d2dfbe8..bff144c97e66e 100644
+--- a/drivers/pinctrl/Kconfig
++++ b/drivers/pinctrl/Kconfig
+@@ -32,7 +32,7 @@ config DEBUG_PINCTRL
+ Say Y here to add some extra checks and diagnostics to PINCTRL calls.
+
+ config PINCTRL_AMD
+- tristate "AMD GPIO pin control"
++ bool "AMD GPIO pin control"
+ depends on HAS_IOMEM
+ depends on ACPI || COMPILE_TEST
+ select GPIOLIB
+diff --git a/drivers/platform/chrome/cros_ec.c b/drivers/platform/chrome/cros_ec.c
+index a5cc8f24299eb..4168d7f5f273f 100644
+--- a/drivers/platform/chrome/cros_ec.c
++++ b/drivers/platform/chrome/cros_ec.c
+@@ -135,16 +135,16 @@ static int cros_ec_sleep_event(struct cros_ec_device *ec_dev, u8 sleep_event)
+ buf.msg.command = EC_CMD_HOST_SLEEP_EVENT;
+
+ ret = cros_ec_cmd_xfer_status(ec_dev, &buf.msg);
+-
+- /* For now, report failure to transition to S0ix with a warning. */
++ /* Report failure to transition to system wide suspend with a warning. */
+ if (ret >= 0 && ec_dev->host_sleep_v1 &&
+- (sleep_event == HOST_SLEEP_EVENT_S0IX_RESUME)) {
++ (sleep_event == HOST_SLEEP_EVENT_S0IX_RESUME ||
++ sleep_event == HOST_SLEEP_EVENT_S3_RESUME)) {
+ ec_dev->last_resume_result =
+ buf.u.resp1.resume_response.sleep_transitions;
+
+ WARN_ONCE(buf.u.resp1.resume_response.sleep_transitions &
+ EC_HOST_RESUME_SLEEP_TIMEOUT,
+- "EC detected sleep transition timeout. Total slp_s0 transitions: %d",
++ "EC detected sleep transition timeout. Total sleep transitions: %d",
+ buf.u.resp1.resume_response.sleep_transitions &
+ EC_HOST_RESUME_SLEEP_TRANSITIONS_MASK);
+ }
+diff --git a/drivers/platform/mellanox/mlxreg-lc.c b/drivers/platform/mellanox/mlxreg-lc.c
+index c897a2f158404..55834ccb4ac7c 100644
+--- a/drivers/platform/mellanox/mlxreg-lc.c
++++ b/drivers/platform/mellanox/mlxreg-lc.c
+@@ -716,8 +716,12 @@ mlxreg_lc_config_init(struct mlxreg_lc *mlxreg_lc, void *regmap,
+ switch (regval) {
+ case MLXREG_LC_SN4800_C16:
+ err = mlxreg_lc_sn4800_c16_config_init(mlxreg_lc, regmap, data);
+- if (err)
++ if (err) {
++ dev_err(dev, "Failed to config client %s at bus %d at addr 0x%02x\n",
++ data->hpdev.brdinfo->type, data->hpdev.nr,
++ data->hpdev.brdinfo->addr);
+ return err;
++ }
+ break;
+ default:
+ return -ENODEV;
+@@ -730,8 +734,11 @@ mlxreg_lc_config_init(struct mlxreg_lc *mlxreg_lc, void *regmap,
+ mlxreg_lc->mux = platform_device_register_resndata(dev, "i2c-mux-mlxcpld", data->hpdev.nr,
+ NULL, 0, mlxreg_lc->mux_data,
+ sizeof(*mlxreg_lc->mux_data));
+- if (IS_ERR(mlxreg_lc->mux))
++ if (IS_ERR(mlxreg_lc->mux)) {
++ dev_err(dev, "Failed to create mux infra for client %s at bus %d at addr 0x%02x\n",
++ data->hpdev.brdinfo->type, data->hpdev.nr, data->hpdev.brdinfo->addr);
+ return PTR_ERR(mlxreg_lc->mux);
++ }
+
+ /* Register IO access driver. */
+ if (mlxreg_lc->io_data) {
+@@ -740,6 +747,9 @@ mlxreg_lc_config_init(struct mlxreg_lc *mlxreg_lc, void *regmap,
+ platform_device_register_resndata(dev, "mlxreg-io", data->hpdev.nr, NULL, 0,
+ mlxreg_lc->io_data, sizeof(*mlxreg_lc->io_data));
+ if (IS_ERR(mlxreg_lc->io_regs)) {
++ dev_err(dev, "Failed to create regio for client %s at bus %d at addr 0x%02x\n",
++ data->hpdev.brdinfo->type, data->hpdev.nr,
++ data->hpdev.brdinfo->addr);
+ err = PTR_ERR(mlxreg_lc->io_regs);
+ goto fail_register_io;
+ }
+@@ -753,6 +763,9 @@ mlxreg_lc_config_init(struct mlxreg_lc *mlxreg_lc, void *regmap,
+ mlxreg_lc->led_data,
+ sizeof(*mlxreg_lc->led_data));
+ if (IS_ERR(mlxreg_lc->led)) {
++ dev_err(dev, "Failed to create LED objects for client %s at bus %d at addr 0x%02x\n",
++ data->hpdev.brdinfo->type, data->hpdev.nr,
++ data->hpdev.brdinfo->addr);
+ err = PTR_ERR(mlxreg_lc->led);
+ goto fail_register_led;
+ }
+@@ -809,7 +822,8 @@ static int mlxreg_lc_probe(struct platform_device *pdev)
+ if (!data->hpdev.adapter) {
+ dev_err(&pdev->dev, "Failed to get adapter for bus %d\n",
+ data->hpdev.nr);
+- return -EFAULT;
++ err = -EFAULT;
++ goto i2c_get_adapter_fail;
+ }
+
+ /* Create device at the top of line card I2C tree.*/
+@@ -818,32 +832,40 @@ static int mlxreg_lc_probe(struct platform_device *pdev)
+ if (IS_ERR(data->hpdev.client)) {
+ dev_err(&pdev->dev, "Failed to create client %s at bus %d at addr 0x%02x\n",
+ data->hpdev.brdinfo->type, data->hpdev.nr, data->hpdev.brdinfo->addr);
+-
+- i2c_put_adapter(data->hpdev.adapter);
+- data->hpdev.adapter = NULL;
+- return PTR_ERR(data->hpdev.client);
++ err = PTR_ERR(data->hpdev.client);
++ goto i2c_new_device_fail;
+ }
+
+ regmap = devm_regmap_init_i2c(data->hpdev.client,
+ &mlxreg_lc_regmap_conf);
+ if (IS_ERR(regmap)) {
++ dev_err(&pdev->dev, "Failed to create regmap for client %s at bus %d at addr 0x%02x\n",
++ data->hpdev.brdinfo->type, data->hpdev.nr, data->hpdev.brdinfo->addr);
+ err = PTR_ERR(regmap);
+- goto mlxreg_lc_probe_fail;
++ goto devm_regmap_init_i2c_fail;
+ }
+
+ /* Set default registers. */
+ for (i = 0; i < mlxreg_lc_regmap_conf.num_reg_defaults; i++) {
+ err = regmap_write(regmap, mlxreg_lc_regmap_default[i].reg,
+ mlxreg_lc_regmap_default[i].def);
+- if (err)
+- goto mlxreg_lc_probe_fail;
++ if (err) {
++ dev_err(&pdev->dev, "Failed to set default regmap %d for client %s at bus %d at addr 0x%02x\n",
++ i, data->hpdev.brdinfo->type, data->hpdev.nr,
++ data->hpdev.brdinfo->addr);
++ goto regmap_write_fail;
++ }
+ }
+
+ /* Sync registers with hardware. */
+ regcache_mark_dirty(regmap);
+ err = regcache_sync(regmap);
+- if (err)
+- goto mlxreg_lc_probe_fail;
++ if (err) {
++ dev_err(&pdev->dev, "Failed to sync regmap for client %s at bus %d at addr 0x%02x\n",
++ data->hpdev.brdinfo->type, data->hpdev.nr, data->hpdev.brdinfo->addr);
++ err = PTR_ERR(regmap);
++ goto regcache_sync_fail;
++ }
+
+ par_pdata = data->hpdev.brdinfo->platform_data;
+ mlxreg_lc->par_regmap = par_pdata->regmap;
+@@ -854,12 +876,27 @@ static int mlxreg_lc_probe(struct platform_device *pdev)
+ /* Configure line card. */
+ err = mlxreg_lc_config_init(mlxreg_lc, regmap, data);
+ if (err)
+- goto mlxreg_lc_probe_fail;
++ goto mlxreg_lc_config_init_fail;
+
+ return err;
+
+-mlxreg_lc_probe_fail:
++mlxreg_lc_config_init_fail:
++regcache_sync_fail:
++regmap_write_fail:
++devm_regmap_init_i2c_fail:
++ if (data->hpdev.client) {
++ i2c_unregister_device(data->hpdev.client);
++ data->hpdev.client = NULL;
++ }
++i2c_new_device_fail:
+ i2c_put_adapter(data->hpdev.adapter);
++ data->hpdev.adapter = NULL;
++i2c_get_adapter_fail:
++ /* Clear event notification callback and handle. */
++ if (data->notifier) {
++ data->notifier->user_handler = NULL;
++ data->notifier->handle = NULL;
++ }
+ return err;
+ }
+
+@@ -868,11 +905,18 @@ static int mlxreg_lc_remove(struct platform_device *pdev)
+ struct mlxreg_core_data *data = dev_get_platdata(&pdev->dev);
+ struct mlxreg_lc *mlxreg_lc = platform_get_drvdata(pdev);
+
+- /* Clear event notification callback. */
+- if (data->notifier) {
+- data->notifier->user_handler = NULL;
+- data->notifier->handle = NULL;
+- }
++ /*
++ * Probing and removing are invoked by hotplug events raised upon line card insertion and
++ * removing. If probing procedure fails all data is cleared. However, hotplug event still
++ * will be raised on line card removing and activate removing procedure. In this case there
++ * is nothing to remove.
++ */
++ if (!data->notifier || !data->notifier->handle)
++ return 0;
++
++ /* Clear event notification callback and handle. */
++ data->notifier->user_handler = NULL;
++ data->notifier->handle = NULL;
+
+ /* Destroy static I2C device feeding by main power. */
+ mlxreg_lc_destroy_static_devices(mlxreg_lc, mlxreg_lc->main_devs,
+diff --git a/drivers/platform/olpc/olpc-ec.c b/drivers/platform/olpc/olpc-ec.c
+index 4ff5c3a12991c..921520475ff68 100644
+--- a/drivers/platform/olpc/olpc-ec.c
++++ b/drivers/platform/olpc/olpc-ec.c
+@@ -264,7 +264,7 @@ static ssize_t ec_dbgfs_cmd_write(struct file *file, const char __user *buf,
+ int i, m;
+ unsigned char ec_cmd[EC_MAX_CMD_ARGS];
+ unsigned int ec_cmd_int[EC_MAX_CMD_ARGS];
+- char cmdbuf[64];
++ char cmdbuf[64] = "";
+ int ec_cmd_bytes;
+
+ mutex_lock(&ec_dbgfs_lock);
+diff --git a/drivers/platform/x86/pmc_atom.c b/drivers/platform/x86/pmc_atom.c
+index a40fae6edc841..f24ab24f2927e 100644
+--- a/drivers/platform/x86/pmc_atom.c
++++ b/drivers/platform/x86/pmc_atom.c
+@@ -402,21 +402,16 @@ static const struct dmi_system_id critclk_systems[] = {
+ },
+ },
+ {
+- /* pmc_plt_clk0 - 3 are used for the 4 ethernet controllers */
+- .ident = "Lex 3I380D",
++ /*
++ * Lex System / Lex Computech Co. makes a lot of Bay Trail
++ * based embedded boards which often come with multiple
++ * ethernet controllers using multiple pmc_plt_clks. See:
++ * https://www.lex.com.tw/products/embedded-ipc-board/
++ */
++ .ident = "Lex BayTrail",
+ .callback = dmi_callback,
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Lex BayTrail"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "3I380D"),
+- },
+- },
+- {
+- /* pmc_plt_clk* - are used for ethernet controllers */
+- .ident = "Lex 2I385SW",
+- .callback = dmi_callback,
+- .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "Lex BayTrail"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "2I385SW"),
+ },
+ },
+ {
+diff --git a/drivers/pwm/pwm-lpc18xx-sct.c b/drivers/pwm/pwm-lpc18xx-sct.c
+index b909096dba2fd..43b5509dde513 100644
+--- a/drivers/pwm/pwm-lpc18xx-sct.c
++++ b/drivers/pwm/pwm-lpc18xx-sct.c
+@@ -98,7 +98,7 @@ struct lpc18xx_pwm_chip {
+ unsigned long clk_rate;
+ unsigned int period_ns;
+ unsigned int min_period_ns;
+- unsigned int max_period_ns;
++ u64 max_period_ns;
+ unsigned int period_event;
+ unsigned long event_map;
+ struct mutex res_lock;
+@@ -145,40 +145,48 @@ static void lpc18xx_pwm_set_conflict_res(struct lpc18xx_pwm_chip *lpc18xx_pwm,
+ mutex_unlock(&lpc18xx_pwm->res_lock);
+ }
+
+-static void lpc18xx_pwm_config_period(struct pwm_chip *chip, int period_ns)
++static void lpc18xx_pwm_config_period(struct pwm_chip *chip, u64 period_ns)
+ {
+ struct lpc18xx_pwm_chip *lpc18xx_pwm = to_lpc18xx_pwm_chip(chip);
+- u64 val;
++ u32 val;
+
+- val = (u64)period_ns * lpc18xx_pwm->clk_rate;
+- do_div(val, NSEC_PER_SEC);
++ /*
++ * With clk_rate < NSEC_PER_SEC this cannot overflow.
++ * With period_ns < max_period_ns this also fits into an u32.
++ * As period_ns >= min_period_ns = DIV_ROUND_UP(NSEC_PER_SEC, lpc18xx_pwm->clk_rate);
++ * we have val >= 1.
++ */
++ val = mul_u64_u64_div_u64(period_ns, lpc18xx_pwm->clk_rate, NSEC_PER_SEC);
+
+ lpc18xx_pwm_writel(lpc18xx_pwm,
+ LPC18XX_PWM_MATCH(lpc18xx_pwm->period_event),
+- (u32)val - 1);
++ val - 1);
+
+ lpc18xx_pwm_writel(lpc18xx_pwm,
+ LPC18XX_PWM_MATCHREL(lpc18xx_pwm->period_event),
+- (u32)val - 1);
++ val - 1);
+ }
+
+ static void lpc18xx_pwm_config_duty(struct pwm_chip *chip,
+- struct pwm_device *pwm, int duty_ns)
++ struct pwm_device *pwm, u64 duty_ns)
+ {
+ struct lpc18xx_pwm_chip *lpc18xx_pwm = to_lpc18xx_pwm_chip(chip);
+ struct lpc18xx_pwm_data *lpc18xx_data = &lpc18xx_pwm->channeldata[pwm->hwpwm];
+- u64 val;
++ u32 val;
+
+- val = (u64)duty_ns * lpc18xx_pwm->clk_rate;
+- do_div(val, NSEC_PER_SEC);
++ /*
++ * With clk_rate < NSEC_PER_SEC this cannot overflow.
++ * With duty_ns <= period_ns < max_period_ns this also fits into an u32.
++ */
++ val = mul_u64_u64_div_u64(duty_ns, lpc18xx_pwm->clk_rate, NSEC_PER_SEC);
+
+ lpc18xx_pwm_writel(lpc18xx_pwm,
+ LPC18XX_PWM_MATCH(lpc18xx_data->duty_event),
+- (u32)val);
++ val);
+
+ lpc18xx_pwm_writel(lpc18xx_pwm,
+ LPC18XX_PWM_MATCHREL(lpc18xx_data->duty_event),
+- (u32)val);
++ val);
+ }
+
+ static int lpc18xx_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm,
+@@ -360,12 +368,27 @@ static int lpc18xx_pwm_probe(struct platform_device *pdev)
+ goto disable_pwmclk;
+ }
+
++ /*
++ * If clkrate is too fast, the calculations in .apply() might overflow.
++ */
++ if (lpc18xx_pwm->clk_rate > NSEC_PER_SEC) {
++ ret = dev_err_probe(&pdev->dev, -EINVAL, "pwm clock to fast\n");
++ goto disable_pwmclk;
++ }
++
++ /*
++ * If clkrate is too fast, the calculations in .apply() might overflow.
++ */
++ if (lpc18xx_pwm->clk_rate > NSEC_PER_SEC) {
++ ret = dev_err_probe(&pdev->dev, -EINVAL, "pwm clock to fast\n");
++ goto disable_pwmclk;
++ }
++
+ mutex_init(&lpc18xx_pwm->res_lock);
+ mutex_init(&lpc18xx_pwm->period_lock);
+
+- val = (u64)NSEC_PER_SEC * LPC18XX_PWM_TIMER_MAX;
+- do_div(val, lpc18xx_pwm->clk_rate);
+- lpc18xx_pwm->max_period_ns = val;
++ lpc18xx_pwm->max_period_ns =
++ mul_u64_u64_div_u64(NSEC_PER_SEC, LPC18XX_PWM_TIMER_MAX, lpc18xx_pwm->clk_rate);
+
+ lpc18xx_pwm->min_period_ns = DIV_ROUND_UP(NSEC_PER_SEC,
+ lpc18xx_pwm->clk_rate);
+diff --git a/drivers/pwm/pwm-sifive.c b/drivers/pwm/pwm-sifive.c
+index 253c4a17d2553..58347fcd48125 100644
+--- a/drivers/pwm/pwm-sifive.c
++++ b/drivers/pwm/pwm-sifive.c
+@@ -23,7 +23,7 @@
+ #define PWM_SIFIVE_PWMCFG 0x0
+ #define PWM_SIFIVE_PWMCOUNT 0x8
+ #define PWM_SIFIVE_PWMS 0x10
+-#define PWM_SIFIVE_PWMCMP0 0x20
++#define PWM_SIFIVE_PWMCMP(i) (0x20 + 4 * (i))
+
+ /* PWMCFG fields */
+ #define PWM_SIFIVE_PWMCFG_SCALE GENMASK(3, 0)
+@@ -36,8 +36,6 @@
+ #define PWM_SIFIVE_PWMCFG_GANG BIT(24)
+ #define PWM_SIFIVE_PWMCFG_IP BIT(28)
+
+-/* PWM_SIFIVE_SIZE_PWMCMP is used to calculate offset for pwmcmpX registers */
+-#define PWM_SIFIVE_SIZE_PWMCMP 4
+ #define PWM_SIFIVE_CMPWIDTH 16
+ #define PWM_SIFIVE_DEFAULT_PERIOD 10000000
+
+@@ -112,8 +110,7 @@ static void pwm_sifive_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
+ struct pwm_sifive_ddata *ddata = pwm_sifive_chip_to_ddata(chip);
+ u32 duty, val;
+
+- duty = readl(ddata->regs + PWM_SIFIVE_PWMCMP0 +
+- pwm->hwpwm * PWM_SIFIVE_SIZE_PWMCMP);
++ duty = readl(ddata->regs + PWM_SIFIVE_PWMCMP(pwm->hwpwm));
+
+ state->enabled = duty > 0;
+
+@@ -194,8 +191,7 @@ static int pwm_sifive_apply(struct pwm_chip *chip, struct pwm_device *pwm,
+ pwm_sifive_update_clock(ddata, clk_get_rate(ddata->clk));
+ }
+
+- writel(frac, ddata->regs + PWM_SIFIVE_PWMCMP0 +
+- pwm->hwpwm * PWM_SIFIVE_SIZE_PWMCMP);
++ writel(frac, ddata->regs + PWM_SIFIVE_PWMCMP(pwm->hwpwm));
+
+ if (state->enabled != enabled)
+ pwm_sifive_enable(chip, state->enabled);
+@@ -233,6 +229,8 @@ static int pwm_sifive_probe(struct platform_device *pdev)
+ struct pwm_sifive_ddata *ddata;
+ struct pwm_chip *chip;
+ int ret;
++ u32 val;
++ unsigned int enabled_pwms = 0, enabled_clks = 1;
+
+ ddata = devm_kzalloc(dev, sizeof(*ddata), GFP_KERNEL);
+ if (!ddata)
+@@ -259,6 +257,33 @@ static int pwm_sifive_probe(struct platform_device *pdev)
+ return ret;
+ }
+
++ val = readl(ddata->regs + PWM_SIFIVE_PWMCFG);
++ if (val & PWM_SIFIVE_PWMCFG_EN_ALWAYS) {
++ unsigned int i;
++
++ for (i = 0; i < chip->npwm; ++i) {
++ val = readl(ddata->regs + PWM_SIFIVE_PWMCMP(i));
++ if (val > 0)
++ ++enabled_pwms;
++ }
++ }
++
++ /* The clk should be on once for each running PWM. */
++ if (enabled_pwms) {
++ while (enabled_clks < enabled_pwms) {
++ /* This is not expected to fail as the clk is already on */
++ ret = clk_enable(ddata->clk);
++ if (unlikely(ret)) {
++ dev_err_probe(dev, ret, "Failed to enable clk\n");
++ goto disable_clk;
++ }
++ ++enabled_clks;
++ }
++ } else {
++ clk_disable(ddata->clk);
++ enabled_clks = 0;
++ }
++
+ /* Watch for changes to underlying clock frequency */
+ ddata->notifier.notifier_call = pwm_sifive_clock_notifier;
+ ret = clk_notifier_register(ddata->clk, &ddata->notifier);
+@@ -281,7 +306,11 @@ static int pwm_sifive_probe(struct platform_device *pdev)
+ unregister_clk:
+ clk_notifier_unregister(ddata->clk, &ddata->notifier);
+ disable_clk:
+- clk_disable_unprepare(ddata->clk);
++ while (enabled_clks) {
++ clk_disable(ddata->clk);
++ --enabled_clks;
++ }
++ clk_unprepare(ddata->clk);
+
+ return ret;
+ }
+@@ -289,23 +318,19 @@ disable_clk:
+ static int pwm_sifive_remove(struct platform_device *dev)
+ {
+ struct pwm_sifive_ddata *ddata = platform_get_drvdata(dev);
+- bool is_enabled = false;
+ struct pwm_device *pwm;
+ int ch;
+
++ pwmchip_remove(&ddata->chip);
++ clk_notifier_unregister(ddata->clk, &ddata->notifier);
++
+ for (ch = 0; ch < ddata->chip.npwm; ch++) {
+ pwm = &ddata->chip.pwms[ch];
+- if (pwm->state.enabled) {
+- is_enabled = true;
+- break;
+- }
++ if (pwm->state.enabled)
++ clk_disable(ddata->clk);
+ }
+- if (is_enabled)
+- clk_disable(ddata->clk);
+
+- clk_disable_unprepare(ddata->clk);
+- pwmchip_remove(&ddata->chip);
+- clk_notifier_unregister(ddata->clk, &ddata->notifier);
++ clk_unprepare(ddata->clk);
+
+ return 0;
+ }
+diff --git a/drivers/regulator/of_regulator.c b/drivers/regulator/of_regulator.c
+index f54d4f176882a..e12b681c72e5e 100644
+--- a/drivers/regulator/of_regulator.c
++++ b/drivers/regulator/of_regulator.c
+@@ -264,8 +264,12 @@ static int of_get_regulation_constraints(struct device *dev,
+ }
+
+ suspend_np = of_get_child_by_name(np, regulator_states[i]);
+- if (!suspend_np || !suspend_state)
++ if (!suspend_np)
+ continue;
++ if (!suspend_state) {
++ of_node_put(suspend_np);
++ continue;
++ }
+
+ if (!of_property_read_u32(suspend_np, "regulator-mode",
+ &pval)) {
+diff --git a/drivers/regulator/qcom_smd-regulator.c b/drivers/regulator/qcom_smd-regulator.c
+index 7dff94a2eb7e9..0af8286e1b106 100644
+--- a/drivers/regulator/qcom_smd-regulator.c
++++ b/drivers/regulator/qcom_smd-regulator.c
+@@ -357,10 +357,10 @@ static const struct regulator_desc pm8941_switch = {
+
+ static const struct regulator_desc pm8916_pldo = {
+ .linear_ranges = (struct linear_range[]) {
+- REGULATOR_LINEAR_RANGE(750000, 0, 208, 12500),
++ REGULATOR_LINEAR_RANGE(1750000, 0, 127, 12500),
+ },
+ .n_linear_ranges = 1,
+- .n_voltages = 209,
++ .n_voltages = 128,
+ .ops = &rpm_smps_ldo_ops,
+ };
+
+diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c
+index 91eb037089ef8..f17bb41a65510 100644
+--- a/drivers/remoteproc/imx_rproc.c
++++ b/drivers/remoteproc/imx_rproc.c
+@@ -562,16 +562,17 @@ static int imx_rproc_addr_init(struct imx_rproc *priv,
+
+ node = of_parse_phandle(np, "memory-region", a);
+ /* Not map vdevbuffer, vdevring region */
+- if (!strncmp(node->name, "vdev", strlen("vdev")))
++ if (!strncmp(node->name, "vdev", strlen("vdev"))) {
++ of_node_put(node);
+ continue;
++ }
+ err = of_address_to_resource(node, 0, &res);
++ of_node_put(node);
+ if (err) {
+ dev_err(dev, "unable to resolve memory region\n");
+ return err;
+ }
+
+- of_node_put(node);
+-
+ if (b >= IMX_RPROC_MEM_MAX)
+ break;
+
+diff --git a/drivers/remoteproc/qcom_q6v5_pas.c b/drivers/remoteproc/qcom_q6v5_pas.c
+index 1ae47cc153e55..e7765bbe17a0e 100644
+--- a/drivers/remoteproc/qcom_q6v5_pas.c
++++ b/drivers/remoteproc/qcom_q6v5_pas.c
+@@ -87,6 +87,9 @@ static void adsp_minidump(struct rproc *rproc)
+ {
+ struct qcom_adsp *adsp = rproc->priv;
+
++ if (rproc->dump_conf == RPROC_COREDUMP_DISABLED)
++ return;
++
+ qcom_minidump(rproc, adsp->minidump_id);
+ }
+
+diff --git a/drivers/remoteproc/qcom_sysmon.c b/drivers/remoteproc/qcom_sysmon.c
+index 9fca814928635..a9f04dd83ab68 100644
+--- a/drivers/remoteproc/qcom_sysmon.c
++++ b/drivers/remoteproc/qcom_sysmon.c
+@@ -41,6 +41,7 @@ struct qcom_sysmon {
+ struct completion comp;
+ struct completion ind_comp;
+ struct completion shutdown_comp;
++ struct completion ssctl_comp;
+ struct mutex lock;
+
+ bool ssr_ack;
+@@ -445,6 +446,8 @@ static int ssctl_new_server(struct qmi_handle *qmi, struct qmi_service *svc)
+
+ svc->priv = sysmon;
+
++ complete(&sysmon->ssctl_comp);
++
+ return 0;
+ }
+
+@@ -501,6 +504,7 @@ static int sysmon_start(struct rproc_subdev *subdev)
+ .ssr_event = SSCTL_SSR_EVENT_AFTER_POWERUP
+ };
+
++ reinit_completion(&sysmon->ssctl_comp);
+ mutex_lock(&sysmon->state_lock);
+ sysmon->state = SSCTL_SSR_EVENT_AFTER_POWERUP;
+ blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
+@@ -545,6 +549,11 @@ static void sysmon_stop(struct rproc_subdev *subdev, bool crashed)
+ if (crashed)
+ return;
+
++ if (sysmon->ssctl_instance) {
++ if (!wait_for_completion_timeout(&sysmon->ssctl_comp, HZ / 2))
++ dev_err(sysmon->dev, "timeout waiting for ssctl service\n");
++ }
++
+ if (sysmon->ssctl_version)
+ sysmon->shutdown_acked = ssctl_request_shutdown(sysmon);
+ else if (sysmon->ept)
+@@ -631,6 +640,7 @@ struct qcom_sysmon *qcom_add_sysmon_subdev(struct rproc *rproc,
+ init_completion(&sysmon->comp);
+ init_completion(&sysmon->ind_comp);
+ init_completion(&sysmon->shutdown_comp);
++ init_completion(&sysmon->ssctl_comp);
+ mutex_init(&sysmon->lock);
+ mutex_init(&sysmon->state_lock);
+
+diff --git a/drivers/remoteproc/qcom_wcnss.c b/drivers/remoteproc/qcom_wcnss.c
+index 9a223d394087f..68f37296b1516 100644
+--- a/drivers/remoteproc/qcom_wcnss.c
++++ b/drivers/remoteproc/qcom_wcnss.c
+@@ -467,6 +467,7 @@ static int wcnss_request_irq(struct qcom_wcnss *wcnss,
+ irq_handler_t thread_fn)
+ {
+ int ret;
++ int irq_number;
+
+ ret = platform_get_irq_byname(pdev, name);
+ if (ret < 0 && optional) {
+@@ -477,14 +478,19 @@ static int wcnss_request_irq(struct qcom_wcnss *wcnss,
+ return ret;
+ }
+
++ irq_number = ret;
++
+ ret = devm_request_threaded_irq(&pdev->dev, ret,
+ NULL, thread_fn,
+ IRQF_TRIGGER_RISING | IRQF_ONESHOT,
+ "wcnss", wcnss);
+- if (ret)
++ if (ret) {
+ dev_err(&pdev->dev, "request %s IRQ failed\n", name);
++ return ret;
++ }
+
+- return ret;
++ /* Return the IRQ number if the IRQ was successfully acquired */
++ return irq_number;
+ }
+
+ static int wcnss_alloc_memory_region(struct qcom_wcnss *wcnss)
+diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c
+index 4840ad906018e..0481926c69752 100644
+--- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
++++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
+@@ -1655,6 +1655,7 @@ static int k3_r5_cluster_of_init(struct platform_device *pdev)
+ if (!cpdev) {
+ ret = -ENODEV;
+ dev_err(dev, "could not get R5 core platform device\n");
++ of_node_put(child);
+ goto fail;
+ }
+
+@@ -1663,6 +1664,7 @@ static int k3_r5_cluster_of_init(struct platform_device *pdev)
+ dev_err(dev, "k3_r5_core_of_init failed, ret = %d\n",
+ ret);
+ put_device(&cpdev->dev);
++ of_node_put(child);
+ goto fail;
+ }
+
+diff --git a/drivers/rpmsg/mtk_rpmsg.c b/drivers/rpmsg/mtk_rpmsg.c
+index 5b4404b8be4c7..d1213c33da204 100644
+--- a/drivers/rpmsg/mtk_rpmsg.c
++++ b/drivers/rpmsg/mtk_rpmsg.c
+@@ -234,7 +234,9 @@ static void mtk_register_device_work_function(struct work_struct *register_work)
+ if (info->registered)
+ continue;
+
++ mutex_unlock(&subdev->channels_lock);
+ ret = mtk_rpmsg_register_device(subdev, &info->info);
++ mutex_lock(&subdev->channels_lock);
+ if (ret) {
+ dev_err(&pdev->dev, "Can't create rpmsg_device\n");
+ continue;
+diff --git a/drivers/rpmsg/qcom_smd.c b/drivers/rpmsg/qcom_smd.c
+index 1957b27c4cf37..f7af53891ef92 100644
+--- a/drivers/rpmsg/qcom_smd.c
++++ b/drivers/rpmsg/qcom_smd.c
+@@ -1383,6 +1383,7 @@ static int qcom_smd_parse_edge(struct device *dev,
+ }
+
+ edge->ipc_regmap = syscon_node_to_regmap(syscon_np);
++ of_node_put(syscon_np);
+ if (IS_ERR(edge->ipc_regmap)) {
+ ret = PTR_ERR(edge->ipc_regmap);
+ goto put_node;
+diff --git a/drivers/rpmsg/rpmsg_char.c b/drivers/rpmsg/rpmsg_char.c
+index b6183d4f62a22..4f2189111494a 100644
+--- a/drivers/rpmsg/rpmsg_char.c
++++ b/drivers/rpmsg/rpmsg_char.c
+@@ -120,8 +120,11 @@ static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
+ struct rpmsg_device *rpdev = eptdev->rpdev;
+ struct device *dev = &eptdev->dev;
+
+- if (eptdev->ept)
++ mutex_lock(&eptdev->ept_lock);
++ if (eptdev->ept) {
++ mutex_unlock(&eptdev->ept_lock);
+ return -EBUSY;
++ }
+
+ get_device(dev);
+
+@@ -137,11 +140,13 @@ static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
+ if (!ept) {
+ dev_err(dev, "failed to open %s\n", eptdev->chinfo.name);
+ put_device(dev);
++ mutex_unlock(&eptdev->ept_lock);
+ return -EINVAL;
+ }
+
+ eptdev->ept = ept;
+ filp->private_data = eptdev;
++ mutex_unlock(&eptdev->ept_lock);
+
+ return 0;
+ }
+diff --git a/drivers/rtc/rtc-rx8025.c b/drivers/rtc/rtc-rx8025.c
+index 5bfdd34a72fff..753c79892aa67 100644
+--- a/drivers/rtc/rtc-rx8025.c
++++ b/drivers/rtc/rtc-rx8025.c
+@@ -55,6 +55,8 @@
+ #define RX8025_BIT_CTRL2_XST BIT(5)
+ #define RX8025_BIT_CTRL2_VDET BIT(6)
+
++#define RX8035_BIT_HOUR_1224 BIT(7)
++
+ /* Clock precision adjustment */
+ #define RX8025_ADJ_RESOLUTION 3050 /* in ppb */
+ #define RX8025_ADJ_DATA_MAX 62
+@@ -78,6 +80,7 @@ struct rx8025_data {
+ struct rtc_device *rtc;
+ enum rx_model model;
+ u8 ctrl1;
++ int is_24;
+ };
+
+ static s32 rx8025_read_reg(const struct i2c_client *client, u8 number)
+@@ -226,7 +229,7 @@ static int rx8025_get_time(struct device *dev, struct rtc_time *dt)
+
+ dt->tm_sec = bcd2bin(date[RX8025_REG_SEC] & 0x7f);
+ dt->tm_min = bcd2bin(date[RX8025_REG_MIN] & 0x7f);
+- if (rx8025->ctrl1 & RX8025_BIT_CTRL1_1224)
++ if (rx8025->is_24)
+ dt->tm_hour = bcd2bin(date[RX8025_REG_HOUR] & 0x3f);
+ else
+ dt->tm_hour = bcd2bin(date[RX8025_REG_HOUR] & 0x1f) % 12
+@@ -254,7 +257,7 @@ static int rx8025_set_time(struct device *dev, struct rtc_time *dt)
+ */
+ date[RX8025_REG_SEC] = bin2bcd(dt->tm_sec);
+ date[RX8025_REG_MIN] = bin2bcd(dt->tm_min);
+- if (rx8025->ctrl1 & RX8025_BIT_CTRL1_1224)
++ if (rx8025->is_24)
+ date[RX8025_REG_HOUR] = bin2bcd(dt->tm_hour);
+ else
+ date[RX8025_REG_HOUR] = (dt->tm_hour >= 12 ? 0x20 : 0)
+@@ -279,6 +282,7 @@ static int rx8025_init_client(struct i2c_client *client)
+ struct rx8025_data *rx8025 = i2c_get_clientdata(client);
+ u8 ctrl[2], ctrl2;
+ int need_clear = 0;
++ int hour_reg;
+ int err;
+
+ err = rx8025_read_regs(client, RX8025_REG_CTRL1, 2, ctrl);
+@@ -303,6 +307,16 @@ static int rx8025_init_client(struct i2c_client *client)
+
+ err = rx8025_write_reg(client, RX8025_REG_CTRL2, ctrl2);
+ }
++
++ if (rx8025->model == model_rx_8035) {
++ /* In RX-8035, 12/24 flag is in the hour register */
++ hour_reg = rx8025_read_reg(client, RX8025_REG_HOUR);
++ if (hour_reg < 0)
++ return hour_reg;
++ rx8025->is_24 = (hour_reg & RX8035_BIT_HOUR_1224);
++ } else {
++ rx8025->is_24 = (ctrl[1] & RX8025_BIT_CTRL1_1224);
++ }
+ out:
+ return err;
+ }
+@@ -329,7 +343,7 @@ static int rx8025_read_alarm(struct device *dev, struct rtc_wkalrm *t)
+ /* Hardware alarms precision is 1 minute! */
+ t->time.tm_sec = 0;
+ t->time.tm_min = bcd2bin(ald[0] & 0x7f);
+- if (rx8025->ctrl1 & RX8025_BIT_CTRL1_1224)
++ if (rx8025->is_24)
+ t->time.tm_hour = bcd2bin(ald[1] & 0x3f);
+ else
+ t->time.tm_hour = bcd2bin(ald[1] & 0x1f) % 12
+@@ -350,7 +364,7 @@ static int rx8025_set_alarm(struct device *dev, struct rtc_wkalrm *t)
+ int err;
+
+ ald[0] = bin2bcd(t->time.tm_min);
+- if (rx8025->ctrl1 & RX8025_BIT_CTRL1_1224)
++ if (rx8025->is_24)
+ ald[1] = bin2bcd(t->time.tm_hour);
+ else
+ ald[1] = (t->time.tm_hour >= 12 ? 0x20 : 0)
+diff --git a/drivers/s390/char/zcore.c b/drivers/s390/char/zcore.c
+index 516783ba950f8..92b32ce645b95 100644
+--- a/drivers/s390/char/zcore.c
++++ b/drivers/s390/char/zcore.c
+@@ -50,6 +50,7 @@ static struct dentry *zcore_reipl_file;
+ static struct dentry *zcore_hsa_file;
+ static struct ipl_parameter_block *zcore_ipl_block;
+
++static DEFINE_MUTEX(hsa_buf_mutex);
+ static char hsa_buf[PAGE_SIZE] __aligned(PAGE_SIZE);
+
+ /*
+@@ -66,19 +67,24 @@ int memcpy_hsa_user(void __user *dest, unsigned long src, size_t count)
+ if (!hsa_available)
+ return -ENODATA;
+
++ mutex_lock(&hsa_buf_mutex);
+ while (count) {
+ if (sclp_sdias_copy(hsa_buf, src / PAGE_SIZE + 2, 1)) {
+ TRACE("sclp_sdias_copy() failed\n");
++ mutex_unlock(&hsa_buf_mutex);
+ return -EIO;
+ }
+ offset = src % PAGE_SIZE;
+ bytes = min(PAGE_SIZE - offset, count);
+- if (copy_to_user(dest, hsa_buf + offset, bytes))
++ if (copy_to_user(dest, hsa_buf + offset, bytes)) {
++ mutex_unlock(&hsa_buf_mutex);
+ return -EFAULT;
++ }
+ src += bytes;
+ dest += bytes;
+ count -= bytes;
+ }
++ mutex_unlock(&hsa_buf_mutex);
+ return 0;
+ }
+
+@@ -96,9 +102,11 @@ int memcpy_hsa_kernel(void *dest, unsigned long src, size_t count)
+ if (!hsa_available)
+ return -ENODATA;
+
++ mutex_lock(&hsa_buf_mutex);
+ while (count) {
+ if (sclp_sdias_copy(hsa_buf, src / PAGE_SIZE + 2, 1)) {
+ TRACE("sclp_sdias_copy() failed\n");
++ mutex_unlock(&hsa_buf_mutex);
+ return -EIO;
+ }
+ offset = src % PAGE_SIZE;
+@@ -108,6 +116,7 @@ int memcpy_hsa_kernel(void *dest, unsigned long src, size_t count)
+ dest += bytes;
+ count -= bytes;
+ }
++ mutex_unlock(&hsa_buf_mutex);
+ return 0;
+ }
+
+diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
+index ee182cfb467d1..e76a4a3cf1ce6 100644
+--- a/drivers/s390/cio/vfio_ccw_drv.c
++++ b/drivers/s390/cio/vfio_ccw_drv.c
+@@ -107,9 +107,10 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
+ /*
+ * Reset to IDLE only if processing of a channel program
+ * has finished. Do not overwrite a possible processing
+- * state if the final interrupt was for HSCH or CSCH.
++ * state if the interrupt was unsolicited, or if the final
++ * interrupt was for HSCH or CSCH.
+ */
+- if (private->mdev && cp_is_finished)
++ if (cp_is_finished)
+ private->state = VFIO_CCW_STATE_IDLE;
+
+ if (private->io_trigger)
+@@ -301,19 +302,11 @@ static int vfio_ccw_sch_event(struct subchannel *sch, int process)
+ if (work_pending(&sch->todo_work))
+ goto out_unlock;
+
+- if (cio_update_schib(sch)) {
+- vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_NOT_OPER);
+- rc = 0;
+- goto out_unlock;
+- }
+-
+- private = dev_get_drvdata(&sch->dev);
+- if (private->state == VFIO_CCW_STATE_NOT_OPER) {
+- private->state = private->mdev ? VFIO_CCW_STATE_IDLE :
+- VFIO_CCW_STATE_STANDBY;
+- }
+ rc = 0;
+
++ if (cio_update_schib(sch))
++ vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_NOT_OPER);
++
+ out_unlock:
+ spin_unlock_irqrestore(sch->lock, flags);
+
+diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
+index d8589afac272f..8d76f5b26e2b6 100644
+--- a/drivers/s390/cio/vfio_ccw_ops.c
++++ b/drivers/s390/cio/vfio_ccw_ops.c
+@@ -146,7 +146,7 @@ err_atomic:
+ vfio_uninit_group_dev(&private->vdev);
+ atomic_inc(&private->avail);
+ private->mdev = NULL;
+- private->state = VFIO_CCW_STATE_IDLE;
++ private->state = VFIO_CCW_STATE_STANDBY;
+ return ret;
+ }
+
+diff --git a/drivers/s390/scsi/zfcp_fc.c b/drivers/s390/scsi/zfcp_fc.c
+index 511bf8e0a436c..b61acbb09be3b 100644
+--- a/drivers/s390/scsi/zfcp_fc.c
++++ b/drivers/s390/scsi/zfcp_fc.c
+@@ -145,27 +145,33 @@ void zfcp_fc_enqueue_event(struct zfcp_adapter *adapter,
+
+ static int zfcp_fc_wka_port_get(struct zfcp_fc_wka_port *wka_port)
+ {
++ int ret = -EIO;
++
+ if (mutex_lock_interruptible(&wka_port->mutex))
+ return -ERESTARTSYS;
+
+ if (wka_port->status == ZFCP_FC_WKA_PORT_OFFLINE ||
+ wka_port->status == ZFCP_FC_WKA_PORT_CLOSING) {
+ wka_port->status = ZFCP_FC_WKA_PORT_OPENING;
+- if (zfcp_fsf_open_wka_port(wka_port))
++ if (zfcp_fsf_open_wka_port(wka_port)) {
++ /* could not even send request, nothing to wait for */
+ wka_port->status = ZFCP_FC_WKA_PORT_OFFLINE;
++ goto out;
++ }
+ }
+
+- mutex_unlock(&wka_port->mutex);
+-
+- wait_event(wka_port->completion_wq,
++ wait_event(wka_port->opened,
+ wka_port->status == ZFCP_FC_WKA_PORT_ONLINE ||
+ wka_port->status == ZFCP_FC_WKA_PORT_OFFLINE);
+
+ if (wka_port->status == ZFCP_FC_WKA_PORT_ONLINE) {
+ atomic_inc(&wka_port->refcount);
+- return 0;
++ ret = 0;
++ goto out;
+ }
+- return -EIO;
++out:
++ mutex_unlock(&wka_port->mutex);
++ return ret;
+ }
+
+ static void zfcp_fc_wka_port_offline(struct work_struct *work)
+@@ -181,9 +187,12 @@ static void zfcp_fc_wka_port_offline(struct work_struct *work)
+
+ wka_port->status = ZFCP_FC_WKA_PORT_CLOSING;
+ if (zfcp_fsf_close_wka_port(wka_port)) {
++ /* could not even send request, nothing to wait for */
+ wka_port->status = ZFCP_FC_WKA_PORT_OFFLINE;
+- wake_up(&wka_port->completion_wq);
++ goto out;
+ }
++ wait_event(wka_port->closed,
++ wka_port->status == ZFCP_FC_WKA_PORT_OFFLINE);
+ out:
+ mutex_unlock(&wka_port->mutex);
+ }
+@@ -193,13 +202,15 @@ static void zfcp_fc_wka_port_put(struct zfcp_fc_wka_port *wka_port)
+ if (atomic_dec_return(&wka_port->refcount) != 0)
+ return;
+ /* wait 10 milliseconds, other reqs might pop in */
+- schedule_delayed_work(&wka_port->work, HZ / 100);
++ queue_delayed_work(wka_port->adapter->work_queue, &wka_port->work,
++ msecs_to_jiffies(10));
+ }
+
+ static void zfcp_fc_wka_port_init(struct zfcp_fc_wka_port *wka_port, u32 d_id,
+ struct zfcp_adapter *adapter)
+ {
+- init_waitqueue_head(&wka_port->completion_wq);
++ init_waitqueue_head(&wka_port->opened);
++ init_waitqueue_head(&wka_port->closed);
+
+ wka_port->adapter = adapter;
+ wka_port->d_id = d_id;
+diff --git a/drivers/s390/scsi/zfcp_fc.h b/drivers/s390/scsi/zfcp_fc.h
+index 8aaf409ce9cba..97755407ce1b5 100644
+--- a/drivers/s390/scsi/zfcp_fc.h
++++ b/drivers/s390/scsi/zfcp_fc.h
+@@ -185,7 +185,8 @@ enum zfcp_fc_wka_status {
+ /**
+ * struct zfcp_fc_wka_port - representation of well-known-address (WKA) FC port
+ * @adapter: Pointer to adapter structure this WKA port belongs to
+- * @completion_wq: Wait for completion of open/close command
++ * @opened: Wait for completion of open command
++ * @closed: Wait for completion of close command
+ * @status: Current status of WKA port
+ * @refcount: Reference count to keep port open as long as it is in use
+ * @d_id: FC destination id or well-known-address
+@@ -195,7 +196,8 @@ enum zfcp_fc_wka_status {
+ */
+ struct zfcp_fc_wka_port {
+ struct zfcp_adapter *adapter;
+- wait_queue_head_t completion_wq;
++ wait_queue_head_t opened;
++ wait_queue_head_t closed;
+ enum zfcp_fc_wka_status status;
+ atomic_t refcount;
+ u32 d_id;
+diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
+index 4f1e4385ce58a..19223b0755686 100644
+--- a/drivers/s390/scsi/zfcp_fsf.c
++++ b/drivers/s390/scsi/zfcp_fsf.c
+@@ -1907,7 +1907,7 @@ static void zfcp_fsf_open_wka_port_handler(struct zfcp_fsf_req *req)
+ wka_port->status = ZFCP_FC_WKA_PORT_ONLINE;
+ }
+ out:
+- wake_up(&wka_port->completion_wq);
++ wake_up(&wka_port->opened);
+ }
+
+ /**
+@@ -1966,7 +1966,7 @@ static void zfcp_fsf_close_wka_port_handler(struct zfcp_fsf_req *req)
+ }
+
+ wka_port->status = ZFCP_FC_WKA_PORT_OFFLINE;
+- wake_up(&wka_port->completion_wq);
++ wake_up(&wka_port->closed);
+ }
+
+ /**
+diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c
+index 3bb0adefbe06f..02026476c39c9 100644
+--- a/drivers/scsi/be2iscsi/be_main.c
++++ b/drivers/scsi/be2iscsi/be_main.c
+@@ -5745,7 +5745,7 @@ static void beiscsi_remove(struct pci_dev *pcidev)
+ cancel_work_sync(&phba->sess_work);
+
+ beiscsi_iface_destroy_default(phba);
+- iscsi_host_remove(phba->shost);
++ iscsi_host_remove(phba->shost, false);
+ beiscsi_disable_port(phba, 1);
+
+ /* after cancelling boot_work */
+diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c b/drivers/scsi/bnx2i/bnx2i_iscsi.c
+index 15fbd09baa943..a3c800e04a2e8 100644
+--- a/drivers/scsi/bnx2i/bnx2i_iscsi.c
++++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c
+@@ -909,7 +909,7 @@ void bnx2i_free_hba(struct bnx2i_hba *hba)
+ {
+ struct Scsi_Host *shost = hba->shost;
+
+- iscsi_host_remove(shost);
++ iscsi_host_remove(shost, false);
+ INIT_LIST_HEAD(&hba->ep_ofld_list);
+ INIT_LIST_HEAD(&hba->ep_active_list);
+ INIT_LIST_HEAD(&hba->ep_destroy_list);
+diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
+index 4365d52c6430e..32abdf0fa9aab 100644
+--- a/drivers/scsi/cxgbi/libcxgbi.c
++++ b/drivers/scsi/cxgbi/libcxgbi.c
+@@ -328,7 +328,7 @@ void cxgbi_hbas_remove(struct cxgbi_device *cdev)
+ chba = cdev->hbas[i];
+ if (chba) {
+ cdev->hbas[i] = NULL;
+- iscsi_host_remove(chba->shost);
++ iscsi_host_remove(chba->shost, false);
+ pci_dev_put(cdev->pdev);
+ iscsi_host_free(chba->shost);
+ }
+diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
+index 9fee70d6434a8..52c6f70d60ec4 100644
+--- a/drivers/scsi/iscsi_tcp.c
++++ b/drivers/scsi/iscsi_tcp.c
+@@ -898,7 +898,7 @@ iscsi_sw_tcp_session_create(struct iscsi_endpoint *ep, uint16_t cmds_max,
+ remove_session:
+ iscsi_session_teardown(cls_session);
+ remove_host:
+- iscsi_host_remove(shost);
++ iscsi_host_remove(shost, false);
+ free_host:
+ iscsi_host_free(shost);
+ return NULL;
+@@ -915,7 +915,7 @@ static void iscsi_sw_tcp_session_destroy(struct iscsi_cls_session *cls_session)
+ iscsi_tcp_r2tpool_free(cls_session->dd_data);
+ iscsi_session_teardown(cls_session);
+
+- iscsi_host_remove(shost);
++ iscsi_host_remove(shost, false);
+ iscsi_host_free(shost);
+ }
+
+diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
+index 797abf4f53995..3ddb701cd29c7 100644
+--- a/drivers/scsi/libiscsi.c
++++ b/drivers/scsi/libiscsi.c
+@@ -2828,11 +2828,12 @@ static void iscsi_notify_host_removed(struct iscsi_cls_session *cls_session)
+ /**
+ * iscsi_host_remove - remove host and sessions
+ * @shost: scsi host
++ * @is_shutdown: true if called from a driver shutdown callout
+ *
+ * If there are any sessions left, this will initiate the removal and wait
+ * for the completion.
+ */
+-void iscsi_host_remove(struct Scsi_Host *shost)
++void iscsi_host_remove(struct Scsi_Host *shost, bool is_shutdown)
+ {
+ struct iscsi_host *ihost = shost_priv(shost);
+ unsigned long flags;
+@@ -2841,7 +2842,11 @@ void iscsi_host_remove(struct Scsi_Host *shost)
+ ihost->state = ISCSI_HOST_REMOVED;
+ spin_unlock_irqrestore(&ihost->lock, flags);
+
+- iscsi_host_for_each_session(shost, iscsi_notify_host_removed);
++ if (!is_shutdown)
++ iscsi_host_for_each_session(shost, iscsi_notify_host_removed);
++ else
++ iscsi_host_for_each_session(shost, iscsi_force_destroy_session);
++
+ wait_event_interruptible(ihost->session_removal_wq,
+ ihost->num_sessions == 0);
+ if (signal_pending(current))
+diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
+index add238dc599bc..baed4f0146f54 100644
+--- a/drivers/scsi/lpfc/lpfc_scsi.c
++++ b/drivers/scsi/lpfc/lpfc_scsi.c
+@@ -5710,7 +5710,6 @@ lpfc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *cmnd)
+ cur_iocbq->cmd_flag |= LPFC_IO_VMID;
+ }
+ }
+- atomic_inc(&ndlp->cmd_pending);
+
+ #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
+ if (unlikely(phba->hdwqstat_on & LPFC_CHECK_SCSI_IO))
+diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
+index 83ffba7f51da1..780d975c85b5b 100644
+--- a/drivers/scsi/qedi/qedi_main.c
++++ b/drivers/scsi/qedi/qedi_main.c
+@@ -2414,9 +2414,12 @@ static void __qedi_remove(struct pci_dev *pdev, int mode)
+ int rval;
+ u16 retry = 10;
+
+- if (mode == QEDI_MODE_NORMAL || mode == QEDI_MODE_SHUTDOWN) {
+- iscsi_host_remove(qedi->shost);
++ if (mode == QEDI_MODE_NORMAL)
++ iscsi_host_remove(qedi->shost, false);
++ else if (mode == QEDI_MODE_SHUTDOWN)
++ iscsi_host_remove(qedi->shost, true);
+
++ if (mode == QEDI_MODE_NORMAL || mode == QEDI_MODE_SHUTDOWN) {
+ if (qedi->tmf_thread) {
+ destroy_workqueue(qedi->tmf_thread);
+ qedi->tmf_thread = NULL;
+@@ -2791,7 +2794,7 @@ remove_host:
+ #ifdef CONFIG_DEBUG_FS
+ qedi_dbg_host_exit(&qedi->dbg_ctx);
+ #endif
+- iscsi_host_remove(qedi->shost);
++ iscsi_host_remove(qedi->shost, false);
+ stop_iscsi_func:
+ qedi_ops->stop(qedi->cdev);
+ stop_slowpath:
+diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
+index 3b3e4234f37a0..412ad888bdc17 100644
+--- a/drivers/scsi/qla2xxx/qla_attr.c
++++ b/drivers/scsi/qla2xxx/qla_attr.c
+@@ -2716,17 +2716,24 @@ qla2x00_dev_loss_tmo_callbk(struct fc_rport *rport)
+ if (!fcport)
+ return;
+
+- /* Now that the rport has been deleted, set the fcport state to
+- FCS_DEVICE_DEAD */
+- qla2x00_set_fcport_state(fcport, FCS_DEVICE_DEAD);
++
++ /*
++ * Now that the rport has been deleted, set the fcport state to
++ * FCS_DEVICE_DEAD, if the fcport is still lost.
++ */
++ if (fcport->scan_state != QLA_FCPORT_FOUND)
++ qla2x00_set_fcport_state(fcport, FCS_DEVICE_DEAD);
+
+ /*
+ * Transport has effectively 'deleted' the rport, clear
+ * all local references.
+ */
+ spin_lock_irqsave(host->host_lock, flags);
+- fcport->rport = fcport->drport = NULL;
+- *((fc_port_t **)rport->dd_data) = NULL;
++ /* Confirm port has not reappeared before clearing pointers. */
++ if (rport->port_state != FC_PORTSTATE_ONLINE) {
++ fcport->rport = fcport->drport = NULL;
++ *((fc_port_t **)rport->dd_data) = NULL;
++ }
+ spin_unlock_irqrestore(host->host_lock, flags);
+
+ if (test_bit(ABORT_ISP_ACTIVE, &fcport->vha->dpc_flags))
+@@ -2759,9 +2766,12 @@ qla2x00_terminate_rport_io(struct fc_rport *rport)
+ /*
+ * At this point all fcport's software-states are cleared. Perform any
+ * final cleanup of firmware resources (PCBs and XCBs).
++ *
++ * Attempt to cleanup only lost devices.
+ */
+ if (fcport->loop_id != FC_NO_LOOP_ID) {
+- if (IS_FWI2_CAPABLE(fcport->vha->hw)) {
++ if (IS_FWI2_CAPABLE(fcport->vha->hw) &&
++ fcport->scan_state != QLA_FCPORT_FOUND) {
+ if (fcport->loop_id != FC_NO_LOOP_ID)
+ fcport->logout_on_delete = 1;
+
+@@ -2771,7 +2781,7 @@ qla2x00_terminate_rport_io(struct fc_rport *rport)
+ __LINE__);
+ qlt_schedule_sess_for_deletion(fcport);
+ }
+- } else {
++ } else if (!IS_FWI2_CAPABLE(fcport->vha->hw)) {
+ qla2x00_port_logout(fcport->vha, fcport);
+ }
+ }
+diff --git a/drivers/scsi/qla2xxx/qla_bsg.c b/drivers/scsi/qla2xxx/qla_bsg.c
+index c2f00f076f799..726af9e405728 100644
+--- a/drivers/scsi/qla2xxx/qla_bsg.c
++++ b/drivers/scsi/qla2xxx/qla_bsg.c
+@@ -2975,6 +2975,13 @@ qla24xx_bsg_timeout(struct bsg_job *bsg_job)
+
+ ql_log(ql_log_info, vha, 0x708b, "%s CMD timeout. bsg ptr %p.\n",
+ __func__, bsg_job);
++
++ if (qla2x00_isp_reg_stat(ha)) {
++ ql_log(ql_log_info, vha, 0x9007,
++ "PCI/Register disconnect.\n");
++ qla_pci_set_eeh_busy(vha);
++ }
++
+ /* find the bsg job from the active list of commands */
+ spin_lock_irqsave(&ha->hardware_lock, flags);
+ for (que = 0; que < ha->max_req_queues; que++) {
+@@ -2992,7 +2999,8 @@ qla24xx_bsg_timeout(struct bsg_job *bsg_job)
+ sp->u.bsg_job == bsg_job) {
+ req->outstanding_cmds[cnt] = NULL;
+ spin_unlock_irqrestore(&ha->hardware_lock, flags);
+- if (ha->isp_ops->abort_command(sp)) {
++
++ if (!ha->flags.eeh_busy && ha->isp_ops->abort_command(sp)) {
+ ql_log(ql_log_warn, vha, 0x7089,
+ "mbx abort_command failed.\n");
+ bsg_reply->result = -EIO;
+diff --git a/drivers/scsi/qla2xxx/qla_dbg.h b/drivers/scsi/qla2xxx/qla_dbg.h
+index f1f6c740bdcd8..feeb1666227f1 100644
+--- a/drivers/scsi/qla2xxx/qla_dbg.h
++++ b/drivers/scsi/qla2xxx/qla_dbg.h
+@@ -383,5 +383,5 @@ ql_mask_match(uint level)
+ if (ql2xextended_error_logging == 1)
+ ql2xextended_error_logging = QL_DBG_DEFAULT1_MASK;
+
+- return (level & ql2xextended_error_logging) == level;
++ return level && ((level & ql2xextended_error_logging) == level);
+ }
+diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
+index e8f69c486be10..01cdd5f8723c7 100644
+--- a/drivers/scsi/qla2xxx/qla_def.h
++++ b/drivers/scsi/qla2xxx/qla_def.h
+@@ -2158,6 +2158,11 @@ typedef struct {
+ #define CS_IOCB_ERROR 0x31 /* Generic error for IOCB request
+ failure */
+ #define CS_REJECT_RECEIVED 0x4E /* Reject received */
++#define CS_EDIF_AUTH_ERROR 0x63 /* decrypt error */
++#define CS_EDIF_PAD_LEN_ERROR 0x65 /* pad > frame size, not 4byte align */
++#define CS_EDIF_INV_REQ 0x66 /* invalid request */
++#define CS_EDIF_SPI_ERROR 0x67 /* rx frame unable to locate sa */
++#define CS_EDIF_HDR_ERROR 0x69 /* data frame != expected len */
+ #define CS_BAD_PAYLOAD 0x80 /* Driver defined */
+ #define CS_UNKNOWN 0x81 /* Driver defined */
+ #define CS_RETRY 0x82 /* Driver defined */
+@@ -2626,7 +2631,6 @@ typedef struct fc_port {
+ struct {
+ uint32_t enable:1; /* device is edif enabled/req'd */
+ uint32_t app_stop:2;
+- uint32_t app_started:1;
+ uint32_t aes_gmac:1;
+ uint32_t app_sess_online:1;
+ uint32_t tx_sa_set:1;
+@@ -2637,6 +2641,7 @@ typedef struct fc_port {
+ uint32_t rx_rekey_cnt;
+ uint64_t tx_bytes;
+ uint64_t rx_bytes;
++ uint8_t sess_down_acked;
+ uint8_t auth_state;
+ uint16_t authok:1;
+ uint16_t rekey_cnt;
+@@ -3204,6 +3209,8 @@ struct ct_sns_rsp {
+ #define GFF_NVME_OFFSET 23 /* type = 28h */
+ struct {
+ uint8_t fc4_features[128];
++#define FC4_FF_TARGET BIT_0
++#define FC4_FF_INITIATOR BIT_1
+ } gff_id;
+ struct {
+ uint8_t reserved;
+@@ -3975,6 +3982,7 @@ struct qla_hw_data {
+ /* SRB cache. */
+ #define SRB_MIN_REQ 128
+ mempool_t *srb_mempool;
++ u8 port_name[WWN_SIZE];
+
+ volatile struct {
+ uint32_t mbox_int :1;
+@@ -4040,6 +4048,9 @@ struct qla_hw_data {
+ uint32_t n2n_fw_acc_sec:1;
+ uint32_t plogi_template_valid:1;
+ uint32_t port_isolated:1;
++ uint32_t eeh_flush:2;
++#define EEH_FLUSH_RDY 1
++#define EEH_FLUSH_DONE 2
+ } flags;
+
+ uint16_t max_exchg;
+@@ -4074,6 +4085,7 @@ struct qla_hw_data {
+ uint32_t rsp_que_len;
+ uint32_t req_que_off;
+ uint32_t rsp_que_off;
++ unsigned long eeh_jif;
+
+ /* Multi queue data structs */
+ device_reg_t *mqiobase;
+@@ -4256,8 +4268,8 @@ struct qla_hw_data {
+ #define IS_OEM_001(ha) ((ha)->device_type & DT_OEM_001)
+ #define HAS_EXTENDED_IDS(ha) ((ha)->device_type & DT_EXTENDED_IDS)
+ #define IS_CT6_SUPPORTED(ha) ((ha)->device_type & DT_CT6_SUPPORTED)
+-#define IS_MQUE_CAPABLE(ha) ((ha)->mqenable || IS_QLA83XX(ha) || \
+- IS_QLA27XX(ha) || IS_QLA28XX(ha))
++#define IS_MQUE_CAPABLE(ha) (IS_QLA83XX(ha) || IS_QLA27XX(ha) || \
++ IS_QLA28XX(ha))
+ #define IS_BIDI_CAPABLE(ha) \
+ (IS_QLA25XX(ha) || IS_QLA2031(ha) || IS_QLA27XX(ha) || IS_QLA28XX(ha))
+ /* Bit 21 of fw_attributes decides the MCTP capabilities */
+diff --git a/drivers/scsi/qla2xxx/qla_edif.c b/drivers/scsi/qla2xxx/qla_edif.c
+index 0628633c7c7e9..dbe8ef887a9d3 100644
+--- a/drivers/scsi/qla2xxx/qla_edif.c
++++ b/drivers/scsi/qla2xxx/qla_edif.c
+@@ -52,6 +52,31 @@ const char *sc_to_str(uint16_t cmd)
+ return "unknown";
+ }
+
++static struct edb_node *qla_edb_getnext(scsi_qla_host_t *vha)
++{
++ unsigned long flags;
++ struct edb_node *edbnode = NULL;
++
++ spin_lock_irqsave(&vha->e_dbell.db_lock, flags);
++
++ /* db nodes are fifo - no qualifications done */
++ if (!list_empty(&vha->e_dbell.head)) {
++ edbnode = list_first_entry(&vha->e_dbell.head,
++ struct edb_node, list);
++ list_del_init(&edbnode->list);
++ }
++
++ spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
++
++ return edbnode;
++}
++
++static void qla_edb_node_free(scsi_qla_host_t *vha, struct edb_node *node)
++{
++ list_del_init(&node->list);
++ kfree(node);
++}
++
+ static struct edif_list_entry *qla_edif_list_find_sa_index(fc_port_t *fcport,
+ uint16_t handle)
+ {
+@@ -257,14 +282,8 @@ qla2x00_find_fcport_by_pid(scsi_qla_host_t *vha, port_id_t *id)
+
+ f = NULL;
+ list_for_each_entry_safe(f, tf, &vha->vp_fcports, list) {
+- if ((f->flags & FCF_FCSP_DEVICE)) {
+- ql_dbg(ql_dbg_edif + ql_dbg_verbose, vha, 0x2058,
+- "Found secure fcport - nn %8phN pn %8phN portid=0x%x, 0x%x.\n",
+- f->node_name, f->port_name,
+- f->d_id.b24, id->b24);
+- if (f->d_id.b24 == id->b24)
+- return f;
+- }
++ if (f->d_id.b24 == id->b24)
++ return f;
+ }
+ return NULL;
+ }
+@@ -280,14 +299,19 @@ qla_edif_app_check(scsi_qla_host_t *vha, struct app_id appid)
+ {
+ /* check that the app is allow/known to the driver */
+
+- if (appid.app_vid == EDIF_APP_ID) {
+- ql_dbg(ql_dbg_edif + ql_dbg_verbose, vha, 0x911d, "%s app id ok\n", __func__);
+- return true;
++ if (appid.app_vid != EDIF_APP_ID) {
++ ql_dbg(ql_dbg_edif, vha, 0x911d, "%s app id not ok (%x)",
++ __func__, appid.app_vid);
++ return false;
++ }
++
++ if (appid.version != EDIF_VERSION1) {
++ ql_dbg(ql_dbg_edif, vha, 0x911d, "%s app version is not ok (%x)",
++ __func__, appid.version);
++ return false;
+ }
+- ql_dbg(ql_dbg_edif, vha, 0x911d, "%s app id not ok (%x)",
+- __func__, appid.app_vid);
+
+- return false;
++ return true;
+ }
+
+ static void
+@@ -486,16 +510,35 @@ qla_edif_app_start(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+ /* mark doorbell as active since an app is now present */
+ vha->e_dbell.db_flags |= EDB_ACTIVE;
+ } else {
+- ql_dbg(ql_dbg_edif, vha, 0x911e, "%s doorbell already active\n",
+- __func__);
++ goto out;
+ }
+
+ if (N2N_TOPO(vha->hw)) {
+- if (vha->hw->flags.n2n_fw_acc_sec)
+- set_bit(N2N_LINK_RESET, &vha->dpc_flags);
+- else
++ list_for_each_entry_safe(fcport, tf, &vha->vp_fcports, list)
++ fcport->n2n_link_reset_cnt = 0;
++
++ if (vha->hw->flags.n2n_fw_acc_sec) {
++ list_for_each_entry_safe(fcport, tf, &vha->vp_fcports, list)
++ qla_edif_sa_ctl_init(vha, fcport);
++
++ /*
++ * While authentication app was not running, remote device
++ * could still try to login with this local port. Let's
++ * clear the state and try again.
++ */
++ qla2x00_wait_for_sess_deletion(vha);
++
++ /* bounce the link to get the other guy to relogin */
++ if (!vha->hw->flags.n2n_bigger) {
++ set_bit(N2N_LINK_RESET, &vha->dpc_flags);
++ qla2xxx_wake_dpc(vha);
++ }
++ } else {
++ qla2x00_wait_for_hba_online(vha);
+ set_bit(ISP_ABORT_NEEDED, &vha->dpc_flags);
+- qla2xxx_wake_dpc(vha);
++ qla2xxx_wake_dpc(vha);
++ qla2x00_wait_for_hba_online(vha);
++ }
+ } else {
+ list_for_each_entry_safe(fcport, tf, &vha->vp_fcports, list) {
+ ql_dbg(ql_dbg_edif, vha, 0x2058,
+@@ -517,19 +560,31 @@ qla_edif_app_start(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+ if (atomic_read(&vha->loop_state) == LOOP_DOWN)
+ break;
+
+- fcport->edif.app_started = 1;
+ fcport->login_retry = vha->hw->login_retry_count;
+
+- /* no activity */
+ fcport->edif.app_stop = 0;
++ fcport->edif.app_sess_online = 0;
++
++ if (fcport->scan_state != QLA_FCPORT_FOUND)
++ continue;
++
++ if (fcport->port_type == FCT_UNKNOWN &&
++ !fcport->fc4_features)
++ rval = qla24xx_async_gffid(vha, fcport, true);
++
++ if (!rval && !(fcport->fc4_features & FC4_FF_TARGET ||
++ fcport->port_type & (FCT_TARGET|FCT_NVME_TARGET)))
++ continue;
++
++ rval = 0;
+
+ ql_dbg(ql_dbg_edif, vha, 0x911e,
+ "%s wwpn %8phC calling qla_edif_reset_auth_wait\n",
+ __func__, fcport->port_name);
+- fcport->edif.app_sess_online = 0;
+ qlt_schedule_sess_for_deletion(fcport);
+ qla_edif_sa_ctl_init(vha, fcport);
+ }
++ set_bit(RELOGIN_NEEDED, &vha->dpc_flags);
+ }
+
+ if (vha->pur_cinfo.enode_flags != ENODE_ACTIVE) {
+@@ -540,9 +595,11 @@ qla_edif_app_start(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+ __func__);
+ }
+
++out:
+ appreply.host_support_edif = vha->hw->flags.edif_enabled;
+ appreply.edif_enode_active = vha->pur_cinfo.enode_flags;
+ appreply.edif_edb_active = vha->e_dbell.db_flags;
++ appreply.version = EDIF_VERSION1;
+
+ bsg_job->reply_len = sizeof(struct fc_bsg_reply);
+
+@@ -610,9 +667,6 @@ qla_edif_app_stop(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+
+ fcport->send_els_logo = 1;
+ qlt_schedule_sess_for_deletion(fcport);
+-
+- /* qla_edif_flush_sa_ctl_lists(fcport); */
+- fcport->edif.app_started = 0;
+ }
+ }
+
+@@ -673,6 +727,7 @@ qla_edif_app_authok(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+ portid.b.area = appplogiok.u.d_id.b.area;
+ portid.b.al_pa = appplogiok.u.d_id.b.al_pa;
+
++ appplogireply.version = EDIF_VERSION1;
+ switch (appplogiok.type) {
+ case PL_TYPE_WWPN:
+ fcport = qla2x00_find_fcport_by_wwpn(vha,
+@@ -865,6 +920,8 @@ qla_edif_app_getfcinfo(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+ } else {
+ struct fc_port *fcport = NULL, *tf;
+
++ app_reply->version = EDIF_VERSION1;
++
+ list_for_each_entry_safe(fcport, tf, &vha->vp_fcports, list) {
+ if (!(fcport->flags & FCF_FCSP_DEVICE))
+ continue;
+@@ -881,9 +938,25 @@ qla_edif_app_getfcinfo(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+ if (tdid.b24 != 0 && tdid.b24 != fcport->d_id.b24)
+ continue;
+
+- app_reply->ports[pcnt].rekey_count =
+- fcport->edif.rekey_cnt;
++ if (!N2N_TOPO(vha->hw)) {
++ if (fcport->scan_state != QLA_FCPORT_FOUND)
++ continue;
++
++ if (fcport->port_type == FCT_UNKNOWN &&
++ !fcport->fc4_features)
++ rval = qla24xx_async_gffid(vha, fcport,
++ true);
++
++ if (!rval &&
++ !(fcport->fc4_features & FC4_FF_TARGET ||
++ fcport->port_type &
++ (FCT_TARGET | FCT_NVME_TARGET)))
++ continue;
++ }
++
++ rval = 0;
+
++ app_reply->ports[pcnt].version = EDIF_VERSION1;
+ app_reply->ports[pcnt].remote_type =
+ VND_CMD_RTYPE_UNKNOWN;
+ if (fcport->port_type & (FCT_NVME_TARGET | FCT_TARGET))
+@@ -980,6 +1053,8 @@ qla_edif_app_getstats(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+ } else {
+ struct fc_port *fcport = NULL, *tf;
+
++ app_reply->version = EDIF_VERSION1;
++
+ list_for_each_entry_safe(fcport, tf, &vha->vp_fcports, list) {
+ if (fcport->edif.enable) {
+ if (pcnt > app_req.num_ports)
+@@ -1013,6 +1088,164 @@ qla_edif_app_getstats(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+ return rval;
+ }
+
++static int32_t
++qla_edif_ack(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
++{
++ struct fc_port *fcport;
++ struct aen_complete_cmd ack;
++ struct fc_bsg_reply *bsg_reply = bsg_job->reply;
++
++ sg_copy_to_buffer(bsg_job->request_payload.sg_list,
++ bsg_job->request_payload.sg_cnt, &ack, sizeof(ack));
++
++ ql_dbg(ql_dbg_edif, vha, 0x70cf,
++ "%s: %06x event_code %x\n",
++ __func__, ack.port_id.b24, ack.event_code);
++
++ fcport = qla2x00_find_fcport_by_pid(vha, &ack.port_id);
++ SET_DID_STATUS(bsg_reply->result, DID_OK);
++
++ if (!fcport) {
++ ql_dbg(ql_dbg_edif, vha, 0x70cf,
++ "%s: unable to find fcport %06x \n",
++ __func__, ack.port_id.b24);
++ return 0;
++ }
++
++ switch (ack.event_code) {
++ case VND_CMD_AUTH_STATE_SESSION_SHUTDOWN:
++ fcport->edif.sess_down_acked = 1;
++ break;
++ default:
++ break;
++ }
++ return 0;
++}
++
++static int qla_edif_consume_dbell(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
++{
++ struct fc_bsg_reply *bsg_reply = bsg_job->reply;
++ u32 sg_skip, reply_payload_len;
++ bool keep;
++ struct edb_node *dbnode = NULL;
++ struct edif_app_dbell ap;
++ int dat_size = 0;
++
++ sg_skip = 0;
++ reply_payload_len = bsg_job->reply_payload.payload_len;
++
++ while ((reply_payload_len - sg_skip) >= sizeof(struct edb_node)) {
++ dbnode = qla_edb_getnext(vha);
++ if (dbnode) {
++ keep = true;
++ dat_size = 0;
++ ap.event_code = dbnode->ntype;
++ switch (dbnode->ntype) {
++ case VND_CMD_AUTH_STATE_SESSION_SHUTDOWN:
++ case VND_CMD_AUTH_STATE_NEEDED:
++ ap.port_id = dbnode->u.plogi_did;
++ dat_size += sizeof(ap.port_id);
++ break;
++ case VND_CMD_AUTH_STATE_ELS_RCVD:
++ ap.port_id = dbnode->u.els_sid;
++ dat_size += sizeof(ap.port_id);
++ break;
++ case VND_CMD_AUTH_STATE_SAUPDATE_COMPL:
++ ap.port_id = dbnode->u.sa_aen.port_id;
++ memcpy(&ap.event_data, &dbnode->u,
++ sizeof(struct edif_sa_update_aen));
++ dat_size += sizeof(struct edif_sa_update_aen);
++ break;
++ default:
++ keep = false;
++ ql_log(ql_log_warn, vha, 0x09102,
++ "%s unknown DB type=%d %p\n",
++ __func__, dbnode->ntype, dbnode);
++ break;
++ }
++ ap.event_data_size = dat_size;
++ /* 8 = sizeof(ap.event_code + ap.event_data_size) */
++ dat_size += 8;
++ if (keep)
++ sg_skip += sg_copy_buffer(bsg_job->reply_payload.sg_list,
++ bsg_job->reply_payload.sg_cnt,
++ &ap, dat_size, sg_skip, false);
++
++ ql_dbg(ql_dbg_edif, vha, 0x09102,
++ "%s Doorbell consumed : type=%d %p\n",
++ __func__, dbnode->ntype, dbnode);
++
++ kfree(dbnode);
++ } else {
++ break;
++ }
++ }
++
++ SET_DID_STATUS(bsg_reply->result, DID_OK);
++ bsg_reply->reply_payload_rcv_len = sg_skip;
++ bsg_job->reply_len = sizeof(struct fc_bsg_reply);
++
++ return 0;
++}
++
++static void __qla_edif_dbell_bsg_done(scsi_qla_host_t *vha, struct bsg_job *bsg_job,
++ u32 delay)
++{
++ struct fc_bsg_reply *bsg_reply = bsg_job->reply;
++
++ /* small sleep for doorbell events to accumulate */
++ if (delay)
++ msleep(delay);
++
++ qla_edif_consume_dbell(vha, bsg_job);
++
++ bsg_job_done(bsg_job, bsg_reply->result, bsg_reply->reply_payload_rcv_len);
++}
++
++static void qla_edif_dbell_bsg_done(scsi_qla_host_t *vha)
++{
++ unsigned long flags;
++ struct bsg_job *prev_bsg_job = NULL;
++
++ spin_lock_irqsave(&vha->e_dbell.db_lock, flags);
++ if (vha->e_dbell.dbell_bsg_job) {
++ prev_bsg_job = vha->e_dbell.dbell_bsg_job;
++ vha->e_dbell.dbell_bsg_job = NULL;
++ }
++ spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
++
++ if (prev_bsg_job)
++ __qla_edif_dbell_bsg_done(vha, prev_bsg_job, 0);
++}
++
++static int
++qla_edif_dbell_bsg(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
++{
++ unsigned long flags;
++ bool return_bsg = false;
++
++ /* flush previous dbell bsg */
++ qla_edif_dbell_bsg_done(vha);
++
++ spin_lock_irqsave(&vha->e_dbell.db_lock, flags);
++ if (list_empty(&vha->e_dbell.head) && DBELL_ACTIVE(vha)) {
++ /*
++ * when the next db event happens, bsg_job will return.
++ * Otherwise, timer will return it.
++ */
++ vha->e_dbell.dbell_bsg_job = bsg_job;
++ vha->e_dbell.bsg_expire = jiffies + 10 * HZ;
++ } else {
++ return_bsg = true;
++ }
++ spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
++
++ if (return_bsg)
++ __qla_edif_dbell_bsg_done(vha, bsg_job, 1);
++
++ return 0;
++}
++
+ int32_t
+ qla_edif_app_mgmt(struct bsg_job *bsg_job)
+ {
+@@ -1024,8 +1257,13 @@ qla_edif_app_mgmt(struct bsg_job *bsg_job)
+ bool done = true;
+ int32_t rval = 0;
+ uint32_t vnd_sc = bsg_request->rqst_data.h_vendor.vendor_cmd[1];
++ u32 level = ql_dbg_edif;
++
++ /* doorbell is high traffic */
++ if (vnd_sc == QL_VND_SC_READ_DBELL)
++ level = 0;
+
+- ql_dbg(ql_dbg_edif, vha, 0x911d, "%s vnd subcmd=%x\n",
++ ql_dbg(level, vha, 0x911d, "%s vnd subcmd=%x\n",
+ __func__, vnd_sc);
+
+ sg_copy_to_buffer(bsg_job->request_payload.sg_list,
+@@ -1034,7 +1272,7 @@ qla_edif_app_mgmt(struct bsg_job *bsg_job)
+
+ if (!vha->hw->flags.edif_enabled ||
+ test_bit(VPORT_DELETE, &vha->dpc_flags)) {
+- ql_dbg(ql_dbg_edif, vha, 0x911d,
++ ql_dbg(level, vha, 0x911d,
+ "%s edif not enabled or vp delete. bsg ptr done %p. dpc_flags %lx\n",
+ __func__, bsg_job, vha->dpc_flags);
+
+@@ -1043,7 +1281,7 @@ qla_edif_app_mgmt(struct bsg_job *bsg_job)
+ }
+
+ if (!qla_edif_app_check(vha, appcheck)) {
+- ql_dbg(ql_dbg_edif, vha, 0x911d,
++ ql_dbg(level, vha, 0x911d,
+ "%s app checked failed.\n",
+ __func__);
+
+@@ -1075,6 +1313,13 @@ qla_edif_app_mgmt(struct bsg_job *bsg_job)
+ case QL_VND_SC_GET_STATS:
+ rval = qla_edif_app_getstats(vha, bsg_job);
+ break;
++ case QL_VND_SC_AEN_COMPLETE:
++ rval = qla_edif_ack(vha, bsg_job);
++ break;
++ case QL_VND_SC_READ_DBELL:
++ rval = qla_edif_dbell_bsg(vha, bsg_job);
++ done = false;
++ break;
+ default:
+ ql_dbg(ql_dbg_edif, vha, 0x911d, "%s unknown cmd=%x\n",
+ __func__,
+@@ -1086,7 +1331,7 @@ qla_edif_app_mgmt(struct bsg_job *bsg_job)
+
+ done:
+ if (done) {
+- ql_dbg(ql_dbg_user, vha, 0x7009,
++ ql_dbg(level, vha, 0x7009,
+ "%s: %d bsg ptr done %p\n", __func__, __LINE__, bsg_job);
+ bsg_job_done(bsg_job, bsg_reply->result,
+ bsg_reply->reply_payload_rcv_len);
+@@ -1248,6 +1493,8 @@ qla24xx_check_sadb_avail_slot(struct bsg_job *bsg_job, fc_port_t *fcport,
+
+ #define QLA_SA_UPDATE_FLAGS_RX_KEY 0x0
+ #define QLA_SA_UPDATE_FLAGS_TX_KEY 0x2
++#define EDIF_MSLEEP_INTERVAL 100
++#define EDIF_RETRY_COUNT 50
+
+ int
+ qla24xx_sadb_update(struct bsg_job *bsg_job)
+@@ -1260,7 +1507,7 @@ qla24xx_sadb_update(struct bsg_job *bsg_job)
+ struct edif_list_entry *edif_entry = NULL;
+ int found = 0;
+ int rval = 0;
+- int result = 0;
++ int result = 0, cnt;
+ struct qla_sa_update_frame sa_frame;
+ struct srb_iocb *iocb_cmd;
+ port_id_t portid;
+@@ -1501,11 +1748,23 @@ force_rx_delete:
+ sp->done = qla2x00_bsg_job_done;
+ iocb_cmd = &sp->u.iocb_cmd;
+ iocb_cmd->u.sa_update.sa_frame = sa_frame;
+-
++ cnt = 0;
++retry:
+ rval = qla2x00_start_sp(sp);
+- if (rval != QLA_SUCCESS) {
++ switch (rval) {
++ case QLA_SUCCESS:
++ break;
++ case EAGAIN:
++ msleep(EDIF_MSLEEP_INTERVAL);
++ cnt++;
++ if (cnt < EDIF_RETRY_COUNT)
++ goto retry;
++
++ fallthrough;
++ default:
+ ql_log(ql_dbg_edif, vha, 0x70e3,
+- "qla2x00_start_sp failed=%d.\n", rval);
++ "%s qla2x00_start_sp failed=%d.\n",
++ __func__, rval);
+
+ qla2x00_rel_sp(sp);
+ rval = -EIO;
+@@ -1798,30 +2057,6 @@ qla_edb_init(scsi_qla_host_t *vha)
+ /* initialize lock which protects doorbell & init list */
+ spin_lock_init(&vha->e_dbell.db_lock);
+ INIT_LIST_HEAD(&vha->e_dbell.head);
+-
+- /* create and initialize doorbell */
+- init_completion(&vha->e_dbell.dbell);
+-}
+-
+-static void
+-qla_edb_node_free(scsi_qla_host_t *vha, struct edb_node *node)
+-{
+- /*
+- * releases the space held by this edb node entry
+- * this function does _not_ free the edb node itself
+- * NB: the edb node entry passed should not be on any list
+- *
+- * currently for doorbell there's no additional cleanup
+- * needed, but here as a placeholder for furture use.
+- */
+-
+- if (!node) {
+- ql_dbg(ql_dbg_edif, vha, 0x09122,
+- "%s error - no valid node passed\n", __func__);
+- return;
+- }
+-
+- node->ntype = N_UNDEF;
+ }
+
+ static void qla_edb_clear(scsi_qla_host_t *vha, port_id_t portid)
+@@ -1868,11 +2103,8 @@ static void qla_edb_clear(scsi_qla_host_t *vha, port_id_t portid)
+ }
+ spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
+
+- list_for_each_entry_safe(e, tmp, &edb_list, list) {
++ list_for_each_entry_safe(e, tmp, &edb_list, list)
+ qla_edb_node_free(vha, e);
+- list_del_init(&e->list);
+- kfree(e);
+- }
+ }
+
+ /* function called when app is stopping */
+@@ -1900,14 +2132,10 @@ qla_edb_stop(scsi_qla_host_t *vha)
+ "%s freeing edb_node type=%x\n",
+ __func__, node->ntype);
+ qla_edb_node_free(vha, node);
+- list_del(&node->list);
+-
+- kfree(node);
+ }
+ spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
+
+- /* wake up doorbell waiters - they'll be dismissed with error code */
+- complete_all(&vha->e_dbell.dbell);
++ qla_edif_dbell_bsg_done(vha);
+ }
+
+ static struct edb_node *
+@@ -1945,9 +2173,6 @@ qla_edb_node_add(scsi_qla_host_t *vha, struct edb_node *ptr)
+ list_add_tail(&ptr->list, &vha->e_dbell.head);
+ spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
+
+- /* ring doorbell for waiters */
+- complete(&vha->e_dbell.dbell);
+-
+ return true;
+ }
+
+@@ -2011,47 +2236,29 @@ qla_edb_eventcreate(scsi_qla_host_t *vha, uint32_t dbtype,
+ edbnode->u.sa_aen.port_id = fcport->d_id;
+ edbnode->u.sa_aen.status = data;
+ edbnode->u.sa_aen.key_type = data2;
++ edbnode->u.sa_aen.version = EDIF_VERSION1;
+ break;
+ default:
+ ql_dbg(ql_dbg_edif, vha, 0x09102,
+ "%s unknown type: %x\n", __func__, dbtype);
+- qla_edb_node_free(vha, edbnode);
+ kfree(edbnode);
+ edbnode = NULL;
+ break;
+ }
+
+- if (edbnode && (!qla_edb_node_add(vha, edbnode))) {
++ if (edbnode) {
++ if (!qla_edb_node_add(vha, edbnode)) {
++ ql_dbg(ql_dbg_edif, vha, 0x09102,
++ "%s unable to add dbnode\n", __func__);
++ kfree(edbnode);
++ return;
++ }
+ ql_dbg(ql_dbg_edif, vha, 0x09102,
+- "%s unable to add dbnode\n", __func__);
+- qla_edb_node_free(vha, edbnode);
+- kfree(edbnode);
+- return;
+- }
+- if (edbnode && fcport)
+- fcport->edif.auth_state = dbtype;
+- ql_dbg(ql_dbg_edif, vha, 0x09102,
+- "%s Doorbell produced : type=%d %p\n", __func__, dbtype, edbnode);
+-}
+-
+-static struct edb_node *
+-qla_edb_getnext(scsi_qla_host_t *vha)
+-{
+- unsigned long flags;
+- struct edb_node *edbnode = NULL;
+-
+- spin_lock_irqsave(&vha->e_dbell.db_lock, flags);
+-
+- /* db nodes are fifo - no qualifications done */
+- if (!list_empty(&vha->e_dbell.head)) {
+- edbnode = list_first_entry(&vha->e_dbell.head,
+- struct edb_node, list);
+- list_del(&edbnode->list);
++ "%s Doorbell produced : type=%d %p\n", __func__, dbtype, edbnode);
++ qla_edif_dbell_bsg_done(vha);
++ if (fcport)
++ fcport->edif.auth_state = dbtype;
+ }
+-
+- spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
+-
+- return edbnode;
+ }
+
+ void
+@@ -2079,6 +2286,9 @@ qla_edif_timer(scsi_qla_host_t *vha)
+ ha->edif_post_stop_cnt_down = 60;
+ }
+ }
++
++ if (vha->e_dbell.dbell_bsg_job && time_after_eq(jiffies, vha->e_dbell.bsg_expire))
++ qla_edif_dbell_bsg_done(vha);
+ }
+
+ /*
+@@ -2146,7 +2356,6 @@ edif_doorbell_show(struct device *dev, struct device_attribute *attr,
+ "%s Doorbell consumed : type=%d %p\n",
+ __func__, dbnode->ntype, dbnode);
+ /* we're done with the db node, so free it up */
+- qla_edb_node_free(vha, dbnode);
+ kfree(dbnode);
+ } else {
+ break;
+@@ -2162,6 +2371,7 @@ edif_doorbell_show(struct device *dev, struct device_attribute *attr,
+
+ static void qla_noop_sp_done(srb_t *sp, int res)
+ {
++ sp->fcport->flags &= ~(FCF_ASYNC_SENT | FCF_ASYNC_ACTIVE);
+ /* ref: INIT */
+ kref_put(&sp->cmd_kref, qla2x00_sp_release);
+ }
+@@ -2186,7 +2396,8 @@ qla24xx_issue_sa_replace_iocb(scsi_qla_host_t *vha, struct qla_work_evt *e)
+ if (!sa_ctl) {
+ ql_dbg(ql_dbg_edif, vha, 0x70e6,
+ "sa_ctl allocation failed\n");
+- return -ENOMEM;
++ rval = -ENOMEM;
++ goto done;
+ }
+
+ fcport = sa_ctl->fcport;
+@@ -2196,7 +2407,8 @@ qla24xx_issue_sa_replace_iocb(scsi_qla_host_t *vha, struct qla_work_evt *e)
+ if (!sp) {
+ ql_dbg(ql_dbg_edif, vha, 0x70e6,
+ "SRB allocation failed\n");
+- return -ENOMEM;
++ rval = -ENOMEM;
++ goto done;
+ }
+
+ fcport->flags |= FCF_ASYNC_SENT;
+@@ -2225,9 +2437,16 @@ qla24xx_issue_sa_replace_iocb(scsi_qla_host_t *vha, struct qla_work_evt *e)
+
+ rval = qla2x00_start_sp(sp);
+
+- if (rval != QLA_SUCCESS)
+- rval = QLA_FUNCTION_FAILED;
++ if (rval != QLA_SUCCESS) {
++ goto done_free_sp;
++ }
+
++ return rval;
++done_free_sp:
++ kref_put(&sp->cmd_kref, qla2x00_sp_release);
++ fcport->flags &= ~FCF_ASYNC_SENT;
++done:
++ fcport->flags &= ~FCF_ASYNC_ACTIVE;
+ return rval;
+ }
+
+@@ -2447,8 +2666,7 @@ void qla24xx_auth_els(scsi_qla_host_t *vha, void **pkt, struct rsp_que **rsp)
+
+ fcport = qla2x00_find_fcport_by_pid(host, &purex->pur_info.pur_sid);
+
+- if (DBELL_INACTIVE(vha) ||
+- (fcport && EDIF_SESSION_DOWN(fcport))) {
++ if (DBELL_INACTIVE(vha)) {
+ ql_dbg(ql_dbg_edif, host, 0x0910c, "%s e_dbell.db_flags =%x %06x\n",
+ __func__, host->e_dbell.db_flags,
+ fcport ? fcport->d_id.b24 : 0);
+@@ -2458,6 +2676,22 @@ void qla24xx_auth_els(scsi_qla_host_t *vha, void **pkt, struct rsp_que **rsp)
+ return;
+ }
+
++ if (fcport && EDIF_SESSION_DOWN(fcport)) {
++ ql_dbg(ql_dbg_edif, host, 0x13b6,
++ "%s terminate exchange. Send logo to 0x%x\n",
++ __func__, a.did.b24);
++
++ a.tx_byte_count = a.tx_len = 0;
++ a.tx_addr = 0;
++ a.control_flags = EPD_RX_XCHG; /* EPD_RX_XCHG = terminate cmd */
++ qla_els_reject_iocb(host, (*rsp)->qpair, &a);
++ qla_enode_free(host, ptr);
++ /* send logo to let remote port knows to tear down session */
++ fcport->send_els_logo = 1;
++ qlt_schedule_sess_for_deletion(fcport);
++ return;
++ }
++
+ /* add the local enode to the list */
+ qla_enode_add(host, ptr);
+
+@@ -3350,10 +3584,14 @@ int qla_edif_process_els(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+ fc_port_t *fcport = NULL;
+ struct qla_hw_data *ha = vha->hw;
+ srb_t *sp;
+- int rval = (DID_ERROR << 16);
++ int rval = (DID_ERROR << 16), cnt;
+ port_id_t d_id;
+ struct qla_bsg_auth_els_request *p =
+ (struct qla_bsg_auth_els_request *)bsg_job->request;
++ struct qla_bsg_auth_els_reply *rpl =
++ (struct qla_bsg_auth_els_reply *)bsg_job->reply;
++
++ rpl->version = EDIF_VERSION1;
+
+ d_id.b.al_pa = bsg_request->rqst_data.h_els.port_id[2];
+ d_id.b.area = bsg_request->rqst_data.h_els.port_id[1];
+@@ -3372,7 +3610,7 @@ int qla_edif_process_els(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+ if (qla_bsg_check(vha, bsg_job, fcport))
+ return 0;
+
+- if (fcport->loop_id == FC_NO_LOOP_ID) {
++ if (EDIF_SESS_DELETE(fcport)) {
+ ql_dbg(ql_dbg_edif, vha, 0x910d,
+ "%s ELS code %x, no loop id.\n", __func__,
+ bsg_request->rqst_data.r_els.els_code);
+@@ -3441,17 +3679,26 @@ int qla_edif_process_els(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+ sp->free = qla2x00_bsg_sp_free;
+ sp->done = qla2x00_bsg_job_done;
+
++ cnt = 0;
++retry:
+ rval = qla2x00_start_sp(sp);
+-
+- ql_dbg(ql_dbg_edif, vha, 0x700a,
+- "%s %s %8phN xchg %x ctlflag %x hdl %x reqlen %xh bsg ptr %p\n",
+- __func__, sc_to_str(p->e.sub_cmd), fcport->port_name,
+- p->e.extra_rx_xchg_address, p->e.extra_control_flags,
+- sp->handle, sp->remap.req.len, bsg_job);
+-
+- if (rval != QLA_SUCCESS) {
++ switch (rval) {
++ case QLA_SUCCESS:
++ ql_dbg(ql_dbg_edif, vha, 0x700a,
++ "%s %s %8phN xchg %x ctlflag %x hdl %x reqlen %xh bsg ptr %p\n",
++ __func__, sc_to_str(p->e.sub_cmd), fcport->port_name,
++ p->e.extra_rx_xchg_address, p->e.extra_control_flags,
++ sp->handle, sp->remap.req.len, bsg_job);
++ break;
++ case EAGAIN:
++ msleep(EDIF_MSLEEP_INTERVAL);
++ cnt++;
++ if (cnt < EDIF_RETRY_COUNT)
++ goto retry;
++ fallthrough;
++ default:
+ ql_log(ql_log_warn, vha, 0x700e,
+- "qla2x00_start_sp failed = %d\n", rval);
++ "%s qla2x00_start_sp failed = %d\n", __func__, rval);
+ SET_DID_STATUS(bsg_reply->result, DID_IMM_RETRY);
+ rval = -EIO;
+ goto done_free_remap_rsp;
+@@ -3473,14 +3720,29 @@ done:
+
+ void qla_edif_sess_down(struct scsi_qla_host *vha, struct fc_port *sess)
+ {
++ u16 cnt = 0;
++
+ if (sess->edif.app_sess_online && DBELL_ACTIVE(vha)) {
+ ql_dbg(ql_dbg_disc, vha, 0xf09c,
+ "%s: sess %8phN send port_offline event\n",
+ __func__, sess->port_name);
+ sess->edif.app_sess_online = 0;
++ sess->edif.sess_down_acked = 0;
+ qla_edb_eventcreate(vha, VND_CMD_AUTH_STATE_SESSION_SHUTDOWN,
+ sess->d_id.b24, 0, sess);
+ qla2x00_post_aen_work(vha, FCH_EVT_PORT_OFFLINE, sess->d_id.b24);
++
++ while (!READ_ONCE(sess->edif.sess_down_acked) &&
++ !test_bit(VPORT_DELETE, &vha->dpc_flags)) {
++ msleep(100);
++ cnt++;
++ if (cnt > 100)
++ break;
++ }
++ sess->edif.sess_down_acked = 0;
++ ql_dbg(ql_dbg_disc, vha, 0xf09c,
++ "%s: sess %8phN port_offline event completed\n",
++ __func__, sess->port_name);
+ }
+ }
+
+diff --git a/drivers/scsi/qla2xxx/qla_edif.h b/drivers/scsi/qla2xxx/qla_edif.h
+index a965ca8e47ce7..7cdb89ccdc6ea 100644
+--- a/drivers/scsi/qla2xxx/qla_edif.h
++++ b/drivers/scsi/qla2xxx/qla_edif.h
+@@ -51,7 +51,8 @@ struct edif_dbell {
+ enum db_flags_t db_flags;
+ spinlock_t db_lock;
+ struct list_head head;
+- struct completion dbell;
++ struct bsg_job *dbell_bsg_job;
++ unsigned long bsg_expire;
+ };
+
+ #define SA_UPDATE_IOCB_TYPE 0x71 /* Security Association Update IOCB entry */
+@@ -140,4 +141,8 @@ struct enode {
+ (DBELL_ACTIVE(_fcport->vha) && \
+ (_fcport->disc_state == DSC_LOGIN_AUTH_PEND))
+
++#define EDIF_SESS_DELETE(_s) \
++ (qla_ini_mode_enabled(_s->vha) && (_s->disc_state == DSC_DELETE_PEND || \
++ _s->disc_state == DSC_DELETED))
++
+ #endif /* __QLA_EDIF_H */
+diff --git a/drivers/scsi/qla2xxx/qla_edif_bsg.h b/drivers/scsi/qla2xxx/qla_edif_bsg.h
+index 5a26c77157da2..0931f4e4e127a 100644
+--- a/drivers/scsi/qla2xxx/qla_edif_bsg.h
++++ b/drivers/scsi/qla2xxx/qla_edif_bsg.h
+@@ -7,13 +7,15 @@
+ #ifndef __QLA_EDIF_BSG_H
+ #define __QLA_EDIF_BSG_H
+
++#define EDIF_VERSION1 1
++
+ /* BSG Vendor specific commands */
+ #define ELS_MAX_PAYLOAD 2112
+ #ifndef WWN_SIZE
+ #define WWN_SIZE 8
+ #endif
+-#define VND_CMD_APP_RESERVED_SIZE 32
+-
++#define VND_CMD_APP_RESERVED_SIZE 28
++#define VND_CMD_PAD_SIZE 3
+ enum auth_els_sub_cmd {
+ SEND_ELS = 0,
+ SEND_ELS_REPLY,
+@@ -28,7 +30,9 @@ struct extra_auth_els {
+ #define BSG_CTL_FLAG_LS_ACC 1
+ #define BSG_CTL_FLAG_LS_RJT 2
+ #define BSG_CTL_FLAG_TRM 3
+- uint8_t extra_rsvd[3];
++ uint8_t version;
++ uint8_t pad[2];
++ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ } __packed;
+
+ struct qla_bsg_auth_els_request {
+@@ -39,51 +43,46 @@ struct qla_bsg_auth_els_request {
+ struct qla_bsg_auth_els_reply {
+ struct fc_bsg_reply r;
+ uint32_t rx_xchg_address;
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
++ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ };
+
+ struct app_id {
+ int app_vid;
+- uint8_t app_key[32];
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
++ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ } __packed;
+
+ struct app_start_reply {
+ uint32_t host_support_edif;
+ uint32_t edif_enode_active;
+ uint32_t edif_edb_active;
+- uint32_t reserved[VND_CMD_APP_RESERVED_SIZE];
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
++ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ } __packed;
+
+ struct app_start {
+ struct app_id app_info;
+- uint32_t prli_to;
+- uint32_t key_shred;
+ uint8_t app_start_flags;
+- uint8_t reserved[VND_CMD_APP_RESERVED_SIZE - 1];
++ uint8_t version;
++ uint8_t pad[2];
++ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ } __packed;
+
+ struct app_stop {
+ struct app_id app_info;
+- char buf[16];
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
++ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ } __packed;
+
+ struct app_plogi_reply {
+ uint32_t prli_status;
+- uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+-} __packed;
+-
+-#define RECFG_TIME 1
+-#define RECFG_BYTES 2
+-
+-struct app_rekey_cfg {
+- struct app_id app_info;
+- uint8_t rekey_mode;
+- port_id_t d_id;
+- uint8_t force;
+- union {
+- int64_t bytes;
+- int64_t time;
+- } rky_units;
+-
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ } __packed;
+
+@@ -91,7 +90,9 @@ struct app_pinfo_req {
+ struct app_id app_info;
+ uint8_t num_ports;
+ port_id_t remote_pid;
+- uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
++ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ } __packed;
+
+ struct app_pinfo {
+@@ -103,11 +104,8 @@ struct app_pinfo {
+ #define VND_CMD_RTYPE_INITIATOR 2
+ uint8_t remote_state;
+ uint8_t auth_state;
+- uint8_t rekey_mode;
+- int64_t rekey_count;
+- int64_t rekey_config_value;
+- int64_t rekey_consumed_value;
+-
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ } __packed;
+
+@@ -120,6 +118,8 @@ struct app_pinfo {
+
+ struct app_pinfo_reply {
+ uint8_t port_count;
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ struct app_pinfo ports[];
+ } __packed;
+@@ -127,6 +127,8 @@ struct app_pinfo_reply {
+ struct app_sinfo_req {
+ struct app_id app_info;
+ uint8_t num_ports;
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ } __packed;
+
+@@ -140,6 +142,9 @@ struct app_sinfo {
+
+ struct app_stats_reply {
+ uint8_t elem_count;
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
++ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ struct app_sinfo elem[];
+ } __packed;
+
+@@ -163,9 +168,11 @@ struct qla_sa_update_frame {
+ uint8_t node_name[WWN_SIZE];
+ uint8_t port_name[WWN_SIZE];
+ port_id_t port_id;
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
++ uint8_t reserved2[VND_CMD_APP_RESERVED_SIZE];
+ } __packed;
+
+-// used for edif mgmt bsg interface
+ #define QL_VND_SC_UNDEF 0
+ #define QL_VND_SC_SA_UPDATE 1
+ #define QL_VND_SC_APP_START 2
+@@ -175,6 +182,22 @@ struct qla_sa_update_frame {
+ #define QL_VND_SC_REKEY_CONFIG 6
+ #define QL_VND_SC_GET_FCINFO 7
+ #define QL_VND_SC_GET_STATS 8
++#define QL_VND_SC_AEN_COMPLETE 9
++#define QL_VND_SC_READ_DBELL 10
++
++/*
++ * bsg caller to provide empty buffer for doorbell events.
++ *
++ * sg_io_v4.din_xferp = empty buffer for door bell events
++ * sg_io_v4.dout_xferp = struct edif_read_dbell *buf
++ */
++struct edif_read_dbell {
++ struct app_id app_info;
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
++ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
++};
++
+
+ /* Application interface data structure for rtn data */
+ #define EXT_DEF_EVENT_DATA_SIZE 64
+@@ -191,7 +214,9 @@ struct edif_sa_update_aen {
+ port_id_t port_id;
+ uint32_t key_type; /* Tx (1) or RX (2) */
+ uint32_t status; /* 0 succes, 1 failed, 2 timeout , 3 error */
+- uint8_t reserved[16];
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
++ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ } __packed;
+
+ #define QL_VND_SA_STAT_SUCCESS 0
+@@ -212,9 +237,22 @@ struct auth_complete_cmd {
+ uint8_t wwpn[WWN_SIZE];
+ port_id_t d_id;
+ } u;
+- uint32_t reserved[VND_CMD_APP_RESERVED_SIZE];
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
++ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
++} __packed;
++
++struct aen_complete_cmd {
++ struct app_id app_info;
++ port_id_t port_id;
++ uint32_t event_code;
++ uint8_t version;
++ uint8_t pad[VND_CMD_PAD_SIZE];
++ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ } __packed;
+
+ #define RX_DELAY_DELETE_TIMEOUT 20
+
++#define FCH_EVT_VENDOR_UNIQUE_VPORT_DOWN 1
++
+ #endif /* QLA_EDIF_BSG_H */
+diff --git a/drivers/scsi/qla2xxx/qla_fw.h b/drivers/scsi/qla2xxx/qla_fw.h
+index 0bb1d562f0bfc..361015b5763ef 100644
+--- a/drivers/scsi/qla2xxx/qla_fw.h
++++ b/drivers/scsi/qla2xxx/qla_fw.h
+@@ -807,7 +807,7 @@ struct els_entry_24xx {
+ #define EPD_ELS_COMMAND (0 << 13)
+ #define EPD_ELS_ACC (1 << 13)
+ #define EPD_ELS_RJT (2 << 13)
+-#define EPD_RX_XCHG (3 << 13)
++#define EPD_RX_XCHG (3 << 13) /* terminate exchange */
+ #define ECF_CLR_PASSTHRU_PEND BIT_12
+ #define ECF_INCL_FRAME_HDR BIT_11
+ #define ECF_SEC_LOGIN BIT_3
+diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h
+index dac27b5ff0ac7..2e5b65072b757 100644
+--- a/drivers/scsi/qla2xxx/qla_gbl.h
++++ b/drivers/scsi/qla2xxx/qla_gbl.h
+@@ -335,6 +335,7 @@ extern int qla24xx_configure_prot_mode(srb_t *, uint16_t *);
+ extern int qla24xx_issue_sa_replace_iocb(scsi_qla_host_t *vha,
+ struct qla_work_evt *e);
+ void qla2x00_sp_release(struct kref *kref);
++void qla2x00_els_dcmd2_iocb_timeout(void *data);
+
+ /*
+ * Global Function Prototypes in qla_mbx.c source file.
+@@ -433,7 +434,8 @@ extern int
+ qla2x00_get_resource_cnts(scsi_qla_host_t *);
+
+ extern int
+-qla2x00_get_fcal_position_map(scsi_qla_host_t *ha, char *pos_map);
++qla2x00_get_fcal_position_map(scsi_qla_host_t *ha, char *pos_map,
++ u8 *num_entries);
+
+ extern int
+ qla2x00_get_link_status(scsi_qla_host_t *, uint16_t, struct link_statistics *,
+@@ -727,7 +729,7 @@ int qla24xx_async_gpsc(scsi_qla_host_t *, fc_port_t *);
+ void qla24xx_handle_gpsc_event(scsi_qla_host_t *, struct event_arg *);
+ int qla2x00_mgmt_svr_login(scsi_qla_host_t *);
+ void qla24xx_handle_gffid_event(scsi_qla_host_t *vha, struct event_arg *ea);
+-int qla24xx_async_gffid(scsi_qla_host_t *vha, fc_port_t *fcport);
++int qla24xx_async_gffid(scsi_qla_host_t *vha, fc_port_t *fcport, bool);
+ int qla24xx_async_gpnft(scsi_qla_host_t *, u8, srb_t *);
+ void qla24xx_async_gpnft_done(scsi_qla_host_t *, srb_t *);
+ void qla24xx_async_gnnft_done(scsi_qla_host_t *, srb_t *);
+diff --git a/drivers/scsi/qla2xxx/qla_gs.c b/drivers/scsi/qla2xxx/qla_gs.c
+index e811de2f6a25f..7ca7343370005 100644
+--- a/drivers/scsi/qla2xxx/qla_gs.c
++++ b/drivers/scsi/qla2xxx/qla_gs.c
+@@ -1596,7 +1596,6 @@ qla2x00_hba_attributes(scsi_qla_host_t *vha, void *entries,
+ unsigned int callopt)
+ {
+ struct qla_hw_data *ha = vha->hw;
+- struct init_cb_24xx *icb24 = (void *)ha->init_cb;
+ struct new_utsname *p_sysid = utsname();
+ struct ct_fdmi_hba_attr *eiter;
+ uint16_t alen;
+@@ -1758,8 +1757,8 @@ qla2x00_hba_attributes(scsi_qla_host_t *vha, void *entries,
+ /* MAX CT Payload Length */
+ eiter = entries + size;
+ eiter->type = cpu_to_be16(FDMI_HBA_MAXIMUM_CT_PAYLOAD_LENGTH);
+- eiter->a.max_ct_len = cpu_to_be32(le16_to_cpu(IS_FWI2_CAPABLE(ha) ?
+- icb24->frame_payload_size : ha->init_cb->frame_payload_size));
++ eiter->a.max_ct_len = cpu_to_be32(ha->frame_payload_size >> 2);
++
+ alen = sizeof(eiter->a.max_ct_len);
+ alen += FDMI_ATTR_TYPELEN(eiter);
+ eiter->len = cpu_to_be16(alen);
+@@ -1851,7 +1850,6 @@ qla2x00_port_attributes(scsi_qla_host_t *vha, void *entries,
+ unsigned int callopt)
+ {
+ struct qla_hw_data *ha = vha->hw;
+- struct init_cb_24xx *icb24 = (void *)ha->init_cb;
+ struct new_utsname *p_sysid = utsname();
+ char *hostname = p_sysid ?
+ p_sysid->nodename : fc_host_system_hostname(vha->host);
+@@ -1903,8 +1901,7 @@ qla2x00_port_attributes(scsi_qla_host_t *vha, void *entries,
+ /* Max frame size. */
+ eiter = entries + size;
+ eiter->type = cpu_to_be16(FDMI_PORT_MAX_FRAME_SIZE);
+- eiter->a.max_frame_size = cpu_to_be32(le16_to_cpu(IS_FWI2_CAPABLE(ha) ?
+- icb24->frame_payload_size : ha->init_cb->frame_payload_size));
++ eiter->a.max_frame_size = cpu_to_be32(ha->frame_payload_size);
+ alen = sizeof(eiter->a.max_frame_size);
+ alen += FDMI_ATTR_TYPELEN(eiter);
+ eiter->len = cpu_to_be16(alen);
+@@ -3280,19 +3277,12 @@ done:
+ return rval;
+ }
+
+-void qla24xx_handle_gffid_event(scsi_qla_host_t *vha, struct event_arg *ea)
+-{
+- fc_port_t *fcport = ea->fcport;
+-
+- qla24xx_post_gnl_work(vha, fcport);
+-}
+
+ void qla24xx_async_gffid_sp_done(srb_t *sp, int res)
+ {
+ struct scsi_qla_host *vha = sp->vha;
+ fc_port_t *fcport = sp->fcport;
+ struct ct_sns_rsp *ct_rsp;
+- struct event_arg ea;
+ uint8_t fc4_scsi_feat;
+ uint8_t fc4_nvme_feat;
+
+@@ -3300,10 +3290,10 @@ void qla24xx_async_gffid_sp_done(srb_t *sp, int res)
+ "Async done-%s res %x ID %x. %8phC\n",
+ sp->name, res, fcport->d_id.b24, fcport->port_name);
+
+- fcport->flags &= ~FCF_ASYNC_SENT;
+- ct_rsp = &fcport->ct_desc.ct_sns->p.rsp;
++ ct_rsp = sp->u.iocb_cmd.u.ctarg.rsp;
+ fc4_scsi_feat = ct_rsp->rsp.gff_id.fc4_features[GFF_FCP_SCSI_OFFSET];
+ fc4_nvme_feat = ct_rsp->rsp.gff_id.fc4_features[GFF_NVME_OFFSET];
++ sp->rc = res;
+
+ /*
+ * FC-GS-7, 5.2.3.12 FC-4 Features - format
+@@ -3324,24 +3314,42 @@ void qla24xx_async_gffid_sp_done(srb_t *sp, int res)
+ }
+ }
+
+- memset(&ea, 0, sizeof(ea));
+- ea.sp = sp;
+- ea.fcport = sp->fcport;
+- ea.rc = res;
++ if (sp->flags & SRB_WAKEUP_ON_COMP) {
++ complete(sp->comp);
++ } else {
++ if (sp->u.iocb_cmd.u.ctarg.req) {
++ dma_free_coherent(&vha->hw->pdev->dev,
++ sp->u.iocb_cmd.u.ctarg.req_allocated_size,
++ sp->u.iocb_cmd.u.ctarg.req,
++ sp->u.iocb_cmd.u.ctarg.req_dma);
++ sp->u.iocb_cmd.u.ctarg.req = NULL;
++ }
+
+- qla24xx_handle_gffid_event(vha, &ea);
+- /* ref: INIT */
+- kref_put(&sp->cmd_kref, qla2x00_sp_release);
++ if (sp->u.iocb_cmd.u.ctarg.rsp) {
++ dma_free_coherent(&vha->hw->pdev->dev,
++ sp->u.iocb_cmd.u.ctarg.rsp_allocated_size,
++ sp->u.iocb_cmd.u.ctarg.rsp,
++ sp->u.iocb_cmd.u.ctarg.rsp_dma);
++ sp->u.iocb_cmd.u.ctarg.rsp = NULL;
++ }
++
++ /* ref: INIT */
++ kref_put(&sp->cmd_kref, qla2x00_sp_release);
++ /* we should not be here */
++ dump_stack();
++ }
+ }
+
+ /* Get FC4 Feature with Nport ID. */
+-int qla24xx_async_gffid(scsi_qla_host_t *vha, fc_port_t *fcport)
++int qla24xx_async_gffid(scsi_qla_host_t *vha, fc_port_t *fcport, bool wait)
+ {
+ int rval = QLA_FUNCTION_FAILED;
+ struct ct_sns_req *ct_req;
+ srb_t *sp;
++ DECLARE_COMPLETION_ONSTACK(comp);
+
+- if (!vha->flags.online || (fcport->flags & FCF_ASYNC_SENT))
++ /* this routine does not have handling for no wait */
++ if (!vha->flags.online || !wait)
+ return rval;
+
+ /* ref: INIT */
+@@ -3349,43 +3357,86 @@ int qla24xx_async_gffid(scsi_qla_host_t *vha, fc_port_t *fcport)
+ if (!sp)
+ return rval;
+
+- fcport->flags |= FCF_ASYNC_SENT;
+ sp->type = SRB_CT_PTHRU_CMD;
+ sp->name = "gffid";
+ sp->gen1 = fcport->rscn_gen;
+ sp->gen2 = fcport->login_gen;
+ qla2x00_init_async_sp(sp, qla2x00_get_async_timeout(vha) + 2,
+ qla24xx_async_gffid_sp_done);
++ sp->comp = ∁
++ sp->u.iocb_cmd.timeout = qla2x00_els_dcmd2_iocb_timeout;
++
++ if (wait)
++ sp->flags = SRB_WAKEUP_ON_COMP;
++
++ sp->u.iocb_cmd.u.ctarg.req_allocated_size = sizeof(struct ct_sns_pkt);
++ sp->u.iocb_cmd.u.ctarg.req = dma_alloc_coherent(&vha->hw->pdev->dev,
++ sp->u.iocb_cmd.u.ctarg.req_allocated_size,
++ &sp->u.iocb_cmd.u.ctarg.req_dma,
++ GFP_KERNEL);
++ if (!sp->u.iocb_cmd.u.ctarg.req) {
++ ql_log(ql_log_warn, vha, 0xd041,
++ "%s: Failed to allocate ct_sns request.\n",
++ __func__);
++ goto done_free_sp;
++ }
++
++ sp->u.iocb_cmd.u.ctarg.rsp_allocated_size = sizeof(struct ct_sns_pkt);
++ sp->u.iocb_cmd.u.ctarg.rsp = dma_alloc_coherent(&vha->hw->pdev->dev,
++ sp->u.iocb_cmd.u.ctarg.rsp_allocated_size,
++ &sp->u.iocb_cmd.u.ctarg.rsp_dma,
++ GFP_KERNEL);
++ if (!sp->u.iocb_cmd.u.ctarg.rsp) {
++ ql_log(ql_log_warn, vha, 0xd041,
++ "%s: Failed to allocate ct_sns response.\n",
++ __func__);
++ goto done_free_sp;
++ }
+
+ /* CT_IU preamble */
+- ct_req = qla2x00_prep_ct_req(fcport->ct_desc.ct_sns, GFF_ID_CMD,
+- GFF_ID_RSP_SIZE);
++ ct_req = qla2x00_prep_ct_req(sp->u.iocb_cmd.u.ctarg.req, GFF_ID_CMD, GFF_ID_RSP_SIZE);
+
+ ct_req->req.gff_id.port_id[0] = fcport->d_id.b.domain;
+ ct_req->req.gff_id.port_id[1] = fcport->d_id.b.area;
+ ct_req->req.gff_id.port_id[2] = fcport->d_id.b.al_pa;
+
+- sp->u.iocb_cmd.u.ctarg.req = fcport->ct_desc.ct_sns;
+- sp->u.iocb_cmd.u.ctarg.req_dma = fcport->ct_desc.ct_sns_dma;
+- sp->u.iocb_cmd.u.ctarg.rsp = fcport->ct_desc.ct_sns;
+- sp->u.iocb_cmd.u.ctarg.rsp_dma = fcport->ct_desc.ct_sns_dma;
+ sp->u.iocb_cmd.u.ctarg.req_size = GFF_ID_REQ_SIZE;
+ sp->u.iocb_cmd.u.ctarg.rsp_size = GFF_ID_RSP_SIZE;
+ sp->u.iocb_cmd.u.ctarg.nport_handle = NPH_SNS;
+
+- ql_dbg(ql_dbg_disc, vha, 0x2132,
+- "Async-%s hdl=%x %8phC.\n", sp->name,
+- sp->handle, fcport->port_name);
+-
+ rval = qla2x00_start_sp(sp);
+- if (rval != QLA_SUCCESS)
++
++ if (rval != QLA_SUCCESS) {
++ rval = QLA_FUNCTION_FAILED;
+ goto done_free_sp;
++ } else {
++ ql_dbg(ql_dbg_disc, vha, 0x3074,
++ "Async-%s hdl=%x portid %06x\n",
++ sp->name, sp->handle, fcport->d_id.b24);
++ }
++
++ wait_for_completion(sp->comp);
++ rval = sp->rc;
+
+- return rval;
+ done_free_sp:
++ if (sp->u.iocb_cmd.u.ctarg.req) {
++ dma_free_coherent(&vha->hw->pdev->dev,
++ sp->u.iocb_cmd.u.ctarg.req_allocated_size,
++ sp->u.iocb_cmd.u.ctarg.req,
++ sp->u.iocb_cmd.u.ctarg.req_dma);
++ sp->u.iocb_cmd.u.ctarg.req = NULL;
++ }
++
++ if (sp->u.iocb_cmd.u.ctarg.rsp) {
++ dma_free_coherent(&vha->hw->pdev->dev,
++ sp->u.iocb_cmd.u.ctarg.rsp_allocated_size,
++ sp->u.iocb_cmd.u.ctarg.rsp,
++ sp->u.iocb_cmd.u.ctarg.rsp_dma);
++ sp->u.iocb_cmd.u.ctarg.rsp = NULL;
++ }
++
+ /* ref: INIT */
+ kref_put(&sp->cmd_kref, qla2x00_sp_release);
+- fcport->flags &= ~FCF_ASYNC_SENT;
+ return rval;
+ }
+
+@@ -3578,7 +3629,7 @@ login_logout:
+ do_delete) {
+ if (fcport->loop_id != FC_NO_LOOP_ID) {
+ if (fcport->flags & FCF_FCP2_DEVICE)
+- fcport->logout_on_delete = 0;
++ continue;
+
+ ql_log(ql_log_warn, vha, 0x20f0,
+ "%s %d %8phC post del sess\n",
+diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
+index 3f3417a3e8911..51503a316b10f 100644
+--- a/drivers/scsi/qla2xxx/qla_init.c
++++ b/drivers/scsi/qla2xxx/qla_init.c
+@@ -47,6 +47,7 @@ qla2x00_sp_timeout(struct timer_list *t)
+ {
+ srb_t *sp = from_timer(sp, t, u.iocb_cmd.timer);
+ struct srb_iocb *iocb;
++ scsi_qla_host_t *vha = sp->vha;
+
+ WARN_ON(irqs_disabled());
+ iocb = &sp->u.iocb_cmd;
+@@ -54,6 +55,12 @@ qla2x00_sp_timeout(struct timer_list *t)
+
+ /* ref: TMR */
+ kref_put(&sp->cmd_kref, qla2x00_sp_release);
++
++ if (vha && qla2x00_isp_reg_stat(vha->hw)) {
++ ql_log(ql_log_info, vha, 0x9008,
++ "PCI/Register disconnect.\n");
++ qla_pci_set_eeh_busy(vha);
++ }
+ }
+
+ void qla2x00_sp_free(srb_t *sp)
+@@ -161,6 +168,7 @@ int qla24xx_async_abort_cmd(srb_t *cmd_sp, bool wait)
+ struct srb_iocb *abt_iocb;
+ srb_t *sp;
+ int rval = QLA_FUNCTION_FAILED;
++ uint8_t bail;
+
+ /* ref: INIT for ABTS command */
+ sp = qla2xxx_get_qpair_sp(cmd_sp->vha, cmd_sp->qpair, cmd_sp->fcport,
+@@ -168,6 +176,7 @@ int qla24xx_async_abort_cmd(srb_t *cmd_sp, bool wait)
+ if (!sp)
+ return QLA_MEMORY_ALLOC_FAILED;
+
++ QLA_VHA_MARK_BUSY(vha, bail);
+ abt_iocb = &sp->u.iocb_cmd;
+ sp->type = SRB_ABT_CMD;
+ sp->name = "abort";
+@@ -1480,7 +1489,6 @@ static int qla_chk_secure_login(scsi_qla_host_t *vha, fc_port_t *fcport,
+ ql_dbg(ql_dbg_disc, vha, 0x20ef,
+ "%s %d %8phC EDIF: post DB_AUTH: AUTH needed\n",
+ __func__, __LINE__, fcport->port_name);
+- fcport->edif.app_started = 1;
+ fcport->edif.app_sess_online = 1;
+
+ qla_edb_eventcreate(vha, VND_CMD_AUTH_STATE_NEEDED,
+@@ -1763,8 +1771,16 @@ int qla24xx_fcport_handle_login(struct scsi_qla_host *vha, fc_port_t *fcport)
+ break;
+
+ case DSC_LOGIN_PEND:
+- if (fcport->fw_login_state == DSC_LS_PLOGI_COMP)
++ if (vha->hw->flags.edif_enabled)
++ break;
++
++ if (fcport->fw_login_state == DSC_LS_PLOGI_COMP) {
++ ql_dbg(ql_dbg_disc, vha, 0x2118,
++ "%s %d %8phC post %s PRLI\n",
++ __func__, __LINE__, fcport->port_name,
++ NVME_TARGET(vha->hw, fcport) ? "NVME" : "FC");
+ qla24xx_post_prli_work(vha, fcport);
++ }
+ break;
+
+ case DSC_UPD_FCPORT:
+@@ -1818,7 +1834,8 @@ void qla2x00_handle_rscn(scsi_qla_host_t *vha, struct event_arg *ea)
+ case RSCN_PORT_ADDR:
+ fcport = qla2x00_find_fcport_by_nportid(vha, &ea->id, 1);
+ if (fcport) {
+- if (fcport->flags & FCF_FCP2_DEVICE) {
++ if (fcport->flags & FCF_FCP2_DEVICE &&
++ atomic_read(&fcport->state) == FCS_ONLINE) {
+ ql_dbg(ql_dbg_disc, vha, 0x2115,
+ "Delaying session delete for FCP2 portid=%06x %8phC ",
+ fcport->d_id.b24, fcport->port_name);
+@@ -1850,7 +1867,8 @@ void qla2x00_handle_rscn(scsi_qla_host_t *vha, struct event_arg *ea)
+ break;
+ case RSCN_AREA_ADDR:
+ list_for_each_entry(fcport, &vha->vp_fcports, list) {
+- if (fcport->flags & FCF_FCP2_DEVICE)
++ if (fcport->flags & FCF_FCP2_DEVICE &&
++ atomic_read(&fcport->state) == FCS_ONLINE)
+ continue;
+
+ if ((ea->id.b24 & 0xffff00) == (fcport->d_id.b24 & 0xffff00)) {
+@@ -1861,7 +1879,8 @@ void qla2x00_handle_rscn(scsi_qla_host_t *vha, struct event_arg *ea)
+ break;
+ case RSCN_DOM_ADDR:
+ list_for_each_entry(fcport, &vha->vp_fcports, list) {
+- if (fcport->flags & FCF_FCP2_DEVICE)
++ if (fcport->flags & FCF_FCP2_DEVICE &&
++ atomic_read(&fcport->state) == FCS_ONLINE)
+ continue;
+
+ if ((ea->id.b24 & 0xff0000) == (fcport->d_id.b24 & 0xff0000)) {
+@@ -1873,7 +1892,8 @@ void qla2x00_handle_rscn(scsi_qla_host_t *vha, struct event_arg *ea)
+ case RSCN_FAB_ADDR:
+ default:
+ list_for_each_entry(fcport, &vha->vp_fcports, list) {
+- if (fcport->flags & FCF_FCP2_DEVICE)
++ if (fcport->flags & FCF_FCP2_DEVICE &&
++ atomic_read(&fcport->state) == FCS_ONLINE)
+ continue;
+
+ fcport->scan_needed = 1;
+@@ -2000,12 +2020,14 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, uint32_t lun,
+ struct srb_iocb *tm_iocb;
+ srb_t *sp;
+ int rval = QLA_FUNCTION_FAILED;
++ uint8_t bail;
+
+ /* ref: INIT */
+ sp = qla2x00_get_sp(vha, fcport, GFP_KERNEL);
+ if (!sp)
+ goto done;
+
++ QLA_VHA_MARK_BUSY(vha, bail);
+ sp->type = SRB_TM_CMD;
+ sp->name = "tmf";
+ qla2x00_init_async_sp(sp, qla2x00_get_async_timeout(vha),
+@@ -2124,6 +2146,13 @@ qla24xx_handle_prli_done_event(struct scsi_qla_host *vha, struct event_arg *ea)
+ }
+
+ if (N2N_TOPO(vha->hw)) {
++ if (ea->fcport->n2n_link_reset_cnt ==
++ vha->hw->login_retry_count &&
++ ea->fcport->flags & FCF_FCSP_DEVICE) {
++ /* remote authentication app just started */
++ ea->fcport->n2n_link_reset_cnt = 0;
++ }
++
+ if (ea->fcport->n2n_link_reset_cnt <
+ vha->hw->login_retry_count) {
+ ea->fcport->n2n_link_reset_cnt++;
+@@ -4509,6 +4538,8 @@ qla2x00_init_rings(scsi_qla_host_t *vha)
+ BIT_6) != 0;
+ ql_dbg(ql_dbg_init, vha, 0x00bc, "FA-WWPN Support: %s.\n",
+ (ha->flags.fawwpn_enabled) ? "enabled" : "disabled");
++ /* Init_cb will be reused for other command(s). Save a backup copy of port_name */
++ memcpy(ha->port_name, ha->init_cb->port_name, WWN_SIZE);
+ }
+
+ /* ELS pass through payload is limit by frame size. */
+@@ -5273,9 +5304,6 @@ qla2x00_alloc_fcport(scsi_qla_host_t *vha, gfp_t flags)
+ INIT_LIST_HEAD(&fcport->edif.tx_sa_list);
+ INIT_LIST_HEAD(&fcport->edif.rx_sa_list);
+
+- if (vha->e_dbell.db_flags == EDB_ACTIVE)
+- fcport->edif.app_started = 1;
+-
+ spin_lock_init(&fcport->edif.indx_list_lock);
+ INIT_LIST_HEAD(&fcport->edif.edif_indx_list);
+
+@@ -5488,6 +5516,22 @@ static int qla2x00_configure_n2n_loop(scsi_qla_host_t *vha)
+ return QLA_FUNCTION_FAILED;
+ }
+
++static void
++qla_reinitialize_link(scsi_qla_host_t *vha)
++{
++ int rval;
++
++ atomic_set(&vha->loop_state, LOOP_DOWN);
++ atomic_set(&vha->loop_down_timer, LOOP_DOWN_TIME);
++ rval = qla2x00_full_login_lip(vha);
++ if (rval == QLA_SUCCESS) {
++ ql_dbg(ql_dbg_disc, vha, 0xd050, "Link reinitialized\n");
++ } else {
++ ql_dbg(ql_dbg_disc, vha, 0xd051,
++ "Link reinitialization failed (%d)\n", rval);
++ }
++}
++
+ /*
+ * qla2x00_configure_local_loop
+ * Updates Fibre Channel Device Database with local loop devices.
+@@ -5539,6 +5583,19 @@ qla2x00_configure_local_loop(scsi_qla_host_t *vha)
+ spin_unlock_irqrestore(&vha->work_lock, flags);
+
+ if (vha->scan.scan_retry < MAX_SCAN_RETRIES) {
++ u8 loop_map_entries = 0;
++ int rc;
++
++ rc = qla2x00_get_fcal_position_map(vha, NULL,
++ &loop_map_entries);
++ if (rc == QLA_SUCCESS && loop_map_entries > 1) {
++ /*
++ * There are devices that are still not logged
++ * in. Reinitialize to give them a chance.
++ */
++ qla_reinitialize_link(vha);
++ return QLA_FUNCTION_FAILED;
++ }
+ set_bit(LOCAL_LOOP_UPDATE, &vha->dpc_flags);
+ set_bit(LOOP_RESYNC_NEEDED, &vha->dpc_flags);
+ }
+@@ -5767,8 +5824,6 @@ qla2x00_reg_remote_port(scsi_qla_host_t *vha, fc_port_t *fcport)
+ if (atomic_read(&fcport->state) == FCS_ONLINE)
+ return;
+
+- qla2x00_set_fcport_state(fcport, FCS_ONLINE);
+-
+ rport_ids.node_name = wwn_to_u64(fcport->node_name);
+ rport_ids.port_name = wwn_to_u64(fcport->port_name);
+ rport_ids.port_id = fcport->d_id.b.domain << 16 |
+@@ -5869,7 +5924,6 @@ qla2x00_update_fcport(scsi_qla_host_t *vha, fc_port_t *fcport)
+ qla2x00_reg_remote_port(vha, fcport);
+ break;
+ case MODE_TARGET:
+- qla2x00_set_fcport_state(fcport, FCS_ONLINE);
+ if (!vha->vha_tgt.qla_tgt->tgt_stop &&
+ !vha->vha_tgt.qla_tgt->tgt_stopped)
+ qlt_fc_port_added(vha, fcport);
+@@ -5887,6 +5941,8 @@ qla2x00_update_fcport(scsi_qla_host_t *vha, fc_port_t *fcport)
+ if (NVME_TARGET(vha->hw, fcport))
+ qla_nvme_register_remote(vha, fcport);
+
++ qla2x00_set_fcport_state(fcport, FCS_ONLINE);
++
+ if (IS_IIDMA_CAPABLE(vha->hw) && vha->hw->flags.gpsc_supported) {
+ if (fcport->id_changed) {
+ fcport->id_changed = 0;
+@@ -9657,6 +9713,12 @@ int qla2xxx_disable_port(struct Scsi_Host *host)
+
+ vha->hw->flags.port_isolated = 1;
+
++ if (qla2x00_isp_reg_stat(vha->hw)) {
++ ql_log(ql_log_info, vha, 0x9006,
++ "PCI/Register disconnect, exiting.\n");
++ qla_pci_set_eeh_busy(vha);
++ return FAILED;
++ }
+ if (qla2x00_chip_is_down(vha))
+ return 0;
+
+@@ -9672,6 +9734,13 @@ int qla2xxx_enable_port(struct Scsi_Host *host)
+ {
+ scsi_qla_host_t *vha = shost_priv(host);
+
++ if (qla2x00_isp_reg_stat(vha->hw)) {
++ ql_log(ql_log_info, vha, 0x9001,
++ "PCI/Register disconnect, exiting.\n");
++ qla_pci_set_eeh_busy(vha);
++ return FAILED;
++ }
++
+ vha->hw->flags.port_isolated = 0;
+ /* Set the flag to 1, so that isp_abort can proceed */
+ vha->flags.online = 1;
+diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c
+index e0fe9ddb4bd2c..42ce4e1fe7441 100644
+--- a/drivers/scsi/qla2xxx/qla_iocb.c
++++ b/drivers/scsi/qla2xxx/qla_iocb.c
+@@ -2819,7 +2819,7 @@ qla24xx_els_logo_iocb(srb_t *sp, struct els_entry_24xx *els_iocb)
+ sp->vha->qla_stats.control_requests++;
+ }
+
+-static void
++void
+ qla2x00_els_dcmd2_iocb_timeout(void *data)
+ {
+ srb_t *sp = data;
+@@ -2882,6 +2882,9 @@ static void qla2x00_els_dcmd2_sp_done(srb_t *sp, int res)
+ sp->name, res, sp->handle, fcport->d_id.b24, fcport->port_name);
+
+ fcport->flags &= ~(FCF_ASYNC_SENT|FCF_ASYNC_ACTIVE);
++ /* For edif, set logout on delete to ensure any residual key from FW is flushed.*/
++ fcport->logout_on_delete = 1;
++ fcport->chip_reset = vha->hw->base_qpair->chip_reset;
+
+ if (sp->flags & SRB_WAKEUP_ON_COMP)
+ complete(&lio->u.els_plogi.comp);
+diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
+index 21b31d6359c8a..de348628aa535 100644
+--- a/drivers/scsi/qla2xxx/qla_isr.c
++++ b/drivers/scsi/qla2xxx/qla_isr.c
+@@ -1354,9 +1354,7 @@ skip_rio:
+ if (!vha->vp_idx) {
+ if (ha->flags.fawwpn_enabled &&
+ (ha->current_topology == ISP_CFG_F)) {
+- void *wwpn = ha->init_cb->port_name;
+-
+- memcpy(vha->port_name, wwpn, WWN_SIZE);
++ memcpy(vha->port_name, ha->port_name, WWN_SIZE);
+ fc_host_port_name(vha->host) =
+ wwn_to_u64(vha->port_name);
+ ql_dbg(ql_dbg_init + ql_dbg_verbose,
+@@ -2639,7 +2637,7 @@ static void qla24xx_nvme_iocb_entry(scsi_qla_host_t *vha, struct req_que *req,
+ }
+
+ if (unlikely(logit))
+- ql_log(ql_dbg_io, fcport->vha, 0x5060,
++ ql_dbg(ql_dbg_io, fcport->vha, 0x5060,
+ "NVME-%s ERR Handling - hdl=%x status(%x) tr_len:%x resid=%x ox_id=%x\n",
+ sp->name, sp->handle, comp_status,
+ fd->transferred_length, le32_to_cpu(sts->residual_len),
+@@ -3426,6 +3424,7 @@ check_scsi_status:
+ case CS_PORT_UNAVAILABLE:
+ case CS_TIMEOUT:
+ case CS_RESET:
++ case CS_EDIF_INV_REQ:
+
+ /*
+ * We are going to have the fc class block the rport
+@@ -3496,7 +3495,7 @@ check_scsi_status:
+
+ out:
+ if (logit)
+- ql_log(ql_dbg_io, fcport->vha, 0x3022,
++ ql_dbg(ql_dbg_io, fcport->vha, 0x3022,
+ "FCP command status: 0x%x-0x%x (0x%x) nexus=%ld:%d:%llu portid=%02x%02x%02x oxid=0x%x cdb=%10phN len=0x%x rsp_info=0x%x resid=0x%x fw_resid=0x%x sp=%p cp=%p.\n",
+ comp_status, scsi_status, res, vha->host_no,
+ cp->device->id, cp->device->lun, fcport->d_id.b.domain,
+@@ -4420,16 +4419,12 @@ msix_register_fail:
+ }
+
+ /* Enable MSI-X vector for response queue update for queue 0 */
+- if (IS_QLA83XX(ha) || IS_QLA27XX(ha) || IS_QLA28XX(ha)) {
+- if (ha->msixbase && ha->mqiobase &&
+- (ha->max_rsp_queues > 1 || ha->max_req_queues > 1 ||
+- ql2xmqsupport))
+- ha->mqenable = 1;
+- } else
+- if (ha->mqiobase &&
+- (ha->max_rsp_queues > 1 || ha->max_req_queues > 1 ||
+- ql2xmqsupport))
+- ha->mqenable = 1;
++ if (IS_MQUE_CAPABLE(ha) &&
++ (ha->msixbase && ha->mqiobase && ha->max_qpairs))
++ ha->mqenable = 1;
++ else
++ ha->mqenable = 0;
++
+ ql_dbg(ql_dbg_multiq, vha, 0xc005,
+ "mqiobase=%p, max_rsp_queues=%d, max_req_queues=%d.\n",
+ ha->mqiobase, ha->max_rsp_queues, ha->max_req_queues);
+diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c
+index 892caf2475dff..86d8c455c07ab 100644
+--- a/drivers/scsi/qla2xxx/qla_mbx.c
++++ b/drivers/scsi/qla2xxx/qla_mbx.c
+@@ -238,6 +238,8 @@ qla2x00_mailbox_command(scsi_qla_host_t *vha, mbx_cmd_t *mcp)
+ ql_dbg(ql_dbg_mbx, vha, 0x1112,
+ "mbox[%d]<-0x%04x\n", cnt, *iptr);
+ wrt_reg_word(optr, *iptr);
++ } else {
++ wrt_reg_word(optr, 0);
+ }
+
+ mboxes >>= 1;
+@@ -274,6 +276,12 @@ qla2x00_mailbox_command(scsi_qla_host_t *vha, mbx_cmd_t *mcp)
+ atomic_inc(&ha->num_pend_mbx_stage3);
+ if (!wait_for_completion_timeout(&ha->mbx_intr_comp,
+ mcp->tov * HZ)) {
++ ql_dbg(ql_dbg_mbx, vha, 0x117a,
++ "cmd=%x Timeout.\n", command);
++ spin_lock_irqsave(&ha->hardware_lock, flags);
++ clear_bit(MBX_INTR_WAIT, &ha->mbx_cmd_flags);
++ spin_unlock_irqrestore(&ha->hardware_lock, flags);
++
+ if (chip_reset != ha->chip_reset) {
+ eeh_delay = ha->flags.eeh_busy ? 1 : 0;
+
+@@ -286,12 +294,6 @@ qla2x00_mailbox_command(scsi_qla_host_t *vha, mbx_cmd_t *mcp)
+ rval = QLA_ABORTED;
+ goto premature_exit;
+ }
+- ql_dbg(ql_dbg_mbx, vha, 0x117a,
+- "cmd=%x Timeout.\n", command);
+- spin_lock_irqsave(&ha->hardware_lock, flags);
+- clear_bit(MBX_INTR_WAIT, &ha->mbx_cmd_flags);
+- spin_unlock_irqrestore(&ha->hardware_lock, flags);
+-
+ } else if (ha->flags.purge_mbox ||
+ chip_reset != ha->chip_reset) {
+ eeh_delay = ha->flags.eeh_busy ? 1 : 0;
+@@ -3066,7 +3068,8 @@ qla2x00_get_resource_cnts(scsi_qla_host_t *vha)
+ * Kernel context.
+ */
+ int
+-qla2x00_get_fcal_position_map(scsi_qla_host_t *vha, char *pos_map)
++qla2x00_get_fcal_position_map(scsi_qla_host_t *vha, char *pos_map,
++ u8 *num_entries)
+ {
+ int rval;
+ mbx_cmd_t mc;
+@@ -3106,6 +3109,8 @@ qla2x00_get_fcal_position_map(scsi_qla_host_t *vha, char *pos_map)
+
+ if (pos_map)
+ memcpy(pos_map, pmap, FCAL_MAP_SIZE);
++ if (num_entries)
++ *num_entries = pmap[0];
+ }
+ dma_pool_free(ha->s_dma_pool, pmap, pmap_dma);
+
+diff --git a/drivers/scsi/qla2xxx/qla_mid.c b/drivers/scsi/qla2xxx/qla_mid.c
+index e6b5c4ccce97b..eb43a5f1b3992 100644
+--- a/drivers/scsi/qla2xxx/qla_mid.c
++++ b/drivers/scsi/qla2xxx/qla_mid.c
+@@ -166,9 +166,13 @@ qla24xx_disable_vp(scsi_qla_host_t *vha)
+ int ret = QLA_SUCCESS;
+ fc_port_t *fcport;
+
+- if (vha->hw->flags.edif_enabled)
++ if (vha->hw->flags.edif_enabled) {
++ if (DBELL_ACTIVE(vha))
++ qla2x00_post_aen_work(vha, FCH_EVT_VENDOR_UNIQUE,
++ FCH_EVT_VENDOR_UNIQUE_VPORT_DOWN);
+ /* delete sessions and flush sa_indexes */
+ qla2x00_wait_for_sess_deletion(vha);
++ }
+
+ if (vha->hw->flags.fw_started)
+ ret = qla24xx_control_vp(vha, VCE_COMMAND_DISABLE_VPS_LOGO_ALL);
+diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c
+index 87c9404aa4018..7450c3458be7e 100644
+--- a/drivers/scsi/qla2xxx/qla_nvme.c
++++ b/drivers/scsi/qla2xxx/qla_nvme.c
+@@ -37,11 +37,6 @@ int qla_nvme_register_remote(struct scsi_qla_host *vha, struct fc_port *fcport)
+ (fcport->nvme_flag & NVME_FLAG_REGISTERED))
+ return 0;
+
+- if (atomic_read(&fcport->state) == FCS_ONLINE)
+- return 0;
+-
+- qla2x00_set_fcport_state(fcport, FCS_ONLINE);
+-
+ fcport->nvme_flag &= ~NVME_FLAG_RESETTING;
+
+ memset(&req, 0, sizeof(struct nvme_fc_port_info));
+diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
+index 762229d495a8c..f9ad0847782d3 100644
+--- a/drivers/scsi/qla2xxx/qla_os.c
++++ b/drivers/scsi/qla2xxx/qla_os.c
+@@ -333,6 +333,11 @@ MODULE_PARM_DESC(ql2xabts_wait_nvme,
+ "To wait for ABTS response on I/O timeouts for NVMe. (default: 1)");
+
+
++u32 ql2xdelay_before_pci_error_handling = 5;
++module_param(ql2xdelay_before_pci_error_handling, uint, 0644);
++MODULE_PARM_DESC(ql2xdelay_before_pci_error_handling,
++ "Number of seconds delayed before qla begin PCI error self-handling (default: 5).\n");
++
+ static void qla2x00_clear_drv_active(struct qla_hw_data *);
+ static void qla2x00_free_device(scsi_qla_host_t *);
+ static int qla2xxx_map_queues(struct Scsi_Host *shost);
+@@ -1337,21 +1342,20 @@ qla2xxx_eh_abort(struct scsi_cmnd *cmd)
+ /*
+ * Returns: QLA_SUCCESS or QLA_FUNCTION_FAILED.
+ */
+-int
+-qla2x00_eh_wait_for_pending_commands(scsi_qla_host_t *vha, unsigned int t,
+- uint64_t l, enum nexus_wait_type type)
++static int
++__qla2x00_eh_wait_for_pending_commands(struct qla_qpair *qpair, unsigned int t,
++ uint64_t l, enum nexus_wait_type type)
+ {
+ int cnt, match, status;
+ unsigned long flags;
+- struct qla_hw_data *ha = vha->hw;
+- struct req_que *req;
++ scsi_qla_host_t *vha = qpair->vha;
++ struct req_que *req = qpair->req;
+ srb_t *sp;
+ struct scsi_cmnd *cmd;
+
+ status = QLA_SUCCESS;
+
+- spin_lock_irqsave(&ha->hardware_lock, flags);
+- req = vha->req;
++ spin_lock_irqsave(qpair->qp_lock_ptr, flags);
+ for (cnt = 1; status == QLA_SUCCESS &&
+ cnt < req->num_outstanding_cmds; cnt++) {
+ sp = req->outstanding_cmds[cnt];
+@@ -1378,15 +1382,35 @@ qla2x00_eh_wait_for_pending_commands(scsi_qla_host_t *vha, unsigned int t,
+ if (!match)
+ continue;
+
+- spin_unlock_irqrestore(&ha->hardware_lock, flags);
++ spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
+ status = qla2x00_eh_wait_on_command(cmd);
+- spin_lock_irqsave(&ha->hardware_lock, flags);
++ spin_lock_irqsave(qpair->qp_lock_ptr, flags);
+ }
+- spin_unlock_irqrestore(&ha->hardware_lock, flags);
++ spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
+
+ return status;
+ }
+
++int
++qla2x00_eh_wait_for_pending_commands(scsi_qla_host_t *vha, unsigned int t,
++ uint64_t l, enum nexus_wait_type type)
++{
++ struct qla_qpair *qpair;
++ struct qla_hw_data *ha = vha->hw;
++ int i, status = QLA_SUCCESS;
++
++ status = __qla2x00_eh_wait_for_pending_commands(ha->base_qpair, t, l,
++ type);
++ for (i = 0; status == QLA_SUCCESS && i < ha->max_qpairs; i++) {
++ qpair = ha->queue_pair_map[i];
++ if (!qpair)
++ continue;
++ status = __qla2x00_eh_wait_for_pending_commands(qpair, t, l,
++ type);
++ }
++ return status;
++}
++
+ static char *reset_errors[] = {
+ "HBA not online",
+ "HBA not ready",
+@@ -1420,7 +1444,7 @@ qla2xxx_eh_device_reset(struct scsi_cmnd *cmd)
+ return err;
+
+ if (fcport->deleted)
+- return SUCCESS;
++ return FAILED;
+
+ ql_log(ql_log_info, vha, 0x8009,
+ "DEVICE RESET ISSUED nexus=%ld:%d:%llu cmd=%p.\n", vha->host_no,
+@@ -1488,7 +1512,7 @@ qla2xxx_eh_target_reset(struct scsi_cmnd *cmd)
+ return err;
+
+ if (fcport->deleted)
+- return SUCCESS;
++ return FAILED;
+
+ ql_log(ql_log_info, vha, 0x8009,
+ "TARGET RESET ISSUED nexus=%ld:%d cmd=%p.\n", vha->host_no,
+@@ -5473,7 +5497,7 @@ qla2x00_do_work(struct scsi_qla_host *vha)
+ e->u.fcport.fcport, false);
+ break;
+ case QLA_EVT_SA_REPLACE:
+- qla24xx_issue_sa_replace_iocb(vha, e);
++ rc = qla24xx_issue_sa_replace_iocb(vha, e);
+ break;
+ }
+
+@@ -7239,6 +7263,44 @@ static void qla_heart_beat(struct scsi_qla_host *vha, u16 dpc_started)
+ }
+ }
+
++static void qla_wind_down_chip(scsi_qla_host_t *vha)
++{
++ struct qla_hw_data *ha = vha->hw;
++
++ if (!ha->flags.eeh_busy)
++ return;
++ if (ha->pci_error_state)
++ /* system is trying to recover */
++ return;
++
++ /*
++ * Current system is not handling PCIE error. At this point, this is
++ * best effort to wind down the adapter.
++ */
++ if (time_after_eq(jiffies, ha->eeh_jif + ql2xdelay_before_pci_error_handling * HZ) &&
++ !ha->flags.eeh_flush) {
++ ql_log(ql_log_info, vha, 0x9009,
++ "PCI Error detected, attempting to reset hardware.\n");
++
++ ha->isp_ops->reset_chip(vha);
++ ha->isp_ops->disable_intrs(ha);
++
++ ha->flags.eeh_flush = EEH_FLUSH_RDY;
++ ha->eeh_jif = jiffies;
++
++ } else if (ha->flags.eeh_flush == EEH_FLUSH_RDY &&
++ time_after_eq(jiffies, ha->eeh_jif + 5 * HZ)) {
++ pci_clear_master(ha->pdev);
++
++ /* flush all command */
++ qla2x00_abort_isp_cleanup(vha);
++ ha->flags.eeh_flush = EEH_FLUSH_DONE;
++
++ ql_log(ql_log_info, vha, 0x900a,
++ "PCI Error handling complete, all IOs aborted.\n");
++ }
++}
++
+ /**************************************************************************
+ * qla2x00_timer
+ *
+@@ -7262,6 +7324,8 @@ qla2x00_timer(struct timer_list *t)
+ fc_port_t *fcport = NULL;
+
+ if (ha->flags.eeh_busy) {
++ qla_wind_down_chip(vha);
++
+ ql_dbg(ql_dbg_timer, vha, 0x6000,
+ "EEH = %d, restarting timer.\n",
+ ha->flags.eeh_busy);
+@@ -7842,6 +7906,9 @@ void qla_pci_set_eeh_busy(struct scsi_qla_host *vha)
+
+ spin_lock_irqsave(&base_vha->work_lock, flags);
+ if (!ha->flags.eeh_busy) {
++ ha->eeh_jif = jiffies;
++ ha->flags.eeh_flush = 0;
++
+ ha->flags.eeh_busy = 1;
+ do_cleanup = true;
+ }
+diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
+index 6dfcfd8e73371..34b85c80233fe 100644
+--- a/drivers/scsi/qla2xxx/qla_target.c
++++ b/drivers/scsi/qla2xxx/qla_target.c
+@@ -988,22 +988,6 @@ void qlt_free_session_done(struct work_struct *work)
+ sess->send_els_logo);
+
+ if (!IS_SW_RESV_ADDR(sess->d_id)) {
+- if (ha->flags.edif_enabled &&
+- (!own || own->iocb.u.isp24.status_subcode == ELS_PLOGI)) {
+- sess->edif.authok = 0;
+- if (!ha->flags.host_shutting_down) {
+- ql_dbg(ql_dbg_edif, vha, 0x911e,
+- "%s wwpn %8phC calling qla2x00_release_all_sadb\n",
+- __func__, sess->port_name);
+- qla2x00_release_all_sadb(vha, sess);
+- } else {
+- ql_dbg(ql_dbg_edif, vha, 0x911e,
+- "%s bypassing release_all_sadb\n",
+- __func__);
+- }
+- qla_edif_clear_appdata(vha, sess);
+- qla_edif_sess_down(vha, sess);
+- }
+ qla2x00_mark_device_lost(vha, sess, 0);
+
+ if (sess->send_els_logo) {
+@@ -1049,6 +1033,25 @@ void qlt_free_session_done(struct work_struct *work)
+ sess->nvme_flag |= NVME_FLAG_DELETING;
+ qla_nvme_unregister_remote_port(sess);
+ }
++
++ if (ha->flags.edif_enabled &&
++ (!own || (own &&
++ own->iocb.u.isp24.status_subcode == ELS_PLOGI))) {
++ sess->edif.authok = 0;
++ if (!ha->flags.host_shutting_down) {
++ ql_dbg(ql_dbg_edif, vha, 0x911e,
++ "%s wwpn %8phC calling qla2x00_release_all_sadb\n",
++ __func__, sess->port_name);
++ qla2x00_release_all_sadb(vha, sess);
++ } else {
++ ql_dbg(ql_dbg_edif, vha, 0x911e,
++ "%s bypassing release_all_sadb\n",
++ __func__);
++ }
++
++ qla_edif_clear_appdata(vha, sess);
++ qla_edif_sess_down(vha, sess);
++ }
+ }
+
+ /*
+diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
+index 5d21f07456c6d..2a38cd2d24eff 100644
+--- a/drivers/scsi/scsi_transport_iscsi.c
++++ b/drivers/scsi/scsi_transport_iscsi.c
+@@ -2264,16 +2264,8 @@ static void iscsi_if_disconnect_bound_ep(struct iscsi_cls_conn *conn,
+ }
+ }
+
+-static int iscsi_if_stop_conn(struct iscsi_transport *transport,
+- struct iscsi_uevent *ev)
++static int iscsi_if_stop_conn(struct iscsi_cls_conn *conn, int flag)
+ {
+- int flag = ev->u.stop_conn.flag;
+- struct iscsi_cls_conn *conn;
+-
+- conn = iscsi_conn_lookup(ev->u.stop_conn.sid, ev->u.stop_conn.cid);
+- if (!conn)
+- return -EINVAL;
+-
+ ISCSI_DBG_TRANS_CONN(conn, "iscsi if conn stop.\n");
+ /*
+ * If this is a termination we have to call stop_conn with that flag
+@@ -2349,6 +2341,55 @@ static void iscsi_cleanup_conn_work_fn(struct work_struct *work)
+ ISCSI_DBG_TRANS_CONN(conn, "cleanup done.\n");
+ }
+
++static int iscsi_iter_force_destroy_conn_fn(struct device *dev, void *data)
++{
++ struct iscsi_transport *transport;
++ struct iscsi_cls_conn *conn;
++
++ if (!iscsi_is_conn_dev(dev))
++ return 0;
++
++ conn = iscsi_dev_to_conn(dev);
++ transport = conn->transport;
++
++ if (READ_ONCE(conn->state) != ISCSI_CONN_DOWN)
++ iscsi_if_stop_conn(conn, STOP_CONN_TERM);
++
++ transport->destroy_conn(conn);
++ return 0;
++}
++
++/**
++ * iscsi_force_destroy_session - destroy a session from the kernel
++ * @session: session to destroy
++ *
++ * Force the destruction of a session from the kernel. This should only be
++ * used when userspace is no longer running during system shutdown.
++ */
++void iscsi_force_destroy_session(struct iscsi_cls_session *session)
++{
++ struct iscsi_transport *transport = session->transport;
++ unsigned long flags;
++
++ WARN_ON_ONCE(system_state == SYSTEM_RUNNING);
++
++ spin_lock_irqsave(&sesslock, flags);
++ if (list_empty(&session->sess_list)) {
++ spin_unlock_irqrestore(&sesslock, flags);
++ /*
++ * Conn/ep is already freed. Session is being torn down via
++ * async path. For shutdown we don't care about it so return.
++ */
++ return;
++ }
++ spin_unlock_irqrestore(&sesslock, flags);
++
++ device_for_each_child(&session->dev, NULL,
++ iscsi_iter_force_destroy_conn_fn);
++ transport->destroy_session(session);
++}
++EXPORT_SYMBOL_GPL(iscsi_force_destroy_session);
++
+ void iscsi_free_session(struct iscsi_cls_session *session)
+ {
+ ISCSI_DBG_TRANS_SESSION(session, "Freeing session\n");
+@@ -3720,7 +3761,12 @@ static int iscsi_if_transport_conn(struct iscsi_transport *transport,
+ case ISCSI_UEVENT_DESTROY_CONN:
+ return iscsi_if_destroy_conn(transport, ev);
+ case ISCSI_UEVENT_STOP_CONN:
+- return iscsi_if_stop_conn(transport, ev);
++ conn = iscsi_conn_lookup(ev->u.stop_conn.sid,
++ ev->u.stop_conn.cid);
++ if (!conn)
++ return -EINVAL;
++
++ return iscsi_if_stop_conn(conn, ev->u.stop_conn.flag);
+ }
+
+ /*
+diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
+index cbffa712b9f3e..508cb855386dc 100644
+--- a/drivers/scsi/sg.c
++++ b/drivers/scsi/sg.c
+@@ -195,7 +195,7 @@ static void sg_link_reserve(Sg_fd * sfp, Sg_request * srp, int size);
+ static void sg_unlink_reserve(Sg_fd * sfp, Sg_request * srp);
+ static Sg_fd *sg_add_sfp(Sg_device * sdp);
+ static void sg_remove_sfp(struct kref *);
+-static Sg_request *sg_get_rq_mark(Sg_fd * sfp, int pack_id);
++static Sg_request *sg_get_rq_mark(Sg_fd * sfp, int pack_id, bool *busy);
+ static Sg_request *sg_add_request(Sg_fd * sfp);
+ static int sg_remove_request(Sg_fd * sfp, Sg_request * srp);
+ static Sg_device *sg_get_dev(int dev);
+@@ -444,6 +444,7 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos)
+ Sg_fd *sfp;
+ Sg_request *srp;
+ int req_pack_id = -1;
++ bool busy;
+ sg_io_hdr_t *hp;
+ struct sg_header *old_hdr;
+ int retval;
+@@ -466,20 +467,16 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos)
+ if (retval)
+ return retval;
+
+- srp = sg_get_rq_mark(sfp, req_pack_id);
++ srp = sg_get_rq_mark(sfp, req_pack_id, &busy);
+ if (!srp) { /* now wait on packet to arrive */
+- if (atomic_read(&sdp->detaching))
+- return -ENODEV;
+ if (filp->f_flags & O_NONBLOCK)
+ return -EAGAIN;
+ retval = wait_event_interruptible(sfp->read_wait,
+- (atomic_read(&sdp->detaching) ||
+- (srp = sg_get_rq_mark(sfp, req_pack_id))));
+- if (atomic_read(&sdp->detaching))
+- return -ENODEV;
+- if (retval)
+- /* -ERESTARTSYS as signal hit process */
+- return retval;
++ ((srp = sg_get_rq_mark(sfp, req_pack_id, &busy)) ||
++ (!busy && atomic_read(&sdp->detaching))));
++ if (!srp)
++ /* signal or detaching */
++ return retval ? retval : -ENODEV;
+ }
+ if (srp->header.interface_id != '\0')
+ return sg_new_read(sfp, buf, count, srp);
+@@ -939,9 +936,7 @@ sg_ioctl_common(struct file *filp, Sg_device *sdp, Sg_fd *sfp,
+ if (result < 0)
+ return result;
+ result = wait_event_interruptible(sfp->read_wait,
+- (srp_done(sfp, srp) || atomic_read(&sdp->detaching)));
+- if (atomic_read(&sdp->detaching))
+- return -ENODEV;
++ srp_done(sfp, srp));
+ write_lock_irq(&sfp->rq_list_lock);
+ if (srp->done) {
+ srp->done = 2;
+@@ -2078,19 +2073,28 @@ sg_unlink_reserve(Sg_fd * sfp, Sg_request * srp)
+ }
+
+ static Sg_request *
+-sg_get_rq_mark(Sg_fd * sfp, int pack_id)
++sg_get_rq_mark(Sg_fd * sfp, int pack_id, bool *busy)
+ {
+ Sg_request *resp;
+ unsigned long iflags;
+
++ *busy = false;
+ write_lock_irqsave(&sfp->rq_list_lock, iflags);
+ list_for_each_entry(resp, &sfp->rq_list, entry) {
+- /* look for requests that are ready + not SG_IO owned */
+- if ((1 == resp->done) && (!resp->sg_io_owned) &&
++ /* look for requests that are not SG_IO owned */
++ if ((!resp->sg_io_owned) &&
+ ((-1 == pack_id) || (resp->header.pack_id == pack_id))) {
+- resp->done = 2; /* guard against other readers */
+- write_unlock_irqrestore(&sfp->rq_list_lock, iflags);
+- return resp;
++ switch (resp->done) {
++ case 0: /* request active */
++ *busy = true;
++ break;
++ case 1: /* request done; response ready to return */
++ resp->done = 2; /* guard against other readers */
++ write_unlock_irqrestore(&sfp->rq_list_lock, iflags);
++ return resp;
++ case 2: /* response already being returned */
++ break;
++ }
+ }
+ }
+ write_unlock_irqrestore(&sfp->rq_list_lock, iflags);
+@@ -2144,6 +2148,15 @@ sg_remove_request(Sg_fd * sfp, Sg_request * srp)
+ res = 1;
+ }
+ write_unlock_irqrestore(&sfp->rq_list_lock, iflags);
++
++ /*
++ * If the device is detaching, wakeup any readers in case we just
++ * removed the last response, which would leave nothing for them to
++ * return other than -ENODEV.
++ */
++ if (unlikely(atomic_read(&sfp->parentdp->detaching)))
++ wake_up_interruptible_all(&sfp->read_wait);
++
+ return res;
+ }
+
+diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
+index 7c0d069a31583..e1fc6f5b96124 100644
+--- a/drivers/scsi/smartpqi/smartpqi_init.c
++++ b/drivers/scsi/smartpqi/smartpqi_init.c
+@@ -5484,10 +5484,10 @@ static int pqi_raid_submit_scsi_cmd_with_io_request(
+ }
+
+ switch (scmd->sc_data_direction) {
+- case DMA_TO_DEVICE:
++ case DMA_FROM_DEVICE:
+ request->data_direction = SOP_READ_FLAG;
+ break;
+- case DMA_FROM_DEVICE:
++ case DMA_TO_DEVICE:
+ request->data_direction = SOP_WRITE_FLAG;
+ break;
+ case DMA_NONE:
+diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
+index 874490f7f5e7f..9e6f45aea20e9 100644
+--- a/drivers/scsi/ufs/ufshcd.c
++++ b/drivers/scsi/ufs/ufshcd.c
+@@ -9500,12 +9500,8 @@ EXPORT_SYMBOL(ufshcd_runtime_resume);
+ int ufshcd_shutdown(struct ufs_hba *hba)
+ {
+ if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
+- goto out;
+-
+- pm_runtime_get_sync(hba->dev);
++ ufshcd_suspend(hba);
+
+- ufshcd_suspend(hba);
+-out:
+ hba->is_powered = false;
+ /* allow force shutdown even in case of errors */
+ return 0;
+diff --git a/drivers/soc/amlogic/meson-mx-socinfo.c b/drivers/soc/amlogic/meson-mx-socinfo.c
+index 78f0f1aeca578..92125dd65f338 100644
+--- a/drivers/soc/amlogic/meson-mx-socinfo.c
++++ b/drivers/soc/amlogic/meson-mx-socinfo.c
+@@ -126,6 +126,7 @@ static int __init meson_mx_socinfo_init(void)
+ np = of_find_matching_node(NULL, meson_mx_socinfo_analog_top_ids);
+ if (np) {
+ analog_top_regmap = syscon_node_to_regmap(np);
++ of_node_put(np);
+ if (IS_ERR(analog_top_regmap))
+ return PTR_ERR(analog_top_regmap);
+
+diff --git a/drivers/soc/amlogic/meson-secure-pwrc.c b/drivers/soc/amlogic/meson-secure-pwrc.c
+index a10a417a87db8..e935187635267 100644
+--- a/drivers/soc/amlogic/meson-secure-pwrc.c
++++ b/drivers/soc/amlogic/meson-secure-pwrc.c
+@@ -152,8 +152,10 @@ static int meson_secure_pwrc_probe(struct platform_device *pdev)
+ }
+
+ pwrc = devm_kzalloc(&pdev->dev, sizeof(*pwrc), GFP_KERNEL);
+- if (!pwrc)
++ if (!pwrc) {
++ of_node_put(sm_np);
+ return -ENOMEM;
++ }
+
+ pwrc->fw = meson_sm_get(sm_np);
+ of_node_put(sm_np);
+diff --git a/drivers/soc/fsl/guts.c b/drivers/soc/fsl/guts.c
+index 5ed2fc1c53a0e..be18d46c7b0fb 100644
+--- a/drivers/soc/fsl/guts.c
++++ b/drivers/soc/fsl/guts.c
+@@ -140,7 +140,7 @@ static int fsl_guts_probe(struct platform_device *pdev)
+ struct device_node *root, *np = pdev->dev.of_node;
+ struct device *dev = &pdev->dev;
+ const struct fsl_soc_die_attr *soc_die;
+- const char *machine;
++ const char *machine = NULL;
+ u32 svr;
+
+ /* Initialize guts */
+diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
+index e718b87354444..4472fb22ba045 100644
+--- a/drivers/soc/qcom/Kconfig
++++ b/drivers/soc/qcom/Kconfig
+@@ -129,6 +129,7 @@ config QCOM_RPMHPD
+
+ config QCOM_RPMPD
+ tristate "Qualcomm RPM Power domain driver"
++ depends on PM
+ depends on QCOM_SMD_RPM
+ help
+ QCOM RPM Power domain driver to support power-domains with
+diff --git a/drivers/soc/qcom/ocmem.c b/drivers/soc/qcom/ocmem.c
+index 97fd24c178f8d..c92d26b73e6fc 100644
+--- a/drivers/soc/qcom/ocmem.c
++++ b/drivers/soc/qcom/ocmem.c
+@@ -194,14 +194,17 @@ struct ocmem *of_get_ocmem(struct device *dev)
+ devnode = of_parse_phandle(dev->of_node, "sram", 0);
+ if (!devnode || !devnode->parent) {
+ dev_err(dev, "Cannot look up sram phandle\n");
++ of_node_put(devnode);
+ return ERR_PTR(-ENODEV);
+ }
+
+ pdev = of_find_device_by_node(devnode->parent);
+ if (!pdev) {
+ dev_err(dev, "Cannot find device node %s\n", devnode->name);
++ of_node_put(devnode);
+ return ERR_PTR(-EPROBE_DEFER);
+ }
++ of_node_put(devnode);
+
+ ocmem = platform_get_drvdata(pdev);
+ if (!ocmem) {
+diff --git a/drivers/soc/qcom/qcom_aoss.c b/drivers/soc/qcom/qcom_aoss.c
+index a59bb34e5ebaf..18c856056475c 100644
+--- a/drivers/soc/qcom/qcom_aoss.c
++++ b/drivers/soc/qcom/qcom_aoss.c
+@@ -399,8 +399,10 @@ static int qmp_cooling_devices_register(struct qmp *qmp)
+ continue;
+ ret = qmp_cooling_device_add(qmp, &qmp->cooling_devs[count++],
+ child);
+- if (ret)
++ if (ret) {
++ of_node_put(child);
+ goto unroll;
++ }
+ }
+
+ if (!count)
+diff --git a/drivers/soc/qcom/socinfo.c b/drivers/soc/qcom/socinfo.c
+index 8b38d134720aa..f6879983da693 100644
+--- a/drivers/soc/qcom/socinfo.c
++++ b/drivers/soc/qcom/socinfo.c
+@@ -328,7 +328,8 @@ static const struct soc_id soc_id[] = {
+ { 455, "QRB5165" },
+ { 457, "SM8450" },
+ { 459, "SM7225" },
+- { 460, "SA8540P" },
++ { 460, "SA8295P" },
++ { 461, "SA8540P" },
+ { 480, "SM8450" },
+ };
+
+diff --git a/drivers/soc/renesas/r8a779a0-sysc.c b/drivers/soc/renesas/r8a779a0-sysc.c
+index fdfc857df3349..04f1bc322ae7b 100644
+--- a/drivers/soc/renesas/r8a779a0-sysc.c
++++ b/drivers/soc/renesas/r8a779a0-sysc.c
+@@ -57,11 +57,11 @@ static struct rcar_gen4_sysc_area r8a779a0_areas[] __initdata = {
+ { "a2cv6", R8A779A0_PD_A2CV6, R8A779A0_PD_A3IR },
+ { "a2cn2", R8A779A0_PD_A2CN2, R8A779A0_PD_A3IR },
+ { "a2imp23", R8A779A0_PD_A2IMP23, R8A779A0_PD_A3IR },
+- { "a2dp1", R8A779A0_PD_A2DP0, R8A779A0_PD_A3IR },
+- { "a2cv2", R8A779A0_PD_A2CV0, R8A779A0_PD_A3IR },
+- { "a2cv3", R8A779A0_PD_A2CV1, R8A779A0_PD_A3IR },
+- { "a2cv5", R8A779A0_PD_A2CV4, R8A779A0_PD_A3IR },
+- { "a2cv7", R8A779A0_PD_A2CV6, R8A779A0_PD_A3IR },
++ { "a2dp1", R8A779A0_PD_A2DP1, R8A779A0_PD_A3IR },
++ { "a2cv2", R8A779A0_PD_A2CV2, R8A779A0_PD_A3IR },
++ { "a2cv3", R8A779A0_PD_A2CV3, R8A779A0_PD_A3IR },
++ { "a2cv5", R8A779A0_PD_A2CV5, R8A779A0_PD_A3IR },
++ { "a2cv7", R8A779A0_PD_A2CV7, R8A779A0_PD_A3IR },
+ { "a2cn1", R8A779A0_PD_A2CN1, R8A779A0_PD_A3IR },
+ { "a1cnn0", R8A779A0_PD_A1CNN0, R8A779A0_PD_A2CN0 },
+ { "a1cnn2", R8A779A0_PD_A1CNN2, R8A779A0_PD_A2CN2 },
+diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
+index 354d3f89366f6..f1cf78c6d4778 100644
+--- a/drivers/soundwire/bus.c
++++ b/drivers/soundwire/bus.c
+@@ -7,6 +7,7 @@
+ #include <linux/pm_runtime.h>
+ #include <linux/soundwire/sdw_registers.h>
+ #include <linux/soundwire/sdw.h>
++#include <linux/soundwire/sdw_type.h>
+ #include "bus.h"
+ #include "sysfs_local.h"
+
+@@ -846,15 +847,21 @@ static int sdw_slave_clk_stop_callback(struct sdw_slave *slave,
+ enum sdw_clk_stop_mode mode,
+ enum sdw_clk_stop_type type)
+ {
+- int ret;
++ int ret = 0;
+
+- if (slave->ops && slave->ops->clk_stop) {
+- ret = slave->ops->clk_stop(slave, mode, type);
+- if (ret < 0)
+- return ret;
++ mutex_lock(&slave->sdw_dev_lock);
++
++ if (slave->probed) {
++ struct device *dev = &slave->dev;
++ struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
++
++ if (drv->ops && drv->ops->clk_stop)
++ ret = drv->ops->clk_stop(slave, mode, type);
+ }
+
+- return 0;
++ mutex_unlock(&slave->sdw_dev_lock);
++
++ return ret;
+ }
+
+ static int sdw_slave_clk_stop_prepare(struct sdw_slave *slave,
+@@ -1616,14 +1623,24 @@ static int sdw_handle_slave_alerts(struct sdw_slave *slave)
+ }
+
+ /* Update the Slave driver */
+- if (slave_notify && slave->ops &&
+- slave->ops->interrupt_callback) {
+- slave_intr.sdca_cascade = sdca_cascade;
+- slave_intr.control_port = clear;
+- memcpy(slave_intr.port, &port_status,
+- sizeof(slave_intr.port));
+-
+- slave->ops->interrupt_callback(slave, &slave_intr);
++ if (slave_notify) {
++ mutex_lock(&slave->sdw_dev_lock);
++
++ if (slave->probed) {
++ struct device *dev = &slave->dev;
++ struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
++
++ if (drv->ops && drv->ops->interrupt_callback) {
++ slave_intr.sdca_cascade = sdca_cascade;
++ slave_intr.control_port = clear;
++ memcpy(slave_intr.port, &port_status,
++ sizeof(slave_intr.port));
++
++ drv->ops->interrupt_callback(slave, &slave_intr);
++ }
++ }
++
++ mutex_unlock(&slave->sdw_dev_lock);
+ }
+
+ /* Ack interrupt */
+@@ -1697,29 +1714,21 @@ io_err:
+ static int sdw_update_slave_status(struct sdw_slave *slave,
+ enum sdw_slave_status status)
+ {
+- unsigned long time;
++ int ret = 0;
+
+- if (!slave->probed) {
+- /*
+- * the slave status update is typically handled in an
+- * interrupt thread, which can race with the driver
+- * probe, e.g. when a module needs to be loaded.
+- *
+- * make sure the probe is complete before updating
+- * status.
+- */
+- time = wait_for_completion_timeout(&slave->probe_complete,
+- msecs_to_jiffies(DEFAULT_PROBE_TIMEOUT));
+- if (!time) {
+- dev_err(&slave->dev, "Probe not complete, timed out\n");
+- return -ETIMEDOUT;
+- }
++ mutex_lock(&slave->sdw_dev_lock);
++
++ if (slave->probed) {
++ struct device *dev = &slave->dev;
++ struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
++
++ if (drv->ops && drv->ops->update_status)
++ ret = drv->ops->update_status(slave, status);
+ }
+
+- if (!slave->ops || !slave->ops->update_status)
+- return 0;
++ mutex_unlock(&slave->sdw_dev_lock);
+
+- return slave->ops->update_status(slave, status);
++ return ret;
+ }
+
+ /**
+diff --git a/drivers/soundwire/bus_type.c b/drivers/soundwire/bus_type.c
+index 893296f3fe395..04b3529f89293 100644
+--- a/drivers/soundwire/bus_type.c
++++ b/drivers/soundwire/bus_type.c
+@@ -98,8 +98,6 @@ static int sdw_drv_probe(struct device *dev)
+ if (!id)
+ return -ENODEV;
+
+- slave->ops = drv->ops;
+-
+ /*
+ * attach to power domain but don't turn on (last arg)
+ */
+@@ -107,19 +105,23 @@ static int sdw_drv_probe(struct device *dev)
+ if (ret)
+ return ret;
+
++ mutex_lock(&slave->sdw_dev_lock);
++
+ ret = drv->probe(slave, id);
+ if (ret) {
+ name = drv->name;
+ if (!name)
+ name = drv->driver.name;
++ mutex_unlock(&slave->sdw_dev_lock);
++
+ dev_err(dev, "Probe of %s failed: %d\n", name, ret);
+ dev_pm_domain_detach(dev, false);
+ return ret;
+ }
+
+ /* device is probed so let's read the properties now */
+- if (slave->ops && slave->ops->read_prop)
+- slave->ops->read_prop(slave);
++ if (drv->ops && drv->ops->read_prop)
++ drv->ops->read_prop(slave);
+
+ /* init the sysfs as we have properties now */
+ ret = sdw_slave_sysfs_init(slave);
+@@ -139,7 +141,19 @@ static int sdw_drv_probe(struct device *dev)
+ slave->prop.clk_stop_timeout);
+
+ slave->probed = true;
+- complete(&slave->probe_complete);
++
++ /*
++ * if the probe happened after the bus was started, notify the codec driver
++ * of the current hardware status to e.g. start the initialization.
++ * Errors are only logged as warnings to avoid failing the probe.
++ */
++ if (drv->ops && drv->ops->update_status) {
++ ret = drv->ops->update_status(slave, slave->status);
++ if (ret < 0)
++ dev_warn(dev, "%s: update_status failed with status %d\n", __func__, ret);
++ }
++
++ mutex_unlock(&slave->sdw_dev_lock);
+
+ dev_dbg(dev, "probe complete\n");
+
+@@ -152,9 +166,15 @@ static int sdw_drv_remove(struct device *dev)
+ struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
+ int ret = 0;
+
++ mutex_lock(&slave->sdw_dev_lock);
++
++ slave->probed = false;
++
+ if (drv->remove)
+ ret = drv->remove(slave);
+
++ mutex_unlock(&slave->sdw_dev_lock);
++
+ dev_pm_domain_detach(dev, false);
+
+ return ret;
+@@ -193,12 +213,8 @@ int __sdw_register_driver(struct sdw_driver *drv, struct module *owner)
+
+ drv->driver.owner = owner;
+ drv->driver.probe = sdw_drv_probe;
+-
+- if (drv->remove)
+- drv->driver.remove = sdw_drv_remove;
+-
+- if (drv->shutdown)
+- drv->driver.shutdown = sdw_drv_shutdown;
++ drv->driver.remove = sdw_drv_remove;
++ drv->driver.shutdown = sdw_drv_shutdown;
+
+ return driver_register(&drv->driver);
+ }
+diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c
+index 7b8ef45abee44..eea9d0f338f59 100644
+--- a/drivers/soundwire/qcom.c
++++ b/drivers/soundwire/qcom.c
+@@ -471,6 +471,10 @@ static int qcom_swrm_enumerate(struct sdw_bus *bus)
+ char *buf1 = (char *)&val1, *buf2 = (char *)&val2;
+
+ for (i = 1; i <= SDW_MAX_DEVICES; i++) {
++ /* do not continue if the status is Not Present */
++ if (!ctrl->status[i])
++ continue;
++
+ /*SCP_Devid5 - Devid 4*/
+ ctrl->reg_read(ctrl, SWRM_ENUMERATOR_SLAVE_DEV_ID_1(i), &val1);
+
+diff --git a/drivers/soundwire/slave.c b/drivers/soundwire/slave.c
+index 669d7573320b7..25e76b5d4a1a3 100644
+--- a/drivers/soundwire/slave.c
++++ b/drivers/soundwire/slave.c
+@@ -12,6 +12,7 @@ static void sdw_slave_release(struct device *dev)
+ {
+ struct sdw_slave *slave = dev_to_sdw_dev(dev);
+
++ mutex_destroy(&slave->sdw_dev_lock);
+ kfree(slave);
+ }
+
+@@ -58,9 +59,9 @@ int sdw_slave_add(struct sdw_bus *bus,
+ init_completion(&slave->enumeration_complete);
+ init_completion(&slave->initialization_complete);
+ slave->dev_num = 0;
+- init_completion(&slave->probe_complete);
+ slave->probed = false;
+ slave->first_interrupt_done = false;
++ mutex_init(&slave->sdw_dev_lock);
+
+ for (i = 0; i < SDW_MAX_PORTS; i++)
+ init_completion(&slave->port_ready[i]);
+diff --git a/drivers/soundwire/stream.c b/drivers/soundwire/stream.c
+index f273459b20234..6963e5f7ea6d4 100644
+--- a/drivers/soundwire/stream.c
++++ b/drivers/soundwire/stream.c
+@@ -13,6 +13,7 @@
+ #include <linux/slab.h>
+ #include <linux/soundwire/sdw_registers.h>
+ #include <linux/soundwire/sdw.h>
++#include <linux/soundwire/sdw_type.h>
+ #include <sound/soc.h>
+ #include "bus.h"
+
+@@ -401,20 +402,26 @@ static int sdw_do_port_prep(struct sdw_slave_runtime *s_rt,
+ struct sdw_prepare_ch prep_ch,
+ enum sdw_port_prep_ops cmd)
+ {
+- const struct sdw_slave_ops *ops = s_rt->slave->ops;
+- int ret;
++ int ret = 0;
++ struct sdw_slave *slave = s_rt->slave;
+
+- if (ops->port_prep) {
+- ret = ops->port_prep(s_rt->slave, &prep_ch, cmd);
+- if (ret < 0) {
+- dev_err(&s_rt->slave->dev,
+- "Slave Port Prep cmd %d failed: %d\n",
+- cmd, ret);
+- return ret;
++ mutex_lock(&slave->sdw_dev_lock);
++
++ if (slave->probed) {
++ struct device *dev = &slave->dev;
++ struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
++
++ if (drv->ops && drv->ops->port_prep) {
++ ret = drv->ops->port_prep(slave, &prep_ch, cmd);
++ if (ret < 0)
++ dev_err(dev, "Slave Port Prep cmd %d failed: %d\n",
++ cmd, ret);
+ }
+ }
+
+- return 0;
++ mutex_unlock(&slave->sdw_dev_lock);
++
++ return ret;
+ }
+
+ static int sdw_prep_deprep_slave_ports(struct sdw_bus *bus,
+@@ -578,7 +585,7 @@ static int sdw_notify_config(struct sdw_master_runtime *m_rt)
+ struct sdw_slave_runtime *s_rt;
+ struct sdw_bus *bus = m_rt->bus;
+ struct sdw_slave *slave;
+- int ret = 0;
++ int ret;
+
+ if (bus->ops->set_bus_conf) {
+ ret = bus->ops->set_bus_conf(bus, &bus->params);
+@@ -589,17 +596,27 @@ static int sdw_notify_config(struct sdw_master_runtime *m_rt)
+ list_for_each_entry(s_rt, &m_rt->slave_rt_list, m_rt_node) {
+ slave = s_rt->slave;
+
+- if (slave->ops->bus_config) {
+- ret = slave->ops->bus_config(slave, &bus->params);
+- if (ret < 0) {
+- dev_err(bus->dev, "Notify Slave: %d failed\n",
+- slave->dev_num);
+- return ret;
++ mutex_lock(&slave->sdw_dev_lock);
++
++ if (slave->probed) {
++ struct device *dev = &slave->dev;
++ struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
++
++ if (drv->ops && drv->ops->bus_config) {
++ ret = drv->ops->bus_config(slave, &bus->params);
++ if (ret < 0) {
++ dev_err(dev, "Notify Slave: %d failed\n",
++ slave->dev_num);
++ mutex_unlock(&slave->sdw_dev_lock);
++ return ret;
++ }
+ }
+ }
++
++ mutex_unlock(&slave->sdw_dev_lock);
+ }
+
+- return ret;
++ return 0;
+ }
+
+ /**
+diff --git a/drivers/spi/spi-altera-dfl.c b/drivers/spi/spi-altera-dfl.c
+index ca40923258af3..596e181ae1368 100644
+--- a/drivers/spi/spi-altera-dfl.c
++++ b/drivers/spi/spi-altera-dfl.c
+@@ -128,9 +128,9 @@ static int dfl_spi_altera_probe(struct dfl_device *dfl_dev)
+ struct spi_master *master;
+ struct altera_spi *hw;
+ void __iomem *base;
+- int err = -ENODEV;
++ int err;
+
+- master = spi_alloc_master(dev, sizeof(struct altera_spi));
++ master = devm_spi_alloc_master(dev, sizeof(struct altera_spi));
+ if (!master)
+ return -ENOMEM;
+
+@@ -159,10 +159,9 @@ static int dfl_spi_altera_probe(struct dfl_device *dfl_dev)
+ altera_spi_init_master(master);
+
+ err = devm_spi_register_master(dev, master);
+- if (err) {
+- dev_err(dev, "%s failed to register spi master %d\n", __func__, err);
+- goto exit;
+- }
++ if (err)
++ return dev_err_probe(dev, err, "%s failed to register spi master\n",
++ __func__);
+
+ if (dfl_dev->revision == FME_FEATURE_REV_MAX10_SPI_N5010)
+ strscpy(board_info.modalias, "m10-n5010", SPI_NAME_SIZE);
+@@ -179,9 +178,6 @@ static int dfl_spi_altera_probe(struct dfl_device *dfl_dev)
+ }
+
+ return 0;
+-exit:
+- spi_master_put(master);
+- return err;
+ }
+
+ static const struct dfl_device_id dfl_spi_altera_ids[] = {
+diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
+index d5ee5130601e1..79d853f6d1920 100644
+--- a/drivers/spi/spi-dw.h
++++ b/drivers/spi/spi-dw.h
+@@ -23,7 +23,7 @@
+ ((_dws)->ip == DW_ ## _ip ## _ID)
+
+ #define __dw_spi_ver_cmp(_dws, _ip, _ver, _op) \
+- (dw_spi_ip_is(_dws, _ip) && (_dws)->ver _op DW_ ## _ip ## _ver)
++ (dw_spi_ip_is(_dws, _ip) && (_dws)->ver _op DW_ ## _ip ## _ ## _ver)
+
+ #define dw_spi_ver_is(_dws, _ip, _ver) __dw_spi_ver_cmp(_dws, _ip, _ver, ==)
+
+diff --git a/drivers/spi/spi-rspi.c b/drivers/spi/spi-rspi.c
+index 7a014eeec2d0d..411b1307b7fd8 100644
+--- a/drivers/spi/spi-rspi.c
++++ b/drivers/spi/spi-rspi.c
+@@ -613,6 +613,10 @@ static int rspi_dma_transfer(struct rspi_data *rspi, struct sg_table *tx,
+ rspi->dma_callbacked, HZ);
+ if (ret > 0 && rspi->dma_callbacked) {
+ ret = 0;
++ if (tx)
++ dmaengine_synchronize(rspi->ctlr->dma_tx);
++ if (rx)
++ dmaengine_synchronize(rspi->ctlr->dma_rx);
+ } else {
+ if (!ret) {
+ dev_err(&rspi->ctlr->dev, "DMA timeout\n");
+diff --git a/drivers/spi/spi-s3c64xx.c b/drivers/spi/spi-s3c64xx.c
+index c26440e9058d7..8fa21afc6a35b 100644
+--- a/drivers/spi/spi-s3c64xx.c
++++ b/drivers/spi/spi-s3c64xx.c
+@@ -1413,7 +1413,7 @@ static const struct s3c64xx_spi_port_config exynos5433_spi_port_config = {
+ .quirks = S3C64XX_SPI_QUIRK_CS_AUTO,
+ };
+
+-static struct s3c64xx_spi_port_config fsd_spi_port_config = {
++static const struct s3c64xx_spi_port_config fsd_spi_port_config = {
+ .fifo_lvl_mask = { 0x7f, 0x7f, 0x7f, 0x7f, 0x7f},
+ .rx_lvl_offset = 15,
+ .tx_st_done = 25,
+diff --git a/drivers/spi/spi-synquacer.c b/drivers/spi/spi-synquacer.c
+index ea706d9629cb1..47cbe73137c23 100644
+--- a/drivers/spi/spi-synquacer.c
++++ b/drivers/spi/spi-synquacer.c
+@@ -783,6 +783,7 @@ static int __maybe_unused synquacer_spi_resume(struct device *dev)
+
+ ret = synquacer_spi_enable(master);
+ if (ret) {
++ clk_disable_unprepare(sspi->clk);
+ dev_err(dev, "failed to enable spi (%d)\n", ret);
+ return ret;
+ }
+diff --git a/drivers/spi/spi-tegra20-slink.c b/drivers/spi/spi-tegra20-slink.c
+index 80c3787deea9d..c91c226be69eb 100644
+--- a/drivers/spi/spi-tegra20-slink.c
++++ b/drivers/spi/spi-tegra20-slink.c
+@@ -1137,7 +1137,7 @@ exit_free_master:
+
+ static int tegra_slink_remove(struct platform_device *pdev)
+ {
+- struct spi_master *master = platform_get_drvdata(pdev);
++ struct spi_master *master = spi_master_get(platform_get_drvdata(pdev));
+ struct tegra_slink_data *tspi = spi_master_get_devdata(master);
+
+ spi_unregister_master(master);
+@@ -1152,6 +1152,7 @@ static int tegra_slink_remove(struct platform_device *pdev)
+ if (tspi->rx_dma_chan)
+ tegra_slink_deinit_dma_param(tspi, true);
+
++ spi_master_put(master);
+ return 0;
+ }
+
+diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
+index 2e6d6bbeb7842..6a8850e620785 100644
+--- a/drivers/spi/spi.c
++++ b/drivers/spi/spi.c
+@@ -2416,7 +2416,7 @@ static int acpi_spi_add_resource(struct acpi_resource *ares, void *data)
+
+ ctlr = acpi_spi_find_controller_by_adev(adev);
+ if (!ctlr)
+- return -ENODEV;
++ return -EPROBE_DEFER;
+
+ lookup->ctlr = ctlr;
+ }
+@@ -3068,9 +3068,9 @@ free_bus_id:
+ }
+ EXPORT_SYMBOL_GPL(spi_register_controller);
+
+-static void devm_spi_unregister(void *ctlr)
++static void devm_spi_unregister(struct device *dev, void *res)
+ {
+- spi_unregister_controller(ctlr);
++ spi_unregister_controller(*(struct spi_controller **)res);
+ }
+
+ /**
+@@ -3089,13 +3089,22 @@ static void devm_spi_unregister(void *ctlr)
+ int devm_spi_register_controller(struct device *dev,
+ struct spi_controller *ctlr)
+ {
++ struct spi_controller **ptr;
+ int ret;
+
++ ptr = devres_alloc(devm_spi_unregister, sizeof(*ptr), GFP_KERNEL);
++ if (!ptr)
++ return -ENOMEM;
++
+ ret = spi_register_controller(ctlr);
+- if (ret)
+- return ret;
++ if (!ret) {
++ *ptr = ctlr;
++ devres_add(dev, ptr);
++ } else {
++ devres_free(ptr);
++ }
+
+- return devm_add_action_or_reset(dev, devm_spi_unregister, ctlr);
++ return ret;
+ }
+ EXPORT_SYMBOL_GPL(devm_spi_register_controller);
+
+diff --git a/drivers/staging/fbtft/fbtft-core.c b/drivers/staging/fbtft/fbtft-core.c
+index 9c4d797e7ae46..4137c1a51e1b9 100644
+--- a/drivers/staging/fbtft/fbtft-core.c
++++ b/drivers/staging/fbtft/fbtft-core.c
+@@ -656,7 +656,6 @@ struct fb_info *fbtft_framebuffer_alloc(struct fbtft_display *display,
+ fbdefio->delay = HZ / fps;
+ fbdefio->sort_pagelist = true;
+ fbdefio->deferred_io = fbtft_deferred_io;
+- fb_deferred_io_init(info);
+
+ snprintf(info->fix.id, sizeof(info->fix.id), "%s", dev->driver->name);
+ info->fix.type = FB_TYPE_PACKED_PIXELS;
+@@ -667,6 +666,7 @@ struct fb_info *fbtft_framebuffer_alloc(struct fbtft_display *display,
+ info->fix.line_length = width * bpp / 8;
+ info->fix.accel = FB_ACCEL_NONE;
+ info->fix.smem_len = vmem_size;
++ fb_deferred_io_init(info);
+
+ info->var.rotate = pdata->rotate;
+ info->var.xres = width;
+diff --git a/drivers/staging/media/atomisp/pci/atomisp_cmd.c b/drivers/staging/media/atomisp/pci/atomisp_cmd.c
+index 97d5a528969b8..0da0b69a46375 100644
+--- a/drivers/staging/media/atomisp/pci/atomisp_cmd.c
++++ b/drivers/staging/media/atomisp/pci/atomisp_cmd.c
+@@ -901,9 +901,9 @@ void atomisp_buf_done(struct atomisp_sub_device *asd, int error,
+ int err;
+ unsigned long irqflags;
+ struct ia_css_frame *frame = NULL;
+- struct atomisp_s3a_buf *s3a_buf = NULL, *_s3a_buf_tmp;
+- struct atomisp_dis_buf *dis_buf = NULL, *_dis_buf_tmp;
+- struct atomisp_metadata_buf *md_buf = NULL, *_md_buf_tmp;
++ struct atomisp_s3a_buf *s3a_buf = NULL, *_s3a_buf_tmp, *s3a_iter;
++ struct atomisp_dis_buf *dis_buf = NULL, *_dis_buf_tmp, *dis_iter;
++ struct atomisp_metadata_buf *md_buf = NULL, *_md_buf_tmp, *md_iter;
+ enum atomisp_metadata_type md_type;
+ struct atomisp_device *isp = asd->isp;
+ struct v4l2_control ctrl;
+@@ -942,60 +942,75 @@ void atomisp_buf_done(struct atomisp_sub_device *asd, int error,
+
+ switch (buf_type) {
+ case IA_CSS_BUFFER_TYPE_3A_STATISTICS:
+- list_for_each_entry_safe(s3a_buf, _s3a_buf_tmp,
++ list_for_each_entry_safe(s3a_iter, _s3a_buf_tmp,
+ &asd->s3a_stats_in_css, list) {
+- if (s3a_buf->s3a_data ==
++ if (s3a_iter->s3a_data ==
+ buffer.css_buffer.data.stats_3a) {
+- list_del_init(&s3a_buf->list);
+- list_add_tail(&s3a_buf->list,
++ list_del_init(&s3a_iter->list);
++ list_add_tail(&s3a_iter->list,
+ &asd->s3a_stats_ready);
++ s3a_buf = s3a_iter;
+ break;
+ }
+ }
+
+ asd->s3a_bufs_in_css[css_pipe_id]--;
+ atomisp_3a_stats_ready_event(asd, buffer.css_buffer.exp_id);
+- dev_dbg(isp->dev, "%s: s3a stat with exp_id %d is ready\n",
+- __func__, s3a_buf->s3a_data->exp_id);
++ if (s3a_buf)
++ dev_dbg(isp->dev, "%s: s3a stat with exp_id %d is ready\n",
++ __func__, s3a_buf->s3a_data->exp_id);
++ else
++ dev_dbg(isp->dev, "%s: s3a stat is ready with no exp_id found\n",
++ __func__);
+ break;
+ case IA_CSS_BUFFER_TYPE_METADATA:
+ if (error)
+ break;
+
+ md_type = atomisp_get_metadata_type(asd, css_pipe_id);
+- list_for_each_entry_safe(md_buf, _md_buf_tmp,
++ list_for_each_entry_safe(md_iter, _md_buf_tmp,
+ &asd->metadata_in_css[md_type], list) {
+- if (md_buf->metadata ==
++ if (md_iter->metadata ==
+ buffer.css_buffer.data.metadata) {
+- list_del_init(&md_buf->list);
+- list_add_tail(&md_buf->list,
++ list_del_init(&md_iter->list);
++ list_add_tail(&md_iter->list,
+ &asd->metadata_ready[md_type]);
++ md_buf = md_iter;
+ break;
+ }
+ }
+ asd->metadata_bufs_in_css[stream_id][css_pipe_id]--;
+ atomisp_metadata_ready_event(asd, md_type);
+- dev_dbg(isp->dev, "%s: metadata with exp_id %d is ready\n",
+- __func__, md_buf->metadata->exp_id);
++ if (md_buf)
++ dev_dbg(isp->dev, "%s: metadata with exp_id %d is ready\n",
++ __func__, md_buf->metadata->exp_id);
++ else
++ dev_dbg(isp->dev, "%s: metadata is ready with no exp_id found\n",
++ __func__);
+ break;
+ case IA_CSS_BUFFER_TYPE_DIS_STATISTICS:
+- list_for_each_entry_safe(dis_buf, _dis_buf_tmp,
++ list_for_each_entry_safe(dis_iter, _dis_buf_tmp,
+ &asd->dis_stats_in_css, list) {
+- if (dis_buf->dis_data ==
++ if (dis_iter->dis_data ==
+ buffer.css_buffer.data.stats_dvs) {
+ spin_lock_irqsave(&asd->dis_stats_lock,
+ irqflags);
+- list_del_init(&dis_buf->list);
+- list_add(&dis_buf->list, &asd->dis_stats);
++ list_del_init(&dis_iter->list);
++ list_add(&dis_iter->list, &asd->dis_stats);
+ asd->params.dis_proj_data_valid = true;
+ spin_unlock_irqrestore(&asd->dis_stats_lock,
+ irqflags);
++ dis_buf = dis_iter;
+ break;
+ }
+ }
+ asd->dis_bufs_in_css--;
+- dev_dbg(isp->dev, "%s: dis stat with exp_id %d is ready\n",
+- __func__, dis_buf->dis_data->exp_id);
++ if (dis_buf)
++ dev_dbg(isp->dev, "%s: dis stat with exp_id %d is ready\n",
++ __func__, dis_buf->dis_data->exp_id);
++ else
++ dev_dbg(isp->dev, "%s: dis stat is ready with no exp_id found\n",
++ __func__);
+ break;
+ case IA_CSS_BUFFER_TYPE_VF_OUTPUT_FRAME:
+ case IA_CSS_BUFFER_TYPE_SEC_VF_OUTPUT_FRAME:
+diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
+index bd7d11032c94a..ac232b5f7825c 100644
+--- a/drivers/staging/media/hantro/hantro_drv.c
++++ b/drivers/staging/media/hantro/hantro_drv.c
+@@ -638,6 +638,7 @@ static const struct of_device_id of_hantro_match[] = {
+ { .compatible = "rockchip,rk3288-vpu", .data = &rk3288_vpu_variant, },
+ { .compatible = "rockchip,rk3328-vpu", .data = &rk3328_vpu_variant, },
+ { .compatible = "rockchip,rk3399-vpu", .data = &rk3399_vpu_variant, },
++ { .compatible = "rockchip,rk3568-vpu", .data = &rk3568_vpu_variant, },
+ #endif
+ #ifdef CONFIG_VIDEO_HANTRO_IMX8M
+ { .compatible = "nxp,imx8mm-vpu-g1", .data = &imx8mm_vpu_g1_variant, },
+diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
+index 5f3178bac9c80..d28653d04d20f 100644
+--- a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
++++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
+@@ -8,6 +8,20 @@
+ #include "hantro_hw.h"
+ #include "hantro_g2_regs.h"
+
++#define G2_ALIGN 16
++
++static size_t hantro_hevc_chroma_offset(struct hantro_ctx *ctx)
++{
++ return ctx->dst_fmt.width * ctx->dst_fmt.height;
++}
++
++static size_t hantro_hevc_motion_vectors_offset(struct hantro_ctx *ctx)
++{
++ size_t cr_offset = hantro_hevc_chroma_offset(ctx);
++
++ return ALIGN((cr_offset * 3) / 2, G2_ALIGN);
++}
++
+ static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
+ {
+ struct hantro_dev *vpu = ctx->dev;
+@@ -330,7 +344,6 @@ static void set_ref_pic_list(struct hantro_ctx *ctx)
+ static int set_ref(struct hantro_ctx *ctx)
+ {
+ const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
+- const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
+ const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
+ const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
+ const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
+@@ -338,8 +351,8 @@ static int set_ref(struct hantro_ctx *ctx)
+ struct hantro_dev *vpu = ctx->dev;
+ struct vb2_v4l2_buffer *vb2_dst;
+ struct hantro_decoded_buffer *dst;
+- size_t cr_offset = hantro_hevc_chroma_offset(sps);
+- size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
++ size_t cr_offset = hantro_hevc_chroma_offset(ctx);
++ size_t mv_offset = hantro_hevc_motion_vectors_offset(ctx);
+ u32 max_ref_frames;
+ u16 dpb_longterm_e;
+ static const struct hantro_reg cur_poc[] = {
+@@ -377,11 +390,10 @@ static int set_ref(struct hantro_ctx *ctx)
+ !!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
+
+ /*
+- * Write POC count diff from current pic. For frame decoding only compute
+- * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
++ * Write POC count diff from current pic.
+ */
+ for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
+- char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
++ char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt_val;
+
+ hantro_reg_write(vpu, &cur_poc[i], poc_diff);
+ }
+@@ -401,14 +413,14 @@ static int set_ref(struct hantro_ctx *ctx)
+
+ set_ref_pic_list(ctx);
+
+- /* We will only keep the references picture that are still used */
+- ctx->hevc_dec.ref_bufs_used = 0;
++ /* We will only keep the reference pictures that are still used */
++ hantro_hevc_ref_init(ctx);
+
+ /* Set up addresses of DPB buffers */
+ dpb_longterm_e = 0;
+ for (i = 0; i < decode_params->num_active_dpb_entries &&
+ i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
+- luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
++ luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt_val);
+ if (!luma_addr)
+ return -ENOMEM;
+
+@@ -443,8 +455,6 @@ static int set_ref(struct hantro_ctx *ctx)
+ hantro_write_addr(vpu, G2_OUT_CHROMA_ADDR, chroma_addr);
+ hantro_write_addr(vpu, G2_OUT_MV_ADDR, mv_addr);
+
+- hantro_hevc_ref_remove_unused(ctx);
+-
+ for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
+ hantro_write_addr(vpu, G2_REF_LUMA_ADDR(i), 0);
+ hantro_write_addr(vpu, G2_REF_CHROMA_ADDR(i), 0);
+diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
+index b7c6f9877b9d5..0f43516d08051 100644
+--- a/drivers/staging/media/hantro/hantro_g2_regs.h
++++ b/drivers/staging/media/hantro/hantro_g2_regs.h
+@@ -107,7 +107,7 @@
+
+ #define g2_start_code_e G2_DEC_REG(10, 31, 0x1)
+ #define g2_init_qp_old G2_DEC_REG(10, 25, 0x3f)
+-#define g2_init_qp G2_DEC_REG(10, 24, 0x3f)
++#define g2_init_qp G2_DEC_REG(10, 24, 0x7f)
+ #define g2_num_tile_cols_old G2_DEC_REG(10, 20, 0x1f)
+ #define g2_num_tile_cols G2_DEC_REG(10, 19, 0x1f)
+ #define g2_num_tile_rows_old G2_DEC_REG(10, 15, 0x1f)
+diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
+index b49a41d7ae912..df1f81952bba1 100644
+--- a/drivers/staging/media/hantro/hantro_hevc.c
++++ b/drivers/staging/media/hantro/hantro_hevc.c
+@@ -25,41 +25,20 @@
+ #define MAX_TILE_COLS 20
+ #define MAX_TILE_ROWS 22
+
+-#define UNUSED_REF -1
+-
+-#define G2_ALIGN 16
+-
+-size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
+-{
+- int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
+-
+- return sps->pic_width_in_luma_samples *
+- sps->pic_height_in_luma_samples * bytes_per_pixel;
+-}
+-
+-size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
+-{
+- size_t cr_offset = hantro_hevc_chroma_offset(sps);
+-
+- return ALIGN((cr_offset * 3) / 2, G2_ALIGN);
+-}
+-
+-static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
++void hantro_hevc_ref_init(struct hantro_ctx *ctx)
+ {
+ struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+- int i;
+
+- for (i = 0; i < NUM_REF_PICTURES; i++)
+- hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
++ hevc_dec->ref_bufs_used = 0;
+ }
+
+ dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
+- int poc)
++ s32 poc)
+ {
+ struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+ int i;
+
+- /* Find the reference buffer in already know ones */
++ /* Find the reference buffer in already known ones */
+ for (i = 0; i < NUM_REF_PICTURES; i++) {
+ if (hevc_dec->ref_bufs_poc[i] == poc) {
+ hevc_dec->ref_bufs_used |= 1 << i;
+@@ -77,7 +56,7 @@ int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx, int poc, dma_addr_t addr)
+
+ /* Add a new reference buffer */
+ for (i = 0; i < NUM_REF_PICTURES; i++) {
+- if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
++ if (!(hevc_dec->ref_bufs_used & 1 << i)) {
+ hevc_dec->ref_bufs_used |= 1 << i;
+ hevc_dec->ref_bufs_poc[i] = poc;
+ hevc_dec->ref_bufs[i].dma = addr;
+@@ -88,23 +67,6 @@ int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx, int poc, dma_addr_t addr)
+ return -EINVAL;
+ }
+
+-void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx)
+-{
+- struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
+- int i;
+-
+- /* Just tag buffer as unused, do not free them */
+- for (i = 0; i < NUM_REF_PICTURES; i++) {
+- if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF)
+- continue;
+-
+- if (hevc_dec->ref_bufs_used & (1 << i))
+- continue;
+-
+- hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
+- }
+-}
+-
+ static int tile_buffer_reallocate(struct hantro_ctx *ctx)
+ {
+ struct hantro_dev *vpu = ctx->dev;
+@@ -192,6 +154,25 @@ err_free_tile_buffers:
+ return -ENOMEM;
+ }
+
++static int hantro_hevc_validate_sps(struct hantro_ctx *ctx, const struct v4l2_ctrl_hevc_sps *sps)
++{
++ /*
++ * for tile pixel format check if the width and height match
++ * hardware constraints
++ */
++ if (ctx->vpu_dst_fmt->fourcc == V4L2_PIX_FMT_NV12_4L4) {
++ if (ctx->dst_fmt.width !=
++ ALIGN(sps->pic_width_in_luma_samples, ctx->vpu_dst_fmt->frmsize.step_width))
++ return -EINVAL;
++
++ if (ctx->dst_fmt.height !=
++ ALIGN(sps->pic_height_in_luma_samples, ctx->vpu_dst_fmt->frmsize.step_height))
++ return -EINVAL;
++ }
++
++ return 0;
++}
++
+ int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx)
+ {
+ struct hantro_hevc_dec_hw_ctx *hevc_ctx = &ctx->hevc_dec;
+@@ -215,6 +196,10 @@ int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx)
+ if (WARN_ON(!ctrls->sps))
+ return -EINVAL;
+
++ ret = hantro_hevc_validate_sps(ctx, ctrls->sps);
++ if (ret)
++ return ret;
++
+ ctrls->pps =
+ hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_HEVC_PPS);
+ if (WARN_ON(!ctrls->pps))
+diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
+index ed018e293ba07..68c313864b065 100644
+--- a/drivers/staging/media/hantro/hantro_hw.h
++++ b/drivers/staging/media/hantro/hantro_hw.h
+@@ -18,9 +18,21 @@
+ #define DEC_8190_ALIGN_MASK 0x07U
+
+ #define MB_DIM 16
++#define TILE_MB_DIM 4
+ #define MB_WIDTH(w) DIV_ROUND_UP(w, MB_DIM)
+ #define MB_HEIGHT(h) DIV_ROUND_UP(h, MB_DIM)
+
++#define FMT_MIN_WIDTH 48
++#define FMT_MIN_HEIGHT 48
++#define FMT_HD_WIDTH 1280
++#define FMT_HD_HEIGHT 720
++#define FMT_FHD_WIDTH 1920
++#define FMT_FHD_HEIGHT 1088
++#define FMT_UHD_WIDTH 3840
++#define FMT_UHD_HEIGHT 2160
++#define FMT_4K_WIDTH 4096
++#define FMT_4K_HEIGHT 2304
++
+ #define NUM_REF_PICTURES (V4L2_HEVC_DPB_ENTRIES_NUM_MAX + 1)
+
+ struct hantro_dev;
+@@ -131,7 +143,7 @@ struct hantro_hevc_dec_hw_ctx {
+ struct hantro_aux_buf tile_bsd;
+ struct hantro_aux_buf ref_bufs[NUM_REF_PICTURES];
+ struct hantro_aux_buf scaling_lists;
+- int ref_bufs_poc[NUM_REF_PICTURES];
++ s32 ref_bufs_poc[NUM_REF_PICTURES];
+ u32 ref_bufs_used;
+ struct hantro_hevc_dec_ctrls ctrls;
+ unsigned int num_tile_cols_allocated;
+@@ -300,6 +312,7 @@ extern const struct hantro_variant rk3066_vpu_variant;
+ extern const struct hantro_variant rk3288_vpu_variant;
+ extern const struct hantro_variant rk3328_vpu_variant;
+ extern const struct hantro_variant rk3399_vpu_variant;
++extern const struct hantro_variant rk3568_vpu_variant;
+ extern const struct hantro_variant sama5d4_vdec_variant;
+ extern const struct hantro_variant sunxi_vpu_variant;
+
+@@ -337,11 +350,10 @@ int hantro_hevc_dec_init(struct hantro_ctx *ctx);
+ void hantro_hevc_dec_exit(struct hantro_ctx *ctx);
+ int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx);
+ int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx);
+-dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx, int poc);
++void hantro_hevc_ref_init(struct hantro_ctx *ctx);
++dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx, s32 poc);
+ int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx, int poc, dma_addr_t addr);
+-void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx);
+-size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps);
+-size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps);
++
+
+ static inline unsigned short hantro_vp9_num_sbs(unsigned short dimension)
+ {
+diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
+index 71a6279750bf7..93d0dcf69f4a0 100644
+--- a/drivers/staging/media/hantro/hantro_v4l2.c
++++ b/drivers/staging/media/hantro/hantro_v4l2.c
+@@ -260,7 +260,7 @@ static int hantro_try_fmt(const struct hantro_ctx *ctx,
+ } else if (ctx->is_encoder) {
+ vpu_fmt = ctx->vpu_dst_fmt;
+ } else {
+- vpu_fmt = ctx->vpu_src_fmt;
++ vpu_fmt = fmt;
+ /*
+ * Width/height on the CAPTURE end of a decoder are ignored and
+ * replaced by the OUTPUT ones.
+diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
+index 9802508bade27..77f574fdfa77b 100644
+--- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
++++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
+@@ -83,6 +83,14 @@ static const struct hantro_fmt imx8m_vpu_postproc_fmts[] = {
+ .fourcc = V4L2_PIX_FMT_YUYV,
+ .codec_mode = HANTRO_MODE_NONE,
+ .postprocessed = true,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
++ .step_width = MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
++ .step_height = MB_DIM,
++ },
+ },
+ };
+
+@@ -90,17 +98,25 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
+ {
+ .fourcc = V4L2_PIX_FMT_NV12,
+ .codec_mode = HANTRO_MODE_NONE,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
++ .step_width = MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
++ .step_height = MB_DIM,
++ },
+ },
+ {
+ .fourcc = V4L2_PIX_FMT_MPEG2_SLICE,
+ .codec_mode = HANTRO_MODE_MPEG2_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 1920,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_FHD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 1088,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_FHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+@@ -109,11 +125,11 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
+ .codec_mode = HANTRO_MODE_VP8_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 3840,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 2160,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+@@ -122,11 +138,11 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
+ .codec_mode = HANTRO_MODE_H264_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 3840,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 2160,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+@@ -137,6 +153,14 @@ static const struct hantro_fmt imx8m_vpu_g2_postproc_fmts[] = {
+ .fourcc = V4L2_PIX_FMT_NV12,
+ .codec_mode = HANTRO_MODE_NONE,
+ .postprocessed = true,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
++ .step_width = MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
++ .step_height = MB_DIM,
++ },
+ },
+ };
+
+@@ -144,18 +168,26 @@ static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
+ {
+ .fourcc = V4L2_PIX_FMT_NV12_4L4,
+ .codec_mode = HANTRO_MODE_NONE,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
++ .step_width = TILE_MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
++ .step_height = TILE_MB_DIM,
++ },
+ },
+ {
+ .fourcc = V4L2_PIX_FMT_HEVC_SLICE,
+ .codec_mode = HANTRO_MODE_HEVC_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 3840,
+- .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 2160,
+- .step_height = MB_DIM,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
++ .step_width = TILE_MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
++ .step_height = TILE_MB_DIM,
+ },
+ },
+ {
+@@ -163,12 +195,12 @@ static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
+ .codec_mode = HANTRO_MODE_VP9_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 3840,
+- .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 2160,
+- .step_height = MB_DIM,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
++ .step_width = TILE_MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
++ .step_height = TILE_MB_DIM,
+ },
+ },
+ };
+diff --git a/drivers/staging/media/hantro/rockchip_vpu_hw.c b/drivers/staging/media/hantro/rockchip_vpu_hw.c
+index 163cf92eafca8..26e16b5a6a703 100644
+--- a/drivers/staging/media/hantro/rockchip_vpu_hw.c
++++ b/drivers/staging/media/hantro/rockchip_vpu_hw.c
+@@ -63,6 +63,14 @@ static const struct hantro_fmt rockchip_vpu1_postproc_fmts[] = {
+ .fourcc = V4L2_PIX_FMT_YUYV,
+ .codec_mode = HANTRO_MODE_NONE,
+ .postprocessed = true,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_FHD_WIDTH,
++ .step_width = MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_FHD_HEIGHT,
++ .step_height = MB_DIM,
++ },
+ },
+ };
+
+@@ -70,17 +78,25 @@ static const struct hantro_fmt rk3066_vpu_dec_fmts[] = {
+ {
+ .fourcc = V4L2_PIX_FMT_NV12,
+ .codec_mode = HANTRO_MODE_NONE,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_FHD_WIDTH,
++ .step_width = MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_FHD_HEIGHT,
++ .step_height = MB_DIM,
++ },
+ },
+ {
+ .fourcc = V4L2_PIX_FMT_H264_SLICE,
+ .codec_mode = HANTRO_MODE_H264_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 1920,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_FHD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 1088,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_FHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+@@ -89,11 +105,11 @@ static const struct hantro_fmt rk3066_vpu_dec_fmts[] = {
+ .codec_mode = HANTRO_MODE_MPEG2_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 1920,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_FHD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 1088,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_FHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+@@ -102,11 +118,11 @@ static const struct hantro_fmt rk3066_vpu_dec_fmts[] = {
+ .codec_mode = HANTRO_MODE_VP8_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 1920,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_FHD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 1088,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_FHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+@@ -116,17 +132,25 @@ static const struct hantro_fmt rk3288_vpu_dec_fmts[] = {
+ {
+ .fourcc = V4L2_PIX_FMT_NV12,
+ .codec_mode = HANTRO_MODE_NONE,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_4K_WIDTH,
++ .step_width = MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_4K_HEIGHT,
++ .step_height = MB_DIM,
++ },
+ },
+ {
+ .fourcc = V4L2_PIX_FMT_H264_SLICE,
+ .codec_mode = HANTRO_MODE_H264_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 4096,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_4K_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 2304,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_4K_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+@@ -135,11 +159,11 @@ static const struct hantro_fmt rk3288_vpu_dec_fmts[] = {
+ .codec_mode = HANTRO_MODE_MPEG2_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 1920,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_FHD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 1088,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_FHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+@@ -148,31 +172,80 @@ static const struct hantro_fmt rk3288_vpu_dec_fmts[] = {
+ .codec_mode = HANTRO_MODE_VP8_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 3840,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 2160,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+ };
+
+-static const struct hantro_fmt rk3399_vpu_dec_fmts[] = {
++static const struct hantro_fmt rockchip_vdpu2_dec_fmts[] = {
+ {
+ .fourcc = V4L2_PIX_FMT_NV12,
+ .codec_mode = HANTRO_MODE_NONE,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_FHD_WIDTH,
++ .step_width = MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_FHD_HEIGHT,
++ .step_height = MB_DIM,
++ },
+ },
+ {
+ .fourcc = V4L2_PIX_FMT_H264_SLICE,
+ .codec_mode = HANTRO_MODE_H264_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 1920,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_FHD_WIDTH,
++ .step_width = MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_FHD_HEIGHT,
++ .step_height = MB_DIM,
++ },
++ },
++ {
++ .fourcc = V4L2_PIX_FMT_MPEG2_SLICE,
++ .codec_mode = HANTRO_MODE_MPEG2_DEC,
++ .max_depth = 2,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_FHD_WIDTH,
++ .step_width = MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_FHD_HEIGHT,
++ .step_height = MB_DIM,
++ },
++ },
++ {
++ .fourcc = V4L2_PIX_FMT_VP8_FRAME,
++ .codec_mode = HANTRO_MODE_VP8_DEC,
++ .max_depth = 2,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
++ .step_width = MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
++ .step_height = MB_DIM,
++ },
++ },
++};
++
++static const struct hantro_fmt rk3399_vpu_dec_fmts[] = {
++ {
++ .fourcc = V4L2_PIX_FMT_NV12,
++ .codec_mode = HANTRO_MODE_NONE,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_FHD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 1088,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_FHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+@@ -181,11 +254,11 @@ static const struct hantro_fmt rk3399_vpu_dec_fmts[] = {
+ .codec_mode = HANTRO_MODE_MPEG2_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 1920,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_FHD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 1088,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_FHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+@@ -194,11 +267,11 @@ static const struct hantro_fmt rk3399_vpu_dec_fmts[] = {
+ .codec_mode = HANTRO_MODE_VP8_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 3840,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 2160,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+@@ -516,8 +589,8 @@ const struct hantro_variant rk3288_vpu_variant = {
+
+ const struct hantro_variant rk3328_vpu_variant = {
+ .dec_offset = 0x400,
+- .dec_fmts = rk3399_vpu_dec_fmts,
+- .num_dec_fmts = ARRAY_SIZE(rk3399_vpu_dec_fmts),
++ .dec_fmts = rockchip_vdpu2_dec_fmts,
++ .num_dec_fmts = ARRAY_SIZE(rockchip_vdpu2_dec_fmts),
+ .codec = HANTRO_MPEG2_DECODER | HANTRO_VP8_DECODER |
+ HANTRO_H264_DECODER,
+ .codec_ops = rk3399_vpu_codec_ops,
+@@ -528,6 +601,11 @@ const struct hantro_variant rk3328_vpu_variant = {
+ .num_clocks = ARRAY_SIZE(rockchip_vpu_clk_names),
+ };
+
++/*
++ * H.264 decoding explicitly disabled in RK3399.
++ * This ensures userspace applications use the Rockchip VDEC core,
++ * which has better performance.
++ */
+ const struct hantro_variant rk3399_vpu_variant = {
+ .enc_offset = 0x0,
+ .enc_fmts = rockchip_vpu_enc_fmts,
+@@ -545,13 +623,27 @@ const struct hantro_variant rk3399_vpu_variant = {
+ .num_clocks = ARRAY_SIZE(rockchip_vpu_clk_names)
+ };
+
++const struct hantro_variant rk3568_vpu_variant = {
++ .dec_offset = 0x400,
++ .dec_fmts = rockchip_vdpu2_dec_fmts,
++ .num_dec_fmts = ARRAY_SIZE(rockchip_vdpu2_dec_fmts),
++ .codec = HANTRO_MPEG2_DECODER |
++ HANTRO_VP8_DECODER | HANTRO_H264_DECODER,
++ .codec_ops = rk3399_vpu_codec_ops,
++ .irqs = rockchip_vdpu2_irqs,
++ .num_irqs = ARRAY_SIZE(rockchip_vdpu2_irqs),
++ .init = rockchip_vpu_hw_init,
++ .clk_names = rockchip_vpu_clk_names,
++ .num_clocks = ARRAY_SIZE(rockchip_vpu_clk_names)
++};
++
+ const struct hantro_variant px30_vpu_variant = {
+ .enc_offset = 0x0,
+ .enc_fmts = rockchip_vpu_enc_fmts,
+ .num_enc_fmts = ARRAY_SIZE(rockchip_vpu_enc_fmts),
+ .dec_offset = 0x400,
+- .dec_fmts = rk3399_vpu_dec_fmts,
+- .num_dec_fmts = ARRAY_SIZE(rk3399_vpu_dec_fmts),
++ .dec_fmts = rockchip_vdpu2_dec_fmts,
++ .num_dec_fmts = ARRAY_SIZE(rockchip_vdpu2_dec_fmts),
+ .codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
+ HANTRO_VP8_DECODER | HANTRO_H264_DECODER,
+ .codec_ops = rk3399_vpu_codec_ops,
+diff --git a/drivers/staging/media/hantro/sama5d4_vdec_hw.c b/drivers/staging/media/hantro/sama5d4_vdec_hw.c
+index b2fc1c5613e19..b205e2db5b04d 100644
+--- a/drivers/staging/media/hantro/sama5d4_vdec_hw.c
++++ b/drivers/staging/media/hantro/sama5d4_vdec_hw.c
+@@ -16,6 +16,14 @@ static const struct hantro_fmt sama5d4_vdec_postproc_fmts[] = {
+ .fourcc = V4L2_PIX_FMT_YUYV,
+ .codec_mode = HANTRO_MODE_NONE,
+ .postprocessed = true,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_HD_WIDTH,
++ .step_width = MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_HD_HEIGHT,
++ .step_height = MB_DIM,
++ },
+ },
+ };
+
+@@ -23,17 +31,25 @@ static const struct hantro_fmt sama5d4_vdec_fmts[] = {
+ {
+ .fourcc = V4L2_PIX_FMT_NV12,
+ .codec_mode = HANTRO_MODE_NONE,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_HD_WIDTH,
++ .step_width = MB_DIM,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_HD_HEIGHT,
++ .step_height = MB_DIM,
++ },
+ },
+ {
+ .fourcc = V4L2_PIX_FMT_MPEG2_SLICE,
+ .codec_mode = HANTRO_MODE_MPEG2_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 1280,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_HD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 720,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_HD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+@@ -42,11 +58,11 @@ static const struct hantro_fmt sama5d4_vdec_fmts[] = {
+ .codec_mode = HANTRO_MODE_VP8_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 1280,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_HD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 720,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_HD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+@@ -55,11 +71,11 @@ static const struct hantro_fmt sama5d4_vdec_fmts[] = {
+ .codec_mode = HANTRO_MODE_H264_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 1280,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_HD_WIDTH,
+ .step_width = MB_DIM,
+- .min_height = 48,
+- .max_height = 720,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_HD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+diff --git a/drivers/staging/media/hantro/sunxi_vpu_hw.c b/drivers/staging/media/hantro/sunxi_vpu_hw.c
+index c0edd5856a0c8..fbeac81e59e13 100644
+--- a/drivers/staging/media/hantro/sunxi_vpu_hw.c
++++ b/drivers/staging/media/hantro/sunxi_vpu_hw.c
+@@ -14,6 +14,14 @@ static const struct hantro_fmt sunxi_vpu_postproc_fmts[] = {
+ .fourcc = V4L2_PIX_FMT_NV12,
+ .codec_mode = HANTRO_MODE_NONE,
+ .postprocessed = true,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
++ .step_width = 32,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
++ .step_height = 32,
++ },
+ },
+ };
+
+@@ -21,17 +29,25 @@ static const struct hantro_fmt sunxi_vpu_dec_fmts[] = {
+ {
+ .fourcc = V4L2_PIX_FMT_NV12_4L4,
+ .codec_mode = HANTRO_MODE_NONE,
++ .frmsize = {
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
++ .step_width = 32,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
++ .step_height = 32,
++ },
+ },
+ {
+ .fourcc = V4L2_PIX_FMT_VP9_FRAME,
+ .codec_mode = HANTRO_MODE_VP9_DEC,
+ .max_depth = 2,
+ .frmsize = {
+- .min_width = 48,
+- .max_width = 3840,
++ .min_width = FMT_MIN_WIDTH,
++ .max_width = FMT_UHD_WIDTH,
+ .step_width = 32,
+- .min_height = 48,
+- .max_height = 2160,
++ .min_height = FMT_MIN_HEIGHT,
++ .max_height = FMT_UHD_HEIGHT,
+ .step_height = 32,
+ },
+ },
+diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
+index 44f385be9f6c6..04419381ea56b 100644
+--- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
++++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
+@@ -143,10 +143,13 @@ static void cedrus_h265_frame_info_write_dpb(struct cedrus_ctx *ctx,
+ for (i = 0; i < num_active_dpb_entries; i++) {
+ int buffer_index = vb2_find_timestamp(vq, dpb[i].timestamp, 0);
+ u32 pic_order_cnt[2] = {
+- dpb[i].pic_order_cnt[0],
+- dpb[i].pic_order_cnt[1]
++ dpb[i].pic_order_cnt_val,
++ dpb[i].pic_order_cnt_val
+ };
+
++ if (buffer_index < 0)
++ continue;
++
+ cedrus_h265_frame_info_write_single(ctx, i, dpb[i].field_pic,
+ pic_order_cnt,
+ buffer_index);
+@@ -301,6 +304,31 @@ static void cedrus_h265_write_scaling_list(struct cedrus_ctx *ctx,
+ }
+ }
+
++static int cedrus_h265_is_low_delay(struct cedrus_run *run)
++{
++ const struct v4l2_ctrl_hevc_slice_params *slice_params;
++ const struct v4l2_hevc_dpb_entry *dpb;
++ s32 poc;
++ int i;
++
++ slice_params = run->h265.slice_params;
++ poc = run->h265.decode_params->pic_order_cnt_val;
++ dpb = run->h265.decode_params->dpb;
++
++ for (i = 0; i < slice_params->num_ref_idx_l0_active_minus1 + 1; i++)
++ if (dpb[slice_params->ref_idx_l0[i]].pic_order_cnt_val > poc)
++ return 1;
++
++ if (slice_params->slice_type != V4L2_HEVC_SLICE_TYPE_B)
++ return 0;
++
++ for (i = 0; i < slice_params->num_ref_idx_l1_active_minus1 + 1; i++)
++ if (dpb[slice_params->ref_idx_l1[i]].pic_order_cnt_val > poc)
++ return 1;
++
++ return 0;
++}
++
+ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
+ struct cedrus_run *run)
+ {
+@@ -559,7 +587,6 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
+
+ reg = VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_TC_OFFSET_DIV2(slice_params->slice_tc_offset_div2) |
+ VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_BETA_OFFSET_DIV2(slice_params->slice_beta_offset_div2) |
+- VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_POC_BIGEST_IN_RPS_ST(decode_params->num_poc_st_curr_after == 0) |
+ VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CR_QP_OFFSET(slice_params->slice_cr_qp_offset) |
+ VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CB_QP_OFFSET(slice_params->slice_cb_qp_offset) |
+ VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_QP_DELTA(slice_params->slice_qp_delta);
+@@ -572,6 +599,9 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
+ V4L2_HEVC_SLICE_PARAMS_FLAG_SLICE_LOOP_FILTER_ACROSS_SLICES_ENABLED,
+ slice_params->flags);
+
++ if (slice_params->slice_type != V4L2_HEVC_SLICE_TYPE_I && !cedrus_h265_is_low_delay(run))
++ reg |= VE_DEC_H265_DEC_SLICE_HDR_INFO1_FLAG_SLICE_NOT_LOW_DELAY;
++
+ cedrus_write(dev, VE_DEC_H265_DEC_SLICE_HDR_INFO1, reg);
+
+ chroma_log2_weight_denom = pred_weight_table->luma_log2_weight_denom +
+diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_regs.h b/drivers/staging/media/sunxi/cedrus/cedrus_regs.h
+index bdb062ad86823..d81f7513ade0d 100644
+--- a/drivers/staging/media/sunxi/cedrus/cedrus_regs.h
++++ b/drivers/staging/media/sunxi/cedrus/cedrus_regs.h
+@@ -377,13 +377,12 @@
+
+ #define VE_DEC_H265_DEC_SLICE_HDR_INFO1_FLAG_SLICE_DEBLOCKING_FILTER_DISABLED BIT(23)
+ #define VE_DEC_H265_DEC_SLICE_HDR_INFO1_FLAG_SLICE_LOOP_FILTER_ACROSS_SLICES_ENABLED BIT(22)
++#define VE_DEC_H265_DEC_SLICE_HDR_INFO1_FLAG_SLICE_NOT_LOW_DELAY BIT(21)
+
+ #define VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_TC_OFFSET_DIV2(v) \
+ SHIFT_AND_MASK_BITS(v, 31, 28)
+ #define VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_BETA_OFFSET_DIV2(v) \
+ SHIFT_AND_MASK_BITS(v, 27, 24)
+-#define VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_POC_BIGEST_IN_RPS_ST(v) \
+- ((v) ? BIT(21) : 0)
+ #define VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CR_QP_OFFSET(v) \
+ SHIFT_AND_MASK_BITS(v, 20, 16)
+ #define VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CB_QP_OFFSET(v) \
+diff --git a/drivers/staging/rtl8192u/r8192U.h b/drivers/staging/rtl8192u/r8192U.h
+index 14ca00a2789b0..1942cb8493748 100644
+--- a/drivers/staging/rtl8192u/r8192U.h
++++ b/drivers/staging/rtl8192u/r8192U.h
+@@ -1013,7 +1013,7 @@ typedef struct r8192_priv {
+ bool bis_any_nonbepkts;
+ bool bcurrent_turbo_EDCA;
+ bool bis_cur_rdlstate;
+- struct timer_list fsync_timer;
++ struct delayed_work fsync_work;
+ bool bfsync_processing; /* 500ms Fsync timer is active or not */
+ u32 rate_record;
+ u32 rateCountDiffRecord;
+diff --git a/drivers/staging/rtl8192u/r8192U_dm.c b/drivers/staging/rtl8192u/r8192U_dm.c
+index 725bf5ca9e34d..0fcfcaa6500bf 100644
+--- a/drivers/staging/rtl8192u/r8192U_dm.c
++++ b/drivers/staging/rtl8192u/r8192U_dm.c
+@@ -2578,19 +2578,20 @@ static void dm_init_fsync(struct net_device *dev)
+ priv->ieee80211->fsync_seconddiff_ratethreshold = 200;
+ priv->ieee80211->fsync_state = Default_Fsync;
+ priv->framesyncMonitor = 1; /* current default 0xc38 monitor on */
+- timer_setup(&priv->fsync_timer, dm_fsync_timer_callback, 0);
++ INIT_DELAYED_WORK(&priv->fsync_work, dm_fsync_work_callback);
+ }
+
+ static void dm_deInit_fsync(struct net_device *dev)
+ {
+ struct r8192_priv *priv = ieee80211_priv(dev);
+
+- del_timer_sync(&priv->fsync_timer);
++ cancel_delayed_work_sync(&priv->fsync_work);
+ }
+
+-void dm_fsync_timer_callback(struct timer_list *t)
++void dm_fsync_work_callback(struct work_struct *work)
+ {
+- struct r8192_priv *priv = from_timer(priv, t, fsync_timer);
++ struct r8192_priv *priv =
++ container_of(work, struct r8192_priv, fsync_work.work);
+ struct net_device *dev = priv->ieee80211->dev;
+ u32 rate_index, rate_count = 0, rate_count_diff = 0;
+ bool bSwitchFromCountDiff = false;
+@@ -2657,17 +2658,16 @@ void dm_fsync_timer_callback(struct timer_list *t)
+ }
+ }
+ if (bDoubleTimeInterval) {
+- if (timer_pending(&priv->fsync_timer))
+- del_timer_sync(&priv->fsync_timer);
+- priv->fsync_timer.expires = jiffies +
+- msecs_to_jiffies(priv->ieee80211->fsync_time_interval*priv->ieee80211->fsync_multiple_timeinterval);
+- add_timer(&priv->fsync_timer);
++ cancel_delayed_work_sync(&priv->fsync_work);
++ schedule_delayed_work(&priv->fsync_work,
++ msecs_to_jiffies(priv
++ ->ieee80211->fsync_time_interval *
++ priv->ieee80211->fsync_multiple_timeinterval));
+ } else {
+- if (timer_pending(&priv->fsync_timer))
+- del_timer_sync(&priv->fsync_timer);
+- priv->fsync_timer.expires = jiffies +
+- msecs_to_jiffies(priv->ieee80211->fsync_time_interval);
+- add_timer(&priv->fsync_timer);
++ cancel_delayed_work_sync(&priv->fsync_work);
++ schedule_delayed_work(&priv->fsync_work,
++ msecs_to_jiffies(priv
++ ->ieee80211->fsync_time_interval));
+ }
+ } else {
+ /* Let Register return to default value; */
+@@ -2695,7 +2695,7 @@ static void dm_EndSWFsync(struct net_device *dev)
+ struct r8192_priv *priv = ieee80211_priv(dev);
+
+ RT_TRACE(COMP_HALDM, "%s\n", __func__);
+- del_timer_sync(&(priv->fsync_timer));
++ cancel_delayed_work_sync(&priv->fsync_work);
+
+ /* Let Register return to default value; */
+ if (priv->bswitch_fsync) {
+@@ -2736,11 +2736,9 @@ static void dm_StartSWFsync(struct net_device *dev)
+ if (priv->ieee80211->fsync_rate_bitmap & rateBitmap)
+ priv->rate_record += priv->stats.received_rate_histogram[1][rateIndex];
+ }
+- if (timer_pending(&priv->fsync_timer))
+- del_timer_sync(&priv->fsync_timer);
+- priv->fsync_timer.expires = jiffies +
+- msecs_to_jiffies(priv->ieee80211->fsync_time_interval);
+- add_timer(&priv->fsync_timer);
++ cancel_delayed_work_sync(&priv->fsync_work);
++ schedule_delayed_work(&priv->fsync_work,
++ msecs_to_jiffies(priv->ieee80211->fsync_time_interval));
+
+ write_nic_dword(dev, rOFDM0_RxDetector2, 0x465c12cd);
+ }
+diff --git a/drivers/staging/rtl8192u/r8192U_dm.h b/drivers/staging/rtl8192u/r8192U_dm.h
+index 0b2a1c688597c..2159018b4e38f 100644
+--- a/drivers/staging/rtl8192u/r8192U_dm.h
++++ b/drivers/staging/rtl8192u/r8192U_dm.h
+@@ -166,7 +166,7 @@ void dm_force_tx_fw_info(struct net_device *dev,
+ void dm_init_edca_turbo(struct net_device *dev);
+ void dm_rf_operation_test_callback(unsigned long data);
+ void dm_rf_pathcheck_workitemcallback(struct work_struct *work);
+-void dm_fsync_timer_callback(struct timer_list *t);
++void dm_fsync_work_callback(struct work_struct *work);
+ void dm_cck_txpower_adjust(struct net_device *dev, bool binch14);
+ void dm_shadow_init(struct net_device *dev);
+ void dm_initialize_txpower_tracking(struct net_device *dev);
+diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
+index 1c4aac8464a70..1e5a78131aba9 100644
+--- a/drivers/thermal/thermal_sysfs.c
++++ b/drivers/thermal/thermal_sysfs.c
+@@ -813,12 +813,13 @@ static const struct attribute_group cooling_device_stats_attr_group = {
+
+ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
+ {
++ const struct attribute_group *stats_attr_group = NULL;
+ struct cooling_dev_stats *stats;
+ unsigned long states;
+ int var;
+
+ if (cdev->ops->get_max_state(cdev, &states))
+- return;
++ goto out;
+
+ states++; /* Total number of states is highest state + 1 */
+
+@@ -828,7 +829,7 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
+
+ stats = kzalloc(var, GFP_KERNEL);
+ if (!stats)
+- return;
++ goto out;
+
+ stats->time_in_state = (ktime_t *)(stats + 1);
+ stats->trans_table = (unsigned int *)(stats->time_in_state + states);
+@@ -838,9 +839,12 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
+
+ spin_lock_init(&stats->lock);
+
++ stats_attr_group = &cooling_device_stats_attr_group;
++
++out:
+ /* Fill the empty slot left in cooling_device_attr_groups */
+ var = ARRAY_SIZE(cooling_device_attr_groups) - 2;
+- cooling_device_attr_groups[var] = &cooling_device_stats_attr_group;
++ cooling_device_attr_groups[var] = stats_attr_group;
+ }
+
+ static void cooling_device_stats_destroy(struct thermal_cooling_device *cdev)
+diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
+index 9f2a8c0e1e334..3dd71ede4cd44 100644
+--- a/drivers/tty/n_gsm.c
++++ b/drivers/tty/n_gsm.c
+@@ -5,6 +5,14 @@
+ *
+ * * THIS IS A DEVELOPMENT SNAPSHOT IT IS NOT A FINAL RELEASE *
+ *
++ * Outgoing path:
++ * tty -> DLCI fifo -> scheduler -> GSM MUX data queue ---o-> ldisc
++ * control message -> GSM MUX control queue --´
++ *
++ * Incoming path:
++ * ldisc -> gsm_queue() -o--> tty
++ * `-> gsm_control_response()
++ *
+ * TO DO:
+ * Mostly done: ioctls for setting modes/timing
+ * Partly done: hooks so you can pull off frames to non tty devs
+@@ -210,6 +218,9 @@ struct gsm_mux {
+ /* Events on the GSM channel */
+ wait_queue_head_t event;
+
++ /* ldisc send work */
++ struct work_struct tx_work;
++
+ /* Bits for GSM mode decoding */
+
+ /* Framing Layer */
+@@ -235,14 +246,17 @@ struct gsm_mux {
+ struct gsm_dlci *dlci[NUM_DLCI];
+ int old_c_iflag; /* termios c_iflag value before attach */
+ bool constipated; /* Asked by remote to shut up */
++ bool has_devices; /* Devices were registered */
+
+ spinlock_t tx_lock;
+ unsigned int tx_bytes; /* TX data outstanding */
+ #define TX_THRESH_HI 8192
+ #define TX_THRESH_LO 2048
+- struct list_head tx_list; /* Pending data packets */
++ struct list_head tx_ctrl_list; /* Pending control packets */
++ struct list_head tx_data_list; /* Pending data packets */
+
+ /* Control messages */
++ struct timer_list kick_timer; /* Kick TX queuing on timeout */
+ struct timer_list t2_timer; /* Retransmit timer for commands */
+ int cretries; /* Command retry counter */
+ struct gsm_control *pending_cmd;/* Our current pending command */
+@@ -369,6 +383,11 @@ static const u8 gsm_fcs8[256] = {
+
+ static int gsmld_output(struct gsm_mux *gsm, u8 *data, int len);
+ static int gsm_modem_update(struct gsm_dlci *dlci, u8 brk);
++static struct gsm_msg *gsm_data_alloc(struct gsm_mux *gsm, u8 addr, int len,
++ u8 ctrl);
++static int gsm_send_packet(struct gsm_mux *gsm, struct gsm_msg *msg);
++static void gsmld_write_trigger(struct gsm_mux *gsm);
++static void gsmld_write_task(struct work_struct *work);
+
+ /**
+ * gsm_fcs_add - update FCS
+@@ -419,6 +438,27 @@ static int gsm_read_ea(unsigned int *val, u8 c)
+ return c & EA;
+ }
+
++/**
++ * gsm_read_ea_val - read a value until EA
++ * @val: variable holding value
++ * @data: buffer of data
++ * @dlen: length of data
++ *
++ * Processes an EA value. Updates the passed variable and
++ * returns the processed data length.
++ */
++static unsigned int gsm_read_ea_val(unsigned int *val, const u8 *data, int dlen)
++{
++ unsigned int len = 0;
++
++ for (; dlen > 0; dlen--) {
++ len++;
++ if (gsm_read_ea(val, *data++))
++ break;
++ }
++ return len;
++}
++
+ /**
+ * gsm_encode_modem - encode modem data bits
+ * @dlci: DLCI to encode from
+@@ -463,6 +503,68 @@ static void gsm_hex_dump_bytes(const char *fname, const u8 *data,
+ kfree(prefix);
+ }
+
++/**
++ * gsm_register_devices - register all tty devices for a given mux index
++ *
++ * @driver: the tty driver that describes the tty devices
++ * @index: the mux number is used to calculate the minor numbers of the
++ * ttys for this mux and may differ from the position in the
++ * mux array.
++ */
++static int gsm_register_devices(struct tty_driver *driver, unsigned int index)
++{
++ struct device *dev;
++ int i;
++ unsigned int base;
++
++ if (!driver || index >= MAX_MUX)
++ return -EINVAL;
++
++ base = index * NUM_DLCI; /* first minor for this index */
++ for (i = 1; i < NUM_DLCI; i++) {
++ /* Don't register device 0 - this is the control channel
++ * and not a usable tty interface
++ */
++ dev = tty_register_device(gsm_tty_driver, base + i, NULL);
++ if (IS_ERR(dev)) {
++ if (debug & 8)
++ pr_info("%s failed to register device minor %u",
++ __func__, base + i);
++ for (i--; i >= 1; i--)
++ tty_unregister_device(gsm_tty_driver, base + i);
++ return PTR_ERR(dev);
++ }
++ }
++
++ return 0;
++}
++
++/**
++ * gsm_unregister_devices - unregister all tty devices for a given mux index
++ *
++ * @driver: the tty driver that describes the tty devices
++ * @index: the mux number is used to calculate the minor numbers of the
++ * ttys for this mux and may differ from the position in the
++ * mux array.
++ */
++static void gsm_unregister_devices(struct tty_driver *driver,
++ unsigned int index)
++{
++ int i;
++ unsigned int base;
++
++ if (!driver || index >= MAX_MUX)
++ return;
++
++ base = index * NUM_DLCI; /* first minor for this index */
++ for (i = 1; i < NUM_DLCI; i++) {
++ /* Don't unregister device 0 - this is the control
++ * channel and not a usable tty interface
++ */
++ tty_unregister_device(gsm_tty_driver, base + i);
++ }
++}
++
+ /**
+ * gsm_print_packet - display a frame for debug
+ * @hdr: header to print before decode
+@@ -570,57 +672,73 @@ static int gsm_stuff_frame(const u8 *input, u8 *output, int len)
+ * @cr: command/response bit seen as initiator
+ * @control: control byte including PF bit
+ *
+- * Format up and transmit a control frame. These do not go via the
+- * queueing logic as they should be transmitted ahead of data when
+- * they are needed.
+- *
+- * FIXME: Lock versus data TX path
++ * Format up and transmit a control frame. These should be transmitted
++ * ahead of data when they are needed.
+ */
+-
+-static void gsm_send(struct gsm_mux *gsm, int addr, int cr, int control)
++static int gsm_send(struct gsm_mux *gsm, int addr, int cr, int control)
+ {
+- int len;
+- u8 cbuf[10];
+- u8 ibuf[3];
++ struct gsm_msg *msg;
++ u8 *dp;
+ int ocr;
++ unsigned long flags;
++
++ msg = gsm_data_alloc(gsm, addr, 0, control);
++ if (!msg)
++ return -ENOMEM;
+
+ /* toggle C/R coding if not initiator */
+ ocr = cr ^ (gsm->initiator ? 0 : 1);
+
+- switch (gsm->encoding) {
+- case 0:
+- cbuf[0] = GSM0_SOF;
+- cbuf[1] = (addr << 2) | (ocr << 1) | EA;
+- cbuf[2] = control;
+- cbuf[3] = EA; /* Length of data = 0 */
+- cbuf[4] = 0xFF - gsm_fcs_add_block(INIT_FCS, cbuf + 1, 3);
+- cbuf[5] = GSM0_SOF;
+- len = 6;
+- break;
+- case 1:
+- case 2:
+- /* Control frame + packing (but not frame stuffing) in mode 1 */
+- ibuf[0] = (addr << 2) | (ocr << 1) | EA;
+- ibuf[1] = control;
+- ibuf[2] = 0xFF - gsm_fcs_add_block(INIT_FCS, ibuf, 2);
+- /* Stuffing may double the size worst case */
+- len = gsm_stuff_frame(ibuf, cbuf + 1, 3);
+- /* Now add the SOF markers */
+- cbuf[0] = GSM1_SOF;
+- cbuf[len + 1] = GSM1_SOF;
+- /* FIXME: we can omit the lead one in many cases */
+- len += 2;
+- break;
+- default:
+- WARN_ON(1);
+- return;
+- }
+- gsmld_output(gsm, cbuf, len);
+- if (!gsm->initiator) {
+- cr = cr & gsm->initiator;
+- control = control & ~PF;
++ msg->data -= 3;
++ dp = msg->data;
++ *dp++ = (addr << 2) | (ocr << 1) | EA;
++ *dp++ = control;
++
++ if (gsm->encoding == 0)
++ *dp++ = EA; /* Length of data = 0 */
++
++ *dp = 0xFF - gsm_fcs_add_block(INIT_FCS, msg->data, dp - msg->data);
++ msg->len = (dp - msg->data) + 1;
++
++ gsm_print_packet("Q->", addr, cr, control, NULL, 0);
++
++ spin_lock_irqsave(&gsm->tx_lock, flags);
++ list_add_tail(&msg->list, &gsm->tx_ctrl_list);
++ gsm->tx_bytes += msg->len;
++ spin_unlock_irqrestore(&gsm->tx_lock, flags);
++ gsmld_write_trigger(gsm);
++
++ return 0;
++}
++
++/**
++ * gsm_dlci_clear_queues - remove outstanding data for a DLCI
++ * @gsm: mux
++ * @dlci: clear for this DLCI
++ *
++ * Clears the data queues for a given DLCI.
++ */
++static void gsm_dlci_clear_queues(struct gsm_mux *gsm, struct gsm_dlci *dlci)
++{
++ struct gsm_msg *msg, *nmsg;
++ int addr = dlci->addr;
++ unsigned long flags;
++
++ /* Clear DLCI write fifo first */
++ spin_lock_irqsave(&dlci->lock, flags);
++ kfifo_reset(&dlci->fifo);
++ spin_unlock_irqrestore(&dlci->lock, flags);
++
++ /* Clear data packets in MUX write queue */
++ spin_lock_irqsave(&gsm->tx_lock, flags);
++ list_for_each_entry_safe(msg, nmsg, &gsm->tx_data_list, list) {
++ if (msg->addr != addr)
++ continue;
++ gsm->tx_bytes -= msg->len;
++ list_del(&msg->list);
++ kfree(msg);
+ }
+- gsm_print_packet("-->", addr, cr, control, NULL, 0);
++ spin_unlock_irqrestore(&gsm->tx_lock, flags);
+ }
+
+ /**
+@@ -683,59 +801,151 @@ static struct gsm_msg *gsm_data_alloc(struct gsm_mux *gsm, u8 addr, int len,
+ }
+
+ /**
+- * gsm_data_kick - poke the queue
++ * gsm_send_packet - sends a single packet
+ * @gsm: GSM Mux
+- * @dlci: DLCI sending the data
++ * @msg: packet to send
+ *
+- * The tty device has called us to indicate that room has appeared in
+- * the transmit queue. Ram more data into the pipe if we have any
+- * If we have been flow-stopped by a CMD_FCOFF, then we can only
+- * send messages on DLCI0 until CMD_FCON
++ * The given packet is encoded and sent out. No memory is freed.
++ * The caller must hold the gsm tx lock.
++ */
++static int gsm_send_packet(struct gsm_mux *gsm, struct gsm_msg *msg)
++{
++ int len, ret;
++
++
++ if (gsm->encoding == 0) {
++ gsm->txframe[0] = GSM0_SOF;
++ memcpy(gsm->txframe + 1, msg->data, msg->len);
++ gsm->txframe[msg->len + 1] = GSM0_SOF;
++ len = msg->len + 2;
++ } else {
++ gsm->txframe[0] = GSM1_SOF;
++ len = gsm_stuff_frame(msg->data, gsm->txframe + 1, msg->len);
++ gsm->txframe[len + 1] = GSM1_SOF;
++ len += 2;
++ }
++
++ if (debug & 4)
++ gsm_hex_dump_bytes(__func__, gsm->txframe, len);
++ gsm_print_packet("-->", msg->addr, gsm->initiator, msg->ctrl, msg->data,
++ msg->len);
++
++ ret = gsmld_output(gsm, gsm->txframe, len);
++ if (ret <= 0)
++ return ret;
++ /* FIXME: Can eliminate one SOF in many more cases */
++ gsm->tx_bytes -= msg->len;
++
++ return 0;
++}
++
++/**
++ * gsm_is_flow_ctrl_msg - checks if flow control message
++ * @msg: message to check
+ *
+- * FIXME: lock against link layer control transmissions
++ * Returns true if the given message is a flow control command of the
++ * control channel. False is returned in any other case.
+ */
++static bool gsm_is_flow_ctrl_msg(struct gsm_msg *msg)
++{
++ unsigned int cmd;
++
++ if (msg->addr > 0)
++ return false;
++
++ switch (msg->ctrl & ~PF) {
++ case UI:
++ case UIH:
++ cmd = 0;
++ if (gsm_read_ea_val(&cmd, msg->data + 2, msg->len - 2) < 1)
++ break;
++ switch (cmd & ~PF) {
++ case CMD_FCOFF:
++ case CMD_FCON:
++ return true;
++ }
++ break;
++ }
++
++ return false;
++}
+
+-static void gsm_data_kick(struct gsm_mux *gsm, struct gsm_dlci *dlci)
++/**
++ * gsm_data_kick - poke the queue
++ * @gsm: GSM Mux
++ *
++ * The tty device has called us to indicate that room has appeared in
++ * the transmit queue. Ram more data into the pipe if we have any.
++ * If we have been flow-stopped by a CMD_FCOFF, then we can only
++ * send messages on DLCI0 until CMD_FCON. The caller must hold
++ * the gsm tx lock.
++ */
++static int gsm_data_kick(struct gsm_mux *gsm)
+ {
+ struct gsm_msg *msg, *nmsg;
+- int len;
++ struct gsm_dlci *dlci;
++ int ret;
+
+- list_for_each_entry_safe(msg, nmsg, &gsm->tx_list, list) {
+- if (gsm->constipated && msg->addr)
+- continue;
+- if (gsm->encoding != 0) {
+- gsm->txframe[0] = GSM1_SOF;
+- len = gsm_stuff_frame(msg->data,
+- gsm->txframe + 1, msg->len);
+- gsm->txframe[len + 1] = GSM1_SOF;
+- len += 2;
+- } else {
+- gsm->txframe[0] = GSM0_SOF;
+- memcpy(gsm->txframe + 1 , msg->data, msg->len);
+- gsm->txframe[msg->len + 1] = GSM0_SOF;
+- len = msg->len + 2;
+- }
++ clear_bit(TTY_DO_WRITE_WAKEUP, &gsm->tty->flags);
+
+- if (debug & 4)
+- gsm_hex_dump_bytes(__func__, gsm->txframe, len);
+- if (gsmld_output(gsm, gsm->txframe, len) <= 0)
++ /* Serialize control messages and control channel messages first */
++ list_for_each_entry_safe(msg, nmsg, &gsm->tx_ctrl_list, list) {
++ if (gsm->constipated && !gsm_is_flow_ctrl_msg(msg))
++ continue;
++ ret = gsm_send_packet(gsm, msg);
++ switch (ret) {
++ case -ENOSPC:
++ return -ENOSPC;
++ case -ENODEV:
++ /* ldisc not open */
++ gsm->tx_bytes -= msg->len;
++ list_del(&msg->list);
++ kfree(msg);
++ continue;
++ default:
++ if (ret >= 0) {
++ list_del(&msg->list);
++ kfree(msg);
++ }
+ break;
+- /* FIXME: Can eliminate one SOF in many more cases */
+- gsm->tx_bytes -= msg->len;
+-
+- list_del(&msg->list);
+- kfree(msg);
++ }
++ }
+
+- if (dlci) {
+- tty_port_tty_wakeup(&dlci->port);
+- } else {
+- int i = 0;
++ if (gsm->constipated)
++ return -EAGAIN;
+
+- for (i = 0; i < NUM_DLCI; i++)
+- if (gsm->dlci[i])
+- tty_port_tty_wakeup(&gsm->dlci[i]->port);
++ /* Serialize other channels */
++ if (list_empty(&gsm->tx_data_list))
++ return 0;
++ list_for_each_entry_safe(msg, nmsg, &gsm->tx_data_list, list) {
++ dlci = gsm->dlci[msg->addr];
++ /* Send only messages for DLCIs with valid state */
++ if (dlci->state != DLCI_OPEN) {
++ gsm->tx_bytes -= msg->len;
++ list_del(&msg->list);
++ kfree(msg);
++ continue;
++ }
++ ret = gsm_send_packet(gsm, msg);
++ switch (ret) {
++ case -ENOSPC:
++ return -ENOSPC;
++ case -ENODEV:
++ /* ldisc not open */
++ gsm->tx_bytes -= msg->len;
++ list_del(&msg->list);
++ kfree(msg);
++ continue;
++ default:
++ if (ret >= 0) {
++ list_del(&msg->list);
++ kfree(msg);
++ }
++ break;
+ }
+ }
++
++ return 1;
+ }
+
+ /**
+@@ -784,9 +994,22 @@ static void __gsm_data_queue(struct gsm_dlci *dlci, struct gsm_msg *msg)
+ msg->data = dp;
+
+ /* Add to the actual output queue */
+- list_add_tail(&msg->list, &gsm->tx_list);
++ switch (msg->ctrl & ~PF) {
++ case UI:
++ case UIH:
++ if (msg->addr > 0) {
++ list_add_tail(&msg->list, &gsm->tx_data_list);
++ break;
++ }
++ fallthrough;
++ default:
++ list_add_tail(&msg->list, &gsm->tx_ctrl_list);
++ break;
++ }
+ gsm->tx_bytes += msg->len;
+- gsm_data_kick(gsm, dlci);
++
++ gsmld_write_trigger(gsm);
++ mod_timer(&gsm->kick_timer, jiffies + 10 * gsm->t1 * HZ / 100);
+ }
+
+ /**
+@@ -823,41 +1046,48 @@ static int gsm_dlci_data_output(struct gsm_mux *gsm, struct gsm_dlci *dlci)
+ {
+ struct gsm_msg *msg;
+ u8 *dp;
+- int len, total_size, size;
+- int h = dlci->adaption - 1;
++ int h, len, size;
+
+- total_size = 0;
+- while (1) {
+- len = kfifo_len(&dlci->fifo);
+- if (len == 0)
+- return total_size;
+-
+- /* MTU/MRU count only the data bits */
+- if (len > gsm->mtu)
+- len = gsm->mtu;
+-
+- size = len + h;
+-
+- msg = gsm_data_alloc(gsm, dlci->addr, size, gsm->ftype);
+- /* FIXME: need a timer or something to kick this so it can't
+- get stuck with no work outstanding and no buffer free */
+- if (msg == NULL)
+- return -ENOMEM;
+- dp = msg->data;
+- switch (dlci->adaption) {
+- case 1: /* Unstructured */
+- break;
+- case 2: /* Unstructed with modem bits.
+- Always one byte as we never send inline break data */
+- *dp++ = (gsm_encode_modem(dlci) << 1) | EA;
+- break;
+- }
+- WARN_ON(kfifo_out_locked(&dlci->fifo, dp , len, &dlci->lock) != len);
+- __gsm_data_queue(dlci, msg);
+- total_size += size;
++ /* for modem bits without break data */
++ h = ((dlci->adaption == 1) ? 0 : 1);
++
++ len = kfifo_len(&dlci->fifo);
++ if (len == 0)
++ return 0;
++
++ /* MTU/MRU count only the data bits but watch adaption mode */
++ if ((len + h) > gsm->mtu)
++ len = gsm->mtu - h;
++
++ size = len + h;
++
++ msg = gsm_data_alloc(gsm, dlci->addr, size, gsm->ftype);
++ if (!msg)
++ return -ENOMEM;
++ dp = msg->data;
++ switch (dlci->adaption) {
++ case 1: /* Unstructured */
++ break;
++ case 2: /* Unstructured with modem bits.
++ * Always one byte as we never send inline break data
++ */
++ *dp++ = (gsm_encode_modem(dlci) << 1) | EA;
++ break;
++ default:
++ pr_err("%s: unsupported adaption %d\n", __func__,
++ dlci->adaption);
++ break;
+ }
++
++ WARN_ON(len != kfifo_out_locked(&dlci->fifo, dp, len,
++ &dlci->lock));
++
++ /* Notify upper layer about available send space. */
++ tty_port_tty_wakeup(&dlci->port);
++
++ __gsm_data_queue(dlci, msg);
+ /* Bytes of data we used up */
+- return total_size;
++ return size;
+ }
+
+ /**
+@@ -908,9 +1138,6 @@ static int gsm_dlci_data_output_framed(struct gsm_mux *gsm,
+
+ size = len + overhead;
+ msg = gsm_data_alloc(gsm, dlci->addr, size, gsm->ftype);
+-
+- /* FIXME: need a timer or something to kick this so it can't
+- get stuck with no work outstanding and no buffer free */
+ if (msg == NULL) {
+ skb_queue_tail(&dlci->skb_list, dlci->skb);
+ dlci->skb = NULL;
+@@ -1006,32 +1233,43 @@ static int gsm_dlci_modem_output(struct gsm_mux *gsm, struct gsm_dlci *dlci,
+ * renegotiate DLCI priorities with optional stuff. Needs optimising.
+ */
+
+-static void gsm_dlci_data_sweep(struct gsm_mux *gsm)
++static int gsm_dlci_data_sweep(struct gsm_mux *gsm)
+ {
+- int len;
+ /* Priority ordering: We should do priority with RR of the groups */
+- int i = 1;
+-
+- while (i < NUM_DLCI) {
+- struct gsm_dlci *dlci;
++ int i, len, ret = 0;
++ bool sent;
++ struct gsm_dlci *dlci;
+
+- if (gsm->tx_bytes > TX_THRESH_HI)
+- break;
+- dlci = gsm->dlci[i];
+- if (dlci == NULL || dlci->constipated) {
+- i++;
+- continue;
++ while (gsm->tx_bytes < TX_THRESH_HI) {
++ for (sent = false, i = 1; i < NUM_DLCI; i++) {
++ dlci = gsm->dlci[i];
++ /* skip unused or blocked channel */
++ if (!dlci || dlci->constipated)
++ continue;
++ /* skip channels with invalid state */
++ if (dlci->state != DLCI_OPEN)
++ continue;
++ /* count the sent data per adaption */
++ if (dlci->adaption < 3 && !dlci->net)
++ len = gsm_dlci_data_output(gsm, dlci);
++ else
++ len = gsm_dlci_data_output_framed(gsm, dlci);
++ /* on error exit */
++ if (len < 0)
++ return ret;
++ if (len > 0) {
++ ret++;
++ sent = true;
++ /* The lower DLCs can starve the higher DLCs! */
++ break;
++ }
++ /* try next */
+ }
+- if (dlci->adaption < 3 && !dlci->net)
+- len = gsm_dlci_data_output(gsm, dlci);
+- else
+- len = gsm_dlci_data_output_framed(gsm, dlci);
+- if (len < 0)
++ if (!sent)
+ break;
+- /* DLCI empty - try the next */
+- if (len == 0)
+- i++;
+- }
++ };
++
++ return ret;
+ }
+
+ /**
+@@ -1277,7 +1515,6 @@ static void gsm_control_message(struct gsm_mux *gsm, unsigned int command,
+ const u8 *data, int clen)
+ {
+ u8 buf[1];
+- unsigned long flags;
+
+ switch (command) {
+ case CMD_CLD: {
+@@ -1299,9 +1536,7 @@ static void gsm_control_message(struct gsm_mux *gsm, unsigned int command,
+ gsm->constipated = false;
+ gsm_control_reply(gsm, CMD_FCON, NULL, 0);
+ /* Kick the link in case it is idling */
+- spin_lock_irqsave(&gsm->tx_lock, flags);
+- gsm_data_kick(gsm, NULL);
+- spin_unlock_irqrestore(&gsm->tx_lock, flags);
++ gsmld_write_trigger(gsm);
+ break;
+ case CMD_FCOFF:
+ /* Modem wants us to STFU */
+@@ -1407,7 +1642,7 @@ static void gsm_control_retransmit(struct timer_list *t)
+ spin_lock_irqsave(&gsm->control_lock, flags);
+ ctrl = gsm->pending_cmd;
+ if (ctrl) {
+- if (gsm->cretries == 0) {
++ if (gsm->cretries == 0 || !gsm->dlci[0] || gsm->dlci[0]->dead) {
+ gsm->pending_cmd = NULL;
+ ctrl->error = -ETIMEDOUT;
+ ctrl->done = 1;
+@@ -1504,25 +1739,24 @@ static int gsm_control_wait(struct gsm_mux *gsm, struct gsm_control *control)
+
+ static void gsm_dlci_close(struct gsm_dlci *dlci)
+ {
+- unsigned long flags;
+-
+ del_timer(&dlci->t1);
+ if (debug & 8)
+ pr_debug("DLCI %d goes closed.\n", dlci->addr);
+ dlci->state = DLCI_CLOSED;
++ /* Prevent us from sending data before the link is up again */
++ dlci->constipated = true;
+ if (dlci->addr != 0) {
+ tty_port_tty_hangup(&dlci->port, false);
+- spin_lock_irqsave(&dlci->lock, flags);
+- kfifo_reset(&dlci->fifo);
+- spin_unlock_irqrestore(&dlci->lock, flags);
++ gsm_dlci_clear_queues(dlci->gsm, dlci);
+ /* Ensure that gsmtty_open() can return. */
+ tty_port_set_initialized(&dlci->port, 0);
+ wake_up_interruptible(&dlci->port.open_wait);
+ } else
+ dlci->gsm->dead = true;
+- wake_up(&dlci->gsm->event);
+ /* A DLCI 0 close is a MUX termination so we need to kick that
+ back to userspace somehow */
++ gsm_dlci_data_kick(dlci);
++ wake_up(&dlci->gsm->event);
+ }
+
+ /**
+@@ -1539,11 +1773,13 @@ static void gsm_dlci_open(struct gsm_dlci *dlci)
+ del_timer(&dlci->t1);
+ /* This will let a tty open continue */
+ dlci->state = DLCI_OPEN;
++ dlci->constipated = false;
+ if (debug & 8)
+ pr_debug("DLCI %d goes open.\n", dlci->addr);
+ /* Send current modem state */
+ if (dlci->addr)
+ gsm_modem_update(dlci, 0);
++ gsm_dlci_data_kick(dlci);
+ wake_up(&dlci->gsm->event);
+ }
+
+@@ -1569,8 +1805,8 @@ static void gsm_dlci_t1(struct timer_list *t)
+
+ switch (dlci->state) {
+ case DLCI_OPENING:
+- dlci->retries--;
+ if (dlci->retries) {
++ dlci->retries--;
+ gsm_command(dlci->gsm, dlci->addr, SABM|PF);
+ mod_timer(&dlci->t1, jiffies + gsm->t1 * HZ / 100);
+ } else if (!dlci->addr && gsm->control == (DM | PF)) {
+@@ -1585,8 +1821,8 @@ static void gsm_dlci_t1(struct timer_list *t)
+
+ break;
+ case DLCI_CLOSING:
+- dlci->retries--;
+ if (dlci->retries) {
++ dlci->retries--;
+ gsm_command(dlci->gsm, dlci->addr, DISC|PF);
+ mod_timer(&dlci->t1, jiffies + gsm->t1 * HZ / 100);
+ } else
+@@ -1619,6 +1855,25 @@ static void gsm_dlci_begin_open(struct gsm_dlci *dlci)
+ mod_timer(&dlci->t1, jiffies + gsm->t1 * HZ / 100);
+ }
+
++/**
++ * gsm_dlci_set_opening - change state to opening
++ * @dlci: DLCI to open
++ *
++ * Change internal state to wait for DLCI open from initiator side.
++ * We set off timers and responses upon reception of an SABM.
++ */
++static void gsm_dlci_set_opening(struct gsm_dlci *dlci)
++{
++ switch (dlci->state) {
++ case DLCI_CLOSED:
++ case DLCI_CLOSING:
++ dlci->state = DLCI_OPENING;
++ break;
++ default:
++ break;
++ }
++}
++
+ /**
+ * gsm_dlci_begin_close - start channel open procedure
+ * @dlci: DLCI to open
+@@ -1728,6 +1983,30 @@ static void gsm_dlci_command(struct gsm_dlci *dlci, const u8 *data, int len)
+ }
+ }
+
++/**
++ * gsm_kick_timer - transmit if possible
++ * @t: timer contained in our gsm object
++ *
++ * Transmit data from DLCIs if the queue is empty. We can't rely on
++ * a tty wakeup except when we filled the pipe so we need to fire off
++ * new data ourselves in other cases.
++ */
++static void gsm_kick_timer(struct timer_list *t)
++{
++ struct gsm_mux *gsm = from_timer(gsm, t, kick_timer);
++ unsigned long flags;
++ int sent = 0;
++
++ spin_lock_irqsave(&gsm->tx_lock, flags);
++ /* If we have nothing running then we need to fire up */
++ if (gsm->tx_bytes < TX_THRESH_LO)
++ sent = gsm_dlci_data_sweep(gsm);
++ spin_unlock_irqrestore(&gsm->tx_lock, flags);
++
++ if (sent && debug & 4)
++ pr_info("%s TX queue stalled\n", __func__);
++}
++
+ /*
+ * Allocate/Free DLCI channels
+ */
+@@ -1762,10 +2041,13 @@ static struct gsm_dlci *gsm_dlci_alloc(struct gsm_mux *gsm, int addr)
+ dlci->addr = addr;
+ dlci->adaption = gsm->adaption;
+ dlci->state = DLCI_CLOSED;
+- if (addr)
++ if (addr) {
+ dlci->data = gsm_dlci_data;
+- else
++ /* Prevent us from sending data before the link is up */
++ dlci->constipated = true;
++ } else {
+ dlci->data = gsm_dlci_command;
++ }
+ gsm->dlci[addr] = dlci;
+ return dlci;
+ }
+@@ -1929,7 +2211,7 @@ static void gsm_queue(struct gsm_mux *gsm)
+ goto invalid;
+ #endif
+ if (dlci == NULL || dlci->state != DLCI_OPEN) {
+- gsm_command(gsm, address, DM|PF);
++ gsm_response(gsm, address, DM|PF);
+ return;
+ }
+ dlci->data(dlci, gsm->buf, gsm->len);
+@@ -2052,7 +2334,7 @@ static void gsm1_receive(struct gsm_mux *gsm, unsigned char c)
+ } else if ((c & ISO_IEC_646_MASK) == XOFF) {
+ gsm->constipated = false;
+ /* Kick the link in case it is idling */
+- gsm_data_kick(gsm, NULL);
++ gsmld_write_trigger(gsm);
+ return;
+ }
+ if (c == GSM1_SOF) {
+@@ -2180,18 +2462,29 @@ static void gsm_cleanup_mux(struct gsm_mux *gsm, bool disc)
+ }
+
+ /* Finish outstanding timers, making sure they are done */
++ del_timer_sync(&gsm->kick_timer);
+ del_timer_sync(&gsm->t2_timer);
+
++ /* Finish writing to ldisc */
++ flush_work(&gsm->tx_work);
++
+ /* Free up any link layer users and finally the control channel */
++ if (gsm->has_devices) {
++ gsm_unregister_devices(gsm_tty_driver, gsm->num);
++ gsm->has_devices = false;
++ }
+ for (i = NUM_DLCI - 1; i >= 0; i--)
+ if (gsm->dlci[i])
+ gsm_dlci_release(gsm->dlci[i]);
+ mutex_unlock(&gsm->mutex);
+ /* Now wipe the queues */
+ tty_ldisc_flush(gsm->tty);
+- list_for_each_entry_safe(txq, ntxq, &gsm->tx_list, list)
++ list_for_each_entry_safe(txq, ntxq, &gsm->tx_ctrl_list, list)
++ kfree(txq);
++ INIT_LIST_HEAD(&gsm->tx_ctrl_list);
++ list_for_each_entry_safe(txq, ntxq, &gsm->tx_data_list, list)
+ kfree(txq);
+- INIT_LIST_HEAD(&gsm->tx_list);
++ INIT_LIST_HEAD(&gsm->tx_data_list);
+ }
+
+ /**
+@@ -2206,8 +2499,15 @@ static void gsm_cleanup_mux(struct gsm_mux *gsm, bool disc)
+ static int gsm_activate_mux(struct gsm_mux *gsm)
+ {
+ struct gsm_dlci *dlci;
++ int ret;
++
++ dlci = gsm_dlci_alloc(gsm, 0);
++ if (dlci == NULL)
++ return -ENOMEM;
+
++ timer_setup(&gsm->kick_timer, gsm_kick_timer, 0);
+ timer_setup(&gsm->t2_timer, gsm_control_retransmit, 0);
++ INIT_WORK(&gsm->tx_work, gsmld_write_task);
+ init_waitqueue_head(&gsm->event);
+ spin_lock_init(&gsm->control_lock);
+ spin_lock_init(&gsm->tx_lock);
+@@ -2217,9 +2517,11 @@ static int gsm_activate_mux(struct gsm_mux *gsm)
+ else
+ gsm->receive = gsm1_receive;
+
+- dlci = gsm_dlci_alloc(gsm, 0);
+- if (dlci == NULL)
+- return -ENOMEM;
++ ret = gsm_register_devices(gsm_tty_driver, gsm->num);
++ if (ret)
++ return ret;
++
++ gsm->has_devices = true;
+ gsm->dead = false; /* Tty opens are now permissible */
+ return 0;
+ }
+@@ -2312,7 +2614,8 @@ static struct gsm_mux *gsm_alloc_mux(void)
+ spin_lock_init(&gsm->lock);
+ mutex_init(&gsm->mutex);
+ kref_init(&gsm->ref);
+- INIT_LIST_HEAD(&gsm->tx_list);
++ INIT_LIST_HEAD(&gsm->tx_ctrl_list);
++ INIT_LIST_HEAD(&gsm->tx_data_list);
+
+ gsm->t1 = T1;
+ gsm->t2 = T2;
+@@ -2469,6 +2772,47 @@ static int gsmld_output(struct gsm_mux *gsm, u8 *data, int len)
+ return gsm->tty->ops->write(gsm->tty, data, len);
+ }
+
++
++/**
++ * gsmld_write_trigger - schedule ldisc write task
++ * @gsm: our mux
++ */
++static void gsmld_write_trigger(struct gsm_mux *gsm)
++{
++ if (!gsm || !gsm->dlci[0] || gsm->dlci[0]->dead)
++ return;
++ schedule_work(&gsm->tx_work);
++}
++
++
++/**
++ * gsmld_write_task - ldisc write task
++ * @work: our tx write work
++ *
++ * Writes out data to the ldisc if possible. We are doing this here to
++ * avoid dead-locking. This returns if no space or data is left for output.
++ */
++static void gsmld_write_task(struct work_struct *work)
++{
++ struct gsm_mux *gsm = container_of(work, struct gsm_mux, tx_work);
++ unsigned long flags;
++ int i, ret;
++
++ /* All outstanding control channel and control messages and one data
++ * frame is sent.
++ */
++ ret = -ENODEV;
++ spin_lock_irqsave(&gsm->tx_lock, flags);
++ if (gsm->tty)
++ ret = gsm_data_kick(gsm);
++ spin_unlock_irqrestore(&gsm->tx_lock, flags);
++
++ if (ret >= 0)
++ for (i = 0; i < NUM_DLCI; i++)
++ if (gsm->dlci[i])
++ tty_port_tty_wakeup(&gsm->dlci[i]->port);
++}
++
+ /**
+ * gsmld_attach_gsm - mode set up
+ * @tty: our tty structure
+@@ -2479,39 +2823,14 @@ static int gsmld_output(struct gsm_mux *gsm, u8 *data, int len)
+ * will need moving to an ioctl path.
+ */
+
+-static int gsmld_attach_gsm(struct tty_struct *tty, struct gsm_mux *gsm)
++static void gsmld_attach_gsm(struct tty_struct *tty, struct gsm_mux *gsm)
+ {
+- unsigned int base;
+- int ret, i;
+-
+ gsm->tty = tty_kref_get(tty);
+ /* Turn off tty XON/XOFF handling to handle it explicitly. */
+ gsm->old_c_iflag = tty->termios.c_iflag;
+ tty->termios.c_iflag &= (IXON | IXOFF);
+- ret = gsm_activate_mux(gsm);
+- if (ret != 0)
+- tty_kref_put(gsm->tty);
+- else {
+- /* Don't register device 0 - this is the control channel and not
+- a usable tty interface */
+- base = mux_num_to_base(gsm); /* Base for this MUX */
+- for (i = 1; i < NUM_DLCI; i++) {
+- struct device *dev;
+-
+- dev = tty_register_device(gsm_tty_driver,
+- base + i, NULL);
+- if (IS_ERR(dev)) {
+- for (i--; i >= 1; i--)
+- tty_unregister_device(gsm_tty_driver,
+- base + i);
+- return PTR_ERR(dev);
+- }
+- }
+- }
+- return ret;
+ }
+
+-
+ /**
+ * gsmld_detach_gsm - stop doing 0710 mux
+ * @tty: tty attached to the mux
+@@ -2522,12 +2841,7 @@ static int gsmld_attach_gsm(struct tty_struct *tty, struct gsm_mux *gsm)
+
+ static void gsmld_detach_gsm(struct tty_struct *tty, struct gsm_mux *gsm)
+ {
+- unsigned int base = mux_num_to_base(gsm); /* Base for this MUX */
+- int i;
+-
+ WARN_ON(tty != gsm->tty);
+- for (i = 1; i < NUM_DLCI; i++)
+- tty_unregister_device(gsm_tty_driver, base + i);
+ /* Restore tty XON/XOFF handling. */
+ gsm->tty->termios.c_iflag = gsm->old_c_iflag;
+ tty_kref_put(gsm->tty);
+@@ -2619,7 +2933,6 @@ static void gsmld_close(struct tty_struct *tty)
+ static int gsmld_open(struct tty_struct *tty)
+ {
+ struct gsm_mux *gsm;
+- int ret;
+
+ if (tty->ops->write == NULL)
+ return -EINVAL;
+@@ -2635,12 +2948,13 @@ static int gsmld_open(struct tty_struct *tty)
+ /* Attach the initial passive connection */
+ gsm->encoding = 1;
+
+- ret = gsmld_attach_gsm(tty, gsm);
+- if (ret != 0) {
+- gsm_cleanup_mux(gsm, false);
+- mux_put(gsm);
+- }
+- return ret;
++ gsmld_attach_gsm(tty, gsm);
++
++ timer_setup(&gsm->kick_timer, gsm_kick_timer, 0);
++ timer_setup(&gsm->t2_timer, gsm_control_retransmit, 0);
++ INIT_WORK(&gsm->tx_work, gsmld_write_task);
++
++ return 0;
+ }
+
+ /**
+@@ -2655,16 +2969,9 @@ static int gsmld_open(struct tty_struct *tty)
+ static void gsmld_write_wakeup(struct tty_struct *tty)
+ {
+ struct gsm_mux *gsm = tty->disc_data;
+- unsigned long flags;
+
+ /* Queue poll */
+- clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
+- spin_lock_irqsave(&gsm->tx_lock, flags);
+- gsm_data_kick(gsm, NULL);
+- if (gsm->tx_bytes < TX_THRESH_LO) {
+- gsm_dlci_data_sweep(gsm);
+- }
+- spin_unlock_irqrestore(&gsm->tx_lock, flags);
++ gsmld_write_trigger(gsm);
+ }
+
+ /**
+@@ -2708,11 +3015,24 @@ static ssize_t gsmld_read(struct tty_struct *tty, struct file *file,
+ static ssize_t gsmld_write(struct tty_struct *tty, struct file *file,
+ const unsigned char *buf, size_t nr)
+ {
+- int space = tty_write_room(tty);
++ struct gsm_mux *gsm = tty->disc_data;
++ unsigned long flags;
++ int space;
++ int ret;
++
++ if (!gsm)
++ return -ENODEV;
++
++ ret = -ENOBUFS;
++ spin_lock_irqsave(&gsm->tx_lock, flags);
++ space = tty_write_room(tty);
+ if (space >= nr)
+- return tty->ops->write(tty, buf, nr);
+- set_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
+- return -ENOBUFS;
++ ret = tty->ops->write(tty, buf, nr);
++ else
++ set_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
++ spin_unlock_irqrestore(&gsm->tx_lock, flags);
++
++ return ret;
+ }
+
+ /**
+@@ -2737,12 +3057,15 @@ static __poll_t gsmld_poll(struct tty_struct *tty, struct file *file,
+
+ poll_wait(file, &tty->read_wait, wait);
+ poll_wait(file, &tty->write_wait, wait);
++
++ if (gsm->dead)
++ mask |= EPOLLHUP;
+ if (tty_hung_up_p(file))
+ mask |= EPOLLHUP;
++ if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
++ mask |= EPOLLHUP;
+ if (!tty_is_writelocked(tty) && tty_write_room(tty) > 0)
+ mask |= EPOLLOUT | EPOLLWRNORM;
+- if (gsm->dead)
+- mask |= EPOLLHUP;
+ return mask;
+ }
+
+@@ -3178,6 +3501,8 @@ static int gsmtty_open(struct tty_struct *tty, struct file *filp)
+ /* Start sending off SABM messages */
+ if (gsm->initiator)
+ gsm_dlci_begin_open(dlci);
++ else
++ gsm_dlci_set_opening(dlci);
+ /* And wait for virtual carrier */
+ return tty_port_block_til_ready(port, tty, filp);
+ }
+diff --git a/drivers/tty/serial/8250/8250.h b/drivers/tty/serial/8250/8250.h
+index db784ace25d83..467372534d1cd 100644
+--- a/drivers/tty/serial/8250/8250.h
++++ b/drivers/tty/serial/8250/8250.h
+@@ -120,6 +120,28 @@ static inline void serial_out(struct uart_8250_port *up, int offset, int value)
+ up->port.serial_out(&up->port, offset, value);
+ }
+
++/*
++ * For the 16C950
++ */
++static void serial_icr_write(struct uart_8250_port *up, int offset, int value)
++{
++ serial_out(up, UART_SCR, offset);
++ serial_out(up, UART_ICR, value);
++}
++
++static unsigned int __maybe_unused serial_icr_read(struct uart_8250_port *up,
++ int offset)
++{
++ unsigned int value;
++
++ serial_icr_write(up, UART_ACR, up->acr | UART_ACR_ICRRD);
++ serial_out(up, UART_SCR, offset);
++ value = serial_in(up, UART_ICR);
++ serial_icr_write(up, UART_ACR, up->acr);
++
++ return value;
++}
++
+ void serial8250_clear_and_reinit_fifos(struct uart_8250_port *p);
+
+ static inline int serial_dl_read(struct uart_8250_port *up)
+diff --git a/drivers/tty/serial/8250/8250_bcm2835aux.c b/drivers/tty/serial/8250/8250_bcm2835aux.c
+index 2a1226a78a0c2..21939bb44613e 100644
+--- a/drivers/tty/serial/8250/8250_bcm2835aux.c
++++ b/drivers/tty/serial/8250/8250_bcm2835aux.c
+@@ -166,8 +166,10 @@ static int bcm2835aux_serial_probe(struct platform_device *pdev)
+ uartclk = clk_get_rate(data->clk);
+ if (!uartclk) {
+ ret = device_property_read_u32(&pdev->dev, "clock-frequency", &uartclk);
+- if (ret)
+- return dev_err_probe(&pdev->dev, ret, "could not get clk rate\n");
++ if (ret) {
++ dev_err_probe(&pdev->dev, ret, "could not get clk rate\n");
++ goto dis_clk;
++ }
+ }
+
+ /* the HW-clock divider for bcm2835aux is 8,
+diff --git a/drivers/tty/serial/8250/8250_bcm7271.c b/drivers/tty/serial/8250/8250_bcm7271.c
+index 9b878d023dac8..8efdc271eb75f 100644
+--- a/drivers/tty/serial/8250/8250_bcm7271.c
++++ b/drivers/tty/serial/8250/8250_bcm7271.c
+@@ -1139,16 +1139,19 @@ static int __maybe_unused brcmuart_suspend(struct device *dev)
+ struct brcmuart_priv *priv = dev_get_drvdata(dev);
+ struct uart_8250_port *up = serial8250_get_port(priv->line);
+ struct uart_port *port = &up->port;
+-
+- serial8250_suspend_port(priv->line);
+- clk_disable_unprepare(priv->baud_mux_clk);
++ unsigned long flags;
+
+ /*
+ * This will prevent resume from enabling RTS before the
+- * baud rate has been resored.
++ * baud rate has been restored.
+ */
++ spin_lock_irqsave(&port->lock, flags);
+ priv->saved_mctrl = port->mctrl;
+- port->mctrl = 0;
++ port->mctrl &= ~TIOCM_RTS;
++ spin_unlock_irqrestore(&port->lock, flags);
++
++ serial8250_suspend_port(priv->line);
++ clk_disable_unprepare(priv->baud_mux_clk);
+
+ return 0;
+ }
+@@ -1158,6 +1161,7 @@ static int __maybe_unused brcmuart_resume(struct device *dev)
+ struct brcmuart_priv *priv = dev_get_drvdata(dev);
+ struct uart_8250_port *up = serial8250_get_port(priv->line);
+ struct uart_port *port = &up->port;
++ unsigned long flags;
+ int ret;
+
+ ret = clk_prepare_enable(priv->baud_mux_clk);
+@@ -1180,7 +1184,15 @@ static int __maybe_unused brcmuart_resume(struct device *dev)
+ start_rx_dma(serial8250_get_port(priv->line));
+ }
+ serial8250_resume_port(priv->line);
+- port->mctrl = priv->saved_mctrl;
++
++ if (priv->saved_mctrl & TIOCM_RTS) {
++ /* Restore RTS */
++ spin_lock_irqsave(&port->lock, flags);
++ port->mctrl |= TIOCM_RTS;
++ port->ops->set_mctrl(port, port->mctrl);
++ spin_unlock_irqrestore(&port->lock, flags);
++ }
++
+ return 0;
+ }
+
+diff --git a/drivers/tty/serial/8250/8250_fsl.c b/drivers/tty/serial/8250/8250_fsl.c
+index 9c01c531349df..71ce436857977 100644
+--- a/drivers/tty/serial/8250/8250_fsl.c
++++ b/drivers/tty/serial/8250/8250_fsl.c
+@@ -77,7 +77,7 @@ int fsl8250_handle_irq(struct uart_port *port)
+ if ((lsr & UART_LSR_THRE) && (up->ier & UART_IER_THRI))
+ serial8250_tx_chars(up);
+
+- up->lsr_saved_flags = orig_lsr;
++ up->lsr_saved_flags |= orig_lsr & UART_LSR_BI;
+
+ uart_unlock_and_check_sysrq_irqrestore(&up->port, flags);
+
+diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
+index a293e9f107d0f..aeac20f7cbb2e 100644
+--- a/drivers/tty/serial/8250/8250_pci.c
++++ b/drivers/tty/serial/8250/8250_pci.c
+@@ -11,6 +11,7 @@
+ #include <linux/pci.h>
+ #include <linux/string.h>
+ #include <linux/kernel.h>
++#include <linux/math.h>
+ #include <linux/slab.h>
+ #include <linux/delay.h>
+ #include <linux/tty.h>
+@@ -994,41 +995,29 @@ static void pci_ite887x_exit(struct pci_dev *dev)
+ }
+
+ /*
+- * EndRun Technologies.
+- * Determine the number of ports available on the device.
++ * Oxford Semiconductor Inc.
++ * Check if an OxSemi device is part of the Tornado range of devices.
+ */
+ #define PCI_VENDOR_ID_ENDRUN 0x7401
+ #define PCI_DEVICE_ID_ENDRUN_1588 0xe100
+
+-static int pci_endrun_init(struct pci_dev *dev)
++static bool pci_oxsemi_tornado_p(struct pci_dev *dev)
+ {
+- u8 __iomem *p;
+- unsigned long deviceID;
+- unsigned int number_uarts = 0;
++ /* OxSemi Tornado devices are all 0xCxxx */
++ if (dev->vendor == PCI_VENDOR_ID_OXSEMI &&
++ (dev->device & 0xf000) != 0xc000)
++ return false;
+
+- /* EndRun device is all 0xexxx */
++ /* EndRun devices are all 0xExxx */
+ if (dev->vendor == PCI_VENDOR_ID_ENDRUN &&
+- (dev->device & 0xf000) != 0xe000)
+- return 0;
+-
+- p = pci_iomap(dev, 0, 5);
+- if (p == NULL)
+- return -ENOMEM;
++ (dev->device & 0xf000) != 0xe000)
++ return false;
+
+- deviceID = ioread32(p);
+- /* EndRun device */
+- if (deviceID == 0x07000200) {
+- number_uarts = ioread8(p + 4);
+- pci_dbg(dev, "%d ports detected on EndRun PCI Express device\n", number_uarts);
+- }
+- pci_iounmap(dev, p);
+- return number_uarts;
++ return true;
+ }
+
+ /*
+- * Oxford Semiconductor Inc.
+- * Check that device is part of the Tornado range of devices, then determine
+- * the number of ports available on the device.
++ * Determine the number of ports available on a Tornado device.
+ */
+ static int pci_oxsemi_tornado_init(struct pci_dev *dev)
+ {
+@@ -1036,9 +1025,7 @@ static int pci_oxsemi_tornado_init(struct pci_dev *dev)
+ unsigned long deviceID;
+ unsigned int number_uarts = 0;
+
+- /* OxSemi Tornado devices are all 0xCxxx */
+- if (dev->vendor == PCI_VENDOR_ID_OXSEMI &&
+- (dev->device & 0xF000) != 0xC000)
++ if (!pci_oxsemi_tornado_p(dev))
+ return 0;
+
+ p = pci_iomap(dev, 0, 5);
+@@ -1049,12 +1036,217 @@ static int pci_oxsemi_tornado_init(struct pci_dev *dev)
+ /* Tornado device */
+ if (deviceID == 0x07000200) {
+ number_uarts = ioread8(p + 4);
+- pci_dbg(dev, "%d ports detected on Oxford PCI Express device\n", number_uarts);
++ pci_dbg(dev, "%d ports detected on %s PCI Express device\n",
++ number_uarts,
++ dev->vendor == PCI_VENDOR_ID_ENDRUN ?
++ "EndRun" : "Oxford");
+ }
+ pci_iounmap(dev, p);
+ return number_uarts;
+ }
+
++/* Tornado-specific constants for the TCR and CPR registers; see below. */
++#define OXSEMI_TORNADO_TCR_MASK 0xf
++#define OXSEMI_TORNADO_CPR_MASK 0x1ff
++#define OXSEMI_TORNADO_CPR_MIN 0x008
++#define OXSEMI_TORNADO_CPR_DEF 0x10f
++
++/*
++ * Determine the oversampling rate, the clock prescaler, and the clock
++ * divisor for the requested baud rate. The clock rate is 62.5 MHz,
++ * which is four times the baud base, and the prescaler increments in
++ * steps of 1/8. Therefore to make calculations on integers we need
++ * to use a scaled clock rate, which is the baud base multiplied by 32
++ * (or our assumed UART clock rate multiplied by 2).
++ *
++ * The allowed oversampling rates are from 4 up to 16 inclusive (values
++ * from 0 to 3 inclusive map to 16). Likewise the clock prescaler allows
++ * values between 1.000 and 63.875 inclusive (operation for values from
++ * 0.000 to 0.875 has not been specified). The clock divisor is the usual
++ * unsigned 16-bit integer.
++ *
++ * For the most accurate baud rate we use a table of predetermined
++ * oversampling rates and clock prescalers that records all possible
++ * products of the two parameters in the range from 4 up to 255 inclusive,
++ * and additionally 335 for the 1500000bps rate, with the prescaler scaled
++ * by 8. The table is sorted by the decreasing value of the oversampling
++ * rate and ties are resolved by sorting by the decreasing value of the
++ * product. This way preference is given to higher oversampling rates.
++ *
++ * We iterate over the table and choose the product of an oversampling
++ * rate and a clock prescaler that gives the lowest integer division
++ * result deviation, or if an exact integer divider is found we stop
++ * looking for it right away. We do some fixup if the resulting clock
++ * divisor required would be out of its unsigned 16-bit integer range.
++ *
++ * Finally we abuse the supposed fractional part returned to encode the
++ * 4-bit value of the oversampling rate and the 9-bit value of the clock
++ * prescaler which will end up in the TCR and CPR/CPR2 registers.
++ */
++static unsigned int pci_oxsemi_tornado_get_divisor(struct uart_port *port,
++ unsigned int baud,
++ unsigned int *frac)
++{
++ static u8 p[][2] = {
++ { 16, 14, }, { 16, 13, }, { 16, 12, }, { 16, 11, },
++ { 16, 10, }, { 16, 9, }, { 16, 8, }, { 15, 17, },
++ { 15, 16, }, { 15, 15, }, { 15, 14, }, { 15, 13, },
++ { 15, 12, }, { 15, 11, }, { 15, 10, }, { 15, 9, },
++ { 15, 8, }, { 14, 18, }, { 14, 17, }, { 14, 14, },
++ { 14, 13, }, { 14, 12, }, { 14, 11, }, { 14, 10, },
++ { 14, 9, }, { 14, 8, }, { 13, 19, }, { 13, 18, },
++ { 13, 17, }, { 13, 13, }, { 13, 12, }, { 13, 11, },
++ { 13, 10, }, { 13, 9, }, { 13, 8, }, { 12, 19, },
++ { 12, 18, }, { 12, 17, }, { 12, 11, }, { 12, 9, },
++ { 12, 8, }, { 11, 23, }, { 11, 22, }, { 11, 21, },
++ { 11, 20, }, { 11, 19, }, { 11, 18, }, { 11, 17, },
++ { 11, 11, }, { 11, 10, }, { 11, 9, }, { 11, 8, },
++ { 10, 25, }, { 10, 23, }, { 10, 20, }, { 10, 19, },
++ { 10, 17, }, { 10, 10, }, { 10, 9, }, { 10, 8, },
++ { 9, 27, }, { 9, 23, }, { 9, 21, }, { 9, 19, },
++ { 9, 18, }, { 9, 17, }, { 9, 9, }, { 9, 8, },
++ { 8, 31, }, { 8, 29, }, { 8, 23, }, { 8, 19, },
++ { 8, 17, }, { 8, 8, }, { 7, 35, }, { 7, 31, },
++ { 7, 29, }, { 7, 25, }, { 7, 23, }, { 7, 21, },
++ { 7, 19, }, { 7, 17, }, { 7, 15, }, { 7, 14, },
++ { 7, 13, }, { 7, 12, }, { 7, 11, }, { 7, 10, },
++ { 7, 9, }, { 7, 8, }, { 6, 41, }, { 6, 37, },
++ { 6, 31, }, { 6, 29, }, { 6, 23, }, { 6, 19, },
++ { 6, 17, }, { 6, 13, }, { 6, 11, }, { 6, 10, },
++ { 6, 9, }, { 6, 8, }, { 5, 67, }, { 5, 47, },
++ { 5, 43, }, { 5, 41, }, { 5, 37, }, { 5, 31, },
++ { 5, 29, }, { 5, 25, }, { 5, 23, }, { 5, 19, },
++ { 5, 17, }, { 5, 15, }, { 5, 13, }, { 5, 11, },
++ { 5, 10, }, { 5, 9, }, { 5, 8, }, { 4, 61, },
++ { 4, 59, }, { 4, 53, }, { 4, 47, }, { 4, 43, },
++ { 4, 41, }, { 4, 37, }, { 4, 31, }, { 4, 29, },
++ { 4, 23, }, { 4, 19, }, { 4, 17, }, { 4, 13, },
++ { 4, 9, }, { 4, 8, },
++ };
++ /* Scale the quotient for comparison to get the fractional part. */
++ const unsigned int quot_scale = 65536;
++ unsigned int sclk = port->uartclk * 2;
++ unsigned int sdiv = DIV_ROUND_CLOSEST(sclk, baud);
++ unsigned int best_squot;
++ unsigned int squot;
++ unsigned int quot;
++ u16 cpr;
++ u8 tcr;
++ int i;
++
++ /* Old custom speed handling. */
++ if (baud == 38400 && (port->flags & UPF_SPD_MASK) == UPF_SPD_CUST) {
++ unsigned int cust_div = port->custom_divisor;
++
++ quot = cust_div & UART_DIV_MAX;
++ tcr = (cust_div >> 16) & OXSEMI_TORNADO_TCR_MASK;
++ cpr = (cust_div >> 20) & OXSEMI_TORNADO_CPR_MASK;
++ if (cpr < OXSEMI_TORNADO_CPR_MIN)
++ cpr = OXSEMI_TORNADO_CPR_DEF;
++ } else {
++ best_squot = quot_scale;
++ for (i = 0; i < ARRAY_SIZE(p); i++) {
++ unsigned int spre;
++ unsigned int srem;
++ u8 cp;
++ u8 tc;
++
++ tc = p[i][0];
++ cp = p[i][1];
++ spre = tc * cp;
++
++ srem = sdiv % spre;
++ if (srem > spre / 2)
++ srem = spre - srem;
++ squot = DIV_ROUND_CLOSEST(srem * quot_scale, spre);
++
++ if (srem == 0) {
++ tcr = tc;
++ cpr = cp;
++ quot = sdiv / spre;
++ break;
++ } else if (squot < best_squot) {
++ best_squot = squot;
++ tcr = tc;
++ cpr = cp;
++ quot = DIV_ROUND_CLOSEST(sdiv, spre);
++ }
++ }
++ while (tcr <= (OXSEMI_TORNADO_TCR_MASK + 1) >> 1 &&
++ quot % 2 == 0) {
++ quot >>= 1;
++ tcr <<= 1;
++ }
++ while (quot > UART_DIV_MAX) {
++ if (tcr <= (OXSEMI_TORNADO_TCR_MASK + 1) >> 1) {
++ quot >>= 1;
++ tcr <<= 1;
++ } else if (cpr <= OXSEMI_TORNADO_CPR_MASK >> 1) {
++ quot >>= 1;
++ cpr <<= 1;
++ } else {
++ quot = quot * cpr / OXSEMI_TORNADO_CPR_MASK;
++ cpr = OXSEMI_TORNADO_CPR_MASK;
++ }
++ }
++ }
++
++ *frac = (cpr << 8) | (tcr & OXSEMI_TORNADO_TCR_MASK);
++ return quot;
++}
++
++/*
++ * Set the oversampling rate in the transmitter clock cycle register (TCR),
++ * the clock prescaler in the clock prescaler register (CPR and CPR2), and
++ * the clock divisor in the divisor latch (DLL and DLM). Note that for
++ * backwards compatibility any write to CPR clears CPR2 and therefore CPR
++ * has to be written first, followed by CPR2, which occupies the location
++ * of CKS used with earlier UART designs.
++ */
++static void pci_oxsemi_tornado_set_divisor(struct uart_port *port,
++ unsigned int baud,
++ unsigned int quot,
++ unsigned int quot_frac)
++{
++ struct uart_8250_port *up = up_to_u8250p(port);
++ u8 cpr2 = quot_frac >> 16;
++ u8 cpr = quot_frac >> 8;
++ u8 tcr = quot_frac;
++
++ serial_icr_write(up, UART_TCR, tcr);
++ serial_icr_write(up, UART_CPR, cpr);
++ serial_icr_write(up, UART_CKS, cpr2);
++ serial8250_do_set_divisor(port, baud, quot, 0);
++}
++
++/*
++ * For Tornado devices we force MCR[7] set for the Divide-by-M N/8 baud rate
++ * generator prescaler (CPR and CPR2). Otherwise no prescaler would be used.
++ */
++static void pci_oxsemi_tornado_set_mctrl(struct uart_port *port,
++ unsigned int mctrl)
++{
++ struct uart_8250_port *up = up_to_u8250p(port);
++
++ up->mcr |= UART_MCR_CLKSEL;
++ serial8250_do_set_mctrl(port, mctrl);
++}
++
++static int pci_oxsemi_tornado_setup(struct serial_private *priv,
++ const struct pciserial_board *board,
++ struct uart_8250_port *up, int idx)
++{
++ struct pci_dev *dev = priv->dev;
++
++ if (pci_oxsemi_tornado_p(dev)) {
++ up->port.get_divisor = pci_oxsemi_tornado_get_divisor;
++ up->port.set_divisor = pci_oxsemi_tornado_set_divisor;
++ up->port.set_mctrl = pci_oxsemi_tornado_set_mctrl;
++ }
++
++ return pci_default_setup(priv, board, up, idx);
++}
++
+ static int pci_asix_setup(struct serial_private *priv,
+ const struct pciserial_board *board,
+ struct uart_8250_port *port, int idx)
+@@ -2244,7 +2436,7 @@ static struct pci_serial_quirk pci_serial_quirks[] = {
+ .device = PCI_ANY_ID,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+- .init = pci_endrun_init,
++ .init = pci_oxsemi_tornado_init,
+ .setup = pci_default_setup,
+ },
+ /*
+@@ -2256,7 +2448,7 @@ static struct pci_serial_quirk pci_serial_quirks[] = {
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .init = pci_oxsemi_tornado_init,
+- .setup = pci_default_setup,
++ .setup = pci_oxsemi_tornado_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_MAINPINE,
+@@ -2264,7 +2456,7 @@ static struct pci_serial_quirk pci_serial_quirks[] = {
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .init = pci_oxsemi_tornado_init,
+- .setup = pci_default_setup,
++ .setup = pci_oxsemi_tornado_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_DIGI,
+@@ -2272,7 +2464,7 @@ static struct pci_serial_quirk pci_serial_quirks[] = {
+ .subvendor = PCI_SUBVENDOR_ID_IBM,
+ .subdevice = PCI_ANY_ID,
+ .init = pci_oxsemi_tornado_init,
+- .setup = pci_default_setup,
++ .setup = pci_oxsemi_tornado_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_INTEL,
+@@ -2589,7 +2781,7 @@ enum pci_board_num_t {
+ pbn_b0_2_1843200,
+ pbn_b0_4_1843200,
+
+- pbn_b0_1_3906250,
++ pbn_b0_1_15625000,
+
+ pbn_b0_bt_1_115200,
+ pbn_b0_bt_2_115200,
+@@ -2667,12 +2859,11 @@ enum pci_board_num_t {
+ pbn_panacom2,
+ pbn_panacom4,
+ pbn_plx_romulus,
+- pbn_endrun_2_3906250,
+ pbn_oxsemi,
+- pbn_oxsemi_1_3906250,
+- pbn_oxsemi_2_3906250,
+- pbn_oxsemi_4_3906250,
+- pbn_oxsemi_8_3906250,
++ pbn_oxsemi_1_15625000,
++ pbn_oxsemi_2_15625000,
++ pbn_oxsemi_4_15625000,
++ pbn_oxsemi_8_15625000,
+ pbn_intel_i960,
+ pbn_sgi_ioc3,
+ pbn_computone_4,
+@@ -2815,10 +3006,10 @@ static struct pciserial_board pci_boards[] = {
+ .uart_offset = 8,
+ },
+
+- [pbn_b0_1_3906250] = {
++ [pbn_b0_1_15625000] = {
+ .flags = FL_BASE0,
+ .num_ports = 1,
+- .base_baud = 3906250,
++ .base_baud = 15625000,
+ .uart_offset = 8,
+ },
+
+@@ -3189,20 +3380,6 @@ static struct pciserial_board pci_boards[] = {
+ .first_offset = 0x03,
+ },
+
+- /*
+- * EndRun Technologies
+- * Uses the size of PCI Base region 0 to
+- * signal now many ports are available
+- * 2 port 952 Uart support
+- */
+- [pbn_endrun_2_3906250] = {
+- .flags = FL_BASE0,
+- .num_ports = 2,
+- .base_baud = 3906250,
+- .uart_offset = 0x200,
+- .first_offset = 0x1000,
+- },
+-
+ /*
+ * This board uses the size of PCI Base region 0 to
+ * signal now many ports are available
+@@ -3213,31 +3390,31 @@ static struct pciserial_board pci_boards[] = {
+ .base_baud = 115200,
+ .uart_offset = 8,
+ },
+- [pbn_oxsemi_1_3906250] = {
++ [pbn_oxsemi_1_15625000] = {
+ .flags = FL_BASE0,
+ .num_ports = 1,
+- .base_baud = 3906250,
++ .base_baud = 15625000,
+ .uart_offset = 0x200,
+ .first_offset = 0x1000,
+ },
+- [pbn_oxsemi_2_3906250] = {
++ [pbn_oxsemi_2_15625000] = {
+ .flags = FL_BASE0,
+ .num_ports = 2,
+- .base_baud = 3906250,
++ .base_baud = 15625000,
+ .uart_offset = 0x200,
+ .first_offset = 0x1000,
+ },
+- [pbn_oxsemi_4_3906250] = {
++ [pbn_oxsemi_4_15625000] = {
+ .flags = FL_BASE0,
+ .num_ports = 4,
+- .base_baud = 3906250,
++ .base_baud = 15625000,
+ .uart_offset = 0x200,
+ .first_offset = 0x1000,
+ },
+- [pbn_oxsemi_8_3906250] = {
++ [pbn_oxsemi_8_15625000] = {
+ .flags = FL_BASE0,
+ .num_ports = 8,
+- .base_baud = 3906250,
++ .base_baud = 15625000,
+ .uart_offset = 0x200,
+ .first_offset = 0x1000,
+ },
+@@ -4109,13 +4286,6 @@ static const struct pci_device_id serial_pci_tbl[] = {
+ { PCI_VENDOR_ID_PLX, PCI_DEVICE_ID_PLX_ROMULUS,
+ 0x10b5, 0x106a, 0, 0,
+ pbn_plx_romulus },
+- /*
+- * EndRun Technologies. PCI express device range.
+- * EndRun PTP/1588 has 2 Native UARTs.
+- */
+- { PCI_VENDOR_ID_ENDRUN, PCI_DEVICE_ID_ENDRUN_1588,
+- PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_endrun_2_3906250 },
+ /*
+ * Quatech cards. These actually have configurable clocks but for
+ * now we just use the default.
+@@ -4225,158 +4395,165 @@ static const struct pci_device_id serial_pci_tbl[] = {
+ */
+ { PCI_VENDOR_ID_OXSEMI, 0xc101, /* OXPCIe952 1 Legacy UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_b0_1_3906250 },
++ pbn_b0_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc105, /* OXPCIe952 1 Legacy UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_b0_1_3906250 },
++ pbn_b0_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc11b, /* OXPCIe952 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc11f, /* OXPCIe952 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc120, /* OXPCIe952 1 Legacy UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_b0_1_3906250 },
++ pbn_b0_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc124, /* OXPCIe952 1 Legacy UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_b0_1_3906250 },
++ pbn_b0_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc138, /* OXPCIe952 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc13d, /* OXPCIe952 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc140, /* OXPCIe952 1 Legacy UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_b0_1_3906250 },
++ pbn_b0_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc141, /* OXPCIe952 1 Legacy UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_b0_1_3906250 },
++ pbn_b0_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc144, /* OXPCIe952 1 Legacy UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_b0_1_3906250 },
++ pbn_b0_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc145, /* OXPCIe952 1 Legacy UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_b0_1_3906250 },
++ pbn_b0_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc158, /* OXPCIe952 2 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_2_3906250 },
++ pbn_oxsemi_2_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc15d, /* OXPCIe952 2 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_2_3906250 },
++ pbn_oxsemi_2_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc208, /* OXPCIe954 4 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_4_3906250 },
++ pbn_oxsemi_4_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc20d, /* OXPCIe954 4 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_4_3906250 },
++ pbn_oxsemi_4_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc308, /* OXPCIe958 8 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_8_3906250 },
++ pbn_oxsemi_8_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc30d, /* OXPCIe958 8 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_8_3906250 },
++ pbn_oxsemi_8_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc40b, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc40f, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc41b, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc41f, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc42b, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc42f, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc43b, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc43f, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc44b, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc44f, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc45b, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc45f, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc46b, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc46f, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc47b, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc47f, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc48b, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc48f, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc49b, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc49f, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc4ab, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc4af, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc4bb, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc4bf, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc4cb, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_OXSEMI, 0xc4cf, /* OXPCIe200 1 Native UART */
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ /*
+ * Mainpine Inc. IQ Express "Rev3" utilizing OxSemi Tornado
+ */
+ { PCI_VENDOR_ID_MAINPINE, 0x4000, /* IQ Express 1 Port V.34 Super-G3 Fax */
+ PCI_VENDOR_ID_MAINPINE, 0x4001, 0, 0,
+- pbn_oxsemi_1_3906250 },
++ pbn_oxsemi_1_15625000 },
+ { PCI_VENDOR_ID_MAINPINE, 0x4000, /* IQ Express 2 Port V.34 Super-G3 Fax */
+ PCI_VENDOR_ID_MAINPINE, 0x4002, 0, 0,
+- pbn_oxsemi_2_3906250 },
++ pbn_oxsemi_2_15625000 },
+ { PCI_VENDOR_ID_MAINPINE, 0x4000, /* IQ Express 4 Port V.34 Super-G3 Fax */
+ PCI_VENDOR_ID_MAINPINE, 0x4004, 0, 0,
+- pbn_oxsemi_4_3906250 },
++ pbn_oxsemi_4_15625000 },
+ { PCI_VENDOR_ID_MAINPINE, 0x4000, /* IQ Express 8 Port V.34 Super-G3 Fax */
+ PCI_VENDOR_ID_MAINPINE, 0x4008, 0, 0,
+- pbn_oxsemi_8_3906250 },
++ pbn_oxsemi_8_15625000 },
+
+ /*
+ * Digi/IBM PCIe 2-port Async EIA-232 Adapter utilizing OxSemi Tornado
+ */
+ { PCI_VENDOR_ID_DIGI, PCIE_DEVICE_ID_NEO_2_OX_IBM,
+ PCI_SUBVENDOR_ID_IBM, PCI_ANY_ID, 0, 0,
+- pbn_oxsemi_2_3906250 },
++ pbn_oxsemi_2_15625000 },
++ /*
++ * EndRun Technologies. PCI express device range.
++ * EndRun PTP/1588 has 2 Native UARTs utilizing OxSemi 952.
++ */
++ { PCI_VENDOR_ID_ENDRUN, PCI_DEVICE_ID_ENDRUN_1588,
++ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
++ pbn_oxsemi_2_15625000 },
+
+ /*
+ * SBS Technologies, Inc. P-Octal and PMC-OCTPRO cards,
+@@ -4886,6 +5063,115 @@ static const struct pci_device_id serial_pci_tbl[] = {
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_b2_4_115200 },
++ /*
++ * Brainboxes PX-101
++ */
++ { PCI_VENDOR_ID_INTASHIELD, 0x4005,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_b0_2_115200 },
++ { PCI_VENDOR_ID_INTASHIELD, 0x4019,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_oxsemi_2_15625000 },
++ /*
++ * Brainboxes PX-235/246
++ */
++ { PCI_VENDOR_ID_INTASHIELD, 0x4004,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_b0_1_115200 },
++ { PCI_VENDOR_ID_INTASHIELD, 0x4016,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_oxsemi_1_15625000 },
++ /*
++ * Brainboxes PX-203/PX-257
++ */
++ { PCI_VENDOR_ID_INTASHIELD, 0x4006,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_b0_2_115200 },
++ { PCI_VENDOR_ID_INTASHIELD, 0x4015,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_oxsemi_4_15625000 },
++ /*
++ * Brainboxes PX-260/PX-701
++ */
++ { PCI_VENDOR_ID_INTASHIELD, 0x400A,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_oxsemi_4_15625000 },
++ /*
++ * Brainboxes PX-310
++ */
++ { PCI_VENDOR_ID_INTASHIELD, 0x400E,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_oxsemi_2_15625000 },
++ /*
++ * Brainboxes PX-313
++ */
++ { PCI_VENDOR_ID_INTASHIELD, 0x400C,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_oxsemi_2_15625000 },
++ /*
++ * Brainboxes PX-320/324/PX-376/PX-387
++ */
++ { PCI_VENDOR_ID_INTASHIELD, 0x400B,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_oxsemi_1_15625000 },
++ /*
++ * Brainboxes PX-335/346
++ */
++ { PCI_VENDOR_ID_INTASHIELD, 0x400F,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_oxsemi_4_15625000 },
++ /*
++ * Brainboxes PX-368
++ */
++ { PCI_VENDOR_ID_INTASHIELD, 0x4010,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_oxsemi_4_15625000 },
++ /*
++ * Brainboxes PX-420
++ */
++ { PCI_VENDOR_ID_INTASHIELD, 0x4000,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_b0_4_115200 },
++ { PCI_VENDOR_ID_INTASHIELD, 0x4011,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_oxsemi_4_15625000 },
++ /*
++ * Brainboxes PX-803
++ */
++ { PCI_VENDOR_ID_INTASHIELD, 0x4009,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_b0_1_115200 },
++ { PCI_VENDOR_ID_INTASHIELD, 0x401E,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_oxsemi_1_15625000 },
++ /*
++ * Brainboxes PX-846
++ */
++ { PCI_VENDOR_ID_INTASHIELD, 0x4008,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_b0_1_115200 },
++ { PCI_VENDOR_ID_INTASHIELD, 0x4017,
++ PCI_ANY_ID, PCI_ANY_ID,
++ 0, 0,
++ pbn_oxsemi_1_15625000 },
++
+ /*
+ * Perle PCI-RAS cards
+ */
+diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
+index ddaf35daf316b..291fd99bd7f1b 100644
+--- a/drivers/tty/serial/8250/8250_port.c
++++ b/drivers/tty/serial/8250/8250_port.c
+@@ -537,27 +537,6 @@ serial_port_out_sync(struct uart_port *p, int offset, int value)
+ }
+ }
+
+-/*
+- * For the 16C950
+- */
+-static void serial_icr_write(struct uart_8250_port *up, int offset, int value)
+-{
+- serial_out(up, UART_SCR, offset);
+- serial_out(up, UART_ICR, value);
+-}
+-
+-static unsigned int serial_icr_read(struct uart_8250_port *up, int offset)
+-{
+- unsigned int value;
+-
+- serial_icr_write(up, UART_ACR, up->acr | UART_ACR_ICRRD);
+- serial_out(up, UART_SCR, offset);
+- value = serial_in(up, UART_ICR);
+- serial_icr_write(up, UART_ACR, up->acr);
+-
+- return value;
+-}
+-
+ /*
+ * FIFO support.
+ */
+diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
+index 2cb89491dd09f..65b76adf107ce 100644
+--- a/drivers/tty/serial/fsl_lpuart.c
++++ b/drivers/tty/serial/fsl_lpuart.c
+@@ -990,12 +990,12 @@ static void lpuart32_rxint(struct lpuart_port *sport)
+
+ if (sr & (UARTSTAT_PE | UARTSTAT_OR | UARTSTAT_FE)) {
+ if (sr & UARTSTAT_PE) {
++ sport->port.icount.parity++;
++ } else if (sr & UARTSTAT_FE) {
+ if (is_break)
+ sport->port.icount.brk++;
+ else
+- sport->port.icount.parity++;
+- } else if (sr & UARTSTAT_FE) {
+- sport->port.icount.frame++;
++ sport->port.icount.frame++;
+ }
+
+ if (sr & UARTSTAT_OR)
+@@ -1010,12 +1010,12 @@ static void lpuart32_rxint(struct lpuart_port *sport)
+ sr &= sport->port.read_status_mask;
+
+ if (sr & UARTSTAT_PE) {
++ flg = TTY_PARITY;
++ } else if (sr & UARTSTAT_FE) {
+ if (is_break)
+ flg = TTY_BREAK;
+ else
+- flg = TTY_PARITY;
+- } else if (sr & UARTSTAT_FE) {
+- flg = TTY_FRAME;
++ flg = TTY_FRAME;
+ }
+
+ if (sr & UARTSTAT_OR)
+diff --git a/drivers/tty/serial/mvebu-uart.c b/drivers/tty/serial/mvebu-uart.c
+index 93489fe334d0f..65eaecd10b7ca 100644
+--- a/drivers/tty/serial/mvebu-uart.c
++++ b/drivers/tty/serial/mvebu-uart.c
+@@ -265,6 +265,7 @@ static void mvebu_uart_rx_chars(struct uart_port *port, unsigned int status)
+ struct tty_port *tport = &port->state->port;
+ unsigned char ch = 0;
+ char flag = 0;
++ int ret;
+
+ do {
+ if (status & STAT_RX_RDY(port)) {
+@@ -277,6 +278,16 @@ static void mvebu_uart_rx_chars(struct uart_port *port, unsigned int status)
+ port->icount.parity++;
+ }
+
++ /*
++ * For UART2, error bits are not cleared on buffer read.
++ * This causes interrupt loop and system hang.
++ */
++ if (IS_EXTENDED(port) && (status & STAT_BRK_ERR)) {
++ ret = readl(port->membase + UART_STAT);
++ ret |= STAT_BRK_ERR;
++ writel(ret, port->membase + UART_STAT);
++ }
++
+ if (status & STAT_BRK_DET) {
+ port->icount.brk++;
+ status &= ~(STAT_FRM_ERR | STAT_PAR_ERR);
+diff --git a/drivers/tty/serial/pic32_uart.c b/drivers/tty/serial/pic32_uart.c
+index b7a3a1b959b19..a967b586a0a07 100644
+--- a/drivers/tty/serial/pic32_uart.c
++++ b/drivers/tty/serial/pic32_uart.c
+@@ -427,7 +427,7 @@ static int pic32_uart_startup(struct uart_port *port)
+ if (!sport->irq_fault_name) {
+ dev_err(port->dev, "%s: kasprintf err!", __func__);
+ ret = -ENOMEM;
+- goto out_done;
++ goto out_disable_clk;
+ }
+ irq_set_status_flags(sport->irq_fault, IRQ_NOAUTOEN);
+ ret = request_irq(sport->irq_fault, pic32_uart_fault_interrupt,
+@@ -493,14 +493,16 @@ static int pic32_uart_startup(struct uart_port *port)
+ return 0;
+
+ out_t:
+- kfree(sport->irq_tx_name);
+ free_irq(sport->irq_tx, port);
++ kfree(sport->irq_tx_name);
+ out_r:
+- kfree(sport->irq_rx_name);
+ free_irq(sport->irq_rx, port);
++ kfree(sport->irq_rx_name);
+ out_f:
+- kfree(sport->irq_fault_name);
+ free_irq(sport->irq_fault, port);
++ kfree(sport->irq_fault_name);
++out_disable_clk:
++ clk_disable_unprepare(sport->clk);
+ out_done:
+ return ret;
+ }
+@@ -519,8 +521,11 @@ static void pic32_uart_shutdown(struct uart_port *port)
+
+ /* free all 3 interrupts for this UART */
+ free_irq(sport->irq_fault, port);
++ kfree(sport->irq_fault_name);
+ free_irq(sport->irq_tx, port);
++ kfree(sport->irq_tx_name);
+ free_irq(sport->irq_rx, port);
++ kfree(sport->irq_rx_name);
+ }
+
+ /* serial core request to change current uart setting */
+diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
+index dfc1f4b445f3b..6eaf8eb846619 100644
+--- a/drivers/tty/vt/vt.c
++++ b/drivers/tty/vt/vt.c
+@@ -344,7 +344,7 @@ static struct uni_screen *vc_uniscr_alloc(unsigned int cols, unsigned int rows)
+ /* allocate everything in one go */
+ memsize = cols * rows * sizeof(char32_t);
+ memsize += rows * sizeof(char32_t *);
+- p = vmalloc(memsize);
++ p = vzalloc(memsize);
+ if (!p)
+ return NULL;
+
+diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
+index d6d515d598dc0..ae049eb28b93c 100644
+--- a/drivers/usb/cdns3/cdns3-gadget.c
++++ b/drivers/usb/cdns3/cdns3-gadget.c
+@@ -2280,11 +2280,16 @@ static int cdns3_gadget_ep_enable(struct usb_ep *ep,
+ int ret = 0;
+ int val;
+
++ if (!ep) {
++ pr_debug("usbss: ep not configured?\n");
++ return -EINVAL;
++ }
++
+ priv_ep = ep_to_cdns3_ep(ep);
+ priv_dev = priv_ep->cdns3_dev;
+ comp_desc = priv_ep->endpoint.comp_desc;
+
+- if (!ep || !desc || desc->bDescriptorType != USB_DT_ENDPOINT) {
++ if (!desc || desc->bDescriptorType != USB_DT_ENDPOINT) {
+ dev_dbg(priv_dev->dev, "usbss: invalid parameters\n");
+ return -EINVAL;
+ }
+@@ -2596,7 +2601,7 @@ int cdns3_gadget_ep_dequeue(struct usb_ep *ep,
+ struct usb_request *request)
+ {
+ struct cdns3_endpoint *priv_ep = ep_to_cdns3_ep(ep);
+- struct cdns3_device *priv_dev = priv_ep->cdns3_dev;
++ struct cdns3_device *priv_dev;
+ struct usb_request *req, *req_temp;
+ struct cdns3_request *priv_req;
+ struct cdns3_trb *link_trb;
+@@ -2607,6 +2612,8 @@ int cdns3_gadget_ep_dequeue(struct usb_ep *ep,
+ if (!ep || !request || !ep->desc)
+ return -EINVAL;
+
++ priv_dev = priv_ep->cdns3_dev;
++
+ spin_lock_irqsave(&priv_dev->lock, flags);
+
+ priv_req = to_cdns3_request(request);
+diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
+index 06eea8848ccc2..11c8ea0cccc83 100644
+--- a/drivers/usb/core/hcd.c
++++ b/drivers/usb/core/hcd.c
+@@ -1691,7 +1691,6 @@ static void usb_giveback_urb_bh(struct tasklet_struct *t)
+
+ spin_lock_irq(&bh->lock);
+ bh->running = true;
+- restart:
+ list_replace_init(&bh->head, &local_list);
+ spin_unlock_irq(&bh->lock);
+
+@@ -1705,10 +1704,17 @@ static void usb_giveback_urb_bh(struct tasklet_struct *t)
+ bh->completing_ep = NULL;
+ }
+
+- /* check if there are new URBs to giveback */
++ /*
++ * giveback new URBs next time to prevent this function
++ * from not exiting for a long time.
++ */
+ spin_lock_irq(&bh->lock);
+- if (!list_empty(&bh->head))
+- goto restart;
++ if (!list_empty(&bh->head)) {
++ if (bh->high_prio)
++ tasklet_hi_schedule(&bh->bh);
++ else
++ tasklet_schedule(&bh->bh);
++ }
+ bh->running = false;
+ spin_unlock_irq(&bh->lock);
+ }
+@@ -1737,7 +1743,7 @@ static void usb_giveback_urb_bh(struct tasklet_struct *t)
+ void usb_hcd_giveback_urb(struct usb_hcd *hcd, struct urb *urb, int status)
+ {
+ struct giveback_urb_bh *bh;
+- bool running, high_prio_bh;
++ bool running;
+
+ /* pass status to tasklet via unlinked */
+ if (likely(!urb->unlinked))
+@@ -1748,13 +1754,10 @@ void usb_hcd_giveback_urb(struct usb_hcd *hcd, struct urb *urb, int status)
+ return;
+ }
+
+- if (usb_pipeisoc(urb->pipe) || usb_pipeint(urb->pipe)) {
++ if (usb_pipeisoc(urb->pipe) || usb_pipeint(urb->pipe))
+ bh = &hcd->high_prio_bh;
+- high_prio_bh = true;
+- } else {
++ else
+ bh = &hcd->low_prio_bh;
+- high_prio_bh = false;
+- }
+
+ spin_lock(&bh->lock);
+ list_add_tail(&urb->urb_list, &bh->head);
+@@ -1763,7 +1766,7 @@ void usb_hcd_giveback_urb(struct usb_hcd *hcd, struct urb *urb, int status)
+
+ if (running)
+ ;
+- else if (high_prio_bh)
++ else if (bh->high_prio)
+ tasklet_hi_schedule(&bh->bh);
+ else
+ tasklet_schedule(&bh->bh);
+@@ -2959,6 +2962,7 @@ int usb_add_hcd(struct usb_hcd *hcd,
+
+ /* initialize tasklets */
+ init_giveback_urb_bh(&hcd->high_prio_bh);
++ hcd->high_prio_bh.high_prio = true;
+ init_giveback_urb_bh(&hcd->low_prio_bh);
+
+ /* enable irqs just before we start the controller,
+diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
+index d28cd1a6709bb..ecec9cdfa0a69 100644
+--- a/drivers/usb/dwc3/core.c
++++ b/drivers/usb/dwc3/core.c
+@@ -157,8 +157,13 @@ static void __dwc3_set_mode(struct work_struct *work)
+ break;
+ }
+
+- /* For DRD host or device mode only */
+- if (dwc->desired_dr_role != DWC3_GCTL_PRTCAP_OTG) {
++ /*
++ * When current_dr_role is not set, there's no role switching.
++ * Only perform GCTL.CoreSoftReset when there's DRD role switching.
++ */
++ if (dwc->current_dr_role && ((DWC3_IP_IS(DWC3) ||
++ DWC3_VER_IS_PRIOR(DWC31, 190A)) &&
++ dwc->desired_dr_role != DWC3_GCTL_PRTCAP_OTG)) {
+ reg = dwc3_readl(dwc->regs, DWC3_GCTL);
+ reg |= DWC3_GCTL_CORESOFTRESET;
+ dwc3_writel(dwc->regs, DWC3_GCTL, reg);
+diff --git a/drivers/usb/dwc3/dwc3-qcom.c b/drivers/usb/dwc3/dwc3-qcom.c
+index 6cba990da32ef..3582fd6dfa141 100644
+--- a/drivers/usb/dwc3/dwc3-qcom.c
++++ b/drivers/usb/dwc3/dwc3-qcom.c
+@@ -443,9 +443,9 @@ static int dwc3_qcom_get_irq(struct platform_device *pdev,
+ int ret;
+
+ if (np)
+- ret = platform_get_irq_byname(pdev_irq, name);
++ ret = platform_get_irq_byname_optional(pdev_irq, name);
+ else
+- ret = platform_get_irq(pdev_irq, num);
++ ret = platform_get_irq_optional(pdev_irq, num);
+
+ return ret;
+ }
+diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
+index f46fc02669a8c..d3be2b424244d 100644
+--- a/drivers/usb/dwc3/gadget.c
++++ b/drivers/usb/dwc3/gadget.c
+@@ -1181,17 +1181,49 @@ static u32 dwc3_calc_trbs_left(struct dwc3_ep *dep)
+ return trbs_left;
+ }
+
+-static void __dwc3_prepare_one_trb(struct dwc3_ep *dep, struct dwc3_trb *trb,
+- dma_addr_t dma, unsigned int length, unsigned int chain,
+- unsigned int node, unsigned int stream_id,
+- unsigned int short_not_ok, unsigned int no_interrupt,
+- unsigned int is_last, bool must_interrupt)
++/**
++ * dwc3_prepare_one_trb - setup one TRB from one request
++ * @dep: endpoint for which this request is prepared
++ * @req: dwc3_request pointer
++ * @trb_length: buffer size of the TRB
++ * @chain: should this TRB be chained to the next?
++ * @node: only for isochronous endpoints. First TRB needs different type.
++ * @use_bounce_buffer: set to use bounce buffer
++ * @must_interrupt: set to interrupt on TRB completion
++ */
++static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
++ struct dwc3_request *req, unsigned int trb_length,
++ unsigned int chain, unsigned int node, bool use_bounce_buffer,
++ bool must_interrupt)
+ {
++ struct dwc3_trb *trb;
++ dma_addr_t dma;
++ unsigned int stream_id = req->request.stream_id;
++ unsigned int short_not_ok = req->request.short_not_ok;
++ unsigned int no_interrupt = req->request.no_interrupt;
++ unsigned int is_last = req->request.is_last;
+ struct dwc3 *dwc = dep->dwc;
+ struct usb_gadget *gadget = dwc->gadget;
+ enum usb_device_speed speed = gadget->speed;
+
+- trb->size = DWC3_TRB_SIZE_LENGTH(length);
++ if (use_bounce_buffer)
++ dma = dep->dwc->bounce_addr;
++ else if (req->request.num_sgs > 0)
++ dma = sg_dma_address(req->start_sg);
++ else
++ dma = req->request.dma;
++
++ trb = &dep->trb_pool[dep->trb_enqueue];
++
++ if (!req->trb) {
++ dwc3_gadget_move_started_request(req);
++ req->trb = trb;
++ req->trb_dma = dwc3_trb_dma_offset(dep, trb);
++ }
++
++ req->num_trbs++;
++
++ trb->size = DWC3_TRB_SIZE_LENGTH(trb_length);
+ trb->bpl = lower_32_bits(dma);
+ trb->bph = upper_32_bits(dma);
+
+@@ -1231,10 +1263,10 @@ static void __dwc3_prepare_one_trb(struct dwc3_ep *dep, struct dwc3_trb *trb,
+ unsigned int mult = 2;
+ unsigned int maxp = usb_endpoint_maxp(ep->desc);
+
+- if (length <= (2 * maxp))
++ if (req->request.length <= (2 * maxp))
+ mult--;
+
+- if (length <= maxp)
++ if (req->request.length <= maxp)
+ mult--;
+
+ trb->size |= DWC3_TRB_SIZE_PCM1(mult);
+@@ -1308,50 +1340,6 @@ static void __dwc3_prepare_one_trb(struct dwc3_ep *dep, struct dwc3_trb *trb,
+ trace_dwc3_prepare_trb(dep, trb);
+ }
+
+-/**
+- * dwc3_prepare_one_trb - setup one TRB from one request
+- * @dep: endpoint for which this request is prepared
+- * @req: dwc3_request pointer
+- * @trb_length: buffer size of the TRB
+- * @chain: should this TRB be chained to the next?
+- * @node: only for isochronous endpoints. First TRB needs different type.
+- * @use_bounce_buffer: set to use bounce buffer
+- * @must_interrupt: set to interrupt on TRB completion
+- */
+-static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
+- struct dwc3_request *req, unsigned int trb_length,
+- unsigned int chain, unsigned int node, bool use_bounce_buffer,
+- bool must_interrupt)
+-{
+- struct dwc3_trb *trb;
+- dma_addr_t dma;
+- unsigned int stream_id = req->request.stream_id;
+- unsigned int short_not_ok = req->request.short_not_ok;
+- unsigned int no_interrupt = req->request.no_interrupt;
+- unsigned int is_last = req->request.is_last;
+-
+- if (use_bounce_buffer)
+- dma = dep->dwc->bounce_addr;
+- else if (req->request.num_sgs > 0)
+- dma = sg_dma_address(req->start_sg);
+- else
+- dma = req->request.dma;
+-
+- trb = &dep->trb_pool[dep->trb_enqueue];
+-
+- if (!req->trb) {
+- dwc3_gadget_move_started_request(req);
+- req->trb = trb;
+- req->trb_dma = dwc3_trb_dma_offset(dep, trb);
+- }
+-
+- req->num_trbs++;
+-
+- __dwc3_prepare_one_trb(dep, trb, dma, trb_length, chain, node,
+- stream_id, short_not_ok, no_interrupt, is_last,
+- must_interrupt);
+-}
+-
+ static bool dwc3_needs_extra_trb(struct dwc3_ep *dep, struct dwc3_request *req)
+ {
+ unsigned int maxp = usb_endpoint_maxp(dep->endpoint.desc);
+diff --git a/drivers/usb/gadget/function/f_mass_storage.c b/drivers/usb/gadget/function/f_mass_storage.c
+index 3a77bca0ebe1c..e884f295504f6 100644
+--- a/drivers/usb/gadget/function/f_mass_storage.c
++++ b/drivers/usb/gadget/function/f_mass_storage.c
+@@ -1192,13 +1192,14 @@ static int do_read_toc(struct fsg_common *common, struct fsg_buffhd *bh)
+ u8 format;
+ int i, len;
+
++ format = common->cmnd[2] & 0xf;
++
+ if ((common->cmnd[1] & ~0x02) != 0 || /* Mask away MSF */
+- start_track > 1) {
++ (start_track > 1 && format != 0x1)) {
+ curlun->sense_data = SS_INVALID_FIELD_IN_CDB;
+ return -EINVAL;
+ }
+
+- format = common->cmnd[2] & 0xf;
+ /*
+ * Check if CDB is old style SFF-8020i
+ * i.e. format is in 2 MSBs of byte 9
+@@ -1208,8 +1209,8 @@ static int do_read_toc(struct fsg_common *common, struct fsg_buffhd *bh)
+ format = (common->cmnd[9] >> 6) & 0x3;
+
+ switch (format) {
+- case 0:
+- /* Formatted TOC */
++ case 0: /* Formatted TOC */
++ case 1: /* Multi-session info */
+ len = 4 + 2*8; /* 4 byte header + 2 descriptors */
+ memset(buf, 0, len);
+ buf[1] = len - 2; /* TOC Length excludes length field */
+@@ -1250,7 +1251,7 @@ static int do_read_toc(struct fsg_common *common, struct fsg_buffhd *bh)
+ return len;
+
+ default:
+- /* Multi-session, PMA, ATIP, CD-TEXT not supported/required */
++ /* PMA, ATIP, CD-TEXT not supported/required */
+ curlun->sense_data = SS_INVALID_FIELD_IN_CDB;
+ return -EINVAL;
+ }
+diff --git a/drivers/usb/gadget/udc/Kconfig b/drivers/usb/gadget/udc/Kconfig
+index 69394dc1cdfb6..2cdd37be165a4 100644
+--- a/drivers/usb/gadget/udc/Kconfig
++++ b/drivers/usb/gadget/udc/Kconfig
+@@ -311,7 +311,7 @@ source "drivers/usb/gadget/udc/bdc/Kconfig"
+
+ config USB_AMD5536UDC
+ tristate "AMD5536 UDC"
+- depends on USB_PCI
++ depends on USB_PCI && HAS_DMA
+ select USB_SNP_CORE
+ help
+ The AMD5536 UDC is part of the AMD Geode CS5536, an x86 southbridge.
+diff --git a/drivers/usb/gadget/udc/aspeed-vhub/hub.c b/drivers/usb/gadget/udc/aspeed-vhub/hub.c
+index 65cd4e46f031f..e2207d0146204 100644
+--- a/drivers/usb/gadget/udc/aspeed-vhub/hub.c
++++ b/drivers/usb/gadget/udc/aspeed-vhub/hub.c
+@@ -1059,8 +1059,10 @@ static int ast_vhub_init_desc(struct ast_vhub *vhub)
+ /* Initialize vhub String Descriptors. */
+ INIT_LIST_HEAD(&vhub->vhub_str_desc);
+ desc_np = of_get_child_by_name(vhub_np, "vhub-strings");
+- if (desc_np)
++ if (desc_np) {
+ ret = ast_vhub_of_parse_str_desc(vhub, desc_np);
++ of_node_put(desc_np);
++ }
+ else
+ ret = ast_vhub_str_alloc_add(vhub, &ast_vhub_strings);
+
+diff --git a/drivers/usb/gadget/udc/tegra-xudc.c b/drivers/usb/gadget/udc/tegra-xudc.c
+index d9c406bdb680a..c257656286251 100644
+--- a/drivers/usb/gadget/udc/tegra-xudc.c
++++ b/drivers/usb/gadget/udc/tegra-xudc.c
+@@ -3691,15 +3691,15 @@ static int tegra_xudc_powerdomain_init(struct tegra_xudc *xudc)
+ int err;
+
+ xudc->genpd_dev_device = dev_pm_domain_attach_by_name(dev, "dev");
+- if (IS_ERR(xudc->genpd_dev_device)) {
+- err = PTR_ERR(xudc->genpd_dev_device);
++ if (IS_ERR_OR_NULL(xudc->genpd_dev_device)) {
++ err = PTR_ERR(xudc->genpd_dev_device) ? : -ENODATA;
+ dev_err(dev, "failed to get device power domain: %d\n", err);
+ return err;
+ }
+
+ xudc->genpd_dev_ss = dev_pm_domain_attach_by_name(dev, "ss");
+- if (IS_ERR(xudc->genpd_dev_ss)) {
+- err = PTR_ERR(xudc->genpd_dev_ss);
++ if (IS_ERR_OR_NULL(xudc->genpd_dev_ss)) {
++ err = PTR_ERR(xudc->genpd_dev_ss) ? : -ENODATA;
+ dev_err(dev, "failed to get SuperSpeed power domain: %d\n", err);
+ return err;
+ }
+diff --git a/drivers/usb/host/ehci-ppc-of.c b/drivers/usb/host/ehci-ppc-of.c
+index 6bbaee74f7e7d..28a19693c19fe 100644
+--- a/drivers/usb/host/ehci-ppc-of.c
++++ b/drivers/usb/host/ehci-ppc-of.c
+@@ -148,6 +148,7 @@ static int ehci_hcd_ppc_of_probe(struct platform_device *op)
+ } else {
+ ehci->has_amcc_usb23 = 1;
+ }
++ of_node_put(np);
+ }
+
+ if (of_get_property(dn, "big-endian", NULL)) {
+diff --git a/drivers/usb/host/ohci-nxp.c b/drivers/usb/host/ohci-nxp.c
+index 85878e8ad3311..106a6bcefb087 100644
+--- a/drivers/usb/host/ohci-nxp.c
++++ b/drivers/usb/host/ohci-nxp.c
+@@ -164,6 +164,7 @@ static int ohci_hcd_nxp_probe(struct platform_device *pdev)
+ }
+
+ isp1301_i2c_client = isp1301_get_client(isp1301_node);
++ of_node_put(isp1301_node);
+ if (!isp1301_i2c_client)
+ return -EPROBE_DEFER;
+
+diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c
+index 996958a6565c3..bdb776553826b 100644
+--- a/drivers/usb/host/xhci-tegra.c
++++ b/drivers/usb/host/xhci-tegra.c
+@@ -1010,15 +1010,15 @@ static int tegra_xusb_powerdomain_init(struct device *dev,
+ int err;
+
+ tegra->genpd_dev_host = dev_pm_domain_attach_by_name(dev, "xusb_host");
+- if (IS_ERR(tegra->genpd_dev_host)) {
+- err = PTR_ERR(tegra->genpd_dev_host);
++ if (IS_ERR_OR_NULL(tegra->genpd_dev_host)) {
++ err = PTR_ERR(tegra->genpd_dev_host) ? : -ENODATA;
+ dev_err(dev, "failed to get host pm-domain: %d\n", err);
+ return err;
+ }
+
+ tegra->genpd_dev_ss = dev_pm_domain_attach_by_name(dev, "xusb_ss");
+- if (IS_ERR(tegra->genpd_dev_ss)) {
+- err = PTR_ERR(tegra->genpd_dev_ss);
++ if (IS_ERR_OR_NULL(tegra->genpd_dev_ss)) {
++ err = PTR_ERR(tegra->genpd_dev_ss) ? : -ENODATA;
+ dev_err(dev, "failed to get superspeed pm-domain: %d\n", err);
+ return err;
+ }
+diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
+index 1f3f311d9951e..1d60e62752f33 100644
+--- a/drivers/usb/host/xhci.h
++++ b/drivers/usb/host/xhci.h
+@@ -2393,7 +2393,7 @@ static inline const char *xhci_decode_trb(char *str, size_t size,
+ field3 & TRB_CYCLE ? 'C' : 'c');
+ break;
+ case TRB_STOP_RING:
+- sprintf(str,
++ snprintf(str, size,
+ "%s: slot %d sp %d ep %d flags %c",
+ xhci_trb_type_string(type),
+ TRB_TO_SLOT_ID(field3),
+diff --git a/drivers/usb/serial/sierra.c b/drivers/usb/serial/sierra.c
+index 9d56138133a97..ef6a2891f290c 100644
+--- a/drivers/usb/serial/sierra.c
++++ b/drivers/usb/serial/sierra.c
+@@ -737,7 +737,8 @@ static void sierra_close(struct usb_serial_port *port)
+
+ /*
+ * Need to take susp_lock to make sure port is not already being
+- * resumed, but no need to hold it due to initialized
++ * resumed, but no need to hold it due to the tty-port initialized
++ * flag.
+ */
+ spin_lock_irq(&intfdata->susp_lock);
+ if (--intfdata->open_ports == 0)
+diff --git a/drivers/usb/serial/usb-serial.c b/drivers/usb/serial/usb-serial.c
+index 24101bd7fcad2..e35bea2235c1c 100644
+--- a/drivers/usb/serial/usb-serial.c
++++ b/drivers/usb/serial/usb-serial.c
+@@ -295,7 +295,7 @@ static int serial_open(struct tty_struct *tty, struct file *filp)
+ *
+ * Shut down a USB serial port. Serialized against activate by the
+ * tport mutex and kept to matching open/close pairs
+- * of calls by the initialized flag.
++ * of calls by the tty-port initialized flag.
+ *
+ * Not called if tty is console.
+ */
+diff --git a/drivers/usb/serial/usb_wwan.c b/drivers/usb/serial/usb_wwan.c
+index dab38b63eaf7f..cc81ab7ef4da1 100644
+--- a/drivers/usb/serial/usb_wwan.c
++++ b/drivers/usb/serial/usb_wwan.c
+@@ -388,7 +388,8 @@ void usb_wwan_close(struct usb_serial_port *port)
+
+ /*
+ * Need to take susp_lock to make sure port is not already being
+- * resumed, but no need to hold it due to initialized
++ * resumed, but no need to hold it due to the tty-port initialized
++ * flag.
+ */
+ spin_lock_irq(&intfdata->susp_lock);
+ if (--intfdata->open_ports == 0)
+diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
+index a6045aef0d04f..668880c3a4723 100644
+--- a/drivers/usb/typec/ucsi/ucsi.c
++++ b/drivers/usb/typec/ucsi/ucsi.c
+@@ -76,6 +76,10 @@ static int ucsi_read_error(struct ucsi *ucsi)
+ if (ret)
+ return ret;
+
++ ret = ucsi_acknowledge_command(ucsi);
++ if (ret)
++ return ret;
++
+ switch (error) {
+ case UCSI_ERROR_INCOMPATIBLE_PARTNER:
+ return -EOPNOTSUPP;
+diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+index 767b5d47631a4..e92376837b29e 100644
+--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
++++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+@@ -337,6 +337,14 @@ static int vf_qm_cache_wb(struct hisi_qm *qm)
+ return 0;
+ }
+
++static struct hisi_acc_vf_core_device *hssi_acc_drvdata(struct pci_dev *pdev)
++{
++ struct vfio_pci_core_device *core_device = dev_get_drvdata(&pdev->dev);
++
++ return container_of(core_device, struct hisi_acc_vf_core_device,
++ core_device);
++}
++
+ static void vf_qm_fun_reset(struct hisi_acc_vf_core_device *hisi_acc_vdev,
+ struct hisi_qm *qm)
+ {
+@@ -962,7 +970,7 @@ hisi_acc_vfio_pci_get_device_state(struct vfio_device *vdev,
+
+ static void hisi_acc_vf_pci_aer_reset_done(struct pci_dev *pdev)
+ {
+- struct hisi_acc_vf_core_device *hisi_acc_vdev = dev_get_drvdata(&pdev->dev);
++ struct hisi_acc_vf_core_device *hisi_acc_vdev = hssi_acc_drvdata(pdev);
+
+ if (hisi_acc_vdev->core_device.vdev.migration_flags !=
+ VFIO_MIGRATION_STOP_COPY)
+@@ -1274,11 +1282,10 @@ static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device
+ &hisi_acc_vfio_pci_ops);
+ }
+
++ dev_set_drvdata(&pdev->dev, &hisi_acc_vdev->core_device);
+ ret = vfio_pci_core_register_device(&hisi_acc_vdev->core_device);
+ if (ret)
+ goto out_free;
+-
+- dev_set_drvdata(&pdev->dev, hisi_acc_vdev);
+ return 0;
+
+ out_free:
+@@ -1289,7 +1296,7 @@ out_free:
+
+ static void hisi_acc_vfio_pci_remove(struct pci_dev *pdev)
+ {
+- struct hisi_acc_vf_core_device *hisi_acc_vdev = dev_get_drvdata(&pdev->dev);
++ struct hisi_acc_vf_core_device *hisi_acc_vdev = hssi_acc_drvdata(pdev);
+
+ vfio_pci_core_unregister_device(&hisi_acc_vdev->core_device);
+ vfio_pci_core_uninit_device(&hisi_acc_vdev->core_device);
+diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c
+index bbec5d288fee9..9f59f5807b8ab 100644
+--- a/drivers/vfio/pci/mlx5/main.c
++++ b/drivers/vfio/pci/mlx5/main.c
+@@ -39,6 +39,14 @@ struct mlx5vf_pci_core_device {
+ struct mlx5_vf_migration_file *saving_migf;
+ };
+
++static struct mlx5vf_pci_core_device *mlx5vf_drvdata(struct pci_dev *pdev)
++{
++ struct vfio_pci_core_device *core_device = dev_get_drvdata(&pdev->dev);
++
++ return container_of(core_device, struct mlx5vf_pci_core_device,
++ core_device);
++}
++
+ static struct page *
+ mlx5vf_get_migration_page(struct mlx5_vf_migration_file *migf,
+ unsigned long offset)
+@@ -505,7 +513,7 @@ static int mlx5vf_pci_get_device_state(struct vfio_device *vdev,
+
+ static void mlx5vf_pci_aer_reset_done(struct pci_dev *pdev)
+ {
+- struct mlx5vf_pci_core_device *mvdev = dev_get_drvdata(&pdev->dev);
++ struct mlx5vf_pci_core_device *mvdev = mlx5vf_drvdata(pdev);
+
+ if (!mvdev->migrate_cap)
+ return;
+@@ -614,11 +622,10 @@ static int mlx5vf_pci_probe(struct pci_dev *pdev,
+ }
+ }
+
++ dev_set_drvdata(&pdev->dev, &mvdev->core_device);
+ ret = vfio_pci_core_register_device(&mvdev->core_device);
+ if (ret)
+ goto out_free;
+-
+- dev_set_drvdata(&pdev->dev, mvdev);
+ return 0;
+
+ out_free:
+@@ -629,7 +636,7 @@ out_free:
+
+ static void mlx5vf_pci_remove(struct pci_dev *pdev)
+ {
+- struct mlx5vf_pci_core_device *mvdev = dev_get_drvdata(&pdev->dev);
++ struct mlx5vf_pci_core_device *mvdev = mlx5vf_drvdata(pdev);
+
+ vfio_pci_core_unregister_device(&mvdev->core_device);
+ vfio_pci_core_uninit_device(&mvdev->core_device);
+diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
+index 2b047469e02fe..8c990a1a7def9 100644
+--- a/drivers/vfio/pci/vfio_pci.c
++++ b/drivers/vfio/pci/vfio_pci.c
+@@ -151,10 +151,10 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+ return -ENOMEM;
+ vfio_pci_core_init_device(vdev, pdev, &vfio_pci_ops);
+
++ dev_set_drvdata(&pdev->dev, vdev);
+ ret = vfio_pci_core_register_device(vdev);
+ if (ret)
+ goto out_free;
+- dev_set_drvdata(&pdev->dev, vdev);
+ return 0;
+
+ out_free:
+diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
+index 06b6f3594a131..65587fd5c021b 100644
+--- a/drivers/vfio/pci/vfio_pci_core.c
++++ b/drivers/vfio/pci/vfio_pci_core.c
+@@ -1821,6 +1821,10 @@ int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
+ struct pci_dev *pdev = vdev->pdev;
+ int ret;
+
++ /* Drivers must set the vfio_pci_core_device to their drvdata */
++ if (WARN_ON(vdev != dev_get_drvdata(&vdev->pdev->dev)))
++ return -EINVAL;
++
+ if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
+ return -EINVAL;
+
+diff --git a/drivers/video/fbdev/amba-clcd.c b/drivers/video/fbdev/amba-clcd.c
+index 8080116aea844..f65c96d1394d3 100644
+--- a/drivers/video/fbdev/amba-clcd.c
++++ b/drivers/video/fbdev/amba-clcd.c
+@@ -698,16 +698,18 @@ static int clcdfb_of_init_display(struct clcd_fb *fb)
+ return -ENODEV;
+
+ panel = of_graph_get_remote_port_parent(endpoint);
+- if (!panel)
+- return -ENODEV;
++ if (!panel) {
++ err = -ENODEV;
++ goto out_endpoint_put;
++ }
+
+ err = clcdfb_of_get_backlight(&fb->dev->dev, fb->panel);
+ if (err)
+- return err;
++ goto out_panel_put;
+
+ err = clcdfb_of_get_mode(&fb->dev->dev, panel, fb->panel);
+ if (err)
+- return err;
++ goto out_panel_put;
+
+ err = of_property_read_u32(fb->dev->dev.of_node, "max-memory-bandwidth",
+ &max_bandwidth);
+@@ -736,11 +738,21 @@ static int clcdfb_of_init_display(struct clcd_fb *fb)
+
+ if (of_property_read_u32_array(endpoint,
+ "arm,pl11x,tft-r0g0b0-pads",
+- tft_r0b0g0, ARRAY_SIZE(tft_r0b0g0)) != 0)
+- return -ENOENT;
++ tft_r0b0g0, ARRAY_SIZE(tft_r0b0g0)) != 0) {
++ err = -ENOENT;
++ goto out_panel_put;
++ }
++
++ of_node_put(panel);
++ of_node_put(endpoint);
+
+ return clcdfb_of_init_tft_panel(fb, tft_r0b0g0[0],
+ tft_r0b0g0[1], tft_r0b0g0[2]);
++out_panel_put:
++ of_node_put(panel);
++out_endpoint_put:
++ of_node_put(endpoint);
++ return err;
+ }
+
+ static int clcdfb_of_vram_setup(struct clcd_fb *fb)
+diff --git a/drivers/video/fbdev/arkfb.c b/drivers/video/fbdev/arkfb.c
+index eb3e47c58c5f7..a2a381631628e 100644
+--- a/drivers/video/fbdev/arkfb.c
++++ b/drivers/video/fbdev/arkfb.c
+@@ -781,7 +781,12 @@ static int arkfb_set_par(struct fb_info *info)
+ return -EINVAL;
+ }
+
+- ark_set_pixclock(info, (hdiv * info->var.pixclock) / hmul);
++ value = (hdiv * info->var.pixclock) / hmul;
++ if (!value) {
++ fb_dbg(info, "invalid pixclock\n");
++ value = 1;
++ }
++ ark_set_pixclock(info, value);
+ svga_set_timings(par->state.vgabase, &ark_timing_regs, &(info->var), hmul, hdiv,
+ (info->var.vmode & FB_VMODE_DOUBLE) ? 2 : 1,
+ (info->var.vmode & FB_VMODE_INTERLACED) ? 2 : 1,
+@@ -792,6 +797,8 @@ static int arkfb_set_par(struct fb_info *info)
+ value = ((value * hmul / hdiv) / 8) - 5;
+ vga_wcrt(par->state.vgabase, 0x42, (value + 1) / 2);
+
++ if (screen_size > info->screen_size)
++ screen_size = info->screen_size;
+ memset_io(info->screen_base, 0x00, screen_size);
+ /* Device and screen back on */
+ svga_wcrt_mask(par->state.vgabase, 0x17, 0x80, 0x80);
+diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c
+index 526a7d2de4912..d597f0c31578b 100644
+--- a/drivers/video/fbdev/core/fbcon.c
++++ b/drivers/video/fbdev/core/fbcon.c
+@@ -115,8 +115,8 @@ static int logo_lines;
+ enums. */
+ static int logo_shown = FBCON_LOGO_CANSHOW;
+ /* console mappings */
+-static int first_fb_vc;
+-static int last_fb_vc = MAX_NR_CONSOLES - 1;
++static unsigned int first_fb_vc;
++static unsigned int last_fb_vc = MAX_NR_CONSOLES - 1;
+ static int fbcon_is_default = 1;
+ static int primary_device = -1;
+ static int fbcon_has_console_bind;
+@@ -464,10 +464,12 @@ static int __init fb_console_setup(char *this_opt)
+ options += 3;
+ if (*options)
+ first_fb_vc = simple_strtoul(options, &options, 10) - 1;
+- if (first_fb_vc < 0)
++ if (first_fb_vc >= MAX_NR_CONSOLES)
+ first_fb_vc = 0;
+ if (*options++ == '-')
+ last_fb_vc = simple_strtoul(options, &options, 10) - 1;
++ if (last_fb_vc < first_fb_vc || last_fb_vc >= MAX_NR_CONSOLES)
++ last_fb_vc = MAX_NR_CONSOLES - 1;
+ fbcon_is_default = 0;
+ continue;
+ }
+@@ -1704,8 +1706,6 @@ static bool fbcon_scroll(struct vc_data *vc, unsigned int t, unsigned int b,
+ case SM_UP:
+ if (count > vc->vc_rows) /* Maximum realistic size */
+ count = vc->vc_rows;
+- if (logo_shown >= 0)
+- goto redraw_up;
+ switch (fb_scrollmode(p)) {
+ case SCROLL_MOVE:
+ fbcon_redraw_blit(vc, info, p, t, b - t - count,
+@@ -1794,8 +1794,6 @@ static bool fbcon_scroll(struct vc_data *vc, unsigned int t, unsigned int b,
+ case SM_DOWN:
+ if (count > vc->vc_rows) /* Maximum realistic size */
+ count = vc->vc_rows;
+- if (logo_shown >= 0)
+- goto redraw_down;
+ switch (fb_scrollmode(p)) {
+ case SCROLL_MOVE:
+ fbcon_redraw_blit(vc, info, p, b - 1, b - t - count,
+diff --git a/drivers/video/fbdev/s3fb.c b/drivers/video/fbdev/s3fb.c
+index b93c8eb023369..5069f6f67923f 100644
+--- a/drivers/video/fbdev/s3fb.c
++++ b/drivers/video/fbdev/s3fb.c
+@@ -905,6 +905,8 @@ static int s3fb_set_par(struct fb_info *info)
+ value = clamp((htotal + hsstart + 1) / 2 + 2, hsstart + 4, htotal + 1);
+ svga_wcrt_multi(par->state.vgabase, s3_dtpc_regs, value);
+
++ if (screen_size > info->screen_size)
++ screen_size = info->screen_size;
+ memset_io(info->screen_base, 0x00, screen_size);
+ /* Device and screen back on */
+ svga_wcrt_mask(par->state.vgabase, 0x17, 0x80, 0x80);
+diff --git a/drivers/video/fbdev/sis/init.c b/drivers/video/fbdev/sis/init.c
+index b568c646a76c2..2ba91d62af92e 100644
+--- a/drivers/video/fbdev/sis/init.c
++++ b/drivers/video/fbdev/sis/init.c
+@@ -355,12 +355,12 @@ SiS_GetModeID(int VGAEngine, unsigned int VBFlags, int HDisplay, int VDisplay,
+ }
+ break;
+ case 400:
+- if((!(VBFlags & CRT1_LCDA)) || ((LCDwidth >= 800) && (LCDwidth >= 600))) {
++ if((!(VBFlags & CRT1_LCDA)) || ((LCDwidth >= 800) && (LCDheight >= 600))) {
+ if(VDisplay == 300) ModeIndex = ModeIndex_400x300[Depth];
+ }
+ break;
+ case 512:
+- if((!(VBFlags & CRT1_LCDA)) || ((LCDwidth >= 1024) && (LCDwidth >= 768))) {
++ if((!(VBFlags & CRT1_LCDA)) || ((LCDwidth >= 1024) && (LCDheight >= 768))) {
+ if(VDisplay == 384) ModeIndex = ModeIndex_512x384[Depth];
+ }
+ break;
+diff --git a/drivers/video/fbdev/vt8623fb.c b/drivers/video/fbdev/vt8623fb.c
+index a92a8c670cf0f..4274c6efb2490 100644
+--- a/drivers/video/fbdev/vt8623fb.c
++++ b/drivers/video/fbdev/vt8623fb.c
+@@ -507,6 +507,8 @@ static int vt8623fb_set_par(struct fb_info *info)
+ (info->var.vmode & FB_VMODE_DOUBLE) ? 2 : 1, 1,
+ 1, info->node);
+
++ if (screen_size > info->screen_size)
++ screen_size = info->screen_size;
+ memset_io(info->screen_base, 0x00, screen_size);
+
+ /* Device and screen back on */
+diff --git a/drivers/watchdog/armada_37xx_wdt.c b/drivers/watchdog/armada_37xx_wdt.c
+index 1635f421ef2c3..854b1cc723cb6 100644
+--- a/drivers/watchdog/armada_37xx_wdt.c
++++ b/drivers/watchdog/armada_37xx_wdt.c
+@@ -274,6 +274,8 @@ static int armada_37xx_wdt_probe(struct platform_device *pdev)
+ if (!res)
+ return -ENODEV;
+ dev->reg = devm_ioremap(&pdev->dev, res->start, resource_size(res));
++ if (!dev->reg)
++ return -ENOMEM;
+
+ /* init clock */
+ dev->clk = devm_clk_get(&pdev->dev, NULL);
+diff --git a/drivers/watchdog/f71808e_wdt.c b/drivers/watchdog/f71808e_wdt.c
+index 7f59c680de253..6a16d3d0bb1e6 100644
+--- a/drivers/watchdog/f71808e_wdt.c
++++ b/drivers/watchdog/f71808e_wdt.c
+@@ -634,7 +634,9 @@ static int __init fintek_wdt_init(void)
+
+ pdata.type = ret;
+
+- platform_driver_register(&fintek_wdt_driver);
++ ret = platform_driver_register(&fintek_wdt_driver);
++ if (ret)
++ return ret;
+
+ wdt_res.name = "superio port";
+ wdt_res.flags = IORESOURCE_IO;
+diff --git a/drivers/watchdog/sp5100_tco.c b/drivers/watchdog/sp5100_tco.c
+index 86ffb58fbc854..ae54dd33e2336 100644
+--- a/drivers/watchdog/sp5100_tco.c
++++ b/drivers/watchdog/sp5100_tco.c
+@@ -402,6 +402,7 @@ out:
+ iounmap(addr);
+
+ release_resource(res);
++ kfree(res);
+
+ return ret;
+ }
+diff --git a/fs/Makefile b/fs/Makefile
+index 208a74e0b00e1..93b80529f8e82 100644
+--- a/fs/Makefile
++++ b/fs/Makefile
+@@ -34,8 +34,6 @@ obj-$(CONFIG_TIMERFD) += timerfd.o
+ obj-$(CONFIG_EVENTFD) += eventfd.o
+ obj-$(CONFIG_USERFAULTFD) += userfaultfd.o
+ obj-$(CONFIG_AIO) += aio.o
+-obj-$(CONFIG_IO_URING) += io_uring.o
+-obj-$(CONFIG_IO_WQ) += io-wq.o
+ obj-$(CONFIG_FS_DAX) += dax.o
+ obj-$(CONFIG_FS_ENCRYPTION) += crypto/
+ obj-$(CONFIG_FS_VERITY) += verity/
+diff --git a/fs/attr.c b/fs/attr.c
+index dbe996b0dedfc..f581c4d008971 100644
+--- a/fs/attr.c
++++ b/fs/attr.c
+@@ -184,6 +184,8 @@ EXPORT_SYMBOL(setattr_prepare);
+ */
+ int inode_newsize_ok(const struct inode *inode, loff_t offset)
+ {
++ if (offset < 0)
++ return -EINVAL;
+ if (inode->i_size < offset) {
+ unsigned long limit;
+
+diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
+index 667b7025d5030..0c7fe3142d7c9 100644
+--- a/fs/btrfs/block-group.c
++++ b/fs/btrfs/block-group.c
+@@ -1033,8 +1033,13 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
+ < block_group->zone_unusable);
+ WARN_ON(block_group->space_info->disk_total
+ < block_group->length * factor);
++ WARN_ON(block_group->zone_is_active &&
++ block_group->space_info->active_total_bytes
++ < block_group->length);
+ }
+ block_group->space_info->total_bytes -= block_group->length;
++ if (block_group->zone_is_active)
++ block_group->space_info->active_total_bytes -= block_group->length;
+ block_group->space_info->bytes_readonly -=
+ (block_group->length - block_group->zone_unusable);
+ block_group->space_info->bytes_zone_unusable -=
+@@ -2102,7 +2107,8 @@ static int read_one_block_group(struct btrfs_fs_info *info,
+ trace_btrfs_add_block_group(info, cache, 0);
+ btrfs_update_space_info(info, cache->flags, cache->length,
+ cache->used, cache->bytes_super,
+- cache->zone_unusable, &space_info);
++ cache->zone_unusable, cache->zone_is_active,
++ &space_info);
+
+ cache->space_info = space_info;
+
+@@ -2172,7 +2178,7 @@ static int fill_dummy_bgs(struct btrfs_fs_info *fs_info)
+ }
+
+ btrfs_update_space_info(fs_info, bg->flags, em->len, em->len,
+- 0, 0, &space_info);
++ 0, 0, false, &space_info);
+ bg->space_info = space_info;
+ link_block_group(bg);
+
+@@ -2553,7 +2559,7 @@ struct btrfs_block_group *btrfs_make_block_group(struct btrfs_trans_handle *tran
+ trace_btrfs_add_block_group(fs_info, cache, 1);
+ btrfs_update_space_info(fs_info, cache->flags, size, bytes_used,
+ cache->bytes_super, cache->zone_unusable,
+- &cache->space_info);
++ cache->zone_is_active, &cache->space_info);
+ btrfs_update_global_block_rsv(fs_info);
+
+ link_block_group(cache);
+@@ -2653,6 +2659,14 @@ int btrfs_inc_block_group_ro(struct btrfs_block_group *cache,
+ ret = btrfs_chunk_alloc(trans, alloc_flags, CHUNK_ALLOC_FORCE);
+ if (ret < 0)
+ goto out;
++ /*
++ * We have allocated a new chunk. We also need to activate that chunk to
++ * grant metadata tickets for zoned filesystem.
++ */
++ ret = btrfs_zoned_activate_one_bg(fs_info, cache->space_info, true);
++ if (ret < 0)
++ goto out;
++
+ ret = inc_block_group_ro(cache, 0);
+ if (ret == -ETXTBSY)
+ goto unlock_out;
+@@ -3724,6 +3738,7 @@ int btrfs_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags,
+ * attempt.
+ */
+ wait_for_alloc = true;
++ force = CHUNK_ALLOC_NO_FORCE;
+ spin_unlock(&space_info->lock);
+ mutex_lock(&fs_info->chunk_mutex);
+ mutex_unlock(&fs_info->chunk_mutex);
+@@ -3846,6 +3861,14 @@ static void reserve_chunk_space(struct btrfs_trans_handle *trans,
+ if (IS_ERR(bg)) {
+ ret = PTR_ERR(bg);
+ } else {
++ /*
++ * We have a new chunk. We also need to activate it for
++ * zoned filesystem.
++ */
++ ret = btrfs_zoned_activate_one_bg(fs_info, info, true);
++ if (ret < 0)
++ return;
++
+ /*
+ * If we fail to add the chunk item here, we end up
+ * trying again at phase 2 of chunk allocation, at
+diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
+index 077c95e9baa50..4ce0977ec9e12 100644
+--- a/fs/btrfs/ctree.h
++++ b/fs/btrfs/ctree.h
+@@ -107,14 +107,6 @@ struct btrfs_ioctl_encoded_io_args;
+ #define BTRFS_STAT_CURR 0
+ #define BTRFS_STAT_PREV 1
+
+-/*
+- * Count how many BTRFS_MAX_EXTENT_SIZE cover the @size
+- */
+-static inline u32 count_max_extents(u64 size)
+-{
+- return div_u64(size + BTRFS_MAX_EXTENT_SIZE - 1, BTRFS_MAX_EXTENT_SIZE);
+-}
+-
+ static inline unsigned long btrfs_chunk_item_size(int num_stripes)
+ {
+ BUG_ON(num_stripes == 0);
+@@ -635,6 +627,9 @@ enum {
+ /* Indicate we have half completed snapshot deletions pending. */
+ BTRFS_FS_UNFINISHED_DROPS,
+
++ /* Indicate we have to finish a zone to do next allocation. */
++ BTRFS_FS_NEED_ZONE_FINISH,
++
+ #if BITS_PER_LONG == 32
+ /* Indicate if we have error/warn message printed on 32bit systems */
+ BTRFS_FS_32BIT_ERROR,
+@@ -1032,6 +1027,12 @@ struct btrfs_fs_info {
+ u32 csums_per_leaf;
+ u32 stripesize;
+
++ /*
++ * Maximum size of an extent. BTRFS_MAX_EXTENT_SIZE on regular
++ * filesystem, on zoned it depends on the device constraints.
++ */
++ u64 max_extent_size;
++
+ /* Block groups and devices containing active swapfiles. */
+ spinlock_t swapfile_pins_lock;
+ struct rb_root swapfile_pins;
+@@ -1050,6 +1051,8 @@ struct btrfs_fs_info {
+ u64 zoned;
+ };
+
++ /* Max size to emit ZONE_APPEND write command */
++ u64 max_zone_append_size;
+ struct mutex zoned_meta_io_lock;
+ spinlock_t treelog_bg_lock;
+ u64 treelog_bg;
+@@ -1066,6 +1069,8 @@ struct btrfs_fs_info {
+
+ spinlock_t zone_active_bgs_lock;
+ struct list_head zone_active_bgs;
++ /* Waiters when BTRFS_FS_NEED_ZONE_FINISH is set */
++ wait_queue_head_t zone_finish_wait;
+
+ #ifdef CONFIG_BTRFS_FS_REF_VERIFY
+ spinlock_t ref_verify_lock;
+@@ -3932,6 +3937,19 @@ static inline bool btrfs_is_zoned(const struct btrfs_fs_info *fs_info)
+ return fs_info->zoned != 0;
+ }
+
++/*
++ * Count how many fs_info->max_extent_size cover the @size
++ */
++static inline u32 count_max_extents(struct btrfs_fs_info *fs_info, u64 size)
++{
++#ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
++ if (!fs_info)
++ return div_u64(size + BTRFS_MAX_EXTENT_SIZE - 1, BTRFS_MAX_EXTENT_SIZE);
++#endif
++
++ return div_u64(size + fs_info->max_extent_size - 1, fs_info->max_extent_size);
++}
++
+ static inline bool btrfs_is_data_reloc_root(const struct btrfs_root *root)
+ {
+ return root->root_key.objectid == BTRFS_DATA_RELOC_TREE_OBJECTID;
+diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c
+index bd8267c4687d9..a224f7e52bf72 100644
+--- a/fs/btrfs/delalloc-space.c
++++ b/fs/btrfs/delalloc-space.c
+@@ -273,7 +273,7 @@ static void calc_inode_reservations(struct btrfs_fs_info *fs_info,
+ u64 num_bytes, u64 disk_num_bytes,
+ u64 *meta_reserve, u64 *qgroup_reserve)
+ {
+- u64 nr_extents = count_max_extents(num_bytes);
++ u64 nr_extents = count_max_extents(fs_info, num_bytes);
+ u64 csum_leaves = btrfs_csum_bytes_to_leaves(fs_info, disk_num_bytes);
+ u64 inode_update = btrfs_calc_metadata_size(fs_info, 1);
+
+@@ -349,7 +349,7 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes,
+ * needs to free the reservation we just made.
+ */
+ spin_lock(&inode->lock);
+- nr_extents = count_max_extents(num_bytes);
++ nr_extents = count_max_extents(fs_info, num_bytes);
+ btrfs_mod_outstanding_extents(inode, nr_extents);
+ inode->csum_bytes += disk_num_bytes;
+ btrfs_calculate_inode_block_rsv_size(fs_info, inode);
+@@ -412,7 +412,7 @@ void btrfs_delalloc_release_extents(struct btrfs_inode *inode, u64 num_bytes)
+ unsigned num_extents;
+
+ spin_lock(&inode->lock);
+- num_extents = count_max_extents(num_bytes);
++ num_extents = count_max_extents(fs_info, num_bytes);
+ btrfs_mod_outstanding_extents(inode, -num_extents);
+ btrfs_calculate_inode_block_rsv_size(fs_info, inode);
+ spin_unlock(&inode->lock);
+diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
+index 6f30413ed9a9b..59fa7bf3a2e55 100644
+--- a/fs/btrfs/disk-io.c
++++ b/fs/btrfs/disk-io.c
+@@ -3239,6 +3239,7 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
+ init_waitqueue_head(&fs_info->transaction_blocked_wait);
+ init_waitqueue_head(&fs_info->async_submit_wait);
+ init_waitqueue_head(&fs_info->delayed_iputs_wait);
++ init_waitqueue_head(&fs_info->zone_finish_wait);
+
+ /* Usable values until the real ones are cached from the superblock */
+ fs_info->nodesize = 4096;
+@@ -3246,6 +3247,8 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
+ fs_info->sectorsize_bits = ilog2(4096);
+ fs_info->stripesize = 4096;
+
++ fs_info->max_extent_size = BTRFS_MAX_EXTENT_SIZE;
++
+ spin_lock_init(&fs_info->swapfile_pins_lock);
+ fs_info->swapfile_pins = RB_ROOT;
+
+@@ -3577,16 +3580,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
+ */
+ fs_info->compress_type = BTRFS_COMPRESS_ZLIB;
+
+- /*
+- * Flag our filesystem as having big metadata blocks if they are bigger
+- * than the page size.
+- */
+- if (btrfs_super_nodesize(disk_super) > PAGE_SIZE) {
+- if (!(features & BTRFS_FEATURE_INCOMPAT_BIG_METADATA))
+- btrfs_info(fs_info,
+- "flagging fs with big metadata feature");
+- features |= BTRFS_FEATURE_INCOMPAT_BIG_METADATA;
+- }
+
+ /* Set up fs_info before parsing mount options */
+ nodesize = btrfs_super_nodesize(disk_super);
+@@ -3627,6 +3620,17 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
+ if (features & BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA)
+ btrfs_info(fs_info, "has skinny extents");
+
++ /*
++ * Flag our filesystem as having big metadata blocks if they are bigger
++ * than the page size.
++ */
++ if (btrfs_super_nodesize(disk_super) > PAGE_SIZE) {
++ if (!(features & BTRFS_FEATURE_INCOMPAT_BIG_METADATA))
++ btrfs_info(fs_info,
++ "flagging fs with big metadata feature");
++ features |= BTRFS_FEATURE_INCOMPAT_BIG_METADATA;
++ }
++
+ /*
+ * mixed block groups end up with duplicate but slightly offset
+ * extent buffers for the same range. It leads to corruptions
+@@ -3654,6 +3658,20 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
+ err = -EINVAL;
+ goto fail_alloc;
+ }
++ /*
++ * We have unsupported RO compat features, although RO mounted, we
++ * should not cause any metadata write, including log replay.
++ * Or we could screw up whatever the new feature requires.
++ */
++ if (unlikely(features && btrfs_super_log_root(disk_super) &&
++ !btrfs_test_opt(fs_info, NOLOGREPLAY))) {
++ btrfs_err(fs_info,
++"cannot replay dirty log with unsupported compat_ro features (0x%llx), try rescue=nologreplay",
++ features);
++ err = -EINVAL;
++ goto fail_alloc;
++ }
++
+
+ if (sectorsize < PAGE_SIZE) {
+ struct btrfs_subpage_info *subpage_info;
+diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
+index f45ecd939a2cb..eee68a6f2be77 100644
+--- a/fs/btrfs/extent-tree.c
++++ b/fs/btrfs/extent-tree.c
+@@ -3803,8 +3803,7 @@ static int do_allocation_zoned(struct btrfs_block_group *block_group,
+
+ /* Check RO and no space case before trying to activate it */
+ spin_lock(&block_group->lock);
+- if (block_group->ro ||
+- block_group->alloc_offset == block_group->zone_capacity) {
++ if (block_group->ro || btrfs_zoned_bg_is_full(block_group)) {
+ ret = 1;
+ /*
+ * May need to clear fs_info->{treelog,data_reloc}_bg.
+@@ -3985,23 +3984,63 @@ static void found_extent(struct find_free_extent_ctl *ffe_ctl,
+ }
+ }
+
+-static bool can_allocate_chunk(struct btrfs_fs_info *fs_info,
+- struct find_free_extent_ctl *ffe_ctl)
++static int can_allocate_chunk_zoned(struct btrfs_fs_info *fs_info,
++ struct find_free_extent_ctl *ffe_ctl)
++{
++ /* If we can activate new zone, just allocate a chunk and use it */
++ if (btrfs_can_activate_zone(fs_info->fs_devices, ffe_ctl->flags))
++ return 0;
++
++ /*
++ * We already reached the max active zones. Try to finish one block
++ * group to make a room for a new block group. This is only possible
++ * for a data block group because btrfs_zone_finish() may need to wait
++ * for a running transaction which can cause a deadlock for metadata
++ * allocation.
++ */
++ if (ffe_ctl->flags & BTRFS_BLOCK_GROUP_DATA) {
++ int ret = btrfs_zone_finish_one_bg(fs_info);
++
++ if (ret == 1)
++ return 0;
++ else if (ret < 0)
++ return ret;
++ }
++
++ /*
++ * If we have enough free space left in an already active block group
++ * and we can't activate any other zone now, do not allow allocating a
++ * new chunk and let find_free_extent() retry with a smaller size.
++ */
++ if (ffe_ctl->max_extent_size >= ffe_ctl->min_alloc_size)
++ return -ENOSPC;
++
++ /*
++ * Even min_alloc_size is not left in any block groups. Since we cannot
++ * activate a new block group, allocating it may not help. Let's tell a
++ * caller to try again and hope it progress something by writing some
++ * parts of the region. That is only possible for data block groups,
++ * where a part of the region can be written.
++ */
++ if (ffe_ctl->flags & BTRFS_BLOCK_GROUP_DATA)
++ return -EAGAIN;
++
++ /*
++ * We cannot activate a new block group and no enough space left in any
++ * block groups. So, allocating a new block group may not help. But,
++ * there is nothing to do anyway, so let's go with it.
++ */
++ return 0;
++}
++
++static int can_allocate_chunk(struct btrfs_fs_info *fs_info,
++ struct find_free_extent_ctl *ffe_ctl)
+ {
+ switch (ffe_ctl->policy) {
+ case BTRFS_EXTENT_ALLOC_CLUSTERED:
+- return true;
++ return 0;
+ case BTRFS_EXTENT_ALLOC_ZONED:
+- /*
+- * If we have enough free space left in an already
+- * active block group and we can't activate any other
+- * zone now, do not allow allocating a new chunk and
+- * let find_free_extent() retry with a smaller size.
+- */
+- if (ffe_ctl->max_extent_size >= ffe_ctl->min_alloc_size &&
+- !btrfs_can_activate_zone(fs_info->fs_devices, ffe_ctl->flags))
+- return false;
+- return true;
++ return can_allocate_chunk_zoned(fs_info, ffe_ctl);
+ default:
+ BUG();
+ }
+@@ -4083,8 +4122,9 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
+ int exist = 0;
+
+ /*Check if allocation policy allows to create a new chunk */
+- if (!can_allocate_chunk(fs_info, ffe_ctl))
+- return -ENOSPC;
++ ret = can_allocate_chunk(fs_info, ffe_ctl);
++ if (ret)
++ return ret;
+
+ trans = current->journal_info;
+ if (trans)
+diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
+index 68ddd90685d9d..bfc7d5b31156e 100644
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -1992,10 +1992,12 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
+ struct page *locked_page, u64 *start,
+ u64 *end)
+ {
++ struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+ struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree;
+ const u64 orig_start = *start;
+ const u64 orig_end = *end;
+- u64 max_bytes = BTRFS_MAX_EXTENT_SIZE;
++ /* The sanity tests may not set a valid fs_info. */
++ u64 max_bytes = fs_info ? fs_info->max_extent_size : BTRFS_MAX_EXTENT_SIZE;
+ u64 delalloc_start;
+ u64 delalloc_end;
+ bool found;
+diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
+index 153920acd2269..2d24f2dcc0ea4 100644
+--- a/fs/btrfs/file.c
++++ b/fs/btrfs/file.c
+@@ -2344,7 +2344,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
+ btrfs_release_log_ctx_extents(&ctx);
+ if (ret < 0) {
+ /* Fallthrough and commit/free transaction. */
+- ret = 1;
++ ret = BTRFS_LOG_FORCE_COMMIT;
+ }
+
+ /* we've logged all the items and now have a consistent
+diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
+index 01a408db56833..ef84bc5030cd8 100644
+--- a/fs/btrfs/free-space-cache.c
++++ b/fs/btrfs/free-space-cache.c
+@@ -2630,16 +2630,19 @@ out:
+ static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group,
+ u64 bytenr, u64 size, bool used)
+ {
+- struct btrfs_fs_info *fs_info = block_group->fs_info;
++ struct btrfs_space_info *sinfo = block_group->space_info;
+ struct btrfs_free_space_ctl *ctl = block_group->free_space_ctl;
+ u64 offset = bytenr - block_group->start;
+ u64 to_free, to_unusable;
+- const int bg_reclaim_threshold = READ_ONCE(fs_info->bg_reclaim_threshold);
++ int bg_reclaim_threshold = 0;
+ bool initial = (size == block_group->length);
+ u64 reclaimable_unusable;
+
+ WARN_ON(!initial && offset + size > block_group->zone_capacity);
+
++ if (!initial)
++ bg_reclaim_threshold = READ_ONCE(sinfo->bg_reclaim_threshold);
++
+ spin_lock(&ctl->tree_lock);
+ if (!used)
+ to_free = size;
+diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
+index 5d15e374d0326..6cf4b31466678 100644
+--- a/fs/btrfs/inode.c
++++ b/fs/btrfs/inode.c
+@@ -92,7 +92,8 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent);
+ static noinline int cow_file_range(struct btrfs_inode *inode,
+ struct page *locked_page,
+ u64 start, u64 end, int *page_started,
+- unsigned long *nr_written, int unlock);
++ unsigned long *nr_written, int unlock,
++ u64 *done_offset);
+ static struct extent_map *create_io_em(struct btrfs_inode *inode, u64 start,
+ u64 len, u64 orig_start, u64 block_start,
+ u64 block_len, u64 orig_block_len,
+@@ -884,15 +885,25 @@ static int submit_uncompressed_range(struct btrfs_inode *inode,
+ * can directly submit them without interruption.
+ */
+ ret = cow_file_range(inode, locked_page, start, end, &page_started,
+- &nr_written, 0);
++ &nr_written, 0, NULL);
+ /* Inline extent inserted, page gets unlocked and everything is done */
+ if (page_started) {
+ ret = 0;
+ goto out;
+ }
+ if (ret < 0) {
+- if (locked_page)
++ btrfs_cleanup_ordered_extents(inode, locked_page, start, end - start + 1);
++ if (locked_page) {
++ const u64 page_start = page_offset(locked_page);
++ const u64 page_end = page_start + PAGE_SIZE - 1;
++
++ btrfs_page_set_error(inode->root->fs_info, locked_page,
++ page_start, PAGE_SIZE);
++ set_page_writeback(locked_page);
++ end_page_writeback(locked_page);
++ end_extent_writepage(locked_page, ret, page_start, page_end);
+ unlock_page(locked_page);
++ }
+ goto out;
+ }
+
+@@ -1097,15 +1108,39 @@ static u64 get_extent_allocation_hint(struct btrfs_inode *inode, u64 start,
+ * *page_started is set to one if we unlock locked_page and do everything
+ * required to start IO on it. It may be clean and already done with
+ * IO when we return.
++ *
++ * When unlock == 1, we unlock the pages in successfully allocated regions.
++ * When unlock == 0, we leave them locked for writing them out.
++ *
++ * However, we unlock all the pages except @locked_page in case of failure.
++ *
++ * In summary, page locking state will be as follow:
++ *
++ * - page_started == 1 (return value)
++ * - All the pages are unlocked. IO is started.
++ * - Note that this can happen only on success
++ * - unlock == 1
++ * - All the pages except @locked_page are unlocked in any case
++ * - unlock == 0
++ * - On success, all the pages are locked for writing out them
++ * - On failure, all the pages except @locked_page are unlocked
++ *
++ * When a failure happens in the second or later iteration of the
++ * while-loop, the ordered extents created in previous iterations are kept
++ * intact. So, the caller must clean them up by calling
++ * btrfs_cleanup_ordered_extents(). See btrfs_run_delalloc_range() for
++ * example.
+ */
+ static noinline int cow_file_range(struct btrfs_inode *inode,
+ struct page *locked_page,
+ u64 start, u64 end, int *page_started,
+- unsigned long *nr_written, int unlock)
++ unsigned long *nr_written, int unlock,
++ u64 *done_offset)
+ {
+ struct btrfs_root *root = inode->root;
+ struct btrfs_fs_info *fs_info = root->fs_info;
+ u64 alloc_hint = 0;
++ u64 orig_start = start;
+ u64 num_bytes;
+ unsigned long ram_size;
+ u64 cur_alloc_size = 0;
+@@ -1293,18 +1328,62 @@ out_reserve:
+ btrfs_dec_block_group_reservations(fs_info, ins.objectid);
+ btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 1);
+ out_unlock:
++ /*
++ * If done_offset is non-NULL and ret == -EAGAIN, we expect the
++ * caller to write out the successfully allocated region and retry.
++ */
++ if (done_offset && ret == -EAGAIN) {
++ if (orig_start < start)
++ *done_offset = start - 1;
++ else
++ *done_offset = start;
++ return ret;
++ } else if (ret == -EAGAIN) {
++ /* Convert to -ENOSPC since the caller cannot retry. */
++ ret = -ENOSPC;
++ }
++
++ /*
++ * Now, we have three regions to clean up:
++ *
++ * |-------(1)----|---(2)---|-------------(3)----------|
++ * `- orig_start `- start `- start + cur_alloc_size `- end
++ *
++ * We process each region below.
++ */
++
+ clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW |
+ EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV;
+ page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK;
++
+ /*
+- * If we reserved an extent for our delalloc range (or a subrange) and
+- * failed to create the respective ordered extent, then it means that
+- * when we reserved the extent we decremented the extent's size from
+- * the data space_info's bytes_may_use counter and incremented the
+- * space_info's bytes_reserved counter by the same amount. We must make
+- * sure extent_clear_unlock_delalloc() does not try to decrement again
+- * the data space_info's bytes_may_use counter, therefore we do not pass
+- * it the flag EXTENT_CLEAR_DATA_RESV.
++ * For the range (1). We have already instantiated the ordered extents
++ * for this region. They are cleaned up by
++ * btrfs_cleanup_ordered_extents() in e.g,
++ * btrfs_run_delalloc_range(). EXTENT_LOCKED | EXTENT_DELALLOC are
++ * already cleared in the above loop. And, EXTENT_DELALLOC_NEW |
++ * EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV are handled by the cleanup
++ * function.
++ *
++ * However, in case of unlock == 0, we still need to unlock the pages
++ * (except @locked_page) to ensure all the pages are unlocked.
++ */
++ if (!unlock && orig_start < start) {
++ if (!locked_page)
++ mapping_set_error(inode->vfs_inode.i_mapping, ret);
++ extent_clear_unlock_delalloc(inode, orig_start, start - 1,
++ locked_page, 0, page_ops);
++ }
++
++ /*
++ * For the range (2). If we reserved an extent for our delalloc range
++ * (or a subrange) and failed to create the respective ordered extent,
++ * then it means that when we reserved the extent we decremented the
++ * extent's size from the data space_info's bytes_may_use counter and
++ * incremented the space_info's bytes_reserved counter by the same
++ * amount. We must make sure extent_clear_unlock_delalloc() does not try
++ * to decrement again the data space_info's bytes_may_use counter,
++ * therefore we do not pass it the flag EXTENT_CLEAR_DATA_RESV.
+ */
+ if (extent_reserved) {
+ extent_clear_unlock_delalloc(inode, start,
+@@ -1316,6 +1395,13 @@ out_unlock:
+ if (start >= end)
+ goto out;
+ }
++
++ /*
++ * For the range (3). We never touched the region. In addition to the
++ * clear_bits above, we add EXTENT_CLEAR_DATA_RESV to release the data
++ * space_info's bytes_may_use counter, reserved in
++ * btrfs_check_data_free_space().
++ */
+ extent_clear_unlock_delalloc(inode, start, end, locked_page,
+ clear_bits | EXTENT_CLEAR_DATA_RESV,
+ page_ops);
+@@ -1502,19 +1588,42 @@ static noinline int run_delalloc_zoned(struct btrfs_inode *inode,
+ u64 end, int *page_started,
+ unsigned long *nr_written)
+ {
++ u64 done_offset = end;
+ int ret;
++ bool locked_page_done = false;
+
+- ret = cow_file_range(inode, locked_page, start, end, page_started,
+- nr_written, 0);
+- if (ret)
+- return ret;
++ while (start <= end) {
++ ret = cow_file_range(inode, locked_page, start, end, page_started,
++ nr_written, 0, &done_offset);
++ if (ret && ret != -EAGAIN)
++ return ret;
+
+- if (*page_started)
+- return 0;
++ if (*page_started) {
++ ASSERT(ret == 0);
++ return 0;
++ }
++
++ if (ret == 0)
++ done_offset = end;
++
++ if (done_offset == start) {
++ struct btrfs_fs_info *info = inode->root->fs_info;
++
++ wait_var_event(&info->zone_finish_wait,
++ !test_bit(BTRFS_FS_NEED_ZONE_FINISH, &info->flags));
++ continue;
++ }
++
++ if (!locked_page_done) {
++ __set_page_dirty_nobuffers(locked_page);
++ account_page_redirty(locked_page);
++ }
++ locked_page_done = true;
++ extent_write_locked_range(&inode->vfs_inode, start, done_offset);
++
++ start = done_offset + 1;
++ }
+
+- __set_page_dirty_nobuffers(locked_page);
+- account_page_redirty(locked_page);
+- extent_write_locked_range(&inode->vfs_inode, start, end);
+ *page_started = 1;
+
+ return 0;
+@@ -1606,7 +1715,7 @@ static int fallback_to_cow(struct btrfs_inode *inode, struct page *locked_page,
+ }
+
+ return cow_file_range(inode, locked_page, start, end, page_started,
+- nr_written, 1);
++ nr_written, 1, NULL);
+ }
+
+ /*
+@@ -2017,7 +2126,7 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page
+ page_started, nr_written);
+ else
+ ret = cow_file_range(inode, locked_page, start, end,
+- page_started, nr_written, 1);
++ page_started, nr_written, 1, NULL);
+ } else {
+ set_bit(BTRFS_INODE_HAS_ASYNC_EXTENT, &inode->runtime_flags);
+ ret = cow_file_range_async(inode, wbc, locked_page, start, end,
+@@ -2033,6 +2142,7 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page
+ void btrfs_split_delalloc_extent(struct inode *inode,
+ struct extent_state *orig, u64 split)
+ {
++ struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+ u64 size;
+
+ /* not delalloc, ignore it */
+@@ -2040,7 +2150,7 @@ void btrfs_split_delalloc_extent(struct inode *inode,
+ return;
+
+ size = orig->end - orig->start + 1;
+- if (size > BTRFS_MAX_EXTENT_SIZE) {
++ if (size > fs_info->max_extent_size) {
+ u32 num_extents;
+ u64 new_size;
+
+@@ -2049,10 +2159,10 @@ void btrfs_split_delalloc_extent(struct inode *inode,
+ * applies here, just in reverse.
+ */
+ new_size = orig->end - split + 1;
+- num_extents = count_max_extents(new_size);
++ num_extents = count_max_extents(fs_info, new_size);
+ new_size = split - orig->start;
+- num_extents += count_max_extents(new_size);
+- if (count_max_extents(size) >= num_extents)
++ num_extents += count_max_extents(fs_info, new_size);
++ if (count_max_extents(fs_info, size) >= num_extents)
+ return;
+ }
+
+@@ -2069,6 +2179,7 @@ void btrfs_split_delalloc_extent(struct inode *inode,
+ void btrfs_merge_delalloc_extent(struct inode *inode, struct extent_state *new,
+ struct extent_state *other)
+ {
++ struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+ u64 new_size, old_size;
+ u32 num_extents;
+
+@@ -2082,7 +2193,7 @@ void btrfs_merge_delalloc_extent(struct inode *inode, struct extent_state *new,
+ new_size = other->end - new->start + 1;
+
+ /* we're not bigger than the max, unreserve the space and go */
+- if (new_size <= BTRFS_MAX_EXTENT_SIZE) {
++ if (new_size <= fs_info->max_extent_size) {
+ spin_lock(&BTRFS_I(inode)->lock);
+ btrfs_mod_outstanding_extents(BTRFS_I(inode), -1);
+ spin_unlock(&BTRFS_I(inode)->lock);
+@@ -2108,10 +2219,10 @@ void btrfs_merge_delalloc_extent(struct inode *inode, struct extent_state *new,
+ * this case.
+ */
+ old_size = other->end - other->start + 1;
+- num_extents = count_max_extents(old_size);
++ num_extents = count_max_extents(fs_info, old_size);
+ old_size = new->end - new->start + 1;
+- num_extents += count_max_extents(old_size);
+- if (count_max_extents(new_size) >= num_extents)
++ num_extents += count_max_extents(fs_info, old_size);
++ if (count_max_extents(fs_info, new_size) >= num_extents)
+ return;
+
+ spin_lock(&BTRFS_I(inode)->lock);
+@@ -2190,7 +2301,7 @@ void btrfs_set_delalloc_extent(struct inode *inode, struct extent_state *state,
+ if (!(state->state & EXTENT_DELALLOC) && (*bits & EXTENT_DELALLOC)) {
+ struct btrfs_root *root = BTRFS_I(inode)->root;
+ u64 len = state->end + 1 - state->start;
+- u32 num_extents = count_max_extents(len);
++ u32 num_extents = count_max_extents(fs_info, len);
+ bool do_list = !btrfs_is_free_space_inode(BTRFS_I(inode));
+
+ spin_lock(&BTRFS_I(inode)->lock);
+@@ -2232,7 +2343,7 @@ void btrfs_clear_delalloc_extent(struct inode *vfs_inode,
+ struct btrfs_inode *inode = BTRFS_I(vfs_inode);
+ struct btrfs_fs_info *fs_info = btrfs_sb(vfs_inode->i_sb);
+ u64 len = state->end + 1 - state->start;
+- u32 num_extents = count_max_extents(len);
++ u32 num_extents = count_max_extents(fs_info, len);
+
+ if ((state->state & EXTENT_DEFRAG) && (*bits & EXTENT_DEFRAG)) {
+ spin_lock(&inode->lock);
+diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
+index b87931a458ebb..104cbc901c0e6 100644
+--- a/fs/btrfs/space-info.c
++++ b/fs/btrfs/space-info.c
+@@ -9,6 +9,7 @@
+ #include "ordered-data.h"
+ #include "transaction.h"
+ #include "block-group.h"
++#include "zoned.h"
+
+ /*
+ * HOW DOES SPACE RESERVATION WORK
+@@ -181,6 +182,43 @@ void btrfs_clear_space_info_full(struct btrfs_fs_info *info)
+ found->full = 0;
+ }
+
++/*
++ * Block groups with more than this value (percents) of unusable space will be
++ * scheduled for background reclaim.
++ */
++#define BTRFS_DEFAULT_ZONED_RECLAIM_THRESH (75)
++
++/*
++ * Calculate chunk size depending on volume type (regular or zoned).
++ */
++static u64 calc_chunk_size(const struct btrfs_fs_info *fs_info, u64 flags)
++{
++ if (btrfs_is_zoned(fs_info))
++ return fs_info->zone_size;
++
++ ASSERT(flags & BTRFS_BLOCK_GROUP_TYPE_MASK);
++
++ if (flags & BTRFS_BLOCK_GROUP_DATA)
++ return SZ_1G;
++ else if (flags & BTRFS_BLOCK_GROUP_SYSTEM)
++ return SZ_32M;
++
++ /* Handle BTRFS_BLOCK_GROUP_METADATA */
++ if (fs_info->fs_devices->total_rw_bytes > 50ULL * SZ_1G)
++ return SZ_1G;
++
++ return SZ_256M;
++}
++
++/*
++ * Update default chunk size.
++ */
++void btrfs_update_space_info_chunk_size(struct btrfs_space_info *space_info,
++ u64 chunk_size)
++{
++ WRITE_ONCE(space_info->chunk_size, chunk_size);
++}
++
+ static int create_space_info(struct btrfs_fs_info *info, u64 flags)
+ {
+
+@@ -202,6 +240,10 @@ static int create_space_info(struct btrfs_fs_info *info, u64 flags)
+ INIT_LIST_HEAD(&space_info->tickets);
+ INIT_LIST_HEAD(&space_info->priority_tickets);
+ space_info->clamp = 1;
++ btrfs_update_space_info_chunk_size(space_info, calc_chunk_size(info, flags));
++
++ if (btrfs_is_zoned(info))
++ space_info->bg_reclaim_threshold = BTRFS_DEFAULT_ZONED_RECLAIM_THRESH;
+
+ ret = btrfs_sysfs_add_space_info_type(info, space_info);
+ if (ret)
+@@ -254,7 +296,7 @@ out:
+ void btrfs_update_space_info(struct btrfs_fs_info *info, u64 flags,
+ u64 total_bytes, u64 bytes_used,
+ u64 bytes_readonly, u64 bytes_zone_unusable,
+- struct btrfs_space_info **space_info)
++ bool active, struct btrfs_space_info **space_info)
+ {
+ struct btrfs_space_info *found;
+ int factor;
+@@ -265,6 +307,8 @@ void btrfs_update_space_info(struct btrfs_fs_info *info, u64 flags,
+ ASSERT(found);
+ spin_lock(&found->lock);
+ found->total_bytes += total_bytes;
++ if (active)
++ found->active_total_bytes += total_bytes;
+ found->disk_total += total_bytes * factor;
+ found->bytes_used += bytes_used;
+ found->disk_used += bytes_used * factor;
+@@ -328,6 +372,22 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info,
+ return avail;
+ }
+
++static inline u64 writable_total_bytes(struct btrfs_fs_info *fs_info,
++ struct btrfs_space_info *space_info)
++{
++ /*
++ * On regular filesystem, all total_bytes are always writable. On zoned
++ * filesystem, there may be a limitation imposed by max_active_zones.
++ * For metadata allocation, we cannot finish an existing active block
++ * group to avoid a deadlock. Thus, we need to consider only the active
++ * groups to be writable for metadata space.
++ */
++ if (!btrfs_is_zoned(fs_info) || (space_info->flags & BTRFS_BLOCK_GROUP_DATA))
++ return space_info->total_bytes;
++
++ return space_info->active_total_bytes;
++}
++
+ int btrfs_can_overcommit(struct btrfs_fs_info *fs_info,
+ struct btrfs_space_info *space_info, u64 bytes,
+ enum btrfs_reserve_flush_enum flush)
+@@ -340,9 +400,12 @@ int btrfs_can_overcommit(struct btrfs_fs_info *fs_info,
+ return 0;
+
+ used = btrfs_space_info_used(space_info, true);
+- avail = calc_available_free_space(fs_info, space_info, flush);
++ if (btrfs_is_zoned(fs_info) && (space_info->flags & BTRFS_BLOCK_GROUP_METADATA))
++ avail = 0;
++ else
++ avail = calc_available_free_space(fs_info, space_info, flush);
+
+- if (used + bytes < space_info->total_bytes + avail)
++ if (used + bytes < writable_total_bytes(fs_info, space_info) + avail)
+ return 1;
+ return 0;
+ }
+@@ -378,7 +441,7 @@ again:
+ ticket = list_first_entry(head, struct reserve_ticket, list);
+
+ /* Check and see if our ticket can be satisfied now. */
+- if ((used + ticket->bytes <= space_info->total_bytes) ||
++ if ((used + ticket->bytes <= writable_total_bytes(fs_info, space_info)) ||
+ btrfs_can_overcommit(fs_info, space_info, ticket->bytes,
+ flush)) {
+ btrfs_space_info_update_bytes_may_use(fs_info,
+@@ -662,6 +725,18 @@ static void flush_space(struct btrfs_fs_info *fs_info,
+ break;
+ case ALLOC_CHUNK:
+ case ALLOC_CHUNK_FORCE:
++ /*
++ * For metadata space on zoned filesystem, reaching here means we
++ * don't have enough space left in active_total_bytes. Try to
++ * activate a block group first, because we may have inactive
++ * block group already allocated.
++ */
++ ret = btrfs_zoned_activate_one_bg(fs_info, space_info, false);
++ if (ret < 0)
++ break;
++ else if (ret == 1)
++ break;
++
+ trans = btrfs_join_transaction(root);
+ if (IS_ERR(trans)) {
+ ret = PTR_ERR(trans);
+@@ -672,6 +747,23 @@ static void flush_space(struct btrfs_fs_info *fs_info,
+ (state == ALLOC_CHUNK) ? CHUNK_ALLOC_NO_FORCE :
+ CHUNK_ALLOC_FORCE);
+ btrfs_end_transaction(trans);
++
++ /*
++ * For metadata space on zoned filesystem, allocating a new chunk
++ * is not enough. We still need to activate the block * group.
++ * Active the newly allocated block group by (maybe) finishing
++ * a block group.
++ */
++ if (ret == 1) {
++ ret = btrfs_zoned_activate_one_bg(fs_info, space_info, true);
++ /*
++ * Revert to the original ret regardless we could finish
++ * one block group or not.
++ */
++ if (ret >= 0)
++ ret = 1;
++ }
++
+ if (ret > 0 || ret == -ENOSPC)
+ ret = 0;
+ break;
+@@ -709,6 +801,7 @@ btrfs_calc_reclaim_metadata_size(struct btrfs_fs_info *fs_info,
+ {
+ u64 used;
+ u64 avail;
++ u64 total;
+ u64 to_reclaim = space_info->reclaim_size;
+
+ lockdep_assert_held(&space_info->lock);
+@@ -723,8 +816,9 @@ btrfs_calc_reclaim_metadata_size(struct btrfs_fs_info *fs_info,
+ * space. If that's the case add in our overage so we make sure to put
+ * appropriate pressure on the flushing state machine.
+ */
+- if (space_info->total_bytes + avail < used)
+- to_reclaim += used - (space_info->total_bytes + avail);
++ total = writable_total_bytes(fs_info, space_info);
++ if (total + avail < used)
++ to_reclaim += used - (total + avail);
+
+ return to_reclaim;
+ }
+@@ -734,9 +828,12 @@ static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info,
+ {
+ u64 global_rsv_size = fs_info->global_block_rsv.reserved;
+ u64 ordered, delalloc;
+- u64 thresh = div_factor_fine(space_info->total_bytes, 90);
++ u64 total = writable_total_bytes(fs_info, space_info);
++ u64 thresh;
+ u64 used;
+
++ thresh = div_factor_fine(total, 90);
++
+ lockdep_assert_held(&space_info->lock);
+
+ /* If we're just plain full then async reclaim just slows us down. */
+@@ -798,8 +895,8 @@ static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info,
+ BTRFS_RESERVE_FLUSH_ALL);
+ used = space_info->bytes_used + space_info->bytes_reserved +
+ space_info->bytes_readonly + global_rsv_size;
+- if (used < space_info->total_bytes)
+- thresh += space_info->total_bytes - used;
++ if (used < total)
++ thresh += total - used;
+ thresh >>= space_info->clamp;
+
+ used = space_info->bytes_pinned;
+@@ -1516,7 +1613,7 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
+ * can_overcommit() to ensure we can overcommit to continue.
+ */
+ if (!pending_tickets &&
+- ((used + orig_bytes <= space_info->total_bytes) ||
++ ((used + orig_bytes <= writable_total_bytes(fs_info, space_info)) ||
+ btrfs_can_overcommit(fs_info, space_info, orig_bytes, flush))) {
+ btrfs_space_info_update_bytes_may_use(fs_info, space_info,
+ orig_bytes);
+diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
+index d841fed734923..b8cee27df2134 100644
+--- a/fs/btrfs/space-info.h
++++ b/fs/btrfs/space-info.h
+@@ -17,12 +17,22 @@ struct btrfs_space_info {
+ u64 bytes_may_use; /* number of bytes that may be used for
+ delalloc/allocations */
+ u64 bytes_readonly; /* total bytes that are read only */
++ /* Total bytes in the space, but only accounts active block groups. */
++ u64 active_total_bytes;
+ u64 bytes_zone_unusable; /* total bytes that are unusable until
+ resetting the device zone */
+
+ u64 max_extent_size; /* This will hold the maximum extent size of
+ the space info if we had an ENOSPC in the
+ allocator. */
++ /* Chunk size in bytes */
++ u64 chunk_size;
++
++ /*
++ * Once a block group drops below this threshold (percents) we'll
++ * schedule it for reclaim.
++ */
++ int bg_reclaim_threshold;
+
+ int clamp; /* Used to scale our threshold for preemptive
+ flushing. The value is >> clamp, so turns
+@@ -114,7 +124,9 @@ int btrfs_init_space_info(struct btrfs_fs_info *fs_info);
+ void btrfs_update_space_info(struct btrfs_fs_info *info, u64 flags,
+ u64 total_bytes, u64 bytes_used,
+ u64 bytes_readonly, u64 bytes_zone_unusable,
+- struct btrfs_space_info **space_info);
++ bool active, struct btrfs_space_info **space_info);
++void btrfs_update_space_info_chunk_size(struct btrfs_space_info *space_info,
++ u64 chunk_size);
+ struct btrfs_space_info *btrfs_find_space_info(struct btrfs_fs_info *info,
+ u64 flags);
+ u64 __pure btrfs_space_info_used(struct btrfs_space_info *s_info,
+diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
+index ba78ca5aabbb2..43845cae0c742 100644
+--- a/fs/btrfs/sysfs.c
++++ b/fs/btrfs/sysfs.c
+@@ -722,6 +722,42 @@ SPACE_INFO_ATTR(bytes_zone_unusable);
+ SPACE_INFO_ATTR(disk_used);
+ SPACE_INFO_ATTR(disk_total);
+
++static ssize_t btrfs_sinfo_bg_reclaim_threshold_show(struct kobject *kobj,
++ struct kobj_attribute *a,
++ char *buf)
++{
++ struct btrfs_space_info *space_info = to_space_info(kobj);
++ ssize_t ret;
++
++ ret = sysfs_emit(buf, "%d\n", READ_ONCE(space_info->bg_reclaim_threshold));
++
++ return ret;
++}
++
++static ssize_t btrfs_sinfo_bg_reclaim_threshold_store(struct kobject *kobj,
++ struct kobj_attribute *a,
++ const char *buf, size_t len)
++{
++ struct btrfs_space_info *space_info = to_space_info(kobj);
++ int thresh;
++ int ret;
++
++ ret = kstrtoint(buf, 10, &thresh);
++ if (ret)
++ return ret;
++
++ if (thresh != 0 && (thresh <= 50 || thresh > 100))
++ return -EINVAL;
++
++ WRITE_ONCE(space_info->bg_reclaim_threshold, thresh);
++
++ return len;
++}
++
++BTRFS_ATTR_RW(space_info, bg_reclaim_threshold,
++ btrfs_sinfo_bg_reclaim_threshold_show,
++ btrfs_sinfo_bg_reclaim_threshold_store);
++
+ /*
+ * Allocation information about block group types.
+ *
+@@ -738,6 +774,7 @@ static struct attribute *space_info_attrs[] = {
+ BTRFS_ATTR_PTR(space_info, bytes_zone_unusable),
+ BTRFS_ATTR_PTR(space_info, disk_used),
+ BTRFS_ATTR_PTR(space_info, disk_total),
++ BTRFS_ATTR_PTR(space_info, bg_reclaim_threshold),
+ NULL,
+ };
+ ATTRIBUTE_GROUPS(space_info);
+diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
+index e65633686378c..2cc376775ae47 100644
+--- a/fs/btrfs/tree-log.c
++++ b/fs/btrfs/tree-log.c
+@@ -171,7 +171,7 @@ again:
+ int index = (root->log_transid + 1) % 2;
+
+ if (btrfs_need_log_full_commit(trans)) {
+- ret = -EAGAIN;
++ ret = BTRFS_LOG_FORCE_COMMIT;
+ goto out;
+ }
+
+@@ -194,7 +194,7 @@ again:
+ * writing.
+ */
+ if (zoned && !created) {
+- ret = -EAGAIN;
++ ret = BTRFS_LOG_FORCE_COMMIT;
+ goto out;
+ }
+
+@@ -3122,7 +3122,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
+
+ /* bail out if we need to do a full commit */
+ if (btrfs_need_log_full_commit(trans)) {
+- ret = -EAGAIN;
++ ret = BTRFS_LOG_FORCE_COMMIT;
+ mutex_unlock(&root->log_mutex);
+ goto out;
+ }
+@@ -3223,7 +3223,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
+ }
+ btrfs_wait_tree_log_extents(log, mark);
+ mutex_unlock(&log_root_tree->log_mutex);
+- ret = -EAGAIN;
++ ret = BTRFS_LOG_FORCE_COMMIT;
+ goto out;
+ }
+
+@@ -3262,7 +3262,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
+ blk_finish_plug(&plug);
+ btrfs_wait_tree_log_extents(log, mark);
+ mutex_unlock(&log_root_tree->log_mutex);
+- ret = -EAGAIN;
++ ret = BTRFS_LOG_FORCE_COMMIT;
+ goto out_wake_log_root;
+ }
+
+@@ -5849,7 +5849,7 @@ static int btrfs_log_inode(struct btrfs_trans_handle *trans,
+ inode_only == LOG_INODE_ALL &&
+ inode->last_unlink_trans >= trans->transid) {
+ btrfs_set_log_full_commit(trans);
+- ret = 1;
++ ret = BTRFS_LOG_FORCE_COMMIT;
+ goto out_unlock;
+ }
+
+@@ -6563,12 +6563,12 @@ static int btrfs_log_inode_parent(struct btrfs_trans_handle *trans,
+ bool log_dentries = false;
+
+ if (btrfs_test_opt(fs_info, NOTREELOG)) {
+- ret = 1;
++ ret = BTRFS_LOG_FORCE_COMMIT;
+ goto end_no_trans;
+ }
+
+ if (btrfs_root_refs(&root->root_item) == 0) {
+- ret = 1;
++ ret = BTRFS_LOG_FORCE_COMMIT;
+ goto end_no_trans;
+ }
+
+@@ -6666,7 +6666,7 @@ static int btrfs_log_inode_parent(struct btrfs_trans_handle *trans,
+ end_trans:
+ if (ret < 0) {
+ btrfs_set_log_full_commit(trans);
+- ret = 1;
++ ret = BTRFS_LOG_FORCE_COMMIT;
+ }
+
+ if (ret)
+@@ -7030,8 +7030,15 @@ void btrfs_log_new_name(struct btrfs_trans_handle *trans,
+ * anyone from syncing the log until we have updated both inodes
+ * in the log.
+ */
++ ret = join_running_log_trans(root);
++ /*
++ * At least one of the inodes was logged before, so this should
++ * not fail, but if it does, it's not serious, just bail out and
++ * mark the log for a full commit.
++ */
++ if (WARN_ON_ONCE(ret < 0))
++ goto out;
+ log_pinned = true;
+- btrfs_pin_log_trans(root);
+
+ path = btrfs_alloc_path();
+ if (!path) {
+diff --git a/fs/btrfs/tree-log.h b/fs/btrfs/tree-log.h
+index 1620f8170629e..57ab5f3b8dc77 100644
+--- a/fs/btrfs/tree-log.h
++++ b/fs/btrfs/tree-log.h
+@@ -12,6 +12,9 @@
+ /* return value for btrfs_log_dentry_safe that means we don't need to log it at all */
+ #define BTRFS_NO_LOG_SYNC 256
+
++/* We can't use the tree log for whatever reason, force a transaction commit */
++#define BTRFS_LOG_FORCE_COMMIT (1)
++
+ struct btrfs_log_ctx {
+ int log_ret;
+ int log_transid;
+diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
+index 659575526e9fe..4bc97e7d8e460 100644
+--- a/fs/btrfs/volumes.c
++++ b/fs/btrfs/volumes.c
+@@ -5091,26 +5091,16 @@ static void init_alloc_chunk_ctl_policy_regular(
+ struct btrfs_fs_devices *fs_devices,
+ struct alloc_chunk_ctl *ctl)
+ {
+- u64 type = ctl->type;
++ struct btrfs_space_info *space_info;
+
+- if (type & BTRFS_BLOCK_GROUP_DATA) {
+- ctl->max_stripe_size = SZ_1G;
+- ctl->max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
+- } else if (type & BTRFS_BLOCK_GROUP_METADATA) {
+- /* For larger filesystems, use larger metadata chunks */
+- if (fs_devices->total_rw_bytes > 50ULL * SZ_1G)
+- ctl->max_stripe_size = SZ_1G;
+- else
+- ctl->max_stripe_size = SZ_256M;
+- ctl->max_chunk_size = ctl->max_stripe_size;
+- } else if (type & BTRFS_BLOCK_GROUP_SYSTEM) {
+- ctl->max_stripe_size = SZ_32M;
+- ctl->max_chunk_size = 2 * ctl->max_stripe_size;
+- ctl->devs_max = min_t(int, ctl->devs_max,
+- BTRFS_MAX_DEVS_SYS_CHUNK);
+- } else {
+- BUG();
+- }
++ space_info = btrfs_find_space_info(fs_devices->fs_info, ctl->type);
++ ASSERT(space_info);
++
++ ctl->max_chunk_size = READ_ONCE(space_info->chunk_size);
++ ctl->max_stripe_size = ctl->max_chunk_size;
++
++ if (ctl->type & BTRFS_BLOCK_GROUP_SYSTEM)
++ ctl->devs_max = min_t(int, ctl->devs_max, BTRFS_MAX_DEVS_SYS_CHUNK);
+
+ /* We don't want a chunk larger than 10% of writable space */
+ ctl->max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
+diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
+index 84b6d39509bd3..45e29b8c705c1 100644
+--- a/fs/btrfs/zoned.c
++++ b/fs/btrfs/zoned.c
+@@ -407,6 +407,16 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache)
+ nr_sectors = bdev_nr_sectors(bdev);
+ zone_info->zone_size_shift = ilog2(zone_info->zone_size);
+ zone_info->nr_zones = nr_sectors >> ilog2(zone_sectors);
++ /*
++ * We limit max_zone_append_size also by max_segments *
++ * PAGE_SIZE. Technically, we can have multiple pages per segment. But,
++ * since btrfs adds the pages one by one to a bio, and btrfs cannot
++ * increase the metadata reservation even if it increases the number of
++ * extents, it is safe to stick with the limit.
++ */
++ zone_info->max_zone_append_size =
++ min_t(u64, (u64)bdev_max_zone_append_sectors(bdev) << SECTOR_SHIFT,
++ (u64)bdev_max_segments(bdev) << PAGE_SHIFT);
+ if (!IS_ALIGNED(nr_sectors, zone_sectors))
+ zone_info->nr_zones++;
+
+@@ -632,6 +642,7 @@ int btrfs_check_zoned_mode(struct btrfs_fs_info *fs_info)
+ u64 zoned_devices = 0;
+ u64 nr_devices = 0;
+ u64 zone_size = 0;
++ u64 max_zone_append_size = 0;
+ const bool incompat_zoned = btrfs_fs_incompat(fs_info, ZONED);
+ int ret = 0;
+
+@@ -666,6 +677,11 @@ int btrfs_check_zoned_mode(struct btrfs_fs_info *fs_info)
+ ret = -EINVAL;
+ goto out;
+ }
++ if (!max_zone_append_size ||
++ (zone_info->max_zone_append_size &&
++ zone_info->max_zone_append_size < max_zone_append_size))
++ max_zone_append_size =
++ zone_info->max_zone_append_size;
+ }
+ nr_devices++;
+ }
+@@ -715,7 +731,11 @@ int btrfs_check_zoned_mode(struct btrfs_fs_info *fs_info)
+ }
+
+ fs_info->zone_size = zone_size;
++ fs_info->max_zone_append_size = ALIGN_DOWN(max_zone_append_size,
++ fs_info->sectorsize);
+ fs_info->fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_ZONED;
++ if (fs_info->max_zone_append_size < fs_info->max_extent_size)
++ fs_info->max_extent_size = fs_info->max_zone_append_size;
+
+ /*
+ * Check mount options here, because we might change fs_info->zoned
+@@ -1821,6 +1841,7 @@ struct btrfs_device *btrfs_zoned_get_device(struct btrfs_fs_info *fs_info,
+ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
+ {
+ struct btrfs_fs_info *fs_info = block_group->fs_info;
++ struct btrfs_space_info *space_info = block_group->space_info;
+ struct map_lookup *map;
+ struct btrfs_device *device;
+ u64 physical;
+@@ -1832,6 +1853,7 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
+
+ map = block_group->physical_map;
+
++ spin_lock(&space_info->lock);
+ spin_lock(&block_group->lock);
+ if (block_group->zone_is_active) {
+ ret = true;
+@@ -1839,7 +1861,7 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
+ }
+
+ /* No space left */
+- if (block_group->alloc_offset == block_group->zone_capacity) {
++ if (btrfs_zoned_bg_is_full(block_group)) {
+ ret = false;
+ goto out_unlock;
+ }
+@@ -1860,7 +1882,10 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
+
+ /* Successfully activated all the zones */
+ block_group->zone_is_active = 1;
++ space_info->active_total_bytes += block_group->length;
+ spin_unlock(&block_group->lock);
++ btrfs_try_granting_tickets(fs_info, space_info);
++ spin_unlock(&space_info->lock);
+
+ /* For the active block group list */
+ btrfs_get_block_group(block_group);
+@@ -1873,6 +1898,7 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
+
+ out_unlock:
+ spin_unlock(&block_group->lock);
++ spin_unlock(&space_info->lock);
+ return ret;
+ }
+
+@@ -1967,6 +1993,9 @@ int btrfs_zone_finish(struct btrfs_block_group *block_group)
+ /* For active_bg_list */
+ btrfs_put_block_group(block_group);
+
++ clear_bit(BTRFS_FS_NEED_ZONE_FINISH, &fs_info->flags);
++ wake_up_all(&fs_info->zone_finish_wait);
++
+ return 0;
+ }
+
+@@ -1995,6 +2024,9 @@ bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags)
+ }
+ mutex_unlock(&fs_info->chunk_mutex);
+
++ if (!ret)
++ set_bit(BTRFS_FS_NEED_ZONE_FINISH, &fs_info->flags);
++
+ return ret;
+ }
+
+@@ -2156,3 +2188,96 @@ out:
+ spin_unlock(&block_group->lock);
+ btrfs_put_block_group(block_group);
+ }
++
++int btrfs_zone_finish_one_bg(struct btrfs_fs_info *fs_info)
++{
++ struct btrfs_block_group *block_group;
++ struct btrfs_block_group *min_bg = NULL;
++ u64 min_avail = U64_MAX;
++ int ret;
++
++ spin_lock(&fs_info->zone_active_bgs_lock);
++ list_for_each_entry(block_group, &fs_info->zone_active_bgs,
++ active_bg_list) {
++ u64 avail;
++
++ spin_lock(&block_group->lock);
++ if (block_group->reserved ||
++ (block_group->flags & BTRFS_BLOCK_GROUP_SYSTEM)) {
++ spin_unlock(&block_group->lock);
++ continue;
++ }
++
++ avail = block_group->zone_capacity - block_group->alloc_offset;
++ if (min_avail > avail) {
++ if (min_bg)
++ btrfs_put_block_group(min_bg);
++ min_bg = block_group;
++ min_avail = avail;
++ btrfs_get_block_group(min_bg);
++ }
++ spin_unlock(&block_group->lock);
++ }
++ spin_unlock(&fs_info->zone_active_bgs_lock);
++
++ if (!min_bg)
++ return 0;
++
++ ret = btrfs_zone_finish(min_bg);
++ btrfs_put_block_group(min_bg);
++
++ return ret < 0 ? ret : 1;
++}
++
++int btrfs_zoned_activate_one_bg(struct btrfs_fs_info *fs_info,
++ struct btrfs_space_info *space_info,
++ bool do_finish)
++{
++ struct btrfs_block_group *bg;
++ int index;
++
++ if (!btrfs_is_zoned(fs_info) || (space_info->flags & BTRFS_BLOCK_GROUP_DATA))
++ return 0;
++
++ /* No more block groups to activate */
++ if (space_info->active_total_bytes == space_info->total_bytes)
++ return 0;
++
++ for (;;) {
++ int ret;
++ bool need_finish = false;
++
++ down_read(&space_info->groups_sem);
++ for (index = 0; index < BTRFS_NR_RAID_TYPES; index++) {
++ list_for_each_entry(bg, &space_info->block_groups[index],
++ list) {
++ if (!spin_trylock(&bg->lock))
++ continue;
++ if (btrfs_zoned_bg_is_full(bg) || bg->zone_is_active) {
++ spin_unlock(&bg->lock);
++ continue;
++ }
++ spin_unlock(&bg->lock);
++
++ if (btrfs_zone_activate(bg)) {
++ up_read(&space_info->groups_sem);
++ return 1;
++ }
++
++ need_finish = true;
++ }
++ }
++ up_read(&space_info->groups_sem);
++
++ if (!do_finish || !need_finish)
++ break;
++
++ ret = btrfs_zone_finish_one_bg(fs_info);
++ if (ret == 0)
++ break;
++ if (ret < 0)
++ return ret;
++ }
++
++ return 0;
++}
+diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h
+index cf6320feef464..1cac322662764 100644
+--- a/fs/btrfs/zoned.h
++++ b/fs/btrfs/zoned.h
+@@ -10,11 +10,7 @@
+ #include "block-group.h"
+ #include "btrfs_inode.h"
+
+-/*
+- * Block groups with more than this value (percents) of unusable space will be
+- * scheduled for background reclaim.
+- */
+-#define BTRFS_DEFAULT_RECLAIM_THRESH 75
++#define BTRFS_DEFAULT_RECLAIM_THRESH (75)
+
+ struct btrfs_zoned_device_info {
+ /*
+@@ -23,6 +19,7 @@ struct btrfs_zoned_device_info {
+ */
+ u64 zone_size;
+ u8 zone_size_shift;
++ u64 max_zone_append_size;
+ u32 nr_zones;
+ unsigned int max_active_zones;
+ atomic_t active_zones_left;
+@@ -82,6 +79,9 @@ void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg);
+ void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info);
+ void btrfs_zoned_release_data_reloc_bg(struct btrfs_fs_info *fs_info, u64 logical,
+ u64 length);
++int btrfs_zone_finish_one_bg(struct btrfs_fs_info *fs_info);
++int btrfs_zoned_activate_one_bg(struct btrfs_fs_info *fs_info,
++ struct btrfs_space_info *space_info, bool do_finish);
+ #else /* CONFIG_BLK_DEV_ZONED */
+ static inline int btrfs_get_dev_zone(struct btrfs_device *device, u64 pos,
+ struct blk_zone *zone)
+@@ -246,6 +246,20 @@ static inline void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info) { }
+
+ static inline void btrfs_zoned_release_data_reloc_bg(struct btrfs_fs_info *fs_info,
+ u64 logical, u64 length) { }
++
++static inline int btrfs_zone_finish_one_bg(struct btrfs_fs_info *fs_info)
++{
++ return 1;
++}
++
++static inline int btrfs_zoned_activate_one_bg(struct btrfs_fs_info *fs_info,
++ struct btrfs_space_info *space_info,
++ bool do_finish)
++{
++ /* Consider all the block groups are active */
++ return 0;
++}
++
+ #endif
+
+ static inline bool btrfs_dev_is_sequential(struct btrfs_device *device, u64 pos)
+@@ -380,4 +394,10 @@ static inline void btrfs_zoned_data_reloc_unlock(struct btrfs_inode *inode)
+ mutex_unlock(&root->fs_info->zoned_data_reloc_io_lock);
+ }
+
++static inline bool btrfs_zoned_bg_is_full(const struct btrfs_block_group *bg)
++{
++ ASSERT(btrfs_is_zoned(bg->fs_info));
++ return (bg->alloc_offset == bg->zone_capacity);
++}
++
+ #endif
+diff --git a/fs/cifs/file.c b/fs/cifs/file.c
+index 58dce567ceaf1..6aeca5b301c8b 100644
+--- a/fs/cifs/file.c
++++ b/fs/cifs/file.c
+@@ -4456,10 +4456,10 @@ static void cifs_readahead(struct readahead_control *ractl)
+ * TODO: Send a whole batch of pages to be read
+ * by the cache.
+ */
+- page = readahead_page(ractl);
+- last_batch_size = 1 << thp_order(page);
++ struct folio *folio = readahead_folio(ractl);
++ last_batch_size = folio_nr_pages(folio);
+ if (cifs_readpage_from_fscache(ractl->mapping->host,
+- page) < 0) {
++ &folio->page) < 0) {
+ /*
+ * TODO: Deal with cache read failure
+ * here, but for the moment, delegate
+@@ -4467,7 +4467,7 @@ static void cifs_readahead(struct readahead_control *ractl)
+ */
+ caching = false;
+ }
+- unlock_page(page);
++ folio_unlock(folio);
+ next_cached++;
+ cache_nr_pages--;
+ if (cache_nr_pages == 0)
+@@ -4807,8 +4807,6 @@ void cifs_oplock_break(struct work_struct *work)
+ struct TCP_Server_Info *server = tcon->ses->server;
+ int rc = 0;
+ bool purge_cache = false;
+- bool is_deferred = false;
+- struct cifs_deferred_close *dclose;
+
+ wait_on_bit(&cinode->flags, CIFS_INODE_PENDING_WRITERS,
+ TASK_UNINTERRUPTIBLE);
+@@ -4844,22 +4842,6 @@ void cifs_oplock_break(struct work_struct *work)
+ cifs_dbg(VFS, "Push locks rc = %d\n", rc);
+
+ oplock_break_ack:
+- /*
+- * When oplock break is received and there are no active
+- * file handles but cached, then schedule deferred close immediately.
+- * So, new open will not use cached handle.
+- */
+- spin_lock(&CIFS_I(inode)->deferred_lock);
+- is_deferred = cifs_is_deferred_close(cfile, &dclose);
+- spin_unlock(&CIFS_I(inode)->deferred_lock);
+- if (is_deferred &&
+- cfile->deferred_close_scheduled &&
+- delayed_work_pending(&cfile->deferred)) {
+- if (cancel_delayed_work(&cfile->deferred)) {
+- _cifsFileInfo_put(cfile, false, false);
+- goto oplock_break_done;
+- }
+- }
+ /*
+ * releasing stale oplock after recent reconnect of smb session using
+ * a now incorrect file handle is not a data integrity issue but do
+@@ -4871,7 +4853,7 @@ oplock_break_ack:
+ cinode);
+ cifs_dbg(FYI, "Oplock release rc = %d\n", rc);
+ }
+-oplock_break_done:
++
+ _cifsFileInfo_put(cfile, false /* do not wait for ourself */, false);
+ cifs_done_oplock_break(cinode);
+ }
+diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
+index 0e0d1fc0f1301..9bbd9a59426b6 100644
+--- a/fs/erofs/decompressor.c
++++ b/fs/erofs/decompressor.c
+@@ -93,14 +93,18 @@ static int z_erofs_lz4_prepare_dstpages(struct z_erofs_lz4_decompress_ctx *ctx,
+
+ if (page) {
+ __clear_bit(j, bounced);
+- if (kaddr) {
+- if (kaddr + PAGE_SIZE == page_address(page))
++ if (!PageHighMem(page)) {
++ if (!i) {
++ kaddr = page_address(page);
++ continue;
++ }
++ if (kaddr &&
++ kaddr + PAGE_SIZE == page_address(page)) {
+ kaddr += PAGE_SIZE;
+- else
+- kaddr = NULL;
+- } else if (!i) {
+- kaddr = page_address(page);
++ continue;
++ }
+ }
++ kaddr = NULL;
+ continue;
+ }
+ kaddr = NULL;
+diff --git a/fs/erofs/decompressor_lzma.c b/fs/erofs/decompressor_lzma.c
+index 05a3063cf2bc1..5e59b3f523eb6 100644
+--- a/fs/erofs/decompressor_lzma.c
++++ b/fs/erofs/decompressor_lzma.c
+@@ -143,6 +143,7 @@ again:
+ DBG_BUGON(z_erofs_lzma_head);
+ z_erofs_lzma_head = head;
+ spin_unlock(&z_erofs_lzma_lock);
++ wake_up_all(&z_erofs_lzma_wq);
+
+ z_erofs_lzma_max_dictsize = dict_size;
+ mutex_unlock(&lzma_resize_mutex);
+diff --git a/fs/eventpoll.c b/fs/eventpoll.c
+index e2daa940ebce7..8b56b94e2f56f 100644
+--- a/fs/eventpoll.c
++++ b/fs/eventpoll.c
+@@ -1747,6 +1747,21 @@ static struct timespec64 *ep_timeout_to_timespec(struct timespec64 *to, long ms)
+ return to;
+ }
+
++/*
++ * autoremove_wake_function, but remove even on failure to wake up, because we
++ * know that default_wake_function/ttwu will only fail if the thread is already
++ * woken, and in that case the ep_poll loop will remove the entry anyways, not
++ * try to reuse it.
++ */
++static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry,
++ unsigned int mode, int sync, void *key)
++{
++ int ret = default_wake_function(wq_entry, mode, sync, key);
++
++ list_del_init(&wq_entry->entry);
++ return ret;
++}
++
+ /**
+ * ep_poll - Retrieves ready events, and delivers them to the caller-supplied
+ * event buffer.
+@@ -1828,8 +1843,15 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
+ * normal wakeup path no need to call __remove_wait_queue()
+ * explicitly, thus ep->lock is not taken, which halts the
+ * event delivery.
++ *
++ * In fact, we now use an even more aggressive function that
++ * unconditionally removes, because we don't reuse the wait
++ * entry between loop iterations. This lets us also avoid the
++ * performance issue if a process is killed, causing all of its
++ * threads to wake up without being removed normally.
+ */
+ init_wait(&wait);
++ wait.func = ep_autoremove_wake_function;
+
+ write_lock_irq(&ep->lock);
+ /*
+diff --git a/fs/exec.c b/fs/exec.c
+index 5a75e92b1a0a3..a9f5acf8f0ecb 100644
+--- a/fs/exec.c
++++ b/fs/exec.c
+@@ -1297,6 +1297,9 @@ int begin_new_exec(struct linux_binprm * bprm)
+ bprm->mm = NULL;
+
+ #ifdef CONFIG_POSIX_TIMERS
++ spin_lock_irq(&me->sighand->siglock);
++ posix_cpu_timers_exit(me);
++ spin_unlock_irq(&me->sighand->siglock);
+ exit_itimers(me);
+ flush_itimer_signals();
+ #endif
+diff --git a/fs/ext2/super.c b/fs/ext2/super.c
+index f6a19f6d9f6d5..cdffa2a041af8 100644
+--- a/fs/ext2/super.c
++++ b/fs/ext2/super.c
+@@ -1059,9 +1059,10 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
+ sbi->s_frags_per_group);
+ goto failed_mount;
+ }
+- if (sbi->s_inodes_per_group > sb->s_blocksize * 8) {
++ if (sbi->s_inodes_per_group < sbi->s_inodes_per_block ||
++ sbi->s_inodes_per_group > sb->s_blocksize * 8) {
+ ext2_msg(sb, KERN_ERR,
+- "error: #inodes per group too big: %lu",
++ "error: invalid #inodes per group: %lu",
+ sbi->s_inodes_per_group);
+ goto failed_mount;
+ }
+@@ -1071,6 +1072,13 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
+ sbi->s_groups_count = ((le32_to_cpu(es->s_blocks_count) -
+ le32_to_cpu(es->s_first_data_block) - 1)
+ / EXT2_BLOCKS_PER_GROUP(sb)) + 1;
++ if ((u64)sbi->s_groups_count * sbi->s_inodes_per_group !=
++ le32_to_cpu(es->s_inodes_count)) {
++ ext2_msg(sb, KERN_ERR, "error: invalid #inodes: %u vs computed %llu",
++ le32_to_cpu(es->s_inodes_count),
++ (u64)sbi->s_groups_count * sbi->s_inodes_per_group);
++ goto failed_mount;
++ }
+ db_count = (sbi->s_groups_count + EXT2_DESC_PER_BLOCK(sb) - 1) /
+ EXT2_DESC_PER_BLOCK(sb);
+ sbi->s_group_desc = kmalloc_array(db_count,
+diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
+index e9ef5cf309694..84fcd06a8e8ab 100644
+--- a/fs/ext4/inline.c
++++ b/fs/ext4/inline.c
+@@ -35,6 +35,9 @@ static int get_max_inline_xattr_value_size(struct inode *inode,
+ struct ext4_inode *raw_inode;
+ int free, min_offs;
+
++ if (!EXT4_INODE_HAS_XATTR_SPACE(inode))
++ return 0;
++
+ min_offs = EXT4_SB(inode->i_sb)->s_inode_size -
+ EXT4_GOOD_OLD_INODE_SIZE -
+ EXT4_I(inode)->i_extra_isize -
+diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
+index beed9e32571c0..e94ec798dce17 100644
+--- a/fs/ext4/inode.c
++++ b/fs/ext4/inode.c
+@@ -178,6 +178,8 @@ void ext4_evict_inode(struct inode *inode)
+
+ trace_ext4_evict_inode(inode);
+
++ if (EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)
++ ext4_evict_ea_inode(inode);
+ if (inode->i_nlink) {
+ /*
+ * When journalling data dirty buffers are tracked only in the
+@@ -1559,7 +1561,14 @@ static void mpage_release_unused_pages(struct mpage_da_data *mpd,
+ ext4_lblk_t start, last;
+ start = index << (PAGE_SHIFT - inode->i_blkbits);
+ last = end << (PAGE_SHIFT - inode->i_blkbits);
++
++ /*
++ * avoid racing with extent status tree scans made by
++ * ext4_insert_delayed_block()
++ */
++ down_write(&EXT4_I(inode)->i_data_sem);
+ ext4_es_remove_extent(inode, start, last - start + 1);
++ up_write(&EXT4_I(inode)->i_data_sem);
+ }
+
+ pagevec_init(&pvec);
+@@ -3130,13 +3139,15 @@ static sector_t ext4_bmap(struct address_space *mapping, sector_t block)
+ {
+ struct inode *inode = mapping->host;
+ journal_t *journal;
++ sector_t ret = 0;
+ int err;
+
++ inode_lock_shared(inode);
+ /*
+ * We can get here for an inline file via the FIBMAP ioctl
+ */
+ if (ext4_has_inline_data(inode))
+- return 0;
++ goto out;
+
+ if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) &&
+ test_opt(inode->i_sb, DELALLOC)) {
+@@ -3175,10 +3186,14 @@ static sector_t ext4_bmap(struct address_space *mapping, sector_t block)
+ jbd2_journal_unlock_updates(journal);
+
+ if (err)
+- return 0;
++ goto out;
+ }
+
+- return iomap_bmap(mapping, block, &ext4_iomap_ops);
++ ret = iomap_bmap(mapping, block, &ext4_iomap_ops);
++
++out:
++ inode_unlock_shared(inode);
++ return ret;
+ }
+
+ static int ext4_readpage(struct file *file, struct page *page)
+@@ -4674,8 +4689,7 @@ static inline int ext4_iget_extra_inode(struct inode *inode,
+ __le32 *magic = (void *)raw_inode +
+ EXT4_GOOD_OLD_INODE_SIZE + ei->i_extra_isize;
+
+- if (EXT4_GOOD_OLD_INODE_SIZE + ei->i_extra_isize + sizeof(__le32) <=
+- EXT4_INODE_SIZE(inode->i_sb) &&
++ if (EXT4_INODE_HAS_XATTR_SPACE(inode) &&
+ *magic == cpu_to_le32(EXT4_XATTR_MAGIC)) {
+ ext4_set_inode_state(inode, EXT4_STATE_XATTR);
+ return ext4_find_inline_data_nolock(inode);
+diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
+index 7a5353a8cfd7b..f2c4d9eee475e 100644
+--- a/fs/ext4/migrate.c
++++ b/fs/ext4/migrate.c
+@@ -417,7 +417,7 @@ int ext4_ext_migrate(struct inode *inode)
+ struct inode *tmp_inode = NULL;
+ struct migrate_struct lb;
+ unsigned long max_entries;
+- __u32 goal;
++ __u32 goal, tmp_csum_seed;
+ uid_t owner[2];
+
+ /*
+@@ -465,6 +465,7 @@ int ext4_ext_migrate(struct inode *inode)
+ * the migration.
+ */
+ ei = EXT4_I(inode);
++ tmp_csum_seed = EXT4_I(tmp_inode)->i_csum_seed;
+ EXT4_I(tmp_inode)->i_csum_seed = ei->i_csum_seed;
+ i_size_write(tmp_inode, i_size_read(inode));
+ /*
+@@ -575,6 +576,7 @@ err_out:
+ * the inode is not visible to user space.
+ */
+ tmp_inode->i_blocks = 0;
++ EXT4_I(tmp_inode)->i_csum_seed = tmp_csum_seed;
+
+ /* Reset the extent details */
+ ext4_ext_tree_init(handle, tmp_inode);
+diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
+index 4f0420b1ff3ec..13b6265848c22 100644
+--- a/fs/ext4/namei.c
++++ b/fs/ext4/namei.c
+@@ -54,6 +54,7 @@ static struct buffer_head *ext4_append(handle_t *handle,
+ struct inode *inode,
+ ext4_lblk_t *block)
+ {
++ struct ext4_map_blocks map;
+ struct buffer_head *bh;
+ int err;
+
+@@ -63,6 +64,21 @@ static struct buffer_head *ext4_append(handle_t *handle,
+ return ERR_PTR(-ENOSPC);
+
+ *block = inode->i_size >> inode->i_sb->s_blocksize_bits;
++ map.m_lblk = *block;
++ map.m_len = 1;
++
++ /*
++ * We're appending new directory block. Make sure the block is not
++ * allocated yet, otherwise we will end up corrupting the
++ * directory.
++ */
++ err = ext4_map_blocks(NULL, inode, &map, 0);
++ if (err < 0)
++ return ERR_PTR(err);
++ if (err) {
++ EXT4_ERROR_INODE(inode, "Logical block already allocated");
++ return ERR_PTR(-EFSCORRUPTED);
++ }
+
+ bh = ext4_bread(handle, inode, *block, EXT4_GET_BLOCKS_CREATE);
+ if (IS_ERR(bh))
+@@ -110,6 +126,13 @@ static struct buffer_head *__ext4_read_dirblock(struct inode *inode,
+ struct ext4_dir_entry *dirent;
+ int is_dx_block = 0;
+
++ if (block >= inode->i_size) {
++ ext4_error_inode(inode, func, line, block,
++ "Attempting to read directory block (%u) that is past i_size (%llu)",
++ block, inode->i_size);
++ return ERR_PTR(-EFSCORRUPTED);
++ }
++
+ if (ext4_simulate_fail(inode->i_sb, EXT4_SIM_DIRBLOCK_EIO))
+ bh = ERR_PTR(-EIO);
+ else
+diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
+index 8b70a47012931..e5c2713aa11ad 100644
+--- a/fs/ext4/resize.c
++++ b/fs/ext4/resize.c
+@@ -1484,6 +1484,7 @@ static void ext4_update_super(struct super_block *sb,
+ * Update the fs overhead information
+ */
+ ext4_calculate_overhead(sb);
++ es->s_overhead_clusters = cpu_to_le32(sbi->s_overhead);
+
+ if (test_opt(sb, DEBUG))
+ printk(KERN_DEBUG "EXT4-fs: added group %u:"
+diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
+index 0423253490986..533216e80fa2b 100644
+--- a/fs/ext4/xattr.c
++++ b/fs/ext4/xattr.c
+@@ -436,6 +436,21 @@ error:
+ return err;
+ }
+
++/* Remove entry from mbcache when EA inode is getting evicted */
++void ext4_evict_ea_inode(struct inode *inode)
++{
++ struct mb_cache_entry *oe;
++
++ if (!EA_INODE_CACHE(inode))
++ return;
++ /* Wait for entry to get unused so that we can remove it */
++ while ((oe = mb_cache_entry_delete_or_get(EA_INODE_CACHE(inode),
++ ext4_xattr_inode_get_hash(inode), inode->i_ino))) {
++ mb_cache_entry_wait_unused(oe);
++ mb_cache_entry_put(EA_INODE_CACHE(inode), oe);
++ }
++}
++
+ static int
+ ext4_xattr_inode_verify_hashes(struct inode *ea_inode,
+ struct ext4_xattr_entry *entry, void *buffer,
+@@ -976,10 +991,8 @@ int __ext4_xattr_set_credits(struct super_block *sb, struct inode *inode,
+ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode,
+ int ref_change)
+ {
+- struct mb_cache *ea_inode_cache = EA_INODE_CACHE(ea_inode);
+ struct ext4_iloc iloc;
+ s64 ref_count;
+- u32 hash;
+ int ret;
+
+ inode_lock(ea_inode);
+@@ -1002,14 +1015,6 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode,
+
+ set_nlink(ea_inode, 1);
+ ext4_orphan_del(handle, ea_inode);
+-
+- if (ea_inode_cache) {
+- hash = ext4_xattr_inode_get_hash(ea_inode);
+- mb_cache_entry_create(ea_inode_cache,
+- GFP_NOFS, hash,
+- ea_inode->i_ino,
+- true /* reusable */);
+- }
+ }
+ } else {
+ WARN_ONCE(ref_count < 0, "EA inode %lu ref_count=%lld",
+@@ -1022,12 +1027,6 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode,
+
+ clear_nlink(ea_inode);
+ ext4_orphan_add(handle, ea_inode);
+-
+- if (ea_inode_cache) {
+- hash = ext4_xattr_inode_get_hash(ea_inode);
+- mb_cache_entry_delete(ea_inode_cache, hash,
+- ea_inode->i_ino);
+- }
+ }
+ }
+
+@@ -1237,6 +1236,7 @@ ext4_xattr_release_block(handle_t *handle, struct inode *inode,
+ if (error)
+ goto out;
+
++retry_ref:
+ lock_buffer(bh);
+ hash = le32_to_cpu(BHDR(bh)->h_hash);
+ ref = le32_to_cpu(BHDR(bh)->h_refcount);
+@@ -1246,9 +1246,18 @@ ext4_xattr_release_block(handle_t *handle, struct inode *inode,
+ * This must happen under buffer lock for
+ * ext4_xattr_block_set() to reliably detect freed block
+ */
+- if (ea_block_cache)
+- mb_cache_entry_delete(ea_block_cache, hash,
+- bh->b_blocknr);
++ if (ea_block_cache) {
++ struct mb_cache_entry *oe;
++
++ oe = mb_cache_entry_delete_or_get(ea_block_cache, hash,
++ bh->b_blocknr);
++ if (oe) {
++ unlock_buffer(bh);
++ mb_cache_entry_wait_unused(oe);
++ mb_cache_entry_put(ea_block_cache, oe);
++ goto retry_ref;
++ }
++ }
+ get_bh(bh);
+ unlock_buffer(bh);
+
+@@ -1858,6 +1867,8 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
+ #define header(x) ((struct ext4_xattr_header *)(x))
+
+ if (s->base) {
++ int offset = (char *)s->here - bs->bh->b_data;
++
+ BUFFER_TRACE(bs->bh, "get_write_access");
+ error = ext4_journal_get_write_access(handle, sb, bs->bh,
+ EXT4_JTR_NONE);
+@@ -1873,9 +1884,20 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
+ * ext4_xattr_block_set() to reliably detect modified
+ * block
+ */
+- if (ea_block_cache)
+- mb_cache_entry_delete(ea_block_cache, hash,
+- bs->bh->b_blocknr);
++ if (ea_block_cache) {
++ struct mb_cache_entry *oe;
++
++ oe = mb_cache_entry_delete_or_get(ea_block_cache,
++ hash, bs->bh->b_blocknr);
++ if (oe) {
++ /*
++ * Xattr block is getting reused. Leave
++ * it alone.
++ */
++ mb_cache_entry_put(ea_block_cache, oe);
++ goto clone_block;
++ }
++ }
+ ea_bdebug(bs->bh, "modifying in-place");
+ error = ext4_xattr_set_entry(i, s, handle, inode,
+ true /* is_block */);
+@@ -1890,50 +1912,47 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
+ if (error)
+ goto cleanup;
+ goto inserted;
+- } else {
+- int offset = (char *)s->here - bs->bh->b_data;
++ }
++clone_block:
++ unlock_buffer(bs->bh);
++ ea_bdebug(bs->bh, "cloning");
++ s->base = kmemdup(BHDR(bs->bh), bs->bh->b_size, GFP_NOFS);
++ error = -ENOMEM;
++ if (s->base == NULL)
++ goto cleanup;
++ s->first = ENTRY(header(s->base)+1);
++ header(s->base)->h_refcount = cpu_to_le32(1);
++ s->here = ENTRY(s->base + offset);
++ s->end = s->base + bs->bh->b_size;
+
+- unlock_buffer(bs->bh);
+- ea_bdebug(bs->bh, "cloning");
+- s->base = kmalloc(bs->bh->b_size, GFP_NOFS);
+- error = -ENOMEM;
+- if (s->base == NULL)
++ /*
++ * If existing entry points to an xattr inode, we need
++ * to prevent ext4_xattr_set_entry() from decrementing
++ * ref count on it because the reference belongs to the
++ * original block. In this case, make the entry look
++ * like it has an empty value.
++ */
++ if (!s->not_found && s->here->e_value_inum) {
++ ea_ino = le32_to_cpu(s->here->e_value_inum);
++ error = ext4_xattr_inode_iget(inode, ea_ino,
++ le32_to_cpu(s->here->e_hash),
++ &tmp_inode);
++ if (error)
+ goto cleanup;
+- memcpy(s->base, BHDR(bs->bh), bs->bh->b_size);
+- s->first = ENTRY(header(s->base)+1);
+- header(s->base)->h_refcount = cpu_to_le32(1);
+- s->here = ENTRY(s->base + offset);
+- s->end = s->base + bs->bh->b_size;
+-
+- /*
+- * If existing entry points to an xattr inode, we need
+- * to prevent ext4_xattr_set_entry() from decrementing
+- * ref count on it because the reference belongs to the
+- * original block. In this case, make the entry look
+- * like it has an empty value.
+- */
+- if (!s->not_found && s->here->e_value_inum) {
+- ea_ino = le32_to_cpu(s->here->e_value_inum);
+- error = ext4_xattr_inode_iget(inode, ea_ino,
+- le32_to_cpu(s->here->e_hash),
+- &tmp_inode);
+- if (error)
+- goto cleanup;
+-
+- if (!ext4_test_inode_state(tmp_inode,
+- EXT4_STATE_LUSTRE_EA_INODE)) {
+- /*
+- * Defer quota free call for previous
+- * inode until success is guaranteed.
+- */
+- old_ea_inode_quota = le32_to_cpu(
+- s->here->e_value_size);
+- }
+- iput(tmp_inode);
+
+- s->here->e_value_inum = 0;
+- s->here->e_value_size = 0;
++ if (!ext4_test_inode_state(tmp_inode,
++ EXT4_STATE_LUSTRE_EA_INODE)) {
++ /*
++ * Defer quota free call for previous
++ * inode until success is guaranteed.
++ */
++ old_ea_inode_quota = le32_to_cpu(
++ s->here->e_value_size);
+ }
++ iput(tmp_inode);
++
++ s->here->e_value_inum = 0;
++ s->here->e_value_size = 0;
+ }
+ } else {
+ /* Allocate a buffer where we construct the new block. */
+@@ -2000,18 +2019,13 @@ inserted:
+ lock_buffer(new_bh);
+ /*
+ * We have to be careful about races with
+- * freeing, rehashing or adding references to
+- * xattr block. Once we hold buffer lock xattr
+- * block's state is stable so we can check
+- * whether the block got freed / rehashed or
+- * not. Since we unhash mbcache entry under
+- * buffer lock when freeing / rehashing xattr
+- * block, checking whether entry is still
+- * hashed is reliable. Same rules hold for
+- * e_reusable handling.
++ * adding references to xattr block. Once we
++ * hold buffer lock xattr block's state is
++ * stable so we can check the additional
++ * reference fits.
+ */
+- if (hlist_bl_unhashed(&ce->e_hash_list) ||
+- !ce->e_reusable) {
++ ref = le32_to_cpu(BHDR(new_bh)->h_refcount) + 1;
++ if (ref > EXT4_XATTR_REFCOUNT_MAX) {
+ /*
+ * Undo everything and check mbcache
+ * again.
+@@ -2026,9 +2040,8 @@ inserted:
+ new_bh = NULL;
+ goto inserted;
+ }
+- ref = le32_to_cpu(BHDR(new_bh)->h_refcount) + 1;
+ BHDR(new_bh)->h_refcount = cpu_to_le32(ref);
+- if (ref >= EXT4_XATTR_REFCOUNT_MAX)
++ if (ref == EXT4_XATTR_REFCOUNT_MAX)
+ ce->e_reusable = 0;
+ ea_bdebug(new_bh, "reusing; refcount now=%d",
+ ref);
+@@ -2176,8 +2189,9 @@ int ext4_xattr_ibody_find(struct inode *inode, struct ext4_xattr_info *i,
+ struct ext4_inode *raw_inode;
+ int error;
+
+- if (EXT4_I(inode)->i_extra_isize == 0)
++ if (!EXT4_INODE_HAS_XATTR_SPACE(inode))
+ return 0;
++
+ raw_inode = ext4_raw_inode(&is->iloc);
+ header = IHDR(inode, raw_inode);
+ is->s.base = is->s.first = IFIRST(header);
+@@ -2205,8 +2219,9 @@ int ext4_xattr_ibody_set(handle_t *handle, struct inode *inode,
+ struct ext4_xattr_search *s = &is->s;
+ int error;
+
+- if (EXT4_I(inode)->i_extra_isize == 0)
++ if (!EXT4_INODE_HAS_XATTR_SPACE(inode))
+ return -ENOSPC;
++
+ error = ext4_xattr_set_entry(i, s, handle, inode, false /* is_block */);
+ if (error)
+ return error;
+diff --git a/fs/ext4/xattr.h b/fs/ext4/xattr.h
+index 77efb9a627ad2..e5e36bd11f055 100644
+--- a/fs/ext4/xattr.h
++++ b/fs/ext4/xattr.h
+@@ -95,6 +95,19 @@ struct ext4_xattr_entry {
+
+ #define EXT4_ZERO_XATTR_VALUE ((void *)-1)
+
++/*
++ * If we want to add an xattr to the inode, we should make sure that
++ * i_extra_isize is not 0 and that the inode size is not less than
++ * EXT4_GOOD_OLD_INODE_SIZE + extra_isize + pad.
++ * EXT4_GOOD_OLD_INODE_SIZE extra_isize header entry pad data
++ * |--------------------------|------------|------|---------|---|-------|
++ */
++#define EXT4_INODE_HAS_XATTR_SPACE(inode) \
++ ((EXT4_I(inode)->i_extra_isize != 0) && \
++ (EXT4_GOOD_OLD_INODE_SIZE + EXT4_I(inode)->i_extra_isize + \
++ sizeof(struct ext4_xattr_ibody_header) + EXT4_XATTR_PAD <= \
++ EXT4_INODE_SIZE((inode)->i_sb)))
++
+ struct ext4_xattr_info {
+ const char *name;
+ const void *value;
+@@ -178,6 +191,7 @@ extern void ext4_xattr_inode_array_free(struct ext4_xattr_inode_array *array);
+
+ extern int ext4_expand_extra_isize_ea(struct inode *inode, int new_extra_isize,
+ struct ext4_inode *raw_inode, handle_t *handle);
++extern void ext4_evict_ea_inode(struct inode *inode);
+
+ extern const struct xattr_handler *ext4_xattr_handlers[];
+
+diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
+index beceac9885c3e..63ab8c9d674e6 100644
+--- a/fs/f2fs/checkpoint.c
++++ b/fs/f2fs/checkpoint.c
+@@ -1004,9 +1004,7 @@ static void __add_dirty_inode(struct inode *inode, enum inode_type type)
+ return;
+
+ set_inode_flag(inode, flag);
+- if (!f2fs_is_volatile_file(inode))
+- list_add_tail(&F2FS_I(inode)->dirty_list,
+- &sbi->inode_list[type]);
++ list_add_tail(&F2FS_I(inode)->dirty_list, &sbi->inode_list[type]);
+ stat_inc_dirty_inode(sbi, type);
+ }
+
+diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
+index 9a1a526f20920..07522e82c7c42 100644
+--- a/fs/f2fs/data.c
++++ b/fs/f2fs/data.c
+@@ -69,8 +69,7 @@ static bool __is_cp_guaranteed(struct page *page)
+
+ if (f2fs_is_compressed_page(page))
+ return false;
+- if ((S_ISREG(inode->i_mode) &&
+- (f2fs_is_atomic_file(inode) || IS_NOQUOTA(inode))) ||
++ if ((S_ISREG(inode->i_mode) && IS_NOQUOTA(inode)) ||
+ page_private_gcing(page))
+ return true;
+ return false;
+@@ -1436,9 +1435,12 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
+ *map->m_next_extent = pgofs + map->m_len;
+
+ /* for hardware encryption, but to avoid potential issue in future */
+- if (flag == F2FS_GET_BLOCK_DIO)
++ if (flag == F2FS_GET_BLOCK_DIO) {
+ f2fs_wait_on_block_writeback_range(inode,
+ map->m_pblk, map->m_len);
++ invalidate_mapping_pages(META_MAPPING(sbi),
++ map->m_pblk, map->m_pblk + map->m_len - 1);
++ }
+
+ if (map->m_multidev_dio) {
+ block_t blk_addr = map->m_pblk;
+@@ -1655,7 +1657,7 @@ sync_out:
+ f2fs_wait_on_block_writeback_range(inode,
+ map->m_pblk, map->m_len);
+ invalidate_mapping_pages(META_MAPPING(sbi),
+- map->m_pblk, map->m_pblk);
++ map->m_pblk, map->m_pblk + map->m_len - 1);
+
+ if (map->m_multidev_dio) {
+ block_t blk_addr = map->m_pblk;
+@@ -2563,7 +2565,12 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio)
+ bool ipu_force = false;
+ int err = 0;
+
+- set_new_dnode(&dn, inode, NULL, NULL, 0);
++ /* Use COW inode to make dnode_of_data for atomic write */
++ if (f2fs_is_atomic_file(inode))
++ set_new_dnode(&dn, F2FS_I(inode)->cow_inode, NULL, NULL, 0);
++ else
++ set_new_dnode(&dn, inode, NULL, NULL, 0);
++
+ if (need_inplace_update(fio) &&
+ f2fs_lookup_extent_cache(inode, page->index, &ei)) {
+ fio->old_blkaddr = ei.blk + page->index - ei.fofs;
+@@ -2600,6 +2607,7 @@ got_it:
+ err = -EFSCORRUPTED;
+ goto out_writepage;
+ }
++
+ /*
+ * If current allocation needs SSR,
+ * it had better in-place writes for updated data.
+@@ -2736,11 +2744,6 @@ int f2fs_write_single_data_page(struct page *page, int *submitted,
+ write:
+ if (f2fs_is_drop_cache(inode))
+ goto out;
+- /* we should not write 0'th page having journal header */
+- if (f2fs_is_volatile_file(inode) && (!page->index ||
+- (!wbc->for_reclaim &&
+- f2fs_available_free_memory(sbi, BASE_CHECK))))
+- goto redirty_out;
+
+ /* Dentry/quota blocks are controlled by checkpoint */
+ if (S_ISDIR(inode->i_mode) || IS_NOQUOTA(inode)) {
+@@ -3313,6 +3316,100 @@ unlock_out:
+ return err;
+ }
+
++static int __find_data_block(struct inode *inode, pgoff_t index,
++ block_t *blk_addr)
++{
++ struct dnode_of_data dn;
++ struct page *ipage;
++ struct extent_info ei = {0, };
++ int err = 0;
++
++ ipage = f2fs_get_node_page(F2FS_I_SB(inode), inode->i_ino);
++ if (IS_ERR(ipage))
++ return PTR_ERR(ipage);
++
++ set_new_dnode(&dn, inode, ipage, ipage, 0);
++
++ if (f2fs_lookup_extent_cache(inode, index, &ei)) {
++ dn.data_blkaddr = ei.blk + index - ei.fofs;
++ } else {
++ /* hole case */
++ err = f2fs_get_dnode_of_data(&dn, index, LOOKUP_NODE);
++ if (err) {
++ dn.data_blkaddr = NULL_ADDR;
++ err = 0;
++ }
++ }
++ *blk_addr = dn.data_blkaddr;
++ f2fs_put_dnode(&dn);
++ return err;
++}
++
++static int __reserve_data_block(struct inode *inode, pgoff_t index,
++ block_t *blk_addr, bool *node_changed)
++{
++ struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
++ struct dnode_of_data dn;
++ struct page *ipage;
++ int err = 0;
++
++ f2fs_do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, true);
++
++ ipage = f2fs_get_node_page(sbi, inode->i_ino);
++ if (IS_ERR(ipage)) {
++ err = PTR_ERR(ipage);
++ goto unlock_out;
++ }
++ set_new_dnode(&dn, inode, ipage, ipage, 0);
++
++ err = f2fs_get_block(&dn, index);
++
++ *blk_addr = dn.data_blkaddr;
++ *node_changed = dn.node_changed;
++ f2fs_put_dnode(&dn);
++
++unlock_out:
++ f2fs_do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, false);
++ return err;
++}
++
++static int prepare_atomic_write_begin(struct f2fs_sb_info *sbi,
++ struct page *page, loff_t pos, unsigned int len,
++ block_t *blk_addr, bool *node_changed)
++{
++ struct inode *inode = page->mapping->host;
++ struct inode *cow_inode = F2FS_I(inode)->cow_inode;
++ pgoff_t index = page->index;
++ int err = 0;
++ block_t ori_blk_addr;
++
++ /* If pos is beyond the end of file, reserve a new block in COW inode */
++ if ((pos & PAGE_MASK) >= i_size_read(inode))
++ return __reserve_data_block(cow_inode, index, blk_addr,
++ node_changed);
++
++ /* Look for the block in COW inode first */
++ err = __find_data_block(cow_inode, index, blk_addr);
++ if (err)
++ return err;
++ else if (*blk_addr != NULL_ADDR)
++ return 0;
++
++ /* Look for the block in the original inode */
++ err = __find_data_block(inode, index, &ori_blk_addr);
++ if (err)
++ return err;
++
++ /* Finally, we should reserve a new block in COW inode for the update */
++ err = __reserve_data_block(cow_inode, index, blk_addr, node_changed);
++ if (err)
++ return err;
++
++ if (ori_blk_addr != NULL_ADDR)
++ *blk_addr = ori_blk_addr;
++ return 0;
++}
++
+ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
+ loff_t pos, unsigned len, unsigned flags,
+ struct page **pagep, void **fsdata)
+@@ -3321,7 +3418,7 @@ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
+ struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+ struct page *page = NULL;
+ pgoff_t index = ((unsigned long long) pos) >> PAGE_SHIFT;
+- bool need_balance = false, drop_atomic = false;
++ bool need_balance = false;
+ block_t blkaddr = NULL_ADDR;
+ int err = 0;
+
+@@ -3332,14 +3429,6 @@ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
+ goto fail;
+ }
+
+- if ((f2fs_is_atomic_file(inode) &&
+- !f2fs_available_free_memory(sbi, INMEM_PAGES)) ||
+- is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST)) {
+- err = -ENOMEM;
+- drop_atomic = true;
+- goto fail;
+- }
+-
+ /*
+ * We should check this at this moment to avoid deadlock on inode page
+ * and #0 page. The locking rule for inline_data conversion should be:
+@@ -3387,7 +3476,11 @@ repeat:
+
+ *pagep = page;
+
+- err = prepare_write_begin(sbi, page, pos, len,
++ if (f2fs_is_atomic_file(inode))
++ err = prepare_atomic_write_begin(sbi, page, pos, len,
++ &blkaddr, &need_balance);
++ else
++ err = prepare_write_begin(sbi, page, pos, len,
+ &blkaddr, &need_balance);
+ if (err)
+ goto fail;
+@@ -3443,8 +3536,6 @@ repeat:
+ fail:
+ f2fs_put_page(page, 1);
+ f2fs_write_failed(inode, pos + len);
+- if (drop_atomic)
+- f2fs_drop_inmem_pages_all(sbi, false);
+ return err;
+ }
+
+@@ -3488,8 +3579,12 @@ static int f2fs_write_end(struct file *file,
+ set_page_dirty(page);
+
+ if (pos + copied > i_size_read(inode) &&
+- !f2fs_verity_in_progress(inode))
++ !f2fs_verity_in_progress(inode)) {
+ f2fs_i_size_write(inode, pos + copied);
++ if (f2fs_is_atomic_file(inode))
++ f2fs_i_size_write(F2FS_I(inode)->cow_inode,
++ pos + copied);
++ }
+ unlock_out:
+ f2fs_put_page(page, 1);
+ f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
+@@ -3522,9 +3617,6 @@ void f2fs_invalidate_folio(struct folio *folio, size_t offset, size_t length)
+ inode->i_ino == F2FS_COMPRESS_INO(sbi))
+ clear_page_private_data(&folio->page);
+
+- if (page_private_atomic(&folio->page))
+- return f2fs_drop_inmem_page(inode, &folio->page);
+-
+ folio_detach_private(folio);
+ }
+
+@@ -3534,10 +3626,6 @@ int f2fs_release_page(struct page *page, gfp_t wait)
+ if (PageDirty(page))
+ return 0;
+
+- /* This is atomic written page, keep Private */
+- if (page_private_atomic(page))
+- return 0;
+-
+ if (test_opt(F2FS_P_SB(page), COMPRESS_CACHE)) {
+ struct inode *inode = page->mapping->host;
+
+@@ -3563,18 +3651,6 @@ static bool f2fs_dirty_data_folio(struct address_space *mapping,
+ folio_mark_uptodate(folio);
+ BUG_ON(folio_test_swapcache(folio));
+
+- if (f2fs_is_atomic_file(inode) && !f2fs_is_commit_atomic_write(inode)) {
+- if (!page_private_atomic(&folio->page)) {
+- f2fs_register_inmem_page(inode, &folio->page);
+- return true;
+- }
+- /*
+- * Previously, this page has been registered, we just
+- * return here.
+- */
+- return false;
+- }
+-
+ if (!folio_test_dirty(folio)) {
+ filemap_dirty_folio(mapping, folio);
+ f2fs_update_dirty_folio(inode, folio);
+@@ -3654,42 +3730,14 @@ out:
+ int f2fs_migrate_page(struct address_space *mapping,
+ struct page *newpage, struct page *page, enum migrate_mode mode)
+ {
+- int rc, extra_count;
+- struct f2fs_inode_info *fi = F2FS_I(mapping->host);
+- bool atomic_written = page_private_atomic(page);
++ int rc, extra_count = 0;
+
+ BUG_ON(PageWriteback(page));
+
+- /* migrating an atomic written page is safe with the inmem_lock hold */
+- if (atomic_written) {
+- if (mode != MIGRATE_SYNC)
+- return -EBUSY;
+- if (!mutex_trylock(&fi->inmem_lock))
+- return -EAGAIN;
+- }
+-
+- /* one extra reference was held for atomic_write page */
+- extra_count = atomic_written ? 1 : 0;
+ rc = migrate_page_move_mapping(mapping, newpage,
+ page, extra_count);
+- if (rc != MIGRATEPAGE_SUCCESS) {
+- if (atomic_written)
+- mutex_unlock(&fi->inmem_lock);
++ if (rc != MIGRATEPAGE_SUCCESS)
+ return rc;
+- }
+-
+- if (atomic_written) {
+- struct inmem_pages *cur;
+-
+- list_for_each_entry(cur, &fi->inmem_pages, list)
+- if (cur->page == page) {
+- cur->page = newpage;
+- break;
+- }
+- mutex_unlock(&fi->inmem_lock);
+- put_page(page);
+- get_page(newpage);
+- }
+
+ /* guarantee to start from no stale private field */
+ set_page_private(newpage, 0);
+diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
+index fcdf253cd211e..c92625ef16d0b 100644
+--- a/fs/f2fs/debug.c
++++ b/fs/f2fs/debug.c
+@@ -91,11 +91,8 @@ static void update_general_status(struct f2fs_sb_info *sbi)
+ si->ndirty_files = sbi->ndirty_inode[FILE_INODE];
+ si->nquota_files = sbi->nquota_files;
+ si->ndirty_all = sbi->ndirty_inode[DIRTY_META];
+- si->inmem_pages = get_pages(sbi, F2FS_INMEM_PAGES);
+ si->aw_cnt = sbi->atomic_files;
+- si->vw_cnt = atomic_read(&sbi->vw_cnt);
+ si->max_aw_cnt = atomic_read(&sbi->max_aw_cnt);
+- si->max_vw_cnt = atomic_read(&sbi->max_vw_cnt);
+ si->nr_dio_read = get_pages(sbi, F2FS_DIO_READ);
+ si->nr_dio_write = get_pages(sbi, F2FS_DIO_WRITE);
+ si->nr_wb_cp_data = get_pages(sbi, F2FS_WB_CP_DATA);
+@@ -167,8 +164,6 @@ static void update_general_status(struct f2fs_sb_info *sbi)
+ si->alloc_nids = NM_I(sbi)->nid_cnt[PREALLOC_NID];
+ si->io_skip_bggc = sbi->io_skip_bggc;
+ si->other_skip_bggc = sbi->other_skip_bggc;
+- si->skipped_atomic_files[BG_GC] = sbi->skipped_atomic_files[BG_GC];
+- si->skipped_atomic_files[FG_GC] = sbi->skipped_atomic_files[FG_GC];
+ si->util_free = (int)(free_user_blocks(sbi) >> sbi->log_blocks_per_seg)
+ * 100 / (int)(sbi->user_block_count >> sbi->log_blocks_per_seg)
+ / 2;
+@@ -296,7 +291,6 @@ get_cache:
+ sizeof(struct nat_entry);
+ si->cache_mem += NM_I(sbi)->nat_cnt[DIRTY_NAT] *
+ sizeof(struct nat_entry_set);
+- si->cache_mem += si->inmem_pages * sizeof(struct inmem_pages);
+ for (i = 0; i < MAX_INO_ENTRY; i++)
+ si->cache_mem += sbi->im[i].ino_num * sizeof(struct ino_entry);
+ si->cache_mem += atomic_read(&sbi->total_ext_tree) *
+@@ -491,10 +485,6 @@ static int stat_show(struct seq_file *s, void *v)
+ si->bg_data_blks);
+ seq_printf(s, " - node blocks : %d (%d)\n", si->node_blks,
+ si->bg_node_blks);
+- seq_printf(s, "Skipped : atomic write %llu (%llu)\n",
+- si->skipped_atomic_files[BG_GC] +
+- si->skipped_atomic_files[FG_GC],
+- si->skipped_atomic_files[BG_GC]);
+ seq_printf(s, "BG skip : IO: %u, Other: %u\n",
+ si->io_skip_bggc, si->other_skip_bggc);
+ seq_puts(s, "\nExtent Cache:\n");
+@@ -519,10 +509,8 @@ static int stat_show(struct seq_file *s, void *v)
+ si->flush_list_empty,
+ si->nr_discarding, si->nr_discarded,
+ si->nr_discard_cmd, si->undiscard_blks);
+- seq_printf(s, " - inmem: %4d, atomic IO: %4d (Max. %4d), "
+- "volatile IO: %4d (Max. %4d)\n",
+- si->inmem_pages, si->aw_cnt, si->max_aw_cnt,
+- si->vw_cnt, si->max_vw_cnt);
++ seq_printf(s, " - atomic IO: %4d (Max. %4d)\n",
++ si->aw_cnt, si->max_aw_cnt);
+ seq_printf(s, " - compress: %4d, hit:%8d\n", si->compress_pages, si->compress_page_hit);
+ seq_printf(s, " - nodes: %4d in %4d\n",
+ si->ndirty_node, si->node_pages);
+@@ -623,9 +611,7 @@ int f2fs_build_stats(struct f2fs_sb_info *sbi)
+ for (i = META_CP; i < META_MAX; i++)
+ atomic_set(&sbi->meta_count[i], 0);
+
+- atomic_set(&sbi->vw_cnt, 0);
+ atomic_set(&sbi->max_aw_cnt, 0);
+- atomic_set(&sbi->max_vw_cnt, 0);
+
+ raw_spin_lock_irqsave(&f2fs_stat_lock, flags);
+ list_add_tail(&si->stat_list, &f2fs_stat_list);
+diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
+index 9b89f26af1f3f..e2022fad15747 100644
+--- a/fs/f2fs/f2fs.h
++++ b/fs/f2fs/f2fs.h
+@@ -716,7 +716,6 @@ enum {
+
+ enum {
+ GC_FAILURE_PIN,
+- GC_FAILURE_ATOMIC,
+ MAX_GC_FAILURE
+ };
+
+@@ -738,8 +737,6 @@ enum {
+ FI_UPDATE_WRITE, /* inode has in-place-update data */
+ FI_NEED_IPU, /* used for ipu per file */
+ FI_ATOMIC_FILE, /* indicate atomic file */
+- FI_ATOMIC_COMMIT, /* indicate the state of atomical committing */
+- FI_VOLATILE_FILE, /* indicate volatile file */
+ FI_FIRST_BLOCK_WRITTEN, /* indicate #0 data block was written */
+ FI_DROP_CACHE, /* drop dirty page cache */
+ FI_DATA_EXIST, /* indicate data exists */
+@@ -752,7 +749,6 @@ enum {
+ FI_EXTRA_ATTR, /* indicate file has extra attribute */
+ FI_PROJ_INHERIT, /* indicate file inherits projectid */
+ FI_PIN_FILE, /* indicate file should not be gced */
+- FI_ATOMIC_REVOKE_REQUEST, /* request to drop atomic data */
+ FI_VERITY_IN_PROGRESS, /* building fs-verity Merkle tree */
+ FI_COMPRESSED_FILE, /* indicate file's data can be compressed */
+ FI_COMPRESS_CORRUPT, /* indicate compressed cluster is corrupted */
+@@ -760,6 +756,7 @@ enum {
+ FI_ENABLE_COMPRESS, /* enable compression in "user" compression mode */
+ FI_COMPRESS_RELEASED, /* compressed blocks were released */
+ FI_ALIGNED_WRITE, /* enable aligned write */
++ FI_COW_FILE, /* indicate COW file */
+ FI_MAX, /* max flag, never be used */
+ };
+
+@@ -794,11 +791,9 @@ struct f2fs_inode_info {
+ #endif
+ struct list_head dirty_list; /* dirty list for dirs and files */
+ struct list_head gdirty_list; /* linked in global dirty list */
+- struct list_head inmem_ilist; /* list for inmem inodes */
+- struct list_head inmem_pages; /* inmemory pages managed by f2fs */
+- struct task_struct *inmem_task; /* store inmemory task */
+- struct mutex inmem_lock; /* lock for inmemory pages */
++ struct task_struct *atomic_write_task; /* store atomic write task */
+ struct extent_tree *extent_tree; /* cached extent_tree entry */
++ struct inode *cow_inode; /* copy-on-write inode for atomic write */
+
+ /* avoid racing between foreground op and gc */
+ struct f2fs_rwsem i_gc_rwsem[2];
+@@ -1092,7 +1087,6 @@ enum count_type {
+ F2FS_DIRTY_QDATA,
+ F2FS_DIRTY_NODES,
+ F2FS_DIRTY_META,
+- F2FS_INMEM_PAGES,
+ F2FS_DIRTY_IMETA,
+ F2FS_WB_CP_DATA,
+ F2FS_WB_DATA,
+@@ -1122,11 +1116,7 @@ enum page_type {
+ META,
+ NR_PAGE_TYPE,
+ META_FLUSH,
+- INMEM, /* the below types are used by tracepoints only. */
+- INMEM_DROP,
+- INMEM_INVALIDATE,
+- INMEM_REVOKE,
+- IPU,
++ IPU, /* the below types are used by tracepoints only. */
+ OPU,
+ };
+
+@@ -1718,7 +1708,6 @@ struct f2fs_sb_info {
+
+ /* for skip statistic */
+ unsigned int atomic_files; /* # of opened atomic file */
+- unsigned long long skipped_atomic_files[2]; /* FG_GC and BG_GC */
+ unsigned long long skipped_gc_rwsem; /* FG_GC only */
+
+ /* threshold for gc trials on pinned files */
+@@ -1749,9 +1738,7 @@ struct f2fs_sb_info {
+ atomic_t inline_dir; /* # of inline_dentry inodes */
+ atomic_t compr_inode; /* # of compressed inodes */
+ atomic64_t compr_blocks; /* # of compressed blocks */
+- atomic_t vw_cnt; /* # of volatile writes */
+ atomic_t max_aw_cnt; /* max # of atomic writes */
+- atomic_t max_vw_cnt; /* max # of volatile writes */
+ unsigned int io_skip_bggc; /* skip background gc for in-flight IO */
+ unsigned int other_skip_bggc; /* skip background gc for other reasons */
+ unsigned int ndirty_inode[NR_INODE_TYPE]; /* # of dirty inodes */
+@@ -3202,14 +3189,9 @@ static inline bool f2fs_is_atomic_file(struct inode *inode)
+ return is_inode_flag_set(inode, FI_ATOMIC_FILE);
+ }
+
+-static inline bool f2fs_is_commit_atomic_write(struct inode *inode)
++static inline bool f2fs_is_cow_file(struct inode *inode)
+ {
+- return is_inode_flag_set(inode, FI_ATOMIC_COMMIT);
+-}
+-
+-static inline bool f2fs_is_volatile_file(struct inode *inode)
+-{
+- return is_inode_flag_set(inode, FI_VOLATILE_FILE);
++ return is_inode_flag_set(inode, FI_COW_FILE);
+ }
+
+ static inline bool f2fs_is_first_block_written(struct inode *inode)
+@@ -3444,6 +3426,8 @@ void f2fs_handle_failed_inode(struct inode *inode);
+ int f2fs_update_extension_list(struct f2fs_sb_info *sbi, const char *name,
+ bool hot, bool set);
+ struct dentry *f2fs_get_parent(struct dentry *child);
++int f2fs_get_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
++ struct inode **new_inode);
+
+ /*
+ * dir.c
+@@ -3579,11 +3563,8 @@ void f2fs_destroy_node_manager_caches(void);
+ * segment.c
+ */
+ bool f2fs_need_SSR(struct f2fs_sb_info *sbi);
+-void f2fs_register_inmem_page(struct inode *inode, struct page *page);
+-void f2fs_drop_inmem_pages_all(struct f2fs_sb_info *sbi, bool gc_failure);
+-void f2fs_drop_inmem_pages(struct inode *inode);
+-void f2fs_drop_inmem_page(struct inode *inode, struct page *page);
+-int f2fs_commit_inmem_pages(struct inode *inode);
++int f2fs_commit_atomic_write(struct inode *inode);
++void f2fs_abort_atomic_write(struct inode *inode, bool clean);
+ void f2fs_balance_fs(struct f2fs_sb_info *sbi, bool need);
+ void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi, bool from_bg);
+ int f2fs_issue_flush(struct f2fs_sb_info *sbi, nid_t ino);
+@@ -3815,7 +3796,6 @@ struct f2fs_stat_info {
+ int ext_tree, zombie_tree, ext_node;
+ int ndirty_node, ndirty_dent, ndirty_meta, ndirty_imeta;
+ int ndirty_data, ndirty_qdata;
+- int inmem_pages;
+ unsigned int ndirty_dirs, ndirty_files, nquota_files, ndirty_all;
+ int nats, dirty_nats, sits, dirty_sits;
+ int free_nids, avail_nids, alloc_nids;
+@@ -3833,7 +3813,7 @@ struct f2fs_stat_info {
+ int inline_xattr, inline_inode, inline_dir, append, update, orphans;
+ int compr_inode;
+ unsigned long long compr_blocks;
+- int aw_cnt, max_aw_cnt, vw_cnt, max_vw_cnt;
++ int aw_cnt, max_aw_cnt;
+ unsigned int valid_count, valid_node_count, valid_inode_count, discard_blks;
+ unsigned int bimodal, avg_vblocks;
+ int util_free, util_valid, util_invalid;
+@@ -3845,7 +3825,6 @@ struct f2fs_stat_info {
+ int bg_node_segs, bg_data_segs;
+ int tot_blks, data_blks, node_blks;
+ int bg_data_blks, bg_node_blks;
+- unsigned long long skipped_atomic_files[2];
+ int curseg[NR_CURSEG_TYPE];
+ int cursec[NR_CURSEG_TYPE];
+ int curzone[NR_CURSEG_TYPE];
+@@ -3945,17 +3924,6 @@ static inline struct f2fs_stat_info *F2FS_STAT(struct f2fs_sb_info *sbi)
+ if (cur > max) \
+ atomic_set(&F2FS_I_SB(inode)->max_aw_cnt, cur); \
+ } while (0)
+-#define stat_inc_volatile_write(inode) \
+- (atomic_inc(&F2FS_I_SB(inode)->vw_cnt))
+-#define stat_dec_volatile_write(inode) \
+- (atomic_dec(&F2FS_I_SB(inode)->vw_cnt))
+-#define stat_update_max_volatile_write(inode) \
+- do { \
+- int cur = atomic_read(&F2FS_I_SB(inode)->vw_cnt); \
+- int max = atomic_read(&F2FS_I_SB(inode)->max_vw_cnt); \
+- if (cur > max) \
+- atomic_set(&F2FS_I_SB(inode)->max_vw_cnt, cur); \
+- } while (0)
+ #define stat_inc_seg_count(sbi, type, gc_type) \
+ do { \
+ struct f2fs_stat_info *si = F2FS_STAT(sbi); \
+@@ -4017,9 +3985,6 @@ void f2fs_update_sit_info(struct f2fs_sb_info *sbi);
+ #define stat_add_compr_blocks(inode, blocks) do { } while (0)
+ #define stat_sub_compr_blocks(inode, blocks) do { } while (0)
+ #define stat_update_max_atomic_write(inode) do { } while (0)
+-#define stat_inc_volatile_write(inode) do { } while (0)
+-#define stat_dec_volatile_write(inode) do { } while (0)
+-#define stat_update_max_volatile_write(inode) do { } while (0)
+ #define stat_inc_meta_count(sbi, blkaddr) do { } while (0)
+ #define stat_inc_seg_type(sbi, curseg) do { } while (0)
+ #define stat_inc_block_count(sbi, curseg) do { } while (0)
+@@ -4423,8 +4388,7 @@ static inline bool f2fs_lfs_mode(struct f2fs_sb_info *sbi)
+ static inline bool f2fs_may_compress(struct inode *inode)
+ {
+ if (IS_SWAPFILE(inode) || f2fs_is_pinned_file(inode) ||
+- f2fs_is_atomic_file(inode) ||
+- f2fs_is_volatile_file(inode))
++ f2fs_is_atomic_file(inode) || f2fs_has_inline_data(inode))
+ return false;
+ return S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode);
+ }
+diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
+index 5d1b97e852e74..d3415aca5736f 100644
+--- a/fs/f2fs/file.c
++++ b/fs/f2fs/file.c
+@@ -1816,16 +1816,8 @@ static int f2fs_release_file(struct inode *inode, struct file *filp)
+ atomic_read(&inode->i_writecount) != 1)
+ return 0;
+
+- /* some remained atomic pages should discarded */
+ if (f2fs_is_atomic_file(inode))
+- f2fs_drop_inmem_pages(inode);
+- if (f2fs_is_volatile_file(inode)) {
+- set_inode_flag(inode, FI_DROP_CACHE);
+- filemap_fdatawrite(inode->i_mapping);
+- clear_inode_flag(inode, FI_DROP_CACHE);
+- clear_inode_flag(inode, FI_VOLATILE_FILE);
+- stat_dec_volatile_write(inode);
+- }
++ f2fs_abort_atomic_write(inode, true);
+ return 0;
+ }
+
+@@ -1840,8 +1832,8 @@ static int f2fs_file_flush(struct file *file, fl_owner_t id)
+ * before dropping file lock, it needs to do in ->flush.
+ */
+ if (f2fs_is_atomic_file(inode) &&
+- F2FS_I(inode)->inmem_task == current)
+- f2fs_drop_inmem_pages(inode);
++ F2FS_I(inode)->atomic_write_task == current)
++ f2fs_abort_atomic_write(inode, true);
+ return 0;
+ }
+
+@@ -1875,10 +1867,7 @@ static int f2fs_setflags_common(struct inode *inode, u32 iflags, u32 mask)
+ if (masked_flags & F2FS_COMPR_FL) {
+ if (!f2fs_disable_compressed_file(inode))
+ return -EINVAL;
+- }
+- if (iflags & F2FS_NOCOMP_FL)
+- return -EINVAL;
+- if (iflags & F2FS_COMPR_FL) {
++ } else {
+ if (!f2fs_may_compress(inode))
+ return -EINVAL;
+ if (S_ISREG(inode->i_mode) && inode->i_size)
+@@ -1887,10 +1876,6 @@ static int f2fs_setflags_common(struct inode *inode, u32 iflags, u32 mask)
+ set_compress_context(inode);
+ }
+ }
+- if ((iflags ^ masked_flags) & F2FS_NOCOMP_FL) {
+- if (masked_flags & F2FS_COMPR_FL)
+- return -EINVAL;
+- }
+
+ fi->i_flags = iflags | (fi->i_flags & ~mask);
+ f2fs_bug_on(F2FS_I_SB(inode), (fi->i_flags & F2FS_COMPR_FL) &&
+@@ -2004,6 +1989,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
+ struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
+ struct f2fs_inode_info *fi = F2FS_I(inode);
+ struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
++ struct inode *pinode;
+ int ret;
+
+ if (!inode_owner_or_capable(mnt_userns, inode))
+@@ -2026,11 +2012,8 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
+ goto out;
+ }
+
+- if (f2fs_is_atomic_file(inode)) {
+- if (is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST))
+- ret = -EINVAL;
++ if (f2fs_is_atomic_file(inode))
+ goto out;
+- }
+
+ ret = f2fs_convert_inline_inode(inode);
+ if (ret)
+@@ -2051,19 +2034,33 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
+ goto out;
+ }
+
++ /* Create a COW inode for atomic write */
++ pinode = f2fs_iget(inode->i_sb, fi->i_pino);
++ if (IS_ERR(pinode)) {
++ f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
++ ret = PTR_ERR(pinode);
++ goto out;
++ }
++
++ ret = f2fs_get_tmpfile(mnt_userns, pinode, &fi->cow_inode);
++ iput(pinode);
++ if (ret) {
++ f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
++ goto out;
++ }
++ f2fs_i_size_write(fi->cow_inode, i_size_read(inode));
++
+ spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
+- if (list_empty(&fi->inmem_ilist))
+- list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
+ sbi->atomic_files++;
+ spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
+
+- /* add inode in inmem_list first and set atomic_file */
+ set_inode_flag(inode, FI_ATOMIC_FILE);
+- clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
++ set_inode_flag(fi->cow_inode, FI_COW_FILE);
++ clear_inode_flag(fi->cow_inode, FI_INLINE_DATA);
+ f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
+
+ f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
+- F2FS_I(inode)->inmem_task = current;
++ F2FS_I(inode)->atomic_write_task = current;
+ stat_update_max_atomic_write(inode);
+ out:
+ inode_unlock(inode);
+@@ -2088,99 +2085,24 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
+
+ inode_lock(inode);
+
+- if (f2fs_is_volatile_file(inode)) {
+- ret = -EINVAL;
+- goto err_out;
+- }
+-
+ if (f2fs_is_atomic_file(inode)) {
+- ret = f2fs_commit_inmem_pages(inode);
++ ret = f2fs_commit_atomic_write(inode);
+ if (ret)
+- goto err_out;
++ goto unlock_out;
+
+ ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
+ if (!ret)
+- f2fs_drop_inmem_pages(inode);
++ f2fs_abort_atomic_write(inode, false);
+ } else {
+ ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
+ }
+-err_out:
+- if (is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST)) {
+- clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
+- ret = -EINVAL;
+- }
++unlock_out:
+ inode_unlock(inode);
+ mnt_drop_write_file(filp);
+ return ret;
+ }
+
+-static int f2fs_ioc_start_volatile_write(struct file *filp)
+-{
+- struct inode *inode = file_inode(filp);
+- struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
+- int ret;
+-
+- if (!inode_owner_or_capable(mnt_userns, inode))
+- return -EACCES;
+-
+- if (!S_ISREG(inode->i_mode))
+- return -EINVAL;
+-
+- ret = mnt_want_write_file(filp);
+- if (ret)
+- return ret;
+-
+- inode_lock(inode);
+-
+- if (f2fs_is_volatile_file(inode))
+- goto out;
+-
+- ret = f2fs_convert_inline_inode(inode);
+- if (ret)
+- goto out;
+-
+- stat_inc_volatile_write(inode);
+- stat_update_max_volatile_write(inode);
+-
+- set_inode_flag(inode, FI_VOLATILE_FILE);
+- f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
+-out:
+- inode_unlock(inode);
+- mnt_drop_write_file(filp);
+- return ret;
+-}
+-
+-static int f2fs_ioc_release_volatile_write(struct file *filp)
+-{
+- struct inode *inode = file_inode(filp);
+- struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
+- int ret;
+-
+- if (!inode_owner_or_capable(mnt_userns, inode))
+- return -EACCES;
+-
+- ret = mnt_want_write_file(filp);
+- if (ret)
+- return ret;
+-
+- inode_lock(inode);
+-
+- if (!f2fs_is_volatile_file(inode))
+- goto out;
+-
+- if (!f2fs_is_first_block_written(inode)) {
+- ret = truncate_partial_data_page(inode, 0, true);
+- goto out;
+- }
+-
+- ret = punch_hole(inode, 0, F2FS_BLKSIZE);
+-out:
+- inode_unlock(inode);
+- mnt_drop_write_file(filp);
+- return ret;
+-}
+-
+-static int f2fs_ioc_abort_volatile_write(struct file *filp)
++static int f2fs_ioc_abort_atomic_write(struct file *filp)
+ {
+ struct inode *inode = file_inode(filp);
+ struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
+@@ -2196,14 +2118,7 @@ static int f2fs_ioc_abort_volatile_write(struct file *filp)
+ inode_lock(inode);
+
+ if (f2fs_is_atomic_file(inode))
+- f2fs_drop_inmem_pages(inode);
+- if (f2fs_is_volatile_file(inode)) {
+- clear_inode_flag(inode, FI_VOLATILE_FILE);
+- stat_dec_volatile_write(inode);
+- ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
+- }
+-
+- clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
++ f2fs_abort_atomic_write(inode, true);
+
+ inode_unlock(inode);
+
+@@ -4024,8 +3939,8 @@ static int f2fs_ioc_decompress_file(struct file *filp, unsigned long arg)
+ goto out;
+ }
+
+- if (f2fs_is_mmap_file(inode)) {
+- ret = -EBUSY;
++ if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED)) {
++ ret = -EINVAL;
+ goto out;
+ }
+
+@@ -4096,8 +4011,8 @@ static int f2fs_ioc_compress_file(struct file *filp, unsigned long arg)
+ goto out;
+ }
+
+- if (f2fs_is_mmap_file(inode)) {
+- ret = -EBUSY;
++ if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED)) {
++ ret = -EINVAL;
+ goto out;
+ }
+
+@@ -4149,12 +4064,11 @@ static long __f2fs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+ return f2fs_ioc_start_atomic_write(filp);
+ case F2FS_IOC_COMMIT_ATOMIC_WRITE:
+ return f2fs_ioc_commit_atomic_write(filp);
++ case F2FS_IOC_ABORT_ATOMIC_WRITE:
++ return f2fs_ioc_abort_atomic_write(filp);
+ case F2FS_IOC_START_VOLATILE_WRITE:
+- return f2fs_ioc_start_volatile_write(filp);
+ case F2FS_IOC_RELEASE_VOLATILE_WRITE:
+- return f2fs_ioc_release_volatile_write(filp);
+- case F2FS_IOC_ABORT_VOLATILE_WRITE:
+- return f2fs_ioc_abort_volatile_write(filp);
++ return -EOPNOTSUPP;
+ case F2FS_IOC_SHUTDOWN:
+ return f2fs_ioc_shutdown(filp, arg);
+ case FITRIM:
+@@ -4778,7 +4692,7 @@ long f2fs_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+ case F2FS_IOC_COMMIT_ATOMIC_WRITE:
+ case F2FS_IOC_START_VOLATILE_WRITE:
+ case F2FS_IOC_RELEASE_VOLATILE_WRITE:
+- case F2FS_IOC_ABORT_VOLATILE_WRITE:
++ case F2FS_IOC_ABORT_ATOMIC_WRITE:
+ case F2FS_IOC_SHUTDOWN:
+ case FITRIM:
+ case FS_IOC_SET_ENCRYPTION_POLICY:
+diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
+index ea5b93b689cd5..ba8e93e517be2 100644
+--- a/fs/f2fs/gc.c
++++ b/fs/f2fs/gc.c
+@@ -646,6 +646,54 @@ static void release_victim_entry(struct f2fs_sb_info *sbi)
+ f2fs_bug_on(sbi, !list_empty(&am->victim_list));
+ }
+
++static bool f2fs_pin_section(struct f2fs_sb_info *sbi, unsigned int segno)
++{
++ struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
++ unsigned int secno = GET_SEC_FROM_SEG(sbi, segno);
++
++ if (!dirty_i->enable_pin_section)
++ return false;
++ if (!test_and_set_bit(secno, dirty_i->pinned_secmap))
++ dirty_i->pinned_secmap_cnt++;
++ return true;
++}
++
++static bool f2fs_pinned_section_exists(struct dirty_seglist_info *dirty_i)
++{
++ return dirty_i->pinned_secmap_cnt;
++}
++
++static bool f2fs_section_is_pinned(struct dirty_seglist_info *dirty_i,
++ unsigned int secno)
++{
++ return dirty_i->enable_pin_section &&
++ f2fs_pinned_section_exists(dirty_i) &&
++ test_bit(secno, dirty_i->pinned_secmap);
++}
++
++static void f2fs_unpin_all_sections(struct f2fs_sb_info *sbi, bool enable)
++{
++ unsigned int bitmap_size = f2fs_bitmap_size(MAIN_SECS(sbi));
++
++ if (f2fs_pinned_section_exists(DIRTY_I(sbi))) {
++ memset(DIRTY_I(sbi)->pinned_secmap, 0, bitmap_size);
++ DIRTY_I(sbi)->pinned_secmap_cnt = 0;
++ }
++ DIRTY_I(sbi)->enable_pin_section = enable;
++}
++
++static int f2fs_gc_pinned_control(struct inode *inode, int gc_type,
++ unsigned int segno)
++{
++ if (!f2fs_is_pinned_file(inode))
++ return 0;
++ if (gc_type != FG_GC)
++ return -EBUSY;
++ if (!f2fs_pin_section(F2FS_I_SB(inode), segno))
++ f2fs_pin_file_control(inode, true);
++ return -EAGAIN;
++}
++
+ /*
+ * This function is called from two paths.
+ * One is garbage collection and the other is SSR segment selection.
+@@ -787,6 +835,9 @@ retry:
+ if (gc_type == BG_GC && test_bit(secno, dirty_i->victim_secmap))
+ goto next;
+
++ if (gc_type == FG_GC && f2fs_section_is_pinned(dirty_i, secno))
++ goto next;
++
+ if (is_atgc) {
+ add_victim_entry(sbi, &p, segno);
+ goto next;
+@@ -1194,18 +1245,9 @@ static int move_data_block(struct inode *inode, block_t bidx,
+ goto out;
+ }
+
+- if (f2fs_is_atomic_file(inode)) {
+- F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC]++;
+- F2FS_I_SB(inode)->skipped_atomic_files[gc_type]++;
+- err = -EAGAIN;
+- goto out;
+- }
+-
+- if (f2fs_is_pinned_file(inode)) {
+- f2fs_pin_file_control(inode, true);
+- err = -EAGAIN;
++ err = f2fs_gc_pinned_control(inode, gc_type, segno);
++ if (err)
+ goto out;
+- }
+
+ set_new_dnode(&dn, inode, NULL, NULL, 0);
+ err = f2fs_get_dnode_of_data(&dn, bidx, LOOKUP_NODE);
+@@ -1344,18 +1386,9 @@ static int move_data_page(struct inode *inode, block_t bidx, int gc_type,
+ goto out;
+ }
+
+- if (f2fs_is_atomic_file(inode)) {
+- F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC]++;
+- F2FS_I_SB(inode)->skipped_atomic_files[gc_type]++;
+- err = -EAGAIN;
+- goto out;
+- }
+- if (f2fs_is_pinned_file(inode)) {
+- if (gc_type == FG_GC)
+- f2fs_pin_file_control(inode, true);
+- err = -EAGAIN;
++ err = f2fs_gc_pinned_control(inode, gc_type, segno);
++ if (err)
+ goto out;
+- }
+
+ if (gc_type == BG_GC) {
+ if (PageWriteback(page)) {
+@@ -1475,11 +1508,19 @@ next_step:
+ ofs_in_node = le16_to_cpu(entry->ofs_in_node);
+
+ if (phase == 3) {
++ int err;
++
+ inode = f2fs_iget(sb, dni.ino);
+ if (IS_ERR(inode) || is_bad_inode(inode) ||
+ special_file(inode->i_mode))
+ continue;
+
++ err = f2fs_gc_pinned_control(inode, gc_type, segno);
++ if (err == -EAGAIN) {
++ iput(inode);
++ return submitted;
++ }
++
+ if (!f2fs_down_write_trylock(
+ &F2FS_I(inode)->i_gc_rwsem[WRITE])) {
+ iput(inode);
+@@ -1711,8 +1752,6 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
+ .ilist = LIST_HEAD_INIT(gc_list.ilist),
+ .iroot = RADIX_TREE_INIT(gc_list.iroot, GFP_NOFS),
+ };
+- unsigned long long last_skipped = sbi->skipped_atomic_files[FG_GC];
+- unsigned long long first_skipped;
+ unsigned int skipped_round = 0, round = 0;
+
+ trace_f2fs_gc_begin(sbi->sb, sync, background,
+@@ -1726,7 +1765,6 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
+
+ cpc.reason = __get_cp_reason(sbi);
+ sbi->skipped_gc_rwsem = 0;
+- first_skipped = last_skipped;
+ gc_more:
+ if (unlikely(!(sbi->sb->s_flags & SB_ACTIVE))) {
+ ret = -EINVAL;
+@@ -1758,9 +1796,17 @@ gc_more:
+ ret = -EINVAL;
+ goto stop;
+ }
++retry:
+ ret = __get_victim(sbi, &segno, gc_type);
+- if (ret)
++ if (ret) {
++ /* allow to search victim from sections has pinned data */
++ if (ret == -ENODATA && gc_type == FG_GC &&
++ f2fs_pinned_section_exists(DIRTY_I(sbi))) {
++ f2fs_unpin_all_sections(sbi, false);
++ goto retry;
++ }
+ goto stop;
++ }
+
+ seg_freed = do_garbage_collect(sbi, segno, &gc_list, gc_type, force);
+ if (gc_type == FG_GC &&
+@@ -1769,10 +1815,8 @@ gc_more:
+ total_freed += seg_freed;
+
+ if (gc_type == FG_GC) {
+- if (sbi->skipped_atomic_files[FG_GC] > last_skipped ||
+- sbi->skipped_gc_rwsem)
++ if (sbi->skipped_gc_rwsem)
+ skipped_round++;
+- last_skipped = sbi->skipped_atomic_files[FG_GC];
+ round++;
+ }
+
+@@ -1782,27 +1826,31 @@ gc_more:
+ if (sync)
+ goto stop;
+
+- if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
+- if (skipped_round <= MAX_SKIP_GC_COUNT ||
+- skipped_round * 2 < round) {
+- segno = NULL_SEGNO;
+- goto gc_more;
+- }
++ if (!has_not_enough_free_secs(sbi, sec_freed, 0))
++ goto stop;
+
+- if (first_skipped < last_skipped &&
+- (last_skipped - first_skipped) >
+- sbi->skipped_gc_rwsem) {
+- f2fs_drop_inmem_pages_all(sbi, true);
+- segno = NULL_SEGNO;
+- goto gc_more;
+- }
+- if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
++ if (skipped_round <= MAX_SKIP_GC_COUNT || skipped_round * 2 < round) {
++
++ /* Write checkpoint to reclaim prefree segments */
++ if (free_sections(sbi) < NR_CURSEG_PERSIST_TYPE &&
++ prefree_segments(sbi) &&
++ !is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
+ ret = f2fs_write_checkpoint(sbi, &cpc);
++ if (ret)
++ goto stop;
++ }
++ segno = NULL_SEGNO;
++ goto gc_more;
+ }
++ if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
++ ret = f2fs_write_checkpoint(sbi, &cpc);
+ stop:
+ SIT_I(sbi)->last_victim[ALLOC_NEXT] = 0;
+ SIT_I(sbi)->last_victim[FLUSH_DEVICE] = init_segno;
+
++ if (gc_type == FG_GC)
++ f2fs_unpin_all_sections(sbi, true);
++
+ trace_f2fs_gc_end(sbi->sb, ret, total_freed, sec_freed,
+ get_pages(sbi, F2FS_DIRTY_NODES),
+ get_pages(sbi, F2FS_DIRTY_DENTS),
+diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
+index e9818723103c6..938961a9084e3 100644
+--- a/fs/f2fs/inode.c
++++ b/fs/f2fs/inode.c
+@@ -744,9 +744,8 @@ void f2fs_evict_inode(struct inode *inode)
+ nid_t xnid = F2FS_I(inode)->i_xattr_nid;
+ int err = 0;
+
+- /* some remained atomic pages should discarded */
+ if (f2fs_is_atomic_file(inode))
+- f2fs_drop_inmem_pages(inode);
++ f2fs_abort_atomic_write(inode, true);
+
+ trace_f2fs_evict_inode(inode);
+ truncate_inode_pages_final(&inode->i_data);
+diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
+index 3764e12f19db0..28d5981443a73 100644
+--- a/fs/f2fs/namei.c
++++ b/fs/f2fs/namei.c
+@@ -848,8 +848,8 @@ out:
+ }
+
+ static int __f2fs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
+- struct dentry *dentry, umode_t mode,
+- struct inode **whiteout)
++ struct dentry *dentry, umode_t mode, bool is_whiteout,
++ struct inode **new_inode)
+ {
+ struct f2fs_sb_info *sbi = F2FS_I_SB(dir);
+ struct inode *inode;
+@@ -863,7 +863,7 @@ static int __f2fs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
+ if (IS_ERR(inode))
+ return PTR_ERR(inode);
+
+- if (whiteout) {
++ if (is_whiteout) {
+ init_special_inode(inode, inode->i_mode, WHITEOUT_DEV);
+ inode->i_op = &f2fs_special_inode_operations;
+ } else {
+@@ -888,21 +888,25 @@ static int __f2fs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
+ f2fs_add_orphan_inode(inode);
+ f2fs_alloc_nid_done(sbi, inode->i_ino);
+
+- if (whiteout) {
++ if (is_whiteout) {
+ f2fs_i_links_write(inode, false);
+
+ spin_lock(&inode->i_lock);
+ inode->i_state |= I_LINKABLE;
+ spin_unlock(&inode->i_lock);
+-
+- *whiteout = inode;
+ } else {
+- d_tmpfile(dentry, inode);
++ if (dentry)
++ d_tmpfile(dentry, inode);
++ else
++ f2fs_i_links_write(inode, false);
+ }
+ /* link_count was changed by d_tmpfile as well. */
+ f2fs_unlock_op(sbi);
+ unlock_new_inode(inode);
+
++ if (new_inode)
++ *new_inode = inode;
++
+ f2fs_balance_fs(sbi, true);
+ return 0;
+
+@@ -923,7 +927,7 @@ static int f2fs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
+ if (!f2fs_is_checkpoint_ready(sbi))
+ return -ENOSPC;
+
+- return __f2fs_tmpfile(mnt_userns, dir, dentry, mode, NULL);
++ return __f2fs_tmpfile(mnt_userns, dir, dentry, mode, false, NULL);
+ }
+
+ static int f2fs_create_whiteout(struct user_namespace *mnt_userns,
+@@ -933,7 +937,13 @@ static int f2fs_create_whiteout(struct user_namespace *mnt_userns,
+ return -EIO;
+
+ return __f2fs_tmpfile(mnt_userns, dir, NULL,
+- S_IFCHR | WHITEOUT_MODE, whiteout);
++ S_IFCHR | WHITEOUT_MODE, true, whiteout);
++}
++
++int f2fs_get_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
++ struct inode **new_inode)
++{
++ return __f2fs_tmpfile(mnt_userns, dir, NULL, S_IFREG, false, new_inode);
+ }
+
+ static int f2fs_rename(struct user_namespace *mnt_userns, struct inode *old_dir,
+diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
+index aedc3d334113b..9dabf99eedc5d 100644
+--- a/fs/f2fs/node.c
++++ b/fs/f2fs/node.c
+@@ -90,10 +90,6 @@ bool f2fs_available_free_memory(struct f2fs_sb_info *sbi, int type)
+ atomic_read(&sbi->total_ext_node) *
+ sizeof(struct extent_node)) >> PAGE_SHIFT;
+ res = mem_size < ((avail_ram * nm_i->ram_thresh / 100) >> 1);
+- } else if (type == INMEM_PAGES) {
+- /* it allows 20% / total_ram for inmemory pages */
+- mem_size = get_pages(sbi, F2FS_INMEM_PAGES);
+- res = mem_size < (val.totalram / 5);
+ } else if (type == DISCARD_CACHE) {
+ mem_size = (atomic_read(&dcc->discard_cmd_cnt) *
+ sizeof(struct discard_cmd)) >> PAGE_SHIFT;
+diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h
+index 4c1d34bfea781..3c09cae058b0a 100644
+--- a/fs/f2fs/node.h
++++ b/fs/f2fs/node.h
+@@ -147,7 +147,6 @@ enum mem_type {
+ DIRTY_DENTS, /* indicates dirty dentry pages */
+ INO_ENTRIES, /* indicates inode entries */
+ EXTENT_CACHE, /* indicates extent cache */
+- INMEM_PAGES, /* indicates inmemory pages */
+ DISCARD_CACHE, /* indicates memory of cached discard cmds */
+ COMPRESS_PAGE, /* indicates memory of cached compressed pages */
+ BASE_CHECK, /* check kernel status */
+diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
+index aa0162664a1ec..331ad46722b29 100644
+--- a/fs/f2fs/segment.c
++++ b/fs/f2fs/segment.c
+@@ -30,7 +30,7 @@
+ static struct kmem_cache *discard_entry_slab;
+ static struct kmem_cache *discard_cmd_slab;
+ static struct kmem_cache *sit_entry_set_slab;
+-static struct kmem_cache *inmem_entry_slab;
++static struct kmem_cache *revoke_entry_slab;
+
+ static unsigned long __reverse_ulong(unsigned char *str)
+ {
+@@ -185,304 +185,180 @@ bool f2fs_need_SSR(struct f2fs_sb_info *sbi)
+ SM_I(sbi)->min_ssr_sections + reserved_sections(sbi));
+ }
+
+-void f2fs_register_inmem_page(struct inode *inode, struct page *page)
++void f2fs_abort_atomic_write(struct inode *inode, bool clean)
+ {
+- struct inmem_pages *new;
+-
+- set_page_private_atomic(page);
+-
+- new = f2fs_kmem_cache_alloc(inmem_entry_slab,
+- GFP_NOFS, true, NULL);
+-
+- /* add atomic page indices to the list */
+- new->page = page;
+- INIT_LIST_HEAD(&new->list);
++ struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
++ struct f2fs_inode_info *fi = F2FS_I(inode);
+
+- /* increase reference count with clean state */
+- get_page(page);
+- mutex_lock(&F2FS_I(inode)->inmem_lock);
+- list_add_tail(&new->list, &F2FS_I(inode)->inmem_pages);
+- inc_page_count(F2FS_I_SB(inode), F2FS_INMEM_PAGES);
+- mutex_unlock(&F2FS_I(inode)->inmem_lock);
++ if (f2fs_is_atomic_file(inode)) {
++ if (clean)
++ truncate_inode_pages_final(inode->i_mapping);
++ clear_inode_flag(fi->cow_inode, FI_COW_FILE);
++ iput(fi->cow_inode);
++ fi->cow_inode = NULL;
++ clear_inode_flag(inode, FI_ATOMIC_FILE);
+
+- trace_f2fs_register_inmem_page(page, INMEM);
++ spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
++ sbi->atomic_files--;
++ spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
++ }
+ }
+
+-static int __revoke_inmem_pages(struct inode *inode,
+- struct list_head *head, bool drop, bool recover,
+- bool trylock)
++static int __replace_atomic_write_block(struct inode *inode, pgoff_t index,
++ block_t new_addr, block_t *old_addr, bool recover)
+ {
+ struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+- struct inmem_pages *cur, *tmp;
+- int err = 0;
+-
+- list_for_each_entry_safe(cur, tmp, head, list) {
+- struct page *page = cur->page;
+-
+- if (drop)
+- trace_f2fs_commit_inmem_page(page, INMEM_DROP);
+-
+- if (trylock) {
+- /*
+- * to avoid deadlock in between page lock and
+- * inmem_lock.
+- */
+- if (!trylock_page(page))
+- continue;
+- } else {
+- lock_page(page);
+- }
+-
+- f2fs_wait_on_page_writeback(page, DATA, true, true);
+-
+- if (recover) {
+- struct dnode_of_data dn;
+- struct node_info ni;
++ struct dnode_of_data dn;
++ struct node_info ni;
++ int err;
+
+- trace_f2fs_commit_inmem_page(page, INMEM_REVOKE);
+ retry:
+- set_new_dnode(&dn, inode, NULL, NULL, 0);
+- err = f2fs_get_dnode_of_data(&dn, page->index,
+- LOOKUP_NODE);
+- if (err) {
+- if (err == -ENOMEM) {
+- memalloc_retry_wait(GFP_NOFS);
+- goto retry;
+- }
+- err = -EAGAIN;
+- goto next;
+- }
+-
+- err = f2fs_get_node_info(sbi, dn.nid, &ni, false);
+- if (err) {
+- f2fs_put_dnode(&dn);
+- return err;
+- }
+-
+- if (cur->old_addr == NEW_ADDR) {
+- f2fs_invalidate_blocks(sbi, dn.data_blkaddr);
+- f2fs_update_data_blkaddr(&dn, NEW_ADDR);
+- } else
+- f2fs_replace_block(sbi, &dn, dn.data_blkaddr,
+- cur->old_addr, ni.version, true, true);
+- f2fs_put_dnode(&dn);
+- }
+-next:
+- /* we don't need to invalidate this in the sccessful status */
+- if (drop || recover) {
+- ClearPageUptodate(page);
+- clear_page_private_gcing(page);
++ set_new_dnode(&dn, inode, NULL, NULL, 0);
++ err = f2fs_get_dnode_of_data(&dn, index, LOOKUP_NODE_RA);
++ if (err) {
++ if (err == -ENOMEM) {
++ f2fs_io_schedule_timeout(DEFAULT_IO_TIMEOUT);
++ goto retry;
+ }
+- detach_page_private(page);
+- set_page_private(page, 0);
+- f2fs_put_page(page, 1);
+-
+- list_del(&cur->list);
+- kmem_cache_free(inmem_entry_slab, cur);
+- dec_page_count(F2FS_I_SB(inode), F2FS_INMEM_PAGES);
++ return err;
+ }
+- return err;
+-}
+
+-void f2fs_drop_inmem_pages_all(struct f2fs_sb_info *sbi, bool gc_failure)
+-{
+- struct list_head *head = &sbi->inode_list[ATOMIC_FILE];
+- struct inode *inode;
+- struct f2fs_inode_info *fi;
+- unsigned int count = sbi->atomic_files;
+- unsigned int looped = 0;
+-next:
+- spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
+- if (list_empty(head)) {
+- spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
+- return;
++ err = f2fs_get_node_info(sbi, dn.nid, &ni, false);
++ if (err) {
++ f2fs_put_dnode(&dn);
++ return err;
+ }
+- fi = list_first_entry(head, struct f2fs_inode_info, inmem_ilist);
+- inode = igrab(&fi->vfs_inode);
+- if (inode)
+- list_move_tail(&fi->inmem_ilist, head);
+- spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
+
+- if (inode) {
+- if (gc_failure) {
+- if (!fi->i_gc_failures[GC_FAILURE_ATOMIC])
+- goto skip;
++ if (recover) {
++ /* dn.data_blkaddr is always valid */
++ if (!__is_valid_data_blkaddr(new_addr)) {
++ if (new_addr == NULL_ADDR)
++ dec_valid_block_count(sbi, inode, 1);
++ f2fs_invalidate_blocks(sbi, dn.data_blkaddr);
++ f2fs_update_data_blkaddr(&dn, new_addr);
++ } else {
++ f2fs_replace_block(sbi, &dn, dn.data_blkaddr,
++ new_addr, ni.version, true, true);
+ }
+- set_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
+- f2fs_drop_inmem_pages(inode);
+-skip:
+- iput(inode);
+- }
+- f2fs_io_schedule_timeout(DEFAULT_IO_TIMEOUT);
+- if (gc_failure) {
+- if (++looped >= count)
+- return;
+- }
+- goto next;
+-}
+-
+-void f2fs_drop_inmem_pages(struct inode *inode)
+-{
+- struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+- struct f2fs_inode_info *fi = F2FS_I(inode);
++ } else {
++ blkcnt_t count = 1;
+
+- do {
+- mutex_lock(&fi->inmem_lock);
+- if (list_empty(&fi->inmem_pages)) {
+- fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
+-
+- spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
+- if (!list_empty(&fi->inmem_ilist))
+- list_del_init(&fi->inmem_ilist);
+- if (f2fs_is_atomic_file(inode)) {
+- clear_inode_flag(inode, FI_ATOMIC_FILE);
+- sbi->atomic_files--;
+- }
+- spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
++ *old_addr = dn.data_blkaddr;
++ f2fs_truncate_data_blocks_range(&dn, 1);
++ dec_valid_block_count(sbi, F2FS_I(inode)->cow_inode, count);
++ inc_valid_block_count(sbi, inode, &count);
++ f2fs_replace_block(sbi, &dn, dn.data_blkaddr, new_addr,
++ ni.version, true, false);
++ }
+
+- mutex_unlock(&fi->inmem_lock);
+- break;
+- }
+- __revoke_inmem_pages(inode, &fi->inmem_pages,
+- true, false, true);
+- mutex_unlock(&fi->inmem_lock);
+- } while (1);
++ f2fs_put_dnode(&dn);
++ return 0;
+ }
+
+-void f2fs_drop_inmem_page(struct inode *inode, struct page *page)
++static void __complete_revoke_list(struct inode *inode, struct list_head *head,
++ bool revoke)
+ {
+- struct f2fs_inode_info *fi = F2FS_I(inode);
+- struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+- struct list_head *head = &fi->inmem_pages;
+- struct inmem_pages *cur = NULL;
+- struct inmem_pages *tmp;
+-
+- f2fs_bug_on(sbi, !page_private_atomic(page));
++ struct revoke_entry *cur, *tmp;
+
+- mutex_lock(&fi->inmem_lock);
+- list_for_each_entry(tmp, head, list) {
+- if (tmp->page == page) {
+- cur = tmp;
+- break;
+- }
++ list_for_each_entry_safe(cur, tmp, head, list) {
++ if (revoke)
++ __replace_atomic_write_block(inode, cur->index,
++ cur->old_addr, NULL, true);
++ list_del(&cur->list);
++ kmem_cache_free(revoke_entry_slab, cur);
+ }
+-
+- f2fs_bug_on(sbi, !cur);
+- list_del(&cur->list);
+- mutex_unlock(&fi->inmem_lock);
+-
+- dec_page_count(sbi, F2FS_INMEM_PAGES);
+- kmem_cache_free(inmem_entry_slab, cur);
+-
+- ClearPageUptodate(page);
+- clear_page_private_atomic(page);
+- f2fs_put_page(page, 0);
+-
+- detach_page_private(page);
+- set_page_private(page, 0);
+-
+- trace_f2fs_commit_inmem_page(page, INMEM_INVALIDATE);
+ }
+
+-static int __f2fs_commit_inmem_pages(struct inode *inode)
++static int __f2fs_commit_atomic_write(struct inode *inode)
+ {
+ struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+ struct f2fs_inode_info *fi = F2FS_I(inode);
+- struct inmem_pages *cur, *tmp;
+- struct f2fs_io_info fio = {
+- .sbi = sbi,
+- .ino = inode->i_ino,
+- .type = DATA,
+- .op = REQ_OP_WRITE,
+- .op_flags = REQ_SYNC | REQ_PRIO,
+- .io_type = FS_DATA_IO,
+- };
++ struct inode *cow_inode = fi->cow_inode;
++ struct revoke_entry *new;
+ struct list_head revoke_list;
+- bool submit_bio = false;
+- int err = 0;
++ block_t blkaddr;
++ struct dnode_of_data dn;
++ pgoff_t len = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
++ pgoff_t off = 0, blen, index;
++ int ret = 0, i;
+
+ INIT_LIST_HEAD(&revoke_list);
+
+- list_for_each_entry_safe(cur, tmp, &fi->inmem_pages, list) {
+- struct page *page = cur->page;
++ while (len) {
++ blen = min_t(pgoff_t, ADDRS_PER_BLOCK(cow_inode), len);
+
+- lock_page(page);
+- if (page->mapping == inode->i_mapping) {
+- trace_f2fs_commit_inmem_page(page, INMEM);
++ set_new_dnode(&dn, cow_inode, NULL, NULL, 0);
++ ret = f2fs_get_dnode_of_data(&dn, off, LOOKUP_NODE_RA);
++ if (ret && ret != -ENOENT) {
++ goto out;
++ } else if (ret == -ENOENT) {
++ ret = 0;
++ if (dn.max_level == 0)
++ goto out;
++ goto next;
++ }
+
+- f2fs_wait_on_page_writeback(page, DATA, true, true);
++ blen = min((pgoff_t)ADDRS_PER_PAGE(dn.node_page, cow_inode),
++ len);
++ index = off;
++ for (i = 0; i < blen; i++, dn.ofs_in_node++, index++) {
++ blkaddr = f2fs_data_blkaddr(&dn);
+
+- set_page_dirty(page);
+- if (clear_page_dirty_for_io(page)) {
+- inode_dec_dirty_pages(inode);
+- f2fs_remove_dirty_inode(inode);
+- }
+-retry:
+- fio.page = page;
+- fio.old_blkaddr = NULL_ADDR;
+- fio.encrypted_page = NULL;
+- fio.need_lock = LOCK_DONE;
+- err = f2fs_do_write_data_page(&fio);
+- if (err) {
+- if (err == -ENOMEM) {
+- memalloc_retry_wait(GFP_NOFS);
+- goto retry;
+- }
+- unlock_page(page);
+- break;
++ if (!__is_valid_data_blkaddr(blkaddr)) {
++ continue;
++ } else if (!f2fs_is_valid_blkaddr(sbi, blkaddr,
++ DATA_GENERIC_ENHANCE)) {
++ f2fs_put_dnode(&dn);
++ ret = -EFSCORRUPTED;
++ goto out;
+ }
+- /* record old blkaddr for revoking */
+- cur->old_addr = fio.old_blkaddr;
+- submit_bio = true;
+- }
+- unlock_page(page);
+- list_move_tail(&cur->list, &revoke_list);
+- }
+
+- if (submit_bio)
+- f2fs_submit_merged_write_cond(sbi, inode, NULL, 0, DATA);
++ new = f2fs_kmem_cache_alloc(revoke_entry_slab, GFP_NOFS,
++ true, NULL);
++ if (!new) {
++ f2fs_put_dnode(&dn);
++ ret = -ENOMEM;
++ goto out;
++ }
+
+- if (err) {
+- /*
+- * try to revoke all committed pages, but still we could fail
+- * due to no memory or other reason, if that happened, EAGAIN
+- * will be returned, which means in such case, transaction is
+- * already not integrity, caller should use journal to do the
+- * recovery or rewrite & commit last transaction. For other
+- * error number, revoking was done by filesystem itself.
+- */
+- err = __revoke_inmem_pages(inode, &revoke_list,
+- false, true, false);
++ ret = __replace_atomic_write_block(inode, index, blkaddr,
++ &new->old_addr, false);
++ if (ret) {
++ f2fs_put_dnode(&dn);
++ kmem_cache_free(revoke_entry_slab, new);
++ goto out;
++ }
+
+- /* drop all uncommitted pages */
+- __revoke_inmem_pages(inode, &fi->inmem_pages,
+- true, false, false);
+- } else {
+- __revoke_inmem_pages(inode, &revoke_list,
+- false, false, false);
++ f2fs_update_data_blkaddr(&dn, NULL_ADDR);
++ new->index = index;
++ list_add_tail(&new->list, &revoke_list);
++ }
++ f2fs_put_dnode(&dn);
++next:
++ off += blen;
++ len -= blen;
+ }
+
+- return err;
++out:
++ __complete_revoke_list(inode, &revoke_list, ret ? true : false);
++
++ return ret;
+ }
+
+-int f2fs_commit_inmem_pages(struct inode *inode)
++int f2fs_commit_atomic_write(struct inode *inode)
+ {
+ struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+ struct f2fs_inode_info *fi = F2FS_I(inode);
+ int err;
+
+- f2fs_balance_fs(sbi, true);
++ err = filemap_write_and_wait_range(inode->i_mapping, 0, LLONG_MAX);
++ if (err)
++ return err;
+
+ f2fs_down_write(&fi->i_gc_rwsem[WRITE]);
+-
+ f2fs_lock_op(sbi);
+- set_inode_flag(inode, FI_ATOMIC_COMMIT);
+-
+- mutex_lock(&fi->inmem_lock);
+- err = __f2fs_commit_inmem_pages(inode);
+- mutex_unlock(&fi->inmem_lock);
+
+- clear_inode_flag(inode, FI_ATOMIC_COMMIT);
++ err = __f2fs_commit_atomic_write(inode);
+
+ f2fs_unlock_op(sbi);
+ f2fs_up_write(&fi->i_gc_rwsem[WRITE]);
+@@ -3291,8 +3167,7 @@ static int __get_segment_type_6(struct f2fs_io_info *fio)
+ return CURSEG_COLD_DATA;
+ if (file_is_hot(inode) ||
+ is_inode_flag_set(inode, FI_HOT_DATA) ||
+- f2fs_is_atomic_file(inode) ||
+- f2fs_is_volatile_file(inode))
++ f2fs_is_cow_file(inode))
+ return CURSEG_HOT_DATA;
+ return f2fs_rw_hint_to_seg_type(inode->i_write_hint);
+ } else {
+@@ -4653,6 +4528,13 @@ static int init_victim_secmap(struct f2fs_sb_info *sbi)
+ dirty_i->victim_secmap = f2fs_kvzalloc(sbi, bitmap_size, GFP_KERNEL);
+ if (!dirty_i->victim_secmap)
+ return -ENOMEM;
++
++ dirty_i->pinned_secmap = f2fs_kvzalloc(sbi, bitmap_size, GFP_KERNEL);
++ if (!dirty_i->pinned_secmap)
++ return -ENOMEM;
++
++ dirty_i->pinned_secmap_cnt = 0;
++ dirty_i->enable_pin_section = true;
+ return 0;
+ }
+
+@@ -5241,6 +5123,7 @@ static void destroy_victim_secmap(struct f2fs_sb_info *sbi)
+ {
+ struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
+
++ kvfree(dirty_i->pinned_secmap);
+ kvfree(dirty_i->victim_secmap);
+ }
+
+@@ -5351,9 +5234,9 @@ int __init f2fs_create_segment_manager_caches(void)
+ if (!sit_entry_set_slab)
+ goto destroy_discard_cmd;
+
+- inmem_entry_slab = f2fs_kmem_cache_create("f2fs_inmem_page_entry",
+- sizeof(struct inmem_pages));
+- if (!inmem_entry_slab)
++ revoke_entry_slab = f2fs_kmem_cache_create("f2fs_revoke_entry",
++ sizeof(struct revoke_entry));
++ if (!revoke_entry_slab)
+ goto destroy_sit_entry_set;
+ return 0;
+
+@@ -5372,5 +5255,5 @@ void f2fs_destroy_segment_manager_caches(void)
+ kmem_cache_destroy(sit_entry_set_slab);
+ kmem_cache_destroy(discard_cmd_slab);
+ kmem_cache_destroy(discard_entry_slab);
+- kmem_cache_destroy(inmem_entry_slab);
++ kmem_cache_destroy(revoke_entry_slab);
+ }
+diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
+index 1fa26a9603cb8..3f277dfcb1311 100644
+--- a/fs/f2fs/segment.h
++++ b/fs/f2fs/segment.h
+@@ -225,10 +225,10 @@ struct segment_allocation {
+
+ #define MAX_SKIP_GC_COUNT 16
+
+-struct inmem_pages {
++struct revoke_entry {
+ struct list_head list;
+- struct page *page;
+ block_t old_addr; /* for revoking when fail to commit */
++ pgoff_t index;
+ };
+
+ struct sit_info {
+@@ -295,6 +295,9 @@ struct dirty_seglist_info {
+ struct mutex seglist_lock; /* lock for segment bitmaps */
+ int nr_dirty[NR_DIRTY_TYPE]; /* # of dirty segments */
+ unsigned long *victim_secmap; /* background GC victims */
++ unsigned long *pinned_secmap; /* pinned victims from foreground GC */
++ unsigned int pinned_secmap_cnt; /* count of victims which has pinned data */
++ bool enable_pin_section; /* enable pinning section */
+ };
+
+ /* victim selection function for cleaning and SSR */
+diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
+index 1b59f95606c72..e3539c027a6cf 100644
+--- a/fs/f2fs/super.c
++++ b/fs/f2fs/super.c
+@@ -1339,9 +1339,6 @@ static struct inode *f2fs_alloc_inode(struct super_block *sb)
+ spin_lock_init(&fi->i_size_lock);
+ INIT_LIST_HEAD(&fi->dirty_list);
+ INIT_LIST_HEAD(&fi->gdirty_list);
+- INIT_LIST_HEAD(&fi->inmem_ilist);
+- INIT_LIST_HEAD(&fi->inmem_pages);
+- mutex_init(&fi->inmem_lock);
+ init_f2fs_rwsem(&fi->i_gc_rwsem[READ]);
+ init_f2fs_rwsem(&fi->i_gc_rwsem[WRITE]);
+ init_f2fs_rwsem(&fi->i_xattr_sem);
+@@ -1382,9 +1379,8 @@ static int f2fs_drop_inode(struct inode *inode)
+ atomic_inc(&inode->i_count);
+ spin_unlock(&inode->i_lock);
+
+- /* some remained atomic pages should discarded */
+ if (f2fs_is_atomic_file(inode))
+- f2fs_drop_inmem_pages(inode);
++ f2fs_abort_atomic_write(inode, true);
+
+ /* should remain fi->extent_tree for writepage */
+ f2fs_destroy_extent_node(inode);
+diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c
+index 3d793202cc9fe..5ac7e756a1bb3 100644
+--- a/fs/f2fs/verity.c
++++ b/fs/f2fs/verity.c
+@@ -128,7 +128,7 @@ static int f2fs_begin_enable_verity(struct file *filp)
+ if (f2fs_verity_in_progress(inode))
+ return -EBUSY;
+
+- if (f2fs_is_atomic_file(inode) || f2fs_is_volatile_file(inode))
++ if (f2fs_is_atomic_file(inode))
+ return -EOPNOTSUPP;
+
+ /*
+diff --git a/fs/fuse/control.c b/fs/fuse/control.c
+index 7cede9a3bc962..247ef4f767612 100644
+--- a/fs/fuse/control.c
++++ b/fs/fuse/control.c
+@@ -258,7 +258,7 @@ int fuse_ctl_add_conn(struct fuse_conn *fc)
+ struct dentry *parent;
+ char name[32];
+
+- if (!fuse_control_sb)
++ if (!fuse_control_sb || fc->no_control)
+ return 0;
+
+ parent = fuse_control_sb->s_root;
+@@ -296,7 +296,7 @@ void fuse_ctl_remove_conn(struct fuse_conn *fc)
+ {
+ int i;
+
+- if (!fuse_control_sb)
++ if (!fuse_control_sb || fc->no_control)
+ return;
+
+ for (i = fc->ctl_ndents - 1; i >= 0; i--) {
+diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
+index 9ff27b8a9782c..9dfee44e97ad9 100644
+--- a/fs/fuse/dir.c
++++ b/fs/fuse/dir.c
+@@ -537,6 +537,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
+ struct fuse_file *ff;
+ void *security_ctx = NULL;
+ u32 security_ctxlen;
++ bool trunc = flags & O_TRUNC;
+
+ /* Userspace expects S_IFREG in create mode */
+ BUG_ON((mode & S_IFMT) != S_IFREG);
+@@ -561,7 +562,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
+ inarg.mode = mode;
+ inarg.umask = current_umask();
+
+- if (fm->fc->handle_killpriv_v2 && (flags & O_TRUNC) &&
++ if (fm->fc->handle_killpriv_v2 && trunc &&
+ !(flags & O_EXCL) && !capable(CAP_FSETID)) {
+ inarg.open_flags |= FUSE_OPEN_KILL_SUIDGID;
+ }
+@@ -623,6 +624,10 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
+ } else {
+ file->private_data = ff;
+ fuse_finish_open(inode, file);
++ if (fm->fc->atomic_o_trunc && trunc)
++ truncate_pagecache(inode, 0);
++ else if (!(ff->open_flags & FOPEN_KEEP_CACHE))
++ invalidate_inode_pages2(inode->i_mapping);
+ }
+ return err;
+
+diff --git a/fs/fuse/file.c b/fs/fuse/file.c
+index f18d14d5fea19..9b64e2ff1c966 100644
+--- a/fs/fuse/file.c
++++ b/fs/fuse/file.c
+@@ -210,13 +210,9 @@ void fuse_finish_open(struct inode *inode, struct file *file)
+ fi->attr_version = atomic64_inc_return(&fc->attr_version);
+ i_size_write(inode, 0);
+ spin_unlock(&fi->lock);
+- truncate_pagecache(inode, 0);
+ file_update_time(file);
+ fuse_invalidate_attr_mask(inode, FUSE_STATX_MODSIZE);
+- } else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) {
+- invalidate_inode_pages2(inode->i_mapping);
+ }
+-
+ if ((file->f_mode & FMODE_WRITE) && fc->writeback_cache)
+ fuse_link_write_file(file);
+ }
+@@ -239,30 +235,38 @@ int fuse_open_common(struct inode *inode, struct file *file, bool isdir)
+ if (err)
+ return err;
+
+- if (is_wb_truncate || dax_truncate) {
++ if (is_wb_truncate || dax_truncate)
+ inode_lock(inode);
+- fuse_set_nowrite(inode);
+- }
+
+ if (dax_truncate) {
+ filemap_invalidate_lock(inode->i_mapping);
+ err = fuse_dax_break_layouts(inode, 0, 0);
+ if (err)
+- goto out;
++ goto out_inode_unlock;
+ }
+
++ if (is_wb_truncate || dax_truncate)
++ fuse_set_nowrite(inode);
++
+ err = fuse_do_open(fm, get_node_id(inode), file, isdir);
+ if (!err)
+ fuse_finish_open(inode, file);
+
+-out:
++ if (is_wb_truncate || dax_truncate)
++ fuse_release_nowrite(inode);
++ if (!err) {
++ struct fuse_file *ff = file->private_data;
++
++ if (fc->atomic_o_trunc && (file->f_flags & O_TRUNC))
++ truncate_pagecache(inode, 0);
++ else if (!(ff->open_flags & FOPEN_KEEP_CACHE))
++ invalidate_inode_pages2(inode->i_mapping);
++ }
+ if (dax_truncate)
+ filemap_invalidate_unlock(inode->i_mapping);
+-
+- if (is_wb_truncate | dax_truncate) {
+- fuse_release_nowrite(inode);
++out_inode_unlock:
++ if (is_wb_truncate || dax_truncate)
+ inode_unlock(inode);
+- }
+
+ return err;
+ }
+@@ -338,6 +342,15 @@ static int fuse_open(struct inode *inode, struct file *file)
+
+ static int fuse_release(struct inode *inode, struct file *file)
+ {
++ struct fuse_conn *fc = get_fuse_conn(inode);
++
++ /*
++ * Dirty pages might remain despite write_inode_now() call from
++ * fuse_flush() due to writes racing with the close.
++ */
++ if (fc->writeback_cache)
++ write_inode_now(inode, 1);
++
+ fuse_release_common(file, false);
+
+ /* return value is ignored by VFS */
+diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
+index 8c0665c5dff88..7c290089e693e 100644
+--- a/fs/fuse/inode.c
++++ b/fs/fuse/inode.c
+@@ -180,6 +180,12 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
+ inode->i_uid = make_kuid(fc->user_ns, attr->uid);
+ inode->i_gid = make_kgid(fc->user_ns, attr->gid);
+ inode->i_blocks = attr->blocks;
++
++ /* Sanitize nsecs */
++ attr->atimensec = min_t(u32, attr->atimensec, NSEC_PER_SEC - 1);
++ attr->mtimensec = min_t(u32, attr->mtimensec, NSEC_PER_SEC - 1);
++ attr->ctimensec = min_t(u32, attr->ctimensec, NSEC_PER_SEC - 1);
++
+ inode->i_atime.tv_sec = attr->atime;
+ inode->i_atime.tv_nsec = attr->atimensec;
+ /* mtime from server may be stale due to local buffered write */
+diff --git a/fs/fuse/ioctl.c b/fs/fuse/ioctl.c
+index 33cde4bbccdc1..61d8afcb10a3f 100644
+--- a/fs/fuse/ioctl.c
++++ b/fs/fuse/ioctl.c
+@@ -9,6 +9,17 @@
+ #include <linux/compat.h>
+ #include <linux/fileattr.h>
+
++static ssize_t fuse_send_ioctl(struct fuse_mount *fm, struct fuse_args *args)
++{
++ ssize_t ret = fuse_simple_request(fm, args);
++
++ /* Translate ENOSYS, which shouldn't be returned from fs */
++ if (ret == -ENOSYS)
++ ret = -ENOTTY;
++
++ return ret;
++}
++
+ /*
+ * CUSE servers compiled on 32bit broke on 64bit kernels because the
+ * ABI was defined to be 'struct iovec' which is different on 32bit
+@@ -259,7 +270,7 @@ long fuse_do_ioctl(struct file *file, unsigned int cmd, unsigned long arg,
+ ap.args.out_pages = true;
+ ap.args.out_argvar = true;
+
+- transferred = fuse_simple_request(fm, &ap.args);
++ transferred = fuse_send_ioctl(fm, &ap.args);
+ err = transferred;
+ if (transferred < 0)
+ goto out;
+@@ -393,7 +404,7 @@ static int fuse_priv_ioctl(struct inode *inode, struct fuse_file *ff,
+ args.out_args[1].size = inarg.out_size;
+ args.out_args[1].value = ptr;
+
+- err = fuse_simple_request(fm, &args);
++ err = fuse_send_ioctl(fm, &args);
+ if (!err) {
+ if (outarg.result < 0)
+ err = outarg.result;
+diff --git a/fs/io-wq.c b/fs/io-wq.c
+deleted file mode 100644
+index 32aeb2c581c58..0000000000000
+--- a/fs/io-wq.c
++++ /dev/null
+@@ -1,1424 +0,0 @@
+-// SPDX-License-Identifier: GPL-2.0
+-/*
+- * Basic worker thread pool for io_uring
+- *
+- * Copyright (C) 2019 Jens Axboe
+- *
+- */
+-#include <linux/kernel.h>
+-#include <linux/init.h>
+-#include <linux/errno.h>
+-#include <linux/sched/signal.h>
+-#include <linux/percpu.h>
+-#include <linux/slab.h>
+-#include <linux/rculist_nulls.h>
+-#include <linux/cpu.h>
+-#include <linux/task_work.h>
+-#include <linux/audit.h>
+-#include <uapi/linux/io_uring.h>
+-
+-#include "io-wq.h"
+-
+-#define WORKER_IDLE_TIMEOUT (5 * HZ)
+-
+-enum {
+- IO_WORKER_F_UP = 1, /* up and active */
+- IO_WORKER_F_RUNNING = 2, /* account as running */
+- IO_WORKER_F_FREE = 4, /* worker on free list */
+- IO_WORKER_F_BOUND = 8, /* is doing bounded work */
+-};
+-
+-enum {
+- IO_WQ_BIT_EXIT = 0, /* wq exiting */
+-};
+-
+-enum {
+- IO_ACCT_STALLED_BIT = 0, /* stalled on hash */
+-};
+-
+-/*
+- * One for each thread in a wqe pool
+- */
+-struct io_worker {
+- refcount_t ref;
+- unsigned flags;
+- struct hlist_nulls_node nulls_node;
+- struct list_head all_list;
+- struct task_struct *task;
+- struct io_wqe *wqe;
+-
+- struct io_wq_work *cur_work;
+- struct io_wq_work *next_work;
+- raw_spinlock_t lock;
+-
+- struct completion ref_done;
+-
+- unsigned long create_state;
+- struct callback_head create_work;
+- int create_index;
+-
+- union {
+- struct rcu_head rcu;
+- struct work_struct work;
+- };
+-};
+-
+-#if BITS_PER_LONG == 64
+-#define IO_WQ_HASH_ORDER 6
+-#else
+-#define IO_WQ_HASH_ORDER 5
+-#endif
+-
+-#define IO_WQ_NR_HASH_BUCKETS (1u << IO_WQ_HASH_ORDER)
+-
+-struct io_wqe_acct {
+- unsigned nr_workers;
+- unsigned max_workers;
+- int index;
+- atomic_t nr_running;
+- raw_spinlock_t lock;
+- struct io_wq_work_list work_list;
+- unsigned long flags;
+-};
+-
+-enum {
+- IO_WQ_ACCT_BOUND,
+- IO_WQ_ACCT_UNBOUND,
+- IO_WQ_ACCT_NR,
+-};
+-
+-/*
+- * Per-node worker thread pool
+- */
+-struct io_wqe {
+- raw_spinlock_t lock;
+- struct io_wqe_acct acct[IO_WQ_ACCT_NR];
+-
+- int node;
+-
+- struct hlist_nulls_head free_list;
+- struct list_head all_list;
+-
+- struct wait_queue_entry wait;
+-
+- struct io_wq *wq;
+- struct io_wq_work *hash_tail[IO_WQ_NR_HASH_BUCKETS];
+-
+- cpumask_var_t cpu_mask;
+-};
+-
+-/*
+- * Per io_wq state
+- */
+-struct io_wq {
+- unsigned long state;
+-
+- free_work_fn *free_work;
+- io_wq_work_fn *do_work;
+-
+- struct io_wq_hash *hash;
+-
+- atomic_t worker_refs;
+- struct completion worker_done;
+-
+- struct hlist_node cpuhp_node;
+-
+- struct task_struct *task;
+-
+- struct io_wqe *wqes[];
+-};
+-
+-static enum cpuhp_state io_wq_online;
+-
+-struct io_cb_cancel_data {
+- work_cancel_fn *fn;
+- void *data;
+- int nr_running;
+- int nr_pending;
+- bool cancel_all;
+-};
+-
+-static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index);
+-static void io_wqe_dec_running(struct io_worker *worker);
+-static bool io_acct_cancel_pending_work(struct io_wqe *wqe,
+- struct io_wqe_acct *acct,
+- struct io_cb_cancel_data *match);
+-static void create_worker_cb(struct callback_head *cb);
+-static void io_wq_cancel_tw_create(struct io_wq *wq);
+-
+-static bool io_worker_get(struct io_worker *worker)
+-{
+- return refcount_inc_not_zero(&worker->ref);
+-}
+-
+-static void io_worker_release(struct io_worker *worker)
+-{
+- if (refcount_dec_and_test(&worker->ref))
+- complete(&worker->ref_done);
+-}
+-
+-static inline struct io_wqe_acct *io_get_acct(struct io_wqe *wqe, bool bound)
+-{
+- return &wqe->acct[bound ? IO_WQ_ACCT_BOUND : IO_WQ_ACCT_UNBOUND];
+-}
+-
+-static inline struct io_wqe_acct *io_work_get_acct(struct io_wqe *wqe,
+- struct io_wq_work *work)
+-{
+- return io_get_acct(wqe, !(work->flags & IO_WQ_WORK_UNBOUND));
+-}
+-
+-static inline struct io_wqe_acct *io_wqe_get_acct(struct io_worker *worker)
+-{
+- return io_get_acct(worker->wqe, worker->flags & IO_WORKER_F_BOUND);
+-}
+-
+-static void io_worker_ref_put(struct io_wq *wq)
+-{
+- if (atomic_dec_and_test(&wq->worker_refs))
+- complete(&wq->worker_done);
+-}
+-
+-static void io_worker_cancel_cb(struct io_worker *worker)
+-{
+- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+- struct io_wqe *wqe = worker->wqe;
+- struct io_wq *wq = wqe->wq;
+-
+- atomic_dec(&acct->nr_running);
+- raw_spin_lock(&worker->wqe->lock);
+- acct->nr_workers--;
+- raw_spin_unlock(&worker->wqe->lock);
+- io_worker_ref_put(wq);
+- clear_bit_unlock(0, &worker->create_state);
+- io_worker_release(worker);
+-}
+-
+-static bool io_task_worker_match(struct callback_head *cb, void *data)
+-{
+- struct io_worker *worker;
+-
+- if (cb->func != create_worker_cb)
+- return false;
+- worker = container_of(cb, struct io_worker, create_work);
+- return worker == data;
+-}
+-
+-static void io_worker_exit(struct io_worker *worker)
+-{
+- struct io_wqe *wqe = worker->wqe;
+- struct io_wq *wq = wqe->wq;
+-
+- while (1) {
+- struct callback_head *cb = task_work_cancel_match(wq->task,
+- io_task_worker_match, worker);
+-
+- if (!cb)
+- break;
+- io_worker_cancel_cb(worker);
+- }
+-
+- io_worker_release(worker);
+- wait_for_completion(&worker->ref_done);
+-
+- raw_spin_lock(&wqe->lock);
+- if (worker->flags & IO_WORKER_F_FREE)
+- hlist_nulls_del_rcu(&worker->nulls_node);
+- list_del_rcu(&worker->all_list);
+- raw_spin_unlock(&wqe->lock);
+- io_wqe_dec_running(worker);
+- worker->flags = 0;
+- preempt_disable();
+- current->flags &= ~PF_IO_WORKER;
+- preempt_enable();
+-
+- kfree_rcu(worker, rcu);
+- io_worker_ref_put(wqe->wq);
+- do_exit(0);
+-}
+-
+-static inline bool io_acct_run_queue(struct io_wqe_acct *acct)
+-{
+- bool ret = false;
+-
+- raw_spin_lock(&acct->lock);
+- if (!wq_list_empty(&acct->work_list) &&
+- !test_bit(IO_ACCT_STALLED_BIT, &acct->flags))
+- ret = true;
+- raw_spin_unlock(&acct->lock);
+-
+- return ret;
+-}
+-
+-/*
+- * Check head of free list for an available worker. If one isn't available,
+- * caller must create one.
+- */
+-static bool io_wqe_activate_free_worker(struct io_wqe *wqe,
+- struct io_wqe_acct *acct)
+- __must_hold(RCU)
+-{
+- struct hlist_nulls_node *n;
+- struct io_worker *worker;
+-
+- /*
+- * Iterate free_list and see if we can find an idle worker to
+- * activate. If a given worker is on the free_list but in the process
+- * of exiting, keep trying.
+- */
+- hlist_nulls_for_each_entry_rcu(worker, n, &wqe->free_list, nulls_node) {
+- if (!io_worker_get(worker))
+- continue;
+- if (io_wqe_get_acct(worker) != acct) {
+- io_worker_release(worker);
+- continue;
+- }
+- if (wake_up_process(worker->task)) {
+- io_worker_release(worker);
+- return true;
+- }
+- io_worker_release(worker);
+- }
+-
+- return false;
+-}
+-
+-/*
+- * We need a worker. If we find a free one, we're good. If not, and we're
+- * below the max number of workers, create one.
+- */
+-static bool io_wqe_create_worker(struct io_wqe *wqe, struct io_wqe_acct *acct)
+-{
+- /*
+- * Most likely an attempt to queue unbounded work on an io_wq that
+- * wasn't setup with any unbounded workers.
+- */
+- if (unlikely(!acct->max_workers))
+- pr_warn_once("io-wq is not configured for unbound workers");
+-
+- raw_spin_lock(&wqe->lock);
+- if (acct->nr_workers >= acct->max_workers) {
+- raw_spin_unlock(&wqe->lock);
+- return true;
+- }
+- acct->nr_workers++;
+- raw_spin_unlock(&wqe->lock);
+- atomic_inc(&acct->nr_running);
+- atomic_inc(&wqe->wq->worker_refs);
+- return create_io_worker(wqe->wq, wqe, acct->index);
+-}
+-
+-static void io_wqe_inc_running(struct io_worker *worker)
+-{
+- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+-
+- atomic_inc(&acct->nr_running);
+-}
+-
+-static void create_worker_cb(struct callback_head *cb)
+-{
+- struct io_worker *worker;
+- struct io_wq *wq;
+- struct io_wqe *wqe;
+- struct io_wqe_acct *acct;
+- bool do_create = false;
+-
+- worker = container_of(cb, struct io_worker, create_work);
+- wqe = worker->wqe;
+- wq = wqe->wq;
+- acct = &wqe->acct[worker->create_index];
+- raw_spin_lock(&wqe->lock);
+- if (acct->nr_workers < acct->max_workers) {
+- acct->nr_workers++;
+- do_create = true;
+- }
+- raw_spin_unlock(&wqe->lock);
+- if (do_create) {
+- create_io_worker(wq, wqe, worker->create_index);
+- } else {
+- atomic_dec(&acct->nr_running);
+- io_worker_ref_put(wq);
+- }
+- clear_bit_unlock(0, &worker->create_state);
+- io_worker_release(worker);
+-}
+-
+-static bool io_queue_worker_create(struct io_worker *worker,
+- struct io_wqe_acct *acct,
+- task_work_func_t func)
+-{
+- struct io_wqe *wqe = worker->wqe;
+- struct io_wq *wq = wqe->wq;
+-
+- /* raced with exit, just ignore create call */
+- if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
+- goto fail;
+- if (!io_worker_get(worker))
+- goto fail;
+- /*
+- * create_state manages ownership of create_work/index. We should
+- * only need one entry per worker, as the worker going to sleep
+- * will trigger the condition, and waking will clear it once it
+- * runs the task_work.
+- */
+- if (test_bit(0, &worker->create_state) ||
+- test_and_set_bit_lock(0, &worker->create_state))
+- goto fail_release;
+-
+- atomic_inc(&wq->worker_refs);
+- init_task_work(&worker->create_work, func);
+- worker->create_index = acct->index;
+- if (!task_work_add(wq->task, &worker->create_work, TWA_SIGNAL)) {
+- /*
+- * EXIT may have been set after checking it above, check after
+- * adding the task_work and remove any creation item if it is
+- * now set. wq exit does that too, but we can have added this
+- * work item after we canceled in io_wq_exit_workers().
+- */
+- if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
+- io_wq_cancel_tw_create(wq);
+- io_worker_ref_put(wq);
+- return true;
+- }
+- io_worker_ref_put(wq);
+- clear_bit_unlock(0, &worker->create_state);
+-fail_release:
+- io_worker_release(worker);
+-fail:
+- atomic_dec(&acct->nr_running);
+- io_worker_ref_put(wq);
+- return false;
+-}
+-
+-static void io_wqe_dec_running(struct io_worker *worker)
+-{
+- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+- struct io_wqe *wqe = worker->wqe;
+-
+- if (!(worker->flags & IO_WORKER_F_UP))
+- return;
+-
+- if (!atomic_dec_and_test(&acct->nr_running))
+- return;
+- if (!io_acct_run_queue(acct))
+- return;
+-
+- atomic_inc(&acct->nr_running);
+- atomic_inc(&wqe->wq->worker_refs);
+- io_queue_worker_create(worker, acct, create_worker_cb);
+-}
+-
+-/*
+- * Worker will start processing some work. Move it to the busy list, if
+- * it's currently on the freelist
+- */
+-static void __io_worker_busy(struct io_wqe *wqe, struct io_worker *worker)
+-{
+- if (worker->flags & IO_WORKER_F_FREE) {
+- worker->flags &= ~IO_WORKER_F_FREE;
+- raw_spin_lock(&wqe->lock);
+- hlist_nulls_del_init_rcu(&worker->nulls_node);
+- raw_spin_unlock(&wqe->lock);
+- }
+-}
+-
+-/*
+- * No work, worker going to sleep. Move to freelist, and unuse mm if we
+- * have one attached. Dropping the mm may potentially sleep, so we drop
+- * the lock in that case and return success. Since the caller has to
+- * retry the loop in that case (we changed task state), we don't regrab
+- * the lock if we return success.
+- */
+-static void __io_worker_idle(struct io_wqe *wqe, struct io_worker *worker)
+- __must_hold(wqe->lock)
+-{
+- if (!(worker->flags & IO_WORKER_F_FREE)) {
+- worker->flags |= IO_WORKER_F_FREE;
+- hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
+- }
+-}
+-
+-static inline unsigned int io_get_work_hash(struct io_wq_work *work)
+-{
+- return work->flags >> IO_WQ_HASH_SHIFT;
+-}
+-
+-static bool io_wait_on_hash(struct io_wqe *wqe, unsigned int hash)
+-{
+- struct io_wq *wq = wqe->wq;
+- bool ret = false;
+-
+- spin_lock_irq(&wq->hash->wait.lock);
+- if (list_empty(&wqe->wait.entry)) {
+- __add_wait_queue(&wq->hash->wait, &wqe->wait);
+- if (!test_bit(hash, &wq->hash->map)) {
+- __set_current_state(TASK_RUNNING);
+- list_del_init(&wqe->wait.entry);
+- ret = true;
+- }
+- }
+- spin_unlock_irq(&wq->hash->wait.lock);
+- return ret;
+-}
+-
+-static struct io_wq_work *io_get_next_work(struct io_wqe_acct *acct,
+- struct io_worker *worker)
+- __must_hold(acct->lock)
+-{
+- struct io_wq_work_node *node, *prev;
+- struct io_wq_work *work, *tail;
+- unsigned int stall_hash = -1U;
+- struct io_wqe *wqe = worker->wqe;
+-
+- wq_list_for_each(node, prev, &acct->work_list) {
+- unsigned int hash;
+-
+- work = container_of(node, struct io_wq_work, list);
+-
+- /* not hashed, can run anytime */
+- if (!io_wq_is_hashed(work)) {
+- wq_list_del(&acct->work_list, node, prev);
+- return work;
+- }
+-
+- hash = io_get_work_hash(work);
+- /* all items with this hash lie in [work, tail] */
+- tail = wqe->hash_tail[hash];
+-
+- /* hashed, can run if not already running */
+- if (!test_and_set_bit(hash, &wqe->wq->hash->map)) {
+- wqe->hash_tail[hash] = NULL;
+- wq_list_cut(&acct->work_list, &tail->list, prev);
+- return work;
+- }
+- if (stall_hash == -1U)
+- stall_hash = hash;
+- /* fast forward to a next hash, for-each will fix up @prev */
+- node = &tail->list;
+- }
+-
+- if (stall_hash != -1U) {
+- bool unstalled;
+-
+- /*
+- * Set this before dropping the lock to avoid racing with new
+- * work being added and clearing the stalled bit.
+- */
+- set_bit(IO_ACCT_STALLED_BIT, &acct->flags);
+- raw_spin_unlock(&acct->lock);
+- unstalled = io_wait_on_hash(wqe, stall_hash);
+- raw_spin_lock(&acct->lock);
+- if (unstalled) {
+- clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
+- if (wq_has_sleeper(&wqe->wq->hash->wait))
+- wake_up(&wqe->wq->hash->wait);
+- }
+- }
+-
+- return NULL;
+-}
+-
+-static bool io_flush_signals(void)
+-{
+- if (unlikely(test_thread_flag(TIF_NOTIFY_SIGNAL))) {
+- __set_current_state(TASK_RUNNING);
+- clear_notify_signal();
+- if (task_work_pending(current))
+- task_work_run();
+- return true;
+- }
+- return false;
+-}
+-
+-static void io_assign_current_work(struct io_worker *worker,
+- struct io_wq_work *work)
+-{
+- if (work) {
+- io_flush_signals();
+- cond_resched();
+- }
+-
+- raw_spin_lock(&worker->lock);
+- worker->cur_work = work;
+- worker->next_work = NULL;
+- raw_spin_unlock(&worker->lock);
+-}
+-
+-static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work);
+-
+-static void io_worker_handle_work(struct io_worker *worker)
+-{
+- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+- struct io_wqe *wqe = worker->wqe;
+- struct io_wq *wq = wqe->wq;
+- bool do_kill = test_bit(IO_WQ_BIT_EXIT, &wq->state);
+-
+- do {
+- struct io_wq_work *work;
+-
+- /*
+- * If we got some work, mark us as busy. If we didn't, but
+- * the list isn't empty, it means we stalled on hashed work.
+- * Mark us stalled so we don't keep looking for work when we
+- * can't make progress, any work completion or insertion will
+- * clear the stalled flag.
+- */
+- raw_spin_lock(&acct->lock);
+- work = io_get_next_work(acct, worker);
+- raw_spin_unlock(&acct->lock);
+- if (work) {
+- __io_worker_busy(wqe, worker);
+-
+- /*
+- * Make sure cancelation can find this, even before
+- * it becomes the active work. That avoids a window
+- * where the work has been removed from our general
+- * work list, but isn't yet discoverable as the
+- * current work item for this worker.
+- */
+- raw_spin_lock(&worker->lock);
+- worker->next_work = work;
+- raw_spin_unlock(&worker->lock);
+- } else {
+- break;
+- }
+- io_assign_current_work(worker, work);
+- __set_current_state(TASK_RUNNING);
+-
+- /* handle a whole dependent link */
+- do {
+- struct io_wq_work *next_hashed, *linked;
+- unsigned int hash = io_get_work_hash(work);
+-
+- next_hashed = wq_next_work(work);
+-
+- if (unlikely(do_kill) && (work->flags & IO_WQ_WORK_UNBOUND))
+- work->flags |= IO_WQ_WORK_CANCEL;
+- wq->do_work(work);
+- io_assign_current_work(worker, NULL);
+-
+- linked = wq->free_work(work);
+- work = next_hashed;
+- if (!work && linked && !io_wq_is_hashed(linked)) {
+- work = linked;
+- linked = NULL;
+- }
+- io_assign_current_work(worker, work);
+- if (linked)
+- io_wqe_enqueue(wqe, linked);
+-
+- if (hash != -1U && !next_hashed) {
+- /* serialize hash clear with wake_up() */
+- spin_lock_irq(&wq->hash->wait.lock);
+- clear_bit(hash, &wq->hash->map);
+- clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
+- spin_unlock_irq(&wq->hash->wait.lock);
+- if (wq_has_sleeper(&wq->hash->wait))
+- wake_up(&wq->hash->wait);
+- }
+- } while (work);
+- } while (1);
+-}
+-
+-static int io_wqe_worker(void *data)
+-{
+- struct io_worker *worker = data;
+- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+- struct io_wqe *wqe = worker->wqe;
+- struct io_wq *wq = wqe->wq;
+- bool last_timeout = false;
+- char buf[TASK_COMM_LEN];
+-
+- worker->flags |= (IO_WORKER_F_UP | IO_WORKER_F_RUNNING);
+-
+- snprintf(buf, sizeof(buf), "iou-wrk-%d", wq->task->pid);
+- set_task_comm(current, buf);
+-
+- audit_alloc_kernel(current);
+-
+- while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)) {
+- long ret;
+-
+- set_current_state(TASK_INTERRUPTIBLE);
+- while (io_acct_run_queue(acct))
+- io_worker_handle_work(worker);
+-
+- raw_spin_lock(&wqe->lock);
+- /* timed out, exit unless we're the last worker */
+- if (last_timeout && acct->nr_workers > 1) {
+- acct->nr_workers--;
+- raw_spin_unlock(&wqe->lock);
+- __set_current_state(TASK_RUNNING);
+- break;
+- }
+- last_timeout = false;
+- __io_worker_idle(wqe, worker);
+- raw_spin_unlock(&wqe->lock);
+- if (io_flush_signals())
+- continue;
+- ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
+- if (signal_pending(current)) {
+- struct ksignal ksig;
+-
+- if (!get_signal(&ksig))
+- continue;
+- break;
+- }
+- last_timeout = !ret;
+- }
+-
+- if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
+- io_worker_handle_work(worker);
+-
+- audit_free(current);
+- io_worker_exit(worker);
+- return 0;
+-}
+-
+-/*
+- * Called when a worker is scheduled in. Mark us as currently running.
+- */
+-void io_wq_worker_running(struct task_struct *tsk)
+-{
+- struct io_worker *worker = tsk->worker_private;
+-
+- if (!worker)
+- return;
+- if (!(worker->flags & IO_WORKER_F_UP))
+- return;
+- if (worker->flags & IO_WORKER_F_RUNNING)
+- return;
+- worker->flags |= IO_WORKER_F_RUNNING;
+- io_wqe_inc_running(worker);
+-}
+-
+-/*
+- * Called when worker is going to sleep. If there are no workers currently
+- * running and we have work pending, wake up a free one or create a new one.
+- */
+-void io_wq_worker_sleeping(struct task_struct *tsk)
+-{
+- struct io_worker *worker = tsk->worker_private;
+-
+- if (!worker)
+- return;
+- if (!(worker->flags & IO_WORKER_F_UP))
+- return;
+- if (!(worker->flags & IO_WORKER_F_RUNNING))
+- return;
+-
+- worker->flags &= ~IO_WORKER_F_RUNNING;
+- io_wqe_dec_running(worker);
+-}
+-
+-static void io_init_new_worker(struct io_wqe *wqe, struct io_worker *worker,
+- struct task_struct *tsk)
+-{
+- tsk->worker_private = worker;
+- worker->task = tsk;
+- set_cpus_allowed_ptr(tsk, wqe->cpu_mask);
+- tsk->flags |= PF_NO_SETAFFINITY;
+-
+- raw_spin_lock(&wqe->lock);
+- hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
+- list_add_tail_rcu(&worker->all_list, &wqe->all_list);
+- worker->flags |= IO_WORKER_F_FREE;
+- raw_spin_unlock(&wqe->lock);
+- wake_up_new_task(tsk);
+-}
+-
+-static bool io_wq_work_match_all(struct io_wq_work *work, void *data)
+-{
+- return true;
+-}
+-
+-static inline bool io_should_retry_thread(long err)
+-{
+- /*
+- * Prevent perpetual task_work retry, if the task (or its group) is
+- * exiting.
+- */
+- if (fatal_signal_pending(current))
+- return false;
+-
+- switch (err) {
+- case -EAGAIN:
+- case -ERESTARTSYS:
+- case -ERESTARTNOINTR:
+- case -ERESTARTNOHAND:
+- return true;
+- default:
+- return false;
+- }
+-}
+-
+-static void create_worker_cont(struct callback_head *cb)
+-{
+- struct io_worker *worker;
+- struct task_struct *tsk;
+- struct io_wqe *wqe;
+-
+- worker = container_of(cb, struct io_worker, create_work);
+- clear_bit_unlock(0, &worker->create_state);
+- wqe = worker->wqe;
+- tsk = create_io_thread(io_wqe_worker, worker, wqe->node);
+- if (!IS_ERR(tsk)) {
+- io_init_new_worker(wqe, worker, tsk);
+- io_worker_release(worker);
+- return;
+- } else if (!io_should_retry_thread(PTR_ERR(tsk))) {
+- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+-
+- atomic_dec(&acct->nr_running);
+- raw_spin_lock(&wqe->lock);
+- acct->nr_workers--;
+- if (!acct->nr_workers) {
+- struct io_cb_cancel_data match = {
+- .fn = io_wq_work_match_all,
+- .cancel_all = true,
+- };
+-
+- raw_spin_unlock(&wqe->lock);
+- while (io_acct_cancel_pending_work(wqe, acct, &match))
+- ;
+- } else {
+- raw_spin_unlock(&wqe->lock);
+- }
+- io_worker_ref_put(wqe->wq);
+- kfree(worker);
+- return;
+- }
+-
+- /* re-create attempts grab a new worker ref, drop the existing one */
+- io_worker_release(worker);
+- schedule_work(&worker->work);
+-}
+-
+-static void io_workqueue_create(struct work_struct *work)
+-{
+- struct io_worker *worker = container_of(work, struct io_worker, work);
+- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+-
+- if (!io_queue_worker_create(worker, acct, create_worker_cont))
+- kfree(worker);
+-}
+-
+-static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index)
+-{
+- struct io_wqe_acct *acct = &wqe->acct[index];
+- struct io_worker *worker;
+- struct task_struct *tsk;
+-
+- __set_current_state(TASK_RUNNING);
+-
+- worker = kzalloc_node(sizeof(*worker), GFP_KERNEL, wqe->node);
+- if (!worker) {
+-fail:
+- atomic_dec(&acct->nr_running);
+- raw_spin_lock(&wqe->lock);
+- acct->nr_workers--;
+- raw_spin_unlock(&wqe->lock);
+- io_worker_ref_put(wq);
+- return false;
+- }
+-
+- refcount_set(&worker->ref, 1);
+- worker->wqe = wqe;
+- raw_spin_lock_init(&worker->lock);
+- init_completion(&worker->ref_done);
+-
+- if (index == IO_WQ_ACCT_BOUND)
+- worker->flags |= IO_WORKER_F_BOUND;
+-
+- tsk = create_io_thread(io_wqe_worker, worker, wqe->node);
+- if (!IS_ERR(tsk)) {
+- io_init_new_worker(wqe, worker, tsk);
+- } else if (!io_should_retry_thread(PTR_ERR(tsk))) {
+- kfree(worker);
+- goto fail;
+- } else {
+- INIT_WORK(&worker->work, io_workqueue_create);
+- schedule_work(&worker->work);
+- }
+-
+- return true;
+-}
+-
+-/*
+- * Iterate the passed in list and call the specific function for each
+- * worker that isn't exiting
+- */
+-static bool io_wq_for_each_worker(struct io_wqe *wqe,
+- bool (*func)(struct io_worker *, void *),
+- void *data)
+-{
+- struct io_worker *worker;
+- bool ret = false;
+-
+- list_for_each_entry_rcu(worker, &wqe->all_list, all_list) {
+- if (io_worker_get(worker)) {
+- /* no task if node is/was offline */
+- if (worker->task)
+- ret = func(worker, data);
+- io_worker_release(worker);
+- if (ret)
+- break;
+- }
+- }
+-
+- return ret;
+-}
+-
+-static bool io_wq_worker_wake(struct io_worker *worker, void *data)
+-{
+- set_notify_signal(worker->task);
+- wake_up_process(worker->task);
+- return false;
+-}
+-
+-static void io_run_cancel(struct io_wq_work *work, struct io_wqe *wqe)
+-{
+- struct io_wq *wq = wqe->wq;
+-
+- do {
+- work->flags |= IO_WQ_WORK_CANCEL;
+- wq->do_work(work);
+- work = wq->free_work(work);
+- } while (work);
+-}
+-
+-static void io_wqe_insert_work(struct io_wqe *wqe, struct io_wq_work *work)
+-{
+- struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
+- unsigned int hash;
+- struct io_wq_work *tail;
+-
+- if (!io_wq_is_hashed(work)) {
+-append:
+- wq_list_add_tail(&work->list, &acct->work_list);
+- return;
+- }
+-
+- hash = io_get_work_hash(work);
+- tail = wqe->hash_tail[hash];
+- wqe->hash_tail[hash] = work;
+- if (!tail)
+- goto append;
+-
+- wq_list_add_after(&work->list, &tail->list, &acct->work_list);
+-}
+-
+-static bool io_wq_work_match_item(struct io_wq_work *work, void *data)
+-{
+- return work == data;
+-}
+-
+-static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work)
+-{
+- struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
+- struct io_cb_cancel_data match;
+- unsigned work_flags = work->flags;
+- bool do_create;
+-
+- /*
+- * If io-wq is exiting for this task, or if the request has explicitly
+- * been marked as one that should not get executed, cancel it here.
+- */
+- if (test_bit(IO_WQ_BIT_EXIT, &wqe->wq->state) ||
+- (work->flags & IO_WQ_WORK_CANCEL)) {
+- io_run_cancel(work, wqe);
+- return;
+- }
+-
+- raw_spin_lock(&acct->lock);
+- io_wqe_insert_work(wqe, work);
+- clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
+- raw_spin_unlock(&acct->lock);
+-
+- raw_spin_lock(&wqe->lock);
+- rcu_read_lock();
+- do_create = !io_wqe_activate_free_worker(wqe, acct);
+- rcu_read_unlock();
+-
+- raw_spin_unlock(&wqe->lock);
+-
+- if (do_create && ((work_flags & IO_WQ_WORK_CONCURRENT) ||
+- !atomic_read(&acct->nr_running))) {
+- bool did_create;
+-
+- did_create = io_wqe_create_worker(wqe, acct);
+- if (likely(did_create))
+- return;
+-
+- raw_spin_lock(&wqe->lock);
+- if (acct->nr_workers) {
+- raw_spin_unlock(&wqe->lock);
+- return;
+- }
+- raw_spin_unlock(&wqe->lock);
+-
+- /* fatal condition, failed to create the first worker */
+- match.fn = io_wq_work_match_item,
+- match.data = work,
+- match.cancel_all = false,
+-
+- io_acct_cancel_pending_work(wqe, acct, &match);
+- }
+-}
+-
+-void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work)
+-{
+- struct io_wqe *wqe = wq->wqes[numa_node_id()];
+-
+- io_wqe_enqueue(wqe, work);
+-}
+-
+-/*
+- * Work items that hash to the same value will not be done in parallel.
+- * Used to limit concurrent writes, generally hashed by inode.
+- */
+-void io_wq_hash_work(struct io_wq_work *work, void *val)
+-{
+- unsigned int bit;
+-
+- bit = hash_ptr(val, IO_WQ_HASH_ORDER);
+- work->flags |= (IO_WQ_WORK_HASHED | (bit << IO_WQ_HASH_SHIFT));
+-}
+-
+-static bool __io_wq_worker_cancel(struct io_worker *worker,
+- struct io_cb_cancel_data *match,
+- struct io_wq_work *work)
+-{
+- if (work && match->fn(work, match->data)) {
+- work->flags |= IO_WQ_WORK_CANCEL;
+- set_notify_signal(worker->task);
+- return true;
+- }
+-
+- return false;
+-}
+-
+-static bool io_wq_worker_cancel(struct io_worker *worker, void *data)
+-{
+- struct io_cb_cancel_data *match = data;
+-
+- /*
+- * Hold the lock to avoid ->cur_work going out of scope, caller
+- * may dereference the passed in work.
+- */
+- raw_spin_lock(&worker->lock);
+- if (__io_wq_worker_cancel(worker, match, worker->cur_work) ||
+- __io_wq_worker_cancel(worker, match, worker->next_work))
+- match->nr_running++;
+- raw_spin_unlock(&worker->lock);
+-
+- return match->nr_running && !match->cancel_all;
+-}
+-
+-static inline void io_wqe_remove_pending(struct io_wqe *wqe,
+- struct io_wq_work *work,
+- struct io_wq_work_node *prev)
+-{
+- struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
+- unsigned int hash = io_get_work_hash(work);
+- struct io_wq_work *prev_work = NULL;
+-
+- if (io_wq_is_hashed(work) && work == wqe->hash_tail[hash]) {
+- if (prev)
+- prev_work = container_of(prev, struct io_wq_work, list);
+- if (prev_work && io_get_work_hash(prev_work) == hash)
+- wqe->hash_tail[hash] = prev_work;
+- else
+- wqe->hash_tail[hash] = NULL;
+- }
+- wq_list_del(&acct->work_list, &work->list, prev);
+-}
+-
+-static bool io_acct_cancel_pending_work(struct io_wqe *wqe,
+- struct io_wqe_acct *acct,
+- struct io_cb_cancel_data *match)
+-{
+- struct io_wq_work_node *node, *prev;
+- struct io_wq_work *work;
+-
+- raw_spin_lock(&acct->lock);
+- wq_list_for_each(node, prev, &acct->work_list) {
+- work = container_of(node, struct io_wq_work, list);
+- if (!match->fn(work, match->data))
+- continue;
+- io_wqe_remove_pending(wqe, work, prev);
+- raw_spin_unlock(&acct->lock);
+- io_run_cancel(work, wqe);
+- match->nr_pending++;
+- /* not safe to continue after unlock */
+- return true;
+- }
+- raw_spin_unlock(&acct->lock);
+-
+- return false;
+-}
+-
+-static void io_wqe_cancel_pending_work(struct io_wqe *wqe,
+- struct io_cb_cancel_data *match)
+-{
+- int i;
+-retry:
+- for (i = 0; i < IO_WQ_ACCT_NR; i++) {
+- struct io_wqe_acct *acct = io_get_acct(wqe, i == 0);
+-
+- if (io_acct_cancel_pending_work(wqe, acct, match)) {
+- if (match->cancel_all)
+- goto retry;
+- break;
+- }
+- }
+-}
+-
+-static void io_wqe_cancel_running_work(struct io_wqe *wqe,
+- struct io_cb_cancel_data *match)
+-{
+- rcu_read_lock();
+- io_wq_for_each_worker(wqe, io_wq_worker_cancel, match);
+- rcu_read_unlock();
+-}
+-
+-enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel,
+- void *data, bool cancel_all)
+-{
+- struct io_cb_cancel_data match = {
+- .fn = cancel,
+- .data = data,
+- .cancel_all = cancel_all,
+- };
+- int node;
+-
+- /*
+- * First check pending list, if we're lucky we can just remove it
+- * from there. CANCEL_OK means that the work is returned as-new,
+- * no completion will be posted for it.
+- *
+- * Then check if a free (going busy) or busy worker has the work
+- * currently running. If we find it there, we'll return CANCEL_RUNNING
+- * as an indication that we attempt to signal cancellation. The
+- * completion will run normally in this case.
+- *
+- * Do both of these while holding the wqe->lock, to ensure that
+- * we'll find a work item regardless of state.
+- */
+- for_each_node(node) {
+- struct io_wqe *wqe = wq->wqes[node];
+-
+- io_wqe_cancel_pending_work(wqe, &match);
+- if (match.nr_pending && !match.cancel_all)
+- return IO_WQ_CANCEL_OK;
+-
+- raw_spin_lock(&wqe->lock);
+- io_wqe_cancel_running_work(wqe, &match);
+- raw_spin_unlock(&wqe->lock);
+- if (match.nr_running && !match.cancel_all)
+- return IO_WQ_CANCEL_RUNNING;
+- }
+-
+- if (match.nr_running)
+- return IO_WQ_CANCEL_RUNNING;
+- if (match.nr_pending)
+- return IO_WQ_CANCEL_OK;
+- return IO_WQ_CANCEL_NOTFOUND;
+-}
+-
+-static int io_wqe_hash_wake(struct wait_queue_entry *wait, unsigned mode,
+- int sync, void *key)
+-{
+- struct io_wqe *wqe = container_of(wait, struct io_wqe, wait);
+- int i;
+-
+- list_del_init(&wait->entry);
+-
+- rcu_read_lock();
+- for (i = 0; i < IO_WQ_ACCT_NR; i++) {
+- struct io_wqe_acct *acct = &wqe->acct[i];
+-
+- if (test_and_clear_bit(IO_ACCT_STALLED_BIT, &acct->flags))
+- io_wqe_activate_free_worker(wqe, acct);
+- }
+- rcu_read_unlock();
+- return 1;
+-}
+-
+-struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
+-{
+- int ret, node, i;
+- struct io_wq *wq;
+-
+- if (WARN_ON_ONCE(!data->free_work || !data->do_work))
+- return ERR_PTR(-EINVAL);
+- if (WARN_ON_ONCE(!bounded))
+- return ERR_PTR(-EINVAL);
+-
+- wq = kzalloc(struct_size(wq, wqes, nr_node_ids), GFP_KERNEL);
+- if (!wq)
+- return ERR_PTR(-ENOMEM);
+- ret = cpuhp_state_add_instance_nocalls(io_wq_online, &wq->cpuhp_node);
+- if (ret)
+- goto err_wq;
+-
+- refcount_inc(&data->hash->refs);
+- wq->hash = data->hash;
+- wq->free_work = data->free_work;
+- wq->do_work = data->do_work;
+-
+- ret = -ENOMEM;
+- for_each_node(node) {
+- struct io_wqe *wqe;
+- int alloc_node = node;
+-
+- if (!node_online(alloc_node))
+- alloc_node = NUMA_NO_NODE;
+- wqe = kzalloc_node(sizeof(struct io_wqe), GFP_KERNEL, alloc_node);
+- if (!wqe)
+- goto err;
+- if (!alloc_cpumask_var(&wqe->cpu_mask, GFP_KERNEL))
+- goto err;
+- cpumask_copy(wqe->cpu_mask, cpumask_of_node(node));
+- wq->wqes[node] = wqe;
+- wqe->node = alloc_node;
+- wqe->acct[IO_WQ_ACCT_BOUND].max_workers = bounded;
+- wqe->acct[IO_WQ_ACCT_UNBOUND].max_workers =
+- task_rlimit(current, RLIMIT_NPROC);
+- INIT_LIST_HEAD(&wqe->wait.entry);
+- wqe->wait.func = io_wqe_hash_wake;
+- for (i = 0; i < IO_WQ_ACCT_NR; i++) {
+- struct io_wqe_acct *acct = &wqe->acct[i];
+-
+- acct->index = i;
+- atomic_set(&acct->nr_running, 0);
+- INIT_WQ_LIST(&acct->work_list);
+- raw_spin_lock_init(&acct->lock);
+- }
+- wqe->wq = wq;
+- raw_spin_lock_init(&wqe->lock);
+- INIT_HLIST_NULLS_HEAD(&wqe->free_list, 0);
+- INIT_LIST_HEAD(&wqe->all_list);
+- }
+-
+- wq->task = get_task_struct(data->task);
+- atomic_set(&wq->worker_refs, 1);
+- init_completion(&wq->worker_done);
+- return wq;
+-err:
+- io_wq_put_hash(data->hash);
+- cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node);
+- for_each_node(node) {
+- if (!wq->wqes[node])
+- continue;
+- free_cpumask_var(wq->wqes[node]->cpu_mask);
+- kfree(wq->wqes[node]);
+- }
+-err_wq:
+- kfree(wq);
+- return ERR_PTR(ret);
+-}
+-
+-static bool io_task_work_match(struct callback_head *cb, void *data)
+-{
+- struct io_worker *worker;
+-
+- if (cb->func != create_worker_cb && cb->func != create_worker_cont)
+- return false;
+- worker = container_of(cb, struct io_worker, create_work);
+- return worker->wqe->wq == data;
+-}
+-
+-void io_wq_exit_start(struct io_wq *wq)
+-{
+- set_bit(IO_WQ_BIT_EXIT, &wq->state);
+-}
+-
+-static void io_wq_cancel_tw_create(struct io_wq *wq)
+-{
+- struct callback_head *cb;
+-
+- while ((cb = task_work_cancel_match(wq->task, io_task_work_match, wq)) != NULL) {
+- struct io_worker *worker;
+-
+- worker = container_of(cb, struct io_worker, create_work);
+- io_worker_cancel_cb(worker);
+- }
+-}
+-
+-static void io_wq_exit_workers(struct io_wq *wq)
+-{
+- int node;
+-
+- if (!wq->task)
+- return;
+-
+- io_wq_cancel_tw_create(wq);
+-
+- rcu_read_lock();
+- for_each_node(node) {
+- struct io_wqe *wqe = wq->wqes[node];
+-
+- io_wq_for_each_worker(wqe, io_wq_worker_wake, NULL);
+- }
+- rcu_read_unlock();
+- io_worker_ref_put(wq);
+- wait_for_completion(&wq->worker_done);
+-
+- for_each_node(node) {
+- spin_lock_irq(&wq->hash->wait.lock);
+- list_del_init(&wq->wqes[node]->wait.entry);
+- spin_unlock_irq(&wq->hash->wait.lock);
+- }
+- put_task_struct(wq->task);
+- wq->task = NULL;
+-}
+-
+-static void io_wq_destroy(struct io_wq *wq)
+-{
+- int node;
+-
+- cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node);
+-
+- for_each_node(node) {
+- struct io_wqe *wqe = wq->wqes[node];
+- struct io_cb_cancel_data match = {
+- .fn = io_wq_work_match_all,
+- .cancel_all = true,
+- };
+- io_wqe_cancel_pending_work(wqe, &match);
+- free_cpumask_var(wqe->cpu_mask);
+- kfree(wqe);
+- }
+- io_wq_put_hash(wq->hash);
+- kfree(wq);
+-}
+-
+-void io_wq_put_and_exit(struct io_wq *wq)
+-{
+- WARN_ON_ONCE(!test_bit(IO_WQ_BIT_EXIT, &wq->state));
+-
+- io_wq_exit_workers(wq);
+- io_wq_destroy(wq);
+-}
+-
+-struct online_data {
+- unsigned int cpu;
+- bool online;
+-};
+-
+-static bool io_wq_worker_affinity(struct io_worker *worker, void *data)
+-{
+- struct online_data *od = data;
+-
+- if (od->online)
+- cpumask_set_cpu(od->cpu, worker->wqe->cpu_mask);
+- else
+- cpumask_clear_cpu(od->cpu, worker->wqe->cpu_mask);
+- return false;
+-}
+-
+-static int __io_wq_cpu_online(struct io_wq *wq, unsigned int cpu, bool online)
+-{
+- struct online_data od = {
+- .cpu = cpu,
+- .online = online
+- };
+- int i;
+-
+- rcu_read_lock();
+- for_each_node(i)
+- io_wq_for_each_worker(wq->wqes[i], io_wq_worker_affinity, &od);
+- rcu_read_unlock();
+- return 0;
+-}
+-
+-static int io_wq_cpu_online(unsigned int cpu, struct hlist_node *node)
+-{
+- struct io_wq *wq = hlist_entry_safe(node, struct io_wq, cpuhp_node);
+-
+- return __io_wq_cpu_online(wq, cpu, true);
+-}
+-
+-static int io_wq_cpu_offline(unsigned int cpu, struct hlist_node *node)
+-{
+- struct io_wq *wq = hlist_entry_safe(node, struct io_wq, cpuhp_node);
+-
+- return __io_wq_cpu_online(wq, cpu, false);
+-}
+-
+-int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask)
+-{
+- int i;
+-
+- rcu_read_lock();
+- for_each_node(i) {
+- struct io_wqe *wqe = wq->wqes[i];
+-
+- if (mask)
+- cpumask_copy(wqe->cpu_mask, mask);
+- else
+- cpumask_copy(wqe->cpu_mask, cpumask_of_node(i));
+- }
+- rcu_read_unlock();
+- return 0;
+-}
+-
+-/*
+- * Set max number of unbounded workers, returns old value. If new_count is 0,
+- * then just return the old value.
+- */
+-int io_wq_max_workers(struct io_wq *wq, int *new_count)
+-{
+- int prev[IO_WQ_ACCT_NR];
+- bool first_node = true;
+- int i, node;
+-
+- BUILD_BUG_ON((int) IO_WQ_ACCT_BOUND != (int) IO_WQ_BOUND);
+- BUILD_BUG_ON((int) IO_WQ_ACCT_UNBOUND != (int) IO_WQ_UNBOUND);
+- BUILD_BUG_ON((int) IO_WQ_ACCT_NR != 2);
+-
+- for (i = 0; i < IO_WQ_ACCT_NR; i++) {
+- if (new_count[i] > task_rlimit(current, RLIMIT_NPROC))
+- new_count[i] = task_rlimit(current, RLIMIT_NPROC);
+- }
+-
+- for (i = 0; i < IO_WQ_ACCT_NR; i++)
+- prev[i] = 0;
+-
+- rcu_read_lock();
+- for_each_node(node) {
+- struct io_wqe *wqe = wq->wqes[node];
+- struct io_wqe_acct *acct;
+-
+- raw_spin_lock(&wqe->lock);
+- for (i = 0; i < IO_WQ_ACCT_NR; i++) {
+- acct = &wqe->acct[i];
+- if (first_node)
+- prev[i] = max_t(int, acct->max_workers, prev[i]);
+- if (new_count[i])
+- acct->max_workers = new_count[i];
+- }
+- raw_spin_unlock(&wqe->lock);
+- first_node = false;
+- }
+- rcu_read_unlock();
+-
+- for (i = 0; i < IO_WQ_ACCT_NR; i++)
+- new_count[i] = prev[i];
+-
+- return 0;
+-}
+-
+-static __init int io_wq_init(void)
+-{
+- int ret;
+-
+- ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "io-wq/online",
+- io_wq_cpu_online, io_wq_cpu_offline);
+- if (ret < 0)
+- return ret;
+- io_wq_online = ret;
+- return 0;
+-}
+-subsys_initcall(io_wq_init);
+diff --git a/fs/io-wq.h b/fs/io-wq.h
+deleted file mode 100644
+index dbecd27656c7c..0000000000000
+--- a/fs/io-wq.h
++++ /dev/null
+@@ -1,227 +0,0 @@
+-#ifndef INTERNAL_IO_WQ_H
+-#define INTERNAL_IO_WQ_H
+-
+-#include <linux/refcount.h>
+-
+-struct io_wq;
+-
+-enum {
+- IO_WQ_WORK_CANCEL = 1,
+- IO_WQ_WORK_HASHED = 2,
+- IO_WQ_WORK_UNBOUND = 4,
+- IO_WQ_WORK_CONCURRENT = 16,
+-
+- IO_WQ_HASH_SHIFT = 24, /* upper 8 bits are used for hash key */
+-};
+-
+-enum io_wq_cancel {
+- IO_WQ_CANCEL_OK, /* cancelled before started */
+- IO_WQ_CANCEL_RUNNING, /* found, running, and attempted cancelled */
+- IO_WQ_CANCEL_NOTFOUND, /* work not found */
+-};
+-
+-struct io_wq_work_node {
+- struct io_wq_work_node *next;
+-};
+-
+-struct io_wq_work_list {
+- struct io_wq_work_node *first;
+- struct io_wq_work_node *last;
+-};
+-
+-#define wq_list_for_each(pos, prv, head) \
+- for (pos = (head)->first, prv = NULL; pos; prv = pos, pos = (pos)->next)
+-
+-#define wq_list_for_each_resume(pos, prv) \
+- for (; pos; prv = pos, pos = (pos)->next)
+-
+-#define wq_list_empty(list) (READ_ONCE((list)->first) == NULL)
+-#define INIT_WQ_LIST(list) do { \
+- (list)->first = NULL; \
+-} while (0)
+-
+-static inline void wq_list_add_after(struct io_wq_work_node *node,
+- struct io_wq_work_node *pos,
+- struct io_wq_work_list *list)
+-{
+- struct io_wq_work_node *next = pos->next;
+-
+- pos->next = node;
+- node->next = next;
+- if (!next)
+- list->last = node;
+-}
+-
+-/**
+- * wq_list_merge - merge the second list to the first one.
+- * @list0: the first list
+- * @list1: the second list
+- * Return the first node after mergence.
+- */
+-static inline struct io_wq_work_node *wq_list_merge(struct io_wq_work_list *list0,
+- struct io_wq_work_list *list1)
+-{
+- struct io_wq_work_node *ret;
+-
+- if (!list0->first) {
+- ret = list1->first;
+- } else {
+- ret = list0->first;
+- list0->last->next = list1->first;
+- }
+- INIT_WQ_LIST(list0);
+- INIT_WQ_LIST(list1);
+- return ret;
+-}
+-
+-static inline void wq_list_add_tail(struct io_wq_work_node *node,
+- struct io_wq_work_list *list)
+-{
+- node->next = NULL;
+- if (!list->first) {
+- list->last = node;
+- WRITE_ONCE(list->first, node);
+- } else {
+- list->last->next = node;
+- list->last = node;
+- }
+-}
+-
+-static inline void wq_list_add_head(struct io_wq_work_node *node,
+- struct io_wq_work_list *list)
+-{
+- node->next = list->first;
+- if (!node->next)
+- list->last = node;
+- WRITE_ONCE(list->first, node);
+-}
+-
+-static inline void wq_list_cut(struct io_wq_work_list *list,
+- struct io_wq_work_node *last,
+- struct io_wq_work_node *prev)
+-{
+- /* first in the list, if prev==NULL */
+- if (!prev)
+- WRITE_ONCE(list->first, last->next);
+- else
+- prev->next = last->next;
+-
+- if (last == list->last)
+- list->last = prev;
+- last->next = NULL;
+-}
+-
+-static inline void __wq_list_splice(struct io_wq_work_list *list,
+- struct io_wq_work_node *to)
+-{
+- list->last->next = to->next;
+- to->next = list->first;
+- INIT_WQ_LIST(list);
+-}
+-
+-static inline bool wq_list_splice(struct io_wq_work_list *list,
+- struct io_wq_work_node *to)
+-{
+- if (!wq_list_empty(list)) {
+- __wq_list_splice(list, to);
+- return true;
+- }
+- return false;
+-}
+-
+-static inline void wq_stack_add_head(struct io_wq_work_node *node,
+- struct io_wq_work_node *stack)
+-{
+- node->next = stack->next;
+- stack->next = node;
+-}
+-
+-static inline void wq_list_del(struct io_wq_work_list *list,
+- struct io_wq_work_node *node,
+- struct io_wq_work_node *prev)
+-{
+- wq_list_cut(list, node, prev);
+-}
+-
+-static inline
+-struct io_wq_work_node *wq_stack_extract(struct io_wq_work_node *stack)
+-{
+- struct io_wq_work_node *node = stack->next;
+-
+- stack->next = node->next;
+- return node;
+-}
+-
+-struct io_wq_work {
+- struct io_wq_work_node list;
+- unsigned flags;
+-};
+-
+-static inline struct io_wq_work *wq_next_work(struct io_wq_work *work)
+-{
+- if (!work->list.next)
+- return NULL;
+-
+- return container_of(work->list.next, struct io_wq_work, list);
+-}
+-
+-typedef struct io_wq_work *(free_work_fn)(struct io_wq_work *);
+-typedef void (io_wq_work_fn)(struct io_wq_work *);
+-
+-struct io_wq_hash {
+- refcount_t refs;
+- unsigned long map;
+- struct wait_queue_head wait;
+-};
+-
+-static inline void io_wq_put_hash(struct io_wq_hash *hash)
+-{
+- if (refcount_dec_and_test(&hash->refs))
+- kfree(hash);
+-}
+-
+-struct io_wq_data {
+- struct io_wq_hash *hash;
+- struct task_struct *task;
+- io_wq_work_fn *do_work;
+- free_work_fn *free_work;
+-};
+-
+-struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data);
+-void io_wq_exit_start(struct io_wq *wq);
+-void io_wq_put_and_exit(struct io_wq *wq);
+-
+-void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work);
+-void io_wq_hash_work(struct io_wq_work *work, void *val);
+-
+-int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask);
+-int io_wq_max_workers(struct io_wq *wq, int *new_count);
+-
+-static inline bool io_wq_is_hashed(struct io_wq_work *work)
+-{
+- return work->flags & IO_WQ_WORK_HASHED;
+-}
+-
+-typedef bool (work_cancel_fn)(struct io_wq_work *, void *);
+-
+-enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel,
+- void *data, bool cancel_all);
+-
+-#if defined(CONFIG_IO_WQ)
+-extern void io_wq_worker_sleeping(struct task_struct *);
+-extern void io_wq_worker_running(struct task_struct *);
+-#else
+-static inline void io_wq_worker_sleeping(struct task_struct *tsk)
+-{
+-}
+-static inline void io_wq_worker_running(struct task_struct *tsk)
+-{
+-}
+-#endif
+-
+-static inline bool io_wq_current_is_worker(void)
+-{
+- return in_task() && (current->flags & PF_IO_WORKER) &&
+- current->worker_private;
+-}
+-#endif
+diff --git a/fs/io_uring.c b/fs/io_uring.c
+deleted file mode 100644
+index 3d97372e811eb..0000000000000
+--- a/fs/io_uring.c
++++ /dev/null
+@@ -1,11915 +0,0 @@
+-// SPDX-License-Identifier: GPL-2.0
+-/*
+- * Shared application/kernel submission and completion ring pairs, for
+- * supporting fast/efficient IO.
+- *
+- * A note on the read/write ordering memory barriers that are matched between
+- * the application and kernel side.
+- *
+- * After the application reads the CQ ring tail, it must use an
+- * appropriate smp_rmb() to pair with the smp_wmb() the kernel uses
+- * before writing the tail (using smp_load_acquire to read the tail will
+- * do). It also needs a smp_mb() before updating CQ head (ordering the
+- * entry load(s) with the head store), pairing with an implicit barrier
+- * through a control-dependency in io_get_cqe (smp_store_release to
+- * store head will do). Failure to do so could lead to reading invalid
+- * CQ entries.
+- *
+- * Likewise, the application must use an appropriate smp_wmb() before
+- * writing the SQ tail (ordering SQ entry stores with the tail store),
+- * which pairs with smp_load_acquire in io_get_sqring (smp_store_release
+- * to store the tail will do). And it needs a barrier ordering the SQ
+- * head load before writing new SQ entries (smp_load_acquire to read
+- * head will do).
+- *
+- * When using the SQ poll thread (IORING_SETUP_SQPOLL), the application
+- * needs to check the SQ flags for IORING_SQ_NEED_WAKEUP *after*
+- * updating the SQ tail; a full memory barrier smp_mb() is needed
+- * between.
+- *
+- * Also see the examples in the liburing library:
+- *
+- * git://git.kernel.dk/liburing
+- *
+- * io_uring also uses READ/WRITE_ONCE() for _any_ store or load that happens
+- * from data shared between the kernel and application. This is done both
+- * for ordering purposes, but also to ensure that once a value is loaded from
+- * data that the application could potentially modify, it remains stable.
+- *
+- * Copyright (C) 2018-2019 Jens Axboe
+- * Copyright (c) 2018-2019 Christoph Hellwig
+- */
+-#include <linux/kernel.h>
+-#include <linux/init.h>
+-#include <linux/errno.h>
+-#include <linux/syscalls.h>
+-#include <linux/compat.h>
+-#include <net/compat.h>
+-#include <linux/refcount.h>
+-#include <linux/uio.h>
+-#include <linux/bits.h>
+-
+-#include <linux/sched/signal.h>
+-#include <linux/fs.h>
+-#include <linux/file.h>
+-#include <linux/fdtable.h>
+-#include <linux/mm.h>
+-#include <linux/mman.h>
+-#include <linux/percpu.h>
+-#include <linux/slab.h>
+-#include <linux/blk-mq.h>
+-#include <linux/bvec.h>
+-#include <linux/net.h>
+-#include <net/sock.h>
+-#include <net/af_unix.h>
+-#include <net/scm.h>
+-#include <linux/anon_inodes.h>
+-#include <linux/sched/mm.h>
+-#include <linux/uaccess.h>
+-#include <linux/nospec.h>
+-#include <linux/sizes.h>
+-#include <linux/hugetlb.h>
+-#include <linux/highmem.h>
+-#include <linux/namei.h>
+-#include <linux/fsnotify.h>
+-#include <linux/fadvise.h>
+-#include <linux/eventpoll.h>
+-#include <linux/splice.h>
+-#include <linux/task_work.h>
+-#include <linux/pagemap.h>
+-#include <linux/io_uring.h>
+-#include <linux/audit.h>
+-#include <linux/security.h>
+-
+-#define CREATE_TRACE_POINTS
+-#include <trace/events/io_uring.h>
+-
+-#include <uapi/linux/io_uring.h>
+-
+-#include "internal.h"
+-#include "io-wq.h"
+-
+-#define IORING_MAX_ENTRIES 32768
+-#define IORING_MAX_CQ_ENTRIES (2 * IORING_MAX_ENTRIES)
+-#define IORING_SQPOLL_CAP_ENTRIES_VALUE 8
+-
+-/* only define max */
+-#define IORING_MAX_FIXED_FILES (1U << 15)
+-#define IORING_MAX_RESTRICTIONS (IORING_RESTRICTION_LAST + \
+- IORING_REGISTER_LAST + IORING_OP_LAST)
+-
+-#define IO_RSRC_TAG_TABLE_SHIFT (PAGE_SHIFT - 3)
+-#define IO_RSRC_TAG_TABLE_MAX (1U << IO_RSRC_TAG_TABLE_SHIFT)
+-#define IO_RSRC_TAG_TABLE_MASK (IO_RSRC_TAG_TABLE_MAX - 1)
+-
+-#define IORING_MAX_REG_BUFFERS (1U << 14)
+-
+-#define SQE_COMMON_FLAGS (IOSQE_FIXED_FILE | IOSQE_IO_LINK | \
+- IOSQE_IO_HARDLINK | IOSQE_ASYNC)
+-
+-#define SQE_VALID_FLAGS (SQE_COMMON_FLAGS | IOSQE_BUFFER_SELECT | \
+- IOSQE_IO_DRAIN | IOSQE_CQE_SKIP_SUCCESS)
+-
+-#define IO_REQ_CLEAN_FLAGS (REQ_F_BUFFER_SELECTED | REQ_F_NEED_CLEANUP | \
+- REQ_F_POLLED | REQ_F_INFLIGHT | REQ_F_CREDS | \
+- REQ_F_ASYNC_DATA)
+-
+-#define IO_TCTX_REFS_CACHE_NR (1U << 10)
+-
+-struct io_uring {
+- u32 head ____cacheline_aligned_in_smp;
+- u32 tail ____cacheline_aligned_in_smp;
+-};
+-
+-/*
+- * This data is shared with the application through the mmap at offsets
+- * IORING_OFF_SQ_RING and IORING_OFF_CQ_RING.
+- *
+- * The offsets to the member fields are published through struct
+- * io_sqring_offsets when calling io_uring_setup.
+- */
+-struct io_rings {
+- /*
+- * Head and tail offsets into the ring; the offsets need to be
+- * masked to get valid indices.
+- *
+- * The kernel controls head of the sq ring and the tail of the cq ring,
+- * and the application controls tail of the sq ring and the head of the
+- * cq ring.
+- */
+- struct io_uring sq, cq;
+- /*
+- * Bitmasks to apply to head and tail offsets (constant, equals
+- * ring_entries - 1)
+- */
+- u32 sq_ring_mask, cq_ring_mask;
+- /* Ring sizes (constant, power of 2) */
+- u32 sq_ring_entries, cq_ring_entries;
+- /*
+- * Number of invalid entries dropped by the kernel due to
+- * invalid index stored in array
+- *
+- * Written by the kernel, shouldn't be modified by the
+- * application (i.e. get number of "new events" by comparing to
+- * cached value).
+- *
+- * After a new SQ head value was read by the application this
+- * counter includes all submissions that were dropped reaching
+- * the new SQ head (and possibly more).
+- */
+- u32 sq_dropped;
+- /*
+- * Runtime SQ flags
+- *
+- * Written by the kernel, shouldn't be modified by the
+- * application.
+- *
+- * The application needs a full memory barrier before checking
+- * for IORING_SQ_NEED_WAKEUP after updating the sq tail.
+- */
+- u32 sq_flags;
+- /*
+- * Runtime CQ flags
+- *
+- * Written by the application, shouldn't be modified by the
+- * kernel.
+- */
+- u32 cq_flags;
+- /*
+- * Number of completion events lost because the queue was full;
+- * this should be avoided by the application by making sure
+- * there are not more requests pending than there is space in
+- * the completion queue.
+- *
+- * Written by the kernel, shouldn't be modified by the
+- * application (i.e. get number of "new events" by comparing to
+- * cached value).
+- *
+- * As completion events come in out of order this counter is not
+- * ordered with any other data.
+- */
+- u32 cq_overflow;
+- /*
+- * Ring buffer of completion events.
+- *
+- * The kernel writes completion events fresh every time they are
+- * produced, so the application is allowed to modify pending
+- * entries.
+- */
+- struct io_uring_cqe cqes[] ____cacheline_aligned_in_smp;
+-};
+-
+-enum io_uring_cmd_flags {
+- IO_URING_F_COMPLETE_DEFER = 1,
+- IO_URING_F_UNLOCKED = 2,
+- /* int's last bit, sign checks are usually faster than a bit test */
+- IO_URING_F_NONBLOCK = INT_MIN,
+-};
+-
+-struct io_mapped_ubuf {
+- u64 ubuf;
+- u64 ubuf_end;
+- unsigned int nr_bvecs;
+- unsigned long acct_pages;
+- struct bio_vec bvec[];
+-};
+-
+-struct io_ring_ctx;
+-
+-struct io_overflow_cqe {
+- struct io_uring_cqe cqe;
+- struct list_head list;
+-};
+-
+-struct io_fixed_file {
+- /* file * with additional FFS_* flags */
+- unsigned long file_ptr;
+-};
+-
+-struct io_rsrc_put {
+- struct list_head list;
+- u64 tag;
+- union {
+- void *rsrc;
+- struct file *file;
+- struct io_mapped_ubuf *buf;
+- };
+-};
+-
+-struct io_file_table {
+- struct io_fixed_file *files;
+-};
+-
+-struct io_rsrc_node {
+- struct percpu_ref refs;
+- struct list_head node;
+- struct list_head rsrc_list;
+- struct io_rsrc_data *rsrc_data;
+- struct llist_node llist;
+- bool done;
+-};
+-
+-typedef void (rsrc_put_fn)(struct io_ring_ctx *ctx, struct io_rsrc_put *prsrc);
+-
+-struct io_rsrc_data {
+- struct io_ring_ctx *ctx;
+-
+- u64 **tags;
+- unsigned int nr;
+- rsrc_put_fn *do_put;
+- atomic_t refs;
+- struct completion done;
+- bool quiesce;
+-};
+-
+-struct io_buffer_list {
+- struct list_head list;
+- struct list_head buf_list;
+- __u16 bgid;
+-};
+-
+-struct io_buffer {
+- struct list_head list;
+- __u64 addr;
+- __u32 len;
+- __u16 bid;
+- __u16 bgid;
+-};
+-
+-struct io_restriction {
+- DECLARE_BITMAP(register_op, IORING_REGISTER_LAST);
+- DECLARE_BITMAP(sqe_op, IORING_OP_LAST);
+- u8 sqe_flags_allowed;
+- u8 sqe_flags_required;
+- bool registered;
+-};
+-
+-enum {
+- IO_SQ_THREAD_SHOULD_STOP = 0,
+- IO_SQ_THREAD_SHOULD_PARK,
+-};
+-
+-struct io_sq_data {
+- refcount_t refs;
+- atomic_t park_pending;
+- struct mutex lock;
+-
+- /* ctx's that are using this sqd */
+- struct list_head ctx_list;
+-
+- struct task_struct *thread;
+- struct wait_queue_head wait;
+-
+- unsigned sq_thread_idle;
+- int sq_cpu;
+- pid_t task_pid;
+- pid_t task_tgid;
+-
+- unsigned long state;
+- struct completion exited;
+-};
+-
+-#define IO_COMPL_BATCH 32
+-#define IO_REQ_CACHE_SIZE 32
+-#define IO_REQ_ALLOC_BATCH 8
+-
+-struct io_submit_link {
+- struct io_kiocb *head;
+- struct io_kiocb *last;
+-};
+-
+-struct io_submit_state {
+- /* inline/task_work completion list, under ->uring_lock */
+- struct io_wq_work_node free_list;
+- /* batch completion logic */
+- struct io_wq_work_list compl_reqs;
+- struct io_submit_link link;
+-
+- bool plug_started;
+- bool need_plug;
+- bool flush_cqes;
+- unsigned short submit_nr;
+- struct blk_plug plug;
+-};
+-
+-struct io_ev_fd {
+- struct eventfd_ctx *cq_ev_fd;
+- unsigned int eventfd_async: 1;
+- struct rcu_head rcu;
+-};
+-
+-#define IO_BUFFERS_HASH_BITS 5
+-
+-struct io_ring_ctx {
+- /* const or read-mostly hot data */
+- struct {
+- struct percpu_ref refs;
+-
+- struct io_rings *rings;
+- unsigned int flags;
+- unsigned int compat: 1;
+- unsigned int drain_next: 1;
+- unsigned int restricted: 1;
+- unsigned int off_timeout_used: 1;
+- unsigned int drain_active: 1;
+- unsigned int drain_disabled: 1;
+- unsigned int has_evfd: 1;
+- } ____cacheline_aligned_in_smp;
+-
+- /* submission data */
+- struct {
+- struct mutex uring_lock;
+-
+- /*
+- * Ring buffer of indices into array of io_uring_sqe, which is
+- * mmapped by the application using the IORING_OFF_SQES offset.
+- *
+- * This indirection could e.g. be used to assign fixed
+- * io_uring_sqe entries to operations and only submit them to
+- * the queue when needed.
+- *
+- * The kernel modifies neither the indices array nor the entries
+- * array.
+- */
+- u32 *sq_array;
+- struct io_uring_sqe *sq_sqes;
+- unsigned cached_sq_head;
+- unsigned sq_entries;
+- struct list_head defer_list;
+-
+- /*
+- * Fixed resources fast path, should be accessed only under
+- * uring_lock, and updated through io_uring_register(2)
+- */
+- struct io_rsrc_node *rsrc_node;
+- int rsrc_cached_refs;
+- struct io_file_table file_table;
+- unsigned nr_user_files;
+- unsigned nr_user_bufs;
+- struct io_mapped_ubuf **user_bufs;
+-
+- struct io_submit_state submit_state;
+- struct list_head timeout_list;
+- struct list_head ltimeout_list;
+- struct list_head cq_overflow_list;
+- struct list_head *io_buffers;
+- struct list_head io_buffers_cache;
+- struct list_head apoll_cache;
+- struct xarray personalities;
+- u32 pers_next;
+- unsigned sq_thread_idle;
+- } ____cacheline_aligned_in_smp;
+-
+- /* IRQ completion list, under ->completion_lock */
+- struct io_wq_work_list locked_free_list;
+- unsigned int locked_free_nr;
+-
+- const struct cred *sq_creds; /* cred used for __io_sq_thread() */
+- struct io_sq_data *sq_data; /* if using sq thread polling */
+-
+- struct wait_queue_head sqo_sq_wait;
+- struct list_head sqd_list;
+-
+- unsigned long check_cq_overflow;
+-
+- struct {
+- unsigned cached_cq_tail;
+- unsigned cq_entries;
+- struct io_ev_fd __rcu *io_ev_fd;
+- struct wait_queue_head cq_wait;
+- unsigned cq_extra;
+- atomic_t cq_timeouts;
+- unsigned cq_last_tm_flush;
+- } ____cacheline_aligned_in_smp;
+-
+- struct {
+- spinlock_t completion_lock;
+-
+- spinlock_t timeout_lock;
+-
+- /*
+- * ->iopoll_list is protected by the ctx->uring_lock for
+- * io_uring instances that don't use IORING_SETUP_SQPOLL.
+- * For SQPOLL, only the single threaded io_sq_thread() will
+- * manipulate the list, hence no extra locking is needed there.
+- */
+- struct io_wq_work_list iopoll_list;
+- struct hlist_head *cancel_hash;
+- unsigned cancel_hash_bits;
+- bool poll_multi_queue;
+-
+- struct list_head io_buffers_comp;
+- } ____cacheline_aligned_in_smp;
+-
+- struct io_restriction restrictions;
+-
+- /* slow path rsrc auxilary data, used by update/register */
+- struct {
+- struct io_rsrc_node *rsrc_backup_node;
+- struct io_mapped_ubuf *dummy_ubuf;
+- struct io_rsrc_data *file_data;
+- struct io_rsrc_data *buf_data;
+-
+- struct delayed_work rsrc_put_work;
+- struct llist_head rsrc_put_llist;
+- struct list_head rsrc_ref_list;
+- spinlock_t rsrc_ref_lock;
+-
+- struct list_head io_buffers_pages;
+- };
+-
+- /* Keep this last, we don't need it for the fast path */
+- struct {
+- #if defined(CONFIG_UNIX)
+- struct socket *ring_sock;
+- #endif
+- /* hashed buffered write serialization */
+- struct io_wq_hash *hash_map;
+-
+- /* Only used for accounting purposes */
+- struct user_struct *user;
+- struct mm_struct *mm_account;
+-
+- /* ctx exit and cancelation */
+- struct llist_head fallback_llist;
+- struct delayed_work fallback_work;
+- struct work_struct exit_work;
+- struct list_head tctx_list;
+- struct completion ref_comp;
+- u32 iowq_limits[2];
+- bool iowq_limits_set;
+- };
+-};
+-
+-/*
+- * Arbitrary limit, can be raised if need be
+- */
+-#define IO_RINGFD_REG_MAX 16
+-
+-struct io_uring_task {
+- /* submission side */
+- int cached_refs;
+- struct xarray xa;
+- struct wait_queue_head wait;
+- const struct io_ring_ctx *last;
+- struct io_wq *io_wq;
+- struct percpu_counter inflight;
+- atomic_t inflight_tracked;
+- atomic_t in_idle;
+-
+- spinlock_t task_lock;
+- struct io_wq_work_list task_list;
+- struct io_wq_work_list prior_task_list;
+- struct callback_head task_work;
+- struct file **registered_rings;
+- bool task_running;
+-};
+-
+-/*
+- * First field must be the file pointer in all the
+- * iocb unions! See also 'struct kiocb' in <linux/fs.h>
+- */
+-struct io_poll_iocb {
+- struct file *file;
+- struct wait_queue_head *head;
+- __poll_t events;
+- struct wait_queue_entry wait;
+-};
+-
+-struct io_poll_update {
+- struct file *file;
+- u64 old_user_data;
+- u64 new_user_data;
+- __poll_t events;
+- bool update_events;
+- bool update_user_data;
+-};
+-
+-struct io_close {
+- struct file *file;
+- int fd;
+- u32 file_slot;
+-};
+-
+-struct io_timeout_data {
+- struct io_kiocb *req;
+- struct hrtimer timer;
+- struct timespec64 ts;
+- enum hrtimer_mode mode;
+- u32 flags;
+-};
+-
+-struct io_accept {
+- struct file *file;
+- struct sockaddr __user *addr;
+- int __user *addr_len;
+- int flags;
+- u32 file_slot;
+- unsigned long nofile;
+-};
+-
+-struct io_sync {
+- struct file *file;
+- loff_t len;
+- loff_t off;
+- int flags;
+- int mode;
+-};
+-
+-struct io_cancel {
+- struct file *file;
+- u64 addr;
+-};
+-
+-struct io_timeout {
+- struct file *file;
+- u32 off;
+- u32 target_seq;
+- struct list_head list;
+- /* head of the link, used by linked timeouts only */
+- struct io_kiocb *head;
+- /* for linked completions */
+- struct io_kiocb *prev;
+-};
+-
+-struct io_timeout_rem {
+- struct file *file;
+- u64 addr;
+-
+- /* timeout update */
+- struct timespec64 ts;
+- u32 flags;
+- bool ltimeout;
+-};
+-
+-struct io_rw {
+- /* NOTE: kiocb has the file as the first member, so don't do it here */
+- struct kiocb kiocb;
+- u64 addr;
+- u32 len;
+- u32 flags;
+-};
+-
+-struct io_connect {
+- struct file *file;
+- struct sockaddr __user *addr;
+- int addr_len;
+-};
+-
+-struct io_sr_msg {
+- struct file *file;
+- union {
+- struct compat_msghdr __user *umsg_compat;
+- struct user_msghdr __user *umsg;
+- void __user *buf;
+- };
+- int msg_flags;
+- int bgid;
+- size_t len;
+- size_t done_io;
+-};
+-
+-struct io_open {
+- struct file *file;
+- int dfd;
+- u32 file_slot;
+- struct filename *filename;
+- struct open_how how;
+- unsigned long nofile;
+-};
+-
+-struct io_rsrc_update {
+- struct file *file;
+- u64 arg;
+- u32 nr_args;
+- u32 offset;
+-};
+-
+-struct io_fadvise {
+- struct file *file;
+- u64 offset;
+- u32 len;
+- u32 advice;
+-};
+-
+-struct io_madvise {
+- struct file *file;
+- u64 addr;
+- u32 len;
+- u32 advice;
+-};
+-
+-struct io_epoll {
+- struct file *file;
+- int epfd;
+- int op;
+- int fd;
+- struct epoll_event event;
+-};
+-
+-struct io_splice {
+- struct file *file_out;
+- loff_t off_out;
+- loff_t off_in;
+- u64 len;
+- int splice_fd_in;
+- unsigned int flags;
+-};
+-
+-struct io_provide_buf {
+- struct file *file;
+- __u64 addr;
+- __u32 len;
+- __u32 bgid;
+- __u16 nbufs;
+- __u16 bid;
+-};
+-
+-struct io_statx {
+- struct file *file;
+- int dfd;
+- unsigned int mask;
+- unsigned int flags;
+- struct filename *filename;
+- struct statx __user *buffer;
+-};
+-
+-struct io_shutdown {
+- struct file *file;
+- int how;
+-};
+-
+-struct io_rename {
+- struct file *file;
+- int old_dfd;
+- int new_dfd;
+- struct filename *oldpath;
+- struct filename *newpath;
+- int flags;
+-};
+-
+-struct io_unlink {
+- struct file *file;
+- int dfd;
+- int flags;
+- struct filename *filename;
+-};
+-
+-struct io_mkdir {
+- struct file *file;
+- int dfd;
+- umode_t mode;
+- struct filename *filename;
+-};
+-
+-struct io_symlink {
+- struct file *file;
+- int new_dfd;
+- struct filename *oldpath;
+- struct filename *newpath;
+-};
+-
+-struct io_hardlink {
+- struct file *file;
+- int old_dfd;
+- int new_dfd;
+- struct filename *oldpath;
+- struct filename *newpath;
+- int flags;
+-};
+-
+-struct io_msg {
+- struct file *file;
+- u64 user_data;
+- u32 len;
+-};
+-
+-struct io_async_connect {
+- struct sockaddr_storage address;
+-};
+-
+-struct io_async_msghdr {
+- struct iovec fast_iov[UIO_FASTIOV];
+- /* points to an allocated iov, if NULL we use fast_iov instead */
+- struct iovec *free_iov;
+- struct sockaddr __user *uaddr;
+- struct msghdr msg;
+- struct sockaddr_storage addr;
+-};
+-
+-struct io_rw_state {
+- struct iov_iter iter;
+- struct iov_iter_state iter_state;
+- struct iovec fast_iov[UIO_FASTIOV];
+-};
+-
+-struct io_async_rw {
+- struct io_rw_state s;
+- const struct iovec *free_iovec;
+- size_t bytes_done;
+- struct wait_page_queue wpq;
+-};
+-
+-enum {
+- REQ_F_FIXED_FILE_BIT = IOSQE_FIXED_FILE_BIT,
+- REQ_F_IO_DRAIN_BIT = IOSQE_IO_DRAIN_BIT,
+- REQ_F_LINK_BIT = IOSQE_IO_LINK_BIT,
+- REQ_F_HARDLINK_BIT = IOSQE_IO_HARDLINK_BIT,
+- REQ_F_FORCE_ASYNC_BIT = IOSQE_ASYNC_BIT,
+- REQ_F_BUFFER_SELECT_BIT = IOSQE_BUFFER_SELECT_BIT,
+- REQ_F_CQE_SKIP_BIT = IOSQE_CQE_SKIP_SUCCESS_BIT,
+-
+- /* first byte is taken by user flags, shift it to not overlap */
+- REQ_F_FAIL_BIT = 8,
+- REQ_F_INFLIGHT_BIT,
+- REQ_F_CUR_POS_BIT,
+- REQ_F_NOWAIT_BIT,
+- REQ_F_LINK_TIMEOUT_BIT,
+- REQ_F_NEED_CLEANUP_BIT,
+- REQ_F_POLLED_BIT,
+- REQ_F_BUFFER_SELECTED_BIT,
+- REQ_F_COMPLETE_INLINE_BIT,
+- REQ_F_REISSUE_BIT,
+- REQ_F_CREDS_BIT,
+- REQ_F_REFCOUNT_BIT,
+- REQ_F_ARM_LTIMEOUT_BIT,
+- REQ_F_ASYNC_DATA_BIT,
+- REQ_F_SKIP_LINK_CQES_BIT,
+- REQ_F_SINGLE_POLL_BIT,
+- REQ_F_DOUBLE_POLL_BIT,
+- REQ_F_PARTIAL_IO_BIT,
+- /* keep async read/write and isreg together and in order */
+- REQ_F_SUPPORT_NOWAIT_BIT,
+- REQ_F_ISREG_BIT,
+-
+- /* not a real bit, just to check we're not overflowing the space */
+- __REQ_F_LAST_BIT,
+-};
+-
+-enum {
+- /* ctx owns file */
+- REQ_F_FIXED_FILE = BIT(REQ_F_FIXED_FILE_BIT),
+- /* drain existing IO first */
+- REQ_F_IO_DRAIN = BIT(REQ_F_IO_DRAIN_BIT),
+- /* linked sqes */
+- REQ_F_LINK = BIT(REQ_F_LINK_BIT),
+- /* doesn't sever on completion < 0 */
+- REQ_F_HARDLINK = BIT(REQ_F_HARDLINK_BIT),
+- /* IOSQE_ASYNC */
+- REQ_F_FORCE_ASYNC = BIT(REQ_F_FORCE_ASYNC_BIT),
+- /* IOSQE_BUFFER_SELECT */
+- REQ_F_BUFFER_SELECT = BIT(REQ_F_BUFFER_SELECT_BIT),
+- /* IOSQE_CQE_SKIP_SUCCESS */
+- REQ_F_CQE_SKIP = BIT(REQ_F_CQE_SKIP_BIT),
+-
+- /* fail rest of links */
+- REQ_F_FAIL = BIT(REQ_F_FAIL_BIT),
+- /* on inflight list, should be cancelled and waited on exit reliably */
+- REQ_F_INFLIGHT = BIT(REQ_F_INFLIGHT_BIT),
+- /* read/write uses file position */
+- REQ_F_CUR_POS = BIT(REQ_F_CUR_POS_BIT),
+- /* must not punt to workers */
+- REQ_F_NOWAIT = BIT(REQ_F_NOWAIT_BIT),
+- /* has or had linked timeout */
+- REQ_F_LINK_TIMEOUT = BIT(REQ_F_LINK_TIMEOUT_BIT),
+- /* needs cleanup */
+- REQ_F_NEED_CLEANUP = BIT(REQ_F_NEED_CLEANUP_BIT),
+- /* already went through poll handler */
+- REQ_F_POLLED = BIT(REQ_F_POLLED_BIT),
+- /* buffer already selected */
+- REQ_F_BUFFER_SELECTED = BIT(REQ_F_BUFFER_SELECTED_BIT),
+- /* completion is deferred through io_comp_state */
+- REQ_F_COMPLETE_INLINE = BIT(REQ_F_COMPLETE_INLINE_BIT),
+- /* caller should reissue async */
+- REQ_F_REISSUE = BIT(REQ_F_REISSUE_BIT),
+- /* supports async reads/writes */
+- REQ_F_SUPPORT_NOWAIT = BIT(REQ_F_SUPPORT_NOWAIT_BIT),
+- /* regular file */
+- REQ_F_ISREG = BIT(REQ_F_ISREG_BIT),
+- /* has creds assigned */
+- REQ_F_CREDS = BIT(REQ_F_CREDS_BIT),
+- /* skip refcounting if not set */
+- REQ_F_REFCOUNT = BIT(REQ_F_REFCOUNT_BIT),
+- /* there is a linked timeout that has to be armed */
+- REQ_F_ARM_LTIMEOUT = BIT(REQ_F_ARM_LTIMEOUT_BIT),
+- /* ->async_data allocated */
+- REQ_F_ASYNC_DATA = BIT(REQ_F_ASYNC_DATA_BIT),
+- /* don't post CQEs while failing linked requests */
+- REQ_F_SKIP_LINK_CQES = BIT(REQ_F_SKIP_LINK_CQES_BIT),
+- /* single poll may be active */
+- REQ_F_SINGLE_POLL = BIT(REQ_F_SINGLE_POLL_BIT),
+- /* double poll may active */
+- REQ_F_DOUBLE_POLL = BIT(REQ_F_DOUBLE_POLL_BIT),
+- /* request has already done partial IO */
+- REQ_F_PARTIAL_IO = BIT(REQ_F_PARTIAL_IO_BIT),
+-};
+-
+-struct async_poll {
+- struct io_poll_iocb poll;
+- struct io_poll_iocb *double_poll;
+-};
+-
+-typedef void (*io_req_tw_func_t)(struct io_kiocb *req, bool *locked);
+-
+-struct io_task_work {
+- union {
+- struct io_wq_work_node node;
+- struct llist_node fallback_node;
+- };
+- io_req_tw_func_t func;
+-};
+-
+-enum {
+- IORING_RSRC_FILE = 0,
+- IORING_RSRC_BUFFER = 1,
+-};
+-
+-/*
+- * NOTE! Each of the iocb union members has the file pointer
+- * as the first entry in their struct definition. So you can
+- * access the file pointer through any of the sub-structs,
+- * or directly as just 'file' in this struct.
+- */
+-struct io_kiocb {
+- union {
+- struct file *file;
+- struct io_rw rw;
+- struct io_poll_iocb poll;
+- struct io_poll_update poll_update;
+- struct io_accept accept;
+- struct io_sync sync;
+- struct io_cancel cancel;
+- struct io_timeout timeout;
+- struct io_timeout_rem timeout_rem;
+- struct io_connect connect;
+- struct io_sr_msg sr_msg;
+- struct io_open open;
+- struct io_close close;
+- struct io_rsrc_update rsrc_update;
+- struct io_fadvise fadvise;
+- struct io_madvise madvise;
+- struct io_epoll epoll;
+- struct io_splice splice;
+- struct io_provide_buf pbuf;
+- struct io_statx statx;
+- struct io_shutdown shutdown;
+- struct io_rename rename;
+- struct io_unlink unlink;
+- struct io_mkdir mkdir;
+- struct io_symlink symlink;
+- struct io_hardlink hardlink;
+- struct io_msg msg;
+- };
+-
+- u8 opcode;
+- /* polled IO has completed */
+- u8 iopoll_completed;
+- u16 buf_index;
+- unsigned int flags;
+-
+- u64 user_data;
+- u32 result;
+- /* fd initially, then cflags for completion */
+- union {
+- u32 cflags;
+- int fd;
+- };
+-
+- struct io_ring_ctx *ctx;
+- struct task_struct *task;
+-
+- struct percpu_ref *fixed_rsrc_refs;
+- /* store used ubuf, so we can prevent reloading */
+- struct io_mapped_ubuf *imu;
+-
+- union {
+- /* used by request caches, completion batching and iopoll */
+- struct io_wq_work_node comp_list;
+- /* cache ->apoll->events */
+- __poll_t apoll_events;
+- };
+- atomic_t refs;
+- atomic_t poll_refs;
+- struct io_task_work io_task_work;
+- /* for polled requests, i.e. IORING_OP_POLL_ADD and async armed poll */
+- struct hlist_node hash_node;
+- /* internal polling, see IORING_FEAT_FAST_POLL */
+- struct async_poll *apoll;
+- /* opcode allocated if it needs to store data for async defer */
+- void *async_data;
+- /* stores selected buf, valid IFF REQ_F_BUFFER_SELECTED is set */
+- struct io_buffer *kbuf;
+- /* linked requests, IFF REQ_F_HARDLINK or REQ_F_LINK are set */
+- struct io_kiocb *link;
+- /* custom credentials, valid IFF REQ_F_CREDS is set */
+- const struct cred *creds;
+- struct io_wq_work work;
+-};
+-
+-struct io_tctx_node {
+- struct list_head ctx_node;
+- struct task_struct *task;
+- struct io_ring_ctx *ctx;
+-};
+-
+-struct io_defer_entry {
+- struct list_head list;
+- struct io_kiocb *req;
+- u32 seq;
+-};
+-
+-struct io_op_def {
+- /* needs req->file assigned */
+- unsigned needs_file : 1;
+- /* should block plug */
+- unsigned plug : 1;
+- /* hash wq insertion if file is a regular file */
+- unsigned hash_reg_file : 1;
+- /* unbound wq insertion if file is a non-regular file */
+- unsigned unbound_nonreg_file : 1;
+- /* set if opcode supports polled "wait" */
+- unsigned pollin : 1;
+- unsigned pollout : 1;
+- unsigned poll_exclusive : 1;
+- /* op supports buffer selection */
+- unsigned buffer_select : 1;
+- /* do prep async if is going to be punted */
+- unsigned needs_async_setup : 1;
+- /* opcode is not supported by this kernel */
+- unsigned not_supported : 1;
+- /* skip auditing */
+- unsigned audit_skip : 1;
+- /* size of async data needed, if any */
+- unsigned short async_size;
+-};
+-
+-static const struct io_op_def io_op_defs[] = {
+- [IORING_OP_NOP] = {},
+- [IORING_OP_READV] = {
+- .needs_file = 1,
+- .unbound_nonreg_file = 1,
+- .pollin = 1,
+- .buffer_select = 1,
+- .needs_async_setup = 1,
+- .plug = 1,
+- .audit_skip = 1,
+- .async_size = sizeof(struct io_async_rw),
+- },
+- [IORING_OP_WRITEV] = {
+- .needs_file = 1,
+- .hash_reg_file = 1,
+- .unbound_nonreg_file = 1,
+- .pollout = 1,
+- .needs_async_setup = 1,
+- .plug = 1,
+- .audit_skip = 1,
+- .async_size = sizeof(struct io_async_rw),
+- },
+- [IORING_OP_FSYNC] = {
+- .needs_file = 1,
+- .audit_skip = 1,
+- },
+- [IORING_OP_READ_FIXED] = {
+- .needs_file = 1,
+- .unbound_nonreg_file = 1,
+- .pollin = 1,
+- .plug = 1,
+- .audit_skip = 1,
+- .async_size = sizeof(struct io_async_rw),
+- },
+- [IORING_OP_WRITE_FIXED] = {
+- .needs_file = 1,
+- .hash_reg_file = 1,
+- .unbound_nonreg_file = 1,
+- .pollout = 1,
+- .plug = 1,
+- .audit_skip = 1,
+- .async_size = sizeof(struct io_async_rw),
+- },
+- [IORING_OP_POLL_ADD] = {
+- .needs_file = 1,
+- .unbound_nonreg_file = 1,
+- .audit_skip = 1,
+- },
+- [IORING_OP_POLL_REMOVE] = {
+- .audit_skip = 1,
+- },
+- [IORING_OP_SYNC_FILE_RANGE] = {
+- .needs_file = 1,
+- .audit_skip = 1,
+- },
+- [IORING_OP_SENDMSG] = {
+- .needs_file = 1,
+- .unbound_nonreg_file = 1,
+- .pollout = 1,
+- .needs_async_setup = 1,
+- .async_size = sizeof(struct io_async_msghdr),
+- },
+- [IORING_OP_RECVMSG] = {
+- .needs_file = 1,
+- .unbound_nonreg_file = 1,
+- .pollin = 1,
+- .buffer_select = 1,
+- .needs_async_setup = 1,
+- .async_size = sizeof(struct io_async_msghdr),
+- },
+- [IORING_OP_TIMEOUT] = {
+- .audit_skip = 1,
+- .async_size = sizeof(struct io_timeout_data),
+- },
+- [IORING_OP_TIMEOUT_REMOVE] = {
+- /* used by timeout updates' prep() */
+- .audit_skip = 1,
+- },
+- [IORING_OP_ACCEPT] = {
+- .needs_file = 1,
+- .unbound_nonreg_file = 1,
+- .pollin = 1,
+- .poll_exclusive = 1,
+- },
+- [IORING_OP_ASYNC_CANCEL] = {
+- .audit_skip = 1,
+- },
+- [IORING_OP_LINK_TIMEOUT] = {
+- .audit_skip = 1,
+- .async_size = sizeof(struct io_timeout_data),
+- },
+- [IORING_OP_CONNECT] = {
+- .needs_file = 1,
+- .unbound_nonreg_file = 1,
+- .pollout = 1,
+- .needs_async_setup = 1,
+- .async_size = sizeof(struct io_async_connect),
+- },
+- [IORING_OP_FALLOCATE] = {
+- .needs_file = 1,
+- },
+- [IORING_OP_OPENAT] = {},
+- [IORING_OP_CLOSE] = {},
+- [IORING_OP_FILES_UPDATE] = {
+- .audit_skip = 1,
+- },
+- [IORING_OP_STATX] = {
+- .audit_skip = 1,
+- },
+- [IORING_OP_READ] = {
+- .needs_file = 1,
+- .unbound_nonreg_file = 1,
+- .pollin = 1,
+- .buffer_select = 1,
+- .plug = 1,
+- .audit_skip = 1,
+- .async_size = sizeof(struct io_async_rw),
+- },
+- [IORING_OP_WRITE] = {
+- .needs_file = 1,
+- .hash_reg_file = 1,
+- .unbound_nonreg_file = 1,
+- .pollout = 1,
+- .plug = 1,
+- .audit_skip = 1,
+- .async_size = sizeof(struct io_async_rw),
+- },
+- [IORING_OP_FADVISE] = {
+- .needs_file = 1,
+- .audit_skip = 1,
+- },
+- [IORING_OP_MADVISE] = {},
+- [IORING_OP_SEND] = {
+- .needs_file = 1,
+- .unbound_nonreg_file = 1,
+- .pollout = 1,
+- .audit_skip = 1,
+- },
+- [IORING_OP_RECV] = {
+- .needs_file = 1,
+- .unbound_nonreg_file = 1,
+- .pollin = 1,
+- .buffer_select = 1,
+- .audit_skip = 1,
+- },
+- [IORING_OP_OPENAT2] = {
+- },
+- [IORING_OP_EPOLL_CTL] = {
+- .unbound_nonreg_file = 1,
+- .audit_skip = 1,
+- },
+- [IORING_OP_SPLICE] = {
+- .needs_file = 1,
+- .hash_reg_file = 1,
+- .unbound_nonreg_file = 1,
+- .audit_skip = 1,
+- },
+- [IORING_OP_PROVIDE_BUFFERS] = {
+- .audit_skip = 1,
+- },
+- [IORING_OP_REMOVE_BUFFERS] = {
+- .audit_skip = 1,
+- },
+- [IORING_OP_TEE] = {
+- .needs_file = 1,
+- .hash_reg_file = 1,
+- .unbound_nonreg_file = 1,
+- .audit_skip = 1,
+- },
+- [IORING_OP_SHUTDOWN] = {
+- .needs_file = 1,
+- },
+- [IORING_OP_RENAMEAT] = {},
+- [IORING_OP_UNLINKAT] = {},
+- [IORING_OP_MKDIRAT] = {},
+- [IORING_OP_SYMLINKAT] = {},
+- [IORING_OP_LINKAT] = {},
+- [IORING_OP_MSG_RING] = {
+- .needs_file = 1,
+- },
+-};
+-
+-/* requests with any of those set should undergo io_disarm_next() */
+-#define IO_DISARM_MASK (REQ_F_ARM_LTIMEOUT | REQ_F_LINK_TIMEOUT | REQ_F_FAIL)
+-
+-static bool io_disarm_next(struct io_kiocb *req);
+-static void io_uring_del_tctx_node(unsigned long index);
+-static void io_uring_try_cancel_requests(struct io_ring_ctx *ctx,
+- struct task_struct *task,
+- bool cancel_all);
+-static void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd);
+-
+-static void io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags);
+-
+-static void io_put_req(struct io_kiocb *req);
+-static void io_put_req_deferred(struct io_kiocb *req);
+-static void io_dismantle_req(struct io_kiocb *req);
+-static void io_queue_linked_timeout(struct io_kiocb *req);
+-static int __io_register_rsrc_update(struct io_ring_ctx *ctx, unsigned type,
+- struct io_uring_rsrc_update2 *up,
+- unsigned nr_args);
+-static void io_clean_op(struct io_kiocb *req);
+-static inline struct file *io_file_get_fixed(struct io_kiocb *req, int fd,
+- unsigned issue_flags);
+-static inline struct file *io_file_get_normal(struct io_kiocb *req, int fd);
+-static void __io_queue_sqe(struct io_kiocb *req);
+-static void io_rsrc_put_work(struct work_struct *work);
+-
+-static void io_req_task_queue(struct io_kiocb *req);
+-static void __io_submit_flush_completions(struct io_ring_ctx *ctx);
+-static int io_req_prep_async(struct io_kiocb *req);
+-
+-static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
+- unsigned int issue_flags, u32 slot_index);
+-static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags);
+-
+-static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer);
+-static void io_eventfd_signal(struct io_ring_ctx *ctx);
+-
+-static struct kmem_cache *req_cachep;
+-
+-static const struct file_operations io_uring_fops;
+-
+-struct sock *io_uring_get_socket(struct file *file)
+-{
+-#if defined(CONFIG_UNIX)
+- if (file->f_op == &io_uring_fops) {
+- struct io_ring_ctx *ctx = file->private_data;
+-
+- return ctx->ring_sock->sk;
+- }
+-#endif
+- return NULL;
+-}
+-EXPORT_SYMBOL(io_uring_get_socket);
+-
+-static inline void io_tw_lock(struct io_ring_ctx *ctx, bool *locked)
+-{
+- if (!*locked) {
+- mutex_lock(&ctx->uring_lock);
+- *locked = true;
+- }
+-}
+-
+-#define io_for_each_link(pos, head) \
+- for (pos = (head); pos; pos = pos->link)
+-
+-/*
+- * Shamelessly stolen from the mm implementation of page reference checking,
+- * see commit f958d7b528b1 for details.
+- */
+-#define req_ref_zero_or_close_to_overflow(req) \
+- ((unsigned int) atomic_read(&(req->refs)) + 127u <= 127u)
+-
+-static inline bool req_ref_inc_not_zero(struct io_kiocb *req)
+-{
+- WARN_ON_ONCE(!(req->flags & REQ_F_REFCOUNT));
+- return atomic_inc_not_zero(&req->refs);
+-}
+-
+-static inline bool req_ref_put_and_test(struct io_kiocb *req)
+-{
+- if (likely(!(req->flags & REQ_F_REFCOUNT)))
+- return true;
+-
+- WARN_ON_ONCE(req_ref_zero_or_close_to_overflow(req));
+- return atomic_dec_and_test(&req->refs);
+-}
+-
+-static inline void req_ref_get(struct io_kiocb *req)
+-{
+- WARN_ON_ONCE(!(req->flags & REQ_F_REFCOUNT));
+- WARN_ON_ONCE(req_ref_zero_or_close_to_overflow(req));
+- atomic_inc(&req->refs);
+-}
+-
+-static inline void io_submit_flush_completions(struct io_ring_ctx *ctx)
+-{
+- if (!wq_list_empty(&ctx->submit_state.compl_reqs))
+- __io_submit_flush_completions(ctx);
+-}
+-
+-static inline void __io_req_set_refcount(struct io_kiocb *req, int nr)
+-{
+- if (!(req->flags & REQ_F_REFCOUNT)) {
+- req->flags |= REQ_F_REFCOUNT;
+- atomic_set(&req->refs, nr);
+- }
+-}
+-
+-static inline void io_req_set_refcount(struct io_kiocb *req)
+-{
+- __io_req_set_refcount(req, 1);
+-}
+-
+-#define IO_RSRC_REF_BATCH 100
+-
+-static inline void io_req_put_rsrc_locked(struct io_kiocb *req,
+- struct io_ring_ctx *ctx)
+- __must_hold(&ctx->uring_lock)
+-{
+- struct percpu_ref *ref = req->fixed_rsrc_refs;
+-
+- if (ref) {
+- if (ref == &ctx->rsrc_node->refs)
+- ctx->rsrc_cached_refs++;
+- else
+- percpu_ref_put(ref);
+- }
+-}
+-
+-static inline void io_req_put_rsrc(struct io_kiocb *req, struct io_ring_ctx *ctx)
+-{
+- if (req->fixed_rsrc_refs)
+- percpu_ref_put(req->fixed_rsrc_refs);
+-}
+-
+-static __cold void io_rsrc_refs_drop(struct io_ring_ctx *ctx)
+- __must_hold(&ctx->uring_lock)
+-{
+- if (ctx->rsrc_cached_refs) {
+- percpu_ref_put_many(&ctx->rsrc_node->refs, ctx->rsrc_cached_refs);
+- ctx->rsrc_cached_refs = 0;
+- }
+-}
+-
+-static void io_rsrc_refs_refill(struct io_ring_ctx *ctx)
+- __must_hold(&ctx->uring_lock)
+-{
+- ctx->rsrc_cached_refs += IO_RSRC_REF_BATCH;
+- percpu_ref_get_many(&ctx->rsrc_node->refs, IO_RSRC_REF_BATCH);
+-}
+-
+-static inline void io_req_set_rsrc_node(struct io_kiocb *req,
+- struct io_ring_ctx *ctx,
+- unsigned int issue_flags)
+-{
+- if (!req->fixed_rsrc_refs) {
+- req->fixed_rsrc_refs = &ctx->rsrc_node->refs;
+-
+- if (!(issue_flags & IO_URING_F_UNLOCKED)) {
+- lockdep_assert_held(&ctx->uring_lock);
+- ctx->rsrc_cached_refs--;
+- if (unlikely(ctx->rsrc_cached_refs < 0))
+- io_rsrc_refs_refill(ctx);
+- } else {
+- percpu_ref_get(req->fixed_rsrc_refs);
+- }
+- }
+-}
+-
+-static unsigned int __io_put_kbuf(struct io_kiocb *req, struct list_head *list)
+-{
+- struct io_buffer *kbuf = req->kbuf;
+- unsigned int cflags;
+-
+- cflags = IORING_CQE_F_BUFFER | (kbuf->bid << IORING_CQE_BUFFER_SHIFT);
+- req->flags &= ~REQ_F_BUFFER_SELECTED;
+- list_add(&kbuf->list, list);
+- req->kbuf = NULL;
+- return cflags;
+-}
+-
+-static inline unsigned int io_put_kbuf_comp(struct io_kiocb *req)
+-{
+- lockdep_assert_held(&req->ctx->completion_lock);
+-
+- if (likely(!(req->flags & REQ_F_BUFFER_SELECTED)))
+- return 0;
+- return __io_put_kbuf(req, &req->ctx->io_buffers_comp);
+-}
+-
+-static inline unsigned int io_put_kbuf(struct io_kiocb *req,
+- unsigned issue_flags)
+-{
+- unsigned int cflags;
+-
+- if (likely(!(req->flags & REQ_F_BUFFER_SELECTED)))
+- return 0;
+-
+- /*
+- * We can add this buffer back to two lists:
+- *
+- * 1) The io_buffers_cache list. This one is protected by the
+- * ctx->uring_lock. If we already hold this lock, add back to this
+- * list as we can grab it from issue as well.
+- * 2) The io_buffers_comp list. This one is protected by the
+- * ctx->completion_lock.
+- *
+- * We migrate buffers from the comp_list to the issue cache list
+- * when we need one.
+- */
+- if (issue_flags & IO_URING_F_UNLOCKED) {
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- spin_lock(&ctx->completion_lock);
+- cflags = __io_put_kbuf(req, &ctx->io_buffers_comp);
+- spin_unlock(&ctx->completion_lock);
+- } else {
+- lockdep_assert_held(&req->ctx->uring_lock);
+-
+- cflags = __io_put_kbuf(req, &req->ctx->io_buffers_cache);
+- }
+-
+- return cflags;
+-}
+-
+-static struct io_buffer_list *io_buffer_get_list(struct io_ring_ctx *ctx,
+- unsigned int bgid)
+-{
+- struct list_head *hash_list;
+- struct io_buffer_list *bl;
+-
+- hash_list = &ctx->io_buffers[hash_32(bgid, IO_BUFFERS_HASH_BITS)];
+- list_for_each_entry(bl, hash_list, list)
+- if (bl->bgid == bgid || bgid == -1U)
+- return bl;
+-
+- return NULL;
+-}
+-
+-static void io_kbuf_recycle(struct io_kiocb *req, unsigned issue_flags)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- struct io_buffer_list *bl;
+- struct io_buffer *buf;
+-
+- if (likely(!(req->flags & REQ_F_BUFFER_SELECTED)))
+- return;
+- /* don't recycle if we already did IO to this buffer */
+- if (req->flags & REQ_F_PARTIAL_IO)
+- return;
+-
+- if (issue_flags & IO_URING_F_UNLOCKED)
+- mutex_lock(&ctx->uring_lock);
+-
+- lockdep_assert_held(&ctx->uring_lock);
+-
+- buf = req->kbuf;
+- bl = io_buffer_get_list(ctx, buf->bgid);
+- list_add(&buf->list, &bl->buf_list);
+- req->flags &= ~REQ_F_BUFFER_SELECTED;
+- req->kbuf = NULL;
+-
+- if (issue_flags & IO_URING_F_UNLOCKED)
+- mutex_unlock(&ctx->uring_lock);
+-}
+-
+-static bool io_match_task(struct io_kiocb *head, struct task_struct *task,
+- bool cancel_all)
+- __must_hold(&req->ctx->timeout_lock)
+-{
+- struct io_kiocb *req;
+-
+- if (task && head->task != task)
+- return false;
+- if (cancel_all)
+- return true;
+-
+- io_for_each_link(req, head) {
+- if (req->flags & REQ_F_INFLIGHT)
+- return true;
+- }
+- return false;
+-}
+-
+-static bool io_match_linked(struct io_kiocb *head)
+-{
+- struct io_kiocb *req;
+-
+- io_for_each_link(req, head) {
+- if (req->flags & REQ_F_INFLIGHT)
+- return true;
+- }
+- return false;
+-}
+-
+-/*
+- * As io_match_task() but protected against racing with linked timeouts.
+- * User must not hold timeout_lock.
+- */
+-static bool io_match_task_safe(struct io_kiocb *head, struct task_struct *task,
+- bool cancel_all)
+-{
+- bool matched;
+-
+- if (task && head->task != task)
+- return false;
+- if (cancel_all)
+- return true;
+-
+- if (head->flags & REQ_F_LINK_TIMEOUT) {
+- struct io_ring_ctx *ctx = head->ctx;
+-
+- /* protect against races with linked timeouts */
+- spin_lock_irq(&ctx->timeout_lock);
+- matched = io_match_linked(head);
+- spin_unlock_irq(&ctx->timeout_lock);
+- } else {
+- matched = io_match_linked(head);
+- }
+- return matched;
+-}
+-
+-static inline bool req_has_async_data(struct io_kiocb *req)
+-{
+- return req->flags & REQ_F_ASYNC_DATA;
+-}
+-
+-static inline void req_set_fail(struct io_kiocb *req)
+-{
+- req->flags |= REQ_F_FAIL;
+- if (req->flags & REQ_F_CQE_SKIP) {
+- req->flags &= ~REQ_F_CQE_SKIP;
+- req->flags |= REQ_F_SKIP_LINK_CQES;
+- }
+-}
+-
+-static inline void req_fail_link_node(struct io_kiocb *req, int res)
+-{
+- req_set_fail(req);
+- req->result = res;
+-}
+-
+-static __cold void io_ring_ctx_ref_free(struct percpu_ref *ref)
+-{
+- struct io_ring_ctx *ctx = container_of(ref, struct io_ring_ctx, refs);
+-
+- complete(&ctx->ref_comp);
+-}
+-
+-static inline bool io_is_timeout_noseq(struct io_kiocb *req)
+-{
+- return !req->timeout.off;
+-}
+-
+-static __cold void io_fallback_req_func(struct work_struct *work)
+-{
+- struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx,
+- fallback_work.work);
+- struct llist_node *node = llist_del_all(&ctx->fallback_llist);
+- struct io_kiocb *req, *tmp;
+- bool locked = false;
+-
+- percpu_ref_get(&ctx->refs);
+- llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node)
+- req->io_task_work.func(req, &locked);
+-
+- if (locked) {
+- io_submit_flush_completions(ctx);
+- mutex_unlock(&ctx->uring_lock);
+- }
+- percpu_ref_put(&ctx->refs);
+-}
+-
+-static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p)
+-{
+- struct io_ring_ctx *ctx;
+- int i, hash_bits;
+-
+- ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+- if (!ctx)
+- return NULL;
+-
+- /*
+- * Use 5 bits less than the max cq entries, that should give us around
+- * 32 entries per hash list if totally full and uniformly spread.
+- */
+- hash_bits = ilog2(p->cq_entries);
+- hash_bits -= 5;
+- if (hash_bits <= 0)
+- hash_bits = 1;
+- ctx->cancel_hash_bits = hash_bits;
+- ctx->cancel_hash = kmalloc((1U << hash_bits) * sizeof(struct hlist_head),
+- GFP_KERNEL);
+- if (!ctx->cancel_hash)
+- goto err;
+- __hash_init(ctx->cancel_hash, 1U << hash_bits);
+-
+- ctx->dummy_ubuf = kzalloc(sizeof(*ctx->dummy_ubuf), GFP_KERNEL);
+- if (!ctx->dummy_ubuf)
+- goto err;
+- /* set invalid range, so io_import_fixed() fails meeting it */
+- ctx->dummy_ubuf->ubuf = -1UL;
+-
+- ctx->io_buffers = kcalloc(1U << IO_BUFFERS_HASH_BITS,
+- sizeof(struct list_head), GFP_KERNEL);
+- if (!ctx->io_buffers)
+- goto err;
+- for (i = 0; i < (1U << IO_BUFFERS_HASH_BITS); i++)
+- INIT_LIST_HEAD(&ctx->io_buffers[i]);
+-
+- if (percpu_ref_init(&ctx->refs, io_ring_ctx_ref_free,
+- PERCPU_REF_ALLOW_REINIT, GFP_KERNEL))
+- goto err;
+-
+- ctx->flags = p->flags;
+- init_waitqueue_head(&ctx->sqo_sq_wait);
+- INIT_LIST_HEAD(&ctx->sqd_list);
+- INIT_LIST_HEAD(&ctx->cq_overflow_list);
+- INIT_LIST_HEAD(&ctx->io_buffers_cache);
+- INIT_LIST_HEAD(&ctx->apoll_cache);
+- init_completion(&ctx->ref_comp);
+- xa_init_flags(&ctx->personalities, XA_FLAGS_ALLOC1);
+- mutex_init(&ctx->uring_lock);
+- init_waitqueue_head(&ctx->cq_wait);
+- spin_lock_init(&ctx->completion_lock);
+- spin_lock_init(&ctx->timeout_lock);
+- INIT_WQ_LIST(&ctx->iopoll_list);
+- INIT_LIST_HEAD(&ctx->io_buffers_pages);
+- INIT_LIST_HEAD(&ctx->io_buffers_comp);
+- INIT_LIST_HEAD(&ctx->defer_list);
+- INIT_LIST_HEAD(&ctx->timeout_list);
+- INIT_LIST_HEAD(&ctx->ltimeout_list);
+- spin_lock_init(&ctx->rsrc_ref_lock);
+- INIT_LIST_HEAD(&ctx->rsrc_ref_list);
+- INIT_DELAYED_WORK(&ctx->rsrc_put_work, io_rsrc_put_work);
+- init_llist_head(&ctx->rsrc_put_llist);
+- INIT_LIST_HEAD(&ctx->tctx_list);
+- ctx->submit_state.free_list.next = NULL;
+- INIT_WQ_LIST(&ctx->locked_free_list);
+- INIT_DELAYED_WORK(&ctx->fallback_work, io_fallback_req_func);
+- INIT_WQ_LIST(&ctx->submit_state.compl_reqs);
+- return ctx;
+-err:
+- kfree(ctx->dummy_ubuf);
+- kfree(ctx->cancel_hash);
+- kfree(ctx->io_buffers);
+- kfree(ctx);
+- return NULL;
+-}
+-
+-static void io_account_cq_overflow(struct io_ring_ctx *ctx)
+-{
+- struct io_rings *r = ctx->rings;
+-
+- WRITE_ONCE(r->cq_overflow, READ_ONCE(r->cq_overflow) + 1);
+- ctx->cq_extra--;
+-}
+-
+-static bool req_need_defer(struct io_kiocb *req, u32 seq)
+-{
+- if (unlikely(req->flags & REQ_F_IO_DRAIN)) {
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- return seq + READ_ONCE(ctx->cq_extra) != ctx->cached_cq_tail;
+- }
+-
+- return false;
+-}
+-
+-#define FFS_NOWAIT 0x1UL
+-#define FFS_ISREG 0x2UL
+-#define FFS_MASK ~(FFS_NOWAIT|FFS_ISREG)
+-
+-static inline bool io_req_ffs_set(struct io_kiocb *req)
+-{
+- return req->flags & REQ_F_FIXED_FILE;
+-}
+-
+-static inline void io_req_track_inflight(struct io_kiocb *req)
+-{
+- if (!(req->flags & REQ_F_INFLIGHT)) {
+- req->flags |= REQ_F_INFLIGHT;
+- atomic_inc(&req->task->io_uring->inflight_tracked);
+- }
+-}
+-
+-static struct io_kiocb *__io_prep_linked_timeout(struct io_kiocb *req)
+-{
+- if (WARN_ON_ONCE(!req->link))
+- return NULL;
+-
+- req->flags &= ~REQ_F_ARM_LTIMEOUT;
+- req->flags |= REQ_F_LINK_TIMEOUT;
+-
+- /* linked timeouts should have two refs once prep'ed */
+- io_req_set_refcount(req);
+- __io_req_set_refcount(req->link, 2);
+- return req->link;
+-}
+-
+-static inline struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req)
+-{
+- if (likely(!(req->flags & REQ_F_ARM_LTIMEOUT)))
+- return NULL;
+- return __io_prep_linked_timeout(req);
+-}
+-
+-static void io_prep_async_work(struct io_kiocb *req)
+-{
+- const struct io_op_def *def = &io_op_defs[req->opcode];
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- if (!(req->flags & REQ_F_CREDS)) {
+- req->flags |= REQ_F_CREDS;
+- req->creds = get_current_cred();
+- }
+-
+- req->work.list.next = NULL;
+- req->work.flags = 0;
+- if (req->flags & REQ_F_FORCE_ASYNC)
+- req->work.flags |= IO_WQ_WORK_CONCURRENT;
+-
+- if (req->flags & REQ_F_ISREG) {
+- if (def->hash_reg_file || (ctx->flags & IORING_SETUP_IOPOLL))
+- io_wq_hash_work(&req->work, file_inode(req->file));
+- } else if (!req->file || !S_ISBLK(file_inode(req->file)->i_mode)) {
+- if (def->unbound_nonreg_file)
+- req->work.flags |= IO_WQ_WORK_UNBOUND;
+- }
+-}
+-
+-static void io_prep_async_link(struct io_kiocb *req)
+-{
+- struct io_kiocb *cur;
+-
+- if (req->flags & REQ_F_LINK_TIMEOUT) {
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- spin_lock_irq(&ctx->timeout_lock);
+- io_for_each_link(cur, req)
+- io_prep_async_work(cur);
+- spin_unlock_irq(&ctx->timeout_lock);
+- } else {
+- io_for_each_link(cur, req)
+- io_prep_async_work(cur);
+- }
+-}
+-
+-static inline void io_req_add_compl_list(struct io_kiocb *req)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- struct io_submit_state *state = &ctx->submit_state;
+-
+- if (!(req->flags & REQ_F_CQE_SKIP))
+- ctx->submit_state.flush_cqes = true;
+- wq_list_add_tail(&req->comp_list, &state->compl_reqs);
+-}
+-
+-static void io_queue_async_work(struct io_kiocb *req, bool *dont_use)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- struct io_kiocb *link = io_prep_linked_timeout(req);
+- struct io_uring_task *tctx = req->task->io_uring;
+-
+- BUG_ON(!tctx);
+- BUG_ON(!tctx->io_wq);
+-
+- /* init ->work of the whole link before punting */
+- io_prep_async_link(req);
+-
+- /*
+- * Not expected to happen, but if we do have a bug where this _can_
+- * happen, catch it here and ensure the request is marked as
+- * canceled. That will make io-wq go through the usual work cancel
+- * procedure rather than attempt to run this request (or create a new
+- * worker for it).
+- */
+- if (WARN_ON_ONCE(!same_thread_group(req->task, current)))
+- req->work.flags |= IO_WQ_WORK_CANCEL;
+-
+- trace_io_uring_queue_async_work(ctx, req, req->user_data, req->opcode, req->flags,
+- &req->work, io_wq_is_hashed(&req->work));
+- io_wq_enqueue(tctx->io_wq, &req->work);
+- if (link)
+- io_queue_linked_timeout(link);
+-}
+-
+-static void io_kill_timeout(struct io_kiocb *req, int status)
+- __must_hold(&req->ctx->completion_lock)
+- __must_hold(&req->ctx->timeout_lock)
+-{
+- struct io_timeout_data *io = req->async_data;
+-
+- if (hrtimer_try_to_cancel(&io->timer) != -1) {
+- if (status)
+- req_set_fail(req);
+- atomic_set(&req->ctx->cq_timeouts,
+- atomic_read(&req->ctx->cq_timeouts) + 1);
+- list_del_init(&req->timeout.list);
+- io_fill_cqe_req(req, status, 0);
+- io_put_req_deferred(req);
+- }
+-}
+-
+-static __cold void io_queue_deferred(struct io_ring_ctx *ctx)
+-{
+- while (!list_empty(&ctx->defer_list)) {
+- struct io_defer_entry *de = list_first_entry(&ctx->defer_list,
+- struct io_defer_entry, list);
+-
+- if (req_need_defer(de->req, de->seq))
+- break;
+- list_del_init(&de->list);
+- io_req_task_queue(de->req);
+- kfree(de);
+- }
+-}
+-
+-static __cold void io_flush_timeouts(struct io_ring_ctx *ctx)
+- __must_hold(&ctx->completion_lock)
+-{
+- u32 seq = ctx->cached_cq_tail - atomic_read(&ctx->cq_timeouts);
+- struct io_kiocb *req, *tmp;
+-
+- spin_lock_irq(&ctx->timeout_lock);
+- list_for_each_entry_safe(req, tmp, &ctx->timeout_list, timeout.list) {
+- u32 events_needed, events_got;
+-
+- if (io_is_timeout_noseq(req))
+- break;
+-
+- /*
+- * Since seq can easily wrap around over time, subtract
+- * the last seq at which timeouts were flushed before comparing.
+- * Assuming not more than 2^31-1 events have happened since,
+- * these subtractions won't have wrapped, so we can check if
+- * target is in [last_seq, current_seq] by comparing the two.
+- */
+- events_needed = req->timeout.target_seq - ctx->cq_last_tm_flush;
+- events_got = seq - ctx->cq_last_tm_flush;
+- if (events_got < events_needed)
+- break;
+-
+- io_kill_timeout(req, 0);
+- }
+- ctx->cq_last_tm_flush = seq;
+- spin_unlock_irq(&ctx->timeout_lock);
+-}
+-
+-static inline void io_commit_cqring(struct io_ring_ctx *ctx)
+-{
+- /* order cqe stores with ring update */
+- smp_store_release(&ctx->rings->cq.tail, ctx->cached_cq_tail);
+-}
+-
+-static void __io_commit_cqring_flush(struct io_ring_ctx *ctx)
+-{
+- if (ctx->off_timeout_used || ctx->drain_active) {
+- spin_lock(&ctx->completion_lock);
+- if (ctx->off_timeout_used)
+- io_flush_timeouts(ctx);
+- if (ctx->drain_active)
+- io_queue_deferred(ctx);
+- io_commit_cqring(ctx);
+- spin_unlock(&ctx->completion_lock);
+- }
+- if (ctx->has_evfd)
+- io_eventfd_signal(ctx);
+-}
+-
+-static inline bool io_sqring_full(struct io_ring_ctx *ctx)
+-{
+- struct io_rings *r = ctx->rings;
+-
+- return READ_ONCE(r->sq.tail) - ctx->cached_sq_head == ctx->sq_entries;
+-}
+-
+-static inline unsigned int __io_cqring_events(struct io_ring_ctx *ctx)
+-{
+- return ctx->cached_cq_tail - READ_ONCE(ctx->rings->cq.head);
+-}
+-
+-static inline struct io_uring_cqe *io_get_cqe(struct io_ring_ctx *ctx)
+-{
+- struct io_rings *rings = ctx->rings;
+- unsigned tail, mask = ctx->cq_entries - 1;
+-
+- /*
+- * writes to the cq entry need to come after reading head; the
+- * control dependency is enough as we're using WRITE_ONCE to
+- * fill the cq entry
+- */
+- if (__io_cqring_events(ctx) == ctx->cq_entries)
+- return NULL;
+-
+- tail = ctx->cached_cq_tail++;
+- return &rings->cqes[tail & mask];
+-}
+-
+-static void io_eventfd_signal(struct io_ring_ctx *ctx)
+-{
+- struct io_ev_fd *ev_fd;
+-
+- rcu_read_lock();
+- /*
+- * rcu_dereference ctx->io_ev_fd once and use it for both for checking
+- * and eventfd_signal
+- */
+- ev_fd = rcu_dereference(ctx->io_ev_fd);
+-
+- /*
+- * Check again if ev_fd exists incase an io_eventfd_unregister call
+- * completed between the NULL check of ctx->io_ev_fd at the start of
+- * the function and rcu_read_lock.
+- */
+- if (unlikely(!ev_fd))
+- goto out;
+- if (READ_ONCE(ctx->rings->cq_flags) & IORING_CQ_EVENTFD_DISABLED)
+- goto out;
+-
+- if (!ev_fd->eventfd_async || io_wq_current_is_worker())
+- eventfd_signal(ev_fd->cq_ev_fd, 1);
+-out:
+- rcu_read_unlock();
+-}
+-
+-static inline void io_cqring_wake(struct io_ring_ctx *ctx)
+-{
+- /*
+- * wake_up_all() may seem excessive, but io_wake_function() and
+- * io_should_wake() handle the termination of the loop and only
+- * wake as many waiters as we need to.
+- */
+- if (wq_has_sleeper(&ctx->cq_wait))
+- wake_up_all(&ctx->cq_wait);
+-}
+-
+-/*
+- * This should only get called when at least one event has been posted.
+- * Some applications rely on the eventfd notification count only changing
+- * IFF a new CQE has been added to the CQ ring. There's no depedency on
+- * 1:1 relationship between how many times this function is called (and
+- * hence the eventfd count) and number of CQEs posted to the CQ ring.
+- */
+-static inline void io_cqring_ev_posted(struct io_ring_ctx *ctx)
+-{
+- if (unlikely(ctx->off_timeout_used || ctx->drain_active ||
+- ctx->has_evfd))
+- __io_commit_cqring_flush(ctx);
+-
+- io_cqring_wake(ctx);
+-}
+-
+-static void io_cqring_ev_posted_iopoll(struct io_ring_ctx *ctx)
+-{
+- if (unlikely(ctx->off_timeout_used || ctx->drain_active ||
+- ctx->has_evfd))
+- __io_commit_cqring_flush(ctx);
+-
+- if (ctx->flags & IORING_SETUP_SQPOLL)
+- io_cqring_wake(ctx);
+-}
+-
+-/* Returns true if there are no backlogged entries after the flush */
+-static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
+-{
+- bool all_flushed, posted;
+-
+- if (!force && __io_cqring_events(ctx) == ctx->cq_entries)
+- return false;
+-
+- posted = false;
+- spin_lock(&ctx->completion_lock);
+- while (!list_empty(&ctx->cq_overflow_list)) {
+- struct io_uring_cqe *cqe = io_get_cqe(ctx);
+- struct io_overflow_cqe *ocqe;
+-
+- if (!cqe && !force)
+- break;
+- ocqe = list_first_entry(&ctx->cq_overflow_list,
+- struct io_overflow_cqe, list);
+- if (cqe)
+- memcpy(cqe, &ocqe->cqe, sizeof(*cqe));
+- else
+- io_account_cq_overflow(ctx);
+-
+- posted = true;
+- list_del(&ocqe->list);
+- kfree(ocqe);
+- }
+-
+- all_flushed = list_empty(&ctx->cq_overflow_list);
+- if (all_flushed) {
+- clear_bit(0, &ctx->check_cq_overflow);
+- WRITE_ONCE(ctx->rings->sq_flags,
+- ctx->rings->sq_flags & ~IORING_SQ_CQ_OVERFLOW);
+- }
+-
+- if (posted)
+- io_commit_cqring(ctx);
+- spin_unlock(&ctx->completion_lock);
+- if (posted)
+- io_cqring_ev_posted(ctx);
+- return all_flushed;
+-}
+-
+-static bool io_cqring_overflow_flush(struct io_ring_ctx *ctx)
+-{
+- bool ret = true;
+-
+- if (test_bit(0, &ctx->check_cq_overflow)) {
+- /* iopoll syncs against uring_lock, not completion_lock */
+- if (ctx->flags & IORING_SETUP_IOPOLL)
+- mutex_lock(&ctx->uring_lock);
+- ret = __io_cqring_overflow_flush(ctx, false);
+- if (ctx->flags & IORING_SETUP_IOPOLL)
+- mutex_unlock(&ctx->uring_lock);
+- }
+-
+- return ret;
+-}
+-
+-/* must to be called somewhat shortly after putting a request */
+-static inline void io_put_task(struct task_struct *task, int nr)
+-{
+- struct io_uring_task *tctx = task->io_uring;
+-
+- if (likely(task == current)) {
+- tctx->cached_refs += nr;
+- } else {
+- percpu_counter_sub(&tctx->inflight, nr);
+- if (unlikely(atomic_read(&tctx->in_idle)))
+- wake_up(&tctx->wait);
+- put_task_struct_many(task, nr);
+- }
+-}
+-
+-static void io_task_refs_refill(struct io_uring_task *tctx)
+-{
+- unsigned int refill = -tctx->cached_refs + IO_TCTX_REFS_CACHE_NR;
+-
+- percpu_counter_add(&tctx->inflight, refill);
+- refcount_add(refill, ¤t->usage);
+- tctx->cached_refs += refill;
+-}
+-
+-static inline void io_get_task_refs(int nr)
+-{
+- struct io_uring_task *tctx = current->io_uring;
+-
+- tctx->cached_refs -= nr;
+- if (unlikely(tctx->cached_refs < 0))
+- io_task_refs_refill(tctx);
+-}
+-
+-static __cold void io_uring_drop_tctx_refs(struct task_struct *task)
+-{
+- struct io_uring_task *tctx = task->io_uring;
+- unsigned int refs = tctx->cached_refs;
+-
+- if (refs) {
+- tctx->cached_refs = 0;
+- percpu_counter_sub(&tctx->inflight, refs);
+- put_task_struct_many(task, refs);
+- }
+-}
+-
+-static bool io_cqring_event_overflow(struct io_ring_ctx *ctx, u64 user_data,
+- s32 res, u32 cflags)
+-{
+- struct io_overflow_cqe *ocqe;
+-
+- ocqe = kmalloc(sizeof(*ocqe), GFP_ATOMIC | __GFP_ACCOUNT);
+- if (!ocqe) {
+- /*
+- * If we're in ring overflow flush mode, or in task cancel mode,
+- * or cannot allocate an overflow entry, then we need to drop it
+- * on the floor.
+- */
+- io_account_cq_overflow(ctx);
+- return false;
+- }
+- if (list_empty(&ctx->cq_overflow_list)) {
+- set_bit(0, &ctx->check_cq_overflow);
+- WRITE_ONCE(ctx->rings->sq_flags,
+- ctx->rings->sq_flags | IORING_SQ_CQ_OVERFLOW);
+-
+- }
+- ocqe->cqe.user_data = user_data;
+- ocqe->cqe.res = res;
+- ocqe->cqe.flags = cflags;
+- list_add_tail(&ocqe->list, &ctx->cq_overflow_list);
+- return true;
+-}
+-
+-static inline bool __io_fill_cqe(struct io_ring_ctx *ctx, u64 user_data,
+- s32 res, u32 cflags)
+-{
+- struct io_uring_cqe *cqe;
+-
+- /*
+- * If we can't get a cq entry, userspace overflowed the
+- * submission (by quite a lot). Increment the overflow count in
+- * the ring.
+- */
+- cqe = io_get_cqe(ctx);
+- if (likely(cqe)) {
+- WRITE_ONCE(cqe->user_data, user_data);
+- WRITE_ONCE(cqe->res, res);
+- WRITE_ONCE(cqe->flags, cflags);
+- return true;
+- }
+- return io_cqring_event_overflow(ctx, user_data, res, cflags);
+-}
+-
+-static inline bool __io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags)
+-{
+- trace_io_uring_complete(req->ctx, req, req->user_data, res, cflags);
+- return __io_fill_cqe(req->ctx, req->user_data, res, cflags);
+-}
+-
+-static noinline void io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags)
+-{
+- if (!(req->flags & REQ_F_CQE_SKIP))
+- __io_fill_cqe_req(req, res, cflags);
+-}
+-
+-static noinline bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data,
+- s32 res, u32 cflags)
+-{
+- ctx->cq_extra++;
+- trace_io_uring_complete(ctx, NULL, user_data, res, cflags);
+- return __io_fill_cqe(ctx, user_data, res, cflags);
+-}
+-
+-static void __io_req_complete_post(struct io_kiocb *req, s32 res,
+- u32 cflags)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- if (!(req->flags & REQ_F_CQE_SKIP))
+- __io_fill_cqe_req(req, res, cflags);
+- /*
+- * If we're the last reference to this request, add to our locked
+- * free_list cache.
+- */
+- if (req_ref_put_and_test(req)) {
+- if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) {
+- if (req->flags & IO_DISARM_MASK)
+- io_disarm_next(req);
+- if (req->link) {
+- io_req_task_queue(req->link);
+- req->link = NULL;
+- }
+- }
+- io_req_put_rsrc(req, ctx);
+- /*
+- * Selected buffer deallocation in io_clean_op() assumes that
+- * we don't hold ->completion_lock. Clean them here to avoid
+- * deadlocks.
+- */
+- io_put_kbuf_comp(req);
+- io_dismantle_req(req);
+- io_put_task(req->task, 1);
+- wq_list_add_head(&req->comp_list, &ctx->locked_free_list);
+- ctx->locked_free_nr++;
+- }
+-}
+-
+-static void io_req_complete_post(struct io_kiocb *req, s32 res,
+- u32 cflags)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- spin_lock(&ctx->completion_lock);
+- __io_req_complete_post(req, res, cflags);
+- io_commit_cqring(ctx);
+- spin_unlock(&ctx->completion_lock);
+- io_cqring_ev_posted(ctx);
+-}
+-
+-static inline void io_req_complete_state(struct io_kiocb *req, s32 res,
+- u32 cflags)
+-{
+- req->result = res;
+- req->cflags = cflags;
+- req->flags |= REQ_F_COMPLETE_INLINE;
+-}
+-
+-static inline void __io_req_complete(struct io_kiocb *req, unsigned issue_flags,
+- s32 res, u32 cflags)
+-{
+- if (issue_flags & IO_URING_F_COMPLETE_DEFER)
+- io_req_complete_state(req, res, cflags);
+- else
+- io_req_complete_post(req, res, cflags);
+-}
+-
+-static inline void io_req_complete(struct io_kiocb *req, s32 res)
+-{
+- __io_req_complete(req, 0, res, 0);
+-}
+-
+-static void io_req_complete_failed(struct io_kiocb *req, s32 res)
+-{
+- req_set_fail(req);
+- io_req_complete_post(req, res, io_put_kbuf(req, IO_URING_F_UNLOCKED));
+-}
+-
+-static void io_req_complete_fail_submit(struct io_kiocb *req)
+-{
+- /*
+- * We don't submit, fail them all, for that replace hardlinks with
+- * normal links. Extra REQ_F_LINK is tolerated.
+- */
+- req->flags &= ~REQ_F_HARDLINK;
+- req->flags |= REQ_F_LINK;
+- io_req_complete_failed(req, req->result);
+-}
+-
+-/*
+- * Don't initialise the fields below on every allocation, but do that in
+- * advance and keep them valid across allocations.
+- */
+-static void io_preinit_req(struct io_kiocb *req, struct io_ring_ctx *ctx)
+-{
+- req->ctx = ctx;
+- req->link = NULL;
+- req->async_data = NULL;
+- /* not necessary, but safer to zero */
+- req->result = 0;
+-}
+-
+-static void io_flush_cached_locked_reqs(struct io_ring_ctx *ctx,
+- struct io_submit_state *state)
+-{
+- spin_lock(&ctx->completion_lock);
+- wq_list_splice(&ctx->locked_free_list, &state->free_list);
+- ctx->locked_free_nr = 0;
+- spin_unlock(&ctx->completion_lock);
+-}
+-
+-/* Returns true IFF there are requests in the cache */
+-static bool io_flush_cached_reqs(struct io_ring_ctx *ctx)
+-{
+- struct io_submit_state *state = &ctx->submit_state;
+-
+- /*
+- * If we have more than a batch's worth of requests in our IRQ side
+- * locked cache, grab the lock and move them over to our submission
+- * side cache.
+- */
+- if (READ_ONCE(ctx->locked_free_nr) > IO_COMPL_BATCH)
+- io_flush_cached_locked_reqs(ctx, state);
+- return !!state->free_list.next;
+-}
+-
+-/*
+- * A request might get retired back into the request caches even before opcode
+- * handlers and io_issue_sqe() are done with it, e.g. inline completion path.
+- * Because of that, io_alloc_req() should be called only under ->uring_lock
+- * and with extra caution to not get a request that is still worked on.
+- */
+-static __cold bool __io_alloc_req_refill(struct io_ring_ctx *ctx)
+- __must_hold(&ctx->uring_lock)
+-{
+- struct io_submit_state *state = &ctx->submit_state;
+- gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
+- void *reqs[IO_REQ_ALLOC_BATCH];
+- struct io_kiocb *req;
+- int ret, i;
+-
+- if (likely(state->free_list.next || io_flush_cached_reqs(ctx)))
+- return true;
+-
+- ret = kmem_cache_alloc_bulk(req_cachep, gfp, ARRAY_SIZE(reqs), reqs);
+-
+- /*
+- * Bulk alloc is all-or-nothing. If we fail to get a batch,
+- * retry single alloc to be on the safe side.
+- */
+- if (unlikely(ret <= 0)) {
+- reqs[0] = kmem_cache_alloc(req_cachep, gfp);
+- if (!reqs[0])
+- return false;
+- ret = 1;
+- }
+-
+- percpu_ref_get_many(&ctx->refs, ret);
+- for (i = 0; i < ret; i++) {
+- req = reqs[i];
+-
+- io_preinit_req(req, ctx);
+- wq_stack_add_head(&req->comp_list, &state->free_list);
+- }
+- return true;
+-}
+-
+-static inline bool io_alloc_req_refill(struct io_ring_ctx *ctx)
+-{
+- if (unlikely(!ctx->submit_state.free_list.next))
+- return __io_alloc_req_refill(ctx);
+- return true;
+-}
+-
+-static inline struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
+-{
+- struct io_wq_work_node *node;
+-
+- node = wq_stack_extract(&ctx->submit_state.free_list);
+- return container_of(node, struct io_kiocb, comp_list);
+-}
+-
+-static inline void io_put_file(struct file *file)
+-{
+- if (file)
+- fput(file);
+-}
+-
+-static inline void io_dismantle_req(struct io_kiocb *req)
+-{
+- unsigned int flags = req->flags;
+-
+- if (unlikely(flags & IO_REQ_CLEAN_FLAGS))
+- io_clean_op(req);
+- if (!(flags & REQ_F_FIXED_FILE))
+- io_put_file(req->file);
+-}
+-
+-static __cold void __io_free_req(struct io_kiocb *req)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- io_req_put_rsrc(req, ctx);
+- io_dismantle_req(req);
+- io_put_task(req->task, 1);
+-
+- spin_lock(&ctx->completion_lock);
+- wq_list_add_head(&req->comp_list, &ctx->locked_free_list);
+- ctx->locked_free_nr++;
+- spin_unlock(&ctx->completion_lock);
+-}
+-
+-static inline void io_remove_next_linked(struct io_kiocb *req)
+-{
+- struct io_kiocb *nxt = req->link;
+-
+- req->link = nxt->link;
+- nxt->link = NULL;
+-}
+-
+-static bool io_kill_linked_timeout(struct io_kiocb *req)
+- __must_hold(&req->ctx->completion_lock)
+- __must_hold(&req->ctx->timeout_lock)
+-{
+- struct io_kiocb *link = req->link;
+-
+- if (link && link->opcode == IORING_OP_LINK_TIMEOUT) {
+- struct io_timeout_data *io = link->async_data;
+-
+- io_remove_next_linked(req);
+- link->timeout.head = NULL;
+- if (hrtimer_try_to_cancel(&io->timer) != -1) {
+- list_del(&link->timeout.list);
+- /* leave REQ_F_CQE_SKIP to io_fill_cqe_req */
+- io_fill_cqe_req(link, -ECANCELED, 0);
+- io_put_req_deferred(link);
+- return true;
+- }
+- }
+- return false;
+-}
+-
+-static void io_fail_links(struct io_kiocb *req)
+- __must_hold(&req->ctx->completion_lock)
+-{
+- struct io_kiocb *nxt, *link = req->link;
+- bool ignore_cqes = req->flags & REQ_F_SKIP_LINK_CQES;
+-
+- req->link = NULL;
+- while (link) {
+- long res = -ECANCELED;
+-
+- if (link->flags & REQ_F_FAIL)
+- res = link->result;
+-
+- nxt = link->link;
+- link->link = NULL;
+-
+- trace_io_uring_fail_link(req->ctx, req, req->user_data,
+- req->opcode, link);
+-
+- if (!ignore_cqes) {
+- link->flags &= ~REQ_F_CQE_SKIP;
+- io_fill_cqe_req(link, res, 0);
+- }
+- io_put_req_deferred(link);
+- link = nxt;
+- }
+-}
+-
+-static bool io_disarm_next(struct io_kiocb *req)
+- __must_hold(&req->ctx->completion_lock)
+-{
+- bool posted = false;
+-
+- if (req->flags & REQ_F_ARM_LTIMEOUT) {
+- struct io_kiocb *link = req->link;
+-
+- req->flags &= ~REQ_F_ARM_LTIMEOUT;
+- if (link && link->opcode == IORING_OP_LINK_TIMEOUT) {
+- io_remove_next_linked(req);
+- /* leave REQ_F_CQE_SKIP to io_fill_cqe_req */
+- io_fill_cqe_req(link, -ECANCELED, 0);
+- io_put_req_deferred(link);
+- posted = true;
+- }
+- } else if (req->flags & REQ_F_LINK_TIMEOUT) {
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- spin_lock_irq(&ctx->timeout_lock);
+- posted = io_kill_linked_timeout(req);
+- spin_unlock_irq(&ctx->timeout_lock);
+- }
+- if (unlikely((req->flags & REQ_F_FAIL) &&
+- !(req->flags & REQ_F_HARDLINK))) {
+- posted |= (req->link != NULL);
+- io_fail_links(req);
+- }
+- return posted;
+-}
+-
+-static void __io_req_find_next_prep(struct io_kiocb *req)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- bool posted;
+-
+- spin_lock(&ctx->completion_lock);
+- posted = io_disarm_next(req);
+- if (posted)
+- io_commit_cqring(ctx);
+- spin_unlock(&ctx->completion_lock);
+- if (posted)
+- io_cqring_ev_posted(ctx);
+-}
+-
+-static inline struct io_kiocb *io_req_find_next(struct io_kiocb *req)
+-{
+- struct io_kiocb *nxt;
+-
+- if (likely(!(req->flags & (REQ_F_LINK|REQ_F_HARDLINK))))
+- return NULL;
+- /*
+- * If LINK is set, we have dependent requests in this chain. If we
+- * didn't fail this request, queue the first one up, moving any other
+- * dependencies to the next request. In case of failure, fail the rest
+- * of the chain.
+- */
+- if (unlikely(req->flags & IO_DISARM_MASK))
+- __io_req_find_next_prep(req);
+- nxt = req->link;
+- req->link = NULL;
+- return nxt;
+-}
+-
+-static void ctx_flush_and_put(struct io_ring_ctx *ctx, bool *locked)
+-{
+- if (!ctx)
+- return;
+- if (*locked) {
+- io_submit_flush_completions(ctx);
+- mutex_unlock(&ctx->uring_lock);
+- *locked = false;
+- }
+- percpu_ref_put(&ctx->refs);
+-}
+-
+-static inline void ctx_commit_and_unlock(struct io_ring_ctx *ctx)
+-{
+- io_commit_cqring(ctx);
+- spin_unlock(&ctx->completion_lock);
+- io_cqring_ev_posted(ctx);
+-}
+-
+-static void handle_prev_tw_list(struct io_wq_work_node *node,
+- struct io_ring_ctx **ctx, bool *uring_locked)
+-{
+- if (*ctx && !*uring_locked)
+- spin_lock(&(*ctx)->completion_lock);
+-
+- do {
+- struct io_wq_work_node *next = node->next;
+- struct io_kiocb *req = container_of(node, struct io_kiocb,
+- io_task_work.node);
+-
+- prefetch(container_of(next, struct io_kiocb, io_task_work.node));
+-
+- if (req->ctx != *ctx) {
+- if (unlikely(!*uring_locked && *ctx))
+- ctx_commit_and_unlock(*ctx);
+-
+- ctx_flush_and_put(*ctx, uring_locked);
+- *ctx = req->ctx;
+- /* if not contended, grab and improve batching */
+- *uring_locked = mutex_trylock(&(*ctx)->uring_lock);
+- percpu_ref_get(&(*ctx)->refs);
+- if (unlikely(!*uring_locked))
+- spin_lock(&(*ctx)->completion_lock);
+- }
+- if (likely(*uring_locked))
+- req->io_task_work.func(req, uring_locked);
+- else
+- __io_req_complete_post(req, req->result,
+- io_put_kbuf_comp(req));
+- node = next;
+- } while (node);
+-
+- if (unlikely(!*uring_locked))
+- ctx_commit_and_unlock(*ctx);
+-}
+-
+-static void handle_tw_list(struct io_wq_work_node *node,
+- struct io_ring_ctx **ctx, bool *locked)
+-{
+- do {
+- struct io_wq_work_node *next = node->next;
+- struct io_kiocb *req = container_of(node, struct io_kiocb,
+- io_task_work.node);
+-
+- prefetch(container_of(next, struct io_kiocb, io_task_work.node));
+-
+- if (req->ctx != *ctx) {
+- ctx_flush_and_put(*ctx, locked);
+- *ctx = req->ctx;
+- /* if not contended, grab and improve batching */
+- *locked = mutex_trylock(&(*ctx)->uring_lock);
+- percpu_ref_get(&(*ctx)->refs);
+- }
+- req->io_task_work.func(req, locked);
+- node = next;
+- } while (node);
+-}
+-
+-static void tctx_task_work(struct callback_head *cb)
+-{
+- bool uring_locked = false;
+- struct io_ring_ctx *ctx = NULL;
+- struct io_uring_task *tctx = container_of(cb, struct io_uring_task,
+- task_work);
+-
+- while (1) {
+- struct io_wq_work_node *node1, *node2;
+-
+- if (!tctx->task_list.first &&
+- !tctx->prior_task_list.first && uring_locked)
+- io_submit_flush_completions(ctx);
+-
+- spin_lock_irq(&tctx->task_lock);
+- node1 = tctx->prior_task_list.first;
+- node2 = tctx->task_list.first;
+- INIT_WQ_LIST(&tctx->task_list);
+- INIT_WQ_LIST(&tctx->prior_task_list);
+- if (!node2 && !node1)
+- tctx->task_running = false;
+- spin_unlock_irq(&tctx->task_lock);
+- if (!node2 && !node1)
+- break;
+-
+- if (node1)
+- handle_prev_tw_list(node1, &ctx, &uring_locked);
+-
+- if (node2)
+- handle_tw_list(node2, &ctx, &uring_locked);
+- cond_resched();
+- }
+-
+- ctx_flush_and_put(ctx, &uring_locked);
+-
+- /* relaxed read is enough as only the task itself sets ->in_idle */
+- if (unlikely(atomic_read(&tctx->in_idle)))
+- io_uring_drop_tctx_refs(current);
+-}
+-
+-static void io_req_task_work_add(struct io_kiocb *req, bool priority)
+-{
+- struct task_struct *tsk = req->task;
+- struct io_uring_task *tctx = tsk->io_uring;
+- enum task_work_notify_mode notify;
+- struct io_wq_work_node *node;
+- unsigned long flags;
+- bool running;
+-
+- WARN_ON_ONCE(!tctx);
+-
+- spin_lock_irqsave(&tctx->task_lock, flags);
+- if (priority)
+- wq_list_add_tail(&req->io_task_work.node, &tctx->prior_task_list);
+- else
+- wq_list_add_tail(&req->io_task_work.node, &tctx->task_list);
+- running = tctx->task_running;
+- if (!running)
+- tctx->task_running = true;
+- spin_unlock_irqrestore(&tctx->task_lock, flags);
+-
+- /* task_work already pending, we're done */
+- if (running)
+- return;
+-
+- /*
+- * SQPOLL kernel thread doesn't need notification, just a wakeup. For
+- * all other cases, use TWA_SIGNAL unconditionally to ensure we're
+- * processing task_work. There's no reliable way to tell if TWA_RESUME
+- * will do the job.
+- */
+- notify = (req->ctx->flags & IORING_SETUP_SQPOLL) ? TWA_NONE : TWA_SIGNAL;
+- if (likely(!task_work_add(tsk, &tctx->task_work, notify))) {
+- if (notify == TWA_NONE)
+- wake_up_process(tsk);
+- return;
+- }
+-
+- spin_lock_irqsave(&tctx->task_lock, flags);
+- tctx->task_running = false;
+- node = wq_list_merge(&tctx->prior_task_list, &tctx->task_list);
+- spin_unlock_irqrestore(&tctx->task_lock, flags);
+-
+- while (node) {
+- req = container_of(node, struct io_kiocb, io_task_work.node);
+- node = node->next;
+- if (llist_add(&req->io_task_work.fallback_node,
+- &req->ctx->fallback_llist))
+- schedule_delayed_work(&req->ctx->fallback_work, 1);
+- }
+-}
+-
+-static void io_req_task_cancel(struct io_kiocb *req, bool *locked)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- /* not needed for normal modes, but SQPOLL depends on it */
+- io_tw_lock(ctx, locked);
+- io_req_complete_failed(req, req->result);
+-}
+-
+-static void io_req_task_submit(struct io_kiocb *req, bool *locked)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- io_tw_lock(ctx, locked);
+- /* req->task == current here, checking PF_EXITING is safe */
+- if (likely(!(req->task->flags & PF_EXITING)))
+- __io_queue_sqe(req);
+- else
+- io_req_complete_failed(req, -EFAULT);
+-}
+-
+-static void io_req_task_queue_fail(struct io_kiocb *req, int ret)
+-{
+- req->result = ret;
+- req->io_task_work.func = io_req_task_cancel;
+- io_req_task_work_add(req, false);
+-}
+-
+-static void io_req_task_queue(struct io_kiocb *req)
+-{
+- req->io_task_work.func = io_req_task_submit;
+- io_req_task_work_add(req, false);
+-}
+-
+-static void io_req_task_queue_reissue(struct io_kiocb *req)
+-{
+- req->io_task_work.func = io_queue_async_work;
+- io_req_task_work_add(req, false);
+-}
+-
+-static inline void io_queue_next(struct io_kiocb *req)
+-{
+- struct io_kiocb *nxt = io_req_find_next(req);
+-
+- if (nxt)
+- io_req_task_queue(nxt);
+-}
+-
+-static void io_free_req(struct io_kiocb *req)
+-{
+- io_queue_next(req);
+- __io_free_req(req);
+-}
+-
+-static void io_free_req_work(struct io_kiocb *req, bool *locked)
+-{
+- io_free_req(req);
+-}
+-
+-static void io_free_batch_list(struct io_ring_ctx *ctx,
+- struct io_wq_work_node *node)
+- __must_hold(&ctx->uring_lock)
+-{
+- struct task_struct *task = NULL;
+- int task_refs = 0;
+-
+- do {
+- struct io_kiocb *req = container_of(node, struct io_kiocb,
+- comp_list);
+-
+- if (unlikely(req->flags & REQ_F_REFCOUNT)) {
+- node = req->comp_list.next;
+- if (!req_ref_put_and_test(req))
+- continue;
+- }
+-
+- io_req_put_rsrc_locked(req, ctx);
+- io_queue_next(req);
+- io_dismantle_req(req);
+-
+- if (req->task != task) {
+- if (task)
+- io_put_task(task, task_refs);
+- task = req->task;
+- task_refs = 0;
+- }
+- task_refs++;
+- node = req->comp_list.next;
+- wq_stack_add_head(&req->comp_list, &ctx->submit_state.free_list);
+- } while (node);
+-
+- if (task)
+- io_put_task(task, task_refs);
+-}
+-
+-static void __io_submit_flush_completions(struct io_ring_ctx *ctx)
+- __must_hold(&ctx->uring_lock)
+-{
+- struct io_wq_work_node *node, *prev;
+- struct io_submit_state *state = &ctx->submit_state;
+-
+- if (state->flush_cqes) {
+- spin_lock(&ctx->completion_lock);
+- wq_list_for_each(node, prev, &state->compl_reqs) {
+- struct io_kiocb *req = container_of(node, struct io_kiocb,
+- comp_list);
+-
+- if (!(req->flags & REQ_F_CQE_SKIP))
+- __io_fill_cqe_req(req, req->result, req->cflags);
+- if ((req->flags & REQ_F_POLLED) && req->apoll) {
+- struct async_poll *apoll = req->apoll;
+-
+- if (apoll->double_poll)
+- kfree(apoll->double_poll);
+- list_add(&apoll->poll.wait.entry,
+- &ctx->apoll_cache);
+- req->flags &= ~REQ_F_POLLED;
+- }
+- }
+-
+- io_commit_cqring(ctx);
+- spin_unlock(&ctx->completion_lock);
+- io_cqring_ev_posted(ctx);
+- state->flush_cqes = false;
+- }
+-
+- io_free_batch_list(ctx, state->compl_reqs.first);
+- INIT_WQ_LIST(&state->compl_reqs);
+-}
+-
+-/*
+- * Drop reference to request, return next in chain (if there is one) if this
+- * was the last reference to this request.
+- */
+-static inline struct io_kiocb *io_put_req_find_next(struct io_kiocb *req)
+-{
+- struct io_kiocb *nxt = NULL;
+-
+- if (req_ref_put_and_test(req)) {
+- nxt = io_req_find_next(req);
+- __io_free_req(req);
+- }
+- return nxt;
+-}
+-
+-static inline void io_put_req(struct io_kiocb *req)
+-{
+- if (req_ref_put_and_test(req))
+- io_free_req(req);
+-}
+-
+-static inline void io_put_req_deferred(struct io_kiocb *req)
+-{
+- if (req_ref_put_and_test(req)) {
+- req->io_task_work.func = io_free_req_work;
+- io_req_task_work_add(req, false);
+- }
+-}
+-
+-static unsigned io_cqring_events(struct io_ring_ctx *ctx)
+-{
+- /* See comment at the top of this file */
+- smp_rmb();
+- return __io_cqring_events(ctx);
+-}
+-
+-static inline unsigned int io_sqring_entries(struct io_ring_ctx *ctx)
+-{
+- struct io_rings *rings = ctx->rings;
+-
+- /* make sure SQ entry isn't read before tail */
+- return smp_load_acquire(&rings->sq.tail) - ctx->cached_sq_head;
+-}
+-
+-static inline bool io_run_task_work(void)
+-{
+- if (test_thread_flag(TIF_NOTIFY_SIGNAL) || task_work_pending(current)) {
+- __set_current_state(TASK_RUNNING);
+- clear_notify_signal();
+- if (task_work_pending(current))
+- task_work_run();
+- return true;
+- }
+-
+- return false;
+-}
+-
+-static int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin)
+-{
+- struct io_wq_work_node *pos, *start, *prev;
+- unsigned int poll_flags = BLK_POLL_NOSLEEP;
+- DEFINE_IO_COMP_BATCH(iob);
+- int nr_events = 0;
+-
+- /*
+- * Only spin for completions if we don't have multiple devices hanging
+- * off our complete list.
+- */
+- if (ctx->poll_multi_queue || force_nonspin)
+- poll_flags |= BLK_POLL_ONESHOT;
+-
+- wq_list_for_each(pos, start, &ctx->iopoll_list) {
+- struct io_kiocb *req = container_of(pos, struct io_kiocb, comp_list);
+- struct kiocb *kiocb = &req->rw.kiocb;
+- int ret;
+-
+- /*
+- * Move completed and retryable entries to our local lists.
+- * If we find a request that requires polling, break out
+- * and complete those lists first, if we have entries there.
+- */
+- if (READ_ONCE(req->iopoll_completed))
+- break;
+-
+- ret = kiocb->ki_filp->f_op->iopoll(kiocb, &iob, poll_flags);
+- if (unlikely(ret < 0))
+- return ret;
+- else if (ret)
+- poll_flags |= BLK_POLL_ONESHOT;
+-
+- /* iopoll may have completed current req */
+- if (!rq_list_empty(iob.req_list) ||
+- READ_ONCE(req->iopoll_completed))
+- break;
+- }
+-
+- if (!rq_list_empty(iob.req_list))
+- iob.complete(&iob);
+- else if (!pos)
+- return 0;
+-
+- prev = start;
+- wq_list_for_each_resume(pos, prev) {
+- struct io_kiocb *req = container_of(pos, struct io_kiocb, comp_list);
+-
+- /* order with io_complete_rw_iopoll(), e.g. ->result updates */
+- if (!smp_load_acquire(&req->iopoll_completed))
+- break;
+- nr_events++;
+- if (unlikely(req->flags & REQ_F_CQE_SKIP))
+- continue;
+- __io_fill_cqe_req(req, req->result, io_put_kbuf(req, 0));
+- }
+-
+- if (unlikely(!nr_events))
+- return 0;
+-
+- io_commit_cqring(ctx);
+- io_cqring_ev_posted_iopoll(ctx);
+- pos = start ? start->next : ctx->iopoll_list.first;
+- wq_list_cut(&ctx->iopoll_list, prev, start);
+- io_free_batch_list(ctx, pos);
+- return nr_events;
+-}
+-
+-/*
+- * We can't just wait for polled events to come to us, we have to actively
+- * find and complete them.
+- */
+-static __cold void io_iopoll_try_reap_events(struct io_ring_ctx *ctx)
+-{
+- if (!(ctx->flags & IORING_SETUP_IOPOLL))
+- return;
+-
+- mutex_lock(&ctx->uring_lock);
+- while (!wq_list_empty(&ctx->iopoll_list)) {
+- /* let it sleep and repeat later if can't complete a request */
+- if (io_do_iopoll(ctx, true) == 0)
+- break;
+- /*
+- * Ensure we allow local-to-the-cpu processing to take place,
+- * in this case we need to ensure that we reap all events.
+- * Also let task_work, etc. to progress by releasing the mutex
+- */
+- if (need_resched()) {
+- mutex_unlock(&ctx->uring_lock);
+- cond_resched();
+- mutex_lock(&ctx->uring_lock);
+- }
+- }
+- mutex_unlock(&ctx->uring_lock);
+-}
+-
+-static int io_iopoll_check(struct io_ring_ctx *ctx, long min)
+-{
+- unsigned int nr_events = 0;
+- int ret = 0;
+-
+- /*
+- * We disallow the app entering submit/complete with polling, but we
+- * still need to lock the ring to prevent racing with polled issue
+- * that got punted to a workqueue.
+- */
+- mutex_lock(&ctx->uring_lock);
+- /*
+- * Don't enter poll loop if we already have events pending.
+- * If we do, we can potentially be spinning for commands that
+- * already triggered a CQE (eg in error).
+- */
+- if (test_bit(0, &ctx->check_cq_overflow))
+- __io_cqring_overflow_flush(ctx, false);
+- if (io_cqring_events(ctx))
+- goto out;
+- do {
+- /*
+- * If a submit got punted to a workqueue, we can have the
+- * application entering polling for a command before it gets
+- * issued. That app will hold the uring_lock for the duration
+- * of the poll right here, so we need to take a breather every
+- * now and then to ensure that the issue has a chance to add
+- * the poll to the issued list. Otherwise we can spin here
+- * forever, while the workqueue is stuck trying to acquire the
+- * very same mutex.
+- */
+- if (wq_list_empty(&ctx->iopoll_list)) {
+- u32 tail = ctx->cached_cq_tail;
+-
+- mutex_unlock(&ctx->uring_lock);
+- io_run_task_work();
+- mutex_lock(&ctx->uring_lock);
+-
+- /* some requests don't go through iopoll_list */
+- if (tail != ctx->cached_cq_tail ||
+- wq_list_empty(&ctx->iopoll_list))
+- break;
+- }
+- ret = io_do_iopoll(ctx, !min);
+- if (ret < 0)
+- break;
+- nr_events += ret;
+- ret = 0;
+- } while (nr_events < min && !need_resched());
+-out:
+- mutex_unlock(&ctx->uring_lock);
+- return ret;
+-}
+-
+-static void kiocb_end_write(struct io_kiocb *req)
+-{
+- /*
+- * Tell lockdep we inherited freeze protection from submission
+- * thread.
+- */
+- if (req->flags & REQ_F_ISREG) {
+- struct super_block *sb = file_inode(req->file)->i_sb;
+-
+- __sb_writers_acquired(sb, SB_FREEZE_WRITE);
+- sb_end_write(sb);
+- }
+-}
+-
+-#ifdef CONFIG_BLOCK
+-static bool io_resubmit_prep(struct io_kiocb *req)
+-{
+- struct io_async_rw *rw = req->async_data;
+-
+- if (!req_has_async_data(req))
+- return !io_req_prep_async(req);
+- iov_iter_restore(&rw->s.iter, &rw->s.iter_state);
+- return true;
+-}
+-
+-static bool io_rw_should_reissue(struct io_kiocb *req)
+-{
+- umode_t mode = file_inode(req->file)->i_mode;
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- if (!S_ISBLK(mode) && !S_ISREG(mode))
+- return false;
+- if ((req->flags & REQ_F_NOWAIT) || (io_wq_current_is_worker() &&
+- !(ctx->flags & IORING_SETUP_IOPOLL)))
+- return false;
+- /*
+- * If ref is dying, we might be running poll reap from the exit work.
+- * Don't attempt to reissue from that path, just let it fail with
+- * -EAGAIN.
+- */
+- if (percpu_ref_is_dying(&ctx->refs))
+- return false;
+- /*
+- * Play it safe and assume not safe to re-import and reissue if we're
+- * not in the original thread group (or in task context).
+- */
+- if (!same_thread_group(req->task, current) || !in_task())
+- return false;
+- return true;
+-}
+-#else
+-static bool io_resubmit_prep(struct io_kiocb *req)
+-{
+- return false;
+-}
+-static bool io_rw_should_reissue(struct io_kiocb *req)
+-{
+- return false;
+-}
+-#endif
+-
+-static bool __io_complete_rw_common(struct io_kiocb *req, long res)
+-{
+- if (req->rw.kiocb.ki_flags & IOCB_WRITE) {
+- kiocb_end_write(req);
+- fsnotify_modify(req->file);
+- } else {
+- fsnotify_access(req->file);
+- }
+- if (unlikely(res != req->result)) {
+- if ((res == -EAGAIN || res == -EOPNOTSUPP) &&
+- io_rw_should_reissue(req)) {
+- req->flags |= REQ_F_REISSUE;
+- return true;
+- }
+- req_set_fail(req);
+- req->result = res;
+- }
+- return false;
+-}
+-
+-static inline void io_req_task_complete(struct io_kiocb *req, bool *locked)
+-{
+- int res = req->result;
+-
+- if (*locked) {
+- io_req_complete_state(req, res, io_put_kbuf(req, 0));
+- io_req_add_compl_list(req);
+- } else {
+- io_req_complete_post(req, res,
+- io_put_kbuf(req, IO_URING_F_UNLOCKED));
+- }
+-}
+-
+-static void __io_complete_rw(struct io_kiocb *req, long res,
+- unsigned int issue_flags)
+-{
+- if (__io_complete_rw_common(req, res))
+- return;
+- __io_req_complete(req, issue_flags, req->result,
+- io_put_kbuf(req, issue_flags));
+-}
+-
+-static void io_complete_rw(struct kiocb *kiocb, long res)
+-{
+- struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
+-
+- if (__io_complete_rw_common(req, res))
+- return;
+- req->result = res;
+- req->io_task_work.func = io_req_task_complete;
+- io_req_task_work_add(req, !!(req->ctx->flags & IORING_SETUP_SQPOLL));
+-}
+-
+-static void io_complete_rw_iopoll(struct kiocb *kiocb, long res)
+-{
+- struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
+-
+- if (kiocb->ki_flags & IOCB_WRITE)
+- kiocb_end_write(req);
+- if (unlikely(res != req->result)) {
+- if (res == -EAGAIN && io_rw_should_reissue(req)) {
+- req->flags |= REQ_F_REISSUE;
+- return;
+- }
+- req->result = res;
+- }
+-
+- /* order with io_iopoll_complete() checking ->iopoll_completed */
+- smp_store_release(&req->iopoll_completed, 1);
+-}
+-
+-/*
+- * After the iocb has been issued, it's safe to be found on the poll list.
+- * Adding the kiocb to the list AFTER submission ensures that we don't
+- * find it from a io_do_iopoll() thread before the issuer is done
+- * accessing the kiocb cookie.
+- */
+-static void io_iopoll_req_issued(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- const bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+-
+- /* workqueue context doesn't hold uring_lock, grab it now */
+- if (unlikely(needs_lock))
+- mutex_lock(&ctx->uring_lock);
+-
+- /*
+- * Track whether we have multiple files in our lists. This will impact
+- * how we do polling eventually, not spinning if we're on potentially
+- * different devices.
+- */
+- if (wq_list_empty(&ctx->iopoll_list)) {
+- ctx->poll_multi_queue = false;
+- } else if (!ctx->poll_multi_queue) {
+- struct io_kiocb *list_req;
+-
+- list_req = container_of(ctx->iopoll_list.first, struct io_kiocb,
+- comp_list);
+- if (list_req->file != req->file)
+- ctx->poll_multi_queue = true;
+- }
+-
+- /*
+- * For fast devices, IO may have already completed. If it has, add
+- * it to the front so we find it first.
+- */
+- if (READ_ONCE(req->iopoll_completed))
+- wq_list_add_head(&req->comp_list, &ctx->iopoll_list);
+- else
+- wq_list_add_tail(&req->comp_list, &ctx->iopoll_list);
+-
+- if (unlikely(needs_lock)) {
+- /*
+- * If IORING_SETUP_SQPOLL is enabled, sqes are either handle
+- * in sq thread task context or in io worker task context. If
+- * current task context is sq thread, we don't need to check
+- * whether should wake up sq thread.
+- */
+- if ((ctx->flags & IORING_SETUP_SQPOLL) &&
+- wq_has_sleeper(&ctx->sq_data->wait))
+- wake_up(&ctx->sq_data->wait);
+-
+- mutex_unlock(&ctx->uring_lock);
+- }
+-}
+-
+-static bool io_bdev_nowait(struct block_device *bdev)
+-{
+- return !bdev || blk_queue_nowait(bdev_get_queue(bdev));
+-}
+-
+-/*
+- * If we tracked the file through the SCM inflight mechanism, we could support
+- * any file. For now, just ensure that anything potentially problematic is done
+- * inline.
+- */
+-static bool __io_file_supports_nowait(struct file *file, umode_t mode)
+-{
+- if (S_ISBLK(mode)) {
+- if (IS_ENABLED(CONFIG_BLOCK) &&
+- io_bdev_nowait(I_BDEV(file->f_mapping->host)))
+- return true;
+- return false;
+- }
+- if (S_ISSOCK(mode))
+- return true;
+- if (S_ISREG(mode)) {
+- if (IS_ENABLED(CONFIG_BLOCK) &&
+- io_bdev_nowait(file->f_inode->i_sb->s_bdev) &&
+- file->f_op != &io_uring_fops)
+- return true;
+- return false;
+- }
+-
+- /* any ->read/write should understand O_NONBLOCK */
+- if (file->f_flags & O_NONBLOCK)
+- return true;
+- return file->f_mode & FMODE_NOWAIT;
+-}
+-
+-/*
+- * If we tracked the file through the SCM inflight mechanism, we could support
+- * any file. For now, just ensure that anything potentially problematic is done
+- * inline.
+- */
+-static unsigned int io_file_get_flags(struct file *file)
+-{
+- umode_t mode = file_inode(file)->i_mode;
+- unsigned int res = 0;
+-
+- if (S_ISREG(mode))
+- res |= FFS_ISREG;
+- if (__io_file_supports_nowait(file, mode))
+- res |= FFS_NOWAIT;
+- return res;
+-}
+-
+-static inline bool io_file_supports_nowait(struct io_kiocb *req)
+-{
+- return req->flags & REQ_F_SUPPORT_NOWAIT;
+-}
+-
+-static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- struct kiocb *kiocb = &req->rw.kiocb;
+- unsigned ioprio;
+- int ret;
+-
+- kiocb->ki_pos = READ_ONCE(sqe->off);
+- /* used for fixed read/write too - just read unconditionally */
+- req->buf_index = READ_ONCE(sqe->buf_index);
+- req->imu = NULL;
+-
+- if (req->opcode == IORING_OP_READ_FIXED ||
+- req->opcode == IORING_OP_WRITE_FIXED) {
+- struct io_ring_ctx *ctx = req->ctx;
+- u16 index;
+-
+- if (unlikely(req->buf_index >= ctx->nr_user_bufs))
+- return -EFAULT;
+- index = array_index_nospec(req->buf_index, ctx->nr_user_bufs);
+- req->imu = ctx->user_bufs[index];
+- io_req_set_rsrc_node(req, ctx, 0);
+- }
+-
+- ioprio = READ_ONCE(sqe->ioprio);
+- if (ioprio) {
+- ret = ioprio_check_cap(ioprio);
+- if (ret)
+- return ret;
+-
+- kiocb->ki_ioprio = ioprio;
+- } else {
+- kiocb->ki_ioprio = get_current_ioprio();
+- }
+-
+- req->rw.addr = READ_ONCE(sqe->addr);
+- req->rw.len = READ_ONCE(sqe->len);
+- req->rw.flags = READ_ONCE(sqe->rw_flags);
+- return 0;
+-}
+-
+-static inline void io_rw_done(struct kiocb *kiocb, ssize_t ret)
+-{
+- switch (ret) {
+- case -EIOCBQUEUED:
+- break;
+- case -ERESTARTSYS:
+- case -ERESTARTNOINTR:
+- case -ERESTARTNOHAND:
+- case -ERESTART_RESTARTBLOCK:
+- /*
+- * We can't just restart the syscall, since previously
+- * submitted sqes may already be in progress. Just fail this
+- * IO with EINTR.
+- */
+- ret = -EINTR;
+- fallthrough;
+- default:
+- kiocb->ki_complete(kiocb, ret);
+- }
+-}
+-
+-static inline loff_t *io_kiocb_update_pos(struct io_kiocb *req)
+-{
+- struct kiocb *kiocb = &req->rw.kiocb;
+-
+- if (kiocb->ki_pos != -1)
+- return &kiocb->ki_pos;
+-
+- if (!(req->file->f_mode & FMODE_STREAM)) {
+- req->flags |= REQ_F_CUR_POS;
+- kiocb->ki_pos = req->file->f_pos;
+- return &kiocb->ki_pos;
+- }
+-
+- kiocb->ki_pos = 0;
+- return NULL;
+-}
+-
+-static void kiocb_done(struct io_kiocb *req, ssize_t ret,
+- unsigned int issue_flags)
+-{
+- struct io_async_rw *io = req->async_data;
+-
+- /* add previously done IO, if any */
+- if (req_has_async_data(req) && io->bytes_done > 0) {
+- if (ret < 0)
+- ret = io->bytes_done;
+- else
+- ret += io->bytes_done;
+- }
+-
+- if (req->flags & REQ_F_CUR_POS)
+- req->file->f_pos = req->rw.kiocb.ki_pos;
+- if (ret >= 0 && (req->rw.kiocb.ki_complete == io_complete_rw))
+- __io_complete_rw(req, ret, issue_flags);
+- else
+- io_rw_done(&req->rw.kiocb, ret);
+-
+- if (req->flags & REQ_F_REISSUE) {
+- req->flags &= ~REQ_F_REISSUE;
+- if (io_resubmit_prep(req))
+- io_req_task_queue_reissue(req);
+- else
+- io_req_task_queue_fail(req, ret);
+- }
+-}
+-
+-static int __io_import_fixed(struct io_kiocb *req, int rw, struct iov_iter *iter,
+- struct io_mapped_ubuf *imu)
+-{
+- size_t len = req->rw.len;
+- u64 buf_end, buf_addr = req->rw.addr;
+- size_t offset;
+-
+- if (unlikely(check_add_overflow(buf_addr, (u64)len, &buf_end)))
+- return -EFAULT;
+- /* not inside the mapped region */
+- if (unlikely(buf_addr < imu->ubuf || buf_end > imu->ubuf_end))
+- return -EFAULT;
+-
+- /*
+- * May not be a start of buffer, set size appropriately
+- * and advance us to the beginning.
+- */
+- offset = buf_addr - imu->ubuf;
+- iov_iter_bvec(iter, rw, imu->bvec, imu->nr_bvecs, offset + len);
+-
+- if (offset) {
+- /*
+- * Don't use iov_iter_advance() here, as it's really slow for
+- * using the latter parts of a big fixed buffer - it iterates
+- * over each segment manually. We can cheat a bit here, because
+- * we know that:
+- *
+- * 1) it's a BVEC iter, we set it up
+- * 2) all bvecs are PAGE_SIZE in size, except potentially the
+- * first and last bvec
+- *
+- * So just find our index, and adjust the iterator afterwards.
+- * If the offset is within the first bvec (or the whole first
+- * bvec, just use iov_iter_advance(). This makes it easier
+- * since we can just skip the first segment, which may not
+- * be PAGE_SIZE aligned.
+- */
+- const struct bio_vec *bvec = imu->bvec;
+-
+- if (offset <= bvec->bv_len) {
+- iov_iter_advance(iter, offset);
+- } else {
+- unsigned long seg_skip;
+-
+- /* skip first vec */
+- offset -= bvec->bv_len;
+- seg_skip = 1 + (offset >> PAGE_SHIFT);
+-
+- iter->bvec = bvec + seg_skip;
+- iter->nr_segs -= seg_skip;
+- iter->count -= bvec->bv_len + offset;
+- iter->iov_offset = offset & ~PAGE_MASK;
+- }
+- }
+-
+- return 0;
+-}
+-
+-static int io_import_fixed(struct io_kiocb *req, int rw, struct iov_iter *iter,
+- unsigned int issue_flags)
+-{
+- if (WARN_ON_ONCE(!req->imu))
+- return -EFAULT;
+- return __io_import_fixed(req, rw, iter, req->imu);
+-}
+-
+-static void io_ring_submit_unlock(struct io_ring_ctx *ctx, bool needs_lock)
+-{
+- if (needs_lock)
+- mutex_unlock(&ctx->uring_lock);
+-}
+-
+-static void io_ring_submit_lock(struct io_ring_ctx *ctx, bool needs_lock)
+-{
+- /*
+- * "Normal" inline submissions always hold the uring_lock, since we
+- * grab it from the system call. Same is true for the SQPOLL offload.
+- * The only exception is when we've detached the request and issue it
+- * from an async worker thread, grab the lock for that case.
+- */
+- if (needs_lock)
+- mutex_lock(&ctx->uring_lock);
+-}
+-
+-static void io_buffer_add_list(struct io_ring_ctx *ctx,
+- struct io_buffer_list *bl, unsigned int bgid)
+-{
+- struct list_head *list;
+-
+- list = &ctx->io_buffers[hash_32(bgid, IO_BUFFERS_HASH_BITS)];
+- INIT_LIST_HEAD(&bl->buf_list);
+- bl->bgid = bgid;
+- list_add(&bl->list, list);
+-}
+-
+-static struct io_buffer *io_buffer_select(struct io_kiocb *req, size_t *len,
+- int bgid, unsigned int issue_flags)
+-{
+- struct io_buffer *kbuf = req->kbuf;
+- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+- struct io_ring_ctx *ctx = req->ctx;
+- struct io_buffer_list *bl;
+-
+- if (req->flags & REQ_F_BUFFER_SELECTED)
+- return kbuf;
+-
+- io_ring_submit_lock(ctx, needs_lock);
+-
+- lockdep_assert_held(&ctx->uring_lock);
+-
+- bl = io_buffer_get_list(ctx, bgid);
+- if (bl && !list_empty(&bl->buf_list)) {
+- kbuf = list_first_entry(&bl->buf_list, struct io_buffer, list);
+- list_del(&kbuf->list);
+- if (*len > kbuf->len)
+- *len = kbuf->len;
+- req->flags |= REQ_F_BUFFER_SELECTED;
+- req->kbuf = kbuf;
+- } else {
+- kbuf = ERR_PTR(-ENOBUFS);
+- }
+-
+- io_ring_submit_unlock(req->ctx, needs_lock);
+- return kbuf;
+-}
+-
+-static void __user *io_rw_buffer_select(struct io_kiocb *req, size_t *len,
+- unsigned int issue_flags)
+-{
+- struct io_buffer *kbuf;
+- u16 bgid;
+-
+- bgid = req->buf_index;
+- kbuf = io_buffer_select(req, len, bgid, issue_flags);
+- if (IS_ERR(kbuf))
+- return kbuf;
+- return u64_to_user_ptr(kbuf->addr);
+-}
+-
+-#ifdef CONFIG_COMPAT
+-static ssize_t io_compat_import(struct io_kiocb *req, struct iovec *iov,
+- unsigned int issue_flags)
+-{
+- struct compat_iovec __user *uiov;
+- compat_ssize_t clen;
+- void __user *buf;
+- ssize_t len;
+-
+- uiov = u64_to_user_ptr(req->rw.addr);
+- if (!access_ok(uiov, sizeof(*uiov)))
+- return -EFAULT;
+- if (__get_user(clen, &uiov->iov_len))
+- return -EFAULT;
+- if (clen < 0)
+- return -EINVAL;
+-
+- len = clen;
+- buf = io_rw_buffer_select(req, &len, issue_flags);
+- if (IS_ERR(buf))
+- return PTR_ERR(buf);
+- iov[0].iov_base = buf;
+- iov[0].iov_len = (compat_size_t) len;
+- return 0;
+-}
+-#endif
+-
+-static ssize_t __io_iov_buffer_select(struct io_kiocb *req, struct iovec *iov,
+- unsigned int issue_flags)
+-{
+- struct iovec __user *uiov = u64_to_user_ptr(req->rw.addr);
+- void __user *buf;
+- ssize_t len;
+-
+- if (copy_from_user(iov, uiov, sizeof(*uiov)))
+- return -EFAULT;
+-
+- len = iov[0].iov_len;
+- if (len < 0)
+- return -EINVAL;
+- buf = io_rw_buffer_select(req, &len, issue_flags);
+- if (IS_ERR(buf))
+- return PTR_ERR(buf);
+- iov[0].iov_base = buf;
+- iov[0].iov_len = len;
+- return 0;
+-}
+-
+-static ssize_t io_iov_buffer_select(struct io_kiocb *req, struct iovec *iov,
+- unsigned int issue_flags)
+-{
+- if (req->flags & REQ_F_BUFFER_SELECTED) {
+- struct io_buffer *kbuf = req->kbuf;
+-
+- iov[0].iov_base = u64_to_user_ptr(kbuf->addr);
+- iov[0].iov_len = kbuf->len;
+- return 0;
+- }
+- if (req->rw.len != 1)
+- return -EINVAL;
+-
+-#ifdef CONFIG_COMPAT
+- if (req->ctx->compat)
+- return io_compat_import(req, iov, issue_flags);
+-#endif
+-
+- return __io_iov_buffer_select(req, iov, issue_flags);
+-}
+-
+-static inline bool io_do_buffer_select(struct io_kiocb *req)
+-{
+- if (!(req->flags & REQ_F_BUFFER_SELECT))
+- return false;
+- return !(req->flags & REQ_F_BUFFER_SELECTED);
+-}
+-
+-static struct iovec *__io_import_iovec(int rw, struct io_kiocb *req,
+- struct io_rw_state *s,
+- unsigned int issue_flags)
+-{
+- struct iov_iter *iter = &s->iter;
+- u8 opcode = req->opcode;
+- struct iovec *iovec;
+- void __user *buf;
+- size_t sqe_len;
+- ssize_t ret;
+-
+- if (opcode == IORING_OP_READ_FIXED || opcode == IORING_OP_WRITE_FIXED) {
+- ret = io_import_fixed(req, rw, iter, issue_flags);
+- if (ret)
+- return ERR_PTR(ret);
+- return NULL;
+- }
+-
+- /* buffer index only valid with fixed read/write, or buffer select */
+- if (unlikely(req->buf_index && !(req->flags & REQ_F_BUFFER_SELECT)))
+- return ERR_PTR(-EINVAL);
+-
+- buf = u64_to_user_ptr(req->rw.addr);
+- sqe_len = req->rw.len;
+-
+- if (opcode == IORING_OP_READ || opcode == IORING_OP_WRITE) {
+- if (req->flags & REQ_F_BUFFER_SELECT) {
+- buf = io_rw_buffer_select(req, &sqe_len, issue_flags);
+- if (IS_ERR(buf))
+- return ERR_CAST(buf);
+- req->rw.len = sqe_len;
+- }
+-
+- ret = import_single_range(rw, buf, sqe_len, s->fast_iov, iter);
+- if (ret)
+- return ERR_PTR(ret);
+- return NULL;
+- }
+-
+- iovec = s->fast_iov;
+- if (req->flags & REQ_F_BUFFER_SELECT) {
+- ret = io_iov_buffer_select(req, iovec, issue_flags);
+- if (ret)
+- return ERR_PTR(ret);
+- iov_iter_init(iter, rw, iovec, 1, iovec->iov_len);
+- return NULL;
+- }
+-
+- ret = __import_iovec(rw, buf, sqe_len, UIO_FASTIOV, &iovec, iter,
+- req->ctx->compat);
+- if (unlikely(ret < 0))
+- return ERR_PTR(ret);
+- return iovec;
+-}
+-
+-static inline int io_import_iovec(int rw, struct io_kiocb *req,
+- struct iovec **iovec, struct io_rw_state *s,
+- unsigned int issue_flags)
+-{
+- *iovec = __io_import_iovec(rw, req, s, issue_flags);
+- if (unlikely(IS_ERR(*iovec)))
+- return PTR_ERR(*iovec);
+-
+- iov_iter_save_state(&s->iter, &s->iter_state);
+- return 0;
+-}
+-
+-static inline loff_t *io_kiocb_ppos(struct kiocb *kiocb)
+-{
+- return (kiocb->ki_filp->f_mode & FMODE_STREAM) ? NULL : &kiocb->ki_pos;
+-}
+-
+-/*
+- * For files that don't have ->read_iter() and ->write_iter(), handle them
+- * by looping over ->read() or ->write() manually.
+- */
+-static ssize_t loop_rw_iter(int rw, struct io_kiocb *req, struct iov_iter *iter)
+-{
+- struct kiocb *kiocb = &req->rw.kiocb;
+- struct file *file = req->file;
+- ssize_t ret = 0;
+- loff_t *ppos;
+-
+- /*
+- * Don't support polled IO through this interface, and we can't
+- * support non-blocking either. For the latter, this just causes
+- * the kiocb to be handled from an async context.
+- */
+- if (kiocb->ki_flags & IOCB_HIPRI)
+- return -EOPNOTSUPP;
+- if ((kiocb->ki_flags & IOCB_NOWAIT) &&
+- !(kiocb->ki_filp->f_flags & O_NONBLOCK))
+- return -EAGAIN;
+-
+- ppos = io_kiocb_ppos(kiocb);
+-
+- while (iov_iter_count(iter)) {
+- struct iovec iovec;
+- ssize_t nr;
+-
+- if (!iov_iter_is_bvec(iter)) {
+- iovec = iov_iter_iovec(iter);
+- } else {
+- iovec.iov_base = u64_to_user_ptr(req->rw.addr);
+- iovec.iov_len = req->rw.len;
+- }
+-
+- if (rw == READ) {
+- nr = file->f_op->read(file, iovec.iov_base,
+- iovec.iov_len, ppos);
+- } else {
+- nr = file->f_op->write(file, iovec.iov_base,
+- iovec.iov_len, ppos);
+- }
+-
+- if (nr < 0) {
+- if (!ret)
+- ret = nr;
+- break;
+- }
+- ret += nr;
+- if (!iov_iter_is_bvec(iter)) {
+- iov_iter_advance(iter, nr);
+- } else {
+- req->rw.addr += nr;
+- req->rw.len -= nr;
+- if (!req->rw.len)
+- break;
+- }
+- if (nr != iovec.iov_len)
+- break;
+- }
+-
+- return ret;
+-}
+-
+-static void io_req_map_rw(struct io_kiocb *req, const struct iovec *iovec,
+- const struct iovec *fast_iov, struct iov_iter *iter)
+-{
+- struct io_async_rw *rw = req->async_data;
+-
+- memcpy(&rw->s.iter, iter, sizeof(*iter));
+- rw->free_iovec = iovec;
+- rw->bytes_done = 0;
+- /* can only be fixed buffers, no need to do anything */
+- if (iov_iter_is_bvec(iter))
+- return;
+- if (!iovec) {
+- unsigned iov_off = 0;
+-
+- rw->s.iter.iov = rw->s.fast_iov;
+- if (iter->iov != fast_iov) {
+- iov_off = iter->iov - fast_iov;
+- rw->s.iter.iov += iov_off;
+- }
+- if (rw->s.fast_iov != fast_iov)
+- memcpy(rw->s.fast_iov + iov_off, fast_iov + iov_off,
+- sizeof(struct iovec) * iter->nr_segs);
+- } else {
+- req->flags |= REQ_F_NEED_CLEANUP;
+- }
+-}
+-
+-static inline bool io_alloc_async_data(struct io_kiocb *req)
+-{
+- WARN_ON_ONCE(!io_op_defs[req->opcode].async_size);
+- req->async_data = kmalloc(io_op_defs[req->opcode].async_size, GFP_KERNEL);
+- if (req->async_data) {
+- req->flags |= REQ_F_ASYNC_DATA;
+- return false;
+- }
+- return true;
+-}
+-
+-static int io_setup_async_rw(struct io_kiocb *req, const struct iovec *iovec,
+- struct io_rw_state *s, bool force)
+-{
+- if (!force && !io_op_defs[req->opcode].needs_async_setup)
+- return 0;
+- if (!req_has_async_data(req)) {
+- struct io_async_rw *iorw;
+-
+- if (io_alloc_async_data(req)) {
+- kfree(iovec);
+- return -ENOMEM;
+- }
+-
+- io_req_map_rw(req, iovec, s->fast_iov, &s->iter);
+- iorw = req->async_data;
+- /* we've copied and mapped the iter, ensure state is saved */
+- iov_iter_save_state(&iorw->s.iter, &iorw->s.iter_state);
+- }
+- return 0;
+-}
+-
+-static inline int io_rw_prep_async(struct io_kiocb *req, int rw)
+-{
+- struct io_async_rw *iorw = req->async_data;
+- struct iovec *iov;
+- int ret;
+-
+- /* submission path, ->uring_lock should already be taken */
+- ret = io_import_iovec(rw, req, &iov, &iorw->s, 0);
+- if (unlikely(ret < 0))
+- return ret;
+-
+- iorw->bytes_done = 0;
+- iorw->free_iovec = iov;
+- if (iov)
+- req->flags |= REQ_F_NEED_CLEANUP;
+- return 0;
+-}
+-
+-/*
+- * This is our waitqueue callback handler, registered through __folio_lock_async()
+- * when we initially tried to do the IO with the iocb armed our waitqueue.
+- * This gets called when the page is unlocked, and we generally expect that to
+- * happen when the page IO is completed and the page is now uptodate. This will
+- * queue a task_work based retry of the operation, attempting to copy the data
+- * again. If the latter fails because the page was NOT uptodate, then we will
+- * do a thread based blocking retry of the operation. That's the unexpected
+- * slow path.
+- */
+-static int io_async_buf_func(struct wait_queue_entry *wait, unsigned mode,
+- int sync, void *arg)
+-{
+- struct wait_page_queue *wpq;
+- struct io_kiocb *req = wait->private;
+- struct wait_page_key *key = arg;
+-
+- wpq = container_of(wait, struct wait_page_queue, wait);
+-
+- if (!wake_page_match(wpq, key))
+- return 0;
+-
+- req->rw.kiocb.ki_flags &= ~IOCB_WAITQ;
+- list_del_init(&wait->entry);
+- io_req_task_queue(req);
+- return 1;
+-}
+-
+-/*
+- * This controls whether a given IO request should be armed for async page
+- * based retry. If we return false here, the request is handed to the async
+- * worker threads for retry. If we're doing buffered reads on a regular file,
+- * we prepare a private wait_page_queue entry and retry the operation. This
+- * will either succeed because the page is now uptodate and unlocked, or it
+- * will register a callback when the page is unlocked at IO completion. Through
+- * that callback, io_uring uses task_work to setup a retry of the operation.
+- * That retry will attempt the buffered read again. The retry will generally
+- * succeed, or in rare cases where it fails, we then fall back to using the
+- * async worker threads for a blocking retry.
+- */
+-static bool io_rw_should_retry(struct io_kiocb *req)
+-{
+- struct io_async_rw *rw = req->async_data;
+- struct wait_page_queue *wait = &rw->wpq;
+- struct kiocb *kiocb = &req->rw.kiocb;
+-
+- /* never retry for NOWAIT, we just complete with -EAGAIN */
+- if (req->flags & REQ_F_NOWAIT)
+- return false;
+-
+- /* Only for buffered IO */
+- if (kiocb->ki_flags & (IOCB_DIRECT | IOCB_HIPRI))
+- return false;
+-
+- /*
+- * just use poll if we can, and don't attempt if the fs doesn't
+- * support callback based unlocks
+- */
+- if (file_can_poll(req->file) || !(req->file->f_mode & FMODE_BUF_RASYNC))
+- return false;
+-
+- wait->wait.func = io_async_buf_func;
+- wait->wait.private = req;
+- wait->wait.flags = 0;
+- INIT_LIST_HEAD(&wait->wait.entry);
+- kiocb->ki_flags |= IOCB_WAITQ;
+- kiocb->ki_flags &= ~IOCB_NOWAIT;
+- kiocb->ki_waitq = wait;
+- return true;
+-}
+-
+-static inline int io_iter_do_read(struct io_kiocb *req, struct iov_iter *iter)
+-{
+- if (likely(req->file->f_op->read_iter))
+- return call_read_iter(req->file, &req->rw.kiocb, iter);
+- else if (req->file->f_op->read)
+- return loop_rw_iter(READ, req, iter);
+- else
+- return -EINVAL;
+-}
+-
+-static bool need_read_all(struct io_kiocb *req)
+-{
+- return req->flags & REQ_F_ISREG ||
+- S_ISBLK(file_inode(req->file)->i_mode);
+-}
+-
+-static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
+-{
+- struct kiocb *kiocb = &req->rw.kiocb;
+- struct io_ring_ctx *ctx = req->ctx;
+- struct file *file = req->file;
+- int ret;
+-
+- if (unlikely(!file || !(file->f_mode & mode)))
+- return -EBADF;
+-
+- if (!io_req_ffs_set(req))
+- req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
+-
+- kiocb->ki_flags = iocb_flags(file);
+- ret = kiocb_set_rw_flags(kiocb, req->rw.flags);
+- if (unlikely(ret))
+- return ret;
+-
+- /*
+- * If the file is marked O_NONBLOCK, still allow retry for it if it
+- * supports async. Otherwise it's impossible to use O_NONBLOCK files
+- * reliably. If not, or it IOCB_NOWAIT is set, don't retry.
+- */
+- if ((kiocb->ki_flags & IOCB_NOWAIT) ||
+- ((file->f_flags & O_NONBLOCK) && !io_file_supports_nowait(req)))
+- req->flags |= REQ_F_NOWAIT;
+-
+- if (ctx->flags & IORING_SETUP_IOPOLL) {
+- if (!(kiocb->ki_flags & IOCB_DIRECT) || !file->f_op->iopoll)
+- return -EOPNOTSUPP;
+-
+- kiocb->private = NULL;
+- kiocb->ki_flags |= IOCB_HIPRI | IOCB_ALLOC_CACHE;
+- kiocb->ki_complete = io_complete_rw_iopoll;
+- req->iopoll_completed = 0;
+- } else {
+- if (kiocb->ki_flags & IOCB_HIPRI)
+- return -EINVAL;
+- kiocb->ki_complete = io_complete_rw;
+- }
+-
+- return 0;
+-}
+-
+-static int io_read(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_rw_state __s, *s = &__s;
+- struct iovec *iovec;
+- struct kiocb *kiocb = &req->rw.kiocb;
+- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+- struct io_async_rw *rw;
+- ssize_t ret, ret2;
+- loff_t *ppos;
+-
+- if (!req_has_async_data(req)) {
+- ret = io_import_iovec(READ, req, &iovec, s, issue_flags);
+- if (unlikely(ret < 0))
+- return ret;
+- } else {
+- rw = req->async_data;
+- s = &rw->s;
+-
+- /*
+- * Safe and required to re-import if we're using provided
+- * buffers, as we dropped the selected one before retry.
+- */
+- if (io_do_buffer_select(req)) {
+- ret = io_import_iovec(READ, req, &iovec, s, issue_flags);
+- if (unlikely(ret < 0))
+- return ret;
+- }
+-
+- /*
+- * We come here from an earlier attempt, restore our state to
+- * match in case it doesn't. It's cheap enough that we don't
+- * need to make this conditional.
+- */
+- iov_iter_restore(&s->iter, &s->iter_state);
+- iovec = NULL;
+- }
+- ret = io_rw_init_file(req, FMODE_READ);
+- if (unlikely(ret)) {
+- kfree(iovec);
+- return ret;
+- }
+- req->result = iov_iter_count(&s->iter);
+-
+- if (force_nonblock) {
+- /* If the file doesn't support async, just async punt */
+- if (unlikely(!io_file_supports_nowait(req))) {
+- ret = io_setup_async_rw(req, iovec, s, true);
+- return ret ?: -EAGAIN;
+- }
+- kiocb->ki_flags |= IOCB_NOWAIT;
+- } else {
+- /* Ensure we clear previously set non-block flag */
+- kiocb->ki_flags &= ~IOCB_NOWAIT;
+- }
+-
+- ppos = io_kiocb_update_pos(req);
+-
+- ret = rw_verify_area(READ, req->file, ppos, req->result);
+- if (unlikely(ret)) {
+- kfree(iovec);
+- return ret;
+- }
+-
+- ret = io_iter_do_read(req, &s->iter);
+-
+- if (ret == -EAGAIN || (req->flags & REQ_F_REISSUE)) {
+- req->flags &= ~REQ_F_REISSUE;
+- /* if we can poll, just do that */
+- if (req->opcode == IORING_OP_READ && file_can_poll(req->file))
+- return -EAGAIN;
+- /* IOPOLL retry should happen for io-wq threads */
+- if (!force_nonblock && !(req->ctx->flags & IORING_SETUP_IOPOLL))
+- goto done;
+- /* no retry on NONBLOCK nor RWF_NOWAIT */
+- if (req->flags & REQ_F_NOWAIT)
+- goto done;
+- ret = 0;
+- } else if (ret == -EIOCBQUEUED) {
+- goto out_free;
+- } else if (ret == req->result || ret <= 0 || !force_nonblock ||
+- (req->flags & REQ_F_NOWAIT) || !need_read_all(req)) {
+- /* read all, failed, already did sync or don't want to retry */
+- goto done;
+- }
+-
+- /*
+- * Don't depend on the iter state matching what was consumed, or being
+- * untouched in case of error. Restore it and we'll advance it
+- * manually if we need to.
+- */
+- iov_iter_restore(&s->iter, &s->iter_state);
+-
+- ret2 = io_setup_async_rw(req, iovec, s, true);
+- if (ret2)
+- return ret2;
+-
+- iovec = NULL;
+- rw = req->async_data;
+- s = &rw->s;
+- /*
+- * Now use our persistent iterator and state, if we aren't already.
+- * We've restored and mapped the iter to match.
+- */
+-
+- do {
+- /*
+- * We end up here because of a partial read, either from
+- * above or inside this loop. Advance the iter by the bytes
+- * that were consumed.
+- */
+- iov_iter_advance(&s->iter, ret);
+- if (!iov_iter_count(&s->iter))
+- break;
+- rw->bytes_done += ret;
+- iov_iter_save_state(&s->iter, &s->iter_state);
+-
+- /* if we can retry, do so with the callbacks armed */
+- if (!io_rw_should_retry(req)) {
+- kiocb->ki_flags &= ~IOCB_WAITQ;
+- return -EAGAIN;
+- }
+-
+- /*
+- * Now retry read with the IOCB_WAITQ parts set in the iocb. If
+- * we get -EIOCBQUEUED, then we'll get a notification when the
+- * desired page gets unlocked. We can also get a partial read
+- * here, and if we do, then just retry at the new offset.
+- */
+- ret = io_iter_do_read(req, &s->iter);
+- if (ret == -EIOCBQUEUED)
+- return 0;
+- /* we got some bytes, but not all. retry. */
+- kiocb->ki_flags &= ~IOCB_WAITQ;
+- iov_iter_restore(&s->iter, &s->iter_state);
+- } while (ret > 0);
+-done:
+- kiocb_done(req, ret, issue_flags);
+-out_free:
+- /* it's faster to check here then delegate to kfree */
+- if (iovec)
+- kfree(iovec);
+- return 0;
+-}
+-
+-static int io_write(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_rw_state __s, *s = &__s;
+- struct iovec *iovec;
+- struct kiocb *kiocb = &req->rw.kiocb;
+- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+- ssize_t ret, ret2;
+- loff_t *ppos;
+-
+- if (!req_has_async_data(req)) {
+- ret = io_import_iovec(WRITE, req, &iovec, s, issue_flags);
+- if (unlikely(ret < 0))
+- return ret;
+- } else {
+- struct io_async_rw *rw = req->async_data;
+-
+- s = &rw->s;
+- iov_iter_restore(&s->iter, &s->iter_state);
+- iovec = NULL;
+- }
+- ret = io_rw_init_file(req, FMODE_WRITE);
+- if (unlikely(ret)) {
+- kfree(iovec);
+- return ret;
+- }
+- req->result = iov_iter_count(&s->iter);
+-
+- if (force_nonblock) {
+- /* If the file doesn't support async, just async punt */
+- if (unlikely(!io_file_supports_nowait(req)))
+- goto copy_iov;
+-
+- /* file path doesn't support NOWAIT for non-direct_IO */
+- if (force_nonblock && !(kiocb->ki_flags & IOCB_DIRECT) &&
+- (req->flags & REQ_F_ISREG))
+- goto copy_iov;
+-
+- kiocb->ki_flags |= IOCB_NOWAIT;
+- } else {
+- /* Ensure we clear previously set non-block flag */
+- kiocb->ki_flags &= ~IOCB_NOWAIT;
+- }
+-
+- ppos = io_kiocb_update_pos(req);
+-
+- ret = rw_verify_area(WRITE, req->file, ppos, req->result);
+- if (unlikely(ret))
+- goto out_free;
+-
+- /*
+- * Open-code file_start_write here to grab freeze protection,
+- * which will be released by another thread in
+- * io_complete_rw(). Fool lockdep by telling it the lock got
+- * released so that it doesn't complain about the held lock when
+- * we return to userspace.
+- */
+- if (req->flags & REQ_F_ISREG) {
+- sb_start_write(file_inode(req->file)->i_sb);
+- __sb_writers_release(file_inode(req->file)->i_sb,
+- SB_FREEZE_WRITE);
+- }
+- kiocb->ki_flags |= IOCB_WRITE;
+-
+- if (likely(req->file->f_op->write_iter))
+- ret2 = call_write_iter(req->file, kiocb, &s->iter);
+- else if (req->file->f_op->write)
+- ret2 = loop_rw_iter(WRITE, req, &s->iter);
+- else
+- ret2 = -EINVAL;
+-
+- if (req->flags & REQ_F_REISSUE) {
+- req->flags &= ~REQ_F_REISSUE;
+- ret2 = -EAGAIN;
+- }
+-
+- /*
+- * Raw bdev writes will return -EOPNOTSUPP for IOCB_NOWAIT. Just
+- * retry them without IOCB_NOWAIT.
+- */
+- if (ret2 == -EOPNOTSUPP && (kiocb->ki_flags & IOCB_NOWAIT))
+- ret2 = -EAGAIN;
+- /* no retry on NONBLOCK nor RWF_NOWAIT */
+- if (ret2 == -EAGAIN && (req->flags & REQ_F_NOWAIT))
+- goto done;
+- if (!force_nonblock || ret2 != -EAGAIN) {
+- /* IOPOLL retry should happen for io-wq threads */
+- if (ret2 == -EAGAIN && (req->ctx->flags & IORING_SETUP_IOPOLL))
+- goto copy_iov;
+-done:
+- kiocb_done(req, ret2, issue_flags);
+- } else {
+-copy_iov:
+- iov_iter_restore(&s->iter, &s->iter_state);
+- ret = io_setup_async_rw(req, iovec, s, false);
+- return ret ?: -EAGAIN;
+- }
+-out_free:
+- /* it's reportedly faster than delegating the null check to kfree() */
+- if (iovec)
+- kfree(iovec);
+- return ret;
+-}
+-
+-static int io_renameat_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- struct io_rename *ren = &req->rename;
+- const char __user *oldf, *newf;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
+- return -EINVAL;
+- if (unlikely(req->flags & REQ_F_FIXED_FILE))
+- return -EBADF;
+-
+- ren->old_dfd = READ_ONCE(sqe->fd);
+- oldf = u64_to_user_ptr(READ_ONCE(sqe->addr));
+- newf = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+- ren->new_dfd = READ_ONCE(sqe->len);
+- ren->flags = READ_ONCE(sqe->rename_flags);
+-
+- ren->oldpath = getname(oldf);
+- if (IS_ERR(ren->oldpath))
+- return PTR_ERR(ren->oldpath);
+-
+- ren->newpath = getname(newf);
+- if (IS_ERR(ren->newpath)) {
+- putname(ren->oldpath);
+- return PTR_ERR(ren->newpath);
+- }
+-
+- req->flags |= REQ_F_NEED_CLEANUP;
+- return 0;
+-}
+-
+-static int io_renameat(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_rename *ren = &req->rename;
+- int ret;
+-
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- return -EAGAIN;
+-
+- ret = do_renameat2(ren->old_dfd, ren->oldpath, ren->new_dfd,
+- ren->newpath, ren->flags);
+-
+- req->flags &= ~REQ_F_NEED_CLEANUP;
+- if (ret < 0)
+- req_set_fail(req);
+- io_req_complete(req, ret);
+- return 0;
+-}
+-
+-static int io_unlinkat_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- struct io_unlink *un = &req->unlink;
+- const char __user *fname;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->off || sqe->len || sqe->buf_index ||
+- sqe->splice_fd_in)
+- return -EINVAL;
+- if (unlikely(req->flags & REQ_F_FIXED_FILE))
+- return -EBADF;
+-
+- un->dfd = READ_ONCE(sqe->fd);
+-
+- un->flags = READ_ONCE(sqe->unlink_flags);
+- if (un->flags & ~AT_REMOVEDIR)
+- return -EINVAL;
+-
+- fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
+- un->filename = getname(fname);
+- if (IS_ERR(un->filename))
+- return PTR_ERR(un->filename);
+-
+- req->flags |= REQ_F_NEED_CLEANUP;
+- return 0;
+-}
+-
+-static int io_unlinkat(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_unlink *un = &req->unlink;
+- int ret;
+-
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- return -EAGAIN;
+-
+- if (un->flags & AT_REMOVEDIR)
+- ret = do_rmdir(un->dfd, un->filename);
+- else
+- ret = do_unlinkat(un->dfd, un->filename);
+-
+- req->flags &= ~REQ_F_NEED_CLEANUP;
+- if (ret < 0)
+- req_set_fail(req);
+- io_req_complete(req, ret);
+- return 0;
+-}
+-
+-static int io_mkdirat_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- struct io_mkdir *mkd = &req->mkdir;
+- const char __user *fname;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->off || sqe->rw_flags || sqe->buf_index ||
+- sqe->splice_fd_in)
+- return -EINVAL;
+- if (unlikely(req->flags & REQ_F_FIXED_FILE))
+- return -EBADF;
+-
+- mkd->dfd = READ_ONCE(sqe->fd);
+- mkd->mode = READ_ONCE(sqe->len);
+-
+- fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
+- mkd->filename = getname(fname);
+- if (IS_ERR(mkd->filename))
+- return PTR_ERR(mkd->filename);
+-
+- req->flags |= REQ_F_NEED_CLEANUP;
+- return 0;
+-}
+-
+-static int io_mkdirat(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_mkdir *mkd = &req->mkdir;
+- int ret;
+-
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- return -EAGAIN;
+-
+- ret = do_mkdirat(mkd->dfd, mkd->filename, mkd->mode);
+-
+- req->flags &= ~REQ_F_NEED_CLEANUP;
+- if (ret < 0)
+- req_set_fail(req);
+- io_req_complete(req, ret);
+- return 0;
+-}
+-
+-static int io_symlinkat_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- struct io_symlink *sl = &req->symlink;
+- const char __user *oldpath, *newpath;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->len || sqe->rw_flags || sqe->buf_index ||
+- sqe->splice_fd_in)
+- return -EINVAL;
+- if (unlikely(req->flags & REQ_F_FIXED_FILE))
+- return -EBADF;
+-
+- sl->new_dfd = READ_ONCE(sqe->fd);
+- oldpath = u64_to_user_ptr(READ_ONCE(sqe->addr));
+- newpath = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+-
+- sl->oldpath = getname(oldpath);
+- if (IS_ERR(sl->oldpath))
+- return PTR_ERR(sl->oldpath);
+-
+- sl->newpath = getname(newpath);
+- if (IS_ERR(sl->newpath)) {
+- putname(sl->oldpath);
+- return PTR_ERR(sl->newpath);
+- }
+-
+- req->flags |= REQ_F_NEED_CLEANUP;
+- return 0;
+-}
+-
+-static int io_symlinkat(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_symlink *sl = &req->symlink;
+- int ret;
+-
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- return -EAGAIN;
+-
+- ret = do_symlinkat(sl->oldpath, sl->new_dfd, sl->newpath);
+-
+- req->flags &= ~REQ_F_NEED_CLEANUP;
+- if (ret < 0)
+- req_set_fail(req);
+- io_req_complete(req, ret);
+- return 0;
+-}
+-
+-static int io_linkat_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- struct io_hardlink *lnk = &req->hardlink;
+- const char __user *oldf, *newf;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->rw_flags || sqe->buf_index || sqe->splice_fd_in)
+- return -EINVAL;
+- if (unlikely(req->flags & REQ_F_FIXED_FILE))
+- return -EBADF;
+-
+- lnk->old_dfd = READ_ONCE(sqe->fd);
+- lnk->new_dfd = READ_ONCE(sqe->len);
+- oldf = u64_to_user_ptr(READ_ONCE(sqe->addr));
+- newf = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+- lnk->flags = READ_ONCE(sqe->hardlink_flags);
+-
+- lnk->oldpath = getname(oldf);
+- if (IS_ERR(lnk->oldpath))
+- return PTR_ERR(lnk->oldpath);
+-
+- lnk->newpath = getname(newf);
+- if (IS_ERR(lnk->newpath)) {
+- putname(lnk->oldpath);
+- return PTR_ERR(lnk->newpath);
+- }
+-
+- req->flags |= REQ_F_NEED_CLEANUP;
+- return 0;
+-}
+-
+-static int io_linkat(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_hardlink *lnk = &req->hardlink;
+- int ret;
+-
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- return -EAGAIN;
+-
+- ret = do_linkat(lnk->old_dfd, lnk->oldpath, lnk->new_dfd,
+- lnk->newpath, lnk->flags);
+-
+- req->flags &= ~REQ_F_NEED_CLEANUP;
+- if (ret < 0)
+- req_set_fail(req);
+- io_req_complete(req, ret);
+- return 0;
+-}
+-
+-static int io_shutdown_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+-#if defined(CONFIG_NET)
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (unlikely(sqe->ioprio || sqe->off || sqe->addr || sqe->rw_flags ||
+- sqe->buf_index || sqe->splice_fd_in))
+- return -EINVAL;
+-
+- req->shutdown.how = READ_ONCE(sqe->len);
+- return 0;
+-#else
+- return -EOPNOTSUPP;
+-#endif
+-}
+-
+-static int io_shutdown(struct io_kiocb *req, unsigned int issue_flags)
+-{
+-#if defined(CONFIG_NET)
+- struct socket *sock;
+- int ret;
+-
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- return -EAGAIN;
+-
+- sock = sock_from_file(req->file);
+- if (unlikely(!sock))
+- return -ENOTSOCK;
+-
+- ret = __sys_shutdown_sock(sock, req->shutdown.how);
+- if (ret < 0)
+- req_set_fail(req);
+- io_req_complete(req, ret);
+- return 0;
+-#else
+- return -EOPNOTSUPP;
+-#endif
+-}
+-
+-static int __io_splice_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- struct io_splice *sp = &req->splice;
+- unsigned int valid_flags = SPLICE_F_FD_IN_FIXED | SPLICE_F_ALL;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+-
+- sp->len = READ_ONCE(sqe->len);
+- sp->flags = READ_ONCE(sqe->splice_flags);
+- if (unlikely(sp->flags & ~valid_flags))
+- return -EINVAL;
+- sp->splice_fd_in = READ_ONCE(sqe->splice_fd_in);
+- return 0;
+-}
+-
+-static int io_tee_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- if (READ_ONCE(sqe->splice_off_in) || READ_ONCE(sqe->off))
+- return -EINVAL;
+- return __io_splice_prep(req, sqe);
+-}
+-
+-static int io_tee(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_splice *sp = &req->splice;
+- struct file *out = sp->file_out;
+- unsigned int flags = sp->flags & ~SPLICE_F_FD_IN_FIXED;
+- struct file *in;
+- long ret = 0;
+-
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- return -EAGAIN;
+-
+- if (sp->flags & SPLICE_F_FD_IN_FIXED)
+- in = io_file_get_fixed(req, sp->splice_fd_in, issue_flags);
+- else
+- in = io_file_get_normal(req, sp->splice_fd_in);
+- if (!in) {
+- ret = -EBADF;
+- goto done;
+- }
+-
+- if (sp->len)
+- ret = do_tee(in, out, sp->len, flags);
+-
+- if (!(sp->flags & SPLICE_F_FD_IN_FIXED))
+- io_put_file(in);
+-done:
+- if (ret != sp->len)
+- req_set_fail(req);
+- io_req_complete(req, ret);
+- return 0;
+-}
+-
+-static int io_splice_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- struct io_splice *sp = &req->splice;
+-
+- sp->off_in = READ_ONCE(sqe->splice_off_in);
+- sp->off_out = READ_ONCE(sqe->off);
+- return __io_splice_prep(req, sqe);
+-}
+-
+-static int io_splice(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_splice *sp = &req->splice;
+- struct file *out = sp->file_out;
+- unsigned int flags = sp->flags & ~SPLICE_F_FD_IN_FIXED;
+- loff_t *poff_in, *poff_out;
+- struct file *in;
+- long ret = 0;
+-
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- return -EAGAIN;
+-
+- if (sp->flags & SPLICE_F_FD_IN_FIXED)
+- in = io_file_get_fixed(req, sp->splice_fd_in, issue_flags);
+- else
+- in = io_file_get_normal(req, sp->splice_fd_in);
+- if (!in) {
+- ret = -EBADF;
+- goto done;
+- }
+-
+- poff_in = (sp->off_in == -1) ? NULL : &sp->off_in;
+- poff_out = (sp->off_out == -1) ? NULL : &sp->off_out;
+-
+- if (sp->len)
+- ret = do_splice(in, poff_in, out, poff_out, sp->len, flags);
+-
+- if (!(sp->flags & SPLICE_F_FD_IN_FIXED))
+- io_put_file(in);
+-done:
+- if (ret != sp->len)
+- req_set_fail(req);
+- io_req_complete(req, ret);
+- return 0;
+-}
+-
+-/*
+- * IORING_OP_NOP just posts a completion event, nothing else.
+- */
+-static int io_nop(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+-
+- __io_req_complete(req, issue_flags, 0, 0);
+- return 0;
+-}
+-
+-static int io_msg_ring_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- if (unlikely(sqe->addr || sqe->ioprio || sqe->rw_flags ||
+- sqe->splice_fd_in || sqe->buf_index || sqe->personality))
+- return -EINVAL;
+-
+- req->msg.user_data = READ_ONCE(sqe->off);
+- req->msg.len = READ_ONCE(sqe->len);
+- return 0;
+-}
+-
+-static int io_msg_ring(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_ring_ctx *target_ctx;
+- struct io_msg *msg = &req->msg;
+- bool filled;
+- int ret;
+-
+- ret = -EBADFD;
+- if (req->file->f_op != &io_uring_fops)
+- goto done;
+-
+- ret = -EOVERFLOW;
+- target_ctx = req->file->private_data;
+-
+- spin_lock(&target_ctx->completion_lock);
+- filled = io_fill_cqe_aux(target_ctx, msg->user_data, msg->len, 0);
+- io_commit_cqring(target_ctx);
+- spin_unlock(&target_ctx->completion_lock);
+-
+- if (filled) {
+- io_cqring_ev_posted(target_ctx);
+- ret = 0;
+- }
+-
+-done:
+- if (ret < 0)
+- req_set_fail(req);
+- __io_req_complete(req, issue_flags, ret, 0);
+- /* put file to avoid an attempt to IOPOLL the req */
+- io_put_file(req->file);
+- req->file = NULL;
+- return 0;
+-}
+-
+-static int io_fsync_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (unlikely(sqe->addr || sqe->ioprio || sqe->buf_index ||
+- sqe->splice_fd_in))
+- return -EINVAL;
+-
+- req->sync.flags = READ_ONCE(sqe->fsync_flags);
+- if (unlikely(req->sync.flags & ~IORING_FSYNC_DATASYNC))
+- return -EINVAL;
+-
+- req->sync.off = READ_ONCE(sqe->off);
+- req->sync.len = READ_ONCE(sqe->len);
+- return 0;
+-}
+-
+-static int io_fsync(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- loff_t end = req->sync.off + req->sync.len;
+- int ret;
+-
+- /* fsync always requires a blocking context */
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- return -EAGAIN;
+-
+- ret = vfs_fsync_range(req->file, req->sync.off,
+- end > 0 ? end : LLONG_MAX,
+- req->sync.flags & IORING_FSYNC_DATASYNC);
+- if (ret < 0)
+- req_set_fail(req);
+- io_req_complete(req, ret);
+- return 0;
+-}
+-
+-static int io_fallocate_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- if (sqe->ioprio || sqe->buf_index || sqe->rw_flags ||
+- sqe->splice_fd_in)
+- return -EINVAL;
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+-
+- req->sync.off = READ_ONCE(sqe->off);
+- req->sync.len = READ_ONCE(sqe->addr);
+- req->sync.mode = READ_ONCE(sqe->len);
+- return 0;
+-}
+-
+-static int io_fallocate(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- int ret;
+-
+- /* fallocate always requiring blocking context */
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- return -EAGAIN;
+- ret = vfs_fallocate(req->file, req->sync.mode, req->sync.off,
+- req->sync.len);
+- if (ret < 0)
+- req_set_fail(req);
+- else
+- fsnotify_modify(req->file);
+- io_req_complete(req, ret);
+- return 0;
+-}
+-
+-static int __io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- const char __user *fname;
+- int ret;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (unlikely(sqe->ioprio || sqe->buf_index))
+- return -EINVAL;
+- if (unlikely(req->flags & REQ_F_FIXED_FILE))
+- return -EBADF;
+-
+- /* open.how should be already initialised */
+- if (!(req->open.how.flags & O_PATH) && force_o_largefile())
+- req->open.how.flags |= O_LARGEFILE;
+-
+- req->open.dfd = READ_ONCE(sqe->fd);
+- fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
+- req->open.filename = getname(fname);
+- if (IS_ERR(req->open.filename)) {
+- ret = PTR_ERR(req->open.filename);
+- req->open.filename = NULL;
+- return ret;
+- }
+-
+- req->open.file_slot = READ_ONCE(sqe->file_index);
+- if (req->open.file_slot && (req->open.how.flags & O_CLOEXEC))
+- return -EINVAL;
+-
+- req->open.nofile = rlimit(RLIMIT_NOFILE);
+- req->flags |= REQ_F_NEED_CLEANUP;
+- return 0;
+-}
+-
+-static int io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- u64 mode = READ_ONCE(sqe->len);
+- u64 flags = READ_ONCE(sqe->open_flags);
+-
+- req->open.how = build_open_how(flags, mode);
+- return __io_openat_prep(req, sqe);
+-}
+-
+-static int io_openat2_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- struct open_how __user *how;
+- size_t len;
+- int ret;
+-
+- how = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+- len = READ_ONCE(sqe->len);
+- if (len < OPEN_HOW_SIZE_VER0)
+- return -EINVAL;
+-
+- ret = copy_struct_from_user(&req->open.how, sizeof(req->open.how), how,
+- len);
+- if (ret)
+- return ret;
+-
+- return __io_openat_prep(req, sqe);
+-}
+-
+-static int io_openat2(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct open_flags op;
+- struct file *file;
+- bool resolve_nonblock, nonblock_set;
+- bool fixed = !!req->open.file_slot;
+- int ret;
+-
+- ret = build_open_flags(&req->open.how, &op);
+- if (ret)
+- goto err;
+- nonblock_set = op.open_flag & O_NONBLOCK;
+- resolve_nonblock = req->open.how.resolve & RESOLVE_CACHED;
+- if (issue_flags & IO_URING_F_NONBLOCK) {
+- /*
+- * Don't bother trying for O_TRUNC, O_CREAT, or O_TMPFILE open,
+- * it'll always -EAGAIN
+- */
+- if (req->open.how.flags & (O_TRUNC | O_CREAT | O_TMPFILE))
+- return -EAGAIN;
+- op.lookup_flags |= LOOKUP_CACHED;
+- op.open_flag |= O_NONBLOCK;
+- }
+-
+- if (!fixed) {
+- ret = __get_unused_fd_flags(req->open.how.flags, req->open.nofile);
+- if (ret < 0)
+- goto err;
+- }
+-
+- file = do_filp_open(req->open.dfd, req->open.filename, &op);
+- if (IS_ERR(file)) {
+- /*
+- * We could hang on to this 'fd' on retrying, but seems like
+- * marginal gain for something that is now known to be a slower
+- * path. So just put it, and we'll get a new one when we retry.
+- */
+- if (!fixed)
+- put_unused_fd(ret);
+-
+- ret = PTR_ERR(file);
+- /* only retry if RESOLVE_CACHED wasn't already set by application */
+- if (ret == -EAGAIN &&
+- (!resolve_nonblock && (issue_flags & IO_URING_F_NONBLOCK)))
+- return -EAGAIN;
+- goto err;
+- }
+-
+- if ((issue_flags & IO_URING_F_NONBLOCK) && !nonblock_set)
+- file->f_flags &= ~O_NONBLOCK;
+- fsnotify_open(file);
+-
+- if (!fixed)
+- fd_install(ret, file);
+- else
+- ret = io_install_fixed_file(req, file, issue_flags,
+- req->open.file_slot - 1);
+-err:
+- putname(req->open.filename);
+- req->flags &= ~REQ_F_NEED_CLEANUP;
+- if (ret < 0)
+- req_set_fail(req);
+- __io_req_complete(req, issue_flags, ret, 0);
+- return 0;
+-}
+-
+-static int io_openat(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- return io_openat2(req, issue_flags);
+-}
+-
+-static int io_remove_buffers_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- struct io_provide_buf *p = &req->pbuf;
+- u64 tmp;
+-
+- if (sqe->ioprio || sqe->rw_flags || sqe->addr || sqe->len || sqe->off ||
+- sqe->splice_fd_in)
+- return -EINVAL;
+-
+- tmp = READ_ONCE(sqe->fd);
+- if (!tmp || tmp > USHRT_MAX)
+- return -EINVAL;
+-
+- memset(p, 0, sizeof(*p));
+- p->nbufs = tmp;
+- p->bgid = READ_ONCE(sqe->buf_group);
+- return 0;
+-}
+-
+-static int __io_remove_buffers(struct io_ring_ctx *ctx,
+- struct io_buffer_list *bl, unsigned nbufs)
+-{
+- unsigned i = 0;
+-
+- /* shouldn't happen */
+- if (!nbufs)
+- return 0;
+-
+- /* the head kbuf is the list itself */
+- while (!list_empty(&bl->buf_list)) {
+- struct io_buffer *nxt;
+-
+- nxt = list_first_entry(&bl->buf_list, struct io_buffer, list);
+- list_del(&nxt->list);
+- if (++i == nbufs)
+- return i;
+- cond_resched();
+- }
+- i++;
+-
+- return i;
+-}
+-
+-static int io_remove_buffers(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_provide_buf *p = &req->pbuf;
+- struct io_ring_ctx *ctx = req->ctx;
+- struct io_buffer_list *bl;
+- int ret = 0;
+- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+-
+- io_ring_submit_lock(ctx, needs_lock);
+-
+- lockdep_assert_held(&ctx->uring_lock);
+-
+- ret = -ENOENT;
+- bl = io_buffer_get_list(ctx, p->bgid);
+- if (bl)
+- ret = __io_remove_buffers(ctx, bl, p->nbufs);
+- if (ret < 0)
+- req_set_fail(req);
+-
+- /* complete before unlock, IOPOLL may need the lock */
+- __io_req_complete(req, issue_flags, ret, 0);
+- io_ring_submit_unlock(ctx, needs_lock);
+- return 0;
+-}
+-
+-static int io_provide_buffers_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- unsigned long size, tmp_check;
+- struct io_provide_buf *p = &req->pbuf;
+- u64 tmp;
+-
+- if (sqe->ioprio || sqe->rw_flags || sqe->splice_fd_in)
+- return -EINVAL;
+-
+- tmp = READ_ONCE(sqe->fd);
+- if (!tmp || tmp > USHRT_MAX)
+- return -E2BIG;
+- p->nbufs = tmp;
+- p->addr = READ_ONCE(sqe->addr);
+- p->len = READ_ONCE(sqe->len);
+-
+- if (check_mul_overflow((unsigned long)p->len, (unsigned long)p->nbufs,
+- &size))
+- return -EOVERFLOW;
+- if (check_add_overflow((unsigned long)p->addr, size, &tmp_check))
+- return -EOVERFLOW;
+-
+- size = (unsigned long)p->len * p->nbufs;
+- if (!access_ok(u64_to_user_ptr(p->addr), size))
+- return -EFAULT;
+-
+- p->bgid = READ_ONCE(sqe->buf_group);
+- tmp = READ_ONCE(sqe->off);
+- if (tmp > USHRT_MAX)
+- return -E2BIG;
+- p->bid = tmp;
+- return 0;
+-}
+-
+-static int io_refill_buffer_cache(struct io_ring_ctx *ctx)
+-{
+- struct io_buffer *buf;
+- struct page *page;
+- int bufs_in_page;
+-
+- /*
+- * Completions that don't happen inline (eg not under uring_lock) will
+- * add to ->io_buffers_comp. If we don't have any free buffers, check
+- * the completion list and splice those entries first.
+- */
+- if (!list_empty_careful(&ctx->io_buffers_comp)) {
+- spin_lock(&ctx->completion_lock);
+- if (!list_empty(&ctx->io_buffers_comp)) {
+- list_splice_init(&ctx->io_buffers_comp,
+- &ctx->io_buffers_cache);
+- spin_unlock(&ctx->completion_lock);
+- return 0;
+- }
+- spin_unlock(&ctx->completion_lock);
+- }
+-
+- /*
+- * No free buffers and no completion entries either. Allocate a new
+- * page worth of buffer entries and add those to our freelist.
+- */
+- page = alloc_page(GFP_KERNEL_ACCOUNT);
+- if (!page)
+- return -ENOMEM;
+-
+- list_add(&page->lru, &ctx->io_buffers_pages);
+-
+- buf = page_address(page);
+- bufs_in_page = PAGE_SIZE / sizeof(*buf);
+- while (bufs_in_page) {
+- list_add_tail(&buf->list, &ctx->io_buffers_cache);
+- buf++;
+- bufs_in_page--;
+- }
+-
+- return 0;
+-}
+-
+-static int io_add_buffers(struct io_ring_ctx *ctx, struct io_provide_buf *pbuf,
+- struct io_buffer_list *bl)
+-{
+- struct io_buffer *buf;
+- u64 addr = pbuf->addr;
+- int i, bid = pbuf->bid;
+-
+- for (i = 0; i < pbuf->nbufs; i++) {
+- if (list_empty(&ctx->io_buffers_cache) &&
+- io_refill_buffer_cache(ctx))
+- break;
+- buf = list_first_entry(&ctx->io_buffers_cache, struct io_buffer,
+- list);
+- list_move_tail(&buf->list, &bl->buf_list);
+- buf->addr = addr;
+- buf->len = min_t(__u32, pbuf->len, MAX_RW_COUNT);
+- buf->bid = bid;
+- buf->bgid = pbuf->bgid;
+- addr += pbuf->len;
+- bid++;
+- cond_resched();
+- }
+-
+- return i ? 0 : -ENOMEM;
+-}
+-
+-static int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_provide_buf *p = &req->pbuf;
+- struct io_ring_ctx *ctx = req->ctx;
+- struct io_buffer_list *bl;
+- int ret = 0;
+- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+-
+- io_ring_submit_lock(ctx, needs_lock);
+-
+- lockdep_assert_held(&ctx->uring_lock);
+-
+- bl = io_buffer_get_list(ctx, p->bgid);
+- if (unlikely(!bl)) {
+- bl = kmalloc(sizeof(*bl), GFP_KERNEL);
+- if (!bl) {
+- ret = -ENOMEM;
+- goto err;
+- }
+- io_buffer_add_list(ctx, bl, p->bgid);
+- }
+-
+- ret = io_add_buffers(ctx, p, bl);
+-err:
+- if (ret < 0)
+- req_set_fail(req);
+- /* complete before unlock, IOPOLL may need the lock */
+- __io_req_complete(req, issue_flags, ret, 0);
+- io_ring_submit_unlock(ctx, needs_lock);
+- return 0;
+-}
+-
+-static int io_epoll_ctl_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+-#if defined(CONFIG_EPOLL)
+- if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
+- return -EINVAL;
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+-
+- req->epoll.epfd = READ_ONCE(sqe->fd);
+- req->epoll.op = READ_ONCE(sqe->len);
+- req->epoll.fd = READ_ONCE(sqe->off);
+-
+- if (ep_op_has_event(req->epoll.op)) {
+- struct epoll_event __user *ev;
+-
+- ev = u64_to_user_ptr(READ_ONCE(sqe->addr));
+- if (copy_from_user(&req->epoll.event, ev, sizeof(*ev)))
+- return -EFAULT;
+- }
+-
+- return 0;
+-#else
+- return -EOPNOTSUPP;
+-#endif
+-}
+-
+-static int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags)
+-{
+-#if defined(CONFIG_EPOLL)
+- struct io_epoll *ie = &req->epoll;
+- int ret;
+- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+-
+- ret = do_epoll_ctl(ie->epfd, ie->op, ie->fd, &ie->event, force_nonblock);
+- if (force_nonblock && ret == -EAGAIN)
+- return -EAGAIN;
+-
+- if (ret < 0)
+- req_set_fail(req);
+- __io_req_complete(req, issue_flags, ret, 0);
+- return 0;
+-#else
+- return -EOPNOTSUPP;
+-#endif
+-}
+-
+-static int io_madvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+-#if defined(CONFIG_ADVISE_SYSCALLS) && defined(CONFIG_MMU)
+- if (sqe->ioprio || sqe->buf_index || sqe->off || sqe->splice_fd_in)
+- return -EINVAL;
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+-
+- req->madvise.addr = READ_ONCE(sqe->addr);
+- req->madvise.len = READ_ONCE(sqe->len);
+- req->madvise.advice = READ_ONCE(sqe->fadvise_advice);
+- return 0;
+-#else
+- return -EOPNOTSUPP;
+-#endif
+-}
+-
+-static int io_madvise(struct io_kiocb *req, unsigned int issue_flags)
+-{
+-#if defined(CONFIG_ADVISE_SYSCALLS) && defined(CONFIG_MMU)
+- struct io_madvise *ma = &req->madvise;
+- int ret;
+-
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- return -EAGAIN;
+-
+- ret = do_madvise(current->mm, ma->addr, ma->len, ma->advice);
+- if (ret < 0)
+- req_set_fail(req);
+- io_req_complete(req, ret);
+- return 0;
+-#else
+- return -EOPNOTSUPP;
+-#endif
+-}
+-
+-static int io_fadvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- if (sqe->ioprio || sqe->buf_index || sqe->addr || sqe->splice_fd_in)
+- return -EINVAL;
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+-
+- req->fadvise.offset = READ_ONCE(sqe->off);
+- req->fadvise.len = READ_ONCE(sqe->len);
+- req->fadvise.advice = READ_ONCE(sqe->fadvise_advice);
+- return 0;
+-}
+-
+-static int io_fadvise(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_fadvise *fa = &req->fadvise;
+- int ret;
+-
+- if (issue_flags & IO_URING_F_NONBLOCK) {
+- switch (fa->advice) {
+- case POSIX_FADV_NORMAL:
+- case POSIX_FADV_RANDOM:
+- case POSIX_FADV_SEQUENTIAL:
+- break;
+- default:
+- return -EAGAIN;
+- }
+- }
+-
+- ret = vfs_fadvise(req->file, fa->offset, fa->len, fa->advice);
+- if (ret < 0)
+- req_set_fail(req);
+- __io_req_complete(req, issue_flags, ret, 0);
+- return 0;
+-}
+-
+-static int io_statx_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- const char __user *path;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
+- return -EINVAL;
+- if (req->flags & REQ_F_FIXED_FILE)
+- return -EBADF;
+-
+- req->statx.dfd = READ_ONCE(sqe->fd);
+- req->statx.mask = READ_ONCE(sqe->len);
+- path = u64_to_user_ptr(READ_ONCE(sqe->addr));
+- req->statx.buffer = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+- req->statx.flags = READ_ONCE(sqe->statx_flags);
+-
+- req->statx.filename = getname_flags(path,
+- getname_statx_lookup_flags(req->statx.flags),
+- NULL);
+-
+- if (IS_ERR(req->statx.filename)) {
+- int ret = PTR_ERR(req->statx.filename);
+-
+- req->statx.filename = NULL;
+- return ret;
+- }
+-
+- req->flags |= REQ_F_NEED_CLEANUP;
+- return 0;
+-}
+-
+-static int io_statx(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_statx *ctx = &req->statx;
+- int ret;
+-
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- return -EAGAIN;
+-
+- ret = do_statx(ctx->dfd, ctx->filename, ctx->flags, ctx->mask,
+- ctx->buffer);
+-
+- if (ret < 0)
+- req_set_fail(req);
+- io_req_complete(req, ret);
+- return 0;
+-}
+-
+-static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->off || sqe->addr || sqe->len ||
+- sqe->rw_flags || sqe->buf_index)
+- return -EINVAL;
+- if (req->flags & REQ_F_FIXED_FILE)
+- return -EBADF;
+-
+- req->close.fd = READ_ONCE(sqe->fd);
+- req->close.file_slot = READ_ONCE(sqe->file_index);
+- if (req->close.file_slot && req->close.fd)
+- return -EINVAL;
+-
+- return 0;
+-}
+-
+-static int io_close(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct files_struct *files = current->files;
+- struct io_close *close = &req->close;
+- struct fdtable *fdt;
+- struct file *file = NULL;
+- int ret = -EBADF;
+-
+- if (req->close.file_slot) {
+- ret = io_close_fixed(req, issue_flags);
+- goto err;
+- }
+-
+- spin_lock(&files->file_lock);
+- fdt = files_fdtable(files);
+- if (close->fd >= fdt->max_fds) {
+- spin_unlock(&files->file_lock);
+- goto err;
+- }
+- file = fdt->fd[close->fd];
+- if (!file || file->f_op == &io_uring_fops) {
+- spin_unlock(&files->file_lock);
+- file = NULL;
+- goto err;
+- }
+-
+- /* if the file has a flush method, be safe and punt to async */
+- if (file->f_op->flush && (issue_flags & IO_URING_F_NONBLOCK)) {
+- spin_unlock(&files->file_lock);
+- return -EAGAIN;
+- }
+-
+- ret = __close_fd_get_file(close->fd, &file);
+- spin_unlock(&files->file_lock);
+- if (ret < 0) {
+- if (ret == -ENOENT)
+- ret = -EBADF;
+- goto err;
+- }
+-
+- /* No ->flush() or already async, safely close from here */
+- ret = filp_close(file, current->files);
+-err:
+- if (ret < 0)
+- req_set_fail(req);
+- if (file)
+- fput(file);
+- __io_req_complete(req, issue_flags, ret, 0);
+- return 0;
+-}
+-
+-static int io_sfr_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (unlikely(sqe->addr || sqe->ioprio || sqe->buf_index ||
+- sqe->splice_fd_in))
+- return -EINVAL;
+-
+- req->sync.off = READ_ONCE(sqe->off);
+- req->sync.len = READ_ONCE(sqe->len);
+- req->sync.flags = READ_ONCE(sqe->sync_range_flags);
+- return 0;
+-}
+-
+-static int io_sync_file_range(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- int ret;
+-
+- /* sync_file_range always requires a blocking context */
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- return -EAGAIN;
+-
+- ret = sync_file_range(req->file, req->sync.off, req->sync.len,
+- req->sync.flags);
+- if (ret < 0)
+- req_set_fail(req);
+- io_req_complete(req, ret);
+- return 0;
+-}
+-
+-#if defined(CONFIG_NET)
+-static int io_setup_async_msg(struct io_kiocb *req,
+- struct io_async_msghdr *kmsg)
+-{
+- struct io_async_msghdr *async_msg = req->async_data;
+-
+- if (async_msg)
+- return -EAGAIN;
+- if (io_alloc_async_data(req)) {
+- kfree(kmsg->free_iov);
+- return -ENOMEM;
+- }
+- async_msg = req->async_data;
+- req->flags |= REQ_F_NEED_CLEANUP;
+- memcpy(async_msg, kmsg, sizeof(*kmsg));
+- async_msg->msg.msg_name = &async_msg->addr;
+- /* if were using fast_iov, set it to the new one */
+- if (!async_msg->free_iov)
+- async_msg->msg.msg_iter.iov = async_msg->fast_iov;
+-
+- return -EAGAIN;
+-}
+-
+-static int io_sendmsg_copy_hdr(struct io_kiocb *req,
+- struct io_async_msghdr *iomsg)
+-{
+- iomsg->msg.msg_name = &iomsg->addr;
+- iomsg->free_iov = iomsg->fast_iov;
+- return sendmsg_copy_msghdr(&iomsg->msg, req->sr_msg.umsg,
+- req->sr_msg.msg_flags, &iomsg->free_iov);
+-}
+-
+-static int io_sendmsg_prep_async(struct io_kiocb *req)
+-{
+- int ret;
+-
+- ret = io_sendmsg_copy_hdr(req, req->async_data);
+- if (!ret)
+- req->flags |= REQ_F_NEED_CLEANUP;
+- return ret;
+-}
+-
+-static int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- struct io_sr_msg *sr = &req->sr_msg;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (unlikely(sqe->addr2 || sqe->file_index || sqe->ioprio))
+- return -EINVAL;
+-
+- sr->umsg = u64_to_user_ptr(READ_ONCE(sqe->addr));
+- sr->len = READ_ONCE(sqe->len);
+- sr->msg_flags = READ_ONCE(sqe->msg_flags) | MSG_NOSIGNAL;
+- if (sr->msg_flags & MSG_DONTWAIT)
+- req->flags |= REQ_F_NOWAIT;
+-
+-#ifdef CONFIG_COMPAT
+- if (req->ctx->compat)
+- sr->msg_flags |= MSG_CMSG_COMPAT;
+-#endif
+- return 0;
+-}
+-
+-static int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_async_msghdr iomsg, *kmsg;
+- struct socket *sock;
+- unsigned flags;
+- int min_ret = 0;
+- int ret;
+-
+- sock = sock_from_file(req->file);
+- if (unlikely(!sock))
+- return -ENOTSOCK;
+-
+- if (req_has_async_data(req)) {
+- kmsg = req->async_data;
+- } else {
+- ret = io_sendmsg_copy_hdr(req, &iomsg);
+- if (ret)
+- return ret;
+- kmsg = &iomsg;
+- }
+-
+- flags = req->sr_msg.msg_flags;
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- flags |= MSG_DONTWAIT;
+- if (flags & MSG_WAITALL)
+- min_ret = iov_iter_count(&kmsg->msg.msg_iter);
+-
+- ret = __sys_sendmsg_sock(sock, &kmsg->msg, flags);
+-
+- if (ret < min_ret) {
+- if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK))
+- return io_setup_async_msg(req, kmsg);
+- if (ret == -ERESTARTSYS)
+- ret = -EINTR;
+- req_set_fail(req);
+- }
+- /* fast path, check for non-NULL to avoid function call */
+- if (kmsg->free_iov)
+- kfree(kmsg->free_iov);
+- req->flags &= ~REQ_F_NEED_CLEANUP;
+- __io_req_complete(req, issue_flags, ret, 0);
+- return 0;
+-}
+-
+-static int io_send(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_sr_msg *sr = &req->sr_msg;
+- struct msghdr msg;
+- struct iovec iov;
+- struct socket *sock;
+- unsigned flags;
+- int min_ret = 0;
+- int ret;
+-
+- sock = sock_from_file(req->file);
+- if (unlikely(!sock))
+- return -ENOTSOCK;
+-
+- ret = import_single_range(WRITE, sr->buf, sr->len, &iov, &msg.msg_iter);
+- if (unlikely(ret))
+- return ret;
+-
+- msg.msg_name = NULL;
+- msg.msg_control = NULL;
+- msg.msg_controllen = 0;
+- msg.msg_namelen = 0;
+-
+- flags = req->sr_msg.msg_flags;
+- if (issue_flags & IO_URING_F_NONBLOCK)
+- flags |= MSG_DONTWAIT;
+- if (flags & MSG_WAITALL)
+- min_ret = iov_iter_count(&msg.msg_iter);
+-
+- msg.msg_flags = flags;
+- ret = sock_sendmsg(sock, &msg);
+- if (ret < min_ret) {
+- if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK))
+- return -EAGAIN;
+- if (ret == -ERESTARTSYS)
+- ret = -EINTR;
+- req_set_fail(req);
+- }
+- __io_req_complete(req, issue_flags, ret, 0);
+- return 0;
+-}
+-
+-static int __io_recvmsg_copy_hdr(struct io_kiocb *req,
+- struct io_async_msghdr *iomsg)
+-{
+- struct io_sr_msg *sr = &req->sr_msg;
+- struct iovec __user *uiov;
+- size_t iov_len;
+- int ret;
+-
+- ret = __copy_msghdr_from_user(&iomsg->msg, sr->umsg,
+- &iomsg->uaddr, &uiov, &iov_len);
+- if (ret)
+- return ret;
+-
+- if (req->flags & REQ_F_BUFFER_SELECT) {
+- if (iov_len > 1)
+- return -EINVAL;
+- if (copy_from_user(iomsg->fast_iov, uiov, sizeof(*uiov)))
+- return -EFAULT;
+- sr->len = iomsg->fast_iov[0].iov_len;
+- iomsg->free_iov = NULL;
+- } else {
+- iomsg->free_iov = iomsg->fast_iov;
+- ret = __import_iovec(READ, uiov, iov_len, UIO_FASTIOV,
+- &iomsg->free_iov, &iomsg->msg.msg_iter,
+- false);
+- if (ret > 0)
+- ret = 0;
+- }
+-
+- return ret;
+-}
+-
+-#ifdef CONFIG_COMPAT
+-static int __io_compat_recvmsg_copy_hdr(struct io_kiocb *req,
+- struct io_async_msghdr *iomsg)
+-{
+- struct io_sr_msg *sr = &req->sr_msg;
+- struct compat_iovec __user *uiov;
+- compat_uptr_t ptr;
+- compat_size_t len;
+- int ret;
+-
+- ret = __get_compat_msghdr(&iomsg->msg, sr->umsg_compat, &iomsg->uaddr,
+- &ptr, &len);
+- if (ret)
+- return ret;
+-
+- uiov = compat_ptr(ptr);
+- if (req->flags & REQ_F_BUFFER_SELECT) {
+- compat_ssize_t clen;
+-
+- if (len > 1)
+- return -EINVAL;
+- if (!access_ok(uiov, sizeof(*uiov)))
+- return -EFAULT;
+- if (__get_user(clen, &uiov->iov_len))
+- return -EFAULT;
+- if (clen < 0)
+- return -EINVAL;
+- sr->len = clen;
+- iomsg->free_iov = NULL;
+- } else {
+- iomsg->free_iov = iomsg->fast_iov;
+- ret = __import_iovec(READ, (struct iovec __user *)uiov, len,
+- UIO_FASTIOV, &iomsg->free_iov,
+- &iomsg->msg.msg_iter, true);
+- if (ret < 0)
+- return ret;
+- }
+-
+- return 0;
+-}
+-#endif
+-
+-static int io_recvmsg_copy_hdr(struct io_kiocb *req,
+- struct io_async_msghdr *iomsg)
+-{
+- iomsg->msg.msg_name = &iomsg->addr;
+-
+-#ifdef CONFIG_COMPAT
+- if (req->ctx->compat)
+- return __io_compat_recvmsg_copy_hdr(req, iomsg);
+-#endif
+-
+- return __io_recvmsg_copy_hdr(req, iomsg);
+-}
+-
+-static struct io_buffer *io_recv_buffer_select(struct io_kiocb *req,
+- unsigned int issue_flags)
+-{
+- struct io_sr_msg *sr = &req->sr_msg;
+-
+- return io_buffer_select(req, &sr->len, sr->bgid, issue_flags);
+-}
+-
+-static int io_recvmsg_prep_async(struct io_kiocb *req)
+-{
+- int ret;
+-
+- ret = io_recvmsg_copy_hdr(req, req->async_data);
+- if (!ret)
+- req->flags |= REQ_F_NEED_CLEANUP;
+- return ret;
+-}
+-
+-static int io_recvmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- struct io_sr_msg *sr = &req->sr_msg;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (unlikely(sqe->addr2 || sqe->file_index || sqe->ioprio))
+- return -EINVAL;
+-
+- sr->umsg = u64_to_user_ptr(READ_ONCE(sqe->addr));
+- sr->len = READ_ONCE(sqe->len);
+- sr->bgid = READ_ONCE(sqe->buf_group);
+- sr->msg_flags = READ_ONCE(sqe->msg_flags) | MSG_NOSIGNAL;
+- if (sr->msg_flags & MSG_DONTWAIT)
+- req->flags |= REQ_F_NOWAIT;
+-
+-#ifdef CONFIG_COMPAT
+- if (req->ctx->compat)
+- sr->msg_flags |= MSG_CMSG_COMPAT;
+-#endif
+- sr->done_io = 0;
+- return 0;
+-}
+-
+-static bool io_net_retry(struct socket *sock, int flags)
+-{
+- if (!(flags & MSG_WAITALL))
+- return false;
+- return sock->type == SOCK_STREAM || sock->type == SOCK_SEQPACKET;
+-}
+-
+-static int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_async_msghdr iomsg, *kmsg;
+- struct io_sr_msg *sr = &req->sr_msg;
+- struct socket *sock;
+- struct io_buffer *kbuf;
+- unsigned flags;
+- int ret, min_ret = 0;
+- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+-
+- sock = sock_from_file(req->file);
+- if (unlikely(!sock))
+- return -ENOTSOCK;
+-
+- if (req_has_async_data(req)) {
+- kmsg = req->async_data;
+- } else {
+- ret = io_recvmsg_copy_hdr(req, &iomsg);
+- if (ret)
+- return ret;
+- kmsg = &iomsg;
+- }
+-
+- if (req->flags & REQ_F_BUFFER_SELECT) {
+- kbuf = io_recv_buffer_select(req, issue_flags);
+- if (IS_ERR(kbuf))
+- return PTR_ERR(kbuf);
+- kmsg->fast_iov[0].iov_base = u64_to_user_ptr(kbuf->addr);
+- kmsg->fast_iov[0].iov_len = req->sr_msg.len;
+- iov_iter_init(&kmsg->msg.msg_iter, READ, kmsg->fast_iov,
+- 1, req->sr_msg.len);
+- }
+-
+- flags = req->sr_msg.msg_flags;
+- if (force_nonblock)
+- flags |= MSG_DONTWAIT;
+- if (flags & MSG_WAITALL)
+- min_ret = iov_iter_count(&kmsg->msg.msg_iter);
+-
+- ret = __sys_recvmsg_sock(sock, &kmsg->msg, req->sr_msg.umsg,
+- kmsg->uaddr, flags);
+- if (ret < min_ret) {
+- if (ret == -EAGAIN && force_nonblock)
+- return io_setup_async_msg(req, kmsg);
+- if (ret == -ERESTARTSYS)
+- ret = -EINTR;
+- if (ret > 0 && io_net_retry(sock, flags)) {
+- sr->done_io += ret;
+- req->flags |= REQ_F_PARTIAL_IO;
+- return io_setup_async_msg(req, kmsg);
+- }
+- req_set_fail(req);
+- } else if ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
+- req_set_fail(req);
+- }
+-
+- /* fast path, check for non-NULL to avoid function call */
+- if (kmsg->free_iov)
+- kfree(kmsg->free_iov);
+- req->flags &= ~REQ_F_NEED_CLEANUP;
+- if (ret >= 0)
+- ret += sr->done_io;
+- else if (sr->done_io)
+- ret = sr->done_io;
+- __io_req_complete(req, issue_flags, ret, io_put_kbuf(req, issue_flags));
+- return 0;
+-}
+-
+-static int io_recv(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_buffer *kbuf;
+- struct io_sr_msg *sr = &req->sr_msg;
+- struct msghdr msg;
+- void __user *buf = sr->buf;
+- struct socket *sock;
+- struct iovec iov;
+- unsigned flags;
+- int ret, min_ret = 0;
+- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+-
+- sock = sock_from_file(req->file);
+- if (unlikely(!sock))
+- return -ENOTSOCK;
+-
+- if (req->flags & REQ_F_BUFFER_SELECT) {
+- kbuf = io_recv_buffer_select(req, issue_flags);
+- if (IS_ERR(kbuf))
+- return PTR_ERR(kbuf);
+- buf = u64_to_user_ptr(kbuf->addr);
+- }
+-
+- ret = import_single_range(READ, buf, sr->len, &iov, &msg.msg_iter);
+- if (unlikely(ret))
+- goto out_free;
+-
+- msg.msg_name = NULL;
+- msg.msg_control = NULL;
+- msg.msg_controllen = 0;
+- msg.msg_namelen = 0;
+- msg.msg_iocb = NULL;
+- msg.msg_flags = 0;
+-
+- flags = req->sr_msg.msg_flags;
+- if (force_nonblock)
+- flags |= MSG_DONTWAIT;
+- if (flags & MSG_WAITALL)
+- min_ret = iov_iter_count(&msg.msg_iter);
+-
+- ret = sock_recvmsg(sock, &msg, flags);
+- if (ret < min_ret) {
+- if (ret == -EAGAIN && force_nonblock)
+- return -EAGAIN;
+- if (ret == -ERESTARTSYS)
+- ret = -EINTR;
+- if (ret > 0 && io_net_retry(sock, flags)) {
+- sr->len -= ret;
+- sr->buf += ret;
+- sr->done_io += ret;
+- req->flags |= REQ_F_PARTIAL_IO;
+- return -EAGAIN;
+- }
+- req_set_fail(req);
+- } else if ((flags & MSG_WAITALL) && (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
+-out_free:
+- req_set_fail(req);
+- }
+-
+- if (ret >= 0)
+- ret += sr->done_io;
+- else if (sr->done_io)
+- ret = sr->done_io;
+- __io_req_complete(req, issue_flags, ret, io_put_kbuf(req, issue_flags));
+- return 0;
+-}
+-
+-static int io_accept_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- struct io_accept *accept = &req->accept;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->len || sqe->buf_index)
+- return -EINVAL;
+-
+- accept->addr = u64_to_user_ptr(READ_ONCE(sqe->addr));
+- accept->addr_len = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+- accept->flags = READ_ONCE(sqe->accept_flags);
+- accept->nofile = rlimit(RLIMIT_NOFILE);
+-
+- accept->file_slot = READ_ONCE(sqe->file_index);
+- if (accept->file_slot && (accept->flags & SOCK_CLOEXEC))
+- return -EINVAL;
+- if (accept->flags & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))
+- return -EINVAL;
+- if (SOCK_NONBLOCK != O_NONBLOCK && (accept->flags & SOCK_NONBLOCK))
+- accept->flags = (accept->flags & ~SOCK_NONBLOCK) | O_NONBLOCK;
+- return 0;
+-}
+-
+-static int io_accept(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_accept *accept = &req->accept;
+- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+- unsigned int file_flags = force_nonblock ? O_NONBLOCK : 0;
+- bool fixed = !!accept->file_slot;
+- struct file *file;
+- int ret, fd;
+-
+- if (!fixed) {
+- fd = __get_unused_fd_flags(accept->flags, accept->nofile);
+- if (unlikely(fd < 0))
+- return fd;
+- }
+- file = do_accept(req->file, file_flags, accept->addr, accept->addr_len,
+- accept->flags);
+- if (IS_ERR(file)) {
+- if (!fixed)
+- put_unused_fd(fd);
+- ret = PTR_ERR(file);
+- if (ret == -EAGAIN && force_nonblock)
+- return -EAGAIN;
+- if (ret == -ERESTARTSYS)
+- ret = -EINTR;
+- req_set_fail(req);
+- } else if (!fixed) {
+- fd_install(fd, file);
+- ret = fd;
+- } else {
+- ret = io_install_fixed_file(req, file, issue_flags,
+- accept->file_slot - 1);
+- }
+- __io_req_complete(req, issue_flags, ret, 0);
+- return 0;
+-}
+-
+-static int io_connect_prep_async(struct io_kiocb *req)
+-{
+- struct io_async_connect *io = req->async_data;
+- struct io_connect *conn = &req->connect;
+-
+- return move_addr_to_kernel(conn->addr, conn->addr_len, &io->address);
+-}
+-
+-static int io_connect_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- struct io_connect *conn = &req->connect;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->len || sqe->buf_index || sqe->rw_flags ||
+- sqe->splice_fd_in)
+- return -EINVAL;
+-
+- conn->addr = u64_to_user_ptr(READ_ONCE(sqe->addr));
+- conn->addr_len = READ_ONCE(sqe->addr2);
+- return 0;
+-}
+-
+-static int io_connect(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_async_connect __io, *io;
+- unsigned file_flags;
+- int ret;
+- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+-
+- if (req_has_async_data(req)) {
+- io = req->async_data;
+- } else {
+- ret = move_addr_to_kernel(req->connect.addr,
+- req->connect.addr_len,
+- &__io.address);
+- if (ret)
+- goto out;
+- io = &__io;
+- }
+-
+- file_flags = force_nonblock ? O_NONBLOCK : 0;
+-
+- ret = __sys_connect_file(req->file, &io->address,
+- req->connect.addr_len, file_flags);
+- if ((ret == -EAGAIN || ret == -EINPROGRESS) && force_nonblock) {
+- if (req_has_async_data(req))
+- return -EAGAIN;
+- if (io_alloc_async_data(req)) {
+- ret = -ENOMEM;
+- goto out;
+- }
+- memcpy(req->async_data, &__io, sizeof(__io));
+- return -EAGAIN;
+- }
+- if (ret == -ERESTARTSYS)
+- ret = -EINTR;
+-out:
+- if (ret < 0)
+- req_set_fail(req);
+- __io_req_complete(req, issue_flags, ret, 0);
+- return 0;
+-}
+-#else /* !CONFIG_NET */
+-#define IO_NETOP_FN(op) \
+-static int io_##op(struct io_kiocb *req, unsigned int issue_flags) \
+-{ \
+- return -EOPNOTSUPP; \
+-}
+-
+-#define IO_NETOP_PREP(op) \
+-IO_NETOP_FN(op) \
+-static int io_##op##_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) \
+-{ \
+- return -EOPNOTSUPP; \
+-} \
+-
+-#define IO_NETOP_PREP_ASYNC(op) \
+-IO_NETOP_PREP(op) \
+-static int io_##op##_prep_async(struct io_kiocb *req) \
+-{ \
+- return -EOPNOTSUPP; \
+-}
+-
+-IO_NETOP_PREP_ASYNC(sendmsg);
+-IO_NETOP_PREP_ASYNC(recvmsg);
+-IO_NETOP_PREP_ASYNC(connect);
+-IO_NETOP_PREP(accept);
+-IO_NETOP_FN(send);
+-IO_NETOP_FN(recv);
+-#endif /* CONFIG_NET */
+-
+-struct io_poll_table {
+- struct poll_table_struct pt;
+- struct io_kiocb *req;
+- int nr_entries;
+- int error;
+-};
+-
+-#define IO_POLL_CANCEL_FLAG BIT(31)
+-#define IO_POLL_REF_MASK GENMASK(30, 0)
+-
+-/*
+- * If refs part of ->poll_refs (see IO_POLL_REF_MASK) is 0, it's free. We can
+- * bump it and acquire ownership. It's disallowed to modify requests while not
+- * owning it, that prevents from races for enqueueing task_work's and b/w
+- * arming poll and wakeups.
+- */
+-static inline bool io_poll_get_ownership(struct io_kiocb *req)
+-{
+- return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK);
+-}
+-
+-static void io_poll_mark_cancelled(struct io_kiocb *req)
+-{
+- atomic_or(IO_POLL_CANCEL_FLAG, &req->poll_refs);
+-}
+-
+-static struct io_poll_iocb *io_poll_get_double(struct io_kiocb *req)
+-{
+- /* pure poll stashes this in ->async_data, poll driven retry elsewhere */
+- if (req->opcode == IORING_OP_POLL_ADD)
+- return req->async_data;
+- return req->apoll->double_poll;
+-}
+-
+-static struct io_poll_iocb *io_poll_get_single(struct io_kiocb *req)
+-{
+- if (req->opcode == IORING_OP_POLL_ADD)
+- return &req->poll;
+- return &req->apoll->poll;
+-}
+-
+-static void io_poll_req_insert(struct io_kiocb *req)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- struct hlist_head *list;
+-
+- list = &ctx->cancel_hash[hash_long(req->user_data, ctx->cancel_hash_bits)];
+- hlist_add_head(&req->hash_node, list);
+-}
+-
+-static void io_init_poll_iocb(struct io_poll_iocb *poll, __poll_t events,
+- wait_queue_func_t wake_func)
+-{
+- poll->head = NULL;
+-#define IO_POLL_UNMASK (EPOLLERR|EPOLLHUP|EPOLLNVAL|EPOLLRDHUP)
+- /* mask in events that we always want/need */
+- poll->events = events | IO_POLL_UNMASK;
+- INIT_LIST_HEAD(&poll->wait.entry);
+- init_waitqueue_func_entry(&poll->wait, wake_func);
+-}
+-
+-static inline void io_poll_remove_entry(struct io_poll_iocb *poll)
+-{
+- struct wait_queue_head *head = smp_load_acquire(&poll->head);
+-
+- if (head) {
+- spin_lock_irq(&head->lock);
+- list_del_init(&poll->wait.entry);
+- poll->head = NULL;
+- spin_unlock_irq(&head->lock);
+- }
+-}
+-
+-static void io_poll_remove_entries(struct io_kiocb *req)
+-{
+- /*
+- * Nothing to do if neither of those flags are set. Avoid dipping
+- * into the poll/apoll/double cachelines if we can.
+- */
+- if (!(req->flags & (REQ_F_SINGLE_POLL | REQ_F_DOUBLE_POLL)))
+- return;
+-
+- /*
+- * While we hold the waitqueue lock and the waitqueue is nonempty,
+- * wake_up_pollfree() will wait for us. However, taking the waitqueue
+- * lock in the first place can race with the waitqueue being freed.
+- *
+- * We solve this as eventpoll does: by taking advantage of the fact that
+- * all users of wake_up_pollfree() will RCU-delay the actual free. If
+- * we enter rcu_read_lock() and see that the pointer to the queue is
+- * non-NULL, we can then lock it without the memory being freed out from
+- * under us.
+- *
+- * Keep holding rcu_read_lock() as long as we hold the queue lock, in
+- * case the caller deletes the entry from the queue, leaving it empty.
+- * In that case, only RCU prevents the queue memory from being freed.
+- */
+- rcu_read_lock();
+- if (req->flags & REQ_F_SINGLE_POLL)
+- io_poll_remove_entry(io_poll_get_single(req));
+- if (req->flags & REQ_F_DOUBLE_POLL)
+- io_poll_remove_entry(io_poll_get_double(req));
+- rcu_read_unlock();
+-}
+-
+-/*
+- * All poll tw should go through this. Checks for poll events, manages
+- * references, does rewait, etc.
+- *
+- * Returns a negative error on failure. >0 when no action require, which is
+- * either spurious wakeup or multishot CQE is served. 0 when it's done with
+- * the request, then the mask is stored in req->result.
+- */
+-static int io_poll_check_events(struct io_kiocb *req, bool locked)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- int v;
+-
+- /* req->task == current here, checking PF_EXITING is safe */
+- if (unlikely(req->task->flags & PF_EXITING))
+- io_poll_mark_cancelled(req);
+-
+- do {
+- v = atomic_read(&req->poll_refs);
+-
+- /* tw handler should be the owner, and so have some references */
+- if (WARN_ON_ONCE(!(v & IO_POLL_REF_MASK)))
+- return 0;
+- if (v & IO_POLL_CANCEL_FLAG)
+- return -ECANCELED;
+-
+- if (!req->result) {
+- struct poll_table_struct pt = { ._key = req->apoll_events };
+- req->result = vfs_poll(req->file, &pt) & req->apoll_events;
+- }
+-
+- /* multishot, just fill an CQE and proceed */
+- if (req->result && !(req->apoll_events & EPOLLONESHOT)) {
+- __poll_t mask = mangle_poll(req->result & req->apoll_events);
+- bool filled;
+-
+- spin_lock(&ctx->completion_lock);
+- filled = io_fill_cqe_aux(ctx, req->user_data, mask,
+- IORING_CQE_F_MORE);
+- io_commit_cqring(ctx);
+- spin_unlock(&ctx->completion_lock);
+- if (unlikely(!filled))
+- return -ECANCELED;
+- io_cqring_ev_posted(ctx);
+- } else if (req->result) {
+- return 0;
+- }
+-
+- /*
+- * Release all references, retry if someone tried to restart
+- * task_work while we were executing it.
+- */
+- } while (atomic_sub_return(v & IO_POLL_REF_MASK, &req->poll_refs));
+-
+- return 1;
+-}
+-
+-static void io_poll_task_func(struct io_kiocb *req, bool *locked)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- int ret;
+-
+- ret = io_poll_check_events(req, *locked);
+- if (ret > 0)
+- return;
+-
+- if (!ret) {
+- req->result = mangle_poll(req->result & req->poll.events);
+- } else {
+- req->result = ret;
+- req_set_fail(req);
+- }
+-
+- io_poll_remove_entries(req);
+- spin_lock(&ctx->completion_lock);
+- hash_del(&req->hash_node);
+- __io_req_complete_post(req, req->result, 0);
+- io_commit_cqring(ctx);
+- spin_unlock(&ctx->completion_lock);
+- io_cqring_ev_posted(ctx);
+-}
+-
+-static void io_apoll_task_func(struct io_kiocb *req, bool *locked)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- int ret;
+-
+- ret = io_poll_check_events(req, *locked);
+- if (ret > 0)
+- return;
+-
+- io_poll_remove_entries(req);
+- spin_lock(&ctx->completion_lock);
+- hash_del(&req->hash_node);
+- spin_unlock(&ctx->completion_lock);
+-
+- if (!ret)
+- io_req_task_submit(req, locked);
+- else
+- io_req_complete_failed(req, ret);
+-}
+-
+-static void __io_poll_execute(struct io_kiocb *req, int mask,
+- __poll_t __maybe_unused events)
+-{
+- req->result = mask;
+- /*
+- * This is useful for poll that is armed on behalf of another
+- * request, and where the wakeup path could be on a different
+- * CPU. We want to avoid pulling in req->apoll->events for that
+- * case.
+- */
+- if (req->opcode == IORING_OP_POLL_ADD)
+- req->io_task_work.func = io_poll_task_func;
+- else
+- req->io_task_work.func = io_apoll_task_func;
+-
+- trace_io_uring_task_add(req->ctx, req, req->user_data, req->opcode, mask);
+- io_req_task_work_add(req, false);
+-}
+-
+-static inline void io_poll_execute(struct io_kiocb *req, int res,
+- __poll_t events)
+-{
+- if (io_poll_get_ownership(req))
+- __io_poll_execute(req, res, events);
+-}
+-
+-static void io_poll_cancel_req(struct io_kiocb *req)
+-{
+- io_poll_mark_cancelled(req);
+- /* kick tw, which should complete the request */
+- io_poll_execute(req, 0, 0);
+-}
+-
+-#define wqe_to_req(wait) ((void *)((unsigned long) (wait)->private & ~1))
+-#define wqe_is_double(wait) ((unsigned long) (wait)->private & 1)
+-#define IO_ASYNC_POLL_COMMON (EPOLLONESHOT | POLLPRI)
+-
+-static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
+- void *key)
+-{
+- struct io_kiocb *req = wqe_to_req(wait);
+- struct io_poll_iocb *poll = container_of(wait, struct io_poll_iocb,
+- wait);
+- __poll_t mask = key_to_poll(key);
+-
+- if (unlikely(mask & POLLFREE)) {
+- io_poll_mark_cancelled(req);
+- /* we have to kick tw in case it's not already */
+- io_poll_execute(req, 0, poll->events);
+-
+- /*
+- * If the waitqueue is being freed early but someone is already
+- * holds ownership over it, we have to tear down the request as
+- * best we can. That means immediately removing the request from
+- * its waitqueue and preventing all further accesses to the
+- * waitqueue via the request.
+- */
+- list_del_init(&poll->wait.entry);
+-
+- /*
+- * Careful: this *must* be the last step, since as soon
+- * as req->head is NULL'ed out, the request can be
+- * completed and freed, since aio_poll_complete_work()
+- * will no longer need to take the waitqueue lock.
+- */
+- smp_store_release(&poll->head, NULL);
+- return 1;
+- }
+-
+- /* for instances that support it check for an event match first */
+- if (mask && !(mask & (poll->events & ~IO_ASYNC_POLL_COMMON)))
+- return 0;
+-
+- if (io_poll_get_ownership(req)) {
+- /* optional, saves extra locking for removal in tw handler */
+- if (mask && poll->events & EPOLLONESHOT) {
+- list_del_init(&poll->wait.entry);
+- poll->head = NULL;
+- if (wqe_is_double(wait))
+- req->flags &= ~REQ_F_DOUBLE_POLL;
+- else
+- req->flags &= ~REQ_F_SINGLE_POLL;
+- }
+- __io_poll_execute(req, mask, poll->events);
+- }
+- return 1;
+-}
+-
+-static void __io_queue_proc(struct io_poll_iocb *poll, struct io_poll_table *pt,
+- struct wait_queue_head *head,
+- struct io_poll_iocb **poll_ptr)
+-{
+- struct io_kiocb *req = pt->req;
+- unsigned long wqe_private = (unsigned long) req;
+-
+- /*
+- * The file being polled uses multiple waitqueues for poll handling
+- * (e.g. one for read, one for write). Setup a separate io_poll_iocb
+- * if this happens.
+- */
+- if (unlikely(pt->nr_entries)) {
+- struct io_poll_iocb *first = poll;
+-
+- /* double add on the same waitqueue head, ignore */
+- if (first->head == head)
+- return;
+- /* already have a 2nd entry, fail a third attempt */
+- if (*poll_ptr) {
+- if ((*poll_ptr)->head == head)
+- return;
+- pt->error = -EINVAL;
+- return;
+- }
+-
+- poll = kmalloc(sizeof(*poll), GFP_ATOMIC);
+- if (!poll) {
+- pt->error = -ENOMEM;
+- return;
+- }
+- /* mark as double wq entry */
+- wqe_private |= 1;
+- req->flags |= REQ_F_DOUBLE_POLL;
+- io_init_poll_iocb(poll, first->events, first->wait.func);
+- *poll_ptr = poll;
+- if (req->opcode == IORING_OP_POLL_ADD)
+- req->flags |= REQ_F_ASYNC_DATA;
+- }
+-
+- req->flags |= REQ_F_SINGLE_POLL;
+- pt->nr_entries++;
+- poll->head = head;
+- poll->wait.private = (void *) wqe_private;
+-
+- if (poll->events & EPOLLEXCLUSIVE)
+- add_wait_queue_exclusive(head, &poll->wait);
+- else
+- add_wait_queue(head, &poll->wait);
+-}
+-
+-static void io_poll_queue_proc(struct file *file, struct wait_queue_head *head,
+- struct poll_table_struct *p)
+-{
+- struct io_poll_table *pt = container_of(p, struct io_poll_table, pt);
+-
+- __io_queue_proc(&pt->req->poll, pt, head,
+- (struct io_poll_iocb **) &pt->req->async_data);
+-}
+-
+-static int __io_arm_poll_handler(struct io_kiocb *req,
+- struct io_poll_iocb *poll,
+- struct io_poll_table *ipt, __poll_t mask)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- int v;
+-
+- INIT_HLIST_NODE(&req->hash_node);
+- io_init_poll_iocb(poll, mask, io_poll_wake);
+- poll->file = req->file;
+-
+- req->apoll_events = poll->events;
+-
+- ipt->pt._key = mask;
+- ipt->req = req;
+- ipt->error = 0;
+- ipt->nr_entries = 0;
+-
+- /*
+- * Take the ownership to delay any tw execution up until we're done
+- * with poll arming. see io_poll_get_ownership().
+- */
+- atomic_set(&req->poll_refs, 1);
+- mask = vfs_poll(req->file, &ipt->pt) & poll->events;
+-
+- if (mask && (poll->events & EPOLLONESHOT)) {
+- io_poll_remove_entries(req);
+- /* no one else has access to the req, forget about the ref */
+- return mask;
+- }
+- if (!mask && unlikely(ipt->error || !ipt->nr_entries)) {
+- io_poll_remove_entries(req);
+- if (!ipt->error)
+- ipt->error = -EINVAL;
+- return 0;
+- }
+-
+- spin_lock(&ctx->completion_lock);
+- io_poll_req_insert(req);
+- spin_unlock(&ctx->completion_lock);
+-
+- if (mask) {
+- /* can't multishot if failed, just queue the event we've got */
+- if (unlikely(ipt->error || !ipt->nr_entries)) {
+- poll->events |= EPOLLONESHOT;
+- req->apoll_events |= EPOLLONESHOT;
+- ipt->error = 0;
+- }
+- __io_poll_execute(req, mask, poll->events);
+- return 0;
+- }
+-
+- /*
+- * Release ownership. If someone tried to queue a tw while it was
+- * locked, kick it off for them.
+- */
+- v = atomic_dec_return(&req->poll_refs);
+- if (unlikely(v & IO_POLL_REF_MASK))
+- __io_poll_execute(req, 0, poll->events);
+- return 0;
+-}
+-
+-static void io_async_queue_proc(struct file *file, struct wait_queue_head *head,
+- struct poll_table_struct *p)
+-{
+- struct io_poll_table *pt = container_of(p, struct io_poll_table, pt);
+- struct async_poll *apoll = pt->req->apoll;
+-
+- __io_queue_proc(&apoll->poll, pt, head, &apoll->double_poll);
+-}
+-
+-enum {
+- IO_APOLL_OK,
+- IO_APOLL_ABORTED,
+- IO_APOLL_READY
+-};
+-
+-static int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
+-{
+- const struct io_op_def *def = &io_op_defs[req->opcode];
+- struct io_ring_ctx *ctx = req->ctx;
+- struct async_poll *apoll;
+- struct io_poll_table ipt;
+- __poll_t mask = IO_ASYNC_POLL_COMMON | POLLERR;
+- int ret;
+-
+- if (!def->pollin && !def->pollout)
+- return IO_APOLL_ABORTED;
+- if (!file_can_poll(req->file) || (req->flags & REQ_F_POLLED))
+- return IO_APOLL_ABORTED;
+-
+- if (def->pollin) {
+- mask |= POLLIN | POLLRDNORM;
+-
+- /* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */
+- if ((req->opcode == IORING_OP_RECVMSG) &&
+- (req->sr_msg.msg_flags & MSG_ERRQUEUE))
+- mask &= ~POLLIN;
+- } else {
+- mask |= POLLOUT | POLLWRNORM;
+- }
+- if (def->poll_exclusive)
+- mask |= EPOLLEXCLUSIVE;
+- if (!(issue_flags & IO_URING_F_UNLOCKED) &&
+- !list_empty(&ctx->apoll_cache)) {
+- apoll = list_first_entry(&ctx->apoll_cache, struct async_poll,
+- poll.wait.entry);
+- list_del_init(&apoll->poll.wait.entry);
+- } else {
+- apoll = kmalloc(sizeof(*apoll), GFP_ATOMIC);
+- if (unlikely(!apoll))
+- return IO_APOLL_ABORTED;
+- }
+- apoll->double_poll = NULL;
+- req->apoll = apoll;
+- req->flags |= REQ_F_POLLED;
+- ipt.pt._qproc = io_async_queue_proc;
+-
+- io_kbuf_recycle(req, issue_flags);
+-
+- ret = __io_arm_poll_handler(req, &apoll->poll, &ipt, mask);
+- if (ret || ipt.error)
+- return ret ? IO_APOLL_READY : IO_APOLL_ABORTED;
+-
+- trace_io_uring_poll_arm(ctx, req, req->user_data, req->opcode,
+- mask, apoll->poll.events);
+- return IO_APOLL_OK;
+-}
+-
+-/*
+- * Returns true if we found and killed one or more poll requests
+- */
+-static __cold bool io_poll_remove_all(struct io_ring_ctx *ctx,
+- struct task_struct *tsk, bool cancel_all)
+-{
+- struct hlist_node *tmp;
+- struct io_kiocb *req;
+- bool found = false;
+- int i;
+-
+- spin_lock(&ctx->completion_lock);
+- for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) {
+- struct hlist_head *list;
+-
+- list = &ctx->cancel_hash[i];
+- hlist_for_each_entry_safe(req, tmp, list, hash_node) {
+- if (io_match_task_safe(req, tsk, cancel_all)) {
+- hlist_del_init(&req->hash_node);
+- io_poll_cancel_req(req);
+- found = true;
+- }
+- }
+- }
+- spin_unlock(&ctx->completion_lock);
+- return found;
+-}
+-
+-static struct io_kiocb *io_poll_find(struct io_ring_ctx *ctx, __u64 sqe_addr,
+- bool poll_only)
+- __must_hold(&ctx->completion_lock)
+-{
+- struct hlist_head *list;
+- struct io_kiocb *req;
+-
+- list = &ctx->cancel_hash[hash_long(sqe_addr, ctx->cancel_hash_bits)];
+- hlist_for_each_entry(req, list, hash_node) {
+- if (sqe_addr != req->user_data)
+- continue;
+- if (poll_only && req->opcode != IORING_OP_POLL_ADD)
+- continue;
+- return req;
+- }
+- return NULL;
+-}
+-
+-static bool io_poll_disarm(struct io_kiocb *req)
+- __must_hold(&ctx->completion_lock)
+-{
+- if (!io_poll_get_ownership(req))
+- return false;
+- io_poll_remove_entries(req);
+- hash_del(&req->hash_node);
+- return true;
+-}
+-
+-static int io_poll_cancel(struct io_ring_ctx *ctx, __u64 sqe_addr,
+- bool poll_only)
+- __must_hold(&ctx->completion_lock)
+-{
+- struct io_kiocb *req = io_poll_find(ctx, sqe_addr, poll_only);
+-
+- if (!req)
+- return -ENOENT;
+- io_poll_cancel_req(req);
+- return 0;
+-}
+-
+-static __poll_t io_poll_parse_events(const struct io_uring_sqe *sqe,
+- unsigned int flags)
+-{
+- u32 events;
+-
+- events = READ_ONCE(sqe->poll32_events);
+-#ifdef __BIG_ENDIAN
+- events = swahw32(events);
+-#endif
+- if (!(flags & IORING_POLL_ADD_MULTI))
+- events |= EPOLLONESHOT;
+- return demangle_poll(events) | (events & (EPOLLEXCLUSIVE|EPOLLONESHOT));
+-}
+-
+-static int io_poll_update_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- struct io_poll_update *upd = &req->poll_update;
+- u32 flags;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
+- return -EINVAL;
+- flags = READ_ONCE(sqe->len);
+- if (flags & ~(IORING_POLL_UPDATE_EVENTS | IORING_POLL_UPDATE_USER_DATA |
+- IORING_POLL_ADD_MULTI))
+- return -EINVAL;
+- /* meaningless without update */
+- if (flags == IORING_POLL_ADD_MULTI)
+- return -EINVAL;
+-
+- upd->old_user_data = READ_ONCE(sqe->addr);
+- upd->update_events = flags & IORING_POLL_UPDATE_EVENTS;
+- upd->update_user_data = flags & IORING_POLL_UPDATE_USER_DATA;
+-
+- upd->new_user_data = READ_ONCE(sqe->off);
+- if (!upd->update_user_data && upd->new_user_data)
+- return -EINVAL;
+- if (upd->update_events)
+- upd->events = io_poll_parse_events(sqe, flags);
+- else if (sqe->poll32_events)
+- return -EINVAL;
+-
+- return 0;
+-}
+-
+-static int io_poll_add_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- struct io_poll_iocb *poll = &req->poll;
+- u32 flags;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->buf_index || sqe->off || sqe->addr)
+- return -EINVAL;
+- flags = READ_ONCE(sqe->len);
+- if (flags & ~IORING_POLL_ADD_MULTI)
+- return -EINVAL;
+- if ((flags & IORING_POLL_ADD_MULTI) && (req->flags & REQ_F_CQE_SKIP))
+- return -EINVAL;
+-
+- io_req_set_refcount(req);
+- poll->events = io_poll_parse_events(sqe, flags);
+- return 0;
+-}
+-
+-static int io_poll_add(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_poll_iocb *poll = &req->poll;
+- struct io_poll_table ipt;
+- int ret;
+-
+- ipt.pt._qproc = io_poll_queue_proc;
+-
+- ret = __io_arm_poll_handler(req, &req->poll, &ipt, poll->events);
+- if (!ret && ipt.error)
+- req_set_fail(req);
+- ret = ret ?: ipt.error;
+- if (ret)
+- __io_req_complete(req, issue_flags, ret, 0);
+- return 0;
+-}
+-
+-static int io_poll_update(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- struct io_kiocb *preq;
+- int ret2, ret = 0;
+- bool locked;
+-
+- spin_lock(&ctx->completion_lock);
+- preq = io_poll_find(ctx, req->poll_update.old_user_data, true);
+- if (!preq || !io_poll_disarm(preq)) {
+- spin_unlock(&ctx->completion_lock);
+- ret = preq ? -EALREADY : -ENOENT;
+- goto out;
+- }
+- spin_unlock(&ctx->completion_lock);
+-
+- if (req->poll_update.update_events || req->poll_update.update_user_data) {
+- /* only mask one event flags, keep behavior flags */
+- if (req->poll_update.update_events) {
+- preq->poll.events &= ~0xffff;
+- preq->poll.events |= req->poll_update.events & 0xffff;
+- preq->poll.events |= IO_POLL_UNMASK;
+- }
+- if (req->poll_update.update_user_data)
+- preq->user_data = req->poll_update.new_user_data;
+-
+- ret2 = io_poll_add(preq, issue_flags);
+- /* successfully updated, don't complete poll request */
+- if (!ret2)
+- goto out;
+- }
+-
+- req_set_fail(preq);
+- preq->result = -ECANCELED;
+- locked = !(issue_flags & IO_URING_F_UNLOCKED);
+- io_req_task_complete(preq, &locked);
+-out:
+- if (ret < 0)
+- req_set_fail(req);
+- /* complete update request, we're done with it */
+- __io_req_complete(req, issue_flags, ret, 0);
+- return 0;
+-}
+-
+-static enum hrtimer_restart io_timeout_fn(struct hrtimer *timer)
+-{
+- struct io_timeout_data *data = container_of(timer,
+- struct io_timeout_data, timer);
+- struct io_kiocb *req = data->req;
+- struct io_ring_ctx *ctx = req->ctx;
+- unsigned long flags;
+-
+- spin_lock_irqsave(&ctx->timeout_lock, flags);
+- list_del_init(&req->timeout.list);
+- atomic_set(&req->ctx->cq_timeouts,
+- atomic_read(&req->ctx->cq_timeouts) + 1);
+- spin_unlock_irqrestore(&ctx->timeout_lock, flags);
+-
+- if (!(data->flags & IORING_TIMEOUT_ETIME_SUCCESS))
+- req_set_fail(req);
+-
+- req->result = -ETIME;
+- req->io_task_work.func = io_req_task_complete;
+- io_req_task_work_add(req, false);
+- return HRTIMER_NORESTART;
+-}
+-
+-static struct io_kiocb *io_timeout_extract(struct io_ring_ctx *ctx,
+- __u64 user_data)
+- __must_hold(&ctx->timeout_lock)
+-{
+- struct io_timeout_data *io;
+- struct io_kiocb *req;
+- bool found = false;
+-
+- list_for_each_entry(req, &ctx->timeout_list, timeout.list) {
+- found = user_data == req->user_data;
+- if (found)
+- break;
+- }
+- if (!found)
+- return ERR_PTR(-ENOENT);
+-
+- io = req->async_data;
+- if (hrtimer_try_to_cancel(&io->timer) == -1)
+- return ERR_PTR(-EALREADY);
+- list_del_init(&req->timeout.list);
+- return req;
+-}
+-
+-static int io_timeout_cancel(struct io_ring_ctx *ctx, __u64 user_data)
+- __must_hold(&ctx->completion_lock)
+- __must_hold(&ctx->timeout_lock)
+-{
+- struct io_kiocb *req = io_timeout_extract(ctx, user_data);
+-
+- if (IS_ERR(req))
+- return PTR_ERR(req);
+- io_req_task_queue_fail(req, -ECANCELED);
+- return 0;
+-}
+-
+-static clockid_t io_timeout_get_clock(struct io_timeout_data *data)
+-{
+- switch (data->flags & IORING_TIMEOUT_CLOCK_MASK) {
+- case IORING_TIMEOUT_BOOTTIME:
+- return CLOCK_BOOTTIME;
+- case IORING_TIMEOUT_REALTIME:
+- return CLOCK_REALTIME;
+- default:
+- /* can't happen, vetted at prep time */
+- WARN_ON_ONCE(1);
+- fallthrough;
+- case 0:
+- return CLOCK_MONOTONIC;
+- }
+-}
+-
+-static int io_linked_timeout_update(struct io_ring_ctx *ctx, __u64 user_data,
+- struct timespec64 *ts, enum hrtimer_mode mode)
+- __must_hold(&ctx->timeout_lock)
+-{
+- struct io_timeout_data *io;
+- struct io_kiocb *req;
+- bool found = false;
+-
+- list_for_each_entry(req, &ctx->ltimeout_list, timeout.list) {
+- found = user_data == req->user_data;
+- if (found)
+- break;
+- }
+- if (!found)
+- return -ENOENT;
+-
+- io = req->async_data;
+- if (hrtimer_try_to_cancel(&io->timer) == -1)
+- return -EALREADY;
+- hrtimer_init(&io->timer, io_timeout_get_clock(io), mode);
+- io->timer.function = io_link_timeout_fn;
+- hrtimer_start(&io->timer, timespec64_to_ktime(*ts), mode);
+- return 0;
+-}
+-
+-static int io_timeout_update(struct io_ring_ctx *ctx, __u64 user_data,
+- struct timespec64 *ts, enum hrtimer_mode mode)
+- __must_hold(&ctx->timeout_lock)
+-{
+- struct io_kiocb *req = io_timeout_extract(ctx, user_data);
+- struct io_timeout_data *data;
+-
+- if (IS_ERR(req))
+- return PTR_ERR(req);
+-
+- req->timeout.off = 0; /* noseq */
+- data = req->async_data;
+- list_add_tail(&req->timeout.list, &ctx->timeout_list);
+- hrtimer_init(&data->timer, io_timeout_get_clock(data), mode);
+- data->timer.function = io_timeout_fn;
+- hrtimer_start(&data->timer, timespec64_to_ktime(*ts), mode);
+- return 0;
+-}
+-
+-static int io_timeout_remove_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- struct io_timeout_rem *tr = &req->timeout_rem;
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->buf_index || sqe->len || sqe->splice_fd_in)
+- return -EINVAL;
+-
+- tr->ltimeout = false;
+- tr->addr = READ_ONCE(sqe->addr);
+- tr->flags = READ_ONCE(sqe->timeout_flags);
+- if (tr->flags & IORING_TIMEOUT_UPDATE_MASK) {
+- if (hweight32(tr->flags & IORING_TIMEOUT_CLOCK_MASK) > 1)
+- return -EINVAL;
+- if (tr->flags & IORING_LINK_TIMEOUT_UPDATE)
+- tr->ltimeout = true;
+- if (tr->flags & ~(IORING_TIMEOUT_UPDATE_MASK|IORING_TIMEOUT_ABS))
+- return -EINVAL;
+- if (get_timespec64(&tr->ts, u64_to_user_ptr(sqe->addr2)))
+- return -EFAULT;
+- if (tr->ts.tv_sec < 0 || tr->ts.tv_nsec < 0)
+- return -EINVAL;
+- } else if (tr->flags) {
+- /* timeout removal doesn't support flags */
+- return -EINVAL;
+- }
+-
+- return 0;
+-}
+-
+-static inline enum hrtimer_mode io_translate_timeout_mode(unsigned int flags)
+-{
+- return (flags & IORING_TIMEOUT_ABS) ? HRTIMER_MODE_ABS
+- : HRTIMER_MODE_REL;
+-}
+-
+-/*
+- * Remove or update an existing timeout command
+- */
+-static int io_timeout_remove(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_timeout_rem *tr = &req->timeout_rem;
+- struct io_ring_ctx *ctx = req->ctx;
+- int ret;
+-
+- if (!(req->timeout_rem.flags & IORING_TIMEOUT_UPDATE)) {
+- spin_lock(&ctx->completion_lock);
+- spin_lock_irq(&ctx->timeout_lock);
+- ret = io_timeout_cancel(ctx, tr->addr);
+- spin_unlock_irq(&ctx->timeout_lock);
+- spin_unlock(&ctx->completion_lock);
+- } else {
+- enum hrtimer_mode mode = io_translate_timeout_mode(tr->flags);
+-
+- spin_lock_irq(&ctx->timeout_lock);
+- if (tr->ltimeout)
+- ret = io_linked_timeout_update(ctx, tr->addr, &tr->ts, mode);
+- else
+- ret = io_timeout_update(ctx, tr->addr, &tr->ts, mode);
+- spin_unlock_irq(&ctx->timeout_lock);
+- }
+-
+- if (ret < 0)
+- req_set_fail(req);
+- io_req_complete_post(req, ret, 0);
+- return 0;
+-}
+-
+-static int io_timeout_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe,
+- bool is_timeout_link)
+-{
+- struct io_timeout_data *data;
+- unsigned flags;
+- u32 off = READ_ONCE(sqe->off);
+-
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->buf_index || sqe->len != 1 ||
+- sqe->splice_fd_in)
+- return -EINVAL;
+- if (off && is_timeout_link)
+- return -EINVAL;
+- flags = READ_ONCE(sqe->timeout_flags);
+- if (flags & ~(IORING_TIMEOUT_ABS | IORING_TIMEOUT_CLOCK_MASK |
+- IORING_TIMEOUT_ETIME_SUCCESS))
+- return -EINVAL;
+- /* more than one clock specified is invalid, obviously */
+- if (hweight32(flags & IORING_TIMEOUT_CLOCK_MASK) > 1)
+- return -EINVAL;
+-
+- INIT_LIST_HEAD(&req->timeout.list);
+- req->timeout.off = off;
+- if (unlikely(off && !req->ctx->off_timeout_used))
+- req->ctx->off_timeout_used = true;
+-
+- if (WARN_ON_ONCE(req_has_async_data(req)))
+- return -EFAULT;
+- if (io_alloc_async_data(req))
+- return -ENOMEM;
+-
+- data = req->async_data;
+- data->req = req;
+- data->flags = flags;
+-
+- if (get_timespec64(&data->ts, u64_to_user_ptr(sqe->addr)))
+- return -EFAULT;
+-
+- if (data->ts.tv_sec < 0 || data->ts.tv_nsec < 0)
+- return -EINVAL;
+-
+- INIT_LIST_HEAD(&req->timeout.list);
+- data->mode = io_translate_timeout_mode(flags);
+- hrtimer_init(&data->timer, io_timeout_get_clock(data), data->mode);
+-
+- if (is_timeout_link) {
+- struct io_submit_link *link = &req->ctx->submit_state.link;
+-
+- if (!link->head)
+- return -EINVAL;
+- if (link->last->opcode == IORING_OP_LINK_TIMEOUT)
+- return -EINVAL;
+- req->timeout.head = link->last;
+- link->last->flags |= REQ_F_ARM_LTIMEOUT;
+- }
+- return 0;
+-}
+-
+-static int io_timeout(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- struct io_timeout_data *data = req->async_data;
+- struct list_head *entry;
+- u32 tail, off = req->timeout.off;
+-
+- spin_lock_irq(&ctx->timeout_lock);
+-
+- /*
+- * sqe->off holds how many events that need to occur for this
+- * timeout event to be satisfied. If it isn't set, then this is
+- * a pure timeout request, sequence isn't used.
+- */
+- if (io_is_timeout_noseq(req)) {
+- entry = ctx->timeout_list.prev;
+- goto add;
+- }
+-
+- tail = ctx->cached_cq_tail - atomic_read(&ctx->cq_timeouts);
+- req->timeout.target_seq = tail + off;
+-
+- /* Update the last seq here in case io_flush_timeouts() hasn't.
+- * This is safe because ->completion_lock is held, and submissions
+- * and completions are never mixed in the same ->completion_lock section.
+- */
+- ctx->cq_last_tm_flush = tail;
+-
+- /*
+- * Insertion sort, ensuring the first entry in the list is always
+- * the one we need first.
+- */
+- list_for_each_prev(entry, &ctx->timeout_list) {
+- struct io_kiocb *nxt = list_entry(entry, struct io_kiocb,
+- timeout.list);
+-
+- if (io_is_timeout_noseq(nxt))
+- continue;
+- /* nxt.seq is behind @tail, otherwise would've been completed */
+- if (off >= nxt->timeout.target_seq - tail)
+- break;
+- }
+-add:
+- list_add(&req->timeout.list, entry);
+- data->timer.function = io_timeout_fn;
+- hrtimer_start(&data->timer, timespec64_to_ktime(data->ts), data->mode);
+- spin_unlock_irq(&ctx->timeout_lock);
+- return 0;
+-}
+-
+-struct io_cancel_data {
+- struct io_ring_ctx *ctx;
+- u64 user_data;
+-};
+-
+-static bool io_cancel_cb(struct io_wq_work *work, void *data)
+-{
+- struct io_kiocb *req = container_of(work, struct io_kiocb, work);
+- struct io_cancel_data *cd = data;
+-
+- return req->ctx == cd->ctx && req->user_data == cd->user_data;
+-}
+-
+-static int io_async_cancel_one(struct io_uring_task *tctx, u64 user_data,
+- struct io_ring_ctx *ctx)
+-{
+- struct io_cancel_data data = { .ctx = ctx, .user_data = user_data, };
+- enum io_wq_cancel cancel_ret;
+- int ret = 0;
+-
+- if (!tctx || !tctx->io_wq)
+- return -ENOENT;
+-
+- cancel_ret = io_wq_cancel_cb(tctx->io_wq, io_cancel_cb, &data, false);
+- switch (cancel_ret) {
+- case IO_WQ_CANCEL_OK:
+- ret = 0;
+- break;
+- case IO_WQ_CANCEL_RUNNING:
+- ret = -EALREADY;
+- break;
+- case IO_WQ_CANCEL_NOTFOUND:
+- ret = -ENOENT;
+- break;
+- }
+-
+- return ret;
+-}
+-
+-static int io_try_cancel_userdata(struct io_kiocb *req, u64 sqe_addr)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- int ret;
+-
+- WARN_ON_ONCE(!io_wq_current_is_worker() && req->task != current);
+-
+- ret = io_async_cancel_one(req->task->io_uring, sqe_addr, ctx);
+- /*
+- * Fall-through even for -EALREADY, as we may have poll armed
+- * that need unarming.
+- */
+- if (!ret)
+- return 0;
+-
+- spin_lock(&ctx->completion_lock);
+- ret = io_poll_cancel(ctx, sqe_addr, false);
+- if (ret != -ENOENT)
+- goto out;
+-
+- spin_lock_irq(&ctx->timeout_lock);
+- ret = io_timeout_cancel(ctx, sqe_addr);
+- spin_unlock_irq(&ctx->timeout_lock);
+-out:
+- spin_unlock(&ctx->completion_lock);
+- return ret;
+-}
+-
+-static int io_async_cancel_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+- return -EINVAL;
+- if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->off || sqe->len || sqe->cancel_flags ||
+- sqe->splice_fd_in)
+- return -EINVAL;
+-
+- req->cancel.addr = READ_ONCE(sqe->addr);
+- return 0;
+-}
+-
+-static int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- u64 sqe_addr = req->cancel.addr;
+- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+- struct io_tctx_node *node;
+- int ret;
+-
+- ret = io_try_cancel_userdata(req, sqe_addr);
+- if (ret != -ENOENT)
+- goto done;
+-
+- /* slow path, try all io-wq's */
+- io_ring_submit_lock(ctx, needs_lock);
+- ret = -ENOENT;
+- list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
+- struct io_uring_task *tctx = node->task->io_uring;
+-
+- ret = io_async_cancel_one(tctx, req->cancel.addr, ctx);
+- if (ret != -ENOENT)
+- break;
+- }
+- io_ring_submit_unlock(ctx, needs_lock);
+-done:
+- if (ret < 0)
+- req_set_fail(req);
+- io_req_complete_post(req, ret, 0);
+- return 0;
+-}
+-
+-static int io_rsrc_update_prep(struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+-{
+- if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
+- return -EINVAL;
+- if (sqe->ioprio || sqe->rw_flags || sqe->splice_fd_in)
+- return -EINVAL;
+-
+- req->rsrc_update.offset = READ_ONCE(sqe->off);
+- req->rsrc_update.nr_args = READ_ONCE(sqe->len);
+- if (!req->rsrc_update.nr_args)
+- return -EINVAL;
+- req->rsrc_update.arg = READ_ONCE(sqe->addr);
+- return 0;
+-}
+-
+-static int io_files_update(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+- struct io_uring_rsrc_update2 up;
+- int ret;
+-
+- up.offset = req->rsrc_update.offset;
+- up.data = req->rsrc_update.arg;
+- up.nr = 0;
+- up.tags = 0;
+- up.resv = 0;
+- up.resv2 = 0;
+-
+- io_ring_submit_lock(ctx, needs_lock);
+- ret = __io_register_rsrc_update(ctx, IORING_RSRC_FILE,
+- &up, req->rsrc_update.nr_args);
+- io_ring_submit_unlock(ctx, needs_lock);
+-
+- if (ret < 0)
+- req_set_fail(req);
+- __io_req_complete(req, issue_flags, ret, 0);
+- return 0;
+-}
+-
+-static int io_req_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+-{
+- switch (req->opcode) {
+- case IORING_OP_NOP:
+- return 0;
+- case IORING_OP_READV:
+- case IORING_OP_READ_FIXED:
+- case IORING_OP_READ:
+- case IORING_OP_WRITEV:
+- case IORING_OP_WRITE_FIXED:
+- case IORING_OP_WRITE:
+- return io_prep_rw(req, sqe);
+- case IORING_OP_POLL_ADD:
+- return io_poll_add_prep(req, sqe);
+- case IORING_OP_POLL_REMOVE:
+- return io_poll_update_prep(req, sqe);
+- case IORING_OP_FSYNC:
+- return io_fsync_prep(req, sqe);
+- case IORING_OP_SYNC_FILE_RANGE:
+- return io_sfr_prep(req, sqe);
+- case IORING_OP_SENDMSG:
+- case IORING_OP_SEND:
+- return io_sendmsg_prep(req, sqe);
+- case IORING_OP_RECVMSG:
+- case IORING_OP_RECV:
+- return io_recvmsg_prep(req, sqe);
+- case IORING_OP_CONNECT:
+- return io_connect_prep(req, sqe);
+- case IORING_OP_TIMEOUT:
+- return io_timeout_prep(req, sqe, false);
+- case IORING_OP_TIMEOUT_REMOVE:
+- return io_timeout_remove_prep(req, sqe);
+- case IORING_OP_ASYNC_CANCEL:
+- return io_async_cancel_prep(req, sqe);
+- case IORING_OP_LINK_TIMEOUT:
+- return io_timeout_prep(req, sqe, true);
+- case IORING_OP_ACCEPT:
+- return io_accept_prep(req, sqe);
+- case IORING_OP_FALLOCATE:
+- return io_fallocate_prep(req, sqe);
+- case IORING_OP_OPENAT:
+- return io_openat_prep(req, sqe);
+- case IORING_OP_CLOSE:
+- return io_close_prep(req, sqe);
+- case IORING_OP_FILES_UPDATE:
+- return io_rsrc_update_prep(req, sqe);
+- case IORING_OP_STATX:
+- return io_statx_prep(req, sqe);
+- case IORING_OP_FADVISE:
+- return io_fadvise_prep(req, sqe);
+- case IORING_OP_MADVISE:
+- return io_madvise_prep(req, sqe);
+- case IORING_OP_OPENAT2:
+- return io_openat2_prep(req, sqe);
+- case IORING_OP_EPOLL_CTL:
+- return io_epoll_ctl_prep(req, sqe);
+- case IORING_OP_SPLICE:
+- return io_splice_prep(req, sqe);
+- case IORING_OP_PROVIDE_BUFFERS:
+- return io_provide_buffers_prep(req, sqe);
+- case IORING_OP_REMOVE_BUFFERS:
+- return io_remove_buffers_prep(req, sqe);
+- case IORING_OP_TEE:
+- return io_tee_prep(req, sqe);
+- case IORING_OP_SHUTDOWN:
+- return io_shutdown_prep(req, sqe);
+- case IORING_OP_RENAMEAT:
+- return io_renameat_prep(req, sqe);
+- case IORING_OP_UNLINKAT:
+- return io_unlinkat_prep(req, sqe);
+- case IORING_OP_MKDIRAT:
+- return io_mkdirat_prep(req, sqe);
+- case IORING_OP_SYMLINKAT:
+- return io_symlinkat_prep(req, sqe);
+- case IORING_OP_LINKAT:
+- return io_linkat_prep(req, sqe);
+- case IORING_OP_MSG_RING:
+- return io_msg_ring_prep(req, sqe);
+- }
+-
+- printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
+- req->opcode);
+- return -EINVAL;
+-}
+-
+-static int io_req_prep_async(struct io_kiocb *req)
+-{
+- const struct io_op_def *def = &io_op_defs[req->opcode];
+-
+- /* assign early for deferred execution for non-fixed file */
+- if (def->needs_file && !(req->flags & REQ_F_FIXED_FILE))
+- req->file = io_file_get_normal(req, req->fd);
+- if (!def->needs_async_setup)
+- return 0;
+- if (WARN_ON_ONCE(req_has_async_data(req)))
+- return -EFAULT;
+- if (io_alloc_async_data(req))
+- return -EAGAIN;
+-
+- switch (req->opcode) {
+- case IORING_OP_READV:
+- return io_rw_prep_async(req, READ);
+- case IORING_OP_WRITEV:
+- return io_rw_prep_async(req, WRITE);
+- case IORING_OP_SENDMSG:
+- return io_sendmsg_prep_async(req);
+- case IORING_OP_RECVMSG:
+- return io_recvmsg_prep_async(req);
+- case IORING_OP_CONNECT:
+- return io_connect_prep_async(req);
+- }
+- printk_once(KERN_WARNING "io_uring: prep_async() bad opcode %d\n",
+- req->opcode);
+- return -EFAULT;
+-}
+-
+-static u32 io_get_sequence(struct io_kiocb *req)
+-{
+- u32 seq = req->ctx->cached_sq_head;
+-
+- /* need original cached_sq_head, but it was increased for each req */
+- io_for_each_link(req, req)
+- seq--;
+- return seq;
+-}
+-
+-static __cold void io_drain_req(struct io_kiocb *req)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- struct io_defer_entry *de;
+- int ret;
+- u32 seq = io_get_sequence(req);
+-
+- /* Still need defer if there is pending req in defer list. */
+- spin_lock(&ctx->completion_lock);
+- if (!req_need_defer(req, seq) && list_empty_careful(&ctx->defer_list)) {
+- spin_unlock(&ctx->completion_lock);
+-queue:
+- ctx->drain_active = false;
+- io_req_task_queue(req);
+- return;
+- }
+- spin_unlock(&ctx->completion_lock);
+-
+- ret = io_req_prep_async(req);
+- if (ret) {
+-fail:
+- io_req_complete_failed(req, ret);
+- return;
+- }
+- io_prep_async_link(req);
+- de = kmalloc(sizeof(*de), GFP_KERNEL);
+- if (!de) {
+- ret = -ENOMEM;
+- goto fail;
+- }
+-
+- spin_lock(&ctx->completion_lock);
+- if (!req_need_defer(req, seq) && list_empty(&ctx->defer_list)) {
+- spin_unlock(&ctx->completion_lock);
+- kfree(de);
+- goto queue;
+- }
+-
+- trace_io_uring_defer(ctx, req, req->user_data, req->opcode);
+- de->req = req;
+- de->seq = seq;
+- list_add_tail(&de->list, &ctx->defer_list);
+- spin_unlock(&ctx->completion_lock);
+-}
+-
+-static void io_clean_op(struct io_kiocb *req)
+-{
+- if (req->flags & REQ_F_BUFFER_SELECTED) {
+- spin_lock(&req->ctx->completion_lock);
+- io_put_kbuf_comp(req);
+- spin_unlock(&req->ctx->completion_lock);
+- }
+-
+- if (req->flags & REQ_F_NEED_CLEANUP) {
+- switch (req->opcode) {
+- case IORING_OP_READV:
+- case IORING_OP_READ_FIXED:
+- case IORING_OP_READ:
+- case IORING_OP_WRITEV:
+- case IORING_OP_WRITE_FIXED:
+- case IORING_OP_WRITE: {
+- struct io_async_rw *io = req->async_data;
+-
+- kfree(io->free_iovec);
+- break;
+- }
+- case IORING_OP_RECVMSG:
+- case IORING_OP_SENDMSG: {
+- struct io_async_msghdr *io = req->async_data;
+-
+- kfree(io->free_iov);
+- break;
+- }
+- case IORING_OP_OPENAT:
+- case IORING_OP_OPENAT2:
+- if (req->open.filename)
+- putname(req->open.filename);
+- break;
+- case IORING_OP_RENAMEAT:
+- putname(req->rename.oldpath);
+- putname(req->rename.newpath);
+- break;
+- case IORING_OP_UNLINKAT:
+- putname(req->unlink.filename);
+- break;
+- case IORING_OP_MKDIRAT:
+- putname(req->mkdir.filename);
+- break;
+- case IORING_OP_SYMLINKAT:
+- putname(req->symlink.oldpath);
+- putname(req->symlink.newpath);
+- break;
+- case IORING_OP_LINKAT:
+- putname(req->hardlink.oldpath);
+- putname(req->hardlink.newpath);
+- break;
+- case IORING_OP_STATX:
+- if (req->statx.filename)
+- putname(req->statx.filename);
+- break;
+- }
+- }
+- if ((req->flags & REQ_F_POLLED) && req->apoll) {
+- kfree(req->apoll->double_poll);
+- kfree(req->apoll);
+- req->apoll = NULL;
+- }
+- if (req->flags & REQ_F_INFLIGHT) {
+- struct io_uring_task *tctx = req->task->io_uring;
+-
+- atomic_dec(&tctx->inflight_tracked);
+- }
+- if (req->flags & REQ_F_CREDS)
+- put_cred(req->creds);
+- if (req->flags & REQ_F_ASYNC_DATA) {
+- kfree(req->async_data);
+- req->async_data = NULL;
+- }
+- req->flags &= ~IO_REQ_CLEAN_FLAGS;
+-}
+-
+-static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- if (req->file || !io_op_defs[req->opcode].needs_file)
+- return true;
+-
+- if (req->flags & REQ_F_FIXED_FILE)
+- req->file = io_file_get_fixed(req, req->fd, issue_flags);
+- else
+- req->file = io_file_get_normal(req, req->fd);
+- if (req->file)
+- return true;
+-
+- req_set_fail(req);
+- req->result = -EBADF;
+- return false;
+-}
+-
+-static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- const struct cred *creds = NULL;
+- int ret;
+-
+- if (unlikely(!io_assign_file(req, issue_flags)))
+- return -EBADF;
+-
+- if (unlikely((req->flags & REQ_F_CREDS) && req->creds != current_cred()))
+- creds = override_creds(req->creds);
+-
+- if (!io_op_defs[req->opcode].audit_skip)
+- audit_uring_entry(req->opcode);
+-
+- switch (req->opcode) {
+- case IORING_OP_NOP:
+- ret = io_nop(req, issue_flags);
+- break;
+- case IORING_OP_READV:
+- case IORING_OP_READ_FIXED:
+- case IORING_OP_READ:
+- ret = io_read(req, issue_flags);
+- break;
+- case IORING_OP_WRITEV:
+- case IORING_OP_WRITE_FIXED:
+- case IORING_OP_WRITE:
+- ret = io_write(req, issue_flags);
+- break;
+- case IORING_OP_FSYNC:
+- ret = io_fsync(req, issue_flags);
+- break;
+- case IORING_OP_POLL_ADD:
+- ret = io_poll_add(req, issue_flags);
+- break;
+- case IORING_OP_POLL_REMOVE:
+- ret = io_poll_update(req, issue_flags);
+- break;
+- case IORING_OP_SYNC_FILE_RANGE:
+- ret = io_sync_file_range(req, issue_flags);
+- break;
+- case IORING_OP_SENDMSG:
+- ret = io_sendmsg(req, issue_flags);
+- break;
+- case IORING_OP_SEND:
+- ret = io_send(req, issue_flags);
+- break;
+- case IORING_OP_RECVMSG:
+- ret = io_recvmsg(req, issue_flags);
+- break;
+- case IORING_OP_RECV:
+- ret = io_recv(req, issue_flags);
+- break;
+- case IORING_OP_TIMEOUT:
+- ret = io_timeout(req, issue_flags);
+- break;
+- case IORING_OP_TIMEOUT_REMOVE:
+- ret = io_timeout_remove(req, issue_flags);
+- break;
+- case IORING_OP_ACCEPT:
+- ret = io_accept(req, issue_flags);
+- break;
+- case IORING_OP_CONNECT:
+- ret = io_connect(req, issue_flags);
+- break;
+- case IORING_OP_ASYNC_CANCEL:
+- ret = io_async_cancel(req, issue_flags);
+- break;
+- case IORING_OP_FALLOCATE:
+- ret = io_fallocate(req, issue_flags);
+- break;
+- case IORING_OP_OPENAT:
+- ret = io_openat(req, issue_flags);
+- break;
+- case IORING_OP_CLOSE:
+- ret = io_close(req, issue_flags);
+- break;
+- case IORING_OP_FILES_UPDATE:
+- ret = io_files_update(req, issue_flags);
+- break;
+- case IORING_OP_STATX:
+- ret = io_statx(req, issue_flags);
+- break;
+- case IORING_OP_FADVISE:
+- ret = io_fadvise(req, issue_flags);
+- break;
+- case IORING_OP_MADVISE:
+- ret = io_madvise(req, issue_flags);
+- break;
+- case IORING_OP_OPENAT2:
+- ret = io_openat2(req, issue_flags);
+- break;
+- case IORING_OP_EPOLL_CTL:
+- ret = io_epoll_ctl(req, issue_flags);
+- break;
+- case IORING_OP_SPLICE:
+- ret = io_splice(req, issue_flags);
+- break;
+- case IORING_OP_PROVIDE_BUFFERS:
+- ret = io_provide_buffers(req, issue_flags);
+- break;
+- case IORING_OP_REMOVE_BUFFERS:
+- ret = io_remove_buffers(req, issue_flags);
+- break;
+- case IORING_OP_TEE:
+- ret = io_tee(req, issue_flags);
+- break;
+- case IORING_OP_SHUTDOWN:
+- ret = io_shutdown(req, issue_flags);
+- break;
+- case IORING_OP_RENAMEAT:
+- ret = io_renameat(req, issue_flags);
+- break;
+- case IORING_OP_UNLINKAT:
+- ret = io_unlinkat(req, issue_flags);
+- break;
+- case IORING_OP_MKDIRAT:
+- ret = io_mkdirat(req, issue_flags);
+- break;
+- case IORING_OP_SYMLINKAT:
+- ret = io_symlinkat(req, issue_flags);
+- break;
+- case IORING_OP_LINKAT:
+- ret = io_linkat(req, issue_flags);
+- break;
+- case IORING_OP_MSG_RING:
+- ret = io_msg_ring(req, issue_flags);
+- break;
+- default:
+- ret = -EINVAL;
+- break;
+- }
+-
+- if (!io_op_defs[req->opcode].audit_skip)
+- audit_uring_exit(!ret, ret);
+-
+- if (creds)
+- revert_creds(creds);
+- if (ret)
+- return ret;
+- /* If the op doesn't have a file, we're not polling for it */
+- if ((req->ctx->flags & IORING_SETUP_IOPOLL) && req->file)
+- io_iopoll_req_issued(req, issue_flags);
+-
+- return 0;
+-}
+-
+-static struct io_wq_work *io_wq_free_work(struct io_wq_work *work)
+-{
+- struct io_kiocb *req = container_of(work, struct io_kiocb, work);
+-
+- req = io_put_req_find_next(req);
+- return req ? &req->work : NULL;
+-}
+-
+-static void io_wq_submit_work(struct io_wq_work *work)
+-{
+- struct io_kiocb *req = container_of(work, struct io_kiocb, work);
+- const struct io_op_def *def = &io_op_defs[req->opcode];
+- unsigned int issue_flags = IO_URING_F_UNLOCKED;
+- bool needs_poll = false;
+- struct io_kiocb *timeout;
+- int ret = 0, err = -ECANCELED;
+-
+- /* one will be dropped by ->io_free_work() after returning to io-wq */
+- if (!(req->flags & REQ_F_REFCOUNT))
+- __io_req_set_refcount(req, 2);
+- else
+- req_ref_get(req);
+-
+- timeout = io_prep_linked_timeout(req);
+- if (timeout)
+- io_queue_linked_timeout(timeout);
+-
+-
+- /* either cancelled or io-wq is dying, so don't touch tctx->iowq */
+- if (work->flags & IO_WQ_WORK_CANCEL) {
+-fail:
+- io_req_task_queue_fail(req, err);
+- return;
+- }
+- if (!io_assign_file(req, issue_flags)) {
+- err = -EBADF;
+- work->flags |= IO_WQ_WORK_CANCEL;
+- goto fail;
+- }
+-
+- if (req->flags & REQ_F_FORCE_ASYNC) {
+- bool opcode_poll = def->pollin || def->pollout;
+-
+- if (opcode_poll && file_can_poll(req->file)) {
+- needs_poll = true;
+- issue_flags |= IO_URING_F_NONBLOCK;
+- }
+- }
+-
+- do {
+- ret = io_issue_sqe(req, issue_flags);
+- if (ret != -EAGAIN)
+- break;
+- /*
+- * We can get EAGAIN for iopolled IO even though we're
+- * forcing a sync submission from here, since we can't
+- * wait for request slots on the block side.
+- */
+- if (!needs_poll) {
+- if (!(req->ctx->flags & IORING_SETUP_IOPOLL))
+- break;
+- cond_resched();
+- continue;
+- }
+-
+- if (io_arm_poll_handler(req, issue_flags) == IO_APOLL_OK)
+- return;
+- /* aborted or ready, in either case retry blocking */
+- needs_poll = false;
+- issue_flags &= ~IO_URING_F_NONBLOCK;
+- } while (1);
+-
+- /* avoid locking problems by failing it from a clean context */
+- if (ret)
+- io_req_task_queue_fail(req, ret);
+-}
+-
+-static inline struct io_fixed_file *io_fixed_file_slot(struct io_file_table *table,
+- unsigned i)
+-{
+- return &table->files[i];
+-}
+-
+-static inline struct file *io_file_from_index(struct io_ring_ctx *ctx,
+- int index)
+-{
+- struct io_fixed_file *slot = io_fixed_file_slot(&ctx->file_table, index);
+-
+- return (struct file *) (slot->file_ptr & FFS_MASK);
+-}
+-
+-static void io_fixed_file_set(struct io_fixed_file *file_slot, struct file *file)
+-{
+- unsigned long file_ptr = (unsigned long) file;
+-
+- file_ptr |= io_file_get_flags(file);
+- file_slot->file_ptr = file_ptr;
+-}
+-
+-static inline struct file *io_file_get_fixed(struct io_kiocb *req, int fd,
+- unsigned int issue_flags)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- struct file *file = NULL;
+- unsigned long file_ptr;
+-
+- if (issue_flags & IO_URING_F_UNLOCKED)
+- mutex_lock(&ctx->uring_lock);
+-
+- if (unlikely((unsigned int)fd >= ctx->nr_user_files))
+- goto out;
+- fd = array_index_nospec(fd, ctx->nr_user_files);
+- file_ptr = io_fixed_file_slot(&ctx->file_table, fd)->file_ptr;
+- file = (struct file *) (file_ptr & FFS_MASK);
+- file_ptr &= ~FFS_MASK;
+- /* mask in overlapping REQ_F and FFS bits */
+- req->flags |= (file_ptr << REQ_F_SUPPORT_NOWAIT_BIT);
+- io_req_set_rsrc_node(req, ctx, 0);
+-out:
+- if (issue_flags & IO_URING_F_UNLOCKED)
+- mutex_unlock(&ctx->uring_lock);
+- return file;
+-}
+-
+-static struct file *io_file_get_normal(struct io_kiocb *req, int fd)
+-{
+- struct file *file = fget(fd);
+-
+- trace_io_uring_file_get(req->ctx, req, req->user_data, fd);
+-
+- /* we don't allow fixed io_uring files */
+- if (file && file->f_op == &io_uring_fops)
+- io_req_track_inflight(req);
+- return file;
+-}
+-
+-static void io_req_task_link_timeout(struct io_kiocb *req, bool *locked)
+-{
+- struct io_kiocb *prev = req->timeout.prev;
+- int ret = -ENOENT;
+-
+- if (prev) {
+- if (!(req->task->flags & PF_EXITING))
+- ret = io_try_cancel_userdata(req, prev->user_data);
+- io_req_complete_post(req, ret ?: -ETIME, 0);
+- io_put_req(prev);
+- } else {
+- io_req_complete_post(req, -ETIME, 0);
+- }
+-}
+-
+-static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer)
+-{
+- struct io_timeout_data *data = container_of(timer,
+- struct io_timeout_data, timer);
+- struct io_kiocb *prev, *req = data->req;
+- struct io_ring_ctx *ctx = req->ctx;
+- unsigned long flags;
+-
+- spin_lock_irqsave(&ctx->timeout_lock, flags);
+- prev = req->timeout.head;
+- req->timeout.head = NULL;
+-
+- /*
+- * We don't expect the list to be empty, that will only happen if we
+- * race with the completion of the linked work.
+- */
+- if (prev) {
+- io_remove_next_linked(prev);
+- if (!req_ref_inc_not_zero(prev))
+- prev = NULL;
+- }
+- list_del(&req->timeout.list);
+- req->timeout.prev = prev;
+- spin_unlock_irqrestore(&ctx->timeout_lock, flags);
+-
+- req->io_task_work.func = io_req_task_link_timeout;
+- io_req_task_work_add(req, false);
+- return HRTIMER_NORESTART;
+-}
+-
+-static void io_queue_linked_timeout(struct io_kiocb *req)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+-
+- spin_lock_irq(&ctx->timeout_lock);
+- /*
+- * If the back reference is NULL, then our linked request finished
+- * before we got a chance to setup the timer
+- */
+- if (req->timeout.head) {
+- struct io_timeout_data *data = req->async_data;
+-
+- data->timer.function = io_link_timeout_fn;
+- hrtimer_start(&data->timer, timespec64_to_ktime(data->ts),
+- data->mode);
+- list_add_tail(&req->timeout.list, &ctx->ltimeout_list);
+- }
+- spin_unlock_irq(&ctx->timeout_lock);
+- /* drop submission reference */
+- io_put_req(req);
+-}
+-
+-static void io_queue_sqe_arm_apoll(struct io_kiocb *req)
+- __must_hold(&req->ctx->uring_lock)
+-{
+- struct io_kiocb *linked_timeout = io_prep_linked_timeout(req);
+-
+- switch (io_arm_poll_handler(req, 0)) {
+- case IO_APOLL_READY:
+- io_req_task_queue(req);
+- break;
+- case IO_APOLL_ABORTED:
+- /*
+- * Queued up for async execution, worker will release
+- * submit reference when the iocb is actually submitted.
+- */
+- io_queue_async_work(req, NULL);
+- break;
+- case IO_APOLL_OK:
+- break;
+- }
+-
+- if (linked_timeout)
+- io_queue_linked_timeout(linked_timeout);
+-}
+-
+-static inline void __io_queue_sqe(struct io_kiocb *req)
+- __must_hold(&req->ctx->uring_lock)
+-{
+- struct io_kiocb *linked_timeout;
+- int ret;
+-
+- ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER);
+-
+- if (req->flags & REQ_F_COMPLETE_INLINE) {
+- io_req_add_compl_list(req);
+- return;
+- }
+- /*
+- * We async punt it if the file wasn't marked NOWAIT, or if the file
+- * doesn't support non-blocking read/write attempts
+- */
+- if (likely(!ret)) {
+- linked_timeout = io_prep_linked_timeout(req);
+- if (linked_timeout)
+- io_queue_linked_timeout(linked_timeout);
+- } else if (ret == -EAGAIN && !(req->flags & REQ_F_NOWAIT)) {
+- io_queue_sqe_arm_apoll(req);
+- } else {
+- io_req_complete_failed(req, ret);
+- }
+-}
+-
+-static void io_queue_sqe_fallback(struct io_kiocb *req)
+- __must_hold(&req->ctx->uring_lock)
+-{
+- if (req->flags & REQ_F_FAIL) {
+- io_req_complete_fail_submit(req);
+- } else if (unlikely(req->ctx->drain_active)) {
+- io_drain_req(req);
+- } else {
+- int ret = io_req_prep_async(req);
+-
+- if (unlikely(ret))
+- io_req_complete_failed(req, ret);
+- else
+- io_queue_async_work(req, NULL);
+- }
+-}
+-
+-static inline void io_queue_sqe(struct io_kiocb *req)
+- __must_hold(&req->ctx->uring_lock)
+-{
+- if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL))))
+- __io_queue_sqe(req);
+- else
+- io_queue_sqe_fallback(req);
+-}
+-
+-/*
+- * Check SQE restrictions (opcode and flags).
+- *
+- * Returns 'true' if SQE is allowed, 'false' otherwise.
+- */
+-static inline bool io_check_restriction(struct io_ring_ctx *ctx,
+- struct io_kiocb *req,
+- unsigned int sqe_flags)
+-{
+- if (!test_bit(req->opcode, ctx->restrictions.sqe_op))
+- return false;
+-
+- if ((sqe_flags & ctx->restrictions.sqe_flags_required) !=
+- ctx->restrictions.sqe_flags_required)
+- return false;
+-
+- if (sqe_flags & ~(ctx->restrictions.sqe_flags_allowed |
+- ctx->restrictions.sqe_flags_required))
+- return false;
+-
+- return true;
+-}
+-
+-static void io_init_req_drain(struct io_kiocb *req)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- struct io_kiocb *head = ctx->submit_state.link.head;
+-
+- ctx->drain_active = true;
+- if (head) {
+- /*
+- * If we need to drain a request in the middle of a link, drain
+- * the head request and the next request/link after the current
+- * link. Considering sequential execution of links,
+- * REQ_F_IO_DRAIN will be maintained for every request of our
+- * link.
+- */
+- head->flags |= REQ_F_IO_DRAIN | REQ_F_FORCE_ASYNC;
+- ctx->drain_next = true;
+- }
+-}
+-
+-static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+- __must_hold(&ctx->uring_lock)
+-{
+- unsigned int sqe_flags;
+- int personality;
+- u8 opcode;
+-
+- /* req is partially pre-initialised, see io_preinit_req() */
+- req->opcode = opcode = READ_ONCE(sqe->opcode);
+- /* same numerical values with corresponding REQ_F_*, safe to copy */
+- req->flags = sqe_flags = READ_ONCE(sqe->flags);
+- req->user_data = READ_ONCE(sqe->user_data);
+- req->file = NULL;
+- req->fixed_rsrc_refs = NULL;
+- req->task = current;
+-
+- if (unlikely(opcode >= IORING_OP_LAST)) {
+- req->opcode = 0;
+- return -EINVAL;
+- }
+- if (unlikely(sqe_flags & ~SQE_COMMON_FLAGS)) {
+- /* enforce forwards compatibility on users */
+- if (sqe_flags & ~SQE_VALID_FLAGS)
+- return -EINVAL;
+- if ((sqe_flags & IOSQE_BUFFER_SELECT) &&
+- !io_op_defs[opcode].buffer_select)
+- return -EOPNOTSUPP;
+- if (sqe_flags & IOSQE_CQE_SKIP_SUCCESS)
+- ctx->drain_disabled = true;
+- if (sqe_flags & IOSQE_IO_DRAIN) {
+- if (ctx->drain_disabled)
+- return -EOPNOTSUPP;
+- io_init_req_drain(req);
+- }
+- }
+- if (unlikely(ctx->restricted || ctx->drain_active || ctx->drain_next)) {
+- if (ctx->restricted && !io_check_restriction(ctx, req, sqe_flags))
+- return -EACCES;
+- /* knock it to the slow queue path, will be drained there */
+- if (ctx->drain_active)
+- req->flags |= REQ_F_FORCE_ASYNC;
+- /* if there is no link, we're at "next" request and need to drain */
+- if (unlikely(ctx->drain_next) && !ctx->submit_state.link.head) {
+- ctx->drain_next = false;
+- ctx->drain_active = true;
+- req->flags |= REQ_F_IO_DRAIN | REQ_F_FORCE_ASYNC;
+- }
+- }
+-
+- if (io_op_defs[opcode].needs_file) {
+- struct io_submit_state *state = &ctx->submit_state;
+-
+- req->fd = READ_ONCE(sqe->fd);
+-
+- /*
+- * Plug now if we have more than 2 IO left after this, and the
+- * target is potentially a read/write to block based storage.
+- */
+- if (state->need_plug && io_op_defs[opcode].plug) {
+- state->plug_started = true;
+- state->need_plug = false;
+- blk_start_plug_nr_ios(&state->plug, state->submit_nr);
+- }
+- }
+-
+- personality = READ_ONCE(sqe->personality);
+- if (personality) {
+- int ret;
+-
+- req->creds = xa_load(&ctx->personalities, personality);
+- if (!req->creds)
+- return -EINVAL;
+- get_cred(req->creds);
+- ret = security_uring_override_creds(req->creds);
+- if (ret) {
+- put_cred(req->creds);
+- return ret;
+- }
+- req->flags |= REQ_F_CREDS;
+- }
+-
+- return io_req_prep(req, sqe);
+-}
+-
+-static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
+- const struct io_uring_sqe *sqe)
+- __must_hold(&ctx->uring_lock)
+-{
+- struct io_submit_link *link = &ctx->submit_state.link;
+- int ret;
+-
+- ret = io_init_req(ctx, req, sqe);
+- if (unlikely(ret)) {
+- trace_io_uring_req_failed(sqe, ctx, req, ret);
+-
+- /* fail even hard links since we don't submit */
+- if (link->head) {
+- /*
+- * we can judge a link req is failed or cancelled by if
+- * REQ_F_FAIL is set, but the head is an exception since
+- * it may be set REQ_F_FAIL because of other req's failure
+- * so let's leverage req->result to distinguish if a head
+- * is set REQ_F_FAIL because of its failure or other req's
+- * failure so that we can set the correct ret code for it.
+- * init result here to avoid affecting the normal path.
+- */
+- if (!(link->head->flags & REQ_F_FAIL))
+- req_fail_link_node(link->head, -ECANCELED);
+- } else if (!(req->flags & (REQ_F_LINK | REQ_F_HARDLINK))) {
+- /*
+- * the current req is a normal req, we should return
+- * error and thus break the submittion loop.
+- */
+- io_req_complete_failed(req, ret);
+- return ret;
+- }
+- req_fail_link_node(req, ret);
+- }
+-
+- /* don't need @sqe from now on */
+- trace_io_uring_submit_sqe(ctx, req, req->user_data, req->opcode,
+- req->flags, true,
+- ctx->flags & IORING_SETUP_SQPOLL);
+-
+- /*
+- * If we already have a head request, queue this one for async
+- * submittal once the head completes. If we don't have a head but
+- * IOSQE_IO_LINK is set in the sqe, start a new head. This one will be
+- * submitted sync once the chain is complete. If none of those
+- * conditions are true (normal request), then just queue it.
+- */
+- if (link->head) {
+- struct io_kiocb *head = link->head;
+-
+- if (!(req->flags & REQ_F_FAIL)) {
+- ret = io_req_prep_async(req);
+- if (unlikely(ret)) {
+- req_fail_link_node(req, ret);
+- if (!(head->flags & REQ_F_FAIL))
+- req_fail_link_node(head, -ECANCELED);
+- }
+- }
+- trace_io_uring_link(ctx, req, head);
+- link->last->link = req;
+- link->last = req;
+-
+- if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK))
+- return 0;
+- /* last request of a link, enqueue the link */
+- link->head = NULL;
+- req = head;
+- } else if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) {
+- link->head = req;
+- link->last = req;
+- return 0;
+- }
+-
+- io_queue_sqe(req);
+- return 0;
+-}
+-
+-/*
+- * Batched submission is done, ensure local IO is flushed out.
+- */
+-static void io_submit_state_end(struct io_ring_ctx *ctx)
+-{
+- struct io_submit_state *state = &ctx->submit_state;
+-
+- if (state->link.head)
+- io_queue_sqe(state->link.head);
+- /* flush only after queuing links as they can generate completions */
+- io_submit_flush_completions(ctx);
+- if (state->plug_started)
+- blk_finish_plug(&state->plug);
+-}
+-
+-/*
+- * Start submission side cache.
+- */
+-static void io_submit_state_start(struct io_submit_state *state,
+- unsigned int max_ios)
+-{
+- state->plug_started = false;
+- state->need_plug = max_ios > 2;
+- state->submit_nr = max_ios;
+- /* set only head, no need to init link_last in advance */
+- state->link.head = NULL;
+-}
+-
+-static void io_commit_sqring(struct io_ring_ctx *ctx)
+-{
+- struct io_rings *rings = ctx->rings;
+-
+- /*
+- * Ensure any loads from the SQEs are done at this point,
+- * since once we write the new head, the application could
+- * write new data to them.
+- */
+- smp_store_release(&rings->sq.head, ctx->cached_sq_head);
+-}
+-
+-/*
+- * Fetch an sqe, if one is available. Note this returns a pointer to memory
+- * that is mapped by userspace. This means that care needs to be taken to
+- * ensure that reads are stable, as we cannot rely on userspace always
+- * being a good citizen. If members of the sqe are validated and then later
+- * used, it's important that those reads are done through READ_ONCE() to
+- * prevent a re-load down the line.
+- */
+-static const struct io_uring_sqe *io_get_sqe(struct io_ring_ctx *ctx)
+-{
+- unsigned head, mask = ctx->sq_entries - 1;
+- unsigned sq_idx = ctx->cached_sq_head++ & mask;
+-
+- /*
+- * The cached sq head (or cq tail) serves two purposes:
+- *
+- * 1) allows us to batch the cost of updating the user visible
+- * head updates.
+- * 2) allows the kernel side to track the head on its own, even
+- * though the application is the one updating it.
+- */
+- head = READ_ONCE(ctx->sq_array[sq_idx]);
+- if (likely(head < ctx->sq_entries))
+- return &ctx->sq_sqes[head];
+-
+- /* drop invalid entries */
+- ctx->cq_extra--;
+- WRITE_ONCE(ctx->rings->sq_dropped,
+- READ_ONCE(ctx->rings->sq_dropped) + 1);
+- return NULL;
+-}
+-
+-static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr)
+- __must_hold(&ctx->uring_lock)
+-{
+- unsigned int entries = io_sqring_entries(ctx);
+- int submitted = 0;
+-
+- if (unlikely(!entries))
+- return 0;
+- /* make sure SQ entry isn't read before tail */
+- nr = min3(nr, ctx->sq_entries, entries);
+- io_get_task_refs(nr);
+-
+- io_submit_state_start(&ctx->submit_state, nr);
+- do {
+- const struct io_uring_sqe *sqe;
+- struct io_kiocb *req;
+-
+- if (unlikely(!io_alloc_req_refill(ctx))) {
+- if (!submitted)
+- submitted = -EAGAIN;
+- break;
+- }
+- req = io_alloc_req(ctx);
+- sqe = io_get_sqe(ctx);
+- if (unlikely(!sqe)) {
+- wq_stack_add_head(&req->comp_list, &ctx->submit_state.free_list);
+- break;
+- }
+- /* will complete beyond this point, count as submitted */
+- submitted++;
+- if (io_submit_sqe(ctx, req, sqe)) {
+- /*
+- * Continue submitting even for sqe failure if the
+- * ring was setup with IORING_SETUP_SUBMIT_ALL
+- */
+- if (!(ctx->flags & IORING_SETUP_SUBMIT_ALL))
+- break;
+- }
+- } while (submitted < nr);
+-
+- if (unlikely(submitted != nr)) {
+- int ref_used = (submitted == -EAGAIN) ? 0 : submitted;
+- int unused = nr - ref_used;
+-
+- current->io_uring->cached_refs += unused;
+- }
+-
+- io_submit_state_end(ctx);
+- /* Commit SQ ring head once we've consumed and submitted all SQEs */
+- io_commit_sqring(ctx);
+-
+- return submitted;
+-}
+-
+-static inline bool io_sqd_events_pending(struct io_sq_data *sqd)
+-{
+- return READ_ONCE(sqd->state);
+-}
+-
+-static inline void io_ring_set_wakeup_flag(struct io_ring_ctx *ctx)
+-{
+- /* Tell userspace we may need a wakeup call */
+- spin_lock(&ctx->completion_lock);
+- WRITE_ONCE(ctx->rings->sq_flags,
+- ctx->rings->sq_flags | IORING_SQ_NEED_WAKEUP);
+- spin_unlock(&ctx->completion_lock);
+-}
+-
+-static inline void io_ring_clear_wakeup_flag(struct io_ring_ctx *ctx)
+-{
+- spin_lock(&ctx->completion_lock);
+- WRITE_ONCE(ctx->rings->sq_flags,
+- ctx->rings->sq_flags & ~IORING_SQ_NEED_WAKEUP);
+- spin_unlock(&ctx->completion_lock);
+-}
+-
+-static int __io_sq_thread(struct io_ring_ctx *ctx, bool cap_entries)
+-{
+- unsigned int to_submit;
+- int ret = 0;
+-
+- to_submit = io_sqring_entries(ctx);
+- /* if we're handling multiple rings, cap submit size for fairness */
+- if (cap_entries && to_submit > IORING_SQPOLL_CAP_ENTRIES_VALUE)
+- to_submit = IORING_SQPOLL_CAP_ENTRIES_VALUE;
+-
+- if (!wq_list_empty(&ctx->iopoll_list) || to_submit) {
+- const struct cred *creds = NULL;
+-
+- if (ctx->sq_creds != current_cred())
+- creds = override_creds(ctx->sq_creds);
+-
+- mutex_lock(&ctx->uring_lock);
+- if (!wq_list_empty(&ctx->iopoll_list))
+- io_do_iopoll(ctx, true);
+-
+- /*
+- * Don't submit if refs are dying, good for io_uring_register(),
+- * but also it is relied upon by io_ring_exit_work()
+- */
+- if (to_submit && likely(!percpu_ref_is_dying(&ctx->refs)) &&
+- !(ctx->flags & IORING_SETUP_R_DISABLED))
+- ret = io_submit_sqes(ctx, to_submit);
+- mutex_unlock(&ctx->uring_lock);
+-
+- if (to_submit && wq_has_sleeper(&ctx->sqo_sq_wait))
+- wake_up(&ctx->sqo_sq_wait);
+- if (creds)
+- revert_creds(creds);
+- }
+-
+- return ret;
+-}
+-
+-static __cold void io_sqd_update_thread_idle(struct io_sq_data *sqd)
+-{
+- struct io_ring_ctx *ctx;
+- unsigned sq_thread_idle = 0;
+-
+- list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
+- sq_thread_idle = max(sq_thread_idle, ctx->sq_thread_idle);
+- sqd->sq_thread_idle = sq_thread_idle;
+-}
+-
+-static bool io_sqd_handle_event(struct io_sq_data *sqd)
+-{
+- bool did_sig = false;
+- struct ksignal ksig;
+-
+- if (test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state) ||
+- signal_pending(current)) {
+- mutex_unlock(&sqd->lock);
+- if (signal_pending(current))
+- did_sig = get_signal(&ksig);
+- cond_resched();
+- mutex_lock(&sqd->lock);
+- }
+- return did_sig || test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state);
+-}
+-
+-static int io_sq_thread(void *data)
+-{
+- struct io_sq_data *sqd = data;
+- struct io_ring_ctx *ctx;
+- unsigned long timeout = 0;
+- char buf[TASK_COMM_LEN];
+- DEFINE_WAIT(wait);
+-
+- snprintf(buf, sizeof(buf), "iou-sqp-%d", sqd->task_pid);
+- set_task_comm(current, buf);
+-
+- if (sqd->sq_cpu != -1)
+- set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu));
+- else
+- set_cpus_allowed_ptr(current, cpu_online_mask);
+- current->flags |= PF_NO_SETAFFINITY;
+-
+- audit_alloc_kernel(current);
+-
+- mutex_lock(&sqd->lock);
+- while (1) {
+- bool cap_entries, sqt_spin = false;
+-
+- if (io_sqd_events_pending(sqd) || signal_pending(current)) {
+- if (io_sqd_handle_event(sqd))
+- break;
+- timeout = jiffies + sqd->sq_thread_idle;
+- }
+-
+- cap_entries = !list_is_singular(&sqd->ctx_list);
+- list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
+- int ret = __io_sq_thread(ctx, cap_entries);
+-
+- if (!sqt_spin && (ret > 0 || !wq_list_empty(&ctx->iopoll_list)))
+- sqt_spin = true;
+- }
+- if (io_run_task_work())
+- sqt_spin = true;
+-
+- if (sqt_spin || !time_after(jiffies, timeout)) {
+- cond_resched();
+- if (sqt_spin)
+- timeout = jiffies + sqd->sq_thread_idle;
+- continue;
+- }
+-
+- prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
+- if (!io_sqd_events_pending(sqd) && !task_work_pending(current)) {
+- bool needs_sched = true;
+-
+- list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
+- io_ring_set_wakeup_flag(ctx);
+-
+- if ((ctx->flags & IORING_SETUP_IOPOLL) &&
+- !wq_list_empty(&ctx->iopoll_list)) {
+- needs_sched = false;
+- break;
+- }
+-
+- /*
+- * Ensure the store of the wakeup flag is not
+- * reordered with the load of the SQ tail
+- */
+- smp_mb();
+-
+- if (io_sqring_entries(ctx)) {
+- needs_sched = false;
+- break;
+- }
+- }
+-
+- if (needs_sched) {
+- mutex_unlock(&sqd->lock);
+- schedule();
+- mutex_lock(&sqd->lock);
+- }
+- list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
+- io_ring_clear_wakeup_flag(ctx);
+- }
+-
+- finish_wait(&sqd->wait, &wait);
+- timeout = jiffies + sqd->sq_thread_idle;
+- }
+-
+- io_uring_cancel_generic(true, sqd);
+- sqd->thread = NULL;
+- list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
+- io_ring_set_wakeup_flag(ctx);
+- io_run_task_work();
+- mutex_unlock(&sqd->lock);
+-
+- audit_free(current);
+-
+- complete(&sqd->exited);
+- do_exit(0);
+-}
+-
+-struct io_wait_queue {
+- struct wait_queue_entry wq;
+- struct io_ring_ctx *ctx;
+- unsigned cq_tail;
+- unsigned nr_timeouts;
+-};
+-
+-static inline bool io_should_wake(struct io_wait_queue *iowq)
+-{
+- struct io_ring_ctx *ctx = iowq->ctx;
+- int dist = ctx->cached_cq_tail - (int) iowq->cq_tail;
+-
+- /*
+- * Wake up if we have enough events, or if a timeout occurred since we
+- * started waiting. For timeouts, we always want to return to userspace,
+- * regardless of event count.
+- */
+- return dist >= 0 || atomic_read(&ctx->cq_timeouts) != iowq->nr_timeouts;
+-}
+-
+-static int io_wake_function(struct wait_queue_entry *curr, unsigned int mode,
+- int wake_flags, void *key)
+-{
+- struct io_wait_queue *iowq = container_of(curr, struct io_wait_queue,
+- wq);
+-
+- /*
+- * Cannot safely flush overflowed CQEs from here, ensure we wake up
+- * the task, and the next invocation will do it.
+- */
+- if (io_should_wake(iowq) || test_bit(0, &iowq->ctx->check_cq_overflow))
+- return autoremove_wake_function(curr, mode, wake_flags, key);
+- return -1;
+-}
+-
+-static int io_run_task_work_sig(void)
+-{
+- if (io_run_task_work())
+- return 1;
+- if (test_thread_flag(TIF_NOTIFY_SIGNAL))
+- return -ERESTARTSYS;
+- if (task_sigpending(current))
+- return -EINTR;
+- return 0;
+-}
+-
+-/* when returns >0, the caller should retry */
+-static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
+- struct io_wait_queue *iowq,
+- ktime_t timeout)
+-{
+- int ret;
+-
+- /* make sure we run task_work before checking for signals */
+- ret = io_run_task_work_sig();
+- if (ret || io_should_wake(iowq))
+- return ret;
+- /* let the caller flush overflows, retry */
+- if (test_bit(0, &ctx->check_cq_overflow))
+- return 1;
+-
+- if (!schedule_hrtimeout(&timeout, HRTIMER_MODE_ABS))
+- return -ETIME;
+- return 1;
+-}
+-
+-/*
+- * Wait until events become available, if we don't already have some. The
+- * application must reap them itself, as they reside on the shared cq ring.
+- */
+-static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
+- const sigset_t __user *sig, size_t sigsz,
+- struct __kernel_timespec __user *uts)
+-{
+- struct io_wait_queue iowq;
+- struct io_rings *rings = ctx->rings;
+- ktime_t timeout = KTIME_MAX;
+- int ret;
+-
+- do {
+- io_cqring_overflow_flush(ctx);
+- if (io_cqring_events(ctx) >= min_events)
+- return 0;
+- if (!io_run_task_work())
+- break;
+- } while (1);
+-
+- if (sig) {
+-#ifdef CONFIG_COMPAT
+- if (in_compat_syscall())
+- ret = set_compat_user_sigmask((const compat_sigset_t __user *)sig,
+- sigsz);
+- else
+-#endif
+- ret = set_user_sigmask(sig, sigsz);
+-
+- if (ret)
+- return ret;
+- }
+-
+- if (uts) {
+- struct timespec64 ts;
+-
+- if (get_timespec64(&ts, uts))
+- return -EFAULT;
+- timeout = ktime_add_ns(timespec64_to_ktime(ts), ktime_get_ns());
+- }
+-
+- init_waitqueue_func_entry(&iowq.wq, io_wake_function);
+- iowq.wq.private = current;
+- INIT_LIST_HEAD(&iowq.wq.entry);
+- iowq.ctx = ctx;
+- iowq.nr_timeouts = atomic_read(&ctx->cq_timeouts);
+- iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events;
+-
+- trace_io_uring_cqring_wait(ctx, min_events);
+- do {
+- /* if we can't even flush overflow, don't wait for more */
+- if (!io_cqring_overflow_flush(ctx)) {
+- ret = -EBUSY;
+- break;
+- }
+- prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq,
+- TASK_INTERRUPTIBLE);
+- ret = io_cqring_wait_schedule(ctx, &iowq, timeout);
+- finish_wait(&ctx->cq_wait, &iowq.wq);
+- cond_resched();
+- } while (ret > 0);
+-
+- restore_saved_sigmask_unless(ret == -EINTR);
+-
+- return READ_ONCE(rings->cq.head) == READ_ONCE(rings->cq.tail) ? ret : 0;
+-}
+-
+-static void io_free_page_table(void **table, size_t size)
+-{
+- unsigned i, nr_tables = DIV_ROUND_UP(size, PAGE_SIZE);
+-
+- for (i = 0; i < nr_tables; i++)
+- kfree(table[i]);
+- kfree(table);
+-}
+-
+-static __cold void **io_alloc_page_table(size_t size)
+-{
+- unsigned i, nr_tables = DIV_ROUND_UP(size, PAGE_SIZE);
+- size_t init_size = size;
+- void **table;
+-
+- table = kcalloc(nr_tables, sizeof(*table), GFP_KERNEL_ACCOUNT);
+- if (!table)
+- return NULL;
+-
+- for (i = 0; i < nr_tables; i++) {
+- unsigned int this_size = min_t(size_t, size, PAGE_SIZE);
+-
+- table[i] = kzalloc(this_size, GFP_KERNEL_ACCOUNT);
+- if (!table[i]) {
+- io_free_page_table(table, init_size);
+- return NULL;
+- }
+- size -= this_size;
+- }
+- return table;
+-}
+-
+-static void io_rsrc_node_destroy(struct io_rsrc_node *ref_node)
+-{
+- percpu_ref_exit(&ref_node->refs);
+- kfree(ref_node);
+-}
+-
+-static __cold void io_rsrc_node_ref_zero(struct percpu_ref *ref)
+-{
+- struct io_rsrc_node *node = container_of(ref, struct io_rsrc_node, refs);
+- struct io_ring_ctx *ctx = node->rsrc_data->ctx;
+- unsigned long flags;
+- bool first_add = false;
+- unsigned long delay = HZ;
+-
+- spin_lock_irqsave(&ctx->rsrc_ref_lock, flags);
+- node->done = true;
+-
+- /* if we are mid-quiesce then do not delay */
+- if (node->rsrc_data->quiesce)
+- delay = 0;
+-
+- while (!list_empty(&ctx->rsrc_ref_list)) {
+- node = list_first_entry(&ctx->rsrc_ref_list,
+- struct io_rsrc_node, node);
+- /* recycle ref nodes in order */
+- if (!node->done)
+- break;
+- list_del(&node->node);
+- first_add |= llist_add(&node->llist, &ctx->rsrc_put_llist);
+- }
+- spin_unlock_irqrestore(&ctx->rsrc_ref_lock, flags);
+-
+- if (first_add)
+- mod_delayed_work(system_wq, &ctx->rsrc_put_work, delay);
+-}
+-
+-static struct io_rsrc_node *io_rsrc_node_alloc(void)
+-{
+- struct io_rsrc_node *ref_node;
+-
+- ref_node = kzalloc(sizeof(*ref_node), GFP_KERNEL);
+- if (!ref_node)
+- return NULL;
+-
+- if (percpu_ref_init(&ref_node->refs, io_rsrc_node_ref_zero,
+- 0, GFP_KERNEL)) {
+- kfree(ref_node);
+- return NULL;
+- }
+- INIT_LIST_HEAD(&ref_node->node);
+- INIT_LIST_HEAD(&ref_node->rsrc_list);
+- ref_node->done = false;
+- return ref_node;
+-}
+-
+-static void io_rsrc_node_switch(struct io_ring_ctx *ctx,
+- struct io_rsrc_data *data_to_kill)
+- __must_hold(&ctx->uring_lock)
+-{
+- WARN_ON_ONCE(!ctx->rsrc_backup_node);
+- WARN_ON_ONCE(data_to_kill && !ctx->rsrc_node);
+-
+- io_rsrc_refs_drop(ctx);
+-
+- if (data_to_kill) {
+- struct io_rsrc_node *rsrc_node = ctx->rsrc_node;
+-
+- rsrc_node->rsrc_data = data_to_kill;
+- spin_lock_irq(&ctx->rsrc_ref_lock);
+- list_add_tail(&rsrc_node->node, &ctx->rsrc_ref_list);
+- spin_unlock_irq(&ctx->rsrc_ref_lock);
+-
+- atomic_inc(&data_to_kill->refs);
+- percpu_ref_kill(&rsrc_node->refs);
+- ctx->rsrc_node = NULL;
+- }
+-
+- if (!ctx->rsrc_node) {
+- ctx->rsrc_node = ctx->rsrc_backup_node;
+- ctx->rsrc_backup_node = NULL;
+- }
+-}
+-
+-static int io_rsrc_node_switch_start(struct io_ring_ctx *ctx)
+-{
+- if (ctx->rsrc_backup_node)
+- return 0;
+- ctx->rsrc_backup_node = io_rsrc_node_alloc();
+- return ctx->rsrc_backup_node ? 0 : -ENOMEM;
+-}
+-
+-static __cold int io_rsrc_ref_quiesce(struct io_rsrc_data *data,
+- struct io_ring_ctx *ctx)
+-{
+- int ret;
+-
+- /* As we may drop ->uring_lock, other task may have started quiesce */
+- if (data->quiesce)
+- return -ENXIO;
+-
+- data->quiesce = true;
+- do {
+- ret = io_rsrc_node_switch_start(ctx);
+- if (ret)
+- break;
+- io_rsrc_node_switch(ctx, data);
+-
+- /* kill initial ref, already quiesced if zero */
+- if (atomic_dec_and_test(&data->refs))
+- break;
+- mutex_unlock(&ctx->uring_lock);
+- flush_delayed_work(&ctx->rsrc_put_work);
+- ret = wait_for_completion_interruptible(&data->done);
+- if (!ret) {
+- mutex_lock(&ctx->uring_lock);
+- if (atomic_read(&data->refs) > 0) {
+- /*
+- * it has been revived by another thread while
+- * we were unlocked
+- */
+- mutex_unlock(&ctx->uring_lock);
+- } else {
+- break;
+- }
+- }
+-
+- atomic_inc(&data->refs);
+- /* wait for all works potentially completing data->done */
+- flush_delayed_work(&ctx->rsrc_put_work);
+- reinit_completion(&data->done);
+-
+- ret = io_run_task_work_sig();
+- mutex_lock(&ctx->uring_lock);
+- } while (ret >= 0);
+- data->quiesce = false;
+-
+- return ret;
+-}
+-
+-static u64 *io_get_tag_slot(struct io_rsrc_data *data, unsigned int idx)
+-{
+- unsigned int off = idx & IO_RSRC_TAG_TABLE_MASK;
+- unsigned int table_idx = idx >> IO_RSRC_TAG_TABLE_SHIFT;
+-
+- return &data->tags[table_idx][off];
+-}
+-
+-static void io_rsrc_data_free(struct io_rsrc_data *data)
+-{
+- size_t size = data->nr * sizeof(data->tags[0][0]);
+-
+- if (data->tags)
+- io_free_page_table((void **)data->tags, size);
+- kfree(data);
+-}
+-
+-static __cold int io_rsrc_data_alloc(struct io_ring_ctx *ctx, rsrc_put_fn *do_put,
+- u64 __user *utags, unsigned nr,
+- struct io_rsrc_data **pdata)
+-{
+- struct io_rsrc_data *data;
+- int ret = -ENOMEM;
+- unsigned i;
+-
+- data = kzalloc(sizeof(*data), GFP_KERNEL);
+- if (!data)
+- return -ENOMEM;
+- data->tags = (u64 **)io_alloc_page_table(nr * sizeof(data->tags[0][0]));
+- if (!data->tags) {
+- kfree(data);
+- return -ENOMEM;
+- }
+-
+- data->nr = nr;
+- data->ctx = ctx;
+- data->do_put = do_put;
+- if (utags) {
+- ret = -EFAULT;
+- for (i = 0; i < nr; i++) {
+- u64 *tag_slot = io_get_tag_slot(data, i);
+-
+- if (copy_from_user(tag_slot, &utags[i],
+- sizeof(*tag_slot)))
+- goto fail;
+- }
+- }
+-
+- atomic_set(&data->refs, 1);
+- init_completion(&data->done);
+- *pdata = data;
+- return 0;
+-fail:
+- io_rsrc_data_free(data);
+- return ret;
+-}
+-
+-static bool io_alloc_file_tables(struct io_file_table *table, unsigned nr_files)
+-{
+- table->files = kvcalloc(nr_files, sizeof(table->files[0]),
+- GFP_KERNEL_ACCOUNT);
+- return !!table->files;
+-}
+-
+-static void io_free_file_tables(struct io_file_table *table)
+-{
+- kvfree(table->files);
+- table->files = NULL;
+-}
+-
+-static void __io_sqe_files_unregister(struct io_ring_ctx *ctx)
+-{
+-#if defined(CONFIG_UNIX)
+- if (ctx->ring_sock) {
+- struct sock *sock = ctx->ring_sock->sk;
+- struct sk_buff *skb;
+-
+- while ((skb = skb_dequeue(&sock->sk_receive_queue)) != NULL)
+- kfree_skb(skb);
+- }
+-#else
+- int i;
+-
+- for (i = 0; i < ctx->nr_user_files; i++) {
+- struct file *file;
+-
+- file = io_file_from_index(ctx, i);
+- if (file)
+- fput(file);
+- }
+-#endif
+- io_free_file_tables(&ctx->file_table);
+- io_rsrc_data_free(ctx->file_data);
+- ctx->file_data = NULL;
+- ctx->nr_user_files = 0;
+-}
+-
+-static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
+-{
+- unsigned nr = ctx->nr_user_files;
+- int ret;
+-
+- if (!ctx->file_data)
+- return -ENXIO;
+-
+- /*
+- * Quiesce may unlock ->uring_lock, and while it's not held
+- * prevent new requests using the table.
+- */
+- ctx->nr_user_files = 0;
+- ret = io_rsrc_ref_quiesce(ctx->file_data, ctx);
+- ctx->nr_user_files = nr;
+- if (!ret)
+- __io_sqe_files_unregister(ctx);
+- return ret;
+-}
+-
+-static void io_sq_thread_unpark(struct io_sq_data *sqd)
+- __releases(&sqd->lock)
+-{
+- WARN_ON_ONCE(sqd->thread == current);
+-
+- /*
+- * Do the dance but not conditional clear_bit() because it'd race with
+- * other threads incrementing park_pending and setting the bit.
+- */
+- clear_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state);
+- if (atomic_dec_return(&sqd->park_pending))
+- set_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state);
+- mutex_unlock(&sqd->lock);
+-}
+-
+-static void io_sq_thread_park(struct io_sq_data *sqd)
+- __acquires(&sqd->lock)
+-{
+- WARN_ON_ONCE(sqd->thread == current);
+-
+- atomic_inc(&sqd->park_pending);
+- set_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state);
+- mutex_lock(&sqd->lock);
+- if (sqd->thread)
+- wake_up_process(sqd->thread);
+-}
+-
+-static void io_sq_thread_stop(struct io_sq_data *sqd)
+-{
+- WARN_ON_ONCE(sqd->thread == current);
+- WARN_ON_ONCE(test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state));
+-
+- set_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state);
+- mutex_lock(&sqd->lock);
+- if (sqd->thread)
+- wake_up_process(sqd->thread);
+- mutex_unlock(&sqd->lock);
+- wait_for_completion(&sqd->exited);
+-}
+-
+-static void io_put_sq_data(struct io_sq_data *sqd)
+-{
+- if (refcount_dec_and_test(&sqd->refs)) {
+- WARN_ON_ONCE(atomic_read(&sqd->park_pending));
+-
+- io_sq_thread_stop(sqd);
+- kfree(sqd);
+- }
+-}
+-
+-static void io_sq_thread_finish(struct io_ring_ctx *ctx)
+-{
+- struct io_sq_data *sqd = ctx->sq_data;
+-
+- if (sqd) {
+- io_sq_thread_park(sqd);
+- list_del_init(&ctx->sqd_list);
+- io_sqd_update_thread_idle(sqd);
+- io_sq_thread_unpark(sqd);
+-
+- io_put_sq_data(sqd);
+- ctx->sq_data = NULL;
+- }
+-}
+-
+-static struct io_sq_data *io_attach_sq_data(struct io_uring_params *p)
+-{
+- struct io_ring_ctx *ctx_attach;
+- struct io_sq_data *sqd;
+- struct fd f;
+-
+- f = fdget(p->wq_fd);
+- if (!f.file)
+- return ERR_PTR(-ENXIO);
+- if (f.file->f_op != &io_uring_fops) {
+- fdput(f);
+- return ERR_PTR(-EINVAL);
+- }
+-
+- ctx_attach = f.file->private_data;
+- sqd = ctx_attach->sq_data;
+- if (!sqd) {
+- fdput(f);
+- return ERR_PTR(-EINVAL);
+- }
+- if (sqd->task_tgid != current->tgid) {
+- fdput(f);
+- return ERR_PTR(-EPERM);
+- }
+-
+- refcount_inc(&sqd->refs);
+- fdput(f);
+- return sqd;
+-}
+-
+-static struct io_sq_data *io_get_sq_data(struct io_uring_params *p,
+- bool *attached)
+-{
+- struct io_sq_data *sqd;
+-
+- *attached = false;
+- if (p->flags & IORING_SETUP_ATTACH_WQ) {
+- sqd = io_attach_sq_data(p);
+- if (!IS_ERR(sqd)) {
+- *attached = true;
+- return sqd;
+- }
+- /* fall through for EPERM case, setup new sqd/task */
+- if (PTR_ERR(sqd) != -EPERM)
+- return sqd;
+- }
+-
+- sqd = kzalloc(sizeof(*sqd), GFP_KERNEL);
+- if (!sqd)
+- return ERR_PTR(-ENOMEM);
+-
+- atomic_set(&sqd->park_pending, 0);
+- refcount_set(&sqd->refs, 1);
+- INIT_LIST_HEAD(&sqd->ctx_list);
+- mutex_init(&sqd->lock);
+- init_waitqueue_head(&sqd->wait);
+- init_completion(&sqd->exited);
+- return sqd;
+-}
+-
+-#if defined(CONFIG_UNIX)
+-/*
+- * Ensure the UNIX gc is aware of our file set, so we are certain that
+- * the io_uring can be safely unregistered on process exit, even if we have
+- * loops in the file referencing.
+- */
+-static int __io_sqe_files_scm(struct io_ring_ctx *ctx, int nr, int offset)
+-{
+- struct sock *sk = ctx->ring_sock->sk;
+- struct scm_fp_list *fpl;
+- struct sk_buff *skb;
+- int i, nr_files;
+-
+- fpl = kzalloc(sizeof(*fpl), GFP_KERNEL);
+- if (!fpl)
+- return -ENOMEM;
+-
+- skb = alloc_skb(0, GFP_KERNEL);
+- if (!skb) {
+- kfree(fpl);
+- return -ENOMEM;
+- }
+-
+- skb->sk = sk;
+-
+- nr_files = 0;
+- fpl->user = get_uid(current_user());
+- for (i = 0; i < nr; i++) {
+- struct file *file = io_file_from_index(ctx, i + offset);
+-
+- if (!file)
+- continue;
+- fpl->fp[nr_files] = get_file(file);
+- unix_inflight(fpl->user, fpl->fp[nr_files]);
+- nr_files++;
+- }
+-
+- if (nr_files) {
+- fpl->max = SCM_MAX_FD;
+- fpl->count = nr_files;
+- UNIXCB(skb).fp = fpl;
+- skb->destructor = unix_destruct_scm;
+- refcount_add(skb->truesize, &sk->sk_wmem_alloc);
+- skb_queue_head(&sk->sk_receive_queue, skb);
+-
+- for (i = 0; i < nr; i++) {
+- struct file *file = io_file_from_index(ctx, i + offset);
+-
+- if (file)
+- fput(file);
+- }
+- } else {
+- kfree_skb(skb);
+- free_uid(fpl->user);
+- kfree(fpl);
+- }
+-
+- return 0;
+-}
+-
+-/*
+- * If UNIX sockets are enabled, fd passing can cause a reference cycle which
+- * causes regular reference counting to break down. We rely on the UNIX
+- * garbage collection to take care of this problem for us.
+- */
+-static int io_sqe_files_scm(struct io_ring_ctx *ctx)
+-{
+- unsigned left, total;
+- int ret = 0;
+-
+- total = 0;
+- left = ctx->nr_user_files;
+- while (left) {
+- unsigned this_files = min_t(unsigned, left, SCM_MAX_FD);
+-
+- ret = __io_sqe_files_scm(ctx, this_files, total);
+- if (ret)
+- break;
+- left -= this_files;
+- total += this_files;
+- }
+-
+- if (!ret)
+- return 0;
+-
+- while (total < ctx->nr_user_files) {
+- struct file *file = io_file_from_index(ctx, total);
+-
+- if (file)
+- fput(file);
+- total++;
+- }
+-
+- return ret;
+-}
+-#else
+-static int io_sqe_files_scm(struct io_ring_ctx *ctx)
+-{
+- return 0;
+-}
+-#endif
+-
+-static void io_rsrc_file_put(struct io_ring_ctx *ctx, struct io_rsrc_put *prsrc)
+-{
+- struct file *file = prsrc->file;
+-#if defined(CONFIG_UNIX)
+- struct sock *sock = ctx->ring_sock->sk;
+- struct sk_buff_head list, *head = &sock->sk_receive_queue;
+- struct sk_buff *skb;
+- int i;
+-
+- __skb_queue_head_init(&list);
+-
+- /*
+- * Find the skb that holds this file in its SCM_RIGHTS. When found,
+- * remove this entry and rearrange the file array.
+- */
+- skb = skb_dequeue(head);
+- while (skb) {
+- struct scm_fp_list *fp;
+-
+- fp = UNIXCB(skb).fp;
+- for (i = 0; i < fp->count; i++) {
+- int left;
+-
+- if (fp->fp[i] != file)
+- continue;
+-
+- unix_notinflight(fp->user, fp->fp[i]);
+- left = fp->count - 1 - i;
+- if (left) {
+- memmove(&fp->fp[i], &fp->fp[i + 1],
+- left * sizeof(struct file *));
+- }
+- fp->count--;
+- if (!fp->count) {
+- kfree_skb(skb);
+- skb = NULL;
+- } else {
+- __skb_queue_tail(&list, skb);
+- }
+- fput(file);
+- file = NULL;
+- break;
+- }
+-
+- if (!file)
+- break;
+-
+- __skb_queue_tail(&list, skb);
+-
+- skb = skb_dequeue(head);
+- }
+-
+- if (skb_peek(&list)) {
+- spin_lock_irq(&head->lock);
+- while ((skb = __skb_dequeue(&list)) != NULL)
+- __skb_queue_tail(head, skb);
+- spin_unlock_irq(&head->lock);
+- }
+-#else
+- fput(file);
+-#endif
+-}
+-
+-static void __io_rsrc_put_work(struct io_rsrc_node *ref_node)
+-{
+- struct io_rsrc_data *rsrc_data = ref_node->rsrc_data;
+- struct io_ring_ctx *ctx = rsrc_data->ctx;
+- struct io_rsrc_put *prsrc, *tmp;
+-
+- list_for_each_entry_safe(prsrc, tmp, &ref_node->rsrc_list, list) {
+- list_del(&prsrc->list);
+-
+- if (prsrc->tag) {
+- bool lock_ring = ctx->flags & IORING_SETUP_IOPOLL;
+-
+- io_ring_submit_lock(ctx, lock_ring);
+- spin_lock(&ctx->completion_lock);
+- io_fill_cqe_aux(ctx, prsrc->tag, 0, 0);
+- io_commit_cqring(ctx);
+- spin_unlock(&ctx->completion_lock);
+- io_cqring_ev_posted(ctx);
+- io_ring_submit_unlock(ctx, lock_ring);
+- }
+-
+- rsrc_data->do_put(ctx, prsrc);
+- kfree(prsrc);
+- }
+-
+- io_rsrc_node_destroy(ref_node);
+- if (atomic_dec_and_test(&rsrc_data->refs))
+- complete(&rsrc_data->done);
+-}
+-
+-static void io_rsrc_put_work(struct work_struct *work)
+-{
+- struct io_ring_ctx *ctx;
+- struct llist_node *node;
+-
+- ctx = container_of(work, struct io_ring_ctx, rsrc_put_work.work);
+- node = llist_del_all(&ctx->rsrc_put_llist);
+-
+- while (node) {
+- struct io_rsrc_node *ref_node;
+- struct llist_node *next = node->next;
+-
+- ref_node = llist_entry(node, struct io_rsrc_node, llist);
+- __io_rsrc_put_work(ref_node);
+- node = next;
+- }
+-}
+-
+-static int io_sqe_files_register(struct io_ring_ctx *ctx, void __user *arg,
+- unsigned nr_args, u64 __user *tags)
+-{
+- __s32 __user *fds = (__s32 __user *) arg;
+- struct file *file;
+- int fd, ret;
+- unsigned i;
+-
+- if (ctx->file_data)
+- return -EBUSY;
+- if (!nr_args)
+- return -EINVAL;
+- if (nr_args > IORING_MAX_FIXED_FILES)
+- return -EMFILE;
+- if (nr_args > rlimit(RLIMIT_NOFILE))
+- return -EMFILE;
+- ret = io_rsrc_node_switch_start(ctx);
+- if (ret)
+- return ret;
+- ret = io_rsrc_data_alloc(ctx, io_rsrc_file_put, tags, nr_args,
+- &ctx->file_data);
+- if (ret)
+- return ret;
+-
+- ret = -ENOMEM;
+- if (!io_alloc_file_tables(&ctx->file_table, nr_args))
+- goto out_free;
+-
+- for (i = 0; i < nr_args; i++, ctx->nr_user_files++) {
+- if (copy_from_user(&fd, &fds[i], sizeof(fd))) {
+- ret = -EFAULT;
+- goto out_fput;
+- }
+- /* allow sparse sets */
+- if (fd == -1) {
+- ret = -EINVAL;
+- if (unlikely(*io_get_tag_slot(ctx->file_data, i)))
+- goto out_fput;
+- continue;
+- }
+-
+- file = fget(fd);
+- ret = -EBADF;
+- if (unlikely(!file))
+- goto out_fput;
+-
+- /*
+- * Don't allow io_uring instances to be registered. If UNIX
+- * isn't enabled, then this causes a reference cycle and this
+- * instance can never get freed. If UNIX is enabled we'll
+- * handle it just fine, but there's still no point in allowing
+- * a ring fd as it doesn't support regular read/write anyway.
+- */
+- if (file->f_op == &io_uring_fops) {
+- fput(file);
+- goto out_fput;
+- }
+- io_fixed_file_set(io_fixed_file_slot(&ctx->file_table, i), file);
+- }
+-
+- ret = io_sqe_files_scm(ctx);
+- if (ret) {
+- __io_sqe_files_unregister(ctx);
+- return ret;
+- }
+-
+- io_rsrc_node_switch(ctx, NULL);
+- return ret;
+-out_fput:
+- for (i = 0; i < ctx->nr_user_files; i++) {
+- file = io_file_from_index(ctx, i);
+- if (file)
+- fput(file);
+- }
+- io_free_file_tables(&ctx->file_table);
+- ctx->nr_user_files = 0;
+-out_free:
+- io_rsrc_data_free(ctx->file_data);
+- ctx->file_data = NULL;
+- return ret;
+-}
+-
+-static int io_sqe_file_register(struct io_ring_ctx *ctx, struct file *file,
+- int index)
+-{
+-#if defined(CONFIG_UNIX)
+- struct sock *sock = ctx->ring_sock->sk;
+- struct sk_buff_head *head = &sock->sk_receive_queue;
+- struct sk_buff *skb;
+-
+- /*
+- * See if we can merge this file into an existing skb SCM_RIGHTS
+- * file set. If there's no room, fall back to allocating a new skb
+- * and filling it in.
+- */
+- spin_lock_irq(&head->lock);
+- skb = skb_peek(head);
+- if (skb) {
+- struct scm_fp_list *fpl = UNIXCB(skb).fp;
+-
+- if (fpl->count < SCM_MAX_FD) {
+- __skb_unlink(skb, head);
+- spin_unlock_irq(&head->lock);
+- fpl->fp[fpl->count] = get_file(file);
+- unix_inflight(fpl->user, fpl->fp[fpl->count]);
+- fpl->count++;
+- spin_lock_irq(&head->lock);
+- __skb_queue_head(head, skb);
+- } else {
+- skb = NULL;
+- }
+- }
+- spin_unlock_irq(&head->lock);
+-
+- if (skb) {
+- fput(file);
+- return 0;
+- }
+-
+- return __io_sqe_files_scm(ctx, 1, index);
+-#else
+- return 0;
+-#endif
+-}
+-
+-static int io_queue_rsrc_removal(struct io_rsrc_data *data, unsigned idx,
+- struct io_rsrc_node *node, void *rsrc)
+-{
+- u64 *tag_slot = io_get_tag_slot(data, idx);
+- struct io_rsrc_put *prsrc;
+-
+- prsrc = kzalloc(sizeof(*prsrc), GFP_KERNEL);
+- if (!prsrc)
+- return -ENOMEM;
+-
+- prsrc->tag = *tag_slot;
+- *tag_slot = 0;
+- prsrc->rsrc = rsrc;
+- list_add(&prsrc->list, &node->rsrc_list);
+- return 0;
+-}
+-
+-static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
+- unsigned int issue_flags, u32 slot_index)
+-{
+- struct io_ring_ctx *ctx = req->ctx;
+- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+- bool needs_switch = false;
+- struct io_fixed_file *file_slot;
+- int ret = -EBADF;
+-
+- io_ring_submit_lock(ctx, needs_lock);
+- if (file->f_op == &io_uring_fops)
+- goto err;
+- ret = -ENXIO;
+- if (!ctx->file_data)
+- goto err;
+- ret = -EINVAL;
+- if (slot_index >= ctx->nr_user_files)
+- goto err;
+-
+- slot_index = array_index_nospec(slot_index, ctx->nr_user_files);
+- file_slot = io_fixed_file_slot(&ctx->file_table, slot_index);
+-
+- if (file_slot->file_ptr) {
+- struct file *old_file;
+-
+- ret = io_rsrc_node_switch_start(ctx);
+- if (ret)
+- goto err;
+-
+- old_file = (struct file *)(file_slot->file_ptr & FFS_MASK);
+- ret = io_queue_rsrc_removal(ctx->file_data, slot_index,
+- ctx->rsrc_node, old_file);
+- if (ret)
+- goto err;
+- file_slot->file_ptr = 0;
+- needs_switch = true;
+- }
+-
+- *io_get_tag_slot(ctx->file_data, slot_index) = 0;
+- io_fixed_file_set(file_slot, file);
+- ret = io_sqe_file_register(ctx, file, slot_index);
+- if (ret) {
+- file_slot->file_ptr = 0;
+- goto err;
+- }
+-
+- ret = 0;
+-err:
+- if (needs_switch)
+- io_rsrc_node_switch(ctx, ctx->file_data);
+- io_ring_submit_unlock(ctx, needs_lock);
+- if (ret)
+- fput(file);
+- return ret;
+-}
+-
+-static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags)
+-{
+- unsigned int offset = req->close.file_slot - 1;
+- struct io_ring_ctx *ctx = req->ctx;
+- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+- struct io_fixed_file *file_slot;
+- struct file *file;
+- int ret;
+-
+- io_ring_submit_lock(ctx, needs_lock);
+- ret = -ENXIO;
+- if (unlikely(!ctx->file_data))
+- goto out;
+- ret = -EINVAL;
+- if (offset >= ctx->nr_user_files)
+- goto out;
+- ret = io_rsrc_node_switch_start(ctx);
+- if (ret)
+- goto out;
+-
+- offset = array_index_nospec(offset, ctx->nr_user_files);
+- file_slot = io_fixed_file_slot(&ctx->file_table, offset);
+- ret = -EBADF;
+- if (!file_slot->file_ptr)
+- goto out;
+-
+- file = (struct file *)(file_slot->file_ptr & FFS_MASK);
+- ret = io_queue_rsrc_removal(ctx->file_data, offset, ctx->rsrc_node, file);
+- if (ret)
+- goto out;
+-
+- file_slot->file_ptr = 0;
+- io_rsrc_node_switch(ctx, ctx->file_data);
+- ret = 0;
+-out:
+- io_ring_submit_unlock(ctx, needs_lock);
+- return ret;
+-}
+-
+-static int __io_sqe_files_update(struct io_ring_ctx *ctx,
+- struct io_uring_rsrc_update2 *up,
+- unsigned nr_args)
+-{
+- u64 __user *tags = u64_to_user_ptr(up->tags);
+- __s32 __user *fds = u64_to_user_ptr(up->data);
+- struct io_rsrc_data *data = ctx->file_data;
+- struct io_fixed_file *file_slot;
+- struct file *file;
+- int fd, i, err = 0;
+- unsigned int done;
+- bool needs_switch = false;
+-
+- if (!ctx->file_data)
+- return -ENXIO;
+- if (up->offset + nr_args > ctx->nr_user_files)
+- return -EINVAL;
+-
+- for (done = 0; done < nr_args; done++) {
+- u64 tag = 0;
+-
+- if ((tags && copy_from_user(&tag, &tags[done], sizeof(tag))) ||
+- copy_from_user(&fd, &fds[done], sizeof(fd))) {
+- err = -EFAULT;
+- break;
+- }
+- if ((fd == IORING_REGISTER_FILES_SKIP || fd == -1) && tag) {
+- err = -EINVAL;
+- break;
+- }
+- if (fd == IORING_REGISTER_FILES_SKIP)
+- continue;
+-
+- i = array_index_nospec(up->offset + done, ctx->nr_user_files);
+- file_slot = io_fixed_file_slot(&ctx->file_table, i);
+-
+- if (file_slot->file_ptr) {
+- file = (struct file *)(file_slot->file_ptr & FFS_MASK);
+- err = io_queue_rsrc_removal(data, i, ctx->rsrc_node, file);
+- if (err)
+- break;
+- file_slot->file_ptr = 0;
+- needs_switch = true;
+- }
+- if (fd != -1) {
+- file = fget(fd);
+- if (!file) {
+- err = -EBADF;
+- break;
+- }
+- /*
+- * Don't allow io_uring instances to be registered. If
+- * UNIX isn't enabled, then this causes a reference
+- * cycle and this instance can never get freed. If UNIX
+- * is enabled we'll handle it just fine, but there's
+- * still no point in allowing a ring fd as it doesn't
+- * support regular read/write anyway.
+- */
+- if (file->f_op == &io_uring_fops) {
+- fput(file);
+- err = -EBADF;
+- break;
+- }
+- *io_get_tag_slot(data, i) = tag;
+- io_fixed_file_set(file_slot, file);
+- err = io_sqe_file_register(ctx, file, i);
+- if (err) {
+- file_slot->file_ptr = 0;
+- fput(file);
+- break;
+- }
+- }
+- }
+-
+- if (needs_switch)
+- io_rsrc_node_switch(ctx, data);
+- return done ? done : err;
+-}
+-
+-static struct io_wq *io_init_wq_offload(struct io_ring_ctx *ctx,
+- struct task_struct *task)
+-{
+- struct io_wq_hash *hash;
+- struct io_wq_data data;
+- unsigned int concurrency;
+-
+- mutex_lock(&ctx->uring_lock);
+- hash = ctx->hash_map;
+- if (!hash) {
+- hash = kzalloc(sizeof(*hash), GFP_KERNEL);
+- if (!hash) {
+- mutex_unlock(&ctx->uring_lock);
+- return ERR_PTR(-ENOMEM);
+- }
+- refcount_set(&hash->refs, 1);
+- init_waitqueue_head(&hash->wait);
+- ctx->hash_map = hash;
+- }
+- mutex_unlock(&ctx->uring_lock);
+-
+- data.hash = hash;
+- data.task = task;
+- data.free_work = io_wq_free_work;
+- data.do_work = io_wq_submit_work;
+-
+- /* Do QD, or 4 * CPUS, whatever is smallest */
+- concurrency = min(ctx->sq_entries, 4 * num_online_cpus());
+-
+- return io_wq_create(concurrency, &data);
+-}
+-
+-static __cold int io_uring_alloc_task_context(struct task_struct *task,
+- struct io_ring_ctx *ctx)
+-{
+- struct io_uring_task *tctx;
+- int ret;
+-
+- tctx = kzalloc(sizeof(*tctx), GFP_KERNEL);
+- if (unlikely(!tctx))
+- return -ENOMEM;
+-
+- tctx->registered_rings = kcalloc(IO_RINGFD_REG_MAX,
+- sizeof(struct file *), GFP_KERNEL);
+- if (unlikely(!tctx->registered_rings)) {
+- kfree(tctx);
+- return -ENOMEM;
+- }
+-
+- ret = percpu_counter_init(&tctx->inflight, 0, GFP_KERNEL);
+- if (unlikely(ret)) {
+- kfree(tctx->registered_rings);
+- kfree(tctx);
+- return ret;
+- }
+-
+- tctx->io_wq = io_init_wq_offload(ctx, task);
+- if (IS_ERR(tctx->io_wq)) {
+- ret = PTR_ERR(tctx->io_wq);
+- percpu_counter_destroy(&tctx->inflight);
+- kfree(tctx->registered_rings);
+- kfree(tctx);
+- return ret;
+- }
+-
+- xa_init(&tctx->xa);
+- init_waitqueue_head(&tctx->wait);
+- atomic_set(&tctx->in_idle, 0);
+- atomic_set(&tctx->inflight_tracked, 0);
+- task->io_uring = tctx;
+- spin_lock_init(&tctx->task_lock);
+- INIT_WQ_LIST(&tctx->task_list);
+- INIT_WQ_LIST(&tctx->prior_task_list);
+- init_task_work(&tctx->task_work, tctx_task_work);
+- return 0;
+-}
+-
+-void __io_uring_free(struct task_struct *tsk)
+-{
+- struct io_uring_task *tctx = tsk->io_uring;
+-
+- WARN_ON_ONCE(!xa_empty(&tctx->xa));
+- WARN_ON_ONCE(tctx->io_wq);
+- WARN_ON_ONCE(tctx->cached_refs);
+-
+- kfree(tctx->registered_rings);
+- percpu_counter_destroy(&tctx->inflight);
+- kfree(tctx);
+- tsk->io_uring = NULL;
+-}
+-
+-static __cold int io_sq_offload_create(struct io_ring_ctx *ctx,
+- struct io_uring_params *p)
+-{
+- int ret;
+-
+- /* Retain compatibility with failing for an invalid attach attempt */
+- if ((ctx->flags & (IORING_SETUP_ATTACH_WQ | IORING_SETUP_SQPOLL)) ==
+- IORING_SETUP_ATTACH_WQ) {
+- struct fd f;
+-
+- f = fdget(p->wq_fd);
+- if (!f.file)
+- return -ENXIO;
+- if (f.file->f_op != &io_uring_fops) {
+- fdput(f);
+- return -EINVAL;
+- }
+- fdput(f);
+- }
+- if (ctx->flags & IORING_SETUP_SQPOLL) {
+- struct task_struct *tsk;
+- struct io_sq_data *sqd;
+- bool attached;
+-
+- ret = security_uring_sqpoll();
+- if (ret)
+- return ret;
+-
+- sqd = io_get_sq_data(p, &attached);
+- if (IS_ERR(sqd)) {
+- ret = PTR_ERR(sqd);
+- goto err;
+- }
+-
+- ctx->sq_creds = get_current_cred();
+- ctx->sq_data = sqd;
+- ctx->sq_thread_idle = msecs_to_jiffies(p->sq_thread_idle);
+- if (!ctx->sq_thread_idle)
+- ctx->sq_thread_idle = HZ;
+-
+- io_sq_thread_park(sqd);
+- list_add(&ctx->sqd_list, &sqd->ctx_list);
+- io_sqd_update_thread_idle(sqd);
+- /* don't attach to a dying SQPOLL thread, would be racy */
+- ret = (attached && !sqd->thread) ? -ENXIO : 0;
+- io_sq_thread_unpark(sqd);
+-
+- if (ret < 0)
+- goto err;
+- if (attached)
+- return 0;
+-
+- if (p->flags & IORING_SETUP_SQ_AFF) {
+- int cpu = p->sq_thread_cpu;
+-
+- ret = -EINVAL;
+- if (cpu >= nr_cpu_ids || !cpu_online(cpu))
+- goto err_sqpoll;
+- sqd->sq_cpu = cpu;
+- } else {
+- sqd->sq_cpu = -1;
+- }
+-
+- sqd->task_pid = current->pid;
+- sqd->task_tgid = current->tgid;
+- tsk = create_io_thread(io_sq_thread, sqd, NUMA_NO_NODE);
+- if (IS_ERR(tsk)) {
+- ret = PTR_ERR(tsk);
+- goto err_sqpoll;
+- }
+-
+- sqd->thread = tsk;
+- ret = io_uring_alloc_task_context(tsk, ctx);
+- wake_up_new_task(tsk);
+- if (ret)
+- goto err;
+- } else if (p->flags & IORING_SETUP_SQ_AFF) {
+- /* Can't have SQ_AFF without SQPOLL */
+- ret = -EINVAL;
+- goto err;
+- }
+-
+- return 0;
+-err_sqpoll:
+- complete(&ctx->sq_data->exited);
+-err:
+- io_sq_thread_finish(ctx);
+- return ret;
+-}
+-
+-static inline void __io_unaccount_mem(struct user_struct *user,
+- unsigned long nr_pages)
+-{
+- atomic_long_sub(nr_pages, &user->locked_vm);
+-}
+-
+-static inline int __io_account_mem(struct user_struct *user,
+- unsigned long nr_pages)
+-{
+- unsigned long page_limit, cur_pages, new_pages;
+-
+- /* Don't allow more pages than we can safely lock */
+- page_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
+-
+- do {
+- cur_pages = atomic_long_read(&user->locked_vm);
+- new_pages = cur_pages + nr_pages;
+- if (new_pages > page_limit)
+- return -ENOMEM;
+- } while (atomic_long_cmpxchg(&user->locked_vm, cur_pages,
+- new_pages) != cur_pages);
+-
+- return 0;
+-}
+-
+-static void io_unaccount_mem(struct io_ring_ctx *ctx, unsigned long nr_pages)
+-{
+- if (ctx->user)
+- __io_unaccount_mem(ctx->user, nr_pages);
+-
+- if (ctx->mm_account)
+- atomic64_sub(nr_pages, &ctx->mm_account->pinned_vm);
+-}
+-
+-static int io_account_mem(struct io_ring_ctx *ctx, unsigned long nr_pages)
+-{
+- int ret;
+-
+- if (ctx->user) {
+- ret = __io_account_mem(ctx->user, nr_pages);
+- if (ret)
+- return ret;
+- }
+-
+- if (ctx->mm_account)
+- atomic64_add(nr_pages, &ctx->mm_account->pinned_vm);
+-
+- return 0;
+-}
+-
+-static void io_mem_free(void *ptr)
+-{
+- struct page *page;
+-
+- if (!ptr)
+- return;
+-
+- page = virt_to_head_page(ptr);
+- if (put_page_testzero(page))
+- free_compound_page(page);
+-}
+-
+-static void *io_mem_alloc(size_t size)
+-{
+- gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN | __GFP_COMP;
+-
+- return (void *) __get_free_pages(gfp, get_order(size));
+-}
+-
+-static unsigned long rings_size(unsigned sq_entries, unsigned cq_entries,
+- size_t *sq_offset)
+-{
+- struct io_rings *rings;
+- size_t off, sq_array_size;
+-
+- off = struct_size(rings, cqes, cq_entries);
+- if (off == SIZE_MAX)
+- return SIZE_MAX;
+-
+-#ifdef CONFIG_SMP
+- off = ALIGN(off, SMP_CACHE_BYTES);
+- if (off == 0)
+- return SIZE_MAX;
+-#endif
+-
+- if (sq_offset)
+- *sq_offset = off;
+-
+- sq_array_size = array_size(sizeof(u32), sq_entries);
+- if (sq_array_size == SIZE_MAX)
+- return SIZE_MAX;
+-
+- if (check_add_overflow(off, sq_array_size, &off))
+- return SIZE_MAX;
+-
+- return off;
+-}
+-
+-static void io_buffer_unmap(struct io_ring_ctx *ctx, struct io_mapped_ubuf **slot)
+-{
+- struct io_mapped_ubuf *imu = *slot;
+- unsigned int i;
+-
+- if (imu != ctx->dummy_ubuf) {
+- for (i = 0; i < imu->nr_bvecs; i++)
+- unpin_user_page(imu->bvec[i].bv_page);
+- if (imu->acct_pages)
+- io_unaccount_mem(ctx, imu->acct_pages);
+- kvfree(imu);
+- }
+- *slot = NULL;
+-}
+-
+-static void io_rsrc_buf_put(struct io_ring_ctx *ctx, struct io_rsrc_put *prsrc)
+-{
+- io_buffer_unmap(ctx, &prsrc->buf);
+- prsrc->buf = NULL;
+-}
+-
+-static void __io_sqe_buffers_unregister(struct io_ring_ctx *ctx)
+-{
+- unsigned int i;
+-
+- for (i = 0; i < ctx->nr_user_bufs; i++)
+- io_buffer_unmap(ctx, &ctx->user_bufs[i]);
+- kfree(ctx->user_bufs);
+- io_rsrc_data_free(ctx->buf_data);
+- ctx->user_bufs = NULL;
+- ctx->buf_data = NULL;
+- ctx->nr_user_bufs = 0;
+-}
+-
+-static int io_sqe_buffers_unregister(struct io_ring_ctx *ctx)
+-{
+- unsigned nr = ctx->nr_user_bufs;
+- int ret;
+-
+- if (!ctx->buf_data)
+- return -ENXIO;
+-
+- /*
+- * Quiesce may unlock ->uring_lock, and while it's not held
+- * prevent new requests using the table.
+- */
+- ctx->nr_user_bufs = 0;
+- ret = io_rsrc_ref_quiesce(ctx->buf_data, ctx);
+- ctx->nr_user_bufs = nr;
+- if (!ret)
+- __io_sqe_buffers_unregister(ctx);
+- return ret;
+-}
+-
+-static int io_copy_iov(struct io_ring_ctx *ctx, struct iovec *dst,
+- void __user *arg, unsigned index)
+-{
+- struct iovec __user *src;
+-
+-#ifdef CONFIG_COMPAT
+- if (ctx->compat) {
+- struct compat_iovec __user *ciovs;
+- struct compat_iovec ciov;
+-
+- ciovs = (struct compat_iovec __user *) arg;
+- if (copy_from_user(&ciov, &ciovs[index], sizeof(ciov)))
+- return -EFAULT;
+-
+- dst->iov_base = u64_to_user_ptr((u64)ciov.iov_base);
+- dst->iov_len = ciov.iov_len;
+- return 0;
+- }
+-#endif
+- src = (struct iovec __user *) arg;
+- if (copy_from_user(dst, &src[index], sizeof(*dst)))
+- return -EFAULT;
+- return 0;
+-}
+-
+-/*
+- * Not super efficient, but this is just a registration time. And we do cache
+- * the last compound head, so generally we'll only do a full search if we don't
+- * match that one.
+- *
+- * We check if the given compound head page has already been accounted, to
+- * avoid double accounting it. This allows us to account the full size of the
+- * page, not just the constituent pages of a huge page.
+- */
+-static bool headpage_already_acct(struct io_ring_ctx *ctx, struct page **pages,
+- int nr_pages, struct page *hpage)
+-{
+- int i, j;
+-
+- /* check current page array */
+- for (i = 0; i < nr_pages; i++) {
+- if (!PageCompound(pages[i]))
+- continue;
+- if (compound_head(pages[i]) == hpage)
+- return true;
+- }
+-
+- /* check previously registered pages */
+- for (i = 0; i < ctx->nr_user_bufs; i++) {
+- struct io_mapped_ubuf *imu = ctx->user_bufs[i];
+-
+- for (j = 0; j < imu->nr_bvecs; j++) {
+- if (!PageCompound(imu->bvec[j].bv_page))
+- continue;
+- if (compound_head(imu->bvec[j].bv_page) == hpage)
+- return true;
+- }
+- }
+-
+- return false;
+-}
+-
+-static int io_buffer_account_pin(struct io_ring_ctx *ctx, struct page **pages,
+- int nr_pages, struct io_mapped_ubuf *imu,
+- struct page **last_hpage)
+-{
+- int i, ret;
+-
+- imu->acct_pages = 0;
+- for (i = 0; i < nr_pages; i++) {
+- if (!PageCompound(pages[i])) {
+- imu->acct_pages++;
+- } else {
+- struct page *hpage;
+-
+- hpage = compound_head(pages[i]);
+- if (hpage == *last_hpage)
+- continue;
+- *last_hpage = hpage;
+- if (headpage_already_acct(ctx, pages, i, hpage))
+- continue;
+- imu->acct_pages += page_size(hpage) >> PAGE_SHIFT;
+- }
+- }
+-
+- if (!imu->acct_pages)
+- return 0;
+-
+- ret = io_account_mem(ctx, imu->acct_pages);
+- if (ret)
+- imu->acct_pages = 0;
+- return ret;
+-}
+-
+-static int io_sqe_buffer_register(struct io_ring_ctx *ctx, struct iovec *iov,
+- struct io_mapped_ubuf **pimu,
+- struct page **last_hpage)
+-{
+- struct io_mapped_ubuf *imu = NULL;
+- struct vm_area_struct **vmas = NULL;
+- struct page **pages = NULL;
+- unsigned long off, start, end, ubuf;
+- size_t size;
+- int ret, pret, nr_pages, i;
+-
+- if (!iov->iov_base) {
+- *pimu = ctx->dummy_ubuf;
+- return 0;
+- }
+-
+- ubuf = (unsigned long) iov->iov_base;
+- end = (ubuf + iov->iov_len + PAGE_SIZE - 1) >> PAGE_SHIFT;
+- start = ubuf >> PAGE_SHIFT;
+- nr_pages = end - start;
+-
+- *pimu = NULL;
+- ret = -ENOMEM;
+-
+- pages = kvmalloc_array(nr_pages, sizeof(struct page *), GFP_KERNEL);
+- if (!pages)
+- goto done;
+-
+- vmas = kvmalloc_array(nr_pages, sizeof(struct vm_area_struct *),
+- GFP_KERNEL);
+- if (!vmas)
+- goto done;
+-
+- imu = kvmalloc(struct_size(imu, bvec, nr_pages), GFP_KERNEL);
+- if (!imu)
+- goto done;
+-
+- ret = 0;
+- mmap_read_lock(current->mm);
+- pret = pin_user_pages(ubuf, nr_pages, FOLL_WRITE | FOLL_LONGTERM,
+- pages, vmas);
+- if (pret == nr_pages) {
+- /* don't support file backed memory */
+- for (i = 0; i < nr_pages; i++) {
+- struct vm_area_struct *vma = vmas[i];
+-
+- if (vma_is_shmem(vma))
+- continue;
+- if (vma->vm_file &&
+- !is_file_hugepages(vma->vm_file)) {
+- ret = -EOPNOTSUPP;
+- break;
+- }
+- }
+- } else {
+- ret = pret < 0 ? pret : -EFAULT;
+- }
+- mmap_read_unlock(current->mm);
+- if (ret) {
+- /*
+- * if we did partial map, or found file backed vmas,
+- * release any pages we did get
+- */
+- if (pret > 0)
+- unpin_user_pages(pages, pret);
+- goto done;
+- }
+-
+- ret = io_buffer_account_pin(ctx, pages, pret, imu, last_hpage);
+- if (ret) {
+- unpin_user_pages(pages, pret);
+- goto done;
+- }
+-
+- off = ubuf & ~PAGE_MASK;
+- size = iov->iov_len;
+- for (i = 0; i < nr_pages; i++) {
+- size_t vec_len;
+-
+- vec_len = min_t(size_t, size, PAGE_SIZE - off);
+- imu->bvec[i].bv_page = pages[i];
+- imu->bvec[i].bv_len = vec_len;
+- imu->bvec[i].bv_offset = off;
+- off = 0;
+- size -= vec_len;
+- }
+- /* store original address for later verification */
+- imu->ubuf = ubuf;
+- imu->ubuf_end = ubuf + iov->iov_len;
+- imu->nr_bvecs = nr_pages;
+- *pimu = imu;
+- ret = 0;
+-done:
+- if (ret)
+- kvfree(imu);
+- kvfree(pages);
+- kvfree(vmas);
+- return ret;
+-}
+-
+-static int io_buffers_map_alloc(struct io_ring_ctx *ctx, unsigned int nr_args)
+-{
+- ctx->user_bufs = kcalloc(nr_args, sizeof(*ctx->user_bufs), GFP_KERNEL);
+- return ctx->user_bufs ? 0 : -ENOMEM;
+-}
+-
+-static int io_buffer_validate(struct iovec *iov)
+-{
+- unsigned long tmp, acct_len = iov->iov_len + (PAGE_SIZE - 1);
+-
+- /*
+- * Don't impose further limits on the size and buffer
+- * constraints here, we'll -EINVAL later when IO is
+- * submitted if they are wrong.
+- */
+- if (!iov->iov_base)
+- return iov->iov_len ? -EFAULT : 0;
+- if (!iov->iov_len)
+- return -EFAULT;
+-
+- /* arbitrary limit, but we need something */
+- if (iov->iov_len > SZ_1G)
+- return -EFAULT;
+-
+- if (check_add_overflow((unsigned long)iov->iov_base, acct_len, &tmp))
+- return -EOVERFLOW;
+-
+- return 0;
+-}
+-
+-static int io_sqe_buffers_register(struct io_ring_ctx *ctx, void __user *arg,
+- unsigned int nr_args, u64 __user *tags)
+-{
+- struct page *last_hpage = NULL;
+- struct io_rsrc_data *data;
+- int i, ret;
+- struct iovec iov;
+-
+- if (ctx->user_bufs)
+- return -EBUSY;
+- if (!nr_args || nr_args > IORING_MAX_REG_BUFFERS)
+- return -EINVAL;
+- ret = io_rsrc_node_switch_start(ctx);
+- if (ret)
+- return ret;
+- ret = io_rsrc_data_alloc(ctx, io_rsrc_buf_put, tags, nr_args, &data);
+- if (ret)
+- return ret;
+- ret = io_buffers_map_alloc(ctx, nr_args);
+- if (ret) {
+- io_rsrc_data_free(data);
+- return ret;
+- }
+-
+- for (i = 0; i < nr_args; i++, ctx->nr_user_bufs++) {
+- ret = io_copy_iov(ctx, &iov, arg, i);
+- if (ret)
+- break;
+- ret = io_buffer_validate(&iov);
+- if (ret)
+- break;
+- if (!iov.iov_base && *io_get_tag_slot(data, i)) {
+- ret = -EINVAL;
+- break;
+- }
+-
+- ret = io_sqe_buffer_register(ctx, &iov, &ctx->user_bufs[i],
+- &last_hpage);
+- if (ret)
+- break;
+- }
+-
+- WARN_ON_ONCE(ctx->buf_data);
+-
+- ctx->buf_data = data;
+- if (ret)
+- __io_sqe_buffers_unregister(ctx);
+- else
+- io_rsrc_node_switch(ctx, NULL);
+- return ret;
+-}
+-
+-static int __io_sqe_buffers_update(struct io_ring_ctx *ctx,
+- struct io_uring_rsrc_update2 *up,
+- unsigned int nr_args)
+-{
+- u64 __user *tags = u64_to_user_ptr(up->tags);
+- struct iovec iov, __user *iovs = u64_to_user_ptr(up->data);
+- struct page *last_hpage = NULL;
+- bool needs_switch = false;
+- __u32 done;
+- int i, err;
+-
+- if (!ctx->buf_data)
+- return -ENXIO;
+- if (up->offset + nr_args > ctx->nr_user_bufs)
+- return -EINVAL;
+-
+- for (done = 0; done < nr_args; done++) {
+- struct io_mapped_ubuf *imu;
+- int offset = up->offset + done;
+- u64 tag = 0;
+-
+- err = io_copy_iov(ctx, &iov, iovs, done);
+- if (err)
+- break;
+- if (tags && copy_from_user(&tag, &tags[done], sizeof(tag))) {
+- err = -EFAULT;
+- break;
+- }
+- err = io_buffer_validate(&iov);
+- if (err)
+- break;
+- if (!iov.iov_base && tag) {
+- err = -EINVAL;
+- break;
+- }
+- err = io_sqe_buffer_register(ctx, &iov, &imu, &last_hpage);
+- if (err)
+- break;
+-
+- i = array_index_nospec(offset, ctx->nr_user_bufs);
+- if (ctx->user_bufs[i] != ctx->dummy_ubuf) {
+- err = io_queue_rsrc_removal(ctx->buf_data, i,
+- ctx->rsrc_node, ctx->user_bufs[i]);
+- if (unlikely(err)) {
+- io_buffer_unmap(ctx, &imu);
+- break;
+- }
+- ctx->user_bufs[i] = NULL;
+- needs_switch = true;
+- }
+-
+- ctx->user_bufs[i] = imu;
+- *io_get_tag_slot(ctx->buf_data, offset) = tag;
+- }
+-
+- if (needs_switch)
+- io_rsrc_node_switch(ctx, ctx->buf_data);
+- return done ? done : err;
+-}
+-
+-static int io_eventfd_register(struct io_ring_ctx *ctx, void __user *arg,
+- unsigned int eventfd_async)
+-{
+- struct io_ev_fd *ev_fd;
+- __s32 __user *fds = arg;
+- int fd;
+-
+- ev_fd = rcu_dereference_protected(ctx->io_ev_fd,
+- lockdep_is_held(&ctx->uring_lock));
+- if (ev_fd)
+- return -EBUSY;
+-
+- if (copy_from_user(&fd, fds, sizeof(*fds)))
+- return -EFAULT;
+-
+- ev_fd = kmalloc(sizeof(*ev_fd), GFP_KERNEL);
+- if (!ev_fd)
+- return -ENOMEM;
+-
+- ev_fd->cq_ev_fd = eventfd_ctx_fdget(fd);
+- if (IS_ERR(ev_fd->cq_ev_fd)) {
+- int ret = PTR_ERR(ev_fd->cq_ev_fd);
+- kfree(ev_fd);
+- return ret;
+- }
+- ev_fd->eventfd_async = eventfd_async;
+- ctx->has_evfd = true;
+- rcu_assign_pointer(ctx->io_ev_fd, ev_fd);
+- return 0;
+-}
+-
+-static void io_eventfd_put(struct rcu_head *rcu)
+-{
+- struct io_ev_fd *ev_fd = container_of(rcu, struct io_ev_fd, rcu);
+-
+- eventfd_ctx_put(ev_fd->cq_ev_fd);
+- kfree(ev_fd);
+-}
+-
+-static int io_eventfd_unregister(struct io_ring_ctx *ctx)
+-{
+- struct io_ev_fd *ev_fd;
+-
+- ev_fd = rcu_dereference_protected(ctx->io_ev_fd,
+- lockdep_is_held(&ctx->uring_lock));
+- if (ev_fd) {
+- ctx->has_evfd = false;
+- rcu_assign_pointer(ctx->io_ev_fd, NULL);
+- call_rcu(&ev_fd->rcu, io_eventfd_put);
+- return 0;
+- }
+-
+- return -ENXIO;
+-}
+-
+-static void io_destroy_buffers(struct io_ring_ctx *ctx)
+-{
+- int i;
+-
+- for (i = 0; i < (1U << IO_BUFFERS_HASH_BITS); i++) {
+- struct list_head *list = &ctx->io_buffers[i];
+-
+- while (!list_empty(list)) {
+- struct io_buffer_list *bl;
+-
+- bl = list_first_entry(list, struct io_buffer_list, list);
+- __io_remove_buffers(ctx, bl, -1U);
+- list_del(&bl->list);
+- kfree(bl);
+- }
+- }
+-
+- while (!list_empty(&ctx->io_buffers_pages)) {
+- struct page *page;
+-
+- page = list_first_entry(&ctx->io_buffers_pages, struct page, lru);
+- list_del_init(&page->lru);
+- __free_page(page);
+- }
+-}
+-
+-static void io_req_caches_free(struct io_ring_ctx *ctx)
+-{
+- struct io_submit_state *state = &ctx->submit_state;
+- int nr = 0;
+-
+- mutex_lock(&ctx->uring_lock);
+- io_flush_cached_locked_reqs(ctx, state);
+-
+- while (state->free_list.next) {
+- struct io_wq_work_node *node;
+- struct io_kiocb *req;
+-
+- node = wq_stack_extract(&state->free_list);
+- req = container_of(node, struct io_kiocb, comp_list);
+- kmem_cache_free(req_cachep, req);
+- nr++;
+- }
+- if (nr)
+- percpu_ref_put_many(&ctx->refs, nr);
+- mutex_unlock(&ctx->uring_lock);
+-}
+-
+-static void io_wait_rsrc_data(struct io_rsrc_data *data)
+-{
+- if (data && !atomic_dec_and_test(&data->refs))
+- wait_for_completion(&data->done);
+-}
+-
+-static void io_flush_apoll_cache(struct io_ring_ctx *ctx)
+-{
+- struct async_poll *apoll;
+-
+- while (!list_empty(&ctx->apoll_cache)) {
+- apoll = list_first_entry(&ctx->apoll_cache, struct async_poll,
+- poll.wait.entry);
+- list_del(&apoll->poll.wait.entry);
+- kfree(apoll);
+- }
+-}
+-
+-static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx)
+-{
+- io_sq_thread_finish(ctx);
+-
+- if (ctx->mm_account) {
+- mmdrop(ctx->mm_account);
+- ctx->mm_account = NULL;
+- }
+-
+- io_rsrc_refs_drop(ctx);
+- /* __io_rsrc_put_work() may need uring_lock to progress, wait w/o it */
+- io_wait_rsrc_data(ctx->buf_data);
+- io_wait_rsrc_data(ctx->file_data);
+-
+- mutex_lock(&ctx->uring_lock);
+- if (ctx->buf_data)
+- __io_sqe_buffers_unregister(ctx);
+- if (ctx->file_data)
+- __io_sqe_files_unregister(ctx);
+- if (ctx->rings)
+- __io_cqring_overflow_flush(ctx, true);
+- io_eventfd_unregister(ctx);
+- io_flush_apoll_cache(ctx);
+- mutex_unlock(&ctx->uring_lock);
+- io_destroy_buffers(ctx);
+- if (ctx->sq_creds)
+- put_cred(ctx->sq_creds);
+-
+- /* there are no registered resources left, nobody uses it */
+- if (ctx->rsrc_node)
+- io_rsrc_node_destroy(ctx->rsrc_node);
+- if (ctx->rsrc_backup_node)
+- io_rsrc_node_destroy(ctx->rsrc_backup_node);
+- flush_delayed_work(&ctx->rsrc_put_work);
+- flush_delayed_work(&ctx->fallback_work);
+-
+- WARN_ON_ONCE(!list_empty(&ctx->rsrc_ref_list));
+- WARN_ON_ONCE(!llist_empty(&ctx->rsrc_put_llist));
+-
+-#if defined(CONFIG_UNIX)
+- if (ctx->ring_sock) {
+- ctx->ring_sock->file = NULL; /* so that iput() is called */
+- sock_release(ctx->ring_sock);
+- }
+-#endif
+- WARN_ON_ONCE(!list_empty(&ctx->ltimeout_list));
+-
+- io_mem_free(ctx->rings);
+- io_mem_free(ctx->sq_sqes);
+-
+- percpu_ref_exit(&ctx->refs);
+- free_uid(ctx->user);
+- io_req_caches_free(ctx);
+- if (ctx->hash_map)
+- io_wq_put_hash(ctx->hash_map);
+- kfree(ctx->cancel_hash);
+- kfree(ctx->dummy_ubuf);
+- kfree(ctx->io_buffers);
+- kfree(ctx);
+-}
+-
+-static __poll_t io_uring_poll(struct file *file, poll_table *wait)
+-{
+- struct io_ring_ctx *ctx = file->private_data;
+- __poll_t mask = 0;
+-
+- poll_wait(file, &ctx->cq_wait, wait);
+- /*
+- * synchronizes with barrier from wq_has_sleeper call in
+- * io_commit_cqring
+- */
+- smp_rmb();
+- if (!io_sqring_full(ctx))
+- mask |= EPOLLOUT | EPOLLWRNORM;
+-
+- /*
+- * Don't flush cqring overflow list here, just do a simple check.
+- * Otherwise there could possible be ABBA deadlock:
+- * CPU0 CPU1
+- * ---- ----
+- * lock(&ctx->uring_lock);
+- * lock(&ep->mtx);
+- * lock(&ctx->uring_lock);
+- * lock(&ep->mtx);
+- *
+- * Users may get EPOLLIN meanwhile seeing nothing in cqring, this
+- * pushs them to do the flush.
+- */
+- if (io_cqring_events(ctx) || test_bit(0, &ctx->check_cq_overflow))
+- mask |= EPOLLIN | EPOLLRDNORM;
+-
+- return mask;
+-}
+-
+-static int io_unregister_personality(struct io_ring_ctx *ctx, unsigned id)
+-{
+- const struct cred *creds;
+-
+- creds = xa_erase(&ctx->personalities, id);
+- if (creds) {
+- put_cred(creds);
+- return 0;
+- }
+-
+- return -EINVAL;
+-}
+-
+-struct io_tctx_exit {
+- struct callback_head task_work;
+- struct completion completion;
+- struct io_ring_ctx *ctx;
+-};
+-
+-static __cold void io_tctx_exit_cb(struct callback_head *cb)
+-{
+- struct io_uring_task *tctx = current->io_uring;
+- struct io_tctx_exit *work;
+-
+- work = container_of(cb, struct io_tctx_exit, task_work);
+- /*
+- * When @in_idle, we're in cancellation and it's racy to remove the
+- * node. It'll be removed by the end of cancellation, just ignore it.
+- */
+- if (!atomic_read(&tctx->in_idle))
+- io_uring_del_tctx_node((unsigned long)work->ctx);
+- complete(&work->completion);
+-}
+-
+-static __cold bool io_cancel_ctx_cb(struct io_wq_work *work, void *data)
+-{
+- struct io_kiocb *req = container_of(work, struct io_kiocb, work);
+-
+- return req->ctx == data;
+-}
+-
+-static __cold void io_ring_exit_work(struct work_struct *work)
+-{
+- struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx, exit_work);
+- unsigned long timeout = jiffies + HZ * 60 * 5;
+- unsigned long interval = HZ / 20;
+- struct io_tctx_exit exit;
+- struct io_tctx_node *node;
+- int ret;
+-
+- /*
+- * If we're doing polled IO and end up having requests being
+- * submitted async (out-of-line), then completions can come in while
+- * we're waiting for refs to drop. We need to reap these manually,
+- * as nobody else will be looking for them.
+- */
+- do {
+- io_uring_try_cancel_requests(ctx, NULL, true);
+- if (ctx->sq_data) {
+- struct io_sq_data *sqd = ctx->sq_data;
+- struct task_struct *tsk;
+-
+- io_sq_thread_park(sqd);
+- tsk = sqd->thread;
+- if (tsk && tsk->io_uring && tsk->io_uring->io_wq)
+- io_wq_cancel_cb(tsk->io_uring->io_wq,
+- io_cancel_ctx_cb, ctx, true);
+- io_sq_thread_unpark(sqd);
+- }
+-
+- io_req_caches_free(ctx);
+-
+- if (WARN_ON_ONCE(time_after(jiffies, timeout))) {
+- /* there is little hope left, don't run it too often */
+- interval = HZ * 60;
+- }
+- } while (!wait_for_completion_timeout(&ctx->ref_comp, interval));
+-
+- init_completion(&exit.completion);
+- init_task_work(&exit.task_work, io_tctx_exit_cb);
+- exit.ctx = ctx;
+- /*
+- * Some may use context even when all refs and requests have been put,
+- * and they are free to do so while still holding uring_lock or
+- * completion_lock, see io_req_task_submit(). Apart from other work,
+- * this lock/unlock section also waits them to finish.
+- */
+- mutex_lock(&ctx->uring_lock);
+- while (!list_empty(&ctx->tctx_list)) {
+- WARN_ON_ONCE(time_after(jiffies, timeout));
+-
+- node = list_first_entry(&ctx->tctx_list, struct io_tctx_node,
+- ctx_node);
+- /* don't spin on a single task if cancellation failed */
+- list_rotate_left(&ctx->tctx_list);
+- ret = task_work_add(node->task, &exit.task_work, TWA_SIGNAL);
+- if (WARN_ON_ONCE(ret))
+- continue;
+-
+- mutex_unlock(&ctx->uring_lock);
+- wait_for_completion(&exit.completion);
+- mutex_lock(&ctx->uring_lock);
+- }
+- mutex_unlock(&ctx->uring_lock);
+- spin_lock(&ctx->completion_lock);
+- spin_unlock(&ctx->completion_lock);
+-
+- io_ring_ctx_free(ctx);
+-}
+-
+-/* Returns true if we found and killed one or more timeouts */
+-static __cold bool io_kill_timeouts(struct io_ring_ctx *ctx,
+- struct task_struct *tsk, bool cancel_all)
+-{
+- struct io_kiocb *req, *tmp;
+- int canceled = 0;
+-
+- spin_lock(&ctx->completion_lock);
+- spin_lock_irq(&ctx->timeout_lock);
+- list_for_each_entry_safe(req, tmp, &ctx->timeout_list, timeout.list) {
+- if (io_match_task(req, tsk, cancel_all)) {
+- io_kill_timeout(req, -ECANCELED);
+- canceled++;
+- }
+- }
+- spin_unlock_irq(&ctx->timeout_lock);
+- if (canceled != 0)
+- io_commit_cqring(ctx);
+- spin_unlock(&ctx->completion_lock);
+- if (canceled != 0)
+- io_cqring_ev_posted(ctx);
+- return canceled != 0;
+-}
+-
+-static __cold void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx)
+-{
+- unsigned long index;
+- struct creds *creds;
+-
+- mutex_lock(&ctx->uring_lock);
+- percpu_ref_kill(&ctx->refs);
+- if (ctx->rings)
+- __io_cqring_overflow_flush(ctx, true);
+- xa_for_each(&ctx->personalities, index, creds)
+- io_unregister_personality(ctx, index);
+- mutex_unlock(&ctx->uring_lock);
+-
+- io_kill_timeouts(ctx, NULL, true);
+- io_poll_remove_all(ctx, NULL, true);
+-
+- /* if we failed setting up the ctx, we might not have any rings */
+- io_iopoll_try_reap_events(ctx);
+-
+- INIT_WORK(&ctx->exit_work, io_ring_exit_work);
+- /*
+- * Use system_unbound_wq to avoid spawning tons of event kworkers
+- * if we're exiting a ton of rings at the same time. It just adds
+- * noise and overhead, there's no discernable change in runtime
+- * over using system_wq.
+- */
+- queue_work(system_unbound_wq, &ctx->exit_work);
+-}
+-
+-static int io_uring_release(struct inode *inode, struct file *file)
+-{
+- struct io_ring_ctx *ctx = file->private_data;
+-
+- file->private_data = NULL;
+- io_ring_ctx_wait_and_kill(ctx);
+- return 0;
+-}
+-
+-struct io_task_cancel {
+- struct task_struct *task;
+- bool all;
+-};
+-
+-static bool io_cancel_task_cb(struct io_wq_work *work, void *data)
+-{
+- struct io_kiocb *req = container_of(work, struct io_kiocb, work);
+- struct io_task_cancel *cancel = data;
+-
+- return io_match_task_safe(req, cancel->task, cancel->all);
+-}
+-
+-static __cold bool io_cancel_defer_files(struct io_ring_ctx *ctx,
+- struct task_struct *task,
+- bool cancel_all)
+-{
+- struct io_defer_entry *de;
+- LIST_HEAD(list);
+-
+- spin_lock(&ctx->completion_lock);
+- list_for_each_entry_reverse(de, &ctx->defer_list, list) {
+- if (io_match_task_safe(de->req, task, cancel_all)) {
+- list_cut_position(&list, &ctx->defer_list, &de->list);
+- break;
+- }
+- }
+- spin_unlock(&ctx->completion_lock);
+- if (list_empty(&list))
+- return false;
+-
+- while (!list_empty(&list)) {
+- de = list_first_entry(&list, struct io_defer_entry, list);
+- list_del_init(&de->list);
+- io_req_complete_failed(de->req, -ECANCELED);
+- kfree(de);
+- }
+- return true;
+-}
+-
+-static __cold bool io_uring_try_cancel_iowq(struct io_ring_ctx *ctx)
+-{
+- struct io_tctx_node *node;
+- enum io_wq_cancel cret;
+- bool ret = false;
+-
+- mutex_lock(&ctx->uring_lock);
+- list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
+- struct io_uring_task *tctx = node->task->io_uring;
+-
+- /*
+- * io_wq will stay alive while we hold uring_lock, because it's
+- * killed after ctx nodes, which requires to take the lock.
+- */
+- if (!tctx || !tctx->io_wq)
+- continue;
+- cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_ctx_cb, ctx, true);
+- ret |= (cret != IO_WQ_CANCEL_NOTFOUND);
+- }
+- mutex_unlock(&ctx->uring_lock);
+-
+- return ret;
+-}
+-
+-static __cold void io_uring_try_cancel_requests(struct io_ring_ctx *ctx,
+- struct task_struct *task,
+- bool cancel_all)
+-{
+- struct io_task_cancel cancel = { .task = task, .all = cancel_all, };
+- struct io_uring_task *tctx = task ? task->io_uring : NULL;
+-
+- while (1) {
+- enum io_wq_cancel cret;
+- bool ret = false;
+-
+- if (!task) {
+- ret |= io_uring_try_cancel_iowq(ctx);
+- } else if (tctx && tctx->io_wq) {
+- /*
+- * Cancels requests of all rings, not only @ctx, but
+- * it's fine as the task is in exit/exec.
+- */
+- cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_task_cb,
+- &cancel, true);
+- ret |= (cret != IO_WQ_CANCEL_NOTFOUND);
+- }
+-
+- /* SQPOLL thread does its own polling */
+- if ((!(ctx->flags & IORING_SETUP_SQPOLL) && cancel_all) ||
+- (ctx->sq_data && ctx->sq_data->thread == current)) {
+- while (!wq_list_empty(&ctx->iopoll_list)) {
+- io_iopoll_try_reap_events(ctx);
+- ret = true;
+- }
+- }
+-
+- ret |= io_cancel_defer_files(ctx, task, cancel_all);
+- ret |= io_poll_remove_all(ctx, task, cancel_all);
+- ret |= io_kill_timeouts(ctx, task, cancel_all);
+- if (task)
+- ret |= io_run_task_work();
+- if (!ret)
+- break;
+- cond_resched();
+- }
+-}
+-
+-static int __io_uring_add_tctx_node(struct io_ring_ctx *ctx)
+-{
+- struct io_uring_task *tctx = current->io_uring;
+- struct io_tctx_node *node;
+- int ret;
+-
+- if (unlikely(!tctx)) {
+- ret = io_uring_alloc_task_context(current, ctx);
+- if (unlikely(ret))
+- return ret;
+-
+- tctx = current->io_uring;
+- if (ctx->iowq_limits_set) {
+- unsigned int limits[2] = { ctx->iowq_limits[0],
+- ctx->iowq_limits[1], };
+-
+- ret = io_wq_max_workers(tctx->io_wq, limits);
+- if (ret)
+- return ret;
+- }
+- }
+- if (!xa_load(&tctx->xa, (unsigned long)ctx)) {
+- node = kmalloc(sizeof(*node), GFP_KERNEL);
+- if (!node)
+- return -ENOMEM;
+- node->ctx = ctx;
+- node->task = current;
+-
+- ret = xa_err(xa_store(&tctx->xa, (unsigned long)ctx,
+- node, GFP_KERNEL));
+- if (ret) {
+- kfree(node);
+- return ret;
+- }
+-
+- mutex_lock(&ctx->uring_lock);
+- list_add(&node->ctx_node, &ctx->tctx_list);
+- mutex_unlock(&ctx->uring_lock);
+- }
+- tctx->last = ctx;
+- return 0;
+-}
+-
+-/*
+- * Note that this task has used io_uring. We use it for cancelation purposes.
+- */
+-static inline int io_uring_add_tctx_node(struct io_ring_ctx *ctx)
+-{
+- struct io_uring_task *tctx = current->io_uring;
+-
+- if (likely(tctx && tctx->last == ctx))
+- return 0;
+- return __io_uring_add_tctx_node(ctx);
+-}
+-
+-/*
+- * Remove this io_uring_file -> task mapping.
+- */
+-static __cold void io_uring_del_tctx_node(unsigned long index)
+-{
+- struct io_uring_task *tctx = current->io_uring;
+- struct io_tctx_node *node;
+-
+- if (!tctx)
+- return;
+- node = xa_erase(&tctx->xa, index);
+- if (!node)
+- return;
+-
+- WARN_ON_ONCE(current != node->task);
+- WARN_ON_ONCE(list_empty(&node->ctx_node));
+-
+- mutex_lock(&node->ctx->uring_lock);
+- list_del(&node->ctx_node);
+- mutex_unlock(&node->ctx->uring_lock);
+-
+- if (tctx->last == node->ctx)
+- tctx->last = NULL;
+- kfree(node);
+-}
+-
+-static __cold void io_uring_clean_tctx(struct io_uring_task *tctx)
+-{
+- struct io_wq *wq = tctx->io_wq;
+- struct io_tctx_node *node;
+- unsigned long index;
+-
+- xa_for_each(&tctx->xa, index, node) {
+- io_uring_del_tctx_node(index);
+- cond_resched();
+- }
+- if (wq) {
+- /*
+- * Must be after io_uring_del_tctx_node() (removes nodes under
+- * uring_lock) to avoid race with io_uring_try_cancel_iowq().
+- */
+- io_wq_put_and_exit(wq);
+- tctx->io_wq = NULL;
+- }
+-}
+-
+-static s64 tctx_inflight(struct io_uring_task *tctx, bool tracked)
+-{
+- if (tracked)
+- return atomic_read(&tctx->inflight_tracked);
+- return percpu_counter_sum(&tctx->inflight);
+-}
+-
+-/*
+- * Find any io_uring ctx that this task has registered or done IO on, and cancel
+- * requests. @sqd should be not-null IFF it's an SQPOLL thread cancellation.
+- */
+-static __cold void io_uring_cancel_generic(bool cancel_all,
+- struct io_sq_data *sqd)
+-{
+- struct io_uring_task *tctx = current->io_uring;
+- struct io_ring_ctx *ctx;
+- s64 inflight;
+- DEFINE_WAIT(wait);
+-
+- WARN_ON_ONCE(sqd && sqd->thread != current);
+-
+- if (!current->io_uring)
+- return;
+- if (tctx->io_wq)
+- io_wq_exit_start(tctx->io_wq);
+-
+- atomic_inc(&tctx->in_idle);
+- do {
+- io_uring_drop_tctx_refs(current);
+- /* read completions before cancelations */
+- inflight = tctx_inflight(tctx, !cancel_all);
+- if (!inflight)
+- break;
+-
+- if (!sqd) {
+- struct io_tctx_node *node;
+- unsigned long index;
+-
+- xa_for_each(&tctx->xa, index, node) {
+- /* sqpoll task will cancel all its requests */
+- if (node->ctx->sq_data)
+- continue;
+- io_uring_try_cancel_requests(node->ctx, current,
+- cancel_all);
+- }
+- } else {
+- list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
+- io_uring_try_cancel_requests(ctx, current,
+- cancel_all);
+- }
+-
+- prepare_to_wait(&tctx->wait, &wait, TASK_INTERRUPTIBLE);
+- io_run_task_work();
+- io_uring_drop_tctx_refs(current);
+-
+- /*
+- * If we've seen completions, retry without waiting. This
+- * avoids a race where a completion comes in before we did
+- * prepare_to_wait().
+- */
+- if (inflight == tctx_inflight(tctx, !cancel_all))
+- schedule();
+- finish_wait(&tctx->wait, &wait);
+- } while (1);
+-
+- io_uring_clean_tctx(tctx);
+- if (cancel_all) {
+- /*
+- * We shouldn't run task_works after cancel, so just leave
+- * ->in_idle set for normal exit.
+- */
+- atomic_dec(&tctx->in_idle);
+- /* for exec all current's requests should be gone, kill tctx */
+- __io_uring_free(current);
+- }
+-}
+-
+-void __io_uring_cancel(bool cancel_all)
+-{
+- io_uring_cancel_generic(cancel_all, NULL);
+-}
+-
+-void io_uring_unreg_ringfd(void)
+-{
+- struct io_uring_task *tctx = current->io_uring;
+- int i;
+-
+- for (i = 0; i < IO_RINGFD_REG_MAX; i++) {
+- if (tctx->registered_rings[i]) {
+- fput(tctx->registered_rings[i]);
+- tctx->registered_rings[i] = NULL;
+- }
+- }
+-}
+-
+-static int io_ring_add_registered_fd(struct io_uring_task *tctx, int fd,
+- int start, int end)
+-{
+- struct file *file;
+- int offset;
+-
+- for (offset = start; offset < end; offset++) {
+- offset = array_index_nospec(offset, IO_RINGFD_REG_MAX);
+- if (tctx->registered_rings[offset])
+- continue;
+-
+- file = fget(fd);
+- if (!file) {
+- return -EBADF;
+- } else if (file->f_op != &io_uring_fops) {
+- fput(file);
+- return -EOPNOTSUPP;
+- }
+- tctx->registered_rings[offset] = file;
+- return offset;
+- }
+-
+- return -EBUSY;
+-}
+-
+-/*
+- * Register a ring fd to avoid fdget/fdput for each io_uring_enter()
+- * invocation. User passes in an array of struct io_uring_rsrc_update
+- * with ->data set to the ring_fd, and ->offset given for the desired
+- * index. If no index is desired, application may set ->offset == -1U
+- * and we'll find an available index. Returns number of entries
+- * successfully processed, or < 0 on error if none were processed.
+- */
+-static int io_ringfd_register(struct io_ring_ctx *ctx, void __user *__arg,
+- unsigned nr_args)
+-{
+- struct io_uring_rsrc_update __user *arg = __arg;
+- struct io_uring_rsrc_update reg;
+- struct io_uring_task *tctx;
+- int ret, i;
+-
+- if (!nr_args || nr_args > IO_RINGFD_REG_MAX)
+- return -EINVAL;
+-
+- mutex_unlock(&ctx->uring_lock);
+- ret = io_uring_add_tctx_node(ctx);
+- mutex_lock(&ctx->uring_lock);
+- if (ret)
+- return ret;
+-
+- tctx = current->io_uring;
+- for (i = 0; i < nr_args; i++) {
+- int start, end;
+-
+- if (copy_from_user(®, &arg[i], sizeof(reg))) {
+- ret = -EFAULT;
+- break;
+- }
+-
+- if (reg.resv) {
+- ret = -EINVAL;
+- break;
+- }
+-
+- if (reg.offset == -1U) {
+- start = 0;
+- end = IO_RINGFD_REG_MAX;
+- } else {
+- if (reg.offset >= IO_RINGFD_REG_MAX) {
+- ret = -EINVAL;
+- break;
+- }
+- start = reg.offset;
+- end = start + 1;
+- }
+-
+- ret = io_ring_add_registered_fd(tctx, reg.data, start, end);
+- if (ret < 0)
+- break;
+-
+- reg.offset = ret;
+- if (copy_to_user(&arg[i], ®, sizeof(reg))) {
+- fput(tctx->registered_rings[reg.offset]);
+- tctx->registered_rings[reg.offset] = NULL;
+- ret = -EFAULT;
+- break;
+- }
+- }
+-
+- return i ? i : ret;
+-}
+-
+-static int io_ringfd_unregister(struct io_ring_ctx *ctx, void __user *__arg,
+- unsigned nr_args)
+-{
+- struct io_uring_rsrc_update __user *arg = __arg;
+- struct io_uring_task *tctx = current->io_uring;
+- struct io_uring_rsrc_update reg;
+- int ret = 0, i;
+-
+- if (!nr_args || nr_args > IO_RINGFD_REG_MAX)
+- return -EINVAL;
+- if (!tctx)
+- return 0;
+-
+- for (i = 0; i < nr_args; i++) {
+- if (copy_from_user(®, &arg[i], sizeof(reg))) {
+- ret = -EFAULT;
+- break;
+- }
+- if (reg.resv || reg.data || reg.offset >= IO_RINGFD_REG_MAX) {
+- ret = -EINVAL;
+- break;
+- }
+-
+- reg.offset = array_index_nospec(reg.offset, IO_RINGFD_REG_MAX);
+- if (tctx->registered_rings[reg.offset]) {
+- fput(tctx->registered_rings[reg.offset]);
+- tctx->registered_rings[reg.offset] = NULL;
+- }
+- }
+-
+- return i ? i : ret;
+-}
+-
+-static void *io_uring_validate_mmap_request(struct file *file,
+- loff_t pgoff, size_t sz)
+-{
+- struct io_ring_ctx *ctx = file->private_data;
+- loff_t offset = pgoff << PAGE_SHIFT;
+- struct page *page;
+- void *ptr;
+-
+- switch (offset) {
+- case IORING_OFF_SQ_RING:
+- case IORING_OFF_CQ_RING:
+- ptr = ctx->rings;
+- break;
+- case IORING_OFF_SQES:
+- ptr = ctx->sq_sqes;
+- break;
+- default:
+- return ERR_PTR(-EINVAL);
+- }
+-
+- page = virt_to_head_page(ptr);
+- if (sz > page_size(page))
+- return ERR_PTR(-EINVAL);
+-
+- return ptr;
+-}
+-
+-#ifdef CONFIG_MMU
+-
+-static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
+-{
+- size_t sz = vma->vm_end - vma->vm_start;
+- unsigned long pfn;
+- void *ptr;
+-
+- ptr = io_uring_validate_mmap_request(file, vma->vm_pgoff, sz);
+- if (IS_ERR(ptr))
+- return PTR_ERR(ptr);
+-
+- pfn = virt_to_phys(ptr) >> PAGE_SHIFT;
+- return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot);
+-}
+-
+-#else /* !CONFIG_MMU */
+-
+-static int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
+-{
+- return vma->vm_flags & (VM_SHARED | VM_MAYSHARE) ? 0 : -EINVAL;
+-}
+-
+-static unsigned int io_uring_nommu_mmap_capabilities(struct file *file)
+-{
+- return NOMMU_MAP_DIRECT | NOMMU_MAP_READ | NOMMU_MAP_WRITE;
+-}
+-
+-static unsigned long io_uring_nommu_get_unmapped_area(struct file *file,
+- unsigned long addr, unsigned long len,
+- unsigned long pgoff, unsigned long flags)
+-{
+- void *ptr;
+-
+- ptr = io_uring_validate_mmap_request(file, pgoff, len);
+- if (IS_ERR(ptr))
+- return PTR_ERR(ptr);
+-
+- return (unsigned long) ptr;
+-}
+-
+-#endif /* !CONFIG_MMU */
+-
+-static int io_sqpoll_wait_sq(struct io_ring_ctx *ctx)
+-{
+- DEFINE_WAIT(wait);
+-
+- do {
+- if (!io_sqring_full(ctx))
+- break;
+- prepare_to_wait(&ctx->sqo_sq_wait, &wait, TASK_INTERRUPTIBLE);
+-
+- if (!io_sqring_full(ctx))
+- break;
+- schedule();
+- } while (!signal_pending(current));
+-
+- finish_wait(&ctx->sqo_sq_wait, &wait);
+- return 0;
+-}
+-
+-static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz,
+- struct __kernel_timespec __user **ts,
+- const sigset_t __user **sig)
+-{
+- struct io_uring_getevents_arg arg;
+-
+- /*
+- * If EXT_ARG isn't set, then we have no timespec and the argp pointer
+- * is just a pointer to the sigset_t.
+- */
+- if (!(flags & IORING_ENTER_EXT_ARG)) {
+- *sig = (const sigset_t __user *) argp;
+- *ts = NULL;
+- return 0;
+- }
+-
+- /*
+- * EXT_ARG is set - ensure we agree on the size of it and copy in our
+- * timespec and sigset_t pointers if good.
+- */
+- if (*argsz != sizeof(arg))
+- return -EINVAL;
+- if (copy_from_user(&arg, argp, sizeof(arg)))
+- return -EFAULT;
+- if (arg.pad)
+- return -EINVAL;
+- *sig = u64_to_user_ptr(arg.sigmask);
+- *argsz = arg.sigmask_sz;
+- *ts = u64_to_user_ptr(arg.ts);
+- return 0;
+-}
+-
+-SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
+- u32, min_complete, u32, flags, const void __user *, argp,
+- size_t, argsz)
+-{
+- struct io_ring_ctx *ctx;
+- int submitted = 0;
+- struct fd f;
+- long ret;
+-
+- io_run_task_work();
+-
+- if (unlikely(flags & ~(IORING_ENTER_GETEVENTS | IORING_ENTER_SQ_WAKEUP |
+- IORING_ENTER_SQ_WAIT | IORING_ENTER_EXT_ARG |
+- IORING_ENTER_REGISTERED_RING)))
+- return -EINVAL;
+-
+- /*
+- * Ring fd has been registered via IORING_REGISTER_RING_FDS, we
+- * need only dereference our task private array to find it.
+- */
+- if (flags & IORING_ENTER_REGISTERED_RING) {
+- struct io_uring_task *tctx = current->io_uring;
+-
+- if (!tctx || fd >= IO_RINGFD_REG_MAX)
+- return -EINVAL;
+- fd = array_index_nospec(fd, IO_RINGFD_REG_MAX);
+- f.file = tctx->registered_rings[fd];
+- if (unlikely(!f.file))
+- return -EBADF;
+- } else {
+- f = fdget(fd);
+- if (unlikely(!f.file))
+- return -EBADF;
+- }
+-
+- ret = -EOPNOTSUPP;
+- if (unlikely(f.file->f_op != &io_uring_fops))
+- goto out_fput;
+-
+- ret = -ENXIO;
+- ctx = f.file->private_data;
+- if (unlikely(!percpu_ref_tryget(&ctx->refs)))
+- goto out_fput;
+-
+- ret = -EBADFD;
+- if (unlikely(ctx->flags & IORING_SETUP_R_DISABLED))
+- goto out;
+-
+- /*
+- * For SQ polling, the thread will do all submissions and completions.
+- * Just return the requested submit count, and wake the thread if
+- * we were asked to.
+- */
+- ret = 0;
+- if (ctx->flags & IORING_SETUP_SQPOLL) {
+- io_cqring_overflow_flush(ctx);
+-
+- if (unlikely(ctx->sq_data->thread == NULL)) {
+- ret = -EOWNERDEAD;
+- goto out;
+- }
+- if (flags & IORING_ENTER_SQ_WAKEUP)
+- wake_up(&ctx->sq_data->wait);
+- if (flags & IORING_ENTER_SQ_WAIT) {
+- ret = io_sqpoll_wait_sq(ctx);
+- if (ret)
+- goto out;
+- }
+- submitted = to_submit;
+- } else if (to_submit) {
+- ret = io_uring_add_tctx_node(ctx);
+- if (unlikely(ret))
+- goto out;
+- mutex_lock(&ctx->uring_lock);
+- submitted = io_submit_sqes(ctx, to_submit);
+- mutex_unlock(&ctx->uring_lock);
+-
+- if (submitted != to_submit)
+- goto out;
+- }
+- if (flags & IORING_ENTER_GETEVENTS) {
+- const sigset_t __user *sig;
+- struct __kernel_timespec __user *ts;
+-
+- ret = io_get_ext_arg(flags, argp, &argsz, &ts, &sig);
+- if (unlikely(ret))
+- goto out;
+-
+- min_complete = min(min_complete, ctx->cq_entries);
+-
+- /*
+- * When SETUP_IOPOLL and SETUP_SQPOLL are both enabled, user
+- * space applications don't need to do io completion events
+- * polling again, they can rely on io_sq_thread to do polling
+- * work, which can reduce cpu usage and uring_lock contention.
+- */
+- if (ctx->flags & IORING_SETUP_IOPOLL &&
+- !(ctx->flags & IORING_SETUP_SQPOLL)) {
+- ret = io_iopoll_check(ctx, min_complete);
+- } else {
+- ret = io_cqring_wait(ctx, min_complete, sig, argsz, ts);
+- }
+- }
+-
+-out:
+- percpu_ref_put(&ctx->refs);
+-out_fput:
+- if (!(flags & IORING_ENTER_REGISTERED_RING))
+- fdput(f);
+- return submitted ? submitted : ret;
+-}
+-
+-#ifdef CONFIG_PROC_FS
+-static __cold int io_uring_show_cred(struct seq_file *m, unsigned int id,
+- const struct cred *cred)
+-{
+- struct user_namespace *uns = seq_user_ns(m);
+- struct group_info *gi;
+- kernel_cap_t cap;
+- unsigned __capi;
+- int g;
+-
+- seq_printf(m, "%5d\n", id);
+- seq_put_decimal_ull(m, "\tUid:\t", from_kuid_munged(uns, cred->uid));
+- seq_put_decimal_ull(m, "\t\t", from_kuid_munged(uns, cred->euid));
+- seq_put_decimal_ull(m, "\t\t", from_kuid_munged(uns, cred->suid));
+- seq_put_decimal_ull(m, "\t\t", from_kuid_munged(uns, cred->fsuid));
+- seq_put_decimal_ull(m, "\n\tGid:\t", from_kgid_munged(uns, cred->gid));
+- seq_put_decimal_ull(m, "\t\t", from_kgid_munged(uns, cred->egid));
+- seq_put_decimal_ull(m, "\t\t", from_kgid_munged(uns, cred->sgid));
+- seq_put_decimal_ull(m, "\t\t", from_kgid_munged(uns, cred->fsgid));
+- seq_puts(m, "\n\tGroups:\t");
+- gi = cred->group_info;
+- for (g = 0; g < gi->ngroups; g++) {
+- seq_put_decimal_ull(m, g ? " " : "",
+- from_kgid_munged(uns, gi->gid[g]));
+- }
+- seq_puts(m, "\n\tCapEff:\t");
+- cap = cred->cap_effective;
+- CAP_FOR_EACH_U32(__capi)
+- seq_put_hex_ll(m, NULL, cap.cap[CAP_LAST_U32 - __capi], 8);
+- seq_putc(m, '\n');
+- return 0;
+-}
+-
+-static __cold void __io_uring_show_fdinfo(struct io_ring_ctx *ctx,
+- struct seq_file *m)
+-{
+- struct io_sq_data *sq = NULL;
+- struct io_overflow_cqe *ocqe;
+- struct io_rings *r = ctx->rings;
+- unsigned int sq_mask = ctx->sq_entries - 1, cq_mask = ctx->cq_entries - 1;
+- unsigned int sq_head = READ_ONCE(r->sq.head);
+- unsigned int sq_tail = READ_ONCE(r->sq.tail);
+- unsigned int cq_head = READ_ONCE(r->cq.head);
+- unsigned int cq_tail = READ_ONCE(r->cq.tail);
+- unsigned int sq_entries, cq_entries;
+- bool has_lock;
+- unsigned int i;
+-
+- /*
+- * we may get imprecise sqe and cqe info if uring is actively running
+- * since we get cached_sq_head and cached_cq_tail without uring_lock
+- * and sq_tail and cq_head are changed by userspace. But it's ok since
+- * we usually use these info when it is stuck.
+- */
+- seq_printf(m, "SqMask:\t0x%x\n", sq_mask);
+- seq_printf(m, "SqHead:\t%u\n", sq_head);
+- seq_printf(m, "SqTail:\t%u\n", sq_tail);
+- seq_printf(m, "CachedSqHead:\t%u\n", ctx->cached_sq_head);
+- seq_printf(m, "CqMask:\t0x%x\n", cq_mask);
+- seq_printf(m, "CqHead:\t%u\n", cq_head);
+- seq_printf(m, "CqTail:\t%u\n", cq_tail);
+- seq_printf(m, "CachedCqTail:\t%u\n", ctx->cached_cq_tail);
+- seq_printf(m, "SQEs:\t%u\n", sq_tail - ctx->cached_sq_head);
+- sq_entries = min(sq_tail - sq_head, ctx->sq_entries);
+- for (i = 0; i < sq_entries; i++) {
+- unsigned int entry = i + sq_head;
+- unsigned int sq_idx = READ_ONCE(ctx->sq_array[entry & sq_mask]);
+- struct io_uring_sqe *sqe;
+-
+- if (sq_idx > sq_mask)
+- continue;
+- sqe = &ctx->sq_sqes[sq_idx];
+- seq_printf(m, "%5u: opcode:%d, fd:%d, flags:%x, user_data:%llu\n",
+- sq_idx, sqe->opcode, sqe->fd, sqe->flags,
+- sqe->user_data);
+- }
+- seq_printf(m, "CQEs:\t%u\n", cq_tail - cq_head);
+- cq_entries = min(cq_tail - cq_head, ctx->cq_entries);
+- for (i = 0; i < cq_entries; i++) {
+- unsigned int entry = i + cq_head;
+- struct io_uring_cqe *cqe = &r->cqes[entry & cq_mask];
+-
+- seq_printf(m, "%5u: user_data:%llu, res:%d, flag:%x\n",
+- entry & cq_mask, cqe->user_data, cqe->res,
+- cqe->flags);
+- }
+-
+- /*
+- * Avoid ABBA deadlock between the seq lock and the io_uring mutex,
+- * since fdinfo case grabs it in the opposite direction of normal use
+- * cases. If we fail to get the lock, we just don't iterate any
+- * structures that could be going away outside the io_uring mutex.
+- */
+- has_lock = mutex_trylock(&ctx->uring_lock);
+-
+- if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) {
+- sq = ctx->sq_data;
+- if (!sq->thread)
+- sq = NULL;
+- }
+-
+- seq_printf(m, "SqThread:\t%d\n", sq ? task_pid_nr(sq->thread) : -1);
+- seq_printf(m, "SqThreadCpu:\t%d\n", sq ? task_cpu(sq->thread) : -1);
+- seq_printf(m, "UserFiles:\t%u\n", ctx->nr_user_files);
+- for (i = 0; has_lock && i < ctx->nr_user_files; i++) {
+- struct file *f = io_file_from_index(ctx, i);
+-
+- if (f)
+- seq_printf(m, "%5u: %s\n", i, file_dentry(f)->d_iname);
+- else
+- seq_printf(m, "%5u: <none>\n", i);
+- }
+- seq_printf(m, "UserBufs:\t%u\n", ctx->nr_user_bufs);
+- for (i = 0; has_lock && i < ctx->nr_user_bufs; i++) {
+- struct io_mapped_ubuf *buf = ctx->user_bufs[i];
+- unsigned int len = buf->ubuf_end - buf->ubuf;
+-
+- seq_printf(m, "%5u: 0x%llx/%u\n", i, buf->ubuf, len);
+- }
+- if (has_lock && !xa_empty(&ctx->personalities)) {
+- unsigned long index;
+- const struct cred *cred;
+-
+- seq_printf(m, "Personalities:\n");
+- xa_for_each(&ctx->personalities, index, cred)
+- io_uring_show_cred(m, index, cred);
+- }
+- if (has_lock)
+- mutex_unlock(&ctx->uring_lock);
+-
+- seq_puts(m, "PollList:\n");
+- spin_lock(&ctx->completion_lock);
+- for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) {
+- struct hlist_head *list = &ctx->cancel_hash[i];
+- struct io_kiocb *req;
+-
+- hlist_for_each_entry(req, list, hash_node)
+- seq_printf(m, " op=%d, task_works=%d\n", req->opcode,
+- task_work_pending(req->task));
+- }
+-
+- seq_puts(m, "CqOverflowList:\n");
+- list_for_each_entry(ocqe, &ctx->cq_overflow_list, list) {
+- struct io_uring_cqe *cqe = &ocqe->cqe;
+-
+- seq_printf(m, " user_data=%llu, res=%d, flags=%x\n",
+- cqe->user_data, cqe->res, cqe->flags);
+-
+- }
+-
+- spin_unlock(&ctx->completion_lock);
+-}
+-
+-static __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f)
+-{
+- struct io_ring_ctx *ctx = f->private_data;
+-
+- if (percpu_ref_tryget(&ctx->refs)) {
+- __io_uring_show_fdinfo(ctx, m);
+- percpu_ref_put(&ctx->refs);
+- }
+-}
+-#endif
+-
+-static const struct file_operations io_uring_fops = {
+- .release = io_uring_release,
+- .mmap = io_uring_mmap,
+-#ifndef CONFIG_MMU
+- .get_unmapped_area = io_uring_nommu_get_unmapped_area,
+- .mmap_capabilities = io_uring_nommu_mmap_capabilities,
+-#endif
+- .poll = io_uring_poll,
+-#ifdef CONFIG_PROC_FS
+- .show_fdinfo = io_uring_show_fdinfo,
+-#endif
+-};
+-
+-static __cold int io_allocate_scq_urings(struct io_ring_ctx *ctx,
+- struct io_uring_params *p)
+-{
+- struct io_rings *rings;
+- size_t size, sq_array_offset;
+-
+- /* make sure these are sane, as we already accounted them */
+- ctx->sq_entries = p->sq_entries;
+- ctx->cq_entries = p->cq_entries;
+-
+- size = rings_size(p->sq_entries, p->cq_entries, &sq_array_offset);
+- if (size == SIZE_MAX)
+- return -EOVERFLOW;
+-
+- rings = io_mem_alloc(size);
+- if (!rings)
+- return -ENOMEM;
+-
+- ctx->rings = rings;
+- ctx->sq_array = (u32 *)((char *)rings + sq_array_offset);
+- rings->sq_ring_mask = p->sq_entries - 1;
+- rings->cq_ring_mask = p->cq_entries - 1;
+- rings->sq_ring_entries = p->sq_entries;
+- rings->cq_ring_entries = p->cq_entries;
+-
+- size = array_size(sizeof(struct io_uring_sqe), p->sq_entries);
+- if (size == SIZE_MAX) {
+- io_mem_free(ctx->rings);
+- ctx->rings = NULL;
+- return -EOVERFLOW;
+- }
+-
+- ctx->sq_sqes = io_mem_alloc(size);
+- if (!ctx->sq_sqes) {
+- io_mem_free(ctx->rings);
+- ctx->rings = NULL;
+- return -ENOMEM;
+- }
+-
+- return 0;
+-}
+-
+-static int io_uring_install_fd(struct io_ring_ctx *ctx, struct file *file)
+-{
+- int ret, fd;
+-
+- fd = get_unused_fd_flags(O_RDWR | O_CLOEXEC);
+- if (fd < 0)
+- return fd;
+-
+- ret = io_uring_add_tctx_node(ctx);
+- if (ret) {
+- put_unused_fd(fd);
+- return ret;
+- }
+- fd_install(fd, file);
+- return fd;
+-}
+-
+-/*
+- * Allocate an anonymous fd, this is what constitutes the application
+- * visible backing of an io_uring instance. The application mmaps this
+- * fd to gain access to the SQ/CQ ring details. If UNIX sockets are enabled,
+- * we have to tie this fd to a socket for file garbage collection purposes.
+- */
+-static struct file *io_uring_get_file(struct io_ring_ctx *ctx)
+-{
+- struct file *file;
+-#if defined(CONFIG_UNIX)
+- int ret;
+-
+- ret = sock_create_kern(&init_net, PF_UNIX, SOCK_RAW, IPPROTO_IP,
+- &ctx->ring_sock);
+- if (ret)
+- return ERR_PTR(ret);
+-#endif
+-
+- file = anon_inode_getfile_secure("[io_uring]", &io_uring_fops, ctx,
+- O_RDWR | O_CLOEXEC, NULL);
+-#if defined(CONFIG_UNIX)
+- if (IS_ERR(file)) {
+- sock_release(ctx->ring_sock);
+- ctx->ring_sock = NULL;
+- } else {
+- ctx->ring_sock->file = file;
+- }
+-#endif
+- return file;
+-}
+-
+-static __cold int io_uring_create(unsigned entries, struct io_uring_params *p,
+- struct io_uring_params __user *params)
+-{
+- struct io_ring_ctx *ctx;
+- struct file *file;
+- int ret;
+-
+- if (!entries)
+- return -EINVAL;
+- if (entries > IORING_MAX_ENTRIES) {
+- if (!(p->flags & IORING_SETUP_CLAMP))
+- return -EINVAL;
+- entries = IORING_MAX_ENTRIES;
+- }
+-
+- /*
+- * Use twice as many entries for the CQ ring. It's possible for the
+- * application to drive a higher depth than the size of the SQ ring,
+- * since the sqes are only used at submission time. This allows for
+- * some flexibility in overcommitting a bit. If the application has
+- * set IORING_SETUP_CQSIZE, it will have passed in the desired number
+- * of CQ ring entries manually.
+- */
+- p->sq_entries = roundup_pow_of_two(entries);
+- if (p->flags & IORING_SETUP_CQSIZE) {
+- /*
+- * If IORING_SETUP_CQSIZE is set, we do the same roundup
+- * to a power-of-two, if it isn't already. We do NOT impose
+- * any cq vs sq ring sizing.
+- */
+- if (!p->cq_entries)
+- return -EINVAL;
+- if (p->cq_entries > IORING_MAX_CQ_ENTRIES) {
+- if (!(p->flags & IORING_SETUP_CLAMP))
+- return -EINVAL;
+- p->cq_entries = IORING_MAX_CQ_ENTRIES;
+- }
+- p->cq_entries = roundup_pow_of_two(p->cq_entries);
+- if (p->cq_entries < p->sq_entries)
+- return -EINVAL;
+- } else {
+- p->cq_entries = 2 * p->sq_entries;
+- }
+-
+- ctx = io_ring_ctx_alloc(p);
+- if (!ctx)
+- return -ENOMEM;
+- ctx->compat = in_compat_syscall();
+- if (!capable(CAP_IPC_LOCK))
+- ctx->user = get_uid(current_user());
+-
+- /*
+- * This is just grabbed for accounting purposes. When a process exits,
+- * the mm is exited and dropped before the files, hence we need to hang
+- * on to this mm purely for the purposes of being able to unaccount
+- * memory (locked/pinned vm). It's not used for anything else.
+- */
+- mmgrab(current->mm);
+- ctx->mm_account = current->mm;
+-
+- ret = io_allocate_scq_urings(ctx, p);
+- if (ret)
+- goto err;
+-
+- ret = io_sq_offload_create(ctx, p);
+- if (ret)
+- goto err;
+- /* always set a rsrc node */
+- ret = io_rsrc_node_switch_start(ctx);
+- if (ret)
+- goto err;
+- io_rsrc_node_switch(ctx, NULL);
+-
+- memset(&p->sq_off, 0, sizeof(p->sq_off));
+- p->sq_off.head = offsetof(struct io_rings, sq.head);
+- p->sq_off.tail = offsetof(struct io_rings, sq.tail);
+- p->sq_off.ring_mask = offsetof(struct io_rings, sq_ring_mask);
+- p->sq_off.ring_entries = offsetof(struct io_rings, sq_ring_entries);
+- p->sq_off.flags = offsetof(struct io_rings, sq_flags);
+- p->sq_off.dropped = offsetof(struct io_rings, sq_dropped);
+- p->sq_off.array = (char *)ctx->sq_array - (char *)ctx->rings;
+-
+- memset(&p->cq_off, 0, sizeof(p->cq_off));
+- p->cq_off.head = offsetof(struct io_rings, cq.head);
+- p->cq_off.tail = offsetof(struct io_rings, cq.tail);
+- p->cq_off.ring_mask = offsetof(struct io_rings, cq_ring_mask);
+- p->cq_off.ring_entries = offsetof(struct io_rings, cq_ring_entries);
+- p->cq_off.overflow = offsetof(struct io_rings, cq_overflow);
+- p->cq_off.cqes = offsetof(struct io_rings, cqes);
+- p->cq_off.flags = offsetof(struct io_rings, cq_flags);
+-
+- p->features = IORING_FEAT_SINGLE_MMAP | IORING_FEAT_NODROP |
+- IORING_FEAT_SUBMIT_STABLE | IORING_FEAT_RW_CUR_POS |
+- IORING_FEAT_CUR_PERSONALITY | IORING_FEAT_FAST_POLL |
+- IORING_FEAT_POLL_32BITS | IORING_FEAT_SQPOLL_NONFIXED |
+- IORING_FEAT_EXT_ARG | IORING_FEAT_NATIVE_WORKERS |
+- IORING_FEAT_RSRC_TAGS | IORING_FEAT_CQE_SKIP |
+- IORING_FEAT_LINKED_FILE;
+-
+- if (copy_to_user(params, p, sizeof(*p))) {
+- ret = -EFAULT;
+- goto err;
+- }
+-
+- file = io_uring_get_file(ctx);
+- if (IS_ERR(file)) {
+- ret = PTR_ERR(file);
+- goto err;
+- }
+-
+- /*
+- * Install ring fd as the very last thing, so we don't risk someone
+- * having closed it before we finish setup
+- */
+- ret = io_uring_install_fd(ctx, file);
+- if (ret < 0) {
+- /* fput will clean it up */
+- fput(file);
+- return ret;
+- }
+-
+- trace_io_uring_create(ret, ctx, p->sq_entries, p->cq_entries, p->flags);
+- return ret;
+-err:
+- io_ring_ctx_wait_and_kill(ctx);
+- return ret;
+-}
+-
+-/*
+- * Sets up an aio uring context, and returns the fd. Applications asks for a
+- * ring size, we return the actual sq/cq ring sizes (among other things) in the
+- * params structure passed in.
+- */
+-static long io_uring_setup(u32 entries, struct io_uring_params __user *params)
+-{
+- struct io_uring_params p;
+- int i;
+-
+- if (copy_from_user(&p, params, sizeof(p)))
+- return -EFAULT;
+- for (i = 0; i < ARRAY_SIZE(p.resv); i++) {
+- if (p.resv[i])
+- return -EINVAL;
+- }
+-
+- if (p.flags & ~(IORING_SETUP_IOPOLL | IORING_SETUP_SQPOLL |
+- IORING_SETUP_SQ_AFF | IORING_SETUP_CQSIZE |
+- IORING_SETUP_CLAMP | IORING_SETUP_ATTACH_WQ |
+- IORING_SETUP_R_DISABLED | IORING_SETUP_SUBMIT_ALL))
+- return -EINVAL;
+-
+- return io_uring_create(entries, &p, params);
+-}
+-
+-SYSCALL_DEFINE2(io_uring_setup, u32, entries,
+- struct io_uring_params __user *, params)
+-{
+- return io_uring_setup(entries, params);
+-}
+-
+-static __cold int io_probe(struct io_ring_ctx *ctx, void __user *arg,
+- unsigned nr_args)
+-{
+- struct io_uring_probe *p;
+- size_t size;
+- int i, ret;
+-
+- size = struct_size(p, ops, nr_args);
+- if (size == SIZE_MAX)
+- return -EOVERFLOW;
+- p = kzalloc(size, GFP_KERNEL);
+- if (!p)
+- return -ENOMEM;
+-
+- ret = -EFAULT;
+- if (copy_from_user(p, arg, size))
+- goto out;
+- ret = -EINVAL;
+- if (memchr_inv(p, 0, size))
+- goto out;
+-
+- p->last_op = IORING_OP_LAST - 1;
+- if (nr_args > IORING_OP_LAST)
+- nr_args = IORING_OP_LAST;
+-
+- for (i = 0; i < nr_args; i++) {
+- p->ops[i].op = i;
+- if (!io_op_defs[i].not_supported)
+- p->ops[i].flags = IO_URING_OP_SUPPORTED;
+- }
+- p->ops_len = i;
+-
+- ret = 0;
+- if (copy_to_user(arg, p, size))
+- ret = -EFAULT;
+-out:
+- kfree(p);
+- return ret;
+-}
+-
+-static int io_register_personality(struct io_ring_ctx *ctx)
+-{
+- const struct cred *creds;
+- u32 id;
+- int ret;
+-
+- creds = get_current_cred();
+-
+- ret = xa_alloc_cyclic(&ctx->personalities, &id, (void *)creds,
+- XA_LIMIT(0, USHRT_MAX), &ctx->pers_next, GFP_KERNEL);
+- if (ret < 0) {
+- put_cred(creds);
+- return ret;
+- }
+- return id;
+-}
+-
+-static __cold int io_register_restrictions(struct io_ring_ctx *ctx,
+- void __user *arg, unsigned int nr_args)
+-{
+- struct io_uring_restriction *res;
+- size_t size;
+- int i, ret;
+-
+- /* Restrictions allowed only if rings started disabled */
+- if (!(ctx->flags & IORING_SETUP_R_DISABLED))
+- return -EBADFD;
+-
+- /* We allow only a single restrictions registration */
+- if (ctx->restrictions.registered)
+- return -EBUSY;
+-
+- if (!arg || nr_args > IORING_MAX_RESTRICTIONS)
+- return -EINVAL;
+-
+- size = array_size(nr_args, sizeof(*res));
+- if (size == SIZE_MAX)
+- return -EOVERFLOW;
+-
+- res = memdup_user(arg, size);
+- if (IS_ERR(res))
+- return PTR_ERR(res);
+-
+- ret = 0;
+-
+- for (i = 0; i < nr_args; i++) {
+- switch (res[i].opcode) {
+- case IORING_RESTRICTION_REGISTER_OP:
+- if (res[i].register_op >= IORING_REGISTER_LAST) {
+- ret = -EINVAL;
+- goto out;
+- }
+-
+- __set_bit(res[i].register_op,
+- ctx->restrictions.register_op);
+- break;
+- case IORING_RESTRICTION_SQE_OP:
+- if (res[i].sqe_op >= IORING_OP_LAST) {
+- ret = -EINVAL;
+- goto out;
+- }
+-
+- __set_bit(res[i].sqe_op, ctx->restrictions.sqe_op);
+- break;
+- case IORING_RESTRICTION_SQE_FLAGS_ALLOWED:
+- ctx->restrictions.sqe_flags_allowed = res[i].sqe_flags;
+- break;
+- case IORING_RESTRICTION_SQE_FLAGS_REQUIRED:
+- ctx->restrictions.sqe_flags_required = res[i].sqe_flags;
+- break;
+- default:
+- ret = -EINVAL;
+- goto out;
+- }
+- }
+-
+-out:
+- /* Reset all restrictions if an error happened */
+- if (ret != 0)
+- memset(&ctx->restrictions, 0, sizeof(ctx->restrictions));
+- else
+- ctx->restrictions.registered = true;
+-
+- kfree(res);
+- return ret;
+-}
+-
+-static int io_register_enable_rings(struct io_ring_ctx *ctx)
+-{
+- if (!(ctx->flags & IORING_SETUP_R_DISABLED))
+- return -EBADFD;
+-
+- if (ctx->restrictions.registered)
+- ctx->restricted = 1;
+-
+- ctx->flags &= ~IORING_SETUP_R_DISABLED;
+- if (ctx->sq_data && wq_has_sleeper(&ctx->sq_data->wait))
+- wake_up(&ctx->sq_data->wait);
+- return 0;
+-}
+-
+-static int __io_register_rsrc_update(struct io_ring_ctx *ctx, unsigned type,
+- struct io_uring_rsrc_update2 *up,
+- unsigned nr_args)
+-{
+- __u32 tmp;
+- int err;
+-
+- if (check_add_overflow(up->offset, nr_args, &tmp))
+- return -EOVERFLOW;
+- err = io_rsrc_node_switch_start(ctx);
+- if (err)
+- return err;
+-
+- switch (type) {
+- case IORING_RSRC_FILE:
+- return __io_sqe_files_update(ctx, up, nr_args);
+- case IORING_RSRC_BUFFER:
+- return __io_sqe_buffers_update(ctx, up, nr_args);
+- }
+- return -EINVAL;
+-}
+-
+-static int io_register_files_update(struct io_ring_ctx *ctx, void __user *arg,
+- unsigned nr_args)
+-{
+- struct io_uring_rsrc_update2 up;
+-
+- if (!nr_args)
+- return -EINVAL;
+- memset(&up, 0, sizeof(up));
+- if (copy_from_user(&up, arg, sizeof(struct io_uring_rsrc_update)))
+- return -EFAULT;
+- if (up.resv || up.resv2)
+- return -EINVAL;
+- return __io_register_rsrc_update(ctx, IORING_RSRC_FILE, &up, nr_args);
+-}
+-
+-static int io_register_rsrc_update(struct io_ring_ctx *ctx, void __user *arg,
+- unsigned size, unsigned type)
+-{
+- struct io_uring_rsrc_update2 up;
+-
+- if (size != sizeof(up))
+- return -EINVAL;
+- if (copy_from_user(&up, arg, sizeof(up)))
+- return -EFAULT;
+- if (!up.nr || up.resv || up.resv2)
+- return -EINVAL;
+- return __io_register_rsrc_update(ctx, type, &up, up.nr);
+-}
+-
+-static __cold int io_register_rsrc(struct io_ring_ctx *ctx, void __user *arg,
+- unsigned int size, unsigned int type)
+-{
+- struct io_uring_rsrc_register rr;
+-
+- /* keep it extendible */
+- if (size != sizeof(rr))
+- return -EINVAL;
+-
+- memset(&rr, 0, sizeof(rr));
+- if (copy_from_user(&rr, arg, size))
+- return -EFAULT;
+- if (!rr.nr || rr.resv || rr.resv2)
+- return -EINVAL;
+-
+- switch (type) {
+- case IORING_RSRC_FILE:
+- return io_sqe_files_register(ctx, u64_to_user_ptr(rr.data),
+- rr.nr, u64_to_user_ptr(rr.tags));
+- case IORING_RSRC_BUFFER:
+- return io_sqe_buffers_register(ctx, u64_to_user_ptr(rr.data),
+- rr.nr, u64_to_user_ptr(rr.tags));
+- }
+- return -EINVAL;
+-}
+-
+-static __cold int io_register_iowq_aff(struct io_ring_ctx *ctx,
+- void __user *arg, unsigned len)
+-{
+- struct io_uring_task *tctx = current->io_uring;
+- cpumask_var_t new_mask;
+- int ret;
+-
+- if (!tctx || !tctx->io_wq)
+- return -EINVAL;
+-
+- if (!alloc_cpumask_var(&new_mask, GFP_KERNEL))
+- return -ENOMEM;
+-
+- cpumask_clear(new_mask);
+- if (len > cpumask_size())
+- len = cpumask_size();
+-
+- if (in_compat_syscall()) {
+- ret = compat_get_bitmap(cpumask_bits(new_mask),
+- (const compat_ulong_t __user *)arg,
+- len * 8 /* CHAR_BIT */);
+- } else {
+- ret = copy_from_user(new_mask, arg, len);
+- }
+-
+- if (ret) {
+- free_cpumask_var(new_mask);
+- return -EFAULT;
+- }
+-
+- ret = io_wq_cpu_affinity(tctx->io_wq, new_mask);
+- free_cpumask_var(new_mask);
+- return ret;
+-}
+-
+-static __cold int io_unregister_iowq_aff(struct io_ring_ctx *ctx)
+-{
+- struct io_uring_task *tctx = current->io_uring;
+-
+- if (!tctx || !tctx->io_wq)
+- return -EINVAL;
+-
+- return io_wq_cpu_affinity(tctx->io_wq, NULL);
+-}
+-
+-static __cold int io_register_iowq_max_workers(struct io_ring_ctx *ctx,
+- void __user *arg)
+- __must_hold(&ctx->uring_lock)
+-{
+- struct io_tctx_node *node;
+- struct io_uring_task *tctx = NULL;
+- struct io_sq_data *sqd = NULL;
+- __u32 new_count[2];
+- int i, ret;
+-
+- if (copy_from_user(new_count, arg, sizeof(new_count)))
+- return -EFAULT;
+- for (i = 0; i < ARRAY_SIZE(new_count); i++)
+- if (new_count[i] > INT_MAX)
+- return -EINVAL;
+-
+- if (ctx->flags & IORING_SETUP_SQPOLL) {
+- sqd = ctx->sq_data;
+- if (sqd) {
+- /*
+- * Observe the correct sqd->lock -> ctx->uring_lock
+- * ordering. Fine to drop uring_lock here, we hold
+- * a ref to the ctx.
+- */
+- refcount_inc(&sqd->refs);
+- mutex_unlock(&ctx->uring_lock);
+- mutex_lock(&sqd->lock);
+- mutex_lock(&ctx->uring_lock);
+- if (sqd->thread)
+- tctx = sqd->thread->io_uring;
+- }
+- } else {
+- tctx = current->io_uring;
+- }
+-
+- BUILD_BUG_ON(sizeof(new_count) != sizeof(ctx->iowq_limits));
+-
+- for (i = 0; i < ARRAY_SIZE(new_count); i++)
+- if (new_count[i])
+- ctx->iowq_limits[i] = new_count[i];
+- ctx->iowq_limits_set = true;
+-
+- if (tctx && tctx->io_wq) {
+- ret = io_wq_max_workers(tctx->io_wq, new_count);
+- if (ret)
+- goto err;
+- } else {
+- memset(new_count, 0, sizeof(new_count));
+- }
+-
+- if (sqd) {
+- mutex_unlock(&sqd->lock);
+- io_put_sq_data(sqd);
+- }
+-
+- if (copy_to_user(arg, new_count, sizeof(new_count)))
+- return -EFAULT;
+-
+- /* that's it for SQPOLL, only the SQPOLL task creates requests */
+- if (sqd)
+- return 0;
+-
+- /* now propagate the restriction to all registered users */
+- list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
+- struct io_uring_task *tctx = node->task->io_uring;
+-
+- if (WARN_ON_ONCE(!tctx->io_wq))
+- continue;
+-
+- for (i = 0; i < ARRAY_SIZE(new_count); i++)
+- new_count[i] = ctx->iowq_limits[i];
+- /* ignore errors, it always returns zero anyway */
+- (void)io_wq_max_workers(tctx->io_wq, new_count);
+- }
+- return 0;
+-err:
+- if (sqd) {
+- mutex_unlock(&sqd->lock);
+- io_put_sq_data(sqd);
+- }
+- return ret;
+-}
+-
+-static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
+- void __user *arg, unsigned nr_args)
+- __releases(ctx->uring_lock)
+- __acquires(ctx->uring_lock)
+-{
+- int ret;
+-
+- /*
+- * We're inside the ring mutex, if the ref is already dying, then
+- * someone else killed the ctx or is already going through
+- * io_uring_register().
+- */
+- if (percpu_ref_is_dying(&ctx->refs))
+- return -ENXIO;
+-
+- if (ctx->restricted) {
+- if (opcode >= IORING_REGISTER_LAST)
+- return -EINVAL;
+- opcode = array_index_nospec(opcode, IORING_REGISTER_LAST);
+- if (!test_bit(opcode, ctx->restrictions.register_op))
+- return -EACCES;
+- }
+-
+- switch (opcode) {
+- case IORING_REGISTER_BUFFERS:
+- ret = io_sqe_buffers_register(ctx, arg, nr_args, NULL);
+- break;
+- case IORING_UNREGISTER_BUFFERS:
+- ret = -EINVAL;
+- if (arg || nr_args)
+- break;
+- ret = io_sqe_buffers_unregister(ctx);
+- break;
+- case IORING_REGISTER_FILES:
+- ret = io_sqe_files_register(ctx, arg, nr_args, NULL);
+- break;
+- case IORING_UNREGISTER_FILES:
+- ret = -EINVAL;
+- if (arg || nr_args)
+- break;
+- ret = io_sqe_files_unregister(ctx);
+- break;
+- case IORING_REGISTER_FILES_UPDATE:
+- ret = io_register_files_update(ctx, arg, nr_args);
+- break;
+- case IORING_REGISTER_EVENTFD:
+- ret = -EINVAL;
+- if (nr_args != 1)
+- break;
+- ret = io_eventfd_register(ctx, arg, 0);
+- break;
+- case IORING_REGISTER_EVENTFD_ASYNC:
+- ret = -EINVAL;
+- if (nr_args != 1)
+- break;
+- ret = io_eventfd_register(ctx, arg, 1);
+- break;
+- case IORING_UNREGISTER_EVENTFD:
+- ret = -EINVAL;
+- if (arg || nr_args)
+- break;
+- ret = io_eventfd_unregister(ctx);
+- break;
+- case IORING_REGISTER_PROBE:
+- ret = -EINVAL;
+- if (!arg || nr_args > 256)
+- break;
+- ret = io_probe(ctx, arg, nr_args);
+- break;
+- case IORING_REGISTER_PERSONALITY:
+- ret = -EINVAL;
+- if (arg || nr_args)
+- break;
+- ret = io_register_personality(ctx);
+- break;
+- case IORING_UNREGISTER_PERSONALITY:
+- ret = -EINVAL;
+- if (arg)
+- break;
+- ret = io_unregister_personality(ctx, nr_args);
+- break;
+- case IORING_REGISTER_ENABLE_RINGS:
+- ret = -EINVAL;
+- if (arg || nr_args)
+- break;
+- ret = io_register_enable_rings(ctx);
+- break;
+- case IORING_REGISTER_RESTRICTIONS:
+- ret = io_register_restrictions(ctx, arg, nr_args);
+- break;
+- case IORING_REGISTER_FILES2:
+- ret = io_register_rsrc(ctx, arg, nr_args, IORING_RSRC_FILE);
+- break;
+- case IORING_REGISTER_FILES_UPDATE2:
+- ret = io_register_rsrc_update(ctx, arg, nr_args,
+- IORING_RSRC_FILE);
+- break;
+- case IORING_REGISTER_BUFFERS2:
+- ret = io_register_rsrc(ctx, arg, nr_args, IORING_RSRC_BUFFER);
+- break;
+- case IORING_REGISTER_BUFFERS_UPDATE:
+- ret = io_register_rsrc_update(ctx, arg, nr_args,
+- IORING_RSRC_BUFFER);
+- break;
+- case IORING_REGISTER_IOWQ_AFF:
+- ret = -EINVAL;
+- if (!arg || !nr_args)
+- break;
+- ret = io_register_iowq_aff(ctx, arg, nr_args);
+- break;
+- case IORING_UNREGISTER_IOWQ_AFF:
+- ret = -EINVAL;
+- if (arg || nr_args)
+- break;
+- ret = io_unregister_iowq_aff(ctx);
+- break;
+- case IORING_REGISTER_IOWQ_MAX_WORKERS:
+- ret = -EINVAL;
+- if (!arg || nr_args != 2)
+- break;
+- ret = io_register_iowq_max_workers(ctx, arg);
+- break;
+- case IORING_REGISTER_RING_FDS:
+- ret = io_ringfd_register(ctx, arg, nr_args);
+- break;
+- case IORING_UNREGISTER_RING_FDS:
+- ret = io_ringfd_unregister(ctx, arg, nr_args);
+- break;
+- default:
+- ret = -EINVAL;
+- break;
+- }
+-
+- return ret;
+-}
+-
+-SYSCALL_DEFINE4(io_uring_register, unsigned int, fd, unsigned int, opcode,
+- void __user *, arg, unsigned int, nr_args)
+-{
+- struct io_ring_ctx *ctx;
+- long ret = -EBADF;
+- struct fd f;
+-
+- f = fdget(fd);
+- if (!f.file)
+- return -EBADF;
+-
+- ret = -EOPNOTSUPP;
+- if (f.file->f_op != &io_uring_fops)
+- goto out_fput;
+-
+- ctx = f.file->private_data;
+-
+- io_run_task_work();
+-
+- mutex_lock(&ctx->uring_lock);
+- ret = __io_uring_register(ctx, opcode, arg, nr_args);
+- mutex_unlock(&ctx->uring_lock);
+- trace_io_uring_register(ctx, opcode, ctx->nr_user_files, ctx->nr_user_bufs, ret);
+-out_fput:
+- fdput(f);
+- return ret;
+-}
+-
+-static int __init io_uring_init(void)
+-{
+-#define __BUILD_BUG_VERIFY_ELEMENT(stype, eoffset, etype, ename) do { \
+- BUILD_BUG_ON(offsetof(stype, ename) != eoffset); \
+- BUILD_BUG_ON(sizeof(etype) != sizeof_field(stype, ename)); \
+-} while (0)
+-
+-#define BUILD_BUG_SQE_ELEM(eoffset, etype, ename) \
+- __BUILD_BUG_VERIFY_ELEMENT(struct io_uring_sqe, eoffset, etype, ename)
+- BUILD_BUG_ON(sizeof(struct io_uring_sqe) != 64);
+- BUILD_BUG_SQE_ELEM(0, __u8, opcode);
+- BUILD_BUG_SQE_ELEM(1, __u8, flags);
+- BUILD_BUG_SQE_ELEM(2, __u16, ioprio);
+- BUILD_BUG_SQE_ELEM(4, __s32, fd);
+- BUILD_BUG_SQE_ELEM(8, __u64, off);
+- BUILD_BUG_SQE_ELEM(8, __u64, addr2);
+- BUILD_BUG_SQE_ELEM(16, __u64, addr);
+- BUILD_BUG_SQE_ELEM(16, __u64, splice_off_in);
+- BUILD_BUG_SQE_ELEM(24, __u32, len);
+- BUILD_BUG_SQE_ELEM(28, __kernel_rwf_t, rw_flags);
+- BUILD_BUG_SQE_ELEM(28, /* compat */ int, rw_flags);
+- BUILD_BUG_SQE_ELEM(28, /* compat */ __u32, rw_flags);
+- BUILD_BUG_SQE_ELEM(28, __u32, fsync_flags);
+- BUILD_BUG_SQE_ELEM(28, /* compat */ __u16, poll_events);
+- BUILD_BUG_SQE_ELEM(28, __u32, poll32_events);
+- BUILD_BUG_SQE_ELEM(28, __u32, sync_range_flags);
+- BUILD_BUG_SQE_ELEM(28, __u32, msg_flags);
+- BUILD_BUG_SQE_ELEM(28, __u32, timeout_flags);
+- BUILD_BUG_SQE_ELEM(28, __u32, accept_flags);
+- BUILD_BUG_SQE_ELEM(28, __u32, cancel_flags);
+- BUILD_BUG_SQE_ELEM(28, __u32, open_flags);
+- BUILD_BUG_SQE_ELEM(28, __u32, statx_flags);
+- BUILD_BUG_SQE_ELEM(28, __u32, fadvise_advice);
+- BUILD_BUG_SQE_ELEM(28, __u32, splice_flags);
+- BUILD_BUG_SQE_ELEM(32, __u64, user_data);
+- BUILD_BUG_SQE_ELEM(40, __u16, buf_index);
+- BUILD_BUG_SQE_ELEM(40, __u16, buf_group);
+- BUILD_BUG_SQE_ELEM(42, __u16, personality);
+- BUILD_BUG_SQE_ELEM(44, __s32, splice_fd_in);
+- BUILD_BUG_SQE_ELEM(44, __u32, file_index);
+-
+- BUILD_BUG_ON(sizeof(struct io_uring_files_update) !=
+- sizeof(struct io_uring_rsrc_update));
+- BUILD_BUG_ON(sizeof(struct io_uring_rsrc_update) >
+- sizeof(struct io_uring_rsrc_update2));
+-
+- /* ->buf_index is u16 */
+- BUILD_BUG_ON(IORING_MAX_REG_BUFFERS >= (1u << 16));
+-
+- /* should fit into one byte */
+- BUILD_BUG_ON(SQE_VALID_FLAGS >= (1 << 8));
+- BUILD_BUG_ON(SQE_COMMON_FLAGS >= (1 << 8));
+- BUILD_BUG_ON((SQE_VALID_FLAGS | SQE_COMMON_FLAGS) != SQE_VALID_FLAGS);
+-
+- BUILD_BUG_ON(ARRAY_SIZE(io_op_defs) != IORING_OP_LAST);
+- BUILD_BUG_ON(__REQ_F_LAST_BIT > 8 * sizeof(int));
+-
+- req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC |
+- SLAB_ACCOUNT);
+- return 0;
+-};
+-__initcall(io_uring_init);
+diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
+index ac7f067b7bddb..f306b52b8e0b7 100644
+--- a/fs/jbd2/commit.c
++++ b/fs/jbd2/commit.c
+@@ -551,13 +551,13 @@ void jbd2_journal_commit_transaction(journal_t *journal)
+ */
+ jbd2_journal_switch_revoke_table(journal);
+
++ write_lock(&journal->j_state_lock);
+ /*
+ * Reserved credits cannot be claimed anymore, free them
+ */
+ atomic_sub(atomic_read(&journal->j_reserved_credits),
+ &commit_transaction->t_outstanding_credits);
+
+- write_lock(&journal->j_state_lock);
+ trace_jbd2_commit_flushing(journal, commit_transaction);
+ stats.run.rs_flushing = jiffies;
+ stats.run.rs_locked = jbd2_time_diff(stats.run.rs_locked,
+diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
+index fcb9175016a59..49f1096999339 100644
+--- a/fs/jbd2/transaction.c
++++ b/fs/jbd2/transaction.c
+@@ -1486,8 +1486,6 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
+ struct journal_head *jh;
+ int ret = 0;
+
+- if (is_handle_aborted(handle))
+- return -EROFS;
+ if (!buffer_jbd(bh))
+ return -EUCLEAN;
+
+@@ -1534,6 +1532,18 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
+ journal = transaction->t_journal;
+ spin_lock(&jh->b_state_lock);
+
++ if (is_handle_aborted(handle)) {
++ /*
++ * Check journal aborting with @jh->b_state_lock locked,
++ * since 'jh->b_transaction' could be replaced with
++ * 'jh->b_next_transaction' during old transaction
++ * committing if journal aborted, which may fail
++ * assertion on 'jh->b_frozen_data == NULL'.
++ */
++ ret = -EROFS;
++ goto out_unlock_bh;
++ }
++
+ if (jh->b_modified == 0) {
+ /*
+ * This buffer's got modified and becoming part
+diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
+index 6eca72cfa1f28..1cc88ba6de907 100644
+--- a/fs/kernfs/dir.c
++++ b/fs/kernfs/dir.c
+@@ -1343,14 +1343,17 @@ static void __kernfs_remove(struct kernfs_node *kn)
+ {
+ struct kernfs_node *pos;
+
++ /* Short-circuit if non-root @kn has already finished removal. */
++ if (!kn)
++ return;
++
+ lockdep_assert_held_write(&kernfs_root(kn)->kernfs_rwsem);
+
+ /*
+- * Short-circuit if non-root @kn has already finished removal.
+ * This is for kernfs_remove_self() which plays with active ref
+ * after removal.
+ */
+- if (!kn || (kn->parent && RB_EMPTY_NODE(&kn->rb)))
++ if (kn->parent && RB_EMPTY_NODE(&kn->rb))
+ return;
+
+ pr_debug("kernfs %s: removing\n", kn->name);
+diff --git a/fs/ksmbd/connection.c b/fs/ksmbd/connection.c
+index bc6050b67256d..e8f476c5f1890 100644
+--- a/fs/ksmbd/connection.c
++++ b/fs/ksmbd/connection.c
+@@ -205,31 +205,31 @@ int ksmbd_conn_write(struct ksmbd_work *work)
+ return 0;
+ }
+
+-int ksmbd_conn_rdma_read(struct ksmbd_conn *conn, void *buf,
+- unsigned int buflen, u32 remote_key, u64 remote_offset,
+- u32 remote_len)
++int ksmbd_conn_rdma_read(struct ksmbd_conn *conn,
++ void *buf, unsigned int buflen,
++ struct smb2_buffer_desc_v1 *desc,
++ unsigned int desc_len)
+ {
+ int ret = -EINVAL;
+
+ if (conn->transport->ops->rdma_read)
+ ret = conn->transport->ops->rdma_read(conn->transport,
+ buf, buflen,
+- remote_key, remote_offset,
+- remote_len);
++ desc, desc_len);
+ return ret;
+ }
+
+-int ksmbd_conn_rdma_write(struct ksmbd_conn *conn, void *buf,
+- unsigned int buflen, u32 remote_key,
+- u64 remote_offset, u32 remote_len)
++int ksmbd_conn_rdma_write(struct ksmbd_conn *conn,
++ void *buf, unsigned int buflen,
++ struct smb2_buffer_desc_v1 *desc,
++ unsigned int desc_len)
+ {
+ int ret = -EINVAL;
+
+ if (conn->transport->ops->rdma_write)
+ ret = conn->transport->ops->rdma_write(conn->transport,
+ buf, buflen,
+- remote_key, remote_offset,
+- remote_len);
++ desc, desc_len);
+ return ret;
+ }
+
+diff --git a/fs/ksmbd/connection.h b/fs/ksmbd/connection.h
+index 7a59aacb5daa5..98c1cbe45ec97 100644
+--- a/fs/ksmbd/connection.h
++++ b/fs/ksmbd/connection.h
+@@ -122,11 +122,14 @@ struct ksmbd_transport_ops {
+ int (*writev)(struct ksmbd_transport *t, struct kvec *iovs, int niov,
+ int size, bool need_invalidate_rkey,
+ unsigned int remote_key);
+- int (*rdma_read)(struct ksmbd_transport *t, void *buf, unsigned int len,
+- u32 remote_key, u64 remote_offset, u32 remote_len);
+- int (*rdma_write)(struct ksmbd_transport *t, void *buf,
+- unsigned int len, u32 remote_key, u64 remote_offset,
+- u32 remote_len);
++ int (*rdma_read)(struct ksmbd_transport *t,
++ void *buf, unsigned int len,
++ struct smb2_buffer_desc_v1 *desc,
++ unsigned int desc_len);
++ int (*rdma_write)(struct ksmbd_transport *t,
++ void *buf, unsigned int len,
++ struct smb2_buffer_desc_v1 *desc,
++ unsigned int desc_len);
+ };
+
+ struct ksmbd_transport {
+@@ -148,12 +151,14 @@ struct ksmbd_conn *ksmbd_conn_alloc(void);
+ void ksmbd_conn_free(struct ksmbd_conn *conn);
+ bool ksmbd_conn_lookup_dialect(struct ksmbd_conn *c);
+ int ksmbd_conn_write(struct ksmbd_work *work);
+-int ksmbd_conn_rdma_read(struct ksmbd_conn *conn, void *buf,
+- unsigned int buflen, u32 remote_key, u64 remote_offset,
+- u32 remote_len);
+-int ksmbd_conn_rdma_write(struct ksmbd_conn *conn, void *buf,
+- unsigned int buflen, u32 remote_key, u64 remote_offset,
+- u32 remote_len);
++int ksmbd_conn_rdma_read(struct ksmbd_conn *conn,
++ void *buf, unsigned int buflen,
++ struct smb2_buffer_desc_v1 *desc,
++ unsigned int desc_len);
++int ksmbd_conn_rdma_write(struct ksmbd_conn *conn,
++ void *buf, unsigned int buflen,
++ struct smb2_buffer_desc_v1 *desc,
++ unsigned int desc_len);
+ void ksmbd_conn_enqueue_request(struct ksmbd_work *work);
+ int ksmbd_conn_try_dequeue_request(struct ksmbd_work *work);
+ void ksmbd_conn_init_server_callbacks(struct ksmbd_conn_ops *ops);
+diff --git a/fs/ksmbd/ksmbd_netlink.h b/fs/ksmbd/ksmbd_netlink.h
+index ebe6ca08467ac..52aa0adeb9519 100644
+--- a/fs/ksmbd/ksmbd_netlink.h
++++ b/fs/ksmbd/ksmbd_netlink.h
+@@ -104,7 +104,8 @@ struct ksmbd_startup_request {
+ */
+ __u32 sub_auth[3]; /* Subauth value for Security ID */
+ __u32 smb2_max_credits; /* MAX credits */
+- __u32 reserved[128]; /* Reserved room */
++ __u32 smbd_max_io_size; /* smbd read write size */
++ __u32 reserved[127]; /* Reserved room */
+ __u32 ifc_list_sz; /* interfaces list size */
+ __s8 ____payload[];
+ };
+diff --git a/fs/ksmbd/smb2misc.c b/fs/ksmbd/smb2misc.c
+index f8f456377a51d..6e25ace365684 100644
+--- a/fs/ksmbd/smb2misc.c
++++ b/fs/ksmbd/smb2misc.c
+@@ -90,11 +90,6 @@ static int smb2_get_data_area_len(unsigned int *off, unsigned int *len,
+ *off = 0;
+ *len = 0;
+
+- /* error reqeusts do not have data area */
+- if (hdr->Status && hdr->Status != STATUS_MORE_PROCESSING_REQUIRED &&
+- (((struct smb2_err_rsp *)hdr)->StructureSize) == SMB2_ERROR_STRUCTURE_SIZE2_LE)
+- return ret;
+-
+ /*
+ * Following commands have data areas so we have to get the location
+ * of the data buffer offset and data buffer length for the particular
+@@ -136,8 +131,11 @@ static int smb2_get_data_area_len(unsigned int *off, unsigned int *len,
+ *len = le16_to_cpu(((struct smb2_read_req *)hdr)->ReadChannelInfoLength);
+ break;
+ case SMB2_WRITE:
+- if (((struct smb2_write_req *)hdr)->DataOffset) {
+- *off = le16_to_cpu(((struct smb2_write_req *)hdr)->DataOffset);
++ if (((struct smb2_write_req *)hdr)->DataOffset ||
++ ((struct smb2_write_req *)hdr)->Length) {
++ *off = max_t(unsigned int,
++ le16_to_cpu(((struct smb2_write_req *)hdr)->DataOffset),
++ offsetof(struct smb2_write_req, Buffer));
+ *len = le32_to_cpu(((struct smb2_write_req *)hdr)->Length);
+ break;
+ }
+diff --git a/fs/ksmbd/smb2pdu.c b/fs/ksmbd/smb2pdu.c
+index 31138e9be1dc2..85a9ed7156ea1 100644
+--- a/fs/ksmbd/smb2pdu.c
++++ b/fs/ksmbd/smb2pdu.c
+@@ -535,9 +535,10 @@ int smb2_allocate_rsp_buf(struct ksmbd_work *work)
+ struct smb2_query_info_req *req;
+
+ req = smb2_get_msg(work->request_buf);
+- if (req->InfoType == SMB2_O_INFO_FILE &&
+- (req->FileInfoClass == FILE_FULL_EA_INFORMATION ||
+- req->FileInfoClass == FILE_ALL_INFORMATION))
++ if ((req->InfoType == SMB2_O_INFO_FILE &&
++ (req->FileInfoClass == FILE_FULL_EA_INFORMATION ||
++ req->FileInfoClass == FILE_ALL_INFORMATION)) ||
++ req->InfoType == SMB2_O_INFO_SECURITY)
+ sz = large_sz;
+ }
+
+@@ -1139,12 +1140,16 @@ int smb2_handle_negotiate(struct ksmbd_work *work)
+ status);
+ rsp->hdr.Status = status;
+ rc = -EINVAL;
++ kfree(conn->preauth_info);
++ conn->preauth_info = NULL;
+ goto err_out;
+ }
+
+ rc = init_smb3_11_server(conn);
+ if (rc < 0) {
+ rsp->hdr.Status = STATUS_INVALID_PARAMETER;
++ kfree(conn->preauth_info);
++ conn->preauth_info = NULL;
+ goto err_out;
+ }
+
+@@ -2039,6 +2044,7 @@ int smb2_tree_disconnect(struct ksmbd_work *work)
+
+ ksmbd_close_tree_conn_fds(work);
+ ksmbd_tree_conn_disconnect(sess, tcon);
++ work->tcon = NULL;
+ return 0;
+ }
+
+@@ -2969,7 +2975,7 @@ int smb2_open(struct ksmbd_work *work)
+ goto err_out;
+
+ rc = build_sec_desc(user_ns,
+- pntsd, NULL,
++ pntsd, NULL, 0,
+ OWNER_SECINFO |
+ GROUP_SECINFO |
+ DACL_SECINFO,
+@@ -3814,6 +3820,15 @@ static int verify_info_level(int info_level)
+ return 0;
+ }
+
++static int smb2_resp_buf_len(struct ksmbd_work *work, unsigned short hdr2_len)
++{
++ int free_len;
++
++ free_len = (int)(work->response_sz -
++ (get_rfc1002_len(work->response_buf) + 4)) - hdr2_len;
++ return free_len;
++}
++
+ static int smb2_calc_max_out_buf_len(struct ksmbd_work *work,
+ unsigned short hdr2_len,
+ unsigned int out_buf_len)
+@@ -3823,9 +3838,7 @@ static int smb2_calc_max_out_buf_len(struct ksmbd_work *work,
+ if (out_buf_len > work->conn->vals->max_trans_size)
+ return -EINVAL;
+
+- free_len = (int)(work->response_sz -
+- (get_rfc1002_len(work->response_buf) + 4)) -
+- hdr2_len;
++ free_len = smb2_resp_buf_len(work, hdr2_len);
+ if (free_len < 0)
+ return -EINVAL;
+
+@@ -5080,10 +5093,10 @@ static int smb2_get_info_sec(struct ksmbd_work *work,
+ struct smb_ntsd *pntsd = (struct smb_ntsd *)rsp->Buffer, *ppntsd = NULL;
+ struct smb_fattr fattr = {{0}};
+ struct inode *inode;
+- __u32 secdesclen;
++ __u32 secdesclen = 0;
+ unsigned int id = KSMBD_NO_FID, pid = KSMBD_NO_FID;
+ int addition_info = le32_to_cpu(req->AdditionalInformation);
+- int rc;
++ int rc = 0, ppntsd_size = 0;
+
+ if (addition_info & ~(OWNER_SECINFO | GROUP_SECINFO | DACL_SECINFO |
+ PROTECTED_DACL_SECINFO |
+@@ -5129,11 +5142,14 @@ static int smb2_get_info_sec(struct ksmbd_work *work,
+
+ if (test_share_config_flag(work->tcon->share_conf,
+ KSMBD_SHARE_FLAG_ACL_XATTR))
+- ksmbd_vfs_get_sd_xattr(work->conn, user_ns,
+- fp->filp->f_path.dentry, &ppntsd);
+-
+- rc = build_sec_desc(user_ns, pntsd, ppntsd, addition_info,
+- &secdesclen, &fattr);
++ ppntsd_size = ksmbd_vfs_get_sd_xattr(work->conn, user_ns,
++ fp->filp->f_path.dentry,
++ &ppntsd);
++
++ /* Check if sd buffer size exceeds response buffer size */
++ if (smb2_resp_buf_len(work, 8) > ppntsd_size)
++ rc = build_sec_desc(user_ns, pntsd, ppntsd, ppntsd_size,
++ addition_info, &secdesclen, &fattr);
+ posix_acl_release(fattr.cf_acls);
+ posix_acl_release(fattr.cf_dacls);
+ kfree(ppntsd);
+@@ -6116,7 +6132,6 @@ out:
+ static int smb2_set_remote_key_for_rdma(struct ksmbd_work *work,
+ struct smb2_buffer_desc_v1 *desc,
+ __le32 Channel,
+- __le16 ChannelInfoOffset,
+ __le16 ChannelInfoLength)
+ {
+ unsigned int i, ch_count;
+@@ -6142,7 +6157,8 @@ static int smb2_set_remote_key_for_rdma(struct ksmbd_work *work,
+
+ work->need_invalidate_rkey =
+ (Channel == SMB2_CHANNEL_RDMA_V1_INVALIDATE);
+- work->remote_key = le32_to_cpu(desc->token);
++ if (Channel == SMB2_CHANNEL_RDMA_V1_INVALIDATE)
++ work->remote_key = le32_to_cpu(desc->token);
+ return 0;
+ }
+
+@@ -6150,14 +6166,12 @@ static ssize_t smb2_read_rdma_channel(struct ksmbd_work *work,
+ struct smb2_read_req *req, void *data_buf,
+ size_t length)
+ {
+- struct smb2_buffer_desc_v1 *desc =
+- (struct smb2_buffer_desc_v1 *)&req->Buffer[0];
+ int err;
+
+ err = ksmbd_conn_rdma_write(work->conn, data_buf, length,
+- le32_to_cpu(desc->token),
+- le64_to_cpu(desc->offset),
+- le32_to_cpu(desc->length));
++ (struct smb2_buffer_desc_v1 *)
++ ((char *)req + le16_to_cpu(req->ReadChannelInfoOffset)),
++ le16_to_cpu(req->ReadChannelInfoLength));
+ if (err)
+ return err;
+
+@@ -6180,6 +6194,8 @@ int smb2_read(struct ksmbd_work *work)
+ size_t length, mincount;
+ ssize_t nbytes = 0, remain_bytes = 0;
+ int err = 0;
++ bool is_rdma_channel = false;
++ unsigned int max_read_size = conn->vals->max_read_size;
+
+ WORK_BUFFERS(work, req, rsp);
+
+@@ -6191,6 +6207,11 @@ int smb2_read(struct ksmbd_work *work)
+
+ if (req->Channel == SMB2_CHANNEL_RDMA_V1_INVALIDATE ||
+ req->Channel == SMB2_CHANNEL_RDMA_V1) {
++ is_rdma_channel = true;
++ max_read_size = get_smbd_max_read_write_size();
++ }
++
++ if (is_rdma_channel == true) {
+ unsigned int ch_offset = le16_to_cpu(req->ReadChannelInfoOffset);
+
+ if (ch_offset < offsetof(struct smb2_read_req, Buffer)) {
+@@ -6201,7 +6222,6 @@ int smb2_read(struct ksmbd_work *work)
+ (struct smb2_buffer_desc_v1 *)
+ ((char *)req + ch_offset),
+ req->Channel,
+- req->ReadChannelInfoOffset,
+ req->ReadChannelInfoLength);
+ if (err)
+ goto out;
+@@ -6223,9 +6243,9 @@ int smb2_read(struct ksmbd_work *work)
+ length = le32_to_cpu(req->Length);
+ mincount = le32_to_cpu(req->MinimumCount);
+
+- if (length > conn->vals->max_read_size) {
++ if (length > max_read_size) {
+ ksmbd_debug(SMB, "limiting read size to max size(%u)\n",
+- conn->vals->max_read_size);
++ max_read_size);
+ err = -EINVAL;
+ goto out;
+ }
+@@ -6257,8 +6277,7 @@ int smb2_read(struct ksmbd_work *work)
+ ksmbd_debug(SMB, "nbytes %zu, offset %lld mincount %zu\n",
+ nbytes, offset, mincount);
+
+- if (req->Channel == SMB2_CHANNEL_RDMA_V1_INVALIDATE ||
+- req->Channel == SMB2_CHANNEL_RDMA_V1) {
++ if (is_rdma_channel == true) {
+ /* write data to the client using rdma channel */
+ remain_bytes = smb2_read_rdma_channel(work, req,
+ work->aux_payload_buf,
+@@ -6328,23 +6347,18 @@ static noinline int smb2_write_pipe(struct ksmbd_work *work)
+ length = le32_to_cpu(req->Length);
+ id = req->VolatileFileId;
+
+- if (le16_to_cpu(req->DataOffset) ==
+- offsetof(struct smb2_write_req, Buffer)) {
+- data_buf = (char *)&req->Buffer[0];
+- } else {
+- if ((u64)le16_to_cpu(req->DataOffset) + length >
+- get_rfc1002_len(work->request_buf)) {
+- pr_err("invalid write data offset %u, smb_len %u\n",
+- le16_to_cpu(req->DataOffset),
+- get_rfc1002_len(work->request_buf));
+- err = -EINVAL;
+- goto out;
+- }
+-
+- data_buf = (char *)(((char *)&req->hdr.ProtocolId) +
+- le16_to_cpu(req->DataOffset));
++ if ((u64)le16_to_cpu(req->DataOffset) + length >
++ get_rfc1002_len(work->request_buf)) {
++ pr_err("invalid write data offset %u, smb_len %u\n",
++ le16_to_cpu(req->DataOffset),
++ get_rfc1002_len(work->request_buf));
++ err = -EINVAL;
++ goto out;
+ }
+
++ data_buf = (char *)(((char *)&req->hdr.ProtocolId) +
++ le16_to_cpu(req->DataOffset));
++
+ rpc_resp = ksmbd_rpc_write(work->sess, id, data_buf, length);
+ if (rpc_resp) {
+ if (rpc_resp->flags == KSMBD_RPC_ENOTIMPLEMENTED) {
+@@ -6384,21 +6398,18 @@ static ssize_t smb2_write_rdma_channel(struct ksmbd_work *work,
+ struct ksmbd_file *fp,
+ loff_t offset, size_t length, bool sync)
+ {
+- struct smb2_buffer_desc_v1 *desc;
+ char *data_buf;
+ int ret;
+ ssize_t nbytes;
+
+- desc = (struct smb2_buffer_desc_v1 *)&req->Buffer[0];
+-
+ data_buf = kvmalloc(length, GFP_KERNEL | __GFP_ZERO);
+ if (!data_buf)
+ return -ENOMEM;
+
+ ret = ksmbd_conn_rdma_read(work->conn, data_buf, length,
+- le32_to_cpu(desc->token),
+- le64_to_cpu(desc->offset),
+- le32_to_cpu(desc->length));
++ (struct smb2_buffer_desc_v1 *)
++ ((char *)req + le16_to_cpu(req->WriteChannelInfoOffset)),
++ le16_to_cpu(req->WriteChannelInfoLength));
+ if (ret < 0) {
+ kvfree(data_buf);
+ return ret;
+@@ -6427,8 +6438,9 @@ int smb2_write(struct ksmbd_work *work)
+ size_t length;
+ ssize_t nbytes;
+ char *data_buf;
+- bool writethrough = false;
++ bool writethrough = false, is_rdma_channel = false;
+ int err = 0;
++ unsigned int max_write_size = work->conn->vals->max_write_size;
+
+ WORK_BUFFERS(work, req, rsp);
+
+@@ -6437,8 +6449,17 @@ int smb2_write(struct ksmbd_work *work)
+ return smb2_write_pipe(work);
+ }
+
++ offset = le64_to_cpu(req->Offset);
++ length = le32_to_cpu(req->Length);
++
+ if (req->Channel == SMB2_CHANNEL_RDMA_V1 ||
+ req->Channel == SMB2_CHANNEL_RDMA_V1_INVALIDATE) {
++ is_rdma_channel = true;
++ max_write_size = get_smbd_max_read_write_size();
++ length = le32_to_cpu(req->RemainingBytes);
++ }
++
++ if (is_rdma_channel == true) {
+ unsigned int ch_offset = le16_to_cpu(req->WriteChannelInfoOffset);
+
+ if (req->Length != 0 || req->DataOffset != 0 ||
+@@ -6450,7 +6471,6 @@ int smb2_write(struct ksmbd_work *work)
+ (struct smb2_buffer_desc_v1 *)
+ ((char *)req + ch_offset),
+ req->Channel,
+- req->WriteChannelInfoOffset,
+ req->WriteChannelInfoLength);
+ if (err)
+ goto out;
+@@ -6474,12 +6494,9 @@ int smb2_write(struct ksmbd_work *work)
+ goto out;
+ }
+
+- offset = le64_to_cpu(req->Offset);
+- length = le32_to_cpu(req->Length);
+-
+- if (length > work->conn->vals->max_write_size) {
++ if (length > max_write_size) {
+ ksmbd_debug(SMB, "limiting write size to max size(%u)\n",
+- work->conn->vals->max_write_size);
++ max_write_size);
+ err = -EINVAL;
+ goto out;
+ }
+@@ -6487,25 +6504,16 @@ int smb2_write(struct ksmbd_work *work)
+ if (le32_to_cpu(req->Flags) & SMB2_WRITEFLAG_WRITE_THROUGH)
+ writethrough = true;
+
+- if (req->Channel != SMB2_CHANNEL_RDMA_V1 &&
+- req->Channel != SMB2_CHANNEL_RDMA_V1_INVALIDATE) {
+- if (le16_to_cpu(req->DataOffset) ==
++ if (is_rdma_channel == false) {
++ if (le16_to_cpu(req->DataOffset) <
+ offsetof(struct smb2_write_req, Buffer)) {
+- data_buf = (char *)&req->Buffer[0];
+- } else {
+- if ((u64)le16_to_cpu(req->DataOffset) + length >
+- get_rfc1002_len(work->request_buf)) {
+- pr_err("invalid write data offset %u, smb_len %u\n",
+- le16_to_cpu(req->DataOffset),
+- get_rfc1002_len(work->request_buf));
+- err = -EINVAL;
+- goto out;
+- }
+-
+- data_buf = (char *)(((char *)&req->hdr.ProtocolId) +
+- le16_to_cpu(req->DataOffset));
++ err = -EINVAL;
++ goto out;
+ }
+
++ data_buf = (char *)(((char *)&req->hdr.ProtocolId) +
++ le16_to_cpu(req->DataOffset));
++
+ ksmbd_debug(SMB, "flags %u\n", le32_to_cpu(req->Flags));
+ if (le32_to_cpu(req->Flags) & SMB2_WRITEFLAG_WRITE_THROUGH)
+ writethrough = true;
+@@ -6520,8 +6528,7 @@ int smb2_write(struct ksmbd_work *work)
+ /* read data from the client using rdma channel, and
+ * write the data.
+ */
+- nbytes = smb2_write_rdma_channel(work, req, fp, offset,
+- le32_to_cpu(req->RemainingBytes),
++ nbytes = smb2_write_rdma_channel(work, req, fp, offset, length,
+ writethrough);
+ if (nbytes < 0) {
+ err = (int)nbytes;
+diff --git a/fs/ksmbd/smbacl.c b/fs/ksmbd/smbacl.c
+index 38f23bf981ac9..3781bca2c8fc4 100644
+--- a/fs/ksmbd/smbacl.c
++++ b/fs/ksmbd/smbacl.c
+@@ -690,6 +690,7 @@ posix_default_acl:
+ static void set_ntacl_dacl(struct user_namespace *user_ns,
+ struct smb_acl *pndacl,
+ struct smb_acl *nt_dacl,
++ unsigned int aces_size,
+ const struct smb_sid *pownersid,
+ const struct smb_sid *pgrpsid,
+ struct smb_fattr *fattr)
+@@ -703,9 +704,19 @@ static void set_ntacl_dacl(struct user_namespace *user_ns,
+ if (nt_num_aces) {
+ ntace = (struct smb_ace *)((char *)nt_dacl + sizeof(struct smb_acl));
+ for (i = 0; i < nt_num_aces; i++) {
+- memcpy((char *)pndace + size, ntace, le16_to_cpu(ntace->size));
+- size += le16_to_cpu(ntace->size);
+- ntace = (struct smb_ace *)((char *)ntace + le16_to_cpu(ntace->size));
++ unsigned short nt_ace_size;
++
++ if (offsetof(struct smb_ace, access_req) > aces_size)
++ break;
++
++ nt_ace_size = le16_to_cpu(ntace->size);
++ if (nt_ace_size > aces_size)
++ break;
++
++ memcpy((char *)pndace + size, ntace, nt_ace_size);
++ size += nt_ace_size;
++ aces_size -= nt_ace_size;
++ ntace = (struct smb_ace *)((char *)ntace + nt_ace_size);
+ num_aces++;
+ }
+ }
+@@ -878,7 +889,7 @@ int parse_sec_desc(struct user_namespace *user_ns, struct smb_ntsd *pntsd,
+ /* Convert permission bits from mode to equivalent CIFS ACL */
+ int build_sec_desc(struct user_namespace *user_ns,
+ struct smb_ntsd *pntsd, struct smb_ntsd *ppntsd,
+- int addition_info, __u32 *secdesclen,
++ int ppntsd_size, int addition_info, __u32 *secdesclen,
+ struct smb_fattr *fattr)
+ {
+ int rc = 0;
+@@ -938,15 +949,25 @@ int build_sec_desc(struct user_namespace *user_ns,
+
+ if (!ppntsd) {
+ set_mode_dacl(user_ns, dacl_ptr, fattr);
+- } else if (!ppntsd->dacloffset) {
+- goto out;
+ } else {
+ struct smb_acl *ppdacl_ptr;
++ unsigned int dacl_offset = le32_to_cpu(ppntsd->dacloffset);
++ int ppdacl_size, ntacl_size = ppntsd_size - dacl_offset;
++
++ if (!dacl_offset ||
++ (dacl_offset + sizeof(struct smb_acl) > ppntsd_size))
++ goto out;
++
++ ppdacl_ptr = (struct smb_acl *)((char *)ppntsd + dacl_offset);
++ ppdacl_size = le16_to_cpu(ppdacl_ptr->size);
++ if (ppdacl_size > ntacl_size ||
++ ppdacl_size < sizeof(struct smb_acl))
++ goto out;
+
+- ppdacl_ptr = (struct smb_acl *)((char *)ppntsd +
+- le32_to_cpu(ppntsd->dacloffset));
+ set_ntacl_dacl(user_ns, dacl_ptr, ppdacl_ptr,
+- nowner_sid_ptr, ngroup_sid_ptr, fattr);
++ ntacl_size - sizeof(struct smb_acl),
++ nowner_sid_ptr, ngroup_sid_ptr,
++ fattr);
+ }
+ pntsd->dacloffset = cpu_to_le32(offset);
+ offset += le16_to_cpu(dacl_ptr->size);
+@@ -980,24 +1001,31 @@ int smb_inherit_dacl(struct ksmbd_conn *conn,
+ struct smb_sid owner_sid, group_sid;
+ struct dentry *parent = path->dentry->d_parent;
+ struct user_namespace *user_ns = mnt_user_ns(path->mnt);
+- int inherited_flags = 0, flags = 0, i, ace_cnt = 0, nt_size = 0;
+- int rc = 0, num_aces, dacloffset, pntsd_type, acl_len;
++ int inherited_flags = 0, flags = 0, i, ace_cnt = 0, nt_size = 0, pdacl_size;
++ int rc = 0, num_aces, dacloffset, pntsd_type, pntsd_size, acl_len, aces_size;
+ char *aces_base;
+ bool is_dir = S_ISDIR(d_inode(path->dentry)->i_mode);
+
+- acl_len = ksmbd_vfs_get_sd_xattr(conn, user_ns,
+- parent, &parent_pntsd);
+- if (acl_len <= 0)
++ pntsd_size = ksmbd_vfs_get_sd_xattr(conn, user_ns,
++ parent, &parent_pntsd);
++ if (pntsd_size <= 0)
+ return -ENOENT;
+ dacloffset = le32_to_cpu(parent_pntsd->dacloffset);
+- if (!dacloffset) {
++ if (!dacloffset || (dacloffset + sizeof(struct smb_acl) > pntsd_size)) {
+ rc = -EINVAL;
+ goto free_parent_pntsd;
+ }
+
+ parent_pdacl = (struct smb_acl *)((char *)parent_pntsd + dacloffset);
++ acl_len = pntsd_size - dacloffset;
+ num_aces = le32_to_cpu(parent_pdacl->num_aces);
+ pntsd_type = le16_to_cpu(parent_pntsd->type);
++ pdacl_size = le16_to_cpu(parent_pdacl->size);
++
++ if (pdacl_size > acl_len || pdacl_size < sizeof(struct smb_acl)) {
++ rc = -EINVAL;
++ goto free_parent_pntsd;
++ }
+
+ aces_base = kmalloc(sizeof(struct smb_ace) * num_aces * 2, GFP_KERNEL);
+ if (!aces_base) {
+@@ -1008,11 +1036,23 @@ int smb_inherit_dacl(struct ksmbd_conn *conn,
+ aces = (struct smb_ace *)aces_base;
+ parent_aces = (struct smb_ace *)((char *)parent_pdacl +
+ sizeof(struct smb_acl));
++ aces_size = acl_len - sizeof(struct smb_acl);
+
+ if (pntsd_type & DACL_AUTO_INHERITED)
+ inherited_flags = INHERITED_ACE;
+
+ for (i = 0; i < num_aces; i++) {
++ int pace_size;
++
++ if (offsetof(struct smb_ace, access_req) > aces_size)
++ break;
++
++ pace_size = le16_to_cpu(parent_aces->size);
++ if (pace_size > aces_size)
++ break;
++
++ aces_size -= pace_size;
++
+ flags = parent_aces->flags;
+ if (!smb_inherit_flags(flags, is_dir))
+ goto pass;
+@@ -1057,8 +1097,7 @@ int smb_inherit_dacl(struct ksmbd_conn *conn,
+ aces = (struct smb_ace *)((char *)aces + le16_to_cpu(aces->size));
+ ace_cnt++;
+ pass:
+- parent_aces =
+- (struct smb_ace *)((char *)parent_aces + le16_to_cpu(parent_aces->size));
++ parent_aces = (struct smb_ace *)((char *)parent_aces + pace_size);
+ }
+
+ if (nt_size > 0) {
+@@ -1153,7 +1192,7 @@ int smb_check_perm_dacl(struct ksmbd_conn *conn, struct path *path,
+ struct smb_ntsd *pntsd = NULL;
+ struct smb_acl *pdacl;
+ struct posix_acl *posix_acls;
+- int rc = 0, acl_size;
++ int rc = 0, pntsd_size, acl_size, aces_size, pdacl_size, dacl_offset;
+ struct smb_sid sid;
+ int granted = le32_to_cpu(*pdaccess & ~FILE_MAXIMAL_ACCESS_LE);
+ struct smb_ace *ace;
+@@ -1162,37 +1201,33 @@ int smb_check_perm_dacl(struct ksmbd_conn *conn, struct path *path,
+ struct smb_ace *others_ace = NULL;
+ struct posix_acl_entry *pa_entry;
+ unsigned int sid_type = SIDOWNER;
+- char *end_of_acl;
++ unsigned short ace_size;
+
+ ksmbd_debug(SMB, "check permission using windows acl\n");
+- acl_size = ksmbd_vfs_get_sd_xattr(conn, user_ns,
+- path->dentry, &pntsd);
+- if (acl_size <= 0 || !pntsd || !pntsd->dacloffset) {
+- kfree(pntsd);
+- return 0;
+- }
++ pntsd_size = ksmbd_vfs_get_sd_xattr(conn, user_ns,
++ path->dentry, &pntsd);
++ if (pntsd_size <= 0 || !pntsd)
++ goto err_out;
++
++ dacl_offset = le32_to_cpu(pntsd->dacloffset);
++ if (!dacl_offset ||
++ (dacl_offset + sizeof(struct smb_acl) > pntsd_size))
++ goto err_out;
+
+ pdacl = (struct smb_acl *)((char *)pntsd + le32_to_cpu(pntsd->dacloffset));
+- end_of_acl = ((char *)pntsd) + acl_size;
+- if (end_of_acl <= (char *)pdacl) {
+- kfree(pntsd);
+- return 0;
+- }
++ acl_size = pntsd_size - dacl_offset;
++ pdacl_size = le16_to_cpu(pdacl->size);
+
+- if (end_of_acl < (char *)pdacl + le16_to_cpu(pdacl->size) ||
+- le16_to_cpu(pdacl->size) < sizeof(struct smb_acl)) {
+- kfree(pntsd);
+- return 0;
+- }
++ if (pdacl_size > acl_size || pdacl_size < sizeof(struct smb_acl))
++ goto err_out;
+
+ if (!pdacl->num_aces) {
+- if (!(le16_to_cpu(pdacl->size) - sizeof(struct smb_acl)) &&
++ if (!(pdacl_size - sizeof(struct smb_acl)) &&
+ *pdaccess & ~(FILE_READ_CONTROL_LE | FILE_WRITE_DAC_LE)) {
+ rc = -EACCES;
+ goto err_out;
+ }
+- kfree(pntsd);
+- return 0;
++ goto err_out;
+ }
+
+ if (*pdaccess & FILE_MAXIMAL_ACCESS_LE) {
+@@ -1200,11 +1235,16 @@ int smb_check_perm_dacl(struct ksmbd_conn *conn, struct path *path,
+ DELETE;
+
+ ace = (struct smb_ace *)((char *)pdacl + sizeof(struct smb_acl));
++ aces_size = acl_size - sizeof(struct smb_acl);
+ for (i = 0; i < le32_to_cpu(pdacl->num_aces); i++) {
++ if (offsetof(struct smb_ace, access_req) > aces_size)
++ break;
++ ace_size = le16_to_cpu(ace->size);
++ if (ace_size > aces_size)
++ break;
++ aces_size -= ace_size;
+ granted |= le32_to_cpu(ace->access_req);
+ ace = (struct smb_ace *)((char *)ace + le16_to_cpu(ace->size));
+- if (end_of_acl < (char *)ace)
+- goto err_out;
+ }
+
+ if (!pdacl->num_aces)
+@@ -1216,7 +1256,15 @@ int smb_check_perm_dacl(struct ksmbd_conn *conn, struct path *path,
+ id_to_sid(uid, sid_type, &sid);
+
+ ace = (struct smb_ace *)((char *)pdacl + sizeof(struct smb_acl));
++ aces_size = acl_size - sizeof(struct smb_acl);
+ for (i = 0; i < le32_to_cpu(pdacl->num_aces); i++) {
++ if (offsetof(struct smb_ace, access_req) > aces_size)
++ break;
++ ace_size = le16_to_cpu(ace->size);
++ if (ace_size > aces_size)
++ break;
++ aces_size -= ace_size;
++
+ if (!compare_sids(&sid, &ace->sid) ||
+ !compare_sids(&sid_unix_NFS_mode, &ace->sid)) {
+ found = 1;
+@@ -1226,8 +1274,6 @@ int smb_check_perm_dacl(struct ksmbd_conn *conn, struct path *path,
+ others_ace = ace;
+
+ ace = (struct smb_ace *)((char *)ace + le16_to_cpu(ace->size));
+- if (end_of_acl < (char *)ace)
+- goto err_out;
+ }
+
+ if (*pdaccess & FILE_MAXIMAL_ACCESS_LE && found) {
+diff --git a/fs/ksmbd/smbacl.h b/fs/ksmbd/smbacl.h
+index 811af33094291..fcb2c83f29928 100644
+--- a/fs/ksmbd/smbacl.h
++++ b/fs/ksmbd/smbacl.h
+@@ -193,7 +193,7 @@ struct posix_acl_state {
+ int parse_sec_desc(struct user_namespace *user_ns, struct smb_ntsd *pntsd,
+ int acl_len, struct smb_fattr *fattr);
+ int build_sec_desc(struct user_namespace *user_ns, struct smb_ntsd *pntsd,
+- struct smb_ntsd *ppntsd, int addition_info,
++ struct smb_ntsd *ppntsd, int ppntsd_size, int addition_info,
+ __u32 *secdesclen, struct smb_fattr *fattr);
+ int init_acl_state(struct posix_acl_state *state, int cnt);
+ void free_acl_state(struct posix_acl_state *state);
+diff --git a/fs/ksmbd/transport_ipc.c b/fs/ksmbd/transport_ipc.c
+index 3ad6881e0f7ed..7cb0eeb07c808 100644
+--- a/fs/ksmbd/transport_ipc.c
++++ b/fs/ksmbd/transport_ipc.c
+@@ -26,6 +26,7 @@
+ #include "mgmt/ksmbd_ida.h"
+ #include "connection.h"
+ #include "transport_tcp.h"
++#include "transport_rdma.h"
+
+ #define IPC_WAIT_TIMEOUT (2 * HZ)
+
+@@ -303,6 +304,8 @@ static int ipc_server_config_on_startup(struct ksmbd_startup_request *req)
+ init_smb2_max_trans_size(req->smb2_max_trans);
+ if (req->smb2_max_credits)
+ init_smb2_max_credits(req->smb2_max_credits);
++ if (req->smbd_max_io_size)
++ init_smbd_max_io_size(req->smbd_max_io_size);
+
+ ret = ksmbd_set_netbios_name(req->netbios_name);
+ ret |= ksmbd_set_server_string(req->server_string);
+diff --git a/fs/ksmbd/transport_rdma.c b/fs/ksmbd/transport_rdma.c
+index 3f5d135716945..c6af8d89b7f74 100644
+--- a/fs/ksmbd/transport_rdma.c
++++ b/fs/ksmbd/transport_rdma.c
+@@ -80,9 +80,7 @@ static int smb_direct_max_fragmented_recv_size = 1024 * 1024;
+ /* The maximum single-message size which can be received */
+ static int smb_direct_max_receive_size = 8192;
+
+-static int smb_direct_max_read_write_size = 524224;
+-
+-static int smb_direct_max_outstanding_rw_ops = 8;
++static int smb_direct_max_read_write_size = SMBD_DEFAULT_IOSIZE;
+
+ static LIST_HEAD(smb_direct_device_list);
+ static DEFINE_RWLOCK(smb_direct_device_lock);
+@@ -147,10 +145,12 @@ struct smb_direct_transport {
+ atomic_t send_credits;
+ spinlock_t lock_new_recv_credits;
+ int new_recv_credits;
+- atomic_t rw_avail_ops;
++ int max_rw_credits;
++ int pages_per_rw_credit;
++ atomic_t rw_credits;
+
+ wait_queue_head_t wait_send_credits;
+- wait_queue_head_t wait_rw_avail_ops;
++ wait_queue_head_t wait_rw_credits;
+
+ mempool_t *sendmsg_mempool;
+ struct kmem_cache *sendmsg_cache;
+@@ -214,6 +214,17 @@ struct smb_direct_rdma_rw_msg {
+ struct scatterlist sg_list[];
+ };
+
++void init_smbd_max_io_size(unsigned int sz)
++{
++ sz = clamp_val(sz, SMBD_MIN_IOSIZE, SMBD_MAX_IOSIZE);
++ smb_direct_max_read_write_size = sz;
++}
++
++unsigned int get_smbd_max_read_write_size(void)
++{
++ return smb_direct_max_read_write_size;
++}
++
+ static inline int get_buf_page_count(void *buf, int size)
+ {
+ return DIV_ROUND_UP((uintptr_t)buf + size, PAGE_SIZE) -
+@@ -377,7 +388,7 @@ static struct smb_direct_transport *alloc_transport(struct rdma_cm_id *cm_id)
+ t->reassembly_queue_length = 0;
+ init_waitqueue_head(&t->wait_reassembly_queue);
+ init_waitqueue_head(&t->wait_send_credits);
+- init_waitqueue_head(&t->wait_rw_avail_ops);
++ init_waitqueue_head(&t->wait_rw_credits);
+
+ spin_lock_init(&t->receive_credit_lock);
+ spin_lock_init(&t->recvmsg_queue_lock);
+@@ -984,18 +995,19 @@ static int smb_direct_flush_send_list(struct smb_direct_transport *t,
+ }
+
+ static int wait_for_credits(struct smb_direct_transport *t,
+- wait_queue_head_t *waitq, atomic_t *credits)
++ wait_queue_head_t *waitq, atomic_t *total_credits,
++ int needed)
+ {
+ int ret;
+
+ do {
+- if (atomic_dec_return(credits) >= 0)
++ if (atomic_sub_return(needed, total_credits) >= 0)
+ return 0;
+
+- atomic_inc(credits);
++ atomic_add(needed, total_credits);
+ ret = wait_event_interruptible(*waitq,
+- atomic_read(credits) > 0 ||
+- t->status != SMB_DIRECT_CS_CONNECTED);
++ atomic_read(total_credits) >= needed ||
++ t->status != SMB_DIRECT_CS_CONNECTED);
+
+ if (t->status != SMB_DIRECT_CS_CONNECTED)
+ return -ENOTCONN;
+@@ -1016,7 +1028,19 @@ static int wait_for_send_credits(struct smb_direct_transport *t,
+ return ret;
+ }
+
+- return wait_for_credits(t, &t->wait_send_credits, &t->send_credits);
++ return wait_for_credits(t, &t->wait_send_credits, &t->send_credits, 1);
++}
++
++static int wait_for_rw_credits(struct smb_direct_transport *t, int credits)
++{
++ return wait_for_credits(t, &t->wait_rw_credits, &t->rw_credits, credits);
++}
++
++static int calc_rw_credits(struct smb_direct_transport *t,
++ char *buf, unsigned int len)
++{
++ return DIV_ROUND_UP(get_buf_page_count(buf, len),
++ t->pages_per_rw_credit);
+ }
+
+ static int smb_direct_create_header(struct smb_direct_transport *t,
+@@ -1332,8 +1356,8 @@ static void read_write_done(struct ib_cq *cq, struct ib_wc *wc,
+ smb_direct_disconnect_rdma_connection(t);
+ }
+
+- if (atomic_inc_return(&t->rw_avail_ops) > 0)
+- wake_up(&t->wait_rw_avail_ops);
++ if (atomic_inc_return(&t->rw_credits) > 0)
++ wake_up(&t->wait_rw_credits);
+
+ rdma_rw_ctx_destroy(&msg->rw_ctx, t->qp, t->qp->port,
+ msg->sg_list, msg->sgt.nents, dir);
+@@ -1352,16 +1376,22 @@ static void write_done(struct ib_cq *cq, struct ib_wc *wc)
+ read_write_done(cq, wc, DMA_TO_DEVICE);
+ }
+
+-static int smb_direct_rdma_xmit(struct smb_direct_transport *t, void *buf,
+- int buf_len, u32 remote_key, u64 remote_offset,
+- u32 remote_len, bool is_read)
++static int smb_direct_rdma_xmit(struct smb_direct_transport *t,
++ void *buf, int buf_len,
++ struct smb2_buffer_desc_v1 *desc,
++ unsigned int desc_len,
++ bool is_read)
+ {
+ struct smb_direct_rdma_rw_msg *msg;
+ int ret;
+ DECLARE_COMPLETION_ONSTACK(completion);
+ struct ib_send_wr *first_wr = NULL;
++ u32 remote_key = le32_to_cpu(desc[0].token);
++ u64 remote_offset = le64_to_cpu(desc[0].offset);
++ int credits_needed;
+
+- ret = wait_for_credits(t, &t->wait_rw_avail_ops, &t->rw_avail_ops);
++ credits_needed = calc_rw_credits(t, buf, buf_len);
++ ret = wait_for_rw_credits(t, credits_needed);
+ if (ret < 0)
+ return ret;
+
+@@ -1369,7 +1399,7 @@ static int smb_direct_rdma_xmit(struct smb_direct_transport *t, void *buf,
+ msg = kmalloc(offsetof(struct smb_direct_rdma_rw_msg, sg_list) +
+ sizeof(struct scatterlist) * SG_CHUNK_SIZE, GFP_KERNEL);
+ if (!msg) {
+- atomic_inc(&t->rw_avail_ops);
++ atomic_add(credits_needed, &t->rw_credits);
+ return -ENOMEM;
+ }
+
+@@ -1378,7 +1408,7 @@ static int smb_direct_rdma_xmit(struct smb_direct_transport *t, void *buf,
+ get_buf_page_count(buf, buf_len),
+ msg->sg_list, SG_CHUNK_SIZE);
+ if (ret) {
+- atomic_inc(&t->rw_avail_ops);
++ atomic_add(credits_needed, &t->rw_credits);
+ kfree(msg);
+ return -ENOMEM;
+ }
+@@ -1414,7 +1444,7 @@ static int smb_direct_rdma_xmit(struct smb_direct_transport *t, void *buf,
+ return 0;
+
+ err:
+- atomic_inc(&t->rw_avail_ops);
++ atomic_add(credits_needed, &t->rw_credits);
+ if (first_wr)
+ rdma_rw_ctx_destroy(&msg->rw_ctx, t->qp, t->qp->port,
+ msg->sg_list, msg->sgt.nents,
+@@ -1424,22 +1454,22 @@ err:
+ return ret;
+ }
+
+-static int smb_direct_rdma_write(struct ksmbd_transport *t, void *buf,
+- unsigned int buflen, u32 remote_key,
+- u64 remote_offset, u32 remote_len)
++static int smb_direct_rdma_write(struct ksmbd_transport *t,
++ void *buf, unsigned int buflen,
++ struct smb2_buffer_desc_v1 *desc,
++ unsigned int desc_len)
+ {
+ return smb_direct_rdma_xmit(smb_trans_direct_transfort(t), buf, buflen,
+- remote_key, remote_offset,
+- remote_len, false);
++ desc, desc_len, false);
+ }
+
+-static int smb_direct_rdma_read(struct ksmbd_transport *t, void *buf,
+- unsigned int buflen, u32 remote_key,
+- u64 remote_offset, u32 remote_len)
++static int smb_direct_rdma_read(struct ksmbd_transport *t,
++ void *buf, unsigned int buflen,
++ struct smb2_buffer_desc_v1 *desc,
++ unsigned int desc_len)
+ {
+ return smb_direct_rdma_xmit(smb_trans_direct_transfort(t), buf, buflen,
+- remote_key, remote_offset,
+- remote_len, true);
++ desc, desc_len, true);
+ }
+
+ static void smb_direct_disconnect(struct ksmbd_transport *t)
+@@ -1639,11 +1669,19 @@ out_err:
+ return ret;
+ }
+
++static unsigned int smb_direct_get_max_fr_pages(struct smb_direct_transport *t)
++{
++ return min_t(unsigned int,
++ t->cm_id->device->attrs.max_fast_reg_page_list_len,
++ 256);
++}
++
+ static int smb_direct_init_params(struct smb_direct_transport *t,
+ struct ib_qp_cap *cap)
+ {
+ struct ib_device *device = t->cm_id->device;
+- int max_send_sges, max_pages, max_rw_wrs, max_send_wrs;
++ int max_send_sges, max_rw_wrs, max_send_wrs;
++ unsigned int max_sge_per_wr, wrs_per_credit;
+
+ /* need 2 more sge. because a SMB_DIRECT header will be mapped,
+ * and maybe a send buffer could be not page aligned.
+@@ -1655,25 +1693,31 @@ static int smb_direct_init_params(struct smb_direct_transport *t,
+ return -EINVAL;
+ }
+
+- /*
+- * allow smb_direct_max_outstanding_rw_ops of in-flight RDMA
+- * read/writes. HCA guarantees at least max_send_sge of sges for
+- * a RDMA read/write work request, and if memory registration is used,
+- * we need reg_mr, local_inv wrs for each read/write.
++ /* Calculate the number of work requests for RDMA R/W.
++ * The maximum number of pages which can be registered
++ * with one Memory region can be transferred with one
++ * R/W credit. And at least 4 work requests for each credit
++ * are needed for MR registration, RDMA R/W, local & remote
++ * MR invalidation.
+ */
+ t->max_rdma_rw_size = smb_direct_max_read_write_size;
+- max_pages = DIV_ROUND_UP(t->max_rdma_rw_size, PAGE_SIZE) + 1;
+- max_rw_wrs = DIV_ROUND_UP(max_pages, SMB_DIRECT_MAX_SEND_SGES);
+- max_rw_wrs += rdma_rw_mr_factor(device, t->cm_id->port_num,
+- max_pages) * 2;
+- max_rw_wrs *= smb_direct_max_outstanding_rw_ops;
++ t->pages_per_rw_credit = smb_direct_get_max_fr_pages(t);
++ t->max_rw_credits = DIV_ROUND_UP(t->max_rdma_rw_size,
++ (t->pages_per_rw_credit - 1) *
++ PAGE_SIZE);
++
++ max_sge_per_wr = min_t(unsigned int, device->attrs.max_send_sge,
++ device->attrs.max_sge_rd);
++ wrs_per_credit = max_t(unsigned int, 4,
++ DIV_ROUND_UP(t->pages_per_rw_credit,
++ max_sge_per_wr) + 1);
++ max_rw_wrs = t->max_rw_credits * wrs_per_credit;
+
+ max_send_wrs = smb_direct_send_credit_target + max_rw_wrs;
+ if (max_send_wrs > device->attrs.max_cqe ||
+ max_send_wrs > device->attrs.max_qp_wr) {
+- pr_err("consider lowering send_credit_target = %d, or max_outstanding_rw_ops = %d\n",
+- smb_direct_send_credit_target,
+- smb_direct_max_outstanding_rw_ops);
++ pr_err("consider lowering send_credit_target = %d\n",
++ smb_direct_send_credit_target);
+ pr_err("Possible CQE overrun, device reporting max_cqe %d max_qp_wr %d\n",
+ device->attrs.max_cqe, device->attrs.max_qp_wr);
+ return -EINVAL;
+@@ -1708,7 +1752,7 @@ static int smb_direct_init_params(struct smb_direct_transport *t,
+
+ t->send_credit_target = smb_direct_send_credit_target;
+ atomic_set(&t->send_credits, 0);
+- atomic_set(&t->rw_avail_ops, smb_direct_max_outstanding_rw_ops);
++ atomic_set(&t->rw_credits, t->max_rw_credits);
+
+ t->max_send_size = smb_direct_max_send_size;
+ t->max_recv_size = smb_direct_max_receive_size;
+@@ -1716,12 +1760,10 @@ static int smb_direct_init_params(struct smb_direct_transport *t,
+
+ cap->max_send_wr = max_send_wrs;
+ cap->max_recv_wr = t->recv_credit_max;
+- cap->max_send_sge = SMB_DIRECT_MAX_SEND_SGES;
++ cap->max_send_sge = max_sge_per_wr;
+ cap->max_recv_sge = SMB_DIRECT_MAX_RECV_SGES;
+ cap->max_inline_data = 0;
+- cap->max_rdma_ctxs =
+- rdma_rw_mr_factor(device, t->cm_id->port_num, max_pages) *
+- smb_direct_max_outstanding_rw_ops;
++ cap->max_rdma_ctxs = t->max_rw_credits;
+ return 0;
+ }
+
+@@ -1814,7 +1856,8 @@ static int smb_direct_create_qpair(struct smb_direct_transport *t,
+ }
+
+ t->send_cq = ib_alloc_cq(t->cm_id->device, t,
+- t->send_credit_target, 0, IB_POLL_WORKQUEUE);
++ smb_direct_send_credit_target + cap->max_rdma_ctxs,
++ 0, IB_POLL_WORKQUEUE);
+ if (IS_ERR(t->send_cq)) {
+ pr_err("Can't create RDMA send CQ\n");
+ ret = PTR_ERR(t->send_cq);
+@@ -1823,8 +1866,7 @@ static int smb_direct_create_qpair(struct smb_direct_transport *t,
+ }
+
+ t->recv_cq = ib_alloc_cq(t->cm_id->device, t,
+- cap->max_send_wr + cap->max_rdma_ctxs,
+- 0, IB_POLL_WORKQUEUE);
++ t->recv_credit_max, 0, IB_POLL_WORKQUEUE);
+ if (IS_ERR(t->recv_cq)) {
+ pr_err("Can't create RDMA recv CQ\n");
+ ret = PTR_ERR(t->recv_cq);
+@@ -1853,17 +1895,12 @@ static int smb_direct_create_qpair(struct smb_direct_transport *t,
+
+ pages_per_rw = DIV_ROUND_UP(t->max_rdma_rw_size, PAGE_SIZE) + 1;
+ if (pages_per_rw > t->cm_id->device->attrs.max_sgl_rd) {
+- int pages_per_mr, mr_count;
+-
+- pages_per_mr = min_t(int, pages_per_rw,
+- t->cm_id->device->attrs.max_fast_reg_page_list_len);
+- mr_count = DIV_ROUND_UP(pages_per_rw, pages_per_mr) *
+- atomic_read(&t->rw_avail_ops);
+- ret = ib_mr_pool_init(t->qp, &t->qp->rdma_mrs, mr_count,
+- IB_MR_TYPE_MEM_REG, pages_per_mr, 0);
++ ret = ib_mr_pool_init(t->qp, &t->qp->rdma_mrs,
++ t->max_rw_credits, IB_MR_TYPE_MEM_REG,
++ t->pages_per_rw_credit, 0);
+ if (ret) {
+ pr_err("failed to init mr pool count %d pages %d\n",
+- mr_count, pages_per_mr);
++ t->max_rw_credits, t->pages_per_rw_credit);
+ goto err;
+ }
+ }
+diff --git a/fs/ksmbd/transport_rdma.h b/fs/ksmbd/transport_rdma.h
+index 5567d93a6f96e..77aee4e5c9dcd 100644
+--- a/fs/ksmbd/transport_rdma.h
++++ b/fs/ksmbd/transport_rdma.h
+@@ -7,6 +7,10 @@
+ #ifndef __KSMBD_TRANSPORT_RDMA_H__
+ #define __KSMBD_TRANSPORT_RDMA_H__
+
++#define SMBD_DEFAULT_IOSIZE (8 * 1024 * 1024)
++#define SMBD_MIN_IOSIZE (512 * 1024)
++#define SMBD_MAX_IOSIZE (16 * 1024 * 1024)
++
+ /* SMB DIRECT negotiation request packet [MS-SMBD] 2.2.1 */
+ struct smb_direct_negotiate_req {
+ __le16 min_version;
+@@ -52,10 +56,14 @@ struct smb_direct_data_transfer {
+ int ksmbd_rdma_init(void);
+ void ksmbd_rdma_destroy(void);
+ bool ksmbd_rdma_capable_netdev(struct net_device *netdev);
++void init_smbd_max_io_size(unsigned int sz);
++unsigned int get_smbd_max_read_write_size(void);
+ #else
+ static inline int ksmbd_rdma_init(void) { return 0; }
+ static inline int ksmbd_rdma_destroy(void) { return 0; }
+ static inline bool ksmbd_rdma_capable_netdev(struct net_device *netdev) { return false; }
++static inline void init_smbd_max_io_size(unsigned int sz) { }
++static inline unsigned int get_smbd_max_read_write_size(void) { return 0; }
+ #endif
+
+ #endif /* __KSMBD_TRANSPORT_RDMA_H__ */
+diff --git a/fs/ksmbd/vfs.c b/fs/ksmbd/vfs.c
+index 05efcdf7a4a73..201962f03772d 100644
+--- a/fs/ksmbd/vfs.c
++++ b/fs/ksmbd/vfs.c
+@@ -1540,6 +1540,11 @@ int ksmbd_vfs_get_sd_xattr(struct ksmbd_conn *conn,
+ }
+
+ *pntsd = acl.sd_buf;
++ if (acl.sd_size < sizeof(struct smb_ntsd)) {
++ pr_err("sd size is invalid\n");
++ goto out_free;
++ }
++
+ (*pntsd)->osidoffset = cpu_to_le32(le32_to_cpu((*pntsd)->osidoffset) -
+ NDR_NTSD_OFFSETOF);
+ (*pntsd)->gsidoffset = cpu_to_le32(le32_to_cpu((*pntsd)->gsidoffset) -
+diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
+index 176b468a61c75..e5adb524a445f 100644
+--- a/fs/lockd/svc4proc.c
++++ b/fs/lockd/svc4proc.c
+@@ -32,6 +32,10 @@ nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
+ if (!nlmsvc_ops)
+ return nlm_lck_denied_nolocks;
+
++ if (lock->lock_start > OFFSET_MAX ||
++ (lock->lock_len && ((lock->lock_len - 1) > (OFFSET_MAX - lock->lock_start))))
++ return nlm4_fbig;
++
+ /* Obtain host handle */
+ if (!(host = nlmsvc_lookup_host(rqstp, lock->caller, lock->len))
+ || (argp->monitor && nsm_monitor(host) < 0))
+@@ -50,6 +54,10 @@ nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
+ /* Set up the missing parts of the file_lock structure */
+ lock->fl.fl_file = file->f_file[mode];
+ lock->fl.fl_pid = current->tgid;
++ lock->fl.fl_start = (loff_t)lock->lock_start;
++ lock->fl.fl_end = lock->lock_len ?
++ (loff_t)(lock->lock_start + lock->lock_len - 1) :
++ OFFSET_MAX;
+ lock->fl.fl_lmops = &nlmsvc_lock_operations;
+ nlmsvc_locks_init_private(&lock->fl, host, (pid_t)lock->svid);
+ if (!lock->fl.fl_owner) {
+diff --git a/fs/lockd/xdr4.c b/fs/lockd/xdr4.c
+index 856267c0864bd..712fdfeb8ef06 100644
+--- a/fs/lockd/xdr4.c
++++ b/fs/lockd/xdr4.c
+@@ -20,13 +20,6 @@
+
+ #include "svcxdr.h"
+
+-static inline loff_t
+-s64_to_loff_t(__s64 offset)
+-{
+- return (loff_t)offset;
+-}
+-
+-
+ static inline s64
+ loff_t_to_s64(loff_t offset)
+ {
+@@ -70,8 +63,6 @@ static bool
+ svcxdr_decode_lock(struct xdr_stream *xdr, struct nlm_lock *lock)
+ {
+ struct file_lock *fl = &lock->fl;
+- u64 len, start;
+- s64 end;
+
+ if (!svcxdr_decode_string(xdr, &lock->caller, &lock->len))
+ return false;
+@@ -81,20 +72,14 @@ svcxdr_decode_lock(struct xdr_stream *xdr, struct nlm_lock *lock)
+ return false;
+ if (xdr_stream_decode_u32(xdr, &lock->svid) < 0)
+ return false;
+- if (xdr_stream_decode_u64(xdr, &start) < 0)
++ if (xdr_stream_decode_u64(xdr, &lock->lock_start) < 0)
+ return false;
+- if (xdr_stream_decode_u64(xdr, &len) < 0)
++ if (xdr_stream_decode_u64(xdr, &lock->lock_len) < 0)
+ return false;
+
+ locks_init_lock(fl);
+ fl->fl_flags = FL_POSIX;
+ fl->fl_type = F_RDLCK;
+- end = start + len - 1;
+- fl->fl_start = s64_to_loff_t(start);
+- if (len == 0 || end < 0)
+- fl->fl_end = OFFSET_MAX;
+- else
+- fl->fl_end = s64_to_loff_t(end);
+
+ return true;
+ }
+diff --git a/fs/mbcache.c b/fs/mbcache.c
+index 97c54d3a22276..2010bc80a3f2d 100644
+--- a/fs/mbcache.c
++++ b/fs/mbcache.c
+@@ -11,7 +11,7 @@
+ /*
+ * Mbcache is a simple key-value store. Keys need not be unique, however
+ * key-value pairs are expected to be unique (we use this fact in
+- * mb_cache_entry_delete()).
++ * mb_cache_entry_delete_or_get()).
+ *
+ * Ext2 and ext4 use this cache for deduplication of extended attribute blocks.
+ * Ext4 also uses it for deduplication of xattr values stored in inodes.
+@@ -125,6 +125,19 @@ void __mb_cache_entry_free(struct mb_cache_entry *entry)
+ }
+ EXPORT_SYMBOL(__mb_cache_entry_free);
+
++/*
++ * mb_cache_entry_wait_unused - wait to be the last user of the entry
++ *
++ * @entry - entry to work on
++ *
++ * Wait to be the last user of the entry.
++ */
++void mb_cache_entry_wait_unused(struct mb_cache_entry *entry)
++{
++ wait_var_event(&entry->e_refcnt, atomic_read(&entry->e_refcnt) <= 3);
++}
++EXPORT_SYMBOL(mb_cache_entry_wait_unused);
++
+ static struct mb_cache_entry *__entry_find(struct mb_cache *cache,
+ struct mb_cache_entry *entry,
+ u32 key)
+@@ -217,7 +230,7 @@ out:
+ }
+ EXPORT_SYMBOL(mb_cache_entry_get);
+
+-/* mb_cache_entry_delete - remove a cache entry
++/* mb_cache_entry_delete - try to remove a cache entry
+ * @cache - cache we work with
+ * @key - key
+ * @value - value
+@@ -254,6 +267,55 @@ void mb_cache_entry_delete(struct mb_cache *cache, u32 key, u64 value)
+ }
+ EXPORT_SYMBOL(mb_cache_entry_delete);
+
++/* mb_cache_entry_delete_or_get - remove a cache entry if it has no users
++ * @cache - cache we work with
++ * @key - key
++ * @value - value
++ *
++ * Remove entry from cache @cache with key @key and value @value. The removal
++ * happens only if the entry is unused. The function returns NULL in case the
++ * entry was successfully removed or there's no entry in cache. Otherwise the
++ * function grabs reference of the entry that we failed to delete because it
++ * still has users and return it.
++ */
++struct mb_cache_entry *mb_cache_entry_delete_or_get(struct mb_cache *cache,
++ u32 key, u64 value)
++{
++ struct hlist_bl_node *node;
++ struct hlist_bl_head *head;
++ struct mb_cache_entry *entry;
++
++ head = mb_cache_entry_head(cache, key);
++ hlist_bl_lock(head);
++ hlist_bl_for_each_entry(entry, node, head, e_hash_list) {
++ if (entry->e_key == key && entry->e_value == value) {
++ if (atomic_read(&entry->e_refcnt) > 2) {
++ atomic_inc(&entry->e_refcnt);
++ hlist_bl_unlock(head);
++ return entry;
++ }
++ /* We keep hash list reference to keep entry alive */
++ hlist_bl_del_init(&entry->e_hash_list);
++ hlist_bl_unlock(head);
++ spin_lock(&cache->c_list_lock);
++ if (!list_empty(&entry->e_list)) {
++ list_del_init(&entry->e_list);
++ if (!WARN_ONCE(cache->c_entry_count == 0,
++ "mbcache: attempt to decrement c_entry_count past zero"))
++ cache->c_entry_count--;
++ atomic_dec(&entry->e_refcnt);
++ }
++ spin_unlock(&cache->c_list_lock);
++ mb_cache_entry_put(cache, entry);
++ return NULL;
++ }
++ }
++ hlist_bl_unlock(head);
++
++ return NULL;
++}
++EXPORT_SYMBOL(mb_cache_entry_delete_or_get);
++
+ /* mb_cache_entry_touch - cache entry got used
+ * @cache - cache the entry belongs to
+ * @entry - entry that got used
+@@ -288,7 +350,7 @@ static unsigned long mb_cache_shrink(struct mb_cache *cache,
+ while (nr_to_scan-- && !list_empty(&cache->c_list)) {
+ entry = list_first_entry(&cache->c_list,
+ struct mb_cache_entry, e_list);
+- if (entry->e_referenced) {
++ if (entry->e_referenced || atomic_read(&entry->e_refcnt) > 2) {
+ entry->e_referenced = 0;
+ list_move_tail(&entry->e_list, &cache->c_list);
+ continue;
+@@ -302,6 +364,14 @@ static unsigned long mb_cache_shrink(struct mb_cache *cache,
+ spin_unlock(&cache->c_list_lock);
+ head = mb_cache_entry_head(cache, entry->e_key);
+ hlist_bl_lock(head);
++ /* Now a reliable check if the entry didn't get used... */
++ if (atomic_read(&entry->e_refcnt) > 2) {
++ hlist_bl_unlock(head);
++ spin_lock(&cache->c_list_lock);
++ list_add_tail(&entry->e_list, &cache->c_list);
++ cache->c_entry_count++;
++ continue;
++ }
+ if (!hlist_bl_unhashed(&entry->e_hash_list)) {
+ hlist_bl_del_init(&entry->e_hash_list);
+ atomic_dec(&entry->e_refcnt);
+diff --git a/fs/namei.c b/fs/namei.c
+index fd3c95ac261bd..2fa412c5a0823 100644
+--- a/fs/namei.c
++++ b/fs/namei.c
+@@ -1511,6 +1511,8 @@ static bool __follow_mount_rcu(struct nameidata *nd, struct path *path,
+ * becoming unpinned.
+ */
+ flags = dentry->d_flags;
++ if (read_seqretry(&mount_lock, nd->m_seq))
++ return false;
+ continue;
+ }
+ if (read_seqretry(&mount_lock, nd->m_seq))
+@@ -3571,6 +3573,8 @@ struct dentry *vfs_tmpfile(struct user_namespace *mnt_userns,
+ child = d_alloc(dentry, &slash_name);
+ if (unlikely(!child))
+ goto out_err;
++ if (!IS_POSIXACL(dir))
++ mode &= ~current_umask();
+ error = dir->i_op->tmpfile(mnt_userns, dir, child, mode);
+ if (error)
+ goto out_err;
+diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
+index 604be402ae13c..7d285561e59f6 100644
+--- a/fs/nfs/flexfilelayout/flexfilelayout.c
++++ b/fs/nfs/flexfilelayout/flexfilelayout.c
+@@ -1131,6 +1131,8 @@ static int ff_layout_async_handle_error_v4(struct rpc_task *task,
+ case -EIO:
+ case -ETIMEDOUT:
+ case -EPIPE:
++ case -EPROTO:
++ case -ENODEV:
+ dprintk("%s DS connection error %d\n", __func__,
+ task->tk_status);
+ nfs4_delete_deviceid(devid->ld, devid->nfs_client,
+@@ -1236,6 +1238,8 @@ static void ff_layout_io_track_ds_error(struct pnfs_layout_segment *lseg,
+ case -ENOBUFS:
+ case -EPIPE:
+ case -EPERM:
++ case -EPROTO:
++ case -ENODEV:
+ *op_status = status = NFS4ERR_NXIO;
+ break;
+ case -EACCES:
+diff --git a/fs/nfs/nfs3client.c b/fs/nfs/nfs3client.c
+index 5601e47360c28..b49359afac883 100644
+--- a/fs/nfs/nfs3client.c
++++ b/fs/nfs/nfs3client.c
+@@ -108,7 +108,6 @@ struct nfs_client *nfs3_set_ds_client(struct nfs_server *mds_srv,
+ if (mds_srv->flags & NFS_MOUNT_NORESVPORT)
+ __set_bit(NFS_CS_NORESVPORT, &cl_init.init_flags);
+
+- __set_bit(NFS_CS_NOPING, &cl_init.init_flags);
+ __set_bit(NFS_CS_DS, &cl_init.init_flags);
+
+ /* Use the MDS nfs_client cl_ipaddr. */
+diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
+index 0326bdec5de77..a46260965cf10 100644
+--- a/fs/nfsd/filecache.c
++++ b/fs/nfsd/filecache.c
+@@ -183,12 +183,6 @@ nfsd_file_alloc(struct inode *inode, unsigned int may, unsigned int hashval,
+ nf->nf_hashval = hashval;
+ refcount_set(&nf->nf_ref, 1);
+ nf->nf_may = may & NFSD_FILE_MAY_MASK;
+- if (may & NFSD_MAY_NOT_BREAK_LEASE) {
+- if (may & NFSD_MAY_WRITE)
+- __set_bit(NFSD_FILE_BREAK_WRITE, &nf->nf_flags);
+- if (may & NFSD_MAY_READ)
+- __set_bit(NFSD_FILE_BREAK_READ, &nf->nf_flags);
+- }
+ nf->nf_mark = NULL;
+ trace_nfsd_file_alloc(nf);
+ }
+@@ -954,21 +948,7 @@ wait_for_construction:
+
+ this_cpu_inc(nfsd_file_cache_hits);
+
+- if (!(may_flags & NFSD_MAY_NOT_BREAK_LEASE)) {
+- bool write = (may_flags & NFSD_MAY_WRITE);
+-
+- if (test_bit(NFSD_FILE_BREAK_READ, &nf->nf_flags) ||
+- (test_bit(NFSD_FILE_BREAK_WRITE, &nf->nf_flags) && write)) {
+- status = nfserrno(nfsd_open_break_lease(
+- file_inode(nf->nf_file), may_flags));
+- if (status == nfs_ok) {
+- clear_bit(NFSD_FILE_BREAK_READ, &nf->nf_flags);
+- if (write)
+- clear_bit(NFSD_FILE_BREAK_WRITE,
+- &nf->nf_flags);
+- }
+- }
+- }
++ status = nfserrno(nfsd_open_break_lease(file_inode(nf->nf_file), may_flags));
+ out:
+ if (status == nfs_ok) {
+ *pnf = nf;
+diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
+index 435ceab27897a..63104be2865c5 100644
+--- a/fs/nfsd/filecache.h
++++ b/fs/nfsd/filecache.h
+@@ -37,9 +37,7 @@ struct nfsd_file {
+ struct net *nf_net;
+ #define NFSD_FILE_HASHED (0)
+ #define NFSD_FILE_PENDING (1)
+-#define NFSD_FILE_BREAK_READ (2)
+-#define NFSD_FILE_BREAK_WRITE (3)
+-#define NFSD_FILE_REFERENCED (4)
++#define NFSD_FILE_REFERENCED (2)
+ unsigned long nf_flags;
+ struct inode *nf_inode;
+ unsigned int nf_hashval;
+diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
+index 242fa123e0e94..61ee50843f5cb 100644
+--- a/fs/nfsd/trace.h
++++ b/fs/nfsd/trace.h
+@@ -692,18 +692,10 @@ DEFINE_CLID_EVENT(confirmed_r);
+ /*
+ * from fs/nfsd/filecache.h
+ */
+-TRACE_DEFINE_ENUM(NFSD_FILE_HASHED);
+-TRACE_DEFINE_ENUM(NFSD_FILE_PENDING);
+-TRACE_DEFINE_ENUM(NFSD_FILE_BREAK_READ);
+-TRACE_DEFINE_ENUM(NFSD_FILE_BREAK_WRITE);
+-TRACE_DEFINE_ENUM(NFSD_FILE_REFERENCED);
+-
+ #define show_nf_flags(val) \
+ __print_flags(val, "|", \
+ { 1 << NFSD_FILE_HASHED, "HASHED" }, \
+ { 1 << NFSD_FILE_PENDING, "PENDING" }, \
+- { 1 << NFSD_FILE_BREAK_READ, "BREAK_READ" }, \
+- { 1 << NFSD_FILE_BREAK_WRITE, "BREAK_WRITE" }, \
+ { 1 << NFSD_FILE_REFERENCED, "REFERENCED"})
+
+ DECLARE_EVENT_CLASS(nfsd_file_class,
+diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
+index ebde05c9cf62e..dbb944b5f81e5 100644
+--- a/fs/overlayfs/export.c
++++ b/fs/overlayfs/export.c
+@@ -259,7 +259,7 @@ static int ovl_encode_fh(struct inode *inode, u32 *fid, int *max_len,
+ return FILEID_INVALID;
+
+ dentry = d_find_any_alias(inode);
+- if (WARN_ON(!dentry))
++ if (!dentry)
+ return FILEID_INVALID;
+
+ bytes = ovl_dentry_to_fid(ofs, dentry, fid, buflen);
+diff --git a/fs/proc/base.c b/fs/proc/base.c
+index c1031843cc6aa..2f0595e8ec4a8 100644
+--- a/fs/proc/base.c
++++ b/fs/proc/base.c
+@@ -1885,7 +1885,7 @@ void proc_pid_evict_inode(struct proc_inode *ei)
+ put_pid(pid);
+ }
+
+-struct inode *proc_pid_make_inode(struct super_block * sb,
++struct inode *proc_pid_make_inode(struct super_block *sb,
+ struct task_struct *task, umode_t mode)
+ {
+ struct inode * inode;
+@@ -1914,11 +1914,6 @@ struct inode *proc_pid_make_inode(struct super_block * sb,
+
+ /* Let the pid remember us for quick removal */
+ ei->pid = pid;
+- if (S_ISDIR(mode)) {
+- spin_lock(&pid->lock);
+- hlist_add_head_rcu(&ei->sibling_inodes, &pid->inodes);
+- spin_unlock(&pid->lock);
+- }
+
+ task_dump_owner(task, 0, &inode->i_uid, &inode->i_gid);
+ security_task_to_inode(task, inode);
+@@ -1931,6 +1926,39 @@ out_unlock:
+ return NULL;
+ }
+
++/*
++ * Generating an inode and adding it into @pid->inodes, so that task will
++ * invalidate inode's dentry before being released.
++ *
++ * This helper is used for creating dir-type entries under '/proc' and
++ * '/proc/<tgid>/task'. Other entries(eg. fd, stat) under '/proc/<tgid>'
++ * can be released by invalidating '/proc/<tgid>' dentry.
++ * In theory, dentries under '/proc/<tgid>/task' can also be released by
++ * invalidating '/proc/<tgid>' dentry, we reserve it to handle single
++ * thread exiting situation: Any one of threads should invalidate its
++ * '/proc/<tgid>/task/<pid>' dentry before released.
++ */
++static struct inode *proc_pid_make_base_inode(struct super_block *sb,
++ struct task_struct *task, umode_t mode)
++{
++ struct inode *inode;
++ struct proc_inode *ei;
++ struct pid *pid;
++
++ inode = proc_pid_make_inode(sb, task, mode);
++ if (!inode)
++ return NULL;
++
++ /* Let proc_flush_pid find this directory inode */
++ ei = PROC_I(inode);
++ pid = ei->pid;
++ spin_lock(&pid->lock);
++ hlist_add_head_rcu(&ei->sibling_inodes, &pid->inodes);
++ spin_unlock(&pid->lock);
++
++ return inode;
++}
++
+ int pid_getattr(struct user_namespace *mnt_userns, const struct path *path,
+ struct kstat *stat, u32 request_mask, unsigned int query_flags)
+ {
+@@ -3350,7 +3378,8 @@ static struct dentry *proc_pid_instantiate(struct dentry * dentry,
+ {
+ struct inode *inode;
+
+- inode = proc_pid_make_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
++ inode = proc_pid_make_base_inode(dentry->d_sb, task,
++ S_IFDIR | S_IRUGO | S_IXUGO);
+ if (!inode)
+ return ERR_PTR(-ENOENT);
+
+@@ -3649,7 +3678,8 @@ static struct dentry *proc_task_instantiate(struct dentry *dentry,
+ struct task_struct *task, const void *ptr)
+ {
+ struct inode *inode;
+- inode = proc_pid_make_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
++ inode = proc_pid_make_base_inode(dentry->d_sb, task,
++ S_IFDIR | S_IRUGO | S_IXUGO);
+ if (!inode)
+ return ERR_PTR(-ENOENT);
+
+diff --git a/fs/splice.c b/fs/splice.c
+index 047b79db8eb52..93a2c9bf62494 100644
+--- a/fs/splice.c
++++ b/fs/splice.c
+@@ -814,17 +814,15 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
+ {
+ struct pipe_inode_info *pipe;
+ long ret, bytes;
+- umode_t i_mode;
+ size_t len;
+ int i, flags, more;
+
+ /*
+- * We require the input being a regular file, as we don't want to
+- * randomly drop data for eg socket -> socket splicing. Use the
+- * piped splicing for that!
++ * We require the input to be seekable, as we don't want to randomly
++ * drop data for eg socket -> socket splicing. Use the piped splicing
++ * for that!
+ */
+- i_mode = file_inode(in)->i_mode;
+- if (unlikely(!S_ISREG(i_mode) && !S_ISBLK(i_mode)))
++ if (unlikely(!(in->f_mode & FMODE_LSEEK)))
+ return -EINVAL;
+
+ /*
+diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
+index 15a4c7c07a3bf..b68798a572fcb 100644
+--- a/fs/zonefs/super.c
++++ b/fs/zonefs/super.c
+@@ -723,13 +723,12 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from)
+ struct inode *inode = file_inode(iocb->ki_filp);
+ struct zonefs_inode_info *zi = ZONEFS_I(inode);
+ struct block_device *bdev = inode->i_sb->s_bdev;
+- unsigned int max;
++ unsigned int max = bdev_max_zone_append_sectors(bdev);
+ struct bio *bio;
+ ssize_t size;
+ int nr_pages;
+ ssize_t ret;
+
+- max = queue_max_zone_append_sectors(bdev_get_queue(bdev));
+ max = ALIGN_DOWN(max << SECTOR_SHIFT, inode->i_sb->s_blocksize);
+ iov_iter_truncate(from, max);
+
+diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
+index 181907349b49b..a76f8c6b732d2 100644
+--- a/include/acpi/cppc_acpi.h
++++ b/include/acpi/cppc_acpi.h
+@@ -17,7 +17,7 @@
+ #include <acpi/pcc.h>
+ #include <acpi/processor.h>
+
+-/* Support CPPCv2 and CPPCv3 */
++/* CPPCv2 and CPPCv3 support */
+ #define CPPC_V2_REV 2
+ #define CPPC_V3_REV 3
+ #define CPPC_V2_NUM_ENT 21
+diff --git a/include/crypto/internal/blake2s.h b/include/crypto/internal/blake2s.h
+index 52363eee2b20e..506d56530ca93 100644
+--- a/include/crypto/internal/blake2s.h
++++ b/include/crypto/internal/blake2s.h
+@@ -8,7 +8,6 @@
+ #define _CRYPTO_INTERNAL_BLAKE2S_H
+
+ #include <crypto/blake2s.h>
+-#include <crypto/internal/hash.h>
+ #include <linux/string.h>
+
+ void blake2s_compress_generic(struct blake2s_state *state, const u8 *block,
+@@ -19,111 +18,4 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block,
+
+ bool blake2s_selftest(void);
+
+-static inline void blake2s_set_lastblock(struct blake2s_state *state)
+-{
+- state->f[0] = -1;
+-}
+-
+-/* Helper functions for BLAKE2s shared by the library and shash APIs */
+-
+-static __always_inline void
+-__blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen,
+- bool force_generic)
+-{
+- const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen;
+-
+- if (unlikely(!inlen))
+- return;
+- if (inlen > fill) {
+- memcpy(state->buf + state->buflen, in, fill);
+- if (force_generic)
+- blake2s_compress_generic(state, state->buf, 1,
+- BLAKE2S_BLOCK_SIZE);
+- else
+- blake2s_compress(state, state->buf, 1,
+- BLAKE2S_BLOCK_SIZE);
+- state->buflen = 0;
+- in += fill;
+- inlen -= fill;
+- }
+- if (inlen > BLAKE2S_BLOCK_SIZE) {
+- const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE);
+- /* Hash one less (full) block than strictly possible */
+- if (force_generic)
+- blake2s_compress_generic(state, in, nblocks - 1,
+- BLAKE2S_BLOCK_SIZE);
+- else
+- blake2s_compress(state, in, nblocks - 1,
+- BLAKE2S_BLOCK_SIZE);
+- in += BLAKE2S_BLOCK_SIZE * (nblocks - 1);
+- inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1);
+- }
+- memcpy(state->buf + state->buflen, in, inlen);
+- state->buflen += inlen;
+-}
+-
+-static __always_inline void
+-__blake2s_final(struct blake2s_state *state, u8 *out, bool force_generic)
+-{
+- blake2s_set_lastblock(state);
+- memset(state->buf + state->buflen, 0,
+- BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */
+- if (force_generic)
+- blake2s_compress_generic(state, state->buf, 1, state->buflen);
+- else
+- blake2s_compress(state, state->buf, 1, state->buflen);
+- cpu_to_le32_array(state->h, ARRAY_SIZE(state->h));
+- memcpy(out, state->h, state->outlen);
+-}
+-
+-/* Helper functions for shash implementations of BLAKE2s */
+-
+-struct blake2s_tfm_ctx {
+- u8 key[BLAKE2S_KEY_SIZE];
+- unsigned int keylen;
+-};
+-
+-static inline int crypto_blake2s_setkey(struct crypto_shash *tfm,
+- const u8 *key, unsigned int keylen)
+-{
+- struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(tfm);
+-
+- if (keylen == 0 || keylen > BLAKE2S_KEY_SIZE)
+- return -EINVAL;
+-
+- memcpy(tctx->key, key, keylen);
+- tctx->keylen = keylen;
+-
+- return 0;
+-}
+-
+-static inline int crypto_blake2s_init(struct shash_desc *desc)
+-{
+- const struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm);
+- struct blake2s_state *state = shash_desc_ctx(desc);
+- unsigned int outlen = crypto_shash_digestsize(desc->tfm);
+-
+- __blake2s_init(state, outlen, tctx->key, tctx->keylen);
+- return 0;
+-}
+-
+-static inline int crypto_blake2s_update(struct shash_desc *desc,
+- const u8 *in, unsigned int inlen,
+- bool force_generic)
+-{
+- struct blake2s_state *state = shash_desc_ctx(desc);
+-
+- __blake2s_update(state, in, inlen, force_generic);
+- return 0;
+-}
+-
+-static inline int crypto_blake2s_final(struct shash_desc *desc, u8 *out,
+- bool force_generic)
+-{
+- struct blake2s_state *state = shash_desc_ctx(desc);
+-
+- __blake2s_final(state, out, force_generic);
+- return 0;
+-}
+-
+ #endif /* _CRYPTO_INTERNAL_BLAKE2S_H */
+diff --git a/include/dt-bindings/clock/qcom,gcc-msm8939.h b/include/dt-bindings/clock/qcom,gcc-msm8939.h
+index 0634467c4ce5a..2d545ed0d35ab 100644
+--- a/include/dt-bindings/clock/qcom,gcc-msm8939.h
++++ b/include/dt-bindings/clock/qcom,gcc-msm8939.h
+@@ -192,6 +192,7 @@
+ #define GCC_VENUS0_CORE0_VCODEC0_CLK 183
+ #define GCC_VENUS0_CORE1_VCODEC0_CLK 184
+ #define GCC_OXILI_TIMER_CLK 185
++#define SYSTEM_MM_NOC_BFDCD_CLK_SRC 186
+
+ /* Indexes for GDSCs */
+ #define BIMC_GDSC 0
+diff --git a/include/linux/acpi_viot.h b/include/linux/acpi_viot.h
+index 1eb8ee5b0e5fe..a5a1224315637 100644
+--- a/include/linux/acpi_viot.h
++++ b/include/linux/acpi_viot.h
+@@ -6,9 +6,11 @@
+ #include <linux/acpi.h>
+
+ #ifdef CONFIG_ACPI_VIOT
++void __init acpi_viot_early_init(void);
+ void __init acpi_viot_init(void);
+ int viot_iommu_configure(struct device *dev);
+ #else
++static inline void acpi_viot_early_init(void) {}
+ static inline void acpi_viot_init(void) {}
+ static inline int viot_iommu_configure(struct device *dev)
+ {
+diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
+index 108e3d114bfc1..7927480b9cf7c 100644
+--- a/include/linux/blkdev.h
++++ b/include/linux/blkdev.h
+@@ -466,7 +466,6 @@ struct request_queue {
+ #endif /* CONFIG_BLK_DEV_ZONED */
+
+ int node;
+- struct mutex debugfs_mutex;
+ #ifdef CONFIG_BLK_DEV_IO_TRACE
+ struct blk_trace __rcu *blk_trace;
+ #endif
+@@ -510,11 +509,12 @@ struct request_queue {
+ struct bio_set bio_split;
+
+ struct dentry *debugfs_dir;
+-
+-#ifdef CONFIG_BLK_DEBUG_FS
+ struct dentry *sched_debugfs_dir;
+ struct dentry *rqos_debugfs_dir;
+-#endif
++ /*
++ * Serializes all debugfs metadata operations using the above dentries.
++ */
++ struct mutex debugfs_mutex;
+
+ bool mq_sysfs_init_done;
+
+@@ -1190,6 +1190,17 @@ static inline unsigned int queue_max_zone_append_sectors(const struct request_qu
+ return min(l->max_zone_append_sectors, l->max_sectors);
+ }
+
++static inline unsigned int
++bdev_max_zone_append_sectors(struct block_device *bdev)
++{
++ return queue_max_zone_append_sectors(bdev_get_queue(bdev));
++}
++
++static inline unsigned int bdev_max_segments(struct block_device *bdev)
++{
++ return queue_max_segments(bdev_get_queue(bdev));
++}
++
+ static inline unsigned queue_logical_block_size(const struct request_queue *q)
+ {
+ int retval = 512;
+diff --git a/include/linux/bpf.h b/include/linux/bpf.h
+index 83bd5598ec4df..a5f5172eb52c5 100644
+--- a/include/linux/bpf.h
++++ b/include/linux/bpf.h
+@@ -23,6 +23,7 @@
+ #include <linux/slab.h>
+ #include <linux/percpu-refcount.h>
+ #include <linux/bpfptr.h>
++#include <linux/btf.h>
+
+ struct bpf_verifier_env;
+ struct bpf_verifier_log;
+@@ -155,6 +156,41 @@ struct bpf_map_ops {
+ const struct bpf_iter_seq_info *iter_seq_info;
+ };
+
++enum {
++ /* Support at most 8 pointers in a BPF map value */
++ BPF_MAP_VALUE_OFF_MAX = 8,
++ BPF_MAP_OFF_ARR_MAX = BPF_MAP_VALUE_OFF_MAX +
++ 1 + /* for bpf_spin_lock */
++ 1, /* for bpf_timer */
++};
++
++enum bpf_kptr_type {
++ BPF_KPTR_UNREF,
++ BPF_KPTR_REF,
++};
++
++struct bpf_map_value_off_desc {
++ u32 offset;
++ enum bpf_kptr_type type;
++ struct {
++ struct btf *btf;
++ struct module *module;
++ btf_dtor_kfunc_t dtor;
++ u32 btf_id;
++ } kptr;
++};
++
++struct bpf_map_value_off {
++ u32 nr_off;
++ struct bpf_map_value_off_desc off[];
++};
++
++struct bpf_map_off_arr {
++ u32 cnt;
++ u32 field_off[BPF_MAP_OFF_ARR_MAX];
++ u8 field_sz[BPF_MAP_OFF_ARR_MAX];
++};
++
+ struct bpf_map {
+ /* The first two cachelines with read-mostly members of which some
+ * are also accessed in fast-path (e.g. ops, max_entries).
+@@ -171,6 +207,7 @@ struct bpf_map {
+ u64 map_extra; /* any per-map-type extra fields */
+ u32 map_flags;
+ int spin_lock_off; /* >=0 valid offset, <0 error */
++ struct bpf_map_value_off *kptr_off_tab;
+ int timer_off; /* >=0 valid offset, <0 error */
+ u32 id;
+ int numa_node;
+@@ -182,10 +219,7 @@ struct bpf_map {
+ struct mem_cgroup *memcg;
+ #endif
+ char name[BPF_OBJ_NAME_LEN];
+- bool bypass_spec_v1;
+- bool frozen; /* write-once; write-protected by freeze_mutex */
+- /* 14 bytes hole */
+-
++ struct bpf_map_off_arr *off_arr;
+ /* The 3rd and 4th cacheline with misc members to avoid false sharing
+ * particularly with refcounting.
+ */
+@@ -205,6 +239,8 @@ struct bpf_map {
+ bool jited;
+ bool xdp_has_frags;
+ } owner;
++ bool bypass_spec_v1;
++ bool frozen; /* write-once; write-protected by freeze_mutex */
+ };
+
+ static inline bool map_value_has_spin_lock(const struct bpf_map *map)
+@@ -217,43 +253,44 @@ static inline bool map_value_has_timer(const struct bpf_map *map)
+ return map->timer_off >= 0;
+ }
+
++static inline bool map_value_has_kptrs(const struct bpf_map *map)
++{
++ return !IS_ERR_OR_NULL(map->kptr_off_tab);
++}
++
+ static inline void check_and_init_map_value(struct bpf_map *map, void *dst)
+ {
+ if (unlikely(map_value_has_spin_lock(map)))
+ memset(dst + map->spin_lock_off, 0, sizeof(struct bpf_spin_lock));
+ if (unlikely(map_value_has_timer(map)))
+ memset(dst + map->timer_off, 0, sizeof(struct bpf_timer));
++ if (unlikely(map_value_has_kptrs(map))) {
++ struct bpf_map_value_off *tab = map->kptr_off_tab;
++ int i;
++
++ for (i = 0; i < tab->nr_off; i++)
++ *(u64 *)(dst + tab->off[i].offset) = 0;
++ }
+ }
+
+ /* copy everything but bpf_spin_lock and bpf_timer. There could be one of each. */
+ static inline void copy_map_value(struct bpf_map *map, void *dst, void *src)
+ {
+- u32 s_off = 0, s_sz = 0, t_off = 0, t_sz = 0;
++ u32 curr_off = 0;
++ int i;
+
+- if (unlikely(map_value_has_spin_lock(map))) {
+- s_off = map->spin_lock_off;
+- s_sz = sizeof(struct bpf_spin_lock);
+- }
+- if (unlikely(map_value_has_timer(map))) {
+- t_off = map->timer_off;
+- t_sz = sizeof(struct bpf_timer);
++ if (likely(!map->off_arr)) {
++ memcpy(dst, src, map->value_size);
++ return;
+ }
+
+- if (unlikely(s_sz || t_sz)) {
+- if (s_off < t_off || !s_sz) {
+- swap(s_off, t_off);
+- swap(s_sz, t_sz);
+- }
+- memcpy(dst, src, t_off);
+- memcpy(dst + t_off + t_sz,
+- src + t_off + t_sz,
+- s_off - t_off - t_sz);
+- memcpy(dst + s_off + s_sz,
+- src + s_off + s_sz,
+- map->value_size - s_off - s_sz);
+- } else {
+- memcpy(dst, src, map->value_size);
++ for (i = 0; i < map->off_arr->cnt; i++) {
++ u32 next_off = map->off_arr->field_off[i];
++
++ memcpy(dst + curr_off, src + curr_off, next_off - curr_off);
++ curr_off += map->off_arr->field_sz[i];
+ }
++ memcpy(dst + curr_off, src + curr_off, map->value_size - curr_off);
+ }
+ void copy_map_value_locked(struct bpf_map *map, void *dst, void *src,
+ bool lock_src);
+@@ -342,7 +379,10 @@ enum bpf_type_flag {
+ */
+ MEM_PERCPU = BIT(4 + BPF_BASE_TYPE_BITS),
+
+- __BPF_TYPE_LAST_FLAG = MEM_PERCPU,
++ /* Indicates that the argument will be released. */
++ OBJ_RELEASE = BIT(5 + BPF_BASE_TYPE_BITS),
++
++ __BPF_TYPE_LAST_FLAG = OBJ_RELEASE,
+ };
+
+ /* Max number of base types. */
+@@ -391,6 +431,7 @@ enum bpf_arg_type {
+ ARG_PTR_TO_STACK, /* pointer to stack */
+ ARG_PTR_TO_CONST_STR, /* pointer to a null terminated read-only string */
+ ARG_PTR_TO_TIMER, /* pointer to bpf_timer */
++ ARG_PTR_TO_KPTR, /* pointer to referenced kptr */
+ __BPF_ARG_TYPE_MAX,
+
+ /* Extended arg_types. */
+@@ -400,6 +441,7 @@ enum bpf_arg_type {
+ ARG_PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_SOCKET,
+ ARG_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_ALLOC_MEM,
+ ARG_PTR_TO_STACK_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_STACK,
++ ARG_PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_BTF_ID,
+
+ /* This must be the last entry. Its purpose is to ensure the enum is
+ * wide enough to hold the higher bits reserved for bpf_type_flag.
+@@ -674,11 +716,11 @@ struct btf_func_model {
+ /* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
+ * bytes on x86.
+ */
+-#define BPF_MAX_TRAMP_PROGS 38
++#define BPF_MAX_TRAMP_LINKS 38
+
+-struct bpf_tramp_progs {
+- struct bpf_prog *progs[BPF_MAX_TRAMP_PROGS];
+- int nr_progs;
++struct bpf_tramp_links {
++ struct bpf_tramp_link *links[BPF_MAX_TRAMP_LINKS];
++ int nr_links;
+ };
+
+ /* Different use cases for BPF trampoline:
+@@ -704,7 +746,7 @@ struct bpf_tramp_progs {
+ struct bpf_tramp_image;
+ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *image_end,
+ const struct btf_func_model *m, u32 flags,
+- struct bpf_tramp_progs *tprogs,
++ struct bpf_tramp_links *tlinks,
+ void *orig_call);
+ /* these two functions are called from generated trampoline */
+ u64 notrace __bpf_prog_enter(struct bpf_prog *prog);
+@@ -803,9 +845,10 @@ static __always_inline __nocfi unsigned int bpf_dispatcher_nop_func(
+ {
+ return bpf_func(ctx, insnsi);
+ }
++
+ #ifdef CONFIG_BPF_JIT
+-int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr);
+-int bpf_trampoline_unlink_prog(struct bpf_prog *prog, struct bpf_trampoline *tr);
++int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr);
++int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr);
+ struct bpf_trampoline *bpf_trampoline_get(u64 key,
+ struct bpf_attach_target_info *tgt_info);
+ void bpf_trampoline_put(struct bpf_trampoline *tr);
+@@ -856,12 +899,12 @@ int bpf_jit_charge_modmem(u32 size);
+ void bpf_jit_uncharge_modmem(u32 size);
+ bool bpf_prog_has_trampoline(const struct bpf_prog *prog);
+ #else
+-static inline int bpf_trampoline_link_prog(struct bpf_prog *prog,
++static inline int bpf_trampoline_link_prog(struct bpf_tramp_link *link,
+ struct bpf_trampoline *tr)
+ {
+ return -ENOTSUPP;
+ }
+-static inline int bpf_trampoline_unlink_prog(struct bpf_prog *prog,
++static inline int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link,
+ struct bpf_trampoline *tr)
+ {
+ return -ENOTSUPP;
+@@ -959,8 +1002,6 @@ struct bpf_prog_aux {
+ bool sleepable;
+ bool tail_call_reachable;
+ bool xdp_has_frags;
+- bool use_bpf_prog_pack;
+- struct hlist_node tramp_hlist;
+ /* BTF_KIND_FUNC_PROTO for valid attach_btf_id */
+ const struct btf_type *attach_func_proto;
+ /* function name for valid attach_btf_id */
+@@ -1047,6 +1088,18 @@ struct bpf_link_ops {
+ struct bpf_link_info *info);
+ };
+
++struct bpf_tramp_link {
++ struct bpf_link link;
++ struct hlist_node tramp_hlist;
++};
++
++struct bpf_tracing_link {
++ struct bpf_tramp_link link;
++ enum bpf_attach_type attach_type;
++ struct bpf_trampoline *trampoline;
++ struct bpf_prog *tgt_prog;
++};
++
+ struct bpf_link_primer {
+ struct bpf_link *link;
+ struct file *file;
+@@ -1084,8 +1137,8 @@ bool bpf_struct_ops_get(const void *kdata);
+ void bpf_struct_ops_put(const void *kdata);
+ int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map, void *key,
+ void *value);
+-int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_progs *tprogs,
+- struct bpf_prog *prog,
++int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks,
++ struct bpf_tramp_link *link,
+ const struct btf_func_model *model,
+ void *image, void *image_end);
+ static inline bool bpf_try_module_get(const void *data, struct module *owner)
+@@ -1396,6 +1449,12 @@ void bpf_prog_put(struct bpf_prog *prog);
+ void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock);
+ void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock);
+
++struct bpf_map_value_off_desc *bpf_map_kptr_off_contains(struct bpf_map *map, u32 offset);
++void bpf_map_free_kptr_off_tab(struct bpf_map *map);
++struct bpf_map_value_off *bpf_map_copy_kptr_off_tab(const struct bpf_map *map);
++bool bpf_map_equal_kptr_off_tab(const struct bpf_map *map_a, const struct bpf_map *map_b);
++void bpf_map_free_kptrs(struct bpf_map *map, void *map_value);
++
+ struct bpf_map *bpf_map_get(u32 ufd);
+ struct bpf_map *bpf_map_get_with_uref(u32 ufd);
+ struct bpf_map *__bpf_map_get(struct fd f);
+@@ -2169,6 +2228,7 @@ extern const struct bpf_func_proto bpf_find_vma_proto;
+ extern const struct bpf_func_proto bpf_loop_proto;
+ extern const struct bpf_func_proto bpf_strncmp_proto;
+ extern const struct bpf_func_proto bpf_copy_from_user_task_proto;
++extern const struct bpf_func_proto bpf_kptr_xchg_proto;
+
+ const struct bpf_func_proto *tracing_prog_func_proto(
+ enum bpf_func_id func_id, const struct bpf_prog *prog);
+diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
+index 3e24ad0c4b3c1..2b9112b801712 100644
+--- a/include/linux/bpf_types.h
++++ b/include/linux/bpf_types.h
+@@ -141,3 +141,4 @@ BPF_LINK_TYPE(BPF_LINK_TYPE_XDP, xdp)
+ BPF_LINK_TYPE(BPF_LINK_TYPE_PERF_EVENT, perf)
+ #endif
+ BPF_LINK_TYPE(BPF_LINK_TYPE_KPROBE_MULTI, kprobe_multi)
++BPF_LINK_TYPE(BPF_LINK_TYPE_STRUCT_OPS, struct_ops)
+diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
+index 3a9d2d7cc6b72..1f1e7f2ea967f 100644
+--- a/include/linux/bpf_verifier.h
++++ b/include/linux/bpf_verifier.h
+@@ -523,8 +523,7 @@ int check_ptr_off_reg(struct bpf_verifier_env *env,
+ const struct bpf_reg_state *reg, int regno);
+ int check_func_arg_reg_off(struct bpf_verifier_env *env,
+ const struct bpf_reg_state *reg, int regno,
+- enum bpf_arg_type arg_type,
+- bool is_release_func);
++ enum bpf_arg_type arg_type);
+ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
+ u32 regno);
+ int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
+diff --git a/include/linux/btf.h b/include/linux/btf.h
+index 36bc09b8e890d..f70625dd5bb4a 100644
+--- a/include/linux/btf.h
++++ b/include/linux/btf.h
+@@ -40,6 +40,13 @@ struct btf_kfunc_id_set {
+ };
+ };
+
++struct btf_id_dtor_kfunc {
++ u32 btf_id;
++ u32 kfunc_btf_id;
++};
++
++typedef void (*btf_dtor_kfunc_t)(void *);
++
+ extern const struct file_operations btf_fops;
+
+ void btf_get(struct btf *btf);
+@@ -123,6 +130,8 @@ bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
+ u32 expected_offset, u32 expected_size);
+ int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t);
+ int btf_find_timer(const struct btf *btf, const struct btf_type *t);
++struct bpf_map_value_off *btf_parse_kptrs(const struct btf *btf,
++ const struct btf_type *t);
+ bool btf_type_is_void(const struct btf_type *t);
+ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind);
+ const struct btf_type *btf_type_skip_modifiers(const struct btf *btf,
+@@ -344,6 +353,9 @@ bool btf_kfunc_id_set_contains(const struct btf *btf,
+ enum btf_kfunc_type type, u32 kfunc_btf_id);
+ int register_btf_kfunc_id_set(enum bpf_prog_type prog_type,
+ const struct btf_kfunc_id_set *s);
++s32 btf_find_dtor_kfunc(struct btf *btf, u32 btf_id);
++int register_btf_id_dtor_kfuncs(const struct btf_id_dtor_kfunc *dtors, u32 add_cnt,
++ struct module *owner);
+ #else
+ static inline const struct btf_type *btf_type_by_id(const struct btf *btf,
+ u32 type_id)
+@@ -367,6 +379,15 @@ static inline int register_btf_kfunc_id_set(enum bpf_prog_type prog_type,
+ {
+ return 0;
+ }
++static inline s32 btf_find_dtor_kfunc(struct btf *btf, u32 btf_id)
++{
++ return -ENOENT;
++}
++static inline int register_btf_id_dtor_kfuncs(const struct btf_id_dtor_kfunc *dtors,
++ u32 add_cnt, struct module *owner)
++{
++ return 0;
++}
+ #endif
+
+ #endif
+diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
+index bcb4fe9b8575c..8e4552460910e 100644
+--- a/include/linux/buffer_head.h
++++ b/include/linux/buffer_head.h
+@@ -117,7 +117,6 @@ static __always_inline int test_clear_buffer_##name(struct buffer_head *bh) \
+ * of the form "mark_buffer_foo()". These are higher-level functions which
+ * do something in addition to setting a b_state bit.
+ */
+-BUFFER_FNS(Uptodate, uptodate)
+ BUFFER_FNS(Dirty, dirty)
+ TAS_BUFFER_FNS(Dirty, dirty)
+ BUFFER_FNS(Lock, locked)
+@@ -135,6 +134,30 @@ BUFFER_FNS(Meta, meta)
+ BUFFER_FNS(Prio, prio)
+ BUFFER_FNS(Defer_Completion, defer_completion)
+
++static __always_inline void set_buffer_uptodate(struct buffer_head *bh)
++{
++ /*
++ * make it consistent with folio_mark_uptodate
++ * pairs with smp_load_acquire in buffer_uptodate
++ */
++ smp_mb__before_atomic();
++ set_bit(BH_Uptodate, &bh->b_state);
++}
++
++static __always_inline void clear_buffer_uptodate(struct buffer_head *bh)
++{
++ clear_bit(BH_Uptodate, &bh->b_state);
++}
++
++static __always_inline int buffer_uptodate(const struct buffer_head *bh)
++{
++ /*
++ * make it consistent with folio_test_uptodate
++ * pairs with smp_mb__before_atomic in set_buffer_uptodate
++ */
++ return (smp_load_acquire(&bh->b_state) & (1UL << BH_Uptodate)) != 0;
++}
++
+ #define bh_offset(bh) ((unsigned long)(bh)->b_data & ~PAGE_MASK)
+
+ /* If we *know* page->private refers to buffer_heads */
+diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
+index fe29ac7cc469c..4592d08459417 100644
+--- a/include/linux/cpumask.h
++++ b/include/linux/cpumask.h
+@@ -1071,4 +1071,22 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
+ [0] = 1UL \
+ } }
+
++/*
++ * Provide a valid theoretical max size for cpumap and cpulist sysfs files
++ * to avoid breaking userspace which may allocate a buffer based on the size
++ * reported by e.g. fstat.
++ *
++ * for cpumap NR_CPUS * 9/32 - 1 should be an exact length.
++ *
++ * For cpulist 7 is (ceil(log10(NR_CPUS)) + 1) allowing for NR_CPUS to be up
++ * to 2 orders of magnitude larger than 8192. And then we divide by 2 to
++ * cover a worst-case of every other cpu being on one of two nodes for a
++ * very large NR_CPUS.
++ *
++ * Use PAGE_SIZE as a minimum for smaller configurations.
++ */
++#define CPUMAP_FILE_MAX_BYTES ((((NR_CPUS * 9)/32 - 1) > PAGE_SIZE) \
++ ? (NR_CPUS * 9)/32 - 1 : PAGE_SIZE)
++#define CPULIST_FILE_MAX_BYTES (((NR_CPUS * 7)/2 > PAGE_SIZE) ? (NR_CPUS * 7)/2 : PAGE_SIZE)
++
+ #endif /* __LINUX_CPUMASK_H */
+diff --git a/include/linux/filter.h b/include/linux/filter.h
+index ed0c0ff42ad5b..8fd2e2f58eeb2 100644
+--- a/include/linux/filter.h
++++ b/include/linux/filter.h
+@@ -948,6 +948,7 @@ u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
+ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog);
+ void bpf_jit_compile(struct bpf_prog *prog);
+ bool bpf_jit_needs_zext(void);
++bool bpf_jit_supports_subprog_tailcalls(void);
+ bool bpf_jit_supports_kfunc_call(void);
+ bool bpf_helper_changes_pkt_data(void *func);
+
+@@ -1060,6 +1061,14 @@ u64 bpf_jit_alloc_exec_limit(void);
+ void *bpf_jit_alloc_exec(unsigned long size);
+ void bpf_jit_free_exec(void *addr);
+ void bpf_jit_free(struct bpf_prog *fp);
++struct bpf_binary_header *
++bpf_jit_binary_pack_hdr(const struct bpf_prog *fp);
++
++static inline bool bpf_prog_kallsyms_verify_off(const struct bpf_prog *fp)
++{
++ return list_empty(&fp->aux->ksym.lnode) ||
++ fp->aux->ksym.lnode.prev == LIST_POISON2;
++}
+
+ struct bpf_binary_header *
+ bpf_jit_binary_pack_alloc(unsigned int proglen, u8 **ro_image,
+diff --git a/include/linux/highmem.h b/include/linux/highmem.h
+index 39bb9b47fa9cd..7037c6bed55c0 100644
+--- a/include/linux/highmem.h
++++ b/include/linux/highmem.h
+@@ -218,6 +218,16 @@ static inline void clear_highpage(struct page *page)
+ kunmap_local(kaddr);
+ }
+
++static inline void clear_highpage_kasan_tagged(struct page *page)
++{
++ u8 tag;
++
++ tag = page_kasan_tag(page);
++ page_kasan_tag_reset(page);
++ clear_highpage(page);
++ page_kasan_tag_set(page, tag);
++}
++
+ #ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
+
+ static inline void tag_clear_highpage(struct page *page)
+diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
+index ac2a1d758a80e..fbf4a6509fb39 100644
+--- a/include/linux/hugetlb.h
++++ b/include/linux/hugetlb.h
+@@ -167,7 +167,7 @@ bool hugetlb_reserve_pages(struct inode *inode, long from, long to,
+ vm_flags_t vm_flags);
+ long hugetlb_unreserve_pages(struct inode *inode, long start, long end,
+ long freed);
+-bool isolate_huge_page(struct page *page, struct list_head *list);
++int isolate_hugetlb(struct page *page, struct list_head *list);
+ int get_hwpoison_huge_page(struct page *page, bool *hugetlb);
+ int get_huge_page_for_hwpoison(unsigned long pfn, int flags);
+ void putback_active_hugepage(struct page *page);
+@@ -369,9 +369,9 @@ static inline pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr,
+ return NULL;
+ }
+
+-static inline bool isolate_huge_page(struct page *page, struct list_head *list)
++static inline int isolate_hugetlb(struct page *page, struct list_head *list)
+ {
+- return false;
++ return -EBUSY;
+ }
+
+ static inline int get_hwpoison_huge_page(struct page *page, bool *hugetlb)
+diff --git a/include/linux/iio/common/cros_ec_sensors_core.h b/include/linux/iio/common/cros_ec_sensors_core.h
+index c582e1a142320..7b5dbd7499957 100644
+--- a/include/linux/iio/common/cros_ec_sensors_core.h
++++ b/include/linux/iio/common/cros_ec_sensors_core.h
+@@ -95,8 +95,11 @@ int cros_ec_sensors_read_cmd(struct iio_dev *indio_dev, unsigned long scan_mask,
+ struct platform_device;
+ int cros_ec_sensors_core_init(struct platform_device *pdev,
+ struct iio_dev *indio_dev, bool physical_device,
+- cros_ec_sensors_capture_t trigger_capture,
+- cros_ec_sensorhub_push_data_cb_t push_data);
++ cros_ec_sensors_capture_t trigger_capture);
++
++int cros_ec_sensors_core_register(struct device *dev,
++ struct iio_dev *indio_dev,
++ cros_ec_sensorhub_push_data_cb_t push_data);
+
+ irqreturn_t cros_ec_sensors_capture(int irq, void *p);
+ int cros_ec_sensors_push_data(struct iio_dev *indio_dev,
+diff --git a/include/linux/iio/iio.h b/include/linux/iio/iio.h
+index faf00f2c0be6f..c4ce02293f1f8 100644
+--- a/include/linux/iio/iio.h
++++ b/include/linux/iio/iio.h
+@@ -9,6 +9,7 @@
+
+ #include <linux/device.h>
+ #include <linux/cdev.h>
++#include <linux/slab.h>
+ #include <linux/iio/types.h>
+ #include <linux/of.h>
+ /* IIO TODO LIST */
+@@ -657,8 +658,13 @@ static inline void *iio_device_get_drvdata(const struct iio_dev *indio_dev)
+ return dev_get_drvdata(&indio_dev->dev);
+ }
+
+-/* Can we make this smaller? */
+-#define IIO_ALIGN L1_CACHE_BYTES
++/*
++ * Used to ensure the iio_priv() structure is aligned to allow that structure
++ * to in turn include IIO_DMA_MINALIGN'd elements such as buffers which
++ * must not share cachelines with the rest of the structure, thus making
++ * them safe for use with non-coherent DMA.
++ */
++#define IIO_DMA_MINALIGN ARCH_KMALLOC_MINALIGN
+ struct iio_dev *iio_device_alloc(struct device *parent, int sizeof_priv);
+
+ /* The information at the returned address is guaranteed to be cacheline aligned */
+diff --git a/include/linux/kexec.h b/include/linux/kexec.h
+index 8d573baaab29e..f3e7680befcc4 100644
+--- a/include/linux/kexec.h
++++ b/include/linux/kexec.h
+@@ -188,21 +188,48 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
+ void *buf, unsigned int size,
+ bool get_value);
+ void *kexec_purgatory_get_symbol_addr(struct kimage *image, const char *name);
++void *kexec_image_load_default(struct kimage *image);
+
+-/* Architectures may override the below functions */
+-int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
+- unsigned long buf_len);
+-void *arch_kexec_kernel_image_load(struct kimage *image);
+-int arch_kimage_file_post_load_cleanup(struct kimage *image);
+-#ifdef CONFIG_KEXEC_SIG
+-int arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
+- unsigned long buf_len);
++#ifndef arch_kexec_kernel_image_probe
++static inline int
++arch_kexec_kernel_image_probe(struct kimage *image, void *buf, unsigned long buf_len)
++{
++ return kexec_image_probe_default(image, buf, buf_len);
++}
++#endif
++
++#ifndef arch_kimage_file_post_load_cleanup
++static inline int arch_kimage_file_post_load_cleanup(struct kimage *image)
++{
++ return kexec_image_post_load_cleanup_default(image);
++}
++#endif
++
++#ifndef arch_kexec_kernel_image_load
++static inline void *arch_kexec_kernel_image_load(struct kimage *image)
++{
++ return kexec_image_load_default(image);
++}
+ #endif
+-int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf);
+
+ extern int kexec_add_buffer(struct kexec_buf *kbuf);
+ int kexec_locate_mem_hole(struct kexec_buf *kbuf);
+
++#ifndef arch_kexec_locate_mem_hole
++/**
++ * arch_kexec_locate_mem_hole - Find free memory to place the segments.
++ * @kbuf: Parameters for the memory search.
++ *
++ * On success, kbuf->mem will have the start address of the memory region found.
++ *
++ * Return: 0 on success, negative errno on error.
++ */
++static inline int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
++{
++ return kexec_locate_mem_hole(kbuf);
++}
++#endif
++
+ /* Alignment required for elf header segment */
+ #define ELF_CORE_HEADER_ALIGN 4096
+
+diff --git a/include/linux/kfifo.h b/include/linux/kfifo.h
+index 86249476b57f4..0b35a41440ff1 100644
+--- a/include/linux/kfifo.h
++++ b/include/linux/kfifo.h
+@@ -688,7 +688,7 @@ __kfifo_uint_must_check_helper( \
+ * writer, you don't need extra locking to use these macro.
+ */
+ #define kfifo_to_user(fifo, to, len, copied) \
+-__kfifo_uint_must_check_helper( \
++__kfifo_int_must_check_helper( \
+ ({ \
+ typeof((fifo) + 1) __tmp = (fifo); \
+ void __user *__to = (to); \
+diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
+index ac1ebb37a0ffd..f328a01db4fe9 100644
+--- a/include/linux/kvm_types.h
++++ b/include/linux/kvm_types.h
+@@ -19,6 +19,7 @@ struct kvm_memslots;
+ enum kvm_mr_change;
+
+ #include <linux/bits.h>
++#include <linux/mutex.h>
+ #include <linux/types.h>
+ #include <linux/spinlock_types.h>
+
+@@ -69,6 +70,7 @@ struct gfn_to_pfn_cache {
+ struct kvm_vcpu *vcpu;
+ struct list_head list;
+ rwlock_t lock;
++ struct mutex refresh_lock;
+ void *khva;
+ kvm_pfn_t pfn;
+ enum pfn_cache_usage usage;
+diff --git a/include/linux/lockd/xdr.h b/include/linux/lockd/xdr.h
+index 398f70093cd35..67e4a2c5500bd 100644
+--- a/include/linux/lockd/xdr.h
++++ b/include/linux/lockd/xdr.h
+@@ -41,6 +41,8 @@ struct nlm_lock {
+ struct nfs_fh fh;
+ struct xdr_netobj oh;
+ u32 svid;
++ u64 lock_start;
++ u64 lock_len;
+ struct file_lock fl;
+ };
+
+diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
+index 467b94257105e..0a9afe84ea44e 100644
+--- a/include/linux/lockdep.h
++++ b/include/linux/lockdep.h
+@@ -192,7 +192,7 @@ static inline void
+ lockdep_init_map_waits(struct lockdep_map *lock, const char *name,
+ struct lock_class_key *key, int subclass, u8 inner, u8 outer)
+ {
+- lockdep_init_map_type(lock, name, key, subclass, inner, LD_WAIT_INV, LD_LOCK_NORMAL);
++ lockdep_init_map_type(lock, name, key, subclass, inner, outer, LD_LOCK_NORMAL);
+ }
+
+ static inline void
+@@ -215,24 +215,28 @@ static inline void lockdep_init_map(struct lockdep_map *lock, const char *name,
+ * or they are too narrow (they suffer from a false class-split):
+ */
+ #define lockdep_set_class(lock, key) \
+- lockdep_init_map_waits(&(lock)->dep_map, #key, key, 0, \
+- (lock)->dep_map.wait_type_inner, \
+- (lock)->dep_map.wait_type_outer)
++ lockdep_init_map_type(&(lock)->dep_map, #key, key, 0, \
++ (lock)->dep_map.wait_type_inner, \
++ (lock)->dep_map.wait_type_outer, \
++ (lock)->dep_map.lock_type)
+
+ #define lockdep_set_class_and_name(lock, key, name) \
+- lockdep_init_map_waits(&(lock)->dep_map, name, key, 0, \
+- (lock)->dep_map.wait_type_inner, \
+- (lock)->dep_map.wait_type_outer)
++ lockdep_init_map_type(&(lock)->dep_map, name, key, 0, \
++ (lock)->dep_map.wait_type_inner, \
++ (lock)->dep_map.wait_type_outer, \
++ (lock)->dep_map.lock_type)
+
+ #define lockdep_set_class_and_subclass(lock, key, sub) \
+- lockdep_init_map_waits(&(lock)->dep_map, #key, key, sub,\
+- (lock)->dep_map.wait_type_inner, \
+- (lock)->dep_map.wait_type_outer)
++ lockdep_init_map_type(&(lock)->dep_map, #key, key, sub, \
++ (lock)->dep_map.wait_type_inner, \
++ (lock)->dep_map.wait_type_outer, \
++ (lock)->dep_map.lock_type)
+
+ #define lockdep_set_subclass(lock, sub) \
+- lockdep_init_map_waits(&(lock)->dep_map, #lock, (lock)->dep_map.key, sub,\
+- (lock)->dep_map.wait_type_inner, \
+- (lock)->dep_map.wait_type_outer)
++ lockdep_init_map_type(&(lock)->dep_map, #lock, (lock)->dep_map.key, sub,\
++ (lock)->dep_map.wait_type_inner, \
++ (lock)->dep_map.wait_type_outer, \
++ (lock)->dep_map.lock_type)
+
+ #define lockdep_set_novalidate_class(lock) \
+ lockdep_set_class_and_name(lock, &__lockdep_no_validate__, #lock)
+diff --git a/include/linux/mbcache.h b/include/linux/mbcache.h
+index 20f1e3ff60130..8eca7f25c4320 100644
+--- a/include/linux/mbcache.h
++++ b/include/linux/mbcache.h
+@@ -30,15 +30,23 @@ void mb_cache_destroy(struct mb_cache *cache);
+ int mb_cache_entry_create(struct mb_cache *cache, gfp_t mask, u32 key,
+ u64 value, bool reusable);
+ void __mb_cache_entry_free(struct mb_cache_entry *entry);
++void mb_cache_entry_wait_unused(struct mb_cache_entry *entry);
+ static inline int mb_cache_entry_put(struct mb_cache *cache,
+ struct mb_cache_entry *entry)
+ {
+- if (!atomic_dec_and_test(&entry->e_refcnt))
++ unsigned int cnt = atomic_dec_return(&entry->e_refcnt);
++
++ if (cnt > 0) {
++ if (cnt <= 3)
++ wake_up_var(&entry->e_refcnt);
+ return 0;
++ }
+ __mb_cache_entry_free(entry);
+ return 1;
+ }
+
++struct mb_cache_entry *mb_cache_entry_delete_or_get(struct mb_cache *cache,
++ u32 key, u64 value);
+ void mb_cache_entry_delete(struct mb_cache *cache, u32 key, u64 value);
+ struct mb_cache_entry *mb_cache_entry_get(struct mb_cache *cache, u32 key,
+ u64 value);
+diff --git a/include/linux/mfd/t7l66xb.h b/include/linux/mfd/t7l66xb.h
+index 69632c1b07bd8..ae3e7a5c5219b 100644
+--- a/include/linux/mfd/t7l66xb.h
++++ b/include/linux/mfd/t7l66xb.h
+@@ -12,7 +12,6 @@
+
+ struct t7l66xb_platform_data {
+ int (*enable)(struct platform_device *dev);
+- int (*disable)(struct platform_device *dev);
+ int (*suspend)(struct platform_device *dev);
+ int (*resume)(struct platform_device *dev);
+
+diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
+index 9424503eb8d30..3d1594bad4ec8 100644
+--- a/include/linux/mlx5/driver.h
++++ b/include/linux/mlx5/driver.h
+@@ -445,6 +445,11 @@ struct mlx5_qp_table {
+ struct radix_tree_root tree;
+ };
+
++enum {
++ MLX5_PF_NOTIFY_DISABLE_VF,
++ MLX5_PF_NOTIFY_ENABLE_VF,
++};
++
+ struct mlx5_vf_context {
+ int enabled;
+ u64 port_guid;
+@@ -455,6 +460,7 @@ struct mlx5_vf_context {
+ u8 port_guid_valid:1;
+ u8 node_guid_valid:1;
+ enum port_state_policy policy;
++ struct blocking_notifier_head notifier;
+ };
+
+ struct mlx5_core_sriov {
+@@ -1155,6 +1161,12 @@ int mlx5_dm_sw_icm_dealloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type
+ struct mlx5_core_dev *mlx5_vf_get_core_dev(struct pci_dev *pdev);
+ void mlx5_vf_put_core_dev(struct mlx5_core_dev *mdev);
+
++int mlx5_sriov_blocking_notifier_register(struct mlx5_core_dev *mdev,
++ int vf_id,
++ struct notifier_block *nb);
++void mlx5_sriov_blocking_notifier_unregister(struct mlx5_core_dev *mdev,
++ int vf_id,
++ struct notifier_block *nb);
+ #ifdef CONFIG_MLX5_CORE_IPOIB
+ struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
+ struct ib_device *ibdev,
+diff --git a/include/linux/nvme-fc-driver.h b/include/linux/nvme-fc-driver.h
+index 5358a5facdee6..fa092b9be2fdc 100644
+--- a/include/linux/nvme-fc-driver.h
++++ b/include/linux/nvme-fc-driver.h
+@@ -564,6 +564,15 @@ int nvme_fc_rcv_ls_req(struct nvme_fc_remote_port *remoteport,
+ void *lsreqbuf, u32 lsreqbuf_len);
+
+
++/*
++ * Routine called to get the appid field associated with request by the lldd
++ *
++ * If the return value is NULL : the user/libvirt has not set the appid to VM
++ * If the return value is non-zero: Returns the appid associated with VM
++ *
++ * @req: IO request from nvme fc to driver
++ */
++char *nvme_fc_io_getuuid(struct nvmefc_fcp_req *req);
+
+ /*
+ * *************** LLDD FC-NVME Target/Subsystem API ***************
+@@ -1048,5 +1057,10 @@ int nvmet_fc_rcv_fcp_req(struct nvmet_fc_target_port *tgtport,
+
+ void nvmet_fc_rcv_fcp_abort(struct nvmet_fc_target_port *tgtport,
+ struct nvmefc_tgt_fcp_req *fcpreq);
++/*
++ * add a define, visible to the compiler, that indicates support
++ * for feature. Allows for conditional compilation in LLDDs.
++ */
++#define NVME_FC_FEAT_UUID 0x0001
+
+ #endif /* _NVME_FC_DRIVER_H */
+diff --git a/include/linux/once_lite.h b/include/linux/once_lite.h
+index 861e606b820fa..b7bce4983638f 100644
+--- a/include/linux/once_lite.h
++++ b/include/linux/once_lite.h
+@@ -9,15 +9,27 @@
+ */
+ #define DO_ONCE_LITE(func, ...) \
+ DO_ONCE_LITE_IF(true, func, ##__VA_ARGS__)
+-#define DO_ONCE_LITE_IF(condition, func, ...) \
++
++#define __ONCE_LITE_IF(condition) \
+ ({ \
+ static bool __section(".data.once") __already_done; \
+- bool __ret_do_once = !!(condition); \
++ bool __ret_cond = !!(condition); \
++ bool __ret_once = false; \
+ \
+- if (unlikely(__ret_do_once && !__already_done)) { \
++ if (unlikely(__ret_cond && !__already_done)) { \
+ __already_done = true; \
+- func(__VA_ARGS__); \
++ __ret_once = true; \
+ } \
++ unlikely(__ret_once); \
++ })
++
++#define DO_ONCE_LITE_IF(condition, func, ...) \
++ ({ \
++ bool __ret_do_once = !!(condition); \
++ \
++ if (__ONCE_LITE_IF(__ret_do_once)) \
++ func(__VA_ARGS__); \
++ \
+ unlikely(__ret_do_once); \
+ })
+
+diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
+index af97dd427501c..a411080d5169f 100644
+--- a/include/linux/perf_event.h
++++ b/include/linux/perf_event.h
+@@ -1063,6 +1063,22 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
+ data->txn = 0;
+ }
+
++/*
++ * Clear all bitfields in the perf_branch_entry.
++ * The to and from fields are not cleared because they are
++ * systematically modified by caller.
++ */
++static inline void perf_clear_branch_entry_bitfields(struct perf_branch_entry *br)
++{
++ br->mispred = 0;
++ br->predicted = 0;
++ br->in_tx = 0;
++ br->abort = 0;
++ br->cycles = 0;
++ br->type = 0;
++ br->reserved = 0;
++}
++
+ extern void perf_output_sample(struct perf_output_handle *handle,
+ struct perf_event_header *header,
+ struct perf_sample_data *data,
+diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
+index cb0fd633a6106..4ea4969241062 100644
+--- a/include/linux/pipe_fs_i.h
++++ b/include/linux/pipe_fs_i.h
+@@ -229,6 +229,15 @@ static inline bool pipe_buf_try_steal(struct pipe_inode_info *pipe,
+ return buf->ops->try_steal(pipe, buf);
+ }
+
++static inline void pipe_discard_from(struct pipe_inode_info *pipe,
++ unsigned int old_head)
++{
++ unsigned int mask = pipe->ring_size - 1;
++
++ while (pipe->head > old_head)
++ pipe_buf_release(pipe, &pipe->bufs[--pipe->head & mask]);
++}
++
+ /* Differs from PIPE_BUF in that PIPE_SIZE is the length of the actual
+ memory allocation, whereas PIPE_BUF makes atomicity guarantees. */
+ #define PIPE_SIZE PAGE_SIZE
+diff --git a/include/linux/rmap.h b/include/linux/rmap.h
+index 17230c458341e..a0c4a870bb484 100644
+--- a/include/linux/rmap.h
++++ b/include/linux/rmap.h
+@@ -220,8 +220,8 @@ struct page_vma_mapped_walk {
+ #define DEFINE_PAGE_VMA_WALK(name, _page, _vma, _address, _flags) \
+ struct page_vma_mapped_walk name = { \
+ .pfn = page_to_pfn(_page), \
+- .nr_pages = compound_nr(page), \
+- .pgoff = page_to_pgoff(page), \
++ .nr_pages = compound_nr(_page), \
++ .pgoff = page_to_pgoff(_page), \
+ .vma = _vma, \
+ .address = _address, \
+ .flags = _flags, \
+diff --git a/include/linux/sched.h b/include/linux/sched.h
+index a8911b1f35aad..d438e39fffe50 100644
+--- a/include/linux/sched.h
++++ b/include/linux/sched.h
+@@ -1812,7 +1812,7 @@ current_restore_flags(unsigned long orig_flags, unsigned long flags)
+ }
+
+ extern int cpuset_cpumask_can_shrink(const struct cpumask *cur, const struct cpumask *trial);
+-extern int task_can_attach(struct task_struct *p, const struct cpumask *cs_cpus_allowed);
++extern int task_can_attach(struct task_struct *p, const struct cpumask *cs_effective_cpus);
+ #ifdef CONFIG_SMP
+ extern void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask);
+ extern int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask);
+diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
+index e5af028c08b49..994c25640e156 100644
+--- a/include/linux/sched/rt.h
++++ b/include/linux/sched/rt.h
+@@ -39,20 +39,12 @@ static inline struct task_struct *rt_mutex_get_top_task(struct task_struct *p)
+ }
+ extern void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task);
+ extern void rt_mutex_adjust_pi(struct task_struct *p);
+-static inline bool tsk_is_pi_blocked(struct task_struct *tsk)
+-{
+- return tsk->pi_blocked_on != NULL;
+-}
+ #else
+ static inline struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
+ {
+ return NULL;
+ }
+ # define rt_mutex_adjust_pi(p) do { } while (0)
+-static inline bool tsk_is_pi_blocked(struct task_struct *tsk)
+-{
+- return false;
+-}
+ #endif
+
+ extern void normalize_rt_tasks(void);
+diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
+index 56cffe42abbc4..816df6cc444e1 100644
+--- a/include/linux/sched/topology.h
++++ b/include/linux/sched/topology.h
+@@ -81,6 +81,7 @@ struct sched_domain_shared {
+ atomic_t ref;
+ atomic_t nr_busy_cpus;
+ int has_idle_cores;
++ int nr_idle_scan;
+ };
+
+ struct sched_domain {
+diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
+index 76ce3f3ac0f22..bf6f0decb3f6d 100644
+--- a/include/linux/soundwire/sdw.h
++++ b/include/linux/soundwire/sdw.h
+@@ -646,9 +646,6 @@ struct sdw_slave_ops {
+ * @dev_num: Current Device Number, values can be 0 or dev_num_sticky
+ * @dev_num_sticky: one-time static Device Number assigned by Bus
+ * @probed: boolean tracking driver state
+- * @probe_complete: completion utility to control potential races
+- * on startup between driver probe/initialization and SoundWire
+- * Slave state changes/implementation-defined interrupts
+ * @enumeration_complete: completion utility to control potential races
+ * on startup between device enumeration and read/write access to the
+ * Slave device
+@@ -663,6 +660,7 @@ struct sdw_slave_ops {
+ * for a Slave happens for the first time after enumeration
+ * @is_mockup_device: status flag used to squelch errors in the command/control
+ * protocol for SoundWire mockup devices
++ * @sdw_dev_lock: mutex used to protect callbacks/remove races
+ */
+ struct sdw_slave {
+ struct sdw_slave_id id;
+@@ -680,12 +678,12 @@ struct sdw_slave {
+ u16 dev_num;
+ u16 dev_num_sticky;
+ bool probed;
+- struct completion probe_complete;
+ struct completion enumeration_complete;
+ struct completion initialization_complete;
+ u32 unattach_request;
+ bool first_interrupt_done;
+ bool is_mockup_device;
++ struct mutex sdw_dev_lock; /* protect callbacks/remove races */
+ };
+
+ #define dev_to_sdw_dev(_dev) container_of(_dev, struct sdw_slave, dev)
+diff --git a/include/linux/swapops.h b/include/linux/swapops.h
+index d356ab4047f77..f99f0bd85724e 100644
+--- a/include/linux/swapops.h
++++ b/include/linux/swapops.h
+@@ -216,8 +216,10 @@ extern void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep,
+ spinlock_t *ptl);
+ extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd,
+ unsigned long address);
+-extern void migration_entry_wait_huge(struct vm_area_struct *vma,
+- struct mm_struct *mm, pte_t *pte);
++#ifdef CONFIG_HUGETLB_PAGE
++extern void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl);
++extern void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte);
++#endif
+ #else
+ static inline swp_entry_t make_readable_migration_entry(pgoff_t offset)
+ {
+@@ -238,8 +240,10 @@ static inline void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep,
+ spinlock_t *ptl) { }
+ static inline void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd,
+ unsigned long address) { }
+-static inline void migration_entry_wait_huge(struct vm_area_struct *vma,
+- struct mm_struct *mm, pte_t *pte) { }
++#ifdef CONFIG_HUGETLB_PAGE
++static inline void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl) { }
++static inline void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { }
++#endif
+ static inline int is_writable_migration_entry(swp_entry_t entry)
+ {
+ return 0;
+diff --git a/include/linux/tpm_eventlog.h b/include/linux/tpm_eventlog.h
+index 739ba9a03ec16..20c0ff54b7a0d 100644
+--- a/include/linux/tpm_eventlog.h
++++ b/include/linux/tpm_eventlog.h
+@@ -157,7 +157,7 @@ struct tcg_algorithm_info {
+ * Return: size of the event on success, 0 on failure
+ */
+
+-static inline int __calc_tpm2_event_size(struct tcg_pcr_event2_head *event,
++static __always_inline int __calc_tpm2_event_size(struct tcg_pcr_event2_head *event,
+ struct tcg_pcr_event *event_header,
+ bool do_mapping)
+ {
+diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
+index e6e95a9f07a52..b18759a673c66 100644
+--- a/include/linux/trace_events.h
++++ b/include/linux/trace_events.h
+@@ -916,6 +916,24 @@ perf_trace_buf_submit(void *raw_data, int size, int rctx, u16 type,
+
+ #endif
+
++#define TRACE_EVENT_STR_MAX 512
++
++/*
++ * gcc warns that you can not use a va_list in an inlined
++ * function. But lets me make it into a macro :-/
++ */
++#define __trace_event_vstr_len(fmt, va) \
++({ \
++ va_list __ap; \
++ int __ret; \
++ \
++ va_copy(__ap, *(va)); \
++ __ret = vsnprintf(NULL, 0, fmt, __ap) + 1; \
++ va_end(__ap); \
++ \
++ min(__ret, TRACE_EVENT_STR_MAX); \
++})
++
+ #endif /* _LINUX_TRACE_EVENT_H */
+
+ /*
+diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h
+index 2c1fc9212cf28..98d1921f02b1e 100644
+--- a/include/linux/usb/hcd.h
++++ b/include/linux/usb/hcd.h
+@@ -66,6 +66,7 @@
+
+ struct giveback_urb_bh {
+ bool running;
++ bool high_prio;
+ spinlock_t lock;
+ struct list_head head;
+ struct tasklet_struct bh;
+diff --git a/include/linux/wait.h b/include/linux/wait.h
+index 851e07da2583f..58cfbf81447cc 100644
+--- a/include/linux/wait.h
++++ b/include/linux/wait.h
+@@ -544,10 +544,11 @@ do { \
+ \
+ hrtimer_init_sleeper_on_stack(&__t, CLOCK_MONOTONIC, \
+ HRTIMER_MODE_REL); \
+- if ((timeout) != KTIME_MAX) \
+- hrtimer_start_range_ns(&__t.timer, timeout, \
+- current->timer_slack_ns, \
+- HRTIMER_MODE_REL); \
++ if ((timeout) != KTIME_MAX) { \
++ hrtimer_set_expires_range_ns(&__t.timer, timeout, \
++ current->timer_slack_ns); \
++ hrtimer_sleeper_start_expires(&__t, HRTIMER_MODE_REL); \
++ } \
+ \
+ __ret = ___wait_event(wq_head, condition, state, 0, 0, \
+ if (!__t.task) { \
+diff --git a/include/media/hevc-ctrls.h b/include/media/hevc-ctrls.h
+index 01ccda48d8c57..88e804578cb19 100644
+--- a/include/media/hevc-ctrls.h
++++ b/include/media/hevc-ctrls.h
+@@ -135,7 +135,7 @@ struct v4l2_hevc_dpb_entry {
+ __u64 timestamp;
+ __u8 flags;
+ __u8 field_pic;
+- __u16 pic_order_cnt[2];
++ __s32 pic_order_cnt_val;
+ __u8 padding[2];
+ };
+
+@@ -178,7 +178,7 @@ struct v4l2_ctrl_hevc_slice_params {
+ /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
+ __u8 slice_type;
+ __u8 colour_plane_id;
+- __u16 slice_pic_order_cnt;
++ __s32 slice_pic_order_cnt;
+ __u8 num_ref_idx_l0_active_minus1;
+ __u8 num_ref_idx_l1_active_minus1;
+ __u8 collocated_ref_idx;
+diff --git a/include/net/9p/client.h b/include/net/9p/client.h
+index ec1d1706f43c0..cb78e0e333324 100644
+--- a/include/net/9p/client.h
++++ b/include/net/9p/client.h
+@@ -76,7 +76,7 @@ enum p9_req_status_t {
+ struct p9_req_t {
+ int status;
+ int t_err;
+- struct kref refcount;
++ refcount_t refcount;
+ wait_queue_head_t wq;
+ struct p9_fcall tc;
+ struct p9_fcall rc;
+@@ -227,15 +227,15 @@ struct p9_req_t *p9_tag_lookup(struct p9_client *c, u16 tag);
+
+ static inline void p9_req_get(struct p9_req_t *r)
+ {
+- kref_get(&r->refcount);
++ refcount_inc(&r->refcount);
+ }
+
+ static inline int p9_req_try_get(struct p9_req_t *r)
+ {
+- return kref_get_unless_zero(&r->refcount);
++ return refcount_inc_not_zero(&r->refcount);
+ }
+
+-int p9_req_put(struct p9_req_t *r);
++int p9_req_put(struct p9_client *c, struct p9_req_t *r);
+
+ void p9_client_cb(struct p9_client *c, struct p9_req_t *req, int status);
+
+diff --git a/include/net/ax25.h b/include/net/ax25.h
+index a427a05672e2a..f8cf3629a4193 100644
+--- a/include/net/ax25.h
++++ b/include/net/ax25.h
+@@ -236,6 +236,7 @@ typedef struct ax25_cb {
+ ax25_address source_addr, dest_addr;
+ ax25_digi *digipeat;
+ ax25_dev *ax25_dev;
++ netdevice_tracker dev_tracker;
+ unsigned char iamdigi;
+ unsigned char state, modulus, pidincl;
+ unsigned short vs, vr, va;
+diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
+index 62a9bb022aedf..5f2342205ae16 100644
+--- a/include/net/bluetooth/hci.h
++++ b/include/net/bluetooth/hci.h
+@@ -361,6 +361,7 @@ enum {
+ HCI_QUALITY_REPORT,
+ HCI_OFFLOAD_CODECS_ENABLED,
+ HCI_LE_SIMULTANEOUS_ROLES,
++ HCI_CMD_DRAIN_WORKQUEUE,
+
+ __HCI_NUM_FLAGS,
+ };
+diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h
+index 81b9659530368..56f1286583d3c 100644
+--- a/include/net/inet6_hashtables.h
++++ b/include/net/inet6_hashtables.h
+@@ -103,15 +103,24 @@ struct sock *inet6_lookup(struct net *net, struct inet_hashinfo *hashinfo,
+ const int dif);
+
+ int inet6_hash(struct sock *sk);
+-#endif /* IS_ENABLED(CONFIG_IPV6) */
+
+-#define INET6_MATCH(__sk, __net, __saddr, __daddr, __ports, __dif, __sdif) \
+- (((__sk)->sk_portpair == (__ports)) && \
+- ((__sk)->sk_family == AF_INET6) && \
+- ipv6_addr_equal(&(__sk)->sk_v6_daddr, (__saddr)) && \
+- ipv6_addr_equal(&(__sk)->sk_v6_rcv_saddr, (__daddr)) && \
+- (((__sk)->sk_bound_dev_if == (__dif)) || \
+- ((__sk)->sk_bound_dev_if == (__sdif))) && \
+- net_eq(sock_net(__sk), (__net)))
++static inline bool inet6_match(struct net *net, const struct sock *sk,
++ const struct in6_addr *saddr,
++ const struct in6_addr *daddr,
++ const __portpair ports,
++ const int dif, const int sdif)
++{
++ if (!net_eq(sock_net(sk), net) ||
++ sk->sk_family != AF_INET6 ||
++ sk->sk_portpair != ports ||
++ !ipv6_addr_equal(&sk->sk_v6_daddr, saddr) ||
++ !ipv6_addr_equal(&sk->sk_v6_rcv_saddr, daddr))
++ return false;
++
++ /* READ_ONCE() paired with WRITE_ONCE() in sock_bindtoindex_locked() */
++ return inet_sk_bound_dev_eq(net, READ_ONCE(sk->sk_bound_dev_if), dif,
++ sdif);
++}
++#endif /* IS_ENABLED(CONFIG_IPV6) */
+
+ #endif /* _INET6_HASHTABLES_H */
+diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
+index 749bb1e460871..53c22b64e9724 100644
+--- a/include/net/inet_hashtables.h
++++ b/include/net/inet_hashtables.h
+@@ -203,17 +203,6 @@ static inline void inet_ehash_locks_free(struct inet_hashinfo *hashinfo)
+ hashinfo->ehash_locks = NULL;
+ }
+
+-static inline bool inet_sk_bound_dev_eq(struct net *net, int bound_dev_if,
+- int dif, int sdif)
+-{
+-#if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
+- return inet_bound_dev_eq(!!READ_ONCE(net->ipv4.sysctl_tcp_l3mdev_accept),
+- bound_dev_if, dif, sdif);
+-#else
+- return inet_bound_dev_eq(true, bound_dev_if, dif, sdif);
+-#endif
+-}
+-
+ struct inet_bind_bucket *
+ inet_bind_bucket_create(struct kmem_cache *cachep, struct net *net,
+ struct inet_bind_hashbucket *head,
+@@ -295,7 +284,6 @@ static inline struct sock *inet_lookup_listener(struct net *net,
+ ((__force __portpair)(((__u32)(__dport) << 16) | (__force __u32)(__be16)(__sport)))
+ #endif
+
+-#if (BITS_PER_LONG == 64)
+ #ifdef __BIG_ENDIAN
+ #define INET_ADDR_COOKIE(__name, __saddr, __daddr) \
+ const __addrpair __name = (__force __addrpair) ( \
+@@ -307,24 +295,20 @@ static inline struct sock *inet_lookup_listener(struct net *net,
+ (((__force __u64)(__be32)(__daddr)) << 32) | \
+ ((__force __u64)(__be32)(__saddr)))
+ #endif /* __BIG_ENDIAN */
+-#define INET_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif, __sdif) \
+- (((__sk)->sk_portpair == (__ports)) && \
+- ((__sk)->sk_addrpair == (__cookie)) && \
+- (((__sk)->sk_bound_dev_if == (__dif)) || \
+- ((__sk)->sk_bound_dev_if == (__sdif))) && \
+- net_eq(sock_net(__sk), (__net)))
+-#else /* 32-bit arch */
+-#define INET_ADDR_COOKIE(__name, __saddr, __daddr) \
+- const int __name __deprecated __attribute__((unused))
+-
+-#define INET_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif, __sdif) \
+- (((__sk)->sk_portpair == (__ports)) && \
+- ((__sk)->sk_daddr == (__saddr)) && \
+- ((__sk)->sk_rcv_saddr == (__daddr)) && \
+- (((__sk)->sk_bound_dev_if == (__dif)) || \
+- ((__sk)->sk_bound_dev_if == (__sdif))) && \
+- net_eq(sock_net(__sk), (__net)))
+-#endif /* 64-bit arch */
++
++static inline bool INET_MATCH(struct net *net, const struct sock *sk,
++ const __addrpair cookie, const __portpair ports,
++ int dif, int sdif)
++{
++ if (!net_eq(sock_net(sk), net) ||
++ sk->sk_portpair != ports ||
++ sk->sk_addrpair != cookie)
++ return false;
++
++ /* READ_ONCE() paired with WRITE_ONCE() in sock_bindtoindex_locked() */
++ return inet_sk_bound_dev_eq(net, READ_ONCE(sk->sk_bound_dev_if), dif,
++ sdif);
++}
+
+ /* Sockets in TCP_CLOSE state are _always_ taken out of the hash, so we need
+ * not check it for lookups anymore, thanks Alexey. -DaveM
+diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
+index 6395f6b9a5d29..bf5654ce711ef 100644
+--- a/include/net/inet_sock.h
++++ b/include/net/inet_sock.h
+@@ -149,6 +149,17 @@ static inline bool inet_bound_dev_eq(bool l3mdev_accept, int bound_dev_if,
+ return bound_dev_if == dif || bound_dev_if == sdif;
+ }
+
++static inline bool inet_sk_bound_dev_eq(struct net *net, int bound_dev_if,
++ int dif, int sdif)
++{
++#if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
++ return inet_bound_dev_eq(!!READ_ONCE(net->ipv4.sysctl_tcp_l3mdev_accept),
++ bound_dev_if, dif, sdif);
++#else
++ return inet_bound_dev_eq(true, bound_dev_if, dif, sdif);
++#endif
++}
++
+ struct inet_cork {
+ unsigned int flags;
+ __be32 addr;
+diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
+index 44a35531952e1..3372a1f67cf4e 100644
+--- a/include/net/pkt_sched.h
++++ b/include/net/pkt_sched.h
+@@ -173,11 +173,28 @@ struct tc_taprio_qopt_offload {
+ struct tc_taprio_sched_entry entries[];
+ };
+
++#if IS_ENABLED(CONFIG_NET_SCH_TAPRIO)
++
+ /* Reference counting */
+ struct tc_taprio_qopt_offload *taprio_offload_get(struct tc_taprio_qopt_offload
+ *offload);
+ void taprio_offload_free(struct tc_taprio_qopt_offload *offload);
+
++#else
++
++/* Reference counting */
++static inline struct tc_taprio_qopt_offload *
++taprio_offload_get(struct tc_taprio_qopt_offload *offload)
++{
++ return NULL;
++}
++
++static inline void taprio_offload_free(struct tc_taprio_qopt_offload *offload)
++{
++}
++
++#endif
++
+ /* Ensure skb_mstamp_ns, which might have been populated with the txtime, is
+ * not mistaken for a software timestamp, because this will otherwise prevent
+ * the dispatch of hardware timestamps to the socket.
+diff --git a/include/net/raw.h b/include/net/raw.h
+index c51a635671a73..537d9d1df890d 100644
+--- a/include/net/raw.h
++++ b/include/net/raw.h
+@@ -20,9 +20,8 @@
+ extern struct proto raw_prot;
+
+ extern struct raw_hashinfo raw_v4_hashinfo;
+-struct sock *__raw_v4_lookup(struct net *net, struct sock *sk,
+- unsigned short num, __be32 raddr,
+- __be32 laddr, int dif, int sdif);
++bool raw_v4_match(struct net *net, struct sock *sk, unsigned short num,
++ __be32 raddr, __be32 laddr, int dif, int sdif);
+
+ int raw_abort(struct sock *sk, int err);
+ void raw_icmp_error(struct sk_buff *, int, u32);
+@@ -34,9 +33,18 @@ int raw_rcv(struct sock *, struct sk_buff *);
+
+ struct raw_hashinfo {
+ rwlock_t lock;
+- struct hlist_head ht[RAW_HTABLE_SIZE];
++ struct hlist_nulls_head ht[RAW_HTABLE_SIZE];
+ };
+
++static inline void raw_hashinfo_init(struct raw_hashinfo *hashinfo)
++{
++ int i;
++
++ rwlock_init(&hashinfo->lock);
++ for (i = 0; i < RAW_HTABLE_SIZE; i++)
++ INIT_HLIST_NULLS_HEAD(&hashinfo->ht[i], i);
++}
++
+ #ifdef CONFIG_PROC_FS
+ int raw_proc_init(void);
+ void raw_proc_exit(void);
+diff --git a/include/net/rawv6.h b/include/net/rawv6.h
+index 53d86b6055e8c..bc70909625f60 100644
+--- a/include/net/rawv6.h
++++ b/include/net/rawv6.h
+@@ -3,11 +3,12 @@
+ #define _NET_RAWV6_H
+
+ #include <net/protocol.h>
++#include <net/raw.h>
+
+ extern struct raw_hashinfo raw_v6_hashinfo;
+-struct sock *__raw_v6_lookup(struct net *net, struct sock *sk,
+- unsigned short num, const struct in6_addr *loc_addr,
+- const struct in6_addr *rmt_addr, int dif, int sdif);
++bool raw_v6_match(struct net *net, struct sock *sk, unsigned short num,
++ const struct in6_addr *loc_addr,
++ const struct in6_addr *rmt_addr, int dif, int sdif);
+
+ int raw_abort(struct sock *sk, int err);
+
+diff --git a/include/net/sock.h b/include/net/sock.h
+index 9563a093fdfc1..810cb96b7f390 100644
+--- a/include/net/sock.h
++++ b/include/net/sock.h
+@@ -161,9 +161,6 @@ typedef __u64 __bitwise __addrpair;
+ * for struct sock and struct inet_timewait_sock.
+ */
+ struct sock_common {
+- /* skc_daddr and skc_rcv_saddr must be grouped on a 8 bytes aligned
+- * address on 64bit arches : cf INET_MATCH()
+- */
+ union {
+ __addrpair skc_addrpair;
+ struct {
+@@ -1557,19 +1554,23 @@ static inline bool sk_has_account(struct sock *sk)
+
+ static inline bool sk_wmem_schedule(struct sock *sk, int size)
+ {
++ int delta;
++
+ if (!sk_has_account(sk))
+ return true;
+- return size <= sk->sk_forward_alloc ||
+- __sk_mem_schedule(sk, size, SK_MEM_SEND);
++ delta = size - sk->sk_forward_alloc;
++ return delta <= 0 || __sk_mem_schedule(sk, delta, SK_MEM_SEND);
+ }
+
+ static inline bool
+ sk_rmem_schedule(struct sock *sk, struct sk_buff *skb, int size)
+ {
++ int delta;
++
+ if (!sk_has_account(sk))
+ return true;
+- return size <= sk->sk_forward_alloc ||
+- __sk_mem_schedule(sk, size, SK_MEM_RECV) ||
++ delta = size - sk->sk_forward_alloc;
++ return delta <= 0 || __sk_mem_schedule(sk, delta, SK_MEM_RECV) ||
+ skb_pfmemalloc(skb);
+ }
+
+diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
+index 4aa0318496688..0774ce97c2f1b 100644
+--- a/include/net/xdp_sock_drv.h
++++ b/include/net/xdp_sock_drv.h
+@@ -95,6 +95,13 @@ static inline void xsk_buff_free(struct xdp_buff *xdp)
+ xp_free(xskb);
+ }
+
++static inline void xsk_buff_discard(struct xdp_buff *xdp)
++{
++ struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp);
++
++ xp_release(xskb);
++}
++
+ static inline void xsk_buff_set_size(struct xdp_buff *xdp, u32 size)
+ {
+ xdp->data = xdp->data_hard_start + XDP_PACKET_HEADROOM;
+@@ -238,6 +245,10 @@ static inline void xsk_buff_free(struct xdp_buff *xdp)
+ {
+ }
+
++static inline void xsk_buff_discard(struct xdp_buff *xdp)
++{
++}
++
+ static inline void xsk_buff_set_size(struct xdp_buff *xdp, u32 size)
+ {
+ }
+diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
+index c0703cd20a993..9758a4a9923f5 100644
+--- a/include/scsi/libiscsi.h
++++ b/include/scsi/libiscsi.h
+@@ -411,7 +411,7 @@ extern int iscsi_host_add(struct Scsi_Host *shost, struct device *pdev);
+ extern struct Scsi_Host *iscsi_host_alloc(struct scsi_host_template *sht,
+ int dd_data_size,
+ bool xmit_can_sleep);
+-extern void iscsi_host_remove(struct Scsi_Host *shost);
++extern void iscsi_host_remove(struct Scsi_Host *shost, bool is_shutdown);
+ extern void iscsi_host_free(struct Scsi_Host *shost);
+ extern int iscsi_target_alloc(struct scsi_target *starget);
+ extern int iscsi_host_get_max_scsi_cmds(struct Scsi_Host *shost,
+diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
+index 9acb8422f6802..d6eab7cb221a7 100644
+--- a/include/scsi/scsi_transport_iscsi.h
++++ b/include/scsi/scsi_transport_iscsi.h
+@@ -442,6 +442,7 @@ extern struct iscsi_cls_session *iscsi_create_session(struct Scsi_Host *shost,
+ struct iscsi_transport *t,
+ int dd_size,
+ unsigned int target_id);
++extern void iscsi_force_destroy_session(struct iscsi_cls_session *session);
+ extern void iscsi_remove_session(struct iscsi_cls_session *session);
+ extern void iscsi_free_session(struct iscsi_cls_session *session);
+ extern struct iscsi_cls_conn *iscsi_alloc_conn(struct iscsi_cls_session *sess,
+diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
+index 9b4e6c78d0f41..c90a9a2f77a94 100644
+--- a/include/soc/mscc/ocelot.h
++++ b/include/soc/mscc/ocelot.h
+@@ -568,6 +568,7 @@ struct ocelot_ops {
+ int (*psfp_stats_get)(struct ocelot *ocelot, struct flow_cls_offload *f,
+ struct flow_stats *stats);
+ void (*cut_through_fwd)(struct ocelot *ocelot);
++ void (*tas_clock_adjust)(struct ocelot *ocelot);
+ };
+
+ struct ocelot_vcap_policer {
+@@ -652,29 +653,32 @@ struct ocelot_port {
+
+ struct regmap *target;
+
+- bool vlan_aware;
++ struct net_device *bond;
++ struct net_device *bridge;
++
+ /* VLAN that untagged frames are classified to, on ingress */
+ const struct ocelot_bridge_vlan *pvid_vlan;
+
++ struct tc_taprio_qopt_offload *taprio;
++
++ phy_interface_t phy_mode;
++
+ unsigned int ptp_skbs_in_flight;
+- u8 ptp_cmd;
+ struct sk_buff_head tx_skbs;
+- u8 ts_id;
+
+- phy_interface_t phy_mode;
++ u16 mrp_ring_id;
+
+- u8 *xmit_template;
++ u8 ptp_cmd;
++ u8 ts_id;
++
++ u8 stp_state;
++ bool vlan_aware;
+ bool is_dsa_8021q_cpu;
+ bool learn_ena;
+
+- struct net_device *bond;
+ bool lag_tx_active;
+
+- u16 mrp_ring_id;
+-
+- struct net_device *bridge;
+ int bridge_num;
+- u8 stp_state;
+
+ int speed;
+ };
+@@ -743,6 +747,9 @@ struct ocelot {
+ /* Lock for serializing forwarding domain changes */
+ struct mutex fwd_domain_lock;
+
++ /* Lock for serializing Time-Aware Shaper changes */
++ struct mutex tas_lock;
++
+ struct workqueue_struct *owq;
+
+ u8 ptp:1;
+diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h
+index 1779e133cea0c..d91370377523d 100644
+--- a/include/trace/events/f2fs.h
++++ b/include/trace/events/f2fs.h
+@@ -15,10 +15,6 @@ TRACE_DEFINE_ENUM(NODE);
+ TRACE_DEFINE_ENUM(DATA);
+ TRACE_DEFINE_ENUM(META);
+ TRACE_DEFINE_ENUM(META_FLUSH);
+-TRACE_DEFINE_ENUM(INMEM);
+-TRACE_DEFINE_ENUM(INMEM_DROP);
+-TRACE_DEFINE_ENUM(INMEM_INVALIDATE);
+-TRACE_DEFINE_ENUM(INMEM_REVOKE);
+ TRACE_DEFINE_ENUM(IPU);
+ TRACE_DEFINE_ENUM(OPU);
+ TRACE_DEFINE_ENUM(HOT);
+@@ -59,10 +55,6 @@ TRACE_DEFINE_ENUM(CP_RESIZE);
+ { DATA, "DATA" }, \
+ { META, "META" }, \
+ { META_FLUSH, "META_FLUSH" }, \
+- { INMEM, "INMEM" }, \
+- { INMEM_DROP, "INMEM_DROP" }, \
+- { INMEM_INVALIDATE, "INMEM_INVALIDATE" }, \
+- { INMEM_REVOKE, "INMEM_REVOKE" }, \
+ { IPU, "IN-PLACE" }, \
+ { OPU, "OUT-OF-PLACE" })
+
+@@ -1289,20 +1281,6 @@ DEFINE_EVENT(f2fs__page, f2fs_vm_page_mkwrite,
+ TP_ARGS(page, type)
+ );
+
+-DEFINE_EVENT(f2fs__page, f2fs_register_inmem_page,
+-
+- TP_PROTO(struct page *page, int type),
+-
+- TP_ARGS(page, type)
+-);
+-
+-DEFINE_EVENT(f2fs__page, f2fs_commit_inmem_page,
+-
+- TP_PROTO(struct page *page, int type),
+-
+- TP_ARGS(page, type)
+-);
+-
+ TRACE_EVENT(f2fs_filemap_fault,
+
+ TP_PROTO(struct inode *inode, pgoff_t index, unsigned long ret),
+diff --git a/include/trace/events/spmi.h b/include/trace/events/spmi.h
+index 8b60efe18ba68..a6819fd85cdf4 100644
+--- a/include/trace/events/spmi.h
++++ b/include/trace/events/spmi.h
+@@ -21,15 +21,15 @@ TRACE_EVENT(spmi_write_begin,
+ __field ( u8, sid )
+ __field ( u16, addr )
+ __field ( u8, len )
+- __dynamic_array ( u8, buf, len + 1 )
++ __dynamic_array ( u8, buf, len )
+ ),
+
+ TP_fast_assign(
+ __entry->opcode = opcode;
+ __entry->sid = sid;
+ __entry->addr = addr;
+- __entry->len = len + 1;
+- memcpy(__get_dynamic_array(buf), buf, len + 1);
++ __entry->len = len;
++ memcpy(__get_dynamic_array(buf), buf, len);
+ ),
+
+ TP_printk("opc=%d sid=%02d addr=0x%04x len=%d buf=0x[%*phD]",
+@@ -92,7 +92,7 @@ TRACE_EVENT(spmi_read_end,
+ __field ( u16, addr )
+ __field ( int, ret )
+ __field ( u8, len )
+- __dynamic_array ( u8, buf, len + 1 )
++ __dynamic_array ( u8, buf, len )
+ ),
+
+ TP_fast_assign(
+@@ -100,8 +100,8 @@ TRACE_EVENT(spmi_read_end,
+ __entry->sid = sid;
+ __entry->addr = addr;
+ __entry->ret = ret;
+- __entry->len = len + 1;
+- memcpy(__get_dynamic_array(buf), buf, len + 1);
++ __entry->len = len;
++ memcpy(__get_dynamic_array(buf), buf, len);
+ ),
+
+ TP_printk("opc=%d sid=%02d addr=0x%04x ret=%d len=%02d buf=0x[%*phD]",
+diff --git a/include/trace/stages/stage1_struct_define.h b/include/trace/stages/stage1_struct_define.h
+index a16783419687e..1b7bab60434c1 100644
+--- a/include/trace/stages/stage1_struct_define.h
++++ b/include/trace/stages/stage1_struct_define.h
+@@ -26,6 +26,9 @@
+ #undef __string_len
+ #define __string_len(item, src, len) __dynamic_array(char, item, -1)
+
++#undef __vstring
++#define __vstring(item, fmt, ap) __dynamic_array(char, item, -1)
++
+ #undef __bitmask
+ #define __bitmask(item, nr_bits) __dynamic_array(char, item, -1)
+
+diff --git a/include/trace/stages/stage2_data_offsets.h b/include/trace/stages/stage2_data_offsets.h
+index 42fd1e8813ecf..1b7a8f764fddd 100644
+--- a/include/trace/stages/stage2_data_offsets.h
++++ b/include/trace/stages/stage2_data_offsets.h
+@@ -32,6 +32,9 @@
+ #undef __string_len
+ #define __string_len(item, src, len) __dynamic_array(char, item, -1)
+
++#undef __vstring
++#define __vstring(item, fmt, ap) __dynamic_array(char, item, -1)
++
+ #undef __bitmask
+ #define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
+
+diff --git a/include/trace/stages/stage4_event_fields.h b/include/trace/stages/stage4_event_fields.h
+index e80cdc397a436..80d34f3965555 100644
+--- a/include/trace/stages/stage4_event_fields.h
++++ b/include/trace/stages/stage4_event_fields.h
+@@ -2,16 +2,18 @@
+
+ /* Stage 4 definitions for creating trace events */
+
++#define ALIGN_STRUCTFIELD(type) ((int)(offsetof(struct {char a; type b;}, b)))
++
+ #undef __field_ext
+ #define __field_ext(_type, _item, _filter_type) { \
+ .type = #_type, .name = #_item, \
+- .size = sizeof(_type), .align = __alignof__(_type), \
++ .size = sizeof(_type), .align = ALIGN_STRUCTFIELD(_type), \
+ .is_signed = is_signed_type(_type), .filter_type = _filter_type },
+
+ #undef __field_struct_ext
+ #define __field_struct_ext(_type, _item, _filter_type) { \
+ .type = #_type, .name = #_item, \
+- .size = sizeof(_type), .align = __alignof__(_type), \
++ .size = sizeof(_type), .align = ALIGN_STRUCTFIELD(_type), \
+ 0, .filter_type = _filter_type },
+
+ #undef __field
+@@ -23,7 +25,7 @@
+ #undef __array
+ #define __array(_type, _item, _len) { \
+ .type = #_type"["__stringify(_len)"]", .name = #_item, \
+- .size = sizeof(_type[_len]), .align = __alignof__(_type), \
++ .size = sizeof(_type[_len]), .align = ALIGN_STRUCTFIELD(_type), \
+ .is_signed = is_signed_type(_type), .filter_type = FILTER_OTHER },
+
+ #undef __dynamic_array
+@@ -38,6 +40,9 @@
+ #undef __string_len
+ #define __string_len(item, src, len) __dynamic_array(char, item, -1)
+
++#undef __vstring
++#define __vstring(item, fmt, ap) __dynamic_array(char, item, -1)
++
+ #undef __bitmask
+ #define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
+
+diff --git a/include/trace/stages/stage5_get_offsets.h b/include/trace/stages/stage5_get_offsets.h
+index 7ee5931300e6d..fba4c24ed9e60 100644
+--- a/include/trace/stages/stage5_get_offsets.h
++++ b/include/trace/stages/stage5_get_offsets.h
+@@ -39,6 +39,10 @@
+ #undef __string_len
+ #define __string_len(item, src, len) __dynamic_array(char, item, (len) + 1)
+
++#undef __vstring
++#define __vstring(item, fmt, ap) __dynamic_array(char, item, \
++ __trace_event_vstr_len(fmt, ap))
++
+ #undef __rel_dynamic_array
+ #define __rel_dynamic_array(type, item, len) \
+ __item_length = (len) * sizeof(type); \
+diff --git a/include/trace/stages/stage6_event_callback.h b/include/trace/stages/stage6_event_callback.h
+index e1724f73594be..3c554a5853204 100644
+--- a/include/trace/stages/stage6_event_callback.h
++++ b/include/trace/stages/stage6_event_callback.h
+@@ -24,6 +24,9 @@
+ #undef __string_len
+ #define __string_len(item, src, len) __dynamic_array(char, item, -1)
+
++#undef __vstring
++#define __vstring(item, fmt, ap) __dynamic_array(char, item, -1)
++
+ #undef __assign_str
+ #define __assign_str(dst, src) \
+ strcpy(__get_str(dst), (src) ? (const char *)(src) : "(null)");
+@@ -35,6 +38,15 @@
+ __get_str(dst)[len] = '\0'; \
+ } while(0)
+
++#undef __assign_vstr
++#define __assign_vstr(dst, fmt, va) \
++ do { \
++ va_list __cp_va; \
++ va_copy(__cp_va, *(va)); \
++ vsnprintf(__get_str(dst), TRACE_EVENT_STR_MAX, fmt, __cp_va); \
++ va_end(__cp_va); \
++ } while (0)
++
+ #undef __bitmask
+ #define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
+
+diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
+index d14b10b85e51f..ff9af73859cab 100644
+--- a/include/uapi/linux/bpf.h
++++ b/include/uapi/linux/bpf.h
+@@ -1013,6 +1013,7 @@ enum bpf_link_type {
+ BPF_LINK_TYPE_XDP = 6,
+ BPF_LINK_TYPE_PERF_EVENT = 7,
+ BPF_LINK_TYPE_KPROBE_MULTI = 8,
++ BPF_LINK_TYPE_STRUCT_OPS = 9,
+
+ MAX_BPF_LINK_TYPE,
+ };
+@@ -5143,6 +5144,17 @@ union bpf_attr {
+ * The **hash_algo** is returned on success,
+ * **-EOPNOTSUP** if the hash calculation failed or **-EINVAL** if
+ * invalid arguments are passed.
++ *
++ * void *bpf_kptr_xchg(void *map_value, void *ptr)
++ * Description
++ * Exchange kptr at pointer *map_value* with *ptr*, and return the
++ * old value. *ptr* can be NULL, otherwise it must be a referenced
++ * pointer which will be released when this helper is called.
++ * Return
++ * The old value of kptr (which can be NULL). The returned pointer
++ * if not NULL, is a reference which must be released using its
++ * corresponding release function, or moved into a BPF map before
++ * program exit.
+ */
+ #define __BPF_FUNC_MAPPER(FN) \
+ FN(unspec), \
+@@ -5339,6 +5351,7 @@ union bpf_attr {
+ FN(copy_from_user_task), \
+ FN(skb_set_tstamp), \
+ FN(ima_file_hash), \
++ FN(kptr_xchg), \
+ /* */
+
+ /* integer value in 'imm' field of BPF_CALL instruction selects which helper
+diff --git a/include/uapi/linux/can/error.h b/include/uapi/linux/can/error.h
+index 34633283de641..a1000cb630632 100644
+--- a/include/uapi/linux/can/error.h
++++ b/include/uapi/linux/can/error.h
+@@ -120,6 +120,9 @@
+ #define CAN_ERR_TRX_CANL_SHORT_TO_GND 0x70 /* 0111 0000 */
+ #define CAN_ERR_TRX_CANL_SHORT_TO_CANH 0x80 /* 1000 0000 */
+
+-/* controller specific additional information / data[5..7] */
++/* data[5] is reserved (do not use) */
++
++/* TX error counter / data[6] */
++/* RX error counter / data[7] */
+
+ #endif /* _UAPI_CAN_ERROR_H */
+diff --git a/include/uapi/linux/f2fs.h b/include/uapi/linux/f2fs.h
+index 352a822d43709..3121d127d5aae 100644
+--- a/include/uapi/linux/f2fs.h
++++ b/include/uapi/linux/f2fs.h
+@@ -13,7 +13,7 @@
+ #define F2FS_IOC_COMMIT_ATOMIC_WRITE _IO(F2FS_IOCTL_MAGIC, 2)
+ #define F2FS_IOC_START_VOLATILE_WRITE _IO(F2FS_IOCTL_MAGIC, 3)
+ #define F2FS_IOC_RELEASE_VOLATILE_WRITE _IO(F2FS_IOCTL_MAGIC, 4)
+-#define F2FS_IOC_ABORT_VOLATILE_WRITE _IO(F2FS_IOCTL_MAGIC, 5)
++#define F2FS_IOC_ABORT_ATOMIC_WRITE _IO(F2FS_IOCTL_MAGIC, 5)
+ #define F2FS_IOC_GARBAGE_COLLECT _IOW(F2FS_IOCTL_MAGIC, 6, __u32)
+ #define F2FS_IOC_WRITE_CHECKPOINT _IO(F2FS_IOCTL_MAGIC, 7)
+ #define F2FS_IOC_DEFRAGMENT _IOWR(F2FS_IOCTL_MAGIC, 8, \
+diff --git a/include/uapi/linux/netfilter/xt_IDLETIMER.h b/include/uapi/linux/netfilter/xt_IDLETIMER.h
+index 49ddcdc61c094..7bfb31a66fc9b 100644
+--- a/include/uapi/linux/netfilter/xt_IDLETIMER.h
++++ b/include/uapi/linux/netfilter/xt_IDLETIMER.h
+@@ -1,6 +1,5 @@
++/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
+ /*
+- * linux/include/linux/netfilter/xt_IDLETIMER.h
+- *
+ * Header file for Xtables timer target module.
+ *
+ * Copyright (C) 2004, 2010 Nokia Corporation
+@@ -10,20 +9,6 @@
+ * by Luciano Coelho <luciano.coelho@nokia.com>
+ *
+ * Contact: Luciano Coelho <luciano.coelho@nokia.com>
+- *
+- * This program is free software; you can redistribute it and/or
+- * modify it under the terms of the GNU General Public License
+- * version 2 as published by the Free Software Foundation.
+- *
+- * This program is distributed in the hope that it will be useful, but
+- * WITHOUT ANY WARRANTY; without even the implied warranty of
+- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+- * General Public License for more details.
+- *
+- * You should have received a copy of the GNU General Public License
+- * along with this program; if not, write to the Free Software
+- * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
+- * 02110-1301 USA
+ */
+
+ #ifndef _XT_IDLETIMER_H
+diff --git a/init/main.c b/init/main.c
+index 80bd217dd35ea..1e866b31bf19b 100644
+--- a/init/main.c
++++ b/init/main.c
+@@ -99,6 +99,7 @@
+ #include <linux/kcsan.h>
+ #include <linux/init_syscalls.h>
+ #include <linux/stackdepot.h>
++#include <linux/randomize_kstack.h>
+ #include <net/net_namespace.h>
+
+ #include <asm/io.h>
+diff --git a/io_uring/Makefile b/io_uring/Makefile
+new file mode 100644
+index 0000000000000..3680425df9478
+--- /dev/null
++++ b/io_uring/Makefile
+@@ -0,0 +1,6 @@
++# SPDX-License-Identifier: GPL-2.0
++#
++# Makefile for io_uring
++
++obj-$(CONFIG_IO_URING) += io_uring.o
++obj-$(CONFIG_IO_WQ) += io-wq.o
+diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c
+new file mode 100644
+index 0000000000000..32aeb2c581c58
+--- /dev/null
++++ b/io_uring/io-wq.c
+@@ -0,0 +1,1424 @@
++// SPDX-License-Identifier: GPL-2.0
++/*
++ * Basic worker thread pool for io_uring
++ *
++ * Copyright (C) 2019 Jens Axboe
++ *
++ */
++#include <linux/kernel.h>
++#include <linux/init.h>
++#include <linux/errno.h>
++#include <linux/sched/signal.h>
++#include <linux/percpu.h>
++#include <linux/slab.h>
++#include <linux/rculist_nulls.h>
++#include <linux/cpu.h>
++#include <linux/task_work.h>
++#include <linux/audit.h>
++#include <uapi/linux/io_uring.h>
++
++#include "io-wq.h"
++
++#define WORKER_IDLE_TIMEOUT (5 * HZ)
++
++enum {
++ IO_WORKER_F_UP = 1, /* up and active */
++ IO_WORKER_F_RUNNING = 2, /* account as running */
++ IO_WORKER_F_FREE = 4, /* worker on free list */
++ IO_WORKER_F_BOUND = 8, /* is doing bounded work */
++};
++
++enum {
++ IO_WQ_BIT_EXIT = 0, /* wq exiting */
++};
++
++enum {
++ IO_ACCT_STALLED_BIT = 0, /* stalled on hash */
++};
++
++/*
++ * One for each thread in a wqe pool
++ */
++struct io_worker {
++ refcount_t ref;
++ unsigned flags;
++ struct hlist_nulls_node nulls_node;
++ struct list_head all_list;
++ struct task_struct *task;
++ struct io_wqe *wqe;
++
++ struct io_wq_work *cur_work;
++ struct io_wq_work *next_work;
++ raw_spinlock_t lock;
++
++ struct completion ref_done;
++
++ unsigned long create_state;
++ struct callback_head create_work;
++ int create_index;
++
++ union {
++ struct rcu_head rcu;
++ struct work_struct work;
++ };
++};
++
++#if BITS_PER_LONG == 64
++#define IO_WQ_HASH_ORDER 6
++#else
++#define IO_WQ_HASH_ORDER 5
++#endif
++
++#define IO_WQ_NR_HASH_BUCKETS (1u << IO_WQ_HASH_ORDER)
++
++struct io_wqe_acct {
++ unsigned nr_workers;
++ unsigned max_workers;
++ int index;
++ atomic_t nr_running;
++ raw_spinlock_t lock;
++ struct io_wq_work_list work_list;
++ unsigned long flags;
++};
++
++enum {
++ IO_WQ_ACCT_BOUND,
++ IO_WQ_ACCT_UNBOUND,
++ IO_WQ_ACCT_NR,
++};
++
++/*
++ * Per-node worker thread pool
++ */
++struct io_wqe {
++ raw_spinlock_t lock;
++ struct io_wqe_acct acct[IO_WQ_ACCT_NR];
++
++ int node;
++
++ struct hlist_nulls_head free_list;
++ struct list_head all_list;
++
++ struct wait_queue_entry wait;
++
++ struct io_wq *wq;
++ struct io_wq_work *hash_tail[IO_WQ_NR_HASH_BUCKETS];
++
++ cpumask_var_t cpu_mask;
++};
++
++/*
++ * Per io_wq state
++ */
++struct io_wq {
++ unsigned long state;
++
++ free_work_fn *free_work;
++ io_wq_work_fn *do_work;
++
++ struct io_wq_hash *hash;
++
++ atomic_t worker_refs;
++ struct completion worker_done;
++
++ struct hlist_node cpuhp_node;
++
++ struct task_struct *task;
++
++ struct io_wqe *wqes[];
++};
++
++static enum cpuhp_state io_wq_online;
++
++struct io_cb_cancel_data {
++ work_cancel_fn *fn;
++ void *data;
++ int nr_running;
++ int nr_pending;
++ bool cancel_all;
++};
++
++static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index);
++static void io_wqe_dec_running(struct io_worker *worker);
++static bool io_acct_cancel_pending_work(struct io_wqe *wqe,
++ struct io_wqe_acct *acct,
++ struct io_cb_cancel_data *match);
++static void create_worker_cb(struct callback_head *cb);
++static void io_wq_cancel_tw_create(struct io_wq *wq);
++
++static bool io_worker_get(struct io_worker *worker)
++{
++ return refcount_inc_not_zero(&worker->ref);
++}
++
++static void io_worker_release(struct io_worker *worker)
++{
++ if (refcount_dec_and_test(&worker->ref))
++ complete(&worker->ref_done);
++}
++
++static inline struct io_wqe_acct *io_get_acct(struct io_wqe *wqe, bool bound)
++{
++ return &wqe->acct[bound ? IO_WQ_ACCT_BOUND : IO_WQ_ACCT_UNBOUND];
++}
++
++static inline struct io_wqe_acct *io_work_get_acct(struct io_wqe *wqe,
++ struct io_wq_work *work)
++{
++ return io_get_acct(wqe, !(work->flags & IO_WQ_WORK_UNBOUND));
++}
++
++static inline struct io_wqe_acct *io_wqe_get_acct(struct io_worker *worker)
++{
++ return io_get_acct(worker->wqe, worker->flags & IO_WORKER_F_BOUND);
++}
++
++static void io_worker_ref_put(struct io_wq *wq)
++{
++ if (atomic_dec_and_test(&wq->worker_refs))
++ complete(&wq->worker_done);
++}
++
++static void io_worker_cancel_cb(struct io_worker *worker)
++{
++ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
++ struct io_wqe *wqe = worker->wqe;
++ struct io_wq *wq = wqe->wq;
++
++ atomic_dec(&acct->nr_running);
++ raw_spin_lock(&worker->wqe->lock);
++ acct->nr_workers--;
++ raw_spin_unlock(&worker->wqe->lock);
++ io_worker_ref_put(wq);
++ clear_bit_unlock(0, &worker->create_state);
++ io_worker_release(worker);
++}
++
++static bool io_task_worker_match(struct callback_head *cb, void *data)
++{
++ struct io_worker *worker;
++
++ if (cb->func != create_worker_cb)
++ return false;
++ worker = container_of(cb, struct io_worker, create_work);
++ return worker == data;
++}
++
++static void io_worker_exit(struct io_worker *worker)
++{
++ struct io_wqe *wqe = worker->wqe;
++ struct io_wq *wq = wqe->wq;
++
++ while (1) {
++ struct callback_head *cb = task_work_cancel_match(wq->task,
++ io_task_worker_match, worker);
++
++ if (!cb)
++ break;
++ io_worker_cancel_cb(worker);
++ }
++
++ io_worker_release(worker);
++ wait_for_completion(&worker->ref_done);
++
++ raw_spin_lock(&wqe->lock);
++ if (worker->flags & IO_WORKER_F_FREE)
++ hlist_nulls_del_rcu(&worker->nulls_node);
++ list_del_rcu(&worker->all_list);
++ raw_spin_unlock(&wqe->lock);
++ io_wqe_dec_running(worker);
++ worker->flags = 0;
++ preempt_disable();
++ current->flags &= ~PF_IO_WORKER;
++ preempt_enable();
++
++ kfree_rcu(worker, rcu);
++ io_worker_ref_put(wqe->wq);
++ do_exit(0);
++}
++
++static inline bool io_acct_run_queue(struct io_wqe_acct *acct)
++{
++ bool ret = false;
++
++ raw_spin_lock(&acct->lock);
++ if (!wq_list_empty(&acct->work_list) &&
++ !test_bit(IO_ACCT_STALLED_BIT, &acct->flags))
++ ret = true;
++ raw_spin_unlock(&acct->lock);
++
++ return ret;
++}
++
++/*
++ * Check head of free list for an available worker. If one isn't available,
++ * caller must create one.
++ */
++static bool io_wqe_activate_free_worker(struct io_wqe *wqe,
++ struct io_wqe_acct *acct)
++ __must_hold(RCU)
++{
++ struct hlist_nulls_node *n;
++ struct io_worker *worker;
++
++ /*
++ * Iterate free_list and see if we can find an idle worker to
++ * activate. If a given worker is on the free_list but in the process
++ * of exiting, keep trying.
++ */
++ hlist_nulls_for_each_entry_rcu(worker, n, &wqe->free_list, nulls_node) {
++ if (!io_worker_get(worker))
++ continue;
++ if (io_wqe_get_acct(worker) != acct) {
++ io_worker_release(worker);
++ continue;
++ }
++ if (wake_up_process(worker->task)) {
++ io_worker_release(worker);
++ return true;
++ }
++ io_worker_release(worker);
++ }
++
++ return false;
++}
++
++/*
++ * We need a worker. If we find a free one, we're good. If not, and we're
++ * below the max number of workers, create one.
++ */
++static bool io_wqe_create_worker(struct io_wqe *wqe, struct io_wqe_acct *acct)
++{
++ /*
++ * Most likely an attempt to queue unbounded work on an io_wq that
++ * wasn't setup with any unbounded workers.
++ */
++ if (unlikely(!acct->max_workers))
++ pr_warn_once("io-wq is not configured for unbound workers");
++
++ raw_spin_lock(&wqe->lock);
++ if (acct->nr_workers >= acct->max_workers) {
++ raw_spin_unlock(&wqe->lock);
++ return true;
++ }
++ acct->nr_workers++;
++ raw_spin_unlock(&wqe->lock);
++ atomic_inc(&acct->nr_running);
++ atomic_inc(&wqe->wq->worker_refs);
++ return create_io_worker(wqe->wq, wqe, acct->index);
++}
++
++static void io_wqe_inc_running(struct io_worker *worker)
++{
++ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
++
++ atomic_inc(&acct->nr_running);
++}
++
++static void create_worker_cb(struct callback_head *cb)
++{
++ struct io_worker *worker;
++ struct io_wq *wq;
++ struct io_wqe *wqe;
++ struct io_wqe_acct *acct;
++ bool do_create = false;
++
++ worker = container_of(cb, struct io_worker, create_work);
++ wqe = worker->wqe;
++ wq = wqe->wq;
++ acct = &wqe->acct[worker->create_index];
++ raw_spin_lock(&wqe->lock);
++ if (acct->nr_workers < acct->max_workers) {
++ acct->nr_workers++;
++ do_create = true;
++ }
++ raw_spin_unlock(&wqe->lock);
++ if (do_create) {
++ create_io_worker(wq, wqe, worker->create_index);
++ } else {
++ atomic_dec(&acct->nr_running);
++ io_worker_ref_put(wq);
++ }
++ clear_bit_unlock(0, &worker->create_state);
++ io_worker_release(worker);
++}
++
++static bool io_queue_worker_create(struct io_worker *worker,
++ struct io_wqe_acct *acct,
++ task_work_func_t func)
++{
++ struct io_wqe *wqe = worker->wqe;
++ struct io_wq *wq = wqe->wq;
++
++ /* raced with exit, just ignore create call */
++ if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
++ goto fail;
++ if (!io_worker_get(worker))
++ goto fail;
++ /*
++ * create_state manages ownership of create_work/index. We should
++ * only need one entry per worker, as the worker going to sleep
++ * will trigger the condition, and waking will clear it once it
++ * runs the task_work.
++ */
++ if (test_bit(0, &worker->create_state) ||
++ test_and_set_bit_lock(0, &worker->create_state))
++ goto fail_release;
++
++ atomic_inc(&wq->worker_refs);
++ init_task_work(&worker->create_work, func);
++ worker->create_index = acct->index;
++ if (!task_work_add(wq->task, &worker->create_work, TWA_SIGNAL)) {
++ /*
++ * EXIT may have been set after checking it above, check after
++ * adding the task_work and remove any creation item if it is
++ * now set. wq exit does that too, but we can have added this
++ * work item after we canceled in io_wq_exit_workers().
++ */
++ if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
++ io_wq_cancel_tw_create(wq);
++ io_worker_ref_put(wq);
++ return true;
++ }
++ io_worker_ref_put(wq);
++ clear_bit_unlock(0, &worker->create_state);
++fail_release:
++ io_worker_release(worker);
++fail:
++ atomic_dec(&acct->nr_running);
++ io_worker_ref_put(wq);
++ return false;
++}
++
++static void io_wqe_dec_running(struct io_worker *worker)
++{
++ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
++ struct io_wqe *wqe = worker->wqe;
++
++ if (!(worker->flags & IO_WORKER_F_UP))
++ return;
++
++ if (!atomic_dec_and_test(&acct->nr_running))
++ return;
++ if (!io_acct_run_queue(acct))
++ return;
++
++ atomic_inc(&acct->nr_running);
++ atomic_inc(&wqe->wq->worker_refs);
++ io_queue_worker_create(worker, acct, create_worker_cb);
++}
++
++/*
++ * Worker will start processing some work. Move it to the busy list, if
++ * it's currently on the freelist
++ */
++static void __io_worker_busy(struct io_wqe *wqe, struct io_worker *worker)
++{
++ if (worker->flags & IO_WORKER_F_FREE) {
++ worker->flags &= ~IO_WORKER_F_FREE;
++ raw_spin_lock(&wqe->lock);
++ hlist_nulls_del_init_rcu(&worker->nulls_node);
++ raw_spin_unlock(&wqe->lock);
++ }
++}
++
++/*
++ * No work, worker going to sleep. Move to freelist, and unuse mm if we
++ * have one attached. Dropping the mm may potentially sleep, so we drop
++ * the lock in that case and return success. Since the caller has to
++ * retry the loop in that case (we changed task state), we don't regrab
++ * the lock if we return success.
++ */
++static void __io_worker_idle(struct io_wqe *wqe, struct io_worker *worker)
++ __must_hold(wqe->lock)
++{
++ if (!(worker->flags & IO_WORKER_F_FREE)) {
++ worker->flags |= IO_WORKER_F_FREE;
++ hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
++ }
++}
++
++static inline unsigned int io_get_work_hash(struct io_wq_work *work)
++{
++ return work->flags >> IO_WQ_HASH_SHIFT;
++}
++
++static bool io_wait_on_hash(struct io_wqe *wqe, unsigned int hash)
++{
++ struct io_wq *wq = wqe->wq;
++ bool ret = false;
++
++ spin_lock_irq(&wq->hash->wait.lock);
++ if (list_empty(&wqe->wait.entry)) {
++ __add_wait_queue(&wq->hash->wait, &wqe->wait);
++ if (!test_bit(hash, &wq->hash->map)) {
++ __set_current_state(TASK_RUNNING);
++ list_del_init(&wqe->wait.entry);
++ ret = true;
++ }
++ }
++ spin_unlock_irq(&wq->hash->wait.lock);
++ return ret;
++}
++
++static struct io_wq_work *io_get_next_work(struct io_wqe_acct *acct,
++ struct io_worker *worker)
++ __must_hold(acct->lock)
++{
++ struct io_wq_work_node *node, *prev;
++ struct io_wq_work *work, *tail;
++ unsigned int stall_hash = -1U;
++ struct io_wqe *wqe = worker->wqe;
++
++ wq_list_for_each(node, prev, &acct->work_list) {
++ unsigned int hash;
++
++ work = container_of(node, struct io_wq_work, list);
++
++ /* not hashed, can run anytime */
++ if (!io_wq_is_hashed(work)) {
++ wq_list_del(&acct->work_list, node, prev);
++ return work;
++ }
++
++ hash = io_get_work_hash(work);
++ /* all items with this hash lie in [work, tail] */
++ tail = wqe->hash_tail[hash];
++
++ /* hashed, can run if not already running */
++ if (!test_and_set_bit(hash, &wqe->wq->hash->map)) {
++ wqe->hash_tail[hash] = NULL;
++ wq_list_cut(&acct->work_list, &tail->list, prev);
++ return work;
++ }
++ if (stall_hash == -1U)
++ stall_hash = hash;
++ /* fast forward to a next hash, for-each will fix up @prev */
++ node = &tail->list;
++ }
++
++ if (stall_hash != -1U) {
++ bool unstalled;
++
++ /*
++ * Set this before dropping the lock to avoid racing with new
++ * work being added and clearing the stalled bit.
++ */
++ set_bit(IO_ACCT_STALLED_BIT, &acct->flags);
++ raw_spin_unlock(&acct->lock);
++ unstalled = io_wait_on_hash(wqe, stall_hash);
++ raw_spin_lock(&acct->lock);
++ if (unstalled) {
++ clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
++ if (wq_has_sleeper(&wqe->wq->hash->wait))
++ wake_up(&wqe->wq->hash->wait);
++ }
++ }
++
++ return NULL;
++}
++
++static bool io_flush_signals(void)
++{
++ if (unlikely(test_thread_flag(TIF_NOTIFY_SIGNAL))) {
++ __set_current_state(TASK_RUNNING);
++ clear_notify_signal();
++ if (task_work_pending(current))
++ task_work_run();
++ return true;
++ }
++ return false;
++}
++
++static void io_assign_current_work(struct io_worker *worker,
++ struct io_wq_work *work)
++{
++ if (work) {
++ io_flush_signals();
++ cond_resched();
++ }
++
++ raw_spin_lock(&worker->lock);
++ worker->cur_work = work;
++ worker->next_work = NULL;
++ raw_spin_unlock(&worker->lock);
++}
++
++static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work);
++
++static void io_worker_handle_work(struct io_worker *worker)
++{
++ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
++ struct io_wqe *wqe = worker->wqe;
++ struct io_wq *wq = wqe->wq;
++ bool do_kill = test_bit(IO_WQ_BIT_EXIT, &wq->state);
++
++ do {
++ struct io_wq_work *work;
++
++ /*
++ * If we got some work, mark us as busy. If we didn't, but
++ * the list isn't empty, it means we stalled on hashed work.
++ * Mark us stalled so we don't keep looking for work when we
++ * can't make progress, any work completion or insertion will
++ * clear the stalled flag.
++ */
++ raw_spin_lock(&acct->lock);
++ work = io_get_next_work(acct, worker);
++ raw_spin_unlock(&acct->lock);
++ if (work) {
++ __io_worker_busy(wqe, worker);
++
++ /*
++ * Make sure cancelation can find this, even before
++ * it becomes the active work. That avoids a window
++ * where the work has been removed from our general
++ * work list, but isn't yet discoverable as the
++ * current work item for this worker.
++ */
++ raw_spin_lock(&worker->lock);
++ worker->next_work = work;
++ raw_spin_unlock(&worker->lock);
++ } else {
++ break;
++ }
++ io_assign_current_work(worker, work);
++ __set_current_state(TASK_RUNNING);
++
++ /* handle a whole dependent link */
++ do {
++ struct io_wq_work *next_hashed, *linked;
++ unsigned int hash = io_get_work_hash(work);
++
++ next_hashed = wq_next_work(work);
++
++ if (unlikely(do_kill) && (work->flags & IO_WQ_WORK_UNBOUND))
++ work->flags |= IO_WQ_WORK_CANCEL;
++ wq->do_work(work);
++ io_assign_current_work(worker, NULL);
++
++ linked = wq->free_work(work);
++ work = next_hashed;
++ if (!work && linked && !io_wq_is_hashed(linked)) {
++ work = linked;
++ linked = NULL;
++ }
++ io_assign_current_work(worker, work);
++ if (linked)
++ io_wqe_enqueue(wqe, linked);
++
++ if (hash != -1U && !next_hashed) {
++ /* serialize hash clear with wake_up() */
++ spin_lock_irq(&wq->hash->wait.lock);
++ clear_bit(hash, &wq->hash->map);
++ clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
++ spin_unlock_irq(&wq->hash->wait.lock);
++ if (wq_has_sleeper(&wq->hash->wait))
++ wake_up(&wq->hash->wait);
++ }
++ } while (work);
++ } while (1);
++}
++
++static int io_wqe_worker(void *data)
++{
++ struct io_worker *worker = data;
++ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
++ struct io_wqe *wqe = worker->wqe;
++ struct io_wq *wq = wqe->wq;
++ bool last_timeout = false;
++ char buf[TASK_COMM_LEN];
++
++ worker->flags |= (IO_WORKER_F_UP | IO_WORKER_F_RUNNING);
++
++ snprintf(buf, sizeof(buf), "iou-wrk-%d", wq->task->pid);
++ set_task_comm(current, buf);
++
++ audit_alloc_kernel(current);
++
++ while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)) {
++ long ret;
++
++ set_current_state(TASK_INTERRUPTIBLE);
++ while (io_acct_run_queue(acct))
++ io_worker_handle_work(worker);
++
++ raw_spin_lock(&wqe->lock);
++ /* timed out, exit unless we're the last worker */
++ if (last_timeout && acct->nr_workers > 1) {
++ acct->nr_workers--;
++ raw_spin_unlock(&wqe->lock);
++ __set_current_state(TASK_RUNNING);
++ break;
++ }
++ last_timeout = false;
++ __io_worker_idle(wqe, worker);
++ raw_spin_unlock(&wqe->lock);
++ if (io_flush_signals())
++ continue;
++ ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
++ if (signal_pending(current)) {
++ struct ksignal ksig;
++
++ if (!get_signal(&ksig))
++ continue;
++ break;
++ }
++ last_timeout = !ret;
++ }
++
++ if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
++ io_worker_handle_work(worker);
++
++ audit_free(current);
++ io_worker_exit(worker);
++ return 0;
++}
++
++/*
++ * Called when a worker is scheduled in. Mark us as currently running.
++ */
++void io_wq_worker_running(struct task_struct *tsk)
++{
++ struct io_worker *worker = tsk->worker_private;
++
++ if (!worker)
++ return;
++ if (!(worker->flags & IO_WORKER_F_UP))
++ return;
++ if (worker->flags & IO_WORKER_F_RUNNING)
++ return;
++ worker->flags |= IO_WORKER_F_RUNNING;
++ io_wqe_inc_running(worker);
++}
++
++/*
++ * Called when worker is going to sleep. If there are no workers currently
++ * running and we have work pending, wake up a free one or create a new one.
++ */
++void io_wq_worker_sleeping(struct task_struct *tsk)
++{
++ struct io_worker *worker = tsk->worker_private;
++
++ if (!worker)
++ return;
++ if (!(worker->flags & IO_WORKER_F_UP))
++ return;
++ if (!(worker->flags & IO_WORKER_F_RUNNING))
++ return;
++
++ worker->flags &= ~IO_WORKER_F_RUNNING;
++ io_wqe_dec_running(worker);
++}
++
++static void io_init_new_worker(struct io_wqe *wqe, struct io_worker *worker,
++ struct task_struct *tsk)
++{
++ tsk->worker_private = worker;
++ worker->task = tsk;
++ set_cpus_allowed_ptr(tsk, wqe->cpu_mask);
++ tsk->flags |= PF_NO_SETAFFINITY;
++
++ raw_spin_lock(&wqe->lock);
++ hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
++ list_add_tail_rcu(&worker->all_list, &wqe->all_list);
++ worker->flags |= IO_WORKER_F_FREE;
++ raw_spin_unlock(&wqe->lock);
++ wake_up_new_task(tsk);
++}
++
++static bool io_wq_work_match_all(struct io_wq_work *work, void *data)
++{
++ return true;
++}
++
++static inline bool io_should_retry_thread(long err)
++{
++ /*
++ * Prevent perpetual task_work retry, if the task (or its group) is
++ * exiting.
++ */
++ if (fatal_signal_pending(current))
++ return false;
++
++ switch (err) {
++ case -EAGAIN:
++ case -ERESTARTSYS:
++ case -ERESTARTNOINTR:
++ case -ERESTARTNOHAND:
++ return true;
++ default:
++ return false;
++ }
++}
++
++static void create_worker_cont(struct callback_head *cb)
++{
++ struct io_worker *worker;
++ struct task_struct *tsk;
++ struct io_wqe *wqe;
++
++ worker = container_of(cb, struct io_worker, create_work);
++ clear_bit_unlock(0, &worker->create_state);
++ wqe = worker->wqe;
++ tsk = create_io_thread(io_wqe_worker, worker, wqe->node);
++ if (!IS_ERR(tsk)) {
++ io_init_new_worker(wqe, worker, tsk);
++ io_worker_release(worker);
++ return;
++ } else if (!io_should_retry_thread(PTR_ERR(tsk))) {
++ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
++
++ atomic_dec(&acct->nr_running);
++ raw_spin_lock(&wqe->lock);
++ acct->nr_workers--;
++ if (!acct->nr_workers) {
++ struct io_cb_cancel_data match = {
++ .fn = io_wq_work_match_all,
++ .cancel_all = true,
++ };
++
++ raw_spin_unlock(&wqe->lock);
++ while (io_acct_cancel_pending_work(wqe, acct, &match))
++ ;
++ } else {
++ raw_spin_unlock(&wqe->lock);
++ }
++ io_worker_ref_put(wqe->wq);
++ kfree(worker);
++ return;
++ }
++
++ /* re-create attempts grab a new worker ref, drop the existing one */
++ io_worker_release(worker);
++ schedule_work(&worker->work);
++}
++
++static void io_workqueue_create(struct work_struct *work)
++{
++ struct io_worker *worker = container_of(work, struct io_worker, work);
++ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
++
++ if (!io_queue_worker_create(worker, acct, create_worker_cont))
++ kfree(worker);
++}
++
++static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index)
++{
++ struct io_wqe_acct *acct = &wqe->acct[index];
++ struct io_worker *worker;
++ struct task_struct *tsk;
++
++ __set_current_state(TASK_RUNNING);
++
++ worker = kzalloc_node(sizeof(*worker), GFP_KERNEL, wqe->node);
++ if (!worker) {
++fail:
++ atomic_dec(&acct->nr_running);
++ raw_spin_lock(&wqe->lock);
++ acct->nr_workers--;
++ raw_spin_unlock(&wqe->lock);
++ io_worker_ref_put(wq);
++ return false;
++ }
++
++ refcount_set(&worker->ref, 1);
++ worker->wqe = wqe;
++ raw_spin_lock_init(&worker->lock);
++ init_completion(&worker->ref_done);
++
++ if (index == IO_WQ_ACCT_BOUND)
++ worker->flags |= IO_WORKER_F_BOUND;
++
++ tsk = create_io_thread(io_wqe_worker, worker, wqe->node);
++ if (!IS_ERR(tsk)) {
++ io_init_new_worker(wqe, worker, tsk);
++ } else if (!io_should_retry_thread(PTR_ERR(tsk))) {
++ kfree(worker);
++ goto fail;
++ } else {
++ INIT_WORK(&worker->work, io_workqueue_create);
++ schedule_work(&worker->work);
++ }
++
++ return true;
++}
++
++/*
++ * Iterate the passed in list and call the specific function for each
++ * worker that isn't exiting
++ */
++static bool io_wq_for_each_worker(struct io_wqe *wqe,
++ bool (*func)(struct io_worker *, void *),
++ void *data)
++{
++ struct io_worker *worker;
++ bool ret = false;
++
++ list_for_each_entry_rcu(worker, &wqe->all_list, all_list) {
++ if (io_worker_get(worker)) {
++ /* no task if node is/was offline */
++ if (worker->task)
++ ret = func(worker, data);
++ io_worker_release(worker);
++ if (ret)
++ break;
++ }
++ }
++
++ return ret;
++}
++
++static bool io_wq_worker_wake(struct io_worker *worker, void *data)
++{
++ set_notify_signal(worker->task);
++ wake_up_process(worker->task);
++ return false;
++}
++
++static void io_run_cancel(struct io_wq_work *work, struct io_wqe *wqe)
++{
++ struct io_wq *wq = wqe->wq;
++
++ do {
++ work->flags |= IO_WQ_WORK_CANCEL;
++ wq->do_work(work);
++ work = wq->free_work(work);
++ } while (work);
++}
++
++static void io_wqe_insert_work(struct io_wqe *wqe, struct io_wq_work *work)
++{
++ struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
++ unsigned int hash;
++ struct io_wq_work *tail;
++
++ if (!io_wq_is_hashed(work)) {
++append:
++ wq_list_add_tail(&work->list, &acct->work_list);
++ return;
++ }
++
++ hash = io_get_work_hash(work);
++ tail = wqe->hash_tail[hash];
++ wqe->hash_tail[hash] = work;
++ if (!tail)
++ goto append;
++
++ wq_list_add_after(&work->list, &tail->list, &acct->work_list);
++}
++
++static bool io_wq_work_match_item(struct io_wq_work *work, void *data)
++{
++ return work == data;
++}
++
++static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work)
++{
++ struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
++ struct io_cb_cancel_data match;
++ unsigned work_flags = work->flags;
++ bool do_create;
++
++ /*
++ * If io-wq is exiting for this task, or if the request has explicitly
++ * been marked as one that should not get executed, cancel it here.
++ */
++ if (test_bit(IO_WQ_BIT_EXIT, &wqe->wq->state) ||
++ (work->flags & IO_WQ_WORK_CANCEL)) {
++ io_run_cancel(work, wqe);
++ return;
++ }
++
++ raw_spin_lock(&acct->lock);
++ io_wqe_insert_work(wqe, work);
++ clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
++ raw_spin_unlock(&acct->lock);
++
++ raw_spin_lock(&wqe->lock);
++ rcu_read_lock();
++ do_create = !io_wqe_activate_free_worker(wqe, acct);
++ rcu_read_unlock();
++
++ raw_spin_unlock(&wqe->lock);
++
++ if (do_create && ((work_flags & IO_WQ_WORK_CONCURRENT) ||
++ !atomic_read(&acct->nr_running))) {
++ bool did_create;
++
++ did_create = io_wqe_create_worker(wqe, acct);
++ if (likely(did_create))
++ return;
++
++ raw_spin_lock(&wqe->lock);
++ if (acct->nr_workers) {
++ raw_spin_unlock(&wqe->lock);
++ return;
++ }
++ raw_spin_unlock(&wqe->lock);
++
++ /* fatal condition, failed to create the first worker */
++ match.fn = io_wq_work_match_item,
++ match.data = work,
++ match.cancel_all = false,
++
++ io_acct_cancel_pending_work(wqe, acct, &match);
++ }
++}
++
++void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work)
++{
++ struct io_wqe *wqe = wq->wqes[numa_node_id()];
++
++ io_wqe_enqueue(wqe, work);
++}
++
++/*
++ * Work items that hash to the same value will not be done in parallel.
++ * Used to limit concurrent writes, generally hashed by inode.
++ */
++void io_wq_hash_work(struct io_wq_work *work, void *val)
++{
++ unsigned int bit;
++
++ bit = hash_ptr(val, IO_WQ_HASH_ORDER);
++ work->flags |= (IO_WQ_WORK_HASHED | (bit << IO_WQ_HASH_SHIFT));
++}
++
++static bool __io_wq_worker_cancel(struct io_worker *worker,
++ struct io_cb_cancel_data *match,
++ struct io_wq_work *work)
++{
++ if (work && match->fn(work, match->data)) {
++ work->flags |= IO_WQ_WORK_CANCEL;
++ set_notify_signal(worker->task);
++ return true;
++ }
++
++ return false;
++}
++
++static bool io_wq_worker_cancel(struct io_worker *worker, void *data)
++{
++ struct io_cb_cancel_data *match = data;
++
++ /*
++ * Hold the lock to avoid ->cur_work going out of scope, caller
++ * may dereference the passed in work.
++ */
++ raw_spin_lock(&worker->lock);
++ if (__io_wq_worker_cancel(worker, match, worker->cur_work) ||
++ __io_wq_worker_cancel(worker, match, worker->next_work))
++ match->nr_running++;
++ raw_spin_unlock(&worker->lock);
++
++ return match->nr_running && !match->cancel_all;
++}
++
++static inline void io_wqe_remove_pending(struct io_wqe *wqe,
++ struct io_wq_work *work,
++ struct io_wq_work_node *prev)
++{
++ struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
++ unsigned int hash = io_get_work_hash(work);
++ struct io_wq_work *prev_work = NULL;
++
++ if (io_wq_is_hashed(work) && work == wqe->hash_tail[hash]) {
++ if (prev)
++ prev_work = container_of(prev, struct io_wq_work, list);
++ if (prev_work && io_get_work_hash(prev_work) == hash)
++ wqe->hash_tail[hash] = prev_work;
++ else
++ wqe->hash_tail[hash] = NULL;
++ }
++ wq_list_del(&acct->work_list, &work->list, prev);
++}
++
++static bool io_acct_cancel_pending_work(struct io_wqe *wqe,
++ struct io_wqe_acct *acct,
++ struct io_cb_cancel_data *match)
++{
++ struct io_wq_work_node *node, *prev;
++ struct io_wq_work *work;
++
++ raw_spin_lock(&acct->lock);
++ wq_list_for_each(node, prev, &acct->work_list) {
++ work = container_of(node, struct io_wq_work, list);
++ if (!match->fn(work, match->data))
++ continue;
++ io_wqe_remove_pending(wqe, work, prev);
++ raw_spin_unlock(&acct->lock);
++ io_run_cancel(work, wqe);
++ match->nr_pending++;
++ /* not safe to continue after unlock */
++ return true;
++ }
++ raw_spin_unlock(&acct->lock);
++
++ return false;
++}
++
++static void io_wqe_cancel_pending_work(struct io_wqe *wqe,
++ struct io_cb_cancel_data *match)
++{
++ int i;
++retry:
++ for (i = 0; i < IO_WQ_ACCT_NR; i++) {
++ struct io_wqe_acct *acct = io_get_acct(wqe, i == 0);
++
++ if (io_acct_cancel_pending_work(wqe, acct, match)) {
++ if (match->cancel_all)
++ goto retry;
++ break;
++ }
++ }
++}
++
++static void io_wqe_cancel_running_work(struct io_wqe *wqe,
++ struct io_cb_cancel_data *match)
++{
++ rcu_read_lock();
++ io_wq_for_each_worker(wqe, io_wq_worker_cancel, match);
++ rcu_read_unlock();
++}
++
++enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel,
++ void *data, bool cancel_all)
++{
++ struct io_cb_cancel_data match = {
++ .fn = cancel,
++ .data = data,
++ .cancel_all = cancel_all,
++ };
++ int node;
++
++ /*
++ * First check pending list, if we're lucky we can just remove it
++ * from there. CANCEL_OK means that the work is returned as-new,
++ * no completion will be posted for it.
++ *
++ * Then check if a free (going busy) or busy worker has the work
++ * currently running. If we find it there, we'll return CANCEL_RUNNING
++ * as an indication that we attempt to signal cancellation. The
++ * completion will run normally in this case.
++ *
++ * Do both of these while holding the wqe->lock, to ensure that
++ * we'll find a work item regardless of state.
++ */
++ for_each_node(node) {
++ struct io_wqe *wqe = wq->wqes[node];
++
++ io_wqe_cancel_pending_work(wqe, &match);
++ if (match.nr_pending && !match.cancel_all)
++ return IO_WQ_CANCEL_OK;
++
++ raw_spin_lock(&wqe->lock);
++ io_wqe_cancel_running_work(wqe, &match);
++ raw_spin_unlock(&wqe->lock);
++ if (match.nr_running && !match.cancel_all)
++ return IO_WQ_CANCEL_RUNNING;
++ }
++
++ if (match.nr_running)
++ return IO_WQ_CANCEL_RUNNING;
++ if (match.nr_pending)
++ return IO_WQ_CANCEL_OK;
++ return IO_WQ_CANCEL_NOTFOUND;
++}
++
++static int io_wqe_hash_wake(struct wait_queue_entry *wait, unsigned mode,
++ int sync, void *key)
++{
++ struct io_wqe *wqe = container_of(wait, struct io_wqe, wait);
++ int i;
++
++ list_del_init(&wait->entry);
++
++ rcu_read_lock();
++ for (i = 0; i < IO_WQ_ACCT_NR; i++) {
++ struct io_wqe_acct *acct = &wqe->acct[i];
++
++ if (test_and_clear_bit(IO_ACCT_STALLED_BIT, &acct->flags))
++ io_wqe_activate_free_worker(wqe, acct);
++ }
++ rcu_read_unlock();
++ return 1;
++}
++
++struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
++{
++ int ret, node, i;
++ struct io_wq *wq;
++
++ if (WARN_ON_ONCE(!data->free_work || !data->do_work))
++ return ERR_PTR(-EINVAL);
++ if (WARN_ON_ONCE(!bounded))
++ return ERR_PTR(-EINVAL);
++
++ wq = kzalloc(struct_size(wq, wqes, nr_node_ids), GFP_KERNEL);
++ if (!wq)
++ return ERR_PTR(-ENOMEM);
++ ret = cpuhp_state_add_instance_nocalls(io_wq_online, &wq->cpuhp_node);
++ if (ret)
++ goto err_wq;
++
++ refcount_inc(&data->hash->refs);
++ wq->hash = data->hash;
++ wq->free_work = data->free_work;
++ wq->do_work = data->do_work;
++
++ ret = -ENOMEM;
++ for_each_node(node) {
++ struct io_wqe *wqe;
++ int alloc_node = node;
++
++ if (!node_online(alloc_node))
++ alloc_node = NUMA_NO_NODE;
++ wqe = kzalloc_node(sizeof(struct io_wqe), GFP_KERNEL, alloc_node);
++ if (!wqe)
++ goto err;
++ if (!alloc_cpumask_var(&wqe->cpu_mask, GFP_KERNEL))
++ goto err;
++ cpumask_copy(wqe->cpu_mask, cpumask_of_node(node));
++ wq->wqes[node] = wqe;
++ wqe->node = alloc_node;
++ wqe->acct[IO_WQ_ACCT_BOUND].max_workers = bounded;
++ wqe->acct[IO_WQ_ACCT_UNBOUND].max_workers =
++ task_rlimit(current, RLIMIT_NPROC);
++ INIT_LIST_HEAD(&wqe->wait.entry);
++ wqe->wait.func = io_wqe_hash_wake;
++ for (i = 0; i < IO_WQ_ACCT_NR; i++) {
++ struct io_wqe_acct *acct = &wqe->acct[i];
++
++ acct->index = i;
++ atomic_set(&acct->nr_running, 0);
++ INIT_WQ_LIST(&acct->work_list);
++ raw_spin_lock_init(&acct->lock);
++ }
++ wqe->wq = wq;
++ raw_spin_lock_init(&wqe->lock);
++ INIT_HLIST_NULLS_HEAD(&wqe->free_list, 0);
++ INIT_LIST_HEAD(&wqe->all_list);
++ }
++
++ wq->task = get_task_struct(data->task);
++ atomic_set(&wq->worker_refs, 1);
++ init_completion(&wq->worker_done);
++ return wq;
++err:
++ io_wq_put_hash(data->hash);
++ cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node);
++ for_each_node(node) {
++ if (!wq->wqes[node])
++ continue;
++ free_cpumask_var(wq->wqes[node]->cpu_mask);
++ kfree(wq->wqes[node]);
++ }
++err_wq:
++ kfree(wq);
++ return ERR_PTR(ret);
++}
++
++static bool io_task_work_match(struct callback_head *cb, void *data)
++{
++ struct io_worker *worker;
++
++ if (cb->func != create_worker_cb && cb->func != create_worker_cont)
++ return false;
++ worker = container_of(cb, struct io_worker, create_work);
++ return worker->wqe->wq == data;
++}
++
++void io_wq_exit_start(struct io_wq *wq)
++{
++ set_bit(IO_WQ_BIT_EXIT, &wq->state);
++}
++
++static void io_wq_cancel_tw_create(struct io_wq *wq)
++{
++ struct callback_head *cb;
++
++ while ((cb = task_work_cancel_match(wq->task, io_task_work_match, wq)) != NULL) {
++ struct io_worker *worker;
++
++ worker = container_of(cb, struct io_worker, create_work);
++ io_worker_cancel_cb(worker);
++ }
++}
++
++static void io_wq_exit_workers(struct io_wq *wq)
++{
++ int node;
++
++ if (!wq->task)
++ return;
++
++ io_wq_cancel_tw_create(wq);
++
++ rcu_read_lock();
++ for_each_node(node) {
++ struct io_wqe *wqe = wq->wqes[node];
++
++ io_wq_for_each_worker(wqe, io_wq_worker_wake, NULL);
++ }
++ rcu_read_unlock();
++ io_worker_ref_put(wq);
++ wait_for_completion(&wq->worker_done);
++
++ for_each_node(node) {
++ spin_lock_irq(&wq->hash->wait.lock);
++ list_del_init(&wq->wqes[node]->wait.entry);
++ spin_unlock_irq(&wq->hash->wait.lock);
++ }
++ put_task_struct(wq->task);
++ wq->task = NULL;
++}
++
++static void io_wq_destroy(struct io_wq *wq)
++{
++ int node;
++
++ cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node);
++
++ for_each_node(node) {
++ struct io_wqe *wqe = wq->wqes[node];
++ struct io_cb_cancel_data match = {
++ .fn = io_wq_work_match_all,
++ .cancel_all = true,
++ };
++ io_wqe_cancel_pending_work(wqe, &match);
++ free_cpumask_var(wqe->cpu_mask);
++ kfree(wqe);
++ }
++ io_wq_put_hash(wq->hash);
++ kfree(wq);
++}
++
++void io_wq_put_and_exit(struct io_wq *wq)
++{
++ WARN_ON_ONCE(!test_bit(IO_WQ_BIT_EXIT, &wq->state));
++
++ io_wq_exit_workers(wq);
++ io_wq_destroy(wq);
++}
++
++struct online_data {
++ unsigned int cpu;
++ bool online;
++};
++
++static bool io_wq_worker_affinity(struct io_worker *worker, void *data)
++{
++ struct online_data *od = data;
++
++ if (od->online)
++ cpumask_set_cpu(od->cpu, worker->wqe->cpu_mask);
++ else
++ cpumask_clear_cpu(od->cpu, worker->wqe->cpu_mask);
++ return false;
++}
++
++static int __io_wq_cpu_online(struct io_wq *wq, unsigned int cpu, bool online)
++{
++ struct online_data od = {
++ .cpu = cpu,
++ .online = online
++ };
++ int i;
++
++ rcu_read_lock();
++ for_each_node(i)
++ io_wq_for_each_worker(wq->wqes[i], io_wq_worker_affinity, &od);
++ rcu_read_unlock();
++ return 0;
++}
++
++static int io_wq_cpu_online(unsigned int cpu, struct hlist_node *node)
++{
++ struct io_wq *wq = hlist_entry_safe(node, struct io_wq, cpuhp_node);
++
++ return __io_wq_cpu_online(wq, cpu, true);
++}
++
++static int io_wq_cpu_offline(unsigned int cpu, struct hlist_node *node)
++{
++ struct io_wq *wq = hlist_entry_safe(node, struct io_wq, cpuhp_node);
++
++ return __io_wq_cpu_online(wq, cpu, false);
++}
++
++int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask)
++{
++ int i;
++
++ rcu_read_lock();
++ for_each_node(i) {
++ struct io_wqe *wqe = wq->wqes[i];
++
++ if (mask)
++ cpumask_copy(wqe->cpu_mask, mask);
++ else
++ cpumask_copy(wqe->cpu_mask, cpumask_of_node(i));
++ }
++ rcu_read_unlock();
++ return 0;
++}
++
++/*
++ * Set max number of unbounded workers, returns old value. If new_count is 0,
++ * then just return the old value.
++ */
++int io_wq_max_workers(struct io_wq *wq, int *new_count)
++{
++ int prev[IO_WQ_ACCT_NR];
++ bool first_node = true;
++ int i, node;
++
++ BUILD_BUG_ON((int) IO_WQ_ACCT_BOUND != (int) IO_WQ_BOUND);
++ BUILD_BUG_ON((int) IO_WQ_ACCT_UNBOUND != (int) IO_WQ_UNBOUND);
++ BUILD_BUG_ON((int) IO_WQ_ACCT_NR != 2);
++
++ for (i = 0; i < IO_WQ_ACCT_NR; i++) {
++ if (new_count[i] > task_rlimit(current, RLIMIT_NPROC))
++ new_count[i] = task_rlimit(current, RLIMIT_NPROC);
++ }
++
++ for (i = 0; i < IO_WQ_ACCT_NR; i++)
++ prev[i] = 0;
++
++ rcu_read_lock();
++ for_each_node(node) {
++ struct io_wqe *wqe = wq->wqes[node];
++ struct io_wqe_acct *acct;
++
++ raw_spin_lock(&wqe->lock);
++ for (i = 0; i < IO_WQ_ACCT_NR; i++) {
++ acct = &wqe->acct[i];
++ if (first_node)
++ prev[i] = max_t(int, acct->max_workers, prev[i]);
++ if (new_count[i])
++ acct->max_workers = new_count[i];
++ }
++ raw_spin_unlock(&wqe->lock);
++ first_node = false;
++ }
++ rcu_read_unlock();
++
++ for (i = 0; i < IO_WQ_ACCT_NR; i++)
++ new_count[i] = prev[i];
++
++ return 0;
++}
++
++static __init int io_wq_init(void)
++{
++ int ret;
++
++ ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "io-wq/online",
++ io_wq_cpu_online, io_wq_cpu_offline);
++ if (ret < 0)
++ return ret;
++ io_wq_online = ret;
++ return 0;
++}
++subsys_initcall(io_wq_init);
+diff --git a/io_uring/io-wq.h b/io_uring/io-wq.h
+new file mode 100644
+index 0000000000000..dbecd27656c7c
+--- /dev/null
++++ b/io_uring/io-wq.h
+@@ -0,0 +1,227 @@
++#ifndef INTERNAL_IO_WQ_H
++#define INTERNAL_IO_WQ_H
++
++#include <linux/refcount.h>
++
++struct io_wq;
++
++enum {
++ IO_WQ_WORK_CANCEL = 1,
++ IO_WQ_WORK_HASHED = 2,
++ IO_WQ_WORK_UNBOUND = 4,
++ IO_WQ_WORK_CONCURRENT = 16,
++
++ IO_WQ_HASH_SHIFT = 24, /* upper 8 bits are used for hash key */
++};
++
++enum io_wq_cancel {
++ IO_WQ_CANCEL_OK, /* cancelled before started */
++ IO_WQ_CANCEL_RUNNING, /* found, running, and attempted cancelled */
++ IO_WQ_CANCEL_NOTFOUND, /* work not found */
++};
++
++struct io_wq_work_node {
++ struct io_wq_work_node *next;
++};
++
++struct io_wq_work_list {
++ struct io_wq_work_node *first;
++ struct io_wq_work_node *last;
++};
++
++#define wq_list_for_each(pos, prv, head) \
++ for (pos = (head)->first, prv = NULL; pos; prv = pos, pos = (pos)->next)
++
++#define wq_list_for_each_resume(pos, prv) \
++ for (; pos; prv = pos, pos = (pos)->next)
++
++#define wq_list_empty(list) (READ_ONCE((list)->first) == NULL)
++#define INIT_WQ_LIST(list) do { \
++ (list)->first = NULL; \
++} while (0)
++
++static inline void wq_list_add_after(struct io_wq_work_node *node,
++ struct io_wq_work_node *pos,
++ struct io_wq_work_list *list)
++{
++ struct io_wq_work_node *next = pos->next;
++
++ pos->next = node;
++ node->next = next;
++ if (!next)
++ list->last = node;
++}
++
++/**
++ * wq_list_merge - merge the second list to the first one.
++ * @list0: the first list
++ * @list1: the second list
++ * Return the first node after mergence.
++ */
++static inline struct io_wq_work_node *wq_list_merge(struct io_wq_work_list *list0,
++ struct io_wq_work_list *list1)
++{
++ struct io_wq_work_node *ret;
++
++ if (!list0->first) {
++ ret = list1->first;
++ } else {
++ ret = list0->first;
++ list0->last->next = list1->first;
++ }
++ INIT_WQ_LIST(list0);
++ INIT_WQ_LIST(list1);
++ return ret;
++}
++
++static inline void wq_list_add_tail(struct io_wq_work_node *node,
++ struct io_wq_work_list *list)
++{
++ node->next = NULL;
++ if (!list->first) {
++ list->last = node;
++ WRITE_ONCE(list->first, node);
++ } else {
++ list->last->next = node;
++ list->last = node;
++ }
++}
++
++static inline void wq_list_add_head(struct io_wq_work_node *node,
++ struct io_wq_work_list *list)
++{
++ node->next = list->first;
++ if (!node->next)
++ list->last = node;
++ WRITE_ONCE(list->first, node);
++}
++
++static inline void wq_list_cut(struct io_wq_work_list *list,
++ struct io_wq_work_node *last,
++ struct io_wq_work_node *prev)
++{
++ /* first in the list, if prev==NULL */
++ if (!prev)
++ WRITE_ONCE(list->first, last->next);
++ else
++ prev->next = last->next;
++
++ if (last == list->last)
++ list->last = prev;
++ last->next = NULL;
++}
++
++static inline void __wq_list_splice(struct io_wq_work_list *list,
++ struct io_wq_work_node *to)
++{
++ list->last->next = to->next;
++ to->next = list->first;
++ INIT_WQ_LIST(list);
++}
++
++static inline bool wq_list_splice(struct io_wq_work_list *list,
++ struct io_wq_work_node *to)
++{
++ if (!wq_list_empty(list)) {
++ __wq_list_splice(list, to);
++ return true;
++ }
++ return false;
++}
++
++static inline void wq_stack_add_head(struct io_wq_work_node *node,
++ struct io_wq_work_node *stack)
++{
++ node->next = stack->next;
++ stack->next = node;
++}
++
++static inline void wq_list_del(struct io_wq_work_list *list,
++ struct io_wq_work_node *node,
++ struct io_wq_work_node *prev)
++{
++ wq_list_cut(list, node, prev);
++}
++
++static inline
++struct io_wq_work_node *wq_stack_extract(struct io_wq_work_node *stack)
++{
++ struct io_wq_work_node *node = stack->next;
++
++ stack->next = node->next;
++ return node;
++}
++
++struct io_wq_work {
++ struct io_wq_work_node list;
++ unsigned flags;
++};
++
++static inline struct io_wq_work *wq_next_work(struct io_wq_work *work)
++{
++ if (!work->list.next)
++ return NULL;
++
++ return container_of(work->list.next, struct io_wq_work, list);
++}
++
++typedef struct io_wq_work *(free_work_fn)(struct io_wq_work *);
++typedef void (io_wq_work_fn)(struct io_wq_work *);
++
++struct io_wq_hash {
++ refcount_t refs;
++ unsigned long map;
++ struct wait_queue_head wait;
++};
++
++static inline void io_wq_put_hash(struct io_wq_hash *hash)
++{
++ if (refcount_dec_and_test(&hash->refs))
++ kfree(hash);
++}
++
++struct io_wq_data {
++ struct io_wq_hash *hash;
++ struct task_struct *task;
++ io_wq_work_fn *do_work;
++ free_work_fn *free_work;
++};
++
++struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data);
++void io_wq_exit_start(struct io_wq *wq);
++void io_wq_put_and_exit(struct io_wq *wq);
++
++void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work);
++void io_wq_hash_work(struct io_wq_work *work, void *val);
++
++int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask);
++int io_wq_max_workers(struct io_wq *wq, int *new_count);
++
++static inline bool io_wq_is_hashed(struct io_wq_work *work)
++{
++ return work->flags & IO_WQ_WORK_HASHED;
++}
++
++typedef bool (work_cancel_fn)(struct io_wq_work *, void *);
++
++enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel,
++ void *data, bool cancel_all);
++
++#if defined(CONFIG_IO_WQ)
++extern void io_wq_worker_sleeping(struct task_struct *);
++extern void io_wq_worker_running(struct task_struct *);
++#else
++static inline void io_wq_worker_sleeping(struct task_struct *tsk)
++{
++}
++static inline void io_wq_worker_running(struct task_struct *tsk)
++{
++}
++#endif
++
++static inline bool io_wq_current_is_worker(void)
++{
++ return in_task() && (current->flags & PF_IO_WORKER) &&
++ current->worker_private;
++}
++#endif
+diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
+new file mode 100644
+index 0000000000000..ec4e31c67e07e
+--- /dev/null
++++ b/io_uring/io_uring.c
+@@ -0,0 +1,11915 @@
++// SPDX-License-Identifier: GPL-2.0
++/*
++ * Shared application/kernel submission and completion ring pairs, for
++ * supporting fast/efficient IO.
++ *
++ * A note on the read/write ordering memory barriers that are matched between
++ * the application and kernel side.
++ *
++ * After the application reads the CQ ring tail, it must use an
++ * appropriate smp_rmb() to pair with the smp_wmb() the kernel uses
++ * before writing the tail (using smp_load_acquire to read the tail will
++ * do). It also needs a smp_mb() before updating CQ head (ordering the
++ * entry load(s) with the head store), pairing with an implicit barrier
++ * through a control-dependency in io_get_cqe (smp_store_release to
++ * store head will do). Failure to do so could lead to reading invalid
++ * CQ entries.
++ *
++ * Likewise, the application must use an appropriate smp_wmb() before
++ * writing the SQ tail (ordering SQ entry stores with the tail store),
++ * which pairs with smp_load_acquire in io_get_sqring (smp_store_release
++ * to store the tail will do). And it needs a barrier ordering the SQ
++ * head load before writing new SQ entries (smp_load_acquire to read
++ * head will do).
++ *
++ * When using the SQ poll thread (IORING_SETUP_SQPOLL), the application
++ * needs to check the SQ flags for IORING_SQ_NEED_WAKEUP *after*
++ * updating the SQ tail; a full memory barrier smp_mb() is needed
++ * between.
++ *
++ * Also see the examples in the liburing library:
++ *
++ * git://git.kernel.dk/liburing
++ *
++ * io_uring also uses READ/WRITE_ONCE() for _any_ store or load that happens
++ * from data shared between the kernel and application. This is done both
++ * for ordering purposes, but also to ensure that once a value is loaded from
++ * data that the application could potentially modify, it remains stable.
++ *
++ * Copyright (C) 2018-2019 Jens Axboe
++ * Copyright (c) 2018-2019 Christoph Hellwig
++ */
++#include <linux/kernel.h>
++#include <linux/init.h>
++#include <linux/errno.h>
++#include <linux/syscalls.h>
++#include <linux/compat.h>
++#include <net/compat.h>
++#include <linux/refcount.h>
++#include <linux/uio.h>
++#include <linux/bits.h>
++
++#include <linux/sched/signal.h>
++#include <linux/fs.h>
++#include <linux/file.h>
++#include <linux/fdtable.h>
++#include <linux/mm.h>
++#include <linux/mman.h>
++#include <linux/percpu.h>
++#include <linux/slab.h>
++#include <linux/blk-mq.h>
++#include <linux/bvec.h>
++#include <linux/net.h>
++#include <net/sock.h>
++#include <net/af_unix.h>
++#include <net/scm.h>
++#include <linux/anon_inodes.h>
++#include <linux/sched/mm.h>
++#include <linux/uaccess.h>
++#include <linux/nospec.h>
++#include <linux/sizes.h>
++#include <linux/hugetlb.h>
++#include <linux/highmem.h>
++#include <linux/namei.h>
++#include <linux/fsnotify.h>
++#include <linux/fadvise.h>
++#include <linux/eventpoll.h>
++#include <linux/splice.h>
++#include <linux/task_work.h>
++#include <linux/pagemap.h>
++#include <linux/io_uring.h>
++#include <linux/audit.h>
++#include <linux/security.h>
++
++#define CREATE_TRACE_POINTS
++#include <trace/events/io_uring.h>
++
++#include <uapi/linux/io_uring.h>
++
++#include "../fs/internal.h"
++#include "io-wq.h"
++
++#define IORING_MAX_ENTRIES 32768
++#define IORING_MAX_CQ_ENTRIES (2 * IORING_MAX_ENTRIES)
++#define IORING_SQPOLL_CAP_ENTRIES_VALUE 8
++
++/* only define max */
++#define IORING_MAX_FIXED_FILES (1U << 15)
++#define IORING_MAX_RESTRICTIONS (IORING_RESTRICTION_LAST + \
++ IORING_REGISTER_LAST + IORING_OP_LAST)
++
++#define IO_RSRC_TAG_TABLE_SHIFT (PAGE_SHIFT - 3)
++#define IO_RSRC_TAG_TABLE_MAX (1U << IO_RSRC_TAG_TABLE_SHIFT)
++#define IO_RSRC_TAG_TABLE_MASK (IO_RSRC_TAG_TABLE_MAX - 1)
++
++#define IORING_MAX_REG_BUFFERS (1U << 14)
++
++#define SQE_COMMON_FLAGS (IOSQE_FIXED_FILE | IOSQE_IO_LINK | \
++ IOSQE_IO_HARDLINK | IOSQE_ASYNC)
++
++#define SQE_VALID_FLAGS (SQE_COMMON_FLAGS | IOSQE_BUFFER_SELECT | \
++ IOSQE_IO_DRAIN | IOSQE_CQE_SKIP_SUCCESS)
++
++#define IO_REQ_CLEAN_FLAGS (REQ_F_BUFFER_SELECTED | REQ_F_NEED_CLEANUP | \
++ REQ_F_POLLED | REQ_F_INFLIGHT | REQ_F_CREDS | \
++ REQ_F_ASYNC_DATA)
++
++#define IO_TCTX_REFS_CACHE_NR (1U << 10)
++
++struct io_uring {
++ u32 head ____cacheline_aligned_in_smp;
++ u32 tail ____cacheline_aligned_in_smp;
++};
++
++/*
++ * This data is shared with the application through the mmap at offsets
++ * IORING_OFF_SQ_RING and IORING_OFF_CQ_RING.
++ *
++ * The offsets to the member fields are published through struct
++ * io_sqring_offsets when calling io_uring_setup.
++ */
++struct io_rings {
++ /*
++ * Head and tail offsets into the ring; the offsets need to be
++ * masked to get valid indices.
++ *
++ * The kernel controls head of the sq ring and the tail of the cq ring,
++ * and the application controls tail of the sq ring and the head of the
++ * cq ring.
++ */
++ struct io_uring sq, cq;
++ /*
++ * Bitmasks to apply to head and tail offsets (constant, equals
++ * ring_entries - 1)
++ */
++ u32 sq_ring_mask, cq_ring_mask;
++ /* Ring sizes (constant, power of 2) */
++ u32 sq_ring_entries, cq_ring_entries;
++ /*
++ * Number of invalid entries dropped by the kernel due to
++ * invalid index stored in array
++ *
++ * Written by the kernel, shouldn't be modified by the
++ * application (i.e. get number of "new events" by comparing to
++ * cached value).
++ *
++ * After a new SQ head value was read by the application this
++ * counter includes all submissions that were dropped reaching
++ * the new SQ head (and possibly more).
++ */
++ u32 sq_dropped;
++ /*
++ * Runtime SQ flags
++ *
++ * Written by the kernel, shouldn't be modified by the
++ * application.
++ *
++ * The application needs a full memory barrier before checking
++ * for IORING_SQ_NEED_WAKEUP after updating the sq tail.
++ */
++ u32 sq_flags;
++ /*
++ * Runtime CQ flags
++ *
++ * Written by the application, shouldn't be modified by the
++ * kernel.
++ */
++ u32 cq_flags;
++ /*
++ * Number of completion events lost because the queue was full;
++ * this should be avoided by the application by making sure
++ * there are not more requests pending than there is space in
++ * the completion queue.
++ *
++ * Written by the kernel, shouldn't be modified by the
++ * application (i.e. get number of "new events" by comparing to
++ * cached value).
++ *
++ * As completion events come in out of order this counter is not
++ * ordered with any other data.
++ */
++ u32 cq_overflow;
++ /*
++ * Ring buffer of completion events.
++ *
++ * The kernel writes completion events fresh every time they are
++ * produced, so the application is allowed to modify pending
++ * entries.
++ */
++ struct io_uring_cqe cqes[] ____cacheline_aligned_in_smp;
++};
++
++enum io_uring_cmd_flags {
++ IO_URING_F_COMPLETE_DEFER = 1,
++ IO_URING_F_UNLOCKED = 2,
++ /* int's last bit, sign checks are usually faster than a bit test */
++ IO_URING_F_NONBLOCK = INT_MIN,
++};
++
++struct io_mapped_ubuf {
++ u64 ubuf;
++ u64 ubuf_end;
++ unsigned int nr_bvecs;
++ unsigned long acct_pages;
++ struct bio_vec bvec[];
++};
++
++struct io_ring_ctx;
++
++struct io_overflow_cqe {
++ struct io_uring_cqe cqe;
++ struct list_head list;
++};
++
++struct io_fixed_file {
++ /* file * with additional FFS_* flags */
++ unsigned long file_ptr;
++};
++
++struct io_rsrc_put {
++ struct list_head list;
++ u64 tag;
++ union {
++ void *rsrc;
++ struct file *file;
++ struct io_mapped_ubuf *buf;
++ };
++};
++
++struct io_file_table {
++ struct io_fixed_file *files;
++};
++
++struct io_rsrc_node {
++ struct percpu_ref refs;
++ struct list_head node;
++ struct list_head rsrc_list;
++ struct io_rsrc_data *rsrc_data;
++ struct llist_node llist;
++ bool done;
++};
++
++typedef void (rsrc_put_fn)(struct io_ring_ctx *ctx, struct io_rsrc_put *prsrc);
++
++struct io_rsrc_data {
++ struct io_ring_ctx *ctx;
++
++ u64 **tags;
++ unsigned int nr;
++ rsrc_put_fn *do_put;
++ atomic_t refs;
++ struct completion done;
++ bool quiesce;
++};
++
++struct io_buffer_list {
++ struct list_head list;
++ struct list_head buf_list;
++ __u16 bgid;
++};
++
++struct io_buffer {
++ struct list_head list;
++ __u64 addr;
++ __u32 len;
++ __u16 bid;
++ __u16 bgid;
++};
++
++struct io_restriction {
++ DECLARE_BITMAP(register_op, IORING_REGISTER_LAST);
++ DECLARE_BITMAP(sqe_op, IORING_OP_LAST);
++ u8 sqe_flags_allowed;
++ u8 sqe_flags_required;
++ bool registered;
++};
++
++enum {
++ IO_SQ_THREAD_SHOULD_STOP = 0,
++ IO_SQ_THREAD_SHOULD_PARK,
++};
++
++struct io_sq_data {
++ refcount_t refs;
++ atomic_t park_pending;
++ struct mutex lock;
++
++ /* ctx's that are using this sqd */
++ struct list_head ctx_list;
++
++ struct task_struct *thread;
++ struct wait_queue_head wait;
++
++ unsigned sq_thread_idle;
++ int sq_cpu;
++ pid_t task_pid;
++ pid_t task_tgid;
++
++ unsigned long state;
++ struct completion exited;
++};
++
++#define IO_COMPL_BATCH 32
++#define IO_REQ_CACHE_SIZE 32
++#define IO_REQ_ALLOC_BATCH 8
++
++struct io_submit_link {
++ struct io_kiocb *head;
++ struct io_kiocb *last;
++};
++
++struct io_submit_state {
++ /* inline/task_work completion list, under ->uring_lock */
++ struct io_wq_work_node free_list;
++ /* batch completion logic */
++ struct io_wq_work_list compl_reqs;
++ struct io_submit_link link;
++
++ bool plug_started;
++ bool need_plug;
++ bool flush_cqes;
++ unsigned short submit_nr;
++ struct blk_plug plug;
++};
++
++struct io_ev_fd {
++ struct eventfd_ctx *cq_ev_fd;
++ unsigned int eventfd_async: 1;
++ struct rcu_head rcu;
++};
++
++#define IO_BUFFERS_HASH_BITS 5
++
++struct io_ring_ctx {
++ /* const or read-mostly hot data */
++ struct {
++ struct percpu_ref refs;
++
++ struct io_rings *rings;
++ unsigned int flags;
++ unsigned int compat: 1;
++ unsigned int drain_next: 1;
++ unsigned int restricted: 1;
++ unsigned int off_timeout_used: 1;
++ unsigned int drain_active: 1;
++ unsigned int drain_disabled: 1;
++ unsigned int has_evfd: 1;
++ } ____cacheline_aligned_in_smp;
++
++ /* submission data */
++ struct {
++ struct mutex uring_lock;
++
++ /*
++ * Ring buffer of indices into array of io_uring_sqe, which is
++ * mmapped by the application using the IORING_OFF_SQES offset.
++ *
++ * This indirection could e.g. be used to assign fixed
++ * io_uring_sqe entries to operations and only submit them to
++ * the queue when needed.
++ *
++ * The kernel modifies neither the indices array nor the entries
++ * array.
++ */
++ u32 *sq_array;
++ struct io_uring_sqe *sq_sqes;
++ unsigned cached_sq_head;
++ unsigned sq_entries;
++ struct list_head defer_list;
++
++ /*
++ * Fixed resources fast path, should be accessed only under
++ * uring_lock, and updated through io_uring_register(2)
++ */
++ struct io_rsrc_node *rsrc_node;
++ int rsrc_cached_refs;
++ struct io_file_table file_table;
++ unsigned nr_user_files;
++ unsigned nr_user_bufs;
++ struct io_mapped_ubuf **user_bufs;
++
++ struct io_submit_state submit_state;
++ struct list_head timeout_list;
++ struct list_head ltimeout_list;
++ struct list_head cq_overflow_list;
++ struct list_head *io_buffers;
++ struct list_head io_buffers_cache;
++ struct list_head apoll_cache;
++ struct xarray personalities;
++ u32 pers_next;
++ unsigned sq_thread_idle;
++ } ____cacheline_aligned_in_smp;
++
++ /* IRQ completion list, under ->completion_lock */
++ struct io_wq_work_list locked_free_list;
++ unsigned int locked_free_nr;
++
++ const struct cred *sq_creds; /* cred used for __io_sq_thread() */
++ struct io_sq_data *sq_data; /* if using sq thread polling */
++
++ struct wait_queue_head sqo_sq_wait;
++ struct list_head sqd_list;
++
++ unsigned long check_cq_overflow;
++
++ struct {
++ unsigned cached_cq_tail;
++ unsigned cq_entries;
++ struct io_ev_fd __rcu *io_ev_fd;
++ struct wait_queue_head cq_wait;
++ unsigned cq_extra;
++ atomic_t cq_timeouts;
++ unsigned cq_last_tm_flush;
++ } ____cacheline_aligned_in_smp;
++
++ struct {
++ spinlock_t completion_lock;
++
++ spinlock_t timeout_lock;
++
++ /*
++ * ->iopoll_list is protected by the ctx->uring_lock for
++ * io_uring instances that don't use IORING_SETUP_SQPOLL.
++ * For SQPOLL, only the single threaded io_sq_thread() will
++ * manipulate the list, hence no extra locking is needed there.
++ */
++ struct io_wq_work_list iopoll_list;
++ struct hlist_head *cancel_hash;
++ unsigned cancel_hash_bits;
++ bool poll_multi_queue;
++
++ struct list_head io_buffers_comp;
++ } ____cacheline_aligned_in_smp;
++
++ struct io_restriction restrictions;
++
++ /* slow path rsrc auxilary data, used by update/register */
++ struct {
++ struct io_rsrc_node *rsrc_backup_node;
++ struct io_mapped_ubuf *dummy_ubuf;
++ struct io_rsrc_data *file_data;
++ struct io_rsrc_data *buf_data;
++
++ struct delayed_work rsrc_put_work;
++ struct llist_head rsrc_put_llist;
++ struct list_head rsrc_ref_list;
++ spinlock_t rsrc_ref_lock;
++
++ struct list_head io_buffers_pages;
++ };
++
++ /* Keep this last, we don't need it for the fast path */
++ struct {
++ #if defined(CONFIG_UNIX)
++ struct socket *ring_sock;
++ #endif
++ /* hashed buffered write serialization */
++ struct io_wq_hash *hash_map;
++
++ /* Only used for accounting purposes */
++ struct user_struct *user;
++ struct mm_struct *mm_account;
++
++ /* ctx exit and cancelation */
++ struct llist_head fallback_llist;
++ struct delayed_work fallback_work;
++ struct work_struct exit_work;
++ struct list_head tctx_list;
++ struct completion ref_comp;
++ u32 iowq_limits[2];
++ bool iowq_limits_set;
++ };
++};
++
++/*
++ * Arbitrary limit, can be raised if need be
++ */
++#define IO_RINGFD_REG_MAX 16
++
++struct io_uring_task {
++ /* submission side */
++ int cached_refs;
++ struct xarray xa;
++ struct wait_queue_head wait;
++ const struct io_ring_ctx *last;
++ struct io_wq *io_wq;
++ struct percpu_counter inflight;
++ atomic_t inflight_tracked;
++ atomic_t in_idle;
++
++ spinlock_t task_lock;
++ struct io_wq_work_list task_list;
++ struct io_wq_work_list prior_task_list;
++ struct callback_head task_work;
++ struct file **registered_rings;
++ bool task_running;
++};
++
++/*
++ * First field must be the file pointer in all the
++ * iocb unions! See also 'struct kiocb' in <linux/fs.h>
++ */
++struct io_poll_iocb {
++ struct file *file;
++ struct wait_queue_head *head;
++ __poll_t events;
++ struct wait_queue_entry wait;
++};
++
++struct io_poll_update {
++ struct file *file;
++ u64 old_user_data;
++ u64 new_user_data;
++ __poll_t events;
++ bool update_events;
++ bool update_user_data;
++};
++
++struct io_close {
++ struct file *file;
++ int fd;
++ u32 file_slot;
++};
++
++struct io_timeout_data {
++ struct io_kiocb *req;
++ struct hrtimer timer;
++ struct timespec64 ts;
++ enum hrtimer_mode mode;
++ u32 flags;
++};
++
++struct io_accept {
++ struct file *file;
++ struct sockaddr __user *addr;
++ int __user *addr_len;
++ int flags;
++ u32 file_slot;
++ unsigned long nofile;
++};
++
++struct io_sync {
++ struct file *file;
++ loff_t len;
++ loff_t off;
++ int flags;
++ int mode;
++};
++
++struct io_cancel {
++ struct file *file;
++ u64 addr;
++};
++
++struct io_timeout {
++ struct file *file;
++ u32 off;
++ u32 target_seq;
++ struct list_head list;
++ /* head of the link, used by linked timeouts only */
++ struct io_kiocb *head;
++ /* for linked completions */
++ struct io_kiocb *prev;
++};
++
++struct io_timeout_rem {
++ struct file *file;
++ u64 addr;
++
++ /* timeout update */
++ struct timespec64 ts;
++ u32 flags;
++ bool ltimeout;
++};
++
++struct io_rw {
++ /* NOTE: kiocb has the file as the first member, so don't do it here */
++ struct kiocb kiocb;
++ u64 addr;
++ u32 len;
++ u32 flags;
++};
++
++struct io_connect {
++ struct file *file;
++ struct sockaddr __user *addr;
++ int addr_len;
++};
++
++struct io_sr_msg {
++ struct file *file;
++ union {
++ struct compat_msghdr __user *umsg_compat;
++ struct user_msghdr __user *umsg;
++ void __user *buf;
++ };
++ int msg_flags;
++ int bgid;
++ size_t len;
++ size_t done_io;
++};
++
++struct io_open {
++ struct file *file;
++ int dfd;
++ u32 file_slot;
++ struct filename *filename;
++ struct open_how how;
++ unsigned long nofile;
++};
++
++struct io_rsrc_update {
++ struct file *file;
++ u64 arg;
++ u32 nr_args;
++ u32 offset;
++};
++
++struct io_fadvise {
++ struct file *file;
++ u64 offset;
++ u32 len;
++ u32 advice;
++};
++
++struct io_madvise {
++ struct file *file;
++ u64 addr;
++ u32 len;
++ u32 advice;
++};
++
++struct io_epoll {
++ struct file *file;
++ int epfd;
++ int op;
++ int fd;
++ struct epoll_event event;
++};
++
++struct io_splice {
++ struct file *file_out;
++ loff_t off_out;
++ loff_t off_in;
++ u64 len;
++ int splice_fd_in;
++ unsigned int flags;
++};
++
++struct io_provide_buf {
++ struct file *file;
++ __u64 addr;
++ __u32 len;
++ __u32 bgid;
++ __u16 nbufs;
++ __u16 bid;
++};
++
++struct io_statx {
++ struct file *file;
++ int dfd;
++ unsigned int mask;
++ unsigned int flags;
++ struct filename *filename;
++ struct statx __user *buffer;
++};
++
++struct io_shutdown {
++ struct file *file;
++ int how;
++};
++
++struct io_rename {
++ struct file *file;
++ int old_dfd;
++ int new_dfd;
++ struct filename *oldpath;
++ struct filename *newpath;
++ int flags;
++};
++
++struct io_unlink {
++ struct file *file;
++ int dfd;
++ int flags;
++ struct filename *filename;
++};
++
++struct io_mkdir {
++ struct file *file;
++ int dfd;
++ umode_t mode;
++ struct filename *filename;
++};
++
++struct io_symlink {
++ struct file *file;
++ int new_dfd;
++ struct filename *oldpath;
++ struct filename *newpath;
++};
++
++struct io_hardlink {
++ struct file *file;
++ int old_dfd;
++ int new_dfd;
++ struct filename *oldpath;
++ struct filename *newpath;
++ int flags;
++};
++
++struct io_msg {
++ struct file *file;
++ u64 user_data;
++ u32 len;
++};
++
++struct io_async_connect {
++ struct sockaddr_storage address;
++};
++
++struct io_async_msghdr {
++ struct iovec fast_iov[UIO_FASTIOV];
++ /* points to an allocated iov, if NULL we use fast_iov instead */
++ struct iovec *free_iov;
++ struct sockaddr __user *uaddr;
++ struct msghdr msg;
++ struct sockaddr_storage addr;
++};
++
++struct io_rw_state {
++ struct iov_iter iter;
++ struct iov_iter_state iter_state;
++ struct iovec fast_iov[UIO_FASTIOV];
++};
++
++struct io_async_rw {
++ struct io_rw_state s;
++ const struct iovec *free_iovec;
++ size_t bytes_done;
++ struct wait_page_queue wpq;
++};
++
++enum {
++ REQ_F_FIXED_FILE_BIT = IOSQE_FIXED_FILE_BIT,
++ REQ_F_IO_DRAIN_BIT = IOSQE_IO_DRAIN_BIT,
++ REQ_F_LINK_BIT = IOSQE_IO_LINK_BIT,
++ REQ_F_HARDLINK_BIT = IOSQE_IO_HARDLINK_BIT,
++ REQ_F_FORCE_ASYNC_BIT = IOSQE_ASYNC_BIT,
++ REQ_F_BUFFER_SELECT_BIT = IOSQE_BUFFER_SELECT_BIT,
++ REQ_F_CQE_SKIP_BIT = IOSQE_CQE_SKIP_SUCCESS_BIT,
++
++ /* first byte is taken by user flags, shift it to not overlap */
++ REQ_F_FAIL_BIT = 8,
++ REQ_F_INFLIGHT_BIT,
++ REQ_F_CUR_POS_BIT,
++ REQ_F_NOWAIT_BIT,
++ REQ_F_LINK_TIMEOUT_BIT,
++ REQ_F_NEED_CLEANUP_BIT,
++ REQ_F_POLLED_BIT,
++ REQ_F_BUFFER_SELECTED_BIT,
++ REQ_F_COMPLETE_INLINE_BIT,
++ REQ_F_REISSUE_BIT,
++ REQ_F_CREDS_BIT,
++ REQ_F_REFCOUNT_BIT,
++ REQ_F_ARM_LTIMEOUT_BIT,
++ REQ_F_ASYNC_DATA_BIT,
++ REQ_F_SKIP_LINK_CQES_BIT,
++ REQ_F_SINGLE_POLL_BIT,
++ REQ_F_DOUBLE_POLL_BIT,
++ REQ_F_PARTIAL_IO_BIT,
++ /* keep async read/write and isreg together and in order */
++ REQ_F_SUPPORT_NOWAIT_BIT,
++ REQ_F_ISREG_BIT,
++
++ /* not a real bit, just to check we're not overflowing the space */
++ __REQ_F_LAST_BIT,
++};
++
++enum {
++ /* ctx owns file */
++ REQ_F_FIXED_FILE = BIT(REQ_F_FIXED_FILE_BIT),
++ /* drain existing IO first */
++ REQ_F_IO_DRAIN = BIT(REQ_F_IO_DRAIN_BIT),
++ /* linked sqes */
++ REQ_F_LINK = BIT(REQ_F_LINK_BIT),
++ /* doesn't sever on completion < 0 */
++ REQ_F_HARDLINK = BIT(REQ_F_HARDLINK_BIT),
++ /* IOSQE_ASYNC */
++ REQ_F_FORCE_ASYNC = BIT(REQ_F_FORCE_ASYNC_BIT),
++ /* IOSQE_BUFFER_SELECT */
++ REQ_F_BUFFER_SELECT = BIT(REQ_F_BUFFER_SELECT_BIT),
++ /* IOSQE_CQE_SKIP_SUCCESS */
++ REQ_F_CQE_SKIP = BIT(REQ_F_CQE_SKIP_BIT),
++
++ /* fail rest of links */
++ REQ_F_FAIL = BIT(REQ_F_FAIL_BIT),
++ /* on inflight list, should be cancelled and waited on exit reliably */
++ REQ_F_INFLIGHT = BIT(REQ_F_INFLIGHT_BIT),
++ /* read/write uses file position */
++ REQ_F_CUR_POS = BIT(REQ_F_CUR_POS_BIT),
++ /* must not punt to workers */
++ REQ_F_NOWAIT = BIT(REQ_F_NOWAIT_BIT),
++ /* has or had linked timeout */
++ REQ_F_LINK_TIMEOUT = BIT(REQ_F_LINK_TIMEOUT_BIT),
++ /* needs cleanup */
++ REQ_F_NEED_CLEANUP = BIT(REQ_F_NEED_CLEANUP_BIT),
++ /* already went through poll handler */
++ REQ_F_POLLED = BIT(REQ_F_POLLED_BIT),
++ /* buffer already selected */
++ REQ_F_BUFFER_SELECTED = BIT(REQ_F_BUFFER_SELECTED_BIT),
++ /* completion is deferred through io_comp_state */
++ REQ_F_COMPLETE_INLINE = BIT(REQ_F_COMPLETE_INLINE_BIT),
++ /* caller should reissue async */
++ REQ_F_REISSUE = BIT(REQ_F_REISSUE_BIT),
++ /* supports async reads/writes */
++ REQ_F_SUPPORT_NOWAIT = BIT(REQ_F_SUPPORT_NOWAIT_BIT),
++ /* regular file */
++ REQ_F_ISREG = BIT(REQ_F_ISREG_BIT),
++ /* has creds assigned */
++ REQ_F_CREDS = BIT(REQ_F_CREDS_BIT),
++ /* skip refcounting if not set */
++ REQ_F_REFCOUNT = BIT(REQ_F_REFCOUNT_BIT),
++ /* there is a linked timeout that has to be armed */
++ REQ_F_ARM_LTIMEOUT = BIT(REQ_F_ARM_LTIMEOUT_BIT),
++ /* ->async_data allocated */
++ REQ_F_ASYNC_DATA = BIT(REQ_F_ASYNC_DATA_BIT),
++ /* don't post CQEs while failing linked requests */
++ REQ_F_SKIP_LINK_CQES = BIT(REQ_F_SKIP_LINK_CQES_BIT),
++ /* single poll may be active */
++ REQ_F_SINGLE_POLL = BIT(REQ_F_SINGLE_POLL_BIT),
++ /* double poll may active */
++ REQ_F_DOUBLE_POLL = BIT(REQ_F_DOUBLE_POLL_BIT),
++ /* request has already done partial IO */
++ REQ_F_PARTIAL_IO = BIT(REQ_F_PARTIAL_IO_BIT),
++};
++
++struct async_poll {
++ struct io_poll_iocb poll;
++ struct io_poll_iocb *double_poll;
++};
++
++typedef void (*io_req_tw_func_t)(struct io_kiocb *req, bool *locked);
++
++struct io_task_work {
++ union {
++ struct io_wq_work_node node;
++ struct llist_node fallback_node;
++ };
++ io_req_tw_func_t func;
++};
++
++enum {
++ IORING_RSRC_FILE = 0,
++ IORING_RSRC_BUFFER = 1,
++};
++
++/*
++ * NOTE! Each of the iocb union members has the file pointer
++ * as the first entry in their struct definition. So you can
++ * access the file pointer through any of the sub-structs,
++ * or directly as just 'file' in this struct.
++ */
++struct io_kiocb {
++ union {
++ struct file *file;
++ struct io_rw rw;
++ struct io_poll_iocb poll;
++ struct io_poll_update poll_update;
++ struct io_accept accept;
++ struct io_sync sync;
++ struct io_cancel cancel;
++ struct io_timeout timeout;
++ struct io_timeout_rem timeout_rem;
++ struct io_connect connect;
++ struct io_sr_msg sr_msg;
++ struct io_open open;
++ struct io_close close;
++ struct io_rsrc_update rsrc_update;
++ struct io_fadvise fadvise;
++ struct io_madvise madvise;
++ struct io_epoll epoll;
++ struct io_splice splice;
++ struct io_provide_buf pbuf;
++ struct io_statx statx;
++ struct io_shutdown shutdown;
++ struct io_rename rename;
++ struct io_unlink unlink;
++ struct io_mkdir mkdir;
++ struct io_symlink symlink;
++ struct io_hardlink hardlink;
++ struct io_msg msg;
++ };
++
++ u8 opcode;
++ /* polled IO has completed */
++ u8 iopoll_completed;
++ u16 buf_index;
++ unsigned int flags;
++
++ u64 user_data;
++ u32 result;
++ /* fd initially, then cflags for completion */
++ union {
++ u32 cflags;
++ int fd;
++ };
++
++ struct io_ring_ctx *ctx;
++ struct task_struct *task;
++
++ struct percpu_ref *fixed_rsrc_refs;
++ /* store used ubuf, so we can prevent reloading */
++ struct io_mapped_ubuf *imu;
++
++ union {
++ /* used by request caches, completion batching and iopoll */
++ struct io_wq_work_node comp_list;
++ /* cache ->apoll->events */
++ __poll_t apoll_events;
++ };
++ atomic_t refs;
++ atomic_t poll_refs;
++ struct io_task_work io_task_work;
++ /* for polled requests, i.e. IORING_OP_POLL_ADD and async armed poll */
++ struct hlist_node hash_node;
++ /* internal polling, see IORING_FEAT_FAST_POLL */
++ struct async_poll *apoll;
++ /* opcode allocated if it needs to store data for async defer */
++ void *async_data;
++ /* stores selected buf, valid IFF REQ_F_BUFFER_SELECTED is set */
++ struct io_buffer *kbuf;
++ /* linked requests, IFF REQ_F_HARDLINK or REQ_F_LINK are set */
++ struct io_kiocb *link;
++ /* custom credentials, valid IFF REQ_F_CREDS is set */
++ const struct cred *creds;
++ struct io_wq_work work;
++};
++
++struct io_tctx_node {
++ struct list_head ctx_node;
++ struct task_struct *task;
++ struct io_ring_ctx *ctx;
++};
++
++struct io_defer_entry {
++ struct list_head list;
++ struct io_kiocb *req;
++ u32 seq;
++};
++
++struct io_op_def {
++ /* needs req->file assigned */
++ unsigned needs_file : 1;
++ /* should block plug */
++ unsigned plug : 1;
++ /* hash wq insertion if file is a regular file */
++ unsigned hash_reg_file : 1;
++ /* unbound wq insertion if file is a non-regular file */
++ unsigned unbound_nonreg_file : 1;
++ /* set if opcode supports polled "wait" */
++ unsigned pollin : 1;
++ unsigned pollout : 1;
++ unsigned poll_exclusive : 1;
++ /* op supports buffer selection */
++ unsigned buffer_select : 1;
++ /* do prep async if is going to be punted */
++ unsigned needs_async_setup : 1;
++ /* opcode is not supported by this kernel */
++ unsigned not_supported : 1;
++ /* skip auditing */
++ unsigned audit_skip : 1;
++ /* size of async data needed, if any */
++ unsigned short async_size;
++};
++
++static const struct io_op_def io_op_defs[] = {
++ [IORING_OP_NOP] = {},
++ [IORING_OP_READV] = {
++ .needs_file = 1,
++ .unbound_nonreg_file = 1,
++ .pollin = 1,
++ .buffer_select = 1,
++ .needs_async_setup = 1,
++ .plug = 1,
++ .audit_skip = 1,
++ .async_size = sizeof(struct io_async_rw),
++ },
++ [IORING_OP_WRITEV] = {
++ .needs_file = 1,
++ .hash_reg_file = 1,
++ .unbound_nonreg_file = 1,
++ .pollout = 1,
++ .needs_async_setup = 1,
++ .plug = 1,
++ .audit_skip = 1,
++ .async_size = sizeof(struct io_async_rw),
++ },
++ [IORING_OP_FSYNC] = {
++ .needs_file = 1,
++ .audit_skip = 1,
++ },
++ [IORING_OP_READ_FIXED] = {
++ .needs_file = 1,
++ .unbound_nonreg_file = 1,
++ .pollin = 1,
++ .plug = 1,
++ .audit_skip = 1,
++ .async_size = sizeof(struct io_async_rw),
++ },
++ [IORING_OP_WRITE_FIXED] = {
++ .needs_file = 1,
++ .hash_reg_file = 1,
++ .unbound_nonreg_file = 1,
++ .pollout = 1,
++ .plug = 1,
++ .audit_skip = 1,
++ .async_size = sizeof(struct io_async_rw),
++ },
++ [IORING_OP_POLL_ADD] = {
++ .needs_file = 1,
++ .unbound_nonreg_file = 1,
++ .audit_skip = 1,
++ },
++ [IORING_OP_POLL_REMOVE] = {
++ .audit_skip = 1,
++ },
++ [IORING_OP_SYNC_FILE_RANGE] = {
++ .needs_file = 1,
++ .audit_skip = 1,
++ },
++ [IORING_OP_SENDMSG] = {
++ .needs_file = 1,
++ .unbound_nonreg_file = 1,
++ .pollout = 1,
++ .needs_async_setup = 1,
++ .async_size = sizeof(struct io_async_msghdr),
++ },
++ [IORING_OP_RECVMSG] = {
++ .needs_file = 1,
++ .unbound_nonreg_file = 1,
++ .pollin = 1,
++ .buffer_select = 1,
++ .needs_async_setup = 1,
++ .async_size = sizeof(struct io_async_msghdr),
++ },
++ [IORING_OP_TIMEOUT] = {
++ .audit_skip = 1,
++ .async_size = sizeof(struct io_timeout_data),
++ },
++ [IORING_OP_TIMEOUT_REMOVE] = {
++ /* used by timeout updates' prep() */
++ .audit_skip = 1,
++ },
++ [IORING_OP_ACCEPT] = {
++ .needs_file = 1,
++ .unbound_nonreg_file = 1,
++ .pollin = 1,
++ .poll_exclusive = 1,
++ },
++ [IORING_OP_ASYNC_CANCEL] = {
++ .audit_skip = 1,
++ },
++ [IORING_OP_LINK_TIMEOUT] = {
++ .audit_skip = 1,
++ .async_size = sizeof(struct io_timeout_data),
++ },
++ [IORING_OP_CONNECT] = {
++ .needs_file = 1,
++ .unbound_nonreg_file = 1,
++ .pollout = 1,
++ .needs_async_setup = 1,
++ .async_size = sizeof(struct io_async_connect),
++ },
++ [IORING_OP_FALLOCATE] = {
++ .needs_file = 1,
++ },
++ [IORING_OP_OPENAT] = {},
++ [IORING_OP_CLOSE] = {},
++ [IORING_OP_FILES_UPDATE] = {
++ .audit_skip = 1,
++ },
++ [IORING_OP_STATX] = {
++ .audit_skip = 1,
++ },
++ [IORING_OP_READ] = {
++ .needs_file = 1,
++ .unbound_nonreg_file = 1,
++ .pollin = 1,
++ .buffer_select = 1,
++ .plug = 1,
++ .audit_skip = 1,
++ .async_size = sizeof(struct io_async_rw),
++ },
++ [IORING_OP_WRITE] = {
++ .needs_file = 1,
++ .hash_reg_file = 1,
++ .unbound_nonreg_file = 1,
++ .pollout = 1,
++ .plug = 1,
++ .audit_skip = 1,
++ .async_size = sizeof(struct io_async_rw),
++ },
++ [IORING_OP_FADVISE] = {
++ .needs_file = 1,
++ .audit_skip = 1,
++ },
++ [IORING_OP_MADVISE] = {},
++ [IORING_OP_SEND] = {
++ .needs_file = 1,
++ .unbound_nonreg_file = 1,
++ .pollout = 1,
++ .audit_skip = 1,
++ },
++ [IORING_OP_RECV] = {
++ .needs_file = 1,
++ .unbound_nonreg_file = 1,
++ .pollin = 1,
++ .buffer_select = 1,
++ .audit_skip = 1,
++ },
++ [IORING_OP_OPENAT2] = {
++ },
++ [IORING_OP_EPOLL_CTL] = {
++ .unbound_nonreg_file = 1,
++ .audit_skip = 1,
++ },
++ [IORING_OP_SPLICE] = {
++ .needs_file = 1,
++ .hash_reg_file = 1,
++ .unbound_nonreg_file = 1,
++ .audit_skip = 1,
++ },
++ [IORING_OP_PROVIDE_BUFFERS] = {
++ .audit_skip = 1,
++ },
++ [IORING_OP_REMOVE_BUFFERS] = {
++ .audit_skip = 1,
++ },
++ [IORING_OP_TEE] = {
++ .needs_file = 1,
++ .hash_reg_file = 1,
++ .unbound_nonreg_file = 1,
++ .audit_skip = 1,
++ },
++ [IORING_OP_SHUTDOWN] = {
++ .needs_file = 1,
++ },
++ [IORING_OP_RENAMEAT] = {},
++ [IORING_OP_UNLINKAT] = {},
++ [IORING_OP_MKDIRAT] = {},
++ [IORING_OP_SYMLINKAT] = {},
++ [IORING_OP_LINKAT] = {},
++ [IORING_OP_MSG_RING] = {
++ .needs_file = 1,
++ },
++};
++
++/* requests with any of those set should undergo io_disarm_next() */
++#define IO_DISARM_MASK (REQ_F_ARM_LTIMEOUT | REQ_F_LINK_TIMEOUT | REQ_F_FAIL)
++
++static bool io_disarm_next(struct io_kiocb *req);
++static void io_uring_del_tctx_node(unsigned long index);
++static void io_uring_try_cancel_requests(struct io_ring_ctx *ctx,
++ struct task_struct *task,
++ bool cancel_all);
++static void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd);
++
++static void io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags);
++
++static void io_put_req(struct io_kiocb *req);
++static void io_put_req_deferred(struct io_kiocb *req);
++static void io_dismantle_req(struct io_kiocb *req);
++static void io_queue_linked_timeout(struct io_kiocb *req);
++static int __io_register_rsrc_update(struct io_ring_ctx *ctx, unsigned type,
++ struct io_uring_rsrc_update2 *up,
++ unsigned nr_args);
++static void io_clean_op(struct io_kiocb *req);
++static inline struct file *io_file_get_fixed(struct io_kiocb *req, int fd,
++ unsigned issue_flags);
++static inline struct file *io_file_get_normal(struct io_kiocb *req, int fd);
++static void __io_queue_sqe(struct io_kiocb *req);
++static void io_rsrc_put_work(struct work_struct *work);
++
++static void io_req_task_queue(struct io_kiocb *req);
++static void __io_submit_flush_completions(struct io_ring_ctx *ctx);
++static int io_req_prep_async(struct io_kiocb *req);
++
++static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
++ unsigned int issue_flags, u32 slot_index);
++static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags);
++
++static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer);
++static void io_eventfd_signal(struct io_ring_ctx *ctx);
++
++static struct kmem_cache *req_cachep;
++
++static const struct file_operations io_uring_fops;
++
++struct sock *io_uring_get_socket(struct file *file)
++{
++#if defined(CONFIG_UNIX)
++ if (file->f_op == &io_uring_fops) {
++ struct io_ring_ctx *ctx = file->private_data;
++
++ return ctx->ring_sock->sk;
++ }
++#endif
++ return NULL;
++}
++EXPORT_SYMBOL(io_uring_get_socket);
++
++static inline void io_tw_lock(struct io_ring_ctx *ctx, bool *locked)
++{
++ if (!*locked) {
++ mutex_lock(&ctx->uring_lock);
++ *locked = true;
++ }
++}
++
++#define io_for_each_link(pos, head) \
++ for (pos = (head); pos; pos = pos->link)
++
++/*
++ * Shamelessly stolen from the mm implementation of page reference checking,
++ * see commit f958d7b528b1 for details.
++ */
++#define req_ref_zero_or_close_to_overflow(req) \
++ ((unsigned int) atomic_read(&(req->refs)) + 127u <= 127u)
++
++static inline bool req_ref_inc_not_zero(struct io_kiocb *req)
++{
++ WARN_ON_ONCE(!(req->flags & REQ_F_REFCOUNT));
++ return atomic_inc_not_zero(&req->refs);
++}
++
++static inline bool req_ref_put_and_test(struct io_kiocb *req)
++{
++ if (likely(!(req->flags & REQ_F_REFCOUNT)))
++ return true;
++
++ WARN_ON_ONCE(req_ref_zero_or_close_to_overflow(req));
++ return atomic_dec_and_test(&req->refs);
++}
++
++static inline void req_ref_get(struct io_kiocb *req)
++{
++ WARN_ON_ONCE(!(req->flags & REQ_F_REFCOUNT));
++ WARN_ON_ONCE(req_ref_zero_or_close_to_overflow(req));
++ atomic_inc(&req->refs);
++}
++
++static inline void io_submit_flush_completions(struct io_ring_ctx *ctx)
++{
++ if (!wq_list_empty(&ctx->submit_state.compl_reqs))
++ __io_submit_flush_completions(ctx);
++}
++
++static inline void __io_req_set_refcount(struct io_kiocb *req, int nr)
++{
++ if (!(req->flags & REQ_F_REFCOUNT)) {
++ req->flags |= REQ_F_REFCOUNT;
++ atomic_set(&req->refs, nr);
++ }
++}
++
++static inline void io_req_set_refcount(struct io_kiocb *req)
++{
++ __io_req_set_refcount(req, 1);
++}
++
++#define IO_RSRC_REF_BATCH 100
++
++static inline void io_req_put_rsrc_locked(struct io_kiocb *req,
++ struct io_ring_ctx *ctx)
++ __must_hold(&ctx->uring_lock)
++{
++ struct percpu_ref *ref = req->fixed_rsrc_refs;
++
++ if (ref) {
++ if (ref == &ctx->rsrc_node->refs)
++ ctx->rsrc_cached_refs++;
++ else
++ percpu_ref_put(ref);
++ }
++}
++
++static inline void io_req_put_rsrc(struct io_kiocb *req, struct io_ring_ctx *ctx)
++{
++ if (req->fixed_rsrc_refs)
++ percpu_ref_put(req->fixed_rsrc_refs);
++}
++
++static __cold void io_rsrc_refs_drop(struct io_ring_ctx *ctx)
++ __must_hold(&ctx->uring_lock)
++{
++ if (ctx->rsrc_cached_refs) {
++ percpu_ref_put_many(&ctx->rsrc_node->refs, ctx->rsrc_cached_refs);
++ ctx->rsrc_cached_refs = 0;
++ }
++}
++
++static void io_rsrc_refs_refill(struct io_ring_ctx *ctx)
++ __must_hold(&ctx->uring_lock)
++{
++ ctx->rsrc_cached_refs += IO_RSRC_REF_BATCH;
++ percpu_ref_get_many(&ctx->rsrc_node->refs, IO_RSRC_REF_BATCH);
++}
++
++static inline void io_req_set_rsrc_node(struct io_kiocb *req,
++ struct io_ring_ctx *ctx,
++ unsigned int issue_flags)
++{
++ if (!req->fixed_rsrc_refs) {
++ req->fixed_rsrc_refs = &ctx->rsrc_node->refs;
++
++ if (!(issue_flags & IO_URING_F_UNLOCKED)) {
++ lockdep_assert_held(&ctx->uring_lock);
++ ctx->rsrc_cached_refs--;
++ if (unlikely(ctx->rsrc_cached_refs < 0))
++ io_rsrc_refs_refill(ctx);
++ } else {
++ percpu_ref_get(req->fixed_rsrc_refs);
++ }
++ }
++}
++
++static unsigned int __io_put_kbuf(struct io_kiocb *req, struct list_head *list)
++{
++ struct io_buffer *kbuf = req->kbuf;
++ unsigned int cflags;
++
++ cflags = IORING_CQE_F_BUFFER | (kbuf->bid << IORING_CQE_BUFFER_SHIFT);
++ req->flags &= ~REQ_F_BUFFER_SELECTED;
++ list_add(&kbuf->list, list);
++ req->kbuf = NULL;
++ return cflags;
++}
++
++static inline unsigned int io_put_kbuf_comp(struct io_kiocb *req)
++{
++ lockdep_assert_held(&req->ctx->completion_lock);
++
++ if (likely(!(req->flags & REQ_F_BUFFER_SELECTED)))
++ return 0;
++ return __io_put_kbuf(req, &req->ctx->io_buffers_comp);
++}
++
++static inline unsigned int io_put_kbuf(struct io_kiocb *req,
++ unsigned issue_flags)
++{
++ unsigned int cflags;
++
++ if (likely(!(req->flags & REQ_F_BUFFER_SELECTED)))
++ return 0;
++
++ /*
++ * We can add this buffer back to two lists:
++ *
++ * 1) The io_buffers_cache list. This one is protected by the
++ * ctx->uring_lock. If we already hold this lock, add back to this
++ * list as we can grab it from issue as well.
++ * 2) The io_buffers_comp list. This one is protected by the
++ * ctx->completion_lock.
++ *
++ * We migrate buffers from the comp_list to the issue cache list
++ * when we need one.
++ */
++ if (issue_flags & IO_URING_F_UNLOCKED) {
++ struct io_ring_ctx *ctx = req->ctx;
++
++ spin_lock(&ctx->completion_lock);
++ cflags = __io_put_kbuf(req, &ctx->io_buffers_comp);
++ spin_unlock(&ctx->completion_lock);
++ } else {
++ lockdep_assert_held(&req->ctx->uring_lock);
++
++ cflags = __io_put_kbuf(req, &req->ctx->io_buffers_cache);
++ }
++
++ return cflags;
++}
++
++static struct io_buffer_list *io_buffer_get_list(struct io_ring_ctx *ctx,
++ unsigned int bgid)
++{
++ struct list_head *hash_list;
++ struct io_buffer_list *bl;
++
++ hash_list = &ctx->io_buffers[hash_32(bgid, IO_BUFFERS_HASH_BITS)];
++ list_for_each_entry(bl, hash_list, list)
++ if (bl->bgid == bgid || bgid == -1U)
++ return bl;
++
++ return NULL;
++}
++
++static void io_kbuf_recycle(struct io_kiocb *req, unsigned issue_flags)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ struct io_buffer_list *bl;
++ struct io_buffer *buf;
++
++ if (likely(!(req->flags & REQ_F_BUFFER_SELECTED)))
++ return;
++ /* don't recycle if we already did IO to this buffer */
++ if (req->flags & REQ_F_PARTIAL_IO)
++ return;
++
++ if (issue_flags & IO_URING_F_UNLOCKED)
++ mutex_lock(&ctx->uring_lock);
++
++ lockdep_assert_held(&ctx->uring_lock);
++
++ buf = req->kbuf;
++ bl = io_buffer_get_list(ctx, buf->bgid);
++ list_add(&buf->list, &bl->buf_list);
++ req->flags &= ~REQ_F_BUFFER_SELECTED;
++ req->kbuf = NULL;
++
++ if (issue_flags & IO_URING_F_UNLOCKED)
++ mutex_unlock(&ctx->uring_lock);
++}
++
++static bool io_match_task(struct io_kiocb *head, struct task_struct *task,
++ bool cancel_all)
++ __must_hold(&req->ctx->timeout_lock)
++{
++ struct io_kiocb *req;
++
++ if (task && head->task != task)
++ return false;
++ if (cancel_all)
++ return true;
++
++ io_for_each_link(req, head) {
++ if (req->flags & REQ_F_INFLIGHT)
++ return true;
++ }
++ return false;
++}
++
++static bool io_match_linked(struct io_kiocb *head)
++{
++ struct io_kiocb *req;
++
++ io_for_each_link(req, head) {
++ if (req->flags & REQ_F_INFLIGHT)
++ return true;
++ }
++ return false;
++}
++
++/*
++ * As io_match_task() but protected against racing with linked timeouts.
++ * User must not hold timeout_lock.
++ */
++static bool io_match_task_safe(struct io_kiocb *head, struct task_struct *task,
++ bool cancel_all)
++{
++ bool matched;
++
++ if (task && head->task != task)
++ return false;
++ if (cancel_all)
++ return true;
++
++ if (head->flags & REQ_F_LINK_TIMEOUT) {
++ struct io_ring_ctx *ctx = head->ctx;
++
++ /* protect against races with linked timeouts */
++ spin_lock_irq(&ctx->timeout_lock);
++ matched = io_match_linked(head);
++ spin_unlock_irq(&ctx->timeout_lock);
++ } else {
++ matched = io_match_linked(head);
++ }
++ return matched;
++}
++
++static inline bool req_has_async_data(struct io_kiocb *req)
++{
++ return req->flags & REQ_F_ASYNC_DATA;
++}
++
++static inline void req_set_fail(struct io_kiocb *req)
++{
++ req->flags |= REQ_F_FAIL;
++ if (req->flags & REQ_F_CQE_SKIP) {
++ req->flags &= ~REQ_F_CQE_SKIP;
++ req->flags |= REQ_F_SKIP_LINK_CQES;
++ }
++}
++
++static inline void req_fail_link_node(struct io_kiocb *req, int res)
++{
++ req_set_fail(req);
++ req->result = res;
++}
++
++static __cold void io_ring_ctx_ref_free(struct percpu_ref *ref)
++{
++ struct io_ring_ctx *ctx = container_of(ref, struct io_ring_ctx, refs);
++
++ complete(&ctx->ref_comp);
++}
++
++static inline bool io_is_timeout_noseq(struct io_kiocb *req)
++{
++ return !req->timeout.off;
++}
++
++static __cold void io_fallback_req_func(struct work_struct *work)
++{
++ struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx,
++ fallback_work.work);
++ struct llist_node *node = llist_del_all(&ctx->fallback_llist);
++ struct io_kiocb *req, *tmp;
++ bool locked = false;
++
++ percpu_ref_get(&ctx->refs);
++ llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node)
++ req->io_task_work.func(req, &locked);
++
++ if (locked) {
++ io_submit_flush_completions(ctx);
++ mutex_unlock(&ctx->uring_lock);
++ }
++ percpu_ref_put(&ctx->refs);
++}
++
++static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p)
++{
++ struct io_ring_ctx *ctx;
++ int i, hash_bits;
++
++ ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
++ if (!ctx)
++ return NULL;
++
++ /*
++ * Use 5 bits less than the max cq entries, that should give us around
++ * 32 entries per hash list if totally full and uniformly spread.
++ */
++ hash_bits = ilog2(p->cq_entries);
++ hash_bits -= 5;
++ if (hash_bits <= 0)
++ hash_bits = 1;
++ ctx->cancel_hash_bits = hash_bits;
++ ctx->cancel_hash = kmalloc((1U << hash_bits) * sizeof(struct hlist_head),
++ GFP_KERNEL);
++ if (!ctx->cancel_hash)
++ goto err;
++ __hash_init(ctx->cancel_hash, 1U << hash_bits);
++
++ ctx->dummy_ubuf = kzalloc(sizeof(*ctx->dummy_ubuf), GFP_KERNEL);
++ if (!ctx->dummy_ubuf)
++ goto err;
++ /* set invalid range, so io_import_fixed() fails meeting it */
++ ctx->dummy_ubuf->ubuf = -1UL;
++
++ ctx->io_buffers = kcalloc(1U << IO_BUFFERS_HASH_BITS,
++ sizeof(struct list_head), GFP_KERNEL);
++ if (!ctx->io_buffers)
++ goto err;
++ for (i = 0; i < (1U << IO_BUFFERS_HASH_BITS); i++)
++ INIT_LIST_HEAD(&ctx->io_buffers[i]);
++
++ if (percpu_ref_init(&ctx->refs, io_ring_ctx_ref_free,
++ 0, GFP_KERNEL))
++ goto err;
++
++ ctx->flags = p->flags;
++ init_waitqueue_head(&ctx->sqo_sq_wait);
++ INIT_LIST_HEAD(&ctx->sqd_list);
++ INIT_LIST_HEAD(&ctx->cq_overflow_list);
++ INIT_LIST_HEAD(&ctx->io_buffers_cache);
++ INIT_LIST_HEAD(&ctx->apoll_cache);
++ init_completion(&ctx->ref_comp);
++ xa_init_flags(&ctx->personalities, XA_FLAGS_ALLOC1);
++ mutex_init(&ctx->uring_lock);
++ init_waitqueue_head(&ctx->cq_wait);
++ spin_lock_init(&ctx->completion_lock);
++ spin_lock_init(&ctx->timeout_lock);
++ INIT_WQ_LIST(&ctx->iopoll_list);
++ INIT_LIST_HEAD(&ctx->io_buffers_pages);
++ INIT_LIST_HEAD(&ctx->io_buffers_comp);
++ INIT_LIST_HEAD(&ctx->defer_list);
++ INIT_LIST_HEAD(&ctx->timeout_list);
++ INIT_LIST_HEAD(&ctx->ltimeout_list);
++ spin_lock_init(&ctx->rsrc_ref_lock);
++ INIT_LIST_HEAD(&ctx->rsrc_ref_list);
++ INIT_DELAYED_WORK(&ctx->rsrc_put_work, io_rsrc_put_work);
++ init_llist_head(&ctx->rsrc_put_llist);
++ INIT_LIST_HEAD(&ctx->tctx_list);
++ ctx->submit_state.free_list.next = NULL;
++ INIT_WQ_LIST(&ctx->locked_free_list);
++ INIT_DELAYED_WORK(&ctx->fallback_work, io_fallback_req_func);
++ INIT_WQ_LIST(&ctx->submit_state.compl_reqs);
++ return ctx;
++err:
++ kfree(ctx->dummy_ubuf);
++ kfree(ctx->cancel_hash);
++ kfree(ctx->io_buffers);
++ kfree(ctx);
++ return NULL;
++}
++
++static void io_account_cq_overflow(struct io_ring_ctx *ctx)
++{
++ struct io_rings *r = ctx->rings;
++
++ WRITE_ONCE(r->cq_overflow, READ_ONCE(r->cq_overflow) + 1);
++ ctx->cq_extra--;
++}
++
++static bool req_need_defer(struct io_kiocb *req, u32 seq)
++{
++ if (unlikely(req->flags & REQ_F_IO_DRAIN)) {
++ struct io_ring_ctx *ctx = req->ctx;
++
++ return seq + READ_ONCE(ctx->cq_extra) != ctx->cached_cq_tail;
++ }
++
++ return false;
++}
++
++#define FFS_NOWAIT 0x1UL
++#define FFS_ISREG 0x2UL
++#define FFS_MASK ~(FFS_NOWAIT|FFS_ISREG)
++
++static inline bool io_req_ffs_set(struct io_kiocb *req)
++{
++ return req->flags & REQ_F_FIXED_FILE;
++}
++
++static inline void io_req_track_inflight(struct io_kiocb *req)
++{
++ if (!(req->flags & REQ_F_INFLIGHT)) {
++ req->flags |= REQ_F_INFLIGHT;
++ atomic_inc(&req->task->io_uring->inflight_tracked);
++ }
++}
++
++static struct io_kiocb *__io_prep_linked_timeout(struct io_kiocb *req)
++{
++ if (WARN_ON_ONCE(!req->link))
++ return NULL;
++
++ req->flags &= ~REQ_F_ARM_LTIMEOUT;
++ req->flags |= REQ_F_LINK_TIMEOUT;
++
++ /* linked timeouts should have two refs once prep'ed */
++ io_req_set_refcount(req);
++ __io_req_set_refcount(req->link, 2);
++ return req->link;
++}
++
++static inline struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req)
++{
++ if (likely(!(req->flags & REQ_F_ARM_LTIMEOUT)))
++ return NULL;
++ return __io_prep_linked_timeout(req);
++}
++
++static void io_prep_async_work(struct io_kiocb *req)
++{
++ const struct io_op_def *def = &io_op_defs[req->opcode];
++ struct io_ring_ctx *ctx = req->ctx;
++
++ if (!(req->flags & REQ_F_CREDS)) {
++ req->flags |= REQ_F_CREDS;
++ req->creds = get_current_cred();
++ }
++
++ req->work.list.next = NULL;
++ req->work.flags = 0;
++ if (req->flags & REQ_F_FORCE_ASYNC)
++ req->work.flags |= IO_WQ_WORK_CONCURRENT;
++
++ if (req->flags & REQ_F_ISREG) {
++ if (def->hash_reg_file || (ctx->flags & IORING_SETUP_IOPOLL))
++ io_wq_hash_work(&req->work, file_inode(req->file));
++ } else if (!req->file || !S_ISBLK(file_inode(req->file)->i_mode)) {
++ if (def->unbound_nonreg_file)
++ req->work.flags |= IO_WQ_WORK_UNBOUND;
++ }
++}
++
++static void io_prep_async_link(struct io_kiocb *req)
++{
++ struct io_kiocb *cur;
++
++ if (req->flags & REQ_F_LINK_TIMEOUT) {
++ struct io_ring_ctx *ctx = req->ctx;
++
++ spin_lock_irq(&ctx->timeout_lock);
++ io_for_each_link(cur, req)
++ io_prep_async_work(cur);
++ spin_unlock_irq(&ctx->timeout_lock);
++ } else {
++ io_for_each_link(cur, req)
++ io_prep_async_work(cur);
++ }
++}
++
++static inline void io_req_add_compl_list(struct io_kiocb *req)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ struct io_submit_state *state = &ctx->submit_state;
++
++ if (!(req->flags & REQ_F_CQE_SKIP))
++ ctx->submit_state.flush_cqes = true;
++ wq_list_add_tail(&req->comp_list, &state->compl_reqs);
++}
++
++static void io_queue_async_work(struct io_kiocb *req, bool *dont_use)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ struct io_kiocb *link = io_prep_linked_timeout(req);
++ struct io_uring_task *tctx = req->task->io_uring;
++
++ BUG_ON(!tctx);
++ BUG_ON(!tctx->io_wq);
++
++ /* init ->work of the whole link before punting */
++ io_prep_async_link(req);
++
++ /*
++ * Not expected to happen, but if we do have a bug where this _can_
++ * happen, catch it here and ensure the request is marked as
++ * canceled. That will make io-wq go through the usual work cancel
++ * procedure rather than attempt to run this request (or create a new
++ * worker for it).
++ */
++ if (WARN_ON_ONCE(!same_thread_group(req->task, current)))
++ req->work.flags |= IO_WQ_WORK_CANCEL;
++
++ trace_io_uring_queue_async_work(ctx, req, req->user_data, req->opcode, req->flags,
++ &req->work, io_wq_is_hashed(&req->work));
++ io_wq_enqueue(tctx->io_wq, &req->work);
++ if (link)
++ io_queue_linked_timeout(link);
++}
++
++static void io_kill_timeout(struct io_kiocb *req, int status)
++ __must_hold(&req->ctx->completion_lock)
++ __must_hold(&req->ctx->timeout_lock)
++{
++ struct io_timeout_data *io = req->async_data;
++
++ if (hrtimer_try_to_cancel(&io->timer) != -1) {
++ if (status)
++ req_set_fail(req);
++ atomic_set(&req->ctx->cq_timeouts,
++ atomic_read(&req->ctx->cq_timeouts) + 1);
++ list_del_init(&req->timeout.list);
++ io_fill_cqe_req(req, status, 0);
++ io_put_req_deferred(req);
++ }
++}
++
++static __cold void io_queue_deferred(struct io_ring_ctx *ctx)
++{
++ while (!list_empty(&ctx->defer_list)) {
++ struct io_defer_entry *de = list_first_entry(&ctx->defer_list,
++ struct io_defer_entry, list);
++
++ if (req_need_defer(de->req, de->seq))
++ break;
++ list_del_init(&de->list);
++ io_req_task_queue(de->req);
++ kfree(de);
++ }
++}
++
++static __cold void io_flush_timeouts(struct io_ring_ctx *ctx)
++ __must_hold(&ctx->completion_lock)
++{
++ u32 seq = ctx->cached_cq_tail - atomic_read(&ctx->cq_timeouts);
++ struct io_kiocb *req, *tmp;
++
++ spin_lock_irq(&ctx->timeout_lock);
++ list_for_each_entry_safe(req, tmp, &ctx->timeout_list, timeout.list) {
++ u32 events_needed, events_got;
++
++ if (io_is_timeout_noseq(req))
++ break;
++
++ /*
++ * Since seq can easily wrap around over time, subtract
++ * the last seq at which timeouts were flushed before comparing.
++ * Assuming not more than 2^31-1 events have happened since,
++ * these subtractions won't have wrapped, so we can check if
++ * target is in [last_seq, current_seq] by comparing the two.
++ */
++ events_needed = req->timeout.target_seq - ctx->cq_last_tm_flush;
++ events_got = seq - ctx->cq_last_tm_flush;
++ if (events_got < events_needed)
++ break;
++
++ io_kill_timeout(req, 0);
++ }
++ ctx->cq_last_tm_flush = seq;
++ spin_unlock_irq(&ctx->timeout_lock);
++}
++
++static inline void io_commit_cqring(struct io_ring_ctx *ctx)
++{
++ /* order cqe stores with ring update */
++ smp_store_release(&ctx->rings->cq.tail, ctx->cached_cq_tail);
++}
++
++static void __io_commit_cqring_flush(struct io_ring_ctx *ctx)
++{
++ if (ctx->off_timeout_used || ctx->drain_active) {
++ spin_lock(&ctx->completion_lock);
++ if (ctx->off_timeout_used)
++ io_flush_timeouts(ctx);
++ if (ctx->drain_active)
++ io_queue_deferred(ctx);
++ io_commit_cqring(ctx);
++ spin_unlock(&ctx->completion_lock);
++ }
++ if (ctx->has_evfd)
++ io_eventfd_signal(ctx);
++}
++
++static inline bool io_sqring_full(struct io_ring_ctx *ctx)
++{
++ struct io_rings *r = ctx->rings;
++
++ return READ_ONCE(r->sq.tail) - ctx->cached_sq_head == ctx->sq_entries;
++}
++
++static inline unsigned int __io_cqring_events(struct io_ring_ctx *ctx)
++{
++ return ctx->cached_cq_tail - READ_ONCE(ctx->rings->cq.head);
++}
++
++static inline struct io_uring_cqe *io_get_cqe(struct io_ring_ctx *ctx)
++{
++ struct io_rings *rings = ctx->rings;
++ unsigned tail, mask = ctx->cq_entries - 1;
++
++ /*
++ * writes to the cq entry need to come after reading head; the
++ * control dependency is enough as we're using WRITE_ONCE to
++ * fill the cq entry
++ */
++ if (__io_cqring_events(ctx) == ctx->cq_entries)
++ return NULL;
++
++ tail = ctx->cached_cq_tail++;
++ return &rings->cqes[tail & mask];
++}
++
++static void io_eventfd_signal(struct io_ring_ctx *ctx)
++{
++ struct io_ev_fd *ev_fd;
++
++ rcu_read_lock();
++ /*
++ * rcu_dereference ctx->io_ev_fd once and use it for both for checking
++ * and eventfd_signal
++ */
++ ev_fd = rcu_dereference(ctx->io_ev_fd);
++
++ /*
++ * Check again if ev_fd exists incase an io_eventfd_unregister call
++ * completed between the NULL check of ctx->io_ev_fd at the start of
++ * the function and rcu_read_lock.
++ */
++ if (unlikely(!ev_fd))
++ goto out;
++ if (READ_ONCE(ctx->rings->cq_flags) & IORING_CQ_EVENTFD_DISABLED)
++ goto out;
++
++ if (!ev_fd->eventfd_async || io_wq_current_is_worker())
++ eventfd_signal(ev_fd->cq_ev_fd, 1);
++out:
++ rcu_read_unlock();
++}
++
++static inline void io_cqring_wake(struct io_ring_ctx *ctx)
++{
++ /*
++ * wake_up_all() may seem excessive, but io_wake_function() and
++ * io_should_wake() handle the termination of the loop and only
++ * wake as many waiters as we need to.
++ */
++ if (wq_has_sleeper(&ctx->cq_wait))
++ wake_up_all(&ctx->cq_wait);
++}
++
++/*
++ * This should only get called when at least one event has been posted.
++ * Some applications rely on the eventfd notification count only changing
++ * IFF a new CQE has been added to the CQ ring. There's no depedency on
++ * 1:1 relationship between how many times this function is called (and
++ * hence the eventfd count) and number of CQEs posted to the CQ ring.
++ */
++static inline void io_cqring_ev_posted(struct io_ring_ctx *ctx)
++{
++ if (unlikely(ctx->off_timeout_used || ctx->drain_active ||
++ ctx->has_evfd))
++ __io_commit_cqring_flush(ctx);
++
++ io_cqring_wake(ctx);
++}
++
++static void io_cqring_ev_posted_iopoll(struct io_ring_ctx *ctx)
++{
++ if (unlikely(ctx->off_timeout_used || ctx->drain_active ||
++ ctx->has_evfd))
++ __io_commit_cqring_flush(ctx);
++
++ if (ctx->flags & IORING_SETUP_SQPOLL)
++ io_cqring_wake(ctx);
++}
++
++/* Returns true if there are no backlogged entries after the flush */
++static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
++{
++ bool all_flushed, posted;
++
++ if (!force && __io_cqring_events(ctx) == ctx->cq_entries)
++ return false;
++
++ posted = false;
++ spin_lock(&ctx->completion_lock);
++ while (!list_empty(&ctx->cq_overflow_list)) {
++ struct io_uring_cqe *cqe = io_get_cqe(ctx);
++ struct io_overflow_cqe *ocqe;
++
++ if (!cqe && !force)
++ break;
++ ocqe = list_first_entry(&ctx->cq_overflow_list,
++ struct io_overflow_cqe, list);
++ if (cqe)
++ memcpy(cqe, &ocqe->cqe, sizeof(*cqe));
++ else
++ io_account_cq_overflow(ctx);
++
++ posted = true;
++ list_del(&ocqe->list);
++ kfree(ocqe);
++ }
++
++ all_flushed = list_empty(&ctx->cq_overflow_list);
++ if (all_flushed) {
++ clear_bit(0, &ctx->check_cq_overflow);
++ WRITE_ONCE(ctx->rings->sq_flags,
++ ctx->rings->sq_flags & ~IORING_SQ_CQ_OVERFLOW);
++ }
++
++ if (posted)
++ io_commit_cqring(ctx);
++ spin_unlock(&ctx->completion_lock);
++ if (posted)
++ io_cqring_ev_posted(ctx);
++ return all_flushed;
++}
++
++static bool io_cqring_overflow_flush(struct io_ring_ctx *ctx)
++{
++ bool ret = true;
++
++ if (test_bit(0, &ctx->check_cq_overflow)) {
++ /* iopoll syncs against uring_lock, not completion_lock */
++ if (ctx->flags & IORING_SETUP_IOPOLL)
++ mutex_lock(&ctx->uring_lock);
++ ret = __io_cqring_overflow_flush(ctx, false);
++ if (ctx->flags & IORING_SETUP_IOPOLL)
++ mutex_unlock(&ctx->uring_lock);
++ }
++
++ return ret;
++}
++
++/* must to be called somewhat shortly after putting a request */
++static inline void io_put_task(struct task_struct *task, int nr)
++{
++ struct io_uring_task *tctx = task->io_uring;
++
++ if (likely(task == current)) {
++ tctx->cached_refs += nr;
++ } else {
++ percpu_counter_sub(&tctx->inflight, nr);
++ if (unlikely(atomic_read(&tctx->in_idle)))
++ wake_up(&tctx->wait);
++ put_task_struct_many(task, nr);
++ }
++}
++
++static void io_task_refs_refill(struct io_uring_task *tctx)
++{
++ unsigned int refill = -tctx->cached_refs + IO_TCTX_REFS_CACHE_NR;
++
++ percpu_counter_add(&tctx->inflight, refill);
++ refcount_add(refill, ¤t->usage);
++ tctx->cached_refs += refill;
++}
++
++static inline void io_get_task_refs(int nr)
++{
++ struct io_uring_task *tctx = current->io_uring;
++
++ tctx->cached_refs -= nr;
++ if (unlikely(tctx->cached_refs < 0))
++ io_task_refs_refill(tctx);
++}
++
++static __cold void io_uring_drop_tctx_refs(struct task_struct *task)
++{
++ struct io_uring_task *tctx = task->io_uring;
++ unsigned int refs = tctx->cached_refs;
++
++ if (refs) {
++ tctx->cached_refs = 0;
++ percpu_counter_sub(&tctx->inflight, refs);
++ put_task_struct_many(task, refs);
++ }
++}
++
++static bool io_cqring_event_overflow(struct io_ring_ctx *ctx, u64 user_data,
++ s32 res, u32 cflags)
++{
++ struct io_overflow_cqe *ocqe;
++
++ ocqe = kmalloc(sizeof(*ocqe), GFP_ATOMIC | __GFP_ACCOUNT);
++ if (!ocqe) {
++ /*
++ * If we're in ring overflow flush mode, or in task cancel mode,
++ * or cannot allocate an overflow entry, then we need to drop it
++ * on the floor.
++ */
++ io_account_cq_overflow(ctx);
++ return false;
++ }
++ if (list_empty(&ctx->cq_overflow_list)) {
++ set_bit(0, &ctx->check_cq_overflow);
++ WRITE_ONCE(ctx->rings->sq_flags,
++ ctx->rings->sq_flags | IORING_SQ_CQ_OVERFLOW);
++
++ }
++ ocqe->cqe.user_data = user_data;
++ ocqe->cqe.res = res;
++ ocqe->cqe.flags = cflags;
++ list_add_tail(&ocqe->list, &ctx->cq_overflow_list);
++ return true;
++}
++
++static inline bool __io_fill_cqe(struct io_ring_ctx *ctx, u64 user_data,
++ s32 res, u32 cflags)
++{
++ struct io_uring_cqe *cqe;
++
++ /*
++ * If we can't get a cq entry, userspace overflowed the
++ * submission (by quite a lot). Increment the overflow count in
++ * the ring.
++ */
++ cqe = io_get_cqe(ctx);
++ if (likely(cqe)) {
++ WRITE_ONCE(cqe->user_data, user_data);
++ WRITE_ONCE(cqe->res, res);
++ WRITE_ONCE(cqe->flags, cflags);
++ return true;
++ }
++ return io_cqring_event_overflow(ctx, user_data, res, cflags);
++}
++
++static inline bool __io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags)
++{
++ trace_io_uring_complete(req->ctx, req, req->user_data, res, cflags);
++ return __io_fill_cqe(req->ctx, req->user_data, res, cflags);
++}
++
++static noinline void io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags)
++{
++ if (!(req->flags & REQ_F_CQE_SKIP))
++ __io_fill_cqe_req(req, res, cflags);
++}
++
++static noinline bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data,
++ s32 res, u32 cflags)
++{
++ ctx->cq_extra++;
++ trace_io_uring_complete(ctx, NULL, user_data, res, cflags);
++ return __io_fill_cqe(ctx, user_data, res, cflags);
++}
++
++static void __io_req_complete_post(struct io_kiocb *req, s32 res,
++ u32 cflags)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++
++ if (!(req->flags & REQ_F_CQE_SKIP))
++ __io_fill_cqe_req(req, res, cflags);
++ /*
++ * If we're the last reference to this request, add to our locked
++ * free_list cache.
++ */
++ if (req_ref_put_and_test(req)) {
++ if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) {
++ if (req->flags & IO_DISARM_MASK)
++ io_disarm_next(req);
++ if (req->link) {
++ io_req_task_queue(req->link);
++ req->link = NULL;
++ }
++ }
++ io_req_put_rsrc(req, ctx);
++ /*
++ * Selected buffer deallocation in io_clean_op() assumes that
++ * we don't hold ->completion_lock. Clean them here to avoid
++ * deadlocks.
++ */
++ io_put_kbuf_comp(req);
++ io_dismantle_req(req);
++ io_put_task(req->task, 1);
++ wq_list_add_head(&req->comp_list, &ctx->locked_free_list);
++ ctx->locked_free_nr++;
++ }
++}
++
++static void io_req_complete_post(struct io_kiocb *req, s32 res,
++ u32 cflags)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++
++ spin_lock(&ctx->completion_lock);
++ __io_req_complete_post(req, res, cflags);
++ io_commit_cqring(ctx);
++ spin_unlock(&ctx->completion_lock);
++ io_cqring_ev_posted(ctx);
++}
++
++static inline void io_req_complete_state(struct io_kiocb *req, s32 res,
++ u32 cflags)
++{
++ req->result = res;
++ req->cflags = cflags;
++ req->flags |= REQ_F_COMPLETE_INLINE;
++}
++
++static inline void __io_req_complete(struct io_kiocb *req, unsigned issue_flags,
++ s32 res, u32 cflags)
++{
++ if (issue_flags & IO_URING_F_COMPLETE_DEFER)
++ io_req_complete_state(req, res, cflags);
++ else
++ io_req_complete_post(req, res, cflags);
++}
++
++static inline void io_req_complete(struct io_kiocb *req, s32 res)
++{
++ __io_req_complete(req, 0, res, 0);
++}
++
++static void io_req_complete_failed(struct io_kiocb *req, s32 res)
++{
++ req_set_fail(req);
++ io_req_complete_post(req, res, io_put_kbuf(req, IO_URING_F_UNLOCKED));
++}
++
++static void io_req_complete_fail_submit(struct io_kiocb *req)
++{
++ /*
++ * We don't submit, fail them all, for that replace hardlinks with
++ * normal links. Extra REQ_F_LINK is tolerated.
++ */
++ req->flags &= ~REQ_F_HARDLINK;
++ req->flags |= REQ_F_LINK;
++ io_req_complete_failed(req, req->result);
++}
++
++/*
++ * Don't initialise the fields below on every allocation, but do that in
++ * advance and keep them valid across allocations.
++ */
++static void io_preinit_req(struct io_kiocb *req, struct io_ring_ctx *ctx)
++{
++ req->ctx = ctx;
++ req->link = NULL;
++ req->async_data = NULL;
++ /* not necessary, but safer to zero */
++ req->result = 0;
++}
++
++static void io_flush_cached_locked_reqs(struct io_ring_ctx *ctx,
++ struct io_submit_state *state)
++{
++ spin_lock(&ctx->completion_lock);
++ wq_list_splice(&ctx->locked_free_list, &state->free_list);
++ ctx->locked_free_nr = 0;
++ spin_unlock(&ctx->completion_lock);
++}
++
++/* Returns true IFF there are requests in the cache */
++static bool io_flush_cached_reqs(struct io_ring_ctx *ctx)
++{
++ struct io_submit_state *state = &ctx->submit_state;
++
++ /*
++ * If we have more than a batch's worth of requests in our IRQ side
++ * locked cache, grab the lock and move them over to our submission
++ * side cache.
++ */
++ if (READ_ONCE(ctx->locked_free_nr) > IO_COMPL_BATCH)
++ io_flush_cached_locked_reqs(ctx, state);
++ return !!state->free_list.next;
++}
++
++/*
++ * A request might get retired back into the request caches even before opcode
++ * handlers and io_issue_sqe() are done with it, e.g. inline completion path.
++ * Because of that, io_alloc_req() should be called only under ->uring_lock
++ * and with extra caution to not get a request that is still worked on.
++ */
++static __cold bool __io_alloc_req_refill(struct io_ring_ctx *ctx)
++ __must_hold(&ctx->uring_lock)
++{
++ struct io_submit_state *state = &ctx->submit_state;
++ gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
++ void *reqs[IO_REQ_ALLOC_BATCH];
++ struct io_kiocb *req;
++ int ret, i;
++
++ if (likely(state->free_list.next || io_flush_cached_reqs(ctx)))
++ return true;
++
++ ret = kmem_cache_alloc_bulk(req_cachep, gfp, ARRAY_SIZE(reqs), reqs);
++
++ /*
++ * Bulk alloc is all-or-nothing. If we fail to get a batch,
++ * retry single alloc to be on the safe side.
++ */
++ if (unlikely(ret <= 0)) {
++ reqs[0] = kmem_cache_alloc(req_cachep, gfp);
++ if (!reqs[0])
++ return false;
++ ret = 1;
++ }
++
++ percpu_ref_get_many(&ctx->refs, ret);
++ for (i = 0; i < ret; i++) {
++ req = reqs[i];
++
++ io_preinit_req(req, ctx);
++ wq_stack_add_head(&req->comp_list, &state->free_list);
++ }
++ return true;
++}
++
++static inline bool io_alloc_req_refill(struct io_ring_ctx *ctx)
++{
++ if (unlikely(!ctx->submit_state.free_list.next))
++ return __io_alloc_req_refill(ctx);
++ return true;
++}
++
++static inline struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
++{
++ struct io_wq_work_node *node;
++
++ node = wq_stack_extract(&ctx->submit_state.free_list);
++ return container_of(node, struct io_kiocb, comp_list);
++}
++
++static inline void io_put_file(struct file *file)
++{
++ if (file)
++ fput(file);
++}
++
++static inline void io_dismantle_req(struct io_kiocb *req)
++{
++ unsigned int flags = req->flags;
++
++ if (unlikely(flags & IO_REQ_CLEAN_FLAGS))
++ io_clean_op(req);
++ if (!(flags & REQ_F_FIXED_FILE))
++ io_put_file(req->file);
++}
++
++static __cold void __io_free_req(struct io_kiocb *req)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++
++ io_req_put_rsrc(req, ctx);
++ io_dismantle_req(req);
++ io_put_task(req->task, 1);
++
++ spin_lock(&ctx->completion_lock);
++ wq_list_add_head(&req->comp_list, &ctx->locked_free_list);
++ ctx->locked_free_nr++;
++ spin_unlock(&ctx->completion_lock);
++}
++
++static inline void io_remove_next_linked(struct io_kiocb *req)
++{
++ struct io_kiocb *nxt = req->link;
++
++ req->link = nxt->link;
++ nxt->link = NULL;
++}
++
++static bool io_kill_linked_timeout(struct io_kiocb *req)
++ __must_hold(&req->ctx->completion_lock)
++ __must_hold(&req->ctx->timeout_lock)
++{
++ struct io_kiocb *link = req->link;
++
++ if (link && link->opcode == IORING_OP_LINK_TIMEOUT) {
++ struct io_timeout_data *io = link->async_data;
++
++ io_remove_next_linked(req);
++ link->timeout.head = NULL;
++ if (hrtimer_try_to_cancel(&io->timer) != -1) {
++ list_del(&link->timeout.list);
++ /* leave REQ_F_CQE_SKIP to io_fill_cqe_req */
++ io_fill_cqe_req(link, -ECANCELED, 0);
++ io_put_req_deferred(link);
++ return true;
++ }
++ }
++ return false;
++}
++
++static void io_fail_links(struct io_kiocb *req)
++ __must_hold(&req->ctx->completion_lock)
++{
++ struct io_kiocb *nxt, *link = req->link;
++ bool ignore_cqes = req->flags & REQ_F_SKIP_LINK_CQES;
++
++ req->link = NULL;
++ while (link) {
++ long res = -ECANCELED;
++
++ if (link->flags & REQ_F_FAIL)
++ res = link->result;
++
++ nxt = link->link;
++ link->link = NULL;
++
++ trace_io_uring_fail_link(req->ctx, req, req->user_data,
++ req->opcode, link);
++
++ if (!ignore_cqes) {
++ link->flags &= ~REQ_F_CQE_SKIP;
++ io_fill_cqe_req(link, res, 0);
++ }
++ io_put_req_deferred(link);
++ link = nxt;
++ }
++}
++
++static bool io_disarm_next(struct io_kiocb *req)
++ __must_hold(&req->ctx->completion_lock)
++{
++ bool posted = false;
++
++ if (req->flags & REQ_F_ARM_LTIMEOUT) {
++ struct io_kiocb *link = req->link;
++
++ req->flags &= ~REQ_F_ARM_LTIMEOUT;
++ if (link && link->opcode == IORING_OP_LINK_TIMEOUT) {
++ io_remove_next_linked(req);
++ /* leave REQ_F_CQE_SKIP to io_fill_cqe_req */
++ io_fill_cqe_req(link, -ECANCELED, 0);
++ io_put_req_deferred(link);
++ posted = true;
++ }
++ } else if (req->flags & REQ_F_LINK_TIMEOUT) {
++ struct io_ring_ctx *ctx = req->ctx;
++
++ spin_lock_irq(&ctx->timeout_lock);
++ posted = io_kill_linked_timeout(req);
++ spin_unlock_irq(&ctx->timeout_lock);
++ }
++ if (unlikely((req->flags & REQ_F_FAIL) &&
++ !(req->flags & REQ_F_HARDLINK))) {
++ posted |= (req->link != NULL);
++ io_fail_links(req);
++ }
++ return posted;
++}
++
++static void __io_req_find_next_prep(struct io_kiocb *req)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ bool posted;
++
++ spin_lock(&ctx->completion_lock);
++ posted = io_disarm_next(req);
++ if (posted)
++ io_commit_cqring(ctx);
++ spin_unlock(&ctx->completion_lock);
++ if (posted)
++ io_cqring_ev_posted(ctx);
++}
++
++static inline struct io_kiocb *io_req_find_next(struct io_kiocb *req)
++{
++ struct io_kiocb *nxt;
++
++ if (likely(!(req->flags & (REQ_F_LINK|REQ_F_HARDLINK))))
++ return NULL;
++ /*
++ * If LINK is set, we have dependent requests in this chain. If we
++ * didn't fail this request, queue the first one up, moving any other
++ * dependencies to the next request. In case of failure, fail the rest
++ * of the chain.
++ */
++ if (unlikely(req->flags & IO_DISARM_MASK))
++ __io_req_find_next_prep(req);
++ nxt = req->link;
++ req->link = NULL;
++ return nxt;
++}
++
++static void ctx_flush_and_put(struct io_ring_ctx *ctx, bool *locked)
++{
++ if (!ctx)
++ return;
++ if (*locked) {
++ io_submit_flush_completions(ctx);
++ mutex_unlock(&ctx->uring_lock);
++ *locked = false;
++ }
++ percpu_ref_put(&ctx->refs);
++}
++
++static inline void ctx_commit_and_unlock(struct io_ring_ctx *ctx)
++{
++ io_commit_cqring(ctx);
++ spin_unlock(&ctx->completion_lock);
++ io_cqring_ev_posted(ctx);
++}
++
++static void handle_prev_tw_list(struct io_wq_work_node *node,
++ struct io_ring_ctx **ctx, bool *uring_locked)
++{
++ if (*ctx && !*uring_locked)
++ spin_lock(&(*ctx)->completion_lock);
++
++ do {
++ struct io_wq_work_node *next = node->next;
++ struct io_kiocb *req = container_of(node, struct io_kiocb,
++ io_task_work.node);
++
++ prefetch(container_of(next, struct io_kiocb, io_task_work.node));
++
++ if (req->ctx != *ctx) {
++ if (unlikely(!*uring_locked && *ctx))
++ ctx_commit_and_unlock(*ctx);
++
++ ctx_flush_and_put(*ctx, uring_locked);
++ *ctx = req->ctx;
++ /* if not contended, grab and improve batching */
++ *uring_locked = mutex_trylock(&(*ctx)->uring_lock);
++ percpu_ref_get(&(*ctx)->refs);
++ if (unlikely(!*uring_locked))
++ spin_lock(&(*ctx)->completion_lock);
++ }
++ if (likely(*uring_locked))
++ req->io_task_work.func(req, uring_locked);
++ else
++ __io_req_complete_post(req, req->result,
++ io_put_kbuf_comp(req));
++ node = next;
++ } while (node);
++
++ if (unlikely(!*uring_locked))
++ ctx_commit_and_unlock(*ctx);
++}
++
++static void handle_tw_list(struct io_wq_work_node *node,
++ struct io_ring_ctx **ctx, bool *locked)
++{
++ do {
++ struct io_wq_work_node *next = node->next;
++ struct io_kiocb *req = container_of(node, struct io_kiocb,
++ io_task_work.node);
++
++ prefetch(container_of(next, struct io_kiocb, io_task_work.node));
++
++ if (req->ctx != *ctx) {
++ ctx_flush_and_put(*ctx, locked);
++ *ctx = req->ctx;
++ /* if not contended, grab and improve batching */
++ *locked = mutex_trylock(&(*ctx)->uring_lock);
++ percpu_ref_get(&(*ctx)->refs);
++ }
++ req->io_task_work.func(req, locked);
++ node = next;
++ } while (node);
++}
++
++static void tctx_task_work(struct callback_head *cb)
++{
++ bool uring_locked = false;
++ struct io_ring_ctx *ctx = NULL;
++ struct io_uring_task *tctx = container_of(cb, struct io_uring_task,
++ task_work);
++
++ while (1) {
++ struct io_wq_work_node *node1, *node2;
++
++ if (!tctx->task_list.first &&
++ !tctx->prior_task_list.first && uring_locked)
++ io_submit_flush_completions(ctx);
++
++ spin_lock_irq(&tctx->task_lock);
++ node1 = tctx->prior_task_list.first;
++ node2 = tctx->task_list.first;
++ INIT_WQ_LIST(&tctx->task_list);
++ INIT_WQ_LIST(&tctx->prior_task_list);
++ if (!node2 && !node1)
++ tctx->task_running = false;
++ spin_unlock_irq(&tctx->task_lock);
++ if (!node2 && !node1)
++ break;
++
++ if (node1)
++ handle_prev_tw_list(node1, &ctx, &uring_locked);
++
++ if (node2)
++ handle_tw_list(node2, &ctx, &uring_locked);
++ cond_resched();
++ }
++
++ ctx_flush_and_put(ctx, &uring_locked);
++
++ /* relaxed read is enough as only the task itself sets ->in_idle */
++ if (unlikely(atomic_read(&tctx->in_idle)))
++ io_uring_drop_tctx_refs(current);
++}
++
++static void io_req_task_work_add(struct io_kiocb *req, bool priority)
++{
++ struct task_struct *tsk = req->task;
++ struct io_uring_task *tctx = tsk->io_uring;
++ enum task_work_notify_mode notify;
++ struct io_wq_work_node *node;
++ unsigned long flags;
++ bool running;
++
++ WARN_ON_ONCE(!tctx);
++
++ spin_lock_irqsave(&tctx->task_lock, flags);
++ if (priority)
++ wq_list_add_tail(&req->io_task_work.node, &tctx->prior_task_list);
++ else
++ wq_list_add_tail(&req->io_task_work.node, &tctx->task_list);
++ running = tctx->task_running;
++ if (!running)
++ tctx->task_running = true;
++ spin_unlock_irqrestore(&tctx->task_lock, flags);
++
++ /* task_work already pending, we're done */
++ if (running)
++ return;
++
++ /*
++ * SQPOLL kernel thread doesn't need notification, just a wakeup. For
++ * all other cases, use TWA_SIGNAL unconditionally to ensure we're
++ * processing task_work. There's no reliable way to tell if TWA_RESUME
++ * will do the job.
++ */
++ notify = (req->ctx->flags & IORING_SETUP_SQPOLL) ? TWA_NONE : TWA_SIGNAL;
++ if (likely(!task_work_add(tsk, &tctx->task_work, notify))) {
++ if (notify == TWA_NONE)
++ wake_up_process(tsk);
++ return;
++ }
++
++ spin_lock_irqsave(&tctx->task_lock, flags);
++ tctx->task_running = false;
++ node = wq_list_merge(&tctx->prior_task_list, &tctx->task_list);
++ spin_unlock_irqrestore(&tctx->task_lock, flags);
++
++ while (node) {
++ req = container_of(node, struct io_kiocb, io_task_work.node);
++ node = node->next;
++ if (llist_add(&req->io_task_work.fallback_node,
++ &req->ctx->fallback_llist))
++ schedule_delayed_work(&req->ctx->fallback_work, 1);
++ }
++}
++
++static void io_req_task_cancel(struct io_kiocb *req, bool *locked)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++
++ /* not needed for normal modes, but SQPOLL depends on it */
++ io_tw_lock(ctx, locked);
++ io_req_complete_failed(req, req->result);
++}
++
++static void io_req_task_submit(struct io_kiocb *req, bool *locked)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++
++ io_tw_lock(ctx, locked);
++ /* req->task == current here, checking PF_EXITING is safe */
++ if (likely(!(req->task->flags & PF_EXITING)))
++ __io_queue_sqe(req);
++ else
++ io_req_complete_failed(req, -EFAULT);
++}
++
++static void io_req_task_queue_fail(struct io_kiocb *req, int ret)
++{
++ req->result = ret;
++ req->io_task_work.func = io_req_task_cancel;
++ io_req_task_work_add(req, false);
++}
++
++static void io_req_task_queue(struct io_kiocb *req)
++{
++ req->io_task_work.func = io_req_task_submit;
++ io_req_task_work_add(req, false);
++}
++
++static void io_req_task_queue_reissue(struct io_kiocb *req)
++{
++ req->io_task_work.func = io_queue_async_work;
++ io_req_task_work_add(req, false);
++}
++
++static inline void io_queue_next(struct io_kiocb *req)
++{
++ struct io_kiocb *nxt = io_req_find_next(req);
++
++ if (nxt)
++ io_req_task_queue(nxt);
++}
++
++static void io_free_req(struct io_kiocb *req)
++{
++ io_queue_next(req);
++ __io_free_req(req);
++}
++
++static void io_free_req_work(struct io_kiocb *req, bool *locked)
++{
++ io_free_req(req);
++}
++
++static void io_free_batch_list(struct io_ring_ctx *ctx,
++ struct io_wq_work_node *node)
++ __must_hold(&ctx->uring_lock)
++{
++ struct task_struct *task = NULL;
++ int task_refs = 0;
++
++ do {
++ struct io_kiocb *req = container_of(node, struct io_kiocb,
++ comp_list);
++
++ if (unlikely(req->flags & REQ_F_REFCOUNT)) {
++ node = req->comp_list.next;
++ if (!req_ref_put_and_test(req))
++ continue;
++ }
++
++ io_req_put_rsrc_locked(req, ctx);
++ io_queue_next(req);
++ io_dismantle_req(req);
++
++ if (req->task != task) {
++ if (task)
++ io_put_task(task, task_refs);
++ task = req->task;
++ task_refs = 0;
++ }
++ task_refs++;
++ node = req->comp_list.next;
++ wq_stack_add_head(&req->comp_list, &ctx->submit_state.free_list);
++ } while (node);
++
++ if (task)
++ io_put_task(task, task_refs);
++}
++
++static void __io_submit_flush_completions(struct io_ring_ctx *ctx)
++ __must_hold(&ctx->uring_lock)
++{
++ struct io_wq_work_node *node, *prev;
++ struct io_submit_state *state = &ctx->submit_state;
++
++ if (state->flush_cqes) {
++ spin_lock(&ctx->completion_lock);
++ wq_list_for_each(node, prev, &state->compl_reqs) {
++ struct io_kiocb *req = container_of(node, struct io_kiocb,
++ comp_list);
++
++ if (!(req->flags & REQ_F_CQE_SKIP))
++ __io_fill_cqe_req(req, req->result, req->cflags);
++ if ((req->flags & REQ_F_POLLED) && req->apoll) {
++ struct async_poll *apoll = req->apoll;
++
++ if (apoll->double_poll)
++ kfree(apoll->double_poll);
++ list_add(&apoll->poll.wait.entry,
++ &ctx->apoll_cache);
++ req->flags &= ~REQ_F_POLLED;
++ }
++ }
++
++ io_commit_cqring(ctx);
++ spin_unlock(&ctx->completion_lock);
++ io_cqring_ev_posted(ctx);
++ state->flush_cqes = false;
++ }
++
++ io_free_batch_list(ctx, state->compl_reqs.first);
++ INIT_WQ_LIST(&state->compl_reqs);
++}
++
++/*
++ * Drop reference to request, return next in chain (if there is one) if this
++ * was the last reference to this request.
++ */
++static inline struct io_kiocb *io_put_req_find_next(struct io_kiocb *req)
++{
++ struct io_kiocb *nxt = NULL;
++
++ if (req_ref_put_and_test(req)) {
++ nxt = io_req_find_next(req);
++ __io_free_req(req);
++ }
++ return nxt;
++}
++
++static inline void io_put_req(struct io_kiocb *req)
++{
++ if (req_ref_put_and_test(req))
++ io_free_req(req);
++}
++
++static inline void io_put_req_deferred(struct io_kiocb *req)
++{
++ if (req_ref_put_and_test(req)) {
++ req->io_task_work.func = io_free_req_work;
++ io_req_task_work_add(req, false);
++ }
++}
++
++static unsigned io_cqring_events(struct io_ring_ctx *ctx)
++{
++ /* See comment at the top of this file */
++ smp_rmb();
++ return __io_cqring_events(ctx);
++}
++
++static inline unsigned int io_sqring_entries(struct io_ring_ctx *ctx)
++{
++ struct io_rings *rings = ctx->rings;
++
++ /* make sure SQ entry isn't read before tail */
++ return smp_load_acquire(&rings->sq.tail) - ctx->cached_sq_head;
++}
++
++static inline bool io_run_task_work(void)
++{
++ if (test_thread_flag(TIF_NOTIFY_SIGNAL) || task_work_pending(current)) {
++ __set_current_state(TASK_RUNNING);
++ clear_notify_signal();
++ if (task_work_pending(current))
++ task_work_run();
++ return true;
++ }
++
++ return false;
++}
++
++static int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin)
++{
++ struct io_wq_work_node *pos, *start, *prev;
++ unsigned int poll_flags = BLK_POLL_NOSLEEP;
++ DEFINE_IO_COMP_BATCH(iob);
++ int nr_events = 0;
++
++ /*
++ * Only spin for completions if we don't have multiple devices hanging
++ * off our complete list.
++ */
++ if (ctx->poll_multi_queue || force_nonspin)
++ poll_flags |= BLK_POLL_ONESHOT;
++
++ wq_list_for_each(pos, start, &ctx->iopoll_list) {
++ struct io_kiocb *req = container_of(pos, struct io_kiocb, comp_list);
++ struct kiocb *kiocb = &req->rw.kiocb;
++ int ret;
++
++ /*
++ * Move completed and retryable entries to our local lists.
++ * If we find a request that requires polling, break out
++ * and complete those lists first, if we have entries there.
++ */
++ if (READ_ONCE(req->iopoll_completed))
++ break;
++
++ ret = kiocb->ki_filp->f_op->iopoll(kiocb, &iob, poll_flags);
++ if (unlikely(ret < 0))
++ return ret;
++ else if (ret)
++ poll_flags |= BLK_POLL_ONESHOT;
++
++ /* iopoll may have completed current req */
++ if (!rq_list_empty(iob.req_list) ||
++ READ_ONCE(req->iopoll_completed))
++ break;
++ }
++
++ if (!rq_list_empty(iob.req_list))
++ iob.complete(&iob);
++ else if (!pos)
++ return 0;
++
++ prev = start;
++ wq_list_for_each_resume(pos, prev) {
++ struct io_kiocb *req = container_of(pos, struct io_kiocb, comp_list);
++
++ /* order with io_complete_rw_iopoll(), e.g. ->result updates */
++ if (!smp_load_acquire(&req->iopoll_completed))
++ break;
++ nr_events++;
++ if (unlikely(req->flags & REQ_F_CQE_SKIP))
++ continue;
++ __io_fill_cqe_req(req, req->result, io_put_kbuf(req, 0));
++ }
++
++ if (unlikely(!nr_events))
++ return 0;
++
++ io_commit_cqring(ctx);
++ io_cqring_ev_posted_iopoll(ctx);
++ pos = start ? start->next : ctx->iopoll_list.first;
++ wq_list_cut(&ctx->iopoll_list, prev, start);
++ io_free_batch_list(ctx, pos);
++ return nr_events;
++}
++
++/*
++ * We can't just wait for polled events to come to us, we have to actively
++ * find and complete them.
++ */
++static __cold void io_iopoll_try_reap_events(struct io_ring_ctx *ctx)
++{
++ if (!(ctx->flags & IORING_SETUP_IOPOLL))
++ return;
++
++ mutex_lock(&ctx->uring_lock);
++ while (!wq_list_empty(&ctx->iopoll_list)) {
++ /* let it sleep and repeat later if can't complete a request */
++ if (io_do_iopoll(ctx, true) == 0)
++ break;
++ /*
++ * Ensure we allow local-to-the-cpu processing to take place,
++ * in this case we need to ensure that we reap all events.
++ * Also let task_work, etc. to progress by releasing the mutex
++ */
++ if (need_resched()) {
++ mutex_unlock(&ctx->uring_lock);
++ cond_resched();
++ mutex_lock(&ctx->uring_lock);
++ }
++ }
++ mutex_unlock(&ctx->uring_lock);
++}
++
++static int io_iopoll_check(struct io_ring_ctx *ctx, long min)
++{
++ unsigned int nr_events = 0;
++ int ret = 0;
++
++ /*
++ * We disallow the app entering submit/complete with polling, but we
++ * still need to lock the ring to prevent racing with polled issue
++ * that got punted to a workqueue.
++ */
++ mutex_lock(&ctx->uring_lock);
++ /*
++ * Don't enter poll loop if we already have events pending.
++ * If we do, we can potentially be spinning for commands that
++ * already triggered a CQE (eg in error).
++ */
++ if (test_bit(0, &ctx->check_cq_overflow))
++ __io_cqring_overflow_flush(ctx, false);
++ if (io_cqring_events(ctx))
++ goto out;
++ do {
++ /*
++ * If a submit got punted to a workqueue, we can have the
++ * application entering polling for a command before it gets
++ * issued. That app will hold the uring_lock for the duration
++ * of the poll right here, so we need to take a breather every
++ * now and then to ensure that the issue has a chance to add
++ * the poll to the issued list. Otherwise we can spin here
++ * forever, while the workqueue is stuck trying to acquire the
++ * very same mutex.
++ */
++ if (wq_list_empty(&ctx->iopoll_list)) {
++ u32 tail = ctx->cached_cq_tail;
++
++ mutex_unlock(&ctx->uring_lock);
++ io_run_task_work();
++ mutex_lock(&ctx->uring_lock);
++
++ /* some requests don't go through iopoll_list */
++ if (tail != ctx->cached_cq_tail ||
++ wq_list_empty(&ctx->iopoll_list))
++ break;
++ }
++ ret = io_do_iopoll(ctx, !min);
++ if (ret < 0)
++ break;
++ nr_events += ret;
++ ret = 0;
++ } while (nr_events < min && !need_resched());
++out:
++ mutex_unlock(&ctx->uring_lock);
++ return ret;
++}
++
++static void kiocb_end_write(struct io_kiocb *req)
++{
++ /*
++ * Tell lockdep we inherited freeze protection from submission
++ * thread.
++ */
++ if (req->flags & REQ_F_ISREG) {
++ struct super_block *sb = file_inode(req->file)->i_sb;
++
++ __sb_writers_acquired(sb, SB_FREEZE_WRITE);
++ sb_end_write(sb);
++ }
++}
++
++#ifdef CONFIG_BLOCK
++static bool io_resubmit_prep(struct io_kiocb *req)
++{
++ struct io_async_rw *rw = req->async_data;
++
++ if (!req_has_async_data(req))
++ return !io_req_prep_async(req);
++ iov_iter_restore(&rw->s.iter, &rw->s.iter_state);
++ return true;
++}
++
++static bool io_rw_should_reissue(struct io_kiocb *req)
++{
++ umode_t mode = file_inode(req->file)->i_mode;
++ struct io_ring_ctx *ctx = req->ctx;
++
++ if (!S_ISBLK(mode) && !S_ISREG(mode))
++ return false;
++ if ((req->flags & REQ_F_NOWAIT) || (io_wq_current_is_worker() &&
++ !(ctx->flags & IORING_SETUP_IOPOLL)))
++ return false;
++ /*
++ * If ref is dying, we might be running poll reap from the exit work.
++ * Don't attempt to reissue from that path, just let it fail with
++ * -EAGAIN.
++ */
++ if (percpu_ref_is_dying(&ctx->refs))
++ return false;
++ /*
++ * Play it safe and assume not safe to re-import and reissue if we're
++ * not in the original thread group (or in task context).
++ */
++ if (!same_thread_group(req->task, current) || !in_task())
++ return false;
++ return true;
++}
++#else
++static bool io_resubmit_prep(struct io_kiocb *req)
++{
++ return false;
++}
++static bool io_rw_should_reissue(struct io_kiocb *req)
++{
++ return false;
++}
++#endif
++
++static bool __io_complete_rw_common(struct io_kiocb *req, long res)
++{
++ if (req->rw.kiocb.ki_flags & IOCB_WRITE) {
++ kiocb_end_write(req);
++ fsnotify_modify(req->file);
++ } else {
++ fsnotify_access(req->file);
++ }
++ if (unlikely(res != req->result)) {
++ if ((res == -EAGAIN || res == -EOPNOTSUPP) &&
++ io_rw_should_reissue(req)) {
++ req->flags |= REQ_F_REISSUE;
++ return true;
++ }
++ req_set_fail(req);
++ req->result = res;
++ }
++ return false;
++}
++
++static inline void io_req_task_complete(struct io_kiocb *req, bool *locked)
++{
++ int res = req->result;
++
++ if (*locked) {
++ io_req_complete_state(req, res, io_put_kbuf(req, 0));
++ io_req_add_compl_list(req);
++ } else {
++ io_req_complete_post(req, res,
++ io_put_kbuf(req, IO_URING_F_UNLOCKED));
++ }
++}
++
++static void __io_complete_rw(struct io_kiocb *req, long res,
++ unsigned int issue_flags)
++{
++ if (__io_complete_rw_common(req, res))
++ return;
++ __io_req_complete(req, issue_flags, req->result,
++ io_put_kbuf(req, issue_flags));
++}
++
++static void io_complete_rw(struct kiocb *kiocb, long res)
++{
++ struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
++
++ if (__io_complete_rw_common(req, res))
++ return;
++ req->result = res;
++ req->io_task_work.func = io_req_task_complete;
++ io_req_task_work_add(req, !!(req->ctx->flags & IORING_SETUP_SQPOLL));
++}
++
++static void io_complete_rw_iopoll(struct kiocb *kiocb, long res)
++{
++ struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
++
++ if (kiocb->ki_flags & IOCB_WRITE)
++ kiocb_end_write(req);
++ if (unlikely(res != req->result)) {
++ if (res == -EAGAIN && io_rw_should_reissue(req)) {
++ req->flags |= REQ_F_REISSUE;
++ return;
++ }
++ req->result = res;
++ }
++
++ /* order with io_iopoll_complete() checking ->iopoll_completed */
++ smp_store_release(&req->iopoll_completed, 1);
++}
++
++/*
++ * After the iocb has been issued, it's safe to be found on the poll list.
++ * Adding the kiocb to the list AFTER submission ensures that we don't
++ * find it from a io_do_iopoll() thread before the issuer is done
++ * accessing the kiocb cookie.
++ */
++static void io_iopoll_req_issued(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ const bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
++
++ /* workqueue context doesn't hold uring_lock, grab it now */
++ if (unlikely(needs_lock))
++ mutex_lock(&ctx->uring_lock);
++
++ /*
++ * Track whether we have multiple files in our lists. This will impact
++ * how we do polling eventually, not spinning if we're on potentially
++ * different devices.
++ */
++ if (wq_list_empty(&ctx->iopoll_list)) {
++ ctx->poll_multi_queue = false;
++ } else if (!ctx->poll_multi_queue) {
++ struct io_kiocb *list_req;
++
++ list_req = container_of(ctx->iopoll_list.first, struct io_kiocb,
++ comp_list);
++ if (list_req->file != req->file)
++ ctx->poll_multi_queue = true;
++ }
++
++ /*
++ * For fast devices, IO may have already completed. If it has, add
++ * it to the front so we find it first.
++ */
++ if (READ_ONCE(req->iopoll_completed))
++ wq_list_add_head(&req->comp_list, &ctx->iopoll_list);
++ else
++ wq_list_add_tail(&req->comp_list, &ctx->iopoll_list);
++
++ if (unlikely(needs_lock)) {
++ /*
++ * If IORING_SETUP_SQPOLL is enabled, sqes are either handle
++ * in sq thread task context or in io worker task context. If
++ * current task context is sq thread, we don't need to check
++ * whether should wake up sq thread.
++ */
++ if ((ctx->flags & IORING_SETUP_SQPOLL) &&
++ wq_has_sleeper(&ctx->sq_data->wait))
++ wake_up(&ctx->sq_data->wait);
++
++ mutex_unlock(&ctx->uring_lock);
++ }
++}
++
++static bool io_bdev_nowait(struct block_device *bdev)
++{
++ return !bdev || blk_queue_nowait(bdev_get_queue(bdev));
++}
++
++/*
++ * If we tracked the file through the SCM inflight mechanism, we could support
++ * any file. For now, just ensure that anything potentially problematic is done
++ * inline.
++ */
++static bool __io_file_supports_nowait(struct file *file, umode_t mode)
++{
++ if (S_ISBLK(mode)) {
++ if (IS_ENABLED(CONFIG_BLOCK) &&
++ io_bdev_nowait(I_BDEV(file->f_mapping->host)))
++ return true;
++ return false;
++ }
++ if (S_ISSOCK(mode))
++ return true;
++ if (S_ISREG(mode)) {
++ if (IS_ENABLED(CONFIG_BLOCK) &&
++ io_bdev_nowait(file->f_inode->i_sb->s_bdev) &&
++ file->f_op != &io_uring_fops)
++ return true;
++ return false;
++ }
++
++ /* any ->read/write should understand O_NONBLOCK */
++ if (file->f_flags & O_NONBLOCK)
++ return true;
++ return file->f_mode & FMODE_NOWAIT;
++}
++
++/*
++ * If we tracked the file through the SCM inflight mechanism, we could support
++ * any file. For now, just ensure that anything potentially problematic is done
++ * inline.
++ */
++static unsigned int io_file_get_flags(struct file *file)
++{
++ umode_t mode = file_inode(file)->i_mode;
++ unsigned int res = 0;
++
++ if (S_ISREG(mode))
++ res |= FFS_ISREG;
++ if (__io_file_supports_nowait(file, mode))
++ res |= FFS_NOWAIT;
++ return res;
++}
++
++static inline bool io_file_supports_nowait(struct io_kiocb *req)
++{
++ return req->flags & REQ_F_SUPPORT_NOWAIT;
++}
++
++static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ struct kiocb *kiocb = &req->rw.kiocb;
++ unsigned ioprio;
++ int ret;
++
++ kiocb->ki_pos = READ_ONCE(sqe->off);
++ /* used for fixed read/write too - just read unconditionally */
++ req->buf_index = READ_ONCE(sqe->buf_index);
++ req->imu = NULL;
++
++ if (req->opcode == IORING_OP_READ_FIXED ||
++ req->opcode == IORING_OP_WRITE_FIXED) {
++ struct io_ring_ctx *ctx = req->ctx;
++ u16 index;
++
++ if (unlikely(req->buf_index >= ctx->nr_user_bufs))
++ return -EFAULT;
++ index = array_index_nospec(req->buf_index, ctx->nr_user_bufs);
++ req->imu = ctx->user_bufs[index];
++ io_req_set_rsrc_node(req, ctx, 0);
++ }
++
++ ioprio = READ_ONCE(sqe->ioprio);
++ if (ioprio) {
++ ret = ioprio_check_cap(ioprio);
++ if (ret)
++ return ret;
++
++ kiocb->ki_ioprio = ioprio;
++ } else {
++ kiocb->ki_ioprio = get_current_ioprio();
++ }
++
++ req->rw.addr = READ_ONCE(sqe->addr);
++ req->rw.len = READ_ONCE(sqe->len);
++ req->rw.flags = READ_ONCE(sqe->rw_flags);
++ return 0;
++}
++
++static inline void io_rw_done(struct kiocb *kiocb, ssize_t ret)
++{
++ switch (ret) {
++ case -EIOCBQUEUED:
++ break;
++ case -ERESTARTSYS:
++ case -ERESTARTNOINTR:
++ case -ERESTARTNOHAND:
++ case -ERESTART_RESTARTBLOCK:
++ /*
++ * We can't just restart the syscall, since previously
++ * submitted sqes may already be in progress. Just fail this
++ * IO with EINTR.
++ */
++ ret = -EINTR;
++ fallthrough;
++ default:
++ kiocb->ki_complete(kiocb, ret);
++ }
++}
++
++static inline loff_t *io_kiocb_update_pos(struct io_kiocb *req)
++{
++ struct kiocb *kiocb = &req->rw.kiocb;
++
++ if (kiocb->ki_pos != -1)
++ return &kiocb->ki_pos;
++
++ if (!(req->file->f_mode & FMODE_STREAM)) {
++ req->flags |= REQ_F_CUR_POS;
++ kiocb->ki_pos = req->file->f_pos;
++ return &kiocb->ki_pos;
++ }
++
++ kiocb->ki_pos = 0;
++ return NULL;
++}
++
++static void kiocb_done(struct io_kiocb *req, ssize_t ret,
++ unsigned int issue_flags)
++{
++ struct io_async_rw *io = req->async_data;
++
++ /* add previously done IO, if any */
++ if (req_has_async_data(req) && io->bytes_done > 0) {
++ if (ret < 0)
++ ret = io->bytes_done;
++ else
++ ret += io->bytes_done;
++ }
++
++ if (req->flags & REQ_F_CUR_POS)
++ req->file->f_pos = req->rw.kiocb.ki_pos;
++ if (ret >= 0 && (req->rw.kiocb.ki_complete == io_complete_rw))
++ __io_complete_rw(req, ret, issue_flags);
++ else
++ io_rw_done(&req->rw.kiocb, ret);
++
++ if (req->flags & REQ_F_REISSUE) {
++ req->flags &= ~REQ_F_REISSUE;
++ if (io_resubmit_prep(req))
++ io_req_task_queue_reissue(req);
++ else
++ io_req_task_queue_fail(req, ret);
++ }
++}
++
++static int __io_import_fixed(struct io_kiocb *req, int rw, struct iov_iter *iter,
++ struct io_mapped_ubuf *imu)
++{
++ size_t len = req->rw.len;
++ u64 buf_end, buf_addr = req->rw.addr;
++ size_t offset;
++
++ if (unlikely(check_add_overflow(buf_addr, (u64)len, &buf_end)))
++ return -EFAULT;
++ /* not inside the mapped region */
++ if (unlikely(buf_addr < imu->ubuf || buf_end > imu->ubuf_end))
++ return -EFAULT;
++
++ /*
++ * May not be a start of buffer, set size appropriately
++ * and advance us to the beginning.
++ */
++ offset = buf_addr - imu->ubuf;
++ iov_iter_bvec(iter, rw, imu->bvec, imu->nr_bvecs, offset + len);
++
++ if (offset) {
++ /*
++ * Don't use iov_iter_advance() here, as it's really slow for
++ * using the latter parts of a big fixed buffer - it iterates
++ * over each segment manually. We can cheat a bit here, because
++ * we know that:
++ *
++ * 1) it's a BVEC iter, we set it up
++ * 2) all bvecs are PAGE_SIZE in size, except potentially the
++ * first and last bvec
++ *
++ * So just find our index, and adjust the iterator afterwards.
++ * If the offset is within the first bvec (or the whole first
++ * bvec, just use iov_iter_advance(). This makes it easier
++ * since we can just skip the first segment, which may not
++ * be PAGE_SIZE aligned.
++ */
++ const struct bio_vec *bvec = imu->bvec;
++
++ if (offset <= bvec->bv_len) {
++ iov_iter_advance(iter, offset);
++ } else {
++ unsigned long seg_skip;
++
++ /* skip first vec */
++ offset -= bvec->bv_len;
++ seg_skip = 1 + (offset >> PAGE_SHIFT);
++
++ iter->bvec = bvec + seg_skip;
++ iter->nr_segs -= seg_skip;
++ iter->count -= bvec->bv_len + offset;
++ iter->iov_offset = offset & ~PAGE_MASK;
++ }
++ }
++
++ return 0;
++}
++
++static int io_import_fixed(struct io_kiocb *req, int rw, struct iov_iter *iter,
++ unsigned int issue_flags)
++{
++ if (WARN_ON_ONCE(!req->imu))
++ return -EFAULT;
++ return __io_import_fixed(req, rw, iter, req->imu);
++}
++
++static void io_ring_submit_unlock(struct io_ring_ctx *ctx, bool needs_lock)
++{
++ if (needs_lock)
++ mutex_unlock(&ctx->uring_lock);
++}
++
++static void io_ring_submit_lock(struct io_ring_ctx *ctx, bool needs_lock)
++{
++ /*
++ * "Normal" inline submissions always hold the uring_lock, since we
++ * grab it from the system call. Same is true for the SQPOLL offload.
++ * The only exception is when we've detached the request and issue it
++ * from an async worker thread, grab the lock for that case.
++ */
++ if (needs_lock)
++ mutex_lock(&ctx->uring_lock);
++}
++
++static void io_buffer_add_list(struct io_ring_ctx *ctx,
++ struct io_buffer_list *bl, unsigned int bgid)
++{
++ struct list_head *list;
++
++ list = &ctx->io_buffers[hash_32(bgid, IO_BUFFERS_HASH_BITS)];
++ INIT_LIST_HEAD(&bl->buf_list);
++ bl->bgid = bgid;
++ list_add(&bl->list, list);
++}
++
++static struct io_buffer *io_buffer_select(struct io_kiocb *req, size_t *len,
++ int bgid, unsigned int issue_flags)
++{
++ struct io_buffer *kbuf = req->kbuf;
++ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
++ struct io_ring_ctx *ctx = req->ctx;
++ struct io_buffer_list *bl;
++
++ if (req->flags & REQ_F_BUFFER_SELECTED)
++ return kbuf;
++
++ io_ring_submit_lock(ctx, needs_lock);
++
++ lockdep_assert_held(&ctx->uring_lock);
++
++ bl = io_buffer_get_list(ctx, bgid);
++ if (bl && !list_empty(&bl->buf_list)) {
++ kbuf = list_first_entry(&bl->buf_list, struct io_buffer, list);
++ list_del(&kbuf->list);
++ if (*len > kbuf->len)
++ *len = kbuf->len;
++ req->flags |= REQ_F_BUFFER_SELECTED;
++ req->kbuf = kbuf;
++ } else {
++ kbuf = ERR_PTR(-ENOBUFS);
++ }
++
++ io_ring_submit_unlock(req->ctx, needs_lock);
++ return kbuf;
++}
++
++static void __user *io_rw_buffer_select(struct io_kiocb *req, size_t *len,
++ unsigned int issue_flags)
++{
++ struct io_buffer *kbuf;
++ u16 bgid;
++
++ bgid = req->buf_index;
++ kbuf = io_buffer_select(req, len, bgid, issue_flags);
++ if (IS_ERR(kbuf))
++ return kbuf;
++ return u64_to_user_ptr(kbuf->addr);
++}
++
++#ifdef CONFIG_COMPAT
++static ssize_t io_compat_import(struct io_kiocb *req, struct iovec *iov,
++ unsigned int issue_flags)
++{
++ struct compat_iovec __user *uiov;
++ compat_ssize_t clen;
++ void __user *buf;
++ ssize_t len;
++
++ uiov = u64_to_user_ptr(req->rw.addr);
++ if (!access_ok(uiov, sizeof(*uiov)))
++ return -EFAULT;
++ if (__get_user(clen, &uiov->iov_len))
++ return -EFAULT;
++ if (clen < 0)
++ return -EINVAL;
++
++ len = clen;
++ buf = io_rw_buffer_select(req, &len, issue_flags);
++ if (IS_ERR(buf))
++ return PTR_ERR(buf);
++ iov[0].iov_base = buf;
++ iov[0].iov_len = (compat_size_t) len;
++ return 0;
++}
++#endif
++
++static ssize_t __io_iov_buffer_select(struct io_kiocb *req, struct iovec *iov,
++ unsigned int issue_flags)
++{
++ struct iovec __user *uiov = u64_to_user_ptr(req->rw.addr);
++ void __user *buf;
++ ssize_t len;
++
++ if (copy_from_user(iov, uiov, sizeof(*uiov)))
++ return -EFAULT;
++
++ len = iov[0].iov_len;
++ if (len < 0)
++ return -EINVAL;
++ buf = io_rw_buffer_select(req, &len, issue_flags);
++ if (IS_ERR(buf))
++ return PTR_ERR(buf);
++ iov[0].iov_base = buf;
++ iov[0].iov_len = len;
++ return 0;
++}
++
++static ssize_t io_iov_buffer_select(struct io_kiocb *req, struct iovec *iov,
++ unsigned int issue_flags)
++{
++ if (req->flags & REQ_F_BUFFER_SELECTED) {
++ struct io_buffer *kbuf = req->kbuf;
++
++ iov[0].iov_base = u64_to_user_ptr(kbuf->addr);
++ iov[0].iov_len = kbuf->len;
++ return 0;
++ }
++ if (req->rw.len != 1)
++ return -EINVAL;
++
++#ifdef CONFIG_COMPAT
++ if (req->ctx->compat)
++ return io_compat_import(req, iov, issue_flags);
++#endif
++
++ return __io_iov_buffer_select(req, iov, issue_flags);
++}
++
++static inline bool io_do_buffer_select(struct io_kiocb *req)
++{
++ if (!(req->flags & REQ_F_BUFFER_SELECT))
++ return false;
++ return !(req->flags & REQ_F_BUFFER_SELECTED);
++}
++
++static struct iovec *__io_import_iovec(int rw, struct io_kiocb *req,
++ struct io_rw_state *s,
++ unsigned int issue_flags)
++{
++ struct iov_iter *iter = &s->iter;
++ u8 opcode = req->opcode;
++ struct iovec *iovec;
++ void __user *buf;
++ size_t sqe_len;
++ ssize_t ret;
++
++ if (opcode == IORING_OP_READ_FIXED || opcode == IORING_OP_WRITE_FIXED) {
++ ret = io_import_fixed(req, rw, iter, issue_flags);
++ if (ret)
++ return ERR_PTR(ret);
++ return NULL;
++ }
++
++ /* buffer index only valid with fixed read/write, or buffer select */
++ if (unlikely(req->buf_index && !(req->flags & REQ_F_BUFFER_SELECT)))
++ return ERR_PTR(-EINVAL);
++
++ buf = u64_to_user_ptr(req->rw.addr);
++ sqe_len = req->rw.len;
++
++ if (opcode == IORING_OP_READ || opcode == IORING_OP_WRITE) {
++ if (req->flags & REQ_F_BUFFER_SELECT) {
++ buf = io_rw_buffer_select(req, &sqe_len, issue_flags);
++ if (IS_ERR(buf))
++ return ERR_CAST(buf);
++ req->rw.len = sqe_len;
++ }
++
++ ret = import_single_range(rw, buf, sqe_len, s->fast_iov, iter);
++ if (ret)
++ return ERR_PTR(ret);
++ return NULL;
++ }
++
++ iovec = s->fast_iov;
++ if (req->flags & REQ_F_BUFFER_SELECT) {
++ ret = io_iov_buffer_select(req, iovec, issue_flags);
++ if (ret)
++ return ERR_PTR(ret);
++ iov_iter_init(iter, rw, iovec, 1, iovec->iov_len);
++ return NULL;
++ }
++
++ ret = __import_iovec(rw, buf, sqe_len, UIO_FASTIOV, &iovec, iter,
++ req->ctx->compat);
++ if (unlikely(ret < 0))
++ return ERR_PTR(ret);
++ return iovec;
++}
++
++static inline int io_import_iovec(int rw, struct io_kiocb *req,
++ struct iovec **iovec, struct io_rw_state *s,
++ unsigned int issue_flags)
++{
++ *iovec = __io_import_iovec(rw, req, s, issue_flags);
++ if (unlikely(IS_ERR(*iovec)))
++ return PTR_ERR(*iovec);
++
++ iov_iter_save_state(&s->iter, &s->iter_state);
++ return 0;
++}
++
++static inline loff_t *io_kiocb_ppos(struct kiocb *kiocb)
++{
++ return (kiocb->ki_filp->f_mode & FMODE_STREAM) ? NULL : &kiocb->ki_pos;
++}
++
++/*
++ * For files that don't have ->read_iter() and ->write_iter(), handle them
++ * by looping over ->read() or ->write() manually.
++ */
++static ssize_t loop_rw_iter(int rw, struct io_kiocb *req, struct iov_iter *iter)
++{
++ struct kiocb *kiocb = &req->rw.kiocb;
++ struct file *file = req->file;
++ ssize_t ret = 0;
++ loff_t *ppos;
++
++ /*
++ * Don't support polled IO through this interface, and we can't
++ * support non-blocking either. For the latter, this just causes
++ * the kiocb to be handled from an async context.
++ */
++ if (kiocb->ki_flags & IOCB_HIPRI)
++ return -EOPNOTSUPP;
++ if ((kiocb->ki_flags & IOCB_NOWAIT) &&
++ !(kiocb->ki_filp->f_flags & O_NONBLOCK))
++ return -EAGAIN;
++
++ ppos = io_kiocb_ppos(kiocb);
++
++ while (iov_iter_count(iter)) {
++ struct iovec iovec;
++ ssize_t nr;
++
++ if (!iov_iter_is_bvec(iter)) {
++ iovec = iov_iter_iovec(iter);
++ } else {
++ iovec.iov_base = u64_to_user_ptr(req->rw.addr);
++ iovec.iov_len = req->rw.len;
++ }
++
++ if (rw == READ) {
++ nr = file->f_op->read(file, iovec.iov_base,
++ iovec.iov_len, ppos);
++ } else {
++ nr = file->f_op->write(file, iovec.iov_base,
++ iovec.iov_len, ppos);
++ }
++
++ if (nr < 0) {
++ if (!ret)
++ ret = nr;
++ break;
++ }
++ ret += nr;
++ if (!iov_iter_is_bvec(iter)) {
++ iov_iter_advance(iter, nr);
++ } else {
++ req->rw.addr += nr;
++ req->rw.len -= nr;
++ if (!req->rw.len)
++ break;
++ }
++ if (nr != iovec.iov_len)
++ break;
++ }
++
++ return ret;
++}
++
++static void io_req_map_rw(struct io_kiocb *req, const struct iovec *iovec,
++ const struct iovec *fast_iov, struct iov_iter *iter)
++{
++ struct io_async_rw *rw = req->async_data;
++
++ memcpy(&rw->s.iter, iter, sizeof(*iter));
++ rw->free_iovec = iovec;
++ rw->bytes_done = 0;
++ /* can only be fixed buffers, no need to do anything */
++ if (iov_iter_is_bvec(iter))
++ return;
++ if (!iovec) {
++ unsigned iov_off = 0;
++
++ rw->s.iter.iov = rw->s.fast_iov;
++ if (iter->iov != fast_iov) {
++ iov_off = iter->iov - fast_iov;
++ rw->s.iter.iov += iov_off;
++ }
++ if (rw->s.fast_iov != fast_iov)
++ memcpy(rw->s.fast_iov + iov_off, fast_iov + iov_off,
++ sizeof(struct iovec) * iter->nr_segs);
++ } else {
++ req->flags |= REQ_F_NEED_CLEANUP;
++ }
++}
++
++static inline bool io_alloc_async_data(struct io_kiocb *req)
++{
++ WARN_ON_ONCE(!io_op_defs[req->opcode].async_size);
++ req->async_data = kmalloc(io_op_defs[req->opcode].async_size, GFP_KERNEL);
++ if (req->async_data) {
++ req->flags |= REQ_F_ASYNC_DATA;
++ return false;
++ }
++ return true;
++}
++
++static int io_setup_async_rw(struct io_kiocb *req, const struct iovec *iovec,
++ struct io_rw_state *s, bool force)
++{
++ if (!force && !io_op_defs[req->opcode].needs_async_setup)
++ return 0;
++ if (!req_has_async_data(req)) {
++ struct io_async_rw *iorw;
++
++ if (io_alloc_async_data(req)) {
++ kfree(iovec);
++ return -ENOMEM;
++ }
++
++ io_req_map_rw(req, iovec, s->fast_iov, &s->iter);
++ iorw = req->async_data;
++ /* we've copied and mapped the iter, ensure state is saved */
++ iov_iter_save_state(&iorw->s.iter, &iorw->s.iter_state);
++ }
++ return 0;
++}
++
++static inline int io_rw_prep_async(struct io_kiocb *req, int rw)
++{
++ struct io_async_rw *iorw = req->async_data;
++ struct iovec *iov;
++ int ret;
++
++ /* submission path, ->uring_lock should already be taken */
++ ret = io_import_iovec(rw, req, &iov, &iorw->s, 0);
++ if (unlikely(ret < 0))
++ return ret;
++
++ iorw->bytes_done = 0;
++ iorw->free_iovec = iov;
++ if (iov)
++ req->flags |= REQ_F_NEED_CLEANUP;
++ return 0;
++}
++
++/*
++ * This is our waitqueue callback handler, registered through __folio_lock_async()
++ * when we initially tried to do the IO with the iocb armed our waitqueue.
++ * This gets called when the page is unlocked, and we generally expect that to
++ * happen when the page IO is completed and the page is now uptodate. This will
++ * queue a task_work based retry of the operation, attempting to copy the data
++ * again. If the latter fails because the page was NOT uptodate, then we will
++ * do a thread based blocking retry of the operation. That's the unexpected
++ * slow path.
++ */
++static int io_async_buf_func(struct wait_queue_entry *wait, unsigned mode,
++ int sync, void *arg)
++{
++ struct wait_page_queue *wpq;
++ struct io_kiocb *req = wait->private;
++ struct wait_page_key *key = arg;
++
++ wpq = container_of(wait, struct wait_page_queue, wait);
++
++ if (!wake_page_match(wpq, key))
++ return 0;
++
++ req->rw.kiocb.ki_flags &= ~IOCB_WAITQ;
++ list_del_init(&wait->entry);
++ io_req_task_queue(req);
++ return 1;
++}
++
++/*
++ * This controls whether a given IO request should be armed for async page
++ * based retry. If we return false here, the request is handed to the async
++ * worker threads for retry. If we're doing buffered reads on a regular file,
++ * we prepare a private wait_page_queue entry and retry the operation. This
++ * will either succeed because the page is now uptodate and unlocked, or it
++ * will register a callback when the page is unlocked at IO completion. Through
++ * that callback, io_uring uses task_work to setup a retry of the operation.
++ * That retry will attempt the buffered read again. The retry will generally
++ * succeed, or in rare cases where it fails, we then fall back to using the
++ * async worker threads for a blocking retry.
++ */
++static bool io_rw_should_retry(struct io_kiocb *req)
++{
++ struct io_async_rw *rw = req->async_data;
++ struct wait_page_queue *wait = &rw->wpq;
++ struct kiocb *kiocb = &req->rw.kiocb;
++
++ /* never retry for NOWAIT, we just complete with -EAGAIN */
++ if (req->flags & REQ_F_NOWAIT)
++ return false;
++
++ /* Only for buffered IO */
++ if (kiocb->ki_flags & (IOCB_DIRECT | IOCB_HIPRI))
++ return false;
++
++ /*
++ * just use poll if we can, and don't attempt if the fs doesn't
++ * support callback based unlocks
++ */
++ if (file_can_poll(req->file) || !(req->file->f_mode & FMODE_BUF_RASYNC))
++ return false;
++
++ wait->wait.func = io_async_buf_func;
++ wait->wait.private = req;
++ wait->wait.flags = 0;
++ INIT_LIST_HEAD(&wait->wait.entry);
++ kiocb->ki_flags |= IOCB_WAITQ;
++ kiocb->ki_flags &= ~IOCB_NOWAIT;
++ kiocb->ki_waitq = wait;
++ return true;
++}
++
++static inline int io_iter_do_read(struct io_kiocb *req, struct iov_iter *iter)
++{
++ if (likely(req->file->f_op->read_iter))
++ return call_read_iter(req->file, &req->rw.kiocb, iter);
++ else if (req->file->f_op->read)
++ return loop_rw_iter(READ, req, iter);
++ else
++ return -EINVAL;
++}
++
++static bool need_read_all(struct io_kiocb *req)
++{
++ return req->flags & REQ_F_ISREG ||
++ S_ISBLK(file_inode(req->file)->i_mode);
++}
++
++static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
++{
++ struct kiocb *kiocb = &req->rw.kiocb;
++ struct io_ring_ctx *ctx = req->ctx;
++ struct file *file = req->file;
++ int ret;
++
++ if (unlikely(!file || !(file->f_mode & mode)))
++ return -EBADF;
++
++ if (!io_req_ffs_set(req))
++ req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
++
++ kiocb->ki_flags = iocb_flags(file);
++ ret = kiocb_set_rw_flags(kiocb, req->rw.flags);
++ if (unlikely(ret))
++ return ret;
++
++ /*
++ * If the file is marked O_NONBLOCK, still allow retry for it if it
++ * supports async. Otherwise it's impossible to use O_NONBLOCK files
++ * reliably. If not, or it IOCB_NOWAIT is set, don't retry.
++ */
++ if ((kiocb->ki_flags & IOCB_NOWAIT) ||
++ ((file->f_flags & O_NONBLOCK) && !io_file_supports_nowait(req)))
++ req->flags |= REQ_F_NOWAIT;
++
++ if (ctx->flags & IORING_SETUP_IOPOLL) {
++ if (!(kiocb->ki_flags & IOCB_DIRECT) || !file->f_op->iopoll)
++ return -EOPNOTSUPP;
++
++ kiocb->private = NULL;
++ kiocb->ki_flags |= IOCB_HIPRI | IOCB_ALLOC_CACHE;
++ kiocb->ki_complete = io_complete_rw_iopoll;
++ req->iopoll_completed = 0;
++ } else {
++ if (kiocb->ki_flags & IOCB_HIPRI)
++ return -EINVAL;
++ kiocb->ki_complete = io_complete_rw;
++ }
++
++ return 0;
++}
++
++static int io_read(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_rw_state __s, *s = &__s;
++ struct iovec *iovec;
++ struct kiocb *kiocb = &req->rw.kiocb;
++ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
++ struct io_async_rw *rw;
++ ssize_t ret, ret2;
++ loff_t *ppos;
++
++ if (!req_has_async_data(req)) {
++ ret = io_import_iovec(READ, req, &iovec, s, issue_flags);
++ if (unlikely(ret < 0))
++ return ret;
++ } else {
++ rw = req->async_data;
++ s = &rw->s;
++
++ /*
++ * Safe and required to re-import if we're using provided
++ * buffers, as we dropped the selected one before retry.
++ */
++ if (io_do_buffer_select(req)) {
++ ret = io_import_iovec(READ, req, &iovec, s, issue_flags);
++ if (unlikely(ret < 0))
++ return ret;
++ }
++
++ /*
++ * We come here from an earlier attempt, restore our state to
++ * match in case it doesn't. It's cheap enough that we don't
++ * need to make this conditional.
++ */
++ iov_iter_restore(&s->iter, &s->iter_state);
++ iovec = NULL;
++ }
++ ret = io_rw_init_file(req, FMODE_READ);
++ if (unlikely(ret)) {
++ kfree(iovec);
++ return ret;
++ }
++ req->result = iov_iter_count(&s->iter);
++
++ if (force_nonblock) {
++ /* If the file doesn't support async, just async punt */
++ if (unlikely(!io_file_supports_nowait(req))) {
++ ret = io_setup_async_rw(req, iovec, s, true);
++ return ret ?: -EAGAIN;
++ }
++ kiocb->ki_flags |= IOCB_NOWAIT;
++ } else {
++ /* Ensure we clear previously set non-block flag */
++ kiocb->ki_flags &= ~IOCB_NOWAIT;
++ }
++
++ ppos = io_kiocb_update_pos(req);
++
++ ret = rw_verify_area(READ, req->file, ppos, req->result);
++ if (unlikely(ret)) {
++ kfree(iovec);
++ return ret;
++ }
++
++ ret = io_iter_do_read(req, &s->iter);
++
++ if (ret == -EAGAIN || (req->flags & REQ_F_REISSUE)) {
++ req->flags &= ~REQ_F_REISSUE;
++ /* if we can poll, just do that */
++ if (req->opcode == IORING_OP_READ && file_can_poll(req->file))
++ return -EAGAIN;
++ /* IOPOLL retry should happen for io-wq threads */
++ if (!force_nonblock && !(req->ctx->flags & IORING_SETUP_IOPOLL))
++ goto done;
++ /* no retry on NONBLOCK nor RWF_NOWAIT */
++ if (req->flags & REQ_F_NOWAIT)
++ goto done;
++ ret = 0;
++ } else if (ret == -EIOCBQUEUED) {
++ goto out_free;
++ } else if (ret == req->result || ret <= 0 || !force_nonblock ||
++ (req->flags & REQ_F_NOWAIT) || !need_read_all(req)) {
++ /* read all, failed, already did sync or don't want to retry */
++ goto done;
++ }
++
++ /*
++ * Don't depend on the iter state matching what was consumed, or being
++ * untouched in case of error. Restore it and we'll advance it
++ * manually if we need to.
++ */
++ iov_iter_restore(&s->iter, &s->iter_state);
++
++ ret2 = io_setup_async_rw(req, iovec, s, true);
++ if (ret2)
++ return ret2;
++
++ iovec = NULL;
++ rw = req->async_data;
++ s = &rw->s;
++ /*
++ * Now use our persistent iterator and state, if we aren't already.
++ * We've restored and mapped the iter to match.
++ */
++
++ do {
++ /*
++ * We end up here because of a partial read, either from
++ * above or inside this loop. Advance the iter by the bytes
++ * that were consumed.
++ */
++ iov_iter_advance(&s->iter, ret);
++ if (!iov_iter_count(&s->iter))
++ break;
++ rw->bytes_done += ret;
++ iov_iter_save_state(&s->iter, &s->iter_state);
++
++ /* if we can retry, do so with the callbacks armed */
++ if (!io_rw_should_retry(req)) {
++ kiocb->ki_flags &= ~IOCB_WAITQ;
++ return -EAGAIN;
++ }
++
++ /*
++ * Now retry read with the IOCB_WAITQ parts set in the iocb. If
++ * we get -EIOCBQUEUED, then we'll get a notification when the
++ * desired page gets unlocked. We can also get a partial read
++ * here, and if we do, then just retry at the new offset.
++ */
++ ret = io_iter_do_read(req, &s->iter);
++ if (ret == -EIOCBQUEUED)
++ return 0;
++ /* we got some bytes, but not all. retry. */
++ kiocb->ki_flags &= ~IOCB_WAITQ;
++ iov_iter_restore(&s->iter, &s->iter_state);
++ } while (ret > 0);
++done:
++ kiocb_done(req, ret, issue_flags);
++out_free:
++ /* it's faster to check here then delegate to kfree */
++ if (iovec)
++ kfree(iovec);
++ return 0;
++}
++
++static int io_write(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_rw_state __s, *s = &__s;
++ struct iovec *iovec;
++ struct kiocb *kiocb = &req->rw.kiocb;
++ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
++ ssize_t ret, ret2;
++ loff_t *ppos;
++
++ if (!req_has_async_data(req)) {
++ ret = io_import_iovec(WRITE, req, &iovec, s, issue_flags);
++ if (unlikely(ret < 0))
++ return ret;
++ } else {
++ struct io_async_rw *rw = req->async_data;
++
++ s = &rw->s;
++ iov_iter_restore(&s->iter, &s->iter_state);
++ iovec = NULL;
++ }
++ ret = io_rw_init_file(req, FMODE_WRITE);
++ if (unlikely(ret)) {
++ kfree(iovec);
++ return ret;
++ }
++ req->result = iov_iter_count(&s->iter);
++
++ if (force_nonblock) {
++ /* If the file doesn't support async, just async punt */
++ if (unlikely(!io_file_supports_nowait(req)))
++ goto copy_iov;
++
++ /* file path doesn't support NOWAIT for non-direct_IO */
++ if (force_nonblock && !(kiocb->ki_flags & IOCB_DIRECT) &&
++ (req->flags & REQ_F_ISREG))
++ goto copy_iov;
++
++ kiocb->ki_flags |= IOCB_NOWAIT;
++ } else {
++ /* Ensure we clear previously set non-block flag */
++ kiocb->ki_flags &= ~IOCB_NOWAIT;
++ }
++
++ ppos = io_kiocb_update_pos(req);
++
++ ret = rw_verify_area(WRITE, req->file, ppos, req->result);
++ if (unlikely(ret))
++ goto out_free;
++
++ /*
++ * Open-code file_start_write here to grab freeze protection,
++ * which will be released by another thread in
++ * io_complete_rw(). Fool lockdep by telling it the lock got
++ * released so that it doesn't complain about the held lock when
++ * we return to userspace.
++ */
++ if (req->flags & REQ_F_ISREG) {
++ sb_start_write(file_inode(req->file)->i_sb);
++ __sb_writers_release(file_inode(req->file)->i_sb,
++ SB_FREEZE_WRITE);
++ }
++ kiocb->ki_flags |= IOCB_WRITE;
++
++ if (likely(req->file->f_op->write_iter))
++ ret2 = call_write_iter(req->file, kiocb, &s->iter);
++ else if (req->file->f_op->write)
++ ret2 = loop_rw_iter(WRITE, req, &s->iter);
++ else
++ ret2 = -EINVAL;
++
++ if (req->flags & REQ_F_REISSUE) {
++ req->flags &= ~REQ_F_REISSUE;
++ ret2 = -EAGAIN;
++ }
++
++ /*
++ * Raw bdev writes will return -EOPNOTSUPP for IOCB_NOWAIT. Just
++ * retry them without IOCB_NOWAIT.
++ */
++ if (ret2 == -EOPNOTSUPP && (kiocb->ki_flags & IOCB_NOWAIT))
++ ret2 = -EAGAIN;
++ /* no retry on NONBLOCK nor RWF_NOWAIT */
++ if (ret2 == -EAGAIN && (req->flags & REQ_F_NOWAIT))
++ goto done;
++ if (!force_nonblock || ret2 != -EAGAIN) {
++ /* IOPOLL retry should happen for io-wq threads */
++ if (ret2 == -EAGAIN && (req->ctx->flags & IORING_SETUP_IOPOLL))
++ goto copy_iov;
++done:
++ kiocb_done(req, ret2, issue_flags);
++ } else {
++copy_iov:
++ iov_iter_restore(&s->iter, &s->iter_state);
++ ret = io_setup_async_rw(req, iovec, s, false);
++ return ret ?: -EAGAIN;
++ }
++out_free:
++ /* it's reportedly faster than delegating the null check to kfree() */
++ if (iovec)
++ kfree(iovec);
++ return ret;
++}
++
++static int io_renameat_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ struct io_rename *ren = &req->rename;
++ const char __user *oldf, *newf;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
++ return -EINVAL;
++ if (unlikely(req->flags & REQ_F_FIXED_FILE))
++ return -EBADF;
++
++ ren->old_dfd = READ_ONCE(sqe->fd);
++ oldf = u64_to_user_ptr(READ_ONCE(sqe->addr));
++ newf = u64_to_user_ptr(READ_ONCE(sqe->addr2));
++ ren->new_dfd = READ_ONCE(sqe->len);
++ ren->flags = READ_ONCE(sqe->rename_flags);
++
++ ren->oldpath = getname(oldf);
++ if (IS_ERR(ren->oldpath))
++ return PTR_ERR(ren->oldpath);
++
++ ren->newpath = getname(newf);
++ if (IS_ERR(ren->newpath)) {
++ putname(ren->oldpath);
++ return PTR_ERR(ren->newpath);
++ }
++
++ req->flags |= REQ_F_NEED_CLEANUP;
++ return 0;
++}
++
++static int io_renameat(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_rename *ren = &req->rename;
++ int ret;
++
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ return -EAGAIN;
++
++ ret = do_renameat2(ren->old_dfd, ren->oldpath, ren->new_dfd,
++ ren->newpath, ren->flags);
++
++ req->flags &= ~REQ_F_NEED_CLEANUP;
++ if (ret < 0)
++ req_set_fail(req);
++ io_req_complete(req, ret);
++ return 0;
++}
++
++static int io_unlinkat_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ struct io_unlink *un = &req->unlink;
++ const char __user *fname;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->off || sqe->len || sqe->buf_index ||
++ sqe->splice_fd_in)
++ return -EINVAL;
++ if (unlikely(req->flags & REQ_F_FIXED_FILE))
++ return -EBADF;
++
++ un->dfd = READ_ONCE(sqe->fd);
++
++ un->flags = READ_ONCE(sqe->unlink_flags);
++ if (un->flags & ~AT_REMOVEDIR)
++ return -EINVAL;
++
++ fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
++ un->filename = getname(fname);
++ if (IS_ERR(un->filename))
++ return PTR_ERR(un->filename);
++
++ req->flags |= REQ_F_NEED_CLEANUP;
++ return 0;
++}
++
++static int io_unlinkat(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_unlink *un = &req->unlink;
++ int ret;
++
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ return -EAGAIN;
++
++ if (un->flags & AT_REMOVEDIR)
++ ret = do_rmdir(un->dfd, un->filename);
++ else
++ ret = do_unlinkat(un->dfd, un->filename);
++
++ req->flags &= ~REQ_F_NEED_CLEANUP;
++ if (ret < 0)
++ req_set_fail(req);
++ io_req_complete(req, ret);
++ return 0;
++}
++
++static int io_mkdirat_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ struct io_mkdir *mkd = &req->mkdir;
++ const char __user *fname;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->off || sqe->rw_flags || sqe->buf_index ||
++ sqe->splice_fd_in)
++ return -EINVAL;
++ if (unlikely(req->flags & REQ_F_FIXED_FILE))
++ return -EBADF;
++
++ mkd->dfd = READ_ONCE(sqe->fd);
++ mkd->mode = READ_ONCE(sqe->len);
++
++ fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
++ mkd->filename = getname(fname);
++ if (IS_ERR(mkd->filename))
++ return PTR_ERR(mkd->filename);
++
++ req->flags |= REQ_F_NEED_CLEANUP;
++ return 0;
++}
++
++static int io_mkdirat(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_mkdir *mkd = &req->mkdir;
++ int ret;
++
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ return -EAGAIN;
++
++ ret = do_mkdirat(mkd->dfd, mkd->filename, mkd->mode);
++
++ req->flags &= ~REQ_F_NEED_CLEANUP;
++ if (ret < 0)
++ req_set_fail(req);
++ io_req_complete(req, ret);
++ return 0;
++}
++
++static int io_symlinkat_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ struct io_symlink *sl = &req->symlink;
++ const char __user *oldpath, *newpath;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->len || sqe->rw_flags || sqe->buf_index ||
++ sqe->splice_fd_in)
++ return -EINVAL;
++ if (unlikely(req->flags & REQ_F_FIXED_FILE))
++ return -EBADF;
++
++ sl->new_dfd = READ_ONCE(sqe->fd);
++ oldpath = u64_to_user_ptr(READ_ONCE(sqe->addr));
++ newpath = u64_to_user_ptr(READ_ONCE(sqe->addr2));
++
++ sl->oldpath = getname(oldpath);
++ if (IS_ERR(sl->oldpath))
++ return PTR_ERR(sl->oldpath);
++
++ sl->newpath = getname(newpath);
++ if (IS_ERR(sl->newpath)) {
++ putname(sl->oldpath);
++ return PTR_ERR(sl->newpath);
++ }
++
++ req->flags |= REQ_F_NEED_CLEANUP;
++ return 0;
++}
++
++static int io_symlinkat(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_symlink *sl = &req->symlink;
++ int ret;
++
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ return -EAGAIN;
++
++ ret = do_symlinkat(sl->oldpath, sl->new_dfd, sl->newpath);
++
++ req->flags &= ~REQ_F_NEED_CLEANUP;
++ if (ret < 0)
++ req_set_fail(req);
++ io_req_complete(req, ret);
++ return 0;
++}
++
++static int io_linkat_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ struct io_hardlink *lnk = &req->hardlink;
++ const char __user *oldf, *newf;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->rw_flags || sqe->buf_index || sqe->splice_fd_in)
++ return -EINVAL;
++ if (unlikely(req->flags & REQ_F_FIXED_FILE))
++ return -EBADF;
++
++ lnk->old_dfd = READ_ONCE(sqe->fd);
++ lnk->new_dfd = READ_ONCE(sqe->len);
++ oldf = u64_to_user_ptr(READ_ONCE(sqe->addr));
++ newf = u64_to_user_ptr(READ_ONCE(sqe->addr2));
++ lnk->flags = READ_ONCE(sqe->hardlink_flags);
++
++ lnk->oldpath = getname(oldf);
++ if (IS_ERR(lnk->oldpath))
++ return PTR_ERR(lnk->oldpath);
++
++ lnk->newpath = getname(newf);
++ if (IS_ERR(lnk->newpath)) {
++ putname(lnk->oldpath);
++ return PTR_ERR(lnk->newpath);
++ }
++
++ req->flags |= REQ_F_NEED_CLEANUP;
++ return 0;
++}
++
++static int io_linkat(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_hardlink *lnk = &req->hardlink;
++ int ret;
++
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ return -EAGAIN;
++
++ ret = do_linkat(lnk->old_dfd, lnk->oldpath, lnk->new_dfd,
++ lnk->newpath, lnk->flags);
++
++ req->flags &= ~REQ_F_NEED_CLEANUP;
++ if (ret < 0)
++ req_set_fail(req);
++ io_req_complete(req, ret);
++ return 0;
++}
++
++static int io_shutdown_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++#if defined(CONFIG_NET)
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (unlikely(sqe->ioprio || sqe->off || sqe->addr || sqe->rw_flags ||
++ sqe->buf_index || sqe->splice_fd_in))
++ return -EINVAL;
++
++ req->shutdown.how = READ_ONCE(sqe->len);
++ return 0;
++#else
++ return -EOPNOTSUPP;
++#endif
++}
++
++static int io_shutdown(struct io_kiocb *req, unsigned int issue_flags)
++{
++#if defined(CONFIG_NET)
++ struct socket *sock;
++ int ret;
++
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ return -EAGAIN;
++
++ sock = sock_from_file(req->file);
++ if (unlikely(!sock))
++ return -ENOTSOCK;
++
++ ret = __sys_shutdown_sock(sock, req->shutdown.how);
++ if (ret < 0)
++ req_set_fail(req);
++ io_req_complete(req, ret);
++ return 0;
++#else
++ return -EOPNOTSUPP;
++#endif
++}
++
++static int __io_splice_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ struct io_splice *sp = &req->splice;
++ unsigned int valid_flags = SPLICE_F_FD_IN_FIXED | SPLICE_F_ALL;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++
++ sp->len = READ_ONCE(sqe->len);
++ sp->flags = READ_ONCE(sqe->splice_flags);
++ if (unlikely(sp->flags & ~valid_flags))
++ return -EINVAL;
++ sp->splice_fd_in = READ_ONCE(sqe->splice_fd_in);
++ return 0;
++}
++
++static int io_tee_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ if (READ_ONCE(sqe->splice_off_in) || READ_ONCE(sqe->off))
++ return -EINVAL;
++ return __io_splice_prep(req, sqe);
++}
++
++static int io_tee(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_splice *sp = &req->splice;
++ struct file *out = sp->file_out;
++ unsigned int flags = sp->flags & ~SPLICE_F_FD_IN_FIXED;
++ struct file *in;
++ long ret = 0;
++
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ return -EAGAIN;
++
++ if (sp->flags & SPLICE_F_FD_IN_FIXED)
++ in = io_file_get_fixed(req, sp->splice_fd_in, issue_flags);
++ else
++ in = io_file_get_normal(req, sp->splice_fd_in);
++ if (!in) {
++ ret = -EBADF;
++ goto done;
++ }
++
++ if (sp->len)
++ ret = do_tee(in, out, sp->len, flags);
++
++ if (!(sp->flags & SPLICE_F_FD_IN_FIXED))
++ io_put_file(in);
++done:
++ if (ret != sp->len)
++ req_set_fail(req);
++ io_req_complete(req, ret);
++ return 0;
++}
++
++static int io_splice_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ struct io_splice *sp = &req->splice;
++
++ sp->off_in = READ_ONCE(sqe->splice_off_in);
++ sp->off_out = READ_ONCE(sqe->off);
++ return __io_splice_prep(req, sqe);
++}
++
++static int io_splice(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_splice *sp = &req->splice;
++ struct file *out = sp->file_out;
++ unsigned int flags = sp->flags & ~SPLICE_F_FD_IN_FIXED;
++ loff_t *poff_in, *poff_out;
++ struct file *in;
++ long ret = 0;
++
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ return -EAGAIN;
++
++ if (sp->flags & SPLICE_F_FD_IN_FIXED)
++ in = io_file_get_fixed(req, sp->splice_fd_in, issue_flags);
++ else
++ in = io_file_get_normal(req, sp->splice_fd_in);
++ if (!in) {
++ ret = -EBADF;
++ goto done;
++ }
++
++ poff_in = (sp->off_in == -1) ? NULL : &sp->off_in;
++ poff_out = (sp->off_out == -1) ? NULL : &sp->off_out;
++
++ if (sp->len)
++ ret = do_splice(in, poff_in, out, poff_out, sp->len, flags);
++
++ if (!(sp->flags & SPLICE_F_FD_IN_FIXED))
++ io_put_file(in);
++done:
++ if (ret != sp->len)
++ req_set_fail(req);
++ io_req_complete(req, ret);
++ return 0;
++}
++
++/*
++ * IORING_OP_NOP just posts a completion event, nothing else.
++ */
++static int io_nop(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++
++ if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++
++ __io_req_complete(req, issue_flags, 0, 0);
++ return 0;
++}
++
++static int io_msg_ring_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ if (unlikely(sqe->addr || sqe->ioprio || sqe->rw_flags ||
++ sqe->splice_fd_in || sqe->buf_index || sqe->personality))
++ return -EINVAL;
++
++ req->msg.user_data = READ_ONCE(sqe->off);
++ req->msg.len = READ_ONCE(sqe->len);
++ return 0;
++}
++
++static int io_msg_ring(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_ring_ctx *target_ctx;
++ struct io_msg *msg = &req->msg;
++ bool filled;
++ int ret;
++
++ ret = -EBADFD;
++ if (req->file->f_op != &io_uring_fops)
++ goto done;
++
++ ret = -EOVERFLOW;
++ target_ctx = req->file->private_data;
++
++ spin_lock(&target_ctx->completion_lock);
++ filled = io_fill_cqe_aux(target_ctx, msg->user_data, msg->len, 0);
++ io_commit_cqring(target_ctx);
++ spin_unlock(&target_ctx->completion_lock);
++
++ if (filled) {
++ io_cqring_ev_posted(target_ctx);
++ ret = 0;
++ }
++
++done:
++ if (ret < 0)
++ req_set_fail(req);
++ __io_req_complete(req, issue_flags, ret, 0);
++ /* put file to avoid an attempt to IOPOLL the req */
++ io_put_file(req->file);
++ req->file = NULL;
++ return 0;
++}
++
++static int io_fsync_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++
++ if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (unlikely(sqe->addr || sqe->ioprio || sqe->buf_index ||
++ sqe->splice_fd_in))
++ return -EINVAL;
++
++ req->sync.flags = READ_ONCE(sqe->fsync_flags);
++ if (unlikely(req->sync.flags & ~IORING_FSYNC_DATASYNC))
++ return -EINVAL;
++
++ req->sync.off = READ_ONCE(sqe->off);
++ req->sync.len = READ_ONCE(sqe->len);
++ return 0;
++}
++
++static int io_fsync(struct io_kiocb *req, unsigned int issue_flags)
++{
++ loff_t end = req->sync.off + req->sync.len;
++ int ret;
++
++ /* fsync always requires a blocking context */
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ return -EAGAIN;
++
++ ret = vfs_fsync_range(req->file, req->sync.off,
++ end > 0 ? end : LLONG_MAX,
++ req->sync.flags & IORING_FSYNC_DATASYNC);
++ if (ret < 0)
++ req_set_fail(req);
++ io_req_complete(req, ret);
++ return 0;
++}
++
++static int io_fallocate_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ if (sqe->ioprio || sqe->buf_index || sqe->rw_flags ||
++ sqe->splice_fd_in)
++ return -EINVAL;
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++
++ req->sync.off = READ_ONCE(sqe->off);
++ req->sync.len = READ_ONCE(sqe->addr);
++ req->sync.mode = READ_ONCE(sqe->len);
++ return 0;
++}
++
++static int io_fallocate(struct io_kiocb *req, unsigned int issue_flags)
++{
++ int ret;
++
++ /* fallocate always requiring blocking context */
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ return -EAGAIN;
++ ret = vfs_fallocate(req->file, req->sync.mode, req->sync.off,
++ req->sync.len);
++ if (ret < 0)
++ req_set_fail(req);
++ else
++ fsnotify_modify(req->file);
++ io_req_complete(req, ret);
++ return 0;
++}
++
++static int __io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ const char __user *fname;
++ int ret;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (unlikely(sqe->ioprio || sqe->buf_index))
++ return -EINVAL;
++ if (unlikely(req->flags & REQ_F_FIXED_FILE))
++ return -EBADF;
++
++ /* open.how should be already initialised */
++ if (!(req->open.how.flags & O_PATH) && force_o_largefile())
++ req->open.how.flags |= O_LARGEFILE;
++
++ req->open.dfd = READ_ONCE(sqe->fd);
++ fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
++ req->open.filename = getname(fname);
++ if (IS_ERR(req->open.filename)) {
++ ret = PTR_ERR(req->open.filename);
++ req->open.filename = NULL;
++ return ret;
++ }
++
++ req->open.file_slot = READ_ONCE(sqe->file_index);
++ if (req->open.file_slot && (req->open.how.flags & O_CLOEXEC))
++ return -EINVAL;
++
++ req->open.nofile = rlimit(RLIMIT_NOFILE);
++ req->flags |= REQ_F_NEED_CLEANUP;
++ return 0;
++}
++
++static int io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ u64 mode = READ_ONCE(sqe->len);
++ u64 flags = READ_ONCE(sqe->open_flags);
++
++ req->open.how = build_open_how(flags, mode);
++ return __io_openat_prep(req, sqe);
++}
++
++static int io_openat2_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ struct open_how __user *how;
++ size_t len;
++ int ret;
++
++ how = u64_to_user_ptr(READ_ONCE(sqe->addr2));
++ len = READ_ONCE(sqe->len);
++ if (len < OPEN_HOW_SIZE_VER0)
++ return -EINVAL;
++
++ ret = copy_struct_from_user(&req->open.how, sizeof(req->open.how), how,
++ len);
++ if (ret)
++ return ret;
++
++ return __io_openat_prep(req, sqe);
++}
++
++static int io_openat2(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct open_flags op;
++ struct file *file;
++ bool resolve_nonblock, nonblock_set;
++ bool fixed = !!req->open.file_slot;
++ int ret;
++
++ ret = build_open_flags(&req->open.how, &op);
++ if (ret)
++ goto err;
++ nonblock_set = op.open_flag & O_NONBLOCK;
++ resolve_nonblock = req->open.how.resolve & RESOLVE_CACHED;
++ if (issue_flags & IO_URING_F_NONBLOCK) {
++ /*
++ * Don't bother trying for O_TRUNC, O_CREAT, or O_TMPFILE open,
++ * it'll always -EAGAIN
++ */
++ if (req->open.how.flags & (O_TRUNC | O_CREAT | O_TMPFILE))
++ return -EAGAIN;
++ op.lookup_flags |= LOOKUP_CACHED;
++ op.open_flag |= O_NONBLOCK;
++ }
++
++ if (!fixed) {
++ ret = __get_unused_fd_flags(req->open.how.flags, req->open.nofile);
++ if (ret < 0)
++ goto err;
++ }
++
++ file = do_filp_open(req->open.dfd, req->open.filename, &op);
++ if (IS_ERR(file)) {
++ /*
++ * We could hang on to this 'fd' on retrying, but seems like
++ * marginal gain for something that is now known to be a slower
++ * path. So just put it, and we'll get a new one when we retry.
++ */
++ if (!fixed)
++ put_unused_fd(ret);
++
++ ret = PTR_ERR(file);
++ /* only retry if RESOLVE_CACHED wasn't already set by application */
++ if (ret == -EAGAIN &&
++ (!resolve_nonblock && (issue_flags & IO_URING_F_NONBLOCK)))
++ return -EAGAIN;
++ goto err;
++ }
++
++ if ((issue_flags & IO_URING_F_NONBLOCK) && !nonblock_set)
++ file->f_flags &= ~O_NONBLOCK;
++ fsnotify_open(file);
++
++ if (!fixed)
++ fd_install(ret, file);
++ else
++ ret = io_install_fixed_file(req, file, issue_flags,
++ req->open.file_slot - 1);
++err:
++ putname(req->open.filename);
++ req->flags &= ~REQ_F_NEED_CLEANUP;
++ if (ret < 0)
++ req_set_fail(req);
++ __io_req_complete(req, issue_flags, ret, 0);
++ return 0;
++}
++
++static int io_openat(struct io_kiocb *req, unsigned int issue_flags)
++{
++ return io_openat2(req, issue_flags);
++}
++
++static int io_remove_buffers_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ struct io_provide_buf *p = &req->pbuf;
++ u64 tmp;
++
++ if (sqe->ioprio || sqe->rw_flags || sqe->addr || sqe->len || sqe->off ||
++ sqe->splice_fd_in)
++ return -EINVAL;
++
++ tmp = READ_ONCE(sqe->fd);
++ if (!tmp || tmp > USHRT_MAX)
++ return -EINVAL;
++
++ memset(p, 0, sizeof(*p));
++ p->nbufs = tmp;
++ p->bgid = READ_ONCE(sqe->buf_group);
++ return 0;
++}
++
++static int __io_remove_buffers(struct io_ring_ctx *ctx,
++ struct io_buffer_list *bl, unsigned nbufs)
++{
++ unsigned i = 0;
++
++ /* shouldn't happen */
++ if (!nbufs)
++ return 0;
++
++ /* the head kbuf is the list itself */
++ while (!list_empty(&bl->buf_list)) {
++ struct io_buffer *nxt;
++
++ nxt = list_first_entry(&bl->buf_list, struct io_buffer, list);
++ list_del(&nxt->list);
++ if (++i == nbufs)
++ return i;
++ cond_resched();
++ }
++ i++;
++
++ return i;
++}
++
++static int io_remove_buffers(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_provide_buf *p = &req->pbuf;
++ struct io_ring_ctx *ctx = req->ctx;
++ struct io_buffer_list *bl;
++ int ret = 0;
++ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
++
++ io_ring_submit_lock(ctx, needs_lock);
++
++ lockdep_assert_held(&ctx->uring_lock);
++
++ ret = -ENOENT;
++ bl = io_buffer_get_list(ctx, p->bgid);
++ if (bl)
++ ret = __io_remove_buffers(ctx, bl, p->nbufs);
++ if (ret < 0)
++ req_set_fail(req);
++
++ /* complete before unlock, IOPOLL may need the lock */
++ __io_req_complete(req, issue_flags, ret, 0);
++ io_ring_submit_unlock(ctx, needs_lock);
++ return 0;
++}
++
++static int io_provide_buffers_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ unsigned long size, tmp_check;
++ struct io_provide_buf *p = &req->pbuf;
++ u64 tmp;
++
++ if (sqe->ioprio || sqe->rw_flags || sqe->splice_fd_in)
++ return -EINVAL;
++
++ tmp = READ_ONCE(sqe->fd);
++ if (!tmp || tmp > USHRT_MAX)
++ return -E2BIG;
++ p->nbufs = tmp;
++ p->addr = READ_ONCE(sqe->addr);
++ p->len = READ_ONCE(sqe->len);
++
++ if (check_mul_overflow((unsigned long)p->len, (unsigned long)p->nbufs,
++ &size))
++ return -EOVERFLOW;
++ if (check_add_overflow((unsigned long)p->addr, size, &tmp_check))
++ return -EOVERFLOW;
++
++ size = (unsigned long)p->len * p->nbufs;
++ if (!access_ok(u64_to_user_ptr(p->addr), size))
++ return -EFAULT;
++
++ p->bgid = READ_ONCE(sqe->buf_group);
++ tmp = READ_ONCE(sqe->off);
++ if (tmp > USHRT_MAX)
++ return -E2BIG;
++ p->bid = tmp;
++ return 0;
++}
++
++static int io_refill_buffer_cache(struct io_ring_ctx *ctx)
++{
++ struct io_buffer *buf;
++ struct page *page;
++ int bufs_in_page;
++
++ /*
++ * Completions that don't happen inline (eg not under uring_lock) will
++ * add to ->io_buffers_comp. If we don't have any free buffers, check
++ * the completion list and splice those entries first.
++ */
++ if (!list_empty_careful(&ctx->io_buffers_comp)) {
++ spin_lock(&ctx->completion_lock);
++ if (!list_empty(&ctx->io_buffers_comp)) {
++ list_splice_init(&ctx->io_buffers_comp,
++ &ctx->io_buffers_cache);
++ spin_unlock(&ctx->completion_lock);
++ return 0;
++ }
++ spin_unlock(&ctx->completion_lock);
++ }
++
++ /*
++ * No free buffers and no completion entries either. Allocate a new
++ * page worth of buffer entries and add those to our freelist.
++ */
++ page = alloc_page(GFP_KERNEL_ACCOUNT);
++ if (!page)
++ return -ENOMEM;
++
++ list_add(&page->lru, &ctx->io_buffers_pages);
++
++ buf = page_address(page);
++ bufs_in_page = PAGE_SIZE / sizeof(*buf);
++ while (bufs_in_page) {
++ list_add_tail(&buf->list, &ctx->io_buffers_cache);
++ buf++;
++ bufs_in_page--;
++ }
++
++ return 0;
++}
++
++static int io_add_buffers(struct io_ring_ctx *ctx, struct io_provide_buf *pbuf,
++ struct io_buffer_list *bl)
++{
++ struct io_buffer *buf;
++ u64 addr = pbuf->addr;
++ int i, bid = pbuf->bid;
++
++ for (i = 0; i < pbuf->nbufs; i++) {
++ if (list_empty(&ctx->io_buffers_cache) &&
++ io_refill_buffer_cache(ctx))
++ break;
++ buf = list_first_entry(&ctx->io_buffers_cache, struct io_buffer,
++ list);
++ list_move_tail(&buf->list, &bl->buf_list);
++ buf->addr = addr;
++ buf->len = min_t(__u32, pbuf->len, MAX_RW_COUNT);
++ buf->bid = bid;
++ buf->bgid = pbuf->bgid;
++ addr += pbuf->len;
++ bid++;
++ cond_resched();
++ }
++
++ return i ? 0 : -ENOMEM;
++}
++
++static int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_provide_buf *p = &req->pbuf;
++ struct io_ring_ctx *ctx = req->ctx;
++ struct io_buffer_list *bl;
++ int ret = 0;
++ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
++
++ io_ring_submit_lock(ctx, needs_lock);
++
++ lockdep_assert_held(&ctx->uring_lock);
++
++ bl = io_buffer_get_list(ctx, p->bgid);
++ if (unlikely(!bl)) {
++ bl = kzalloc(sizeof(*bl), GFP_KERNEL_ACCOUNT);
++ if (!bl) {
++ ret = -ENOMEM;
++ goto err;
++ }
++ io_buffer_add_list(ctx, bl, p->bgid);
++ }
++
++ ret = io_add_buffers(ctx, p, bl);
++err:
++ if (ret < 0)
++ req_set_fail(req);
++ /* complete before unlock, IOPOLL may need the lock */
++ __io_req_complete(req, issue_flags, ret, 0);
++ io_ring_submit_unlock(ctx, needs_lock);
++ return 0;
++}
++
++static int io_epoll_ctl_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++#if defined(CONFIG_EPOLL)
++ if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
++ return -EINVAL;
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++
++ req->epoll.epfd = READ_ONCE(sqe->fd);
++ req->epoll.op = READ_ONCE(sqe->len);
++ req->epoll.fd = READ_ONCE(sqe->off);
++
++ if (ep_op_has_event(req->epoll.op)) {
++ struct epoll_event __user *ev;
++
++ ev = u64_to_user_ptr(READ_ONCE(sqe->addr));
++ if (copy_from_user(&req->epoll.event, ev, sizeof(*ev)))
++ return -EFAULT;
++ }
++
++ return 0;
++#else
++ return -EOPNOTSUPP;
++#endif
++}
++
++static int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags)
++{
++#if defined(CONFIG_EPOLL)
++ struct io_epoll *ie = &req->epoll;
++ int ret;
++ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
++
++ ret = do_epoll_ctl(ie->epfd, ie->op, ie->fd, &ie->event, force_nonblock);
++ if (force_nonblock && ret == -EAGAIN)
++ return -EAGAIN;
++
++ if (ret < 0)
++ req_set_fail(req);
++ __io_req_complete(req, issue_flags, ret, 0);
++ return 0;
++#else
++ return -EOPNOTSUPP;
++#endif
++}
++
++static int io_madvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++#if defined(CONFIG_ADVISE_SYSCALLS) && defined(CONFIG_MMU)
++ if (sqe->ioprio || sqe->buf_index || sqe->off || sqe->splice_fd_in)
++ return -EINVAL;
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++
++ req->madvise.addr = READ_ONCE(sqe->addr);
++ req->madvise.len = READ_ONCE(sqe->len);
++ req->madvise.advice = READ_ONCE(sqe->fadvise_advice);
++ return 0;
++#else
++ return -EOPNOTSUPP;
++#endif
++}
++
++static int io_madvise(struct io_kiocb *req, unsigned int issue_flags)
++{
++#if defined(CONFIG_ADVISE_SYSCALLS) && defined(CONFIG_MMU)
++ struct io_madvise *ma = &req->madvise;
++ int ret;
++
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ return -EAGAIN;
++
++ ret = do_madvise(current->mm, ma->addr, ma->len, ma->advice);
++ if (ret < 0)
++ req_set_fail(req);
++ io_req_complete(req, ret);
++ return 0;
++#else
++ return -EOPNOTSUPP;
++#endif
++}
++
++static int io_fadvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ if (sqe->ioprio || sqe->buf_index || sqe->addr || sqe->splice_fd_in)
++ return -EINVAL;
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++
++ req->fadvise.offset = READ_ONCE(sqe->off);
++ req->fadvise.len = READ_ONCE(sqe->len);
++ req->fadvise.advice = READ_ONCE(sqe->fadvise_advice);
++ return 0;
++}
++
++static int io_fadvise(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_fadvise *fa = &req->fadvise;
++ int ret;
++
++ if (issue_flags & IO_URING_F_NONBLOCK) {
++ switch (fa->advice) {
++ case POSIX_FADV_NORMAL:
++ case POSIX_FADV_RANDOM:
++ case POSIX_FADV_SEQUENTIAL:
++ break;
++ default:
++ return -EAGAIN;
++ }
++ }
++
++ ret = vfs_fadvise(req->file, fa->offset, fa->len, fa->advice);
++ if (ret < 0)
++ req_set_fail(req);
++ __io_req_complete(req, issue_flags, ret, 0);
++ return 0;
++}
++
++static int io_statx_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ const char __user *path;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
++ return -EINVAL;
++ if (req->flags & REQ_F_FIXED_FILE)
++ return -EBADF;
++
++ req->statx.dfd = READ_ONCE(sqe->fd);
++ req->statx.mask = READ_ONCE(sqe->len);
++ path = u64_to_user_ptr(READ_ONCE(sqe->addr));
++ req->statx.buffer = u64_to_user_ptr(READ_ONCE(sqe->addr2));
++ req->statx.flags = READ_ONCE(sqe->statx_flags);
++
++ req->statx.filename = getname_flags(path,
++ getname_statx_lookup_flags(req->statx.flags),
++ NULL);
++
++ if (IS_ERR(req->statx.filename)) {
++ int ret = PTR_ERR(req->statx.filename);
++
++ req->statx.filename = NULL;
++ return ret;
++ }
++
++ req->flags |= REQ_F_NEED_CLEANUP;
++ return 0;
++}
++
++static int io_statx(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_statx *ctx = &req->statx;
++ int ret;
++
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ return -EAGAIN;
++
++ ret = do_statx(ctx->dfd, ctx->filename, ctx->flags, ctx->mask,
++ ctx->buffer);
++
++ if (ret < 0)
++ req_set_fail(req);
++ io_req_complete(req, ret);
++ return 0;
++}
++
++static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->off || sqe->addr || sqe->len ||
++ sqe->rw_flags || sqe->buf_index)
++ return -EINVAL;
++ if (req->flags & REQ_F_FIXED_FILE)
++ return -EBADF;
++
++ req->close.fd = READ_ONCE(sqe->fd);
++ req->close.file_slot = READ_ONCE(sqe->file_index);
++ if (req->close.file_slot && req->close.fd)
++ return -EINVAL;
++
++ return 0;
++}
++
++static int io_close(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct files_struct *files = current->files;
++ struct io_close *close = &req->close;
++ struct fdtable *fdt;
++ struct file *file = NULL;
++ int ret = -EBADF;
++
++ if (req->close.file_slot) {
++ ret = io_close_fixed(req, issue_flags);
++ goto err;
++ }
++
++ spin_lock(&files->file_lock);
++ fdt = files_fdtable(files);
++ if (close->fd >= fdt->max_fds) {
++ spin_unlock(&files->file_lock);
++ goto err;
++ }
++ file = fdt->fd[close->fd];
++ if (!file || file->f_op == &io_uring_fops) {
++ spin_unlock(&files->file_lock);
++ file = NULL;
++ goto err;
++ }
++
++ /* if the file has a flush method, be safe and punt to async */
++ if (file->f_op->flush && (issue_flags & IO_URING_F_NONBLOCK)) {
++ spin_unlock(&files->file_lock);
++ return -EAGAIN;
++ }
++
++ ret = __close_fd_get_file(close->fd, &file);
++ spin_unlock(&files->file_lock);
++ if (ret < 0) {
++ if (ret == -ENOENT)
++ ret = -EBADF;
++ goto err;
++ }
++
++ /* No ->flush() or already async, safely close from here */
++ ret = filp_close(file, current->files);
++err:
++ if (ret < 0)
++ req_set_fail(req);
++ if (file)
++ fput(file);
++ __io_req_complete(req, issue_flags, ret, 0);
++ return 0;
++}
++
++static int io_sfr_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++
++ if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (unlikely(sqe->addr || sqe->ioprio || sqe->buf_index ||
++ sqe->splice_fd_in))
++ return -EINVAL;
++
++ req->sync.off = READ_ONCE(sqe->off);
++ req->sync.len = READ_ONCE(sqe->len);
++ req->sync.flags = READ_ONCE(sqe->sync_range_flags);
++ return 0;
++}
++
++static int io_sync_file_range(struct io_kiocb *req, unsigned int issue_flags)
++{
++ int ret;
++
++ /* sync_file_range always requires a blocking context */
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ return -EAGAIN;
++
++ ret = sync_file_range(req->file, req->sync.off, req->sync.len,
++ req->sync.flags);
++ if (ret < 0)
++ req_set_fail(req);
++ io_req_complete(req, ret);
++ return 0;
++}
++
++#if defined(CONFIG_NET)
++static int io_setup_async_msg(struct io_kiocb *req,
++ struct io_async_msghdr *kmsg)
++{
++ struct io_async_msghdr *async_msg = req->async_data;
++
++ if (async_msg)
++ return -EAGAIN;
++ if (io_alloc_async_data(req)) {
++ kfree(kmsg->free_iov);
++ return -ENOMEM;
++ }
++ async_msg = req->async_data;
++ req->flags |= REQ_F_NEED_CLEANUP;
++ memcpy(async_msg, kmsg, sizeof(*kmsg));
++ async_msg->msg.msg_name = &async_msg->addr;
++ /* if were using fast_iov, set it to the new one */
++ if (!async_msg->free_iov)
++ async_msg->msg.msg_iter.iov = async_msg->fast_iov;
++
++ return -EAGAIN;
++}
++
++static int io_sendmsg_copy_hdr(struct io_kiocb *req,
++ struct io_async_msghdr *iomsg)
++{
++ iomsg->msg.msg_name = &iomsg->addr;
++ iomsg->free_iov = iomsg->fast_iov;
++ return sendmsg_copy_msghdr(&iomsg->msg, req->sr_msg.umsg,
++ req->sr_msg.msg_flags, &iomsg->free_iov);
++}
++
++static int io_sendmsg_prep_async(struct io_kiocb *req)
++{
++ int ret;
++
++ ret = io_sendmsg_copy_hdr(req, req->async_data);
++ if (!ret)
++ req->flags |= REQ_F_NEED_CLEANUP;
++ return ret;
++}
++
++static int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ struct io_sr_msg *sr = &req->sr_msg;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (unlikely(sqe->addr2 || sqe->file_index || sqe->ioprio))
++ return -EINVAL;
++
++ sr->umsg = u64_to_user_ptr(READ_ONCE(sqe->addr));
++ sr->len = READ_ONCE(sqe->len);
++ sr->msg_flags = READ_ONCE(sqe->msg_flags) | MSG_NOSIGNAL;
++ if (sr->msg_flags & MSG_DONTWAIT)
++ req->flags |= REQ_F_NOWAIT;
++
++#ifdef CONFIG_COMPAT
++ if (req->ctx->compat)
++ sr->msg_flags |= MSG_CMSG_COMPAT;
++#endif
++ return 0;
++}
++
++static int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_async_msghdr iomsg, *kmsg;
++ struct socket *sock;
++ unsigned flags;
++ int min_ret = 0;
++ int ret;
++
++ sock = sock_from_file(req->file);
++ if (unlikely(!sock))
++ return -ENOTSOCK;
++
++ if (req_has_async_data(req)) {
++ kmsg = req->async_data;
++ } else {
++ ret = io_sendmsg_copy_hdr(req, &iomsg);
++ if (ret)
++ return ret;
++ kmsg = &iomsg;
++ }
++
++ flags = req->sr_msg.msg_flags;
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ flags |= MSG_DONTWAIT;
++ if (flags & MSG_WAITALL)
++ min_ret = iov_iter_count(&kmsg->msg.msg_iter);
++
++ ret = __sys_sendmsg_sock(sock, &kmsg->msg, flags);
++
++ if (ret < min_ret) {
++ if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK))
++ return io_setup_async_msg(req, kmsg);
++ if (ret == -ERESTARTSYS)
++ ret = -EINTR;
++ req_set_fail(req);
++ }
++ /* fast path, check for non-NULL to avoid function call */
++ if (kmsg->free_iov)
++ kfree(kmsg->free_iov);
++ req->flags &= ~REQ_F_NEED_CLEANUP;
++ __io_req_complete(req, issue_flags, ret, 0);
++ return 0;
++}
++
++static int io_send(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_sr_msg *sr = &req->sr_msg;
++ struct msghdr msg;
++ struct iovec iov;
++ struct socket *sock;
++ unsigned flags;
++ int min_ret = 0;
++ int ret;
++
++ sock = sock_from_file(req->file);
++ if (unlikely(!sock))
++ return -ENOTSOCK;
++
++ ret = import_single_range(WRITE, sr->buf, sr->len, &iov, &msg.msg_iter);
++ if (unlikely(ret))
++ return ret;
++
++ msg.msg_name = NULL;
++ msg.msg_control = NULL;
++ msg.msg_controllen = 0;
++ msg.msg_namelen = 0;
++
++ flags = req->sr_msg.msg_flags;
++ if (issue_flags & IO_URING_F_NONBLOCK)
++ flags |= MSG_DONTWAIT;
++ if (flags & MSG_WAITALL)
++ min_ret = iov_iter_count(&msg.msg_iter);
++
++ msg.msg_flags = flags;
++ ret = sock_sendmsg(sock, &msg);
++ if (ret < min_ret) {
++ if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK))
++ return -EAGAIN;
++ if (ret == -ERESTARTSYS)
++ ret = -EINTR;
++ req_set_fail(req);
++ }
++ __io_req_complete(req, issue_flags, ret, 0);
++ return 0;
++}
++
++static int __io_recvmsg_copy_hdr(struct io_kiocb *req,
++ struct io_async_msghdr *iomsg)
++{
++ struct io_sr_msg *sr = &req->sr_msg;
++ struct iovec __user *uiov;
++ size_t iov_len;
++ int ret;
++
++ ret = __copy_msghdr_from_user(&iomsg->msg, sr->umsg,
++ &iomsg->uaddr, &uiov, &iov_len);
++ if (ret)
++ return ret;
++
++ if (req->flags & REQ_F_BUFFER_SELECT) {
++ if (iov_len > 1)
++ return -EINVAL;
++ if (copy_from_user(iomsg->fast_iov, uiov, sizeof(*uiov)))
++ return -EFAULT;
++ sr->len = iomsg->fast_iov[0].iov_len;
++ iomsg->free_iov = NULL;
++ } else {
++ iomsg->free_iov = iomsg->fast_iov;
++ ret = __import_iovec(READ, uiov, iov_len, UIO_FASTIOV,
++ &iomsg->free_iov, &iomsg->msg.msg_iter,
++ false);
++ if (ret > 0)
++ ret = 0;
++ }
++
++ return ret;
++}
++
++#ifdef CONFIG_COMPAT
++static int __io_compat_recvmsg_copy_hdr(struct io_kiocb *req,
++ struct io_async_msghdr *iomsg)
++{
++ struct io_sr_msg *sr = &req->sr_msg;
++ struct compat_iovec __user *uiov;
++ compat_uptr_t ptr;
++ compat_size_t len;
++ int ret;
++
++ ret = __get_compat_msghdr(&iomsg->msg, sr->umsg_compat, &iomsg->uaddr,
++ &ptr, &len);
++ if (ret)
++ return ret;
++
++ uiov = compat_ptr(ptr);
++ if (req->flags & REQ_F_BUFFER_SELECT) {
++ compat_ssize_t clen;
++
++ if (len > 1)
++ return -EINVAL;
++ if (!access_ok(uiov, sizeof(*uiov)))
++ return -EFAULT;
++ if (__get_user(clen, &uiov->iov_len))
++ return -EFAULT;
++ if (clen < 0)
++ return -EINVAL;
++ sr->len = clen;
++ iomsg->free_iov = NULL;
++ } else {
++ iomsg->free_iov = iomsg->fast_iov;
++ ret = __import_iovec(READ, (struct iovec __user *)uiov, len,
++ UIO_FASTIOV, &iomsg->free_iov,
++ &iomsg->msg.msg_iter, true);
++ if (ret < 0)
++ return ret;
++ }
++
++ return 0;
++}
++#endif
++
++static int io_recvmsg_copy_hdr(struct io_kiocb *req,
++ struct io_async_msghdr *iomsg)
++{
++ iomsg->msg.msg_name = &iomsg->addr;
++
++#ifdef CONFIG_COMPAT
++ if (req->ctx->compat)
++ return __io_compat_recvmsg_copy_hdr(req, iomsg);
++#endif
++
++ return __io_recvmsg_copy_hdr(req, iomsg);
++}
++
++static struct io_buffer *io_recv_buffer_select(struct io_kiocb *req,
++ unsigned int issue_flags)
++{
++ struct io_sr_msg *sr = &req->sr_msg;
++
++ return io_buffer_select(req, &sr->len, sr->bgid, issue_flags);
++}
++
++static int io_recvmsg_prep_async(struct io_kiocb *req)
++{
++ int ret;
++
++ ret = io_recvmsg_copy_hdr(req, req->async_data);
++ if (!ret)
++ req->flags |= REQ_F_NEED_CLEANUP;
++ return ret;
++}
++
++static int io_recvmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ struct io_sr_msg *sr = &req->sr_msg;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (unlikely(sqe->addr2 || sqe->file_index || sqe->ioprio))
++ return -EINVAL;
++
++ sr->umsg = u64_to_user_ptr(READ_ONCE(sqe->addr));
++ sr->len = READ_ONCE(sqe->len);
++ sr->bgid = READ_ONCE(sqe->buf_group);
++ sr->msg_flags = READ_ONCE(sqe->msg_flags) | MSG_NOSIGNAL;
++ if (sr->msg_flags & MSG_DONTWAIT)
++ req->flags |= REQ_F_NOWAIT;
++
++#ifdef CONFIG_COMPAT
++ if (req->ctx->compat)
++ sr->msg_flags |= MSG_CMSG_COMPAT;
++#endif
++ sr->done_io = 0;
++ return 0;
++}
++
++static bool io_net_retry(struct socket *sock, int flags)
++{
++ if (!(flags & MSG_WAITALL))
++ return false;
++ return sock->type == SOCK_STREAM || sock->type == SOCK_SEQPACKET;
++}
++
++static int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_async_msghdr iomsg, *kmsg;
++ struct io_sr_msg *sr = &req->sr_msg;
++ struct socket *sock;
++ struct io_buffer *kbuf;
++ unsigned flags;
++ int ret, min_ret = 0;
++ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
++
++ sock = sock_from_file(req->file);
++ if (unlikely(!sock))
++ return -ENOTSOCK;
++
++ if (req_has_async_data(req)) {
++ kmsg = req->async_data;
++ } else {
++ ret = io_recvmsg_copy_hdr(req, &iomsg);
++ if (ret)
++ return ret;
++ kmsg = &iomsg;
++ }
++
++ if (req->flags & REQ_F_BUFFER_SELECT) {
++ kbuf = io_recv_buffer_select(req, issue_flags);
++ if (IS_ERR(kbuf))
++ return PTR_ERR(kbuf);
++ kmsg->fast_iov[0].iov_base = u64_to_user_ptr(kbuf->addr);
++ kmsg->fast_iov[0].iov_len = req->sr_msg.len;
++ iov_iter_init(&kmsg->msg.msg_iter, READ, kmsg->fast_iov,
++ 1, req->sr_msg.len);
++ }
++
++ flags = req->sr_msg.msg_flags;
++ if (force_nonblock)
++ flags |= MSG_DONTWAIT;
++ if (flags & MSG_WAITALL)
++ min_ret = iov_iter_count(&kmsg->msg.msg_iter);
++
++ ret = __sys_recvmsg_sock(sock, &kmsg->msg, req->sr_msg.umsg,
++ kmsg->uaddr, flags);
++ if (ret < min_ret) {
++ if (ret == -EAGAIN && force_nonblock)
++ return io_setup_async_msg(req, kmsg);
++ if (ret == -ERESTARTSYS)
++ ret = -EINTR;
++ if (ret > 0 && io_net_retry(sock, flags)) {
++ sr->done_io += ret;
++ req->flags |= REQ_F_PARTIAL_IO;
++ return io_setup_async_msg(req, kmsg);
++ }
++ req_set_fail(req);
++ } else if ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
++ req_set_fail(req);
++ }
++
++ /* fast path, check for non-NULL to avoid function call */
++ if (kmsg->free_iov)
++ kfree(kmsg->free_iov);
++ req->flags &= ~REQ_F_NEED_CLEANUP;
++ if (ret >= 0)
++ ret += sr->done_io;
++ else if (sr->done_io)
++ ret = sr->done_io;
++ __io_req_complete(req, issue_flags, ret, io_put_kbuf(req, issue_flags));
++ return 0;
++}
++
++static int io_recv(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_buffer *kbuf;
++ struct io_sr_msg *sr = &req->sr_msg;
++ struct msghdr msg;
++ void __user *buf = sr->buf;
++ struct socket *sock;
++ struct iovec iov;
++ unsigned flags;
++ int ret, min_ret = 0;
++ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
++
++ sock = sock_from_file(req->file);
++ if (unlikely(!sock))
++ return -ENOTSOCK;
++
++ if (req->flags & REQ_F_BUFFER_SELECT) {
++ kbuf = io_recv_buffer_select(req, issue_flags);
++ if (IS_ERR(kbuf))
++ return PTR_ERR(kbuf);
++ buf = u64_to_user_ptr(kbuf->addr);
++ }
++
++ ret = import_single_range(READ, buf, sr->len, &iov, &msg.msg_iter);
++ if (unlikely(ret))
++ goto out_free;
++
++ msg.msg_name = NULL;
++ msg.msg_control = NULL;
++ msg.msg_controllen = 0;
++ msg.msg_namelen = 0;
++ msg.msg_iocb = NULL;
++ msg.msg_flags = 0;
++
++ flags = req->sr_msg.msg_flags;
++ if (force_nonblock)
++ flags |= MSG_DONTWAIT;
++ if (flags & MSG_WAITALL)
++ min_ret = iov_iter_count(&msg.msg_iter);
++
++ ret = sock_recvmsg(sock, &msg, flags);
++ if (ret < min_ret) {
++ if (ret == -EAGAIN && force_nonblock)
++ return -EAGAIN;
++ if (ret == -ERESTARTSYS)
++ ret = -EINTR;
++ if (ret > 0 && io_net_retry(sock, flags)) {
++ sr->len -= ret;
++ sr->buf += ret;
++ sr->done_io += ret;
++ req->flags |= REQ_F_PARTIAL_IO;
++ return -EAGAIN;
++ }
++ req_set_fail(req);
++ } else if ((flags & MSG_WAITALL) && (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
++out_free:
++ req_set_fail(req);
++ }
++
++ if (ret >= 0)
++ ret += sr->done_io;
++ else if (sr->done_io)
++ ret = sr->done_io;
++ __io_req_complete(req, issue_flags, ret, io_put_kbuf(req, issue_flags));
++ return 0;
++}
++
++static int io_accept_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ struct io_accept *accept = &req->accept;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->len || sqe->buf_index)
++ return -EINVAL;
++
++ accept->addr = u64_to_user_ptr(READ_ONCE(sqe->addr));
++ accept->addr_len = u64_to_user_ptr(READ_ONCE(sqe->addr2));
++ accept->flags = READ_ONCE(sqe->accept_flags);
++ accept->nofile = rlimit(RLIMIT_NOFILE);
++
++ accept->file_slot = READ_ONCE(sqe->file_index);
++ if (accept->file_slot && (accept->flags & SOCK_CLOEXEC))
++ return -EINVAL;
++ if (accept->flags & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))
++ return -EINVAL;
++ if (SOCK_NONBLOCK != O_NONBLOCK && (accept->flags & SOCK_NONBLOCK))
++ accept->flags = (accept->flags & ~SOCK_NONBLOCK) | O_NONBLOCK;
++ return 0;
++}
++
++static int io_accept(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_accept *accept = &req->accept;
++ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
++ unsigned int file_flags = force_nonblock ? O_NONBLOCK : 0;
++ bool fixed = !!accept->file_slot;
++ struct file *file;
++ int ret, fd;
++
++ if (!fixed) {
++ fd = __get_unused_fd_flags(accept->flags, accept->nofile);
++ if (unlikely(fd < 0))
++ return fd;
++ }
++ file = do_accept(req->file, file_flags, accept->addr, accept->addr_len,
++ accept->flags);
++ if (IS_ERR(file)) {
++ if (!fixed)
++ put_unused_fd(fd);
++ ret = PTR_ERR(file);
++ if (ret == -EAGAIN && force_nonblock)
++ return -EAGAIN;
++ if (ret == -ERESTARTSYS)
++ ret = -EINTR;
++ req_set_fail(req);
++ } else if (!fixed) {
++ fd_install(fd, file);
++ ret = fd;
++ } else {
++ ret = io_install_fixed_file(req, file, issue_flags,
++ accept->file_slot - 1);
++ }
++ __io_req_complete(req, issue_flags, ret, 0);
++ return 0;
++}
++
++static int io_connect_prep_async(struct io_kiocb *req)
++{
++ struct io_async_connect *io = req->async_data;
++ struct io_connect *conn = &req->connect;
++
++ return move_addr_to_kernel(conn->addr, conn->addr_len, &io->address);
++}
++
++static int io_connect_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ struct io_connect *conn = &req->connect;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->len || sqe->buf_index || sqe->rw_flags ||
++ sqe->splice_fd_in)
++ return -EINVAL;
++
++ conn->addr = u64_to_user_ptr(READ_ONCE(sqe->addr));
++ conn->addr_len = READ_ONCE(sqe->addr2);
++ return 0;
++}
++
++static int io_connect(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_async_connect __io, *io;
++ unsigned file_flags;
++ int ret;
++ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
++
++ if (req_has_async_data(req)) {
++ io = req->async_data;
++ } else {
++ ret = move_addr_to_kernel(req->connect.addr,
++ req->connect.addr_len,
++ &__io.address);
++ if (ret)
++ goto out;
++ io = &__io;
++ }
++
++ file_flags = force_nonblock ? O_NONBLOCK : 0;
++
++ ret = __sys_connect_file(req->file, &io->address,
++ req->connect.addr_len, file_flags);
++ if ((ret == -EAGAIN || ret == -EINPROGRESS) && force_nonblock) {
++ if (req_has_async_data(req))
++ return -EAGAIN;
++ if (io_alloc_async_data(req)) {
++ ret = -ENOMEM;
++ goto out;
++ }
++ memcpy(req->async_data, &__io, sizeof(__io));
++ return -EAGAIN;
++ }
++ if (ret == -ERESTARTSYS)
++ ret = -EINTR;
++out:
++ if (ret < 0)
++ req_set_fail(req);
++ __io_req_complete(req, issue_flags, ret, 0);
++ return 0;
++}
++#else /* !CONFIG_NET */
++#define IO_NETOP_FN(op) \
++static int io_##op(struct io_kiocb *req, unsigned int issue_flags) \
++{ \
++ return -EOPNOTSUPP; \
++}
++
++#define IO_NETOP_PREP(op) \
++IO_NETOP_FN(op) \
++static int io_##op##_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) \
++{ \
++ return -EOPNOTSUPP; \
++} \
++
++#define IO_NETOP_PREP_ASYNC(op) \
++IO_NETOP_PREP(op) \
++static int io_##op##_prep_async(struct io_kiocb *req) \
++{ \
++ return -EOPNOTSUPP; \
++}
++
++IO_NETOP_PREP_ASYNC(sendmsg);
++IO_NETOP_PREP_ASYNC(recvmsg);
++IO_NETOP_PREP_ASYNC(connect);
++IO_NETOP_PREP(accept);
++IO_NETOP_FN(send);
++IO_NETOP_FN(recv);
++#endif /* CONFIG_NET */
++
++struct io_poll_table {
++ struct poll_table_struct pt;
++ struct io_kiocb *req;
++ int nr_entries;
++ int error;
++};
++
++#define IO_POLL_CANCEL_FLAG BIT(31)
++#define IO_POLL_REF_MASK GENMASK(30, 0)
++
++/*
++ * If refs part of ->poll_refs (see IO_POLL_REF_MASK) is 0, it's free. We can
++ * bump it and acquire ownership. It's disallowed to modify requests while not
++ * owning it, that prevents from races for enqueueing task_work's and b/w
++ * arming poll and wakeups.
++ */
++static inline bool io_poll_get_ownership(struct io_kiocb *req)
++{
++ return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK);
++}
++
++static void io_poll_mark_cancelled(struct io_kiocb *req)
++{
++ atomic_or(IO_POLL_CANCEL_FLAG, &req->poll_refs);
++}
++
++static struct io_poll_iocb *io_poll_get_double(struct io_kiocb *req)
++{
++ /* pure poll stashes this in ->async_data, poll driven retry elsewhere */
++ if (req->opcode == IORING_OP_POLL_ADD)
++ return req->async_data;
++ return req->apoll->double_poll;
++}
++
++static struct io_poll_iocb *io_poll_get_single(struct io_kiocb *req)
++{
++ if (req->opcode == IORING_OP_POLL_ADD)
++ return &req->poll;
++ return &req->apoll->poll;
++}
++
++static void io_poll_req_insert(struct io_kiocb *req)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ struct hlist_head *list;
++
++ list = &ctx->cancel_hash[hash_long(req->user_data, ctx->cancel_hash_bits)];
++ hlist_add_head(&req->hash_node, list);
++}
++
++static void io_init_poll_iocb(struct io_poll_iocb *poll, __poll_t events,
++ wait_queue_func_t wake_func)
++{
++ poll->head = NULL;
++#define IO_POLL_UNMASK (EPOLLERR|EPOLLHUP|EPOLLNVAL|EPOLLRDHUP)
++ /* mask in events that we always want/need */
++ poll->events = events | IO_POLL_UNMASK;
++ INIT_LIST_HEAD(&poll->wait.entry);
++ init_waitqueue_func_entry(&poll->wait, wake_func);
++}
++
++static inline void io_poll_remove_entry(struct io_poll_iocb *poll)
++{
++ struct wait_queue_head *head = smp_load_acquire(&poll->head);
++
++ if (head) {
++ spin_lock_irq(&head->lock);
++ list_del_init(&poll->wait.entry);
++ poll->head = NULL;
++ spin_unlock_irq(&head->lock);
++ }
++}
++
++static void io_poll_remove_entries(struct io_kiocb *req)
++{
++ /*
++ * Nothing to do if neither of those flags are set. Avoid dipping
++ * into the poll/apoll/double cachelines if we can.
++ */
++ if (!(req->flags & (REQ_F_SINGLE_POLL | REQ_F_DOUBLE_POLL)))
++ return;
++
++ /*
++ * While we hold the waitqueue lock and the waitqueue is nonempty,
++ * wake_up_pollfree() will wait for us. However, taking the waitqueue
++ * lock in the first place can race with the waitqueue being freed.
++ *
++ * We solve this as eventpoll does: by taking advantage of the fact that
++ * all users of wake_up_pollfree() will RCU-delay the actual free. If
++ * we enter rcu_read_lock() and see that the pointer to the queue is
++ * non-NULL, we can then lock it without the memory being freed out from
++ * under us.
++ *
++ * Keep holding rcu_read_lock() as long as we hold the queue lock, in
++ * case the caller deletes the entry from the queue, leaving it empty.
++ * In that case, only RCU prevents the queue memory from being freed.
++ */
++ rcu_read_lock();
++ if (req->flags & REQ_F_SINGLE_POLL)
++ io_poll_remove_entry(io_poll_get_single(req));
++ if (req->flags & REQ_F_DOUBLE_POLL)
++ io_poll_remove_entry(io_poll_get_double(req));
++ rcu_read_unlock();
++}
++
++/*
++ * All poll tw should go through this. Checks for poll events, manages
++ * references, does rewait, etc.
++ *
++ * Returns a negative error on failure. >0 when no action require, which is
++ * either spurious wakeup or multishot CQE is served. 0 when it's done with
++ * the request, then the mask is stored in req->result.
++ */
++static int io_poll_check_events(struct io_kiocb *req, bool locked)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ int v;
++
++ /* req->task == current here, checking PF_EXITING is safe */
++ if (unlikely(req->task->flags & PF_EXITING))
++ io_poll_mark_cancelled(req);
++
++ do {
++ v = atomic_read(&req->poll_refs);
++
++ /* tw handler should be the owner, and so have some references */
++ if (WARN_ON_ONCE(!(v & IO_POLL_REF_MASK)))
++ return 0;
++ if (v & IO_POLL_CANCEL_FLAG)
++ return -ECANCELED;
++
++ if (!req->result) {
++ struct poll_table_struct pt = { ._key = req->apoll_events };
++ req->result = vfs_poll(req->file, &pt) & req->apoll_events;
++ }
++
++ /* multishot, just fill an CQE and proceed */
++ if (req->result && !(req->apoll_events & EPOLLONESHOT)) {
++ __poll_t mask = mangle_poll(req->result & req->apoll_events);
++ bool filled;
++
++ spin_lock(&ctx->completion_lock);
++ filled = io_fill_cqe_aux(ctx, req->user_data, mask,
++ IORING_CQE_F_MORE);
++ io_commit_cqring(ctx);
++ spin_unlock(&ctx->completion_lock);
++ if (unlikely(!filled))
++ return -ECANCELED;
++ io_cqring_ev_posted(ctx);
++ } else if (req->result) {
++ return 0;
++ }
++
++ /*
++ * Release all references, retry if someone tried to restart
++ * task_work while we were executing it.
++ */
++ } while (atomic_sub_return(v & IO_POLL_REF_MASK, &req->poll_refs));
++
++ return 1;
++}
++
++static void io_poll_task_func(struct io_kiocb *req, bool *locked)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ int ret;
++
++ ret = io_poll_check_events(req, *locked);
++ if (ret > 0)
++ return;
++
++ if (!ret) {
++ req->result = mangle_poll(req->result & req->poll.events);
++ } else {
++ req->result = ret;
++ req_set_fail(req);
++ }
++
++ io_poll_remove_entries(req);
++ spin_lock(&ctx->completion_lock);
++ hash_del(&req->hash_node);
++ __io_req_complete_post(req, req->result, 0);
++ io_commit_cqring(ctx);
++ spin_unlock(&ctx->completion_lock);
++ io_cqring_ev_posted(ctx);
++}
++
++static void io_apoll_task_func(struct io_kiocb *req, bool *locked)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ int ret;
++
++ ret = io_poll_check_events(req, *locked);
++ if (ret > 0)
++ return;
++
++ io_poll_remove_entries(req);
++ spin_lock(&ctx->completion_lock);
++ hash_del(&req->hash_node);
++ spin_unlock(&ctx->completion_lock);
++
++ if (!ret)
++ io_req_task_submit(req, locked);
++ else
++ io_req_complete_failed(req, ret);
++}
++
++static void __io_poll_execute(struct io_kiocb *req, int mask,
++ __poll_t __maybe_unused events)
++{
++ req->result = mask;
++ /*
++ * This is useful for poll that is armed on behalf of another
++ * request, and where the wakeup path could be on a different
++ * CPU. We want to avoid pulling in req->apoll->events for that
++ * case.
++ */
++ if (req->opcode == IORING_OP_POLL_ADD)
++ req->io_task_work.func = io_poll_task_func;
++ else
++ req->io_task_work.func = io_apoll_task_func;
++
++ trace_io_uring_task_add(req->ctx, req, req->user_data, req->opcode, mask);
++ io_req_task_work_add(req, false);
++}
++
++static inline void io_poll_execute(struct io_kiocb *req, int res,
++ __poll_t events)
++{
++ if (io_poll_get_ownership(req))
++ __io_poll_execute(req, res, events);
++}
++
++static void io_poll_cancel_req(struct io_kiocb *req)
++{
++ io_poll_mark_cancelled(req);
++ /* kick tw, which should complete the request */
++ io_poll_execute(req, 0, 0);
++}
++
++#define wqe_to_req(wait) ((void *)((unsigned long) (wait)->private & ~1))
++#define wqe_is_double(wait) ((unsigned long) (wait)->private & 1)
++#define IO_ASYNC_POLL_COMMON (EPOLLONESHOT | POLLPRI)
++
++static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
++ void *key)
++{
++ struct io_kiocb *req = wqe_to_req(wait);
++ struct io_poll_iocb *poll = container_of(wait, struct io_poll_iocb,
++ wait);
++ __poll_t mask = key_to_poll(key);
++
++ if (unlikely(mask & POLLFREE)) {
++ io_poll_mark_cancelled(req);
++ /* we have to kick tw in case it's not already */
++ io_poll_execute(req, 0, poll->events);
++
++ /*
++ * If the waitqueue is being freed early but someone is already
++ * holds ownership over it, we have to tear down the request as
++ * best we can. That means immediately removing the request from
++ * its waitqueue and preventing all further accesses to the
++ * waitqueue via the request.
++ */
++ list_del_init(&poll->wait.entry);
++
++ /*
++ * Careful: this *must* be the last step, since as soon
++ * as req->head is NULL'ed out, the request can be
++ * completed and freed, since aio_poll_complete_work()
++ * will no longer need to take the waitqueue lock.
++ */
++ smp_store_release(&poll->head, NULL);
++ return 1;
++ }
++
++ /* for instances that support it check for an event match first */
++ if (mask && !(mask & (poll->events & ~IO_ASYNC_POLL_COMMON)))
++ return 0;
++
++ if (io_poll_get_ownership(req)) {
++ /* optional, saves extra locking for removal in tw handler */
++ if (mask && poll->events & EPOLLONESHOT) {
++ list_del_init(&poll->wait.entry);
++ poll->head = NULL;
++ if (wqe_is_double(wait))
++ req->flags &= ~REQ_F_DOUBLE_POLL;
++ else
++ req->flags &= ~REQ_F_SINGLE_POLL;
++ }
++ __io_poll_execute(req, mask, poll->events);
++ }
++ return 1;
++}
++
++static void __io_queue_proc(struct io_poll_iocb *poll, struct io_poll_table *pt,
++ struct wait_queue_head *head,
++ struct io_poll_iocb **poll_ptr)
++{
++ struct io_kiocb *req = pt->req;
++ unsigned long wqe_private = (unsigned long) req;
++
++ /*
++ * The file being polled uses multiple waitqueues for poll handling
++ * (e.g. one for read, one for write). Setup a separate io_poll_iocb
++ * if this happens.
++ */
++ if (unlikely(pt->nr_entries)) {
++ struct io_poll_iocb *first = poll;
++
++ /* double add on the same waitqueue head, ignore */
++ if (first->head == head)
++ return;
++ /* already have a 2nd entry, fail a third attempt */
++ if (*poll_ptr) {
++ if ((*poll_ptr)->head == head)
++ return;
++ pt->error = -EINVAL;
++ return;
++ }
++
++ poll = kmalloc(sizeof(*poll), GFP_ATOMIC);
++ if (!poll) {
++ pt->error = -ENOMEM;
++ return;
++ }
++ /* mark as double wq entry */
++ wqe_private |= 1;
++ req->flags |= REQ_F_DOUBLE_POLL;
++ io_init_poll_iocb(poll, first->events, first->wait.func);
++ *poll_ptr = poll;
++ if (req->opcode == IORING_OP_POLL_ADD)
++ req->flags |= REQ_F_ASYNC_DATA;
++ }
++
++ req->flags |= REQ_F_SINGLE_POLL;
++ pt->nr_entries++;
++ poll->head = head;
++ poll->wait.private = (void *) wqe_private;
++
++ if (poll->events & EPOLLEXCLUSIVE)
++ add_wait_queue_exclusive(head, &poll->wait);
++ else
++ add_wait_queue(head, &poll->wait);
++}
++
++static void io_poll_queue_proc(struct file *file, struct wait_queue_head *head,
++ struct poll_table_struct *p)
++{
++ struct io_poll_table *pt = container_of(p, struct io_poll_table, pt);
++
++ __io_queue_proc(&pt->req->poll, pt, head,
++ (struct io_poll_iocb **) &pt->req->async_data);
++}
++
++static int __io_arm_poll_handler(struct io_kiocb *req,
++ struct io_poll_iocb *poll,
++ struct io_poll_table *ipt, __poll_t mask)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ int v;
++
++ INIT_HLIST_NODE(&req->hash_node);
++ io_init_poll_iocb(poll, mask, io_poll_wake);
++ poll->file = req->file;
++
++ req->apoll_events = poll->events;
++
++ ipt->pt._key = mask;
++ ipt->req = req;
++ ipt->error = 0;
++ ipt->nr_entries = 0;
++
++ /*
++ * Take the ownership to delay any tw execution up until we're done
++ * with poll arming. see io_poll_get_ownership().
++ */
++ atomic_set(&req->poll_refs, 1);
++ mask = vfs_poll(req->file, &ipt->pt) & poll->events;
++
++ if (mask && (poll->events & EPOLLONESHOT)) {
++ io_poll_remove_entries(req);
++ /* no one else has access to the req, forget about the ref */
++ return mask;
++ }
++ if (!mask && unlikely(ipt->error || !ipt->nr_entries)) {
++ io_poll_remove_entries(req);
++ if (!ipt->error)
++ ipt->error = -EINVAL;
++ return 0;
++ }
++
++ spin_lock(&ctx->completion_lock);
++ io_poll_req_insert(req);
++ spin_unlock(&ctx->completion_lock);
++
++ if (mask) {
++ /* can't multishot if failed, just queue the event we've got */
++ if (unlikely(ipt->error || !ipt->nr_entries)) {
++ poll->events |= EPOLLONESHOT;
++ req->apoll_events |= EPOLLONESHOT;
++ ipt->error = 0;
++ }
++ __io_poll_execute(req, mask, poll->events);
++ return 0;
++ }
++
++ /*
++ * Release ownership. If someone tried to queue a tw while it was
++ * locked, kick it off for them.
++ */
++ v = atomic_dec_return(&req->poll_refs);
++ if (unlikely(v & IO_POLL_REF_MASK))
++ __io_poll_execute(req, 0, poll->events);
++ return 0;
++}
++
++static void io_async_queue_proc(struct file *file, struct wait_queue_head *head,
++ struct poll_table_struct *p)
++{
++ struct io_poll_table *pt = container_of(p, struct io_poll_table, pt);
++ struct async_poll *apoll = pt->req->apoll;
++
++ __io_queue_proc(&apoll->poll, pt, head, &apoll->double_poll);
++}
++
++enum {
++ IO_APOLL_OK,
++ IO_APOLL_ABORTED,
++ IO_APOLL_READY
++};
++
++static int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
++{
++ const struct io_op_def *def = &io_op_defs[req->opcode];
++ struct io_ring_ctx *ctx = req->ctx;
++ struct async_poll *apoll;
++ struct io_poll_table ipt;
++ __poll_t mask = IO_ASYNC_POLL_COMMON | POLLERR;
++ int ret;
++
++ if (!def->pollin && !def->pollout)
++ return IO_APOLL_ABORTED;
++ if (!file_can_poll(req->file) || (req->flags & REQ_F_POLLED))
++ return IO_APOLL_ABORTED;
++
++ if (def->pollin) {
++ mask |= POLLIN | POLLRDNORM;
++
++ /* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */
++ if ((req->opcode == IORING_OP_RECVMSG) &&
++ (req->sr_msg.msg_flags & MSG_ERRQUEUE))
++ mask &= ~POLLIN;
++ } else {
++ mask |= POLLOUT | POLLWRNORM;
++ }
++ if (def->poll_exclusive)
++ mask |= EPOLLEXCLUSIVE;
++ if (!(issue_flags & IO_URING_F_UNLOCKED) &&
++ !list_empty(&ctx->apoll_cache)) {
++ apoll = list_first_entry(&ctx->apoll_cache, struct async_poll,
++ poll.wait.entry);
++ list_del_init(&apoll->poll.wait.entry);
++ } else {
++ apoll = kmalloc(sizeof(*apoll), GFP_ATOMIC);
++ if (unlikely(!apoll))
++ return IO_APOLL_ABORTED;
++ }
++ apoll->double_poll = NULL;
++ req->apoll = apoll;
++ req->flags |= REQ_F_POLLED;
++ ipt.pt._qproc = io_async_queue_proc;
++
++ io_kbuf_recycle(req, issue_flags);
++
++ ret = __io_arm_poll_handler(req, &apoll->poll, &ipt, mask);
++ if (ret || ipt.error)
++ return ret ? IO_APOLL_READY : IO_APOLL_ABORTED;
++
++ trace_io_uring_poll_arm(ctx, req, req->user_data, req->opcode,
++ mask, apoll->poll.events);
++ return IO_APOLL_OK;
++}
++
++/*
++ * Returns true if we found and killed one or more poll requests
++ */
++static __cold bool io_poll_remove_all(struct io_ring_ctx *ctx,
++ struct task_struct *tsk, bool cancel_all)
++{
++ struct hlist_node *tmp;
++ struct io_kiocb *req;
++ bool found = false;
++ int i;
++
++ spin_lock(&ctx->completion_lock);
++ for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) {
++ struct hlist_head *list;
++
++ list = &ctx->cancel_hash[i];
++ hlist_for_each_entry_safe(req, tmp, list, hash_node) {
++ if (io_match_task_safe(req, tsk, cancel_all)) {
++ hlist_del_init(&req->hash_node);
++ io_poll_cancel_req(req);
++ found = true;
++ }
++ }
++ }
++ spin_unlock(&ctx->completion_lock);
++ return found;
++}
++
++static struct io_kiocb *io_poll_find(struct io_ring_ctx *ctx, __u64 sqe_addr,
++ bool poll_only)
++ __must_hold(&ctx->completion_lock)
++{
++ struct hlist_head *list;
++ struct io_kiocb *req;
++
++ list = &ctx->cancel_hash[hash_long(sqe_addr, ctx->cancel_hash_bits)];
++ hlist_for_each_entry(req, list, hash_node) {
++ if (sqe_addr != req->user_data)
++ continue;
++ if (poll_only && req->opcode != IORING_OP_POLL_ADD)
++ continue;
++ return req;
++ }
++ return NULL;
++}
++
++static bool io_poll_disarm(struct io_kiocb *req)
++ __must_hold(&ctx->completion_lock)
++{
++ if (!io_poll_get_ownership(req))
++ return false;
++ io_poll_remove_entries(req);
++ hash_del(&req->hash_node);
++ return true;
++}
++
++static int io_poll_cancel(struct io_ring_ctx *ctx, __u64 sqe_addr,
++ bool poll_only)
++ __must_hold(&ctx->completion_lock)
++{
++ struct io_kiocb *req = io_poll_find(ctx, sqe_addr, poll_only);
++
++ if (!req)
++ return -ENOENT;
++ io_poll_cancel_req(req);
++ return 0;
++}
++
++static __poll_t io_poll_parse_events(const struct io_uring_sqe *sqe,
++ unsigned int flags)
++{
++ u32 events;
++
++ events = READ_ONCE(sqe->poll32_events);
++#ifdef __BIG_ENDIAN
++ events = swahw32(events);
++#endif
++ if (!(flags & IORING_POLL_ADD_MULTI))
++ events |= EPOLLONESHOT;
++ return demangle_poll(events) | (events & (EPOLLEXCLUSIVE|EPOLLONESHOT));
++}
++
++static int io_poll_update_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ struct io_poll_update *upd = &req->poll_update;
++ u32 flags;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
++ return -EINVAL;
++ flags = READ_ONCE(sqe->len);
++ if (flags & ~(IORING_POLL_UPDATE_EVENTS | IORING_POLL_UPDATE_USER_DATA |
++ IORING_POLL_ADD_MULTI))
++ return -EINVAL;
++ /* meaningless without update */
++ if (flags == IORING_POLL_ADD_MULTI)
++ return -EINVAL;
++
++ upd->old_user_data = READ_ONCE(sqe->addr);
++ upd->update_events = flags & IORING_POLL_UPDATE_EVENTS;
++ upd->update_user_data = flags & IORING_POLL_UPDATE_USER_DATA;
++
++ upd->new_user_data = READ_ONCE(sqe->off);
++ if (!upd->update_user_data && upd->new_user_data)
++ return -EINVAL;
++ if (upd->update_events)
++ upd->events = io_poll_parse_events(sqe, flags);
++ else if (sqe->poll32_events)
++ return -EINVAL;
++
++ return 0;
++}
++
++static int io_poll_add_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ struct io_poll_iocb *poll = &req->poll;
++ u32 flags;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->buf_index || sqe->off || sqe->addr)
++ return -EINVAL;
++ flags = READ_ONCE(sqe->len);
++ if (flags & ~IORING_POLL_ADD_MULTI)
++ return -EINVAL;
++ if ((flags & IORING_POLL_ADD_MULTI) && (req->flags & REQ_F_CQE_SKIP))
++ return -EINVAL;
++
++ io_req_set_refcount(req);
++ poll->events = io_poll_parse_events(sqe, flags);
++ return 0;
++}
++
++static int io_poll_add(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_poll_iocb *poll = &req->poll;
++ struct io_poll_table ipt;
++ int ret;
++
++ ipt.pt._qproc = io_poll_queue_proc;
++
++ ret = __io_arm_poll_handler(req, &req->poll, &ipt, poll->events);
++ if (!ret && ipt.error)
++ req_set_fail(req);
++ ret = ret ?: ipt.error;
++ if (ret)
++ __io_req_complete(req, issue_flags, ret, 0);
++ return 0;
++}
++
++static int io_poll_update(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ struct io_kiocb *preq;
++ int ret2, ret = 0;
++ bool locked;
++
++ spin_lock(&ctx->completion_lock);
++ preq = io_poll_find(ctx, req->poll_update.old_user_data, true);
++ if (!preq || !io_poll_disarm(preq)) {
++ spin_unlock(&ctx->completion_lock);
++ ret = preq ? -EALREADY : -ENOENT;
++ goto out;
++ }
++ spin_unlock(&ctx->completion_lock);
++
++ if (req->poll_update.update_events || req->poll_update.update_user_data) {
++ /* only mask one event flags, keep behavior flags */
++ if (req->poll_update.update_events) {
++ preq->poll.events &= ~0xffff;
++ preq->poll.events |= req->poll_update.events & 0xffff;
++ preq->poll.events |= IO_POLL_UNMASK;
++ }
++ if (req->poll_update.update_user_data)
++ preq->user_data = req->poll_update.new_user_data;
++
++ ret2 = io_poll_add(preq, issue_flags);
++ /* successfully updated, don't complete poll request */
++ if (!ret2)
++ goto out;
++ }
++
++ req_set_fail(preq);
++ preq->result = -ECANCELED;
++ locked = !(issue_flags & IO_URING_F_UNLOCKED);
++ io_req_task_complete(preq, &locked);
++out:
++ if (ret < 0)
++ req_set_fail(req);
++ /* complete update request, we're done with it */
++ __io_req_complete(req, issue_flags, ret, 0);
++ return 0;
++}
++
++static enum hrtimer_restart io_timeout_fn(struct hrtimer *timer)
++{
++ struct io_timeout_data *data = container_of(timer,
++ struct io_timeout_data, timer);
++ struct io_kiocb *req = data->req;
++ struct io_ring_ctx *ctx = req->ctx;
++ unsigned long flags;
++
++ spin_lock_irqsave(&ctx->timeout_lock, flags);
++ list_del_init(&req->timeout.list);
++ atomic_set(&req->ctx->cq_timeouts,
++ atomic_read(&req->ctx->cq_timeouts) + 1);
++ spin_unlock_irqrestore(&ctx->timeout_lock, flags);
++
++ if (!(data->flags & IORING_TIMEOUT_ETIME_SUCCESS))
++ req_set_fail(req);
++
++ req->result = -ETIME;
++ req->io_task_work.func = io_req_task_complete;
++ io_req_task_work_add(req, false);
++ return HRTIMER_NORESTART;
++}
++
++static struct io_kiocb *io_timeout_extract(struct io_ring_ctx *ctx,
++ __u64 user_data)
++ __must_hold(&ctx->timeout_lock)
++{
++ struct io_timeout_data *io;
++ struct io_kiocb *req;
++ bool found = false;
++
++ list_for_each_entry(req, &ctx->timeout_list, timeout.list) {
++ found = user_data == req->user_data;
++ if (found)
++ break;
++ }
++ if (!found)
++ return ERR_PTR(-ENOENT);
++
++ io = req->async_data;
++ if (hrtimer_try_to_cancel(&io->timer) == -1)
++ return ERR_PTR(-EALREADY);
++ list_del_init(&req->timeout.list);
++ return req;
++}
++
++static int io_timeout_cancel(struct io_ring_ctx *ctx, __u64 user_data)
++ __must_hold(&ctx->completion_lock)
++ __must_hold(&ctx->timeout_lock)
++{
++ struct io_kiocb *req = io_timeout_extract(ctx, user_data);
++
++ if (IS_ERR(req))
++ return PTR_ERR(req);
++ io_req_task_queue_fail(req, -ECANCELED);
++ return 0;
++}
++
++static clockid_t io_timeout_get_clock(struct io_timeout_data *data)
++{
++ switch (data->flags & IORING_TIMEOUT_CLOCK_MASK) {
++ case IORING_TIMEOUT_BOOTTIME:
++ return CLOCK_BOOTTIME;
++ case IORING_TIMEOUT_REALTIME:
++ return CLOCK_REALTIME;
++ default:
++ /* can't happen, vetted at prep time */
++ WARN_ON_ONCE(1);
++ fallthrough;
++ case 0:
++ return CLOCK_MONOTONIC;
++ }
++}
++
++static int io_linked_timeout_update(struct io_ring_ctx *ctx, __u64 user_data,
++ struct timespec64 *ts, enum hrtimer_mode mode)
++ __must_hold(&ctx->timeout_lock)
++{
++ struct io_timeout_data *io;
++ struct io_kiocb *req;
++ bool found = false;
++
++ list_for_each_entry(req, &ctx->ltimeout_list, timeout.list) {
++ found = user_data == req->user_data;
++ if (found)
++ break;
++ }
++ if (!found)
++ return -ENOENT;
++
++ io = req->async_data;
++ if (hrtimer_try_to_cancel(&io->timer) == -1)
++ return -EALREADY;
++ hrtimer_init(&io->timer, io_timeout_get_clock(io), mode);
++ io->timer.function = io_link_timeout_fn;
++ hrtimer_start(&io->timer, timespec64_to_ktime(*ts), mode);
++ return 0;
++}
++
++static int io_timeout_update(struct io_ring_ctx *ctx, __u64 user_data,
++ struct timespec64 *ts, enum hrtimer_mode mode)
++ __must_hold(&ctx->timeout_lock)
++{
++ struct io_kiocb *req = io_timeout_extract(ctx, user_data);
++ struct io_timeout_data *data;
++
++ if (IS_ERR(req))
++ return PTR_ERR(req);
++
++ req->timeout.off = 0; /* noseq */
++ data = req->async_data;
++ list_add_tail(&req->timeout.list, &ctx->timeout_list);
++ hrtimer_init(&data->timer, io_timeout_get_clock(data), mode);
++ data->timer.function = io_timeout_fn;
++ hrtimer_start(&data->timer, timespec64_to_ktime(*ts), mode);
++ return 0;
++}
++
++static int io_timeout_remove_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ struct io_timeout_rem *tr = &req->timeout_rem;
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->buf_index || sqe->len || sqe->splice_fd_in)
++ return -EINVAL;
++
++ tr->ltimeout = false;
++ tr->addr = READ_ONCE(sqe->addr);
++ tr->flags = READ_ONCE(sqe->timeout_flags);
++ if (tr->flags & IORING_TIMEOUT_UPDATE_MASK) {
++ if (hweight32(tr->flags & IORING_TIMEOUT_CLOCK_MASK) > 1)
++ return -EINVAL;
++ if (tr->flags & IORING_LINK_TIMEOUT_UPDATE)
++ tr->ltimeout = true;
++ if (tr->flags & ~(IORING_TIMEOUT_UPDATE_MASK|IORING_TIMEOUT_ABS))
++ return -EINVAL;
++ if (get_timespec64(&tr->ts, u64_to_user_ptr(sqe->addr2)))
++ return -EFAULT;
++ if (tr->ts.tv_sec < 0 || tr->ts.tv_nsec < 0)
++ return -EINVAL;
++ } else if (tr->flags) {
++ /* timeout removal doesn't support flags */
++ return -EINVAL;
++ }
++
++ return 0;
++}
++
++static inline enum hrtimer_mode io_translate_timeout_mode(unsigned int flags)
++{
++ return (flags & IORING_TIMEOUT_ABS) ? HRTIMER_MODE_ABS
++ : HRTIMER_MODE_REL;
++}
++
++/*
++ * Remove or update an existing timeout command
++ */
++static int io_timeout_remove(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_timeout_rem *tr = &req->timeout_rem;
++ struct io_ring_ctx *ctx = req->ctx;
++ int ret;
++
++ if (!(req->timeout_rem.flags & IORING_TIMEOUT_UPDATE)) {
++ spin_lock(&ctx->completion_lock);
++ spin_lock_irq(&ctx->timeout_lock);
++ ret = io_timeout_cancel(ctx, tr->addr);
++ spin_unlock_irq(&ctx->timeout_lock);
++ spin_unlock(&ctx->completion_lock);
++ } else {
++ enum hrtimer_mode mode = io_translate_timeout_mode(tr->flags);
++
++ spin_lock_irq(&ctx->timeout_lock);
++ if (tr->ltimeout)
++ ret = io_linked_timeout_update(ctx, tr->addr, &tr->ts, mode);
++ else
++ ret = io_timeout_update(ctx, tr->addr, &tr->ts, mode);
++ spin_unlock_irq(&ctx->timeout_lock);
++ }
++
++ if (ret < 0)
++ req_set_fail(req);
++ io_req_complete_post(req, ret, 0);
++ return 0;
++}
++
++static int io_timeout_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe,
++ bool is_timeout_link)
++{
++ struct io_timeout_data *data;
++ unsigned flags;
++ u32 off = READ_ONCE(sqe->off);
++
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->buf_index || sqe->len != 1 ||
++ sqe->splice_fd_in)
++ return -EINVAL;
++ if (off && is_timeout_link)
++ return -EINVAL;
++ flags = READ_ONCE(sqe->timeout_flags);
++ if (flags & ~(IORING_TIMEOUT_ABS | IORING_TIMEOUT_CLOCK_MASK |
++ IORING_TIMEOUT_ETIME_SUCCESS))
++ return -EINVAL;
++ /* more than one clock specified is invalid, obviously */
++ if (hweight32(flags & IORING_TIMEOUT_CLOCK_MASK) > 1)
++ return -EINVAL;
++
++ INIT_LIST_HEAD(&req->timeout.list);
++ req->timeout.off = off;
++ if (unlikely(off && !req->ctx->off_timeout_used))
++ req->ctx->off_timeout_used = true;
++
++ if (WARN_ON_ONCE(req_has_async_data(req)))
++ return -EFAULT;
++ if (io_alloc_async_data(req))
++ return -ENOMEM;
++
++ data = req->async_data;
++ data->req = req;
++ data->flags = flags;
++
++ if (get_timespec64(&data->ts, u64_to_user_ptr(sqe->addr)))
++ return -EFAULT;
++
++ if (data->ts.tv_sec < 0 || data->ts.tv_nsec < 0)
++ return -EINVAL;
++
++ INIT_LIST_HEAD(&req->timeout.list);
++ data->mode = io_translate_timeout_mode(flags);
++ hrtimer_init(&data->timer, io_timeout_get_clock(data), data->mode);
++
++ if (is_timeout_link) {
++ struct io_submit_link *link = &req->ctx->submit_state.link;
++
++ if (!link->head)
++ return -EINVAL;
++ if (link->last->opcode == IORING_OP_LINK_TIMEOUT)
++ return -EINVAL;
++ req->timeout.head = link->last;
++ link->last->flags |= REQ_F_ARM_LTIMEOUT;
++ }
++ return 0;
++}
++
++static int io_timeout(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ struct io_timeout_data *data = req->async_data;
++ struct list_head *entry;
++ u32 tail, off = req->timeout.off;
++
++ spin_lock_irq(&ctx->timeout_lock);
++
++ /*
++ * sqe->off holds how many events that need to occur for this
++ * timeout event to be satisfied. If it isn't set, then this is
++ * a pure timeout request, sequence isn't used.
++ */
++ if (io_is_timeout_noseq(req)) {
++ entry = ctx->timeout_list.prev;
++ goto add;
++ }
++
++ tail = ctx->cached_cq_tail - atomic_read(&ctx->cq_timeouts);
++ req->timeout.target_seq = tail + off;
++
++ /* Update the last seq here in case io_flush_timeouts() hasn't.
++ * This is safe because ->completion_lock is held, and submissions
++ * and completions are never mixed in the same ->completion_lock section.
++ */
++ ctx->cq_last_tm_flush = tail;
++
++ /*
++ * Insertion sort, ensuring the first entry in the list is always
++ * the one we need first.
++ */
++ list_for_each_prev(entry, &ctx->timeout_list) {
++ struct io_kiocb *nxt = list_entry(entry, struct io_kiocb,
++ timeout.list);
++
++ if (io_is_timeout_noseq(nxt))
++ continue;
++ /* nxt.seq is behind @tail, otherwise would've been completed */
++ if (off >= nxt->timeout.target_seq - tail)
++ break;
++ }
++add:
++ list_add(&req->timeout.list, entry);
++ data->timer.function = io_timeout_fn;
++ hrtimer_start(&data->timer, timespec64_to_ktime(data->ts), data->mode);
++ spin_unlock_irq(&ctx->timeout_lock);
++ return 0;
++}
++
++struct io_cancel_data {
++ struct io_ring_ctx *ctx;
++ u64 user_data;
++};
++
++static bool io_cancel_cb(struct io_wq_work *work, void *data)
++{
++ struct io_kiocb *req = container_of(work, struct io_kiocb, work);
++ struct io_cancel_data *cd = data;
++
++ return req->ctx == cd->ctx && req->user_data == cd->user_data;
++}
++
++static int io_async_cancel_one(struct io_uring_task *tctx, u64 user_data,
++ struct io_ring_ctx *ctx)
++{
++ struct io_cancel_data data = { .ctx = ctx, .user_data = user_data, };
++ enum io_wq_cancel cancel_ret;
++ int ret = 0;
++
++ if (!tctx || !tctx->io_wq)
++ return -ENOENT;
++
++ cancel_ret = io_wq_cancel_cb(tctx->io_wq, io_cancel_cb, &data, false);
++ switch (cancel_ret) {
++ case IO_WQ_CANCEL_OK:
++ ret = 0;
++ break;
++ case IO_WQ_CANCEL_RUNNING:
++ ret = -EALREADY;
++ break;
++ case IO_WQ_CANCEL_NOTFOUND:
++ ret = -ENOENT;
++ break;
++ }
++
++ return ret;
++}
++
++static int io_try_cancel_userdata(struct io_kiocb *req, u64 sqe_addr)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ int ret;
++
++ WARN_ON_ONCE(!io_wq_current_is_worker() && req->task != current);
++
++ ret = io_async_cancel_one(req->task->io_uring, sqe_addr, ctx);
++ /*
++ * Fall-through even for -EALREADY, as we may have poll armed
++ * that need unarming.
++ */
++ if (!ret)
++ return 0;
++
++ spin_lock(&ctx->completion_lock);
++ ret = io_poll_cancel(ctx, sqe_addr, false);
++ if (ret != -ENOENT)
++ goto out;
++
++ spin_lock_irq(&ctx->timeout_lock);
++ ret = io_timeout_cancel(ctx, sqe_addr);
++ spin_unlock_irq(&ctx->timeout_lock);
++out:
++ spin_unlock(&ctx->completion_lock);
++ return ret;
++}
++
++static int io_async_cancel_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
++ return -EINVAL;
++ if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->off || sqe->len || sqe->cancel_flags ||
++ sqe->splice_fd_in)
++ return -EINVAL;
++
++ req->cancel.addr = READ_ONCE(sqe->addr);
++ return 0;
++}
++
++static int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ u64 sqe_addr = req->cancel.addr;
++ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
++ struct io_tctx_node *node;
++ int ret;
++
++ ret = io_try_cancel_userdata(req, sqe_addr);
++ if (ret != -ENOENT)
++ goto done;
++
++ /* slow path, try all io-wq's */
++ io_ring_submit_lock(ctx, needs_lock);
++ ret = -ENOENT;
++ list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
++ struct io_uring_task *tctx = node->task->io_uring;
++
++ ret = io_async_cancel_one(tctx, req->cancel.addr, ctx);
++ if (ret != -ENOENT)
++ break;
++ }
++ io_ring_submit_unlock(ctx, needs_lock);
++done:
++ if (ret < 0)
++ req_set_fail(req);
++ io_req_complete_post(req, ret, 0);
++ return 0;
++}
++
++static int io_rsrc_update_prep(struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++{
++ if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
++ return -EINVAL;
++ if (sqe->ioprio || sqe->rw_flags || sqe->splice_fd_in)
++ return -EINVAL;
++
++ req->rsrc_update.offset = READ_ONCE(sqe->off);
++ req->rsrc_update.nr_args = READ_ONCE(sqe->len);
++ if (!req->rsrc_update.nr_args)
++ return -EINVAL;
++ req->rsrc_update.arg = READ_ONCE(sqe->addr);
++ return 0;
++}
++
++static int io_files_update(struct io_kiocb *req, unsigned int issue_flags)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
++ struct io_uring_rsrc_update2 up;
++ int ret;
++
++ up.offset = req->rsrc_update.offset;
++ up.data = req->rsrc_update.arg;
++ up.nr = 0;
++ up.tags = 0;
++ up.resv = 0;
++ up.resv2 = 0;
++
++ io_ring_submit_lock(ctx, needs_lock);
++ ret = __io_register_rsrc_update(ctx, IORING_RSRC_FILE,
++ &up, req->rsrc_update.nr_args);
++ io_ring_submit_unlock(ctx, needs_lock);
++
++ if (ret < 0)
++ req_set_fail(req);
++ __io_req_complete(req, issue_flags, ret, 0);
++ return 0;
++}
++
++static int io_req_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
++{
++ switch (req->opcode) {
++ case IORING_OP_NOP:
++ return 0;
++ case IORING_OP_READV:
++ case IORING_OP_READ_FIXED:
++ case IORING_OP_READ:
++ case IORING_OP_WRITEV:
++ case IORING_OP_WRITE_FIXED:
++ case IORING_OP_WRITE:
++ return io_prep_rw(req, sqe);
++ case IORING_OP_POLL_ADD:
++ return io_poll_add_prep(req, sqe);
++ case IORING_OP_POLL_REMOVE:
++ return io_poll_update_prep(req, sqe);
++ case IORING_OP_FSYNC:
++ return io_fsync_prep(req, sqe);
++ case IORING_OP_SYNC_FILE_RANGE:
++ return io_sfr_prep(req, sqe);
++ case IORING_OP_SENDMSG:
++ case IORING_OP_SEND:
++ return io_sendmsg_prep(req, sqe);
++ case IORING_OP_RECVMSG:
++ case IORING_OP_RECV:
++ return io_recvmsg_prep(req, sqe);
++ case IORING_OP_CONNECT:
++ return io_connect_prep(req, sqe);
++ case IORING_OP_TIMEOUT:
++ return io_timeout_prep(req, sqe, false);
++ case IORING_OP_TIMEOUT_REMOVE:
++ return io_timeout_remove_prep(req, sqe);
++ case IORING_OP_ASYNC_CANCEL:
++ return io_async_cancel_prep(req, sqe);
++ case IORING_OP_LINK_TIMEOUT:
++ return io_timeout_prep(req, sqe, true);
++ case IORING_OP_ACCEPT:
++ return io_accept_prep(req, sqe);
++ case IORING_OP_FALLOCATE:
++ return io_fallocate_prep(req, sqe);
++ case IORING_OP_OPENAT:
++ return io_openat_prep(req, sqe);
++ case IORING_OP_CLOSE:
++ return io_close_prep(req, sqe);
++ case IORING_OP_FILES_UPDATE:
++ return io_rsrc_update_prep(req, sqe);
++ case IORING_OP_STATX:
++ return io_statx_prep(req, sqe);
++ case IORING_OP_FADVISE:
++ return io_fadvise_prep(req, sqe);
++ case IORING_OP_MADVISE:
++ return io_madvise_prep(req, sqe);
++ case IORING_OP_OPENAT2:
++ return io_openat2_prep(req, sqe);
++ case IORING_OP_EPOLL_CTL:
++ return io_epoll_ctl_prep(req, sqe);
++ case IORING_OP_SPLICE:
++ return io_splice_prep(req, sqe);
++ case IORING_OP_PROVIDE_BUFFERS:
++ return io_provide_buffers_prep(req, sqe);
++ case IORING_OP_REMOVE_BUFFERS:
++ return io_remove_buffers_prep(req, sqe);
++ case IORING_OP_TEE:
++ return io_tee_prep(req, sqe);
++ case IORING_OP_SHUTDOWN:
++ return io_shutdown_prep(req, sqe);
++ case IORING_OP_RENAMEAT:
++ return io_renameat_prep(req, sqe);
++ case IORING_OP_UNLINKAT:
++ return io_unlinkat_prep(req, sqe);
++ case IORING_OP_MKDIRAT:
++ return io_mkdirat_prep(req, sqe);
++ case IORING_OP_SYMLINKAT:
++ return io_symlinkat_prep(req, sqe);
++ case IORING_OP_LINKAT:
++ return io_linkat_prep(req, sqe);
++ case IORING_OP_MSG_RING:
++ return io_msg_ring_prep(req, sqe);
++ }
++
++ printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
++ req->opcode);
++ return -EINVAL;
++}
++
++static int io_req_prep_async(struct io_kiocb *req)
++{
++ const struct io_op_def *def = &io_op_defs[req->opcode];
++
++ /* assign early for deferred execution for non-fixed file */
++ if (def->needs_file && !(req->flags & REQ_F_FIXED_FILE))
++ req->file = io_file_get_normal(req, req->fd);
++ if (!def->needs_async_setup)
++ return 0;
++ if (WARN_ON_ONCE(req_has_async_data(req)))
++ return -EFAULT;
++ if (io_alloc_async_data(req))
++ return -EAGAIN;
++
++ switch (req->opcode) {
++ case IORING_OP_READV:
++ return io_rw_prep_async(req, READ);
++ case IORING_OP_WRITEV:
++ return io_rw_prep_async(req, WRITE);
++ case IORING_OP_SENDMSG:
++ return io_sendmsg_prep_async(req);
++ case IORING_OP_RECVMSG:
++ return io_recvmsg_prep_async(req);
++ case IORING_OP_CONNECT:
++ return io_connect_prep_async(req);
++ }
++ printk_once(KERN_WARNING "io_uring: prep_async() bad opcode %d\n",
++ req->opcode);
++ return -EFAULT;
++}
++
++static u32 io_get_sequence(struct io_kiocb *req)
++{
++ u32 seq = req->ctx->cached_sq_head;
++
++ /* need original cached_sq_head, but it was increased for each req */
++ io_for_each_link(req, req)
++ seq--;
++ return seq;
++}
++
++static __cold void io_drain_req(struct io_kiocb *req)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ struct io_defer_entry *de;
++ int ret;
++ u32 seq = io_get_sequence(req);
++
++ /* Still need defer if there is pending req in defer list. */
++ spin_lock(&ctx->completion_lock);
++ if (!req_need_defer(req, seq) && list_empty_careful(&ctx->defer_list)) {
++ spin_unlock(&ctx->completion_lock);
++queue:
++ ctx->drain_active = false;
++ io_req_task_queue(req);
++ return;
++ }
++ spin_unlock(&ctx->completion_lock);
++
++ ret = io_req_prep_async(req);
++ if (ret) {
++fail:
++ io_req_complete_failed(req, ret);
++ return;
++ }
++ io_prep_async_link(req);
++ de = kmalloc(sizeof(*de), GFP_KERNEL);
++ if (!de) {
++ ret = -ENOMEM;
++ goto fail;
++ }
++
++ spin_lock(&ctx->completion_lock);
++ if (!req_need_defer(req, seq) && list_empty(&ctx->defer_list)) {
++ spin_unlock(&ctx->completion_lock);
++ kfree(de);
++ goto queue;
++ }
++
++ trace_io_uring_defer(ctx, req, req->user_data, req->opcode);
++ de->req = req;
++ de->seq = seq;
++ list_add_tail(&de->list, &ctx->defer_list);
++ spin_unlock(&ctx->completion_lock);
++}
++
++static void io_clean_op(struct io_kiocb *req)
++{
++ if (req->flags & REQ_F_BUFFER_SELECTED) {
++ spin_lock(&req->ctx->completion_lock);
++ io_put_kbuf_comp(req);
++ spin_unlock(&req->ctx->completion_lock);
++ }
++
++ if (req->flags & REQ_F_NEED_CLEANUP) {
++ switch (req->opcode) {
++ case IORING_OP_READV:
++ case IORING_OP_READ_FIXED:
++ case IORING_OP_READ:
++ case IORING_OP_WRITEV:
++ case IORING_OP_WRITE_FIXED:
++ case IORING_OP_WRITE: {
++ struct io_async_rw *io = req->async_data;
++
++ kfree(io->free_iovec);
++ break;
++ }
++ case IORING_OP_RECVMSG:
++ case IORING_OP_SENDMSG: {
++ struct io_async_msghdr *io = req->async_data;
++
++ kfree(io->free_iov);
++ break;
++ }
++ case IORING_OP_OPENAT:
++ case IORING_OP_OPENAT2:
++ if (req->open.filename)
++ putname(req->open.filename);
++ break;
++ case IORING_OP_RENAMEAT:
++ putname(req->rename.oldpath);
++ putname(req->rename.newpath);
++ break;
++ case IORING_OP_UNLINKAT:
++ putname(req->unlink.filename);
++ break;
++ case IORING_OP_MKDIRAT:
++ putname(req->mkdir.filename);
++ break;
++ case IORING_OP_SYMLINKAT:
++ putname(req->symlink.oldpath);
++ putname(req->symlink.newpath);
++ break;
++ case IORING_OP_LINKAT:
++ putname(req->hardlink.oldpath);
++ putname(req->hardlink.newpath);
++ break;
++ case IORING_OP_STATX:
++ if (req->statx.filename)
++ putname(req->statx.filename);
++ break;
++ }
++ }
++ if ((req->flags & REQ_F_POLLED) && req->apoll) {
++ kfree(req->apoll->double_poll);
++ kfree(req->apoll);
++ req->apoll = NULL;
++ }
++ if (req->flags & REQ_F_INFLIGHT) {
++ struct io_uring_task *tctx = req->task->io_uring;
++
++ atomic_dec(&tctx->inflight_tracked);
++ }
++ if (req->flags & REQ_F_CREDS)
++ put_cred(req->creds);
++ if (req->flags & REQ_F_ASYNC_DATA) {
++ kfree(req->async_data);
++ req->async_data = NULL;
++ }
++ req->flags &= ~IO_REQ_CLEAN_FLAGS;
++}
++
++static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags)
++{
++ if (req->file || !io_op_defs[req->opcode].needs_file)
++ return true;
++
++ if (req->flags & REQ_F_FIXED_FILE)
++ req->file = io_file_get_fixed(req, req->fd, issue_flags);
++ else
++ req->file = io_file_get_normal(req, req->fd);
++ if (req->file)
++ return true;
++
++ req_set_fail(req);
++ req->result = -EBADF;
++ return false;
++}
++
++static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)
++{
++ const struct cred *creds = NULL;
++ int ret;
++
++ if (unlikely(!io_assign_file(req, issue_flags)))
++ return -EBADF;
++
++ if (unlikely((req->flags & REQ_F_CREDS) && req->creds != current_cred()))
++ creds = override_creds(req->creds);
++
++ if (!io_op_defs[req->opcode].audit_skip)
++ audit_uring_entry(req->opcode);
++
++ switch (req->opcode) {
++ case IORING_OP_NOP:
++ ret = io_nop(req, issue_flags);
++ break;
++ case IORING_OP_READV:
++ case IORING_OP_READ_FIXED:
++ case IORING_OP_READ:
++ ret = io_read(req, issue_flags);
++ break;
++ case IORING_OP_WRITEV:
++ case IORING_OP_WRITE_FIXED:
++ case IORING_OP_WRITE:
++ ret = io_write(req, issue_flags);
++ break;
++ case IORING_OP_FSYNC:
++ ret = io_fsync(req, issue_flags);
++ break;
++ case IORING_OP_POLL_ADD:
++ ret = io_poll_add(req, issue_flags);
++ break;
++ case IORING_OP_POLL_REMOVE:
++ ret = io_poll_update(req, issue_flags);
++ break;
++ case IORING_OP_SYNC_FILE_RANGE:
++ ret = io_sync_file_range(req, issue_flags);
++ break;
++ case IORING_OP_SENDMSG:
++ ret = io_sendmsg(req, issue_flags);
++ break;
++ case IORING_OP_SEND:
++ ret = io_send(req, issue_flags);
++ break;
++ case IORING_OP_RECVMSG:
++ ret = io_recvmsg(req, issue_flags);
++ break;
++ case IORING_OP_RECV:
++ ret = io_recv(req, issue_flags);
++ break;
++ case IORING_OP_TIMEOUT:
++ ret = io_timeout(req, issue_flags);
++ break;
++ case IORING_OP_TIMEOUT_REMOVE:
++ ret = io_timeout_remove(req, issue_flags);
++ break;
++ case IORING_OP_ACCEPT:
++ ret = io_accept(req, issue_flags);
++ break;
++ case IORING_OP_CONNECT:
++ ret = io_connect(req, issue_flags);
++ break;
++ case IORING_OP_ASYNC_CANCEL:
++ ret = io_async_cancel(req, issue_flags);
++ break;
++ case IORING_OP_FALLOCATE:
++ ret = io_fallocate(req, issue_flags);
++ break;
++ case IORING_OP_OPENAT:
++ ret = io_openat(req, issue_flags);
++ break;
++ case IORING_OP_CLOSE:
++ ret = io_close(req, issue_flags);
++ break;
++ case IORING_OP_FILES_UPDATE:
++ ret = io_files_update(req, issue_flags);
++ break;
++ case IORING_OP_STATX:
++ ret = io_statx(req, issue_flags);
++ break;
++ case IORING_OP_FADVISE:
++ ret = io_fadvise(req, issue_flags);
++ break;
++ case IORING_OP_MADVISE:
++ ret = io_madvise(req, issue_flags);
++ break;
++ case IORING_OP_OPENAT2:
++ ret = io_openat2(req, issue_flags);
++ break;
++ case IORING_OP_EPOLL_CTL:
++ ret = io_epoll_ctl(req, issue_flags);
++ break;
++ case IORING_OP_SPLICE:
++ ret = io_splice(req, issue_flags);
++ break;
++ case IORING_OP_PROVIDE_BUFFERS:
++ ret = io_provide_buffers(req, issue_flags);
++ break;
++ case IORING_OP_REMOVE_BUFFERS:
++ ret = io_remove_buffers(req, issue_flags);
++ break;
++ case IORING_OP_TEE:
++ ret = io_tee(req, issue_flags);
++ break;
++ case IORING_OP_SHUTDOWN:
++ ret = io_shutdown(req, issue_flags);
++ break;
++ case IORING_OP_RENAMEAT:
++ ret = io_renameat(req, issue_flags);
++ break;
++ case IORING_OP_UNLINKAT:
++ ret = io_unlinkat(req, issue_flags);
++ break;
++ case IORING_OP_MKDIRAT:
++ ret = io_mkdirat(req, issue_flags);
++ break;
++ case IORING_OP_SYMLINKAT:
++ ret = io_symlinkat(req, issue_flags);
++ break;
++ case IORING_OP_LINKAT:
++ ret = io_linkat(req, issue_flags);
++ break;
++ case IORING_OP_MSG_RING:
++ ret = io_msg_ring(req, issue_flags);
++ break;
++ default:
++ ret = -EINVAL;
++ break;
++ }
++
++ if (!io_op_defs[req->opcode].audit_skip)
++ audit_uring_exit(!ret, ret);
++
++ if (creds)
++ revert_creds(creds);
++ if (ret)
++ return ret;
++ /* If the op doesn't have a file, we're not polling for it */
++ if ((req->ctx->flags & IORING_SETUP_IOPOLL) && req->file)
++ io_iopoll_req_issued(req, issue_flags);
++
++ return 0;
++}
++
++static struct io_wq_work *io_wq_free_work(struct io_wq_work *work)
++{
++ struct io_kiocb *req = container_of(work, struct io_kiocb, work);
++
++ req = io_put_req_find_next(req);
++ return req ? &req->work : NULL;
++}
++
++static void io_wq_submit_work(struct io_wq_work *work)
++{
++ struct io_kiocb *req = container_of(work, struct io_kiocb, work);
++ const struct io_op_def *def = &io_op_defs[req->opcode];
++ unsigned int issue_flags = IO_URING_F_UNLOCKED;
++ bool needs_poll = false;
++ struct io_kiocb *timeout;
++ int ret = 0, err = -ECANCELED;
++
++ /* one will be dropped by ->io_free_work() after returning to io-wq */
++ if (!(req->flags & REQ_F_REFCOUNT))
++ __io_req_set_refcount(req, 2);
++ else
++ req_ref_get(req);
++
++ timeout = io_prep_linked_timeout(req);
++ if (timeout)
++ io_queue_linked_timeout(timeout);
++
++
++ /* either cancelled or io-wq is dying, so don't touch tctx->iowq */
++ if (work->flags & IO_WQ_WORK_CANCEL) {
++fail:
++ io_req_task_queue_fail(req, err);
++ return;
++ }
++ if (!io_assign_file(req, issue_flags)) {
++ err = -EBADF;
++ work->flags |= IO_WQ_WORK_CANCEL;
++ goto fail;
++ }
++
++ if (req->flags & REQ_F_FORCE_ASYNC) {
++ bool opcode_poll = def->pollin || def->pollout;
++
++ if (opcode_poll && file_can_poll(req->file)) {
++ needs_poll = true;
++ issue_flags |= IO_URING_F_NONBLOCK;
++ }
++ }
++
++ do {
++ ret = io_issue_sqe(req, issue_flags);
++ if (ret != -EAGAIN)
++ break;
++ /*
++ * We can get EAGAIN for iopolled IO even though we're
++ * forcing a sync submission from here, since we can't
++ * wait for request slots on the block side.
++ */
++ if (!needs_poll) {
++ if (!(req->ctx->flags & IORING_SETUP_IOPOLL))
++ break;
++ cond_resched();
++ continue;
++ }
++
++ if (io_arm_poll_handler(req, issue_flags) == IO_APOLL_OK)
++ return;
++ /* aborted or ready, in either case retry blocking */
++ needs_poll = false;
++ issue_flags &= ~IO_URING_F_NONBLOCK;
++ } while (1);
++
++ /* avoid locking problems by failing it from a clean context */
++ if (ret)
++ io_req_task_queue_fail(req, ret);
++}
++
++static inline struct io_fixed_file *io_fixed_file_slot(struct io_file_table *table,
++ unsigned i)
++{
++ return &table->files[i];
++}
++
++static inline struct file *io_file_from_index(struct io_ring_ctx *ctx,
++ int index)
++{
++ struct io_fixed_file *slot = io_fixed_file_slot(&ctx->file_table, index);
++
++ return (struct file *) (slot->file_ptr & FFS_MASK);
++}
++
++static void io_fixed_file_set(struct io_fixed_file *file_slot, struct file *file)
++{
++ unsigned long file_ptr = (unsigned long) file;
++
++ file_ptr |= io_file_get_flags(file);
++ file_slot->file_ptr = file_ptr;
++}
++
++static inline struct file *io_file_get_fixed(struct io_kiocb *req, int fd,
++ unsigned int issue_flags)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ struct file *file = NULL;
++ unsigned long file_ptr;
++
++ if (issue_flags & IO_URING_F_UNLOCKED)
++ mutex_lock(&ctx->uring_lock);
++
++ if (unlikely((unsigned int)fd >= ctx->nr_user_files))
++ goto out;
++ fd = array_index_nospec(fd, ctx->nr_user_files);
++ file_ptr = io_fixed_file_slot(&ctx->file_table, fd)->file_ptr;
++ file = (struct file *) (file_ptr & FFS_MASK);
++ file_ptr &= ~FFS_MASK;
++ /* mask in overlapping REQ_F and FFS bits */
++ req->flags |= (file_ptr << REQ_F_SUPPORT_NOWAIT_BIT);
++ io_req_set_rsrc_node(req, ctx, 0);
++out:
++ if (issue_flags & IO_URING_F_UNLOCKED)
++ mutex_unlock(&ctx->uring_lock);
++ return file;
++}
++
++static struct file *io_file_get_normal(struct io_kiocb *req, int fd)
++{
++ struct file *file = fget(fd);
++
++ trace_io_uring_file_get(req->ctx, req, req->user_data, fd);
++
++ /* we don't allow fixed io_uring files */
++ if (file && file->f_op == &io_uring_fops)
++ io_req_track_inflight(req);
++ return file;
++}
++
++static void io_req_task_link_timeout(struct io_kiocb *req, bool *locked)
++{
++ struct io_kiocb *prev = req->timeout.prev;
++ int ret = -ENOENT;
++
++ if (prev) {
++ if (!(req->task->flags & PF_EXITING))
++ ret = io_try_cancel_userdata(req, prev->user_data);
++ io_req_complete_post(req, ret ?: -ETIME, 0);
++ io_put_req(prev);
++ } else {
++ io_req_complete_post(req, -ETIME, 0);
++ }
++}
++
++static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer)
++{
++ struct io_timeout_data *data = container_of(timer,
++ struct io_timeout_data, timer);
++ struct io_kiocb *prev, *req = data->req;
++ struct io_ring_ctx *ctx = req->ctx;
++ unsigned long flags;
++
++ spin_lock_irqsave(&ctx->timeout_lock, flags);
++ prev = req->timeout.head;
++ req->timeout.head = NULL;
++
++ /*
++ * We don't expect the list to be empty, that will only happen if we
++ * race with the completion of the linked work.
++ */
++ if (prev) {
++ io_remove_next_linked(prev);
++ if (!req_ref_inc_not_zero(prev))
++ prev = NULL;
++ }
++ list_del(&req->timeout.list);
++ req->timeout.prev = prev;
++ spin_unlock_irqrestore(&ctx->timeout_lock, flags);
++
++ req->io_task_work.func = io_req_task_link_timeout;
++ io_req_task_work_add(req, false);
++ return HRTIMER_NORESTART;
++}
++
++static void io_queue_linked_timeout(struct io_kiocb *req)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++
++ spin_lock_irq(&ctx->timeout_lock);
++ /*
++ * If the back reference is NULL, then our linked request finished
++ * before we got a chance to setup the timer
++ */
++ if (req->timeout.head) {
++ struct io_timeout_data *data = req->async_data;
++
++ data->timer.function = io_link_timeout_fn;
++ hrtimer_start(&data->timer, timespec64_to_ktime(data->ts),
++ data->mode);
++ list_add_tail(&req->timeout.list, &ctx->ltimeout_list);
++ }
++ spin_unlock_irq(&ctx->timeout_lock);
++ /* drop submission reference */
++ io_put_req(req);
++}
++
++static void io_queue_sqe_arm_apoll(struct io_kiocb *req)
++ __must_hold(&req->ctx->uring_lock)
++{
++ struct io_kiocb *linked_timeout = io_prep_linked_timeout(req);
++
++ switch (io_arm_poll_handler(req, 0)) {
++ case IO_APOLL_READY:
++ io_req_task_queue(req);
++ break;
++ case IO_APOLL_ABORTED:
++ /*
++ * Queued up for async execution, worker will release
++ * submit reference when the iocb is actually submitted.
++ */
++ io_queue_async_work(req, NULL);
++ break;
++ case IO_APOLL_OK:
++ break;
++ }
++
++ if (linked_timeout)
++ io_queue_linked_timeout(linked_timeout);
++}
++
++static inline void __io_queue_sqe(struct io_kiocb *req)
++ __must_hold(&req->ctx->uring_lock)
++{
++ struct io_kiocb *linked_timeout;
++ int ret;
++
++ ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER);
++
++ if (req->flags & REQ_F_COMPLETE_INLINE) {
++ io_req_add_compl_list(req);
++ return;
++ }
++ /*
++ * We async punt it if the file wasn't marked NOWAIT, or if the file
++ * doesn't support non-blocking read/write attempts
++ */
++ if (likely(!ret)) {
++ linked_timeout = io_prep_linked_timeout(req);
++ if (linked_timeout)
++ io_queue_linked_timeout(linked_timeout);
++ } else if (ret == -EAGAIN && !(req->flags & REQ_F_NOWAIT)) {
++ io_queue_sqe_arm_apoll(req);
++ } else {
++ io_req_complete_failed(req, ret);
++ }
++}
++
++static void io_queue_sqe_fallback(struct io_kiocb *req)
++ __must_hold(&req->ctx->uring_lock)
++{
++ if (req->flags & REQ_F_FAIL) {
++ io_req_complete_fail_submit(req);
++ } else if (unlikely(req->ctx->drain_active)) {
++ io_drain_req(req);
++ } else {
++ int ret = io_req_prep_async(req);
++
++ if (unlikely(ret))
++ io_req_complete_failed(req, ret);
++ else
++ io_queue_async_work(req, NULL);
++ }
++}
++
++static inline void io_queue_sqe(struct io_kiocb *req)
++ __must_hold(&req->ctx->uring_lock)
++{
++ if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL))))
++ __io_queue_sqe(req);
++ else
++ io_queue_sqe_fallback(req);
++}
++
++/*
++ * Check SQE restrictions (opcode and flags).
++ *
++ * Returns 'true' if SQE is allowed, 'false' otherwise.
++ */
++static inline bool io_check_restriction(struct io_ring_ctx *ctx,
++ struct io_kiocb *req,
++ unsigned int sqe_flags)
++{
++ if (!test_bit(req->opcode, ctx->restrictions.sqe_op))
++ return false;
++
++ if ((sqe_flags & ctx->restrictions.sqe_flags_required) !=
++ ctx->restrictions.sqe_flags_required)
++ return false;
++
++ if (sqe_flags & ~(ctx->restrictions.sqe_flags_allowed |
++ ctx->restrictions.sqe_flags_required))
++ return false;
++
++ return true;
++}
++
++static void io_init_req_drain(struct io_kiocb *req)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ struct io_kiocb *head = ctx->submit_state.link.head;
++
++ ctx->drain_active = true;
++ if (head) {
++ /*
++ * If we need to drain a request in the middle of a link, drain
++ * the head request and the next request/link after the current
++ * link. Considering sequential execution of links,
++ * REQ_F_IO_DRAIN will be maintained for every request of our
++ * link.
++ */
++ head->flags |= REQ_F_IO_DRAIN | REQ_F_FORCE_ASYNC;
++ ctx->drain_next = true;
++ }
++}
++
++static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++ __must_hold(&ctx->uring_lock)
++{
++ unsigned int sqe_flags;
++ int personality;
++ u8 opcode;
++
++ /* req is partially pre-initialised, see io_preinit_req() */
++ req->opcode = opcode = READ_ONCE(sqe->opcode);
++ /* same numerical values with corresponding REQ_F_*, safe to copy */
++ req->flags = sqe_flags = READ_ONCE(sqe->flags);
++ req->user_data = READ_ONCE(sqe->user_data);
++ req->file = NULL;
++ req->fixed_rsrc_refs = NULL;
++ req->task = current;
++
++ if (unlikely(opcode >= IORING_OP_LAST)) {
++ req->opcode = 0;
++ return -EINVAL;
++ }
++ if (unlikely(sqe_flags & ~SQE_COMMON_FLAGS)) {
++ /* enforce forwards compatibility on users */
++ if (sqe_flags & ~SQE_VALID_FLAGS)
++ return -EINVAL;
++ if ((sqe_flags & IOSQE_BUFFER_SELECT) &&
++ !io_op_defs[opcode].buffer_select)
++ return -EOPNOTSUPP;
++ if (sqe_flags & IOSQE_CQE_SKIP_SUCCESS)
++ ctx->drain_disabled = true;
++ if (sqe_flags & IOSQE_IO_DRAIN) {
++ if (ctx->drain_disabled)
++ return -EOPNOTSUPP;
++ io_init_req_drain(req);
++ }
++ }
++ if (unlikely(ctx->restricted || ctx->drain_active || ctx->drain_next)) {
++ if (ctx->restricted && !io_check_restriction(ctx, req, sqe_flags))
++ return -EACCES;
++ /* knock it to the slow queue path, will be drained there */
++ if (ctx->drain_active)
++ req->flags |= REQ_F_FORCE_ASYNC;
++ /* if there is no link, we're at "next" request and need to drain */
++ if (unlikely(ctx->drain_next) && !ctx->submit_state.link.head) {
++ ctx->drain_next = false;
++ ctx->drain_active = true;
++ req->flags |= REQ_F_IO_DRAIN | REQ_F_FORCE_ASYNC;
++ }
++ }
++
++ if (io_op_defs[opcode].needs_file) {
++ struct io_submit_state *state = &ctx->submit_state;
++
++ req->fd = READ_ONCE(sqe->fd);
++
++ /*
++ * Plug now if we have more than 2 IO left after this, and the
++ * target is potentially a read/write to block based storage.
++ */
++ if (state->need_plug && io_op_defs[opcode].plug) {
++ state->plug_started = true;
++ state->need_plug = false;
++ blk_start_plug_nr_ios(&state->plug, state->submit_nr);
++ }
++ }
++
++ personality = READ_ONCE(sqe->personality);
++ if (personality) {
++ int ret;
++
++ req->creds = xa_load(&ctx->personalities, personality);
++ if (!req->creds)
++ return -EINVAL;
++ get_cred(req->creds);
++ ret = security_uring_override_creds(req->creds);
++ if (ret) {
++ put_cred(req->creds);
++ return ret;
++ }
++ req->flags |= REQ_F_CREDS;
++ }
++
++ return io_req_prep(req, sqe);
++}
++
++static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
++ const struct io_uring_sqe *sqe)
++ __must_hold(&ctx->uring_lock)
++{
++ struct io_submit_link *link = &ctx->submit_state.link;
++ int ret;
++
++ ret = io_init_req(ctx, req, sqe);
++ if (unlikely(ret)) {
++ trace_io_uring_req_failed(sqe, ctx, req, ret);
++
++ /* fail even hard links since we don't submit */
++ if (link->head) {
++ /*
++ * we can judge a link req is failed or cancelled by if
++ * REQ_F_FAIL is set, but the head is an exception since
++ * it may be set REQ_F_FAIL because of other req's failure
++ * so let's leverage req->result to distinguish if a head
++ * is set REQ_F_FAIL because of its failure or other req's
++ * failure so that we can set the correct ret code for it.
++ * init result here to avoid affecting the normal path.
++ */
++ if (!(link->head->flags & REQ_F_FAIL))
++ req_fail_link_node(link->head, -ECANCELED);
++ } else if (!(req->flags & (REQ_F_LINK | REQ_F_HARDLINK))) {
++ /*
++ * the current req is a normal req, we should return
++ * error and thus break the submittion loop.
++ */
++ io_req_complete_failed(req, ret);
++ return ret;
++ }
++ req_fail_link_node(req, ret);
++ }
++
++ /* don't need @sqe from now on */
++ trace_io_uring_submit_sqe(ctx, req, req->user_data, req->opcode,
++ req->flags, true,
++ ctx->flags & IORING_SETUP_SQPOLL);
++
++ /*
++ * If we already have a head request, queue this one for async
++ * submittal once the head completes. If we don't have a head but
++ * IOSQE_IO_LINK is set in the sqe, start a new head. This one will be
++ * submitted sync once the chain is complete. If none of those
++ * conditions are true (normal request), then just queue it.
++ */
++ if (link->head) {
++ struct io_kiocb *head = link->head;
++
++ if (!(req->flags & REQ_F_FAIL)) {
++ ret = io_req_prep_async(req);
++ if (unlikely(ret)) {
++ req_fail_link_node(req, ret);
++ if (!(head->flags & REQ_F_FAIL))
++ req_fail_link_node(head, -ECANCELED);
++ }
++ }
++ trace_io_uring_link(ctx, req, head);
++ link->last->link = req;
++ link->last = req;
++
++ if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK))
++ return 0;
++ /* last request of a link, enqueue the link */
++ link->head = NULL;
++ req = head;
++ } else if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) {
++ link->head = req;
++ link->last = req;
++ return 0;
++ }
++
++ io_queue_sqe(req);
++ return 0;
++}
++
++/*
++ * Batched submission is done, ensure local IO is flushed out.
++ */
++static void io_submit_state_end(struct io_ring_ctx *ctx)
++{
++ struct io_submit_state *state = &ctx->submit_state;
++
++ if (state->link.head)
++ io_queue_sqe(state->link.head);
++ /* flush only after queuing links as they can generate completions */
++ io_submit_flush_completions(ctx);
++ if (state->plug_started)
++ blk_finish_plug(&state->plug);
++}
++
++/*
++ * Start submission side cache.
++ */
++static void io_submit_state_start(struct io_submit_state *state,
++ unsigned int max_ios)
++{
++ state->plug_started = false;
++ state->need_plug = max_ios > 2;
++ state->submit_nr = max_ios;
++ /* set only head, no need to init link_last in advance */
++ state->link.head = NULL;
++}
++
++static void io_commit_sqring(struct io_ring_ctx *ctx)
++{
++ struct io_rings *rings = ctx->rings;
++
++ /*
++ * Ensure any loads from the SQEs are done at this point,
++ * since once we write the new head, the application could
++ * write new data to them.
++ */
++ smp_store_release(&rings->sq.head, ctx->cached_sq_head);
++}
++
++/*
++ * Fetch an sqe, if one is available. Note this returns a pointer to memory
++ * that is mapped by userspace. This means that care needs to be taken to
++ * ensure that reads are stable, as we cannot rely on userspace always
++ * being a good citizen. If members of the sqe are validated and then later
++ * used, it's important that those reads are done through READ_ONCE() to
++ * prevent a re-load down the line.
++ */
++static const struct io_uring_sqe *io_get_sqe(struct io_ring_ctx *ctx)
++{
++ unsigned head, mask = ctx->sq_entries - 1;
++ unsigned sq_idx = ctx->cached_sq_head++ & mask;
++
++ /*
++ * The cached sq head (or cq tail) serves two purposes:
++ *
++ * 1) allows us to batch the cost of updating the user visible
++ * head updates.
++ * 2) allows the kernel side to track the head on its own, even
++ * though the application is the one updating it.
++ */
++ head = READ_ONCE(ctx->sq_array[sq_idx]);
++ if (likely(head < ctx->sq_entries))
++ return &ctx->sq_sqes[head];
++
++ /* drop invalid entries */
++ ctx->cq_extra--;
++ WRITE_ONCE(ctx->rings->sq_dropped,
++ READ_ONCE(ctx->rings->sq_dropped) + 1);
++ return NULL;
++}
++
++static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr)
++ __must_hold(&ctx->uring_lock)
++{
++ unsigned int entries = io_sqring_entries(ctx);
++ int submitted = 0;
++
++ if (unlikely(!entries))
++ return 0;
++ /* make sure SQ entry isn't read before tail */
++ nr = min3(nr, ctx->sq_entries, entries);
++ io_get_task_refs(nr);
++
++ io_submit_state_start(&ctx->submit_state, nr);
++ do {
++ const struct io_uring_sqe *sqe;
++ struct io_kiocb *req;
++
++ if (unlikely(!io_alloc_req_refill(ctx))) {
++ if (!submitted)
++ submitted = -EAGAIN;
++ break;
++ }
++ req = io_alloc_req(ctx);
++ sqe = io_get_sqe(ctx);
++ if (unlikely(!sqe)) {
++ wq_stack_add_head(&req->comp_list, &ctx->submit_state.free_list);
++ break;
++ }
++ /* will complete beyond this point, count as submitted */
++ submitted++;
++ if (io_submit_sqe(ctx, req, sqe)) {
++ /*
++ * Continue submitting even for sqe failure if the
++ * ring was setup with IORING_SETUP_SUBMIT_ALL
++ */
++ if (!(ctx->flags & IORING_SETUP_SUBMIT_ALL))
++ break;
++ }
++ } while (submitted < nr);
++
++ if (unlikely(submitted != nr)) {
++ int ref_used = (submitted == -EAGAIN) ? 0 : submitted;
++ int unused = nr - ref_used;
++
++ current->io_uring->cached_refs += unused;
++ }
++
++ io_submit_state_end(ctx);
++ /* Commit SQ ring head once we've consumed and submitted all SQEs */
++ io_commit_sqring(ctx);
++
++ return submitted;
++}
++
++static inline bool io_sqd_events_pending(struct io_sq_data *sqd)
++{
++ return READ_ONCE(sqd->state);
++}
++
++static inline void io_ring_set_wakeup_flag(struct io_ring_ctx *ctx)
++{
++ /* Tell userspace we may need a wakeup call */
++ spin_lock(&ctx->completion_lock);
++ WRITE_ONCE(ctx->rings->sq_flags,
++ ctx->rings->sq_flags | IORING_SQ_NEED_WAKEUP);
++ spin_unlock(&ctx->completion_lock);
++}
++
++static inline void io_ring_clear_wakeup_flag(struct io_ring_ctx *ctx)
++{
++ spin_lock(&ctx->completion_lock);
++ WRITE_ONCE(ctx->rings->sq_flags,
++ ctx->rings->sq_flags & ~IORING_SQ_NEED_WAKEUP);
++ spin_unlock(&ctx->completion_lock);
++}
++
++static int __io_sq_thread(struct io_ring_ctx *ctx, bool cap_entries)
++{
++ unsigned int to_submit;
++ int ret = 0;
++
++ to_submit = io_sqring_entries(ctx);
++ /* if we're handling multiple rings, cap submit size for fairness */
++ if (cap_entries && to_submit > IORING_SQPOLL_CAP_ENTRIES_VALUE)
++ to_submit = IORING_SQPOLL_CAP_ENTRIES_VALUE;
++
++ if (!wq_list_empty(&ctx->iopoll_list) || to_submit) {
++ const struct cred *creds = NULL;
++
++ if (ctx->sq_creds != current_cred())
++ creds = override_creds(ctx->sq_creds);
++
++ mutex_lock(&ctx->uring_lock);
++ if (!wq_list_empty(&ctx->iopoll_list))
++ io_do_iopoll(ctx, true);
++
++ /*
++ * Don't submit if refs are dying, good for io_uring_register(),
++ * but also it is relied upon by io_ring_exit_work()
++ */
++ if (to_submit && likely(!percpu_ref_is_dying(&ctx->refs)) &&
++ !(ctx->flags & IORING_SETUP_R_DISABLED))
++ ret = io_submit_sqes(ctx, to_submit);
++ mutex_unlock(&ctx->uring_lock);
++
++ if (to_submit && wq_has_sleeper(&ctx->sqo_sq_wait))
++ wake_up(&ctx->sqo_sq_wait);
++ if (creds)
++ revert_creds(creds);
++ }
++
++ return ret;
++}
++
++static __cold void io_sqd_update_thread_idle(struct io_sq_data *sqd)
++{
++ struct io_ring_ctx *ctx;
++ unsigned sq_thread_idle = 0;
++
++ list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
++ sq_thread_idle = max(sq_thread_idle, ctx->sq_thread_idle);
++ sqd->sq_thread_idle = sq_thread_idle;
++}
++
++static bool io_sqd_handle_event(struct io_sq_data *sqd)
++{
++ bool did_sig = false;
++ struct ksignal ksig;
++
++ if (test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state) ||
++ signal_pending(current)) {
++ mutex_unlock(&sqd->lock);
++ if (signal_pending(current))
++ did_sig = get_signal(&ksig);
++ cond_resched();
++ mutex_lock(&sqd->lock);
++ }
++ return did_sig || test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state);
++}
++
++static int io_sq_thread(void *data)
++{
++ struct io_sq_data *sqd = data;
++ struct io_ring_ctx *ctx;
++ unsigned long timeout = 0;
++ char buf[TASK_COMM_LEN];
++ DEFINE_WAIT(wait);
++
++ snprintf(buf, sizeof(buf), "iou-sqp-%d", sqd->task_pid);
++ set_task_comm(current, buf);
++
++ if (sqd->sq_cpu != -1)
++ set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu));
++ else
++ set_cpus_allowed_ptr(current, cpu_online_mask);
++ current->flags |= PF_NO_SETAFFINITY;
++
++ audit_alloc_kernel(current);
++
++ mutex_lock(&sqd->lock);
++ while (1) {
++ bool cap_entries, sqt_spin = false;
++
++ if (io_sqd_events_pending(sqd) || signal_pending(current)) {
++ if (io_sqd_handle_event(sqd))
++ break;
++ timeout = jiffies + sqd->sq_thread_idle;
++ }
++
++ cap_entries = !list_is_singular(&sqd->ctx_list);
++ list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
++ int ret = __io_sq_thread(ctx, cap_entries);
++
++ if (!sqt_spin && (ret > 0 || !wq_list_empty(&ctx->iopoll_list)))
++ sqt_spin = true;
++ }
++ if (io_run_task_work())
++ sqt_spin = true;
++
++ if (sqt_spin || !time_after(jiffies, timeout)) {
++ cond_resched();
++ if (sqt_spin)
++ timeout = jiffies + sqd->sq_thread_idle;
++ continue;
++ }
++
++ prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
++ if (!io_sqd_events_pending(sqd) && !task_work_pending(current)) {
++ bool needs_sched = true;
++
++ list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
++ io_ring_set_wakeup_flag(ctx);
++
++ if ((ctx->flags & IORING_SETUP_IOPOLL) &&
++ !wq_list_empty(&ctx->iopoll_list)) {
++ needs_sched = false;
++ break;
++ }
++
++ /*
++ * Ensure the store of the wakeup flag is not
++ * reordered with the load of the SQ tail
++ */
++ smp_mb();
++
++ if (io_sqring_entries(ctx)) {
++ needs_sched = false;
++ break;
++ }
++ }
++
++ if (needs_sched) {
++ mutex_unlock(&sqd->lock);
++ schedule();
++ mutex_lock(&sqd->lock);
++ }
++ list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
++ io_ring_clear_wakeup_flag(ctx);
++ }
++
++ finish_wait(&sqd->wait, &wait);
++ timeout = jiffies + sqd->sq_thread_idle;
++ }
++
++ io_uring_cancel_generic(true, sqd);
++ sqd->thread = NULL;
++ list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
++ io_ring_set_wakeup_flag(ctx);
++ io_run_task_work();
++ mutex_unlock(&sqd->lock);
++
++ audit_free(current);
++
++ complete(&sqd->exited);
++ do_exit(0);
++}
++
++struct io_wait_queue {
++ struct wait_queue_entry wq;
++ struct io_ring_ctx *ctx;
++ unsigned cq_tail;
++ unsigned nr_timeouts;
++};
++
++static inline bool io_should_wake(struct io_wait_queue *iowq)
++{
++ struct io_ring_ctx *ctx = iowq->ctx;
++ int dist = ctx->cached_cq_tail - (int) iowq->cq_tail;
++
++ /*
++ * Wake up if we have enough events, or if a timeout occurred since we
++ * started waiting. For timeouts, we always want to return to userspace,
++ * regardless of event count.
++ */
++ return dist >= 0 || atomic_read(&ctx->cq_timeouts) != iowq->nr_timeouts;
++}
++
++static int io_wake_function(struct wait_queue_entry *curr, unsigned int mode,
++ int wake_flags, void *key)
++{
++ struct io_wait_queue *iowq = container_of(curr, struct io_wait_queue,
++ wq);
++
++ /*
++ * Cannot safely flush overflowed CQEs from here, ensure we wake up
++ * the task, and the next invocation will do it.
++ */
++ if (io_should_wake(iowq) || test_bit(0, &iowq->ctx->check_cq_overflow))
++ return autoremove_wake_function(curr, mode, wake_flags, key);
++ return -1;
++}
++
++static int io_run_task_work_sig(void)
++{
++ if (io_run_task_work())
++ return 1;
++ if (test_thread_flag(TIF_NOTIFY_SIGNAL))
++ return -ERESTARTSYS;
++ if (task_sigpending(current))
++ return -EINTR;
++ return 0;
++}
++
++/* when returns >0, the caller should retry */
++static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
++ struct io_wait_queue *iowq,
++ ktime_t timeout)
++{
++ int ret;
++
++ /* make sure we run task_work before checking for signals */
++ ret = io_run_task_work_sig();
++ if (ret || io_should_wake(iowq))
++ return ret;
++ /* let the caller flush overflows, retry */
++ if (test_bit(0, &ctx->check_cq_overflow))
++ return 1;
++
++ if (!schedule_hrtimeout(&timeout, HRTIMER_MODE_ABS))
++ return -ETIME;
++ return 1;
++}
++
++/*
++ * Wait until events become available, if we don't already have some. The
++ * application must reap them itself, as they reside on the shared cq ring.
++ */
++static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
++ const sigset_t __user *sig, size_t sigsz,
++ struct __kernel_timespec __user *uts)
++{
++ struct io_wait_queue iowq;
++ struct io_rings *rings = ctx->rings;
++ ktime_t timeout = KTIME_MAX;
++ int ret;
++
++ do {
++ io_cqring_overflow_flush(ctx);
++ if (io_cqring_events(ctx) >= min_events)
++ return 0;
++ if (!io_run_task_work())
++ break;
++ } while (1);
++
++ if (sig) {
++#ifdef CONFIG_COMPAT
++ if (in_compat_syscall())
++ ret = set_compat_user_sigmask((const compat_sigset_t __user *)sig,
++ sigsz);
++ else
++#endif
++ ret = set_user_sigmask(sig, sigsz);
++
++ if (ret)
++ return ret;
++ }
++
++ if (uts) {
++ struct timespec64 ts;
++
++ if (get_timespec64(&ts, uts))
++ return -EFAULT;
++ timeout = ktime_add_ns(timespec64_to_ktime(ts), ktime_get_ns());
++ }
++
++ init_waitqueue_func_entry(&iowq.wq, io_wake_function);
++ iowq.wq.private = current;
++ INIT_LIST_HEAD(&iowq.wq.entry);
++ iowq.ctx = ctx;
++ iowq.nr_timeouts = atomic_read(&ctx->cq_timeouts);
++ iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events;
++
++ trace_io_uring_cqring_wait(ctx, min_events);
++ do {
++ /* if we can't even flush overflow, don't wait for more */
++ if (!io_cqring_overflow_flush(ctx)) {
++ ret = -EBUSY;
++ break;
++ }
++ prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq,
++ TASK_INTERRUPTIBLE);
++ ret = io_cqring_wait_schedule(ctx, &iowq, timeout);
++ finish_wait(&ctx->cq_wait, &iowq.wq);
++ cond_resched();
++ } while (ret > 0);
++
++ restore_saved_sigmask_unless(ret == -EINTR);
++
++ return READ_ONCE(rings->cq.head) == READ_ONCE(rings->cq.tail) ? ret : 0;
++}
++
++static void io_free_page_table(void **table, size_t size)
++{
++ unsigned i, nr_tables = DIV_ROUND_UP(size, PAGE_SIZE);
++
++ for (i = 0; i < nr_tables; i++)
++ kfree(table[i]);
++ kfree(table);
++}
++
++static __cold void **io_alloc_page_table(size_t size)
++{
++ unsigned i, nr_tables = DIV_ROUND_UP(size, PAGE_SIZE);
++ size_t init_size = size;
++ void **table;
++
++ table = kcalloc(nr_tables, sizeof(*table), GFP_KERNEL_ACCOUNT);
++ if (!table)
++ return NULL;
++
++ for (i = 0; i < nr_tables; i++) {
++ unsigned int this_size = min_t(size_t, size, PAGE_SIZE);
++
++ table[i] = kzalloc(this_size, GFP_KERNEL_ACCOUNT);
++ if (!table[i]) {
++ io_free_page_table(table, init_size);
++ return NULL;
++ }
++ size -= this_size;
++ }
++ return table;
++}
++
++static void io_rsrc_node_destroy(struct io_rsrc_node *ref_node)
++{
++ percpu_ref_exit(&ref_node->refs);
++ kfree(ref_node);
++}
++
++static __cold void io_rsrc_node_ref_zero(struct percpu_ref *ref)
++{
++ struct io_rsrc_node *node = container_of(ref, struct io_rsrc_node, refs);
++ struct io_ring_ctx *ctx = node->rsrc_data->ctx;
++ unsigned long flags;
++ bool first_add = false;
++ unsigned long delay = HZ;
++
++ spin_lock_irqsave(&ctx->rsrc_ref_lock, flags);
++ node->done = true;
++
++ /* if we are mid-quiesce then do not delay */
++ if (node->rsrc_data->quiesce)
++ delay = 0;
++
++ while (!list_empty(&ctx->rsrc_ref_list)) {
++ node = list_first_entry(&ctx->rsrc_ref_list,
++ struct io_rsrc_node, node);
++ /* recycle ref nodes in order */
++ if (!node->done)
++ break;
++ list_del(&node->node);
++ first_add |= llist_add(&node->llist, &ctx->rsrc_put_llist);
++ }
++ spin_unlock_irqrestore(&ctx->rsrc_ref_lock, flags);
++
++ if (first_add)
++ mod_delayed_work(system_wq, &ctx->rsrc_put_work, delay);
++}
++
++static struct io_rsrc_node *io_rsrc_node_alloc(void)
++{
++ struct io_rsrc_node *ref_node;
++
++ ref_node = kzalloc(sizeof(*ref_node), GFP_KERNEL);
++ if (!ref_node)
++ return NULL;
++
++ if (percpu_ref_init(&ref_node->refs, io_rsrc_node_ref_zero,
++ 0, GFP_KERNEL)) {
++ kfree(ref_node);
++ return NULL;
++ }
++ INIT_LIST_HEAD(&ref_node->node);
++ INIT_LIST_HEAD(&ref_node->rsrc_list);
++ ref_node->done = false;
++ return ref_node;
++}
++
++static void io_rsrc_node_switch(struct io_ring_ctx *ctx,
++ struct io_rsrc_data *data_to_kill)
++ __must_hold(&ctx->uring_lock)
++{
++ WARN_ON_ONCE(!ctx->rsrc_backup_node);
++ WARN_ON_ONCE(data_to_kill && !ctx->rsrc_node);
++
++ io_rsrc_refs_drop(ctx);
++
++ if (data_to_kill) {
++ struct io_rsrc_node *rsrc_node = ctx->rsrc_node;
++
++ rsrc_node->rsrc_data = data_to_kill;
++ spin_lock_irq(&ctx->rsrc_ref_lock);
++ list_add_tail(&rsrc_node->node, &ctx->rsrc_ref_list);
++ spin_unlock_irq(&ctx->rsrc_ref_lock);
++
++ atomic_inc(&data_to_kill->refs);
++ percpu_ref_kill(&rsrc_node->refs);
++ ctx->rsrc_node = NULL;
++ }
++
++ if (!ctx->rsrc_node) {
++ ctx->rsrc_node = ctx->rsrc_backup_node;
++ ctx->rsrc_backup_node = NULL;
++ }
++}
++
++static int io_rsrc_node_switch_start(struct io_ring_ctx *ctx)
++{
++ if (ctx->rsrc_backup_node)
++ return 0;
++ ctx->rsrc_backup_node = io_rsrc_node_alloc();
++ return ctx->rsrc_backup_node ? 0 : -ENOMEM;
++}
++
++static __cold int io_rsrc_ref_quiesce(struct io_rsrc_data *data,
++ struct io_ring_ctx *ctx)
++{
++ int ret;
++
++ /* As we may drop ->uring_lock, other task may have started quiesce */
++ if (data->quiesce)
++ return -ENXIO;
++
++ data->quiesce = true;
++ do {
++ ret = io_rsrc_node_switch_start(ctx);
++ if (ret)
++ break;
++ io_rsrc_node_switch(ctx, data);
++
++ /* kill initial ref, already quiesced if zero */
++ if (atomic_dec_and_test(&data->refs))
++ break;
++ mutex_unlock(&ctx->uring_lock);
++ flush_delayed_work(&ctx->rsrc_put_work);
++ ret = wait_for_completion_interruptible(&data->done);
++ if (!ret) {
++ mutex_lock(&ctx->uring_lock);
++ if (atomic_read(&data->refs) > 0) {
++ /*
++ * it has been revived by another thread while
++ * we were unlocked
++ */
++ mutex_unlock(&ctx->uring_lock);
++ } else {
++ break;
++ }
++ }
++
++ atomic_inc(&data->refs);
++ /* wait for all works potentially completing data->done */
++ flush_delayed_work(&ctx->rsrc_put_work);
++ reinit_completion(&data->done);
++
++ ret = io_run_task_work_sig();
++ mutex_lock(&ctx->uring_lock);
++ } while (ret >= 0);
++ data->quiesce = false;
++
++ return ret;
++}
++
++static u64 *io_get_tag_slot(struct io_rsrc_data *data, unsigned int idx)
++{
++ unsigned int off = idx & IO_RSRC_TAG_TABLE_MASK;
++ unsigned int table_idx = idx >> IO_RSRC_TAG_TABLE_SHIFT;
++
++ return &data->tags[table_idx][off];
++}
++
++static void io_rsrc_data_free(struct io_rsrc_data *data)
++{
++ size_t size = data->nr * sizeof(data->tags[0][0]);
++
++ if (data->tags)
++ io_free_page_table((void **)data->tags, size);
++ kfree(data);
++}
++
++static __cold int io_rsrc_data_alloc(struct io_ring_ctx *ctx, rsrc_put_fn *do_put,
++ u64 __user *utags, unsigned nr,
++ struct io_rsrc_data **pdata)
++{
++ struct io_rsrc_data *data;
++ int ret = -ENOMEM;
++ unsigned i;
++
++ data = kzalloc(sizeof(*data), GFP_KERNEL);
++ if (!data)
++ return -ENOMEM;
++ data->tags = (u64 **)io_alloc_page_table(nr * sizeof(data->tags[0][0]));
++ if (!data->tags) {
++ kfree(data);
++ return -ENOMEM;
++ }
++
++ data->nr = nr;
++ data->ctx = ctx;
++ data->do_put = do_put;
++ if (utags) {
++ ret = -EFAULT;
++ for (i = 0; i < nr; i++) {
++ u64 *tag_slot = io_get_tag_slot(data, i);
++
++ if (copy_from_user(tag_slot, &utags[i],
++ sizeof(*tag_slot)))
++ goto fail;
++ }
++ }
++
++ atomic_set(&data->refs, 1);
++ init_completion(&data->done);
++ *pdata = data;
++ return 0;
++fail:
++ io_rsrc_data_free(data);
++ return ret;
++}
++
++static bool io_alloc_file_tables(struct io_file_table *table, unsigned nr_files)
++{
++ table->files = kvcalloc(nr_files, sizeof(table->files[0]),
++ GFP_KERNEL_ACCOUNT);
++ return !!table->files;
++}
++
++static void io_free_file_tables(struct io_file_table *table)
++{
++ kvfree(table->files);
++ table->files = NULL;
++}
++
++static void __io_sqe_files_unregister(struct io_ring_ctx *ctx)
++{
++#if defined(CONFIG_UNIX)
++ if (ctx->ring_sock) {
++ struct sock *sock = ctx->ring_sock->sk;
++ struct sk_buff *skb;
++
++ while ((skb = skb_dequeue(&sock->sk_receive_queue)) != NULL)
++ kfree_skb(skb);
++ }
++#else
++ int i;
++
++ for (i = 0; i < ctx->nr_user_files; i++) {
++ struct file *file;
++
++ file = io_file_from_index(ctx, i);
++ if (file)
++ fput(file);
++ }
++#endif
++ io_free_file_tables(&ctx->file_table);
++ io_rsrc_data_free(ctx->file_data);
++ ctx->file_data = NULL;
++ ctx->nr_user_files = 0;
++}
++
++static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
++{
++ unsigned nr = ctx->nr_user_files;
++ int ret;
++
++ if (!ctx->file_data)
++ return -ENXIO;
++
++ /*
++ * Quiesce may unlock ->uring_lock, and while it's not held
++ * prevent new requests using the table.
++ */
++ ctx->nr_user_files = 0;
++ ret = io_rsrc_ref_quiesce(ctx->file_data, ctx);
++ ctx->nr_user_files = nr;
++ if (!ret)
++ __io_sqe_files_unregister(ctx);
++ return ret;
++}
++
++static void io_sq_thread_unpark(struct io_sq_data *sqd)
++ __releases(&sqd->lock)
++{
++ WARN_ON_ONCE(sqd->thread == current);
++
++ /*
++ * Do the dance but not conditional clear_bit() because it'd race with
++ * other threads incrementing park_pending and setting the bit.
++ */
++ clear_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state);
++ if (atomic_dec_return(&sqd->park_pending))
++ set_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state);
++ mutex_unlock(&sqd->lock);
++}
++
++static void io_sq_thread_park(struct io_sq_data *sqd)
++ __acquires(&sqd->lock)
++{
++ WARN_ON_ONCE(sqd->thread == current);
++
++ atomic_inc(&sqd->park_pending);
++ set_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state);
++ mutex_lock(&sqd->lock);
++ if (sqd->thread)
++ wake_up_process(sqd->thread);
++}
++
++static void io_sq_thread_stop(struct io_sq_data *sqd)
++{
++ WARN_ON_ONCE(sqd->thread == current);
++ WARN_ON_ONCE(test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state));
++
++ set_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state);
++ mutex_lock(&sqd->lock);
++ if (sqd->thread)
++ wake_up_process(sqd->thread);
++ mutex_unlock(&sqd->lock);
++ wait_for_completion(&sqd->exited);
++}
++
++static void io_put_sq_data(struct io_sq_data *sqd)
++{
++ if (refcount_dec_and_test(&sqd->refs)) {
++ WARN_ON_ONCE(atomic_read(&sqd->park_pending));
++
++ io_sq_thread_stop(sqd);
++ kfree(sqd);
++ }
++}
++
++static void io_sq_thread_finish(struct io_ring_ctx *ctx)
++{
++ struct io_sq_data *sqd = ctx->sq_data;
++
++ if (sqd) {
++ io_sq_thread_park(sqd);
++ list_del_init(&ctx->sqd_list);
++ io_sqd_update_thread_idle(sqd);
++ io_sq_thread_unpark(sqd);
++
++ io_put_sq_data(sqd);
++ ctx->sq_data = NULL;
++ }
++}
++
++static struct io_sq_data *io_attach_sq_data(struct io_uring_params *p)
++{
++ struct io_ring_ctx *ctx_attach;
++ struct io_sq_data *sqd;
++ struct fd f;
++
++ f = fdget(p->wq_fd);
++ if (!f.file)
++ return ERR_PTR(-ENXIO);
++ if (f.file->f_op != &io_uring_fops) {
++ fdput(f);
++ return ERR_PTR(-EINVAL);
++ }
++
++ ctx_attach = f.file->private_data;
++ sqd = ctx_attach->sq_data;
++ if (!sqd) {
++ fdput(f);
++ return ERR_PTR(-EINVAL);
++ }
++ if (sqd->task_tgid != current->tgid) {
++ fdput(f);
++ return ERR_PTR(-EPERM);
++ }
++
++ refcount_inc(&sqd->refs);
++ fdput(f);
++ return sqd;
++}
++
++static struct io_sq_data *io_get_sq_data(struct io_uring_params *p,
++ bool *attached)
++{
++ struct io_sq_data *sqd;
++
++ *attached = false;
++ if (p->flags & IORING_SETUP_ATTACH_WQ) {
++ sqd = io_attach_sq_data(p);
++ if (!IS_ERR(sqd)) {
++ *attached = true;
++ return sqd;
++ }
++ /* fall through for EPERM case, setup new sqd/task */
++ if (PTR_ERR(sqd) != -EPERM)
++ return sqd;
++ }
++
++ sqd = kzalloc(sizeof(*sqd), GFP_KERNEL);
++ if (!sqd)
++ return ERR_PTR(-ENOMEM);
++
++ atomic_set(&sqd->park_pending, 0);
++ refcount_set(&sqd->refs, 1);
++ INIT_LIST_HEAD(&sqd->ctx_list);
++ mutex_init(&sqd->lock);
++ init_waitqueue_head(&sqd->wait);
++ init_completion(&sqd->exited);
++ return sqd;
++}
++
++#if defined(CONFIG_UNIX)
++/*
++ * Ensure the UNIX gc is aware of our file set, so we are certain that
++ * the io_uring can be safely unregistered on process exit, even if we have
++ * loops in the file referencing.
++ */
++static int __io_sqe_files_scm(struct io_ring_ctx *ctx, int nr, int offset)
++{
++ struct sock *sk = ctx->ring_sock->sk;
++ struct scm_fp_list *fpl;
++ struct sk_buff *skb;
++ int i, nr_files;
++
++ fpl = kzalloc(sizeof(*fpl), GFP_KERNEL);
++ if (!fpl)
++ return -ENOMEM;
++
++ skb = alloc_skb(0, GFP_KERNEL);
++ if (!skb) {
++ kfree(fpl);
++ return -ENOMEM;
++ }
++
++ skb->sk = sk;
++
++ nr_files = 0;
++ fpl->user = get_uid(current_user());
++ for (i = 0; i < nr; i++) {
++ struct file *file = io_file_from_index(ctx, i + offset);
++
++ if (!file)
++ continue;
++ fpl->fp[nr_files] = get_file(file);
++ unix_inflight(fpl->user, fpl->fp[nr_files]);
++ nr_files++;
++ }
++
++ if (nr_files) {
++ fpl->max = SCM_MAX_FD;
++ fpl->count = nr_files;
++ UNIXCB(skb).fp = fpl;
++ skb->destructor = unix_destruct_scm;
++ refcount_add(skb->truesize, &sk->sk_wmem_alloc);
++ skb_queue_head(&sk->sk_receive_queue, skb);
++
++ for (i = 0; i < nr; i++) {
++ struct file *file = io_file_from_index(ctx, i + offset);
++
++ if (file)
++ fput(file);
++ }
++ } else {
++ kfree_skb(skb);
++ free_uid(fpl->user);
++ kfree(fpl);
++ }
++
++ return 0;
++}
++
++/*
++ * If UNIX sockets are enabled, fd passing can cause a reference cycle which
++ * causes regular reference counting to break down. We rely on the UNIX
++ * garbage collection to take care of this problem for us.
++ */
++static int io_sqe_files_scm(struct io_ring_ctx *ctx)
++{
++ unsigned left, total;
++ int ret = 0;
++
++ total = 0;
++ left = ctx->nr_user_files;
++ while (left) {
++ unsigned this_files = min_t(unsigned, left, SCM_MAX_FD);
++
++ ret = __io_sqe_files_scm(ctx, this_files, total);
++ if (ret)
++ break;
++ left -= this_files;
++ total += this_files;
++ }
++
++ if (!ret)
++ return 0;
++
++ while (total < ctx->nr_user_files) {
++ struct file *file = io_file_from_index(ctx, total);
++
++ if (file)
++ fput(file);
++ total++;
++ }
++
++ return ret;
++}
++#else
++static int io_sqe_files_scm(struct io_ring_ctx *ctx)
++{
++ return 0;
++}
++#endif
++
++static void io_rsrc_file_put(struct io_ring_ctx *ctx, struct io_rsrc_put *prsrc)
++{
++ struct file *file = prsrc->file;
++#if defined(CONFIG_UNIX)
++ struct sock *sock = ctx->ring_sock->sk;
++ struct sk_buff_head list, *head = &sock->sk_receive_queue;
++ struct sk_buff *skb;
++ int i;
++
++ __skb_queue_head_init(&list);
++
++ /*
++ * Find the skb that holds this file in its SCM_RIGHTS. When found,
++ * remove this entry and rearrange the file array.
++ */
++ skb = skb_dequeue(head);
++ while (skb) {
++ struct scm_fp_list *fp;
++
++ fp = UNIXCB(skb).fp;
++ for (i = 0; i < fp->count; i++) {
++ int left;
++
++ if (fp->fp[i] != file)
++ continue;
++
++ unix_notinflight(fp->user, fp->fp[i]);
++ left = fp->count - 1 - i;
++ if (left) {
++ memmove(&fp->fp[i], &fp->fp[i + 1],
++ left * sizeof(struct file *));
++ }
++ fp->count--;
++ if (!fp->count) {
++ kfree_skb(skb);
++ skb = NULL;
++ } else {
++ __skb_queue_tail(&list, skb);
++ }
++ fput(file);
++ file = NULL;
++ break;
++ }
++
++ if (!file)
++ break;
++
++ __skb_queue_tail(&list, skb);
++
++ skb = skb_dequeue(head);
++ }
++
++ if (skb_peek(&list)) {
++ spin_lock_irq(&head->lock);
++ while ((skb = __skb_dequeue(&list)) != NULL)
++ __skb_queue_tail(head, skb);
++ spin_unlock_irq(&head->lock);
++ }
++#else
++ fput(file);
++#endif
++}
++
++static void __io_rsrc_put_work(struct io_rsrc_node *ref_node)
++{
++ struct io_rsrc_data *rsrc_data = ref_node->rsrc_data;
++ struct io_ring_ctx *ctx = rsrc_data->ctx;
++ struct io_rsrc_put *prsrc, *tmp;
++
++ list_for_each_entry_safe(prsrc, tmp, &ref_node->rsrc_list, list) {
++ list_del(&prsrc->list);
++
++ if (prsrc->tag) {
++ bool lock_ring = ctx->flags & IORING_SETUP_IOPOLL;
++
++ io_ring_submit_lock(ctx, lock_ring);
++ spin_lock(&ctx->completion_lock);
++ io_fill_cqe_aux(ctx, prsrc->tag, 0, 0);
++ io_commit_cqring(ctx);
++ spin_unlock(&ctx->completion_lock);
++ io_cqring_ev_posted(ctx);
++ io_ring_submit_unlock(ctx, lock_ring);
++ }
++
++ rsrc_data->do_put(ctx, prsrc);
++ kfree(prsrc);
++ }
++
++ io_rsrc_node_destroy(ref_node);
++ if (atomic_dec_and_test(&rsrc_data->refs))
++ complete(&rsrc_data->done);
++}
++
++static void io_rsrc_put_work(struct work_struct *work)
++{
++ struct io_ring_ctx *ctx;
++ struct llist_node *node;
++
++ ctx = container_of(work, struct io_ring_ctx, rsrc_put_work.work);
++ node = llist_del_all(&ctx->rsrc_put_llist);
++
++ while (node) {
++ struct io_rsrc_node *ref_node;
++ struct llist_node *next = node->next;
++
++ ref_node = llist_entry(node, struct io_rsrc_node, llist);
++ __io_rsrc_put_work(ref_node);
++ node = next;
++ }
++}
++
++static int io_sqe_files_register(struct io_ring_ctx *ctx, void __user *arg,
++ unsigned nr_args, u64 __user *tags)
++{
++ __s32 __user *fds = (__s32 __user *) arg;
++ struct file *file;
++ int fd, ret;
++ unsigned i;
++
++ if (ctx->file_data)
++ return -EBUSY;
++ if (!nr_args)
++ return -EINVAL;
++ if (nr_args > IORING_MAX_FIXED_FILES)
++ return -EMFILE;
++ if (nr_args > rlimit(RLIMIT_NOFILE))
++ return -EMFILE;
++ ret = io_rsrc_node_switch_start(ctx);
++ if (ret)
++ return ret;
++ ret = io_rsrc_data_alloc(ctx, io_rsrc_file_put, tags, nr_args,
++ &ctx->file_data);
++ if (ret)
++ return ret;
++
++ ret = -ENOMEM;
++ if (!io_alloc_file_tables(&ctx->file_table, nr_args))
++ goto out_free;
++
++ for (i = 0; i < nr_args; i++, ctx->nr_user_files++) {
++ if (copy_from_user(&fd, &fds[i], sizeof(fd))) {
++ ret = -EFAULT;
++ goto out_fput;
++ }
++ /* allow sparse sets */
++ if (fd == -1) {
++ ret = -EINVAL;
++ if (unlikely(*io_get_tag_slot(ctx->file_data, i)))
++ goto out_fput;
++ continue;
++ }
++
++ file = fget(fd);
++ ret = -EBADF;
++ if (unlikely(!file))
++ goto out_fput;
++
++ /*
++ * Don't allow io_uring instances to be registered. If UNIX
++ * isn't enabled, then this causes a reference cycle and this
++ * instance can never get freed. If UNIX is enabled we'll
++ * handle it just fine, but there's still no point in allowing
++ * a ring fd as it doesn't support regular read/write anyway.
++ */
++ if (file->f_op == &io_uring_fops) {
++ fput(file);
++ goto out_fput;
++ }
++ io_fixed_file_set(io_fixed_file_slot(&ctx->file_table, i), file);
++ }
++
++ ret = io_sqe_files_scm(ctx);
++ if (ret) {
++ __io_sqe_files_unregister(ctx);
++ return ret;
++ }
++
++ io_rsrc_node_switch(ctx, NULL);
++ return ret;
++out_fput:
++ for (i = 0; i < ctx->nr_user_files; i++) {
++ file = io_file_from_index(ctx, i);
++ if (file)
++ fput(file);
++ }
++ io_free_file_tables(&ctx->file_table);
++ ctx->nr_user_files = 0;
++out_free:
++ io_rsrc_data_free(ctx->file_data);
++ ctx->file_data = NULL;
++ return ret;
++}
++
++static int io_sqe_file_register(struct io_ring_ctx *ctx, struct file *file,
++ int index)
++{
++#if defined(CONFIG_UNIX)
++ struct sock *sock = ctx->ring_sock->sk;
++ struct sk_buff_head *head = &sock->sk_receive_queue;
++ struct sk_buff *skb;
++
++ /*
++ * See if we can merge this file into an existing skb SCM_RIGHTS
++ * file set. If there's no room, fall back to allocating a new skb
++ * and filling it in.
++ */
++ spin_lock_irq(&head->lock);
++ skb = skb_peek(head);
++ if (skb) {
++ struct scm_fp_list *fpl = UNIXCB(skb).fp;
++
++ if (fpl->count < SCM_MAX_FD) {
++ __skb_unlink(skb, head);
++ spin_unlock_irq(&head->lock);
++ fpl->fp[fpl->count] = get_file(file);
++ unix_inflight(fpl->user, fpl->fp[fpl->count]);
++ fpl->count++;
++ spin_lock_irq(&head->lock);
++ __skb_queue_head(head, skb);
++ } else {
++ skb = NULL;
++ }
++ }
++ spin_unlock_irq(&head->lock);
++
++ if (skb) {
++ fput(file);
++ return 0;
++ }
++
++ return __io_sqe_files_scm(ctx, 1, index);
++#else
++ return 0;
++#endif
++}
++
++static int io_queue_rsrc_removal(struct io_rsrc_data *data, unsigned idx,
++ struct io_rsrc_node *node, void *rsrc)
++{
++ u64 *tag_slot = io_get_tag_slot(data, idx);
++ struct io_rsrc_put *prsrc;
++
++ prsrc = kzalloc(sizeof(*prsrc), GFP_KERNEL);
++ if (!prsrc)
++ return -ENOMEM;
++
++ prsrc->tag = *tag_slot;
++ *tag_slot = 0;
++ prsrc->rsrc = rsrc;
++ list_add(&prsrc->list, &node->rsrc_list);
++ return 0;
++}
++
++static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
++ unsigned int issue_flags, u32 slot_index)
++{
++ struct io_ring_ctx *ctx = req->ctx;
++ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
++ bool needs_switch = false;
++ struct io_fixed_file *file_slot;
++ int ret = -EBADF;
++
++ io_ring_submit_lock(ctx, needs_lock);
++ if (file->f_op == &io_uring_fops)
++ goto err;
++ ret = -ENXIO;
++ if (!ctx->file_data)
++ goto err;
++ ret = -EINVAL;
++ if (slot_index >= ctx->nr_user_files)
++ goto err;
++
++ slot_index = array_index_nospec(slot_index, ctx->nr_user_files);
++ file_slot = io_fixed_file_slot(&ctx->file_table, slot_index);
++
++ if (file_slot->file_ptr) {
++ struct file *old_file;
++
++ ret = io_rsrc_node_switch_start(ctx);
++ if (ret)
++ goto err;
++
++ old_file = (struct file *)(file_slot->file_ptr & FFS_MASK);
++ ret = io_queue_rsrc_removal(ctx->file_data, slot_index,
++ ctx->rsrc_node, old_file);
++ if (ret)
++ goto err;
++ file_slot->file_ptr = 0;
++ needs_switch = true;
++ }
++
++ *io_get_tag_slot(ctx->file_data, slot_index) = 0;
++ io_fixed_file_set(file_slot, file);
++ ret = io_sqe_file_register(ctx, file, slot_index);
++ if (ret) {
++ file_slot->file_ptr = 0;
++ goto err;
++ }
++
++ ret = 0;
++err:
++ if (needs_switch)
++ io_rsrc_node_switch(ctx, ctx->file_data);
++ io_ring_submit_unlock(ctx, needs_lock);
++ if (ret)
++ fput(file);
++ return ret;
++}
++
++static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags)
++{
++ unsigned int offset = req->close.file_slot - 1;
++ struct io_ring_ctx *ctx = req->ctx;
++ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
++ struct io_fixed_file *file_slot;
++ struct file *file;
++ int ret;
++
++ io_ring_submit_lock(ctx, needs_lock);
++ ret = -ENXIO;
++ if (unlikely(!ctx->file_data))
++ goto out;
++ ret = -EINVAL;
++ if (offset >= ctx->nr_user_files)
++ goto out;
++ ret = io_rsrc_node_switch_start(ctx);
++ if (ret)
++ goto out;
++
++ offset = array_index_nospec(offset, ctx->nr_user_files);
++ file_slot = io_fixed_file_slot(&ctx->file_table, offset);
++ ret = -EBADF;
++ if (!file_slot->file_ptr)
++ goto out;
++
++ file = (struct file *)(file_slot->file_ptr & FFS_MASK);
++ ret = io_queue_rsrc_removal(ctx->file_data, offset, ctx->rsrc_node, file);
++ if (ret)
++ goto out;
++
++ file_slot->file_ptr = 0;
++ io_rsrc_node_switch(ctx, ctx->file_data);
++ ret = 0;
++out:
++ io_ring_submit_unlock(ctx, needs_lock);
++ return ret;
++}
++
++static int __io_sqe_files_update(struct io_ring_ctx *ctx,
++ struct io_uring_rsrc_update2 *up,
++ unsigned nr_args)
++{
++ u64 __user *tags = u64_to_user_ptr(up->tags);
++ __s32 __user *fds = u64_to_user_ptr(up->data);
++ struct io_rsrc_data *data = ctx->file_data;
++ struct io_fixed_file *file_slot;
++ struct file *file;
++ int fd, i, err = 0;
++ unsigned int done;
++ bool needs_switch = false;
++
++ if (!ctx->file_data)
++ return -ENXIO;
++ if (up->offset + nr_args > ctx->nr_user_files)
++ return -EINVAL;
++
++ for (done = 0; done < nr_args; done++) {
++ u64 tag = 0;
++
++ if ((tags && copy_from_user(&tag, &tags[done], sizeof(tag))) ||
++ copy_from_user(&fd, &fds[done], sizeof(fd))) {
++ err = -EFAULT;
++ break;
++ }
++ if ((fd == IORING_REGISTER_FILES_SKIP || fd == -1) && tag) {
++ err = -EINVAL;
++ break;
++ }
++ if (fd == IORING_REGISTER_FILES_SKIP)
++ continue;
++
++ i = array_index_nospec(up->offset + done, ctx->nr_user_files);
++ file_slot = io_fixed_file_slot(&ctx->file_table, i);
++
++ if (file_slot->file_ptr) {
++ file = (struct file *)(file_slot->file_ptr & FFS_MASK);
++ err = io_queue_rsrc_removal(data, i, ctx->rsrc_node, file);
++ if (err)
++ break;
++ file_slot->file_ptr = 0;
++ needs_switch = true;
++ }
++ if (fd != -1) {
++ file = fget(fd);
++ if (!file) {
++ err = -EBADF;
++ break;
++ }
++ /*
++ * Don't allow io_uring instances to be registered. If
++ * UNIX isn't enabled, then this causes a reference
++ * cycle and this instance can never get freed. If UNIX
++ * is enabled we'll handle it just fine, but there's
++ * still no point in allowing a ring fd as it doesn't
++ * support regular read/write anyway.
++ */
++ if (file->f_op == &io_uring_fops) {
++ fput(file);
++ err = -EBADF;
++ break;
++ }
++ *io_get_tag_slot(data, i) = tag;
++ io_fixed_file_set(file_slot, file);
++ err = io_sqe_file_register(ctx, file, i);
++ if (err) {
++ file_slot->file_ptr = 0;
++ fput(file);
++ break;
++ }
++ }
++ }
++
++ if (needs_switch)
++ io_rsrc_node_switch(ctx, data);
++ return done ? done : err;
++}
++
++static struct io_wq *io_init_wq_offload(struct io_ring_ctx *ctx,
++ struct task_struct *task)
++{
++ struct io_wq_hash *hash;
++ struct io_wq_data data;
++ unsigned int concurrency;
++
++ mutex_lock(&ctx->uring_lock);
++ hash = ctx->hash_map;
++ if (!hash) {
++ hash = kzalloc(sizeof(*hash), GFP_KERNEL);
++ if (!hash) {
++ mutex_unlock(&ctx->uring_lock);
++ return ERR_PTR(-ENOMEM);
++ }
++ refcount_set(&hash->refs, 1);
++ init_waitqueue_head(&hash->wait);
++ ctx->hash_map = hash;
++ }
++ mutex_unlock(&ctx->uring_lock);
++
++ data.hash = hash;
++ data.task = task;
++ data.free_work = io_wq_free_work;
++ data.do_work = io_wq_submit_work;
++
++ /* Do QD, or 4 * CPUS, whatever is smallest */
++ concurrency = min(ctx->sq_entries, 4 * num_online_cpus());
++
++ return io_wq_create(concurrency, &data);
++}
++
++static __cold int io_uring_alloc_task_context(struct task_struct *task,
++ struct io_ring_ctx *ctx)
++{
++ struct io_uring_task *tctx;
++ int ret;
++
++ tctx = kzalloc(sizeof(*tctx), GFP_KERNEL);
++ if (unlikely(!tctx))
++ return -ENOMEM;
++
++ tctx->registered_rings = kcalloc(IO_RINGFD_REG_MAX,
++ sizeof(struct file *), GFP_KERNEL);
++ if (unlikely(!tctx->registered_rings)) {
++ kfree(tctx);
++ return -ENOMEM;
++ }
++
++ ret = percpu_counter_init(&tctx->inflight, 0, GFP_KERNEL);
++ if (unlikely(ret)) {
++ kfree(tctx->registered_rings);
++ kfree(tctx);
++ return ret;
++ }
++
++ tctx->io_wq = io_init_wq_offload(ctx, task);
++ if (IS_ERR(tctx->io_wq)) {
++ ret = PTR_ERR(tctx->io_wq);
++ percpu_counter_destroy(&tctx->inflight);
++ kfree(tctx->registered_rings);
++ kfree(tctx);
++ return ret;
++ }
++
++ xa_init(&tctx->xa);
++ init_waitqueue_head(&tctx->wait);
++ atomic_set(&tctx->in_idle, 0);
++ atomic_set(&tctx->inflight_tracked, 0);
++ task->io_uring = tctx;
++ spin_lock_init(&tctx->task_lock);
++ INIT_WQ_LIST(&tctx->task_list);
++ INIT_WQ_LIST(&tctx->prior_task_list);
++ init_task_work(&tctx->task_work, tctx_task_work);
++ return 0;
++}
++
++void __io_uring_free(struct task_struct *tsk)
++{
++ struct io_uring_task *tctx = tsk->io_uring;
++
++ WARN_ON_ONCE(!xa_empty(&tctx->xa));
++ WARN_ON_ONCE(tctx->io_wq);
++ WARN_ON_ONCE(tctx->cached_refs);
++
++ kfree(tctx->registered_rings);
++ percpu_counter_destroy(&tctx->inflight);
++ kfree(tctx);
++ tsk->io_uring = NULL;
++}
++
++static __cold int io_sq_offload_create(struct io_ring_ctx *ctx,
++ struct io_uring_params *p)
++{
++ int ret;
++
++ /* Retain compatibility with failing for an invalid attach attempt */
++ if ((ctx->flags & (IORING_SETUP_ATTACH_WQ | IORING_SETUP_SQPOLL)) ==
++ IORING_SETUP_ATTACH_WQ) {
++ struct fd f;
++
++ f = fdget(p->wq_fd);
++ if (!f.file)
++ return -ENXIO;
++ if (f.file->f_op != &io_uring_fops) {
++ fdput(f);
++ return -EINVAL;
++ }
++ fdput(f);
++ }
++ if (ctx->flags & IORING_SETUP_SQPOLL) {
++ struct task_struct *tsk;
++ struct io_sq_data *sqd;
++ bool attached;
++
++ ret = security_uring_sqpoll();
++ if (ret)
++ return ret;
++
++ sqd = io_get_sq_data(p, &attached);
++ if (IS_ERR(sqd)) {
++ ret = PTR_ERR(sqd);
++ goto err;
++ }
++
++ ctx->sq_creds = get_current_cred();
++ ctx->sq_data = sqd;
++ ctx->sq_thread_idle = msecs_to_jiffies(p->sq_thread_idle);
++ if (!ctx->sq_thread_idle)
++ ctx->sq_thread_idle = HZ;
++
++ io_sq_thread_park(sqd);
++ list_add(&ctx->sqd_list, &sqd->ctx_list);
++ io_sqd_update_thread_idle(sqd);
++ /* don't attach to a dying SQPOLL thread, would be racy */
++ ret = (attached && !sqd->thread) ? -ENXIO : 0;
++ io_sq_thread_unpark(sqd);
++
++ if (ret < 0)
++ goto err;
++ if (attached)
++ return 0;
++
++ if (p->flags & IORING_SETUP_SQ_AFF) {
++ int cpu = p->sq_thread_cpu;
++
++ ret = -EINVAL;
++ if (cpu >= nr_cpu_ids || !cpu_online(cpu))
++ goto err_sqpoll;
++ sqd->sq_cpu = cpu;
++ } else {
++ sqd->sq_cpu = -1;
++ }
++
++ sqd->task_pid = current->pid;
++ sqd->task_tgid = current->tgid;
++ tsk = create_io_thread(io_sq_thread, sqd, NUMA_NO_NODE);
++ if (IS_ERR(tsk)) {
++ ret = PTR_ERR(tsk);
++ goto err_sqpoll;
++ }
++
++ sqd->thread = tsk;
++ ret = io_uring_alloc_task_context(tsk, ctx);
++ wake_up_new_task(tsk);
++ if (ret)
++ goto err;
++ } else if (p->flags & IORING_SETUP_SQ_AFF) {
++ /* Can't have SQ_AFF without SQPOLL */
++ ret = -EINVAL;
++ goto err;
++ }
++
++ return 0;
++err_sqpoll:
++ complete(&ctx->sq_data->exited);
++err:
++ io_sq_thread_finish(ctx);
++ return ret;
++}
++
++static inline void __io_unaccount_mem(struct user_struct *user,
++ unsigned long nr_pages)
++{
++ atomic_long_sub(nr_pages, &user->locked_vm);
++}
++
++static inline int __io_account_mem(struct user_struct *user,
++ unsigned long nr_pages)
++{
++ unsigned long page_limit, cur_pages, new_pages;
++
++ /* Don't allow more pages than we can safely lock */
++ page_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
++
++ do {
++ cur_pages = atomic_long_read(&user->locked_vm);
++ new_pages = cur_pages + nr_pages;
++ if (new_pages > page_limit)
++ return -ENOMEM;
++ } while (atomic_long_cmpxchg(&user->locked_vm, cur_pages,
++ new_pages) != cur_pages);
++
++ return 0;
++}
++
++static void io_unaccount_mem(struct io_ring_ctx *ctx, unsigned long nr_pages)
++{
++ if (ctx->user)
++ __io_unaccount_mem(ctx->user, nr_pages);
++
++ if (ctx->mm_account)
++ atomic64_sub(nr_pages, &ctx->mm_account->pinned_vm);
++}
++
++static int io_account_mem(struct io_ring_ctx *ctx, unsigned long nr_pages)
++{
++ int ret;
++
++ if (ctx->user) {
++ ret = __io_account_mem(ctx->user, nr_pages);
++ if (ret)
++ return ret;
++ }
++
++ if (ctx->mm_account)
++ atomic64_add(nr_pages, &ctx->mm_account->pinned_vm);
++
++ return 0;
++}
++
++static void io_mem_free(void *ptr)
++{
++ struct page *page;
++
++ if (!ptr)
++ return;
++
++ page = virt_to_head_page(ptr);
++ if (put_page_testzero(page))
++ free_compound_page(page);
++}
++
++static void *io_mem_alloc(size_t size)
++{
++ gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN | __GFP_COMP;
++
++ return (void *) __get_free_pages(gfp, get_order(size));
++}
++
++static unsigned long rings_size(unsigned sq_entries, unsigned cq_entries,
++ size_t *sq_offset)
++{
++ struct io_rings *rings;
++ size_t off, sq_array_size;
++
++ off = struct_size(rings, cqes, cq_entries);
++ if (off == SIZE_MAX)
++ return SIZE_MAX;
++
++#ifdef CONFIG_SMP
++ off = ALIGN(off, SMP_CACHE_BYTES);
++ if (off == 0)
++ return SIZE_MAX;
++#endif
++
++ if (sq_offset)
++ *sq_offset = off;
++
++ sq_array_size = array_size(sizeof(u32), sq_entries);
++ if (sq_array_size == SIZE_MAX)
++ return SIZE_MAX;
++
++ if (check_add_overflow(off, sq_array_size, &off))
++ return SIZE_MAX;
++
++ return off;
++}
++
++static void io_buffer_unmap(struct io_ring_ctx *ctx, struct io_mapped_ubuf **slot)
++{
++ struct io_mapped_ubuf *imu = *slot;
++ unsigned int i;
++
++ if (imu != ctx->dummy_ubuf) {
++ for (i = 0; i < imu->nr_bvecs; i++)
++ unpin_user_page(imu->bvec[i].bv_page);
++ if (imu->acct_pages)
++ io_unaccount_mem(ctx, imu->acct_pages);
++ kvfree(imu);
++ }
++ *slot = NULL;
++}
++
++static void io_rsrc_buf_put(struct io_ring_ctx *ctx, struct io_rsrc_put *prsrc)
++{
++ io_buffer_unmap(ctx, &prsrc->buf);
++ prsrc->buf = NULL;
++}
++
++static void __io_sqe_buffers_unregister(struct io_ring_ctx *ctx)
++{
++ unsigned int i;
++
++ for (i = 0; i < ctx->nr_user_bufs; i++)
++ io_buffer_unmap(ctx, &ctx->user_bufs[i]);
++ kfree(ctx->user_bufs);
++ io_rsrc_data_free(ctx->buf_data);
++ ctx->user_bufs = NULL;
++ ctx->buf_data = NULL;
++ ctx->nr_user_bufs = 0;
++}
++
++static int io_sqe_buffers_unregister(struct io_ring_ctx *ctx)
++{
++ unsigned nr = ctx->nr_user_bufs;
++ int ret;
++
++ if (!ctx->buf_data)
++ return -ENXIO;
++
++ /*
++ * Quiesce may unlock ->uring_lock, and while it's not held
++ * prevent new requests using the table.
++ */
++ ctx->nr_user_bufs = 0;
++ ret = io_rsrc_ref_quiesce(ctx->buf_data, ctx);
++ ctx->nr_user_bufs = nr;
++ if (!ret)
++ __io_sqe_buffers_unregister(ctx);
++ return ret;
++}
++
++static int io_copy_iov(struct io_ring_ctx *ctx, struct iovec *dst,
++ void __user *arg, unsigned index)
++{
++ struct iovec __user *src;
++
++#ifdef CONFIG_COMPAT
++ if (ctx->compat) {
++ struct compat_iovec __user *ciovs;
++ struct compat_iovec ciov;
++
++ ciovs = (struct compat_iovec __user *) arg;
++ if (copy_from_user(&ciov, &ciovs[index], sizeof(ciov)))
++ return -EFAULT;
++
++ dst->iov_base = u64_to_user_ptr((u64)ciov.iov_base);
++ dst->iov_len = ciov.iov_len;
++ return 0;
++ }
++#endif
++ src = (struct iovec __user *) arg;
++ if (copy_from_user(dst, &src[index], sizeof(*dst)))
++ return -EFAULT;
++ return 0;
++}
++
++/*
++ * Not super efficient, but this is just a registration time. And we do cache
++ * the last compound head, so generally we'll only do a full search if we don't
++ * match that one.
++ *
++ * We check if the given compound head page has already been accounted, to
++ * avoid double accounting it. This allows us to account the full size of the
++ * page, not just the constituent pages of a huge page.
++ */
++static bool headpage_already_acct(struct io_ring_ctx *ctx, struct page **pages,
++ int nr_pages, struct page *hpage)
++{
++ int i, j;
++
++ /* check current page array */
++ for (i = 0; i < nr_pages; i++) {
++ if (!PageCompound(pages[i]))
++ continue;
++ if (compound_head(pages[i]) == hpage)
++ return true;
++ }
++
++ /* check previously registered pages */
++ for (i = 0; i < ctx->nr_user_bufs; i++) {
++ struct io_mapped_ubuf *imu = ctx->user_bufs[i];
++
++ for (j = 0; j < imu->nr_bvecs; j++) {
++ if (!PageCompound(imu->bvec[j].bv_page))
++ continue;
++ if (compound_head(imu->bvec[j].bv_page) == hpage)
++ return true;
++ }
++ }
++
++ return false;
++}
++
++static int io_buffer_account_pin(struct io_ring_ctx *ctx, struct page **pages,
++ int nr_pages, struct io_mapped_ubuf *imu,
++ struct page **last_hpage)
++{
++ int i, ret;
++
++ imu->acct_pages = 0;
++ for (i = 0; i < nr_pages; i++) {
++ if (!PageCompound(pages[i])) {
++ imu->acct_pages++;
++ } else {
++ struct page *hpage;
++
++ hpage = compound_head(pages[i]);
++ if (hpage == *last_hpage)
++ continue;
++ *last_hpage = hpage;
++ if (headpage_already_acct(ctx, pages, i, hpage))
++ continue;
++ imu->acct_pages += page_size(hpage) >> PAGE_SHIFT;
++ }
++ }
++
++ if (!imu->acct_pages)
++ return 0;
++
++ ret = io_account_mem(ctx, imu->acct_pages);
++ if (ret)
++ imu->acct_pages = 0;
++ return ret;
++}
++
++static int io_sqe_buffer_register(struct io_ring_ctx *ctx, struct iovec *iov,
++ struct io_mapped_ubuf **pimu,
++ struct page **last_hpage)
++{
++ struct io_mapped_ubuf *imu = NULL;
++ struct vm_area_struct **vmas = NULL;
++ struct page **pages = NULL;
++ unsigned long off, start, end, ubuf;
++ size_t size;
++ int ret, pret, nr_pages, i;
++
++ if (!iov->iov_base) {
++ *pimu = ctx->dummy_ubuf;
++ return 0;
++ }
++
++ ubuf = (unsigned long) iov->iov_base;
++ end = (ubuf + iov->iov_len + PAGE_SIZE - 1) >> PAGE_SHIFT;
++ start = ubuf >> PAGE_SHIFT;
++ nr_pages = end - start;
++
++ *pimu = NULL;
++ ret = -ENOMEM;
++
++ pages = kvmalloc_array(nr_pages, sizeof(struct page *), GFP_KERNEL);
++ if (!pages)
++ goto done;
++
++ vmas = kvmalloc_array(nr_pages, sizeof(struct vm_area_struct *),
++ GFP_KERNEL);
++ if (!vmas)
++ goto done;
++
++ imu = kvmalloc(struct_size(imu, bvec, nr_pages), GFP_KERNEL);
++ if (!imu)
++ goto done;
++
++ ret = 0;
++ mmap_read_lock(current->mm);
++ pret = pin_user_pages(ubuf, nr_pages, FOLL_WRITE | FOLL_LONGTERM,
++ pages, vmas);
++ if (pret == nr_pages) {
++ /* don't support file backed memory */
++ for (i = 0; i < nr_pages; i++) {
++ struct vm_area_struct *vma = vmas[i];
++
++ if (vma_is_shmem(vma))
++ continue;
++ if (vma->vm_file &&
++ !is_file_hugepages(vma->vm_file)) {
++ ret = -EOPNOTSUPP;
++ break;
++ }
++ }
++ } else {
++ ret = pret < 0 ? pret : -EFAULT;
++ }
++ mmap_read_unlock(current->mm);
++ if (ret) {
++ /*
++ * if we did partial map, or found file backed vmas,
++ * release any pages we did get
++ */
++ if (pret > 0)
++ unpin_user_pages(pages, pret);
++ goto done;
++ }
++
++ ret = io_buffer_account_pin(ctx, pages, pret, imu, last_hpage);
++ if (ret) {
++ unpin_user_pages(pages, pret);
++ goto done;
++ }
++
++ off = ubuf & ~PAGE_MASK;
++ size = iov->iov_len;
++ for (i = 0; i < nr_pages; i++) {
++ size_t vec_len;
++
++ vec_len = min_t(size_t, size, PAGE_SIZE - off);
++ imu->bvec[i].bv_page = pages[i];
++ imu->bvec[i].bv_len = vec_len;
++ imu->bvec[i].bv_offset = off;
++ off = 0;
++ size -= vec_len;
++ }
++ /* store original address for later verification */
++ imu->ubuf = ubuf;
++ imu->ubuf_end = ubuf + iov->iov_len;
++ imu->nr_bvecs = nr_pages;
++ *pimu = imu;
++ ret = 0;
++done:
++ if (ret)
++ kvfree(imu);
++ kvfree(pages);
++ kvfree(vmas);
++ return ret;
++}
++
++static int io_buffers_map_alloc(struct io_ring_ctx *ctx, unsigned int nr_args)
++{
++ ctx->user_bufs = kcalloc(nr_args, sizeof(*ctx->user_bufs), GFP_KERNEL);
++ return ctx->user_bufs ? 0 : -ENOMEM;
++}
++
++static int io_buffer_validate(struct iovec *iov)
++{
++ unsigned long tmp, acct_len = iov->iov_len + (PAGE_SIZE - 1);
++
++ /*
++ * Don't impose further limits on the size and buffer
++ * constraints here, we'll -EINVAL later when IO is
++ * submitted if they are wrong.
++ */
++ if (!iov->iov_base)
++ return iov->iov_len ? -EFAULT : 0;
++ if (!iov->iov_len)
++ return -EFAULT;
++
++ /* arbitrary limit, but we need something */
++ if (iov->iov_len > SZ_1G)
++ return -EFAULT;
++
++ if (check_add_overflow((unsigned long)iov->iov_base, acct_len, &tmp))
++ return -EOVERFLOW;
++
++ return 0;
++}
++
++static int io_sqe_buffers_register(struct io_ring_ctx *ctx, void __user *arg,
++ unsigned int nr_args, u64 __user *tags)
++{
++ struct page *last_hpage = NULL;
++ struct io_rsrc_data *data;
++ int i, ret;
++ struct iovec iov;
++
++ if (ctx->user_bufs)
++ return -EBUSY;
++ if (!nr_args || nr_args > IORING_MAX_REG_BUFFERS)
++ return -EINVAL;
++ ret = io_rsrc_node_switch_start(ctx);
++ if (ret)
++ return ret;
++ ret = io_rsrc_data_alloc(ctx, io_rsrc_buf_put, tags, nr_args, &data);
++ if (ret)
++ return ret;
++ ret = io_buffers_map_alloc(ctx, nr_args);
++ if (ret) {
++ io_rsrc_data_free(data);
++ return ret;
++ }
++
++ for (i = 0; i < nr_args; i++, ctx->nr_user_bufs++) {
++ ret = io_copy_iov(ctx, &iov, arg, i);
++ if (ret)
++ break;
++ ret = io_buffer_validate(&iov);
++ if (ret)
++ break;
++ if (!iov.iov_base && *io_get_tag_slot(data, i)) {
++ ret = -EINVAL;
++ break;
++ }
++
++ ret = io_sqe_buffer_register(ctx, &iov, &ctx->user_bufs[i],
++ &last_hpage);
++ if (ret)
++ break;
++ }
++
++ WARN_ON_ONCE(ctx->buf_data);
++
++ ctx->buf_data = data;
++ if (ret)
++ __io_sqe_buffers_unregister(ctx);
++ else
++ io_rsrc_node_switch(ctx, NULL);
++ return ret;
++}
++
++static int __io_sqe_buffers_update(struct io_ring_ctx *ctx,
++ struct io_uring_rsrc_update2 *up,
++ unsigned int nr_args)
++{
++ u64 __user *tags = u64_to_user_ptr(up->tags);
++ struct iovec iov, __user *iovs = u64_to_user_ptr(up->data);
++ struct page *last_hpage = NULL;
++ bool needs_switch = false;
++ __u32 done;
++ int i, err;
++
++ if (!ctx->buf_data)
++ return -ENXIO;
++ if (up->offset + nr_args > ctx->nr_user_bufs)
++ return -EINVAL;
++
++ for (done = 0; done < nr_args; done++) {
++ struct io_mapped_ubuf *imu;
++ int offset = up->offset + done;
++ u64 tag = 0;
++
++ err = io_copy_iov(ctx, &iov, iovs, done);
++ if (err)
++ break;
++ if (tags && copy_from_user(&tag, &tags[done], sizeof(tag))) {
++ err = -EFAULT;
++ break;
++ }
++ err = io_buffer_validate(&iov);
++ if (err)
++ break;
++ if (!iov.iov_base && tag) {
++ err = -EINVAL;
++ break;
++ }
++ err = io_sqe_buffer_register(ctx, &iov, &imu, &last_hpage);
++ if (err)
++ break;
++
++ i = array_index_nospec(offset, ctx->nr_user_bufs);
++ if (ctx->user_bufs[i] != ctx->dummy_ubuf) {
++ err = io_queue_rsrc_removal(ctx->buf_data, i,
++ ctx->rsrc_node, ctx->user_bufs[i]);
++ if (unlikely(err)) {
++ io_buffer_unmap(ctx, &imu);
++ break;
++ }
++ ctx->user_bufs[i] = NULL;
++ needs_switch = true;
++ }
++
++ ctx->user_bufs[i] = imu;
++ *io_get_tag_slot(ctx->buf_data, offset) = tag;
++ }
++
++ if (needs_switch)
++ io_rsrc_node_switch(ctx, ctx->buf_data);
++ return done ? done : err;
++}
++
++static int io_eventfd_register(struct io_ring_ctx *ctx, void __user *arg,
++ unsigned int eventfd_async)
++{
++ struct io_ev_fd *ev_fd;
++ __s32 __user *fds = arg;
++ int fd;
++
++ ev_fd = rcu_dereference_protected(ctx->io_ev_fd,
++ lockdep_is_held(&ctx->uring_lock));
++ if (ev_fd)
++ return -EBUSY;
++
++ if (copy_from_user(&fd, fds, sizeof(*fds)))
++ return -EFAULT;
++
++ ev_fd = kmalloc(sizeof(*ev_fd), GFP_KERNEL);
++ if (!ev_fd)
++ return -ENOMEM;
++
++ ev_fd->cq_ev_fd = eventfd_ctx_fdget(fd);
++ if (IS_ERR(ev_fd->cq_ev_fd)) {
++ int ret = PTR_ERR(ev_fd->cq_ev_fd);
++ kfree(ev_fd);
++ return ret;
++ }
++ ev_fd->eventfd_async = eventfd_async;
++ ctx->has_evfd = true;
++ rcu_assign_pointer(ctx->io_ev_fd, ev_fd);
++ return 0;
++}
++
++static void io_eventfd_put(struct rcu_head *rcu)
++{
++ struct io_ev_fd *ev_fd = container_of(rcu, struct io_ev_fd, rcu);
++
++ eventfd_ctx_put(ev_fd->cq_ev_fd);
++ kfree(ev_fd);
++}
++
++static int io_eventfd_unregister(struct io_ring_ctx *ctx)
++{
++ struct io_ev_fd *ev_fd;
++
++ ev_fd = rcu_dereference_protected(ctx->io_ev_fd,
++ lockdep_is_held(&ctx->uring_lock));
++ if (ev_fd) {
++ ctx->has_evfd = false;
++ rcu_assign_pointer(ctx->io_ev_fd, NULL);
++ call_rcu(&ev_fd->rcu, io_eventfd_put);
++ return 0;
++ }
++
++ return -ENXIO;
++}
++
++static void io_destroy_buffers(struct io_ring_ctx *ctx)
++{
++ int i;
++
++ for (i = 0; i < (1U << IO_BUFFERS_HASH_BITS); i++) {
++ struct list_head *list = &ctx->io_buffers[i];
++
++ while (!list_empty(list)) {
++ struct io_buffer_list *bl;
++
++ bl = list_first_entry(list, struct io_buffer_list, list);
++ __io_remove_buffers(ctx, bl, -1U);
++ list_del(&bl->list);
++ kfree(bl);
++ }
++ }
++
++ while (!list_empty(&ctx->io_buffers_pages)) {
++ struct page *page;
++
++ page = list_first_entry(&ctx->io_buffers_pages, struct page, lru);
++ list_del_init(&page->lru);
++ __free_page(page);
++ }
++}
++
++static void io_req_caches_free(struct io_ring_ctx *ctx)
++{
++ struct io_submit_state *state = &ctx->submit_state;
++ int nr = 0;
++
++ mutex_lock(&ctx->uring_lock);
++ io_flush_cached_locked_reqs(ctx, state);
++
++ while (state->free_list.next) {
++ struct io_wq_work_node *node;
++ struct io_kiocb *req;
++
++ node = wq_stack_extract(&state->free_list);
++ req = container_of(node, struct io_kiocb, comp_list);
++ kmem_cache_free(req_cachep, req);
++ nr++;
++ }
++ if (nr)
++ percpu_ref_put_many(&ctx->refs, nr);
++ mutex_unlock(&ctx->uring_lock);
++}
++
++static void io_wait_rsrc_data(struct io_rsrc_data *data)
++{
++ if (data && !atomic_dec_and_test(&data->refs))
++ wait_for_completion(&data->done);
++}
++
++static void io_flush_apoll_cache(struct io_ring_ctx *ctx)
++{
++ struct async_poll *apoll;
++
++ while (!list_empty(&ctx->apoll_cache)) {
++ apoll = list_first_entry(&ctx->apoll_cache, struct async_poll,
++ poll.wait.entry);
++ list_del(&apoll->poll.wait.entry);
++ kfree(apoll);
++ }
++}
++
++static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx)
++{
++ io_sq_thread_finish(ctx);
++
++ if (ctx->mm_account) {
++ mmdrop(ctx->mm_account);
++ ctx->mm_account = NULL;
++ }
++
++ io_rsrc_refs_drop(ctx);
++ /* __io_rsrc_put_work() may need uring_lock to progress, wait w/o it */
++ io_wait_rsrc_data(ctx->buf_data);
++ io_wait_rsrc_data(ctx->file_data);
++
++ mutex_lock(&ctx->uring_lock);
++ if (ctx->buf_data)
++ __io_sqe_buffers_unregister(ctx);
++ if (ctx->file_data)
++ __io_sqe_files_unregister(ctx);
++ if (ctx->rings)
++ __io_cqring_overflow_flush(ctx, true);
++ io_eventfd_unregister(ctx);
++ io_flush_apoll_cache(ctx);
++ mutex_unlock(&ctx->uring_lock);
++ io_destroy_buffers(ctx);
++ if (ctx->sq_creds)
++ put_cred(ctx->sq_creds);
++
++ /* there are no registered resources left, nobody uses it */
++ if (ctx->rsrc_node)
++ io_rsrc_node_destroy(ctx->rsrc_node);
++ if (ctx->rsrc_backup_node)
++ io_rsrc_node_destroy(ctx->rsrc_backup_node);
++ flush_delayed_work(&ctx->rsrc_put_work);
++ flush_delayed_work(&ctx->fallback_work);
++
++ WARN_ON_ONCE(!list_empty(&ctx->rsrc_ref_list));
++ WARN_ON_ONCE(!llist_empty(&ctx->rsrc_put_llist));
++
++#if defined(CONFIG_UNIX)
++ if (ctx->ring_sock) {
++ ctx->ring_sock->file = NULL; /* so that iput() is called */
++ sock_release(ctx->ring_sock);
++ }
++#endif
++ WARN_ON_ONCE(!list_empty(&ctx->ltimeout_list));
++
++ io_mem_free(ctx->rings);
++ io_mem_free(ctx->sq_sqes);
++
++ percpu_ref_exit(&ctx->refs);
++ free_uid(ctx->user);
++ io_req_caches_free(ctx);
++ if (ctx->hash_map)
++ io_wq_put_hash(ctx->hash_map);
++ kfree(ctx->cancel_hash);
++ kfree(ctx->dummy_ubuf);
++ kfree(ctx->io_buffers);
++ kfree(ctx);
++}
++
++static __poll_t io_uring_poll(struct file *file, poll_table *wait)
++{
++ struct io_ring_ctx *ctx = file->private_data;
++ __poll_t mask = 0;
++
++ poll_wait(file, &ctx->cq_wait, wait);
++ /*
++ * synchronizes with barrier from wq_has_sleeper call in
++ * io_commit_cqring
++ */
++ smp_rmb();
++ if (!io_sqring_full(ctx))
++ mask |= EPOLLOUT | EPOLLWRNORM;
++
++ /*
++ * Don't flush cqring overflow list here, just do a simple check.
++ * Otherwise there could possible be ABBA deadlock:
++ * CPU0 CPU1
++ * ---- ----
++ * lock(&ctx->uring_lock);
++ * lock(&ep->mtx);
++ * lock(&ctx->uring_lock);
++ * lock(&ep->mtx);
++ *
++ * Users may get EPOLLIN meanwhile seeing nothing in cqring, this
++ * pushs them to do the flush.
++ */
++ if (io_cqring_events(ctx) || test_bit(0, &ctx->check_cq_overflow))
++ mask |= EPOLLIN | EPOLLRDNORM;
++
++ return mask;
++}
++
++static int io_unregister_personality(struct io_ring_ctx *ctx, unsigned id)
++{
++ const struct cred *creds;
++
++ creds = xa_erase(&ctx->personalities, id);
++ if (creds) {
++ put_cred(creds);
++ return 0;
++ }
++
++ return -EINVAL;
++}
++
++struct io_tctx_exit {
++ struct callback_head task_work;
++ struct completion completion;
++ struct io_ring_ctx *ctx;
++};
++
++static __cold void io_tctx_exit_cb(struct callback_head *cb)
++{
++ struct io_uring_task *tctx = current->io_uring;
++ struct io_tctx_exit *work;
++
++ work = container_of(cb, struct io_tctx_exit, task_work);
++ /*
++ * When @in_idle, we're in cancellation and it's racy to remove the
++ * node. It'll be removed by the end of cancellation, just ignore it.
++ */
++ if (!atomic_read(&tctx->in_idle))
++ io_uring_del_tctx_node((unsigned long)work->ctx);
++ complete(&work->completion);
++}
++
++static __cold bool io_cancel_ctx_cb(struct io_wq_work *work, void *data)
++{
++ struct io_kiocb *req = container_of(work, struct io_kiocb, work);
++
++ return req->ctx == data;
++}
++
++static __cold void io_ring_exit_work(struct work_struct *work)
++{
++ struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx, exit_work);
++ unsigned long timeout = jiffies + HZ * 60 * 5;
++ unsigned long interval = HZ / 20;
++ struct io_tctx_exit exit;
++ struct io_tctx_node *node;
++ int ret;
++
++ /*
++ * If we're doing polled IO and end up having requests being
++ * submitted async (out-of-line), then completions can come in while
++ * we're waiting for refs to drop. We need to reap these manually,
++ * as nobody else will be looking for them.
++ */
++ do {
++ io_uring_try_cancel_requests(ctx, NULL, true);
++ if (ctx->sq_data) {
++ struct io_sq_data *sqd = ctx->sq_data;
++ struct task_struct *tsk;
++
++ io_sq_thread_park(sqd);
++ tsk = sqd->thread;
++ if (tsk && tsk->io_uring && tsk->io_uring->io_wq)
++ io_wq_cancel_cb(tsk->io_uring->io_wq,
++ io_cancel_ctx_cb, ctx, true);
++ io_sq_thread_unpark(sqd);
++ }
++
++ io_req_caches_free(ctx);
++
++ if (WARN_ON_ONCE(time_after(jiffies, timeout))) {
++ /* there is little hope left, don't run it too often */
++ interval = HZ * 60;
++ }
++ } while (!wait_for_completion_timeout(&ctx->ref_comp, interval));
++
++ init_completion(&exit.completion);
++ init_task_work(&exit.task_work, io_tctx_exit_cb);
++ exit.ctx = ctx;
++ /*
++ * Some may use context even when all refs and requests have been put,
++ * and they are free to do so while still holding uring_lock or
++ * completion_lock, see io_req_task_submit(). Apart from other work,
++ * this lock/unlock section also waits them to finish.
++ */
++ mutex_lock(&ctx->uring_lock);
++ while (!list_empty(&ctx->tctx_list)) {
++ WARN_ON_ONCE(time_after(jiffies, timeout));
++
++ node = list_first_entry(&ctx->tctx_list, struct io_tctx_node,
++ ctx_node);
++ /* don't spin on a single task if cancellation failed */
++ list_rotate_left(&ctx->tctx_list);
++ ret = task_work_add(node->task, &exit.task_work, TWA_SIGNAL);
++ if (WARN_ON_ONCE(ret))
++ continue;
++
++ mutex_unlock(&ctx->uring_lock);
++ wait_for_completion(&exit.completion);
++ mutex_lock(&ctx->uring_lock);
++ }
++ mutex_unlock(&ctx->uring_lock);
++ spin_lock(&ctx->completion_lock);
++ spin_unlock(&ctx->completion_lock);
++
++ io_ring_ctx_free(ctx);
++}
++
++/* Returns true if we found and killed one or more timeouts */
++static __cold bool io_kill_timeouts(struct io_ring_ctx *ctx,
++ struct task_struct *tsk, bool cancel_all)
++{
++ struct io_kiocb *req, *tmp;
++ int canceled = 0;
++
++ spin_lock(&ctx->completion_lock);
++ spin_lock_irq(&ctx->timeout_lock);
++ list_for_each_entry_safe(req, tmp, &ctx->timeout_list, timeout.list) {
++ if (io_match_task(req, tsk, cancel_all)) {
++ io_kill_timeout(req, -ECANCELED);
++ canceled++;
++ }
++ }
++ spin_unlock_irq(&ctx->timeout_lock);
++ if (canceled != 0)
++ io_commit_cqring(ctx);
++ spin_unlock(&ctx->completion_lock);
++ if (canceled != 0)
++ io_cqring_ev_posted(ctx);
++ return canceled != 0;
++}
++
++static __cold void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx)
++{
++ unsigned long index;
++ struct creds *creds;
++
++ mutex_lock(&ctx->uring_lock);
++ percpu_ref_kill(&ctx->refs);
++ if (ctx->rings)
++ __io_cqring_overflow_flush(ctx, true);
++ xa_for_each(&ctx->personalities, index, creds)
++ io_unregister_personality(ctx, index);
++ mutex_unlock(&ctx->uring_lock);
++
++ io_kill_timeouts(ctx, NULL, true);
++ io_poll_remove_all(ctx, NULL, true);
++
++ /* if we failed setting up the ctx, we might not have any rings */
++ io_iopoll_try_reap_events(ctx);
++
++ INIT_WORK(&ctx->exit_work, io_ring_exit_work);
++ /*
++ * Use system_unbound_wq to avoid spawning tons of event kworkers
++ * if we're exiting a ton of rings at the same time. It just adds
++ * noise and overhead, there's no discernable change in runtime
++ * over using system_wq.
++ */
++ queue_work(system_unbound_wq, &ctx->exit_work);
++}
++
++static int io_uring_release(struct inode *inode, struct file *file)
++{
++ struct io_ring_ctx *ctx = file->private_data;
++
++ file->private_data = NULL;
++ io_ring_ctx_wait_and_kill(ctx);
++ return 0;
++}
++
++struct io_task_cancel {
++ struct task_struct *task;
++ bool all;
++};
++
++static bool io_cancel_task_cb(struct io_wq_work *work, void *data)
++{
++ struct io_kiocb *req = container_of(work, struct io_kiocb, work);
++ struct io_task_cancel *cancel = data;
++
++ return io_match_task_safe(req, cancel->task, cancel->all);
++}
++
++static __cold bool io_cancel_defer_files(struct io_ring_ctx *ctx,
++ struct task_struct *task,
++ bool cancel_all)
++{
++ struct io_defer_entry *de;
++ LIST_HEAD(list);
++
++ spin_lock(&ctx->completion_lock);
++ list_for_each_entry_reverse(de, &ctx->defer_list, list) {
++ if (io_match_task_safe(de->req, task, cancel_all)) {
++ list_cut_position(&list, &ctx->defer_list, &de->list);
++ break;
++ }
++ }
++ spin_unlock(&ctx->completion_lock);
++ if (list_empty(&list))
++ return false;
++
++ while (!list_empty(&list)) {
++ de = list_first_entry(&list, struct io_defer_entry, list);
++ list_del_init(&de->list);
++ io_req_complete_failed(de->req, -ECANCELED);
++ kfree(de);
++ }
++ return true;
++}
++
++static __cold bool io_uring_try_cancel_iowq(struct io_ring_ctx *ctx)
++{
++ struct io_tctx_node *node;
++ enum io_wq_cancel cret;
++ bool ret = false;
++
++ mutex_lock(&ctx->uring_lock);
++ list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
++ struct io_uring_task *tctx = node->task->io_uring;
++
++ /*
++ * io_wq will stay alive while we hold uring_lock, because it's
++ * killed after ctx nodes, which requires to take the lock.
++ */
++ if (!tctx || !tctx->io_wq)
++ continue;
++ cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_ctx_cb, ctx, true);
++ ret |= (cret != IO_WQ_CANCEL_NOTFOUND);
++ }
++ mutex_unlock(&ctx->uring_lock);
++
++ return ret;
++}
++
++static __cold void io_uring_try_cancel_requests(struct io_ring_ctx *ctx,
++ struct task_struct *task,
++ bool cancel_all)
++{
++ struct io_task_cancel cancel = { .task = task, .all = cancel_all, };
++ struct io_uring_task *tctx = task ? task->io_uring : NULL;
++
++ while (1) {
++ enum io_wq_cancel cret;
++ bool ret = false;
++
++ if (!task) {
++ ret |= io_uring_try_cancel_iowq(ctx);
++ } else if (tctx && tctx->io_wq) {
++ /*
++ * Cancels requests of all rings, not only @ctx, but
++ * it's fine as the task is in exit/exec.
++ */
++ cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_task_cb,
++ &cancel, true);
++ ret |= (cret != IO_WQ_CANCEL_NOTFOUND);
++ }
++
++ /* SQPOLL thread does its own polling */
++ if ((!(ctx->flags & IORING_SETUP_SQPOLL) && cancel_all) ||
++ (ctx->sq_data && ctx->sq_data->thread == current)) {
++ while (!wq_list_empty(&ctx->iopoll_list)) {
++ io_iopoll_try_reap_events(ctx);
++ ret = true;
++ }
++ }
++
++ ret |= io_cancel_defer_files(ctx, task, cancel_all);
++ ret |= io_poll_remove_all(ctx, task, cancel_all);
++ ret |= io_kill_timeouts(ctx, task, cancel_all);
++ if (task)
++ ret |= io_run_task_work();
++ if (!ret)
++ break;
++ cond_resched();
++ }
++}
++
++static int __io_uring_add_tctx_node(struct io_ring_ctx *ctx)
++{
++ struct io_uring_task *tctx = current->io_uring;
++ struct io_tctx_node *node;
++ int ret;
++
++ if (unlikely(!tctx)) {
++ ret = io_uring_alloc_task_context(current, ctx);
++ if (unlikely(ret))
++ return ret;
++
++ tctx = current->io_uring;
++ if (ctx->iowq_limits_set) {
++ unsigned int limits[2] = { ctx->iowq_limits[0],
++ ctx->iowq_limits[1], };
++
++ ret = io_wq_max_workers(tctx->io_wq, limits);
++ if (ret)
++ return ret;
++ }
++ }
++ if (!xa_load(&tctx->xa, (unsigned long)ctx)) {
++ node = kmalloc(sizeof(*node), GFP_KERNEL);
++ if (!node)
++ return -ENOMEM;
++ node->ctx = ctx;
++ node->task = current;
++
++ ret = xa_err(xa_store(&tctx->xa, (unsigned long)ctx,
++ node, GFP_KERNEL));
++ if (ret) {
++ kfree(node);
++ return ret;
++ }
++
++ mutex_lock(&ctx->uring_lock);
++ list_add(&node->ctx_node, &ctx->tctx_list);
++ mutex_unlock(&ctx->uring_lock);
++ }
++ tctx->last = ctx;
++ return 0;
++}
++
++/*
++ * Note that this task has used io_uring. We use it for cancelation purposes.
++ */
++static inline int io_uring_add_tctx_node(struct io_ring_ctx *ctx)
++{
++ struct io_uring_task *tctx = current->io_uring;
++
++ if (likely(tctx && tctx->last == ctx))
++ return 0;
++ return __io_uring_add_tctx_node(ctx);
++}
++
++/*
++ * Remove this io_uring_file -> task mapping.
++ */
++static __cold void io_uring_del_tctx_node(unsigned long index)
++{
++ struct io_uring_task *tctx = current->io_uring;
++ struct io_tctx_node *node;
++
++ if (!tctx)
++ return;
++ node = xa_erase(&tctx->xa, index);
++ if (!node)
++ return;
++
++ WARN_ON_ONCE(current != node->task);
++ WARN_ON_ONCE(list_empty(&node->ctx_node));
++
++ mutex_lock(&node->ctx->uring_lock);
++ list_del(&node->ctx_node);
++ mutex_unlock(&node->ctx->uring_lock);
++
++ if (tctx->last == node->ctx)
++ tctx->last = NULL;
++ kfree(node);
++}
++
++static __cold void io_uring_clean_tctx(struct io_uring_task *tctx)
++{
++ struct io_wq *wq = tctx->io_wq;
++ struct io_tctx_node *node;
++ unsigned long index;
++
++ xa_for_each(&tctx->xa, index, node) {
++ io_uring_del_tctx_node(index);
++ cond_resched();
++ }
++ if (wq) {
++ /*
++ * Must be after io_uring_del_tctx_node() (removes nodes under
++ * uring_lock) to avoid race with io_uring_try_cancel_iowq().
++ */
++ io_wq_put_and_exit(wq);
++ tctx->io_wq = NULL;
++ }
++}
++
++static s64 tctx_inflight(struct io_uring_task *tctx, bool tracked)
++{
++ if (tracked)
++ return atomic_read(&tctx->inflight_tracked);
++ return percpu_counter_sum(&tctx->inflight);
++}
++
++/*
++ * Find any io_uring ctx that this task has registered or done IO on, and cancel
++ * requests. @sqd should be not-null IFF it's an SQPOLL thread cancellation.
++ */
++static __cold void io_uring_cancel_generic(bool cancel_all,
++ struct io_sq_data *sqd)
++{
++ struct io_uring_task *tctx = current->io_uring;
++ struct io_ring_ctx *ctx;
++ s64 inflight;
++ DEFINE_WAIT(wait);
++
++ WARN_ON_ONCE(sqd && sqd->thread != current);
++
++ if (!current->io_uring)
++ return;
++ if (tctx->io_wq)
++ io_wq_exit_start(tctx->io_wq);
++
++ atomic_inc(&tctx->in_idle);
++ do {
++ io_uring_drop_tctx_refs(current);
++ /* read completions before cancelations */
++ inflight = tctx_inflight(tctx, !cancel_all);
++ if (!inflight)
++ break;
++
++ if (!sqd) {
++ struct io_tctx_node *node;
++ unsigned long index;
++
++ xa_for_each(&tctx->xa, index, node) {
++ /* sqpoll task will cancel all its requests */
++ if (node->ctx->sq_data)
++ continue;
++ io_uring_try_cancel_requests(node->ctx, current,
++ cancel_all);
++ }
++ } else {
++ list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
++ io_uring_try_cancel_requests(ctx, current,
++ cancel_all);
++ }
++
++ prepare_to_wait(&tctx->wait, &wait, TASK_INTERRUPTIBLE);
++ io_run_task_work();
++ io_uring_drop_tctx_refs(current);
++
++ /*
++ * If we've seen completions, retry without waiting. This
++ * avoids a race where a completion comes in before we did
++ * prepare_to_wait().
++ */
++ if (inflight == tctx_inflight(tctx, !cancel_all))
++ schedule();
++ finish_wait(&tctx->wait, &wait);
++ } while (1);
++
++ io_uring_clean_tctx(tctx);
++ if (cancel_all) {
++ /*
++ * We shouldn't run task_works after cancel, so just leave
++ * ->in_idle set for normal exit.
++ */
++ atomic_dec(&tctx->in_idle);
++ /* for exec all current's requests should be gone, kill tctx */
++ __io_uring_free(current);
++ }
++}
++
++void __io_uring_cancel(bool cancel_all)
++{
++ io_uring_cancel_generic(cancel_all, NULL);
++}
++
++void io_uring_unreg_ringfd(void)
++{
++ struct io_uring_task *tctx = current->io_uring;
++ int i;
++
++ for (i = 0; i < IO_RINGFD_REG_MAX; i++) {
++ if (tctx->registered_rings[i]) {
++ fput(tctx->registered_rings[i]);
++ tctx->registered_rings[i] = NULL;
++ }
++ }
++}
++
++static int io_ring_add_registered_fd(struct io_uring_task *tctx, int fd,
++ int start, int end)
++{
++ struct file *file;
++ int offset;
++
++ for (offset = start; offset < end; offset++) {
++ offset = array_index_nospec(offset, IO_RINGFD_REG_MAX);
++ if (tctx->registered_rings[offset])
++ continue;
++
++ file = fget(fd);
++ if (!file) {
++ return -EBADF;
++ } else if (file->f_op != &io_uring_fops) {
++ fput(file);
++ return -EOPNOTSUPP;
++ }
++ tctx->registered_rings[offset] = file;
++ return offset;
++ }
++
++ return -EBUSY;
++}
++
++/*
++ * Register a ring fd to avoid fdget/fdput for each io_uring_enter()
++ * invocation. User passes in an array of struct io_uring_rsrc_update
++ * with ->data set to the ring_fd, and ->offset given for the desired
++ * index. If no index is desired, application may set ->offset == -1U
++ * and we'll find an available index. Returns number of entries
++ * successfully processed, or < 0 on error if none were processed.
++ */
++static int io_ringfd_register(struct io_ring_ctx *ctx, void __user *__arg,
++ unsigned nr_args)
++{
++ struct io_uring_rsrc_update __user *arg = __arg;
++ struct io_uring_rsrc_update reg;
++ struct io_uring_task *tctx;
++ int ret, i;
++
++ if (!nr_args || nr_args > IO_RINGFD_REG_MAX)
++ return -EINVAL;
++
++ mutex_unlock(&ctx->uring_lock);
++ ret = io_uring_add_tctx_node(ctx);
++ mutex_lock(&ctx->uring_lock);
++ if (ret)
++ return ret;
++
++ tctx = current->io_uring;
++ for (i = 0; i < nr_args; i++) {
++ int start, end;
++
++ if (copy_from_user(®, &arg[i], sizeof(reg))) {
++ ret = -EFAULT;
++ break;
++ }
++
++ if (reg.resv) {
++ ret = -EINVAL;
++ break;
++ }
++
++ if (reg.offset == -1U) {
++ start = 0;
++ end = IO_RINGFD_REG_MAX;
++ } else {
++ if (reg.offset >= IO_RINGFD_REG_MAX) {
++ ret = -EINVAL;
++ break;
++ }
++ start = reg.offset;
++ end = start + 1;
++ }
++
++ ret = io_ring_add_registered_fd(tctx, reg.data, start, end);
++ if (ret < 0)
++ break;
++
++ reg.offset = ret;
++ if (copy_to_user(&arg[i], ®, sizeof(reg))) {
++ fput(tctx->registered_rings[reg.offset]);
++ tctx->registered_rings[reg.offset] = NULL;
++ ret = -EFAULT;
++ break;
++ }
++ }
++
++ return i ? i : ret;
++}
++
++static int io_ringfd_unregister(struct io_ring_ctx *ctx, void __user *__arg,
++ unsigned nr_args)
++{
++ struct io_uring_rsrc_update __user *arg = __arg;
++ struct io_uring_task *tctx = current->io_uring;
++ struct io_uring_rsrc_update reg;
++ int ret = 0, i;
++
++ if (!nr_args || nr_args > IO_RINGFD_REG_MAX)
++ return -EINVAL;
++ if (!tctx)
++ return 0;
++
++ for (i = 0; i < nr_args; i++) {
++ if (copy_from_user(®, &arg[i], sizeof(reg))) {
++ ret = -EFAULT;
++ break;
++ }
++ if (reg.resv || reg.data || reg.offset >= IO_RINGFD_REG_MAX) {
++ ret = -EINVAL;
++ break;
++ }
++
++ reg.offset = array_index_nospec(reg.offset, IO_RINGFD_REG_MAX);
++ if (tctx->registered_rings[reg.offset]) {
++ fput(tctx->registered_rings[reg.offset]);
++ tctx->registered_rings[reg.offset] = NULL;
++ }
++ }
++
++ return i ? i : ret;
++}
++
++static void *io_uring_validate_mmap_request(struct file *file,
++ loff_t pgoff, size_t sz)
++{
++ struct io_ring_ctx *ctx = file->private_data;
++ loff_t offset = pgoff << PAGE_SHIFT;
++ struct page *page;
++ void *ptr;
++
++ switch (offset) {
++ case IORING_OFF_SQ_RING:
++ case IORING_OFF_CQ_RING:
++ ptr = ctx->rings;
++ break;
++ case IORING_OFF_SQES:
++ ptr = ctx->sq_sqes;
++ break;
++ default:
++ return ERR_PTR(-EINVAL);
++ }
++
++ page = virt_to_head_page(ptr);
++ if (sz > page_size(page))
++ return ERR_PTR(-EINVAL);
++
++ return ptr;
++}
++
++#ifdef CONFIG_MMU
++
++static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
++{
++ size_t sz = vma->vm_end - vma->vm_start;
++ unsigned long pfn;
++ void *ptr;
++
++ ptr = io_uring_validate_mmap_request(file, vma->vm_pgoff, sz);
++ if (IS_ERR(ptr))
++ return PTR_ERR(ptr);
++
++ pfn = virt_to_phys(ptr) >> PAGE_SHIFT;
++ return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot);
++}
++
++#else /* !CONFIG_MMU */
++
++static int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
++{
++ return vma->vm_flags & (VM_SHARED | VM_MAYSHARE) ? 0 : -EINVAL;
++}
++
++static unsigned int io_uring_nommu_mmap_capabilities(struct file *file)
++{
++ return NOMMU_MAP_DIRECT | NOMMU_MAP_READ | NOMMU_MAP_WRITE;
++}
++
++static unsigned long io_uring_nommu_get_unmapped_area(struct file *file,
++ unsigned long addr, unsigned long len,
++ unsigned long pgoff, unsigned long flags)
++{
++ void *ptr;
++
++ ptr = io_uring_validate_mmap_request(file, pgoff, len);
++ if (IS_ERR(ptr))
++ return PTR_ERR(ptr);
++
++ return (unsigned long) ptr;
++}
++
++#endif /* !CONFIG_MMU */
++
++static int io_sqpoll_wait_sq(struct io_ring_ctx *ctx)
++{
++ DEFINE_WAIT(wait);
++
++ do {
++ if (!io_sqring_full(ctx))
++ break;
++ prepare_to_wait(&ctx->sqo_sq_wait, &wait, TASK_INTERRUPTIBLE);
++
++ if (!io_sqring_full(ctx))
++ break;
++ schedule();
++ } while (!signal_pending(current));
++
++ finish_wait(&ctx->sqo_sq_wait, &wait);
++ return 0;
++}
++
++static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz,
++ struct __kernel_timespec __user **ts,
++ const sigset_t __user **sig)
++{
++ struct io_uring_getevents_arg arg;
++
++ /*
++ * If EXT_ARG isn't set, then we have no timespec and the argp pointer
++ * is just a pointer to the sigset_t.
++ */
++ if (!(flags & IORING_ENTER_EXT_ARG)) {
++ *sig = (const sigset_t __user *) argp;
++ *ts = NULL;
++ return 0;
++ }
++
++ /*
++ * EXT_ARG is set - ensure we agree on the size of it and copy in our
++ * timespec and sigset_t pointers if good.
++ */
++ if (*argsz != sizeof(arg))
++ return -EINVAL;
++ if (copy_from_user(&arg, argp, sizeof(arg)))
++ return -EFAULT;
++ if (arg.pad)
++ return -EINVAL;
++ *sig = u64_to_user_ptr(arg.sigmask);
++ *argsz = arg.sigmask_sz;
++ *ts = u64_to_user_ptr(arg.ts);
++ return 0;
++}
++
++SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
++ u32, min_complete, u32, flags, const void __user *, argp,
++ size_t, argsz)
++{
++ struct io_ring_ctx *ctx;
++ int submitted = 0;
++ struct fd f;
++ long ret;
++
++ io_run_task_work();
++
++ if (unlikely(flags & ~(IORING_ENTER_GETEVENTS | IORING_ENTER_SQ_WAKEUP |
++ IORING_ENTER_SQ_WAIT | IORING_ENTER_EXT_ARG |
++ IORING_ENTER_REGISTERED_RING)))
++ return -EINVAL;
++
++ /*
++ * Ring fd has been registered via IORING_REGISTER_RING_FDS, we
++ * need only dereference our task private array to find it.
++ */
++ if (flags & IORING_ENTER_REGISTERED_RING) {
++ struct io_uring_task *tctx = current->io_uring;
++
++ if (!tctx || fd >= IO_RINGFD_REG_MAX)
++ return -EINVAL;
++ fd = array_index_nospec(fd, IO_RINGFD_REG_MAX);
++ f.file = tctx->registered_rings[fd];
++ if (unlikely(!f.file))
++ return -EBADF;
++ } else {
++ f = fdget(fd);
++ if (unlikely(!f.file))
++ return -EBADF;
++ }
++
++ ret = -EOPNOTSUPP;
++ if (unlikely(f.file->f_op != &io_uring_fops))
++ goto out_fput;
++
++ ret = -ENXIO;
++ ctx = f.file->private_data;
++ if (unlikely(!percpu_ref_tryget(&ctx->refs)))
++ goto out_fput;
++
++ ret = -EBADFD;
++ if (unlikely(ctx->flags & IORING_SETUP_R_DISABLED))
++ goto out;
++
++ /*
++ * For SQ polling, the thread will do all submissions and completions.
++ * Just return the requested submit count, and wake the thread if
++ * we were asked to.
++ */
++ ret = 0;
++ if (ctx->flags & IORING_SETUP_SQPOLL) {
++ io_cqring_overflow_flush(ctx);
++
++ if (unlikely(ctx->sq_data->thread == NULL)) {
++ ret = -EOWNERDEAD;
++ goto out;
++ }
++ if (flags & IORING_ENTER_SQ_WAKEUP)
++ wake_up(&ctx->sq_data->wait);
++ if (flags & IORING_ENTER_SQ_WAIT) {
++ ret = io_sqpoll_wait_sq(ctx);
++ if (ret)
++ goto out;
++ }
++ submitted = to_submit;
++ } else if (to_submit) {
++ ret = io_uring_add_tctx_node(ctx);
++ if (unlikely(ret))
++ goto out;
++ mutex_lock(&ctx->uring_lock);
++ submitted = io_submit_sqes(ctx, to_submit);
++ mutex_unlock(&ctx->uring_lock);
++
++ if (submitted != to_submit)
++ goto out;
++ }
++ if (flags & IORING_ENTER_GETEVENTS) {
++ const sigset_t __user *sig;
++ struct __kernel_timespec __user *ts;
++
++ ret = io_get_ext_arg(flags, argp, &argsz, &ts, &sig);
++ if (unlikely(ret))
++ goto out;
++
++ min_complete = min(min_complete, ctx->cq_entries);
++
++ /*
++ * When SETUP_IOPOLL and SETUP_SQPOLL are both enabled, user
++ * space applications don't need to do io completion events
++ * polling again, they can rely on io_sq_thread to do polling
++ * work, which can reduce cpu usage and uring_lock contention.
++ */
++ if (ctx->flags & IORING_SETUP_IOPOLL &&
++ !(ctx->flags & IORING_SETUP_SQPOLL)) {
++ ret = io_iopoll_check(ctx, min_complete);
++ } else {
++ ret = io_cqring_wait(ctx, min_complete, sig, argsz, ts);
++ }
++ }
++
++out:
++ percpu_ref_put(&ctx->refs);
++out_fput:
++ if (!(flags & IORING_ENTER_REGISTERED_RING))
++ fdput(f);
++ return submitted ? submitted : ret;
++}
++
++#ifdef CONFIG_PROC_FS
++static __cold int io_uring_show_cred(struct seq_file *m, unsigned int id,
++ const struct cred *cred)
++{
++ struct user_namespace *uns = seq_user_ns(m);
++ struct group_info *gi;
++ kernel_cap_t cap;
++ unsigned __capi;
++ int g;
++
++ seq_printf(m, "%5d\n", id);
++ seq_put_decimal_ull(m, "\tUid:\t", from_kuid_munged(uns, cred->uid));
++ seq_put_decimal_ull(m, "\t\t", from_kuid_munged(uns, cred->euid));
++ seq_put_decimal_ull(m, "\t\t", from_kuid_munged(uns, cred->suid));
++ seq_put_decimal_ull(m, "\t\t", from_kuid_munged(uns, cred->fsuid));
++ seq_put_decimal_ull(m, "\n\tGid:\t", from_kgid_munged(uns, cred->gid));
++ seq_put_decimal_ull(m, "\t\t", from_kgid_munged(uns, cred->egid));
++ seq_put_decimal_ull(m, "\t\t", from_kgid_munged(uns, cred->sgid));
++ seq_put_decimal_ull(m, "\t\t", from_kgid_munged(uns, cred->fsgid));
++ seq_puts(m, "\n\tGroups:\t");
++ gi = cred->group_info;
++ for (g = 0; g < gi->ngroups; g++) {
++ seq_put_decimal_ull(m, g ? " " : "",
++ from_kgid_munged(uns, gi->gid[g]));
++ }
++ seq_puts(m, "\n\tCapEff:\t");
++ cap = cred->cap_effective;
++ CAP_FOR_EACH_U32(__capi)
++ seq_put_hex_ll(m, NULL, cap.cap[CAP_LAST_U32 - __capi], 8);
++ seq_putc(m, '\n');
++ return 0;
++}
++
++static __cold void __io_uring_show_fdinfo(struct io_ring_ctx *ctx,
++ struct seq_file *m)
++{
++ struct io_sq_data *sq = NULL;
++ struct io_overflow_cqe *ocqe;
++ struct io_rings *r = ctx->rings;
++ unsigned int sq_mask = ctx->sq_entries - 1, cq_mask = ctx->cq_entries - 1;
++ unsigned int sq_head = READ_ONCE(r->sq.head);
++ unsigned int sq_tail = READ_ONCE(r->sq.tail);
++ unsigned int cq_head = READ_ONCE(r->cq.head);
++ unsigned int cq_tail = READ_ONCE(r->cq.tail);
++ unsigned int sq_entries, cq_entries;
++ bool has_lock;
++ unsigned int i;
++
++ /*
++ * we may get imprecise sqe and cqe info if uring is actively running
++ * since we get cached_sq_head and cached_cq_tail without uring_lock
++ * and sq_tail and cq_head are changed by userspace. But it's ok since
++ * we usually use these info when it is stuck.
++ */
++ seq_printf(m, "SqMask:\t0x%x\n", sq_mask);
++ seq_printf(m, "SqHead:\t%u\n", sq_head);
++ seq_printf(m, "SqTail:\t%u\n", sq_tail);
++ seq_printf(m, "CachedSqHead:\t%u\n", ctx->cached_sq_head);
++ seq_printf(m, "CqMask:\t0x%x\n", cq_mask);
++ seq_printf(m, "CqHead:\t%u\n", cq_head);
++ seq_printf(m, "CqTail:\t%u\n", cq_tail);
++ seq_printf(m, "CachedCqTail:\t%u\n", ctx->cached_cq_tail);
++ seq_printf(m, "SQEs:\t%u\n", sq_tail - ctx->cached_sq_head);
++ sq_entries = min(sq_tail - sq_head, ctx->sq_entries);
++ for (i = 0; i < sq_entries; i++) {
++ unsigned int entry = i + sq_head;
++ unsigned int sq_idx = READ_ONCE(ctx->sq_array[entry & sq_mask]);
++ struct io_uring_sqe *sqe;
++
++ if (sq_idx > sq_mask)
++ continue;
++ sqe = &ctx->sq_sqes[sq_idx];
++ seq_printf(m, "%5u: opcode:%d, fd:%d, flags:%x, user_data:%llu\n",
++ sq_idx, sqe->opcode, sqe->fd, sqe->flags,
++ sqe->user_data);
++ }
++ seq_printf(m, "CQEs:\t%u\n", cq_tail - cq_head);
++ cq_entries = min(cq_tail - cq_head, ctx->cq_entries);
++ for (i = 0; i < cq_entries; i++) {
++ unsigned int entry = i + cq_head;
++ struct io_uring_cqe *cqe = &r->cqes[entry & cq_mask];
++
++ seq_printf(m, "%5u: user_data:%llu, res:%d, flag:%x\n",
++ entry & cq_mask, cqe->user_data, cqe->res,
++ cqe->flags);
++ }
++
++ /*
++ * Avoid ABBA deadlock between the seq lock and the io_uring mutex,
++ * since fdinfo case grabs it in the opposite direction of normal use
++ * cases. If we fail to get the lock, we just don't iterate any
++ * structures that could be going away outside the io_uring mutex.
++ */
++ has_lock = mutex_trylock(&ctx->uring_lock);
++
++ if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) {
++ sq = ctx->sq_data;
++ if (!sq->thread)
++ sq = NULL;
++ }
++
++ seq_printf(m, "SqThread:\t%d\n", sq ? task_pid_nr(sq->thread) : -1);
++ seq_printf(m, "SqThreadCpu:\t%d\n", sq ? task_cpu(sq->thread) : -1);
++ seq_printf(m, "UserFiles:\t%u\n", ctx->nr_user_files);
++ for (i = 0; has_lock && i < ctx->nr_user_files; i++) {
++ struct file *f = io_file_from_index(ctx, i);
++
++ if (f)
++ seq_printf(m, "%5u: %s\n", i, file_dentry(f)->d_iname);
++ else
++ seq_printf(m, "%5u: <none>\n", i);
++ }
++ seq_printf(m, "UserBufs:\t%u\n", ctx->nr_user_bufs);
++ for (i = 0; has_lock && i < ctx->nr_user_bufs; i++) {
++ struct io_mapped_ubuf *buf = ctx->user_bufs[i];
++ unsigned int len = buf->ubuf_end - buf->ubuf;
++
++ seq_printf(m, "%5u: 0x%llx/%u\n", i, buf->ubuf, len);
++ }
++ if (has_lock && !xa_empty(&ctx->personalities)) {
++ unsigned long index;
++ const struct cred *cred;
++
++ seq_printf(m, "Personalities:\n");
++ xa_for_each(&ctx->personalities, index, cred)
++ io_uring_show_cred(m, index, cred);
++ }
++ if (has_lock)
++ mutex_unlock(&ctx->uring_lock);
++
++ seq_puts(m, "PollList:\n");
++ spin_lock(&ctx->completion_lock);
++ for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) {
++ struct hlist_head *list = &ctx->cancel_hash[i];
++ struct io_kiocb *req;
++
++ hlist_for_each_entry(req, list, hash_node)
++ seq_printf(m, " op=%d, task_works=%d\n", req->opcode,
++ task_work_pending(req->task));
++ }
++
++ seq_puts(m, "CqOverflowList:\n");
++ list_for_each_entry(ocqe, &ctx->cq_overflow_list, list) {
++ struct io_uring_cqe *cqe = &ocqe->cqe;
++
++ seq_printf(m, " user_data=%llu, res=%d, flags=%x\n",
++ cqe->user_data, cqe->res, cqe->flags);
++
++ }
++
++ spin_unlock(&ctx->completion_lock);
++}
++
++static __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f)
++{
++ struct io_ring_ctx *ctx = f->private_data;
++
++ if (percpu_ref_tryget(&ctx->refs)) {
++ __io_uring_show_fdinfo(ctx, m);
++ percpu_ref_put(&ctx->refs);
++ }
++}
++#endif
++
++static const struct file_operations io_uring_fops = {
++ .release = io_uring_release,
++ .mmap = io_uring_mmap,
++#ifndef CONFIG_MMU
++ .get_unmapped_area = io_uring_nommu_get_unmapped_area,
++ .mmap_capabilities = io_uring_nommu_mmap_capabilities,
++#endif
++ .poll = io_uring_poll,
++#ifdef CONFIG_PROC_FS
++ .show_fdinfo = io_uring_show_fdinfo,
++#endif
++};
++
++static __cold int io_allocate_scq_urings(struct io_ring_ctx *ctx,
++ struct io_uring_params *p)
++{
++ struct io_rings *rings;
++ size_t size, sq_array_offset;
++
++ /* make sure these are sane, as we already accounted them */
++ ctx->sq_entries = p->sq_entries;
++ ctx->cq_entries = p->cq_entries;
++
++ size = rings_size(p->sq_entries, p->cq_entries, &sq_array_offset);
++ if (size == SIZE_MAX)
++ return -EOVERFLOW;
++
++ rings = io_mem_alloc(size);
++ if (!rings)
++ return -ENOMEM;
++
++ ctx->rings = rings;
++ ctx->sq_array = (u32 *)((char *)rings + sq_array_offset);
++ rings->sq_ring_mask = p->sq_entries - 1;
++ rings->cq_ring_mask = p->cq_entries - 1;
++ rings->sq_ring_entries = p->sq_entries;
++ rings->cq_ring_entries = p->cq_entries;
++
++ size = array_size(sizeof(struct io_uring_sqe), p->sq_entries);
++ if (size == SIZE_MAX) {
++ io_mem_free(ctx->rings);
++ ctx->rings = NULL;
++ return -EOVERFLOW;
++ }
++
++ ctx->sq_sqes = io_mem_alloc(size);
++ if (!ctx->sq_sqes) {
++ io_mem_free(ctx->rings);
++ ctx->rings = NULL;
++ return -ENOMEM;
++ }
++
++ return 0;
++}
++
++static int io_uring_install_fd(struct io_ring_ctx *ctx, struct file *file)
++{
++ int ret, fd;
++
++ fd = get_unused_fd_flags(O_RDWR | O_CLOEXEC);
++ if (fd < 0)
++ return fd;
++
++ ret = io_uring_add_tctx_node(ctx);
++ if (ret) {
++ put_unused_fd(fd);
++ return ret;
++ }
++ fd_install(fd, file);
++ return fd;
++}
++
++/*
++ * Allocate an anonymous fd, this is what constitutes the application
++ * visible backing of an io_uring instance. The application mmaps this
++ * fd to gain access to the SQ/CQ ring details. If UNIX sockets are enabled,
++ * we have to tie this fd to a socket for file garbage collection purposes.
++ */
++static struct file *io_uring_get_file(struct io_ring_ctx *ctx)
++{
++ struct file *file;
++#if defined(CONFIG_UNIX)
++ int ret;
++
++ ret = sock_create_kern(&init_net, PF_UNIX, SOCK_RAW, IPPROTO_IP,
++ &ctx->ring_sock);
++ if (ret)
++ return ERR_PTR(ret);
++#endif
++
++ file = anon_inode_getfile_secure("[io_uring]", &io_uring_fops, ctx,
++ O_RDWR | O_CLOEXEC, NULL);
++#if defined(CONFIG_UNIX)
++ if (IS_ERR(file)) {
++ sock_release(ctx->ring_sock);
++ ctx->ring_sock = NULL;
++ } else {
++ ctx->ring_sock->file = file;
++ }
++#endif
++ return file;
++}
++
++static __cold int io_uring_create(unsigned entries, struct io_uring_params *p,
++ struct io_uring_params __user *params)
++{
++ struct io_ring_ctx *ctx;
++ struct file *file;
++ int ret;
++
++ if (!entries)
++ return -EINVAL;
++ if (entries > IORING_MAX_ENTRIES) {
++ if (!(p->flags & IORING_SETUP_CLAMP))
++ return -EINVAL;
++ entries = IORING_MAX_ENTRIES;
++ }
++
++ /*
++ * Use twice as many entries for the CQ ring. It's possible for the
++ * application to drive a higher depth than the size of the SQ ring,
++ * since the sqes are only used at submission time. This allows for
++ * some flexibility in overcommitting a bit. If the application has
++ * set IORING_SETUP_CQSIZE, it will have passed in the desired number
++ * of CQ ring entries manually.
++ */
++ p->sq_entries = roundup_pow_of_two(entries);
++ if (p->flags & IORING_SETUP_CQSIZE) {
++ /*
++ * If IORING_SETUP_CQSIZE is set, we do the same roundup
++ * to a power-of-two, if it isn't already. We do NOT impose
++ * any cq vs sq ring sizing.
++ */
++ if (!p->cq_entries)
++ return -EINVAL;
++ if (p->cq_entries > IORING_MAX_CQ_ENTRIES) {
++ if (!(p->flags & IORING_SETUP_CLAMP))
++ return -EINVAL;
++ p->cq_entries = IORING_MAX_CQ_ENTRIES;
++ }
++ p->cq_entries = roundup_pow_of_two(p->cq_entries);
++ if (p->cq_entries < p->sq_entries)
++ return -EINVAL;
++ } else {
++ p->cq_entries = 2 * p->sq_entries;
++ }
++
++ ctx = io_ring_ctx_alloc(p);
++ if (!ctx)
++ return -ENOMEM;
++ ctx->compat = in_compat_syscall();
++ if (!capable(CAP_IPC_LOCK))
++ ctx->user = get_uid(current_user());
++
++ /*
++ * This is just grabbed for accounting purposes. When a process exits,
++ * the mm is exited and dropped before the files, hence we need to hang
++ * on to this mm purely for the purposes of being able to unaccount
++ * memory (locked/pinned vm). It's not used for anything else.
++ */
++ mmgrab(current->mm);
++ ctx->mm_account = current->mm;
++
++ ret = io_allocate_scq_urings(ctx, p);
++ if (ret)
++ goto err;
++
++ ret = io_sq_offload_create(ctx, p);
++ if (ret)
++ goto err;
++ /* always set a rsrc node */
++ ret = io_rsrc_node_switch_start(ctx);
++ if (ret)
++ goto err;
++ io_rsrc_node_switch(ctx, NULL);
++
++ memset(&p->sq_off, 0, sizeof(p->sq_off));
++ p->sq_off.head = offsetof(struct io_rings, sq.head);
++ p->sq_off.tail = offsetof(struct io_rings, sq.tail);
++ p->sq_off.ring_mask = offsetof(struct io_rings, sq_ring_mask);
++ p->sq_off.ring_entries = offsetof(struct io_rings, sq_ring_entries);
++ p->sq_off.flags = offsetof(struct io_rings, sq_flags);
++ p->sq_off.dropped = offsetof(struct io_rings, sq_dropped);
++ p->sq_off.array = (char *)ctx->sq_array - (char *)ctx->rings;
++
++ memset(&p->cq_off, 0, sizeof(p->cq_off));
++ p->cq_off.head = offsetof(struct io_rings, cq.head);
++ p->cq_off.tail = offsetof(struct io_rings, cq.tail);
++ p->cq_off.ring_mask = offsetof(struct io_rings, cq_ring_mask);
++ p->cq_off.ring_entries = offsetof(struct io_rings, cq_ring_entries);
++ p->cq_off.overflow = offsetof(struct io_rings, cq_overflow);
++ p->cq_off.cqes = offsetof(struct io_rings, cqes);
++ p->cq_off.flags = offsetof(struct io_rings, cq_flags);
++
++ p->features = IORING_FEAT_SINGLE_MMAP | IORING_FEAT_NODROP |
++ IORING_FEAT_SUBMIT_STABLE | IORING_FEAT_RW_CUR_POS |
++ IORING_FEAT_CUR_PERSONALITY | IORING_FEAT_FAST_POLL |
++ IORING_FEAT_POLL_32BITS | IORING_FEAT_SQPOLL_NONFIXED |
++ IORING_FEAT_EXT_ARG | IORING_FEAT_NATIVE_WORKERS |
++ IORING_FEAT_RSRC_TAGS | IORING_FEAT_CQE_SKIP |
++ IORING_FEAT_LINKED_FILE;
++
++ if (copy_to_user(params, p, sizeof(*p))) {
++ ret = -EFAULT;
++ goto err;
++ }
++
++ file = io_uring_get_file(ctx);
++ if (IS_ERR(file)) {
++ ret = PTR_ERR(file);
++ goto err;
++ }
++
++ /*
++ * Install ring fd as the very last thing, so we don't risk someone
++ * having closed it before we finish setup
++ */
++ ret = io_uring_install_fd(ctx, file);
++ if (ret < 0) {
++ /* fput will clean it up */
++ fput(file);
++ return ret;
++ }
++
++ trace_io_uring_create(ret, ctx, p->sq_entries, p->cq_entries, p->flags);
++ return ret;
++err:
++ io_ring_ctx_wait_and_kill(ctx);
++ return ret;
++}
++
++/*
++ * Sets up an aio uring context, and returns the fd. Applications asks for a
++ * ring size, we return the actual sq/cq ring sizes (among other things) in the
++ * params structure passed in.
++ */
++static long io_uring_setup(u32 entries, struct io_uring_params __user *params)
++{
++ struct io_uring_params p;
++ int i;
++
++ if (copy_from_user(&p, params, sizeof(p)))
++ return -EFAULT;
++ for (i = 0; i < ARRAY_SIZE(p.resv); i++) {
++ if (p.resv[i])
++ return -EINVAL;
++ }
++
++ if (p.flags & ~(IORING_SETUP_IOPOLL | IORING_SETUP_SQPOLL |
++ IORING_SETUP_SQ_AFF | IORING_SETUP_CQSIZE |
++ IORING_SETUP_CLAMP | IORING_SETUP_ATTACH_WQ |
++ IORING_SETUP_R_DISABLED | IORING_SETUP_SUBMIT_ALL))
++ return -EINVAL;
++
++ return io_uring_create(entries, &p, params);
++}
++
++SYSCALL_DEFINE2(io_uring_setup, u32, entries,
++ struct io_uring_params __user *, params)
++{
++ return io_uring_setup(entries, params);
++}
++
++static __cold int io_probe(struct io_ring_ctx *ctx, void __user *arg,
++ unsigned nr_args)
++{
++ struct io_uring_probe *p;
++ size_t size;
++ int i, ret;
++
++ size = struct_size(p, ops, nr_args);
++ if (size == SIZE_MAX)
++ return -EOVERFLOW;
++ p = kzalloc(size, GFP_KERNEL);
++ if (!p)
++ return -ENOMEM;
++
++ ret = -EFAULT;
++ if (copy_from_user(p, arg, size))
++ goto out;
++ ret = -EINVAL;
++ if (memchr_inv(p, 0, size))
++ goto out;
++
++ p->last_op = IORING_OP_LAST - 1;
++ if (nr_args > IORING_OP_LAST)
++ nr_args = IORING_OP_LAST;
++
++ for (i = 0; i < nr_args; i++) {
++ p->ops[i].op = i;
++ if (!io_op_defs[i].not_supported)
++ p->ops[i].flags = IO_URING_OP_SUPPORTED;
++ }
++ p->ops_len = i;
++
++ ret = 0;
++ if (copy_to_user(arg, p, size))
++ ret = -EFAULT;
++out:
++ kfree(p);
++ return ret;
++}
++
++static int io_register_personality(struct io_ring_ctx *ctx)
++{
++ const struct cred *creds;
++ u32 id;
++ int ret;
++
++ creds = get_current_cred();
++
++ ret = xa_alloc_cyclic(&ctx->personalities, &id, (void *)creds,
++ XA_LIMIT(0, USHRT_MAX), &ctx->pers_next, GFP_KERNEL);
++ if (ret < 0) {
++ put_cred(creds);
++ return ret;
++ }
++ return id;
++}
++
++static __cold int io_register_restrictions(struct io_ring_ctx *ctx,
++ void __user *arg, unsigned int nr_args)
++{
++ struct io_uring_restriction *res;
++ size_t size;
++ int i, ret;
++
++ /* Restrictions allowed only if rings started disabled */
++ if (!(ctx->flags & IORING_SETUP_R_DISABLED))
++ return -EBADFD;
++
++ /* We allow only a single restrictions registration */
++ if (ctx->restrictions.registered)
++ return -EBUSY;
++
++ if (!arg || nr_args > IORING_MAX_RESTRICTIONS)
++ return -EINVAL;
++
++ size = array_size(nr_args, sizeof(*res));
++ if (size == SIZE_MAX)
++ return -EOVERFLOW;
++
++ res = memdup_user(arg, size);
++ if (IS_ERR(res))
++ return PTR_ERR(res);
++
++ ret = 0;
++
++ for (i = 0; i < nr_args; i++) {
++ switch (res[i].opcode) {
++ case IORING_RESTRICTION_REGISTER_OP:
++ if (res[i].register_op >= IORING_REGISTER_LAST) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ __set_bit(res[i].register_op,
++ ctx->restrictions.register_op);
++ break;
++ case IORING_RESTRICTION_SQE_OP:
++ if (res[i].sqe_op >= IORING_OP_LAST) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ __set_bit(res[i].sqe_op, ctx->restrictions.sqe_op);
++ break;
++ case IORING_RESTRICTION_SQE_FLAGS_ALLOWED:
++ ctx->restrictions.sqe_flags_allowed = res[i].sqe_flags;
++ break;
++ case IORING_RESTRICTION_SQE_FLAGS_REQUIRED:
++ ctx->restrictions.sqe_flags_required = res[i].sqe_flags;
++ break;
++ default:
++ ret = -EINVAL;
++ goto out;
++ }
++ }
++
++out:
++ /* Reset all restrictions if an error happened */
++ if (ret != 0)
++ memset(&ctx->restrictions, 0, sizeof(ctx->restrictions));
++ else
++ ctx->restrictions.registered = true;
++
++ kfree(res);
++ return ret;
++}
++
++static int io_register_enable_rings(struct io_ring_ctx *ctx)
++{
++ if (!(ctx->flags & IORING_SETUP_R_DISABLED))
++ return -EBADFD;
++
++ if (ctx->restrictions.registered)
++ ctx->restricted = 1;
++
++ ctx->flags &= ~IORING_SETUP_R_DISABLED;
++ if (ctx->sq_data && wq_has_sleeper(&ctx->sq_data->wait))
++ wake_up(&ctx->sq_data->wait);
++ return 0;
++}
++
++static int __io_register_rsrc_update(struct io_ring_ctx *ctx, unsigned type,
++ struct io_uring_rsrc_update2 *up,
++ unsigned nr_args)
++{
++ __u32 tmp;
++ int err;
++
++ if (check_add_overflow(up->offset, nr_args, &tmp))
++ return -EOVERFLOW;
++ err = io_rsrc_node_switch_start(ctx);
++ if (err)
++ return err;
++
++ switch (type) {
++ case IORING_RSRC_FILE:
++ return __io_sqe_files_update(ctx, up, nr_args);
++ case IORING_RSRC_BUFFER:
++ return __io_sqe_buffers_update(ctx, up, nr_args);
++ }
++ return -EINVAL;
++}
++
++static int io_register_files_update(struct io_ring_ctx *ctx, void __user *arg,
++ unsigned nr_args)
++{
++ struct io_uring_rsrc_update2 up;
++
++ if (!nr_args)
++ return -EINVAL;
++ memset(&up, 0, sizeof(up));
++ if (copy_from_user(&up, arg, sizeof(struct io_uring_rsrc_update)))
++ return -EFAULT;
++ if (up.resv || up.resv2)
++ return -EINVAL;
++ return __io_register_rsrc_update(ctx, IORING_RSRC_FILE, &up, nr_args);
++}
++
++static int io_register_rsrc_update(struct io_ring_ctx *ctx, void __user *arg,
++ unsigned size, unsigned type)
++{
++ struct io_uring_rsrc_update2 up;
++
++ if (size != sizeof(up))
++ return -EINVAL;
++ if (copy_from_user(&up, arg, sizeof(up)))
++ return -EFAULT;
++ if (!up.nr || up.resv || up.resv2)
++ return -EINVAL;
++ return __io_register_rsrc_update(ctx, type, &up, up.nr);
++}
++
++static __cold int io_register_rsrc(struct io_ring_ctx *ctx, void __user *arg,
++ unsigned int size, unsigned int type)
++{
++ struct io_uring_rsrc_register rr;
++
++ /* keep it extendible */
++ if (size != sizeof(rr))
++ return -EINVAL;
++
++ memset(&rr, 0, sizeof(rr));
++ if (copy_from_user(&rr, arg, size))
++ return -EFAULT;
++ if (!rr.nr || rr.resv || rr.resv2)
++ return -EINVAL;
++
++ switch (type) {
++ case IORING_RSRC_FILE:
++ return io_sqe_files_register(ctx, u64_to_user_ptr(rr.data),
++ rr.nr, u64_to_user_ptr(rr.tags));
++ case IORING_RSRC_BUFFER:
++ return io_sqe_buffers_register(ctx, u64_to_user_ptr(rr.data),
++ rr.nr, u64_to_user_ptr(rr.tags));
++ }
++ return -EINVAL;
++}
++
++static __cold int io_register_iowq_aff(struct io_ring_ctx *ctx,
++ void __user *arg, unsigned len)
++{
++ struct io_uring_task *tctx = current->io_uring;
++ cpumask_var_t new_mask;
++ int ret;
++
++ if (!tctx || !tctx->io_wq)
++ return -EINVAL;
++
++ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL))
++ return -ENOMEM;
++
++ cpumask_clear(new_mask);
++ if (len > cpumask_size())
++ len = cpumask_size();
++
++ if (in_compat_syscall()) {
++ ret = compat_get_bitmap(cpumask_bits(new_mask),
++ (const compat_ulong_t __user *)arg,
++ len * 8 /* CHAR_BIT */);
++ } else {
++ ret = copy_from_user(new_mask, arg, len);
++ }
++
++ if (ret) {
++ free_cpumask_var(new_mask);
++ return -EFAULT;
++ }
++
++ ret = io_wq_cpu_affinity(tctx->io_wq, new_mask);
++ free_cpumask_var(new_mask);
++ return ret;
++}
++
++static __cold int io_unregister_iowq_aff(struct io_ring_ctx *ctx)
++{
++ struct io_uring_task *tctx = current->io_uring;
++
++ if (!tctx || !tctx->io_wq)
++ return -EINVAL;
++
++ return io_wq_cpu_affinity(tctx->io_wq, NULL);
++}
++
++static __cold int io_register_iowq_max_workers(struct io_ring_ctx *ctx,
++ void __user *arg)
++ __must_hold(&ctx->uring_lock)
++{
++ struct io_tctx_node *node;
++ struct io_uring_task *tctx = NULL;
++ struct io_sq_data *sqd = NULL;
++ __u32 new_count[2];
++ int i, ret;
++
++ if (copy_from_user(new_count, arg, sizeof(new_count)))
++ return -EFAULT;
++ for (i = 0; i < ARRAY_SIZE(new_count); i++)
++ if (new_count[i] > INT_MAX)
++ return -EINVAL;
++
++ if (ctx->flags & IORING_SETUP_SQPOLL) {
++ sqd = ctx->sq_data;
++ if (sqd) {
++ /*
++ * Observe the correct sqd->lock -> ctx->uring_lock
++ * ordering. Fine to drop uring_lock here, we hold
++ * a ref to the ctx.
++ */
++ refcount_inc(&sqd->refs);
++ mutex_unlock(&ctx->uring_lock);
++ mutex_lock(&sqd->lock);
++ mutex_lock(&ctx->uring_lock);
++ if (sqd->thread)
++ tctx = sqd->thread->io_uring;
++ }
++ } else {
++ tctx = current->io_uring;
++ }
++
++ BUILD_BUG_ON(sizeof(new_count) != sizeof(ctx->iowq_limits));
++
++ for (i = 0; i < ARRAY_SIZE(new_count); i++)
++ if (new_count[i])
++ ctx->iowq_limits[i] = new_count[i];
++ ctx->iowq_limits_set = true;
++
++ if (tctx && tctx->io_wq) {
++ ret = io_wq_max_workers(tctx->io_wq, new_count);
++ if (ret)
++ goto err;
++ } else {
++ memset(new_count, 0, sizeof(new_count));
++ }
++
++ if (sqd) {
++ mutex_unlock(&sqd->lock);
++ io_put_sq_data(sqd);
++ }
++
++ if (copy_to_user(arg, new_count, sizeof(new_count)))
++ return -EFAULT;
++
++ /* that's it for SQPOLL, only the SQPOLL task creates requests */
++ if (sqd)
++ return 0;
++
++ /* now propagate the restriction to all registered users */
++ list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
++ struct io_uring_task *tctx = node->task->io_uring;
++
++ if (WARN_ON_ONCE(!tctx->io_wq))
++ continue;
++
++ for (i = 0; i < ARRAY_SIZE(new_count); i++)
++ new_count[i] = ctx->iowq_limits[i];
++ /* ignore errors, it always returns zero anyway */
++ (void)io_wq_max_workers(tctx->io_wq, new_count);
++ }
++ return 0;
++err:
++ if (sqd) {
++ mutex_unlock(&sqd->lock);
++ io_put_sq_data(sqd);
++ }
++ return ret;
++}
++
++static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
++ void __user *arg, unsigned nr_args)
++ __releases(ctx->uring_lock)
++ __acquires(ctx->uring_lock)
++{
++ int ret;
++
++ /*
++ * We're inside the ring mutex, if the ref is already dying, then
++ * someone else killed the ctx or is already going through
++ * io_uring_register().
++ */
++ if (percpu_ref_is_dying(&ctx->refs))
++ return -ENXIO;
++
++ if (ctx->restricted) {
++ if (opcode >= IORING_REGISTER_LAST)
++ return -EINVAL;
++ opcode = array_index_nospec(opcode, IORING_REGISTER_LAST);
++ if (!test_bit(opcode, ctx->restrictions.register_op))
++ return -EACCES;
++ }
++
++ switch (opcode) {
++ case IORING_REGISTER_BUFFERS:
++ ret = io_sqe_buffers_register(ctx, arg, nr_args, NULL);
++ break;
++ case IORING_UNREGISTER_BUFFERS:
++ ret = -EINVAL;
++ if (arg || nr_args)
++ break;
++ ret = io_sqe_buffers_unregister(ctx);
++ break;
++ case IORING_REGISTER_FILES:
++ ret = io_sqe_files_register(ctx, arg, nr_args, NULL);
++ break;
++ case IORING_UNREGISTER_FILES:
++ ret = -EINVAL;
++ if (arg || nr_args)
++ break;
++ ret = io_sqe_files_unregister(ctx);
++ break;
++ case IORING_REGISTER_FILES_UPDATE:
++ ret = io_register_files_update(ctx, arg, nr_args);
++ break;
++ case IORING_REGISTER_EVENTFD:
++ ret = -EINVAL;
++ if (nr_args != 1)
++ break;
++ ret = io_eventfd_register(ctx, arg, 0);
++ break;
++ case IORING_REGISTER_EVENTFD_ASYNC:
++ ret = -EINVAL;
++ if (nr_args != 1)
++ break;
++ ret = io_eventfd_register(ctx, arg, 1);
++ break;
++ case IORING_UNREGISTER_EVENTFD:
++ ret = -EINVAL;
++ if (arg || nr_args)
++ break;
++ ret = io_eventfd_unregister(ctx);
++ break;
++ case IORING_REGISTER_PROBE:
++ ret = -EINVAL;
++ if (!arg || nr_args > 256)
++ break;
++ ret = io_probe(ctx, arg, nr_args);
++ break;
++ case IORING_REGISTER_PERSONALITY:
++ ret = -EINVAL;
++ if (arg || nr_args)
++ break;
++ ret = io_register_personality(ctx);
++ break;
++ case IORING_UNREGISTER_PERSONALITY:
++ ret = -EINVAL;
++ if (arg)
++ break;
++ ret = io_unregister_personality(ctx, nr_args);
++ break;
++ case IORING_REGISTER_ENABLE_RINGS:
++ ret = -EINVAL;
++ if (arg || nr_args)
++ break;
++ ret = io_register_enable_rings(ctx);
++ break;
++ case IORING_REGISTER_RESTRICTIONS:
++ ret = io_register_restrictions(ctx, arg, nr_args);
++ break;
++ case IORING_REGISTER_FILES2:
++ ret = io_register_rsrc(ctx, arg, nr_args, IORING_RSRC_FILE);
++ break;
++ case IORING_REGISTER_FILES_UPDATE2:
++ ret = io_register_rsrc_update(ctx, arg, nr_args,
++ IORING_RSRC_FILE);
++ break;
++ case IORING_REGISTER_BUFFERS2:
++ ret = io_register_rsrc(ctx, arg, nr_args, IORING_RSRC_BUFFER);
++ break;
++ case IORING_REGISTER_BUFFERS_UPDATE:
++ ret = io_register_rsrc_update(ctx, arg, nr_args,
++ IORING_RSRC_BUFFER);
++ break;
++ case IORING_REGISTER_IOWQ_AFF:
++ ret = -EINVAL;
++ if (!arg || !nr_args)
++ break;
++ ret = io_register_iowq_aff(ctx, arg, nr_args);
++ break;
++ case IORING_UNREGISTER_IOWQ_AFF:
++ ret = -EINVAL;
++ if (arg || nr_args)
++ break;
++ ret = io_unregister_iowq_aff(ctx);
++ break;
++ case IORING_REGISTER_IOWQ_MAX_WORKERS:
++ ret = -EINVAL;
++ if (!arg || nr_args != 2)
++ break;
++ ret = io_register_iowq_max_workers(ctx, arg);
++ break;
++ case IORING_REGISTER_RING_FDS:
++ ret = io_ringfd_register(ctx, arg, nr_args);
++ break;
++ case IORING_UNREGISTER_RING_FDS:
++ ret = io_ringfd_unregister(ctx, arg, nr_args);
++ break;
++ default:
++ ret = -EINVAL;
++ break;
++ }
++
++ return ret;
++}
++
++SYSCALL_DEFINE4(io_uring_register, unsigned int, fd, unsigned int, opcode,
++ void __user *, arg, unsigned int, nr_args)
++{
++ struct io_ring_ctx *ctx;
++ long ret = -EBADF;
++ struct fd f;
++
++ f = fdget(fd);
++ if (!f.file)
++ return -EBADF;
++
++ ret = -EOPNOTSUPP;
++ if (f.file->f_op != &io_uring_fops)
++ goto out_fput;
++
++ ctx = f.file->private_data;
++
++ io_run_task_work();
++
++ mutex_lock(&ctx->uring_lock);
++ ret = __io_uring_register(ctx, opcode, arg, nr_args);
++ mutex_unlock(&ctx->uring_lock);
++ trace_io_uring_register(ctx, opcode, ctx->nr_user_files, ctx->nr_user_bufs, ret);
++out_fput:
++ fdput(f);
++ return ret;
++}
++
++static int __init io_uring_init(void)
++{
++#define __BUILD_BUG_VERIFY_ELEMENT(stype, eoffset, etype, ename) do { \
++ BUILD_BUG_ON(offsetof(stype, ename) != eoffset); \
++ BUILD_BUG_ON(sizeof(etype) != sizeof_field(stype, ename)); \
++} while (0)
++
++#define BUILD_BUG_SQE_ELEM(eoffset, etype, ename) \
++ __BUILD_BUG_VERIFY_ELEMENT(struct io_uring_sqe, eoffset, etype, ename)
++ BUILD_BUG_ON(sizeof(struct io_uring_sqe) != 64);
++ BUILD_BUG_SQE_ELEM(0, __u8, opcode);
++ BUILD_BUG_SQE_ELEM(1, __u8, flags);
++ BUILD_BUG_SQE_ELEM(2, __u16, ioprio);
++ BUILD_BUG_SQE_ELEM(4, __s32, fd);
++ BUILD_BUG_SQE_ELEM(8, __u64, off);
++ BUILD_BUG_SQE_ELEM(8, __u64, addr2);
++ BUILD_BUG_SQE_ELEM(16, __u64, addr);
++ BUILD_BUG_SQE_ELEM(16, __u64, splice_off_in);
++ BUILD_BUG_SQE_ELEM(24, __u32, len);
++ BUILD_BUG_SQE_ELEM(28, __kernel_rwf_t, rw_flags);
++ BUILD_BUG_SQE_ELEM(28, /* compat */ int, rw_flags);
++ BUILD_BUG_SQE_ELEM(28, /* compat */ __u32, rw_flags);
++ BUILD_BUG_SQE_ELEM(28, __u32, fsync_flags);
++ BUILD_BUG_SQE_ELEM(28, /* compat */ __u16, poll_events);
++ BUILD_BUG_SQE_ELEM(28, __u32, poll32_events);
++ BUILD_BUG_SQE_ELEM(28, __u32, sync_range_flags);
++ BUILD_BUG_SQE_ELEM(28, __u32, msg_flags);
++ BUILD_BUG_SQE_ELEM(28, __u32, timeout_flags);
++ BUILD_BUG_SQE_ELEM(28, __u32, accept_flags);
++ BUILD_BUG_SQE_ELEM(28, __u32, cancel_flags);
++ BUILD_BUG_SQE_ELEM(28, __u32, open_flags);
++ BUILD_BUG_SQE_ELEM(28, __u32, statx_flags);
++ BUILD_BUG_SQE_ELEM(28, __u32, fadvise_advice);
++ BUILD_BUG_SQE_ELEM(28, __u32, splice_flags);
++ BUILD_BUG_SQE_ELEM(32, __u64, user_data);
++ BUILD_BUG_SQE_ELEM(40, __u16, buf_index);
++ BUILD_BUG_SQE_ELEM(40, __u16, buf_group);
++ BUILD_BUG_SQE_ELEM(42, __u16, personality);
++ BUILD_BUG_SQE_ELEM(44, __s32, splice_fd_in);
++ BUILD_BUG_SQE_ELEM(44, __u32, file_index);
++
++ BUILD_BUG_ON(sizeof(struct io_uring_files_update) !=
++ sizeof(struct io_uring_rsrc_update));
++ BUILD_BUG_ON(sizeof(struct io_uring_rsrc_update) >
++ sizeof(struct io_uring_rsrc_update2));
++
++ /* ->buf_index is u16 */
++ BUILD_BUG_ON(IORING_MAX_REG_BUFFERS >= (1u << 16));
++
++ /* should fit into one byte */
++ BUILD_BUG_ON(SQE_VALID_FLAGS >= (1 << 8));
++ BUILD_BUG_ON(SQE_COMMON_FLAGS >= (1 << 8));
++ BUILD_BUG_ON((SQE_VALID_FLAGS | SQE_COMMON_FLAGS) != SQE_VALID_FLAGS);
++
++ BUILD_BUG_ON(ARRAY_SIZE(io_op_defs) != IORING_OP_LAST);
++ BUILD_BUG_ON(__REQ_F_LAST_BIT > 8 * sizeof(int));
++
++ req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC |
++ SLAB_ACCOUNT);
++ return 0;
++};
++__initcall(io_uring_init);
+diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
+index 7f145aefbff87..d015fce678654 100644
+--- a/kernel/bpf/arraymap.c
++++ b/kernel/bpf/arraymap.c
+@@ -155,6 +155,11 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
+ return &array->map;
+ }
+
++static void *array_map_elem_ptr(struct bpf_array* array, u32 index)
++{
++ return array->value + (u64)array->elem_size * index;
++}
++
+ /* Called from syscall or from eBPF program */
+ static void *array_map_lookup_elem(struct bpf_map *map, void *key)
+ {
+@@ -164,7 +169,7 @@ static void *array_map_lookup_elem(struct bpf_map *map, void *key)
+ if (unlikely(index >= array->map.max_entries))
+ return NULL;
+
+- return array->value + array->elem_size * (index & array->index_mask);
++ return array->value + (u64)array->elem_size * (index & array->index_mask);
+ }
+
+ static int array_map_direct_value_addr(const struct bpf_map *map, u64 *imm,
+@@ -287,10 +292,12 @@ static int array_map_get_next_key(struct bpf_map *map, void *key, void *next_key
+ return 0;
+ }
+
+-static void check_and_free_timer_in_array(struct bpf_array *arr, void *val)
++static void check_and_free_fields(struct bpf_array *arr, void *val)
+ {
+- if (unlikely(map_value_has_timer(&arr->map)))
++ if (map_value_has_timer(&arr->map))
+ bpf_timer_cancel_and_free(val + arr->map.timer_off);
++ if (map_value_has_kptrs(&arr->map))
++ bpf_map_free_kptrs(&arr->map, val);
+ }
+
+ /* Called from syscall or from eBPF program */
+@@ -322,12 +329,12 @@ static int array_map_update_elem(struct bpf_map *map, void *key, void *value,
+ value, map->value_size);
+ } else {
+ val = array->value +
+- array->elem_size * (index & array->index_mask);
++ (u64)array->elem_size * (index & array->index_mask);
+ if (map_flags & BPF_F_LOCK)
+ copy_map_value_locked(map, val, value, false);
+ else
+ copy_map_value(map, val, value);
+- check_and_free_timer_in_array(array, val);
++ check_and_free_fields(array, val);
+ }
+ return 0;
+ }
+@@ -386,18 +393,25 @@ static void array_map_free_timers(struct bpf_map *map)
+ struct bpf_array *array = container_of(map, struct bpf_array, map);
+ int i;
+
+- if (likely(!map_value_has_timer(map)))
++ /* We don't reset or free kptr on uref dropping to zero. */
++ if (!map_value_has_timer(map))
+ return;
+
+ for (i = 0; i < array->map.max_entries; i++)
+- bpf_timer_cancel_and_free(array->value + array->elem_size * i +
+- map->timer_off);
++ bpf_timer_cancel_and_free(array_map_elem_ptr(array, i) + map->timer_off);
+ }
+
+ /* Called when map->refcnt goes to zero, either from workqueue or from syscall */
+ static void array_map_free(struct bpf_map *map)
+ {
+ struct bpf_array *array = container_of(map, struct bpf_array, map);
++ int i;
++
++ if (map_value_has_kptrs(map)) {
++ for (i = 0; i < array->map.max_entries; i++)
++ bpf_map_free_kptrs(map, array_map_elem_ptr(array, i));
++ bpf_map_free_kptr_off_tab(map);
++ }
+
+ if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY)
+ bpf_array_free_percpu(array);
+@@ -531,7 +545,7 @@ static void *bpf_array_map_seq_start(struct seq_file *seq, loff_t *pos)
+ index = info->index & array->index_mask;
+ if (info->percpu_value_buf)
+ return array->pptrs[index];
+- return array->value + array->elem_size * index;
++ return array_map_elem_ptr(array, index);
+ }
+
+ static void *bpf_array_map_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+@@ -550,7 +564,7 @@ static void *bpf_array_map_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+ index = info->index & array->index_mask;
+ if (info->percpu_value_buf)
+ return array->pptrs[index];
+- return array->value + array->elem_size * index;
++ return array_map_elem_ptr(array, index);
+ }
+
+ static int __bpf_array_map_seq_show(struct seq_file *seq, void *v)
+@@ -665,7 +679,7 @@ static int bpf_for_each_array_elem(struct bpf_map *map, bpf_callback_t callback_
+ if (is_percpu)
+ val = this_cpu_ptr(array->pptrs[i]);
+ else
+- val = array->value + array->elem_size * i;
++ val = array_map_elem_ptr(array, i);
+ num_elems++;
+ key = i;
+ ret = callback_fn((u64)(long)map, (u64)(long)&key,
+diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
+index 21069dbe9138f..310b0591d91fd 100644
+--- a/kernel/bpf/bpf_struct_ops.c
++++ b/kernel/bpf/bpf_struct_ops.c
+@@ -32,15 +32,15 @@ struct bpf_struct_ops_map {
+ const struct bpf_struct_ops *st_ops;
+ /* protect map_update */
+ struct mutex lock;
+- /* progs has all the bpf_prog that is populated
++ /* link has all the bpf_links that is populated
+ * to the func ptr of the kernel's struct
+ * (in kvalue.data).
+ */
+- struct bpf_prog **progs;
++ struct bpf_link **links;
+ /* image is a page that has all the trampolines
+ * that stores the func args before calling the bpf_prog.
+ * A PAGE_SIZE "image" is enough to store all trampoline for
+- * "progs[]".
++ * "links[]".
+ */
+ void *image;
+ /* uvalue->data stores the kernel struct
+@@ -282,9 +282,9 @@ static void bpf_struct_ops_map_put_progs(struct bpf_struct_ops_map *st_map)
+ u32 i;
+
+ for (i = 0; i < btf_type_vlen(t); i++) {
+- if (st_map->progs[i]) {
+- bpf_prog_put(st_map->progs[i]);
+- st_map->progs[i] = NULL;
++ if (st_map->links[i]) {
++ bpf_link_put(st_map->links[i]);
++ st_map->links[i] = NULL;
+ }
+ }
+ }
+@@ -315,18 +315,34 @@ static int check_zero_holes(const struct btf_type *t, void *data)
+ return 0;
+ }
+
+-int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_progs *tprogs,
+- struct bpf_prog *prog,
++static void bpf_struct_ops_link_release(struct bpf_link *link)
++{
++}
++
++static void bpf_struct_ops_link_dealloc(struct bpf_link *link)
++{
++ struct bpf_tramp_link *tlink = container_of(link, struct bpf_tramp_link, link);
++
++ kfree(tlink);
++}
++
++const struct bpf_link_ops bpf_struct_ops_link_lops = {
++ .release = bpf_struct_ops_link_release,
++ .dealloc = bpf_struct_ops_link_dealloc,
++};
++
++int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks,
++ struct bpf_tramp_link *link,
+ const struct btf_func_model *model,
+ void *image, void *image_end)
+ {
+ u32 flags;
+
+- tprogs[BPF_TRAMP_FENTRY].progs[0] = prog;
+- tprogs[BPF_TRAMP_FENTRY].nr_progs = 1;
++ tlinks[BPF_TRAMP_FENTRY].links[0] = link;
++ tlinks[BPF_TRAMP_FENTRY].nr_links = 1;
+ flags = model->ret_size > 0 ? BPF_TRAMP_F_RET_FENTRY_RET : 0;
+ return arch_prepare_bpf_trampoline(NULL, image, image_end,
+- model, flags, tprogs, NULL);
++ model, flags, tlinks, NULL);
+ }
+
+ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
+@@ -337,7 +353,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
+ struct bpf_struct_ops_value *uvalue, *kvalue;
+ const struct btf_member *member;
+ const struct btf_type *t = st_ops->type;
+- struct bpf_tramp_progs *tprogs = NULL;
++ struct bpf_tramp_links *tlinks = NULL;
+ void *udata, *kdata;
+ int prog_fd, err = 0;
+ void *image, *image_end;
+@@ -361,8 +377,8 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
+ if (uvalue->state || refcount_read(&uvalue->refcnt))
+ return -EINVAL;
+
+- tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(*tprogs), GFP_KERNEL);
+- if (!tprogs)
++ tlinks = kcalloc(BPF_TRAMP_MAX, sizeof(*tlinks), GFP_KERNEL);
++ if (!tlinks)
+ return -ENOMEM;
+
+ uvalue = (struct bpf_struct_ops_value *)st_map->uvalue;
+@@ -385,6 +401,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
+ for_each_member(i, t, member) {
+ const struct btf_type *mtype, *ptype;
+ struct bpf_prog *prog;
++ struct bpf_tramp_link *link;
+ u32 moff;
+
+ moff = __btf_member_bit_offset(t, member) / 8;
+@@ -438,16 +455,26 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
+ err = PTR_ERR(prog);
+ goto reset_unlock;
+ }
+- st_map->progs[i] = prog;
+
+ if (prog->type != BPF_PROG_TYPE_STRUCT_OPS ||
+ prog->aux->attach_btf_id != st_ops->type_id ||
+ prog->expected_attach_type != i) {
++ bpf_prog_put(prog);
+ err = -EINVAL;
+ goto reset_unlock;
+ }
+
+- err = bpf_struct_ops_prepare_trampoline(tprogs, prog,
++ link = kzalloc(sizeof(*link), GFP_USER);
++ if (!link) {
++ bpf_prog_put(prog);
++ err = -ENOMEM;
++ goto reset_unlock;
++ }
++ bpf_link_init(&link->link, BPF_LINK_TYPE_STRUCT_OPS,
++ &bpf_struct_ops_link_lops, prog);
++ st_map->links[i] = &link->link;
++
++ err = bpf_struct_ops_prepare_trampoline(tlinks, link,
+ &st_ops->func_models[i],
+ image, image_end);
+ if (err < 0)
+@@ -490,7 +517,7 @@ reset_unlock:
+ memset(uvalue, 0, map->value_size);
+ memset(kvalue, 0, map->value_size);
+ unlock:
+- kfree(tprogs);
++ kfree(tlinks);
+ mutex_unlock(&st_map->lock);
+ return err;
+ }
+@@ -545,9 +572,9 @@ static void bpf_struct_ops_map_free(struct bpf_map *map)
+ {
+ struct bpf_struct_ops_map *st_map = (struct bpf_struct_ops_map *)map;
+
+- if (st_map->progs)
++ if (st_map->links)
+ bpf_struct_ops_map_put_progs(st_map);
+- bpf_map_area_free(st_map->progs);
++ bpf_map_area_free(st_map->links);
+ bpf_jit_free_exec(st_map->image);
+ bpf_map_area_free(st_map->uvalue);
+ bpf_map_area_free(st_map);
+@@ -596,11 +623,11 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
+ map = &st_map->map;
+
+ st_map->uvalue = bpf_map_area_alloc(vt->size, NUMA_NO_NODE);
+- st_map->progs =
+- bpf_map_area_alloc(btf_type_vlen(t) * sizeof(struct bpf_prog *),
++ st_map->links =
++ bpf_map_area_alloc(btf_type_vlen(t) * sizeof(struct bpf_links *),
+ NUMA_NO_NODE);
+ st_map->image = bpf_jit_alloc_exec(PAGE_SIZE);
+- if (!st_map->uvalue || !st_map->progs || !st_map->image) {
++ if (!st_map->uvalue || !st_map->links || !st_map->image) {
+ bpf_struct_ops_map_free(map);
+ return ERR_PTR(-ENOMEM);
+ }
+diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
+index feef799884d1f..7a593ecfbeec5 100644
+--- a/kernel/bpf/btf.c
++++ b/kernel/bpf/btf.c
+@@ -207,12 +207,18 @@ enum btf_kfunc_hook {
+
+ enum {
+ BTF_KFUNC_SET_MAX_CNT = 32,
++ BTF_DTOR_KFUNC_MAX_CNT = 256,
+ };
+
+ struct btf_kfunc_set_tab {
+ struct btf_id_set *sets[BTF_KFUNC_HOOK_MAX][BTF_KFUNC_TYPE_MAX];
+ };
+
++struct btf_id_dtor_kfunc_tab {
++ u32 cnt;
++ struct btf_id_dtor_kfunc dtors[];
++};
++
+ struct btf {
+ void *data;
+ struct btf_type **types;
+@@ -228,6 +234,7 @@ struct btf {
+ u32 id;
+ struct rcu_head rcu;
+ struct btf_kfunc_set_tab *kfunc_set_tab;
++ struct btf_id_dtor_kfunc_tab *dtor_kfunc_tab;
+
+ /* split BTF support */
+ struct btf *base_btf;
+@@ -1616,8 +1623,19 @@ free_tab:
+ btf->kfunc_set_tab = NULL;
+ }
+
++static void btf_free_dtor_kfunc_tab(struct btf *btf)
++{
++ struct btf_id_dtor_kfunc_tab *tab = btf->dtor_kfunc_tab;
++
++ if (!tab)
++ return;
++ kfree(tab);
++ btf->dtor_kfunc_tab = NULL;
++}
++
+ static void btf_free(struct btf *btf)
+ {
++ btf_free_dtor_kfunc_tab(btf);
+ btf_free_kfunc_set_tab(btf);
+ kvfree(btf->types);
+ kvfree(btf->resolved_sizes);
+@@ -3163,24 +3181,86 @@ static void btf_struct_log(struct btf_verifier_env *env,
+ btf_verifier_log(env, "size=%u vlen=%u", t->size, btf_type_vlen(t));
+ }
+
++enum btf_field_type {
++ BTF_FIELD_SPIN_LOCK,
++ BTF_FIELD_TIMER,
++ BTF_FIELD_KPTR,
++};
++
++enum {
++ BTF_FIELD_IGNORE = 0,
++ BTF_FIELD_FOUND = 1,
++};
++
++struct btf_field_info {
++ u32 type_id;
++ u32 off;
++ enum bpf_kptr_type type;
++};
++
++static int btf_find_struct(const struct btf *btf, const struct btf_type *t,
++ u32 off, int sz, struct btf_field_info *info)
++{
++ if (!__btf_type_is_struct(t))
++ return BTF_FIELD_IGNORE;
++ if (t->size != sz)
++ return BTF_FIELD_IGNORE;
++ info->off = off;
++ return BTF_FIELD_FOUND;
++}
++
++static int btf_find_kptr(const struct btf *btf, const struct btf_type *t,
++ u32 off, int sz, struct btf_field_info *info)
++{
++ enum bpf_kptr_type type;
++ u32 res_id;
++
++ /* For PTR, sz is always == 8 */
++ if (!btf_type_is_ptr(t))
++ return BTF_FIELD_IGNORE;
++ t = btf_type_by_id(btf, t->type);
++
++ if (!btf_type_is_type_tag(t))
++ return BTF_FIELD_IGNORE;
++ /* Reject extra tags */
++ if (btf_type_is_type_tag(btf_type_by_id(btf, t->type)))
++ return -EINVAL;
++ if (!strcmp("kptr", __btf_name_by_offset(btf, t->name_off)))
++ type = BPF_KPTR_UNREF;
++ else if (!strcmp("kptr_ref", __btf_name_by_offset(btf, t->name_off)))
++ type = BPF_KPTR_REF;
++ else
++ return -EINVAL;
++
++ /* Get the base type */
++ t = btf_type_skip_modifiers(btf, t->type, &res_id);
++ /* Only pointer to struct is allowed */
++ if (!__btf_type_is_struct(t))
++ return -EINVAL;
++
++ info->type_id = res_id;
++ info->off = off;
++ info->type = type;
++ return BTF_FIELD_FOUND;
++}
++
+ static int btf_find_struct_field(const struct btf *btf, const struct btf_type *t,
+- const char *name, int sz, int align)
++ const char *name, int sz, int align,
++ enum btf_field_type field_type,
++ struct btf_field_info *info, int info_cnt)
+ {
+ const struct btf_member *member;
+- u32 i, off = -ENOENT;
++ struct btf_field_info tmp;
++ int ret, idx = 0;
++ u32 i, off;
+
+ for_each_member(i, t, member) {
+ const struct btf_type *member_type = btf_type_by_id(btf,
+ member->type);
+- if (!__btf_type_is_struct(member_type))
+- continue;
+- if (member_type->size != sz)
+- continue;
+- if (strcmp(__btf_name_by_offset(btf, member_type->name_off), name))
++
++ if (name && strcmp(__btf_name_by_offset(btf, member_type->name_off), name))
+ continue;
+- if (off != -ENOENT)
+- /* only one such field is allowed */
+- return -E2BIG;
++
+ off = __btf_member_bit_offset(t, member);
+ if (off % 8)
+ /* valid C code cannot generate such BTF */
+@@ -3188,46 +3268,115 @@ static int btf_find_struct_field(const struct btf *btf, const struct btf_type *t
+ off /= 8;
+ if (off % align)
+ return -EINVAL;
++
++ switch (field_type) {
++ case BTF_FIELD_SPIN_LOCK:
++ case BTF_FIELD_TIMER:
++ ret = btf_find_struct(btf, member_type, off, sz,
++ idx < info_cnt ? &info[idx] : &tmp);
++ if (ret < 0)
++ return ret;
++ break;
++ case BTF_FIELD_KPTR:
++ ret = btf_find_kptr(btf, member_type, off, sz,
++ idx < info_cnt ? &info[idx] : &tmp);
++ if (ret < 0)
++ return ret;
++ break;
++ default:
++ return -EFAULT;
++ }
++
++ if (ret == BTF_FIELD_IGNORE)
++ continue;
++ if (idx >= info_cnt)
++ return -E2BIG;
++ ++idx;
+ }
+- return off;
++ return idx;
+ }
+
+ static int btf_find_datasec_var(const struct btf *btf, const struct btf_type *t,
+- const char *name, int sz, int align)
++ const char *name, int sz, int align,
++ enum btf_field_type field_type,
++ struct btf_field_info *info, int info_cnt)
+ {
+ const struct btf_var_secinfo *vsi;
+- u32 i, off = -ENOENT;
++ struct btf_field_info tmp;
++ int ret, idx = 0;
++ u32 i, off;
+
+ for_each_vsi(i, t, vsi) {
+ const struct btf_type *var = btf_type_by_id(btf, vsi->type);
+ const struct btf_type *var_type = btf_type_by_id(btf, var->type);
+
+- if (!__btf_type_is_struct(var_type))
+- continue;
+- if (var_type->size != sz)
++ off = vsi->offset;
++
++ if (name && strcmp(__btf_name_by_offset(btf, var_type->name_off), name))
+ continue;
+ if (vsi->size != sz)
+ continue;
+- if (strcmp(__btf_name_by_offset(btf, var_type->name_off), name))
+- continue;
+- if (off != -ENOENT)
+- /* only one such field is allowed */
+- return -E2BIG;
+- off = vsi->offset;
+ if (off % align)
+ return -EINVAL;
++
++ switch (field_type) {
++ case BTF_FIELD_SPIN_LOCK:
++ case BTF_FIELD_TIMER:
++ ret = btf_find_struct(btf, var_type, off, sz,
++ idx < info_cnt ? &info[idx] : &tmp);
++ if (ret < 0)
++ return ret;
++ break;
++ case BTF_FIELD_KPTR:
++ ret = btf_find_kptr(btf, var_type, off, sz,
++ idx < info_cnt ? &info[idx] : &tmp);
++ if (ret < 0)
++ return ret;
++ break;
++ default:
++ return -EFAULT;
++ }
++
++ if (ret == BTF_FIELD_IGNORE)
++ continue;
++ if (idx >= info_cnt)
++ return -E2BIG;
++ ++idx;
+ }
+- return off;
++ return idx;
+ }
+
+ static int btf_find_field(const struct btf *btf, const struct btf_type *t,
+- const char *name, int sz, int align)
++ enum btf_field_type field_type,
++ struct btf_field_info *info, int info_cnt)
+ {
++ const char *name;
++ int sz, align;
++
++ switch (field_type) {
++ case BTF_FIELD_SPIN_LOCK:
++ name = "bpf_spin_lock";
++ sz = sizeof(struct bpf_spin_lock);
++ align = __alignof__(struct bpf_spin_lock);
++ break;
++ case BTF_FIELD_TIMER:
++ name = "bpf_timer";
++ sz = sizeof(struct bpf_timer);
++ align = __alignof__(struct bpf_timer);
++ break;
++ case BTF_FIELD_KPTR:
++ name = NULL;
++ sz = sizeof(u64);
++ align = 8;
++ break;
++ default:
++ return -EFAULT;
++ }
+
+ if (__btf_type_is_struct(t))
+- return btf_find_struct_field(btf, t, name, sz, align);
++ return btf_find_struct_field(btf, t, name, sz, align, field_type, info, info_cnt);
+ else if (btf_type_is_datasec(t))
+- return btf_find_datasec_var(btf, t, name, sz, align);
++ return btf_find_datasec_var(btf, t, name, sz, align, field_type, info, info_cnt);
+ return -EINVAL;
+ }
+
+@@ -3237,16 +3386,130 @@ static int btf_find_field(const struct btf *btf, const struct btf_type *t,
+ */
+ int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t)
+ {
+- return btf_find_field(btf, t, "bpf_spin_lock",
+- sizeof(struct bpf_spin_lock),
+- __alignof__(struct bpf_spin_lock));
++ struct btf_field_info info;
++ int ret;
++
++ ret = btf_find_field(btf, t, BTF_FIELD_SPIN_LOCK, &info, 1);
++ if (ret < 0)
++ return ret;
++ if (!ret)
++ return -ENOENT;
++ return info.off;
+ }
+
+ int btf_find_timer(const struct btf *btf, const struct btf_type *t)
+ {
+- return btf_find_field(btf, t, "bpf_timer",
+- sizeof(struct bpf_timer),
+- __alignof__(struct bpf_timer));
++ struct btf_field_info info;
++ int ret;
++
++ ret = btf_find_field(btf, t, BTF_FIELD_TIMER, &info, 1);
++ if (ret < 0)
++ return ret;
++ if (!ret)
++ return -ENOENT;
++ return info.off;
++}
++
++struct bpf_map_value_off *btf_parse_kptrs(const struct btf *btf,
++ const struct btf_type *t)
++{
++ struct btf_field_info info_arr[BPF_MAP_VALUE_OFF_MAX];
++ struct bpf_map_value_off *tab;
++ struct btf *kernel_btf = NULL;
++ struct module *mod = NULL;
++ int ret, i, nr_off;
++
++ ret = btf_find_field(btf, t, BTF_FIELD_KPTR, info_arr, ARRAY_SIZE(info_arr));
++ if (ret < 0)
++ return ERR_PTR(ret);
++ if (!ret)
++ return NULL;
++
++ nr_off = ret;
++ tab = kzalloc(offsetof(struct bpf_map_value_off, off[nr_off]), GFP_KERNEL | __GFP_NOWARN);
++ if (!tab)
++ return ERR_PTR(-ENOMEM);
++
++ for (i = 0; i < nr_off; i++) {
++ const struct btf_type *t;
++ s32 id;
++
++ /* Find type in map BTF, and use it to look up the matching type
++ * in vmlinux or module BTFs, by name and kind.
++ */
++ t = btf_type_by_id(btf, info_arr[i].type_id);
++ id = bpf_find_btf_id(__btf_name_by_offset(btf, t->name_off), BTF_INFO_KIND(t->info),
++ &kernel_btf);
++ if (id < 0) {
++ ret = id;
++ goto end;
++ }
++
++ /* Find and stash the function pointer for the destruction function that
++ * needs to be eventually invoked from the map free path.
++ */
++ if (info_arr[i].type == BPF_KPTR_REF) {
++ const struct btf_type *dtor_func;
++ const char *dtor_func_name;
++ unsigned long addr;
++ s32 dtor_btf_id;
++
++ /* This call also serves as a whitelist of allowed objects that
++ * can be used as a referenced pointer and be stored in a map at
++ * the same time.
++ */
++ dtor_btf_id = btf_find_dtor_kfunc(kernel_btf, id);
++ if (dtor_btf_id < 0) {
++ ret = dtor_btf_id;
++ goto end_btf;
++ }
++
++ dtor_func = btf_type_by_id(kernel_btf, dtor_btf_id);
++ if (!dtor_func) {
++ ret = -ENOENT;
++ goto end_btf;
++ }
++
++ if (btf_is_module(kernel_btf)) {
++ mod = btf_try_get_module(kernel_btf);
++ if (!mod) {
++ ret = -ENXIO;
++ goto end_btf;
++ }
++ }
++
++ /* We already verified dtor_func to be btf_type_is_func
++ * in register_btf_id_dtor_kfuncs.
++ */
++ dtor_func_name = __btf_name_by_offset(kernel_btf, dtor_func->name_off);
++ addr = kallsyms_lookup_name(dtor_func_name);
++ if (!addr) {
++ ret = -EINVAL;
++ goto end_mod;
++ }
++ tab->off[i].kptr.dtor = (void *)addr;
++ }
++
++ tab->off[i].offset = info_arr[i].off;
++ tab->off[i].type = info_arr[i].type;
++ tab->off[i].kptr.btf_id = id;
++ tab->off[i].kptr.btf = kernel_btf;
++ tab->off[i].kptr.module = mod;
++ }
++ tab->nr_off = nr_off;
++ return tab;
++end_mod:
++ module_put(mod);
++end_btf:
++ btf_put(kernel_btf);
++end:
++ while (i--) {
++ btf_put(tab->off[i].kptr.btf);
++ if (tab->off[i].kptr.module)
++ module_put(tab->off[i].kptr.module);
++ }
++ kfree(tab);
++ return ERR_PTR(ret);
+ }
+
+ static void __btf_struct_show(const struct btf *btf, const struct btf_type *t,
+@@ -5811,6 +6074,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
+ * verifier sees.
+ */
+ for (i = 0; i < nargs; i++) {
++ enum bpf_arg_type arg_type = ARG_DONTCARE;
+ u32 regno = i + 1;
+ struct bpf_reg_state *reg = ®s[regno];
+
+@@ -5831,7 +6095,9 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
+ ref_t = btf_type_skip_modifiers(btf, t->type, &ref_id);
+ ref_tname = btf_name_by_offset(btf, ref_t->name_off);
+
+- ret = check_func_arg_reg_off(env, reg, regno, ARG_DONTCARE, rel);
++ if (rel && reg->ref_obj_id)
++ arg_type |= OBJ_RELEASE;
++ ret = check_func_arg_reg_off(env, reg, regno, arg_type);
+ if (ret < 0)
+ return ret;
+
+@@ -5862,11 +6128,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
+ if (reg->type == PTR_TO_BTF_ID) {
+ reg_btf = reg->btf;
+ reg_ref_id = reg->btf_id;
+- /* Ensure only one argument is referenced
+- * PTR_TO_BTF_ID, check_func_arg_reg_off relies
+- * on only one referenced register being allowed
+- * for kfuncs.
+- */
++ /* Ensure only one argument is referenced PTR_TO_BTF_ID */
+ if (reg->ref_obj_id) {
+ if (ref_obj_id) {
+ bpf_log(log, "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n",
+@@ -6832,6 +7094,138 @@ int register_btf_kfunc_id_set(enum bpf_prog_type prog_type,
+ }
+ EXPORT_SYMBOL_GPL(register_btf_kfunc_id_set);
+
++s32 btf_find_dtor_kfunc(struct btf *btf, u32 btf_id)
++{
++ struct btf_id_dtor_kfunc_tab *tab = btf->dtor_kfunc_tab;
++ struct btf_id_dtor_kfunc *dtor;
++
++ if (!tab)
++ return -ENOENT;
++ /* Even though the size of tab->dtors[0] is > sizeof(u32), we only need
++ * to compare the first u32 with btf_id, so we can reuse btf_id_cmp_func.
++ */
++ BUILD_BUG_ON(offsetof(struct btf_id_dtor_kfunc, btf_id) != 0);
++ dtor = bsearch(&btf_id, tab->dtors, tab->cnt, sizeof(tab->dtors[0]), btf_id_cmp_func);
++ if (!dtor)
++ return -ENOENT;
++ return dtor->kfunc_btf_id;
++}
++
++static int btf_check_dtor_kfuncs(struct btf *btf, const struct btf_id_dtor_kfunc *dtors, u32 cnt)
++{
++ const struct btf_type *dtor_func, *dtor_func_proto, *t;
++ const struct btf_param *args;
++ s32 dtor_btf_id;
++ u32 nr_args, i;
++
++ for (i = 0; i < cnt; i++) {
++ dtor_btf_id = dtors[i].kfunc_btf_id;
++
++ dtor_func = btf_type_by_id(btf, dtor_btf_id);
++ if (!dtor_func || !btf_type_is_func(dtor_func))
++ return -EINVAL;
++
++ dtor_func_proto = btf_type_by_id(btf, dtor_func->type);
++ if (!dtor_func_proto || !btf_type_is_func_proto(dtor_func_proto))
++ return -EINVAL;
++
++ /* Make sure the prototype of the destructor kfunc is 'void func(type *)' */
++ t = btf_type_by_id(btf, dtor_func_proto->type);
++ if (!t || !btf_type_is_void(t))
++ return -EINVAL;
++
++ nr_args = btf_type_vlen(dtor_func_proto);
++ if (nr_args != 1)
++ return -EINVAL;
++ args = btf_params(dtor_func_proto);
++ t = btf_type_by_id(btf, args[0].type);
++ /* Allow any pointer type, as width on targets Linux supports
++ * will be same for all pointer types (i.e. sizeof(void *))
++ */
++ if (!t || !btf_type_is_ptr(t))
++ return -EINVAL;
++ }
++ return 0;
++}
++
++/* This function must be invoked only from initcalls/module init functions */
++int register_btf_id_dtor_kfuncs(const struct btf_id_dtor_kfunc *dtors, u32 add_cnt,
++ struct module *owner)
++{
++ struct btf_id_dtor_kfunc_tab *tab;
++ struct btf *btf;
++ u32 tab_cnt;
++ int ret;
++
++ btf = btf_get_module_btf(owner);
++ if (!btf) {
++ if (!owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
++ pr_err("missing vmlinux BTF, cannot register dtor kfuncs\n");
++ return -ENOENT;
++ }
++ if (owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES)) {
++ pr_err("missing module BTF, cannot register dtor kfuncs\n");
++ return -ENOENT;
++ }
++ return 0;
++ }
++ if (IS_ERR(btf))
++ return PTR_ERR(btf);
++
++ if (add_cnt >= BTF_DTOR_KFUNC_MAX_CNT) {
++ pr_err("cannot register more than %d kfunc destructors\n", BTF_DTOR_KFUNC_MAX_CNT);
++ ret = -E2BIG;
++ goto end;
++ }
++
++ /* Ensure that the prototype of dtor kfuncs being registered is sane */
++ ret = btf_check_dtor_kfuncs(btf, dtors, add_cnt);
++ if (ret < 0)
++ goto end;
++
++ tab = btf->dtor_kfunc_tab;
++ /* Only one call allowed for modules */
++ if (WARN_ON_ONCE(tab && btf_is_module(btf))) {
++ ret = -EINVAL;
++ goto end;
++ }
++
++ tab_cnt = tab ? tab->cnt : 0;
++ if (tab_cnt > U32_MAX - add_cnt) {
++ ret = -EOVERFLOW;
++ goto end;
++ }
++ if (tab_cnt + add_cnt >= BTF_DTOR_KFUNC_MAX_CNT) {
++ pr_err("cannot register more than %d kfunc destructors\n", BTF_DTOR_KFUNC_MAX_CNT);
++ ret = -E2BIG;
++ goto end;
++ }
++
++ tab = krealloc(btf->dtor_kfunc_tab,
++ offsetof(struct btf_id_dtor_kfunc_tab, dtors[tab_cnt + add_cnt]),
++ GFP_KERNEL | __GFP_NOWARN);
++ if (!tab) {
++ ret = -ENOMEM;
++ goto end;
++ }
++
++ if (!btf->dtor_kfunc_tab)
++ tab->cnt = 0;
++ btf->dtor_kfunc_tab = tab;
++
++ memcpy(tab->dtors + tab->cnt, dtors, add_cnt * sizeof(tab->dtors[0]));
++ tab->cnt += add_cnt;
++
++ sort(tab->dtors, tab->cnt, sizeof(tab->dtors[0]), btf_id_cmp_func, NULL);
++
++ return 0;
++end:
++ btf_free_dtor_kfunc_tab(btf);
++ btf_put(btf);
++ return ret;
++}
++EXPORT_SYMBOL_GPL(register_btf_id_dtor_kfuncs);
++
+ #define MAX_TYPES_ARE_COMPAT_DEPTH 2
+
+ static
+diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
+index 0cb6211fcb58d..e7af3e582b4c1 100644
+--- a/kernel/bpf/cgroup.c
++++ b/kernel/bpf/cgroup.c
+@@ -747,6 +747,60 @@ static struct bpf_prog_list *find_detach_entry(struct list_head *progs,
+ return ERR_PTR(-ENOENT);
+ }
+
++/**
++ * purge_effective_progs() - After compute_effective_progs fails to alloc new
++ * cgrp->bpf.inactive table we can recover by
++ * recomputing the array in place.
++ *
++ * @cgrp: The cgroup which descendants to travers
++ * @prog: A program to detach or NULL
++ * @link: A link to detach or NULL
++ * @atype: Type of detach operation
++ */
++static void purge_effective_progs(struct cgroup *cgrp, struct bpf_prog *prog,
++ struct bpf_cgroup_link *link,
++ enum cgroup_bpf_attach_type atype)
++{
++ struct cgroup_subsys_state *css;
++ struct bpf_prog_array *progs;
++ struct bpf_prog_list *pl;
++ struct list_head *head;
++ struct cgroup *cg;
++ int pos;
++
++ /* recompute effective prog array in place */
++ css_for_each_descendant_pre(css, &cgrp->self) {
++ struct cgroup *desc = container_of(css, struct cgroup, self);
++
++ if (percpu_ref_is_zero(&desc->bpf.refcnt))
++ continue;
++
++ /* find position of link or prog in effective progs array */
++ for (pos = 0, cg = desc; cg; cg = cgroup_parent(cg)) {
++ if (pos && !(cg->bpf.flags[atype] & BPF_F_ALLOW_MULTI))
++ continue;
++
++ head = &cg->bpf.progs[atype];
++ list_for_each_entry(pl, head, node) {
++ if (!prog_list_prog(pl))
++ continue;
++ if (pl->prog == prog && pl->link == link)
++ goto found;
++ pos++;
++ }
++ }
++found:
++ BUG_ON(!cg);
++ progs = rcu_dereference_protected(
++ desc->bpf.effective[atype],
++ lockdep_is_held(&cgroup_mutex));
++
++ /* Remove the program from the array */
++ WARN_ONCE(bpf_prog_array_delete_safe_at(progs, pos),
++ "Failed to purge a prog from array at index %d", pos);
++ }
++}
++
+ /**
+ * __cgroup_bpf_detach() - Detach the program or link from a cgroup, and
+ * propagate the change to descendants
+@@ -766,7 +820,6 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
+ struct bpf_prog_list *pl;
+ struct list_head *progs;
+ u32 flags;
+- int err;
+
+ atype = to_cgroup_bpf_attach_type(type);
+ if (atype < 0)
+@@ -788,9 +841,12 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
+ pl->prog = NULL;
+ pl->link = NULL;
+
+- err = update_effective_progs(cgrp, atype);
+- if (err)
+- goto cleanup;
++ if (update_effective_progs(cgrp, atype)) {
++ /* if update effective array failed replace the prog with a dummy prog*/
++ pl->prog = old_prog;
++ pl->link = link;
++ purge_effective_progs(cgrp, old_prog, link, atype);
++ }
+
+ /* now can actually delete it from this cgroup list */
+ list_del(&pl->node);
+@@ -802,12 +858,6 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
+ bpf_prog_put(old_prog);
+ static_branch_dec(&cgroup_bpf_enabled_key[atype]);
+ return 0;
+-
+-cleanup:
+- /* restore back prog or link */
+- pl->prog = old_prog;
+- pl->link = link;
+- return err;
+ }
+
+ static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
+diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
+index 3adff3831c047..483bee45ead51 100644
+--- a/kernel/bpf/core.c
++++ b/kernel/bpf/core.c
+@@ -649,12 +649,6 @@ static bool bpf_prog_kallsyms_candidate(const struct bpf_prog *fp)
+ return fp->jited && !bpf_prog_was_classic(fp);
+ }
+
+-static bool bpf_prog_kallsyms_verify_off(const struct bpf_prog *fp)
+-{
+- return list_empty(&fp->aux->ksym.lnode) ||
+- fp->aux->ksym.lnode.prev == LIST_POISON2;
+-}
+-
+ void bpf_prog_kallsyms_add(struct bpf_prog *fp)
+ {
+ if (!bpf_prog_kallsyms_candidate(fp) ||
+@@ -1149,7 +1143,6 @@ int bpf_jit_binary_pack_finalize(struct bpf_prog *prog,
+ bpf_prog_pack_free(ro_header);
+ return PTR_ERR(ptr);
+ }
+- prog->aux->use_bpf_prog_pack = true;
+ return 0;
+ }
+
+@@ -1173,17 +1166,23 @@ void bpf_jit_binary_pack_free(struct bpf_binary_header *ro_header,
+ bpf_jit_uncharge_modmem(size);
+ }
+
++struct bpf_binary_header *
++bpf_jit_binary_pack_hdr(const struct bpf_prog *fp)
++{
++ unsigned long real_start = (unsigned long)fp->bpf_func;
++ unsigned long addr;
++
++ addr = real_start & BPF_PROG_CHUNK_MASK;
++ return (void *)addr;
++}
++
+ static inline struct bpf_binary_header *
+ bpf_jit_binary_hdr(const struct bpf_prog *fp)
+ {
+ unsigned long real_start = (unsigned long)fp->bpf_func;
+ unsigned long addr;
+
+- if (fp->aux->use_bpf_prog_pack)
+- addr = real_start & BPF_PROG_CHUNK_MASK;
+- else
+- addr = real_start & PAGE_MASK;
+-
++ addr = real_start & PAGE_MASK;
+ return (void *)addr;
+ }
+
+@@ -1196,11 +1195,7 @@ void __weak bpf_jit_free(struct bpf_prog *fp)
+ if (fp->jited) {
+ struct bpf_binary_header *hdr = bpf_jit_binary_hdr(fp);
+
+- if (fp->aux->use_bpf_prog_pack)
+- bpf_jit_binary_pack_free(hdr, NULL /* rw_buffer */);
+- else
+- bpf_jit_binary_free(hdr);
+-
++ bpf_jit_binary_free(hdr);
+ WARN_ON_ONCE(!bpf_prog_kallsyms_verify_off(fp));
+ }
+
+@@ -2712,6 +2707,12 @@ bool __weak bpf_jit_needs_zext(void)
+ return false;
+ }
+
++/* Return TRUE if the JIT backend supports mixing bpf2bpf and tailcalls. */
++bool __weak bpf_jit_supports_subprog_tailcalls(void)
++{
++ return false;
++}
++
+ bool __weak bpf_jit_supports_kfunc_call(void)
+ {
+ return false;
+diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
+index 65877967f414e..ea99c91f72b68 100644
+--- a/kernel/bpf/hashtab.c
++++ b/kernel/bpf/hashtab.c
+@@ -238,7 +238,7 @@ static void htab_free_prealloced_timers(struct bpf_htab *htab)
+ u32 num_entries = htab->map.max_entries;
+ int i;
+
+- if (likely(!map_value_has_timer(&htab->map)))
++ if (!map_value_has_timer(&htab->map))
+ return;
+ if (htab_has_extra_elems(htab))
+ num_entries += num_possible_cpus();
+@@ -254,6 +254,25 @@ static void htab_free_prealloced_timers(struct bpf_htab *htab)
+ }
+ }
+
++static void htab_free_prealloced_kptrs(struct bpf_htab *htab)
++{
++ u32 num_entries = htab->map.max_entries;
++ int i;
++
++ if (!map_value_has_kptrs(&htab->map))
++ return;
++ if (htab_has_extra_elems(htab))
++ num_entries += num_possible_cpus();
++
++ for (i = 0; i < num_entries; i++) {
++ struct htab_elem *elem;
++
++ elem = get_htab_elem(htab, i);
++ bpf_map_free_kptrs(&htab->map, elem->key + round_up(htab->map.key_size, 8));
++ cond_resched();
++ }
++}
++
+ static void htab_free_elems(struct bpf_htab *htab)
+ {
+ int i;
+@@ -725,12 +744,15 @@ static int htab_lru_map_gen_lookup(struct bpf_map *map,
+ return insn - insn_buf;
+ }
+
+-static void check_and_free_timer(struct bpf_htab *htab, struct htab_elem *elem)
++static void check_and_free_fields(struct bpf_htab *htab,
++ struct htab_elem *elem)
+ {
+- if (unlikely(map_value_has_timer(&htab->map)))
+- bpf_timer_cancel_and_free(elem->key +
+- round_up(htab->map.key_size, 8) +
+- htab->map.timer_off);
++ void *map_value = elem->key + round_up(htab->map.key_size, 8);
++
++ if (map_value_has_timer(&htab->map))
++ bpf_timer_cancel_and_free(map_value + htab->map.timer_off);
++ if (map_value_has_kptrs(&htab->map))
++ bpf_map_free_kptrs(&htab->map, map_value);
+ }
+
+ /* It is called from the bpf_lru_list when the LRU needs to delete
+@@ -757,7 +779,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
+ hlist_nulls_for_each_entry_rcu(l, n, head, hash_node)
+ if (l == tgt_l) {
+ hlist_nulls_del_rcu(&l->hash_node);
+- check_and_free_timer(htab, l);
++ check_and_free_fields(htab, l);
+ break;
+ }
+
+@@ -829,7 +851,7 @@ static void htab_elem_free(struct bpf_htab *htab, struct htab_elem *l)
+ {
+ if (htab->map.map_type == BPF_MAP_TYPE_PERCPU_HASH)
+ free_percpu(htab_elem_get_ptr(l, htab->map.key_size));
+- check_and_free_timer(htab, l);
++ check_and_free_fields(htab, l);
+ kfree(l);
+ }
+
+@@ -857,7 +879,7 @@ static void free_htab_elem(struct bpf_htab *htab, struct htab_elem *l)
+ htab_put_fd_value(htab, l);
+
+ if (htab_is_prealloc(htab)) {
+- check_and_free_timer(htab, l);
++ check_and_free_fields(htab, l);
+ __pcpu_freelist_push(&htab->freelist, &l->fnode);
+ } else {
+ atomic_dec(&htab->count);
+@@ -1104,7 +1126,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
+ if (!htab_is_prealloc(htab))
+ free_htab_elem(htab, l_old);
+ else
+- check_and_free_timer(htab, l_old);
++ check_and_free_fields(htab, l_old);
+ }
+ ret = 0;
+ err:
+@@ -1114,7 +1136,7 @@ err:
+
+ static void htab_lru_push_free(struct bpf_htab *htab, struct htab_elem *elem)
+ {
+- check_and_free_timer(htab, elem);
++ check_and_free_fields(htab, elem);
+ bpf_lru_push_free(&htab->lru, &elem->lru_node);
+ }
+
+@@ -1419,8 +1441,14 @@ static void htab_free_malloced_timers(struct bpf_htab *htab)
+ struct hlist_nulls_node *n;
+ struct htab_elem *l;
+
+- hlist_nulls_for_each_entry(l, n, head, hash_node)
+- check_and_free_timer(htab, l);
++ hlist_nulls_for_each_entry(l, n, head, hash_node) {
++ /* We don't reset or free kptr on uref dropping to zero,
++ * hence just free timer.
++ */
++ bpf_timer_cancel_and_free(l->key +
++ round_up(htab->map.key_size, 8) +
++ htab->map.timer_off);
++ }
+ cond_resched_rcu();
+ }
+ rcu_read_unlock();
+@@ -1430,7 +1458,8 @@ static void htab_map_free_timers(struct bpf_map *map)
+ {
+ struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
+
+- if (likely(!map_value_has_timer(&htab->map)))
++ /* We don't reset or free kptr on uref dropping to zero. */
++ if (!map_value_has_timer(&htab->map))
+ return;
+ if (!htab_is_prealloc(htab))
+ htab_free_malloced_timers(htab);
+@@ -1453,11 +1482,14 @@ static void htab_map_free(struct bpf_map *map)
+ * not have executed. Wait for them.
+ */
+ rcu_barrier();
+- if (!htab_is_prealloc(htab))
++ if (!htab_is_prealloc(htab)) {
+ delete_all_elements(htab);
+- else
++ } else {
++ htab_free_prealloced_kptrs(htab);
+ prealloc_destroy(htab);
++ }
+
++ bpf_map_free_kptr_off_tab(map);
+ free_percpu(htab->extra_elems);
+ bpf_map_area_free(htab->buckets);
+ for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
+diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
+index 315053ef6a750..3e709fed53061 100644
+--- a/kernel/bpf/helpers.c
++++ b/kernel/bpf/helpers.c
+@@ -1374,6 +1374,28 @@ out:
+ kfree(t);
+ }
+
++BPF_CALL_2(bpf_kptr_xchg, void *, map_value, void *, ptr)
++{
++ unsigned long *kptr = map_value;
++
++ return xchg(kptr, (unsigned long)ptr);
++}
++
++/* Unlike other PTR_TO_BTF_ID helpers the btf_id in bpf_kptr_xchg()
++ * helper is determined dynamically by the verifier.
++ */
++#define BPF_PTR_POISON ((void *)((0xeB9FUL << 2) + POISON_POINTER_DELTA))
++
++const struct bpf_func_proto bpf_kptr_xchg_proto = {
++ .func = bpf_kptr_xchg,
++ .gpl_only = false,
++ .ret_type = RET_PTR_TO_BTF_ID_OR_NULL,
++ .ret_btf_id = BPF_PTR_POISON,
++ .arg1_type = ARG_PTR_TO_KPTR,
++ .arg2_type = ARG_PTR_TO_BTF_ID_OR_NULL | OBJ_RELEASE,
++ .arg2_btf_id = BPF_PTR_POISON,
++};
++
+ const struct bpf_func_proto bpf_get_current_task_proto __weak;
+ const struct bpf_func_proto bpf_get_current_task_btf_proto __weak;
+ const struct bpf_func_proto bpf_probe_read_user_proto __weak;
+@@ -1452,6 +1474,8 @@ bpf_base_func_proto(enum bpf_func_id func_id)
+ return &bpf_timer_start_proto;
+ case BPF_FUNC_timer_cancel:
+ return &bpf_timer_cancel_proto;
++ case BPF_FUNC_kptr_xchg:
++ return &bpf_kptr_xchg_proto;
+ default:
+ break;
+ }
+diff --git a/kernel/bpf/map_in_map.c b/kernel/bpf/map_in_map.c
+index 5cd8f52772790..135205d0d5607 100644
+--- a/kernel/bpf/map_in_map.c
++++ b/kernel/bpf/map_in_map.c
+@@ -52,6 +52,7 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd)
+ inner_map_meta->max_entries = inner_map->max_entries;
+ inner_map_meta->spin_lock_off = inner_map->spin_lock_off;
+ inner_map_meta->timer_off = inner_map->timer_off;
++ inner_map_meta->kptr_off_tab = bpf_map_copy_kptr_off_tab(inner_map);
+ if (inner_map->btf) {
+ btf_get(inner_map->btf);
+ inner_map_meta->btf = inner_map->btf;
+@@ -71,6 +72,7 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd)
+
+ void bpf_map_meta_free(struct bpf_map *map_meta)
+ {
++ bpf_map_free_kptr_off_tab(map_meta);
+ btf_put(map_meta->btf);
+ kfree(map_meta);
+ }
+@@ -83,7 +85,8 @@ bool bpf_map_meta_equal(const struct bpf_map *meta0,
+ meta0->key_size == meta1->key_size &&
+ meta0->value_size == meta1->value_size &&
+ meta0->timer_off == meta1->timer_off &&
+- meta0->map_flags == meta1->map_flags;
++ meta0->map_flags == meta1->map_flags &&
++ bpf_map_equal_kptr_off_tab(meta0, meta1);
+ }
+
+ void *bpf_map_fd_get_ptr(struct bpf_map *map,
+diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c
+index 710ba9de12ce4..5173fd37590f1 100644
+--- a/kernel/bpf/ringbuf.c
++++ b/kernel/bpf/ringbuf.c
+@@ -404,7 +404,7 @@ BPF_CALL_2(bpf_ringbuf_submit, void *, sample, u64, flags)
+ const struct bpf_func_proto bpf_ringbuf_submit_proto = {
+ .func = bpf_ringbuf_submit,
+ .ret_type = RET_VOID,
+- .arg1_type = ARG_PTR_TO_ALLOC_MEM,
++ .arg1_type = ARG_PTR_TO_ALLOC_MEM | OBJ_RELEASE,
+ .arg2_type = ARG_ANYTHING,
+ };
+
+@@ -417,7 +417,7 @@ BPF_CALL_2(bpf_ringbuf_discard, void *, sample, u64, flags)
+ const struct bpf_func_proto bpf_ringbuf_discard_proto = {
+ .func = bpf_ringbuf_discard,
+ .ret_type = RET_VOID,
+- .arg1_type = ARG_PTR_TO_ALLOC_MEM,
++ .arg1_type = ARG_PTR_TO_ALLOC_MEM | OBJ_RELEASE,
+ .arg2_type = ARG_ANYTHING,
+ };
+
+diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
+index cdaa1152436aa..f4d1f974a8cd0 100644
+--- a/kernel/bpf/syscall.c
++++ b/kernel/bpf/syscall.c
+@@ -6,6 +6,7 @@
+ #include <linux/bpf_trace.h>
+ #include <linux/bpf_lirc.h>
+ #include <linux/bpf_verifier.h>
++#include <linux/bsearch.h>
+ #include <linux/btf.h>
+ #include <linux/syscalls.h>
+ #include <linux/slab.h>
+@@ -29,6 +30,7 @@
+ #include <linux/pgtable.h>
+ #include <linux/bpf_lsm.h>
+ #include <linux/poll.h>
++#include <linux/sort.h>
+ #include <linux/bpf-netns.h>
+ #include <linux/rcupdate_trace.h>
+ #include <linux/memcontrol.h>
+@@ -473,14 +475,128 @@ static void bpf_map_release_memcg(struct bpf_map *map)
+ }
+ #endif
+
++static int bpf_map_kptr_off_cmp(const void *a, const void *b)
++{
++ const struct bpf_map_value_off_desc *off_desc1 = a, *off_desc2 = b;
++
++ if (off_desc1->offset < off_desc2->offset)
++ return -1;
++ else if (off_desc1->offset > off_desc2->offset)
++ return 1;
++ return 0;
++}
++
++struct bpf_map_value_off_desc *bpf_map_kptr_off_contains(struct bpf_map *map, u32 offset)
++{
++ /* Since members are iterated in btf_find_field in increasing order,
++ * offsets appended to kptr_off_tab are in increasing order, so we can
++ * do bsearch to find exact match.
++ */
++ struct bpf_map_value_off *tab;
++
++ if (!map_value_has_kptrs(map))
++ return NULL;
++ tab = map->kptr_off_tab;
++ return bsearch(&offset, tab->off, tab->nr_off, sizeof(tab->off[0]), bpf_map_kptr_off_cmp);
++}
++
++void bpf_map_free_kptr_off_tab(struct bpf_map *map)
++{
++ struct bpf_map_value_off *tab = map->kptr_off_tab;
++ int i;
++
++ if (!map_value_has_kptrs(map))
++ return;
++ for (i = 0; i < tab->nr_off; i++) {
++ if (tab->off[i].kptr.module)
++ module_put(tab->off[i].kptr.module);
++ btf_put(tab->off[i].kptr.btf);
++ }
++ kfree(tab);
++ map->kptr_off_tab = NULL;
++}
++
++struct bpf_map_value_off *bpf_map_copy_kptr_off_tab(const struct bpf_map *map)
++{
++ struct bpf_map_value_off *tab = map->kptr_off_tab, *new_tab;
++ int size, i;
++
++ if (!map_value_has_kptrs(map))
++ return ERR_PTR(-ENOENT);
++ size = offsetof(struct bpf_map_value_off, off[tab->nr_off]);
++ new_tab = kmemdup(tab, size, GFP_KERNEL | __GFP_NOWARN);
++ if (!new_tab)
++ return ERR_PTR(-ENOMEM);
++ /* Do a deep copy of the kptr_off_tab */
++ for (i = 0; i < tab->nr_off; i++) {
++ btf_get(tab->off[i].kptr.btf);
++ if (tab->off[i].kptr.module && !try_module_get(tab->off[i].kptr.module)) {
++ while (i--) {
++ if (tab->off[i].kptr.module)
++ module_put(tab->off[i].kptr.module);
++ btf_put(tab->off[i].kptr.btf);
++ }
++ kfree(new_tab);
++ return ERR_PTR(-ENXIO);
++ }
++ }
++ return new_tab;
++}
++
++bool bpf_map_equal_kptr_off_tab(const struct bpf_map *map_a, const struct bpf_map *map_b)
++{
++ struct bpf_map_value_off *tab_a = map_a->kptr_off_tab, *tab_b = map_b->kptr_off_tab;
++ bool a_has_kptr = map_value_has_kptrs(map_a), b_has_kptr = map_value_has_kptrs(map_b);
++ int size;
++
++ if (!a_has_kptr && !b_has_kptr)
++ return true;
++ if (a_has_kptr != b_has_kptr)
++ return false;
++ if (tab_a->nr_off != tab_b->nr_off)
++ return false;
++ size = offsetof(struct bpf_map_value_off, off[tab_a->nr_off]);
++ return !memcmp(tab_a, tab_b, size);
++}
++
++/* Caller must ensure map_value_has_kptrs is true. Note that this function can
++ * be called on a map value while the map_value is visible to BPF programs, as
++ * it ensures the correct synchronization, and we already enforce the same using
++ * the bpf_kptr_xchg helper on the BPF program side for referenced kptrs.
++ */
++void bpf_map_free_kptrs(struct bpf_map *map, void *map_value)
++{
++ struct bpf_map_value_off *tab = map->kptr_off_tab;
++ unsigned long *btf_id_ptr;
++ int i;
++
++ for (i = 0; i < tab->nr_off; i++) {
++ struct bpf_map_value_off_desc *off_desc = &tab->off[i];
++ unsigned long old_ptr;
++
++ btf_id_ptr = map_value + off_desc->offset;
++ if (off_desc->type == BPF_KPTR_UNREF) {
++ u64 *p = (u64 *)btf_id_ptr;
++
++ WRITE_ONCE(p, 0);
++ continue;
++ }
++ old_ptr = xchg(btf_id_ptr, 0);
++ off_desc->kptr.dtor((void *)old_ptr);
++ }
++}
++
+ /* called from workqueue */
+ static void bpf_map_free_deferred(struct work_struct *work)
+ {
+ struct bpf_map *map = container_of(work, struct bpf_map, work);
+
+ security_bpf_map_free(map);
++ kfree(map->off_arr);
+ bpf_map_release_memcg(map);
+- /* implementation dependent freeing */
++ /* implementation dependent freeing, map_free callback also does
++ * bpf_map_free_kptr_off_tab, if needed.
++ */
+ map->ops->map_free(map);
+ }
+
+@@ -640,7 +756,7 @@ static int bpf_map_mmap(struct file *filp, struct vm_area_struct *vma)
+ int err;
+
+ if (!map->ops->map_mmap || map_value_has_spin_lock(map) ||
+- map_value_has_timer(map))
++ map_value_has_timer(map) || map_value_has_kptrs(map))
+ return -ENOTSUPP;
+
+ if (!(vma->vm_flags & VM_SHARED))
+@@ -767,6 +883,84 @@ int map_check_no_btf(const struct bpf_map *map,
+ return -ENOTSUPP;
+ }
+
++static int map_off_arr_cmp(const void *_a, const void *_b, const void *priv)
++{
++ const u32 a = *(const u32 *)_a;
++ const u32 b = *(const u32 *)_b;
++
++ if (a < b)
++ return -1;
++ else if (a > b)
++ return 1;
++ return 0;
++}
++
++static void map_off_arr_swap(void *_a, void *_b, int size, const void *priv)
++{
++ struct bpf_map *map = (struct bpf_map *)priv;
++ u32 *off_base = map->off_arr->field_off;
++ u32 *a = _a, *b = _b;
++ u8 *sz_a, *sz_b;
++
++ sz_a = map->off_arr->field_sz + (a - off_base);
++ sz_b = map->off_arr->field_sz + (b - off_base);
++
++ swap(*a, *b);
++ swap(*sz_a, *sz_b);
++}
++
++static int bpf_map_alloc_off_arr(struct bpf_map *map)
++{
++ bool has_spin_lock = map_value_has_spin_lock(map);
++ bool has_timer = map_value_has_timer(map);
++ bool has_kptrs = map_value_has_kptrs(map);
++ struct bpf_map_off_arr *off_arr;
++ u32 i;
++
++ if (!has_spin_lock && !has_timer && !has_kptrs) {
++ map->off_arr = NULL;
++ return 0;
++ }
++
++ off_arr = kmalloc(sizeof(*map->off_arr), GFP_KERNEL | __GFP_NOWARN);
++ if (!off_arr)
++ return -ENOMEM;
++ map->off_arr = off_arr;
++
++ off_arr->cnt = 0;
++ if (has_spin_lock) {
++ i = off_arr->cnt;
++
++ off_arr->field_off[i] = map->spin_lock_off;
++ off_arr->field_sz[i] = sizeof(struct bpf_spin_lock);
++ off_arr->cnt++;
++ }
++ if (has_timer) {
++ i = off_arr->cnt;
++
++ off_arr->field_off[i] = map->timer_off;
++ off_arr->field_sz[i] = sizeof(struct bpf_timer);
++ off_arr->cnt++;
++ }
++ if (has_kptrs) {
++ struct bpf_map_value_off *tab = map->kptr_off_tab;
++ u32 *off = &off_arr->field_off[off_arr->cnt];
++ u8 *sz = &off_arr->field_sz[off_arr->cnt];
++
++ for (i = 0; i < tab->nr_off; i++) {
++ *off++ = tab->off[i].offset;
++ *sz++ = sizeof(u64);
++ }
++ off_arr->cnt += tab->nr_off;
++ }
++
++ if (off_arr->cnt == 1)
++ return 0;
++ sort_r(off_arr->field_off, off_arr->cnt, sizeof(off_arr->field_off[0]),
++ map_off_arr_cmp, map_off_arr_swap, map);
++ return 0;
++}
++
+ static int map_check_btf(struct bpf_map *map, const struct btf *btf,
+ u32 btf_key_id, u32 btf_value_id)
+ {
+@@ -820,9 +1014,33 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf,
+ return -EOPNOTSUPP;
+ }
+
+- if (map->ops->map_check_btf)
++ map->kptr_off_tab = btf_parse_kptrs(btf, value_type);
++ if (map_value_has_kptrs(map)) {
++ if (!bpf_capable()) {
++ ret = -EPERM;
++ goto free_map_tab;
++ }
++ if (map->map_flags & (BPF_F_RDONLY_PROG | BPF_F_WRONLY_PROG)) {
++ ret = -EACCES;
++ goto free_map_tab;
++ }
++ if (map->map_type != BPF_MAP_TYPE_HASH &&
++ map->map_type != BPF_MAP_TYPE_LRU_HASH &&
++ map->map_type != BPF_MAP_TYPE_ARRAY) {
++ ret = -EOPNOTSUPP;
++ goto free_map_tab;
++ }
++ }
++
++ if (map->ops->map_check_btf) {
+ ret = map->ops->map_check_btf(map, btf, key_type, value_type);
++ if (ret < 0)
++ goto free_map_tab;
++ }
+
++ return ret;
++free_map_tab:
++ bpf_map_free_kptr_off_tab(map);
+ return ret;
+ }
+
+@@ -912,10 +1130,14 @@ static int map_create(union bpf_attr *attr)
+ attr->btf_vmlinux_value_type_id;
+ }
+
+- err = security_bpf_map_alloc(map);
++ err = bpf_map_alloc_off_arr(map);
+ if (err)
+ goto free_map;
+
++ err = security_bpf_map_alloc(map);
++ if (err)
++ goto free_map_off_arr;
++
+ err = bpf_map_alloc_id(map);
+ if (err)
+ goto free_map_sec;
+@@ -938,6 +1160,8 @@ static int map_create(union bpf_attr *attr)
+
+ free_map_sec:
+ security_bpf_map_free(map);
++free_map_off_arr:
++ kfree(map->off_arr);
+ free_map:
+ btf_put(map->btf);
+ map->ops->map_free(map);
+@@ -1639,7 +1863,7 @@ static int map_freeze(const union bpf_attr *attr)
+ return PTR_ERR(map);
+
+ if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS ||
+- map_value_has_timer(map)) {
++ map_value_has_timer(map) || map_value_has_kptrs(map)) {
+ fdput(f);
+ return -ENOTSUPP;
+ }
+@@ -2640,19 +2864,12 @@ struct bpf_link *bpf_link_get_from_fd(u32 ufd)
+ }
+ EXPORT_SYMBOL(bpf_link_get_from_fd);
+
+-struct bpf_tracing_link {
+- struct bpf_link link;
+- enum bpf_attach_type attach_type;
+- struct bpf_trampoline *trampoline;
+- struct bpf_prog *tgt_prog;
+-};
+-
+ static void bpf_tracing_link_release(struct bpf_link *link)
+ {
+ struct bpf_tracing_link *tr_link =
+- container_of(link, struct bpf_tracing_link, link);
++ container_of(link, struct bpf_tracing_link, link.link);
+
+- WARN_ON_ONCE(bpf_trampoline_unlink_prog(link->prog,
++ WARN_ON_ONCE(bpf_trampoline_unlink_prog(&tr_link->link,
+ tr_link->trampoline));
+
+ bpf_trampoline_put(tr_link->trampoline);
+@@ -2665,7 +2882,7 @@ static void bpf_tracing_link_release(struct bpf_link *link)
+ static void bpf_tracing_link_dealloc(struct bpf_link *link)
+ {
+ struct bpf_tracing_link *tr_link =
+- container_of(link, struct bpf_tracing_link, link);
++ container_of(link, struct bpf_tracing_link, link.link);
+
+ kfree(tr_link);
+ }
+@@ -2674,7 +2891,7 @@ static void bpf_tracing_link_show_fdinfo(const struct bpf_link *link,
+ struct seq_file *seq)
+ {
+ struct bpf_tracing_link *tr_link =
+- container_of(link, struct bpf_tracing_link, link);
++ container_of(link, struct bpf_tracing_link, link.link);
+
+ seq_printf(seq,
+ "attach_type:\t%d\n",
+@@ -2685,7 +2902,7 @@ static int bpf_tracing_link_fill_link_info(const struct bpf_link *link,
+ struct bpf_link_info *info)
+ {
+ struct bpf_tracing_link *tr_link =
+- container_of(link, struct bpf_tracing_link, link);
++ container_of(link, struct bpf_tracing_link, link.link);
+
+ info->tracing.attach_type = tr_link->attach_type;
+ bpf_trampoline_unpack_key(tr_link->trampoline->key,
+@@ -2766,7 +2983,7 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
+ err = -ENOMEM;
+ goto out_put_prog;
+ }
+- bpf_link_init(&link->link, BPF_LINK_TYPE_TRACING,
++ bpf_link_init(&link->link.link, BPF_LINK_TYPE_TRACING,
+ &bpf_tracing_link_lops, prog);
+ link->attach_type = prog->expected_attach_type;
+
+@@ -2836,11 +3053,11 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
+ tgt_prog = prog->aux->dst_prog;
+ }
+
+- err = bpf_link_prime(&link->link, &link_primer);
++ err = bpf_link_prime(&link->link.link, &link_primer);
+ if (err)
+ goto out_unlock;
+
+- err = bpf_trampoline_link_prog(prog, tr);
++ err = bpf_trampoline_link_prog(&link->link, tr);
+ if (err) {
+ bpf_link_cleanup(&link_primer);
+ link = NULL;
+diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
+index 5d8bfb5ef239d..e3bcad5a7c685 100644
+--- a/kernel/bpf/trampoline.c
++++ b/kernel/bpf/trampoline.c
+@@ -168,30 +168,30 @@ static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
+ return ret;
+ }
+
+-static struct bpf_tramp_progs *
++static struct bpf_tramp_links *
+ bpf_trampoline_get_progs(const struct bpf_trampoline *tr, int *total, bool *ip_arg)
+ {
+- const struct bpf_prog_aux *aux;
+- struct bpf_tramp_progs *tprogs;
+- struct bpf_prog **progs;
++ struct bpf_tramp_link *link;
++ struct bpf_tramp_links *tlinks;
++ struct bpf_tramp_link **links;
+ int kind;
+
+ *total = 0;
+- tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(*tprogs), GFP_KERNEL);
+- if (!tprogs)
++ tlinks = kcalloc(BPF_TRAMP_MAX, sizeof(*tlinks), GFP_KERNEL);
++ if (!tlinks)
+ return ERR_PTR(-ENOMEM);
+
+ for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
+- tprogs[kind].nr_progs = tr->progs_cnt[kind];
++ tlinks[kind].nr_links = tr->progs_cnt[kind];
+ *total += tr->progs_cnt[kind];
+- progs = tprogs[kind].progs;
++ links = tlinks[kind].links;
+
+- hlist_for_each_entry(aux, &tr->progs_hlist[kind], tramp_hlist) {
+- *ip_arg |= aux->prog->call_get_func_ip;
+- *progs++ = aux->prog;
++ hlist_for_each_entry(link, &tr->progs_hlist[kind], tramp_hlist) {
++ *ip_arg |= link->link.prog->call_get_func_ip;
++ *links++ = link;
+ }
+ }
+- return tprogs;
++ return tlinks;
+ }
+
+ static void __bpf_tramp_image_put_deferred(struct work_struct *work)
+@@ -330,14 +330,14 @@ out:
+ static int bpf_trampoline_update(struct bpf_trampoline *tr)
+ {
+ struct bpf_tramp_image *im;
+- struct bpf_tramp_progs *tprogs;
++ struct bpf_tramp_links *tlinks;
+ u32 flags = BPF_TRAMP_F_RESTORE_REGS;
+ bool ip_arg = false;
+ int err, total;
+
+- tprogs = bpf_trampoline_get_progs(tr, &total, &ip_arg);
+- if (IS_ERR(tprogs))
+- return PTR_ERR(tprogs);
++ tlinks = bpf_trampoline_get_progs(tr, &total, &ip_arg);
++ if (IS_ERR(tlinks))
++ return PTR_ERR(tlinks);
+
+ if (total == 0) {
+ err = unregister_fentry(tr, tr->cur_image->image);
+@@ -353,15 +353,15 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
+ goto out;
+ }
+
+- if (tprogs[BPF_TRAMP_FEXIT].nr_progs ||
+- tprogs[BPF_TRAMP_MODIFY_RETURN].nr_progs)
++ if (tlinks[BPF_TRAMP_FEXIT].nr_links ||
++ tlinks[BPF_TRAMP_MODIFY_RETURN].nr_links)
+ flags = BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_SKIP_FRAME;
+
+ if (ip_arg)
+ flags |= BPF_TRAMP_F_IP_ARG;
+
+ err = arch_prepare_bpf_trampoline(im, im->image, im->image + PAGE_SIZE,
+- &tr->func.model, flags, tprogs,
++ &tr->func.model, flags, tlinks,
+ tr->func.addr);
+ if (err < 0)
+ goto out;
+@@ -381,7 +381,7 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
+ tr->cur_image = im;
+ tr->selector++;
+ out:
+- kfree(tprogs);
++ kfree(tlinks);
+ return err;
+ }
+
+@@ -407,13 +407,14 @@ static enum bpf_tramp_prog_type bpf_attach_type_to_tramp(struct bpf_prog *prog)
+ }
+ }
+
+-int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
++int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
+ {
+ enum bpf_tramp_prog_type kind;
++ struct bpf_tramp_link *link_exiting;
+ int err = 0;
+ int cnt = 0, i;
+
+- kind = bpf_attach_type_to_tramp(prog);
++ kind = bpf_attach_type_to_tramp(link->link.prog);
+ mutex_lock(&tr->mutex);
+ if (tr->extension_prog) {
+ /* cannot attach fentry/fexit if extension prog is attached.
+@@ -432,25 +433,33 @@ int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
+ err = -EBUSY;
+ goto out;
+ }
+- tr->extension_prog = prog;
++ tr->extension_prog = link->link.prog;
+ err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL,
+- prog->bpf_func);
++ link->link.prog->bpf_func);
+ goto out;
+ }
+- if (cnt >= BPF_MAX_TRAMP_PROGS) {
++ if (cnt >= BPF_MAX_TRAMP_LINKS) {
+ err = -E2BIG;
+ goto out;
+ }
+- if (!hlist_unhashed(&prog->aux->tramp_hlist)) {
++ if (!hlist_unhashed(&link->tramp_hlist)) {
+ /* prog already linked */
+ err = -EBUSY;
+ goto out;
+ }
+- hlist_add_head(&prog->aux->tramp_hlist, &tr->progs_hlist[kind]);
++ hlist_for_each_entry(link_exiting, &tr->progs_hlist[kind], tramp_hlist) {
++ if (link_exiting->link.prog != link->link.prog)
++ continue;
++ /* prog already linked */
++ err = -EBUSY;
++ goto out;
++ }
++
++ hlist_add_head(&link->tramp_hlist, &tr->progs_hlist[kind]);
+ tr->progs_cnt[kind]++;
+ err = bpf_trampoline_update(tr);
+ if (err) {
+- hlist_del_init(&prog->aux->tramp_hlist);
++ hlist_del_init(&link->tramp_hlist);
+ tr->progs_cnt[kind]--;
+ }
+ out:
+@@ -459,12 +468,12 @@ out:
+ }
+
+ /* bpf_trampoline_unlink_prog() should never fail. */
+-int bpf_trampoline_unlink_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
++int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
+ {
+ enum bpf_tramp_prog_type kind;
+ int err;
+
+- kind = bpf_attach_type_to_tramp(prog);
++ kind = bpf_attach_type_to_tramp(link->link.prog);
+ mutex_lock(&tr->mutex);
+ if (kind == BPF_TRAMP_REPLACE) {
+ WARN_ON_ONCE(!tr->extension_prog);
+@@ -473,7 +482,7 @@ int bpf_trampoline_unlink_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
+ tr->extension_prog = NULL;
+ goto out;
+ }
+- hlist_del_init(&prog->aux->tramp_hlist);
++ hlist_del_init(&link->tramp_hlist);
+ tr->progs_cnt[kind]--;
+ err = bpf_trampoline_update(tr);
+ out:
+@@ -641,7 +650,7 @@ void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr)
+ int __weak
+ arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *image_end,
+ const struct btf_func_model *m, u32 flags,
+- struct bpf_tramp_progs *tprogs,
++ struct bpf_tramp_links *tlinks,
+ void *orig_call)
+ {
+ return -ENOTSUPP;
+diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
+index d04147a5efa55..839340d4cc290 100644
+--- a/kernel/bpf/verifier.c
++++ b/kernel/bpf/verifier.c
+@@ -245,6 +245,7 @@ struct bpf_call_arg_meta {
+ struct bpf_map *map_ptr;
+ bool raw_mode;
+ bool pkt_access;
++ u8 release_regno;
+ int regno;
+ int access_size;
+ int mem_size;
+@@ -257,6 +258,7 @@ struct bpf_call_arg_meta {
+ struct btf *ret_btf;
+ u32 ret_btf_id;
+ u32 subprogno;
++ struct bpf_map_value_off_desc *kptr_off_desc;
+ };
+
+ struct btf *btf_vmlinux;
+@@ -471,17 +473,6 @@ static bool type_may_be_null(u32 type)
+ return type & PTR_MAYBE_NULL;
+ }
+
+-/* Determine whether the function releases some resources allocated by another
+- * function call. The first reference type argument will be assumed to be
+- * released by release_reference().
+- */
+-static bool is_release_function(enum bpf_func_id func_id)
+-{
+- return func_id == BPF_FUNC_sk_release ||
+- func_id == BPF_FUNC_ringbuf_submit ||
+- func_id == BPF_FUNC_ringbuf_discard;
+-}
+-
+ static bool may_be_acquire_function(enum bpf_func_id func_id)
+ {
+ return func_id == BPF_FUNC_sk_lookup_tcp ||
+@@ -499,7 +490,8 @@ static bool is_acquire_function(enum bpf_func_id func_id,
+ if (func_id == BPF_FUNC_sk_lookup_tcp ||
+ func_id == BPF_FUNC_sk_lookup_udp ||
+ func_id == BPF_FUNC_skc_lookup_tcp ||
+- func_id == BPF_FUNC_ringbuf_reserve)
++ func_id == BPF_FUNC_ringbuf_reserve ||
++ func_id == BPF_FUNC_kptr_xchg)
+ return true;
+
+ if (func_id == BPF_FUNC_map_lookup_elem &&
+@@ -3210,7 +3202,7 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
+ return 0;
+ }
+
+-enum stack_access_src {
++enum bpf_access_src {
+ ACCESS_DIRECT = 1, /* the access is performed by an instruction */
+ ACCESS_HELPER = 2, /* the access is performed by a helper */
+ };
+@@ -3218,7 +3210,7 @@ enum stack_access_src {
+ static int check_stack_range_initialized(struct bpf_verifier_env *env,
+ int regno, int off, int access_size,
+ bool zero_size_allowed,
+- enum stack_access_src type,
++ enum bpf_access_src type,
+ struct bpf_call_arg_meta *meta);
+
+ static struct bpf_reg_state *reg_state(struct bpf_verifier_env *env, int regno)
+@@ -3468,9 +3460,159 @@ static int check_mem_region_access(struct bpf_verifier_env *env, u32 regno,
+ return 0;
+ }
+
++static int __check_ptr_off_reg(struct bpf_verifier_env *env,
++ const struct bpf_reg_state *reg, int regno,
++ bool fixed_off_ok)
++{
++ /* Access to this pointer-typed register or passing it to a helper
++ * is only allowed in its original, unmodified form.
++ */
++
++ if (reg->off < 0) {
++ verbose(env, "negative offset %s ptr R%d off=%d disallowed\n",
++ reg_type_str(env, reg->type), regno, reg->off);
++ return -EACCES;
++ }
++
++ if (!fixed_off_ok && reg->off) {
++ verbose(env, "dereference of modified %s ptr R%d off=%d disallowed\n",
++ reg_type_str(env, reg->type), regno, reg->off);
++ return -EACCES;
++ }
++
++ if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
++ char tn_buf[48];
++
++ tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off);
++ verbose(env, "variable %s access var_off=%s disallowed\n",
++ reg_type_str(env, reg->type), tn_buf);
++ return -EACCES;
++ }
++
++ return 0;
++}
++
++int check_ptr_off_reg(struct bpf_verifier_env *env,
++ const struct bpf_reg_state *reg, int regno)
++{
++ return __check_ptr_off_reg(env, reg, regno, false);
++}
++
++static int map_kptr_match_type(struct bpf_verifier_env *env,
++ struct bpf_map_value_off_desc *off_desc,
++ struct bpf_reg_state *reg, u32 regno)
++{
++ const char *targ_name = kernel_type_name(off_desc->kptr.btf, off_desc->kptr.btf_id);
++ const char *reg_name = "";
++
++ if (base_type(reg->type) != PTR_TO_BTF_ID || type_flag(reg->type) != PTR_MAYBE_NULL)
++ goto bad_type;
++
++ if (!btf_is_kernel(reg->btf)) {
++ verbose(env, "R%d must point to kernel BTF\n", regno);
++ return -EINVAL;
++ }
++ /* We need to verify reg->type and reg->btf, before accessing reg->btf */
++ reg_name = kernel_type_name(reg->btf, reg->btf_id);
++
++ /* For ref_ptr case, release function check should ensure we get one
++ * referenced PTR_TO_BTF_ID, and that its fixed offset is 0. For the
++ * normal store of unreferenced kptr, we must ensure var_off is zero.
++ * Since ref_ptr cannot be accessed directly by BPF insns, checks for
++ * reg->off and reg->ref_obj_id are not needed here.
++ */
++ if (__check_ptr_off_reg(env, reg, regno, true))
++ return -EACCES;
++
++ /* A full type match is needed, as BTF can be vmlinux or module BTF, and
++ * we also need to take into account the reg->off.
++ *
++ * We want to support cases like:
++ *
++ * struct foo {
++ * struct bar br;
++ * struct baz bz;
++ * };
++ *
++ * struct foo *v;
++ * v = func(); // PTR_TO_BTF_ID
++ * val->foo = v; // reg->off is zero, btf and btf_id match type
++ * val->bar = &v->br; // reg->off is still zero, but we need to retry with
++ * // first member type of struct after comparison fails
++ * val->baz = &v->bz; // reg->off is non-zero, so struct needs to be walked
++ * // to match type
++ *
++ * In the kptr_ref case, check_func_arg_reg_off already ensures reg->off
++ * is zero.
++ */
++ if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, reg->off,
++ off_desc->kptr.btf, off_desc->kptr.btf_id))
++ goto bad_type;
++ return 0;
++bad_type:
++ verbose(env, "invalid kptr access, R%d type=%s%s ", regno,
++ reg_type_str(env, reg->type), reg_name);
++ verbose(env, "expected=%s%s\n", reg_type_str(env, PTR_TO_BTF_ID), targ_name);
++ return -EINVAL;
++}
++
++static int check_map_kptr_access(struct bpf_verifier_env *env, u32 regno,
++ int value_regno, int insn_idx,
++ struct bpf_map_value_off_desc *off_desc)
++{
++ struct bpf_insn *insn = &env->prog->insnsi[insn_idx];
++ int class = BPF_CLASS(insn->code);
++ struct bpf_reg_state *val_reg;
++
++ /* Things we already checked for in check_map_access and caller:
++ * - Reject cases where variable offset may touch kptr
++ * - size of access (must be BPF_DW)
++ * - tnum_is_const(reg->var_off)
++ * - off_desc->offset == off + reg->var_off.value
++ */
++ /* Only BPF_[LDX,STX,ST] | BPF_MEM | BPF_DW is supported */
++ if (BPF_MODE(insn->code) != BPF_MEM) {
++ verbose(env, "kptr in map can only be accessed using BPF_MEM instruction mode\n");
++ return -EACCES;
++ }
++
++ /* We cannot directly access kptr_ref */
++ if (off_desc->type == BPF_KPTR_REF) {
++ verbose(env, "accessing referenced kptr disallowed\n");
++ return -EACCES;
++ }
++
++ if (class == BPF_LDX) {
++ val_reg = reg_state(env, value_regno);
++ /* We can simply mark the value_regno receiving the pointer
++ * value from map as PTR_TO_BTF_ID, with the correct type.
++ */
++ mark_btf_ld_reg(env, cur_regs(env), value_regno, PTR_TO_BTF_ID, off_desc->kptr.btf,
++ off_desc->kptr.btf_id, PTR_MAYBE_NULL);
++ /* For mark_ptr_or_null_reg */
++ val_reg->id = ++env->id_gen;
++ } else if (class == BPF_STX) {
++ val_reg = reg_state(env, value_regno);
++ if (!register_is_null(val_reg) &&
++ map_kptr_match_type(env, off_desc, val_reg, value_regno))
++ return -EACCES;
++ } else if (class == BPF_ST) {
++ if (insn->imm) {
++ verbose(env, "BPF_ST imm must be 0 when storing to kptr at off=%u\n",
++ off_desc->offset);
++ return -EACCES;
++ }
++ } else {
++ verbose(env, "kptr in map can only be accessed using BPF_LDX/BPF_STX/BPF_ST\n");
++ return -EACCES;
++ }
++ return 0;
++}
++
+ /* check read/write into a map element with possible variable offset */
+ static int check_map_access(struct bpf_verifier_env *env, u32 regno,
+- int off, int size, bool zero_size_allowed)
++ int off, int size, bool zero_size_allowed,
++ enum bpf_access_src src)
+ {
+ struct bpf_verifier_state *vstate = env->cur_state;
+ struct bpf_func_state *state = vstate->frame[vstate->curframe];
+@@ -3506,6 +3648,36 @@ static int check_map_access(struct bpf_verifier_env *env, u32 regno,
+ return -EACCES;
+ }
+ }
++ if (map_value_has_kptrs(map)) {
++ struct bpf_map_value_off *tab = map->kptr_off_tab;
++ int i;
++
++ for (i = 0; i < tab->nr_off; i++) {
++ u32 p = tab->off[i].offset;
++
++ if (reg->smin_value + off < p + sizeof(u64) &&
++ p < reg->umax_value + off + size) {
++ if (src != ACCESS_DIRECT) {
++ verbose(env, "kptr cannot be accessed indirectly by helper\n");
++ return -EACCES;
++ }
++ if (!tnum_is_const(reg->var_off)) {
++ verbose(env, "kptr access cannot have variable offset\n");
++ return -EACCES;
++ }
++ if (p != off + reg->var_off.value) {
++ verbose(env, "kptr access misaligned expected=%u off=%llu\n",
++ p, off + reg->var_off.value);
++ return -EACCES;
++ }
++ if (size != bpf_size_to_bytes(BPF_DW)) {
++ verbose(env, "kptr access size must be BPF_DW\n");
++ return -EACCES;
++ }
++ break;
++ }
++ }
++ }
+ return err;
+ }
+
+@@ -3979,44 +4151,6 @@ static int get_callee_stack_depth(struct bpf_verifier_env *env,
+ }
+ #endif
+
+-static int __check_ptr_off_reg(struct bpf_verifier_env *env,
+- const struct bpf_reg_state *reg, int regno,
+- bool fixed_off_ok)
+-{
+- /* Access to this pointer-typed register or passing it to a helper
+- * is only allowed in its original, unmodified form.
+- */
+-
+- if (reg->off < 0) {
+- verbose(env, "negative offset %s ptr R%d off=%d disallowed\n",
+- reg_type_str(env, reg->type), regno, reg->off);
+- return -EACCES;
+- }
+-
+- if (!fixed_off_ok && reg->off) {
+- verbose(env, "dereference of modified %s ptr R%d off=%d disallowed\n",
+- reg_type_str(env, reg->type), regno, reg->off);
+- return -EACCES;
+- }
+-
+- if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
+- char tn_buf[48];
+-
+- tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off);
+- verbose(env, "variable %s access var_off=%s disallowed\n",
+- reg_type_str(env, reg->type), tn_buf);
+- return -EACCES;
+- }
+-
+- return 0;
+-}
+-
+-int check_ptr_off_reg(struct bpf_verifier_env *env,
+- const struct bpf_reg_state *reg, int regno)
+-{
+- return __check_ptr_off_reg(env, reg, regno, false);
+-}
+-
+ static int __check_buffer_access(struct bpf_verifier_env *env,
+ const char *buf_info,
+ const struct bpf_reg_state *reg,
+@@ -4315,7 +4449,7 @@ static int check_stack_slot_within_bounds(int off,
+ static int check_stack_access_within_bounds(
+ struct bpf_verifier_env *env,
+ int regno, int off, int access_size,
+- enum stack_access_src src, enum bpf_access_type type)
++ enum bpf_access_src src, enum bpf_access_type type)
+ {
+ struct bpf_reg_state *regs = cur_regs(env);
+ struct bpf_reg_state *reg = regs + regno;
+@@ -4411,6 +4545,8 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
+ if (value_regno >= 0)
+ mark_reg_unknown(env, regs, value_regno);
+ } else if (reg->type == PTR_TO_MAP_VALUE) {
++ struct bpf_map_value_off_desc *kptr_off_desc = NULL;
++
+ if (t == BPF_WRITE && value_regno >= 0 &&
+ is_pointer_value(env, value_regno)) {
+ verbose(env, "R%d leaks addr into map\n", value_regno);
+@@ -4419,8 +4555,16 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
+ err = check_map_access_type(env, regno, off, size, t);
+ if (err)
+ return err;
+- err = check_map_access(env, regno, off, size, false);
+- if (!err && t == BPF_READ && value_regno >= 0) {
++ err = check_map_access(env, regno, off, size, false, ACCESS_DIRECT);
++ if (err)
++ return err;
++ if (tnum_is_const(reg->var_off))
++ kptr_off_desc = bpf_map_kptr_off_contains(reg->map_ptr,
++ off + reg->var_off.value);
++ if (kptr_off_desc) {
++ err = check_map_kptr_access(env, regno, value_regno, insn_idx,
++ kptr_off_desc);
++ } else if (t == BPF_READ && value_regno >= 0) {
+ struct bpf_map *map = reg->map_ptr;
+
+ /* if map is read-only, track its contents as scalars */
+@@ -4723,7 +4867,7 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
+ static int check_stack_range_initialized(
+ struct bpf_verifier_env *env, int regno, int off,
+ int access_size, bool zero_size_allowed,
+- enum stack_access_src type, struct bpf_call_arg_meta *meta)
++ enum bpf_access_src type, struct bpf_call_arg_meta *meta)
+ {
+ struct bpf_reg_state *reg = reg_state(env, regno);
+ struct bpf_func_state *state = func(env, reg);
+@@ -4873,7 +5017,7 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno,
+ BPF_READ))
+ return -EACCES;
+ return check_map_access(env, regno, reg->off, access_size,
+- zero_size_allowed);
++ zero_size_allowed, ACCESS_HELPER);
+ case PTR_TO_MEM:
+ if (type_is_rdonly_mem(reg->type)) {
+ if (meta && meta->raw_mode) {
+@@ -5162,6 +5306,53 @@ static int process_timer_func(struct bpf_verifier_env *env, int regno,
+ return 0;
+ }
+
++static int process_kptr_func(struct bpf_verifier_env *env, int regno,
++ struct bpf_call_arg_meta *meta)
++{
++ struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno];
++ struct bpf_map_value_off_desc *off_desc;
++ struct bpf_map *map_ptr = reg->map_ptr;
++ u32 kptr_off;
++ int ret;
++
++ if (!tnum_is_const(reg->var_off)) {
++ verbose(env,
++ "R%d doesn't have constant offset. kptr has to be at the constant offset\n",
++ regno);
++ return -EINVAL;
++ }
++ if (!map_ptr->btf) {
++ verbose(env, "map '%s' has to have BTF in order to use bpf_kptr_xchg\n",
++ map_ptr->name);
++ return -EINVAL;
++ }
++ if (!map_value_has_kptrs(map_ptr)) {
++ ret = PTR_ERR_OR_ZERO(map_ptr->kptr_off_tab);
++ if (ret == -E2BIG)
++ verbose(env, "map '%s' has more than %d kptr\n", map_ptr->name,
++ BPF_MAP_VALUE_OFF_MAX);
++ else if (ret == -EEXIST)
++ verbose(env, "map '%s' has repeating kptr BTF tags\n", map_ptr->name);
++ else
++ verbose(env, "map '%s' has no valid kptr\n", map_ptr->name);
++ return -EINVAL;
++ }
++
++ meta->map_ptr = map_ptr;
++ kptr_off = reg->off + reg->var_off.value;
++ off_desc = bpf_map_kptr_off_contains(map_ptr, kptr_off);
++ if (!off_desc) {
++ verbose(env, "off=%d doesn't point to kptr\n", kptr_off);
++ return -EACCES;
++ }
++ if (off_desc->type != BPF_KPTR_REF) {
++ verbose(env, "off=%d kptr isn't referenced kptr\n", kptr_off);
++ return -EACCES;
++ }
++ meta->kptr_off_desc = off_desc;
++ return 0;
++}
++
+ static bool arg_type_is_mem_ptr(enum bpf_arg_type type)
+ {
+ return base_type(type) == ARG_PTR_TO_MEM ||
+@@ -5185,6 +5376,11 @@ static bool arg_type_is_int_ptr(enum bpf_arg_type type)
+ type == ARG_PTR_TO_LONG;
+ }
+
++static bool arg_type_is_release(enum bpf_arg_type type)
++{
++ return type & OBJ_RELEASE;
++}
++
+ static int int_ptr_type_to_size(enum bpf_arg_type type)
+ {
+ if (type == ARG_PTR_TO_INT)
+@@ -5297,6 +5493,7 @@ static const struct bpf_reg_types func_ptr_types = { .types = { PTR_TO_FUNC } };
+ static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK } };
+ static const struct bpf_reg_types const_str_ptr_types = { .types = { PTR_TO_MAP_VALUE } };
+ static const struct bpf_reg_types timer_types = { .types = { PTR_TO_MAP_VALUE } };
++static const struct bpf_reg_types kptr_types = { .types = { PTR_TO_MAP_VALUE } };
+
+ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
+ [ARG_PTR_TO_MAP_KEY] = &map_key_value_types,
+@@ -5324,11 +5521,13 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
+ [ARG_PTR_TO_STACK] = &stack_ptr_types,
+ [ARG_PTR_TO_CONST_STR] = &const_str_ptr_types,
+ [ARG_PTR_TO_TIMER] = &timer_types,
++ [ARG_PTR_TO_KPTR] = &kptr_types,
+ };
+
+ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
+ enum bpf_arg_type arg_type,
+- const u32 *arg_btf_id)
++ const u32 *arg_btf_id,
++ struct bpf_call_arg_meta *meta)
+ {
+ struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno];
+ enum bpf_reg_type expected, type = reg->type;
+@@ -5381,8 +5580,11 @@ found:
+ arg_btf_id = compatible->btf_id;
+ }
+
+- if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, reg->off,
+- btf_vmlinux, *arg_btf_id)) {
++ if (meta->func_id == BPF_FUNC_kptr_xchg) {
++ if (map_kptr_match_type(env, meta->kptr_off_desc, reg, regno))
++ return -EACCES;
++ } else if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, reg->off,
++ btf_vmlinux, *arg_btf_id)) {
+ verbose(env, "R%d is of type %s but %s is expected\n",
+ regno, kernel_type_name(reg->btf, reg->btf_id),
+ kernel_type_name(btf_vmlinux, *arg_btf_id));
+@@ -5395,11 +5597,10 @@ found:
+
+ int check_func_arg_reg_off(struct bpf_verifier_env *env,
+ const struct bpf_reg_state *reg, int regno,
+- enum bpf_arg_type arg_type,
+- bool is_release_func)
++ enum bpf_arg_type arg_type)
+ {
+- bool fixed_off_ok = false, release_reg;
+ enum bpf_reg_type type = reg->type;
++ bool fixed_off_ok = false;
+
+ switch ((u32)type) {
+ case SCALAR_VALUE:
+@@ -5417,7 +5618,7 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
+ /* Some of the argument types nevertheless require a
+ * zero register offset.
+ */
+- if (arg_type != ARG_PTR_TO_ALLOC_MEM)
++ if (base_type(arg_type) != ARG_PTR_TO_ALLOC_MEM)
+ return 0;
+ break;
+ /* All the rest must be rejected, except PTR_TO_BTF_ID which allows
+@@ -5425,19 +5626,17 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
+ */
+ case PTR_TO_BTF_ID:
+ /* When referenced PTR_TO_BTF_ID is passed to release function,
+- * it's fixed offset must be 0. We rely on the property that
+- * only one referenced register can be passed to BPF helpers and
+- * kfuncs. In the other cases, fixed offset can be non-zero.
++ * it's fixed offset must be 0. In the other cases, fixed offset
++ * can be non-zero.
+ */
+- release_reg = is_release_func && reg->ref_obj_id;
+- if (release_reg && reg->off) {
++ if (arg_type_is_release(arg_type) && reg->off) {
+ verbose(env, "R%d must have zero offset when passed to release func\n",
+ regno);
+ return -EINVAL;
+ }
+- /* For release_reg == true, fixed_off_ok must be false, but we
+- * already checked and rejected reg->off != 0 above, so set to
+- * true to allow fixed offset for all other cases.
++ /* For arg is release pointer, fixed_off_ok must be false, but
++ * we already checked and rejected reg->off != 0 above, so set
++ * to true to allow fixed offset for all other cases.
+ */
+ fixed_off_ok = true;
+ break;
+@@ -5492,18 +5691,28 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
+ */
+ goto skip_type_check;
+
+- err = check_reg_type(env, regno, arg_type, fn->arg_btf_id[arg]);
++ err = check_reg_type(env, regno, arg_type, fn->arg_btf_id[arg], meta);
+ if (err)
+ return err;
+
+- err = check_func_arg_reg_off(env, reg, regno, arg_type, is_release_function(meta->func_id));
++ err = check_func_arg_reg_off(env, reg, regno, arg_type);
+ if (err)
+ return err;
+
+ skip_type_check:
+- /* check_func_arg_reg_off relies on only one referenced register being
+- * allowed for BPF helpers.
+- */
++ if (arg_type_is_release(arg_type)) {
++ if (!reg->ref_obj_id && !register_is_null(reg)) {
++ verbose(env, "R%d must be referenced when passed to release function\n",
++ regno);
++ return -EINVAL;
++ }
++ if (meta->release_regno) {
++ verbose(env, "verifier internal error: more than one release argument\n");
++ return -EFAULT;
++ }
++ meta->release_regno = regno;
++ }
++
+ if (reg->ref_obj_id) {
+ if (meta->ref_obj_id) {
+ verbose(env, "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n",
+@@ -5641,7 +5850,8 @@ skip_type_check:
+ }
+
+ err = check_map_access(env, regno, reg->off,
+- map->value_size - reg->off, false);
++ map->value_size - reg->off, false,
++ ACCESS_HELPER);
+ if (err)
+ return err;
+
+@@ -5657,6 +5867,9 @@ skip_type_check:
+ verbose(env, "string is not zero-terminated\n");
+ return -EINVAL;
+ }
++ } else if (arg_type == ARG_PTR_TO_KPTR) {
++ if (process_kptr_func(env, regno, meta))
++ return -EACCES;
+ }
+
+ return err;
+@@ -5696,7 +5909,8 @@ static bool may_update_sockmap(struct bpf_verifier_env *env, int func_id)
+
+ static bool allow_tail_call_in_subprogs(struct bpf_verifier_env *env)
+ {
+- return env->prog->jit_requested && IS_ENABLED(CONFIG_X86_64);
++ return env->prog->jit_requested &&
++ bpf_jit_supports_subprog_tailcalls();
+ }
+
+ static int check_map_func_compatibility(struct bpf_verifier_env *env,
+@@ -5999,17 +6213,18 @@ static bool check_btf_id_ok(const struct bpf_func_proto *fn)
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++) {
+- if (fn->arg_type[i] == ARG_PTR_TO_BTF_ID && !fn->arg_btf_id[i])
++ if (base_type(fn->arg_type[i]) == ARG_PTR_TO_BTF_ID && !fn->arg_btf_id[i])
+ return false;
+
+- if (fn->arg_type[i] != ARG_PTR_TO_BTF_ID && fn->arg_btf_id[i])
++ if (base_type(fn->arg_type[i]) != ARG_PTR_TO_BTF_ID && fn->arg_btf_id[i])
+ return false;
+ }
+
+ return true;
+ }
+
+-static int check_func_proto(const struct bpf_func_proto *fn, int func_id)
++static int check_func_proto(const struct bpf_func_proto *fn, int func_id,
++ struct bpf_call_arg_meta *meta)
+ {
+ return check_raw_mode_ok(fn) &&
+ check_arg_pair_ok(fn) &&
+@@ -6691,7 +6906,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
+ memset(&meta, 0, sizeof(meta));
+ meta.pkt_access = fn->pkt_access;
+
+- err = check_func_proto(fn, func_id);
++ err = check_func_proto(fn, func_id, &meta);
+ if (err) {
+ verbose(env, "kernel subsystem misconfigured func %s#%d\n",
+ func_id_name(func_id), func_id);
+@@ -6724,8 +6939,17 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
+ return err;
+ }
+
+- if (is_release_function(func_id)) {
+- err = release_reference(env, meta.ref_obj_id);
++ regs = cur_regs(env);
++
++ if (meta.release_regno) {
++ err = -EINVAL;
++ if (meta.ref_obj_id)
++ err = release_reference(env, meta.ref_obj_id);
++ /* meta.ref_obj_id can only be 0 if register that is meant to be
++ * released is NULL, which must be > R0.
++ */
++ else if (register_is_null(®s[meta.release_regno]))
++ err = 0;
+ if (err) {
+ verbose(env, "func %s#%d reference has not been acquired before\n",
+ func_id_name(func_id), func_id);
+@@ -6733,8 +6957,6 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
+ }
+ }
+
+- regs = cur_regs(env);
+-
+ switch (func_id) {
+ case BPF_FUNC_tail_call:
+ err = check_reference_leak(env);
+@@ -6858,21 +7080,25 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
+ regs[BPF_REG_0].btf_id = meta.ret_btf_id;
+ }
+ } else if (base_type(ret_type) == RET_PTR_TO_BTF_ID) {
++ struct btf *ret_btf;
+ int ret_btf_id;
+
+ mark_reg_known_zero(env, regs, BPF_REG_0);
+ regs[BPF_REG_0].type = PTR_TO_BTF_ID | ret_flag;
+- ret_btf_id = *fn->ret_btf_id;
++ if (func_id == BPF_FUNC_kptr_xchg) {
++ ret_btf = meta.kptr_off_desc->kptr.btf;
++ ret_btf_id = meta.kptr_off_desc->kptr.btf_id;
++ } else {
++ ret_btf = btf_vmlinux;
++ ret_btf_id = *fn->ret_btf_id;
++ }
+ if (ret_btf_id == 0) {
+ verbose(env, "invalid return type %u of func %s#%d\n",
+ base_type(ret_type), func_id_name(func_id),
+ func_id);
+ return -EINVAL;
+ }
+- /* current BPF helper definitions are only coming from
+- * built-in code with type IDs from vmlinux BTF
+- */
+- regs[BPF_REG_0].btf = btf_vmlinux;
++ regs[BPF_REG_0].btf = ret_btf;
+ regs[BPF_REG_0].btf_id = ret_btf_id;
+ } else {
+ verbose(env, "unknown return type %u of func %s#%d\n",
+@@ -7459,7 +7685,7 @@ static int sanitize_check_bounds(struct bpf_verifier_env *env,
+ return -EACCES;
+ break;
+ case PTR_TO_MAP_VALUE:
+- if (check_map_access(env, dst, dst_reg->off, 1, false)) {
++ if (check_map_access(env, dst, dst_reg->off, 1, false, ACCESS_HELPER)) {
+ verbose(env, "R%d pointer arithmetic of map value goes out of range, "
+ "prohibited for !root\n", dst);
+ return -EACCES;
+@@ -13015,6 +13241,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
+ /* Below members will be freed only at prog->aux */
+ func[i]->aux->btf = prog->aux->btf;
+ func[i]->aux->func_info = prog->aux->func_info;
++ func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
+ func[i]->aux->poke_tab = prog->aux->poke_tab;
+ func[i]->aux->size_poke_tab = prog->aux->size_poke_tab;
+
+@@ -13027,9 +13254,6 @@ static int jit_subprogs(struct bpf_verifier_env *env)
+ poke->aux = func[i]->aux;
+ }
+
+- /* Use bpf_prog_F_tag to indicate functions in stack traces.
+- * Long term would need debug info to populate names
+- */
+ func[i]->aux->name[0] = 'F';
+ func[i]->aux->stack_depth = env->subprog_info[i].stack_depth;
+ func[i]->jit_requested = 1;
+diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
+index 71a418858a5e0..58aadfda9b8b3 100644
+--- a/kernel/cgroup/cpuset.c
++++ b/kernel/cgroup/cpuset.c
+@@ -2239,7 +2239,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
+ goto out_unlock;
+
+ cgroup_taskset_for_each(task, css, tset) {
+- ret = task_can_attach(task, cs->cpus_allowed);
++ ret = task_can_attach(task, cs->effective_cpus);
+ if (ret)
+ goto out_unlock;
+ ret = security_task_setscheduler(task);
+diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
+index 73a41cec9e386..90e5f5c92fdc9 100644
+--- a/kernel/dma/swiotlb.c
++++ b/kernel/dma/swiotlb.c
+@@ -588,7 +588,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
+ int index;
+ phys_addr_t tlb_addr;
+
+- if (!mem)
++ if (!mem || !mem->nslabs)
+ panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
+
+ if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
+diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
+index 10929eda98258..fc760d064a653 100644
+--- a/kernel/irq/Kconfig
++++ b/kernel/irq/Kconfig
+@@ -82,6 +82,7 @@ config IRQ_FASTEOI_HIERARCHY_HANDLERS
+ # Generic IRQ IPI support
+ config GENERIC_IRQ_IPI
+ bool
++ depends on SMP
+ select IRQ_DOMAIN_HIERARCHY
+
+ # Generic MSI interrupt support
+diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
+index 54af0deb239b8..eb921485930fc 100644
+--- a/kernel/irq/chip.c
++++ b/kernel/irq/chip.c
+@@ -1513,7 +1513,8 @@ int irq_chip_request_resources_parent(struct irq_data *data)
+ if (data->chip->irq_request_resources)
+ return data->chip->irq_request_resources(data);
+
+- return -ENOSYS;
++ /* no error on missing optional irq_chip::irq_request_resources */
++ return 0;
+ }
+ EXPORT_SYMBOL_GPL(irq_chip_request_resources_parent);
+
+diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
+index d5ce965105493..481abb885d61f 100644
+--- a/kernel/irq/irqdomain.c
++++ b/kernel/irq/irqdomain.c
+@@ -910,6 +910,8 @@ struct irq_desc *__irq_resolve_mapping(struct irq_domain *domain,
+ data = irq_domain_get_irq_data(domain, hwirq);
+ if (data && data->hwirq == hwirq)
+ desc = irq_data_to_desc(data);
++ if (irq && desc)
++ *irq = hwirq;
+ }
+
+ return desc;
+diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
+index bb0fb63f563cf..ad005cd184a47 100644
+--- a/kernel/kexec_file.c
++++ b/kernel/kexec_file.c
+@@ -62,14 +62,7 @@ int kexec_image_probe_default(struct kimage *image, void *buf,
+ return ret;
+ }
+
+-/* Architectures can provide this probe function */
+-int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
+- unsigned long buf_len)
+-{
+- return kexec_image_probe_default(image, buf, buf_len);
+-}
+-
+-static void *kexec_image_load_default(struct kimage *image)
++void *kexec_image_load_default(struct kimage *image)
+ {
+ if (!image->fops || !image->fops->load)
+ return ERR_PTR(-ENOEXEC);
+@@ -80,11 +73,6 @@ static void *kexec_image_load_default(struct kimage *image)
+ image->cmdline_buf_len);
+ }
+
+-void * __weak arch_kexec_kernel_image_load(struct kimage *image)
+-{
+- return kexec_image_load_default(image);
+-}
+-
+ int kexec_image_post_load_cleanup_default(struct kimage *image)
+ {
+ if (!image->fops || !image->fops->cleanup)
+@@ -93,30 +81,6 @@ int kexec_image_post_load_cleanup_default(struct kimage *image)
+ return image->fops->cleanup(image->image_loader_data);
+ }
+
+-int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
+-{
+- return kexec_image_post_load_cleanup_default(image);
+-}
+-
+-#ifdef CONFIG_KEXEC_SIG
+-static int kexec_image_verify_sig_default(struct kimage *image, void *buf,
+- unsigned long buf_len)
+-{
+- if (!image->fops || !image->fops->verify_sig) {
+- pr_debug("kernel loader does not support signature verification.\n");
+- return -EKEYREJECTED;
+- }
+-
+- return image->fops->verify_sig(buf, buf_len);
+-}
+-
+-int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
+- unsigned long buf_len)
+-{
+- return kexec_image_verify_sig_default(image, buf, buf_len);
+-}
+-#endif
+-
+ /*
+ * Free up memory used by kernel, initrd, and command line. This is temporary
+ * memory allocation which is not needed any more after these buffers have
+@@ -159,13 +123,24 @@ void kimage_file_post_load_cleanup(struct kimage *image)
+ }
+
+ #ifdef CONFIG_KEXEC_SIG
++static int kexec_image_verify_sig(struct kimage *image, void *buf,
++ unsigned long buf_len)
++{
++ if (!image->fops || !image->fops->verify_sig) {
++ pr_debug("kernel loader does not support signature verification.\n");
++ return -EKEYREJECTED;
++ }
++
++ return image->fops->verify_sig(buf, buf_len);
++}
++
+ static int
+ kimage_validate_signature(struct kimage *image)
+ {
+ int ret;
+
+- ret = arch_kexec_kernel_verify_sig(image, image->kernel_buf,
+- image->kernel_buf_len);
++ ret = kexec_image_verify_sig(image, image->kernel_buf,
++ image->kernel_buf_len);
+ if (ret) {
+
+ if (sig_enforce) {
+@@ -621,19 +596,6 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf)
+ return ret == 1 ? 0 : -EADDRNOTAVAIL;
+ }
+
+-/**
+- * arch_kexec_locate_mem_hole - Find free memory to place the segments.
+- * @kbuf: Parameters for the memory search.
+- *
+- * On success, kbuf->mem will have the start address of the memory region found.
+- *
+- * Return: 0 on success, negative errno on error.
+- */
+-int __weak arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
+-{
+- return kexec_locate_mem_hole(kbuf);
+-}
+-
+ /**
+ * kexec_add_buffer - place a buffer in a kexec segment
+ * @kbuf: Buffer contents and memory parameters.
+diff --git a/kernel/kprobes.c b/kernel/kprobes.c
+index f214f8c088ede..80697e5e03e49 100644
+--- a/kernel/kprobes.c
++++ b/kernel/kprobes.c
+@@ -1560,7 +1560,8 @@ static int check_kprobe_address_safe(struct kprobe *p,
+ preempt_disable();
+
+ /* Ensure it is not in reserved area nor out of text */
+- if (!kernel_text_address((unsigned long) p->addr) ||
++ if (!(core_kernel_text((unsigned long) p->addr) ||
++ is_module_text_address((unsigned long) p->addr)) ||
+ within_kprobe_blacklist((unsigned long) p->addr) ||
+ jump_label_text_reserved(p->addr, p->addr) ||
+ static_call_text_reserved(p->addr, p->addr) ||
+diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
+index c06cab6546ed6..e45ec82b42e67 100644
+--- a/kernel/locking/lockdep.c
++++ b/kernel/locking/lockdep.c
+@@ -5214,9 +5214,10 @@ __lock_set_class(struct lockdep_map *lock, const char *name,
+ return 0;
+ }
+
+- lockdep_init_map_waits(lock, name, key, 0,
+- lock->wait_type_inner,
+- lock->wait_type_outer);
++ lockdep_init_map_type(lock, name, key, 0,
++ lock->wait_type_inner,
++ lock->wait_type_outer,
++ lock->lock_type);
+ class = register_lock_class(lock, subclass, 0);
+ hlock->class_idx = class - lock_classes;
+
+diff --git a/kernel/power/user.c b/kernel/power/user.c
+index ad241b4ff64c5..d43c2aa583b26 100644
+--- a/kernel/power/user.c
++++ b/kernel/power/user.c
+@@ -26,6 +26,7 @@
+
+ #include "power.h"
+
++static bool need_wait;
+
+ static struct snapshot_data {
+ struct snapshot_handle handle;
+@@ -78,7 +79,7 @@ static int snapshot_open(struct inode *inode, struct file *filp)
+ * Resuming. We may need to wait for the image device to
+ * appear.
+ */
+- wait_for_device_probe();
++ need_wait = true;
+
+ data->swap = -1;
+ data->mode = O_WRONLY;
+@@ -168,6 +169,11 @@ static ssize_t snapshot_write(struct file *filp, const char __user *buf,
+ ssize_t res;
+ loff_t pg_offp = *offp & ~PAGE_MASK;
+
++ if (need_wait) {
++ wait_for_device_probe();
++ need_wait = false;
++ }
++
+ lock_system_sleep();
+
+ data = filp->private_data;
+@@ -244,6 +250,11 @@ static long snapshot_ioctl(struct file *filp, unsigned int cmd,
+ loff_t size;
+ sector_t offset;
+
++ if (need_wait) {
++ wait_for_device_probe();
++ need_wait = false;
++ }
++
+ if (_IOC_TYPE(cmd) != SNAPSHOT_IOC_MAGIC)
+ return -ENOTTY;
+ if (_IOC_NR(cmd) > SNAPSHOT_IOC_MAXNR)
+diff --git a/kernel/profile.c b/kernel/profile.c
+index 37640a0bd8a3c..ae82ddfc6a684 100644
+--- a/kernel/profile.c
++++ b/kernel/profile.c
+@@ -109,6 +109,13 @@ int __ref profile_init(void)
+
+ /* only text is profiled */
+ prof_len = (_etext - _stext) >> prof_shift;
++
++ if (!prof_len) {
++ pr_warn("profiling shift: %u too large\n", prof_shift);
++ prof_on = 0;
++ return -EINVAL;
++ }
++
+ buffer_bytes = prof_len*sizeof(atomic_t);
+
+ if (!alloc_cpumask_var(&prof_cpu_mask, GFP_KERNEL))
+diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
+index 55d049c39608f..7c3a7f8c828b9 100644
+--- a/kernel/rcu/rcutorture.c
++++ b/kernel/rcu/rcutorture.c
+@@ -2030,6 +2030,19 @@ static int rcutorture_booster_init(unsigned int cpu)
+ if (boost_tasks[cpu] != NULL)
+ return 0; /* Already created, nothing more to do. */
+
++ // Testing RCU priority boosting requires rcutorture do
++ // some serious abuse. Counter this by running ksoftirqd
++ // at higher priority.
++ if (IS_BUILTIN(CONFIG_RCU_TORTURE_TEST)) {
++ struct sched_param sp;
++ struct task_struct *t;
++
++ t = per_cpu(ksoftirqd, cpu);
++ WARN_ON_ONCE(!t);
++ sp.sched_priority = 2;
++ sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
++ }
++
+ /* Don't allow time recalculation while creating a new task. */
+ mutex_lock(&boost_mutex);
+ rcu_torture_disable_rt_throttle();
+@@ -3281,21 +3294,6 @@ rcu_torture_init(void)
+ rcutor_hp = firsterr;
+ if (torture_init_error(firsterr))
+ goto unwind;
+-
+- // Testing RCU priority boosting requires rcutorture do
+- // some serious abuse. Counter this by running ksoftirqd
+- // at higher priority.
+- if (IS_BUILTIN(CONFIG_RCU_TORTURE_TEST)) {
+- for_each_online_cpu(cpu) {
+- struct sched_param sp;
+- struct task_struct *t;
+-
+- t = per_cpu(ksoftirqd, cpu);
+- WARN_ON_ONCE(!t);
+- sp.sched_priority = 2;
+- sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
+- }
+- }
+ }
+ shutdown_jiffies = jiffies + shutdown_secs * HZ;
+ firsterr = torture_shutdown_init(shutdown_secs, rcu_torture_cleanup);
+diff --git a/kernel/sched/core.c b/kernel/sched/core.c
+index dd11daa7a84b1..9671796a11cc9 100644
+--- a/kernel/sched/core.c
++++ b/kernel/sched/core.c
+@@ -88,7 +88,7 @@
+ #include "stats.h"
+
+ #include "../workqueue_internal.h"
+-#include "../../fs/io-wq.h"
++#include "../../io_uring/io-wq.h"
+ #include "../smpboot.h"
+
+ /*
+@@ -3811,7 +3811,7 @@ bool cpus_share_cache(int this_cpu, int that_cpu)
+ return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
+ }
+
+-static inline bool ttwu_queue_cond(int cpu, int wake_flags)
++static inline bool ttwu_queue_cond(struct task_struct *p, int cpu)
+ {
+ /*
+ * Do not complicate things with the async wake_list while the CPU is
+@@ -3820,6 +3820,10 @@ static inline bool ttwu_queue_cond(int cpu, int wake_flags)
+ if (!cpu_active(cpu))
+ return false;
+
++ /* Ensure the task will still be allowed to run on the CPU. */
++ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
++ return false;
++
+ /*
+ * If the CPU does not share cache, then queue the task on the
+ * remote rqs wakelist to avoid accessing remote data.
+@@ -3827,13 +3831,21 @@ static inline bool ttwu_queue_cond(int cpu, int wake_flags)
+ if (!cpus_share_cache(smp_processor_id(), cpu))
+ return true;
+
++ if (cpu == smp_processor_id())
++ return false;
++
+ /*
+- * If the task is descheduling and the only running task on the
+- * CPU then use the wakelist to offload the task activation to
+- * the soon-to-be-idle CPU as the current CPU is likely busy.
+- * nr_running is checked to avoid unnecessary task stacking.
++ * If the wakee cpu is idle, or the task is descheduling and the
++ * only running task on the CPU, then use the wakelist to offload
++ * the task activation to the idle (or soon-to-be-idle) CPU as
++ * the current CPU is likely busy. nr_running is checked to
++ * avoid unnecessary task stacking.
++ *
++ * Note that we can only get here with (wakee) p->on_rq=0,
++ * p->on_cpu can be whatever, we've done the dequeue, so
++ * the wakee has been accounted out of ->nr_running.
+ */
+- if ((wake_flags & WF_ON_CPU) && cpu_rq(cpu)->nr_running <= 1)
++ if (!cpu_rq(cpu)->nr_running)
+ return true;
+
+ return false;
+@@ -3841,10 +3853,7 @@ static inline bool ttwu_queue_cond(int cpu, int wake_flags)
+
+ static bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
+ {
+- if (sched_feat(TTWU_QUEUE) && ttwu_queue_cond(cpu, wake_flags)) {
+- if (WARN_ON_ONCE(cpu == smp_processor_id()))
+- return false;
+-
++ if (sched_feat(TTWU_QUEUE) && ttwu_queue_cond(p, cpu)) {
+ sched_clock_cpu(cpu); /* Sync clocks across CPUs */
+ __ttwu_queue_wakelist(p, cpu, wake_flags);
+ return true;
+@@ -4166,7 +4175,7 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
+ * scheduling.
+ */
+ if (smp_load_acquire(&p->on_cpu) &&
+- ttwu_queue_wakelist(p, task_cpu(p), wake_flags | WF_ON_CPU))
++ ttwu_queue_wakelist(p, task_cpu(p), wake_flags))
+ goto unlock;
+
+ /*
+@@ -4710,7 +4719,8 @@ static inline void prepare_task(struct task_struct *next)
+ * Claim the task as running, we do this before switching to it
+ * such that any running task will have this set.
+ *
+- * See the ttwu() WF_ON_CPU case and its ordering comment.
++ * See the smp_load_acquire(&p->on_cpu) case in ttwu() and
++ * its ordering comment.
+ */
+ WRITE_ONCE(next->on_cpu, 1);
+ #endif
+@@ -6460,8 +6470,12 @@ static inline void sched_submit_work(struct task_struct *tsk)
+ io_wq_worker_sleeping(tsk);
+ }
+
+- if (tsk_is_pi_blocked(tsk))
+- return;
++ /*
++ * spinlock and rwlock must not flush block requests. This will
++ * deadlock if the callback attempts to acquire a lock which is
++ * already acquired.
++ */
++ SCHED_WARN_ON(current->__state & TASK_RTLOCK_WAIT);
+
+ /*
+ * If we are going to sleep and we have plugged IO queued,
+@@ -8895,7 +8909,7 @@ int cpuset_cpumask_can_shrink(const struct cpumask *cur,
+ }
+
+ int task_can_attach(struct task_struct *p,
+- const struct cpumask *cs_cpus_allowed)
++ const struct cpumask *cs_effective_cpus)
+ {
+ int ret = 0;
+
+@@ -8914,9 +8928,11 @@ int task_can_attach(struct task_struct *p,
+ }
+
+ if (dl_task(p) && !cpumask_intersects(task_rq(p)->rd->span,
+- cs_cpus_allowed)) {
+- int cpu = cpumask_any_and(cpu_active_mask, cs_cpus_allowed);
++ cs_effective_cpus)) {
++ int cpu = cpumask_any_and(cpu_active_mask, cs_effective_cpus);
+
++ if (unlikely(cpu >= nr_cpu_ids))
++ return -EINVAL;
+ ret = dl_cpu_busy(cpu, p);
+ }
+
+diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
+index cc8daa3dcc8bc..46f6674a0979d 100644
+--- a/kernel/sched/fair.c
++++ b/kernel/sched/fair.c
+@@ -6324,6 +6324,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
+ {
+ struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
+ int i, cpu, idle_cpu = -1, nr = INT_MAX;
++ struct sched_domain_shared *sd_share;
+ struct rq *this_rq = this_rq();
+ int this = smp_processor_id();
+ struct sched_domain *this_sd;
+@@ -6363,6 +6364,17 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
+ time = cpu_clock(this);
+ }
+
++ if (sched_feat(SIS_UTIL)) {
++ sd_share = rcu_dereference(per_cpu(sd_llc_shared, target));
++ if (sd_share) {
++ /* because !--nr is the condition to stop scan */
++ nr = READ_ONCE(sd_share->nr_idle_scan) + 1;
++ /* overloaded LLC is unlikely to have idle cpu/core */
++ if (nr == 1)
++ return -1;
++ }
++ }
++
+ for_each_cpu_wrap(cpu, cpus, target + 1) {
+ if (has_idle_core) {
+ i = select_idle_core(p, cpu, cpus, &idle_cpu);
+@@ -7616,8 +7628,8 @@ enum group_type {
+ */
+ group_fully_busy,
+ /*
+- * SD_ASYM_CPUCAPACITY only: One task doesn't fit with CPU's capacity
+- * and must be migrated to a more powerful CPU.
++ * One task doesn't fit with CPU's capacity and must be migrated to a
++ * more powerful CPU.
+ */
+ group_misfit_task,
+ /*
+@@ -8700,6 +8712,19 @@ sched_asym(struct lb_env *env, struct sd_lb_stats *sds, struct sg_lb_stats *sgs
+ return sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu);
+ }
+
++static inline bool
++sched_reduced_capacity(struct rq *rq, struct sched_domain *sd)
++{
++ /*
++ * When there is more than 1 task, the group_overloaded case already
++ * takes care of cpu with reduced capacity
++ */
++ if (rq->cfs.h_nr_running != 1)
++ return false;
++
++ return check_cpu_capacity(rq, sd);
++}
++
+ /**
+ * update_sg_lb_stats - Update sched_group's statistics for load balancing.
+ * @env: The load balancing environment.
+@@ -8722,8 +8747,9 @@ static inline void update_sg_lb_stats(struct lb_env *env,
+
+ for_each_cpu_and(i, sched_group_span(group), env->cpus) {
+ struct rq *rq = cpu_rq(i);
++ unsigned long load = cpu_load(rq);
+
+- sgs->group_load += cpu_load(rq);
++ sgs->group_load += load;
+ sgs->group_util += cpu_util_cfs(i);
+ sgs->group_runnable += cpu_runnable(rq);
+ sgs->sum_h_nr_running += rq->cfs.h_nr_running;
+@@ -8753,11 +8779,17 @@ static inline void update_sg_lb_stats(struct lb_env *env,
+ if (local_group)
+ continue;
+
+- /* Check for a misfit task on the cpu */
+- if (env->sd->flags & SD_ASYM_CPUCAPACITY &&
+- sgs->group_misfit_task_load < rq->misfit_task_load) {
+- sgs->group_misfit_task_load = rq->misfit_task_load;
+- *sg_status |= SG_OVERLOAD;
++ if (env->sd->flags & SD_ASYM_CPUCAPACITY) {
++ /* Check for a misfit task on the cpu */
++ if (sgs->group_misfit_task_load < rq->misfit_task_load) {
++ sgs->group_misfit_task_load = rq->misfit_task_load;
++ *sg_status |= SG_OVERLOAD;
++ }
++ } else if ((env->idle != CPU_NOT_IDLE) &&
++ sched_reduced_capacity(rq, env->sd)) {
++ /* Check for a task running on a CPU with reduced capacity */
++ if (sgs->group_misfit_task_load < load)
++ sgs->group_misfit_task_load = load;
+ }
+ }
+
+@@ -8810,7 +8842,8 @@ static bool update_sd_pick_busiest(struct lb_env *env,
+ * CPUs in the group should either be possible to resolve
+ * internally or be covered by avg_load imbalance (eventually).
+ */
+- if (sgs->group_type == group_misfit_task &&
++ if ((env->sd->flags & SD_ASYM_CPUCAPACITY) &&
++ (sgs->group_type == group_misfit_task) &&
+ (!capacity_greater(capacity_of(env->dst_cpu), sg->sgc->max_capacity) ||
+ sds->local_stat.group_type != group_has_spare))
+ return false;
+@@ -9253,6 +9286,77 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, int this_cpu)
+ return idlest;
+ }
+
++static void update_idle_cpu_scan(struct lb_env *env,
++ unsigned long sum_util)
++{
++ struct sched_domain_shared *sd_share;
++ int llc_weight, pct;
++ u64 x, y, tmp;
++ /*
++ * Update the number of CPUs to scan in LLC domain, which could
++ * be used as a hint in select_idle_cpu(). The update of sd_share
++ * could be expensive because it is within a shared cache line.
++ * So the write of this hint only occurs during periodic load
++ * balancing, rather than CPU_NEWLY_IDLE, because the latter
++ * can fire way more frequently than the former.
++ */
++ if (!sched_feat(SIS_UTIL) || env->idle == CPU_NEWLY_IDLE)
++ return;
++
++ llc_weight = per_cpu(sd_llc_size, env->dst_cpu);
++ if (env->sd->span_weight != llc_weight)
++ return;
++
++ sd_share = rcu_dereference(per_cpu(sd_llc_shared, env->dst_cpu));
++ if (!sd_share)
++ return;
++
++ /*
++ * The number of CPUs to search drops as sum_util increases, when
++ * sum_util hits 85% or above, the scan stops.
++ * The reason to choose 85% as the threshold is because this is the
++ * imbalance_pct(117) when a LLC sched group is overloaded.
++ *
++ * let y = SCHED_CAPACITY_SCALE - p * x^2 [1]
++ * and y'= y / SCHED_CAPACITY_SCALE
++ *
++ * x is the ratio of sum_util compared to the CPU capacity:
++ * x = sum_util / (llc_weight * SCHED_CAPACITY_SCALE)
++ * y' is the ratio of CPUs to be scanned in the LLC domain,
++ * and the number of CPUs to scan is calculated by:
++ *
++ * nr_scan = llc_weight * y' [2]
++ *
++ * When x hits the threshold of overloaded, AKA, when
++ * x = 100 / pct, y drops to 0. According to [1],
++ * p should be SCHED_CAPACITY_SCALE * pct^2 / 10000
++ *
++ * Scale x by SCHED_CAPACITY_SCALE:
++ * x' = sum_util / llc_weight; [3]
++ *
++ * and finally [1] becomes:
++ * y = SCHED_CAPACITY_SCALE -
++ * x'^2 * pct^2 / (10000 * SCHED_CAPACITY_SCALE) [4]
++ *
++ */
++ /* equation [3] */
++ x = sum_util;
++ do_div(x, llc_weight);
++
++ /* equation [4] */
++ pct = env->sd->imbalance_pct;
++ tmp = x * x * pct * pct;
++ do_div(tmp, 10000 * SCHED_CAPACITY_SCALE);
++ tmp = min_t(long, tmp, SCHED_CAPACITY_SCALE);
++ y = SCHED_CAPACITY_SCALE - tmp;
++
++ /* equation [2] */
++ y *= llc_weight;
++ do_div(y, SCHED_CAPACITY_SCALE);
++ if ((int)y != sd_share->nr_idle_scan)
++ WRITE_ONCE(sd_share->nr_idle_scan, (int)y);
++}
++
+ /**
+ * update_sd_lb_stats - Update sched_domain's statistics for load balancing.
+ * @env: The load balancing environment.
+@@ -9265,6 +9369,7 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
+ struct sched_group *sg = env->sd->groups;
+ struct sg_lb_stats *local = &sds->local_stat;
+ struct sg_lb_stats tmp_sgs;
++ unsigned long sum_util = 0;
+ int sg_status = 0;
+
+ do {
+@@ -9297,6 +9402,7 @@ next_group:
+ sds->total_load += sgs->group_load;
+ sds->total_capacity += sgs->group_capacity;
+
++ sum_util += sgs->group_util;
+ sg = sg->next;
+ } while (sg != env->sd->groups);
+
+@@ -9322,6 +9428,8 @@ next_group:
+ WRITE_ONCE(rd->overutilized, SG_OVERUTILIZED);
+ trace_sched_overutilized_tp(rd, SG_OVERUTILIZED);
+ }
++
++ update_idle_cpu_scan(env, sum_util);
+ }
+
+ #define NUMA_IMBALANCE_MIN 2
+@@ -9356,9 +9464,18 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
+ busiest = &sds->busiest_stat;
+
+ if (busiest->group_type == group_misfit_task) {
+- /* Set imbalance to allow misfit tasks to be balanced. */
+- env->migration_type = migrate_misfit;
+- env->imbalance = 1;
++ if (env->sd->flags & SD_ASYM_CPUCAPACITY) {
++ /* Set imbalance to allow misfit tasks to be balanced. */
++ env->migration_type = migrate_misfit;
++ env->imbalance = 1;
++ } else {
++ /*
++ * Set load imbalance to allow moving task from cpu
++ * with reduced capacity.
++ */
++ env->migration_type = migrate_load;
++ env->imbalance = busiest->group_misfit_task_load;
++ }
+ return;
+ }
+
+diff --git a/kernel/sched/features.h b/kernel/sched/features.h
+index 1cf435bbcd9ca..ee7f23c76bd33 100644
+--- a/kernel/sched/features.h
++++ b/kernel/sched/features.h
+@@ -60,7 +60,8 @@ SCHED_FEAT(TTWU_QUEUE, true)
+ /*
+ * When doing wakeups, attempt to limit superfluous scans of the LLC domain.
+ */
+-SCHED_FEAT(SIS_PROP, true)
++SCHED_FEAT(SIS_PROP, false)
++SCHED_FEAT(SIS_UTIL, true)
+
+ /*
+ * Issue a WARN when we do multiple update_rq_clock() calls
+diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
+index 7891c0f0e1ff7..907a5d7507a85 100644
+--- a/kernel/sched/rt.c
++++ b/kernel/sched/rt.c
+@@ -430,7 +430,7 @@ static inline void rt_queue_push_tasks(struct rq *rq)
+ #endif /* CONFIG_SMP */
+
+ static void enqueue_top_rt_rq(struct rt_rq *rt_rq);
+-static void dequeue_top_rt_rq(struct rt_rq *rt_rq);
++static void dequeue_top_rt_rq(struct rt_rq *rt_rq, unsigned int count);
+
+ static inline int on_rt_rq(struct sched_rt_entity *rt_se)
+ {
+@@ -551,7 +551,7 @@ static void sched_rt_rq_dequeue(struct rt_rq *rt_rq)
+ rt_se = rt_rq->tg->rt_se[cpu];
+
+ if (!rt_se) {
+- dequeue_top_rt_rq(rt_rq);
++ dequeue_top_rt_rq(rt_rq, rt_rq->rt_nr_running);
+ /* Kick cpufreq (see the comment in kernel/sched/sched.h). */
+ cpufreq_update_util(rq_of_rt_rq(rt_rq), 0);
+ }
+@@ -637,7 +637,7 @@ static inline void sched_rt_rq_enqueue(struct rt_rq *rt_rq)
+
+ static inline void sched_rt_rq_dequeue(struct rt_rq *rt_rq)
+ {
+- dequeue_top_rt_rq(rt_rq);
++ dequeue_top_rt_rq(rt_rq, rt_rq->rt_nr_running);
+ }
+
+ static inline int rt_rq_throttled(struct rt_rq *rt_rq)
+@@ -1039,7 +1039,7 @@ static void update_curr_rt(struct rq *rq)
+ }
+
+ static void
+-dequeue_top_rt_rq(struct rt_rq *rt_rq)
++dequeue_top_rt_rq(struct rt_rq *rt_rq, unsigned int count)
+ {
+ struct rq *rq = rq_of_rt_rq(rt_rq);
+
+@@ -1050,7 +1050,7 @@ dequeue_top_rt_rq(struct rt_rq *rt_rq)
+
+ BUG_ON(!rq->nr_running);
+
+- sub_nr_running(rq, rt_rq->rt_nr_running);
++ sub_nr_running(rq, count);
+ rt_rq->rt_queued = 0;
+
+ }
+@@ -1436,18 +1436,21 @@ static void __dequeue_rt_entity(struct sched_rt_entity *rt_se, unsigned int flag
+ static void dequeue_rt_stack(struct sched_rt_entity *rt_se, unsigned int flags)
+ {
+ struct sched_rt_entity *back = NULL;
++ unsigned int rt_nr_running;
+
+ for_each_sched_rt_entity(rt_se) {
+ rt_se->back = back;
+ back = rt_se;
+ }
+
+- dequeue_top_rt_rq(rt_rq_of_se(back));
++ rt_nr_running = rt_rq_of_se(back)->rt_nr_running;
+
+ for (rt_se = back; rt_se; rt_se = rt_se->back) {
+ if (on_rt_rq(rt_se))
+ __dequeue_rt_entity(rt_se, flags);
+ }
++
++ dequeue_top_rt_rq(rt_rq_of_se(back), rt_nr_running);
+ }
+
+ static void enqueue_rt_entity(struct sched_rt_entity *rt_se, unsigned int flags)
+diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
+index 84bba67c92dc6..08fdb9ccd14dc 100644
+--- a/kernel/sched/sched.h
++++ b/kernel/sched/sched.h
+@@ -2042,7 +2042,6 @@ static inline int task_on_rq_migrating(struct task_struct *p)
+
+ #define WF_SYNC 0x10 /* Waker goes to sleep after wakeup */
+ #define WF_MIGRATED 0x20 /* Internal use, task got migrated */
+-#define WF_ON_CPU 0x40 /* Wakee is on_cpu */
+
+ #ifdef CONFIG_SMP
+ static_assert(WF_EXEC == SD_BALANCE_EXEC);
+diff --git a/kernel/smp.c b/kernel/smp.c
+index 65a630f62363c..381eb15cd28f6 100644
+--- a/kernel/smp.c
++++ b/kernel/smp.c
+@@ -174,9 +174,9 @@ static int __init csdlock_debug(char *str)
+ if (val)
+ static_branch_enable(&csdlock_debug_enabled);
+
+- return 0;
++ return 1;
+ }
+-early_param("csdlock_debug", csdlock_debug);
++__setup("csdlock_debug=", csdlock_debug);
+
+ static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
+ static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
+diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
+index 0ea8702eb5163..23af5eca11b14 100644
+--- a/kernel/time/hrtimer.c
++++ b/kernel/time/hrtimer.c
+@@ -2311,6 +2311,7 @@ schedule_hrtimeout_range_clock(ktime_t *expires, u64 delta,
+
+ return !t.task ? 0 : -EINTR;
+ }
++EXPORT_SYMBOL_GPL(schedule_hrtimeout_range_clock);
+
+ /**
+ * schedule_hrtimeout_range - sleep until timeout
+diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
+index 871c912860ed5..d6a0ff68df410 100644
+--- a/kernel/time/timekeeping.c
++++ b/kernel/time/timekeeping.c
+@@ -23,6 +23,7 @@
+ #include <linux/pvclock_gtod.h>
+ #include <linux/compiler.h>
+ #include <linux/audit.h>
++#include <linux/random.h>
+
+ #include "tick-internal.h"
+ #include "ntp_internal.h"
+@@ -1326,8 +1327,10 @@ out:
+ /* Signal hrtimers about time change */
+ clock_was_set(CLOCK_SET_WALL);
+
+- if (!ret)
++ if (!ret) {
+ audit_tk_injoffset(ts_delta);
++ add_device_randomness(ts, sizeof(*ts));
++ }
+
+ return ret;
+ }
+@@ -2413,6 +2416,7 @@ int do_adjtimex(struct __kernel_timex *txc)
+ ret = timekeeping_validate_timex(txc);
+ if (ret)
+ return ret;
++ add_device_randomness(txc, sizeof(*txc));
+
+ if (txc->modes & ADJ_SETOFFSET) {
+ struct timespec64 delta;
+@@ -2430,6 +2434,7 @@ int do_adjtimex(struct __kernel_timex *txc)
+ audit_ntp_init(&ad);
+
+ ktime_get_real_ts64(&ts);
++ add_device_randomness(&ts, sizeof(ts));
+
+ raw_spin_lock_irqsave(&timekeeper_lock, flags);
+ write_seqcount_begin(&tk_core.seq);
+diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
+index 4d5629196d01d..f0500b5cfefee 100644
+--- a/kernel/trace/blktrace.c
++++ b/kernel/trace/blktrace.c
+@@ -770,14 +770,11 @@ int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg)
+ **/
+ void blk_trace_shutdown(struct request_queue *q)
+ {
+- mutex_lock(&q->debugfs_mutex);
+ if (rcu_dereference_protected(q->blk_trace,
+ lockdep_is_held(&q->debugfs_mutex))) {
+ __blk_trace_startstop(q, 0);
+ __blk_trace_remove(q);
+ }
+-
+- mutex_unlock(&q->debugfs_mutex);
+ }
+
+ #ifdef CONFIG_BLK_CGROUP
+@@ -1059,7 +1056,7 @@ static void blk_add_trace_rq_remap(void *ignore, struct request *rq, dev_t dev,
+ r.sector_from = cpu_to_be64(from);
+
+ __blk_add_trace(bt, blk_rq_pos(rq), blk_rq_bytes(rq),
+- rq_data_dir(rq), 0, BLK_TA_REMAP, 0,
++ req_op(rq), rq->cmd_flags, BLK_TA_REMAP, 0,
+ sizeof(r), &r, blk_trace_request_get_cgid(rq));
+ rcu_read_unlock();
+ }
+diff --git a/lib/crypto/blake2s-selftest.c b/lib/crypto/blake2s-selftest.c
+index 409e4b7287704..7d77dea155873 100644
+--- a/lib/crypto/blake2s-selftest.c
++++ b/lib/crypto/blake2s-selftest.c
+@@ -4,6 +4,8 @@
+ */
+
+ #include <crypto/internal/blake2s.h>
++#include <linux/kernel.h>
++#include <linux/random.h>
+ #include <linux/string.h>
+
+ /*
+@@ -587,5 +589,44 @@ bool __init blake2s_selftest(void)
+ }
+ }
+
++ for (i = 0; i < 32; ++i) {
++ enum { TEST_ALIGNMENT = 16 };
++ u8 unaligned_block[BLAKE2S_BLOCK_SIZE + TEST_ALIGNMENT - 1]
++ __aligned(TEST_ALIGNMENT);
++ u8 blocks[BLAKE2S_BLOCK_SIZE * 2];
++ struct blake2s_state state1, state2;
++
++ get_random_bytes(blocks, sizeof(blocks));
++ get_random_bytes(&state, sizeof(state));
++
++#if defined(CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC) && \
++ defined(CONFIG_CRYPTO_ARCH_HAVE_LIB_BLAKE2S)
++ memcpy(&state1, &state, sizeof(state1));
++ memcpy(&state2, &state, sizeof(state2));
++ blake2s_compress(&state1, blocks, 2, BLAKE2S_BLOCK_SIZE);
++ blake2s_compress_generic(&state2, blocks, 2, BLAKE2S_BLOCK_SIZE);
++ if (memcmp(&state1, &state2, sizeof(state1))) {
++ pr_err("blake2s random compress self-test %d: FAIL\n",
++ i + 1);
++ success = false;
++ }
++#endif
++
++ memcpy(&state1, &state, sizeof(state1));
++ blake2s_compress(&state1, blocks, 1, BLAKE2S_BLOCK_SIZE);
++ for (l = 1; l < TEST_ALIGNMENT; ++l) {
++ memcpy(unaligned_block + l, blocks,
++ BLAKE2S_BLOCK_SIZE);
++ memcpy(&state2, &state, sizeof(state2));
++ blake2s_compress(&state2, unaligned_block + l, 1,
++ BLAKE2S_BLOCK_SIZE);
++ if (memcmp(&state1, &state2, sizeof(state1))) {
++ pr_err("blake2s random compress align %d self-test %d: FAIL\n",
++ l, i + 1);
++ success = false;
++ }
++ }
++ }
++
+ return success;
+ }
+diff --git a/lib/crypto/blake2s.c b/lib/crypto/blake2s.c
+index c71c09621c09c..98e688c6d8910 100644
+--- a/lib/crypto/blake2s.c
++++ b/lib/crypto/blake2s.c
+@@ -16,16 +16,44 @@
+ #include <linux/init.h>
+ #include <linux/bug.h>
+
++static inline void blake2s_set_lastblock(struct blake2s_state *state)
++{
++ state->f[0] = -1;
++}
++
+ void blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen)
+ {
+- __blake2s_update(state, in, inlen, false);
++ const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen;
++
++ if (unlikely(!inlen))
++ return;
++ if (inlen > fill) {
++ memcpy(state->buf + state->buflen, in, fill);
++ blake2s_compress(state, state->buf, 1, BLAKE2S_BLOCK_SIZE);
++ state->buflen = 0;
++ in += fill;
++ inlen -= fill;
++ }
++ if (inlen > BLAKE2S_BLOCK_SIZE) {
++ const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE);
++ blake2s_compress(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE);
++ in += BLAKE2S_BLOCK_SIZE * (nblocks - 1);
++ inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1);
++ }
++ memcpy(state->buf + state->buflen, in, inlen);
++ state->buflen += inlen;
+ }
+ EXPORT_SYMBOL(blake2s_update);
+
+ void blake2s_final(struct blake2s_state *state, u8 *out)
+ {
+ WARN_ON(IS_ENABLED(DEBUG) && !out);
+- __blake2s_final(state, out, false);
++ blake2s_set_lastblock(state);
++ memset(state->buf + state->buflen, 0,
++ BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */
++ blake2s_compress(state, state->buf, 1, state->buflen);
++ cpu_to_le32_array(state->h, ARRAY_SIZE(state->h));
++ memcpy(out, state->h, state->outlen);
+ memzero_explicit(state, sizeof(*state));
+ }
+ EXPORT_SYMBOL(blake2s_final);
+@@ -38,12 +66,7 @@ static int __init blake2s_mod_init(void)
+ return 0;
+ }
+
+-static void __exit blake2s_mod_exit(void)
+-{
+-}
+-
+ module_init(blake2s_mod_init);
+-module_exit(blake2s_mod_exit);
+ MODULE_LICENSE("GPL v2");
+ MODULE_DESCRIPTION("BLAKE2s hash function");
+ MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
+diff --git a/lib/iov_iter.c b/lib/iov_iter.c
+index 0b64695ab632f..2bf20b48a04ad 100644
+--- a/lib/iov_iter.c
++++ b/lib/iov_iter.c
+@@ -689,6 +689,7 @@ static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes,
+ struct pipe_inode_info *pipe = i->pipe;
+ unsigned int p_mask = pipe->ring_size - 1;
+ unsigned int i_head;
++ unsigned int valid = pipe->head;
+ size_t n, off, xfer = 0;
+
+ if (!sanity(i))
+@@ -702,11 +703,17 @@ static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes,
+ rem = copy_mc_to_kernel(p + off, addr + xfer, chunk);
+ chunk -= rem;
+ kunmap_local(p);
+- i->head = i_head;
+- i->iov_offset = off + chunk;
+- xfer += chunk;
+- if (rem)
++ if (chunk) {
++ i->head = i_head;
++ i->iov_offset = off + chunk;
++ xfer += chunk;
++ valid = i_head + 1;
++ }
++ if (rem) {
++ pipe->bufs[i_head & p_mask].len -= rem;
++ pipe_discard_from(pipe, valid);
+ break;
++ }
+ n -= chunk;
+ off = 0;
+ i_head++;
+diff --git a/lib/kunit/executor.c b/lib/kunit/executor.c
+index 96f96e42ce062..16fb88c0aca31 100644
+--- a/lib/kunit/executor.c
++++ b/lib/kunit/executor.c
+@@ -76,8 +76,10 @@ kunit_filter_tests(struct kunit_suite *const suite, const char *test_glob)
+ memcpy(copy, suite, sizeof(*copy));
+
+ filtered = kcalloc(n + 1, sizeof(*filtered), GFP_KERNEL);
+- if (!filtered)
++ if (!filtered) {
++ kfree(copy);
+ return ERR_PTR(-ENOMEM);
++ }
+
+ n = 0;
+ kunit_suite_for_each_test_case(suite, test_case) {
+diff --git a/lib/livepatch/test_klp_callbacks_busy.c b/lib/livepatch/test_klp_callbacks_busy.c
+index 7ac845f65be56..133929e0ce8ff 100644
+--- a/lib/livepatch/test_klp_callbacks_busy.c
++++ b/lib/livepatch/test_klp_callbacks_busy.c
+@@ -16,10 +16,12 @@ MODULE_PARM_DESC(block_transition, "block_transition (default=false)");
+
+ static void busymod_work_func(struct work_struct *work);
+ static DECLARE_WORK(work, busymod_work_func);
++static DECLARE_COMPLETION(busymod_work_started);
+
+ static void busymod_work_func(struct work_struct *work)
+ {
+ pr_info("%s enter\n", __func__);
++ complete(&busymod_work_started);
+
+ while (READ_ONCE(block_transition)) {
+ /*
+@@ -37,6 +39,12 @@ static int test_klp_callbacks_busy_init(void)
+ pr_info("%s\n", __func__);
+ schedule_work(&work);
+
++ /*
++ * To synchronize kernel messages, hold the init function from
++ * exiting until the work function's entry message has printed.
++ */
++ wait_for_completion(&busymod_work_started);
++
+ if (!block_transition) {
+ /*
+ * Serialize output: print all messages from the work
+diff --git a/lib/overflow_kunit.c b/lib/overflow_kunit.c
+index 475f0c064bf65..7e3e43679b73c 100644
+--- a/lib/overflow_kunit.c
++++ b/lib/overflow_kunit.c
+@@ -91,6 +91,7 @@ DEFINE_TEST_ARRAY(u32) = {
+ {-4U, 5U, 1U, -9U, -20U, true, false, true},
+ };
+
++#if BITS_PER_LONG == 64
+ DEFINE_TEST_ARRAY(u64) = {
+ {0, 0, 0, 0, 0, false, false, false},
+ {1, 1, 2, 0, 1, false, false, false},
+@@ -114,6 +115,7 @@ DEFINE_TEST_ARRAY(u64) = {
+ false, true, false},
+ {-15ULL, 10ULL, -5ULL, -25ULL, -150ULL, false, false, true},
+ };
++#endif
+
+ DEFINE_TEST_ARRAY(s8) = {
+ {0, 0, 0, 0, 0, false, false, false},
+@@ -188,6 +190,8 @@ DEFINE_TEST_ARRAY(s32) = {
+ {S32_MIN, S32_MIN, 0, 0, 0, true, false, true},
+ {S32_MAX, S32_MAX, -2, 0, 1, true, false, true},
+ };
++
++#if BITS_PER_LONG == 64
+ DEFINE_TEST_ARRAY(s64) = {
+ {0, 0, 0, 0, 0, false, false, false},
+
+@@ -216,6 +220,7 @@ DEFINE_TEST_ARRAY(s64) = {
+ {-128, -1, -129, -127, 128, false, false, false},
+ {0, -S64_MAX, -S64_MAX, S64_MAX, 0, false, false, false},
+ };
++#endif
+
+ #define check_one_op(t, fmt, op, sym, a, b, r, of) do { \
+ t _r; \
+@@ -650,6 +655,7 @@ static struct kunit_case overflow_test_cases[] = {
+ KUNIT_CASE(s16_overflow_test),
+ KUNIT_CASE(u32_overflow_test),
+ KUNIT_CASE(s32_overflow_test),
++/* Clang 13 and earlier generate unwanted libcalls on 32-bit. */
+ #if BITS_PER_LONG == 64
+ KUNIT_CASE(u64_overflow_test),
+ KUNIT_CASE(s64_overflow_test),
+diff --git a/lib/smp_processor_id.c b/lib/smp_processor_id.c
+index 046ac6297c781..a2bb7738c373c 100644
+--- a/lib/smp_processor_id.c
++++ b/lib/smp_processor_id.c
+@@ -47,9 +47,9 @@ unsigned int check_preemption_disabled(const char *what1, const char *what2)
+
+ printk("caller is %pS\n", __builtin_return_address(0));
+ dump_stack();
+- instrumentation_end();
+
+ out_enable:
++ instrumentation_end();
+ preempt_enable_no_resched_notrace();
+ out:
+ return this_cpu;
+diff --git a/lib/test_bpf.c b/lib/test_bpf.c
+index 0c5cb2d6436a4..1a00d0247f708 100644
+--- a/lib/test_bpf.c
++++ b/lib/test_bpf.c
+@@ -14456,9 +14456,9 @@ static struct skb_segment_test skb_segment_tests[] __initconst = {
+ .build_skb = build_test_skb_linear_no_head_frag,
+ .features = NETIF_F_SG | NETIF_F_FRAGLIST |
+ NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_GSO |
+- NETIF_F_LLTX_BIT | NETIF_F_GRO |
++ NETIF_F_LLTX | NETIF_F_GRO |
+ NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM |
+- NETIF_F_HW_VLAN_STAG_TX_BIT
++ NETIF_F_HW_VLAN_STAG_TX
+ }
+ };
+
+diff --git a/lib/test_hmm.c b/lib/test_hmm.c
+index cfe6320478391..f2c3015c5c82c 100644
+--- a/lib/test_hmm.c
++++ b/lib/test_hmm.c
+@@ -732,7 +732,7 @@ static int dmirror_exclusive(struct dmirror *dmirror,
+
+ mmap_read_lock(mm);
+ for (addr = start; addr < end; addr = next) {
+- unsigned long mapped;
++ unsigned long mapped = 0;
+ int i;
+
+ if (end < addr + (ARRAY_SIZE(pages) << PAGE_SHIFT))
+@@ -741,7 +741,13 @@ static int dmirror_exclusive(struct dmirror *dmirror,
+ next = addr + (ARRAY_SIZE(pages) << PAGE_SHIFT);
+
+ ret = make_device_exclusive_range(mm, addr, next, pages, NULL);
+- mapped = dmirror_atomic_map(addr, next, pages, dmirror);
++ /*
++ * Do dmirror_atomic_map() iff all pages are marked for
++ * exclusive access to avoid accessing uninitialized
++ * fields of pages.
++ */
++ if (ret == (next - addr) >> PAGE_SHIFT)
++ mapped = dmirror_atomic_map(addr, next, pages, dmirror);
+ for (i = 0; i < ret; i++) {
+ if (pages[i]) {
+ unlock_page(pages[i]);
+diff --git a/lib/test_kasan.c b/lib/test_kasan.c
+index ad880231dfa84..630e0c31d7c2b 100644
+--- a/lib/test_kasan.c
++++ b/lib/test_kasan.c
+@@ -131,6 +131,7 @@ static void kmalloc_oob_right(struct kunit *test)
+ ptr = kmalloc(size, GFP_KERNEL);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+
++ OPTIMIZER_HIDE_VAR(ptr);
+ /*
+ * An unaligned access past the requested kmalloc size.
+ * Only generic KASAN can precisely detect these.
+@@ -159,6 +160,7 @@ static void kmalloc_oob_left(struct kunit *test)
+ ptr = kmalloc(size, GFP_KERNEL);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+
++ OPTIMIZER_HIDE_VAR(ptr);
+ KUNIT_EXPECT_KASAN_FAIL(test, *ptr = *(ptr - 1));
+ kfree(ptr);
+ }
+@@ -171,6 +173,7 @@ static void kmalloc_node_oob_right(struct kunit *test)
+ ptr = kmalloc_node(size, GFP_KERNEL, 0);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+
++ OPTIMIZER_HIDE_VAR(ptr);
+ KUNIT_EXPECT_KASAN_FAIL(test, ptr[0] = ptr[size]);
+ kfree(ptr);
+ }
+@@ -191,6 +194,7 @@ static void kmalloc_pagealloc_oob_right(struct kunit *test)
+ ptr = kmalloc(size, GFP_KERNEL);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+
++ OPTIMIZER_HIDE_VAR(ptr);
+ KUNIT_EXPECT_KASAN_FAIL(test, ptr[size + OOB_TAG_OFF] = 0);
+
+ kfree(ptr);
+@@ -271,6 +275,7 @@ static void kmalloc_large_oob_right(struct kunit *test)
+ ptr = kmalloc(size, GFP_KERNEL);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+
++ OPTIMIZER_HIDE_VAR(ptr);
+ KUNIT_EXPECT_KASAN_FAIL(test, ptr[size] = 0);
+ kfree(ptr);
+ }
+@@ -410,6 +415,8 @@ static void kmalloc_oob_16(struct kunit *test)
+ ptr2 = kmalloc(sizeof(*ptr2), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr2);
+
++ OPTIMIZER_HIDE_VAR(ptr1);
++ OPTIMIZER_HIDE_VAR(ptr2);
+ KUNIT_EXPECT_KASAN_FAIL(test, *ptr1 = *ptr2);
+ kfree(ptr1);
+ kfree(ptr2);
+@@ -756,6 +763,8 @@ static void ksize_unpoisons_memory(struct kunit *test)
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+ real_size = ksize(ptr);
+
++ OPTIMIZER_HIDE_VAR(ptr);
++
+ /* This access shouldn't trigger a KASAN report. */
+ ptr[size] = 'x';
+
+@@ -778,6 +787,7 @@ static void ksize_uaf(struct kunit *test)
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+ kfree(ptr);
+
++ OPTIMIZER_HIDE_VAR(ptr);
+ KUNIT_EXPECT_KASAN_FAIL(test, ksize(ptr));
+ KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)ptr)[0]);
+ KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)ptr)[size]);
+diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
+index e34c4d0c4d939..11982685508ee 100644
+--- a/mm/damon/reclaim.c
++++ b/mm/damon/reclaim.c
+@@ -384,8 +384,10 @@ static int __init damon_reclaim_init(void)
+ if (!ctx)
+ return -ENOMEM;
+
+- if (damon_select_ops(ctx, DAMON_OPS_PADDR))
++ if (damon_select_ops(ctx, DAMON_OPS_PADDR)) {
++ damon_destroy_ctx(ctx);
+ return -EINVAL;
++ }
+
+ ctx->callback.after_aggregation = damon_reclaim_after_aggregation;
+
+diff --git a/mm/gup.c b/mm/gup.c
+index c5d076d43d9be..414f0c4b4caa0 100644
+--- a/mm/gup.c
++++ b/mm/gup.c
+@@ -1795,7 +1795,7 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages,
+ * Try to move out any movable page before pinning the range.
+ */
+ if (folio_test_hugetlb(folio)) {
+- if (!isolate_huge_page(&folio->page,
++ if (isolate_hugetlb(&folio->page,
+ &movable_page_list))
+ isolation_error_count++;
+ continue;
+diff --git a/mm/huge_memory.c b/mm/huge_memory.c
+index b7d0697b1f26e..e0c1cf3168d7b 100644
+--- a/mm/huge_memory.c
++++ b/mm/huge_memory.c
+@@ -18,6 +18,7 @@
+ #include <linux/shrinker.h>
+ #include <linux/mm_inline.h>
+ #include <linux/swapops.h>
++#include <linux/backing-dev.h>
+ #include <linux/dax.h>
+ #include <linux/khugepaged.h>
+ #include <linux/freezer.h>
+@@ -2389,11 +2390,15 @@ static void __split_huge_page(struct page *page, struct list_head *list,
+ __split_huge_page_tail(head, i, lruvec, list);
+ /* Some pages can be beyond EOF: drop them from page cache */
+ if (head[i].index >= end) {
+- ClearPageDirty(head + i);
+- __delete_from_page_cache(head + i, NULL);
++ struct folio *tail = page_folio(head + i);
++
+ if (shmem_mapping(head->mapping))
+ shmem_uncharge(head->mapping->host, 1);
+- put_page(head + i);
++ else if (folio_test_clear_dirty(tail))
++ folio_account_cleaned(tail,
++ inode_to_wb(folio->mapping->host));
++ __filemap_remove_folio(tail, NULL);
++ folio_put(tail);
+ } else if (!PageAnon(page)) {
+ __xa_store(&head->mapping->i_pages, head[i].index,
+ head + i, 0);
+diff --git a/mm/hugetlb.c b/mm/hugetlb.c
+index 859cfcaecddbc..19063dbed332d 100644
+--- a/mm/hugetlb.c
++++ b/mm/hugetlb.c
+@@ -2758,8 +2758,7 @@ retry:
+ * Fail with -EBUSY if not possible.
+ */
+ spin_unlock_irq(&hugetlb_lock);
+- if (!isolate_huge_page(old_page, list))
+- ret = -EBUSY;
++ ret = isolate_hugetlb(old_page, list);
+ spin_lock_irq(&hugetlb_lock);
+ goto free_new;
+ } else if (!HPageFreed(old_page)) {
+@@ -2835,7 +2834,7 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
+ if (hstate_is_gigantic(h))
+ return -ENOMEM;
+
+- if (page_count(head) && isolate_huge_page(head, list))
++ if (page_count(head) && !isolate_hugetlb(head, list))
+ ret = 0;
+ else if (!page_count(head))
+ ret = alloc_and_dissolve_huge_page(h, head, list);
+@@ -5603,7 +5602,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
+ */
+ entry = huge_ptep_get(ptep);
+ if (unlikely(is_hugetlb_entry_migration(entry))) {
+- migration_entry_wait_huge(vma, mm, ptep);
++ migration_entry_wait_huge(vma, ptep);
+ return 0;
+ } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
+ return VM_FAULT_HWPOISON_LARGE |
+@@ -6726,7 +6725,7 @@ retry:
+ } else {
+ if (is_hugetlb_entry_migration(pte)) {
+ spin_unlock(ptl);
+- __migration_entry_wait(mm, (pte_t *)pmd, ptl);
++ __migration_entry_wait_huge((pte_t *)pmd, ptl);
+ goto retry;
+ }
+ /*
+@@ -6758,15 +6757,15 @@ follow_huge_pgd(struct mm_struct *mm, unsigned long address, pgd_t *pgd, int fla
+ return pte_page(*(pte_t *)pgd) + ((address & ~PGDIR_MASK) >> PAGE_SHIFT);
+ }
+
+-bool isolate_huge_page(struct page *page, struct list_head *list)
++int isolate_hugetlb(struct page *page, struct list_head *list)
+ {
+- bool ret = true;
++ int ret = 0;
+
+ spin_lock_irq(&hugetlb_lock);
+ if (!PageHeadHuge(page) ||
+ !HPageMigratable(page) ||
+ !get_page_unless_zero(page)) {
+- ret = false;
++ ret = -EBUSY;
+ goto unlock;
+ }
+ ClearHPageMigratable(page);
+diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
+index f9942841df18b..c86691c431fd7 100644
+--- a/mm/hugetlb_cgroup.c
++++ b/mm/hugetlb_cgroup.c
+@@ -772,6 +772,7 @@ static void __init __hugetlb_cgroup_file_dfl_init(int idx)
+ /* Add the numa stat file */
+ cft = &h->cgroup_files_dfl[6];
+ snprintf(cft->name, MAX_CFTYPE_NAME, "%s.numa_stat", buf);
++ cft->private = MEMFILE_PRIVATE(idx, 0);
+ cft->seq_show = hugetlb_cgroup_read_numa_stat;
+ cft->flags = CFTYPE_NOT_ON_ROOT;
+
+diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
+index 9e1b6544bfa8e..9ad8eff71b28d 100644
+--- a/mm/kasan/hw_tags.c
++++ b/mm/kasan/hw_tags.c
+@@ -257,27 +257,37 @@ static void unpoison_vmalloc_pages(const void *addr, u8 tag)
+ }
+ }
+
++static void init_vmalloc_pages(const void *start, unsigned long size)
++{
++ const void *addr;
++
++ for (addr = start; addr < start + size; addr += PAGE_SIZE) {
++ struct page *page = virt_to_page(addr);
++
++ clear_highpage_kasan_tagged(page);
++ }
++}
++
+ void *__kasan_unpoison_vmalloc(const void *start, unsigned long size,
+ kasan_vmalloc_flags_t flags)
+ {
+ u8 tag;
+ unsigned long redzone_start, redzone_size;
+
+- if (!kasan_vmalloc_enabled())
+- return (void *)start;
+-
+- if (!is_vmalloc_or_module_addr(start))
++ if (!kasan_vmalloc_enabled() || !is_vmalloc_or_module_addr(start)) {
++ if (flags & KASAN_VMALLOC_INIT)
++ init_vmalloc_pages(start, size);
+ return (void *)start;
++ }
+
+ /*
+- * Skip unpoisoning and assigning a pointer tag for non-VM_ALLOC
+- * mappings as:
++ * Don't tag non-VM_ALLOC mappings, as:
+ *
+ * 1. Unlike the software KASAN modes, hardware tag-based KASAN only
+ * supports tagging physical memory. Therefore, it can only tag a
+ * single mapping of normal physical pages.
+ * 2. Hardware tag-based KASAN can only tag memory mapped with special
+- * mapping protection bits, see arch_vmalloc_pgprot_modify().
++ * mapping protection bits, see arch_vmap_pgprot_tagged().
+ * As non-VM_ALLOC mappings can be mapped outside of vmalloc code,
+ * providing these bits would require tracking all non-VM_ALLOC
+ * mappers.
+@@ -289,15 +299,19 @@ void *__kasan_unpoison_vmalloc(const void *start, unsigned long size,
+ *
+ * For non-VM_ALLOC allocations, page_alloc memory is tagged as usual.
+ */
+- if (!(flags & KASAN_VMALLOC_VM_ALLOC))
++ if (!(flags & KASAN_VMALLOC_VM_ALLOC)) {
++ WARN_ON(flags & KASAN_VMALLOC_INIT);
+ return (void *)start;
++ }
+
+ /*
+ * Don't tag executable memory.
+ * The kernel doesn't tolerate having the PC register tagged.
+ */
+- if (!(flags & KASAN_VMALLOC_PROT_NORMAL))
++ if (!(flags & KASAN_VMALLOC_PROT_NORMAL)) {
++ WARN_ON(flags & KASAN_VMALLOC_INIT);
+ return (void *)start;
++ }
+
+ tag = kasan_random_tag();
+ start = set_tag(start, tag);
+diff --git a/mm/memory-failure.c b/mm/memory-failure.c
+index 94dac77f5ebad..fb8c9b0b53abf 100644
+--- a/mm/memory-failure.c
++++ b/mm/memory-failure.c
+@@ -2211,7 +2211,7 @@ static bool isolate_page(struct page *page, struct list_head *pagelist)
+ bool lru = PageLRU(page);
+
+ if (PageHuge(page)) {
+- isolated = isolate_huge_page(page, pagelist);
++ isolated = !isolate_hugetlb(page, pagelist);
+ } else {
+ if (lru)
+ isolated = !isolate_lru_page(page);
+diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
+index 416b38ca8defa..3a60c1444bc7e 100644
+--- a/mm/memory_hotplug.c
++++ b/mm/memory_hotplug.c
+@@ -1627,7 +1627,7 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
+
+ if (PageHuge(page)) {
+ pfn = page_to_pfn(head) + compound_nr(head) - 1;
+- isolate_huge_page(head, &source);
++ isolate_hugetlb(head, &source);
+ continue;
+ } else if (PageTransHuge(page))
+ pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
+diff --git a/mm/mempolicy.c b/mm/mempolicy.c
+index ea6dee61bc9dc..c1ccc89845f4f 100644
+--- a/mm/mempolicy.c
++++ b/mm/mempolicy.c
+@@ -607,7 +607,7 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
+ /* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
+ if (flags & (MPOL_MF_MOVE_ALL) ||
+ (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
+- if (!isolate_huge_page(page, qp->pagelist) &&
++ if (isolate_hugetlb(page, qp->pagelist) &&
+ (flags & MPOL_MF_STRICT))
+ /*
+ * Failed to isolate page but allow migrating pages
+@@ -1387,7 +1387,7 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
+ unsigned long bits = min_t(unsigned long, maxnode, BITS_PER_LONG);
+ unsigned long t;
+
+- if (get_bitmap(&t, &nmask[maxnode / BITS_PER_LONG], bits))
++ if (get_bitmap(&t, &nmask[(maxnode - 1) / BITS_PER_LONG], bits))
+ return -EFAULT;
+
+ if (maxnode - bits >= MAX_NUMNODES) {
+diff --git a/mm/memremap.c b/mm/memremap.c
+index e11653fd348cc..2c1130486d28b 100644
+--- a/mm/memremap.c
++++ b/mm/memremap.c
+@@ -141,10 +141,10 @@ void memunmap_pages(struct dev_pagemap *pgmap)
+ for (i = 0; i < pgmap->nr_range; i++)
+ percpu_ref_put_many(&pgmap->ref, pfn_len(pgmap, i));
+ wait_for_completion(&pgmap->done);
+- percpu_ref_exit(&pgmap->ref);
+
+ for (i = 0; i < pgmap->nr_range; i++)
+ pageunmap_range(pgmap, i);
++ percpu_ref_exit(&pgmap->ref);
+
+ WARN_ONCE(pgmap->altmap.alloc, "failed to free all reserved pages\n");
+ devmap_managed_enable_put(pgmap);
+diff --git a/mm/migrate.c b/mm/migrate.c
+index 6c31ee1e1c9b0..5aba3fb612d43 100644
+--- a/mm/migrate.c
++++ b/mm/migrate.c
+@@ -133,7 +133,7 @@ static void putback_movable_page(struct page *page)
+ *
+ * This function shall be used whenever the isolated pageset has been
+ * built from lru, balloon, hugetlbfs page. See isolate_migratepages_range()
+- * and isolate_huge_page().
++ * and isolate_hugetlb().
+ */
+ void putback_movable_pages(struct list_head *l)
+ {
+@@ -309,13 +309,28 @@ void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd,
+ __migration_entry_wait(mm, ptep, ptl);
+ }
+
+-void migration_entry_wait_huge(struct vm_area_struct *vma,
+- struct mm_struct *mm, pte_t *pte)
++#ifdef CONFIG_HUGETLB_PAGE
++void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl)
+ {
+- spinlock_t *ptl = huge_pte_lockptr(hstate_vma(vma), mm, pte);
+- __migration_entry_wait(mm, pte, ptl);
++ pte_t pte;
++
++ spin_lock(ptl);
++ pte = huge_ptep_get(ptep);
++
++ if (unlikely(!is_hugetlb_entry_migration(pte)))
++ spin_unlock(ptl);
++ else
++ migration_entry_wait_on_locked(pte_to_swp_entry(pte), NULL, ptl);
+ }
+
++void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte)
++{
++ spinlock_t *ptl = huge_pte_lockptr(hstate_vma(vma), vma->vm_mm, pte);
++
++ __migration_entry_wait_huge(pte, ptl);
++}
++#endif
++
+ #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+ void pmd_migration_entry_wait(struct mm_struct *mm, pmd_t *pmd)
+ {
+@@ -1631,8 +1646,9 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr,
+
+ if (PageHuge(page)) {
+ if (PageHead(page)) {
+- isolate_huge_page(page, pagelist);
+- err = 1;
++ err = isolate_hugetlb(page, pagelist);
++ if (!err)
++ err = 1;
+ }
+ } else {
+ struct page *head;
+diff --git a/mm/mmap.c b/mm/mmap.c
+index 313b57d55a634..6d9bfd9c94ca0 100644
+--- a/mm/mmap.c
++++ b/mm/mmap.c
+@@ -1882,7 +1882,6 @@ unmap_and_free_vma:
+
+ /* Undo any partial mapping done by a device driver. */
+ unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end);
+- charged = 0;
+ if (vm_flags & VM_SHARED)
+ mapping_unmap_writable(file->f_mapping);
+ free_vma:
+diff --git a/mm/page_alloc.c b/mm/page_alloc.c
+index 135a081edb82c..a8bd356a4d1e3 100644
+--- a/mm/page_alloc.c
++++ b/mm/page_alloc.c
+@@ -1287,12 +1287,8 @@ static void kernel_init_free_pages(struct page *page, int numpages)
+
+ /* s390's use of memset() could override KASAN redzones. */
+ kasan_disable_current();
+- for (i = 0; i < numpages; i++) {
+- u8 tag = page_kasan_tag(page + i);
+- page_kasan_tag_reset(page + i);
+- clear_highpage(page + i);
+- page_kasan_tag_set(page + i, tag);
+- }
++ for (i = 0; i < numpages; i++)
++ clear_highpage_kasan_tagged(page + i);
+ kasan_enable_current();
+ }
+
+diff --git a/mm/vmalloc.c b/mm/vmalloc.c
+index cadfbb5155ea5..06c555fc5be06 100644
+--- a/mm/vmalloc.c
++++ b/mm/vmalloc.c
+@@ -3170,15 +3170,15 @@ again:
+
+ /*
+ * Mark the pages as accessible, now that they are mapped.
+- * The init condition should match the one in post_alloc_hook()
+- * (except for the should_skip_init() check) to make sure that memory
+- * is initialized under the same conditions regardless of the enabled
+- * KASAN mode.
++ * The condition for setting KASAN_VMALLOC_INIT should complement the
++ * one in post_alloc_hook() with regards to the __GFP_SKIP_ZERO check
++ * to make sure that memory is initialized under the same conditions.
+ * Tag-based KASAN modes only assign tags to normal non-executable
+ * allocations, see __kasan_unpoison_vmalloc().
+ */
+ kasan_flags |= KASAN_VMALLOC_VM_ALLOC;
+- if (!want_init_on_free() && want_init_on_alloc(gfp_mask))
++ if (!want_init_on_free() && want_init_on_alloc(gfp_mask) &&
++ (gfp_mask & __GFP_SKIP_ZERO))
+ kasan_flags |= KASAN_VMALLOC_INIT;
+ /* KASAN_VMALLOC_PROT_NORMAL already set if required. */
+ area->addr = kasan_unpoison_vmalloc(area->addr, real_size, kasan_flags);
+diff --git a/net/9p/client.c b/net/9p/client.c
+index 8bba0d9cf9754..87cde948f628e 100644
+--- a/net/9p/client.c
++++ b/net/9p/client.c
+@@ -305,7 +305,7 @@ p9_tag_alloc(struct p9_client *c, int8_t type, unsigned int max_size)
+ * callback), so p9_client_cb eats the second ref there
+ * as the pointer is duplicated directly by virtqueue_add_sgs()
+ */
+- refcount_set(&req->refcount.refcount, 2);
++ refcount_set(&req->refcount, 2);
+
+ return req;
+
+@@ -341,7 +341,7 @@ again:
+ if (!p9_req_try_get(req))
+ goto again;
+ if (req->tc.tag != tag) {
+- p9_req_put(req);
++ p9_req_put(c, req);
+ goto again;
+ }
+ }
+@@ -367,21 +367,18 @@ static int p9_tag_remove(struct p9_client *c, struct p9_req_t *r)
+ spin_lock_irqsave(&c->lock, flags);
+ idr_remove(&c->reqs, tag);
+ spin_unlock_irqrestore(&c->lock, flags);
+- return p9_req_put(r);
++ return p9_req_put(c, r);
+ }
+
+-static void p9_req_free(struct kref *ref)
++int p9_req_put(struct p9_client *c, struct p9_req_t *r)
+ {
+- struct p9_req_t *r = container_of(ref, struct p9_req_t, refcount);
+-
+- p9_fcall_fini(&r->tc);
+- p9_fcall_fini(&r->rc);
+- kmem_cache_free(p9_req_cache, r);
+-}
+-
+-int p9_req_put(struct p9_req_t *r)
+-{
+- return kref_put(&r->refcount, p9_req_free);
++ if (refcount_dec_and_test(&r->refcount)) {
++ p9_fcall_fini(&r->tc);
++ p9_fcall_fini(&r->rc);
++ kmem_cache_free(p9_req_cache, r);
++ return 1;
++ }
++ return 0;
+ }
+ EXPORT_SYMBOL(p9_req_put);
+
+@@ -426,7 +423,7 @@ void p9_client_cb(struct p9_client *c, struct p9_req_t *req, int status)
+
+ wake_up(&req->wq);
+ p9_debug(P9_DEBUG_MUX, "wakeup: %d\n", req->tc.tag);
+- p9_req_put(req);
++ p9_req_put(c, req);
+ }
+ EXPORT_SYMBOL(p9_client_cb);
+
+@@ -709,7 +706,7 @@ static struct p9_req_t *p9_client_prepare_req(struct p9_client *c,
+ reterr:
+ p9_tag_remove(c, req);
+ /* We have to put also the 2nd reference as it won't be used */
+- p9_req_put(req);
++ p9_req_put(c, req);
+ return ERR_PTR(err);
+ }
+
+@@ -746,7 +743,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
+ err = c->trans_mod->request(c, req);
+ if (err < 0) {
+ /* write won't happen */
+- p9_req_put(req);
++ p9_req_put(c, req);
+ if (err != -ERESTARTSYS && err != -EFAULT)
+ c->status = Disconnected;
+ goto recalc_sigpending;
+@@ -889,16 +886,13 @@ static struct p9_fid *p9_fid_create(struct p9_client *clnt)
+ struct p9_fid *fid;
+
+ p9_debug(P9_DEBUG_FID, "clnt %p\n", clnt);
+- fid = kmalloc(sizeof(*fid), GFP_KERNEL);
++ fid = kzalloc(sizeof(*fid), GFP_KERNEL);
+ if (!fid)
+ return NULL;
+
+- memset(&fid->qid, 0, sizeof(fid->qid));
+ fid->mode = -1;
+ fid->uid = current_fsuid();
+ fid->clnt = clnt;
+- fid->rdir = NULL;
+- fid->fid = 0;
+ refcount_set(&fid->count, 1);
+
+ idr_preload(GFP_KERNEL);
+diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
+index 8f8f95e39b03a..e758978b44bee 100644
+--- a/net/9p/trans_fd.c
++++ b/net/9p/trans_fd.c
+@@ -343,6 +343,7 @@ static void p9_read_work(struct work_struct *work)
+ p9_debug(P9_DEBUG_ERROR,
+ "No recv fcall for tag %d (req %p), disconnecting!\n",
+ m->rc.tag, m->rreq);
++ p9_req_put(m->client, m->rreq);
+ m->rreq = NULL;
+ err = -EIO;
+ goto error;
+@@ -378,7 +379,7 @@ static void p9_read_work(struct work_struct *work)
+ m->rc.sdata = NULL;
+ m->rc.offset = 0;
+ m->rc.capacity = 0;
+- p9_req_put(m->rreq);
++ p9_req_put(m->client, m->rreq);
+ m->rreq = NULL;
+ }
+
+@@ -492,7 +493,7 @@ static void p9_write_work(struct work_struct *work)
+ m->wpos += err;
+ if (m->wpos == m->wsize) {
+ m->wpos = m->wsize = 0;
+- p9_req_put(m->wreq);
++ p9_req_put(m->client, m->wreq);
+ m->wreq = NULL;
+ }
+
+@@ -695,7 +696,7 @@ static int p9_fd_cancel(struct p9_client *client, struct p9_req_t *req)
+ if (req->status == REQ_STATUS_UNSENT) {
+ list_del(&req->req_list);
+ req->status = REQ_STATUS_FLSHD;
+- p9_req_put(req);
++ p9_req_put(client, req);
+ ret = 0;
+ }
+ spin_unlock(&client->lock);
+@@ -722,7 +723,7 @@ static int p9_fd_cancelled(struct p9_client *client, struct p9_req_t *req)
+ list_del(&req->req_list);
+ req->status = REQ_STATUS_FLSHD;
+ spin_unlock(&client->lock);
+- p9_req_put(req);
++ p9_req_put(client, req);
+
+ return 0;
+ }
+@@ -883,12 +884,12 @@ static void p9_conn_destroy(struct p9_conn *m)
+ p9_mux_poll_stop(m);
+ cancel_work_sync(&m->rq);
+ if (m->rreq) {
+- p9_req_put(m->rreq);
++ p9_req_put(m->client, m->rreq);
+ m->rreq = NULL;
+ }
+ cancel_work_sync(&m->wq);
+ if (m->wreq) {
+- p9_req_put(m->wreq);
++ p9_req_put(m->client, m->wreq);
+ m->wreq = NULL;
+ }
+
+diff --git a/net/9p/trans_rdma.c b/net/9p/trans_rdma.c
+index 88e5638266743..d817d3745238b 100644
+--- a/net/9p/trans_rdma.c
++++ b/net/9p/trans_rdma.c
+@@ -350,7 +350,7 @@ send_done(struct ib_cq *cq, struct ib_wc *wc)
+ c->busa, c->req->tc.size,
+ DMA_TO_DEVICE);
+ up(&rdma->sq_sem);
+- p9_req_put(c->req);
++ p9_req_put(client, c->req);
+ kfree(c);
+ }
+
+diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
+index b24a4fb0f0a23..147972bf2e797 100644
+--- a/net/9p/trans_virtio.c
++++ b/net/9p/trans_virtio.c
+@@ -199,7 +199,7 @@ static int p9_virtio_cancel(struct p9_client *client, struct p9_req_t *req)
+ /* Reply won't come, so drop req ref */
+ static int p9_virtio_cancelled(struct p9_client *client, struct p9_req_t *req)
+ {
+- p9_req_put(req);
++ p9_req_put(client, req);
+ return 0;
+ }
+
+@@ -523,7 +523,7 @@ err_out:
+ kvfree(out_pages);
+ if (!kicked) {
+ /* reply won't come */
+- p9_req_put(req);
++ p9_req_put(client, req);
+ }
+ return err;
+ }
+diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
+index 77883b6788cde..4cf0c78d4d225 100644
+--- a/net/9p/trans_xen.c
++++ b/net/9p/trans_xen.c
+@@ -163,7 +163,7 @@ again:
+ ring->intf->out_prod = prod;
+ spin_unlock_irqrestore(&ring->lock, flags);
+ notify_remote_via_irq(ring->irq);
+- p9_req_put(p9_req);
++ p9_req_put(client, p9_req);
+
+ return 0;
+ }
+diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
+index 4c7030ed8d331..5b5363c99ed50 100644
+--- a/net/ax25/af_ax25.c
++++ b/net/ax25/af_ax25.c
+@@ -1065,7 +1065,7 @@ static int ax25_release(struct socket *sock)
+ del_timer_sync(&ax25->t3timer);
+ del_timer_sync(&ax25->idletimer);
+ }
+- dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker);
++ dev_put_track(ax25_dev->dev, &ax25->dev_tracker);
+ ax25_dev_put(ax25_dev);
+ }
+
+@@ -1146,7 +1146,7 @@ static int ax25_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
+
+ if (ax25_dev) {
+ ax25_fillin_cb(ax25, ax25_dev);
+- dev_hold_track(ax25_dev->dev, &ax25_dev->dev_tracker, GFP_ATOMIC);
++ dev_hold_track(ax25_dev->dev, &ax25->dev_tracker, GFP_ATOMIC);
+ }
+
+ done:
+diff --git a/net/batman-adv/trace.h b/net/batman-adv/trace.h
+index d673ebdd04267..31c8f922651d5 100644
+--- a/net/batman-adv/trace.h
++++ b/net/batman-adv/trace.h
+@@ -28,8 +28,6 @@
+
+ #endif /* CONFIG_BATMAN_ADV_TRACING */
+
+-#define BATADV_MAX_MSG_LEN 256
+-
+ TRACE_EVENT(batadv_dbg,
+
+ TP_PROTO(struct batadv_priv *bat_priv,
+@@ -40,16 +38,13 @@ TRACE_EVENT(batadv_dbg,
+ TP_STRUCT__entry(
+ __string(device, bat_priv->soft_iface->name)
+ __string(driver, KBUILD_MODNAME)
+- __dynamic_array(char, msg, BATADV_MAX_MSG_LEN)
++ __vstring(msg, vaf->fmt, vaf->va)
+ ),
+
+ TP_fast_assign(
+ __assign_str(device, bat_priv->soft_iface->name);
+ __assign_str(driver, KBUILD_MODNAME);
+- WARN_ON_ONCE(vsnprintf(__get_dynamic_array(msg),
+- BATADV_MAX_MSG_LEN,
+- vaf->fmt,
+- *vaf->va) >= BATADV_MAX_MSG_LEN);
++ __assign_vstr(msg, vaf->fmt, vaf->va);
+ ),
+
+ TP_printk(
+diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
+index 19df3905c5f8e..2b8b51b894522 100644
+--- a/net/bluetooth/hci_core.c
++++ b/net/bluetooth/hci_core.c
+@@ -593,6 +593,11 @@ static int hci_dev_do_reset(struct hci_dev *hdev)
+ skb_queue_purge(&hdev->rx_q);
+ skb_queue_purge(&hdev->cmd_q);
+
++ /* Cancel these to avoid queueing non-chained pending work */
++ hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
++ cancel_delayed_work(&hdev->cmd_timer);
++ cancel_delayed_work(&hdev->ncmd_timer);
++
+ /* Avoid potential lockdep warnings from the *_flush() calls by
+ * ensuring the workqueue is empty up front.
+ */
+@@ -606,6 +611,8 @@ static int hci_dev_do_reset(struct hci_dev *hdev)
+ if (hdev->flush)
+ hdev->flush(hdev);
+
++ hci_dev_clear_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
++
+ atomic_set(&hdev->cmd_cnt, 1);
+ hdev->acl_cnt = 0; hdev->sco_cnt = 0; hdev->le_cnt = 0;
+
+@@ -3863,7 +3870,8 @@ static void hci_cmd_work(struct work_struct *work)
+ if (res < 0)
+ __hci_cmd_sync_cancel(hdev, -res);
+
+- if (test_bit(HCI_RESET, &hdev->flags))
++ if (test_bit(HCI_RESET, &hdev->flags) ||
++ hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
+ cancel_delayed_work(&hdev->cmd_timer);
+ else
+ schedule_delayed_work(&hdev->cmd_timer,
+diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
+index af17dfb20e017..7cb956d3abb26 100644
+--- a/net/bluetooth/hci_event.c
++++ b/net/bluetooth/hci_event.c
+@@ -3768,8 +3768,9 @@ static inline void handle_cmd_cnt_and_timer(struct hci_dev *hdev, u8 ncmd)
+ cancel_delayed_work(&hdev->ncmd_timer);
+ atomic_set(&hdev->cmd_cnt, 1);
+ } else {
+- schedule_delayed_work(&hdev->ncmd_timer,
+- HCI_NCMD_TIMEOUT);
++ if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
++ schedule_delayed_work(&hdev->ncmd_timer,
++ HCI_NCMD_TIMEOUT);
+ }
+ }
+ }
+diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
+index 9e2a42299fc09..6f901398132e3 100644
+--- a/net/bluetooth/hci_sync.c
++++ b/net/bluetooth/hci_sync.c
+@@ -1612,6 +1612,9 @@ static int hci_le_add_resolve_list_sync(struct hci_dev *hdev,
+ bacpy(&cp.bdaddr, ¶ms->addr);
+ memcpy(cp.peer_irk, irk->val, 16);
+
++ /* Default privacy mode is always Network */
++ params->privacy_mode = HCI_NETWORK_PRIVACY;
++
+ done:
+ if (hci_dev_test_flag(hdev, HCI_PRIVACY))
+ memcpy(cp.local_irk, hdev->irk, 16);
+@@ -5008,13 +5011,13 @@ static int hci_resume_scan_sync(struct hci_dev *hdev)
+ if (!hdev->scanning_paused)
+ return 0;
+
++ hdev->scanning_paused = false;
++
+ hci_update_scan_sync(hdev);
+
+ /* Reset passive scanning to normal */
+ hci_update_passive_scan_sync(hdev);
+
+- hdev->scanning_paused = false;
+-
+ return 0;
+ }
+
+@@ -5033,7 +5036,6 @@ int hci_resume_sync(struct hci_dev *hdev)
+ return 0;
+
+ hdev->suspended = false;
+- hdev->scanning_paused = false;
+
+ /* Restore event mask */
+ hci_set_event_mask_sync(hdev);
+diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
+index 52668662ae8de..f18d0c72713f1 100644
+--- a/net/bluetooth/l2cap_core.c
++++ b/net/bluetooth/l2cap_core.c
+@@ -1969,11 +1969,11 @@ static struct l2cap_chan *l2cap_global_chan_by_psm(int state, __le16 psm,
+ bdaddr_t *dst,
+ u8 link_type)
+ {
+- struct l2cap_chan *c, *c1 = NULL;
++ struct l2cap_chan *c, *tmp, *c1 = NULL;
+
+ read_lock(&chan_list_lock);
+
+- list_for_each_entry(c, &chan_list, global_l) {
++ list_for_each_entry_safe(c, tmp, &chan_list, global_l) {
+ if (state && c->state != state)
+ continue;
+
+@@ -1992,11 +1992,10 @@ static struct l2cap_chan *l2cap_global_chan_by_psm(int state, __le16 psm,
+ dst_match = !bacmp(&c->dst, dst);
+ if (src_match && dst_match) {
+ c = l2cap_chan_hold_unless_zero(c);
+- if (!c)
+- continue;
+-
+- read_unlock(&chan_list_lock);
+- return c;
++ if (c) {
++ read_unlock(&chan_list_lock);
++ return c;
++ }
+ }
+
+ /* Closest match */
+diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
+index ae758ab1b558d..12c1ecdba3f3a 100644
+--- a/net/bluetooth/mgmt.c
++++ b/net/bluetooth/mgmt.c
+@@ -6821,11 +6821,14 @@ static int get_conn_info(struct sock *sk, struct hci_dev *hdev, void *data,
+
+ cmd = mgmt_pending_new(sk, MGMT_OP_GET_CONN_INFO, hdev, data,
+ len);
+- if (!cmd)
++ if (!cmd) {
+ err = -ENOMEM;
+- else
++ } else {
++ hci_conn_hold(conn);
++ cmd->user_data = hci_conn_get(conn);
+ err = hci_cmd_sync_queue(hdev, get_conn_info_sync,
+ cmd, get_conn_info_complete);
++ }
+
+ if (err < 0) {
+ mgmt_cmd_complete(sk, hdev->id, MGMT_OP_GET_CONN_INFO,
+@@ -6837,9 +6840,6 @@ static int get_conn_info(struct sock *sk, struct hci_dev *hdev, void *data,
+ goto unlock;
+ }
+
+- hci_conn_hold(conn);
+- cmd->user_data = hci_conn_get(conn);
+-
+ conn->conn_info_timestamp = jiffies;
+ } else {
+ /* Cache is valid, just reply with values cached in hci_conn */
+diff --git a/net/bpf/bpf_dummy_struct_ops.c b/net/bpf/bpf_dummy_struct_ops.c
+index d0e54e30658ac..e78dadfc58290 100644
+--- a/net/bpf/bpf_dummy_struct_ops.c
++++ b/net/bpf/bpf_dummy_struct_ops.c
+@@ -72,13 +72,16 @@ static int dummy_ops_call_op(void *image, struct bpf_dummy_ops_test_args *args)
+ args->args[3], args->args[4]);
+ }
+
++extern const struct bpf_link_ops bpf_struct_ops_link_lops;
++
+ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
+ union bpf_attr __user *uattr)
+ {
+ const struct bpf_struct_ops *st_ops = &bpf_bpf_dummy_ops;
+ const struct btf_type *func_proto;
+ struct bpf_dummy_ops_test_args *args;
+- struct bpf_tramp_progs *tprogs;
++ struct bpf_tramp_links *tlinks;
++ struct bpf_tramp_link *link = NULL;
+ void *image = NULL;
+ unsigned int op_idx;
+ int prog_ret;
+@@ -92,8 +95,8 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
+ if (IS_ERR(args))
+ return PTR_ERR(args);
+
+- tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(*tprogs), GFP_KERNEL);
+- if (!tprogs) {
++ tlinks = kcalloc(BPF_TRAMP_MAX, sizeof(*tlinks), GFP_KERNEL);
++ if (!tlinks) {
+ err = -ENOMEM;
+ goto out;
+ }
+@@ -105,8 +108,17 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
+ }
+ set_vm_flush_reset_perms(image);
+
++ link = kzalloc(sizeof(*link), GFP_USER);
++ if (!link) {
++ err = -ENOMEM;
++ goto out;
++ }
++ /* prog doesn't take the ownership of the reference from caller */
++ bpf_prog_inc(prog);
++ bpf_link_init(&link->link, BPF_LINK_TYPE_STRUCT_OPS, &bpf_struct_ops_link_lops, prog);
++
+ op_idx = prog->expected_attach_type;
+- err = bpf_struct_ops_prepare_trampoline(tprogs, prog,
++ err = bpf_struct_ops_prepare_trampoline(tlinks, link,
+ &st_ops->func_models[op_idx],
+ image, image + PAGE_SIZE);
+ if (err < 0)
+@@ -124,7 +136,9 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
+ out:
+ kfree(args);
+ bpf_jit_free_exec(image);
+- kfree(tprogs);
++ if (link)
++ bpf_link_put(&link->link);
++ kfree(tlinks);
+ return err;
+ }
+
+diff --git a/net/core/filter.c b/net/core/filter.c
+index d0b0c163d3f34..a98f34cb5aee7 100644
+--- a/net/core/filter.c
++++ b/net/core/filter.c
+@@ -3917,7 +3917,7 @@ static void *bpf_xdp_pointer(struct xdp_buff *xdp, u32 offset, u32 len)
+ offset -= frag_size;
+ }
+ out:
+- return offset + len < size ? addr + offset : NULL;
++ return offset + len <= size ? addr + offset : NULL;
+ }
+
+ BPF_CALL_4(bpf_xdp_load_bytes, struct xdp_buff *, xdp, u32, offset,
+@@ -6642,7 +6642,7 @@ static const struct bpf_func_proto bpf_sk_release_proto = {
+ .func = bpf_sk_release,
+ .gpl_only = false,
+ .ret_type = RET_INTEGER,
+- .arg1_type = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
++ .arg1_type = ARG_PTR_TO_BTF_ID_SOCK_COMMON | OBJ_RELEASE,
+ };
+
+ BPF_CALL_5(bpf_xdp_sk_lookup_udp, struct xdp_buff *, ctx,
+diff --git a/net/core/skmsg.c b/net/core/skmsg.c
+index ede0af308f404..f50f8d95b6283 100644
+--- a/net/core/skmsg.c
++++ b/net/core/skmsg.c
+@@ -462,7 +462,7 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
+
+ if (copied == len)
+ break;
+- } while (i != msg_rx->sg.end);
++ } while (!sg_is_last(sge));
+
+ if (unlikely(peek)) {
+ msg_rx = sk_psock_next_msg(psock, msg_rx);
+@@ -472,7 +472,7 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
+ }
+
+ msg_rx->sg.start = i;
+- if (!sge->length && msg_rx->sg.start == msg_rx->sg.end) {
++ if (!sge->length && sg_is_last(sge)) {
+ msg_rx = sk_psock_dequeue_msg(psock);
+ kfree_sk_msg(msg_rx);
+ }
+diff --git a/net/dccp/proto.c b/net/dccp/proto.c
+index a976b4d298925..0ee62506105a8 100644
+--- a/net/dccp/proto.c
++++ b/net/dccp/proto.c
+@@ -736,11 +736,6 @@ int dccp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+
+ lock_sock(sk);
+
+- if (dccp_qpolicy_full(sk)) {
+- rc = -EAGAIN;
+- goto out_release;
+- }
+-
+ timeo = sock_sndtimeo(sk, noblock);
+
+ /*
+@@ -759,6 +754,11 @@ int dccp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+ if (skb == NULL)
+ goto out_release;
+
++ if (dccp_qpolicy_full(sk)) {
++ rc = -EAGAIN;
++ goto out_discard;
++ }
++
+ if (sk->sk_state == DCCP_CLOSED) {
+ rc = -ENOTCONN;
+ goto out_discard;
+diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
+index 5c207367b3b4f..9a70032625af1 100644
+--- a/net/ipv4/af_inet.c
++++ b/net/ipv4/af_inet.c
+@@ -1920,6 +1920,8 @@ static int __init inet_init(void)
+
+ sock_skb_cb_check_size(sizeof(struct inet_skb_parm));
+
++ raw_hashinfo_init(&raw_v4_hashinfo);
++
+ rc = proto_register(&tcp_prot, 1);
+ if (rc)
+ goto out;
+diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
+index 55654e335d43d..d631198e09383 100644
+--- a/net/ipv4/inet_hashtables.c
++++ b/net/ipv4/inet_hashtables.c
+@@ -410,13 +410,11 @@ begin:
+ sk_nulls_for_each_rcu(sk, node, &head->chain) {
+ if (sk->sk_hash != hash)
+ continue;
+- if (likely(INET_MATCH(sk, net, acookie,
+- saddr, daddr, ports, dif, sdif))) {
++ if (likely(INET_MATCH(net, sk, acookie, ports, dif, sdif))) {
+ if (unlikely(!refcount_inc_not_zero(&sk->sk_refcnt)))
+ goto out;
+- if (unlikely(!INET_MATCH(sk, net, acookie,
+- saddr, daddr, ports,
+- dif, sdif))) {
++ if (unlikely(!INET_MATCH(net, sk, acookie,
++ ports, dif, sdif))) {
+ sock_gen_put(sk);
+ goto begin;
+ }
+@@ -465,8 +463,7 @@ static int __inet_check_established(struct inet_timewait_death_row *death_row,
+ if (sk2->sk_hash != hash)
+ continue;
+
+- if (likely(INET_MATCH(sk2, net, acookie,
+- saddr, daddr, ports, dif, sdif))) {
++ if (likely(INET_MATCH(net, sk2, acookie, ports, dif, sdif))) {
+ if (sk2->sk_state == TCP_TIME_WAIT) {
+ tw = inet_twsk(sk2);
+ if (twsk_unique(sk, sk2, twp))
+@@ -532,16 +529,14 @@ static bool inet_ehash_lookup_by_sk(struct sock *sk,
+ if (esk->sk_hash != sk->sk_hash)
+ continue;
+ if (sk->sk_family == AF_INET) {
+- if (unlikely(INET_MATCH(esk, net, acookie,
+- sk->sk_daddr,
+- sk->sk_rcv_saddr,
++ if (unlikely(INET_MATCH(net, esk, acookie,
+ ports, dif, sdif))) {
+ return true;
+ }
+ }
+ #if IS_ENABLED(CONFIG_IPV6)
+ else if (sk->sk_family == AF_INET6) {
+- if (unlikely(INET6_MATCH(esk, net,
++ if (unlikely(inet6_match(net, esk,
+ &sk->sk_v6_daddr,
+ &sk->sk_v6_rcv_saddr,
+ ports, dif, sdif))) {
+diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
+index c9dd9603f2e73..d269b85630b2b 100644
+--- a/net/ipv4/raw.c
++++ b/net/ipv4/raw.c
+@@ -85,20 +85,19 @@ struct raw_frag_vec {
+ int hlen;
+ };
+
+-struct raw_hashinfo raw_v4_hashinfo = {
+- .lock = __RW_LOCK_UNLOCKED(raw_v4_hashinfo.lock),
+-};
++struct raw_hashinfo raw_v4_hashinfo;
+ EXPORT_SYMBOL_GPL(raw_v4_hashinfo);
+
+ int raw_hash_sk(struct sock *sk)
+ {
+ struct raw_hashinfo *h = sk->sk_prot->h.raw_hash;
+- struct hlist_head *head;
++ struct hlist_nulls_head *hlist;
+
+- head = &h->ht[inet_sk(sk)->inet_num & (RAW_HTABLE_SIZE - 1)];
++ hlist = &h->ht[inet_sk(sk)->inet_num & (RAW_HTABLE_SIZE - 1)];
+
+ write_lock_bh(&h->lock);
+- sk_add_node(sk, head);
++ hlist_nulls_add_head_rcu(&sk->sk_nulls_node, hlist);
++ sock_set_flag(sk, SOCK_RCU_FREE);
+ write_unlock_bh(&h->lock);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
+
+@@ -111,30 +110,25 @@ void raw_unhash_sk(struct sock *sk)
+ struct raw_hashinfo *h = sk->sk_prot->h.raw_hash;
+
+ write_lock_bh(&h->lock);
+- if (sk_del_node_init(sk))
++ if (__sk_nulls_del_node_init_rcu(sk))
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1);
+ write_unlock_bh(&h->lock);
+ }
+ EXPORT_SYMBOL_GPL(raw_unhash_sk);
+
+-struct sock *__raw_v4_lookup(struct net *net, struct sock *sk,
+- unsigned short num, __be32 raddr, __be32 laddr,
+- int dif, int sdif)
++bool raw_v4_match(struct net *net, struct sock *sk, unsigned short num,
++ __be32 raddr, __be32 laddr, int dif, int sdif)
+ {
+- sk_for_each_from(sk) {
+- struct inet_sock *inet = inet_sk(sk);
+-
+- if (net_eq(sock_net(sk), net) && inet->inet_num == num &&
+- !(inet->inet_daddr && inet->inet_daddr != raddr) &&
+- !(inet->inet_rcv_saddr && inet->inet_rcv_saddr != laddr) &&
+- raw_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif))
+- goto found; /* gotcha */
+- }
+- sk = NULL;
+-found:
+- return sk;
++ struct inet_sock *inet = inet_sk(sk);
++
++ if (net_eq(sock_net(sk), net) && inet->inet_num == num &&
++ !(inet->inet_daddr && inet->inet_daddr != raddr) &&
++ !(inet->inet_rcv_saddr && inet->inet_rcv_saddr != laddr) &&
++ raw_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif))
++ return true;
++ return false;
+ }
+-EXPORT_SYMBOL_GPL(__raw_v4_lookup);
++EXPORT_SYMBOL_GPL(raw_v4_match);
+
+ /*
+ * 0 - deliver
+@@ -168,23 +162,20 @@ static int icmp_filter(const struct sock *sk, const struct sk_buff *skb)
+ */
+ static int raw_v4_input(struct sk_buff *skb, const struct iphdr *iph, int hash)
+ {
++ struct net *net = dev_net(skb->dev);
++ struct hlist_nulls_head *hlist;
++ struct hlist_nulls_node *hnode;
+ int sdif = inet_sdif(skb);
+ int dif = inet_iif(skb);
+- struct sock *sk;
+- struct hlist_head *head;
+ int delivered = 0;
+- struct net *net;
+-
+- read_lock(&raw_v4_hashinfo.lock);
+- head = &raw_v4_hashinfo.ht[hash];
+- if (hlist_empty(head))
+- goto out;
+-
+- net = dev_net(skb->dev);
+- sk = __raw_v4_lookup(net, __sk_head(head), iph->protocol,
+- iph->saddr, iph->daddr, dif, sdif);
++ struct sock *sk;
+
+- while (sk) {
++ hlist = &raw_v4_hashinfo.ht[hash];
++ rcu_read_lock();
++ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
++ if (!raw_v4_match(net, sk, iph->protocol,
++ iph->saddr, iph->daddr, dif, sdif))
++ continue;
+ delivered = 1;
+ if ((iph->protocol != IPPROTO_ICMP || !icmp_filter(sk, skb)) &&
+ ip_mc_sf_allow(sk, iph->daddr, iph->saddr,
+@@ -195,31 +186,16 @@ static int raw_v4_input(struct sk_buff *skb, const struct iphdr *iph, int hash)
+ if (clone)
+ raw_rcv(sk, clone);
+ }
+- sk = __raw_v4_lookup(net, sk_next(sk), iph->protocol,
+- iph->saddr, iph->daddr,
+- dif, sdif);
+ }
+-out:
+- read_unlock(&raw_v4_hashinfo.lock);
++ rcu_read_unlock();
+ return delivered;
+ }
+
+ int raw_local_deliver(struct sk_buff *skb, int protocol)
+ {
+- int hash;
+- struct sock *raw_sk;
+-
+- hash = protocol & (RAW_HTABLE_SIZE - 1);
+- raw_sk = sk_head(&raw_v4_hashinfo.ht[hash]);
+-
+- /* If there maybe a raw socket we must check - if not we
+- * don't care less
+- */
+- if (raw_sk && !raw_v4_input(skb, ip_hdr(skb), hash))
+- raw_sk = NULL;
+-
+- return raw_sk != NULL;
++ int hash = protocol & (RAW_HTABLE_SIZE - 1);
+
++ return raw_v4_input(skb, ip_hdr(skb), hash);
+ }
+
+ static void raw_err(struct sock *sk, struct sk_buff *skb, u32 info)
+@@ -286,31 +262,27 @@ static void raw_err(struct sock *sk, struct sk_buff *skb, u32 info)
+
+ void raw_icmp_error(struct sk_buff *skb, int protocol, u32 info)
+ {
+- int hash;
+- struct sock *raw_sk;
++ struct net *net = dev_net(skb->dev);
++ struct hlist_nulls_head *hlist;
++ struct hlist_nulls_node *hnode;
++ int dif = skb->dev->ifindex;
++ int sdif = inet_sdif(skb);
+ const struct iphdr *iph;
+- struct net *net;
++ struct sock *sk;
++ int hash;
+
+ hash = protocol & (RAW_HTABLE_SIZE - 1);
++ hlist = &raw_v4_hashinfo.ht[hash];
+
+- read_lock(&raw_v4_hashinfo.lock);
+- raw_sk = sk_head(&raw_v4_hashinfo.ht[hash]);
+- if (raw_sk) {
+- int dif = skb->dev->ifindex;
+- int sdif = inet_sdif(skb);
+-
++ rcu_read_lock();
++ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
+ iph = (const struct iphdr *)skb->data;
+- net = dev_net(skb->dev);
+-
+- while ((raw_sk = __raw_v4_lookup(net, raw_sk, protocol,
+- iph->daddr, iph->saddr,
+- dif, sdif)) != NULL) {
+- raw_err(raw_sk, skb, info);
+- raw_sk = sk_next(raw_sk);
+- iph = (const struct iphdr *)skb->data;
+- }
++ if (!raw_v4_match(net, sk, iph->protocol,
++ iph->daddr, iph->saddr, dif, sdif))
++ continue;
++ raw_err(sk, skb, info);
+ }
+- read_unlock(&raw_v4_hashinfo.lock);
++ rcu_read_unlock();
+ }
+
+ static int raw_rcv_skb(struct sock *sk, struct sk_buff *skb)
+@@ -972,44 +944,41 @@ struct proto raw_prot = {
+ };
+
+ #ifdef CONFIG_PROC_FS
+-static struct sock *raw_get_first(struct seq_file *seq)
++static struct sock *raw_get_first(struct seq_file *seq, int bucket)
+ {
+- struct sock *sk;
+ struct raw_hashinfo *h = pde_data(file_inode(seq->file));
+ struct raw_iter_state *state = raw_seq_private(seq);
++ struct hlist_nulls_head *hlist;
++ struct hlist_nulls_node *hnode;
++ struct sock *sk;
+
+- for (state->bucket = 0; state->bucket < RAW_HTABLE_SIZE;
++ for (state->bucket = bucket; state->bucket < RAW_HTABLE_SIZE;
+ ++state->bucket) {
+- sk_for_each(sk, &h->ht[state->bucket])
++ hlist = &h->ht[state->bucket];
++ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
+ if (sock_net(sk) == seq_file_net(seq))
+- goto found;
++ return sk;
++ }
+ }
+- sk = NULL;
+-found:
+- return sk;
++ return NULL;
+ }
+
+ static struct sock *raw_get_next(struct seq_file *seq, struct sock *sk)
+ {
+- struct raw_hashinfo *h = pde_data(file_inode(seq->file));
+ struct raw_iter_state *state = raw_seq_private(seq);
+
+ do {
+- sk = sk_next(sk);
+-try_again:
+- ;
++ sk = sk_nulls_next(sk);
+ } while (sk && sock_net(sk) != seq_file_net(seq));
+
+- if (!sk && ++state->bucket < RAW_HTABLE_SIZE) {
+- sk = sk_head(&h->ht[state->bucket]);
+- goto try_again;
+- }
++ if (!sk)
++ return raw_get_first(seq, state->bucket + 1);
+ return sk;
+ }
+
+ static struct sock *raw_get_idx(struct seq_file *seq, loff_t pos)
+ {
+- struct sock *sk = raw_get_first(seq);
++ struct sock *sk = raw_get_first(seq, 0);
+
+ if (sk)
+ while (pos && (sk = raw_get_next(seq, sk)) != NULL)
+@@ -1018,11 +987,9 @@ static struct sock *raw_get_idx(struct seq_file *seq, loff_t pos)
+ }
+
+ void *raw_seq_start(struct seq_file *seq, loff_t *pos)
+- __acquires(&h->lock)
++ __acquires(RCU)
+ {
+- struct raw_hashinfo *h = pde_data(file_inode(seq->file));
+-
+- read_lock(&h->lock);
++ rcu_read_lock();
+ return *pos ? raw_get_idx(seq, *pos - 1) : SEQ_START_TOKEN;
+ }
+ EXPORT_SYMBOL_GPL(raw_seq_start);
+@@ -1032,7 +999,7 @@ void *raw_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+ struct sock *sk;
+
+ if (v == SEQ_START_TOKEN)
+- sk = raw_get_first(seq);
++ sk = raw_get_first(seq, 0);
+ else
+ sk = raw_get_next(seq, v);
+ ++*pos;
+@@ -1041,11 +1008,9 @@ void *raw_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+ EXPORT_SYMBOL_GPL(raw_seq_next);
+
+ void raw_seq_stop(struct seq_file *seq, void *v)
+- __releases(&h->lock)
++ __releases(RCU)
+ {
+- struct raw_hashinfo *h = pde_data(file_inode(seq->file));
+-
+- read_unlock(&h->lock);
++ rcu_read_unlock();
+ }
+ EXPORT_SYMBOL_GPL(raw_seq_stop);
+
+@@ -1107,6 +1072,7 @@ static __net_initdata struct pernet_operations raw_net_ops = {
+
+ int __init raw_proc_init(void)
+ {
++
+ return register_pernet_subsys(&raw_net_ops);
+ }
+
+diff --git a/net/ipv4/raw_diag.c b/net/ipv4/raw_diag.c
+index ccacbde30a2c5..5f208e840d859 100644
+--- a/net/ipv4/raw_diag.c
++++ b/net/ipv4/raw_diag.c
+@@ -34,57 +34,57 @@ raw_get_hashinfo(const struct inet_diag_req_v2 *r)
+ * use helper to figure it out.
+ */
+
+-static struct sock *raw_lookup(struct net *net, struct sock *from,
+- const struct inet_diag_req_v2 *req)
++static bool raw_lookup(struct net *net, struct sock *sk,
++ const struct inet_diag_req_v2 *req)
+ {
+ struct inet_diag_req_raw *r = (void *)req;
+- struct sock *sk = NULL;
+
+ if (r->sdiag_family == AF_INET)
+- sk = __raw_v4_lookup(net, from, r->sdiag_raw_protocol,
+- r->id.idiag_dst[0],
+- r->id.idiag_src[0],
+- r->id.idiag_if, 0);
++ return raw_v4_match(net, sk, r->sdiag_raw_protocol,
++ r->id.idiag_dst[0],
++ r->id.idiag_src[0],
++ r->id.idiag_if, 0);
+ #if IS_ENABLED(CONFIG_IPV6)
+ else
+- sk = __raw_v6_lookup(net, from, r->sdiag_raw_protocol,
+- (const struct in6_addr *)r->id.idiag_src,
+- (const struct in6_addr *)r->id.idiag_dst,
+- r->id.idiag_if, 0);
++ return raw_v6_match(net, sk, r->sdiag_raw_protocol,
++ (const struct in6_addr *)r->id.idiag_src,
++ (const struct in6_addr *)r->id.idiag_dst,
++ r->id.idiag_if, 0);
+ #endif
+- return sk;
++ return false;
+ }
+
+ static struct sock *raw_sock_get(struct net *net, const struct inet_diag_req_v2 *r)
+ {
+ struct raw_hashinfo *hashinfo = raw_get_hashinfo(r);
+- struct sock *sk = NULL, *s;
++ struct hlist_nulls_head *hlist;
++ struct hlist_nulls_node *hnode;
++ struct sock *sk;
+ int slot;
+
+ if (IS_ERR(hashinfo))
+ return ERR_CAST(hashinfo);
+
+- read_lock(&hashinfo->lock);
++ rcu_read_lock();
+ for (slot = 0; slot < RAW_HTABLE_SIZE; slot++) {
+- sk_for_each(s, &hashinfo->ht[slot]) {
+- sk = raw_lookup(net, s, r);
+- if (sk) {
++ hlist = &hashinfo->ht[slot];
++ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
++ if (raw_lookup(net, sk, r)) {
+ /*
+ * Grab it and keep until we fill
+- * diag meaage to be reported, so
++ * diag message to be reported, so
+ * caller should call sock_put then.
+- * We can do that because we're keeping
+- * hashinfo->lock here.
+ */
+- sock_hold(sk);
+- goto out_unlock;
++ if (refcount_inc_not_zero(&sk->sk_refcnt))
++ goto out_unlock;
+ }
+ }
+ }
++ sk = ERR_PTR(-ENOENT);
+ out_unlock:
+- read_unlock(&hashinfo->lock);
++ rcu_read_unlock();
+
+- return sk ? sk : ERR_PTR(-ENOENT);
++ return sk;
+ }
+
+ static int raw_diag_dump_one(struct netlink_callback *cb,
+@@ -142,6 +142,8 @@ static void raw_diag_dump(struct sk_buff *skb, struct netlink_callback *cb,
+ struct raw_hashinfo *hashinfo = raw_get_hashinfo(r);
+ struct net *net = sock_net(skb->sk);
+ struct inet_diag_dump_data *cb_data;
++ struct hlist_nulls_head *hlist;
++ struct hlist_nulls_node *hnode;
+ int num, s_num, slot, s_slot;
+ struct sock *sk = NULL;
+ struct nlattr *bc;
+@@ -158,7 +160,8 @@ static void raw_diag_dump(struct sk_buff *skb, struct netlink_callback *cb,
+ for (slot = s_slot; slot < RAW_HTABLE_SIZE; s_num = 0, slot++) {
+ num = 0;
+
+- sk_for_each(sk, &hashinfo->ht[slot]) {
++ hlist = &hashinfo->ht[slot];
++ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
+ struct inet_sock *inet = inet_sk(sk);
+
+ if (!net_eq(sock_net(sk), net))
+diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
+index 91735d631a282..51116166e3d21 100644
+--- a/net/ipv4/tcp.c
++++ b/net/ipv4/tcp.c
+@@ -953,6 +953,23 @@ static int tcp_downgrade_zcopy_pure(struct sock *sk, struct sk_buff *skb)
+ return 0;
+ }
+
++static int tcp_wmem_schedule(struct sock *sk, int copy)
++{
++ int left;
++
++ if (likely(sk_wmem_schedule(sk, copy)))
++ return copy;
++
++ /* We could be in trouble if we have nothing queued.
++ * Use whatever is left in sk->sk_forward_alloc and tcp_wmem[0]
++ * to guarantee some progress.
++ */
++ left = sock_net(sk)->ipv4.sysctl_tcp_wmem[0] - sk->sk_wmem_queued;
++ if (left > 0)
++ sk_forced_mem_schedule(sk, min(left, copy));
++ return min(copy, sk->sk_forward_alloc);
++}
++
+ static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags,
+ struct page *page, int offset, size_t *size)
+ {
+@@ -988,7 +1005,11 @@ new_segment:
+ tcp_mark_push(tp, skb);
+ goto new_segment;
+ }
+- if (tcp_downgrade_zcopy_pure(sk, skb) || !sk_wmem_schedule(sk, copy))
++ if (tcp_downgrade_zcopy_pure(sk, skb))
++ return NULL;
++
++ copy = tcp_wmem_schedule(sk, copy);
++ if (!copy)
+ return NULL;
+
+ if (can_coalesce) {
+@@ -1337,8 +1358,11 @@ new_segment:
+
+ copy = min_t(int, copy, pfrag->size - pfrag->offset);
+
+- if (tcp_downgrade_zcopy_pure(sk, skb) ||
+- !sk_wmem_schedule(sk, copy))
++ if (tcp_downgrade_zcopy_pure(sk, skb))
++ goto wait_for_space;
++
++ copy = tcp_wmem_schedule(sk, copy);
++ if (!copy)
+ goto wait_for_space;
+
+ err = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb,
+@@ -1365,7 +1389,8 @@ new_segment:
+ skb_shinfo(skb)->flags |= SKBFL_PURE_ZEROCOPY;
+
+ if (!skb_zcopy_pure(skb)) {
+- if (!sk_wmem_schedule(sk, copy))
++ copy = tcp_wmem_schedule(sk, copy);
++ if (!copy)
+ goto wait_for_space;
+ }
+
+diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
+index a7f0a1f0c2a34..b19090b81ff09 100644
+--- a/net/ipv4/tcp_output.c
++++ b/net/ipv4/tcp_output.c
+@@ -3140,7 +3140,7 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
+ struct tcp_sock *tp = tcp_sk(sk);
+ unsigned int cur_mss;
+ int diff, len, err;
+-
++ int avail_wnd;
+
+ /* Inconclusive MTU probe */
+ if (icsk->icsk_mtup.probe_size)
+@@ -3162,17 +3162,25 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
+ return -EHOSTUNREACH; /* Routing failure or similar. */
+
+ cur_mss = tcp_current_mss(sk);
++ avail_wnd = tcp_wnd_end(tp) - TCP_SKB_CB(skb)->seq;
+
+ /* If receiver has shrunk his window, and skb is out of
+ * new window, do not retransmit it. The exception is the
+ * case, when window is shrunk to zero. In this case
+- * our retransmit serves as a zero window probe.
++ * our retransmit of one segment serves as a zero window probe.
+ */
+- if (!before(TCP_SKB_CB(skb)->seq, tcp_wnd_end(tp)) &&
+- TCP_SKB_CB(skb)->seq != tp->snd_una)
+- return -EAGAIN;
++ if (avail_wnd <= 0) {
++ if (TCP_SKB_CB(skb)->seq != tp->snd_una)
++ return -EAGAIN;
++ avail_wnd = cur_mss;
++ }
+
+ len = cur_mss * segs;
++ if (len > avail_wnd) {
++ len = rounddown(avail_wnd, cur_mss);
++ if (!len)
++ len = avail_wnd;
++ }
+ if (skb->len > len) {
+ if (tcp_fragment(sk, TCP_FRAG_IN_RTX_QUEUE, skb, len,
+ cur_mss, GFP_ATOMIC))
+@@ -3186,8 +3194,9 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
+ diff -= tcp_skb_pcount(skb);
+ if (diff)
+ tcp_adjust_pcount(sk, skb, diff);
+- if (skb->len < cur_mss)
+- tcp_retrans_try_collapse(sk, skb, cur_mss);
++ avail_wnd = min_t(int, avail_wnd, cur_mss);
++ if (skb->len < avail_wnd)
++ tcp_retrans_try_collapse(sk, skb, avail_wnd);
+ }
+
+ /* RFC3168, section 6.1.1.1. ECN fallback */
+@@ -3358,11 +3367,12 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
+ */
+ void sk_forced_mem_schedule(struct sock *sk, int size)
+ {
+- int amt;
++ int delta, amt;
+
+- if (size <= sk->sk_forward_alloc)
++ delta = size - sk->sk_forward_alloc;
++ if (delta <= 0)
+ return;
+- amt = sk_mem_pages(size);
++ amt = sk_mem_pages(delta);
+ sk->sk_forward_alloc += amt * SK_MEM_QUANTUM;
+ sk_memory_allocated_add(sk, amt);
+
+diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
+index 6b4d8361560fa..201e2533acb29 100644
+--- a/net/ipv4/udp.c
++++ b/net/ipv4/udp.c
+@@ -2564,8 +2564,7 @@ static struct sock *__udp4_lib_demux_lookup(struct net *net,
+ struct sock *sk;
+
+ udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
+- if (INET_MATCH(sk, net, acookie, rmt_addr,
+- loc_addr, ports, dif, sdif))
++ if (INET_MATCH(net, sk, acookie, ports, dif, sdif))
+ return sk;
+ /* Only check first socket in chain */
+ break;
+diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
+index ef1e6545d8690..095298c5727ac 100644
+--- a/net/ipv6/af_inet6.c
++++ b/net/ipv6/af_inet6.c
+@@ -63,6 +63,7 @@
+ #include <net/compat.h>
+ #include <net/xfrm.h>
+ #include <net/ioam6.h>
++#include <net/rawv6.h>
+
+ #include <linux/uaccess.h>
+ #include <linux/mroute6.h>
+@@ -1074,6 +1075,8 @@ static int __init inet6_init(void)
+ goto out;
+ }
+
++ raw_hashinfo_init(&raw_v6_hashinfo);
++
+ err = proto_register(&tcpv6_prot, 1);
+ if (err)
+ goto out;
+diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c
+index 32ccac10bd625..bb4939e368402 100644
+--- a/net/ipv6/inet6_hashtables.c
++++ b/net/ipv6/inet6_hashtables.c
+@@ -71,12 +71,12 @@ begin:
+ sk_nulls_for_each_rcu(sk, node, &head->chain) {
+ if (sk->sk_hash != hash)
+ continue;
+- if (!INET6_MATCH(sk, net, saddr, daddr, ports, dif, sdif))
++ if (!inet6_match(net, sk, saddr, daddr, ports, dif, sdif))
+ continue;
+ if (unlikely(!refcount_inc_not_zero(&sk->sk_refcnt)))
+ goto out;
+
+- if (unlikely(!INET6_MATCH(sk, net, saddr, daddr, ports, dif, sdif))) {
++ if (unlikely(!inet6_match(net, sk, saddr, daddr, ports, dif, sdif))) {
+ sock_gen_put(sk);
+ goto begin;
+ }
+@@ -269,7 +269,7 @@ static int __inet6_check_established(struct inet_timewait_death_row *death_row,
+ if (sk2->sk_hash != hash)
+ continue;
+
+- if (likely(INET6_MATCH(sk2, net, saddr, daddr, ports,
++ if (likely(inet6_match(net, sk2, saddr, daddr, ports,
+ dif, sdif))) {
+ if (sk2->sk_state == TCP_TIME_WAIT) {
+ tw = inet_twsk(sk2);
+diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
+index 8bb41f3b246a9..62f0a3d08d4d8 100644
+--- a/net/ipv6/raw.c
++++ b/net/ipv6/raw.c
+@@ -61,46 +61,30 @@
+
+ #define ICMPV6_HDRLEN 4 /* ICMPv6 header, RFC 4443 Section 2.1 */
+
+-struct raw_hashinfo raw_v6_hashinfo = {
+- .lock = __RW_LOCK_UNLOCKED(raw_v6_hashinfo.lock),
+-};
++struct raw_hashinfo raw_v6_hashinfo;
+ EXPORT_SYMBOL_GPL(raw_v6_hashinfo);
+
+-struct sock *__raw_v6_lookup(struct net *net, struct sock *sk,
+- unsigned short num, const struct in6_addr *loc_addr,
+- const struct in6_addr *rmt_addr, int dif, int sdif)
++bool raw_v6_match(struct net *net, struct sock *sk, unsigned short num,
++ const struct in6_addr *loc_addr,
++ const struct in6_addr *rmt_addr, int dif, int sdif)
+ {
+- bool is_multicast = ipv6_addr_is_multicast(loc_addr);
+-
+- sk_for_each_from(sk)
+- if (inet_sk(sk)->inet_num == num) {
+-
+- if (!net_eq(sock_net(sk), net))
+- continue;
+-
+- if (!ipv6_addr_any(&sk->sk_v6_daddr) &&
+- !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr))
+- continue;
+-
+- if (!raw_sk_bound_dev_eq(net, sk->sk_bound_dev_if,
+- dif, sdif))
+- continue;
+-
+- if (!ipv6_addr_any(&sk->sk_v6_rcv_saddr)) {
+- if (ipv6_addr_equal(&sk->sk_v6_rcv_saddr, loc_addr))
+- goto found;
+- if (is_multicast &&
+- inet6_mc_check(sk, loc_addr, rmt_addr))
+- goto found;
+- continue;
+- }
+- goto found;
+- }
+- sk = NULL;
+-found:
+- return sk;
++ if (inet_sk(sk)->inet_num != num ||
++ !net_eq(sock_net(sk), net) ||
++ (!ipv6_addr_any(&sk->sk_v6_daddr) &&
++ !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr)) ||
++ !raw_sk_bound_dev_eq(net, sk->sk_bound_dev_if,
++ dif, sdif))
++ return false;
++
++ if (ipv6_addr_any(&sk->sk_v6_rcv_saddr) ||
++ ipv6_addr_equal(&sk->sk_v6_rcv_saddr, loc_addr) ||
++ (ipv6_addr_is_multicast(loc_addr) &&
++ inet6_mc_check(sk, loc_addr, rmt_addr)))
++ return true;
++
++ return false;
+ }
+-EXPORT_SYMBOL_GPL(__raw_v6_lookup);
++EXPORT_SYMBOL_GPL(raw_v6_match);
+
+ /*
+ * 0 - deliver
+@@ -156,31 +140,27 @@ EXPORT_SYMBOL(rawv6_mh_filter_unregister);
+ */
+ static bool ipv6_raw_deliver(struct sk_buff *skb, int nexthdr)
+ {
++ struct net *net = dev_net(skb->dev);
++ struct hlist_nulls_head *hlist;
++ struct hlist_nulls_node *hnode;
+ const struct in6_addr *saddr;
+ const struct in6_addr *daddr;
+ struct sock *sk;
+ bool delivered = false;
+ __u8 hash;
+- struct net *net;
+
+ saddr = &ipv6_hdr(skb)->saddr;
+ daddr = saddr + 1;
+
+ hash = nexthdr & (RAW_HTABLE_SIZE - 1);
+-
+- read_lock(&raw_v6_hashinfo.lock);
+- sk = sk_head(&raw_v6_hashinfo.ht[hash]);
+-
+- if (!sk)
+- goto out;
+-
+- net = dev_net(skb->dev);
+- sk = __raw_v6_lookup(net, sk, nexthdr, daddr, saddr,
+- inet6_iif(skb), inet6_sdif(skb));
+-
+- while (sk) {
++ hlist = &raw_v6_hashinfo.ht[hash];
++ rcu_read_lock();
++ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
+ int filtered;
+
++ if (!raw_v6_match(net, sk, nexthdr, daddr, saddr,
++ inet6_iif(skb), inet6_sdif(skb)))
++ continue;
+ delivered = true;
+ switch (nexthdr) {
+ case IPPROTO_ICMPV6:
+@@ -219,23 +199,14 @@ static bool ipv6_raw_deliver(struct sk_buff *skb, int nexthdr)
+ rawv6_rcv(sk, clone);
+ }
+ }
+- sk = __raw_v6_lookup(net, sk_next(sk), nexthdr, daddr, saddr,
+- inet6_iif(skb), inet6_sdif(skb));
+ }
+-out:
+- read_unlock(&raw_v6_hashinfo.lock);
++ rcu_read_unlock();
+ return delivered;
+ }
+
+ bool raw6_local_deliver(struct sk_buff *skb, int nexthdr)
+ {
+- struct sock *raw_sk;
+-
+- raw_sk = sk_head(&raw_v6_hashinfo.ht[nexthdr & (RAW_HTABLE_SIZE - 1)]);
+- if (raw_sk && !ipv6_raw_deliver(skb, nexthdr))
+- raw_sk = NULL;
+-
+- return raw_sk != NULL;
++ return ipv6_raw_deliver(skb, nexthdr);
+ }
+
+ /* This cleans up af_inet6 a bit. -DaveM */
+@@ -361,30 +332,25 @@ static void rawv6_err(struct sock *sk, struct sk_buff *skb,
+ void raw6_icmp_error(struct sk_buff *skb, int nexthdr,
+ u8 type, u8 code, int inner_offset, __be32 info)
+ {
++ struct net *net = dev_net(skb->dev);
++ struct hlist_nulls_head *hlist;
++ struct hlist_nulls_node *hnode;
+ struct sock *sk;
+ int hash;
+- const struct in6_addr *saddr, *daddr;
+- struct net *net;
+
+ hash = nexthdr & (RAW_HTABLE_SIZE - 1);
+-
+- read_lock(&raw_v6_hashinfo.lock);
+- sk = sk_head(&raw_v6_hashinfo.ht[hash]);
+- if (sk) {
++ hlist = &raw_v6_hashinfo.ht[hash];
++ rcu_read_lock();
++ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
+ /* Note: ipv6_hdr(skb) != skb->data */
+ const struct ipv6hdr *ip6h = (const struct ipv6hdr *)skb->data;
+- saddr = &ip6h->saddr;
+- daddr = &ip6h->daddr;
+- net = dev_net(skb->dev);
+-
+- while ((sk = __raw_v6_lookup(net, sk, nexthdr, saddr, daddr,
+- inet6_iif(skb), inet6_iif(skb)))) {
+- rawv6_err(sk, skb, NULL, type, code,
+- inner_offset, info);
+- sk = sk_next(sk);
+- }
++
++ if (!raw_v6_match(net, sk, nexthdr, &ip6h->saddr, &ip6h->daddr,
++ inet6_iif(skb), inet6_iif(skb)))
++ continue;
++ rawv6_err(sk, skb, NULL, type, code, inner_offset, info);
+ }
+- read_unlock(&raw_v6_hashinfo.lock);
++ rcu_read_unlock();
+ }
+
+ static inline int rawv6_rcv_skb(struct sock *sk, struct sk_buff *skb)
+diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
+index aea28bf701be4..544842fc831e4 100644
+--- a/net/ipv6/udp.c
++++ b/net/ipv6/udp.c
+@@ -1044,7 +1044,7 @@ static struct sock *__udp6_lib_demux_lookup(struct net *net,
+
+ udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
+ if (sk->sk_state == TCP_ESTABLISHED &&
+- INET6_MATCH(sk, net, rmt_addr, loc_addr, ports, dif, sdif))
++ inet6_match(net, sk, rmt_addr, loc_addr, ports, dif, sdif))
+ return sk;
+ /* Only check first socket in chain */
+ break;
+diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
+index 07b5a2044cab4..9c25e6a9a2795 100644
+--- a/net/mptcp/protocol.c
++++ b/net/mptcp/protocol.c
+@@ -323,9 +323,10 @@ static bool mptcp_rmem_schedule(struct sock *sk, struct sock *ssk, int size)
+ struct mptcp_sock *msk = mptcp_sk(sk);
+ int amt, amount;
+
+- if (size < msk->rmem_fwd_alloc)
++ if (size <= msk->rmem_fwd_alloc)
+ return true;
+
++ size -= msk->rmem_fwd_alloc;
+ amt = sk_mem_pages(size);
+ amount = amt << SK_MEM_QUANTUM_SHIFT;
+ msk->rmem_fwd_alloc += amount;
+diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
+index de3dc35ce6095..2b60923cf87b0 100644
+--- a/net/netfilter/nf_tables_api.c
++++ b/net/netfilter/nf_tables_api.c
+@@ -153,6 +153,7 @@ static struct nft_trans *nft_trans_alloc_gfp(const struct nft_ctx *ctx,
+ if (trans == NULL)
+ return NULL;
+
++ INIT_LIST_HEAD(&trans->list);
+ trans->msg_type = msg_type;
+ trans->ctx = *ctx;
+
+@@ -2472,6 +2473,7 @@ err:
+ }
+
+ static struct nft_chain *nft_chain_lookup_byid(const struct net *net,
++ const struct nft_table *table,
+ const struct nlattr *nla)
+ {
+ struct nftables_pernet *nft_net = nft_pernet(net);
+@@ -2482,6 +2484,7 @@ static struct nft_chain *nft_chain_lookup_byid(const struct net *net,
+ struct nft_chain *chain = trans->ctx.chain;
+
+ if (trans->msg_type == NFT_MSG_NEWCHAIN &&
++ chain->table == table &&
+ id == nft_trans_chain_id(trans))
+ return chain;
+ }
+@@ -3369,6 +3372,7 @@ static int nft_table_validate(struct net *net, const struct nft_table *table)
+ }
+
+ static struct nft_rule *nft_rule_lookup_byid(const struct net *net,
++ const struct nft_chain *chain,
+ const struct nlattr *nla);
+
+ #define NFT_RULE_MAXEXPRS 128
+@@ -3415,7 +3419,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
+ return -EOPNOTSUPP;
+
+ } else if (nla[NFTA_RULE_CHAIN_ID]) {
+- chain = nft_chain_lookup_byid(net, nla[NFTA_RULE_CHAIN_ID]);
++ chain = nft_chain_lookup_byid(net, table, nla[NFTA_RULE_CHAIN_ID]);
+ if (IS_ERR(chain)) {
+ NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_CHAIN_ID]);
+ return PTR_ERR(chain);
+@@ -3457,7 +3461,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
+ return PTR_ERR(old_rule);
+ }
+ } else if (nla[NFTA_RULE_POSITION_ID]) {
+- old_rule = nft_rule_lookup_byid(net, nla[NFTA_RULE_POSITION_ID]);
++ old_rule = nft_rule_lookup_byid(net, chain, nla[NFTA_RULE_POSITION_ID]);
+ if (IS_ERR(old_rule)) {
+ NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_POSITION_ID]);
+ return PTR_ERR(old_rule);
+@@ -3602,6 +3606,7 @@ err_release_expr:
+ }
+
+ static struct nft_rule *nft_rule_lookup_byid(const struct net *net,
++ const struct nft_chain *chain,
+ const struct nlattr *nla)
+ {
+ struct nftables_pernet *nft_net = nft_pernet(net);
+@@ -3612,6 +3617,7 @@ static struct nft_rule *nft_rule_lookup_byid(const struct net *net,
+ struct nft_rule *rule = nft_trans_rule(trans);
+
+ if (trans->msg_type == NFT_MSG_NEWRULE &&
++ trans->ctx.chain == chain &&
+ id == nft_trans_rule_id(trans))
+ return rule;
+ }
+@@ -3661,7 +3667,7 @@ static int nf_tables_delrule(struct sk_buff *skb, const struct nfnl_info *info,
+
+ err = nft_delrule(&ctx, rule);
+ } else if (nla[NFTA_RULE_ID]) {
+- rule = nft_rule_lookup_byid(net, nla[NFTA_RULE_ID]);
++ rule = nft_rule_lookup_byid(net, chain, nla[NFTA_RULE_ID]);
+ if (IS_ERR(rule)) {
+ NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_ID]);
+ return PTR_ERR(rule);
+@@ -3840,6 +3846,7 @@ static struct nft_set *nft_set_lookup_byhandle(const struct nft_table *table,
+ }
+
+ static struct nft_set *nft_set_lookup_byid(const struct net *net,
++ const struct nft_table *table,
+ const struct nlattr *nla, u8 genmask)
+ {
+ struct nftables_pernet *nft_net = nft_pernet(net);
+@@ -3851,6 +3858,7 @@ static struct nft_set *nft_set_lookup_byid(const struct net *net,
+ struct nft_set *set = nft_trans_set(trans);
+
+ if (id == nft_trans_set_id(trans) &&
++ set->table == table &&
+ nft_active_genmask(set, genmask))
+ return set;
+ }
+@@ -3871,7 +3879,7 @@ struct nft_set *nft_set_lookup_global(const struct net *net,
+ if (!nla_set_id)
+ return set;
+
+- set = nft_set_lookup_byid(net, nla_set_id, genmask);
++ set = nft_set_lookup_byid(net, table, nla_set_id, genmask);
+ }
+ return set;
+ }
+@@ -9595,7 +9603,7 @@ static int nft_verdict_init(const struct nft_ctx *ctx, struct nft_data *data,
+ tb[NFTA_VERDICT_CHAIN],
+ genmask);
+ } else if (tb[NFTA_VERDICT_CHAIN_ID]) {
+- chain = nft_chain_lookup_byid(ctx->net,
++ chain = nft_chain_lookup_byid(ctx->net, ctx->table,
+ tb[NFTA_VERDICT_CHAIN_ID]);
+ if (IS_ERR(chain))
+ return PTR_ERR(chain);
+diff --git a/net/netfilter/nft_queue.c b/net/netfilter/nft_queue.c
+index 15e4b7640dc00..da29e92c03e27 100644
+--- a/net/netfilter/nft_queue.c
++++ b/net/netfilter/nft_queue.c
+@@ -68,6 +68,31 @@ static void nft_queue_sreg_eval(const struct nft_expr *expr,
+ regs->verdict.code = ret;
+ }
+
++static int nft_queue_validate(const struct nft_ctx *ctx,
++ const struct nft_expr *expr,
++ const struct nft_data **data)
++{
++ static const unsigned int supported_hooks = ((1 << NF_INET_PRE_ROUTING) |
++ (1 << NF_INET_LOCAL_IN) |
++ (1 << NF_INET_FORWARD) |
++ (1 << NF_INET_LOCAL_OUT) |
++ (1 << NF_INET_POST_ROUTING));
++
++ switch (ctx->family) {
++ case NFPROTO_IPV4:
++ case NFPROTO_IPV6:
++ case NFPROTO_INET:
++ case NFPROTO_BRIDGE:
++ break;
++ case NFPROTO_NETDEV: /* lacks okfn */
++ fallthrough;
++ default:
++ return -EOPNOTSUPP;
++ }
++
++ return nft_chain_validate_hooks(ctx->chain, supported_hooks);
++}
++
+ static const struct nla_policy nft_queue_policy[NFTA_QUEUE_MAX + 1] = {
+ [NFTA_QUEUE_NUM] = { .type = NLA_U16 },
+ [NFTA_QUEUE_TOTAL] = { .type = NLA_U16 },
+@@ -164,6 +189,7 @@ static const struct nft_expr_ops nft_queue_ops = {
+ .eval = nft_queue_eval,
+ .init = nft_queue_init,
+ .dump = nft_queue_dump,
++ .validate = nft_queue_validate,
+ .reduce = NFT_REDUCE_READONLY,
+ };
+
+@@ -173,6 +199,7 @@ static const struct nft_expr_ops nft_queue_sreg_ops = {
+ .eval = nft_queue_sreg_eval,
+ .init = nft_queue_sreg_init,
+ .dump = nft_queue_sreg_dump,
++ .validate = nft_queue_validate,
+ .reduce = NFT_REDUCE_READONLY,
+ };
+
+diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
+index bf2d986a6bc39..a8e3ec800a9c8 100644
+--- a/net/rose/af_rose.c
++++ b/net/rose/af_rose.c
+@@ -192,6 +192,7 @@ static void rose_kill_by_device(struct net_device *dev)
+ rose_disconnect(s, ENETUNREACH, ROSE_OUT_OF_ORDER, 0);
+ if (rose->neighbour)
+ rose->neighbour->use--;
++ dev_put(rose->device);
+ rose->device = NULL;
+ }
+ }
+@@ -592,6 +593,8 @@ static struct sock *rose_make_new(struct sock *osk)
+ rose->idle = orose->idle;
+ rose->defer = orose->defer;
+ rose->device = orose->device;
++ if (rose->device)
++ dev_hold(rose->device);
+ rose->qbitincl = orose->qbitincl;
+
+ return sk;
+@@ -645,6 +648,7 @@ static int rose_release(struct socket *sock)
+ break;
+ }
+
++ dev_put(rose->device);
+ sock->sk = NULL;
+ release_sock(sk);
+ sock_put(sk);
+@@ -721,7 +725,6 @@ static int rose_connect(struct socket *sock, struct sockaddr *uaddr, int addr_le
+ struct rose_sock *rose = rose_sk(sk);
+ struct sockaddr_rose *addr = (struct sockaddr_rose *)uaddr;
+ unsigned char cause, diagnostic;
+- struct net_device *dev;
+ ax25_uid_assoc *user;
+ int n, err = 0;
+
+@@ -778,9 +781,12 @@ static int rose_connect(struct socket *sock, struct sockaddr *uaddr, int addr_le
+ }
+
+ if (sock_flag(sk, SOCK_ZAPPED)) { /* Must bind first - autobinding in this may or may not work */
++ struct net_device *dev;
++
+ sock_reset_flag(sk, SOCK_ZAPPED);
+
+- if ((dev = rose_dev_first()) == NULL) {
++ dev = rose_dev_first();
++ if (!dev) {
+ err = -ENETUNREACH;
+ goto out_release;
+ }
+@@ -788,6 +794,7 @@ static int rose_connect(struct socket *sock, struct sockaddr *uaddr, int addr_le
+ user = ax25_findbyuid(current_euid());
+ if (!user) {
+ err = -EINVAL;
++ dev_put(dev);
+ goto out_release;
+ }
+
+diff --git a/net/rose/rose_route.c b/net/rose/rose_route.c
+index 5e93510fa3d24..09ad2d9e63c9c 100644
+--- a/net/rose/rose_route.c
++++ b/net/rose/rose_route.c
+@@ -615,6 +615,8 @@ struct net_device *rose_dev_first(void)
+ if (first == NULL || strncmp(dev->name, first->name, 3) < 0)
+ first = dev;
+ }
++ if (first)
++ dev_hold(first);
+ rcu_read_unlock();
+
+ return first;
+diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
+index a35ab8c27866e..3f935cbbaff66 100644
+--- a/net/sched/cls_route.c
++++ b/net/sched/cls_route.c
+@@ -526,7 +526,7 @@ static int route4_change(struct net *net, struct sk_buff *in_skb,
+ rcu_assign_pointer(f->next, f1);
+ rcu_assign_pointer(*fp, f);
+
+- if (fold && fold->handle && f->handle != fold->handle) {
++ if (fold) {
+ th = to_hash(fold->handle);
+ h = from_hash(fold->handle >> 16);
+ b = rtnl_dereference(head->table[th]);
+diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost
+index 48585c4d04ade..0273bf7375e26 100644
+--- a/scripts/Makefile.modpost
++++ b/scripts/Makefile.modpost
+@@ -87,8 +87,7 @@ obj := $(KBUILD_EXTMOD)
+ src := $(obj)
+
+ # Include the module's Makefile to find KBUILD_EXTRA_SYMBOLS
+-include $(if $(wildcard $(KBUILD_EXTMOD)/Kbuild), \
+- $(KBUILD_EXTMOD)/Kbuild, $(KBUILD_EXTMOD)/Makefile)
++include $(if $(wildcard $(src)/Kbuild), $(src)/Kbuild, $(src)/Makefile)
+
+ # modpost option for external modules
+ MODPOST += -e
+diff --git a/scripts/faddr2line b/scripts/faddr2line
+index 94ed98dd899f3..57099687e5e1d 100755
+--- a/scripts/faddr2line
++++ b/scripts/faddr2line
+@@ -112,7 +112,9 @@ __faddr2line() {
+ # section offsets.
+ local file_type=$(${READELF} --file-header $objfile |
+ ${AWK} '$1 == "Type:" { print $2; exit }')
+- [[ $file_type = "EXEC" ]] && is_vmlinux=1
++ if [[ $file_type = "EXEC" ]] || [[ $file_type == "DYN" ]]; then
++ is_vmlinux=1
++ fi
+
+ # Go through each of the object's symbols which match the func name.
+ # In rare cases there might be duplicates, in which case we print all
+diff --git a/scripts/gdb/linux/dmesg.py b/scripts/gdb/linux/dmesg.py
+index d5983cf3db7d0..c771831eb077d 100644
+--- a/scripts/gdb/linux/dmesg.py
++++ b/scripts/gdb/linux/dmesg.py
+@@ -22,7 +22,6 @@ prb_desc_type = utils.CachedType("struct prb_desc")
+ prb_desc_ring_type = utils.CachedType("struct prb_desc_ring")
+ prb_data_ring_type = utils.CachedType("struct prb_data_ring")
+ printk_ringbuffer_type = utils.CachedType("struct printk_ringbuffer")
+-atomic_long_type = utils.CachedType("atomic_long_t")
+
+ class LxDmesg(gdb.Command):
+ """Print Linux kernel log buffer."""
+@@ -68,8 +67,6 @@ class LxDmesg(gdb.Command):
+ off = prb_data_ring_type.get_type()['data'].bitpos // 8
+ text_data_addr = utils.read_ulong(text_data_ring, off)
+
+- counter_off = atomic_long_type.get_type()['counter'].bitpos // 8
+-
+ sv_off = prb_desc_type.get_type()['state_var'].bitpos // 8
+
+ off = prb_desc_type.get_type()['text_blk_lpos'].bitpos // 8
+@@ -89,9 +86,9 @@ class LxDmesg(gdb.Command):
+
+ # read in tail and head descriptor ids
+ off = prb_desc_ring_type.get_type()['tail_id'].bitpos // 8
+- tail_id = utils.read_u64(desc_ring, off + counter_off)
++ tail_id = utils.read_atomic_long(desc_ring, off)
+ off = prb_desc_ring_type.get_type()['head_id'].bitpos // 8
+- head_id = utils.read_u64(desc_ring, off + counter_off)
++ head_id = utils.read_atomic_long(desc_ring, off)
+
+ did = tail_id
+ while True:
+@@ -102,7 +99,7 @@ class LxDmesg(gdb.Command):
+ desc = utils.read_memoryview(inf, desc_addr + desc_off, desc_sz).tobytes()
+
+ # skip non-committed record
+- state = 3 & (utils.read_u64(desc, sv_off + counter_off) >> desc_flags_shift)
++ state = 3 & (utils.read_atomic_long(desc, sv_off) >> desc_flags_shift)
+ if state != desc_committed and state != desc_finalized:
+ if did == head_id:
+ break
+diff --git a/scripts/gdb/linux/utils.py b/scripts/gdb/linux/utils.py
+index ff7c1799d588f..1553f68716cc2 100644
+--- a/scripts/gdb/linux/utils.py
++++ b/scripts/gdb/linux/utils.py
+@@ -35,13 +35,12 @@ class CachedType:
+
+
+ long_type = CachedType("long")
+-
++atomic_long_type = CachedType("atomic_long_t")
+
+ def get_long_type():
+ global long_type
+ return long_type.get_type()
+
+-
+ def offset_of(typeobj, field):
+ element = gdb.Value(0).cast(typeobj)
+ return int(str(element[field].address).split()[0], 16)
+@@ -129,6 +128,17 @@ def read_ulong(buffer, offset):
+ else:
+ return read_u32(buffer, offset)
+
++atomic_long_counter_offset = atomic_long_type.get_type()['counter'].bitpos
++atomic_long_counter_sizeof = atomic_long_type.get_type()['counter'].type.sizeof
++
++def read_atomic_long(buffer, offset):
++ global atomic_long_counter_offset
++ global atomic_long_counter_sizeof
++
++ if atomic_long_counter_sizeof == 8:
++ return read_u64(buffer, offset + atomic_long_counter_offset)
++ else:
++ return read_u32(buffer, offset + atomic_long_counter_offset)
+
+ target_arch = None
+
+diff --git a/security/selinux/ss/policydb.h b/security/selinux/ss/policydb.h
+index c24d4e1063ea0..ffc4e7bad2054 100644
+--- a/security/selinux/ss/policydb.h
++++ b/security/selinux/ss/policydb.h
+@@ -370,6 +370,8 @@ static inline int put_entry(const void *buf, size_t bytes, int num, struct polic
+ {
+ size_t len = bytes * num;
+
++ if (len > fp->len)
++ return -EINVAL;
+ memcpy(fp->data, buf, len);
+ fp->data += len;
+ fp->len -= len;
+diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
+index 6901dc07680de..cad54f454d013 100644
+--- a/security/selinux/ss/services.c
++++ b/security/selinux/ss/services.c
+@@ -4049,6 +4049,7 @@ int security_read_policy(struct selinux_state *state,
+ int security_read_state_kernel(struct selinux_state *state,
+ void **data, size_t *len)
+ {
++ int err;
+ struct selinux_policy *policy;
+
+ policy = rcu_dereference_protected(
+@@ -4061,5 +4062,11 @@ int security_read_state_kernel(struct selinux_state *state,
+ if (!*data)
+ return -ENOMEM;
+
+- return __security_read_policy(policy, *data, len);
++ err = __security_read_policy(policy, *data, len);
++ if (err) {
++ vfree(*data);
++ *data = NULL;
++ *len = 0;
++ }
++ return err;
+ }
+diff --git a/sound/pci/hda/patch_cirrus.c b/sound/pci/hda/patch_cirrus.c
+index 678fbcaf2a3bc..6807b4708a176 100644
+--- a/sound/pci/hda/patch_cirrus.c
++++ b/sound/pci/hda/patch_cirrus.c
+@@ -395,6 +395,7 @@ static const struct snd_pci_quirk cs420x_fixup_tbl[] = {
+
+ /* codec SSID */
+ SND_PCI_QUIRK(0x106b, 0x0600, "iMac 14,1", CS420X_IMAC27_122),
++ SND_PCI_QUIRK(0x106b, 0x0900, "iMac 12,1", CS420X_IMAC27_122),
+ SND_PCI_QUIRK(0x106b, 0x1c00, "MacBookPro 8,1", CS420X_MBP81),
+ SND_PCI_QUIRK(0x106b, 0x2000, "iMac 12,2", CS420X_IMAC27_122),
+ SND_PCI_QUIRK(0x106b, 0x2800, "MacBookPro 10,1", CS420X_MBP101),
+diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c
+index 8c68e6e1387ef..2bc9274e0960b 100644
+--- a/sound/pci/hda/patch_conexant.c
++++ b/sound/pci/hda/patch_conexant.c
+@@ -222,6 +222,7 @@ enum {
+ CXT_PINCFG_LEMOTE_A1205,
+ CXT_PINCFG_COMPAQ_CQ60,
+ CXT_FIXUP_STEREO_DMIC,
++ CXT_PINCFG_LENOVO_NOTEBOOK,
+ CXT_FIXUP_INC_MIC_BOOST,
+ CXT_FIXUP_HEADPHONE_MIC_PIN,
+ CXT_FIXUP_HEADPHONE_MIC,
+@@ -772,6 +773,14 @@ static const struct hda_fixup cxt_fixups[] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = cxt_fixup_stereo_dmic,
+ },
++ [CXT_PINCFG_LENOVO_NOTEBOOK] = {
++ .type = HDA_FIXUP_PINS,
++ .v.pins = (const struct hda_pintbl[]) {
++ { 0x1a, 0x05d71030 },
++ { }
++ },
++ .chain_id = CXT_FIXUP_STEREO_DMIC,
++ },
+ [CXT_FIXUP_INC_MIC_BOOST] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = cxt5066_increase_mic_boost,
+@@ -971,7 +980,7 @@ static const struct snd_pci_quirk cxt5066_fixups[] = {
+ SND_PCI_QUIRK(0x17aa, 0x3905, "Lenovo G50-30", CXT_FIXUP_STEREO_DMIC),
+ SND_PCI_QUIRK(0x17aa, 0x390b, "Lenovo G50-80", CXT_FIXUP_STEREO_DMIC),
+ SND_PCI_QUIRK(0x17aa, 0x3975, "Lenovo U300s", CXT_FIXUP_STEREO_DMIC),
+- SND_PCI_QUIRK(0x17aa, 0x3977, "Lenovo IdeaPad U310", CXT_FIXUP_STEREO_DMIC),
++ SND_PCI_QUIRK(0x17aa, 0x3977, "Lenovo IdeaPad U310", CXT_PINCFG_LENOVO_NOTEBOOK),
+ SND_PCI_QUIRK(0x17aa, 0x3978, "Lenovo G50-70", CXT_FIXUP_STEREO_DMIC),
+ SND_PCI_QUIRK(0x17aa, 0x397b, "Lenovo S205", CXT_FIXUP_STEREO_DMIC),
+ SND_PCI_QUIRK_VENDOR(0x17aa, "Thinkpad", CXT_FIXUP_THINKPAD_ACPI),
+diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
+index b7f1f2fb60cbc..9c7335721cd0b 100644
+--- a/sound/pci/hda/patch_realtek.c
++++ b/sound/pci/hda/patch_realtek.c
+@@ -6839,6 +6839,43 @@ static void alc_fixup_dell4_mic_no_presence_quiet(struct hda_codec *codec,
+ }
+ }
+
++static void alc287_fixup_yoga9_14iap7_bass_spk_pin(struct hda_codec *codec,
++ const struct hda_fixup *fix, int action)
++{
++ /*
++ * The Pin Complex 0x17 for the bass speakers is wrongly reported as
++ * unconnected.
++ */
++ static const struct hda_pintbl pincfgs[] = {
++ { 0x17, 0x90170121 },
++ { }
++ };
++ /*
++ * Avoid DAC 0x06 and 0x08, as they have no volume controls.
++ * DAC 0x02 and 0x03 would be fine.
++ */
++ static const hda_nid_t conn[] = { 0x02, 0x03 };
++ /*
++ * Prefer both speakerbar (0x14) and bass speakers (0x17) connected to DAC 0x02.
++ * Headphones (0x21) are connected to DAC 0x03.
++ */
++ static const hda_nid_t preferred_pairs[] = {
++ 0x14, 0x02,
++ 0x17, 0x02,
++ 0x21, 0x03,
++ 0
++ };
++ struct alc_spec *spec = codec->spec;
++
++ switch (action) {
++ case HDA_FIXUP_ACT_PRE_PROBE:
++ snd_hda_apply_pincfgs(codec, pincfgs);
++ snd_hda_override_conn_list(codec, 0x17, ARRAY_SIZE(conn), conn);
++ spec->gen.preferred_dacs = preferred_pairs;
++ break;
++ }
++}
++
+ enum {
+ ALC269_FIXUP_GPIO2,
+ ALC269_FIXUP_SONY_VAIO,
+@@ -6894,6 +6931,7 @@ enum {
+ ALC269_FIXUP_LIMIT_INT_MIC_BOOST,
+ ALC269VB_FIXUP_ASUS_ZENBOOK,
+ ALC269VB_FIXUP_ASUS_ZENBOOK_UX31A,
++ ALC269VB_FIXUP_ASUS_MIC_NO_PRESENCE,
+ ALC269_FIXUP_LIMIT_INT_MIC_BOOST_MUTE_LED,
+ ALC269VB_FIXUP_ORDISSIMO_EVE2,
+ ALC283_FIXUP_CHROME_BOOK,
+@@ -7075,6 +7113,8 @@ enum {
+ ALC245_FIXUP_CS35L41_SPI_4_HP_GPIO_LED,
+ ALC285_FIXUP_HP_SPEAKERS_MICMUTE_LED,
+ ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE,
++ ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK,
++ ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN,
+ };
+
+ /* A special fixup for Lenovo C940 and Yoga Duet 7;
+@@ -7479,6 +7519,15 @@ static const struct hda_fixup alc269_fixups[] = {
+ .chained = true,
+ .chain_id = ALC269VB_FIXUP_ASUS_ZENBOOK,
+ },
++ [ALC269VB_FIXUP_ASUS_MIC_NO_PRESENCE] = {
++ .type = HDA_FIXUP_PINS,
++ .v.pins = (const struct hda_pintbl[]) {
++ { 0x18, 0x01a110f0 }, /* use as headset mic */
++ { }
++ },
++ .chained = true,
++ .chain_id = ALC269_FIXUP_HEADSET_MIC
++ },
+ [ALC269_FIXUP_LIMIT_INT_MIC_BOOST_MUTE_LED] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = alc269_fixup_limit_int_mic_boost,
+@@ -8917,6 +8966,74 @@ static const struct hda_fixup alc269_fixups[] = {
+ .chained = true,
+ .chain_id = ALC269_FIXUP_HEADSET_MODE_NO_HP_MIC
+ },
++ [ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK] = {
++ .type = HDA_FIXUP_VERBS,
++ .v.verbs = (const struct hda_verb[]) {
++ // enable left speaker
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x24 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x41 },
++
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0xc },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x1a },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
++
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0xf },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x42 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
++
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x10 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x40 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
++
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x2 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
++
++ // enable right speaker
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x24 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x46 },
++
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0xc },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x2a },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
++
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0xf },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x46 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
++
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x10 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x44 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
++
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x2 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
++
++ { },
++ },
++ },
++ [ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN] = {
++ .type = HDA_FIXUP_FUNC,
++ .v.func = alc287_fixup_yoga9_14iap7_bass_spk_pin,
++ .chained = true,
++ .chain_id = ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK,
++ },
+ };
+
+ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+@@ -9096,6 +9213,8 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x103c, 0x861f, "HP Elite Dragonfly G1", ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x869d, "HP", ALC236_FIXUP_HP_MUTE_LED),
+ SND_PCI_QUIRK(0x103c, 0x86c7, "HP Envy AiO 32", ALC274_FIXUP_HP_ENVY_GPIO),
++ SND_PCI_QUIRK(0x103c, 0x86e7, "HP Spectre x360 15-eb0xxx", ALC285_FIXUP_HP_SPECTRE_X360_EB1),
++ SND_PCI_QUIRK(0x103c, 0x86e8, "HP Spectre x360 15-eb0xxx", ALC285_FIXUP_HP_SPECTRE_X360_EB1),
+ SND_PCI_QUIRK(0x103c, 0x8716, "HP Elite Dragonfly G2 Notebook PC", ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x8720, "HP EliteBook x360 1040 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x8724, "HP EliteBook 850 G7", ALC285_FIXUP_HP_GPIO_LED),
+@@ -9111,6 +9230,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x8783, "HP ZBook Fury 15 G7 Mobile Workstation",
+ ALC285_FIXUP_HP_GPIO_AMP_INIT),
++ SND_PCI_QUIRK(0x103c, 0x8786, "HP OMEN 15", ALC285_FIXUP_HP_MUTE_LED),
+ SND_PCI_QUIRK(0x103c, 0x8787, "HP OMEN 15", ALC285_FIXUP_HP_MUTE_LED),
+ SND_PCI_QUIRK(0x103c, 0x8788, "HP OMEN 15", ALC285_FIXUP_HP_MUTE_LED),
+ SND_PCI_QUIRK(0x103c, 0x87c8, "HP", ALC287_FIXUP_HP_GPIO_LED),
+@@ -9180,6 +9300,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1043, 0x12a0, "ASUS X441UV", ALC233_FIXUP_EAPD_COEF_AND_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1043, 0x12e0, "ASUS X541SA", ALC256_FIXUP_ASUS_MIC),
+ SND_PCI_QUIRK(0x1043, 0x12f0, "ASUS X541UV", ALC256_FIXUP_ASUS_MIC),
++ SND_PCI_QUIRK(0x1043, 0x1313, "Asus K42JZ", ALC269VB_FIXUP_ASUS_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1043, 0x13b0, "ASUS Z550SA", ALC256_FIXUP_ASUS_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1427, "Asus Zenbook UX31E", ALC269VB_FIXUP_ASUS_ZENBOOK),
+ SND_PCI_QUIRK(0x1043, 0x1517, "Asus Zenbook UX31A", ALC269VB_FIXUP_ASUS_ZENBOOK_UX31A),
+@@ -9255,6 +9376,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1558, 0x4018, "Clevo NV40M[BE]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x4019, "Clevo NV40MZ", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x4020, "Clevo NV40MB", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
++ SND_PCI_QUIRK(0x1558, 0x4041, "Clevo NV4[15]PZ", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x40a1, "Clevo NL40GU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x40c1, "Clevo NL40[CZ]U", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x40d1, "Clevo NL41DU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+@@ -9367,6 +9489,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x17aa, 0x3176, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC),
+ SND_PCI_QUIRK(0x17aa, 0x3178, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC),
+ SND_PCI_QUIRK(0x17aa, 0x31af, "ThinkCentre Station", ALC623_FIXUP_LENOVO_THINKSTATION_P340),
++ SND_PCI_QUIRK(0x17aa, 0x3801, "Lenovo Yoga9 14IAP7", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN),
+ SND_PCI_QUIRK(0x17aa, 0x3802, "Lenovo Yoga DuetITL 2021", ALC287_FIXUP_YOGA7_14ITL_SPEAKERS),
+ SND_PCI_QUIRK(0x17aa, 0x3813, "Legion 7i 15IMHG05", ALC287_FIXUP_LEGION_15IMHG05_SPEAKERS),
+ SND_PCI_QUIRK(0x17aa, 0x3818, "Lenovo C940 / Yoga Duet 7", ALC298_FIXUP_LENOVO_C940_DUET7),
+@@ -9612,6 +9735,7 @@ static const struct hda_model_fixup alc269_fixup_models[] = {
+ {.id = ALC285_FIXUP_HP_SPECTRE_X360, .name = "alc285-hp-spectre-x360"},
+ {.id = ALC285_FIXUP_HP_SPECTRE_X360_EB1, .name = "alc285-hp-spectre-x360-eb1"},
+ {.id = ALC287_FIXUP_IDEAPAD_BASS_SPK_AMP, .name = "alc287-ideapad-bass-spk-amp"},
++ {.id = ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN, .name = "alc287-yoga9-bass-spk-pin"},
+ {.id = ALC623_FIXUP_LENOVO_THINKSTATION_P340, .name = "alc623-lenovo-thinkstation-p340"},
+ {.id = ALC255_FIXUP_ACER_HEADPHONE_AND_MIC, .name = "alc255-acer-headphone-and-mic"},
+ {.id = ALC285_FIXUP_HP_GPIO_AMP_INIT, .name = "alc285-hp-amp-init"},
+diff --git a/sound/soc/amd/yc/acp6x-mach.c b/sound/soc/amd/yc/acp6x-mach.c
+index 959b70e8baf21..7d3d7d33fa20b 100644
+--- a/sound/soc/amd/yc/acp6x-mach.c
++++ b/sound/soc/amd/yc/acp6x-mach.c
+@@ -104,28 +104,14 @@ static const struct dmi_system_id yc_acp_quirk_table[] = {
+ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "21AW"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "21CM"),
+ }
+ },
+ {
+ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "21AX"),
+- }
+- },
+- {
+- .driver_data = &acp6x_card,
+- .matches = {
+- DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "21BN"),
+- }
+- },
+- {
+- .driver_data = &acp6x_card,
+- .matches = {
+- DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "21BQ"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "21CN"),
+ }
+ },
+ {
+@@ -156,20 +142,6 @@ static const struct dmi_system_id yc_acp_quirk_table[] = {
+ DMI_MATCH(DMI_PRODUCT_NAME, "21CL"),
+ }
+ },
+- {
+- .driver_data = &acp6x_card,
+- .matches = {
+- DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "21D8"),
+- }
+- },
+- {
+- .driver_data = &acp6x_card,
+- .matches = {
+- DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "21D9"),
+- }
+- },
+ {}
+ };
+
+diff --git a/sound/soc/atmel/mchp-spdifrx.c b/sound/soc/atmel/mchp-spdifrx.c
+index 5fc968483f2c8..a7baa0385ec58 100644
+--- a/sound/soc/atmel/mchp-spdifrx.c
++++ b/sound/soc/atmel/mchp-spdifrx.c
+@@ -288,15 +288,17 @@ static void mchp_spdifrx_isr_blockend_en(struct mchp_spdifrx_dev *dev)
+ spin_unlock_irqrestore(&dev->blockend_lock, flags);
+ }
+
+-/* called from atomic context only */
++/* called from atomic/non-atomic context */
+ static void mchp_spdifrx_isr_blockend_dis(struct mchp_spdifrx_dev *dev)
+ {
+- spin_lock(&dev->blockend_lock);
++ unsigned long flags;
++
++ spin_lock_irqsave(&dev->blockend_lock, flags);
+ dev->blockend_refcount--;
+ /* don't enable BLOCKEND interrupt if it's already enabled */
+ if (dev->blockend_refcount == 0)
+ regmap_write(dev->regmap, SPDIFRX_IDR, SPDIFRX_IR_BLOCKEND);
+- spin_unlock(&dev->blockend_lock);
++ spin_unlock_irqrestore(&dev->blockend_lock, flags);
+ }
+
+ static irqreturn_t mchp_spdif_interrupt(int irq, void *dev_id)
+@@ -575,6 +577,7 @@ static int mchp_spdifrx_subcode_ch_get(struct mchp_spdifrx_dev *dev,
+ if (ret <= 0) {
+ dev_dbg(dev->dev, "user data for channel %d timeout\n",
+ channel);
++ mchp_spdifrx_isr_blockend_dis(dev);
+ return ret;
+ }
+
+diff --git a/sound/soc/codecs/cros_ec_codec.c b/sound/soc/codecs/cros_ec_codec.c
+index 9b92e1a0d1a3a..43561ff1bb8d1 100644
+--- a/sound/soc/codecs/cros_ec_codec.c
++++ b/sound/soc/codecs/cros_ec_codec.c
+@@ -994,6 +994,7 @@ static int cros_ec_codec_platform_probe(struct platform_device *pdev)
+ dev_dbg(dev, "ap_shm_phys_addr=%#llx len=%#x\n",
+ priv->ap_shm_phys_addr, priv->ap_shm_len);
+ }
++ of_node_put(node);
+ }
+ #endif
+
+diff --git a/sound/soc/codecs/da7210.c b/sound/soc/codecs/da7210.c
+index 8af344b2fdbf6..d75d15006f64e 100644
+--- a/sound/soc/codecs/da7210.c
++++ b/sound/soc/codecs/da7210.c
+@@ -1336,6 +1336,8 @@ static int __init da7210_modinit(void)
+ int ret = 0;
+ #if IS_ENABLED(CONFIG_I2C)
+ ret = i2c_add_driver(&da7210_i2c_driver);
++ if (ret)
++ return ret;
+ #endif
+ #if defined(CONFIG_SPI_MASTER)
+ ret = spi_register_driver(&da7210_spi_driver);
+diff --git a/sound/soc/codecs/msm8916-wcd-digital.c b/sound/soc/codecs/msm8916-wcd-digital.c
+index 20a07c92b2fc2..098a58990f07d 100644
+--- a/sound/soc/codecs/msm8916-wcd-digital.c
++++ b/sound/soc/codecs/msm8916-wcd-digital.c
+@@ -328,8 +328,8 @@ static const struct snd_kcontrol_new rx1_mix2_inp1_mux = SOC_DAPM_ENUM(
+ static const struct snd_kcontrol_new rx2_mix2_inp1_mux = SOC_DAPM_ENUM(
+ "RX2 MIX2 INP1 Mux", rx2_mix2_inp1_chain_enum);
+
+-/* Digital Gain control -38.4 dB to +38.4 dB in 0.3 dB steps */
+-static const DECLARE_TLV_DB_SCALE(digital_gain, -3840, 30, 0);
++/* Digital Gain control -84 dB to +40 dB in 1 dB steps */
++static const DECLARE_TLV_DB_SCALE(digital_gain, -8400, 100, -8400);
+
+ /* Cutoff Freq for High Pass Filter at -3dB */
+ static const char * const hpf_cutoff_text[] = {
+@@ -510,15 +510,15 @@ static int wcd_iir_filter_info(struct snd_kcontrol *kcontrol,
+
+ static const struct snd_kcontrol_new msm8916_wcd_digital_snd_controls[] = {
+ SOC_SINGLE_S8_TLV("RX1 Digital Volume", LPASS_CDC_RX1_VOL_CTL_B2_CTL,
+- -128, 127, digital_gain),
++ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX2 Digital Volume", LPASS_CDC_RX2_VOL_CTL_B2_CTL,
+- -128, 127, digital_gain),
++ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX3 Digital Volume", LPASS_CDC_RX3_VOL_CTL_B2_CTL,
+- -128, 127, digital_gain),
++ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("TX1 Digital Volume", LPASS_CDC_TX1_VOL_CTL_GAIN,
+- -128, 127, digital_gain),
++ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("TX2 Digital Volume", LPASS_CDC_TX2_VOL_CTL_GAIN,
+- -128, 127, digital_gain),
++ -84, 40, digital_gain),
+ SOC_ENUM("TX1 HPF Cutoff", tx1_hpf_cutoff_enum),
+ SOC_ENUM("TX2 HPF Cutoff", tx2_hpf_cutoff_enum),
+ SOC_SINGLE("TX1 HPF Switch", LPASS_CDC_TX1_MUX_CTL, 3, 1, 0),
+@@ -553,22 +553,22 @@ static const struct snd_kcontrol_new msm8916_wcd_digital_snd_controls[] = {
+ WCD_IIR_FILTER_CTL("IIR2 Band3", IIR2, BAND3),
+ WCD_IIR_FILTER_CTL("IIR2 Band4", IIR2, BAND4),
+ WCD_IIR_FILTER_CTL("IIR2 Band5", IIR2, BAND5),
+- SOC_SINGLE_SX_TLV("IIR1 INP1 Volume", LPASS_CDC_IIR1_GAIN_B1_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("IIR1 INP2 Volume", LPASS_CDC_IIR1_GAIN_B2_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("IIR1 INP3 Volume", LPASS_CDC_IIR1_GAIN_B3_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("IIR1 INP4 Volume", LPASS_CDC_IIR1_GAIN_B4_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("IIR2 INP1 Volume", LPASS_CDC_IIR2_GAIN_B1_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("IIR2 INP2 Volume", LPASS_CDC_IIR2_GAIN_B2_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("IIR2 INP3 Volume", LPASS_CDC_IIR2_GAIN_B3_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("IIR2 INP4 Volume", LPASS_CDC_IIR2_GAIN_B4_CTL,
+- 0, -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("IIR1 INP1 Volume", LPASS_CDC_IIR1_GAIN_B1_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("IIR1 INP2 Volume", LPASS_CDC_IIR1_GAIN_B2_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("IIR1 INP3 Volume", LPASS_CDC_IIR1_GAIN_B3_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("IIR1 INP4 Volume", LPASS_CDC_IIR1_GAIN_B4_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("IIR2 INP1 Volume", LPASS_CDC_IIR2_GAIN_B1_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("IIR2 INP2 Volume", LPASS_CDC_IIR2_GAIN_B2_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("IIR2 INP3 Volume", LPASS_CDC_IIR2_GAIN_B3_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("IIR2 INP4 Volume", LPASS_CDC_IIR2_GAIN_B4_CTL,
++ -84, 40, digital_gain),
+
+ };
+
+diff --git a/sound/soc/codecs/mt6359-accdet.c b/sound/soc/codecs/mt6359-accdet.c
+index 6d3d170144a0a..c190628e29056 100644
+--- a/sound/soc/codecs/mt6359-accdet.c
++++ b/sound/soc/codecs/mt6359-accdet.c
+@@ -675,6 +675,7 @@ static int mt6359_accdet_parse_dt(struct mt6359_accdet *priv)
+ sizeof(struct three_key_threshold));
+ }
+
++ of_node_put(node);
+ dev_warn(priv->dev, "accdet caps=%x\n", priv->caps);
+
+ return 0;
+diff --git a/sound/soc/codecs/mt6359.c b/sound/soc/codecs/mt6359.c
+index f8532aa7e4aa0..9a9c8555f7204 100644
+--- a/sound/soc/codecs/mt6359.c
++++ b/sound/soc/codecs/mt6359.c
+@@ -2780,6 +2780,7 @@ static int mt6359_parse_dt(struct mt6359_priv *priv)
+
+ ret = of_property_read_u32(np, "mediatek,mic-type-2",
+ &priv->mux_select[MUX_MIC_TYPE_2]);
++ of_node_put(np);
+ if (ret) {
+ dev_info(priv->dev,
+ "%s() failed to read mic-type-2, use default (%d)\n",
+diff --git a/sound/soc/codecs/wcd9335.c b/sound/soc/codecs/wcd9335.c
+index aa685980a97b6..b7c5bfc441278 100644
+--- a/sound/soc/codecs/wcd9335.c
++++ b/sound/soc/codecs/wcd9335.c
+@@ -2259,51 +2259,42 @@ static int wcd9335_rx_hph_mode_put(struct snd_kcontrol *kc,
+
+ static const struct snd_kcontrol_new wcd9335_snd_controls[] = {
+ /* -84dB min - 40dB max */
+- SOC_SINGLE_SX_TLV("RX0 Digital Volume", WCD9335_CDC_RX0_RX_VOL_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX1 Digital Volume", WCD9335_CDC_RX1_RX_VOL_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX2 Digital Volume", WCD9335_CDC_RX2_RX_VOL_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX3 Digital Volume", WCD9335_CDC_RX3_RX_VOL_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX4 Digital Volume", WCD9335_CDC_RX4_RX_VOL_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX5 Digital Volume", WCD9335_CDC_RX5_RX_VOL_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX6 Digital Volume", WCD9335_CDC_RX6_RX_VOL_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX7 Digital Volume", WCD9335_CDC_RX7_RX_VOL_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX8 Digital Volume", WCD9335_CDC_RX8_RX_VOL_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX0 Mix Digital Volume",
+- WCD9335_CDC_RX0_RX_VOL_MIX_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX1 Mix Digital Volume",
+- WCD9335_CDC_RX1_RX_VOL_MIX_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX2 Mix Digital Volume",
+- WCD9335_CDC_RX2_RX_VOL_MIX_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX3 Mix Digital Volume",
+- WCD9335_CDC_RX3_RX_VOL_MIX_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX4 Mix Digital Volume",
+- WCD9335_CDC_RX4_RX_VOL_MIX_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX5 Mix Digital Volume",
+- WCD9335_CDC_RX5_RX_VOL_MIX_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX6 Mix Digital Volume",
+- WCD9335_CDC_RX6_RX_VOL_MIX_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX7 Mix Digital Volume",
+- WCD9335_CDC_RX7_RX_VOL_MIX_CTL,
+- 0, -84, 40, digital_gain),
+- SOC_SINGLE_SX_TLV("RX8 Mix Digital Volume",
+- WCD9335_CDC_RX8_RX_VOL_MIX_CTL,
+- 0, -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX0 Digital Volume", WCD9335_CDC_RX0_RX_VOL_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX1 Digital Volume", WCD9335_CDC_RX1_RX_VOL_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX2 Digital Volume", WCD9335_CDC_RX2_RX_VOL_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX3 Digital Volume", WCD9335_CDC_RX3_RX_VOL_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX4 Digital Volume", WCD9335_CDC_RX4_RX_VOL_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX5 Digital Volume", WCD9335_CDC_RX5_RX_VOL_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX6 Digital Volume", WCD9335_CDC_RX6_RX_VOL_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX7 Digital Volume", WCD9335_CDC_RX7_RX_VOL_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX8 Digital Volume", WCD9335_CDC_RX8_RX_VOL_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX0 Mix Digital Volume", WCD9335_CDC_RX0_RX_VOL_MIX_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX1 Mix Digital Volume", WCD9335_CDC_RX1_RX_VOL_MIX_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX2 Mix Digital Volume", WCD9335_CDC_RX2_RX_VOL_MIX_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX3 Mix Digital Volume", WCD9335_CDC_RX3_RX_VOL_MIX_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX4 Mix Digital Volume", WCD9335_CDC_RX4_RX_VOL_MIX_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX5 Mix Digital Volume", WCD9335_CDC_RX5_RX_VOL_MIX_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX6 Mix Digital Volume", WCD9335_CDC_RX6_RX_VOL_MIX_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX7 Mix Digital Volume", WCD9335_CDC_RX7_RX_VOL_MIX_CTL,
++ -84, 40, digital_gain),
++ SOC_SINGLE_S8_TLV("RX8 Mix Digital Volume", WCD9335_CDC_RX8_RX_VOL_MIX_CTL,
++ -84, 40, digital_gain),
+ SOC_ENUM("RX INT0_1 HPF cut off", cf_int0_1_enum),
+ SOC_ENUM("RX INT0_2 HPF cut off", cf_int0_2_enum),
+ SOC_ENUM("RX INT1_1 HPF cut off", cf_int1_1_enum),
+diff --git a/sound/soc/codecs/wsa881x.c b/sound/soc/codecs/wsa881x.c
+index 616b26c70c3b2..649c7e73f7744 100644
+--- a/sound/soc/codecs/wsa881x.c
++++ b/sound/soc/codecs/wsa881x.c
+@@ -1174,11 +1174,17 @@ static int __maybe_unused wsa881x_runtime_resume(struct device *dev)
+ struct sdw_slave *slave = dev_to_sdw_dev(dev);
+ struct regmap *regmap = dev_get_regmap(dev, NULL);
+ struct wsa881x_priv *wsa881x = dev_get_drvdata(dev);
++ unsigned long time;
+
+ gpiod_direction_output(wsa881x->sd_n, 1);
+
+- wait_for_completion_timeout(&slave->initialization_complete,
+- msecs_to_jiffies(WSA881X_PROBE_TIMEOUT));
++ time = wait_for_completion_timeout(&slave->initialization_complete,
++ msecs_to_jiffies(WSA881X_PROBE_TIMEOUT));
++ if (!time) {
++ dev_err(dev, "Initialization not complete, timed out\n");
++ gpiod_direction_output(wsa881x->sd_n, 0);
++ return -ETIMEDOUT;
++ }
+
+ regcache_cache_only(regmap, false);
+ regcache_sync(regmap);
+diff --git a/sound/soc/fsl/fsl-asoc-card.c b/sound/soc/fsl/fsl-asoc-card.c
+index d9a0d4768c4d5..c836848ef0a65 100644
+--- a/sound/soc/fsl/fsl-asoc-card.c
++++ b/sound/soc/fsl/fsl-asoc-card.c
+@@ -537,6 +537,7 @@ static int fsl_asoc_card_probe(struct platform_device *pdev)
+ struct device *codec_dev = NULL;
+ const char *codec_dai_name;
+ const char *codec_dev_name;
++ u32 asrc_fmt = 0;
+ u32 width;
+ int ret;
+
+@@ -829,8 +830,8 @@ static int fsl_asoc_card_probe(struct platform_device *pdev)
+ goto asrc_fail;
+ }
+
+- ret = of_property_read_u32(asrc_np, "fsl,asrc-format",
+- &priv->asrc_format);
++ ret = of_property_read_u32(asrc_np, "fsl,asrc-format", &asrc_fmt);
++ priv->asrc_format = (__force snd_pcm_format_t)asrc_fmt;
+ if (ret) {
+ /* Fallback to old binding; translate to asrc_format */
+ ret = of_property_read_u32(asrc_np, "fsl,asrc-width",
+diff --git a/sound/soc/fsl/fsl_asrc.c b/sound/soc/fsl/fsl_asrc.c
+index d7d1536a4f377..44dcbf49456cb 100644
+--- a/sound/soc/fsl/fsl_asrc.c
++++ b/sound/soc/fsl/fsl_asrc.c
+@@ -1066,6 +1066,7 @@ static int fsl_asrc_probe(struct platform_device *pdev)
+ struct resource *res;
+ void __iomem *regs;
+ int irq, ret, i;
++ u32 asrc_fmt = 0;
+ u32 map_idx;
+ char tmp[16];
+ u32 width;
+@@ -1174,7 +1175,8 @@ static int fsl_asrc_probe(struct platform_device *pdev)
+ return ret;
+ }
+
+- ret = of_property_read_u32(np, "fsl,asrc-format", &asrc->asrc_format);
++ ret = of_property_read_u32(np, "fsl,asrc-format", &asrc_fmt);
++ asrc->asrc_format = (__force snd_pcm_format_t)asrc_fmt;
+ if (ret) {
+ ret = of_property_read_u32(np, "fsl,asrc-width", &width);
+ if (ret) {
+@@ -1197,7 +1199,7 @@ static int fsl_asrc_probe(struct platform_device *pdev)
+ }
+ }
+
+- if (!(FSL_ASRC_FORMATS & (1ULL << asrc->asrc_format))) {
++ if (!(FSL_ASRC_FORMATS & pcm_format_to_bits(asrc->asrc_format))) {
+ dev_warn(&pdev->dev, "unsupported width, use default S24_LE\n");
+ asrc->asrc_format = SNDRV_PCM_FORMAT_S24_LE;
+ }
+diff --git a/sound/soc/fsl/fsl_easrc.c b/sound/soc/fsl/fsl_easrc.c
+index be14f84796cb4..cf0e10d17dbe3 100644
+--- a/sound/soc/fsl/fsl_easrc.c
++++ b/sound/soc/fsl/fsl_easrc.c
+@@ -476,7 +476,8 @@ static int fsl_easrc_prefilter_config(struct fsl_asrc *easrc,
+ struct fsl_asrc_pair *ctx;
+ struct device *dev;
+ u32 inrate, outrate, offset = 0;
+- u32 in_s_rate, out_s_rate, in_s_fmt, out_s_fmt;
++ u32 in_s_rate, out_s_rate;
++ snd_pcm_format_t in_s_fmt, out_s_fmt;
+ int ret, i;
+
+ if (!easrc)
+@@ -1873,6 +1874,7 @@ static int fsl_easrc_probe(struct platform_device *pdev)
+ struct resource *res;
+ struct device_node *np;
+ void __iomem *regs;
++ u32 asrc_fmt = 0;
+ int ret, irq;
+
+ easrc = devm_kzalloc(dev, sizeof(*easrc), GFP_KERNEL);
+@@ -1933,13 +1935,14 @@ static int fsl_easrc_probe(struct platform_device *pdev)
+ return ret;
+ }
+
+- ret = of_property_read_u32(np, "fsl,asrc-format", &easrc->asrc_format);
++ ret = of_property_read_u32(np, "fsl,asrc-format", &asrc_fmt);
++ easrc->asrc_format = (__force snd_pcm_format_t)asrc_fmt;
+ if (ret) {
+ dev_err(dev, "failed to asrc format\n");
+ return ret;
+ }
+
+- if (!(FSL_EASRC_FORMATS & (1ULL << easrc->asrc_format))) {
++ if (!(FSL_EASRC_FORMATS & (pcm_format_to_bits(easrc->asrc_format)))) {
+ dev_warn(dev, "unsupported format, switching to S24_LE\n");
+ easrc->asrc_format = SNDRV_PCM_FORMAT_S24_LE;
+ }
+diff --git a/sound/soc/fsl/fsl_easrc.h b/sound/soc/fsl/fsl_easrc.h
+index 30620d56252cc..5b8469757c122 100644
+--- a/sound/soc/fsl/fsl_easrc.h
++++ b/sound/soc/fsl/fsl_easrc.h
+@@ -569,7 +569,7 @@ struct fsl_easrc_io_params {
+ unsigned int access_len;
+ unsigned int fifo_wtmk;
+ unsigned int sample_rate;
+- unsigned int sample_format;
++ snd_pcm_format_t sample_format;
+ unsigned int norm_rate;
+ };
+
+diff --git a/sound/soc/fsl/imx-audmux.c b/sound/soc/fsl/imx-audmux.c
+index dfa05d40b2764..a8e5e0f57faf9 100644
+--- a/sound/soc/fsl/imx-audmux.c
++++ b/sound/soc/fsl/imx-audmux.c
+@@ -298,7 +298,7 @@ static int imx_audmux_probe(struct platform_device *pdev)
+ audmux_clk = NULL;
+ }
+
+- audmux_type = (enum imx_audmux_type)of_device_get_match_data(&pdev->dev);
++ audmux_type = (uintptr_t)of_device_get_match_data(&pdev->dev);
+
+ switch (audmux_type) {
+ case IMX31_AUDMUX:
+diff --git a/sound/soc/fsl/imx-card.c b/sound/soc/fsl/imx-card.c
+index 6f8efd838fcc8..4a8609b0d700d 100644
+--- a/sound/soc/fsl/imx-card.c
++++ b/sound/soc/fsl/imx-card.c
+@@ -17,6 +17,9 @@
+
+ #include "fsl_sai.h"
+
++#define IMX_CARD_MCLK_22P5792MHZ 22579200
++#define IMX_CARD_MCLK_24P576MHZ 24576000
++
+ enum codec_type {
+ CODEC_DUMMY = 0,
+ CODEC_AK5558 = 1,
+@@ -115,7 +118,7 @@ struct imx_card_data {
+ struct snd_soc_card card;
+ int num_dapm_routes;
+ u32 asrc_rate;
+- u32 asrc_format;
++ snd_pcm_format_t asrc_format;
+ };
+
+ static struct imx_akcodec_fs_mul ak4458_fs_mul[] = {
+@@ -353,9 +356,14 @@ static int imx_aif_hw_params(struct snd_pcm_substream *substream,
+ mclk_freq = akcodec_get_mclk_rate(substream, params, slots, slot_width);
+ else
+ mclk_freq = params_rate(params) * slots * slot_width;
+- /* Use the maximum freq from DSD512 (512*44100 = 22579200) */
+- if (format_is_dsd(params))
+- mclk_freq = 22579200;
++
++ if (format_is_dsd(params)) {
++ /* Use the maximum freq from DSD512 (512*44100 = 22579200) */
++ if (!(params_rate(params) % 11025))
++ mclk_freq = IMX_CARD_MCLK_22P5792MHZ;
++ else
++ mclk_freq = IMX_CARD_MCLK_24P576MHZ;
++ }
+
+ ret = snd_soc_dai_set_sysclk(cpu_dai, link_data->cpu_sysclk_id, mclk_freq,
+ SND_SOC_CLOCK_OUT);
+@@ -466,7 +474,7 @@ static int be_hw_params_fixup(struct snd_soc_pcm_runtime *rtd,
+
+ mask = hw_param_mask(params, SNDRV_PCM_HW_PARAM_FORMAT);
+ snd_mask_none(mask);
+- snd_mask_set(mask, data->asrc_format);
++ snd_mask_set(mask, (__force unsigned int)data->asrc_format);
+
+ return 0;
+ }
+@@ -485,6 +493,7 @@ static int imx_card_parse_of(struct imx_card_data *data)
+ struct dai_link_data *link_data;
+ struct of_phandle_args args;
+ int ret, num_links;
++ u32 asrc_fmt = 0;
+ u32 width;
+
+ ret = snd_soc_of_parse_card_name(card, "model");
+@@ -631,7 +640,8 @@ static int imx_card_parse_of(struct imx_card_data *data)
+ goto err;
+ }
+
+- ret = of_property_read_u32(args.np, "fsl,asrc-format", &data->asrc_format);
++ ret = of_property_read_u32(args.np, "fsl,asrc-format", &asrc_fmt);
++ data->asrc_format = (__force snd_pcm_format_t)asrc_fmt;
+ if (ret) {
+ /* Fallback to old binding; translate to asrc_format */
+ ret = of_property_read_u32(args.np, "fsl,asrc-width", &width);
+diff --git a/sound/soc/generic/audio-graph-card.c b/sound/soc/generic/audio-graph-card.c
+index 2b598af8feef8..b327372f2e4ae 100644
+--- a/sound/soc/generic/audio-graph-card.c
++++ b/sound/soc/generic/audio-graph-card.c
+@@ -158,8 +158,10 @@ static int asoc_simple_parse_dai(struct device_node *ep,
+ * if he unbinded CPU or Codec.
+ */
+ ret = snd_soc_get_dai_name(&args, &dlc->dai_name);
+- if (ret < 0)
++ if (ret < 0) {
++ of_node_put(node);
+ return ret;
++ }
+
+ dlc->of_node = node;
+
+diff --git a/sound/soc/generic/audio-graph-card2.c b/sound/soc/generic/audio-graph-card2.c
+index c0f3907a01fd4..da782403316f5 100644
+--- a/sound/soc/generic/audio-graph-card2.c
++++ b/sound/soc/generic/audio-graph-card2.c
+@@ -229,7 +229,8 @@ enum graph_type {
+
+ static enum graph_type __graph_get_type(struct device_node *lnk)
+ {
+- struct device_node *np;
++ struct device_node *np, *parent_np;
++ enum graph_type ret;
+
+ /*
+ * target {
+@@ -240,19 +241,33 @@ static enum graph_type __graph_get_type(struct device_node *lnk)
+ * };
+ */
+ np = of_get_parent(lnk);
+- if (of_node_name_eq(np, "ports"))
+- np = of_get_parent(np);
++ if (of_node_name_eq(np, "ports")) {
++ parent_np = of_get_parent(np);
++ of_node_put(np);
++ np = parent_np;
++ }
++
++ if (of_node_name_eq(np, GRAPH_NODENAME_MULTI)) {
++ ret = GRAPH_MULTI;
++ goto out_put;
++ }
++
++ if (of_node_name_eq(np, GRAPH_NODENAME_DPCM)) {
++ ret = GRAPH_DPCM;
++ goto out_put;
++ }
+
+- if (of_node_name_eq(np, GRAPH_NODENAME_MULTI))
+- return GRAPH_MULTI;
++ if (of_node_name_eq(np, GRAPH_NODENAME_C2C)) {
++ ret = GRAPH_C2C;
++ goto out_put;
++ }
+
+- if (of_node_name_eq(np, GRAPH_NODENAME_DPCM))
+- return GRAPH_DPCM;
++ ret = GRAPH_NORMAL;
+
+- if (of_node_name_eq(np, GRAPH_NODENAME_C2C))
+- return GRAPH_C2C;
++out_put:
++ of_node_put(np);
++ return ret;
+
+- return GRAPH_NORMAL;
+ }
+
+ static enum graph_type graph_get_type(struct asoc_simple_priv *priv,
+@@ -430,8 +445,10 @@ static int asoc_simple_parse_dai(struct device_node *ep,
+ * if he unbinded CPU or Codec.
+ */
+ ret = snd_soc_get_dai_name(&args, &dlc->dai_name);
+- if (ret < 0)
++ if (ret < 0) {
++ of_node_put(node);
+ return ret;
++ }
+
+ dlc->of_node = node;
+
+@@ -856,7 +873,7 @@ int audio_graph2_link_c2c(struct asoc_simple_priv *priv,
+ struct device_node *port0, *port1, *ports;
+ struct device_node *codec0_port, *codec1_port;
+ struct device_node *ep0, *ep1;
+- u32 val;
++ u32 val = 0;
+ int ret = -EINVAL;
+
+ /*
+@@ -880,7 +897,8 @@ int audio_graph2_link_c2c(struct asoc_simple_priv *priv,
+ ports = of_get_parent(port0);
+ port1 = of_get_next_child(ports, lnk);
+
+- if (!of_get_property(ports, "rate", &val)) {
++ of_property_read_u32(ports, "rate", &val);
++ if (!val) {
+ struct device *dev = simple_priv_to_dev(priv);
+
+ dev_err(dev, "Codec2Codec needs rate settings\n");
+diff --git a/sound/soc/intel/boards/sof_rt5682.c b/sound/soc/intel/boards/sof_rt5682.c
+index 7126fcb63d904..d9f83b04be87c 100644
+--- a/sound/soc/intel/boards/sof_rt5682.c
++++ b/sound/soc/intel/boards/sof_rt5682.c
+@@ -435,6 +435,15 @@ static int sof_card_late_probe(struct snd_soc_card *card)
+ int err;
+ int i = 0;
+
++ if (sof_rt5682_quirk & SOF_MAX98373_SPEAKER_AMP_PRESENT) {
++ /* Disable Left and Right Spk pin after boot */
++ snd_soc_dapm_disable_pin(dapm, "Left Spk");
++ snd_soc_dapm_disable_pin(dapm, "Right Spk");
++ err = snd_soc_dapm_sync(dapm);
++ if (err < 0)
++ return err;
++ }
++
+ /* HDMI is not supported by SOF on Baytrail/CherryTrail */
+ if (is_legacy_cpu || !ctx->idisp_codec)
+ return 0;
+@@ -468,15 +477,6 @@ static int sof_card_late_probe(struct snd_soc_card *card)
+ i++;
+ }
+
+- if (sof_rt5682_quirk & SOF_MAX98373_SPEAKER_AMP_PRESENT) {
+- /* Disable Left and Right Spk pin after boot */
+- snd_soc_dapm_disable_pin(dapm, "Left Spk");
+- snd_soc_dapm_disable_pin(dapm, "Right Spk");
+- err = snd_soc_dapm_sync(dapm);
+- if (err < 0)
+- return err;
+- }
+-
+ return hdac_hdmi_jack_port_init(component, &card->dapm);
+ }
+
+diff --git a/sound/soc/mediatek/mt6797/mt6797-mt6351.c b/sound/soc/mediatek/mt6797/mt6797-mt6351.c
+index 496f32bcfb5e3..d2f6213a6bfcc 100644
+--- a/sound/soc/mediatek/mt6797/mt6797-mt6351.c
++++ b/sound/soc/mediatek/mt6797/mt6797-mt6351.c
+@@ -217,7 +217,8 @@ static int mt6797_mt6351_dev_probe(struct platform_device *pdev)
+ if (!codec_node) {
+ dev_err(&pdev->dev,
+ "Property 'audio-codec' missing or invalid\n");
+- return -EINVAL;
++ ret = -EINVAL;
++ goto put_platform_node;
+ }
+ for_each_card_prelinks(card, i, dai_link) {
+ if (dai_link->codecs->name)
+@@ -230,6 +231,9 @@ static int mt6797_mt6351_dev_probe(struct platform_device *pdev)
+ dev_err(&pdev->dev, "%s snd_soc_register_card fail %d\n",
+ __func__, ret);
+
++ of_node_put(codec_node);
++put_platform_node:
++ of_node_put(platform_node);
+ return ret;
+ }
+
+diff --git a/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5676.c b/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5676.c
+index 5716d92990668..c1e8633a74a7a 100644
+--- a/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5676.c
++++ b/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5676.c
+@@ -256,14 +256,16 @@ static int mt8173_rt5650_rt5676_dev_probe(struct platform_device *pdev)
+ if (!mt8173_rt5650_rt5676_dais[DAI_LINK_CODEC_I2S].codecs[0].of_node) {
+ dev_err(&pdev->dev,
+ "Property 'audio-codec' missing or invalid\n");
+- return -EINVAL;
++ ret = -EINVAL;
++ goto put_node;
+ }
+ mt8173_rt5650_rt5676_dais[DAI_LINK_CODEC_I2S].codecs[1].of_node =
+ of_parse_phandle(pdev->dev.of_node, "mediatek,audio-codec", 1);
+ if (!mt8173_rt5650_rt5676_dais[DAI_LINK_CODEC_I2S].codecs[1].of_node) {
+ dev_err(&pdev->dev,
+ "Property 'audio-codec' missing or invalid\n");
+- return -EINVAL;
++ ret = -EINVAL;
++ goto put_node;
+ }
+ mt8173_rt5650_rt5676_codec_conf[0].dlc.of_node =
+ mt8173_rt5650_rt5676_dais[DAI_LINK_CODEC_I2S].codecs[1].of_node;
+@@ -276,13 +278,15 @@ static int mt8173_rt5650_rt5676_dev_probe(struct platform_device *pdev)
+ if (!mt8173_rt5650_rt5676_dais[DAI_LINK_HDMI_I2S].codecs->of_node) {
+ dev_err(&pdev->dev,
+ "Property 'audio-codec' missing or invalid\n");
+- return -EINVAL;
++ ret = -EINVAL;
++ goto put_node;
+ }
+
+ card->dev = &pdev->dev;
+
+ ret = devm_snd_soc_register_card(&pdev->dev, card);
+
++put_node:
+ of_node_put(platform_node);
+ return ret;
+ }
+diff --git a/sound/soc/mediatek/mt8173/mt8173-rt5650.c b/sound/soc/mediatek/mt8173/mt8173-rt5650.c
+index fc164f4f95f85..487807ee70194 100644
+--- a/sound/soc/mediatek/mt8173/mt8173-rt5650.c
++++ b/sound/soc/mediatek/mt8173/mt8173-rt5650.c
+@@ -280,7 +280,8 @@ static int mt8173_rt5650_dev_probe(struct platform_device *pdev)
+ if (!mt8173_rt5650_dais[DAI_LINK_CODEC_I2S].codecs[0].of_node) {
+ dev_err(&pdev->dev,
+ "Property 'audio-codec' missing or invalid\n");
+- return -EINVAL;
++ ret = -EINVAL;
++ goto put_platform_node;
+ }
+ mt8173_rt5650_dais[DAI_LINK_CODEC_I2S].codecs[1].of_node =
+ mt8173_rt5650_dais[DAI_LINK_CODEC_I2S].codecs[0].of_node;
+@@ -293,7 +294,7 @@ static int mt8173_rt5650_dev_probe(struct platform_device *pdev)
+ dev_err(&pdev->dev,
+ "%s codec_capture_dai name fail %d\n",
+ __func__, ret);
+- return ret;
++ goto put_platform_node;
+ }
+ mt8173_rt5650_dais[DAI_LINK_CODEC_I2S].codecs[1].dai_name =
+ codec_capture_dai;
+@@ -315,12 +316,14 @@ static int mt8173_rt5650_dev_probe(struct platform_device *pdev)
+ if (!mt8173_rt5650_dais[DAI_LINK_HDMI_I2S].codecs->of_node) {
+ dev_err(&pdev->dev,
+ "Property 'audio-codec' missing or invalid\n");
+- return -EINVAL;
++ ret = -EINVAL;
++ goto put_platform_node;
+ }
+ card->dev = &pdev->dev;
+
+ ret = devm_snd_soc_register_card(&pdev->dev, card);
+
++put_platform_node:
+ of_node_put(platform_node);
+ return ret;
+ }
+diff --git a/sound/soc/qcom/lpass-cpu.c b/sound/soc/qcom/lpass-cpu.c
+index e6846ad2b5fa4..964eb07f46d6a 100644
+--- a/sound/soc/qcom/lpass-cpu.c
++++ b/sound/soc/qcom/lpass-cpu.c
+@@ -1090,6 +1090,7 @@ int asoc_qcom_lpass_cpu_platform_probe(struct platform_device *pdev)
+ dsp_of_node = of_parse_phandle(pdev->dev.of_node, "qcom,adsp", 0);
+ if (dsp_of_node) {
+ dev_err(dev, "DSP exists and holds audio resources\n");
++ of_node_put(dsp_of_node);
+ return -EBUSY;
+ }
+
+diff --git a/sound/soc/qcom/qdsp6/q6adm.c b/sound/soc/qcom/qdsp6/q6adm.c
+index 72c5719f1d253..a0678e8cf20a8 100644
+--- a/sound/soc/qcom/qdsp6/q6adm.c
++++ b/sound/soc/qcom/qdsp6/q6adm.c
+@@ -217,7 +217,7 @@ static struct q6copp *q6adm_alloc_copp(struct q6adm *adm, int port_idx)
+ idx = find_first_zero_bit(&adm->copp_bitmap[port_idx],
+ MAX_COPPS_PER_PORT);
+
+- if (idx > MAX_COPPS_PER_PORT)
++ if (idx >= MAX_COPPS_PER_PORT)
+ return ERR_PTR(-EBUSY);
+
+ c = kzalloc(sizeof(*c), GFP_ATOMIC);
+diff --git a/sound/soc/samsung/aries_wm8994.c b/sound/soc/samsung/aries_wm8994.c
+index 83acbe57b2489..a0825da9fff97 100644
+--- a/sound/soc/samsung/aries_wm8994.c
++++ b/sound/soc/samsung/aries_wm8994.c
+@@ -628,8 +628,10 @@ static int aries_audio_probe(struct platform_device *pdev)
+ return -EINVAL;
+
+ codec = of_get_child_by_name(dev->of_node, "codec");
+- if (!codec)
+- return -EINVAL;
++ if (!codec) {
++ ret = -EINVAL;
++ goto out;
++ }
+
+ for_each_card_prelinks(card, i, dai_link) {
+ dai_link->codecs->of_node = of_parse_phandle(codec,
+diff --git a/sound/soc/samsung/h1940_uda1380.c b/sound/soc/samsung/h1940_uda1380.c
+index c994e67d1eaf0..ca086243fcfd6 100644
+--- a/sound/soc/samsung/h1940_uda1380.c
++++ b/sound/soc/samsung/h1940_uda1380.c
+@@ -8,7 +8,7 @@
+ // Based on version from Arnaud Patard <arnaud.patard@rtp-net.org>
+
+ #include <linux/types.h>
+-#include <linux/gpio.h>
++#include <linux/gpio/consumer.h>
+ #include <linux/module.h>
+
+ #include <sound/soc.h>
+diff --git a/sound/soc/samsung/rx1950_uda1380.c b/sound/soc/samsung/rx1950_uda1380.c
+index 6ea1c8cc91675..2820097b00b93 100644
+--- a/sound/soc/samsung/rx1950_uda1380.c
++++ b/sound/soc/samsung/rx1950_uda1380.c
+@@ -128,7 +128,7 @@ static int rx1950_startup(struct snd_pcm_substream *substream)
+ &hw_rates);
+ }
+
+-struct gpio_desc *gpiod_speaker_power;
++static struct gpio_desc *gpiod_speaker_power;
+
+ static int rx1950_spk_power(struct snd_soc_dapm_widget *w,
+ struct snd_kcontrol *kcontrol, int event)
+@@ -227,7 +227,7 @@ static int rx1950_probe(struct platform_device *pdev)
+ return devm_snd_soc_register_card(dev, &rx1950_asoc);
+ }
+
+-struct platform_driver rx1950_audio = {
++static struct platform_driver rx1950_audio = {
+ .driver = {
+ .name = "rx1950-audio",
+ .pm = &snd_soc_pm_ops,
+diff --git a/sound/soc/sof/ipc3-topology.c b/sound/soc/sof/ipc3-topology.c
+index 80fb82ece38d7..74c230fbcf688 100644
+--- a/sound/soc/sof/ipc3-topology.c
++++ b/sound/soc/sof/ipc3-topology.c
+@@ -1629,6 +1629,7 @@ static int sof_ipc3_control_load_bytes(struct snd_sof_dev *sdev, struct snd_sof_
+ return 0;
+ err:
+ kfree(scontrol->ipc_control_data);
++ scontrol->ipc_control_data = NULL;
+ return ret;
+ }
+
+diff --git a/sound/soc/sof/mediatek/mt8195/mt8195-loader.c b/sound/soc/sof/mediatek/mt8195/mt8195-loader.c
+index ed18d6379e922..ef2664c3cd47d 100644
+--- a/sound/soc/sof/mediatek/mt8195/mt8195-loader.c
++++ b/sound/soc/sof/mediatek/mt8195/mt8195-loader.c
+@@ -21,7 +21,7 @@ void sof_hifixdsp_boot_sequence(struct snd_sof_dev *sdev, u32 boot_addr)
+
+ /* pull high StatVectorSel to use AltResetVec (set bit4 to 1) */
+ snd_sof_dsp_update_bits(sdev, DSP_REG_BAR, DSP_RESET_SW,
+- DSP_RESET_SW, DSP_RESET_SW);
++ STATVECTOR_SEL, STATVECTOR_SEL);
+
+ /* toggle DReset & BReset */
+ /* pull high DReset & BReset */
+diff --git a/sound/soc/sof/sof-priv.h b/sound/soc/sof/sof-priv.h
+index c856f0d84e495..f369242dd975e 100644
+--- a/sound/soc/sof/sof-priv.h
++++ b/sound/soc/sof/sof-priv.h
+@@ -364,8 +364,8 @@ struct snd_sof_ipc_msg {
+
+ /**
+ * struct sof_ipc_pm_ops - IPC-specific PM ops
+- * @ctx_save: Function pointer for context save
+- * @ctx_restore: Function pointer for context restore
++ * @ctx_save: Optional function pointer for context save
++ * @ctx_restore: Optional function pointer for context restore
+ */
+ struct sof_ipc_pm_ops {
+ int (*ctx_save)(struct snd_sof_dev *sdev);
+diff --git a/sound/usb/bcd2000/bcd2000.c b/sound/usb/bcd2000/bcd2000.c
+index cd4a0bc6d278f..7aec0a95c609a 100644
+--- a/sound/usb/bcd2000/bcd2000.c
++++ b/sound/usb/bcd2000/bcd2000.c
+@@ -348,7 +348,8 @@ static int bcd2000_init_midi(struct bcd2000 *bcd2k)
+ static void bcd2000_free_usb_related_resources(struct bcd2000 *bcd2k,
+ struct usb_interface *interface)
+ {
+- /* usb_kill_urb not necessary, urb is aborted automatically */
++ usb_kill_urb(bcd2k->midi_out_urb);
++ usb_kill_urb(bcd2k->midi_in_urb);
+
+ usb_free_urb(bcd2k->midi_out_urb);
+ usb_free_urb(bcd2k->midi_in_urb);
+diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c
+index 968d90caeefa0..168fd802d70bd 100644
+--- a/sound/usb/quirks.c
++++ b/sound/usb/quirks.c
+@@ -1843,6 +1843,8 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = {
+ QUIRK_FLAG_SHARE_MEDIA_DEVICE | QUIRK_FLAG_ALIGN_TRANSFER),
+ DEVICE_FLG(0x1395, 0x740a, /* Sennheiser DECT */
+ QUIRK_FLAG_GET_SAMPLE_RATE),
++ DEVICE_FLG(0x1397, 0x0507, /* Behringer UMC202HD */
++ QUIRK_FLAG_PLAYBACK_FIRST | QUIRK_FLAG_GENERIC_IMPLICIT_FB),
+ DEVICE_FLG(0x1397, 0x0508, /* Behringer UMC204HD */
+ QUIRK_FLAG_PLAYBACK_FIRST | QUIRK_FLAG_GENERIC_IMPLICIT_FB),
+ DEVICE_FLG(0x1397, 0x0509, /* Behringer UMC404HD */
+diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
+index 97dec81950e5d..6353a789322b5 100644
+--- a/tools/bpf/bpftool/link.c
++++ b/tools/bpf/bpftool/link.c
+@@ -20,6 +20,10 @@ static const char * const link_type_name[] = {
+ [BPF_LINK_TYPE_CGROUP] = "cgroup",
+ [BPF_LINK_TYPE_ITER] = "iter",
+ [BPF_LINK_TYPE_NETNS] = "netns",
++ [BPF_LINK_TYPE_XDP] = "xdp",
++ [BPF_LINK_TYPE_PERF_EVENT] = "perf_event",
++ [BPF_LINK_TYPE_KPROBE_MULTI] = "kprobe_multi",
++ [BPF_LINK_TYPE_STRUCT_OPS] = "struct_ops",
+ };
+
+ static struct hashmap *link_table;
+diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
+index d14b10b85e51f..ff9af73859cab 100644
+--- a/tools/include/uapi/linux/bpf.h
++++ b/tools/include/uapi/linux/bpf.h
+@@ -1013,6 +1013,7 @@ enum bpf_link_type {
+ BPF_LINK_TYPE_XDP = 6,
+ BPF_LINK_TYPE_PERF_EVENT = 7,
+ BPF_LINK_TYPE_KPROBE_MULTI = 8,
++ BPF_LINK_TYPE_STRUCT_OPS = 9,
+
+ MAX_BPF_LINK_TYPE,
+ };
+@@ -5143,6 +5144,17 @@ union bpf_attr {
+ * The **hash_algo** is returned on success,
+ * **-EOPNOTSUP** if the hash calculation failed or **-EINVAL** if
+ * invalid arguments are passed.
++ *
++ * void *bpf_kptr_xchg(void *map_value, void *ptr)
++ * Description
++ * Exchange kptr at pointer *map_value* with *ptr*, and return the
++ * old value. *ptr* can be NULL, otherwise it must be a referenced
++ * pointer which will be released when this helper is called.
++ * Return
++ * The old value of kptr (which can be NULL). The returned pointer
++ * if not NULL, is a reference which must be released using its
++ * corresponding release function, or moved into a BPF map before
++ * program exit.
+ */
+ #define __BPF_FUNC_MAPPER(FN) \
+ FN(unspec), \
+@@ -5339,6 +5351,7 @@ union bpf_attr {
+ FN(copy_from_user_task), \
+ FN(skb_set_tstamp), \
+ FN(ima_file_hash), \
++ FN(kptr_xchg), \
+ /* */
+
+ /* integer value in 'imm' field of BPF_CALL instruction selects which helper
+diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
+index e3a8c947e89f6..479f514f74119 100644
+--- a/tools/lib/bpf/bpf_tracing.h
++++ b/tools/lib/bpf/bpf_tracing.h
+@@ -227,7 +227,7 @@ struct pt_regs___arm64 {
+ #define __PT_PARM5_REG a4
+ #define __PT_RET_REG ra
+ #define __PT_FP_REG s0
+-#define __PT_RC_REG a5
++#define __PT_RC_REG a0
+ #define __PT_SP_REG sp
+ #define __PT_IP_REG pc
+ /* riscv does not select ARCH_HAS_SYSCALL_WRAPPER. */
+diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c
+index 927745b080141..23f5c46708f8f 100644
+--- a/tools/lib/bpf/gen_loader.c
++++ b/tools/lib/bpf/gen_loader.c
+@@ -533,7 +533,7 @@ void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *attach_name,
+ gen->attach_kind = kind;
+ ret = snprintf(gen->attach_target, sizeof(gen->attach_target), "%s%s",
+ prefix, attach_name);
+- if (ret == sizeof(gen->attach_target))
++ if (ret >= sizeof(gen->attach_target))
+ gen->error = -ENOSPC;
+ }
+
+diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
+index 881ea905ca815..2323d3e38a8e3 100644
+--- a/tools/lib/bpf/libbpf.c
++++ b/tools/lib/bpf/libbpf.c
+@@ -4271,7 +4271,7 @@ static int bpf_get_map_info_from_fdinfo(int fd, struct bpf_map_info *info)
+ int bpf_map__reuse_fd(struct bpf_map *map, int fd)
+ {
+ struct bpf_map_info info = {};
+- __u32 len = sizeof(info);
++ __u32 len = sizeof(info), name_len;
+ int new_fd, err;
+ char *new_name;
+
+@@ -4281,7 +4281,12 @@ int bpf_map__reuse_fd(struct bpf_map *map, int fd)
+ if (err)
+ return libbpf_err(err);
+
+- new_name = strdup(info.name);
++ name_len = strlen(info.name);
++ if (name_len == BPF_OBJ_NAME_LEN - 1 && strncmp(map->name, info.name, name_len) == 0)
++ new_name = strdup(map->name);
++ else
++ new_name = strdup(info.name);
++
+ if (!new_name)
+ return libbpf_err(-errno);
+
+diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
+index af136f73b09d0..67dc010e9fe3b 100644
+--- a/tools/lib/bpf/xsk.c
++++ b/tools/lib/bpf/xsk.c
+@@ -1147,8 +1147,6 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
+ goto out_mmap_tx;
+ }
+
+- ctx->prog_fd = -1;
+-
+ if (!(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
+ err = __xsk_setup_xdp_prog(xsk, NULL);
+ if (err)
+@@ -1229,7 +1227,10 @@ void xsk_socket__delete(struct xsk_socket *xsk)
+
+ ctx = xsk->ctx;
+ umem = ctx->umem;
+- if (ctx->prog_fd != -1) {
++
++ xsk_put_ctx(ctx, true);
++
++ if (!ctx->refcount) {
+ xsk_delete_bpf_maps(xsk);
+ close(ctx->prog_fd);
+ if (ctx->has_bpf_link)
+@@ -1248,8 +1249,6 @@ void xsk_socket__delete(struct xsk_socket *xsk)
+ }
+ }
+
+- xsk_put_ctx(ctx, true);
+-
+ umem->refcount--;
+ /* Do not close an fd that also has an associated umem connected
+ * to it.
+diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
+index f058e8cddfa85..e15fd6f9f7ea2 100644
+--- a/tools/perf/builtin-stat.c
++++ b/tools/perf/builtin-stat.c
+@@ -1663,12 +1663,6 @@ static int add_default_attributes(void)
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },
+
+-};
+- struct perf_event_attr default_sw_attrs[] = {
+- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK },
+- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES },
+- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS },
+- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS },
+ };
+
+ /*
+@@ -1910,30 +1904,6 @@ setup_metrics:
+ }
+
+ if (!evsel_list->core.nr_entries) {
+- if (perf_pmu__has_hybrid()) {
+- struct parse_events_error errinfo;
+- const char *hybrid_str = "cycles,instructions,branches,branch-misses";
+-
+- if (target__has_cpu(&target))
+- default_sw_attrs[0].config = PERF_COUNT_SW_CPU_CLOCK;
+-
+- if (evlist__add_default_attrs(evsel_list,
+- default_sw_attrs) < 0) {
+- return -1;
+- }
+-
+- parse_events_error__init(&errinfo);
+- err = parse_events(evsel_list, hybrid_str, &errinfo);
+- if (err) {
+- fprintf(stderr,
+- "Cannot set up hybrid events %s: %d\n",
+- hybrid_str, err);
+- parse_events_error__print(&errinfo, hybrid_str);
+- }
+- parse_events_error__exit(&errinfo);
+- return err ? -1 : 0;
+- }
+-
+ if (target__has_cpu(&target))
+ default_attrs0[0].config = PERF_COUNT_SW_CPU_CLOCK;
+
+diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
+index b97366f77bbf7..2bd23e4cf19ef 100644
+--- a/tools/perf/util/dsos.c
++++ b/tools/perf/util/dsos.c
+@@ -23,8 +23,19 @@ static int __dso_id__cmp(struct dso_id *a, struct dso_id *b)
+ if (a->ino > b->ino) return -1;
+ if (a->ino < b->ino) return 1;
+
+- if (a->ino_generation > b->ino_generation) return -1;
+- if (a->ino_generation < b->ino_generation) return 1;
++ /*
++ * Synthesized MMAP events have zero ino_generation, avoid comparing
++ * them with MMAP events with actual ino_generation.
++ *
++ * I found it harmful because the mismatch resulted in a new
++ * dso that did not have a build ID whereas the original dso did have a
++ * build ID. The build ID was essential because the object was not found
++ * otherwise. - Adrian
++ */
++ if (a->ino_generation && b->ino_generation) {
++ if (a->ino_generation > b->ino_generation) return -1;
++ if (a->ino_generation < b->ino_generation) return 1;
++ }
+
+ return 0;
+ }
+diff --git a/tools/perf/util/genelf.c b/tools/perf/util/genelf.c
+index aed49806a09ba..953338b9e887e 100644
+--- a/tools/perf/util/genelf.c
++++ b/tools/perf/util/genelf.c
+@@ -30,7 +30,11 @@
+
+ #define BUILD_ID_URANDOM /* different uuid for each run */
+
+-#ifdef HAVE_LIBCRYPTO
++// FIXME, remove this and fix the deprecation warnings before its removed and
++// We'll break for good here...
++#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
++
++#ifdef HAVE_LIBCRYPTO_SUPPORT
+
+ #define BUILD_ID_MD5
+ #undef BUILD_ID_SHA /* does not seem to work well when linked with Java */
+diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
+index ef6ced5c5746a..cb7b244937826 100644
+--- a/tools/perf/util/symbol-elf.c
++++ b/tools/perf/util/symbol-elf.c
+@@ -1294,16 +1294,29 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
+
+ if (elf_read_program_header(syms_ss->elf,
+ (u64)sym.st_value, &phdr)) {
+- pr_warning("%s: failed to find program header for "
++ pr_debug4("%s: failed to find program header for "
+ "symbol: %s st_value: %#" PRIx64 "\n",
+ __func__, elf_name, (u64)sym.st_value);
+- continue;
++ pr_debug4("%s: adjusting symbol: st_value: %#" PRIx64 " "
++ "sh_addr: %#" PRIx64 " sh_offset: %#" PRIx64 "\n",
++ __func__, (u64)sym.st_value, (u64)shdr.sh_addr,
++ (u64)shdr.sh_offset);
++ /*
++ * Fail to find program header, let's rollback
++ * to use shdr.sh_addr and shdr.sh_offset to
++ * calibrate symbol's file address, though this
++ * is not necessary for normal C ELF file, we
++ * still need to handle java JIT symbols in this
++ * case.
++ */
++ sym.st_value -= shdr.sh_addr - shdr.sh_offset;
++ } else {
++ pr_debug4("%s: adjusting symbol: st_value: %#" PRIx64 " "
++ "p_vaddr: %#" PRIx64 " p_offset: %#" PRIx64 "\n",
++ __func__, (u64)sym.st_value, (u64)phdr.p_vaddr,
++ (u64)phdr.p_offset);
++ sym.st_value -= phdr.p_vaddr - phdr.p_offset;
+ }
+- pr_debug4("%s: adjusting symbol: st_value: %#" PRIx64 " "
+- "p_vaddr: %#" PRIx64 " p_offset: %#" PRIx64 "\n",
+- __func__, (u64)sym.st_value, (u64)phdr.p_vaddr,
+- (u64)phdr.p_offset);
+- sym.st_value -= phdr.p_vaddr - phdr.p_offset;
+ }
+
+ demangled = demangle_sym(dso, kmodule, elf_name);
+diff --git a/tools/power/x86/intel-speed-select/isst-daemon.c b/tools/power/x86/intel-speed-select/isst-daemon.c
+index dd372924bc826..d0400c6684ba9 100644
+--- a/tools/power/x86/intel-speed-select/isst-daemon.c
++++ b/tools/power/x86/intel-speed-select/isst-daemon.c
+@@ -41,7 +41,7 @@ void process_level_change(int cpu)
+ time_t tm;
+ int ret;
+
+- if (pkg_id >= MAX_PACKAGE_COUNT || die_id > MAX_DIE_PER_PACKAGE) {
++ if (pkg_id >= MAX_PACKAGE_COUNT || die_id >= MAX_DIE_PER_PACKAGE) {
+ debug_printf("Invalid package/die info for cpu:%d\n", cpu);
+ return;
+ }
+diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c
+index ec823561b912f..a294176f8a9da 100644
+--- a/tools/testing/selftests/bpf/prog_tests/btf.c
++++ b/tools/testing/selftests/bpf/prog_tests/btf.c
+@@ -5226,7 +5226,7 @@ static void do_test_pprint(int test_num)
+ ret = snprintf(pin_path, sizeof(pin_path), "%s/%s",
+ "/sys/fs/bpf", test->map_name);
+
+- if (CHECK(ret == sizeof(pin_path), "pin_path %s/%s is too long",
++ if (CHECK(ret >= sizeof(pin_path), "pin_path %s/%s is too long",
+ "/sys/fs/bpf", test->map_name)) {
+ err = -1;
+ goto done;
+diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_stress.c b/tools/testing/selftests/bpf/prog_tests/fexit_stress.c
+index 3ee2107bbf7a7..58b03d1a70c81 100644
+--- a/tools/testing/selftests/bpf/prog_tests/fexit_stress.c
++++ b/tools/testing/selftests/bpf/prog_tests/fexit_stress.c
+@@ -7,11 +7,9 @@
+
+ void test_fexit_stress(void)
+ {
+- char test_skb[128] = {};
+ int fexit_fd[CNT] = {};
+ int link_fd[CNT] = {};
+- char error[4096];
+- int err, i, filter_fd;
++ int err, i;
+
+ const struct bpf_insn trace_program[] = {
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+@@ -20,25 +18,9 @@ void test_fexit_stress(void)
+
+ LIBBPF_OPTS(bpf_prog_load_opts, trace_opts,
+ .expected_attach_type = BPF_TRACE_FEXIT,
+- .log_buf = error,
+- .log_size = sizeof(error),
+ );
+
+- const struct bpf_insn skb_program[] = {
+- BPF_MOV64_IMM(BPF_REG_0, 0),
+- BPF_EXIT_INSN(),
+- };
+-
+- LIBBPF_OPTS(bpf_prog_load_opts, skb_opts,
+- .log_buf = error,
+- .log_size = sizeof(error),
+- );
+-
+- LIBBPF_OPTS(bpf_test_run_opts, topts,
+- .data_in = test_skb,
+- .data_size_in = sizeof(test_skb),
+- .repeat = 1,
+- );
++ LIBBPF_OPTS(bpf_test_run_opts, topts);
+
+ err = libbpf_find_vmlinux_btf_id("bpf_fentry_test1",
+ trace_opts.expected_attach_type);
+@@ -58,15 +40,9 @@ void test_fexit_stress(void)
+ goto out;
+ }
+
+- filter_fd = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL",
+- skb_program, sizeof(skb_program) / sizeof(struct bpf_insn),
+- &skb_opts);
+- if (!ASSERT_GE(filter_fd, 0, "test_program_loaded"))
+- goto out;
++ err = bpf_prog_test_run_opts(fexit_fd[0], &topts);
++ ASSERT_OK(err, "bpf_prog_test_run_opts");
+
+- err = bpf_prog_test_run_opts(filter_fd, &topts);
+- close(filter_fd);
+- CHECK_FAIL(err);
+ out:
+ for (i = 0; i < CNT; i++) {
+ if (link_fd[i])
+diff --git a/tools/testing/selftests/bpf/prog_tests/sock_fields.c b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
+index 9d211b5c22c41..7d23166c77af5 100644
+--- a/tools/testing/selftests/bpf/prog_tests/sock_fields.c
++++ b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
+@@ -394,7 +394,6 @@ void serial_test_sock_fields(void)
+ test();
+
+ done:
+- test_sock_fields__detach(skel);
+ test_sock_fields__destroy(skel);
+ if (child_cg_fd >= 0)
+ close(child_cg_fd);
+diff --git a/tools/testing/selftests/bpf/prog_tests/tc_redirect.c b/tools/testing/selftests/bpf/prog_tests/tc_redirect.c
+index 7ad66a247c02c..b2e415647bd73 100644
+--- a/tools/testing/selftests/bpf/prog_tests/tc_redirect.c
++++ b/tools/testing/selftests/bpf/prog_tests/tc_redirect.c
+@@ -646,7 +646,7 @@ static void test_tcp_clear_dtime(struct test_tc_dtime *skel)
+ __u32 *errs = skel->bss->errs[t];
+
+ skel->bss->test = t;
+- test_inet_dtime(AF_INET6, SOCK_STREAM, IP6_DST, 0);
++ test_inet_dtime(AF_INET6, SOCK_STREAM, IP6_DST, 50000 + t);
+
+ ASSERT_EQ(dtimes[INGRESS_FWDNS_P100], 0,
+ dtime_cnt_str(t, INGRESS_FWDNS_P100));
+@@ -683,7 +683,7 @@ static void test_tcp_dtime(struct test_tc_dtime *skel, int family, bool bpf_fwd)
+ errs = skel->bss->errs[t];
+
+ skel->bss->test = t;
+- test_inet_dtime(family, SOCK_STREAM, addr, 0);
++ test_inet_dtime(family, SOCK_STREAM, addr, 50000 + t);
+
+ /* fwdns_prio100 prog does not read delivery_time_type, so
+ * kernel puts the (rcv) timetamp in __sk_buff->tstamp
+@@ -715,13 +715,13 @@ static void test_udp_dtime(struct test_tc_dtime *skel, int family, bool bpf_fwd)
+ errs = skel->bss->errs[t];
+
+ skel->bss->test = t;
+- test_inet_dtime(family, SOCK_DGRAM, addr, 0);
++ test_inet_dtime(family, SOCK_DGRAM, addr, 50000 + t);
+
+ ASSERT_EQ(dtimes[INGRESS_FWDNS_P100], 0,
+ dtime_cnt_str(t, INGRESS_FWDNS_P100));
+ /* non mono delivery time is not forwarded */
+ ASSERT_EQ(dtimes[INGRESS_FWDNS_P101], 0,
+- dtime_cnt_str(t, INGRESS_FWDNS_P100));
++ dtime_cnt_str(t, INGRESS_FWDNS_P101));
+ for (i = EGRESS_FWDNS_P100; i < SET_DTIME; i++)
+ ASSERT_GT(dtimes[i], 0, dtime_cnt_str(t, i));
+
+diff --git a/tools/testing/selftests/bpf/progs/test_tc_dtime.c b/tools/testing/selftests/bpf/progs/test_tc_dtime.c
+index 06f300d06dbd7..b596479a9ebeb 100644
+--- a/tools/testing/selftests/bpf/progs/test_tc_dtime.c
++++ b/tools/testing/selftests/bpf/progs/test_tc_dtime.c
+@@ -11,6 +11,8 @@
+ #include <linux/in.h>
+ #include <linux/ip.h>
+ #include <linux/ipv6.h>
++#include <linux/tcp.h>
++#include <linux/udp.h>
+ #include <bpf/bpf_helpers.h>
+ #include <bpf/bpf_endian.h>
+ #include <sys/socket.h>
+@@ -115,6 +117,19 @@ static bool bpf_fwd(void)
+ return test < TCP_IP4_RT_FWD;
+ }
+
++static __u8 get_proto(void)
++{
++ switch (test) {
++ case UDP_IP4:
++ case UDP_IP6:
++ case UDP_IP4_RT_FWD:
++ case UDP_IP6_RT_FWD:
++ return IPPROTO_UDP;
++ default:
++ return IPPROTO_TCP;
++ }
++}
++
+ /* -1: parse error: TC_ACT_SHOT
+ * 0: not testing traffic: TC_ACT_OK
+ * >0: first byte is the inet_proto, second byte has the netns
+@@ -122,11 +137,16 @@ static bool bpf_fwd(void)
+ */
+ static int skb_get_type(struct __sk_buff *skb)
+ {
++ __u16 dst_ns_port = __bpf_htons(50000 + test);
+ void *data_end = ctx_ptr(skb->data_end);
+ void *data = ctx_ptr(skb->data);
+ __u8 inet_proto = 0, ns = 0;
+ struct ipv6hdr *ip6h;
++ __u16 sport, dport;
+ struct iphdr *iph;
++ struct tcphdr *th;
++ struct udphdr *uh;
++ void *trans;
+
+ switch (skb->protocol) {
+ case __bpf_htons(ETH_P_IP):
+@@ -138,6 +158,7 @@ static int skb_get_type(struct __sk_buff *skb)
+ else if (iph->saddr == ip4_dst)
+ ns = DST_NS;
+ inet_proto = iph->protocol;
++ trans = iph + 1;
+ break;
+ case __bpf_htons(ETH_P_IPV6):
+ ip6h = data + sizeof(struct ethhdr);
+@@ -148,15 +169,43 @@ static int skb_get_type(struct __sk_buff *skb)
+ else if (v6_equal(ip6h->saddr, (struct in6_addr)ip6_dst))
+ ns = DST_NS;
+ inet_proto = ip6h->nexthdr;
++ trans = ip6h + 1;
+ break;
+ default:
+ return 0;
+ }
+
+- if ((inet_proto != IPPROTO_TCP && inet_proto != IPPROTO_UDP) || !ns)
++ /* skb is not from src_ns or dst_ns.
++ * skb is not the testing IPPROTO.
++ */
++ if (!ns || inet_proto != get_proto())
+ return 0;
+
+- return (ns << 8 | inet_proto);
++ switch (inet_proto) {
++ case IPPROTO_TCP:
++ th = trans;
++ if (th + 1 > data_end)
++ return -1;
++ sport = th->source;
++ dport = th->dest;
++ break;
++ case IPPROTO_UDP:
++ uh = trans;
++ if (uh + 1 > data_end)
++ return -1;
++ sport = uh->source;
++ dport = uh->dest;
++ break;
++ default:
++ return 0;
++ }
++
++ /* The skb is the testing traffic */
++ if ((ns == SRC_NS && dport == dst_ns_port) ||
++ (ns == DST_NS && sport == dst_ns_port))
++ return (ns << 8 | inet_proto);
++
++ return 0;
+ }
+
+ /* format: direction@iface@netns
+diff --git a/tools/testing/selftests/bpf/verifier/ref_tracking.c b/tools/testing/selftests/bpf/verifier/ref_tracking.c
+index fbd682520e475..57a83d763ec17 100644
+--- a/tools/testing/selftests/bpf/verifier/ref_tracking.c
++++ b/tools/testing/selftests/bpf/verifier/ref_tracking.c
+@@ -796,7 +796,7 @@
+ },
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ .result = REJECT,
+- .errstr = "reference has not been acquired before",
++ .errstr = "R1 must be referenced when passed to release function",
+ },
+ {
+ /* !bpf_sk_fullsock(sk) is checked but !bpf_tcp_sock(sk) is not checked */
+diff --git a/tools/testing/selftests/bpf/verifier/sock.c b/tools/testing/selftests/bpf/verifier/sock.c
+index 86b24cad27a76..d11d0b28be416 100644
+--- a/tools/testing/selftests/bpf/verifier/sock.c
++++ b/tools/testing/selftests/bpf/verifier/sock.c
+@@ -417,7 +417,7 @@
+ },
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ .result = REJECT,
+- .errstr = "reference has not been acquired before",
++ .errstr = "R1 must be referenced when passed to release function",
+ },
+ {
+ "bpf_sk_release(bpf_sk_fullsock(skb->sk))",
+@@ -436,7 +436,7 @@
+ },
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ .result = REJECT,
+- .errstr = "reference has not been acquired before",
++ .errstr = "R1 must be referenced when passed to release function",
+ },
+ {
+ "bpf_sk_release(bpf_tcp_sock(skb->sk))",
+@@ -455,7 +455,7 @@
+ },
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ .result = REJECT,
+- .errstr = "reference has not been acquired before",
++ .errstr = "R1 must be referenced when passed to release function",
+ },
+ {
+ "sk_storage_get(map, skb->sk, NULL, 0): value == NULL",
+diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
+index 33ea5e9955d9b..b9984124a2949 100644
+--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
++++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
+@@ -1423,7 +1423,7 @@ uint64_t kvm_hypercall(uint64_t nr, uint64_t a0, uint64_t a1, uint64_t a2,
+
+ asm volatile("vmcall"
+ : "=a"(r)
+- : "b"(a0), "c"(a1), "d"(a2), "S"(a3));
++ : "a"(nr), "b"(a0), "c"(a1), "d"(a2), "S"(a3));
+ return r;
+ }
+
+diff --git a/tools/testing/selftests/powerpc/papr_attributes/attr_test.c b/tools/testing/selftests/powerpc/papr_attributes/attr_test.c
+index bab0dc06e90b7..9b655be641c90 100644
+--- a/tools/testing/selftests/powerpc/papr_attributes/attr_test.c
++++ b/tools/testing/selftests/powerpc/papr_attributes/attr_test.c
+@@ -7,6 +7,7 @@
+ * Copyright 2022, Pratik Rajesh Sampat, IBM Corp.
+ */
+
++#include <errno.h>
+ #include <stdio.h>
+ #include <string.h>
+ #include <dirent.h>
+@@ -32,7 +33,7 @@ enum type {
+ NUM_VAL
+ };
+
+-int value_type(int id)
++static int value_type(int id)
+ {
+ int val_type;
+
+@@ -54,15 +55,21 @@ int value_type(int id)
+ return val_type;
+ }
+
+-int verify_energy_info(void)
++static int verify_energy_info(void)
+ {
+ const char *path = "/sys/firmware/papr/energy_scale_info";
+ struct dirent *entry;
+ struct stat s;
+ DIR *dirp;
+
+- if (stat(path, &s) || !S_ISDIR(s.st_mode))
+- return -1;
++ errno = 0;
++ if (stat(path, &s)) {
++ SKIP_IF(errno == ENOENT);
++ FAIL_IF(errno);
++ }
++
++ FAIL_IF(!S_ISDIR(s.st_mode));
++
+ dirp = opendir(path);
+
+ while ((entry = readdir(dirp)) != NULL) {
+@@ -76,25 +83,24 @@ int verify_energy_info(void)
+
+ id = atoi(entry->d_name);
+ attr_type = value_type(id);
+- if (attr_type == INVALID)
+- return -1;
++ FAIL_IF(attr_type == INVALID);
+
+ /* Check if the files exist and have data in them */
+ sprintf(file_name, "%s/%d/desc", path, id);
+ f = fopen(file_name, "r");
+- if (!f || fgetc(f) == EOF)
+- return -1;
++ FAIL_IF(!f);
++ FAIL_IF(fgetc(f) == EOF);
+
+ sprintf(file_name, "%s/%d/value", path, id);
+ f = fopen(file_name, "r");
+- if (!f || fgetc(f) == EOF)
+- return -1;
++ FAIL_IF(!f);
++ FAIL_IF(fgetc(f) == EOF);
+
+ if (attr_type == STR_VAL) {
+ sprintf(file_name, "%s/%d/value_desc", path, id);
+ f = fopen(file_name, "r");
+- if (!f || fgetc(f) == EOF)
+- return -1;
++ FAIL_IF(!f);
++ FAIL_IF(fgetc(f) == EOF);
+ }
+ }
+
+diff --git a/tools/testing/selftests/rcutorture/bin/kvm.sh b/tools/testing/selftests/rcutorture/bin/kvm.sh
+index 55b2c15332827..f2768994d861e 100755
+--- a/tools/testing/selftests/rcutorture/bin/kvm.sh
++++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
+@@ -44,6 +44,7 @@ TORTURE_KCONFIG_KASAN_ARG=""
+ TORTURE_KCONFIG_KCSAN_ARG=""
+ TORTURE_KMAKE_ARG=""
+ TORTURE_QEMU_MEM=512
++torture_qemu_mem_default=1
+ TORTURE_REMOTE=
+ TORTURE_SHUTDOWN_GRACE=180
+ TORTURE_SUITE=rcu
+@@ -163,7 +164,7 @@ do
+ shift
+ ;;
+ --gdb)
+- TORTURE_KCONFIG_GDB_ARG="CONFIG_DEBUG_INFO=y"; export TORTURE_KCONFIG_GDB_ARG
++ TORTURE_KCONFIG_GDB_ARG="CONFIG_DEBUG_INFO_NONE=n CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y"; export TORTURE_KCONFIG_GDB_ARG
+ TORTURE_BOOT_GDB_ARG="nokaslr"; export TORTURE_BOOT_GDB_ARG
+ TORTURE_QEMU_GDB_ARG="-s -S"; export TORTURE_QEMU_GDB_ARG
+ ;;
+@@ -179,7 +180,11 @@ do
+ shift
+ ;;
+ --kasan)
+- TORTURE_KCONFIG_KASAN_ARG="CONFIG_DEBUG_INFO=y CONFIG_KASAN=y"; export TORTURE_KCONFIG_KASAN_ARG
++ TORTURE_KCONFIG_KASAN_ARG="CONFIG_DEBUG_INFO_NONE=n CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y CONFIG_KASAN=y"; export TORTURE_KCONFIG_KASAN_ARG
++ if test -n "$torture_qemu_mem_default"
++ then
++ TORTURE_QEMU_MEM=2G
++ fi
+ ;;
+ --kconfig|--kconfigs)
+ checkarg --kconfig "(Kconfig options)" $# "$2" '^CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\)\( CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\)\)*$' '^error$'
+@@ -187,7 +192,7 @@ do
+ shift
+ ;;
+ --kcsan)
+- TORTURE_KCONFIG_KCSAN_ARG="CONFIG_DEBUG_INFO=y CONFIG_KCSAN=y CONFIG_KCSAN_STRICT=y CONFIG_KCSAN_REPORT_ONCE_IN_MS=100000 CONFIG_KCSAN_VERBOSE=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y"; export TORTURE_KCONFIG_KCSAN_ARG
++ TORTURE_KCONFIG_KCSAN_ARG="CONFIG_DEBUG_INFO_NONE=n CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y CONFIG_KCSAN=y CONFIG_KCSAN_STRICT=y CONFIG_KCSAN_REPORT_ONCE_IN_MS=100000 CONFIG_KCSAN_VERBOSE=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y"; export TORTURE_KCONFIG_KCSAN_ARG
+ ;;
+ --kmake-arg|--kmake-args)
+ checkarg --kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$'
+@@ -202,6 +207,7 @@ do
+ --memory)
+ checkarg --memory "(memory size)" $# "$2" '^[0-9]\+[MG]\?$' error
+ TORTURE_QEMU_MEM=$2
++ torture_qemu_mem_default=
+ shift
+ ;;
+ --no-initrd)
+diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
+index 313bb0cbfb1eb..6f65eeb6a3dd5 100644
+--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
++++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
+@@ -802,7 +802,7 @@ void kill_thread_or_group(struct __test_metadata *_metadata,
+ .len = (unsigned short)ARRAY_SIZE(filter_thread),
+ .filter = filter_thread,
+ };
+- int kill = kill_how == KILL_PROCESS ? SECCOMP_RET_KILL_PROCESS : 0xAAAAAAAAA;
++ int kill = kill_how == KILL_PROCESS ? SECCOMP_RET_KILL_PROCESS : 0xAAAAAAAA;
+ struct sock_filter filter_process[] = {
+ BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
+ offsetof(struct seccomp_data, nr)),
+diff --git a/tools/testing/selftests/timers/clocksource-switch.c b/tools/testing/selftests/timers/clocksource-switch.c
+index ef8eb3604595e..b57f0a9be4902 100644
+--- a/tools/testing/selftests/timers/clocksource-switch.c
++++ b/tools/testing/selftests/timers/clocksource-switch.c
+@@ -110,10 +110,10 @@ int run_tests(int secs)
+
+ sprintf(buf, "./inconsistency-check -t %i", secs);
+ ret = system(buf);
+- if (ret)
+- return ret;
++ if (WIFEXITED(ret) && WEXITSTATUS(ret))
++ return WEXITSTATUS(ret);
+ ret = system("./nanosleep");
+- return ret;
++ return WIFEXITED(ret) ? WEXITSTATUS(ret) : 0;
+ }
+
+
+diff --git a/tools/testing/selftests/timers/valid-adjtimex.c b/tools/testing/selftests/timers/valid-adjtimex.c
+index 5397de708d3c2..48b9a803235a8 100644
+--- a/tools/testing/selftests/timers/valid-adjtimex.c
++++ b/tools/testing/selftests/timers/valid-adjtimex.c
+@@ -40,7 +40,7 @@
+ #define ADJ_SETOFFSET 0x0100
+
+ #include <sys/syscall.h>
+-static int clock_adjtime(clockid_t id, struct timex *tx)
++int clock_adjtime(clockid_t id, struct timex *tx)
+ {
+ return syscall(__NR_clock_adjtime, id, tx);
+ }
+diff --git a/tools/testing/selftests/vm/hugepage-mremap.c b/tools/testing/selftests/vm/hugepage-mremap.c
+index 1d689084a54ba..22546f2795341 100644
+--- a/tools/testing/selftests/vm/hugepage-mremap.c
++++ b/tools/testing/selftests/vm/hugepage-mremap.c
+@@ -107,7 +107,7 @@ static void register_region_with_uffd(char *addr, size_t len)
+
+ int main(int argc, char *argv[])
+ {
+- size_t length;
++ size_t length = 0;
+
+ if (argc != 2 && argc != 3) {
+ printf("Usage: %s [length_in_MB] <hugetlb_file>\n", argv[0]);
+diff --git a/tools/testing/selftests/vm/hugetlb-madvise.c b/tools/testing/selftests/vm/hugetlb-madvise.c
+index 6c6af40f57478..3c9943131881e 100644
+--- a/tools/testing/selftests/vm/hugetlb-madvise.c
++++ b/tools/testing/selftests/vm/hugetlb-madvise.c
+@@ -89,10 +89,11 @@ void write_fault_pages(void *addr, unsigned long nr_pages)
+
+ void read_fault_pages(void *addr, unsigned long nr_pages)
+ {
+- unsigned long i, tmp;
++ unsigned long dummy = 0;
++ unsigned long i;
+
+ for (i = 0; i < nr_pages; i++)
+- tmp += *((unsigned long *)(addr + (i * huge_page_size)));
++ dummy += *((unsigned long *)(addr + (i * huge_page_size)));
+ }
+
+ int main(int argc, char **argv)
+diff --git a/tools/testing/selftests/wireguard/qemu/arch/riscv32.config b/tools/testing/selftests/wireguard/qemu/arch/riscv32.config
+index 0bd0e72d95d49..2fc36efb166dc 100644
+--- a/tools/testing/selftests/wireguard/qemu/arch/riscv32.config
++++ b/tools/testing/selftests/wireguard/qemu/arch/riscv32.config
+@@ -1,3 +1,4 @@
++CONFIG_NONPORTABLE=y
+ CONFIG_ARCH_RV32I=y
+ CONFIG_MMU=y
+ CONFIG_FPU=y
+diff --git a/tools/thermal/tmon/sysfs.c b/tools/thermal/tmon/sysfs.c
+index b00b1bfd9d8e7..cb1108bc92498 100644
+--- a/tools/thermal/tmon/sysfs.c
++++ b/tools/thermal/tmon/sysfs.c
+@@ -13,6 +13,7 @@
+ #include <stdint.h>
+ #include <dirent.h>
+ #include <libintl.h>
++#include <limits.h>
+ #include <ctype.h>
+ #include <time.h>
+ #include <syslog.h>
+@@ -33,9 +34,9 @@ int sysfs_set_ulong(char *path, char *filename, unsigned long val)
+ {
+ FILE *fd;
+ int ret = -1;
+- char filepath[256];
++ char filepath[PATH_MAX + 2]; /* NUL and '/' */
+
+- snprintf(filepath, 256, "%s/%s", path, filename);
++ snprintf(filepath, sizeof(filepath), "%s/%s", path, filename);
+
+ fd = fopen(filepath, "w");
+ if (!fd) {
+@@ -57,9 +58,9 @@ static int sysfs_get_ulong(char *path, char *filename, unsigned long *p_ulong)
+ {
+ FILE *fd;
+ int ret = -1;
+- char filepath[256];
++ char filepath[PATH_MAX + 2]; /* NUL and '/' */
+
+- snprintf(filepath, 256, "%s/%s", path, filename);
++ snprintf(filepath, sizeof(filepath), "%s/%s", path, filename);
+
+ fd = fopen(filepath, "r");
+ if (!fd) {
+@@ -76,9 +77,9 @@ static int sysfs_get_string(char *path, char *filename, char *str)
+ {
+ FILE *fd;
+ int ret = -1;
+- char filepath[256];
++ char filepath[PATH_MAX + 2]; /* NUL and '/' */
+
+- snprintf(filepath, 256, "%s/%s", path, filename);
++ snprintf(filepath, sizeof(filepath), "%s/%s", path, filename);
+
+ fd = fopen(filepath, "r");
+ if (!fd) {
+@@ -199,8 +200,8 @@ static int find_tzone_cdev(struct dirent *nl, char *tz_name,
+ {
+ unsigned long trip_instance = 0;
+ char cdev_name_linked[256];
+- char cdev_name[256];
+- char cdev_trip_name[256];
++ char cdev_name[PATH_MAX];
++ char cdev_trip_name[PATH_MAX];
+ int cdev_id;
+
+ if (nl->d_type == DT_LNK) {
+@@ -213,7 +214,8 @@ static int find_tzone_cdev(struct dirent *nl, char *tz_name,
+ return -EINVAL;
+ }
+ /* find the link to real cooling device record binding */
+- snprintf(cdev_name, 256, "%s/%s", tz_name, nl->d_name);
++ snprintf(cdev_name, sizeof(cdev_name) - 2, "%s/%s",
++ tz_name, nl->d_name);
+ memset(cdev_name_linked, 0, sizeof(cdev_name_linked));
+ if (readlink(cdev_name, cdev_name_linked,
+ sizeof(cdev_name_linked) - 1) != -1) {
+@@ -226,8 +228,8 @@ static int find_tzone_cdev(struct dirent *nl, char *tz_name,
+ /* find the trip point in which the cdev is binded to
+ * in this tzone
+ */
+- snprintf(cdev_trip_name, 256, "%s%s", nl->d_name,
+- "_trip_point");
++ snprintf(cdev_trip_name, sizeof(cdev_trip_name) - 1,
++ "%s%s", nl->d_name, "_trip_point");
+ sysfs_get_ulong(tz_name, cdev_trip_name,
+ &trip_instance);
+ /* validate trip point range, e.g. trip could return -1
+diff --git a/tools/thermal/tmon/tmon.h b/tools/thermal/tmon/tmon.h
+index c9066ec104ddd..44d16d778f044 100644
+--- a/tools/thermal/tmon/tmon.h
++++ b/tools/thermal/tmon/tmon.h
+@@ -27,6 +27,9 @@
+ #define NR_LINES_TZDATA 1
+ #define TMON_LOG_FILE "/var/tmp/tmon.log"
+
++#include <sys/time.h>
++#include <pthread.h>
++
+ extern unsigned long ticktime;
+ extern double time_elapsed;
+ extern unsigned long target_temp_user;
+diff --git a/tools/tracing/rtla/Makefile b/tools/tracing/rtla/Makefile
+index 3822f4ea5f495..1bea2d16d4c11 100644
+--- a/tools/tracing/rtla/Makefile
++++ b/tools/tracing/rtla/Makefile
+@@ -1,6 +1,6 @@
+ NAME := rtla
+ # Follow the kernel version
+-VERSION := $(shell cat VERSION 2> /dev/null || make -sC ../../.. kernelversion)
++VERSION := $(shell cat VERSION 2> /dev/null || make -sC ../../.. kernelversion | grep -v make)
+
+ # From libtracefs:
+ # Makefiles suck: This macro sets a default value of $(2) for the
+diff --git a/tools/tracing/rtla/src/trace.c b/tools/tracing/rtla/src/trace.c
+index 5784c9f9e5706..e1ba6d9f42658 100644
+--- a/tools/tracing/rtla/src/trace.c
++++ b/tools/tracing/rtla/src/trace.c
+@@ -134,13 +134,18 @@ void trace_instance_destroy(struct trace_instance *trace)
+ if (trace->inst) {
+ disable_tracer(trace->inst);
+ destroy_instance(trace->inst);
++ trace->inst = NULL;
+ }
+
+- if (trace->seq)
++ if (trace->seq) {
+ free(trace->seq);
++ trace->seq = NULL;
++ }
+
+- if (trace->tep)
++ if (trace->tep) {
+ tep_free(trace->tep);
++ trace->tep = NULL;
++ }
+ }
+
+ /*
+diff --git a/tools/tracing/rtla/src/utils.c b/tools/tracing/rtla/src/utils.c
+index 5352167a1e751..5ae2fa96fde1e 100644
+--- a/tools/tracing/rtla/src/utils.c
++++ b/tools/tracing/rtla/src/utils.c
+@@ -106,8 +106,9 @@ int parse_cpu_list(char *cpu_list, char **monitored_cpus)
+
+ nr_cpus = sysconf(_SC_NPROCESSORS_CONF);
+
+- mon_cpus = malloc(nr_cpus * sizeof(char));
+- memset(mon_cpus, 0, (nr_cpus * sizeof(char)));
++ mon_cpus = calloc(nr_cpus, sizeof(char));
++ if (!mon_cpus)
++ goto err;
+
+ for (p = cpu_list; *p; ) {
+ cpu = atoi(p);
+diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
+index 7f1d19689701b..843396ed4ad33 100644
+--- a/virt/kvm/kvm_main.c
++++ b/virt/kvm/kvm_main.c
+@@ -724,6 +724,15 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
+ kvm->mn_active_invalidate_count++;
+ spin_unlock(&kvm->mn_invalidate_lock);
+
++ /*
++ * Invalidate pfn caches _before_ invalidating the secondary MMUs, i.e.
++ * before acquiring mmu_lock, to avoid holding mmu_lock while acquiring
++ * each cache's lock. There are relatively few caches in existence at
++ * any given time, and the caches themselves can check for hva overlap,
++ * i.e. don't need to rely on memslot overlap checks for performance.
++ * Because this runs without holding mmu_lock, the pfn caches must use
++ * mn_active_invalidate_count (see above) instead of mmu_notifier_count.
++ */
+ gfn_to_pfn_cache_invalidate_start(kvm, range->start, range->end,
+ hva_range.may_block);
+
+@@ -2843,16 +2852,28 @@ void kvm_release_pfn_dirty(kvm_pfn_t pfn)
+ }
+ EXPORT_SYMBOL_GPL(kvm_release_pfn_dirty);
+
++static bool kvm_is_ad_tracked_pfn(kvm_pfn_t pfn)
++{
++ if (!pfn_valid(pfn))
++ return false;
++
++ /*
++ * Per page-flags.h, pages tagged PG_reserved "should in general not be
++ * touched (e.g. set dirty) except by its owner".
++ */
++ return !PageReserved(pfn_to_page(pfn));
++}
++
+ void kvm_set_pfn_dirty(kvm_pfn_t pfn)
+ {
+- if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn))
++ if (kvm_is_ad_tracked_pfn(pfn))
+ SetPageDirty(pfn_to_page(pfn));
+ }
+ EXPORT_SYMBOL_GPL(kvm_set_pfn_dirty);
+
+ void kvm_set_pfn_accessed(kvm_pfn_t pfn)
+ {
+- if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn))
++ if (kvm_is_ad_tracked_pfn(pfn))
+ mark_page_accessed(pfn_to_page(pfn));
+ }
+ EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);
+diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
+index dd84676615f1a..b0b6783673765 100644
+--- a/virt/kvm/pfncache.c
++++ b/virt/kvm/pfncache.c
+@@ -95,7 +95,7 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
+ }
+ EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_check);
+
+-static void __release_gpc(struct kvm *kvm, kvm_pfn_t pfn, void *khva, gpa_t gpa)
++static void gpc_release_pfn_and_khva(struct kvm *kvm, kvm_pfn_t pfn, void *khva)
+ {
+ /* Unmap the old page if it was mapped before, and release it */
+ if (!is_error_noslot_pfn(pfn)) {
+@@ -112,31 +112,122 @@ static void __release_gpc(struct kvm *kvm, kvm_pfn_t pfn, void *khva, gpa_t gpa)
+ }
+ }
+
+-static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, unsigned long uhva)
++static inline bool mmu_notifier_retry_cache(struct kvm *kvm, unsigned long mmu_seq)
+ {
++ /*
++ * mn_active_invalidate_count acts for all intents and purposes
++ * like mmu_notifier_count here; but the latter cannot be used
++ * here because the invalidation of caches in the mmu_notifier
++ * event occurs _before_ mmu_notifier_count is elevated.
++ *
++ * Note, it does not matter that mn_active_invalidate_count
++ * is not protected by gpc->lock. It is guaranteed to
++ * be elevated before the mmu_notifier acquires gpc->lock, and
++ * isn't dropped until after mmu_notifier_seq is updated.
++ */
++ if (kvm->mn_active_invalidate_count)
++ return true;
++
++ /*
++ * Ensure mn_active_invalidate_count is read before
++ * mmu_notifier_seq. This pairs with the smp_wmb() in
++ * mmu_notifier_invalidate_range_end() to guarantee either the
++ * old (non-zero) value of mn_active_invalidate_count or the
++ * new (incremented) value of mmu_notifier_seq is observed.
++ */
++ smp_rmb();
++ return kvm->mmu_notifier_seq != mmu_seq;
++}
++
++static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
++{
++ /* Note, the new page offset may be different than the old! */
++ void *old_khva = gpc->khva - offset_in_page(gpc->khva);
++ kvm_pfn_t new_pfn = KVM_PFN_ERR_FAULT;
++ void *new_khva = NULL;
+ unsigned long mmu_seq;
+- kvm_pfn_t new_pfn;
+- int retry;
++
++ lockdep_assert_held(&gpc->refresh_lock);
++
++ lockdep_assert_held_write(&gpc->lock);
++
++ /*
++ * Invalidate the cache prior to dropping gpc->lock, the gpa=>uhva
++ * assets have already been updated and so a concurrent check() from a
++ * different task may not fail the gpa/uhva/generation checks.
++ */
++ gpc->valid = false;
+
+ do {
+ mmu_seq = kvm->mmu_notifier_seq;
+ smp_rmb();
+
++ write_unlock_irq(&gpc->lock);
++
++ /*
++ * If the previous iteration "failed" due to an mmu_notifier
++ * event, release the pfn and unmap the kernel virtual address
++ * from the previous attempt. Unmapping might sleep, so this
++ * needs to be done after dropping the lock. Opportunistically
++ * check for resched while the lock isn't held.
++ */
++ if (new_pfn != KVM_PFN_ERR_FAULT) {
++ /*
++ * Keep the mapping if the previous iteration reused
++ * the existing mapping and didn't create a new one.
++ */
++ if (new_khva == old_khva)
++ new_khva = NULL;
++
++ gpc_release_pfn_and_khva(kvm, new_pfn, new_khva);
++
++ cond_resched();
++ }
++
+ /* We always request a writeable mapping */
+- new_pfn = hva_to_pfn(uhva, false, NULL, true, NULL);
++ new_pfn = hva_to_pfn(gpc->uhva, false, NULL, true, NULL);
+ if (is_error_noslot_pfn(new_pfn))
+- break;
++ goto out_error;
++
++ /*
++ * Obtain a new kernel mapping if KVM itself will access the
++ * pfn. Note, kmap() and memremap() can both sleep, so this
++ * too must be done outside of gpc->lock!
++ */
++ if (gpc->usage & KVM_HOST_USES_PFN) {
++ if (new_pfn == gpc->pfn) {
++ new_khva = old_khva;
++ } else if (pfn_valid(new_pfn)) {
++ new_khva = kmap(pfn_to_page(new_pfn));
++#ifdef CONFIG_HAS_IOMEM
++ } else {
++ new_khva = memremap(pfn_to_hpa(new_pfn), PAGE_SIZE, MEMREMAP_WB);
++#endif
++ }
++ if (!new_khva) {
++ kvm_release_pfn_clean(new_pfn);
++ goto out_error;
++ }
++ }
+
+- KVM_MMU_READ_LOCK(kvm);
+- retry = mmu_notifier_retry_hva(kvm, mmu_seq, uhva);
+- KVM_MMU_READ_UNLOCK(kvm);
+- if (!retry)
+- break;
++ write_lock_irq(&gpc->lock);
+
+- cond_resched();
+- } while (1);
++ /*
++ * Other tasks must wait for _this_ refresh to complete before
++ * attempting to refresh.
++ */
++ WARN_ON_ONCE(gpc->valid);
++ } while (mmu_notifier_retry_cache(kvm, mmu_seq));
+
+- return new_pfn;
++ gpc->valid = true;
++ gpc->pfn = new_pfn;
++ gpc->khva = new_khva + (gpc->gpa & ~PAGE_MASK);
++ return 0;
++
++out_error:
++ write_lock_irq(&gpc->lock);
++
++ return -EFAULT;
+ }
+
+ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
+@@ -146,9 +237,7 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
+ unsigned long page_offset = gpa & ~PAGE_MASK;
+ kvm_pfn_t old_pfn, new_pfn;
+ unsigned long old_uhva;
+- gpa_t old_gpa;
+ void *old_khva;
+- bool old_valid;
+ int ret = 0;
+
+ /*
+@@ -158,13 +247,18 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
+ if (page_offset + len > PAGE_SIZE)
+ return -EINVAL;
+
++ /*
++ * If another task is refreshing the cache, wait for it to complete.
++ * There is no guarantee that concurrent refreshes will see the same
++ * gpa, memslots generation, etc..., so they must be fully serialized.
++ */
++ mutex_lock(&gpc->refresh_lock);
++
+ write_lock_irq(&gpc->lock);
+
+- old_gpa = gpc->gpa;
+ old_pfn = gpc->pfn;
+ old_khva = gpc->khva - offset_in_page(gpc->khva);
+ old_uhva = gpc->uhva;
+- old_valid = gpc->valid;
+
+ /* If the userspace HVA is invalid, refresh that first */
+ if (gpc->gpa != gpa || gpc->generation != slots->generation ||
+@@ -177,64 +271,17 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
+ gpc->uhva = gfn_to_hva_memslot(gpc->memslot, gfn);
+
+ if (kvm_is_error_hva(gpc->uhva)) {
+- gpc->pfn = KVM_PFN_ERR_FAULT;
+ ret = -EFAULT;
+ goto out;
+ }
+-
+- gpc->uhva += page_offset;
+ }
+
+ /*
+ * If the userspace HVA changed or the PFN was already invalid,
+ * drop the lock and do the HVA to PFN lookup again.
+ */
+- if (!old_valid || old_uhva != gpc->uhva) {
+- unsigned long uhva = gpc->uhva;
+- void *new_khva = NULL;
+-
+- /* Placeholders for "hva is valid but not yet mapped" */
+- gpc->pfn = KVM_PFN_ERR_FAULT;
+- gpc->khva = NULL;
+- gpc->valid = true;
+-
+- write_unlock_irq(&gpc->lock);
+-
+- new_pfn = hva_to_pfn_retry(kvm, uhva);
+- if (is_error_noslot_pfn(new_pfn)) {
+- ret = -EFAULT;
+- goto map_done;
+- }
+-
+- if (gpc->usage & KVM_HOST_USES_PFN) {
+- if (new_pfn == old_pfn) {
+- new_khva = old_khva;
+- old_pfn = KVM_PFN_ERR_FAULT;
+- old_khva = NULL;
+- } else if (pfn_valid(new_pfn)) {
+- new_khva = kmap(pfn_to_page(new_pfn));
+-#ifdef CONFIG_HAS_IOMEM
+- } else {
+- new_khva = memremap(pfn_to_hpa(new_pfn), PAGE_SIZE, MEMREMAP_WB);
+-#endif
+- }
+- if (new_khva)
+- new_khva += page_offset;
+- else
+- ret = -EFAULT;
+- }
+-
+- map_done:
+- write_lock_irq(&gpc->lock);
+- if (ret) {
+- gpc->valid = false;
+- gpc->pfn = KVM_PFN_ERR_FAULT;
+- gpc->khva = NULL;
+- } else {
+- /* At this point, gpc->valid may already have been cleared */
+- gpc->pfn = new_pfn;
+- gpc->khva = new_khva;
+- }
++ if (!gpc->valid || old_uhva != gpc->uhva) {
++ ret = hva_to_pfn_retry(kvm, gpc);
+ } else {
+ /* If the HVA→PFN mapping was already valid, don't unmap it. */
+ old_pfn = KVM_PFN_ERR_FAULT;
+@@ -242,9 +289,26 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
+ }
+
+ out:
++ /*
++ * Invalidate the cache and purge the pfn/khva if the refresh failed.
++ * Some/all of the uhva, gpa, and memslot generation info may still be
++ * valid, leave it as is.
++ */
++ if (ret) {
++ gpc->valid = false;
++ gpc->pfn = KVM_PFN_ERR_FAULT;
++ gpc->khva = NULL;
++ }
++
++ /* Snapshot the new pfn before dropping the lock! */
++ new_pfn = gpc->pfn;
++
+ write_unlock_irq(&gpc->lock);
+
+- __release_gpc(kvm, old_pfn, old_khva, old_gpa);
++ mutex_unlock(&gpc->refresh_lock);
++
++ if (old_pfn != new_pfn)
++ gpc_release_pfn_and_khva(kvm, old_pfn, old_khva);
+
+ return ret;
+ }
+@@ -254,14 +318,13 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
+ {
+ void *old_khva;
+ kvm_pfn_t old_pfn;
+- gpa_t old_gpa;
+
++ mutex_lock(&gpc->refresh_lock);
+ write_lock_irq(&gpc->lock);
+
+ gpc->valid = false;
+
+ old_khva = gpc->khva - offset_in_page(gpc->khva);
+- old_gpa = gpc->gpa;
+ old_pfn = gpc->pfn;
+
+ /*
+@@ -272,8 +335,9 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
+ gpc->pfn = KVM_PFN_ERR_FAULT;
+
+ write_unlock_irq(&gpc->lock);
++ mutex_unlock(&gpc->refresh_lock);
+
+- __release_gpc(kvm, old_pfn, old_khva, old_gpa);
++ gpc_release_pfn_and_khva(kvm, old_pfn, old_khva);
+ }
+ EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_unmap);
+
+@@ -286,6 +350,7 @@ int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
+
+ if (!gpc->active) {
+ rwlock_init(&gpc->lock);
++ mutex_init(&gpc->refresh_lock);
+
+ gpc->khva = NULL;
+ gpc->pfn = KVM_PFN_ERR_FAULT;
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-08-21 16:54 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-08-21 16:54 UTC (permalink / raw
To: gentoo-commits
commit: e6bb58bd953cb5e5e6168c1e1c5bce5383c507a7
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Sun Aug 21 16:54:12 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Sun Aug 21 16:54:12 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=e6bb58bd
Linux patch 5.18.19
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1018_linux-5.18.19.patch | 331 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 335 insertions(+)
diff --git a/0000_README b/0000_README
index c9212ac7..b8b0f1ff 100644
--- a/0000_README
+++ b/0000_README
@@ -115,6 +115,10 @@ Patch: 1017_linux-5.18.18.patch
From: http://www.kernel.org
Desc: Linux 5.18.18
+Patch: 1018_linux-5.18.19.patch
+From: http://www.kernel.org
+Desc: Linux 5.18.19
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1018_linux-5.18.19.patch b/1018_linux-5.18.19.patch
new file mode 100644
index 00000000..bb0c4371
--- /dev/null
+++ b/1018_linux-5.18.19.patch
@@ -0,0 +1,331 @@
+diff --git a/Makefile b/Makefile
+index 23162e2bdf140..fc7efcdab0a27 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 5
+ PATCHLEVEL = 18
+-SUBLEVEL = 18
++SUBLEVEL = 19
+ EXTRAVERSION =
+ NAME = Superb Owl
+
+diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
+index 9ec34690e2551..5ed6a585f21fd 100644
+--- a/arch/arm64/kernel/kexec_image.c
++++ b/arch/arm64/kernel/kexec_image.c
+@@ -14,7 +14,6 @@
+ #include <linux/kexec.h>
+ #include <linux/pe.h>
+ #include <linux/string.h>
+-#include <linux/verification.h>
+ #include <asm/byteorder.h>
+ #include <asm/cpufeature.h>
+ #include <asm/image.h>
+@@ -130,18 +129,10 @@ static void *image_load(struct kimage *image,
+ return NULL;
+ }
+
+-#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
+-static int image_verify_sig(const char *kernel, unsigned long kernel_len)
+-{
+- return verify_pefile_signature(kernel, kernel_len, NULL,
+- VERIFYING_KEXEC_PE_SIGNATURE);
+-}
+-#endif
+-
+ const struct kexec_file_ops kexec_image_ops = {
+ .probe = image_probe,
+ .load = image_load,
+ #ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
+- .verify_sig = image_verify_sig,
++ .verify_sig = kexec_kernel_verify_pe_sig,
+ #endif
+ };
+diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
+index 170d0fd68b1f4..f299b48f9c9f0 100644
+--- a/arch/x86/kernel/kexec-bzimage64.c
++++ b/arch/x86/kernel/kexec-bzimage64.c
+@@ -17,7 +17,6 @@
+ #include <linux/kernel.h>
+ #include <linux/mm.h>
+ #include <linux/efi.h>
+-#include <linux/verification.h>
+
+ #include <asm/bootparam.h>
+ #include <asm/setup.h>
+@@ -528,28 +527,11 @@ static int bzImage64_cleanup(void *loader_data)
+ return 0;
+ }
+
+-#ifdef CONFIG_KEXEC_BZIMAGE_VERIFY_SIG
+-static int bzImage64_verify_sig(const char *kernel, unsigned long kernel_len)
+-{
+- int ret;
+-
+- ret = verify_pefile_signature(kernel, kernel_len,
+- VERIFY_USE_SECONDARY_KEYRING,
+- VERIFYING_KEXEC_PE_SIGNATURE);
+- if (ret == -ENOKEY && IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING)) {
+- ret = verify_pefile_signature(kernel, kernel_len,
+- VERIFY_USE_PLATFORM_KEYRING,
+- VERIFYING_KEXEC_PE_SIGNATURE);
+- }
+- return ret;
+-}
+-#endif
+-
+ const struct kexec_file_ops kexec_bzImage64_ops = {
+ .probe = bzImage64_probe,
+ .load = bzImage64_load,
+ .cleanup = bzImage64_cleanup,
+ #ifdef CONFIG_KEXEC_BZIMAGE_VERIFY_SIG
+- .verify_sig = bzImage64_verify_sig,
++ .verify_sig = kexec_kernel_verify_pe_sig,
+ #endif
+ };
+diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c
+index f31e29e8f1cac..b55d4c733fa8a 100644
+--- a/drivers/tee/tee_shm.c
++++ b/drivers/tee/tee_shm.c
+@@ -311,6 +311,9 @@ struct tee_shm *tee_shm_register_user_buf(struct tee_context *ctx,
+ void *ret;
+ int id;
+
++ if (!access_ok((void __user *)addr, length))
++ return ERR_PTR(-EFAULT);
++
+ mutex_lock(&teedev->mutex);
+ id = idr_alloc(&teedev->idr, NULL, 1, 0, GFP_KERNEL);
+ mutex_unlock(&teedev->mutex);
+diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
+index 0e239a4c3b264..39c4c513bf979 100644
+--- a/fs/btrfs/raid56.c
++++ b/fs/btrfs/raid56.c
+@@ -323,6 +323,9 @@ static void merge_rbio(struct btrfs_raid_bio *dest,
+ {
+ bio_list_merge(&dest->bio_list, &victim->bio_list);
+ dest->bio_list_bytes += victim->bio_list_bytes;
++ /* Also inherit the bitmaps from @victim. */
++ bitmap_or(dest->dbitmap, victim->dbitmap, dest->dbitmap,
++ dest->stripe_npages);
+ dest->generic_bio_cnt += victim->generic_bio_cnt;
+ bio_list_init(&victim->bio_list);
+ }
+@@ -864,6 +867,12 @@ static void rbio_orig_end_io(struct btrfs_raid_bio *rbio, blk_status_t err)
+
+ if (rbio->generic_bio_cnt)
+ btrfs_bio_counter_sub(rbio->bioc->fs_info, rbio->generic_bio_cnt);
++ /*
++ * Clear the data bitmap, as the rbio may be cached for later usage.
++ * do this before before unlock_stripe() so there will be no new bio
++ * for this bio.
++ */
++ bitmap_clear(rbio->dbitmap, 0, rbio->stripe_npages);
+
+ /*
+ * At this moment, rbio->bio_list is empty, however since rbio does not
+@@ -1195,6 +1204,9 @@ static noinline void finish_rmw(struct btrfs_raid_bio *rbio)
+ else
+ BUG();
+
++ /* We should have at least one data sector. */
++ ASSERT(bitmap_weight(rbio->dbitmap, rbio->stripe_npages));
++
+ /* at this point we either have a full stripe,
+ * or we've read the full stripe from the drive.
+ * recalculate the parity and write the new results.
+@@ -1266,6 +1278,11 @@ static noinline void finish_rmw(struct btrfs_raid_bio *rbio)
+ for (stripe = 0; stripe < rbio->real_stripes; stripe++) {
+ for (pagenr = 0; pagenr < rbio->stripe_npages; pagenr++) {
+ struct page *page;
++
++ /* This vertical stripe has no data, skip it. */
++ if (!test_bit(pagenr, rbio->dbitmap))
++ continue;
++
+ if (stripe < rbio->nr_data) {
+ page = page_in_rbio(rbio, stripe, pagenr, 1);
+ if (!page)
+@@ -1290,6 +1307,11 @@ static noinline void finish_rmw(struct btrfs_raid_bio *rbio)
+
+ for (pagenr = 0; pagenr < rbio->stripe_npages; pagenr++) {
+ struct page *page;
++
++ /* This vertical stripe has no data, skip it. */
++ if (!test_bit(pagenr, rbio->dbitmap))
++ continue;
++
+ if (stripe < rbio->nr_data) {
+ page = page_in_rbio(rbio, stripe, pagenr, 1);
+ if (!page)
+@@ -1713,6 +1735,33 @@ static void btrfs_raid_unplug(struct blk_plug_cb *cb, bool from_schedule)
+ run_plug(plug);
+ }
+
++/* Add the original bio into rbio->bio_list, and update rbio::dbitmap. */
++static void rbio_add_bio(struct btrfs_raid_bio *rbio, struct bio *orig_bio)
++{
++ const struct btrfs_fs_info *fs_info = rbio->bioc->fs_info;
++ const u64 orig_logical = orig_bio->bi_iter.bi_sector << SECTOR_SHIFT;
++ const u64 full_stripe_start = rbio->bioc->raid_map[0];
++ const u32 orig_len = orig_bio->bi_iter.bi_size;
++ const u32 sectorsize = fs_info->sectorsize;
++ u64 cur_logical;
++
++ ASSERT(orig_logical >= full_stripe_start &&
++ orig_logical + orig_len <= full_stripe_start +
++ rbio->nr_data * rbio->stripe_len);
++
++ bio_list_add(&rbio->bio_list, orig_bio);
++ rbio->bio_list_bytes += orig_bio->bi_iter.bi_size;
++
++ /* Update the dbitmap. */
++ for (cur_logical = orig_logical; cur_logical < orig_logical + orig_len;
++ cur_logical += sectorsize) {
++ int bit = ((u32)(cur_logical - full_stripe_start) >>
++ fs_info->sectorsize_bits) % rbio->stripe_npages;
++
++ set_bit(bit, rbio->dbitmap);
++ }
++}
++
+ /*
+ * our main entry point for writes from the rest of the FS.
+ */
+@@ -1730,9 +1779,8 @@ int raid56_parity_write(struct bio *bio, struct btrfs_io_context *bioc,
+ btrfs_put_bioc(bioc);
+ return PTR_ERR(rbio);
+ }
+- bio_list_add(&rbio->bio_list, bio);
+- rbio->bio_list_bytes = bio->bi_iter.bi_size;
+ rbio->operation = BTRFS_RBIO_WRITE;
++ rbio_add_bio(rbio, bio);
+
+ btrfs_bio_counter_inc_noblocked(fs_info);
+ rbio->generic_bio_cnt = 1;
+@@ -2036,9 +2084,12 @@ static int __raid56_parity_recover(struct btrfs_raid_bio *rbio)
+ atomic_set(&rbio->error, 0);
+
+ /*
+- * read everything that hasn't failed. Thanks to the
+- * stripe cache, it is possible that some or all of these
+- * pages are going to be uptodate.
++ * Read everything that hasn't failed. However this time we will
++ * not trust any cached sector.
++ * As we may read out some stale data but higher layer is not reading
++ * that stale part.
++ *
++ * So here we always re-read everything in recovery path.
+ */
+ for (stripe = 0; stripe < rbio->real_stripes; stripe++) {
+ if (rbio->faila == stripe || rbio->failb == stripe) {
+@@ -2047,16 +2098,6 @@ static int __raid56_parity_recover(struct btrfs_raid_bio *rbio)
+ }
+
+ for (pagenr = 0; pagenr < rbio->stripe_npages; pagenr++) {
+- struct page *p;
+-
+- /*
+- * the rmw code may have already read this
+- * page in
+- */
+- p = rbio_stripe_page(rbio, stripe, pagenr);
+- if (PageUptodate(p))
+- continue;
+-
+ ret = rbio_add_io_page(rbio, &bio_list,
+ rbio_stripe_page(rbio, stripe, pagenr),
+ stripe, pagenr, rbio->stripe_len);
+@@ -2134,8 +2175,7 @@ int raid56_parity_recover(struct bio *bio, struct btrfs_io_context *bioc,
+ }
+
+ rbio->operation = BTRFS_RBIO_READ_REBUILD;
+- bio_list_add(&rbio->bio_list, bio);
+- rbio->bio_list_bytes = bio->bi_iter.bi_size;
++ rbio_add_bio(rbio, bio);
+
+ rbio->faila = find_logical_bio_stripe(rbio, bio);
+ if (rbio->faila == -1) {
+diff --git a/include/linux/kexec.h b/include/linux/kexec.h
+index f3e7680befcc4..6a349ef1619f6 100644
+--- a/include/linux/kexec.h
++++ b/include/linux/kexec.h
+@@ -19,6 +19,7 @@
+ #include <asm/io.h>
+
+ #include <uapi/linux/kexec.h>
++#include <linux/verification.h>
+
+ /* Location of a reserved region to hold the crash kernel.
+ */
+@@ -212,6 +213,12 @@ static inline void *arch_kexec_kernel_image_load(struct kimage *image)
+ }
+ #endif
+
++#ifdef CONFIG_KEXEC_SIG
++#ifdef CONFIG_SIGNED_PE_FILE_VERIFICATION
++int kexec_kernel_verify_pe_sig(const char *kernel, unsigned long kernel_len);
++#endif
++#endif
++
+ extern int kexec_add_buffer(struct kexec_buf *kbuf);
+ int kexec_locate_mem_hole(struct kexec_buf *kbuf);
+
+diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
+index ad005cd184a47..cc3179140a9f9 100644
+--- a/kernel/kexec_file.c
++++ b/kernel/kexec_file.c
+@@ -123,6 +123,23 @@ void kimage_file_post_load_cleanup(struct kimage *image)
+ }
+
+ #ifdef CONFIG_KEXEC_SIG
++#ifdef CONFIG_SIGNED_PE_FILE_VERIFICATION
++int kexec_kernel_verify_pe_sig(const char *kernel, unsigned long kernel_len)
++{
++ int ret;
++
++ ret = verify_pefile_signature(kernel, kernel_len,
++ VERIFY_USE_SECONDARY_KEYRING,
++ VERIFYING_KEXEC_PE_SIGNATURE);
++ if (ret == -ENOKEY && IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING)) {
++ ret = verify_pefile_signature(kernel, kernel_len,
++ VERIFY_USE_PLATFORM_KEYRING,
++ VERIFYING_KEXEC_PE_SIGNATURE);
++ }
++ return ret;
++}
++#endif
++
+ static int kexec_image_verify_sig(struct kimage *image, void *buf,
+ unsigned long buf_len)
+ {
+diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
+index 3f935cbbaff66..48712bc51bda7 100644
+--- a/net/sched/cls_route.c
++++ b/net/sched/cls_route.c
+@@ -424,6 +424,11 @@ static int route4_set_parms(struct net *net, struct tcf_proto *tp,
+ return -EINVAL;
+ }
+
++ if (!nhandle) {
++ NL_SET_ERR_MSG(extack, "Replacing with handle of 0 is invalid");
++ return -EINVAL;
++ }
++
+ h1 = to_hash(nhandle);
+ b = rtnl_dereference(head->table[h1]);
+ if (!b) {
+@@ -477,6 +482,11 @@ static int route4_change(struct net *net, struct sk_buff *in_skb,
+ int err;
+ bool new = true;
+
++ if (!handle) {
++ NL_SET_ERR_MSG(extack, "Creating with handle of 0 is invalid");
++ return -EINVAL;
++ }
++
+ if (opt == NULL)
+ return handle ? -EINVAL : 0;
+
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [gentoo-commits] proj/linux-patches:5.18 commit in: /
@ 2022-08-21 19:59 Mike Pagano
0 siblings, 0 replies; 31+ messages in thread
From: Mike Pagano @ 2022-08-21 19:59 UTC (permalink / raw
To: gentoo-commits
commit: 5df6212a6d56876ab202675521e9b205f40bbae8
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Sun Aug 21 19:59:14 2022 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Sun Aug 21 19:59:14 2022 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=5df6212a
Remove broken BMQ Patch. This kernel series is EOL
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 8 -
5020_BMQ-and-PDS-io-scheduler-v5.18-r2.patch | 9916 --------------------------
5021_BMQ-and-PDS-gentoo-defaults.patch | 13 -
3 files changed, 9937 deletions(-)
diff --git a/0000_README b/0000_README
index b8b0f1ff..ecfddca1 100644
--- a/0000_README
+++ b/0000_README
@@ -154,11 +154,3 @@ Desc: Add Gentoo Linux support config settings and defaults.
Patch: 5010_enable-cpu-optimizations-universal.patch
From: https://github.com/graysky2/kernel_compiler_patch
Desc: Kernel >= 5.15 patch enables gcc = v11.1+ optimizations for additional CPUs.
-
-Patch: 5020_BMQ-and-PDS-io-scheduler-v5.18-r2.patch
-From: https://gitlab.com/alfredchen/linux-prjc
-Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
-
-Patch: 5021_BMQ-and-PDS-gentoo-defaults.patch
-From: https://gitweb.gentoo.org/proj/linux-patches.git/
-Desc: Set defaults for BMQ. Add archs as people test, default to N
diff --git a/5020_BMQ-and-PDS-io-scheduler-v5.18-r2.patch b/5020_BMQ-and-PDS-io-scheduler-v5.18-r2.patch
deleted file mode 100644
index cf13d856..00000000
--- a/5020_BMQ-and-PDS-io-scheduler-v5.18-r2.patch
+++ /dev/null
@@ -1,9916 +0,0 @@
-diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
-index 3f1cc5e317ed..e6f88a16732b 100644
---- a/Documentation/admin-guide/kernel-parameters.txt
-+++ b/Documentation/admin-guide/kernel-parameters.txt
-@@ -5164,6 +5164,12 @@
- sa1100ir [NET]
- See drivers/net/irda/sa1100_ir.c.
-
-+ sched_timeslice=
-+ [KNL] Time slice in ms for Project C BMQ/PDS scheduler.
-+ Format: integer 2, 4
-+ Default: 4
-+ See Documentation/scheduler/sched-BMQ.txt
-+
- sched_verbose [KNL] Enables verbose scheduler debug messages.
-
- schedstats= [KNL,X86] Enable or disable scheduled statistics.
-diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
-index 1144ea3229a3..2accee67d6fb 100644
---- a/Documentation/admin-guide/sysctl/kernel.rst
-+++ b/Documentation/admin-guide/sysctl/kernel.rst
-@@ -1517,3 +1517,13 @@ is 10 seconds.
-
- The softlockup threshold is (``2 * watchdog_thresh``). Setting this
- tunable to zero will disable lockup detection altogether.
-+
-+yield_type:
-+===========
-+
-+BMQ/PDS CPU scheduler only. This determines what type of yield calls
-+to sched_yield will perform.
-+
-+ 0 - No yield.
-+ 1 - Deboost and requeue task. (default)
-+ 2 - Set run queue skip task.
-diff --git a/Documentation/scheduler/sched-BMQ.txt b/Documentation/scheduler/sched-BMQ.txt
-new file mode 100644
-index 000000000000..05c84eec0f31
---- /dev/null
-+++ b/Documentation/scheduler/sched-BMQ.txt
-@@ -0,0 +1,110 @@
-+ BitMap queue CPU Scheduler
-+ --------------------------
-+
-+CONTENT
-+========
-+
-+ Background
-+ Design
-+ Overview
-+ Task policy
-+ Priority management
-+ BitMap Queue
-+ CPU Assignment and Migration
-+
-+
-+Background
-+==========
-+
-+BitMap Queue CPU scheduler, referred to as BMQ from here on, is an evolution
-+of previous Priority and Deadline based Skiplist multiple queue scheduler(PDS),
-+and inspired by Zircon scheduler. The goal of it is to keep the scheduler code
-+simple, while efficiency and scalable for interactive tasks, such as desktop,
-+movie playback and gaming etc.
-+
-+Design
-+======
-+
-+Overview
-+--------
-+
-+BMQ use per CPU run queue design, each CPU(logical) has it's own run queue,
-+each CPU is responsible for scheduling the tasks that are putting into it's
-+run queue.
-+
-+The run queue is a set of priority queues. Note that these queues are fifo
-+queue for non-rt tasks or priority queue for rt tasks in data structure. See
-+BitMap Queue below for details. BMQ is optimized for non-rt tasks in the fact
-+that most applications are non-rt tasks. No matter the queue is fifo or
-+priority, In each queue is an ordered list of runnable tasks awaiting execution
-+and the data structures are the same. When it is time for a new task to run,
-+the scheduler simply looks the lowest numbered queueue that contains a task,
-+and runs the first task from the head of that queue. And per CPU idle task is
-+also in the run queue, so the scheduler can always find a task to run on from
-+its run queue.
-+
-+Each task will assigned the same timeslice(default 4ms) when it is picked to
-+start running. Task will be reinserted at the end of the appropriate priority
-+queue when it uses its whole timeslice. When the scheduler selects a new task
-+from the priority queue it sets the CPU's preemption timer for the remainder of
-+the previous timeslice. When that timer fires the scheduler will stop execution
-+on that task, select another task and start over again.
-+
-+If a task blocks waiting for a shared resource then it's taken out of its
-+priority queue and is placed in a wait queue for the shared resource. When it
-+is unblocked it will be reinserted in the appropriate priority queue of an
-+eligible CPU.
-+
-+Task policy
-+-----------
-+
-+BMQ supports DEADLINE, FIFO, RR, NORMAL, BATCH and IDLE task policy like the
-+mainline CFS scheduler. But BMQ is heavy optimized for non-rt task, that's
-+NORMAL/BATCH/IDLE policy tasks. Below is the implementation detail of each
-+policy.
-+
-+DEADLINE
-+ It is squashed as priority 0 FIFO task.
-+
-+FIFO/RR
-+ All RT tasks share one single priority queue in BMQ run queue designed. The
-+complexity of insert operation is O(n). BMQ is not designed for system runs
-+with major rt policy tasks.
-+
-+NORMAL/BATCH/IDLE
-+ BATCH and IDLE tasks are treated as the same policy. They compete CPU with
-+NORMAL policy tasks, but they just don't boost. To control the priority of
-+NORMAL/BATCH/IDLE tasks, simply use nice level.
-+
-+ISO
-+ ISO policy is not supported in BMQ. Please use nice level -20 NORMAL policy
-+task instead.
-+
-+Priority management
-+-------------------
-+
-+RT tasks have priority from 0-99. For non-rt tasks, there are three different
-+factors used to determine the effective priority of a task. The effective
-+priority being what is used to determine which queue it will be in.
-+
-+The first factor is simply the task’s static priority. Which is assigned from
-+task's nice level, within [-20, 19] in userland's point of view and [0, 39]
-+internally.
-+
-+The second factor is the priority boost. This is a value bounded between
-+[-MAX_PRIORITY_ADJ, MAX_PRIORITY_ADJ] used to offset the base priority, it is
-+modified by the following cases:
-+
-+*When a thread has used up its entire timeslice, always deboost its boost by
-+increasing by one.
-+*When a thread gives up cpu control(voluntary or non-voluntary) to reschedule,
-+and its switch-in time(time after last switch and run) below the thredhold
-+based on its priority boost, will boost its boost by decreasing by one buti is
-+capped at 0 (won’t go negative).
-+
-+The intent in this system is to ensure that interactive threads are serviced
-+quickly. These are usually the threads that interact directly with the user
-+and cause user-perceivable latency. These threads usually do little work and
-+spend most of their time blocked awaiting another user event. So they get the
-+priority boost from unblocking while background threads that do most of the
-+processing receive the priority penalty for using their entire timeslice.
-diff --git a/fs/proc/base.c b/fs/proc/base.c
-index c1031843cc6a..f2b0af41a3eb 100644
---- a/fs/proc/base.c
-+++ b/fs/proc/base.c
-@@ -479,7 +479,7 @@ static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
- seq_puts(m, "0 0 0\n");
- else
- seq_printf(m, "%llu %llu %lu\n",
-- (unsigned long long)task->se.sum_exec_runtime,
-+ (unsigned long long)tsk_seruntime(task),
- (unsigned long long)task->sched_info.run_delay,
- task->sched_info.pcount);
-
-diff --git a/include/asm-generic/resource.h b/include/asm-generic/resource.h
-index 8874f681b056..59eb72bf7d5f 100644
---- a/include/asm-generic/resource.h
-+++ b/include/asm-generic/resource.h
-@@ -23,7 +23,7 @@
- [RLIMIT_LOCKS] = { RLIM_INFINITY, RLIM_INFINITY }, \
- [RLIMIT_SIGPENDING] = { 0, 0 }, \
- [RLIMIT_MSGQUEUE] = { MQ_BYTES_MAX, MQ_BYTES_MAX }, \
-- [RLIMIT_NICE] = { 0, 0 }, \
-+ [RLIMIT_NICE] = { 30, 30 }, \
- [RLIMIT_RTPRIO] = { 0, 0 }, \
- [RLIMIT_RTTIME] = { RLIM_INFINITY, RLIM_INFINITY }, \
- }
-diff --git a/include/linux/sched.h b/include/linux/sched.h
-index a8911b1f35aa..7a4bf3a0db5a 100644
---- a/include/linux/sched.h
-+++ b/include/linux/sched.h
-@@ -753,8 +753,14 @@ struct task_struct {
- unsigned int ptrace;
-
- #ifdef CONFIG_SMP
-- int on_cpu;
- struct __call_single_node wake_entry;
-+#endif
-+#if defined(CONFIG_SMP) || defined(CONFIG_SCHED_ALT)
-+ int on_cpu;
-+#endif
-+
-+#ifdef CONFIG_SMP
-+#ifndef CONFIG_SCHED_ALT
- unsigned int wakee_flips;
- unsigned long wakee_flip_decay_ts;
- struct task_struct *last_wakee;
-@@ -768,6 +774,7 @@ struct task_struct {
- */
- int recent_used_cpu;
- int wake_cpu;
-+#endif /* !CONFIG_SCHED_ALT */
- #endif
- int on_rq;
-
-@@ -776,6 +783,20 @@ struct task_struct {
- int normal_prio;
- unsigned int rt_priority;
-
-+#ifdef CONFIG_SCHED_ALT
-+ u64 last_ran;
-+ s64 time_slice;
-+ int sq_idx;
-+ struct list_head sq_node;
-+#ifdef CONFIG_SCHED_BMQ
-+ int boost_prio;
-+#endif /* CONFIG_SCHED_BMQ */
-+#ifdef CONFIG_SCHED_PDS
-+ u64 deadline;
-+#endif /* CONFIG_SCHED_PDS */
-+ /* sched_clock time spent running */
-+ u64 sched_time;
-+#else /* !CONFIG_SCHED_ALT */
- struct sched_entity se;
- struct sched_rt_entity rt;
- struct sched_dl_entity dl;
-@@ -786,6 +807,7 @@ struct task_struct {
- unsigned long core_cookie;
- unsigned int core_occupation;
- #endif
-+#endif /* !CONFIG_SCHED_ALT */
-
- #ifdef CONFIG_CGROUP_SCHED
- struct task_group *sched_task_group;
-@@ -1516,6 +1538,15 @@ struct task_struct {
- */
- };
-
-+#ifdef CONFIG_SCHED_ALT
-+#define tsk_seruntime(t) ((t)->sched_time)
-+/* replace the uncertian rt_timeout with 0UL */
-+#define tsk_rttimeout(t) (0UL)
-+#else /* CFS */
-+#define tsk_seruntime(t) ((t)->se.sum_exec_runtime)
-+#define tsk_rttimeout(t) ((t)->rt.timeout)
-+#endif /* !CONFIG_SCHED_ALT */
-+
- static inline struct pid *task_pid(struct task_struct *task)
- {
- return task->thread_pid;
-diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
-index 7c83d4d5a971..fa30f98cb2be 100644
---- a/include/linux/sched/deadline.h
-+++ b/include/linux/sched/deadline.h
-@@ -1,5 +1,24 @@
- /* SPDX-License-Identifier: GPL-2.0 */
-
-+#ifdef CONFIG_SCHED_ALT
-+
-+static inline int dl_task(struct task_struct *p)
-+{
-+ return 0;
-+}
-+
-+#ifdef CONFIG_SCHED_BMQ
-+#define __tsk_deadline(p) (0UL)
-+#endif
-+
-+#ifdef CONFIG_SCHED_PDS
-+#define __tsk_deadline(p) ((((u64) ((p)->prio))<<56) | (p)->deadline)
-+#endif
-+
-+#else
-+
-+#define __tsk_deadline(p) ((p)->dl.deadline)
-+
- /*
- * SCHED_DEADLINE tasks has negative priorities, reflecting
- * the fact that any of them has higher prio than RT and
-@@ -21,6 +40,7 @@ static inline int dl_task(struct task_struct *p)
- {
- return dl_prio(p->prio);
- }
-+#endif /* CONFIG_SCHED_ALT */
-
- static inline bool dl_time_before(u64 a, u64 b)
- {
-diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
-index ab83d85e1183..6af9ae681116 100644
---- a/include/linux/sched/prio.h
-+++ b/include/linux/sched/prio.h
-@@ -18,6 +18,32 @@
- #define MAX_PRIO (MAX_RT_PRIO + NICE_WIDTH)
- #define DEFAULT_PRIO (MAX_RT_PRIO + NICE_WIDTH / 2)
-
-+#ifdef CONFIG_SCHED_ALT
-+
-+/* Undefine MAX_PRIO and DEFAULT_PRIO */
-+#undef MAX_PRIO
-+#undef DEFAULT_PRIO
-+
-+/* +/- priority levels from the base priority */
-+#ifdef CONFIG_SCHED_BMQ
-+#define MAX_PRIORITY_ADJ (7)
-+
-+#define MIN_NORMAL_PRIO (MAX_RT_PRIO)
-+#define MAX_PRIO (MIN_NORMAL_PRIO + NICE_WIDTH)
-+#define DEFAULT_PRIO (MIN_NORMAL_PRIO + NICE_WIDTH / 2)
-+#endif
-+
-+#ifdef CONFIG_SCHED_PDS
-+#define MAX_PRIORITY_ADJ (0)
-+
-+#define MIN_NORMAL_PRIO (128)
-+#define NORMAL_PRIO_NUM (64)
-+#define MAX_PRIO (MIN_NORMAL_PRIO + NORMAL_PRIO_NUM)
-+#define DEFAULT_PRIO (MAX_PRIO - NICE_WIDTH / 2)
-+#endif
-+
-+#endif /* CONFIG_SCHED_ALT */
-+
- /*
- * Convert user-nice values [ -20 ... 0 ... 19 ]
- * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
-diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
-index e5af028c08b4..0a7565d0d3cf 100644
---- a/include/linux/sched/rt.h
-+++ b/include/linux/sched/rt.h
-@@ -24,8 +24,10 @@ static inline bool task_is_realtime(struct task_struct *tsk)
-
- if (policy == SCHED_FIFO || policy == SCHED_RR)
- return true;
-+#ifndef CONFIG_SCHED_ALT
- if (policy == SCHED_DEADLINE)
- return true;
-+#endif
- return false;
- }
-
-diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
-index 56cffe42abbc..e020fc572b22 100644
---- a/include/linux/sched/topology.h
-+++ b/include/linux/sched/topology.h
-@@ -233,7 +233,8 @@ static inline bool cpus_share_cache(int this_cpu, int that_cpu)
-
- #endif /* !CONFIG_SMP */
-
--#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL)
-+#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) && \
-+ !defined(CONFIG_SCHED_ALT)
- extern void rebuild_sched_domains_energy(void);
- #else
- static inline void rebuild_sched_domains_energy(void)
-diff --git a/init/Kconfig b/init/Kconfig
-index ddcbefe535e9..85616423dc94 100644
---- a/init/Kconfig
-+++ b/init/Kconfig
-@@ -821,6 +821,7 @@ menu "Scheduler features"
- config UCLAMP_TASK
- bool "Enable utilization clamping for RT/FAIR tasks"
- depends on CPU_FREQ_GOV_SCHEDUTIL
-+ depends on !SCHED_ALT
- help
- This feature enables the scheduler to track the clamped utilization
- of each CPU based on RUNNABLE tasks scheduled on that CPU.
-@@ -867,6 +868,35 @@ config UCLAMP_BUCKETS_COUNT
-
- If in doubt, use the default value.
-
-+menuconfig SCHED_ALT
-+ bool "Alternative CPU Schedulers"
-+ default y
-+ help
-+ This feature enable alternative CPU scheduler"
-+
-+if SCHED_ALT
-+
-+choice
-+ prompt "Alternative CPU Scheduler"
-+ default SCHED_BMQ
-+
-+config SCHED_BMQ
-+ bool "BMQ CPU scheduler"
-+ help
-+ The BitMap Queue CPU scheduler for excellent interactivity and
-+ responsiveness on the desktop and solid scalability on normal
-+ hardware and commodity servers.
-+
-+config SCHED_PDS
-+ bool "PDS CPU scheduler"
-+ help
-+ The Priority and Deadline based Skip list multiple queue CPU
-+ Scheduler.
-+
-+endchoice
-+
-+endif
-+
- endmenu
-
- #
-@@ -911,6 +941,7 @@ config NUMA_BALANCING
- depends on ARCH_SUPPORTS_NUMA_BALANCING
- depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY
- depends on SMP && NUMA && MIGRATION && !PREEMPT_RT
-+ depends on !SCHED_ALT
- help
- This option adds support for automatic NUMA aware memory/task placement.
- The mechanism is quite primitive and is based on migrating memory when
-@@ -1003,6 +1034,7 @@ config FAIR_GROUP_SCHED
- depends on CGROUP_SCHED
- default CGROUP_SCHED
-
-+if !SCHED_ALT
- config CFS_BANDWIDTH
- bool "CPU bandwidth provisioning for FAIR_GROUP_SCHED"
- depends on FAIR_GROUP_SCHED
-@@ -1025,6 +1057,7 @@ config RT_GROUP_SCHED
- realtime bandwidth for them.
- See Documentation/scheduler/sched-rt-group.rst for more information.
-
-+endif #!SCHED_ALT
- endif #CGROUP_SCHED
-
- config UCLAMP_TASK_GROUP
-@@ -1268,6 +1301,7 @@ config CHECKPOINT_RESTORE
-
- config SCHED_AUTOGROUP
- bool "Automatic process group scheduling"
-+ depends on !SCHED_ALT
- select CGROUPS
- select CGROUP_SCHED
- select FAIR_GROUP_SCHED
-diff --git a/init/init_task.c b/init/init_task.c
-index 73cc8f03511a..2d0bad762895 100644
---- a/init/init_task.c
-+++ b/init/init_task.c
-@@ -75,9 +75,15 @@ struct task_struct init_task
- .stack = init_stack,
- .usage = REFCOUNT_INIT(2),
- .flags = PF_KTHREAD,
-+#ifdef CONFIG_SCHED_ALT
-+ .prio = DEFAULT_PRIO + MAX_PRIORITY_ADJ,
-+ .static_prio = DEFAULT_PRIO,
-+ .normal_prio = DEFAULT_PRIO + MAX_PRIORITY_ADJ,
-+#else
- .prio = MAX_PRIO - 20,
- .static_prio = MAX_PRIO - 20,
- .normal_prio = MAX_PRIO - 20,
-+#endif
- .policy = SCHED_NORMAL,
- .cpus_ptr = &init_task.cpus_mask,
- .user_cpus_ptr = NULL,
-@@ -88,6 +94,17 @@ struct task_struct init_task
- .restart_block = {
- .fn = do_no_restart_syscall,
- },
-+#ifdef CONFIG_SCHED_ALT
-+ .sq_node = LIST_HEAD_INIT(init_task.sq_node),
-+#ifdef CONFIG_SCHED_BMQ
-+ .boost_prio = 0,
-+ .sq_idx = 15,
-+#endif
-+#ifdef CONFIG_SCHED_PDS
-+ .deadline = 0,
-+#endif
-+ .time_slice = HZ,
-+#else
- .se = {
- .group_node = LIST_HEAD_INIT(init_task.se.group_node),
- },
-@@ -95,6 +112,7 @@ struct task_struct init_task
- .run_list = LIST_HEAD_INIT(init_task.rt.run_list),
- .time_slice = RR_TIMESLICE,
- },
-+#endif
- .tasks = LIST_HEAD_INIT(init_task.tasks),
- #ifdef CONFIG_SMP
- .pushable_tasks = PLIST_NODE_INIT(init_task.pushable_tasks, MAX_PRIO),
-diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
-index c2f1fd95a821..41654679b1b2 100644
---- a/kernel/Kconfig.preempt
-+++ b/kernel/Kconfig.preempt
-@@ -117,7 +117,7 @@ config PREEMPT_DYNAMIC
-
- config SCHED_CORE
- bool "Core Scheduling for SMT"
-- depends on SCHED_SMT
-+ depends on SCHED_SMT && !SCHED_ALT
- help
- This option permits Core Scheduling, a means of coordinated task
- selection across SMT siblings. When enabled -- see
-diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
-index 71a418858a5e..7e3016873db1 100644
---- a/kernel/cgroup/cpuset.c
-+++ b/kernel/cgroup/cpuset.c
-@@ -704,7 +704,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
- return ret;
- }
-
--#ifdef CONFIG_SMP
-+#if defined(CONFIG_SMP) && !defined(CONFIG_SCHED_ALT)
- /*
- * Helper routine for generate_sched_domains().
- * Do cpusets a, b have overlapping effective cpus_allowed masks?
-@@ -1100,7 +1100,7 @@ static void rebuild_sched_domains_locked(void)
- /* Have scheduler rebuild the domains */
- partition_and_rebuild_sched_domains(ndoms, doms, attr);
- }
--#else /* !CONFIG_SMP */
-+#else /* !CONFIG_SMP || CONFIG_SCHED_ALT */
- static void rebuild_sched_domains_locked(void)
- {
- }
-diff --git a/kernel/delayacct.c b/kernel/delayacct.c
-index c5e8cea9e05f..8e90b2a3667a 100644
---- a/kernel/delayacct.c
-+++ b/kernel/delayacct.c
-@@ -130,7 +130,7 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
- */
- t1 = tsk->sched_info.pcount;
- t2 = tsk->sched_info.run_delay;
-- t3 = tsk->se.sum_exec_runtime;
-+ t3 = tsk_seruntime(tsk);
-
- d->cpu_count += t1;
-
-diff --git a/kernel/exit.c b/kernel/exit.c
-index f072959fcab7..da97095a2997 100644
---- a/kernel/exit.c
-+++ b/kernel/exit.c
-@@ -124,7 +124,7 @@ static void __exit_signal(struct task_struct *tsk)
- sig->curr_target = next_thread(tsk);
- }
-
-- add_device_randomness((const void*) &tsk->se.sum_exec_runtime,
-+ add_device_randomness((const void*) &tsk_seruntime(tsk),
- sizeof(unsigned long long));
-
- /*
-@@ -145,7 +145,7 @@ static void __exit_signal(struct task_struct *tsk)
- sig->inblock += task_io_get_inblock(tsk);
- sig->oublock += task_io_get_oublock(tsk);
- task_io_accounting_add(&sig->ioac, &tsk->ioac);
-- sig->sum_sched_runtime += tsk->se.sum_exec_runtime;
-+ sig->sum_sched_runtime += tsk_seruntime(tsk);
- sig->nr_threads--;
- __unhash_process(tsk, group_dead);
- write_sequnlock(&sig->stats_lock);
-diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
-index 8555c4efe97c..a2b3bd3fd85c 100644
---- a/kernel/locking/rtmutex.c
-+++ b/kernel/locking/rtmutex.c
-@@ -298,21 +298,25 @@ static __always_inline void
- waiter_update_prio(struct rt_mutex_waiter *waiter, struct task_struct *task)
- {
- waiter->prio = __waiter_prio(task);
-- waiter->deadline = task->dl.deadline;
-+ waiter->deadline = __tsk_deadline(task);
- }
-
- /*
- * Only use with rt_mutex_waiter_{less,equal}()
- */
- #define task_to_waiter(p) \
-- &(struct rt_mutex_waiter){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline }
-+ &(struct rt_mutex_waiter){ .prio = __waiter_prio(p), .deadline = __tsk_deadline(p) }
-
- static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left,
- struct rt_mutex_waiter *right)
- {
-+#ifdef CONFIG_SCHED_PDS
-+ return (left->deadline < right->deadline);
-+#else
- if (left->prio < right->prio)
- return 1;
-
-+#ifndef CONFIG_SCHED_BMQ
- /*
- * If both waiters have dl_prio(), we check the deadlines of the
- * associated tasks.
-@@ -321,16 +325,22 @@ static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left,
- */
- if (dl_prio(left->prio))
- return dl_time_before(left->deadline, right->deadline);
-+#endif
-
- return 0;
-+#endif
- }
-
- static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left,
- struct rt_mutex_waiter *right)
- {
-+#ifdef CONFIG_SCHED_PDS
-+ return (left->deadline == right->deadline);
-+#else
- if (left->prio != right->prio)
- return 0;
-
-+#ifndef CONFIG_SCHED_BMQ
- /*
- * If both waiters have dl_prio(), we check the deadlines of the
- * associated tasks.
-@@ -339,8 +349,10 @@ static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left,
- */
- if (dl_prio(left->prio))
- return left->deadline == right->deadline;
-+#endif
-
- return 1;
-+#endif
- }
-
- static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter,
-diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
-index 976092b7bd45..31d587c16ec1 100644
---- a/kernel/sched/Makefile
-+++ b/kernel/sched/Makefile
-@@ -28,7 +28,12 @@ endif
- # These compilation units have roughly the same size and complexity - so their
- # build parallelizes well and finishes roughly at once:
- #
-+ifdef CONFIG_SCHED_ALT
-+obj-y += alt_core.o
-+obj-$(CONFIG_SCHED_DEBUG) += alt_debug.o
-+else
- obj-y += core.o
- obj-y += fair.o
-+endif
- obj-y += build_policy.o
- obj-y += build_utility.o
-diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
-new file mode 100644
-index 000000000000..b8e67d568e17
---- /dev/null
-+++ b/kernel/sched/alt_core.c
-@@ -0,0 +1,7750 @@
-+/*
-+ * kernel/sched/alt_core.c
-+ *
-+ * Core alternative kernel scheduler code and related syscalls
-+ *
-+ * Copyright (C) 1991-2002 Linus Torvalds
-+ *
-+ * 2009-08-13 Brainfuck deadline scheduling policy by Con Kolivas deletes
-+ * a whole lot of those previous things.
-+ * 2017-09-06 Priority and Deadline based Skip list multiple queue kernel
-+ * scheduler by Alfred Chen.
-+ * 2019-02-20 BMQ(BitMap Queue) kernel scheduler by Alfred Chen.
-+ */
-+#include <linux/sched/cputime.h>
-+#include <linux/sched/debug.h>
-+#include <linux/sched/isolation.h>
-+#include <linux/sched/loadavg.h>
-+#include <linux/sched/mm.h>
-+#include <linux/sched/nohz.h>
-+#include <linux/sched/stat.h>
-+#include <linux/sched/wake_q.h>
-+
-+#include <linux/blkdev.h>
-+#include <linux/context_tracking.h>
-+#include <linux/cpuset.h>
-+#include <linux/delayacct.h>
-+#include <linux/init_task.h>
-+#include <linux/kcov.h>
-+#include <linux/kprobes.h>
-+#include <linux/profile.h>
-+#include <linux/nmi.h>
-+#include <linux/scs.h>
-+
-+#include <uapi/linux/sched/types.h>
-+
-+#include <asm/switch_to.h>
-+
-+#define CREATE_TRACE_POINTS
-+#include <trace/events/sched.h>
-+#undef CREATE_TRACE_POINTS
-+
-+#include "sched.h"
-+
-+#include "pelt.h"
-+
-+#include "../../fs/io-wq.h"
-+#include "../smpboot.h"
-+
-+/*
-+ * Export tracepoints that act as a bare tracehook (ie: have no trace event
-+ * associated with them) to allow external modules to probe them.
-+ */
-+EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_irq_tp);
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+#define sched_feat(x) (1)
-+/*
-+ * Print a warning if need_resched is set for the given duration (if
-+ * LATENCY_WARN is enabled).
-+ *
-+ * If sysctl_resched_latency_warn_once is set, only one warning will be shown
-+ * per boot.
-+ */
-+__read_mostly int sysctl_resched_latency_warn_ms = 100;
-+__read_mostly int sysctl_resched_latency_warn_once = 1;
-+#else
-+#define sched_feat(x) (0)
-+#endif /* CONFIG_SCHED_DEBUG */
-+
-+#define ALT_SCHED_VERSION "v5.18-r2"
-+
-+/* rt_prio(prio) defined in include/linux/sched/rt.h */
-+#define rt_task(p) rt_prio((p)->prio)
-+#define rt_policy(policy) ((policy) == SCHED_FIFO || (policy) == SCHED_RR)
-+#define task_has_rt_policy(p) (rt_policy((p)->policy))
-+
-+#define STOP_PRIO (MAX_RT_PRIO - 1)
-+
-+/* Default time slice is 4 in ms, can be set via kernel parameter "sched_timeslice" */
-+u64 sched_timeslice_ns __read_mostly = (4 << 20);
-+
-+static inline void requeue_task(struct task_struct *p, struct rq *rq, int idx);
-+
-+#ifdef CONFIG_SCHED_BMQ
-+#include "bmq.h"
-+#endif
-+#ifdef CONFIG_SCHED_PDS
-+#include "pds.h"
-+#endif
-+
-+static int __init sched_timeslice(char *str)
-+{
-+ int timeslice_ms;
-+
-+ get_option(&str, ×lice_ms);
-+ if (2 != timeslice_ms)
-+ timeslice_ms = 4;
-+ sched_timeslice_ns = timeslice_ms << 20;
-+ sched_timeslice_imp(timeslice_ms);
-+
-+ return 0;
-+}
-+early_param("sched_timeslice", sched_timeslice);
-+
-+/* Reschedule if less than this many μs left */
-+#define RESCHED_NS (100 << 10)
-+
-+/**
-+ * sched_yield_type - Choose what sort of yield sched_yield will perform.
-+ * 0: No yield.
-+ * 1: Deboost and requeue task. (default)
-+ * 2: Set rq skip task.
-+ */
-+int sched_yield_type __read_mostly = 1;
-+
-+#ifdef CONFIG_SMP
-+static cpumask_t sched_rq_pending_mask ____cacheline_aligned_in_smp;
-+
-+DEFINE_PER_CPU(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
-+DEFINE_PER_CPU(cpumask_t *, sched_cpu_llc_mask);
-+DEFINE_PER_CPU(cpumask_t *, sched_cpu_topo_end_mask);
-+
-+#ifdef CONFIG_SCHED_SMT
-+DEFINE_STATIC_KEY_FALSE(sched_smt_present);
-+EXPORT_SYMBOL_GPL(sched_smt_present);
-+#endif
-+
-+/*
-+ * Keep a unique ID per domain (we use the first CPUs number in the cpumask of
-+ * the domain), this allows us to quickly tell if two cpus are in the same cache
-+ * domain, see cpus_share_cache().
-+ */
-+DEFINE_PER_CPU(int, sd_llc_id);
-+#endif /* CONFIG_SMP */
-+
-+static DEFINE_MUTEX(sched_hotcpu_mutex);
-+
-+DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
-+
-+#ifndef prepare_arch_switch
-+# define prepare_arch_switch(next) do { } while (0)
-+#endif
-+#ifndef finish_arch_post_lock_switch
-+# define finish_arch_post_lock_switch() do { } while (0)
-+#endif
-+
-+#ifdef CONFIG_SCHED_SMT
-+static cpumask_t sched_sg_idle_mask ____cacheline_aligned_in_smp;
-+#endif
-+static cpumask_t sched_rq_watermark[SCHED_QUEUE_BITS] ____cacheline_aligned_in_smp;
-+
-+/* sched_queue related functions */
-+static inline void sched_queue_init(struct sched_queue *q)
-+{
-+ int i;
-+
-+ bitmap_zero(q->bitmap, SCHED_QUEUE_BITS);
-+ for(i = 0; i < SCHED_BITS; i++)
-+ INIT_LIST_HEAD(&q->heads[i]);
-+}
-+
-+/*
-+ * Init idle task and put into queue structure of rq
-+ * IMPORTANT: may be called multiple times for a single cpu
-+ */
-+static inline void sched_queue_init_idle(struct sched_queue *q,
-+ struct task_struct *idle)
-+{
-+ idle->sq_idx = IDLE_TASK_SCHED_PRIO;
-+ INIT_LIST_HEAD(&q->heads[idle->sq_idx]);
-+ list_add(&idle->sq_node, &q->heads[idle->sq_idx]);
-+}
-+
-+/* water mark related functions */
-+static inline void update_sched_rq_watermark(struct rq *rq)
-+{
-+ unsigned long watermark = find_first_bit(rq->queue.bitmap, SCHED_QUEUE_BITS);
-+ unsigned long last_wm = rq->watermark;
-+ unsigned long i;
-+ int cpu;
-+
-+ if (watermark == last_wm)
-+ return;
-+
-+ rq->watermark = watermark;
-+ cpu = cpu_of(rq);
-+ if (watermark < last_wm) {
-+ for (i = last_wm; i > watermark; i--)
-+ cpumask_clear_cpu(cpu, sched_rq_watermark + SCHED_QUEUE_BITS - i);
-+#ifdef CONFIG_SCHED_SMT
-+ if (static_branch_likely(&sched_smt_present) &&
-+ IDLE_TASK_SCHED_PRIO == last_wm)
-+ cpumask_andnot(&sched_sg_idle_mask,
-+ &sched_sg_idle_mask, cpu_smt_mask(cpu));
-+#endif
-+ return;
-+ }
-+ /* last_wm < watermark */
-+ for (i = watermark; i > last_wm; i--)
-+ cpumask_set_cpu(cpu, sched_rq_watermark + SCHED_QUEUE_BITS - i);
-+#ifdef CONFIG_SCHED_SMT
-+ if (static_branch_likely(&sched_smt_present) &&
-+ IDLE_TASK_SCHED_PRIO == watermark) {
-+ cpumask_t tmp;
-+
-+ cpumask_and(&tmp, cpu_smt_mask(cpu), sched_rq_watermark);
-+ if (cpumask_equal(&tmp, cpu_smt_mask(cpu)))
-+ cpumask_or(&sched_sg_idle_mask,
-+ &sched_sg_idle_mask, cpu_smt_mask(cpu));
-+ }
-+#endif
-+}
-+
-+/*
-+ * This routine assume that the idle task always in queue
-+ */
-+static inline struct task_struct *sched_rq_first_task(struct rq *rq)
-+{
-+ unsigned long idx = find_first_bit(rq->queue.bitmap, SCHED_QUEUE_BITS);
-+ const struct list_head *head = &rq->queue.heads[sched_prio2idx(idx, rq)];
-+
-+ return list_first_entry(head, struct task_struct, sq_node);
-+}
-+
-+static inline struct task_struct *
-+sched_rq_next_task(struct task_struct *p, struct rq *rq)
-+{
-+ unsigned long idx = p->sq_idx;
-+ struct list_head *head = &rq->queue.heads[idx];
-+
-+ if (list_is_last(&p->sq_node, head)) {
-+ idx = find_next_bit(rq->queue.bitmap, SCHED_QUEUE_BITS,
-+ sched_idx2prio(idx, rq) + 1);
-+ head = &rq->queue.heads[sched_prio2idx(idx, rq)];
-+
-+ return list_first_entry(head, struct task_struct, sq_node);
-+ }
-+
-+ return list_next_entry(p, sq_node);
-+}
-+
-+static inline struct task_struct *rq_runnable_task(struct rq *rq)
-+{
-+ struct task_struct *next = sched_rq_first_task(rq);
-+
-+ if (unlikely(next == rq->skip))
-+ next = sched_rq_next_task(next, rq);
-+
-+ return next;
-+}
-+
-+/*
-+ * Serialization rules:
-+ *
-+ * Lock order:
-+ *
-+ * p->pi_lock
-+ * rq->lock
-+ * hrtimer_cpu_base->lock (hrtimer_start() for bandwidth controls)
-+ *
-+ * rq1->lock
-+ * rq2->lock where: rq1 < rq2
-+ *
-+ * Regular state:
-+ *
-+ * Normal scheduling state is serialized by rq->lock. __schedule() takes the
-+ * local CPU's rq->lock, it optionally removes the task from the runqueue and
-+ * always looks at the local rq data structures to find the most eligible task
-+ * to run next.
-+ *
-+ * Task enqueue is also under rq->lock, possibly taken from another CPU.
-+ * Wakeups from another LLC domain might use an IPI to transfer the enqueue to
-+ * the local CPU to avoid bouncing the runqueue state around [ see
-+ * ttwu_queue_wakelist() ]
-+ *
-+ * Task wakeup, specifically wakeups that involve migration, are horribly
-+ * complicated to avoid having to take two rq->locks.
-+ *
-+ * Special state:
-+ *
-+ * System-calls and anything external will use task_rq_lock() which acquires
-+ * both p->pi_lock and rq->lock. As a consequence the state they change is
-+ * stable while holding either lock:
-+ *
-+ * - sched_setaffinity()/
-+ * set_cpus_allowed_ptr(): p->cpus_ptr, p->nr_cpus_allowed
-+ * - set_user_nice(): p->se.load, p->*prio
-+ * - __sched_setscheduler(): p->sched_class, p->policy, p->*prio,
-+ * p->se.load, p->rt_priority,
-+ * p->dl.dl_{runtime, deadline, period, flags, bw, density}
-+ * - sched_setnuma(): p->numa_preferred_nid
-+ * - sched_move_task()/
-+ * cpu_cgroup_fork(): p->sched_task_group
-+ * - uclamp_update_active() p->uclamp*
-+ *
-+ * p->state <- TASK_*:
-+ *
-+ * is changed locklessly using set_current_state(), __set_current_state() or
-+ * set_special_state(), see their respective comments, or by
-+ * try_to_wake_up(). This latter uses p->pi_lock to serialize against
-+ * concurrent self.
-+ *
-+ * p->on_rq <- { 0, 1 = TASK_ON_RQ_QUEUED, 2 = TASK_ON_RQ_MIGRATING }:
-+ *
-+ * is set by activate_task() and cleared by deactivate_task(), under
-+ * rq->lock. Non-zero indicates the task is runnable, the special
-+ * ON_RQ_MIGRATING state is used for migration without holding both
-+ * rq->locks. It indicates task_cpu() is not stable, see task_rq_lock().
-+ *
-+ * p->on_cpu <- { 0, 1 }:
-+ *
-+ * is set by prepare_task() and cleared by finish_task() such that it will be
-+ * set before p is scheduled-in and cleared after p is scheduled-out, both
-+ * under rq->lock. Non-zero indicates the task is running on its CPU.
-+ *
-+ * [ The astute reader will observe that it is possible for two tasks on one
-+ * CPU to have ->on_cpu = 1 at the same time. ]
-+ *
-+ * task_cpu(p): is changed by set_task_cpu(), the rules are:
-+ *
-+ * - Don't call set_task_cpu() on a blocked task:
-+ *
-+ * We don't care what CPU we're not running on, this simplifies hotplug,
-+ * the CPU assignment of blocked tasks isn't required to be valid.
-+ *
-+ * - for try_to_wake_up(), called under p->pi_lock:
-+ *
-+ * This allows try_to_wake_up() to only take one rq->lock, see its comment.
-+ *
-+ * - for migration called under rq->lock:
-+ * [ see task_on_rq_migrating() in task_rq_lock() ]
-+ *
-+ * o move_queued_task()
-+ * o detach_task()
-+ *
-+ * - for migration called under double_rq_lock():
-+ *
-+ * o __migrate_swap_task()
-+ * o push_rt_task() / pull_rt_task()
-+ * o push_dl_task() / pull_dl_task()
-+ * o dl_task_offline_migration()
-+ *
-+ */
-+
-+/*
-+ * Context: p->pi_lock
-+ */
-+static inline struct rq
-+*__task_access_lock(struct task_struct *p, raw_spinlock_t **plock)
-+{
-+ struct rq *rq;
-+ for (;;) {
-+ rq = task_rq(p);
-+ if (p->on_cpu || task_on_rq_queued(p)) {
-+ raw_spin_lock(&rq->lock);
-+ if (likely((p->on_cpu || task_on_rq_queued(p))
-+ && rq == task_rq(p))) {
-+ *plock = &rq->lock;
-+ return rq;
-+ }
-+ raw_spin_unlock(&rq->lock);
-+ } else if (task_on_rq_migrating(p)) {
-+ do {
-+ cpu_relax();
-+ } while (unlikely(task_on_rq_migrating(p)));
-+ } else {
-+ *plock = NULL;
-+ return rq;
-+ }
-+ }
-+}
-+
-+static inline void
-+__task_access_unlock(struct task_struct *p, raw_spinlock_t *lock)
-+{
-+ if (NULL != lock)
-+ raw_spin_unlock(lock);
-+}
-+
-+static inline struct rq
-+*task_access_lock_irqsave(struct task_struct *p, raw_spinlock_t **plock,
-+ unsigned long *flags)
-+{
-+ struct rq *rq;
-+ for (;;) {
-+ rq = task_rq(p);
-+ if (p->on_cpu || task_on_rq_queued(p)) {
-+ raw_spin_lock_irqsave(&rq->lock, *flags);
-+ if (likely((p->on_cpu || task_on_rq_queued(p))
-+ && rq == task_rq(p))) {
-+ *plock = &rq->lock;
-+ return rq;
-+ }
-+ raw_spin_unlock_irqrestore(&rq->lock, *flags);
-+ } else if (task_on_rq_migrating(p)) {
-+ do {
-+ cpu_relax();
-+ } while (unlikely(task_on_rq_migrating(p)));
-+ } else {
-+ raw_spin_lock_irqsave(&p->pi_lock, *flags);
-+ if (likely(!p->on_cpu && !p->on_rq &&
-+ rq == task_rq(p))) {
-+ *plock = &p->pi_lock;
-+ return rq;
-+ }
-+ raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
-+ }
-+ }
-+}
-+
-+static inline void
-+task_access_unlock_irqrestore(struct task_struct *p, raw_spinlock_t *lock,
-+ unsigned long *flags)
-+{
-+ raw_spin_unlock_irqrestore(lock, *flags);
-+}
-+
-+/*
-+ * __task_rq_lock - lock the rq @p resides on.
-+ */
-+struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ struct rq *rq;
-+
-+ lockdep_assert_held(&p->pi_lock);
-+
-+ for (;;) {
-+ rq = task_rq(p);
-+ raw_spin_lock(&rq->lock);
-+ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
-+ return rq;
-+ raw_spin_unlock(&rq->lock);
-+
-+ while (unlikely(task_on_rq_migrating(p)))
-+ cpu_relax();
-+ }
-+}
-+
-+/*
-+ * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
-+ */
-+struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
-+ __acquires(p->pi_lock)
-+ __acquires(rq->lock)
-+{
-+ struct rq *rq;
-+
-+ for (;;) {
-+ raw_spin_lock_irqsave(&p->pi_lock, rf->flags);
-+ rq = task_rq(p);
-+ raw_spin_lock(&rq->lock);
-+ /*
-+ * move_queued_task() task_rq_lock()
-+ *
-+ * ACQUIRE (rq->lock)
-+ * [S] ->on_rq = MIGRATING [L] rq = task_rq()
-+ * WMB (__set_task_cpu()) ACQUIRE (rq->lock);
-+ * [S] ->cpu = new_cpu [L] task_rq()
-+ * [L] ->on_rq
-+ * RELEASE (rq->lock)
-+ *
-+ * If we observe the old CPU in task_rq_lock(), the acquire of
-+ * the old rq->lock will fully serialize against the stores.
-+ *
-+ * If we observe the new CPU in task_rq_lock(), the address
-+ * dependency headed by '[L] rq = task_rq()' and the acquire
-+ * will pair with the WMB to ensure we then also see migrating.
-+ */
-+ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
-+ return rq;
-+ }
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
-+
-+ while (unlikely(task_on_rq_migrating(p)))
-+ cpu_relax();
-+ }
-+}
-+
-+static inline void
-+rq_lock_irqsave(struct rq *rq, struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ raw_spin_lock_irqsave(&rq->lock, rf->flags);
-+}
-+
-+static inline void
-+rq_unlock_irqrestore(struct rq *rq, struct rq_flags *rf)
-+ __releases(rq->lock)
-+{
-+ raw_spin_unlock_irqrestore(&rq->lock, rf->flags);
-+}
-+
-+void raw_spin_rq_lock_nested(struct rq *rq, int subclass)
-+{
-+ raw_spinlock_t *lock;
-+
-+ /* Matches synchronize_rcu() in __sched_core_enable() */
-+ preempt_disable();
-+
-+ for (;;) {
-+ lock = __rq_lockp(rq);
-+ raw_spin_lock_nested(lock, subclass);
-+ if (likely(lock == __rq_lockp(rq))) {
-+ /* preempt_count *MUST* be > 1 */
-+ preempt_enable_no_resched();
-+ return;
-+ }
-+ raw_spin_unlock(lock);
-+ }
-+}
-+
-+void raw_spin_rq_unlock(struct rq *rq)
-+{
-+ raw_spin_unlock(rq_lockp(rq));
-+}
-+
-+/*
-+ * RQ-clock updating methods:
-+ */
-+
-+static void update_rq_clock_task(struct rq *rq, s64 delta)
-+{
-+/*
-+ * In theory, the compile should just see 0 here, and optimize out the call
-+ * to sched_rt_avg_update. But I don't trust it...
-+ */
-+ s64 __maybe_unused steal = 0, irq_delta = 0;
-+
-+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
-+ irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time;
-+
-+ /*
-+ * Since irq_time is only updated on {soft,}irq_exit, we might run into
-+ * this case when a previous update_rq_clock() happened inside a
-+ * {soft,}irq region.
-+ *
-+ * When this happens, we stop ->clock_task and only update the
-+ * prev_irq_time stamp to account for the part that fit, so that a next
-+ * update will consume the rest. This ensures ->clock_task is
-+ * monotonic.
-+ *
-+ * It does however cause some slight miss-attribution of {soft,}irq
-+ * time, a more accurate solution would be to update the irq_time using
-+ * the current rq->clock timestamp, except that would require using
-+ * atomic ops.
-+ */
-+ if (irq_delta > delta)
-+ irq_delta = delta;
-+
-+ rq->prev_irq_time += irq_delta;
-+ delta -= irq_delta;
-+#endif
-+#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
-+ if (static_key_false((¶virt_steal_rq_enabled))) {
-+ steal = paravirt_steal_clock(cpu_of(rq));
-+ steal -= rq->prev_steal_time_rq;
-+
-+ if (unlikely(steal > delta))
-+ steal = delta;
-+
-+ rq->prev_steal_time_rq += steal;
-+ delta -= steal;
-+ }
-+#endif
-+
-+ rq->clock_task += delta;
-+
-+#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
-+ if ((irq_delta + steal))
-+ update_irq_load_avg(rq, irq_delta + steal);
-+#endif
-+}
-+
-+static inline void update_rq_clock(struct rq *rq)
-+{
-+ s64 delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
-+
-+ if (unlikely(delta <= 0))
-+ return;
-+ rq->clock += delta;
-+ update_rq_time_edge(rq);
-+ update_rq_clock_task(rq, delta);
-+}
-+
-+/*
-+ * RQ Load update routine
-+ */
-+#define RQ_LOAD_HISTORY_BITS (sizeof(s32) * 8ULL)
-+#define RQ_UTIL_SHIFT (8)
-+#define RQ_LOAD_HISTORY_TO_UTIL(l) (((l) >> (RQ_LOAD_HISTORY_BITS - 1 - RQ_UTIL_SHIFT)) & 0xff)
-+
-+#define LOAD_BLOCK(t) ((t) >> 17)
-+#define LOAD_HALF_BLOCK(t) ((t) >> 16)
-+#define BLOCK_MASK(t) ((t) & ((0x01 << 18) - 1))
-+#define LOAD_BLOCK_BIT(b) (1UL << (RQ_LOAD_HISTORY_BITS - 1 - (b)))
-+#define CURRENT_LOAD_BIT LOAD_BLOCK_BIT(0)
-+
-+static inline void rq_load_update(struct rq *rq)
-+{
-+ u64 time = rq->clock;
-+ u64 delta = min(LOAD_BLOCK(time) - LOAD_BLOCK(rq->load_stamp),
-+ RQ_LOAD_HISTORY_BITS - 1);
-+ u64 prev = !!(rq->load_history & CURRENT_LOAD_BIT);
-+ u64 curr = !!rq->nr_running;
-+
-+ if (delta) {
-+ rq->load_history = rq->load_history >> delta;
-+
-+ if (delta < RQ_UTIL_SHIFT) {
-+ rq->load_block += (~BLOCK_MASK(rq->load_stamp)) * prev;
-+ if (!!LOAD_HALF_BLOCK(rq->load_block) ^ curr)
-+ rq->load_history ^= LOAD_BLOCK_BIT(delta);
-+ }
-+
-+ rq->load_block = BLOCK_MASK(time) * prev;
-+ } else {
-+ rq->load_block += (time - rq->load_stamp) * prev;
-+ }
-+ if (prev ^ curr)
-+ rq->load_history ^= CURRENT_LOAD_BIT;
-+ rq->load_stamp = time;
-+}
-+
-+unsigned long rq_load_util(struct rq *rq, unsigned long max)
-+{
-+ return RQ_LOAD_HISTORY_TO_UTIL(rq->load_history) * (max >> RQ_UTIL_SHIFT);
-+}
-+
-+#ifdef CONFIG_SMP
-+unsigned long sched_cpu_util(int cpu, unsigned long max)
-+{
-+ return rq_load_util(cpu_rq(cpu), max);
-+}
-+#endif /* CONFIG_SMP */
-+
-+#ifdef CONFIG_CPU_FREQ
-+/**
-+ * cpufreq_update_util - Take a note about CPU utilization changes.
-+ * @rq: Runqueue to carry out the update for.
-+ * @flags: Update reason flags.
-+ *
-+ * This function is called by the scheduler on the CPU whose utilization is
-+ * being updated.
-+ *
-+ * It can only be called from RCU-sched read-side critical sections.
-+ *
-+ * The way cpufreq is currently arranged requires it to evaluate the CPU
-+ * performance state (frequency/voltage) on a regular basis to prevent it from
-+ * being stuck in a completely inadequate performance level for too long.
-+ * That is not guaranteed to happen if the updates are only triggered from CFS
-+ * and DL, though, because they may not be coming in if only RT tasks are
-+ * active all the time (or there are RT tasks only).
-+ *
-+ * As a workaround for that issue, this function is called periodically by the
-+ * RT sched class to trigger extra cpufreq updates to prevent it from stalling,
-+ * but that really is a band-aid. Going forward it should be replaced with
-+ * solutions targeted more specifically at RT tasks.
-+ */
-+static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
-+{
-+ struct update_util_data *data;
-+
-+#ifdef CONFIG_SMP
-+ rq_load_update(rq);
-+#endif
-+ data = rcu_dereference_sched(*per_cpu_ptr(&cpufreq_update_util_data,
-+ cpu_of(rq)));
-+ if (data)
-+ data->func(data, rq_clock(rq), flags);
-+}
-+#else
-+static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
-+{
-+#ifdef CONFIG_SMP
-+ rq_load_update(rq);
-+#endif
-+}
-+#endif /* CONFIG_CPU_FREQ */
-+
-+#ifdef CONFIG_NO_HZ_FULL
-+/*
-+ * Tick may be needed by tasks in the runqueue depending on their policy and
-+ * requirements. If tick is needed, lets send the target an IPI to kick it out
-+ * of nohz mode if necessary.
-+ */
-+static inline void sched_update_tick_dependency(struct rq *rq)
-+{
-+ int cpu = cpu_of(rq);
-+
-+ if (!tick_nohz_full_cpu(cpu))
-+ return;
-+
-+ if (rq->nr_running < 2)
-+ tick_nohz_dep_clear_cpu(cpu, TICK_DEP_BIT_SCHED);
-+ else
-+ tick_nohz_dep_set_cpu(cpu, TICK_DEP_BIT_SCHED);
-+}
-+#else /* !CONFIG_NO_HZ_FULL */
-+static inline void sched_update_tick_dependency(struct rq *rq) { }
-+#endif
-+
-+bool sched_task_on_rq(struct task_struct *p)
-+{
-+ return task_on_rq_queued(p);
-+}
-+
-+unsigned long get_wchan(struct task_struct *p)
-+{
-+ unsigned long ip = 0;
-+ unsigned int state;
-+
-+ if (!p || p == current)
-+ return 0;
-+
-+ /* Only get wchan if task is blocked and we can keep it that way. */
-+ raw_spin_lock_irq(&p->pi_lock);
-+ state = READ_ONCE(p->__state);
-+ smp_rmb(); /* see try_to_wake_up() */
-+ if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq)
-+ ip = __get_wchan(p);
-+ raw_spin_unlock_irq(&p->pi_lock);
-+
-+ return ip;
-+}
-+
-+/*
-+ * Add/Remove/Requeue task to/from the runqueue routines
-+ * Context: rq->lock
-+ */
-+#define __SCHED_DEQUEUE_TASK(p, rq, flags) \
-+ psi_dequeue(p, flags & DEQUEUE_SLEEP); \
-+ sched_info_dequeue(rq, p); \
-+ \
-+ list_del(&p->sq_node); \
-+ if (list_empty(&rq->queue.heads[p->sq_idx])) \
-+ clear_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
-+
-+#define __SCHED_ENQUEUE_TASK(p, rq, flags) \
-+ sched_info_enqueue(rq, p); \
-+ psi_enqueue(p, flags); \
-+ \
-+ p->sq_idx = task_sched_prio_idx(p, rq); \
-+ list_add_tail(&p->sq_node, &rq->queue.heads[p->sq_idx]); \
-+ set_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
-+
-+static inline void dequeue_task(struct task_struct *p, struct rq *rq, int flags)
-+{
-+ lockdep_assert_held(&rq->lock);
-+
-+ /*printk(KERN_INFO "sched: dequeue(%d) %px %016llx\n", cpu_of(rq), p, p->priodl);*/
-+ WARN_ONCE(task_rq(p) != rq, "sched: dequeue task reside on cpu%d from cpu%d\n",
-+ task_cpu(p), cpu_of(rq));
-+
-+ __SCHED_DEQUEUE_TASK(p, rq, flags);
-+ --rq->nr_running;
-+#ifdef CONFIG_SMP
-+ if (1 == rq->nr_running)
-+ cpumask_clear_cpu(cpu_of(rq), &sched_rq_pending_mask);
-+#endif
-+
-+ sched_update_tick_dependency(rq);
-+}
-+
-+static inline void enqueue_task(struct task_struct *p, struct rq *rq, int flags)
-+{
-+ lockdep_assert_held(&rq->lock);
-+
-+ /*printk(KERN_INFO "sched: enqueue(%d) %px %016llx\n", cpu_of(rq), p, p->priodl);*/
-+ WARN_ONCE(task_rq(p) != rq, "sched: enqueue task reside on cpu%d to cpu%d\n",
-+ task_cpu(p), cpu_of(rq));
-+
-+ __SCHED_ENQUEUE_TASK(p, rq, flags);
-+ update_sched_rq_watermark(rq);
-+ ++rq->nr_running;
-+#ifdef CONFIG_SMP
-+ if (2 == rq->nr_running)
-+ cpumask_set_cpu(cpu_of(rq), &sched_rq_pending_mask);
-+#endif
-+
-+ sched_update_tick_dependency(rq);
-+}
-+
-+static inline void requeue_task(struct task_struct *p, struct rq *rq, int idx)
-+{
-+ lockdep_assert_held(&rq->lock);
-+ /*printk(KERN_INFO "sched: requeue(%d) %px %016llx\n", cpu_of(rq), p, p->priodl);*/
-+ WARN_ONCE(task_rq(p) != rq, "sched: cpu[%d] requeue task reside on cpu%d\n",
-+ cpu_of(rq), task_cpu(p));
-+
-+ list_del(&p->sq_node);
-+ list_add_tail(&p->sq_node, &rq->queue.heads[idx]);
-+ if (idx != p->sq_idx) {
-+ if (list_empty(&rq->queue.heads[p->sq_idx]))
-+ clear_bit(sched_idx2prio(p->sq_idx, rq),
-+ rq->queue.bitmap);
-+ p->sq_idx = idx;
-+ set_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
-+ update_sched_rq_watermark(rq);
-+ }
-+}
-+
-+/*
-+ * cmpxchg based fetch_or, macro so it works for different integer types
-+ */
-+#define fetch_or(ptr, mask) \
-+ ({ \
-+ typeof(ptr) _ptr = (ptr); \
-+ typeof(mask) _mask = (mask); \
-+ typeof(*_ptr) _old, _val = *_ptr; \
-+ \
-+ for (;;) { \
-+ _old = cmpxchg(_ptr, _val, _val | _mask); \
-+ if (_old == _val) \
-+ break; \
-+ _val = _old; \
-+ } \
-+ _old; \
-+})
-+
-+#if defined(CONFIG_SMP) && defined(TIF_POLLING_NRFLAG)
-+/*
-+ * Atomically set TIF_NEED_RESCHED and test for TIF_POLLING_NRFLAG,
-+ * this avoids any races wrt polling state changes and thereby avoids
-+ * spurious IPIs.
-+ */
-+static bool set_nr_and_not_polling(struct task_struct *p)
-+{
-+ struct thread_info *ti = task_thread_info(p);
-+ return !(fetch_or(&ti->flags, _TIF_NEED_RESCHED) & _TIF_POLLING_NRFLAG);
-+}
-+
-+/*
-+ * Atomically set TIF_NEED_RESCHED if TIF_POLLING_NRFLAG is set.
-+ *
-+ * If this returns true, then the idle task promises to call
-+ * sched_ttwu_pending() and reschedule soon.
-+ */
-+static bool set_nr_if_polling(struct task_struct *p)
-+{
-+ struct thread_info *ti = task_thread_info(p);
-+ typeof(ti->flags) old, val = READ_ONCE(ti->flags);
-+
-+ for (;;) {
-+ if (!(val & _TIF_POLLING_NRFLAG))
-+ return false;
-+ if (val & _TIF_NEED_RESCHED)
-+ return true;
-+ old = cmpxchg(&ti->flags, val, val | _TIF_NEED_RESCHED);
-+ if (old == val)
-+ break;
-+ val = old;
-+ }
-+ return true;
-+}
-+
-+#else
-+static bool set_nr_and_not_polling(struct task_struct *p)
-+{
-+ set_tsk_need_resched(p);
-+ return true;
-+}
-+
-+#ifdef CONFIG_SMP
-+static bool set_nr_if_polling(struct task_struct *p)
-+{
-+ return false;
-+}
-+#endif
-+#endif
-+
-+static bool __wake_q_add(struct wake_q_head *head, struct task_struct *task)
-+{
-+ struct wake_q_node *node = &task->wake_q;
-+
-+ /*
-+ * Atomically grab the task, if ->wake_q is !nil already it means
-+ * it's already queued (either by us or someone else) and will get the
-+ * wakeup due to that.
-+ *
-+ * In order to ensure that a pending wakeup will observe our pending
-+ * state, even in the failed case, an explicit smp_mb() must be used.
-+ */
-+ smp_mb__before_atomic();
-+ if (unlikely(cmpxchg_relaxed(&node->next, NULL, WAKE_Q_TAIL)))
-+ return false;
-+
-+ /*
-+ * The head is context local, there can be no concurrency.
-+ */
-+ *head->lastp = node;
-+ head->lastp = &node->next;
-+ return true;
-+}
-+
-+/**
-+ * wake_q_add() - queue a wakeup for 'later' waking.
-+ * @head: the wake_q_head to add @task to
-+ * @task: the task to queue for 'later' wakeup
-+ *
-+ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
-+ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
-+ * instantly.
-+ *
-+ * This function must be used as-if it were wake_up_process(); IOW the task
-+ * must be ready to be woken at this location.
-+ */
-+void wake_q_add(struct wake_q_head *head, struct task_struct *task)
-+{
-+ if (__wake_q_add(head, task))
-+ get_task_struct(task);
-+}
-+
-+/**
-+ * wake_q_add_safe() - safely queue a wakeup for 'later' waking.
-+ * @head: the wake_q_head to add @task to
-+ * @task: the task to queue for 'later' wakeup
-+ *
-+ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
-+ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
-+ * instantly.
-+ *
-+ * This function must be used as-if it were wake_up_process(); IOW the task
-+ * must be ready to be woken at this location.
-+ *
-+ * This function is essentially a task-safe equivalent to wake_q_add(). Callers
-+ * that already hold reference to @task can call the 'safe' version and trust
-+ * wake_q to do the right thing depending whether or not the @task is already
-+ * queued for wakeup.
-+ */
-+void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task)
-+{
-+ if (!__wake_q_add(head, task))
-+ put_task_struct(task);
-+}
-+
-+void wake_up_q(struct wake_q_head *head)
-+{
-+ struct wake_q_node *node = head->first;
-+
-+ while (node != WAKE_Q_TAIL) {
-+ struct task_struct *task;
-+
-+ task = container_of(node, struct task_struct, wake_q);
-+ /* task can safely be re-inserted now: */
-+ node = node->next;
-+ task->wake_q.next = NULL;
-+
-+ /*
-+ * wake_up_process() executes a full barrier, which pairs with
-+ * the queueing in wake_q_add() so as not to miss wakeups.
-+ */
-+ wake_up_process(task);
-+ put_task_struct(task);
-+ }
-+}
-+
-+/*
-+ * resched_curr - mark rq's current task 'to be rescheduled now'.
-+ *
-+ * On UP this means the setting of the need_resched flag, on SMP it
-+ * might also involve a cross-CPU call to trigger the scheduler on
-+ * the target CPU.
-+ */
-+void resched_curr(struct rq *rq)
-+{
-+ struct task_struct *curr = rq->curr;
-+ int cpu;
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ if (test_tsk_need_resched(curr))
-+ return;
-+
-+ cpu = cpu_of(rq);
-+ if (cpu == smp_processor_id()) {
-+ set_tsk_need_resched(curr);
-+ set_preempt_need_resched();
-+ return;
-+ }
-+
-+ if (set_nr_and_not_polling(curr))
-+ smp_send_reschedule(cpu);
-+ else
-+ trace_sched_wake_idle_without_ipi(cpu);
-+}
-+
-+void resched_cpu(int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ if (cpu_online(cpu) || cpu == smp_processor_id())
-+ resched_curr(cpu_rq(cpu));
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+}
-+
-+#ifdef CONFIG_SMP
-+#ifdef CONFIG_NO_HZ_COMMON
-+void nohz_balance_enter_idle(int cpu) {}
-+
-+void select_nohz_load_balancer(int stop_tick) {}
-+
-+void set_cpu_sd_state_idle(void) {}
-+
-+/*
-+ * In the semi idle case, use the nearest busy CPU for migrating timers
-+ * from an idle CPU. This is good for power-savings.
-+ *
-+ * We don't do similar optimization for completely idle system, as
-+ * selecting an idle CPU will add more delays to the timers than intended
-+ * (as that CPU's timer base may not be uptodate wrt jiffies etc).
-+ */
-+int get_nohz_timer_target(void)
-+{
-+ int i, cpu = smp_processor_id(), default_cpu = -1;
-+ struct cpumask *mask;
-+ const struct cpumask *hk_mask;
-+
-+ if (housekeeping_cpu(cpu, HK_TYPE_TIMER)) {
-+ if (!idle_cpu(cpu))
-+ return cpu;
-+ default_cpu = cpu;
-+ }
-+
-+ hk_mask = housekeeping_cpumask(HK_TYPE_TIMER);
-+
-+ for (mask = per_cpu(sched_cpu_topo_masks, cpu) + 1;
-+ mask < per_cpu(sched_cpu_topo_end_mask, cpu); mask++)
-+ for_each_cpu_and(i, mask, hk_mask)
-+ if (!idle_cpu(i))
-+ return i;
-+
-+ if (default_cpu == -1)
-+ default_cpu = housekeeping_any_cpu(HK_TYPE_TIMER);
-+ cpu = default_cpu;
-+
-+ return cpu;
-+}
-+
-+/*
-+ * When add_timer_on() enqueues a timer into the timer wheel of an
-+ * idle CPU then this timer might expire before the next timer event
-+ * which is scheduled to wake up that CPU. In case of a completely
-+ * idle system the next event might even be infinite time into the
-+ * future. wake_up_idle_cpu() ensures that the CPU is woken up and
-+ * leaves the inner idle loop so the newly added timer is taken into
-+ * account when the CPU goes back to idle and evaluates the timer
-+ * wheel for the next timer event.
-+ */
-+static inline void wake_up_idle_cpu(int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ if (cpu == smp_processor_id())
-+ return;
-+
-+ if (set_nr_and_not_polling(rq->idle))
-+ smp_send_reschedule(cpu);
-+ else
-+ trace_sched_wake_idle_without_ipi(cpu);
-+}
-+
-+static inline bool wake_up_full_nohz_cpu(int cpu)
-+{
-+ /*
-+ * We just need the target to call irq_exit() and re-evaluate
-+ * the next tick. The nohz full kick at least implies that.
-+ * If needed we can still optimize that later with an
-+ * empty IRQ.
-+ */
-+ if (cpu_is_offline(cpu))
-+ return true; /* Don't try to wake offline CPUs. */
-+ if (tick_nohz_full_cpu(cpu)) {
-+ if (cpu != smp_processor_id() ||
-+ tick_nohz_tick_stopped())
-+ tick_nohz_full_kick_cpu(cpu);
-+ return true;
-+ }
-+
-+ return false;
-+}
-+
-+void wake_up_nohz_cpu(int cpu)
-+{
-+ if (!wake_up_full_nohz_cpu(cpu))
-+ wake_up_idle_cpu(cpu);
-+}
-+
-+static void nohz_csd_func(void *info)
-+{
-+ struct rq *rq = info;
-+ int cpu = cpu_of(rq);
-+ unsigned int flags;
-+
-+ /*
-+ * Release the rq::nohz_csd.
-+ */
-+ flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(cpu));
-+ WARN_ON(!(flags & NOHZ_KICK_MASK));
-+
-+ rq->idle_balance = idle_cpu(cpu);
-+ if (rq->idle_balance && !need_resched()) {
-+ rq->nohz_idle_balance = flags;
-+ raise_softirq_irqoff(SCHED_SOFTIRQ);
-+ }
-+}
-+
-+#endif /* CONFIG_NO_HZ_COMMON */
-+#endif /* CONFIG_SMP */
-+
-+static inline void check_preempt_curr(struct rq *rq)
-+{
-+ if (sched_rq_first_task(rq) != rq->curr)
-+ resched_curr(rq);
-+}
-+
-+#ifdef CONFIG_SCHED_HRTICK
-+/*
-+ * Use HR-timers to deliver accurate preemption points.
-+ */
-+
-+static void hrtick_clear(struct rq *rq)
-+{
-+ if (hrtimer_active(&rq->hrtick_timer))
-+ hrtimer_cancel(&rq->hrtick_timer);
-+}
-+
-+/*
-+ * High-resolution timer tick.
-+ * Runs from hardirq context with interrupts disabled.
-+ */
-+static enum hrtimer_restart hrtick(struct hrtimer *timer)
-+{
-+ struct rq *rq = container_of(timer, struct rq, hrtick_timer);
-+
-+ WARN_ON_ONCE(cpu_of(rq) != smp_processor_id());
-+
-+ raw_spin_lock(&rq->lock);
-+ resched_curr(rq);
-+ raw_spin_unlock(&rq->lock);
-+
-+ return HRTIMER_NORESTART;
-+}
-+
-+/*
-+ * Use hrtick when:
-+ * - enabled by features
-+ * - hrtimer is actually high res
-+ */
-+static inline int hrtick_enabled(struct rq *rq)
-+{
-+ /**
-+ * Alt schedule FW doesn't support sched_feat yet
-+ if (!sched_feat(HRTICK))
-+ return 0;
-+ */
-+ if (!cpu_active(cpu_of(rq)))
-+ return 0;
-+ return hrtimer_is_hres_active(&rq->hrtick_timer);
-+}
-+
-+#ifdef CONFIG_SMP
-+
-+static void __hrtick_restart(struct rq *rq)
-+{
-+ struct hrtimer *timer = &rq->hrtick_timer;
-+ ktime_t time = rq->hrtick_time;
-+
-+ hrtimer_start(timer, time, HRTIMER_MODE_ABS_PINNED_HARD);
-+}
-+
-+/*
-+ * called from hardirq (IPI) context
-+ */
-+static void __hrtick_start(void *arg)
-+{
-+ struct rq *rq = arg;
-+
-+ raw_spin_lock(&rq->lock);
-+ __hrtick_restart(rq);
-+ raw_spin_unlock(&rq->lock);
-+}
-+
-+/*
-+ * Called to set the hrtick timer state.
-+ *
-+ * called with rq->lock held and irqs disabled
-+ */
-+void hrtick_start(struct rq *rq, u64 delay)
-+{
-+ struct hrtimer *timer = &rq->hrtick_timer;
-+ s64 delta;
-+
-+ /*
-+ * Don't schedule slices shorter than 10000ns, that just
-+ * doesn't make sense and can cause timer DoS.
-+ */
-+ delta = max_t(s64, delay, 10000LL);
-+
-+ rq->hrtick_time = ktime_add_ns(timer->base->get_time(), delta);
-+
-+ if (rq == this_rq())
-+ __hrtick_restart(rq);
-+ else
-+ smp_call_function_single_async(cpu_of(rq), &rq->hrtick_csd);
-+}
-+
-+#else
-+/*
-+ * Called to set the hrtick timer state.
-+ *
-+ * called with rq->lock held and irqs disabled
-+ */
-+void hrtick_start(struct rq *rq, u64 delay)
-+{
-+ /*
-+ * Don't schedule slices shorter than 10000ns, that just
-+ * doesn't make sense. Rely on vruntime for fairness.
-+ */
-+ delay = max_t(u64, delay, 10000LL);
-+ hrtimer_start(&rq->hrtick_timer, ns_to_ktime(delay),
-+ HRTIMER_MODE_REL_PINNED_HARD);
-+}
-+#endif /* CONFIG_SMP */
-+
-+static void hrtick_rq_init(struct rq *rq)
-+{
-+#ifdef CONFIG_SMP
-+ INIT_CSD(&rq->hrtick_csd, __hrtick_start, rq);
-+#endif
-+
-+ hrtimer_init(&rq->hrtick_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
-+ rq->hrtick_timer.function = hrtick;
-+}
-+#else /* CONFIG_SCHED_HRTICK */
-+static inline int hrtick_enabled(struct rq *rq)
-+{
-+ return 0;
-+}
-+
-+static inline void hrtick_clear(struct rq *rq)
-+{
-+}
-+
-+static inline void hrtick_rq_init(struct rq *rq)
-+{
-+}
-+#endif /* CONFIG_SCHED_HRTICK */
-+
-+static inline int __normal_prio(int policy, int rt_prio, int static_prio)
-+{
-+ return rt_policy(policy) ? (MAX_RT_PRIO - 1 - rt_prio) :
-+ static_prio + MAX_PRIORITY_ADJ;
-+}
-+
-+/*
-+ * Calculate the expected normal priority: i.e. priority
-+ * without taking RT-inheritance into account. Might be
-+ * boosted by interactivity modifiers. Changes upon fork,
-+ * setprio syscalls, and whenever the interactivity
-+ * estimator recalculates.
-+ */
-+static inline int normal_prio(struct task_struct *p)
-+{
-+ return __normal_prio(p->policy, p->rt_priority, p->static_prio);
-+}
-+
-+/*
-+ * Calculate the current priority, i.e. the priority
-+ * taken into account by the scheduler. This value might
-+ * be boosted by RT tasks as it will be RT if the task got
-+ * RT-boosted. If not then it returns p->normal_prio.
-+ */
-+static int effective_prio(struct task_struct *p)
-+{
-+ p->normal_prio = normal_prio(p);
-+ /*
-+ * If we are RT tasks or we were boosted to RT priority,
-+ * keep the priority unchanged. Otherwise, update priority
-+ * to the normal priority:
-+ */
-+ if (!rt_prio(p->prio))
-+ return p->normal_prio;
-+ return p->prio;
-+}
-+
-+/*
-+ * activate_task - move a task to the runqueue.
-+ *
-+ * Context: rq->lock
-+ */
-+static void activate_task(struct task_struct *p, struct rq *rq)
-+{
-+ enqueue_task(p, rq, ENQUEUE_WAKEUP);
-+ p->on_rq = TASK_ON_RQ_QUEUED;
-+
-+ /*
-+ * If in_iowait is set, the code below may not trigger any cpufreq
-+ * utilization updates, so do it here explicitly with the IOWAIT flag
-+ * passed.
-+ */
-+ cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT * p->in_iowait);
-+}
-+
-+/*
-+ * deactivate_task - remove a task from the runqueue.
-+ *
-+ * Context: rq->lock
-+ */
-+static inline void deactivate_task(struct task_struct *p, struct rq *rq)
-+{
-+ dequeue_task(p, rq, DEQUEUE_SLEEP);
-+ p->on_rq = 0;
-+ cpufreq_update_util(rq, 0);
-+}
-+
-+static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu)
-+{
-+#ifdef CONFIG_SMP
-+ /*
-+ * After ->cpu is set up to a new value, task_access_lock(p, ...) can be
-+ * successfully executed on another CPU. We must ensure that updates of
-+ * per-task data have been completed by this moment.
-+ */
-+ smp_wmb();
-+
-+ WRITE_ONCE(task_thread_info(p)->cpu, cpu);
-+#endif
-+}
-+
-+static inline bool is_migration_disabled(struct task_struct *p)
-+{
-+#ifdef CONFIG_SMP
-+ return p->migration_disabled;
-+#else
-+ return false;
-+#endif
-+}
-+
-+#define SCA_CHECK 0x01
-+#define SCA_USER 0x08
-+
-+#ifdef CONFIG_SMP
-+
-+void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
-+{
-+#ifdef CONFIG_SCHED_DEBUG
-+ unsigned int state = READ_ONCE(p->__state);
-+
-+ /*
-+ * We should never call set_task_cpu() on a blocked task,
-+ * ttwu() will sort out the placement.
-+ */
-+ WARN_ON_ONCE(state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq);
-+
-+#ifdef CONFIG_LOCKDEP
-+ /*
-+ * The caller should hold either p->pi_lock or rq->lock, when changing
-+ * a task's CPU. ->pi_lock for waking tasks, rq->lock for runnable tasks.
-+ *
-+ * sched_move_task() holds both and thus holding either pins the cgroup,
-+ * see task_group().
-+ */
-+ WARN_ON_ONCE(debug_locks && !(lockdep_is_held(&p->pi_lock) ||
-+ lockdep_is_held(&task_rq(p)->lock)));
-+#endif
-+ /*
-+ * Clearly, migrating tasks to offline CPUs is a fairly daft thing.
-+ */
-+ WARN_ON_ONCE(!cpu_online(new_cpu));
-+
-+ WARN_ON_ONCE(is_migration_disabled(p));
-+#endif
-+ if (task_cpu(p) == new_cpu)
-+ return;
-+ trace_sched_migrate_task(p, new_cpu);
-+ rseq_migrate(p);
-+ perf_event_task_migrate(p);
-+
-+ __set_task_cpu(p, new_cpu);
-+}
-+
-+#define MDF_FORCE_ENABLED 0x80
-+
-+static void
-+__do_set_cpus_ptr(struct task_struct *p, const struct cpumask *new_mask)
-+{
-+ /*
-+ * This here violates the locking rules for affinity, since we're only
-+ * supposed to change these variables while holding both rq->lock and
-+ * p->pi_lock.
-+ *
-+ * HOWEVER, it magically works, because ttwu() is the only code that
-+ * accesses these variables under p->pi_lock and only does so after
-+ * smp_cond_load_acquire(&p->on_cpu, !VAL), and we're in __schedule()
-+ * before finish_task().
-+ *
-+ * XXX do further audits, this smells like something putrid.
-+ */
-+ SCHED_WARN_ON(!p->on_cpu);
-+ p->cpus_ptr = new_mask;
-+}
-+
-+void migrate_disable(void)
-+{
-+ struct task_struct *p = current;
-+ int cpu;
-+
-+ if (p->migration_disabled) {
-+ p->migration_disabled++;
-+ return;
-+ }
-+
-+ preempt_disable();
-+ cpu = smp_processor_id();
-+ if (cpumask_test_cpu(cpu, &p->cpus_mask)) {
-+ cpu_rq(cpu)->nr_pinned++;
-+ p->migration_disabled = 1;
-+ p->migration_flags &= ~MDF_FORCE_ENABLED;
-+
-+ /*
-+ * Violates locking rules! see comment in __do_set_cpus_ptr().
-+ */
-+ if (p->cpus_ptr == &p->cpus_mask)
-+ __do_set_cpus_ptr(p, cpumask_of(cpu));
-+ }
-+ preempt_enable();
-+}
-+EXPORT_SYMBOL_GPL(migrate_disable);
-+
-+void migrate_enable(void)
-+{
-+ struct task_struct *p = current;
-+
-+ if (0 == p->migration_disabled)
-+ return;
-+
-+ if (p->migration_disabled > 1) {
-+ p->migration_disabled--;
-+ return;
-+ }
-+
-+ if (WARN_ON_ONCE(!p->migration_disabled))
-+ return;
-+
-+ /*
-+ * Ensure stop_task runs either before or after this, and that
-+ * __set_cpus_allowed_ptr(SCA_MIGRATE_ENABLE) doesn't schedule().
-+ */
-+ preempt_disable();
-+ /*
-+ * Assumption: current should be running on allowed cpu
-+ */
-+ WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &p->cpus_mask));
-+ if (p->cpus_ptr != &p->cpus_mask)
-+ __do_set_cpus_ptr(p, &p->cpus_mask);
-+ /*
-+ * Mustn't clear migration_disabled() until cpus_ptr points back at the
-+ * regular cpus_mask, otherwise things that race (eg.
-+ * select_fallback_rq) get confused.
-+ */
-+ barrier();
-+ p->migration_disabled = 0;
-+ this_rq()->nr_pinned--;
-+ preempt_enable();
-+}
-+EXPORT_SYMBOL_GPL(migrate_enable);
-+
-+static inline bool rq_has_pinned_tasks(struct rq *rq)
-+{
-+ return rq->nr_pinned;
-+}
-+
-+/*
-+ * Per-CPU kthreads are allowed to run on !active && online CPUs, see
-+ * __set_cpus_allowed_ptr() and select_fallback_rq().
-+ */
-+static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
-+{
-+ /* When not in the task's cpumask, no point in looking further. */
-+ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
-+ return false;
-+
-+ /* migrate_disabled() must be allowed to finish. */
-+ if (is_migration_disabled(p))
-+ return cpu_online(cpu);
-+
-+ /* Non kernel threads are not allowed during either online or offline. */
-+ if (!(p->flags & PF_KTHREAD))
-+ return cpu_active(cpu) && task_cpu_possible(cpu, p);
-+
-+ /* KTHREAD_IS_PER_CPU is always allowed. */
-+ if (kthread_is_per_cpu(p))
-+ return cpu_online(cpu);
-+
-+ /* Regular kernel threads don't get to stay during offline. */
-+ if (cpu_dying(cpu))
-+ return false;
-+
-+ /* But are allowed during online. */
-+ return cpu_online(cpu);
-+}
-+
-+/*
-+ * This is how migration works:
-+ *
-+ * 1) we invoke migration_cpu_stop() on the target CPU using
-+ * stop_one_cpu().
-+ * 2) stopper starts to run (implicitly forcing the migrated thread
-+ * off the CPU)
-+ * 3) it checks whether the migrated task is still in the wrong runqueue.
-+ * 4) if it's in the wrong runqueue then the migration thread removes
-+ * it and puts it into the right queue.
-+ * 5) stopper completes and stop_one_cpu() returns and the migration
-+ * is done.
-+ */
-+
-+/*
-+ * move_queued_task - move a queued task to new rq.
-+ *
-+ * Returns (locked) new rq. Old rq's lock is released.
-+ */
-+static struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int
-+ new_cpu)
-+{
-+ lockdep_assert_held(&rq->lock);
-+
-+ WRITE_ONCE(p->on_rq, TASK_ON_RQ_MIGRATING);
-+ dequeue_task(p, rq, 0);
-+ update_sched_rq_watermark(rq);
-+ set_task_cpu(p, new_cpu);
-+ raw_spin_unlock(&rq->lock);
-+
-+ rq = cpu_rq(new_cpu);
-+
-+ raw_spin_lock(&rq->lock);
-+ BUG_ON(task_cpu(p) != new_cpu);
-+ sched_task_sanity_check(p, rq);
-+ enqueue_task(p, rq, 0);
-+ p->on_rq = TASK_ON_RQ_QUEUED;
-+ check_preempt_curr(rq);
-+
-+ return rq;
-+}
-+
-+struct migration_arg {
-+ struct task_struct *task;
-+ int dest_cpu;
-+};
-+
-+/*
-+ * Move (not current) task off this CPU, onto the destination CPU. We're doing
-+ * this because either it can't run here any more (set_cpus_allowed()
-+ * away from this CPU, or CPU going down), or because we're
-+ * attempting to rebalance this task on exec (sched_exec).
-+ *
-+ * So we race with normal scheduler movements, but that's OK, as long
-+ * as the task is no longer on this CPU.
-+ */
-+static struct rq *__migrate_task(struct rq *rq, struct task_struct *p, int
-+ dest_cpu)
-+{
-+ /* Affinity changed (again). */
-+ if (!is_cpu_allowed(p, dest_cpu))
-+ return rq;
-+
-+ update_rq_clock(rq);
-+ return move_queued_task(rq, p, dest_cpu);
-+}
-+
-+/*
-+ * migration_cpu_stop - this will be executed by a highprio stopper thread
-+ * and performs thread migration by bumping thread off CPU then
-+ * 'pushing' onto another runqueue.
-+ */
-+static int migration_cpu_stop(void *data)
-+{
-+ struct migration_arg *arg = data;
-+ struct task_struct *p = arg->task;
-+ struct rq *rq = this_rq();
-+ unsigned long flags;
-+
-+ /*
-+ * The original target CPU might have gone down and we might
-+ * be on another CPU but it doesn't matter.
-+ */
-+ local_irq_save(flags);
-+ /*
-+ * We need to explicitly wake pending tasks before running
-+ * __migrate_task() such that we will not miss enforcing cpus_ptr
-+ * during wakeups, see set_cpus_allowed_ptr()'s TASK_WAKING test.
-+ */
-+ flush_smp_call_function_from_idle();
-+
-+ raw_spin_lock(&p->pi_lock);
-+ raw_spin_lock(&rq->lock);
-+ /*
-+ * If task_rq(p) != rq, it cannot be migrated here, because we're
-+ * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
-+ * we're holding p->pi_lock.
-+ */
-+ if (task_rq(p) == rq && task_on_rq_queued(p))
-+ rq = __migrate_task(rq, p, arg->dest_cpu);
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+
-+ return 0;
-+}
-+
-+static inline void
-+set_cpus_allowed_common(struct task_struct *p, const struct cpumask *new_mask)
-+{
-+ cpumask_copy(&p->cpus_mask, new_mask);
-+ p->nr_cpus_allowed = cpumask_weight(new_mask);
-+}
-+
-+static void
-+__do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
-+{
-+ lockdep_assert_held(&p->pi_lock);
-+ set_cpus_allowed_common(p, new_mask);
-+}
-+
-+void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
-+{
-+ __do_set_cpus_allowed(p, new_mask);
-+}
-+
-+int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
-+ int node)
-+{
-+ if (!src->user_cpus_ptr)
-+ return 0;
-+
-+ dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
-+ if (!dst->user_cpus_ptr)
-+ return -ENOMEM;
-+
-+ cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
-+ return 0;
-+}
-+
-+static inline struct cpumask *clear_user_cpus_ptr(struct task_struct *p)
-+{
-+ struct cpumask *user_mask = NULL;
-+
-+ swap(p->user_cpus_ptr, user_mask);
-+
-+ return user_mask;
-+}
-+
-+void release_user_cpus_ptr(struct task_struct *p)
-+{
-+ kfree(clear_user_cpus_ptr(p));
-+}
-+
-+#endif
-+
-+/**
-+ * task_curr - is this task currently executing on a CPU?
-+ * @p: the task in question.
-+ *
-+ * Return: 1 if the task is currently executing. 0 otherwise.
-+ */
-+inline int task_curr(const struct task_struct *p)
-+{
-+ return cpu_curr(task_cpu(p)) == p;
-+}
-+
-+#ifdef CONFIG_SMP
-+/*
-+ * wait_task_inactive - wait for a thread to unschedule.
-+ *
-+ * If @match_state is nonzero, it's the @p->state value just checked and
-+ * not expected to change. If it changes, i.e. @p might have woken up,
-+ * then return zero. When we succeed in waiting for @p to be off its CPU,
-+ * we return a positive number (its total switch count). If a second call
-+ * a short while later returns the same number, the caller can be sure that
-+ * @p has remained unscheduled the whole time.
-+ *
-+ * The caller must ensure that the task *will* unschedule sometime soon,
-+ * else this function might spin for a *long* time. This function can't
-+ * be called with interrupts off, or it may introduce deadlock with
-+ * smp_call_function() if an IPI is sent by the same process we are
-+ * waiting to become inactive.
-+ */
-+unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state)
-+{
-+ unsigned long flags;
-+ bool running, on_rq;
-+ unsigned long ncsw;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+
-+ for (;;) {
-+ rq = task_rq(p);
-+
-+ /*
-+ * If the task is actively running on another CPU
-+ * still, just relax and busy-wait without holding
-+ * any locks.
-+ *
-+ * NOTE! Since we don't hold any locks, it's not
-+ * even sure that "rq" stays as the right runqueue!
-+ * But we don't care, since this will return false
-+ * if the runqueue has changed and p is actually now
-+ * running somewhere else!
-+ */
-+ while (task_running(p) && p == rq->curr) {
-+ if (match_state && unlikely(READ_ONCE(p->__state) != match_state))
-+ return 0;
-+ cpu_relax();
-+ }
-+
-+ /*
-+ * Ok, time to look more closely! We need the rq
-+ * lock now, to be *sure*. If we're wrong, we'll
-+ * just go back and repeat.
-+ */
-+ task_access_lock_irqsave(p, &lock, &flags);
-+ trace_sched_wait_task(p);
-+ running = task_running(p);
-+ on_rq = p->on_rq;
-+ ncsw = 0;
-+ if (!match_state || READ_ONCE(p->__state) == match_state)
-+ ncsw = p->nvcsw | LONG_MIN; /* sets MSB */
-+ task_access_unlock_irqrestore(p, lock, &flags);
-+
-+ /*
-+ * If it changed from the expected state, bail out now.
-+ */
-+ if (unlikely(!ncsw))
-+ break;
-+
-+ /*
-+ * Was it really running after all now that we
-+ * checked with the proper locks actually held?
-+ *
-+ * Oops. Go back and try again..
-+ */
-+ if (unlikely(running)) {
-+ cpu_relax();
-+ continue;
-+ }
-+
-+ /*
-+ * It's not enough that it's not actively running,
-+ * it must be off the runqueue _entirely_, and not
-+ * preempted!
-+ *
-+ * So if it was still runnable (but just not actively
-+ * running right now), it's preempted, and we should
-+ * yield - it could be a while.
-+ */
-+ if (unlikely(on_rq)) {
-+ ktime_t to = NSEC_PER_SEC / HZ;
-+
-+ set_current_state(TASK_UNINTERRUPTIBLE);
-+ schedule_hrtimeout(&to, HRTIMER_MODE_REL_HARD);
-+ continue;
-+ }
-+
-+ /*
-+ * Ahh, all good. It wasn't running, and it wasn't
-+ * runnable, which means that it will never become
-+ * running in the future either. We're all done!
-+ */
-+ break;
-+ }
-+
-+ return ncsw;
-+}
-+
-+/***
-+ * kick_process - kick a running thread to enter/exit the kernel
-+ * @p: the to-be-kicked thread
-+ *
-+ * Cause a process which is running on another CPU to enter
-+ * kernel-mode, without any delay. (to get signals handled.)
-+ *
-+ * NOTE: this function doesn't have to take the runqueue lock,
-+ * because all it wants to ensure is that the remote task enters
-+ * the kernel. If the IPI races and the task has been migrated
-+ * to another CPU then no harm is done and the purpose has been
-+ * achieved as well.
-+ */
-+void kick_process(struct task_struct *p)
-+{
-+ int cpu;
-+
-+ preempt_disable();
-+ cpu = task_cpu(p);
-+ if ((cpu != smp_processor_id()) && task_curr(p))
-+ smp_send_reschedule(cpu);
-+ preempt_enable();
-+}
-+EXPORT_SYMBOL_GPL(kick_process);
-+
-+/*
-+ * ->cpus_ptr is protected by both rq->lock and p->pi_lock
-+ *
-+ * A few notes on cpu_active vs cpu_online:
-+ *
-+ * - cpu_active must be a subset of cpu_online
-+ *
-+ * - on CPU-up we allow per-CPU kthreads on the online && !active CPU,
-+ * see __set_cpus_allowed_ptr(). At this point the newly online
-+ * CPU isn't yet part of the sched domains, and balancing will not
-+ * see it.
-+ *
-+ * - on cpu-down we clear cpu_active() to mask the sched domains and
-+ * avoid the load balancer to place new tasks on the to be removed
-+ * CPU. Existing tasks will remain running there and will be taken
-+ * off.
-+ *
-+ * This means that fallback selection must not select !active CPUs.
-+ * And can assume that any active CPU must be online. Conversely
-+ * select_task_rq() below may allow selection of !active CPUs in order
-+ * to satisfy the above rules.
-+ */
-+static int select_fallback_rq(int cpu, struct task_struct *p)
-+{
-+ int nid = cpu_to_node(cpu);
-+ const struct cpumask *nodemask = NULL;
-+ enum { cpuset, possible, fail } state = cpuset;
-+ int dest_cpu;
-+
-+ /*
-+ * If the node that the CPU is on has been offlined, cpu_to_node()
-+ * will return -1. There is no CPU on the node, and we should
-+ * select the CPU on the other node.
-+ */
-+ if (nid != -1) {
-+ nodemask = cpumask_of_node(nid);
-+
-+ /* Look for allowed, online CPU in same node. */
-+ for_each_cpu(dest_cpu, nodemask) {
-+ if (is_cpu_allowed(p, dest_cpu))
-+ return dest_cpu;
-+ }
-+ }
-+
-+ for (;;) {
-+ /* Any allowed, online CPU? */
-+ for_each_cpu(dest_cpu, p->cpus_ptr) {
-+ if (!is_cpu_allowed(p, dest_cpu))
-+ continue;
-+ goto out;
-+ }
-+
-+ /* No more Mr. Nice Guy. */
-+ switch (state) {
-+ case cpuset:
-+ if (cpuset_cpus_allowed_fallback(p)) {
-+ state = possible;
-+ break;
-+ }
-+ fallthrough;
-+ case possible:
-+ /*
-+ * XXX When called from select_task_rq() we only
-+ * hold p->pi_lock and again violate locking order.
-+ *
-+ * More yuck to audit.
-+ */
-+ do_set_cpus_allowed(p, task_cpu_possible_mask(p));
-+ state = fail;
-+ break;
-+
-+ case fail:
-+ BUG();
-+ break;
-+ }
-+ }
-+
-+out:
-+ if (state != cpuset) {
-+ /*
-+ * Don't tell them about moving exiting tasks or
-+ * kernel threads (both mm NULL), since they never
-+ * leave kernel.
-+ */
-+ if (p->mm && printk_ratelimit()) {
-+ printk_deferred("process %d (%s) no longer affine to cpu%d\n",
-+ task_pid_nr(p), p->comm, cpu);
-+ }
-+ }
-+
-+ return dest_cpu;
-+}
-+
-+static inline int select_task_rq(struct task_struct *p)
-+{
-+ cpumask_t chk_mask, tmp;
-+
-+ if (unlikely(!cpumask_and(&chk_mask, p->cpus_ptr, cpu_active_mask)))
-+ return select_fallback_rq(task_cpu(p), p);
-+
-+ if (
-+#ifdef CONFIG_SCHED_SMT
-+ cpumask_and(&tmp, &chk_mask, &sched_sg_idle_mask) ||
-+#endif
-+ cpumask_and(&tmp, &chk_mask, sched_rq_watermark) ||
-+ cpumask_and(&tmp, &chk_mask,
-+ sched_rq_watermark + SCHED_QUEUE_BITS - 1 - task_sched_prio(p)))
-+ return best_mask_cpu(task_cpu(p), &tmp);
-+
-+ return best_mask_cpu(task_cpu(p), &chk_mask);
-+}
-+
-+void sched_set_stop_task(int cpu, struct task_struct *stop)
-+{
-+ static struct lock_class_key stop_pi_lock;
-+ struct sched_param stop_param = { .sched_priority = STOP_PRIO };
-+ struct sched_param start_param = { .sched_priority = 0 };
-+ struct task_struct *old_stop = cpu_rq(cpu)->stop;
-+
-+ if (stop) {
-+ /*
-+ * Make it appear like a SCHED_FIFO task, its something
-+ * userspace knows about and won't get confused about.
-+ *
-+ * Also, it will make PI more or less work without too
-+ * much confusion -- but then, stop work should not
-+ * rely on PI working anyway.
-+ */
-+ sched_setscheduler_nocheck(stop, SCHED_FIFO, &stop_param);
-+
-+ /*
-+ * The PI code calls rt_mutex_setprio() with ->pi_lock held to
-+ * adjust the effective priority of a task. As a result,
-+ * rt_mutex_setprio() can trigger (RT) balancing operations,
-+ * which can then trigger wakeups of the stop thread to push
-+ * around the current task.
-+ *
-+ * The stop task itself will never be part of the PI-chain, it
-+ * never blocks, therefore that ->pi_lock recursion is safe.
-+ * Tell lockdep about this by placing the stop->pi_lock in its
-+ * own class.
-+ */
-+ lockdep_set_class(&stop->pi_lock, &stop_pi_lock);
-+ }
-+
-+ cpu_rq(cpu)->stop = stop;
-+
-+ if (old_stop) {
-+ /*
-+ * Reset it back to a normal scheduling policy so that
-+ * it can die in pieces.
-+ */
-+ sched_setscheduler_nocheck(old_stop, SCHED_NORMAL, &start_param);
-+ }
-+}
-+
-+static int affine_move_task(struct rq *rq, struct task_struct *p, int dest_cpu,
-+ raw_spinlock_t *lock, unsigned long irq_flags)
-+{
-+ /* Can the task run on the task's current CPU? If so, we're done */
-+ if (!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) {
-+ if (p->migration_disabled) {
-+ if (likely(p->cpus_ptr != &p->cpus_mask))
-+ __do_set_cpus_ptr(p, &p->cpus_mask);
-+ p->migration_disabled = 0;
-+ p->migration_flags |= MDF_FORCE_ENABLED;
-+ /* When p is migrate_disabled, rq->lock should be held */
-+ rq->nr_pinned--;
-+ }
-+
-+ if (task_running(p) || READ_ONCE(p->__state) == TASK_WAKING) {
-+ struct migration_arg arg = { p, dest_cpu };
-+
-+ /* Need help from migration thread: drop lock and wait. */
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
-+ stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg);
-+ return 0;
-+ }
-+ if (task_on_rq_queued(p)) {
-+ /*
-+ * OK, since we're going to drop the lock immediately
-+ * afterwards anyway.
-+ */
-+ update_rq_clock(rq);
-+ rq = move_queued_task(rq, p, dest_cpu);
-+ lock = &rq->lock;
-+ }
-+ }
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
-+ return 0;
-+}
-+
-+static int __set_cpus_allowed_ptr_locked(struct task_struct *p,
-+ const struct cpumask *new_mask,
-+ u32 flags,
-+ struct rq *rq,
-+ raw_spinlock_t *lock,
-+ unsigned long irq_flags)
-+{
-+ const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p);
-+ const struct cpumask *cpu_valid_mask = cpu_active_mask;
-+ bool kthread = p->flags & PF_KTHREAD;
-+ struct cpumask *user_mask = NULL;
-+ int dest_cpu;
-+ int ret = 0;
-+
-+ if (kthread || is_migration_disabled(p)) {
-+ /*
-+ * Kernel threads are allowed on online && !active CPUs,
-+ * however, during cpu-hot-unplug, even these might get pushed
-+ * away if not KTHREAD_IS_PER_CPU.
-+ *
-+ * Specifically, migration_disabled() tasks must not fail the
-+ * cpumask_any_and_distribute() pick below, esp. so on
-+ * SCA_MIGRATE_ENABLE, otherwise we'll not call
-+ * set_cpus_allowed_common() and actually reset p->cpus_ptr.
-+ */
-+ cpu_valid_mask = cpu_online_mask;
-+ }
-+
-+ if (!kthread && !cpumask_subset(new_mask, cpu_allowed_mask)) {
-+ ret = -EINVAL;
-+ goto out;
-+ }
-+
-+ /*
-+ * Must re-check here, to close a race against __kthread_bind(),
-+ * sched_setaffinity() is not guaranteed to observe the flag.
-+ */
-+ if ((flags & SCA_CHECK) && (p->flags & PF_NO_SETAFFINITY)) {
-+ ret = -EINVAL;
-+ goto out;
-+ }
-+
-+ if (cpumask_equal(&p->cpus_mask, new_mask))
-+ goto out;
-+
-+ dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);
-+ if (dest_cpu >= nr_cpu_ids) {
-+ ret = -EINVAL;
-+ goto out;
-+ }
-+
-+ __do_set_cpus_allowed(p, new_mask);
-+
-+ if (flags & SCA_USER)
-+ user_mask = clear_user_cpus_ptr(p);
-+
-+ ret = affine_move_task(rq, p, dest_cpu, lock, irq_flags);
-+
-+ kfree(user_mask);
-+
-+ return ret;
-+
-+out:
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
-+
-+ return ret;
-+}
-+
-+/*
-+ * Change a given task's CPU affinity. Migrate the thread to a
-+ * proper CPU and schedule it away if the CPU it's executing on
-+ * is removed from the allowed bitmask.
-+ *
-+ * NOTE: the caller must have a valid reference to the task, the
-+ * task must not exit() & deallocate itself prematurely. The
-+ * call is not atomic; no spinlocks may be held.
-+ */
-+static int __set_cpus_allowed_ptr(struct task_struct *p,
-+ const struct cpumask *new_mask, u32 flags)
-+{
-+ unsigned long irq_flags;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
-+ rq = __task_access_lock(p, &lock);
-+
-+ return __set_cpus_allowed_ptr_locked(p, new_mask, flags, rq, lock, irq_flags);
-+}
-+
-+int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
-+{
-+ return __set_cpus_allowed_ptr(p, new_mask, 0);
-+}
-+EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
-+
-+/*
-+ * Change a given task's CPU affinity to the intersection of its current
-+ * affinity mask and @subset_mask, writing the resulting mask to @new_mask
-+ * and pointing @p->user_cpus_ptr to a copy of the old mask.
-+ * If the resulting mask is empty, leave the affinity unchanged and return
-+ * -EINVAL.
-+ */
-+static int restrict_cpus_allowed_ptr(struct task_struct *p,
-+ struct cpumask *new_mask,
-+ const struct cpumask *subset_mask)
-+{
-+ struct cpumask *user_mask = NULL;
-+ unsigned long irq_flags;
-+ raw_spinlock_t *lock;
-+ struct rq *rq;
-+ int err;
-+
-+ if (!p->user_cpus_ptr) {
-+ user_mask = kmalloc(cpumask_size(), GFP_KERNEL);
-+ if (!user_mask)
-+ return -ENOMEM;
-+ }
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
-+ rq = __task_access_lock(p, &lock);
-+
-+ if (!cpumask_and(new_mask, &p->cpus_mask, subset_mask)) {
-+ err = -EINVAL;
-+ goto err_unlock;
-+ }
-+
-+ /*
-+ * We're about to butcher the task affinity, so keep track of what
-+ * the user asked for in case we're able to restore it later on.
-+ */
-+ if (user_mask) {
-+ cpumask_copy(user_mask, p->cpus_ptr);
-+ p->user_cpus_ptr = user_mask;
-+ }
-+
-+ /*return __set_cpus_allowed_ptr_locked(p, new_mask, 0, rq, &rf);*/
-+ return __set_cpus_allowed_ptr_locked(p, new_mask, 0, rq, lock, irq_flags);
-+
-+err_unlock:
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
-+ kfree(user_mask);
-+ return err;
-+}
-+
-+/*
-+ * Restrict the CPU affinity of task @p so that it is a subset of
-+ * task_cpu_possible_mask() and point @p->user_cpu_ptr to a copy of the
-+ * old affinity mask. If the resulting mask is empty, we warn and walk
-+ * up the cpuset hierarchy until we find a suitable mask.
-+ */
-+void force_compatible_cpus_allowed_ptr(struct task_struct *p)
-+{
-+ cpumask_var_t new_mask;
-+ const struct cpumask *override_mask = task_cpu_possible_mask(p);
-+
-+ alloc_cpumask_var(&new_mask, GFP_KERNEL);
-+
-+ /*
-+ * __migrate_task() can fail silently in the face of concurrent
-+ * offlining of the chosen destination CPU, so take the hotplug
-+ * lock to ensure that the migration succeeds.
-+ */
-+ cpus_read_lock();
-+ if (!cpumask_available(new_mask))
-+ goto out_set_mask;
-+
-+ if (!restrict_cpus_allowed_ptr(p, new_mask, override_mask))
-+ goto out_free_mask;
-+
-+ /*
-+ * We failed to find a valid subset of the affinity mask for the
-+ * task, so override it based on its cpuset hierarchy.
-+ */
-+ cpuset_cpus_allowed(p, new_mask);
-+ override_mask = new_mask;
-+
-+out_set_mask:
-+ if (printk_ratelimit()) {
-+ printk_deferred("Overriding affinity for process %d (%s) to CPUs %*pbl\n",
-+ task_pid_nr(p), p->comm,
-+ cpumask_pr_args(override_mask));
-+ }
-+
-+ WARN_ON(set_cpus_allowed_ptr(p, override_mask));
-+out_free_mask:
-+ cpus_read_unlock();
-+ free_cpumask_var(new_mask);
-+}
-+
-+static int
-+__sched_setaffinity(struct task_struct *p, const struct cpumask *mask);
-+
-+/*
-+ * Restore the affinity of a task @p which was previously restricted by a
-+ * call to force_compatible_cpus_allowed_ptr(). This will clear (and free)
-+ * @p->user_cpus_ptr.
-+ *
-+ * It is the caller's responsibility to serialise this with any calls to
-+ * force_compatible_cpus_allowed_ptr(@p).
-+ */
-+void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
-+{
-+ struct cpumask *user_mask = p->user_cpus_ptr;
-+ unsigned long flags;
-+
-+ /*
-+ * Try to restore the old affinity mask. If this fails, then
-+ * we free the mask explicitly to avoid it being inherited across
-+ * a subsequent fork().
-+ */
-+ if (!user_mask || !__sched_setaffinity(p, user_mask))
-+ return;
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ user_mask = clear_user_cpus_ptr(p);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+
-+ kfree(user_mask);
-+}
-+
-+#else /* CONFIG_SMP */
-+
-+static inline int select_task_rq(struct task_struct *p)
-+{
-+ return 0;
-+}
-+
-+static inline int
-+__set_cpus_allowed_ptr(struct task_struct *p,
-+ const struct cpumask *new_mask, u32 flags)
-+{
-+ return set_cpus_allowed_ptr(p, new_mask);
-+}
-+
-+static inline bool rq_has_pinned_tasks(struct rq *rq)
-+{
-+ return false;
-+}
-+
-+#endif /* !CONFIG_SMP */
-+
-+static void
-+ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ struct rq *rq;
-+
-+ if (!schedstat_enabled())
-+ return;
-+
-+ rq = this_rq();
-+
-+#ifdef CONFIG_SMP
-+ if (cpu == rq->cpu) {
-+ __schedstat_inc(rq->ttwu_local);
-+ __schedstat_inc(p->stats.nr_wakeups_local);
-+ } else {
-+ /** Alt schedule FW ToDo:
-+ * How to do ttwu_wake_remote
-+ */
-+ }
-+#endif /* CONFIG_SMP */
-+
-+ __schedstat_inc(rq->ttwu_count);
-+ __schedstat_inc(p->stats.nr_wakeups);
-+}
-+
-+/*
-+ * Mark the task runnable and perform wakeup-preemption.
-+ */
-+static inline void
-+ttwu_do_wakeup(struct rq *rq, struct task_struct *p, int wake_flags)
-+{
-+ check_preempt_curr(rq);
-+ WRITE_ONCE(p->__state, TASK_RUNNING);
-+ trace_sched_wakeup(p);
-+}
-+
-+static inline void
-+ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags)
-+{
-+ if (p->sched_contributes_to_load)
-+ rq->nr_uninterruptible--;
-+
-+ if (
-+#ifdef CONFIG_SMP
-+ !(wake_flags & WF_MIGRATED) &&
-+#endif
-+ p->in_iowait) {
-+ delayacct_blkio_end(p);
-+ atomic_dec(&task_rq(p)->nr_iowait);
-+ }
-+
-+ activate_task(p, rq);
-+ ttwu_do_wakeup(rq, p, 0);
-+}
-+
-+/*
-+ * Consider @p being inside a wait loop:
-+ *
-+ * for (;;) {
-+ * set_current_state(TASK_UNINTERRUPTIBLE);
-+ *
-+ * if (CONDITION)
-+ * break;
-+ *
-+ * schedule();
-+ * }
-+ * __set_current_state(TASK_RUNNING);
-+ *
-+ * between set_current_state() and schedule(). In this case @p is still
-+ * runnable, so all that needs doing is change p->state back to TASK_RUNNING in
-+ * an atomic manner.
-+ *
-+ * By taking task_rq(p)->lock we serialize against schedule(), if @p->on_rq
-+ * then schedule() must still happen and p->state can be changed to
-+ * TASK_RUNNING. Otherwise we lost the race, schedule() has happened, and we
-+ * need to do a full wakeup with enqueue.
-+ *
-+ * Returns: %true when the wakeup is done,
-+ * %false otherwise.
-+ */
-+static int ttwu_runnable(struct task_struct *p, int wake_flags)
-+{
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+ int ret = 0;
-+
-+ rq = __task_access_lock(p, &lock);
-+ if (task_on_rq_queued(p)) {
-+ /* check_preempt_curr() may use rq clock */
-+ update_rq_clock(rq);
-+ ttwu_do_wakeup(rq, p, wake_flags);
-+ ret = 1;
-+ }
-+ __task_access_unlock(p, lock);
-+
-+ return ret;
-+}
-+
-+#ifdef CONFIG_SMP
-+void sched_ttwu_pending(void *arg)
-+{
-+ struct llist_node *llist = arg;
-+ struct rq *rq = this_rq();
-+ struct task_struct *p, *t;
-+ struct rq_flags rf;
-+
-+ if (!llist)
-+ return;
-+
-+ /*
-+ * rq::ttwu_pending racy indication of out-standing wakeups.
-+ * Races such that false-negatives are possible, since they
-+ * are shorter lived that false-positives would be.
-+ */
-+ WRITE_ONCE(rq->ttwu_pending, 0);
-+
-+ rq_lock_irqsave(rq, &rf);
-+ update_rq_clock(rq);
-+
-+ llist_for_each_entry_safe(p, t, llist, wake_entry.llist) {
-+ if (WARN_ON_ONCE(p->on_cpu))
-+ smp_cond_load_acquire(&p->on_cpu, !VAL);
-+
-+ if (WARN_ON_ONCE(task_cpu(p) != cpu_of(rq)))
-+ set_task_cpu(p, cpu_of(rq));
-+
-+ ttwu_do_activate(rq, p, p->sched_remote_wakeup ? WF_MIGRATED : 0);
-+ }
-+
-+ rq_unlock_irqrestore(rq, &rf);
-+}
-+
-+void send_call_function_single_ipi(int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ if (!set_nr_if_polling(rq->idle))
-+ arch_send_call_function_single_ipi(cpu);
-+ else
-+ trace_sched_wake_idle_without_ipi(cpu);
-+}
-+
-+/*
-+ * Queue a task on the target CPUs wake_list and wake the CPU via IPI if
-+ * necessary. The wakee CPU on receipt of the IPI will queue the task
-+ * via sched_ttwu_wakeup() for activation so the wakee incurs the cost
-+ * of the wakeup instead of the waker.
-+ */
-+static void __ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED);
-+
-+ WRITE_ONCE(rq->ttwu_pending, 1);
-+ __smp_call_single_queue(cpu, &p->wake_entry.llist);
-+}
-+
-+static inline bool ttwu_queue_cond(int cpu, int wake_flags)
-+{
-+ /*
-+ * Do not complicate things with the async wake_list while the CPU is
-+ * in hotplug state.
-+ */
-+ if (!cpu_active(cpu))
-+ return false;
-+
-+ /*
-+ * If the CPU does not share cache, then queue the task on the
-+ * remote rqs wakelist to avoid accessing remote data.
-+ */
-+ if (!cpus_share_cache(smp_processor_id(), cpu))
-+ return true;
-+
-+ /*
-+ * If the task is descheduling and the only running task on the
-+ * CPU then use the wakelist to offload the task activation to
-+ * the soon-to-be-idle CPU as the current CPU is likely busy.
-+ * nr_running is checked to avoid unnecessary task stacking.
-+ */
-+ if ((wake_flags & WF_ON_CPU) && cpu_rq(cpu)->nr_running <= 1)
-+ return true;
-+
-+ return false;
-+}
-+
-+static bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ if (__is_defined(ALT_SCHED_TTWU_QUEUE) && ttwu_queue_cond(cpu, wake_flags)) {
-+ if (WARN_ON_ONCE(cpu == smp_processor_id()))
-+ return false;
-+
-+ sched_clock_cpu(cpu); /* Sync clocks across CPUs */
-+ __ttwu_queue_wakelist(p, cpu, wake_flags);
-+ return true;
-+ }
-+
-+ return false;
-+}
-+
-+void wake_up_if_idle(int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ rcu_read_lock();
-+
-+ if (!is_idle_task(rcu_dereference(rq->curr)))
-+ goto out;
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ if (is_idle_task(rq->curr))
-+ resched_curr(rq);
-+ /* Else CPU is not idle, do nothing here */
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+out:
-+ rcu_read_unlock();
-+}
-+
-+bool cpus_share_cache(int this_cpu, int that_cpu)
-+{
-+ if (this_cpu == that_cpu)
-+ return true;
-+
-+ return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
-+}
-+#else /* !CONFIG_SMP */
-+
-+static inline bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ return false;
-+}
-+
-+#endif /* CONFIG_SMP */
-+
-+static inline void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ if (ttwu_queue_wakelist(p, cpu, wake_flags))
-+ return;
-+
-+ raw_spin_lock(&rq->lock);
-+ update_rq_clock(rq);
-+ ttwu_do_activate(rq, p, wake_flags);
-+ raw_spin_unlock(&rq->lock);
-+}
-+
-+/*
-+ * Invoked from try_to_wake_up() to check whether the task can be woken up.
-+ *
-+ * The caller holds p::pi_lock if p != current or has preemption
-+ * disabled when p == current.
-+ *
-+ * The rules of PREEMPT_RT saved_state:
-+ *
-+ * The related locking code always holds p::pi_lock when updating
-+ * p::saved_state, which means the code is fully serialized in both cases.
-+ *
-+ * The lock wait and lock wakeups happen via TASK_RTLOCK_WAIT. No other
-+ * bits set. This allows to distinguish all wakeup scenarios.
-+ */
-+static __always_inline
-+bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
-+{
-+ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)) {
-+ WARN_ON_ONCE((state & TASK_RTLOCK_WAIT) &&
-+ state != TASK_RTLOCK_WAIT);
-+ }
-+
-+ if (READ_ONCE(p->__state) & state) {
-+ *success = 1;
-+ return true;
-+ }
-+
-+#ifdef CONFIG_PREEMPT_RT
-+ /*
-+ * Saved state preserves the task state across blocking on
-+ * an RT lock. If the state matches, set p::saved_state to
-+ * TASK_RUNNING, but do not wake the task because it waits
-+ * for a lock wakeup. Also indicate success because from
-+ * the regular waker's point of view this has succeeded.
-+ *
-+ * After acquiring the lock the task will restore p::__state
-+ * from p::saved_state which ensures that the regular
-+ * wakeup is not lost. The restore will also set
-+ * p::saved_state to TASK_RUNNING so any further tests will
-+ * not result in false positives vs. @success
-+ */
-+ if (p->saved_state & state) {
-+ p->saved_state = TASK_RUNNING;
-+ *success = 1;
-+ }
-+#endif
-+ return false;
-+}
-+
-+/*
-+ * Notes on Program-Order guarantees on SMP systems.
-+ *
-+ * MIGRATION
-+ *
-+ * The basic program-order guarantee on SMP systems is that when a task [t]
-+ * migrates, all its activity on its old CPU [c0] happens-before any subsequent
-+ * execution on its new CPU [c1].
-+ *
-+ * For migration (of runnable tasks) this is provided by the following means:
-+ *
-+ * A) UNLOCK of the rq(c0)->lock scheduling out task t
-+ * B) migration for t is required to synchronize *both* rq(c0)->lock and
-+ * rq(c1)->lock (if not at the same time, then in that order).
-+ * C) LOCK of the rq(c1)->lock scheduling in task
-+ *
-+ * Transitivity guarantees that B happens after A and C after B.
-+ * Note: we only require RCpc transitivity.
-+ * Note: the CPU doing B need not be c0 or c1
-+ *
-+ * Example:
-+ *
-+ * CPU0 CPU1 CPU2
-+ *
-+ * LOCK rq(0)->lock
-+ * sched-out X
-+ * sched-in Y
-+ * UNLOCK rq(0)->lock
-+ *
-+ * LOCK rq(0)->lock // orders against CPU0
-+ * dequeue X
-+ * UNLOCK rq(0)->lock
-+ *
-+ * LOCK rq(1)->lock
-+ * enqueue X
-+ * UNLOCK rq(1)->lock
-+ *
-+ * LOCK rq(1)->lock // orders against CPU2
-+ * sched-out Z
-+ * sched-in X
-+ * UNLOCK rq(1)->lock
-+ *
-+ *
-+ * BLOCKING -- aka. SLEEP + WAKEUP
-+ *
-+ * For blocking we (obviously) need to provide the same guarantee as for
-+ * migration. However the means are completely different as there is no lock
-+ * chain to provide order. Instead we do:
-+ *
-+ * 1) smp_store_release(X->on_cpu, 0) -- finish_task()
-+ * 2) smp_cond_load_acquire(!X->on_cpu) -- try_to_wake_up()
-+ *
-+ * Example:
-+ *
-+ * CPU0 (schedule) CPU1 (try_to_wake_up) CPU2 (schedule)
-+ *
-+ * LOCK rq(0)->lock LOCK X->pi_lock
-+ * dequeue X
-+ * sched-out X
-+ * smp_store_release(X->on_cpu, 0);
-+ *
-+ * smp_cond_load_acquire(&X->on_cpu, !VAL);
-+ * X->state = WAKING
-+ * set_task_cpu(X,2)
-+ *
-+ * LOCK rq(2)->lock
-+ * enqueue X
-+ * X->state = RUNNING
-+ * UNLOCK rq(2)->lock
-+ *
-+ * LOCK rq(2)->lock // orders against CPU1
-+ * sched-out Z
-+ * sched-in X
-+ * UNLOCK rq(2)->lock
-+ *
-+ * UNLOCK X->pi_lock
-+ * UNLOCK rq(0)->lock
-+ *
-+ *
-+ * However; for wakeups there is a second guarantee we must provide, namely we
-+ * must observe the state that lead to our wakeup. That is, not only must our
-+ * task observe its own prior state, it must also observe the stores prior to
-+ * its wakeup.
-+ *
-+ * This means that any means of doing remote wakeups must order the CPU doing
-+ * the wakeup against the CPU the task is going to end up running on. This,
-+ * however, is already required for the regular Program-Order guarantee above,
-+ * since the waking CPU is the one issueing the ACQUIRE (smp_cond_load_acquire).
-+ *
-+ */
-+
-+/**
-+ * try_to_wake_up - wake up a thread
-+ * @p: the thread to be awakened
-+ * @state: the mask of task states that can be woken
-+ * @wake_flags: wake modifier flags (WF_*)
-+ *
-+ * Conceptually does:
-+ *
-+ * If (@state & @p->state) @p->state = TASK_RUNNING.
-+ *
-+ * If the task was not queued/runnable, also place it back on a runqueue.
-+ *
-+ * This function is atomic against schedule() which would dequeue the task.
-+ *
-+ * It issues a full memory barrier before accessing @p->state, see the comment
-+ * with set_current_state().
-+ *
-+ * Uses p->pi_lock to serialize against concurrent wake-ups.
-+ *
-+ * Relies on p->pi_lock stabilizing:
-+ * - p->sched_class
-+ * - p->cpus_ptr
-+ * - p->sched_task_group
-+ * in order to do migration, see its use of select_task_rq()/set_task_cpu().
-+ *
-+ * Tries really hard to only take one task_rq(p)->lock for performance.
-+ * Takes rq->lock in:
-+ * - ttwu_runnable() -- old rq, unavoidable, see comment there;
-+ * - ttwu_queue() -- new rq, for enqueue of the task;
-+ * - psi_ttwu_dequeue() -- much sadness :-( accounting will kill us.
-+ *
-+ * As a consequence we race really badly with just about everything. See the
-+ * many memory barriers and their comments for details.
-+ *
-+ * Return: %true if @p->state changes (an actual wakeup was done),
-+ * %false otherwise.
-+ */
-+static int try_to_wake_up(struct task_struct *p, unsigned int state,
-+ int wake_flags)
-+{
-+ unsigned long flags;
-+ int cpu, success = 0;
-+
-+ preempt_disable();
-+ if (p == current) {
-+ /*
-+ * We're waking current, this means 'p->on_rq' and 'task_cpu(p)
-+ * == smp_processor_id()'. Together this means we can special
-+ * case the whole 'p->on_rq && ttwu_runnable()' case below
-+ * without taking any locks.
-+ *
-+ * In particular:
-+ * - we rely on Program-Order guarantees for all the ordering,
-+ * - we're serialized against set_special_state() by virtue of
-+ * it disabling IRQs (this allows not taking ->pi_lock).
-+ */
-+ if (!ttwu_state_match(p, state, &success))
-+ goto out;
-+
-+ trace_sched_waking(p);
-+ WRITE_ONCE(p->__state, TASK_RUNNING);
-+ trace_sched_wakeup(p);
-+ goto out;
-+ }
-+
-+ /*
-+ * If we are going to wake up a thread waiting for CONDITION we
-+ * need to ensure that CONDITION=1 done by the caller can not be
-+ * reordered with p->state check below. This pairs with smp_store_mb()
-+ * in set_current_state() that the waiting thread does.
-+ */
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ smp_mb__after_spinlock();
-+ if (!ttwu_state_match(p, state, &success))
-+ goto unlock;
-+
-+ trace_sched_waking(p);
-+
-+ /*
-+ * Ensure we load p->on_rq _after_ p->state, otherwise it would
-+ * be possible to, falsely, observe p->on_rq == 0 and get stuck
-+ * in smp_cond_load_acquire() below.
-+ *
-+ * sched_ttwu_pending() try_to_wake_up()
-+ * STORE p->on_rq = 1 LOAD p->state
-+ * UNLOCK rq->lock
-+ *
-+ * __schedule() (switch to task 'p')
-+ * LOCK rq->lock smp_rmb();
-+ * smp_mb__after_spinlock();
-+ * UNLOCK rq->lock
-+ *
-+ * [task p]
-+ * STORE p->state = UNINTERRUPTIBLE LOAD p->on_rq
-+ *
-+ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
-+ * __schedule(). See the comment for smp_mb__after_spinlock().
-+ *
-+ * A similar smb_rmb() lives in try_invoke_on_locked_down_task().
-+ */
-+ smp_rmb();
-+ if (READ_ONCE(p->on_rq) && ttwu_runnable(p, wake_flags))
-+ goto unlock;
-+
-+#ifdef CONFIG_SMP
-+ /*
-+ * Ensure we load p->on_cpu _after_ p->on_rq, otherwise it would be
-+ * possible to, falsely, observe p->on_cpu == 0.
-+ *
-+ * One must be running (->on_cpu == 1) in order to remove oneself
-+ * from the runqueue.
-+ *
-+ * __schedule() (switch to task 'p') try_to_wake_up()
-+ * STORE p->on_cpu = 1 LOAD p->on_rq
-+ * UNLOCK rq->lock
-+ *
-+ * __schedule() (put 'p' to sleep)
-+ * LOCK rq->lock smp_rmb();
-+ * smp_mb__after_spinlock();
-+ * STORE p->on_rq = 0 LOAD p->on_cpu
-+ *
-+ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
-+ * __schedule(). See the comment for smp_mb__after_spinlock().
-+ *
-+ * Form a control-dep-acquire with p->on_rq == 0 above, to ensure
-+ * schedule()'s deactivate_task() has 'happened' and p will no longer
-+ * care about it's own p->state. See the comment in __schedule().
-+ */
-+ smp_acquire__after_ctrl_dep();
-+
-+ /*
-+ * We're doing the wakeup (@success == 1), they did a dequeue (p->on_rq
-+ * == 0), which means we need to do an enqueue, change p->state to
-+ * TASK_WAKING such that we can unlock p->pi_lock before doing the
-+ * enqueue, such as ttwu_queue_wakelist().
-+ */
-+ WRITE_ONCE(p->__state, TASK_WAKING);
-+
-+ /*
-+ * If the owning (remote) CPU is still in the middle of schedule() with
-+ * this task as prev, considering queueing p on the remote CPUs wake_list
-+ * which potentially sends an IPI instead of spinning on p->on_cpu to
-+ * let the waker make forward progress. This is safe because IRQs are
-+ * disabled and the IPI will deliver after on_cpu is cleared.
-+ *
-+ * Ensure we load task_cpu(p) after p->on_cpu:
-+ *
-+ * set_task_cpu(p, cpu);
-+ * STORE p->cpu = @cpu
-+ * __schedule() (switch to task 'p')
-+ * LOCK rq->lock
-+ * smp_mb__after_spin_lock() smp_cond_load_acquire(&p->on_cpu)
-+ * STORE p->on_cpu = 1 LOAD p->cpu
-+ *
-+ * to ensure we observe the correct CPU on which the task is currently
-+ * scheduling.
-+ */
-+ if (smp_load_acquire(&p->on_cpu) &&
-+ ttwu_queue_wakelist(p, task_cpu(p), wake_flags | WF_ON_CPU))
-+ goto unlock;
-+
-+ /*
-+ * If the owning (remote) CPU is still in the middle of schedule() with
-+ * this task as prev, wait until it's done referencing the task.
-+ *
-+ * Pairs with the smp_store_release() in finish_task().
-+ *
-+ * This ensures that tasks getting woken will be fully ordered against
-+ * their previous state and preserve Program Order.
-+ */
-+ smp_cond_load_acquire(&p->on_cpu, !VAL);
-+
-+ sched_task_ttwu(p);
-+
-+ cpu = select_task_rq(p);
-+
-+ if (cpu != task_cpu(p)) {
-+ if (p->in_iowait) {
-+ delayacct_blkio_end(p);
-+ atomic_dec(&task_rq(p)->nr_iowait);
-+ }
-+
-+ wake_flags |= WF_MIGRATED;
-+ psi_ttwu_dequeue(p);
-+ set_task_cpu(p, cpu);
-+ }
-+#else
-+ cpu = task_cpu(p);
-+#endif /* CONFIG_SMP */
-+
-+ ttwu_queue(p, cpu, wake_flags);
-+unlock:
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+out:
-+ if (success)
-+ ttwu_stat(p, task_cpu(p), wake_flags);
-+ preempt_enable();
-+
-+ return success;
-+}
-+
-+/**
-+ * task_call_func - Invoke a function on task in fixed state
-+ * @p: Process for which the function is to be invoked, can be @current.
-+ * @func: Function to invoke.
-+ * @arg: Argument to function.
-+ *
-+ * Fix the task in it's current state by avoiding wakeups and or rq operations
-+ * and call @func(@arg) on it. This function can use ->on_rq and task_curr()
-+ * to work out what the state is, if required. Given that @func can be invoked
-+ * with a runqueue lock held, it had better be quite lightweight.
-+ *
-+ * Returns:
-+ * Whatever @func returns
-+ */
-+int task_call_func(struct task_struct *p, task_call_f func, void *arg)
-+{
-+ struct rq *rq = NULL;
-+ unsigned int state;
-+ struct rq_flags rf;
-+ int ret;
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
-+
-+ state = READ_ONCE(p->__state);
-+
-+ /*
-+ * Ensure we load p->on_rq after p->__state, otherwise it would be
-+ * possible to, falsely, observe p->on_rq == 0.
-+ *
-+ * See try_to_wake_up() for a longer comment.
-+ */
-+ smp_rmb();
-+
-+ /*
-+ * Since pi->lock blocks try_to_wake_up(), we don't need rq->lock when
-+ * the task is blocked. Make sure to check @state since ttwu() can drop
-+ * locks at the end, see ttwu_queue_wakelist().
-+ */
-+ if (state == TASK_RUNNING || state == TASK_WAKING || p->on_rq)
-+ rq = __task_rq_lock(p, &rf);
-+
-+ /*
-+ * At this point the task is pinned; either:
-+ * - blocked and we're holding off wakeups (pi->lock)
-+ * - woken, and we're holding off enqueue (rq->lock)
-+ * - queued, and we're holding off schedule (rq->lock)
-+ * - running, and we're holding off de-schedule (rq->lock)
-+ *
-+ * The called function (@func) can use: task_curr(), p->on_rq and
-+ * p->__state to differentiate between these states.
-+ */
-+ ret = func(p, arg);
-+
-+ if (rq)
-+ __task_rq_unlock(rq, &rf);
-+
-+ raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags);
-+ return ret;
-+}
-+
-+/**
-+ * wake_up_process - Wake up a specific process
-+ * @p: The process to be woken up.
-+ *
-+ * Attempt to wake up the nominated process and move it to the set of runnable
-+ * processes.
-+ *
-+ * Return: 1 if the process was woken up, 0 if it was already running.
-+ *
-+ * This function executes a full memory barrier before accessing the task state.
-+ */
-+int wake_up_process(struct task_struct *p)
-+{
-+ return try_to_wake_up(p, TASK_NORMAL, 0);
-+}
-+EXPORT_SYMBOL(wake_up_process);
-+
-+int wake_up_state(struct task_struct *p, unsigned int state)
-+{
-+ return try_to_wake_up(p, state, 0);
-+}
-+
-+/*
-+ * Perform scheduler related setup for a newly forked process p.
-+ * p is forked by current.
-+ *
-+ * __sched_fork() is basic setup used by init_idle() too:
-+ */
-+static inline void __sched_fork(unsigned long clone_flags, struct task_struct *p)
-+{
-+ p->on_rq = 0;
-+ p->on_cpu = 0;
-+ p->utime = 0;
-+ p->stime = 0;
-+ p->sched_time = 0;
-+
-+#ifdef CONFIG_SCHEDSTATS
-+ /* Even if schedstat is disabled, there should not be garbage */
-+ memset(&p->stats, 0, sizeof(p->stats));
-+#endif
-+
-+#ifdef CONFIG_PREEMPT_NOTIFIERS
-+ INIT_HLIST_HEAD(&p->preempt_notifiers);
-+#endif
-+
-+#ifdef CONFIG_COMPACTION
-+ p->capture_control = NULL;
-+#endif
-+#ifdef CONFIG_SMP
-+ p->wake_entry.u_flags = CSD_TYPE_TTWU;
-+#endif
-+}
-+
-+/*
-+ * fork()/clone()-time setup:
-+ */
-+int sched_fork(unsigned long clone_flags, struct task_struct *p)
-+{
-+ __sched_fork(clone_flags, p);
-+ /*
-+ * We mark the process as NEW here. This guarantees that
-+ * nobody will actually run it, and a signal or other external
-+ * event cannot wake it up and insert it on the runqueue either.
-+ */
-+ p->__state = TASK_NEW;
-+
-+ /*
-+ * Make sure we do not leak PI boosting priority to the child.
-+ */
-+ p->prio = current->normal_prio;
-+
-+ /*
-+ * Revert to default priority/policy on fork if requested.
-+ */
-+ if (unlikely(p->sched_reset_on_fork)) {
-+ if (task_has_rt_policy(p)) {
-+ p->policy = SCHED_NORMAL;
-+ p->static_prio = NICE_TO_PRIO(0);
-+ p->rt_priority = 0;
-+ } else if (PRIO_TO_NICE(p->static_prio) < 0)
-+ p->static_prio = NICE_TO_PRIO(0);
-+
-+ p->prio = p->normal_prio = p->static_prio;
-+
-+ /*
-+ * We don't need the reset flag anymore after the fork. It has
-+ * fulfilled its duty:
-+ */
-+ p->sched_reset_on_fork = 0;
-+ }
-+
-+#ifdef CONFIG_SCHED_INFO
-+ if (unlikely(sched_info_on()))
-+ memset(&p->sched_info, 0, sizeof(p->sched_info));
-+#endif
-+ init_task_preempt_count(p);
-+
-+ return 0;
-+}
-+
-+void sched_cgroup_fork(struct task_struct *p, struct kernel_clone_args *kargs)
-+{
-+ unsigned long flags;
-+ struct rq *rq;
-+
-+ /*
-+ * Because we're not yet on the pid-hash, p->pi_lock isn't strictly
-+ * required yet, but lockdep gets upset if rules are violated.
-+ */
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ /*
-+ * Share the timeslice between parent and child, thus the
-+ * total amount of pending timeslices in the system doesn't change,
-+ * resulting in more scheduling fairness.
-+ */
-+ rq = this_rq();
-+ raw_spin_lock(&rq->lock);
-+
-+ rq->curr->time_slice /= 2;
-+ p->time_slice = rq->curr->time_slice;
-+#ifdef CONFIG_SCHED_HRTICK
-+ hrtick_start(rq, rq->curr->time_slice);
-+#endif
-+
-+ if (p->time_slice < RESCHED_NS) {
-+ p->time_slice = sched_timeslice_ns;
-+ resched_curr(rq);
-+ }
-+ sched_task_fork(p, rq);
-+ raw_spin_unlock(&rq->lock);
-+
-+ rseq_migrate(p);
-+ /*
-+ * We're setting the CPU for the first time, we don't migrate,
-+ * so use __set_task_cpu().
-+ */
-+ __set_task_cpu(p, smp_processor_id());
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+}
-+
-+void sched_post_fork(struct task_struct *p)
-+{
-+}
-+
-+#ifdef CONFIG_SCHEDSTATS
-+
-+DEFINE_STATIC_KEY_FALSE(sched_schedstats);
-+
-+static void set_schedstats(bool enabled)
-+{
-+ if (enabled)
-+ static_branch_enable(&sched_schedstats);
-+ else
-+ static_branch_disable(&sched_schedstats);
-+}
-+
-+void force_schedstat_enabled(void)
-+{
-+ if (!schedstat_enabled()) {
-+ pr_info("kernel profiling enabled schedstats, disable via kernel.sched_schedstats.\n");
-+ static_branch_enable(&sched_schedstats);
-+ }
-+}
-+
-+static int __init setup_schedstats(char *str)
-+{
-+ int ret = 0;
-+ if (!str)
-+ goto out;
-+
-+ if (!strcmp(str, "enable")) {
-+ set_schedstats(true);
-+ ret = 1;
-+ } else if (!strcmp(str, "disable")) {
-+ set_schedstats(false);
-+ ret = 1;
-+ }
-+out:
-+ if (!ret)
-+ pr_warn("Unable to parse schedstats=\n");
-+
-+ return ret;
-+}
-+__setup("schedstats=", setup_schedstats);
-+
-+#ifdef CONFIG_PROC_SYSCTL
-+int sysctl_schedstats(struct ctl_table *table, int write,
-+ void __user *buffer, size_t *lenp, loff_t *ppos)
-+{
-+ struct ctl_table t;
-+ int err;
-+ int state = static_branch_likely(&sched_schedstats);
-+
-+ if (write && !capable(CAP_SYS_ADMIN))
-+ return -EPERM;
-+
-+ t = *table;
-+ t.data = &state;
-+ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
-+ if (err < 0)
-+ return err;
-+ if (write)
-+ set_schedstats(state);
-+ return err;
-+}
-+#endif /* CONFIG_PROC_SYSCTL */
-+#endif /* CONFIG_SCHEDSTATS */
-+
-+/*
-+ * wake_up_new_task - wake up a newly created task for the first time.
-+ *
-+ * This function will do some initial scheduler statistics housekeeping
-+ * that must be done for every newly created context, then puts the task
-+ * on the runqueue and wakes it.
-+ */
-+void wake_up_new_task(struct task_struct *p)
-+{
-+ unsigned long flags;
-+ struct rq *rq;
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ WRITE_ONCE(p->__state, TASK_RUNNING);
-+ rq = cpu_rq(select_task_rq(p));
-+#ifdef CONFIG_SMP
-+ rseq_migrate(p);
-+ /*
-+ * Fork balancing, do it here and not earlier because:
-+ * - cpus_ptr can change in the fork path
-+ * - any previously selected CPU might disappear through hotplug
-+ *
-+ * Use __set_task_cpu() to avoid calling sched_class::migrate_task_rq,
-+ * as we're not fully set-up yet.
-+ */
-+ __set_task_cpu(p, cpu_of(rq));
-+#endif
-+
-+ raw_spin_lock(&rq->lock);
-+ update_rq_clock(rq);
-+
-+ activate_task(p, rq);
-+ trace_sched_wakeup_new(p);
-+ check_preempt_curr(rq);
-+
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+}
-+
-+#ifdef CONFIG_PREEMPT_NOTIFIERS
-+
-+static DEFINE_STATIC_KEY_FALSE(preempt_notifier_key);
-+
-+void preempt_notifier_inc(void)
-+{
-+ static_branch_inc(&preempt_notifier_key);
-+}
-+EXPORT_SYMBOL_GPL(preempt_notifier_inc);
-+
-+void preempt_notifier_dec(void)
-+{
-+ static_branch_dec(&preempt_notifier_key);
-+}
-+EXPORT_SYMBOL_GPL(preempt_notifier_dec);
-+
-+/**
-+ * preempt_notifier_register - tell me when current is being preempted & rescheduled
-+ * @notifier: notifier struct to register
-+ */
-+void preempt_notifier_register(struct preempt_notifier *notifier)
-+{
-+ if (!static_branch_unlikely(&preempt_notifier_key))
-+ WARN(1, "registering preempt_notifier while notifiers disabled\n");
-+
-+ hlist_add_head(¬ifier->link, ¤t->preempt_notifiers);
-+}
-+EXPORT_SYMBOL_GPL(preempt_notifier_register);
-+
-+/**
-+ * preempt_notifier_unregister - no longer interested in preemption notifications
-+ * @notifier: notifier struct to unregister
-+ *
-+ * This is *not* safe to call from within a preemption notifier.
-+ */
-+void preempt_notifier_unregister(struct preempt_notifier *notifier)
-+{
-+ hlist_del(¬ifier->link);
-+}
-+EXPORT_SYMBOL_GPL(preempt_notifier_unregister);
-+
-+static void __fire_sched_in_preempt_notifiers(struct task_struct *curr)
-+{
-+ struct preempt_notifier *notifier;
-+
-+ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
-+ notifier->ops->sched_in(notifier, raw_smp_processor_id());
-+}
-+
-+static __always_inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
-+{
-+ if (static_branch_unlikely(&preempt_notifier_key))
-+ __fire_sched_in_preempt_notifiers(curr);
-+}
-+
-+static void
-+__fire_sched_out_preempt_notifiers(struct task_struct *curr,
-+ struct task_struct *next)
-+{
-+ struct preempt_notifier *notifier;
-+
-+ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
-+ notifier->ops->sched_out(notifier, next);
-+}
-+
-+static __always_inline void
-+fire_sched_out_preempt_notifiers(struct task_struct *curr,
-+ struct task_struct *next)
-+{
-+ if (static_branch_unlikely(&preempt_notifier_key))
-+ __fire_sched_out_preempt_notifiers(curr, next);
-+}
-+
-+#else /* !CONFIG_PREEMPT_NOTIFIERS */
-+
-+static inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
-+{
-+}
-+
-+static inline void
-+fire_sched_out_preempt_notifiers(struct task_struct *curr,
-+ struct task_struct *next)
-+{
-+}
-+
-+#endif /* CONFIG_PREEMPT_NOTIFIERS */
-+
-+static inline void prepare_task(struct task_struct *next)
-+{
-+ /*
-+ * Claim the task as running, we do this before switching to it
-+ * such that any running task will have this set.
-+ *
-+ * See the ttwu() WF_ON_CPU case and its ordering comment.
-+ */
-+ WRITE_ONCE(next->on_cpu, 1);
-+}
-+
-+static inline void finish_task(struct task_struct *prev)
-+{
-+#ifdef CONFIG_SMP
-+ /*
-+ * This must be the very last reference to @prev from this CPU. After
-+ * p->on_cpu is cleared, the task can be moved to a different CPU. We
-+ * must ensure this doesn't happen until the switch is completely
-+ * finished.
-+ *
-+ * In particular, the load of prev->state in finish_task_switch() must
-+ * happen before this.
-+ *
-+ * Pairs with the smp_cond_load_acquire() in try_to_wake_up().
-+ */
-+ smp_store_release(&prev->on_cpu, 0);
-+#else
-+ prev->on_cpu = 0;
-+#endif
-+}
-+
-+#ifdef CONFIG_SMP
-+
-+static void do_balance_callbacks(struct rq *rq, struct callback_head *head)
-+{
-+ void (*func)(struct rq *rq);
-+ struct callback_head *next;
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ while (head) {
-+ func = (void (*)(struct rq *))head->func;
-+ next = head->next;
-+ head->next = NULL;
-+ head = next;
-+
-+ func(rq);
-+ }
-+}
-+
-+static void balance_push(struct rq *rq);
-+
-+struct callback_head balance_push_callback = {
-+ .next = NULL,
-+ .func = (void (*)(struct callback_head *))balance_push,
-+};
-+
-+static inline struct callback_head *splice_balance_callbacks(struct rq *rq)
-+{
-+ struct callback_head *head = rq->balance_callback;
-+
-+ if (head) {
-+ lockdep_assert_held(&rq->lock);
-+ rq->balance_callback = NULL;
-+ }
-+
-+ return head;
-+}
-+
-+static void __balance_callbacks(struct rq *rq)
-+{
-+ do_balance_callbacks(rq, splice_balance_callbacks(rq));
-+}
-+
-+static inline void balance_callbacks(struct rq *rq, struct callback_head *head)
-+{
-+ unsigned long flags;
-+
-+ if (unlikely(head)) {
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ do_balance_callbacks(rq, head);
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+ }
-+}
-+
-+#else
-+
-+static inline void __balance_callbacks(struct rq *rq)
-+{
-+}
-+
-+static inline struct callback_head *splice_balance_callbacks(struct rq *rq)
-+{
-+ return NULL;
-+}
-+
-+static inline void balance_callbacks(struct rq *rq, struct callback_head *head)
-+{
-+}
-+
-+#endif
-+
-+static inline void
-+prepare_lock_switch(struct rq *rq, struct task_struct *next)
-+{
-+ /*
-+ * Since the runqueue lock will be released by the next
-+ * task (which is an invalid locking op but in the case
-+ * of the scheduler it's an obvious special-case), so we
-+ * do an early lockdep release here:
-+ */
-+ spin_release(&rq->lock.dep_map, _THIS_IP_);
-+#ifdef CONFIG_DEBUG_SPINLOCK
-+ /* this is a valid case when another task releases the spinlock */
-+ rq->lock.owner = next;
-+#endif
-+}
-+
-+static inline void finish_lock_switch(struct rq *rq)
-+{
-+ /*
-+ * If we are tracking spinlock dependencies then we have to
-+ * fix up the runqueue lock - which gets 'carried over' from
-+ * prev into current:
-+ */
-+ spin_acquire(&rq->lock.dep_map, 0, 0, _THIS_IP_);
-+ __balance_callbacks(rq);
-+ raw_spin_unlock_irq(&rq->lock);
-+}
-+
-+/*
-+ * NOP if the arch has not defined these:
-+ */
-+
-+#ifndef prepare_arch_switch
-+# define prepare_arch_switch(next) do { } while (0)
-+#endif
-+
-+#ifndef finish_arch_post_lock_switch
-+# define finish_arch_post_lock_switch() do { } while (0)
-+#endif
-+
-+static inline void kmap_local_sched_out(void)
-+{
-+#ifdef CONFIG_KMAP_LOCAL
-+ if (unlikely(current->kmap_ctrl.idx))
-+ __kmap_local_sched_out();
-+#endif
-+}
-+
-+static inline void kmap_local_sched_in(void)
-+{
-+#ifdef CONFIG_KMAP_LOCAL
-+ if (unlikely(current->kmap_ctrl.idx))
-+ __kmap_local_sched_in();
-+#endif
-+}
-+
-+/**
-+ * prepare_task_switch - prepare to switch tasks
-+ * @rq: the runqueue preparing to switch
-+ * @next: the task we are going to switch to.
-+ *
-+ * This is called with the rq lock held and interrupts off. It must
-+ * be paired with a subsequent finish_task_switch after the context
-+ * switch.
-+ *
-+ * prepare_task_switch sets up locking and calls architecture specific
-+ * hooks.
-+ */
-+static inline void
-+prepare_task_switch(struct rq *rq, struct task_struct *prev,
-+ struct task_struct *next)
-+{
-+ kcov_prepare_switch(prev);
-+ sched_info_switch(rq, prev, next);
-+ perf_event_task_sched_out(prev, next);
-+ rseq_preempt(prev);
-+ fire_sched_out_preempt_notifiers(prev, next);
-+ kmap_local_sched_out();
-+ prepare_task(next);
-+ prepare_arch_switch(next);
-+}
-+
-+/**
-+ * finish_task_switch - clean up after a task-switch
-+ * @rq: runqueue associated with task-switch
-+ * @prev: the thread we just switched away from.
-+ *
-+ * finish_task_switch must be called after the context switch, paired
-+ * with a prepare_task_switch call before the context switch.
-+ * finish_task_switch will reconcile locking set up by prepare_task_switch,
-+ * and do any other architecture-specific cleanup actions.
-+ *
-+ * Note that we may have delayed dropping an mm in context_switch(). If
-+ * so, we finish that here outside of the runqueue lock. (Doing it
-+ * with the lock held can cause deadlocks; see schedule() for
-+ * details.)
-+ *
-+ * The context switch have flipped the stack from under us and restored the
-+ * local variables which were saved when this task called schedule() in the
-+ * past. prev == current is still correct but we need to recalculate this_rq
-+ * because prev may have moved to another CPU.
-+ */
-+static struct rq *finish_task_switch(struct task_struct *prev)
-+ __releases(rq->lock)
-+{
-+ struct rq *rq = this_rq();
-+ struct mm_struct *mm = rq->prev_mm;
-+ unsigned int prev_state;
-+
-+ /*
-+ * The previous task will have left us with a preempt_count of 2
-+ * because it left us after:
-+ *
-+ * schedule()
-+ * preempt_disable(); // 1
-+ * __schedule()
-+ * raw_spin_lock_irq(&rq->lock) // 2
-+ *
-+ * Also, see FORK_PREEMPT_COUNT.
-+ */
-+ if (WARN_ONCE(preempt_count() != 2*PREEMPT_DISABLE_OFFSET,
-+ "corrupted preempt_count: %s/%d/0x%x\n",
-+ current->comm, current->pid, preempt_count()))
-+ preempt_count_set(FORK_PREEMPT_COUNT);
-+
-+ rq->prev_mm = NULL;
-+
-+ /*
-+ * A task struct has one reference for the use as "current".
-+ * If a task dies, then it sets TASK_DEAD in tsk->state and calls
-+ * schedule one last time. The schedule call will never return, and
-+ * the scheduled task must drop that reference.
-+ *
-+ * We must observe prev->state before clearing prev->on_cpu (in
-+ * finish_task), otherwise a concurrent wakeup can get prev
-+ * running on another CPU and we could rave with its RUNNING -> DEAD
-+ * transition, resulting in a double drop.
-+ */
-+ prev_state = READ_ONCE(prev->__state);
-+ vtime_task_switch(prev);
-+ perf_event_task_sched_in(prev, current);
-+ finish_task(prev);
-+ tick_nohz_task_switch();
-+ finish_lock_switch(rq);
-+ finish_arch_post_lock_switch();
-+ kcov_finish_switch(current);
-+ /*
-+ * kmap_local_sched_out() is invoked with rq::lock held and
-+ * interrupts disabled. There is no requirement for that, but the
-+ * sched out code does not have an interrupt enabled section.
-+ * Restoring the maps on sched in does not require interrupts being
-+ * disabled either.
-+ */
-+ kmap_local_sched_in();
-+
-+ fire_sched_in_preempt_notifiers(current);
-+ /*
-+ * When switching through a kernel thread, the loop in
-+ * membarrier_{private,global}_expedited() may have observed that
-+ * kernel thread and not issued an IPI. It is therefore possible to
-+ * schedule between user->kernel->user threads without passing though
-+ * switch_mm(). Membarrier requires a barrier after storing to
-+ * rq->curr, before returning to userspace, so provide them here:
-+ *
-+ * - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED, implicitly
-+ * provided by mmdrop(),
-+ * - a sync_core for SYNC_CORE.
-+ */
-+ if (mm) {
-+ membarrier_mm_sync_core_before_usermode(mm);
-+ mmdrop_sched(mm);
-+ }
-+ if (unlikely(prev_state == TASK_DEAD)) {
-+ /* Task is done with its stack. */
-+ put_task_stack(prev);
-+
-+ put_task_struct_rcu_user(prev);
-+ }
-+
-+ return rq;
-+}
-+
-+/**
-+ * schedule_tail - first thing a freshly forked thread must call.
-+ * @prev: the thread we just switched away from.
-+ */
-+asmlinkage __visible void schedule_tail(struct task_struct *prev)
-+ __releases(rq->lock)
-+{
-+ /*
-+ * New tasks start with FORK_PREEMPT_COUNT, see there and
-+ * finish_task_switch() for details.
-+ *
-+ * finish_task_switch() will drop rq->lock() and lower preempt_count
-+ * and the preempt_enable() will end up enabling preemption (on
-+ * PREEMPT_COUNT kernels).
-+ */
-+
-+ finish_task_switch(prev);
-+ preempt_enable();
-+
-+ if (current->set_child_tid)
-+ put_user(task_pid_vnr(current), current->set_child_tid);
-+
-+ calculate_sigpending();
-+}
-+
-+/*
-+ * context_switch - switch to the new MM and the new thread's register state.
-+ */
-+static __always_inline struct rq *
-+context_switch(struct rq *rq, struct task_struct *prev,
-+ struct task_struct *next)
-+{
-+ prepare_task_switch(rq, prev, next);
-+
-+ /*
-+ * For paravirt, this is coupled with an exit in switch_to to
-+ * combine the page table reload and the switch backend into
-+ * one hypercall.
-+ */
-+ arch_start_context_switch(prev);
-+
-+ /*
-+ * kernel -> kernel lazy + transfer active
-+ * user -> kernel lazy + mmgrab() active
-+ *
-+ * kernel -> user switch + mmdrop() active
-+ * user -> user switch
-+ */
-+ if (!next->mm) { // to kernel
-+ enter_lazy_tlb(prev->active_mm, next);
-+
-+ next->active_mm = prev->active_mm;
-+ if (prev->mm) // from user
-+ mmgrab(prev->active_mm);
-+ else
-+ prev->active_mm = NULL;
-+ } else { // to user
-+ membarrier_switch_mm(rq, prev->active_mm, next->mm);
-+ /*
-+ * sys_membarrier() requires an smp_mb() between setting
-+ * rq->curr / membarrier_switch_mm() and returning to userspace.
-+ *
-+ * The below provides this either through switch_mm(), or in
-+ * case 'prev->active_mm == next->mm' through
-+ * finish_task_switch()'s mmdrop().
-+ */
-+ switch_mm_irqs_off(prev->active_mm, next->mm, next);
-+
-+ if (!prev->mm) { // from kernel
-+ /* will mmdrop() in finish_task_switch(). */
-+ rq->prev_mm = prev->active_mm;
-+ prev->active_mm = NULL;
-+ }
-+ }
-+
-+ prepare_lock_switch(rq, next);
-+
-+ /* Here we just switch the register state and the stack. */
-+ switch_to(prev, next, prev);
-+ barrier();
-+
-+ return finish_task_switch(prev);
-+}
-+
-+/*
-+ * nr_running, nr_uninterruptible and nr_context_switches:
-+ *
-+ * externally visible scheduler statistics: current number of runnable
-+ * threads, total number of context switches performed since bootup.
-+ */
-+unsigned int nr_running(void)
-+{
-+ unsigned int i, sum = 0;
-+
-+ for_each_online_cpu(i)
-+ sum += cpu_rq(i)->nr_running;
-+
-+ return sum;
-+}
-+
-+/*
-+ * Check if only the current task is running on the CPU.
-+ *
-+ * Caution: this function does not check that the caller has disabled
-+ * preemption, thus the result might have a time-of-check-to-time-of-use
-+ * race. The caller is responsible to use it correctly, for example:
-+ *
-+ * - from a non-preemptible section (of course)
-+ *
-+ * - from a thread that is bound to a single CPU
-+ *
-+ * - in a loop with very short iterations (e.g. a polling loop)
-+ */
-+bool single_task_running(void)
-+{
-+ return raw_rq()->nr_running == 1;
-+}
-+EXPORT_SYMBOL(single_task_running);
-+
-+unsigned long long nr_context_switches(void)
-+{
-+ int i;
-+ unsigned long long sum = 0;
-+
-+ for_each_possible_cpu(i)
-+ sum += cpu_rq(i)->nr_switches;
-+
-+ return sum;
-+}
-+
-+/*
-+ * Consumers of these two interfaces, like for example the cpuidle menu
-+ * governor, are using nonsensical data. Preferring shallow idle state selection
-+ * for a CPU that has IO-wait which might not even end up running the task when
-+ * it does become runnable.
-+ */
-+
-+unsigned int nr_iowait_cpu(int cpu)
-+{
-+ return atomic_read(&cpu_rq(cpu)->nr_iowait);
-+}
-+
-+/*
-+ * IO-wait accounting, and how it's mostly bollocks (on SMP).
-+ *
-+ * The idea behind IO-wait account is to account the idle time that we could
-+ * have spend running if it were not for IO. That is, if we were to improve the
-+ * storage performance, we'd have a proportional reduction in IO-wait time.
-+ *
-+ * This all works nicely on UP, where, when a task blocks on IO, we account
-+ * idle time as IO-wait, because if the storage were faster, it could've been
-+ * running and we'd not be idle.
-+ *
-+ * This has been extended to SMP, by doing the same for each CPU. This however
-+ * is broken.
-+ *
-+ * Imagine for instance the case where two tasks block on one CPU, only the one
-+ * CPU will have IO-wait accounted, while the other has regular idle. Even
-+ * though, if the storage were faster, both could've ran at the same time,
-+ * utilising both CPUs.
-+ *
-+ * This means, that when looking globally, the current IO-wait accounting on
-+ * SMP is a lower bound, by reason of under accounting.
-+ *
-+ * Worse, since the numbers are provided per CPU, they are sometimes
-+ * interpreted per CPU, and that is nonsensical. A blocked task isn't strictly
-+ * associated with any one particular CPU, it can wake to another CPU than it
-+ * blocked on. This means the per CPU IO-wait number is meaningless.
-+ *
-+ * Task CPU affinities can make all that even more 'interesting'.
-+ */
-+
-+unsigned int nr_iowait(void)
-+{
-+ unsigned int i, sum = 0;
-+
-+ for_each_possible_cpu(i)
-+ sum += nr_iowait_cpu(i);
-+
-+ return sum;
-+}
-+
-+#ifdef CONFIG_SMP
-+
-+/*
-+ * sched_exec - execve() is a valuable balancing opportunity, because at
-+ * this point the task has the smallest effective memory and cache
-+ * footprint.
-+ */
-+void sched_exec(void)
-+{
-+}
-+
-+#endif
-+
-+DEFINE_PER_CPU(struct kernel_stat, kstat);
-+DEFINE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
-+
-+EXPORT_PER_CPU_SYMBOL(kstat);
-+EXPORT_PER_CPU_SYMBOL(kernel_cpustat);
-+
-+static inline void update_curr(struct rq *rq, struct task_struct *p)
-+{
-+ s64 ns = rq->clock_task - p->last_ran;
-+
-+ p->sched_time += ns;
-+ cgroup_account_cputime(p, ns);
-+ account_group_exec_runtime(p, ns);
-+
-+ p->time_slice -= ns;
-+ p->last_ran = rq->clock_task;
-+}
-+
-+/*
-+ * Return accounted runtime for the task.
-+ * Return separately the current's pending runtime that have not been
-+ * accounted yet.
-+ */
-+unsigned long long task_sched_runtime(struct task_struct *p)
-+{
-+ unsigned long flags;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+ u64 ns;
-+
-+#if defined(CONFIG_64BIT) && defined(CONFIG_SMP)
-+ /*
-+ * 64-bit doesn't need locks to atomically read a 64-bit value.
-+ * So we have a optimization chance when the task's delta_exec is 0.
-+ * Reading ->on_cpu is racy, but this is ok.
-+ *
-+ * If we race with it leaving CPU, we'll take a lock. So we're correct.
-+ * If we race with it entering CPU, unaccounted time is 0. This is
-+ * indistinguishable from the read occurring a few cycles earlier.
-+ * If we see ->on_cpu without ->on_rq, the task is leaving, and has
-+ * been accounted, so we're correct here as well.
-+ */
-+ if (!p->on_cpu || !task_on_rq_queued(p))
-+ return tsk_seruntime(p);
-+#endif
-+
-+ rq = task_access_lock_irqsave(p, &lock, &flags);
-+ /*
-+ * Must be ->curr _and_ ->on_rq. If dequeued, we would
-+ * project cycles that may never be accounted to this
-+ * thread, breaking clock_gettime().
-+ */
-+ if (p == rq->curr && task_on_rq_queued(p)) {
-+ update_rq_clock(rq);
-+ update_curr(rq, p);
-+ }
-+ ns = tsk_seruntime(p);
-+ task_access_unlock_irqrestore(p, lock, &flags);
-+
-+ return ns;
-+}
-+
-+/* This manages tasks that have run out of timeslice during a scheduler_tick */
-+static inline void scheduler_task_tick(struct rq *rq)
-+{
-+ struct task_struct *p = rq->curr;
-+
-+ if (is_idle_task(p))
-+ return;
-+
-+ update_curr(rq, p);
-+ cpufreq_update_util(rq, 0);
-+
-+ /*
-+ * Tasks have less than RESCHED_NS of time slice left they will be
-+ * rescheduled.
-+ */
-+ if (p->time_slice >= RESCHED_NS)
-+ return;
-+ set_tsk_need_resched(p);
-+ set_preempt_need_resched();
-+}
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+static u64 cpu_resched_latency(struct rq *rq)
-+{
-+ int latency_warn_ms = READ_ONCE(sysctl_resched_latency_warn_ms);
-+ u64 resched_latency, now = rq_clock(rq);
-+ static bool warned_once;
-+
-+ if (sysctl_resched_latency_warn_once && warned_once)
-+ return 0;
-+
-+ if (!need_resched() || !latency_warn_ms)
-+ return 0;
-+
-+ if (system_state == SYSTEM_BOOTING)
-+ return 0;
-+
-+ if (!rq->last_seen_need_resched_ns) {
-+ rq->last_seen_need_resched_ns = now;
-+ rq->ticks_without_resched = 0;
-+ return 0;
-+ }
-+
-+ rq->ticks_without_resched++;
-+ resched_latency = now - rq->last_seen_need_resched_ns;
-+ if (resched_latency <= latency_warn_ms * NSEC_PER_MSEC)
-+ return 0;
-+
-+ warned_once = true;
-+
-+ return resched_latency;
-+}
-+
-+static int __init setup_resched_latency_warn_ms(char *str)
-+{
-+ long val;
-+
-+ if ((kstrtol(str, 0, &val))) {
-+ pr_warn("Unable to set resched_latency_warn_ms\n");
-+ return 1;
-+ }
-+
-+ sysctl_resched_latency_warn_ms = val;
-+ return 1;
-+}
-+__setup("resched_latency_warn_ms=", setup_resched_latency_warn_ms);
-+#else
-+static inline u64 cpu_resched_latency(struct rq *rq) { return 0; }
-+#endif /* CONFIG_SCHED_DEBUG */
-+
-+/*
-+ * This function gets called by the timer code, with HZ frequency.
-+ * We call it with interrupts disabled.
-+ */
-+void scheduler_tick(void)
-+{
-+ int cpu __maybe_unused = smp_processor_id();
-+ struct rq *rq = cpu_rq(cpu);
-+ u64 resched_latency;
-+
-+ arch_scale_freq_tick();
-+ sched_clock_tick();
-+
-+ raw_spin_lock(&rq->lock);
-+ update_rq_clock(rq);
-+
-+ scheduler_task_tick(rq);
-+ if (sched_feat(LATENCY_WARN))
-+ resched_latency = cpu_resched_latency(rq);
-+ calc_global_load_tick(rq);
-+
-+ rq->last_tick = rq->clock;
-+ raw_spin_unlock(&rq->lock);
-+
-+ if (sched_feat(LATENCY_WARN) && resched_latency)
-+ resched_latency_warn(cpu, resched_latency);
-+
-+ perf_event_task_tick();
-+}
-+
-+#ifdef CONFIG_SCHED_SMT
-+static inline int sg_balance_cpu_stop(void *data)
-+{
-+ struct rq *rq = this_rq();
-+ struct task_struct *p = data;
-+ cpumask_t tmp;
-+ unsigned long flags;
-+
-+ local_irq_save(flags);
-+
-+ raw_spin_lock(&p->pi_lock);
-+ raw_spin_lock(&rq->lock);
-+
-+ rq->active_balance = 0;
-+ /* _something_ may have changed the task, double check again */
-+ if (task_on_rq_queued(p) && task_rq(p) == rq &&
-+ cpumask_and(&tmp, p->cpus_ptr, &sched_sg_idle_mask) &&
-+ !is_migration_disabled(p)) {
-+ int cpu = cpu_of(rq);
-+ int dcpu = __best_mask_cpu(&tmp, per_cpu(sched_cpu_llc_mask, cpu));
-+ rq = move_queued_task(rq, p, dcpu);
-+ }
-+
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock(&p->pi_lock);
-+
-+ local_irq_restore(flags);
-+
-+ return 0;
-+}
-+
-+/* sg_balance_trigger - trigger slibing group balance for @cpu */
-+static inline int sg_balance_trigger(const int cpu)
-+{
-+ struct rq *rq= cpu_rq(cpu);
-+ unsigned long flags;
-+ struct task_struct *curr;
-+ int res;
-+
-+ if (!raw_spin_trylock_irqsave(&rq->lock, flags))
-+ return 0;
-+ curr = rq->curr;
-+ res = (!is_idle_task(curr)) && (1 == rq->nr_running) &&\
-+ cpumask_intersects(curr->cpus_ptr, &sched_sg_idle_mask) &&\
-+ !is_migration_disabled(curr) && (!rq->active_balance);
-+
-+ if (res)
-+ rq->active_balance = 1;
-+
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+ if (res)
-+ stop_one_cpu_nowait(cpu, sg_balance_cpu_stop, curr,
-+ &rq->active_balance_work);
-+ return res;
-+}
-+
-+/*
-+ * sg_balance - slibing group balance check for run queue @rq
-+ */
-+static inline void sg_balance(struct rq *rq)
-+{
-+ cpumask_t chk;
-+ int cpu = cpu_of(rq);
-+
-+ /* exit when cpu is offline */
-+ if (unlikely(!rq->online))
-+ return;
-+
-+ /*
-+ * Only cpu in slibing idle group will do the checking and then
-+ * find potential cpus which can migrate the current running task
-+ */
-+ if (cpumask_test_cpu(cpu, &sched_sg_idle_mask) &&
-+ cpumask_andnot(&chk, cpu_online_mask, sched_rq_watermark) &&
-+ cpumask_andnot(&chk, &chk, &sched_rq_pending_mask)) {
-+ int i;
-+
-+ for_each_cpu_wrap(i, &chk, cpu) {
-+ if (cpumask_subset(cpu_smt_mask(i), &chk) &&
-+ sg_balance_trigger(i))
-+ return;
-+ }
-+ }
-+}
-+#endif /* CONFIG_SCHED_SMT */
-+
-+#ifdef CONFIG_NO_HZ_FULL
-+
-+struct tick_work {
-+ int cpu;
-+ atomic_t state;
-+ struct delayed_work work;
-+};
-+/* Values for ->state, see diagram below. */
-+#define TICK_SCHED_REMOTE_OFFLINE 0
-+#define TICK_SCHED_REMOTE_OFFLINING 1
-+#define TICK_SCHED_REMOTE_RUNNING 2
-+
-+/*
-+ * State diagram for ->state:
-+ *
-+ *
-+ * TICK_SCHED_REMOTE_OFFLINE
-+ * | ^
-+ * | |
-+ * | | sched_tick_remote()
-+ * | |
-+ * | |
-+ * +--TICK_SCHED_REMOTE_OFFLINING
-+ * | ^
-+ * | |
-+ * sched_tick_start() | | sched_tick_stop()
-+ * | |
-+ * V |
-+ * TICK_SCHED_REMOTE_RUNNING
-+ *
-+ *
-+ * Other transitions get WARN_ON_ONCE(), except that sched_tick_remote()
-+ * and sched_tick_start() are happy to leave the state in RUNNING.
-+ */
-+
-+static struct tick_work __percpu *tick_work_cpu;
-+
-+static void sched_tick_remote(struct work_struct *work)
-+{
-+ struct delayed_work *dwork = to_delayed_work(work);
-+ struct tick_work *twork = container_of(dwork, struct tick_work, work);
-+ int cpu = twork->cpu;
-+ struct rq *rq = cpu_rq(cpu);
-+ struct task_struct *curr;
-+ unsigned long flags;
-+ u64 delta;
-+ int os;
-+
-+ /*
-+ * Handle the tick only if it appears the remote CPU is running in full
-+ * dynticks mode. The check is racy by nature, but missing a tick or
-+ * having one too much is no big deal because the scheduler tick updates
-+ * statistics and checks timeslices in a time-independent way, regardless
-+ * of when exactly it is running.
-+ */
-+ if (!tick_nohz_tick_stopped_cpu(cpu))
-+ goto out_requeue;
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ curr = rq->curr;
-+ if (cpu_is_offline(cpu))
-+ goto out_unlock;
-+
-+ update_rq_clock(rq);
-+ if (!is_idle_task(curr)) {
-+ /*
-+ * Make sure the next tick runs within a reasonable
-+ * amount of time.
-+ */
-+ delta = rq_clock_task(rq) - curr->last_ran;
-+ WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
-+ }
-+ scheduler_task_tick(rq);
-+
-+ calc_load_nohz_remote(rq);
-+out_unlock:
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+out_requeue:
-+ /*
-+ * Run the remote tick once per second (1Hz). This arbitrary
-+ * frequency is large enough to avoid overload but short enough
-+ * to keep scheduler internal stats reasonably up to date. But
-+ * first update state to reflect hotplug activity if required.
-+ */
-+ os = atomic_fetch_add_unless(&twork->state, -1, TICK_SCHED_REMOTE_RUNNING);
-+ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_OFFLINE);
-+ if (os == TICK_SCHED_REMOTE_RUNNING)
-+ queue_delayed_work(system_unbound_wq, dwork, HZ);
-+}
-+
-+static void sched_tick_start(int cpu)
-+{
-+ int os;
-+ struct tick_work *twork;
-+
-+ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
-+ return;
-+
-+ WARN_ON_ONCE(!tick_work_cpu);
-+
-+ twork = per_cpu_ptr(tick_work_cpu, cpu);
-+ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_RUNNING);
-+ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_RUNNING);
-+ if (os == TICK_SCHED_REMOTE_OFFLINE) {
-+ twork->cpu = cpu;
-+ INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
-+ queue_delayed_work(system_unbound_wq, &twork->work, HZ);
-+ }
-+}
-+
-+#ifdef CONFIG_HOTPLUG_CPU
-+static void sched_tick_stop(int cpu)
-+{
-+ struct tick_work *twork;
-+
-+ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
-+ return;
-+
-+ WARN_ON_ONCE(!tick_work_cpu);
-+
-+ twork = per_cpu_ptr(tick_work_cpu, cpu);
-+ cancel_delayed_work_sync(&twork->work);
-+}
-+#endif /* CONFIG_HOTPLUG_CPU */
-+
-+int __init sched_tick_offload_init(void)
-+{
-+ tick_work_cpu = alloc_percpu(struct tick_work);
-+ BUG_ON(!tick_work_cpu);
-+ return 0;
-+}
-+
-+#else /* !CONFIG_NO_HZ_FULL */
-+static inline void sched_tick_start(int cpu) { }
-+static inline void sched_tick_stop(int cpu) { }
-+#endif
-+
-+#if defined(CONFIG_PREEMPTION) && (defined(CONFIG_DEBUG_PREEMPT) || \
-+ defined(CONFIG_PREEMPT_TRACER))
-+/*
-+ * If the value passed in is equal to the current preempt count
-+ * then we just disabled preemption. Start timing the latency.
-+ */
-+static inline void preempt_latency_start(int val)
-+{
-+ if (preempt_count() == val) {
-+ unsigned long ip = get_lock_parent_ip();
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ current->preempt_disable_ip = ip;
-+#endif
-+ trace_preempt_off(CALLER_ADDR0, ip);
-+ }
-+}
-+
-+void preempt_count_add(int val)
-+{
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ /*
-+ * Underflow?
-+ */
-+ if (DEBUG_LOCKS_WARN_ON((preempt_count() < 0)))
-+ return;
-+#endif
-+ __preempt_count_add(val);
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ /*
-+ * Spinlock count overflowing soon?
-+ */
-+ DEBUG_LOCKS_WARN_ON((preempt_count() & PREEMPT_MASK) >=
-+ PREEMPT_MASK - 10);
-+#endif
-+ preempt_latency_start(val);
-+}
-+EXPORT_SYMBOL(preempt_count_add);
-+NOKPROBE_SYMBOL(preempt_count_add);
-+
-+/*
-+ * If the value passed in equals to the current preempt count
-+ * then we just enabled preemption. Stop timing the latency.
-+ */
-+static inline void preempt_latency_stop(int val)
-+{
-+ if (preempt_count() == val)
-+ trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip());
-+}
-+
-+void preempt_count_sub(int val)
-+{
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ /*
-+ * Underflow?
-+ */
-+ if (DEBUG_LOCKS_WARN_ON(val > preempt_count()))
-+ return;
-+ /*
-+ * Is the spinlock portion underflowing?
-+ */
-+ if (DEBUG_LOCKS_WARN_ON((val < PREEMPT_MASK) &&
-+ !(preempt_count() & PREEMPT_MASK)))
-+ return;
-+#endif
-+
-+ preempt_latency_stop(val);
-+ __preempt_count_sub(val);
-+}
-+EXPORT_SYMBOL(preempt_count_sub);
-+NOKPROBE_SYMBOL(preempt_count_sub);
-+
-+#else
-+static inline void preempt_latency_start(int val) { }
-+static inline void preempt_latency_stop(int val) { }
-+#endif
-+
-+static inline unsigned long get_preempt_disable_ip(struct task_struct *p)
-+{
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ return p->preempt_disable_ip;
-+#else
-+ return 0;
-+#endif
-+}
-+
-+/*
-+ * Print scheduling while atomic bug:
-+ */
-+static noinline void __schedule_bug(struct task_struct *prev)
-+{
-+ /* Save this before calling printk(), since that will clobber it */
-+ unsigned long preempt_disable_ip = get_preempt_disable_ip(current);
-+
-+ if (oops_in_progress)
-+ return;
-+
-+ printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n",
-+ prev->comm, prev->pid, preempt_count());
-+
-+ debug_show_held_locks(prev);
-+ print_modules();
-+ if (irqs_disabled())
-+ print_irqtrace_events(prev);
-+ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)
-+ && in_atomic_preempt_off()) {
-+ pr_err("Preemption disabled at:");
-+ print_ip_sym(KERN_ERR, preempt_disable_ip);
-+ }
-+ if (panic_on_warn)
-+ panic("scheduling while atomic\n");
-+
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+}
-+
-+/*
-+ * Various schedule()-time debugging checks and statistics:
-+ */
-+static inline void schedule_debug(struct task_struct *prev, bool preempt)
-+{
-+#ifdef CONFIG_SCHED_STACK_END_CHECK
-+ if (task_stack_end_corrupted(prev))
-+ panic("corrupted stack end detected inside scheduler\n");
-+
-+ if (task_scs_end_corrupted(prev))
-+ panic("corrupted shadow stack detected inside scheduler\n");
-+#endif
-+
-+#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
-+ if (!preempt && READ_ONCE(prev->__state) && prev->non_block_count) {
-+ printk(KERN_ERR "BUG: scheduling in a non-blocking section: %s/%d/%i\n",
-+ prev->comm, prev->pid, prev->non_block_count);
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+ }
-+#endif
-+
-+ if (unlikely(in_atomic_preempt_off())) {
-+ __schedule_bug(prev);
-+ preempt_count_set(PREEMPT_DISABLED);
-+ }
-+ rcu_sleep_check();
-+ SCHED_WARN_ON(ct_state() == CONTEXT_USER);
-+
-+ profile_hit(SCHED_PROFILING, __builtin_return_address(0));
-+
-+ schedstat_inc(this_rq()->sched_count);
-+}
-+
-+/*
-+ * Compile time debug macro
-+ * #define ALT_SCHED_DEBUG
-+ */
-+
-+#ifdef ALT_SCHED_DEBUG
-+void alt_sched_debug(void)
-+{
-+ printk(KERN_INFO "sched: pending: 0x%04lx, idle: 0x%04lx, sg_idle: 0x%04lx\n",
-+ sched_rq_pending_mask.bits[0],
-+ sched_rq_watermark[0].bits[0],
-+ sched_sg_idle_mask.bits[0]);
-+}
-+#else
-+inline void alt_sched_debug(void) {}
-+#endif
-+
-+#ifdef CONFIG_SMP
-+
-+#define SCHED_RQ_NR_MIGRATION (32U)
-+/*
-+ * Migrate pending tasks in @rq to @dest_cpu
-+ * Will try to migrate mininal of half of @rq nr_running tasks and
-+ * SCHED_RQ_NR_MIGRATION to @dest_cpu
-+ */
-+static inline int
-+migrate_pending_tasks(struct rq *rq, struct rq *dest_rq, const int dest_cpu)
-+{
-+ struct task_struct *p, *skip = rq->curr;
-+ int nr_migrated = 0;
-+ int nr_tries = min(rq->nr_running / 2, SCHED_RQ_NR_MIGRATION);
-+
-+ while (skip != rq->idle && nr_tries &&
-+ (p = sched_rq_next_task(skip, rq)) != rq->idle) {
-+ skip = sched_rq_next_task(p, rq);
-+ if (cpumask_test_cpu(dest_cpu, p->cpus_ptr)) {
-+ __SCHED_DEQUEUE_TASK(p, rq, 0);
-+ set_task_cpu(p, dest_cpu);
-+ sched_task_sanity_check(p, dest_rq);
-+ __SCHED_ENQUEUE_TASK(p, dest_rq, 0);
-+ nr_migrated++;
-+ }
-+ nr_tries--;
-+ }
-+
-+ return nr_migrated;
-+}
-+
-+static inline int take_other_rq_tasks(struct rq *rq, int cpu)
-+{
-+ struct cpumask *topo_mask, *end_mask;
-+
-+ if (unlikely(!rq->online))
-+ return 0;
-+
-+ if (cpumask_empty(&sched_rq_pending_mask))
-+ return 0;
-+
-+ topo_mask = per_cpu(sched_cpu_topo_masks, cpu) + 1;
-+ end_mask = per_cpu(sched_cpu_topo_end_mask, cpu);
-+ do {
-+ int i;
-+ for_each_cpu_and(i, &sched_rq_pending_mask, topo_mask) {
-+ int nr_migrated;
-+ struct rq *src_rq;
-+
-+ src_rq = cpu_rq(i);
-+ if (!do_raw_spin_trylock(&src_rq->lock))
-+ continue;
-+ spin_acquire(&src_rq->lock.dep_map,
-+ SINGLE_DEPTH_NESTING, 1, _RET_IP_);
-+
-+ if ((nr_migrated = migrate_pending_tasks(src_rq, rq, cpu))) {
-+ src_rq->nr_running -= nr_migrated;
-+ if (src_rq->nr_running < 2)
-+ cpumask_clear_cpu(i, &sched_rq_pending_mask);
-+
-+ rq->nr_running += nr_migrated;
-+ if (rq->nr_running > 1)
-+ cpumask_set_cpu(cpu, &sched_rq_pending_mask);
-+
-+ cpufreq_update_util(rq, 0);
-+
-+ spin_release(&src_rq->lock.dep_map, _RET_IP_);
-+ do_raw_spin_unlock(&src_rq->lock);
-+
-+ return 1;
-+ }
-+
-+ spin_release(&src_rq->lock.dep_map, _RET_IP_);
-+ do_raw_spin_unlock(&src_rq->lock);
-+ }
-+ } while (++topo_mask < end_mask);
-+
-+ return 0;
-+}
-+#endif
-+
-+/*
-+ * Timeslices below RESCHED_NS are considered as good as expired as there's no
-+ * point rescheduling when there's so little time left.
-+ */
-+static inline void check_curr(struct task_struct *p, struct rq *rq)
-+{
-+ if (unlikely(rq->idle == p))
-+ return;
-+
-+ update_curr(rq, p);
-+
-+ if (p->time_slice < RESCHED_NS)
-+ time_slice_expired(p, rq);
-+}
-+
-+static inline struct task_struct *
-+choose_next_task(struct rq *rq, int cpu, struct task_struct *prev)
-+{
-+ struct task_struct *next;
-+
-+ if (unlikely(rq->skip)) {
-+ next = rq_runnable_task(rq);
-+ if (next == rq->idle) {
-+#ifdef CONFIG_SMP
-+ if (!take_other_rq_tasks(rq, cpu)) {
-+#endif
-+ rq->skip = NULL;
-+ schedstat_inc(rq->sched_goidle);
-+ return next;
-+#ifdef CONFIG_SMP
-+ }
-+ next = rq_runnable_task(rq);
-+#endif
-+ }
-+ rq->skip = NULL;
-+#ifdef CONFIG_HIGH_RES_TIMERS
-+ hrtick_start(rq, next->time_slice);
-+#endif
-+ return next;
-+ }
-+
-+ next = sched_rq_first_task(rq);
-+ if (next == rq->idle) {
-+#ifdef CONFIG_SMP
-+ if (!take_other_rq_tasks(rq, cpu)) {
-+#endif
-+ schedstat_inc(rq->sched_goidle);
-+ /*printk(KERN_INFO "sched: choose_next_task(%d) idle %px\n", cpu, next);*/
-+ return next;
-+#ifdef CONFIG_SMP
-+ }
-+ next = sched_rq_first_task(rq);
-+#endif
-+ }
-+#ifdef CONFIG_HIGH_RES_TIMERS
-+ hrtick_start(rq, next->time_slice);
-+#endif
-+ /*printk(KERN_INFO "sched: choose_next_task(%d) next %px\n", cpu,
-+ * next);*/
-+ return next;
-+}
-+
-+/*
-+ * Constants for the sched_mode argument of __schedule().
-+ *
-+ * The mode argument allows RT enabled kernels to differentiate a
-+ * preemption from blocking on an 'sleeping' spin/rwlock. Note that
-+ * SM_MASK_PREEMPT for !RT has all bits set, which allows the compiler to
-+ * optimize the AND operation out and just check for zero.
-+ */
-+#define SM_NONE 0x0
-+#define SM_PREEMPT 0x1
-+#define SM_RTLOCK_WAIT 0x2
-+
-+#ifndef CONFIG_PREEMPT_RT
-+# define SM_MASK_PREEMPT (~0U)
-+#else
-+# define SM_MASK_PREEMPT SM_PREEMPT
-+#endif
-+
-+/*
-+ * schedule() is the main scheduler function.
-+ *
-+ * The main means of driving the scheduler and thus entering this function are:
-+ *
-+ * 1. Explicit blocking: mutex, semaphore, waitqueue, etc.
-+ *
-+ * 2. TIF_NEED_RESCHED flag is checked on interrupt and userspace return
-+ * paths. For example, see arch/x86/entry_64.S.
-+ *
-+ * To drive preemption between tasks, the scheduler sets the flag in timer
-+ * interrupt handler scheduler_tick().
-+ *
-+ * 3. Wakeups don't really cause entry into schedule(). They add a
-+ * task to the run-queue and that's it.
-+ *
-+ * Now, if the new task added to the run-queue preempts the current
-+ * task, then the wakeup sets TIF_NEED_RESCHED and schedule() gets
-+ * called on the nearest possible occasion:
-+ *
-+ * - If the kernel is preemptible (CONFIG_PREEMPTION=y):
-+ *
-+ * - in syscall or exception context, at the next outmost
-+ * preempt_enable(). (this might be as soon as the wake_up()'s
-+ * spin_unlock()!)
-+ *
-+ * - in IRQ context, return from interrupt-handler to
-+ * preemptible context
-+ *
-+ * - If the kernel is not preemptible (CONFIG_PREEMPTION is not set)
-+ * then at the next:
-+ *
-+ * - cond_resched() call
-+ * - explicit schedule() call
-+ * - return from syscall or exception to user-space
-+ * - return from interrupt-handler to user-space
-+ *
-+ * WARNING: must be called with preemption disabled!
-+ */
-+static void __sched notrace __schedule(unsigned int sched_mode)
-+{
-+ struct task_struct *prev, *next;
-+ unsigned long *switch_count;
-+ unsigned long prev_state;
-+ struct rq *rq;
-+ int cpu;
-+ int deactivated = 0;
-+
-+ cpu = smp_processor_id();
-+ rq = cpu_rq(cpu);
-+ prev = rq->curr;
-+
-+ schedule_debug(prev, !!sched_mode);
-+
-+ /* by passing sched_feat(HRTICK) checking which Alt schedule FW doesn't support */
-+ hrtick_clear(rq);
-+
-+ local_irq_disable();
-+ rcu_note_context_switch(!!sched_mode);
-+
-+ /*
-+ * Make sure that signal_pending_state()->signal_pending() below
-+ * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
-+ * done by the caller to avoid the race with signal_wake_up():
-+ *
-+ * __set_current_state(@state) signal_wake_up()
-+ * schedule() set_tsk_thread_flag(p, TIF_SIGPENDING)
-+ * wake_up_state(p, state)
-+ * LOCK rq->lock LOCK p->pi_state
-+ * smp_mb__after_spinlock() smp_mb__after_spinlock()
-+ * if (signal_pending_state()) if (p->state & @state)
-+ *
-+ * Also, the membarrier system call requires a full memory barrier
-+ * after coming from user-space, before storing to rq->curr.
-+ */
-+ raw_spin_lock(&rq->lock);
-+ smp_mb__after_spinlock();
-+
-+ update_rq_clock(rq);
-+
-+ switch_count = &prev->nivcsw;
-+ /*
-+ * We must load prev->state once (task_struct::state is volatile), such
-+ * that:
-+ *
-+ * - we form a control dependency vs deactivate_task() below.
-+ * - ptrace_{,un}freeze_traced() can change ->state underneath us.
-+ */
-+ prev_state = READ_ONCE(prev->__state);
-+ if (!(sched_mode & SM_MASK_PREEMPT) && prev_state) {
-+ if (signal_pending_state(prev_state, prev)) {
-+ WRITE_ONCE(prev->__state, TASK_RUNNING);
-+ } else {
-+ prev->sched_contributes_to_load =
-+ (prev_state & TASK_UNINTERRUPTIBLE) &&
-+ !(prev_state & TASK_NOLOAD) &&
-+ !(prev->flags & PF_FROZEN);
-+
-+ if (prev->sched_contributes_to_load)
-+ rq->nr_uninterruptible++;
-+
-+ /*
-+ * __schedule() ttwu()
-+ * prev_state = prev->state; if (p->on_rq && ...)
-+ * if (prev_state) goto out;
-+ * p->on_rq = 0; smp_acquire__after_ctrl_dep();
-+ * p->state = TASK_WAKING
-+ *
-+ * Where __schedule() and ttwu() have matching control dependencies.
-+ *
-+ * After this, schedule() must not care about p->state any more.
-+ */
-+ sched_task_deactivate(prev, rq);
-+ deactivate_task(prev, rq);
-+ deactivated = 1;
-+
-+ if (prev->in_iowait) {
-+ atomic_inc(&rq->nr_iowait);
-+ delayacct_blkio_start();
-+ }
-+ }
-+ switch_count = &prev->nvcsw;
-+ }
-+
-+ check_curr(prev, rq);
-+
-+ next = choose_next_task(rq, cpu, prev);
-+ clear_tsk_need_resched(prev);
-+ clear_preempt_need_resched();
-+#ifdef CONFIG_SCHED_DEBUG
-+ rq->last_seen_need_resched_ns = 0;
-+#endif
-+
-+ if (likely(prev != next)) {
-+ if (deactivated)
-+ update_sched_rq_watermark(rq);
-+ next->last_ran = rq->clock_task;
-+ rq->last_ts_switch = rq->clock;
-+
-+ rq->nr_switches++;
-+ /*
-+ * RCU users of rcu_dereference(rq->curr) may not see
-+ * changes to task_struct made by pick_next_task().
-+ */
-+ RCU_INIT_POINTER(rq->curr, next);
-+ /*
-+ * The membarrier system call requires each architecture
-+ * to have a full memory barrier after updating
-+ * rq->curr, before returning to user-space.
-+ *
-+ * Here are the schemes providing that barrier on the
-+ * various architectures:
-+ * - mm ? switch_mm() : mmdrop() for x86, s390, sparc, PowerPC.
-+ * switch_mm() rely on membarrier_arch_switch_mm() on PowerPC.
-+ * - finish_lock_switch() for weakly-ordered
-+ * architectures where spin_unlock is a full barrier,
-+ * - switch_to() for arm64 (weakly-ordered, spin_unlock
-+ * is a RELEASE barrier),
-+ */
-+ ++*switch_count;
-+
-+ psi_sched_switch(prev, next, !task_on_rq_queued(prev));
-+
-+ trace_sched_switch(sched_mode & SM_MASK_PREEMPT, prev, next, prev_state);
-+
-+ /* Also unlocks the rq: */
-+ rq = context_switch(rq, prev, next);
-+ } else {
-+ __balance_callbacks(rq);
-+ raw_spin_unlock_irq(&rq->lock);
-+ }
-+
-+#ifdef CONFIG_SCHED_SMT
-+ sg_balance(rq);
-+#endif
-+}
-+
-+void __noreturn do_task_dead(void)
-+{
-+ /* Causes final put_task_struct in finish_task_switch(): */
-+ set_special_state(TASK_DEAD);
-+
-+ /* Tell freezer to ignore us: */
-+ current->flags |= PF_NOFREEZE;
-+
-+ __schedule(SM_NONE);
-+ BUG();
-+
-+ /* Avoid "noreturn function does return" - but don't continue if BUG() is a NOP: */
-+ for (;;)
-+ cpu_relax();
-+}
-+
-+static inline void sched_submit_work(struct task_struct *tsk)
-+{
-+ unsigned int task_flags;
-+
-+ if (task_is_running(tsk))
-+ return;
-+
-+ task_flags = tsk->flags;
-+ /*
-+ * If a worker goes to sleep, notify and ask workqueue whether it
-+ * wants to wake up a task to maintain concurrency.
-+ */
-+ if (task_flags & (PF_WQ_WORKER | PF_IO_WORKER)) {
-+ if (task_flags & PF_WQ_WORKER)
-+ wq_worker_sleeping(tsk);
-+ else
-+ io_wq_worker_sleeping(tsk);
-+ }
-+
-+ if (tsk_is_pi_blocked(tsk))
-+ return;
-+
-+ /*
-+ * If we are going to sleep and we have plugged IO queued,
-+ * make sure to submit it to avoid deadlocks.
-+ */
-+ blk_flush_plug(tsk->plug, true);
-+}
-+
-+static void sched_update_worker(struct task_struct *tsk)
-+{
-+ if (tsk->flags & (PF_WQ_WORKER | PF_IO_WORKER)) {
-+ if (tsk->flags & PF_WQ_WORKER)
-+ wq_worker_running(tsk);
-+ else
-+ io_wq_worker_running(tsk);
-+ }
-+}
-+
-+asmlinkage __visible void __sched schedule(void)
-+{
-+ struct task_struct *tsk = current;
-+
-+ sched_submit_work(tsk);
-+ do {
-+ preempt_disable();
-+ __schedule(SM_NONE);
-+ sched_preempt_enable_no_resched();
-+ } while (need_resched());
-+ sched_update_worker(tsk);
-+}
-+EXPORT_SYMBOL(schedule);
-+
-+/*
-+ * synchronize_rcu_tasks() makes sure that no task is stuck in preempted
-+ * state (have scheduled out non-voluntarily) by making sure that all
-+ * tasks have either left the run queue or have gone into user space.
-+ * As idle tasks do not do either, they must not ever be preempted
-+ * (schedule out non-voluntarily).
-+ *
-+ * schedule_idle() is similar to schedule_preempt_disable() except that it
-+ * never enables preemption because it does not call sched_submit_work().
-+ */
-+void __sched schedule_idle(void)
-+{
-+ /*
-+ * As this skips calling sched_submit_work(), which the idle task does
-+ * regardless because that function is a nop when the task is in a
-+ * TASK_RUNNING state, make sure this isn't used someplace that the
-+ * current task can be in any other state. Note, idle is always in the
-+ * TASK_RUNNING state.
-+ */
-+ WARN_ON_ONCE(current->__state);
-+ do {
-+ __schedule(SM_NONE);
-+ } while (need_resched());
-+}
-+
-+#if defined(CONFIG_CONTEXT_TRACKING) && !defined(CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK)
-+asmlinkage __visible void __sched schedule_user(void)
-+{
-+ /*
-+ * If we come here after a random call to set_need_resched(),
-+ * or we have been woken up remotely but the IPI has not yet arrived,
-+ * we haven't yet exited the RCU idle mode. Do it here manually until
-+ * we find a better solution.
-+ *
-+ * NB: There are buggy callers of this function. Ideally we
-+ * should warn if prev_state != CONTEXT_USER, but that will trigger
-+ * too frequently to make sense yet.
-+ */
-+ enum ctx_state prev_state = exception_enter();
-+ schedule();
-+ exception_exit(prev_state);
-+}
-+#endif
-+
-+/**
-+ * schedule_preempt_disabled - called with preemption disabled
-+ *
-+ * Returns with preemption disabled. Note: preempt_count must be 1
-+ */
-+void __sched schedule_preempt_disabled(void)
-+{
-+ sched_preempt_enable_no_resched();
-+ schedule();
-+ preempt_disable();
-+}
-+
-+#ifdef CONFIG_PREEMPT_RT
-+void __sched notrace schedule_rtlock(void)
-+{
-+ do {
-+ preempt_disable();
-+ __schedule(SM_RTLOCK_WAIT);
-+ sched_preempt_enable_no_resched();
-+ } while (need_resched());
-+}
-+NOKPROBE_SYMBOL(schedule_rtlock);
-+#endif
-+
-+static void __sched notrace preempt_schedule_common(void)
-+{
-+ do {
-+ /*
-+ * Because the function tracer can trace preempt_count_sub()
-+ * and it also uses preempt_enable/disable_notrace(), if
-+ * NEED_RESCHED is set, the preempt_enable_notrace() called
-+ * by the function tracer will call this function again and
-+ * cause infinite recursion.
-+ *
-+ * Preemption must be disabled here before the function
-+ * tracer can trace. Break up preempt_disable() into two
-+ * calls. One to disable preemption without fear of being
-+ * traced. The other to still record the preemption latency,
-+ * which can also be traced by the function tracer.
-+ */
-+ preempt_disable_notrace();
-+ preempt_latency_start(1);
-+ __schedule(SM_PREEMPT);
-+ preempt_latency_stop(1);
-+ preempt_enable_no_resched_notrace();
-+
-+ /*
-+ * Check again in case we missed a preemption opportunity
-+ * between schedule and now.
-+ */
-+ } while (need_resched());
-+}
-+
-+#ifdef CONFIG_PREEMPTION
-+/*
-+ * This is the entry point to schedule() from in-kernel preemption
-+ * off of preempt_enable.
-+ */
-+asmlinkage __visible void __sched notrace preempt_schedule(void)
-+{
-+ /*
-+ * If there is a non-zero preempt_count or interrupts are disabled,
-+ * we do not want to preempt the current task. Just return..
-+ */
-+ if (likely(!preemptible()))
-+ return;
-+
-+ preempt_schedule_common();
-+}
-+NOKPROBE_SYMBOL(preempt_schedule);
-+EXPORT_SYMBOL(preempt_schedule);
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-+#ifndef preempt_schedule_dynamic_enabled
-+#define preempt_schedule_dynamic_enabled preempt_schedule
-+#define preempt_schedule_dynamic_disabled NULL
-+#endif
-+DEFINE_STATIC_CALL(preempt_schedule, preempt_schedule_dynamic_enabled);
-+EXPORT_STATIC_CALL_TRAMP(preempt_schedule);
-+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-+static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule);
-+void __sched notrace dynamic_preempt_schedule(void)
-+{
-+ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule))
-+ return;
-+ preempt_schedule();
-+}
-+NOKPROBE_SYMBOL(dynamic_preempt_schedule);
-+EXPORT_SYMBOL(dynamic_preempt_schedule);
-+#endif
-+#endif
-+
-+/**
-+ * preempt_schedule_notrace - preempt_schedule called by tracing
-+ *
-+ * The tracing infrastructure uses preempt_enable_notrace to prevent
-+ * recursion and tracing preempt enabling caused by the tracing
-+ * infrastructure itself. But as tracing can happen in areas coming
-+ * from userspace or just about to enter userspace, a preempt enable
-+ * can occur before user_exit() is called. This will cause the scheduler
-+ * to be called when the system is still in usermode.
-+ *
-+ * To prevent this, the preempt_enable_notrace will use this function
-+ * instead of preempt_schedule() to exit user context if needed before
-+ * calling the scheduler.
-+ */
-+asmlinkage __visible void __sched notrace preempt_schedule_notrace(void)
-+{
-+ enum ctx_state prev_ctx;
-+
-+ if (likely(!preemptible()))
-+ return;
-+
-+ do {
-+ /*
-+ * Because the function tracer can trace preempt_count_sub()
-+ * and it also uses preempt_enable/disable_notrace(), if
-+ * NEED_RESCHED is set, the preempt_enable_notrace() called
-+ * by the function tracer will call this function again and
-+ * cause infinite recursion.
-+ *
-+ * Preemption must be disabled here before the function
-+ * tracer can trace. Break up preempt_disable() into two
-+ * calls. One to disable preemption without fear of being
-+ * traced. The other to still record the preemption latency,
-+ * which can also be traced by the function tracer.
-+ */
-+ preempt_disable_notrace();
-+ preempt_latency_start(1);
-+ /*
-+ * Needs preempt disabled in case user_exit() is traced
-+ * and the tracer calls preempt_enable_notrace() causing
-+ * an infinite recursion.
-+ */
-+ prev_ctx = exception_enter();
-+ __schedule(SM_PREEMPT);
-+ exception_exit(prev_ctx);
-+
-+ preempt_latency_stop(1);
-+ preempt_enable_no_resched_notrace();
-+ } while (need_resched());
-+}
-+EXPORT_SYMBOL_GPL(preempt_schedule_notrace);
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-+#ifndef preempt_schedule_notrace_dynamic_enabled
-+#define preempt_schedule_notrace_dynamic_enabled preempt_schedule_notrace
-+#define preempt_schedule_notrace_dynamic_disabled NULL
-+#endif
-+DEFINE_STATIC_CALL(preempt_schedule_notrace, preempt_schedule_notrace_dynamic_enabled);
-+EXPORT_STATIC_CALL_TRAMP(preempt_schedule_notrace);
-+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-+static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule_notrace);
-+void __sched notrace dynamic_preempt_schedule_notrace(void)
-+{
-+ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule_notrace))
-+ return;
-+ preempt_schedule_notrace();
-+}
-+NOKPROBE_SYMBOL(dynamic_preempt_schedule_notrace);
-+EXPORT_SYMBOL(dynamic_preempt_schedule_notrace);
-+#endif
-+#endif
-+
-+#endif /* CONFIG_PREEMPTION */
-+
-+/*
-+ * This is the entry point to schedule() from kernel preemption
-+ * off of irq context.
-+ * Note, that this is called and return with irqs disabled. This will
-+ * protect us against recursive calling from irq.
-+ */
-+asmlinkage __visible void __sched preempt_schedule_irq(void)
-+{
-+ enum ctx_state prev_state;
-+
-+ /* Catch callers which need to be fixed */
-+ BUG_ON(preempt_count() || !irqs_disabled());
-+
-+ prev_state = exception_enter();
-+
-+ do {
-+ preempt_disable();
-+ local_irq_enable();
-+ __schedule(SM_PREEMPT);
-+ local_irq_disable();
-+ sched_preempt_enable_no_resched();
-+ } while (need_resched());
-+
-+ exception_exit(prev_state);
-+}
-+
-+int default_wake_function(wait_queue_entry_t *curr, unsigned mode, int wake_flags,
-+ void *key)
-+{
-+ WARN_ON_ONCE(IS_ENABLED(CONFIG_SCHED_DEBUG) && wake_flags & ~WF_SYNC);
-+ return try_to_wake_up(curr->private, mode, wake_flags);
-+}
-+EXPORT_SYMBOL(default_wake_function);
-+
-+static inline void check_task_changed(struct task_struct *p, struct rq *rq)
-+{
-+ int idx;
-+
-+ /* Trigger resched if task sched_prio has been modified. */
-+ if (task_on_rq_queued(p) && (idx = task_sched_prio_idx(p, rq)) != p->sq_idx) {
-+ requeue_task(p, rq, idx);
-+ check_preempt_curr(rq);
-+ }
-+}
-+
-+static void __setscheduler_prio(struct task_struct *p, int prio)
-+{
-+ p->prio = prio;
-+}
-+
-+#ifdef CONFIG_RT_MUTEXES
-+
-+static inline int __rt_effective_prio(struct task_struct *pi_task, int prio)
-+{
-+ if (pi_task)
-+ prio = min(prio, pi_task->prio);
-+
-+ return prio;
-+}
-+
-+static inline int rt_effective_prio(struct task_struct *p, int prio)
-+{
-+ struct task_struct *pi_task = rt_mutex_get_top_task(p);
-+
-+ return __rt_effective_prio(pi_task, prio);
-+}
-+
-+/*
-+ * rt_mutex_setprio - set the current priority of a task
-+ * @p: task to boost
-+ * @pi_task: donor task
-+ *
-+ * This function changes the 'effective' priority of a task. It does
-+ * not touch ->normal_prio like __setscheduler().
-+ *
-+ * Used by the rt_mutex code to implement priority inheritance
-+ * logic. Call site only calls if the priority of the task changed.
-+ */
-+void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
-+{
-+ int prio;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+
-+ /* XXX used to be waiter->prio, not waiter->task->prio */
-+ prio = __rt_effective_prio(pi_task, p->normal_prio);
-+
-+ /*
-+ * If nothing changed; bail early.
-+ */
-+ if (p->pi_top_task == pi_task && prio == p->prio)
-+ return;
-+
-+ rq = __task_access_lock(p, &lock);
-+ /*
-+ * Set under pi_lock && rq->lock, such that the value can be used under
-+ * either lock.
-+ *
-+ * Note that there is loads of tricky to make this pointer cache work
-+ * right. rt_mutex_slowunlock()+rt_mutex_postunlock() work together to
-+ * ensure a task is de-boosted (pi_task is set to NULL) before the
-+ * task is allowed to run again (and can exit). This ensures the pointer
-+ * points to a blocked task -- which guarantees the task is present.
-+ */
-+ p->pi_top_task = pi_task;
-+
-+ /*
-+ * For FIFO/RR we only need to set prio, if that matches we're done.
-+ */
-+ if (prio == p->prio)
-+ goto out_unlock;
-+
-+ /*
-+ * Idle task boosting is a nono in general. There is one
-+ * exception, when PREEMPT_RT and NOHZ is active:
-+ *
-+ * The idle task calls get_next_timer_interrupt() and holds
-+ * the timer wheel base->lock on the CPU and another CPU wants
-+ * to access the timer (probably to cancel it). We can safely
-+ * ignore the boosting request, as the idle CPU runs this code
-+ * with interrupts disabled and will complete the lock
-+ * protected section without being interrupted. So there is no
-+ * real need to boost.
-+ */
-+ if (unlikely(p == rq->idle)) {
-+ WARN_ON(p != rq->curr);
-+ WARN_ON(p->pi_blocked_on);
-+ goto out_unlock;
-+ }
-+
-+ trace_sched_pi_setprio(p, pi_task);
-+
-+ __setscheduler_prio(p, prio);
-+
-+ check_task_changed(p, rq);
-+out_unlock:
-+ /* Avoid rq from going away on us: */
-+ preempt_disable();
-+
-+ __balance_callbacks(rq);
-+ __task_access_unlock(p, lock);
-+
-+ preempt_enable();
-+}
-+#else
-+static inline int rt_effective_prio(struct task_struct *p, int prio)
-+{
-+ return prio;
-+}
-+#endif
-+
-+void set_user_nice(struct task_struct *p, long nice)
-+{
-+ unsigned long flags;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+
-+ if (task_nice(p) == nice || nice < MIN_NICE || nice > MAX_NICE)
-+ return;
-+ /*
-+ * We have to be careful, if called from sys_setpriority(),
-+ * the task might be in the middle of scheduling on another CPU.
-+ */
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ rq = __task_access_lock(p, &lock);
-+
-+ p->static_prio = NICE_TO_PRIO(nice);
-+ /*
-+ * The RT priorities are set via sched_setscheduler(), but we still
-+ * allow the 'normal' nice value to be set - but as expected
-+ * it won't have any effect on scheduling until the task is
-+ * not SCHED_NORMAL/SCHED_BATCH:
-+ */
-+ if (task_has_rt_policy(p))
-+ goto out_unlock;
-+
-+ p->prio = effective_prio(p);
-+
-+ check_task_changed(p, rq);
-+out_unlock:
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+}
-+EXPORT_SYMBOL(set_user_nice);
-+
-+/*
-+ * can_nice - check if a task can reduce its nice value
-+ * @p: task
-+ * @nice: nice value
-+ */
-+int can_nice(const struct task_struct *p, const int nice)
-+{
-+ /* Convert nice value [19,-20] to rlimit style value [1,40] */
-+ int nice_rlim = nice_to_rlimit(nice);
-+
-+ return (nice_rlim <= task_rlimit(p, RLIMIT_NICE) ||
-+ capable(CAP_SYS_NICE));
-+}
-+
-+#ifdef __ARCH_WANT_SYS_NICE
-+
-+/*
-+ * sys_nice - change the priority of the current process.
-+ * @increment: priority increment
-+ *
-+ * sys_setpriority is a more generic, but much slower function that
-+ * does similar things.
-+ */
-+SYSCALL_DEFINE1(nice, int, increment)
-+{
-+ long nice, retval;
-+
-+ /*
-+ * Setpriority might change our priority at the same moment.
-+ * We don't have to worry. Conceptually one call occurs first
-+ * and we have a single winner.
-+ */
-+
-+ increment = clamp(increment, -NICE_WIDTH, NICE_WIDTH);
-+ nice = task_nice(current) + increment;
-+
-+ nice = clamp_val(nice, MIN_NICE, MAX_NICE);
-+ if (increment < 0 && !can_nice(current, nice))
-+ return -EPERM;
-+
-+ retval = security_task_setnice(current, nice);
-+ if (retval)
-+ return retval;
-+
-+ set_user_nice(current, nice);
-+ return 0;
-+}
-+
-+#endif
-+
-+/**
-+ * task_prio - return the priority value of a given task.
-+ * @p: the task in question.
-+ *
-+ * Return: The priority value as seen by users in /proc.
-+ *
-+ * sched policy return value kernel prio user prio/nice
-+ *
-+ * (BMQ)normal, batch, idle[0 ... 53] [100 ... 139] 0/[-20 ... 19]/[-7 ... 7]
-+ * (PDS)normal, batch, idle[0 ... 39] 100 0/[-20 ... 19]
-+ * fifo, rr [-1 ... -100] [99 ... 0] [0 ... 99]
-+ */
-+int task_prio(const struct task_struct *p)
-+{
-+ return (p->prio < MAX_RT_PRIO) ? p->prio - MAX_RT_PRIO :
-+ task_sched_prio_normal(p, task_rq(p));
-+}
-+
-+/**
-+ * idle_cpu - is a given CPU idle currently?
-+ * @cpu: the processor in question.
-+ *
-+ * Return: 1 if the CPU is currently idle. 0 otherwise.
-+ */
-+int idle_cpu(int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ if (rq->curr != rq->idle)
-+ return 0;
-+
-+ if (rq->nr_running)
-+ return 0;
-+
-+#ifdef CONFIG_SMP
-+ if (rq->ttwu_pending)
-+ return 0;
-+#endif
-+
-+ return 1;
-+}
-+
-+/**
-+ * idle_task - return the idle task for a given CPU.
-+ * @cpu: the processor in question.
-+ *
-+ * Return: The idle task for the cpu @cpu.
-+ */
-+struct task_struct *idle_task(int cpu)
-+{
-+ return cpu_rq(cpu)->idle;
-+}
-+
-+/**
-+ * find_process_by_pid - find a process with a matching PID value.
-+ * @pid: the pid in question.
-+ *
-+ * The task of @pid, if found. %NULL otherwise.
-+ */
-+static inline struct task_struct *find_process_by_pid(pid_t pid)
-+{
-+ return pid ? find_task_by_vpid(pid) : current;
-+}
-+
-+/*
-+ * sched_setparam() passes in -1 for its policy, to let the functions
-+ * it calls know not to change it.
-+ */
-+#define SETPARAM_POLICY -1
-+
-+static void __setscheduler_params(struct task_struct *p,
-+ const struct sched_attr *attr)
-+{
-+ int policy = attr->sched_policy;
-+
-+ if (policy == SETPARAM_POLICY)
-+ policy = p->policy;
-+
-+ p->policy = policy;
-+
-+ /*
-+ * allow normal nice value to be set, but will not have any
-+ * effect on scheduling until the task not SCHED_NORMAL/
-+ * SCHED_BATCH
-+ */
-+ p->static_prio = NICE_TO_PRIO(attr->sched_nice);
-+
-+ /*
-+ * __sched_setscheduler() ensures attr->sched_priority == 0 when
-+ * !rt_policy. Always setting this ensures that things like
-+ * getparam()/getattr() don't report silly values for !rt tasks.
-+ */
-+ p->rt_priority = attr->sched_priority;
-+ p->normal_prio = normal_prio(p);
-+}
-+
-+/*
-+ * check the target process has a UID that matches the current process's
-+ */
-+static bool check_same_owner(struct task_struct *p)
-+{
-+ const struct cred *cred = current_cred(), *pcred;
-+ bool match;
-+
-+ rcu_read_lock();
-+ pcred = __task_cred(p);
-+ match = (uid_eq(cred->euid, pcred->euid) ||
-+ uid_eq(cred->euid, pcred->uid));
-+ rcu_read_unlock();
-+ return match;
-+}
-+
-+static int __sched_setscheduler(struct task_struct *p,
-+ const struct sched_attr *attr,
-+ bool user, bool pi)
-+{
-+ const struct sched_attr dl_squash_attr = {
-+ .size = sizeof(struct sched_attr),
-+ .sched_policy = SCHED_FIFO,
-+ .sched_nice = 0,
-+ .sched_priority = 99,
-+ };
-+ int oldpolicy = -1, policy = attr->sched_policy;
-+ int retval, newprio;
-+ struct callback_head *head;
-+ unsigned long flags;
-+ struct rq *rq;
-+ int reset_on_fork;
-+ raw_spinlock_t *lock;
-+
-+ /* The pi code expects interrupts enabled */
-+ BUG_ON(pi && in_interrupt());
-+
-+ /*
-+ * Alt schedule FW supports SCHED_DEADLINE by squash it as prio 0 SCHED_FIFO
-+ */
-+ if (unlikely(SCHED_DEADLINE == policy)) {
-+ attr = &dl_squash_attr;
-+ policy = attr->sched_policy;
-+ }
-+recheck:
-+ /* Double check policy once rq lock held */
-+ if (policy < 0) {
-+ reset_on_fork = p->sched_reset_on_fork;
-+ policy = oldpolicy = p->policy;
-+ } else {
-+ reset_on_fork = !!(attr->sched_flags & SCHED_RESET_ON_FORK);
-+
-+ if (policy > SCHED_IDLE)
-+ return -EINVAL;
-+ }
-+
-+ if (attr->sched_flags & ~(SCHED_FLAG_ALL))
-+ return -EINVAL;
-+
-+ /*
-+ * Valid priorities for SCHED_FIFO and SCHED_RR are
-+ * 1..MAX_RT_PRIO-1, valid priority for SCHED_NORMAL and
-+ * SCHED_BATCH and SCHED_IDLE is 0.
-+ */
-+ if (attr->sched_priority < 0 ||
-+ (p->mm && attr->sched_priority > MAX_RT_PRIO - 1) ||
-+ (!p->mm && attr->sched_priority > MAX_RT_PRIO - 1))
-+ return -EINVAL;
-+ if ((SCHED_RR == policy || SCHED_FIFO == policy) !=
-+ (attr->sched_priority != 0))
-+ return -EINVAL;
-+
-+ /*
-+ * Allow unprivileged RT tasks to decrease priority:
-+ */
-+ if (user && !capable(CAP_SYS_NICE)) {
-+ if (SCHED_FIFO == policy || SCHED_RR == policy) {
-+ unsigned long rlim_rtprio =
-+ task_rlimit(p, RLIMIT_RTPRIO);
-+
-+ /* Can't set/change the rt policy */
-+ if (policy != p->policy && !rlim_rtprio)
-+ return -EPERM;
-+
-+ /* Can't increase priority */
-+ if (attr->sched_priority > p->rt_priority &&
-+ attr->sched_priority > rlim_rtprio)
-+ return -EPERM;
-+ }
-+
-+ /* Can't change other user's priorities */
-+ if (!check_same_owner(p))
-+ return -EPERM;
-+
-+ /* Normal users shall not reset the sched_reset_on_fork flag */
-+ if (p->sched_reset_on_fork && !reset_on_fork)
-+ return -EPERM;
-+ }
-+
-+ if (user) {
-+ retval = security_task_setscheduler(p);
-+ if (retval)
-+ return retval;
-+ }
-+
-+ if (pi)
-+ cpuset_read_lock();
-+
-+ /*
-+ * Make sure no PI-waiters arrive (or leave) while we are
-+ * changing the priority of the task:
-+ */
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+
-+ /*
-+ * To be able to change p->policy safely, task_access_lock()
-+ * must be called.
-+ * IF use task_access_lock() here:
-+ * For the task p which is not running, reading rq->stop is
-+ * racy but acceptable as ->stop doesn't change much.
-+ * An enhancemnet can be made to read rq->stop saftly.
-+ */
-+ rq = __task_access_lock(p, &lock);
-+
-+ /*
-+ * Changing the policy of the stop threads its a very bad idea
-+ */
-+ if (p == rq->stop) {
-+ retval = -EINVAL;
-+ goto unlock;
-+ }
-+
-+ /*
-+ * If not changing anything there's no need to proceed further:
-+ */
-+ if (unlikely(policy == p->policy)) {
-+ if (rt_policy(policy) && attr->sched_priority != p->rt_priority)
-+ goto change;
-+ if (!rt_policy(policy) &&
-+ NICE_TO_PRIO(attr->sched_nice) != p->static_prio)
-+ goto change;
-+
-+ p->sched_reset_on_fork = reset_on_fork;
-+ retval = 0;
-+ goto unlock;
-+ }
-+change:
-+
-+ /* Re-check policy now with rq lock held */
-+ if (unlikely(oldpolicy != -1 && oldpolicy != p->policy)) {
-+ policy = oldpolicy = -1;
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+ if (pi)
-+ cpuset_read_unlock();
-+ goto recheck;
-+ }
-+
-+ p->sched_reset_on_fork = reset_on_fork;
-+
-+ newprio = __normal_prio(policy, attr->sched_priority, NICE_TO_PRIO(attr->sched_nice));
-+ if (pi) {
-+ /*
-+ * Take priority boosted tasks into account. If the new
-+ * effective priority is unchanged, we just store the new
-+ * normal parameters and do not touch the scheduler class and
-+ * the runqueue. This will be done when the task deboost
-+ * itself.
-+ */
-+ newprio = rt_effective_prio(p, newprio);
-+ }
-+
-+ if (!(attr->sched_flags & SCHED_FLAG_KEEP_PARAMS)) {
-+ __setscheduler_params(p, attr);
-+ __setscheduler_prio(p, newprio);
-+ }
-+
-+ check_task_changed(p, rq);
-+
-+ /* Avoid rq from going away on us: */
-+ preempt_disable();
-+ head = splice_balance_callbacks(rq);
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+
-+ if (pi) {
-+ cpuset_read_unlock();
-+ rt_mutex_adjust_pi(p);
-+ }
-+
-+ /* Run balance callbacks after we've adjusted the PI chain: */
-+ balance_callbacks(rq, head);
-+ preempt_enable();
-+
-+ return 0;
-+
-+unlock:
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+ if (pi)
-+ cpuset_read_unlock();
-+ return retval;
-+}
-+
-+static int _sched_setscheduler(struct task_struct *p, int policy,
-+ const struct sched_param *param, bool check)
-+{
-+ struct sched_attr attr = {
-+ .sched_policy = policy,
-+ .sched_priority = param->sched_priority,
-+ .sched_nice = PRIO_TO_NICE(p->static_prio),
-+ };
-+
-+ /* Fixup the legacy SCHED_RESET_ON_FORK hack. */
-+ if ((policy != SETPARAM_POLICY) && (policy & SCHED_RESET_ON_FORK)) {
-+ attr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
-+ policy &= ~SCHED_RESET_ON_FORK;
-+ attr.sched_policy = policy;
-+ }
-+
-+ return __sched_setscheduler(p, &attr, check, true);
-+}
-+
-+/**
-+ * sched_setscheduler - change the scheduling policy and/or RT priority of a thread.
-+ * @p: the task in question.
-+ * @policy: new policy.
-+ * @param: structure containing the new RT priority.
-+ *
-+ * Use sched_set_fifo(), read its comment.
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ *
-+ * NOTE that the task may be already dead.
-+ */
-+int sched_setscheduler(struct task_struct *p, int policy,
-+ const struct sched_param *param)
-+{
-+ return _sched_setscheduler(p, policy, param, true);
-+}
-+
-+int sched_setattr(struct task_struct *p, const struct sched_attr *attr)
-+{
-+ return __sched_setscheduler(p, attr, true, true);
-+}
-+
-+int sched_setattr_nocheck(struct task_struct *p, const struct sched_attr *attr)
-+{
-+ return __sched_setscheduler(p, attr, false, true);
-+}
-+EXPORT_SYMBOL_GPL(sched_setattr_nocheck);
-+
-+/**
-+ * sched_setscheduler_nocheck - change the scheduling policy and/or RT priority of a thread from kernelspace.
-+ * @p: the task in question.
-+ * @policy: new policy.
-+ * @param: structure containing the new RT priority.
-+ *
-+ * Just like sched_setscheduler, only don't bother checking if the
-+ * current context has permission. For example, this is needed in
-+ * stop_machine(): we create temporary high priority worker threads,
-+ * but our caller might not have that capability.
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ */
-+int sched_setscheduler_nocheck(struct task_struct *p, int policy,
-+ const struct sched_param *param)
-+{
-+ return _sched_setscheduler(p, policy, param, false);
-+}
-+
-+/*
-+ * SCHED_FIFO is a broken scheduler model; that is, it is fundamentally
-+ * incapable of resource management, which is the one thing an OS really should
-+ * be doing.
-+ *
-+ * This is of course the reason it is limited to privileged users only.
-+ *
-+ * Worse still; it is fundamentally impossible to compose static priority
-+ * workloads. You cannot take two correctly working static prio workloads
-+ * and smash them together and still expect them to work.
-+ *
-+ * For this reason 'all' FIFO tasks the kernel creates are basically at:
-+ *
-+ * MAX_RT_PRIO / 2
-+ *
-+ * The administrator _MUST_ configure the system, the kernel simply doesn't
-+ * know enough information to make a sensible choice.
-+ */
-+void sched_set_fifo(struct task_struct *p)
-+{
-+ struct sched_param sp = { .sched_priority = MAX_RT_PRIO / 2 };
-+ WARN_ON_ONCE(sched_setscheduler_nocheck(p, SCHED_FIFO, &sp) != 0);
-+}
-+EXPORT_SYMBOL_GPL(sched_set_fifo);
-+
-+/*
-+ * For when you don't much care about FIFO, but want to be above SCHED_NORMAL.
-+ */
-+void sched_set_fifo_low(struct task_struct *p)
-+{
-+ struct sched_param sp = { .sched_priority = 1 };
-+ WARN_ON_ONCE(sched_setscheduler_nocheck(p, SCHED_FIFO, &sp) != 0);
-+}
-+EXPORT_SYMBOL_GPL(sched_set_fifo_low);
-+
-+void sched_set_normal(struct task_struct *p, int nice)
-+{
-+ struct sched_attr attr = {
-+ .sched_policy = SCHED_NORMAL,
-+ .sched_nice = nice,
-+ };
-+ WARN_ON_ONCE(sched_setattr_nocheck(p, &attr) != 0);
-+}
-+EXPORT_SYMBOL_GPL(sched_set_normal);
-+
-+static int
-+do_sched_setscheduler(pid_t pid, int policy, struct sched_param __user *param)
-+{
-+ struct sched_param lparam;
-+ struct task_struct *p;
-+ int retval;
-+
-+ if (!param || pid < 0)
-+ return -EINVAL;
-+ if (copy_from_user(&lparam, param, sizeof(struct sched_param)))
-+ return -EFAULT;
-+
-+ rcu_read_lock();
-+ retval = -ESRCH;
-+ p = find_process_by_pid(pid);
-+ if (likely(p))
-+ get_task_struct(p);
-+ rcu_read_unlock();
-+
-+ if (likely(p)) {
-+ retval = sched_setscheduler(p, policy, &lparam);
-+ put_task_struct(p);
-+ }
-+
-+ return retval;
-+}
-+
-+/*
-+ * Mimics kernel/events/core.c perf_copy_attr().
-+ */
-+static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *attr)
-+{
-+ u32 size;
-+ int ret;
-+
-+ /* Zero the full structure, so that a short copy will be nice: */
-+ memset(attr, 0, sizeof(*attr));
-+
-+ ret = get_user(size, &uattr->size);
-+ if (ret)
-+ return ret;
-+
-+ /* ABI compatibility quirk: */
-+ if (!size)
-+ size = SCHED_ATTR_SIZE_VER0;
-+
-+ if (size < SCHED_ATTR_SIZE_VER0 || size > PAGE_SIZE)
-+ goto err_size;
-+
-+ ret = copy_struct_from_user(attr, sizeof(*attr), uattr, size);
-+ if (ret) {
-+ if (ret == -E2BIG)
-+ goto err_size;
-+ return ret;
-+ }
-+
-+ /*
-+ * XXX: Do we want to be lenient like existing syscalls; or do we want
-+ * to be strict and return an error on out-of-bounds values?
-+ */
-+ attr->sched_nice = clamp(attr->sched_nice, -20, 19);
-+
-+ /* sched/core.c uses zero here but we already know ret is zero */
-+ return 0;
-+
-+err_size:
-+ put_user(sizeof(*attr), &uattr->size);
-+ return -E2BIG;
-+}
-+
-+/**
-+ * sys_sched_setscheduler - set/change the scheduler policy and RT priority
-+ * @pid: the pid in question.
-+ * @policy: new policy.
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ * @param: structure containing the new RT priority.
-+ */
-+SYSCALL_DEFINE3(sched_setscheduler, pid_t, pid, int, policy, struct sched_param __user *, param)
-+{
-+ if (policy < 0)
-+ return -EINVAL;
-+
-+ return do_sched_setscheduler(pid, policy, param);
-+}
-+
-+/**
-+ * sys_sched_setparam - set/change the RT priority of a thread
-+ * @pid: the pid in question.
-+ * @param: structure containing the new RT priority.
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ */
-+SYSCALL_DEFINE2(sched_setparam, pid_t, pid, struct sched_param __user *, param)
-+{
-+ return do_sched_setscheduler(pid, SETPARAM_POLICY, param);
-+}
-+
-+/**
-+ * sys_sched_setattr - same as above, but with extended sched_attr
-+ * @pid: the pid in question.
-+ * @uattr: structure containing the extended parameters.
-+ */
-+SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
-+ unsigned int, flags)
-+{
-+ struct sched_attr attr;
-+ struct task_struct *p;
-+ int retval;
-+
-+ if (!uattr || pid < 0 || flags)
-+ return -EINVAL;
-+
-+ retval = sched_copy_attr(uattr, &attr);
-+ if (retval)
-+ return retval;
-+
-+ if ((int)attr.sched_policy < 0)
-+ return -EINVAL;
-+
-+ rcu_read_lock();
-+ retval = -ESRCH;
-+ p = find_process_by_pid(pid);
-+ if (likely(p))
-+ get_task_struct(p);
-+ rcu_read_unlock();
-+
-+ if (likely(p)) {
-+ retval = sched_setattr(p, &attr);
-+ put_task_struct(p);
-+ }
-+
-+ return retval;
-+}
-+
-+/**
-+ * sys_sched_getscheduler - get the policy (scheduling class) of a thread
-+ * @pid: the pid in question.
-+ *
-+ * Return: On success, the policy of the thread. Otherwise, a negative error
-+ * code.
-+ */
-+SYSCALL_DEFINE1(sched_getscheduler, pid_t, pid)
-+{
-+ struct task_struct *p;
-+ int retval = -EINVAL;
-+
-+ if (pid < 0)
-+ goto out_nounlock;
-+
-+ retval = -ESRCH;
-+ rcu_read_lock();
-+ p = find_process_by_pid(pid);
-+ if (p) {
-+ retval = security_task_getscheduler(p);
-+ if (!retval)
-+ retval = p->policy;
-+ }
-+ rcu_read_unlock();
-+
-+out_nounlock:
-+ return retval;
-+}
-+
-+/**
-+ * sys_sched_getscheduler - get the RT priority of a thread
-+ * @pid: the pid in question.
-+ * @param: structure containing the RT priority.
-+ *
-+ * Return: On success, 0 and the RT priority is in @param. Otherwise, an error
-+ * code.
-+ */
-+SYSCALL_DEFINE2(sched_getparam, pid_t, pid, struct sched_param __user *, param)
-+{
-+ struct sched_param lp = { .sched_priority = 0 };
-+ struct task_struct *p;
-+ int retval = -EINVAL;
-+
-+ if (!param || pid < 0)
-+ goto out_nounlock;
-+
-+ rcu_read_lock();
-+ p = find_process_by_pid(pid);
-+ retval = -ESRCH;
-+ if (!p)
-+ goto out_unlock;
-+
-+ retval = security_task_getscheduler(p);
-+ if (retval)
-+ goto out_unlock;
-+
-+ if (task_has_rt_policy(p))
-+ lp.sched_priority = p->rt_priority;
-+ rcu_read_unlock();
-+
-+ /*
-+ * This one might sleep, we cannot do it with a spinlock held ...
-+ */
-+ retval = copy_to_user(param, &lp, sizeof(*param)) ? -EFAULT : 0;
-+
-+out_nounlock:
-+ return retval;
-+
-+out_unlock:
-+ rcu_read_unlock();
-+ return retval;
-+}
-+
-+/*
-+ * Copy the kernel size attribute structure (which might be larger
-+ * than what user-space knows about) to user-space.
-+ *
-+ * Note that all cases are valid: user-space buffer can be larger or
-+ * smaller than the kernel-space buffer. The usual case is that both
-+ * have the same size.
-+ */
-+static int
-+sched_attr_copy_to_user(struct sched_attr __user *uattr,
-+ struct sched_attr *kattr,
-+ unsigned int usize)
-+{
-+ unsigned int ksize = sizeof(*kattr);
-+
-+ if (!access_ok(uattr, usize))
-+ return -EFAULT;
-+
-+ /*
-+ * sched_getattr() ABI forwards and backwards compatibility:
-+ *
-+ * If usize == ksize then we just copy everything to user-space and all is good.
-+ *
-+ * If usize < ksize then we only copy as much as user-space has space for,
-+ * this keeps ABI compatibility as well. We skip the rest.
-+ *
-+ * If usize > ksize then user-space is using a newer version of the ABI,
-+ * which part the kernel doesn't know about. Just ignore it - tooling can
-+ * detect the kernel's knowledge of attributes from the attr->size value
-+ * which is set to ksize in this case.
-+ */
-+ kattr->size = min(usize, ksize);
-+
-+ if (copy_to_user(uattr, kattr, kattr->size))
-+ return -EFAULT;
-+
-+ return 0;
-+}
-+
-+/**
-+ * sys_sched_getattr - similar to sched_getparam, but with sched_attr
-+ * @pid: the pid in question.
-+ * @uattr: structure containing the extended parameters.
-+ * @usize: sizeof(attr) for fwd/bwd comp.
-+ * @flags: for future extension.
-+ */
-+SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
-+ unsigned int, usize, unsigned int, flags)
-+{
-+ struct sched_attr kattr = { };
-+ struct task_struct *p;
-+ int retval;
-+
-+ if (!uattr || pid < 0 || usize > PAGE_SIZE ||
-+ usize < SCHED_ATTR_SIZE_VER0 || flags)
-+ return -EINVAL;
-+
-+ rcu_read_lock();
-+ p = find_process_by_pid(pid);
-+ retval = -ESRCH;
-+ if (!p)
-+ goto out_unlock;
-+
-+ retval = security_task_getscheduler(p);
-+ if (retval)
-+ goto out_unlock;
-+
-+ kattr.sched_policy = p->policy;
-+ if (p->sched_reset_on_fork)
-+ kattr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
-+ if (task_has_rt_policy(p))
-+ kattr.sched_priority = p->rt_priority;
-+ else
-+ kattr.sched_nice = task_nice(p);
-+ kattr.sched_flags &= SCHED_FLAG_ALL;
-+
-+#ifdef CONFIG_UCLAMP_TASK
-+ kattr.sched_util_min = p->uclamp_req[UCLAMP_MIN].value;
-+ kattr.sched_util_max = p->uclamp_req[UCLAMP_MAX].value;
-+#endif
-+
-+ rcu_read_unlock();
-+
-+ return sched_attr_copy_to_user(uattr, &kattr, usize);
-+
-+out_unlock:
-+ rcu_read_unlock();
-+ return retval;
-+}
-+
-+static int
-+__sched_setaffinity(struct task_struct *p, const struct cpumask *mask)
-+{
-+ int retval;
-+ cpumask_var_t cpus_allowed, new_mask;
-+
-+ if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL))
-+ return -ENOMEM;
-+
-+ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) {
-+ retval = -ENOMEM;
-+ goto out_free_cpus_allowed;
-+ }
-+
-+ cpuset_cpus_allowed(p, cpus_allowed);
-+ cpumask_and(new_mask, mask, cpus_allowed);
-+again:
-+ retval = __set_cpus_allowed_ptr(p, new_mask, SCA_CHECK | SCA_USER);
-+ if (retval)
-+ goto out_free_new_mask;
-+
-+ cpuset_cpus_allowed(p, cpus_allowed);
-+ if (!cpumask_subset(new_mask, cpus_allowed)) {
-+ /*
-+ * We must have raced with a concurrent cpuset
-+ * update. Just reset the cpus_allowed to the
-+ * cpuset's cpus_allowed
-+ */
-+ cpumask_copy(new_mask, cpus_allowed);
-+ goto again;
-+ }
-+
-+out_free_new_mask:
-+ free_cpumask_var(new_mask);
-+out_free_cpus_allowed:
-+ free_cpumask_var(cpus_allowed);
-+ return retval;
-+}
-+
-+long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
-+{
-+ struct task_struct *p;
-+ int retval;
-+
-+ rcu_read_lock();
-+
-+ p = find_process_by_pid(pid);
-+ if (!p) {
-+ rcu_read_unlock();
-+ return -ESRCH;
-+ }
-+
-+ /* Prevent p going away */
-+ get_task_struct(p);
-+ rcu_read_unlock();
-+
-+ if (p->flags & PF_NO_SETAFFINITY) {
-+ retval = -EINVAL;
-+ goto out_put_task;
-+ }
-+
-+ if (!check_same_owner(p)) {
-+ rcu_read_lock();
-+ if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
-+ rcu_read_unlock();
-+ retval = -EPERM;
-+ goto out_put_task;
-+ }
-+ rcu_read_unlock();
-+ }
-+
-+ retval = security_task_setscheduler(p);
-+ if (retval)
-+ goto out_put_task;
-+
-+ retval = __sched_setaffinity(p, in_mask);
-+out_put_task:
-+ put_task_struct(p);
-+ return retval;
-+}
-+
-+static int get_user_cpu_mask(unsigned long __user *user_mask_ptr, unsigned len,
-+ struct cpumask *new_mask)
-+{
-+ if (len < cpumask_size())
-+ cpumask_clear(new_mask);
-+ else if (len > cpumask_size())
-+ len = cpumask_size();
-+
-+ return copy_from_user(new_mask, user_mask_ptr, len) ? -EFAULT : 0;
-+}
-+
-+/**
-+ * sys_sched_setaffinity - set the CPU affinity of a process
-+ * @pid: pid of the process
-+ * @len: length in bytes of the bitmask pointed to by user_mask_ptr
-+ * @user_mask_ptr: user-space pointer to the new CPU mask
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ */
-+SYSCALL_DEFINE3(sched_setaffinity, pid_t, pid, unsigned int, len,
-+ unsigned long __user *, user_mask_ptr)
-+{
-+ cpumask_var_t new_mask;
-+ int retval;
-+
-+ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL))
-+ return -ENOMEM;
-+
-+ retval = get_user_cpu_mask(user_mask_ptr, len, new_mask);
-+ if (retval == 0)
-+ retval = sched_setaffinity(pid, new_mask);
-+ free_cpumask_var(new_mask);
-+ return retval;
-+}
-+
-+long sched_getaffinity(pid_t pid, cpumask_t *mask)
-+{
-+ struct task_struct *p;
-+ raw_spinlock_t *lock;
-+ unsigned long flags;
-+ int retval;
-+
-+ rcu_read_lock();
-+
-+ retval = -ESRCH;
-+ p = find_process_by_pid(pid);
-+ if (!p)
-+ goto out_unlock;
-+
-+ retval = security_task_getscheduler(p);
-+ if (retval)
-+ goto out_unlock;
-+
-+ task_access_lock_irqsave(p, &lock, &flags);
-+ cpumask_and(mask, &p->cpus_mask, cpu_active_mask);
-+ task_access_unlock_irqrestore(p, lock, &flags);
-+
-+out_unlock:
-+ rcu_read_unlock();
-+
-+ return retval;
-+}
-+
-+/**
-+ * sys_sched_getaffinity - get the CPU affinity of a process
-+ * @pid: pid of the process
-+ * @len: length in bytes of the bitmask pointed to by user_mask_ptr
-+ * @user_mask_ptr: user-space pointer to hold the current CPU mask
-+ *
-+ * Return: size of CPU mask copied to user_mask_ptr on success. An
-+ * error code otherwise.
-+ */
-+SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
-+ unsigned long __user *, user_mask_ptr)
-+{
-+ int ret;
-+ cpumask_var_t mask;
-+
-+ if ((len * BITS_PER_BYTE) < nr_cpu_ids)
-+ return -EINVAL;
-+ if (len & (sizeof(unsigned long)-1))
-+ return -EINVAL;
-+
-+ if (!alloc_cpumask_var(&mask, GFP_KERNEL))
-+ return -ENOMEM;
-+
-+ ret = sched_getaffinity(pid, mask);
-+ if (ret == 0) {
-+ unsigned int retlen = min_t(size_t, len, cpumask_size());
-+
-+ if (copy_to_user(user_mask_ptr, mask, retlen))
-+ ret = -EFAULT;
-+ else
-+ ret = retlen;
-+ }
-+ free_cpumask_var(mask);
-+
-+ return ret;
-+}
-+
-+static void do_sched_yield(void)
-+{
-+ struct rq *rq;
-+ struct rq_flags rf;
-+
-+ if (!sched_yield_type)
-+ return;
-+
-+ rq = this_rq_lock_irq(&rf);
-+
-+ schedstat_inc(rq->yld_count);
-+
-+ if (1 == sched_yield_type) {
-+ if (!rt_task(current))
-+ do_sched_yield_type_1(current, rq);
-+ } else if (2 == sched_yield_type) {
-+ if (rq->nr_running > 1)
-+ rq->skip = current;
-+ }
-+
-+ preempt_disable();
-+ raw_spin_unlock_irq(&rq->lock);
-+ sched_preempt_enable_no_resched();
-+
-+ schedule();
-+}
-+
-+/**
-+ * sys_sched_yield - yield the current processor to other threads.
-+ *
-+ * This function yields the current CPU to other tasks. If there are no
-+ * other threads running on this CPU then this function will return.
-+ *
-+ * Return: 0.
-+ */
-+SYSCALL_DEFINE0(sched_yield)
-+{
-+ do_sched_yield();
-+ return 0;
-+}
-+
-+#if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC)
-+int __sched __cond_resched(void)
-+{
-+ if (should_resched(0)) {
-+ preempt_schedule_common();
-+ return 1;
-+ }
-+ /*
-+ * In preemptible kernels, ->rcu_read_lock_nesting tells the tick
-+ * whether the current CPU is in an RCU read-side critical section,
-+ * so the tick can report quiescent states even for CPUs looping
-+ * in kernel context. In contrast, in non-preemptible kernels,
-+ * RCU readers leave no in-memory hints, which means that CPU-bound
-+ * processes executing in kernel context might never report an
-+ * RCU quiescent state. Therefore, the following code causes
-+ * cond_resched() to report a quiescent state, but only when RCU
-+ * is in urgent need of one.
-+ */
-+#ifndef CONFIG_PREEMPT_RCU
-+ rcu_all_qs();
-+#endif
-+ return 0;
-+}
-+EXPORT_SYMBOL(__cond_resched);
-+#endif
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-+#define cond_resched_dynamic_enabled __cond_resched
-+#define cond_resched_dynamic_disabled ((void *)&__static_call_return0)
-+DEFINE_STATIC_CALL_RET0(cond_resched, __cond_resched);
-+EXPORT_STATIC_CALL_TRAMP(cond_resched);
-+
-+#define might_resched_dynamic_enabled __cond_resched
-+#define might_resched_dynamic_disabled ((void *)&__static_call_return0)
-+DEFINE_STATIC_CALL_RET0(might_resched, __cond_resched);
-+EXPORT_STATIC_CALL_TRAMP(might_resched);
-+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-+static DEFINE_STATIC_KEY_FALSE(sk_dynamic_cond_resched);
-+int __sched dynamic_cond_resched(void)
-+{
-+ if (!static_branch_unlikely(&sk_dynamic_cond_resched))
-+ return 0;
-+ return __cond_resched();
-+}
-+EXPORT_SYMBOL(dynamic_cond_resched);
-+
-+static DEFINE_STATIC_KEY_FALSE(sk_dynamic_might_resched);
-+int __sched dynamic_might_resched(void)
-+{
-+ if (!static_branch_unlikely(&sk_dynamic_might_resched))
-+ return 0;
-+ return __cond_resched();
-+}
-+EXPORT_SYMBOL(dynamic_might_resched);
-+#endif
-+#endif
-+
-+/*
-+ * __cond_resched_lock() - if a reschedule is pending, drop the given lock,
-+ * call schedule, and on return reacquire the lock.
-+ *
-+ * This works OK both with and without CONFIG_PREEMPTION. We do strange low-level
-+ * operations here to prevent schedule() from being called twice (once via
-+ * spin_unlock(), once by hand).
-+ */
-+int __cond_resched_lock(spinlock_t *lock)
-+{
-+ int resched = should_resched(PREEMPT_LOCK_OFFSET);
-+ int ret = 0;
-+
-+ lockdep_assert_held(lock);
-+
-+ if (spin_needbreak(lock) || resched) {
-+ spin_unlock(lock);
-+ if (!_cond_resched())
-+ cpu_relax();
-+ ret = 1;
-+ spin_lock(lock);
-+ }
-+ return ret;
-+}
-+EXPORT_SYMBOL(__cond_resched_lock);
-+
-+int __cond_resched_rwlock_read(rwlock_t *lock)
-+{
-+ int resched = should_resched(PREEMPT_LOCK_OFFSET);
-+ int ret = 0;
-+
-+ lockdep_assert_held_read(lock);
-+
-+ if (rwlock_needbreak(lock) || resched) {
-+ read_unlock(lock);
-+ if (!_cond_resched())
-+ cpu_relax();
-+ ret = 1;
-+ read_lock(lock);
-+ }
-+ return ret;
-+}
-+EXPORT_SYMBOL(__cond_resched_rwlock_read);
-+
-+int __cond_resched_rwlock_write(rwlock_t *lock)
-+{
-+ int resched = should_resched(PREEMPT_LOCK_OFFSET);
-+ int ret = 0;
-+
-+ lockdep_assert_held_write(lock);
-+
-+ if (rwlock_needbreak(lock) || resched) {
-+ write_unlock(lock);
-+ if (!_cond_resched())
-+ cpu_relax();
-+ ret = 1;
-+ write_lock(lock);
-+ }
-+ return ret;
-+}
-+EXPORT_SYMBOL(__cond_resched_rwlock_write);
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+
-+#ifdef CONFIG_GENERIC_ENTRY
-+#include <linux/entry-common.h>
-+#endif
-+
-+/*
-+ * SC:cond_resched
-+ * SC:might_resched
-+ * SC:preempt_schedule
-+ * SC:preempt_schedule_notrace
-+ * SC:irqentry_exit_cond_resched
-+ *
-+ *
-+ * NONE:
-+ * cond_resched <- __cond_resched
-+ * might_resched <- RET0
-+ * preempt_schedule <- NOP
-+ * preempt_schedule_notrace <- NOP
-+ * irqentry_exit_cond_resched <- NOP
-+ *
-+ * VOLUNTARY:
-+ * cond_resched <- __cond_resched
-+ * might_resched <- __cond_resched
-+ * preempt_schedule <- NOP
-+ * preempt_schedule_notrace <- NOP
-+ * irqentry_exit_cond_resched <- NOP
-+ *
-+ * FULL:
-+ * cond_resched <- RET0
-+ * might_resched <- RET0
-+ * preempt_schedule <- preempt_schedule
-+ * preempt_schedule_notrace <- preempt_schedule_notrace
-+ * irqentry_exit_cond_resched <- irqentry_exit_cond_resched
-+ */
-+
-+enum {
-+ preempt_dynamic_undefined = -1,
-+ preempt_dynamic_none,
-+ preempt_dynamic_voluntary,
-+ preempt_dynamic_full,
-+};
-+
-+int preempt_dynamic_mode = preempt_dynamic_undefined;
-+
-+int sched_dynamic_mode(const char *str)
-+{
-+ if (!strcmp(str, "none"))
-+ return preempt_dynamic_none;
-+
-+ if (!strcmp(str, "voluntary"))
-+ return preempt_dynamic_voluntary;
-+
-+ if (!strcmp(str, "full"))
-+ return preempt_dynamic_full;
-+
-+ return -EINVAL;
-+}
-+
-+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-+#define preempt_dynamic_enable(f) static_call_update(f, f##_dynamic_enabled)
-+#define preempt_dynamic_disable(f) static_call_update(f, f##_dynamic_disabled)
-+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-+#define preempt_dynamic_enable(f) static_key_enable(&sk_dynamic_##f.key)
-+#define preempt_dynamic_disable(f) static_key_disable(&sk_dynamic_##f.key)
-+#else
-+#error "Unsupported PREEMPT_DYNAMIC mechanism"
-+#endif
-+
-+void sched_dynamic_update(int mode)
-+{
-+ /*
-+ * Avoid {NONE,VOLUNTARY} -> FULL transitions from ever ending up in
-+ * the ZERO state, which is invalid.
-+ */
-+ preempt_dynamic_enable(cond_resched);
-+ preempt_dynamic_enable(might_resched);
-+ preempt_dynamic_enable(preempt_schedule);
-+ preempt_dynamic_enable(preempt_schedule_notrace);
-+ preempt_dynamic_enable(irqentry_exit_cond_resched);
-+
-+ switch (mode) {
-+ case preempt_dynamic_none:
-+ preempt_dynamic_enable(cond_resched);
-+ preempt_dynamic_disable(might_resched);
-+ preempt_dynamic_disable(preempt_schedule);
-+ preempt_dynamic_disable(preempt_schedule_notrace);
-+ preempt_dynamic_disable(irqentry_exit_cond_resched);
-+ pr_info("Dynamic Preempt: none\n");
-+ break;
-+
-+ case preempt_dynamic_voluntary:
-+ preempt_dynamic_enable(cond_resched);
-+ preempt_dynamic_enable(might_resched);
-+ preempt_dynamic_disable(preempt_schedule);
-+ preempt_dynamic_disable(preempt_schedule_notrace);
-+ preempt_dynamic_disable(irqentry_exit_cond_resched);
-+ pr_info("Dynamic Preempt: voluntary\n");
-+ break;
-+
-+ case preempt_dynamic_full:
-+ preempt_dynamic_disable(cond_resched);
-+ preempt_dynamic_disable(might_resched);
-+ preempt_dynamic_enable(preempt_schedule);
-+ preempt_dynamic_enable(preempt_schedule_notrace);
-+ preempt_dynamic_enable(irqentry_exit_cond_resched);
-+ pr_info("Dynamic Preempt: full\n");
-+ break;
-+ }
-+
-+ preempt_dynamic_mode = mode;
-+}
-+
-+static int __init setup_preempt_mode(char *str)
-+{
-+ int mode = sched_dynamic_mode(str);
-+ if (mode < 0) {
-+ pr_warn("Dynamic Preempt: unsupported mode: %s\n", str);
-+ return 0;
-+ }
-+
-+ sched_dynamic_update(mode);
-+ return 1;
-+}
-+__setup("preempt=", setup_preempt_mode);
-+
-+static void __init preempt_dynamic_init(void)
-+{
-+ if (preempt_dynamic_mode == preempt_dynamic_undefined) {
-+ if (IS_ENABLED(CONFIG_PREEMPT_NONE)) {
-+ sched_dynamic_update(preempt_dynamic_none);
-+ } else if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) {
-+ sched_dynamic_update(preempt_dynamic_voluntary);
-+ } else {
-+ /* Default static call setting, nothing to do */
-+ WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT));
-+ preempt_dynamic_mode = preempt_dynamic_full;
-+ pr_info("Dynamic Preempt: full\n");
-+ }
-+ }
-+}
-+
-+#else /* !CONFIG_PREEMPT_DYNAMIC */
-+
-+static inline void preempt_dynamic_init(void) { }
-+
-+#endif /* #ifdef CONFIG_PREEMPT_DYNAMIC */
-+
-+/**
-+ * yield - yield the current processor to other threads.
-+ *
-+ * Do not ever use this function, there's a 99% chance you're doing it wrong.
-+ *
-+ * The scheduler is at all times free to pick the calling task as the most
-+ * eligible task to run, if removing the yield() call from your code breaks
-+ * it, it's already broken.
-+ *
-+ * Typical broken usage is:
-+ *
-+ * while (!event)
-+ * yield();
-+ *
-+ * where one assumes that yield() will let 'the other' process run that will
-+ * make event true. If the current task is a SCHED_FIFO task that will never
-+ * happen. Never use yield() as a progress guarantee!!
-+ *
-+ * If you want to use yield() to wait for something, use wait_event().
-+ * If you want to use yield() to be 'nice' for others, use cond_resched().
-+ * If you still want to use yield(), do not!
-+ */
-+void __sched yield(void)
-+{
-+ set_current_state(TASK_RUNNING);
-+ do_sched_yield();
-+}
-+EXPORT_SYMBOL(yield);
-+
-+/**
-+ * yield_to - yield the current processor to another thread in
-+ * your thread group, or accelerate that thread toward the
-+ * processor it's on.
-+ * @p: target task
-+ * @preempt: whether task preemption is allowed or not
-+ *
-+ * It's the caller's job to ensure that the target task struct
-+ * can't go away on us before we can do any checks.
-+ *
-+ * In Alt schedule FW, yield_to is not supported.
-+ *
-+ * Return:
-+ * true (>0) if we indeed boosted the target task.
-+ * false (0) if we failed to boost the target.
-+ * -ESRCH if there's no task to yield to.
-+ */
-+int __sched yield_to(struct task_struct *p, bool preempt)
-+{
-+ return 0;
-+}
-+EXPORT_SYMBOL_GPL(yield_to);
-+
-+int io_schedule_prepare(void)
-+{
-+ int old_iowait = current->in_iowait;
-+
-+ current->in_iowait = 1;
-+ blk_flush_plug(current->plug, true);
-+ return old_iowait;
-+}
-+
-+void io_schedule_finish(int token)
-+{
-+ current->in_iowait = token;
-+}
-+
-+/*
-+ * This task is about to go to sleep on IO. Increment rq->nr_iowait so
-+ * that process accounting knows that this is a task in IO wait state.
-+ *
-+ * But don't do that if it is a deliberate, throttling IO wait (this task
-+ * has set its backing_dev_info: the queue against which it should throttle)
-+ */
-+
-+long __sched io_schedule_timeout(long timeout)
-+{
-+ int token;
-+ long ret;
-+
-+ token = io_schedule_prepare();
-+ ret = schedule_timeout(timeout);
-+ io_schedule_finish(token);
-+
-+ return ret;
-+}
-+EXPORT_SYMBOL(io_schedule_timeout);
-+
-+void __sched io_schedule(void)
-+{
-+ int token;
-+
-+ token = io_schedule_prepare();
-+ schedule();
-+ io_schedule_finish(token);
-+}
-+EXPORT_SYMBOL(io_schedule);
-+
-+/**
-+ * sys_sched_get_priority_max - return maximum RT priority.
-+ * @policy: scheduling class.
-+ *
-+ * Return: On success, this syscall returns the maximum
-+ * rt_priority that can be used by a given scheduling class.
-+ * On failure, a negative error code is returned.
-+ */
-+SYSCALL_DEFINE1(sched_get_priority_max, int, policy)
-+{
-+ int ret = -EINVAL;
-+
-+ switch (policy) {
-+ case SCHED_FIFO:
-+ case SCHED_RR:
-+ ret = MAX_RT_PRIO - 1;
-+ break;
-+ case SCHED_NORMAL:
-+ case SCHED_BATCH:
-+ case SCHED_IDLE:
-+ ret = 0;
-+ break;
-+ }
-+ return ret;
-+}
-+
-+/**
-+ * sys_sched_get_priority_min - return minimum RT priority.
-+ * @policy: scheduling class.
-+ *
-+ * Return: On success, this syscall returns the minimum
-+ * rt_priority that can be used by a given scheduling class.
-+ * On failure, a negative error code is returned.
-+ */
-+SYSCALL_DEFINE1(sched_get_priority_min, int, policy)
-+{
-+ int ret = -EINVAL;
-+
-+ switch (policy) {
-+ case SCHED_FIFO:
-+ case SCHED_RR:
-+ ret = 1;
-+ break;
-+ case SCHED_NORMAL:
-+ case SCHED_BATCH:
-+ case SCHED_IDLE:
-+ ret = 0;
-+ break;
-+ }
-+ return ret;
-+}
-+
-+static int sched_rr_get_interval(pid_t pid, struct timespec64 *t)
-+{
-+ struct task_struct *p;
-+ int retval;
-+
-+ alt_sched_debug();
-+
-+ if (pid < 0)
-+ return -EINVAL;
-+
-+ retval = -ESRCH;
-+ rcu_read_lock();
-+ p = find_process_by_pid(pid);
-+ if (!p)
-+ goto out_unlock;
-+
-+ retval = security_task_getscheduler(p);
-+ if (retval)
-+ goto out_unlock;
-+ rcu_read_unlock();
-+
-+ *t = ns_to_timespec64(sched_timeslice_ns);
-+ return 0;
-+
-+out_unlock:
-+ rcu_read_unlock();
-+ return retval;
-+}
-+
-+/**
-+ * sys_sched_rr_get_interval - return the default timeslice of a process.
-+ * @pid: pid of the process.
-+ * @interval: userspace pointer to the timeslice value.
-+ *
-+ *
-+ * Return: On success, 0 and the timeslice is in @interval. Otherwise,
-+ * an error code.
-+ */
-+SYSCALL_DEFINE2(sched_rr_get_interval, pid_t, pid,
-+ struct __kernel_timespec __user *, interval)
-+{
-+ struct timespec64 t;
-+ int retval = sched_rr_get_interval(pid, &t);
-+
-+ if (retval == 0)
-+ retval = put_timespec64(&t, interval);
-+
-+ return retval;
-+}
-+
-+#ifdef CONFIG_COMPAT_32BIT_TIME
-+SYSCALL_DEFINE2(sched_rr_get_interval_time32, pid_t, pid,
-+ struct old_timespec32 __user *, interval)
-+{
-+ struct timespec64 t;
-+ int retval = sched_rr_get_interval(pid, &t);
-+
-+ if (retval == 0)
-+ retval = put_old_timespec32(&t, interval);
-+ return retval;
-+}
-+#endif
-+
-+void sched_show_task(struct task_struct *p)
-+{
-+ unsigned long free = 0;
-+ int ppid;
-+
-+ if (!try_get_task_stack(p))
-+ return;
-+
-+ pr_info("task:%-15.15s state:%c", p->comm, task_state_to_char(p));
-+
-+ if (task_is_running(p))
-+ pr_cont(" running task ");
-+#ifdef CONFIG_DEBUG_STACK_USAGE
-+ free = stack_not_used(p);
-+#endif
-+ ppid = 0;
-+ rcu_read_lock();
-+ if (pid_alive(p))
-+ ppid = task_pid_nr(rcu_dereference(p->real_parent));
-+ rcu_read_unlock();
-+ pr_cont(" stack:%5lu pid:%5d ppid:%6d flags:0x%08lx\n",
-+ free, task_pid_nr(p), ppid,
-+ read_task_thread_flags(p));
-+
-+ print_worker_info(KERN_INFO, p);
-+ print_stop_info(KERN_INFO, p);
-+ show_stack(p, NULL, KERN_INFO);
-+ put_task_stack(p);
-+}
-+EXPORT_SYMBOL_GPL(sched_show_task);
-+
-+static inline bool
-+state_filter_match(unsigned long state_filter, struct task_struct *p)
-+{
-+ unsigned int state = READ_ONCE(p->__state);
-+
-+ /* no filter, everything matches */
-+ if (!state_filter)
-+ return true;
-+
-+ /* filter, but doesn't match */
-+ if (!(state & state_filter))
-+ return false;
-+
-+ /*
-+ * When looking for TASK_UNINTERRUPTIBLE skip TASK_IDLE (allows
-+ * TASK_KILLABLE).
-+ */
-+ if (state_filter == TASK_UNINTERRUPTIBLE && state == TASK_IDLE)
-+ return false;
-+
-+ return true;
-+}
-+
-+
-+void show_state_filter(unsigned int state_filter)
-+{
-+ struct task_struct *g, *p;
-+
-+ rcu_read_lock();
-+ for_each_process_thread(g, p) {
-+ /*
-+ * reset the NMI-timeout, listing all files on a slow
-+ * console might take a lot of time:
-+ * Also, reset softlockup watchdogs on all CPUs, because
-+ * another CPU might be blocked waiting for us to process
-+ * an IPI.
-+ */
-+ touch_nmi_watchdog();
-+ touch_all_softlockup_watchdogs();
-+ if (state_filter_match(state_filter, p))
-+ sched_show_task(p);
-+ }
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+ /* TODO: Alt schedule FW should support this
-+ if (!state_filter)
-+ sysrq_sched_debug_show();
-+ */
-+#endif
-+ rcu_read_unlock();
-+ /*
-+ * Only show locks if all tasks are dumped:
-+ */
-+ if (!state_filter)
-+ debug_show_all_locks();
-+}
-+
-+void dump_cpu_task(int cpu)
-+{
-+ pr_info("Task dump for CPU %d:\n", cpu);
-+ sched_show_task(cpu_curr(cpu));
-+}
-+
-+/**
-+ * init_idle - set up an idle thread for a given CPU
-+ * @idle: task in question
-+ * @cpu: CPU the idle task belongs to
-+ *
-+ * NOTE: this function does not set the idle thread's NEED_RESCHED
-+ * flag, to make booting more robust.
-+ */
-+void __init init_idle(struct task_struct *idle, int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ __sched_fork(0, idle);
-+
-+ raw_spin_lock_irqsave(&idle->pi_lock, flags);
-+ raw_spin_lock(&rq->lock);
-+ update_rq_clock(rq);
-+
-+ idle->last_ran = rq->clock_task;
-+ idle->__state = TASK_RUNNING;
-+ /*
-+ * PF_KTHREAD should already be set at this point; regardless, make it
-+ * look like a proper per-CPU kthread.
-+ */
-+ idle->flags |= PF_IDLE | PF_KTHREAD | PF_NO_SETAFFINITY;
-+ kthread_set_per_cpu(idle, cpu);
-+
-+ sched_queue_init_idle(&rq->queue, idle);
-+
-+#ifdef CONFIG_SMP
-+ /*
-+ * It's possible that init_idle() gets called multiple times on a task,
-+ * in that case do_set_cpus_allowed() will not do the right thing.
-+ *
-+ * And since this is boot we can forgo the serialisation.
-+ */
-+ set_cpus_allowed_common(idle, cpumask_of(cpu));
-+#endif
-+
-+ /* Silence PROVE_RCU */
-+ rcu_read_lock();
-+ __set_task_cpu(idle, cpu);
-+ rcu_read_unlock();
-+
-+ rq->idle = idle;
-+ rcu_assign_pointer(rq->curr, idle);
-+ idle->on_cpu = 1;
-+
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&idle->pi_lock, flags);
-+
-+ /* Set the preempt count _outside_ the spinlocks! */
-+ init_idle_preempt_count(idle, cpu);
-+
-+ ftrace_graph_init_idle_task(idle, cpu);
-+ vtime_init_idle(idle, cpu);
-+#ifdef CONFIG_SMP
-+ sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
-+#endif
-+}
-+
-+#ifdef CONFIG_SMP
-+
-+int cpuset_cpumask_can_shrink(const struct cpumask __maybe_unused *cur,
-+ const struct cpumask __maybe_unused *trial)
-+{
-+ return 1;
-+}
-+
-+int task_can_attach(struct task_struct *p,
-+ const struct cpumask *cs_cpus_allowed)
-+{
-+ int ret = 0;
-+
-+ /*
-+ * Kthreads which disallow setaffinity shouldn't be moved
-+ * to a new cpuset; we don't want to change their CPU
-+ * affinity and isolating such threads by their set of
-+ * allowed nodes is unnecessary. Thus, cpusets are not
-+ * applicable for such threads. This prevents checking for
-+ * success of set_cpus_allowed_ptr() on all attached tasks
-+ * before cpus_mask may be changed.
-+ */
-+ if (p->flags & PF_NO_SETAFFINITY)
-+ ret = -EINVAL;
-+
-+ return ret;
-+}
-+
-+bool sched_smp_initialized __read_mostly;
-+
-+#ifdef CONFIG_HOTPLUG_CPU
-+/*
-+ * Ensures that the idle task is using init_mm right before its CPU goes
-+ * offline.
-+ */
-+void idle_task_exit(void)
-+{
-+ struct mm_struct *mm = current->active_mm;
-+
-+ BUG_ON(current != this_rq()->idle);
-+
-+ if (mm != &init_mm) {
-+ switch_mm(mm, &init_mm, current);
-+ finish_arch_post_lock_switch();
-+ }
-+
-+ /* finish_cpu(), as ran on the BP, will clean up the active_mm state */
-+}
-+
-+static int __balance_push_cpu_stop(void *arg)
-+{
-+ struct task_struct *p = arg;
-+ struct rq *rq = this_rq();
-+ struct rq_flags rf;
-+ int cpu;
-+
-+ raw_spin_lock_irq(&p->pi_lock);
-+ rq_lock(rq, &rf);
-+
-+ update_rq_clock(rq);
-+
-+ if (task_rq(p) == rq && task_on_rq_queued(p)) {
-+ cpu = select_fallback_rq(rq->cpu, p);
-+ rq = __migrate_task(rq, p, cpu);
-+ }
-+
-+ rq_unlock(rq, &rf);
-+ raw_spin_unlock_irq(&p->pi_lock);
-+
-+ put_task_struct(p);
-+
-+ return 0;
-+}
-+
-+static DEFINE_PER_CPU(struct cpu_stop_work, push_work);
-+
-+/*
-+ * This is enabled below SCHED_AP_ACTIVE; when !cpu_active(), but only
-+ * effective when the hotplug motion is down.
-+ */
-+static void balance_push(struct rq *rq)
-+{
-+ struct task_struct *push_task = rq->curr;
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ /*
-+ * Ensure the thing is persistent until balance_push_set(.on = false);
-+ */
-+ rq->balance_callback = &balance_push_callback;
-+
-+ /*
-+ * Only active while going offline and when invoked on the outgoing
-+ * CPU.
-+ */
-+ if (!cpu_dying(rq->cpu) || rq != this_rq())
-+ return;
-+
-+ /*
-+ * Both the cpu-hotplug and stop task are in this case and are
-+ * required to complete the hotplug process.
-+ */
-+ if (kthread_is_per_cpu(push_task) ||
-+ is_migration_disabled(push_task)) {
-+
-+ /*
-+ * If this is the idle task on the outgoing CPU try to wake
-+ * up the hotplug control thread which might wait for the
-+ * last task to vanish. The rcuwait_active() check is
-+ * accurate here because the waiter is pinned on this CPU
-+ * and can't obviously be running in parallel.
-+ *
-+ * On RT kernels this also has to check whether there are
-+ * pinned and scheduled out tasks on the runqueue. They
-+ * need to leave the migrate disabled section first.
-+ */
-+ if (!rq->nr_running && !rq_has_pinned_tasks(rq) &&
-+ rcuwait_active(&rq->hotplug_wait)) {
-+ raw_spin_unlock(&rq->lock);
-+ rcuwait_wake_up(&rq->hotplug_wait);
-+ raw_spin_lock(&rq->lock);
-+ }
-+ return;
-+ }
-+
-+ get_task_struct(push_task);
-+ /*
-+ * Temporarily drop rq->lock such that we can wake-up the stop task.
-+ * Both preemption and IRQs are still disabled.
-+ */
-+ raw_spin_unlock(&rq->lock);
-+ stop_one_cpu_nowait(rq->cpu, __balance_push_cpu_stop, push_task,
-+ this_cpu_ptr(&push_work));
-+ /*
-+ * At this point need_resched() is true and we'll take the loop in
-+ * schedule(). The next pick is obviously going to be the stop task
-+ * which kthread_is_per_cpu() and will push this task away.
-+ */
-+ raw_spin_lock(&rq->lock);
-+}
-+
-+static void balance_push_set(int cpu, bool on)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ struct rq_flags rf;
-+
-+ rq_lock_irqsave(rq, &rf);
-+ if (on) {
-+ WARN_ON_ONCE(rq->balance_callback);
-+ rq->balance_callback = &balance_push_callback;
-+ } else if (rq->balance_callback == &balance_push_callback) {
-+ rq->balance_callback = NULL;
-+ }
-+ rq_unlock_irqrestore(rq, &rf);
-+}
-+
-+/*
-+ * Invoked from a CPUs hotplug control thread after the CPU has been marked
-+ * inactive. All tasks which are not per CPU kernel threads are either
-+ * pushed off this CPU now via balance_push() or placed on a different CPU
-+ * during wakeup. Wait until the CPU is quiescent.
-+ */
-+static void balance_hotplug_wait(void)
-+{
-+ struct rq *rq = this_rq();
-+
-+ rcuwait_wait_event(&rq->hotplug_wait,
-+ rq->nr_running == 1 && !rq_has_pinned_tasks(rq),
-+ TASK_UNINTERRUPTIBLE);
-+}
-+
-+#else
-+
-+static void balance_push(struct rq *rq)
-+{
-+}
-+
-+static void balance_push_set(int cpu, bool on)
-+{
-+}
-+
-+static inline void balance_hotplug_wait(void)
-+{
-+}
-+#endif /* CONFIG_HOTPLUG_CPU */
-+
-+static void set_rq_offline(struct rq *rq)
-+{
-+ if (rq->online)
-+ rq->online = false;
-+}
-+
-+static void set_rq_online(struct rq *rq)
-+{
-+ if (!rq->online)
-+ rq->online = true;
-+}
-+
-+/*
-+ * used to mark begin/end of suspend/resume:
-+ */
-+static int num_cpus_frozen;
-+
-+/*
-+ * Update cpusets according to cpu_active mask. If cpusets are
-+ * disabled, cpuset_update_active_cpus() becomes a simple wrapper
-+ * around partition_sched_domains().
-+ *
-+ * If we come here as part of a suspend/resume, don't touch cpusets because we
-+ * want to restore it back to its original state upon resume anyway.
-+ */
-+static void cpuset_cpu_active(void)
-+{
-+ if (cpuhp_tasks_frozen) {
-+ /*
-+ * num_cpus_frozen tracks how many CPUs are involved in suspend
-+ * resume sequence. As long as this is not the last online
-+ * operation in the resume sequence, just build a single sched
-+ * domain, ignoring cpusets.
-+ */
-+ partition_sched_domains(1, NULL, NULL);
-+ if (--num_cpus_frozen)
-+ return;
-+ /*
-+ * This is the last CPU online operation. So fall through and
-+ * restore the original sched domains by considering the
-+ * cpuset configurations.
-+ */
-+ cpuset_force_rebuild();
-+ }
-+
-+ cpuset_update_active_cpus();
-+}
-+
-+static int cpuset_cpu_inactive(unsigned int cpu)
-+{
-+ if (!cpuhp_tasks_frozen) {
-+ cpuset_update_active_cpus();
-+ } else {
-+ num_cpus_frozen++;
-+ partition_sched_domains(1, NULL, NULL);
-+ }
-+ return 0;
-+}
-+
-+int sched_cpu_activate(unsigned int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ /*
-+ * Clear the balance_push callback and prepare to schedule
-+ * regular tasks.
-+ */
-+ balance_push_set(cpu, false);
-+
-+#ifdef CONFIG_SCHED_SMT
-+ /*
-+ * When going up, increment the number of cores with SMT present.
-+ */
-+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
-+ static_branch_inc_cpuslocked(&sched_smt_present);
-+#endif
-+ set_cpu_active(cpu, true);
-+
-+ if (sched_smp_initialized)
-+ cpuset_cpu_active();
-+
-+ /*
-+ * Put the rq online, if not already. This happens:
-+ *
-+ * 1) In the early boot process, because we build the real domains
-+ * after all cpus have been brought up.
-+ *
-+ * 2) At runtime, if cpuset_cpu_active() fails to rebuild the
-+ * domains.
-+ */
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ set_rq_online(rq);
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+ return 0;
-+}
-+
-+int sched_cpu_deactivate(unsigned int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+ int ret;
-+
-+ set_cpu_active(cpu, false);
-+
-+ /*
-+ * From this point forward, this CPU will refuse to run any task that
-+ * is not: migrate_disable() or KTHREAD_IS_PER_CPU, and will actively
-+ * push those tasks away until this gets cleared, see
-+ * sched_cpu_dying().
-+ */
-+ balance_push_set(cpu, true);
-+
-+ /*
-+ * We've cleared cpu_active_mask, wait for all preempt-disabled and RCU
-+ * users of this state to go away such that all new such users will
-+ * observe it.
-+ *
-+ * Specifically, we rely on ttwu to no longer target this CPU, see
-+ * ttwu_queue_cond() and is_cpu_allowed().
-+ *
-+ * Do sync before park smpboot threads to take care the rcu boost case.
-+ */
-+ synchronize_rcu();
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ update_rq_clock(rq);
-+ set_rq_offline(rq);
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+#ifdef CONFIG_SCHED_SMT
-+ /*
-+ * When going down, decrement the number of cores with SMT present.
-+ */
-+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2) {
-+ static_branch_dec_cpuslocked(&sched_smt_present);
-+ if (!static_branch_likely(&sched_smt_present))
-+ cpumask_clear(&sched_sg_idle_mask);
-+ }
-+#endif
-+
-+ if (!sched_smp_initialized)
-+ return 0;
-+
-+ ret = cpuset_cpu_inactive(cpu);
-+ if (ret) {
-+ balance_push_set(cpu, false);
-+ set_cpu_active(cpu, true);
-+ return ret;
-+ }
-+
-+ return 0;
-+}
-+
-+static void sched_rq_cpu_starting(unsigned int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ rq->calc_load_update = calc_load_update;
-+}
-+
-+int sched_cpu_starting(unsigned int cpu)
-+{
-+ sched_rq_cpu_starting(cpu);
-+ sched_tick_start(cpu);
-+ return 0;
-+}
-+
-+#ifdef CONFIG_HOTPLUG_CPU
-+
-+/*
-+ * Invoked immediately before the stopper thread is invoked to bring the
-+ * CPU down completely. At this point all per CPU kthreads except the
-+ * hotplug thread (current) and the stopper thread (inactive) have been
-+ * either parked or have been unbound from the outgoing CPU. Ensure that
-+ * any of those which might be on the way out are gone.
-+ *
-+ * If after this point a bound task is being woken on this CPU then the
-+ * responsible hotplug callback has failed to do it's job.
-+ * sched_cpu_dying() will catch it with the appropriate fireworks.
-+ */
-+int sched_cpu_wait_empty(unsigned int cpu)
-+{
-+ balance_hotplug_wait();
-+ return 0;
-+}
-+
-+/*
-+ * Since this CPU is going 'away' for a while, fold any nr_active delta we
-+ * might have. Called from the CPU stopper task after ensuring that the
-+ * stopper is the last running task on the CPU, so nr_active count is
-+ * stable. We need to take the teardown thread which is calling this into
-+ * account, so we hand in adjust = 1 to the load calculation.
-+ *
-+ * Also see the comment "Global load-average calculations".
-+ */
-+static void calc_load_migrate(struct rq *rq)
-+{
-+ long delta = calc_load_fold_active(rq, 1);
-+
-+ if (delta)
-+ atomic_long_add(delta, &calc_load_tasks);
-+}
-+
-+static void dump_rq_tasks(struct rq *rq, const char *loglvl)
-+{
-+ struct task_struct *g, *p;
-+ int cpu = cpu_of(rq);
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ printk("%sCPU%d enqueued tasks (%u total):\n", loglvl, cpu, rq->nr_running);
-+ for_each_process_thread(g, p) {
-+ if (task_cpu(p) != cpu)
-+ continue;
-+
-+ if (!task_on_rq_queued(p))
-+ continue;
-+
-+ printk("%s\tpid: %d, name: %s\n", loglvl, p->pid, p->comm);
-+ }
-+}
-+
-+int sched_cpu_dying(unsigned int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ /* Handle pending wakeups and then migrate everything off */
-+ sched_tick_stop(cpu);
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
-+ WARN(true, "Dying CPU not properly vacated!");
-+ dump_rq_tasks(rq, KERN_WARNING);
-+ }
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+ calc_load_migrate(rq);
-+ hrtick_clear(rq);
-+ return 0;
-+}
-+#endif
-+
-+#ifdef CONFIG_SMP
-+static void sched_init_topology_cpumask_early(void)
-+{
-+ int cpu;
-+ cpumask_t *tmp;
-+
-+ for_each_possible_cpu(cpu) {
-+ /* init topo masks */
-+ tmp = per_cpu(sched_cpu_topo_masks, cpu);
-+
-+ cpumask_copy(tmp, cpumask_of(cpu));
-+ tmp++;
-+ cpumask_copy(tmp, cpu_possible_mask);
-+ per_cpu(sched_cpu_llc_mask, cpu) = tmp;
-+ per_cpu(sched_cpu_topo_end_mask, cpu) = ++tmp;
-+ /*per_cpu(sd_llc_id, cpu) = cpu;*/
-+ }
-+}
-+
-+#define TOPOLOGY_CPUMASK(name, mask, last)\
-+ if (cpumask_and(topo, topo, mask)) { \
-+ cpumask_copy(topo, mask); \
-+ printk(KERN_INFO "sched: cpu#%02d topo: 0x%08lx - "#name, \
-+ cpu, (topo++)->bits[0]); \
-+ } \
-+ if (!last) \
-+ cpumask_complement(topo, mask)
-+
-+static void sched_init_topology_cpumask(void)
-+{
-+ int cpu;
-+ cpumask_t *topo;
-+
-+ for_each_online_cpu(cpu) {
-+ /* take chance to reset time slice for idle tasks */
-+ cpu_rq(cpu)->idle->time_slice = sched_timeslice_ns;
-+
-+ topo = per_cpu(sched_cpu_topo_masks, cpu) + 1;
-+
-+ cpumask_complement(topo, cpumask_of(cpu));
-+#ifdef CONFIG_SCHED_SMT
-+ TOPOLOGY_CPUMASK(smt, topology_sibling_cpumask(cpu), false);
-+#endif
-+ per_cpu(sd_llc_id, cpu) = cpumask_first(cpu_coregroup_mask(cpu));
-+ per_cpu(sched_cpu_llc_mask, cpu) = topo;
-+ TOPOLOGY_CPUMASK(coregroup, cpu_coregroup_mask(cpu), false);
-+
-+ TOPOLOGY_CPUMASK(core, topology_core_cpumask(cpu), false);
-+
-+ TOPOLOGY_CPUMASK(others, cpu_online_mask, true);
-+
-+ per_cpu(sched_cpu_topo_end_mask, cpu) = topo;
-+ printk(KERN_INFO "sched: cpu#%02d llc_id = %d, llc_mask idx = %d\n",
-+ cpu, per_cpu(sd_llc_id, cpu),
-+ (int) (per_cpu(sched_cpu_llc_mask, cpu) -
-+ per_cpu(sched_cpu_topo_masks, cpu)));
-+ }
-+}
-+#endif
-+
-+void __init sched_init_smp(void)
-+{
-+ /* Move init over to a non-isolated CPU */
-+ if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_DOMAIN)) < 0)
-+ BUG();
-+ current->flags &= ~PF_NO_SETAFFINITY;
-+
-+ sched_init_topology_cpumask();
-+
-+ sched_smp_initialized = true;
-+}
-+#else
-+void __init sched_init_smp(void)
-+{
-+ cpu_rq(0)->idle->time_slice = sched_timeslice_ns;
-+}
-+#endif /* CONFIG_SMP */
-+
-+int in_sched_functions(unsigned long addr)
-+{
-+ return in_lock_functions(addr) ||
-+ (addr >= (unsigned long)__sched_text_start
-+ && addr < (unsigned long)__sched_text_end);
-+}
-+
-+#ifdef CONFIG_CGROUP_SCHED
-+/* task group related information */
-+struct task_group {
-+ struct cgroup_subsys_state css;
-+
-+ struct rcu_head rcu;
-+ struct list_head list;
-+
-+ struct task_group *parent;
-+ struct list_head siblings;
-+ struct list_head children;
-+#ifdef CONFIG_FAIR_GROUP_SCHED
-+ unsigned long shares;
-+#endif
-+};
-+
-+/*
-+ * Default task group.
-+ * Every task in system belongs to this group at bootup.
-+ */
-+struct task_group root_task_group;
-+LIST_HEAD(task_groups);
-+
-+/* Cacheline aligned slab cache for task_group */
-+static struct kmem_cache *task_group_cache __read_mostly;
-+#endif /* CONFIG_CGROUP_SCHED */
-+
-+void __init sched_init(void)
-+{
-+ int i;
-+ struct rq *rq;
-+
-+ printk(KERN_INFO ALT_SCHED_VERSION_MSG);
-+
-+ wait_bit_init();
-+
-+#ifdef CONFIG_SMP
-+ for (i = 0; i < SCHED_QUEUE_BITS; i++)
-+ cpumask_copy(sched_rq_watermark + i, cpu_present_mask);
-+#endif
-+
-+#ifdef CONFIG_CGROUP_SCHED
-+ task_group_cache = KMEM_CACHE(task_group, 0);
-+
-+ list_add(&root_task_group.list, &task_groups);
-+ INIT_LIST_HEAD(&root_task_group.children);
-+ INIT_LIST_HEAD(&root_task_group.siblings);
-+#endif /* CONFIG_CGROUP_SCHED */
-+ for_each_possible_cpu(i) {
-+ rq = cpu_rq(i);
-+
-+ sched_queue_init(&rq->queue);
-+ rq->watermark = IDLE_TASK_SCHED_PRIO;
-+ rq->skip = NULL;
-+
-+ raw_spin_lock_init(&rq->lock);
-+ rq->nr_running = rq->nr_uninterruptible = 0;
-+ rq->calc_load_active = 0;
-+ rq->calc_load_update = jiffies + LOAD_FREQ;
-+#ifdef CONFIG_SMP
-+ rq->online = false;
-+ rq->cpu = i;
-+
-+#ifdef CONFIG_SCHED_SMT
-+ rq->active_balance = 0;
-+#endif
-+
-+#ifdef CONFIG_NO_HZ_COMMON
-+ INIT_CSD(&rq->nohz_csd, nohz_csd_func, rq);
-+#endif
-+ rq->balance_callback = &balance_push_callback;
-+#ifdef CONFIG_HOTPLUG_CPU
-+ rcuwait_init(&rq->hotplug_wait);
-+#endif
-+#endif /* CONFIG_SMP */
-+ rq->nr_switches = 0;
-+
-+ hrtick_rq_init(rq);
-+ atomic_set(&rq->nr_iowait, 0);
-+ }
-+#ifdef CONFIG_SMP
-+ /* Set rq->online for cpu 0 */
-+ cpu_rq(0)->online = true;
-+#endif
-+ /*
-+ * The boot idle thread does lazy MMU switching as well:
-+ */
-+ mmgrab(&init_mm);
-+ enter_lazy_tlb(&init_mm, current);
-+
-+ /*
-+ * The idle task doesn't need the kthread struct to function, but it
-+ * is dressed up as a per-CPU kthread and thus needs to play the part
-+ * if we want to avoid special-casing it in code that deals with per-CPU
-+ * kthreads.
-+ */
-+ WARN_ON(!set_kthread_struct(current));
-+
-+ /*
-+ * Make us the idle thread. Technically, schedule() should not be
-+ * called from this thread, however somewhere below it might be,
-+ * but because we are the idle thread, we just pick up running again
-+ * when this runqueue becomes "idle".
-+ */
-+ init_idle(current, smp_processor_id());
-+
-+ calc_load_update = jiffies + LOAD_FREQ;
-+
-+#ifdef CONFIG_SMP
-+ idle_thread_set_boot_cpu();
-+ balance_push_set(smp_processor_id(), false);
-+
-+ sched_init_topology_cpumask_early();
-+#endif /* SMP */
-+
-+ psi_init();
-+
-+ preempt_dynamic_init();
-+}
-+
-+#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
-+
-+void __might_sleep(const char *file, int line)
-+{
-+ unsigned int state = get_current_state();
-+ /*
-+ * Blocking primitives will set (and therefore destroy) current->state,
-+ * since we will exit with TASK_RUNNING make sure we enter with it,
-+ * otherwise we will destroy state.
-+ */
-+ WARN_ONCE(state != TASK_RUNNING && current->task_state_change,
-+ "do not call blocking ops when !TASK_RUNNING; "
-+ "state=%x set at [<%p>] %pS\n", state,
-+ (void *)current->task_state_change,
-+ (void *)current->task_state_change);
-+
-+ __might_resched(file, line, 0);
-+}
-+EXPORT_SYMBOL(__might_sleep);
-+
-+static void print_preempt_disable_ip(int preempt_offset, unsigned long ip)
-+{
-+ if (!IS_ENABLED(CONFIG_DEBUG_PREEMPT))
-+ return;
-+
-+ if (preempt_count() == preempt_offset)
-+ return;
-+
-+ pr_err("Preemption disabled at:");
-+ print_ip_sym(KERN_ERR, ip);
-+}
-+
-+static inline bool resched_offsets_ok(unsigned int offsets)
-+{
-+ unsigned int nested = preempt_count();
-+
-+ nested += rcu_preempt_depth() << MIGHT_RESCHED_RCU_SHIFT;
-+
-+ return nested == offsets;
-+}
-+
-+void __might_resched(const char *file, int line, unsigned int offsets)
-+{
-+ /* Ratelimiting timestamp: */
-+ static unsigned long prev_jiffy;
-+
-+ unsigned long preempt_disable_ip;
-+
-+ /* WARN_ON_ONCE() by default, no rate limit required: */
-+ rcu_sleep_check();
-+
-+ if ((resched_offsets_ok(offsets) && !irqs_disabled() &&
-+ !is_idle_task(current) && !current->non_block_count) ||
-+ system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING ||
-+ oops_in_progress)
-+ return;
-+ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
-+ return;
-+ prev_jiffy = jiffies;
-+
-+ /* Save this before calling printk(), since that will clobber it: */
-+ preempt_disable_ip = get_preempt_disable_ip(current);
-+
-+ pr_err("BUG: sleeping function called from invalid context at %s:%d\n",
-+ file, line);
-+ pr_err("in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n",
-+ in_atomic(), irqs_disabled(), current->non_block_count,
-+ current->pid, current->comm);
-+ pr_err("preempt_count: %x, expected: %x\n", preempt_count(),
-+ offsets & MIGHT_RESCHED_PREEMPT_MASK);
-+
-+ if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
-+ pr_err("RCU nest depth: %d, expected: %u\n",
-+ rcu_preempt_depth(), offsets >> MIGHT_RESCHED_RCU_SHIFT);
-+ }
-+
-+ if (task_stack_end_corrupted(current))
-+ pr_emerg("Thread overran stack, or stack corrupted\n");
-+
-+ debug_show_held_locks(current);
-+ if (irqs_disabled())
-+ print_irqtrace_events(current);
-+
-+ print_preempt_disable_ip(offsets & MIGHT_RESCHED_PREEMPT_MASK,
-+ preempt_disable_ip);
-+
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+}
-+EXPORT_SYMBOL(__might_resched);
-+
-+void __cant_sleep(const char *file, int line, int preempt_offset)
-+{
-+ static unsigned long prev_jiffy;
-+
-+ if (irqs_disabled())
-+ return;
-+
-+ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
-+ return;
-+
-+ if (preempt_count() > preempt_offset)
-+ return;
-+
-+ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
-+ return;
-+ prev_jiffy = jiffies;
-+
-+ printk(KERN_ERR "BUG: assuming atomic context at %s:%d\n", file, line);
-+ printk(KERN_ERR "in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
-+ in_atomic(), irqs_disabled(),
-+ current->pid, current->comm);
-+
-+ debug_show_held_locks(current);
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+}
-+EXPORT_SYMBOL_GPL(__cant_sleep);
-+
-+#ifdef CONFIG_SMP
-+void __cant_migrate(const char *file, int line)
-+{
-+ static unsigned long prev_jiffy;
-+
-+ if (irqs_disabled())
-+ return;
-+
-+ if (is_migration_disabled(current))
-+ return;
-+
-+ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
-+ return;
-+
-+ if (preempt_count() > 0)
-+ return;
-+
-+ if (current->migration_flags & MDF_FORCE_ENABLED)
-+ return;
-+
-+ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
-+ return;
-+ prev_jiffy = jiffies;
-+
-+ pr_err("BUG: assuming non migratable context at %s:%d\n", file, line);
-+ pr_err("in_atomic(): %d, irqs_disabled(): %d, migration_disabled() %u pid: %d, name: %s\n",
-+ in_atomic(), irqs_disabled(), is_migration_disabled(current),
-+ current->pid, current->comm);
-+
-+ debug_show_held_locks(current);
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+}
-+EXPORT_SYMBOL_GPL(__cant_migrate);
-+#endif
-+#endif
-+
-+#ifdef CONFIG_MAGIC_SYSRQ
-+void normalize_rt_tasks(void)
-+{
-+ struct task_struct *g, *p;
-+ struct sched_attr attr = {
-+ .sched_policy = SCHED_NORMAL,
-+ };
-+
-+ read_lock(&tasklist_lock);
-+ for_each_process_thread(g, p) {
-+ /*
-+ * Only normalize user tasks:
-+ */
-+ if (p->flags & PF_KTHREAD)
-+ continue;
-+
-+ schedstat_set(p->stats.wait_start, 0);
-+ schedstat_set(p->stats.sleep_start, 0);
-+ schedstat_set(p->stats.block_start, 0);
-+
-+ if (!rt_task(p)) {
-+ /*
-+ * Renice negative nice level userspace
-+ * tasks back to 0:
-+ */
-+ if (task_nice(p) < 0)
-+ set_user_nice(p, 0);
-+ continue;
-+ }
-+
-+ __sched_setscheduler(p, &attr, false, false);
-+ }
-+ read_unlock(&tasklist_lock);
-+}
-+#endif /* CONFIG_MAGIC_SYSRQ */
-+
-+#if defined(CONFIG_IA64) || defined(CONFIG_KGDB_KDB)
-+/*
-+ * These functions are only useful for the IA64 MCA handling, or kdb.
-+ *
-+ * They can only be called when the whole system has been
-+ * stopped - every CPU needs to be quiescent, and no scheduling
-+ * activity can take place. Using them for anything else would
-+ * be a serious bug, and as a result, they aren't even visible
-+ * under any other configuration.
-+ */
-+
-+/**
-+ * curr_task - return the current task for a given CPU.
-+ * @cpu: the processor in question.
-+ *
-+ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
-+ *
-+ * Return: The current task for @cpu.
-+ */
-+struct task_struct *curr_task(int cpu)
-+{
-+ return cpu_curr(cpu);
-+}
-+
-+#endif /* defined(CONFIG_IA64) || defined(CONFIG_KGDB_KDB) */
-+
-+#ifdef CONFIG_IA64
-+/**
-+ * ia64_set_curr_task - set the current task for a given CPU.
-+ * @cpu: the processor in question.
-+ * @p: the task pointer to set.
-+ *
-+ * Description: This function must only be used when non-maskable interrupts
-+ * are serviced on a separate stack. It allows the architecture to switch the
-+ * notion of the current task on a CPU in a non-blocking manner. This function
-+ * must be called with all CPU's synchronised, and interrupts disabled, the
-+ * and caller must save the original value of the current task (see
-+ * curr_task() above) and restore that value before reenabling interrupts and
-+ * re-starting the system.
-+ *
-+ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
-+ */
-+void ia64_set_curr_task(int cpu, struct task_struct *p)
-+{
-+ cpu_curr(cpu) = p;
-+}
-+
-+#endif
-+
-+#ifdef CONFIG_CGROUP_SCHED
-+static void sched_free_group(struct task_group *tg)
-+{
-+ kmem_cache_free(task_group_cache, tg);
-+}
-+
-+static void sched_free_group_rcu(struct rcu_head *rhp)
-+{
-+ sched_free_group(container_of(rhp, struct task_group, rcu));
-+}
-+
-+static void sched_unregister_group(struct task_group *tg)
-+{
-+ /*
-+ * We have to wait for yet another RCU grace period to expire, as
-+ * print_cfs_stats() might run concurrently.
-+ */
-+ call_rcu(&tg->rcu, sched_free_group_rcu);
-+}
-+
-+/* allocate runqueue etc for a new task group */
-+struct task_group *sched_create_group(struct task_group *parent)
-+{
-+ struct task_group *tg;
-+
-+ tg = kmem_cache_alloc(task_group_cache, GFP_KERNEL | __GFP_ZERO);
-+ if (!tg)
-+ return ERR_PTR(-ENOMEM);
-+
-+ return tg;
-+}
-+
-+void sched_online_group(struct task_group *tg, struct task_group *parent)
-+{
-+}
-+
-+/* rcu callback to free various structures associated with a task group */
-+static void sched_unregister_group_rcu(struct rcu_head *rhp)
-+{
-+ /* Now it should be safe to free those cfs_rqs: */
-+ sched_unregister_group(container_of(rhp, struct task_group, rcu));
-+}
-+
-+void sched_destroy_group(struct task_group *tg)
-+{
-+ /* Wait for possible concurrent references to cfs_rqs complete: */
-+ call_rcu(&tg->rcu, sched_unregister_group_rcu);
-+}
-+
-+void sched_release_group(struct task_group *tg)
-+{
-+}
-+
-+static inline struct task_group *css_tg(struct cgroup_subsys_state *css)
-+{
-+ return css ? container_of(css, struct task_group, css) : NULL;
-+}
-+
-+static struct cgroup_subsys_state *
-+cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
-+{
-+ struct task_group *parent = css_tg(parent_css);
-+ struct task_group *tg;
-+
-+ if (!parent) {
-+ /* This is early initialization for the top cgroup */
-+ return &root_task_group.css;
-+ }
-+
-+ tg = sched_create_group(parent);
-+ if (IS_ERR(tg))
-+ return ERR_PTR(-ENOMEM);
-+ return &tg->css;
-+}
-+
-+/* Expose task group only after completing cgroup initialization */
-+static int cpu_cgroup_css_online(struct cgroup_subsys_state *css)
-+{
-+ struct task_group *tg = css_tg(css);
-+ struct task_group *parent = css_tg(css->parent);
-+
-+ if (parent)
-+ sched_online_group(tg, parent);
-+ return 0;
-+}
-+
-+static void cpu_cgroup_css_released(struct cgroup_subsys_state *css)
-+{
-+ struct task_group *tg = css_tg(css);
-+
-+ sched_release_group(tg);
-+}
-+
-+static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
-+{
-+ struct task_group *tg = css_tg(css);
-+
-+ /*
-+ * Relies on the RCU grace period between css_released() and this.
-+ */
-+ sched_unregister_group(tg);
-+}
-+
-+static void cpu_cgroup_fork(struct task_struct *task)
-+{
-+}
-+
-+static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
-+{
-+ return 0;
-+}
-+
-+static void cpu_cgroup_attach(struct cgroup_taskset *tset)
-+{
-+}
-+
-+#ifdef CONFIG_FAIR_GROUP_SCHED
-+static DEFINE_MUTEX(shares_mutex);
-+
-+int sched_group_set_shares(struct task_group *tg, unsigned long shares)
-+{
-+ /*
-+ * We can't change the weight of the root cgroup.
-+ */
-+ if (&root_task_group == tg)
-+ return -EINVAL;
-+
-+ shares = clamp(shares, scale_load(MIN_SHARES), scale_load(MAX_SHARES));
-+
-+ mutex_lock(&shares_mutex);
-+ if (tg->shares == shares)
-+ goto done;
-+
-+ tg->shares = shares;
-+done:
-+ mutex_unlock(&shares_mutex);
-+ return 0;
-+}
-+
-+static int cpu_shares_write_u64(struct cgroup_subsys_state *css,
-+ struct cftype *cftype, u64 shareval)
-+{
-+ if (shareval > scale_load_down(ULONG_MAX))
-+ shareval = MAX_SHARES;
-+ return sched_group_set_shares(css_tg(css), scale_load(shareval));
-+}
-+
-+static u64 cpu_shares_read_u64(struct cgroup_subsys_state *css,
-+ struct cftype *cft)
-+{
-+ struct task_group *tg = css_tg(css);
-+
-+ return (u64) scale_load_down(tg->shares);
-+}
-+#endif
-+
-+static struct cftype cpu_legacy_files[] = {
-+#ifdef CONFIG_FAIR_GROUP_SCHED
-+ {
-+ .name = "shares",
-+ .read_u64 = cpu_shares_read_u64,
-+ .write_u64 = cpu_shares_write_u64,
-+ },
-+#endif
-+ { } /* Terminate */
-+};
-+
-+
-+static struct cftype cpu_files[] = {
-+ { } /* terminate */
-+};
-+
-+static int cpu_extra_stat_show(struct seq_file *sf,
-+ struct cgroup_subsys_state *css)
-+{
-+ return 0;
-+}
-+
-+struct cgroup_subsys cpu_cgrp_subsys = {
-+ .css_alloc = cpu_cgroup_css_alloc,
-+ .css_online = cpu_cgroup_css_online,
-+ .css_released = cpu_cgroup_css_released,
-+ .css_free = cpu_cgroup_css_free,
-+ .css_extra_stat_show = cpu_extra_stat_show,
-+ .fork = cpu_cgroup_fork,
-+ .can_attach = cpu_cgroup_can_attach,
-+ .attach = cpu_cgroup_attach,
-+ .legacy_cftypes = cpu_files,
-+ .legacy_cftypes = cpu_legacy_files,
-+ .dfl_cftypes = cpu_files,
-+ .early_init = true,
-+ .threaded = true,
-+};
-+#endif /* CONFIG_CGROUP_SCHED */
-+
-+#undef CREATE_TRACE_POINTS
-diff --git a/kernel/sched/alt_debug.c b/kernel/sched/alt_debug.c
-new file mode 100644
-index 000000000000..1212a031700e
---- /dev/null
-+++ b/kernel/sched/alt_debug.c
-@@ -0,0 +1,31 @@
-+/*
-+ * kernel/sched/alt_debug.c
-+ *
-+ * Print the alt scheduler debugging details
-+ *
-+ * Author: Alfred Chen
-+ * Date : 2020
-+ */
-+#include "sched.h"
-+
-+/*
-+ * This allows printing both to /proc/sched_debug and
-+ * to the console
-+ */
-+#define SEQ_printf(m, x...) \
-+ do { \
-+ if (m) \
-+ seq_printf(m, x); \
-+ else \
-+ pr_cont(x); \
-+ } while (0)
-+
-+void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
-+ struct seq_file *m)
-+{
-+ SEQ_printf(m, "%s (%d, #threads: %d)\n", p->comm, task_pid_nr_ns(p, ns),
-+ get_nr_threads(p));
-+}
-+
-+void proc_sched_set_task(struct task_struct *p)
-+{}
-diff --git a/kernel/sched/alt_sched.h b/kernel/sched/alt_sched.h
-new file mode 100644
-index 000000000000..611424bbfa9b
---- /dev/null
-+++ b/kernel/sched/alt_sched.h
-@@ -0,0 +1,645 @@
-+#ifndef ALT_SCHED_H
-+#define ALT_SCHED_H
-+
-+#include <linux/psi.h>
-+#include <linux/stop_machine.h>
-+#include <linux/syscalls.h>
-+#include <linux/tick.h>
-+
-+#include <trace/events/power.h>
-+#include <trace/events/sched.h>
-+
-+#include "../workqueue_internal.h"
-+
-+#include "cpupri.h"
-+
-+#ifdef CONFIG_SCHED_BMQ
-+/* bits:
-+ * RT(0-99), (Low prio adj range, nice width, high prio adj range) / 2, cpu idle task */
-+#define SCHED_BITS (MAX_RT_PRIO + NICE_WIDTH / 2 + MAX_PRIORITY_ADJ + 1)
-+#endif
-+
-+#ifdef CONFIG_SCHED_PDS
-+/* bits: RT(0-99), reserved(100-127), NORMAL_PRIO_NUM, cpu idle task */
-+#define SCHED_BITS (MIN_NORMAL_PRIO + NORMAL_PRIO_NUM + 1)
-+#endif /* CONFIG_SCHED_PDS */
-+
-+#define IDLE_TASK_SCHED_PRIO (SCHED_BITS - 1)
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+# define SCHED_WARN_ON(x) WARN_ONCE(x, #x)
-+extern void resched_latency_warn(int cpu, u64 latency);
-+#else
-+# define SCHED_WARN_ON(x) ({ (void)(x), 0; })
-+static inline void resched_latency_warn(int cpu, u64 latency) {}
-+#endif
-+
-+/*
-+ * Increase resolution of nice-level calculations for 64-bit architectures.
-+ * The extra resolution improves shares distribution and load balancing of
-+ * low-weight task groups (eg. nice +19 on an autogroup), deeper taskgroup
-+ * hierarchies, especially on larger systems. This is not a user-visible change
-+ * and does not change the user-interface for setting shares/weights.
-+ *
-+ * We increase resolution only if we have enough bits to allow this increased
-+ * resolution (i.e. 64-bit). The costs for increasing resolution when 32-bit
-+ * are pretty high and the returns do not justify the increased costs.
-+ *
-+ * Really only required when CONFIG_FAIR_GROUP_SCHED=y is also set, but to
-+ * increase coverage and consistency always enable it on 64-bit platforms.
-+ */
-+#ifdef CONFIG_64BIT
-+# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT + SCHED_FIXEDPOINT_SHIFT)
-+# define scale_load(w) ((w) << SCHED_FIXEDPOINT_SHIFT)
-+# define scale_load_down(w) \
-+({ \
-+ unsigned long __w = (w); \
-+ if (__w) \
-+ __w = max(2UL, __w >> SCHED_FIXEDPOINT_SHIFT); \
-+ __w; \
-+})
-+#else
-+# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT)
-+# define scale_load(w) (w)
-+# define scale_load_down(w) (w)
-+#endif
-+
-+#ifdef CONFIG_FAIR_GROUP_SCHED
-+#define ROOT_TASK_GROUP_LOAD NICE_0_LOAD
-+
-+/*
-+ * A weight of 0 or 1 can cause arithmetics problems.
-+ * A weight of a cfs_rq is the sum of weights of which entities
-+ * are queued on this cfs_rq, so a weight of a entity should not be
-+ * too large, so as the shares value of a task group.
-+ * (The default weight is 1024 - so there's no practical
-+ * limitation from this.)
-+ */
-+#define MIN_SHARES (1UL << 1)
-+#define MAX_SHARES (1UL << 18)
-+#endif
-+
-+/* task_struct::on_rq states: */
-+#define TASK_ON_RQ_QUEUED 1
-+#define TASK_ON_RQ_MIGRATING 2
-+
-+static inline int task_on_rq_queued(struct task_struct *p)
-+{
-+ return p->on_rq == TASK_ON_RQ_QUEUED;
-+}
-+
-+static inline int task_on_rq_migrating(struct task_struct *p)
-+{
-+ return READ_ONCE(p->on_rq) == TASK_ON_RQ_MIGRATING;
-+}
-+
-+/*
-+ * wake flags
-+ */
-+#define WF_SYNC 0x01 /* waker goes to sleep after wakeup */
-+#define WF_FORK 0x02 /* child wakeup after fork */
-+#define WF_MIGRATED 0x04 /* internal use, task got migrated */
-+#define WF_ON_CPU 0x08 /* Wakee is on_rq */
-+
-+#define SCHED_QUEUE_BITS (SCHED_BITS - 1)
-+
-+struct sched_queue {
-+ DECLARE_BITMAP(bitmap, SCHED_QUEUE_BITS);
-+ struct list_head heads[SCHED_BITS];
-+};
-+
-+/*
-+ * This is the main, per-CPU runqueue data structure.
-+ * This data should only be modified by the local cpu.
-+ */
-+struct rq {
-+ /* runqueue lock: */
-+ raw_spinlock_t lock;
-+
-+ struct task_struct __rcu *curr;
-+ struct task_struct *idle, *stop, *skip;
-+ struct mm_struct *prev_mm;
-+
-+ struct sched_queue queue;
-+#ifdef CONFIG_SCHED_PDS
-+ u64 time_edge;
-+#endif
-+ unsigned long watermark;
-+
-+ /* switch count */
-+ u64 nr_switches;
-+
-+ atomic_t nr_iowait;
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+ u64 last_seen_need_resched_ns;
-+ int ticks_without_resched;
-+#endif
-+
-+#ifdef CONFIG_MEMBARRIER
-+ int membarrier_state;
-+#endif
-+
-+#ifdef CONFIG_SMP
-+ int cpu; /* cpu of this runqueue */
-+ bool online;
-+
-+ unsigned int ttwu_pending;
-+ unsigned char nohz_idle_balance;
-+ unsigned char idle_balance;
-+
-+#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
-+ struct sched_avg avg_irq;
-+#endif
-+
-+#ifdef CONFIG_SCHED_SMT
-+ int active_balance;
-+ struct cpu_stop_work active_balance_work;
-+#endif
-+ struct callback_head *balance_callback;
-+#ifdef CONFIG_HOTPLUG_CPU
-+ struct rcuwait hotplug_wait;
-+#endif
-+ unsigned int nr_pinned;
-+
-+#endif /* CONFIG_SMP */
-+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
-+ u64 prev_irq_time;
-+#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
-+#ifdef CONFIG_PARAVIRT
-+ u64 prev_steal_time;
-+#endif /* CONFIG_PARAVIRT */
-+#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
-+ u64 prev_steal_time_rq;
-+#endif /* CONFIG_PARAVIRT_TIME_ACCOUNTING */
-+
-+ /* For genenal cpu load util */
-+ s32 load_history;
-+ u64 load_block;
-+ u64 load_stamp;
-+
-+ /* calc_load related fields */
-+ unsigned long calc_load_update;
-+ long calc_load_active;
-+
-+ u64 clock, last_tick;
-+ u64 last_ts_switch;
-+ u64 clock_task;
-+
-+ unsigned int nr_running;
-+ unsigned long nr_uninterruptible;
-+
-+#ifdef CONFIG_SCHED_HRTICK
-+#ifdef CONFIG_SMP
-+ call_single_data_t hrtick_csd;
-+#endif
-+ struct hrtimer hrtick_timer;
-+ ktime_t hrtick_time;
-+#endif
-+
-+#ifdef CONFIG_SCHEDSTATS
-+
-+ /* latency stats */
-+ struct sched_info rq_sched_info;
-+ unsigned long long rq_cpu_time;
-+ /* could above be rq->cfs_rq.exec_clock + rq->rt_rq.rt_runtime ? */
-+
-+ /* sys_sched_yield() stats */
-+ unsigned int yld_count;
-+
-+ /* schedule() stats */
-+ unsigned int sched_switch;
-+ unsigned int sched_count;
-+ unsigned int sched_goidle;
-+
-+ /* try_to_wake_up() stats */
-+ unsigned int ttwu_count;
-+ unsigned int ttwu_local;
-+#endif /* CONFIG_SCHEDSTATS */
-+
-+#ifdef CONFIG_CPU_IDLE
-+ /* Must be inspected within a rcu lock section */
-+ struct cpuidle_state *idle_state;
-+#endif
-+
-+#ifdef CONFIG_NO_HZ_COMMON
-+#ifdef CONFIG_SMP
-+ call_single_data_t nohz_csd;
-+#endif
-+ atomic_t nohz_flags;
-+#endif /* CONFIG_NO_HZ_COMMON */
-+};
-+
-+extern unsigned long rq_load_util(struct rq *rq, unsigned long max);
-+
-+extern unsigned long calc_load_update;
-+extern atomic_long_t calc_load_tasks;
-+
-+extern void calc_global_load_tick(struct rq *this_rq);
-+extern long calc_load_fold_active(struct rq *this_rq, long adjust);
-+
-+DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
-+#define cpu_rq(cpu) (&per_cpu(runqueues, (cpu)))
-+#define this_rq() this_cpu_ptr(&runqueues)
-+#define task_rq(p) cpu_rq(task_cpu(p))
-+#define cpu_curr(cpu) (cpu_rq(cpu)->curr)
-+#define raw_rq() raw_cpu_ptr(&runqueues)
-+
-+#ifdef CONFIG_SMP
-+#if defined(CONFIG_SCHED_DEBUG) && defined(CONFIG_SYSCTL)
-+void register_sched_domain_sysctl(void);
-+void unregister_sched_domain_sysctl(void);
-+#else
-+static inline void register_sched_domain_sysctl(void)
-+{
-+}
-+static inline void unregister_sched_domain_sysctl(void)
-+{
-+}
-+#endif
-+
-+extern bool sched_smp_initialized;
-+
-+enum {
-+ ITSELF_LEVEL_SPACE_HOLDER,
-+#ifdef CONFIG_SCHED_SMT
-+ SMT_LEVEL_SPACE_HOLDER,
-+#endif
-+ COREGROUP_LEVEL_SPACE_HOLDER,
-+ CORE_LEVEL_SPACE_HOLDER,
-+ OTHER_LEVEL_SPACE_HOLDER,
-+ NR_CPU_AFFINITY_LEVELS
-+};
-+
-+DECLARE_PER_CPU(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
-+DECLARE_PER_CPU(cpumask_t *, sched_cpu_llc_mask);
-+
-+static inline int
-+__best_mask_cpu(const cpumask_t *cpumask, const cpumask_t *mask)
-+{
-+ int cpu;
-+
-+ while ((cpu = cpumask_any_and(cpumask, mask)) >= nr_cpu_ids)
-+ mask++;
-+
-+ return cpu;
-+}
-+
-+static inline int best_mask_cpu(int cpu, const cpumask_t *mask)
-+{
-+ return __best_mask_cpu(mask, per_cpu(sched_cpu_topo_masks, cpu));
-+}
-+
-+extern void flush_smp_call_function_from_idle(void);
-+
-+#else /* !CONFIG_SMP */
-+static inline void flush_smp_call_function_from_idle(void) { }
-+#endif
-+
-+#ifndef arch_scale_freq_tick
-+static __always_inline
-+void arch_scale_freq_tick(void)
-+{
-+}
-+#endif
-+
-+#ifndef arch_scale_freq_capacity
-+static __always_inline
-+unsigned long arch_scale_freq_capacity(int cpu)
-+{
-+ return SCHED_CAPACITY_SCALE;
-+}
-+#endif
-+
-+static inline u64 __rq_clock_broken(struct rq *rq)
-+{
-+ return READ_ONCE(rq->clock);
-+}
-+
-+static inline u64 rq_clock(struct rq *rq)
-+{
-+ /*
-+ * Relax lockdep_assert_held() checking as in VRQ, call to
-+ * sched_info_xxxx() may not held rq->lock
-+ * lockdep_assert_held(&rq->lock);
-+ */
-+ return rq->clock;
-+}
-+
-+static inline u64 rq_clock_task(struct rq *rq)
-+{
-+ /*
-+ * Relax lockdep_assert_held() checking as in VRQ, call to
-+ * sched_info_xxxx() may not held rq->lock
-+ * lockdep_assert_held(&rq->lock);
-+ */
-+ return rq->clock_task;
-+}
-+
-+/*
-+ * {de,en}queue flags:
-+ *
-+ * DEQUEUE_SLEEP - task is no longer runnable
-+ * ENQUEUE_WAKEUP - task just became runnable
-+ *
-+ */
-+
-+#define DEQUEUE_SLEEP 0x01
-+
-+#define ENQUEUE_WAKEUP 0x01
-+
-+
-+/*
-+ * Below are scheduler API which using in other kernel code
-+ * It use the dummy rq_flags
-+ * ToDo : BMQ need to support these APIs for compatibility with mainline
-+ * scheduler code.
-+ */
-+struct rq_flags {
-+ unsigned long flags;
-+};
-+
-+struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
-+ __acquires(rq->lock);
-+
-+struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
-+ __acquires(p->pi_lock)
-+ __acquires(rq->lock);
-+
-+static inline void __task_rq_unlock(struct rq *rq, struct rq_flags *rf)
-+ __releases(rq->lock)
-+{
-+ raw_spin_unlock(&rq->lock);
-+}
-+
-+static inline void
-+task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
-+ __releases(rq->lock)
-+ __releases(p->pi_lock)
-+{
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
-+}
-+
-+static inline void
-+rq_lock(struct rq *rq, struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ raw_spin_lock(&rq->lock);
-+}
-+
-+static inline void
-+rq_unlock_irq(struct rq *rq, struct rq_flags *rf)
-+ __releases(rq->lock)
-+{
-+ raw_spin_unlock_irq(&rq->lock);
-+}
-+
-+static inline void
-+rq_unlock(struct rq *rq, struct rq_flags *rf)
-+ __releases(rq->lock)
-+{
-+ raw_spin_unlock(&rq->lock);
-+}
-+
-+static inline struct rq *
-+this_rq_lock_irq(struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ struct rq *rq;
-+
-+ local_irq_disable();
-+ rq = this_rq();
-+ raw_spin_lock(&rq->lock);
-+
-+ return rq;
-+}
-+
-+static inline raw_spinlock_t *__rq_lockp(struct rq *rq)
-+{
-+ return &rq->lock;
-+}
-+
-+static inline raw_spinlock_t *rq_lockp(struct rq *rq)
-+{
-+ return __rq_lockp(rq);
-+}
-+
-+static inline void lockdep_assert_rq_held(struct rq *rq)
-+{
-+ lockdep_assert_held(__rq_lockp(rq));
-+}
-+
-+extern void raw_spin_rq_lock_nested(struct rq *rq, int subclass);
-+extern void raw_spin_rq_unlock(struct rq *rq);
-+
-+static inline void raw_spin_rq_lock(struct rq *rq)
-+{
-+ raw_spin_rq_lock_nested(rq, 0);
-+}
-+
-+static inline void raw_spin_rq_lock_irq(struct rq *rq)
-+{
-+ local_irq_disable();
-+ raw_spin_rq_lock(rq);
-+}
-+
-+static inline void raw_spin_rq_unlock_irq(struct rq *rq)
-+{
-+ raw_spin_rq_unlock(rq);
-+ local_irq_enable();
-+}
-+
-+static inline int task_current(struct rq *rq, struct task_struct *p)
-+{
-+ return rq->curr == p;
-+}
-+
-+static inline bool task_running(struct task_struct *p)
-+{
-+ return p->on_cpu;
-+}
-+
-+extern int task_running_nice(struct task_struct *p);
-+
-+extern struct static_key_false sched_schedstats;
-+
-+#ifdef CONFIG_CPU_IDLE
-+static inline void idle_set_state(struct rq *rq,
-+ struct cpuidle_state *idle_state)
-+{
-+ rq->idle_state = idle_state;
-+}
-+
-+static inline struct cpuidle_state *idle_get_state(struct rq *rq)
-+{
-+ WARN_ON(!rcu_read_lock_held());
-+ return rq->idle_state;
-+}
-+#else
-+static inline void idle_set_state(struct rq *rq,
-+ struct cpuidle_state *idle_state)
-+{
-+}
-+
-+static inline struct cpuidle_state *idle_get_state(struct rq *rq)
-+{
-+ return NULL;
-+}
-+#endif
-+
-+static inline int cpu_of(const struct rq *rq)
-+{
-+#ifdef CONFIG_SMP
-+ return rq->cpu;
-+#else
-+ return 0;
-+#endif
-+}
-+
-+#include "stats.h"
-+
-+#ifdef CONFIG_NO_HZ_COMMON
-+#define NOHZ_BALANCE_KICK_BIT 0
-+#define NOHZ_STATS_KICK_BIT 1
-+
-+#define NOHZ_BALANCE_KICK BIT(NOHZ_BALANCE_KICK_BIT)
-+#define NOHZ_STATS_KICK BIT(NOHZ_STATS_KICK_BIT)
-+
-+#define NOHZ_KICK_MASK (NOHZ_BALANCE_KICK | NOHZ_STATS_KICK)
-+
-+#define nohz_flags(cpu) (&cpu_rq(cpu)->nohz_flags)
-+
-+/* TODO: needed?
-+extern void nohz_balance_exit_idle(struct rq *rq);
-+#else
-+static inline void nohz_balance_exit_idle(struct rq *rq) { }
-+*/
-+#endif
-+
-+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
-+struct irqtime {
-+ u64 total;
-+ u64 tick_delta;
-+ u64 irq_start_time;
-+ struct u64_stats_sync sync;
-+};
-+
-+DECLARE_PER_CPU(struct irqtime, cpu_irqtime);
-+
-+/*
-+ * Returns the irqtime minus the softirq time computed by ksoftirqd.
-+ * Otherwise ksoftirqd's sum_exec_runtime is substracted its own runtime
-+ * and never move forward.
-+ */
-+static inline u64 irq_time_read(int cpu)
-+{
-+ struct irqtime *irqtime = &per_cpu(cpu_irqtime, cpu);
-+ unsigned int seq;
-+ u64 total;
-+
-+ do {
-+ seq = __u64_stats_fetch_begin(&irqtime->sync);
-+ total = irqtime->total;
-+ } while (__u64_stats_fetch_retry(&irqtime->sync, seq));
-+
-+ return total;
-+}
-+#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
-+
-+#ifdef CONFIG_CPU_FREQ
-+DECLARE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
-+#endif /* CONFIG_CPU_FREQ */
-+
-+#ifdef CONFIG_NO_HZ_FULL
-+extern int __init sched_tick_offload_init(void);
-+#else
-+static inline int sched_tick_offload_init(void) { return 0; }
-+#endif
-+
-+#ifdef arch_scale_freq_capacity
-+#ifndef arch_scale_freq_invariant
-+#define arch_scale_freq_invariant() (true)
-+#endif
-+#else /* arch_scale_freq_capacity */
-+#define arch_scale_freq_invariant() (false)
-+#endif
-+
-+extern void schedule_idle(void);
-+
-+#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
-+
-+/*
-+ * !! For sched_setattr_nocheck() (kernel) only !!
-+ *
-+ * This is actually gross. :(
-+ *
-+ * It is used to make schedutil kworker(s) higher priority than SCHED_DEADLINE
-+ * tasks, but still be able to sleep. We need this on platforms that cannot
-+ * atomically change clock frequency. Remove once fast switching will be
-+ * available on such platforms.
-+ *
-+ * SUGOV stands for SchedUtil GOVernor.
-+ */
-+#define SCHED_FLAG_SUGOV 0x10000000
-+
-+#ifdef CONFIG_MEMBARRIER
-+/*
-+ * The scheduler provides memory barriers required by membarrier between:
-+ * - prior user-space memory accesses and store to rq->membarrier_state,
-+ * - store to rq->membarrier_state and following user-space memory accesses.
-+ * In the same way it provides those guarantees around store to rq->curr.
-+ */
-+static inline void membarrier_switch_mm(struct rq *rq,
-+ struct mm_struct *prev_mm,
-+ struct mm_struct *next_mm)
-+{
-+ int membarrier_state;
-+
-+ if (prev_mm == next_mm)
-+ return;
-+
-+ membarrier_state = atomic_read(&next_mm->membarrier_state);
-+ if (READ_ONCE(rq->membarrier_state) == membarrier_state)
-+ return;
-+
-+ WRITE_ONCE(rq->membarrier_state, membarrier_state);
-+}
-+#else
-+static inline void membarrier_switch_mm(struct rq *rq,
-+ struct mm_struct *prev_mm,
-+ struct mm_struct *next_mm)
-+{
-+}
-+#endif
-+
-+#ifdef CONFIG_NUMA
-+extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);
-+#else
-+static inline int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
-+{
-+ return nr_cpu_ids;
-+}
-+#endif
-+
-+extern void swake_up_all_locked(struct swait_queue_head *q);
-+extern void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait);
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+extern int preempt_dynamic_mode;
-+extern int sched_dynamic_mode(const char *str);
-+extern void sched_dynamic_update(int mode);
-+#endif
-+
-+static inline void nohz_run_idle_balance(int cpu) { }
-+
-+static inline
-+unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util,
-+ struct task_struct *p)
-+{
-+ return util;
-+}
-+
-+static inline bool uclamp_rq_is_capped(struct rq *rq) { return false; }
-+
-+#endif /* ALT_SCHED_H */
-diff --git a/kernel/sched/bmq.h b/kernel/sched/bmq.h
-new file mode 100644
-index 000000000000..66b77291b9d0
---- /dev/null
-+++ b/kernel/sched/bmq.h
-@@ -0,0 +1,110 @@
-+#define ALT_SCHED_VERSION_MSG "sched/bmq: BMQ CPU Scheduler "ALT_SCHED_VERSION" by Alfred Chen.\n"
-+
-+/*
-+ * BMQ only routines
-+ */
-+#define rq_switch_time(rq) ((rq)->clock - (rq)->last_ts_switch)
-+#define boost_threshold(p) (sched_timeslice_ns >>\
-+ (15 - MAX_PRIORITY_ADJ - (p)->boost_prio))
-+
-+static inline void boost_task(struct task_struct *p)
-+{
-+ int limit;
-+
-+ switch (p->policy) {
-+ case SCHED_NORMAL:
-+ limit = -MAX_PRIORITY_ADJ;
-+ break;
-+ case SCHED_BATCH:
-+ case SCHED_IDLE:
-+ limit = 0;
-+ break;
-+ default:
-+ return;
-+ }
-+
-+ if (p->boost_prio > limit)
-+ p->boost_prio--;
-+}
-+
-+static inline void deboost_task(struct task_struct *p)
-+{
-+ if (p->boost_prio < MAX_PRIORITY_ADJ)
-+ p->boost_prio++;
-+}
-+
-+/*
-+ * Common interfaces
-+ */
-+static inline void sched_timeslice_imp(const int timeslice_ms) {}
-+
-+static inline int
-+task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
-+{
-+ return p->prio + p->boost_prio - MAX_RT_PRIO;
-+}
-+
-+static inline int task_sched_prio(const struct task_struct *p)
-+{
-+ return (p->prio < MAX_RT_PRIO)? p->prio : MAX_RT_PRIO / 2 + (p->prio + p->boost_prio) / 2;
-+}
-+
-+static inline int
-+task_sched_prio_idx(const struct task_struct *p, const struct rq *rq)
-+{
-+ return task_sched_prio(p);
-+}
-+
-+static inline int sched_prio2idx(int prio, struct rq *rq)
-+{
-+ return prio;
-+}
-+
-+static inline int sched_idx2prio(int idx, struct rq *rq)
-+{
-+ return idx;
-+}
-+
-+static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
-+{
-+ p->time_slice = sched_timeslice_ns;
-+
-+ if (SCHED_FIFO != p->policy && task_on_rq_queued(p)) {
-+ if (SCHED_RR != p->policy)
-+ deboost_task(p);
-+ requeue_task(p, rq, task_sched_prio_idx(p, rq));
-+ }
-+}
-+
-+static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq) {}
-+
-+inline int task_running_nice(struct task_struct *p)
-+{
-+ return (p->prio + p->boost_prio > DEFAULT_PRIO + MAX_PRIORITY_ADJ);
-+}
-+
-+static void sched_task_fork(struct task_struct *p, struct rq *rq)
-+{
-+ p->boost_prio = MAX_PRIORITY_ADJ;
-+}
-+
-+static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
-+{
-+ p->boost_prio = MAX_PRIORITY_ADJ;
-+}
-+
-+#ifdef CONFIG_SMP
-+static inline void sched_task_ttwu(struct task_struct *p)
-+{
-+ if(this_rq()->clock_task - p->last_ran > sched_timeslice_ns)
-+ boost_task(p);
-+}
-+#endif
-+
-+static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq)
-+{
-+ if (rq_switch_time(rq) < boost_threshold(p))
-+ boost_task(p);
-+}
-+
-+static inline void update_rq_time_edge(struct rq *rq) {}
-diff --git a/kernel/sched/build_policy.c b/kernel/sched/build_policy.c
-index e0104b45029a..5eb28f1fdd74 100644
---- a/kernel/sched/build_policy.c
-+++ b/kernel/sched/build_policy.c
-@@ -40,13 +40,19 @@
-
- #include "idle.c"
-
-+#ifndef CONFIG_SCHED_ALT
- #include "rt.c"
-+#endif
-
- #ifdef CONFIG_SMP
-+#ifndef CONFIG_SCHED_ALT
- # include "cpudeadline.c"
-+#endif
- # include "pelt.c"
- #endif
-
- #include "cputime.c"
--#include "deadline.c"
-
-+#ifndef CONFIG_SCHED_ALT
-+#include "deadline.c"
-+#endif
-diff --git a/kernel/sched/build_utility.c b/kernel/sched/build_utility.c
-index eec0849b2aae..880f4f819d77 100644
---- a/kernel/sched/build_utility.c
-+++ b/kernel/sched/build_utility.c
-@@ -84,7 +84,9 @@
-
- #ifdef CONFIG_SMP
- # include "cpupri.c"
-+#ifndef CONFIG_SCHED_ALT
- # include "stop_task.c"
-+#endif
- # include "topology.c"
- #endif
-
-diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
-index 3dbf351d12d5..b2590f961139 100644
---- a/kernel/sched/cpufreq_schedutil.c
-+++ b/kernel/sched/cpufreq_schedutil.c
-@@ -160,9 +160,14 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu)
- unsigned long max = arch_scale_cpu_capacity(sg_cpu->cpu);
-
- sg_cpu->max = max;
-+#ifndef CONFIG_SCHED_ALT
- sg_cpu->bw_dl = cpu_bw_dl(rq);
- sg_cpu->util = effective_cpu_util(sg_cpu->cpu, cpu_util_cfs(sg_cpu->cpu), max,
- FREQUENCY_UTIL, NULL);
-+#else
-+ sg_cpu->bw_dl = 0;
-+ sg_cpu->util = rq_load_util(rq, max);
-+#endif /* CONFIG_SCHED_ALT */
- }
-
- /**
-@@ -306,8 +311,10 @@ static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return false; }
- */
- static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu)
- {
-+#ifndef CONFIG_SCHED_ALT
- if (cpu_bw_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->bw_dl)
- sg_cpu->sg_policy->limits_changed = true;
-+#endif
- }
-
- static inline bool sugov_update_single_common(struct sugov_cpu *sg_cpu,
-@@ -607,6 +614,7 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
- }
-
- ret = sched_setattr_nocheck(thread, &attr);
-+
- if (ret) {
- kthread_stop(thread);
- pr_warn("%s: failed to set SCHED_DEADLINE\n", __func__);
-@@ -839,7 +847,9 @@ cpufreq_governor_init(schedutil_gov);
- #ifdef CONFIG_ENERGY_MODEL
- static void rebuild_sd_workfn(struct work_struct *work)
- {
-+#ifndef CONFIG_SCHED_ALT
- rebuild_sched_domains_energy();
-+#endif /* CONFIG_SCHED_ALT */
- }
- static DECLARE_WORK(rebuild_sd_work, rebuild_sd_workfn);
-
-diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
-index 78a233d43757..b3bbc87d4352 100644
---- a/kernel/sched/cputime.c
-+++ b/kernel/sched/cputime.c
-@@ -122,7 +122,7 @@ void account_user_time(struct task_struct *p, u64 cputime)
- p->utime += cputime;
- account_group_user_time(p, cputime);
-
-- index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;
-+ index = task_running_nice(p) ? CPUTIME_NICE : CPUTIME_USER;
-
- /* Add user time to cpustat. */
- task_group_account_field(p, index, cputime);
-@@ -146,7 +146,7 @@ void account_guest_time(struct task_struct *p, u64 cputime)
- p->gtime += cputime;
-
- /* Add guest time to cpustat. */
-- if (task_nice(p) > 0) {
-+ if (task_running_nice(p)) {
- task_group_account_field(p, CPUTIME_NICE, cputime);
- cpustat[CPUTIME_GUEST_NICE] += cputime;
- } else {
-@@ -269,7 +269,7 @@ static inline u64 account_other_time(u64 max)
- #ifdef CONFIG_64BIT
- static inline u64 read_sum_exec_runtime(struct task_struct *t)
- {
-- return t->se.sum_exec_runtime;
-+ return tsk_seruntime(t);
- }
- #else
- static u64 read_sum_exec_runtime(struct task_struct *t)
-@@ -279,7 +279,7 @@ static u64 read_sum_exec_runtime(struct task_struct *t)
- struct rq *rq;
-
- rq = task_rq_lock(t, &rf);
-- ns = t->se.sum_exec_runtime;
-+ ns = tsk_seruntime(t);
- task_rq_unlock(rq, t, &rf);
-
- return ns;
-@@ -611,7 +611,7 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
- void task_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
- {
- struct task_cputime cputime = {
-- .sum_exec_runtime = p->se.sum_exec_runtime,
-+ .sum_exec_runtime = tsk_seruntime(p),
- };
-
- if (task_cputime(p, &cputime.utime, &cputime.stime))
-diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
-index bb3d63bdf4ae..4e1680785704 100644
---- a/kernel/sched/debug.c
-+++ b/kernel/sched/debug.c
-@@ -7,6 +7,7 @@
- * Copyright(C) 2007, Red Hat, Inc., Ingo Molnar
- */
-
-+#ifndef CONFIG_SCHED_ALT
- /*
- * This allows printing both to /proc/sched_debug and
- * to the console
-@@ -215,6 +216,7 @@ static const struct file_operations sched_scaling_fops = {
- };
-
- #endif /* SMP */
-+#endif /* !CONFIG_SCHED_ALT */
-
- #ifdef CONFIG_PREEMPT_DYNAMIC
-
-@@ -278,6 +280,7 @@ static const struct file_operations sched_dynamic_fops = {
-
- #endif /* CONFIG_PREEMPT_DYNAMIC */
-
-+#ifndef CONFIG_SCHED_ALT
- __read_mostly bool sched_debug_verbose;
-
- static const struct seq_operations sched_debug_sops;
-@@ -293,6 +296,7 @@ static const struct file_operations sched_debug_fops = {
- .llseek = seq_lseek,
- .release = seq_release,
- };
-+#endif /* !CONFIG_SCHED_ALT */
-
- static struct dentry *debugfs_sched;
-
-@@ -302,12 +306,15 @@ static __init int sched_init_debug(void)
-
- debugfs_sched = debugfs_create_dir("sched", NULL);
-
-+#ifndef CONFIG_SCHED_ALT
- debugfs_create_file("features", 0644, debugfs_sched, NULL, &sched_feat_fops);
- debugfs_create_bool("verbose", 0644, debugfs_sched, &sched_debug_verbose);
-+#endif /* !CONFIG_SCHED_ALT */
- #ifdef CONFIG_PREEMPT_DYNAMIC
- debugfs_create_file("preempt", 0644, debugfs_sched, NULL, &sched_dynamic_fops);
- #endif
-
-+#ifndef CONFIG_SCHED_ALT
- debugfs_create_u32("latency_ns", 0644, debugfs_sched, &sysctl_sched_latency);
- debugfs_create_u32("min_granularity_ns", 0644, debugfs_sched, &sysctl_sched_min_granularity);
- debugfs_create_u32("idle_min_granularity_ns", 0644, debugfs_sched, &sysctl_sched_idle_min_granularity);
-@@ -336,11 +343,13 @@ static __init int sched_init_debug(void)
- #endif
-
- debugfs_create_file("debug", 0444, debugfs_sched, NULL, &sched_debug_fops);
-+#endif /* !CONFIG_SCHED_ALT */
-
- return 0;
- }
- late_initcall(sched_init_debug);
-
-+#ifndef CONFIG_SCHED_ALT
- #ifdef CONFIG_SMP
-
- static cpumask_var_t sd_sysctl_cpus;
-@@ -1067,6 +1076,7 @@ void proc_sched_set_task(struct task_struct *p)
- memset(&p->stats, 0, sizeof(p->stats));
- #endif
- }
-+#endif /* !CONFIG_SCHED_ALT */
-
- void resched_latency_warn(int cpu, u64 latency)
- {
-diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
-index ecb0d7052877..000c0d87de78 100644
---- a/kernel/sched/idle.c
-+++ b/kernel/sched/idle.c
-@@ -400,6 +400,7 @@ void cpu_startup_entry(enum cpuhp_state state)
- do_idle();
- }
-
-+#ifndef CONFIG_SCHED_ALT
- /*
- * idle-task scheduling class.
- */
-@@ -521,3 +522,4 @@ DEFINE_SCHED_CLASS(idle) = {
- .switched_to = switched_to_idle,
- .update_curr = update_curr_idle,
- };
-+#endif
-diff --git a/kernel/sched/pds.h b/kernel/sched/pds.h
-new file mode 100644
-index 000000000000..56a649d02e49
---- /dev/null
-+++ b/kernel/sched/pds.h
-@@ -0,0 +1,127 @@
-+#define ALT_SCHED_VERSION_MSG "sched/pds: PDS CPU Scheduler "ALT_SCHED_VERSION" by Alfred Chen.\n"
-+
-+static int sched_timeslice_shift = 22;
-+
-+#define NORMAL_PRIO_MOD(x) ((x) & (NORMAL_PRIO_NUM - 1))
-+
-+/*
-+ * Common interfaces
-+ */
-+static inline void sched_timeslice_imp(const int timeslice_ms)
-+{
-+ if (2 == timeslice_ms)
-+ sched_timeslice_shift = 21;
-+}
-+
-+static inline int
-+task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
-+{
-+ s64 delta = p->deadline - rq->time_edge + NORMAL_PRIO_NUM - NICE_WIDTH;
-+
-+ if (WARN_ONCE(delta > NORMAL_PRIO_NUM - 1,
-+ "pds: task_sched_prio_normal() delta %lld\n", delta))
-+ return NORMAL_PRIO_NUM - 1;
-+
-+ return (delta < 0) ? 0 : delta;
-+}
-+
-+static inline int task_sched_prio(const struct task_struct *p)
-+{
-+ return (p->prio < MAX_RT_PRIO) ? p->prio :
-+ MIN_NORMAL_PRIO + task_sched_prio_normal(p, task_rq(p));
-+}
-+
-+static inline int
-+task_sched_prio_idx(const struct task_struct *p, const struct rq *rq)
-+{
-+ return (p->prio < MAX_RT_PRIO) ? p->prio : MIN_NORMAL_PRIO +
-+ NORMAL_PRIO_MOD(task_sched_prio_normal(p, rq) + rq->time_edge);
-+}
-+
-+static inline int sched_prio2idx(int prio, struct rq *rq)
-+{
-+ return (IDLE_TASK_SCHED_PRIO == prio || prio < MAX_RT_PRIO) ? prio :
-+ MIN_NORMAL_PRIO + NORMAL_PRIO_MOD((prio - MIN_NORMAL_PRIO) +
-+ rq->time_edge);
-+}
-+
-+static inline int sched_idx2prio(int idx, struct rq *rq)
-+{
-+ return (idx < MAX_RT_PRIO) ? idx : MIN_NORMAL_PRIO +
-+ NORMAL_PRIO_MOD((idx - MIN_NORMAL_PRIO) + NORMAL_PRIO_NUM -
-+ NORMAL_PRIO_MOD(rq->time_edge));
-+}
-+
-+static inline void sched_renew_deadline(struct task_struct *p, const struct rq *rq)
-+{
-+ if (p->prio >= MAX_RT_PRIO)
-+ p->deadline = (rq->clock >> sched_timeslice_shift) +
-+ p->static_prio - (MAX_PRIO - NICE_WIDTH);
-+}
-+
-+int task_running_nice(struct task_struct *p)
-+{
-+ return (p->prio > DEFAULT_PRIO);
-+}
-+
-+static inline void update_rq_time_edge(struct rq *rq)
-+{
-+ struct list_head head;
-+ u64 old = rq->time_edge;
-+ u64 now = rq->clock >> sched_timeslice_shift;
-+ u64 prio, delta;
-+
-+ if (now == old)
-+ return;
-+
-+ delta = min_t(u64, NORMAL_PRIO_NUM, now - old);
-+ INIT_LIST_HEAD(&head);
-+
-+ for_each_set_bit(prio, &rq->queue.bitmap[2], delta)
-+ list_splice_tail_init(rq->queue.heads + MIN_NORMAL_PRIO +
-+ NORMAL_PRIO_MOD(prio + old), &head);
-+
-+ rq->queue.bitmap[2] = (NORMAL_PRIO_NUM == delta) ? 0UL :
-+ rq->queue.bitmap[2] >> delta;
-+ rq->time_edge = now;
-+ if (!list_empty(&head)) {
-+ u64 idx = MIN_NORMAL_PRIO + NORMAL_PRIO_MOD(now);
-+ struct task_struct *p;
-+
-+ list_for_each_entry(p, &head, sq_node)
-+ p->sq_idx = idx;
-+
-+ list_splice(&head, rq->queue.heads + idx);
-+ rq->queue.bitmap[2] |= 1UL;
-+ }
-+}
-+
-+static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
-+{
-+ p->time_slice = sched_timeslice_ns;
-+ sched_renew_deadline(p, rq);
-+ if (SCHED_FIFO != p->policy && task_on_rq_queued(p))
-+ requeue_task(p, rq, task_sched_prio_idx(p, rq));
-+}
-+
-+static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq)
-+{
-+ u64 max_dl = rq->time_edge + NICE_WIDTH - 1;
-+ if (unlikely(p->deadline > max_dl))
-+ p->deadline = max_dl;
-+}
-+
-+static void sched_task_fork(struct task_struct *p, struct rq *rq)
-+{
-+ sched_renew_deadline(p, rq);
-+}
-+
-+static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
-+{
-+ time_slice_expired(p, rq);
-+}
-+
-+#ifdef CONFIG_SMP
-+static inline void sched_task_ttwu(struct task_struct *p) {}
-+#endif
-+static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq) {}
-diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
-index 0f310768260c..bd38bf738fe9 100644
---- a/kernel/sched/pelt.c
-+++ b/kernel/sched/pelt.c
-@@ -266,6 +266,7 @@ ___update_load_avg(struct sched_avg *sa, unsigned long load)
- WRITE_ONCE(sa->util_avg, sa->util_sum / divider);
- }
-
-+#ifndef CONFIG_SCHED_ALT
- /*
- * sched_entity:
- *
-@@ -383,8 +384,9 @@ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
-
- return 0;
- }
-+#endif
-
--#ifdef CONFIG_SCHED_THERMAL_PRESSURE
-+#if defined(CONFIG_SCHED_THERMAL_PRESSURE) && !defined(CONFIG_SCHED_ALT)
- /*
- * thermal:
- *
-diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h
-index c336f5f481bc..5865f14714a9 100644
---- a/kernel/sched/pelt.h
-+++ b/kernel/sched/pelt.h
-@@ -1,13 +1,15 @@
- #ifdef CONFIG_SMP
- #include "sched-pelt.h"
-
-+#ifndef CONFIG_SCHED_ALT
- int __update_load_avg_blocked_se(u64 now, struct sched_entity *se);
- int __update_load_avg_se(u64 now, struct cfs_rq *cfs_rq, struct sched_entity *se);
- int __update_load_avg_cfs_rq(u64 now, struct cfs_rq *cfs_rq);
- int update_rt_rq_load_avg(u64 now, struct rq *rq, int running);
- int update_dl_rq_load_avg(u64 now, struct rq *rq, int running);
-+#endif
-
--#ifdef CONFIG_SCHED_THERMAL_PRESSURE
-+#if defined(CONFIG_SCHED_THERMAL_PRESSURE) && !defined(CONFIG_SCHED_ALT)
- int update_thermal_load_avg(u64 now, struct rq *rq, u64 capacity);
-
- static inline u64 thermal_load_avg(struct rq *rq)
-@@ -44,6 +46,7 @@ static inline u32 get_pelt_divider(struct sched_avg *avg)
- return PELT_MIN_DIVIDER + avg->period_contrib;
- }
-
-+#ifndef CONFIG_SCHED_ALT
- static inline void cfs_se_util_change(struct sched_avg *avg)
- {
- unsigned int enqueued;
-@@ -155,9 +158,11 @@ static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq)
- return rq_clock_pelt(rq_of(cfs_rq));
- }
- #endif
-+#endif /* CONFIG_SCHED_ALT */
-
- #else
-
-+#ifndef CONFIG_SCHED_ALT
- static inline int
- update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
- {
-@@ -175,6 +180,7 @@ update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
- {
- return 0;
- }
-+#endif
-
- static inline int
- update_thermal_load_avg(u64 now, struct rq *rq, u64 capacity)
-diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
-index 8dccb34eb190..bb3598e0ba5d 100644
---- a/kernel/sched/sched.h
-+++ b/kernel/sched/sched.h
-@@ -5,6 +5,10 @@
- #ifndef _KERNEL_SCHED_SCHED_H
- #define _KERNEL_SCHED_SCHED_H
-
-+#ifdef CONFIG_SCHED_ALT
-+#include "alt_sched.h"
-+#else
-+
- #include <linux/sched/affinity.h>
- #include <linux/sched/autogroup.h>
- #include <linux/sched/cpufreq.h>
-@@ -3087,4 +3091,9 @@ extern int sched_dynamic_mode(const char *str);
- extern void sched_dynamic_update(int mode);
- #endif
-
-+static inline int task_running_nice(struct task_struct *p)
-+{
-+ return (task_nice(p) > 0);
-+}
-+#endif /* !CONFIG_SCHED_ALT */
- #endif /* _KERNEL_SCHED_SCHED_H */
-diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
-index 857f837f52cb..5486c63e4790 100644
---- a/kernel/sched/stats.c
-+++ b/kernel/sched/stats.c
-@@ -125,8 +125,10 @@ static int show_schedstat(struct seq_file *seq, void *v)
- } else {
- struct rq *rq;
- #ifdef CONFIG_SMP
-+#ifndef CONFIG_SCHED_ALT
- struct sched_domain *sd;
- int dcount = 0;
-+#endif
- #endif
- cpu = (unsigned long)(v - 2);
- rq = cpu_rq(cpu);
-@@ -143,6 +145,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
- seq_printf(seq, "\n");
-
- #ifdef CONFIG_SMP
-+#ifndef CONFIG_SCHED_ALT
- /* domain-specific stats */
- rcu_read_lock();
- for_each_domain(cpu, sd) {
-@@ -171,6 +174,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
- sd->ttwu_move_balance);
- }
- rcu_read_unlock();
-+#endif
- #endif
- }
- return 0;
-diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
-index baa839c1ba96..15238be0581b 100644
---- a/kernel/sched/stats.h
-+++ b/kernel/sched/stats.h
-@@ -89,6 +89,7 @@ static inline void rq_sched_info_depart (struct rq *rq, unsigned long long delt
-
- #endif /* CONFIG_SCHEDSTATS */
-
-+#ifndef CONFIG_SCHED_ALT
- #ifdef CONFIG_FAIR_GROUP_SCHED
- struct sched_entity_stats {
- struct sched_entity se;
-@@ -105,6 +106,7 @@ __schedstats_from_se(struct sched_entity *se)
- #endif
- return &task_of(se)->stats;
- }
-+#endif /* CONFIG_SCHED_ALT */
-
- #ifdef CONFIG_PSI
- /*
-diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
-index 810750e62118..f2cdbb696dba 100644
---- a/kernel/sched/topology.c
-+++ b/kernel/sched/topology.c
-@@ -3,6 +3,7 @@
- * Scheduler topology setup/handling methods
- */
-
-+#ifndef CONFIG_SCHED_ALT
- DEFINE_MUTEX(sched_domains_mutex);
-
- /* Protected by sched_domains_mutex: */
-@@ -1392,8 +1393,10 @@ static void asym_cpu_capacity_scan(void)
- */
-
- static int default_relax_domain_level = -1;
-+#endif /* CONFIG_SCHED_ALT */
- int sched_domain_level_max;
-
-+#ifndef CONFIG_SCHED_ALT
- static int __init setup_relax_domain_level(char *str)
- {
- if (kstrtoint(str, 0, &default_relax_domain_level))
-@@ -1626,6 +1629,7 @@ sd_init(struct sched_domain_topology_level *tl,
-
- return sd;
- }
-+#endif /* CONFIG_SCHED_ALT */
-
- /*
- * Topology list, bottom-up.
-@@ -1662,6 +1666,7 @@ void set_sched_topology(struct sched_domain_topology_level *tl)
- sched_domain_topology_saved = NULL;
- }
-
-+#ifndef CONFIG_SCHED_ALT
- #ifdef CONFIG_NUMA
-
- static const struct cpumask *sd_numa_mask(int cpu)
-@@ -2617,3 +2622,15 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
- partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
- mutex_unlock(&sched_domains_mutex);
- }
-+#else /* CONFIG_SCHED_ALT */
-+void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
-+ struct sched_domain_attr *dattr_new)
-+{}
-+
-+#ifdef CONFIG_NUMA
-+int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
-+{
-+ return best_mask_cpu(cpu, cpus);
-+}
-+#endif /* CONFIG_NUMA */
-+#endif
-diff --git a/kernel/sysctl.c b/kernel/sysctl.c
-index 830aaf8ca08e..7ad676d5ae3b 100644
---- a/kernel/sysctl.c
-+++ b/kernel/sysctl.c
-@@ -96,6 +96,10 @@
-
- /* Constants used for minimum and maximum */
-
-+#ifdef CONFIG_SCHED_ALT
-+extern int sched_yield_type;
-+#endif
-+
- #ifdef CONFIG_PERF_EVENTS
- static const int six_hundred_forty_kb = 640 * 1024;
- #endif
-@@ -1659,6 +1663,24 @@ int proc_do_static_key(struct ctl_table *table, int write,
- }
-
- static struct ctl_table kern_table[] = {
-+#ifdef CONFIG_SCHED_ALT
-+/* In ALT, only supported "sched_schedstats" */
-+#ifdef CONFIG_SCHED_DEBUG
-+#ifdef CONFIG_SMP
-+#ifdef CONFIG_SCHEDSTATS
-+ {
-+ .procname = "sched_schedstats",
-+ .data = NULL,
-+ .maxlen = sizeof(unsigned int),
-+ .mode = 0644,
-+ .proc_handler = sysctl_schedstats,
-+ .extra1 = SYSCTL_ZERO,
-+ .extra2 = SYSCTL_ONE,
-+ },
-+#endif /* CONFIG_SCHEDSTATS */
-+#endif /* CONFIG_SMP */
-+#endif /* CONFIG_SCHED_DEBUG */
-+#else /* !CONFIG_SCHED_ALT */
- {
- .procname = "sched_child_runs_first",
- .data = &sysctl_sched_child_runs_first,
-@@ -1778,6 +1800,7 @@ static struct ctl_table kern_table[] = {
- .extra2 = SYSCTL_ONE,
- },
- #endif
-+#endif /* !CONFIG_SCHED_ALT */
- #ifdef CONFIG_PROVE_LOCKING
- {
- .procname = "prove_locking",
-@@ -2163,6 +2186,17 @@ static struct ctl_table kern_table[] = {
- .proc_handler = proc_dointvec,
- },
- #endif
-+#ifdef CONFIG_SCHED_ALT
-+ {
-+ .procname = "yield_type",
-+ .data = &sched_yield_type,
-+ .maxlen = sizeof (int),
-+ .mode = 0644,
-+ .proc_handler = &proc_dointvec_minmax,
-+ .extra1 = SYSCTL_ZERO,
-+ .extra2 = SYSCTL_TWO,
-+ },
-+#endif
- #if defined(CONFIG_S390) && defined(CONFIG_SMP)
- {
- .procname = "spin_retry",
-diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
-index 0ea8702eb516..a27a0f3a654d 100644
---- a/kernel/time/hrtimer.c
-+++ b/kernel/time/hrtimer.c
-@@ -2088,8 +2088,10 @@ long hrtimer_nanosleep(ktime_t rqtp, const enum hrtimer_mode mode,
- int ret = 0;
- u64 slack;
-
-+#ifndef CONFIG_SCHED_ALT
- slack = current->timer_slack_ns;
- if (dl_task(current) || rt_task(current))
-+#endif
- slack = 0;
-
- hrtimer_init_sleeper_on_stack(&t, clockid, mode);
-diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
-index 0a97193984db..e32235cdc3b1 100644
---- a/kernel/time/posix-cpu-timers.c
-+++ b/kernel/time/posix-cpu-timers.c
-@@ -223,7 +223,7 @@ static void task_sample_cputime(struct task_struct *p, u64 *samples)
- u64 stime, utime;
-
- task_cputime(p, &utime, &stime);
-- store_samples(samples, stime, utime, p->se.sum_exec_runtime);
-+ store_samples(samples, stime, utime, tsk_seruntime(p));
- }
-
- static void proc_sample_cputime_atomic(struct task_cputime_atomic *at,
-@@ -866,6 +866,7 @@ static void collect_posix_cputimers(struct posix_cputimers *pct, u64 *samples,
- }
- }
-
-+#ifndef CONFIG_SCHED_ALT
- static inline void check_dl_overrun(struct task_struct *tsk)
- {
- if (tsk->dl.dl_overrun) {
-@@ -873,6 +874,7 @@ static inline void check_dl_overrun(struct task_struct *tsk)
- __group_send_sig_info(SIGXCPU, SEND_SIG_PRIV, tsk);
- }
- }
-+#endif
-
- static bool check_rlimit(u64 time, u64 limit, int signo, bool rt, bool hard)
- {
-@@ -900,8 +902,10 @@ static void check_thread_timers(struct task_struct *tsk,
- u64 samples[CPUCLOCK_MAX];
- unsigned long soft;
-
-+#ifndef CONFIG_SCHED_ALT
- if (dl_task(tsk))
- check_dl_overrun(tsk);
-+#endif
-
- if (expiry_cache_is_inactive(pct))
- return;
-@@ -915,7 +919,7 @@ static void check_thread_timers(struct task_struct *tsk,
- soft = task_rlimit(tsk, RLIMIT_RTTIME);
- if (soft != RLIM_INFINITY) {
- /* Task RT timeout is accounted in jiffies. RTTIME is usec */
-- unsigned long rttime = tsk->rt.timeout * (USEC_PER_SEC / HZ);
-+ unsigned long rttime = tsk_rttimeout(tsk) * (USEC_PER_SEC / HZ);
- unsigned long hard = task_rlimit_max(tsk, RLIMIT_RTTIME);
-
- /* At the hard limit, send SIGKILL. No further action. */
-@@ -1151,8 +1155,10 @@ static inline bool fastpath_timer_check(struct task_struct *tsk)
- return true;
- }
-
-+#ifndef CONFIG_SCHED_ALT
- if (dl_task(tsk) && tsk->dl.dl_overrun)
- return true;
-+#endif
-
- return false;
- }
-diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
-index abcadbe933bb..d4c778b0ab0e 100644
---- a/kernel/trace/trace_selftest.c
-+++ b/kernel/trace/trace_selftest.c
-@@ -1140,10 +1140,15 @@ static int trace_wakeup_test_thread(void *data)
- {
- /* Make this a -deadline thread */
- static const struct sched_attr attr = {
-+#ifdef CONFIG_SCHED_ALT
-+ /* No deadline on BMQ/PDS, use RR */
-+ .sched_policy = SCHED_RR,
-+#else
- .sched_policy = SCHED_DEADLINE,
- .sched_runtime = 100000ULL,
- .sched_deadline = 10000000ULL,
- .sched_period = 10000000ULL
-+#endif
- };
- struct wakeup_test_data *x = data;
-
diff --git a/5021_BMQ-and-PDS-gentoo-defaults.patch b/5021_BMQ-and-PDS-gentoo-defaults.patch
deleted file mode 100644
index 6b2049da..00000000
--- a/5021_BMQ-and-PDS-gentoo-defaults.patch
+++ /dev/null
@@ -1,13 +0,0 @@
---- a/init/Kconfig 2022-07-07 13:22:00.698439887 -0400
-+++ b/init/Kconfig 2022-07-07 13:23:45.152333576 -0400
-@@ -874,8 +874,9 @@ config UCLAMP_BUCKETS_COUNT
- If in doubt, use the default value.
-
- menuconfig SCHED_ALT
-+ depends on X86_64
- bool "Alternative CPU Schedulers"
-- default y
-+ default n
- help
- This feature enable alternative CPU scheduler"
-
^ permalink raw reply related [flat|nested] 31+ messages in thread
end of thread, other threads:[~2022-08-21 19:59 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-11 17:39 [gentoo-commits] proj/linux-patches:5.18 commit in: / Mike Pagano
-- strict thread matches above, loose matches on Subject: below --
2022-08-21 19:59 Mike Pagano
2022-08-21 16:54 Mike Pagano
2022-08-17 14:31 Mike Pagano
2022-08-11 12:32 Mike Pagano
2022-08-03 14:22 Alice Ferrazzi
2022-07-29 16:39 Mike Pagano
2022-07-23 11:45 Alice Ferrazzi
2022-07-22 11:11 Mike Pagano
2022-07-22 11:06 Mike Pagano
2022-07-15 10:10 Alice Ferrazzi
2022-07-15 8:59 Alice Ferrazzi
2022-07-12 15:58 Mike Pagano
2022-07-07 17:29 Mike Pagano
2022-07-07 16:26 Mike Pagano
2022-07-02 16:12 Mike Pagano
2022-06-29 11:07 Mike Pagano
2022-06-26 21:52 Mike Pagano
2022-06-25 19:42 Mike Pagano
2022-06-22 12:53 Mike Pagano
2022-06-16 12:07 Mike Pagano
2022-06-14 17:29 Mike Pagano
2022-06-09 18:40 Mike Pagano
2022-06-09 11:25 Mike Pagano
2022-06-06 11:01 Mike Pagano
2022-05-30 21:42 Mike Pagano
2022-05-30 13:57 Mike Pagano
2022-05-24 21:02 Mike Pagano
2022-05-24 18:14 Mike Pagano
2022-04-25 16:23 Mike Pagano
2022-04-25 16:15 Mike Pagano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox