From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id A098E138334 for ; Thu, 16 May 2019 23:01:26 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 7768AE085A; Thu, 16 May 2019 23:01:25 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 21FDDE085A for ; Thu, 16 May 2019 23:01:25 +0000 (UTC) Received: from oystercatcher.gentoo.org (unknown [IPv6:2a01:4f8:202:4333:225:90ff:fed9:fc84]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id DFF0E3447C1 for ; Thu, 16 May 2019 23:01:22 +0000 (UTC) Received: from localhost.localdomain (localhost [IPv6:::1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id 5D6AC601 for ; Thu, 16 May 2019 23:01:21 +0000 (UTC) From: "Mike Pagano" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Mike Pagano" Message-ID: <1558047663.3c35585cbd7d4d38ec6d96a638e0ee62d35aa104.mpagano@gentoo> Subject: [gentoo-commits] proj/linux-patches:4.4 commit in: / X-VCS-Repository: proj/linux-patches X-VCS-Files: 0000_README 1179_linux-4.4.180.patch X-VCS-Directories: / X-VCS-Committer: mpagano X-VCS-Committer-Name: Mike Pagano X-VCS-Revision: 3c35585cbd7d4d38ec6d96a638e0ee62d35aa104 X-VCS-Branch: 4.4 Date: Thu, 16 May 2019 23:01:21 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply X-Archives-Salt: b0f5ee34-9681-460a-beef-3d00c12b218d X-Archives-Hash: 87fe5fa5bde0e5b754435827d6e1165b commit: 3c35585cbd7d4d38ec6d96a638e0ee62d35aa104 Author: Mike Pagano gentoo org> AuthorDate: Thu May 16 23:01:03 2019 +0000 Commit: Mike Pagano gentoo org> CommitDate: Thu May 16 23:01:03 2019 +0000 URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=3c35585c Linux patch 4.4.180 Signed-off-by: Mike Pagano gentoo.org> 0000_README | 4 + 1179_linux-4.4.180.patch | 10052 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 10056 insertions(+) diff --git a/0000_README b/0000_README index f6d929e..8557058 100644 --- a/0000_README +++ b/0000_README @@ -759,6 +759,10 @@ Patch: 1178_linux-4.4.179.patch From: http://www.kernel.org Desc: Linux 4.4.179 +Patch: 1179_linux-4.4.180.patch +From: http://www.kernel.org +Desc: Linux 4.4.180 + Patch: 1500_XATTR_USER_PREFIX.patch From: https://bugs.gentoo.org/show_bug.cgi?id=470644 Desc: Support for namespace user.pax.* on tmpfs. diff --git a/1179_linux-4.4.180.patch b/1179_linux-4.4.180.patch new file mode 100644 index 0000000..0d785d9 --- /dev/null +++ b/1179_linux-4.4.180.patch @@ -0,0 +1,10052 @@ +diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu +index 50f95689ab38..e4cd3be77663 100644 +--- a/Documentation/ABI/testing/sysfs-devices-system-cpu ++++ b/Documentation/ABI/testing/sysfs-devices-system-cpu +@@ -277,6 +277,8 @@ What: /sys/devices/system/cpu/vulnerabilities + /sys/devices/system/cpu/vulnerabilities/spectre_v1 + /sys/devices/system/cpu/vulnerabilities/spectre_v2 + /sys/devices/system/cpu/vulnerabilities/spec_store_bypass ++ /sys/devices/system/cpu/vulnerabilities/l1tf ++ /sys/devices/system/cpu/vulnerabilities/mds + Date: January 2018 + Contact: Linux kernel mailing list + Description: Information about CPU vulnerabilities +diff --git a/Documentation/hw-vuln/mds.rst b/Documentation/hw-vuln/mds.rst +new file mode 100644 +index 000000000000..3f92728be021 +--- /dev/null ++++ b/Documentation/hw-vuln/mds.rst +@@ -0,0 +1,305 @@ ++MDS - Microarchitectural Data Sampling ++====================================== ++ ++Microarchitectural Data Sampling is a hardware vulnerability which allows ++unprivileged speculative access to data which is available in various CPU ++internal buffers. ++ ++Affected processors ++------------------- ++ ++This vulnerability affects a wide range of Intel processors. The ++vulnerability is not present on: ++ ++ - Processors from AMD, Centaur and other non Intel vendors ++ ++ - Older processor models, where the CPU family is < 6 ++ ++ - Some Atoms (Bonnell, Saltwell, Goldmont, GoldmontPlus) ++ ++ - Intel processors which have the ARCH_CAP_MDS_NO bit set in the ++ IA32_ARCH_CAPABILITIES MSR. ++ ++Whether a processor is affected or not can be read out from the MDS ++vulnerability file in sysfs. See :ref:`mds_sys_info`. ++ ++Not all processors are affected by all variants of MDS, but the mitigation ++is identical for all of them so the kernel treats them as a single ++vulnerability. ++ ++Related CVEs ++------------ ++ ++The following CVE entries are related to the MDS vulnerability: ++ ++ ============== ===== =================================================== ++ CVE-2018-12126 MSBDS Microarchitectural Store Buffer Data Sampling ++ CVE-2018-12130 MFBDS Microarchitectural Fill Buffer Data Sampling ++ CVE-2018-12127 MLPDS Microarchitectural Load Port Data Sampling ++ CVE-2019-11091 MDSUM Microarchitectural Data Sampling Uncacheable Memory ++ ============== ===== =================================================== ++ ++Problem ++------- ++ ++When performing store, load, L1 refill operations, processors write data ++into temporary microarchitectural structures (buffers). The data in the ++buffer can be forwarded to load operations as an optimization. ++ ++Under certain conditions, usually a fault/assist caused by a load ++operation, data unrelated to the load memory address can be speculatively ++forwarded from the buffers. Because the load operation causes a fault or ++assist and its result will be discarded, the forwarded data will not cause ++incorrect program execution or state changes. But a malicious operation ++may be able to forward this speculative data to a disclosure gadget which ++allows in turn to infer the value via a cache side channel attack. ++ ++Because the buffers are potentially shared between Hyper-Threads cross ++Hyper-Thread attacks are possible. ++ ++Deeper technical information is available in the MDS specific x86 ++architecture section: :ref:`Documentation/x86/mds.rst `. ++ ++ ++Attack scenarios ++---------------- ++ ++Attacks against the MDS vulnerabilities can be mounted from malicious non ++priviledged user space applications running on hosts or guest. Malicious ++guest OSes can obviously mount attacks as well. ++ ++Contrary to other speculation based vulnerabilities the MDS vulnerability ++does not allow the attacker to control the memory target address. As a ++consequence the attacks are purely sampling based, but as demonstrated with ++the TLBleed attack samples can be postprocessed successfully. ++ ++Web-Browsers ++^^^^^^^^^^^^ ++ ++ It's unclear whether attacks through Web-Browsers are possible at ++ all. The exploitation through Java-Script is considered very unlikely, ++ but other widely used web technologies like Webassembly could possibly be ++ abused. ++ ++ ++.. _mds_sys_info: ++ ++MDS system information ++----------------------- ++ ++The Linux kernel provides a sysfs interface to enumerate the current MDS ++status of the system: whether the system is vulnerable, and which ++mitigations are active. The relevant sysfs file is: ++ ++/sys/devices/system/cpu/vulnerabilities/mds ++ ++The possible values in this file are: ++ ++ .. list-table:: ++ ++ * - 'Not affected' ++ - The processor is not vulnerable ++ * - 'Vulnerable' ++ - The processor is vulnerable, but no mitigation enabled ++ * - 'Vulnerable: Clear CPU buffers attempted, no microcode' ++ - The processor is vulnerable but microcode is not updated. ++ ++ The mitigation is enabled on a best effort basis. See :ref:`vmwerv` ++ * - 'Mitigation: Clear CPU buffers' ++ - The processor is vulnerable and the CPU buffer clearing mitigation is ++ enabled. ++ ++If the processor is vulnerable then the following information is appended ++to the above information: ++ ++ ======================== ============================================ ++ 'SMT vulnerable' SMT is enabled ++ 'SMT mitigated' SMT is enabled and mitigated ++ 'SMT disabled' SMT is disabled ++ 'SMT Host state unknown' Kernel runs in a VM, Host SMT state unknown ++ ======================== ============================================ ++ ++.. _vmwerv: ++ ++Best effort mitigation mode ++^^^^^^^^^^^^^^^^^^^^^^^^^^^ ++ ++ If the processor is vulnerable, but the availability of the microcode based ++ mitigation mechanism is not advertised via CPUID the kernel selects a best ++ effort mitigation mode. This mode invokes the mitigation instructions ++ without a guarantee that they clear the CPU buffers. ++ ++ This is done to address virtualization scenarios where the host has the ++ microcode update applied, but the hypervisor is not yet updated to expose ++ the CPUID to the guest. If the host has updated microcode the protection ++ takes effect otherwise a few cpu cycles are wasted pointlessly. ++ ++ The state in the mds sysfs file reflects this situation accordingly. ++ ++ ++Mitigation mechanism ++------------------------- ++ ++The kernel detects the affected CPUs and the presence of the microcode ++which is required. ++ ++If a CPU is affected and the microcode is available, then the kernel ++enables the mitigation by default. The mitigation can be controlled at boot ++time via a kernel command line option. See ++:ref:`mds_mitigation_control_command_line`. ++ ++.. _cpu_buffer_clear: ++ ++CPU buffer clearing ++^^^^^^^^^^^^^^^^^^^ ++ ++ The mitigation for MDS clears the affected CPU buffers on return to user ++ space and when entering a guest. ++ ++ If SMT is enabled it also clears the buffers on idle entry when the CPU ++ is only affected by MSBDS and not any other MDS variant, because the ++ other variants cannot be protected against cross Hyper-Thread attacks. ++ ++ For CPUs which are only affected by MSBDS the user space, guest and idle ++ transition mitigations are sufficient and SMT is not affected. ++ ++.. _virt_mechanism: ++ ++Virtualization mitigation ++^^^^^^^^^^^^^^^^^^^^^^^^^ ++ ++ The protection for host to guest transition depends on the L1TF ++ vulnerability of the CPU: ++ ++ - CPU is affected by L1TF: ++ ++ If the L1D flush mitigation is enabled and up to date microcode is ++ available, the L1D flush mitigation is automatically protecting the ++ guest transition. ++ ++ If the L1D flush mitigation is disabled then the MDS mitigation is ++ invoked explicit when the host MDS mitigation is enabled. ++ ++ For details on L1TF and virtualization see: ++ :ref:`Documentation/hw-vuln//l1tf.rst `. ++ ++ - CPU is not affected by L1TF: ++ ++ CPU buffers are flushed before entering the guest when the host MDS ++ mitigation is enabled. ++ ++ The resulting MDS protection matrix for the host to guest transition: ++ ++ ============ ===== ============= ============ ================= ++ L1TF MDS VMX-L1FLUSH Host MDS MDS-State ++ ++ Don't care No Don't care N/A Not affected ++ ++ Yes Yes Disabled Off Vulnerable ++ ++ Yes Yes Disabled Full Mitigated ++ ++ Yes Yes Enabled Don't care Mitigated ++ ++ No Yes N/A Off Vulnerable ++ ++ No Yes N/A Full Mitigated ++ ============ ===== ============= ============ ================= ++ ++ This only covers the host to guest transition, i.e. prevents leakage from ++ host to guest, but does not protect the guest internally. Guests need to ++ have their own protections. ++ ++.. _xeon_phi: ++ ++XEON PHI specific considerations ++^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ++ ++ The XEON PHI processor family is affected by MSBDS which can be exploited ++ cross Hyper-Threads when entering idle states. Some XEON PHI variants allow ++ to use MWAIT in user space (Ring 3) which opens an potential attack vector ++ for malicious user space. The exposure can be disabled on the kernel ++ command line with the 'ring3mwait=disable' command line option. ++ ++ XEON PHI is not affected by the other MDS variants and MSBDS is mitigated ++ before the CPU enters a idle state. As XEON PHI is not affected by L1TF ++ either disabling SMT is not required for full protection. ++ ++.. _mds_smt_control: ++ ++SMT control ++^^^^^^^^^^^ ++ ++ All MDS variants except MSBDS can be attacked cross Hyper-Threads. That ++ means on CPUs which are affected by MFBDS or MLPDS it is necessary to ++ disable SMT for full protection. These are most of the affected CPUs; the ++ exception is XEON PHI, see :ref:`xeon_phi`. ++ ++ Disabling SMT can have a significant performance impact, but the impact ++ depends on the type of workloads. ++ ++ See the relevant chapter in the L1TF mitigation documentation for details: ++ :ref:`Documentation/hw-vuln/l1tf.rst `. ++ ++ ++.. _mds_mitigation_control_command_line: ++ ++Mitigation control on the kernel command line ++--------------------------------------------- ++ ++The kernel command line allows to control the MDS mitigations at boot ++time with the option "mds=". The valid arguments for this option are: ++ ++ ============ ============================================================= ++ full If the CPU is vulnerable, enable all available mitigations ++ for the MDS vulnerability, CPU buffer clearing on exit to ++ userspace and when entering a VM. Idle transitions are ++ protected as well if SMT is enabled. ++ ++ It does not automatically disable SMT. ++ ++ off Disables MDS mitigations completely. ++ ++ ============ ============================================================= ++ ++Not specifying this option is equivalent to "mds=full". ++ ++ ++Mitigation selection guide ++-------------------------- ++ ++1. Trusted userspace ++^^^^^^^^^^^^^^^^^^^^ ++ ++ If all userspace applications are from a trusted source and do not ++ execute untrusted code which is supplied externally, then the mitigation ++ can be disabled. ++ ++ ++2. Virtualization with trusted guests ++^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ++ ++ The same considerations as above versus trusted user space apply. ++ ++3. Virtualization with untrusted guests ++^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ++ ++ The protection depends on the state of the L1TF mitigations. ++ See :ref:`virt_mechanism`. ++ ++ If the MDS mitigation is enabled and SMT is disabled, guest to host and ++ guest to guest attacks are prevented. ++ ++.. _mds_default_mitigations: ++ ++Default mitigations ++------------------- ++ ++ The kernel default mitigations for vulnerable processors are: ++ ++ - Enable CPU buffer clearing ++ ++ The kernel does not by default enforce the disabling of SMT, which leaves ++ SMT systems vulnerable when running untrusted code. The same rationale as ++ for L1TF applies. ++ See :ref:`Documentation/hw-vuln//l1tf.rst `. +diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt +index da515c535e62..175d57049168 100644 +--- a/Documentation/kernel-parameters.txt ++++ b/Documentation/kernel-parameters.txt +@@ -2035,6 +2035,30 @@ bytes respectively. Such letter suffixes can also be entirely omitted. + Format: , + Specifies range of consoles to be captured by the MDA. + ++ mds= [X86,INTEL] ++ Control mitigation for the Micro-architectural Data ++ Sampling (MDS) vulnerability. ++ ++ Certain CPUs are vulnerable to an exploit against CPU ++ internal buffers which can forward information to a ++ disclosure gadget under certain conditions. ++ ++ In vulnerable processors, the speculatively ++ forwarded data can be used in a cache side channel ++ attack, to access data to which the attacker does ++ not have direct access. ++ ++ This parameter controls the MDS mitigation. The ++ options are: ++ ++ full - Enable MDS mitigation on vulnerable CPUs ++ off - Unconditionally disable MDS mitigation ++ ++ Not specifying this option is equivalent to ++ mds=full. ++ ++ For details see: Documentation/hw-vuln/mds.rst ++ + mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory + Amount of memory to be used when the kernel is not able + to see the whole system memory or for test. +@@ -2149,6 +2173,30 @@ bytes respectively. Such letter suffixes can also be entirely omitted. + in the "bleeding edge" mini2440 support kernel at + http://repo.or.cz/w/linux-2.6/mini2440.git + ++ mitigations= ++ [X86] Control optional mitigations for CPU ++ vulnerabilities. This is a set of curated, ++ arch-independent options, each of which is an ++ aggregation of existing arch-specific options. ++ ++ off ++ Disable all optional CPU mitigations. This ++ improves system performance, but it may also ++ expose users to several CPU vulnerabilities. ++ Equivalent to: nopti [X86] ++ nospectre_v2 [X86] ++ spectre_v2_user=off [X86] ++ spec_store_bypass_disable=off [X86] ++ mds=off [X86] ++ ++ auto (default) ++ Mitigate all CPU vulnerabilities, but leave SMT ++ enabled, even if it's vulnerable. This is for ++ users who don't want to be surprised by SMT ++ getting disabled across kernel upgrades, or who ++ have other ways of avoiding SMT-based attacks. ++ Equivalent to: (default behavior) ++ + mminit_loglevel= + [KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this + parameter allows control of the logging verbosity for +@@ -2450,7 +2498,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted. + + nohugeiomap [KNL,x86] Disable kernel huge I/O mappings. + +- nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2 ++ nospectre_v1 [PPC] Disable mitigations for Spectre Variant 1 (bounds ++ check bypass). With this option data leaks are possible ++ in the system. ++ ++ nospectre_v2 [X86,PPC_FSL_BOOK3E] Disable all mitigations for the Spectre variant 2 + (indirect branch prediction) vulnerability. System may + allow data leaks with this option, which is equivalent + to spectre_v2=off. +@@ -3600,9 +3652,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted. + + spectre_v2= [X86] Control mitigation of Spectre variant 2 + (indirect branch speculation) vulnerability. ++ The default operation protects the kernel from ++ user space attacks. + +- on - unconditionally enable +- off - unconditionally disable ++ on - unconditionally enable, implies ++ spectre_v2_user=on ++ off - unconditionally disable, implies ++ spectre_v2_user=off + auto - kernel detects whether your CPU model is + vulnerable + +@@ -3612,6 +3668,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. + CONFIG_RETPOLINE configuration option, and the + compiler with which the kernel was built. + ++ Selecting 'on' will also enable the mitigation ++ against user space to user space task attacks. ++ ++ Selecting 'off' will disable both the kernel and ++ the user space protections. ++ + Specific mitigations can also be selected manually: + + retpoline - replace indirect branches +@@ -3621,6 +3683,48 @@ bytes respectively. Such letter suffixes can also be entirely omitted. + Not specifying this option is equivalent to + spectre_v2=auto. + ++ spectre_v2_user= ++ [X86] Control mitigation of Spectre variant 2 ++ (indirect branch speculation) vulnerability between ++ user space tasks ++ ++ on - Unconditionally enable mitigations. Is ++ enforced by spectre_v2=on ++ ++ off - Unconditionally disable mitigations. Is ++ enforced by spectre_v2=off ++ ++ prctl - Indirect branch speculation is enabled, ++ but mitigation can be enabled via prctl ++ per thread. The mitigation control state ++ is inherited on fork. ++ ++ prctl,ibpb ++ - Like "prctl" above, but only STIBP is ++ controlled per thread. IBPB is issued ++ always when switching between different user ++ space processes. ++ ++ seccomp ++ - Same as "prctl" above, but all seccomp ++ threads will enable the mitigation unless ++ they explicitly opt out. ++ ++ seccomp,ibpb ++ - Like "seccomp" above, but only STIBP is ++ controlled per thread. IBPB is issued ++ always when switching between different ++ user space processes. ++ ++ auto - Kernel selects the mitigation depending on ++ the available CPU features and vulnerability. ++ ++ Default mitigation: ++ If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl" ++ ++ Not specifying this option is equivalent to ++ spectre_v2_user=auto. ++ + spec_store_bypass_disable= + [HW] Control Speculative Store Bypass (SSB) Disable mitigation + (Speculative Store Bypass vulnerability) +diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt +index 2fb35658d151..709d24b4b533 100644 +--- a/Documentation/networking/ip-sysctl.txt ++++ b/Documentation/networking/ip-sysctl.txt +@@ -387,6 +387,7 @@ tcp_min_rtt_wlen - INTEGER + minimum RTT when it is moved to a longer path (e.g., due to traffic + engineering). A longer window makes the filter more resistant to RTT + inflations such as transient congestion. The unit is seconds. ++ Possible values: 0 - 86400 (1 day) + Default: 300 + + tcp_moderate_rcvbuf - BOOLEAN +diff --git a/Documentation/spec_ctrl.txt b/Documentation/spec_ctrl.txt +index 32f3d55c54b7..c4dbe6f7cdae 100644 +--- a/Documentation/spec_ctrl.txt ++++ b/Documentation/spec_ctrl.txt +@@ -92,3 +92,12 @@ Speculation misfeature controls + * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE, 0, 0); + * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE, 0, 0); + * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_FORCE_DISABLE, 0, 0); ++ ++- PR_SPEC_INDIR_BRANCH: Indirect Branch Speculation in User Processes ++ (Mitigate Spectre V2 style attacks against user processes) ++ ++ Invocations: ++ * prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0); ++ * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0); ++ * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0); ++ * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0); +diff --git a/Documentation/usb/power-management.txt b/Documentation/usb/power-management.txt +index 0a94ffe17ab6..b13e031beaa6 100644 +--- a/Documentation/usb/power-management.txt ++++ b/Documentation/usb/power-management.txt +@@ -365,11 +365,15 @@ autosuspend the interface's device. When the usage counter is = 0 + then the interface is considered to be idle, and the kernel may + autosuspend the device. + +-Drivers need not be concerned about balancing changes to the usage +-counter; the USB core will undo any remaining "get"s when a driver +-is unbound from its interface. As a corollary, drivers must not call +-any of the usb_autopm_* functions after their disconnect() routine has +-returned. ++Drivers must be careful to balance their overall changes to the usage ++counter. Unbalanced "get"s will remain in effect when a driver is ++unbound from its interface, preventing the device from going into ++runtime suspend should the interface be bound to a driver again. On ++the other hand, drivers are allowed to achieve this balance by calling ++the ``usb_autopm_*`` functions even after their ``disconnect`` routine ++has returned -- say from within a work-queue routine -- provided they ++retain an active reference to the interface (via ``usb_get_intf`` and ++``usb_put_intf``). + + Drivers using the async routines are responsible for their own + synchronization and mutual exclusion. +diff --git a/Documentation/x86/mds.rst b/Documentation/x86/mds.rst +new file mode 100644 +index 000000000000..534e9baa4e1d +--- /dev/null ++++ b/Documentation/x86/mds.rst +@@ -0,0 +1,225 @@ ++Microarchitectural Data Sampling (MDS) mitigation ++================================================= ++ ++.. _mds: ++ ++Overview ++-------- ++ ++Microarchitectural Data Sampling (MDS) is a family of side channel attacks ++on internal buffers in Intel CPUs. The variants are: ++ ++ - Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126) ++ - Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130) ++ - Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127) ++ - Microarchitectural Data Sampling Uncacheable Memory (MDSUM) (CVE-2019-11091) ++ ++MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a ++dependent load (store-to-load forwarding) as an optimization. The forward ++can also happen to a faulting or assisting load operation for a different ++memory address, which can be exploited under certain conditions. Store ++buffers are partitioned between Hyper-Threads so cross thread forwarding is ++not possible. But if a thread enters or exits a sleep state the store ++buffer is repartitioned which can expose data from one thread to the other. ++ ++MFBDS leaks Fill Buffer Entries. Fill buffers are used internally to manage ++L1 miss situations and to hold data which is returned or sent in response ++to a memory or I/O operation. Fill buffers can forward data to a load ++operation and also write data to the cache. When the fill buffer is ++deallocated it can retain the stale data of the preceding operations which ++can then be forwarded to a faulting or assisting load operation, which can ++be exploited under certain conditions. Fill buffers are shared between ++Hyper-Threads so cross thread leakage is possible. ++ ++MLPDS leaks Load Port Data. Load ports are used to perform load operations ++from memory or I/O. The received data is then forwarded to the register ++file or a subsequent operation. In some implementations the Load Port can ++contain stale data from a previous operation which can be forwarded to ++faulting or assisting loads under certain conditions, which again can be ++exploited eventually. Load ports are shared between Hyper-Threads so cross ++thread leakage is possible. ++ ++MDSUM is a special case of MSBDS, MFBDS and MLPDS. An uncacheable load from ++memory that takes a fault or assist can leave data in a microarchitectural ++structure that may later be observed using one of the same methods used by ++MSBDS, MFBDS or MLPDS. ++ ++Exposure assumptions ++-------------------- ++ ++It is assumed that attack code resides in user space or in a guest with one ++exception. The rationale behind this assumption is that the code construct ++needed for exploiting MDS requires: ++ ++ - to control the load to trigger a fault or assist ++ ++ - to have a disclosure gadget which exposes the speculatively accessed ++ data for consumption through a side channel. ++ ++ - to control the pointer through which the disclosure gadget exposes the ++ data ++ ++The existence of such a construct in the kernel cannot be excluded with ++100% certainty, but the complexity involved makes it extremly unlikely. ++ ++There is one exception, which is untrusted BPF. The functionality of ++untrusted BPF is limited, but it needs to be thoroughly investigated ++whether it can be used to create such a construct. ++ ++ ++Mitigation strategy ++------------------- ++ ++All variants have the same mitigation strategy at least for the single CPU ++thread case (SMT off): Force the CPU to clear the affected buffers. ++ ++This is achieved by using the otherwise unused and obsolete VERW ++instruction in combination with a microcode update. The microcode clears ++the affected CPU buffers when the VERW instruction is executed. ++ ++For virtualization there are two ways to achieve CPU buffer ++clearing. Either the modified VERW instruction or via the L1D Flush ++command. The latter is issued when L1TF mitigation is enabled so the extra ++VERW can be avoided. If the CPU is not affected by L1TF then VERW needs to ++be issued. ++ ++If the VERW instruction with the supplied segment selector argument is ++executed on a CPU without the microcode update there is no side effect ++other than a small number of pointlessly wasted CPU cycles. ++ ++This does not protect against cross Hyper-Thread attacks except for MSBDS ++which is only exploitable cross Hyper-thread when one of the Hyper-Threads ++enters a C-state. ++ ++The kernel provides a function to invoke the buffer clearing: ++ ++ mds_clear_cpu_buffers() ++ ++The mitigation is invoked on kernel/userspace, hypervisor/guest and C-state ++(idle) transitions. ++ ++As a special quirk to address virtualization scenarios where the host has ++the microcode updated, but the hypervisor does not (yet) expose the ++MD_CLEAR CPUID bit to guests, the kernel issues the VERW instruction in the ++hope that it might actually clear the buffers. The state is reflected ++accordingly. ++ ++According to current knowledge additional mitigations inside the kernel ++itself are not required because the necessary gadgets to expose the leaked ++data cannot be controlled in a way which allows exploitation from malicious ++user space or VM guests. ++ ++Kernel internal mitigation modes ++-------------------------------- ++ ++ ======= ============================================================ ++ off Mitigation is disabled. Either the CPU is not affected or ++ mds=off is supplied on the kernel command line ++ ++ full Mitigation is enabled. CPU is affected and MD_CLEAR is ++ advertised in CPUID. ++ ++ vmwerv Mitigation is enabled. CPU is affected and MD_CLEAR is not ++ advertised in CPUID. That is mainly for virtualization ++ scenarios where the host has the updated microcode but the ++ hypervisor does not expose MD_CLEAR in CPUID. It's a best ++ effort approach without guarantee. ++ ======= ============================================================ ++ ++If the CPU is affected and mds=off is not supplied on the kernel command ++line then the kernel selects the appropriate mitigation mode depending on ++the availability of the MD_CLEAR CPUID bit. ++ ++Mitigation points ++----------------- ++ ++1. Return to user space ++^^^^^^^^^^^^^^^^^^^^^^^ ++ ++ When transitioning from kernel to user space the CPU buffers are flushed ++ on affected CPUs when the mitigation is not disabled on the kernel ++ command line. The migitation is enabled through the static key ++ mds_user_clear. ++ ++ The mitigation is invoked in prepare_exit_to_usermode() which covers ++ most of the kernel to user space transitions. There are a few exceptions ++ which are not invoking prepare_exit_to_usermode() on return to user ++ space. These exceptions use the paranoid exit code. ++ ++ - Non Maskable Interrupt (NMI): ++ ++ Access to sensible data like keys, credentials in the NMI context is ++ mostly theoretical: The CPU can do prefetching or execute a ++ misspeculated code path and thereby fetching data which might end up ++ leaking through a buffer. ++ ++ But for mounting other attacks the kernel stack address of the task is ++ already valuable information. So in full mitigation mode, the NMI is ++ mitigated on the return from do_nmi() to provide almost complete ++ coverage. ++ ++ - Double fault (#DF): ++ ++ A double fault is usually fatal, but the ESPFIX workaround, which can ++ be triggered from user space through modify_ldt(2) is a recoverable ++ double fault. #DF uses the paranoid exit path, so explicit mitigation ++ in the double fault handler is required. ++ ++ - Machine Check Exception (#MC): ++ ++ Another corner case is a #MC which hits between the CPU buffer clear ++ invocation and the actual return to user. As this still is in kernel ++ space it takes the paranoid exit path which does not clear the CPU ++ buffers. So the #MC handler repopulates the buffers to some ++ extent. Machine checks are not reliably controllable and the window is ++ extremly small so mitigation would just tick a checkbox that this ++ theoretical corner case is covered. To keep the amount of special ++ cases small, ignore #MC. ++ ++ - Debug Exception (#DB): ++ ++ This takes the paranoid exit path only when the INT1 breakpoint is in ++ kernel space. #DB on a user space address takes the regular exit path, ++ so no extra mitigation required. ++ ++ ++2. C-State transition ++^^^^^^^^^^^^^^^^^^^^^ ++ ++ When a CPU goes idle and enters a C-State the CPU buffers need to be ++ cleared on affected CPUs when SMT is active. This addresses the ++ repartitioning of the store buffer when one of the Hyper-Threads enters ++ a C-State. ++ ++ When SMT is inactive, i.e. either the CPU does not support it or all ++ sibling threads are offline CPU buffer clearing is not required. ++ ++ The idle clearing is enabled on CPUs which are only affected by MSBDS ++ and not by any other MDS variant. The other MDS variants cannot be ++ protected against cross Hyper-Thread attacks because the Fill Buffer and ++ the Load Ports are shared. So on CPUs affected by other variants, the ++ idle clearing would be a window dressing exercise and is therefore not ++ activated. ++ ++ The invocation is controlled by the static key mds_idle_clear which is ++ switched depending on the chosen mitigation mode and the SMT state of ++ the system. ++ ++ The buffer clear is only invoked before entering the C-State to prevent ++ that stale data from the idling CPU from spilling to the Hyper-Thread ++ sibling after the store buffer got repartitioned and all entries are ++ available to the non idle sibling. ++ ++ When coming out of idle the store buffer is partitioned again so each ++ sibling has half of it available. The back from idle CPU could be then ++ speculatively exposed to contents of the sibling. The buffers are ++ flushed either on exit to user space or on VMENTER so malicious code ++ in user space or the guest cannot speculatively access them. ++ ++ The mitigation is hooked into all variants of halt()/mwait(), but does ++ not cover the legacy ACPI IO-Port mechanism because the ACPI idle driver ++ has been superseded by the intel_idle driver around 2010 and is ++ preferred on all affected CPUs which are expected to gain the MD_CLEAR ++ functionality in microcode. Aside of that the IO-Port mechanism is a ++ legacy interface which is only used on older systems which are either ++ not affected or do not receive microcode updates anymore. +diff --git a/Makefile b/Makefile +index ee0a50b871b9..6023a9dbad59 100644 +--- a/Makefile ++++ b/Makefile +@@ -1,6 +1,6 @@ + VERSION = 4 + PATCHLEVEL = 4 +-SUBLEVEL = 179 ++SUBLEVEL = 180 + EXTRAVERSION = + NAME = Blurry Fish Butt + +diff --git a/arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi b/arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi +index d6d98d426384..cae04e806036 100644 +--- a/arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi ++++ b/arch/arm/boot/dts/imx6qdl-phytec-pfla02.dtsi +@@ -90,6 +90,7 @@ + pinctrl-names = "default"; + pinctrl-0 = <&pinctrl_enet>; + phy-mode = "rgmii"; ++ phy-reset-duration = <10>; /* in msecs */ + phy-reset-gpios = <&gpio3 23 GPIO_ACTIVE_LOW>; + phy-supply = <&vdd_eth_io_reg>; + status = "disabled"; +diff --git a/arch/arm/mach-iop13xx/setup.c b/arch/arm/mach-iop13xx/setup.c +index 53c316f7301e..fe4932fda01d 100644 +--- a/arch/arm/mach-iop13xx/setup.c ++++ b/arch/arm/mach-iop13xx/setup.c +@@ -300,7 +300,7 @@ static struct resource iop13xx_adma_2_resources[] = { + } + }; + +-static u64 iop13xx_adma_dmamask = DMA_BIT_MASK(64); ++static u64 iop13xx_adma_dmamask = DMA_BIT_MASK(32); + static struct iop_adma_platform_data iop13xx_adma_0_data = { + .hw_id = 0, + .pool_size = PAGE_SIZE, +@@ -324,7 +324,7 @@ static struct platform_device iop13xx_adma_0_channel = { + .resource = iop13xx_adma_0_resources, + .dev = { + .dma_mask = &iop13xx_adma_dmamask, +- .coherent_dma_mask = DMA_BIT_MASK(64), ++ .coherent_dma_mask = DMA_BIT_MASK(32), + .platform_data = (void *) &iop13xx_adma_0_data, + }, + }; +@@ -336,7 +336,7 @@ static struct platform_device iop13xx_adma_1_channel = { + .resource = iop13xx_adma_1_resources, + .dev = { + .dma_mask = &iop13xx_adma_dmamask, +- .coherent_dma_mask = DMA_BIT_MASK(64), ++ .coherent_dma_mask = DMA_BIT_MASK(32), + .platform_data = (void *) &iop13xx_adma_1_data, + }, + }; +@@ -348,7 +348,7 @@ static struct platform_device iop13xx_adma_2_channel = { + .resource = iop13xx_adma_2_resources, + .dev = { + .dma_mask = &iop13xx_adma_dmamask, +- .coherent_dma_mask = DMA_BIT_MASK(64), ++ .coherent_dma_mask = DMA_BIT_MASK(32), + .platform_data = (void *) &iop13xx_adma_2_data, + }, + }; +diff --git a/arch/arm/mach-iop13xx/tpmi.c b/arch/arm/mach-iop13xx/tpmi.c +index db511ec2b1df..116feb6b261e 100644 +--- a/arch/arm/mach-iop13xx/tpmi.c ++++ b/arch/arm/mach-iop13xx/tpmi.c +@@ -152,7 +152,7 @@ static struct resource iop13xx_tpmi_3_resources[] = { + } + }; + +-u64 iop13xx_tpmi_mask = DMA_BIT_MASK(64); ++u64 iop13xx_tpmi_mask = DMA_BIT_MASK(32); + static struct platform_device iop13xx_tpmi_0_device = { + .name = "iop-tpmi", + .id = 0, +@@ -160,7 +160,7 @@ static struct platform_device iop13xx_tpmi_0_device = { + .resource = iop13xx_tpmi_0_resources, + .dev = { + .dma_mask = &iop13xx_tpmi_mask, +- .coherent_dma_mask = DMA_BIT_MASK(64), ++ .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }; + +@@ -171,7 +171,7 @@ static struct platform_device iop13xx_tpmi_1_device = { + .resource = iop13xx_tpmi_1_resources, + .dev = { + .dma_mask = &iop13xx_tpmi_mask, +- .coherent_dma_mask = DMA_BIT_MASK(64), ++ .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }; + +@@ -182,7 +182,7 @@ static struct platform_device iop13xx_tpmi_2_device = { + .resource = iop13xx_tpmi_2_resources, + .dev = { + .dma_mask = &iop13xx_tpmi_mask, +- .coherent_dma_mask = DMA_BIT_MASK(64), ++ .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }; + +@@ -193,7 +193,7 @@ static struct platform_device iop13xx_tpmi_3_device = { + .resource = iop13xx_tpmi_3_resources, + .dev = { + .dma_mask = &iop13xx_tpmi_mask, +- .coherent_dma_mask = DMA_BIT_MASK(64), ++ .coherent_dma_mask = DMA_BIT_MASK(32), + }, + }; + +diff --git a/arch/arm/plat-iop/adma.c b/arch/arm/plat-iop/adma.c +index a4d1f8de3b5b..d9612221e484 100644 +--- a/arch/arm/plat-iop/adma.c ++++ b/arch/arm/plat-iop/adma.c +@@ -143,7 +143,7 @@ struct platform_device iop3xx_dma_0_channel = { + .resource = iop3xx_dma_0_resources, + .dev = { + .dma_mask = &iop3xx_adma_dmamask, +- .coherent_dma_mask = DMA_BIT_MASK(64), ++ .coherent_dma_mask = DMA_BIT_MASK(32), + .platform_data = (void *) &iop3xx_dma_0_data, + }, + }; +@@ -155,7 +155,7 @@ struct platform_device iop3xx_dma_1_channel = { + .resource = iop3xx_dma_1_resources, + .dev = { + .dma_mask = &iop3xx_adma_dmamask, +- .coherent_dma_mask = DMA_BIT_MASK(64), ++ .coherent_dma_mask = DMA_BIT_MASK(32), + .platform_data = (void *) &iop3xx_dma_1_data, + }, + }; +@@ -167,7 +167,7 @@ struct platform_device iop3xx_aau_channel = { + .resource = iop3xx_aau_resources, + .dev = { + .dma_mask = &iop3xx_adma_dmamask, +- .coherent_dma_mask = DMA_BIT_MASK(64), ++ .coherent_dma_mask = DMA_BIT_MASK(32), + .platform_data = (void *) &iop3xx_aau_data, + }, + }; +diff --git a/arch/arm/plat-orion/common.c b/arch/arm/plat-orion/common.c +index 8861c367d061..51c3737ddba7 100644 +--- a/arch/arm/plat-orion/common.c ++++ b/arch/arm/plat-orion/common.c +@@ -645,7 +645,7 @@ static struct platform_device orion_xor0_shared = { + .resource = orion_xor0_shared_resources, + .dev = { + .dma_mask = &orion_xor_dmamask, +- .coherent_dma_mask = DMA_BIT_MASK(64), ++ .coherent_dma_mask = DMA_BIT_MASK(32), + .platform_data = &orion_xor0_pdata, + }, + }; +@@ -706,7 +706,7 @@ static struct platform_device orion_xor1_shared = { + .resource = orion_xor1_shared_resources, + .dev = { + .dma_mask = &orion_xor_dmamask, +- .coherent_dma_mask = DMA_BIT_MASK(64), ++ .coherent_dma_mask = DMA_BIT_MASK(32), + .platform_data = &orion_xor1_pdata, + }, + }; +diff --git a/arch/mips/kernel/scall64-o32.S b/arch/mips/kernel/scall64-o32.S +index 87c697181d25..4faff3e77b25 100644 +--- a/arch/mips/kernel/scall64-o32.S ++++ b/arch/mips/kernel/scall64-o32.S +@@ -126,7 +126,7 @@ trace_a_syscall: + subu t1, v0, __NR_O32_Linux + move a1, v0 + bnez t1, 1f /* __NR_syscall at offset 0 */ +- lw a1, PT_R4(sp) /* Arg1 for __NR_syscall case */ ++ ld a1, PT_R4(sp) /* Arg1 for __NR_syscall case */ + .set pop + + 1: jal syscall_trace_enter +diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig +index 58a1fa979655..01b6c00a7060 100644 +--- a/arch/powerpc/Kconfig ++++ b/arch/powerpc/Kconfig +@@ -136,7 +136,7 @@ config PPC + select GENERIC_SMP_IDLE_THREAD + select GENERIC_CMOS_UPDATE + select GENERIC_TIME_VSYSCALL_OLD +- select GENERIC_CPU_VULNERABILITIES if PPC_BOOK3S_64 ++ select GENERIC_CPU_VULNERABILITIES if PPC_BARRIER_NOSPEC + select GENERIC_CLOCKEVENTS + select GENERIC_CLOCKEVENTS_BROADCAST if SMP + select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST +@@ -162,6 +162,11 @@ config PPC + select ARCH_HAS_DMA_SET_COHERENT_MASK + select HAVE_ARCH_SECCOMP_FILTER + ++config PPC_BARRIER_NOSPEC ++ bool ++ default y ++ depends on PPC_BOOK3S_64 || PPC_FSL_BOOK3E ++ + config GENERIC_CSUM + def_bool CPU_LITTLE_ENDIAN + +diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h +new file mode 100644 +index 000000000000..8944c55591cf +--- /dev/null ++++ b/arch/powerpc/include/asm/asm-prototypes.h +@@ -0,0 +1,21 @@ ++#ifndef _ASM_POWERPC_ASM_PROTOTYPES_H ++#define _ASM_POWERPC_ASM_PROTOTYPES_H ++/* ++ * This file is for prototypes of C functions that are only called ++ * from asm, and any associated variables. ++ * ++ * Copyright 2016, Daniel Axtens, IBM Corporation. ++ * ++ * This program is free software; you can redistribute it and/or ++ * modify it under the terms of the GNU General Public License ++ * as published by the Free Software Foundation; either version 2 ++ * of the License, or (at your option) any later version. ++ */ ++ ++/* Patch sites */ ++extern s32 patch__call_flush_count_cache; ++extern s32 patch__flush_count_cache_return; ++ ++extern long flush_count_cache; ++ ++#endif /* _ASM_POWERPC_ASM_PROTOTYPES_H */ +diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h +index b9e16855a037..e7cb72cdb2ba 100644 +--- a/arch/powerpc/include/asm/barrier.h ++++ b/arch/powerpc/include/asm/barrier.h +@@ -92,4 +92,25 @@ do { \ + #define smp_mb__after_atomic() smp_mb() + #define smp_mb__before_spinlock() smp_mb() + ++#ifdef CONFIG_PPC_BOOK3S_64 ++#define NOSPEC_BARRIER_SLOT nop ++#elif defined(CONFIG_PPC_FSL_BOOK3E) ++#define NOSPEC_BARRIER_SLOT nop; nop ++#endif ++ ++#ifdef CONFIG_PPC_BARRIER_NOSPEC ++/* ++ * Prevent execution of subsequent instructions until preceding branches have ++ * been fully resolved and are no longer executing speculatively. ++ */ ++#define barrier_nospec_asm NOSPEC_BARRIER_FIXUP_SECTION; NOSPEC_BARRIER_SLOT ++ ++// This also acts as a compiler barrier due to the memory clobber. ++#define barrier_nospec() asm (stringify_in_c(barrier_nospec_asm) ::: "memory") ++ ++#else /* !CONFIG_PPC_BARRIER_NOSPEC */ ++#define barrier_nospec_asm ++#define barrier_nospec() ++#endif /* CONFIG_PPC_BARRIER_NOSPEC */ ++ + #endif /* _ASM_POWERPC_BARRIER_H */ +diff --git a/arch/powerpc/include/asm/code-patching-asm.h b/arch/powerpc/include/asm/code-patching-asm.h +new file mode 100644 +index 000000000000..ed7b1448493a +--- /dev/null ++++ b/arch/powerpc/include/asm/code-patching-asm.h +@@ -0,0 +1,18 @@ ++/* SPDX-License-Identifier: GPL-2.0+ */ ++/* ++ * Copyright 2018, Michael Ellerman, IBM Corporation. ++ */ ++#ifndef _ASM_POWERPC_CODE_PATCHING_ASM_H ++#define _ASM_POWERPC_CODE_PATCHING_ASM_H ++ ++/* Define a "site" that can be patched */ ++.macro patch_site label name ++ .pushsection ".rodata" ++ .balign 4 ++ .global \name ++\name: ++ .4byte \label - . ++ .popsection ++.endm ++ ++#endif /* _ASM_POWERPC_CODE_PATCHING_ASM_H */ +diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h +index 840a5509b3f1..a734b4b34d26 100644 +--- a/arch/powerpc/include/asm/code-patching.h ++++ b/arch/powerpc/include/asm/code-patching.h +@@ -28,6 +28,8 @@ unsigned int create_cond_branch(const unsigned int *addr, + unsigned long target, int flags); + int patch_branch(unsigned int *addr, unsigned long target, int flags); + int patch_instruction(unsigned int *addr, unsigned int instr); ++int patch_instruction_site(s32 *addr, unsigned int instr); ++int patch_branch_site(s32 *site, unsigned long target, int flags); + + int instr_is_relative_branch(unsigned int instr); + int instr_is_branch_to_addr(const unsigned int *instr, unsigned long addr); +diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h +index 9bddbec441b8..3ed536bec462 100644 +--- a/arch/powerpc/include/asm/exception-64s.h ++++ b/arch/powerpc/include/asm/exception-64s.h +@@ -50,6 +50,27 @@ + #define EX_PPR 88 /* SMT thread status register (priority) */ + #define EX_CTR 96 + ++#define STF_ENTRY_BARRIER_SLOT \ ++ STF_ENTRY_BARRIER_FIXUP_SECTION; \ ++ nop; \ ++ nop; \ ++ nop ++ ++#define STF_EXIT_BARRIER_SLOT \ ++ STF_EXIT_BARRIER_FIXUP_SECTION; \ ++ nop; \ ++ nop; \ ++ nop; \ ++ nop; \ ++ nop; \ ++ nop ++ ++/* ++ * r10 must be free to use, r13 must be paca ++ */ ++#define INTERRUPT_TO_KERNEL \ ++ STF_ENTRY_BARRIER_SLOT ++ + /* + * Macros for annotating the expected destination of (h)rfid + * +@@ -66,16 +87,19 @@ + rfid + + #define RFI_TO_USER \ ++ STF_EXIT_BARRIER_SLOT; \ + RFI_FLUSH_SLOT; \ + rfid; \ + b rfi_flush_fallback + + #define RFI_TO_USER_OR_KERNEL \ ++ STF_EXIT_BARRIER_SLOT; \ + RFI_FLUSH_SLOT; \ + rfid; \ + b rfi_flush_fallback + + #define RFI_TO_GUEST \ ++ STF_EXIT_BARRIER_SLOT; \ + RFI_FLUSH_SLOT; \ + rfid; \ + b rfi_flush_fallback +@@ -84,21 +108,25 @@ + hrfid + + #define HRFI_TO_USER \ ++ STF_EXIT_BARRIER_SLOT; \ + RFI_FLUSH_SLOT; \ + hrfid; \ + b hrfi_flush_fallback + + #define HRFI_TO_USER_OR_KERNEL \ ++ STF_EXIT_BARRIER_SLOT; \ + RFI_FLUSH_SLOT; \ + hrfid; \ + b hrfi_flush_fallback + + #define HRFI_TO_GUEST \ ++ STF_EXIT_BARRIER_SLOT; \ + RFI_FLUSH_SLOT; \ + hrfid; \ + b hrfi_flush_fallback + + #define HRFI_TO_UNKNOWN \ ++ STF_EXIT_BARRIER_SLOT; \ + RFI_FLUSH_SLOT; \ + hrfid; \ + b hrfi_flush_fallback +@@ -226,6 +254,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) + #define __EXCEPTION_PROLOG_1(area, extra, vec) \ + OPT_SAVE_REG_TO_PACA(area+EX_PPR, r9, CPU_FTR_HAS_PPR); \ + OPT_SAVE_REG_TO_PACA(area+EX_CFAR, r10, CPU_FTR_CFAR); \ ++ INTERRUPT_TO_KERNEL; \ + SAVE_CTR(r10, area); \ + mfcr r9; \ + extra(vec); \ +@@ -512,6 +541,12 @@ label##_relon_hv: \ + #define _MASKABLE_EXCEPTION_PSERIES(vec, label, h, extra) \ + __MASKABLE_EXCEPTION_PSERIES(vec, label, h, extra) + ++#define MASKABLE_EXCEPTION_OOL(vec, label) \ ++ .globl label##_ool; \ ++label##_ool: \ ++ EXCEPTION_PROLOG_1(PACA_EXGEN, SOFTEN_TEST_PR, vec); \ ++ EXCEPTION_PROLOG_PSERIES_1(label##_common, EXC_STD); ++ + #define MASKABLE_EXCEPTION_PSERIES(loc, vec, label) \ + . = loc; \ + .globl label##_pSeries; \ +diff --git a/arch/powerpc/include/asm/feature-fixups.h b/arch/powerpc/include/asm/feature-fixups.h +index 7068bafbb2d6..145a37ab2d3e 100644 +--- a/arch/powerpc/include/asm/feature-fixups.h ++++ b/arch/powerpc/include/asm/feature-fixups.h +@@ -184,6 +184,22 @@ label##3: \ + FTR_ENTRY_OFFSET label##1b-label##3b; \ + .popsection; + ++#define STF_ENTRY_BARRIER_FIXUP_SECTION \ ++953: \ ++ .pushsection __stf_entry_barrier_fixup,"a"; \ ++ .align 2; \ ++954: \ ++ FTR_ENTRY_OFFSET 953b-954b; \ ++ .popsection; ++ ++#define STF_EXIT_BARRIER_FIXUP_SECTION \ ++955: \ ++ .pushsection __stf_exit_barrier_fixup,"a"; \ ++ .align 2; \ ++956: \ ++ FTR_ENTRY_OFFSET 955b-956b; \ ++ .popsection; ++ + #define RFI_FLUSH_FIXUP_SECTION \ + 951: \ + .pushsection __rfi_flush_fixup,"a"; \ +@@ -192,10 +208,34 @@ label##3: \ + FTR_ENTRY_OFFSET 951b-952b; \ + .popsection; + ++#define NOSPEC_BARRIER_FIXUP_SECTION \ ++953: \ ++ .pushsection __barrier_nospec_fixup,"a"; \ ++ .align 2; \ ++954: \ ++ FTR_ENTRY_OFFSET 953b-954b; \ ++ .popsection; ++ ++#define START_BTB_FLUSH_SECTION \ ++955: \ ++ ++#define END_BTB_FLUSH_SECTION \ ++956: \ ++ .pushsection __btb_flush_fixup,"a"; \ ++ .align 2; \ ++957: \ ++ FTR_ENTRY_OFFSET 955b-957b; \ ++ FTR_ENTRY_OFFSET 956b-957b; \ ++ .popsection; + + #ifndef __ASSEMBLY__ + ++extern long stf_barrier_fallback; ++extern long __start___stf_entry_barrier_fixup, __stop___stf_entry_barrier_fixup; ++extern long __start___stf_exit_barrier_fixup, __stop___stf_exit_barrier_fixup; + extern long __start___rfi_flush_fixup, __stop___rfi_flush_fixup; ++extern long __start___barrier_nospec_fixup, __stop___barrier_nospec_fixup; ++extern long __start__btb_flush_fixup, __stop__btb_flush_fixup; + + #endif + +diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h +index 449bbb87c257..b57db9d09db9 100644 +--- a/arch/powerpc/include/asm/hvcall.h ++++ b/arch/powerpc/include/asm/hvcall.h +@@ -292,10 +292,15 @@ + #define H_CPU_CHAR_L1D_FLUSH_ORI30 (1ull << 61) // IBM bit 2 + #define H_CPU_CHAR_L1D_FLUSH_TRIG2 (1ull << 60) // IBM bit 3 + #define H_CPU_CHAR_L1D_THREAD_PRIV (1ull << 59) // IBM bit 4 ++#define H_CPU_CHAR_BRANCH_HINTS_HONORED (1ull << 58) // IBM bit 5 ++#define H_CPU_CHAR_THREAD_RECONFIG_CTRL (1ull << 57) // IBM bit 6 ++#define H_CPU_CHAR_COUNT_CACHE_DISABLED (1ull << 56) // IBM bit 7 ++#define H_CPU_CHAR_BCCTR_FLUSH_ASSIST (1ull << 54) // IBM bit 9 + + #define H_CPU_BEHAV_FAVOUR_SECURITY (1ull << 63) // IBM bit 0 + #define H_CPU_BEHAV_L1D_FLUSH_PR (1ull << 62) // IBM bit 1 + #define H_CPU_BEHAV_BNDS_CHK_SPEC_BAR (1ull << 61) // IBM bit 2 ++#define H_CPU_BEHAV_FLUSH_COUNT_CACHE (1ull << 58) // IBM bit 5 + + #ifndef __ASSEMBLY__ + #include +diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h +index 45e2aefece16..08e5df3395fa 100644 +--- a/arch/powerpc/include/asm/paca.h ++++ b/arch/powerpc/include/asm/paca.h +@@ -199,8 +199,7 @@ struct paca_struct { + */ + u64 exrfi[13] __aligned(0x80); + void *rfi_flush_fallback_area; +- u64 l1d_flush_congruence; +- u64 l1d_flush_sets; ++ u64 l1d_flush_size; + #endif + }; + +diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h +index 7ab04fc59e24..faf1bb045dee 100644 +--- a/arch/powerpc/include/asm/ppc-opcode.h ++++ b/arch/powerpc/include/asm/ppc-opcode.h +@@ -147,6 +147,7 @@ + #define PPC_INST_LWSYNC 0x7c2004ac + #define PPC_INST_SYNC 0x7c0004ac + #define PPC_INST_SYNC_MASK 0xfc0007fe ++#define PPC_INST_ISYNC 0x4c00012c + #define PPC_INST_LXVD2X 0x7c000698 + #define PPC_INST_MCRXR 0x7c000400 + #define PPC_INST_MCRXR_MASK 0xfc0007fe +diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h +index 160bb2311bbb..d219816b3e19 100644 +--- a/arch/powerpc/include/asm/ppc_asm.h ++++ b/arch/powerpc/include/asm/ppc_asm.h +@@ -821,4 +821,15 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,945) + .long 0x2400004c /* rfid */ + #endif /* !CONFIG_PPC_BOOK3E */ + #endif /* __ASSEMBLY__ */ ++ ++#ifdef CONFIG_PPC_FSL_BOOK3E ++#define BTB_FLUSH(reg) \ ++ lis reg,BUCSR_INIT@h; \ ++ ori reg,reg,BUCSR_INIT@l; \ ++ mtspr SPRN_BUCSR,reg; \ ++ isync; ++#else ++#define BTB_FLUSH(reg) ++#endif /* CONFIG_PPC_FSL_BOOK3E */ ++ + #endif /* _ASM_POWERPC_PPC_ASM_H */ +diff --git a/arch/powerpc/include/asm/reg_booke.h b/arch/powerpc/include/asm/reg_booke.h +index 2fef74b474f0..410ebee9e339 100644 +--- a/arch/powerpc/include/asm/reg_booke.h ++++ b/arch/powerpc/include/asm/reg_booke.h +@@ -41,7 +41,7 @@ + #if defined(CONFIG_PPC_BOOK3E_64) + #define MSR_64BIT MSR_CM + +-#define MSR_ (MSR_ME | MSR_CE) ++#define MSR_ (MSR_ME | MSR_RI | MSR_CE) + #define MSR_KERNEL (MSR_ | MSR_64BIT) + #define MSR_USER32 (MSR_ | MSR_PR | MSR_EE) + #define MSR_USER64 (MSR_USER32 | MSR_64BIT) +diff --git a/arch/powerpc/include/asm/security_features.h b/arch/powerpc/include/asm/security_features.h +new file mode 100644 +index 000000000000..759597bf0fd8 +--- /dev/null ++++ b/arch/powerpc/include/asm/security_features.h +@@ -0,0 +1,92 @@ ++/* SPDX-License-Identifier: GPL-2.0+ */ ++/* ++ * Security related feature bit definitions. ++ * ++ * Copyright 2018, Michael Ellerman, IBM Corporation. ++ */ ++ ++#ifndef _ASM_POWERPC_SECURITY_FEATURES_H ++#define _ASM_POWERPC_SECURITY_FEATURES_H ++ ++ ++extern unsigned long powerpc_security_features; ++extern bool rfi_flush; ++ ++/* These are bit flags */ ++enum stf_barrier_type { ++ STF_BARRIER_NONE = 0x1, ++ STF_BARRIER_FALLBACK = 0x2, ++ STF_BARRIER_EIEIO = 0x4, ++ STF_BARRIER_SYNC_ORI = 0x8, ++}; ++ ++void setup_stf_barrier(void); ++void do_stf_barrier_fixups(enum stf_barrier_type types); ++void setup_count_cache_flush(void); ++ ++static inline void security_ftr_set(unsigned long feature) ++{ ++ powerpc_security_features |= feature; ++} ++ ++static inline void security_ftr_clear(unsigned long feature) ++{ ++ powerpc_security_features &= ~feature; ++} ++ ++static inline bool security_ftr_enabled(unsigned long feature) ++{ ++ return !!(powerpc_security_features & feature); ++} ++ ++ ++// Features indicating support for Spectre/Meltdown mitigations ++ ++// The L1-D cache can be flushed with ori r30,r30,0 ++#define SEC_FTR_L1D_FLUSH_ORI30 0x0000000000000001ull ++ ++// The L1-D cache can be flushed with mtspr 882,r0 (aka SPRN_TRIG2) ++#define SEC_FTR_L1D_FLUSH_TRIG2 0x0000000000000002ull ++ ++// ori r31,r31,0 acts as a speculation barrier ++#define SEC_FTR_SPEC_BAR_ORI31 0x0000000000000004ull ++ ++// Speculation past bctr is disabled ++#define SEC_FTR_BCCTRL_SERIALISED 0x0000000000000008ull ++ ++// Entries in L1-D are private to a SMT thread ++#define SEC_FTR_L1D_THREAD_PRIV 0x0000000000000010ull ++ ++// Indirect branch prediction cache disabled ++#define SEC_FTR_COUNT_CACHE_DISABLED 0x0000000000000020ull ++ ++// bcctr 2,0,0 triggers a hardware assisted count cache flush ++#define SEC_FTR_BCCTR_FLUSH_ASSIST 0x0000000000000800ull ++ ++ ++// Features indicating need for Spectre/Meltdown mitigations ++ ++// The L1-D cache should be flushed on MSR[HV] 1->0 transition (hypervisor to guest) ++#define SEC_FTR_L1D_FLUSH_HV 0x0000000000000040ull ++ ++// The L1-D cache should be flushed on MSR[PR] 0->1 transition (kernel to userspace) ++#define SEC_FTR_L1D_FLUSH_PR 0x0000000000000080ull ++ ++// A speculation barrier should be used for bounds checks (Spectre variant 1) ++#define SEC_FTR_BNDS_CHK_SPEC_BAR 0x0000000000000100ull ++ ++// Firmware configuration indicates user favours security over performance ++#define SEC_FTR_FAVOUR_SECURITY 0x0000000000000200ull ++ ++// Software required to flush count cache on context switch ++#define SEC_FTR_FLUSH_COUNT_CACHE 0x0000000000000400ull ++ ++ ++// Features enabled by default ++#define SEC_FTR_DEFAULT \ ++ (SEC_FTR_L1D_FLUSH_HV | \ ++ SEC_FTR_L1D_FLUSH_PR | \ ++ SEC_FTR_BNDS_CHK_SPEC_BAR | \ ++ SEC_FTR_FAVOUR_SECURITY) ++ ++#endif /* _ASM_POWERPC_SECURITY_FEATURES_H */ +diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h +index 7916b56f2e60..d299479c770b 100644 +--- a/arch/powerpc/include/asm/setup.h ++++ b/arch/powerpc/include/asm/setup.h +@@ -8,6 +8,7 @@ extern void ppc_printk_progress(char *s, unsigned short hex); + + extern unsigned int rtas_data; + extern unsigned long long memory_limit; ++extern bool init_mem_is_free; + extern unsigned long klimit; + extern void *zalloc_maybe_bootmem(size_t size, gfp_t mask); + +@@ -36,8 +37,28 @@ enum l1d_flush_type { + L1D_FLUSH_MTTRIG = 0x8, + }; + +-void __init setup_rfi_flush(enum l1d_flush_type, bool enable); ++void setup_rfi_flush(enum l1d_flush_type, bool enable); + void do_rfi_flush_fixups(enum l1d_flush_type types); ++#ifdef CONFIG_PPC_BARRIER_NOSPEC ++void setup_barrier_nospec(void); ++#else ++static inline void setup_barrier_nospec(void) { }; ++#endif ++void do_barrier_nospec_fixups(bool enable); ++extern bool barrier_nospec_enabled; ++ ++#ifdef CONFIG_PPC_BARRIER_NOSPEC ++void do_barrier_nospec_fixups_range(bool enable, void *start, void *end); ++#else ++static inline void do_barrier_nospec_fixups_range(bool enable, void *start, void *end) { }; ++#endif ++ ++#ifdef CONFIG_PPC_FSL_BOOK3E ++void setup_spectre_v2(void); ++#else ++static inline void setup_spectre_v2(void) {}; ++#endif ++void do_btb_flush_fixups(void); + + #endif /* !__ASSEMBLY__ */ + +diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h +index 05f1389228d2..e51ce5a0e221 100644 +--- a/arch/powerpc/include/asm/uaccess.h ++++ b/arch/powerpc/include/asm/uaccess.h +@@ -269,6 +269,7 @@ do { \ + __chk_user_ptr(ptr); \ + if (!is_kernel_addr((unsigned long)__gu_addr)) \ + might_fault(); \ ++ barrier_nospec(); \ + __get_user_size(__gu_val, __gu_addr, (size), __gu_err); \ + (x) = (__typeof__(*(ptr)))__gu_val; \ + __gu_err; \ +@@ -283,6 +284,7 @@ do { \ + __chk_user_ptr(ptr); \ + if (!is_kernel_addr((unsigned long)__gu_addr)) \ + might_fault(); \ ++ barrier_nospec(); \ + __get_user_size(__gu_val, __gu_addr, (size), __gu_err); \ + (x) = (__force __typeof__(*(ptr)))__gu_val; \ + __gu_err; \ +@@ -295,8 +297,10 @@ do { \ + unsigned long __gu_val = 0; \ + __typeof__(*(ptr)) __user *__gu_addr = (ptr); \ + might_fault(); \ +- if (access_ok(VERIFY_READ, __gu_addr, (size))) \ ++ if (access_ok(VERIFY_READ, __gu_addr, (size))) { \ ++ barrier_nospec(); \ + __get_user_size(__gu_val, __gu_addr, (size), __gu_err); \ ++ } \ + (x) = (__force __typeof__(*(ptr)))__gu_val; \ + __gu_err; \ + }) +@@ -307,6 +311,7 @@ do { \ + unsigned long __gu_val; \ + __typeof__(*(ptr)) __user *__gu_addr = (ptr); \ + __chk_user_ptr(ptr); \ ++ barrier_nospec(); \ + __get_user_size(__gu_val, __gu_addr, (size), __gu_err); \ + (x) = (__force __typeof__(*(ptr)))__gu_val; \ + __gu_err; \ +@@ -323,8 +328,10 @@ extern unsigned long __copy_tofrom_user(void __user *to, + static inline unsigned long copy_from_user(void *to, + const void __user *from, unsigned long n) + { +- if (likely(access_ok(VERIFY_READ, from, n))) ++ if (likely(access_ok(VERIFY_READ, from, n))) { ++ barrier_nospec(); + return __copy_tofrom_user((__force void __user *)to, from, n); ++ } + memset(to, 0, n); + return n; + } +@@ -359,21 +366,27 @@ static inline unsigned long __copy_from_user_inatomic(void *to, + + switch (n) { + case 1: ++ barrier_nospec(); + __get_user_size(*(u8 *)to, from, 1, ret); + break; + case 2: ++ barrier_nospec(); + __get_user_size(*(u16 *)to, from, 2, ret); + break; + case 4: ++ barrier_nospec(); + __get_user_size(*(u32 *)to, from, 4, ret); + break; + case 8: ++ barrier_nospec(); + __get_user_size(*(u64 *)to, from, 8, ret); + break; + } + if (ret == 0) + return 0; + } ++ ++ barrier_nospec(); + return __copy_tofrom_user((__force void __user *)to, from, n); + } + +@@ -400,6 +413,7 @@ static inline unsigned long __copy_to_user_inatomic(void __user *to, + if (ret == 0) + return 0; + } ++ + return __copy_tofrom_user(to, (__force const void __user *)from, n); + } + +diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile +index ba336930d448..22ed3c32fca8 100644 +--- a/arch/powerpc/kernel/Makefile ++++ b/arch/powerpc/kernel/Makefile +@@ -44,6 +44,7 @@ obj-$(CONFIG_PPC_BOOK3S_64) += cpu_setup_power.o + obj-$(CONFIG_PPC_BOOK3S_64) += mce.o mce_power.o + obj64-$(CONFIG_RELOCATABLE) += reloc_64.o + obj-$(CONFIG_PPC_BOOK3E_64) += exceptions-64e.o idle_book3e.o ++obj-$(CONFIG_PPC_BARRIER_NOSPEC) += security.o + obj-$(CONFIG_PPC64) += vdso64/ + obj-$(CONFIG_ALTIVEC) += vecemu.o + obj-$(CONFIG_PPC_970_NAP) += idle_power4.o +diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c +index d92705e3a0c1..de3c29c51503 100644 +--- a/arch/powerpc/kernel/asm-offsets.c ++++ b/arch/powerpc/kernel/asm-offsets.c +@@ -245,8 +245,7 @@ int main(void) + DEFINE(PACA_IN_MCE, offsetof(struct paca_struct, in_mce)); + DEFINE(PACA_RFI_FLUSH_FALLBACK_AREA, offsetof(struct paca_struct, rfi_flush_fallback_area)); + DEFINE(PACA_EXRFI, offsetof(struct paca_struct, exrfi)); +- DEFINE(PACA_L1D_FLUSH_CONGRUENCE, offsetof(struct paca_struct, l1d_flush_congruence)); +- DEFINE(PACA_L1D_FLUSH_SETS, offsetof(struct paca_struct, l1d_flush_sets)); ++ DEFINE(PACA_L1D_FLUSH_SIZE, offsetof(struct paca_struct, l1d_flush_size)); + #endif + DEFINE(PACAHWCPUID, offsetof(struct paca_struct, hw_cpu_id)); + DEFINE(PACAKEXECSTATE, offsetof(struct paca_struct, kexec_state)); +diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S +index 3728e617e17e..609bc7d01f13 100644 +--- a/arch/powerpc/kernel/entry_32.S ++++ b/arch/powerpc/kernel/entry_32.S +@@ -33,6 +33,7 @@ + #include + #include + #include ++#include + + /* + * MSR_KERNEL is > 0x10000 on 4xx/Book-E since it include MSR_CE. +@@ -340,6 +341,15 @@ syscall_dotrace_cont: + ori r10,r10,sys_call_table@l + slwi r0,r0,2 + bge- 66f ++ ++ barrier_nospec_asm ++ /* ++ * Prevent the load of the handler below (based on the user-passed ++ * system call number) being speculatively executed until the test ++ * against NR_syscalls and branch to .66f above has ++ * committed. ++ */ ++ + lwzx r10,r10,r0 /* Fetch system call handler [ptr] */ + mtlr r10 + addi r9,r1,STACK_FRAME_OVERHEAD +diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S +index 59be96917369..6d36a4fb4acf 100644 +--- a/arch/powerpc/kernel/entry_64.S ++++ b/arch/powerpc/kernel/entry_64.S +@@ -25,6 +25,7 @@ + #include + #include + #include ++#include + #include + #include + #include +@@ -36,6 +37,7 @@ + #include + #include + #include ++#include + #ifdef CONFIG_PPC_BOOK3S + #include + #else +@@ -75,6 +77,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM) + std r0,GPR0(r1) + std r10,GPR1(r1) + beq 2f /* if from kernel mode */ ++#ifdef CONFIG_PPC_FSL_BOOK3E ++START_BTB_FLUSH_SECTION ++ BTB_FLUSH(r10) ++END_BTB_FLUSH_SECTION ++#endif + ACCOUNT_CPU_USER_ENTRY(r10, r11) + 2: std r2,GPR2(r1) + std r3,GPR3(r1) +@@ -177,6 +184,15 @@ system_call: /* label this so stack traces look sane */ + clrldi r8,r8,32 + 15: + slwi r0,r0,4 ++ ++ barrier_nospec_asm ++ /* ++ * Prevent the load of the handler below (based on the user-passed ++ * system call number) being speculatively executed until the test ++ * against NR_syscalls and branch to .Lsyscall_enosys above has ++ * committed. ++ */ ++ + ldx r12,r11,r0 /* Fetch system call handler [ptr] */ + mtctr r12 + bctrl /* Call handler */ +@@ -440,6 +456,57 @@ _GLOBAL(ret_from_kernel_thread) + li r3,0 + b .Lsyscall_exit + ++#ifdef CONFIG_PPC_BOOK3S_64 ++ ++#define FLUSH_COUNT_CACHE \ ++1: nop; \ ++ patch_site 1b, patch__call_flush_count_cache ++ ++ ++#define BCCTR_FLUSH .long 0x4c400420 ++ ++.macro nops number ++ .rept \number ++ nop ++ .endr ++.endm ++ ++.balign 32 ++.global flush_count_cache ++flush_count_cache: ++ /* Save LR into r9 */ ++ mflr r9 ++ ++ .rept 64 ++ bl .+4 ++ .endr ++ b 1f ++ nops 6 ++ ++ .balign 32 ++ /* Restore LR */ ++1: mtlr r9 ++ li r9,0x7fff ++ mtctr r9 ++ ++ BCCTR_FLUSH ++ ++2: nop ++ patch_site 2b patch__flush_count_cache_return ++ ++ nops 3 ++ ++ .rept 278 ++ .balign 32 ++ BCCTR_FLUSH ++ nops 7 ++ .endr ++ ++ blr ++#else ++#define FLUSH_COUNT_CACHE ++#endif /* CONFIG_PPC_BOOK3S_64 */ ++ + /* + * This routine switches between two different tasks. The process + * state of one is saved on its kernel stack. Then the state +@@ -503,6 +570,8 @@ BEGIN_FTR_SECTION + END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) + #endif + ++ FLUSH_COUNT_CACHE ++ + #ifdef CONFIG_SMP + /* We need a sync somewhere here to make sure that if the + * previous task gets rescheduled on another CPU, it sees all +diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S +index 5cc93f0b52ca..48ec841ea1bf 100644 +--- a/arch/powerpc/kernel/exceptions-64e.S ++++ b/arch/powerpc/kernel/exceptions-64e.S +@@ -295,7 +295,8 @@ ret_from_mc_except: + andi. r10,r11,MSR_PR; /* save stack pointer */ \ + beq 1f; /* branch around if supervisor */ \ + ld r1,PACAKSAVE(r13); /* get kernel stack coming from usr */\ +-1: cmpdi cr1,r1,0; /* check if SP makes sense */ \ ++1: type##_BTB_FLUSH \ ++ cmpdi cr1,r1,0; /* check if SP makes sense */ \ + bge- cr1,exc_##n##_bad_stack;/* bad stack (TODO: out of line) */ \ + mfspr r10,SPRN_##type##_SRR0; /* read SRR0 before touching stack */ + +@@ -327,6 +328,30 @@ ret_from_mc_except: + #define SPRN_MC_SRR0 SPRN_MCSRR0 + #define SPRN_MC_SRR1 SPRN_MCSRR1 + ++#ifdef CONFIG_PPC_FSL_BOOK3E ++#define GEN_BTB_FLUSH \ ++ START_BTB_FLUSH_SECTION \ ++ beq 1f; \ ++ BTB_FLUSH(r10) \ ++ 1: \ ++ END_BTB_FLUSH_SECTION ++ ++#define CRIT_BTB_FLUSH \ ++ START_BTB_FLUSH_SECTION \ ++ BTB_FLUSH(r10) \ ++ END_BTB_FLUSH_SECTION ++ ++#define DBG_BTB_FLUSH CRIT_BTB_FLUSH ++#define MC_BTB_FLUSH CRIT_BTB_FLUSH ++#define GDBELL_BTB_FLUSH GEN_BTB_FLUSH ++#else ++#define GEN_BTB_FLUSH ++#define CRIT_BTB_FLUSH ++#define DBG_BTB_FLUSH ++#define MC_BTB_FLUSH ++#define GDBELL_BTB_FLUSH ++#endif ++ + #define NORMAL_EXCEPTION_PROLOG(n, intnum, addition) \ + EXCEPTION_PROLOG(n, intnum, GEN, addition##_GEN(n)) + +diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S +index 938a30fef031..10e7cec9553d 100644 +--- a/arch/powerpc/kernel/exceptions-64s.S ++++ b/arch/powerpc/kernel/exceptions-64s.S +@@ -36,6 +36,7 @@ BEGIN_FTR_SECTION \ + END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE) \ + mr r9,r13 ; \ + GET_PACA(r13) ; \ ++ INTERRUPT_TO_KERNEL ; \ + mfspr r11,SPRN_SRR0 ; \ + 0: + +@@ -292,7 +293,9 @@ hardware_interrupt_hv: + . = 0x900 + .globl decrementer_pSeries + decrementer_pSeries: +- _MASKABLE_EXCEPTION_PSERIES(0x900, decrementer, EXC_STD, SOFTEN_TEST_PR) ++ SET_SCRATCH0(r13) ++ EXCEPTION_PROLOG_0(PACA_EXGEN) ++ b decrementer_ool + + STD_EXCEPTION_HV(0x980, 0x982, hdecrementer) + +@@ -319,6 +322,7 @@ system_call_pSeries: + OPT_GET_SPR(r9, SPRN_PPR, CPU_FTR_HAS_PPR); + HMT_MEDIUM; + std r10,PACA_EXGEN+EX_R10(r13) ++ INTERRUPT_TO_KERNEL + OPT_SAVE_REG_TO_PACA(PACA_EXGEN+EX_PPR, r9, CPU_FTR_HAS_PPR); + mfcr r9 + KVMTEST(0xc00) +@@ -607,6 +611,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) + + .align 7 + /* moved from 0xe00 */ ++ MASKABLE_EXCEPTION_OOL(0x900, decrementer) + STD_EXCEPTION_HV_OOL(0xe02, h_data_storage) + KVM_HANDLER_SKIP(PACA_EXGEN, EXC_HV, 0xe02) + STD_EXCEPTION_HV_OOL(0xe22, h_instr_storage) +@@ -1564,6 +1569,21 @@ power4_fixup_nap: + blr + #endif + ++ .balign 16 ++ .globl stf_barrier_fallback ++stf_barrier_fallback: ++ std r9,PACA_EXRFI+EX_R9(r13) ++ std r10,PACA_EXRFI+EX_R10(r13) ++ sync ++ ld r9,PACA_EXRFI+EX_R9(r13) ++ ld r10,PACA_EXRFI+EX_R10(r13) ++ ori 31,31,0 ++ .rept 14 ++ b 1f ++1: ++ .endr ++ blr ++ + .globl rfi_flush_fallback + rfi_flush_fallback: + SET_SCRATCH0(r13); +@@ -1571,39 +1591,37 @@ rfi_flush_fallback: + std r9,PACA_EXRFI+EX_R9(r13) + std r10,PACA_EXRFI+EX_R10(r13) + std r11,PACA_EXRFI+EX_R11(r13) +- std r12,PACA_EXRFI+EX_R12(r13) +- std r8,PACA_EXRFI+EX_R13(r13) + mfctr r9 + ld r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13) +- ld r11,PACA_L1D_FLUSH_SETS(r13) +- ld r12,PACA_L1D_FLUSH_CONGRUENCE(r13) +- /* +- * The load adresses are at staggered offsets within cachelines, +- * which suits some pipelines better (on others it should not +- * hurt). +- */ +- addi r12,r12,8 ++ ld r11,PACA_L1D_FLUSH_SIZE(r13) ++ srdi r11,r11,(7 + 3) /* 128 byte lines, unrolled 8x */ + mtctr r11 + DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */ + + /* order ld/st prior to dcbt stop all streams with flushing */ + sync +-1: li r8,0 +- .rept 8 /* 8-way set associative */ +- ldx r11,r10,r8 +- add r8,r8,r12 +- xor r11,r11,r11 // Ensure r11 is 0 even if fallback area is not +- add r8,r8,r11 // Add 0, this creates a dependency on the ldx +- .endr +- addi r10,r10,128 /* 128 byte cache line */ ++ ++ /* ++ * The load adresses are at staggered offsets within cachelines, ++ * which suits some pipelines better (on others it should not ++ * hurt). ++ */ ++1: ++ ld r11,(0x80 + 8)*0(r10) ++ ld r11,(0x80 + 8)*1(r10) ++ ld r11,(0x80 + 8)*2(r10) ++ ld r11,(0x80 + 8)*3(r10) ++ ld r11,(0x80 + 8)*4(r10) ++ ld r11,(0x80 + 8)*5(r10) ++ ld r11,(0x80 + 8)*6(r10) ++ ld r11,(0x80 + 8)*7(r10) ++ addi r10,r10,0x80*8 + bdnz 1b + + mtctr r9 + ld r9,PACA_EXRFI+EX_R9(r13) + ld r10,PACA_EXRFI+EX_R10(r13) + ld r11,PACA_EXRFI+EX_R11(r13) +- ld r12,PACA_EXRFI+EX_R12(r13) +- ld r8,PACA_EXRFI+EX_R13(r13) + GET_SCRATCH0(r13); + rfid + +@@ -1614,39 +1632,37 @@ hrfi_flush_fallback: + std r9,PACA_EXRFI+EX_R9(r13) + std r10,PACA_EXRFI+EX_R10(r13) + std r11,PACA_EXRFI+EX_R11(r13) +- std r12,PACA_EXRFI+EX_R12(r13) +- std r8,PACA_EXRFI+EX_R13(r13) + mfctr r9 + ld r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13) +- ld r11,PACA_L1D_FLUSH_SETS(r13) +- ld r12,PACA_L1D_FLUSH_CONGRUENCE(r13) +- /* +- * The load adresses are at staggered offsets within cachelines, +- * which suits some pipelines better (on others it should not +- * hurt). +- */ +- addi r12,r12,8 ++ ld r11,PACA_L1D_FLUSH_SIZE(r13) ++ srdi r11,r11,(7 + 3) /* 128 byte lines, unrolled 8x */ + mtctr r11 + DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */ + + /* order ld/st prior to dcbt stop all streams with flushing */ + sync +-1: li r8,0 +- .rept 8 /* 8-way set associative */ +- ldx r11,r10,r8 +- add r8,r8,r12 +- xor r11,r11,r11 // Ensure r11 is 0 even if fallback area is not +- add r8,r8,r11 // Add 0, this creates a dependency on the ldx +- .endr +- addi r10,r10,128 /* 128 byte cache line */ ++ ++ /* ++ * The load adresses are at staggered offsets within cachelines, ++ * which suits some pipelines better (on others it should not ++ * hurt). ++ */ ++1: ++ ld r11,(0x80 + 8)*0(r10) ++ ld r11,(0x80 + 8)*1(r10) ++ ld r11,(0x80 + 8)*2(r10) ++ ld r11,(0x80 + 8)*3(r10) ++ ld r11,(0x80 + 8)*4(r10) ++ ld r11,(0x80 + 8)*5(r10) ++ ld r11,(0x80 + 8)*6(r10) ++ ld r11,(0x80 + 8)*7(r10) ++ addi r10,r10,0x80*8 + bdnz 1b + + mtctr r9 + ld r9,PACA_EXRFI+EX_R9(r13) + ld r10,PACA_EXRFI+EX_R10(r13) + ld r11,PACA_EXRFI+EX_R11(r13) +- ld r12,PACA_EXRFI+EX_R12(r13) +- ld r8,PACA_EXRFI+EX_R13(r13) + GET_SCRATCH0(r13); + hrfid + +diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h +index a620203f7de3..7b98c7351f6c 100644 +--- a/arch/powerpc/kernel/head_booke.h ++++ b/arch/powerpc/kernel/head_booke.h +@@ -31,6 +31,16 @@ + */ + #define THREAD_NORMSAVE(offset) (THREAD_NORMSAVES + (offset * 4)) + ++#ifdef CONFIG_PPC_FSL_BOOK3E ++#define BOOKE_CLEAR_BTB(reg) \ ++START_BTB_FLUSH_SECTION \ ++ BTB_FLUSH(reg) \ ++END_BTB_FLUSH_SECTION ++#else ++#define BOOKE_CLEAR_BTB(reg) ++#endif ++ ++ + #define NORMAL_EXCEPTION_PROLOG(intno) \ + mtspr SPRN_SPRG_WSCRATCH0, r10; /* save one register */ \ + mfspr r10, SPRN_SPRG_THREAD; \ +@@ -42,6 +52,7 @@ + andi. r11, r11, MSR_PR; /* check whether user or kernel */\ + mr r11, r1; \ + beq 1f; \ ++ BOOKE_CLEAR_BTB(r11) \ + /* if from user, start at top of this thread's kernel stack */ \ + lwz r11, THREAD_INFO-THREAD(r10); \ + ALLOC_STACK_FRAME(r11, THREAD_SIZE); \ +@@ -127,6 +138,7 @@ + stw r9,_CCR(r8); /* save CR on stack */\ + mfspr r11,exc_level_srr1; /* check whether user or kernel */\ + DO_KVM BOOKE_INTERRUPT_##intno exc_level_srr1; \ ++ BOOKE_CLEAR_BTB(r10) \ + andi. r11,r11,MSR_PR; \ + mfspr r11,SPRN_SPRG_THREAD; /* if from user, start at top of */\ + lwz r11,THREAD_INFO-THREAD(r11); /* this thread's kernel stack */\ +diff --git a/arch/powerpc/kernel/head_fsl_booke.S b/arch/powerpc/kernel/head_fsl_booke.S +index fffd1f96bb1d..275769b6fb0d 100644 +--- a/arch/powerpc/kernel/head_fsl_booke.S ++++ b/arch/powerpc/kernel/head_fsl_booke.S +@@ -451,6 +451,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV) + mfcr r13 + stw r13, THREAD_NORMSAVE(3)(r10) + DO_KVM BOOKE_INTERRUPT_DTLB_MISS SPRN_SRR1 ++START_BTB_FLUSH_SECTION ++ mfspr r11, SPRN_SRR1 ++ andi. r10,r11,MSR_PR ++ beq 1f ++ BTB_FLUSH(r10) ++1: ++END_BTB_FLUSH_SECTION + mfspr r10, SPRN_DEAR /* Get faulting address */ + + /* If we are faulting a kernel address, we have to use the +@@ -545,6 +552,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV) + mfcr r13 + stw r13, THREAD_NORMSAVE(3)(r10) + DO_KVM BOOKE_INTERRUPT_ITLB_MISS SPRN_SRR1 ++START_BTB_FLUSH_SECTION ++ mfspr r11, SPRN_SRR1 ++ andi. r10,r11,MSR_PR ++ beq 1f ++ BTB_FLUSH(r10) ++1: ++END_BTB_FLUSH_SECTION ++ + mfspr r10, SPRN_SRR0 /* Get faulting address */ + + /* If we are faulting a kernel address, we have to use the +diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c +index 9547381b631a..ff009be97a42 100644 +--- a/arch/powerpc/kernel/module.c ++++ b/arch/powerpc/kernel/module.c +@@ -67,7 +67,15 @@ int module_finalize(const Elf_Ehdr *hdr, + do_feature_fixups(powerpc_firmware_features, + (void *)sect->sh_addr, + (void *)sect->sh_addr + sect->sh_size); +-#endif ++#endif /* CONFIG_PPC64 */ ++ ++#ifdef CONFIG_PPC_BARRIER_NOSPEC ++ sect = find_section(hdr, sechdrs, "__spec_barrier_fixup"); ++ if (sect != NULL) ++ do_barrier_nospec_fixups_range(barrier_nospec_enabled, ++ (void *)sect->sh_addr, ++ (void *)sect->sh_addr + sect->sh_size); ++#endif /* CONFIG_PPC_BARRIER_NOSPEC */ + + sect = find_section(hdr, sechdrs, "__lwsync_fixup"); + if (sect != NULL) +diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c +new file mode 100644 +index 000000000000..fe30ddfd51ee +--- /dev/null ++++ b/arch/powerpc/kernel/security.c +@@ -0,0 +1,434 @@ ++// SPDX-License-Identifier: GPL-2.0+ ++// ++// Security related flags and so on. ++// ++// Copyright 2018, Michael Ellerman, IBM Corporation. ++ ++#include ++#include ++#include ++#include ++#include ++ ++#include ++#include ++#include ++#include ++#include ++ ++ ++unsigned long powerpc_security_features __read_mostly = SEC_FTR_DEFAULT; ++ ++enum count_cache_flush_type { ++ COUNT_CACHE_FLUSH_NONE = 0x1, ++ COUNT_CACHE_FLUSH_SW = 0x2, ++ COUNT_CACHE_FLUSH_HW = 0x4, ++}; ++static enum count_cache_flush_type count_cache_flush_type = COUNT_CACHE_FLUSH_NONE; ++ ++bool barrier_nospec_enabled; ++static bool no_nospec; ++static bool btb_flush_enabled; ++#ifdef CONFIG_PPC_FSL_BOOK3E ++static bool no_spectrev2; ++#endif ++ ++static void enable_barrier_nospec(bool enable) ++{ ++ barrier_nospec_enabled = enable; ++ do_barrier_nospec_fixups(enable); ++} ++ ++void setup_barrier_nospec(void) ++{ ++ bool enable; ++ ++ /* ++ * It would make sense to check SEC_FTR_SPEC_BAR_ORI31 below as well. ++ * But there's a good reason not to. The two flags we check below are ++ * both are enabled by default in the kernel, so if the hcall is not ++ * functional they will be enabled. ++ * On a system where the host firmware has been updated (so the ori ++ * functions as a barrier), but on which the hypervisor (KVM/Qemu) has ++ * not been updated, we would like to enable the barrier. Dropping the ++ * check for SEC_FTR_SPEC_BAR_ORI31 achieves that. The only downside is ++ * we potentially enable the barrier on systems where the host firmware ++ * is not updated, but that's harmless as it's a no-op. ++ */ ++ enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && ++ security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR); ++ ++ if (!no_nospec) ++ enable_barrier_nospec(enable); ++} ++ ++static int __init handle_nospectre_v1(char *p) ++{ ++ no_nospec = true; ++ ++ return 0; ++} ++early_param("nospectre_v1", handle_nospectre_v1); ++ ++#ifdef CONFIG_DEBUG_FS ++static int barrier_nospec_set(void *data, u64 val) ++{ ++ switch (val) { ++ case 0: ++ case 1: ++ break; ++ default: ++ return -EINVAL; ++ } ++ ++ if (!!val == !!barrier_nospec_enabled) ++ return 0; ++ ++ enable_barrier_nospec(!!val); ++ ++ return 0; ++} ++ ++static int barrier_nospec_get(void *data, u64 *val) ++{ ++ *val = barrier_nospec_enabled ? 1 : 0; ++ return 0; ++} ++ ++DEFINE_SIMPLE_ATTRIBUTE(fops_barrier_nospec, ++ barrier_nospec_get, barrier_nospec_set, "%llu\n"); ++ ++static __init int barrier_nospec_debugfs_init(void) ++{ ++ debugfs_create_file("barrier_nospec", 0600, powerpc_debugfs_root, NULL, ++ &fops_barrier_nospec); ++ return 0; ++} ++device_initcall(barrier_nospec_debugfs_init); ++#endif /* CONFIG_DEBUG_FS */ ++ ++#ifdef CONFIG_PPC_FSL_BOOK3E ++static int __init handle_nospectre_v2(char *p) ++{ ++ no_spectrev2 = true; ++ ++ return 0; ++} ++early_param("nospectre_v2", handle_nospectre_v2); ++void setup_spectre_v2(void) ++{ ++ if (no_spectrev2) ++ do_btb_flush_fixups(); ++ else ++ btb_flush_enabled = true; ++} ++#endif /* CONFIG_PPC_FSL_BOOK3E */ ++ ++#ifdef CONFIG_PPC_BOOK3S_64 ++ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf) ++{ ++ bool thread_priv; ++ ++ thread_priv = security_ftr_enabled(SEC_FTR_L1D_THREAD_PRIV); ++ ++ if (rfi_flush || thread_priv) { ++ struct seq_buf s; ++ seq_buf_init(&s, buf, PAGE_SIZE - 1); ++ ++ seq_buf_printf(&s, "Mitigation: "); ++ ++ if (rfi_flush) ++ seq_buf_printf(&s, "RFI Flush"); ++ ++ if (rfi_flush && thread_priv) ++ seq_buf_printf(&s, ", "); ++ ++ if (thread_priv) ++ seq_buf_printf(&s, "L1D private per thread"); ++ ++ seq_buf_printf(&s, "\n"); ++ ++ return s.len; ++ } ++ ++ if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) && ++ !security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR)) ++ return sprintf(buf, "Not affected\n"); ++ ++ return sprintf(buf, "Vulnerable\n"); ++} ++#endif ++ ++ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, char *buf) ++{ ++ struct seq_buf s; ++ ++ seq_buf_init(&s, buf, PAGE_SIZE - 1); ++ ++ if (security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR)) { ++ if (barrier_nospec_enabled) ++ seq_buf_printf(&s, "Mitigation: __user pointer sanitization"); ++ else ++ seq_buf_printf(&s, "Vulnerable"); ++ ++ if (security_ftr_enabled(SEC_FTR_SPEC_BAR_ORI31)) ++ seq_buf_printf(&s, ", ori31 speculation barrier enabled"); ++ ++ seq_buf_printf(&s, "\n"); ++ } else ++ seq_buf_printf(&s, "Not affected\n"); ++ ++ return s.len; ++} ++ ++ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, char *buf) ++{ ++ struct seq_buf s; ++ bool bcs, ccd; ++ ++ seq_buf_init(&s, buf, PAGE_SIZE - 1); ++ ++ bcs = security_ftr_enabled(SEC_FTR_BCCTRL_SERIALISED); ++ ccd = security_ftr_enabled(SEC_FTR_COUNT_CACHE_DISABLED); ++ ++ if (bcs || ccd) { ++ seq_buf_printf(&s, "Mitigation: "); ++ ++ if (bcs) ++ seq_buf_printf(&s, "Indirect branch serialisation (kernel only)"); ++ ++ if (bcs && ccd) ++ seq_buf_printf(&s, ", "); ++ ++ if (ccd) ++ seq_buf_printf(&s, "Indirect branch cache disabled"); ++ } else if (count_cache_flush_type != COUNT_CACHE_FLUSH_NONE) { ++ seq_buf_printf(&s, "Mitigation: Software count cache flush"); ++ ++ if (count_cache_flush_type == COUNT_CACHE_FLUSH_HW) ++ seq_buf_printf(&s, " (hardware accelerated)"); ++ } else if (btb_flush_enabled) { ++ seq_buf_printf(&s, "Mitigation: Branch predictor state flush"); ++ } else { ++ seq_buf_printf(&s, "Vulnerable"); ++ } ++ ++ seq_buf_printf(&s, "\n"); ++ ++ return s.len; ++} ++ ++#ifdef CONFIG_PPC_BOOK3S_64 ++/* ++ * Store-forwarding barrier support. ++ */ ++ ++static enum stf_barrier_type stf_enabled_flush_types; ++static bool no_stf_barrier; ++bool stf_barrier; ++ ++static int __init handle_no_stf_barrier(char *p) ++{ ++ pr_info("stf-barrier: disabled on command line."); ++ no_stf_barrier = true; ++ return 0; ++} ++ ++early_param("no_stf_barrier", handle_no_stf_barrier); ++ ++/* This is the generic flag used by other architectures */ ++static int __init handle_ssbd(char *p) ++{ ++ if (!p || strncmp(p, "auto", 5) == 0 || strncmp(p, "on", 2) == 0 ) { ++ /* Until firmware tells us, we have the barrier with auto */ ++ return 0; ++ } else if (strncmp(p, "off", 3) == 0) { ++ handle_no_stf_barrier(NULL); ++ return 0; ++ } else ++ return 1; ++ ++ return 0; ++} ++early_param("spec_store_bypass_disable", handle_ssbd); ++ ++/* This is the generic flag used by other architectures */ ++static int __init handle_no_ssbd(char *p) ++{ ++ handle_no_stf_barrier(NULL); ++ return 0; ++} ++early_param("nospec_store_bypass_disable", handle_no_ssbd); ++ ++static void stf_barrier_enable(bool enable) ++{ ++ if (enable) ++ do_stf_barrier_fixups(stf_enabled_flush_types); ++ else ++ do_stf_barrier_fixups(STF_BARRIER_NONE); ++ ++ stf_barrier = enable; ++} ++ ++void setup_stf_barrier(void) ++{ ++ enum stf_barrier_type type; ++ bool enable, hv; ++ ++ hv = cpu_has_feature(CPU_FTR_HVMODE); ++ ++ /* Default to fallback in case fw-features are not available */ ++ if (cpu_has_feature(CPU_FTR_ARCH_207S)) ++ type = STF_BARRIER_SYNC_ORI; ++ else if (cpu_has_feature(CPU_FTR_ARCH_206)) ++ type = STF_BARRIER_FALLBACK; ++ else ++ type = STF_BARRIER_NONE; ++ ++ enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && ++ (security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR) || ++ (security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) && hv)); ++ ++ if (type == STF_BARRIER_FALLBACK) { ++ pr_info("stf-barrier: fallback barrier available\n"); ++ } else if (type == STF_BARRIER_SYNC_ORI) { ++ pr_info("stf-barrier: hwsync barrier available\n"); ++ } else if (type == STF_BARRIER_EIEIO) { ++ pr_info("stf-barrier: eieio barrier available\n"); ++ } ++ ++ stf_enabled_flush_types = type; ++ ++ if (!no_stf_barrier) ++ stf_barrier_enable(enable); ++} ++ ++ssize_t cpu_show_spec_store_bypass(struct device *dev, struct device_attribute *attr, char *buf) ++{ ++ if (stf_barrier && stf_enabled_flush_types != STF_BARRIER_NONE) { ++ const char *type; ++ switch (stf_enabled_flush_types) { ++ case STF_BARRIER_EIEIO: ++ type = "eieio"; ++ break; ++ case STF_BARRIER_SYNC_ORI: ++ type = "hwsync"; ++ break; ++ case STF_BARRIER_FALLBACK: ++ type = "fallback"; ++ break; ++ default: ++ type = "unknown"; ++ } ++ return sprintf(buf, "Mitigation: Kernel entry/exit barrier (%s)\n", type); ++ } ++ ++ if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) && ++ !security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR)) ++ return sprintf(buf, "Not affected\n"); ++ ++ return sprintf(buf, "Vulnerable\n"); ++} ++ ++#ifdef CONFIG_DEBUG_FS ++static int stf_barrier_set(void *data, u64 val) ++{ ++ bool enable; ++ ++ if (val == 1) ++ enable = true; ++ else if (val == 0) ++ enable = false; ++ else ++ return -EINVAL; ++ ++ /* Only do anything if we're changing state */ ++ if (enable != stf_barrier) ++ stf_barrier_enable(enable); ++ ++ return 0; ++} ++ ++static int stf_barrier_get(void *data, u64 *val) ++{ ++ *val = stf_barrier ? 1 : 0; ++ return 0; ++} ++ ++DEFINE_SIMPLE_ATTRIBUTE(fops_stf_barrier, stf_barrier_get, stf_barrier_set, "%llu\n"); ++ ++static __init int stf_barrier_debugfs_init(void) ++{ ++ debugfs_create_file("stf_barrier", 0600, powerpc_debugfs_root, NULL, &fops_stf_barrier); ++ return 0; ++} ++device_initcall(stf_barrier_debugfs_init); ++#endif /* CONFIG_DEBUG_FS */ ++ ++static void toggle_count_cache_flush(bool enable) ++{ ++ if (!enable || !security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) { ++ patch_instruction_site(&patch__call_flush_count_cache, PPC_INST_NOP); ++ count_cache_flush_type = COUNT_CACHE_FLUSH_NONE; ++ pr_info("count-cache-flush: software flush disabled.\n"); ++ return; ++ } ++ ++ patch_branch_site(&patch__call_flush_count_cache, ++ (u64)&flush_count_cache, BRANCH_SET_LINK); ++ ++ if (!security_ftr_enabled(SEC_FTR_BCCTR_FLUSH_ASSIST)) { ++ count_cache_flush_type = COUNT_CACHE_FLUSH_SW; ++ pr_info("count-cache-flush: full software flush sequence enabled.\n"); ++ return; ++ } ++ ++ patch_instruction_site(&patch__flush_count_cache_return, PPC_INST_BLR); ++ count_cache_flush_type = COUNT_CACHE_FLUSH_HW; ++ pr_info("count-cache-flush: hardware assisted flush sequence enabled\n"); ++} ++ ++void setup_count_cache_flush(void) ++{ ++ toggle_count_cache_flush(true); ++} ++ ++#ifdef CONFIG_DEBUG_FS ++static int count_cache_flush_set(void *data, u64 val) ++{ ++ bool enable; ++ ++ if (val == 1) ++ enable = true; ++ else if (val == 0) ++ enable = false; ++ else ++ return -EINVAL; ++ ++ toggle_count_cache_flush(enable); ++ ++ return 0; ++} ++ ++static int count_cache_flush_get(void *data, u64 *val) ++{ ++ if (count_cache_flush_type == COUNT_CACHE_FLUSH_NONE) ++ *val = 0; ++ else ++ *val = 1; ++ ++ return 0; ++} ++ ++DEFINE_SIMPLE_ATTRIBUTE(fops_count_cache_flush, count_cache_flush_get, ++ count_cache_flush_set, "%llu\n"); ++ ++static __init int count_cache_flush_debugfs_init(void) ++{ ++ debugfs_create_file("count_cache_flush", 0600, powerpc_debugfs_root, ++ NULL, &fops_count_cache_flush); ++ return 0; ++} ++device_initcall(count_cache_flush_debugfs_init); ++#endif /* CONFIG_DEBUG_FS */ ++#endif /* CONFIG_PPC_BOOK3S_64 */ +diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c +index ad8c9db61237..cb37f27bb928 100644 +--- a/arch/powerpc/kernel/setup_32.c ++++ b/arch/powerpc/kernel/setup_32.c +@@ -322,6 +322,9 @@ void __init setup_arch(char **cmdline_p) + ppc_md.setup_arch(); + if ( ppc_md.progress ) ppc_md.progress("arch: exit", 0x3eab); + ++ setup_barrier_nospec(); ++ setup_spectre_v2(); ++ + paging_init(); + + /* Initialize the MMU context management stuff */ +diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c +index 9eb469bed22b..11590f6cb2f9 100644 +--- a/arch/powerpc/kernel/setup_64.c ++++ b/arch/powerpc/kernel/setup_64.c +@@ -736,6 +736,9 @@ void __init setup_arch(char **cmdline_p) + if (ppc_md.setup_arch) + ppc_md.setup_arch(); + ++ setup_barrier_nospec(); ++ setup_spectre_v2(); ++ + paging_init(); + + /* Initialize the MMU context management stuff */ +@@ -873,9 +876,6 @@ static void do_nothing(void *unused) + + void rfi_flush_enable(bool enable) + { +- if (rfi_flush == enable) +- return; +- + if (enable) { + do_rfi_flush_fixups(enabled_flush_types); + on_each_cpu(do_nothing, NULL, 1); +@@ -885,11 +885,15 @@ void rfi_flush_enable(bool enable) + rfi_flush = enable; + } + +-static void init_fallback_flush(void) ++static void __ref init_fallback_flush(void) + { + u64 l1d_size, limit; + int cpu; + ++ /* Only allocate the fallback flush area once (at boot time). */ ++ if (l1d_flush_fallback_area) ++ return; ++ + l1d_size = ppc64_caches.dsize; + limit = min(safe_stack_limit(), ppc64_rma_size); + +@@ -902,34 +906,23 @@ static void init_fallback_flush(void) + memset(l1d_flush_fallback_area, 0, l1d_size * 2); + + for_each_possible_cpu(cpu) { +- /* +- * The fallback flush is currently coded for 8-way +- * associativity. Different associativity is possible, but it +- * will be treated as 8-way and may not evict the lines as +- * effectively. +- * +- * 128 byte lines are mandatory. +- */ +- u64 c = l1d_size / 8; +- + paca[cpu].rfi_flush_fallback_area = l1d_flush_fallback_area; +- paca[cpu].l1d_flush_congruence = c; +- paca[cpu].l1d_flush_sets = c / 128; ++ paca[cpu].l1d_flush_size = l1d_size; + } + } + +-void __init setup_rfi_flush(enum l1d_flush_type types, bool enable) ++void setup_rfi_flush(enum l1d_flush_type types, bool enable) + { + if (types & L1D_FLUSH_FALLBACK) { +- pr_info("rfi-flush: Using fallback displacement flush\n"); ++ pr_info("rfi-flush: fallback displacement flush available\n"); + init_fallback_flush(); + } + + if (types & L1D_FLUSH_ORI) +- pr_info("rfi-flush: Using ori type flush\n"); ++ pr_info("rfi-flush: ori type flush available\n"); + + if (types & L1D_FLUSH_MTTRIG) +- pr_info("rfi-flush: Using mttrig type flush\n"); ++ pr_info("rfi-flush: mttrig type flush available\n"); + + enabled_flush_types = types; + +@@ -940,13 +933,19 @@ void __init setup_rfi_flush(enum l1d_flush_type types, bool enable) + #ifdef CONFIG_DEBUG_FS + static int rfi_flush_set(void *data, u64 val) + { ++ bool enable; ++ + if (val == 1) +- rfi_flush_enable(true); ++ enable = true; + else if (val == 0) +- rfi_flush_enable(false); ++ enable = false; + else + return -EINVAL; + ++ /* Only do anything if we're changing state */ ++ if (enable != rfi_flush) ++ rfi_flush_enable(enable); ++ + return 0; + } + +@@ -965,12 +964,4 @@ static __init int rfi_flush_debugfs_init(void) + } + device_initcall(rfi_flush_debugfs_init); + #endif +- +-ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf) +-{ +- if (rfi_flush) +- return sprintf(buf, "Mitigation: RFI Flush\n"); +- +- return sprintf(buf, "Vulnerable\n"); +-} + #endif /* CONFIG_PPC_BOOK3S_64 */ +diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S +index 072a23a17350..876ac9d52afc 100644 +--- a/arch/powerpc/kernel/vmlinux.lds.S ++++ b/arch/powerpc/kernel/vmlinux.lds.S +@@ -73,14 +73,45 @@ SECTIONS + RODATA + + #ifdef CONFIG_PPC64 ++ . = ALIGN(8); ++ __stf_entry_barrier_fixup : AT(ADDR(__stf_entry_barrier_fixup) - LOAD_OFFSET) { ++ __start___stf_entry_barrier_fixup = .; ++ *(__stf_entry_barrier_fixup) ++ __stop___stf_entry_barrier_fixup = .; ++ } ++ ++ . = ALIGN(8); ++ __stf_exit_barrier_fixup : AT(ADDR(__stf_exit_barrier_fixup) - LOAD_OFFSET) { ++ __start___stf_exit_barrier_fixup = .; ++ *(__stf_exit_barrier_fixup) ++ __stop___stf_exit_barrier_fixup = .; ++ } ++ + . = ALIGN(8); + __rfi_flush_fixup : AT(ADDR(__rfi_flush_fixup) - LOAD_OFFSET) { + __start___rfi_flush_fixup = .; + *(__rfi_flush_fixup) + __stop___rfi_flush_fixup = .; + } +-#endif ++#endif /* CONFIG_PPC64 */ + ++#ifdef CONFIG_PPC_BARRIER_NOSPEC ++ . = ALIGN(8); ++ __spec_barrier_fixup : AT(ADDR(__spec_barrier_fixup) - LOAD_OFFSET) { ++ __start___barrier_nospec_fixup = .; ++ *(__barrier_nospec_fixup) ++ __stop___barrier_nospec_fixup = .; ++ } ++#endif /* CONFIG_PPC_BARRIER_NOSPEC */ ++ ++#ifdef CONFIG_PPC_FSL_BOOK3E ++ . = ALIGN(8); ++ __spec_btb_flush_fixup : AT(ADDR(__spec_btb_flush_fixup) - LOAD_OFFSET) { ++ __start__btb_flush_fixup = .; ++ *(__btb_flush_fixup) ++ __stop__btb_flush_fixup = .; ++ } ++#endif + EXCEPTION_TABLE(0) + + NOTES :kernel :notes +diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S +index 81bd8a07aa51..612b7f6a887f 100644 +--- a/arch/powerpc/kvm/bookehv_interrupts.S ++++ b/arch/powerpc/kvm/bookehv_interrupts.S +@@ -75,6 +75,10 @@ + PPC_LL r1, VCPU_HOST_STACK(r4) + PPC_LL r2, HOST_R2(r1) + ++START_BTB_FLUSH_SECTION ++ BTB_FLUSH(r10) ++END_BTB_FLUSH_SECTION ++ + mfspr r10, SPRN_PID + lwz r8, VCPU_HOST_PID(r4) + PPC_LL r11, VCPU_SHARED(r4) +diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c +index 990db69a1d0b..fa88f641ac03 100644 +--- a/arch/powerpc/kvm/e500_emulate.c ++++ b/arch/powerpc/kvm/e500_emulate.c +@@ -277,6 +277,13 @@ int kvmppc_core_emulate_mtspr_e500(struct kvm_vcpu *vcpu, int sprn, ulong spr_va + vcpu->arch.pwrmgtcr0 = spr_val; + break; + ++ case SPRN_BUCSR: ++ /* ++ * If we are here, it means that we have already flushed the ++ * branch predictor, so just return to guest. ++ */ ++ break; ++ + /* extra exceptions */ + #ifdef CONFIG_SPE_POSSIBLE + case SPRN_IVOR32: +diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c +index d5edbeb8eb82..31d31a10f71f 100644 +--- a/arch/powerpc/lib/code-patching.c ++++ b/arch/powerpc/lib/code-patching.c +@@ -14,12 +14,25 @@ + #include + #include + #include ++#include ++#include + + ++static inline bool is_init(unsigned int *addr) ++{ ++ return addr >= (unsigned int *)__init_begin && addr < (unsigned int *)__init_end; ++} ++ + int patch_instruction(unsigned int *addr, unsigned int instr) + { + int err; + ++ /* Make sure we aren't patching a freed init section */ ++ if (*PTRRELOC(&init_mem_is_free) && is_init(addr)) { ++ pr_debug("Skipping init section patching addr: 0x%px\n", addr); ++ return 0; ++ } ++ + __put_user_size(instr, addr, 4, err); + if (err) + return err; +@@ -32,6 +45,22 @@ int patch_branch(unsigned int *addr, unsigned long target, int flags) + return patch_instruction(addr, create_branch(addr, target, flags)); + } + ++int patch_branch_site(s32 *site, unsigned long target, int flags) ++{ ++ unsigned int *addr; ++ ++ addr = (unsigned int *)((unsigned long)site + *site); ++ return patch_instruction(addr, create_branch(addr, target, flags)); ++} ++ ++int patch_instruction_site(s32 *site, unsigned int instr) ++{ ++ unsigned int *addr; ++ ++ addr = (unsigned int *)((unsigned long)site + *site); ++ return patch_instruction(addr, instr); ++} ++ + unsigned int create_branch(const unsigned int *addr, + unsigned long target, int flags) + { +diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c +index 3af014684872..7bdfc19a491d 100644 +--- a/arch/powerpc/lib/feature-fixups.c ++++ b/arch/powerpc/lib/feature-fixups.c +@@ -21,7 +21,7 @@ + #include + #include + #include +- ++#include + + struct fixup_entry { + unsigned long mask; +@@ -115,6 +115,120 @@ void do_feature_fixups(unsigned long value, void *fixup_start, void *fixup_end) + } + + #ifdef CONFIG_PPC_BOOK3S_64 ++void do_stf_entry_barrier_fixups(enum stf_barrier_type types) ++{ ++ unsigned int instrs[3], *dest; ++ long *start, *end; ++ int i; ++ ++ start = PTRRELOC(&__start___stf_entry_barrier_fixup), ++ end = PTRRELOC(&__stop___stf_entry_barrier_fixup); ++ ++ instrs[0] = 0x60000000; /* nop */ ++ instrs[1] = 0x60000000; /* nop */ ++ instrs[2] = 0x60000000; /* nop */ ++ ++ i = 0; ++ if (types & STF_BARRIER_FALLBACK) { ++ instrs[i++] = 0x7d4802a6; /* mflr r10 */ ++ instrs[i++] = 0x60000000; /* branch patched below */ ++ instrs[i++] = 0x7d4803a6; /* mtlr r10 */ ++ } else if (types & STF_BARRIER_EIEIO) { ++ instrs[i++] = 0x7e0006ac; /* eieio + bit 6 hint */ ++ } else if (types & STF_BARRIER_SYNC_ORI) { ++ instrs[i++] = 0x7c0004ac; /* hwsync */ ++ instrs[i++] = 0xe94d0000; /* ld r10,0(r13) */ ++ instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */ ++ } ++ ++ for (i = 0; start < end; start++, i++) { ++ dest = (void *)start + *start; ++ ++ pr_devel("patching dest %lx\n", (unsigned long)dest); ++ ++ patch_instruction(dest, instrs[0]); ++ ++ if (types & STF_BARRIER_FALLBACK) ++ patch_branch(dest + 1, (unsigned long)&stf_barrier_fallback, ++ BRANCH_SET_LINK); ++ else ++ patch_instruction(dest + 1, instrs[1]); ++ ++ patch_instruction(dest + 2, instrs[2]); ++ } ++ ++ printk(KERN_DEBUG "stf-barrier: patched %d entry locations (%s barrier)\n", i, ++ (types == STF_BARRIER_NONE) ? "no" : ++ (types == STF_BARRIER_FALLBACK) ? "fallback" : ++ (types == STF_BARRIER_EIEIO) ? "eieio" : ++ (types == (STF_BARRIER_SYNC_ORI)) ? "hwsync" ++ : "unknown"); ++} ++ ++void do_stf_exit_barrier_fixups(enum stf_barrier_type types) ++{ ++ unsigned int instrs[6], *dest; ++ long *start, *end; ++ int i; ++ ++ start = PTRRELOC(&__start___stf_exit_barrier_fixup), ++ end = PTRRELOC(&__stop___stf_exit_barrier_fixup); ++ ++ instrs[0] = 0x60000000; /* nop */ ++ instrs[1] = 0x60000000; /* nop */ ++ instrs[2] = 0x60000000; /* nop */ ++ instrs[3] = 0x60000000; /* nop */ ++ instrs[4] = 0x60000000; /* nop */ ++ instrs[5] = 0x60000000; /* nop */ ++ ++ i = 0; ++ if (types & STF_BARRIER_FALLBACK || types & STF_BARRIER_SYNC_ORI) { ++ if (cpu_has_feature(CPU_FTR_HVMODE)) { ++ instrs[i++] = 0x7db14ba6; /* mtspr 0x131, r13 (HSPRG1) */ ++ instrs[i++] = 0x7db04aa6; /* mfspr r13, 0x130 (HSPRG0) */ ++ } else { ++ instrs[i++] = 0x7db243a6; /* mtsprg 2,r13 */ ++ instrs[i++] = 0x7db142a6; /* mfsprg r13,1 */ ++ } ++ instrs[i++] = 0x7c0004ac; /* hwsync */ ++ instrs[i++] = 0xe9ad0000; /* ld r13,0(r13) */ ++ instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */ ++ if (cpu_has_feature(CPU_FTR_HVMODE)) { ++ instrs[i++] = 0x7db14aa6; /* mfspr r13, 0x131 (HSPRG1) */ ++ } else { ++ instrs[i++] = 0x7db242a6; /* mfsprg r13,2 */ ++ } ++ } else if (types & STF_BARRIER_EIEIO) { ++ instrs[i++] = 0x7e0006ac; /* eieio + bit 6 hint */ ++ } ++ ++ for (i = 0; start < end; start++, i++) { ++ dest = (void *)start + *start; ++ ++ pr_devel("patching dest %lx\n", (unsigned long)dest); ++ ++ patch_instruction(dest, instrs[0]); ++ patch_instruction(dest + 1, instrs[1]); ++ patch_instruction(dest + 2, instrs[2]); ++ patch_instruction(dest + 3, instrs[3]); ++ patch_instruction(dest + 4, instrs[4]); ++ patch_instruction(dest + 5, instrs[5]); ++ } ++ printk(KERN_DEBUG "stf-barrier: patched %d exit locations (%s barrier)\n", i, ++ (types == STF_BARRIER_NONE) ? "no" : ++ (types == STF_BARRIER_FALLBACK) ? "fallback" : ++ (types == STF_BARRIER_EIEIO) ? "eieio" : ++ (types == (STF_BARRIER_SYNC_ORI)) ? "hwsync" ++ : "unknown"); ++} ++ ++ ++void do_stf_barrier_fixups(enum stf_barrier_type types) ++{ ++ do_stf_entry_barrier_fixups(types); ++ do_stf_exit_barrier_fixups(types); ++} ++ + void do_rfi_flush_fixups(enum l1d_flush_type types) + { + unsigned int instrs[3], *dest; +@@ -151,10 +265,110 @@ void do_rfi_flush_fixups(enum l1d_flush_type types) + patch_instruction(dest + 2, instrs[2]); + } + +- printk(KERN_DEBUG "rfi-flush: patched %d locations\n", i); ++ printk(KERN_DEBUG "rfi-flush: patched %d locations (%s flush)\n", i, ++ (types == L1D_FLUSH_NONE) ? "no" : ++ (types == L1D_FLUSH_FALLBACK) ? "fallback displacement" : ++ (types & L1D_FLUSH_ORI) ? (types & L1D_FLUSH_MTTRIG) ++ ? "ori+mttrig type" ++ : "ori type" : ++ (types & L1D_FLUSH_MTTRIG) ? "mttrig type" ++ : "unknown"); ++} ++ ++void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_end) ++{ ++ unsigned int instr, *dest; ++ long *start, *end; ++ int i; ++ ++ start = fixup_start; ++ end = fixup_end; ++ ++ instr = 0x60000000; /* nop */ ++ ++ if (enable) { ++ pr_info("barrier-nospec: using ORI speculation barrier\n"); ++ instr = 0x63ff0000; /* ori 31,31,0 speculation barrier */ ++ } ++ ++ for (i = 0; start < end; start++, i++) { ++ dest = (void *)start + *start; ++ ++ pr_devel("patching dest %lx\n", (unsigned long)dest); ++ patch_instruction(dest, instr); ++ } ++ ++ printk(KERN_DEBUG "barrier-nospec: patched %d locations\n", i); + } ++ + #endif /* CONFIG_PPC_BOOK3S_64 */ + ++#ifdef CONFIG_PPC_BARRIER_NOSPEC ++void do_barrier_nospec_fixups(bool enable) ++{ ++ void *start, *end; ++ ++ start = PTRRELOC(&__start___barrier_nospec_fixup), ++ end = PTRRELOC(&__stop___barrier_nospec_fixup); ++ ++ do_barrier_nospec_fixups_range(enable, start, end); ++} ++#endif /* CONFIG_PPC_BARRIER_NOSPEC */ ++ ++#ifdef CONFIG_PPC_FSL_BOOK3E ++void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_end) ++{ ++ unsigned int instr[2], *dest; ++ long *start, *end; ++ int i; ++ ++ start = fixup_start; ++ end = fixup_end; ++ ++ instr[0] = PPC_INST_NOP; ++ instr[1] = PPC_INST_NOP; ++ ++ if (enable) { ++ pr_info("barrier-nospec: using isync; sync as speculation barrier\n"); ++ instr[0] = PPC_INST_ISYNC; ++ instr[1] = PPC_INST_SYNC; ++ } ++ ++ for (i = 0; start < end; start++, i++) { ++ dest = (void *)start + *start; ++ ++ pr_devel("patching dest %lx\n", (unsigned long)dest); ++ patch_instruction(dest, instr[0]); ++ patch_instruction(dest + 1, instr[1]); ++ } ++ ++ printk(KERN_DEBUG "barrier-nospec: patched %d locations\n", i); ++} ++ ++static void patch_btb_flush_section(long *curr) ++{ ++ unsigned int *start, *end; ++ ++ start = (void *)curr + *curr; ++ end = (void *)curr + *(curr + 1); ++ for (; start < end; start++) { ++ pr_devel("patching dest %lx\n", (unsigned long)start); ++ patch_instruction(start, PPC_INST_NOP); ++ } ++} ++ ++void do_btb_flush_fixups(void) ++{ ++ long *start, *end; ++ ++ start = PTRRELOC(&__start__btb_flush_fixup); ++ end = PTRRELOC(&__stop__btb_flush_fixup); ++ ++ for (; start < end; start += 2) ++ patch_btb_flush_section(start); ++} ++#endif /* CONFIG_PPC_FSL_BOOK3E */ ++ + void do_lwsync_fixups(unsigned long value, void *fixup_start, void *fixup_end) + { + long *start, *end; +diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c +index 22d94c3e6fc4..1efe5ca5c3bc 100644 +--- a/arch/powerpc/mm/mem.c ++++ b/arch/powerpc/mm/mem.c +@@ -62,6 +62,7 @@ + #endif + + unsigned long long memory_limit; ++bool init_mem_is_free; + + #ifdef CONFIG_HIGHMEM + pte_t *kmap_pte; +@@ -381,6 +382,7 @@ void __init mem_init(void) + void free_initmem(void) + { + ppc_md.progress = ppc_printk_progress; ++ init_mem_is_free = true; + free_initmem_default(POISON_FREE_INITMEM); + } + +diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S +index 29d6987c37ba..5486d56da289 100644 +--- a/arch/powerpc/mm/tlb_low_64e.S ++++ b/arch/powerpc/mm/tlb_low_64e.S +@@ -69,6 +69,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV) + std r15,EX_TLB_R15(r12) + std r10,EX_TLB_CR(r12) + #ifdef CONFIG_PPC_FSL_BOOK3E ++START_BTB_FLUSH_SECTION ++ mfspr r11, SPRN_SRR1 ++ andi. r10,r11,MSR_PR ++ beq 1f ++ BTB_FLUSH(r10) ++1: ++END_BTB_FLUSH_SECTION + std r7,EX_TLB_R7(r12) + #endif + TLB_MISS_PROLOG_STATS +diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c +index c57afc619b20..e14b52c7ebd8 100644 +--- a/arch/powerpc/platforms/powernv/setup.c ++++ b/arch/powerpc/platforms/powernv/setup.c +@@ -37,53 +37,99 @@ + #include + #include + #include ++#include + + #include "powernv.h" + ++ ++static bool fw_feature_is(const char *state, const char *name, ++ struct device_node *fw_features) ++{ ++ struct device_node *np; ++ bool rc = false; ++ ++ np = of_get_child_by_name(fw_features, name); ++ if (np) { ++ rc = of_property_read_bool(np, state); ++ of_node_put(np); ++ } ++ ++ return rc; ++} ++ ++static void init_fw_feat_flags(struct device_node *np) ++{ ++ if (fw_feature_is("enabled", "inst-spec-barrier-ori31,31,0", np)) ++ security_ftr_set(SEC_FTR_SPEC_BAR_ORI31); ++ ++ if (fw_feature_is("enabled", "fw-bcctrl-serialized", np)) ++ security_ftr_set(SEC_FTR_BCCTRL_SERIALISED); ++ ++ if (fw_feature_is("enabled", "inst-l1d-flush-ori30,30,0", np)) ++ security_ftr_set(SEC_FTR_L1D_FLUSH_ORI30); ++ ++ if (fw_feature_is("enabled", "inst-l1d-flush-trig2", np)) ++ security_ftr_set(SEC_FTR_L1D_FLUSH_TRIG2); ++ ++ if (fw_feature_is("enabled", "fw-l1d-thread-split", np)) ++ security_ftr_set(SEC_FTR_L1D_THREAD_PRIV); ++ ++ if (fw_feature_is("enabled", "fw-count-cache-disabled", np)) ++ security_ftr_set(SEC_FTR_COUNT_CACHE_DISABLED); ++ ++ if (fw_feature_is("enabled", "fw-count-cache-flush-bcctr2,0,0", np)) ++ security_ftr_set(SEC_FTR_BCCTR_FLUSH_ASSIST); ++ ++ if (fw_feature_is("enabled", "needs-count-cache-flush-on-context-switch", np)) ++ security_ftr_set(SEC_FTR_FLUSH_COUNT_CACHE); ++ ++ /* ++ * The features below are enabled by default, so we instead look to see ++ * if firmware has *disabled* them, and clear them if so. ++ */ ++ if (fw_feature_is("disabled", "speculation-policy-favor-security", np)) ++ security_ftr_clear(SEC_FTR_FAVOUR_SECURITY); ++ ++ if (fw_feature_is("disabled", "needs-l1d-flush-msr-pr-0-to-1", np)) ++ security_ftr_clear(SEC_FTR_L1D_FLUSH_PR); ++ ++ if (fw_feature_is("disabled", "needs-l1d-flush-msr-hv-1-to-0", np)) ++ security_ftr_clear(SEC_FTR_L1D_FLUSH_HV); ++ ++ if (fw_feature_is("disabled", "needs-spec-barrier-for-bound-checks", np)) ++ security_ftr_clear(SEC_FTR_BNDS_CHK_SPEC_BAR); ++} ++ + static void pnv_setup_rfi_flush(void) + { + struct device_node *np, *fw_features; + enum l1d_flush_type type; +- int enable; ++ bool enable; + + /* Default to fallback in case fw-features are not available */ + type = L1D_FLUSH_FALLBACK; +- enable = 1; + + np = of_find_node_by_name(NULL, "ibm,opal"); + fw_features = of_get_child_by_name(np, "fw-features"); + of_node_put(np); + + if (fw_features) { +- np = of_get_child_by_name(fw_features, "inst-l1d-flush-trig2"); +- if (np && of_property_read_bool(np, "enabled")) +- type = L1D_FLUSH_MTTRIG; ++ init_fw_feat_flags(fw_features); ++ of_node_put(fw_features); + +- of_node_put(np); ++ if (security_ftr_enabled(SEC_FTR_L1D_FLUSH_TRIG2)) ++ type = L1D_FLUSH_MTTRIG; + +- np = of_get_child_by_name(fw_features, "inst-l1d-flush-ori30,30,0"); +- if (np && of_property_read_bool(np, "enabled")) ++ if (security_ftr_enabled(SEC_FTR_L1D_FLUSH_ORI30)) + type = L1D_FLUSH_ORI; +- +- of_node_put(np); +- +- /* Enable unless firmware says NOT to */ +- enable = 2; +- np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-hv-1-to-0"); +- if (np && of_property_read_bool(np, "disabled")) +- enable--; +- +- of_node_put(np); +- +- np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-pr-0-to-1"); +- if (np && of_property_read_bool(np, "disabled")) +- enable--; +- +- of_node_put(np); +- of_node_put(fw_features); + } + +- setup_rfi_flush(type, enable > 0); ++ enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && \ ++ (security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR) || \ ++ security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV)); ++ ++ setup_rfi_flush(type, enable); ++ setup_count_cache_flush(); + } + + static void __init pnv_setup_arch(void) +@@ -91,6 +137,7 @@ static void __init pnv_setup_arch(void) + set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT); + + pnv_setup_rfi_flush(); ++ setup_stf_barrier(); + + /* Initialize SMP */ + pnv_smp_init(); +diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c +index 8dd0c8edefd6..c773396d0969 100644 +--- a/arch/powerpc/platforms/pseries/mobility.c ++++ b/arch/powerpc/platforms/pseries/mobility.c +@@ -314,6 +314,9 @@ void post_mobility_fixup(void) + printk(KERN_ERR "Post-mobility device tree update " + "failed: %d\n", rc); + ++ /* Possibly switch to a new RFI flush type */ ++ pseries_setup_rfi_flush(); ++ + return; + } + +diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h +index 8411c27293e4..e7d80797384d 100644 +--- a/arch/powerpc/platforms/pseries/pseries.h ++++ b/arch/powerpc/platforms/pseries/pseries.h +@@ -81,4 +81,6 @@ extern struct pci_controller_ops pseries_pci_controller_ops; + + unsigned long pseries_memory_block_size(void); + ++void pseries_setup_rfi_flush(void); ++ + #endif /* _PSERIES_PSERIES_H */ +diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c +index dd2545fc9947..9cc976ff7fec 100644 +--- a/arch/powerpc/platforms/pseries/setup.c ++++ b/arch/powerpc/platforms/pseries/setup.c +@@ -67,6 +67,7 @@ + #include + #include + #include ++#include + + #include "pseries.h" + +@@ -499,37 +500,87 @@ static void __init find_and_init_phbs(void) + of_pci_check_probe_only(); + } + +-static void pseries_setup_rfi_flush(void) ++static void init_cpu_char_feature_flags(struct h_cpu_char_result *result) ++{ ++ /* ++ * The features below are disabled by default, so we instead look to see ++ * if firmware has *enabled* them, and set them if so. ++ */ ++ if (result->character & H_CPU_CHAR_SPEC_BAR_ORI31) ++ security_ftr_set(SEC_FTR_SPEC_BAR_ORI31); ++ ++ if (result->character & H_CPU_CHAR_BCCTRL_SERIALISED) ++ security_ftr_set(SEC_FTR_BCCTRL_SERIALISED); ++ ++ if (result->character & H_CPU_CHAR_L1D_FLUSH_ORI30) ++ security_ftr_set(SEC_FTR_L1D_FLUSH_ORI30); ++ ++ if (result->character & H_CPU_CHAR_L1D_FLUSH_TRIG2) ++ security_ftr_set(SEC_FTR_L1D_FLUSH_TRIG2); ++ ++ if (result->character & H_CPU_CHAR_L1D_THREAD_PRIV) ++ security_ftr_set(SEC_FTR_L1D_THREAD_PRIV); ++ ++ if (result->character & H_CPU_CHAR_COUNT_CACHE_DISABLED) ++ security_ftr_set(SEC_FTR_COUNT_CACHE_DISABLED); ++ ++ if (result->character & H_CPU_CHAR_BCCTR_FLUSH_ASSIST) ++ security_ftr_set(SEC_FTR_BCCTR_FLUSH_ASSIST); ++ ++ if (result->behaviour & H_CPU_BEHAV_FLUSH_COUNT_CACHE) ++ security_ftr_set(SEC_FTR_FLUSH_COUNT_CACHE); ++ ++ /* ++ * The features below are enabled by default, so we instead look to see ++ * if firmware has *disabled* them, and clear them if so. ++ */ ++ if (!(result->behaviour & H_CPU_BEHAV_FAVOUR_SECURITY)) ++ security_ftr_clear(SEC_FTR_FAVOUR_SECURITY); ++ ++ if (!(result->behaviour & H_CPU_BEHAV_L1D_FLUSH_PR)) ++ security_ftr_clear(SEC_FTR_L1D_FLUSH_PR); ++ ++ if (!(result->behaviour & H_CPU_BEHAV_BNDS_CHK_SPEC_BAR)) ++ security_ftr_clear(SEC_FTR_BNDS_CHK_SPEC_BAR); ++} ++ ++void pseries_setup_rfi_flush(void) + { + struct h_cpu_char_result result; + enum l1d_flush_type types; + bool enable; + long rc; + +- /* Enable by default */ +- enable = true; ++ /* ++ * Set features to the defaults assumed by init_cpu_char_feature_flags() ++ * so it can set/clear again any features that might have changed after ++ * migration, and in case the hypercall fails and it is not even called. ++ */ ++ powerpc_security_features = SEC_FTR_DEFAULT; + + rc = plpar_get_cpu_characteristics(&result); +- if (rc == H_SUCCESS) { +- types = L1D_FLUSH_NONE; ++ if (rc == H_SUCCESS) ++ init_cpu_char_feature_flags(&result); + +- if (result.character & H_CPU_CHAR_L1D_FLUSH_TRIG2) +- types |= L1D_FLUSH_MTTRIG; +- if (result.character & H_CPU_CHAR_L1D_FLUSH_ORI30) +- types |= L1D_FLUSH_ORI; ++ /* ++ * We're the guest so this doesn't apply to us, clear it to simplify ++ * handling of it elsewhere. ++ */ ++ security_ftr_clear(SEC_FTR_L1D_FLUSH_HV); + +- /* Use fallback if nothing set in hcall */ +- if (types == L1D_FLUSH_NONE) +- types = L1D_FLUSH_FALLBACK; ++ types = L1D_FLUSH_FALLBACK; + +- if (!(result.behaviour & H_CPU_BEHAV_L1D_FLUSH_PR)) +- enable = false; +- } else { +- /* Default to fallback if case hcall is not available */ +- types = L1D_FLUSH_FALLBACK; +- } ++ if (security_ftr_enabled(SEC_FTR_L1D_FLUSH_TRIG2)) ++ types |= L1D_FLUSH_MTTRIG; ++ ++ if (security_ftr_enabled(SEC_FTR_L1D_FLUSH_ORI30)) ++ types |= L1D_FLUSH_ORI; ++ ++ enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && \ ++ security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR); + + setup_rfi_flush(types, enable); ++ setup_count_cache_flush(); + } + + static void __init pSeries_setup_arch(void) +@@ -549,6 +600,7 @@ static void __init pSeries_setup_arch(void) + fwnmi_init(); + + pseries_setup_rfi_flush(); ++ setup_stf_barrier(); + + /* By default, only probe PCI (can be overridden by rtas_pci) */ + pci_add_flags(PCI_PROBE_ONLY); +diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c +index 786bf01691c9..83619ebede93 100644 +--- a/arch/powerpc/xmon/xmon.c ++++ b/arch/powerpc/xmon/xmon.c +@@ -2144,6 +2144,8 @@ static void dump_one_paca(int cpu) + DUMP(p, slb_cache_ptr, "x"); + for (i = 0; i < SLB_CACHE_ENTRIES; i++) + printf(" slb_cache[%d]: = 0x%016lx\n", i, p->slb_cache[i]); ++ ++ DUMP(p, rfi_flush_fallback_area, "px"); + #endif + DUMP(p, dscr_default, "llx"); + #ifdef CONFIG_PPC_BOOK3E +diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig +index 4598d087dec2..4d1262cf630c 100644 +--- a/arch/x86/Kconfig ++++ b/arch/x86/Kconfig +@@ -893,13 +893,7 @@ config NR_CPUS + approximately eight kilobytes to the kernel image. + + config SCHED_SMT +- bool "SMT (Hyperthreading) scheduler support" +- depends on SMP +- ---help--- +- SMT scheduler support improves the CPU scheduler's decision making +- when dealing with Intel Pentium 4 chips with HyperThreading at a +- cost of slightly increased overhead in some places. If unsure say +- N here. ++ def_bool y if SMP + + config SCHED_MC + def_bool y +diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c +index 071582a3b5c0..57be07f27f37 100644 +--- a/arch/x86/entry/common.c ++++ b/arch/x86/entry/common.c +@@ -28,6 +28,7 @@ + #include + #include + #include ++#include + + #define CREATE_TRACE_POINTS + #include +@@ -295,6 +296,8 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs) + #endif + + user_enter(); ++ ++ mds_user_clear_cpu_buffers(); + } + + #define SYSCALL_EXIT_WORK_FLAGS \ +diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile +index 297dda4d5947..15ed35f5097d 100644 +--- a/arch/x86/entry/vdso/Makefile ++++ b/arch/x86/entry/vdso/Makefile +@@ -159,7 +159,8 @@ quiet_cmd_vdso = VDSO $@ + sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@' + + VDSO_LDFLAGS = -shared $(call ld-option, --hash-style=both) \ +- $(call ld-option, --build-id) -Bsymbolic ++ $(call ld-option, --build-id) $(call ld-option, --eh-frame-hdr) \ ++ -Bsymbolic + GCOV_PROFILE := n + + # +diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h +index a5fa3195a230..d9f7d1770e98 100644 +--- a/arch/x86/include/asm/cpufeatures.h ++++ b/arch/x86/include/asm/cpufeatures.h +@@ -214,6 +214,7 @@ + #define X86_FEATURE_STIBP ( 7*32+27) /* Single Thread Indirect Branch Predictors */ + #define X86_FEATURE_ZEN ( 7*32+28) /* "" CPU is AMD family 0x17 (Zen) */ + #define X86_FEATURE_L1TF_PTEINV ( 7*32+29) /* "" L1TF workaround PTE inversion */ ++#define X86_FEATURE_IBRS_ENHANCED ( 7*32+30) /* Enhanced IBRS */ + + /* Virtualization flags: Linux defined, word 8 */ + #define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ +@@ -265,10 +266,12 @@ + + /* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */ + #define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */ +-#define X86_FEATURE_AMD_IBPB (13*32+12) /* Indirect Branch Prediction Barrier */ +-#define X86_FEATURE_AMD_IBRS (13*32+14) /* Indirect Branch Restricted Speculation */ +-#define X86_FEATURE_AMD_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors */ ++#define X86_FEATURE_AMD_IBPB (13*32+12) /* "" Indirect Branch Prediction Barrier */ ++#define X86_FEATURE_AMD_IBRS (13*32+14) /* "" Indirect Branch Restricted Speculation */ ++#define X86_FEATURE_AMD_STIBP (13*32+15) /* "" Single Thread Indirect Branch Predictors */ ++#define X86_FEATURE_AMD_SSBD (13*32+24) /* "" Speculative Store Bypass Disable */ + #define X86_FEATURE_VIRT_SSBD (13*32+25) /* Virtualized Speculative Store Bypass Disable */ ++#define X86_FEATURE_AMD_SSB_NO (13*32+26) /* "" Speculative Store Bypass is fixed in hardware. */ + + /* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */ + #define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */ +@@ -307,6 +310,7 @@ + /* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */ + #define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */ + #define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */ ++#define X86_FEATURE_MD_CLEAR (18*32+10) /* VERW clears CPU buffers */ + #define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */ + #define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */ + #define X86_FEATURE_FLUSH_L1D (18*32+28) /* Flush L1D cache */ +@@ -332,5 +336,7 @@ + #define X86_BUG_SPECTRE_V2 X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */ + #define X86_BUG_SPEC_STORE_BYPASS X86_BUG(17) /* CPU is affected by speculative store bypass attack */ + #define X86_BUG_L1TF X86_BUG(18) /* CPU is affected by L1 Terminal Fault */ ++#define X86_BUG_MDS X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */ ++#define X86_BUG_MSBDS_ONLY X86_BUG(20) /* CPU is only affected by the MSDBS variant of BUG_MDS */ + + #endif /* _ASM_X86_CPUFEATURES_H */ +diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h +index e13ff5a14633..6801f958e254 100644 +--- a/arch/x86/include/asm/intel-family.h ++++ b/arch/x86/include/asm/intel-family.h +@@ -50,19 +50,23 @@ + + /* "Small Core" Processors (Atom) */ + +-#define INTEL_FAM6_ATOM_PINEVIEW 0x1C +-#define INTEL_FAM6_ATOM_LINCROFT 0x26 +-#define INTEL_FAM6_ATOM_PENWELL 0x27 +-#define INTEL_FAM6_ATOM_CLOVERVIEW 0x35 +-#define INTEL_FAM6_ATOM_CEDARVIEW 0x36 +-#define INTEL_FAM6_ATOM_SILVERMONT1 0x37 /* BayTrail/BYT / Valleyview */ +-#define INTEL_FAM6_ATOM_SILVERMONT2 0x4D /* Avaton/Rangely */ +-#define INTEL_FAM6_ATOM_AIRMONT 0x4C /* CherryTrail / Braswell */ +-#define INTEL_FAM6_ATOM_MERRIFIELD 0x4A /* Tangier */ +-#define INTEL_FAM6_ATOM_MOOREFIELD 0x5A /* Annidale */ +-#define INTEL_FAM6_ATOM_GOLDMONT 0x5C +-#define INTEL_FAM6_ATOM_DENVERTON 0x5F /* Goldmont Microserver */ +-#define INTEL_FAM6_ATOM_GEMINI_LAKE 0x7A ++#define INTEL_FAM6_ATOM_BONNELL 0x1C /* Diamondville, Pineview */ ++#define INTEL_FAM6_ATOM_BONNELL_MID 0x26 /* Silverthorne, Lincroft */ ++ ++#define INTEL_FAM6_ATOM_SALTWELL 0x36 /* Cedarview */ ++#define INTEL_FAM6_ATOM_SALTWELL_MID 0x27 /* Penwell */ ++#define INTEL_FAM6_ATOM_SALTWELL_TABLET 0x35 /* Cloverview */ ++ ++#define INTEL_FAM6_ATOM_SILVERMONT 0x37 /* Bay Trail, Valleyview */ ++#define INTEL_FAM6_ATOM_SILVERMONT_X 0x4D /* Avaton, Rangely */ ++#define INTEL_FAM6_ATOM_SILVERMONT_MID 0x4A /* Merriefield */ ++ ++#define INTEL_FAM6_ATOM_AIRMONT 0x4C /* Cherry Trail, Braswell */ ++#define INTEL_FAM6_ATOM_AIRMONT_MID 0x5A /* Moorefield */ ++ ++#define INTEL_FAM6_ATOM_GOLDMONT 0x5C /* Apollo Lake */ ++#define INTEL_FAM6_ATOM_GOLDMONT_X 0x5F /* Denverton */ ++#define INTEL_FAM6_ATOM_GOLDMONT_PLUS 0x7A /* Gemini Lake */ + + /* Xeon Phi */ + +diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h +index 8afbdcd3032b..46d8b99a0ff1 100644 +--- a/arch/x86/include/asm/irqflags.h ++++ b/arch/x86/include/asm/irqflags.h +@@ -4,6 +4,9 @@ + #include + + #ifndef __ASSEMBLY__ ++ ++#include ++ + /* + * Interrupt control: + */ +@@ -49,11 +52,13 @@ static inline void native_irq_enable(void) + + static inline void native_safe_halt(void) + { ++ mds_idle_clear_cpu_buffers(); + asm volatile("sti; hlt": : :"memory"); + } + + static inline void native_halt(void) + { ++ mds_idle_clear_cpu_buffers(); + asm volatile("hlt": : :"memory"); + } + +diff --git a/arch/x86/include/asm/microcode_intel.h b/arch/x86/include/asm/microcode_intel.h +index 8559b0102ea1..90343ba50485 100644 +--- a/arch/x86/include/asm/microcode_intel.h ++++ b/arch/x86/include/asm/microcode_intel.h +@@ -53,6 +53,21 @@ struct extended_sigtable { + + #define exttable_size(et) ((et)->count * EXT_SIGNATURE_SIZE + EXT_HEADER_SIZE) + ++static inline u32 intel_get_microcode_revision(void) ++{ ++ u32 rev, dummy; ++ ++ native_wrmsrl(MSR_IA32_UCODE_REV, 0); ++ ++ /* As documented in the SDM: Do a CPUID 1 here */ ++ sync_core(); ++ ++ /* get the current revision from MSR 0x8B */ ++ native_rdmsr(MSR_IA32_UCODE_REV, dummy, rev); ++ ++ return rev; ++} ++ + extern int has_newer_microcode(void *mc, unsigned int csig, int cpf, int rev); + extern int microcode_sanity_check(void *mc, int print_err); + extern int find_matching_signature(void *mc, unsigned int csig, int cpf); +diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h +index caa00191e565..d4f5b8209393 100644 +--- a/arch/x86/include/asm/msr-index.h ++++ b/arch/x86/include/asm/msr-index.h +@@ -1,6 +1,8 @@ + #ifndef _ASM_X86_MSR_INDEX_H + #define _ASM_X86_MSR_INDEX_H + ++#include ++ + /* CPU model specific register (MSR) numbers */ + + /* x86-64 specific MSRs */ +@@ -33,13 +35,14 @@ + + /* Intel MSRs. Some also available on other CPUs */ + #define MSR_IA32_SPEC_CTRL 0x00000048 /* Speculation Control */ +-#define SPEC_CTRL_IBRS (1 << 0) /* Indirect Branch Restricted Speculation */ +-#define SPEC_CTRL_STIBP (1 << 1) /* Single Thread Indirect Branch Predictors */ ++#define SPEC_CTRL_IBRS BIT(0) /* Indirect Branch Restricted Speculation */ ++#define SPEC_CTRL_STIBP_SHIFT 1 /* Single Thread Indirect Branch Predictor (STIBP) bit */ ++#define SPEC_CTRL_STIBP BIT(SPEC_CTRL_STIBP_SHIFT) /* STIBP mask */ + #define SPEC_CTRL_SSBD_SHIFT 2 /* Speculative Store Bypass Disable bit */ +-#define SPEC_CTRL_SSBD (1 << SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */ ++#define SPEC_CTRL_SSBD BIT(SPEC_CTRL_SSBD_SHIFT) /* Speculative Store Bypass Disable */ + + #define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */ +-#define PRED_CMD_IBPB (1 << 0) /* Indirect Branch Prediction Barrier */ ++#define PRED_CMD_IBPB BIT(0) /* Indirect Branch Prediction Barrier */ + + #define MSR_IA32_PERFCTR0 0x000000c1 + #define MSR_IA32_PERFCTR1 0x000000c2 +@@ -56,13 +59,18 @@ + #define MSR_MTRRcap 0x000000fe + + #define MSR_IA32_ARCH_CAPABILITIES 0x0000010a +-#define ARCH_CAP_RDCL_NO (1 << 0) /* Not susceptible to Meltdown */ +-#define ARCH_CAP_IBRS_ALL (1 << 1) /* Enhanced IBRS support */ +-#define ARCH_CAP_SSB_NO (1 << 4) /* +- * Not susceptible to Speculative Store Bypass +- * attack, so no Speculative Store Bypass +- * control required. +- */ ++#define ARCH_CAP_RDCL_NO BIT(0) /* Not susceptible to Meltdown */ ++#define ARCH_CAP_IBRS_ALL BIT(1) /* Enhanced IBRS support */ ++#define ARCH_CAP_SSB_NO BIT(4) /* ++ * Not susceptible to Speculative Store Bypass ++ * attack, so no Speculative Store Bypass ++ * control required. ++ */ ++#define ARCH_CAP_MDS_NO BIT(5) /* ++ * Not susceptible to ++ * Microarchitectural Data ++ * Sampling (MDS) vulnerabilities. ++ */ + + #define MSR_IA32_BBL_CR_CTL 0x00000119 + #define MSR_IA32_BBL_CR_CTL3 0x0000011e +diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h +index 0deeb2d26df7..b98dbdaee8ac 100644 +--- a/arch/x86/include/asm/mwait.h ++++ b/arch/x86/include/asm/mwait.h +@@ -4,6 +4,7 @@ + #include + + #include ++#include + + #define MWAIT_SUBSTATE_MASK 0xf + #define MWAIT_CSTATE_MASK 0xf +@@ -38,6 +39,8 @@ static inline void __monitorx(const void *eax, unsigned long ecx, + + static inline void __mwait(unsigned long eax, unsigned long ecx) + { ++ mds_idle_clear_cpu_buffers(); ++ + /* "mwait %eax, %ecx;" */ + asm volatile(".byte 0x0f, 0x01, 0xc9;" + :: "a" (eax), "c" (ecx)); +@@ -72,6 +75,8 @@ static inline void __mwait(unsigned long eax, unsigned long ecx) + static inline void __mwaitx(unsigned long eax, unsigned long ebx, + unsigned long ecx) + { ++ /* No MDS buffer clear as this is AMD/HYGON only */ ++ + /* "mwaitx %eax, %ebx, %ecx;" */ + asm volatile(".byte 0x0f, 0x01, 0xfb;" + :: "a" (eax), "b" (ebx), "c" (ecx)); +@@ -79,6 +84,8 @@ static inline void __mwaitx(unsigned long eax, unsigned long ebx, + + static inline void __sti_mwait(unsigned long eax, unsigned long ecx) + { ++ mds_idle_clear_cpu_buffers(); ++ + trace_hardirqs_on(); + /* "mwait %eax, %ecx;" */ + asm volatile("sti; .byte 0x0f, 0x01, 0xc9;" +diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h +index b4c74c24c890..e58c078f3d96 100644 +--- a/arch/x86/include/asm/nospec-branch.h ++++ b/arch/x86/include/asm/nospec-branch.h +@@ -3,6 +3,8 @@ + #ifndef _ASM_X86_NOSPEC_BRANCH_H_ + #define _ASM_X86_NOSPEC_BRANCH_H_ + ++#include ++ + #include + #include + #include +@@ -169,7 +171,15 @@ enum spectre_v2_mitigation { + SPECTRE_V2_RETPOLINE_MINIMAL_AMD, + SPECTRE_V2_RETPOLINE_GENERIC, + SPECTRE_V2_RETPOLINE_AMD, +- SPECTRE_V2_IBRS, ++ SPECTRE_V2_IBRS_ENHANCED, ++}; ++ ++/* The indirect branch speculation control variants */ ++enum spectre_v2_user_mitigation { ++ SPECTRE_V2_USER_NONE, ++ SPECTRE_V2_USER_STRICT, ++ SPECTRE_V2_USER_PRCTL, ++ SPECTRE_V2_USER_SECCOMP, + }; + + /* The Speculative Store Bypass disable variants */ +@@ -248,6 +258,60 @@ do { \ + preempt_enable(); \ + } while (0) + ++DECLARE_STATIC_KEY_FALSE(switch_to_cond_stibp); ++DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ibpb); ++DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb); ++ ++DECLARE_STATIC_KEY_FALSE(mds_user_clear); ++DECLARE_STATIC_KEY_FALSE(mds_idle_clear); ++ ++#include ++ ++/** ++ * mds_clear_cpu_buffers - Mitigation for MDS vulnerability ++ * ++ * This uses the otherwise unused and obsolete VERW instruction in ++ * combination with microcode which triggers a CPU buffer flush when the ++ * instruction is executed. ++ */ ++static inline void mds_clear_cpu_buffers(void) ++{ ++ static const u16 ds = __KERNEL_DS; ++ ++ /* ++ * Has to be the memory-operand variant because only that ++ * guarantees the CPU buffer flush functionality according to ++ * documentation. The register-operand variant does not. ++ * Works with any segment selector, but a valid writable ++ * data segment is the fastest variant. ++ * ++ * "cc" clobber is required because VERW modifies ZF. ++ */ ++ asm volatile("verw %[ds]" : : [ds] "m" (ds) : "cc"); ++} ++ ++/** ++ * mds_user_clear_cpu_buffers - Mitigation for MDS vulnerability ++ * ++ * Clear CPU buffers if the corresponding static key is enabled ++ */ ++static inline void mds_user_clear_cpu_buffers(void) ++{ ++ if (static_branch_likely(&mds_user_clear)) ++ mds_clear_cpu_buffers(); ++} ++ ++/** ++ * mds_idle_clear_cpu_buffers - Mitigation for MDS vulnerability ++ * ++ * Clear CPU buffers if the corresponding static key is enabled ++ */ ++static inline void mds_idle_clear_cpu_buffers(void) ++{ ++ if (static_branch_likely(&mds_idle_clear)) ++ mds_clear_cpu_buffers(); ++} ++ + #endif /* __ASSEMBLY__ */ + + /* +diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h +index 221a32ed1372..f12e61e2a86b 100644 +--- a/arch/x86/include/asm/pgtable_64.h ++++ b/arch/x86/include/asm/pgtable_64.h +@@ -44,15 +44,15 @@ struct mm_struct; + void set_pte_vaddr_pud(pud_t *pud_page, unsigned long vaddr, pte_t new_pte); + + +-static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr, +- pte_t *ptep) ++static inline void native_set_pte(pte_t *ptep, pte_t pte) + { +- *ptep = native_make_pte(0); ++ WRITE_ONCE(*ptep, pte); + } + +-static inline void native_set_pte(pte_t *ptep, pte_t pte) ++static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr, ++ pte_t *ptep) + { +- *ptep = pte; ++ native_set_pte(ptep, native_make_pte(0)); + } + + static inline void native_set_pte_atomic(pte_t *ptep, pte_t pte) +@@ -62,7 +62,7 @@ static inline void native_set_pte_atomic(pte_t *ptep, pte_t pte) + + static inline void native_set_pmd(pmd_t *pmdp, pmd_t pmd) + { +- *pmdp = pmd; ++ WRITE_ONCE(*pmdp, pmd); + } + + static inline void native_pmd_clear(pmd_t *pmd) +@@ -98,7 +98,7 @@ static inline pmd_t native_pmdp_get_and_clear(pmd_t *xp) + + static inline void native_set_pud(pud_t *pudp, pud_t pud) + { +- *pudp = pud; ++ WRITE_ONCE(*pudp, pud); + } + + static inline void native_pud_clear(pud_t *pud) +@@ -131,7 +131,7 @@ static inline pgd_t *native_get_shadow_pgd(pgd_t *pgdp) + + static inline void native_set_pgd(pgd_t *pgdp, pgd_t pgd) + { +- *pgdp = kaiser_set_shadow_pgd(pgdp, pgd); ++ WRITE_ONCE(*pgdp, kaiser_set_shadow_pgd(pgdp, pgd)); + } + + static inline void native_pgd_clear(pgd_t *pgd) +diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h +index 440a948c4feb..dab73faef9b0 100644 +--- a/arch/x86/include/asm/processor.h ++++ b/arch/x86/include/asm/processor.h +@@ -845,4 +845,11 @@ bool xen_set_default_idle(void); + + void stop_this_cpu(void *dummy); + void df_debug(struct pt_regs *regs, long error_code); ++ ++enum mds_mitigations { ++ MDS_MITIGATION_OFF, ++ MDS_MITIGATION_FULL, ++ MDS_MITIGATION_VMWERV, ++}; ++ + #endif /* _ASM_X86_PROCESSOR_H */ +diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h +index ae7c2c5cd7f0..5393babc0598 100644 +--- a/arch/x86/include/asm/spec-ctrl.h ++++ b/arch/x86/include/asm/spec-ctrl.h +@@ -53,12 +53,24 @@ static inline u64 ssbd_tif_to_spec_ctrl(u64 tifn) + return (tifn & _TIF_SSBD) >> (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT); + } + ++static inline u64 stibp_tif_to_spec_ctrl(u64 tifn) ++{ ++ BUILD_BUG_ON(TIF_SPEC_IB < SPEC_CTRL_STIBP_SHIFT); ++ return (tifn & _TIF_SPEC_IB) >> (TIF_SPEC_IB - SPEC_CTRL_STIBP_SHIFT); ++} ++ + static inline unsigned long ssbd_spec_ctrl_to_tif(u64 spec_ctrl) + { + BUILD_BUG_ON(TIF_SSBD < SPEC_CTRL_SSBD_SHIFT); + return (spec_ctrl & SPEC_CTRL_SSBD) << (TIF_SSBD - SPEC_CTRL_SSBD_SHIFT); + } + ++static inline unsigned long stibp_spec_ctrl_to_tif(u64 spec_ctrl) ++{ ++ BUILD_BUG_ON(TIF_SPEC_IB < SPEC_CTRL_STIBP_SHIFT); ++ return (spec_ctrl & SPEC_CTRL_STIBP) << (TIF_SPEC_IB - SPEC_CTRL_STIBP_SHIFT); ++} ++ + static inline u64 ssbd_tif_to_amd_ls_cfg(u64 tifn) + { + return (tifn & _TIF_SSBD) ? x86_amd_ls_cfg_ssbd_mask : 0ULL; +@@ -70,11 +82,7 @@ extern void speculative_store_bypass_ht_init(void); + static inline void speculative_store_bypass_ht_init(void) { } + #endif + +-extern void speculative_store_bypass_update(unsigned long tif); +- +-static inline void speculative_store_bypass_update_current(void) +-{ +- speculative_store_bypass_update(current_thread_info()->flags); +-} ++extern void speculation_ctrl_update(unsigned long tif); ++extern void speculation_ctrl_update_current(void); + + #endif +diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h +index 025ecfaba9c9..4ff0878f4633 100644 +--- a/arch/x86/include/asm/switch_to.h ++++ b/arch/x86/include/asm/switch_to.h +@@ -6,9 +6,6 @@ + struct task_struct; /* one of the stranger aspects of C forward declarations */ + __visible struct task_struct *__switch_to(struct task_struct *prev, + struct task_struct *next); +-struct tss_struct; +-void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, +- struct tss_struct *tss); + + #ifdef CONFIG_X86_32 + +diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h +index a96e88b243ef..e522b15fa3f0 100644 +--- a/arch/x86/include/asm/thread_info.h ++++ b/arch/x86/include/asm/thread_info.h +@@ -92,10 +92,12 @@ struct thread_info { + #define TIF_SIGPENDING 2 /* signal pending */ + #define TIF_NEED_RESCHED 3 /* rescheduling necessary */ + #define TIF_SINGLESTEP 4 /* reenable singlestep on user return*/ +-#define TIF_SSBD 5 /* Reduced data speculation */ ++#define TIF_SSBD 5 /* Speculative store bypass disable */ + #define TIF_SYSCALL_EMU 6 /* syscall emulation active */ + #define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */ + #define TIF_SECCOMP 8 /* secure computing */ ++#define TIF_SPEC_IB 9 /* Indirect branch speculation mitigation */ ++#define TIF_SPEC_FORCE_UPDATE 10 /* Force speculation MSR update in context switch */ + #define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */ + #define TIF_UPROBE 12 /* breakpointed or singlestepping */ + #define TIF_NOTSC 16 /* TSC is not accessible in userland */ +@@ -121,6 +123,8 @@ struct thread_info { + #define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU) + #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT) + #define _TIF_SECCOMP (1 << TIF_SECCOMP) ++#define _TIF_SPEC_IB (1 << TIF_SPEC_IB) ++#define _TIF_SPEC_FORCE_UPDATE (1 << TIF_SPEC_FORCE_UPDATE) + #define _TIF_USER_RETURN_NOTIFY (1 << TIF_USER_RETURN_NOTIFY) + #define _TIF_UPROBE (1 << TIF_UPROBE) + #define _TIF_NOTSC (1 << TIF_NOTSC) +@@ -148,8 +152,18 @@ struct thread_info { + _TIF_NOHZ) + + /* flags to check in __switch_to() */ +-#define _TIF_WORK_CTXSW \ +- (_TIF_IO_BITMAP|_TIF_NOTSC|_TIF_BLOCKSTEP|_TIF_SSBD) ++#define _TIF_WORK_CTXSW_BASE \ ++ (_TIF_IO_BITMAP|_TIF_NOTSC|_TIF_BLOCKSTEP| \ ++ _TIF_SSBD | _TIF_SPEC_FORCE_UPDATE) ++ ++/* ++ * Avoid calls to __switch_to_xtra() on UP as STIBP is not evaluated. ++ */ ++#ifdef CONFIG_SMP ++# define _TIF_WORK_CTXSW (_TIF_WORK_CTXSW_BASE | _TIF_SPEC_IB) ++#else ++# define _TIF_WORK_CTXSW (_TIF_WORK_CTXSW_BASE) ++#endif + + #define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY) + #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW) +diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h +index 72cfe3e53af1..8dab88b85785 100644 +--- a/arch/x86/include/asm/tlbflush.h ++++ b/arch/x86/include/asm/tlbflush.h +@@ -68,8 +68,12 @@ static inline void invpcid_flush_all_nonglobals(void) + struct tlb_state { + struct mm_struct *active_mm; + int state; +- /* last user mm's ctx id */ +- u64 last_ctx_id; ++ ++ /* Last user mm for optimizing IBPB */ ++ union { ++ struct mm_struct *last_user_mm; ++ unsigned long last_user_mm_ibpb; ++ }; + + /* + * Access to this CR4 shadow and to H/W CR4 is protected by +diff --git a/arch/x86/include/uapi/asm/Kbuild b/arch/x86/include/uapi/asm/Kbuild +index 3dec769cadf7..1c532b3f18ea 100644 +--- a/arch/x86/include/uapi/asm/Kbuild ++++ b/arch/x86/include/uapi/asm/Kbuild +@@ -27,7 +27,6 @@ header-y += ldt.h + header-y += mce.h + header-y += mman.h + header-y += msgbuf.h +-header-y += msr-index.h + header-y += msr.h + header-y += mtrr.h + header-y += param.h +diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h +index 03429da2fa80..83b9be4e0492 100644 +--- a/arch/x86/include/uapi/asm/mce.h ++++ b/arch/x86/include/uapi/asm/mce.h +@@ -26,6 +26,10 @@ struct mce { + __u32 socketid; /* CPU socket ID */ + __u32 apicid; /* CPU initial apic ID */ + __u64 mcgcap; /* MCGCAP MSR: machine check capabilities of CPU */ ++ __u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */ ++ __u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */ ++ __u64 ppin; /* Protected Processor Inventory Number */ ++ __u32 microcode;/* Microcode revision */ + }; + + #define MCE_GET_RECORD_LEN _IOR('M', 1, int) +diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c +index 621bc6561189..2017fa20611c 100644 +--- a/arch/x86/kernel/cpu/bugs.c ++++ b/arch/x86/kernel/cpu/bugs.c +@@ -13,6 +13,7 @@ + #include + #include + #include ++#include + + #include + #include +@@ -23,6 +24,7 @@ + #include + #include + #include ++#include + #include + #include + #include +@@ -31,13 +33,12 @@ + static void __init spectre_v2_select_mitigation(void); + static void __init ssb_select_mitigation(void); + static void __init l1tf_select_mitigation(void); ++static void __init mds_select_mitigation(void); + +-/* +- * Our boot-time value of the SPEC_CTRL MSR. We read it once so that any +- * writes to SPEC_CTRL contain whatever reserved bits have been set. +- */ ++/* The base value of the SPEC_CTRL MSR that always has to be preserved. */ + u64 x86_spec_ctrl_base; + EXPORT_SYMBOL_GPL(x86_spec_ctrl_base); ++static DEFINE_MUTEX(spec_ctrl_mutex); + + /* + * The vendor and possibly platform specific bits which can be modified in +@@ -52,6 +53,19 @@ static u64 x86_spec_ctrl_mask = SPEC_CTRL_IBRS; + u64 x86_amd_ls_cfg_base; + u64 x86_amd_ls_cfg_ssbd_mask; + ++/* Control conditional STIPB in switch_to() */ ++DEFINE_STATIC_KEY_FALSE(switch_to_cond_stibp); ++/* Control conditional IBPB in switch_mm() */ ++DEFINE_STATIC_KEY_FALSE(switch_mm_cond_ibpb); ++/* Control unconditional IBPB in switch_mm() */ ++DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb); ++ ++/* Control MDS CPU buffer clear before returning to user space */ ++DEFINE_STATIC_KEY_FALSE(mds_user_clear); ++/* Control MDS CPU buffer clear before idling (halt, mwait) */ ++DEFINE_STATIC_KEY_FALSE(mds_idle_clear); ++EXPORT_SYMBOL_GPL(mds_idle_clear); ++ + void __init check_bugs(void) + { + identify_boot_cpu(); +@@ -84,6 +98,10 @@ void __init check_bugs(void) + + l1tf_select_mitigation(); + ++ mds_select_mitigation(); ++ ++ arch_smt_update(); ++ + #ifdef CONFIG_X86_32 + /* + * Check whether we are able to run this kernel safely on SMP. +@@ -116,29 +134,6 @@ void __init check_bugs(void) + #endif + } + +-/* The kernel command line selection */ +-enum spectre_v2_mitigation_cmd { +- SPECTRE_V2_CMD_NONE, +- SPECTRE_V2_CMD_AUTO, +- SPECTRE_V2_CMD_FORCE, +- SPECTRE_V2_CMD_RETPOLINE, +- SPECTRE_V2_CMD_RETPOLINE_GENERIC, +- SPECTRE_V2_CMD_RETPOLINE_AMD, +-}; +- +-static const char *spectre_v2_strings[] = { +- [SPECTRE_V2_NONE] = "Vulnerable", +- [SPECTRE_V2_RETPOLINE_MINIMAL] = "Vulnerable: Minimal generic ASM retpoline", +- [SPECTRE_V2_RETPOLINE_MINIMAL_AMD] = "Vulnerable: Minimal AMD ASM retpoline", +- [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline", +- [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline", +-}; +- +-#undef pr_fmt +-#define pr_fmt(fmt) "Spectre V2 : " fmt +- +-static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE; +- + void + x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest) + { +@@ -156,9 +151,14 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest) + guestval |= guest_spec_ctrl & x86_spec_ctrl_mask; + + /* SSBD controlled in MSR_SPEC_CTRL */ +- if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD)) ++ if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) || ++ static_cpu_has(X86_FEATURE_AMD_SSBD)) + hostval |= ssbd_tif_to_spec_ctrl(ti->flags); + ++ /* Conditional STIBP enabled? */ ++ if (static_branch_unlikely(&switch_to_cond_stibp)) ++ hostval |= stibp_tif_to_spec_ctrl(ti->flags); ++ + if (hostval != guestval) { + msrval = setguest ? guestval : hostval; + wrmsrl(MSR_IA32_SPEC_CTRL, msrval); +@@ -192,7 +192,7 @@ x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest) + tif = setguest ? ssbd_spec_ctrl_to_tif(guestval) : + ssbd_spec_ctrl_to_tif(hostval); + +- speculative_store_bypass_update(tif); ++ speculation_ctrl_update(tif); + } + } + EXPORT_SYMBOL_GPL(x86_virt_spec_ctrl); +@@ -207,6 +207,57 @@ static void x86_amd_ssb_disable(void) + wrmsrl(MSR_AMD64_LS_CFG, msrval); + } + ++#undef pr_fmt ++#define pr_fmt(fmt) "MDS: " fmt ++ ++/* Default mitigation for MDS-affected CPUs */ ++static enum mds_mitigations mds_mitigation = MDS_MITIGATION_FULL; ++ ++static const char * const mds_strings[] = { ++ [MDS_MITIGATION_OFF] = "Vulnerable", ++ [MDS_MITIGATION_FULL] = "Mitigation: Clear CPU buffers", ++ [MDS_MITIGATION_VMWERV] = "Vulnerable: Clear CPU buffers attempted, no microcode", ++}; ++ ++static void __init mds_select_mitigation(void) ++{ ++ if (!boot_cpu_has_bug(X86_BUG_MDS) || cpu_mitigations_off()) { ++ mds_mitigation = MDS_MITIGATION_OFF; ++ return; ++ } ++ ++ if (mds_mitigation == MDS_MITIGATION_FULL) { ++ if (!boot_cpu_has(X86_FEATURE_MD_CLEAR)) ++ mds_mitigation = MDS_MITIGATION_VMWERV; ++ static_branch_enable(&mds_user_clear); ++ } ++ pr_info("%s\n", mds_strings[mds_mitigation]); ++} ++ ++static int __init mds_cmdline(char *str) ++{ ++ if (!boot_cpu_has_bug(X86_BUG_MDS)) ++ return 0; ++ ++ if (!str) ++ return -EINVAL; ++ ++ if (!strcmp(str, "off")) ++ mds_mitigation = MDS_MITIGATION_OFF; ++ else if (!strcmp(str, "full")) ++ mds_mitigation = MDS_MITIGATION_FULL; ++ ++ return 0; ++} ++early_param("mds", mds_cmdline); ++ ++#undef pr_fmt ++#define pr_fmt(fmt) "Spectre V2 : " fmt ++ ++static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE; ++ ++static enum spectre_v2_user_mitigation spectre_v2_user = SPECTRE_V2_USER_NONE; ++ + #ifdef RETPOLINE + static bool spectre_v2_bad_module; + +@@ -228,67 +279,224 @@ static inline const char *spectre_v2_module_string(void) + static inline const char *spectre_v2_module_string(void) { return ""; } + #endif + +-static void __init spec2_print_if_insecure(const char *reason) ++static inline bool match_option(const char *arg, int arglen, const char *opt) + { +- if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) +- pr_info("%s selected on command line.\n", reason); ++ int len = strlen(opt); ++ ++ return len == arglen && !strncmp(arg, opt, len); + } + +-static void __init spec2_print_if_secure(const char *reason) ++/* The kernel command line selection for spectre v2 */ ++enum spectre_v2_mitigation_cmd { ++ SPECTRE_V2_CMD_NONE, ++ SPECTRE_V2_CMD_AUTO, ++ SPECTRE_V2_CMD_FORCE, ++ SPECTRE_V2_CMD_RETPOLINE, ++ SPECTRE_V2_CMD_RETPOLINE_GENERIC, ++ SPECTRE_V2_CMD_RETPOLINE_AMD, ++}; ++ ++enum spectre_v2_user_cmd { ++ SPECTRE_V2_USER_CMD_NONE, ++ SPECTRE_V2_USER_CMD_AUTO, ++ SPECTRE_V2_USER_CMD_FORCE, ++ SPECTRE_V2_USER_CMD_PRCTL, ++ SPECTRE_V2_USER_CMD_PRCTL_IBPB, ++ SPECTRE_V2_USER_CMD_SECCOMP, ++ SPECTRE_V2_USER_CMD_SECCOMP_IBPB, ++}; ++ ++static const char * const spectre_v2_user_strings[] = { ++ [SPECTRE_V2_USER_NONE] = "User space: Vulnerable", ++ [SPECTRE_V2_USER_STRICT] = "User space: Mitigation: STIBP protection", ++ [SPECTRE_V2_USER_PRCTL] = "User space: Mitigation: STIBP via prctl", ++ [SPECTRE_V2_USER_SECCOMP] = "User space: Mitigation: STIBP via seccomp and prctl", ++}; ++ ++static const struct { ++ const char *option; ++ enum spectre_v2_user_cmd cmd; ++ bool secure; ++} v2_user_options[] __initconst = { ++ { "auto", SPECTRE_V2_USER_CMD_AUTO, false }, ++ { "off", SPECTRE_V2_USER_CMD_NONE, false }, ++ { "on", SPECTRE_V2_USER_CMD_FORCE, true }, ++ { "prctl", SPECTRE_V2_USER_CMD_PRCTL, false }, ++ { "prctl,ibpb", SPECTRE_V2_USER_CMD_PRCTL_IBPB, false }, ++ { "seccomp", SPECTRE_V2_USER_CMD_SECCOMP, false }, ++ { "seccomp,ibpb", SPECTRE_V2_USER_CMD_SECCOMP_IBPB, false }, ++}; ++ ++static void __init spec_v2_user_print_cond(const char *reason, bool secure) + { +- if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) +- pr_info("%s selected on command line.\n", reason); ++ if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2) != secure) ++ pr_info("spectre_v2_user=%s forced on command line.\n", reason); + } + +-static inline bool retp_compiler(void) ++static enum spectre_v2_user_cmd __init ++spectre_v2_parse_user_cmdline(enum spectre_v2_mitigation_cmd v2_cmd) + { +- return __is_defined(RETPOLINE); ++ char arg[20]; ++ int ret, i; ++ ++ switch (v2_cmd) { ++ case SPECTRE_V2_CMD_NONE: ++ return SPECTRE_V2_USER_CMD_NONE; ++ case SPECTRE_V2_CMD_FORCE: ++ return SPECTRE_V2_USER_CMD_FORCE; ++ default: ++ break; ++ } ++ ++ ret = cmdline_find_option(boot_command_line, "spectre_v2_user", ++ arg, sizeof(arg)); ++ if (ret < 0) ++ return SPECTRE_V2_USER_CMD_AUTO; ++ ++ for (i = 0; i < ARRAY_SIZE(v2_user_options); i++) { ++ if (match_option(arg, ret, v2_user_options[i].option)) { ++ spec_v2_user_print_cond(v2_user_options[i].option, ++ v2_user_options[i].secure); ++ return v2_user_options[i].cmd; ++ } ++ } ++ ++ pr_err("Unknown user space protection option (%s). Switching to AUTO select\n", arg); ++ return SPECTRE_V2_USER_CMD_AUTO; + } + +-static inline bool match_option(const char *arg, int arglen, const char *opt) ++static void __init ++spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd) + { +- int len = strlen(opt); ++ enum spectre_v2_user_mitigation mode = SPECTRE_V2_USER_NONE; ++ bool smt_possible = IS_ENABLED(CONFIG_SMP); ++ enum spectre_v2_user_cmd cmd; + +- return len == arglen && !strncmp(arg, opt, len); ++ if (!boot_cpu_has(X86_FEATURE_IBPB) && !boot_cpu_has(X86_FEATURE_STIBP)) ++ return; ++ ++ if (!IS_ENABLED(CONFIG_SMP)) ++ smt_possible = false; ++ ++ cmd = spectre_v2_parse_user_cmdline(v2_cmd); ++ switch (cmd) { ++ case SPECTRE_V2_USER_CMD_NONE: ++ goto set_mode; ++ case SPECTRE_V2_USER_CMD_FORCE: ++ mode = SPECTRE_V2_USER_STRICT; ++ break; ++ case SPECTRE_V2_USER_CMD_PRCTL: ++ case SPECTRE_V2_USER_CMD_PRCTL_IBPB: ++ mode = SPECTRE_V2_USER_PRCTL; ++ break; ++ case SPECTRE_V2_USER_CMD_AUTO: ++ case SPECTRE_V2_USER_CMD_SECCOMP: ++ case SPECTRE_V2_USER_CMD_SECCOMP_IBPB: ++ if (IS_ENABLED(CONFIG_SECCOMP)) ++ mode = SPECTRE_V2_USER_SECCOMP; ++ else ++ mode = SPECTRE_V2_USER_PRCTL; ++ break; ++ } ++ ++ /* Initialize Indirect Branch Prediction Barrier */ ++ if (boot_cpu_has(X86_FEATURE_IBPB)) { ++ setup_force_cpu_cap(X86_FEATURE_USE_IBPB); ++ ++ switch (cmd) { ++ case SPECTRE_V2_USER_CMD_FORCE: ++ case SPECTRE_V2_USER_CMD_PRCTL_IBPB: ++ case SPECTRE_V2_USER_CMD_SECCOMP_IBPB: ++ static_branch_enable(&switch_mm_always_ibpb); ++ break; ++ case SPECTRE_V2_USER_CMD_PRCTL: ++ case SPECTRE_V2_USER_CMD_AUTO: ++ case SPECTRE_V2_USER_CMD_SECCOMP: ++ static_branch_enable(&switch_mm_cond_ibpb); ++ break; ++ default: ++ break; ++ } ++ ++ pr_info("mitigation: Enabling %s Indirect Branch Prediction Barrier\n", ++ static_key_enabled(&switch_mm_always_ibpb) ? ++ "always-on" : "conditional"); ++ } ++ ++ /* If enhanced IBRS is enabled no STIPB required */ ++ if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) ++ return; ++ ++ /* ++ * If SMT is not possible or STIBP is not available clear the STIPB ++ * mode. ++ */ ++ if (!smt_possible || !boot_cpu_has(X86_FEATURE_STIBP)) ++ mode = SPECTRE_V2_USER_NONE; ++set_mode: ++ spectre_v2_user = mode; ++ /* Only print the STIBP mode when SMT possible */ ++ if (smt_possible) ++ pr_info("%s\n", spectre_v2_user_strings[mode]); + } + ++static const char * const spectre_v2_strings[] = { ++ [SPECTRE_V2_NONE] = "Vulnerable", ++ [SPECTRE_V2_RETPOLINE_MINIMAL] = "Vulnerable: Minimal generic ASM retpoline", ++ [SPECTRE_V2_RETPOLINE_MINIMAL_AMD] = "Vulnerable: Minimal AMD ASM retpoline", ++ [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline", ++ [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline", ++ [SPECTRE_V2_IBRS_ENHANCED] = "Mitigation: Enhanced IBRS", ++}; ++ + static const struct { + const char *option; + enum spectre_v2_mitigation_cmd cmd; + bool secure; +-} mitigation_options[] = { +- { "off", SPECTRE_V2_CMD_NONE, false }, +- { "on", SPECTRE_V2_CMD_FORCE, true }, +- { "retpoline", SPECTRE_V2_CMD_RETPOLINE, false }, +- { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_AMD, false }, +- { "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false }, +- { "auto", SPECTRE_V2_CMD_AUTO, false }, ++} mitigation_options[] __initconst = { ++ { "off", SPECTRE_V2_CMD_NONE, false }, ++ { "on", SPECTRE_V2_CMD_FORCE, true }, ++ { "retpoline", SPECTRE_V2_CMD_RETPOLINE, false }, ++ { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_AMD, false }, ++ { "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false }, ++ { "auto", SPECTRE_V2_CMD_AUTO, false }, + }; + ++static void __init spec_v2_print_cond(const char *reason, bool secure) ++{ ++ if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2) != secure) ++ pr_info("%s selected on command line.\n", reason); ++} ++ ++static inline bool retp_compiler(void) ++{ ++ return __is_defined(RETPOLINE); ++} ++ + static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) + { ++ enum spectre_v2_mitigation_cmd cmd = SPECTRE_V2_CMD_AUTO; + char arg[20]; + int ret, i; +- enum spectre_v2_mitigation_cmd cmd = SPECTRE_V2_CMD_AUTO; + +- if (cmdline_find_option_bool(boot_command_line, "nospectre_v2")) ++ if (cmdline_find_option_bool(boot_command_line, "nospectre_v2") || ++ cpu_mitigations_off()) + return SPECTRE_V2_CMD_NONE; +- else { +- ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg)); +- if (ret < 0) +- return SPECTRE_V2_CMD_AUTO; + +- for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) { +- if (!match_option(arg, ret, mitigation_options[i].option)) +- continue; +- cmd = mitigation_options[i].cmd; +- break; +- } ++ ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg)); ++ if (ret < 0) ++ return SPECTRE_V2_CMD_AUTO; + +- if (i >= ARRAY_SIZE(mitigation_options)) { +- pr_err("unknown option (%s). Switching to AUTO select\n", arg); +- return SPECTRE_V2_CMD_AUTO; +- } ++ for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) { ++ if (!match_option(arg, ret, mitigation_options[i].option)) ++ continue; ++ cmd = mitigation_options[i].cmd; ++ break; ++ } ++ ++ if (i >= ARRAY_SIZE(mitigation_options)) { ++ pr_err("unknown option (%s). Switching to AUTO select\n", arg); ++ return SPECTRE_V2_CMD_AUTO; + } + + if ((cmd == SPECTRE_V2_CMD_RETPOLINE || +@@ -305,11 +513,8 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) + return SPECTRE_V2_CMD_AUTO; + } + +- if (mitigation_options[i].secure) +- spec2_print_if_secure(mitigation_options[i].option); +- else +- spec2_print_if_insecure(mitigation_options[i].option); +- ++ spec_v2_print_cond(mitigation_options[i].option, ++ mitigation_options[i].secure); + return cmd; + } + +@@ -332,6 +537,13 @@ static void __init spectre_v2_select_mitigation(void) + + case SPECTRE_V2_CMD_FORCE: + case SPECTRE_V2_CMD_AUTO: ++ if (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED)) { ++ mode = SPECTRE_V2_IBRS_ENHANCED; ++ /* Force it so VMEXIT will restore correctly */ ++ x86_spec_ctrl_base |= SPEC_CTRL_IBRS; ++ wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); ++ goto specv2_set_mode; ++ } + if (IS_ENABLED(CONFIG_RETPOLINE)) + goto retpoline_auto; + break; +@@ -369,6 +581,7 @@ retpoline_auto: + setup_force_cpu_cap(X86_FEATURE_RETPOLINE); + } + ++specv2_set_mode: + spectre_v2_enabled = mode; + pr_info("%s\n", spectre_v2_strings[mode]); + +@@ -383,20 +596,114 @@ retpoline_auto: + setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW); + pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n"); + +- /* Initialize Indirect Branch Prediction Barrier if supported */ +- if (boot_cpu_has(X86_FEATURE_IBPB)) { +- setup_force_cpu_cap(X86_FEATURE_USE_IBPB); +- pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n"); +- } +- + /* + * Retpoline means the kernel is safe because it has no indirect +- * branches. But firmware isn't, so use IBRS to protect that. ++ * branches. Enhanced IBRS protects firmware too, so, enable restricted ++ * speculation around firmware calls only when Enhanced IBRS isn't ++ * supported. ++ * ++ * Use "mode" to check Enhanced IBRS instead of boot_cpu_has(), because ++ * the user might select retpoline on the kernel command line and if ++ * the CPU supports Enhanced IBRS, kernel might un-intentionally not ++ * enable IBRS around firmware calls. + */ +- if (boot_cpu_has(X86_FEATURE_IBRS)) { ++ if (boot_cpu_has(X86_FEATURE_IBRS) && mode != SPECTRE_V2_IBRS_ENHANCED) { + setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW); + pr_info("Enabling Restricted Speculation for firmware calls\n"); + } ++ ++ /* Set up IBPB and STIBP depending on the general spectre V2 command */ ++ spectre_v2_user_select_mitigation(cmd); ++} ++ ++static void update_stibp_msr(void * __unused) ++{ ++ wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); ++} ++ ++/* Update x86_spec_ctrl_base in case SMT state changed. */ ++static void update_stibp_strict(void) ++{ ++ u64 mask = x86_spec_ctrl_base & ~SPEC_CTRL_STIBP; ++ ++ if (sched_smt_active()) ++ mask |= SPEC_CTRL_STIBP; ++ ++ if (mask == x86_spec_ctrl_base) ++ return; ++ ++ pr_info("Update user space SMT mitigation: STIBP %s\n", ++ mask & SPEC_CTRL_STIBP ? "always-on" : "off"); ++ x86_spec_ctrl_base = mask; ++ on_each_cpu(update_stibp_msr, NULL, 1); ++} ++ ++/* Update the static key controlling the evaluation of TIF_SPEC_IB */ ++static void update_indir_branch_cond(void) ++{ ++ if (sched_smt_active()) ++ static_branch_enable(&switch_to_cond_stibp); ++ else ++ static_branch_disable(&switch_to_cond_stibp); ++} ++ ++#undef pr_fmt ++#define pr_fmt(fmt) fmt ++ ++/* Update the static key controlling the MDS CPU buffer clear in idle */ ++static void update_mds_branch_idle(void) ++{ ++ /* ++ * Enable the idle clearing if SMT is active on CPUs which are ++ * affected only by MSBDS and not any other MDS variant. ++ * ++ * The other variants cannot be mitigated when SMT is enabled, so ++ * clearing the buffers on idle just to prevent the Store Buffer ++ * repartitioning leak would be a window dressing exercise. ++ */ ++ if (!boot_cpu_has_bug(X86_BUG_MSBDS_ONLY)) ++ return; ++ ++ if (sched_smt_active()) ++ static_branch_enable(&mds_idle_clear); ++ else ++ static_branch_disable(&mds_idle_clear); ++} ++ ++#define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n" ++ ++void arch_smt_update(void) ++{ ++ /* Enhanced IBRS implies STIBP. No update required. */ ++ if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) ++ return; ++ ++ mutex_lock(&spec_ctrl_mutex); ++ ++ switch (spectre_v2_user) { ++ case SPECTRE_V2_USER_NONE: ++ break; ++ case SPECTRE_V2_USER_STRICT: ++ update_stibp_strict(); ++ break; ++ case SPECTRE_V2_USER_PRCTL: ++ case SPECTRE_V2_USER_SECCOMP: ++ update_indir_branch_cond(); ++ break; ++ } ++ ++ switch (mds_mitigation) { ++ case MDS_MITIGATION_FULL: ++ case MDS_MITIGATION_VMWERV: ++ if (sched_smt_active() && !boot_cpu_has(X86_BUG_MSBDS_ONLY)) ++ pr_warn_once(MDS_MSG_SMT); ++ update_mds_branch_idle(); ++ break; ++ case MDS_MITIGATION_OFF: ++ break; ++ } ++ ++ mutex_unlock(&spec_ctrl_mutex); + } + + #undef pr_fmt +@@ -413,7 +720,7 @@ enum ssb_mitigation_cmd { + SPEC_STORE_BYPASS_CMD_SECCOMP, + }; + +-static const char *ssb_strings[] = { ++static const char * const ssb_strings[] = { + [SPEC_STORE_BYPASS_NONE] = "Vulnerable", + [SPEC_STORE_BYPASS_DISABLE] = "Mitigation: Speculative Store Bypass disabled", + [SPEC_STORE_BYPASS_PRCTL] = "Mitigation: Speculative Store Bypass disabled via prctl", +@@ -423,7 +730,7 @@ static const char *ssb_strings[] = { + static const struct { + const char *option; + enum ssb_mitigation_cmd cmd; +-} ssb_mitigation_options[] = { ++} ssb_mitigation_options[] __initconst = { + { "auto", SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */ + { "on", SPEC_STORE_BYPASS_CMD_ON }, /* Disable Speculative Store Bypass */ + { "off", SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */ +@@ -437,7 +744,8 @@ static enum ssb_mitigation_cmd __init ssb_parse_cmdline(void) + char arg[20]; + int ret, i; + +- if (cmdline_find_option_bool(boot_command_line, "nospec_store_bypass_disable")) { ++ if (cmdline_find_option_bool(boot_command_line, "nospec_store_bypass_disable") || ++ cpu_mitigations_off()) { + return SPEC_STORE_BYPASS_CMD_NONE; + } else { + ret = cmdline_find_option(boot_command_line, "spec_store_bypass_disable", +@@ -507,18 +815,16 @@ static enum ssb_mitigation __init __ssb_select_mitigation(void) + if (mode == SPEC_STORE_BYPASS_DISABLE) { + setup_force_cpu_cap(X86_FEATURE_SPEC_STORE_BYPASS_DISABLE); + /* +- * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD uses +- * a completely different MSR and bit dependent on family. ++ * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD may ++ * use a completely different MSR and bit dependent on family. + */ +- switch (boot_cpu_data.x86_vendor) { +- case X86_VENDOR_INTEL: ++ if (!static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) && ++ !static_cpu_has(X86_FEATURE_AMD_SSBD)) { ++ x86_amd_ssb_disable(); ++ } else { + x86_spec_ctrl_base |= SPEC_CTRL_SSBD; + x86_spec_ctrl_mask |= SPEC_CTRL_SSBD; + wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base); +- break; +- case X86_VENDOR_AMD: +- x86_amd_ssb_disable(); +- break; + } + } + +@@ -536,10 +842,25 @@ static void ssb_select_mitigation(void) + #undef pr_fmt + #define pr_fmt(fmt) "Speculation prctl: " fmt + +-static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl) ++static void task_update_spec_tif(struct task_struct *tsk) + { +- bool update; ++ /* Force the update of the real TIF bits */ ++ set_tsk_thread_flag(tsk, TIF_SPEC_FORCE_UPDATE); ++ ++ /* ++ * Immediately update the speculation control MSRs for the current ++ * task, but for a non-current task delay setting the CPU ++ * mitigation until it is scheduled next. ++ * ++ * This can only happen for SECCOMP mitigation. For PRCTL it's ++ * always the current task. ++ */ ++ if (tsk == current) ++ speculation_ctrl_update_current(); ++} + ++static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl) ++{ + if (ssb_mode != SPEC_STORE_BYPASS_PRCTL && + ssb_mode != SPEC_STORE_BYPASS_SECCOMP) + return -ENXIO; +@@ -550,28 +871,56 @@ static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl) + if (task_spec_ssb_force_disable(task)) + return -EPERM; + task_clear_spec_ssb_disable(task); +- update = test_and_clear_tsk_thread_flag(task, TIF_SSBD); ++ task_update_spec_tif(task); + break; + case PR_SPEC_DISABLE: + task_set_spec_ssb_disable(task); +- update = !test_and_set_tsk_thread_flag(task, TIF_SSBD); ++ task_update_spec_tif(task); + break; + case PR_SPEC_FORCE_DISABLE: + task_set_spec_ssb_disable(task); + task_set_spec_ssb_force_disable(task); +- update = !test_and_set_tsk_thread_flag(task, TIF_SSBD); ++ task_update_spec_tif(task); + break; + default: + return -ERANGE; + } ++ return 0; ++} + +- /* +- * If being set on non-current task, delay setting the CPU +- * mitigation until it is next scheduled. +- */ +- if (task == current && update) +- speculative_store_bypass_update_current(); +- ++static int ib_prctl_set(struct task_struct *task, unsigned long ctrl) ++{ ++ switch (ctrl) { ++ case PR_SPEC_ENABLE: ++ if (spectre_v2_user == SPECTRE_V2_USER_NONE) ++ return 0; ++ /* ++ * Indirect branch speculation is always disabled in strict ++ * mode. ++ */ ++ if (spectre_v2_user == SPECTRE_V2_USER_STRICT) ++ return -EPERM; ++ task_clear_spec_ib_disable(task); ++ task_update_spec_tif(task); ++ break; ++ case PR_SPEC_DISABLE: ++ case PR_SPEC_FORCE_DISABLE: ++ /* ++ * Indirect branch speculation is always allowed when ++ * mitigation is force disabled. ++ */ ++ if (spectre_v2_user == SPECTRE_V2_USER_NONE) ++ return -EPERM; ++ if (spectre_v2_user == SPECTRE_V2_USER_STRICT) ++ return 0; ++ task_set_spec_ib_disable(task); ++ if (ctrl == PR_SPEC_FORCE_DISABLE) ++ task_set_spec_ib_force_disable(task); ++ task_update_spec_tif(task); ++ break; ++ default: ++ return -ERANGE; ++ } + return 0; + } + +@@ -581,6 +930,8 @@ int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which, + switch (which) { + case PR_SPEC_STORE_BYPASS: + return ssb_prctl_set(task, ctrl); ++ case PR_SPEC_INDIRECT_BRANCH: ++ return ib_prctl_set(task, ctrl); + default: + return -ENODEV; + } +@@ -591,6 +942,8 @@ void arch_seccomp_spec_mitigate(struct task_struct *task) + { + if (ssb_mode == SPEC_STORE_BYPASS_SECCOMP) + ssb_prctl_set(task, PR_SPEC_FORCE_DISABLE); ++ if (spectre_v2_user == SPECTRE_V2_USER_SECCOMP) ++ ib_prctl_set(task, PR_SPEC_FORCE_DISABLE); + } + #endif + +@@ -613,11 +966,35 @@ static int ssb_prctl_get(struct task_struct *task) + } + } + ++static int ib_prctl_get(struct task_struct *task) ++{ ++ if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2)) ++ return PR_SPEC_NOT_AFFECTED; ++ ++ switch (spectre_v2_user) { ++ case SPECTRE_V2_USER_NONE: ++ return PR_SPEC_ENABLE; ++ case SPECTRE_V2_USER_PRCTL: ++ case SPECTRE_V2_USER_SECCOMP: ++ if (task_spec_ib_force_disable(task)) ++ return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE; ++ if (task_spec_ib_disable(task)) ++ return PR_SPEC_PRCTL | PR_SPEC_DISABLE; ++ return PR_SPEC_PRCTL | PR_SPEC_ENABLE; ++ case SPECTRE_V2_USER_STRICT: ++ return PR_SPEC_DISABLE; ++ default: ++ return PR_SPEC_NOT_AFFECTED; ++ } ++} ++ + int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which) + { + switch (which) { + case PR_SPEC_STORE_BYPASS: + return ssb_prctl_get(task); ++ case PR_SPEC_INDIRECT_BRANCH: ++ return ib_prctl_get(task); + default: + return -ENODEV; + } +@@ -694,16 +1071,66 @@ static void __init l1tf_select_mitigation(void) + pr_info("You may make it effective by booting the kernel with mem=%llu parameter.\n", + half_pa); + pr_info("However, doing so will make a part of your RAM unusable.\n"); +- pr_info("Reading https://www.kernel.org/doc/html/latest/admin-guide/l1tf.html might help you decide.\n"); ++ pr_info("Reading https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html might help you decide.\n"); + return; + } + + setup_force_cpu_cap(X86_FEATURE_L1TF_PTEINV); + } + #undef pr_fmt ++#define pr_fmt(fmt) fmt + + #ifdef CONFIG_SYSFS + ++static ssize_t mds_show_state(char *buf) ++{ ++#ifdef CONFIG_HYPERVISOR_GUEST ++ if (x86_hyper) { ++ return sprintf(buf, "%s; SMT Host state unknown\n", ++ mds_strings[mds_mitigation]); ++ } ++#endif ++ ++ if (boot_cpu_has(X86_BUG_MSBDS_ONLY)) { ++ return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation], ++ (mds_mitigation == MDS_MITIGATION_OFF ? "vulnerable" : ++ sched_smt_active() ? "mitigated" : "disabled")); ++ } ++ ++ return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation], ++ sched_smt_active() ? "vulnerable" : "disabled"); ++} ++ ++static char *stibp_state(void) ++{ ++ if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED) ++ return ""; ++ ++ switch (spectre_v2_user) { ++ case SPECTRE_V2_USER_NONE: ++ return ", STIBP: disabled"; ++ case SPECTRE_V2_USER_STRICT: ++ return ", STIBP: forced"; ++ case SPECTRE_V2_USER_PRCTL: ++ case SPECTRE_V2_USER_SECCOMP: ++ if (static_key_enabled(&switch_to_cond_stibp)) ++ return ", STIBP: conditional"; ++ } ++ return ""; ++} ++ ++static char *ibpb_state(void) ++{ ++ if (boot_cpu_has(X86_FEATURE_IBPB)) { ++ if (static_key_enabled(&switch_mm_always_ibpb)) ++ return ", IBPB: always-on"; ++ if (static_key_enabled(&switch_mm_cond_ibpb)) ++ return ", IBPB: conditional"; ++ return ", IBPB: disabled"; ++ } ++ return ""; ++} ++ + static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr, + char *buf, unsigned int bug) + { +@@ -721,9 +1148,11 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr + return sprintf(buf, "Mitigation: __user pointer sanitization\n"); + + case X86_BUG_SPECTRE_V2: +- return sprintf(buf, "%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled], +- boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "", ++ return sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled], ++ ibpb_state(), + boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "", ++ stibp_state(), ++ boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "", + spectre_v2_module_string()); + + case X86_BUG_SPEC_STORE_BYPASS: +@@ -731,9 +1160,12 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr + + case X86_BUG_L1TF: + if (boot_cpu_has(X86_FEATURE_L1TF_PTEINV)) +- return sprintf(buf, "Mitigation: Page Table Inversion\n"); ++ return sprintf(buf, "Mitigation: PTE Inversion\n"); + break; + ++ case X86_BUG_MDS: ++ return mds_show_state(buf); ++ + default: + break; + } +@@ -765,4 +1197,9 @@ ssize_t cpu_show_l1tf(struct device *dev, struct device_attribute *attr, char *b + { + return cpu_show_common(dev, attr, buf, X86_BUG_L1TF); + } ++ ++ssize_t cpu_show_mds(struct device *dev, struct device_attribute *attr, char *buf) ++{ ++ return cpu_show_common(dev, attr, buf, X86_BUG_MDS); ++} + #endif +diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c +index e8b46f575306..4bce77bc7e61 100644 +--- a/arch/x86/kernel/cpu/common.c ++++ b/arch/x86/kernel/cpu/common.c +@@ -709,6 +709,12 @@ static void init_speculation_control(struct cpuinfo_x86 *c) + set_cpu_cap(c, X86_FEATURE_STIBP); + set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL); + } ++ ++ if (cpu_has(c, X86_FEATURE_AMD_SSBD)) { ++ set_cpu_cap(c, X86_FEATURE_SSBD); ++ set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL); ++ clear_cpu_cap(c, X86_FEATURE_VIRT_SSBD); ++ } + } + + void get_cpu_cap(struct cpuinfo_x86 *c) +@@ -841,81 +847,95 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c) + #endif + } + +-static const __initconst struct x86_cpu_id cpu_no_speculation[] = { +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW, X86_FEATURE_ANY }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW, X86_FEATURE_ANY }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT, X86_FEATURE_ANY }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PENWELL, X86_FEATURE_ANY }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PINEVIEW, X86_FEATURE_ANY }, +- { X86_VENDOR_CENTAUR, 5 }, +- { X86_VENDOR_INTEL, 5 }, +- { X86_VENDOR_NSC, 5 }, +- { X86_VENDOR_ANY, 4 }, ++#define NO_SPECULATION BIT(0) ++#define NO_MELTDOWN BIT(1) ++#define NO_SSB BIT(2) ++#define NO_L1TF BIT(3) ++#define NO_MDS BIT(4) ++#define MSBDS_ONLY BIT(5) ++ ++#define VULNWL(_vendor, _family, _model, _whitelist) \ ++ { X86_VENDOR_##_vendor, _family, _model, X86_FEATURE_ANY, _whitelist } ++ ++#define VULNWL_INTEL(model, whitelist) \ ++ VULNWL(INTEL, 6, INTEL_FAM6_##model, whitelist) ++ ++#define VULNWL_AMD(family, whitelist) \ ++ VULNWL(AMD, family, X86_MODEL_ANY, whitelist) ++ ++static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = { ++ VULNWL(ANY, 4, X86_MODEL_ANY, NO_SPECULATION), ++ VULNWL(CENTAUR, 5, X86_MODEL_ANY, NO_SPECULATION), ++ VULNWL(INTEL, 5, X86_MODEL_ANY, NO_SPECULATION), ++ VULNWL(NSC, 5, X86_MODEL_ANY, NO_SPECULATION), ++ ++ /* Intel Family 6 */ ++ VULNWL_INTEL(ATOM_SALTWELL, NO_SPECULATION), ++ VULNWL_INTEL(ATOM_SALTWELL_TABLET, NO_SPECULATION), ++ VULNWL_INTEL(ATOM_SALTWELL_MID, NO_SPECULATION), ++ VULNWL_INTEL(ATOM_BONNELL, NO_SPECULATION), ++ VULNWL_INTEL(ATOM_BONNELL_MID, NO_SPECULATION), ++ ++ VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF | MSBDS_ONLY), ++ VULNWL_INTEL(ATOM_SILVERMONT_X, NO_SSB | NO_L1TF | MSBDS_ONLY), ++ VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF | MSBDS_ONLY), ++ VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF | MSBDS_ONLY), ++ VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF | MSBDS_ONLY), ++ VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF | MSBDS_ONLY), ++ ++ VULNWL_INTEL(CORE_YONAH, NO_SSB), ++ ++ VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF | MSBDS_ONLY), ++ ++ VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF), ++ VULNWL_INTEL(ATOM_GOLDMONT_X, NO_MDS | NO_L1TF), ++ VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF), ++ ++ /* AMD Family 0xf - 0x12 */ ++ VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS), ++ VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS), ++ VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS), ++ VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS), ++ ++ /* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */ ++ VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS), + {} + }; + +-static const __initconst struct x86_cpu_id cpu_no_meltdown[] = { +- { X86_VENDOR_AMD }, +- {} +-}; +- +-static const __initconst struct x86_cpu_id cpu_no_spec_store_bypass[] = { +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PINEVIEW }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PENWELL }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT1 }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_AIRMONT }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT2 }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_MERRIFIELD }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_CORE_YONAH }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNL }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNM }, +- { X86_VENDOR_CENTAUR, 5, }, +- { X86_VENDOR_INTEL, 5, }, +- { X86_VENDOR_NSC, 5, }, +- { X86_VENDOR_AMD, 0x12, }, +- { X86_VENDOR_AMD, 0x11, }, +- { X86_VENDOR_AMD, 0x10, }, +- { X86_VENDOR_AMD, 0xf, }, +- { X86_VENDOR_ANY, 4, }, +- {} +-}; ++static bool __init cpu_matches(unsigned long which) ++{ ++ const struct x86_cpu_id *m = x86_match_cpu(cpu_vuln_whitelist); + +-static const __initconst struct x86_cpu_id cpu_no_l1tf[] = { +- /* in addition to cpu_no_speculation */ +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT1 }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_SILVERMONT2 }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_AIRMONT }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_MERRIFIELD }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_MOOREFIELD }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_GOLDMONT }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_DENVERTON }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_GEMINI_LAKE }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNL }, +- { X86_VENDOR_INTEL, 6, INTEL_FAM6_XEON_PHI_KNM }, +- {} +-}; ++ return m && !!(m->driver_data & which); ++} + + static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) + { + u64 ia32_cap = 0; + ++ if (cpu_matches(NO_SPECULATION)) ++ return; ++ ++ setup_force_cpu_bug(X86_BUG_SPECTRE_V1); ++ setup_force_cpu_bug(X86_BUG_SPECTRE_V2); ++ + if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES)) + rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap); + +- if (!x86_match_cpu(cpu_no_spec_store_bypass) && +- !(ia32_cap & ARCH_CAP_SSB_NO)) ++ if (!cpu_matches(NO_SSB) && !(ia32_cap & ARCH_CAP_SSB_NO) && ++ !cpu_has(c, X86_FEATURE_AMD_SSB_NO)) + setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS); + +- if (x86_match_cpu(cpu_no_speculation)) +- return; ++ if (ia32_cap & ARCH_CAP_IBRS_ALL) ++ setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED); + +- setup_force_cpu_bug(X86_BUG_SPECTRE_V1); +- setup_force_cpu_bug(X86_BUG_SPECTRE_V2); ++ if (!cpu_matches(NO_MDS) && !(ia32_cap & ARCH_CAP_MDS_NO)) { ++ setup_force_cpu_bug(X86_BUG_MDS); ++ if (cpu_matches(MSBDS_ONLY)) ++ setup_force_cpu_bug(X86_BUG_MSBDS_ONLY); ++ } + +- if (x86_match_cpu(cpu_no_meltdown)) ++ if (cpu_matches(NO_MELTDOWN)) + return; + + /* Rogue Data Cache Load? No! */ +@@ -924,7 +944,7 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c) + + setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN); + +- if (x86_match_cpu(cpu_no_l1tf)) ++ if (cpu_matches(NO_L1TF)) + return; + + setup_force_cpu_bug(X86_BUG_L1TF); +diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c +index b18fe3d245fe..b0e0c7a12e61 100644 +--- a/arch/x86/kernel/cpu/intel.c ++++ b/arch/x86/kernel/cpu/intel.c +@@ -14,6 +14,7 @@ + #include + #include + #include ++#include + + #ifdef CONFIG_X86_64 + #include +@@ -102,14 +103,8 @@ static void early_init_intel(struct cpuinfo_x86 *c) + (c->x86 == 0x6 && c->x86_model >= 0x0e)) + set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC); + +- if (c->x86 >= 6 && !cpu_has(c, X86_FEATURE_IA64)) { +- unsigned lower_word; +- +- wrmsr(MSR_IA32_UCODE_REV, 0, 0); +- /* Required by the SDM */ +- sync_core(); +- rdmsr(MSR_IA32_UCODE_REV, lower_word, c->microcode); +- } ++ if (c->x86 >= 6 && !cpu_has(c, X86_FEATURE_IA64)) ++ c->microcode = intel_get_microcode_revision(); + + /* Now if any of them are set, check the blacklist and clear the lot */ + if ((cpu_has(c, X86_FEATURE_SPEC_CTRL) || +diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c +index 9c682c222071..1ce85ba50005 100644 +--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c ++++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c +@@ -132,6 +132,11 @@ static struct severity { + SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_INSTR), + USER + ), ++ MCESEV( ++ PANIC, "Instruction fetch error in kernel", ++ SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_INSTR), ++ KERNEL ++ ), + #endif + MCESEV( + PANIC, "Action required: unknown MCACOD", +diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c +index 77f7580e22c6..4b9cfdcc3aaa 100644 +--- a/arch/x86/kernel/cpu/mcheck/mce.c ++++ b/arch/x86/kernel/cpu/mcheck/mce.c +@@ -138,6 +138,8 @@ void mce_setup(struct mce *m) + m->socketid = cpu_data(m->extcpu).phys_proc_id; + m->apicid = cpu_data(m->extcpu).initial_apicid; + rdmsrl(MSR_IA32_MCG_CAP, m->mcgcap); ++ ++ m->microcode = boot_cpu_data.microcode; + } + + DEFINE_PER_CPU(struct mce, injectm); +@@ -258,7 +260,7 @@ static void print_mce(struct mce *m) + */ + pr_emerg(HW_ERR "PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x microcode %x\n", + m->cpuvendor, m->cpuid, m->time, m->socketid, m->apicid, +- cpu_data(m->extcpu).microcode); ++ m->microcode); + + /* + * Print out human-readable details about the MCE error, +diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c +index 6da6f9cd6d2d..ca5b45799264 100644 +--- a/arch/x86/kernel/cpu/microcode/amd.c ++++ b/arch/x86/kernel/cpu/microcode/amd.c +@@ -695,22 +695,26 @@ int apply_microcode_amd(int cpu) + return -1; + + /* need to apply patch? */ +- if (rev >= mc_amd->hdr.patch_id) { +- c->microcode = rev; +- uci->cpu_sig.rev = rev; +- return 0; +- } ++ if (rev >= mc_amd->hdr.patch_id) ++ goto out; + + if (__apply_microcode_amd(mc_amd)) { + pr_err("CPU%d: update failed for patch_level=0x%08x\n", + cpu, mc_amd->hdr.patch_id); + return -1; + } +- pr_info("CPU%d: new patch_level=0x%08x\n", cpu, +- mc_amd->hdr.patch_id); + +- uci->cpu_sig.rev = mc_amd->hdr.patch_id; +- c->microcode = mc_amd->hdr.patch_id; ++ rev = mc_amd->hdr.patch_id; ++ ++ pr_info("CPU%d: new patch_level=0x%08x\n", cpu, rev); ++ ++out: ++ uci->cpu_sig.rev = rev; ++ c->microcode = rev; ++ ++ /* Update boot_cpu_data's revision too, if we're on the BSP: */ ++ if (c->cpu_index == boot_cpu_data.cpu_index) ++ boot_cpu_data.microcode = rev; + + return 0; + } +diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c +index 2f38a99cdb98..afaf648386e9 100644 +--- a/arch/x86/kernel/cpu/microcode/intel.c ++++ b/arch/x86/kernel/cpu/microcode/intel.c +@@ -376,15 +376,8 @@ static int collect_cpu_info_early(struct ucode_cpu_info *uci) + native_rdmsr(MSR_IA32_PLATFORM_ID, val[0], val[1]); + csig.pf = 1 << ((val[1] >> 18) & 7); + } +- native_wrmsr(MSR_IA32_UCODE_REV, 0, 0); + +- /* As documented in the SDM: Do a CPUID 1 here */ +- sync_core(); +- +- /* get the current revision from MSR 0x8B */ +- native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]); +- +- csig.rev = val[1]; ++ csig.rev = intel_get_microcode_revision(); + + uci->cpu_sig = csig; + uci->valid = 1; +@@ -654,31 +647,37 @@ static inline void print_ucode(struct ucode_cpu_info *uci) + static int apply_microcode_early(struct ucode_cpu_info *uci, bool early) + { + struct microcode_intel *mc_intel; +- unsigned int val[2]; ++ u32 rev; + + mc_intel = uci->mc; + if (mc_intel == NULL) + return 0; + ++ /* ++ * Save us the MSR write below - which is a particular expensive ++ * operation - when the other hyperthread has updated the microcode ++ * already. ++ */ ++ rev = intel_get_microcode_revision(); ++ if (rev >= mc_intel->hdr.rev) { ++ uci->cpu_sig.rev = rev; ++ return 0; ++ } ++ + /* write microcode via MSR 0x79 */ + native_wrmsr(MSR_IA32_UCODE_WRITE, + (unsigned long) mc_intel->bits, + (unsigned long) mc_intel->bits >> 16 >> 16); +- native_wrmsr(MSR_IA32_UCODE_REV, 0, 0); +- +- /* As documented in the SDM: Do a CPUID 1 here */ +- sync_core(); + +- /* get the current revision from MSR 0x8B */ +- native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]); +- if (val[1] != mc_intel->hdr.rev) ++ rev = intel_get_microcode_revision(); ++ if (rev != mc_intel->hdr.rev) + return -1; + + #ifdef CONFIG_X86_64 + /* Flush global tlb. This is precaution. */ + flush_tlb_early(); + #endif +- uci->cpu_sig.rev = val[1]; ++ uci->cpu_sig.rev = rev; + + if (early) + print_ucode(uci); +@@ -852,7 +851,7 @@ static int apply_microcode_intel(int cpu) + { + struct microcode_intel *mc_intel; + struct ucode_cpu_info *uci; +- unsigned int val[2]; ++ u32 rev; + int cpu_num = raw_smp_processor_id(); + struct cpuinfo_x86 *c = &cpu_data(cpu_num); + +@@ -873,31 +872,40 @@ static int apply_microcode_intel(int cpu) + if (get_matching_mc(mc_intel, cpu) == 0) + return 0; + ++ /* ++ * Save us the MSR write below - which is a particular expensive ++ * operation - when the other hyperthread has updated the microcode ++ * already. ++ */ ++ rev = intel_get_microcode_revision(); ++ if (rev >= mc_intel->hdr.rev) ++ goto out; ++ + /* write microcode via MSR 0x79 */ + wrmsr(MSR_IA32_UCODE_WRITE, + (unsigned long) mc_intel->bits, + (unsigned long) mc_intel->bits >> 16 >> 16); +- wrmsr(MSR_IA32_UCODE_REV, 0, 0); +- +- /* As documented in the SDM: Do a CPUID 1 here */ +- sync_core(); + +- /* get the current revision from MSR 0x8B */ +- rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]); ++ rev = intel_get_microcode_revision(); + +- if (val[1] != mc_intel->hdr.rev) { ++ if (rev != mc_intel->hdr.rev) { + pr_err("CPU%d update to revision 0x%x failed\n", + cpu_num, mc_intel->hdr.rev); + return -1; + } + pr_info("CPU%d updated to revision 0x%x, date = %04x-%02x-%02x\n", +- cpu_num, val[1], ++ cpu_num, rev, + mc_intel->hdr.date & 0xffff, + mc_intel->hdr.date >> 24, + (mc_intel->hdr.date >> 16) & 0xff); + +- uci->cpu_sig.rev = val[1]; +- c->microcode = val[1]; ++out: ++ uci->cpu_sig.rev = rev; ++ c->microcode = rev; ++ ++ /* Update boot_cpu_data's revision too, if we're on the BSP: */ ++ if (c->cpu_index == boot_cpu_data.cpu_index) ++ boot_cpu_data.microcode = rev; + + return 0; + } +diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c +index 7b79c80ce029..325ed90511cf 100644 +--- a/arch/x86/kernel/cpu/perf_event_intel.c ++++ b/arch/x86/kernel/cpu/perf_event_intel.c +@@ -2513,7 +2513,7 @@ static int intel_pmu_hw_config(struct perf_event *event) + return ret; + + if (event->attr.precise_ip) { +- if (!event->attr.freq) { ++ if (!(event->attr.freq || event->attr.wakeup_events)) { + event->hw.flags |= PERF_X86_EVENT_AUTO_RELOAD; + if (!(event->attr.sample_type & + ~intel_pmu_free_running_flags(event))) +diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c +index 697f90db0e37..a4df15f3878e 100644 +--- a/arch/x86/kernel/nmi.c ++++ b/arch/x86/kernel/nmi.c +@@ -29,6 +29,7 @@ + #include + #include + #include ++#include + + #define CREATE_TRACE_POINTS + #include +@@ -522,6 +523,9 @@ nmi_restart: + write_cr2(this_cpu_read(nmi_cr2)); + if (this_cpu_dec_return(nmi_state)) + goto nmi_restart; ++ ++ if (user_mode(regs)) ++ mds_user_clear_cpu_buffers(); + } + NOKPROBE_SYMBOL(do_nmi); + +diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c +index e18c8798c3a2..64090c943f05 100644 +--- a/arch/x86/kernel/process.c ++++ b/arch/x86/kernel/process.c +@@ -33,6 +33,8 @@ + #include + #include + ++#include "process.h" ++ + /* + * per-CPU TSS segments. Threads are completely 'soft' on Linux, + * no more per-task TSS's. The TSS size is kept cacheline-aligned +@@ -179,11 +181,12 @@ int set_tsc_mode(unsigned int val) + return 0; + } + +-static inline void switch_to_bitmap(struct tss_struct *tss, +- struct thread_struct *prev, ++static inline void switch_to_bitmap(struct thread_struct *prev, + struct thread_struct *next, + unsigned long tifp, unsigned long tifn) + { ++ struct tss_struct *tss = this_cpu_ptr(&cpu_tss); ++ + if (tifn & _TIF_IO_BITMAP) { + /* + * Copy the relevant range of the IO bitmap. +@@ -317,32 +320,85 @@ static __always_inline void amd_set_ssb_virt_state(unsigned long tifn) + wrmsrl(MSR_AMD64_VIRT_SPEC_CTRL, ssbd_tif_to_spec_ctrl(tifn)); + } + +-static __always_inline void intel_set_ssb_state(unsigned long tifn) ++/* ++ * Update the MSRs managing speculation control, during context switch. ++ * ++ * tifp: Previous task's thread flags ++ * tifn: Next task's thread flags ++ */ ++static __always_inline void __speculation_ctrl_update(unsigned long tifp, ++ unsigned long tifn) + { +- u64 msr = x86_spec_ctrl_base | ssbd_tif_to_spec_ctrl(tifn); ++ unsigned long tif_diff = tifp ^ tifn; ++ u64 msr = x86_spec_ctrl_base; ++ bool updmsr = false; ++ ++ /* ++ * If TIF_SSBD is different, select the proper mitigation ++ * method. Note that if SSBD mitigation is disabled or permanentely ++ * enabled this branch can't be taken because nothing can set ++ * TIF_SSBD. ++ */ ++ if (tif_diff & _TIF_SSBD) { ++ if (static_cpu_has(X86_FEATURE_VIRT_SSBD)) { ++ amd_set_ssb_virt_state(tifn); ++ } else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) { ++ amd_set_core_ssb_state(tifn); ++ } else if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) || ++ static_cpu_has(X86_FEATURE_AMD_SSBD)) { ++ msr |= ssbd_tif_to_spec_ctrl(tifn); ++ updmsr = true; ++ } ++ } ++ ++ /* ++ * Only evaluate TIF_SPEC_IB if conditional STIBP is enabled, ++ * otherwise avoid the MSR write. ++ */ ++ if (IS_ENABLED(CONFIG_SMP) && ++ static_branch_unlikely(&switch_to_cond_stibp)) { ++ updmsr |= !!(tif_diff & _TIF_SPEC_IB); ++ msr |= stibp_tif_to_spec_ctrl(tifn); ++ } + +- wrmsrl(MSR_IA32_SPEC_CTRL, msr); ++ if (updmsr) ++ wrmsrl(MSR_IA32_SPEC_CTRL, msr); + } + +-static __always_inline void __speculative_store_bypass_update(unsigned long tifn) ++static unsigned long speculation_ctrl_update_tif(struct task_struct *tsk) + { +- if (static_cpu_has(X86_FEATURE_VIRT_SSBD)) +- amd_set_ssb_virt_state(tifn); +- else if (static_cpu_has(X86_FEATURE_LS_CFG_SSBD)) +- amd_set_core_ssb_state(tifn); +- else +- intel_set_ssb_state(tifn); ++ if (test_and_clear_tsk_thread_flag(tsk, TIF_SPEC_FORCE_UPDATE)) { ++ if (task_spec_ssb_disable(tsk)) ++ set_tsk_thread_flag(tsk, TIF_SSBD); ++ else ++ clear_tsk_thread_flag(tsk, TIF_SSBD); ++ ++ if (task_spec_ib_disable(tsk)) ++ set_tsk_thread_flag(tsk, TIF_SPEC_IB); ++ else ++ clear_tsk_thread_flag(tsk, TIF_SPEC_IB); ++ } ++ /* Return the updated threadinfo flags*/ ++ return task_thread_info(tsk)->flags; + } + +-void speculative_store_bypass_update(unsigned long tif) ++void speculation_ctrl_update(unsigned long tif) + { ++ /* Forced update. Make sure all relevant TIF flags are different */ + preempt_disable(); +- __speculative_store_bypass_update(tif); ++ __speculation_ctrl_update(~tif, tif); + preempt_enable(); + } + +-void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, +- struct tss_struct *tss) ++/* Called from seccomp/prctl update */ ++void speculation_ctrl_update_current(void) ++{ ++ preempt_disable(); ++ speculation_ctrl_update(speculation_ctrl_update_tif(current)); ++ preempt_enable(); ++} ++ ++void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p) + { + struct thread_struct *prev, *next; + unsigned long tifp, tifn; +@@ -352,7 +408,7 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, + + tifn = READ_ONCE(task_thread_info(next_p)->flags); + tifp = READ_ONCE(task_thread_info(prev_p)->flags); +- switch_to_bitmap(tss, prev, next, tifp, tifn); ++ switch_to_bitmap(prev, next, tifp, tifn); + + propagate_user_return_notify(prev_p, next_p); + +@@ -370,8 +426,15 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, + if ((tifp ^ tifn) & _TIF_NOTSC) + cr4_toggle_bits(X86_CR4_TSD); + +- if ((tifp ^ tifn) & _TIF_SSBD) +- __speculative_store_bypass_update(tifn); ++ if (likely(!((tifp | tifn) & _TIF_SPEC_FORCE_UPDATE))) { ++ __speculation_ctrl_update(tifp, tifn); ++ } else { ++ speculation_ctrl_update_tif(prev_p); ++ tifn = speculation_ctrl_update_tif(next_p); ++ ++ /* Enforce MSR update to ensure consistent state */ ++ __speculation_ctrl_update(~tifn, tifn); ++ } + } + + /* +diff --git a/arch/x86/kernel/process.h b/arch/x86/kernel/process.h +new file mode 100644 +index 000000000000..898e97cf6629 +--- /dev/null ++++ b/arch/x86/kernel/process.h +@@ -0,0 +1,39 @@ ++// SPDX-License-Identifier: GPL-2.0 ++// ++// Code shared between 32 and 64 bit ++ ++#include ++ ++void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p); ++ ++/* ++ * This needs to be inline to optimize for the common case where no extra ++ * work needs to be done. ++ */ ++static inline void switch_to_extra(struct task_struct *prev, ++ struct task_struct *next) ++{ ++ unsigned long next_tif = task_thread_info(next)->flags; ++ unsigned long prev_tif = task_thread_info(prev)->flags; ++ ++ if (IS_ENABLED(CONFIG_SMP)) { ++ /* ++ * Avoid __switch_to_xtra() invocation when conditional ++ * STIPB is disabled and the only different bit is ++ * TIF_SPEC_IB. For CONFIG_SMP=n TIF_SPEC_IB is not ++ * in the TIF_WORK_CTXSW masks. ++ */ ++ if (!static_branch_likely(&switch_to_cond_stibp)) { ++ prev_tif &= ~_TIF_SPEC_IB; ++ next_tif &= ~_TIF_SPEC_IB; ++ } ++ } ++ ++ /* ++ * __switch_to_xtra() handles debug registers, i/o bitmaps, ++ * speculation mitigations etc. ++ */ ++ if (unlikely(next_tif & _TIF_WORK_CTXSW_NEXT || ++ prev_tif & _TIF_WORK_CTXSW_PREV)) ++ __switch_to_xtra(prev, next); ++} +diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c +index 9f950917528b..85b112efac30 100644 +--- a/arch/x86/kernel/process_32.c ++++ b/arch/x86/kernel/process_32.c +@@ -55,6 +55,8 @@ + #include + #include + ++#include "process.h" ++ + asmlinkage void ret_from_fork(void) __asm__("ret_from_fork"); + asmlinkage void ret_from_kernel_thread(void) __asm__("ret_from_kernel_thread"); + +@@ -279,12 +281,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) + if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl)) + set_iopl_mask(next->iopl); + +- /* +- * Now maybe handle debug registers and/or IO bitmaps +- */ +- if (unlikely(task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV || +- task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT)) +- __switch_to_xtra(prev_p, next_p, tss); ++ switch_to_extra(prev_p, next_p); + + /* + * Leave lazy mode, flushing any hypercalls made here. +diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c +index c7cc81e9bb84..618565fecb1c 100644 +--- a/arch/x86/kernel/process_64.c ++++ b/arch/x86/kernel/process_64.c +@@ -50,6 +50,8 @@ + #include + #include + ++#include "process.h" ++ + asmlinkage extern void ret_from_fork(void); + + __visible DEFINE_PER_CPU(unsigned long, rsp_scratch); +@@ -406,12 +408,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) + /* Reload esp0 and ss1. This changes current_thread_info(). */ + load_sp0(tss, next); + +- /* +- * Now maybe reload the debug registers and handle I/O bitmaps +- */ +- if (unlikely(task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT || +- task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV)) +- __switch_to_xtra(prev_p, next_p, tss); ++ switch_to_extra(prev_p, next_p); + + #ifdef CONFIG_XEN + /* +diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c +index 8c73bf1492b8..6223929fc621 100644 +--- a/arch/x86/kernel/traps.c ++++ b/arch/x86/kernel/traps.c +@@ -61,6 +61,7 @@ + #include + #include + #include ++#include + #include + #include + +@@ -337,6 +338,13 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code) + regs->ip = (unsigned long)general_protection; + regs->sp = (unsigned long)&normal_regs->orig_ax; + ++ /* ++ * This situation can be triggered by userspace via ++ * modify_ldt(2) and the return does not take the regular ++ * user space exit, so a CPU buffer clear is required when ++ * MDS mitigation is enabled. ++ */ ++ mds_user_clear_cpu_buffers(); + return; + } + #endif +diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c +index b857bb9f6f23..53918abccbc3 100644 +--- a/arch/x86/kvm/cpuid.c ++++ b/arch/x86/kvm/cpuid.c +@@ -343,7 +343,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, + + /* cpuid 0x80000008.ebx */ + const u32 kvm_cpuid_8000_0008_ebx_x86_features = +- F(AMD_IBPB) | F(AMD_IBRS) | F(VIRT_SSBD); ++ F(AMD_IBPB) | F(AMD_IBRS) | F(AMD_SSBD) | F(VIRT_SSBD) | ++ F(AMD_SSB_NO) | F(AMD_STIBP); + + /* cpuid 0xC0000001.edx */ + const u32 kvm_supported_word5_x86_features = +@@ -364,7 +365,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, + + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = +- F(SPEC_CTRL) | F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES); ++ F(SPEC_CTRL) | F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | ++ F(INTEL_STIBP) | F(MD_CLEAR); + + /* all calls to cpuid_count() should be made on the same cpu */ + get_cpu(); +@@ -607,7 +609,12 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, + entry->ebx |= F(VIRT_SSBD); + entry->ebx &= kvm_cpuid_8000_0008_ebx_x86_features; + cpuid_mask(&entry->ebx, CPUID_8000_0008_EBX); +- if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD)) ++ /* ++ * The preference is to use SPEC CTRL MSR instead of the ++ * VIRT_SPEC MSR. ++ */ ++ if (boot_cpu_has(X86_FEATURE_LS_CFG_SSBD) && ++ !boot_cpu_has(X86_FEATURE_AMD_SSBD)) + entry->ebx |= F(VIRT_SSBD); + break; + } +diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h +index 72f159f4d456..8c28926dc900 100644 +--- a/arch/x86/kvm/cpuid.h ++++ b/arch/x86/kvm/cpuid.h +@@ -175,7 +175,7 @@ static inline bool guest_cpuid_has_spec_ctrl(struct kvm_vcpu *vcpu) + struct kvm_cpuid_entry2 *best; + + best = kvm_find_cpuid_entry(vcpu, 0x80000008, 0); +- if (best && (best->ebx & bit(X86_FEATURE_AMD_IBRS))) ++ if (best && (best->ebx & (bit(X86_FEATURE_AMD_IBRS | bit(X86_FEATURE_AMD_SSBD))))) + return true; + best = kvm_find_cpuid_entry(vcpu, 7, 0); + return best && (best->edx & (bit(X86_FEATURE_SPEC_CTRL) | bit(X86_FEATURE_SPEC_CTRL_SSBD))); +diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c +index acbde1249b6f..9fc536657492 100644 +--- a/arch/x86/kvm/svm.c ++++ b/arch/x86/kvm/svm.c +@@ -3197,7 +3197,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) + return 1; + + /* The STIBP bit doesn't fault even if it's not advertised */ +- if (data & ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP)) ++ if (data & ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD)) + return 1; + + svm->spec_ctrl = data; +@@ -3928,8 +3928,6 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) + + clgi(); + +- local_irq_enable(); +- + /* + * If this vCPU has touched SPEC_CTRL, restore the guest's value if + * it's non-zero. Since vmentry is serialising on affected CPUs, there +@@ -3938,6 +3936,8 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) + */ + x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl); + ++ local_irq_enable(); ++ + asm volatile ( + "push %%" _ASM_BP "; \n\t" + "mov %c[rbx](%[svm]), %%" _ASM_BX " \n\t" +@@ -4060,12 +4060,12 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) + if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)) + svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL); + +- x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl); +- + reload_tss(vcpu); + + local_irq_disable(); + ++ x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl); ++ + vcpu->arch.cr2 = svm->vmcb->save.cr2; + vcpu->arch.regs[VCPU_REGS_RAX] = svm->vmcb->save.rax; + vcpu->arch.regs[VCPU_REGS_RSP] = svm->vmcb->save.rsp; +diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h +index ab9ae67a80e4..0ec94c6b4757 100644 +--- a/arch/x86/kvm/trace.h ++++ b/arch/x86/kvm/trace.h +@@ -434,13 +434,13 @@ TRACE_EVENT(kvm_apic_ipi, + ); + + TRACE_EVENT(kvm_apic_accept_irq, +- TP_PROTO(__u32 apicid, __u16 dm, __u8 tm, __u8 vec), ++ TP_PROTO(__u32 apicid, __u16 dm, __u16 tm, __u8 vec), + TP_ARGS(apicid, dm, tm, vec), + + TP_STRUCT__entry( + __field( __u32, apicid ) + __field( __u16, dm ) +- __field( __u8, tm ) ++ __field( __u16, tm ) + __field( __u8, vec ) + ), + +diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c +index 706c5d63a53f..d830a0d60ba4 100644 +--- a/arch/x86/kvm/x86.c ++++ b/arch/x86/kvm/x86.c +@@ -2972,6 +2972,10 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu, + | KVM_VCPUEVENT_VALID_SMM)) + return -EINVAL; + ++ if (events->exception.injected && ++ (events->exception.nr > 31 || events->exception.nr == NMI_VECTOR)) ++ return -EINVAL; ++ + /* INITs are latched while in SMM */ + if (events->flags & KVM_VCPUEVENT_VALID_SMM && + (events->smi.smm || events->smi.pending) && +diff --git a/arch/x86/mm/kaiser.c b/arch/x86/mm/kaiser.c +index 7a72e32e4806..2cbcd6f3317d 100644 +--- a/arch/x86/mm/kaiser.c ++++ b/arch/x86/mm/kaiser.c +@@ -10,6 +10,7 @@ + #include + #include + #include ++#include + + #undef pr_fmt + #define pr_fmt(fmt) "Kernel/User page tables isolation: " fmt +@@ -297,7 +298,8 @@ void __init kaiser_check_boottime_disable(void) + goto skip; + } + +- if (cmdline_find_option_bool(boot_command_line, "nopti")) ++ if (cmdline_find_option_bool(boot_command_line, "nopti") || ++ cpu_mitigations_off()) + goto disable; + + skip: +diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c +index 55c7446311a7..50f75768aadd 100644 +--- a/arch/x86/mm/pgtable.c ++++ b/arch/x86/mm/pgtable.c +@@ -247,7 +247,7 @@ static void pgd_mop_up_pmds(struct mm_struct *mm, pgd_t *pgdp) + if (pgd_val(pgd) != 0) { + pmd_t *pmd = (pmd_t *)pgd_page_vaddr(pgd); + +- pgdp[i] = native_make_pgd(0); ++ pgd_clear(&pgdp[i]); + + paravirt_release_pmd(pgd_val(pgd) >> PAGE_SHIFT); + pmd_free(mm, pmd); +@@ -424,7 +424,7 @@ int ptep_set_access_flags(struct vm_area_struct *vma, + int changed = !pte_same(*ptep, entry); + + if (changed && dirty) { +- *ptep = entry; ++ set_pte(ptep, entry); + pte_update_defer(vma->vm_mm, address, ptep); + } + +@@ -441,7 +441,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, + VM_BUG_ON(address & ~HPAGE_PMD_MASK); + + if (changed && dirty) { +- *pmdp = entry; ++ set_pmd(pmdp, entry); + pmd_update_defer(vma->vm_mm, address, pmdp); + /* + * We had a write-protection fault here and changed the pmd +diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c +index 6d683bbb3502..f3237e4cb18f 100644 +--- a/arch/x86/mm/tlb.c ++++ b/arch/x86/mm/tlb.c +@@ -30,6 +30,12 @@ + * Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi + */ + ++/* ++ * Use bit 0 to mangle the TIF_SPEC_IB state into the mm pointer which is ++ * stored in cpu_tlb_state.last_user_mm_ibpb. ++ */ ++#define LAST_USER_MM_IBPB 0x1UL ++ + atomic64_t last_mm_ctx_id = ATOMIC64_INIT(1); + + struct flush_tlb_info { +@@ -101,41 +107,101 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next, + local_irq_restore(flags); + } + +-void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, +- struct task_struct *tsk) ++static inline unsigned long mm_mangle_tif_spec_ib(struct task_struct *next) + { +- unsigned cpu = smp_processor_id(); ++ unsigned long next_tif = task_thread_info(next)->flags; ++ unsigned long ibpb = (next_tif >> TIF_SPEC_IB) & LAST_USER_MM_IBPB; + +- if (likely(prev != next)) { +- u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id); ++ return (unsigned long)next->mm | ibpb; ++} ++ ++static void cond_ibpb(struct task_struct *next) ++{ ++ if (!next || !next->mm) ++ return; ++ ++ /* ++ * Both, the conditional and the always IBPB mode use the mm ++ * pointer to avoid the IBPB when switching between tasks of the ++ * same process. Using the mm pointer instead of mm->context.ctx_id ++ * opens a hypothetical hole vs. mm_struct reuse, which is more or ++ * less impossible to control by an attacker. Aside of that it ++ * would only affect the first schedule so the theoretically ++ * exposed data is not really interesting. ++ */ ++ if (static_branch_likely(&switch_mm_cond_ibpb)) { ++ unsigned long prev_mm, next_mm; + + /* +- * Avoid user/user BTB poisoning by flushing the branch +- * predictor when switching between processes. This stops +- * one process from doing Spectre-v2 attacks on another. ++ * This is a bit more complex than the always mode because ++ * it has to handle two cases: ++ * ++ * 1) Switch from a user space task (potential attacker) ++ * which has TIF_SPEC_IB set to a user space task ++ * (potential victim) which has TIF_SPEC_IB not set. ++ * ++ * 2) Switch from a user space task (potential attacker) ++ * which has TIF_SPEC_IB not set to a user space task ++ * (potential victim) which has TIF_SPEC_IB set. ++ * ++ * This could be done by unconditionally issuing IBPB when ++ * a task which has TIF_SPEC_IB set is either scheduled in ++ * or out. Though that results in two flushes when: ++ * ++ * - the same user space task is scheduled out and later ++ * scheduled in again and only a kernel thread ran in ++ * between. ++ * ++ * - a user space task belonging to the same process is ++ * scheduled in after a kernel thread ran in between + * +- * As an optimization, flush indirect branches only when +- * switching into processes that disable dumping. This +- * protects high value processes like gpg, without having +- * too high performance overhead. IBPB is *expensive*! ++ * - a user space task belonging to the same process is ++ * scheduled in immediately. + * +- * This will not flush branches when switching into kernel +- * threads. It will also not flush if we switch to idle +- * thread and back to the same process. It will flush if we +- * switch to a different non-dumpable process. ++ * Optimize this with reasonably small overhead for the ++ * above cases. Mangle the TIF_SPEC_IB bit into the mm ++ * pointer of the incoming task which is stored in ++ * cpu_tlbstate.last_user_mm_ibpb for comparison. + */ +- if (tsk && tsk->mm && +- tsk->mm->context.ctx_id != last_ctx_id && +- get_dumpable(tsk->mm) != SUID_DUMP_USER) ++ next_mm = mm_mangle_tif_spec_ib(next); ++ prev_mm = this_cpu_read(cpu_tlbstate.last_user_mm_ibpb); ++ ++ /* ++ * Issue IBPB only if the mm's are different and one or ++ * both have the IBPB bit set. ++ */ ++ if (next_mm != prev_mm && ++ (next_mm | prev_mm) & LAST_USER_MM_IBPB) + indirect_branch_prediction_barrier(); + ++ this_cpu_write(cpu_tlbstate.last_user_mm_ibpb, next_mm); ++ } ++ ++ if (static_branch_unlikely(&switch_mm_always_ibpb)) { + /* +- * Record last user mm's context id, so we can avoid +- * flushing branch buffer with IBPB if we switch back +- * to the same user. ++ * Only flush when switching to a user space task with a ++ * different context than the user space task which ran ++ * last on this CPU. ++ */ ++ if (this_cpu_read(cpu_tlbstate.last_user_mm) != next->mm) { ++ indirect_branch_prediction_barrier(); ++ this_cpu_write(cpu_tlbstate.last_user_mm, next->mm); ++ } ++ } ++} ++ ++void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, ++ struct task_struct *tsk) ++{ ++ unsigned cpu = smp_processor_id(); ++ ++ if (likely(prev != next)) { ++ /* ++ * Avoid user/user BTB poisoning by flushing the branch ++ * predictor when switching between processes. This stops ++ * one process from doing Spectre-v2 attacks on another. + */ +- if (next != &init_mm) +- this_cpu_write(cpu_tlbstate.last_ctx_id, next->context.ctx_id); ++ cond_ibpb(tsk); + + this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK); + this_cpu_write(cpu_tlbstate.active_mm, next); +diff --git a/drivers/ata/libata-zpodd.c b/drivers/ata/libata-zpodd.c +index 0ad96c647541..7017a81d53cf 100644 +--- a/drivers/ata/libata-zpodd.c ++++ b/drivers/ata/libata-zpodd.c +@@ -51,38 +51,52 @@ static int eject_tray(struct ata_device *dev) + /* Per the spec, only slot type and drawer type ODD can be supported */ + static enum odd_mech_type zpodd_get_mech_type(struct ata_device *dev) + { +- char buf[16]; ++ char *buf; + unsigned int ret; +- struct rm_feature_desc *desc = (void *)(buf + 8); ++ struct rm_feature_desc *desc; + struct ata_taskfile tf; + static const char cdb[] = { GPCMD_GET_CONFIGURATION, + 2, /* only 1 feature descriptor requested */ + 0, 3, /* 3, removable medium feature */ + 0, 0, 0,/* reserved */ +- 0, sizeof(buf), ++ 0, 16, + 0, 0, 0, + }; + ++ buf = kzalloc(16, GFP_KERNEL); ++ if (!buf) ++ return ODD_MECH_TYPE_UNSUPPORTED; ++ desc = (void *)(buf + 8); ++ + ata_tf_init(dev, &tf); + tf.flags = ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE; + tf.command = ATA_CMD_PACKET; + tf.protocol = ATAPI_PROT_PIO; +- tf.lbam = sizeof(buf); ++ tf.lbam = 16; + + ret = ata_exec_internal(dev, &tf, cdb, DMA_FROM_DEVICE, +- buf, sizeof(buf), 0); +- if (ret) ++ buf, 16, 0); ++ if (ret) { ++ kfree(buf); + return ODD_MECH_TYPE_UNSUPPORTED; ++ } + +- if (be16_to_cpu(desc->feature_code) != 3) ++ if (be16_to_cpu(desc->feature_code) != 3) { ++ kfree(buf); + return ODD_MECH_TYPE_UNSUPPORTED; ++ } + +- if (desc->mech_type == 0 && desc->load == 0 && desc->eject == 1) ++ if (desc->mech_type == 0 && desc->load == 0 && desc->eject == 1) { ++ kfree(buf); + return ODD_MECH_TYPE_SLOT; +- else if (desc->mech_type == 1 && desc->load == 0 && desc->eject == 1) ++ } else if (desc->mech_type == 1 && desc->load == 0 && ++ desc->eject == 1) { ++ kfree(buf); + return ODD_MECH_TYPE_DRAWER; +- else ++ } else { ++ kfree(buf); + return ODD_MECH_TYPE_UNSUPPORTED; ++ } + } + + /* Test if ODD is zero power ready by sense code */ +diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c +index 41090ef5facb..3934aaf9d157 100644 +--- a/drivers/base/cpu.c ++++ b/drivers/base/cpu.c +@@ -530,11 +530,18 @@ ssize_t __weak cpu_show_l1tf(struct device *dev, + return sprintf(buf, "Not affected\n"); + } + ++ssize_t __weak cpu_show_mds(struct device *dev, ++ struct device_attribute *attr, char *buf) ++{ ++ return sprintf(buf, "Not affected\n"); ++} ++ + static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL); + static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL); + static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL); + static DEVICE_ATTR(spec_store_bypass, 0444, cpu_show_spec_store_bypass, NULL); + static DEVICE_ATTR(l1tf, 0444, cpu_show_l1tf, NULL); ++static DEVICE_ATTR(mds, 0444, cpu_show_mds, NULL); + + static struct attribute *cpu_root_vulnerabilities_attrs[] = { + &dev_attr_meltdown.attr, +@@ -542,6 +549,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = { + &dev_attr_spectre_v2.attr, + &dev_attr_spec_store_bypass.attr, + &dev_attr_l1tf.attr, ++ &dev_attr_mds.attr, + NULL + }; + +diff --git a/drivers/block/loop.c b/drivers/block/loop.c +index ae361ee90587..da3902ac16c8 100644 +--- a/drivers/block/loop.c ++++ b/drivers/block/loop.c +@@ -82,7 +82,6 @@ + + static DEFINE_IDR(loop_index_idr); + static DEFINE_MUTEX(loop_index_mutex); +-static DEFINE_MUTEX(loop_ctl_mutex); + + static int max_part; + static int part_shift; +@@ -1045,7 +1044,7 @@ static int loop_clr_fd(struct loop_device *lo) + */ + if (atomic_read(&lo->lo_refcnt) > 1) { + lo->lo_flags |= LO_FLAGS_AUTOCLEAR; +- mutex_unlock(&loop_ctl_mutex); ++ mutex_unlock(&lo->lo_ctl_mutex); + return 0; + } + +@@ -1094,12 +1093,12 @@ static int loop_clr_fd(struct loop_device *lo) + if (!part_shift) + lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN; + loop_unprepare_queue(lo); +- mutex_unlock(&loop_ctl_mutex); ++ mutex_unlock(&lo->lo_ctl_mutex); + /* +- * Need not hold loop_ctl_mutex to fput backing file. +- * Calling fput holding loop_ctl_mutex triggers a circular ++ * Need not hold lo_ctl_mutex to fput backing file. ++ * Calling fput holding lo_ctl_mutex triggers a circular + * lock dependency possibility warning as fput can take +- * bd_mutex which is usually taken before loop_ctl_mutex. ++ * bd_mutex which is usually taken before lo_ctl_mutex. + */ + fput(filp); + return 0; +@@ -1362,7 +1361,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, + struct loop_device *lo = bdev->bd_disk->private_data; + int err; + +- mutex_lock_nested(&loop_ctl_mutex, 1); ++ mutex_lock_nested(&lo->lo_ctl_mutex, 1); + switch (cmd) { + case LOOP_SET_FD: + err = loop_set_fd(lo, mode, bdev, arg); +@@ -1371,7 +1370,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, + err = loop_change_fd(lo, bdev, arg); + break; + case LOOP_CLR_FD: +- /* loop_clr_fd would have unlocked loop_ctl_mutex on success */ ++ /* loop_clr_fd would have unlocked lo_ctl_mutex on success */ + err = loop_clr_fd(lo); + if (!err) + goto out_unlocked; +@@ -1407,7 +1406,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, + default: + err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL; + } +- mutex_unlock(&loop_ctl_mutex); ++ mutex_unlock(&lo->lo_ctl_mutex); + + out_unlocked: + return err; +@@ -1540,16 +1539,16 @@ static int lo_compat_ioctl(struct block_device *bdev, fmode_t mode, + + switch(cmd) { + case LOOP_SET_STATUS: +- mutex_lock(&loop_ctl_mutex); ++ mutex_lock(&lo->lo_ctl_mutex); + err = loop_set_status_compat( + lo, (const struct compat_loop_info __user *) arg); +- mutex_unlock(&loop_ctl_mutex); ++ mutex_unlock(&lo->lo_ctl_mutex); + break; + case LOOP_GET_STATUS: +- mutex_lock(&loop_ctl_mutex); ++ mutex_lock(&lo->lo_ctl_mutex); + err = loop_get_status_compat( + lo, (struct compat_loop_info __user *) arg); +- mutex_unlock(&loop_ctl_mutex); ++ mutex_unlock(&lo->lo_ctl_mutex); + break; + case LOOP_SET_CAPACITY: + case LOOP_CLR_FD: +@@ -1593,7 +1592,7 @@ static void __lo_release(struct loop_device *lo) + if (atomic_dec_return(&lo->lo_refcnt)) + return; + +- mutex_lock(&loop_ctl_mutex); ++ mutex_lock(&lo->lo_ctl_mutex); + if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) { + /* + * In autoclear mode, stop the loop thread +@@ -1610,7 +1609,7 @@ static void __lo_release(struct loop_device *lo) + loop_flush(lo); + } + +- mutex_unlock(&loop_ctl_mutex); ++ mutex_unlock(&lo->lo_ctl_mutex); + } + + static void lo_release(struct gendisk *disk, fmode_t mode) +@@ -1656,10 +1655,10 @@ static int unregister_transfer_cb(int id, void *ptr, void *data) + struct loop_device *lo = ptr; + struct loop_func_table *xfer = data; + +- mutex_lock(&loop_ctl_mutex); ++ mutex_lock(&lo->lo_ctl_mutex); + if (lo->lo_encryption == xfer) + loop_release_xfer(lo); +- mutex_unlock(&loop_ctl_mutex); ++ mutex_unlock(&lo->lo_ctl_mutex); + return 0; + } + +@@ -1821,6 +1820,7 @@ static int loop_add(struct loop_device **l, int i) + if (!part_shift) + disk->flags |= GENHD_FL_NO_PART_SCAN; + disk->flags |= GENHD_FL_EXT_DEVT; ++ mutex_init(&lo->lo_ctl_mutex); + atomic_set(&lo->lo_refcnt, 0); + lo->lo_number = i; + spin_lock_init(&lo->lo_lock); +@@ -1933,19 +1933,19 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd, + ret = loop_lookup(&lo, parm); + if (ret < 0) + break; +- mutex_lock(&loop_ctl_mutex); ++ mutex_lock(&lo->lo_ctl_mutex); + if (lo->lo_state != Lo_unbound) { + ret = -EBUSY; +- mutex_unlock(&loop_ctl_mutex); ++ mutex_unlock(&lo->lo_ctl_mutex); + break; + } + if (atomic_read(&lo->lo_refcnt) > 0) { + ret = -EBUSY; +- mutex_unlock(&loop_ctl_mutex); ++ mutex_unlock(&lo->lo_ctl_mutex); + break; + } + lo->lo_disk->private_data = NULL; +- mutex_unlock(&loop_ctl_mutex); ++ mutex_unlock(&lo->lo_ctl_mutex); + idr_remove(&loop_index_idr, lo->lo_number); + loop_remove(lo); + break; +diff --git a/drivers/block/loop.h b/drivers/block/loop.h +index a923e74495ce..60f0fd2c0c65 100644 +--- a/drivers/block/loop.h ++++ b/drivers/block/loop.h +@@ -55,6 +55,7 @@ struct loop_device { + + spinlock_t lo_lock; + int lo_state; ++ struct mutex lo_ctl_mutex; + struct kthread_worker worker; + struct task_struct *worker_task; + bool use_dio; +diff --git a/drivers/block/xsysace.c b/drivers/block/xsysace.c +index c4328d9d9981..f838119d12b2 100644 +--- a/drivers/block/xsysace.c ++++ b/drivers/block/xsysace.c +@@ -1062,6 +1062,8 @@ static int ace_setup(struct ace_device *ace) + return 0; + + err_read: ++ /* prevent double queue cleanup */ ++ ace->gd->queue = NULL; + put_disk(ace->gd); + err_alloc_disk: + blk_cleanup_queue(ace->queue); +diff --git a/drivers/gpu/ipu-v3/ipu-dp.c b/drivers/gpu/ipu-v3/ipu-dp.c +index 98686edbcdbb..33de3a1bac49 100644 +--- a/drivers/gpu/ipu-v3/ipu-dp.c ++++ b/drivers/gpu/ipu-v3/ipu-dp.c +@@ -195,7 +195,8 @@ int ipu_dp_setup_channel(struct ipu_dp *dp, + ipu_dp_csc_init(flow, flow->foreground.in_cs, flow->out_cs, + DP_COM_CONF_CSC_DEF_BOTH); + } else { +- if (flow->foreground.in_cs == flow->out_cs) ++ if (flow->foreground.in_cs == IPUV3_COLORSPACE_UNKNOWN || ++ flow->foreground.in_cs == flow->out_cs) + /* + * foreground identical to output, apply color + * conversion on background +@@ -261,6 +262,8 @@ void ipu_dp_disable_channel(struct ipu_dp *dp) + struct ipu_dp_priv *priv = flow->priv; + u32 reg, csc; + ++ dp->in_cs = IPUV3_COLORSPACE_UNKNOWN; ++ + if (!dp->foreground) + return; + +@@ -268,8 +271,9 @@ void ipu_dp_disable_channel(struct ipu_dp *dp) + + reg = readl(flow->base + DP_COM_CONF); + csc = reg & DP_COM_CONF_CSC_DEF_MASK; +- if (csc == DP_COM_CONF_CSC_DEF_FG) +- reg &= ~DP_COM_CONF_CSC_DEF_MASK; ++ reg &= ~DP_COM_CONF_CSC_DEF_MASK; ++ if (csc == DP_COM_CONF_CSC_DEF_BOTH || csc == DP_COM_CONF_CSC_DEF_BG) ++ reg |= DP_COM_CONF_CSC_DEF_BG; + + reg &= ~DP_COM_CONF_FG_EN; + writel(reg, flow->base + DP_COM_CONF); +@@ -350,6 +354,8 @@ int ipu_dp_init(struct ipu_soc *ipu, struct device *dev, unsigned long base) + mutex_init(&priv->mutex); + + for (i = 0; i < IPUV3_NUM_FLOWS; i++) { ++ priv->flow[i].background.in_cs = IPUV3_COLORSPACE_UNKNOWN; ++ priv->flow[i].foreground.in_cs = IPUV3_COLORSPACE_UNKNOWN; + priv->flow[i].foreground.foreground = true; + priv->flow[i].base = priv->base + ipu_dp_flow_base[i]; + priv->flow[i].priv = priv; +diff --git a/drivers/hid/hid-debug.c b/drivers/hid/hid-debug.c +index d7179dd3c9ef..3cafa1d28fed 100644 +--- a/drivers/hid/hid-debug.c ++++ b/drivers/hid/hid-debug.c +@@ -1058,10 +1058,15 @@ static int hid_debug_rdesc_show(struct seq_file *f, void *p) + seq_printf(f, "\n\n"); + + /* dump parsed data and input mappings */ ++ if (down_interruptible(&hdev->driver_input_lock)) ++ return 0; ++ + hid_dump_device(hdev, f); + seq_printf(f, "\n"); + hid_dump_input_mapping(hdev, f); + ++ up(&hdev->driver_input_lock); ++ + return 0; + } + +diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c +index 8d74e691ac90..ee3c66c02043 100644 +--- a/drivers/hid/hid-input.c ++++ b/drivers/hid/hid-input.c +@@ -783,6 +783,10 @@ static void hidinput_configure_usage(struct hid_input *hidinput, struct hid_fiel + case 0x074: map_key_clear(KEY_BRIGHTNESS_MAX); break; + case 0x075: map_key_clear(KEY_BRIGHTNESS_AUTO); break; + ++ case 0x079: map_key_clear(KEY_KBDILLUMUP); break; ++ case 0x07a: map_key_clear(KEY_KBDILLUMDOWN); break; ++ case 0x07c: map_key_clear(KEY_KBDILLUMTOGGLE); break; ++ + case 0x082: map_key_clear(KEY_VIDEO_NEXT); break; + case 0x083: map_key_clear(KEY_LAST); break; + case 0x084: map_key_clear(KEY_ENTER); break; +@@ -913,6 +917,8 @@ static void hidinput_configure_usage(struct hid_input *hidinput, struct hid_fiel + case 0x2cb: map_key_clear(KEY_KBDINPUTASSIST_ACCEPT); break; + case 0x2cc: map_key_clear(KEY_KBDINPUTASSIST_CANCEL); break; + ++ case 0x29f: map_key_clear(KEY_SCALE); break; ++ + default: map_key_clear(KEY_UNKNOWN); + } + break; +diff --git a/drivers/hwtracing/intel_th/gth.c b/drivers/hwtracing/intel_th/gth.c +index eb43943cdf07..189eb6269971 100644 +--- a/drivers/hwtracing/intel_th/gth.c ++++ b/drivers/hwtracing/intel_th/gth.c +@@ -597,7 +597,7 @@ static void intel_th_gth_unassign(struct intel_th_device *thdev, + othdev->output.port = -1; + othdev->output.active = false; + gth->output[port].output = NULL; +- for (master = 0; master < TH_CONFIGURABLE_MASTERS; master++) ++ for (master = 0; master <= TH_CONFIGURABLE_MASTERS; master++) + if (gth->master[master] == port) + gth->master[master] = -1; + spin_unlock(>h->gth_lock); +diff --git a/drivers/iio/adc/xilinx-xadc-core.c b/drivers/iio/adc/xilinx-xadc-core.c +index 475c5a74f2d1..6398e86a272b 100644 +--- a/drivers/iio/adc/xilinx-xadc-core.c ++++ b/drivers/iio/adc/xilinx-xadc-core.c +@@ -1299,7 +1299,7 @@ static int xadc_remove(struct platform_device *pdev) + } + free_irq(irq, indio_dev); + clk_disable_unprepare(xadc->clk); +- cancel_delayed_work(&xadc->zynq_unmask_work); ++ cancel_delayed_work_sync(&xadc->zynq_unmask_work); + kfree(xadc->data); + kfree(indio_dev->channels); + +diff --git a/drivers/input/keyboard/snvs_pwrkey.c b/drivers/input/keyboard/snvs_pwrkey.c +index 9adf13a5864a..57143365e945 100644 +--- a/drivers/input/keyboard/snvs_pwrkey.c ++++ b/drivers/input/keyboard/snvs_pwrkey.c +@@ -156,6 +156,9 @@ static int imx_snvs_pwrkey_probe(struct platform_device *pdev) + return error; + } + ++ pdata->input = input; ++ platform_set_drvdata(pdev, pdata); ++ + error = devm_request_irq(&pdev->dev, pdata->irq, + imx_snvs_pwrkey_interrupt, + 0, pdev->name, pdev); +@@ -172,9 +175,6 @@ static int imx_snvs_pwrkey_probe(struct platform_device *pdev) + return error; + } + +- pdata->input = input; +- platform_set_drvdata(pdev, pdata); +- + device_init_wakeup(&pdev->dev, pdata->wakeup); + + return 0; +diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c +index 94f1bf772ec9..db85cc5791dc 100644 +--- a/drivers/iommu/amd_iommu_init.c ++++ b/drivers/iommu/amd_iommu_init.c +@@ -295,7 +295,7 @@ static void iommu_write_l2(struct amd_iommu *iommu, u8 address, u32 val) + static void iommu_set_exclusion_range(struct amd_iommu *iommu) + { + u64 start = iommu->exclusion_start & PAGE_MASK; +- u64 limit = (start + iommu->exclusion_length) & PAGE_MASK; ++ u64 limit = (start + iommu->exclusion_length - 1) & PAGE_MASK; + u64 entry; + + if (!iommu->exclusion_start) +diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c +index 5e65dc6def7e..17517889d46b 100644 +--- a/drivers/md/raid5.c ++++ b/drivers/md/raid5.c +@@ -3897,26 +3897,15 @@ static void handle_parity_checks6(struct r5conf *conf, struct stripe_head *sh, + case check_state_check_result: + sh->check_state = check_state_idle; + ++ if (s->failed > 1) ++ break; + /* handle a successful check operation, if parity is correct + * we are done. Otherwise update the mismatch count and repair + * parity if !MD_RECOVERY_CHECK + */ + if (sh->ops.zero_sum_result == 0) { +- /* both parities are correct */ +- if (!s->failed) +- set_bit(STRIPE_INSYNC, &sh->state); +- else { +- /* in contrast to the raid5 case we can validate +- * parity, but still have a failure to write +- * back +- */ +- sh->check_state = check_state_compute_result; +- /* Returning at this point means that we may go +- * off and bring p and/or q uptodate again so +- * we make sure to check zero_sum_result again +- * to verify if p or q need writeback +- */ +- } ++ /* Any parity checked was correct */ ++ set_bit(STRIPE_INSYNC, &sh->state); + } else { + atomic64_add(STRIPE_SECTORS, &conf->mddev->resync_mismatches); + if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) +diff --git a/drivers/media/i2c/ov7670.c b/drivers/media/i2c/ov7670.c +index e1b5dc84c14e..24a0c21a3d8d 100644 +--- a/drivers/media/i2c/ov7670.c ++++ b/drivers/media/i2c/ov7670.c +@@ -155,10 +155,10 @@ MODULE_PARM_DESC(debug, "Debug level (0-1)"); + #define REG_GFIX 0x69 /* Fix gain control */ + + #define REG_DBLV 0x6b /* PLL control an debugging */ +-#define DBLV_BYPASS 0x00 /* Bypass PLL */ +-#define DBLV_X4 0x01 /* clock x4 */ +-#define DBLV_X6 0x10 /* clock x6 */ +-#define DBLV_X8 0x11 /* clock x8 */ ++#define DBLV_BYPASS 0x0a /* Bypass PLL */ ++#define DBLV_X4 0x4a /* clock x4 */ ++#define DBLV_X6 0x8a /* clock x6 */ ++#define DBLV_X8 0xca /* clock x8 */ + + #define REG_REG76 0x76 /* OV's name */ + #define R76_BLKPCOR 0x80 /* Black pixel correction enable */ +@@ -833,7 +833,7 @@ static int ov7675_set_framerate(struct v4l2_subdev *sd, + if (ret < 0) + return ret; + +- return ov7670_write(sd, REG_DBLV, DBLV_X4); ++ return 0; + } + + static void ov7670_get_framerate_legacy(struct v4l2_subdev *sd, +@@ -1578,11 +1578,7 @@ static int ov7670_probe(struct i2c_client *client, + if (config->clock_speed) + info->clock_speed = config->clock_speed; + +- /* +- * It should be allowed for ov7670 too when it is migrated to +- * the new frame rate formula. +- */ +- if (config->pll_bypass && id->driver_data != MODEL_OV7670) ++ if (config->pll_bypass) + info->pll_bypass = true; + + if (config->pclk_hb_disable) +diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c +index 66560a8fcfa2..1022e80aaf97 100644 +--- a/drivers/net/bonding/bond_options.c ++++ b/drivers/net/bonding/bond_options.c +@@ -1066,13 +1066,6 @@ static int bond_option_arp_validate_set(struct bonding *bond, + { + netdev_info(bond->dev, "Setting arp_validate to %s (%llu)\n", + newval->string, newval->value); +- +- if (bond->dev->flags & IFF_UP) { +- if (!newval->value) +- bond->recv_probe = NULL; +- else if (bond->params.arp_interval) +- bond->recv_probe = bond_arp_rcv; +- } + bond->params.arp_validate = newval->value; + + return 0; +diff --git a/drivers/net/bonding/bond_sysfs_slave.c b/drivers/net/bonding/bond_sysfs_slave.c +index 7d16c51e6913..641a532b67cb 100644 +--- a/drivers/net/bonding/bond_sysfs_slave.c ++++ b/drivers/net/bonding/bond_sysfs_slave.c +@@ -55,7 +55,9 @@ static SLAVE_ATTR_RO(link_failure_count); + + static ssize_t perm_hwaddr_show(struct slave *slave, char *buf) + { +- return sprintf(buf, "%pM\n", slave->perm_hwaddr); ++ return sprintf(buf, "%*phC\n", ++ slave->dev->addr_len, ++ slave->perm_hwaddr); + } + static SLAVE_ATTR_RO(perm_hwaddr); + +diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c +index 00bd7be85679..d9ab970dcbe9 100644 +--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c ++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c +@@ -4957,8 +4957,15 @@ static int bnxt_cfg_rx_mode(struct bnxt *bp) + + skip_uc: + rc = bnxt_hwrm_cfa_l2_set_rx_mask(bp, 0); ++ if (rc && vnic->mc_list_count) { ++ netdev_info(bp->dev, "Failed setting MC filters rc: %d, turning on ALL_MCAST mode\n", ++ rc); ++ vnic->rx_mask |= CFA_L2_SET_RX_MASK_REQ_MASK_ALL_MCAST; ++ vnic->mc_list_count = 0; ++ rc = bnxt_hwrm_cfa_l2_set_rx_mask(bp, 0); ++ } + if (rc) +- netdev_err(bp->dev, "HWRM cfa l2 rx mask failure rc: %x\n", ++ netdev_err(bp->dev, "HWRM cfa l2 rx mask failure rc: %d\n", + rc); + + return rc; +diff --git a/drivers/net/ethernet/freescale/ucc_geth_ethtool.c b/drivers/net/ethernet/freescale/ucc_geth_ethtool.c +index 89714f5e0dfc..c8b9a73d6b1b 100644 +--- a/drivers/net/ethernet/freescale/ucc_geth_ethtool.c ++++ b/drivers/net/ethernet/freescale/ucc_geth_ethtool.c +@@ -253,14 +253,12 @@ uec_set_ringparam(struct net_device *netdev, + return -EINVAL; + } + ++ if (netif_running(netdev)) ++ return -EBUSY; ++ + ug_info->bdRingLenRx[queue] = ring->rx_pending; + ug_info->bdRingLenTx[queue] = ring->tx_pending; + +- if (netif_running(netdev)) { +- /* FIXME: restart automatically */ +- netdev_info(netdev, "Please re-open the interface\n"); +- } +- + return ret; + } + +diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.c b/drivers/net/ethernet/hisilicon/hns/hnae.c +index b3645297477e..3ce41efe8a94 100644 +--- a/drivers/net/ethernet/hisilicon/hns/hnae.c ++++ b/drivers/net/ethernet/hisilicon/hns/hnae.c +@@ -144,7 +144,6 @@ out_buffer_fail: + /* free desc along with its attached buffer */ + static void hnae_free_desc(struct hnae_ring *ring) + { +- hnae_free_buffers(ring); + dma_unmap_single(ring_to_dev(ring), ring->desc_dma_addr, + ring->desc_num * sizeof(ring->desc[0]), + ring_to_dma_dir(ring)); +@@ -177,6 +176,9 @@ static int hnae_alloc_desc(struct hnae_ring *ring) + /* fini ring, also free the buffer for the ring */ + static void hnae_fini_ring(struct hnae_ring *ring) + { ++ if (is_rx_ring(ring)) ++ hnae_free_buffers(ring); ++ + hnae_free_desc(ring); + kfree(ring->desc_cb); + ring->desc_cb = NULL; +diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c b/drivers/net/ethernet/hisilicon/hns/hns_enet.c +index 2fa54b0b0679..6d649e7b45a9 100644 +--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c ++++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c +@@ -28,9 +28,6 @@ + + #define SERVICE_TIMER_HZ (1 * HZ) + +-#define NIC_TX_CLEAN_MAX_NUM 256 +-#define NIC_RX_CLEAN_MAX_NUM 64 +- + #define RCB_IRQ_NOT_INITED 0 + #define RCB_IRQ_INITED 1 + +@@ -1408,7 +1405,7 @@ static int hns_nic_init_ring_data(struct hns_nic_priv *priv) + rd->fini_process = hns_nic_tx_fini_pro; + + netif_napi_add(priv->netdev, &rd->napi, +- hns_nic_common_poll, NIC_TX_CLEAN_MAX_NUM); ++ hns_nic_common_poll, NAPI_POLL_WEIGHT); + rd->ring->irq_init_flag = RCB_IRQ_NOT_INITED; + } + for (i = h->q_num; i < h->q_num * 2; i++) { +@@ -1420,7 +1417,7 @@ static int hns_nic_init_ring_data(struct hns_nic_priv *priv) + rd->fini_process = hns_nic_rx_fini_pro; + + netif_napi_add(priv->netdev, &rd->napi, +- hns_nic_common_poll, NIC_RX_CLEAN_MAX_NUM); ++ hns_nic_common_poll, NAPI_POLL_WEIGHT); + rd->ring->irq_init_flag = RCB_IRQ_NOT_INITED; + } + +diff --git a/drivers/net/ethernet/ibm/ehea/ehea_main.c b/drivers/net/ethernet/ibm/ehea/ehea_main.c +index 2a0dc127df3f..1a56de06b014 100644 +--- a/drivers/net/ethernet/ibm/ehea/ehea_main.c ++++ b/drivers/net/ethernet/ibm/ehea/ehea_main.c +@@ -3183,6 +3183,7 @@ static ssize_t ehea_probe_port(struct device *dev, + + if (ehea_add_adapter_mr(adapter)) { + pr_err("creating MR failed\n"); ++ of_node_put(eth_dn); + return -EIO; + } + +diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h b/drivers/net/ethernet/intel/igb/e1000_defines.h +index b1915043bc0c..7b9fb71137da 100644 +--- a/drivers/net/ethernet/intel/igb/e1000_defines.h ++++ b/drivers/net/ethernet/intel/igb/e1000_defines.h +@@ -193,6 +193,8 @@ + /* enable link status from external LINK_0 and LINK_1 pins */ + #define E1000_CTRL_SWDPIN0 0x00040000 /* SWDPIN 0 value */ + #define E1000_CTRL_SWDPIN1 0x00080000 /* SWDPIN 1 value */ ++#define E1000_CTRL_ADVD3WUC 0x00100000 /* D3 WUC */ ++#define E1000_CTRL_EN_PHY_PWR_MGMT 0x00200000 /* PHY PM enable */ + #define E1000_CTRL_SDP0_DIR 0x00400000 /* SDP0 Data direction */ + #define E1000_CTRL_SDP1_DIR 0x00800000 /* SDP1 Data direction */ + #define E1000_CTRL_RST 0x04000000 /* Global reset */ +diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c +index c1796aa2dde5..70ed5e5c3514 100644 +--- a/drivers/net/ethernet/intel/igb/igb_main.c ++++ b/drivers/net/ethernet/intel/igb/igb_main.c +@@ -7325,9 +7325,7 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake, + struct e1000_hw *hw = &adapter->hw; + u32 ctrl, rctl, status; + u32 wufc = runtime ? E1000_WUFC_LNKC : adapter->wol; +-#ifdef CONFIG_PM +- int retval = 0; +-#endif ++ bool wake; + + rtnl_lock(); + netif_device_detach(netdev); +@@ -7338,14 +7336,6 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake, + igb_clear_interrupt_scheme(adapter); + rtnl_unlock(); + +-#ifdef CONFIG_PM +- if (!runtime) { +- retval = pci_save_state(pdev); +- if (retval) +- return retval; +- } +-#endif +- + status = rd32(E1000_STATUS); + if (status & E1000_STATUS_LU) + wufc &= ~E1000_WUFC_LNKC; +@@ -7362,10 +7352,6 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake, + } + + ctrl = rd32(E1000_CTRL); +- /* advertise wake from D3Cold */ +- #define E1000_CTRL_ADVD3WUC 0x00100000 +- /* phy power management enable */ +- #define E1000_CTRL_EN_PHY_PWR_MGMT 0x00200000 + ctrl |= E1000_CTRL_ADVD3WUC; + wr32(E1000_CTRL, ctrl); + +@@ -7379,12 +7365,15 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake, + wr32(E1000_WUFC, 0); + } + +- *enable_wake = wufc || adapter->en_mng_pt; +- if (!*enable_wake) ++ wake = wufc || adapter->en_mng_pt; ++ if (!wake) + igb_power_down_link(adapter); + else + igb_power_up_link(adapter); + ++ if (enable_wake) ++ *enable_wake = wake; ++ + /* Release control of h/w to f/w. If f/w is AMT enabled, this + * would have already happened in close and is redundant. + */ +@@ -7399,22 +7388,7 @@ static int __igb_shutdown(struct pci_dev *pdev, bool *enable_wake, + #ifdef CONFIG_PM_SLEEP + static int igb_suspend(struct device *dev) + { +- int retval; +- bool wake; +- struct pci_dev *pdev = to_pci_dev(dev); +- +- retval = __igb_shutdown(pdev, &wake, 0); +- if (retval) +- return retval; +- +- if (wake) { +- pci_prepare_to_sleep(pdev); +- } else { +- pci_wake_from_d3(pdev, false); +- pci_set_power_state(pdev, PCI_D3hot); +- } +- +- return 0; ++ return __igb_shutdown(to_pci_dev(dev), NULL, 0); + } + #endif /* CONFIG_PM_SLEEP */ + +@@ -7483,22 +7457,7 @@ static int igb_runtime_idle(struct device *dev) + + static int igb_runtime_suspend(struct device *dev) + { +- struct pci_dev *pdev = to_pci_dev(dev); +- int retval; +- bool wake; +- +- retval = __igb_shutdown(pdev, &wake, 1); +- if (retval) +- return retval; +- +- if (wake) { +- pci_prepare_to_sleep(pdev); +- } else { +- pci_wake_from_d3(pdev, false); +- pci_set_power_state(pdev, PCI_D3hot); +- } +- +- return 0; ++ return __igb_shutdown(to_pci_dev(dev), NULL, 1); + } + + static int igb_runtime_resume(struct device *dev) +diff --git a/drivers/net/ethernet/micrel/ks8851.c b/drivers/net/ethernet/micrel/ks8851.c +index 1edc973df4c4..7377dca6eb57 100644 +--- a/drivers/net/ethernet/micrel/ks8851.c ++++ b/drivers/net/ethernet/micrel/ks8851.c +@@ -547,9 +547,8 @@ static void ks8851_rx_pkts(struct ks8851_net *ks) + /* set dma read address */ + ks8851_wrreg16(ks, KS_RXFDPR, RXFDPR_RXFPAI | 0x00); + +- /* start the packet dma process, and set auto-dequeue rx */ +- ks8851_wrreg16(ks, KS_RXQCR, +- ks->rc_rxqcr | RXQCR_SDA | RXQCR_ADRFE); ++ /* start DMA access */ ++ ks8851_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr | RXQCR_SDA); + + if (rxlen > 4) { + unsigned int rxalign; +@@ -580,7 +579,8 @@ static void ks8851_rx_pkts(struct ks8851_net *ks) + } + } + +- ks8851_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr); ++ /* end DMA access and dequeue packet */ ++ ks8851_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr | RXQCR_RRXEF); + } + } + +@@ -797,6 +797,15 @@ static void ks8851_tx_work(struct work_struct *work) + static int ks8851_net_open(struct net_device *dev) + { + struct ks8851_net *ks = netdev_priv(dev); ++ int ret; ++ ++ ret = request_threaded_irq(dev->irq, NULL, ks8851_irq, ++ IRQF_TRIGGER_LOW | IRQF_ONESHOT, ++ dev->name, ks); ++ if (ret < 0) { ++ netdev_err(dev, "failed to get irq\n"); ++ return ret; ++ } + + /* lock the card, even if we may not actually be doing anything + * else at the moment */ +@@ -861,6 +870,7 @@ static int ks8851_net_open(struct net_device *dev) + netif_dbg(ks, ifup, ks->netdev, "network device up\n"); + + mutex_unlock(&ks->lock); ++ mii_check_link(&ks->mii); + return 0; + } + +@@ -911,6 +921,8 @@ static int ks8851_net_stop(struct net_device *dev) + dev_kfree_skb(txb); + } + ++ free_irq(dev->irq, ks); ++ + return 0; + } + +@@ -1516,6 +1528,7 @@ static int ks8851_probe(struct spi_device *spi) + + spi_set_drvdata(spi, ks); + ++ netif_carrier_off(ks->netdev); + ndev->if_port = IF_PORT_100BASET; + ndev->netdev_ops = &ks8851_netdev_ops; + ndev->irq = spi->irq; +@@ -1542,14 +1555,6 @@ static int ks8851_probe(struct spi_device *spi) + ks8851_read_selftest(ks); + ks8851_init_mac(ks); + +- ret = request_threaded_irq(spi->irq, NULL, ks8851_irq, +- IRQF_TRIGGER_LOW | IRQF_ONESHOT, +- ndev->name, ks); +- if (ret < 0) { +- dev_err(&spi->dev, "failed to get irq\n"); +- goto err_irq; +- } +- + ret = register_netdev(ndev); + if (ret) { + dev_err(&spi->dev, "failed to register network device\n"); +@@ -1562,14 +1567,10 @@ static int ks8851_probe(struct spi_device *spi) + + return 0; + +- + err_netdev: +- free_irq(ndev->irq, ks); +- +-err_irq: ++err_id: + if (gpio_is_valid(gpio)) + gpio_set_value(gpio, 0); +-err_id: + regulator_disable(ks->vdd_reg); + err_reg: + regulator_disable(ks->vdd_io); +@@ -1587,7 +1588,6 @@ static int ks8851_remove(struct spi_device *spi) + dev_info(&spi->dev, "remove\n"); + + unregister_netdev(priv->netdev); +- free_irq(spi->irq, priv); + if (gpio_is_valid(priv->gpio)) + gpio_set_value(priv->gpio, 0); + regulator_disable(priv->vdd_reg); +diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c +index 0a2318cad34d..63ebc491057b 100644 +--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c ++++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c +@@ -1038,6 +1038,8 @@ int qlcnic_do_lb_test(struct qlcnic_adapter *adapter, u8 mode) + + for (i = 0; i < QLCNIC_NUM_ILB_PKT; i++) { + skb = netdev_alloc_skb(adapter->netdev, QLCNIC_ILB_PKT_SIZE); ++ if (!skb) ++ break; + qlcnic_create_loopback_buff(skb->data, adapter->mac_addr); + skb_put(skb, QLCNIC_ILB_PKT_SIZE); + adapter->ahw->diag_cnt = 0; +diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +index 059113dce6e0..f4d6512f066c 100644 +--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c ++++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +@@ -1792,8 +1792,6 @@ static int stmmac_open(struct net_device *dev) + struct stmmac_priv *priv = netdev_priv(dev); + int ret; + +- stmmac_check_ether_addr(priv); +- + if (priv->pcs != STMMAC_PCS_RGMII && priv->pcs != STMMAC_PCS_TBI && + priv->pcs != STMMAC_PCS_RTBI) { + ret = stmmac_init_phy(dev); +@@ -2929,6 +2927,8 @@ int stmmac_dvr_probe(struct device *device, + if (ret) + goto error_hw_init; + ++ stmmac_check_ether_addr(priv); ++ + ndev->netdev_ops = &stmmac_netdev_ops; + + ndev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | +diff --git a/drivers/net/ethernet/ti/netcp_ethss.c b/drivers/net/ethernet/ti/netcp_ethss.c +index 4e70e7586a09..a5732edc8437 100644 +--- a/drivers/net/ethernet/ti/netcp_ethss.c ++++ b/drivers/net/ethernet/ti/netcp_ethss.c +@@ -3122,12 +3122,16 @@ static int gbe_probe(struct netcp_device *netcp_device, struct device *dev, + + ret = netcp_txpipe_init(&gbe_dev->tx_pipe, netcp_device, + gbe_dev->dma_chan_name, gbe_dev->tx_queue_id); +- if (ret) ++ if (ret) { ++ of_node_put(interfaces); + return ret; ++ } + + ret = netcp_txpipe_open(&gbe_dev->tx_pipe); +- if (ret) ++ if (ret) { ++ of_node_put(interfaces); + return ret; ++ } + + /* Create network interfaces */ + INIT_LIST_HEAD(&gbe_dev->gbe_intf_head); +diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c +index 4684644703cc..58ba579793f8 100644 +--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c ++++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c +@@ -1595,12 +1595,14 @@ static int axienet_probe(struct platform_device *pdev) + ret = of_address_to_resource(np, 0, &dmares); + if (ret) { + dev_err(&pdev->dev, "unable to get DMA resource\n"); ++ of_node_put(np); + goto free_netdev; + } + lp->dma_regs = devm_ioremap_resource(&pdev->dev, &dmares); + if (IS_ERR(lp->dma_regs)) { + dev_err(&pdev->dev, "could not map DMA regs\n"); + ret = PTR_ERR(lp->dma_regs); ++ of_node_put(np); + goto free_netdev; + } + lp->rx_irq = irq_of_parse_and_map(np, 1); +diff --git a/drivers/net/slip/slhc.c b/drivers/net/slip/slhc.c +index cfd81eb1b532..ddceed3c5a4a 100644 +--- a/drivers/net/slip/slhc.c ++++ b/drivers/net/slip/slhc.c +@@ -153,7 +153,7 @@ out_fail: + void + slhc_free(struct slcompress *comp) + { +- if ( comp == NULLSLCOMPR ) ++ if ( IS_ERR_OR_NULL(comp) ) + return; + + if ( comp->tstate != NULLSLSTATE ) +diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c +index 267a90423154..7b3ef6dc45a4 100644 +--- a/drivers/net/team/team.c ++++ b/drivers/net/team/team.c +@@ -1136,6 +1136,12 @@ static int team_port_add(struct team *team, struct net_device *port_dev) + return -EINVAL; + } + ++ if (netdev_has_upper_dev(dev, port_dev)) { ++ netdev_err(dev, "Device %s is already an upper device of the team interface\n", ++ portname); ++ return -EBUSY; ++ } ++ + if (port_dev->features & NETIF_F_VLAN_CHALLENGED && + vlan_uses_dev(dev)) { + netdev_err(dev, "Device %s is VLAN challenged and team device has VLAN set up\n", +diff --git a/drivers/net/usb/ipheth.c b/drivers/net/usb/ipheth.c +index f1f8227e7342..01f95d192d25 100644 +--- a/drivers/net/usb/ipheth.c ++++ b/drivers/net/usb/ipheth.c +@@ -148,6 +148,7 @@ struct ipheth_device { + u8 bulk_in; + u8 bulk_out; + struct delayed_work carrier_work; ++ bool confirmed_pairing; + }; + + static int ipheth_rx_submit(struct ipheth_device *dev, gfp_t mem_flags); +@@ -259,7 +260,7 @@ static void ipheth_rcvbulk_callback(struct urb *urb) + + dev->net->stats.rx_packets++; + dev->net->stats.rx_bytes += len; +- ++ dev->confirmed_pairing = true; + netif_rx(skb); + ipheth_rx_submit(dev, GFP_ATOMIC); + } +@@ -280,14 +281,24 @@ static void ipheth_sndbulk_callback(struct urb *urb) + dev_err(&dev->intf->dev, "%s: urb status: %d\n", + __func__, status); + +- netif_wake_queue(dev->net); ++ if (status == 0) ++ netif_wake_queue(dev->net); ++ else ++ // on URB error, trigger immediate poll ++ schedule_delayed_work(&dev->carrier_work, 0); + } + + static int ipheth_carrier_set(struct ipheth_device *dev) + { +- struct usb_device *udev = dev->udev; ++ struct usb_device *udev; + int retval; + ++ if (!dev) ++ return 0; ++ if (!dev->confirmed_pairing) ++ return 0; ++ ++ udev = dev->udev; + retval = usb_control_msg(udev, + usb_rcvctrlpipe(udev, IPHETH_CTRL_ENDP), + IPHETH_CMD_CARRIER_CHECK, /* request */ +@@ -302,11 +313,14 @@ static int ipheth_carrier_set(struct ipheth_device *dev) + return retval; + } + +- if (dev->ctrl_buf[0] == IPHETH_CARRIER_ON) ++ if (dev->ctrl_buf[0] == IPHETH_CARRIER_ON) { + netif_carrier_on(dev->net); +- else ++ if (dev->tx_urb->status != -EINPROGRESS) ++ netif_wake_queue(dev->net); ++ } else { + netif_carrier_off(dev->net); +- ++ netif_stop_queue(dev->net); ++ } + return 0; + } + +@@ -386,7 +400,6 @@ static int ipheth_open(struct net_device *net) + return retval; + + schedule_delayed_work(&dev->carrier_work, IPHETH_CARRIER_CHECK_TIMEOUT); +- netif_start_queue(net); + return retval; + } + +@@ -489,7 +502,7 @@ static int ipheth_probe(struct usb_interface *intf, + dev->udev = udev; + dev->net = netdev; + dev->intf = intf; +- ++ dev->confirmed_pairing = false; + /* Set up endpoints */ + hintf = usb_altnum_to_altsetting(intf, IPHETH_ALT_INTFNUM); + if (hintf == NULL) { +@@ -540,7 +553,9 @@ static int ipheth_probe(struct usb_interface *intf, + retval = -EIO; + goto err_register_netdev; + } +- ++ // carrier down and transmit queues stopped until packet from device ++ netif_carrier_off(netdev); ++ netif_tx_stop_all_queues(netdev); + dev_info(&intf->dev, "Apple iPhone USB Ethernet device attached\n"); + return 0; + +diff --git a/drivers/net/wireless/cw1200/scan.c b/drivers/net/wireless/cw1200/scan.c +index 9f1037e7e55c..2ce0193614f2 100644 +--- a/drivers/net/wireless/cw1200/scan.c ++++ b/drivers/net/wireless/cw1200/scan.c +@@ -84,8 +84,11 @@ int cw1200_hw_scan(struct ieee80211_hw *hw, + + frame.skb = ieee80211_probereq_get(hw, priv->vif->addr, NULL, 0, + req->ie_len); +- if (!frame.skb) ++ if (!frame.skb) { ++ mutex_unlock(&priv->conf_mutex); ++ up(&priv->scan.lock); + return -ENOMEM; ++ } + + if (req->ie_len) + memcpy(skb_put(frame.skb, req->ie_len), req->ie, req->ie_len); +diff --git a/drivers/nvdimm/btt_devs.c b/drivers/nvdimm/btt_devs.c +index cb477518dd0e..4c129450495d 100644 +--- a/drivers/nvdimm/btt_devs.c ++++ b/drivers/nvdimm/btt_devs.c +@@ -170,14 +170,15 @@ static struct device *__nd_btt_create(struct nd_region *nd_region, + return NULL; + + nd_btt->id = ida_simple_get(&nd_region->btt_ida, 0, 0, GFP_KERNEL); +- if (nd_btt->id < 0) { +- kfree(nd_btt); +- return NULL; +- } ++ if (nd_btt->id < 0) ++ goto out_nd_btt; + + nd_btt->lbasize = lbasize; +- if (uuid) ++ if (uuid) { + uuid = kmemdup(uuid, 16, GFP_KERNEL); ++ if (!uuid) ++ goto out_put_id; ++ } + nd_btt->uuid = uuid; + dev = &nd_btt->dev; + dev_set_name(dev, "btt%d.%d", nd_region->id, nd_btt->id); +@@ -192,6 +193,13 @@ static struct device *__nd_btt_create(struct nd_region *nd_region, + return NULL; + } + return dev; ++ ++out_put_id: ++ ida_simple_remove(&nd_region->btt_ida, nd_btt->id); ++ ++out_nd_btt: ++ kfree(nd_btt); ++ return NULL; + } + + struct device *nd_btt_create(struct nd_region *nd_region) +diff --git a/drivers/platform/x86/sony-laptop.c b/drivers/platform/x86/sony-laptop.c +index f73c29558cd3..c54ff94c491d 100644 +--- a/drivers/platform/x86/sony-laptop.c ++++ b/drivers/platform/x86/sony-laptop.c +@@ -4394,14 +4394,16 @@ sony_pic_read_possible_resource(struct acpi_resource *resource, void *context) + } + return AE_OK; + } ++ ++ case ACPI_RESOURCE_TYPE_END_TAG: ++ return AE_OK; ++ + default: + dprintk("Resource %d isn't an IRQ nor an IO port\n", + resource->type); ++ return AE_CTRL_TERMINATE; + +- case ACPI_RESOURCE_TYPE_END_TAG: +- return AE_OK; + } +- return AE_CTRL_TERMINATE; + } + + static int sony_pic_possible_resources(struct acpi_device *device) +diff --git a/drivers/rtc/rtc-da9063.c b/drivers/rtc/rtc-da9063.c +index d6c853bbfa9f..e93beecd5010 100644 +--- a/drivers/rtc/rtc-da9063.c ++++ b/drivers/rtc/rtc-da9063.c +@@ -491,6 +491,13 @@ static int da9063_rtc_probe(struct platform_device *pdev) + da9063_data_to_tm(data, &rtc->alarm_time, rtc); + rtc->rtc_sync = false; + ++ /* ++ * TODO: some models have alarms on a minute boundary but still support ++ * real hardware interrupts. Add this once the core supports it. ++ */ ++ if (config->rtc_data_start != RTC_SEC) ++ rtc->rtc_dev->uie_unsupported = 1; ++ + irq_alarm = platform_get_irq_byname(pdev, "ALARM"); + ret = devm_request_threaded_irq(&pdev->dev, irq_alarm, NULL, + da9063_alarm_event, +diff --git a/drivers/rtc/rtc-sh.c b/drivers/rtc/rtc-sh.c +index 2b81dd4baf17..104c854d6a8a 100644 +--- a/drivers/rtc/rtc-sh.c ++++ b/drivers/rtc/rtc-sh.c +@@ -455,7 +455,7 @@ static int sh_rtc_set_time(struct device *dev, struct rtc_time *tm) + static inline int sh_rtc_read_alarm_value(struct sh_rtc *rtc, int reg_off) + { + unsigned int byte; +- int value = 0xff; /* return 0xff for ignored values */ ++ int value = -1; /* return -1 for ignored values */ + + byte = readb(rtc->regbase + reg_off); + if (byte & AR_ENB) { +diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c +index 80a43074c2f9..c530610f61ac 100644 +--- a/drivers/s390/block/dasd_eckd.c ++++ b/drivers/s390/block/dasd_eckd.c +@@ -2066,14 +2066,14 @@ static int dasd_eckd_end_analysis(struct dasd_block *block) + blk_per_trk = recs_per_track(&private->rdc_data, 0, block->bp_block); + + raw: +- block->blocks = (private->real_cyl * ++ block->blocks = ((unsigned long) private->real_cyl * + private->rdc_data.trk_per_cyl * + blk_per_trk); + + dev_info(&device->cdev->dev, +- "DASD with %d KB/block, %d KB total size, %d KB/track, " ++ "DASD with %u KB/block, %lu KB total size, %u KB/track, " + "%s\n", (block->bp_block >> 10), +- ((private->real_cyl * ++ (((unsigned long) private->real_cyl * + private->rdc_data.trk_per_cyl * + blk_per_trk * (block->bp_block >> 9)) >> 1), + ((blk_per_trk * block->bp_block) >> 10), +diff --git a/drivers/s390/char/con3270.c b/drivers/s390/char/con3270.c +index bae98521c808..3e5a7912044f 100644 +--- a/drivers/s390/char/con3270.c ++++ b/drivers/s390/char/con3270.c +@@ -627,7 +627,7 @@ con3270_init(void) + (void (*)(unsigned long)) con3270_read_tasklet, + (unsigned long) condev->read); + +- raw3270_add_view(&condev->view, &con3270_fn, 1); ++ raw3270_add_view(&condev->view, &con3270_fn, 1, RAW3270_VIEW_LOCK_IRQ); + + INIT_LIST_HEAD(&condev->freemem); + for (i = 0; i < CON3270_STRING_PAGES; i++) { +diff --git a/drivers/s390/char/fs3270.c b/drivers/s390/char/fs3270.c +index 71e974738014..f0c86bcbe316 100644 +--- a/drivers/s390/char/fs3270.c ++++ b/drivers/s390/char/fs3270.c +@@ -463,7 +463,8 @@ fs3270_open(struct inode *inode, struct file *filp) + + init_waitqueue_head(&fp->wait); + fp->fs_pid = get_pid(task_pid(current)); +- rc = raw3270_add_view(&fp->view, &fs3270_fn, minor); ++ rc = raw3270_add_view(&fp->view, &fs3270_fn, minor, ++ RAW3270_VIEW_LOCK_BH); + if (rc) { + fs3270_free_view(&fp->view); + goto out; +diff --git a/drivers/s390/char/raw3270.c b/drivers/s390/char/raw3270.c +index 220acb4cbee5..9c350e6d75bf 100644 +--- a/drivers/s390/char/raw3270.c ++++ b/drivers/s390/char/raw3270.c +@@ -956,7 +956,7 @@ raw3270_deactivate_view(struct raw3270_view *view) + * Add view to device with minor "minor". + */ + int +-raw3270_add_view(struct raw3270_view *view, struct raw3270_fn *fn, int minor) ++raw3270_add_view(struct raw3270_view *view, struct raw3270_fn *fn, int minor, int subclass) + { + unsigned long flags; + struct raw3270 *rp; +@@ -978,6 +978,7 @@ raw3270_add_view(struct raw3270_view *view, struct raw3270_fn *fn, int minor) + view->cols = rp->cols; + view->ascebc = rp->ascebc; + spin_lock_init(&view->lock); ++ lockdep_set_subclass(&view->lock, subclass); + list_add(&view->list, &rp->view_list); + rc = 0; + spin_unlock_irqrestore(get_ccwdev_lock(rp->cdev), flags); +diff --git a/drivers/s390/char/raw3270.h b/drivers/s390/char/raw3270.h +index e1e41c2861fb..5ae54317857a 100644 +--- a/drivers/s390/char/raw3270.h ++++ b/drivers/s390/char/raw3270.h +@@ -155,6 +155,8 @@ struct raw3270_fn { + struct raw3270_view { + struct list_head list; + spinlock_t lock; ++#define RAW3270_VIEW_LOCK_IRQ 0 ++#define RAW3270_VIEW_LOCK_BH 1 + atomic_t ref_count; + struct raw3270 *dev; + struct raw3270_fn *fn; +@@ -163,7 +165,7 @@ struct raw3270_view { + unsigned char *ascebc; /* ascii -> ebcdic table */ + }; + +-int raw3270_add_view(struct raw3270_view *, struct raw3270_fn *, int); ++int raw3270_add_view(struct raw3270_view *, struct raw3270_fn *, int, int); + int raw3270_activate_view(struct raw3270_view *); + void raw3270_del_view(struct raw3270_view *); + void raw3270_deactivate_view(struct raw3270_view *); +diff --git a/drivers/s390/char/tty3270.c b/drivers/s390/char/tty3270.c +index e96fc7fd9498..ab95d24b991b 100644 +--- a/drivers/s390/char/tty3270.c ++++ b/drivers/s390/char/tty3270.c +@@ -937,7 +937,8 @@ static int tty3270_install(struct tty_driver *driver, struct tty_struct *tty) + return PTR_ERR(tp); + + rc = raw3270_add_view(&tp->view, &tty3270_fn, +- tty->index + RAW3270_FIRSTMINOR); ++ tty->index + RAW3270_FIRSTMINOR, ++ RAW3270_VIEW_LOCK_BH); + if (rc) { + tty3270_free_view(tp); + return rc; +diff --git a/drivers/s390/net/ctcm_main.c b/drivers/s390/net/ctcm_main.c +index 05c37d6d4afe..a31821d94677 100644 +--- a/drivers/s390/net/ctcm_main.c ++++ b/drivers/s390/net/ctcm_main.c +@@ -1595,6 +1595,7 @@ static int ctcm_new_device(struct ccwgroup_device *cgdev) + if (priv->channel[direction] == NULL) { + if (direction == CTCM_WRITE) + channel_free(priv->channel[CTCM_READ]); ++ result = -ENODEV; + goto out_dev; + } + priv->channel[direction]->netdev = dev; +diff --git a/drivers/s390/scsi/zfcp_fc.c b/drivers/s390/scsi/zfcp_fc.c +index 237688af179b..f7630cf581cd 100644 +--- a/drivers/s390/scsi/zfcp_fc.c ++++ b/drivers/s390/scsi/zfcp_fc.c +@@ -238,10 +238,6 @@ static void _zfcp_fc_incoming_rscn(struct zfcp_fsf_req *fsf_req, u32 range, + list_for_each_entry(port, &adapter->port_list, list) { + if ((port->d_id & range) == (ntoh24(page->rscn_fid) & range)) + zfcp_fc_test_link(port); +- if (!port->d_id) +- zfcp_erp_port_reopen(port, +- ZFCP_STATUS_COMMON_ERP_FAILED, +- "fcrscn1"); + } + read_unlock_irqrestore(&adapter->port_list_lock, flags); + } +@@ -249,6 +245,7 @@ static void _zfcp_fc_incoming_rscn(struct zfcp_fsf_req *fsf_req, u32 range, + static void zfcp_fc_incoming_rscn(struct zfcp_fsf_req *fsf_req) + { + struct fsf_status_read_buffer *status_buffer = (void *)fsf_req->data; ++ struct zfcp_adapter *adapter = fsf_req->adapter; + struct fc_els_rscn *head; + struct fc_els_rscn_page *page; + u16 i; +@@ -261,6 +258,22 @@ static void zfcp_fc_incoming_rscn(struct zfcp_fsf_req *fsf_req) + /* see FC-FS */ + no_entries = head->rscn_plen / sizeof(struct fc_els_rscn_page); + ++ if (no_entries > 1) { ++ /* handle failed ports */ ++ unsigned long flags; ++ struct zfcp_port *port; ++ ++ read_lock_irqsave(&adapter->port_list_lock, flags); ++ list_for_each_entry(port, &adapter->port_list, list) { ++ if (port->d_id) ++ continue; ++ zfcp_erp_port_reopen(port, ++ ZFCP_STATUS_COMMON_ERP_FAILED, ++ "fcrscn1"); ++ } ++ read_unlock_irqrestore(&adapter->port_list_lock, flags); ++ } ++ + for (i = 1; i < no_entries; i++) { + /* skip head and start with 1st element */ + page++; +diff --git a/drivers/scsi/csiostor/csio_scsi.c b/drivers/scsi/csiostor/csio_scsi.c +index c2a6f9f29427..ddbdaade654d 100644 +--- a/drivers/scsi/csiostor/csio_scsi.c ++++ b/drivers/scsi/csiostor/csio_scsi.c +@@ -1713,8 +1713,11 @@ csio_scsi_err_handler(struct csio_hw *hw, struct csio_ioreq *req) + } + + out: +- if (req->nsge > 0) ++ if (req->nsge > 0) { + scsi_dma_unmap(cmnd); ++ if (req->dcopy && (host_status == DID_OK)) ++ host_status = csio_scsi_copy_to_sgl(hw, req); ++ } + + cmnd->result = (((host_status) << 16) | scsi_status); + cmnd->scsi_done(cmnd); +diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c +index 7be581f7c35d..1a6f65db615e 100644 +--- a/drivers/scsi/libsas/sas_expander.c ++++ b/drivers/scsi/libsas/sas_expander.c +@@ -47,17 +47,16 @@ static void smp_task_timedout(unsigned long _task) + unsigned long flags; + + spin_lock_irqsave(&task->task_state_lock, flags); +- if (!(task->task_state_flags & SAS_TASK_STATE_DONE)) ++ if (!(task->task_state_flags & SAS_TASK_STATE_DONE)) { + task->task_state_flags |= SAS_TASK_STATE_ABORTED; ++ complete(&task->slow_task->completion); ++ } + spin_unlock_irqrestore(&task->task_state_lock, flags); +- +- complete(&task->slow_task->completion); + } + + static void smp_task_done(struct sas_task *task) + { +- if (!del_timer(&task->slow_task->timer)) +- return; ++ del_timer(&task->slow_task->timer); + complete(&task->slow_task->completion); + } + +diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c +index ac12ee844bfc..31c29a5d1f38 100644 +--- a/drivers/scsi/qla2xxx/qla_attr.c ++++ b/drivers/scsi/qla2xxx/qla_attr.c +@@ -431,7 +431,7 @@ qla2x00_sysfs_write_optrom_ctl(struct file *filp, struct kobject *kobj, + } + + ha->optrom_region_start = start; +- ha->optrom_region_size = start + size; ++ ha->optrom_region_size = size; + + ha->optrom_state = QLA_SREADING; + ha->optrom_buffer = vmalloc(ha->optrom_region_size); +@@ -504,7 +504,7 @@ qla2x00_sysfs_write_optrom_ctl(struct file *filp, struct kobject *kobj, + } + + ha->optrom_region_start = start; +- ha->optrom_region_size = start + size; ++ ha->optrom_region_size = size; + + ha->optrom_state = QLA_SWRITING; + ha->optrom_buffer = vmalloc(ha->optrom_region_size); +diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c +index f9f899ec9427..c158967b59d7 100644 +--- a/drivers/scsi/qla4xxx/ql4_os.c ++++ b/drivers/scsi/qla4xxx/ql4_os.c +@@ -3207,6 +3207,8 @@ static int qla4xxx_conn_bind(struct iscsi_cls_session *cls_session, + if (iscsi_conn_bind(cls_session, cls_conn, is_leading)) + return -EINVAL; + ep = iscsi_lookup_endpoint(transport_fd); ++ if (!ep) ++ return -EINVAL; + conn = cls_conn->dd_data; + qla_conn = conn->dd_data; + qla_conn->qla_ep = ep->dd_data; +diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c +index 44b7a69d022a..45cd4cf93af3 100644 +--- a/drivers/scsi/storvsc_drv.c ++++ b/drivers/scsi/storvsc_drv.c +@@ -613,13 +613,22 @@ static void handle_sc_creation(struct vmbus_channel *new_sc) + static void handle_multichannel_storage(struct hv_device *device, int max_chns) + { + struct storvsc_device *stor_device; +- int num_cpus = num_online_cpus(); + int num_sc; + struct storvsc_cmd_request *request; + struct vstor_packet *vstor_packet; + int ret, t; + +- num_sc = ((max_chns > num_cpus) ? num_cpus : max_chns); ++ /* ++ * If the number of CPUs is artificially restricted, such as ++ * with maxcpus=1 on the kernel boot line, Hyper-V could offer ++ * sub-channels >= the number of CPUs. These sub-channels ++ * should not be created. The primary channel is already created ++ * and assigned to one CPU, so check against # CPUs - 1. ++ */ ++ num_sc = min((int)(num_online_cpus() - 1), max_chns); ++ if (!num_sc) ++ return; ++ + stor_device = get_out_stor_device(device); + if (!stor_device) + return; +diff --git a/drivers/staging/iio/addac/adt7316.c b/drivers/staging/iio/addac/adt7316.c +index 3adc4516918c..8c5cfb9400d0 100644 +--- a/drivers/staging/iio/addac/adt7316.c ++++ b/drivers/staging/iio/addac/adt7316.c +@@ -47,6 +47,8 @@ + #define ADT7516_MSB_AIN3 0xA + #define ADT7516_MSB_AIN4 0xB + #define ADT7316_DA_DATA_BASE 0x10 ++#define ADT7316_DA_10_BIT_LSB_SHIFT 6 ++#define ADT7316_DA_12_BIT_LSB_SHIFT 4 + #define ADT7316_DA_MSB_DATA_REGS 4 + #define ADT7316_LSB_DAC_A 0x10 + #define ADT7316_MSB_DAC_A 0x11 +@@ -1092,7 +1094,7 @@ static ssize_t adt7316_store_DAC_internal_Vref(struct device *dev, + ldac_config = chip->ldac_config & (~ADT7516_DAC_IN_VREF_MASK); + if (data & 0x1) + ldac_config |= ADT7516_DAC_AB_IN_VREF; +- else if (data & 0x2) ++ if (data & 0x2) + ldac_config |= ADT7516_DAC_CD_IN_VREF; + } else { + ret = kstrtou8(buf, 16, &data); +@@ -1414,7 +1416,7 @@ static IIO_DEVICE_ATTR(ex_analog_temp_offset, S_IRUGO | S_IWUSR, + static ssize_t adt7316_show_DAC(struct adt7316_chip_info *chip, + int channel, char *buf) + { +- u16 data; ++ u16 data = 0; + u8 msb, lsb, offset; + int ret; + +@@ -1439,7 +1441,11 @@ static ssize_t adt7316_show_DAC(struct adt7316_chip_info *chip, + if (ret) + return -EIO; + +- data = (msb << offset) + (lsb & ((1 << offset) - 1)); ++ if (chip->dac_bits == 12) ++ data = lsb >> ADT7316_DA_12_BIT_LSB_SHIFT; ++ else if (chip->dac_bits == 10) ++ data = lsb >> ADT7316_DA_10_BIT_LSB_SHIFT; ++ data |= msb << offset; + + return sprintf(buf, "%d\n", data); + } +@@ -1447,7 +1453,7 @@ static ssize_t adt7316_show_DAC(struct adt7316_chip_info *chip, + static ssize_t adt7316_store_DAC(struct adt7316_chip_info *chip, + int channel, const char *buf, size_t len) + { +- u8 msb, lsb, offset; ++ u8 msb, lsb, lsb_reg, offset; + u16 data; + int ret; + +@@ -1465,9 +1471,13 @@ static ssize_t adt7316_store_DAC(struct adt7316_chip_info *chip, + return -EINVAL; + + if (chip->dac_bits > 8) { +- lsb = data & (1 << offset); ++ lsb = data & ((1 << offset) - 1); ++ if (chip->dac_bits == 12) ++ lsb_reg = lsb << ADT7316_DA_12_BIT_LSB_SHIFT; ++ else ++ lsb_reg = lsb << ADT7316_DA_10_BIT_LSB_SHIFT; + ret = chip->bus.write(chip->bus.client, +- ADT7316_DA_DATA_BASE + channel * 2, lsb); ++ ADT7316_DA_DATA_BASE + channel * 2, lsb_reg); + if (ret) + return -EIO; + } +diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c +index 17a22073d226..032f3c13b8c4 100644 +--- a/drivers/tty/serial/sc16is7xx.c ++++ b/drivers/tty/serial/sc16is7xx.c +@@ -1448,7 +1448,7 @@ static int __init sc16is7xx_init(void) + ret = i2c_add_driver(&sc16is7xx_i2c_uart_driver); + if (ret < 0) { + pr_err("failed to init sc16is7xx i2c --> %d\n", ret); +- return ret; ++ goto err_i2c; + } + #endif + +@@ -1456,10 +1456,18 @@ static int __init sc16is7xx_init(void) + ret = spi_register_driver(&sc16is7xx_spi_uart_driver); + if (ret < 0) { + pr_err("failed to init sc16is7xx spi --> %d\n", ret); +- return ret; ++ goto err_spi; + } + #endif + return ret; ++ ++err_spi: ++#ifdef CONFIG_SERIAL_SC16IS7XX_I2C ++ i2c_del_driver(&sc16is7xx_i2c_uart_driver); ++#endif ++err_i2c: ++ uart_unregister_driver(&sc16is7xx_uart); ++ return ret; + } + module_init(sc16is7xx_init); + +diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c +index e9d6cf146fcc..654199c6a36c 100644 +--- a/drivers/usb/core/driver.c ++++ b/drivers/usb/core/driver.c +@@ -470,11 +470,6 @@ static int usb_unbind_interface(struct device *dev) + pm_runtime_disable(dev); + pm_runtime_set_suspended(dev); + +- /* Undo any residual pm_autopm_get_interface_* calls */ +- for (r = atomic_read(&intf->pm_usage_cnt); r > 0; --r) +- usb_autopm_put_interface_no_suspend(intf); +- atomic_set(&intf->pm_usage_cnt, 0); +- + if (!error) + usb_autosuspend_device(udev); + +@@ -1625,7 +1620,6 @@ void usb_autopm_put_interface(struct usb_interface *intf) + int status; + + usb_mark_last_busy(udev); +- atomic_dec(&intf->pm_usage_cnt); + status = pm_runtime_put_sync(&intf->dev); + dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n", + __func__, atomic_read(&intf->dev.power.usage_count), +@@ -1654,7 +1648,6 @@ void usb_autopm_put_interface_async(struct usb_interface *intf) + int status; + + usb_mark_last_busy(udev); +- atomic_dec(&intf->pm_usage_cnt); + status = pm_runtime_put(&intf->dev); + dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n", + __func__, atomic_read(&intf->dev.power.usage_count), +@@ -1676,7 +1669,6 @@ void usb_autopm_put_interface_no_suspend(struct usb_interface *intf) + struct usb_device *udev = interface_to_usbdev(intf); + + usb_mark_last_busy(udev); +- atomic_dec(&intf->pm_usage_cnt); + pm_runtime_put_noidle(&intf->dev); + } + EXPORT_SYMBOL_GPL(usb_autopm_put_interface_no_suspend); +@@ -1707,8 +1699,6 @@ int usb_autopm_get_interface(struct usb_interface *intf) + status = pm_runtime_get_sync(&intf->dev); + if (status < 0) + pm_runtime_put_sync(&intf->dev); +- else +- atomic_inc(&intf->pm_usage_cnt); + dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n", + __func__, atomic_read(&intf->dev.power.usage_count), + status); +@@ -1742,8 +1732,6 @@ int usb_autopm_get_interface_async(struct usb_interface *intf) + status = pm_runtime_get(&intf->dev); + if (status < 0 && status != -EINPROGRESS) + pm_runtime_put_noidle(&intf->dev); +- else +- atomic_inc(&intf->pm_usage_cnt); + dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n", + __func__, atomic_read(&intf->dev.power.usage_count), + status); +@@ -1767,7 +1755,6 @@ void usb_autopm_get_interface_no_resume(struct usb_interface *intf) + struct usb_device *udev = interface_to_usbdev(intf); + + usb_mark_last_busy(udev); +- atomic_inc(&intf->pm_usage_cnt); + pm_runtime_get_noresume(&intf->dev); + } + EXPORT_SYMBOL_GPL(usb_autopm_get_interface_no_resume); +@@ -1888,14 +1875,11 @@ int usb_runtime_idle(struct device *dev) + return -EBUSY; + } + +-int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable) ++static int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable) + { + struct usb_hcd *hcd = bus_to_hcd(udev->bus); + int ret = -EPERM; + +- if (enable && !udev->usb2_hw_lpm_allowed) +- return 0; +- + if (hcd->driver->set_usb2_hw_lpm) { + ret = hcd->driver->set_usb2_hw_lpm(hcd, udev, enable); + if (!ret) +@@ -1905,6 +1889,24 @@ int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable) + return ret; + } + ++int usb_enable_usb2_hardware_lpm(struct usb_device *udev) ++{ ++ if (!udev->usb2_hw_lpm_capable || ++ !udev->usb2_hw_lpm_allowed || ++ udev->usb2_hw_lpm_enabled) ++ return 0; ++ ++ return usb_set_usb2_hardware_lpm(udev, 1); ++} ++ ++int usb_disable_usb2_hardware_lpm(struct usb_device *udev) ++{ ++ if (!udev->usb2_hw_lpm_enabled) ++ return 0; ++ ++ return usb_set_usb2_hardware_lpm(udev, 0); ++} ++ + #endif /* CONFIG_PM */ + + struct bus_type usb_bus_type = { +diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c +index 3a6978458d95..7c87c0b38bcf 100644 +--- a/drivers/usb/core/hub.c ++++ b/drivers/usb/core/hub.c +@@ -3116,8 +3116,7 @@ int usb_port_suspend(struct usb_device *udev, pm_message_t msg) + } + + /* disable USB2 hardware LPM */ +- if (udev->usb2_hw_lpm_enabled == 1) +- usb_set_usb2_hardware_lpm(udev, 0); ++ usb_disable_usb2_hardware_lpm(udev); + + if (usb_disable_ltm(udev)) { + dev_err(&udev->dev, "Failed to disable LTM before suspend\n."); +@@ -3163,8 +3162,7 @@ int usb_port_suspend(struct usb_device *udev, pm_message_t msg) + usb_enable_ltm(udev); + err_ltm: + /* Try to enable USB2 hardware LPM again */ +- if (udev->usb2_hw_lpm_capable == 1) +- usb_set_usb2_hardware_lpm(udev, 1); ++ usb_enable_usb2_hardware_lpm(udev); + + if (udev->do_remote_wakeup) + (void) usb_disable_remote_wakeup(udev); +@@ -3443,8 +3441,7 @@ int usb_port_resume(struct usb_device *udev, pm_message_t msg) + hub_port_logical_disconnect(hub, port1); + } else { + /* Try to enable USB2 hardware LPM */ +- if (udev->usb2_hw_lpm_capable == 1) +- usb_set_usb2_hardware_lpm(udev, 1); ++ usb_enable_usb2_hardware_lpm(udev); + + /* Try to enable USB3 LTM and LPM */ + usb_enable_ltm(udev); +@@ -4270,7 +4267,7 @@ static void hub_set_initial_usb2_lpm_policy(struct usb_device *udev) + if ((udev->bos->ext_cap->bmAttributes & cpu_to_le32(USB_BESL_SUPPORT)) || + connect_type == USB_PORT_CONNECT_TYPE_HARD_WIRED) { + udev->usb2_hw_lpm_allowed = 1; +- usb_set_usb2_hardware_lpm(udev, 1); ++ usb_enable_usb2_hardware_lpm(udev); + } + } + +@@ -5415,8 +5412,7 @@ static int usb_reset_and_verify_device(struct usb_device *udev) + /* Disable USB2 hardware LPM. + * It will be re-enabled by the enumeration process. + */ +- if (udev->usb2_hw_lpm_enabled == 1) +- usb_set_usb2_hardware_lpm(udev, 0); ++ usb_disable_usb2_hardware_lpm(udev); + + /* Disable LPM and LTM while we reset the device and reinstall the alt + * settings. Device-initiated LPM settings, and system exit latency +@@ -5526,7 +5522,7 @@ static int usb_reset_and_verify_device(struct usb_device *udev) + + done: + /* Now that the alt settings are re-installed, enable LTM and LPM. */ +- usb_set_usb2_hardware_lpm(udev, 1); ++ usb_enable_usb2_hardware_lpm(udev); + usb_unlocked_enable_lpm(udev); + usb_enable_ltm(udev); + usb_release_bos_descriptor(udev); +diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c +index 08cba309eb78..adc696a76b20 100644 +--- a/drivers/usb/core/message.c ++++ b/drivers/usb/core/message.c +@@ -820,9 +820,11 @@ int usb_string(struct usb_device *dev, int index, char *buf, size_t size) + + if (dev->state == USB_STATE_SUSPENDED) + return -EHOSTUNREACH; +- if (size <= 0 || !buf || !index) ++ if (size <= 0 || !buf) + return -EINVAL; + buf[0] = 0; ++ if (index <= 0 || index >= 256) ++ return -EINVAL; + tbuf = kmalloc(256, GFP_NOIO); + if (!tbuf) + return -ENOMEM; +@@ -1184,8 +1186,7 @@ void usb_disable_device(struct usb_device *dev, int skip_ep0) + dev->actconfig->interface[i] = NULL; + } + +- if (dev->usb2_hw_lpm_enabled == 1) +- usb_set_usb2_hardware_lpm(dev, 0); ++ usb_disable_usb2_hardware_lpm(dev); + usb_unlocked_disable_lpm(dev); + usb_disable_ltm(dev); + +diff --git a/drivers/usb/core/sysfs.c b/drivers/usb/core/sysfs.c +index 65b6e6b84043..6dc0f4e25cf3 100644 +--- a/drivers/usb/core/sysfs.c ++++ b/drivers/usb/core/sysfs.c +@@ -472,7 +472,10 @@ static ssize_t usb2_hardware_lpm_store(struct device *dev, + + if (!ret) { + udev->usb2_hw_lpm_allowed = value; +- ret = usb_set_usb2_hardware_lpm(udev, value); ++ if (value) ++ ret = usb_enable_usb2_hardware_lpm(udev); ++ else ++ ret = usb_disable_usb2_hardware_lpm(udev); + } + + usb_unlock_device(udev); +diff --git a/drivers/usb/core/usb.h b/drivers/usb/core/usb.h +index 53318126ed91..6b2f11544283 100644 +--- a/drivers/usb/core/usb.h ++++ b/drivers/usb/core/usb.h +@@ -84,7 +84,8 @@ extern int usb_remote_wakeup(struct usb_device *dev); + extern int usb_runtime_suspend(struct device *dev); + extern int usb_runtime_resume(struct device *dev); + extern int usb_runtime_idle(struct device *dev); +-extern int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable); ++extern int usb_enable_usb2_hardware_lpm(struct usb_device *udev); ++extern int usb_disable_usb2_hardware_lpm(struct usb_device *udev); + + #else + +@@ -104,7 +105,12 @@ static inline int usb_autoresume_device(struct usb_device *udev) + return 0; + } + +-static inline int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable) ++static inline int usb_enable_usb2_hardware_lpm(struct usb_device *udev) ++{ ++ return 0; ++} ++ ++static inline int usb_disable_usb2_hardware_lpm(struct usb_device *udev) + { + return 0; + } +diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c +index 22b4797383cd..4378e758baef 100644 +--- a/drivers/usb/dwc3/core.c ++++ b/drivers/usb/dwc3/core.c +@@ -867,7 +867,7 @@ static int dwc3_probe(struct platform_device *pdev) + dwc->regs_size = resource_size(res); + + /* default to highest possible threshold */ +- lpm_nyet_threshold = 0xff; ++ lpm_nyet_threshold = 0xf; + + /* default to -3.5dB de-emphasis */ + tx_de_emphasis = 1; +diff --git a/drivers/usb/gadget/udc/net2272.c b/drivers/usb/gadget/udc/net2272.c +index 3b6e34fc032b..553922c3be85 100644 +--- a/drivers/usb/gadget/udc/net2272.c ++++ b/drivers/usb/gadget/udc/net2272.c +@@ -962,6 +962,7 @@ net2272_dequeue(struct usb_ep *_ep, struct usb_request *_req) + break; + } + if (&req->req != _req) { ++ ep->stopped = stopped; + spin_unlock_irqrestore(&ep->dev->lock, flags); + return -EINVAL; + } +diff --git a/drivers/usb/gadget/udc/net2280.c b/drivers/usb/gadget/udc/net2280.c +index 8efeadf30b4d..3a8d056a5d16 100644 +--- a/drivers/usb/gadget/udc/net2280.c ++++ b/drivers/usb/gadget/udc/net2280.c +@@ -870,9 +870,6 @@ static void start_queue(struct net2280_ep *ep, u32 dmactl, u32 td_dma) + (void) readl(&ep->dev->pci->pcimstctl); + + writel(BIT(DMA_START), &dma->dmastat); +- +- if (!ep->is_in) +- stop_out_naking(ep); + } + + static void start_dma(struct net2280_ep *ep, struct net2280_request *req) +@@ -911,6 +908,7 @@ static void start_dma(struct net2280_ep *ep, struct net2280_request *req) + writel(BIT(DMA_START), &dma->dmastat); + return; + } ++ stop_out_naking(ep); + } + + tmp = dmactl_default; +@@ -1272,9 +1270,9 @@ static int net2280_dequeue(struct usb_ep *_ep, struct usb_request *_req) + break; + } + if (&req->req != _req) { ++ ep->stopped = stopped; + spin_unlock_irqrestore(&ep->dev->lock, flags); +- dev_err(&ep->dev->pdev->dev, "%s: Request mismatch\n", +- __func__); ++ ep_dbg(ep->dev, "%s: Request mismatch\n", __func__); + return -EINVAL; + } + +diff --git a/drivers/usb/host/u132-hcd.c b/drivers/usb/host/u132-hcd.c +index d5434e7a3b2e..86f9944f337d 100644 +--- a/drivers/usb/host/u132-hcd.c ++++ b/drivers/usb/host/u132-hcd.c +@@ -3214,6 +3214,9 @@ static int __init u132_hcd_init(void) + printk(KERN_INFO "driver %s\n", hcd_name); + workqueue = create_singlethread_workqueue("u132"); + retval = platform_driver_register(&u132_platform_driver); ++ if (retval) ++ destroy_workqueue(workqueue); ++ + return retval; + } + +diff --git a/drivers/usb/misc/yurex.c b/drivers/usb/misc/yurex.c +index 5594a4a4a83f..a8b6d0036e5d 100644 +--- a/drivers/usb/misc/yurex.c ++++ b/drivers/usb/misc/yurex.c +@@ -332,6 +332,7 @@ static void yurex_disconnect(struct usb_interface *interface) + usb_deregister_dev(interface, &yurex_class); + + /* prevent more I/O from starting */ ++ usb_poison_urb(dev->urb); + mutex_lock(&dev->io_mutex); + dev->interface = NULL; + mutex_unlock(&dev->io_mutex); +diff --git a/drivers/usb/serial/generic.c b/drivers/usb/serial/generic.c +index 54e170dd3dad..faead4f32b1c 100644 +--- a/drivers/usb/serial/generic.c ++++ b/drivers/usb/serial/generic.c +@@ -350,39 +350,59 @@ void usb_serial_generic_read_bulk_callback(struct urb *urb) + struct usb_serial_port *port = urb->context; + unsigned char *data = urb->transfer_buffer; + unsigned long flags; ++ bool stopped = false; ++ int status = urb->status; + int i; + + for (i = 0; i < ARRAY_SIZE(port->read_urbs); ++i) { + if (urb == port->read_urbs[i]) + break; + } +- set_bit(i, &port->read_urbs_free); + + dev_dbg(&port->dev, "%s - urb %d, len %d\n", __func__, i, + urb->actual_length); +- switch (urb->status) { ++ switch (status) { + case 0: ++ usb_serial_debug_data(&port->dev, __func__, urb->actual_length, ++ data); ++ port->serial->type->process_read_urb(urb); + break; + case -ENOENT: + case -ECONNRESET: + case -ESHUTDOWN: + dev_dbg(&port->dev, "%s - urb stopped: %d\n", +- __func__, urb->status); +- return; ++ __func__, status); ++ stopped = true; ++ break; + case -EPIPE: + dev_err(&port->dev, "%s - urb stopped: %d\n", +- __func__, urb->status); +- return; ++ __func__, status); ++ stopped = true; ++ break; + default: + dev_dbg(&port->dev, "%s - nonzero urb status: %d\n", +- __func__, urb->status); +- goto resubmit; ++ __func__, status); ++ break; + } + +- usb_serial_debug_data(&port->dev, __func__, urb->actual_length, data); +- port->serial->type->process_read_urb(urb); ++ /* ++ * Make sure URB processing is done before marking as free to avoid ++ * racing with unthrottle() on another CPU. Matches the barriers ++ * implied by the test_and_clear_bit() in ++ * usb_serial_generic_submit_read_urb(). ++ */ ++ smp_mb__before_atomic(); ++ set_bit(i, &port->read_urbs_free); ++ /* ++ * Make sure URB is marked as free before checking the throttled flag ++ * to avoid racing with unthrottle() on another CPU. Matches the ++ * smp_mb() in unthrottle(). ++ */ ++ smp_mb__after_atomic(); ++ ++ if (stopped) ++ return; + +-resubmit: + /* Throttle the device if requested by tty */ + spin_lock_irqsave(&port->lock, flags); + port->throttled = port->throttle_req; +@@ -399,6 +419,7 @@ void usb_serial_generic_write_bulk_callback(struct urb *urb) + { + unsigned long flags; + struct usb_serial_port *port = urb->context; ++ int status = urb->status; + int i; + + for (i = 0; i < ARRAY_SIZE(port->write_urbs); ++i) { +@@ -410,22 +431,22 @@ void usb_serial_generic_write_bulk_callback(struct urb *urb) + set_bit(i, &port->write_urbs_free); + spin_unlock_irqrestore(&port->lock, flags); + +- switch (urb->status) { ++ switch (status) { + case 0: + break; + case -ENOENT: + case -ECONNRESET: + case -ESHUTDOWN: + dev_dbg(&port->dev, "%s - urb stopped: %d\n", +- __func__, urb->status); ++ __func__, status); + return; + case -EPIPE: + dev_err_console(port, "%s - urb stopped: %d\n", +- __func__, urb->status); ++ __func__, status); + return; + default: + dev_err_console(port, "%s - nonzero urb status: %d\n", +- __func__, urb->status); ++ __func__, status); + goto resubmit; + } + +@@ -456,6 +477,12 @@ void usb_serial_generic_unthrottle(struct tty_struct *tty) + port->throttled = port->throttle_req = 0; + spin_unlock_irq(&port->lock); + ++ /* ++ * Matches the smp_mb__after_atomic() in ++ * usb_serial_generic_read_bulk_callback(). ++ */ ++ smp_mb(); ++ + if (was_throttled) + usb_serial_generic_submit_read_urbs(port, GFP_KERNEL); + } +diff --git a/drivers/usb/storage/realtek_cr.c b/drivers/usb/storage/realtek_cr.c +index 20433563a601..be432bec0c5b 100644 +--- a/drivers/usb/storage/realtek_cr.c ++++ b/drivers/usb/storage/realtek_cr.c +@@ -772,18 +772,16 @@ static void rts51x_suspend_timer_fn(unsigned long data) + break; + case RTS51X_STAT_IDLE: + case RTS51X_STAT_SS: +- usb_stor_dbg(us, "RTS51X_STAT_SS, intf->pm_usage_cnt:%d, power.usage:%d\n", +- atomic_read(&us->pusb_intf->pm_usage_cnt), ++ usb_stor_dbg(us, "RTS51X_STAT_SS, power.usage:%d\n", + atomic_read(&us->pusb_intf->dev.power.usage_count)); + +- if (atomic_read(&us->pusb_intf->pm_usage_cnt) > 0) { ++ if (atomic_read(&us->pusb_intf->dev.power.usage_count) > 0) { + usb_stor_dbg(us, "Ready to enter SS state\n"); + rts51x_set_stat(chip, RTS51X_STAT_SS); + /* ignore mass storage interface's children */ + pm_suspend_ignore_children(&us->pusb_intf->dev, true); + usb_autopm_put_interface_async(us->pusb_intf); +- usb_stor_dbg(us, "RTS51X_STAT_SS 01, intf->pm_usage_cnt:%d, power.usage:%d\n", +- atomic_read(&us->pusb_intf->pm_usage_cnt), ++ usb_stor_dbg(us, "RTS51X_STAT_SS 01, power.usage:%d\n", + atomic_read(&us->pusb_intf->dev.power.usage_count)); + } + break; +@@ -816,11 +814,10 @@ static void rts51x_invoke_transport(struct scsi_cmnd *srb, struct us_data *us) + int ret; + + if (working_scsi(srb)) { +- usb_stor_dbg(us, "working scsi, intf->pm_usage_cnt:%d, power.usage:%d\n", +- atomic_read(&us->pusb_intf->pm_usage_cnt), ++ usb_stor_dbg(us, "working scsi, power.usage:%d\n", + atomic_read(&us->pusb_intf->dev.power.usage_count)); + +- if (atomic_read(&us->pusb_intf->pm_usage_cnt) <= 0) { ++ if (atomic_read(&us->pusb_intf->dev.power.usage_count) <= 0) { + ret = usb_autopm_get_interface(us->pusb_intf); + usb_stor_dbg(us, "working scsi, ret=%d\n", ret); + } +diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c +index 6cac8f26b97a..e657b111b320 100644 +--- a/drivers/usb/storage/uas.c ++++ b/drivers/usb/storage/uas.c +@@ -772,23 +772,33 @@ static int uas_slave_alloc(struct scsi_device *sdev) + { + struct uas_dev_info *devinfo = + (struct uas_dev_info *)sdev->host->hostdata; ++ int maxp; + + sdev->hostdata = devinfo; + +- /* USB has unusual DMA-alignment requirements: Although the +- * starting address of each scatter-gather element doesn't matter, +- * the length of each element except the last must be divisible +- * by the Bulk maxpacket value. There's currently no way to +- * express this by block-layer constraints, so we'll cop out +- * and simply require addresses to be aligned at 512-byte +- * boundaries. This is okay since most block I/O involves +- * hardware sectors that are multiples of 512 bytes in length, +- * and since host controllers up through USB 2.0 have maxpacket +- * values no larger than 512. +- * +- * But it doesn't suffice for Wireless USB, where Bulk maxpacket +- * values can be as large as 2048. To make that work properly +- * will require changes to the block layer. ++ /* ++ * We have two requirements here. We must satisfy the requirements ++ * of the physical HC and the demands of the protocol, as we ++ * definitely want no additional memory allocation in this path ++ * ruling out using bounce buffers. ++ * ++ * For a transmission on USB to continue we must never send ++ * a package that is smaller than maxpacket. Hence the length of each ++ * scatterlist element except the last must be divisible by the ++ * Bulk maxpacket value. ++ * If the HC does not ensure that through SG, ++ * the upper layer must do that. We must assume nothing ++ * about the capabilities off the HC, so we use the most ++ * pessimistic requirement. ++ */ ++ ++ maxp = usb_maxpacket(devinfo->udev, devinfo->data_in_pipe, 0); ++ blk_queue_virt_boundary(sdev->request_queue, maxp - 1); ++ ++ /* ++ * The protocol has no requirements on alignment in the strict sense. ++ * Controllers may or may not have alignment restrictions. ++ * As this is not exported, we use an extremely conservative guess. + */ + blk_queue_update_dma_alignment(sdev->request_queue, (512 - 1)); + +diff --git a/drivers/usb/usbip/stub_rx.c b/drivers/usb/usbip/stub_rx.c +index 56cacb68040c..808e3a317954 100644 +--- a/drivers/usb/usbip/stub_rx.c ++++ b/drivers/usb/usbip/stub_rx.c +@@ -380,22 +380,10 @@ static int get_pipe(struct stub_device *sdev, struct usbip_header *pdu) + } + + if (usb_endpoint_xfer_isoc(epd)) { +- /* validate packet size and number of packets */ +- unsigned int maxp, packets, bytes; +- +-#define USB_EP_MAXP_MULT_SHIFT 11 +-#define USB_EP_MAXP_MULT_MASK (3 << USB_EP_MAXP_MULT_SHIFT) +-#define USB_EP_MAXP_MULT(m) \ +- (((m) & USB_EP_MAXP_MULT_MASK) >> USB_EP_MAXP_MULT_SHIFT) +- +- maxp = usb_endpoint_maxp(epd); +- maxp *= (USB_EP_MAXP_MULT( +- __le16_to_cpu(epd->wMaxPacketSize)) + 1); +- bytes = pdu->u.cmd_submit.transfer_buffer_length; +- packets = DIV_ROUND_UP(bytes, maxp); +- ++ /* validate number of packets */ + if (pdu->u.cmd_submit.number_of_packets < 0 || +- pdu->u.cmd_submit.number_of_packets > packets) { ++ pdu->u.cmd_submit.number_of_packets > ++ USBIP_MAX_ISO_PACKETS) { + dev_err(&sdev->udev->dev, + "CMD_SUBMIT: isoc invalid num packets %d\n", + pdu->u.cmd_submit.number_of_packets); +diff --git a/drivers/usb/usbip/usbip_common.h b/drivers/usb/usbip/usbip_common.h +index 0fc5ace57c0e..af903aa4ad90 100644 +--- a/drivers/usb/usbip/usbip_common.h ++++ b/drivers/usb/usbip/usbip_common.h +@@ -134,6 +134,13 @@ extern struct device_attribute dev_attr_usbip_debug; + #define USBIP_DIR_OUT 0x00 + #define USBIP_DIR_IN 0x01 + ++/* ++ * Arbitrary limit for the maximum number of isochronous packets in an URB, ++ * compare for example the uhci_submit_isochronous function in ++ * drivers/usb/host/uhci-q.c ++ */ ++#define USBIP_MAX_ISO_PACKETS 1024 ++ + /** + * struct usbip_header_basic - data pertinent to every request + * @command: the usbip request type +diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c +index b31b84f56e8f..47b229fa5e8e 100644 +--- a/drivers/vfio/pci/vfio_pci.c ++++ b/drivers/vfio/pci/vfio_pci.c +@@ -1191,11 +1191,11 @@ static void __init vfio_pci_fill_ids(void) + rc = pci_add_dynid(&vfio_pci_driver, vendor, device, + subvendor, subdevice, class, class_mask, 0); + if (rc) +- pr_warn("failed to add dynamic id [%04hx:%04hx[%04hx:%04hx]] class %#08x/%08x (%d)\n", ++ pr_warn("failed to add dynamic id [%04x:%04x[%04x:%04x]] class %#08x/%08x (%d)\n", + vendor, device, subvendor, subdevice, + class, class_mask, rc); + else +- pr_info("add [%04hx:%04hx[%04hx:%04hx]] class %#08x/%08x\n", ++ pr_info("add [%04x:%04x[%04x:%04x]] class %#08x/%08x\n", + vendor, device, subvendor, subdevice, + class, class_mask); + } +diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c +index 2fa280671c1e..875634d0d020 100644 +--- a/drivers/vfio/vfio_iommu_type1.c ++++ b/drivers/vfio/vfio_iommu_type1.c +@@ -53,10 +53,16 @@ module_param_named(disable_hugepages, + MODULE_PARM_DESC(disable_hugepages, + "Disable VFIO IOMMU support for IOMMU hugepages."); + ++static unsigned int dma_entry_limit __read_mostly = U16_MAX; ++module_param_named(dma_entry_limit, dma_entry_limit, uint, 0644); ++MODULE_PARM_DESC(dma_entry_limit, ++ "Maximum number of user DMA mappings per container (65535)."); ++ + struct vfio_iommu { + struct list_head domain_list; + struct mutex lock; + struct rb_root dma_list; ++ unsigned int dma_avail; + bool v2; + bool nesting; + }; +@@ -382,6 +388,7 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma) + vfio_unmap_unpin(iommu, dma); + vfio_unlink_dma(iommu, dma); + kfree(dma); ++ iommu->dma_avail++; + } + + static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu) +@@ -582,12 +589,18 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, + return -EEXIST; + } + ++ if (!iommu->dma_avail) { ++ mutex_unlock(&iommu->lock); ++ return -ENOSPC; ++ } ++ + dma = kzalloc(sizeof(*dma), GFP_KERNEL); + if (!dma) { + mutex_unlock(&iommu->lock); + return -ENOMEM; + } + ++ iommu->dma_avail--; + dma->iova = iova; + dma->vaddr = vaddr; + dma->prot = prot; +@@ -903,6 +916,7 @@ static void *vfio_iommu_type1_open(unsigned long arg) + + INIT_LIST_HEAD(&iommu->domain_list); + iommu->dma_list = RB_ROOT; ++ iommu->dma_avail = dma_entry_limit; + mutex_init(&iommu->lock); + + return iommu; +diff --git a/drivers/virt/fsl_hypervisor.c b/drivers/virt/fsl_hypervisor.c +index 590a0f51a249..9f96c7e61387 100644 +--- a/drivers/virt/fsl_hypervisor.c ++++ b/drivers/virt/fsl_hypervisor.c +@@ -215,6 +215,9 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy __user *p) + * hypervisor. + */ + lb_offset = param.local_vaddr & (PAGE_SIZE - 1); ++ if (param.count == 0 || ++ param.count > U64_MAX - lb_offset - PAGE_SIZE + 1) ++ return -EINVAL; + num_pages = (param.count + lb_offset + PAGE_SIZE - 1) >> PAGE_SHIFT; + + /* Allocate the buffers we need */ +@@ -335,8 +338,8 @@ static long ioctl_dtprop(struct fsl_hv_ioctl_prop __user *p, int set) + struct fsl_hv_ioctl_prop param; + char __user *upath, *upropname; + void __user *upropval; +- char *path = NULL, *propname = NULL; +- void *propval = NULL; ++ char *path, *propname; ++ void *propval; + int ret = 0; + + /* Get the parameters from the user. */ +@@ -348,32 +351,30 @@ static long ioctl_dtprop(struct fsl_hv_ioctl_prop __user *p, int set) + upropval = (void __user *)(uintptr_t)param.propval; + + path = strndup_user(upath, FH_DTPROP_MAX_PATHLEN); +- if (IS_ERR(path)) { +- ret = PTR_ERR(path); +- goto out; +- } ++ if (IS_ERR(path)) ++ return PTR_ERR(path); + + propname = strndup_user(upropname, FH_DTPROP_MAX_PATHLEN); + if (IS_ERR(propname)) { + ret = PTR_ERR(propname); +- goto out; ++ goto err_free_path; + } + + if (param.proplen > FH_DTPROP_MAX_PROPLEN) { + ret = -EINVAL; +- goto out; ++ goto err_free_propname; + } + + propval = kmalloc(param.proplen, GFP_KERNEL); + if (!propval) { + ret = -ENOMEM; +- goto out; ++ goto err_free_propname; + } + + if (set) { + if (copy_from_user(propval, upropval, param.proplen)) { + ret = -EFAULT; +- goto out; ++ goto err_free_propval; + } + + param.ret = fh_partition_set_dtprop(param.handle, +@@ -392,7 +393,7 @@ static long ioctl_dtprop(struct fsl_hv_ioctl_prop __user *p, int set) + if (copy_to_user(upropval, propval, param.proplen) || + put_user(param.proplen, &p->proplen)) { + ret = -EFAULT; +- goto out; ++ goto err_free_propval; + } + } + } +@@ -400,10 +401,12 @@ static long ioctl_dtprop(struct fsl_hv_ioctl_prop __user *p, int set) + if (put_user(param.ret, &p->ret)) + ret = -EFAULT; + +-out: +- kfree(path); ++err_free_propval: + kfree(propval); ++err_free_propname: + kfree(propname); ++err_free_path: ++ kfree(path); + + return ret; + } +diff --git a/drivers/w1/masters/ds2490.c b/drivers/w1/masters/ds2490.c +index 59d74d1b47a8..2287e1be0e55 100644 +--- a/drivers/w1/masters/ds2490.c ++++ b/drivers/w1/masters/ds2490.c +@@ -1039,15 +1039,15 @@ static int ds_probe(struct usb_interface *intf, + /* alternative 3, 1ms interrupt (greatly speeds search), 64 byte bulk */ + alt = 3; + err = usb_set_interface(dev->udev, +- intf->altsetting[alt].desc.bInterfaceNumber, alt); ++ intf->cur_altsetting->desc.bInterfaceNumber, alt); + if (err) { + dev_err(&dev->udev->dev, "Failed to set alternative setting %d " + "for %d interface: err=%d.\n", alt, +- intf->altsetting[alt].desc.bInterfaceNumber, err); ++ intf->cur_altsetting->desc.bInterfaceNumber, err); + goto err_out_clear; + } + +- iface_desc = &intf->altsetting[alt]; ++ iface_desc = intf->cur_altsetting; + if (iface_desc->desc.bNumEndpoints != NUM_EP-1) { + pr_info("Num endpoints=%d. It is not DS9490R.\n", + iface_desc->desc.bNumEndpoints); +diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c +index be7d187d53fd..d636e2660e62 100644 +--- a/fs/ceph/dir.c ++++ b/fs/ceph/dir.c +@@ -1288,6 +1288,7 @@ void ceph_dentry_lru_del(struct dentry *dn) + unsigned ceph_dentry_hash(struct inode *dir, struct dentry *dn) + { + struct ceph_inode_info *dci = ceph_inode(dir); ++ unsigned hash; + + switch (dci->i_dir_layout.dl_dir_hash) { + case 0: /* for backward compat */ +@@ -1295,8 +1296,11 @@ unsigned ceph_dentry_hash(struct inode *dir, struct dentry *dn) + return dn->d_name.hash; + + default: +- return ceph_str_hash(dci->i_dir_layout.dl_dir_hash, ++ spin_lock(&dn->d_lock); ++ hash = ceph_str_hash(dci->i_dir_layout.dl_dir_hash, + dn->d_name.name, dn->d_name.len); ++ spin_unlock(&dn->d_lock); ++ return hash; + } + } + +diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c +index 9f0d99094cc1..a663b676d566 100644 +--- a/fs/ceph/inode.c ++++ b/fs/ceph/inode.c +@@ -474,6 +474,7 @@ static void ceph_i_callback(struct rcu_head *head) + struct inode *inode = container_of(head, struct inode, i_rcu); + struct ceph_inode_info *ci = ceph_inode(inode); + ++ kfree(ci->i_symlink); + kmem_cache_free(ceph_inode_cachep, ci); + } + +@@ -505,7 +506,6 @@ void ceph_destroy_inode(struct inode *inode) + ceph_put_snap_realm(mdsc, realm); + } + +- kfree(ci->i_symlink); + while ((n = rb_first(&ci->i_fragtree)) != NULL) { + frag = rb_entry(n, struct ceph_inode_frag, node); + rb_erase(n, &ci->i_fragtree); +diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c +index 35e6e0b2cf34..a5de8e22629b 100644 +--- a/fs/ceph/mds_client.c ++++ b/fs/ceph/mds_client.c +@@ -1198,6 +1198,15 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, + list_add(&ci->i_prealloc_cap_flush->list, &to_remove); + ci->i_prealloc_cap_flush = NULL; + } ++ ++ if (drop && ++ ci->i_wrbuffer_ref_head == 0 && ++ ci->i_wr_ref == 0 && ++ ci->i_dirty_caps == 0 && ++ ci->i_flushing_caps == 0) { ++ ceph_put_snap_context(ci->i_head_snapc); ++ ci->i_head_snapc = NULL; ++ } + } + spin_unlock(&ci->i_ceph_lock); + while (!list_empty(&to_remove)) { +diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c +index a485d0cdc559..3d876a1cf567 100644 +--- a/fs/ceph/snap.c ++++ b/fs/ceph/snap.c +@@ -567,7 +567,12 @@ void ceph_queue_cap_snap(struct ceph_inode_info *ci) + capsnap = NULL; + + update_snapc: +- if (ci->i_head_snapc) { ++ if (ci->i_wrbuffer_ref_head == 0 && ++ ci->i_wr_ref == 0 && ++ ci->i_dirty_caps == 0 && ++ ci->i_flushing_caps == 0) { ++ ci->i_head_snapc = NULL; ++ } else { + ci->i_head_snapc = ceph_get_snap_context(new_snapc); + dout(" new snapc is %p\n", new_snapc); + } +diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c +index d8bd8dd36211..0f210cb5038a 100644 +--- a/fs/cifs/inode.c ++++ b/fs/cifs/inode.c +@@ -1669,6 +1669,10 @@ cifs_do_rename(const unsigned int xid, struct dentry *from_dentry, + if (rc == 0 || rc != -EBUSY) + goto do_rename_exit; + ++ /* Don't fall back to using SMB on SMB 2+ mount */ ++ if (server->vals->protocol_id != 0) ++ goto do_rename_exit; ++ + /* open-file renames don't work across directories */ + if (to_dentry->d_parent != from_dentry->d_parent) + goto do_rename_exit; +diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c +index 22fe11baef2b..3530e1c3ff56 100644 +--- a/fs/debugfs/inode.c ++++ b/fs/debugfs/inode.c +@@ -164,19 +164,24 @@ static int debugfs_show_options(struct seq_file *m, struct dentry *root) + return 0; + } + +-static void debugfs_evict_inode(struct inode *inode) ++static void debugfs_i_callback(struct rcu_head *head) + { +- truncate_inode_pages_final(&inode->i_data); +- clear_inode(inode); ++ struct inode *inode = container_of(head, struct inode, i_rcu); + if (S_ISLNK(inode->i_mode)) + kfree(inode->i_link); ++ free_inode_nonrcu(inode); ++} ++ ++static void debugfs_destroy_inode(struct inode *inode) ++{ ++ call_rcu(&inode->i_rcu, debugfs_i_callback); + } + + static const struct super_operations debugfs_super_operations = { + .statfs = simple_statfs, + .remount_fs = debugfs_remount, + .show_options = debugfs_show_options, +- .evict_inode = debugfs_evict_inode, ++ .destroy_inode = debugfs_destroy_inode, + }; + + static struct vfsmount *debugfs_automount(struct path *path) +diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c +index cefae2350da5..27c4e2ac39a9 100644 +--- a/fs/hugetlbfs/inode.c ++++ b/fs/hugetlbfs/inode.c +@@ -745,11 +745,17 @@ static struct inode *hugetlbfs_get_inode(struct super_block *sb, + umode_t mode, dev_t dev) + { + struct inode *inode; +- struct resv_map *resv_map; ++ struct resv_map *resv_map = NULL; + +- resv_map = resv_map_alloc(); +- if (!resv_map) +- return NULL; ++ /* ++ * Reserve maps are only needed for inodes that can have associated ++ * page allocations. ++ */ ++ if (S_ISREG(mode) || S_ISLNK(mode)) { ++ resv_map = resv_map_alloc(); ++ if (!resv_map) ++ return NULL; ++ } + + inode = new_inode(sb); + if (inode) { +@@ -790,8 +796,10 @@ static struct inode *hugetlbfs_get_inode(struct super_block *sb, + break; + } + lockdep_annotate_inode_mutex_key(inode); +- } else +- kref_put(&resv_map->refs, resv_map_release); ++ } else { ++ if (resv_map) ++ kref_put(&resv_map->refs, resv_map_release); ++ } + + return inode; + } +diff --git a/fs/jffs2/readinode.c b/fs/jffs2/readinode.c +index bfebbf13698c..5b52ea41b84f 100644 +--- a/fs/jffs2/readinode.c ++++ b/fs/jffs2/readinode.c +@@ -1414,11 +1414,6 @@ void jffs2_do_clear_inode(struct jffs2_sb_info *c, struct jffs2_inode_info *f) + + jffs2_kill_fragtree(&f->fragtree, deleted?c:NULL); + +- if (f->target) { +- kfree(f->target); +- f->target = NULL; +- } +- + fds = f->dents; + while(fds) { + fd = fds; +diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c +index 023e7f32ee1b..9fc297df8c75 100644 +--- a/fs/jffs2/super.c ++++ b/fs/jffs2/super.c +@@ -47,7 +47,10 @@ static struct inode *jffs2_alloc_inode(struct super_block *sb) + static void jffs2_i_callback(struct rcu_head *head) + { + struct inode *inode = container_of(head, struct inode, i_rcu); +- kmem_cache_free(jffs2_inode_cachep, JFFS2_INODE_INFO(inode)); ++ struct jffs2_inode_info *f = JFFS2_INODE_INFO(inode); ++ ++ kfree(f->target); ++ kmem_cache_free(jffs2_inode_cachep, f); + } + + static void jffs2_destroy_inode(struct inode *inode) +diff --git a/fs/nfs/super.c b/fs/nfs/super.c +index 9b42139a479b..dced329a8584 100644 +--- a/fs/nfs/super.c ++++ b/fs/nfs/super.c +@@ -2020,7 +2020,8 @@ static int nfs23_validate_mount_data(void *options, + memcpy(sap, &data->addr, sizeof(data->addr)); + args->nfs_server.addrlen = sizeof(data->addr); + args->nfs_server.port = ntohs(data->addr.sin_port); +- if (!nfs_verify_server_address(sap)) ++ if (sap->sa_family != AF_INET || ++ !nfs_verify_server_address(sap)) + goto out_no_address; + + if (!(data->flags & NFS_MOUNT_TCP)) +diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c +index 24ace275160c..4fa3f0ba9ab3 100644 +--- a/fs/nfsd/nfs4callback.c ++++ b/fs/nfsd/nfs4callback.c +@@ -874,8 +874,9 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata) + cb->cb_seq_status = 1; + cb->cb_status = 0; + if (minorversion) { +- if (!nfsd41_cb_get_slot(clp, task)) ++ if (!cb->cb_holds_slot && !nfsd41_cb_get_slot(clp, task)) + return; ++ cb->cb_holds_slot = true; + } + rpc_call_start(task); + } +@@ -902,6 +903,9 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback + return true; + } + ++ if (!cb->cb_holds_slot) ++ goto need_restart; ++ + switch (cb->cb_seq_status) { + case 0: + /* +@@ -939,6 +943,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback + cb->cb_seq_status); + } + ++ cb->cb_holds_slot = false; + clear_bit(0, &clp->cl_cb_slot_busy); + rpc_wake_up_next(&clp->cl_cb_waitq); + dprintk("%s: freed slot, new seqid=%d\n", __func__, +@@ -1146,6 +1151,7 @@ void nfsd4_init_cb(struct nfsd4_callback *cb, struct nfs4_client *clp, + cb->cb_seq_status = 1; + cb->cb_status = 0; + cb->cb_need_restart = false; ++ cb->cb_holds_slot = false; + } + + void nfsd4_run_cb(struct nfsd4_callback *cb) +diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h +index 86af697c21d3..2c26bedda7be 100644 +--- a/fs/nfsd/state.h ++++ b/fs/nfsd/state.h +@@ -70,6 +70,7 @@ struct nfsd4_callback { + int cb_seq_status; + int cb_status; + bool cb_need_restart; ++ bool cb_holds_slot; + }; + + struct nfsd4_callback_ops { +diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c +index c7e32a891502..2eea16a81500 100644 +--- a/fs/proc/proc_sysctl.c ++++ b/fs/proc/proc_sysctl.c +@@ -1550,9 +1550,11 @@ static void drop_sysctl_table(struct ctl_table_header *header) + if (--header->nreg) + return; + +- if (parent) ++ if (parent) { + put_links(header); +- start_unregistering(header); ++ start_unregistering(header); ++ } ++ + if (!--header->count) + kfree_rcu(header, rcu); + +diff --git a/include/linux/bitops.h b/include/linux/bitops.h +index defeaac0745f..e76d03f44c80 100644 +--- a/include/linux/bitops.h ++++ b/include/linux/bitops.h +@@ -1,28 +1,9 @@ + #ifndef _LINUX_BITOPS_H + #define _LINUX_BITOPS_H + #include ++#include + +-#ifdef __KERNEL__ +-#define BIT(nr) (1UL << (nr)) +-#define BIT_ULL(nr) (1ULL << (nr)) +-#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG)) +-#define BIT_WORD(nr) ((nr) / BITS_PER_LONG) +-#define BIT_ULL_MASK(nr) (1ULL << ((nr) % BITS_PER_LONG_LONG)) +-#define BIT_ULL_WORD(nr) ((nr) / BITS_PER_LONG_LONG) +-#define BITS_PER_BYTE 8 + #define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long)) +-#endif +- +-/* +- * Create a contiguous bitmask starting at bit position @l and ending at +- * position @h. For example +- * GENMASK_ULL(39, 21) gives us the 64bit vector 0x000000ffffe00000. +- */ +-#define GENMASK(h, l) \ +- (((~0UL) << (l)) & (~0UL >> (BITS_PER_LONG - 1 - (h)))) +- +-#define GENMASK_ULL(h, l) \ +- (((~0ULL) << (l)) & (~0ULL >> (BITS_PER_LONG_LONG - 1 - (h)))) + + extern unsigned int __sw_hweight8(unsigned int w); + extern unsigned int __sw_hweight16(unsigned int w); +diff --git a/include/linux/bits.h b/include/linux/bits.h +new file mode 100644 +index 000000000000..2b7b532c1d51 +--- /dev/null ++++ b/include/linux/bits.h +@@ -0,0 +1,26 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++#ifndef __LINUX_BITS_H ++#define __LINUX_BITS_H ++#include ++ ++#define BIT(nr) (1UL << (nr)) ++#define BIT_ULL(nr) (1ULL << (nr)) ++#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG)) ++#define BIT_WORD(nr) ((nr) / BITS_PER_LONG) ++#define BIT_ULL_MASK(nr) (1ULL << ((nr) % BITS_PER_LONG_LONG)) ++#define BIT_ULL_WORD(nr) ((nr) / BITS_PER_LONG_LONG) ++#define BITS_PER_BYTE 8 ++ ++/* ++ * Create a contiguous bitmask starting at bit position @l and ending at ++ * position @h. For example ++ * GENMASK_ULL(39, 21) gives us the 64bit vector 0x000000ffffe00000. ++ */ ++#define GENMASK(h, l) \ ++ (((~0UL) - (1UL << (l)) + 1) & (~0UL >> (BITS_PER_LONG - 1 - (h)))) ++ ++#define GENMASK_ULL(h, l) \ ++ (((~0ULL) - (1ULL << (l)) + 1) & \ ++ (~0ULL >> (BITS_PER_LONG_LONG - 1 - (h)))) ++ ++#endif /* __LINUX_BITS_H */ +diff --git a/include/linux/cpu.h b/include/linux/cpu.h +index 063c73ed6d78..664f892d6e73 100644 +--- a/include/linux/cpu.h ++++ b/include/linux/cpu.h +@@ -50,6 +50,8 @@ extern ssize_t cpu_show_spec_store_bypass(struct device *dev, + struct device_attribute *attr, char *buf); + extern ssize_t cpu_show_l1tf(struct device *dev, + struct device_attribute *attr, char *buf); ++extern ssize_t cpu_show_mds(struct device *dev, ++ struct device_attribute *attr, char *buf); + + extern __printf(4, 5) + struct device *cpu_device_create(struct device *parent, void *drvdata, +@@ -294,4 +296,21 @@ bool cpu_wait_death(unsigned int cpu, int seconds); + bool cpu_report_death(void); + #endif /* #ifdef CONFIG_HOTPLUG_CPU */ + ++/* ++ * These are used for a global "mitigations=" cmdline option for toggling ++ * optional CPU mitigations. ++ */ ++enum cpu_mitigations { ++ CPU_MITIGATIONS_OFF, ++ CPU_MITIGATIONS_AUTO, ++}; ++ ++extern enum cpu_mitigations cpu_mitigations; ++ ++/* mitigations=off */ ++static inline bool cpu_mitigations_off(void) ++{ ++ return cpu_mitigations == CPU_MITIGATIONS_OFF; ++} ++ + #endif /* _LINUX_CPU_H_ */ +diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h +index 68904469fba1..2209eb0740b0 100644 +--- a/include/linux/jump_label.h ++++ b/include/linux/jump_label.h +@@ -267,9 +267,15 @@ struct static_key_false { + #define DEFINE_STATIC_KEY_TRUE(name) \ + struct static_key_true name = STATIC_KEY_TRUE_INIT + ++#define DECLARE_STATIC_KEY_TRUE(name) \ ++ extern struct static_key_true name ++ + #define DEFINE_STATIC_KEY_FALSE(name) \ + struct static_key_false name = STATIC_KEY_FALSE_INIT + ++#define DECLARE_STATIC_KEY_FALSE(name) \ ++ extern struct static_key_false name ++ + extern bool ____wrong_branch_error(void); + + #define static_key_enabled(x) \ +diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h +index 81fdf4b8aba4..8b1e2bd46bb7 100644 +--- a/include/linux/ptrace.h ++++ b/include/linux/ptrace.h +@@ -57,14 +57,17 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead); + #define PTRACE_MODE_READ 0x01 + #define PTRACE_MODE_ATTACH 0x02 + #define PTRACE_MODE_NOAUDIT 0x04 +-#define PTRACE_MODE_FSCREDS 0x08 +-#define PTRACE_MODE_REALCREDS 0x10 ++#define PTRACE_MODE_FSCREDS 0x08 ++#define PTRACE_MODE_REALCREDS 0x10 ++#define PTRACE_MODE_SCHED 0x20 ++#define PTRACE_MODE_IBPB 0x40 + + /* shorthands for READ/ATTACH and FSCREDS/REALCREDS combinations */ + #define PTRACE_MODE_READ_FSCREDS (PTRACE_MODE_READ | PTRACE_MODE_FSCREDS) + #define PTRACE_MODE_READ_REALCREDS (PTRACE_MODE_READ | PTRACE_MODE_REALCREDS) + #define PTRACE_MODE_ATTACH_FSCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS) + #define PTRACE_MODE_ATTACH_REALCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS) ++#define PTRACE_MODE_SPEC_IBPB (PTRACE_MODE_ATTACH_REALCREDS | PTRACE_MODE_IBPB) + + /** + * ptrace_may_access - check whether the caller is permitted to access +@@ -82,6 +85,20 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead); + */ + extern bool ptrace_may_access(struct task_struct *task, unsigned int mode); + ++/** ++ * ptrace_may_access - check whether the caller is permitted to access ++ * a target task. ++ * @task: target task ++ * @mode: selects type of access and caller credentials ++ * ++ * Returns true on success, false on denial. ++ * ++ * Similar to ptrace_may_access(). Only to be called from context switch ++ * code. Does not call into audit and the regular LSM hooks due to locking ++ * constraints. ++ */ ++extern bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode); ++ + static inline int ptrace_reparented(struct task_struct *child) + { + return !same_thread_group(child->real_parent, child->parent); +diff --git a/include/linux/sched.h b/include/linux/sched.h +index 48a59f731406..a0b540f800d9 100644 +--- a/include/linux/sched.h ++++ b/include/linux/sched.h +@@ -2169,6 +2169,8 @@ static inline void memalloc_noio_restore(unsigned int flags) + #define PFA_SPREAD_SLAB 2 /* Spread some slab caches over cpuset */ + #define PFA_SPEC_SSB_DISABLE 4 /* Speculative Store Bypass disabled */ + #define PFA_SPEC_SSB_FORCE_DISABLE 5 /* Speculative Store Bypass force disabled*/ ++#define PFA_SPEC_IB_DISABLE 6 /* Indirect branch speculation restricted */ ++#define PFA_SPEC_IB_FORCE_DISABLE 7 /* Indirect branch speculation permanently restricted */ + + + #define TASK_PFA_TEST(name, func) \ +@@ -2199,6 +2201,13 @@ TASK_PFA_CLEAR(SPEC_SSB_DISABLE, spec_ssb_disable) + TASK_PFA_TEST(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable) + TASK_PFA_SET(SPEC_SSB_FORCE_DISABLE, spec_ssb_force_disable) + ++TASK_PFA_TEST(SPEC_IB_DISABLE, spec_ib_disable) ++TASK_PFA_SET(SPEC_IB_DISABLE, spec_ib_disable) ++TASK_PFA_CLEAR(SPEC_IB_DISABLE, spec_ib_disable) ++ ++TASK_PFA_TEST(SPEC_IB_FORCE_DISABLE, spec_ib_force_disable) ++TASK_PFA_SET(SPEC_IB_FORCE_DISABLE, spec_ib_force_disable) ++ + /* + * task->jobctl flags + */ +diff --git a/include/linux/sched/smt.h b/include/linux/sched/smt.h +new file mode 100644 +index 000000000000..559ac4590593 +--- /dev/null ++++ b/include/linux/sched/smt.h +@@ -0,0 +1,20 @@ ++/* SPDX-License-Identifier: GPL-2.0 */ ++#ifndef _LINUX_SCHED_SMT_H ++#define _LINUX_SCHED_SMT_H ++ ++#include ++ ++#ifdef CONFIG_SCHED_SMT ++extern atomic_t sched_smt_present; ++ ++static __always_inline bool sched_smt_active(void) ++{ ++ return atomic_read(&sched_smt_present); ++} ++#else ++static inline bool sched_smt_active(void) { return false; } ++#endif ++ ++void arch_smt_update(void); ++ ++#endif +diff --git a/include/linux/usb.h b/include/linux/usb.h +index 5c03ebc6dfa0..02bffcc611c3 100644 +--- a/include/linux/usb.h ++++ b/include/linux/usb.h +@@ -127,7 +127,6 @@ enum usb_interface_condition { + * @dev: driver model's view of this device + * @usb_dev: if an interface is bound to the USB major, this will point + * to the sysfs representation for that device. +- * @pm_usage_cnt: PM usage counter for this interface + * @reset_ws: Used for scheduling resets from atomic context. + * @resetting_device: USB core reset the device, so use alt setting 0 as + * current; needs bandwidth alloc after reset. +@@ -184,7 +183,6 @@ struct usb_interface { + + struct device dev; /* interface specific device info */ + struct device *usb_dev; +- atomic_t pm_usage_cnt; /* usage counter for autosuspend */ + struct work_struct reset_ws; /* for resets in atomic context */ + }; + #define to_usb_interface(d) container_of(d, struct usb_interface, dev) +diff --git a/include/net/addrconf.h b/include/net/addrconf.h +index 18dd7a3caf2f..af032e5405f6 100644 +--- a/include/net/addrconf.h ++++ b/include/net/addrconf.h +@@ -162,6 +162,7 @@ int ipv6_sock_mc_join(struct sock *sk, int ifindex, + const struct in6_addr *addr); + int ipv6_sock_mc_drop(struct sock *sk, int ifindex, + const struct in6_addr *addr); ++void __ipv6_sock_mc_close(struct sock *sk); + void ipv6_sock_mc_close(struct sock *sk); + bool inet6_mc_check(struct sock *sk, const struct in6_addr *mc_addr, + const struct in6_addr *src_addr); +diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h +index 876688b5a356..7c0c83dfe86e 100644 +--- a/include/net/bluetooth/hci_core.h ++++ b/include/net/bluetooth/hci_core.h +@@ -174,6 +174,9 @@ struct adv_info { + + #define HCI_MAX_SHORT_NAME_LENGTH 10 + ++/* Min encryption key size to match with SMP */ ++#define HCI_MIN_ENC_KEY_SIZE 7 ++ + /* Default LE RPA expiry time, 15 minutes */ + #define HCI_DEFAULT_RPA_TIMEOUT (15 * 60) + +diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h +index 64776b72e1eb..64ec0d62e5f5 100644 +--- a/include/uapi/linux/prctl.h ++++ b/include/uapi/linux/prctl.h +@@ -202,6 +202,7 @@ struct prctl_mm_map { + #define PR_SET_SPECULATION_CTRL 53 + /* Speculation control variants */ + # define PR_SPEC_STORE_BYPASS 0 ++# define PR_SPEC_INDIRECT_BRANCH 1 + /* Return and control values for PR_SET/GET_SPECULATION_CTRL */ + # define PR_SPEC_NOT_AFFECTED 0 + # define PR_SPEC_PRCTL (1UL << 0) +diff --git a/init/main.c b/init/main.c +index 49926d95442f..e88c8cdef6a7 100644 +--- a/init/main.c ++++ b/init/main.c +@@ -538,6 +538,8 @@ asmlinkage __visible void __init start_kernel(void) + page_alloc_init(); + + pr_notice("Kernel command line: %s\n", boot_command_line); ++ /* parameters may set static keys */ ++ jump_label_init(); + parse_early_param(); + after_dashes = parse_args("Booting kernel", + static_command_line, __start___param, +@@ -547,8 +549,6 @@ asmlinkage __visible void __init start_kernel(void) + parse_args("Setting init args", after_dashes, NULL, 0, -1, -1, + NULL, set_init_arg); + +- jump_label_init(); +- + /* + * These use large bootmem allocations and must precede + * kmem_cache_init() +diff --git a/kernel/cpu.c b/kernel/cpu.c +index 42ce0b0ae5c5..3225c3a9d028 100644 +--- a/kernel/cpu.c ++++ b/kernel/cpu.c +@@ -8,6 +8,7 @@ + #include + #include + #include ++#include + #include + #include + #include +@@ -199,6 +200,12 @@ void cpu_hotplug_enable(void) + EXPORT_SYMBOL_GPL(cpu_hotplug_enable); + #endif /* CONFIG_HOTPLUG_CPU */ + ++/* ++ * Architectures that need SMT-specific errata handling during SMT hotplug ++ * should override this. ++ */ ++void __weak arch_smt_update(void) { } ++ + /* Need to know about CPUs going up/down? */ + int register_cpu_notifier(struct notifier_block *nb) + { +@@ -434,6 +441,7 @@ out_release: + cpu_hotplug_done(); + if (!err) + cpu_notify_nofail(CPU_POST_DEAD | mod, hcpu); ++ arch_smt_update(); + return err; + } + +@@ -537,7 +545,7 @@ out_notify: + __cpu_notify(CPU_UP_CANCELED | mod, hcpu, nr_calls, NULL); + out: + cpu_hotplug_done(); +- ++ arch_smt_update(); + return ret; + } + +@@ -834,3 +842,16 @@ void init_cpu_online(const struct cpumask *src) + { + cpumask_copy(to_cpumask(cpu_online_bits), src); + } ++ ++enum cpu_mitigations cpu_mitigations = CPU_MITIGATIONS_AUTO; ++ ++static int __init mitigations_parse_cmdline(char *arg) ++{ ++ if (!strcmp(arg, "off")) ++ cpu_mitigations = CPU_MITIGATIONS_OFF; ++ else if (!strcmp(arg, "auto")) ++ cpu_mitigations = CPU_MITIGATIONS_AUTO; ++ ++ return 0; ++} ++early_param("mitigations", mitigations_parse_cmdline); +diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c +index 83cea913983c..92c7eb1aeded 100644 +--- a/kernel/irq/manage.c ++++ b/kernel/irq/manage.c +@@ -319,8 +319,10 @@ irq_set_affinity_notifier(unsigned int irq, struct irq_affinity_notify *notify) + desc->affinity_notify = notify; + raw_spin_unlock_irqrestore(&desc->lock, flags); + +- if (old_notify) ++ if (old_notify) { ++ cancel_work_sync(&old_notify->work); + kref_put(&old_notify->kref, old_notify->release); ++ } + + return 0; + } +diff --git a/kernel/ptrace.c b/kernel/ptrace.c +index 5e2cd1030702..8303874c2a06 100644 +--- a/kernel/ptrace.c ++++ b/kernel/ptrace.c +@@ -228,6 +228,9 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state) + + static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode) + { ++ if (mode & PTRACE_MODE_SCHED) ++ return false; ++ + if (mode & PTRACE_MODE_NOAUDIT) + return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE); + else +@@ -295,9 +298,16 @@ ok: + !ptrace_has_cap(mm->user_ns, mode))) + return -EPERM; + ++ if (mode & PTRACE_MODE_SCHED) ++ return 0; + return security_ptrace_access_check(task, mode); + } + ++bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode) ++{ ++ return __ptrace_may_access(task, mode | PTRACE_MODE_SCHED); ++} ++ + bool ptrace_may_access(struct task_struct *task, unsigned int mode) + { + int err; +diff --git a/kernel/sched/core.c b/kernel/sched/core.c +index d0618951014b..d35a7d528ea6 100644 +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -5610,6 +5610,10 @@ static void set_cpu_rq_start_time(void) + rq->age_stamp = sched_clock_cpu(cpu); + } + ++#ifdef CONFIG_SCHED_SMT ++atomic_t sched_smt_present = ATOMIC_INIT(0); ++#endif ++ + static int sched_cpu_active(struct notifier_block *nfb, + unsigned long action, void *hcpu) + { +@@ -5626,11 +5630,23 @@ static int sched_cpu_active(struct notifier_block *nfb, + * set_cpu_online(). But it might not yet have marked itself + * as active, which is essential from here on. + */ ++#ifdef CONFIG_SCHED_SMT ++ /* ++ * When going up, increment the number of cores with SMT present. ++ */ ++ if (cpumask_weight(cpu_smt_mask(cpu)) == 2) ++ atomic_inc(&sched_smt_present); ++#endif + set_cpu_active(cpu, true); + stop_machine_unpark(cpu); + return NOTIFY_OK; + + case CPU_DOWN_FAILED: ++#ifdef CONFIG_SCHED_SMT ++ /* Same as for CPU_ONLINE */ ++ if (cpumask_weight(cpu_smt_mask(cpu)) == 2) ++ atomic_inc(&sched_smt_present); ++#endif + set_cpu_active(cpu, true); + return NOTIFY_OK; + +@@ -5645,7 +5661,15 @@ static int sched_cpu_inactive(struct notifier_block *nfb, + switch (action & ~CPU_TASKS_FROZEN) { + case CPU_DOWN_PREPARE: + set_cpu_active((long)hcpu, false); ++#ifdef CONFIG_SCHED_SMT ++ /* ++ * When going down, decrement the number of cores with SMT present. ++ */ ++ if (cpumask_weight(cpu_smt_mask((long)hcpu)) == 2) ++ atomic_dec(&sched_smt_present); ++#endif + return NOTIFY_OK; ++ + default: + return NOTIFY_DONE; + } +diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c +index d706cf4fda99..75bfa23f97b4 100644 +--- a/kernel/sched/fair.c ++++ b/kernel/sched/fair.c +@@ -1722,6 +1722,10 @@ static u64 numa_get_avg_runtime(struct task_struct *p, u64 *period) + if (p->last_task_numa_placement) { + delta = runtime - p->last_sum_exec_runtime; + *period = now - p->last_task_numa_placement; ++ ++ /* Avoid time going backwards, prevent potential divide error: */ ++ if (unlikely((s64)*period < 0)) ++ *period = 0; + } else { + delta = p->se.avg.load_sum / p->se.load.weight; + *period = LOAD_AVG_MAX; +diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h +index 6893ee31df4d..8b96df04ba78 100644 +--- a/kernel/sched/sched.h ++++ b/kernel/sched/sched.h +@@ -2,6 +2,7 @@ + #include + #include + #include ++#include + #include + #include + #include +diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c +index 1adecb4b87c8..7e4d715f9c22 100644 +--- a/kernel/time/timer_stats.c ++++ b/kernel/time/timer_stats.c +@@ -417,7 +417,7 @@ static int __init init_tstats_procfs(void) + { + struct proc_dir_entry *pe; + +- pe = proc_create("timer_stats", 0644, NULL, &tstats_fops); ++ pe = proc_create("timer_stats", 0600, NULL, &tstats_fops); + if (!pe) + return -ENOMEM; + return 0; +diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c +index 5e091614fe39..1cf2402c6922 100644 +--- a/kernel/trace/ring_buffer.c ++++ b/kernel/trace/ring_buffer.c +@@ -701,7 +701,7 @@ u64 ring_buffer_time_stamp(struct ring_buffer *buffer, int cpu) + + preempt_disable_notrace(); + time = rb_time_stamp(buffer); +- preempt_enable_no_resched_notrace(); ++ preempt_enable_notrace(); + + return time; + } +diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c +index ac9791dd4768..5139c4ebb96b 100644 +--- a/net/8021q/vlan_dev.c ++++ b/net/8021q/vlan_dev.c +@@ -363,10 +363,12 @@ static int vlan_dev_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) + ifrr.ifr_ifru = ifr->ifr_ifru; + + switch (cmd) { ++ case SIOCSHWTSTAMP: ++ if (!net_eq(dev_net(dev), &init_net)) ++ break; + case SIOCGMIIPHY: + case SIOCGMIIREG: + case SIOCSMIIREG: +- case SIOCSHWTSTAMP: + case SIOCGHWTSTAMP: + if (netif_device_present(real_dev) && ops->ndo_do_ioctl) + err = ops->ndo_do_ioctl(real_dev, &ifrr, cmd); +diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c +index 80be0ee17ff3..83d4d574fa44 100644 +--- a/net/bluetooth/hci_conn.c ++++ b/net/bluetooth/hci_conn.c +@@ -1177,6 +1177,14 @@ int hci_conn_check_link_mode(struct hci_conn *conn) + !test_bit(HCI_CONN_ENCRYPT, &conn->flags)) + return 0; + ++ /* The minimum encryption key size needs to be enforced by the ++ * host stack before establishing any L2CAP connections. The ++ * specification in theory allows a minimum of 1, but to align ++ * BR/EDR and LE transports, a minimum of 7 is chosen. ++ */ ++ if (conn->enc_key_size < HCI_MIN_ENC_KEY_SIZE) ++ return 0; ++ + return 1; + } + +diff --git a/net/bluetooth/hidp/sock.c b/net/bluetooth/hidp/sock.c +index 008ba439bd62..cc80c76177b6 100644 +--- a/net/bluetooth/hidp/sock.c ++++ b/net/bluetooth/hidp/sock.c +@@ -76,6 +76,7 @@ static int hidp_sock_ioctl(struct socket *sock, unsigned int cmd, unsigned long + sockfd_put(csock); + return err; + } ++ ca.name[sizeof(ca.name)-1] = 0; + + err = hidp_connection_add(&ca, csock, isock); + if (!err && copy_to_user(argp, &ca, sizeof(ca))) +diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c +index 50e84e634dfe..c7a281549d91 100644 +--- a/net/bridge/br_if.c ++++ b/net/bridge/br_if.c +@@ -471,13 +471,15 @@ int br_add_if(struct net_bridge *br, struct net_device *dev) + call_netdevice_notifiers(NETDEV_JOIN, dev); + + err = dev_set_allmulti(dev, 1); +- if (err) +- goto put_back; ++ if (err) { ++ kfree(p); /* kobject not yet init'd, manually free */ ++ goto err1; ++ } + + err = kobject_init_and_add(&p->kobj, &brport_ktype, &(dev->dev.kobj), + SYSFS_BRIDGE_PORT_ATTR); + if (err) +- goto err1; ++ goto err2; + + err = br_sysfs_addif(p); + if (err) +@@ -551,12 +553,9 @@ err3: + sysfs_remove_link(br->ifobj, p->dev->name); + err2: + kobject_put(&p->kobj); +- p = NULL; /* kobject_put frees */ +-err1: + dev_set_allmulti(dev, -1); +-put_back: ++err1: + dev_put(dev); +- kfree(p); + return err; + } + +diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c +index 93b5525bcccf..2ae0451fd634 100644 +--- a/net/bridge/br_netfilter_hooks.c ++++ b/net/bridge/br_netfilter_hooks.c +@@ -507,6 +507,7 @@ static unsigned int br_nf_pre_routing(void *priv, + nf_bridge->ipv4_daddr = ip_hdr(skb)->daddr; + + skb->protocol = htons(ETH_P_IP); ++ skb->transport_header = skb->network_header + ip_hdr(skb)->ihl * 4; + + NF_HOOK(NFPROTO_IPV4, NF_INET_PRE_ROUTING, state->net, state->sk, skb, + skb->dev, NULL, +diff --git a/net/bridge/br_netfilter_ipv6.c b/net/bridge/br_netfilter_ipv6.c +index 69dfd212e50d..f94c83f5cc37 100644 +--- a/net/bridge/br_netfilter_ipv6.c ++++ b/net/bridge/br_netfilter_ipv6.c +@@ -237,6 +237,8 @@ unsigned int br_nf_pre_routing_ipv6(void *priv, + nf_bridge->ipv6_daddr = ipv6_hdr(skb)->daddr; + + skb->protocol = htons(ETH_P_IPV6); ++ skb->transport_header = skb->network_header + sizeof(struct ipv6hdr); ++ + NF_HOOK(NFPROTO_IPV6, NF_INET_PRE_ROUTING, state->net, state->sk, skb, + skb->dev, NULL, + br_nf_pre_routing_finish_ipv6); +diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c +index f13402d407e4..1a87cf78fadc 100644 +--- a/net/bridge/netfilter/ebtables.c ++++ b/net/bridge/netfilter/ebtables.c +@@ -2046,7 +2046,8 @@ static int ebt_size_mwt(struct compat_ebt_entry_mwt *match32, + if (match_kern) + match_kern->match_size = ret; + +- if (WARN_ON(type == EBT_COMPAT_TARGET && size_left)) ++ /* rule should have no remaining data after target */ ++ if (type == EBT_COMPAT_TARGET && size_left) + return -EINVAL; + + match32 = (struct compat_ebt_entry_mwt *) buf; +diff --git a/net/core/filter.c b/net/core/filter.c +index 1a9ded6af138..3c5f51198c41 100644 +--- a/net/core/filter.c ++++ b/net/core/filter.c +@@ -742,6 +742,17 @@ static bool chk_code_allowed(u16 code_to_probe) + return codes[code_to_probe]; + } + ++static bool bpf_check_basics_ok(const struct sock_filter *filter, ++ unsigned int flen) ++{ ++ if (filter == NULL) ++ return false; ++ if (flen == 0 || flen > BPF_MAXINSNS) ++ return false; ++ ++ return true; ++} ++ + /** + * bpf_check_classic - verify socket filter code + * @filter: filter to verify +@@ -762,9 +773,6 @@ static int bpf_check_classic(const struct sock_filter *filter, + bool anc_found; + int pc; + +- if (flen == 0 || flen > BPF_MAXINSNS) +- return -EINVAL; +- + /* Check the filter code now */ + for (pc = 0; pc < flen; pc++) { + const struct sock_filter *ftest = &filter[pc]; +@@ -1057,7 +1065,7 @@ int bpf_prog_create(struct bpf_prog **pfp, struct sock_fprog_kern *fprog) + struct bpf_prog *fp; + + /* Make sure new filter is there and in the right amounts. */ +- if (fprog->filter == NULL) ++ if (!bpf_check_basics_ok(fprog->filter, fprog->len)) + return -EINVAL; + + fp = bpf_prog_alloc(bpf_prog_size(fprog->len), 0); +@@ -1104,7 +1112,7 @@ int bpf_prog_create_from_user(struct bpf_prog **pfp, struct sock_fprog *fprog, + int err; + + /* Make sure new filter is there and in the right amounts. */ +- if (fprog->filter == NULL) ++ if (!bpf_check_basics_ok(fprog->filter, fprog->len)) + return -EINVAL; + + fp = bpf_prog_alloc(bpf_prog_size(fprog->len), 0); +@@ -1184,7 +1192,6 @@ int __sk_attach_filter(struct sock_fprog *fprog, struct sock *sk, + bool locked) + { + unsigned int fsize = bpf_classic_proglen(fprog); +- unsigned int bpf_fsize = bpf_prog_size(fprog->len); + struct bpf_prog *prog; + int err; + +@@ -1192,10 +1199,10 @@ int __sk_attach_filter(struct sock_fprog *fprog, struct sock *sk, + return -EPERM; + + /* Make sure new filter is there and in the right amounts. */ +- if (fprog->filter == NULL) ++ if (!bpf_check_basics_ok(fprog->filter, fprog->len)) + return -EINVAL; + +- prog = bpf_prog_alloc(bpf_fsize, 0); ++ prog = bpf_prog_alloc(bpf_prog_size(fprog->len), 0); + if (!prog) + return -ENOMEM; + +diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c +index c11bb6d2d00a..6d5a0a7ebe10 100644 +--- a/net/ipv4/ip_output.c ++++ b/net/ipv4/ip_output.c +@@ -475,6 +475,7 @@ static void ip_copy_metadata(struct sk_buff *to, struct sk_buff *from) + to->pkt_type = from->pkt_type; + to->priority = from->priority; + to->protocol = from->protocol; ++ to->skb_iif = from->skb_iif; + skb_dst_drop(to); + skb_dst_copy(to, from); + to->dev = from->dev; +diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c +index 4d3d4291c82f..e742323d69e1 100644 +--- a/net/ipv4/raw.c ++++ b/net/ipv4/raw.c +@@ -167,6 +167,7 @@ static int icmp_filter(const struct sock *sk, const struct sk_buff *skb) + */ + static int raw_v4_input(struct sk_buff *skb, const struct iphdr *iph, int hash) + { ++ int dif = inet_iif(skb); + struct sock *sk; + struct hlist_head *head; + int delivered = 0; +@@ -179,8 +180,7 @@ static int raw_v4_input(struct sk_buff *skb, const struct iphdr *iph, int hash) + + net = dev_net(skb->dev); + sk = __raw_v4_lookup(net, __sk_head(head), iph->protocol, +- iph->saddr, iph->daddr, +- skb->dev->ifindex); ++ iph->saddr, iph->daddr, dif); + + while (sk) { + delivered = 1; +diff --git a/net/ipv4/route.c b/net/ipv4/route.c +index 1d580d290054..a58effba760a 100644 +--- a/net/ipv4/route.c ++++ b/net/ipv4/route.c +@@ -1162,25 +1162,39 @@ static struct dst_entry *ipv4_dst_check(struct dst_entry *dst, u32 cookie) + return dst; + } + +-static void ipv4_link_failure(struct sk_buff *skb) ++static void ipv4_send_dest_unreach(struct sk_buff *skb) + { + struct ip_options opt; +- struct rtable *rt; + int res; + + /* Recompile ip options since IPCB may not be valid anymore. ++ * Also check we have a reasonable ipv4 header. + */ +- memset(&opt, 0, sizeof(opt)); +- opt.optlen = ip_hdr(skb)->ihl*4 - sizeof(struct iphdr); ++ if (!pskb_network_may_pull(skb, sizeof(struct iphdr)) || ++ ip_hdr(skb)->version != 4 || ip_hdr(skb)->ihl < 5) ++ return; + +- rcu_read_lock(); +- res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL); +- rcu_read_unlock(); ++ memset(&opt, 0, sizeof(opt)); ++ if (ip_hdr(skb)->ihl > 5) { ++ if (!pskb_network_may_pull(skb, ip_hdr(skb)->ihl * 4)) ++ return; ++ opt.optlen = ip_hdr(skb)->ihl * 4 - sizeof(struct iphdr); + +- if (res) +- return; ++ rcu_read_lock(); ++ res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL); ++ rcu_read_unlock(); + ++ if (res) ++ return; ++ } + __icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0, &opt); ++} ++ ++static void ipv4_link_failure(struct sk_buff *skb) ++{ ++ struct rtable *rt; ++ ++ ipv4_send_dest_unreach(skb); + + rt = skb_rtable(skb); + if (rt) +diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c +index da90c74d12ef..167ca0fddf9e 100644 +--- a/net/ipv4/sysctl_net_ipv4.c ++++ b/net/ipv4/sysctl_net_ipv4.c +@@ -42,6 +42,7 @@ static int tcp_syn_retries_min = 1; + static int tcp_syn_retries_max = MAX_TCP_SYNCNT; + static int ip_ping_group_range_min[] = { 0, 0 }; + static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX }; ++static int one_day_secs = 24 * 3600; + + /* Update system visible IP port range */ + static void set_local_port_range(struct net *net, int range[2]) +@@ -597,7 +598,9 @@ static struct ctl_table ipv4_table[] = { + .data = &sysctl_tcp_min_rtt_wlen, + .maxlen = sizeof(int), + .mode = 0644, +- .proc_handler = proc_dointvec ++ .proc_handler = proc_dointvec_minmax, ++ .extra1 = &zero, ++ .extra2 = &one_day_secs + }, + { + .procname = "tcp_low_latency", +diff --git a/net/ipv6/ip6_flowlabel.c b/net/ipv6/ip6_flowlabel.c +index f3a0a9c0f61e..c6061f7343f1 100644 +--- a/net/ipv6/ip6_flowlabel.c ++++ b/net/ipv6/ip6_flowlabel.c +@@ -94,15 +94,21 @@ static struct ip6_flowlabel *fl_lookup(struct net *net, __be32 label) + return fl; + } + ++static void fl_free_rcu(struct rcu_head *head) ++{ ++ struct ip6_flowlabel *fl = container_of(head, struct ip6_flowlabel, rcu); ++ ++ if (fl->share == IPV6_FL_S_PROCESS) ++ put_pid(fl->owner.pid); ++ kfree(fl->opt); ++ kfree(fl); ++} ++ + + static void fl_free(struct ip6_flowlabel *fl) + { +- if (fl) { +- if (fl->share == IPV6_FL_S_PROCESS) +- put_pid(fl->owner.pid); +- kfree(fl->opt); +- kfree_rcu(fl, rcu); +- } ++ if (fl) ++ call_rcu(&fl->rcu, fl_free_rcu); + } + + static void fl_release(struct ip6_flowlabel *fl) +@@ -633,9 +639,9 @@ recheck: + if (fl1->share == IPV6_FL_S_EXCL || + fl1->share != fl->share || + ((fl1->share == IPV6_FL_S_PROCESS) && +- (fl1->owner.pid == fl->owner.pid)) || ++ (fl1->owner.pid != fl->owner.pid)) || + ((fl1->share == IPV6_FL_S_USER) && +- uid_eq(fl1->owner.uid, fl->owner.uid))) ++ !uid_eq(fl1->owner.uid, fl->owner.uid))) + goto release; + + err = -ENOMEM; +diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c +index 8d11a034ca3f..71263754b19b 100644 +--- a/net/ipv6/ipv6_sockglue.c ++++ b/net/ipv6/ipv6_sockglue.c +@@ -121,6 +121,7 @@ struct ipv6_txoptions *ipv6_update_options(struct sock *sk, + static bool setsockopt_needs_rtnl(int optname) + { + switch (optname) { ++ case IPV6_ADDRFORM: + case IPV6_ADD_MEMBERSHIP: + case IPV6_DROP_MEMBERSHIP: + case IPV6_JOIN_ANYCAST: +@@ -199,7 +200,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, + } + + fl6_free_socklist(sk); +- ipv6_sock_mc_close(sk); ++ __ipv6_sock_mc_close(sk); + + /* + * Sock is moving from IPv6 to IPv4 (sk_prot), so +diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c +index a5ec9a0cbb80..976c8133a281 100644 +--- a/net/ipv6/mcast.c ++++ b/net/ipv6/mcast.c +@@ -276,16 +276,14 @@ static struct inet6_dev *ip6_mc_find_dev_rcu(struct net *net, + return idev; + } + +-void ipv6_sock_mc_close(struct sock *sk) ++void __ipv6_sock_mc_close(struct sock *sk) + { + struct ipv6_pinfo *np = inet6_sk(sk); + struct ipv6_mc_socklist *mc_lst; + struct net *net = sock_net(sk); + +- if (!rcu_access_pointer(np->ipv6_mc_list)) +- return; ++ ASSERT_RTNL(); + +- rtnl_lock(); + while ((mc_lst = rtnl_dereference(np->ipv6_mc_list)) != NULL) { + struct net_device *dev; + +@@ -303,8 +301,17 @@ void ipv6_sock_mc_close(struct sock *sk) + + atomic_sub(sizeof(*mc_lst), &sk->sk_omem_alloc); + kfree_rcu(mc_lst, rcu); +- + } ++} ++ ++void ipv6_sock_mc_close(struct sock *sk) ++{ ++ struct ipv6_pinfo *np = inet6_sk(sk); ++ ++ if (!rcu_access_pointer(np->ipv6_mc_list)) ++ return; ++ rtnl_lock(); ++ __ipv6_sock_mc_close(sk); + rtnl_unlock(); + } + +diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c +index 77736190dc15..5039486c4f86 100644 +--- a/net/ipv6/sit.c ++++ b/net/ipv6/sit.c +@@ -1076,7 +1076,7 @@ static void ipip6_tunnel_bind_dev(struct net_device *dev) + if (!tdev && tunnel->parms.link) + tdev = __dev_get_by_index(tunnel->net, tunnel->parms.link); + +- if (tdev) { ++ if (tdev && !netif_is_l3_master(tdev)) { + int t_hlen = tunnel->hlen + sizeof(struct iphdr); + + dev->hard_header_len = tdev->hard_header_len + sizeof(struct iphdr); +diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c +index ac212542a217..c4509a10ce52 100644 +--- a/net/netfilter/ipvs/ip_vs_core.c ++++ b/net/netfilter/ipvs/ip_vs_core.c +@@ -1484,7 +1484,7 @@ ip_vs_in_icmp(struct netns_ipvs *ipvs, struct sk_buff *skb, int *related, + if (!cp) { + int v; + +- if (!sysctl_schedule_icmp(ipvs)) ++ if (ipip || !sysctl_schedule_icmp(ipvs)) + return NF_ACCEPT; + + if (!ip_vs_try_to_schedule(ipvs, AF_INET, skb, pd, &v, &cp, &ciph)) +diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c +index b6e72af15237..cdafbd38a456 100644 +--- a/net/netfilter/x_tables.c ++++ b/net/netfilter/x_tables.c +@@ -1699,7 +1699,7 @@ static int __init xt_init(void) + seqcount_init(&per_cpu(xt_recseq, i)); + } + +- xt = kmalloc(sizeof(struct xt_af) * NFPROTO_NUMPROTO, GFP_KERNEL); ++ xt = kcalloc(NFPROTO_NUMPROTO, sizeof(struct xt_af), GFP_KERNEL); + if (!xt) + return -ENOMEM; + +diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c +index 7d93228ba1e1..c78bcc13ebab 100644 +--- a/net/packet/af_packet.c ++++ b/net/packet/af_packet.c +@@ -2490,8 +2490,8 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg) + void *ph; + DECLARE_SOCKADDR(struct sockaddr_ll *, saddr, msg->msg_name); + bool need_wait = !(msg->msg_flags & MSG_DONTWAIT); ++ unsigned char *addr = NULL; + int tp_len, size_max; +- unsigned char *addr; + int len_sum = 0; + int status = TP_STATUS_AVAILABLE; + int hlen, tlen; +@@ -2511,10 +2511,13 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg) + sll_addr))) + goto out; + proto = saddr->sll_protocol; +- addr = saddr->sll_halen ? saddr->sll_addr : NULL; + dev = dev_get_by_index(sock_net(&po->sk), saddr->sll_ifindex); +- if (addr && dev && saddr->sll_halen < dev->addr_len) +- goto out_put; ++ if (po->sk.sk_socket->type == SOCK_DGRAM) { ++ if (dev && msg->msg_namelen < dev->addr_len + ++ offsetof(struct sockaddr_ll, sll_addr)) ++ goto out_put; ++ addr = saddr->sll_addr; ++ } + } + + err = -ENXIO; +@@ -2652,7 +2655,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) + struct sk_buff *skb; + struct net_device *dev; + __be16 proto; +- unsigned char *addr; ++ unsigned char *addr = NULL; + int err, reserve = 0; + struct sockcm_cookie sockc; + struct virtio_net_hdr vnet_hdr = { 0 }; +@@ -2672,7 +2675,6 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) + if (likely(saddr == NULL)) { + dev = packet_cached_dev_get(po); + proto = po->num; +- addr = NULL; + } else { + err = -EINVAL; + if (msg->msg_namelen < sizeof(struct sockaddr_ll)) +@@ -2680,10 +2682,13 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) + if (msg->msg_namelen < (saddr->sll_halen + offsetof(struct sockaddr_ll, sll_addr))) + goto out; + proto = saddr->sll_protocol; +- addr = saddr->sll_halen ? saddr->sll_addr : NULL; + dev = dev_get_by_index(sock_net(sk), saddr->sll_ifindex); +- if (addr && dev && saddr->sll_halen < dev->addr_len) +- goto out_unlock; ++ if (sock->type == SOCK_DGRAM) { ++ if (dev && msg->msg_namelen < dev->addr_len + ++ offsetof(struct sockaddr_ll, sll_addr)) ++ goto out_unlock; ++ addr = saddr->sll_addr; ++ } + } + + err = -ENXIO; +@@ -4518,14 +4523,29 @@ static void __exit packet_exit(void) + + static int __init packet_init(void) + { +- int rc = proto_register(&packet_proto, 0); ++ int rc; + +- if (rc != 0) ++ rc = proto_register(&packet_proto, 0); ++ if (rc) + goto out; ++ rc = sock_register(&packet_family_ops); ++ if (rc) ++ goto out_proto; ++ rc = register_pernet_subsys(&packet_net_ops); ++ if (rc) ++ goto out_sock; ++ rc = register_netdevice_notifier(&packet_netdev_notifier); ++ if (rc) ++ goto out_pernet; + +- sock_register(&packet_family_ops); +- register_pernet_subsys(&packet_net_ops); +- register_netdevice_notifier(&packet_netdev_notifier); ++ return 0; ++ ++out_pernet: ++ unregister_pernet_subsys(&packet_net_ops); ++out_sock: ++ sock_unregister(PF_PACKET); ++out_proto: ++ proto_unregister(&packet_proto); + out: + return rc; + } +diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c +index af17b00145e1..a8ab98b53a3a 100644 +--- a/net/sunrpc/cache.c ++++ b/net/sunrpc/cache.c +@@ -54,6 +54,7 @@ static void cache_init(struct cache_head *h, struct cache_detail *detail) + h->last_refresh = now; + } + ++static inline int cache_is_valid(struct cache_head *h); + static void cache_fresh_locked(struct cache_head *head, time_t expiry, + struct cache_detail *detail); + static void cache_fresh_unlocked(struct cache_head *head, +@@ -100,6 +101,8 @@ struct cache_head *sunrpc_cache_lookup(struct cache_detail *detail, + if (cache_is_expired(detail, tmp)) { + hlist_del_init(&tmp->cache_list); + detail->entries --; ++ if (cache_is_valid(tmp) == -EAGAIN) ++ set_bit(CACHE_NEGATIVE, &tmp->flags); + cache_fresh_locked(tmp, 0, detail); + freeme = tmp; + break; +diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c +index e9653c42cdd1..8400211537a2 100644 +--- a/net/tipc/netlink_compat.c ++++ b/net/tipc/netlink_compat.c +@@ -262,8 +262,14 @@ static int tipc_nl_compat_dumpit(struct tipc_nl_compat_cmd_dump *cmd, + if (msg->rep_type) + tipc_tlv_init(msg->rep, msg->rep_type); + +- if (cmd->header) +- (*cmd->header)(msg); ++ if (cmd->header) { ++ err = (*cmd->header)(msg); ++ if (err) { ++ kfree_skb(msg->rep); ++ msg->rep = NULL; ++ return err; ++ } ++ } + + arg = nlmsg_new(0, GFP_KERNEL); + if (!arg) { +@@ -382,7 +388,12 @@ static int tipc_nl_compat_bearer_enable(struct tipc_nl_compat_cmd_doit *cmd, + if (!bearer) + return -EMSGSIZE; + +- len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_BEARER_NAME); ++ len = TLV_GET_DATA_LEN(msg->req); ++ len -= offsetof(struct tipc_bearer_config, name); ++ if (len <= 0) ++ return -EINVAL; ++ ++ len = min_t(int, len, TIPC_MAX_BEARER_NAME); + if (!string_is_valid(b->name, len)) + return -EINVAL; + +@@ -727,7 +738,12 @@ static int tipc_nl_compat_link_set(struct tipc_nl_compat_cmd_doit *cmd, + + lc = (struct tipc_link_config *)TLV_DATA(msg->req); + +- len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME); ++ len = TLV_GET_DATA_LEN(msg->req); ++ len -= offsetof(struct tipc_link_config, name); ++ if (len <= 0) ++ return -EINVAL; ++ ++ len = min_t(int, len, TIPC_MAX_LINK_NAME); + if (!string_is_valid(lc->name, len)) + return -EINVAL; + +diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include +index 5e9cf7d146f0..e61a5c29b08c 100644 +--- a/scripts/Kbuild.include ++++ b/scripts/Kbuild.include +@@ -156,9 +156,7 @@ cc-ldoption = $(call try-run,\ + + # ld-option + # Usage: LDFLAGS += $(call ld-option, -X) +-ld-option = $(call try-run,\ +- $(CC) $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS) -x c /dev/null -c -o "$$TMPO"; \ +- $(LD) $(LDFLAGS) $(1) "$$TMPO" -o "$$TMP",$(1),$(2)) ++ld-option = $(call try-run, $(LD) $(LDFLAGS) $(1) -v,$(1),$(2)) + + # ar-option + # Usage: KBUILD_ARFLAGS := $(call ar-option,D) +diff --git a/scripts/kconfig/lxdialog/inputbox.c b/scripts/kconfig/lxdialog/inputbox.c +index d58de1dc5360..510049a7bd1d 100644 +--- a/scripts/kconfig/lxdialog/inputbox.c ++++ b/scripts/kconfig/lxdialog/inputbox.c +@@ -126,7 +126,8 @@ do_resize: + case KEY_DOWN: + break; + case KEY_BACKSPACE: +- case 127: ++ case 8: /* ^H */ ++ case 127: /* ^? */ + if (pos) { + wattrset(dialog, dlg.inputbox.atr); + if (input_x == 0) { +diff --git a/scripts/kconfig/nconf.c b/scripts/kconfig/nconf.c +index d42d534a66cd..f7049e288e93 100644 +--- a/scripts/kconfig/nconf.c ++++ b/scripts/kconfig/nconf.c +@@ -1046,7 +1046,7 @@ static int do_match(int key, struct match_state *state, int *ans) + state->match_direction = FIND_NEXT_MATCH_UP; + *ans = get_mext_match(state->pattern, + state->match_direction); +- } else if (key == KEY_BACKSPACE || key == 127) { ++ } else if (key == KEY_BACKSPACE || key == 8 || key == 127) { + state->pattern[strlen(state->pattern)-1] = '\0'; + adj_match_dir(&state->match_direction); + } else +diff --git a/scripts/kconfig/nconf.gui.c b/scripts/kconfig/nconf.gui.c +index 4b2f44c20caf..9a65035cf787 100644 +--- a/scripts/kconfig/nconf.gui.c ++++ b/scripts/kconfig/nconf.gui.c +@@ -439,7 +439,8 @@ int dialog_inputbox(WINDOW *main_window, + case KEY_F(F_EXIT): + case KEY_F(F_BACK): + break; +- case 127: ++ case 8: /* ^H */ ++ case 127: /* ^? */ + case KEY_BACKSPACE: + if (cursor_position > 0) { + memmove(&result[cursor_position-1], +diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c +index 99212ff6a568..ab2759d88bc6 100644 +--- a/security/selinux/hooks.c ++++ b/security/selinux/hooks.c +@@ -396,21 +396,43 @@ static int may_context_mount_inode_relabel(u32 sid, + return rc; + } + +-static int selinux_is_sblabel_mnt(struct super_block *sb) ++static int selinux_is_genfs_special_handling(struct super_block *sb) + { +- struct superblock_security_struct *sbsec = sb->s_security; +- +- return sbsec->behavior == SECURITY_FS_USE_XATTR || +- sbsec->behavior == SECURITY_FS_USE_TRANS || +- sbsec->behavior == SECURITY_FS_USE_TASK || +- sbsec->behavior == SECURITY_FS_USE_NATIVE || +- /* Special handling. Genfs but also in-core setxattr handler */ +- !strcmp(sb->s_type->name, "sysfs") || ++ /* Special handling. Genfs but also in-core setxattr handler */ ++ return !strcmp(sb->s_type->name, "sysfs") || + !strcmp(sb->s_type->name, "pstore") || + !strcmp(sb->s_type->name, "debugfs") || + !strcmp(sb->s_type->name, "rootfs"); + } + ++static int selinux_is_sblabel_mnt(struct super_block *sb) ++{ ++ struct superblock_security_struct *sbsec = sb->s_security; ++ ++ /* ++ * IMPORTANT: Double-check logic in this function when adding a new ++ * SECURITY_FS_USE_* definition! ++ */ ++ BUILD_BUG_ON(SECURITY_FS_USE_MAX != 7); ++ ++ switch (sbsec->behavior) { ++ case SECURITY_FS_USE_XATTR: ++ case SECURITY_FS_USE_TRANS: ++ case SECURITY_FS_USE_TASK: ++ case SECURITY_FS_USE_NATIVE: ++ return 1; ++ ++ case SECURITY_FS_USE_GENFS: ++ return selinux_is_genfs_special_handling(sb); ++ ++ /* Never allow relabeling on context mounts */ ++ case SECURITY_FS_USE_MNTPOINT: ++ case SECURITY_FS_USE_NONE: ++ default: ++ return 0; ++ } ++} ++ + static int sb_finish_set_opts(struct super_block *sb) + { + struct superblock_security_struct *sbsec = sb->s_security; +diff --git a/sound/soc/codecs/cs4270.c b/sound/soc/codecs/cs4270.c +index 3670086b9227..f273533c6653 100644 +--- a/sound/soc/codecs/cs4270.c ++++ b/sound/soc/codecs/cs4270.c +@@ -641,6 +641,7 @@ static const struct regmap_config cs4270_regmap = { + .reg_defaults = cs4270_reg_defaults, + .num_reg_defaults = ARRAY_SIZE(cs4270_reg_defaults), + .cache_type = REGCACHE_RBTREE, ++ .write_flag_mask = CS4270_I2C_INCR, + + .readable_reg = cs4270_reg_is_readable, + .volatile_reg = cs4270_reg_is_volatile, +diff --git a/sound/soc/codecs/tlv320aic32x4.c b/sound/soc/codecs/tlv320aic32x4.c +index f2d3191961e1..714bd0e3fc71 100644 +--- a/sound/soc/codecs/tlv320aic32x4.c ++++ b/sound/soc/codecs/tlv320aic32x4.c +@@ -234,6 +234,8 @@ static const struct snd_soc_dapm_widget aic32x4_dapm_widgets[] = { + SND_SOC_DAPM_INPUT("IN2_R"), + SND_SOC_DAPM_INPUT("IN3_L"), + SND_SOC_DAPM_INPUT("IN3_R"), ++ SND_SOC_DAPM_INPUT("CM_L"), ++ SND_SOC_DAPM_INPUT("CM_R"), + }; + + static const struct snd_soc_dapm_route aic32x4_dapm_routes[] = { +diff --git a/sound/soc/intel/common/sst-dsp.c b/sound/soc/intel/common/sst-dsp.c +index c9452e02e0dd..c0a50ecb6dbd 100644 +--- a/sound/soc/intel/common/sst-dsp.c ++++ b/sound/soc/intel/common/sst-dsp.c +@@ -463,11 +463,15 @@ struct sst_dsp *sst_dsp_new(struct device *dev, + goto irq_err; + + err = sst_dma_new(sst); +- if (err) +- dev_warn(dev, "sst_dma_new failed %d\n", err); ++ if (err) { ++ dev_err(dev, "sst_dma_new failed %d\n", err); ++ goto dma_err; ++ } + + return sst; + ++dma_err: ++ free_irq(sst->irq, sst); + irq_err: + if (sst->ops->free) + sst->ops->free(sst); +diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c +index f99eb8f44282..1c0d44c86c01 100644 +--- a/sound/soc/soc-pcm.c ++++ b/sound/soc/soc-pcm.c +@@ -882,10 +882,13 @@ static int soc_pcm_hw_params(struct snd_pcm_substream *substream, + codec_params = *params; + + /* fixup params based on TDM slot masks */ +- if (codec_dai->tx_mask) ++ if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK && ++ codec_dai->tx_mask) + soc_pcm_codec_params_fixup(&codec_params, + codec_dai->tx_mask); +- if (codec_dai->rx_mask) ++ ++ if (substream->stream == SNDRV_PCM_STREAM_CAPTURE && ++ codec_dai->rx_mask) + soc_pcm_codec_params_fixup(&codec_params, + codec_dai->rx_mask); + +diff --git a/sound/usb/line6/driver.c b/sound/usb/line6/driver.c +index be78078a10ba..954dc4423cb0 100644 +--- a/sound/usb/line6/driver.c ++++ b/sound/usb/line6/driver.c +@@ -307,12 +307,16 @@ int line6_read_data(struct usb_line6 *line6, unsigned address, void *data, + { + struct usb_device *usbdev = line6->usbdev; + int ret; +- unsigned char len; ++ unsigned char *len; + unsigned count; + + if (address > 0xffff || datalen > 0xff) + return -EINVAL; + ++ len = kmalloc(sizeof(*len), GFP_KERNEL); ++ if (!len) ++ return -ENOMEM; ++ + /* query the serial number: */ + ret = usb_control_msg(usbdev, usb_sndctrlpipe(usbdev, 0), 0x67, + USB_TYPE_VENDOR | USB_RECIP_DEVICE | USB_DIR_OUT, +@@ -321,7 +325,7 @@ int line6_read_data(struct usb_line6 *line6, unsigned address, void *data, + + if (ret < 0) { + dev_err(line6->ifcdev, "read request failed (error %d)\n", ret); +- return ret; ++ goto exit; + } + + /* Wait for data length. We'll get 0xff until length arrives. */ +@@ -331,28 +335,29 @@ int line6_read_data(struct usb_line6 *line6, unsigned address, void *data, + ret = usb_control_msg(usbdev, usb_rcvctrlpipe(usbdev, 0), 0x67, + USB_TYPE_VENDOR | USB_RECIP_DEVICE | + USB_DIR_IN, +- 0x0012, 0x0000, &len, 1, ++ 0x0012, 0x0000, len, 1, + LINE6_TIMEOUT * HZ); + if (ret < 0) { + dev_err(line6->ifcdev, + "receive length failed (error %d)\n", ret); +- return ret; ++ goto exit; + } + +- if (len != 0xff) ++ if (*len != 0xff) + break; + } + +- if (len == 0xff) { ++ ret = -EIO; ++ if (*len == 0xff) { + dev_err(line6->ifcdev, "read failed after %d retries\n", + count); +- return -EIO; +- } else if (len != datalen) { ++ goto exit; ++ } else if (*len != datalen) { + /* should be equal or something went wrong */ + dev_err(line6->ifcdev, + "length mismatch (expected %d, got %d)\n", +- (int)datalen, (int)len); +- return -EIO; ++ (int)datalen, (int)*len); ++ goto exit; + } + + /* receive the result: */ +@@ -361,12 +366,12 @@ int line6_read_data(struct usb_line6 *line6, unsigned address, void *data, + 0x0013, 0x0000, data, datalen, + LINE6_TIMEOUT * HZ); + +- if (ret < 0) { ++ if (ret < 0) + dev_err(line6->ifcdev, "read failed (error %d)\n", ret); +- return ret; +- } + +- return 0; ++exit: ++ kfree(len); ++ return ret; + } + EXPORT_SYMBOL_GPL(line6_read_data); + +@@ -378,12 +383,16 @@ int line6_write_data(struct usb_line6 *line6, unsigned address, void *data, + { + struct usb_device *usbdev = line6->usbdev; + int ret; +- unsigned char status; ++ unsigned char *status; + int count; + + if (address > 0xffff || datalen > 0xffff) + return -EINVAL; + ++ status = kmalloc(sizeof(*status), GFP_KERNEL); ++ if (!status) ++ return -ENOMEM; ++ + ret = usb_control_msg(usbdev, usb_sndctrlpipe(usbdev, 0), 0x67, + USB_TYPE_VENDOR | USB_RECIP_DEVICE | USB_DIR_OUT, + 0x0022, address, data, datalen, +@@ -392,7 +401,7 @@ int line6_write_data(struct usb_line6 *line6, unsigned address, void *data, + if (ret < 0) { + dev_err(line6->ifcdev, + "write request failed (error %d)\n", ret); +- return ret; ++ goto exit; + } + + for (count = 0; count < LINE6_READ_WRITE_MAX_RETRIES; count++) { +@@ -403,28 +412,29 @@ int line6_write_data(struct usb_line6 *line6, unsigned address, void *data, + USB_TYPE_VENDOR | USB_RECIP_DEVICE | + USB_DIR_IN, + 0x0012, 0x0000, +- &status, 1, LINE6_TIMEOUT * HZ); ++ status, 1, LINE6_TIMEOUT * HZ); + + if (ret < 0) { + dev_err(line6->ifcdev, + "receiving status failed (error %d)\n", ret); +- return ret; ++ goto exit; + } + +- if (status != 0xff) ++ if (*status != 0xff) + break; + } + +- if (status == 0xff) { ++ if (*status == 0xff) { + dev_err(line6->ifcdev, "write failed after %d retries\n", + count); +- return -EIO; +- } else if (status != 0) { ++ ret = -EIO; ++ } else if (*status != 0) { + dev_err(line6->ifcdev, "write failed (error %d)\n", ret); +- return -EIO; ++ ret = -EIO; + } +- +- return 0; ++exit: ++ kfree(status); ++ return ret; + } + EXPORT_SYMBOL_GPL(line6_write_data); + +diff --git a/sound/usb/line6/toneport.c b/sound/usb/line6/toneport.c +index 6d4c50c9b17d..5512b3d532e7 100644 +--- a/sound/usb/line6/toneport.c ++++ b/sound/usb/line6/toneport.c +@@ -365,15 +365,20 @@ static bool toneport_has_source_select(struct usb_line6_toneport *toneport) + /* + Setup Toneport device. + */ +-static void toneport_setup(struct usb_line6_toneport *toneport) ++static int toneport_setup(struct usb_line6_toneport *toneport) + { +- int ticks; ++ int *ticks; + struct usb_line6 *line6 = &toneport->line6; + struct usb_device *usbdev = line6->usbdev; + ++ ticks = kmalloc(sizeof(*ticks), GFP_KERNEL); ++ if (!ticks) ++ return -ENOMEM; ++ + /* sync time on device with host: */ +- ticks = (int)get_seconds(); +- line6_write_data(line6, 0x80c6, &ticks, 4); ++ *ticks = (int)get_seconds(); ++ line6_write_data(line6, 0x80c6, ticks, 4); ++ kfree(ticks); + + /* enable device: */ + toneport_send_cmd(usbdev, 0x0301, 0x0000); +@@ -388,6 +393,7 @@ static void toneport_setup(struct usb_line6_toneport *toneport) + toneport_update_led(toneport); + + mod_timer(&toneport->timer, jiffies + TONEPORT_PCM_DELAY * HZ); ++ return 0; + } + + /* +@@ -451,7 +457,9 @@ static int toneport_init(struct usb_line6 *line6, + return err; + } + +- toneport_setup(toneport); ++ err = toneport_setup(toneport); ++ if (err) ++ return err; + + /* register audio system: */ + return snd_card_register(line6->card); +@@ -463,7 +471,11 @@ static int toneport_init(struct usb_line6 *line6, + */ + static int toneport_reset_resume(struct usb_interface *interface) + { +- toneport_setup(usb_get_intfdata(interface)); ++ int err; ++ ++ err = toneport_setup(usb_get_intfdata(interface)); ++ if (err) ++ return err; + return line6_resume(interface); + } + #endif +diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c +index 743746a3c50d..df3c73e9dea4 100644 +--- a/tools/lib/traceevent/event-parse.c ++++ b/tools/lib/traceevent/event-parse.c +@@ -2201,7 +2201,7 @@ eval_type_str(unsigned long long val, const char *type, int pointer) + return val & 0xffffffff; + + if (strcmp(type, "u64") == 0 || +- strcmp(type, "s64")) ++ strcmp(type, "s64") == 0) + return val; + + if (strcmp(type, "s8") == 0) +diff --git a/tools/power/x86/turbostat/Makefile b/tools/power/x86/turbostat/Makefile +index e367b1a85d70..3c04e2a85599 100644 +--- a/tools/power/x86/turbostat/Makefile ++++ b/tools/power/x86/turbostat/Makefile +@@ -8,7 +8,7 @@ ifeq ("$(origin O)", "command line") + endif + + turbostat : turbostat.c +-CFLAGS += -Wall ++CFLAGS += -Wall -I../../../include + CFLAGS += -DMSRHEADER='"../../../../arch/x86/include/asm/msr-index.h"' + + %: %.c +diff --git a/tools/testing/selftests/net/run_netsocktests b/tools/testing/selftests/net/run_netsocktests +index 16058bbea7a8..c195b4478662 100755 +--- a/tools/testing/selftests/net/run_netsocktests ++++ b/tools/testing/selftests/net/run_netsocktests +@@ -6,7 +6,7 @@ echo "--------------------" + ./socket + if [ $? -ne 0 ]; then + echo "[FAIL]" ++ exit 1 + else + echo "[PASS]" + fi +-