From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.gentoo.org (woodpecker.gentoo.org [140.211.166.183]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id D16C41581EE for ; Tue, 25 Mar 2025 10:47:45 +0000 (UTC) Received: from lists.gentoo.org (bobolink.gentoo.org [140.211.166.189]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) (Authenticated sender: relay-lists.gentoo.org@gentoo.org) by smtp.gentoo.org (Postfix) with ESMTPSA id AA0A33432C7 for ; Tue, 25 Mar 2025 10:47:45 +0000 (UTC) Received: from bobolink.gentoo.org (localhost [127.0.0.1]) by bobolink.gentoo.org (Postfix) with ESMTP id 6ECA7110296; Tue, 25 Mar 2025 10:47:43 +0000 (UTC) Received: from smtp.gentoo.org (woodpecker.gentoo.org [140.211.166.183]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bobolink.gentoo.org (Postfix) with ESMTPS id 45F13110296 for ; Tue, 25 Mar 2025 10:47:43 +0000 (UTC) Received: from oystercatcher.gentoo.org (oystercatcher.gentoo.org [148.251.78.52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id 0FD0F3432B0 for ; Tue, 25 Mar 2025 10:47:42 +0000 (UTC) Received: from localhost.localdomain (localhost [IPv6:::1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id 8D00C157C for ; Tue, 25 Mar 2025 10:47:40 +0000 (UTC) From: "Tomáš Mózes" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Tomáš Mózes" Message-ID: <1742899603.e8a8d2c357c31d34f8d2d9389641f8bc008eb8b0.hydrapolic@gentoo> Subject: [gentoo-commits] proj/xen-upstream-patches:main commit in: / X-VCS-Repository: proj/xen-upstream-patches X-VCS-Files: 0001-update-Xen-version-to-4.19.1-pre.patch 0001-xen-device-tree-Allow-region-overlapping-with-memres.patch 0002-bunzip2-fix-rare-decompression-failure.patch 0002-update-Xen-version-to-4.19.2-pre.patch 0003-XSM-domctl-Fix-permission-checks-on-XEN_DOMCTL_creat.patch 0003-x86emul-MOVBE-requires-a-memory-operand.patch 0004-x86-dom0-fix-restoring-cr3-and-the-mapcache-override.patch 0004-xen-Kconfig-livepatch-build-tools-requires-debug-inf.patch 0005-libs-guest-Fix-migration-compatibility-with-a-securi.patch 0005-x86-altcall-further-refine-clang-workaround.patch 0006-tools-ocaml-Specify-rpath-correctly-for-ocamlmklib.patch 0006-xen-sched-fix-error-handling-in-cpu_schedule_up.patch 0007-x86-io-apic-prevent-early-exit-from-i8259-loop-detec.patch 0007-xen-hvm-Don-t-skip-MSR_READ-trace-record.patch 0008-tools-lsevtchn-Use-errno-macro-to-handle-hypercall-e.patch 0008-xen-update-ECLAIR-service-identifiers-from-MC3R1-to-.patch 0009-9pfsd-fix-release-build-with-old-gcc.patch 0009-MIS RA-Unmark-Rules-1.1-and-2.1-as-clean-following-Ec.patch 0010-tools-xg-increase-LZMA_BLOCK_SIZE-for-uncompressing-.patch 0010-x86-emul-Fix-misaligned-IO-breakpoint-behaviour-in-P.patch 0011-x86-IOMMU-move-tracking-in-iommu_identity_mapping.patch 0011-xen-arch-x86-make-objdump-output-user-locale-agnosti.patch 0012-x86-pass-through-documents-as-security-unsupported-w.patch 0012-x86-spec-ctrl-Support-for-SRSO_U-S_NO-and-SRSO_MSR_F.patch 0013-automation-disable-Yocto-jobs.patch 0013-x86-traps-Rework-LER-initialisation-and-support-Zen5.patch 0014-automation-use-expect-to-run-QEMU.patch 0014-x86-amd-Misc-setup-for-Fam1Ah-processors.patch 0015-x86-vLAPIC-prevent-undue-recursion-of-vlapic_error.patch 0015-x86emul-VCVT-U-DQ2PD-ignores-embedded-rounding.patch 0016-Arm-correct-FIXADDR_TOP.patch 0016-x86emul-correct-put_fpu-s-segment-selector-handling.patch 0017-xen-flask-Wire-up-XEN_DOMCTL_vuart_op.patch 0017-xl-fix-incorrect-output-in-help-command.patch 0018-x86emul-correct-UD-check-for-AVX512 -FP16-complex-mul.patch 0018-xen-flask-Wire-up-XEN_DOMCTL_dt_overlay.patch 0019-x86-pv-Introduce-x86_merge_dr6-and-fix-do_debug.patch 0019-xen-events-fix-race-with-set_global_virq_handler.patch 0020-x86-HVM-reduce-recursion-in-linear_-read-write.patch 0020-x86-pv-Fix-merging-of-new-status-bits-into-dr6.patch 0021-x86-HVM-correct-MMIO-emulation-cache-bounds-check.patch 0021-x86-pv-Address-Coverity-complaint-in-check_guest_io_.patch 0022-x86-HVM-allocate-emulation-cache-entries-dynamically.patch 0022-x86emul-always-set-operand-size-for-AVX-VNNI-INT8-in.patch 0023-x86-HVM-correct-read-write-split-at-page-boundaries.patch 0023-x86emul-set-fake-operand-size-for-AVX512CD-broadcast.patch 0024-x86-iommu-check-for-CMPXCHG16B-when-enabling-IOMMU.patch 0024-x86-x2APIC-correct-cluster-tracking-upon-CPUs-going-.patch 0025-iommu-amd-atomically-update-IRTE.patch 0025-x86-dom0-disable-SMAP-for-PV-domain-building-only.patch 0026-x86-HVM-correct-partial-HPET_STATUS-write-emulation.patch 0026-x86emul- further-correct-64-bit-mode-zero-count-repea.patch 0027-Arm64-adjust-__irq_to_desc-to-fix-build-with-gcc14.patch 0027-x86-PV-further-harden-guest-memory-accesses-against-.patch 0028-libxl-Fix-nul-termination-of-the-return-value-of-lib.patch 0028-x86-intel-Fix-PERF_GLOBAL-fixup-when-virtualised.patch 0029-SUPPORT.md-split-XSM-from-Flask.patch 0029-radix-tree-purge-node-allocation-override-hooks.patch 0030-radix-tree-introduce-RADIX_TREE-_INIT.patch 0030-x86-fix-UP-build-with-gcc14.patch 0031-x86-shutdown-offline-APs-with-interrupts-disabled-on.patch 0031-x86emul-test-fix-build-with-gas-2.43.patch 0032-x86-HVM-properly-reject-indirect-VRAM-writes.patch 0032-x86-smp-perform-disabling-on-interrupts-ahead-of-AP-.patch 0033-x86-pci-disable-MSI-X-on-all-devices-at-shutdown.patch 0033-xen-x86-pvh-handle-ACPI-RSDT-table-in-PVH-Dom0-build.patch 0034-blkif-reconcile-protocol-specification-with-in-use-i.patch 0034-x86-iommu-disable-interrupts-at-shutdown.patch 0035-IOMMU-x86-the-bus-to-bridge-l ock-needs-to-be-acquire.patch 0035-xen-ucode-Fix-buffer-under-run-when-parsing-AMD-cont.patch 0036-xen-console-Fix-truncation-of-panic-messages.patch 0036-xen-ucode-Make-Intel-s-microcode_sanity_check-strict.patch 0037-x86-PV-simplify-and-thus-correct-guest-accessor-func.patch 0037-xen-memory-Make-resource_max_frames-to-return-0-on-u.patch 0038-x86-svm-Separate-STI-and-VMRUN-instructions-in-svm_a.patch 0038-x86-traps-Re-enable-interrupts-after-reading-cr2-in-.patch 0039-x86-emul-dump-unhandled-memory-accesses-for-PVH-dom0.patch 0039-x86-pv-Rework-guest_io_okay-to-return-X86EMUL_.patch 0040-x86-dom0-attempt-to-fixup-p2m-page-faults-for-PVH-do.patch 0040-x86-pv-Handle-PF-correctly-when-reading-the-IO-permi.patch 0041-x86-dom0-correctly-set-the-maximum-iomem_caps-bound-.patch 0041-x86-pv-Rename-pv.iobmp_limit-to-iobmp_nr-and-clarify.patch 0042-stubdom-Fix-newlib-build-with-GCC-14.patch 0042-x86-iommu-account-for-IOMEM-caps-when-populating-dom.patch 0043-x86-dom0-be-less-restrictive-wit h-the-Interrupt-Addr.patch 0043-x86-dpci-do-not-leak-pending-interrupts-on-CPU-offli.patch 0044-ioreq-don-t-wrongly-claim-success-in-ioreq_send_buff.patch 0044-tools-xl-fix-channel-configuration-setting.patch 0045-x86-domctl-fix-maximum-number-of-MSRs-in-XEN_DOMCTL_.patch 0045-x86-vlapic-Fix-handling-of-writes-to-APIC_ESR.patch 0046-x86-msr-expose-MSR_FAM10H_MMIO_CONF_BASE-on-AMD.patch 0046-xen-spinlock-Fix-UBSAN-load-of-address-with-insuffic.patch 0047-iommu-amd-vi-do-not-error-if-device-referenced-in-IV.patch 0047-x86-vmx-fix-posted-interrupts-usage-of-msi_desc-msg-.patch 0048-x86-boot-Fix-microcode-module-handling-during-PVH-bo.patch 0048-x86-hvm-check-return-code-of-hvm_pi_update_irte-when.patch 0049-libxl-avoid-infinite-loop-in-libxl__remove_directory.patch 0049-x86-boot-Fix-XSM-module-handling-during-PVH-boot.patch 0050-Config-Update-MiniOS-revision.patch 0050-xen-sched-fix-arinc653-to-not-use-variables-across-c.patch 0051-CI-Resync-.cirrus.yml-for-FreeBSD-testing.patch 0051-x 86-ioremap-prevent-additions-against-the-NULL-point.patch 0052-automation-add-x86_64-xilinx-smoke-test.patch 0052-x86-mm-Fix-IS_ALIGNED-check-in-IS_LnE_ALIGNED.patch 0053-automation-add-default-QEMU_TIMEOUT-value-if-not-alr.patch 0053-xen-arinc653-call-xfree-with-local-IRQ-enabled.patch 0054-automation-restore-CR-filtering.patch 0055-automation-update-xilinx-test-scripts-tty.patch 0056-automation-fix-false-success-in-qemu-tests.patch 0057-automation-use-expect-utility-in-xilinx-tests.patch 0058-automation-fix-xilinx-test-console-settings.patch 0059-automation-introduce-TEST_TIMEOUT_OVERRIDE.patch 0060-automation-preserve-built-xen.efi.patch 0061-automation-add-a-smoke-test-for-xen.efi-on-X86.patch 0062-automation-shorten-the-timeout-for-smoke-tests.patch 0063-CI-Stop-building-QEMU-in-general.patch 0064-CI-Minor-cleanup-to-qubes-x86-64.sh.patch 0065-CI-Rework-domU_config-generation-in-qubes-x86-64.sh.patch 0066-CI-Add-adl-zen3p-pvshim-tests.patch 0067-CI-Drop-alpine-3.18-rootfs-expor t-and-use-test-artef.patch 0068-CI-Refresh-the-Debian-12-x86_64-container.patch 0069-CI-Refresh-the-Debian-12-x86_32-container.patch 0070-x86-HVM-drop-stdvga-s-cache-struct-member.patch 0071-x86-HVM-drop-stdvga-s-stdvga-struct-member.patch 0072-x86-HVM-remove-unused-MMIO-handling-code.patch 0073-x86-HVM-drop-stdvga-s-gr-struct-member.patch 0074-x86-HVM-drop-stdvga-s-sr-struct-member.patch 0075-x86-HVM-drop-stdvga-s-g-s-r_index-struct-members.patch 0076-x86-HVM-drop-stdvga-s-vram_page-struct-member.patch 0077-x86-HVM-drop-stdvga-s-lock-struct-member.patch 0078-x86-hvm-Simplify-stdvga_mem_accept-further.patch 0079-libxl-Use-zero-ed-memory-for-PVH-acpi-tables.patch 0080-x86-io-apic-fix-directed-EOI-when-using-AMD-Vi-inter.patch 0081-tools-libxl-remove-usage-of-VLA-arrays.patch 0082-xen-x86-prevent-addition-of-.note.gnu.property-if-li.patch 0083-xen-arm64-entry-Actually-skip-do_trap_-when-an-SErro.patch info.txt X-VCS-Directories: / X-VCS-Committer: hydrapolic X-VCS-Committer-Name: Tomáš Mózes X-VCS-Revision: e8a8d2c357c31d34f8d2d9389641f8bc008eb8b0 X-VCS-Branch: main Date: Tue, 25 Mar 2025 10:47:40 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply X-Archives-Salt: 35f780ef-85f2-46ad-a0e1-a429c4edfce9 X-Archives-Hash: d19c7c061ad8d1256254535eb1e5fe8b commit: e8a8d2c357c31d34f8d2d9389641f8bc008eb8b0 Author: Tomáš Mózes gmail com> AuthorDate: Tue Mar 25 10:46:43 2025 +0000 Commit: Tomáš Mózes gmail com> CommitDate: Tue Mar 25 10:46:43 2025 +0000 URL: https://gitweb.gentoo.org/proj/xen-upstream-patches.git/commit/?id=e8a8d2c3 Xen 4.19.2-pre-patchset-0 Signed-off-by: Tomáš Mózes gmail.com> 0001-update-Xen-version-to-4.19.1-pre.patch | 164 --- ...tree-Allow-region-overlapping-with-memres.patch | 272 +++++ 0002-bunzip2-fix-rare-decompression-failure.patch | 39 - 0002-update-Xen-version-to-4.19.2-pre.patch | 25 + ...Fix-permission-checks-on-XEN_DOMCTL_creat.patch | 150 --- 0003-x86emul-MOVBE-requires-a-memory-operand.patch | 32 + ...x-restoring-cr3-and-the-mapcache-override.patch | 38 - ...-livepatch-build-tools-requires-debug-inf.patch | 45 + ...Fix-migration-compatibility-with-a-securi.patch | 65 + ...6-altcall-further-refine-clang-workaround.patch | 73 -- ...ml-Specify-rpath-correctly-for-ocamlmklib.patch | 50 + ...hed-fix-error-handling-in-cpu_schedule_up.patch | 113 -- ...-prevent-early-exit-from-i8259-loop-detec.patch | 84 ++ ...-xen-hvm-Don-t-skip-MSR_READ-trace-record.patch | 40 - ...chn-Use-errno-macro-to-handle-hypercall-e.patch | 75 -- ...ECLAIR-service-identifiers-from-MC3R1-to-.patch | 1280 ++++++++++++++++++++ 0009-9pfsd-fix-release-build-with-old-gcc.patch | 33 - ...k-Rules-1.1-and-2.1-as-clean-following-Ec.patch | 47 + ...crease-LZMA_BLOCK_SIZE-for-uncompressing-.patch | 67 + ...x-misaligned-IO-breakpoint-behaviour-in-P.patch | 41 - ...U-move-tracking-in-iommu_identity_mapping.patch | 111 -- ...6-make-objdump-output-user-locale-agnosti.patch | 34 + ...rough-documents-as-security-unsupported-w.patch | 42 - ...rl-Support-for-SRSO_U-S_NO-and-SRSO_MSR_F.patch | 350 ++++++ 0013-automation-disable-Yocto-jobs.patch | 48 - ...ework-LER-initialisation-and-support-Zen5.patch | 156 +++ 0014-automation-use-expect-to-run-QEMU.patch | 362 ------ ...-x86-amd-Misc-setup-for-Fam1Ah-processors.patch | 66 + ...C-prevent-undue-recursion-of-vlapic_error.patch | 57 - ...ul-VCVT-U-DQ2PD-ignores-embedded-rounding.patch | 67 + 0016-Arm-correct-FIXADDR_TOP.patch | 58 - ...rrect-put_fpu-s-segment-selector-handling.patch | 116 ++ 0017-xen-flask-Wire-up-XEN_DOMCTL_vuart_op.patch | 63 + 0017-xl-fix-incorrect-output-in-help-command.patch | 36 - ...rect-UD-check-for-AVX512-FP16-complex-mul.patch | 37 - 0018-xen-flask-Wire-up-XEN_DOMCTL_dt_overlay.patch | 78 ++ ...-Introduce-x86_merge_dr6-and-fix-do_debug.patch | 140 --- ...nts-fix-race-with-set_global_virq_handler.patch | 81 ++ ...VM-reduce-recursion-in-linear_-read-write.patch | 85 ++ ...v-Fix-merging-of-new-status-bits-into-dr6.patch | 222 ---- ...correct-MMIO-emulation-cache-bounds-check.patch | 36 + ...ess-Coverity-complaint-in-check_guest_io_.patch | 112 -- ...ocate-emulation-cache-entries-dynamically.patch | 172 +++ ...ays-set-operand-size-for-AVX-VNNI-INT8-in.patch | 36 - ...rrect-read-write-split-at-page-boundaries.patch | 241 ++++ ...-fake-operand-size-for-AVX512CD-broadcast.patch | 35 - ...-check-for-CMPXCHG16B-when-enabling-IOMMU.patch | 126 ++ ...correct-cluster-tracking-upon-CPUs-going-.patch | 52 - 0025-iommu-amd-atomically-update-IRTE.patch | 222 ++++ ...-disable-SMAP-for-PV-domain-building-only.patch | 145 --- ...rrect-partial-HPET_STATUS-write-emulation.patch | 37 - ...ther-correct-64-bit-mode-zero-count-repea.patch | 108 ++ ...ust-__irq_to_desc-to-fix-build-with-gcc14.patch | 61 - ...her-harden-guest-memory-accesses-against-.patch | 88 ++ ...ul-termination-of-the-return-value-of-lib.patch | 100 -- ...el-Fix-PERF_GLOBAL-fixup-when-virtualised.patch | 117 ++ 0029-SUPPORT.md-split-XSM-from-Flask.patch | 66 - ...tree-purge-node-allocation-override-hooks.patch | 125 ++ 0030-radix-tree-introduce-RADIX_TREE-_INIT.patch | 99 ++ 0030-x86-fix-UP-build-with-gcc14.patch | 63 - ...n-offline-APs-with-interrupts-disabled-on.patch | 142 +++ 0031-x86emul-test-fix-build-with-gas-2.43.patch | 86 -- ...-HVM-properly-reject-indirect-VRAM-writes.patch | 45 - ...form-disabling-on-interrupts-ahead-of-AP-.patch | 46 + ...-disable-MSI-X-on-all-devices-at-shutdown.patch | 201 +++ ...-handle-ACPI-RSDT-table-in-PVH-Dom0-build.patch | 63 - ...cile-protocol-specification-with-in-use-i.patch | 183 --- ...-x86-iommu-disable-interrupts-at-shutdown.patch | 193 +++ ...he-bus-to-bridge-lock-needs-to-be-acquire.patch | 113 ++ ...ix-buffer-under-run-when-parsing-AMD-cont.patch | 62 - ...-console-Fix-truncation-of-panic-messages.patch | 103 ++ ...ake-Intel-s-microcode_sanity_check-strict.patch | 43 - ...lify-and-thus-correct-guest-accessor-func.patch | 201 --- ...Make-resource_max_frames-to-return-0-on-u.patch | 76 ++ ...arate-STI-and-VMRUN-instructions-in-svm_a.patch | 55 + ...e-enable-interrupts-after-reading-cr2-in-.patch | 104 -- ...mp-unhandled-memory-accesses-for-PVH-dom0.patch | 44 + ...v-Rework-guest_io_okay-to-return-X86EMUL_.patch | 127 -- ...tempt-to-fixup-p2m-page-faults-for-PVH-do.patch | 242 ++++ ...le-PF-correctly-when-reading-the-IO-permi.patch | 82 -- ...rrectly-set-the-maximum-iomem_caps-bound-.patch | 42 + ...me-pv.iobmp_limit-to-iobmp_nr-and-clarify.patch | 87 -- 0042-stubdom-Fix-newlib-build-with-GCC-14.patch | 58 - ...ccount-for-IOMEM-caps-when-populating-dom.patch | 255 ++++ ...-less-restrictive-with-the-Interrupt-Addr.patch | 136 +++ ...-not-leak-pending-interrupts-on-CPU-offli.patch | 75 -- ...-wrongly-claim-success-in-ioreq_send_buff.patch | 44 - ...ools-xl-fix-channel-configuration-setting.patch | 41 + ...fix-maximum-number-of-MSRs-in-XEN_DOMCTL_.patch | 51 - ...vlapic-Fix-handling-of-writes-to-APIC_ESR.patch | 90 ++ ...r-expose-MSR_FAM10H_MMIO_CONF_BASE-on-AMD.patch | 72 ++ ...k-Fix-UBSAN-load-of-address-with-insuffic.patch | 67 - ...i-do-not-error-if-device-referenced-in-IV.patch | 52 - ...-posted-interrupts-usage-of-msi_desc-msg-.patch | 75 ++ ...x-microcode-module-handling-during-PVH-bo.patch | 166 --- ...ck-return-code-of-hvm_pi_update_irte-when.patch | 45 + ...-infinite-loop-in-libxl__remove_directory.patch | 37 + ...t-Fix-XSM-module-handling-during-PVH-boot.patch | 120 -- 0050-Config-Update-MiniOS-revision.patch | 28 - ...ix-arinc653-to-not-use-variables-across-c.patch | 109 ++ ...CI-Resync-.cirrus.yml-for-FreeBSD-testing.patch | 29 - ...-prevent-additions-against-the-NULL-point.patch | 89 ++ 0052-automation-add-x86_64-xilinx-smoke-test.patch | 211 ---- ...mm-Fix-IS_ALIGNED-check-in-IS_LnE_ALIGNED.patch | 61 + ...add-default-QEMU_TIMEOUT-value-if-not-alr.patch | 35 - ...rinc653-call-xfree-with-local-IRQ-enabled.patch | 53 + 0054-automation-restore-CR-filtering.patch | 118 -- ...automation-update-xilinx-test-scripts-tty.patch | 115 -- ...utomation-fix-false-success-in-qemu-tests.patch | 226 ---- ...mation-use-expect-utility-in-xilinx-tests.patch | 393 ------ ...tomation-fix-xilinx-test-console-settings.patch | 45 - ...utomation-introduce-TEST_TIMEOUT_OVERRIDE.patch | 65 - 0060-automation-preserve-built-xen.efi.patch | 65 - ...ation-add-a-smoke-test-for-xen.efi-on-X86.patch | 91 -- ...ation-shorten-the-timeout-for-smoke-tests.patch | 79 -- 0063-CI-Stop-building-QEMU-in-general.patch | 67 - 0064-CI-Minor-cleanup-to-qubes-x86-64.sh.patch | 186 --- ...domU_config-generation-in-qubes-x86-64.sh.patch | 117 -- 0066-CI-Add-adl-zen3p-pvshim-tests.patch | 88 -- ...ine-3.18-rootfs-export-and-use-test-artef.patch | 55 - ...CI-Refresh-the-Debian-12-x86_64-container.patch | 295 ----- ...CI-Refresh-the-Debian-12-x86_32-container.patch | 184 --- ...x86-HVM-drop-stdvga-s-cache-struct-member.patch | 146 --- ...86-HVM-drop-stdvga-s-stdvga-struct-member.patch | 112 -- ...-x86-HVM-remove-unused-MMIO-handling-code.patch | 392 ------ 0073-x86-HVM-drop-stdvga-s-gr-struct-member.patch | 70 -- 0074-x86-HVM-drop-stdvga-s-sr-struct-member.patch | 70 -- ...-drop-stdvga-s-g-s-r_index-struct-members.patch | 114 -- ...HVM-drop-stdvga-s-vram_page-struct-member.patch | 124 -- ...-x86-HVM-drop-stdvga-s-lock-struct-member.patch | 119 -- ...86-hvm-Simplify-stdvga_mem_accept-further.patch | 94 -- ...xl-Use-zero-ed-memory-for-PVH-acpi-tables.patch | 43 - ...-fix-directed-EOI-when-using-AMD-Vi-inter.patch | 160 --- 0081-tools-libxl-remove-usage-of-VLA-arrays.patch | 54 - ...vent-addition-of-.note.gnu.property-if-li.patch | 46 - ...ntry-Actually-skip-do_trap_-when-an-SErro.patch | 43 - info.txt | 6 +- 137 files changed, 6720 insertions(+), 8384 deletions(-) diff --git a/0001-update-Xen-version-to-4.19.1-pre.patch b/0001-update-Xen-version-to-4.19.1-pre.patch deleted file mode 100644 index 86c82ea..0000000 --- a/0001-update-Xen-version-to-4.19.1-pre.patch +++ /dev/null @@ -1,164 +0,0 @@ -From f97db9b3bc3deac4eead160106a3f6de2ccce81d Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Thu, 8 Aug 2024 13:43:19 +0200 -Subject: [PATCH 01/83] update Xen version to 4.19.1-pre - ---- - Config.mk | 2 - - MAINTAINERS | 106 +++++---------------------------------------------- - xen/Makefile | 2 +- - 3 files changed, 10 insertions(+), 100 deletions(-) - -diff --git a/Config.mk b/Config.mk -index ac8fb847ce..03a89624c7 100644 ---- a/Config.mk -+++ b/Config.mk -@@ -234,8 +234,6 @@ ETHERBOOT_NICS ?= rtl8139 8086100e - - QEMU_TRADITIONAL_URL ?= https://xenbits.xen.org/git-http/qemu-xen-traditional.git - QEMU_TRADITIONAL_REVISION ?= xen-4.19.0 --# Wed Jul 15 10:01:40 2020 +0100 --# qemu-trad: remove Xen path dependencies - - # Specify which qemu-dm to use. This may be `ioemu' to use the old - # Mercurial in-tree version, or a local directory, or a git URL. -diff --git a/MAINTAINERS b/MAINTAINERS -index 2b0c894527..fe81ed63ad 100644 ---- a/MAINTAINERS -+++ b/MAINTAINERS -@@ -54,6 +54,15 @@ list. Remember to copy the appropriate stable branch maintainer who - will be listed in this section of the MAINTAINERS file in the - appropriate branch. - -+The maintainer for this branch is: -+ -+ Jan Beulich -+ -+Tools backport requests should also be copied to: -+ -+       Anthony Perard  -+ -+ - Unstable Subsystem Maintainers - ============================== - -@@ -104,103 +113,6 @@ Descriptions of section entries: - xen-maintainers- - - -- Check-in policy -- =============== -- --In order for a patch to be checked in, in general, several conditions --must be met: -- --1. In order to get a change to a given file committed, it must have -- the approval of at least one maintainer of that file. -- -- A patch of course needs Acks from the maintainers of each file that -- it changes; so a patch which changes xen/arch/x86/traps.c, -- xen/arch/x86/mm/p2m.c, and xen/arch/x86/mm/shadow/multi.c would -- require an Ack from each of the three sets of maintainers. -- -- See below for rules on nested maintainership. -- --2. Each change must have appropriate approval from someone other than -- the person who wrote it. This can be either: -- -- a. An Acked-by from a maintainer of the code being touched (a -- co-maintainer if available, or a more general level maintainer if -- not available; see the secton on nested maintainership) -- -- b. A Reviewed-by by anyone of suitable stature in the community -- --3. Sufficient time must have been given for anyone to respond. This -- depends in large part upon the urgency and nature of the patch. -- For a straightforward uncontroversial patch, a day or two may be -- sufficient; for a controversial patch, a week or two may be better. -- --4. There must be no "open" objections. -- --In a case where one person submits a patch and a maintainer gives an --Ack, the Ack stands in for both the approval requirement (#1) and the --Acked-by-non-submitter requirement (#2). -- --In a case where a maintainer themselves submits a patch, the --Signed-off-by meets the approval requirement (#1); so a Review --from anyone in the community suffices for requirement #2. -- --Before a maintainer checks in their own patch with another community --member's R-b but no co-maintainer Ack, it is especially important to --give their co-maintainer opportunity to give feedback, perhaps --declaring their intention to check it in without their co-maintainers --ack a day before doing so. -- --In the case where two people collaborate on a patch, at least one of --whom is a maintainer -- typically where one maintainer will do an --early version of the patch, and another maintainer will pick it up and --revise it -- there should be two Signed-off-by's and one Acked-by or --Reviewed-by; with the maintainer who did the most recent change --sending the patch, and an Acked-by or Reviewed-by coming from the --maintainer who did not most recently edit the patch. This satisfies --the requirement #2 because a) the Signed-off-by of the sender approves --the final version of the patch; including all parts of the patch that --the sender did not write b) the Reviewed-by approves the final version --of the patch, including all patches that the reviewer did not write. --Thus all code in the patch has been approved by someone who did not --write it. -- --Maintainers may choose to override non-maintainer objections in the --case that consensus can't be reached. -- --As always, no policy can cover all possible situations. In --exceptional circumstances, committers may commit a patch in absence of --one or more of the above requirements, if they are reasonably --confident that the other maintainers will approve of their decision in --retrospect. -- -- The meaning of nesting -- ====================== -- --Many maintainership areas are "nested": for example, there are entries --for xen/arch/x86 as well as xen/arch/x86/mm, and even --xen/arch/x86/mm/shadow; and there is a section at the end called "THE --REST" which lists all committers. The meaning of nesting is that: -- --1. Under normal circumstances, the Ack of the most specific maintainer --is both necessary and sufficient to get a change to a given file --committed. So a change to xen/arch/x86/mm/shadow/multi.c requires the --the Ack of the xen/arch/x86/mm/shadow maintainer for that part of the --patch, but would not require the Ack of the xen/arch/x86 maintainer or --the xen/arch/x86/mm maintainer. -- --2. In unusual circumstances, a more general maintainer's Ack can stand --in for or even overrule a specific maintainer's Ack. Unusual --circumstances might include: -- - The patch is fixing a high-priority issue causing immediate pain, -- and the more specific maintainer is not available. -- - The more specific maintainer has not responded either to the -- original patch, nor to "pings", within a reasonable amount of time. -- - The more general maintainer wants to overrule the more specific -- maintainer on some issue. (This should be exceptional.) -- - In the case of a disagreement between maintainers, THE REST can -- settle the matter by majority vote. (This should be very exceptional -- indeed.) -- - - Maintainers List (try to look for most precise areas first) - -diff --git a/xen/Makefile b/xen/Makefile -index 16055101fb..59dac504b3 100644 ---- a/xen/Makefile -+++ b/xen/Makefile -@@ -6,7 +6,7 @@ this-makefile := $(call lastword,$(MAKEFILE_LIST)) - # All other places this is stored (eg. compile.h) should be autogenerated. - export XEN_VERSION = 4 - export XEN_SUBVERSION = 19 --export XEN_EXTRAVERSION ?= .0$(XEN_VENDORVERSION) -+export XEN_EXTRAVERSION ?= .1-pre$(XEN_VENDORVERSION) - export XEN_FULLVERSION = $(XEN_VERSION).$(XEN_SUBVERSION)$(XEN_EXTRAVERSION) - -include xen-version - --- -2.47.0 - diff --git a/0001-xen-device-tree-Allow-region-overlapping-with-memres.patch b/0001-xen-device-tree-Allow-region-overlapping-with-memres.patch new file mode 100644 index 0000000..00a7e17 --- /dev/null +++ b/0001-xen-device-tree-Allow-region-overlapping-with-memres.patch @@ -0,0 +1,272 @@ +From 6f7af8383f51f6e4f0588a12ff945d22d7375ae6 Mon Sep 17 00:00:00 2001 +From: Luca Fancellu +Date: Mon, 16 Dec 2024 13:31:23 +0100 +Subject: [PATCH 01/53] xen/device-tree: Allow region overlapping with + /memreserve/ ranges + +There are some cases where the device tree exposes a memory range +in both /memreserve/ and reserved-memory node, in this case the +current code will stop Xen to boot since it will find that the +latter range is clashing with the already recorded /memreserve/ +ranges. + +Furthermore, u-boot lists boot modules ranges, such as ramdisk, +in the /memreserve/ part and even in this case this will prevent +Xen to boot since it will see that the module memory range that +it is going to add in 'add_boot_module' clashes with a /memreserve/ +range. + +When Xen populate the data structure that tracks the memory ranges, +it also adds a memory type described in 'enum membank_type', so +in order to fix this behavior, allow overlapping with the /memreserve/ +ranges in the 'check_reserved_regions_overlap' function when a flag +is set. + +In order to implement this solution, there is a distinction between +the 'struct membanks *' handled by meminfo_overlap_check(...) that +needs to be done, because the static shared memory banks doesn't have +a usable bank[].type field and so it can't be accessed, hence now +the 'struct membanks_hdr' have a 'enum region_type type' field in order +to be able to identify static shared memory banks in meminfo_overlap_check(...). + +While there, set a type for the memory recorded using meminfo_add_bank() +from efi-boot.h. + +Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to bootinfo.reserved_mem") +Reported-by: Shawn Anastasio +Reported-by: Grygorii Strashko +Signed-off-by: Luca Fancellu +Tested-by: Grygorii Strashko +Reviewed-by: Julien Grall + +bootfdt: Add missing trailing commas in BOOTINFO_{ACPI,SHMEM}_INIT + +Commit a14593e3995a extended BOOTINFO_{ACPI,SHMEM}_INIT initializers +list with a new 'type' member but forgot to add trailing commas (they +were present before). This results in a build failure when building +with CONFIG_ACPI=y and CONFIG_STATIC_SHM=y: +./include/xen/bootfdt.h:155:5: error: request for member 'shmem' in something not a structure or union + 155 | .shmem.common.max_banks = NR_SHMEM_BANKS, \ + | ^ +./include/xen/bootfdt.h:168:5: note: in expansion of macro 'BOOTINFO_SHMEM_INIT' + 168 | BOOTINFO_SHMEM_INIT \ + | ^~~~~~~~~~~~~~~~~~~ +common/device-tree/bootinfo.c:22:39: note: in expansion of macro 'BOOTINFO_INIT' + 22 | struct bootinfo __initdata bootinfo = BOOTINFO_INIT; + +Fixes: a14593e3995a ("xen/device-tree: Allow region overlapping with /memreserve/ ranges") +Signed-off-by: Michal Orzel +Reviewed-by: Bertrand Marquis +Reviewed-by: Luca Fancellu +master commit: a14593e3995afc74bf4efe91116e34894e0ea49a +master date: 2024-11-28 18:57:21 +0000 +master commit: 5a455a52eae1420619df14c8e55fd17ced70538e +master date: 2024-12-03 12:20:41 +0000 +--- + xen/arch/arm/bootfdt.c | 9 +++++++- + xen/arch/arm/efi/efi-boot.h | 3 ++- + xen/arch/arm/include/asm/setup.h | 20 +++++++++++++--- + xen/arch/arm/setup.c | 39 ++++++++++++++++++++++++-------- + xen/arch/arm/static-shmem.c | 2 +- + 5 files changed, 57 insertions(+), 16 deletions(-) + +diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c +index 6e060111d9..1a766a2c4b 100644 +--- a/xen/arch/arm/bootfdt.c ++++ b/xen/arch/arm/bootfdt.c +@@ -148,8 +148,15 @@ static int __init device_tree_get_meminfo(const void *fdt, int node, + for ( i = 0; i < banks && mem->nr_banks < mem->max_banks; i++ ) + { + device_tree_get_reg(&cell, address_cells, size_cells, &start, &size); ++ /* ++ * Some valid device trees, such as those generated by OpenPOWER ++ * skiboot firmware, expose all reserved memory regions in the ++ * FDT memory reservation block AND in the reserved-memory node which ++ * has already been parsed. Thus, any matching overlaps in the ++ * reserved_mem banks should be ignored. ++ */ + if ( mem == bootinfo_get_reserved_mem() && +- check_reserved_regions_overlap(start, size) ) ++ check_reserved_regions_overlap(start, size, true) ) + return -EINVAL; + /* Some DT may describe empty bank, ignore them */ + if ( !size ) +diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h +index 199f526022..a80a5a7ab3 100644 +--- a/xen/arch/arm/efi/efi-boot.h ++++ b/xen/arch/arm/efi/efi-boot.h +@@ -167,13 +167,14 @@ static bool __init meminfo_add_bank(struct membanks *mem, + if ( mem->nr_banks >= mem->max_banks ) + return false; + #ifdef CONFIG_ACPI +- if ( check_reserved_regions_overlap(start, size) ) ++ if ( check_reserved_regions_overlap(start, size, false) ) + return false; + #endif + + bank = &mem->bank[mem->nr_banks]; + bank->start = start; + bank->size = size; ++ bank->type = MEMBANK_DEFAULT; + + mem->nr_banks++; + +diff --git a/xen/arch/arm/include/asm/setup.h b/xen/arch/arm/include/asm/setup.h +index c34179da93..ea2e8503dc 100644 +--- a/xen/arch/arm/include/asm/setup.h ++++ b/xen/arch/arm/include/asm/setup.h +@@ -49,6 +49,12 @@ enum membank_type { + MEMBANK_FDT_RESVMEM, + }; + ++enum region_type { ++ MEMORY, ++ RESERVED_MEMORY, ++ STATIC_SHARED_MEMORY ++}; ++ + /* Indicates the maximum number of characters(\0 included) for shm_id */ + #define MAX_SHM_ID_LENGTH 16 + +@@ -72,6 +78,7 @@ struct membanks { + __struct_group(membanks_hdr, common, , + unsigned int nr_banks; + unsigned int max_banks; ++ enum region_type type; + ); + struct membank bank[]; + }; +@@ -137,13 +144,17 @@ struct bootinfo { + }; + + #ifdef CONFIG_ACPI +-#define BOOTINFO_ACPI_INIT .acpi.common.max_banks = NR_MEM_BANKS, ++#define BOOTINFO_ACPI_INIT \ ++ .acpi.common.max_banks = NR_MEM_BANKS, \ ++ .acpi.common.type = MEMORY, + #else + #define BOOTINFO_ACPI_INIT + #endif + + #ifdef CONFIG_STATIC_SHM +-#define BOOTINFO_SHMEM_INIT .shmem.common.max_banks = NR_SHMEM_BANKS, ++#define BOOTINFO_SHMEM_INIT \ ++ .shmem.common.max_banks = NR_SHMEM_BANKS, \ ++ .shmem.common.type = STATIC_SHARED_MEMORY, + #else + #define BOOTINFO_SHMEM_INIT + #endif +@@ -151,7 +162,9 @@ struct bootinfo { + #define BOOTINFO_INIT \ + { \ + .mem.common.max_banks = NR_MEM_BANKS, \ ++ .mem.common.type = MEMORY, \ + .reserved_mem.common.max_banks = NR_MEM_BANKS, \ ++ .reserved_mem.common.type = RESERVED_MEMORY, \ + BOOTINFO_ACPI_INIT \ + BOOTINFO_SHMEM_INIT \ + } +@@ -223,7 +236,8 @@ void fw_unreserved_regions(paddr_t s, paddr_t e, + size_t boot_fdt_info(const void *fdt, paddr_t paddr); + const char *boot_fdt_cmdline(const void *fdt); + +-bool check_reserved_regions_overlap(paddr_t region_start, paddr_t region_size); ++bool check_reserved_regions_overlap(paddr_t region_start, paddr_t region_size, ++ bool allow_memreserve_overlap); + + struct bootmodule *add_boot_module(bootmodule_kind kind, + paddr_t start, paddr_t size, bool domU); +diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c +index 0c2fdaceaf..d0533cb715 100644 +--- a/xen/arch/arm/setup.c ++++ b/xen/arch/arm/setup.c +@@ -268,7 +268,8 @@ static void __init dt_unreserved_regions(paddr_t s, paddr_t e, + */ + static bool __init meminfo_overlap_check(const struct membanks *mem, + paddr_t region_start, +- paddr_t region_size) ++ paddr_t region_size, ++ bool allow_memreserve_overlap) + { + paddr_t bank_start = INVALID_PADDR, bank_end = 0; + paddr_t region_end = region_start + region_size; +@@ -282,12 +283,23 @@ static bool __init meminfo_overlap_check(const struct membanks *mem, + if ( INVALID_PADDR == bank_start || region_end <= bank_start || + region_start >= bank_end ) + continue; +- else +- { +- printk("Region: [%#"PRIpaddr", %#"PRIpaddr") overlapping with bank[%u]: [%#"PRIpaddr", %#"PRIpaddr")\n", +- region_start, region_end, i, bank_start, bank_end); +- return true; +- } ++ ++ /* ++ * If allow_memreserve_overlap is set, this check allows a region to be ++ * included in a MEMBANK_FDT_RESVMEM bank, but struct membanks *mem of ++ * type STATIC_SHARED_MEMORY don't set the bank[].type field because ++ * that is declared in a union with a field that is instead used, ++ * in any case this restriction is ok since STATIC_SHARED_MEMORY banks ++ * are not meant to clash with FDT /memreserve/ ranges. ++ */ ++ if ( allow_memreserve_overlap && mem->type != STATIC_SHARED_MEMORY && ++ region_start >= bank_start && region_end <= bank_end && ++ mem->bank[i].type == MEMBANK_FDT_RESVMEM ) ++ continue; ++ ++ printk("Region: [%#"PRIpaddr", %#"PRIpaddr") overlapping with bank[%u]: [%#"PRIpaddr", %#"PRIpaddr")\n", ++ region_start, region_end, i, bank_start, bank_end); ++ return true; + } + + return false; +@@ -340,7 +352,8 @@ void __init fw_unreserved_regions(paddr_t s, paddr_t e, + * existing reserved memory regions, otherwise false. + */ + bool __init check_reserved_regions_overlap(paddr_t region_start, +- paddr_t region_size) ++ paddr_t region_size, ++ bool allow_memreserve_overlap) + { + const struct membanks *mem_banks[] = { + bootinfo_get_reserved_mem(), +@@ -359,7 +372,8 @@ bool __init check_reserved_regions_overlap(paddr_t region_start, + * shared memory banks (when static shared memory feature is enabled) + */ + for ( i = 0; i < ARRAY_SIZE(mem_banks); i++ ) +- if ( meminfo_overlap_check(mem_banks[i], region_start, region_size) ) ++ if ( meminfo_overlap_check(mem_banks[i], region_start, region_size, ++ allow_memreserve_overlap) ) + return true; + + /* Check if input region is overlapping with bootmodules */ +@@ -385,7 +399,12 @@ struct bootmodule __init *add_boot_module(bootmodule_kind kind, + return NULL; + } + +- if ( check_reserved_regions_overlap(start, size) ) ++ /* ++ * u-boot adds boot module such as ramdisk to the /memreserve/, since these ++ * ranges are saved in reserved_mem at this stage, allow an eventual exact ++ * match with MEMBANK_FDT_RESVMEM banks. ++ */ ++ if ( check_reserved_regions_overlap(start, size, true) ) + return NULL; + + for ( i = 0 ; i < mods->nr_mods ; i++ ) +diff --git a/xen/arch/arm/static-shmem.c b/xen/arch/arm/static-shmem.c +index aa80756c3c..66088a4267 100644 +--- a/xen/arch/arm/static-shmem.c ++++ b/xen/arch/arm/static-shmem.c +@@ -696,7 +696,7 @@ int __init process_shm_node(const void *fdt, int node, uint32_t address_cells, + if (i < mem->max_banks) + { + if ( (paddr != INVALID_PADDR) && +- check_reserved_regions_overlap(paddr, size) ) ++ check_reserved_regions_overlap(paddr, size, false) ) + return -EINVAL; + + /* Static shared memory shall be reserved from any other use. */ +-- +2.48.1 + diff --git a/0002-bunzip2-fix-rare-decompression-failure.patch b/0002-bunzip2-fix-rare-decompression-failure.patch deleted file mode 100644 index 84481a7..0000000 --- a/0002-bunzip2-fix-rare-decompression-failure.patch +++ /dev/null @@ -1,39 +0,0 @@ -From e54077cbca7149c8fa856535b69a4c70dfd48cd2 Mon Sep 17 00:00:00 2001 -From: Ross Lagerwall -Date: Thu, 8 Aug 2024 13:44:26 +0200 -Subject: [PATCH 02/83] bunzip2: fix rare decompression failure - -The decompression code parses a huffman tree and counts the number of -symbols for a given bit length. In rare cases, there may be >= 256 -symbols with a given bit length, causing the unsigned char to overflow. -This causes a decompression failure later when the code tries and fails to -find the bit length for a given symbol. - -Since the maximum number of symbols is 258, use unsigned short instead. - -Fixes: ab77e81f6521 ("x86/dom0: support bzip2 and lzma compressed bzImage payloads") -Signed-off-by: Ross Lagerwall -Acked-by: Jan Beulich -master commit: 303d3ff85c90ee4af4bad4e3b1d4932fa2634d64 -master date: 2024-07-30 11:55:56 +0200 ---- - xen/common/bunzip2.c | 3 ++- - 1 file changed, 2 insertions(+), 1 deletion(-) - -diff --git a/xen/common/bunzip2.c b/xen/common/bunzip2.c -index 4466426941..79f17162b1 100644 ---- a/xen/common/bunzip2.c -+++ b/xen/common/bunzip2.c -@@ -221,7 +221,8 @@ static int __init get_next_block(struct bunzip_data *bd) - RUNB) */ - symCount = symTotal+2; - for (j = 0; j < groupCount; j++) { -- unsigned char length[MAX_SYMBOLS], temp[MAX_HUFCODE_BITS+1]; -+ unsigned char length[MAX_SYMBOLS]; -+ unsigned short temp[MAX_HUFCODE_BITS+1]; - int minLen, maxLen, pp; - /* Read Huffman code lengths for each symbol. They're - stored in a way similar to mtf; record a starting --- -2.47.0 - diff --git a/0002-update-Xen-version-to-4.19.2-pre.patch b/0002-update-Xen-version-to-4.19.2-pre.patch new file mode 100644 index 0000000..cb2e4e6 --- /dev/null +++ b/0002-update-Xen-version-to-4.19.2-pre.patch @@ -0,0 +1,25 @@ +From 5d61bc05850c05a32c2b4192adc1d33bcc19d1f7 Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Mon, 16 Dec 2024 13:31:51 +0100 +Subject: [PATCH 02/53] update Xen version to 4.19.2-pre + +--- + xen/Makefile | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/xen/Makefile b/xen/Makefile +index 0a942894dd..968971694c 100644 +--- a/xen/Makefile ++++ b/xen/Makefile +@@ -6,7 +6,7 @@ this-makefile := $(call lastword,$(MAKEFILE_LIST)) + # All other places this is stored (eg. compile.h) should be autogenerated. + export XEN_VERSION = 4 + export XEN_SUBVERSION = 19 +-export XEN_EXTRAVERSION ?= .1$(XEN_VENDORVERSION) ++export XEN_EXTRAVERSION ?= .2-pre$(XEN_VENDORVERSION) + export XEN_FULLVERSION = $(XEN_VERSION).$(XEN_SUBVERSION)$(XEN_EXTRAVERSION) + -include xen-version + +-- +2.48.1 + diff --git a/0003-XSM-domctl-Fix-permission-checks-on-XEN_DOMCTL_creat.patch b/0003-XSM-domctl-Fix-permission-checks-on-XEN_DOMCTL_creat.patch deleted file mode 100644 index c89e8be..0000000 --- a/0003-XSM-domctl-Fix-permission-checks-on-XEN_DOMCTL_creat.patch +++ /dev/null @@ -1,150 +0,0 @@ -From d2ecc1f231b90d4e54394e25a9aef9be42c0d196 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Thu, 8 Aug 2024 13:44:56 +0200 -Subject: [PATCH 03/83] XSM/domctl: Fix permission checks on - XEN_DOMCTL_createdomain - -The XSM checks for XEN_DOMCTL_createdomain are problematic. There's a split -between xsm_domctl() called early, and flask_domain_create() called quite late -during domain construction. - -All XSM implementations except Flask have a simple IS_PRIV check in -xsm_domctl(), and operate as expected when an unprivileged domain tries to -make a hypercall. - -Flask however foregoes any action in xsm_domctl() and defers everything, -including the simple "is the caller permitted to create a domain" check, to -flask_domain_create(). - -As a consequence, when XSM Flask is active, and irrespective of the policy -loaded, all domains irrespective of privilege can: - - * Mutate the global 'rover' variable, used to track the next free domid. - Therefore, all domains can cause a domid wraparound, and combined with a - voluntary reboot, choose their own domid. - - * Cause a reasonable amount of a domain to be constructed before ultimately - failing for permission reasons, including the use of settings outside of - supported limits. - -In order to remediate this, pass the ssidref into xsm_domctl() and at least -check that the calling domain privileged enough to create domains. - -Take the opportunity to also fix the sign of the cmd parameter to be unsigned. - -This issue has not been assigned an XSA, because Flask is experimental and not -security supported. - -Reported-by: Ross Lagerwall -Signed-off-by: Andrew Cooper -Reviewed-by: Jan Beulich -Acked-by: Daniel P. Smith -master commit: ee32b9b29af449d38aad0a1b3a81aaae586f5ea7 -master date: 2024-07-30 17:42:17 +0100 ---- - xen/arch/x86/mm/paging.c | 2 +- - xen/common/domctl.c | 4 +++- - xen/include/xsm/dummy.h | 2 +- - xen/include/xsm/xsm.h | 7 ++++--- - xen/xsm/flask/hooks.c | 14 ++++++++++++-- - 5 files changed, 21 insertions(+), 8 deletions(-) - -diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c -index bca320fffa..dd47bde5ce 100644 ---- a/xen/arch/x86/mm/paging.c -+++ b/xen/arch/x86/mm/paging.c -@@ -767,7 +767,7 @@ long do_paging_domctl_cont( - if ( d == NULL ) - return -ESRCH; - -- ret = xsm_domctl(XSM_OTHER, d, op.cmd); -+ ret = xsm_domctl(XSM_OTHER, d, op.cmd, 0 /* SSIDref not applicable */); - if ( !ret ) - { - if ( domctl_lock_acquire() ) -diff --git a/xen/common/domctl.c b/xen/common/domctl.c -index 2c0331bb05..ea16b75910 100644 ---- a/xen/common/domctl.c -+++ b/xen/common/domctl.c -@@ -322,7 +322,9 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) - break; - } - -- ret = xsm_domctl(XSM_OTHER, d, op->cmd); -+ ret = xsm_domctl(XSM_OTHER, d, op->cmd, -+ /* SSIDRef only applicable for cmd == createdomain */ -+ op->u.createdomain.ssidref); - if ( ret ) - goto domctl_out_unlock_domonly; - -diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h -index 00d2cbebf2..7956f27a29 100644 ---- a/xen/include/xsm/dummy.h -+++ b/xen/include/xsm/dummy.h -@@ -162,7 +162,7 @@ static XSM_INLINE int cf_check xsm_set_target( - } - - static XSM_INLINE int cf_check xsm_domctl( -- XSM_DEFAULT_ARG struct domain *d, int cmd) -+ XSM_DEFAULT_ARG struct domain *d, unsigned int cmd, uint32_t ssidref) - { - XSM_ASSERT_ACTION(XSM_OTHER); - switch ( cmd ) -diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h -index 8dad03fd3d..627c0d2731 100644 ---- a/xen/include/xsm/xsm.h -+++ b/xen/include/xsm/xsm.h -@@ -60,7 +60,7 @@ struct xsm_ops { - int (*domctl_scheduler_op)(struct domain *d, int op); - int (*sysctl_scheduler_op)(int op); - int (*set_target)(struct domain *d, struct domain *e); -- int (*domctl)(struct domain *d, int cmd); -+ int (*domctl)(struct domain *d, unsigned int cmd, uint32_t ssidref); - int (*sysctl)(int cmd); - int (*readconsole)(uint32_t clear); - -@@ -248,9 +248,10 @@ static inline int xsm_set_target( - return alternative_call(xsm_ops.set_target, d, e); - } - --static inline int xsm_domctl(xsm_default_t def, struct domain *d, int cmd) -+static inline int xsm_domctl(xsm_default_t def, struct domain *d, -+ unsigned int cmd, uint32_t ssidref) - { -- return alternative_call(xsm_ops.domctl, d, cmd); -+ return alternative_call(xsm_ops.domctl, d, cmd, ssidref); - } - - static inline int xsm_sysctl(xsm_default_t def, int cmd) -diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c -index 5e88c71b8e..278ad38c2a 100644 ---- a/xen/xsm/flask/hooks.c -+++ b/xen/xsm/flask/hooks.c -@@ -663,12 +663,22 @@ static int cf_check flask_set_target(struct domain *d, struct domain *t) - return rc; - } - --static int cf_check flask_domctl(struct domain *d, int cmd) -+static int cf_check flask_domctl(struct domain *d, unsigned int cmd, -+ uint32_t ssidref) - { - switch ( cmd ) - { -- /* These have individual XSM hooks (common/domctl.c) */ - case XEN_DOMCTL_createdomain: -+ /* -+ * There is a later hook too, but at this early point simply check -+ * that the calling domain is privileged enough to create a domain. -+ * -+ * Note that d is NULL because we haven't even allocated memory for it -+ * this early in XEN_DOMCTL_createdomain. -+ */ -+ return avc_current_has_perm(ssidref, SECCLASS_DOMAIN, DOMAIN__CREATE, NULL); -+ -+ /* These have individual XSM hooks (common/domctl.c) */ - case XEN_DOMCTL_getdomaininfo: - case XEN_DOMCTL_scheduler_op: - case XEN_DOMCTL_irq_permission: --- -2.47.0 - diff --git a/0003-x86emul-MOVBE-requires-a-memory-operand.patch b/0003-x86emul-MOVBE-requires-a-memory-operand.patch new file mode 100644 index 0000000..fbfa0be --- /dev/null +++ b/0003-x86emul-MOVBE-requires-a-memory-operand.patch @@ -0,0 +1,32 @@ +From 3a9e5a93e6ed5300f25612963c0fe521156951a1 Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Mon, 16 Dec 2024 13:32:19 +0100 +Subject: [PATCH 03/53] x86emul: MOVBE requires a memory operand + +The reg-reg forms should cause #UD; they come into existence only with +APX, where MOVBE also extends BSWAP (for the latter not being "eligible" +to a REX2 prefix). + +Signed-off-by: Jan Beulich +Reviewed-by: Andrew Cooper +master commit: 4c5d9a01f8fa81417a9c431e9624fb71361ec4f9 +master date: 2024-12-02 09:50:14 +0100 +--- + xen/arch/x86/x86_emulate/x86_emulate.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c +index 905c6abf42..09ab75d035 100644 +--- a/xen/arch/x86/x86_emulate/x86_emulate.c ++++ b/xen/arch/x86/x86_emulate/x86_emulate.c +@@ -6944,6 +6944,7 @@ x86_emulate( + + case X86EMUL_OPC(0x0f38, 0xf0): /* movbe m,r */ + case X86EMUL_OPC(0x0f38, 0xf1): /* movbe r,m */ ++ generate_exception_if(ea.type != OP_MEM, X86_EXC_UD); + vcpu_must_have(movbe); + switch ( op_bytes ) + { +-- +2.48.1 + diff --git a/0004-x86-dom0-fix-restoring-cr3-and-the-mapcache-override.patch b/0004-x86-dom0-fix-restoring-cr3-and-the-mapcache-override.patch deleted file mode 100644 index a1fcdce..0000000 --- a/0004-x86-dom0-fix-restoring-cr3-and-the-mapcache-override.patch +++ /dev/null @@ -1,38 +0,0 @@ -From adf1939b51a0a2fa596f7acca0989bfe56cab307 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= -Date: Thu, 8 Aug 2024 13:45:28 +0200 -Subject: [PATCH 04/83] x86/dom0: fix restoring %cr3 and the mapcache override - on PV build error -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -One of the error paths in the PV dom0 builder section that runs on the guest -page-tables wasn't restoring the Xen value of %cr3, neither removing the -mapcache override. - -Fixes: 079ff2d32c3d ('libelf-loader: introduce elf_load_image') -Signed-off-by: Roger Pau Monné -Reviewed-by: Jan Beulich -master commit: 1fc3f77113dd43b14fa7ef5936dcdba120c0b63f -master date: 2024-07-31 12:41:02 +0200 ---- - xen/arch/x86/pv/dom0_build.c | 2 ++ - 1 file changed, 2 insertions(+) - -diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c -index d8043fa58a..57e58a02e7 100644 ---- a/xen/arch/x86/pv/dom0_build.c -+++ b/xen/arch/x86/pv/dom0_build.c -@@ -825,6 +825,8 @@ int __init dom0_construct_pv(struct domain *d, - rc = elf_load_binary(&elf); - if ( rc < 0 ) - { -+ mapcache_override_current(NULL); -+ switch_cr3_cr4(current->arch.cr3, read_cr4()); - printk("Failed to load the kernel binary\n"); - goto out; - } --- -2.47.0 - diff --git a/0004-xen-Kconfig-livepatch-build-tools-requires-debug-inf.patch b/0004-xen-Kconfig-livepatch-build-tools-requires-debug-inf.patch new file mode 100644 index 0000000..5c7a863 --- /dev/null +++ b/0004-xen-Kconfig-livepatch-build-tools-requires-debug-inf.patch @@ -0,0 +1,45 @@ +From 475511a467e8296f7d11f35cb4b5838c2b465937 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Mon, 16 Dec 2024 13:32:43 +0100 +Subject: [PATCH 04/53] xen/Kconfig: livepatch-build-tools requires debug + information +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +The tools infrastructure used to build livepatches for Xen +(livepatch-build-tools) consumes some DWARF debug information present in +xen-syms to generate a livepatch (see livepatch-build script usage of readelf +-wi). + +The current Kconfig defaults however will enable LIVEPATCH without DEBUG_INFO +on release builds, thus providing a default Kconfig selection that's not +suitable for livepatch-build-tools even when LIVEPATCH support is enabled, +because it's missing the DWARF debug section. + +Fix by defaulting DEBUG_INFO to enabled when LIVEPATCH is. + +Signed-off-by: Roger Pau Monné +Reviewed-by: Jan Beulich +master commit: 126b0a6e537ce1d486a29e35cfeec1f222a74d11 +master date: 2024-12-02 15:22:05 +0100 +--- + xen/Kconfig.debug | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug +index 78b5a7c603..778db6f4e9 100644 +--- a/xen/Kconfig.debug ++++ b/xen/Kconfig.debug +@@ -133,7 +133,7 @@ endif # DEBUG || EXPERT + + config DEBUG_INFO + bool "Compile Xen with debug info" +- default DEBUG ++ default DEBUG || LIVEPATCH + help + Say Y here if you want to build Xen with debug information. This + information is needed e.g. for doing crash dump analysis of the +-- +2.48.1 + diff --git a/0005-libs-guest-Fix-migration-compatibility-with-a-securi.patch b/0005-libs-guest-Fix-migration-compatibility-with-a-securi.patch new file mode 100644 index 0000000..13dcfd1 --- /dev/null +++ b/0005-libs-guest-Fix-migration-compatibility-with-a-securi.patch @@ -0,0 +1,65 @@ +From 60573721c5f85d79d1ed04de6534248e504df0da Mon Sep 17 00:00:00 2001 +From: Andrew Cooper +Date: Mon, 16 Dec 2024 13:33:07 +0100 +Subject: [PATCH 05/53] libs/guest: Fix migration compatibility with a + security-patched Xen 4.13 + +xc_cpuid_apply_policy() provides compatibility for migration of a pre-4.14 VM +where no CPUID data was provided in the stream. + +It guesses the various max-leaf limits, based on what was true at the time of +writing, but this was not correctly adapted when speculative security issues +forced the advertisement of new feature bits. Of note are: + + * LFENCE-DISPATCH, in leaf 0x80000021.eax + * BHI-CTRL, in leaf 0x7[2].edx + +In both cases, a VM booted on a security-patched Xen 4.13, and then migrated +on to any newer version of Xen on the same or compatible hardware would have +these features stripped back because Xen is still editing the cpu-policy for +sanity behind the back of the toolstack. + +For VMs using BHI_DIS_S to mitigate Native-BHI, this resulted in a failure to +restore the guests MSR_SPEC_CTRL setting: + + (XEN) HVM d7v0 load MSR 0x48 with value 0x401 failed + (XEN) HVM7 restore: failed to load entry 20/0 rc -6 + +Fixes: e9b4fe263649 ("x86/cpuid: support LFENCE always serialising CPUID bit") +Fixes: f3709b15fc86 ("x86/cpuid: Infrastructure for cpuid word 7:2.edx") +Signed-off-by: Andrew Cooper +Reviewed-by: Jan Beulich +master commit: 28301682f492c1df2ff9c3e01a0aab6262bd925a +master date: 2024-12-03 12:20:41 +0000 +--- + tools/libs/guest/xg_cpuid_x86.c | 7 ++++--- + 1 file changed, 4 insertions(+), 3 deletions(-) + +diff --git a/tools/libs/guest/xg_cpuid_x86.c b/tools/libs/guest/xg_cpuid_x86.c +index 4453178100..263a9d4787 100644 +--- a/tools/libs/guest/xg_cpuid_x86.c ++++ b/tools/libs/guest/xg_cpuid_x86.c +@@ -640,7 +640,8 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid, bool restore, + * + * This restore path is used for incoming VMs with no CPUID data + * i.e. originated on Xen 4.13 or earlier. We must invent a policy +- * compatible with what Xen 4.13 would have done on the same hardware. ++ * compatible with what a security-patched Xen 4.13 would have done on ++ * the same hardware. + * + * Specifically: + * - Clamp max leaves. +@@ -657,8 +658,8 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid, bool restore, + } + + p->policy.basic.max_leaf = min(p->policy.basic.max_leaf, 0xdu); +- p->policy.feat.max_subleaf = 0; +- p->policy.extd.max_leaf = min(p->policy.extd.max_leaf, 0x8000001c); ++ p->policy.feat.max_subleaf = min(p->policy.feat.max_subleaf, 0x2u); ++ p->policy.extd.max_leaf = min(p->policy.extd.max_leaf, 0x80000021); + } + + if ( featureset ) +-- +2.48.1 + diff --git a/0005-x86-altcall-further-refine-clang-workaround.patch b/0005-x86-altcall-further-refine-clang-workaround.patch deleted file mode 100644 index 1d993e3..0000000 --- a/0005-x86-altcall-further-refine-clang-workaround.patch +++ /dev/null @@ -1,73 +0,0 @@ -From ee032f29972b8c58db9fcf96650f9cbc083edca8 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= -Date: Thu, 8 Aug 2024 13:45:58 +0200 -Subject: [PATCH 05/83] x86/altcall: further refine clang workaround -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -The current code in ALT_CALL_ARG() won't successfully workaround the clang -code-generation issue if the arg parameter has a size that's not a power of 2. -While there are no such sized parameters at the moment, improve the workaround -to also be effective when such sizes are used. - -Instead of using a union with a long use an unsigned long that's first -initialized to 0 and afterwards set to the argument value. - -Reported-by: Alejandro Vallejo -Suggested-by: Alejandro Vallejo -Signed-off-by: Roger Pau Monné -Reviewed-by: Jan Beulich -master commit: 561cba38ff551383a628dc93e64ab0691cfc92bf -master date: 2024-07-31 12:41:22 +0200 ---- - xen/arch/x86/include/asm/alternative.h | 26 ++++++++++++-------------- - 1 file changed, 12 insertions(+), 14 deletions(-) - -diff --git a/xen/arch/x86/include/asm/alternative.h b/xen/arch/x86/include/asm/alternative.h -index e63b459276..a86eadfaec 100644 ---- a/xen/arch/x86/include/asm/alternative.h -+++ b/xen/arch/x86/include/asm/alternative.h -@@ -169,27 +169,25 @@ extern void alternative_branches(void); - - #ifdef CONFIG_CC_IS_CLANG - /* -- * Use a union with an unsigned long in order to prevent clang from -- * skipping a possible truncation of the value. By using the union any -- * truncation is carried before the call instruction, in turn covering -- * for ABI-non-compliance in that the necessary clipping / extension of -- * the value is supposed to be carried out in the callee. -+ * Clang doesn't follow the psABI and doesn't truncate parameter values at the -+ * callee. This can lead to bad code being generated when using alternative -+ * calls. - * -- * Note this behavior is not mandated by the standard, and hence could -- * stop being a viable workaround, or worse, could cause a different set -- * of code-generation issues in future clang versions. -+ * Workaround it by using a temporary intermediate variable that's zeroed -+ * before being assigned the parameter value, as that forces clang to zero the -+ * register at the caller. - * - * This has been reported upstream: - * https://github.com/llvm/llvm-project/issues/12579 - * https://github.com/llvm/llvm-project/issues/82598 - */ - #define ALT_CALL_ARG(arg, n) \ -- register union { \ -- typeof(arg) e[sizeof(long) / sizeof(arg)]; \ -- unsigned long r; \ -- } a ## n ## _ asm ( ALT_CALL_arg ## n ) = { \ -- .e[0] = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); })\ -- } -+ register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n ) = ({ \ -+ unsigned long tmp = 0; \ -+ BUILD_BUG_ON(sizeof(arg) > sizeof(unsigned long)); \ -+ *(typeof(arg) *)&tmp = (arg); \ -+ tmp; \ -+ }) - #else - #define ALT_CALL_ARG(arg, n) \ - register typeof(arg) a ## n ## _ asm ( ALT_CALL_arg ## n ) = \ --- -2.47.0 - diff --git a/0006-tools-ocaml-Specify-rpath-correctly-for-ocamlmklib.patch b/0006-tools-ocaml-Specify-rpath-correctly-for-ocamlmklib.patch new file mode 100644 index 0000000..cc09b81 --- /dev/null +++ b/0006-tools-ocaml-Specify-rpath-correctly-for-ocamlmklib.patch @@ -0,0 +1,50 @@ +From fbe3ec72dc0d6ecf4007fdb6eff821cc7570e9ca Mon Sep 17 00:00:00 2001 +From: Andrii Sultanov +Date: Mon, 16 Dec 2024 13:33:17 +0100 +Subject: [PATCH 06/53] tools/ocaml: Specify rpath correctly for ocamlmklib + +ocamlmklib has special handling for C-like '-Wl,-rpath' option, but does +not know how to handle '-Wl,-rpath-link', as evidenced by warnings like: +"Unknown option +-Wl,-rpath-link=$HOME/xen/tools/ocaml/libs/eventchn/../../../../tools/libs/toollog" +Pass this option directly to the compiler with -ccopt instead. + +Also pass -L directly to the linker with -ldopt. This prevents embedding absolute +paths from buildtime into binary's RPATH. + +Fixes: f7b4e4558b42 ("tools/ocaml: Fix OCaml libs rules") +Reported-by: Fernando Rodrigues +Tested-by: Fernando Rodrigues +Signed-off-by: Andrii Sultanov +Acked-by: Christian Lindig +master commit: bf8a209915804088c09ac6575bcca554450fa7e8 +master date: 2024-12-11 10:45:08 +0000 +--- + tools/ocaml/Makefile.rules | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/tools/ocaml/Makefile.rules b/tools/ocaml/Makefile.rules +index 5638193edf..678bbc046f 100644 +--- a/tools/ocaml/Makefile.rules ++++ b/tools/ocaml/Makefile.rules +@@ -61,7 +61,7 @@ mk-caml-lib-bytecode = $(call quiet-command, $(OCAMLC) $(OCAMLCFLAGS) -a -o $1 $ + + mk-caml-stubs = $(call quiet-command, $(OCAMLMKLIB) -o `basename $1 .a` $2,MKLIB,$1) + mk-caml-lib-stubs = \ +- $(call quiet-command, $(OCAMLMKLIB) -o `basename $1 .a | sed -e 's/^lib//'` $2 $3,MKLIB,$1) ++ $(call quiet-command, $(OCAMLMKLIB) -o `basename $1 .a | sed -e 's/^lib//'` $2 `echo $3 | sed -e 's/-ccopt -l/-l/g' | sed -e 's/-ccopt -L/-ldopt -L/g'`,MKLIB,$1) + + # define a library target .cmxa and .cma + define OCAML_LIBRARY_template +@@ -72,7 +72,7 @@ define OCAML_LIBRARY_template + $(1)_stubs.a: $(foreach obj,$$($(1)_C_OBJS),$(obj).o) + $(call mk-caml-stubs,$$@, $$+) + lib$(1)_stubs.a: $(foreach obj,$($(1)_C_OBJS),$(obj).o) +- $(call mk-caml-lib-stubs,$$@, $$+, $(foreach lib,$(LIBS_$(1)),$(lib))) ++ $(call mk-caml-lib-stubs,$$@, $$+, $(foreach lib,$(LIBS_$(1)),-ccopt $(lib))) + endef + + define OCAML_NOC_LIBRARY_template +-- +2.48.1 + diff --git a/0006-xen-sched-fix-error-handling-in-cpu_schedule_up.patch b/0006-xen-sched-fix-error-handling-in-cpu_schedule_up.patch deleted file mode 100644 index 2d2d078..0000000 --- a/0006-xen-sched-fix-error-handling-in-cpu_schedule_up.patch +++ /dev/null @@ -1,113 +0,0 @@ -From b37580d5e984770266783b639552a97c36ecb58a Mon Sep 17 00:00:00 2001 -From: Juergen Gross -Date: Thu, 8 Aug 2024 13:46:21 +0200 -Subject: [PATCH 06/83] xen/sched: fix error handling in cpu_schedule_up() - -In case cpu_schedule_up() is failing, it needs to undo all externally -visible changes it has done before. - -Reason is that cpu_schedule_callback() won't be called with the -CPU_UP_CANCELED notifier in case cpu_schedule_up() did fail. - -Fixes: 207589dbacd4 ("xen/sched: move per cpu scheduler private data into struct sched_resource") -Reported-by: Jan Beulich -Signed-off-by: Juergen Gross -Reviewed-by: Jan Beulich -master commit: 44a7d4f0a5e9eae41a44a162e54ff6d2ebe5b7d6 -master date: 2024-07-31 14:50:18 +0200 ---- - xen/common/sched/core.c | 63 +++++++++++++++++++++-------------------- - 1 file changed, 33 insertions(+), 30 deletions(-) - -diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c -index d84b65f197..c466711e9e 100644 ---- a/xen/common/sched/core.c -+++ b/xen/common/sched/core.c -@@ -2755,6 +2755,36 @@ static struct sched_resource *sched_alloc_res(void) - return sr; - } - -+static void cf_check sched_res_free(struct rcu_head *head) -+{ -+ struct sched_resource *sr = container_of(head, struct sched_resource, rcu); -+ -+ free_cpumask_var(sr->cpus); -+ if ( sr->sched_unit_idle ) -+ sched_free_unit_mem(sr->sched_unit_idle); -+ xfree(sr); -+} -+ -+static void cpu_schedule_down(unsigned int cpu) -+{ -+ struct sched_resource *sr; -+ -+ rcu_read_lock(&sched_res_rculock); -+ -+ sr = get_sched_res(cpu); -+ -+ kill_timer(&sr->s_timer); -+ -+ cpumask_clear_cpu(cpu, &sched_res_mask); -+ set_sched_res(cpu, NULL); -+ -+ /* Keep idle unit. */ -+ sr->sched_unit_idle = NULL; -+ call_rcu(&sr->rcu, sched_res_free); -+ -+ rcu_read_unlock(&sched_res_rculock); -+} -+ - static int cpu_schedule_up(unsigned int cpu) - { - struct sched_resource *sr; -@@ -2794,7 +2824,10 @@ static int cpu_schedule_up(unsigned int cpu) - idle_vcpu[cpu]->sched_unit->res = sr; - - if ( idle_vcpu[cpu] == NULL ) -+ { -+ cpu_schedule_down(cpu); - return -ENOMEM; -+ } - - idle_vcpu[cpu]->sched_unit->rendezvous_in_cnt = 0; - -@@ -2812,36 +2845,6 @@ static int cpu_schedule_up(unsigned int cpu) - return 0; - } - --static void cf_check sched_res_free(struct rcu_head *head) --{ -- struct sched_resource *sr = container_of(head, struct sched_resource, rcu); -- -- free_cpumask_var(sr->cpus); -- if ( sr->sched_unit_idle ) -- sched_free_unit_mem(sr->sched_unit_idle); -- xfree(sr); --} -- --static void cpu_schedule_down(unsigned int cpu) --{ -- struct sched_resource *sr; -- -- rcu_read_lock(&sched_res_rculock); -- -- sr = get_sched_res(cpu); -- -- kill_timer(&sr->s_timer); -- -- cpumask_clear_cpu(cpu, &sched_res_mask); -- set_sched_res(cpu, NULL); -- -- /* Keep idle unit. */ -- sr->sched_unit_idle = NULL; -- call_rcu(&sr->rcu, sched_res_free); -- -- rcu_read_unlock(&sched_res_rculock); --} -- - void sched_rm_cpu(unsigned int cpu) - { - int rc; --- -2.47.0 - diff --git a/0007-x86-io-apic-prevent-early-exit-from-i8259-loop-detec.patch b/0007-x86-io-apic-prevent-early-exit-from-i8259-loop-detec.patch new file mode 100644 index 0000000..14e696c --- /dev/null +++ b/0007-x86-io-apic-prevent-early-exit-from-i8259-loop-detec.patch @@ -0,0 +1,84 @@ +From c41c22bf8e9f7ff6c5d4dbe9fcf19a61643543c9 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Tue, 17 Dec 2024 12:46:29 +0100 +Subject: [PATCH 07/53] x86/io-apic: prevent early exit from i8259 loop + detection +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Avoid exiting early from the loop when a pin that could be connected to the +i8259 is found, as such early exit would leave the EOI handler translation +array only partially allocated and/or initialized. + +Otherwise on systems with multiple IO-APICs and an unmasked ExtINT pin on +any IO-APIC that's no the last one the following NULL pointer dereference +triggers: + +(XEN) Enabling APIC mode. Using 2 I/O APICs +(XEN) ----[ Xen-4.20-unstable x86_64 debug=y Not tainted ]---- +(XEN) CPU: 0 +(XEN) RIP: e008:[] __ioapic_write_entry+0x83/0x95 +[...] +(XEN) Xen call trace: +(XEN) [] R __ioapic_write_entry+0x83/0x95 +(XEN) [] F amd_iommu_ioapic_update_ire+0x1ea/0x273 +(XEN) [] F iommu_update_ire_from_apic+0xa/0xc +(XEN) [] F __ioapic_write_entry+0x93/0x95 +(XEN) [] F arch/x86/io_apic.c#clear_IO_APIC_pin+0x7c/0x10e +(XEN) [] F arch/x86/io_apic.c#clear_IO_APIC+0x2d/0x61 +(XEN) [] F enable_IO_APIC+0x2e3/0x34f +(XEN) [] F smp_prepare_cpus+0x254/0x27a +(XEN) [] F __start_xen+0x1ce1/0x23ae +(XEN) [] F __high_start+0x8e/0x90 +(XEN) +(XEN) Pagetable walk from 0000000000000000: +(XEN) L4[0x000] = 000000007dbfd063 ffffffffffffffff +(XEN) L3[0x000] = 000000007dbfa063 ffffffffffffffff +(XEN) L2[0x000] = 000000007dbcc063 ffffffffffffffff +(XEN) L1[0x000] = 0000000000000000 ffffffffffffffff +(XEN) +(XEN) **************************************** +(XEN) Panic on CPU 0: +(XEN) FATAL PAGE FAULT +(XEN) [error_code=0002] +(XEN) Faulting linear address: 0000000000000000 +(XEN) **************************************** +(XEN) +(XEN) Reboot in five seconds... + +Reported-by: Sergii Dmytruk +Fixes: 86001b3970fe ('x86/io-apic: fix directed EOI when using AMD-Vi interrupt remapping') +Signed-off-by: Roger Pau Monné +Reviewed-by: Jan Beulich +master commit: f38fd27c4ceadf7ec4e82e82d0731b6ea415c51e +master date: 2024-12-17 11:15:30 +0100 +--- + xen/arch/x86/io_apic.c | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c +index 1fd9559a06..65b019db02 100644 +--- a/xen/arch/x86/io_apic.c ++++ b/xen/arch/x86/io_apic.c +@@ -1372,14 +1372,14 @@ void __init enable_IO_APIC(void) + /* If the interrupt line is enabled and in ExtInt mode + * I have found the pin where the i8259 is connected. + */ +- if ((entry.mask == 0) && (entry.delivery_mode == dest_ExtINT)) { ++ if ( ioapic_i8259.pin == -1 && entry.mask == 0 && ++ entry.delivery_mode == dest_ExtINT ) ++ { + ioapic_i8259.apic = apic; + ioapic_i8259.pin = pin; +- goto found_i8259; + } + } + } +- found_i8259: + /* Look to see what if the MP table has reported the ExtINT */ + /* If we could not find the appropriate pin by looking at the ioapic + * the i8259 probably is not connected the ioapic but give the +-- +2.48.1 + diff --git a/0007-xen-hvm-Don-t-skip-MSR_READ-trace-record.patch b/0007-xen-hvm-Don-t-skip-MSR_READ-trace-record.patch deleted file mode 100644 index 5c6bdf4..0000000 --- a/0007-xen-hvm-Don-t-skip-MSR_READ-trace-record.patch +++ /dev/null @@ -1,40 +0,0 @@ -From 97a15007c9606d4c53109754bb21fd593bca589b Mon Sep 17 00:00:00 2001 -From: George Dunlap -Date: Thu, 8 Aug 2024 13:47:02 +0200 -Subject: [PATCH 07/83] xen/hvm: Don't skip MSR_READ trace record - -Commit 37f074a3383 ("x86/msr: introduce guest_rdmsr()") introduced a -function to combine the MSR_READ handling between PV and HVM. -Unfortunately, by returning directly, it skipped the trace generation, -leading to gaps in the trace record, as well as xenalyze errors like -this: - -hvm_generic_postprocess: d2v0 Strange, exit 7c(VMEXIT_MSR) missing a handler - -Replace the `return` with `goto out`. - -Fixes: 37f074a3383 ("x86/msr: introduce guest_rdmsr()") -Signed-off-by: George Dunlap -Reviewed-by: Jan Beulich -master commit: bc8a43fff61ae4162a95d84f4e148d6773667cd2 -master date: 2024-08-02 08:42:09 +0200 ---- - xen/arch/x86/hvm/hvm.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c -index 7f4b627b1f..0fe2b85b16 100644 ---- a/xen/arch/x86/hvm/hvm.c -+++ b/xen/arch/x86/hvm/hvm.c -@@ -3557,7 +3557,7 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) - fixed_range_base = (uint64_t *)v->arch.hvm.mtrr.fixed_ranges; - - if ( (ret = guest_rdmsr(v, msr, msr_content)) != X86EMUL_UNHANDLEABLE ) -- return ret; -+ goto out; - - ret = X86EMUL_OKAY; - --- -2.47.0 - diff --git a/0008-tools-lsevtchn-Use-errno-macro-to-handle-hypercall-e.patch b/0008-tools-lsevtchn-Use-errno-macro-to-handle-hypercall-e.patch deleted file mode 100644 index 821890f..0000000 --- a/0008-tools-lsevtchn-Use-errno-macro-to-handle-hypercall-e.patch +++ /dev/null @@ -1,75 +0,0 @@ -From e0e84771b61ed985809d105d8f116d4c520542b0 Mon Sep 17 00:00:00 2001 -From: Matthew Barnes -Date: Thu, 8 Aug 2024 13:47:30 +0200 -Subject: [PATCH 08/83] tools/lsevtchn: Use errno macro to handle hypercall - error cases - -Currently, lsevtchn aborts its event channel enumeration when it hits -an event channel that is owned by Xen. - -lsevtchn does not distinguish between different hypercall errors, which -results in lsevtchn missing potential relevant event channels with -higher port numbers. - -Use the errno macro to distinguish between hypercall errors, and -continue event channel enumeration if the hypercall error is not -critical to enumeration. - -Signed-off-by: Matthew Barnes -Reviewed-by: Anthony PERARD -master commit: e92a453c8db8bba62d6be3006079e2b9990c3978 -master date: 2024-08-02 08:43:57 +0200 ---- - tools/xcutils/lsevtchn.c | 22 ++++++++++++++++++++-- - 1 file changed, 20 insertions(+), 2 deletions(-) - -diff --git a/tools/xcutils/lsevtchn.c b/tools/xcutils/lsevtchn.c -index d1710613dd..30c8d847b8 100644 ---- a/tools/xcutils/lsevtchn.c -+++ b/tools/xcutils/lsevtchn.c -@@ -3,6 +3,7 @@ - #include - #include - #include -+#include - - #include - -@@ -24,7 +25,23 @@ int main(int argc, char **argv) - status.port = port; - rc = xc_evtchn_status(xch, &status); - if ( rc < 0 ) -- break; -+ { -+ switch ( errno ) -+ { -+ case EACCES: /* Xen-owned evtchn */ -+ continue; -+ -+ case EINVAL: /* Port enumeration has ended */ -+ rc = 0; -+ break; -+ -+ default: -+ perror("xc_evtchn_status"); -+ rc = 1; -+ break; -+ } -+ goto out; -+ } - - if ( status.status == EVTCHNSTAT_closed ) - continue; -@@ -58,7 +75,8 @@ int main(int argc, char **argv) - printf("\n"); - } - -+ out: - xc_interface_close(xch); - -- return 0; -+ return rc; - } --- -2.47.0 - diff --git a/0008-xen-update-ECLAIR-service-identifiers-from-MC3R1-to-.patch b/0008-xen-update-ECLAIR-service-identifiers-from-MC3R1-to-.patch new file mode 100644 index 0000000..2510bc4 --- /dev/null +++ b/0008-xen-update-ECLAIR-service-identifiers-from-MC3R1-to-.patch @@ -0,0 +1,1280 @@ +From 8b584c97f88724a390fd17b08b2735f488d5f980 Mon Sep 17 00:00:00 2001 +From: Alessandro Zucchelli +Date: Tue, 10 Dec 2024 11:37:23 +0100 +Subject: [PATCH 08/53] xen: update ECLAIR service identifiers from MC3R1 to + MC3A2. + +Rename all instances of ECLAIR MISRA C:2012 service identifiers, +identified by the prefix MC3R1, to use the prefix MC3A2, which +refers to MISRA C:2012 Amendment 2 guidelines. + +This update is motivated by the need to upgrade ECLAIR GitLab runners +that use the new naming scheme for MISRA C:2012 Amendment 2 guidelines. + +Changes to the docs/misra directory are needed in order to keep +comment-based deviation up to date. + +Signed-off-by: Alessandro Zucchelli +Reviewed-by: Stefano Stabellini +(cherry picked from commit 631f535a3d4ffd66a270672f0f787d79f3bf38f8) +--- + .../eclair_analysis/ECLAIR/B.UNEVALEFF.ecl | 2 +- + .../ECLAIR/accepted_guidelines.sh | 2 +- + .../eclair_analysis/ECLAIR/analysis.ecl | 6 +- + .../eclair_analysis/ECLAIR/deviations.ecl | 228 +++++++++--------- + .../eclair_analysis/ECLAIR/monitored.ecl | 204 ++++++++-------- + automation/eclair_analysis/ECLAIR/tagging.ecl | 160 ++++++------ + docs/misra/documenting-violations.rst | 6 +- + docs/misra/safe.json | 26 +- + 8 files changed, 317 insertions(+), 317 deletions(-) + +diff --git a/automation/eclair_analysis/ECLAIR/B.UNEVALEFF.ecl b/automation/eclair_analysis/ECLAIR/B.UNEVALEFF.ecl +index 92d8db8986..fa249b8e36 100644 +--- a/automation/eclair_analysis/ECLAIR/B.UNEVALEFF.ecl ++++ b/automation/eclair_analysis/ECLAIR/B.UNEVALEFF.ecl +@@ -1,4 +1,4 @@ +--clone_service=MC3R1.R13.6,B.UNEVALEFF ++-clone_service=MC3A2.R13.6,B.UNEVALEFF + + -config=B.UNEVALEFF,summary="The operand of the `alignof' and `typeof' operators shall not contain any expression which has potential side effects" + -config=B.UNEVALEFF,stmt_child_matcher= +diff --git a/automation/eclair_analysis/ECLAIR/accepted_guidelines.sh b/automation/eclair_analysis/ECLAIR/accepted_guidelines.sh +index 368135122c..2c4b339d0d 100755 +--- a/automation/eclair_analysis/ECLAIR/accepted_guidelines.sh ++++ b/automation/eclair_analysis/ECLAIR/accepted_guidelines.sh +@@ -10,6 +10,6 @@ script_dir="$( + accepted_rst=$1 + + grep -Eo "\`(Dir|Rule) [0-9]+\.[0-9]+" ${accepted_rst} \ +- | sed -e 's/`Rule /MC3R1.R/' -e 's/`Dir /MC3R1.D/' -e 's/.*/-enable=&/' > ${script_dir}/accepted.ecl ++ | sed -e 's/`Rule /MC3A2.R/' -e 's/`Dir /MC3A2.D/' -e 's/.*/-enable=&/' > ${script_dir}/accepted.ecl + + echo "-enable=B.UNEVALEFF" >> ${script_dir}/accepted.ecl +diff --git a/automation/eclair_analysis/ECLAIR/analysis.ecl b/automation/eclair_analysis/ECLAIR/analysis.ecl +index df0b551812..824283a989 100644 +--- a/automation/eclair_analysis/ECLAIR/analysis.ecl ++++ b/automation/eclair_analysis/ECLAIR/analysis.ecl +@@ -22,15 +22,15 @@ setq(analysis_kind,getenv("ANALYSIS_KIND")) + -doc_begin="These configurations serve the purpose of recognizing the 'mem*' macros as + their Standard Library equivalents." + +--config=MC3R1.R21.14,call_select+= ++-config=MC3A2.R21.14,call_select+= + {"macro(^memcmp$)&&any_arg(1..2, skip(__non_syntactic_paren_cast_stmts, node(string_literal)))", + "any()", violation, "%{__callslct_any_base_fmt()}", {{arg, "%{__callslct_arg_fmt()}"}}} + +--config=MC3R1.R21.15,call_args+= ++-config=MC3A2.R21.15,call_args+= + {"macro(^mem(cmp|move|cpy)$)", {1, 2}, "unqual_pointee_compatible", + "%{__argscmpr_culprit_fmt()}", "%{__argscmpr_evidence_fmt()}"} + +--config=MC3R1.R21.16,call_select+= ++-config=MC3A2.R21.16,call_select+= + {"macro(^memcmp$)&&any_arg(1..2, skip(__non_syntactic_paren_stmts, type(canonical(__memcmp_pte_types))))", + "any()", violation, "%{__callslct_any_base_fmt()}", {{arg,"%{__callslct_arg_type_fmt()}"}}} + +diff --git a/automation/eclair_analysis/ECLAIR/deviations.ecl b/automation/eclair_analysis/ECLAIR/deviations.ecl +index 0af1cb93d1..046d378087 100644 +--- a/automation/eclair_analysis/ECLAIR/deviations.ecl ++++ b/automation/eclair_analysis/ECLAIR/deviations.ecl +@@ -4,36 +4,36 @@ + + -doc_begin="The compiler implementation guarantees that the unreachable code is removed. + Constant expressions and unreachable branches of if and switch statements are expected." +--config=MC3R1.R2.1,+reports={safe,"first_area(^.*has an invariantly.*$)"} +--config=MC3R1.R2.1,+reports={safe,"first_area(^.*incompatible with labeled statement$)"} ++-config=MC3A2.R2.1,+reports={safe,"first_area(^.*has an invariantly.*$)"} ++-config=MC3A2.R2.1,+reports={safe,"first_area(^.*incompatible with labeled statement$)"} + -doc_end + + -doc_begin="Some functions are intended to be not referenced." +--config=MC3R1.R2.1,+reports={deliberate,"first_area(^.*is never referenced$)"} ++-config=MC3A2.R2.1,+reports={deliberate,"first_area(^.*is never referenced$)"} + -doc_end + + -doc_begin="Unreachability caused by calls to the following functions or macros is deliberate and there is no risk of code being unexpectedly left out." +--config=MC3R1.R2.1,statements+={deliberate,"macro(name(BUG||assert_failed))"} +--config=MC3R1.R2.1,statements+={deliberate, "call(decl(name(__builtin_unreachable||panic||do_unexpected_trap||machine_halt||machine_restart||reboot_or_halt)))"} ++-config=MC3A2.R2.1,statements+={deliberate,"macro(name(BUG||assert_failed))"} ++-config=MC3A2.R2.1,statements+={deliberate, "call(decl(name(__builtin_unreachable||panic||do_unexpected_trap||machine_halt||machine_restart||reboot_or_halt)))"} + -doc_end + + -doc_begin="Unreachability inside an ASSERT_UNREACHABLE() and analogous macro calls is deliberate and safe." +--config=MC3R1.R2.1,reports+={deliberate, "any_area(any_loc(any_exp(macro(name(ASSERT_UNREACHABLE||PARSE_ERR_RET||PARSE_ERR||FAIL_MSR||FAIL_CPUID)))))"} ++-config=MC3A2.R2.1,reports+={deliberate, "any_area(any_loc(any_exp(macro(name(ASSERT_UNREACHABLE||PARSE_ERR_RET||PARSE_ERR||FAIL_MSR||FAIL_CPUID)))))"} + -doc_end + + -doc_begin="The asm-offset files are not linked deliberately, since they are used to generate definitions for asm modules." + -file_tag+={asm_offsets, "^xen/arch/(arm|x86)/(arm32|arm64|x86_64)/asm-offsets\\.c$"} +--config=MC3R1.R2.1,reports+={deliberate, "any_area(any_loc(file(asm_offsets)))"} ++-config=MC3A2.R2.1,reports+={deliberate, "any_area(any_loc(file(asm_offsets)))"} + -doc_end + + -doc_begin="Pure declarations (i.e., declarations without initialization) are + not executable, and therefore it is safe for them to be unreachable." +--config=MC3R1.R2.1,ignored_stmts+={"any()", "pure_decl()"} ++-config=MC3A2.R2.1,ignored_stmts+={"any()", "pure_decl()"} + -doc_end + + -doc_begin="The following autogenerated file is not linked deliberately." + -file_tag+={C_runtime_failures,"^automation/eclair_analysis/C-runtime-failures\\.rst\\.c$"} +--config=MC3R1.R2.1,reports+={deliberate, "any_area(any_loc(file(C_runtime_failures)))"} ++-config=MC3A2.R2.1,reports+={deliberate, "any_area(any_loc(file(C_runtime_failures)))"} + -doc_end + + -doc_begin="Proving compliance with respect to Rule 2.2 is generally impossible: +@@ -42,11 +42,11 @@ confidence that no evidence of errors in the program's logic has been missed due + to undetected violations of Rule 2.2, if any. Testing on time behavior gives us + confidence on the fact that, should the program contain dead code that is not + removed by the compiler, the resulting slowdown is negligible." +--config=MC3R1.R2.2,reports+={disapplied,"any()"} ++-config=MC3A2.R2.2,reports+={disapplied,"any()"} + -doc_end + + -doc_begin="Some labels are unused in certain build configurations, or are deliberately marked as unused, so that the compiler is entitled to remove them." +--config=MC3R1.R2.6,reports+={deliberate, "any_area(text(^.*__maybe_unused.*$))"} ++-config=MC3A2.R2.6,reports+={deliberate, "any_area(text(^.*__maybe_unused.*$))"} + -doc_end + + # +@@ -55,7 +55,7 @@ removed by the compiler, the resulting slowdown is negligible." + + -doc_begin="Comments starting with '/*' and containing hyperlinks are safe as + they are not instances of commented-out code." +--config=MC3R1.R3.1,reports+={safe, "first_area(text(^.*https?://.*$))"} ++-config=MC3A2.R3.1,reports+={safe, "first_area(text(^.*https?://.*$))"} + -doc_end + + # +@@ -63,26 +63,26 @@ they are not instances of commented-out code." + # + + -doc_begin="The directive has been accepted only for the ARM codebase." +--config=MC3R1.D4.3,reports+={disapplied,"!(any_area(any_loc(file(^xen/arch/arm/arm64/.*$))))"} ++-config=MC3A2.D4.3,reports+={disapplied,"!(any_area(any_loc(file(^xen/arch/arm/arm64/.*$))))"} + -doc_end + + -doc_begin="The inline asm in 'arm64/lib/bitops.c' is tightly coupled with the surronding C code that acts as a wrapper, so it has been decided not to add an additional encapsulation layer." + -file_tag+={arm64_bitops, "^xen/arch/arm/arm64/lib/bitops\\.c$"} +--config=MC3R1.D4.3,reports+={deliberate, "all_area(any_loc(file(arm64_bitops)&&any_exp(macro(^(bit|test)op$))))"} +--config=MC3R1.D4.3,reports+={deliberate, "any_area(any_loc(file(arm64_bitops))&&context(name(int_clear_mask16)))"} ++-config=MC3A2.D4.3,reports+={deliberate, "all_area(any_loc(file(arm64_bitops)&&any_exp(macro(^(bit|test)op$))))"} ++-config=MC3A2.D4.3,reports+={deliberate, "any_area(any_loc(file(arm64_bitops))&&context(name(int_clear_mask16)))"} + -doc_end + + -doc_begin="This header file is autogenerated or empty, therefore it poses no + risk if included more than once." + -file_tag+={empty_header, "^xen/arch/arm/efi/runtime\\.h$"} + -file_tag+={autogen_headers, "^xen/include/xen/compile\\.h$||^xen/include/generated/autoconf.h$||^xen/include/xen/hypercall-defs.h$"} +--config=MC3R1.D4.10,reports+={safe, "all_area(all_loc(file(empty_header||autogen_headers)))"} ++-config=MC3A2.D4.10,reports+={safe, "all_area(all_loc(file(empty_header||autogen_headers)))"} + -doc_end + + -doc_begin="Files that are intended to be included more than once do not need to + conform to the directive." +--config=MC3R1.D4.10,reports+={safe, "first_area(text(^/\\* This file is legitimately included multiple times\\. \\*/$, begin-4))"} +--config=MC3R1.D4.10,reports+={safe, "first_area(text(^/\\* Generated file, do not edit! \\*/$, begin-3))"} ++-config=MC3A2.D4.10,reports+={safe, "first_area(text(^/\\* This file is legitimately included multiple times\\. \\*/$, begin-4))"} ++-config=MC3A2.D4.10,reports+={safe, "first_area(text(^/\\* Generated file, do not edit! \\*/$, begin-3))"} + -doc_end + + # +@@ -91,50 +91,50 @@ conform to the directive." + + -doc_begin="The project adopted the rule with an exception listed in + 'docs/misra/rules.rst'" +--config=MC3R1.R5.3,reports+={safe, "any_area(any_loc(any_exp(macro(^READ_SYSREG$))&&any_exp(macro(^WRITE_SYSREG$))))"} +--config=MC3R1.R5.3,reports+={safe, "any_area(any_loc(any_exp(macro(^max(_t)?$))&&any_exp(macro(^min(_t)?$))))"} +--config=MC3R1.R5.3,reports+={safe, "any_area(any_loc(any_exp(macro(^read[bwlq]$))&&any_exp(macro(^read[bwlq]_relaxed$))))"} +--config=MC3R1.R5.3,reports+={safe, "any_area(any_loc(any_exp(macro(^per_cpu$))&&any_exp(macro(^this_cpu$))))"} +--config=MC3R1.R5.3,reports+={safe, "any_area(any_loc(any_exp(macro(^__emulate_2op$))&&any_exp(macro(^__emulate_2op_nobyte$))))"} +--config=MC3R1.R5.3,reports+={safe, "any_area(any_loc(any_exp(macro(^read_debugreg$))&&any_exp(macro(^write_debugreg$))))"} ++-config=MC3A2.R5.3,reports+={safe, "any_area(any_loc(any_exp(macro(^READ_SYSREG$))&&any_exp(macro(^WRITE_SYSREG$))))"} ++-config=MC3A2.R5.3,reports+={safe, "any_area(any_loc(any_exp(macro(^max(_t)?$))&&any_exp(macro(^min(_t)?$))))"} ++-config=MC3A2.R5.3,reports+={safe, "any_area(any_loc(any_exp(macro(^read[bwlq]$))&&any_exp(macro(^read[bwlq]_relaxed$))))"} ++-config=MC3A2.R5.3,reports+={safe, "any_area(any_loc(any_exp(macro(^per_cpu$))&&any_exp(macro(^this_cpu$))))"} ++-config=MC3A2.R5.3,reports+={safe, "any_area(any_loc(any_exp(macro(^__emulate_2op$))&&any_exp(macro(^__emulate_2op_nobyte$))))"} ++-config=MC3A2.R5.3,reports+={safe, "any_area(any_loc(any_exp(macro(^read_debugreg$))&&any_exp(macro(^write_debugreg$))))"} + -doc_end + + -doc_begin="Macros expanding to their own identifier (e.g., \"#define x x\") are deliberate." +--config=MC3R1.R5.5,reports+={deliberate, "all_area(macro(same_id_body())||!macro(!same_id_body()))"} ++-config=MC3A2.R5.5,reports+={deliberate, "all_area(macro(same_id_body())||!macro(!same_id_body()))"} + -doc_end + + -doc_begin="There is no clash between function like macros and not callable objects." +--config=MC3R1.R5.5,reports+={deliberate, "all_area(macro(function_like())||decl(any()))&&all_area(macro(any())||!decl(kind(function))&&!decl(__function_pointer_decls))"} ++-config=MC3A2.R5.5,reports+={deliberate, "all_area(macro(function_like())||decl(any()))&&all_area(macro(any())||!decl(kind(function))&&!decl(__function_pointer_decls))"} + -doc_end + + -doc_begin="Clashes between function names and macros are deliberate for string handling functions since some architectures may want to use their own arch-specific implementation." +--config=MC3R1.R5.5,reports+={deliberate, "all_area(all_loc(file(^xen/arch/x86/string\\.c|xen/include/xen/string\\.h|xen/lib/.*$)))"} ++-config=MC3A2.R5.5,reports+={deliberate, "all_area(all_loc(file(^xen/arch/x86/string\\.c|xen/include/xen/string\\.h|xen/lib/.*$)))"} + -doc_end + + -doc_begin="In libelf, clashes between macros and function names are deliberate and needed to prevent the use of undecorated versions of memcpy, memset and memmove." +--config=MC3R1.R5.5,reports+={deliberate, "any_area(decl(kind(function))||any_loc(macro(name(memcpy||memset||memmove))))&&any_area(any_loc(file(^xen/common/libelf/libelf-private\\.h$)))"} ++-config=MC3A2.R5.5,reports+={deliberate, "any_area(decl(kind(function))||any_loc(macro(name(memcpy||memset||memmove))))&&any_area(any_loc(file(^xen/common/libelf/libelf-private\\.h$)))"} + -doc_end + + -doc_begin="The type \"ret_t\" is deliberately defined multiple times, + depending on the guest." +--config=MC3R1.R5.6,reports+={deliberate,"any_area(any_loc(text(^.*ret_t.*$)))"} ++-config=MC3A2.R5.6,reports+={deliberate,"any_area(any_loc(text(^.*ret_t.*$)))"} + -doc_end + + -doc_begin="On X86, the types \"guest_intpte_t\", \"guest_l1e_t\" and + \"guest_l2e_t\" are deliberately defined multiple times, depending on the + number of guest paging levels." +--config=MC3R1.R5.6,reports+={deliberate,"any_area(any_loc(file(^xen/arch/x86/include/asm/guest_pt\\.h$)))&&any_area(any_loc(text(^.*(guest_intpte_t|guest_l[12]e_t).*$)))"} ++-config=MC3A2.R5.6,reports+={deliberate,"any_area(any_loc(file(^xen/arch/x86/include/asm/guest_pt\\.h$)))&&any_area(any_loc(text(^.*(guest_intpte_t|guest_l[12]e_t).*$)))"} + -doc_end + + -doc_begin="The following files are imported from the gnu-efi package." + -file_tag+={adopted_r5_6,"^xen/include/efi/.*$"} + -file_tag+={adopted_r5_6,"^xen/arch/.*/include/asm/.*/efibind\\.h$"} +--config=MC3R1.R5.6,reports+={deliberate,"any_area(any_loc(file(adopted_r5_6)))"} ++-config=MC3A2.R5.6,reports+={deliberate,"any_area(any_loc(file(adopted_r5_6)))"} + -doc_end + + -doc_begin="The project intentionally reuses tag names in order to have identifiers matching the applicable external specifications as well as established internal conventions. + As there is little possibility for developer confusion not resulting into compilation errors, the risk of renaming outweighs the potential advantages of compliance." +--config=MC3R1.R5.7,reports+={deliberate,"any()"} ++-config=MC3A2.R5.7,reports+={deliberate,"any()"} + -doc_end + + # +@@ -143,7 +143,7 @@ As there is little possibility for developer confusion not resulting into compil + + -doc_begin="It is safe to use certain octal constants the way they are defined + in specifications, manuals, and algorithm descriptions." +--config=MC3R1.R7.1,reports+={safe, "any_area(any_loc(any_exp(text(^.*octal-ok.*$))))"} ++-config=MC3A2.R7.1,reports+={safe, "any_area(any_loc(any_exp(text(^.*octal-ok.*$))))"} + -doc_end + + -doc_begin="Violations in files that maintainers have asked to not modify in the +@@ -156,17 +156,17 @@ context of R7.2." + -file_tag+={adopted_r7_2,"^xen/arch/x86/cpu/intel\\.c$"} + -file_tag+={adopted_r7_2,"^xen/arch/x86/cpu/amd\\.c$"} + -file_tag+={adopted_r7_2,"^xen/arch/x86/cpu/common\\.c$"} +--config=MC3R1.R7.2,reports+={deliberate,"any_area(any_loc(file(adopted_r7_2)))"} ++-config=MC3A2.R7.2,reports+={deliberate,"any_area(any_loc(file(adopted_r7_2)))"} + -doc_end + + -doc_begin="Violations caused by __HYPERVISOR_VIRT_START are related to the + particular use of it done in xen_mk_ulong." +--config=MC3R1.R7.2,reports+={deliberate,"any_area(any_loc(macro(name(BUILD_BUG_ON))))"} ++-config=MC3A2.R7.2,reports+={deliberate,"any_area(any_loc(macro(name(BUILD_BUG_ON))))"} + -doc_end + + -doc_begin="Allow pointers of non-character type as long as the pointee is + const-qualified." +--config=MC3R1.R7.4,same_pointee=false ++-config=MC3A2.R7.4,same_pointee=false + -doc_end + + # +@@ -174,7 +174,7 @@ const-qualified." + # + + -doc_begin="The type ret_t is deliberately used and defined as int or long depending on the architecture." +--config=MC3R1.R8.3,reports+={deliberate,"any_area(any_loc(text(^.*ret_t.*$)))"} ++-config=MC3A2.R8.3,reports+={deliberate,"any_area(any_loc(text(^.*ret_t.*$)))"} + -doc_end + + -doc_begin="The following files are imported from Linux and decompress.h defines a unique and documented interface towards all the (adopted) decompress functions." +@@ -184,71 +184,71 @@ const-qualified." + -file_tag+={adopted_decompress_r8_3,"^xen/common/unlzo\\.c$"} + -file_tag+={adopted_decompress_r8_3,"^xen/common/unxz\\.c$"} + -file_tag+={adopted_decompress_r8_3,"^xen/common/unzstd\\.c$"} +--config=MC3R1.R8.3,reports+={deliberate,"any_area(any_loc(file(adopted_decompress_r8_3)))&&any_area(any_loc(file(^xen/include/xen/decompress\\.h$)))"} ++-config=MC3A2.R8.3,reports+={deliberate,"any_area(any_loc(file(adopted_decompress_r8_3)))&&any_area(any_loc(file(^xen/include/xen/decompress\\.h$)))"} + -doc_end + + -doc_begin="Parameter name \"unused\" (with an optional numeric suffix) is deliberate and makes explicit the intention of not using such parameter within the function." +--config=MC3R1.R8.3,reports+={deliberate, "any_area(^.*parameter `unused[0-9]*'.*$)"} ++-config=MC3A2.R8.3,reports+={deliberate, "any_area(^.*parameter `unused[0-9]*'.*$)"} + -doc_end + + -doc_begin="The following file is imported from Linux: ignore for now." + -file_tag+={adopted_time_r8_3,"^xen/arch/x86/time\\.c$"} +--config=MC3R1.R8.3,reports+={deliberate,"any_area(any_loc(file(adopted_time_r8_3)))&&(any_area(any_loc(file(^xen/include/xen/time\\.h$)))||any_area(any_loc(file(^xen/arch/x86/include/asm/setup\\.h$))))"} ++-config=MC3A2.R8.3,reports+={deliberate,"any_area(any_loc(file(adopted_time_r8_3)))&&(any_area(any_loc(file(^xen/include/xen/time\\.h$)))||any_area(any_loc(file(^xen/arch/x86/include/asm/setup\\.h$))))"} + -doc_end + + -doc_begin="The following file is imported from Linux: ignore for now." + -file_tag+={adopted_cpu_idle_r8_3,"^xen/arch/x86/acpi/cpu_idle\\.c$"} +--config=MC3R1.R8.3,reports+={deliberate,"any_area(any_loc(file(adopted_cpu_idle_r8_3)))&&any_area(any_loc(file(^xen/include/xen/pmstat\\.h$)))"} ++-config=MC3A2.R8.3,reports+={deliberate,"any_area(any_loc(file(adopted_cpu_idle_r8_3)))&&any_area(any_loc(file(^xen/include/xen/pmstat\\.h$)))"} + -doc_end + + -doc_begin="The following file is imported from Linux: ignore for now." + -file_tag+={adopted_mpparse_r8_3,"^xen/arch/x86/mpparse\\.c$"} +--config=MC3R1.R8.3,reports+={deliberate,"any_area(any_loc(file(adopted_mpparse_r8_3)))&&any_area(any_loc(file(^xen/arch/x86/include/asm/mpspec\\.h$)))"} ++-config=MC3A2.R8.3,reports+={deliberate,"any_area(any_loc(file(adopted_mpparse_r8_3)))&&any_area(any_loc(file(^xen/arch/x86/include/asm/mpspec\\.h$)))"} + -doc_end + + -doc_begin="The definitions present in this file are meant to generate definitions for asm modules, and are not called by C code. Therefore the absence of prior declarations is safe." + -file_tag+={asm_offsets, "^xen/arch/(arm|x86)/(arm32|arm64|x86_64)/asm-offsets\\.c$"} +--config=MC3R1.R8.4,reports+={safe, "first_area(any_loc(file(asm_offsets)))"} ++-config=MC3A2.R8.4,reports+={safe, "first_area(any_loc(file(asm_offsets)))"} + -doc_end + + -doc_begin="The functions defined in this file are meant to be called from gcc-generated code in a non-release build configuration. + Therefore the absence of prior declarations is safe." + -file_tag+={gcov, "^xen/common/coverage/gcov_base\\.c$"} +--config=MC3R1.R8.4,reports+={safe, "first_area(any_loc(file(gcov)))"} ++-config=MC3A2.R8.4,reports+={safe, "first_area(any_loc(file(gcov)))"} + -doc_end + + -doc_begin="Recognize the occurrence of current_stack_pointer as a declaration." + -file_tag+={asm_defns, "^xen/arch/x86/include/asm/asm_defns\\.h$"} +--config=MC3R1.R8.4,declarations+={safe, "loc(file(asm_defns))&&^current_stack_pointer$"} ++-config=MC3A2.R8.4,declarations+={safe, "loc(file(asm_defns))&&^current_stack_pointer$"} + -doc_end + + -doc_begin="The function apei_(read|check|clear)_mce are dead code and are excluded from non-debug builds, therefore the absence of prior declarations is safe." +--config=MC3R1.R8.4,declarations+={safe, "^apei_(read|check|clear)_mce\\(.*$"} ++-config=MC3A2.R8.4,declarations+={safe, "^apei_(read|check|clear)_mce\\(.*$"} + -doc_end + + -doc_begin="asmlinkage is a marker to indicate that the function is only used to interface with asm modules." +--config=MC3R1.R8.4,declarations+={safe,"loc(text(^(?s).*asmlinkage.*$, -1..0))"} ++-config=MC3A2.R8.4,declarations+={safe,"loc(text(^(?s).*asmlinkage.*$, -1..0))"} + -doc_end + + -doc_begin="Given that bsearch and sort are defined with the attribute 'gnu_inline', it's deliberate not to have a prior declaration. + See Section \"6.33.1 Common Function Attributes\" of \"GCC_MANUAL\" for a full explanation of gnu_inline." + -file_tag+={bsearch_sort, "^xen/include/xen/(sort|lib)\\.h$"} +--config=MC3R1.R8.4,reports+={deliberate, "any_area(any_loc(file(bsearch_sort))&&decl(name(bsearch||sort)))"} ++-config=MC3A2.R8.4,reports+={deliberate, "any_area(any_loc(file(bsearch_sort))&&decl(name(bsearch||sort)))"} + -doc_end + + -doc_begin="first_valid_mfn is defined in this way because the current lack of NUMA support in Arm and PPC requires it." + -file_tag+={first_valid_mfn, "^xen/common/page_alloc\\.c$"} +--config=MC3R1.R8.4,declarations+={deliberate,"loc(file(first_valid_mfn))"} ++-config=MC3A2.R8.4,declarations+={deliberate,"loc(file(first_valid_mfn))"} + -doc_end + + -doc_begin="The following variables are compiled in multiple translation units + belonging to different executables and therefore are safe." +--config=MC3R1.R8.6,declarations+={safe, "name(current_stack_pointer||bsearch||sort)"} ++-config=MC3A2.R8.6,declarations+={safe, "name(current_stack_pointer||bsearch||sort)"} + -doc_end + + -doc_begin="Declarations without definitions are allowed (specifically when the + definition is compiled-out or optimized-out by the compiler)" +--config=MC3R1.R8.6,reports+={deliberate, "first_area(^.*has no definition$)"} ++-config=MC3A2.R8.6,reports+={deliberate, "first_area(^.*has no definition$)"} + -doc_end + + -doc_begin="The search procedure for Unix linkers is well defined, see ld(1) +@@ -259,11 +259,11 @@ the linker will include the appropriate file(s) from the archive\". + In Xen, thanks to the order in which file names appear in the build commands, + if arch-specific definitions are present, they get always linked in before + searching in the lib.a archive resulting from xen/lib." +--config=MC3R1.R8.6,declarations+={deliberate, "loc(file(^xen/lib/.*$))"} ++-config=MC3A2.R8.6,declarations+={deliberate, "loc(file(^xen/lib/.*$))"} + -doc_end + + -doc_begin="The gnu_inline attribute without static is deliberately allowed." +--config=MC3R1.R8.10,declarations+={deliberate,"property(gnu_inline)"} ++-config=MC3A2.R8.10,declarations+={deliberate,"property(gnu_inline)"} + -doc_end + + # +@@ -273,12 +273,12 @@ searching in the lib.a archive resulting from xen/lib." + -doc_begin="Violations in files that maintainers have asked to not modify in the + context of R9.1." + -file_tag+={adopted_r9_1,"^xen/arch/arm/arm64/lib/find_next_bit\\.c$"} +--config=MC3R1.R9.1,reports+={deliberate,"any_area(any_loc(file(adopted_r9_1)))"} ++-config=MC3A2.R9.1,reports+={deliberate,"any_area(any_loc(file(adopted_r9_1)))"} + -doc_end + + -doc_begin="The possibility of committing mistakes by specifying an explicit + dimension is higher than omitting the dimension." +--config=MC3R1.R9.5,reports+={deliberate, "any()"} ++-config=MC3A2.R9.5,reports+={deliberate, "any()"} + -doc_end + + # +@@ -286,45 +286,45 @@ dimension is higher than omitting the dimension." + # + + -doc_begin="The value-preserving conversions of integer constants are safe" +--config=MC3R1.R10.1,etypes={safe,"any()","preserved_integer_constant()"} +--config=MC3R1.R10.3,etypes={safe,"any()","preserved_integer_constant()"} +--config=MC3R1.R10.4,etypes={safe,"any()","preserved_integer_constant()||sibling(rhs,preserved_integer_constant())"} ++-config=MC3A2.R10.1,etypes={safe,"any()","preserved_integer_constant()"} ++-config=MC3A2.R10.3,etypes={safe,"any()","preserved_integer_constant()"} ++-config=MC3A2.R10.4,etypes={safe,"any()","preserved_integer_constant()||sibling(rhs,preserved_integer_constant())"} + -doc_end + + -doc_begin="Shifting non-negative integers to the right is safe." +--config=MC3R1.R10.1,etypes+={safe, ++-config=MC3A2.R10.1,etypes+={safe, + "stmt(node(binary_operator)&&operator(shr))", + "src_expr(definitely_in(0..))"} + -doc_end + + -doc_begin="Shifting non-negative integers to the left is safe if the result is + still non-negative." +--config=MC3R1.R10.1,etypes+={safe, ++-config=MC3A2.R10.1,etypes+={safe, + "stmt(node(binary_operator)&&operator(shl)&&definitely_in(0..))", + "src_expr(definitely_in(0..))"} + -doc_end + + -doc_begin="Bitwise logical operations on non-negative integers are safe." +--config=MC3R1.R10.1,etypes+={safe, ++-config=MC3A2.R10.1,etypes+={safe, + "stmt(node(binary_operator)&&operator(and||or||xor))", + "src_expr(definitely_in(0..))"} + -doc_end + + -doc_begin="The implicit conversion to Boolean for logical operator arguments is well known to all Xen developers to be a comparison with 0" +--config=MC3R1.R10.1,etypes+={safe, "stmt(operator(logical)||node(conditional_operator||binary_conditional_operator))", "dst_type(ebool||boolean)"} ++-config=MC3A2.R10.1,etypes+={safe, "stmt(operator(logical)||node(conditional_operator||binary_conditional_operator))", "dst_type(ebool||boolean)"} + -doc_end + + -doc_begin="The macro ISOLATE_LSB encapsulates a well-known pattern to obtain + a mask where only the lowest bit set in the argument is set, if any, for unsigned + integers arguments on two's complement architectures + (all the architectures supported by Xen satisfy this requirement)." +--config=MC3R1.R10.1,reports+={safe, "any_area(any_loc(any_exp(macro(^ISOLATE_LSB$))))"} ++-config=MC3A2.R10.1,reports+={safe, "any_area(any_loc(any_exp(macro(^ISOLATE_LSB$))))"} + -doc_end + + -doc_begin="XEN only supports architectures where signed integers are + representend using two's complement and all the XEN developers are aware of + this." +--config=MC3R1.R10.1,etypes+={safe, ++-config=MC3A2.R10.1,etypes+={safe, + "stmt(operator(and||or||xor||not||and_assign||or_assign||xor_assign))", + "any()"} + -doc_end +@@ -335,7 +335,7 @@ C language, GCC does not use the latitude given in C99 and C11 only to treat + certain aspects of signed `<<' as undefined. However, -fsanitize=shift (and + -fsanitize=undefined) will diagnose such cases. They are also diagnosed where + constant expressions are required.\"" +--config=MC3R1.R10.1,etypes+={safe, ++-config=MC3A2.R10.1,etypes+={safe, + "stmt(operator(shl||shr||shl_assign||shr_assign))", + "any()"} + -doc_end +@@ -345,7 +345,7 @@ constant expressions are required.\"" + # + + -doc_begin="The conversion from a function pointer to unsigned long or (void *) does not lose any information, provided that the target type has enough bits to store it." +--config=MC3R1.R11.1,casts+={safe, ++-config=MC3A2.R11.1,casts+={safe, + "from(type(canonical(__function_pointer_types))) + &&to(type(canonical(builtin(unsigned long)||pointer(builtin(void))))) + &&relation(definitely_preserves_value)" +@@ -353,14 +353,14 @@ constant expressions are required.\"" + -doc_end + + -doc_begin="The conversion from a function pointer to a boolean has a well-known semantics that do not lead to unexpected behaviour." +--config=MC3R1.R11.1,casts+={safe, ++-config=MC3A2.R11.1,casts+={safe, + "from(type(canonical(__function_pointer_types))) + &&kind(pointer_to_boolean)" + } + -doc_end + + -doc_begin="The conversion from a pointer to an incomplete type to unsigned long does not lose any information, provided that the target type has enough bits to store it." +--config=MC3R1.R11.2,casts+={safe, ++-config=MC3A2.R11.2,casts+={safe, + "from(type(any())) + &&to(type(canonical(builtin(unsigned long)))) + &&relation(definitely_preserves_value)" +@@ -368,20 +368,20 @@ constant expressions are required.\"" + -doc_end + + -doc_begin="Conversions to object pointers that have a pointee type with a smaller (i.e., less strict) alignment requirement are safe." +--config=MC3R1.R11.3,casts+={safe, ++-config=MC3A2.R11.3,casts+={safe, + "!relation(more_aligned_pointee)" + } + -doc_end + + -doc_begin="Conversions from and to integral types are safe, in the assumption that the target type has enough bits to store the value. + See also Section \"4.7 Arrays and Pointers\" of \"GCC_MANUAL\"" +--config=MC3R1.R11.6,casts+={safe, ++-config=MC3A2.R11.6,casts+={safe, + "(from(type(canonical(integral())))||to(type(canonical(integral())))) + &&relation(definitely_preserves_value)"} + -doc_end + + -doc_begin="The conversion from a pointer to a boolean has a well-known semantics that do not lead to unexpected behaviour." +--config=MC3R1.R11.6,casts+={safe, ++-config=MC3A2.R11.6,casts+={safe, + "from(type(canonical(__pointer_types))) + &&kind(pointer_to_boolean)" + } +@@ -391,11 +391,11 @@ See also Section \"4.7 Arrays and Pointers\" of \"GCC_MANUAL\"" + with the provided offset. The resulting pointer is then immediately cast back to its + original type, which preserves the qualifier. This use is deemed safe. + Fixing this violation would require to increase code complexity and lower readability." +--config=MC3R1.R11.8,reports+={safe,"any_area(any_loc(any_exp(macro(^container_of$))))"} ++-config=MC3A2.R11.8,reports+={safe,"any_area(any_loc(any_exp(macro(^container_of$))))"} + -doc_end + + -doc_begin="This construct is used to check if the type is scalar, and for this purpose the use of 0 as a null pointer constant is deliberate." +--config=MC3R1.R11.9,reports+={deliberate, "any_area(any_loc(any_exp(macro(^__ACCESS_ONCE$))))" ++-config=MC3A2.R11.9,reports+={deliberate, "any_area(any_loc(any_exp(macro(^__ACCESS_ONCE$))))" + } + -doc_end + +@@ -405,16 +405,16 @@ Fixing this violation would require to increase code complexity and lower readab + + -doc_begin="All developers and reviewers can be safely assumed to be well aware + of the short-circuit evaluation strategy of such logical operators." +--config=MC3R1.R13.5,reports+={disapplied,"any()"} ++-config=MC3A2.R13.5,reports+={disapplied,"any()"} + -doc_end + + -doc_begin="Macros alternative_v?call[0-9] use sizeof and typeof to check that the argument types match the corresponding parameter ones." +--config=MC3R1.R13.6,reports+={deliberate,"any_area(any_loc(any_exp(macro(^alternative_vcall[0-9]$))&&file(^xen/arch/x86/include/asm/alternative\\.h*$)))"} ++-config=MC3A2.R13.6,reports+={deliberate,"any_area(any_loc(any_exp(macro(^alternative_vcall[0-9]$))&&file(^xen/arch/x86/include/asm/alternative\\.h*$)))"} + -config=B.UNEVALEFF,reports+={deliberate,"any_area(any_loc(any_exp(macro(^alternative_v?call[0-9]$))&&file(^xen/arch/x86/include/asm/alterantive\\.h*$)))"} + -doc_end + + -doc_begin="Anything, no matter how complicated, inside the BUILD_BUG_ON macro is subject to a compile-time evaluation without relevant side effects." +--config=MC3R1.R13.6,reports+={safe,"any_area(any_loc(any_exp(macro(name(BUILD_BUG_ON)))))"} ++-config=MC3A2.R13.6,reports+={safe,"any_area(any_loc(any_exp(macro(name(BUILD_BUG_ON)))))"} + -config=B.UNEVALEFF,reports+={safe,"any_area(any_loc(any_exp(macro(name(BUILD_BUG_ON)))))"} + -doc_end + +@@ -425,31 +425,31 @@ of the short-circuit evaluation strategy of such logical operators." + -doc_begin="The severe restrictions imposed by this rule on the use of for + statements are not balanced by the presumed facilitation of the peer review + activity." +--config=MC3R1.R14.2,reports+={disapplied,"any()"} ++-config=MC3A2.R14.2,reports+={disapplied,"any()"} + -doc_end + + -doc_begin="The XEN team relies on the fact that invariant conditions of 'if' statements and conditional operators are deliberate" +--config=MC3R1.R14.3,statements+={deliberate, "wrapped(any(),node(if_stmt||conditional_operator||binary_conditional_operator))" } ++-config=MC3A2.R14.3,statements+={deliberate, "wrapped(any(),node(if_stmt||conditional_operator||binary_conditional_operator))" } + -doc_end + + -doc_begin="Switches having a 'sizeof' operator as the condition are deliberate and have limited scope." +--config=MC3R1.R14.3,statements+={deliberate, "wrapped(any(),node(switch_stmt)&&child(cond, operator(sizeof)))" } ++-config=MC3A2.R14.3,statements+={deliberate, "wrapped(any(),node(switch_stmt)&&child(cond, operator(sizeof)))" } + -doc_end + + -doc_begin="The use of an invariant size argument in {put,get}_unsafe_size and array_access_ok, as defined in arch/x86(_64)?/include/asm/uaccess.h is deliberate and is deemed safe." + -file_tag+={x86_uaccess, "^xen/arch/x86(_64)?/include/asm/uaccess\\.h$"} +--config=MC3R1.R14.3,reports+={deliberate, "any_area(any_loc(file(x86_uaccess)&&any_exp(macro(^(put|get)_unsafe_size$))))"} +--config=MC3R1.R14.3,reports+={deliberate, "any_area(any_loc(file(x86_uaccess)&&any_exp(macro(^array_access_ok$))))"} ++-config=MC3A2.R14.3,reports+={deliberate, "any_area(any_loc(file(x86_uaccess)&&any_exp(macro(^(put|get)_unsafe_size$))))"} ++-config=MC3A2.R14.3,reports+={deliberate, "any_area(any_loc(file(x86_uaccess)&&any_exp(macro(^array_access_ok$))))"} + -doc_end + + -doc_begin="A controlling expression of 'if' and iteration statements having integer, character or pointer type has a semantics that is well-known to all Xen developers." +--config=MC3R1.R14.4,etypes+={deliberate, "any()", "src_type(integer||character)||src_expr(type(desugar(pointer(any()))))"} ++-config=MC3A2.R14.4,etypes+={deliberate, "any()", "src_type(integer||character)||src_expr(type(desugar(pointer(any()))))"} + -doc_end + + -doc_begin="The XEN team relies on the fact that the enum is_dying has the + constant with assigned value 0 act as false and the other ones as true, + therefore have the same behavior of a boolean" +--config=MC3R1.R14.4,etypes+={deliberate, "stmt(child(cond,child(expr,ref(^?::is_dying$))))","src_type(enum)"} ++-config=MC3A2.R14.4,etypes+={deliberate, "stmt(child(cond,child(expr,ref(^?::is_dying$))))","src_type(enum)"} + -doc_end + + # +@@ -460,43 +460,43 @@ therefore have the same behavior of a boolean" + therefore it is deemed better to leave such files as is." + -file_tag+={x86_emulate,"^xen/arch/x86/x86_emulate/.*$"} + -file_tag+={x86_svm_emulate,"^xen/arch/x86/hvm/svm/emulate\\.c$"} +--config=MC3R1.R16.2,reports+={deliberate, "any_area(any_loc(file(x86_emulate||x86_svm_emulate)))"} ++-config=MC3A2.R16.2,reports+={deliberate, "any_area(any_loc(file(x86_emulate||x86_svm_emulate)))"} + -doc_end + + -doc_begin="Switch clauses ending with continue, goto, return statements are + safe." +--config=MC3R1.R16.3,terminals+={safe, "node(continue_stmt||goto_stmt||return_stmt)"} ++-config=MC3A2.R16.3,terminals+={safe, "node(continue_stmt||goto_stmt||return_stmt)"} + -doc_end + + -doc_begin="Switch clauses ending with a call to a function that does not give + the control back (i.e., a function with attribute noreturn) are safe." +--config=MC3R1.R16.3,terminals+={safe, "call(property(noreturn))"} ++-config=MC3A2.R16.3,terminals+={safe, "call(property(noreturn))"} + -doc_end + + -doc_begin="Switch clauses ending with pseudo-keyword \"fallthrough\" are + safe." +--config=MC3R1.R16.3,reports+={safe, "any_area(end_loc(any_exp(text(/fallthrough;/))))"} ++-config=MC3A2.R16.3,reports+={safe, "any_area(end_loc(any_exp(text(/fallthrough;/))))"} + -doc_end + + -doc_begin="Switch clauses ending with failure method \"BUG()\" are safe." +--config=MC3R1.R16.3,reports+={safe, "any_area(end_loc(any_exp(text(/BUG\\(\\);/))))"} ++-config=MC3A2.R16.3,reports+={safe, "any_area(end_loc(any_exp(text(/BUG\\(\\);/))))"} + -doc_end + + -doc_begin="Switch clauses not ending with the break statement are safe if an + explicit comment indicating the fallthrough intention is present." +--config=MC3R1.R16.3,reports+={safe, "any_area(end_loc(any_exp(text(^(?s).*/\\* [fF]all ?through.? \\*/.*$,0..1))))"} ++-config=MC3A2.R16.3,reports+={safe, "any_area(end_loc(any_exp(text(^(?s).*/\\* [fF]all ?through.? \\*/.*$,0..1))))"} + -doc_end + + -doc_begin="Switch statements having a controlling expression of enum type deliberately do not have a default case: gcc -Wall enables -Wswitch which warns (and breaks the build as we use -Werror) if one of the enum labels is missing from the switch." +--config=MC3R1.R16.4,reports+={deliberate,'any_area(kind(context)&&^.* has no `default.*$&&stmt(node(switch_stmt)&&child(cond,skip(__non_syntactic_paren_stmts,type(canonical(enum_underlying_type(any())))))))'} ++-config=MC3A2.R16.4,reports+={deliberate,'any_area(kind(context)&&^.* has no `default.*$&&stmt(node(switch_stmt)&&child(cond,skip(__non_syntactic_paren_stmts,type(canonical(enum_underlying_type(any())))))))'} + -doc_end + + -doc_begin="A switch statement with a single switch clause and no default label may be used in place of an equivalent if statement if it is considered to improve readability." +--config=MC3R1.R16.4,switch_clauses+={deliberate,"switch(1)&&default(0)"} ++-config=MC3A2.R16.4,switch_clauses+={deliberate,"switch(1)&&default(0)"} + -doc_end + + -doc_begin="A switch statement with a single switch clause and no default label may be used in place of an equivalent if statement if it is considered to improve readability." +--config=MC3R1.R16.6,switch_clauses+={deliberate, "default(0)"} ++-config=MC3A2.R16.6,switch_clauses+={deliberate, "default(0)"} + -doc_end + + # +@@ -504,16 +504,16 @@ explicit comment indicating the fallthrough intention is present." + # + + -doc_begin="printf()-like functions are allowed to use the variadic features provided by stdarg.h." +--config=MC3R1.R17.1,reports+={deliberate,"any_area(^.*va_list.*$&&context(ancestor_or_self(^.*printk\\(.*\\)$)))"} +--config=MC3R1.R17.1,reports+={deliberate,"any_area(^.*va_list.*$&&context(ancestor_or_self(^.*printf\\(.*\\)$)))"} +--config=MC3R1.R17.1,reports+={deliberate,"any_area(^.*va_list.*$&&context(ancestor_or_self(name(panic)&&kind(function))))"} +--config=MC3R1.R17.1,reports+={deliberate,"any_area(^.*va_list.*$&&context(ancestor_or_self(name(elf_call_log_callback)&&kind(function))))"} +--config=MC3R1.R17.1,reports+={deliberate,"any_area(^.*va_list.*$&&context(ancestor_or_self(name(vprintk_common)&&kind(function))))"} +--config=MC3R1.R17.1,macros+={hide , "^va_(arg|start|copy|end)$"} ++-config=MC3A2.R17.1,reports+={deliberate,"any_area(^.*va_list.*$&&context(ancestor_or_self(^.*printk\\(.*\\)$)))"} ++-config=MC3A2.R17.1,reports+={deliberate,"any_area(^.*va_list.*$&&context(ancestor_or_self(^.*printf\\(.*\\)$)))"} ++-config=MC3A2.R17.1,reports+={deliberate,"any_area(^.*va_list.*$&&context(ancestor_or_self(name(panic)&&kind(function))))"} ++-config=MC3A2.R17.1,reports+={deliberate,"any_area(^.*va_list.*$&&context(ancestor_or_self(name(elf_call_log_callback)&&kind(function))))"} ++-config=MC3A2.R17.1,reports+={deliberate,"any_area(^.*va_list.*$&&context(ancestor_or_self(name(vprintk_common)&&kind(function))))"} ++-config=MC3A2.R17.1,macros+={hide , "^va_(arg|start|copy|end)$"} + -doc_end + + -doc_begin="Not using the return value of a function does not endanger safety if it coincides with an actual argument." +--config=MC3R1.R17.7,calls+={safe, "any()", "decl(name(__builtin_memcpy||__builtin_memmove||__builtin_memset||cpumask_check))"} ++-config=MC3A2.R17.7,calls+={safe, "any()", "decl(name(__builtin_memcpy||__builtin_memmove||__builtin_memset||cpumask_check))"} + -doc_end + + # +@@ -522,7 +522,7 @@ explicit comment indicating the fallthrough intention is present." + + -doc_begin="Flexible array members are deliberately used and XEN developers are aware of the dangers related to them: + unexpected result when the structure is given as argument to a sizeof() operator and the truncation in assignment between structures." +--config=MC3R1.R18.7,reports+={deliberate, "any()"} ++-config=MC3A2.R18.7,reports+={deliberate, "any()"} + -doc_end + + # +@@ -533,7 +533,7 @@ unexpected result when the structure is given as argument to a sizeof() operator + as function arguments; (2) as macro arguments; (3) as array indices; (4) as lhs + in assignments; (5) as initializers, possibly designated, in initalizer lists; + (6) as the constant expression in a switch clause label." +--config=MC3R1.R20.7,expansion_context= ++-config=MC3A2.R20.7,expansion_context= + {safe, "context(__call_expr_arg_contexts)"}, + {safe, "left_right(^[(,\\[]$,^[),\\]]$)"}, + {safe, "context(skip_to(__expr_non_syntactic_contexts, stmt_child(node(array_subscript_expr), subscript)))"}, +@@ -546,52 +546,52 @@ in assignments; (5) as initializers, possibly designated, in initalizer lists; + breaking the macro's logic; futhermore, the macro is only ever used in the context + of the IS_ENABLED or STATIC_IF/STATIC_IF_NOT macros, so it always receives a literal + 0 or 1 as input, posing no risk to safety." +--config=MC3R1.R20.7,reports+={safe, "any_area(any_loc(any_exp(macro(^___config_enabled$))))"} ++-config=MC3A2.R20.7,reports+={safe, "any_area(any_loc(any_exp(macro(^___config_enabled$))))"} + -doc_end + + -doc_begin="Violations due to the use of macros defined in files that are + not in scope for compliance are allowed, as that is imported code." + -file_tag+={gnu_efi_include, "^xen/include/efi/.*$"} + -file_tag+={acpi_cpu_idle, "^xen/arch/x86/acpi/cpu_idle\\.c$"} +--config=MC3R1.R20.7,reports+={safe, "any_area(any_loc(file(gnu_efi_include)))"} +--config=MC3R1.R20.7,reports+={safe, "any_area(any_loc(file(acpi_cpu_idle)))"} ++-config=MC3A2.R20.7,reports+={safe, "any_area(any_loc(file(gnu_efi_include)))"} ++-config=MC3A2.R20.7,reports+={safe, "any_area(any_loc(file(acpi_cpu_idle)))"} + -doc_end + + -doc_begin="To avoid compromising readability, the macros alternative_(v)?call[0-9] are allowed + not to parenthesize their arguments." +--config=MC3R1.R20.7,reports+={safe, "any_area(any_loc(any_exp(macro(^alternative_(v)?call[0-9]$))))"} ++-config=MC3A2.R20.7,reports+={safe, "any_area(any_loc(any_exp(macro(^alternative_(v)?call[0-9]$))))"} + -doc_end + + -doc_begin="The argument 'x' of the count_args_ macro can't be parenthesized as + the rule would require, without breaking the functionality of the macro. The uses + of this macro do not lead to developer confusion, and can thus be deviated." +--config=MC3R1.R20.7,reports+={safe, "any_area(any_loc(any_exp(macro(^count_args_$))))"} ++-config=MC3A2.R20.7,reports+={safe, "any_area(any_loc(any_exp(macro(^count_args_$))))"} + -doc_end + + -doc_begin="Uses of variadic macros that have one of their arguments defined as + a macro and used within the body for both ordinary parameter expansion and as an + operand to the # or ## operators have a behavior that is well-understood and + deliberate." +--config=MC3R1.R20.12,macros+={deliberate, "variadic()"} ++-config=MC3A2.R20.12,macros+={deliberate, "variadic()"} + -doc_end + + -doc_begin="Uses of a macro parameter for ordinary expansion and as an operand + to the # or ## operators within the following macros are deliberate, to provide + useful diagnostic messages to the user." +--config=MC3R1.R20.12,macros+={deliberate, "name(ASSERT||BUILD_BUG_ON||BUILD_BUG_ON_ZERO||RUNTIME_CHECK)"} ++-config=MC3A2.R20.12,macros+={deliberate, "name(ASSERT||BUILD_BUG_ON||BUILD_BUG_ON_ZERO||RUNTIME_CHECK)"} + -doc_end + + -doc_begin="The helper macro GENERATE_CASE may use a macro parameter for ordinary + expansion and token pasting to improve readability. Only instances where this + leads to a violation of the Rule are deviated." + -file_tag+={deliberate_generate_case, "^xen/arch/arm/vcpreg\\.c$"} +--config=MC3R1.R20.12,macros+={deliberate, "name(GENERATE_CASE)&&loc(file(deliberate_generate_case))"} ++-config=MC3A2.R20.12,macros+={deliberate, "name(GENERATE_CASE)&&loc(file(deliberate_generate_case))"} + -doc_end + + -doc_begin="The macro DEFINE is defined and used in excluded files asm-offsets.c. + This may still cause violations if entities outside these files are referred to + in the expansion." +--config=MC3R1.R20.12,macros+={deliberate, "name(DEFINE)&&loc(file(asm_offsets))"} ++-config=MC3A2.R20.12,macros+={deliberate, "name(DEFINE)&&loc(file(asm_offsets))"} + -doc_end + + # +@@ -601,7 +601,7 @@ in the expansion." + -doc_begin="or, and and xor are reserved identifiers because they constitute alternate + spellings for the corresponding operators (they are defined as macros by iso646.h). + However, Xen doesn't use standard library headers, so there is no risk of overlap." +--config=MC3R1.R21.2,reports+={safe, "any_area(stmt(ref(kind(label)&&^(or|and|xor|not)$)))"} ++-config=MC3A2.R21.2,reports+={safe, "any_area(stmt(ref(kind(label)&&^(or|and|xor|not)$)))"} + -doc_end + + -doc_begin="Xen does not use the functions provided by the Standard Library, but +@@ -610,8 +610,8 @@ The implementation of these functions is available in source form, so the undefi + or implementation-defined behaviors contemplated by the C Standard do not apply. + If some undefined or unspecified behavior does arise in the implementation, it + falls under the jurisdiction of other MISRA rules." +--config=MC3R1.R21.9,reports+={deliberate, "any()"} +--config=MC3R1.R21.10,reports+={deliberate, "any()"} ++-config=MC3A2.R21.9,reports+={deliberate, "any()"} ++-config=MC3A2.R21.10,reports+={deliberate, "any()"} + -doc_end + + # +@@ -636,7 +636,7 @@ falls under the jurisdiction of other MISRA rules." + programmers:no developers' confusion is not possible. In addition, adopted code + is assumed to work as is. Reports that are fully contained in adopted code are + hidden/tagged with the 'adopted' tag." +--service_selector={developer_confusion_guidelines,"^(MC3R1\\.R2\\.1|MC3R1\\.R2\\.2|MC3R1\\.R2\\.3|MC3R1\\.R2\\.4|MC3R1\\.R2\\.5|MC3R1\\.R2\\.6|MC3R1\\.R2\\.7|MC3R1\\.R4\\.1|MC3R1\\.R5\\.3|MC3R1\\.R5\\.6|MC3R1\\.R5\\.7|MC3R1\\.R5\\.8|MC3R1\\.R5\\.9|MC3R1\\.R7\\.1|MC3R1\\.R7\\.2|MC3R1\\.R7\\.3|MC3R1\\.R8\\.7|MC3R1\\.R8\\.8|MC3R1\\.R8\\.9|MC3R1\\.R8\\.11|MC3R1\\.R8\\.12|MC3R1\\.R8\\.13|MC3R1\\.R9\\.3|MC3R1\\.R9\\.4|MC3R1\\.R9\\.5|MC3R1\\.R10\\.2|MC3R1\\.R10\\.5|MC3R1\\.R10\\.6|MC3R1\\.R10\\.7|MC3R1\\.R10\\.8|MC3R1\\.R11\\.9|MC3R1\\.R12\\.1|MC3R1\\.R12\\.3|MC3R1\\.R12\\.4|MC3R1\\.R13\\.5|MC3R1\\.R14\\.1|MC3R1\\.R14\\.2|MC3R1\\.R14\\.3|MC3R1\\.R15\\.1|MC3R1\\.R15\\.2|MC3R1\\.R15\\.3|MC3R1\\.R15\\.4|MC3R1\\.R15\\.5|MC3R1\\.R15\\.6|MC3R1\\.R15\\.7|MC3R1\\.R16\\.1|MC3R1\\.R16\\.2|MC3R1\\.R16\\.3|MC3R1\\.R16\\.4|MC3R1\\.R16\\.5|MC3R1\\.R16\\.6|MC3R1\\.R16\\.7|MC3R1\\.R17\\.7|MC3R1\\.R17\\.8|MC3R1\\.R18\\.4|MC3R1\\.R18\\.5)$" ++-service_selector={developer_confusion_guidelines,"^(MC3A2\\.R2\\.1|MC3A2\\.R2\\.2|MC3A2\\.R2\\.3|MC3A2\\.R2\\.4|MC3A2\\.R2\\.5|MC3A2\\.R2\\.6|MC3A2\\.R2\\.7|MC3A2\\.R4\\.1|MC3A2\\.R5\\.3|MC3A2\\.R5\\.6|MC3A2\\.R5\\.7|MC3A2\\.R5\\.8|MC3A2\\.R5\\.9|MC3A2\\.R7\\.1|MC3A2\\.R7\\.2|MC3A2\\.R7\\.3|MC3A2\\.R8\\.7|MC3A2\\.R8\\.8|MC3A2\\.R8\\.9|MC3A2\\.R8\\.11|MC3A2\\.R8\\.12|MC3A2\\.R8\\.13|MC3A2\\.R9\\.3|MC3A2\\.R9\\.4|MC3A2\\.R9\\.5|MC3A2\\.R10\\.2|MC3A2\\.R10\\.5|MC3A2\\.R10\\.6|MC3A2\\.R10\\.7|MC3A2\\.R10\\.8|MC3A2\\.R11\\.9|MC3A2\\.R12\\.1|MC3A2\\.R12\\.3|MC3A2\\.R12\\.4|MC3A2\\.R13\\.5|MC3A2\\.R14\\.1|MC3A2\\.R14\\.2|MC3A2\\.R14\\.3|MC3A2\\.R15\\.1|MC3A2\\.R15\\.2|MC3A2\\.R15\\.3|MC3A2\\.R15\\.4|MC3A2\\.R15\\.5|MC3A2\\.R15\\.6|MC3A2\\.R15\\.7|MC3A2\\.R16\\.1|MC3A2\\.R16\\.2|MC3A2\\.R16\\.3|MC3A2\\.R16\\.4|MC3A2\\.R16\\.5|MC3A2\\.R16\\.6|MC3A2\\.R16\\.7|MC3A2\\.R17\\.7|MC3A2\\.R17\\.8|MC3A2\\.R18\\.4|MC3A2\\.R18\\.5)$" + } + -config=developer_confusion_guidelines,reports+={relied,adopted_report} + -doc_end +diff --git a/automation/eclair_analysis/ECLAIR/monitored.ecl b/automation/eclair_analysis/ECLAIR/monitored.ecl +index 9ffaebbdc3..464516e780 100644 +--- a/automation/eclair_analysis/ECLAIR/monitored.ecl ++++ b/automation/eclair_analysis/ECLAIR/monitored.ecl +@@ -1,104 +1,104 @@ + -doc_begin="A set of guidelines that are clean or that only have few violations left." +--enable=MC3R1.D1.1 +--enable=MC3R1.D2.1 +--enable=MC3R1.D4.1 +--enable=MC3R1.D4.10 +--enable=MC3R1.D4.11 +--enable=MC3R1.D4.12 +--enable=MC3R1.D4.14 +--enable=MC3R1.D4.3 +--enable=MC3R1.D4.7 +--enable=MC3R1.R10.1 +--enable=MC3R1.R10.2 +--enable=MC3R1.R1.1 +--enable=MC3R1.R11.1 +--enable=MC3R1.R11.7 +--enable=MC3R1.R11.8 +--enable=MC3R1.R11.9 +--enable=MC3R1.R12.5 +--enable=MC3R1.R1.3 +--enable=MC3R1.R13.6 +--enable=MC3R1.R13.1 +--enable=MC3R1.R1.4 +--enable=MC3R1.R14.1 +--enable=MC3R1.R14.4 +--enable=MC3R1.R16.2 +--enable=MC3R1.R16.3 +--enable=MC3R1.R16.4 +--enable=MC3R1.R16.6 +--enable=MC3R1.R16.7 +--enable=MC3R1.R17.1 +--enable=MC3R1.R17.3 +--enable=MC3R1.R17.4 +--enable=MC3R1.R17.5 +--enable=MC3R1.R17.6 +--enable=MC3R1.R19.1 +--enable=MC3R1.R20.12 +--enable=MC3R1.R20.13 +--enable=MC3R1.R20.14 +--enable=MC3R1.R20.4 +--enable=MC3R1.R20.7 +--enable=MC3R1.R20.9 +--enable=MC3R1.R2.1 +--enable=MC3R1.R21.10 +--enable=MC3R1.R21.13 +--enable=MC3R1.R21.17 +--enable=MC3R1.R21.18 +--enable=MC3R1.R21.19 +--enable=MC3R1.R21.20 +--enable=MC3R1.R21.21 +--enable=MC3R1.R21.9 +--enable=MC3R1.R2.2 +--enable=MC3R1.R22.2 +--enable=MC3R1.R22.4 +--enable=MC3R1.R22.5 +--enable=MC3R1.R22.6 +--enable=MC3R1.R2.6 +--enable=MC3R1.R3.1 +--enable=MC3R1.R3.2 +--enable=MC3R1.R4.1 +--enable=MC3R1.R4.2 +--enable=MC3R1.R5.1 +--enable=MC3R1.R5.2 +--enable=MC3R1.R5.3 +--enable=MC3R1.R5.4 +--enable=MC3R1.R5.5 +--enable=MC3R1.R5.6 +--enable=MC3R1.R6.1 +--enable=MC3R1.R6.2 +--enable=MC3R1.R7.1 +--enable=MC3R1.R7.2 +--enable=MC3R1.R7.3 +--enable=MC3R1.R7.4 +--enable=MC3R1.R8.1 +--enable=MC3R1.R8.10 +--enable=MC3R1.R8.12 +--enable=MC3R1.R8.14 +--enable=MC3R1.R8.2 +--enable=MC3R1.R8.3 +--enable=MC3R1.R8.4 +--enable=MC3R1.R8.5 +--enable=MC3R1.R8.6 +--enable=MC3R1.R8.8 +--enable=MC3R1.R9.2 +--enable=MC3R1.R9.3 +--enable=MC3R1.R9.4 +--enable=MC3R1.R9.5 +--enable=MC3R1.R18.8 +--enable=MC3R1.R20.2 +--enable=MC3R1.R20.3 +--enable=MC3R1.R20.6 +--enable=MC3R1.R20.11 +--enable=MC3R1.R21.3 +--enable=MC3R1.R21.4 +--enable=MC3R1.R21.5 +--enable=MC3R1.R21.7 +--enable=MC3R1.R21.8 +--enable=MC3R1.R21.12 +--enable=MC3R1.R22.1 +--enable=MC3R1.R22.3 +--enable=MC3R1.R22.7 +--enable=MC3R1.R22.8 +--enable=MC3R1.R22.9 +--enable=MC3R1.R22.10 ++-enable=MC3A2.D1.1 ++-enable=MC3A2.D2.1 ++-enable=MC3A2.D4.1 ++-enable=MC3A2.D4.10 ++-enable=MC3A2.D4.11 ++-enable=MC3A2.D4.12 ++-enable=MC3A2.D4.14 ++-enable=MC3A2.D4.3 ++-enable=MC3A2.D4.7 ++-enable=MC3A2.R10.1 ++-enable=MC3A2.R10.2 ++-enable=MC3A2.R1.1 ++-enable=MC3A2.R11.1 ++-enable=MC3A2.R11.7 ++-enable=MC3A2.R11.8 ++-enable=MC3A2.R11.9 ++-enable=MC3A2.R12.5 ++-enable=MC3A2.R1.3 ++-enable=MC3A2.R13.6 ++-enable=MC3A2.R13.1 ++-enable=MC3A2.R1.4 ++-enable=MC3A2.R14.1 ++-enable=MC3A2.R14.4 ++-enable=MC3A2.R16.2 ++-enable=MC3A2.R16.3 ++-enable=MC3A2.R16.4 ++-enable=MC3A2.R16.6 ++-enable=MC3A2.R16.7 ++-enable=MC3A2.R17.1 ++-enable=MC3A2.R17.3 ++-enable=MC3A2.R17.4 ++-enable=MC3A2.R17.5 ++-enable=MC3A2.R17.6 ++-enable=MC3A2.R19.1 ++-enable=MC3A2.R20.12 ++-enable=MC3A2.R20.13 ++-enable=MC3A2.R20.14 ++-enable=MC3A2.R20.4 ++-enable=MC3A2.R20.7 ++-enable=MC3A2.R20.9 ++-enable=MC3A2.R2.1 ++-enable=MC3A2.R21.10 ++-enable=MC3A2.R21.13 ++-enable=MC3A2.R21.17 ++-enable=MC3A2.R21.18 ++-enable=MC3A2.R21.19 ++-enable=MC3A2.R21.20 ++-enable=MC3A2.R21.21 ++-enable=MC3A2.R21.9 ++-enable=MC3A2.R2.2 ++-enable=MC3A2.R22.2 ++-enable=MC3A2.R22.4 ++-enable=MC3A2.R22.5 ++-enable=MC3A2.R22.6 ++-enable=MC3A2.R2.6 ++-enable=MC3A2.R3.1 ++-enable=MC3A2.R3.2 ++-enable=MC3A2.R4.1 ++-enable=MC3A2.R4.2 ++-enable=MC3A2.R5.1 ++-enable=MC3A2.R5.2 ++-enable=MC3A2.R5.3 ++-enable=MC3A2.R5.4 ++-enable=MC3A2.R5.5 ++-enable=MC3A2.R5.6 ++-enable=MC3A2.R6.1 ++-enable=MC3A2.R6.2 ++-enable=MC3A2.R7.1 ++-enable=MC3A2.R7.2 ++-enable=MC3A2.R7.3 ++-enable=MC3A2.R7.4 ++-enable=MC3A2.R8.1 ++-enable=MC3A2.R8.10 ++-enable=MC3A2.R8.12 ++-enable=MC3A2.R8.14 ++-enable=MC3A2.R8.2 ++-enable=MC3A2.R8.3 ++-enable=MC3A2.R8.4 ++-enable=MC3A2.R8.5 ++-enable=MC3A2.R8.6 ++-enable=MC3A2.R8.8 ++-enable=MC3A2.R9.2 ++-enable=MC3A2.R9.3 ++-enable=MC3A2.R9.4 ++-enable=MC3A2.R9.5 ++-enable=MC3A2.R18.8 ++-enable=MC3A2.R20.2 ++-enable=MC3A2.R20.3 ++-enable=MC3A2.R20.6 ++-enable=MC3A2.R20.11 ++-enable=MC3A2.R21.3 ++-enable=MC3A2.R21.4 ++-enable=MC3A2.R21.5 ++-enable=MC3A2.R21.7 ++-enable=MC3A2.R21.8 ++-enable=MC3A2.R21.12 ++-enable=MC3A2.R22.1 ++-enable=MC3A2.R22.3 ++-enable=MC3A2.R22.7 ++-enable=MC3A2.R22.8 ++-enable=MC3A2.R22.9 ++-enable=MC3A2.R22.10 + -doc_end +diff --git a/automation/eclair_analysis/ECLAIR/tagging.ecl b/automation/eclair_analysis/ECLAIR/tagging.ecl +index b829655ca0..eec1d50f90 100644 +--- a/automation/eclair_analysis/ECLAIR/tagging.ecl ++++ b/automation/eclair_analysis/ECLAIR/tagging.ecl +@@ -20,94 +20,94 @@ + -doc_begin="Clean guidelines: new violations for these guidelines are not accepted." + + -service_selector={clean_guidelines_common, +-"MC3R1.D1.1|| +-MC3R1.D2.1|| +-MC3R1.D4.1|| +-MC3R1.D4.11|| +-MC3R1.D4.14|| +-MC3R1.R1.1|| +-MC3R1.R1.3|| +-MC3R1.R1.4|| +-MC3R1.R2.2|| +-MC3R1.R2.6|| +-MC3R1.R3.1|| +-MC3R1.R3.2|| +-MC3R1.R4.1|| +-MC3R1.R4.2|| +-MC3R1.R5.1|| +-MC3R1.R5.2|| +-MC3R1.R5.4|| +-MC3R1.R5.6|| +-MC3R1.R6.1|| +-MC3R1.R6.2|| +-MC3R1.R7.1|| +-MC3R1.R7.2|| +-MC3R1.R7.4|| +-MC3R1.R8.1|| +-MC3R1.R8.2|| +-MC3R1.R8.5|| +-MC3R1.R8.6|| +-MC3R1.R8.8|| +-MC3R1.R8.10|| +-MC3R1.R8.12|| +-MC3R1.R8.14|| +-MC3R1.R9.2|| +-MC3R1.R9.3|| +-MC3R1.R9.4|| +-MC3R1.R9.5|| +-MC3R1.R10.2|| +-MC3R1.R11.7|| +-MC3R1.R11.9|| +-MC3R1.R12.5|| +-MC3R1.R14.1|| +-MC3R1.R14.4|| +-MC3R1.R16.7|| +-MC3R1.R17.1|| +-MC3R1.R17.3|| +-MC3R1.R17.4|| +-MC3R1.R17.5|| +-MC3R1.R17.6|| +-MC3R1.R18.8|| +-MC3R1.R20.2|| +-MC3R1.R20.3|| +-MC3R1.R20.4|| +-MC3R1.R20.6|| +-MC3R1.R20.9|| +-MC3R1.R20.11|| +-MC3R1.R20.12|| +-MC3R1.R20.13|| +-MC3R1.R20.14|| +-MC3R1.R21.3|| +-MC3R1.R21.4|| +-MC3R1.R21.5|| +-MC3R1.R21.7|| +-MC3R1.R21.8|| +-MC3R1.R21.9|| +-MC3R1.R21.10|| +-MC3R1.R21.12|| +-MC3R1.R21.13|| +-MC3R1.R21.19|| +-MC3R1.R21.21|| +-MC3R1.R22.1|| +-MC3R1.R22.2|| +-MC3R1.R22.3|| +-MC3R1.R22.4|| +-MC3R1.R22.5|| +-MC3R1.R22.6|| +-MC3R1.R22.7|| +-MC3R1.R22.8|| +-MC3R1.R22.9|| +-MC3R1.R22.10" ++"MC3A2.D1.1|| ++MC3A2.D2.1|| ++MC3A2.D4.1|| ++MC3A2.D4.11|| ++MC3A2.D4.14|| ++MC3A2.R1.1|| ++MC3A2.R1.3|| ++MC3A2.R1.4|| ++MC3A2.R2.2|| ++MC3A2.R2.6|| ++MC3A2.R3.1|| ++MC3A2.R3.2|| ++MC3A2.R4.1|| ++MC3A2.R4.2|| ++MC3A2.R5.1|| ++MC3A2.R5.2|| ++MC3A2.R5.4|| ++MC3A2.R5.6|| ++MC3A2.R6.1|| ++MC3A2.R6.2|| ++MC3A2.R7.1|| ++MC3A2.R7.2|| ++MC3A2.R7.4|| ++MC3A2.R8.1|| ++MC3A2.R8.2|| ++MC3A2.R8.5|| ++MC3A2.R8.6|| ++MC3A2.R8.8|| ++MC3A2.R8.10|| ++MC3A2.R8.12|| ++MC3A2.R8.14|| ++MC3A2.R9.2|| ++MC3A2.R9.3|| ++MC3A2.R9.4|| ++MC3A2.R9.5|| ++MC3A2.R10.2|| ++MC3A2.R11.7|| ++MC3A2.R11.9|| ++MC3A2.R12.5|| ++MC3A2.R14.1|| ++MC3A2.R14.4|| ++MC3A2.R16.7|| ++MC3A2.R17.1|| ++MC3A2.R17.3|| ++MC3A2.R17.4|| ++MC3A2.R17.5|| ++MC3A2.R17.6|| ++MC3A2.R18.8|| ++MC3A2.R20.2|| ++MC3A2.R20.3|| ++MC3A2.R20.4|| ++MC3A2.R20.6|| ++MC3A2.R20.9|| ++MC3A2.R20.11|| ++MC3A2.R20.12|| ++MC3A2.R20.13|| ++MC3A2.R20.14|| ++MC3A2.R21.3|| ++MC3A2.R21.4|| ++MC3A2.R21.5|| ++MC3A2.R21.7|| ++MC3A2.R21.8|| ++MC3A2.R21.9|| ++MC3A2.R21.10|| ++MC3A2.R21.12|| ++MC3A2.R21.13|| ++MC3A2.R21.19|| ++MC3A2.R21.21|| ++MC3A2.R22.1|| ++MC3A2.R22.2|| ++MC3A2.R22.3|| ++MC3A2.R22.4|| ++MC3A2.R22.5|| ++MC3A2.R22.6|| ++MC3A2.R22.7|| ++MC3A2.R22.8|| ++MC3A2.R22.9|| ++MC3A2.R22.10" + } + + -setq=target,getenv("XEN_TARGET_ARCH") + + if(string_equal(target,"x86_64"), +- service_selector({"additional_clean_guidelines","MC3R1.D4.3"}) ++ service_selector({"additional_clean_guidelines","MC3A2.D4.3"}) + ) + + if(string_equal(target,"arm64"), +- service_selector({"additional_clean_guidelines","MC3R1.R16.6||MC3R1.R2.1||MC3R1.R5.3||MC3R1.R7.3"}) ++ service_selector({"additional_clean_guidelines","MC3A2.R16.6||MC3A2.R2.1||MC3A2.R5.3||MC3A2.R7.3"}) + ) + + -reports+={clean:added,"service(clean_guidelines_common||additional_clean_guidelines)"} +diff --git a/docs/misra/documenting-violations.rst b/docs/misra/documenting-violations.rst +index 8f1cbd83b8..d26377d5aa 100644 +--- a/docs/misra/documenting-violations.rst ++++ b/docs/misra/documenting-violations.rst +@@ -53,7 +53,7 @@ Here is an example to add a new justification in safe.json:: + | "analyser": { + | "cppcheck": "misra-c2012-20.7", + | "coverity": "misra_c_2012_rule_20_7_violation", +-| "eclair": "MC3R1.R20.7" ++| "eclair": "MC3A2.R20.7" + | }, + | "name": "R20.7 C macro parameters not used as expression", + | "text": "The macro parameters used in this [...]" +@@ -138,7 +138,7 @@ for the Rule 8.6: + + Eclair reports it in its web report, file xen/include/xen/kernel.h, line 68: + +-| MC3R1.R8.6 for program 'xen/xen-syms', variable '_start' has no definition ++| MC3A2.R8.6 for program 'xen/xen-syms', variable '_start' has no definition + + Also coverity reports it, here is an extract of the finding: + +@@ -165,7 +165,7 @@ We will prepare our entry in the safe.json database:: + | { + | "id": "SAF-1-safe", + | "analyser": { +-| "eclair": "MC3R1.R8.6", ++| "eclair": "MC3A2.R8.6", + | "coverity": "misra_c_2012_rule_8_6_violation" + | }, + | "name": "Rule 8.6: linker script defined symbols", +diff --git a/docs/misra/safe.json b/docs/misra/safe.json +index 3f18ef401c..a3b07d1c0d 100644 +--- a/docs/misra/safe.json ++++ b/docs/misra/safe.json +@@ -4,7 +4,7 @@ + { + "id": "SAF-0-safe", + "analyser": { +- "eclair": "MC3R1.R8.6", ++ "eclair": "MC3A2.R8.6", + "coverity": "misra_c_2012_rule_8_6_violation" + }, + "name": "Rule 8.6: linker script defined symbols", +@@ -13,7 +13,7 @@ + { + "id": "SAF-1-safe", + "analyser": { +- "eclair": "MC3R1.R8.4" ++ "eclair": "MC3A2.R8.4" + }, + "name": "Rule 8.4: asm-only definition", + "text": "Functions and variables used only by asm modules do not need to have a visible declaration prior to their definition." +@@ -21,23 +21,23 @@ + { + "id": "SAF-2-safe", + "analyser": { +- "eclair": "MC3R1.R10.1" ++ "eclair": "MC3A2.R10.1" + }, +- "name": "MC3R1.R10.1: use of an enumeration constant in an arithmetic operation", ++ "name": "MC3A2.R10.1: use of an enumeration constant in an arithmetic operation", + "text": "This violation can be fixed with a cast to (int) of the enumeration constant, but a deviation was chosen due to code readability (see also the comment in BITS_TO_LONGS)." + }, + { + "id": "SAF-3-safe", + "analyser": { +- "eclair": "MC3R1.R20.4" ++ "eclair": "MC3A2.R20.4" + }, +- "name": "MC3R1.R20.4: allow the definition of a macro with the same name as a keyword in some special cases", ++ "name": "MC3A2.R20.4: allow the definition of a macro with the same name as a keyword in some special cases", + "text": "The definition of a macro with the same name as a keyword can be useful in certain configurations to improve the guarantees that can be provided by Xen. See docs/misra/deviations.rst for a precise rationale for all such cases." + }, + { + "id": "SAF-4-safe", + "analyser": { +- "eclair": "MC3R1.R17.1" ++ "eclair": "MC3A2.R17.1" + }, + "name": "Rule 17.1: internal helper functions made to break long running hypercalls into multiple calls.", + "text": "They need to take a variable number of arguments depending on the original hypercall they are trying to continue." +@@ -45,25 +45,25 @@ + { + "id": "SAF-5-safe", + "analyser": { +- "eclair": "MC3R1.R16.2" ++ "eclair": "MC3A2.R16.2" + }, +- "name": "MC3R1.R16.2: using a case label when the most closely-enclosing compound statement is not a switch statement", ++ "name": "MC3A2.R16.2: using a case label when the most closely-enclosing compound statement is not a switch statement", + "text": "A switch label enclosed by some compound statement that is not the body of a switch is permitted within local helper macros that are unlikely to be misused or misunderstood." + }, + { + "id": "SAF-6-safe", + "analyser": { +- "eclair": "MC3R1.R20.12" ++ "eclair": "MC3A2.R20.12" + }, +- "name": "MC3R1.R20.12: use of a macro argument that deliberately violates the Rule", ++ "name": "MC3A2.R20.12: use of a macro argument that deliberately violates the Rule", + "text": "A macro parameter that is itself a macro is intentionally used within the macro both as a regular parameter and for text replacement." + }, + { + "id": "SAF-7-safe", + "analyser": { +- "eclair": "MC3R1.R20.7" ++ "eclair": "MC3A2.R20.7" + }, +- "name": "MC3R1.R20.7: deliberately non-parenthesized macro argument", ++ "name": "MC3A2.R20.7: deliberately non-parenthesized macro argument", + "text": "A macro parameter expands to an expression that is non-parenthesized, as doing so would break the functionality." + }, + { +-- +2.48.1 + diff --git a/0009-9pfsd-fix-release-build-with-old-gcc.patch b/0009-9pfsd-fix-release-build-with-old-gcc.patch deleted file mode 100644 index c91ae2d..0000000 --- a/0009-9pfsd-fix-release-build-with-old-gcc.patch +++ /dev/null @@ -1,33 +0,0 @@ -From 8ad5a8c5c36add2eee70a7253da4098ebffdb79b Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Thu, 8 Aug 2024 13:47:44 +0200 -Subject: [PATCH 09/83] 9pfsd: fix release build with old gcc - -Being able to recognize that "par" is reliably initialized on the 1st -loop iteration requires not overly old compilers. - -Fixes: 7809132b1a1d ("tools/xen-9pfsd: add 9pfs response generation support") -Signed-off-by: Jan Beulich -Reviewed-by: Juergen Gross -master commit: 984cb316cb27b53704c607e640a7dd2763b898ab -master date: 2024-08-02 08:44:22 +0200 ---- - tools/9pfsd/io.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/tools/9pfsd/io.c b/tools/9pfsd/io.c -index df1be3df7d..468e0241f5 100644 ---- a/tools/9pfsd/io.c -+++ b/tools/9pfsd/io.c -@@ -196,7 +196,7 @@ static void fill_buffer_at(void **data, const char *fmt, ...); - static void vfill_buffer_at(void **data, const char *fmt, va_list ap) - { - const char *f; -- const void *par; -+ const void *par = NULL; /* old gcc */ - const char *str_val; - const struct p9_qid *qid; - const struct p9_stat *stat; --- -2.47.0 - diff --git a/0009-MISRA-Unmark-Rules-1.1-and-2.1-as-clean-following-Ec.patch b/0009-MISRA-Unmark-Rules-1.1-and-2.1-as-clean-following-Ec.patch new file mode 100644 index 0000000..0e3eaae --- /dev/null +++ b/0009-MISRA-Unmark-Rules-1.1-and-2.1-as-clean-following-Ec.patch @@ -0,0 +1,47 @@ +From 8dd897e69119492989aaa034967f3a887f590197 Mon Sep 17 00:00:00 2001 +From: Andrew Cooper +Date: Tue, 17 Dec 2024 17:04:59 +0000 +Subject: [PATCH 09/53] MISRA: Unmark Rules 1.1 and 2.1 as clean following + Eclair upgrade +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Updating the Eclair runner has had knock-on effects with previously-clean +rules now flagging violations: + + - x86: Rule 1.1, 1940 violations + - ARM64: Rule 1.1, 725 violations, Rule 2.1, 255 violations + +Fixes: 631f535a3d4f ("xen: update ECLAIR service identifiers from MC3R1 to MC3A2.") +Signed-off-by: Andrew Cooper +Acked-by: Roger Pau Monné +(cherry picked from commit 171cb318deaa0be786cc3af3599c72e8909e60f9) +--- + automation/eclair_analysis/ECLAIR/tagging.ecl | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +diff --git a/automation/eclair_analysis/ECLAIR/tagging.ecl b/automation/eclair_analysis/ECLAIR/tagging.ecl +index eec1d50f90..91f27243dd 100644 +--- a/automation/eclair_analysis/ECLAIR/tagging.ecl ++++ b/automation/eclair_analysis/ECLAIR/tagging.ecl +@@ -25,7 +25,6 @@ MC3A2.D2.1|| + MC3A2.D4.1|| + MC3A2.D4.11|| + MC3A2.D4.14|| +-MC3A2.R1.1|| + MC3A2.R1.3|| + MC3A2.R1.4|| + MC3A2.R2.2|| +@@ -107,7 +106,7 @@ if(string_equal(target,"x86_64"), + ) + + if(string_equal(target,"arm64"), +- service_selector({"additional_clean_guidelines","MC3A2.R16.6||MC3A2.R2.1||MC3A2.R5.3||MC3A2.R7.3"}) ++ service_selector({"additional_clean_guidelines","MC3A2.R16.6||MC3A2.R5.3||MC3A2.R7.3"}) + ) + + -reports+={clean:added,"service(clean_guidelines_common||additional_clean_guidelines)"} +-- +2.48.1 + diff --git a/0010-tools-xg-increase-LZMA_BLOCK_SIZE-for-uncompressing-.patch b/0010-tools-xg-increase-LZMA_BLOCK_SIZE-for-uncompressing-.patch new file mode 100644 index 0000000..3eca53f --- /dev/null +++ b/0010-tools-xg-increase-LZMA_BLOCK_SIZE-for-uncompressing-.patch @@ -0,0 +1,67 @@ +From 0a14438052e65507041cf205c517ab0b4c786813 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?= + +Date: Tue, 21 Jan 2025 09:17:44 +0100 +Subject: [PATCH 10/53] tools/xg: increase LZMA_BLOCK_SIZE for uncompressing + the kernel +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Linux 6.12-rc2 fails to decompress with the current 128MiB, contrary to +the code comment. It results in a failure like this: + + domainbuilder: detail: xc_dom_kernel_file: filename="/var/lib/qubes/vm-kernels/6.12-rc2-1.1.fc37/vmlinuz" + domainbuilder: detail: xc_dom_malloc_filemap : 12104 kB + domainbuilder: detail: xc_dom_module_file: filename="/var/lib/qubes/vm-kernels/6.12-rc2-1.1.fc37/initramfs" + domainbuilder: detail: xc_dom_malloc_filemap : 7711 kB + domainbuilder: detail: xc_dom_boot_xen_init: ver 4.19, caps xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 + domainbuilder: detail: xc_dom_parse_image: called + domainbuilder: detail: xc_dom_find_loader: trying multiboot-binary loader ... + domainbuilder: detail: loader probe failed + domainbuilder: detail: xc_dom_find_loader: trying HVM-generic loader ... + domainbuilder: detail: loader probe failed + domainbuilder: detail: xc_dom_find_loader: trying Linux bzImage loader ... + domainbuilder: detail: _xc_try_lzma_decode: XZ decompression error: Memory usage limit reached + xc: error: panic: xg_dom_bzimageloader.c:761: xc_dom_probe_bzimage_kernel unable to XZ decompress kernel: Invalid kernel + domainbuilder: detail: loader probe failed + domainbuilder: detail: xc_dom_find_loader: trying ELF-generic loader ... + domainbuilder: detail: loader probe failed + xc: error: panic: xg_dom_core.c:689: xc_dom_find_loader: no loader found: Invalid kernel + libxl: error: libxl_dom.c:566:libxl__build_dom: xc_dom_parse_image failed + +The important part: XZ decompression error: Memory usage limit reached + +This looks to be related to the following change in Linux: +8653c909922743bceb4800e5cc26087208c9e0e6 ("xz: use 128 MiB dictionary and force single-threaded mode") + +Fix this by increasing the block size to 256MiB. And remove the +misleading comment (from lack of better ideas). + +Signed-off-by: Marek Marczykowski-Górecki +Reviewed-by: Roger Pau Monné +Acked-by: Anthony PERARD +Acked-by: Andrew Cooper +master commit: e6472d46680ccd2b804ad73c19042a5811d036f0 +master date: 2024-12-19 17:33:54 +0000 +--- + tools/libs/guest/xg_dom_bzimageloader.c | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +diff --git a/tools/libs/guest/xg_dom_bzimageloader.c b/tools/libs/guest/xg_dom_bzimageloader.c +index c6ee6d83e7..1fb4e5a1f7 100644 +--- a/tools/libs/guest/xg_dom_bzimageloader.c ++++ b/tools/libs/guest/xg_dom_bzimageloader.c +@@ -272,8 +272,7 @@ static int _xc_try_lzma_decode( + return retval; + } + +-/* 128 Mb is the minimum size (half-way) documented to work for all inputs. */ +-#define LZMA_BLOCK_SIZE (128*1024*1024) ++#define LZMA_BLOCK_SIZE (256*1024*1024) + + static int xc_try_xz_decode( + struct xc_dom_image *dom, void **blob, size_t *size) +-- +2.48.1 + diff --git a/0010-x86-emul-Fix-misaligned-IO-breakpoint-behaviour-in-P.patch b/0010-x86-emul-Fix-misaligned-IO-breakpoint-behaviour-in-P.patch deleted file mode 100644 index b2e5e83..0000000 --- a/0010-x86-emul-Fix-misaligned-IO-breakpoint-behaviour-in-P.patch +++ /dev/null @@ -1,41 +0,0 @@ -From 033060ee6e05f9e86ef1a51674864b55dc15e62c Mon Sep 17 00:00:00 2001 -From: Matthew Barnes -Date: Thu, 8 Aug 2024 13:48:03 +0200 -Subject: [PATCH 10/83] x86/emul: Fix misaligned IO breakpoint behaviour in PV - guests - -When hardware breakpoints are configured on misaligned IO ports, the -hardware will mask the addresses based on the breakpoint width during -comparison. - -For PV guests, misaligned IO breakpoints do not behave the same way, and -therefore yield different results. - -This patch tweaks the emulation of IO breakpoints for PV guests such -that they reproduce the same behaviour as hardware. - -Fixes: bec9e3205018 ("x86: emulate I/O port access breakpoints") -Signed-off-by: Matthew Barnes -Reviewed-by: Jan Beulich -master commit: 08aacc392d86d4c7dbebdb5e664060ae2af72057 -master date: 2024-08-08 13:27:50 +0200 ---- - xen/arch/x86/pv/emul-priv-op.c | 2 ++ - 1 file changed, 2 insertions(+) - -diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c -index f101510a1b..aa11ecadaa 100644 ---- a/xen/arch/x86/pv/emul-priv-op.c -+++ b/xen/arch/x86/pv/emul-priv-op.c -@@ -346,6 +346,8 @@ static unsigned int check_guest_io_breakpoint(struct vcpu *v, - case DR_LEN_8: width = 8; break; - } - -+ start &= ~(width - 1UL); -+ - if ( (start < (port + len)) && ((start + width) > port) ) - match |= 1u << i; - } --- -2.47.0 - diff --git a/0011-x86-IOMMU-move-tracking-in-iommu_identity_mapping.patch b/0011-x86-IOMMU-move-tracking-in-iommu_identity_mapping.patch deleted file mode 100644 index cc34d00..0000000 --- a/0011-x86-IOMMU-move-tracking-in-iommu_identity_mapping.patch +++ /dev/null @@ -1,111 +0,0 @@ -From c61d4264d26d1ffb26563bfb6dc2f0b06cd72128 Mon Sep 17 00:00:00 2001 -From: Teddy Astie -Date: Tue, 13 Aug 2024 16:47:19 +0200 -Subject: [PATCH 11/83] x86/IOMMU: move tracking in iommu_identity_mapping() -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -If for some reason xmalloc() fails after having mapped the reserved -regions, an error is reported, but the regions remain mapped in the P2M. - -Similarly if an error occurs during set_identity_p2m_entry() (except on -the first call), the partial mappings of the region would be retained -without being tracked anywhere, and hence without there being a way to -remove them again from the domain's P2M. - -Move the setting up of the list entry ahead of trying to map the region. -In cases other than the first mapping failing, keep record of the full -region, such that a subsequent unmapping request can be properly torn -down. - -To compensate for the potentially excess unmapping requests, don't log a -warning from p2m_remove_identity_entry() when there really was nothing -mapped at a given GFN. - -This is XSA-460 / CVE-2024-31145. - -Fixes: 2201b67b9128 ("VT-d: improve RMRR region handling") -Fixes: c0e19d7c6c42 ("IOMMU: generalize VT-d's tracking of mapped RMRR regions") -Signed-off-by: Teddy Astie -Signed-off-by: Jan Beulich -Reviewed-by: Roger Pau Monné -master commit: beadd68b5490ada053d72f8a9ce6fd696d626596 -master date: 2024-08-13 16:36:40 +0200 ---- - xen/arch/x86/mm/p2m.c | 8 +++++--- - xen/drivers/passthrough/x86/iommu.c | 30 ++++++++++++++++++++--------- - 2 files changed, 26 insertions(+), 12 deletions(-) - -diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c -index e7e327d6a6..1739133fc2 100644 ---- a/xen/arch/x86/mm/p2m.c -+++ b/xen/arch/x86/mm/p2m.c -@@ -1267,9 +1267,11 @@ int p2m_remove_identity_entry(struct domain *d, unsigned long gfn_l) - else - { - gfn_unlock(p2m, gfn, 0); -- printk(XENLOG_G_WARNING -- "non-identity map d%d:%lx not cleared (mapped to %lx)\n", -- d->domain_id, gfn_l, mfn_x(mfn)); -+ if ( (p2mt != p2m_invalid && p2mt != p2m_mmio_dm) || -+ a != p2m_access_n || !mfn_eq(mfn, INVALID_MFN) ) -+ printk(XENLOG_G_WARNING -+ "non-identity map %pd:%lx not cleared (mapped to %lx)\n", -+ d, gfn_l, mfn_x(mfn)); - ret = 0; - } - -diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c -index cc0062b027..8b1e0596b8 100644 ---- a/xen/drivers/passthrough/x86/iommu.c -+++ b/xen/drivers/passthrough/x86/iommu.c -@@ -267,24 +267,36 @@ int iommu_identity_mapping(struct domain *d, p2m_access_t p2ma, - if ( p2ma == p2m_access_x ) - return -ENOENT; - -- while ( base_pfn < end_pfn ) -- { -- int err = set_identity_p2m_entry(d, base_pfn, p2ma, flag); -- -- if ( err ) -- return err; -- base_pfn++; -- } -- - map = xmalloc(struct identity_map); - if ( !map ) - return -ENOMEM; -+ - map->base = base; - map->end = end; - map->access = p2ma; - map->count = 1; -+ -+ /* -+ * Insert into list ahead of mapping, so the range can be found when -+ * trying to clean up. -+ */ - list_add_tail(&map->list, &hd->arch.identity_maps); - -+ for ( ; base_pfn < end_pfn; ++base_pfn ) -+ { -+ int err = set_identity_p2m_entry(d, base_pfn, p2ma, flag); -+ -+ if ( !err ) -+ continue; -+ -+ if ( (map->base >> PAGE_SHIFT_4K) == base_pfn ) -+ { -+ list_del(&map->list); -+ xfree(map); -+ } -+ return err; -+ } -+ - return 0; - } - --- -2.47.0 - diff --git a/0011-xen-arch-x86-make-objdump-output-user-locale-agnosti.patch b/0011-xen-arch-x86-make-objdump-output-user-locale-agnosti.patch new file mode 100644 index 0000000..d9dcc96 --- /dev/null +++ b/0011-xen-arch-x86-make-objdump-output-user-locale-agnosti.patch @@ -0,0 +1,34 @@ +From d260f797843417e65c4d035198f7a171b0dfa1f1 Mon Sep 17 00:00:00 2001 +From: Maximilian Engelhardt +Date: Tue, 21 Jan 2025 09:17:58 +0100 +Subject: [PATCH 11/53] xen/arch/x86: make objdump output user locale agnostic + +The objdump output is fed to grep, so make sure it doesn't change with +different user locales and break the grep parsing. +This problem was identified while updating xen in Debian and the fix is +needed for generating reproducible builds in varying environments. + +Signed-off-by: Maximilian Engelhardt +Acked-by: Andrew Cooper +master commit: 0d729221ab74c5d2571e71501dc63838acbf752a +master date: 2024-12-30 21:40:37 +0000 +--- + xen/arch/x86/arch.mk | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/xen/arch/x86/arch.mk b/xen/arch/x86/arch.mk +index a683d4bedc..b88d097a84 100644 +--- a/xen/arch/x86/arch.mk ++++ b/xen/arch/x86/arch.mk +@@ -111,7 +111,7 @@ endif + ifeq ($(XEN_BUILD_PE),y) + + # Check if the linker produces fixups in PE by default +-efi-nr-fixups := $(shell $(OBJDUMP) -p $(efi-check).efi | grep '^[[:blank:]]*reloc[[:blank:]]*[0-9][[:blank:]].*DIR64$$' | wc -l) ++efi-nr-fixups := $(shell LC_ALL=C $(OBJDUMP) -p $(efi-check).efi | grep '^[[:blank:]]*reloc[[:blank:]]*[0-9][[:blank:]].*DIR64$$' | wc -l) + + ifeq ($(efi-nr-fixups),2) + MKRELOC := : +-- +2.48.1 + diff --git a/0012-x86-pass-through-documents-as-security-unsupported-w.patch b/0012-x86-pass-through-documents-as-security-unsupported-w.patch deleted file mode 100644 index 31dde8d..0000000 --- a/0012-x86-pass-through-documents-as-security-unsupported-w.patch +++ /dev/null @@ -1,42 +0,0 @@ -From 3e8a2217f211d49dd771f7918d72df057121109f Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 13 Aug 2024 16:48:13 +0200 -Subject: [PATCH 12/83] x86/pass-through: documents as security-unsupported - when sharing resources - -When multiple devices share resources and one of them is to be passed -through to a guest, security of the entire system and of respective -guests individually cannot really be guaranteed without knowing -internals of any of the involved guests. Therefore such a configuration -cannot really be security-supported, yet making that explicit was so far -missing. - -This is XSA-461 / CVE-2024-31146. - -Signed-off-by: Jan Beulich -Reviewed-by: Juergen Gross -master commit: 9c94eda1e3790820699a6de3f6a7c959ecf30600 -master date: 2024-08-13 16:37:25 +0200 ---- - SUPPORT.md | 5 +++++ - 1 file changed, 5 insertions(+) - -diff --git a/SUPPORT.md b/SUPPORT.md -index 8b998d9bc7..1d8b38cbd0 100644 ---- a/SUPPORT.md -+++ b/SUPPORT.md -@@ -841,6 +841,11 @@ This feature is not security supported: see https://xenbits.xen.org/xsa/advisory - - Only systems using IOMMUs are supported. - -+Passing through of devices sharing resources with another device is not -+security supported. Such sharing could e.g. be the same line interrupt being -+used by multiple devices, one of which is to be passed through, or two such -+devices having memory BARs within the same 4k page. -+ - Not compatible with migration, populate-on-demand, altp2m, - introspection, memory sharing, or memory paging. - --- -2.47.0 - diff --git a/0012-x86-spec-ctrl-Support-for-SRSO_U-S_NO-and-SRSO_MSR_F.patch b/0012-x86-spec-ctrl-Support-for-SRSO_U-S_NO-and-SRSO_MSR_F.patch new file mode 100644 index 0000000..5a8ba0b --- /dev/null +++ b/0012-x86-spec-ctrl-Support-for-SRSO_U-S_NO-and-SRSO_MSR_F.patch @@ -0,0 +1,350 @@ +From 816235e311e1dca0b2e37b1036e8e7e7f5f22433 Mon Sep 17 00:00:00 2001 +From: Andrew Cooper +Date: Tue, 21 Jan 2025 09:18:08 +0100 +Subject: [PATCH 12/53] x86/spec-ctrl: Support for SRSO_U/S_NO and SRSO_MSR_FIX +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +AMD have updated the SRSO whitepaper[1] with further information. These +features exist on AMD Zen5 CPUs and are necessary for Xen to use. + +The two features are in principle unrelated: + + * SRSO_U/S_NO is an enumeration saying that SRSO attacks can't cross the + User(CPL3) / Supervisor(CPL<3) boundary. i.e. Xen don't need to use + IBPB-on-entry for PV64. PV32 guests are explicitly unsupported for + speculative issues, and excluded from consideration for simplicity. + + * SRSO_MSR_FIX is an enumeration identifying that the BP_SPEC_REDUCE bit is + available in MSR_BP_CFG. When set, SRSO attacks can't cross the host/guest + boundary. i.e. Xen don't need to use IBPB-on-entry for HVM. + +Extend ibpb_calculations() to account for these when calculating +opt_ibpb_entry_{pv,hvm} defaults. Add a `bp-spec-reduce=` option to +control the use of BP_SPEC_REDUCE, with it active by default. + +Because MSR_BP_CFG is core-scoped with a race condition updating it, repurpose +amd_check_erratum_1485() into amd_check_bp_cfg() and calculate all updates at +once. + +Xen also needs to to advertise SRSO_U/S_NO to guests to allow the guest kernel +to skip SRSO mitigations too: + + * This is trivial for HVM guests. It is also is accurate for PV32 guests + too, but we have already excluded them from consideration, and do so again + here to simplify the policy logic. + + * As written, SRSO_U/S_NO does not help for the PV64 user->kernel boundary. + However, after discussing with AMD, an implementation detail of having + BP_SPEC_REDUCE active causes the PV64 user->kernel boundary to have the + property described by SRSO_U/S_NO, so we can advertise SRSO_U/S_NO to + guests when the BP_SPEC_REDUCE precondition is met. + +Finally, fix a typo in the SRSO_NO's comment. + +[1] https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf +Signed-off-by: Andrew Cooper +Reviewed-by: Roger Pau Monné +master commit: a1746cd4434dd27ca2da8430dfb10edc76264bb3 +master date: 2025-01-02 18:44:49 +0000 +--- + docs/misc/xen-command-line.pandoc | 9 +++- + xen/arch/x86/cpu-policy.c | 21 +++++++++ + xen/arch/x86/cpu/amd.c | 29 +++++++++--- + xen/arch/x86/include/asm/msr-index.h | 1 + + xen/arch/x86/include/asm/spec_ctrl.h | 1 + + xen/arch/x86/spec_ctrl.c | 51 ++++++++++++++++----- + xen/include/public/arch-x86/cpufeatureset.h | 4 +- + 7 files changed, 96 insertions(+), 20 deletions(-) + +diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc +index 98a4521155..a398891bc0 100644 +--- a/docs/misc/xen-command-line.pandoc ++++ b/docs/misc/xen-command-line.pandoc +@@ -2392,7 +2392,7 @@ By default SSBD will be mitigated at runtime (i.e `ssbd=runtime`). + > {ibrs,ibpb,ssbd,psfd, + > eager-fpu,l1d-flush,branch-harden,srb-lock, + > unpriv-mmio,gds-mit,div-scrub,lock-harden, +-> bhi-dis-s}= ]` ++> bhi-dis-s,bp-spec-reduce}= ]` + + Controls for speculative execution sidechannel mitigations. By default, Xen + will pick the most appropriate mitigations based on compiled in support, +@@ -2541,6 +2541,13 @@ boolean can be used to force or prevent Xen from using speculation barriers to + protect lock critical regions. This mitigation won't be engaged by default, + and needs to be explicitly enabled on the command line. + ++On hardware supporting SRSO_MSR_FIX, the `bp-spec-reduce=` option can be used ++to force or prevent Xen from using MSR_BP_CFG.BP_SPEC_REDUCE to mitigate the ++SRSO (Speculative Return Stack Overflow) vulnerability. Xen will use ++bp-spec-reduce when available, as it is preferable to using `ibpb-entry=hvm` ++to mitigate SRSO for HVM guests, and because it is a prerequisite to advertise ++SRSO_U/S_NO to PV guests. ++ + ### sync_console + > `= ` + +diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c +index fd8a20afa1..52caad5162 100644 +--- a/xen/arch/x86/cpu-policy.c ++++ b/xen/arch/x86/cpu-policy.c +@@ -14,6 +14,7 @@ + #include + #include + #include ++#include + #include + + struct cpu_policy __read_mostly raw_cpu_policy; +@@ -622,6 +623,26 @@ static void __init calculate_pv_max_policy(void) + __clear_bit(X86_FEATURE_IBRS, fs); + } + ++ /* ++ * SRSO_U/S_NO means that the CPU is not vulnerable to SRSO attacks across ++ * the User (CPL3) / Supervisor (CPL<3) boundary. ++ * ++ * PV32 guests are unsupported for speculative issues, and excluded from ++ * consideration for simplicity. ++ * ++ * The PV64 user/kernel boundary is CPL3 on both sides, so SRSO_U/S_NO ++ * won't convey the meaning that a PV kernel expects. ++ * ++ * After discussions with AMD, an implementation detail of having ++ * BP_SPEC_REDUCE active causes the PV64 user/kernel boundary to have a ++ * property compatible with the meaning of SRSO_U/S_NO. ++ * ++ * If BP_SPEC_REDUCE isn't active, remove SRSO_U/S_NO from the PV max ++ * policy, which will cause it to filter out of PV default too. ++ */ ++ if ( !boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX) || !opt_bp_spec_reduce ) ++ __clear_bit(X86_FEATURE_SRSO_US_NO, fs); ++ + guest_common_max_feature_adjustments(fs); + guest_common_feature_adjustments(fs); + +diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c +index ab92333673..c448997be5 100644 +--- a/xen/arch/x86/cpu/amd.c ++++ b/xen/arch/x86/cpu/amd.c +@@ -1009,16 +1009,33 @@ static void cf_check fam17_disable_c6(void *arg) + wrmsrl(MSR_AMD_CSTATE_CFG, val & mask); + } + +-static void amd_check_erratum_1485(void) ++static void amd_check_bp_cfg(void) + { +- uint64_t val, chickenbit = (1 << 5); ++ uint64_t val, new = 0; + +- if (cpu_has_hypervisor || boot_cpu_data.x86 != 0x19 || !is_zen4_uarch()) ++ /* ++ * AMD Erratum #1485. Set bit 5, as instructed. ++ */ ++ if (!cpu_has_hypervisor && boot_cpu_data.x86 == 0x19 && is_zen4_uarch()) ++ new |= (1 << 5); ++ ++ /* ++ * On hardware supporting SRSO_MSR_FIX, activate BP_SPEC_REDUCE by ++ * default. This lets us do two things: ++ * ++ * 1) Avoid IBPB-on-entry to mitigate SRSO attacks from HVM guests. ++ * 2) Advertise SRSO_US_NO to PV guests. ++ */ ++ if (boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX) && opt_bp_spec_reduce) ++ new |= BP_CFG_SPEC_REDUCE; ++ ++ /* Avoid reading BP_CFG if we don't intend to change anything. */ ++ if (!new) + return; + + rdmsrl(MSR_AMD64_BP_CFG, val); + +- if (val & chickenbit) ++ if ((val & new) == new) + return; + + /* +@@ -1027,7 +1044,7 @@ static void amd_check_erratum_1485(void) + * same time before the chickenbit is set. It's benign because the + * value being written is the same on both. + */ +- wrmsrl(MSR_AMD64_BP_CFG, val | chickenbit); ++ wrmsrl(MSR_AMD64_BP_CFG, val | new); + } + + static void cf_check init_amd(struct cpuinfo_x86 *c) +@@ -1297,7 +1314,7 @@ static void cf_check init_amd(struct cpuinfo_x86 *c) + disable_c1_ramping(); + + amd_check_zenbleed(); +- amd_check_erratum_1485(); ++ amd_check_bp_cfg(); + + if (fam17_c6_disabled) + fam17_disable_c6(NULL); +diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h +index 9cdb5b2625..22d9e76e55 100644 +--- a/xen/arch/x86/include/asm/msr-index.h ++++ b/xen/arch/x86/include/asm/msr-index.h +@@ -412,6 +412,7 @@ + #define AMD64_DE_CFG_LFENCE_SERIALISE (_AC(1, ULL) << 1) + #define MSR_AMD64_EX_CFG 0xc001102cU + #define MSR_AMD64_BP_CFG 0xc001102eU ++#define BP_CFG_SPEC_REDUCE (_AC(1, ULL) << 4) + #define MSR_AMD64_DE_CFG2 0xc00110e3U + + #define MSR_AMD64_DR0_ADDRESS_MASK 0xc0011027U +diff --git a/xen/arch/x86/include/asm/spec_ctrl.h b/xen/arch/x86/include/asm/spec_ctrl.h +index 72347ef2b9..0772254189 100644 +--- a/xen/arch/x86/include/asm/spec_ctrl.h ++++ b/xen/arch/x86/include/asm/spec_ctrl.h +@@ -90,6 +90,7 @@ extern int8_t opt_xpti_hwdom, opt_xpti_domu; + + extern bool cpu_has_bug_l1tf; + extern int8_t opt_pv_l1tf_hwdom, opt_pv_l1tf_domu; ++extern bool opt_bp_spec_reduce; + + /* + * The L1D address mask, which might be wider than reported in CPUID, and the +diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c +index 40f6ae0170..35351044f9 100644 +--- a/xen/arch/x86/spec_ctrl.c ++++ b/xen/arch/x86/spec_ctrl.c +@@ -83,6 +83,7 @@ static bool __initdata opt_unpriv_mmio; + static bool __ro_after_init opt_verw_mmio; + static int8_t __initdata opt_gds_mit = -1; + static int8_t __initdata opt_div_scrub = -1; ++bool __ro_after_init opt_bp_spec_reduce = true; + + static int __init cf_check parse_spec_ctrl(const char *s) + { +@@ -143,6 +144,7 @@ static int __init cf_check parse_spec_ctrl(const char *s) + opt_unpriv_mmio = false; + opt_gds_mit = 0; + opt_div_scrub = 0; ++ opt_bp_spec_reduce = false; + } + else if ( val > 0 ) + rc = -EINVAL; +@@ -363,6 +365,8 @@ static int __init cf_check parse_spec_ctrl(const char *s) + opt_gds_mit = val; + else if ( (val = parse_boolean("div-scrub", s, ss)) >= 0 ) + opt_div_scrub = val; ++ else if ( (val = parse_boolean("bp-spec-reduce", s, ss)) >= 0 ) ++ opt_bp_spec_reduce = val; + else + rc = -EINVAL; + +@@ -505,7 +509,7 @@ static void __init print_details(enum ind_thunk thunk) + * Hardware read-only information, stating immunity to certain issues, or + * suggestions of which mitigation to use. + */ +- printk(" Hardware hints:%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n", ++ printk(" Hardware hints:%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n", + (caps & ARCH_CAPS_RDCL_NO) ? " RDCL_NO" : "", + (caps & ARCH_CAPS_EIBRS) ? " EIBRS" : "", + (caps & ARCH_CAPS_RSBA) ? " RSBA" : "", +@@ -529,10 +533,11 @@ static void __init print_details(enum ind_thunk thunk) + (e8b & cpufeat_mask(X86_FEATURE_BTC_NO)) ? " BTC_NO" : "", + (e8b & cpufeat_mask(X86_FEATURE_IBPB_RET)) ? " IBPB_RET" : "", + (e21a & cpufeat_mask(X86_FEATURE_IBPB_BRTYPE)) ? " IBPB_BRTYPE" : "", +- (e21a & cpufeat_mask(X86_FEATURE_SRSO_NO)) ? " SRSO_NO" : ""); ++ (e21a & cpufeat_mask(X86_FEATURE_SRSO_NO)) ? " SRSO_NO" : "", ++ (e21a & cpufeat_mask(X86_FEATURE_SRSO_US_NO)) ? " SRSO_US_NO" : ""); + + /* Hardware features which need driving to mitigate issues. */ +- printk(" Hardware features:%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n", ++ printk(" Hardware features:%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s\n", + (e8b & cpufeat_mask(X86_FEATURE_IBPB)) || + (_7d0 & cpufeat_mask(X86_FEATURE_IBRSB)) ? " IBPB" : "", + (e8b & cpufeat_mask(X86_FEATURE_IBRS)) || +@@ -551,7 +556,8 @@ static void __init print_details(enum ind_thunk thunk) + (caps & ARCH_CAPS_FB_CLEAR_CTRL) ? " FB_CLEAR_CTRL" : "", + (caps & ARCH_CAPS_GDS_CTRL) ? " GDS_CTRL" : "", + (caps & ARCH_CAPS_RFDS_CLEAR) ? " RFDS_CLEAR" : "", +- (e21a & cpufeat_mask(X86_FEATURE_SBPB)) ? " SBPB" : ""); ++ (e21a & cpufeat_mask(X86_FEATURE_SBPB)) ? " SBPB" : "", ++ (e21a & cpufeat_mask(X86_FEATURE_SRSO_MSR_FIX)) ? " SRSO_MSR_FIX" : ""); + + /* Compiled-in support which pertains to mitigations. */ + if ( IS_ENABLED(CONFIG_INDIRECT_THUNK) || IS_ENABLED(CONFIG_SHADOW_PAGING) || +@@ -1120,7 +1126,7 @@ static void __init div_calculations(bool hw_smt_enabled) + + static void __init ibpb_calculations(void) + { +- bool def_ibpb_entry = false; ++ bool def_ibpb_entry_pv = false, def_ibpb_entry_hvm = false; + + /* Check we have hardware IBPB support before using it... */ + if ( !boot_cpu_has(X86_FEATURE_IBRSB) && !boot_cpu_has(X86_FEATURE_IBPB) ) +@@ -1145,22 +1151,43 @@ static void __init ibpb_calculations(void) + * Confusion. Mitigate with IBPB-on-entry. + */ + if ( !boot_cpu_has(X86_FEATURE_BTC_NO) ) +- def_ibpb_entry = true; ++ def_ibpb_entry_pv = def_ibpb_entry_hvm = true; + + /* +- * Further to BTC, Zen3/4 CPUs suffer from Speculative Return Stack +- * Overflow in most configurations. Mitigate with IBPB-on-entry if we +- * have the microcode that makes this an effective option. ++ * In addition to BTC, Zen3 and later CPUs suffer from Speculative ++ * Return Stack Overflow in most configurations. If we have microcode ++ * that makes IBPB-on-entry an effective mitigation, see about using ++ * it. + */ + if ( !boot_cpu_has(X86_FEATURE_SRSO_NO) && + boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) ) +- def_ibpb_entry = true; ++ { ++ /* ++ * SRSO_U/S_NO is a subset of SRSO_NO, identifying that SRSO isn't ++ * possible across the User (CPL3) / Supervisor (CPL<3) boundary. ++ * ++ * Ignoring PV32 (not security supported for speculative issues), ++ * this means we only need to use IBPB-on-entry for PV guests on ++ * hardware which doesn't enumerate SRSO_US_NO. ++ */ ++ if ( !boot_cpu_has(X86_FEATURE_SRSO_US_NO) ) ++ def_ibpb_entry_pv = true; ++ ++ /* ++ * SRSO_MSR_FIX enumerates that we can use MSR_BP_CFG.SPEC_REDUCE ++ * to mitigate SRSO across the host/guest boundary. We only need ++ * to use IBPB-on-entry for HVM guests if we haven't enabled this ++ * control. ++ */ ++ if ( !boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX) || !opt_bp_spec_reduce ) ++ def_ibpb_entry_hvm = true; ++ } + } + + if ( opt_ibpb_entry_pv == -1 ) +- opt_ibpb_entry_pv = IS_ENABLED(CONFIG_PV) && def_ibpb_entry; ++ opt_ibpb_entry_pv = IS_ENABLED(CONFIG_PV) && def_ibpb_entry_pv; + if ( opt_ibpb_entry_hvm == -1 ) +- opt_ibpb_entry_hvm = IS_ENABLED(CONFIG_HVM) && def_ibpb_entry; ++ opt_ibpb_entry_hvm = IS_ENABLED(CONFIG_HVM) && def_ibpb_entry_hvm; + + if ( opt_ibpb_entry_pv ) + { +diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h +index d9eba5e9a7..9c98e49928 100644 +--- a/xen/include/public/arch-x86/cpufeatureset.h ++++ b/xen/include/public/arch-x86/cpufeatureset.h +@@ -312,7 +312,9 @@ XEN_CPUFEATURE(FSRSC, 11*32+19) /*A Fast Short REP SCASB */ + XEN_CPUFEATURE(AMD_PREFETCHI, 11*32+20) /*A PREFETCHIT{0,1} Instructions */ + XEN_CPUFEATURE(SBPB, 11*32+27) /*A Selective Branch Predictor Barrier */ + XEN_CPUFEATURE(IBPB_BRTYPE, 11*32+28) /*A IBPB flushes Branch Type predictions too */ +-XEN_CPUFEATURE(SRSO_NO, 11*32+29) /*A Hardware not vulenrable to Speculative Return Stack Overflow */ ++XEN_CPUFEATURE(SRSO_NO, 11*32+29) /*A Hardware not vulnerable to Speculative Return Stack Overflow */ ++XEN_CPUFEATURE(SRSO_US_NO, 11*32+30) /*A! Hardware not vulnerable to SRSO across the User/Supervisor boundary */ ++XEN_CPUFEATURE(SRSO_MSR_FIX, 11*32+31) /* MSR_BP_CFG.BP_SPEC_REDUCE available */ + + /* Intel-defined CPU features, CPUID level 0x00000007:1.ebx, word 12 */ + XEN_CPUFEATURE(INTEL_PPIN, 12*32+ 0) /* Protected Processor Inventory Number */ +-- +2.48.1 + diff --git a/0013-automation-disable-Yocto-jobs.patch b/0013-automation-disable-Yocto-jobs.patch deleted file mode 100644 index 3972719..0000000 --- a/0013-automation-disable-Yocto-jobs.patch +++ /dev/null @@ -1,48 +0,0 @@ -From 51ae51301f2b4bccd365353f78510c1bdac522c9 Mon Sep 17 00:00:00 2001 -From: Stefano Stabellini -Date: Fri, 9 Aug 2024 23:59:18 -0700 -Subject: [PATCH 13/83] automation: disable Yocto jobs - -The Yocto jobs take a long time to run. We are changing Gitlab ARM64 -runners and the new runners might not be able to finish the Yocto jobs -in a reasonable time. - -For now, disable the Yocto jobs by turning them into "manual" trigger -(they need to be manually executed.) - -Signed-off-by: Stefano Stabellini -Reviewed-by: Michal Orzel -master commit: 1c24bca387136d73f88f46ce3db82d34411702e8 -master date: 2024-08-09 23:59:18 -0700 ---- - automation/gitlab-ci/build.yaml | 3 +++ - 1 file changed, 3 insertions(+) - -diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml -index 7ce88d38e7..09895d1fbd 100644 ---- a/automation/gitlab-ci/build.yaml -+++ b/automation/gitlab-ci/build.yaml -@@ -470,17 +470,20 @@ yocto-qemuarm64: - extends: .yocto-test-arm64 - variables: - YOCTO_BOARD: qemuarm64 -+ when: manual - - yocto-qemuarm: - extends: .yocto-test-arm64 - variables: - YOCTO_BOARD: qemuarm - YOCTO_OUTPUT: --copy-output -+ when: manual - - yocto-qemux86-64: - extends: .yocto-test-x86-64 - variables: - YOCTO_BOARD: qemux86-64 -+ when: manual - - # Cppcheck analysis jobs - --- -2.47.0 - diff --git a/0013-x86-traps-Rework-LER-initialisation-and-support-Zen5.patch b/0013-x86-traps-Rework-LER-initialisation-and-support-Zen5.patch new file mode 100644 index 0000000..95a50bf --- /dev/null +++ b/0013-x86-traps-Rework-LER-initialisation-and-support-Zen5.patch @@ -0,0 +1,156 @@ +From 1b88dc9afcac6c33db309ac12969c1e60fce71fb Mon Sep 17 00:00:00 2001 +From: Andrew Cooper +Date: Tue, 21 Jan 2025 09:18:42 +0100 +Subject: [PATCH 13/53] x86/traps: Rework LER initialisation and support + Zen5/Diamond Rapids + +AMD have always used the architectural MSRs for LER. As the first processor +to support LER was the K7 (which was 32bit), we can assume it's presence +unconditionally in 64bit mode. + +Intel are about to run out of space in Family 6 and start using 19. It is +only the Pentium 4 which uses non-architectural LER MSRs. + +percpu_traps_init(), which runs on every CPU, contains a lot of code which +should be init-only, and is the only reason why opt_ler can't be in initdata. + +Write a brand new init_ler() which expects all future Intel and AMD CPUs to +continue using the architectural MSRs, and does all setup together. Call it +from trap_init(), and remove the setup logic percpu_traps_init() except for +the single path configuring MSR_IA32_DEBUGCTLMSR. + +Leave behind a warning if the user asked for LER and Xen couldn't enable it. + +Signed-off-by: Andrew Cooper +Reviewed-by: Jan Beulich +master commit: 555866cb56002849014a1409ecdfa3f436c0c2c4 +master date: 2025-01-06 12:24:05 +0000 +--- + xen/arch/x86/traps.c | 86 ++++++++++++++++++++------------------------ + 1 file changed, 39 insertions(+), 47 deletions(-) + +diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c +index ccb5a37a72..ae573ee4c0 100644 +--- a/xen/arch/x86/traps.c ++++ b/xen/arch/x86/traps.c +@@ -114,7 +114,7 @@ DEFINE_PER_CPU_PAGE_ALIGNED(struct tss_page, tss_page); + static int debug_stack_lines = 20; + integer_param("debug_stack_lines", debug_stack_lines); + +-static bool __ro_after_init opt_ler; ++static bool __initdata opt_ler; + boolean_param("ler", opt_ler); + + /* LastExceptionFromIP on this hardware. Zero if LER is not in use. */ +@@ -2093,56 +2093,10 @@ static void __init set_intr_gate(unsigned int n, void *addr) + __set_intr_gate(n, 0, addr); + } + +-static unsigned int noinline __init calc_ler_msr(void) +-{ +- switch ( boot_cpu_data.x86_vendor ) +- { +- case X86_VENDOR_INTEL: +- switch ( boot_cpu_data.x86 ) +- { +- case 6: +- return MSR_IA32_LASTINTFROMIP; +- +- case 15: +- return MSR_P4_LER_FROM_LIP; +- } +- break; +- +- case X86_VENDOR_AMD: +- switch ( boot_cpu_data.x86 ) +- { +- case 6: +- case 0xf ... 0x19: +- return MSR_IA32_LASTINTFROMIP; +- } +- break; +- +- case X86_VENDOR_HYGON: +- return MSR_IA32_LASTINTFROMIP; +- } +- +- return 0; +-} +- + void percpu_traps_init(void) + { + subarch_percpu_traps_init(); + +- if ( !opt_ler ) +- return; +- +- if ( !ler_msr ) +- { +- ler_msr = calc_ler_msr(); +- if ( !ler_msr ) +- { +- opt_ler = false; +- return; +- } +- +- setup_force_cpu_cap(X86_FEATURE_XEN_LBR); +- } +- + if ( cpu_has_xen_lbr ) + wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR); + } +@@ -2202,6 +2156,42 @@ void __init init_idt_traps(void) + this_cpu(compat_gdt) = boot_compat_gdt; + } + ++static void __init init_ler(void) ++{ ++ unsigned int msr = 0; ++ ++ if ( !opt_ler ) ++ return; ++ ++ /* ++ * Intel Pentium 4 is the only known CPU to not use the architectural MSR ++ * indicies. ++ */ ++ switch ( boot_cpu_data.x86_vendor ) ++ { ++ case X86_VENDOR_INTEL: ++ if ( boot_cpu_data.x86 == 0xf ) ++ { ++ msr = MSR_P4_LER_FROM_LIP; ++ break; ++ } ++ fallthrough; ++ case X86_VENDOR_AMD: ++ case X86_VENDOR_HYGON: ++ msr = MSR_IA32_LASTINTFROMIP; ++ break; ++ } ++ ++ if ( msr == 0 ) ++ { ++ printk(XENLOG_WARNING "LER disabled: failed to identify MSRs\n"); ++ return; ++ } ++ ++ ler_msr = msr; ++ setup_force_cpu_cap(X86_FEATURE_XEN_LBR); ++} ++ + extern void (*const autogen_entrypoints[X86_NR_VECTORS])(void); + void __init trap_init(void) + { +@@ -2227,6 +2217,8 @@ void __init trap_init(void) + } + } + ++ init_ler(); ++ + /* Cache {,compat_}gdt_l1e now that physically relocation is done. */ + this_cpu(gdt_l1e) = + l1e_from_pfn(virt_to_mfn(boot_gdt), __PAGE_HYPERVISOR_RW); +-- +2.48.1 + diff --git a/0014-automation-use-expect-to-run-QEMU.patch b/0014-automation-use-expect-to-run-QEMU.patch deleted file mode 100644 index 5ffd5f7..0000000 --- a/0014-automation-use-expect-to-run-QEMU.patch +++ /dev/null @@ -1,362 +0,0 @@ -From 0918434e0fbee48c9dccc5fe262de5a81e380c15 Mon Sep 17 00:00:00 2001 -From: Stefano Stabellini -Date: Fri, 9 Aug 2024 23:59:20 -0700 -Subject: [PATCH 14/83] automation: use expect to run QEMU - -Use expect to invoke QEMU so that we can terminate the test as soon as -we get the right string in the output instead of waiting until the -final timeout. - -For timeout, instead of an hardcoding the value, use a Gitlab CI -variable "QEMU_TIMEOUT" that can be changed depending on the latest -status of the Gitlab CI runners. - -Signed-off-by: Stefano Stabellini -Reviewed-by: Michal Orzel -master commit: c36efb7fcea6ef9f31a20e60ec79ed3ae293feee -master date: 2024-08-09 23:59:20 -0700 ---- - automation/scripts/qemu-alpine-x86_64.sh | 16 +++---- - automation/scripts/qemu-key.exp | 45 +++++++++++++++++++ - automation/scripts/qemu-smoke-dom0-arm32.sh | 16 +++---- - automation/scripts/qemu-smoke-dom0-arm64.sh | 16 +++---- - .../scripts/qemu-smoke-dom0less-arm32.sh | 18 ++++---- - .../scripts/qemu-smoke-dom0less-arm64.sh | 16 +++---- - automation/scripts/qemu-smoke-ppc64le.sh | 13 +++--- - automation/scripts/qemu-smoke-riscv64.sh | 13 +++--- - automation/scripts/qemu-smoke-x86-64.sh | 15 ++++--- - automation/scripts/qemu-xtf-dom0less-arm64.sh | 15 +++---- - 10 files changed, 112 insertions(+), 71 deletions(-) - create mode 100755 automation/scripts/qemu-key.exp - -diff --git a/automation/scripts/qemu-alpine-x86_64.sh b/automation/scripts/qemu-alpine-x86_64.sh -index 8e398dcea3..5359e0820b 100755 ---- a/automation/scripts/qemu-alpine-x86_64.sh -+++ b/automation/scripts/qemu-alpine-x86_64.sh -@@ -77,18 +77,16 @@ EOF - # Run the test - rm -f smoke.serial - set +e --timeout -k 1 720 \ --qemu-system-x86_64 \ -+export QEMU_CMD="qemu-system-x86_64 \ - -cpu qemu64,+svm \ - -m 2G -smp 2 \ - -monitor none -serial stdio \ - -nographic \ - -device virtio-net-pci,netdev=n0 \ -- -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 |& \ -- # Remove carriage returns from the stdout output, as gitlab -- # interface chokes on them -- tee smoke.serial | sed 's/\r//' -+ -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0" - --set -e --(grep -q "Domain-0" smoke.serial && grep -q "BusyBox" smoke.serial) || exit 1 --exit 0 -+export QEMU_LOG="smoke.serial" -+export LOG_MSG="Domain-0" -+export PASSED="BusyBox" -+ -+./automation/scripts/qemu-key.exp -diff --git a/automation/scripts/qemu-key.exp b/automation/scripts/qemu-key.exp -new file mode 100755 -index 0000000000..35eb903a31 ---- /dev/null -+++ b/automation/scripts/qemu-key.exp -@@ -0,0 +1,45 @@ -+#!/usr/bin/expect -f -+ -+set timeout $env(QEMU_TIMEOUT) -+ -+log_file -a $env(QEMU_LOG) -+ -+match_max 10000 -+ -+eval spawn $env(QEMU_CMD) -+ -+expect_after { -+ -re "(.*)\r" { -+ exp_continue -+ } -+ timeout {send_error "ERROR-Timeout!\n"; exit 1} -+ eof {send_error "ERROR-EOF!\n"; exit 1} -+} -+ -+if {[info exists env(UBOOT_CMD)]} { -+ expect "=>" -+ -+ send "$env(UBOOT_CMD)\r" -+} -+ -+if {[info exists env(LOG_MSG)]} { -+ expect { -+ "$env(PASSED)" { -+ expect "$env(LOG_MSG)" -+ exit 0 -+ } -+ "$env(LOG_MSG)" { -+ expect "$env(PASSED)" -+ exit 0 -+ } -+ } -+} -+ -+expect { -+ "$env(PASSED)" { -+ exit 0 -+ } -+} -+ -+expect eof -+ -diff --git a/automation/scripts/qemu-smoke-dom0-arm32.sh b/automation/scripts/qemu-smoke-dom0-arm32.sh -index 31c05cc840..bab66bfe44 100755 ---- a/automation/scripts/qemu-smoke-dom0-arm32.sh -+++ b/automation/scripts/qemu-smoke-dom0-arm32.sh -@@ -78,9 +78,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d . -c config - - rm -f ${serial_log} - set +e --echo " virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000"| \ --timeout -k 1 720 \ --./qemu-system-arm \ -+export QEMU_CMD="./qemu-system-arm \ - -machine virt \ - -machine virtualization=true \ - -smp 4 \ -@@ -91,9 +89,11 @@ timeout -k 1 720 \ - -no-reboot \ - -device virtio-net-pci,netdev=n0 \ - -netdev user,id=n0,tftp=./ \ -- -bios /usr/lib/u-boot/qemu_arm/u-boot.bin |& \ -- tee ${serial_log} | sed 's/\r//' -+ -bios /usr/lib/u-boot/qemu_arm/u-boot.bin" -+ -+export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" -+export QEMU_LOG="${serial_log}" -+export LOG_MSG="Domain-0" -+export PASSED="/ #" - --set -e --(grep -q "Domain-0" ${serial_log} && grep -q "^/ #" ${serial_log}) || exit 1 --exit 0 -+../automation/scripts/qemu-key.exp -diff --git a/automation/scripts/qemu-smoke-dom0-arm64.sh b/automation/scripts/qemu-smoke-dom0-arm64.sh -index 352963a741..0094bfc8e1 100755 ---- a/automation/scripts/qemu-smoke-dom0-arm64.sh -+++ b/automation/scripts/qemu-smoke-dom0-arm64.sh -@@ -94,9 +94,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c binaries/conf - # Run the test - rm -f smoke.serial - set +e --echo " virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000"| \ --timeout -k 1 720 \ --./binaries/qemu-system-aarch64 \ -+export QEMU_CMD="./binaries/qemu-system-aarch64 \ - -machine virtualization=true \ - -cpu cortex-a57 -machine type=virt \ - -m 2048 -monitor none -serial stdio \ -@@ -104,9 +102,11 @@ timeout -k 1 720 \ - -no-reboot \ - -device virtio-net-pci,netdev=n0 \ - -netdev user,id=n0,tftp=binaries \ -- -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin |& \ -- tee smoke.serial | sed 's/\r//' -+ -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin" -+ -+export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" -+export QEMU_LOG="smoke.serial" -+export LOG_MSG="Domain-0" -+export PASSED="BusyBox" - --set -e --(grep -q "Domain-0" smoke.serial && grep -q "BusyBox" smoke.serial) || exit 1 --exit 0 -+./automation/scripts/qemu-key.exp -diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh b/automation/scripts/qemu-smoke-dom0less-arm32.sh -index c027c8c5c8..68ffbabdb8 100755 ---- a/automation/scripts/qemu-smoke-dom0less-arm32.sh -+++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh -@@ -5,7 +5,7 @@ set -ex - test_variant=$1 - - # Prompt to grep for to check if dom0 booted successfully --dom0_prompt="^/ #" -+dom0_prompt="/ #" - - serial_log="$(pwd)/smoke.serial" - -@@ -131,9 +131,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d . -c config - # Run the test - rm -f ${serial_log} - set +e --echo " virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000"| \ --timeout -k 1 240 \ --./qemu-system-arm \ -+export QEMU_CMD="./qemu-system-arm \ - -machine virt \ - -machine virtualization=true \ - -smp 4 \ -@@ -144,9 +142,11 @@ timeout -k 1 240 \ - -no-reboot \ - -device virtio-net-pci,netdev=n0 \ - -netdev user,id=n0,tftp=./ \ -- -bios /usr/lib/u-boot/qemu_arm/u-boot.bin |& \ -- tee ${serial_log} | sed 's/\r//' -+ -bios /usr/lib/u-boot/qemu_arm/u-boot.bin" -+ -+export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" -+export QEMU_LOG="${serial_log}" -+export LOG_MSG="${dom0_prompt}" -+export PASSED="${passed}" - --set -e --(grep -q "${dom0_prompt}" ${serial_log} && grep -q "${passed}" ${serial_log}) || exit 1 --exit 0 -+../automation/scripts/qemu-key.exp -diff --git a/automation/scripts/qemu-smoke-dom0less-arm64.sh b/automation/scripts/qemu-smoke-dom0less-arm64.sh -index 15258692d5..eb25c4af4b 100755 ---- a/automation/scripts/qemu-smoke-dom0less-arm64.sh -+++ b/automation/scripts/qemu-smoke-dom0less-arm64.sh -@@ -205,9 +205,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c binaries/conf - # Run the test - rm -f smoke.serial - set +e --echo " virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000"| \ --timeout -k 1 240 \ --./binaries/qemu-system-aarch64 \ -+export QEMU_CMD="./binaries/qemu-system-aarch64 \ - -machine virtualization=true \ - -cpu cortex-a57 -machine type=virt,gic-version=$gic_version \ - -m 2048 -monitor none -serial stdio \ -@@ -215,9 +213,11 @@ timeout -k 1 240 \ - -no-reboot \ - -device virtio-net-pci,netdev=n0 \ - -netdev user,id=n0,tftp=binaries \ -- -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin |& \ -- tee smoke.serial | sed 's/\r//' -+ -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin" -+ -+export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" -+export QEMU_LOG="smoke.serial" -+export LOG_MSG="Welcome to Alpine Linux" -+export PASSED="${passed}" - --set -e --(grep -q "^Welcome to Alpine Linux" smoke.serial && grep -q "${passed}" smoke.serial) || exit 1 --exit 0 -+./automation/scripts/qemu-key.exp -diff --git a/automation/scripts/qemu-smoke-ppc64le.sh b/automation/scripts/qemu-smoke-ppc64le.sh -index 9088881b73..ccb4a576f4 100755 ---- a/automation/scripts/qemu-smoke-ppc64le.sh -+++ b/automation/scripts/qemu-smoke-ppc64le.sh -@@ -11,8 +11,7 @@ machine=$1 - rm -f ${serial_log} - set +e - --timeout -k 1 20 \ --qemu-system-ppc64 \ -+export QEMU_CMD="qemu-system-ppc64 \ - -bios skiboot.lid \ - -M $machine \ - -m 2g \ -@@ -21,9 +20,9 @@ qemu-system-ppc64 \ - -monitor none \ - -nographic \ - -serial stdio \ -- -kernel binaries/xen \ -- |& tee ${serial_log} | sed 's/\r//' -+ -kernel binaries/xen" - --set -e --(grep -q "Hello, ppc64le!" ${serial_log}) || exit 1 --exit 0 -+export QEMU_LOG="${serial_log}" -+export PASSED="Hello, ppc64le!" -+ -+./automation/scripts/qemu-key.exp -diff --git a/automation/scripts/qemu-smoke-riscv64.sh b/automation/scripts/qemu-smoke-riscv64.sh -index f90df3c051..0355c075b7 100755 ---- a/automation/scripts/qemu-smoke-riscv64.sh -+++ b/automation/scripts/qemu-smoke-riscv64.sh -@@ -6,15 +6,14 @@ set -ex - rm -f smoke.serial - set +e - --timeout -k 1 2 \ --qemu-system-riscv64 \ -+export QEMU_CMD="qemu-system-riscv64 \ - -M virt \ - -smp 1 \ - -nographic \ - -m 2g \ -- -kernel binaries/xen \ -- |& tee smoke.serial | sed 's/\r//' -+ -kernel binaries/xen" - --set -e --(grep -q "All set up" smoke.serial) || exit 1 --exit 0 -+export QEMU_LOG="smoke.serial" -+export PASSED="All set up" -+ -+./automation/scripts/qemu-key.exp -diff --git a/automation/scripts/qemu-smoke-x86-64.sh b/automation/scripts/qemu-smoke-x86-64.sh -index 3014d07314..37ac10e068 100755 ---- a/automation/scripts/qemu-smoke-x86-64.sh -+++ b/automation/scripts/qemu-smoke-x86-64.sh -@@ -16,11 +16,12 @@ esac - - rm -f smoke.serial - set +e --timeout -k 1 30 \ --qemu-system-x86_64 -nographic -kernel binaries/xen \ -+export QEMU_CMD="qemu-system-x86_64 -nographic -kernel binaries/xen \ - -initrd xtf/tests/example/$k \ -- -append "loglvl=all console=com1 noreboot console_timestamps=boot $extra" \ -- -m 512 -monitor none -serial file:smoke.serial --set -e --grep -q 'Test result: SUCCESS' smoke.serial || exit 1 --exit 0 -+ -append \"loglvl=all console=com1 noreboot console_timestamps=boot $extra\" \ -+ -m 512 -monitor none -serial stdio" -+ -+export QEMU_LOG="smoke.serial" -+export PASSED="Test result: SUCCESS" -+ -+./automation/scripts/qemu-key.exp -diff --git a/automation/scripts/qemu-xtf-dom0less-arm64.sh b/automation/scripts/qemu-xtf-dom0less-arm64.sh -index b08c2d44fb..0666f6363e 100755 ---- a/automation/scripts/qemu-xtf-dom0less-arm64.sh -+++ b/automation/scripts/qemu-xtf-dom0less-arm64.sh -@@ -51,9 +51,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c binaries/conf - # Run the test - rm -f smoke.serial - set +e --echo " virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000"| \ --timeout -k 1 120 \ --./binaries/qemu-system-aarch64 \ -+export QEMU_CMD="./binaries/qemu-system-aarch64 \ - -machine virtualization=true \ - -cpu cortex-a57 -machine type=virt \ - -m 2048 -monitor none -serial stdio \ -@@ -61,9 +59,10 @@ timeout -k 1 120 \ - -no-reboot \ - -device virtio-net-pci,netdev=n0 \ - -netdev user,id=n0,tftp=binaries \ -- -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin |& \ -- tee smoke.serial | sed 's/\r//' -+ -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin" -+ -+export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" -+export QEMU_LOG="smoke.serial" -+export PASSED="${passed}" - --set -e --(grep -q "${passed}" smoke.serial) || exit 1 --exit 0 -+./automation/scripts/qemu-key.exp --- -2.47.0 - diff --git a/0014-x86-amd-Misc-setup-for-Fam1Ah-processors.patch b/0014-x86-amd-Misc-setup-for-Fam1Ah-processors.patch new file mode 100644 index 0000000..9aeece7 --- /dev/null +++ b/0014-x86-amd-Misc-setup-for-Fam1Ah-processors.patch @@ -0,0 +1,66 @@ +From 71d626f2f7da6003f5dccbd479a6818147ce66fd Mon Sep 17 00:00:00 2001 +From: Andrew Cooper +Date: Tue, 21 Jan 2025 09:19:14 +0100 +Subject: [PATCH 14/53] x86/amd: Misc setup for Fam1Ah processors + +Fam1Ah is similar to Fam19h in these regards. + +Signed-off-by: Andrew Cooper +Acked-by: Jan Beulich +master commit: f29cc14de1d195bcd8312dcab2b5f8e634b57288 +master date: 2025-01-06 18:01:32 +0000 +--- + xen/arch/x86/acpi/cpu_idle.c | 1 + + xen/arch/x86/cpu/microcode/amd.c | 4 ++++ + xen/arch/x86/cpu/vpmu_amd.c | 1 + + 3 files changed, 6 insertions(+) + +diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c +index 57ac984790..52808f9809 100644 +--- a/xen/arch/x86/acpi/cpu_idle.c ++++ b/xen/arch/x86/acpi/cpu_idle.c +@@ -1429,6 +1429,7 @@ static void amd_cpuidle_init(struct acpi_processor_power *power) + + switch ( c->x86 ) + { ++ case 0x1a: + case 0x19: + case 0x18: + if ( boot_cpu_data.x86_vendor != X86_VENDOR_HYGON ) +diff --git a/xen/arch/x86/cpu/microcode/amd.c b/xen/arch/x86/cpu/microcode/amd.c +index 9fe6e29751..31fbd326e5 100644 +--- a/xen/arch/x86/cpu/microcode/amd.c ++++ b/xen/arch/x86/cpu/microcode/amd.c +@@ -114,6 +114,7 @@ static bool verify_patch_size(uint32_t patch_size) + #define F16H_MPB_MAX_SIZE 3458 + #define F17H_MPB_MAX_SIZE 3200 + #define F19H_MPB_MAX_SIZE 5568 ++#define F1AH_MPB_MAX_SIZE 15296 + + switch ( boot_cpu_data.x86 ) + { +@@ -132,6 +133,9 @@ static bool verify_patch_size(uint32_t patch_size) + case 0x19: + max_size = F19H_MPB_MAX_SIZE; + break; ++ case 0x1a: ++ max_size = F1AH_MPB_MAX_SIZE; ++ break; + default: + max_size = F1XH_MPB_MAX_SIZE; + break; +diff --git a/xen/arch/x86/cpu/vpmu_amd.c b/xen/arch/x86/cpu/vpmu_amd.c +index 97e6315bd9..d31e5db5b1 100644 +--- a/xen/arch/x86/cpu/vpmu_amd.c ++++ b/xen/arch/x86/cpu/vpmu_amd.c +@@ -567,6 +567,7 @@ const struct arch_vpmu_ops *__init amd_vpmu_init(void) + case 0x15: + case 0x17: + case 0x19: ++ case 0x1a: + num_counters = F15H_NUM_COUNTERS; + counters = AMD_F15H_COUNTERS; + ctrls = AMD_F15H_CTRLS; +-- +2.48.1 + diff --git a/0015-x86-vLAPIC-prevent-undue-recursion-of-vlapic_error.patch b/0015-x86-vLAPIC-prevent-undue-recursion-of-vlapic_error.patch deleted file mode 100644 index 479d717..0000000 --- a/0015-x86-vLAPIC-prevent-undue-recursion-of-vlapic_error.patch +++ /dev/null @@ -1,57 +0,0 @@ -From 9358a7fad7f0427e7d1666da0c78cef341ee9072 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 24 Sep 2024 14:27:03 +0200 -Subject: [PATCH 15/83] x86/vLAPIC: prevent undue recursion of vlapic_error() - -With the error vector set to an illegal value, the function invoking -vlapic_set_irq() would bring execution back here, with the non-recursive -lock already held. Avoid the call in this case, merely further updating -ESR (if necessary). - -This is XSA-462 / CVE-2024-45817. - -Fixes: 5f32d186a8b1 ("x86/vlapic: don't silently accept bad vectors") -Reported-by: Federico Serafini -Reported-by: Andrew Cooper -Signed-off-by: Jan Beulich -Signed-off-by: Andrew Cooper -Reviewed-by: Andrew Cooper -master commit: c42d9ec61f6d11e25fa77bd44dd11dad1edda268 -master date: 2024-09-24 14:23:29 +0200 ---- - xen/arch/x86/hvm/vlapic.c | 17 ++++++++++++++++- - 1 file changed, 16 insertions(+), 1 deletion(-) - -diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c -index 9cfc82666a..46ff758904 100644 ---- a/xen/arch/x86/hvm/vlapic.c -+++ b/xen/arch/x86/hvm/vlapic.c -@@ -112,9 +112,24 @@ static void vlapic_error(struct vlapic *vlapic, unsigned int errmask) - if ( (esr & errmask) != errmask ) - { - uint32_t lvterr = vlapic_get_reg(vlapic, APIC_LVTERR); -+ bool inj = false; - -- vlapic_set_reg(vlapic, APIC_ESR, esr | errmask); - if ( !(lvterr & APIC_LVT_MASKED) ) -+ { -+ /* -+ * If LVTERR is unmasked and has an illegal vector, vlapic_set_irq() -+ * will end up back here. Break the cycle by only injecting LVTERR -+ * if it will succeed, and folding in RECVILL otherwise. -+ */ -+ if ( (lvterr & APIC_VECTOR_MASK) >= 16 ) -+ inj = true; -+ else -+ errmask |= APIC_ESR_RECVILL; -+ } -+ -+ vlapic_set_reg(vlapic, APIC_ESR, esr | errmask); -+ -+ if ( inj ) - vlapic_set_irq(vlapic, lvterr & APIC_VECTOR_MASK, 0); - } - spin_unlock_irqrestore(&vlapic->esr_lock, flags); --- -2.47.0 - diff --git a/0015-x86emul-VCVT-U-DQ2PD-ignores-embedded-rounding.patch b/0015-x86emul-VCVT-U-DQ2PD-ignores-embedded-rounding.patch new file mode 100644 index 0000000..6ecf00c --- /dev/null +++ b/0015-x86emul-VCVT-U-DQ2PD-ignores-embedded-rounding.patch @@ -0,0 +1,67 @@ +From 4abf7f1d6174b74640c219e1cc7cb768e8d7ea32 Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Tue, 21 Jan 2025 09:19:39 +0100 +Subject: [PATCH 15/53] x86emul: VCVT{,U}DQ2PD ignores embedded rounding + +IOW we shouldn't raise #UD in that case. Be on the safe side though and +only encode fully legitimate forms into the stub to be executed. + +Things weren't quite right for VCVT{,U}SI2SD either, in the attempt to +be on the safe side: Clearing EVEX.L'L isn't useful; it's EVEX.b which +primarily needs clearing. Also reflect the somewhat improved doc +situation in the comment there. + +Fixes: ed806f373730 ("x86emul: support AVX512F legacy-equivalent packed int/FP conversion insns") +Fixes: baf4a376f550 ("x86emul: support AVX512F legacy-equivalent scalar int/FP conversion insns") +Signed-off-by: Jan Beulich +Acked-by: Andrew Cooper +master commit: d3709d1324aa140f064b9c68da37547f459f8e8d +master date: 2025-01-08 11:01:17 +0100 +--- + xen/arch/x86/x86_emulate/x86_emulate.c | 20 ++++++++++++++++---- + 1 file changed, 16 insertions(+), 4 deletions(-) + +diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c +index 09ab75d035..4ef868de3d 100644 +--- a/xen/arch/x86/x86_emulate/x86_emulate.c ++++ b/xen/arch/x86/x86_emulate/x86_emulate.c +@@ -3592,12 +3592,15 @@ x86_emulate( + if ( !mode_64bit() ) + evex.w = 0; + /* +- * SDM version 067 claims that exception type E10NF implies #UD when +- * EVEX.L'L is non-zero for 32-bit VCVT{,U}SI2SD. Experimentally this +- * cannot be confirmed, but be on the safe side for the stub. ++ * While SDM version 085 has explicit wording towards embedded rounding ++ * being ignored, it's still not entirely unambiguous with the exception ++ * type referred to. Be on the safe side for the stub. + */ + if ( !evex.w && evex.pfx == vex_f2 ) ++ { ++ evex.brs = 0; + evex.lr = 0; ++ } + opc[1] = (modrm & 0x38) | 0xc0; + insn_bytes = EVEX_PFX_BYTES + 2; + opc[2] = 0xc3; +@@ -4815,7 +4818,16 @@ x86_emulate( + else + { + host_and_vcpu_must_have(avx512f); +- generate_exception_if(ea.type != OP_MEM && evex.brs, X86_EXC_UD); ++ /* ++ * While SDM version 085 has explicit wording towards embedded ++ * rounding being ignored, it's still not entirely unambiguous with ++ * the exception type referred to. Be on the safe side for the stub. ++ */ ++ if ( ea.type != OP_MEM && evex.brs ) ++ { ++ evex.brs = 0; ++ evex.lr = 2; ++ } + } + if ( ea.type != OP_REG || !evex.brs ) + avx512_vlen_check(false); +-- +2.48.1 + diff --git a/0016-Arm-correct-FIXADDR_TOP.patch b/0016-Arm-correct-FIXADDR_TOP.patch deleted file mode 100644 index 8a23ee6..0000000 --- a/0016-Arm-correct-FIXADDR_TOP.patch +++ /dev/null @@ -1,58 +0,0 @@ -From 46a2ce35212c9b35c4818ca9eec918aa4a45cb48 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 24 Sep 2024 14:28:22 +0200 -Subject: [PATCH 16/83] Arm: correct FIXADDR_TOP - -While reviewing a RISC-V patch cloning the Arm code, I noticed an -off-by-1 here: FIX_PMAP_{BEGIN,END} being an inclusive range and -FIX_LAST being the same as FIX_PMAP_END, FIXADDR_TOP cannot derive from -FIX_LAST alone, or else the BUG_ON() in virt_to_fix() would trigger if -FIX_PMAP_END ended up being used. - -While touching this area also add a check for fixmap and boot FDT area -to not only not overlap, but to have at least one (unmapped) page in -between. - -Fixes: 4f17357b52f6 ("xen/arm: add Persistent Map (PMAP) infrastructure") -Signed-off-by: Jan Beulich -Reviewed-by: Michal Orzel -master commit: fe3412ab83cc53c2bf2c497be3794bc09751efa5 -master date: 2024-08-13 21:50:55 +0100 ---- - xen/arch/arm/include/asm/fixmap.h | 2 +- - xen/arch/arm/mmu/setup.c | 6 ++++++ - 2 files changed, 7 insertions(+), 1 deletion(-) - -diff --git a/xen/arch/arm/include/asm/fixmap.h b/xen/arch/arm/include/asm/fixmap.h -index a823456ecb..0cb5d54d1c 100644 ---- a/xen/arch/arm/include/asm/fixmap.h -+++ b/xen/arch/arm/include/asm/fixmap.h -@@ -18,7 +18,7 @@ - #define FIX_LAST FIX_PMAP_END - - #define FIXADDR_START FIXMAP_ADDR(0) --#define FIXADDR_TOP FIXMAP_ADDR(FIX_LAST) -+#define FIXADDR_TOP FIXMAP_ADDR(FIX_LAST + 1) - - #ifndef __ASSEMBLY__ - -diff --git a/xen/arch/arm/mmu/setup.c b/xen/arch/arm/mmu/setup.c -index f4bb424c3c..57042ed57b 100644 ---- a/xen/arch/arm/mmu/setup.c -+++ b/xen/arch/arm/mmu/setup.c -@@ -128,6 +128,12 @@ static void __init __maybe_unused build_assertions(void) - - #undef CHECK_SAME_SLOT - #undef CHECK_DIFFERENT_SLOT -+ -+ /* -+ * Fixmaps must not overlap with boot FDT mapping area. Make sure there's -+ * at least one guard page in between. -+ */ -+ BUILD_BUG_ON(FIXADDR_TOP >= BOOT_FDT_VIRT_START); - } - - lpae_t __init pte_of_xenaddr(vaddr_t va) --- -2.47.0 - diff --git a/0016-x86emul-correct-put_fpu-s-segment-selector-handling.patch b/0016-x86emul-correct-put_fpu-s-segment-selector-handling.patch new file mode 100644 index 0000000..2974ae5 --- /dev/null +++ b/0016-x86emul-correct-put_fpu-s-segment-selector-handling.patch @@ -0,0 +1,116 @@ +From 4df28706ad8756b36de23b5173e8f4ce4c086c57 Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Tue, 21 Jan 2025 09:19:56 +0100 +Subject: [PATCH 16/53] x86emul: correct put_fpu()'s segment selector handling + +All selector fields under ctxt->regs are (normally) poisoned in the HVM +case, and the four ones besides CS and SS are potentially stale for PV. +Avoid using them in the hypervisor incarnation of the emulator, when +trying to cover for a missing ->read_segment() hook. + +To make sure there's always a valid ->read_segment() handler for all HVM +cases, add a respective function to shadow code, even if it is not +expected for FPU insns to be used to update page tables. + +Fixes: 0711b59b858a ("x86emul: correct FPU code/data pointers and opcode handling") +Reported-by: Andrew Cooper +Signed-off-by: Jan Beulich +Acked-by: Andrew Cooper +master commit: 645b8d48c78f5b6ffd6230873f9e3ced4e840acd +master date: 2025-01-08 11:02:16 +0100 +--- + xen/arch/x86/mm/shadow/hvm.c | 18 +++++++++++++++ + xen/arch/x86/x86_emulate/x86_emulate.c | 32 ++++++++++++++++++++++---- + 2 files changed, 46 insertions(+), 4 deletions(-) + +diff --git a/xen/arch/x86/mm/shadow/hvm.c b/xen/arch/x86/mm/shadow/hvm.c +index c16f3b3adf..114957a3e1 100644 +--- a/xen/arch/x86/mm/shadow/hvm.c ++++ b/xen/arch/x86/mm/shadow/hvm.c +@@ -287,11 +287,29 @@ hvm_emulate_cmpxchg(enum x86_segment seg, + return rc; + } + ++static int cf_check ++hvm_emulate_read_segment(enum x86_segment seg, ++ struct segment_register *reg, ++ struct x86_emulate_ctxt *ctxt) ++{ ++ struct sh_emulate_ctxt *sh_ctxt = ++ container_of(ctxt, struct sh_emulate_ctxt, ctxt); ++ const struct segment_register *sreg = hvm_get_seg_reg(seg, sh_ctxt); ++ ++ if ( IS_ERR(sreg) ) ++ return -PTR_ERR(sreg); ++ ++ *reg = *sreg; ++ ++ return X86EMUL_OKAY; ++} ++ + static const struct x86_emulate_ops hvm_shadow_emulator_ops = { + .read = hvm_emulate_read, + .insn_fetch = hvm_emulate_insn_fetch, + .write = hvm_emulate_write, + .cmpxchg = hvm_emulate_cmpxchg, ++ .read_segment = hvm_emulate_read_segment, + }; + + const struct x86_emulate_ops *shadow_init_emulation( +diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c +index 4ef868de3d..31475208d1 100644 +--- a/xen/arch/x86/x86_emulate/x86_emulate.c ++++ b/xen/arch/x86/x86_emulate/x86_emulate.c +@@ -447,14 +447,37 @@ static void put_fpu( + if ( state->ea.type == OP_MEM ) + { + aux.dp = state->ea.mem.off; +- if ( ops->read_segment && +- ops->read_segment(state->ea.mem.seg, &sreg, +- ctxt) == X86EMUL_OKAY ) ++ if ( state->ea.mem.seg == x86_seg_cs ) ++ aux.ds = aux.cs; ++ else if ( ops->read_segment && ++ ops->read_segment(state->ea.mem.seg, &sreg, ++ ctxt) == X86EMUL_OKAY ) + aux.ds = sreg.sel; ++#ifdef __XEN__ ++ /* ++ * While generally the expectation is that input structures are ++ * fully populated, the selector fields under ctxt->regs normally ++ * aren't set, with the exception of CS and SS for PV domains. ++ * Read the real selector registers for PV, and assert that HVM ++ * invocations always set a properly functioning ->read_segment() ++ * hook. ++ */ ++ else if ( is_pv_vcpu(current) ) ++ switch ( state->ea.mem.seg ) ++ { ++ case x86_seg_ds: aux.ds = read_sreg(ds); break; ++ case x86_seg_es: aux.ds = read_sreg(es); break; ++ case x86_seg_fs: aux.ds = read_sreg(fs); break; ++ case x86_seg_gs: aux.ds = read_sreg(gs); break; ++ case x86_seg_ss: aux.ds = ctxt->regs->ss; break; ++ default: ASSERT_UNREACHABLE(); break; ++ } ++ else ++ ASSERT_UNREACHABLE(); ++#else + else + switch ( state->ea.mem.seg ) + { +- case x86_seg_cs: aux.ds = ctxt->regs->cs; break; + case x86_seg_ds: aux.ds = ctxt->regs->ds; break; + case x86_seg_es: aux.ds = ctxt->regs->es; break; + case x86_seg_fs: aux.ds = ctxt->regs->fs; break; +@@ -462,6 +485,7 @@ static void put_fpu( + case x86_seg_ss: aux.ds = ctxt->regs->ss; break; + default: ASSERT_UNREACHABLE(); break; + } ++#endif + aux.dval = true; + } + ops->put_fpu(ctxt, X86EMUL_FPU_none, &aux); +-- +2.48.1 + diff --git a/0017-xen-flask-Wire-up-XEN_DOMCTL_vuart_op.patch b/0017-xen-flask-Wire-up-XEN_DOMCTL_vuart_op.patch new file mode 100644 index 0000000..60aaadf --- /dev/null +++ b/0017-xen-flask-Wire-up-XEN_DOMCTL_vuart_op.patch @@ -0,0 +1,63 @@ +From 30a8d910ca97ba460236bce53fb8e3c3035ea8fe Mon Sep 17 00:00:00 2001 +From: Michal Orzel +Date: Tue, 21 Jan 2025 09:20:42 +0100 +Subject: [PATCH 17/53] xen/flask: Wire up XEN_DOMCTL_vuart_op + +Addition of FLASK permission for this hypercall was overlooked in the +original patch. Fix it. The only VUART operation is initialization that +can occur only during domain creation. + +Fixes: 86039f2e8c20 ("xen/arm: vpl011: Add a new domctl API to initialize vpl011") +Signed-off-by: Michal Orzel +Acked-by: Daniel P. Smith +master commit: 29daa72e4019aae92f857cf6e7e0c3ca8fb1483e +master date: 2025-01-08 13:05:38 +0100 +--- + tools/flask/policy/modules/xen.if | 2 +- + xen/xsm/flask/hooks.c | 3 +++ + xen/xsm/flask/policy/access_vectors | 2 ++ + 3 files changed, 6 insertions(+), 1 deletion(-) + +diff --git a/tools/flask/policy/modules/xen.if b/tools/flask/policy/modules/xen.if +index 11c1562aa5..ba9e91d302 100644 +--- a/tools/flask/policy/modules/xen.if ++++ b/tools/flask/policy/modules/xen.if +@@ -54,7 +54,7 @@ define(`create_domain_common', ` + allow $1 $2:domain2 { set_cpu_policy settsc setscheduler setclaim + set_vnumainfo get_vnumainfo cacheflush + psr_cmt_op psr_alloc soft_reset +- resource_map get_cpu_policy }; ++ resource_map get_cpu_policy vuart_op }; + allow $1 $2:security check_context; + allow $1 $2:shadow enable; + allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op updatemp }; +diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c +index 278ad38c2a..35237a00c4 100644 +--- a/xen/xsm/flask/hooks.c ++++ b/xen/xsm/flask/hooks.c +@@ -829,6 +829,9 @@ static int cf_check flask_domctl(struct domain *d, unsigned int cmd, + case XEN_DOMCTL_soft_reset: + return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__SOFT_RESET); + ++ case XEN_DOMCTL_vuart_op: ++ return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__VUART_OP); ++ + case XEN_DOMCTL_get_cpu_policy: + return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__GET_CPU_POLICY); + +diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors +index a35e3d4c51..7cbdb7ea64 100644 +--- a/xen/xsm/flask/policy/access_vectors ++++ b/xen/xsm/flask/policy/access_vectors +@@ -251,6 +251,8 @@ class domain2 + resource_map + # XEN_DOMCTL_get_cpu_policy + get_cpu_policy ++# XEN_DOMCTL_vuart_op ++ vuart_op + } + + # Similar to class domain, but primarily contains domctls related to HVM domains +-- +2.48.1 + diff --git a/0017-xl-fix-incorrect-output-in-help-command.patch b/0017-xl-fix-incorrect-output-in-help-command.patch deleted file mode 100644 index f5f72bc..0000000 --- a/0017-xl-fix-incorrect-output-in-help-command.patch +++ /dev/null @@ -1,36 +0,0 @@ -From e12998a9db8d0ac14477557d09b437783a999ea4 Mon Sep 17 00:00:00 2001 -From: "John E. Krokes" -Date: Tue, 24 Sep 2024 14:29:26 +0200 -Subject: [PATCH 17/83] xl: fix incorrect output in "help" command - -In "xl help", the output includes this line: - - vsnd-list List virtual display devices for a domain - -This should obviously say "sound devices" instead of "display devices". - -Signed-off-by: John E. Krokes -Reviewed-by: Juergen Gross -Acked-by: Anthony PERARD -master commit: 09226d165b57d919150458044c5b594d3d1dc23a -master date: 2024-08-14 08:49:44 +0200 ---- - tools/xl/xl_cmdtable.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c -index 42751228c1..53fc22d344 100644 ---- a/tools/xl/xl_cmdtable.c -+++ b/tools/xl/xl_cmdtable.c -@@ -433,7 +433,7 @@ const struct cmd_spec cmd_table[] = { - }, - { "vsnd-list", - &main_vsndlist, 0, 0, -- "List virtual display devices for a domain", -+ "List virtual sound devices for a domain", - "", - }, - { "vsnd-detach", --- -2.47.0 - diff --git a/0018-x86emul-correct-UD-check-for-AVX512-FP16-complex-mul.patch b/0018-x86emul-correct-UD-check-for-AVX512-FP16-complex-mul.patch deleted file mode 100644 index e881140..0000000 --- a/0018-x86emul-correct-UD-check-for-AVX512-FP16-complex-mul.patch +++ /dev/null @@ -1,37 +0,0 @@ -From e2f29f7bad59c4be53363c8c0d2933982a22d0de Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 24 Sep 2024 14:30:04 +0200 -Subject: [PATCH 18/83] x86emul: correct #UD check for AVX512-FP16 complex - multiplications - -avx512_vlen_check()'s argument was inverted, while the surrounding -conditional wrongly forced the EVEX.L'L check for the scalar forms when -embedded rounding was in effect. - -Fixes: d14c52cba0f5 ("x86emul: handle AVX512-FP16 complex multiplication insns") -Signed-off-by: Jan Beulich -Acked-by: Andrew Cooper -master commit: a30d438ce58b70c5955f5d37f776086ab8f88623 -master date: 2024-08-19 15:32:31 +0200 ---- - xen/arch/x86/x86_emulate/x86_emulate.c | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c -index 2d5c1de8ec..16557385bf 100644 ---- a/xen/arch/x86/x86_emulate/x86_emulate.c -+++ b/xen/arch/x86/x86_emulate/x86_emulate.c -@@ -7984,8 +7984,8 @@ x86_emulate( - generate_exception_if(modrm_reg == src1 || - (ea.type != OP_MEM && modrm_reg == modrm_rm), - X86_EXC_UD); -- if ( ea.type != OP_REG || (b & 1) || !evex.brs ) -- avx512_vlen_check(!(b & 1)); -+ if ( ea.type != OP_REG || !evex.brs ) -+ avx512_vlen_check(b & 1); - goto simd_zmm; - } - --- -2.47.0 - diff --git a/0018-xen-flask-Wire-up-XEN_DOMCTL_dt_overlay.patch b/0018-xen-flask-Wire-up-XEN_DOMCTL_dt_overlay.patch new file mode 100644 index 0000000..6c01fda --- /dev/null +++ b/0018-xen-flask-Wire-up-XEN_DOMCTL_dt_overlay.patch @@ -0,0 +1,78 @@ +From e7f96aa3f3d8b1ad2f0475a627f62763261df743 Mon Sep 17 00:00:00 2001 +From: Michal Orzel +Date: Tue, 21 Jan 2025 09:20:51 +0100 +Subject: [PATCH 18/53] xen/flask: Wire up XEN_DOMCTL_dt_overlay + +Addition of FLASK permission for this hypercall was overlooked in the +original patch. Fix it. The only dt overlay operation is attaching that can +happen only after the domain is created. Dom0 can attach overlay to itself +as well. + +Fixes: 4c733873b5c2 ("xen/arm: Add XEN_DOMCTL_dt_overlay and device attachment to domains") +Signed-off-by: Michal Orzel +Acked-by: Daniel P. Smith +master commit: 7fa1411676150634b1d6ca030e53b94c26a949dd +master date: 2025-01-08 13:05:50 +0100 +--- + tools/flask/policy/modules/dom0.te | 2 +- + tools/flask/policy/modules/xen.if | 2 +- + xen/xsm/flask/hooks.c | 3 +++ + xen/xsm/flask/policy/access_vectors | 2 ++ + 4 files changed, 7 insertions(+), 2 deletions(-) + +diff --git a/tools/flask/policy/modules/dom0.te b/tools/flask/policy/modules/dom0.te +index 16b8c9646d..f148bfbf27 100644 +--- a/tools/flask/policy/modules/dom0.te ++++ b/tools/flask/policy/modules/dom0.te +@@ -40,7 +40,7 @@ allow dom0_t dom0_t:domain { + }; + allow dom0_t dom0_t:domain2 { + set_cpu_policy gettsc settsc setscheduler set_vnumainfo +- get_vnumainfo psr_cmt_op psr_alloc get_cpu_policy ++ get_vnumainfo psr_cmt_op psr_alloc get_cpu_policy dt_overlay + }; + allow dom0_t dom0_t:resource { add remove }; + +diff --git a/tools/flask/policy/modules/xen.if b/tools/flask/policy/modules/xen.if +index ba9e91d302..def60da883 100644 +--- a/tools/flask/policy/modules/xen.if ++++ b/tools/flask/policy/modules/xen.if +@@ -94,7 +94,7 @@ define(`manage_domain', ` + getaddrsize pause unpause trigger shutdown destroy + setaffinity setdomainmaxmem getscheduler resume + setpodtarget getpodtarget getpagingmempool setpagingmempool }; +- allow $1 $2:domain2 set_vnumainfo; ++ allow $1 $2:domain2 { set_vnumainfo dt_overlay }; + ') + + # migrate_domain_out(priv, target) +diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c +index 35237a00c4..415edee251 100644 +--- a/xen/xsm/flask/hooks.c ++++ b/xen/xsm/flask/hooks.c +@@ -841,6 +841,9 @@ static int cf_check flask_domctl(struct domain *d, unsigned int cmd, + case XEN_DOMCTL_set_paging_mempool_size: + return current_has_perm(d, SECCLASS_DOMAIN, DOMAIN__SETPAGINGMEMPOOL); + ++ case XEN_DOMCTL_dt_overlay: ++ return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__DT_OVERLAY); ++ + default: + return avc_unknown_permission("domctl", cmd); + } +diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors +index 7cbdb7ea64..78fe37583b 100644 +--- a/xen/xsm/flask/policy/access_vectors ++++ b/xen/xsm/flask/policy/access_vectors +@@ -253,6 +253,8 @@ class domain2 + get_cpu_policy + # XEN_DOMCTL_vuart_op + vuart_op ++# XEN_DOMCTL_dt_overlay ++ dt_overlay + } + + # Similar to class domain, but primarily contains domctls related to HVM domains +-- +2.48.1 + diff --git a/0019-x86-pv-Introduce-x86_merge_dr6-and-fix-do_debug.patch b/0019-x86-pv-Introduce-x86_merge_dr6-and-fix-do_debug.patch deleted file mode 100644 index 5aabaf3..0000000 --- a/0019-x86-pv-Introduce-x86_merge_dr6-and-fix-do_debug.patch +++ /dev/null @@ -1,140 +0,0 @@ -From de924e4dbac80ac7d94a2e86c37eecccaa1bc677 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Tue, 24 Sep 2024 14:30:49 +0200 -Subject: [PATCH 19/83] x86/pv: Introduce x86_merge_dr6() and fix do_debug() -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Pretty much everywhere in Xen the logic to update %dr6 when injecting #DB is -buggy. Introduce a new x86_merge_dr6() helper, and start fixing the mess by -adjusting the dr6 merge in do_debug(). Also correct the comment. - -Signed-off-by: Andrew Cooper -Reviewed-by: Roger Pau Monné -Reviewed-by: Jan Beulich -master commit: 54ef601a66e8d812a6a6a308f02524e81201825e -master date: 2024-08-21 23:59:19 +0100 ---- - xen/arch/x86/debug.c | 40 ++++++++++++++++++++++++++++ - xen/arch/x86/include/asm/debugreg.h | 7 +++++ - xen/arch/x86/include/asm/x86-defns.h | 7 +++++ - xen/arch/x86/traps.c | 11 +++++--- - 4 files changed, 62 insertions(+), 3 deletions(-) - -diff --git a/xen/arch/x86/debug.c b/xen/arch/x86/debug.c -index 127fe83021..b10f1f12b6 100644 ---- a/xen/arch/x86/debug.c -+++ b/xen/arch/x86/debug.c -@@ -2,12 +2,52 @@ - /* - * Copyright (C) 2023 XenServer. - */ -+#include - #include - - #include - - #include - -+/* -+ * Merge new bits into dr6. 'new' is always given in positive polarity, -+ * matching the Intel VMCS PENDING_DBG semantics. -+ * -+ * At the time of writing (August 2024), on the subject of %dr6 updates the -+ * manuals are either vague (Intel "certain exceptions may clear bits 0-3"), -+ * or disputed (AMD makes statements which don't match observed behaviour). -+ * -+ * The only debug exception I can find which doesn't clear the breakpoint bits -+ * is ICEBP(/INT1) on AMD systems. This is also the one source of #DB that -+ * doesn't have an explicit status bit, meaning we can't easily identify this -+ * case either (AMD systems don't virtualise PENDING_DBG and only provide a -+ * post-merge %dr6 value). -+ * -+ * Treat %dr6 merging as unconditionally writing the breakpoint bits. -+ * -+ * We can't really manage any better, and guest kernels handling #DB as -+ * instructed by the SDM/APM (i.e. reading %dr6 then resetting it back to -+ * default) wont notice. -+ */ -+unsigned int x86_merge_dr6(const struct cpu_policy *p, unsigned int dr6, -+ unsigned int new) -+{ -+ /* Flip dr6 to have positive polarity. */ -+ dr6 ^= X86_DR6_DEFAULT; -+ -+ /* Sanity check that only known values are passed in. */ -+ ASSERT(!(dr6 & ~X86_DR6_KNOWN_MASK)); -+ ASSERT(!(new & ~X86_DR6_KNOWN_MASK)); -+ -+ /* Breakpoint bits overridden. All others accumulate. */ -+ dr6 = (dr6 & ~X86_DR6_BP_MASK) | new; -+ -+ /* Flip dr6 back to having default polarity. */ -+ dr6 ^= X86_DR6_DEFAULT; -+ -+ return x86_adj_dr6_rsvd(p, dr6); -+} -+ - unsigned int x86_adj_dr6_rsvd(const struct cpu_policy *p, unsigned int dr6) - { - unsigned int ones = X86_DR6_DEFAULT; -diff --git a/xen/arch/x86/include/asm/debugreg.h b/xen/arch/x86/include/asm/debugreg.h -index 96c406ad53..6baa725441 100644 ---- a/xen/arch/x86/include/asm/debugreg.h -+++ b/xen/arch/x86/include/asm/debugreg.h -@@ -108,4 +108,11 @@ struct cpu_policy; - unsigned int x86_adj_dr6_rsvd(const struct cpu_policy *p, unsigned int dr6); - unsigned int x86_adj_dr7_rsvd(const struct cpu_policy *p, unsigned int dr7); - -+/* -+ * Merge new bits into dr6. 'new' is always given in positive polarity, -+ * matching the Intel VMCS PENDING_DBG semantics. -+ */ -+unsigned int x86_merge_dr6(const struct cpu_policy *p, unsigned int dr6, -+ unsigned int new); -+ - #endif /* _X86_DEBUGREG_H */ -diff --git a/xen/arch/x86/include/asm/x86-defns.h b/xen/arch/x86/include/asm/x86-defns.h -index 3bcdbaccd3..caa92829ea 100644 ---- a/xen/arch/x86/include/asm/x86-defns.h -+++ b/xen/arch/x86/include/asm/x86-defns.h -@@ -132,6 +132,13 @@ - #define X86_DR6_ZEROS _AC(0x00001000, UL) /* %dr6 bits forced to 0 */ - #define X86_DR6_DEFAULT _AC(0xffff0ff0, UL) /* Default %dr6 value */ - -+#define X86_DR6_BP_MASK \ -+ (X86_DR6_B0 | X86_DR6_B1 | X86_DR6_B2 | X86_DR6_B3) -+ -+#define X86_DR6_KNOWN_MASK \ -+ (X86_DR6_BP_MASK | X86_DR6_BLD | X86_DR6_BD | X86_DR6_BS | \ -+ X86_DR6_BT | X86_DR6_RTM) -+ - /* - * Debug control flags in DR7. - */ -diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c -index ee91fc56b1..78e83f6fc1 100644 ---- a/xen/arch/x86/traps.c -+++ b/xen/arch/x86/traps.c -@@ -2017,9 +2017,14 @@ void asmlinkage do_debug(struct cpu_user_regs *regs) - return; - } - -- /* Save debug status register where guest OS can peek at it */ -- v->arch.dr6 |= (dr6 & ~X86_DR6_DEFAULT); -- v->arch.dr6 &= (dr6 | ~X86_DR6_DEFAULT); -+ /* -+ * Update the guest's dr6 so the debugger can peek at it. -+ * -+ * TODO: This should be passed out-of-band, so guest state is not modified -+ * by debugging actions completed behind it's back. -+ */ -+ v->arch.dr6 = x86_merge_dr6(v->domain->arch.cpu_policy, -+ v->arch.dr6, dr6 ^ X86_DR6_DEFAULT); - - if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached ) - { --- -2.47.0 - diff --git a/0019-xen-events-fix-race-with-set_global_virq_handler.patch b/0019-xen-events-fix-race-with-set_global_virq_handler.patch new file mode 100644 index 0000000..82aba1f --- /dev/null +++ b/0019-xen-events-fix-race-with-set_global_virq_handler.patch @@ -0,0 +1,81 @@ +From 4803a3c5b5f1f131bd87386c33b285a67126c851 Mon Sep 17 00:00:00 2001 +From: Juergen Gross +Date: Tue, 21 Jan 2025 09:21:01 +0100 +Subject: [PATCH 19/53] xen/events: fix race with set_global_virq_handler() + +There is a possible race scenario between set_global_virq_handler() +and clear_global_virq_handlers() targeting the same domain, which +might result in that domain ending as a zombie domain. + +In case set_global_virq_handler() is being called for a domain which +is just dying, it might happen that clear_global_virq_handlers() is +running first, resulting in set_global_virq_handler() taking a new +reference for that domain and entering in the global_virq_handlers[] +array afterwards. The reference will never be dropped, thus the domain +will never be freed completely. + +This can be fixed by checking the is_dying state of the domain inside +the region guarded by global_virq_handlers_lock. In case the domain is +dying, handle it as if the domain wouldn't exist, which will be the +case in near future anyway. + +Fixes: 87521589aa6a ("xen: allow global VIRQ handlers to be delegated to other domains") +Signed-off-by: Juergen Gross +Reviewed-by: Jan Beulich +master commit: 4d8acc9c1cf14233dda21dd3a7791b5a84b0f6c3 +master date: 2025-01-09 17:34:01 +0100 +--- + xen/common/event_channel.c | 25 ++++++++++++++++++++++--- + 1 file changed, 22 insertions(+), 3 deletions(-) + +diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c +index aac7d4b93f..cc93c52849 100644 +--- a/xen/common/event_channel.c ++++ b/xen/common/event_channel.c +@@ -977,6 +977,7 @@ void send_global_virq(uint32_t virq) + int set_global_virq_handler(struct domain *d, uint32_t virq) + { + struct domain *old; ++ int rc = 0; + + if (virq >= NR_VIRQS) + return -EINVAL; +@@ -990,14 +991,32 @@ int set_global_virq_handler(struct domain *d, uint32_t virq) + return -EINVAL; + + spin_lock(&global_virq_handlers_lock); +- old = global_virq_handlers[virq]; +- global_virq_handlers[virq] = d; ++ ++ /* ++ * Note that this check won't guarantee that a domain just going down can't ++ * be set as the handling domain of a virq, as the is_dying indicator might ++ * change just after testing it. ++ * This isn't going to be a major problem, as clear_global_virq_handlers() ++ * is guaranteed to run afterwards and it will reset the handling domain ++ * for the virq to the hardware domain. ++ */ ++ if ( d->is_dying != DOMDYING_alive ) ++ { ++ old = d; ++ rc = -EINVAL; ++ } ++ else ++ { ++ old = global_virq_handlers[virq]; ++ global_virq_handlers[virq] = d; ++ } ++ + spin_unlock(&global_virq_handlers_lock); + + if (old != NULL) + put_domain(old); + +- return 0; ++ return rc; + } + + static void clear_global_virq_handlers(struct domain *d) +-- +2.48.1 + diff --git a/0020-x86-HVM-reduce-recursion-in-linear_-read-write.patch b/0020-x86-HVM-reduce-recursion-in-linear_-read-write.patch new file mode 100644 index 0000000..4edb49d --- /dev/null +++ b/0020-x86-HVM-reduce-recursion-in-linear_-read-write.patch @@ -0,0 +1,85 @@ +From 41d38d270e18ded067b1800b76c2e0e16844d233 Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Mon, 17 Feb 2025 13:17:45 +0100 +Subject: [PATCH 20/53] x86/HVM: reduce recursion in linear_{read,write}() + +Let's make explicit what the compiler may or may not do on our behalf: +The 2nd of the recursive invocations each can fall through rather than +re-invoking the function. This will save us from adding yet another +parameter (or more) to the function, just for the recursive invocations. + +Signed-off-by: Jan Beulich +Reviewed-by: Andrew Cooper +master commit: 18053054b7583810dd356efc8d7018bbc8720f36 +master date: 2024-09-09 13:40:47 +0200 +--- + xen/arch/x86/hvm/emulate.c | 28 ++++++++++++++++++---------- + 1 file changed, 18 insertions(+), 10 deletions(-) + +diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c +index 02e378365b..9e62b2f184 100644 +--- a/xen/arch/x86/hvm/emulate.c ++++ b/xen/arch/x86/hvm/emulate.c +@@ -1147,7 +1147,7 @@ static int linear_read(unsigned long addr, unsigned int bytes, void *p_data, + pagefault_info_t pfinfo; + struct hvm_vcpu_io *hvio = ¤t->arch.hvm.hvm_io; + unsigned int offset = addr & ~PAGE_MASK; +- int rc = HVMTRANS_bad_gfn_to_mfn; ++ int rc; + + if ( offset + bytes > PAGE_SIZE ) + { +@@ -1155,12 +1155,16 @@ static int linear_read(unsigned long addr, unsigned int bytes, void *p_data, + + /* Split the access at the page boundary. */ + rc = linear_read(addr, part1, p_data, pfec, hvmemul_ctxt); +- if ( rc == X86EMUL_OKAY ) +- rc = linear_read(addr + part1, bytes - part1, p_data + part1, +- pfec, hvmemul_ctxt); +- return rc; ++ if ( rc != X86EMUL_OKAY ) ++ return rc; ++ ++ addr += part1; ++ bytes -= part1; ++ p_data += part1; + } + ++ rc = HVMTRANS_bad_gfn_to_mfn; ++ + /* + * If there is an MMIO cache entry for the access then we must be re-issuing + * an access that was previously handled as MMIO. Thus it is imperative that +@@ -1202,7 +1206,7 @@ static int linear_write(unsigned long addr, unsigned int bytes, void *p_data, + pagefault_info_t pfinfo; + struct hvm_vcpu_io *hvio = ¤t->arch.hvm.hvm_io; + unsigned int offset = addr & ~PAGE_MASK; +- int rc = HVMTRANS_bad_gfn_to_mfn; ++ int rc; + + if ( offset + bytes > PAGE_SIZE ) + { +@@ -1210,12 +1214,16 @@ static int linear_write(unsigned long addr, unsigned int bytes, void *p_data, + + /* Split the access at the page boundary. */ + rc = linear_write(addr, part1, p_data, pfec, hvmemul_ctxt); +- if ( rc == X86EMUL_OKAY ) +- rc = linear_write(addr + part1, bytes - part1, p_data + part1, +- pfec, hvmemul_ctxt); +- return rc; ++ if ( rc != X86EMUL_OKAY ) ++ return rc; ++ ++ addr += part1; ++ bytes -= part1; ++ p_data += part1; + } + ++ rc = HVMTRANS_bad_gfn_to_mfn; ++ + /* + * If there is an MMIO cache entry for the access then we must be re-issuing + * an access that was previously handled as MMIO. Thus it is imperative that +-- +2.48.1 + diff --git a/0020-x86-pv-Fix-merging-of-new-status-bits-into-dr6.patch b/0020-x86-pv-Fix-merging-of-new-status-bits-into-dr6.patch deleted file mode 100644 index c0ea772..0000000 --- a/0020-x86-pv-Fix-merging-of-new-status-bits-into-dr6.patch +++ /dev/null @@ -1,222 +0,0 @@ -From b74a5ea8399d1a0466c55332f557863acdae21b6 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Tue, 24 Sep 2024 14:34:30 +0200 -Subject: [PATCH 20/83] x86/pv: Fix merging of new status bits into %dr6 -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -All #DB exceptions result in an update of %dr6, but this isn't captured in -Xen's handling, and is buggy just about everywhere. - -To begin resolving this issue, add a new pending_dbg field to x86_event -(unioned with cr2 to avoid taking any extra space, adjusting users to avoid -old-GCC bugs with anonymous unions), and introduce pv_inject_DB() to replace -the current callers using pv_inject_hw_exception(). - -Push the adjustment of v->arch.dr6 into pv_inject_event(), and use the new -x86_merge_dr6() rather than the current incorrect logic. - -A key property is that pending_dbg is taken with positive polarity to deal -with RTM/BLD sensibly. Most callers pass in a constant, but callers passing -in a hardware %dr6 value need to XOR the value with X86_DR6_DEFAULT to flip to -positive polarity. - -This fixes the behaviour of the breakpoint status bits; that any left pending -are generally discarded when a new #DB is raised. In principle it would fix -RTM/BLD too, except PV guests can't turn these capabilities on to start with. - -Signed-off-by: Andrew Cooper -Reviewed-by: Roger Pau Monné -Reviewed-by: Jan Beulich -master commit: db39fa4b27ea470902d4625567cb6fa24030ddfa -master date: 2024-08-21 23:59:19 +0100 ---- - xen/arch/x86/include/asm/domain.h | 18 ++++++++++++++++-- - xen/arch/x86/include/asm/hvm/hvm.h | 3 ++- - xen/arch/x86/pv/emul-priv-op.c | 5 +---- - xen/arch/x86/pv/emulate.c | 9 +++++++-- - xen/arch/x86/pv/ro-page-fault.c | 2 +- - xen/arch/x86/pv/traps.c | 16 ++++++++++++---- - xen/arch/x86/traps.c | 2 +- - xen/arch/x86/x86_emulate/x86_emulate.h | 5 ++++- - 8 files changed, 44 insertions(+), 16 deletions(-) - -diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h -index f5daeb182b..5d92891e6f 100644 ---- a/xen/arch/x86/include/asm/domain.h -+++ b/xen/arch/x86/include/asm/domain.h -@@ -731,15 +731,29 @@ static inline void pv_inject_hw_exception(unsigned int vector, int errcode) - pv_inject_event(&event); - } - -+static inline void pv_inject_DB(unsigned long pending_dbg) -+{ -+ struct x86_event event = { -+ .vector = X86_EXC_DB, -+ .type = X86_EVENTTYPE_HW_EXCEPTION, -+ .error_code = X86_EVENT_NO_EC, -+ }; -+ -+ event.pending_dbg = pending_dbg; -+ -+ pv_inject_event(&event); -+} -+ - static inline void pv_inject_page_fault(int errcode, unsigned long cr2) - { -- const struct x86_event event = { -+ struct x86_event event = { - .vector = X86_EXC_PF, - .type = X86_EVENTTYPE_HW_EXCEPTION, - .error_code = errcode, -- .cr2 = cr2, - }; - -+ event.cr2 = cr2; -+ - pv_inject_event(&event); - } - -diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h -index 1c01e22c8e..238eece0cf 100644 ---- a/xen/arch/x86/include/asm/hvm/hvm.h -+++ b/xen/arch/x86/include/asm/hvm/hvm.h -@@ -525,9 +525,10 @@ static inline void hvm_inject_page_fault(int errcode, unsigned long cr2) - .vector = X86_EXC_PF, - .type = X86_EVENTTYPE_HW_EXCEPTION, - .error_code = errcode, -- .cr2 = cr2, - }; - -+ event.cr2 = cr2; -+ - hvm_inject_event(&event); - } - -diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c -index aa11ecadaa..15c83b9d23 100644 ---- a/xen/arch/x86/pv/emul-priv-op.c -+++ b/xen/arch/x86/pv/emul-priv-op.c -@@ -1366,10 +1366,7 @@ int pv_emulate_privileged_op(struct cpu_user_regs *regs) - ctxt.bpmatch |= DR_STEP; - - if ( ctxt.bpmatch ) -- { -- curr->arch.dr6 |= ctxt.bpmatch | DR_STATUS_RESERVED_ONE; -- pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); -- } -+ pv_inject_DB(ctxt.bpmatch); - - /* fall through */ - case X86EMUL_RETRY: -diff --git a/xen/arch/x86/pv/emulate.c b/xen/arch/x86/pv/emulate.c -index e7a1c0a2cc..8c44dea123 100644 ---- a/xen/arch/x86/pv/emulate.c -+++ b/xen/arch/x86/pv/emulate.c -@@ -71,10 +71,15 @@ void pv_emul_instruction_done(struct cpu_user_regs *regs, unsigned long rip) - { - regs->rip = rip; - regs->eflags &= ~X86_EFLAGS_RF; -+ - if ( regs->eflags & X86_EFLAGS_TF ) - { -- current->arch.dr6 |= DR_STEP | DR_STATUS_RESERVED_ONE; -- pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); -+ /* -+ * TODO: this should generally use TF from the start of the -+ * instruction. It's only a latent bug for now, as this path isn't -+ * used for any instruction which modifies eflags. -+ */ -+ pv_inject_DB(X86_DR6_BS); - } - } - -diff --git a/xen/arch/x86/pv/ro-page-fault.c b/xen/arch/x86/pv/ro-page-fault.c -index cad28ef928..d0fe07e3a1 100644 ---- a/xen/arch/x86/pv/ro-page-fault.c -+++ b/xen/arch/x86/pv/ro-page-fault.c -@@ -390,7 +390,7 @@ int pv_ro_page_fault(unsigned long addr, struct cpu_user_regs *regs) - /* Fallthrough */ - case X86EMUL_OKAY: - if ( ctxt.retire.singlestep ) -- pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); -+ pv_inject_DB(X86_DR6_BS); - - /* Fallthrough */ - case X86EMUL_RETRY: -diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c -index 83e84e2762..5a7341abf0 100644 ---- a/xen/arch/x86/pv/traps.c -+++ b/xen/arch/x86/pv/traps.c -@@ -12,6 +12,7 @@ - #include - #include - -+#include - #include - #include - #include -@@ -50,9 +51,9 @@ void pv_inject_event(const struct x86_event *event) - tb->cs = ti->cs; - tb->eip = ti->address; - -- if ( event->type == X86_EVENTTYPE_HW_EXCEPTION && -- vector == X86_EXC_PF ) -+ switch ( vector | -(event->type == X86_EVENTTYPE_SW_INTERRUPT) ) - { -+ case X86_EXC_PF: - curr->arch.pv.ctrlreg[2] = event->cr2; - arch_set_cr2(curr, event->cr2); - -@@ -62,9 +63,16 @@ void pv_inject_event(const struct x86_event *event) - error_code |= PFEC_user_mode; - - trace_pv_page_fault(event->cr2, error_code); -- } -- else -+ break; -+ -+ case X86_EXC_DB: -+ curr->arch.dr6 = x86_merge_dr6(curr->domain->arch.cpu_policy, -+ curr->arch.dr6, event->pending_dbg); -+ fallthrough; -+ default: - trace_pv_trap(vector, regs->rip, use_error_code, error_code); -+ break; -+ } - - if ( use_error_code ) - { -diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c -index 78e83f6fc1..8e2df3e719 100644 ---- a/xen/arch/x86/traps.c -+++ b/xen/arch/x86/traps.c -@@ -2032,7 +2032,7 @@ void asmlinkage do_debug(struct cpu_user_regs *regs) - return; - } - -- pv_inject_hw_exception(X86_EXC_DB, X86_EVENT_NO_EC); -+ pv_inject_DB(0 /* N/A, already merged */); - } - - void asmlinkage do_entry_CP(struct cpu_user_regs *regs) -diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h -index d92be69d84..e8a0e57228 100644 ---- a/xen/arch/x86/x86_emulate/x86_emulate.h -+++ b/xen/arch/x86/x86_emulate/x86_emulate.h -@@ -78,7 +78,10 @@ struct x86_event { - uint8_t type; /* X86_EVENTTYPE_* */ - uint8_t insn_len; /* Instruction length */ - int32_t error_code; /* X86_EVENT_NO_EC if n/a */ -- unsigned long cr2; /* Only for X86_EXC_PF h/w exception */ -+ union { -+ unsigned long cr2; /* #PF */ -+ unsigned long pending_dbg; /* #DB (new DR6 bits, positive polarity) */ -+ }; - }; - - /* --- -2.47.0 - diff --git a/0021-x86-HVM-correct-MMIO-emulation-cache-bounds-check.patch b/0021-x86-HVM-correct-MMIO-emulation-cache-bounds-check.patch new file mode 100644 index 0000000..1ce9d58 --- /dev/null +++ b/0021-x86-HVM-correct-MMIO-emulation-cache-bounds-check.patch @@ -0,0 +1,36 @@ +From 15f83e27444768c60671d7b8c3fdc597c4fca563 Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Mon, 17 Feb 2025 13:18:15 +0100 +Subject: [PATCH 21/53] x86/HVM: correct MMIO emulation cache bounds check +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +To avoid overrunning the internal buffer we need to take the offset into +the buffer into account. + +Fixes: d95da91fb497 ("x86/HVM: grow MMIO cache data size to 64 bytes") +Signed-off-by: Jan Beulich +Reviewed-by: Roger Pau Monné +master commit: e5339bb689dfa79a914c6c96e1d82d61e1ae3161 +master date: 2025-01-23 11:14:48 +0100 +--- + xen/arch/x86/hvm/emulate.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c +index 9e62b2f184..10737533ba 100644 +--- a/xen/arch/x86/hvm/emulate.c ++++ b/xen/arch/x86/hvm/emulate.c +@@ -936,7 +936,7 @@ static int hvmemul_phys_mmio_access( + } + + /* Accesses must not overflow the cache's buffer. */ +- if ( size > sizeof(cache->buffer) ) ++ if ( offset + size > sizeof(cache->buffer) ) + { + ASSERT_UNREACHABLE(); + return X86EMUL_UNHANDLEABLE; +-- +2.48.1 + diff --git a/0021-x86-pv-Address-Coverity-complaint-in-check_guest_io_.patch b/0021-x86-pv-Address-Coverity-complaint-in-check_guest_io_.patch deleted file mode 100644 index a54c8e5..0000000 --- a/0021-x86-pv-Address-Coverity-complaint-in-check_guest_io_.patch +++ /dev/null @@ -1,112 +0,0 @@ -From cb6c3cfc5f8aa8bd8aae1abffea0574b02a04840 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Tue, 24 Sep 2024 14:36:25 +0200 -Subject: [PATCH 21/83] x86/pv: Address Coverity complaint in - check_guest_io_breakpoint() - -Commit 08aacc392d86 ("x86/emul: Fix misaligned IO breakpoint behaviour in PV -guests") caused a Coverity INTEGER_OVERFLOW complaint based on the reasoning -that width could be 0. - -It can't, but digging into the code generation, GCC 8 and later (bisected on -godbolt) choose to emit a CSWITCH lookup table, and because the range (bottom -2 bits clear), it's a 16-entry lookup table. - -So Coverity is understandable, given that GCC did emit a (dead) logic path -where width stayed 0. - -Rewrite the logic. Introduce x86_bp_width() which compiles to a single basic -block, which replaces the switch() statement. Take the opportunity to also -make start and width be loop-scope variables. - -No practical change, but it should compile better and placate Coverity. - -Fixes: 08aacc392d86 ("x86/emul: Fix misaligned IO breakpoint behaviour in PV guests") -Coverity-ID: 1616152 -Signed-off-by: Andrew Cooper -Reviewed-by: Jan Beulich -master commit: 6d41a9d8a12ff89adabdc286e63e9391a0481699 -master date: 2024-08-21 23:59:19 +0100 ---- - xen/arch/x86/include/asm/debugreg.h | 25 +++++++++++++++++++++++++ - xen/arch/x86/pv/emul-priv-op.c | 21 ++++++--------------- - 2 files changed, 31 insertions(+), 15 deletions(-) - -diff --git a/xen/arch/x86/include/asm/debugreg.h b/xen/arch/x86/include/asm/debugreg.h -index 6baa725441..23aa592e40 100644 ---- a/xen/arch/x86/include/asm/debugreg.h -+++ b/xen/arch/x86/include/asm/debugreg.h -@@ -115,4 +115,29 @@ unsigned int x86_adj_dr7_rsvd(const struct cpu_policy *p, unsigned int dr7); - unsigned int x86_merge_dr6(const struct cpu_policy *p, unsigned int dr6, - unsigned int new); - -+/* -+ * Calculate the width of a breakpoint from its dr7 encoding. -+ * -+ * The LEN encoding in dr7 is 2 bits wide per breakpoint and encoded as a X-1 -+ * (0, 1 and 3) for widths of 1, 2 and 4 respectively in the 32bit days. -+ * -+ * In 64bit, the unused value (2) was given a meaning of width 8, which is -+ * great for efficiency but less great for nicely calculating the width. -+ */ -+static inline unsigned int x86_bp_width(unsigned int dr7, unsigned int bp) -+{ -+ unsigned int raw = (dr7 >> (DR_CONTROL_SHIFT + -+ DR_CONTROL_SIZE * bp + 2)) & 3; -+ -+ /* -+ * If the top bit is set (i.e. we've got an 4 or 8 byte wide breakpoint), -+ * flip the bottom to reverse their order, making them sorted properly. -+ * Then it's a simple shift to calculate the width. -+ */ -+ if ( raw & 2 ) -+ raw ^= 1; -+ -+ return 1U << raw; -+} -+ - #endif /* _X86_DEBUGREG_H */ -diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c -index 15c83b9d23..b90f745c75 100644 ---- a/xen/arch/x86/pv/emul-priv-op.c -+++ b/xen/arch/x86/pv/emul-priv-op.c -@@ -323,30 +323,21 @@ static unsigned int check_guest_io_breakpoint(struct vcpu *v, - unsigned int port, - unsigned int len) - { -- unsigned int width, i, match = 0; -- unsigned long start; -+ unsigned int i, match = 0; - - if ( !v->arch.pv.dr7_emul || !(v->arch.pv.ctrlreg[4] & X86_CR4_DE) ) - return 0; - - for ( i = 0; i < 4; i++ ) - { -+ unsigned long start; -+ unsigned int width; -+ - if ( !(v->arch.pv.dr7_emul & (3 << (i * DR_ENABLE_SIZE))) ) - continue; - -- start = v->arch.dr[i]; -- width = 0; -- -- switch ( (v->arch.dr7 >> -- (DR_CONTROL_SHIFT + i * DR_CONTROL_SIZE)) & 0xc ) -- { -- case DR_LEN_1: width = 1; break; -- case DR_LEN_2: width = 2; break; -- case DR_LEN_4: width = 4; break; -- case DR_LEN_8: width = 8; break; -- } -- -- start &= ~(width - 1UL); -+ width = x86_bp_width(v->arch.dr7, i); -+ start = v->arch.dr[i] & ~(width - 1UL); - - if ( (start < (port + len)) && ((start + width) > port) ) - match |= 1u << i; --- -2.47.0 - diff --git a/0022-x86-HVM-allocate-emulation-cache-entries-dynamically.patch b/0022-x86-HVM-allocate-emulation-cache-entries-dynamically.patch new file mode 100644 index 0000000..97a0936 --- /dev/null +++ b/0022-x86-HVM-allocate-emulation-cache-entries-dynamically.patch @@ -0,0 +1,172 @@ +From dcac022fd932a844d1d9a22fe231cef939921945 Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Mon, 17 Feb 2025 13:19:04 +0100 +Subject: [PATCH 22/53] x86/HVM: allocate emulation cache entries dynamically +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Both caches may need higher capacity, and the upper bound will need to +be determined dynamically based on CPUID policy (for AMX'es TILELOAD / +TILESTORE at least). + +Signed-off-by: Jan Beulich +Reviewed-by: Roger Pau Monné +master commit: 23d60dbb0493b2f9ec1d89be5341eec2ee9dab32 +master date: 2025-01-24 10:15:29 +0100 +--- + xen/arch/x86/hvm/emulate.c | 51 ++++++++++++++++++++------ + xen/arch/x86/include/asm/hvm/emulate.h | 4 ++ + xen/arch/x86/include/asm/hvm/vcpu.h | 13 +------ + 3 files changed, 44 insertions(+), 24 deletions(-) + +diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c +index 10737533ba..5c84aff1dc 100644 +--- a/xen/arch/x86/hvm/emulate.c ++++ b/xen/arch/x86/hvm/emulate.c +@@ -26,6 +26,18 @@ + #include + #include + ++/* ++ * We may read or write up to m512 or up to a tile row as a number of ++ * device-model transactions. ++ */ ++struct hvm_mmio_cache { ++ unsigned long gla; ++ unsigned int size; /* Amount of buffer[] actually used. */ ++ unsigned int space:31; /* Allocated size of buffer[]. */ ++ unsigned int dir:1; ++ uint8_t buffer[] __aligned(sizeof(long)); ++}; ++ + struct hvmemul_cache + { + /* The cache is disabled as long as num_ents > max_ents. */ +@@ -936,7 +948,7 @@ static int hvmemul_phys_mmio_access( + } + + /* Accesses must not overflow the cache's buffer. */ +- if ( offset + size > sizeof(cache->buffer) ) ++ if ( offset + size > cache->space ) + { + ASSERT_UNREACHABLE(); + return X86EMUL_UNHANDLEABLE; +@@ -1012,7 +1024,7 @@ static struct hvm_mmio_cache *hvmemul_find_mmio_cache( + + for ( i = 0; i < hvio->mmio_cache_count; i ++ ) + { +- cache = &hvio->mmio_cache[i]; ++ cache = hvio->mmio_cache[i]; + + if ( gla == cache->gla && + dir == cache->dir ) +@@ -1028,10 +1040,11 @@ static struct hvm_mmio_cache *hvmemul_find_mmio_cache( + + ++hvio->mmio_cache_count; + +- cache = &hvio->mmio_cache[i]; +- memset(cache, 0, sizeof (*cache)); ++ cache = hvio->mmio_cache[i]; ++ memset(cache->buffer, 0, cache->space); + + cache->gla = gla; ++ cache->size = 0; + cache->dir = dir; + + return cache; +@@ -2977,16 +2990,21 @@ void hvm_dump_emulation_state(const char *loglvl, const char *prefix, + int hvmemul_cache_init(struct vcpu *v) + { + /* +- * No insn can access more than 16 independent linear addresses (AVX512F +- * scatters/gathers being the worst). Each such linear range can span a +- * page boundary, i.e. may require two page walks. Account for each insn +- * byte individually, for simplicity. ++ * AVX512F scatter/gather insns can access up to 16 independent linear ++ * addresses, up to 8 bytes size. Each such linear range can span a page ++ * boundary, i.e. may require two page walks. ++ */ ++ unsigned int nents = 16 * 2 * (CONFIG_PAGING_LEVELS + 1); ++ unsigned int i, max_bytes = 64; ++ struct hvmemul_cache *cache; ++ ++ /* ++ * Account for each insn byte individually, both for simplicity and to ++ * leave some slack space. + */ +- const unsigned int nents = (CONFIG_PAGING_LEVELS + 1) * +- (MAX_INST_LEN + 16 * 2); +- struct hvmemul_cache *cache = xmalloc_flex_struct(struct hvmemul_cache, +- ents, nents); ++ nents += MAX_INST_LEN * (CONFIG_PAGING_LEVELS + 1); + ++ cache = xmalloc_flex_struct(struct hvmemul_cache, ents, nents); + if ( !cache ) + return -ENOMEM; + +@@ -2996,6 +3014,15 @@ int hvmemul_cache_init(struct vcpu *v) + + v->arch.hvm.hvm_io.cache = cache; + ++ for ( i = 0; i < ARRAY_SIZE(v->arch.hvm.hvm_io.mmio_cache); ++i ) ++ { ++ v->arch.hvm.hvm_io.mmio_cache[i] = ++ xmalloc_flex_struct(struct hvm_mmio_cache, buffer, max_bytes); ++ if ( !v->arch.hvm.hvm_io.mmio_cache[i] ) ++ return -ENOMEM; ++ v->arch.hvm.hvm_io.mmio_cache[i]->space = max_bytes; ++ } ++ + return 0; + } + +diff --git a/xen/arch/x86/include/asm/hvm/emulate.h b/xen/arch/x86/include/asm/hvm/emulate.h +index 29d679442e..2e1eedefa7 100644 +--- a/xen/arch/x86/include/asm/hvm/emulate.h ++++ b/xen/arch/x86/include/asm/hvm/emulate.h +@@ -119,6 +119,10 @@ int hvmemul_do_pio_buffer(uint16_t port, + int __must_check hvmemul_cache_init(struct vcpu *v); + static inline void hvmemul_cache_destroy(struct vcpu *v) + { ++ unsigned int i; ++ ++ for ( i = 0; i < ARRAY_SIZE(v->arch.hvm.hvm_io.mmio_cache); ++i ) ++ XFREE(v->arch.hvm.hvm_io.mmio_cache[i]); + XFREE(v->arch.hvm.hvm_io.cache); + } + bool hvmemul_read_cache(const struct vcpu *v, paddr_t gpa, +diff --git a/xen/arch/x86/include/asm/hvm/vcpu.h b/xen/arch/x86/include/asm/hvm/vcpu.h +index 64c7a6fede..ddf9f8b831 100644 +--- a/xen/arch/x86/include/asm/hvm/vcpu.h ++++ b/xen/arch/x86/include/asm/hvm/vcpu.h +@@ -22,17 +22,6 @@ struct hvm_vcpu_asid { + uint32_t asid; + }; + +-/* +- * We may read or write up to m512 as a number of device-model +- * transactions. +- */ +-struct hvm_mmio_cache { +- unsigned long gla; +- unsigned int size; +- uint8_t dir; +- uint8_t buffer[64] __aligned(sizeof(long)); +-}; +- + struct hvm_vcpu_io { + /* + * HVM emulation: +@@ -48,7 +37,7 @@ struct hvm_vcpu_io { + * We may need to handle up to 3 distinct memory accesses per + * instruction. + */ +- struct hvm_mmio_cache mmio_cache[3]; ++ struct hvm_mmio_cache *mmio_cache[3]; + unsigned int mmio_cache_count; + + /* For retries we shouldn't re-fetch the instruction. */ +-- +2.48.1 + diff --git a/0022-x86emul-always-set-operand-size-for-AVX-VNNI-INT8-in.patch b/0022-x86emul-always-set-operand-size-for-AVX-VNNI-INT8-in.patch deleted file mode 100644 index 6d1a110..0000000 --- a/0022-x86emul-always-set-operand-size-for-AVX-VNNI-INT8-in.patch +++ /dev/null @@ -1,36 +0,0 @@ -From 1e68200487e662e9f8720d508a1d6b3d3e2c72b9 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 24 Sep 2024 14:37:08 +0200 -Subject: [PATCH 22/83] x86emul: always set operand size for AVX-VNNI-INT8 - insns - -Unlike for AVX-VNNI-INT16 I failed to notice that op_bytes may still be -zero when reaching the respective case block: With the ext0f38_table[] -entries having simd_packed_int, the defaulting at the bottom of -x86emul_decode() won't set the field to non-zero for F3- or F2-prefixed -insns. - -Fixes: 842acaa743a5 ("x86emul: support AVX-VNNI-INT8") -Signed-off-by: Jan Beulich -Acked-by: Andrew Cooper -master commit: d45687cca2450bfebe1dfbddb22f4f03c6fbc9cb -master date: 2024-08-23 09:11:15 +0200 ---- - xen/arch/x86/x86_emulate/x86_emulate.c | 1 + - 1 file changed, 1 insertion(+) - -diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c -index 16557385bf..4d9649a2af 100644 ---- a/xen/arch/x86/x86_emulate/x86_emulate.c -+++ b/xen/arch/x86/x86_emulate/x86_emulate.c -@@ -6075,6 +6075,7 @@ x86_emulate( - case X86EMUL_OPC_VEX_F2(0x0f38, 0x51): /* vpdpbssds [xy]mm/mem,[xy]mm,[xy]mm */ - host_and_vcpu_must_have(avx_vnni_int8); - generate_exception_if(vex.w, X86_EXC_UD); -+ op_bytes = 16 << vex.l; - goto simd_0f_ymm; - - case X86EMUL_OPC_VEX_66(0x0f38, 0x50): /* vpdpbusd [xy]mm/mem,[xy]mm,[xy]mm */ --- -2.47.0 - diff --git a/0023-x86-HVM-correct-read-write-split-at-page-boundaries.patch b/0023-x86-HVM-correct-read-write-split-at-page-boundaries.patch new file mode 100644 index 0000000..847b2ab --- /dev/null +++ b/0023-x86-HVM-correct-read-write-split-at-page-boundaries.patch @@ -0,0 +1,241 @@ +From e990bf3de5fafd1414c0091c1066ce1463b01c1d Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Mon, 17 Feb 2025 13:19:51 +0100 +Subject: [PATCH 23/53] x86/HVM: correct read/write split at page boundaries +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +The MMIO cache is intended to have one entry used per independent memory +access that an insn does. This, in particular, is supposed to be +ignoring any page boundary crossing. Therefore when looking up a cache +entry, the access'es starting (linear) address is relevant, not the one +possibly advanced past a page boundary. + +In order for the same offset-into-buffer variable to be usable in +hvmemul_phys_mmio_access() for both the caller's buffer and the cache +entry's it is further necessary to have the un-adjusted caller buffer +passed into there. + +Fixes: 2d527ba310dc ("x86/hvm: split all linear reads and writes at page boundary") +Reported-by: Manuel Andreas +Signed-off-by: Jan Beulich +Acked-by: Roger Pau Monné +master commit: 672894a11fe06e664a0ebfb600baf5dbb897b9e4 +master date: 2025-01-24 10:15:56 +0100 +--- + xen/arch/x86/hvm/emulate.c | 92 +++++++++++++++++++++++++------------- + 1 file changed, 61 insertions(+), 31 deletions(-) + +diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c +index 5c84aff1dc..fb4de6ee0a 100644 +--- a/xen/arch/x86/hvm/emulate.c ++++ b/xen/arch/x86/hvm/emulate.c +@@ -31,8 +31,9 @@ + * device-model transactions. + */ + struct hvm_mmio_cache { +- unsigned long gla; +- unsigned int size; /* Amount of buffer[] actually used. */ ++ unsigned long gla; /* Start of original access (e.g. insn operand). */ ++ unsigned int skip; /* Offset to start of MMIO */ ++ unsigned int size; /* Amount of buffer[] actually used, incl @skip. */ + unsigned int space:31; /* Allocated size of buffer[]. */ + unsigned int dir:1; + uint8_t buffer[] __aligned(sizeof(long)); +@@ -954,6 +955,13 @@ static int hvmemul_phys_mmio_access( + return X86EMUL_UNHANDLEABLE; + } + ++ /* Accesses must not be to the unused leading space. */ ++ if ( offset < cache->skip ) ++ { ++ ASSERT_UNREACHABLE(); ++ return X86EMUL_UNHANDLEABLE; ++ } ++ + /* + * hvmemul_do_io() cannot handle non-power-of-2 accesses or + * accesses larger than sizeof(long), so choose the highest power +@@ -1011,13 +1019,15 @@ static int hvmemul_phys_mmio_access( + + /* + * Multi-cycle MMIO handling is based upon the assumption that emulation +- * of the same instruction will not access the same MMIO region more +- * than once. Hence we can deal with re-emulation (for secondary or +- * subsequent cycles) by looking up the result or previous I/O in a +- * cache indexed by linear MMIO address. ++ * of the same instruction will not access the exact same MMIO region ++ * more than once in exactly the same way (if it does, the accesses will ++ * be "folded"). Hence we can deal with re-emulation (for secondary or ++ * subsequent cycles) by looking up the result of previous I/O in a cache ++ * indexed by linear address and access type. + */ + static struct hvm_mmio_cache *hvmemul_find_mmio_cache( +- struct hvm_vcpu_io *hvio, unsigned long gla, uint8_t dir, bool create) ++ struct hvm_vcpu_io *hvio, unsigned long gla, uint8_t dir, ++ unsigned int skip) + { + unsigned int i; + struct hvm_mmio_cache *cache; +@@ -1031,7 +1041,11 @@ static struct hvm_mmio_cache *hvmemul_find_mmio_cache( + return cache; + } + +- if ( !create ) ++ /* ++ * Bail if a new entry shouldn't be allocated, relying on ->space having ++ * the same value for all entries. ++ */ ++ if ( skip >= hvio->mmio_cache[0]->space ) + return NULL; + + i = hvio->mmio_cache_count; +@@ -1044,7 +1058,8 @@ static struct hvm_mmio_cache *hvmemul_find_mmio_cache( + memset(cache->buffer, 0, cache->space); + + cache->gla = gla; +- cache->size = 0; ++ cache->skip = skip; ++ cache->size = skip; + cache->dir = dir; + + return cache; +@@ -1065,12 +1080,14 @@ static void latch_linear_to_phys(struct hvm_vcpu_io *hvio, unsigned long gla, + + static int hvmemul_linear_mmio_access( + unsigned long gla, unsigned int size, uint8_t dir, void *buffer, +- uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool known_gpfn) ++ uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, ++ unsigned long start_gla, bool known_gpfn) + { + struct hvm_vcpu_io *hvio = ¤t->arch.hvm.hvm_io; + unsigned long offset = gla & ~PAGE_MASK; +- struct hvm_mmio_cache *cache = hvmemul_find_mmio_cache(hvio, gla, dir, true); +- unsigned int chunk, buffer_offset = 0; ++ unsigned int chunk, buffer_offset = gla - start_gla; ++ struct hvm_mmio_cache *cache = hvmemul_find_mmio_cache(hvio, start_gla, ++ dir, buffer_offset); + paddr_t gpa; + unsigned long one_rep = 1; + int rc; +@@ -1118,19 +1135,19 @@ static int hvmemul_linear_mmio_access( + static inline int hvmemul_linear_mmio_read( + unsigned long gla, unsigned int size, void *buffer, + uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, +- bool translate) ++ unsigned long start_gla, bool translate) + { +- return hvmemul_linear_mmio_access(gla, size, IOREQ_READ, buffer, +- pfec, hvmemul_ctxt, translate); ++ return hvmemul_linear_mmio_access(gla, size, IOREQ_READ, buffer, pfec, ++ hvmemul_ctxt, start_gla, translate); + } + + static inline int hvmemul_linear_mmio_write( + unsigned long gla, unsigned int size, void *buffer, + uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, +- bool translate) ++ unsigned long start_gla, bool translate) + { +- return hvmemul_linear_mmio_access(gla, size, IOREQ_WRITE, buffer, +- pfec, hvmemul_ctxt, translate); ++ return hvmemul_linear_mmio_access(gla, size, IOREQ_WRITE, buffer, pfec, ++ hvmemul_ctxt, start_gla, translate); + } + + static bool known_gla(unsigned long addr, unsigned int bytes, uint32_t pfec) +@@ -1159,7 +1176,10 @@ static int linear_read(unsigned long addr, unsigned int bytes, void *p_data, + { + pagefault_info_t pfinfo; + struct hvm_vcpu_io *hvio = ¤t->arch.hvm.hvm_io; ++ void *buffer = p_data; ++ unsigned long start = addr; + unsigned int offset = addr & ~PAGE_MASK; ++ const struct hvm_mmio_cache *cache; + int rc; + + if ( offset + bytes > PAGE_SIZE ) +@@ -1183,8 +1203,17 @@ static int linear_read(unsigned long addr, unsigned int bytes, void *p_data, + * an access that was previously handled as MMIO. Thus it is imperative that + * we handle this access in the same way to guarantee completion and hence + * clean up any interim state. ++ * ++ * Care must be taken, however, to correctly deal with crossing RAM/MMIO or ++ * MMIO/RAM boundaries. While we want to use a single cache entry (tagged ++ * by the starting linear address), we need to continue issuing (i.e. also ++ * upon replay) the RAM access for anything that's ahead of or past MMIO, ++ * i.e. in RAM. + */ +- if ( !hvmemul_find_mmio_cache(hvio, addr, IOREQ_READ, false) ) ++ cache = hvmemul_find_mmio_cache(hvio, start, IOREQ_READ, ~0); ++ if ( !cache || ++ addr + bytes <= start + cache->skip || ++ addr >= start + cache->size ) + rc = hvm_copy_from_guest_linear(p_data, addr, bytes, pfec, &pfinfo); + + switch ( rc ) +@@ -1200,8 +1229,8 @@ static int linear_read(unsigned long addr, unsigned int bytes, void *p_data, + if ( pfec & PFEC_insn_fetch ) + return X86EMUL_UNHANDLEABLE; + +- return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, +- hvmemul_ctxt, ++ return hvmemul_linear_mmio_read(addr, bytes, buffer, pfec, ++ hvmemul_ctxt, start, + known_gla(addr, bytes, pfec)); + + case HVMTRANS_gfn_paged_out: +@@ -1218,7 +1247,10 @@ static int linear_write(unsigned long addr, unsigned int bytes, void *p_data, + { + pagefault_info_t pfinfo; + struct hvm_vcpu_io *hvio = ¤t->arch.hvm.hvm_io; ++ void *buffer = p_data; ++ unsigned long start = addr; + unsigned int offset = addr & ~PAGE_MASK; ++ const struct hvm_mmio_cache *cache; + int rc; + + if ( offset + bytes > PAGE_SIZE ) +@@ -1237,13 +1269,11 @@ static int linear_write(unsigned long addr, unsigned int bytes, void *p_data, + + rc = HVMTRANS_bad_gfn_to_mfn; + +- /* +- * If there is an MMIO cache entry for the access then we must be re-issuing +- * an access that was previously handled as MMIO. Thus it is imperative that +- * we handle this access in the same way to guarantee completion and hence +- * clean up any interim state. +- */ +- if ( !hvmemul_find_mmio_cache(hvio, addr, IOREQ_WRITE, false) ) ++ /* See commentary in linear_read(). */ ++ cache = hvmemul_find_mmio_cache(hvio, start, IOREQ_WRITE, ~0); ++ if ( !cache || ++ addr + bytes <= start + cache->skip || ++ addr >= start + cache->size ) + rc = hvm_copy_to_guest_linear(addr, p_data, bytes, pfec, &pfinfo); + + switch ( rc ) +@@ -1256,8 +1286,8 @@ static int linear_write(unsigned long addr, unsigned int bytes, void *p_data, + return X86EMUL_EXCEPTION; + + case HVMTRANS_bad_gfn_to_mfn: +- return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, +- hvmemul_ctxt, ++ return hvmemul_linear_mmio_write(addr, bytes, buffer, pfec, ++ hvmemul_ctxt, start, + known_gla(addr, bytes, pfec)); + + case HVMTRANS_gfn_paged_out: +@@ -1644,7 +1674,7 @@ static int cf_check hvmemul_cmpxchg( + { + /* Fix this in case the guest is really relying on r-m-w atomicity. */ + return hvmemul_linear_mmio_write(addr, bytes, p_new, pfec, +- hvmemul_ctxt, ++ hvmemul_ctxt, addr, + hvio->mmio_access.write_access && + hvio->mmio_gla == (addr & PAGE_MASK)); + } +-- +2.48.1 + diff --git a/0023-x86emul-set-fake-operand-size-for-AVX512CD-broadcast.patch b/0023-x86emul-set-fake-operand-size-for-AVX512CD-broadcast.patch deleted file mode 100644 index df4bf74..0000000 --- a/0023-x86emul-set-fake-operand-size-for-AVX512CD-broadcast.patch +++ /dev/null @@ -1,35 +0,0 @@ -From a0d6b75b832d2f7c54429de1a550fe122bcd6881 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 24 Sep 2024 14:37:52 +0200 -Subject: [PATCH 23/83] x86emul: set (fake) operand size for AVX512CD broadcast - insns - -Back at the time I failed to pay attention to op_bytes still being zero -when reaching the respective case block: With the ext0f38_table[] -entries having simd_packed_int, the defaulting at the bottom of -x86emul_decode() won't set the field to non-zero for F3-prefixed insns. - -Fixes: 37ccca740c26 ("x86emul: support AVX512CD insns") -Signed-off-by: Jan Beulich -Acked-by: Andrew Cooper -master commit: 6fa6b7feaafd622db3a2f3436750cf07782f4c12 -master date: 2024-08-23 09:12:24 +0200 ---- - xen/arch/x86/x86_emulate/x86_emulate.c | 1 + - 1 file changed, 1 insertion(+) - -diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c -index 4d9649a2af..305f4286bf 100644 ---- a/xen/arch/x86/x86_emulate/x86_emulate.c -+++ b/xen/arch/x86/x86_emulate/x86_emulate.c -@@ -5928,6 +5928,7 @@ x86_emulate( - evex.w == ((b >> 4) & 1)), - X86_EXC_UD); - d |= TwoOp; -+ op_bytes = 1; /* fake */ - /* fall through */ - case X86EMUL_OPC_EVEX_66(0x0f38, 0xc4): /* vpconflict{d,q} [xyz]mm/mem,[xyz]mm{k} */ - fault_suppression = false; --- -2.47.0 - diff --git a/0024-x86-iommu-check-for-CMPXCHG16B-when-enabling-IOMMU.patch b/0024-x86-iommu-check-for-CMPXCHG16B-when-enabling-IOMMU.patch new file mode 100644 index 0000000..3d6c303 --- /dev/null +++ b/0024-x86-iommu-check-for-CMPXCHG16B-when-enabling-IOMMU.patch @@ -0,0 +1,126 @@ +From 49c3324f3737fbdfdbfeab7a682f3c4665a732ce Mon Sep 17 00:00:00 2001 +From: Teddy Astie +Date: Mon, 17 Feb 2025 13:20:14 +0100 +Subject: [PATCH 24/53] x86/iommu: check for CMPXCHG16B when enabling IOMMU +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +All hardware with VT-d/AMD-Vi has CMPXCHG16B support. Check this at +initialisation time, and otherwise refuse to use the IOMMU. + +If the local APICs support x2APIC mode the IOMMU support for interrupt +remapping will be checked earlier using a specific helper. If no support +for CX16 is detected by that earlier hook disable the IOMMU at that point +and prevent further poking for CX16 later in the boot process, which would +also fail. + +There's a possible corner case when running virtualized, and the underlying +hypervisor exposing an IOMMU but no CMPXCHG16B support. In which case +ignoring the IOMMU is fine, albeit the most natural would be for the +underlying hypervisor to also expose CMPXCHG16B support if an IOMMU is +available to the VM. + +Note this change only introduces the checks, but doesn't remove the now +stale checks for CX16 support sprinkled in the IOMMU code. Further changes +will take care of that. + +Suggested-by: Andrew Cooper +Signed-off-by: Teddy Astie +Signed-off-by: Roger Pau Monné +Reviewed-by: Andrew Cooper +master commit: 2636fcdc15c707d5e097770133f0afb69e8d70c9 +master date: 2025-01-27 13:05:11 +0100 +--- + xen/drivers/passthrough/amd/iommu_intr.c | 13 +++++++++++++ + xen/drivers/passthrough/amd/pci_amd_iommu.c | 6 ++++++ + xen/drivers/passthrough/vtd/intremap.c | 13 +++++++++++++ + xen/drivers/passthrough/vtd/iommu.c | 7 +++++++ + 4 files changed, 39 insertions(+) + +diff --git a/xen/drivers/passthrough/amd/iommu_intr.c b/xen/drivers/passthrough/amd/iommu_intr.c +index 7fc796dec2..f07fd9e3d9 100644 +--- a/xen/drivers/passthrough/amd/iommu_intr.c ++++ b/xen/drivers/passthrough/amd/iommu_intr.c +@@ -649,6 +649,19 @@ bool __init cf_check iov_supports_xt(void) + if ( !iommu_enable || !iommu_intremap ) + return false; + ++ if ( unlikely(!cpu_has_cx16) ) ++ { ++ AMD_IOMMU_ERROR("no CMPXCHG16B support, disabling IOMMU\n"); ++ /* ++ * Disable IOMMU support at once: there's no reason to check for CX16 ++ * yet again when attempting to initialize IOMMU DMA remapping ++ * functionality or interrupt remapping without x2APIC support. ++ */ ++ iommu_enable = false; ++ iommu_intremap = iommu_intremap_off; ++ return false; ++ } ++ + if ( amd_iommu_prepare(true) ) + return false; + +diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c +index 73dcc4a2dd..f96f59440b 100644 +--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c ++++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c +@@ -309,6 +309,12 @@ static int __init cf_check iov_detect(void) + if ( !iommu_enable && !iommu_intremap ) + return 0; + ++ if ( unlikely(!cpu_has_cx16) ) ++ { ++ AMD_IOMMU_ERROR("no CMPXCHG16B support, disabling IOMMU\n"); ++ return -ENODEV; ++ } ++ + if ( (init_done ? amd_iommu_init_late() + : amd_iommu_init(false)) != 0 ) + { +diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c +index c504852eb8..233db5cb64 100644 +--- a/xen/drivers/passthrough/vtd/intremap.c ++++ b/xen/drivers/passthrough/vtd/intremap.c +@@ -150,6 +150,19 @@ bool __init cf_check intel_iommu_supports_eim(void) + if ( !iommu_qinval || !iommu_intremap || list_empty(&acpi_drhd_units) ) + return false; + ++ if ( unlikely(!cpu_has_cx16) ) ++ { ++ printk(XENLOG_ERR VTDPREFIX "no CMPXCHG16B support, disabling IOMMU\n"); ++ /* ++ * Disable IOMMU support at once: there's no reason to check for CX16 ++ * yet again when attempting to initialize IOMMU DMA remapping ++ * functionality or interrupt remapping without x2APIC support. ++ */ ++ iommu_enable = false; ++ iommu_intremap = iommu_intremap_off; ++ return false; ++ } ++ + /* We MUST have a DRHD unit for each IOAPIC. */ + for ( apic = 0; apic < nr_ioapics; apic++ ) + if ( !ioapic_to_drhd(IO_APIC_ID(apic)) ) +diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c +index e13be244c1..ab38bbc8a5 100644 +--- a/xen/drivers/passthrough/vtd/iommu.c ++++ b/xen/drivers/passthrough/vtd/iommu.c +@@ -2630,6 +2630,13 @@ static int __init cf_check vtd_setup(void) + int ret; + bool reg_inval_supported = true; + ++ if ( unlikely(!cpu_has_cx16) ) ++ { ++ printk(XENLOG_ERR VTDPREFIX "no CMPXCHG16B support, disabling IOMMU\n"); ++ ret = -ENODEV; ++ goto error; ++ } ++ + if ( list_empty(&acpi_drhd_units) ) + { + ret = -ENODEV; +-- +2.48.1 + diff --git a/0024-x86-x2APIC-correct-cluster-tracking-upon-CPUs-going-.patch b/0024-x86-x2APIC-correct-cluster-tracking-upon-CPUs-going-.patch deleted file mode 100644 index cf1d5c1..0000000 --- a/0024-x86-x2APIC-correct-cluster-tracking-upon-CPUs-going-.patch +++ /dev/null @@ -1,52 +0,0 @@ -From 404fb9b745dd3f1ca17c3e957e43e3f95ab2613a Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 24 Sep 2024 14:38:27 +0200 -Subject: [PATCH 24/83] x86/x2APIC: correct cluster tracking upon CPUs going - down for S3 -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Downing CPUs for S3 is somewhat special: Since we can expect the system -to come back up in exactly the same hardware configuration, per-CPU data -for the secondary CPUs isn't de-allocated (and then cleared upon re- -allocation when the CPUs are being brought back up). Therefore the -cluster_cpus per-CPU pointer will retain its value for all CPUs other -than the final one in a cluster (i.e. in particular for all CPUs in the -same cluster as CPU0). That, however, is in conflict with the assertion -early in init_apic_ldr_x2apic_cluster(). - -Note that the issue is avoided on Intel hardware, where we park CPUs -instead of bringing them down. - -Extend the bypassing of the freeing to the suspend case, thus making -suspend/resume also a tiny bit faster. - -Fixes: 2e6c8f182c9c ("x86: distinguish CPU offlining from CPU removal") -Reported-by: Marek Marczykowski-Górecki -Signed-off-by: Jan Beulich -Tested-by: Marek Marczykowski-Górecki -Acked-by: Andrew Cooper -master commit: ad3ff7b4279d16c91c23cda6e8be5bc670b25c9a -master date: 2024-08-26 10:30:40 +0200 ---- - xen/arch/x86/genapic/x2apic.c | 3 ++- - 1 file changed, 2 insertions(+), 1 deletion(-) - -diff --git a/xen/arch/x86/genapic/x2apic.c b/xen/arch/x86/genapic/x2apic.c -index 371dd100c7..d531035fa4 100644 ---- a/xen/arch/x86/genapic/x2apic.c -+++ b/xen/arch/x86/genapic/x2apic.c -@@ -228,7 +228,8 @@ static int cf_check update_clusterinfo( - case CPU_UP_CANCELED: - case CPU_DEAD: - case CPU_REMOVE: -- if ( park_offline_cpus == (action != CPU_REMOVE) ) -+ if ( park_offline_cpus == (action != CPU_REMOVE) || -+ system_state == SYS_STATE_suspend ) - break; - if ( per_cpu(cluster_cpus, cpu) ) - { --- -2.47.0 - diff --git a/0025-iommu-amd-atomically-update-IRTE.patch b/0025-iommu-amd-atomically-update-IRTE.patch new file mode 100644 index 0000000..6e99111 --- /dev/null +++ b/0025-iommu-amd-atomically-update-IRTE.patch @@ -0,0 +1,222 @@ +From 457b9e11fa131f3a9f376aed3e00f0d16d6ea7c7 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Mon, 17 Feb 2025 13:20:38 +0100 +Subject: [PATCH 25/53] iommu/amd: atomically update IRTE +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Either when using a 32bit Interrupt Remapping Entry or a 128bit one update +the entry atomically, by using cmpxchg unconditionally as IOMMU depends on +it. No longer disable the entry by setting RemapEn = 0 ahead of updating +it. As a consequence of not toggling RemapEn ahead of the update the +Interrupt Remapping Table needs to be flushed after the entry update. + +This avoids a window where the IRTE has RemapEn = 0, which can lead to +IO_PAGE_FAULT if the underlying interrupt source is not masked. + +There's no guidance in AMD-Vi specification about how IRTE update should be +performed as opposed to DTE updating which has specific guidance. However +DTE updating claims that reads will always be at least 128bits in size, and +hence for the purposes here assume that reads and caching of the IRTE +entries in either 32 or 128 bit format will be done atomically from +the IOMMU. + +Note that as part of introducing a new raw128 field in the IRTE struct, the +current raw field is renamed to raw64 to explicitly contain the size in the +field name. + +Signed-off-by: Roger Pau Monné +Reviewed-by: Andrew Cooper +master commit: b953a99da98d63a7c827248abc450d4e8e015ab6 +master date: 2025-01-27 13:05:11 +0100 +--- + xen/drivers/passthrough/amd/iommu_intr.c | 75 ++++++++++-------------- + 1 file changed, 32 insertions(+), 43 deletions(-) + +diff --git a/xen/drivers/passthrough/amd/iommu_intr.c b/xen/drivers/passthrough/amd/iommu_intr.c +index f07fd9e3d9..c0273059cb 100644 +--- a/xen/drivers/passthrough/amd/iommu_intr.c ++++ b/xen/drivers/passthrough/amd/iommu_intr.c +@@ -39,7 +39,8 @@ union irte32 { + }; + + union irte128 { +- uint64_t raw[2]; ++ uint64_t raw64[2]; ++ __uint128_t raw128; + struct { + bool remap_en:1; + bool sup_io_pf:1; +@@ -187,7 +188,7 @@ static void free_intremap_entry(const struct amd_iommu *iommu, + + if ( iommu->ctrl.ga_en ) + { +- ACCESS_ONCE(entry.ptr128->raw[0]) = 0; ++ ACCESS_ONCE(entry.ptr128->raw64[0]) = 0; + /* + * Low half (containing RemapEn) needs to be cleared first. Note that + * strictly speaking smp_wmb() isn't enough, as conceptually it expands +@@ -197,7 +198,7 @@ static void free_intremap_entry(const struct amd_iommu *iommu, + * variant will do. + */ + smp_wmb(); +- entry.ptr128->raw[1] = 0; ++ entry.ptr128->raw64[1] = 0; + } + else + ACCESS_ONCE(entry.ptr32->raw) = 0; +@@ -212,7 +213,7 @@ static void update_intremap_entry(const struct amd_iommu *iommu, + { + if ( iommu->ctrl.ga_en ) + { +- union irte128 irte = { ++ const union irte128 irte = { + .full = { + .remap_en = true, + .int_type = int_type, +@@ -222,19 +223,26 @@ static void update_intremap_entry(const struct amd_iommu *iommu, + .vector = vector, + }, + }; ++ __uint128_t old = entry.ptr128->raw128; ++ __uint128_t res = cmpxchg16b(&entry.ptr128->raw128, &old, ++ &irte.raw128); + +- ASSERT(!entry.ptr128->full.remap_en); +- entry.ptr128->raw[1] = irte.raw[1]; + /* +- * High half needs to be set before low one (containing RemapEn). See +- * comment in free_intremap_entry() regarding the choice of barrier. ++ * Hardware does not update the IRTE behind our backs, so the return ++ * value should match "old". + */ +- smp_wmb(); +- ACCESS_ONCE(entry.ptr128->raw[0]) = irte.raw[0]; ++ if ( res != old ) ++ { ++ printk(XENLOG_ERR ++ "unexpected IRTE %016lx_%016lx (expected %016lx_%016lx)\n", ++ (uint64_t)(res >> 64), (uint64_t)res, ++ (uint64_t)(old >> 64), (uint64_t)old); ++ ASSERT_UNREACHABLE(); ++ } + } + else + { +- union irte32 irte = { ++ const union irte32 irte = { + .flds = { + .remap_en = true, + .int_type = int_type, +@@ -299,21 +307,13 @@ static int update_intremap_entry_from_ioapic( + + entry = get_intremap_entry(iommu, req_id, offset); + +- /* The RemapEn fields match for all formats. */ +- while ( iommu->enabled && entry.ptr32->flds.remap_en ) +- { +- entry.ptr32->flds.remap_en = false; +- spin_unlock(lock); +- +- amd_iommu_flush_intremap(iommu, req_id); +- +- spin_lock(lock); +- } +- + update_intremap_entry(iommu, entry, vector, delivery_mode, dest_mode, dest); + + spin_unlock_irqrestore(lock, flags); + ++ if ( !fresh ) ++ amd_iommu_flush_intremap(iommu, req_id); ++ + set_rte_index(rte, offset); + + return 0; +@@ -322,7 +322,7 @@ static int update_intremap_entry_from_ioapic( + void cf_check amd_iommu_ioapic_update_ire( + unsigned int apic, unsigned int pin, uint64_t rte) + { +- struct IO_APIC_route_entry old_rte, new_rte; ++ struct IO_APIC_route_entry new_rte; + int seg, bdf, rc; + struct amd_iommu *iommu; + unsigned int idx; +@@ -346,14 +346,6 @@ void cf_check amd_iommu_ioapic_update_ire( + return; + } + +- old_rte = __ioapic_read_entry(apic, pin, true); +- /* mask the interrupt while we change the intremap table */ +- if ( !old_rte.mask ) +- { +- old_rte.mask = 1; +- __ioapic_write_entry(apic, pin, true, old_rte); +- } +- + /* Update interrupt remapping entry */ + rc = update_intremap_entry_from_ioapic( + bdf, iommu, &new_rte, +@@ -425,6 +417,7 @@ static int update_intremap_entry_from_msi_msg( + uint8_t delivery_mode, vector, dest_mode; + spinlock_t *lock; + unsigned int dest, offset, i; ++ bool fresh = false; + + req_id = get_dma_requestor_id(iommu->seg, bdf); + alias_id = get_intremap_requestor_id(iommu->seg, bdf); +@@ -468,26 +461,21 @@ static int update_intremap_entry_from_msi_msg( + return -ENOSPC; + } + *remap_index = offset; ++ fresh = true; + } + + entry = get_intremap_entry(iommu, req_id, offset); + +- /* The RemapEn fields match for all formats. */ +- while ( iommu->enabled && entry.ptr32->flds.remap_en ) +- { +- entry.ptr32->flds.remap_en = false; +- spin_unlock(lock); ++ update_intremap_entry(iommu, entry, vector, delivery_mode, dest_mode, dest); ++ spin_unlock_irqrestore(lock, flags); + ++ if ( !fresh ) ++ { + amd_iommu_flush_intremap(iommu, req_id); + if ( alias_id != req_id ) + amd_iommu_flush_intremap(iommu, alias_id); +- +- spin_lock(lock); + } + +- update_intremap_entry(iommu, entry, vector, delivery_mode, dest_mode, dest); +- spin_unlock_irqrestore(lock, flags); +- + *data = (msg->data & ~(INTREMAP_MAX_ENTRIES - 1)) | offset; + + /* +@@ -735,7 +723,7 @@ static void dump_intremap_table(const struct amd_iommu *iommu, + for ( count = 0; count < nr; count++ ) + { + if ( iommu->ctrl.ga_en +- ? !tbl.ptr128[count].raw[0] && !tbl.ptr128[count].raw[1] ++ ? !tbl.ptr128[count].raw64[0] && !tbl.ptr128[count].raw64[1] + : !tbl.ptr32[count].raw ) + continue; + +@@ -748,7 +736,8 @@ static void dump_intremap_table(const struct amd_iommu *iommu, + + if ( iommu->ctrl.ga_en ) + printk(" IRTE[%03x] %016lx_%016lx\n", +- count, tbl.ptr128[count].raw[1], tbl.ptr128[count].raw[0]); ++ count, tbl.ptr128[count].raw64[1], ++ tbl.ptr128[count].raw64[0]); + else + printk(" IRTE[%03x] %08x\n", count, tbl.ptr32[count].raw); + } +-- +2.48.1 + diff --git a/0025-x86-dom0-disable-SMAP-for-PV-domain-building-only.patch b/0025-x86-dom0-disable-SMAP-for-PV-domain-building-only.patch deleted file mode 100644 index 1dd749f..0000000 --- a/0025-x86-dom0-disable-SMAP-for-PV-domain-building-only.patch +++ /dev/null @@ -1,145 +0,0 @@ -From 743af916723eb4f1197719fc0aebd4460bafb5bf Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= -Date: Tue, 24 Sep 2024 14:39:23 +0200 -Subject: [PATCH 25/83] x86/dom0: disable SMAP for PV domain building only -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Move the logic that disables SMAP so it's only performed when building a PV -dom0, PVH dom0 builder doesn't require disabling SMAP. - -The fixes tag is to account for the wrong usage of cpu_has_smap in -create_dom0(), it should instead have used -boot_cpu_has(X86_FEATURE_XEN_SMAP). Fix while moving the logic to apply to PV -only. - -While there also make cr4_pv32_mask __ro_after_init. - -Fixes: 493ab190e5b1 ('xen/sm{e, a}p: allow disabling sm{e, a}p for Xen itself') -Signed-off-by: Roger Pau Monné -Reviewed-by: Jan Beulich -Reviewed-by: Andrew Cooper -master commit: fb1658221a31ec1db33253a80001191391e73b17 -master date: 2024-08-28 19:59:07 +0100 ---- - xen/arch/x86/include/asm/setup.h | 2 ++ - xen/arch/x86/pv/dom0_build.c | 40 ++++++++++++++++++++++++++++---- - xen/arch/x86/setup.c | 20 +--------------- - 3 files changed, 38 insertions(+), 24 deletions(-) - -diff --git a/xen/arch/x86/include/asm/setup.h b/xen/arch/x86/include/asm/setup.h -index d75589178b..8f7dfefb4d 100644 ---- a/xen/arch/x86/include/asm/setup.h -+++ b/xen/arch/x86/include/asm/setup.h -@@ -64,6 +64,8 @@ extern bool opt_dom0_verbose; - extern bool opt_dom0_cpuid_faulting; - extern bool opt_dom0_msr_relaxed; - -+extern unsigned long cr4_pv32_mask; -+ - #define max_init_domid (0) - - #endif -diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c -index 57e58a02e7..07e9594493 100644 ---- a/xen/arch/x86/pv/dom0_build.c -+++ b/xen/arch/x86/pv/dom0_build.c -@@ -354,11 +354,11 @@ static struct page_info * __init alloc_chunk(struct domain *d, - return page; - } - --int __init dom0_construct_pv(struct domain *d, -- const module_t *image, -- unsigned long image_headroom, -- module_t *initrd, -- const char *cmdline) -+static int __init dom0_construct(struct domain *d, -+ const module_t *image, -+ unsigned long image_headroom, -+ module_t *initrd, -+ const char *cmdline) - { - int i, rc, order, machine; - bool compatible, compat; -@@ -1051,6 +1051,36 @@ out: - return rc; - } - -+int __init dom0_construct_pv(struct domain *d, -+ const module_t *image, -+ unsigned long image_headroom, -+ module_t *initrd, -+ const char *cmdline) -+{ -+ int rc; -+ -+ /* -+ * Clear SMAP in CR4 to allow user-accesses in construct_dom0(). This -+ * prevents us needing to rewrite construct_dom0() in terms of -+ * copy_{to,from}_user(). -+ */ -+ if ( boot_cpu_has(X86_FEATURE_XEN_SMAP) ) -+ { -+ cr4_pv32_mask &= ~X86_CR4_SMAP; -+ write_cr4(read_cr4() & ~X86_CR4_SMAP); -+ } -+ -+ rc = dom0_construct(d, image, image_headroom, initrd, cmdline); -+ -+ if ( boot_cpu_has(X86_FEATURE_XEN_SMAP) ) -+ { -+ write_cr4(read_cr4() | X86_CR4_SMAP); -+ cr4_pv32_mask |= X86_CR4_SMAP; -+ } -+ -+ return rc; -+} -+ - /* - * Local variables: - * mode: C -diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c -index eee20bb175..f1076c7203 100644 ---- a/xen/arch/x86/setup.c -+++ b/xen/arch/x86/setup.c -@@ -79,8 +79,7 @@ bool __read_mostly use_invpcid; - int8_t __initdata opt_probe_port_aliases = -1; - boolean_param("probe-port-aliases", opt_probe_port_aliases); - --/* Only used in asm code and within this source file */ --unsigned long asmlinkage __read_mostly cr4_pv32_mask; -+unsigned long __ro_after_init cr4_pv32_mask; - - /* **** Linux config option: propagated to domain0. */ - /* "acpi=off": Sisables both ACPI table parsing and interpreter. */ -@@ -955,26 +954,9 @@ static struct domain *__init create_dom0(const module_t *image, - } - } - -- /* -- * Temporarily clear SMAP in CR4 to allow user-accesses in construct_dom0(). -- * This saves a large number of corner cases interactions with -- * copy_from_user(). -- */ -- if ( cpu_has_smap ) -- { -- cr4_pv32_mask &= ~X86_CR4_SMAP; -- write_cr4(read_cr4() & ~X86_CR4_SMAP); -- } -- - if ( construct_dom0(d, image, headroom, initrd, cmdline) != 0 ) - panic("Could not construct domain 0\n"); - -- if ( cpu_has_smap ) -- { -- write_cr4(read_cr4() | X86_CR4_SMAP); -- cr4_pv32_mask |= X86_CR4_SMAP; -- } -- - return d; - } - --- -2.47.0 - diff --git a/0026-x86-HVM-correct-partial-HPET_STATUS-write-emulation.patch b/0026-x86-HVM-correct-partial-HPET_STATUS-write-emulation.patch deleted file mode 100644 index 798680b..0000000 --- a/0026-x86-HVM-correct-partial-HPET_STATUS-write-emulation.patch +++ /dev/null @@ -1,37 +0,0 @@ -From 6e96dee93c60af4ee446f5e0fddf3b424824de18 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 24 Sep 2024 14:40:03 +0200 -Subject: [PATCH 26/83] x86/HVM: correct partial HPET_STATUS write emulation - -For partial writes the non-written parts of registers are folded into -the full 64-bit value from what they're presently set to. That's wrong -to do though when the behavior is write-1-to-clear: Writes not -including to low 3 bits would unconditionally clear all ISR bits which -are presently set. Re-calculate the value to use. - -Fixes: be07023be115 ("x86/vhpet: add support for level triggered interrupts") -Signed-off-by: Jan Beulich -Reviewed-by: Andrew Cooper -master commit: 41d358d2f9607ba37c216effa39b9f1bc58de69d -master date: 2024-08-29 10:02:20 +0200 ---- - xen/arch/x86/hvm/hpet.c | 3 ++- - 1 file changed, 2 insertions(+), 1 deletion(-) - -diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c -index 87642575f9..f0e5f877f4 100644 ---- a/xen/arch/x86/hvm/hpet.c -+++ b/xen/arch/x86/hvm/hpet.c -@@ -404,7 +404,8 @@ static int cf_check hpet_write( - break; - - case HPET_STATUS: -- /* write 1 to clear. */ -+ /* Write 1 to clear. Therefore don't use new_val directly here. */ -+ new_val = val << ((addr & 7) * 8); - while ( new_val ) - { - bool active; --- -2.47.0 - diff --git a/0026-x86emul-further-correct-64-bit-mode-zero-count-repea.patch b/0026-x86emul-further-correct-64-bit-mode-zero-count-repea.patch new file mode 100644 index 0000000..fd1f13f --- /dev/null +++ b/0026-x86emul-further-correct-64-bit-mode-zero-count-repea.patch @@ -0,0 +1,108 @@ +From b7b0ce5e11203bb81f99b1f05956cd766f55e802 Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Mon, 17 Feb 2025 13:21:00 +0100 +Subject: [PATCH 26/53] x86emul: further correct 64-bit mode zero count + repeated string insn handling +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +In an entirely different context I came across Linux commit 428e3d08574b +("KVM: x86: Fix zero iterations REP-string"), which points out that +we're still doing things wrong: For one, there's no zero-extension at +all on AMD. And then while RCX is zero-extended from 32 bits uniformly +for all string instructions on newer hardware, RSI/RDI are only for MOVS +and STOS on the systems I have access to. (On an old family 0xf system +I've further found that for REP LODS even RCX is not zero-extended.) + +While touching the lines anyway, replace two casts in get_rep_prefix(). + +Fixes: 79e996a89f69 ("x86emul: correct 64-bit mode repeated string insn handling with zero count") +Signed-off-by: Jan Beulich +Acked-by: Roger Pau Monné +master commit: 5310a042c4e3135c471446c8253ad13250539957 +master date: 2025-01-27 15:23:19 +0100 +--- + xen/arch/x86/x86_emulate/x86_emulate.c | 20 ++++++++++---------- + 1 file changed, 10 insertions(+), 10 deletions(-) + +diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c +index 31475208d1..3d837f7e9e 100644 +--- a/xen/arch/x86/x86_emulate/x86_emulate.c ++++ b/xen/arch/x86/x86_emulate/x86_emulate.c +@@ -513,7 +513,7 @@ static inline void put_loop_count( + regs->r(cx) = ad_bytes == 4 ? (uint32_t)count : count; + } + +-#define get_rep_prefix(using_si, using_di) ({ \ ++#define get_rep_prefix(extend_si, extend_di) ({ \ + unsigned long max_reps = 1; \ + if ( rep_prefix() ) \ + max_reps = get_loop_count(&_regs, ad_bytes); \ +@@ -521,14 +521,14 @@ static inline void put_loop_count( + { \ + /* \ + * Skip the instruction if no repetitions are required, but \ +- * zero extend involved registers first when using 32-bit \ ++ * zero extend relevant registers first when using 32-bit \ + * addressing in 64-bit mode. \ + */ \ +- if ( mode_64bit() && ad_bytes == 4 ) \ ++ if ( !amd_like(ctxt) && mode_64bit() && ad_bytes == 4 ) \ + { \ + _regs.r(cx) = 0; \ +- if ( using_si ) _regs.r(si) = (uint32_t)_regs.r(si); \ +- if ( using_di ) _regs.r(di) = (uint32_t)_regs.r(di); \ ++ if ( extend_si ) _regs.r(si) = _regs.esi; \ ++ if ( extend_di ) _regs.r(di) = _regs.edi; \ + } \ + goto complete_insn; \ + } \ +@@ -1815,7 +1815,7 @@ x86_emulate( + dst.bytes = !(b & 1) ? 1 : (op_bytes == 8) ? 4 : op_bytes; + if ( (rc = ioport_access_check(port, dst.bytes, ctxt, ops)) != 0 ) + goto done; +- nr_reps = get_rep_prefix(false, true); ++ nr_reps = get_rep_prefix(false, false /* don't extend RSI/RDI */); + dst.mem.off = truncate_ea_and_reps(_regs.r(di), nr_reps, dst.bytes); + dst.mem.seg = x86_seg_es; + /* Try the presumably most efficient approach first. */ +@@ -1857,7 +1857,7 @@ x86_emulate( + dst.bytes = !(b & 1) ? 1 : (op_bytes == 8) ? 4 : op_bytes; + if ( (rc = ioport_access_check(port, dst.bytes, ctxt, ops)) != 0 ) + goto done; +- nr_reps = get_rep_prefix(true, false); ++ nr_reps = get_rep_prefix(false, false /* don't extend RSI/RDI */); + ea.mem.off = truncate_ea_and_reps(_regs.r(si), nr_reps, dst.bytes); + /* Try the presumably most efficient approach first. */ + if ( !ops->rep_outs ) +@@ -2194,7 +2194,7 @@ x86_emulate( + case 0xa6 ... 0xa7: /* cmps */ { + unsigned long next_eip = _regs.r(ip); + +- get_rep_prefix(true, true); ++ get_rep_prefix(false, false /* don't extend RSI/RDI */); + src.bytes = dst.bytes = (d & ByteOp) ? 1 : op_bytes; + if ( (rc = read_ulong(ea.mem.seg, truncate_ea(_regs.r(si)), + &dst.val, dst.bytes, ctxt, ops)) || +@@ -2236,7 +2236,7 @@ x86_emulate( + } + + case 0xac ... 0xad: /* lods */ +- get_rep_prefix(true, false); ++ get_rep_prefix(false, false /* don't extend RSI/RDI */); + if ( (rc = read_ulong(ea.mem.seg, truncate_ea(_regs.r(si)), + &dst.val, dst.bytes, ctxt, ops)) != 0 ) + goto done; +@@ -2247,7 +2247,7 @@ x86_emulate( + case 0xae ... 0xaf: /* scas */ { + unsigned long next_eip = _regs.r(ip); + +- get_rep_prefix(false, true); ++ get_rep_prefix(false, false /* don't extend RSI/RDI */); + if ( (rc = read_ulong(x86_seg_es, truncate_ea(_regs.r(di)), + &dst.val, src.bytes, ctxt, ops)) != 0 ) + goto done; +-- +2.48.1 + diff --git a/0027-Arm64-adjust-__irq_to_desc-to-fix-build-with-gcc14.patch b/0027-Arm64-adjust-__irq_to_desc-to-fix-build-with-gcc14.patch deleted file mode 100644 index 6b60d46..0000000 --- a/0027-Arm64-adjust-__irq_to_desc-to-fix-build-with-gcc14.patch +++ /dev/null @@ -1,61 +0,0 @@ -From ee826bc490d6036ed9b637ada014a2d59d151f79 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 24 Sep 2024 14:40:34 +0200 -Subject: [PATCH 27/83] Arm64: adjust __irq_to_desc() to fix build with gcc14 -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -With the original code I observe - -In function ‘__irq_to_desc’, - inlined from ‘route_irq_to_guest’ at arch/arm/irq.c:465:12: -arch/arm/irq.c:54:16: error: array subscript -2 is below array bounds of ‘irq_desc_t[32]’ {aka ‘struct irq_desc[32]’} [-Werror=array-bounds=] - 54 | return &this_cpu(local_irq_desc)[irq]; - | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -which looks pretty bogus: How in the world does the compiler arrive at --2 when compiling route_irq_to_guest()? Yet independent of that the -function's parameter wants to be of unsigned type anyway, as shown by -a vast majority of callers (others use plain int when they really mean -non-negative quantities). With that adjustment the code compiles fine -again. - -Signed-off-by: Jan Beulich -Acked-by: Michal Orzel -master commit: 99f942f3d410059dc223ee0a908827e928ef3592 -master date: 2024-08-29 10:03:53 +0200 ---- - xen/arch/arm/include/asm/irq.h | 2 +- - xen/arch/arm/irq.c | 2 +- - 2 files changed, 2 insertions(+), 2 deletions(-) - -diff --git a/xen/arch/arm/include/asm/irq.h b/xen/arch/arm/include/asm/irq.h -index ec437add09..88e060bf29 100644 ---- a/xen/arch/arm/include/asm/irq.h -+++ b/xen/arch/arm/include/asm/irq.h -@@ -56,7 +56,7 @@ extern const unsigned int nr_irqs; - struct irq_desc; - struct irqaction; - --struct irq_desc *__irq_to_desc(int irq); -+struct irq_desc *__irq_to_desc(unsigned int irq); - - #define irq_to_desc(irq) __irq_to_desc(irq) - -diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c -index 6b89f64fd1..b9757d7ad3 100644 ---- a/xen/arch/arm/irq.c -+++ b/xen/arch/arm/irq.c -@@ -48,7 +48,7 @@ void irq_end_none(struct irq_desc *irq) - static irq_desc_t irq_desc[NR_IRQS]; - static DEFINE_PER_CPU(irq_desc_t[NR_LOCAL_IRQS], local_irq_desc); - --struct irq_desc *__irq_to_desc(int irq) -+struct irq_desc *__irq_to_desc(unsigned int irq) - { - if ( irq < NR_LOCAL_IRQS ) - return &this_cpu(local_irq_desc)[irq]; --- -2.47.0 - diff --git a/0027-x86-PV-further-harden-guest-memory-accesses-against-.patch b/0027-x86-PV-further-harden-guest-memory-accesses-against-.patch new file mode 100644 index 0000000..1430113 --- /dev/null +++ b/0027-x86-PV-further-harden-guest-memory-accesses-against-.patch @@ -0,0 +1,88 @@ +From 125270d49dcc42c1249e0fb99abb8d33c9a8b439 Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Mon, 17 Feb 2025 13:21:30 +0100 +Subject: [PATCH 27/53] x86/PV: further harden guest memory accesses against + speculative abuse +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +The original implementation has two issues: For one it doesn't preserve +non-canonical-ness of inputs in the range 0x8000000000000000 through +0x80007fffffffffff. Bogus guest pointers in that range would not cause a +(#GP) fault upon access, when they should. + +And then there is an AMD-specific aspect, where only the low 48 bits of +an address are used for speculative execution; the architecturally +mandated #GP for non-canonical addresses would be raised at a later +execution stage. Therefore to prevent Xen controlled data to make it +into any of the caches in a guest controllable manner, we need to +additionally ensure that for non-canonical inputs bit 47 would be clear. + +See the code comment for how addressing both is being achieved. + +Fixes: 4dc181599142 ("x86/PV: harden guest memory accesses against speculative abuse") +Signed-off-by: Jan Beulich +Reviewed-by: Roger Pau Monné +master commit: 8306d773b03acec6062c0547ac05e3dd4a6960f6 +master date: 2025-01-27 15:23:59 +0100 +--- + xen/arch/x86/include/asm/asm-defns.h | 33 +++++++++++++++++++++++----- + 1 file changed, 27 insertions(+), 6 deletions(-) + +diff --git a/xen/arch/x86/include/asm/asm-defns.h b/xen/arch/x86/include/asm/asm-defns.h +index d55dd3bbc3..32d6b44910 100644 +--- a/xen/arch/x86/include/asm/asm-defns.h ++++ b/xen/arch/x86/include/asm/asm-defns.h +@@ -1,3 +1,5 @@ ++#include ++ + #ifndef HAVE_AS_CLAC_STAC + .macro clac + .byte 0x0f, 0x01, 0xca +@@ -65,17 +67,36 @@ + .macro guest_access_mask_ptr ptr:req, scratch1:req, scratch2:req + #if defined(CONFIG_SPECULATIVE_HARDEN_GUEST_ACCESS) + /* +- * Here we want +- * +- * ptr &= ~0ull >> (ptr < HYPERVISOR_VIRT_END); +- * ++ * Here we want to adjust \ptr such that ++ * - if it's within Xen range, it becomes non-canonical, ++ * - otherwise if it's (non-)canonical on input, it retains that property, ++ * - if the result is non-canonical, bit 47 is clear (to avoid ++ * potentially populating the cache with Xen data on AMD-like hardware), + * but guaranteed without any conditional branches (hence in assembly). ++ * ++ * To achieve this we determine which bit to forcibly clear: Either bit 47 ++ * (in case the address is below HYPERVISOR_VIRT_END) or bit 63. Further ++ * we determine whether for forcably set bit 63: In case we first cleared ++ * it, we'll merely restore the original address. In case we ended up ++ * clearing bit 47 (i.e. the address was either non-canonical or within Xen ++ * range), setting the bit will yield a guaranteed non-canonical address. ++ * If we didn't clear a bit, we also won't set one: The address was in the ++ * low half of address space in that case with bit 47 already clear. The ++ * address can thus be left unchanged, whether canonical or not. + */ + mov $(HYPERVISOR_VIRT_END - 1), \scratch1 +- mov $~0, \scratch2 ++ mov $(VADDR_BITS - 1), \scratch2 + cmp \ptr, \scratch1 ++ /* ++ * Not needed: The value we have in \scratch1 will be truncated to 6 bits, ++ * thus yielding the value we need. ++ mov $63, \scratch1 ++ */ ++ cmovnb \scratch2, \scratch1 ++ xor \scratch2, \scratch2 ++ btr \scratch1, \ptr + rcr $1, \scratch2 +- and \scratch2, \ptr ++ or \scratch2, \ptr + #elif defined(CONFIG_DEBUG) && defined(CONFIG_PV) + xor $~\@, \scratch1 + xor $~\@, \scratch2 +-- +2.48.1 + diff --git a/0028-libxl-Fix-nul-termination-of-the-return-value-of-lib.patch b/0028-libxl-Fix-nul-termination-of-the-return-value-of-lib.patch deleted file mode 100644 index d17d394..0000000 --- a/0028-libxl-Fix-nul-termination-of-the-return-value-of-lib.patch +++ /dev/null @@ -1,100 +0,0 @@ -From c18635fd69fc2da238f00a26ab707f1b2a50bf64 Mon Sep 17 00:00:00 2001 -From: Javi Merino -Date: Tue, 24 Sep 2024 14:41:06 +0200 -Subject: [PATCH 28/83] libxl: Fix nul-termination of the return value of - libxl_xen_console_read_line() -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)" -call in main_dmesg(). ASAN reports a heap buffer overflow: an -off-by-one access to cr->buffer. - -The readconsole sysctl copies up to count characters into the buffer, -but it does not add a null character at the end. Despite the -documentation of libxl_xen_console_read_line(), line_r is not -nul-terminated if 16384 characters were copied to the buffer. - -Fix this by asking xc_readconsolering() to fill the buffer up to size -- 1. As the number of characters in the buffer is only needed in -libxl_xen_console_read_line(), make it a local variable there instead -of part of the libxl__xen_console_reader struct. - -Fixes: 4024bae739cc ("xl: Add subcommand 'xl dmesg'") -Reported-by: Edwin Török -Signed-off-by: Javi Merino -Reviewed-by: Anthony PERARD -master commit: bb03169bcb6ecccf372de1f6b9285cd519a26bb8 -master date: 2024-09-03 10:53:44 +0100 ---- - tools/libs/light/libxl_console.c | 19 +++++++++++++++---- - tools/libs/light/libxl_internal.h | 1 - - 2 files changed, 15 insertions(+), 5 deletions(-) - -diff --git a/tools/libs/light/libxl_console.c b/tools/libs/light/libxl_console.c -index a563c9d3c7..9f736b8913 100644 ---- a/tools/libs/light/libxl_console.c -+++ b/tools/libs/light/libxl_console.c -@@ -774,12 +774,17 @@ libxl_xen_console_reader * - { - GC_INIT(ctx); - libxl_xen_console_reader *cr; -- unsigned int size = 16384; -+ /* -+ * We want xen to fill the buffer in as few hypercalls as -+ * possible, but xen will not nul-terminate it. The default size -+ * of Xen's console buffer is 16384. Leave one byte at the end -+ * for the null character. -+ */ -+ unsigned int size = 16384 + 1; - - cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader)); - cr->buffer = libxl__zalloc(NOGC, size); - cr->size = size; -- cr->count = size; - cr->clear = clear; - cr->incremental = 1; - -@@ -800,10 +805,16 @@ int libxl_xen_console_read_line(libxl_ctx *ctx, - char **line_r) - { - int ret; -+ /* -+ * Number of chars to copy into the buffer. xc_readconsolering() -+ * does not add a null character at the end, so leave a space for -+ * us to add it. -+ */ -+ unsigned int nr_chars = cr->size - 1; - GC_INIT(ctx); - - memset(cr->buffer, 0, cr->size); -- ret = xc_readconsolering(ctx->xch, cr->buffer, &cr->count, -+ ret = xc_readconsolering(ctx->xch, cr->buffer, &nr_chars, - cr->clear, cr->incremental, &cr->index); - if (ret < 0) { - LOGE(ERROR, "reading console ring buffer"); -@@ -811,7 +822,7 @@ int libxl_xen_console_read_line(libxl_ctx *ctx, - return ERROR_FAIL; - } - if (!ret) { -- if (cr->count) { -+ if (nr_chars) { - *line_r = cr->buffer; - ret = 1; - } else { -diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h -index 3b58bb2d7f..96d14f5746 100644 ---- a/tools/libs/light/libxl_internal.h -+++ b/tools/libs/light/libxl_internal.h -@@ -2077,7 +2077,6 @@ _hidden char *libxl__uuid2string(libxl__gc *gc, const libxl_uuid uuid); - struct libxl__xen_console_reader { - char *buffer; - unsigned int size; -- unsigned int count; - unsigned int clear; - unsigned int incremental; - unsigned int index; --- -2.47.0 - diff --git a/0028-x86-intel-Fix-PERF_GLOBAL-fixup-when-virtualised.patch b/0028-x86-intel-Fix-PERF_GLOBAL-fixup-when-virtualised.patch new file mode 100644 index 0000000..05c4b46 --- /dev/null +++ b/0028-x86-intel-Fix-PERF_GLOBAL-fixup-when-virtualised.patch @@ -0,0 +1,117 @@ +From c4d4fc58b6f7f10655958e2fb07c702f4fdb0b0b Mon Sep 17 00:00:00 2001 +From: Andrew Cooper +Date: Mon, 17 Feb 2025 13:21:50 +0100 +Subject: [PATCH 28/53] x86/intel: Fix PERF_GLOBAL fixup when virtualised +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Logic using performance counters needs to look at +MSR_MISC_ENABLE.PERF_AVAILABLE before touching any other resources. + +When virtualised under ESX, Xen dies with a #GP fault trying to read +MSR_CORE_PERF_GLOBAL_CTRL. + +Factor this logic out into a separate function (it's already too squashed to +the RHS), and insert a check of MSR_MISC_ENABLE.PERF_AVAILABLE. + +This also avoids setting X86_FEATURE_ARCH_PERFMON if MSR_MISC_ENABLE says that +PERF is unavailable, although oprofile (the only consumer of this flag) +cross-checks too. + +Fixes: 6bdb965178bb ("x86/intel: ensure Global Performance Counter Control is setup correctly") +Reported-by: Jonathan Katz +Link: https://xcp-ng.org/forum/topic/10286/nesting-xcp-ng-on-esx-8 +Signed-off-by: Andrew Cooper +Reviewed-by: Roger Pau Monné +Tested-by: Jonathan Katz +master commit: dd05d265b8abda4cc7206b29cd71b77fb46658bf +master date: 2025-01-28 11:19:45 +0000 +--- + xen/arch/x86/cpu/intel.c | 64 +++++++++++++++++++++++----------------- + 1 file changed, 37 insertions(+), 27 deletions(-) + +diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c +index af56e57bd8..bb9c6220de 100644 +--- a/xen/arch/x86/cpu/intel.c ++++ b/xen/arch/x86/cpu/intel.c +@@ -535,39 +535,49 @@ static void intel_log_freq(const struct cpuinfo_x86 *c) + printk("%u MHz\n", (factor * max_ratio + 50) / 100); + } + ++static void init_intel_perf(struct cpuinfo_x86 *c) ++{ ++ uint64_t val; ++ unsigned int eax, ver, nr_cnt; ++ ++ if ( c->cpuid_level <= 9 || ++ ({ rdmsrl(MSR_IA32_MISC_ENABLE, val); ++ !(val & MSR_IA32_MISC_ENABLE_PERF_AVAIL); }) ) ++ return; ++ ++ eax = cpuid_eax(10); ++ ver = eax & 0xff; ++ nr_cnt = (eax >> 8) & 0xff; ++ ++ if ( ver && nr_cnt > 1 && nr_cnt <= 32 ) ++ { ++ unsigned int cnt_mask = (1UL << nr_cnt) - 1; ++ ++ /* ++ * On (some?) Sapphire/Emerald Rapids platforms each package-BSP ++ * starts with all the enable bits for the general-purpose PMCs ++ * cleared. Adjust so counters can be enabled from EVNTSEL. ++ */ ++ rdmsrl(MSR_CORE_PERF_GLOBAL_CTRL, val); ++ ++ if ( (val & cnt_mask) != cnt_mask ) ++ { ++ printk("FIRMWARE BUG: CPU%u invalid PERF_GLOBAL_CTRL: %#"PRIx64" adjusting to %#"PRIx64"\n", ++ smp_processor_id(), val, val | cnt_mask); ++ wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, val | cnt_mask); ++ } ++ ++ __set_bit(X86_FEATURE_ARCH_PERFMON, c->x86_capability); ++ } ++} ++ + static void cf_check init_intel(struct cpuinfo_x86 *c) + { + /* Detect the extended topology information if available */ + detect_extended_topology(c); + + init_intel_cacheinfo(c); +- if (c->cpuid_level > 9) { +- unsigned eax = cpuid_eax(10); +- unsigned int cnt = (eax >> 8) & 0xff; +- +- /* Check for version and the number of counters */ +- if ((eax & 0xff) && (cnt > 1) && (cnt <= 32)) { +- uint64_t global_ctrl; +- unsigned int cnt_mask = (1UL << cnt) - 1; +- +- /* +- * On (some?) Sapphire/Emerald Rapids platforms each +- * package-BSP starts with all the enable bits for the +- * general-purpose PMCs cleared. Adjust so counters +- * can be enabled from EVNTSEL. +- */ +- rdmsrl(MSR_CORE_PERF_GLOBAL_CTRL, global_ctrl); +- if ((global_ctrl & cnt_mask) != cnt_mask) { +- printk("CPU%u: invalid PERF_GLOBAL_CTRL: %#" +- PRIx64 " adjusting to %#" PRIx64 "\n", +- smp_processor_id(), global_ctrl, +- global_ctrl | cnt_mask); +- wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, +- global_ctrl | cnt_mask); +- } +- __set_bit(X86_FEATURE_ARCH_PERFMON, c->x86_capability); +- } +- } ++ init_intel_perf(c); + + if ( !cpu_has(c, X86_FEATURE_XTOPOLOGY) ) + { +-- +2.48.1 + diff --git a/0029-SUPPORT.md-split-XSM-from-Flask.patch b/0029-SUPPORT.md-split-XSM-from-Flask.patch deleted file mode 100644 index d5ce891..0000000 --- a/0029-SUPPORT.md-split-XSM-from-Flask.patch +++ /dev/null @@ -1,66 +0,0 @@ -From 3ceb79ceabab58305a0f35aed0117537f7a6b922 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 24 Sep 2024 14:41:51 +0200 -Subject: [PATCH 29/83] SUPPORT.md: split XSM from Flask -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -XSM is a generic framework, which in particular is also used by SILO. -With this it can't really be experimental: Arm mandates SILO for having -a security supported configuration. - -Signed-off-by: Jan Beulich -Reviewed-by: Roger Pau Monné -Reviewed-by: Daniel P. Smith -master commit: d7c18b8720824d7efc39ffa7296751e1812865a9 -master date: 2024-09-04 16:05:03 +0200 ---- - SUPPORT.md | 19 +++++++++++++++++-- - 1 file changed, 17 insertions(+), 2 deletions(-) - -diff --git a/SUPPORT.md b/SUPPORT.md -index 1d8b38cbd0..ba6052477b 100644 ---- a/SUPPORT.md -+++ b/SUPPORT.md -@@ -768,13 +768,21 @@ Compile time disabled for ARM by default. - - Status, x86: Supported, not security supported - --### XSM & FLASK -+### XSM (Xen Security Module) Framework -+ -+XSM is a security policy framework. The dummy implementation is covered by this -+statement, and implements a policy whereby dom0 is all powerful. See below for -+alternative modules (FLASK, SILO). -+ -+ Status: Supported -+ -+### FLASK XSM Module - - Status: Experimental - - Compile time disabled by default. - --Also note that using XSM -+Also note that using FLASK - to delegate various domain control hypercalls - to particular other domains, rather than only permitting use by dom0, - is also specifically excluded from security support for many hypercalls. -@@ -787,6 +795,13 @@ Please see XSA-77 for more details. - The default policy includes FLASK labels and roles for a "typical" Xen-based system - with dom0, driver domains, stub domains, domUs, and so on. - -+### SILO XSM Module -+ -+SILO extends the dummy policy by enforcing that DomU-s can only communicate -+with Dom0, yet not with each other. -+ -+ Status: Supported -+ - ## Virtual Hardware, Hypervisor - - ### x86/Nested PV --- -2.47.0 - diff --git a/0029-radix-tree-purge-node-allocation-override-hooks.patch b/0029-radix-tree-purge-node-allocation-override-hooks.patch new file mode 100644 index 0000000..ad0ea7c --- /dev/null +++ b/0029-radix-tree-purge-node-allocation-override-hooks.patch @@ -0,0 +1,125 @@ +From ebc8b4a4508a0b84c3f3b31324f688dabb210093 Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Mon, 17 Feb 2025 13:22:24 +0100 +Subject: [PATCH 29/53] radix-tree: purge node allocation override hooks + +These were needed by TMEM only, which is long gone. The Linux original +doesn't have such either. This effectively reverts one of the "Other +changes" from 8dc6738dbb3c ("Update radix-tree.[ch] from upstream Linux +to gain RCU awareness"). + +Positive side effect: Two cf_check go away. + +While there also convert xmalloc()+memset() to xzalloc(). + +Requested-by: Andrew Cooper +Signed-off-by: Jan Beulich +Reviewed-by: Andrew Cooper +master commit: 1275093a96fed45057db241b3aa6e191d9dcf596 +master date: 2025-02-07 09:59:11 +0100 +--- + xen/common/radix-tree.c | 37 ++++++------------------------------ + xen/include/xen/radix-tree.h | 10 ---------- + 2 files changed, 6 insertions(+), 41 deletions(-) + +diff --git a/xen/common/radix-tree.c b/xen/common/radix-tree.c +index adc3034222..994a5a3b3d 100644 +--- a/xen/common/radix-tree.c ++++ b/xen/common/radix-tree.c +@@ -52,12 +52,6 @@ struct rcu_node { + struct rcu_head rcu_head; + }; + +-static struct radix_tree_node *cf_check rcu_node_alloc(void *arg) +-{ +- struct rcu_node *rcu_node = xmalloc(struct rcu_node); +- return rcu_node ? &rcu_node->node : NULL; +-} +- + static void cf_check _rcu_node_free(struct rcu_head *head) + { + struct rcu_node *rcu_node = +@@ -65,26 +59,20 @@ static void cf_check _rcu_node_free(struct rcu_head *head) + xfree(rcu_node); + } + +-static void cf_check rcu_node_free(struct radix_tree_node *node, void *arg) +-{ +- struct rcu_node *rcu_node = container_of(node, struct rcu_node, node); +- call_rcu(&rcu_node->rcu_head, _rcu_node_free); +-} +- + static struct radix_tree_node *radix_tree_node_alloc( + struct radix_tree_root *root) + { +- struct radix_tree_node *ret; +- ret = root->node_alloc(root->node_alloc_free_arg); +- if (ret) +- memset(ret, 0, sizeof(*ret)); +- return ret; ++ struct rcu_node *rcu_node = xzalloc(struct rcu_node); ++ ++ return rcu_node ? &rcu_node->node : NULL; + } + + static void radix_tree_node_free( + struct radix_tree_root *root, struct radix_tree_node *node) + { +- root->node_free(node, root->node_alloc_free_arg); ++ struct rcu_node *rcu_node = container_of(node, struct rcu_node, node); ++ ++ call_rcu(&rcu_node->rcu_head, _rcu_node_free); + } + + /* +@@ -717,19 +705,6 @@ void radix_tree_destroy( + void radix_tree_init(struct radix_tree_root *root) + { + memset(root, 0, sizeof(*root)); +- root->node_alloc = rcu_node_alloc; +- root->node_free = rcu_node_free; +-} +- +-void radix_tree_set_alloc_callbacks( +- struct radix_tree_root *root, +- radix_tree_alloc_fn_t *node_alloc, +- radix_tree_free_fn_t *node_free, +- void *node_alloc_free_arg) +-{ +- root->node_alloc = node_alloc; +- root->node_free = node_free; +- root->node_alloc_free_arg = node_alloc_free_arg; + } + + static __init unsigned long __maxindex(unsigned int height) +diff --git a/xen/include/xen/radix-tree.h b/xen/include/xen/radix-tree.h +index 58c40312e6..9d5ffae3eb 100644 +--- a/xen/include/xen/radix-tree.h ++++ b/xen/include/xen/radix-tree.h +@@ -66,11 +66,6 @@ typedef void radix_tree_free_fn_t(struct radix_tree_node *, void *); + struct radix_tree_root { + unsigned int height; + struct radix_tree_node __rcu *rnode; +- +- /* Allow to specify custom node alloc/dealloc routines. */ +- radix_tree_alloc_fn_t *node_alloc; +- radix_tree_free_fn_t *node_free; +- void *node_alloc_free_arg; + }; + + /* +@@ -78,11 +73,6 @@ struct radix_tree_root { + */ + + void radix_tree_init(struct radix_tree_root *root); +-void radix_tree_set_alloc_callbacks( +- struct radix_tree_root *root, +- radix_tree_alloc_fn_t *node_alloc, +- radix_tree_free_fn_t *node_free, +- void *node_alloc_free_arg); + + void radix_tree_destroy( + struct radix_tree_root *root, +-- +2.48.1 + diff --git a/0030-radix-tree-introduce-RADIX_TREE-_INIT.patch b/0030-radix-tree-introduce-RADIX_TREE-_INIT.patch new file mode 100644 index 0000000..d2e479a --- /dev/null +++ b/0030-radix-tree-introduce-RADIX_TREE-_INIT.patch @@ -0,0 +1,99 @@ +From 46e3e9d7398303a12714cacf1e56c6da505cffec Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Mon, 17 Feb 2025 13:23:00 +0100 +Subject: [PATCH 30/53] radix-tree: introduce RADIX_TREE{,_INIT}() + +... now that static initialization is possible. Use RADIX_TREE() for +pci_segments and ivrs_maps. + +This then fixes an ordering issue on x86: With the call to +radix_tree_init(), acpi_mmcfg_init()'s invocation of pci_segments_init() +will zap the possible earlier introduction of segment 0 by +amd_iommu_detect_one_acpi()'s call to pci_ro_device(), and thus the +write-protection of the PCI devices representing AMD IOMMUs. + +Fixes: 3950f2485bbc ("x86/x2APIC: defer probe until after IOMMU ACPI table parsing") +Requested-by: Andrew Cooper +Signed-off-by: Jan Beulich +Reviewed-by: Andrew Cooper +master commit: 26fe09e34566d701ecaea76b4563bb9934e85861 +master date: 2025-02-07 10:00:04 +0100 +--- + xen/common/radix-tree.c | 2 +- + xen/drivers/passthrough/amd/iommu_init.c | 3 +-- + xen/drivers/passthrough/pci.c | 3 +-- + xen/include/xen/radix-tree.h | 3 +++ + 4 files changed, 6 insertions(+), 5 deletions(-) + +diff --git a/xen/common/radix-tree.c b/xen/common/radix-tree.c +index 994a5a3b3d..ac4bfbd3f4 100644 +--- a/xen/common/radix-tree.c ++++ b/xen/common/radix-tree.c +@@ -704,7 +704,7 @@ void radix_tree_destroy( + + void radix_tree_init(struct radix_tree_root *root) + { +- memset(root, 0, sizeof(*root)); ++ *root = (struct radix_tree_root)RADIX_TREE_INIT(); + } + + static __init unsigned long __maxindex(unsigned int height) +diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c +index 6c0dc2d5cb..b546c28667 100644 +--- a/xen/drivers/passthrough/amd/iommu_init.c ++++ b/xen/drivers/passthrough/amd/iommu_init.c +@@ -31,7 +31,7 @@ static struct tasklet amd_iommu_irq_tasklet; + unsigned int __read_mostly amd_iommu_acpi_info; + unsigned int __read_mostly ivrs_bdf_entries; + u8 __read_mostly ivhd_type; +-static struct radix_tree_root ivrs_maps; ++static RADIX_TREE(ivrs_maps); + LIST_HEAD_READ_MOSTLY(amd_iommu_head); + bool iommuv2_enabled; + +@@ -1410,7 +1410,6 @@ int __init amd_iommu_prepare(bool xt) + goto error_out; + ivrs_bdf_entries = rc; + +- radix_tree_init(&ivrs_maps); + for_each_amd_iommu ( iommu ) + { + rc = amd_iommu_prepare_one(iommu); +diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c +index ae620b3007..c7ee835b8a 100644 +--- a/xen/drivers/passthrough/pci.c ++++ b/xen/drivers/passthrough/pci.c +@@ -68,7 +68,7 @@ bool pcidevs_locked(void) + return rspin_is_locked(&_pcidevs_lock); + } + +-static struct radix_tree_root pci_segments; ++static RADIX_TREE(pci_segments); + + static inline struct pci_seg *get_pseg(u16 seg) + { +@@ -124,7 +124,6 @@ static int pci_segments_iterate( + + void __init pci_segments_init(void) + { +- radix_tree_init(&pci_segments); + if ( !alloc_pseg(0) ) + panic("Could not initialize PCI segment 0\n"); + } +diff --git a/xen/include/xen/radix-tree.h b/xen/include/xen/radix-tree.h +index 9d5ffae3eb..4077365972 100644 +--- a/xen/include/xen/radix-tree.h ++++ b/xen/include/xen/radix-tree.h +@@ -72,6 +72,9 @@ struct radix_tree_root { + *** radix-tree API starts here ** + */ + ++#define RADIX_TREE_INIT() {} ++#define RADIX_TREE(name) struct radix_tree_root name = RADIX_TREE_INIT() ++ + void radix_tree_init(struct radix_tree_root *root); + + void radix_tree_destroy( +-- +2.48.1 + diff --git a/0030-x86-fix-UP-build-with-gcc14.patch b/0030-x86-fix-UP-build-with-gcc14.patch deleted file mode 100644 index 06d5dcf..0000000 --- a/0030-x86-fix-UP-build-with-gcc14.patch +++ /dev/null @@ -1,63 +0,0 @@ -From d625c4e9fb46ef1b81a5b32d8fe1774c432cddd6 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 24 Sep 2024 14:41:59 +0200 -Subject: [PATCH 30/83] x86: fix UP build with gcc14 -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -The complaint is: - -In file included from ././include/xen/config.h:17, - from : -arch/x86/smpboot.c: In function ‘link_thread_siblings.constprop’: -./include/asm-generic/percpu.h:16:51: error: array subscript [0, 0] is outside array bounds of ‘long unsigned int[1]’ [-Werror=array-bounds=] - 16 | (*RELOC_HIDE(&per_cpu__##var, __per_cpu_offset[cpu])) -./include/xen/compiler.h:140:29: note: in definition of macro ‘RELOC_HIDE’ - 140 | (typeof(ptr)) (__ptr + (off)); }) - | ^~~ -arch/x86/smpboot.c:238:27: note: in expansion of macro ‘per_cpu’ - 238 | cpumask_set_cpu(cpu2, per_cpu(cpu_sibling_mask, cpu1)); - | ^~~~~~~ -In file included from ./arch/x86/include/generated/asm/percpu.h:1, - from ./include/xen/percpu.h:30, - from ./arch/x86/include/asm/cpuid.h:9, - from ./arch/x86/include/asm/cpufeature.h:11, - from ./arch/x86/include/asm/system.h:6, - from ./include/xen/list.h:11, - from ./include/xen/mm.h:68, - from arch/x86/smpboot.c:12: -./include/asm-generic/percpu.h:12:22: note: while referencing ‘__per_cpu_offset’ - 12 | extern unsigned long __per_cpu_offset[NR_CPUS]; - | ^~~~~~~~~~~~~~~~ - -Which I consider bogus in the first place ("array subscript [0, 0]" vs a -1-element array). Yet taking the experience from 99f942f3d410 ("Arm64: -adjust __irq_to_desc() to fix build with gcc14") I guessed that -switching function parameters to unsigned int (which they should have -been anyway) might help. And voilà ... - -Signed-off-by: Jan Beulich -Acked-by: Andrew Cooper -master commit: a2de7dc4d845738e734b10fce6550c89c6b1092c -master date: 2024-09-04 16:09:28 +0200 ---- - xen/arch/x86/smpboot.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c -index 8aa621533f..0a89f22a39 100644 ---- a/xen/arch/x86/smpboot.c -+++ b/xen/arch/x86/smpboot.c -@@ -226,7 +226,7 @@ static int booting_cpu; - /* CPUs for which sibling maps can be computed. */ - static cpumask_t cpu_sibling_setup_map; - --static void link_thread_siblings(int cpu1, int cpu2) -+static void link_thread_siblings(unsigned int cpu1, unsigned int cpu2) - { - cpumask_set_cpu(cpu1, per_cpu(cpu_sibling_mask, cpu2)); - cpumask_set_cpu(cpu2, per_cpu(cpu_sibling_mask, cpu1)); --- -2.47.0 - diff --git a/0031-x86-shutdown-offline-APs-with-interrupts-disabled-on.patch b/0031-x86-shutdown-offline-APs-with-interrupts-disabled-on.patch new file mode 100644 index 0000000..e3161fe --- /dev/null +++ b/0031-x86-shutdown-offline-APs-with-interrupts-disabled-on.patch @@ -0,0 +1,142 @@ +From e801a86d5eb2ed62c30c35f1d39cdccdf8d860e8 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Mon, 17 Feb 2025 13:23:27 +0100 +Subject: [PATCH 31/53] x86/shutdown: offline APs with interrupts disabled on + all CPUs +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +The current shutdown logic in smp_send_stop() will disable the APs while +having interrupts enabled on the BSP or possibly other APs. On AMD systems +this can lead to local APIC errors: + +APIC error on CPU0: 00(08), Receive accept error + +Such error message can be printed in a loop, thus blocking the system from +rebooting. I assume this loop is created by the error being triggered by +the console interrupt, which is further stirred by the ESR handler +printing to the console. + +Intel SDM states: + +"Receive Accept Error. + +Set when the local APIC detects that the message it received was not +accepted by any APIC on the APIC bus, including itself. Used only on P6 +family and Pentium processors." + +So the error shouldn't trigger on any Intel CPU supported by Xen. + +However AMD doesn't make such claims, and indeed the error is broadcast to +all local APICs when an interrupt targets a CPU that's already offline. + +To prevent the error from stalling the shutdown process perform the +disabling of APs and the BSP local APIC with interrupts disabled on all +CPUs in the system, so that by the time interrupts are unmasked on the BSP +the local APIC is already disabled. This can still lead to a spurious: + +APIC error on CPU0: 00(00) + +As a result of an LVT Error getting injected while interrupts are masked on +the CPU, and the vector only handled after the local APIC is already +disabled. ESR reports 0 because as part of disable_local_APIC() the ESR +register is cleared. + +Note the NMI crash path doesn't have such issue, because disabling of APs +and the caller local APIC is already done in the same contiguous region +with interrupts disabled. There's a possible window on the NMI crash path +(nmi_shootdown_cpus()) where some APs might be disabled (and thus +interrupts targeting them raising "Receive accept error") before others APs +have interrupts disabled. However the shutdown NMI will be handled, +regardless of whether the AP is processing a local APIC error, and hence +such interrupts will not cause the shutdown process to get stuck. + +Remove the call to fixup_irqs() in smp_send_stop(): it doesn't achieve the +intended goal of moving all interrupts to the BSP anyway. The logic in +fixup_irqs() will move interrupts whose affinity doesn't overlap with the +passed mask, but the movement of interrupts is done to any CPU set in +cpu_online_map. As in the shutdown path fixup_irqs() is called before APs +are cleared from cpu_online_map this leads to interrupts being shuffled +around, but not assigned to the BSP exclusively. + +The Fixes tag is more of a guess than a certainty; it's possible the +previous sleep window in fixup_irqs() allowed any in-flight interrupt to be +delivered before APs went offline. However fixup_irqs() was still +incorrectly used, as it didn't (and still doesn't) move all interrupts to +target the provided cpu mask. + +Fixes: e2bb28d62158 ('x86/irq: forward pending interrupts to new destination in fixup_irqs()') +Signed-off-by: Roger Pau Monné +Reviewed-by: Jan Beulich +master commit: 1191ce954f64244a3c5f553116184928bcc677e8 +master date: 2025-02-12 15:56:07 +0100 +--- + xen/arch/x86/smp.c | 27 ++++++++++++++++++++------- + 1 file changed, 20 insertions(+), 7 deletions(-) + +diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c +index 04c6a05723..bd92496d3d 100644 +--- a/xen/arch/x86/smp.c ++++ b/xen/arch/x86/smp.c +@@ -343,6 +343,11 @@ void __stop_this_cpu(void) + + static void cf_check stop_this_cpu(void *dummy) + { ++ const bool *stop_aps = dummy; ++ ++ while ( !*stop_aps ) ++ cpu_relax(); ++ + __stop_this_cpu(); + for ( ; ; ) + halt(); +@@ -355,16 +360,25 @@ static void cf_check stop_this_cpu(void *dummy) + void smp_send_stop(void) + { + unsigned int cpu = smp_processor_id(); ++ bool stop_aps = false; ++ ++ /* ++ * Perform AP offlining and disabling of interrupt controllers with all ++ * CPUs on the system having interrupts disabled to prevent interrupt ++ * delivery errors. On AMD systems "Receive accept error" will be ++ * broadcast to local APICs if interrupts target CPUs that are offline. ++ */ ++ if ( num_online_cpus() > 1 ) ++ smp_call_function(stop_this_cpu, &stop_aps, 0); ++ ++ local_irq_disable(); + + if ( num_online_cpus() > 1 ) + { + int timeout = 10; + +- local_irq_disable(); +- fixup_irqs(cpumask_of(cpu), 0); +- local_irq_enable(); +- +- smp_call_function(stop_this_cpu, NULL, 0); ++ /* Signal APs to stop. */ ++ stop_aps = true; + + /* Wait 10ms for all other CPUs to go offline. */ + while ( (num_online_cpus() > 1) && (timeout-- > 0) ) +@@ -373,13 +387,12 @@ void smp_send_stop(void) + + if ( cpu_online(cpu) ) + { +- local_irq_disable(); + disable_IO_APIC(); + hpet_disable(); + __stop_this_cpu(); + x2apic_enabled = (current_local_apic_mode() == APIC_MODE_X2APIC); +- local_irq_enable(); + } ++ local_irq_enable(); + } + + void smp_send_nmi_allbutself(void) +-- +2.48.1 + diff --git a/0031-x86emul-test-fix-build-with-gas-2.43.patch b/0031-x86emul-test-fix-build-with-gas-2.43.patch deleted file mode 100644 index 8953a09..0000000 --- a/0031-x86emul-test-fix-build-with-gas-2.43.patch +++ /dev/null @@ -1,86 +0,0 @@ -From 78d412f8bc3d78458cd868ba375ad30175194d91 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 24 Sep 2024 14:42:39 +0200 -Subject: [PATCH 31/83] x86emul/test: fix build with gas 2.43 - -Drop explicit {evex} pseudo-prefixes. New gas (validly) complains when -they're used on things other than instructions. Our use was potentially -ahead of macro invocations - see simd.h's "override" macro. - -Signed-off-by: Jan Beulich -Acked-by: Andrew Cooper -master commit: 3c09288298af881ea1bb568740deb2d2a06bcd41 -master date: 2024-09-06 08:41:18 +0200 ---- - tools/tests/x86_emulator/simd.c | 14 +++++++------- - 1 file changed, 7 insertions(+), 7 deletions(-) - -diff --git a/tools/tests/x86_emulator/simd.c b/tools/tests/x86_emulator/simd.c -index 263cea662d..d68a7364c2 100644 ---- a/tools/tests/x86_emulator/simd.c -+++ b/tools/tests/x86_emulator/simd.c -@@ -333,7 +333,7 @@ static inline vec_t movlhps(vec_t x, vec_t y) { - # if FLOAT_SIZE == 4 - # define broadcast(x) ({ \ - vec_t t_; \ -- asm ( "%{evex%} vbroadcastss %1, %0" \ -+ asm ( "vbroadcastss %1, %0" \ - : "=v" (t_) : "m" (*(float[1]){ x }) ); \ - t_; \ - }) -@@ -401,14 +401,14 @@ static inline vec_t movlhps(vec_t x, vec_t y) { - # if VEC_SIZE >= 32 - # define broadcast(x) ({ \ - vec_t t_; \ -- asm ( "%{evex%} vbroadcastsd %1, %0" : "=v" (t_) \ -+ asm ( "vbroadcastsd %1, %0" : "=v" (t_) \ - : "m" (*(double[1]){ x }) ); \ - t_; \ - }) - # else - # define broadcast(x) ({ \ - vec_t t_; \ -- asm ( "%{evex%} vpbroadcastq %1, %0" \ -+ asm ( "vpbroadcastq %1, %0" \ - : "=v" (t_) : "m" (*(double[1]){ x }) ); \ - t_; \ - }) -@@ -601,7 +601,7 @@ static inline vec_t movlhps(vec_t x, vec_t y) { - # if INT_SIZE == 4 || UINT_SIZE == 4 - # define broadcast(x) ({ \ - vec_t t_; \ -- asm ( "%{evex%} vpbroadcastd %1, %0" \ -+ asm ( "vpbroadcastd %1, %0" \ - : "=v" (t_) : "m" (*(int[1]){ x }) ); \ - t_; \ - }) -@@ -649,7 +649,7 @@ static inline vec_t movlhps(vec_t x, vec_t y) { - # elif INT_SIZE == 8 || UINT_SIZE == 8 - # define broadcast(x) ({ \ - vec_t t_; \ -- asm ( "%{evex%} vpbroadcastq %1, %0" \ -+ asm ( "vpbroadcastq %1, %0" \ - : "=v" (t_) : "m" (*(long long[1]){ x }) ); \ - t_; \ - }) -@@ -716,7 +716,7 @@ static inline vec_t movlhps(vec_t x, vec_t y) { - # if INT_SIZE == 1 || UINT_SIZE == 1 - # define broadcast(x) ({ \ - vec_t t_; \ -- asm ( "%{evex%} vpbroadcastb %1, %0" \ -+ asm ( "vpbroadcastb %1, %0" \ - : "=v" (t_) : "m" (*(char[1]){ x }) ); \ - t_; \ - }) -@@ -745,7 +745,7 @@ static inline vec_t movlhps(vec_t x, vec_t y) { - # elif INT_SIZE == 2 || UINT_SIZE == 2 - # define broadcast(x) ({ \ - vec_t t_; \ -- asm ( "%{evex%} vpbroadcastw %1, %0" \ -+ asm ( "vpbroadcastw %1, %0" \ - : "=v" (t_) : "m" (*(short[1]){ x }) ); \ - t_; \ - }) --- -2.47.0 - diff --git a/0032-x86-HVM-properly-reject-indirect-VRAM-writes.patch b/0032-x86-HVM-properly-reject-indirect-VRAM-writes.patch deleted file mode 100644 index 304a423..0000000 --- a/0032-x86-HVM-properly-reject-indirect-VRAM-writes.patch +++ /dev/null @@ -1,45 +0,0 @@ -From ec3999e205ccadbeb8ab1f8420dea02fee2b5a5d Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 24 Sep 2024 14:43:02 +0200 -Subject: [PATCH 32/83] x86/HVM: properly reject "indirect" VRAM writes - -While ->count will only be different from 1 for "indirect" (data in -guest memory) accesses, it being 1 does not exclude the request being an -"indirect" one. Check both to be on the safe side, and bring the ->count -part also in line with what ioreq_send_buffered() actually refuses to -handle. - -Fixes: 3bbaaec09b1b ("x86/hvm: unify stdvga mmio intercept with standard mmio intercept") -Signed-off-by: Jan Beulich -Reviewed-by: Andrew Cooper -master commit: eb7cd0593d88c4b967a24bca8bd30591966676cd -master date: 2024-09-12 09:13:04 +0200 ---- - xen/arch/x86/hvm/stdvga.c | 6 +++--- - 1 file changed, 3 insertions(+), 3 deletions(-) - -diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c -index b16c59f772..5f02d88615 100644 ---- a/xen/arch/x86/hvm/stdvga.c -+++ b/xen/arch/x86/hvm/stdvga.c -@@ -530,14 +530,14 @@ static bool cf_check stdvga_mem_accept( - - spin_lock(&s->lock); - -- if ( p->dir == IOREQ_WRITE && p->count > 1 ) -+ if ( p->dir == IOREQ_WRITE && (p->data_is_ptr || p->count != 1) ) - { - /* - * We cannot return X86EMUL_UNHANDLEABLE on anything other then the - * first cycle of an I/O. So, since we cannot guarantee to always be - * able to send buffered writes, we have to reject any multi-cycle -- * I/O and, since we are rejecting an I/O, we must invalidate the -- * cache. -+ * or "indirect" I/O and, since we are rejecting an I/O, we must -+ * invalidate the cache. - * Single-cycle write transactions are accepted even if the cache is - * not active since we can assert, when in stdvga mode, that writes - * to VRAM have no side effect and thus we can try to buffer them. --- -2.47.0 - diff --git a/0032-x86-smp-perform-disabling-on-interrupts-ahead-of-AP-.patch b/0032-x86-smp-perform-disabling-on-interrupts-ahead-of-AP-.patch new file mode 100644 index 0000000..ce61001 --- /dev/null +++ b/0032-x86-smp-perform-disabling-on-interrupts-ahead-of-AP-.patch @@ -0,0 +1,46 @@ +From 30aadc8dbc9ac1abf88bfeb370d1c27c5ea9d1e6 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Mon, 17 Feb 2025 13:23:50 +0100 +Subject: [PATCH 32/53] x86/smp: perform disabling on interrupts ahead of AP + shutdown +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Move the disabling of interrupt sources so it's done ahead of the offlining +of APs. This is to prevent AMD systems triggering "Receive accept error" +when interrupts target CPUs that are no longer online. + +Signed-off-by: Roger Pau Monné +Reviewed-by: Jan Beulich +master commit: db6daa9bf411260d2c1f5301e4fc786ae4a5cef8 +master date: 2025-02-12 15:56:07 +0100 +--- + xen/arch/x86/smp.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c +index bd92496d3d..eb134ecac8 100644 +--- a/xen/arch/x86/smp.c ++++ b/xen/arch/x86/smp.c +@@ -372,6 +372,8 @@ void smp_send_stop(void) + smp_call_function(stop_this_cpu, &stop_aps, 0); + + local_irq_disable(); ++ disable_IO_APIC(); ++ hpet_disable(); + + if ( num_online_cpus() > 1 ) + { +@@ -387,8 +389,6 @@ void smp_send_stop(void) + + if ( cpu_online(cpu) ) + { +- disable_IO_APIC(); +- hpet_disable(); + __stop_this_cpu(); + x2apic_enabled = (current_local_apic_mode() == APIC_MODE_X2APIC); + } +-- +2.48.1 + diff --git a/0033-x86-pci-disable-MSI-X-on-all-devices-at-shutdown.patch b/0033-x86-pci-disable-MSI-X-on-all-devices-at-shutdown.patch new file mode 100644 index 0000000..af51fbc --- /dev/null +++ b/0033-x86-pci-disable-MSI-X-on-all-devices-at-shutdown.patch @@ -0,0 +1,201 @@ +From eae35da8f4c420f5f6752d4c0291be5ba4d22f5a Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Mon, 17 Feb 2025 13:24:03 +0100 +Subject: [PATCH 33/53] x86/pci: disable MSI(-X) on all devices at shutdown +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Attempt to disable MSI(-X) capabilities on all PCI devices know by Xen at +shutdown. Doing such disabling should facilitate kexec chained kernel from +booting more reliably, as device MSI(-X) interrupt generation should be +quiesced. + +Only attempt to disable MSI(-X) on all devices in the crash context if the +PCI lock is not taken, otherwise the PCI device list could be in an +inconsistent state. This requires introducing a new pcidevs_trylock() +helper to check whether the lock is currently taken. + +Disabling MSI(-X) should prevent "Receive accept error" being raised as a +result of non-disabled interrupts targeting offline CPUs. + +Signed-off-by: Roger Pau Monné +Reviewed-by: Jan Beulich +master commit: 7ab6951981231b4c576a3588248c303001272588 +master date: 2025-02-12 15:56:07 +0100 +--- + xen/arch/x86/crash.c | 10 ++++++++ + xen/arch/x86/include/asm/msi.h | 1 + + xen/arch/x86/msi.c | 18 +++++++++++++++ + xen/arch/x86/smp.c | 1 + + xen/drivers/passthrough/pci.c | 42 ++++++++++++++++++++++++++++++++++ + xen/include/xen/pci.h | 12 ++++++++++ + 6 files changed, 84 insertions(+) + +diff --git a/xen/arch/x86/crash.c b/xen/arch/x86/crash.c +index a789416ca3..22b1121d7a 100644 +--- a/xen/arch/x86/crash.c ++++ b/xen/arch/x86/crash.c +@@ -175,6 +175,16 @@ static void nmi_shootdown_cpus(void) + */ + x2apic_enabled = (current_local_apic_mode() == APIC_MODE_X2APIC); + ++ if ( pcidevs_trylock() ) ++ { ++ /* ++ * Assume the PCI device list to be in a consistent state if the ++ * lock is not held when the crash happened. ++ */ ++ pci_disable_msi_all(); ++ pcidevs_unlock(); ++ } ++ + disable_IO_APIC(); + hpet_disable(); + } +diff --git a/xen/arch/x86/include/asm/msi.h b/xen/arch/x86/include/asm/msi.h +index 748bc3cd6d..503c9447f6 100644 +--- a/xen/arch/x86/include/asm/msi.h ++++ b/xen/arch/x86/include/asm/msi.h +@@ -86,6 +86,7 @@ extern int pci_enable_msi(struct pci_dev *pdev, struct msi_info *msi, + extern void pci_disable_msi(struct msi_desc *msi_desc); + extern int pci_prepare_msix(u16 seg, u8 bus, u8 devfn, bool off); + extern void pci_cleanup_msi(struct pci_dev *pdev); ++extern void pci_disable_msi_all(void); + extern int setup_msi_irq(struct irq_desc *desc, struct msi_desc *msidesc); + extern int __setup_msi_irq(struct irq_desc *desc, struct msi_desc *msidesc, + hw_irq_controller *handler); +diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c +index 0d11448234..0c099215c7 100644 +--- a/xen/arch/x86/msi.c ++++ b/xen/arch/x86/msi.c +@@ -1245,6 +1245,24 @@ void pci_cleanup_msi(struct pci_dev *pdev) + msi_free_irqs(pdev); + } + ++static int cf_check disable_msi(struct pci_dev *pdev, void *arg) ++{ ++ msi_set_enable(pdev, 0); ++ msix_set_enable(pdev, 0); ++ ++ return 0; ++} ++ ++/* Disable MSI and/or MSI-X on all devices known by Xen. */ ++void pci_disable_msi_all(void) ++{ ++ int rc = pci_iterate_devices(disable_msi, NULL); ++ ++ if ( rc ) ++ printk(XENLOG_ERR ++ "Failed to disable MSI(-X) on some devices: %d\n", rc); ++} ++ + int pci_reset_msix_state(struct pci_dev *pdev) + { + unsigned int pos = pci_find_cap_offset(pdev->sbdf, PCI_CAP_ID_MSIX); +diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c +index eb134ecac8..92341b7ee7 100644 +--- a/xen/arch/x86/smp.c ++++ b/xen/arch/x86/smp.c +@@ -372,6 +372,7 @@ void smp_send_stop(void) + smp_call_function(stop_this_cpu, &stop_aps, 0); + + local_irq_disable(); ++ pci_disable_msi_all(); + disable_IO_APIC(); + hpet_disable(); + +diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c +index c7ee835b8a..f4c8c505af 100644 +--- a/xen/drivers/passthrough/pci.c ++++ b/xen/drivers/passthrough/pci.c +@@ -68,6 +68,11 @@ bool pcidevs_locked(void) + return rspin_is_locked(&_pcidevs_lock); + } + ++bool pcidevs_trylock_unsafe(void) ++{ ++ return _rspin_trylock(&_pcidevs_lock); ++} ++ + static RADIX_TREE(pci_segments); + + static inline struct pci_seg *get_pseg(u16 seg) +@@ -1816,6 +1821,43 @@ int iommu_do_pci_domctl( + return ret; + } + ++struct segment_iter { ++ int (*handler)(struct pci_dev *pdev, void *arg); ++ void *arg; ++ int rc; ++}; ++ ++static int cf_check iterate_all(struct pci_seg *pseg, void *arg) ++{ ++ struct segment_iter *iter = arg; ++ struct pci_dev *pdev; ++ ++ list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list ) ++ { ++ int rc = iter->handler(pdev, iter->arg); ++ ++ if ( !iter->rc ) ++ iter->rc = rc; ++ } ++ ++ return 0; ++} ++ ++/* ++ * Iterate without locking or preemption over all PCI devices known by Xen. ++ * Can be called with interrupts disabled. ++ */ ++int pci_iterate_devices(int (*handler)(struct pci_dev *pdev, void *arg), ++ void *arg) ++{ ++ struct segment_iter iter = { ++ .handler = handler, ++ .arg = arg, ++ }; ++ ++ return pci_segments_iterate(iterate_all, &iter) ?: iter.rc; ++} ++ + /* + * Local variables: + * mode: C +diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h +index 82e1221c9c..06f98c7c80 100644 +--- a/xen/include/xen/pci.h ++++ b/xen/include/xen/pci.h +@@ -187,6 +187,11 @@ static always_inline void pcidevs_lock(void) + } + void pcidevs_unlock(void); + bool __must_check pcidevs_locked(void); ++bool pcidevs_trylock_unsafe(void); ++static always_inline bool pcidevs_trylock(void) ++{ ++ return lock_evaluate_nospec(pcidevs_trylock_unsafe()); ++} + + #ifndef NDEBUG + /* +@@ -223,6 +228,13 @@ struct pci_dev *pci_get_pdev(const struct domain *d, pci_sbdf_t sbdf); + struct pci_dev *pci_get_real_pdev(pci_sbdf_t sbdf); + void pci_check_disable_device(u16 seg, u8 bus, u8 devfn); + ++/* ++ * Iterate without locking or preemption over all PCI devices known by Xen. ++ * Can be called with interrupts disabled. ++ */ ++int pci_iterate_devices(int (*handler)(struct pci_dev *pdev, void *arg), ++ void *arg); ++ + uint8_t pci_conf_read8(pci_sbdf_t sbdf, unsigned int reg); + uint16_t pci_conf_read16(pci_sbdf_t sbdf, unsigned int reg); + uint32_t pci_conf_read32(pci_sbdf_t sbdf, unsigned int reg); +-- +2.48.1 + diff --git a/0033-xen-x86-pvh-handle-ACPI-RSDT-table-in-PVH-Dom0-build.patch b/0033-xen-x86-pvh-handle-ACPI-RSDT-table-in-PVH-Dom0-build.patch deleted file mode 100644 index 2e2c55e..0000000 --- a/0033-xen-x86-pvh-handle-ACPI-RSDT-table-in-PVH-Dom0-build.patch +++ /dev/null @@ -1,63 +0,0 @@ -From d0ea9b319d4ca04e29ef533db0c3655a78dec315 Mon Sep 17 00:00:00 2001 -From: Stefano Stabellini -Date: Tue, 24 Sep 2024 14:43:24 +0200 -Subject: [PATCH 33/83] xen/x86/pvh: handle ACPI RSDT table in PVH Dom0 build -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Xen always generates an XSDT table even if the firmware only provided an -RSDT table. Copy the RSDT header from the firmware table, adjusting the -signature, for the XSDT table when not provided by the firmware. - -This is necessary to run Xen on QEMU. - -Fixes: 1d74282c455f ('x86: setup PVHv2 Dom0 ACPI tables') -Suggested-by: Roger Pau Monné -Signed-off-by: Stefano Stabellini -Signed-off-by: Daniel P. Smith -Reviewed-by: Roger Pau Monné -master commit: 6e7f7a0c16c4d406bda6d4a900252ff63a7c5fad -master date: 2024-09-12 09:18:25 +0200 ---- - xen/arch/x86/hvm/dom0_build.c | 17 ++++++++++++++++- - 1 file changed, 16 insertions(+), 1 deletion(-) - -diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c -index f3eddb6846..3dd913bdb0 100644 ---- a/xen/arch/x86/hvm/dom0_build.c -+++ b/xen/arch/x86/hvm/dom0_build.c -@@ -1078,7 +1078,16 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, - rc = -EINVAL; - goto out; - } -- xsdt_paddr = rsdp->xsdt_physical_address; -+ /* -+ * Note the header is the same for both RSDT and XSDT, so it's fine to -+ * copy the native RSDT header to the Xen crafted XSDT if no native -+ * XSDT is available. -+ */ -+ if ( rsdp->revision > 1 && rsdp->xsdt_physical_address ) -+ xsdt_paddr = rsdp->xsdt_physical_address; -+ else -+ xsdt_paddr = rsdp->rsdt_physical_address; -+ - acpi_os_unmap_memory(rsdp, sizeof(*rsdp)); - table = acpi_os_map_memory(xsdt_paddr, sizeof(*table)); - if ( !table ) -@@ -1090,6 +1099,12 @@ static int __init pvh_setup_acpi_xsdt(struct domain *d, paddr_t madt_addr, - xsdt->header = *table; - acpi_os_unmap_memory(table, sizeof(*table)); - -+ /* -+ * In case the header is an RSDT copy, unconditionally ensure it has -+ * an XSDT sig. -+ */ -+ xsdt->header.signature[0] = 'X'; -+ - /* Add the custom MADT. */ - xsdt->table_offset_entry[0] = madt_addr; - --- -2.47.0 - diff --git a/0034-blkif-reconcile-protocol-specification-with-in-use-i.patch b/0034-blkif-reconcile-protocol-specification-with-in-use-i.patch deleted file mode 100644 index 4646dfa..0000000 --- a/0034-blkif-reconcile-protocol-specification-with-in-use-i.patch +++ /dev/null @@ -1,183 +0,0 @@ -From 933416b13966a3fa2a37b1f645c23afbd8fb6d09 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= -Date: Tue, 24 Sep 2024 14:43:50 +0200 -Subject: [PATCH 34/83] blkif: reconcile protocol specification with in-use - implementations -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Current blkif implementations (both backends and frontends) have all slight -differences about how they handle the 'sector-size' xenstore node, and how -other fields are derived from this value or hardcoded to be expressed in units -of 512 bytes. - -To give some context, this is an excerpt of how different implementations use -the value in 'sector-size' as the base unit for to other fields rather than -just to set the logical sector size of the block device: - - │ sectors xenbus node │ requests sector_number │ requests {first,last}_sect -────────────────────────┼─────────────────────┼────────────────────────┼─────────────────────────── -FreeBSD blk{front,back} │ sector-size │ sector-size │ 512 -────────────────────────┼─────────────────────┼────────────────────────┼─────────────────────────── -Linux blk{front,back} │ 512 │ 512 │ 512 -────────────────────────┼─────────────────────┼────────────────────────┼─────────────────────────── -QEMU blkback │ sector-size │ sector-size │ sector-size -────────────────────────┼─────────────────────┼────────────────────────┼─────────────────────────── -Windows blkfront │ sector-size │ sector-size │ sector-size -────────────────────────┼─────────────────────┼────────────────────────┼─────────────────────────── -MiniOS │ sector-size │ 512 │ 512 - -An attempt was made by 67e1c050e36b in order to change the base units of the -request fields and the xenstore 'sectors' node. That however only lead to more -confusion, as the specification now clearly diverged from the reference -implementation in Linux. Such change was only implemented for QEMU Qdisk -and Windows PV blkfront. - -Partially revert to the state before 67e1c050e36b while adjusting the -documentation for 'sectors' to match what it used to be previous to -2fa701e5346d: - - * Declare 'feature-large-sector-size' deprecated. Frontends should not expose - the node, backends should not make decisions based on its presence. - - * Clarify that 'sectors' xenstore node and the requests fields are always in - 512-byte units, like it was previous to 2fa701e5346d and 67e1c050e36b. - -All base units for the fields used in the protocol are 512-byte based, the -xenbus 'sector-size' field is only used to signal the logic block size. When -'sector-size' is greater than 512, blkfront implementations must make sure that -the offsets and sizes (despite being expressed in 512-byte units) are aligned -to the logical block size specified in 'sector-size', otherwise the backend -will fail to process the requests. - -This will require changes to some of the frontends and backends in order to -properly support 'sector-size' nodes greater than 512. - -Fixes: 2fa701e5346d ('blkif.h: Provide more complete documentation of the blkif interface') -Fixes: 67e1c050e36b ('public/io/blkif.h: try to fix the semantics of sector based quantities') -Signed-off-by: Roger Pau Monné -Reviewed-by: Juergen Gross -Reviewed-by: Anthony PERARD -master commit: 221f2748e8dabe8361b8cdfcffbeab9102c4c899 -master date: 2024-09-12 14:04:56 +0200 ---- - xen/include/public/io/blkif.h | 52 ++++++++++++++++++++++++++--------- - 1 file changed, 39 insertions(+), 13 deletions(-) - -diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h -index 22f1eef0c0..9b00d633d3 100644 ---- a/xen/include/public/io/blkif.h -+++ b/xen/include/public/io/blkif.h -@@ -237,12 +237,16 @@ - * sector-size - * Values: - * -- * The logical block size, in bytes, of the underlying storage. This -- * must be a power of two with a minimum value of 512. -+ * The logical block size, in bytes, of the underlying storage. This must -+ * be a power of two with a minimum value of 512. The sector size should -+ * only be used for request segment length and alignment. - * -- * NOTE: Because of implementation bugs in some frontends this must be -- * set to 512, unless the frontend advertizes a non-zero value -- * in its "feature-large-sector-size" xenbus node. (See below). -+ * When exposing a device that uses a logical sector size of 4096, the -+ * only difference xenstore wise will be that 'sector-size' (and possibly -+ * 'physical-sector-size' if supported by the backend) will be 4096, but -+ * the 'sectors' node will still be calculated using 512 byte units. The -+ * sector base units in the ring requests fields will all be 512 byte -+ * based despite the logical sector size exposed in 'sector-size'. - * - * physical-sector-size - * Values: -@@ -254,9 +258,9 @@ - * sectors - * Values: - * -- * The size of the backend device, expressed in units of "sector-size". -- * The product of "sector-size" and "sectors" must also be an integer -- * multiple of "physical-sector-size", if that node is present. -+ * The size of the backend device, expressed in units of 512b. The -+ * product of "sectors" * 512 must also be an integer multiple of -+ * "physical-sector-size", if that node is present. - * - ***************************************************************************** - * Frontend XenBus Nodes -@@ -338,6 +342,7 @@ - * feature-large-sector-size - * Values: 0/1 (boolean) - * Default Value: 0 -+ * Notes: DEPRECATED, 12 - * - * A value of "1" indicates that the frontend will correctly supply and - * interpret all sector-based quantities in terms of the "sector-size" -@@ -411,6 +416,11 @@ - *(10) The discard-secure property may be present and will be set to 1 if the - * backing device supports secure discard. - *(11) Only used by Linux and NetBSD. -+ *(12) Possibly only ever implemented by the QEMU Qdisk backend and the Windows -+ * PV block frontend. Other backends and frontends supported 'sector-size' -+ * values greater than 512 before such feature was added. Frontends should -+ * not expose this node, neither should backends make any decisions based -+ * on it being exposed by the frontend. - */ - - /* -@@ -619,11 +629,14 @@ - #define BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST 8 - - /* -- * NB. 'first_sect' and 'last_sect' in blkif_request_segment, as well as -- * 'sector_number' in blkif_request, blkif_request_discard and -- * blkif_request_indirect are sector-based quantities. See the description -- * of the "feature-large-sector-size" frontend xenbus node above for -- * more information. -+ * NB. 'first_sect' and 'last_sect' in blkif_request_segment are all in units -+ * of 512 bytes, despite the 'sector-size' xenstore node possibly having a -+ * value greater than 512. -+ * -+ * The value in 'first_sect' and 'last_sect' fields must be setup so that the -+ * resulting segment offset and size is aligned to the logical sector size -+ * reported by the 'sector-size' xenstore node, see 'Backend Device Properties' -+ * section. - */ - struct blkif_request_segment { - grant_ref_t gref; /* reference to I/O buffer frame */ -@@ -634,6 +647,10 @@ struct blkif_request_segment { - - /* - * Starting ring element for any I/O request. -+ * -+ * The 'sector_number' field is in units of 512b, despite the value of the -+ * 'sector-size' xenstore node. Note however that the offset in -+ * 'sector_number' must be aligned to 'sector-size'. - */ - struct blkif_request { - uint8_t operation; /* BLKIF_OP_??? */ -@@ -648,6 +665,10 @@ typedef struct blkif_request blkif_request_t; - /* - * Cast to this structure when blkif_request.operation == BLKIF_OP_DISCARD - * sizeof(struct blkif_request_discard) <= sizeof(struct blkif_request) -+ * -+ * The 'sector_number' field is in units of 512b, despite the value of the -+ * 'sector-size' xenstore node. Note however that the offset in -+ * 'sector_number' must be aligned to 'sector-size'. - */ - struct blkif_request_discard { - uint8_t operation; /* BLKIF_OP_DISCARD */ -@@ -660,6 +681,11 @@ struct blkif_request_discard { - }; - typedef struct blkif_request_discard blkif_request_discard_t; - -+/* -+ * The 'sector_number' field is in units of 512b, despite the value of the -+ * 'sector-size' xenstore node. Note however that the offset in -+ * 'sector_number' must be aligned to 'sector-size'. -+ */ - struct blkif_request_indirect { - uint8_t operation; /* BLKIF_OP_INDIRECT */ - uint8_t indirect_op; /* BLKIF_OP_{READ/WRITE} */ --- -2.47.0 - diff --git a/0034-x86-iommu-disable-interrupts-at-shutdown.patch b/0034-x86-iommu-disable-interrupts-at-shutdown.patch new file mode 100644 index 0000000..2cae963 --- /dev/null +++ b/0034-x86-iommu-disable-interrupts-at-shutdown.patch @@ -0,0 +1,193 @@ +From 93302bb88855c5f308f1e67ac2cd84271aa2d73a Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Mon, 17 Feb 2025 13:24:23 +0100 +Subject: [PATCH 34/53] x86/iommu: disable interrupts at shutdown +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Add a new hook to inhibit interrupt generation by the IOMMU(s). Note the +hook is currently only implemented for x86 IOMMUs. The purpose is to +disable interrupt generation at shutdown so any kexec chained image finds +the IOMMU(s) in a quiesced state. + +It would also prevent "Receive accept error" being raised as a result of +non-disabled interrupts targeting offline CPUs. + +Note that the iommu_quiesce() call in nmi_shootdown_cpus() is still +required even when there's a preceding iommu_crash_shutdown() call; the +later can become a no-op depending on the setting of the "crash-disable" +command line option. + +Signed-off-by: Roger Pau Monné +Reviewed-by: Jan Beulich +master commit: 819c3cb186a86ef3e04fb5af4d9f9f6de032c3ee +master date: 2025-02-12 15:56:07 +0100 +--- + xen/arch/x86/crash.c | 1 + + xen/arch/x86/smp.c | 1 + + xen/drivers/passthrough/amd/iommu.h | 1 + + xen/drivers/passthrough/amd/iommu_init.c | 17 +++++++++++++++++ + xen/drivers/passthrough/amd/pci_amd_iommu.c | 1 + + xen/drivers/passthrough/iommu.c | 12 ++++++++++++ + xen/drivers/passthrough/vtd/iommu.c | 19 +++++++++++++++++++ + xen/include/xen/iommu.h | 3 +++ + 8 files changed, 55 insertions(+) + +diff --git a/xen/arch/x86/crash.c b/xen/arch/x86/crash.c +index 22b1121d7a..26057c71d3 100644 +--- a/xen/arch/x86/crash.c ++++ b/xen/arch/x86/crash.c +@@ -187,6 +187,7 @@ static void nmi_shootdown_cpus(void) + + disable_IO_APIC(); + hpet_disable(); ++ iommu_quiesce(); + } + } + +diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c +index 92341b7ee7..2cf36cb96c 100644 +--- a/xen/arch/x86/smp.c ++++ b/xen/arch/x86/smp.c +@@ -375,6 +375,7 @@ void smp_send_stop(void) + pci_disable_msi_all(); + disable_IO_APIC(); + hpet_disable(); ++ iommu_quiesce(); + + if ( num_online_cpus() > 1 ) + { +diff --git a/xen/drivers/passthrough/amd/iommu.h b/xen/drivers/passthrough/amd/iommu.h +index 8d6f63d87f..9934316351 100644 +--- a/xen/drivers/passthrough/amd/iommu.h ++++ b/xen/drivers/passthrough/amd/iommu.h +@@ -292,6 +292,7 @@ extern unsigned long *shared_intremap_inuse; + void cf_check amd_iommu_resume(void); + int __must_check cf_check amd_iommu_suspend(void); + void cf_check amd_iommu_crash_shutdown(void); ++void cf_check amd_iommu_quiesce(void); + + static inline u32 get_field_from_reg_u32(u32 reg_value, u32 mask, u32 shift) + { +diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c +index b546c28667..d899f58a38 100644 +--- a/xen/drivers/passthrough/amd/iommu_init.c ++++ b/xen/drivers/passthrough/amd/iommu_init.c +@@ -1610,3 +1610,20 @@ void cf_check amd_iommu_resume(void) + invalidate_all_domain_pages(); + } + } ++ ++void cf_check amd_iommu_quiesce(void) ++{ ++ struct amd_iommu *iommu; ++ ++ for_each_amd_iommu ( iommu ) ++ { ++ if ( iommu->ctrl.int_cap_xt_en ) ++ { ++ iommu->ctrl.int_cap_xt_en = false; ++ writeq(iommu->ctrl.raw, ++ iommu->mmio_base + IOMMU_CONTROL_MMIO_OFFSET); ++ } ++ else ++ amd_iommu_msi_enable(iommu, IOMMU_CONTROL_DISABLED); ++ } ++} +diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c +index f96f59440b..d00697edb3 100644 +--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c ++++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c +@@ -791,6 +791,7 @@ static const struct iommu_ops __initconst_cf_clobber _iommu_ops = { + .crash_shutdown = amd_iommu_crash_shutdown, + .get_reserved_device_memory = amd_iommu_get_reserved_device_memory, + .dump_page_tables = amd_dump_page_tables, ++ .quiesce = amd_iommu_quiesce, + }; + + static const struct iommu_init_ops __initconstrel _iommu_init_ops = { +diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c +index 50bfd62553..1a3180708c 100644 +--- a/xen/drivers/passthrough/iommu.c ++++ b/xen/drivers/passthrough/iommu.c +@@ -662,6 +662,18 @@ void iommu_crash_shutdown(void) + #endif + } + ++void iommu_quiesce(void) ++{ ++ const struct iommu_ops *ops; ++ ++ if ( !iommu_enabled ) ++ return; ++ ++ ops = iommu_get_ops(); ++ if ( ops->quiesce ) ++ iommu_vcall(ops, quiesce); ++} ++ + int iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt) + { + const struct iommu_ops *ops; +diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c +index ab38bbc8a5..f28de36458 100644 +--- a/xen/drivers/passthrough/vtd/iommu.c ++++ b/xen/drivers/passthrough/vtd/iommu.c +@@ -3247,6 +3247,24 @@ static int cf_check intel_iommu_quarantine_init(struct pci_dev *pdev, + return rc; + } + ++static void cf_check vtd_quiesce(void) ++{ ++ const struct acpi_drhd_unit *drhd; ++ ++ for_each_drhd_unit ( drhd ) ++ { ++ const struct vtd_iommu *iommu = drhd->iommu; ++ uint32_t sts = dmar_readl(iommu->reg, DMAR_FECTL_REG); ++ ++ /* ++ * Open code dma_msi_mask() to avoid taking the spinlock which could ++ * deadlock if called from crash context. ++ */ ++ sts |= DMA_FECTL_IM; ++ dmar_writel(iommu->reg, DMAR_FECTL_REG, sts); ++ } ++} ++ + static const struct iommu_ops __initconst_cf_clobber vtd_ops = { + .page_sizes = PAGE_SIZE_4K, + .init = intel_iommu_domain_init, +@@ -3276,6 +3294,7 @@ static const struct iommu_ops __initconst_cf_clobber vtd_ops = { + .iotlb_flush = iommu_flush_iotlb, + .get_reserved_device_memory = intel_iommu_get_reserved_device_memory, + .dump_page_tables = vtd_dump_page_tables, ++ .quiesce = vtd_quiesce, + }; + + const struct iommu_init_ops __initconstrel intel_iommu_init_ops = { +diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h +index 442ae5322d..04f195be04 100644 +--- a/xen/include/xen/iommu.h ++++ b/xen/include/xen/iommu.h +@@ -314,6 +314,8 @@ struct iommu_ops { + */ + int (*dt_xlate)(device_t *dev, const struct dt_phandle_args *args); + #endif ++ /* Inhibit all interrupt generation, to be used at shutdown. */ ++ void (*quiesce)(void); + }; + + /* +@@ -404,6 +406,7 @@ static inline int iommu_do_domctl(struct xen_domctl *domctl, struct domain *d, + int __must_check iommu_suspend(void); + void iommu_resume(void); + void iommu_crash_shutdown(void); ++void iommu_quiesce(void); + int iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt); + int iommu_quarantine_dev_init(device_t *dev); + +-- +2.48.1 + diff --git a/0035-IOMMU-x86-the-bus-to-bridge-lock-needs-to-be-acquire.patch b/0035-IOMMU-x86-the-bus-to-bridge-lock-needs-to-be-acquire.patch new file mode 100644 index 0000000..9e92144 --- /dev/null +++ b/0035-IOMMU-x86-the-bus-to-bridge-lock-needs-to-be-acquire.patch @@ -0,0 +1,113 @@ +From f2c5b4b1aa1972b522bfb5207f4b771681ef8fd2 Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Thu, 27 Feb 2025 12:58:32 +0000 +Subject: [PATCH 35/53] IOMMU/x86: the bus-to-bridge lock needs to be acquired + IRQ-safe +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +The function's use from set_msi_source_id() is guaranteed to be in an +IRQs-off region. While the invocation of that function could be moved +ahead in msi_msg_to_remap_entry() (doesn't need to be in the IOMMU- +intremap-locked region), the call tree from map_domain_pirq() holds an +IRQ descriptor lock. Hence all use sites of the lock need become IRQ- +safe ones. + +In find_upstream_bridge() do a tiny bit of tidying in adjacent code: +Change a variable's type to unsigned and merge a redundant assignment +into another variable's initializer. + +This is XSA-467 / CVE-2025-1713. + +Fixes: 476bbccc811c ("VT-d: fix MSI source-id of interrupt remapping") +Signed-off-by: Jan Beulich +Reviewed-by: Juergen Gross +Reviewed-by: Roger Pau Monné +(cherry picked from commit 39bc6af3ba483282ed6bbf94b08aec38c93d39e6) +--- + xen/drivers/passthrough/pci.c | 20 +++++++++++--------- + 1 file changed, 11 insertions(+), 9 deletions(-) + +diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c +index f4c8c505af..4d06e5d3f9 100644 +--- a/xen/drivers/passthrough/pci.c ++++ b/xen/drivers/passthrough/pci.c +@@ -351,20 +351,21 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn) + switch ( pdev->type = pdev_type(pseg->nr, bus, devfn) ) + { + unsigned int cap, sec_bus, sub_bus; ++ unsigned long flags; + + case DEV_TYPE_PCIe2PCI_BRIDGE: + case DEV_TYPE_LEGACY_PCI_BRIDGE: + sec_bus = pci_conf_read8(pdev->sbdf, PCI_SECONDARY_BUS); + sub_bus = pci_conf_read8(pdev->sbdf, PCI_SUBORDINATE_BUS); + +- spin_lock(&pseg->bus2bridge_lock); ++ spin_lock_irqsave(&pseg->bus2bridge_lock, flags); + for ( ; sec_bus <= sub_bus; sec_bus++ ) + { + pseg->bus2bridge[sec_bus].map = 1; + pseg->bus2bridge[sec_bus].bus = bus; + pseg->bus2bridge[sec_bus].devfn = devfn; + } +- spin_unlock(&pseg->bus2bridge_lock); ++ spin_unlock_irqrestore(&pseg->bus2bridge_lock, flags); + break; + + case DEV_TYPE_PCIe_ENDPOINT: +@@ -434,16 +435,17 @@ static void free_pdev(struct pci_seg *pseg, struct pci_dev *pdev) + switch ( pdev->type ) + { + unsigned int sec_bus, sub_bus; ++ unsigned long flags; + + case DEV_TYPE_PCIe2PCI_BRIDGE: + case DEV_TYPE_LEGACY_PCI_BRIDGE: + sec_bus = pci_conf_read8(pdev->sbdf, PCI_SECONDARY_BUS); + sub_bus = pci_conf_read8(pdev->sbdf, PCI_SUBORDINATE_BUS); + +- spin_lock(&pseg->bus2bridge_lock); ++ spin_lock_irqsave(&pseg->bus2bridge_lock, flags); + for ( ; sec_bus <= sub_bus; sec_bus++ ) + pseg->bus2bridge[sec_bus] = pseg->bus2bridge[pdev->bus]; +- spin_unlock(&pseg->bus2bridge_lock); ++ spin_unlock_irqrestore(&pseg->bus2bridge_lock, flags); + break; + + default: +@@ -1067,8 +1069,9 @@ enum pdev_type pdev_type(u16 seg, u8 bus, u8 devfn) + int find_upstream_bridge(u16 seg, u8 *bus, u8 *devfn, u8 *secbus) + { + struct pci_seg *pseg = get_pseg(seg); +- int ret = 0; +- int cnt = 0; ++ int ret = 1; ++ unsigned long flags; ++ unsigned int cnt = 0; + + if ( *bus == 0 ) + return 0; +@@ -1079,8 +1082,7 @@ int find_upstream_bridge(u16 seg, u8 *bus, u8 *devfn, u8 *secbus) + if ( !pseg->bus2bridge[*bus].map ) + return 0; + +- ret = 1; +- spin_lock(&pseg->bus2bridge_lock); ++ spin_lock_irqsave(&pseg->bus2bridge_lock, flags); + while ( pseg->bus2bridge[*bus].map ) + { + *secbus = *bus; +@@ -1094,7 +1096,7 @@ int find_upstream_bridge(u16 seg, u8 *bus, u8 *devfn, u8 *secbus) + } + + out: +- spin_unlock(&pseg->bus2bridge_lock); ++ spin_unlock_irqrestore(&pseg->bus2bridge_lock, flags); + return ret; + } + +-- +2.48.1 + diff --git a/0035-xen-ucode-Fix-buffer-under-run-when-parsing-AMD-cont.patch b/0035-xen-ucode-Fix-buffer-under-run-when-parsing-AMD-cont.patch deleted file mode 100644 index 4fe8e78..0000000 --- a/0035-xen-ucode-Fix-buffer-under-run-when-parsing-AMD-cont.patch +++ /dev/null @@ -1,62 +0,0 @@ -From 2c61ab407172682e1382204a8305107f19e2951b Mon Sep 17 00:00:00 2001 -From: Demi Marie Obenour -Date: Tue, 24 Sep 2024 14:44:10 +0200 -Subject: [PATCH 35/83] xen/ucode: Fix buffer under-run when parsing AMD - containers - -The AMD container format has no formal spec. It is, at best, precision -guesswork based on AMD's prior contributions to open source projects. The -Equivalence Table has both an explicit length, and an expectation of having a -NULL entry at the end. - -Xen was sanity checking the NULL entry, but without confirming that an entry -was present, resulting in a read off the front of the buffer. With some -manual debugging/annotations this manifests as: - - (XEN) *** Buf ffff83204c00b19c, eq ffff83204c00b194 - (XEN) *** eq: 0c 00 00 00 44 4d 41 00 00 00 00 00 00 00 00 00 aa aa aa aa - ^-Actual buffer-------------------^ - (XEN) *** installed_cpu: 000c - (XEN) microcode: Bad equivalent cpu table - (XEN) Parsing microcode blob error -22 - -When loaded by hypercall, the 4 bytes interpreted as installed_cpu happen to -be the containing struct ucode_buf's len field, and luckily will be nonzero. - -When loaded at boot, it's possible for the access to #PF if the module happens -to have been placed on a 2M boundary by the bootloader. Under Linux, it will -commonly be the end of the CPIO header. - -Drop the probe of the NULL entry; Nothing else cares. A container without one -is well formed, insofar that we can still parse it correctly. With this -dropped, the same container results in: - - (XEN) microcode: couldn't find any matching ucode in the provided blob! - -Fixes: 4de936a38aa9 ("x86/ucode/amd: Rework parsing logic in cpu_request_microcode()") -Signed-off-by: Demi Marie Obenour -Signed-off-by: Andrew Cooper -Reviewed-by: Jan Beulich -master commit: a8bf14f6f331d4f428010b4277b67c33f561ed19 -master date: 2024-09-13 15:23:30 +0100 ---- - xen/arch/x86/cpu/microcode/amd.c | 3 +-- - 1 file changed, 1 insertion(+), 2 deletions(-) - -diff --git a/xen/arch/x86/cpu/microcode/amd.c b/xen/arch/x86/cpu/microcode/amd.c -index f76a563c8b..9fe6e29751 100644 ---- a/xen/arch/x86/cpu/microcode/amd.c -+++ b/xen/arch/x86/cpu/microcode/amd.c -@@ -336,8 +336,7 @@ static struct microcode_patch *cf_check cpu_request_microcode( - if ( size < sizeof(*et) || - (et = buf)->type != UCODE_EQUIV_CPU_TABLE_TYPE || - size - sizeof(*et) < et->len || -- et->len % sizeof(et->eq[0]) || -- et->eq[(et->len / sizeof(et->eq[0])) - 1].installed_cpu ) -+ et->len % sizeof(et->eq[0]) ) - { - printk(XENLOG_ERR "microcode: Bad equivalent cpu table\n"); - error = -EINVAL; --- -2.47.0 - diff --git a/0036-xen-console-Fix-truncation-of-panic-messages.patch b/0036-xen-console-Fix-truncation-of-panic-messages.patch new file mode 100644 index 0000000..b00b6b2 --- /dev/null +++ b/0036-xen-console-Fix-truncation-of-panic-messages.patch @@ -0,0 +1,103 @@ +From 330ecfc7c1bea7699d3783c8d733cab53e933d39 Mon Sep 17 00:00:00 2001 +From: Andrew Cooper +Date: Thu, 20 Mar 2025 13:13:44 +0100 +Subject: [PATCH 36/53] xen/console: Fix truncation of panic() messages + +The panic() function uses a static buffer to format its arguments into, simply +to emit the result via printk("%s", buf). This buffer is not large enough for +some existing users in Xen. e.g.: + + (XEN) **************************************** + (XEN) Panic on CPU 0: + (XEN) Invalid device tree blob at physical address 0x46a00000. + (XEN) The DTB must be 8-byte aligned and must not exceed 2 MB in size. + (XEN) + (XEN) Plea**************************************** + +The remainder of this particular message is 'e check your bootloader.', but +has been inherited by RISC-V from ARM. + +It is also pointless double buffering. Implement vprintk() beside printk(), +and use it directly rather than rendering into a local buffer, removing it as +one source of message limitation. + +This marginally simplifies panic(), and drops a global used-once buffer. + +Signed-off-by: Andrew Cooper +Reviewed-by: Jan Beulich +master commit: 81f8b1dd9407e4a3d9dc058b7fbbc591168649ad +master date: 2025-02-18 14:15:58 +0000 +--- + xen/drivers/char/console.c | 21 +++++++++++++-------- + xen/include/xen/lib.h | 2 ++ + 2 files changed, 15 insertions(+), 8 deletions(-) + +diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c +index 3a3a97bcbe..a18cf7dfa0 100644 +--- a/xen/drivers/char/console.c ++++ b/xen/drivers/char/console.c +@@ -961,11 +961,17 @@ static void vprintk_common(const char *prefix, const char *fmt, va_list args) + local_irq_restore(flags); + } + ++void vprintk(const char *fmt, va_list args) ++{ ++ vprintk_common("(XEN) ", fmt, args); ++} ++ + void printk(const char *fmt, ...) + { + va_list args; ++ + va_start(args, fmt); +- vprintk_common("(XEN) ", fmt, args); ++ vprintk(fmt, args); + va_end(args); + } + +@@ -1267,23 +1273,22 @@ void panic(const char *fmt, ...) + va_list args; + unsigned long flags; + static DEFINE_SPINLOCK(lock); +- static char buf[128]; + + spin_debug_disable(); + spinlock_profile_printall('\0'); + debugtrace_dump(); + +- /* Protects buf[] and ensure multi-line message prints atomically. */ ++ /* Ensure multi-line message prints atomically. */ + spin_lock_irqsave(&lock, flags); + +- va_start(args, fmt); +- (void)vsnprintf(buf, sizeof(buf), fmt, args); +- va_end(args); +- + console_start_sync(); + printk("\n****************************************\n"); + printk("Panic on CPU %d:\n", smp_processor_id()); +- printk("%s", buf); ++ ++ va_start(args, fmt); ++ vprintk(fmt, args); ++ va_end(args); ++ + printk("****************************************\n\n"); + if ( opt_noreboot ) + printk("Manual reset required ('noreboot' specified)\n"); +diff --git a/xen/include/xen/lib.h b/xen/include/xen/lib.h +index 394319c818..544d45cb9d 100644 +--- a/xen/include/xen/lib.h ++++ b/xen/include/xen/lib.h +@@ -61,6 +61,8 @@ debugtrace_printk(const char *fmt, ...) {} + #define _p(_x) ((void *)(unsigned long)(_x)) + extern void printk(const char *fmt, ...) + __attribute__ ((format (printf, 1, 2), cold)); ++void vprintk(const char *fmt, va_list args) ++ __attribute__ ((format (printf, 1, 0), cold)); + + #define printk_once(fmt, args...) \ + ({ \ +-- +2.48.1 + diff --git a/0036-xen-ucode-Make-Intel-s-microcode_sanity_check-strict.patch b/0036-xen-ucode-Make-Intel-s-microcode_sanity_check-strict.patch deleted file mode 100644 index 9885b7f..0000000 --- a/0036-xen-ucode-Make-Intel-s-microcode_sanity_check-strict.patch +++ /dev/null @@ -1,43 +0,0 @@ -From 84d8fbd883882b6c3ca3e86261bcbd1d3bc2df70 Mon Sep 17 00:00:00 2001 -From: Demi Marie Obenour -Date: Tue, 29 Oct 2024 16:25:07 +0100 -Subject: [PATCH 36/83] xen/ucode: Make Intel's microcode_sanity_check() - stricter - -The SDM states that data size must be a multiple of 4, but Xen doesn't check -this propery. - -This is liable to cause a later failures, but should be checked explicitly. - -Signed-off-by: Demi Marie Obenour -Signed-off-by: Andrew Cooper -Reviewed-by: Jan Beulich -master commit: 8752ad83e79754f8109457cff796e5f86f644348 -master date: 2024-09-24 18:57:38 +0100 ---- - xen/arch/x86/cpu/microcode/intel.c | 7 +++++-- - 1 file changed, 5 insertions(+), 2 deletions(-) - -diff --git a/xen/arch/x86/cpu/microcode/intel.c b/xen/arch/x86/cpu/microcode/intel.c -index f505aa1b78..fa3c2bab00 100644 ---- a/xen/arch/x86/cpu/microcode/intel.c -+++ b/xen/arch/x86/cpu/microcode/intel.c -@@ -155,10 +155,13 @@ static int microcode_sanity_check(const struct microcode_patch *patch) - uint32_t sum; - - /* -- * Total size must be a multiple of 1024 bytes. Data size and the header -- * must fit within it. -+ * The SDM states: -+ * - Data size must be a multiple of 4. -+ * - Total size must be a multiple of 1024 bytes. Data size and the -+ * header must fit within it. - */ - if ( (total_size & 1023) || -+ (data_size & 3) || - data_size > (total_size - MC_HEADER_SIZE) ) - { - printk(XENLOG_WARNING "microcode: Bad size\n"); --- -2.47.0 - diff --git a/0037-x86-PV-simplify-and-thus-correct-guest-accessor-func.patch b/0037-x86-PV-simplify-and-thus-correct-guest-accessor-func.patch deleted file mode 100644 index a02f586..0000000 --- a/0037-x86-PV-simplify-and-thus-correct-guest-accessor-func.patch +++ /dev/null @@ -1,201 +0,0 @@ -From 950e57e0ce74f8284b8aa3f34f15f38c70dbc9ae Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 29 Oct 2024 16:26:30 +0100 -Subject: [PATCH 37/83] x86/PV: simplify (and thus correct) guest accessor - functions - -Taking a fault on a non-byte-granular insn means that the "number of -bytes not handled" return value would need extra care in calculating, if -we want callers to be able to derive e.g. exception context (to be -injected to the guest) - CR2 for #PF in particular - from the value. To -simplify things rather than complicating them, reduce inline assembly to -just byte-granular string insns. On recent CPUs that's also supposed to -be more efficient anyway. - -For singular element accessors, however, alignment checks are added, -hence slightly complicating the code. Misaligned (user) buffer accesses -will now be forwarded to copy_{from,to}_guest_ll(). - -Naturally copy_{from,to}_unsafe_ll() accessors end up being adjusted the -same way, as they're produced by mere re-processing of the same code. -Otoh copy_{from,to}_unsafe() aren't similarly adjusted, but have their -comments made match reality; down the road we may want to change their -return types, e.g. to bool. - -Fixes: 76974398a63c ("Added user-memory accessing functionality for x86_64") -Fixes: 7b8c36701d26 ("Introduce clear_user and clear_guest") -Reported-by: Andrew Cooper -Signed-off-by: Jan Beulich -Reviewed-by: Andrew Cooper -Tested-by: Andrew Cooper -master commit: 67a8e5721e1ea9c28526883036bf08fb2e8a8c9c -master date: 2024-10-01 09:44:55 +0200 ---- - xen/arch/x86/include/asm/uaccess.h | 12 +++--- - xen/arch/x86/usercopy.c | 66 ++++-------------------------- - 2 files changed, 14 insertions(+), 64 deletions(-) - -diff --git a/xen/arch/x86/include/asm/uaccess.h b/xen/arch/x86/include/asm/uaccess.h -index 48b684c19d..c44faf7e5b 100644 ---- a/xen/arch/x86/include/asm/uaccess.h -+++ b/xen/arch/x86/include/asm/uaccess.h -@@ -251,7 +251,8 @@ do { \ - static always_inline unsigned long - __copy_to_guest_pv(void __user *to, const void *from, unsigned long n) - { -- if (__builtin_constant_p(n)) { -+ if ( __builtin_constant_p(n) && !((unsigned long)to & (n - 1)) ) -+ { - unsigned long ret; - - switch (n) { -@@ -291,7 +292,8 @@ __copy_to_guest_pv(void __user *to, const void *from, unsigned long n) - static always_inline unsigned long - __copy_from_guest_pv(void *to, const void __user *from, unsigned long n) - { -- if (__builtin_constant_p(n)) { -+ if ( __builtin_constant_p(n) && !((unsigned long)from & (n - 1)) ) -+ { - unsigned long ret; - - switch (n) { -@@ -321,8 +323,7 @@ __copy_from_guest_pv(void *to, const void __user *from, unsigned long n) - * - * Copy data from hypervisor space to a potentially unmapped area. - * -- * Returns number of bytes that could not be copied. -- * On success, this will be zero. -+ * Returns zero on success and non-zero if some bytes could not be copied. - */ - static always_inline unsigned int - copy_to_unsafe(void __user *to, const void *from, unsigned int n) -@@ -358,8 +359,7 @@ copy_to_unsafe(void __user *to, const void *from, unsigned int n) - * - * Copy data from a potentially unmapped area space to hypervisor space. - * -- * Returns number of bytes that could not be copied. -- * On success, this will be zero. -+ * Returns zero on success and non-zero if some bytes could not be copied. - * - * If some data could not be copied, this function will pad the copied - * data to the requested size using zero bytes. -diff --git a/xen/arch/x86/usercopy.c b/xen/arch/x86/usercopy.c -index b8c2d1cc0b..7ab2009efe 100644 ---- a/xen/arch/x86/usercopy.c -+++ b/xen/arch/x86/usercopy.c -@@ -16,42 +16,19 @@ - - unsigned int copy_to_guest_ll(void __user *to, const void *from, unsigned int n) - { -- unsigned dummy; -+ GUARD(unsigned dummy); - - stac(); - asm volatile ( - GUARD( - " guest_access_mask_ptr %[to], %q[scratch1], %q[scratch2]\n" - ) -- " cmp $"STR(2*BYTES_PER_LONG-1)", %[cnt]\n" -- " jbe 1f\n" -- " mov %k[to], %[cnt]\n" -- " neg %[cnt]\n" -- " and $"STR(BYTES_PER_LONG-1)", %[cnt]\n" -- " sub %[cnt], %[aux]\n" -- "4: rep movsb\n" /* make 'to' address aligned */ -- " mov %[aux], %[cnt]\n" -- " shr $"STR(LONG_BYTEORDER)", %[cnt]\n" -- " and $"STR(BYTES_PER_LONG-1)", %[aux]\n" -- " .align 2,0x90\n" -- "0: rep movs"__OS"\n" /* as many words as possible... */ -- " mov %[aux],%[cnt]\n" -- "1: rep movsb\n" /* ...remainder copied as bytes */ -+ "1: rep movsb\n" - "2:\n" -- ".section .fixup,\"ax\"\n" -- "5: add %[aux], %[cnt]\n" -- " jmp 2b\n" -- "3: lea (%q[aux], %q[cnt], "STR(BYTES_PER_LONG)"), %[cnt]\n" -- " jmp 2b\n" -- ".previous\n" -- _ASM_EXTABLE(4b, 5b) -- _ASM_EXTABLE(0b, 3b) - _ASM_EXTABLE(1b, 2b) -- : [cnt] "+c" (n), [to] "+D" (to), [from] "+S" (from), -- [aux] "=&r" (dummy) -+ : [cnt] "+c" (n), [to] "+D" (to), [from] "+S" (from) - GUARD(, [scratch1] "=&r" (dummy), [scratch2] "=&r" (dummy)) -- : "[aux]" (n) -- : "memory" ); -+ :: "memory" ); - clac(); - - return n; -@@ -66,25 +43,9 @@ unsigned int copy_from_guest_ll(void *to, const void __user *from, unsigned int - GUARD( - " guest_access_mask_ptr %[from], %q[scratch1], %q[scratch2]\n" - ) -- " cmp $"STR(2*BYTES_PER_LONG-1)", %[cnt]\n" -- " jbe 1f\n" -- " mov %k[to], %[cnt]\n" -- " neg %[cnt]\n" -- " and $"STR(BYTES_PER_LONG-1)", %[cnt]\n" -- " sub %[cnt], %[aux]\n" -- "4: rep movsb\n" /* make 'to' address aligned */ -- " mov %[aux],%[cnt]\n" -- " shr $"STR(LONG_BYTEORDER)", %[cnt]\n" -- " and $"STR(BYTES_PER_LONG-1)", %[aux]\n" -- " .align 2,0x90\n" -- "0: rep movs"__OS"\n" /* as many words as possible... */ -- " mov %[aux], %[cnt]\n" -- "1: rep movsb\n" /* ...remainder copied as bytes */ -+ "1: rep movsb\n" - "2:\n" - ".section .fixup,\"ax\"\n" -- "5: add %[aux], %[cnt]\n" -- " jmp 6f\n" -- "3: lea (%q[aux], %q[cnt], "STR(BYTES_PER_LONG)"), %[cnt]\n" - "6: mov %[cnt], %k[from]\n" - " xchg %%eax, %[aux]\n" - " xor %%eax, %%eax\n" -@@ -93,14 +54,11 @@ unsigned int copy_from_guest_ll(void *to, const void __user *from, unsigned int - " mov %k[from], %[cnt]\n" - " jmp 2b\n" - ".previous\n" -- _ASM_EXTABLE(4b, 5b) -- _ASM_EXTABLE(0b, 3b) - _ASM_EXTABLE(1b, 6b) - : [cnt] "+c" (n), [to] "+D" (to), [from] "+S" (from), - [aux] "=&r" (dummy) - GUARD(, [scratch1] "=&r" (dummy), [scratch2] "=&r" (dummy)) -- : "[aux]" (n) -- : "memory" ); -+ :: "memory" ); - clac(); - - return n; -@@ -145,20 +103,12 @@ unsigned int clear_guest_pv(void __user *to, unsigned int n) - stac(); - asm volatile ( - " guest_access_mask_ptr %[to], %[scratch1], %[scratch2]\n" -- "0: rep stos"__OS"\n" -- " mov %[bytes], %[cnt]\n" - "1: rep stosb\n" - "2:\n" -- ".section .fixup,\"ax\"\n" -- "3: lea (%q[bytes], %q[longs], "STR(BYTES_PER_LONG)"), %[cnt]\n" -- " jmp 2b\n" -- ".previous\n" -- _ASM_EXTABLE(0b,3b) - _ASM_EXTABLE(1b,2b) -- : [cnt] "=&c" (n), [to] "+D" (to), [scratch1] "=&r" (dummy), -+ : [cnt] "+c" (n), [to] "+D" (to), [scratch1] "=&r" (dummy), - [scratch2] "=&r" (dummy) -- : [bytes] "r" (n & (BYTES_PER_LONG - 1)), -- [longs] "0" (n / BYTES_PER_LONG), "a" (0) ); -+ : "a" (0) ); - clac(); - } - --- -2.47.0 - diff --git a/0037-xen-memory-Make-resource_max_frames-to-return-0-on-u.patch b/0037-xen-memory-Make-resource_max_frames-to-return-0-on-u.patch new file mode 100644 index 0000000..f7f0816 --- /dev/null +++ b/0037-xen-memory-Make-resource_max_frames-to-return-0-on-u.patch @@ -0,0 +1,76 @@ +From 93027a991a89795c312e20a1c404321ce15ce404 Mon Sep 17 00:00:00 2001 +From: Oleksandr Tyshchenko +Date: Thu, 20 Mar 2025 13:14:27 +0100 +Subject: [PATCH 37/53] xen/memory: Make resource_max_frames() to return 0 on + unknown type + +This is actually what the caller acquire_resource() expects on any kind +of error (the comment on top of resource_max_frames() also suggests that). +Otherwise, the caller will treat -errno as a valid value and propagate incorrect +nr_frames to the VM. As a possible consequence, a VM trying to query a resource +size of an unknown type will get the success result from the hypercall and obtain +nr_frames 4294967201. + +Also, add an ASSERT_UNREACHABLE() in the default case of _acquire_resource(), +normally we won't get to this point, as an unknown type will always be rejected +earlier in resource_max_frames(). + +Also, update test-resource app to verify that Xen can deal with invalid +(unknown) resource type properly. + +Fixes: 9244528955de ("xen/memory: Fix acquire_resource size semantics") +Signed-off-by: Oleksandr Tyshchenko +Reviewed-by: Jan Beulich +Reviewed-by: Andrew Cooper +master commit: 9b8708290002f0a4d0b363e0c66ce945f6b520bd +master date: 2025-02-18 14:47:34 +0000 +--- + tools/tests/resource/test-resource.c | 10 ++++++++++ + xen/common/memory.c | 3 ++- + 2 files changed, 12 insertions(+), 1 deletion(-) + +diff --git a/tools/tests/resource/test-resource.c b/tools/tests/resource/test-resource.c +index 1b10be16a6..a7f2d04643 100644 +--- a/tools/tests/resource/test-resource.c ++++ b/tools/tests/resource/test-resource.c +@@ -123,6 +123,16 @@ static void test_gnttab(uint32_t domid, unsigned int nr_frames, + fail(" Fail: Managed to map gnttab v2 status frames in v1 mode\n"); + xenforeignmemory_unmap_resource(fh, res); + } ++ ++ /* ++ * If this check starts failing, you've found the right place to test your ++ * addition to the Acquire Resource infrastructure. ++ */ ++ rc = xenforeignmemory_resource_size(fh, domid, 3, 0, &size); ++ ++ /* Check that Xen rejected the resource type. */ ++ if ( !rc ) ++ fail(" Fail: Expected error on an invalid resource type, got success\n"); + } + + static void test_domain_configurations(void) +diff --git a/xen/common/memory.c b/xen/common/memory.c +index de2cc7ad92..1f0a9d7e5f 100644 +--- a/xen/common/memory.c ++++ b/xen/common/memory.c +@@ -1156,7 +1156,7 @@ static unsigned int resource_max_frames(const struct domain *d, + return d->vmtrace_size >> PAGE_SHIFT; + + default: +- return -EOPNOTSUPP; ++ return 0; + } + } + +@@ -1239,6 +1239,7 @@ static int _acquire_resource( + return acquire_vmtrace_buf(d, id, frame, nr_frames, mfn_list); + + default: ++ ASSERT_UNREACHABLE(); + return -EOPNOTSUPP; + } + } +-- +2.48.1 + diff --git a/0038-x86-svm-Separate-STI-and-VMRUN-instructions-in-svm_a.patch b/0038-x86-svm-Separate-STI-and-VMRUN-instructions-in-svm_a.patch new file mode 100644 index 0000000..e4e2f61 --- /dev/null +++ b/0038-x86-svm-Separate-STI-and-VMRUN-instructions-in-svm_a.patch @@ -0,0 +1,55 @@ +From b067751570d77e661e22c8b177d1fff4980c4b6a Mon Sep 17 00:00:00 2001 +From: Andrew Cooper +Date: Thu, 20 Mar 2025 13:14:51 +0100 +Subject: [PATCH 38/53] x86/svm: Separate STI and VMRUN instructions in + svm_asm_do_resume() + +There is a corner case in the VMRUN instruction where its INTR_SHADOW state +leaks into guest state if a VMExit occurs before the VMRUN is complete. An +example of this could be taking #NPF due to event injection. + +Xen can safely execute STI anywhere between CLGI and VMRUN, as CLGI blocks +external interrupts too. However, an exception (while fatal) will appear to +be in an irqs-on region (as GIF isn't considered), so position the STI after +the speculation actions but prior to the GPR pops. + +Link: https://lore.kernel.org/all/CADH9ctBs1YPmE4aCfGPNBwA10cA8RuAk2gO7542DjMZgs4uzJQ@mail.gmail.com/ +Fixes: 66b245d9eaeb ("SVM: limit GIF=0 region") +Signed-off-by: Andrew Cooper +Reviewed-by: Jan Beulich +master commit: c989ff614f6bad48b3bd4b32694f711b31c7b2d6 +master date: 2025-02-19 12:45:48 +0000 +--- + xen/arch/x86/hvm/svm/entry.S | 9 ++++++++- + 1 file changed, 8 insertions(+), 1 deletion(-) + +diff --git a/xen/arch/x86/hvm/svm/entry.S b/xen/arch/x86/hvm/svm/entry.S +index 6fd9652c04..91edb33459 100644 +--- a/xen/arch/x86/hvm/svm/entry.S ++++ b/xen/arch/x86/hvm/svm/entry.S +@@ -74,6 +74,14 @@ __UNLIKELY_END(nsvm_hap) + ALTERNATIVE "", svm_vmentry_spec_ctrl, X86_FEATURE_SC_MSR_HVM + ALTERNATIVE "", DO_SPEC_CTRL_DIV, X86_FEATURE_SC_DIV + ++ /* ++ * Set EFLAGS.IF after CLGI covers us from real interrupts, but not ++ * immediately prior to VMRUN. The VMRUN instruction leaks it's ++ * INTR_SHADOW into guest state if a VMExit occurs before VMRUN ++ * completes (e.g. taking #NPF during event injecting.) ++ */ ++ sti ++ + pop %r15 + pop %r14 + pop %r13 +@@ -91,7 +99,6 @@ __UNLIKELY_END(nsvm_hap) + pop %rsi + pop %rdi + +- sti + vmrun + + SAVE_ALL +-- +2.48.1 + diff --git a/0038-x86-traps-Re-enable-interrupts-after-reading-cr2-in-.patch b/0038-x86-traps-Re-enable-interrupts-after-reading-cr2-in-.patch deleted file mode 100644 index b8c8a4b..0000000 --- a/0038-x86-traps-Re-enable-interrupts-after-reading-cr2-in-.patch +++ /dev/null @@ -1,104 +0,0 @@ -From 8f9dad658ad7b1a9c2a7f1a0a5e7e7cbe7f87bc3 Mon Sep 17 00:00:00 2001 -From: Alejandro Vallejo -Date: Tue, 29 Oct 2024 16:26:50 +0100 -Subject: [PATCH 38/83] x86/traps: Re-enable interrupts after reading cr2 in - the #PF handler -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Hitting a page fault clobbers %cr2, so if a page fault is handled while -handling a previous page fault then %cr2 will hold the address of the -latter fault rather than the former. In particular, if a debug key -handler happens to trigger during #PF and before %cr2 is read, and that -handler itself encounters a #PF, then %cr2 will be corrupt for the outer #PF -handler. - -This patch makes the page fault path delay re-enabling IRQs until %cr2 -has been read in order to ensure it stays consistent. - -A similar argument holds in additional cases, but they happen to be safe: - * %dr6 inside #DB: Safe because IST exceptions don't re-enable IRQs. - * MSR_XFD_ERR inside #NM: Safe because AMX isn't used in #NM handler. - -While in the area, remove redundant q suffix to a movq in entry.S and -the space after the comma. - -Fixes: a4cd20a19073 ("[XEN] 'd' key dumps both host and guest state.") -Signed-off-by: Alejandro Vallejo -Acked-by: Roger Pau Monné -master commit: b06e76db7c35974f1b127762683e7852ca0c8e76 -master date: 2024-10-01 09:45:49 +0200 ---- - xen/arch/x86/traps.c | 8 ++++++++ - xen/arch/x86/x86_64/entry.S | 20 ++++++++++++++++---- - 2 files changed, 24 insertions(+), 4 deletions(-) - -diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c -index 8e2df3e719..ccb5a37a72 100644 ---- a/xen/arch/x86/traps.c -+++ b/xen/arch/x86/traps.c -@@ -1603,6 +1603,14 @@ void asmlinkage do_page_fault(struct cpu_user_regs *regs) - - addr = read_cr2(); - -+ /* -+ * Don't re-enable interrupts if we were running an IRQ-off region when -+ * we hit the page fault, or we'll break that code. -+ */ -+ ASSERT(!local_irq_is_enabled()); -+ if ( regs->flags & X86_EFLAGS_IF ) -+ local_irq_enable(); -+ - /* fixup_page_fault() might change regs->error_code, so cache it here. */ - error_code = regs->error_code; - -diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S -index b8482de8ee..9b0cdb7640 100644 ---- a/xen/arch/x86/x86_64/entry.S -+++ b/xen/arch/x86/x86_64/entry.S -@@ -844,9 +844,9 @@ handle_exception_saved: - #elif !defined(CONFIG_PV) - ASSERT_CONTEXT_IS_XEN - #endif /* CONFIG_PV */ -- sti --1: movq %rsp,%rdi -- movzbl UREGS_entry_vector(%rsp),%eax -+.Ldispatch_exceptions: -+ mov %rsp, %rdi -+ movzbl UREGS_entry_vector(%rsp), %eax - #ifdef CONFIG_PERF_COUNTERS - lea per_cpu__perfcounters(%rip), %rcx - add STACK_CPUINFO_FIELD(per_cpu_offset)(%r14), %rcx -@@ -866,7 +866,19 @@ handle_exception_saved: - jmp .L_exn_dispatch_done; \ - .L_ ## vec ## _done: - -+ /* -+ * IRQs kept off to derisk being hit by a nested interrupt before -+ * reading %cr2. Otherwise a page fault in the nested interrupt handler -+ * would corrupt %cr2. -+ */ - DISPATCH(X86_EXC_PF, do_page_fault) -+ -+ /* Only re-enable IRQs if they were active before taking the fault */ -+ testb $X86_EFLAGS_IF >> 8, UREGS_eflags + 1(%rsp) -+ jz 1f -+ sti -+1: -+ - DISPATCH(X86_EXC_GP, do_general_protection) - DISPATCH(X86_EXC_UD, do_invalid_op) - DISPATCH(X86_EXC_NM, do_device_not_available) -@@ -911,7 +923,7 @@ exception_with_ints_disabled: - movq %rsp,%rdi - call search_pre_exception_table - testq %rax,%rax # no fixup code for faulting EIP? -- jz 1b -+ jz .Ldispatch_exceptions - movq %rax,UREGS_rip(%rsp) # fixup regular stack - - #ifdef CONFIG_XEN_SHSTK --- -2.47.0 - diff --git a/0039-x86-emul-dump-unhandled-memory-accesses-for-PVH-dom0.patch b/0039-x86-emul-dump-unhandled-memory-accesses-for-PVH-dom0.patch new file mode 100644 index 0000000..b5a6add --- /dev/null +++ b/0039-x86-emul-dump-unhandled-memory-accesses-for-PVH-dom0.patch @@ -0,0 +1,44 @@ +From c6366b64dd498727bfb0e3aacc37c6db6d7242f1 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Thu, 20 Mar 2025 13:15:48 +0100 +Subject: [PATCH 39/53] x86/emul: dump unhandled memory accesses for PVH dom0 +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +A PV dom0 can map any host memory as long as it's allowed by the IO +capability range in d->iomem_caps. On the other hand, a PVH dom0 has no +way to populate MMIO region onto it's p2m, so it's limited to what Xen +initially populates on the p2m based on the host memory map and the enabled +device BARs. + +Introduce a new debug build only printk that reports attempts by dom0 to +access addresses not populated on the p2m, and not handled by any emulator. +This is for information purposes only, but might allow getting an idea of +what MMIO ranges might be missing on the p2m. + +Signed-off-by: Roger Pau Monné +Acked-by: Jan Beulich +master commit: 43d8a80a0cccfe3715bb3178b5c15fb983979651 +master date: 2025-03-05 10:26:46 +0100 +--- + xen/arch/x86/hvm/emulate.c | 3 +++ + 1 file changed, 3 insertions(+) + +diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c +index fb4de6ee0a..2664fe0c71 100644 +--- a/xen/arch/x86/hvm/emulate.c ++++ b/xen/arch/x86/hvm/emulate.c +@@ -337,6 +337,9 @@ static int hvmemul_do_io( + /* If there is no suitable backing DM, just ignore accesses */ + if ( !s ) + { ++ if ( is_mmio && is_hardware_domain(currd) ) ++ gdprintk(XENLOG_DEBUG, "unhandled memory %s %#lx size %u\n", ++ dir ? "read from" : "write to", addr, size); + rc = hvm_process_io_intercept(&null_handler, &p); + vio->req.state = STATE_IOREQ_NONE; + } +-- +2.48.1 + diff --git a/0039-x86-pv-Rework-guest_io_okay-to-return-X86EMUL_.patch b/0039-x86-pv-Rework-guest_io_okay-to-return-X86EMUL_.patch deleted file mode 100644 index 72975f1..0000000 --- a/0039-x86-pv-Rework-guest_io_okay-to-return-X86EMUL_.patch +++ /dev/null @@ -1,127 +0,0 @@ -From f879df5eb40fb32057e09a78cfa52f9ff08f8030 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Tue, 29 Oct 2024 16:27:29 +0100 -Subject: [PATCH 39/83] x86/pv: Rework guest_io_okay() to return X86EMUL_* - -In order to fix a bug with guest_io_okay() (subsequent patch), rework -guest_io_okay() to take in an emulation context, and return X86EMUL_* rather -than a boolean. - -For the failing case, take the opportunity to inject #GP explicitly, rather -than returning X86EMUL_UNHANDLEABLE. There is a logical difference between -"we know what this is, and it's #GP", vs "we don't know what this is". - -There is no change in practice as emulation is the final step on general #GP -resolution, but returning X86EMUL_UNHANDLEABLE would be a latent bug if a -subsequent action were to appear. - -No practical change. - -Signed-off-by: Andrew Cooper -Reviewed-by: Jan Beulich -master commit: 7429e1cc071b0e20ea9581da4893fb9b2f6d21d4 -master date: 2024-10-01 14:58:18 +0100 ---- - xen/arch/x86/pv/emul-priv-op.c | 36 ++++++++++++++++++++++------------ - 1 file changed, 23 insertions(+), 13 deletions(-) - -diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c -index b90f745c75..cc66ffbf8e 100644 ---- a/xen/arch/x86/pv/emul-priv-op.c -+++ b/xen/arch/x86/pv/emul-priv-op.c -@@ -156,14 +156,16 @@ static bool iopl_ok(const struct vcpu *v, const struct cpu_user_regs *regs) - } - - /* Has the guest requested sufficient permission for this I/O access? */ --static bool guest_io_okay(unsigned int port, unsigned int bytes, -- struct vcpu *v, struct cpu_user_regs *regs) -+static int guest_io_okay(unsigned int port, unsigned int bytes, -+ struct x86_emulate_ctxt *ctxt) - { -+ const struct cpu_user_regs *regs = ctxt->regs; -+ struct vcpu *v = current; - /* If in user mode, switch to kernel mode just to read I/O bitmap. */ - const bool user_mode = !(v->arch.flags & TF_kernel_mode); - - if ( iopl_ok(v, regs) ) -- return true; -+ return X86EMUL_OKAY; - - if ( (port + bytes) <= v->arch.pv.iobmp_limit ) - { -@@ -190,10 +192,12 @@ static bool guest_io_okay(unsigned int port, unsigned int bytes, - toggle_guest_pt(v); - - if ( (x.mask & (((1 << bytes) - 1) << (port & 7))) == 0 ) -- return true; -+ return X86EMUL_OKAY; - } - -- return false; -+ x86_emul_hw_exception(X86_EXC_GP, 0, ctxt); -+ -+ return X86EMUL_EXCEPTION; - } - - /* Has the administrator granted sufficient permission for this I/O access? */ -@@ -353,12 +357,14 @@ static int cf_check read_io( - struct priv_op_ctxt *poc = container_of(ctxt, struct priv_op_ctxt, ctxt); - struct vcpu *curr = current; - struct domain *currd = current->domain; -+ int rc; - - /* INS must not come here. */ - ASSERT((ctxt->opcode & ~9) == 0xe4); - -- if ( !guest_io_okay(port, bytes, curr, ctxt->regs) ) -- return X86EMUL_UNHANDLEABLE; -+ rc = guest_io_okay(port, bytes, ctxt); -+ if ( rc != X86EMUL_OKAY ) -+ return rc; - - poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes); - -@@ -458,12 +464,14 @@ static int cf_check write_io( - struct priv_op_ctxt *poc = container_of(ctxt, struct priv_op_ctxt, ctxt); - struct vcpu *curr = current; - struct domain *currd = current->domain; -+ int rc; - - /* OUTS must not come here. */ - ASSERT((ctxt->opcode & ~9) == 0xe6); - -- if ( !guest_io_okay(port, bytes, curr, ctxt->regs) ) -- return X86EMUL_UNHANDLEABLE; -+ rc = guest_io_okay(port, bytes, ctxt); -+ if ( rc != X86EMUL_OKAY ) -+ return rc; - - poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes); - -@@ -612,8 +620,9 @@ static int cf_check rep_ins( - - *reps = 0; - -- if ( !guest_io_okay(port, bytes_per_rep, curr, ctxt->regs) ) -- return X86EMUL_UNHANDLEABLE; -+ rc = guest_io_okay(port, bytes_per_rep, ctxt); -+ if ( rc != X86EMUL_OKAY ) -+ return rc; - - rc = read_segment(x86_seg_es, &sreg, ctxt); - if ( rc != X86EMUL_OKAY ) -@@ -678,8 +687,9 @@ static int cf_check rep_outs( - - *reps = 0; - -- if ( !guest_io_okay(port, bytes_per_rep, curr, ctxt->regs) ) -- return X86EMUL_UNHANDLEABLE; -+ rc = guest_io_okay(port, bytes_per_rep, ctxt); -+ if ( rc != X86EMUL_OKAY ) -+ return rc; - - rc = read_segment(seg, &sreg, ctxt); - if ( rc != X86EMUL_OKAY ) --- -2.47.0 - diff --git a/0040-x86-dom0-attempt-to-fixup-p2m-page-faults-for-PVH-do.patch b/0040-x86-dom0-attempt-to-fixup-p2m-page-faults-for-PVH-do.patch new file mode 100644 index 0000000..584bb6f --- /dev/null +++ b/0040-x86-dom0-attempt-to-fixup-p2m-page-faults-for-PVH-do.patch @@ -0,0 +1,242 @@ +From 2f47f9df89c8655d16bef125fb48c63170977582 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Thu, 20 Mar 2025 13:16:14 +0100 +Subject: [PATCH 40/53] x86/dom0: attempt to fixup p2m page-faults for PVH dom0 +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +When building a PVH dom0 Xen attempts to map all (relevant) MMIO regions +into the p2m for dom0 access. However the information Xen has about the +host memory map is limited. Xen doesn't have access to any resources +described in ACPI dynamic tables, and hence the p2m mappings provided might +not be complete. + +PV doesn't suffer from this issue because a PV dom0 is capable of mapping +into it's page-tables any address not explicitly banned in d->iomem_caps. + +Introduce a new command line options that allows Xen to attempt to fixup +the p2m page-faults, by creating p2m identity maps in response to p2m +page-faults. + +This is aimed as a workaround to small ACPI regions Xen doesn't know about. +Note that missing large MMIO regions mapped in this way will lead to +slowness due to the VM exit processing, plus the mappings will always use +small pages. + +The ultimate aim is to attempt to bring better parity with a classic PV +dom0. + +Note such fixup rely on the CPU doing the access to the unpopulated +address. If the access is attempted from a device instead there's no +possible way to fixup, as IOMMU page-fault are asynchronous. + +Signed-off-by: Roger Pau Monné +Reviewed-by: Jan Beulich +Acked-by: Oleksii Kurochko +master commit: 104591f5dd675d7bfb04885dace0e4e5a097fc1e +master date: 2025-03-05 10:26:46 +0100 +--- + CHANGELOG.md | 6 +++ + docs/misc/xen-command-line.pandoc | 16 +++++- + xen/arch/x86/dom0_build.c | 5 ++ + xen/arch/x86/hvm/emulate.c | 74 +++++++++++++++++++++++++- + xen/arch/x86/include/asm/hvm/emulate.h | 3 ++ + 5 files changed, 101 insertions(+), 3 deletions(-) + +diff --git a/CHANGELOG.md b/CHANGELOG.md +index 1639886aa7..216e66b576 100644 +--- a/CHANGELOG.md ++++ b/CHANGELOG.md +@@ -4,6 +4,12 @@ Notable changes to Xen will be documented in this file. + + The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) + ++## [4.19.2](https://xenbits.xenproject.org/gitweb/?p=xen.git;a=shortlog;h=RELEASE-4.19.2) ++ ++### Added ++ - On x86: ++ - Option to attempt to fixup p2m page-faults on PVH dom0. ++ + ## [4.19.1](https://xenbits.xenproject.org/gitweb/?p=xen.git;a=shortlog;h=RELEASE-4.19.1) + + ### Changed +diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc +index a398891bc0..ff10ff9f65 100644 +--- a/docs/misc/xen-command-line.pandoc ++++ b/docs/misc/xen-command-line.pandoc +@@ -806,7 +806,8 @@ Specify the bit width of the DMA heap. + + ### dom0 + = List of [ pv | pvh, shadow=, verbose=, +- cpuid-faulting=, msr-relaxed= ] (x86) ++ cpuid-faulting=, msr-relaxed=, ++ pf-fixup= ] (x86) + + = List of [ sve= ] (Arm64) + +@@ -867,6 +868,19 @@ Controls for how dom0 is constructed on x86 systems. + + If using this option is necessary to fix an issue, please report a bug. + ++* The `pf-fixup` boolean is only applicable when using a PVH dom0 and ++ defaults to false. ++ ++ When running dom0 in PVH mode the dom0 kernel has no way to map MMIO ++ regions into its physical memory map, such mode relies on Xen dom0 builder ++ populating the physical memory map with all MMIO regions that dom0 should ++ access. However Xen doesn't have a complete picture of the host memory ++ map, due to not being able to process ACPI dynamic tables. ++ ++ The `pf-fixup` option allows Xen to attempt to add missing MMIO regions ++ to the dom0 physical memory map in response to page-faults generated by ++ dom0 trying to access unpopulated entries in the memory map. ++ + Enables features on dom0 on Arm systems. + + * The `sve` integer parameter enables Arm SVE usage for Dom0 and sets the +diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c +index 8d56705a08..6b22f59ab2 100644 +--- a/xen/arch/x86/dom0_build.c ++++ b/xen/arch/x86/dom0_build.c +@@ -16,6 +16,7 @@ + #include + #include + #include ++#include + #include + #include + #include +@@ -286,6 +287,10 @@ int __init parse_arch_dom0_param(const char *s, const char *e) + opt_dom0_cpuid_faulting = val; + else if ( (val = parse_boolean("msr-relaxed", s, e)) >= 0 ) + opt_dom0_msr_relaxed = val; ++#ifdef CONFIG_HVM ++ else if ( (val = parse_boolean("pf-fixup", s, e)) >= 0 ) ++ opt_dom0_pf_fixup = val; ++#endif + else + return -EINVAL; + +diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c +index 2664fe0c71..abc4f5f261 100644 +--- a/xen/arch/x86/hvm/emulate.c ++++ b/xen/arch/x86/hvm/emulate.c +@@ -10,12 +10,15 @@ + */ + + #include ++#include + #include + #include + #include + #include + #include + #include ++ ++#include + #include + #include + #include +@@ -161,6 +164,36 @@ void hvmemul_cancel(struct vcpu *v) + hvmemul_cache_disable(v); + } + ++bool __ro_after_init opt_dom0_pf_fixup; ++static int hwdom_fixup_p2m(paddr_t addr) ++{ ++ unsigned long gfn = paddr_to_pfn(addr); ++ struct domain *currd = current->domain; ++ p2m_type_t type; ++ mfn_t mfn; ++ int rc; ++ ++ ASSERT(is_hardware_domain(currd)); ++ ASSERT(!altp2m_active(currd)); ++ ++ /* ++ * Fixups are only applied for MMIO holes, and rely on the hardware domain ++ * having identity mappings for non RAM regions (gfn == mfn). ++ */ ++ if ( !iomem_access_permitted(currd, gfn, gfn) || ++ !is_memory_hole(_mfn(gfn), _mfn(gfn)) ) ++ return -EPERM; ++ ++ mfn = get_gfn(currd, gfn, &type); ++ if ( !mfn_eq(mfn, INVALID_MFN) || !p2m_is_hole(type) ) ++ rc = mfn_eq(mfn, _mfn(gfn)) ? -EEXIST : -ENOTEMPTY; ++ else ++ rc = set_mmio_p2m_entry(currd, _gfn(gfn), _mfn(gfn), 0); ++ put_gfn(currd, gfn); ++ ++ return rc; ++} ++ + static int hvmemul_do_io( + bool is_mmio, paddr_t addr, unsigned long *reps, unsigned int size, + uint8_t dir, bool df, bool data_is_addr, uintptr_t data) +@@ -338,8 +371,45 @@ static int hvmemul_do_io( + if ( !s ) + { + if ( is_mmio && is_hardware_domain(currd) ) +- gdprintk(XENLOG_DEBUG, "unhandled memory %s %#lx size %u\n", +- dir ? "read from" : "write to", addr, size); ++ { ++ /* ++ * PVH dom0 is likely missing MMIO mappings on the p2m, due to ++ * the incomplete information Xen has about the memory layout. ++ * ++ * Either print a message to note dom0 attempted to access an ++ * unpopulated GPA, or try to fixup the p2m by creating an ++ * identity mapping for the faulting GPA. ++ */ ++ if ( opt_dom0_pf_fixup ) ++ { ++ int inner_rc = hwdom_fixup_p2m(addr); ++ ++ if ( !inner_rc || inner_rc == -EEXIST ) ++ { ++ if ( !inner_rc ) ++ gdprintk(XENLOG_DEBUG, ++ "fixup p2m mapping for page %lx added\n", ++ paddr_to_pfn(addr)); ++ else ++ gprintk(XENLOG_INFO, ++ "fixup p2m mapping for page %lx already present\n", ++ paddr_to_pfn(addr)); ++ ++ rc = X86EMUL_RETRY; ++ vio->req.state = STATE_IOREQ_NONE; ++ break; ++ } ++ ++ gprintk(XENLOG_WARNING, ++ "unable to fixup memory %s %#lx size %u: %d\n", ++ dir ? "read from" : "write to", addr, size, ++ inner_rc); ++ } ++ else ++ gdprintk(XENLOG_DEBUG, ++ "unhandled memory %s %#lx size %u\n", ++ dir ? "read from" : "write to", addr, size); ++ } + rc = hvm_process_io_intercept(&null_handler, &p); + vio->req.state = STATE_IOREQ_NONE; + } +diff --git a/xen/arch/x86/include/asm/hvm/emulate.h b/xen/arch/x86/include/asm/hvm/emulate.h +index 2e1eedefa7..108472f6f7 100644 +--- a/xen/arch/x86/include/asm/hvm/emulate.h ++++ b/xen/arch/x86/include/asm/hvm/emulate.h +@@ -147,6 +147,9 @@ static inline void hvmemul_write_cache(const struct vcpu *v, paddr_t gpa, + void hvm_dump_emulation_state(const char *loglvl, const char *prefix, + struct hvm_emulate_ctxt *hvmemul_ctxt, int rc); + ++/* For PVH dom0: signal whether to attempt fixup of p2m page-faults. */ ++extern bool opt_dom0_pf_fixup; ++ + #endif /* __ASM_X86_HVM_EMULATE_H__ */ + + /* +-- +2.48.1 + diff --git a/0040-x86-pv-Handle-PF-correctly-when-reading-the-IO-permi.patch b/0040-x86-pv-Handle-PF-correctly-when-reading-the-IO-permi.patch deleted file mode 100644 index 92ed980..0000000 --- a/0040-x86-pv-Handle-PF-correctly-when-reading-the-IO-permi.patch +++ /dev/null @@ -1,82 +0,0 @@ -From 0cfbae3f860db5f1ec842e12b68f942583e9fb2f Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Tue, 29 Oct 2024 16:27:41 +0100 -Subject: [PATCH 40/83] x86/pv: Handle #PF correctly when reading the IO - permission bitmap - -The switch statement in guest_io_okay() is a very expensive way of -pre-initialising x with ~0, and performing a partial read into it. - -However, the logic isn't correct either. - -In a real TSS, the CPU always reads two bytes (like here), and any TSS limit -violation turns silently into no-access. But, in-limit accesses trigger #PF -as usual. AMD document this property explicitly, and while Intel don't (so -far as I can tell), they do behave consistently with AMD. - -Switch from __copy_from_guest_offset() to __copy_from_guest_pv(), like -everything else in this file. This removes code generation setting up -copy_from_user_hvm() (in the likely path even), and safety LFENCEs from -evaluate_nospec(). - -Change the logic to raise #PF if __copy_from_guest_pv() fails, rather than -disallowing the IO port access. This brings the behaviour better in line with -normal x86. - -Signed-off-by: Andrew Cooper -Reviewed-by: Jan Beulich -master commit: 8a6c495d725408d333c1b47bb8af44615a5bfb18 -master date: 2024-10-01 14:58:18 +0100 ---- - xen/arch/x86/pv/emul-priv-op.c | 27 ++++++++++++--------------- - 1 file changed, 12 insertions(+), 15 deletions(-) - -diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c -index cc66ffbf8e..e35285d4ab 100644 ---- a/xen/arch/x86/pv/emul-priv-op.c -+++ b/xen/arch/x86/pv/emul-priv-op.c -@@ -169,29 +169,26 @@ static int guest_io_okay(unsigned int port, unsigned int bytes, - - if ( (port + bytes) <= v->arch.pv.iobmp_limit ) - { -- union { uint8_t bytes[2]; uint16_t mask; } x; -+ const void *__user addr = v->arch.pv.iobmp.p + (port >> 3); -+ uint16_t mask; -+ int rc; - -- /* -- * Grab permission bytes from guest space. Inaccessible bytes are -- * read as 0xff (no access allowed). -- */ -+ /* Grab permission bytes from guest space. */ - if ( user_mode ) - toggle_guest_pt(v); - -- switch ( __copy_from_guest_offset(x.bytes, v->arch.pv.iobmp, -- port>>3, 2) ) -- { -- default: x.bytes[0] = ~0; -- /* fallthrough */ -- case 1: x.bytes[1] = ~0; -- /* fallthrough */ -- case 0: break; -- } -+ rc = __copy_from_guest_pv(&mask, addr, 2); - - if ( user_mode ) - toggle_guest_pt(v); - -- if ( (x.mask & (((1 << bytes) - 1) << (port & 7))) == 0 ) -+ if ( rc ) -+ { -+ x86_emul_pagefault(0, (unsigned long)addr + bytes - rc, ctxt); -+ return X86EMUL_EXCEPTION; -+ } -+ -+ if ( (mask & (((1 << bytes) - 1) << (port & 7))) == 0 ) - return X86EMUL_OKAY; - } - --- -2.47.0 - diff --git a/0041-x86-dom0-correctly-set-the-maximum-iomem_caps-bound-.patch b/0041-x86-dom0-correctly-set-the-maximum-iomem_caps-bound-.patch new file mode 100644 index 0000000..bfb6be7 --- /dev/null +++ b/0041-x86-dom0-correctly-set-the-maximum-iomem_caps-bound-.patch @@ -0,0 +1,42 @@ +From eaa79d83b640f36b4973b15193d7405969030640 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Thu, 20 Mar 2025 13:16:37 +0100 +Subject: [PATCH 41/53] x86/dom0: correctly set the maximum ->iomem_caps bound + for PVH +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +The logic in dom0_setup_permissions() sets the maximum bound in +->iomem_caps unconditionally using paddr_bits, which is not correct for HVM +based domains. Instead use domain_max_paddr_bits() to get the correct +maximum paddr bits for each possible domain type. + +Switch to using PFN_DOWN() instead of PAGE_SHIFT, as that's shorter. + +Fixes: 53de839fb409 ('x86: constrain MFN range Dom0 may access') +Signed-off-by: Roger Pau Monné +Reviewed-by: Jan Beulich +master commit: a00e08799cc7657d2a1aca158f4ad43d4c9103e7 +master date: 2025-03-05 10:26:46 +0100 +--- + xen/arch/x86/dom0_build.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c +index 6b22f59ab2..330af7cf76 100644 +--- a/xen/arch/x86/dom0_build.c ++++ b/xen/arch/x86/dom0_build.c +@@ -481,7 +481,8 @@ int __init dom0_setup_permissions(struct domain *d) + + /* The hardware domain is initially permitted full I/O capabilities. */ + rc = ioports_permit_access(d, 0, 0xFFFF); +- rc |= iomem_permit_access(d, 0UL, (1UL << (paddr_bits - PAGE_SHIFT)) - 1); ++ rc |= iomem_permit_access(d, 0UL, ++ PFN_DOWN(1UL << domain_max_paddr_bits(d)) - 1); + rc |= irqs_permit_access(d, 1, nr_irqs_gsi - 1); + + /* Modify I/O port access permissions. */ +-- +2.48.1 + diff --git a/0041-x86-pv-Rename-pv.iobmp_limit-to-iobmp_nr-and-clarify.patch b/0041-x86-pv-Rename-pv.iobmp_limit-to-iobmp_nr-and-clarify.patch deleted file mode 100644 index 8ed1f09..0000000 --- a/0041-x86-pv-Rename-pv.iobmp_limit-to-iobmp_nr-and-clarify.patch +++ /dev/null @@ -1,87 +0,0 @@ -From 8321aa3db828c50c1d514938fb86edd161bb5adc Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Tue, 29 Oct 2024 16:27:54 +0100 -Subject: [PATCH 41/83] x86/pv: Rename pv.iobmp_limit to iobmp_nr and clarify - behaviour - -Ever since it's introduction in commit 013351bd7ab3 ("Define new event-channel -and physdev hypercalls") in 2006, the public interface was named nr_ports -while the internal field was called iobmp_limit. - -Rename the internal field to iobmp_nr to match the public interface, and -clarify that, when nonzero, Xen will read 2 bytes. - -There isn't a perfect parallel with a real TSS, but iobmp_nr being 0 is the -paravirt "no IOPB" case, and it is important that no read occurs in this case. - -Signed-off-by: Andrew Cooper -Reviewed-by: Jan Beulich -master commit: 633ee8b2df963f7e5cb8de1219c1a48bfb4447f6 -master date: 2024-10-01 14:58:18 +0100 ---- - xen/arch/x86/include/asm/domain.h | 2 +- - xen/arch/x86/physdev.c | 2 +- - xen/arch/x86/pv/emul-priv-op.c | 6 +++++- - xen/include/public/physdev.h | 3 +++ - 4 files changed, 10 insertions(+), 3 deletions(-) - -diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h -index 5d92891e6f..21e6ca90d5 100644 ---- a/xen/arch/x86/include/asm/domain.h -+++ b/xen/arch/x86/include/asm/domain.h -@@ -573,7 +573,7 @@ struct pv_vcpu - - /* I/O-port access bitmap. */ - XEN_GUEST_HANDLE(uint8) iobmp; /* Guest kernel vaddr of the bitmap. */ -- unsigned int iobmp_limit; /* Number of ports represented in the bitmap. */ -+ unsigned int iobmp_nr; /* Number of ports represented in the bitmap. */ - #define IOPL(val) MASK_INSR(val, X86_EFLAGS_IOPL) - unsigned int iopl; /* Current IOPL for this VCPU, shifted left by - * 12 to match the eflags register. */ -diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c -index d6dd622952..69fd42667c 100644 ---- a/xen/arch/x86/physdev.c -+++ b/xen/arch/x86/physdev.c -@@ -436,7 +436,7 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) - #else - guest_from_compat_handle(curr->arch.pv.iobmp, set_iobitmap.bitmap); - #endif -- curr->arch.pv.iobmp_limit = set_iobitmap.nr_ports; -+ curr->arch.pv.iobmp_nr = set_iobitmap.nr_ports; - break; - } - -diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c -index e35285d4ab..70150c2722 100644 ---- a/xen/arch/x86/pv/emul-priv-op.c -+++ b/xen/arch/x86/pv/emul-priv-op.c -@@ -167,7 +167,11 @@ static int guest_io_okay(unsigned int port, unsigned int bytes, - if ( iopl_ok(v, regs) ) - return X86EMUL_OKAY; - -- if ( (port + bytes) <= v->arch.pv.iobmp_limit ) -+ /* -+ * When @iobmp_nr is non-zero, Xen, like real CPUs and the TSS IOPB, -+ * always reads 2 bytes from @iobmp, which might be one byte @iobmp_nr. -+ */ -+ if ( (port + bytes) <= v->arch.pv.iobmp_nr ) - { - const void *__user addr = v->arch.pv.iobmp.p + (port >> 3); - uint16_t mask; -diff --git a/xen/include/public/physdev.h b/xen/include/public/physdev.h -index f0c0d4727c..d694104cd8 100644 ---- a/xen/include/public/physdev.h -+++ b/xen/include/public/physdev.h -@@ -87,6 +87,9 @@ DEFINE_XEN_GUEST_HANDLE(physdev_set_iopl_t); - /* - * Set the current VCPU's I/O-port permissions bitmap. - * @arg == pointer to physdev_set_iobitmap structure. -+ * -+ * When @nr_ports is non-zero, Xen, like real CPUs and the TSS IOPB, always -+ * reads 2 bytes from @bitmap, which might be one byte beyond @nr_ports. - */ - #define PHYSDEVOP_set_iobitmap 7 - struct physdev_set_iobitmap { --- -2.47.0 - diff --git a/0042-stubdom-Fix-newlib-build-with-GCC-14.patch b/0042-stubdom-Fix-newlib-build-with-GCC-14.patch deleted file mode 100644 index 551c051..0000000 --- a/0042-stubdom-Fix-newlib-build-with-GCC-14.patch +++ /dev/null @@ -1,58 +0,0 @@ -From 8eb2fdbc5bff333c2cdbe31c65d06ac846893185 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Tue, 29 Oct 2024 16:28:48 +0100 -Subject: [PATCH 42/83] stubdom: Fix newlib build with GCC-14 - -Based on a fix from OpenSUSE, but adjusted to be Clang-compatible too. Pass --Wno-implicit-function-declaration library-wide rather than using local GCC -pragmas. - -Fix of copy_past_newline() to avoid triggering -Wstrict-prototypes. - -Link: https://build.opensuse.org/request/show/1178775 -Signed-off-by: Andrew Cooper -Reviewed-by: Anthony PERARD -master commit: 444cb9350f2c1cc202b6b86176ddd8e57525e2d9 -master date: 2024-10-03 10:07:25 +0100 ---- - stubdom/Makefile | 2 ++ - stubdom/newlib-fix-copy_past_newline.patch | 10 ++++++++++ - 2 files changed, 12 insertions(+) - create mode 100644 stubdom/newlib-fix-copy_past_newline.patch - -diff --git a/stubdom/Makefile b/stubdom/Makefile -index 8c503c2bf8..f8c31fd35d 100644 ---- a/stubdom/Makefile -+++ b/stubdom/Makefile -@@ -97,10 +97,12 @@ newlib-$(NEWLIB_VERSION): newlib-$(NEWLIB_VERSION).tar.gz - patch -d $@ -p1 < newlib-disable-texinfo.patch - patch -d $@ -p1 < newlib-cygmon-gmon.patch - patch -d $@ -p1 < newlib-makedoc.patch -+ patch -d $@ -p1 < newlib-fix-copy_past_newline.patch - find $@ -type f | xargs perl -i.bak \ - -pe 's/\b_(tzname|daylight|timezone)\b/$$1/g' - touch $@ - -+NEWLIB_CFLAGS += -Wno-implicit-function-declaration - NEWLIB_STAMPFILE=$(CROSS_ROOT)/$(GNU_TARGET_ARCH)-xen-elf/lib/libc.a - .PHONY: cross-newlib - cross-newlib: $(NEWLIB_STAMPFILE) -diff --git a/stubdom/newlib-fix-copy_past_newline.patch b/stubdom/newlib-fix-copy_past_newline.patch -new file mode 100644 -index 0000000000..f8452480bc ---- /dev/null -+++ b/stubdom/newlib-fix-copy_past_newline.patch -@@ -0,0 +1,10 @@ -+--- newlib-1.16.0/newlib/doc/makedoc.c.orig -++++ newlib-1.16.0/newlib/doc/makedoc.c -+@@ -798,6 +798,7 @@ DEFUN( iscommand,(ptr, idx), -+ } -+ -+ -++static unsigned int -+ DEFUN(copy_past_newline,(ptr, idx, dst), -+ string_type *ptr AND -+ unsigned int idx AND --- -2.47.0 - diff --git a/0042-x86-iommu-account-for-IOMEM-caps-when-populating-dom.patch b/0042-x86-iommu-account-for-IOMEM-caps-when-populating-dom.patch new file mode 100644 index 0000000..76f6945 --- /dev/null +++ b/0042-x86-iommu-account-for-IOMEM-caps-when-populating-dom.patch @@ -0,0 +1,255 @@ +From 74e860e45ec839393e4a3036f51b5af2c4aa7efe Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Thu, 20 Mar 2025 13:16:56 +0100 +Subject: [PATCH 42/53] x86/iommu: account for IOMEM caps when populating dom0 + IOMMU page-tables +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +The current code in arch_iommu_hwdom_init() kind of open-codes the same +MMIO permission ranges that are added to the hardware domain ->iomem_caps. +Avoid this duplication and use ->iomem_caps in arch_iommu_hwdom_init() to +filter which memory regions should be added to the dom0 IOMMU page-tables. + +Note the IO-APIC and MCFG page(s) must be set as not accessible for a PVH +dom0, otherwise the internal Xen emulation for those ranges won't work. +This requires adjustments in dom0_setup_permissions(). + +The call to pvh_setup_mmcfg() in dom0_construct_pvh() must now strictly be +done ahead of setting up dom0 permissions, so take the opportunity to also +put it inside the existing is_hardware_domain() region. + +Also the special casing of E820_UNUSABLE regions no longer needs to be done +in arch_iommu_hwdom_init(), as those regions are already blocked in +->iomem_caps and thus would be removed from the rangeset as part of +->iomem_caps processing in arch_iommu_hwdom_init(). The E820_UNUSABLE +regions below 1Mb are not removed from ->iomem_caps, that's a slight +difference for the IOMMU created page-tables, but the aim is to allow +access to the same memory either from the CPU or the IOMMU page-tables. + +Since ->iomem_caps already takes into account the domain max paddr, there's +no need to remove any regions past the last address addressable by the +domain, as applying ->iomem_caps would have already taken care of that. + +Suggested-by: Jan Beulich +Signed-off-by: Roger Pau Monné +Reviewed-by: Jan Beulich +master commit: 62f3fc5296c452285e81adb50976bde2d68d3181 +master date: 2025-03-05 10:26:46 +0100 +--- + xen/arch/x86/dom0_build.c | 11 ++++- + xen/arch/x86/hvm/dom0_build.c | 14 +++--- + xen/arch/x86/hvm/io.c | 6 +-- + xen/arch/x86/include/asm/hvm/io.h | 4 +- + xen/drivers/passthrough/x86/iommu.c | 67 ++++++++++++----------------- + 5 files changed, 49 insertions(+), 53 deletions(-) + +diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c +index 330af7cf76..c0b2ad34b5 100644 +--- a/xen/arch/x86/dom0_build.c ++++ b/xen/arch/x86/dom0_build.c +@@ -558,7 +558,9 @@ int __init dom0_setup_permissions(struct domain *d) + for ( i = 0; i < nr_ioapics; i++ ) + { + mfn = paddr_to_pfn(mp_ioapics[i].mpc_apicaddr); +- if ( !rangeset_contains_singleton(mmio_ro_ranges, mfn) ) ++ /* If emulating IO-APIC(s) make sure the base address is unmapped. */ ++ if ( has_vioapic(d) || ++ !rangeset_contains_singleton(mmio_ro_ranges, mfn) ) + rc |= iomem_deny_access(d, mfn, mfn); + } + /* MSI range. */ +@@ -599,6 +601,13 @@ int __init dom0_setup_permissions(struct domain *d) + rc |= rangeset_add_singleton(mmio_ro_ranges, mfn); + } + ++ if ( has_vpci(d) ) ++ /* ++ * TODO: runtime added MMCFG regions are not checked to make sure they ++ * don't overlap with already mapped regions, thus preventing trapping. ++ */ ++ rc |= vpci_mmcfg_deny_access(d); ++ + return rc; + } + +diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c +index 3dd913bdb0..81445d5b37 100644 +--- a/xen/arch/x86/hvm/dom0_build.c ++++ b/xen/arch/x86/hvm/dom0_build.c +@@ -1312,6 +1312,13 @@ int __init dom0_construct_pvh(struct domain *d, const module_t *image, + + if ( is_hardware_domain(d) ) + { ++ /* ++ * MMCFG initialization must be performed before setting domain ++ * permissions, as the MCFG areas must not be part of the domain IOMEM ++ * accessible regions. ++ */ ++ pvh_setup_mmcfg(d); ++ + /* + * Setup permissions early so that calls to add MMIO regions to the + * p2m as part of vPCI setup don't fail due to permission checks. +@@ -1324,13 +1331,6 @@ int __init dom0_construct_pvh(struct domain *d, const module_t *image, + } + } + +- /* +- * NB: MMCFG initialization needs to be performed before iommu +- * initialization so the iommu code can fetch the MMCFG regions used by the +- * domain. +- */ +- pvh_setup_mmcfg(d); +- + /* + * Craft dom0 physical memory map and set the paging allocation. This must + * be done before the iommu initializion, since iommu initialization code +diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c +index db726b3817..de6ee6c4dd 100644 +--- a/xen/arch/x86/hvm/io.c ++++ b/xen/arch/x86/hvm/io.c +@@ -363,14 +363,14 @@ static const struct hvm_mmcfg *vpci_mmcfg_find(const struct domain *d, + return NULL; + } + +-int __hwdom_init vpci_subtract_mmcfg(const struct domain *d, struct rangeset *r) ++int __hwdom_init vpci_mmcfg_deny_access(struct domain *d) + { + const struct hvm_mmcfg *mmcfg; + + list_for_each_entry ( mmcfg, &d->arch.hvm.mmcfg_regions, next ) + { +- int rc = rangeset_remove_range(r, PFN_DOWN(mmcfg->addr), +- PFN_DOWN(mmcfg->addr + mmcfg->size - 1)); ++ int rc = iomem_deny_access(d, PFN_DOWN(mmcfg->addr), ++ PFN_DOWN(mmcfg->addr + mmcfg->size - 1)); + + if ( rc ) + return rc; +diff --git a/xen/arch/x86/include/asm/hvm/io.h b/xen/arch/x86/include/asm/hvm/io.h +index d72b29f73f..377c59a5c4 100644 +--- a/xen/arch/x86/include/asm/hvm/io.h ++++ b/xen/arch/x86/include/asm/hvm/io.h +@@ -134,8 +134,8 @@ int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr, + /* Destroy tracked MMCFG areas. */ + void destroy_vpci_mmcfg(struct domain *d); + +-/* Remove MMCFG regions from a given rangeset. */ +-int vpci_subtract_mmcfg(const struct domain *d, struct rangeset *r); ++/* Remove MMCFG regions from a domain ->iomem_caps. */ ++int vpci_mmcfg_deny_access(struct domain *d); + + #endif /* __ASM_X86_HVM_IO_H__ */ + +diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c +index 8b1e0596b8..67f025c1ec 100644 +--- a/xen/drivers/passthrough/x86/iommu.c ++++ b/xen/drivers/passthrough/x86/iommu.c +@@ -320,6 +320,26 @@ static int __hwdom_init cf_check map_subtract(unsigned long s, unsigned long e, + return rangeset_remove_range(map, s, e); + } + ++struct handle_iomemcap { ++ struct rangeset *r; ++ unsigned long last; ++}; ++static int __hwdom_init cf_check map_subtract_iomemcap(unsigned long s, ++ unsigned long e, ++ void *data) ++{ ++ struct handle_iomemcap *h = data; ++ int rc = 0; ++ ++ if ( h->last != s ) ++ rc = rangeset_remove_range(h->r, h->last, s - 1); ++ ++ ASSERT(e < ~0UL); ++ h->last = e + 1; ++ ++ return rc; ++} ++ + struct map_data { + struct domain *d; + unsigned int flush_flags; +@@ -400,6 +420,7 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d) + unsigned int i; + struct rangeset *map; + struct map_data map_data = { .d = d }; ++ struct handle_iomemcap iomem = {}; + int rc; + + BUG_ON(!is_hardware_domain(d)); +@@ -442,14 +463,6 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d) + + switch ( entry.type ) + { +- case E820_UNUSABLE: +- /* Only relevant for inclusive mode, otherwise this is a no-op. */ +- rc = rangeset_remove_range(map, PFN_DOWN(entry.addr), +- PFN_DOWN(entry.addr + entry.size - 1)); +- if ( rc ) +- panic("IOMMU failed to remove unusable memory: %d\n", rc); +- continue; +- + case E820_RESERVED: + if ( !iommu_hwdom_inclusive && !iommu_hwdom_reserved ) + continue; +@@ -475,22 +488,13 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d) + if ( rc ) + panic("IOMMU failed to remove Xen ranges: %d\n", rc); + +- /* Remove any overlap with the Interrupt Address Range. */ +- rc = rangeset_remove_range(map, 0xfee00, 0xfeeff); ++ iomem.r = map; ++ rc = rangeset_report_ranges(d->iomem_caps, 0, ~0UL, map_subtract_iomemcap, ++ &iomem); ++ if ( !rc && iomem.last < ~0UL ) ++ rc = rangeset_remove_range(map, iomem.last, ~0UL); + if ( rc ) +- panic("IOMMU failed to remove Interrupt Address Range: %d\n", rc); +- +- /* If emulating IO-APIC(s) make sure the base address is unmapped. */ +- if ( has_vioapic(d) ) +- { +- for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ ) +- { +- rc = rangeset_remove_singleton(map, +- PFN_DOWN(domain_vioapic(d, i)->base_address)); +- if ( rc ) +- panic("IOMMU failed to remove IO-APIC: %d\n", rc); +- } +- } ++ panic("IOMMU failed to remove forbidden regions: %d\n", rc); + + if ( is_pv_domain(d) ) + { +@@ -506,23 +510,6 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d) + panic("IOMMU failed to remove read-only regions: %d\n", rc); + } + +- if ( has_vpci(d) ) +- { +- /* +- * TODO: runtime added MMCFG regions are not checked to make sure they +- * don't overlap with already mapped regions, thus preventing trapping. +- */ +- rc = vpci_subtract_mmcfg(d, map); +- if ( rc ) +- panic("IOMMU unable to remove MMCFG areas: %d\n", rc); +- } +- +- /* Remove any regions past the last address addressable by the domain. */ +- rc = rangeset_remove_range(map, PFN_DOWN(1UL << domain_max_paddr_bits(d)), +- ~0UL); +- if ( rc ) +- panic("IOMMU unable to remove unaddressable ranges: %d\n", rc); +- + if ( iommu_verbose ) + printk(XENLOG_INFO "%pd: identity mappings for IOMMU:\n", d); + +-- +2.48.1 + diff --git a/0043-x86-dom0-be-less-restrictive-with-the-Interrupt-Addr.patch b/0043-x86-dom0-be-less-restrictive-with-the-Interrupt-Addr.patch new file mode 100644 index 0000000..3dd92e3 --- /dev/null +++ b/0043-x86-dom0-be-less-restrictive-with-the-Interrupt-Addr.patch @@ -0,0 +1,136 @@ +From e665204eb38343bb9ee9ca2a21461659abb01dc3 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Thu, 20 Mar 2025 13:17:17 +0100 +Subject: [PATCH 43/53] x86/dom0: be less restrictive with the Interrupt + Address Range +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Xen currently prevents dom0 from creating CPU or IOMMU page-table mappings +into the interrupt address range [0xfee00000, 0xfeefffff]. This range has +two different purposes. For accesses from the CPU is contains the default +position of local APIC page at 0xfee00000. For accesses from devices +it's the MSI address range, so the address field in the MSI entries +(usually) point to an address on that range to trigger an interrupt. + +There are reports of Lenovo Thinkpad devices placing what seems to be the +UCSI shared mailbox at address 0xfeec2000 in the interrupt address range. +Attempting to use that device with a Linux PV dom0 leads to an error when +Linux kernel maps 0xfeec2000: + +RIP: e030:xen_mc_flush+0x1e8/0x2b0 + xen_leave_lazy_mmu+0x15/0x60 + vmap_range_noflush+0x408/0x6f0 + __ioremap_caller+0x20d/0x350 + acpi_os_map_iomem+0x1a3/0x1c0 + acpi_ex_system_memory_space_handler+0x229/0x3f0 + acpi_ev_address_space_dispatch+0x17e/0x4c0 + acpi_ex_access_region+0x28a/0x510 + acpi_ex_field_datum_io+0x95/0x5c0 + acpi_ex_extract_from_field+0x36b/0x4e0 + acpi_ex_read_data_from_field+0xcb/0x430 + acpi_ex_resolve_node_to_value+0x2e0/0x530 + acpi_ex_resolve_to_value+0x1e7/0x550 + acpi_ds_evaluate_name_path+0x107/0x170 + acpi_ds_exec_end_op+0x392/0x860 + acpi_ps_parse_loop+0x268/0xa30 + acpi_ps_parse_aml+0x221/0x5e0 + acpi_ps_execute_method+0x171/0x3e0 + acpi_ns_evaluate+0x174/0x5d0 + acpi_evaluate_object+0x167/0x440 + acpi_evaluate_dsm+0xb6/0x130 + ucsi_acpi_dsm+0x53/0x80 + ucsi_acpi_read+0x2e/0x60 + ucsi_register+0x24/0xa0 + ucsi_acpi_probe+0x162/0x1e3 + platform_probe+0x48/0x90 + really_probe+0xde/0x340 + __driver_probe_device+0x78/0x110 + driver_probe_device+0x1f/0x90 + __driver_attach+0xd2/0x1c0 + bus_for_each_dev+0x77/0xc0 + bus_add_driver+0x112/0x1f0 + driver_register+0x72/0xd0 + do_one_initcall+0x48/0x300 + do_init_module+0x60/0x220 + __do_sys_init_module+0x17f/0x1b0 + do_syscall_64+0x82/0x170 + +Remove the restrictions to create mappings in the interrupt address range +for dom0. Note that the restriction to map the local APIC page is enforced +separately, and that continues to be present. Additionally make sure the +emulated local APIC page is also not mapped, in case dom0 is using it. + +Note that even if the interrupt address range entries are populated in the +IOMMU page-tables no device access will reach those pages. Device accesses +to the Interrupt Address Range will always be converted into Interrupt +Messages and are not subject to DMA remapping. + +There's also the following restriction noted in Intel VT-d: + +> Software must not program paging-structure entries to remap any address to +> the interrupt address range. Untranslated requests and translation requests +> that result in an address in the interrupt range will be blocked with +> condition code LGN.4 or SGN.8. Translated requests with an address in the +> interrupt address range are treated as Unsupported Request (UR). + +Similarly for AMD-Vi: + +> Accesses to the interrupt address range (Table 3) are defined to go through +> the interrupt remapping portion of the IOMMU and not through address +> translation processing. Therefore, when a transaction is being processed as +> an interrupt remapping operation, the transaction attribute of +> pretranslated or untranslated is ignored. +> +> Software Note: The IOMMU should +> not be configured such that an address translation results in a special +> address such as the interrupt address range. + +However those restrictions don't apply to the identity mappings possibly +created for dom0, since the interrupt address range is never subject to DMA +remapping, and hence there's no output address after translation that +belongs to the interrupt address range. + +Reported-by: Jürgen Groß +Link: https://lore.kernel.org/xen-devel/baade0a7-e204-4743-bda1-282df74e5f89@suse.com/ +Signed-off-by: Roger Pau Monné +Acked-by: Jan Beulich +master commit: 381caa38850771ae218eb6f6d490dc02e40df964 +master date: 2025-03-05 10:26:46 +0100 +--- + xen/arch/x86/dom0_build.c | 11 +++++++---- + 1 file changed, 7 insertions(+), 4 deletions(-) + +diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c +index c0b2ad34b5..6ee3dfb73f 100644 +--- a/xen/arch/x86/dom0_build.c ++++ b/xen/arch/x86/dom0_build.c +@@ -554,6 +554,13 @@ int __init dom0_setup_permissions(struct domain *d) + mfn = paddr_to_pfn(mp_lapic_addr); + rc |= iomem_deny_access(d, mfn, mfn); + } ++ /* If using an emulated local APIC make sure its MMIO is unpopulated. */ ++ if ( has_vlapic(d) ) ++ { ++ /* Xen doesn't allow changing the local APIC MMIO window position. */ ++ mfn = paddr_to_pfn(APIC_DEFAULT_PHYS_BASE); ++ rc |= iomem_deny_access(d, mfn, mfn); ++ } + /* I/O APICs. */ + for ( i = 0; i < nr_ioapics; i++ ) + { +@@ -563,10 +570,6 @@ int __init dom0_setup_permissions(struct domain *d) + !rangeset_contains_singleton(mmio_ro_ranges, mfn) ) + rc |= iomem_deny_access(d, mfn, mfn); + } +- /* MSI range. */ +- rc |= iomem_deny_access(d, paddr_to_pfn(MSI_ADDR_BASE_LO), +- paddr_to_pfn(MSI_ADDR_BASE_LO + +- MSI_ADDR_DEST_ID_MASK)); + /* HyperTransport range. */ + if ( boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON) ) + { +-- +2.48.1 + diff --git a/0043-x86-dpci-do-not-leak-pending-interrupts-on-CPU-offli.patch b/0043-x86-dpci-do-not-leak-pending-interrupts-on-CPU-offli.patch deleted file mode 100644 index 346f2d9..0000000 --- a/0043-x86-dpci-do-not-leak-pending-interrupts-on-CPU-offli.patch +++ /dev/null @@ -1,75 +0,0 @@ -From 8ebd6b066d17b585876c761cee298d1e3384079b Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= -Date: Tue, 29 Oct 2024 16:29:12 +0100 -Subject: [PATCH 43/83] x86/dpci: do not leak pending interrupts on CPU offline -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -The current dpci logic relies on a softirq being executed as a side effect of -the cpu_notifier_call_chain() call in the code path that offlines the target -CPU. However the call to cpu_notifier_call_chain() won't trigger any softirq -processing, and even if it did, such processing should be done after all -interrupts have been migrated off the current CPU, otherwise new pending dpci -interrupts could still appear. - -Currently the ASSERT() in the cpu callback notifier is fairly easy to trigger -by doing CPU offline from a PVH dom0. - -Solve this by instead moving out any dpci interrupts pending processing once -the CPU is dead. This might introduce more latency than attempting to drain -before the CPU is put offline, but it's less complex, and CPU online/offline is -not a common action. Any extra introduced latency should be tolerable. - -Fixes: f6dd295381f4 ('dpci: replace tasklet with softirq') -Signed-off-by: Roger Pau Monné -Acked-by: Andrew Cooper -master commit: 29555668b5725b9d5393b72bfe7ff9a3fa606714 -master date: 2024-10-07 11:10:21 +0200 ---- - xen/drivers/passthrough/x86/hvm.c | 20 ++++++++++++-------- - 1 file changed, 12 insertions(+), 8 deletions(-) - -diff --git a/xen/drivers/passthrough/x86/hvm.c b/xen/drivers/passthrough/x86/hvm.c -index d3627e4af7..f5faff7a49 100644 ---- a/xen/drivers/passthrough/x86/hvm.c -+++ b/xen/drivers/passthrough/x86/hvm.c -@@ -1105,23 +1105,27 @@ static int cf_check cpu_callback( - struct notifier_block *nfb, unsigned long action, void *hcpu) - { - unsigned int cpu = (unsigned long)hcpu; -+ unsigned long flags; - - switch ( action ) - { - case CPU_UP_PREPARE: - INIT_LIST_HEAD(&per_cpu(dpci_list, cpu)); - break; -+ - case CPU_UP_CANCELED: -- case CPU_DEAD: -- /* -- * On CPU_DYING this callback is called (on the CPU that is dying) -- * with an possible HVM_DPIC_SOFTIRQ pending - at which point we can -- * clear out any outstanding domains (by the virtue of the idle loop -- * calling the softirq later). In CPU_DEAD case the CPU is deaf and -- * there are no pending softirqs for us to handle so we can chill. -- */ - ASSERT(list_empty(&per_cpu(dpci_list, cpu))); - break; -+ -+ case CPU_DEAD: -+ if ( list_empty(&per_cpu(dpci_list, cpu)) ) -+ break; -+ /* Take whatever dpci interrupts are pending on the dead CPU. */ -+ local_irq_save(flags); -+ list_splice_init(&per_cpu(dpci_list, cpu), &this_cpu(dpci_list)); -+ local_irq_restore(flags); -+ raise_softirq(HVM_DPCI_SOFTIRQ); -+ break; - } - - return NOTIFY_DONE; --- -2.47.0 - diff --git a/0044-ioreq-don-t-wrongly-claim-success-in-ioreq_send_buff.patch b/0044-ioreq-don-t-wrongly-claim-success-in-ioreq_send_buff.patch deleted file mode 100644 index 6132bb4..0000000 --- a/0044-ioreq-don-t-wrongly-claim-success-in-ioreq_send_buff.patch +++ /dev/null @@ -1,44 +0,0 @@ -From d15e9fa3c880d0d2e0a3c19f0fa09ddac01e0ff9 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 29 Oct 2024 16:29:47 +0100 -Subject: [PATCH 44/83] ioreq: don't wrongly claim "success" in - ioreq_send_buffered() - -Returning a literal number is a bad idea anyway when all other returns -use IOREQ_STATUS_* values. The function is dead on Arm, and mapping to -X86EMUL_OKAY is surely wrong on x86. - -Fixes: f6bf39f84f82 ("x86/hvm: add support for broadcast of buffered ioreqs...") -Signed-off-by: Jan Beulich -Reviewed-by: Julien Grall -master commit: 2e0b545b847df7d4feb07308d50bad708bd35a66 -master date: 2024-10-08 14:36:27 +0200 ---- - xen/common/ioreq.c | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/xen/common/ioreq.c b/xen/common/ioreq.c -index 1257a3d972..f5fd30ce12 100644 ---- a/xen/common/ioreq.c -+++ b/xen/common/ioreq.c -@@ -1175,7 +1175,7 @@ static int ioreq_send_buffered(struct ioreq_server *s, ioreq_t *p) - return IOREQ_STATUS_UNHANDLED; - - /* -- * Return 0 for the cases we can't deal with: -+ * Return UNHANDLED for the cases we can't deal with: - * - 'addr' is only a 20-bit field, so we cannot address beyond 1MB - * - we cannot buffer accesses to guest memory buffers, as the guest - * may expect the memory buffer to be synchronously accessed -@@ -1183,7 +1183,7 @@ static int ioreq_send_buffered(struct ioreq_server *s, ioreq_t *p) - * support data_is_ptr we do not waste space for the count field either - */ - if ( (p->addr > 0xfffffUL) || p->data_is_ptr || (p->count != 1) ) -- return 0; -+ return IOREQ_STATUS_UNHANDLED; - - switch ( p->size ) - { --- -2.47.0 - diff --git a/0044-tools-xl-fix-channel-configuration-setting.patch b/0044-tools-xl-fix-channel-configuration-setting.patch new file mode 100644 index 0000000..dd7149a --- /dev/null +++ b/0044-tools-xl-fix-channel-configuration-setting.patch @@ -0,0 +1,41 @@ +From 7c767b7093ad4f6f53760f84eef51d9ca585a7a0 Mon Sep 17 00:00:00 2001 +From: Juergen Gross +Date: Thu, 20 Mar 2025 13:17:41 +0100 +Subject: [PATCH 44/53] tools/xl: fix channel configuration setting + +Channels work differently than other device types: their devid should +be -1 initially in order to distinguish them from the primary console +which has the devid of 0. + +So when parsing the channel configuration, use +ARRAY_EXTEND_INIT_NODEVID() in order to avoid overwriting the devid +set by libxl_device_channel_init(). + +Fixes: 3a6679634766 ("libxl: set channel devid when not provided by application") +Signed-off-by: Juergen Gross +Reviewed-by: Anthony PERARD +master commit: e1ccced4afe465d6541c5825a0f8d1b8f5fa4253 +master date: 2025-03-05 16:37:37 +0100 +--- + tools/xl/xl_parse.c | 5 +++-- + 1 file changed, 3 insertions(+), 2 deletions(-) + +diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c +index e3a4800f6e..9018efa117 100644 +--- a/tools/xl/xl_parse.c ++++ b/tools/xl/xl_parse.c +@@ -2387,8 +2387,9 @@ void parse_config_data(const char *config_source, + char *path = NULL; + int len; + +- chn = ARRAY_EXTEND_INIT(d_config->channels, d_config->num_channels, +- libxl_device_channel_init); ++ chn = ARRAY_EXTEND_INIT_NODEVID(d_config->channels, ++ d_config->num_channels, ++ libxl_device_channel_init); + + split_string_into_string_list(buf, ",", &pairs); + len = libxl_string_list_length(&pairs); +-- +2.48.1 + diff --git a/0045-x86-domctl-fix-maximum-number-of-MSRs-in-XEN_DOMCTL_.patch b/0045-x86-domctl-fix-maximum-number-of-MSRs-in-XEN_DOMCTL_.patch deleted file mode 100644 index ea834dc..0000000 --- a/0045-x86-domctl-fix-maximum-number-of-MSRs-in-XEN_DOMCTL_.patch +++ /dev/null @@ -1,51 +0,0 @@ -From 05292f914f388868f54429f6feeab8c9b0a1b57d Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= -Date: Tue, 29 Oct 2024 16:30:04 +0100 -Subject: [PATCH 45/83] x86/domctl: fix maximum number of MSRs in - XEN_DOMCTL_{get,set}_vcpu_msrs -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Since the addition of the MSR_AMD64_DR{1-4}_ADDRESS_MASK MSRs to the -msrs_to_send array, the calculations for the maximum number of MSRs that -the hypercall can handle is off by 4. - -Remove the addition of 4 to the maximum number of MSRs that -XEN_DOMCTL_{set,get}_vcpu_msrs supports, as those are already part of the -array. - -A further adjustment could be to subtract 4 from the maximum size if the DBEXT -CPUID feature is not exposed to the guest, but guest_{rd,wr}msr() will already -perform that check when fetching or loading the MSRs. The maximum array is -used to indicate the caller of the buffer it needs to allocate in the get case, -and as an early input sanitation in the set case, using a buffer size slightly -lager than required is not an issue. - -Fixes: 86d47adcd3c4 ('x86/msr: Handle MSR_AMD64_DR{0-3}_ADDRESS_MASK in the new MSR infrastructure') -Signed-off-by: Roger Pau Monné -Reviewed-by: Jan Beulich -master commit: c95cd5f9c5a8c1c6ab1b0b366d829fa8561958fd -master date: 2024-10-08 14:37:53 +0200 ---- - xen/arch/x86/domctl.c | 4 ---- - 1 file changed, 4 deletions(-) - -diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c -index 9190e11faa..8066f28e9d 100644 ---- a/xen/arch/x86/domctl.c -+++ b/xen/arch/x86/domctl.c -@@ -1055,10 +1055,6 @@ long arch_do_domctl( - !is_pv_domain(d) ) - break; - -- /* Count maximum number of optional msrs. */ -- if ( boot_cpu_has(X86_FEATURE_DBEXT) ) -- nr_msrs += 4; -- - if ( domctl->cmd == XEN_DOMCTL_get_vcpu_msrs ) - { - ret = 0; copyback = true; --- -2.47.0 - diff --git a/0045-x86-vlapic-Fix-handling-of-writes-to-APIC_ESR.patch b/0045-x86-vlapic-Fix-handling-of-writes-to-APIC_ESR.patch new file mode 100644 index 0000000..821e751 --- /dev/null +++ b/0045-x86-vlapic-Fix-handling-of-writes-to-APIC_ESR.patch @@ -0,0 +1,90 @@ +From da239140a48eb9b2b30784ee5dd420b6f879d189 Mon Sep 17 00:00:00 2001 +From: Andrew Cooper +Date: Thu, 20 Mar 2025 13:18:23 +0100 +Subject: [PATCH 45/53] x86/vlapic: Fix handling of writes to APIC_ESR + +Xen currently presents APIC_ESR to guests as a simple read/write register. + +This is incorrect. The SDM states: + + The ESR is a write/read register. Before attempt to read from the ESR, + software should first write to it. (The value written does not affect the + values read subsequently; only zero may be written in x2APIC mode.) This + write clears any previously logged errors and updates the ESR with any + errors detected since the last write to the ESR. + +Introduce a new pending_esr field in hvm_hw_lapic. + +Update vlapic_error() to accumulate errors here, and extend vlapic_reg_write() +to discard the written value and transfer pending_esr into APIC_ESR. Reads +are still as before. + +Importantly, this means that guests no longer destroys the ESR value it's +looking for in the LVTERR handler when following the SDM instructions. + +Signed-off-by: Andrew Cooper +Reviewed-by: Jan Beulich +master commit: b28b590d4a23894672f1dd7fb98cdf9926ecb282 +master date: 2025-03-07 14:34:08 +0000 +--- + xen/arch/x86/hvm/vlapic.c | 17 +++++++++++++++-- + xen/include/public/arch-x86/hvm/save.h | 1 + + 2 files changed, 16 insertions(+), 2 deletions(-) + +diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c +index 46ff758904..f34bbfb3cc 100644 +--- a/xen/arch/x86/hvm/vlapic.c ++++ b/xen/arch/x86/hvm/vlapic.c +@@ -108,7 +108,7 @@ static void vlapic_error(struct vlapic *vlapic, unsigned int errmask) + uint32_t esr; + + spin_lock_irqsave(&vlapic->esr_lock, flags); +- esr = vlapic_get_reg(vlapic, APIC_ESR); ++ esr = vlapic->hw.pending_esr; + if ( (esr & errmask) != errmask ) + { + uint32_t lvterr = vlapic_get_reg(vlapic, APIC_LVTERR); +@@ -127,7 +127,7 @@ static void vlapic_error(struct vlapic *vlapic, unsigned int errmask) + errmask |= APIC_ESR_RECVILL; + } + +- vlapic_set_reg(vlapic, APIC_ESR, esr | errmask); ++ vlapic->hw.pending_esr |= errmask; + + if ( inj ) + vlapic_set_irq(vlapic, lvterr & APIC_VECTOR_MASK, 0); +@@ -799,6 +799,19 @@ void vlapic_reg_write(struct vcpu *v, unsigned int reg, uint32_t val) + vlapic_set_reg(vlapic, APIC_ID, val); + break; + ++ case APIC_ESR: ++ { ++ unsigned long flags; ++ ++ spin_lock_irqsave(&vlapic->esr_lock, flags); ++ val = vlapic->hw.pending_esr; ++ vlapic->hw.pending_esr = 0; ++ spin_unlock_irqrestore(&vlapic->esr_lock, flags); ++ ++ vlapic_set_reg(vlapic, APIC_ESR, val); ++ break; ++ } ++ + case APIC_TASKPRI: + vlapic_set_reg(vlapic, APIC_TASKPRI, val & 0xff); + break; +diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h +index 7ecacadde1..9c4bfc7ebd 100644 +--- a/xen/include/public/arch-x86/hvm/save.h ++++ b/xen/include/public/arch-x86/hvm/save.h +@@ -394,6 +394,7 @@ struct hvm_hw_lapic { + uint32_t disabled; /* VLAPIC_xx_DISABLED */ + uint32_t timer_divisor; + uint64_t tdt_msr; ++ uint32_t pending_esr; + }; + + DECLARE_HVM_SAVE_TYPE(LAPIC, 5, struct hvm_hw_lapic); +-- +2.48.1 + diff --git a/0046-x86-msr-expose-MSR_FAM10H_MMIO_CONF_BASE-on-AMD.patch b/0046-x86-msr-expose-MSR_FAM10H_MMIO_CONF_BASE-on-AMD.patch new file mode 100644 index 0000000..7444290 --- /dev/null +++ b/0046-x86-msr-expose-MSR_FAM10H_MMIO_CONF_BASE-on-AMD.patch @@ -0,0 +1,72 @@ +From 3c2b6175a1e49ea4e6fa0aec989ee49daf593558 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Thu, 20 Mar 2025 13:18:46 +0100 +Subject: [PATCH 46/53] x86/msr: expose MSR_FAM10H_MMIO_CONF_BASE on AMD +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +The MMIO_CONF_BASE reports the base of the MCFG range on AMD systems. +Linux pre-6.14 is unconditionally attempting to read the MSR without a +safe MSR accessor, and since Xen doesn't allow access to it Linux reports +the following error: + +unchecked MSR access error: RDMSR from 0xc0010058 at rIP: 0xffffffff8101d19f (xen_do_read_msr+0x7f/0xa0) +Call Trace: + xen_read_msr+0x1e/0x30 + amd_get_mmconfig_range+0x2b/0x80 + quirk_amd_mmconfig_area+0x28/0x100 + pnp_fixup_device+0x39/0x50 + __pnp_add_device+0xf/0x150 + pnp_add_device+0x3d/0x100 + pnpacpi_add_device_handler+0x1f9/0x280 + acpi_ns_get_device_callback+0x104/0x1c0 + acpi_ns_walk_namespace+0x1d0/0x260 + acpi_get_devices+0x8a/0xb0 + pnpacpi_init+0x50/0x80 + do_one_initcall+0x46/0x2e0 + kernel_init_freeable+0x1da/0x2f0 + kernel_init+0x16/0x1b0 + ret_from_fork+0x30/0x50 + ret_from_fork_asm+0x1b/0x30 + +Such access is conditional to the presence of a device with PnP ID +"PNP0c01", which triggers the execution of the quirk_amd_mmconfig_area() +function. Note that prior to commit 3fac3734c43a MSR accesses when running +as a PV guest would always use the safe variant, and thus silently handle +the #GP. + +Fix by allowing access to the MSR on AMD systems for the hardware domain. + +Write attempts to the MSR will still result in #GP for all domain types. + +Signed-off-by: Roger Pau Monné +Reviewed-by: Jan Beulich +master commit: b4071d28c5bd9ca4fed76031cbf0e782b74209b9 +master date: 2025-03-12 13:32:30 +0100 +--- + xen/arch/x86/msr.c | 8 ++++++++ + 1 file changed, 8 insertions(+) + +diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c +index 289cf10b78..3f612ad27c 100644 +--- a/xen/arch/x86/msr.c ++++ b/xen/arch/x86/msr.c +@@ -245,6 +245,14 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val) + *val = 0; + break; + ++ case MSR_FAM10H_MMIO_CONF_BASE: ++ if ( !is_hardware_domain(d) || ++ !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) || ++ rdmsr_safe(msr, *val) ) ++ goto gp_fault; ++ ++ break; ++ + case MSR_VIRT_SPEC_CTRL: + if ( !cp->extd.virt_ssbd ) + goto gp_fault; +-- +2.48.1 + diff --git a/0046-xen-spinlock-Fix-UBSAN-load-of-address-with-insuffic.patch b/0046-xen-spinlock-Fix-UBSAN-load-of-address-with-insuffic.patch deleted file mode 100644 index 0b8af31..0000000 --- a/0046-xen-spinlock-Fix-UBSAN-load-of-address-with-insuffic.patch +++ /dev/null @@ -1,67 +0,0 @@ -From a756c242ea32d3285d5582bc9aca030bafd24f31 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Tue, 29 Oct 2024 16:30:41 +0100 -Subject: [PATCH 46/83] xen/spinlock: Fix UBSAN "load of address with - insufficient space" in lock_prof_init() - -UBSAN complains: - - (XEN) ================================================================================ - (XEN) UBSAN: Undefined behaviour in common/spinlock.c:794:10 - (XEN) load of address ffff82d040ae24c8 with insufficient space - (XEN) for an object of type 'struct lock_profile *' - (XEN) ----[ Xen-4.20-unstable x86_64 debug=y ubsan=y Tainted: C ]---- - -This shows up with GCC-14, but not with GCC-12. I have not bisected further. - -Either way, the types for __lock_profile_{start,end} are incorrect. - -They are an array of struct lock_profile pointers. Correct the extern's -types, and adjust the loop to match. - -No practical change. - -Reported-by: Andreas Glashauser -Signed-off-by: Andrew Cooper -Reviewed-by: Juergen Gross -master commit: 542ac112fc68c66cfafc577e252404c21da4f75b -master date: 2024-10-14 16:14:26 +0100 ---- - xen/common/spinlock.c | 8 ++++---- - 1 file changed, 4 insertions(+), 4 deletions(-) - -diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c -index 28c6e9d3ac..e672b1041c 100644 ---- a/xen/common/spinlock.c -+++ b/xen/common/spinlock.c -@@ -607,9 +607,6 @@ struct lock_profile_anc { - typedef void lock_profile_subfunc(struct lock_profile *data, int32_t type, - int32_t idx, void *par); - --extern struct lock_profile *__lock_profile_start; --extern struct lock_profile *__lock_profile_end; -- - static s_time_t lock_profile_start; - static struct lock_profile_anc lock_profile_ancs[] = { - [LOCKPROF_TYPE_GLOBAL] = { .name = "Global" }, -@@ -779,13 +776,16 @@ void _lock_profile_deregister_struct( - spin_unlock(&lock_profile_lock); - } - -+extern struct lock_profile *__lock_profile_start[]; -+extern struct lock_profile *__lock_profile_end[]; -+ - static int __init cf_check lock_prof_init(void) - { - struct lock_profile **q; - - BUILD_BUG_ON(ARRAY_SIZE(lock_profile_ancs) != LOCKPROF_TYPE_N); - -- for ( q = &__lock_profile_start; q < &__lock_profile_end; q++ ) -+ for ( q = __lock_profile_start; q < __lock_profile_end; q++ ) - { - (*q)->next = lock_profile_glb_q.elem_q; - lock_profile_glb_q.elem_q = *q; --- -2.47.0 - diff --git a/0047-iommu-amd-vi-do-not-error-if-device-referenced-in-IV.patch b/0047-iommu-amd-vi-do-not-error-if-device-referenced-in-IV.patch deleted file mode 100644 index a132ac0..0000000 --- a/0047-iommu-amd-vi-do-not-error-if-device-referenced-in-IV.patch +++ /dev/null @@ -1,52 +0,0 @@ -From eec09073ad1d941669836a94e072cc895d3b560a Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= -Date: Tue, 29 Oct 2024 16:30:51 +0100 -Subject: [PATCH 47/83] iommu/amd-vi: do not error if device referenced in IVMD - is not behind any IOMMU -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -IVMD table contains restrictions about memory which must be mandatory assigned -to devices (and which permissions it should use), or memory that should be -never accessible to devices. - -Some hardware however contains ranges in IVMD that reference devices outside of -the IVHD tables (in other words, devices not behind any IOMMU). Such mismatch -will cause Xen to fail in register_range_for_device(), ultimately leading to -the IOMMU being disabled, and Xen crashing as x2APIC support might be already -enabled and relying on the IOMMU functionality. - -Relax IVMD parsing: allow IVMD blocks to reference devices not assigned to any -IOMMU. It's impossible for Xen to fulfill the requirement in the IVMD block if -the device is not behind any IOMMU, but it's no worse than booting without -IOMMU support, and thus not parsing ACPI IVRS in the first place. - -Reported-by: Willi Junga -Signed-off-by: Roger Pau Monné -Acked-by: Jan Beulich -master commit: 2defb544900a11f93104ac68d2f8beba89d4bd02 -master date: 2024-10-15 14:23:59 +0200 ---- - xen/drivers/passthrough/amd/iommu_acpi.c | 5 +++-- - 1 file changed, 3 insertions(+), 2 deletions(-) - -diff --git a/xen/drivers/passthrough/amd/iommu_acpi.c b/xen/drivers/passthrough/amd/iommu_acpi.c -index 3f5508eba0..c416120326 100644 ---- a/xen/drivers/passthrough/amd/iommu_acpi.c -+++ b/xen/drivers/passthrough/amd/iommu_acpi.c -@@ -248,8 +248,9 @@ static int __init register_range_for_device( - iommu = find_iommu_for_device(seg, bdf); - if ( !iommu ) - { -- AMD_IOMMU_ERROR("IVMD: no IOMMU for Dev_Id %#x\n", bdf); -- return -ENODEV; -+ AMD_IOMMU_WARN("IVMD: no IOMMU for device %pp - ignoring constrain\n", -+ &PCI_SBDF(seg, bdf)); -+ return 0; - } - req = ivrs_mappings[bdf].dte_requestor_id; - --- -2.47.0 - diff --git a/0047-x86-vmx-fix-posted-interrupts-usage-of-msi_desc-msg-.patch b/0047-x86-vmx-fix-posted-interrupts-usage-of-msi_desc-msg-.patch new file mode 100644 index 0000000..4d6ad6b --- /dev/null +++ b/0047-x86-vmx-fix-posted-interrupts-usage-of-msi_desc-msg-.patch @@ -0,0 +1,75 @@ +From 037d3e77d7a6d06d97bd993d275510b8706cd60d Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Thu, 20 Mar 2025 13:19:17 +0100 +Subject: [PATCH 47/53] x86/vmx: fix posted interrupts usage of msi_desc->msg + field +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +The current usage of msi_desc->msg in vmx_pi_update_irte() will make the +field contain a translated MSI message, instead of the expected +untranslated one. This breaks dump_msi(), that use the data in +msi_desc->msg to print the interrupt details. + +Fix this by introducing a dummy local msi_msg, and use it with +iommu_update_ire_from_msi(). vmx_pi_update_irte() relies on the MSI +message not changing, so there's no need to propagate the resulting msi_msg +to the hardware, and the contents can be ignored. + +Additionally add a comment to clarify that msi_desc->msg must always +contain the untranslated MSI message. + +Fixes: a5e25908d18d ('VT-d: introduce new fields in msi_desc to track binding with guest interrupt') +Signed-off-by: Roger Pau Monné +Reviewed-by: Jan Beulich +master commit: 30f0e55a79206702b4e82e86dad6b35033157858 +master date: 2025-03-12 13:32:30 +0100 +--- + xen/arch/x86/hvm/vmx/vmx.c | 4 +++- + xen/arch/x86/include/asm/msi.h | 2 +- + 2 files changed, 4 insertions(+), 2 deletions(-) + +diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c +index f16faa6a61..cb2cc8aa28 100644 +--- a/xen/arch/x86/hvm/vmx/vmx.c ++++ b/xen/arch/x86/hvm/vmx/vmx.c +@@ -396,6 +396,7 @@ static int cf_check vmx_pi_update_irte(const struct vcpu *v, + const struct pi_desc *pi_desc = v ? &v->arch.hvm.vmx.pi_desc : NULL; + struct irq_desc *desc; + struct msi_desc *msi_desc; ++ struct msi_msg msg; + int rc; + + desc = pirq_spin_lock_irq_desc(pirq, NULL); +@@ -410,12 +411,13 @@ static int cf_check vmx_pi_update_irte(const struct vcpu *v, + } + msi_desc->pi_desc = pi_desc; + msi_desc->gvec = gvec; ++ msg = msi_desc->msg; + + spin_unlock_irq(&desc->lock); + + ASSERT_PDEV_LIST_IS_READ_LOCKED(msi_desc->dev->domain); + +- return iommu_update_ire_from_msi(msi_desc, &msi_desc->msg); ++ return iommu_update_ire_from_msi(msi_desc, &msg); + + unlock_out: + spin_unlock_irq(&desc->lock); +diff --git a/xen/arch/x86/include/asm/msi.h b/xen/arch/x86/include/asm/msi.h +index 503c9447f6..6e6fedee74 100644 +--- a/xen/arch/x86/include/asm/msi.h ++++ b/xen/arch/x86/include/asm/msi.h +@@ -124,7 +124,7 @@ struct msi_desc { + int irq; + int remap_index; /* index in interrupt remapping table */ + +- struct msi_msg msg; /* Last set MSI message */ ++ struct msi_msg msg; /* Last set MSI message (untranslated) */ + }; + + /* +-- +2.48.1 + diff --git a/0048-x86-boot-Fix-microcode-module-handling-during-PVH-bo.patch b/0048-x86-boot-Fix-microcode-module-handling-during-PVH-bo.patch deleted file mode 100644 index c96c604..0000000 --- a/0048-x86-boot-Fix-microcode-module-handling-during-PVH-bo.patch +++ /dev/null @@ -1,166 +0,0 @@ -From 8e157210c022a8ec061a1cec44ac255961e6739e Mon Sep 17 00:00:00 2001 -From: "Daniel P. Smith" -Date: Tue, 29 Oct 2024 16:31:25 +0100 -Subject: [PATCH 48/83] x86/boot: Fix microcode module handling during PVH boot -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -As detailed in commit 0fe607b2a144 ("x86/boot: Fix PVH boot during boot_info -transition period"), the use of __va(mbi->mods_addr) constitutes a -use-after-free on the PVH boot path. - -This pattern has been in use since before PVH support was added. Inside a PVH -VM, it will go unnoticed as long as the microcode container parser doesn't -choke on the random data it finds. - -The use within early_microcode_init() happens to be safe because it's prior to -move_xen(). microcode_init_cache() is after move_xen(), and therefore unsafe. - -Plumb the boot_info pointer down, replacing module_map and mbi. Importantly, -bi->mods[].mod is a safe way to access the module list during PVH boot. - -Note: microcode_scan_module() is still bogusly stashing a bootstrap_map()'d - pointer in ucode_blob.data, which constitutes a different - use-after-free, and only works in general because of a second bug. This - is unrelated to PVH, and needs untangling differently. - -Signed-off-by: Daniel P. Smith -Signed-off-by: Andrew Cooper -Reviewed-by: Daniel P. Smith -Acked-by: Roger Pau Monné -master commit: 8ddf63a252a6eae6e619ba2df9ad6b6f82e660c1 -master date: 2024-10-23 18:14:24 +0100 ---- - xen/arch/x86/cpu/microcode/core.c | 21 +++++++++++---------- - xen/arch/x86/include/asm/microcode.h | 7 +++++-- - xen/arch/x86/setup.c | 4 ++-- - 3 files changed, 18 insertions(+), 14 deletions(-) - -diff --git a/xen/arch/x86/cpu/microcode/core.c b/xen/arch/x86/cpu/microcode/core.c -index e90055772a..655bc41e07 100644 ---- a/xen/arch/x86/cpu/microcode/core.c -+++ b/xen/arch/x86/cpu/microcode/core.c -@@ -151,9 +151,9 @@ custom_param("ucode", parse_ucode); - - static void __init microcode_scan_module( - unsigned long *module_map, -- const multiboot_info_t *mbi) -+ const multiboot_info_t *mbi, -+ const module_t mod[]) - { -- module_t *mod = (module_t *)__va(mbi->mods_addr); - uint64_t *_blob_start; - unsigned long _blob_size; - struct cpio_data cd; -@@ -203,10 +203,9 @@ static void __init microcode_scan_module( - - static void __init microcode_grab_module( - unsigned long *module_map, -- const multiboot_info_t *mbi) -+ const multiboot_info_t *mbi, -+ const module_t mod[]) - { -- module_t *mod = (module_t *)__va(mbi->mods_addr); -- - if ( ucode_mod_idx < 0 ) - ucode_mod_idx += mbi->mods_count; - if ( ucode_mod_idx <= 0 || ucode_mod_idx >= mbi->mods_count || -@@ -215,7 +214,7 @@ static void __init microcode_grab_module( - ucode_mod = mod[ucode_mod_idx]; - scan: - if ( ucode_scan ) -- microcode_scan_module(module_map, mbi); -+ microcode_scan_module(module_map, mbi, mod); - } - - static struct microcode_ops __ro_after_init ucode_ops; -@@ -801,7 +800,8 @@ static int __init early_update_cache(const void *data, size_t len) - } - - int __init microcode_init_cache(unsigned long *module_map, -- const struct multiboot_info *mbi) -+ const struct multiboot_info *mbi, -+ const module_t mods[]) - { - int rc = 0; - -@@ -810,7 +810,7 @@ int __init microcode_init_cache(unsigned long *module_map, - - if ( ucode_scan ) - /* Need to rescan the modules because they might have been relocated */ -- microcode_scan_module(module_map, mbi); -+ microcode_scan_module(module_map, mbi, mods); - - if ( ucode_mod.mod_end ) - rc = early_update_cache(bootstrap_map(&ucode_mod), -@@ -857,7 +857,8 @@ static int __init early_microcode_update_cpu(void) - } - - int __init early_microcode_init(unsigned long *module_map, -- const struct multiboot_info *mbi) -+ const struct multiboot_info *mbi, -+ const module_t mods[]) - { - const struct cpuinfo_x86 *c = &boot_cpu_data; - int rc = 0; -@@ -900,7 +901,7 @@ int __init early_microcode_init(unsigned long *module_map, - return -ENODEV; - } - -- microcode_grab_module(module_map, mbi); -+ microcode_grab_module(module_map, mbi, mods); - - if ( ucode_mod.mod_end || ucode_blob.size ) - rc = early_microcode_update_cpu(); -diff --git a/xen/arch/x86/include/asm/microcode.h b/xen/arch/x86/include/asm/microcode.h -index 8f59b20b02..1c9a4aa7d7 100644 ---- a/xen/arch/x86/include/asm/microcode.h -+++ b/xen/arch/x86/include/asm/microcode.h -@@ -3,6 +3,7 @@ - - #include - #include -+#include - - #include - -@@ -24,9 +25,11 @@ DECLARE_PER_CPU(struct cpu_signature, cpu_sig); - void microcode_set_module(unsigned int idx); - int microcode_update(XEN_GUEST_HANDLE(const_void) buf, unsigned long len); - int early_microcode_init(unsigned long *module_map, -- const struct multiboot_info *mbi); -+ const struct multiboot_info *mbi, -+ const module_t mods[]); - int microcode_init_cache(unsigned long *module_map, -- const struct multiboot_info *mbi); -+ const struct multiboot_info *mbi, -+ const module_t mods[]); - int microcode_update_one(void); - - #endif /* ASM_X86__MICROCODE_H */ -diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c -index f1076c7203..9e5e871b31 100644 ---- a/xen/arch/x86/setup.c -+++ b/xen/arch/x86/setup.c -@@ -1322,7 +1322,7 @@ void asmlinkage __init noreturn __start_xen(unsigned long mbi_p) - * TODO: load ucode earlier once multiboot modules become accessible - * at an earlier stage. - */ -- early_microcode_init(module_map, mbi); -+ early_microcode_init(module_map, mbi, mod); - - if ( xen_phys_start ) - { -@@ -1866,7 +1866,7 @@ void asmlinkage __init noreturn __start_xen(unsigned long mbi_p) - - timer_init(); - -- microcode_init_cache(module_map, mbi); /* Needs xmalloc() */ -+ microcode_init_cache(module_map, mbi, mod); /* Needs xmalloc() */ - - tsx_init(); /* Needs microcode. May change HLE/RTM feature bits. */ - --- -2.47.0 - diff --git a/0048-x86-hvm-check-return-code-of-hvm_pi_update_irte-when.patch b/0048-x86-hvm-check-return-code-of-hvm_pi_update_irte-when.patch new file mode 100644 index 0000000..f5559a0 --- /dev/null +++ b/0048-x86-hvm-check-return-code-of-hvm_pi_update_irte-when.patch @@ -0,0 +1,45 @@ +From 5e816eb950600a0f7c31f031a8f44c385f8c6b76 Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Thu, 20 Mar 2025 13:19:36 +0100 +Subject: [PATCH 48/53] x86/hvm: check return code of hvm_pi_update_irte when + binding +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +Consume the return code from hvm_pi_update_irte(), and propagate the error +back to the caller if hvm_pi_update_irte() fails. + +Fixes: 35a1caf8b6b5 ('pass-through: update IRTE according to guest interrupt config changes') +Signed-off-by: Roger Pau Monné +Reviewed-by: Jan Beulich +master commit: cb587f620ab56cc683347d8120ba63989fad2693 +master date: 2025-03-12 13:32:31 +0100 +--- + xen/drivers/passthrough/x86/hvm.c | 10 +++++++++- + 1 file changed, 9 insertions(+), 1 deletion(-) + +diff --git a/xen/drivers/passthrough/x86/hvm.c b/xen/drivers/passthrough/x86/hvm.c +index f5faff7a49..47de6953fd 100644 +--- a/xen/drivers/passthrough/x86/hvm.c ++++ b/xen/drivers/passthrough/x86/hvm.c +@@ -381,7 +381,15 @@ int pt_irq_create_bind( + + /* Use interrupt posting if it is supported. */ + if ( iommu_intpost ) +- hvm_pi_update_irte(vcpu, info, pirq_dpci->gmsi.gvec); ++ { ++ rc = hvm_pi_update_irte(vcpu, info, pirq_dpci->gmsi.gvec); ++ ++ if ( rc ) ++ { ++ pt_irq_destroy_bind(d, pt_irq_bind); ++ return rc; ++ } ++ } + + if ( pt_irq_bind->u.msi.gflags & XEN_DOMCTL_VMSI_X86_UNMASKED ) + { +-- +2.48.1 + diff --git a/0049-libxl-avoid-infinite-loop-in-libxl__remove_directory.patch b/0049-libxl-avoid-infinite-loop-in-libxl__remove_directory.patch new file mode 100644 index 0000000..01019b8 --- /dev/null +++ b/0049-libxl-avoid-infinite-loop-in-libxl__remove_directory.patch @@ -0,0 +1,37 @@ +From 3f73662226e82c67d79de950bfb4cf4ea83447ef Mon Sep 17 00:00:00 2001 +From: Jan Beulich +Date: Thu, 20 Mar 2025 13:20:06 +0100 +Subject: [PATCH 49/53] libxl: avoid infinite loop in libxl__remove_directory() + +Infinitely retrying the rmdir() invocation makes little sense. While the +original observation was the log filling the disk (due to repeated +"Directory not empty" errors, in turn occurring for unclear reasons), +the loop wants breaking even if there was no error message being logged +(much like is done in the similar loops in libxl__remove_file() and +libxl__remove_file_or_directory()). + +Fixes: c4dcbee67e6d ("libxl: provide libxl__remove_file et al") +Signed-off-by: Jan Beulich +Reviewed-by: Juergen Gross +Acked-by: Anthony PERARD +master commit: 68baeb5c4852e652b9599e049f40477edac4060e +master date: 2025-03-13 10:23:10 +0100 +--- + tools/libs/light/libxl_utils.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/tools/libs/light/libxl_utils.c b/tools/libs/light/libxl_utils.c +index 506c5b5631..5ae8935344 100644 +--- a/tools/libs/light/libxl_utils.c ++++ b/tools/libs/light/libxl_utils.c +@@ -577,6 +577,7 @@ int libxl__remove_directory(libxl__gc *gc, const char *dirpath) + if (errno == EINTR) continue; + LOGE(ERROR, "failed to remove emptied directory %s", dirpath); + rc = ERROR_FAIL; ++ break; + } + + out: +-- +2.48.1 + diff --git a/0049-x86-boot-Fix-XSM-module-handling-during-PVH-boot.patch b/0049-x86-boot-Fix-XSM-module-handling-during-PVH-boot.patch deleted file mode 100644 index 271ed8f..0000000 --- a/0049-x86-boot-Fix-XSM-module-handling-during-PVH-boot.patch +++ /dev/null @@ -1,120 +0,0 @@ -From fadbc7e32e42f1a4199b854a895744f026803320 Mon Sep 17 00:00:00 2001 -From: "Daniel P. Smith" -Date: Tue, 29 Oct 2024 16:31:38 +0100 -Subject: [PATCH 49/83] x86/boot: Fix XSM module handling during PVH boot -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -As detailed in commit 0fe607b2a144 ("x86/boot: Fix PVH boot during boot_info -transition period"), the use of __va(mbi->mods_addr) constitutes a -use-after-free on the PVH boot path. - -This pattern has been in use since before PVH support was added. This has -most likely gone unnoticed because no-one's tried using a detached Flask -policy in a PVH VM before. - -Plumb the boot_info pointer down, replacing module_map and mbi. Importantly, -bi->mods[].mod is a safe way to access the module list during PVH boot. - -As this is the final non-bi use of mbi in __start_xen(), make the pointer -unusable once bi has been established, to prevent new uses creeping back in. -This is a stopgap until mbi can be fully removed. - -Signed-off-by: Daniel P. Smith -Signed-off-by: Andrew Cooper -Reviewed-by: Daniel P. Smith -Acked-by: Roger Pau Monné -master commit: 6cf0aaeb8df951fb34679f0408461a5c67cb02c6 -master date: 2024-10-23 18:14:24 +0100 ---- - xen/arch/x86/setup.c | 2 +- - xen/include/xsm/xsm.h | 7 +++++-- - xen/xsm/xsm_core.c | 7 ++++--- - xen/xsm/xsm_policy.c | 2 +- - 4 files changed, 11 insertions(+), 7 deletions(-) - -diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c -index 9e5e871b31..89482140cf 100644 ---- a/xen/arch/x86/setup.c -+++ b/xen/arch/x86/setup.c -@@ -1792,7 +1792,7 @@ void asmlinkage __init noreturn __start_xen(unsigned long mbi_p) - mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges", - RANGESETF_prettyprint_hex); - -- xsm_multiboot_init(module_map, mbi); -+ xsm_multiboot_init(module_map, mbi, mod); - - /* - * IOMMU-related ACPI table parsing may require some of the system domains -diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h -index 627c0d2731..5867ccceaf 100644 ---- a/xen/include/xsm/xsm.h -+++ b/xen/include/xsm/xsm.h -@@ -779,9 +779,11 @@ static inline int xsm_argo_send(const struct domain *d, const struct domain *t) - - #ifdef CONFIG_MULTIBOOT - int xsm_multiboot_init( -- unsigned long *module_map, const multiboot_info_t *mbi); -+ unsigned long *module_map, const multiboot_info_t *mbi, -+ const module_t mods[]); - int xsm_multiboot_policy_init( - unsigned long *module_map, const multiboot_info_t *mbi, -+ const module_t mods[], - void **policy_buffer, size_t *policy_size); - #endif - -@@ -829,7 +831,8 @@ static const inline struct xsm_ops *silo_init(void) - - #ifdef CONFIG_MULTIBOOT - static inline int xsm_multiboot_init ( -- unsigned long *module_map, const multiboot_info_t *mbi) -+ unsigned long *module_map, const multiboot_info_t *mbi, -+ const module_t mods[]) - { - return 0; - } -diff --git a/xen/xsm/xsm_core.c b/xen/xsm/xsm_core.c -index eaa028109b..82b0d76d40 100644 ---- a/xen/xsm/xsm_core.c -+++ b/xen/xsm/xsm_core.c -@@ -140,7 +140,8 @@ static int __init xsm_core_init(const void *policy_buffer, size_t policy_size) - - #ifdef CONFIG_MULTIBOOT - int __init xsm_multiboot_init( -- unsigned long *module_map, const multiboot_info_t *mbi) -+ unsigned long *module_map, const multiboot_info_t *mbi, -+ const module_t mods[]) - { - int ret = 0; - void *policy_buffer = NULL; -@@ -150,8 +151,8 @@ int __init xsm_multiboot_init( - - if ( XSM_MAGIC ) - { -- ret = xsm_multiboot_policy_init(module_map, mbi, &policy_buffer, -- &policy_size); -+ ret = xsm_multiboot_policy_init(module_map, mbi, mods, -+ &policy_buffer, &policy_size); - if ( ret ) - { - bootstrap_map(NULL); -diff --git a/xen/xsm/xsm_policy.c b/xen/xsm/xsm_policy.c -index 8dafbc9381..9244a3612d 100644 ---- a/xen/xsm/xsm_policy.c -+++ b/xen/xsm/xsm_policy.c -@@ -32,10 +32,10 @@ - #ifdef CONFIG_MULTIBOOT - int __init xsm_multiboot_policy_init( - unsigned long *module_map, const multiboot_info_t *mbi, -+ const module_t mod[], - void **policy_buffer, size_t *policy_size) - { - int i; -- module_t *mod = (module_t *)__va(mbi->mods_addr); - int rc = 0; - u32 *_policy_start; - unsigned long _policy_len; --- -2.47.0 - diff --git a/0050-Config-Update-MiniOS-revision.patch b/0050-Config-Update-MiniOS-revision.patch deleted file mode 100644 index 57e43c7..0000000 --- a/0050-Config-Update-MiniOS-revision.patch +++ /dev/null @@ -1,28 +0,0 @@ -From 47cdc5fe71bc9d422827d66a7bea15c9b7d32252 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Wed, 30 Oct 2024 18:00:22 +0000 -Subject: [PATCH 50/83] Config: Update MiniOS revision - -Commit a400dd517068 ("Add missing symbol exports for grub-pv"). - -Signed-off-by: Andrew Cooper ---- - Config.mk | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/Config.mk b/Config.mk -index 03a89624c7..a35e821491 100644 ---- a/Config.mk -+++ b/Config.mk -@@ -224,7 +224,7 @@ QEMU_UPSTREAM_URL ?= https://xenbits.xen.org/git-http/qemu-xen.git - QEMU_UPSTREAM_REVISION ?= qemu-xen-4.19.0 - - MINIOS_UPSTREAM_URL ?= https://xenbits.xen.org/git-http/mini-os.git --MINIOS_UPSTREAM_REVISION ?= xen-RELEASE-4.19.0 -+MINIOS_UPSTREAM_REVISION ?= a400dd51706867565ed1382b23d3475bb30668c2 - - SEABIOS_UPSTREAM_URL ?= https://xenbits.xen.org/git-http/seabios.git - SEABIOS_UPSTREAM_REVISION ?= rel-1.16.3 --- -2.47.0 - diff --git a/0050-xen-sched-fix-arinc653-to-not-use-variables-across-c.patch b/0050-xen-sched-fix-arinc653-to-not-use-variables-across-c.patch new file mode 100644 index 0000000..73fc69a --- /dev/null +++ b/0050-xen-sched-fix-arinc653-to-not-use-variables-across-c.patch @@ -0,0 +1,109 @@ +From b65a10814a3693c521b09bef62b5a8f0ce737654 Mon Sep 17 00:00:00 2001 +From: Juergen Gross +Date: Thu, 20 Mar 2025 13:20:14 +0100 +Subject: [PATCH 50/53] xen/sched: fix arinc653 to not use variables across + cpupools + +a653sched_do_schedule() is using two function local static variables, +which is resulting in bad behavior when using more than one cpupool +with the arinc653 scheduler. + +Fix that by moving those variables to the scheduler private data. + +Fixes: 22787f2e107c ("ARINC 653 scheduler") +Reported-by: Choi Anderson +Signed-off-by: Juergen Gross +Reviewed-by: Andrew Cooper +Acked-by: Nathan Studer +master commit: d0561ac8ab0e780b1e8ab41d0d15e9f9b076dee3 +master date: 2025-03-14 10:17:11 +0100 +--- + xen/common/sched/arinc653.c | 31 ++++++++++++++++++------------- + 1 file changed, 18 insertions(+), 13 deletions(-) + +diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c +index a82c0d7314..9ebae6d7ae 100644 +--- a/xen/common/sched/arinc653.c ++++ b/xen/common/sched/arinc653.c +@@ -143,6 +143,12 @@ typedef struct a653sched_priv_s + * pointers to all Xen UNIT structures for iterating through + */ + struct list_head unit_list; ++ ++ /** ++ * scheduling house keeping variables ++ */ ++ unsigned int sched_index; ++ s_time_t next_switch_time; + } a653sched_priv_t; + + /************************************************************************** +@@ -513,8 +519,6 @@ a653sched_do_schedule( + bool tasklet_work_scheduled) + { + struct sched_unit *new_task = NULL; +- static unsigned int sched_index = 0; +- static s_time_t next_switch_time; + a653sched_priv_t *sched_priv = SCHED_PRIV(ops); + const unsigned int cpu = sched_get_resource_cpu(smp_processor_id()); + unsigned long flags; +@@ -528,18 +532,19 @@ a653sched_do_schedule( + /* time to enter a new major frame + * the first time this function is called, this will be true */ + /* start with the first domain in the schedule */ +- sched_index = 0; ++ sched_priv->sched_index = 0; + sched_priv->next_major_frame = now + sched_priv->major_frame; +- next_switch_time = now + sched_priv->schedule[0].runtime; ++ sched_priv->next_switch_time = now + sched_priv->schedule[0].runtime; + } + else + { +- while ( (now >= next_switch_time) +- && (sched_index < sched_priv->num_schedule_entries) ) ++ while ( (now >= sched_priv->next_switch_time) && ++ (sched_priv->sched_index < sched_priv->num_schedule_entries) ) + { + /* time to switch to the next domain in this major frame */ +- sched_index++; +- next_switch_time += sched_priv->schedule[sched_index].runtime; ++ sched_priv->sched_index++; ++ sched_priv->next_switch_time += ++ sched_priv->schedule[sched_priv->sched_index].runtime; + } + } + +@@ -547,8 +552,8 @@ a653sched_do_schedule( + * If we exhausted the domains in the schedule and still have time left + * in the major frame then switch next at the next major frame. + */ +- if ( sched_index >= sched_priv->num_schedule_entries ) +- next_switch_time = sched_priv->next_major_frame; ++ if ( sched_priv->sched_index >= sched_priv->num_schedule_entries ) ++ sched_priv->next_switch_time = sched_priv->next_major_frame; + + /* + * If there are more domains to run in the current major frame, set +@@ -556,8 +561,8 @@ a653sched_do_schedule( + * Otherwise, set new_task equal to the address of the idle task's + * sched_unit structure. + */ +- new_task = (sched_index < sched_priv->num_schedule_entries) +- ? sched_priv->schedule[sched_index].unit ++ new_task = (sched_priv->sched_index < sched_priv->num_schedule_entries) ++ ? sched_priv->schedule[sched_priv->sched_index].unit + : IDLETASK(cpu); + + /* Check to see if the new task can be run (awake & runnable). */ +@@ -589,7 +594,7 @@ a653sched_do_schedule( + * Return the amount of time the next domain has to run and the address + * of the selected task's UNIT structure. + */ +- prev->next_time = next_switch_time - now; ++ prev->next_time = sched_priv->next_switch_time - now; + prev->next_task = new_task; + new_task->migrated = false; + +-- +2.48.1 + diff --git a/0051-CI-Resync-.cirrus.yml-for-FreeBSD-testing.patch b/0051-CI-Resync-.cirrus.yml-for-FreeBSD-testing.patch deleted file mode 100644 index 7bc68d7..0000000 --- a/0051-CI-Resync-.cirrus.yml-for-FreeBSD-testing.patch +++ /dev/null @@ -1,29 +0,0 @@ -From 3ba995ab8d0ccdee6f99a330db7be96886c05d5e Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Mon, 11 Nov 2024 16:59:23 +0000 -Subject: [PATCH 51/83] CI: Resync .cirrus.yml for FreeBSD testing - -Includes: - commit ebb7c6b2faf2 ("cirrus-ci: update to FreeBSD 14.1 image") - -Signed-off-by: Andrew Cooper ---- - .cirrus.yml | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/.cirrus.yml b/.cirrus.yml -index 72227916c7..1c2a6cb812 100644 ---- a/.cirrus.yml -+++ b/.cirrus.yml -@@ -23,7 +23,7 @@ task: - task: - name: 'FreeBSD 14' - freebsd_instance: -- image_family: freebsd-14-0 -+ image_family: freebsd-14-1 - << : *FREEBSD_TEMPLATE - - task: --- -2.47.0 - diff --git a/0051-x86-ioremap-prevent-additions-against-the-NULL-point.patch b/0051-x86-ioremap-prevent-additions-against-the-NULL-point.patch new file mode 100644 index 0000000..59f4cb7 --- /dev/null +++ b/0051-x86-ioremap-prevent-additions-against-the-NULL-point.patch @@ -0,0 +1,89 @@ +From 69754ede24a44c3c3aafdb77d4a203810f00ba9e Mon Sep 17 00:00:00 2001 +From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= +Date: Thu, 20 Mar 2025 13:20:51 +0100 +Subject: [PATCH 51/53] x86/ioremap: prevent additions against the NULL pointer +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +This was reported by clang UBSAN as: + +UBSAN: Undefined behaviour in arch/x86/mm.c:6297:40 +applying zero offset to null pointer +[...] +Xen call trace: + [] R common/ubsan/ubsan.c#ubsan_epilogue+0xa/0xc0 + [] F __ubsan_handle_pointer_overflow+0xcb/0x100 + [] F ioremap_wc+0xc8/0xe0 + [] F video_init+0xd0/0x180 + [] F console_init_preirq+0x3d/0x220 + [] F __start_xen+0x68e/0x5530 + [] F __high_start+0x8e/0x90 + +Fix bt_ioremap() and ioremap{,_wc}() to not add the offset if the returned +pointer from __vmap() is NULL. + +Fixes: d0d4635d034f ('implement vmap()') +Fixes: f390941a92f1 ('x86/DMI: fix table mapping when one lives above 1Mb') +Fixes: 81d195c6c0e2 ('x86: introduce ioremap_wc()') +Signed-off-by: Roger Pau Monné +Reviewed-by: Andrew Cooper +master commit: 9a6f2c52f75781acda39fab5cc96d1bcc54bf534 +master date: 2025-03-17 13:33:29 +0100 +--- + xen/arch/x86/dmi_scan.c | 7 +++++-- + xen/arch/x86/mm.c | 6 ++++-- + 2 files changed, 9 insertions(+), 4 deletions(-) + +diff --git a/xen/arch/x86/dmi_scan.c b/xen/arch/x86/dmi_scan.c +index 81f80c053a..b517c068b8 100644 +--- a/xen/arch/x86/dmi_scan.c ++++ b/xen/arch/x86/dmi_scan.c +@@ -113,6 +113,7 @@ static const void *__init bt_ioremap(paddr_t addr, unsigned int len) + { + mfn_t mfn = _mfn(PFN_DOWN(addr)); + unsigned int offs = PAGE_OFFSET(addr); ++ void *va; + + if ( addr + len <= MB(1) ) + return __va(addr); +@@ -120,8 +121,10 @@ static const void *__init bt_ioremap(paddr_t addr, unsigned int len) + if ( system_state < SYS_STATE_boot ) + return __acpi_map_table(addr, len); + +- return __vmap(&mfn, PFN_UP(offs + len), 1, 1, PAGE_HYPERVISOR_RO, +- VMAP_DEFAULT) + offs; ++ va = __vmap(&mfn, PFN_UP(offs + len), 1, 1, PAGE_HYPERVISOR_RO, ++ VMAP_DEFAULT); ++ ++ return va ? va + offs : NULL; + } + + static void __init bt_iounmap(const void *ptr, unsigned int len) +diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c +index d65cf9fa71..2d9c7a5316 100644 +--- a/xen/arch/x86/mm.c ++++ b/xen/arch/x86/mm.c +@@ -6015,7 +6015,9 @@ void __iomem *ioremap(paddr_t pa, size_t len) + unsigned int offs = pa & (PAGE_SIZE - 1); + unsigned int nr = PFN_UP(offs + len); + +- va = __vmap(&mfn, nr, 1, 1, PAGE_HYPERVISOR_UCMINUS, VMAP_DEFAULT) + offs; ++ va = __vmap(&mfn, nr, 1, 1, PAGE_HYPERVISOR_UCMINUS, VMAP_DEFAULT); ++ if ( va ) ++ va += offs; + } + + return (void __force __iomem *)va; +@@ -6032,7 +6034,7 @@ void __iomem *__init ioremap_wc(paddr_t pa, size_t len) + + va = __vmap(&mfn, nr, 1, 1, PAGE_HYPERVISOR_WC, VMAP_DEFAULT); + +- return (void __force __iomem *)(va + offs); ++ return (void __force __iomem *)(va ? va + offs : NULL); + } + + int create_perdomain_mapping(struct domain *d, unsigned long va, +-- +2.48.1 + diff --git a/0052-automation-add-x86_64-xilinx-smoke-test.patch b/0052-automation-add-x86_64-xilinx-smoke-test.patch deleted file mode 100644 index d23dd13..0000000 --- a/0052-automation-add-x86_64-xilinx-smoke-test.patch +++ /dev/null @@ -1,211 +0,0 @@ -From a0e776530c9dbb68c34a12b5f1bba46efe75dd93 Mon Sep 17 00:00:00 2001 -From: Victor Lira -Date: Fri, 26 Jul 2024 18:56:39 -0700 -Subject: [PATCH 52/83] automation: add x86_64 xilinx smoke test - -Add a test script and related job for running x86_64 dom0 tests. - -Signed-off-by: Victor Lira -Reviewed-by: Stefano Stabellini -(cherry picked from commit 6979e17b3f8a18d2ba5dbd4f0623c4061dae0dfc) ---- - automation/gitlab-ci/test.yaml | 24 +++ - .../scripts/xilinx-smoke-dom0-x86_64.sh | 144 ++++++++++++++++++ - 2 files changed, 168 insertions(+) - create mode 100755 automation/scripts/xilinx-smoke-dom0-x86_64.sh - -diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml -index d89e41f244..4e74946419 100644 ---- a/automation/gitlab-ci/test.yaml -+++ b/automation/gitlab-ci/test.yaml -@@ -96,6 +96,22 @@ - tags: - - xilinx - -+.xilinx-x86_64: -+ extends: .test-jobs-common -+ variables: -+ CONTAINER: ubuntu:xenial-xilinx -+ LOGFILE: xilinx-smoke-x86_64.log -+ artifacts: -+ paths: -+ - smoke.serial -+ - '*.log' -+ when: always -+ only: -+ variables: -+ - $XILINX_JOBS == "true" && $CI_COMMIT_REF_PROTECTED == "true" -+ tags: -+ - xilinx -+ - .adl-x86-64: - extends: .test-jobs-common - variables: -@@ -159,6 +175,14 @@ xilinx-smoke-dom0less-arm64-gcc-debug-gem-passthrough: - - *arm64-test-needs - - alpine-3.18-gcc-debug-arm64 - -+xilinx-smoke-dom0-x86_64-gcc-debug: -+ extends: .xilinx-x86_64 -+ script: -+ - ./automation/scripts/xilinx-smoke-dom0-x86_64.sh ping 2>&1 | tee ${LOGFILE} -+ needs: -+ - *x86-64-test-needs -+ - alpine-3.18-gcc-debug -+ - adl-smoke-x86-64-gcc-debug: - extends: .adl-x86-64 - script: -diff --git a/automation/scripts/xilinx-smoke-dom0-x86_64.sh b/automation/scripts/xilinx-smoke-dom0-x86_64.sh -new file mode 100755 -index 0000000000..e6e6fac6a5 ---- /dev/null -+++ b/automation/scripts/xilinx-smoke-dom0-x86_64.sh -@@ -0,0 +1,144 @@ -+#!/bin/sh -+ -+# Run x86_64 dom0 tests on hardware. -+ -+set -ex -+ -+fatal() { -+ echo "$(basename "$0") $*" >&2 -+ exit 1 -+} -+ -+# Test parameter defaults. -+TEST="$1" -+PASS_MSG="Test passed: ${TEST}" -+XEN_CMD_CONSOLE="console=com1 com1=115200,8n1,0x3F8,4" -+XEN_CMD_DOM0="dom0=pvh dom0_max_vcpus=4 dom0_mem=4G" -+XEN_CMD_XEN="sched=null loglvl=all guest_loglvl=all console_timestamps=boot" -+XEN_CMD_EXTRA="" -+DOM0_CMD="" -+DOMU_CMD="" -+DOMU_CFG=' -+type = "pvh" -+name = "domU" -+kernel = "/boot/vmlinuz" -+ramdisk = "/boot/initrd-domU" -+extra = "root=/dev/ram0 console=hvc0" -+memory = 512 -+vif = [ "bridge=xenbr0", ] -+disk = [ ] -+' -+TIMEOUT_SECONDS=120 -+ -+# Select test variant. -+if [ "${TEST}" = "ping" ]; then -+ DOMU_MSG="domU online" -+ DOMU_CMD=" -+ifconfig eth0 192.168.0.2 -+until ping -c 10 192.168.0.1; do -+ sleep 1 -+done -+echo \"${DOMU_MSG}\" -+" -+ DOM0_CMD=" -+set +x -+until grep -q \"${DOMU_MSG}\" /var/log/xen/console/guest-domU.log; do -+ sleep 1 -+done -+set -x -+echo \"${PASS_MSG}\" -+" -+else -+ fatal "Unknown test: ${TEST}" -+fi -+ -+# Set up domU rootfs. -+mkdir -p rootfs -+cd rootfs -+tar xzf ../binaries/initrd.tar.gz -+mkdir proc -+mkdir run -+mkdir srv -+mkdir sys -+rm var/run -+echo "#!/bin/sh -+ -+${DOMU_CMD} -+" > etc/local.d/xen.start -+chmod +x etc/local.d/xen.start -+echo "rc_verbose=yes" >> etc/rc.conf -+sed -i -e 's/^Welcome/domU \0/' etc/issue -+find . | cpio -H newc -o | gzip > ../binaries/domU-rootfs.cpio.gz -+cd .. -+rm -rf rootfs -+ -+# Set up dom0 rootfs. -+mkdir -p rootfs -+cd rootfs -+tar xzf ../binaries/initrd.tar.gz -+mkdir boot -+mkdir proc -+mkdir run -+mkdir srv -+mkdir sys -+rm var/run -+cp -ar ../binaries/dist/install/* . -+echo "#!/bin/bash -+ -+export LD_LIBRARY_PATH=/usr/local/lib -+bash /etc/init.d/xencommons start -+ -+brctl addbr xenbr0 -+brctl addif xenbr0 eth0 -+ifconfig eth0 up -+ifconfig xenbr0 up -+ifconfig xenbr0 192.168.0.1 -+ -+# get domU console content into test log -+tail -F /var/log/xen/console/guest-domU.log 2>/dev/null | sed -e \"s/^/(domU) /\" & -+xl create /etc/xen/domU.cfg -+${DOM0_CMD} -+" > etc/local.d/xen.start -+chmod +x etc/local.d/xen.start -+echo "${DOMU_CFG}" > etc/xen/domU.cfg -+echo "rc_verbose=yes" >> etc/rc.conf -+echo "XENCONSOLED_TRACE=all" >> etc/default/xencommons -+echo "QEMU_XEN=/bin/false" >> etc/default/xencommons -+mkdir -p var/log/xen/console -+cp ../binaries/bzImage boot/vmlinuz -+cp ../binaries/domU-rootfs.cpio.gz boot/initrd-domU -+find . | cpio -H newc -o | gzip > ../binaries/dom0-rootfs.cpio.gz -+cd .. -+ -+# Load software into TFTP server directory. -+TFTP="/scratch/gitlab-runner/tftp" -+XEN_CMDLINE="${XEN_CMD_CONSOLE} ${XEN_CMD_XEN} ${XEN_CMD_DOM0} ${XEN_CMD_EXTRA}" -+cp -f binaries/xen ${TFTP}/pxelinux.cfg/xen -+cp -f binaries/bzImage ${TFTP}/pxelinux.cfg/vmlinuz -+cp -f binaries/dom0-rootfs.cpio.gz ${TFTP}/pxelinux.cfg/initrd-dom0 -+echo " -+net_default_server=10.0.6.1 -+multiboot2 (tftp)/pxelinux.cfg/xen ${XEN_CMDLINE} -+module2 (tftp)/pxelinux.cfg/vmlinuz console=hvc0 root=/dev/ram0 earlyprintk=xen -+module2 (tftp)/pxelinux.cfg/initrd-dom0 -+boot -+" > ${TFTP}/pxelinux.cfg/grub.cfg -+ -+# Power cycle board and collect serial port output. -+SERIAL_CMD="cat /dev/ttyUSB9 | tee smoke.serial | sed 's/\r//'" -+sh /scratch/gitlab-runner/v2000a.sh 2 -+sleep 5 -+sh /scratch/gitlab-runner/v2000a.sh 1 -+sleep 5 -+set +e -+stty -F /dev/ttyUSB9 115200 -+timeout -k 1 ${TIMEOUT_SECONDS} nohup sh -c "${SERIAL_CMD}" -+sh /scratch/gitlab-runner/v2000a.sh 2 -+ -+set -e -+ -+if grep -q "${PASS_MSG}" smoke.serial; then -+ exit 0 -+fi -+ -+fatal "Test failed" --- -2.47.0 - diff --git a/0052-x86-mm-Fix-IS_ALIGNED-check-in-IS_LnE_ALIGNED.patch b/0052-x86-mm-Fix-IS_ALIGNED-check-in-IS_LnE_ALIGNED.patch new file mode 100644 index 0000000..f67137a --- /dev/null +++ b/0052-x86-mm-Fix-IS_ALIGNED-check-in-IS_LnE_ALIGNED.patch @@ -0,0 +1,61 @@ +From 4d91aaf3494ed7ef25c12564b3b3a4f8a933bc26 Mon Sep 17 00:00:00 2001 +From: Andrew Cooper +Date: Thu, 20 Mar 2025 13:21:12 +0100 +Subject: [PATCH 52/53] x86/mm: Fix IS_ALIGNED() check in IS_LnE_ALIGNED() +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +The current CI failures turn out to be a latent bug triggered by a narrow set +of properties of the initrd and the host memory map, which CI encountered by +chance. + +One step during boot involves constructing directmap mappings for modules. +With some probing at the point of creation, it is observed that there's a 4k +mapping missing towards the end of the initrd. + + (XEN) === Mapped Mod1 [0000000394001000, 00000003be1ff6dc] to Directmap + (XEN) Probing paddr 394001000, va ffff830394001000 + (XEN) Probing paddr 3be1ff6db, va ffff8303be1ff6db + (XEN) Probing paddr 3bdffffff, va ffff8303bdffffff + (XEN) Probing paddr 3be001000, va ffff8303be001000 + (XEN) Probing paddr 3be000000, va ffff8303be000000 + (XEN) Early fatal page fault at e008:ffff82d04032014c (cr2=ffff8303be000000, ec=0000) + +The conditions for this bug appear to be map_pages_to_xen() call with a start +address of exactly 4k beyond a 2M boundary, some number of full 2M pages, then +a tail needing 4k pages. + +Anyway, the condition for spotting superpage boundaries in map_pages_to_xen() +is wrong. The IS_ALIGNED() macro expects a power of two for the alignment +argument, and subtracts 1 itself. + +Fixing this causes the failing case to now boot. + +Fixes: 97fb6fcf26e8 ("x86/mm: introduce helpers to detect super page alignment") +Debugged-by: Marek Marczykowski-Górecki +Signed-off-by: Andrew Cooper +Tested-by: Marek Marczykowski-Górecki +Reviewed-by: Jan Beulich +master commit: b07c7d63f9b587e4df5d71f6da9eaa433512c974 +master date: 2025-03-19 14:53:28 +0000 +--- + xen/arch/x86/mm.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c +index 2d9c7a5316..c3e15a029b 100644 +--- a/xen/arch/x86/mm.c ++++ b/xen/arch/x86/mm.c +@@ -5236,7 +5236,7 @@ int map_pages_to_xen( + \ + ASSERT(!mfn_eq(m_, INVALID_MFN)); \ + IS_ALIGNED(PFN_DOWN(v) | mfn_x(m_), \ +- (1UL << (PAGETABLE_ORDER * ((n) - 1))) - 1); \ ++ 1UL << (PAGETABLE_ORDER * ((n) - 1))); \ + }) + #define IS_L2E_ALIGNED(v, m) IS_LnE_ALIGNED(v, m, 2) + #define IS_L3E_ALIGNED(v, m) IS_LnE_ALIGNED(v, m, 3) +-- +2.48.1 + diff --git a/0053-automation-add-default-QEMU_TIMEOUT-value-if-not-alr.patch b/0053-automation-add-default-QEMU_TIMEOUT-value-if-not-alr.patch deleted file mode 100644 index 9cd1dea..0000000 --- a/0053-automation-add-default-QEMU_TIMEOUT-value-if-not-alr.patch +++ /dev/null @@ -1,35 +0,0 @@ -From cbea75a3cd339d5a28abbd2a0ae08460e4a8e395 Mon Sep 17 00:00:00 2001 -From: Stefano Stabellini -Date: Thu, 15 Aug 2024 18:00:34 -0700 -Subject: [PATCH 53/83] automation: add default QEMU_TIMEOUT value if not - already set - -The expectation is that QEMU_TIMEOUT should be set as a Gitlab CI/CD -variable but if not we should be able to run the pipeline anyway. - -Signed-off-by: Stefano Stabellini -Reviewed-by: Michal Orzel -(cherry picked from commit 1e2a5f991f86979b89aa9a60ca3ba8106ee7d987) ---- - automation/scripts/qemu-key.exp | 6 +++++- - 1 file changed, 5 insertions(+), 1 deletion(-) - -diff --git a/automation/scripts/qemu-key.exp b/automation/scripts/qemu-key.exp -index 35eb903a31..787f1f08cb 100755 ---- a/automation/scripts/qemu-key.exp -+++ b/automation/scripts/qemu-key.exp -@@ -1,6 +1,10 @@ - #!/usr/bin/expect -f - --set timeout $env(QEMU_TIMEOUT) -+if {[info exists env(QEMU_TIMEOUT)]} { -+ set timeout $env(QEMU_TIMEOUT) -+} else { -+ set timeout 1500 -+} - - log_file -a $env(QEMU_LOG) - --- -2.47.0 - diff --git a/0053-xen-arinc653-call-xfree-with-local-IRQ-enabled.patch b/0053-xen-arinc653-call-xfree-with-local-IRQ-enabled.patch new file mode 100644 index 0000000..2d1ae22 --- /dev/null +++ b/0053-xen-arinc653-call-xfree-with-local-IRQ-enabled.patch @@ -0,0 +1,53 @@ +From ce591a92ca50d2b8851469006a7d7824445b5dbc Mon Sep 17 00:00:00 2001 +From: Anderson Choi +Date: Thu, 20 Mar 2025 13:21:30 +0100 +Subject: [PATCH 53/53] xen/arinc653: call xfree() with local IRQ enabled + +xen panic is observed with the following configuration. + +1. Debug xen build (CONFIG_DEBUG=y) +2. dom1 of an ARINC653 domain +3. shutdown dom1 with xl command + +$ xl shutdown + +(XEN) **************************************** +(XEN) Panic on CPU 2: +(XEN) Assertion '!in_irq() && (local_irq_is_enabled() || num_online_cpus() <= 1)' failed at common/xmalloc_tlsf.c:714 +(XEN) **************************************** + +panic was triggered since xfree() was called with local IRQ disabled and +therefore assertion failed. + +Fix this by calling xfree() after local IRQ is enabled. + +Fixes: 19049f8d796a sched: fix locking in a653sched_free_vdata() +Signed-off-by: Anderson Choi +Reviewed-by: Juergen Gross +Acked-by: Nathan Studer +master commit: 3ee55c9543fcf0b35593f030b53f56f3222046b7 +master date: 2025-03-19 16:44:00 +0000 +--- + xen/common/sched/arinc653.c | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c +index 9ebae6d7ae..930361fa5c 100644 +--- a/xen/common/sched/arinc653.c ++++ b/xen/common/sched/arinc653.c +@@ -463,10 +463,11 @@ a653sched_free_udata(const struct scheduler *ops, void *priv) + if ( !is_idle_unit(av->unit) ) + list_del(&av->list); + +- xfree(av); + update_schedule_units(ops); + + spin_unlock_irqrestore(&sched_priv->lock, flags); ++ ++ xfree(av); + } + + /** +-- +2.48.1 + diff --git a/0054-automation-restore-CR-filtering.patch b/0054-automation-restore-CR-filtering.patch deleted file mode 100644 index c515282..0000000 --- a/0054-automation-restore-CR-filtering.patch +++ /dev/null @@ -1,118 +0,0 @@ -From 59ac149af9fc69358f7fe6477ed9b90ea8d83915 Mon Sep 17 00:00:00 2001 -From: Stefano Stabellini -Date: Wed, 21 Aug 2024 13:29:58 -0700 -Subject: [PATCH 54/83] automation: restore CR filtering - -After commit c36efb7fcea6 ("automation: use expect to run QEMU") we lost -the \r filtering introduced by b576497e3b7d ("automation: remove CR -characters from serial output"). This patch reintroduced it. - -Fixes: c36efb7fcea6 ("automation: use expect to run QEMU") -Signed-off-by: Stefano Stabellini -Reviewed-by: Michal Orzel -(cherry picked from commit aa80a04df488528d90a0d892f0752571b1759e8b) ---- - automation/scripts/qemu-alpine-x86_64.sh | 2 +- - automation/scripts/qemu-smoke-dom0-arm32.sh | 2 +- - automation/scripts/qemu-smoke-dom0-arm64.sh | 2 +- - automation/scripts/qemu-smoke-dom0less-arm32.sh | 2 +- - automation/scripts/qemu-smoke-dom0less-arm64.sh | 2 +- - automation/scripts/qemu-smoke-ppc64le.sh | 2 +- - automation/scripts/qemu-smoke-riscv64.sh | 2 +- - automation/scripts/qemu-smoke-x86-64.sh | 2 +- - automation/scripts/qemu-xtf-dom0less-arm64.sh | 2 +- - 9 files changed, 9 insertions(+), 9 deletions(-) - -diff --git a/automation/scripts/qemu-alpine-x86_64.sh b/automation/scripts/qemu-alpine-x86_64.sh -index 5359e0820b..42a89e86b0 100755 ---- a/automation/scripts/qemu-alpine-x86_64.sh -+++ b/automation/scripts/qemu-alpine-x86_64.sh -@@ -89,4 +89,4 @@ export QEMU_LOG="smoke.serial" - export LOG_MSG="Domain-0" - export PASSED="BusyBox" - --./automation/scripts/qemu-key.exp -+./automation/scripts/qemu-key.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-dom0-arm32.sh b/automation/scripts/qemu-smoke-dom0-arm32.sh -index bab66bfe44..7f3d520d9b 100755 ---- a/automation/scripts/qemu-smoke-dom0-arm32.sh -+++ b/automation/scripts/qemu-smoke-dom0-arm32.sh -@@ -96,4 +96,4 @@ export QEMU_LOG="${serial_log}" - export LOG_MSG="Domain-0" - export PASSED="/ #" - --../automation/scripts/qemu-key.exp -+../automation/scripts/qemu-key.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-dom0-arm64.sh b/automation/scripts/qemu-smoke-dom0-arm64.sh -index 0094bfc8e1..e0cea742b0 100755 ---- a/automation/scripts/qemu-smoke-dom0-arm64.sh -+++ b/automation/scripts/qemu-smoke-dom0-arm64.sh -@@ -109,4 +109,4 @@ export QEMU_LOG="smoke.serial" - export LOG_MSG="Domain-0" - export PASSED="BusyBox" - --./automation/scripts/qemu-key.exp -+./automation/scripts/qemu-key.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh b/automation/scripts/qemu-smoke-dom0less-arm32.sh -index 68ffbabdb8..e824cb7c2a 100755 ---- a/automation/scripts/qemu-smoke-dom0less-arm32.sh -+++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh -@@ -149,4 +149,4 @@ export QEMU_LOG="${serial_log}" - export LOG_MSG="${dom0_prompt}" - export PASSED="${passed}" - --../automation/scripts/qemu-key.exp -+../automation/scripts/qemu-key.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-dom0less-arm64.sh b/automation/scripts/qemu-smoke-dom0less-arm64.sh -index eb25c4af4b..f42ba5d196 100755 ---- a/automation/scripts/qemu-smoke-dom0less-arm64.sh -+++ b/automation/scripts/qemu-smoke-dom0less-arm64.sh -@@ -220,4 +220,4 @@ export QEMU_LOG="smoke.serial" - export LOG_MSG="Welcome to Alpine Linux" - export PASSED="${passed}" - --./automation/scripts/qemu-key.exp -+./automation/scripts/qemu-key.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-ppc64le.sh b/automation/scripts/qemu-smoke-ppc64le.sh -index ccb4a576f4..594f92c19c 100755 ---- a/automation/scripts/qemu-smoke-ppc64le.sh -+++ b/automation/scripts/qemu-smoke-ppc64le.sh -@@ -25,4 +25,4 @@ export QEMU_CMD="qemu-system-ppc64 \ - export QEMU_LOG="${serial_log}" - export PASSED="Hello, ppc64le!" - --./automation/scripts/qemu-key.exp -+./automation/scripts/qemu-key.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-riscv64.sh b/automation/scripts/qemu-smoke-riscv64.sh -index 0355c075b7..c2595f657f 100755 ---- a/automation/scripts/qemu-smoke-riscv64.sh -+++ b/automation/scripts/qemu-smoke-riscv64.sh -@@ -16,4 +16,4 @@ export QEMU_CMD="qemu-system-riscv64 \ - export QEMU_LOG="smoke.serial" - export PASSED="All set up" - --./automation/scripts/qemu-key.exp -+./automation/scripts/qemu-key.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-x86-64.sh b/automation/scripts/qemu-smoke-x86-64.sh -index 37ac10e068..3440b1761d 100755 ---- a/automation/scripts/qemu-smoke-x86-64.sh -+++ b/automation/scripts/qemu-smoke-x86-64.sh -@@ -24,4 +24,4 @@ export QEMU_CMD="qemu-system-x86_64 -nographic -kernel binaries/xen \ - export QEMU_LOG="smoke.serial" - export PASSED="Test result: SUCCESS" - --./automation/scripts/qemu-key.exp -+./automation/scripts/qemu-key.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-xtf-dom0less-arm64.sh b/automation/scripts/qemu-xtf-dom0less-arm64.sh -index 0666f6363e..4042fe5060 100755 ---- a/automation/scripts/qemu-xtf-dom0less-arm64.sh -+++ b/automation/scripts/qemu-xtf-dom0less-arm64.sh -@@ -65,4 +65,4 @@ export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x4000000 - export QEMU_LOG="smoke.serial" - export PASSED="${passed}" - --./automation/scripts/qemu-key.exp -+./automation/scripts/qemu-key.exp | sed 's/\r\+$//' --- -2.47.0 - diff --git a/0055-automation-update-xilinx-test-scripts-tty.patch b/0055-automation-update-xilinx-test-scripts-tty.patch deleted file mode 100644 index 86f812c..0000000 --- a/0055-automation-update-xilinx-test-scripts-tty.patch +++ /dev/null @@ -1,115 +0,0 @@ -From 5efbc09cd76b75de87abd00a2b38306d8c5e966a Mon Sep 17 00:00:00 2001 -From: Victor Lira -Date: Fri, 23 Aug 2024 15:29:04 -0700 -Subject: [PATCH 55/83] automation: update xilinx test scripts (tty) - -Update serial device names from ttyUSB* to test board specific names. - -Update xilinx-smoke-dom0-x86_64 with new Xen command line console options, -which are now set as Gitlab CI/CD variables. Abstract the directory where -binaries are stored. Increase the timeout to match new setup. - -Signed-off-by: Victor Lira -Reviewed-by: Stefano Stabellini -(cherry picked from commit 95764a0817a51741b7ffb1f78cba2a19b08ab2d1) ---- - automation/gitlab-ci/test.yaml | 2 ++ - .../scripts/xilinx-smoke-dom0-x86_64.sh | 28 +++++++++---------- - .../scripts/xilinx-smoke-dom0less-arm64.sh | 5 ++-- - 3 files changed, 19 insertions(+), 16 deletions(-) - -diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml -index 4e74946419..3b339f387f 100644 ---- a/automation/gitlab-ci/test.yaml -+++ b/automation/gitlab-ci/test.yaml -@@ -101,6 +101,8 @@ - variables: - CONTAINER: ubuntu:xenial-xilinx - LOGFILE: xilinx-smoke-x86_64.log -+ XEN_CMD_CONSOLE: "console=com2 com2=115200,8n1,0x2F8,4" -+ TEST_BOARD: "crater" - artifacts: - paths: - - smoke.serial -diff --git a/automation/scripts/xilinx-smoke-dom0-x86_64.sh b/automation/scripts/xilinx-smoke-dom0-x86_64.sh -index e6e6fac6a5..4559e2b9ee 100755 ---- a/automation/scripts/xilinx-smoke-dom0-x86_64.sh -+++ b/automation/scripts/xilinx-smoke-dom0-x86_64.sh -@@ -12,7 +12,6 @@ fatal() { - # Test parameter defaults. - TEST="$1" - PASS_MSG="Test passed: ${TEST}" --XEN_CMD_CONSOLE="console=com1 com1=115200,8n1,0x3F8,4" - XEN_CMD_DOM0="dom0=pvh dom0_max_vcpus=4 dom0_mem=4G" - XEN_CMD_XEN="sched=null loglvl=all guest_loglvl=all console_timestamps=boot" - XEN_CMD_EXTRA="" -@@ -28,7 +27,7 @@ memory = 512 - vif = [ "bridge=xenbr0", ] - disk = [ ] - ' --TIMEOUT_SECONDS=120 -+TIMEOUT_SECONDS=200 - - # Select test variant. - if [ "${TEST}" = "ping" ]; then -@@ -113,27 +112,28 @@ cd .. - # Load software into TFTP server directory. - TFTP="/scratch/gitlab-runner/tftp" - XEN_CMDLINE="${XEN_CMD_CONSOLE} ${XEN_CMD_XEN} ${XEN_CMD_DOM0} ${XEN_CMD_EXTRA}" --cp -f binaries/xen ${TFTP}/pxelinux.cfg/xen --cp -f binaries/bzImage ${TFTP}/pxelinux.cfg/vmlinuz --cp -f binaries/dom0-rootfs.cpio.gz ${TFTP}/pxelinux.cfg/initrd-dom0 -+cp -f binaries/xen ${TFTP}/${TEST_BOARD}/xen -+cp -f binaries/bzImage ${TFTP}/${TEST_BOARD}/vmlinuz -+cp -f binaries/dom0-rootfs.cpio.gz ${TFTP}/${TEST_BOARD}/initrd-dom0 - echo " - net_default_server=10.0.6.1 --multiboot2 (tftp)/pxelinux.cfg/xen ${XEN_CMDLINE} --module2 (tftp)/pxelinux.cfg/vmlinuz console=hvc0 root=/dev/ram0 earlyprintk=xen --module2 (tftp)/pxelinux.cfg/initrd-dom0 -+multiboot2 (tftp)/${TEST_BOARD}/xen ${XEN_CMDLINE} -+module2 (tftp)/${TEST_BOARD}/vmlinuz console=hvc0 root=/dev/ram0 earlyprintk=xen -+module2 (tftp)/${TEST_BOARD}/initrd-dom0 - boot --" > ${TFTP}/pxelinux.cfg/grub.cfg -+" > ${TFTP}/${TEST_BOARD}/grub.cfg - - # Power cycle board and collect serial port output. --SERIAL_CMD="cat /dev/ttyUSB9 | tee smoke.serial | sed 's/\r//'" --sh /scratch/gitlab-runner/v2000a.sh 2 -+SERIAL_DEV="/dev/serial/${TEST_BOARD}" -+SERIAL_CMD="cat ${SERIAL_DEV} | tee smoke.serial | sed 's/\r//'" -+sh /scratch/gitlab-runner/${TEST_BOARD}.sh 2 - sleep 5 --sh /scratch/gitlab-runner/v2000a.sh 1 -+sh /scratch/gitlab-runner/${TEST_BOARD}.sh 1 - sleep 5 - set +e --stty -F /dev/ttyUSB9 115200 -+stty -F ${SERIAL_DEV} 115200 - timeout -k 1 ${TIMEOUT_SECONDS} nohup sh -c "${SERIAL_CMD}" --sh /scratch/gitlab-runner/v2000a.sh 2 -+sh /scratch/gitlab-runner/${TEST_BOARD}.sh 2 - - set -e - -diff --git a/automation/scripts/xilinx-smoke-dom0less-arm64.sh b/automation/scripts/xilinx-smoke-dom0less-arm64.sh -index 666411d6a0..18aa07f0a2 100755 ---- a/automation/scripts/xilinx-smoke-dom0less-arm64.sh -+++ b/automation/scripts/xilinx-smoke-dom0less-arm64.sh -@@ -134,9 +134,10 @@ sleep 5 - cd $START - - # connect to serial -+SERIAL_DEV="/dev/serial/zynq" - set +e --stty -F /dev/ttyUSB0 115200 --timeout -k 1 120 nohup sh -c "cat /dev/ttyUSB0 | tee smoke.serial | sed 's/\r//'" -+stty -F ${SERIAL_DEV} 115200 -+timeout -k 1 120 nohup sh -c "cat ${SERIAL_DEV} | tee smoke.serial | sed 's/\r//'" - - # stop the board - cd /scratch/gitlab-runner --- -2.47.0 - diff --git a/0056-automation-fix-false-success-in-qemu-tests.patch b/0056-automation-fix-false-success-in-qemu-tests.patch deleted file mode 100644 index 8226b81..0000000 --- a/0056-automation-fix-false-success-in-qemu-tests.patch +++ /dev/null @@ -1,226 +0,0 @@ -From ed130bef9300f23b855eedeba9fc364e47914df0 Mon Sep 17 00:00:00 2001 -From: Victor Lira -Date: Thu, 29 Aug 2024 15:34:22 -0700 -Subject: [PATCH 56/83] automation: fix false success in qemu tests - -Fix flaw in qemu-*.sh tests that producess a false success. The following -lines produces success despite the "expect" script producing nonzero exit -status: - - set +e -... - ./automation/scripts/qemu-key.exp | sed 's/\r\+$//' - (end of file) - -The default exit status for a pipeline using "|" operator is that of the -rightmost command. Fix this by setting the "pipefail" option in the shell, -and removing "set +e" allowing the expect script to determine the result. - -Signed-off-by: Victor Lira -Reviewed-by: Stefano Stabellini -(cherry picked from commit 740c41ca05a83a2c3629eb2ff323877c37d95c1e) ---- - automation/scripts/qemu-alpine-x86_64.sh | 3 +-- - automation/scripts/qemu-key.exp | 2 +- - automation/scripts/qemu-smoke-dom0-arm32.sh | 3 +-- - automation/scripts/qemu-smoke-dom0-arm64.sh | 3 +-- - automation/scripts/qemu-smoke-dom0less-arm32.sh | 3 +-- - automation/scripts/qemu-smoke-dom0less-arm64.sh | 3 +-- - automation/scripts/qemu-smoke-ppc64le.sh | 3 +-- - automation/scripts/qemu-smoke-riscv64.sh | 3 +-- - automation/scripts/qemu-smoke-x86-64.sh | 3 +-- - automation/scripts/qemu-xtf-dom0less-arm64.sh | 3 +-- - 10 files changed, 10 insertions(+), 19 deletions(-) - -diff --git a/automation/scripts/qemu-alpine-x86_64.sh b/automation/scripts/qemu-alpine-x86_64.sh -index 42a89e86b0..93914c41bc 100755 ---- a/automation/scripts/qemu-alpine-x86_64.sh -+++ b/automation/scripts/qemu-alpine-x86_64.sh -@@ -1,6 +1,6 @@ - #!/bin/bash - --set -ex -+set -ex -o pipefail - - # DomU Busybox - cd binaries -@@ -76,7 +76,6 @@ EOF - - # Run the test - rm -f smoke.serial --set +e - export QEMU_CMD="qemu-system-x86_64 \ - -cpu qemu64,+svm \ - -m 2G -smp 2 \ -diff --git a/automation/scripts/qemu-key.exp b/automation/scripts/qemu-key.exp -index 787f1f08cb..66c4164831 100755 ---- a/automation/scripts/qemu-key.exp -+++ b/automation/scripts/qemu-key.exp -@@ -14,7 +14,7 @@ eval spawn $env(QEMU_CMD) - - expect_after { - -re "(.*)\r" { -- exp_continue -+ exp_continue -continue_timer - } - timeout {send_error "ERROR-Timeout!\n"; exit 1} - eof {send_error "ERROR-EOF!\n"; exit 1} -diff --git a/automation/scripts/qemu-smoke-dom0-arm32.sh b/automation/scripts/qemu-smoke-dom0-arm32.sh -index 7f3d520d9b..0e758dc8f4 100755 ---- a/automation/scripts/qemu-smoke-dom0-arm32.sh -+++ b/automation/scripts/qemu-smoke-dom0-arm32.sh -@@ -1,6 +1,6 @@ - #!/bin/bash - --set -ex -+set -ex -o pipefail - - serial_log="$(pwd)/smoke.serial" - -@@ -77,7 +77,6 @@ git clone --depth 1 https://gitlab.com/xen-project/imagebuilder.git - bash imagebuilder/scripts/uboot-script-gen -t tftp -d . -c config - - rm -f ${serial_log} --set +e - export QEMU_CMD="./qemu-system-arm \ - -machine virt \ - -machine virtualization=true \ -diff --git a/automation/scripts/qemu-smoke-dom0-arm64.sh b/automation/scripts/qemu-smoke-dom0-arm64.sh -index e0cea742b0..81f210f7f5 100755 ---- a/automation/scripts/qemu-smoke-dom0-arm64.sh -+++ b/automation/scripts/qemu-smoke-dom0-arm64.sh -@@ -1,6 +1,6 @@ - #!/bin/bash - --set -ex -+set -ex -o pipefail - - # DomU Busybox - cd binaries -@@ -93,7 +93,6 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c binaries/conf - - # Run the test - rm -f smoke.serial --set +e - export QEMU_CMD="./binaries/qemu-system-aarch64 \ - -machine virtualization=true \ - -cpu cortex-a57 -machine type=virt \ -diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh b/automation/scripts/qemu-smoke-dom0less-arm32.sh -index e824cb7c2a..38e8a0b0bd 100755 ---- a/automation/scripts/qemu-smoke-dom0less-arm32.sh -+++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh -@@ -1,6 +1,6 @@ - #!/bin/bash - --set -ex -+set -ex -o pipefail - - test_variant=$1 - -@@ -130,7 +130,6 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d . -c config - - # Run the test - rm -f ${serial_log} --set +e - export QEMU_CMD="./qemu-system-arm \ - -machine virt \ - -machine virtualization=true \ -diff --git a/automation/scripts/qemu-smoke-dom0less-arm64.sh b/automation/scripts/qemu-smoke-dom0less-arm64.sh -index f42ba5d196..ea67650e17 100755 ---- a/automation/scripts/qemu-smoke-dom0less-arm64.sh -+++ b/automation/scripts/qemu-smoke-dom0less-arm64.sh -@@ -1,6 +1,6 @@ - #!/bin/bash - --set -ex -+set -ex -o pipefail - - test_variant=$1 - -@@ -204,7 +204,6 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c binaries/conf - - # Run the test - rm -f smoke.serial --set +e - export QEMU_CMD="./binaries/qemu-system-aarch64 \ - -machine virtualization=true \ - -cpu cortex-a57 -machine type=virt,gic-version=$gic_version \ -diff --git a/automation/scripts/qemu-smoke-ppc64le.sh b/automation/scripts/qemu-smoke-ppc64le.sh -index 594f92c19c..49e189c370 100755 ---- a/automation/scripts/qemu-smoke-ppc64le.sh -+++ b/automation/scripts/qemu-smoke-ppc64le.sh -@@ -1,6 +1,6 @@ - #!/bin/bash - --set -ex -+set -ex -o pipefail - - serial_log="$(pwd)/smoke.serial" - -@@ -9,7 +9,6 @@ machine=$1 - - # Run the test - rm -f ${serial_log} --set +e - - export QEMU_CMD="qemu-system-ppc64 \ - -bios skiboot.lid \ -diff --git a/automation/scripts/qemu-smoke-riscv64.sh b/automation/scripts/qemu-smoke-riscv64.sh -index c2595f657f..422ee03e0d 100755 ---- a/automation/scripts/qemu-smoke-riscv64.sh -+++ b/automation/scripts/qemu-smoke-riscv64.sh -@@ -1,10 +1,9 @@ - #!/bin/bash - --set -ex -+set -ex -o pipefail - - # Run the test - rm -f smoke.serial --set +e - - export QEMU_CMD="qemu-system-riscv64 \ - -M virt \ -diff --git a/automation/scripts/qemu-smoke-x86-64.sh b/automation/scripts/qemu-smoke-x86-64.sh -index 3440b1761d..7495185d9f 100755 ---- a/automation/scripts/qemu-smoke-x86-64.sh -+++ b/automation/scripts/qemu-smoke-x86-64.sh -@@ -1,6 +1,6 @@ - #!/bin/bash - --set -ex -+set -ex -o pipefail - - # variant should be either pv or pvh - variant=$1 -@@ -15,7 +15,6 @@ case $variant in - esac - - rm -f smoke.serial --set +e - export QEMU_CMD="qemu-system-x86_64 -nographic -kernel binaries/xen \ - -initrd xtf/tests/example/$k \ - -append \"loglvl=all console=com1 noreboot console_timestamps=boot $extra\" \ -diff --git a/automation/scripts/qemu-xtf-dom0less-arm64.sh b/automation/scripts/qemu-xtf-dom0less-arm64.sh -index 4042fe5060..acef1637e2 100755 ---- a/automation/scripts/qemu-xtf-dom0less-arm64.sh -+++ b/automation/scripts/qemu-xtf-dom0less-arm64.sh -@@ -1,6 +1,6 @@ - #!/bin/bash - --set -ex -+set -ex -o pipefail - - # Name of the XTF test - xtf_test=$1 -@@ -50,7 +50,6 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c binaries/conf - - # Run the test - rm -f smoke.serial --set +e - export QEMU_CMD="./binaries/qemu-system-aarch64 \ - -machine virtualization=true \ - -cpu cortex-a57 -machine type=virt \ --- -2.47.0 - diff --git a/0057-automation-use-expect-utility-in-xilinx-tests.patch b/0057-automation-use-expect-utility-in-xilinx-tests.patch deleted file mode 100644 index 63bad97..0000000 --- a/0057-automation-use-expect-utility-in-xilinx-tests.patch +++ /dev/null @@ -1,393 +0,0 @@ -From 9c17da3ea0d04226c584feedf4121691a440b4a6 Mon Sep 17 00:00:00 2001 -From: Victor Lira -Date: Thu, 29 Aug 2024 15:34:23 -0700 -Subject: [PATCH 57/83] automation: use expect utility in xilinx tests - -Fixes: 95764a0817a5 (automation: update xilinx test scripts (tty)) -This patch introduced a CI failure due to a timeout in xilinx-x86_64 test. - -Change xilinx-x86_64 and xilinx-arm64 scripts to use "expect" utility -to determine test result and allow early exit from tests. -Add "expect" to xilinx container environment (dockerfile). -Rename references to "QEMU" in "qemu-key.exp" expect script to "TEST" to be -used by both QEMU and hardware tests. - -Signed-off-by: Victor Lira -Reviewed-by: Stefano Stabellini -(cherry picked from commit 5e99a40ea54a6bf0bdc18241992866a642d7782b) ---- - .../build/ubuntu/xenial-xilinx.dockerfile | 1 + - automation/gitlab-ci/test.yaml | 2 ++ - .../scripts/{qemu-key.exp => console.exp} | 8 +++---- - automation/scripts/qemu-alpine-x86_64.sh | 6 ++--- - automation/scripts/qemu-smoke-dom0-arm32.sh | 6 ++--- - automation/scripts/qemu-smoke-dom0-arm64.sh | 6 ++--- - .../scripts/qemu-smoke-dom0less-arm32.sh | 6 ++--- - .../scripts/qemu-smoke-dom0less-arm64.sh | 6 ++--- - automation/scripts/qemu-smoke-ppc64le.sh | 6 ++--- - automation/scripts/qemu-smoke-riscv64.sh | 6 ++--- - automation/scripts/qemu-smoke-x86-64.sh | 6 ++--- - automation/scripts/qemu-xtf-dom0less-arm64.sh | 6 ++--- - .../scripts/xilinx-smoke-dom0-x86_64.sh | 22 +++++++++---------- - .../scripts/xilinx-smoke-dom0less-arm64.sh | 19 ++++++++-------- - 14 files changed, 54 insertions(+), 52 deletions(-) - rename automation/scripts/{qemu-key.exp => console.exp} (83%) - -diff --git a/automation/build/ubuntu/xenial-xilinx.dockerfile b/automation/build/ubuntu/xenial-xilinx.dockerfile -index f03d62e8bd..6107d8b771 100644 ---- a/automation/build/ubuntu/xenial-xilinx.dockerfile -+++ b/automation/build/ubuntu/xenial-xilinx.dockerfile -@@ -20,6 +20,7 @@ RUN apt-get update && \ - git \ - gzip \ - file \ -+ expect \ - && \ - apt-get autoremove -y && \ - apt-get clean && \ -diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml -index 3b339f387f..cecc18a019 100644 ---- a/automation/gitlab-ci/test.yaml -+++ b/automation/gitlab-ci/test.yaml -@@ -84,6 +84,7 @@ - variables: - CONTAINER: ubuntu:xenial-xilinx - LOGFILE: qemu-smoke-xilinx.log -+ TEST_TIMEOUT: 120 - artifacts: - paths: - - smoke.serial -@@ -103,6 +104,7 @@ - LOGFILE: xilinx-smoke-x86_64.log - XEN_CMD_CONSOLE: "console=com2 com2=115200,8n1,0x2F8,4" - TEST_BOARD: "crater" -+ TEST_TIMEOUT: 1000 - artifacts: - paths: - - smoke.serial -diff --git a/automation/scripts/qemu-key.exp b/automation/scripts/console.exp -similarity index 83% -rename from automation/scripts/qemu-key.exp -rename to automation/scripts/console.exp -index 66c4164831..f538aa6bd0 100755 ---- a/automation/scripts/qemu-key.exp -+++ b/automation/scripts/console.exp -@@ -1,16 +1,16 @@ - #!/usr/bin/expect -f - --if {[info exists env(QEMU_TIMEOUT)]} { -- set timeout $env(QEMU_TIMEOUT) -+if {[info exists env(TEST_TIMEOUT)]} { -+ set timeout $env(TEST_TIMEOUT) - } else { - set timeout 1500 - } - --log_file -a $env(QEMU_LOG) -+log_file -a $env(TEST_LOG) - - match_max 10000 - --eval spawn $env(QEMU_CMD) -+eval spawn $env(TEST_CMD) - - expect_after { - -re "(.*)\r" { -diff --git a/automation/scripts/qemu-alpine-x86_64.sh b/automation/scripts/qemu-alpine-x86_64.sh -index 93914c41bc..1ff689b577 100755 ---- a/automation/scripts/qemu-alpine-x86_64.sh -+++ b/automation/scripts/qemu-alpine-x86_64.sh -@@ -76,7 +76,7 @@ EOF - - # Run the test - rm -f smoke.serial --export QEMU_CMD="qemu-system-x86_64 \ -+export TEST_CMD="qemu-system-x86_64 \ - -cpu qemu64,+svm \ - -m 2G -smp 2 \ - -monitor none -serial stdio \ -@@ -84,8 +84,8 @@ export QEMU_CMD="qemu-system-x86_64 \ - -device virtio-net-pci,netdev=n0 \ - -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0" - --export QEMU_LOG="smoke.serial" -+export TEST_LOG="smoke.serial" - export LOG_MSG="Domain-0" - export PASSED="BusyBox" - --./automation/scripts/qemu-key.exp | sed 's/\r\+$//' -+./automation/scripts/console.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-dom0-arm32.sh b/automation/scripts/qemu-smoke-dom0-arm32.sh -index 0e758dc8f4..b752424cc2 100755 ---- a/automation/scripts/qemu-smoke-dom0-arm32.sh -+++ b/automation/scripts/qemu-smoke-dom0-arm32.sh -@@ -77,7 +77,7 @@ git clone --depth 1 https://gitlab.com/xen-project/imagebuilder.git - bash imagebuilder/scripts/uboot-script-gen -t tftp -d . -c config - - rm -f ${serial_log} --export QEMU_CMD="./qemu-system-arm \ -+export TEST_CMD="./qemu-system-arm \ - -machine virt \ - -machine virtualization=true \ - -smp 4 \ -@@ -91,8 +91,8 @@ export QEMU_CMD="./qemu-system-arm \ - -bios /usr/lib/u-boot/qemu_arm/u-boot.bin" - - export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" --export QEMU_LOG="${serial_log}" -+export TEST_LOG="${serial_log}" - export LOG_MSG="Domain-0" - export PASSED="/ #" - --../automation/scripts/qemu-key.exp | sed 's/\r\+$//' -+../automation/scripts/console.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-dom0-arm64.sh b/automation/scripts/qemu-smoke-dom0-arm64.sh -index 81f210f7f5..4d22a124df 100755 ---- a/automation/scripts/qemu-smoke-dom0-arm64.sh -+++ b/automation/scripts/qemu-smoke-dom0-arm64.sh -@@ -93,7 +93,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c binaries/conf - - # Run the test - rm -f smoke.serial --export QEMU_CMD="./binaries/qemu-system-aarch64 \ -+export TEST_CMD="./binaries/qemu-system-aarch64 \ - -machine virtualization=true \ - -cpu cortex-a57 -machine type=virt \ - -m 2048 -monitor none -serial stdio \ -@@ -104,8 +104,8 @@ export QEMU_CMD="./binaries/qemu-system-aarch64 \ - -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin" - - export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" --export QEMU_LOG="smoke.serial" -+export TEST_LOG="smoke.serial" - export LOG_MSG="Domain-0" - export PASSED="BusyBox" - --./automation/scripts/qemu-key.exp | sed 's/\r\+$//' -+./automation/scripts/console.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-dom0less-arm32.sh b/automation/scripts/qemu-smoke-dom0less-arm32.sh -index 38e8a0b0bd..41f6e5d8e6 100755 ---- a/automation/scripts/qemu-smoke-dom0less-arm32.sh -+++ b/automation/scripts/qemu-smoke-dom0less-arm32.sh -@@ -130,7 +130,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d . -c config - - # Run the test - rm -f ${serial_log} --export QEMU_CMD="./qemu-system-arm \ -+export TEST_CMD="./qemu-system-arm \ - -machine virt \ - -machine virtualization=true \ - -smp 4 \ -@@ -144,8 +144,8 @@ export QEMU_CMD="./qemu-system-arm \ - -bios /usr/lib/u-boot/qemu_arm/u-boot.bin" - - export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" --export QEMU_LOG="${serial_log}" -+export TEST_LOG="${serial_log}" - export LOG_MSG="${dom0_prompt}" - export PASSED="${passed}" - --../automation/scripts/qemu-key.exp | sed 's/\r\+$//' -+../automation/scripts/console.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-dom0less-arm64.sh b/automation/scripts/qemu-smoke-dom0less-arm64.sh -index ea67650e17..83e1866ca6 100755 ---- a/automation/scripts/qemu-smoke-dom0less-arm64.sh -+++ b/automation/scripts/qemu-smoke-dom0less-arm64.sh -@@ -204,7 +204,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c binaries/conf - - # Run the test - rm -f smoke.serial --export QEMU_CMD="./binaries/qemu-system-aarch64 \ -+export TEST_CMD="./binaries/qemu-system-aarch64 \ - -machine virtualization=true \ - -cpu cortex-a57 -machine type=virt,gic-version=$gic_version \ - -m 2048 -monitor none -serial stdio \ -@@ -215,8 +215,8 @@ export QEMU_CMD="./binaries/qemu-system-aarch64 \ - -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin" - - export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" --export QEMU_LOG="smoke.serial" -+export TEST_LOG="smoke.serial" - export LOG_MSG="Welcome to Alpine Linux" - export PASSED="${passed}" - --./automation/scripts/qemu-key.exp | sed 's/\r\+$//' -+./automation/scripts/console.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-ppc64le.sh b/automation/scripts/qemu-smoke-ppc64le.sh -index 49e189c370..617096ad1f 100755 ---- a/automation/scripts/qemu-smoke-ppc64le.sh -+++ b/automation/scripts/qemu-smoke-ppc64le.sh -@@ -10,7 +10,7 @@ machine=$1 - # Run the test - rm -f ${serial_log} - --export QEMU_CMD="qemu-system-ppc64 \ -+export TEST_CMD="qemu-system-ppc64 \ - -bios skiboot.lid \ - -M $machine \ - -m 2g \ -@@ -21,7 +21,7 @@ export QEMU_CMD="qemu-system-ppc64 \ - -serial stdio \ - -kernel binaries/xen" - --export QEMU_LOG="${serial_log}" -+export TEST_LOG="${serial_log}" - export PASSED="Hello, ppc64le!" - --./automation/scripts/qemu-key.exp | sed 's/\r\+$//' -+./automation/scripts/console.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-riscv64.sh b/automation/scripts/qemu-smoke-riscv64.sh -index 422ee03e0d..8f755d0a6a 100755 ---- a/automation/scripts/qemu-smoke-riscv64.sh -+++ b/automation/scripts/qemu-smoke-riscv64.sh -@@ -5,14 +5,14 @@ set -ex -o pipefail - # Run the test - rm -f smoke.serial - --export QEMU_CMD="qemu-system-riscv64 \ -+export TEST_CMD="qemu-system-riscv64 \ - -M virt \ - -smp 1 \ - -nographic \ - -m 2g \ - -kernel binaries/xen" - --export QEMU_LOG="smoke.serial" -+export TEST_LOG="smoke.serial" - export PASSED="All set up" - --./automation/scripts/qemu-key.exp | sed 's/\r\+$//' -+./automation/scripts/console.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-smoke-x86-64.sh b/automation/scripts/qemu-smoke-x86-64.sh -index 7495185d9f..da0c26cc2f 100755 ---- a/automation/scripts/qemu-smoke-x86-64.sh -+++ b/automation/scripts/qemu-smoke-x86-64.sh -@@ -15,12 +15,12 @@ case $variant in - esac - - rm -f smoke.serial --export QEMU_CMD="qemu-system-x86_64 -nographic -kernel binaries/xen \ -+export TEST_CMD="qemu-system-x86_64 -nographic -kernel binaries/xen \ - -initrd xtf/tests/example/$k \ - -append \"loglvl=all console=com1 noreboot console_timestamps=boot $extra\" \ - -m 512 -monitor none -serial stdio" - --export QEMU_LOG="smoke.serial" -+export TEST_LOG="smoke.serial" - export PASSED="Test result: SUCCESS" - --./automation/scripts/qemu-key.exp | sed 's/\r\+$//' -+./automation/scripts/console.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/qemu-xtf-dom0less-arm64.sh b/automation/scripts/qemu-xtf-dom0less-arm64.sh -index acef1637e2..9608de6ec0 100755 ---- a/automation/scripts/qemu-xtf-dom0less-arm64.sh -+++ b/automation/scripts/qemu-xtf-dom0less-arm64.sh -@@ -50,7 +50,7 @@ bash imagebuilder/scripts/uboot-script-gen -t tftp -d binaries/ -c binaries/conf - - # Run the test - rm -f smoke.serial --export QEMU_CMD="./binaries/qemu-system-aarch64 \ -+export TEST_CMD="./binaries/qemu-system-aarch64 \ - -machine virtualization=true \ - -cpu cortex-a57 -machine type=virt \ - -m 2048 -monitor none -serial stdio \ -@@ -61,7 +61,7 @@ export QEMU_CMD="./binaries/qemu-system-aarch64 \ - -bios /usr/lib/u-boot/qemu_arm64/u-boot.bin" - - export UBOOT_CMD="virtio scan; dhcp; tftpb 0x40000000 boot.scr; source 0x40000000" --export QEMU_LOG="smoke.serial" -+export TEST_LOG="smoke.serial" - export PASSED="${passed}" - --./automation/scripts/qemu-key.exp | sed 's/\r\+$//' -+./automation/scripts/console.exp | sed 's/\r\+$//' -diff --git a/automation/scripts/xilinx-smoke-dom0-x86_64.sh b/automation/scripts/xilinx-smoke-dom0-x86_64.sh -index 4559e2b9ee..ef6e1361a9 100755 ---- a/automation/scripts/xilinx-smoke-dom0-x86_64.sh -+++ b/automation/scripts/xilinx-smoke-dom0-x86_64.sh -@@ -1,8 +1,8 @@ --#!/bin/sh -+#!/usr/bin/env bash - - # Run x86_64 dom0 tests on hardware. - --set -ex -+set -ex -o pipefail - - fatal() { - echo "$(basename "$0") $*" >&2 -@@ -27,7 +27,6 @@ memory = 512 - vif = [ "bridge=xenbr0", ] - disk = [ ] - ' --TIMEOUT_SECONDS=200 - - # Select test variant. - if [ "${TEST}" = "ping" ]; then -@@ -125,20 +124,19 @@ boot - - # Power cycle board and collect serial port output. - SERIAL_DEV="/dev/serial/${TEST_BOARD}" --SERIAL_CMD="cat ${SERIAL_DEV} | tee smoke.serial | sed 's/\r//'" - sh /scratch/gitlab-runner/${TEST_BOARD}.sh 2 - sleep 5 - sh /scratch/gitlab-runner/${TEST_BOARD}.sh 1 - sleep 5 - set +e - stty -F ${SERIAL_DEV} 115200 --timeout -k 1 ${TIMEOUT_SECONDS} nohup sh -c "${SERIAL_CMD}" --sh /scratch/gitlab-runner/${TEST_BOARD}.sh 2 -- --set -e - --if grep -q "${PASS_MSG}" smoke.serial; then -- exit 0 --fi -+# Capture test result and power off board before exiting. -+export PASSED="${PASS_MSG}" -+export TEST_CMD="cat ${SERIAL_DEV}" -+export TEST_LOG="smoke.serial" - --fatal "Test failed" -+./automation/scripts/console.exp | sed 's/\r\+$//' -+TEST_RESULT=$? -+sh "/scratch/gitlab-runner/${TEST_BOARD}.sh" 2 -+exit ${TEST_RESULT} -diff --git a/automation/scripts/xilinx-smoke-dom0less-arm64.sh b/automation/scripts/xilinx-smoke-dom0less-arm64.sh -index 18aa07f0a2..b24ad11b8c 100755 ---- a/automation/scripts/xilinx-smoke-dom0less-arm64.sh -+++ b/automation/scripts/xilinx-smoke-dom0less-arm64.sh -@@ -1,6 +1,6 @@ - #!/bin/bash - --set -ex -+set -ex -o pipefail - - test_variant=$1 - -@@ -137,13 +137,14 @@ cd $START - SERIAL_DEV="/dev/serial/zynq" - set +e - stty -F ${SERIAL_DEV} 115200 --timeout -k 1 120 nohup sh -c "cat ${SERIAL_DEV} | tee smoke.serial | sed 's/\r//'" - --# stop the board --cd /scratch/gitlab-runner --bash zcu102.sh 2 --cd $START -+# Capture test result and power off board before exiting. -+export PASSED="${passed}" -+export LOG_MSG="Welcome to Alpine Linux" -+export TEST_CMD="cat ${SERIAL_DEV}" -+export TEST_LOG="smoke.serial" - --set -e --(grep -q "^Welcome to Alpine Linux" smoke.serial && grep -q "${passed}" smoke.serial) || exit 1 --exit 0 -+./automation/scripts/console.exp | sed 's/\r\+$//' -+TEST_RESULT=$? -+sh "/scratch/gitlab-runner/zcu102.sh" 2 -+exit ${TEST_RESULT} --- -2.47.0 - diff --git a/0058-automation-fix-xilinx-test-console-settings.patch b/0058-automation-fix-xilinx-test-console-settings.patch deleted file mode 100644 index a6a7d9c..0000000 --- a/0058-automation-fix-xilinx-test-console-settings.patch +++ /dev/null @@ -1,45 +0,0 @@ -From 7b3b33efab6306ca7e3d0e1f1079ac2eee63aafa Mon Sep 17 00:00:00 2001 -From: Victor Lira -Date: Mon, 9 Sep 2024 17:31:46 -0700 -Subject: [PATCH 58/83] automation: fix xilinx test console settings - -The test showed unreliable behavior due to unsupported console settings. -Update the baud rate used to connect to the UART. - -Signed-off-by: Victor Lira -Reviewed-by: Stefano Stabellini -(cherry picked from commit c23571fe3150c2994afabcaa10c218b3d87fa832) ---- - automation/gitlab-ci/test.yaml | 2 +- - automation/scripts/xilinx-smoke-dom0-x86_64.sh | 2 +- - 2 files changed, 2 insertions(+), 2 deletions(-) - -diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml -index cecc18a019..8675016b6a 100644 ---- a/automation/gitlab-ci/test.yaml -+++ b/automation/gitlab-ci/test.yaml -@@ -102,7 +102,7 @@ - variables: - CONTAINER: ubuntu:xenial-xilinx - LOGFILE: xilinx-smoke-x86_64.log -- XEN_CMD_CONSOLE: "console=com2 com2=115200,8n1,0x2F8,4" -+ XEN_CMD_CONSOLE: "console=com2 com2=57600,8n1,0x2F8,4" - TEST_BOARD: "crater" - TEST_TIMEOUT: 1000 - artifacts: -diff --git a/automation/scripts/xilinx-smoke-dom0-x86_64.sh b/automation/scripts/xilinx-smoke-dom0-x86_64.sh -index ef6e1361a9..7027f083ba 100755 ---- a/automation/scripts/xilinx-smoke-dom0-x86_64.sh -+++ b/automation/scripts/xilinx-smoke-dom0-x86_64.sh -@@ -129,7 +129,7 @@ sleep 5 - sh /scratch/gitlab-runner/${TEST_BOARD}.sh 1 - sleep 5 - set +e --stty -F ${SERIAL_DEV} 115200 -+stty -F ${SERIAL_DEV} 57600 - - # Capture test result and power off board before exiting. - export PASSED="${PASS_MSG}" --- -2.47.0 - diff --git a/0059-automation-introduce-TEST_TIMEOUT_OVERRIDE.patch b/0059-automation-introduce-TEST_TIMEOUT_OVERRIDE.patch deleted file mode 100644 index 9c6e348..0000000 --- a/0059-automation-introduce-TEST_TIMEOUT_OVERRIDE.patch +++ /dev/null @@ -1,65 +0,0 @@ -From b68a7b9b29f831c4261eca2237a6f45d16609391 Mon Sep 17 00:00:00 2001 -From: Stefano Stabellini -Date: Thu, 3 Oct 2024 13:22:51 -0700 -Subject: [PATCH 59/83] automation: introduce TEST_TIMEOUT_OVERRIDE -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -TEST_TIMEOUT is set as a CI/CD project variable, as it should be, to -match the capability and speed of the testing infrastructure. - -As it turns out, TEST_TIMEOUT defined in test.yaml cannot override -TEST_TIMEOUT defined as CI/CD project variable. As a consequence, today -the TEST_TIMEOUT setting in test.yaml for the Xilinx jobs is ignored. - -Instead, rename TEST_TIMEOUT to TEST_TIMEOUT_OVERRIDE in test.yaml and -check for TEST_TIMEOUT_OVERRIDE first in console.exp. - -Signed-off-by: Stefano Stabellini -Reviewed-by: Marek Marczykowski-Górecki -(cherry picked from commit d82e0e094e7a07353ba0fb35732724316c2ec2f6) ---- - automation/gitlab-ci/test.yaml | 4 ++-- - automation/scripts/console.exp | 4 +++- - 2 files changed, 5 insertions(+), 3 deletions(-) - -diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml -index 8675016b6a..e947736195 100644 ---- a/automation/gitlab-ci/test.yaml -+++ b/automation/gitlab-ci/test.yaml -@@ -84,7 +84,7 @@ - variables: - CONTAINER: ubuntu:xenial-xilinx - LOGFILE: qemu-smoke-xilinx.log -- TEST_TIMEOUT: 120 -+ TEST_TIMEOUT_OVERRIDE: 120 - artifacts: - paths: - - smoke.serial -@@ -104,7 +104,7 @@ - LOGFILE: xilinx-smoke-x86_64.log - XEN_CMD_CONSOLE: "console=com2 com2=57600,8n1,0x2F8,4" - TEST_BOARD: "crater" -- TEST_TIMEOUT: 1000 -+ TEST_TIMEOUT_OVERRIDE: 1000 - artifacts: - paths: - - smoke.serial -diff --git a/automation/scripts/console.exp b/automation/scripts/console.exp -index f538aa6bd0..310543c33e 100755 ---- a/automation/scripts/console.exp -+++ b/automation/scripts/console.exp -@@ -1,6 +1,8 @@ - #!/usr/bin/expect -f - --if {[info exists env(TEST_TIMEOUT)]} { -+if {[info exists env(TEST_TIMEOUT_OVERRIDE)]} { -+ set timeout $env(TEST_TIMEOUT_OVERRIDE) -+} elseif {[info exists env(TEST_TIMEOUT)]} { - set timeout $env(TEST_TIMEOUT) - } else { - set timeout 1500 --- -2.47.0 - diff --git a/0060-automation-preserve-built-xen.efi.patch b/0060-automation-preserve-built-xen.efi.patch deleted file mode 100644 index d25deb3..0000000 --- a/0060-automation-preserve-built-xen.efi.patch +++ /dev/null @@ -1,65 +0,0 @@ -From d1c774c17a4c8c6ede588b38ed9eec8894951a2c Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?= - -Date: Fri, 4 Oct 2024 04:29:37 +0200 -Subject: [PATCH 60/83] automation: preserve built xen.efi -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -It will be useful for further tests. Deuplicate the collection. - -Signed-off-by: Marek Marczykowski-Górecki -Acked-by: Andrew Cooper -(cherry picked from commit 636e66b143ac1aad2f6a9c2e7166d8ba88f4559a) ---- - automation/scripts/build | 17 ++++++++++++++--- - 1 file changed, 14 insertions(+), 3 deletions(-) - -diff --git a/automation/scripts/build b/automation/scripts/build -index b3c71fb6fb..1879c1db6d 100755 ---- a/automation/scripts/build -+++ b/automation/scripts/build -@@ -41,19 +41,30 @@ cp xen/.config xen-config - # Directory for the artefacts to be dumped into - mkdir -p binaries - -+collect_xen_artefacts() -+{ -+ local f -+ -+ for f in xen/xen xen/xen.efi; do -+ if [[ -f $f ]]; then -+ cp $f binaries/ -+ fi -+ done -+} -+ - if [[ "${CPPCHECK}" == "y" ]] && [[ "${HYPERVISOR_ONLY}" == "y" ]]; then - # Cppcheck analysis invokes Xen-only build - xen/scripts/xen-analysis.py --run-cppcheck --cppcheck-misra -- -j$(nproc) - - # Preserve artefacts -- cp xen/xen binaries/xen -+ collect_xen_artefacts - cp xen/cppcheck-report/xen-cppcheck.txt xen-cppcheck.txt - elif [[ "${HYPERVISOR_ONLY}" == "y" ]]; then - # Xen-only build - make -j$(nproc) xen - - # Preserve artefacts -- cp xen/xen binaries/xen -+ collect_xen_artefacts - else - # Full build. Figure out our ./configure options - cfgargs=() -@@ -101,5 +112,5 @@ else - # even though dist/ contains everything, while some containers don't even - # build Xen - cp -r dist binaries/ -- if [[ -f xen/xen ]] ; then cp xen/xen binaries/xen; fi -+ collect_xen_artefacts - fi --- -2.47.0 - diff --git a/0061-automation-add-a-smoke-test-for-xen.efi-on-X86.patch b/0061-automation-add-a-smoke-test-for-xen.efi-on-X86.patch deleted file mode 100644 index cc568bf..0000000 --- a/0061-automation-add-a-smoke-test-for-xen.efi-on-X86.patch +++ /dev/null @@ -1,91 +0,0 @@ -From 811637696b881039b1e594f848044588aaee8c0c Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?= - -Date: Fri, 4 Oct 2024 04:29:38 +0200 -Subject: [PATCH 61/83] automation: add a smoke test for xen.efi on X86 -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Check if xen.efi is bootable with an XTF dom0. -The multiboot2+EFI path is tested on hardware tests already. - -Signed-off-by: Marek Marczykowski-Górecki -Acked-by: Andrew Cooper -(cherry picked from commit 2d1c673baea563bb1af00b1e977b4ff7c213cf7f) ---- - automation/gitlab-ci/test.yaml | 7 ++++ - automation/scripts/qemu-smoke-x86-64-efi.sh | 43 +++++++++++++++++++++ - 2 files changed, 50 insertions(+) - create mode 100755 automation/scripts/qemu-smoke-x86-64-efi.sh - -diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml -index e947736195..5687eaf914 100644 ---- a/automation/gitlab-ci/test.yaml -+++ b/automation/gitlab-ci/test.yaml -@@ -463,6 +463,13 @@ qemu-smoke-x86-64-clang-pvh: - needs: - - debian-bookworm-clang-debug - -+qemu-smoke-x86-64-gcc-efi: -+ extends: .qemu-x86-64 -+ script: -+ - ./automation/scripts/qemu-smoke-x86-64-efi.sh pv 2>&1 | tee ${LOGFILE} -+ needs: -+ - debian-bookworm-gcc-debug -+ - qemu-smoke-riscv64-gcc: - extends: .qemu-riscv64 - script: -diff --git a/automation/scripts/qemu-smoke-x86-64-efi.sh b/automation/scripts/qemu-smoke-x86-64-efi.sh -new file mode 100755 -index 0000000000..7572722be6 ---- /dev/null -+++ b/automation/scripts/qemu-smoke-x86-64-efi.sh -@@ -0,0 +1,43 @@ -+#!/bin/bash -+ -+set -ex -o pipefail -+ -+# variant should be either pv or pvh -+variant=$1 -+ -+# Clone and build XTF -+git clone https://xenbits.xen.org/git-http/xtf.git -+cd xtf && make -j$(nproc) && cd - -+ -+case $variant in -+ pvh) k=test-hvm64-example extra="dom0-iommu=none dom0=pvh" ;; -+ *) k=test-pv64-example extra= ;; -+esac -+ -+mkdir -p boot-esp/EFI/BOOT -+cp binaries/xen.efi boot-esp/EFI/BOOT/BOOTX64.EFI -+cp xtf/tests/example/$k boot-esp/EFI/BOOT/kernel -+ -+cat > boot-esp/EFI/BOOT/BOOTX64.cfg < -Date: Fri, 4 Oct 2024 04:29:39 +0200 -Subject: [PATCH 62/83] automation: shorten the timeout for smoke tests -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -The smoke tests when successful complete in about 5s. Don't waste -20min+ on failure, shorten the timeout to 120s - -Signed-off-by: Marek Marczykowski-Górecki -Acked-by: Andrew Cooper -(cherry picked from commit bcce5a6b62761c8b678aebce33c55ea66f879f66) ---- - automation/gitlab-ci/test.yaml | 15 ++++++++++----- - 1 file changed, 10 insertions(+), 5 deletions(-) - -diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml -index 5687eaf914..b27c2be174 100644 ---- a/automation/gitlab-ci/test.yaml -+++ b/automation/gitlab-ci/test.yaml -@@ -53,6 +53,11 @@ - tags: - - x86_64 - -+.qemu-smoke-x86-64: -+ extends: .qemu-x86-64 -+ variables: -+ TEST_TIMEOUT_OVERRIDE: 120 -+ - .qemu-riscv64: - extends: .test-jobs-common - variables: -@@ -436,35 +441,35 @@ qemu-alpine-x86_64-gcc: - - alpine-3.18-gcc - - qemu-smoke-x86-64-gcc: -- extends: .qemu-x86-64 -+ extends: .qemu-smoke-x86-64 - script: - - ./automation/scripts/qemu-smoke-x86-64.sh pv 2>&1 | tee ${LOGFILE} - needs: - - debian-bookworm-gcc-debug - - qemu-smoke-x86-64-clang: -- extends: .qemu-x86-64 -+ extends: .qemu-smoke-x86-64 - script: - - ./automation/scripts/qemu-smoke-x86-64.sh pv 2>&1 | tee ${LOGFILE} - needs: - - debian-bookworm-clang-debug - - qemu-smoke-x86-64-gcc-pvh: -- extends: .qemu-x86-64 -+ extends: .qemu-smoke-x86-64 - script: - - ./automation/scripts/qemu-smoke-x86-64.sh pvh 2>&1 | tee ${LOGFILE} - needs: - - debian-bookworm-gcc-debug - - qemu-smoke-x86-64-clang-pvh: -- extends: .qemu-x86-64 -+ extends: .qemu-smoke-x86-64 - script: - - ./automation/scripts/qemu-smoke-x86-64.sh pvh 2>&1 | tee ${LOGFILE} - needs: - - debian-bookworm-clang-debug - - qemu-smoke-x86-64-gcc-efi: -- extends: .qemu-x86-64 -+ extends: .qemu-smoke-x86-64 - script: - - ./automation/scripts/qemu-smoke-x86-64-efi.sh pv 2>&1 | tee ${LOGFILE} - needs: --- -2.47.0 - diff --git a/0063-CI-Stop-building-QEMU-in-general.patch b/0063-CI-Stop-building-QEMU-in-general.patch deleted file mode 100644 index e7ad836..0000000 --- a/0063-CI-Stop-building-QEMU-in-general.patch +++ /dev/null @@ -1,67 +0,0 @@ -From 76f180625bc3024be97d967662e168d7118f0e0e Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Sat, 13 Jul 2024 17:50:30 +0100 -Subject: [PATCH 63/83] CI: Stop building QEMU in general - -We spend an awful lot of CI time building QEMU, even though most changes don't -touch the subset of tools/libs/ used by QEMU. Some numbers taken at a time -when CI was otherwise quiet: - - With Without - Alpine: 13m38s 6m04s - Debian 12: 10m05s 8m10s - OpenSUSE Tumbleweed: 11m40s 7m54s - Ubuntu 24.04: 14m56s 8m06s - -which is a >50% improvement in wallclock time in some cases. - -The only build we have that needs QEMU is alpine-3.18-gcc-debug. This is the -build deployed and used by the QubesOS ADL-* and Zen3p-* jobs. - -Xilinx-x86_64 deploys it too, but is PVH-only and doesn't use QEMU. - -QEMU is also built by CirrusCI for FreeBSD (fully Clang/LLVM toolchain). - -This should help quite a lot with Gitlab CI capacity. - -Signed-off-by: Andrew Cooper -Reviewed-by: Stefano Stabellini -(cherry picked from commit e305256e69b1c943db3ca20173da6ded3db2d252) ---- - automation/gitlab-ci/build.yaml | 1 + - automation/scripts/build | 7 ++----- - 2 files changed, 3 insertions(+), 5 deletions(-) - -diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml -index 09895d1fbd..6342a025ac 100644 ---- a/automation/gitlab-ci/build.yaml -+++ b/automation/gitlab-ci/build.yaml -@@ -339,6 +339,7 @@ alpine-3.18-gcc-debug: - extends: .gcc-x86-64-build-debug - variables: - CONTAINER: alpine:3.18 -+ BUILD_QEMU_XEN: y - - debian-bookworm-gcc-debug: - extends: .gcc-x86-64-build-debug -diff --git a/automation/scripts/build b/automation/scripts/build -index 1879c1db6d..952599cc25 100755 ---- a/automation/scripts/build -+++ b/automation/scripts/build -@@ -91,11 +91,8 @@ else - cfgargs+=("--with-extra-qemuu-configure-args=\"--disable-werror\"") - fi - -- # Qemu requires Python 3.5 or later, and ninja -- # and Clang 10 or later -- if ! type python3 || python3 -c "import sys; res = sys.version_info < (3, 5); exit(not(res))" \ -- || [[ "$cc_is_clang" == y && "$cc_ver" -lt 0x0a0000 ]] \ -- || ! type ninja; then -+ # QEMU is only for those who ask -+ if [[ "$BUILD_QEMU_XEN" != "y" ]]; then - cfgargs+=("--with-system-qemu=/bin/false") - fi - --- -2.47.0 - diff --git a/0064-CI-Minor-cleanup-to-qubes-x86-64.sh.patch b/0064-CI-Minor-cleanup-to-qubes-x86-64.sh.patch deleted file mode 100644 index 89797e3..0000000 --- a/0064-CI-Minor-cleanup-to-qubes-x86-64.sh.patch +++ /dev/null @@ -1,186 +0,0 @@ -From 1dd4b60de136565f18a93988705b6372f9958935 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Mon, 21 Oct 2024 14:06:24 +0100 -Subject: [PATCH 64/83] CI: Minor cleanup to qubes-x86-64.sh -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - - * List all the test_variants and summerise what's going on - * Use case rather than an if/else chain for $test_variant - * Adjust indentation inside the case block - -No functional change. - -Signed-off-by: Andrew Cooper -Reviewed-by: Stefano Stabellini -Reviewed-by: Marek Marczykowski-Górecki -(cherry picked from commit 6685a129c7864e3733afef09a2539ccd722a4380) ---- - automation/scripts/qubes-x86-64.sh | 84 ++++++++++++++++++------------ - 1 file changed, 50 insertions(+), 34 deletions(-) - -diff --git a/automation/scripts/qubes-x86-64.sh b/automation/scripts/qubes-x86-64.sh -index bfa60c912a..306304e921 100755 ---- a/automation/scripts/qubes-x86-64.sh -+++ b/automation/scripts/qubes-x86-64.sh -@@ -2,6 +2,13 @@ - - set -ex - -+# One of: -+# - "" PV dom0, PVH domU -+# - dom0pvh PVH dom0, PVH domU -+# - dom0pvh-hvm PVH dom0, HVM domU -+# - pci-hvm PV dom0, HVM domU + PCI Passthrough -+# - pci-pv PV dom0, PV domU + PCI Passthrough -+# - s3 PV dom0, S3 suspend/resume - test_variant=$1 - - ### defaults -@@ -19,17 +26,18 @@ vif = [ "bridge=xenbr0", ] - disk = [ ] - ' - --### test: smoke test & smoke test PVH & smoke test HVM --if [ -z "${test_variant}" ] || [ "${test_variant}" = "dom0pvh" ] || [ "${test_variant}" = "dom0pvh-hvm" ]; then -- passed="ping test passed" -- domU_check=" -+case "${test_variant}" in -+ ### test: smoke test & smoke test PVH & smoke test HVM -+ ""|"dom0pvh"|"dom0pvh-hvm") -+ passed="ping test passed" -+ domU_check=" - ifconfig eth0 192.168.0.2 - until ping -c 10 192.168.0.1; do - sleep 1 - done - echo \"${passed}\" - " -- dom0_check=" -+ dom0_check=" - set +x - until grep -q \"${passed}\" /var/log/xen/console/guest-domU.log; do - sleep 1 -@@ -37,12 +45,12 @@ done - set -x - echo \"${passed}\" - " --if [ "${test_variant}" = "dom0pvh" ] || [ "${test_variant}" = "dom0pvh-hvm" ]; then -- extra_xen_opts="dom0=pvh" --fi -+ if [ "${test_variant}" = "dom0pvh" ] || [ "${test_variant}" = "dom0pvh-hvm" ]; then -+ extra_xen_opts="dom0=pvh" -+ fi - --if [ "${test_variant}" = "dom0pvh-hvm" ]; then -- domU_config=' -+ if [ "${test_variant}" = "dom0pvh-hvm" ]; then -+ domU_config=' - type = "hvm" - name = "domU" - kernel = "/boot/vmlinuz" -@@ -52,17 +60,18 @@ memory = 512 - vif = [ "bridge=xenbr0", ] - disk = [ ] - ' --fi -- --### test: S3 --elif [ "${test_variant}" = "s3" ]; then -- passed="suspend test passed" -- wait_and_wakeup="started, suspending" -- domU_check=" -+ fi -+ ;; -+ -+ ### test: S3 -+ "s3") -+ passed="suspend test passed" -+ wait_and_wakeup="started, suspending" -+ domU_check=" - ifconfig eth0 192.168.0.2 - echo domU started - " -- dom0_check=" -+ dom0_check=" - until grep 'domU started' /var/log/xen/console/guest-domU.log; do - sleep 1 - done -@@ -79,19 +88,20 @@ xl dmesg | grep 'Finishing wakeup from ACPI S3 state' || exit 1 - ping -c 10 192.168.0.2 || exit 1 - echo \"${passed}\" - " -+ ;; - --### test: pci-pv, pci-hvm --elif [ "${test_variant}" = "pci-pv" ] || [ "${test_variant}" = "pci-hvm" ]; then -+ ### test: pci-pv, pci-hvm -+ "pci-pv"|"pci-hvm") - -- if [ -z "$PCIDEV" ]; then -- echo "Please set 'PCIDEV' variable with BDF of test network adapter" >&2 -- echo "Optionally set also 'PCIDEV_INTR' to 'MSI' or 'MSI-X'" >&2 -- exit 1 -- fi -+ if [ -z "$PCIDEV" ]; then -+ echo "Please set 'PCIDEV' variable with BDF of test network adapter" >&2 -+ echo "Optionally set also 'PCIDEV_INTR' to 'MSI' or 'MSI-X'" >&2 -+ exit 1 -+ fi - -- passed="pci test passed" -+ passed="pci test passed" - -- domU_config=' -+ domU_config=' - type = "'${test_variant#pci-}'" - name = "domU" - kernel = "/boot/vmlinuz" -@@ -104,7 +114,7 @@ pci = [ "'$PCIDEV',seize=1" ] - on_reboot = "destroy" - ' - -- domU_check=" -+ domU_check=" - set -x -e - interface=eth0 - ip link set \"\$interface\" up -@@ -115,22 +125,28 @@ echo domU started - pcidevice=\$(basename \$(readlink /sys/class/net/\$interface/device)) - lspci -vs \$pcidevice - " -- if [ -n "$PCIDEV_INTR" ]; then -- domU_check="$domU_check -+ if [ -n "$PCIDEV_INTR" ]; then -+ domU_check="$domU_check - lspci -vs \$pcidevice | fgrep '$PCIDEV_INTR: Enable+' - " -- fi -- domU_check="$domU_check -+ fi -+ domU_check="$domU_check - echo \"${passed}\" - " - -- dom0_check=" -+ dom0_check=" - tail -F /var/log/xen/qemu-dm-domU.log & - until grep -q \"^domU Welcome to Alpine Linux\" /var/log/xen/console/guest-domU.log; do - sleep 1 - done - " --fi -+ ;; -+ -+ *) -+ echo "Unrecognised test_variant '${test_variant}'" >&2 -+ exit 1 -+ ;; -+esac - - # DomU - mkdir -p rootfs --- -2.47.0 - diff --git a/0065-CI-Rework-domU_config-generation-in-qubes-x86-64.sh.patch b/0065-CI-Rework-domU_config-generation-in-qubes-x86-64.sh.patch deleted file mode 100644 index 0c09ac6..0000000 --- a/0065-CI-Rework-domU_config-generation-in-qubes-x86-64.sh.patch +++ /dev/null @@ -1,117 +0,0 @@ -From 7e0ba9a387b2bf2a2fa0be8bf4a87160bba7d037 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Mon, 21 Oct 2024 15:07:54 +0100 -Subject: [PATCH 65/83] CI: Rework domU_config generation in qubes-x86-64.sh -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Right now, various blocks rewrite domU_config= as a whole, even though it is -largely the same. - - * dom0pvh-hvm does nothing but change the domain type to hvm - * *-pci sets the domain type, clears vif=[], appends earlyprintk=xen to the - cmdline, and adds some PCI config. - -Refactor this to be domU_type (defaults to pvh), domU_vif (defaults to -xenbr0), and domU_extra_config (defaults to empty) and use these variables to -build domU_config= once. - -Of note, the default domU_config= now sets cmdline=, and extra= is intended -for inclusion via domU_extra_config as necessary. - -No practical change. - -Signed-off-by: Andrew Cooper -Reviewed-by: Stefano Stabellini -Reviewed-by: Marek Marczykowski-Górecki -(cherry picked from commit 3be3ae07705af77ab1c87c2e442c7646c938ad25) ---- - automation/scripts/qubes-x86-64.sh | 50 +++++++++++++----------------- - 1 file changed, 21 insertions(+), 29 deletions(-) - -diff --git a/automation/scripts/qubes-x86-64.sh b/automation/scripts/qubes-x86-64.sh -index 306304e921..76fbafac84 100755 ---- a/automation/scripts/qubes-x86-64.sh -+++ b/automation/scripts/qubes-x86-64.sh -@@ -15,16 +15,9 @@ test_variant=$1 - extra_xen_opts= - wait_and_wakeup= - timeout=120 --domU_config=' --type = "pvh" --name = "domU" --kernel = "/boot/vmlinuz" --ramdisk = "/boot/initrd-domU" --extra = "root=/dev/ram0 console=hvc0" --memory = 512 --vif = [ "bridge=xenbr0", ] --disk = [ ] --' -+domU_type="pvh" -+domU_vif="'bridge=xenbr0'," -+domU_extra_config= - - case "${test_variant}" in - ### test: smoke test & smoke test PVH & smoke test HVM -@@ -50,16 +43,7 @@ echo \"${passed}\" - fi - - if [ "${test_variant}" = "dom0pvh-hvm" ]; then -- domU_config=' --type = "hvm" --name = "domU" --kernel = "/boot/vmlinuz" --ramdisk = "/boot/initrd-domU" --extra = "root=/dev/ram0 console=hvc0" --memory = 512 --vif = [ "bridge=xenbr0", ] --disk = [ ] --' -+ domU_type="hvm" - fi - ;; - -@@ -101,15 +85,11 @@ echo \"${passed}\" - - passed="pci test passed" - -- domU_config=' --type = "'${test_variant#pci-}'" --name = "domU" --kernel = "/boot/vmlinuz" --ramdisk = "/boot/initrd-domU" --extra = "root=/dev/ram0 console=hvc0 earlyprintk=xen" --memory = 512 --vif = [ ] --disk = [ ] -+ domU_type="${test_variant#pci-}" -+ domU_vif="" -+ -+ domU_extra_config=' -+extra = "earlyprintk=xen" - pci = [ "'$PCIDEV',seize=1" ] - on_reboot = "destroy" - ' -@@ -148,6 +128,18 @@ done - ;; - esac - -+domU_config=" -+type = '${domU_type}' -+name = 'domU' -+kernel = '/boot/vmlinuz' -+ramdisk = '/boot/initrd-domU' -+cmdline = 'root=/dev/ram0 console=hvc0' -+memory = 512 -+vif = [ ${domU_vif} ] -+disk = [ ] -+${domU_extra_config} -+" -+ - # DomU - mkdir -p rootfs - cd rootfs --- -2.47.0 - diff --git a/0066-CI-Add-adl-zen3p-pvshim-tests.patch b/0066-CI-Add-adl-zen3p-pvshim-tests.patch deleted file mode 100644 index b1862f5..0000000 --- a/0066-CI-Add-adl-zen3p-pvshim-tests.patch +++ /dev/null @@ -1,88 +0,0 @@ -From 01951e1a05a3865b812e7d8cd27f476520023b57 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Mon, 21 Oct 2024 14:17:56 +0100 -Subject: [PATCH 66/83] CI: Add {adl,zen3p}-pvshim-* tests -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -GitlabCI has no testing of Xen's PVH entrypoint. Fix this. - -Signed-off-by: Andrew Cooper -Reviewed-by: Marek Marczykowski-Górecki -(cherry picked from commit b837d02163ff19e2440cae766e2bc50956da5410) ---- - automation/gitlab-ci/test.yaml | 16 ++++++++++++++++ - automation/scripts/qubes-x86-64.sh | 8 ++++++-- - 2 files changed, 22 insertions(+), 2 deletions(-) - -diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml -index b27c2be174..e76a37bef3 100644 ---- a/automation/gitlab-ci/test.yaml -+++ b/automation/gitlab-ci/test.yaml -@@ -240,6 +240,14 @@ adl-pci-hvm-x86-64-gcc-debug: - - *x86-64-test-needs - - alpine-3.18-gcc-debug - -+adl-pvshim-x86-64-gcc-debug: -+ extends: .adl-x86-64 -+ script: -+ - ./automation/scripts/qubes-x86-64.sh pvshim 2>&1 | tee ${LOGFILE} -+ needs: -+ - *x86-64-test-needs -+ - alpine-3.18-gcc-debug -+ - zen3p-smoke-x86-64-gcc-debug: - extends: .zen3p-x86-64 - script: -@@ -272,6 +280,14 @@ zen3p-pci-hvm-x86-64-gcc-debug: - - *x86-64-test-needs - - alpine-3.18-gcc-debug - -+zen3p-pvshim-x86-64-gcc-debug: -+ extends: .zen3p-x86-64 -+ script: -+ - ./automation/scripts/qubes-x86-64.sh pvshim 2>&1 | tee ${LOGFILE} -+ needs: -+ - *x86-64-test-needs -+ - alpine-3.18-gcc-debug -+ - qemu-smoke-dom0-arm64-gcc: - extends: .qemu-arm64 - script: -diff --git a/automation/scripts/qubes-x86-64.sh b/automation/scripts/qubes-x86-64.sh -index 76fbafac84..8a0b7bfbc0 100755 ---- a/automation/scripts/qubes-x86-64.sh -+++ b/automation/scripts/qubes-x86-64.sh -@@ -8,6 +8,7 @@ set -ex - # - dom0pvh-hvm PVH dom0, HVM domU - # - pci-hvm PV dom0, HVM domU + PCI Passthrough - # - pci-pv PV dom0, PV domU + PCI Passthrough -+# - pvshim PV dom0, PVSHIM domU - # - s3 PV dom0, S3 suspend/resume - test_variant=$1 - -@@ -20,8 +21,8 @@ domU_vif="'bridge=xenbr0'," - domU_extra_config= - - case "${test_variant}" in -- ### test: smoke test & smoke test PVH & smoke test HVM -- ""|"dom0pvh"|"dom0pvh-hvm") -+ ### test: smoke test & smoke test PVH & smoke test HVM & smoke test PVSHIM -+ ""|"dom0pvh"|"dom0pvh-hvm"|"pvshim") - passed="ping test passed" - domU_check=" - ifconfig eth0 192.168.0.2 -@@ -44,6 +45,9 @@ echo \"${passed}\" - - if [ "${test_variant}" = "dom0pvh-hvm" ]; then - domU_type="hvm" -+ elif [ "${test_variant}" = "pvshim" ]; then -+ domU_type="pvh" -+ domU_extra_config='pvshim = 1' - fi - ;; - --- -2.47.0 - diff --git a/0067-CI-Drop-alpine-3.18-rootfs-export-and-use-test-artef.patch b/0067-CI-Drop-alpine-3.18-rootfs-export-and-use-test-artef.patch deleted file mode 100644 index c3ee2cb..0000000 --- a/0067-CI-Drop-alpine-3.18-rootfs-export-and-use-test-artef.patch +++ /dev/null @@ -1,55 +0,0 @@ -From 32e9c5de2efe7e76753668541b95eae5ae31fb19 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Thu, 31 Oct 2024 18:02:57 +0000 -Subject: [PATCH 67/83] CI: Drop alpine-3.18-rootfs-export and use - test-artefacts - -Signed-off-by: Andrew Cooper -Reviewed-by: Stefano Stabellini -(cherry picked from commit babe11b46c1a97036164099528a308476ea27953) ---- - automation/gitlab-ci/build.yaml | 11 ----------- - automation/gitlab-ci/test.yaml | 4 +++- - 2 files changed, 3 insertions(+), 12 deletions(-) - -diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml -index 6342a025ac..19ce06bb33 100644 ---- a/automation/gitlab-ci/build.yaml -+++ b/automation/gitlab-ci/build.yaml -@@ -304,17 +304,6 @@ qemu-system-aarch64-6.0.0-arm32-export: - - # x86_64 test artifacts - --alpine-3.18-rootfs-export: -- extends: .test-jobs-artifact-common -- image: registry.gitlab.com/xen-project/xen/tests-artifacts/alpine:3.18 -- script: -- - mkdir binaries && cp /initrd.tar.gz binaries/initrd.tar.gz -- artifacts: -- paths: -- - binaries/initrd.tar.gz -- tags: -- - x86_64 -- - kernel-6.1.19-export: - extends: .test-jobs-artifact-common - image: registry.gitlab.com/xen-project/xen/tests-artifacts/kernel:6.1.19 -diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml -index e76a37bef3..5c3cff1bc7 100644 ---- a/automation/gitlab-ci/test.yaml -+++ b/automation/gitlab-ci/test.yaml -@@ -11,8 +11,10 @@ - - qemu-system-aarch64-6.0.0-arm32-export - - .x86-64-test-needs: &x86-64-test-needs -- - alpine-3.18-rootfs-export - - kernel-6.1.19-export -+ - project: xen-project/hardware/test-artifacts -+ job: x86_64-rootfs-alpine-3.18 -+ ref: master - - .qemu-arm64: - extends: .test-jobs-common --- -2.47.0 - diff --git a/0068-CI-Refresh-the-Debian-12-x86_64-container.patch b/0068-CI-Refresh-the-Debian-12-x86_64-container.patch deleted file mode 100644 index d990b07..0000000 --- a/0068-CI-Refresh-the-Debian-12-x86_64-container.patch +++ /dev/null @@ -1,295 +0,0 @@ -From 6ac45f72a37ba2c0ba07711de6a88ccab7917220 Mon Sep 17 00:00:00 2001 -From: Javi Merino -Date: Mon, 14 Oct 2024 17:53:31 +0100 -Subject: [PATCH 68/83] CI: Refresh the Debian 12 x86_64 container - -Rework the container to use heredocs for readability, and use -apt-get --no-install-recommends to keep the size down. - -This reduces the size of the (uncompressed) container from 3.44GB to -1.97GB. - -The container is left running the builds and tests as root. A -subsequent patch will make the necessary changes to the test scripts -to allow test execution as a non-root user. - -Signed-off-by: Javi Merino -Reviewed-by: Andrew Cooper -(cherry picked from commit 44b742de09f2fd14f6211a6c7f24c0cba1624c14) ---- - automation/build/debian/12-x86_64.dockerfile | 71 ++++++++++++++++++++ - automation/build/debian/bookworm.dockerfile | 54 --------------- - automation/gitlab-ci/build.yaml | 20 +++--- - automation/gitlab-ci/test.yaml | 14 ++-- - automation/scripts/containerize | 2 +- - 5 files changed, 89 insertions(+), 72 deletions(-) - create mode 100644 automation/build/debian/12-x86_64.dockerfile - delete mode 100644 automation/build/debian/bookworm.dockerfile - -diff --git a/automation/build/debian/12-x86_64.dockerfile b/automation/build/debian/12-x86_64.dockerfile -new file mode 100644 -index 0000000000..6e0a403f64 ---- /dev/null -+++ b/automation/build/debian/12-x86_64.dockerfile -@@ -0,0 +1,71 @@ -+# syntax=docker/dockerfile:1 -+FROM --platform=linux/amd64 debian:bookworm -+LABEL maintainer.name="The Xen Project" -+LABEL maintainer.email="xen-devel@lists.xenproject.org" -+ -+ENV DEBIAN_FRONTEND=noninteractive -+ -+RUN <&1 | tee ${LOGFILE} - needs: -- - debian-bookworm-gcc-debug -+ - debian-12-x86_64-gcc-debug - - qemu-smoke-x86-64-clang: - extends: .qemu-smoke-x86-64 - script: - - ./automation/scripts/qemu-smoke-x86-64.sh pv 2>&1 | tee ${LOGFILE} - needs: -- - debian-bookworm-clang-debug -+ - debian-12-x86_64-clang-debug - - qemu-smoke-x86-64-gcc-pvh: - extends: .qemu-smoke-x86-64 - script: - - ./automation/scripts/qemu-smoke-x86-64.sh pvh 2>&1 | tee ${LOGFILE} - needs: -- - debian-bookworm-gcc-debug -+ - debian-12-x86_64-gcc-debug - - qemu-smoke-x86-64-clang-pvh: - extends: .qemu-smoke-x86-64 - script: - - ./automation/scripts/qemu-smoke-x86-64.sh pvh 2>&1 | tee ${LOGFILE} - needs: -- - debian-bookworm-clang-debug -+ - debian-12-x86_64-clang-debug - - qemu-smoke-x86-64-gcc-efi: - extends: .qemu-smoke-x86-64 - script: - - ./automation/scripts/qemu-smoke-x86-64-efi.sh pv 2>&1 | tee ${LOGFILE} - needs: -- - debian-bookworm-gcc-debug -+ - debian-12-x86_64-gcc-debug - - qemu-smoke-riscv64-gcc: - extends: .qemu-riscv64 -diff --git a/automation/scripts/containerize b/automation/scripts/containerize -index 7607b78f76..daa7818682 100755 ---- a/automation/scripts/containerize -+++ b/automation/scripts/containerize -@@ -34,7 +34,7 @@ case "_${CONTAINER}" in - _bullseye-riscv64) CONTAINER="${BASE}/debian:11-riscv64" ;; - _bookworm-riscv64) CONTAINER="${BASE}/debian:12-riscv64" ;; - _bookworm-x86_64-gcc-ibt) CONTAINER="${BASE}/debian:12-x86_64-gcc-ibt" ;; -- _bookworm|_) CONTAINER="${BASE}/debian:bookworm" ;; -+ _bookworm|_bookworm-x86_64|_) CONTAINER="${BASE}/debian:12-x86_64" ;; - _bookworm-i386) CONTAINER="${BASE}/debian:bookworm-i386" ;; - _bookworm-arm64v8-arm32-gcc) CONTAINER="${BASE}/debian:bookworm-arm64v8-arm32-gcc" ;; - _bookworm-arm64v8) CONTAINER="${BASE}/debian:bookworm-arm64v8" ;; --- -2.47.0 - diff --git a/0069-CI-Refresh-the-Debian-12-x86_32-container.patch b/0069-CI-Refresh-the-Debian-12-x86_32-container.patch deleted file mode 100644 index d44454f..0000000 --- a/0069-CI-Refresh-the-Debian-12-x86_32-container.patch +++ /dev/null @@ -1,184 +0,0 @@ -From c92f26973db95ab15e3f2f0bd442129b599b38ad Mon Sep 17 00:00:00 2001 -From: Javi Merino -Date: Fri, 18 Oct 2024 10:17:43 +0100 -Subject: [PATCH 69/83] CI: Refresh the Debian 12 x86_32 container - -Rework the container to be non-root, use heredocs for readability, and -use apt-get --no-install-recommends to keep the size down. Rename the -job to x86_32, to be consistent with XEN_TARGET_ARCH and the -naming scheme of all the other CI jobs: -${VERSION}-${ARCH}-${BUILD_NAME} - -Remove build dependencies for building QEMU. The absence of ninja/meson means -that the container hasn't been able to build QEMU in years. - -Remove build dependencies for the documentation as we don't have to -build it for every single arch. - -This reduces the size of the container from 2.22GB to 1.32Gb. - -Signed-off-by: Javi Merino -Reviewed-by: Andrew Cooper -(cherry picked from commit 1ceabff11575e5acb97f29aa9091539dfaf05e3d) ---- - automation/build/debian/12-x86_32.dockerfile | 51 +++++++++++++++++++ - .../build/debian/bookworm-i386.dockerfile | 50 ------------------ - automation/gitlab-ci/build.yaml | 8 +-- - automation/scripts/containerize | 2 +- - 4 files changed, 56 insertions(+), 55 deletions(-) - create mode 100644 automation/build/debian/12-x86_32.dockerfile - delete mode 100644 automation/build/debian/bookworm-i386.dockerfile - -diff --git a/automation/build/debian/12-x86_32.dockerfile b/automation/build/debian/12-x86_32.dockerfile -new file mode 100644 -index 0000000000..ef7a257155 ---- /dev/null -+++ b/automation/build/debian/12-x86_32.dockerfile -@@ -0,0 +1,51 @@ -+# syntax=docker/dockerfile:1 -+FROM --platform=linux/i386 debian:bookworm -+LABEL maintainer.name="The Xen Project" -+LABEL maintainer.email="xen-devel@lists.xenproject.org" -+ -+ENV DEBIAN_FRONTEND=noninteractive -+ -+RUN < -Date: Tue, 12 Nov 2024 13:38:08 +0100 -Subject: [PATCH 70/83] x86/HVM: drop stdvga's "cache" struct member - -Since 68e1183411be ("libxc: introduce a xc_dom_arch for hvm-3.0-x86_32 -guests"), HVM guests are built using XEN_DOMCTL_sethvmcontext, which -ends up disabling stdvga caching because of arch_hvm_load() being -involved in the processing of the request. With that the field is -useless, and can be dropped. Drop the helper functions manipulating / -checking as well right away, but leave the use sites of -stdvga_cache_is_enabled() with the hard-coded result the function would -have produced, to aid validation of subsequent dropping of further code. - -This is part of XSA-463 / CVE-2024-45818 - -Signed-off-by: Jan Beulich -Reviewed-by: Andrew Cooper -(cherry picked from commit 53b7246bdfb3c280adcdf714918e4decb7e108f4) ---- - xen/arch/x86/hvm/save.c | 3 --- - xen/arch/x86/hvm/stdvga.c | 44 +++---------------------------- - xen/arch/x86/include/asm/hvm/io.h | 7 ----- - 3 files changed, 3 insertions(+), 51 deletions(-) - -diff --git a/xen/arch/x86/hvm/save.c b/xen/arch/x86/hvm/save.c -index 99aaf3fc33..8ab6405706 100644 ---- a/xen/arch/x86/hvm/save.c -+++ b/xen/arch/x86/hvm/save.c -@@ -69,9 +69,6 @@ static void arch_hvm_load(struct domain *d, const struct hvm_save_header *hdr) - - /* Time when restore started */ - d->arch.hvm.sync_tsc = rdtsc(); -- -- /* VGA state is not saved/restored, so we nobble the cache. */ -- d->arch.hvm.stdvga.cache = STDVGA_CACHE_DISABLED; - } - - /* List of handlers for various HVM save and restore types */ -diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c -index 5f02d88615..2520d0dd01 100644 ---- a/xen/arch/x86/hvm/stdvga.c -+++ b/xen/arch/x86/hvm/stdvga.c -@@ -100,37 +100,6 @@ static void vram_put(struct hvm_hw_stdvga *s, void *p) - unmap_domain_page(p); - } - --static void stdvga_try_cache_enable(struct hvm_hw_stdvga *s) --{ -- /* -- * Caching mode can only be enabled if the the cache has -- * never been used before. As soon as it is disabled, it will -- * become out-of-sync with the VGA device model and since no -- * mechanism exists to acquire current VRAM state from the -- * device model, re-enabling it would lead to stale data being -- * seen by the guest. -- */ -- if ( s->cache != STDVGA_CACHE_UNINITIALIZED ) -- return; -- -- gdprintk(XENLOG_INFO, "entering caching mode\n"); -- s->cache = STDVGA_CACHE_ENABLED; --} -- --static void stdvga_cache_disable(struct hvm_hw_stdvga *s) --{ -- if ( s->cache != STDVGA_CACHE_ENABLED ) -- return; -- -- gdprintk(XENLOG_INFO, "leaving caching mode\n"); -- s->cache = STDVGA_CACHE_DISABLED; --} -- --static bool stdvga_cache_is_enabled(const struct hvm_hw_stdvga *s) --{ -- return s->cache == STDVGA_CACHE_ENABLED; --} -- - static int stdvga_outb(uint64_t addr, uint8_t val) - { - struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm.stdvga; -@@ -170,7 +139,6 @@ static int stdvga_outb(uint64_t addr, uint8_t val) - if ( !prev_stdvga && s->stdvga ) - { - gdprintk(XENLOG_INFO, "entering stdvga mode\n"); -- stdvga_try_cache_enable(s); - } - else if ( prev_stdvga && !s->stdvga ) - { -@@ -468,7 +436,7 @@ static int cf_check stdvga_mem_write( - }; - struct ioreq_server *srv; - -- if ( !stdvga_cache_is_enabled(s) || !s->stdvga ) -+ if ( true || !s->stdvga ) - goto done; - - /* Intercept mmio write */ -@@ -536,18 +504,12 @@ static bool cf_check stdvga_mem_accept( - * We cannot return X86EMUL_UNHANDLEABLE on anything other then the - * first cycle of an I/O. So, since we cannot guarantee to always be - * able to send buffered writes, we have to reject any multi-cycle -- * or "indirect" I/O and, since we are rejecting an I/O, we must -- * invalidate the cache. -- * Single-cycle write transactions are accepted even if the cache is -- * not active since we can assert, when in stdvga mode, that writes -- * to VRAM have no side effect and thus we can try to buffer them. -+ * or "indirect" I/O. - */ -- stdvga_cache_disable(s); -- - goto reject; - } - else if ( p->dir == IOREQ_READ && -- (!stdvga_cache_is_enabled(s) || !s->stdvga) ) -+ (true || !s->stdvga) ) - goto reject; - - /* s->lock intentionally held */ -diff --git a/xen/arch/x86/include/asm/hvm/io.h b/xen/arch/x86/include/asm/hvm/io.h -index 24d1b6134f..ce171eaca4 100644 ---- a/xen/arch/x86/include/asm/hvm/io.h -+++ b/xen/arch/x86/include/asm/hvm/io.h -@@ -110,19 +110,12 @@ struct vpci_arch_msix_entry { - int pirq; - }; - --enum stdvga_cache_state { -- STDVGA_CACHE_UNINITIALIZED, -- STDVGA_CACHE_ENABLED, -- STDVGA_CACHE_DISABLED --}; -- - struct hvm_hw_stdvga { - uint8_t sr_index; - uint8_t sr[8]; - uint8_t gr_index; - uint8_t gr[9]; - bool stdvga; -- enum stdvga_cache_state cache; - uint32_t latch; - struct page_info *vram_page[64]; /* shadow of 0xa0000-0xaffff */ - spinlock_t lock; --- -2.47.0 - diff --git a/0071-x86-HVM-drop-stdvga-s-stdvga-struct-member.patch b/0071-x86-HVM-drop-stdvga-s-stdvga-struct-member.patch deleted file mode 100644 index aad87c0..0000000 --- a/0071-x86-HVM-drop-stdvga-s-stdvga-struct-member.patch +++ /dev/null @@ -1,112 +0,0 @@ -From 92667bef147c8e12eb57db7c8cf4476ca8473652 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 12 Nov 2024 13:38:35 +0100 -Subject: [PATCH 71/83] x86/HVM: drop stdvga's "stdvga" struct member - -Two of its consumers are dead (in compile-time constant conditionals) -and the only remaining ones are merely controlling debug logging. Hence -the field is now pointless to set, which in particular allows to get rid -of the questionable conditional from which the field's value was -established (afaict 551ceee97513 ["x86, hvm: stdvga cache always on"] -had dropped too much of the earlier extra check that was there, and -quite likely further checks were missing). - -This is part of XSA-463 / CVE-2024-45818 - -Signed-off-by: Jan Beulich -Reviewed-by: Andrew Cooper -(cherry picked from commit b740a9369e81bdda675a9780130ce2b9e75d4ec9) ---- - xen/arch/x86/hvm/stdvga.c | 30 +++++------------------------- - xen/arch/x86/include/asm/hvm/io.h | 1 - - 2 files changed, 5 insertions(+), 26 deletions(-) - -diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c -index 2520d0dd01..8a9ce05346 100644 ---- a/xen/arch/x86/hvm/stdvga.c -+++ b/xen/arch/x86/hvm/stdvga.c -@@ -103,7 +103,7 @@ static void vram_put(struct hvm_hw_stdvga *s, void *p) - static int stdvga_outb(uint64_t addr, uint8_t val) - { - struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm.stdvga; -- int rc = 1, prev_stdvga = s->stdvga; -+ int rc = 1; - - switch ( addr ) - { -@@ -132,19 +132,6 @@ static int stdvga_outb(uint64_t addr, uint8_t val) - break; - } - -- /* When in standard vga mode, emulate here all writes to the vram buffer -- * so we can immediately satisfy reads without waiting for qemu. */ -- s->stdvga = (s->sr[7] == 0x00); -- -- if ( !prev_stdvga && s->stdvga ) -- { -- gdprintk(XENLOG_INFO, "entering stdvga mode\n"); -- } -- else if ( prev_stdvga && !s->stdvga ) -- { -- gdprintk(XENLOG_INFO, "leaving stdvga mode\n"); -- } -- - return rc; - } - -@@ -425,7 +412,6 @@ static int cf_check stdvga_mem_write( - const struct hvm_io_handler *handler, uint64_t addr, uint32_t size, - uint64_t data) - { -- struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm.stdvga; - ioreq_t p = { - .type = IOREQ_TYPE_COPY, - .addr = addr, -@@ -436,8 +422,7 @@ static int cf_check stdvga_mem_write( - }; - struct ioreq_server *srv; - -- if ( true || !s->stdvga ) -- goto done; -+ goto done; - - /* Intercept mmio write */ - switch ( size ) -@@ -498,19 +483,14 @@ static bool cf_check stdvga_mem_accept( - - spin_lock(&s->lock); - -- if ( p->dir == IOREQ_WRITE && (p->data_is_ptr || p->count != 1) ) -+ if ( p->dir != IOREQ_WRITE || p->data_is_ptr || p->count != 1 ) - { - /* -- * We cannot return X86EMUL_UNHANDLEABLE on anything other then the -- * first cycle of an I/O. So, since we cannot guarantee to always be -- * able to send buffered writes, we have to reject any multi-cycle -- * or "indirect" I/O. -+ * Only accept single direct writes, as that's the only thing we can -+ * accelerate using buffered ioreq handling. - */ - goto reject; - } -- else if ( p->dir == IOREQ_READ && -- (true || !s->stdvga) ) -- goto reject; - - /* s->lock intentionally held */ - return 1; -diff --git a/xen/arch/x86/include/asm/hvm/io.h b/xen/arch/x86/include/asm/hvm/io.h -index ce171eaca4..67f4d033a7 100644 ---- a/xen/arch/x86/include/asm/hvm/io.h -+++ b/xen/arch/x86/include/asm/hvm/io.h -@@ -115,7 +115,6 @@ struct hvm_hw_stdvga { - uint8_t sr[8]; - uint8_t gr_index; - uint8_t gr[9]; -- bool stdvga; - uint32_t latch; - struct page_info *vram_page[64]; /* shadow of 0xa0000-0xaffff */ - spinlock_t lock; --- -2.47.0 - diff --git a/0072-x86-HVM-remove-unused-MMIO-handling-code.patch b/0072-x86-HVM-remove-unused-MMIO-handling-code.patch deleted file mode 100644 index cd9d57e..0000000 --- a/0072-x86-HVM-remove-unused-MMIO-handling-code.patch +++ /dev/null @@ -1,392 +0,0 @@ -From 2ac4917c2456b723602048f239e12d9d3a9a8633 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 12 Nov 2024 13:38:59 +0100 -Subject: [PATCH 72/83] x86/HVM: remove unused MMIO handling code - -All read accesses are rejected by the ->accept handler, while writes -bypass the bulk of the function body. Drop the dead code, leaving an -assertion in the read handler. - -A number of other static items (and a macro) are then unreferenced and -hence also need (want) dropping. The same applies to the "latch" field -of the state structure. - -This is part of XSA-463 / CVE-2024-45818 - -Signed-off-by: Jan Beulich -Reviewed-by: Andrew Cooper -(cherry picked from commit 89108547af1f230b72893b48351f9c1106189649) ---- - xen/arch/x86/hvm/stdvga.c | 317 +----------------------------- - xen/arch/x86/include/asm/hvm/io.h | 1 - - 2 files changed, 4 insertions(+), 314 deletions(-) - -diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c -index 8a9ce05346..0f0bd10068 100644 ---- a/xen/arch/x86/hvm/stdvga.c -+++ b/xen/arch/x86/hvm/stdvga.c -@@ -37,26 +37,6 @@ - #define VGA_MEM_BASE 0xa0000 - #define VGA_MEM_SIZE 0x20000 - --#define PAT(x) (x) --static const uint32_t mask16[16] = { -- PAT(0x00000000U), -- PAT(0x000000ffU), -- PAT(0x0000ff00U), -- PAT(0x0000ffffU), -- PAT(0x00ff0000U), -- PAT(0x00ff00ffU), -- PAT(0x00ffff00U), -- PAT(0x00ffffffU), -- PAT(0xff000000U), -- PAT(0xff0000ffU), -- PAT(0xff00ff00U), -- PAT(0xff00ffffU), -- PAT(0xffff0000U), -- PAT(0xffff00ffU), -- PAT(0xffffff00U), -- PAT(0xffffffffU), --}; -- - /* force some bits to zero */ - static const uint8_t sr_mask[8] = { - (uint8_t)~0xfc, -@@ -81,25 +61,6 @@ static const uint8_t gr_mask[9] = { - (uint8_t)~0x00, /* 0x08 */ - }; - --static uint8_t *vram_getb(struct hvm_hw_stdvga *s, unsigned int a) --{ -- struct page_info *pg = s->vram_page[(a >> 12) & 0x3f]; -- uint8_t *p = __map_domain_page(pg); -- return &p[a & 0xfff]; --} -- --static uint32_t *vram_getl(struct hvm_hw_stdvga *s, unsigned int a) --{ -- struct page_info *pg = s->vram_page[(a >> 10) & 0x3f]; -- uint32_t *p = __map_domain_page(pg); -- return &p[a & 0x3ff]; --} -- --static void vram_put(struct hvm_hw_stdvga *s, void *p) --{ -- unmap_domain_page(p); --} -- - static int stdvga_outb(uint64_t addr, uint8_t val) - { - struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm.stdvga; -@@ -168,244 +129,13 @@ static int cf_check stdvga_intercept_pio( - return X86EMUL_UNHANDLEABLE; /* propagate to external ioemu */ - } - --static unsigned int stdvga_mem_offset( -- struct hvm_hw_stdvga *s, unsigned int mmio_addr) --{ -- unsigned int memory_map_mode = (s->gr[6] >> 2) & 3; -- unsigned int offset = mmio_addr & 0x1ffff; -- -- switch ( memory_map_mode ) -- { -- case 0: -- break; -- case 1: -- if ( offset >= 0x10000 ) -- goto fail; -- offset += 0; /* assume bank_offset == 0; */ -- break; -- case 2: -- offset -= 0x10000; -- if ( offset >= 0x8000 ) -- goto fail; -- break; -- default: -- case 3: -- offset -= 0x18000; -- if ( offset >= 0x8000 ) -- goto fail; -- break; -- } -- -- return offset; -- -- fail: -- return ~0u; --} -- --#define GET_PLANE(data, p) (((data) >> ((p) * 8)) & 0xff) -- --static uint8_t stdvga_mem_readb(uint64_t addr) --{ -- struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm.stdvga; -- int plane; -- uint32_t ret, *vram_l; -- uint8_t *vram_b; -- -- addr = stdvga_mem_offset(s, addr); -- if ( addr == ~0u ) -- return 0xff; -- -- if ( s->sr[4] & 0x08 ) -- { -- /* chain 4 mode : simplest access */ -- vram_b = vram_getb(s, addr); -- ret = *vram_b; -- vram_put(s, vram_b); -- } -- else if ( s->gr[5] & 0x10 ) -- { -- /* odd/even mode (aka text mode mapping) */ -- plane = (s->gr[4] & 2) | (addr & 1); -- vram_b = vram_getb(s, ((addr & ~1) << 1) | plane); -- ret = *vram_b; -- vram_put(s, vram_b); -- } -- else -- { -- /* standard VGA latched access */ -- vram_l = vram_getl(s, addr); -- s->latch = *vram_l; -- vram_put(s, vram_l); -- -- if ( !(s->gr[5] & 0x08) ) -- { -- /* read mode 0 */ -- plane = s->gr[4]; -- ret = GET_PLANE(s->latch, plane); -- } -- else -- { -- /* read mode 1 */ -- ret = (s->latch ^ mask16[s->gr[2]]) & mask16[s->gr[7]]; -- ret |= ret >> 16; -- ret |= ret >> 8; -- ret = (~ret) & 0xff; -- } -- } -- -- return ret; --} -- - static int cf_check stdvga_mem_read( - const struct hvm_io_handler *handler, uint64_t addr, uint32_t size, - uint64_t *p_data) - { -- uint64_t data = ~0UL; -- -- switch ( size ) -- { -- case 1: -- data = stdvga_mem_readb(addr); -- break; -- -- case 2: -- data = stdvga_mem_readb(addr); -- data |= stdvga_mem_readb(addr + 1) << 8; -- break; -- -- case 4: -- data = stdvga_mem_readb(addr); -- data |= stdvga_mem_readb(addr + 1) << 8; -- data |= stdvga_mem_readb(addr + 2) << 16; -- data |= (uint32_t)stdvga_mem_readb(addr + 3) << 24; -- break; -- -- case 8: -- data = (uint64_t)(stdvga_mem_readb(addr)); -- data |= (uint64_t)(stdvga_mem_readb(addr + 1)) << 8; -- data |= (uint64_t)(stdvga_mem_readb(addr + 2)) << 16; -- data |= (uint64_t)(stdvga_mem_readb(addr + 3)) << 24; -- data |= (uint64_t)(stdvga_mem_readb(addr + 4)) << 32; -- data |= (uint64_t)(stdvga_mem_readb(addr + 5)) << 40; -- data |= (uint64_t)(stdvga_mem_readb(addr + 6)) << 48; -- data |= (uint64_t)(stdvga_mem_readb(addr + 7)) << 56; -- break; -- -- default: -- gdprintk(XENLOG_WARNING, "invalid io size: %u\n", size); -- break; -- } -- -- *p_data = data; -- return X86EMUL_OKAY; --} -- --static void stdvga_mem_writeb(uint64_t addr, uint32_t val) --{ -- struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm.stdvga; -- int plane, write_mode, b, func_select, mask; -- uint32_t write_mask, bit_mask, set_mask, *vram_l; -- uint8_t *vram_b; -- -- addr = stdvga_mem_offset(s, addr); -- if ( addr == ~0u ) -- return; -- -- if ( s->sr[4] & 0x08 ) -- { -- /* chain 4 mode : simplest access */ -- plane = addr & 3; -- mask = (1 << plane); -- if ( s->sr[2] & mask ) -- { -- vram_b = vram_getb(s, addr); -- *vram_b = val; -- vram_put(s, vram_b); -- } -- } -- else if ( s->gr[5] & 0x10 ) -- { -- /* odd/even mode (aka text mode mapping) */ -- plane = (s->gr[4] & 2) | (addr & 1); -- mask = (1 << plane); -- if ( s->sr[2] & mask ) -- { -- addr = ((addr & ~1) << 1) | plane; -- vram_b = vram_getb(s, addr); -- *vram_b = val; -- vram_put(s, vram_b); -- } -- } -- else -- { -- write_mode = s->gr[5] & 3; -- switch ( write_mode ) -- { -- default: -- case 0: -- /* rotate */ -- b = s->gr[3] & 7; -- val = ((val >> b) | (val << (8 - b))) & 0xff; -- val |= val << 8; -- val |= val << 16; -- -- /* apply set/reset mask */ -- set_mask = mask16[s->gr[1]]; -- val = (val & ~set_mask) | (mask16[s->gr[0]] & set_mask); -- bit_mask = s->gr[8]; -- break; -- case 1: -- val = s->latch; -- goto do_write; -- case 2: -- val = mask16[val & 0x0f]; -- bit_mask = s->gr[8]; -- break; -- case 3: -- /* rotate */ -- b = s->gr[3] & 7; -- val = (val >> b) | (val << (8 - b)); -- -- bit_mask = s->gr[8] & val; -- val = mask16[s->gr[0]]; -- break; -- } -- -- /* apply logical operation */ -- func_select = s->gr[3] >> 3; -- switch ( func_select ) -- { -- case 0: -- default: -- /* nothing to do */ -- break; -- case 1: -- /* and */ -- val &= s->latch; -- break; -- case 2: -- /* or */ -- val |= s->latch; -- break; -- case 3: -- /* xor */ -- val ^= s->latch; -- break; -- } -- -- /* apply bit mask */ -- bit_mask |= bit_mask << 8; -- bit_mask |= bit_mask << 16; -- val = (val & bit_mask) | (s->latch & ~bit_mask); -- -- do_write: -- /* mask data according to sr[2] */ -- mask = s->sr[2]; -- write_mask = mask16[mask]; -- vram_l = vram_getl(s, addr); -- *vram_l = (*vram_l & ~write_mask) | (val & write_mask); -- vram_put(s, vram_l); -- } -+ ASSERT_UNREACHABLE(); -+ *p_data = ~0; -+ return X86EMUL_UNHANDLEABLE; - } - - static int cf_check stdvga_mem_write( -@@ -420,47 +150,8 @@ static int cf_check stdvga_mem_write( - .dir = IOREQ_WRITE, - .data = data, - }; -- struct ioreq_server *srv; -- -- goto done; -- -- /* Intercept mmio write */ -- switch ( size ) -- { -- case 1: -- stdvga_mem_writeb(addr, (data >> 0) & 0xff); -- break; -- -- case 2: -- stdvga_mem_writeb(addr+0, (data >> 0) & 0xff); -- stdvga_mem_writeb(addr+1, (data >> 8) & 0xff); -- break; -- -- case 4: -- stdvga_mem_writeb(addr+0, (data >> 0) & 0xff); -- stdvga_mem_writeb(addr+1, (data >> 8) & 0xff); -- stdvga_mem_writeb(addr+2, (data >> 16) & 0xff); -- stdvga_mem_writeb(addr+3, (data >> 24) & 0xff); -- break; -- -- case 8: -- stdvga_mem_writeb(addr+0, (data >> 0) & 0xff); -- stdvga_mem_writeb(addr+1, (data >> 8) & 0xff); -- stdvga_mem_writeb(addr+2, (data >> 16) & 0xff); -- stdvga_mem_writeb(addr+3, (data >> 24) & 0xff); -- stdvga_mem_writeb(addr+4, (data >> 32) & 0xff); -- stdvga_mem_writeb(addr+5, (data >> 40) & 0xff); -- stdvga_mem_writeb(addr+6, (data >> 48) & 0xff); -- stdvga_mem_writeb(addr+7, (data >> 56) & 0xff); -- break; -- -- default: -- gdprintk(XENLOG_WARNING, "invalid io size: %u\n", size); -- break; -- } -+ struct ioreq_server *srv = ioreq_server_select(current->domain, &p); - -- done: -- srv = ioreq_server_select(current->domain, &p); - if ( !srv ) - return X86EMUL_UNHANDLEABLE; - -diff --git a/xen/arch/x86/include/asm/hvm/io.h b/xen/arch/x86/include/asm/hvm/io.h -index 67f4d033a7..91714f3614 100644 ---- a/xen/arch/x86/include/asm/hvm/io.h -+++ b/xen/arch/x86/include/asm/hvm/io.h -@@ -115,7 +115,6 @@ struct hvm_hw_stdvga { - uint8_t sr[8]; - uint8_t gr_index; - uint8_t gr[9]; -- uint32_t latch; - struct page_info *vram_page[64]; /* shadow of 0xa0000-0xaffff */ - spinlock_t lock; - }; --- -2.47.0 - diff --git a/0073-x86-HVM-drop-stdvga-s-gr-struct-member.patch b/0073-x86-HVM-drop-stdvga-s-gr-struct-member.patch deleted file mode 100644 index 0e891d1..0000000 --- a/0073-x86-HVM-drop-stdvga-s-gr-struct-member.patch +++ /dev/null @@ -1,70 +0,0 @@ -From 2334fb4fefba07a33cc466d054fbb67a8cc2d5d5 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 12 Nov 2024 13:39:19 +0100 -Subject: [PATCH 73/83] x86/HVM: drop stdvga's "gr[]" struct member - -No consumers are left, hence the producer and the array itself can also -go away. The static gr_mask[] is then orphaned and hence needs dropping, -too. - -This is part of XSA-463 / CVE-2024-45818 - -Signed-off-by: Jan Beulich -Reviewed-by: Andrew Cooper -(cherry picked from commit b16c0966a17f19c0e55ed0b9baa28191d2590178) ---- - xen/arch/x86/hvm/stdvga.c | 18 ------------------ - xen/arch/x86/include/asm/hvm/io.h | 1 - - 2 files changed, 19 deletions(-) - -diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c -index 0f0bd10068..fa25833caa 100644 ---- a/xen/arch/x86/hvm/stdvga.c -+++ b/xen/arch/x86/hvm/stdvga.c -@@ -49,18 +49,6 @@ static const uint8_t sr_mask[8] = { - (uint8_t)~0x00, - }; - --static const uint8_t gr_mask[9] = { -- (uint8_t)~0xf0, /* 0x00 */ -- (uint8_t)~0xf0, /* 0x01 */ -- (uint8_t)~0xf0, /* 0x02 */ -- (uint8_t)~0xe0, /* 0x03 */ -- (uint8_t)~0xfc, /* 0x04 */ -- (uint8_t)~0x84, /* 0x05 */ -- (uint8_t)~0xf0, /* 0x06 */ -- (uint8_t)~0xf0, /* 0x07 */ -- (uint8_t)~0x00, /* 0x08 */ --}; -- - static int stdvga_outb(uint64_t addr, uint8_t val) - { - struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm.stdvga; -@@ -82,12 +70,6 @@ static int stdvga_outb(uint64_t addr, uint8_t val) - s->gr_index = val; - break; - -- case 0x3cf: /* graphics data register */ -- rc = (s->gr_index < sizeof(s->gr)); -- if ( rc ) -- s->gr[s->gr_index] = val & gr_mask[s->gr_index]; -- break; -- - default: - rc = 0; - break; -diff --git a/xen/arch/x86/include/asm/hvm/io.h b/xen/arch/x86/include/asm/hvm/io.h -index 91714f3614..126622e53c 100644 ---- a/xen/arch/x86/include/asm/hvm/io.h -+++ b/xen/arch/x86/include/asm/hvm/io.h -@@ -114,7 +114,6 @@ struct hvm_hw_stdvga { - uint8_t sr_index; - uint8_t sr[8]; - uint8_t gr_index; -- uint8_t gr[9]; - struct page_info *vram_page[64]; /* shadow of 0xa0000-0xaffff */ - spinlock_t lock; - }; --- -2.47.0 - diff --git a/0074-x86-HVM-drop-stdvga-s-sr-struct-member.patch b/0074-x86-HVM-drop-stdvga-s-sr-struct-member.patch deleted file mode 100644 index a5d3613..0000000 --- a/0074-x86-HVM-drop-stdvga-s-sr-struct-member.patch +++ /dev/null @@ -1,70 +0,0 @@ -From 04f1c5e6f7ba31468bd97889a90f05e9ff0fe812 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 12 Nov 2024 13:39:38 +0100 -Subject: [PATCH 74/83] x86/HVM: drop stdvga's "sr[]" struct member - -No consumers are left, hence the producer and the array itself can also -go away. The static sr_mask[] is then orphaned and hence needs dropping, -too. - -This is part of XSA-463 / CVE-2024-45818 - -Signed-off-by: Jan Beulich -Reviewed-by: Andrew Cooper -(cherry picked from commit 7aba44bdd78aedb97703811948c3b69ccff85032) ---- - xen/arch/x86/hvm/stdvga.c | 18 ------------------ - xen/arch/x86/include/asm/hvm/io.h | 1 - - 2 files changed, 19 deletions(-) - -diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c -index fa25833caa..5523a441dd 100644 ---- a/xen/arch/x86/hvm/stdvga.c -+++ b/xen/arch/x86/hvm/stdvga.c -@@ -37,18 +37,6 @@ - #define VGA_MEM_BASE 0xa0000 - #define VGA_MEM_SIZE 0x20000 - --/* force some bits to zero */ --static const uint8_t sr_mask[8] = { -- (uint8_t)~0xfc, -- (uint8_t)~0xc2, -- (uint8_t)~0xf0, -- (uint8_t)~0xc0, -- (uint8_t)~0xf1, -- (uint8_t)~0xff, -- (uint8_t)~0xff, -- (uint8_t)~0x00, --}; -- - static int stdvga_outb(uint64_t addr, uint8_t val) - { - struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm.stdvga; -@@ -60,12 +48,6 @@ static int stdvga_outb(uint64_t addr, uint8_t val) - s->sr_index = val; - break; - -- case 0x3c5: /* sequencer data register */ -- rc = (s->sr_index < sizeof(s->sr)); -- if ( rc ) -- s->sr[s->sr_index] = val & sr_mask[s->sr_index] ; -- break; -- - case 0x3ce: /* graphics address register */ - s->gr_index = val; - break; -diff --git a/xen/arch/x86/include/asm/hvm/io.h b/xen/arch/x86/include/asm/hvm/io.h -index 126622e53c..3e9079eab6 100644 ---- a/xen/arch/x86/include/asm/hvm/io.h -+++ b/xen/arch/x86/include/asm/hvm/io.h -@@ -112,7 +112,6 @@ struct vpci_arch_msix_entry { - - struct hvm_hw_stdvga { - uint8_t sr_index; -- uint8_t sr[8]; - uint8_t gr_index; - struct page_info *vram_page[64]; /* shadow of 0xa0000-0xaffff */ - spinlock_t lock; --- -2.47.0 - diff --git a/0075-x86-HVM-drop-stdvga-s-g-s-r_index-struct-members.patch b/0075-x86-HVM-drop-stdvga-s-g-s-r_index-struct-members.patch deleted file mode 100644 index 8fb5855..0000000 --- a/0075-x86-HVM-drop-stdvga-s-g-s-r_index-struct-members.patch +++ /dev/null @@ -1,114 +0,0 @@ -From 77cb6587d4ee56cf76e48bc08bc3e5a6abb835e4 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 12 Nov 2024 13:39:56 +0100 -Subject: [PATCH 75/83] x86/HVM: drop stdvga's "{g,s}r_index" struct members - -No consumers are left, hence the producer and the fields themselves can -also go away. stdvga_outb() is then useless, rendering stdvga_out() -useless as well. Hence the entire I/O port intercept can go away. - -This is part of XSA-463 / CVE-2024-45818 - -Signed-off-by: Jan Beulich -Reviewed-by: Andrew Cooper -(cherry picked from commit 86c03372e107f5c18266a62281663861b1144929) ---- - xen/arch/x86/hvm/stdvga.c | 61 ------------------------------- - xen/arch/x86/include/asm/hvm/io.h | 2 - - 2 files changed, 63 deletions(-) - -diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c -index 5523a441dd..155a67a438 100644 ---- a/xen/arch/x86/hvm/stdvga.c -+++ b/xen/arch/x86/hvm/stdvga.c -@@ -37,62 +37,6 @@ - #define VGA_MEM_BASE 0xa0000 - #define VGA_MEM_SIZE 0x20000 - --static int stdvga_outb(uint64_t addr, uint8_t val) --{ -- struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm.stdvga; -- int rc = 1; -- -- switch ( addr ) -- { -- case 0x3c4: /* sequencer address register */ -- s->sr_index = val; -- break; -- -- case 0x3ce: /* graphics address register */ -- s->gr_index = val; -- break; -- -- default: -- rc = 0; -- break; -- } -- -- return rc; --} -- --static void stdvga_out(uint32_t port, uint32_t bytes, uint32_t val) --{ -- switch ( bytes ) -- { -- case 1: -- stdvga_outb(port, val); -- break; -- -- case 2: -- stdvga_outb(port + 0, val >> 0); -- stdvga_outb(port + 1, val >> 8); -- break; -- -- default: -- break; -- } --} -- --static int cf_check stdvga_intercept_pio( -- int dir, unsigned int port, unsigned int bytes, uint32_t *val) --{ -- struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm.stdvga; -- -- if ( dir == IOREQ_WRITE ) -- { -- spin_lock(&s->lock); -- stdvga_out(port, bytes, *val); -- spin_unlock(&s->lock); -- } -- -- return X86EMUL_UNHANDLEABLE; /* propagate to external ioemu */ --} -- - static int cf_check stdvga_mem_read( - const struct hvm_io_handler *handler, uint64_t addr, uint32_t size, - uint64_t *p_data) -@@ -194,11 +138,6 @@ void stdvga_init(struct domain *d) - { - struct hvm_io_handler *handler; - -- /* Sequencer registers. */ -- register_portio_handler(d, 0x3c4, 2, stdvga_intercept_pio); -- /* Graphics registers. */ -- register_portio_handler(d, 0x3ce, 2, stdvga_intercept_pio); -- - /* VGA memory */ - handler = hvm_next_io_handler(d); - -diff --git a/xen/arch/x86/include/asm/hvm/io.h b/xen/arch/x86/include/asm/hvm/io.h -index 3e9079eab6..bf9ddfc70e 100644 ---- a/xen/arch/x86/include/asm/hvm/io.h -+++ b/xen/arch/x86/include/asm/hvm/io.h -@@ -111,8 +111,6 @@ struct vpci_arch_msix_entry { - }; - - struct hvm_hw_stdvga { -- uint8_t sr_index; -- uint8_t gr_index; - struct page_info *vram_page[64]; /* shadow of 0xa0000-0xaffff */ - spinlock_t lock; - }; --- -2.47.0 - diff --git a/0076-x86-HVM-drop-stdvga-s-vram_page-struct-member.patch b/0076-x86-HVM-drop-stdvga-s-vram_page-struct-member.patch deleted file mode 100644 index cf33435..0000000 --- a/0076-x86-HVM-drop-stdvga-s-vram_page-struct-member.patch +++ /dev/null @@ -1,124 +0,0 @@ -From 7b2df91a0e680c1fa529c0de7102027225bd5a37 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 12 Nov 2024 13:40:15 +0100 -Subject: [PATCH 76/83] x86/HVM: drop stdvga's "vram_page[]" struct member - -No uses are left, hence its setup, teardown, and the field itself can -also go away. stdvga_deinit() is then empty and can be dropped as well. - -This is part of XSA-463 / CVE-2024-45818 - -Signed-off-by: Jan Beulich -Reviewed-by: Andrew Cooper -(cherry picked from commit 3beb4baf2a0a2eef40d39eb7e6eecbfd36da5d14) ---- - xen/arch/x86/hvm/hvm.c | 2 -- - xen/arch/x86/hvm/stdvga.c | 41 +++---------------------------- - xen/arch/x86/include/asm/hvm/io.h | 2 -- - 3 files changed, 4 insertions(+), 41 deletions(-) - -diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c -index 0fe2b85b16..fbca7e4949 100644 ---- a/xen/arch/x86/hvm/hvm.c -+++ b/xen/arch/x86/hvm/hvm.c -@@ -700,7 +700,6 @@ int hvm_domain_initialise(struct domain *d, - return 0; - - fail2: -- stdvga_deinit(d); - vioapic_deinit(d); - fail1: - if ( is_hardware_domain(d) ) -@@ -763,7 +762,6 @@ void hvm_domain_destroy(struct domain *d) - if ( hvm_funcs.domain_destroy ) - alternative_vcall(hvm_funcs.domain_destroy, d); - -- stdvga_deinit(d); - vioapic_deinit(d); - - XFREE(d->arch.hvm.pl_time); -diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c -index 155a67a438..9f308fc896 100644 ---- a/xen/arch/x86/hvm/stdvga.c -+++ b/xen/arch/x86/hvm/stdvga.c -@@ -116,8 +116,7 @@ static const struct hvm_io_ops stdvga_mem_ops = { - void stdvga_init(struct domain *d) - { - struct hvm_hw_stdvga *s = &d->arch.hvm.stdvga; -- struct page_info *pg; -- unsigned int i; -+ struct hvm_io_handler *handler; - - if ( !has_vvga(d) ) - return; -@@ -125,47 +124,15 @@ void stdvga_init(struct domain *d) - memset(s, 0, sizeof(*s)); - spin_lock_init(&s->lock); - -- for ( i = 0; i != ARRAY_SIZE(s->vram_page); i++ ) -+ /* VGA memory */ -+ handler = hvm_next_io_handler(d); -+ if ( handler ) - { -- pg = alloc_domheap_page(d, MEMF_no_owner); -- if ( pg == NULL ) -- break; -- s->vram_page[i] = pg; -- clear_domain_page(page_to_mfn(pg)); -- } -- -- if ( i == ARRAY_SIZE(s->vram_page) ) -- { -- struct hvm_io_handler *handler; -- -- /* VGA memory */ -- handler = hvm_next_io_handler(d); -- -- if ( handler == NULL ) -- return; -- - handler->type = IOREQ_TYPE_COPY; - handler->ops = &stdvga_mem_ops; - } - } - --void stdvga_deinit(struct domain *d) --{ -- struct hvm_hw_stdvga *s = &d->arch.hvm.stdvga; -- int i; -- -- if ( !has_vvga(d) ) -- return; -- -- for ( i = 0; i != ARRAY_SIZE(s->vram_page); i++ ) -- { -- if ( s->vram_page[i] == NULL ) -- continue; -- free_domheap_page(s->vram_page[i]); -- s->vram_page[i] = NULL; -- } --} -- - /* - * Local variables: - * mode: C -diff --git a/xen/arch/x86/include/asm/hvm/io.h b/xen/arch/x86/include/asm/hvm/io.h -index bf9ddfc70e..d49f6d6f8c 100644 ---- a/xen/arch/x86/include/asm/hvm/io.h -+++ b/xen/arch/x86/include/asm/hvm/io.h -@@ -111,12 +111,10 @@ struct vpci_arch_msix_entry { - }; - - struct hvm_hw_stdvga { -- struct page_info *vram_page[64]; /* shadow of 0xa0000-0xaffff */ - spinlock_t lock; - }; - - void stdvga_init(struct domain *d); --void stdvga_deinit(struct domain *d); - - extern void hvm_dpci_msi_eoi(struct domain *d, int vector); - --- -2.47.0 - diff --git a/0077-x86-HVM-drop-stdvga-s-lock-struct-member.patch b/0077-x86-HVM-drop-stdvga-s-lock-struct-member.patch deleted file mode 100644 index 24a642a..0000000 --- a/0077-x86-HVM-drop-stdvga-s-lock-struct-member.patch +++ /dev/null @@ -1,119 +0,0 @@ -From 1cb4e0a5fed5b7030b901b827643912b7070cd31 Mon Sep 17 00:00:00 2001 -From: Jan Beulich -Date: Tue, 12 Nov 2024 13:40:33 +0100 -Subject: [PATCH 77/83] x86/HVM: drop stdvga's "lock" struct member - -No state is left to protect. It being the last field, drop the struct -itself as well. Similarly for then ending up empty, drop the .complete -handler. - -This is part of XSA-463 / CVE-2024-45818 - -Suggested-by: Andrew Cooper -Signed-off-by: Jan Beulich -Reviewed-by: Andrew Cooper -(cherry picked from commit b180a50326c8a2c171f37c1940a0fbbdcad4be90) ---- - xen/arch/x86/hvm/stdvga.c | 30 ++------------------------- - xen/arch/x86/include/asm/hvm/domain.h | 1 - - xen/arch/x86/include/asm/hvm/io.h | 4 ---- - 3 files changed, 2 insertions(+), 33 deletions(-) - -diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c -index 9f308fc896..d38d30affb 100644 ---- a/xen/arch/x86/hvm/stdvga.c -+++ b/xen/arch/x86/hvm/stdvga.c -@@ -69,61 +69,35 @@ static int cf_check stdvga_mem_write( - static bool cf_check stdvga_mem_accept( - const struct hvm_io_handler *handler, const ioreq_t *p) - { -- struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm.stdvga; -- -- /* -- * The range check must be done without taking the lock, to avoid -- * deadlock when hvm_mmio_internal() is called from -- * hvm_copy_to/from_guest_phys() in hvm_process_io_intercept(). -- */ - if ( (ioreq_mmio_first_byte(p) < VGA_MEM_BASE) || - (ioreq_mmio_last_byte(p) >= (VGA_MEM_BASE + VGA_MEM_SIZE)) ) - return 0; - -- spin_lock(&s->lock); -- - if ( p->dir != IOREQ_WRITE || p->data_is_ptr || p->count != 1 ) - { - /* - * Only accept single direct writes, as that's the only thing we can - * accelerate using buffered ioreq handling. - */ -- goto reject; -+ return false; - } - -- /* s->lock intentionally held */ -- return 1; -- -- reject: -- spin_unlock(&s->lock); -- return 0; --} -- --static void cf_check stdvga_mem_complete(const struct hvm_io_handler *handler) --{ -- struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm.stdvga; -- -- spin_unlock(&s->lock); -+ return true; - } - - static const struct hvm_io_ops stdvga_mem_ops = { - .accept = stdvga_mem_accept, - .read = stdvga_mem_read, - .write = stdvga_mem_write, -- .complete = stdvga_mem_complete - }; - - void stdvga_init(struct domain *d) - { -- struct hvm_hw_stdvga *s = &d->arch.hvm.stdvga; - struct hvm_io_handler *handler; - - if ( !has_vvga(d) ) - return; - -- memset(s, 0, sizeof(*s)); -- spin_lock_init(&s->lock); -- - /* VGA memory */ - handler = hvm_next_io_handler(d); - if ( handler ) -diff --git a/xen/arch/x86/include/asm/hvm/domain.h b/xen/arch/x86/include/asm/hvm/domain.h -index dd9d837e84..333501d5f2 100644 ---- a/xen/arch/x86/include/asm/hvm/domain.h -+++ b/xen/arch/x86/include/asm/hvm/domain.h -@@ -72,7 +72,6 @@ struct hvm_domain { - struct hvm_hw_vpic vpic[2]; /* 0=master; 1=slave */ - struct hvm_vioapic **vioapic; - unsigned int nr_vioapics; -- struct hvm_hw_stdvga stdvga; - - /* - * hvm_hw_pmtimer is a publicly-visible name. We will defer renaming -diff --git a/xen/arch/x86/include/asm/hvm/io.h b/xen/arch/x86/include/asm/hvm/io.h -index d49f6d6f8c..d72b29f73f 100644 ---- a/xen/arch/x86/include/asm/hvm/io.h -+++ b/xen/arch/x86/include/asm/hvm/io.h -@@ -110,10 +110,6 @@ struct vpci_arch_msix_entry { - int pirq; - }; - --struct hvm_hw_stdvga { -- spinlock_t lock; --}; -- - void stdvga_init(struct domain *d); - - extern void hvm_dpci_msi_eoi(struct domain *d, int vector); --- -2.47.0 - diff --git a/0078-x86-hvm-Simplify-stdvga_mem_accept-further.patch b/0078-x86-hvm-Simplify-stdvga_mem_accept-further.patch deleted file mode 100644 index 25e34bf..0000000 --- a/0078-x86-hvm-Simplify-stdvga_mem_accept-further.patch +++ /dev/null @@ -1,94 +0,0 @@ -From ad77081ac6f79d48c9492b6ecae3e4fde58c8a32 Mon Sep 17 00:00:00 2001 -From: Andrew Cooper -Date: Tue, 12 Nov 2024 13:40:51 +0100 -Subject: [PATCH 78/83] x86/hvm: Simplify stdvga_mem_accept() further - -stdvga_mem_accept() is called on almost all IO emulations, and the -overwhelming likely answer is to reject the ioreq. Simply rearranging the -expression yields an improvement: - - add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-57 (-57) - Function old new delta - stdvga_mem_accept 109 52 -57 - -which is best explained looking at the disassembly: - - Before: After: - f3 0f 1e fa endbr64 f3 0f 1e fa endbr64 - 0f b6 4e 1e movzbl 0x1e(%rsi),%ecx | 0f b6 46 1e movzbl 0x1e(%rsi),%eax - 48 8b 16 mov (%rsi),%rdx | 31 d2 xor %edx,%edx - f6 c1 40 test $0x40,%cl | a8 30 test $0x30,%al - 75 38 jne | 75 23 jne - 31 c0 xor %eax,%eax < - 48 81 fa ff ff 09 00 cmp $0x9ffff,%rdx < - 76 26 jbe < - 8b 46 14 mov 0x14(%rsi),%eax < - 8b 7e 10 mov 0x10(%rsi),%edi < - 48 0f af c7 imul %rdi,%rax < - 48 8d 54 02 ff lea -0x1(%rdx,%rax,1),%rdx < - 31 c0 xor %eax,%eax < - 48 81 fa ff ff 0b 00 cmp $0xbffff,%rdx < - 77 0c ja < - 83 e1 30 and $0x30,%ecx < - 75 07 jne < - 83 7e 10 01 cmpl $0x1,0x10(%rsi) 83 7e 10 01 cmpl $0x1,0x10(%rsi) - 0f 94 c0 sete %al | 75 1d jne - c3 ret | 48 8b 0e mov (%rsi),%rcx - 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) | 48 81 f9 ff ff 09 00 cmp $0x9ffff,%rcx - 8b 46 10 mov 0x10(%rsi),%eax | 76 11 jbe - 8b 7e 14 mov 0x14(%rsi),%edi | 8b 46 14 mov 0x14(%rsi),%eax - 49 89 d0 mov %rdx,%r8 | 48 8d 44 01 ff lea -0x1(%rcx,%rax,1),%rax - 48 83 e8 01 sub $0x1,%rax | 48 3d ff ff 0b 00 cmp $0xbffff,%rax - 48 8d 54 3a ff lea -0x1(%rdx,%rdi,1),%rdx | 0f 96 c2 setbe %dl - 48 0f af c7 imul %rdi,%rax | 89 d0 mov %edx,%eax - 49 29 c0 sub %rax,%r8 < - 31 c0 xor %eax,%eax < - 49 81 f8 ff ff 09 00 cmp $0x9ffff,%r8 < - 77 be ja < - c3 ret c3 ret - -By moving the "p->count != 1" check ahead of the -ioreq_mmio_{first,last}_byte() calls, both multiplies disappear along with a -lot of surrounding logic. - -No functional change. - -Signed-off-by: Andrew Cooper -Reviewed-by: Jan Beulich -(cherry picked from commit 08ffd8705d36c7c445df3ecee8ad9b8f8d65fbe0) ---- - xen/arch/x86/hvm/stdvga.c | 16 ++++++---------- - 1 file changed, 6 insertions(+), 10 deletions(-) - -diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c -index d38d30affb..c3c43f59ee 100644 ---- a/xen/arch/x86/hvm/stdvga.c -+++ b/xen/arch/x86/hvm/stdvga.c -@@ -69,18 +69,14 @@ static int cf_check stdvga_mem_write( - static bool cf_check stdvga_mem_accept( - const struct hvm_io_handler *handler, const ioreq_t *p) - { -- if ( (ioreq_mmio_first_byte(p) < VGA_MEM_BASE) || -+ /* -+ * Only accept single direct writes, as that's the only thing we can -+ * accelerate using buffered ioreq handling. -+ */ -+ if ( p->dir != IOREQ_WRITE || p->data_is_ptr || p->count != 1 || -+ (ioreq_mmio_first_byte(p) < VGA_MEM_BASE) || - (ioreq_mmio_last_byte(p) >= (VGA_MEM_BASE + VGA_MEM_SIZE)) ) -- return 0; -- -- if ( p->dir != IOREQ_WRITE || p->data_is_ptr || p->count != 1 ) -- { -- /* -- * Only accept single direct writes, as that's the only thing we can -- * accelerate using buffered ioreq handling. -- */ - return false; -- } - - return true; - } --- -2.47.0 - diff --git a/0079-libxl-Use-zero-ed-memory-for-PVH-acpi-tables.patch b/0079-libxl-Use-zero-ed-memory-for-PVH-acpi-tables.patch deleted file mode 100644 index 78dbe80..0000000 --- a/0079-libxl-Use-zero-ed-memory-for-PVH-acpi-tables.patch +++ /dev/null @@ -1,43 +0,0 @@ -From 267796fd043de41d7e64b4b8d88512fc46c13d4b Mon Sep 17 00:00:00 2001 -From: Jason Andryuk -Date: Tue, 12 Nov 2024 13:41:13 +0100 -Subject: [PATCH 79/83] libxl: Use zero-ed memory for PVH acpi tables - -xl/libxl memory is leaking into a PVH guest through uninitialized -portions of the ACPI tables. - -Use libxl_zalloc() to obtain zero-ed memory to avoid this issue. - -This is XSA-464 / CVE-2024-45819. - -Signed-off-by: Jason Andryuk -Fixes: 14c0d328da2b ("libxl/acpi: Build ACPI tables for HVMlite guests") -Reviewed-by: Jan Beulich -master commit: 0bfe567b58f1182889dea9207103fc9d00baf414 -master date: 2024-11-12 13:32:45 +0100 ---- - tools/libs/light/libxl_x86_acpi.c | 7 ++++--- - 1 file changed, 4 insertions(+), 3 deletions(-) - -diff --git a/tools/libs/light/libxl_x86_acpi.c b/tools/libs/light/libxl_x86_acpi.c -index 5cf261bd67..2574ce2553 100644 ---- a/tools/libs/light/libxl_x86_acpi.c -+++ b/tools/libs/light/libxl_x86_acpi.c -@@ -176,10 +176,11 @@ int libxl__dom_load_acpi(libxl__gc *gc, - goto out; - } - -- config.rsdp = (unsigned long)libxl__malloc(gc, libxl_ctxt.page_size); -- config.infop = (unsigned long)libxl__malloc(gc, libxl_ctxt.page_size); -+ /* These are all copied into guest memory, so use zero-ed memory. */ -+ config.rsdp = (unsigned long)libxl__zalloc(gc, libxl_ctxt.page_size); -+ config.infop = (unsigned long)libxl__zalloc(gc, libxl_ctxt.page_size); - /* Pages to hold ACPI tables */ -- libxl_ctxt.buf = libxl__malloc(gc, NUM_ACPI_PAGES * -+ libxl_ctxt.buf = libxl__zalloc(gc, NUM_ACPI_PAGES * - libxl_ctxt.page_size); - - /* --- -2.47.0 - diff --git a/0080-x86-io-apic-fix-directed-EOI-when-using-AMD-Vi-inter.patch b/0080-x86-io-apic-fix-directed-EOI-when-using-AMD-Vi-inter.patch deleted file mode 100644 index 149cc17..0000000 --- a/0080-x86-io-apic-fix-directed-EOI-when-using-AMD-Vi-inter.patch +++ /dev/null @@ -1,160 +0,0 @@ -From c86ec8e156db497c674a3a8d40cbcec8c3f68629 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= -Date: Tue, 12 Nov 2024 13:42:16 +0100 -Subject: [PATCH 80/83] x86/io-apic: fix directed EOI when using AMD-Vi - interrupt remapping -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -When using AMD-Vi interrupt remapping the vector field in the IO-APIC RTE is -repurposed to contain part of the offset into the remapping table. Previous to -2ca9fbd739b8 Xen had logic so that the offset into the interrupt remapping -table would match the vector. Such logic was mandatory for end of interrupt to -work, since the vector field (even when not containing a vector) is used by the -IO-APIC to find for which pin the EOI must be performed. - -A simple solution wold be to read the IO-APIC RTE each time an EOI is to be -performed, so the raw value of the vector field can be obtained. However -that's likely to perform poorly. Instead introduce a cache to store the -EOI handles when using interrupt remapping, so that the IO-APIC driver can -translate pins into EOI handles without having to read the IO-APIC RTE entry. -Note that to simplify the logic such cache is used unconditionally when -interrupt remapping is enabled, even if strictly it would only be required -for AMD-Vi. - -Reported-by: Willi Junga -Suggested-by: David Woodhouse -Fixes: 2ca9fbd739b8 ('AMD IOMMU: allocate IRTE entries instead of using a static mapping') -Signed-off-by: Roger Pau Monné -Tested-by: Marek Marczykowski-Górecki -Reviewed-by: Jan Beulich -master commit: 86001b3970fea4536048607ea6e12541736c48e1 -master date: 2024-11-05 10:36:53 +0000 ---- - xen/arch/x86/io_apic.c | 75 +++++++++++++++++++++++++++++++++++++++--- - 1 file changed, 70 insertions(+), 5 deletions(-) - -diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c -index d2a313c4ac..44bfb3de8e 100644 ---- a/xen/arch/x86/io_apic.c -+++ b/xen/arch/x86/io_apic.c -@@ -71,6 +71,24 @@ static int apic_pin_2_gsi_irq(int apic, int pin); - - static vmask_t *__read_mostly vector_map[MAX_IO_APICS]; - -+/* -+ * Store the EOI handle when using interrupt remapping. -+ * -+ * If using AMD-Vi interrupt remapping the IO-APIC redirection entry remapped -+ * format repurposes the vector field to store the offset into the Interrupt -+ * Remap table. This breaks directed EOI, as the CPU vector no longer matches -+ * the contents of the RTE vector field. Add a translation cache so that -+ * directed EOI uses the value in the RTE vector field when interrupt remapping -+ * is enabled. -+ * -+ * Intel VT-d Xen code still stores the CPU vector in the RTE vector field when -+ * using the remapped format, but use the translation cache uniformly in order -+ * to avoid extra logic to differentiate between VT-d and AMD-Vi. -+ * -+ * The matrix is accessed as [#io-apic][#pin]. -+ */ -+static uint8_t **__ro_after_init io_apic_pin_eoi; -+ - static void share_vector_maps(unsigned int src, unsigned int dst) - { - unsigned int pin; -@@ -273,6 +291,17 @@ void __ioapic_write_entry( - { - __io_apic_write(apic, 0x11 + 2 * pin, eu.w2); - __io_apic_write(apic, 0x10 + 2 * pin, eu.w1); -+ /* -+ * Might be called before io_apic_pin_eoi is allocated. Entry will be -+ * initialized to the RTE value once the cache is allocated. -+ * -+ * The vector field is only cached for raw RTE writes when using IR. -+ * In that case the vector field might have been repurposed to store -+ * something different than the CPU vector, and hence need to be cached -+ * for performing EOI. -+ */ -+ if ( io_apic_pin_eoi ) -+ io_apic_pin_eoi[apic][pin] = e.vector; - } - else - iommu_update_ire_from_apic(apic, pin, e.raw); -@@ -288,18 +317,36 @@ static void ioapic_write_entry( - spin_unlock_irqrestore(&ioapic_lock, flags); - } - --/* EOI an IO-APIC entry. Vector may be -1, indicating that it should be -+/* -+ * EOI an IO-APIC entry. Vector may be -1, indicating that it should be - * worked out using the pin. This function expects that the ioapic_lock is - * being held, and interrupts are disabled (or there is a good reason not - * to), and that if both pin and vector are passed, that they refer to the -- * same redirection entry in the IO-APIC. */ -+ * same redirection entry in the IO-APIC. -+ * -+ * If using Interrupt Remapping the vector is always ignored because the RTE -+ * remapping format might have repurposed the vector field and a cached value -+ * of the EOI handle to use is obtained based on the provided apic and pin -+ * values. -+ */ - static void __io_apic_eoi(unsigned int apic, unsigned int vector, unsigned int pin) - { - /* Prefer the use of the EOI register if available */ - if ( ioapic_has_eoi_reg(apic) ) - { -- /* If vector is unknown, read it from the IO-APIC */ -- if ( vector == IRQ_VECTOR_UNASSIGNED ) -+ if ( io_apic_pin_eoi ) -+ /* -+ * If the EOI handle is cached use it. When using AMD-Vi IR the CPU -+ * vector no longer matches the vector field in the RTE, because -+ * the RTE remapping format repurposes the field. -+ * -+ * The value in the RTE vector field must always be used to signal -+ * which RTE to EOI, hence use the cached value which always -+ * mirrors the contents of the raw RTE vector field. -+ */ -+ vector = io_apic_pin_eoi[apic][pin]; -+ else if ( vector == IRQ_VECTOR_UNASSIGNED ) -+ /* If vector is unknown, read it from the IO-APIC */ - vector = __ioapic_read_entry(apic, pin, true).vector; - - *(IO_APIC_BASE(apic)+16) = vector; -@@ -1298,12 +1345,30 @@ void __init enable_IO_APIC(void) - vector_map[apic] = vector_map[0]; - } - -+ if ( iommu_intremap != iommu_intremap_off ) -+ { -+ io_apic_pin_eoi = xmalloc_array(typeof(*io_apic_pin_eoi), nr_ioapics); -+ BUG_ON(!io_apic_pin_eoi); -+ } -+ - for(apic = 0; apic < nr_ioapics; apic++) { - int pin; -- /* See if any of the pins is in ExtINT mode */ -+ -+ if ( io_apic_pin_eoi ) -+ { -+ io_apic_pin_eoi[apic] = xmalloc_array(typeof(**io_apic_pin_eoi), -+ nr_ioapic_entries[apic]); -+ BUG_ON(!io_apic_pin_eoi[apic]); -+ } -+ -+ /* See if any of the pins is in ExtINT mode and cache EOI handle */ - for (pin = 0; pin < nr_ioapic_entries[apic]; pin++) { - struct IO_APIC_route_entry entry = ioapic_read_entry(apic, pin, false); - -+ if ( io_apic_pin_eoi ) -+ io_apic_pin_eoi[apic][pin] = -+ ioapic_read_entry(apic, pin, true).vector; -+ - /* If the interrupt line is enabled and in ExtInt mode - * I have found the pin where the i8259 is connected. - */ --- -2.47.0 - diff --git a/0081-tools-libxl-remove-usage-of-VLA-arrays.patch b/0081-tools-libxl-remove-usage-of-VLA-arrays.patch deleted file mode 100644 index e42cb26..0000000 --- a/0081-tools-libxl-remove-usage-of-VLA-arrays.patch +++ /dev/null @@ -1,54 +0,0 @@ -From 1406f07aa13edcbcc8094c266faab117275962c6 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= -Date: Tue, 12 Nov 2024 13:42:56 +0100 -Subject: [PATCH 81/83] tools/libxl: remove usage of VLA arrays -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -Clang 19 complains with the following error when building libxl: - -libxl_utils.c:48:15: error: variable length array folded to constant array as an extension [-Werror,-Wgnu-folding-constant] - 48 | char path[strlen("/local/domain") + 12]; - | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Replace the usage of strlen() with sizeof, which allows the literal -string length to be known at build time. Note sizeof accounts for the -NUL terminator while strlen() didn't, hence subtract 1 from the total -size calculation. - -Signed-off-by: Roger Pau Monné -Reviewed-by: Frediano Ziglio -Acked-by: Andrew Cooper -Acked-by: Anthony PERARD -master commit: a7c7c3f6424504c4004bbb3437be319aa41ad580 -master date: 2024-11-05 19:59:50 +0000 ---- - tools/libs/light/libxl_utils.c | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/tools/libs/light/libxl_utils.c b/tools/libs/light/libxl_utils.c -index 10398a6c86..506c5b5631 100644 ---- a/tools/libs/light/libxl_utils.c -+++ b/tools/libs/light/libxl_utils.c -@@ -45,7 +45,7 @@ unsigned long libxl_get_required_shadow_memory(unsigned long maxmem_kb, unsigned - char *libxl_domid_to_name(libxl_ctx *ctx, uint32_t domid) - { - unsigned int len; -- char path[strlen("/local/domain") + 12]; -+ char path[sizeof("/local/domain") + 11]; - char *s; - - snprintf(path, sizeof(path), "/local/domain/%d/name", domid); -@@ -141,7 +141,7 @@ int libxl_cpupool_qualifier_to_cpupoolid(libxl_ctx *ctx, const char *p, - char *libxl_cpupoolid_to_name(libxl_ctx *ctx, uint32_t poolid) - { - unsigned int len; -- char path[strlen("/local/pool") + 12]; -+ char path[sizeof("/local/pool") + 11]; - char *s; - - snprintf(path, sizeof(path), "/local/pool/%d/name", poolid); --- -2.47.0 - diff --git a/0082-xen-x86-prevent-addition-of-.note.gnu.property-if-li.patch b/0082-xen-x86-prevent-addition-of-.note.gnu.property-if-li.patch deleted file mode 100644 index 2c3261e..0000000 --- a/0082-xen-x86-prevent-addition-of-.note.gnu.property-if-li.patch +++ /dev/null @@ -1,46 +0,0 @@ -From 251a9496485a86f302980a3f8d3c656831b5a62f Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= -Date: Tue, 12 Nov 2024 13:43:23 +0100 -Subject: [PATCH 82/83] xen/x86: prevent addition of .note.gnu.property if - livepatch is enabled -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -GNU assembly that supports such feature will unconditionally add a -.note.gnu.property section to object files. The content of that section can -change depending on the generated instructions. The current logic in -livepatch-build-tools doesn't know how to deal with such section changing -as a result of applying a patch and rebuilding. - -Since .note.gnu.property is not consumed by the Xen build, suppress its -addition when livepatch support is enabled. - -Signed-off-by: Roger Pau Monné -Reviewed-by: Jan Beulich -master commit: 718400a54dcfcc8a11958a6d953168f50944f002 -master date: 2024-11-11 13:19:45 +0100 ---- - xen/arch/x86/arch.mk | 6 ++++++ - 1 file changed, 6 insertions(+) - -diff --git a/xen/arch/x86/arch.mk b/xen/arch/x86/arch.mk -index 4f6c086988..a683d4bedc 100644 ---- a/xen/arch/x86/arch.mk -+++ b/xen/arch/x86/arch.mk -@@ -46,6 +46,12 @@ CFLAGS-$(CONFIG_CC_IS_GCC) += -fno-jump-tables - CFLAGS-$(CONFIG_CC_IS_CLANG) += -mretpoline-external-thunk - endif - -+# Disable the addition of a .note.gnu.property section to object files when -+# livepatch support is enabled. The contents of that section can change -+# depending on the instructions used, and livepatch-build-tools doesn't know -+# how to deal with such changes. -+$(call cc-option-add,CFLAGS-$(CONFIG_LIVEPATCH),CC,-Wa$$(comma)-mx86-used-note=no) -+ - ifdef CONFIG_XEN_IBT - # Force -fno-jump-tables to work around - # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104816 --- -2.47.0 - diff --git a/0083-xen-arm64-entry-Actually-skip-do_trap_-when-an-SErro.patch b/0083-xen-arm64-entry-Actually-skip-do_trap_-when-an-SErro.patch deleted file mode 100644 index 98d4a38..0000000 --- a/0083-xen-arm64-entry-Actually-skip-do_trap_-when-an-SErro.patch +++ /dev/null @@ -1,43 +0,0 @@ -From 8567eefe37dd518aa342166ffb81d4a2f73765b6 Mon Sep 17 00:00:00 2001 -From: Julien Grall -Date: Tue, 6 Aug 2024 13:48:15 +0100 -Subject: [PATCH 83/83] xen/arm64: entry: Actually skip do_trap_*() when an - SError is triggered - -For SErrors, we support two configurations: - * Every SErrors will result to a panic in Xen - * We will forward SErrors triggered by a VM back to itself - -For the latter case, we want to skip the call to do_trap_*() because the PC -was already adjusted. - -However, the alternative used to decide between the two configurations -is inverted. This would result to the VM corrupting itself if: - * x19 is non-zero in the panic case - * advance PC too much in the second case - -Solve the issue by switch from alternative_if to alternative_if_not. - -Fixes: a458d3bd0d25 ("xen/arm: entry: Ensure the guest state is synced when receiving a vSError") -Signed-off-by: Julien Grall -(cherry picked from commit 35c64c3dce01c2d0689a8c13240bf48a10cef783) ---- - xen/arch/arm/arm64/entry.S | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S -index 6251135ebd..fab10f8a0d 100644 ---- a/xen/arch/arm/arm64/entry.S -+++ b/xen/arch/arm/arm64/entry.S -@@ -259,7 +259,7 @@ - * apart. The easiest way is to duplicate the few instructions - * that need to be skipped. - */ -- alternative_if SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT -+ alternative_if_not SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT - cbnz x19, 1f - mov x0, sp - bl do_trap_\trap --- -2.47.0 - diff --git a/info.txt b/info.txt index 4c25256..ac603fd 100644 --- a/info.txt +++ b/info.txt @@ -1,6 +1,6 @@ -Xen upstream patchset #1 for 4.19.1-pre +Xen upstream patchset #0 for 4.19.2-pre Containing patches from -RELEASE-4.19.0 (0ef126c163d99932c9d7142e2bd130633c5c4844) +RELEASE-4.19.1 (f6479c21645d44814cd3bcee09e07786ef94a476) to -staging-4.19 (8567eefe37dd518aa342166ffb81d4a2f73765b6) +staging-4.19 (ce591a92ca50d2b8851469006a7d7824445b5dbc)