[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2] x86/shutdown: change default reboot method preference
On Tue, Oct 03, 2023 at 01:35:25PM +0200, Roger Pau Monné wrote: > On Wed, Sep 27, 2023 at 10:21:44AM +0200, Jan Beulich wrote: > > On 15.09.2023 09:43, Roger Pau Monne wrote: > > > The current logic to chose the preferred reboot method is based on the > > > mode Xen > > > has been booted into, so if the box is booted from UEFI, the preferred > > > reboot > > > method will be to use the ResetSystem() run time service call. > > > > > > However, that method seems to be widely untested, and quite often leads > > > to a > > > result similar to: > > > > > > Hardware Dom0 shutdown: rebooting machine > > > ----[ Xen-4.18-unstable x86_64 debug=y Tainted: C ]---- > > > CPU: 0 > > > RIP: e008:[<0000000000000017>] 0000000000000017 > > > RFLAGS: 0000000000010202 CONTEXT: hypervisor > > > [...] > > > Xen call trace: > > > [<0000000000000017>] R 0000000000000017 > > > [<ffff83207eff7b50>] S ffff83207eff7b50 > > > [<ffff82d0403525aa>] F machine_restart+0x1da/0x261 > > > [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37 > > > [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb > > > [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34 > > > [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3 > > > [<ffff82d0402018c2>] F common_interrupt+0x132/0x140 > > > [<ffff82d040283d33>] F > > > arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129 > > > [<ffff82d04028436c>] F > > > arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7 > > > [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee > > > > > > **************************************** > > > Panic on CPU 0: > > > FATAL TRAP: vector = 6 (invalid opcode) > > > **************************************** > > > > > > Which in most cases does lead to a reboot, however that's unreliable. > > > > > > Change the default reboot preference to prefer ACPI over UEFI if > > > available and > > > not in reduced hardware mode. > > > > > > This is in line to what Linux does, so it's unlikely to cause issues on > > > current > > > and future hardware, since there's a much higher chance of vendors testing > > > hardware with Linux rather than Xen. > > > > > > Add a special case for one Acer model that does require being rebooted > > > using > > > ResetSystem(). See Linux commit 0082517fa4bce for rationale. > > > > > > I'm not aware of using ACPI reboot causing issues on boxes that do have > > > properly implemented ResetSystem() methods. > > > > A data point from a new system I'm still in the process of setting up: The > > ACPI reboot method, as used by Linux, unconditionally means a warm reboot. > > The EFI method, otoh, properly distinguishes "reboot=warm" from our default > > of explicitly requesting cold reboot. (Without taking the EFI path, I > > assume our write to the relevant BDA location simply has no effect, for > > this being a legacy BIOS thing, and the system apparently defaults to warm > > reboot when using the ACPI method.) > > This is unfortunate, but IMO not as worse as getting a #UD or any > other fault while attempting a reboot. We can always force this > system to use UEFI reboot, if that does work better than ACPI. > > > Clearly, as a secondary effect, this system adds to my personal experience > > of so far EFI reboot consistently working on all x86 hardware I have (had) > > direct access to. (That said, this is the first non-Intel system, which > > likely biases my overall experience.) > > I can try to gather some data, I can at least tell you that the Intel > NUC11TNHi7 TGL does also hit a fault when attempting UEFI reboot. > The above crash was from a Dell PowerEdge R6625. I do recall seeing > this with other boxes on the Citrix lab, but don't know the exact > models. I'm quite sure other downstreams can provide similar > feedback. As a further data point, Dasharo [0] a coreboot downstream was also providing a firmware with a broken ResetSystem() method, and they didn't notice until someone reported errors on Xen reboot: https://github.com/Dasharo/edk2/pull/99/commits/dee75be10ac9387168bd3a8cad0f1ec6e372129a It's quite clear no one is testing ResetSystem(), the UEFI spec doesn't mandate using it, and we are just hurting ourselves by forcing its usage. Regards, Roger. [0] https://github.com/Dasharo
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |