[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] x86/shutdown: change default reboot method preference


  • To: Roger Pau Monne <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 18 Sep 2023 14:26:51 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZWp2IpDs0vRp0OtYFPLQOBzJ172L9P4VXQe5K/mK7xU=; b=WRByl8mBXHrbAOJwPO3J+sHR9yVdSfEr/MEz9JiG1oLKstAHDB8DtBOCVpYCfqJke8XXRdT4SdrG51ExR4Yk+N/STvuwpzP88MwTWPJ+DBWQg0IEcq/YjQApm9D0d1GewcxBo7HGwg+f7Bh/UKoFP4qshPadQcIG3irAhu+ExMXpYsWwz/pkeFROS/4dxhZunr3XtB/NztrXC4PzFA6yZLO1rikU+BKkEasvs45DQoKIhpiHLQ86CeM98ZOdn8KU7Xjubxn7aeEEVW06tf2ny7J35jHO2nHpCayeNfRtX4pxvjLaNpCVb6vBxPXi16gYEWNDznj9JzU/40ApKnVvvQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=DHJVRUkxbadiDBl1D96pVRyvrDsL4dfuSD/HEsS/W9CV+Yv0lFxgiGhZucSA3AOQ2HEqvSCncCP9Epxf8XYF5U6dEOSal9l3d9fwWQvXGinX2apIECUVtYRoNLXPyuuwcchVyxNdHf9yzB8zdCObAmryZovLxQBKCBInBP3c7vgU3dorlgn45OIFX42n310MuPHXhAeIJ4Wd8OonnDQb2K7Dcb95ezI5eHgqhFqCXaZZLCRqmMqcZUBkGMppziqict3Mswx8utHfzfs+ffjet5lrs5X9uf6DswTD5CHE8QH+ocTeqraQDAwaeLFDeaRr59Tht0i3s0ZVp79pBYDLbA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Mon, 18 Sep 2023 12:27:03 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 15.09.2023 09:43, Roger Pau Monne wrote:
> The current logic to chose the preferred reboot method is based on the mode 
> Xen
> has been booted into, so if the box is booted from UEFI, the preferred reboot
> method will be to use the ResetSystem() run time service call.
> 
> However, that method seems to be widely untested, and quite often leads to a
> result similar to:
> 
> Hardware Dom0 shutdown: rebooting machine
> ----[ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C    ]----
> CPU:    0
> RIP:    e008:[<0000000000000017>] 0000000000000017
> RFLAGS: 0000000000010202   CONTEXT: hypervisor
> [...]
> Xen call trace:
>    [<0000000000000017>] R 0000000000000017
>    [<ffff83207eff7b50>] S ffff83207eff7b50
>    [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
>    [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
>    [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
>    [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
>    [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
>    [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
>    [<ffff82d040283d33>] F 
> arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
>    [<ffff82d04028436c>] F 
> arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
>    [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> 
> ****************************************
> Panic on CPU 0:
> FATAL TRAP: vector = 6 (invalid opcode)
> ****************************************
> 
> Which in most cases does lead to a reboot, however that's unreliable.
> 
> Change the default reboot preference to prefer ACPI over UEFI if available and
> not in reduced hardware mode.
> 
> This is in line to what Linux does, so it's unlikely to cause issues on 
> current
> and future hardware, since there's a much higher chance of vendors testing
> hardware with Linux rather than Xen.

I certainly appreciate this as a goal. However, ...

> Add a special case for one Acer model that does require being rebooted using
> ResetSystem().  See Linux commit 0082517fa4bce for rationale.

... this is precisely what I'd like to avoid: Needing workarounds on spec-
conforming systems.

> I'm not aware of using ACPI reboot causing issues on boxes that do have
> properly implemented ResetSystem() methods.

I'm also puzzled by this statement: That Acer aspect is a clear indication
of there being an issue. Plus it's quite easy to see that hooks may be put
in place by various firmware components that would then be used to make
certain adjustments to the platform, ahead of an orderly reboot / shutdown.

> --- a/xen/arch/x86/shutdown.c
> +++ b/xen/arch/x86/shutdown.c
> @@ -150,19 +150,20 @@ static void default_reboot_type(void)
>  
>      if ( xen_guest )
>          reboot_type = BOOT_XEN;
> +    else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
> +        reboot_type = BOOT_ACPI;
>      else if ( efi_enabled(EFI_RS) )
>          reboot_type = BOOT_EFI;
> -    else if ( acpi_disabled )
> -        reboot_type = BOOT_KBD;
>      else
> -        reboot_type = BOOT_ACPI;
> +        reboot_type = BOOT_KBD;
>  }
>  
>  static int __init cf_check override_reboot(const struct dmi_system_id *d)
>  {
>      enum reboot_type type = (long)d->driver_data;
>  
> -    if ( type == BOOT_ACPI && acpi_disabled )
> +    if ( (type == BOOT_ACPI && acpi_disabled) ||
> +         (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
>          type = BOOT_KBD;

I guess I don't follow this adjustment: Why would we fall back to KBD
first thing? Wouldn't it make sense to try ACPI first if EFI cannot
be used? And go further to KBD only if ACPI then also turns out
disabled (a mode that Xen quite likely won't correctly operate in
anymore anyway, due to bitrot)?

As an aside, KBD likely is unusable on hw-reduced systems, for there
simply not being a legacy keyboard controller. Instead we may need to
fall back to CF9 in such a case.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.