[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 2/2] x86/time: prefer CMOS over EFI_GET_TIME
On Fri, Aug 30, 2024 at 05:31:04PM +0100, Andrew Cooper wrote: > On 30/08/2024 10:52 am, Roger Pau Monne wrote: > > The EFI_GET_TIME implementation is well known to be broken for many firmware > > implementations, for Xen the result on such implementations are: > > > > ----[ Xen-4.19-unstable x86_64 debug=y Tainted: C ]---- > > CPU: 0 > > RIP: e008:[<0000000062ccfa70>] 0000000062ccfa70 > > [...] > > Xen call trace: > > [<0000000062ccfa70>] R 0000000062ccfa70 > > [<00000000732e9a3f>] S 00000000732e9a3f > > [<ffff82d04034f34f>] F arch/x86/time.c#get_cmos_time+0x1b3/0x26e > > [<ffff82d04045926f>] F init_xen_time+0x28/0xa4 > > [<ffff82d040454bc4>] F __start_xen+0x1ee7/0x2578 > > [<ffff82d040203334>] F __high_start+0x94/0xa0 > > > > Pagetable walk from 0000000062ccfa70: > > L4[0x000] = 000000207ef1c063 ffffffffffffffff > > L3[0x001] = 000000005d6c0063 ffffffffffffffff > > L2[0x116] = 8000000062c001e3 ffffffffffffffff (PSE) > > > > **************************************** > > Panic on CPU 0: > > FATAL PAGE FAULT > > [error_code=0011] > > Faulting linear address: 0000000062ccfa70 > > **************************************** > > > > Swap the preference to default to CMOS first, and EFI later, in an attempt > > to > > use EFI_GET_TIME as a last resort option only. Note that Linux for example > > doesn't allow calling the get_time method, and instead provides a dummy > > handler > > that unconditionally returns EFI_UNSUPPORTED on x86-64. > > > > Such change in the preferences requires some re-arranging of the function > > logic, so that panic messages with workaround suggestions are suitably > > printed. > > > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > > --- > > xen/arch/x86/time.c | 26 +++++++++++++++++++------- > > 1 file changed, 19 insertions(+), 7 deletions(-) > > > > diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c > > index 272ca2468ea6..0eee954809a9 100644 > > --- a/xen/arch/x86/time.c > > +++ b/xen/arch/x86/time.c > > @@ -1305,24 +1305,36 @@ static unsigned long get_cmos_time(void) > > static bool __read_mostly cmos_rtc_probe; > > boolean_param("cmos-rtc-probe", cmos_rtc_probe); > > > > + if ( likely(!(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC)) ) > > + cmos_rtc_probe = false; > > + > > + if ( (!(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC) || > > + cmos_rtc_probe) && read_cmos_time(&rtc, cmos_rtc_probe) ) > > + return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, > > rtc.sec); > > + > > if ( efi_enabled(EFI_RS) ) > > { > > unsigned long res = efi_get_time(); > > > > if ( res ) > > return res; > > + > > + panic("Broken EFI_GET_TIME %s\n", > > + !cmos_rtc_probe ? "try booting with \"cmos-rtc-probe\" > > option" > > + : "and no CMOS RTC found"); > > } > > > > - if ( likely(!(acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_CMOS_RTC)) ) > > - cmos_rtc_probe = false; > > - else if ( system_state < SYS_STATE_smp_boot && !cmos_rtc_probe ) > > + /* > > + * No viable clock source found. Attempt to provide some guidance to > > the > > + * user about possible workarounds to boot Xen on the system. > > + */ > > + ASSERT(system_state < SYS_STATE_smp_boot); > > + > > + if ( !cmos_rtc_probe ) > > panic("System with no CMOS RTC advertised must be booted from EFI" > > " (or with command line option \"cmos-rtc-probe\")\n"); > > > > - if ( unlikely(!read_cmos_time(&rtc, cmos_rtc_probe)) ) > > - panic("No CMOS RTC found - system must be booted from EFI\n"); > > - > > - return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec); > > + panic("No CMOS RTC found - system must be booted from EFI\n"); > > } > > > > static unsigned int __ro_after_init cmos_alias_mask; > > Hmm, I know I said "fix the crash first, cleanup later" on the prior > patch, but I'm tempted to retract that. This is very hard to follow. Yeah, I was attempting to refrain myself from fully re-writing the logic. > Going back to first principles, at runtime we need USE_{XEN,RTC,EFI} to > select between the various methods available. > > At boot, we need to figure out what to do. I think we want an > init_wallclock_time() split out of get_cmos_time() (which is badly named > anyway, given EFI), and called from init_xen_time() only. In > particular, the init stuff should not be re-evaluated in > time_{suspend,resume}(). I also disliked the current arrangement, and adding a probe function crossed my mind, I simply wanted to avoid this fix growing to a 10 patch series, but I think it's inevitable. > Various details we have to work with: > > * ACPI_FADT_NO_CMOS_RTC, although we know this is falsely set on some > platforms, hence the whole probing logic. > > * Booted on an EFI system, although we know EFI_GET_TIME is broken on > millions of systems, and we only find this out with a #PF in the middle > of EFI-RS. Furthermore, more-major-OSes-than-us strictly veto the use > of this service, and it's not only Linux; it's Windows too. > > Personally, I think "cmos-rtc-probe" is inappropriate/insufficient. It > ought to be wc-time=guess|xen|rtc|efi. We should be able to influence > the behaviour more than a boolean "probe or not". The Xen case might be > better as "hypervisor", although I can't see any evidence of Viridian > having a virtualised wallclock interface. > > I think the init logic wants to be: > * If ACPI says we have an RTC, use it. > * If ACPI says we have no RTC, probe to see if it's really there. > * If we genuinely seem to not have an RTC, probe EFI. This would be > quite invasive in the #PF handler, but not impossible. My plan would be to use EFI as a last resort if available, and hence not probe it. It would however be nice to give the user a better error message than a #PF in the hypervisor with some random stack trace. Is #PF the only fault that we can expect from EFI_GET_TIME? That's what I've seen on some of the systems, but I wouldn't for example discard #GP or #UD as also possible outcomes? > > > That said, I'm still holding out hope that we can simply delete Xen's > need for wallclock time. Xen only uses wallclock time for the > non-default console_timestamps=date, but even then it's uncorrelate with > dom0 changing the platform time, leading to actively-misleading log > files. There's a reason why "time since boot" is the norm. But there's a wallclock provided in the share_info page, do you suggest that dom0 should pass the wallclock value to Xen, so Xen can initialize the shared_info wallclock time? That would leave the fields uninitialized for dom0 which would be an ABI change. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |