[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] PAT-related crash booting Linux 4.4 + Xen 4.5 on VMware ESXi
On Tue, 2016-05-24 at 11:54 -0400, Boris Ostrovsky wrote: > On 05/24/2016 10:53 AM, Kani, Toshimitsu wrote: > > > > On Mon, 2016-05-23 at 15:52 -0700, Ed Swierk wrote: > > > > > > Good question. I ran my tests again, and found I'd misinterpreted the > > > Fusion behavior. > > > > > > On Fusion 8.1.1, MSR_IA32_CR_PAT returns a reasonable value: > > > > > > (XEN) Freed 308kB init memory. > > > mapping kernel into physical memory > > > cpu_has_pat=0 cpuid_edx(1)=f89cbf5 pat=65536 > > > pat_init_cache_modes pat=50100070406 > > > pat_init_cache_modes i=7 pat_val=0 cache=3 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=6 pat_val=0 cache=3 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=5 pat_val=5 cache=5 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=4 pat_val=1 cache=1 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=3 pat_val=0 cache=3 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=2 pat_val=7 cache=2 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=1 pat_val=4 cache=4 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=0 pat_val=6 cache=0 > > > pat_init_cache_modes ok > > > pat_init_cache_modes pat_msg=WB WT UC- UC WC WP UC UC > > > about to get started... > > > [ 0.000000] x86/PAT: Configuration [0-7]: WB WT UC- > > > UC WC WP UC UC > > > > > > On ESXi 5.5.0, MSR_IA32_CR_PAT returns 0, and we are indeed hitting > > > the BUG_ON in update_cache_mode_entry(): > > > > > > (XEN) Freed 312kB init memory. > > > mapping kernel into physical memory > > > cpu_has_pat=0 cpuid_edx(1)=f89cbf5 pat=65536 > > > pat_init_cache_modes pat=0 > > > pat_init_cache_modes i=7 pat_val=0 cache=3 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=6 pat_val=0 cache=3 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=5 pat_val=0 cache=3 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=4 pat_val=0 cache=3 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=3 pat_val=0 cache=3 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=2 pat_val=0 cache=3 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=1 pat_val=0 cache=3 > > > pat_init_cache_modes ok > > > pat_init_cache_modes i=0 pat_val=0 cache=3 > > > (XEN) traps.c:459:d0v0 Unhandled invalid opcode fault/trap [#6] on > > > VCPU 0 [ec=0000] > > > (XEN) domain_crash_sync called from entry.S: fault at > > > ffff82d0802276c3 > > > create_bounce_frame+0x12b/0x13a > > > > > > In both cases, the PAT CPUID feature bit is set, and cpu_has_pat is > > > always 0 at this early point (so my RFC patch is wrong). The simplest > > > fix is to call pat_init_cache_modes(pat) only if pat != 0. > > > > > > This is starting to look like the same logic that's in > > > pat_bsp_init(), > > > which doesn't seem to be called when booting on Xen. Should it be? > > > Was > > > Xen deliberately excluded from this PAT emulation change? > > > https://groups.google.com/d/msg/linux.kernel/JoJKbCOxV0U/PM0I9d1v60kJ > > > > Calling pat_init() requires the CPU rendezvous handler in MTRR, which > > is disabled in Xen. This PAT initialization has been problematic, and > > the following patches addressed it in 4.6. This will fix your problem > > as well. > > https://lkml.org/lkml/2016/3/23/500 > > > > In particular, patch 6/7 removed the Xen code in question. > > https://lkml.org/lkml/2016/3/23/503 > > > > Do you need to fix this issue in 4.4? If so, we should be able to > > request backporting the patches to 4.4 stable. > > Would disabling PAT when the MSR is clearly broken (and not trying to > emulate it) not work? That should work, but the above patches fix the qemu32 issue also found in 4.4. So, they need to be backported to 4.4. https://lkml.org/lkml/2016/3/3/828 -Toshi _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |