[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] PVH dom0 construction timeout
On Fri, Feb 28, 2020 at 09:08:30PM +0000, Andrew Cooper wrote: > It turns out that PVH dom0 construction doesn't work so well on a > 2-socket Rome system... > > (XEN) NX (Execute Disable) protection active > > (XEN) *** Building a PVH Dom0 *** > > (XEN) Watchdog timer detects that CPU0 is stuck! > > (XEN) ----[ Xen-4.14-unstable x86_64 debug=y Not tainted ]---- > > (XEN) CPU: 0 > > (XEN) RIP: e008:[<ffff82d08029a8fd>] page_get_ram_type+0x58/0xb6 > > (XEN) RFLAGS: 0000000000000206 CONTEXT: hypervisor > > (XEN) rax: ffff82d080948fe0 rbx: 0000000002b73db9 rcx: 0000000000000000 > > (XEN) rdx: 0000000004000000 rsi: 0000000004000000 rdi: 0000002b73db9000 > > (XEN) rbp: ffff82d080827be0 rsp: ffff82d080827ba0 r8: ffff82d080948fcc > > (XEN) r9: 0000002b73dba000 r10: ffff82d0809491fc r11: 8000000000000000 > > (XEN) r12: 0000000002b73db9 r13: ffff8320341bc000 r14: 000000000404fc00 > > (XEN) r15: ffff82d08046f209 cr0: 000000008005003b cr4: 00000000001506e0 > > (XEN) cr3: 00000000a0414000 cr2: 0000000000000000 > > (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000 > > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > > (XEN) Xen code around <ffff82d08029a8fd> (page_get_ram_type+0x58/0xb6): > > (XEN) 4c 39 d0 74 4d 49 39 d1 <76> 0b 89 ca 83 ca 10 48 39 38 0f 47 ca 49 89 > c0 > > (XEN) Xen stack trace from rsp=ffff82d080827ba0: > > (XEN) ffff82d08061ee91 ffff82d080827bb4 00000000000b2403 ffff82d080804340 > > (XEN) ffff8320341bc000 ffff82d080804340 ffff83000003df90 ffff8320341bc000 > > (XEN) ffff82d080827c08 ffff82d08061c38c ffff8320341bc000 ffff82d080827ca8 > > (XEN) ffff82d080648750 ffff82d080827c20 ffff82d08061852c 0000000000200000 > > (XEN) ffff82d080827d60 ffff82d080638abe ffff82d080232854 ffff82d080930c60 > > (XEN) ffff82d080930280 ffff82d080674800 ffff83000003df90 0000000001a40000 > > (XEN) ffff83000003df80 ffff82d080827c80 0000000000000206 ffff8320341bc000 > > (XEN) ffff82d080827cb8 ffff82d080827ca8 ffff82d080232854 ffff82d080961780 > > (XEN) ffff82d080930280 ffff82d080827c00 0000000000000002 ffff82d08022f9a0 > > (XEN) 00000000010a4bb0 ffff82d080827ce0 0000000000000206 000000000381b66d > > (XEN) ffff82d080827d00 ffff82d0802b1e87 ffff82d080936900 ffff82d080936900 > > (XEN) ffff82d080827d18 ffff82d0802b30d0 ffff82d080936900 ffff82d080827d50 > > (XEN) ffff82d08022ef5e ffff8320341bc000 ffff83000003df80 ffff8320341bc000 > > (XEN) ffff83000003df80 0000000001a40000 ffff83000003df90 ffff82d080674800 > > (XEN) ffff82d080827d98 ffff82d08063cd06 0000000000000001 ffff82d080674800 > > (XEN) ffff82d080931050 0000000000000100 ffff82d080950c80 ffff82d080827ee8 > > (XEN) ffff82d08062eae7 0000000001a40fff 0000000000000000 000ffff82d080e00 > > (XEN) ffffffff00000000 0000000000000005 0000000000000004 0000000000000004 > > (XEN) 0000000000000003 0000000000000003 0000000000000002 0000000000000002 > > (XEN) 0000000002050000 0000000000000000 ffff82d080674c20 ffff82d080674ea0 > > (XEN) Xen call trace: > > (XEN) [<ffff82d08029a8fd>] R page_get_ram_type+0x58/0xb6 > > (XEN) [<ffff82d08061ee91>] S arch_iommu_hwdom_init+0x239/0x2b7 > > (XEN) [<ffff82d08061c38c>] F > drivers/passthrough/amd/pci_amd_iommu.c#amd_iommu_hwdom_init+0x85/0x9f > > (XEN) [<ffff82d08061852c>] F iommu_hwdom_init+0x44/0x4b > > (XEN) [<ffff82d080638abe>] F dom0_construct_pvh+0x160/0x1233 > > (XEN) [<ffff82d08063cd06>] F construct_dom0+0x5c/0x280e > > (XEN) [<ffff82d08062eae7>] F __start_xen+0x25db/0x2860 > > (XEN) [<ffff82d0802000ec>] F __high_start+0x4c/0x4e > > (XEN) > > (XEN) CPU1 @ e008:ffff82d0802f203f > (arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0xa9/0xbf) > > (XEN) CPU31 @ e008:ffff82d0802f203f > (arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0xa9/0xbf) > > (XEN) CPU30 @ e008:ffff82d0802f203f > (arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0xa9/0xbf) > > (XEN) CPU27 @ e008:ffff82d08022ad5a (scrub_one_page+0x6d/0x7b) > > (XEN) CPU26 @ e008:ffff82d0802f203f > (arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0xa9/0xbf) > > (XEN) CPU244 @ e008:ffff82d0802f203f > (arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0xa9/0xbf) > > (XEN) CPU245 @ e008:ffff82d08022ad5a (scrub_one_page+0x6d/0x7b) > > (XEN) CPU247 @ e008:ffff82d080256e3f > (drivers/char/ns16550.c#ns_read_reg+0x2d/0x35) > > (XEN) CPU246 @ e008:ffff82d0802f203f > (arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0xa9/0xbf) > > <snip rather a large number of cpus, all idle> > > > This stack trace is the same on several boots, and in particular, > page_get_ram_type() being the %rip which took the timeout. For an > equivalent PV dom0 build, it takes perceptibly 0 time, based on how > quickly the next line is printed. set_identity_p2m_entry on AMD will always take longer as it needs to add the mfn to both the p2m and the iommu page tables because of the lack of page table sharing. On a PVH dom0 hwdom_iommu_map will return false more often than for PV, because RAM regions are already mapped into the p2m and the iommu page tables if required, and hence the process_pending_softirqs was likely skipped way more often. That together with a big memory map could explain the watchdog triggering and rIP pointing to page_get_ram_type I think. > > I haven't diagnosed the exact issue, but some observations: > > The arch_iommu_hwdom_init() loop's positioning of > process_pending_softirqs() looks problematic, because it is short > circuited conditionally by hwdom_iommu_map(). > > page_get_ram_type() is definitely suboptimal here. We have an linear > search over a (large-ish) sorted list, and a caller which has every MFN > in the system passed into it, which makes the total runtime of > arch_iommu_hwdom_init() quadratic with the size of the system. This could be improved for PVH dom0 I believe, as we already have an adjusted e820 we could use instead of having to query the type of every mfn on the system. We could just iterate over holes and reserved ranges on the adjusted memory map and avoid having to query the type of RAM regions for example, as those are already mapped in the p2m or iommu pages tables for a PVH dom0. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |