[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v9 0/9] xen/x86: various XPTI speedups
On 03/05/18 19:41, Andrew Cooper wrote: > On 02/05/18 11:38, Juergen Gross wrote: >> On 01/05/18 11:28, Andrew Cooper wrote: >>> On 26/04/18 12:33, Juergen Gross wrote: >>>> This patch series aims at reducing the overhead of the XPTI Meltdown >>>> mitigation. >>> With just the first 3 patches of this series (in a bisection attempt), >>> on a XenServer build based off staging, XenRT finds the following: >>> >>> (XEN) Assertion 'first_dirty != INVALID_DIRTY_IDX || !(pg[i].count_info & >>> PGC_need_scrub)' failed at page_alloc.c:979 >>> (XEN) ----[ Xen-4.11.0-6.0.0-d x86_64 debug=y Not tainted ]---- >>> (XEN) CPU: 0 >>> (XEN) RIP: e008:[<ffff82d080229914>] >>> page_alloc.c#alloc_heap_pages+0x371/0x6f2 >>> (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor (d33v0) >>> (XEN) rax: ffff82e01307ade8 rbx: 000000000007ffff rcx: 8180000000000000 >>> (XEN) rdx: 0000000000000000 rsi: 00000000000001b5 rdi: 0000000000000000 >>> (XEN) rbp: ffff8300952b7ba8 rsp: ffff8300952b7b18 r8: 8000000000000000 >>> (XEN) r9: ffff82e01307ade8 r10: 0180000000000000 r11: 7fffffffffffffff >>> (XEN) r12: 0000000000000000 r13: 00000000024c2e83 r14: 0000000000000000 >>> (XEN) r15: ffff82e01307add8 cr0: 0000000080050033 cr4: 00000000001526e0 >>> (XEN) cr3: 0000000799c41000 cr2: 00007fdaf5539000 >>> (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000 >>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 >>> (XEN) Xen code around <ffff82d080229914> >>> (page_alloc.c#alloc_heap_pages+0x371/0x6f2): >>> (XEN) ff 0f 0b 48 85 c9 79 31 <0f> 0b 48 c7 42 08 00 00 00 00 c7 42 10 00 >>> 00 00 >>> (XEN) Xen stack trace from rsp=ffff8300952b7b18: >>> (XEN) 0000000000000001 ffff830799cdd000 0000000000000000 00000000003037e9 >>> (XEN) 0000000100000004 ffff8300952b7b68 0000000100000000 ffff830095738000 >>> (XEN) ffff8300952b7be8 000000008033bfe8 ffff82e01295e540 0000000000001adc >>> (XEN) ffff830756971770 0000000000000028 0000000000000000 ffff830799cdd000 >>> (XEN) 0000000000000000 ffff830799cdd000 ffff8300952b7be8 ffff82d080229d4c >>> (XEN) 0000000000000000 ffff8300952b7d40 0000000000000000 0000000000000000 >>> (XEN) 00000000000000a8 ffff830799cdd000 ffff8300952b7c98 ffff82d080221d90 >>> (XEN) 0000000100000000 ffff830799cdd000 0000000000000000 0000000099cdd000 >>> (XEN) ffff82e009cd0fd8 00000000000e7b1f ffff8300952b7c88 0000000000000020 >>> (XEN) ffff8800e7b1fdd8 0000000000000002 0000000000000006 ffff830799cdd000 >>> (XEN) ffff8300952b7c78 000000000039f480 0000000000000000 000000000000008d >>> (XEN) ffff8800e7b1fdd8 ffff830799cdd000 0000000000000006 ffff830799cdd000 >>> (XEN) ffff8300952b7db8 ffff82d080223ad7 0000000000000046 ffff830088ff9000 >>> (XEN) ffff8300952b7d18 ffff82d08023cfaf ffff82c000230118 ffff830842ceeb8c >>> (XEN) ffff82e009f54db8 00000000003bc78b ffff830842cd2770 ffff830088ff9000 >>> (XEN) 0000000000000000 0000000000000000 ffff83085d6b9350 0000000000000000 >>> (XEN) ffff8300952b7d28 ffff82d08023d766 ffff8300952b7d58 ffff82d08020c9a2 >>> (XEN) ffff830842cee000 ffff830799cdd000 ffffffff81adbec0 0000000000000200 >>> (XEN) 0000008d00000000 ffff82d000000000 ffffffff81adbec0 0000000000000200 >>> (XEN) 0000000000000000 0000000000007ff0 ffff83085d6b9350 0000000000000006 >>> (XEN) Xen call trace: >>> (XEN) [<ffff82d080229914>] page_alloc.c#alloc_heap_pages+0x371/0x6f2 >>> (XEN) [<ffff82d080229d4c>] alloc_domheap_pages+0xb7/0x157 >>> (XEN) [<ffff82d080221d90>] memory.c#populate_physmap+0x27e/0x4c9 >>> (XEN) [<ffff82d080223ad7>] do_memory_op+0x2e2/0x2695 >>> (XEN) [<ffff82d080308be9>] hypercall.c#hvm_memory_op+0x36/0x60 >>> (XEN) [<ffff82d0803091c2>] hvm_hypercall+0x5af/0x681 >>> (XEN) [<ffff82d08032fee6>] vmx_vmexit_handler+0x1040/0x1e14 >>> (XEN) [<ffff82d080335f88>] vmx_asm_vmexit_handler+0xe8/0x250 >>> (XEN) >>> (XEN) >>> (XEN) **************************************** >>> (XEN) Panic on CPU 0: >>> (XEN) Assertion 'first_dirty != INVALID_DIRTY_IDX || !(pg[i].count_info & >>> PGC_need_scrub)' failed at page_alloc.c:979 >>> (XEN) **************************************** >>> >>> Running repeated tests on adjacent builds, we never see the assertion >>> failure without the patches (6 runs), and have so far seen for 3 of 4 >>> runs (2 still pending) with the patches. >>> >>> What is rather strange is that there is a lot of migration and >>> ballooning going on, but only for HVM (Debian Jessie, not that this >>> should matter) VMs. dom0 will be the only PV domain in the system, and >>> is 64bit. >> Are you sure you have no other patches compared to staging in your >> hypervisor? I can't imagine how one of the 3 patches could cause that >> behavior. >> >> I've tried to do similar testing on my machine: 2 HVM domains + 64-bit >> Pv dom0. dom0 and one HVM domain are ballooned up and down all the time >> while the other HVM domain is being migrated (localhost) in a loop. >> >> Migration count is at 600 already... > > So it turns out that I've now reproduce this ASSERT() once without any > patches from this series applied. > > Therefore, it is a latent bug in either XenServer or Xen, but shouldn't > block this series (Especially as this series makes it easier to reproduce). > > At this point, as we're planning to take the series for 4.11, it might > be better to throw the whole series in and get some wider testing that way. I believe taking this for RC3 tomorrow isn't the best idea, so lets wait until Monday. This way we can let OSStest take a try with the series. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |