[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [xen-unstable test] 113959: regressions - FAIL
On Mon, Oct 09, 2017 at 12:03:53PM +0100, Andrew Cooper wrote: >On 09/10/17 08:58, Chao Gao wrote: >> On Mon, Oct 09, 2017 at 02:13:22PM +0800, Chao Gao wrote: >>> On Tue, Oct 03, 2017 at 11:08:01AM +0100, Roger Pau Monné wrote: >>>> On Tue, Oct 03, 2017 at 09:55:44AM +0000, osstest service owner wrote: >>>>> flight 113959 xen-unstable real [real] >>>>> http://logs.test-lab.xenproject.org/osstest/logs/113959/ >>>>> >>>>> Regressions :-( >>>>> >>>>> Tests which did not succeed and are blocking, >>>>> including tests which could not be run: >>>>> test-amd64-i386-libvirt-xsm 21 leak-check/check fail REGR. vs. >>>>> 113954 >>>> This is due to cron running when the leak-check is executed. >>>> >>>>> test-armhf-armhf-xl-multivcpu 5 host-ping-check-native fail REGR. vs. >>>>> 113954 >>>>> test-amd64-i386-xl-qemut-debianhvm-amd64 17 guest-stop fail REGR. vs. >>>>> 113954 >>>> The test below has triggered the following ASSERT, CCing the Intel >>>> guys. >>>> >>>> Oct 3 06:12:00.415168 (XEN) d15v0: intack: 2:30 pt: 38 >>>> Oct 3 06:12:19.191141 (XEN) vIRR: 00000000 00000000 00000000 00000000 >>>> 00000000 00000000 00010000 00000000 >>>> Oct 3 06:12:19.199162 (XEN) PIR: 00000000 00000000 00000000 00000000 >>>> 00000000 00000000 00000000 00000000 >>>> Oct 3 06:12:19.207160 (XEN) Assertion 'intack.vector >= pt_vector' failed >>>> at intr.c:367 >>>> Oct 3 06:12:19.215215 (XEN) ----[ Xen-4.10-unstable x86_64 debug=y >>>> Not tainted ]---- >>>> Oct 3 06:12:19.223124 (XEN) CPU: 1 >>>> Oct 3 06:12:19.223153 (XEN) RIP: e008:[<ffff82d0803022a5>] >>>> vmx_intr_assist+0x617/0x637 >>>> Oct 3 06:12:19.231185 (XEN) RFLAGS: 0000000000010292 CONTEXT: >>>> hypervisor (d15v0) >>>> Oct 3 06:12:19.239163 (XEN) rax: ffff83022dfc802c rbx: ffff8300ccc65680 >>>> rcx: 0000000000000000 >>>> Oct 3 06:12:19.247169 (XEN) rdx: ffff83022df7ffff rsi: 000000000000000a >>>> rdi: ffff82d0804606d8 >>>> Oct 3 06:12:19.255127 (XEN) rbp: ffff83022df7ff08 rsp: ffff83022df7fea8 >>>> r8: ffff83022df90000 >>>> Oct 3 06:12:19.263114 (XEN) r9: 0000000000000001 r10: 0000000000000000 >>>> r11: 0000000000000001 >>>> Oct 3 06:12:19.271109 (XEN) r12: 00000000ffffffff r13: ffff82d0803cfba6 >>>> r14: ffff82d0803cfba6 >>>> Oct 3 06:12:19.279119 (XEN) r15: 0000000000000004 cr0: 0000000080050033 >>>> cr4: 00000000001526e0 >>>> Oct 3 06:12:19.279157 (XEN) cr3: 0000000214274000 cr2: 00005622a2184dbf >>>> Oct 3 06:12:19.287123 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 >>>> ss: 0000 cs: e008 >>>> Oct 3 06:12:19.295105 (XEN) Xen code around <ffff82d0803022a5> >>>> (vmx_intr_assist+0x617/0x637): >>>> Oct 3 06:12:19.303150 (XEN) 41 bf 00 00 00 00 eb a0 <0f> 0b 89 ce 48 89 >>>> df e8 bb 20 00 00 e9 49 fe ff >>>> Oct 3 06:12:19.311112 (XEN) Xen stack trace from rsp=ffff83022df7fea8: >>>> Oct 3 06:12:19.311146 (XEN) ffff83022df7ff08 000000388030cf76 >>>> ffff82d0805a7570 ffff82d08057ad80 >>>> Oct 3 06:12:19.319131 (XEN) ffff83022df7ffff ffff83022df7fee0 >>>> ffff82d08023b9b6 ffff8300ccc65000 >>>> Oct 3 06:12:19.327115 (XEN) 000000000000000b 0000000000000020 >>>> 00000000000000c2 0000000000000004 >>>> Oct 3 06:12:19.345094 (XEN) ffff880029eb4000 ffff82d080311c21 >>>> 0000000000000004 00000000000000c2 >>>> Oct 3 06:12:19.345177 (XEN) 0000000000000020 000000000000000b >>>> ffff880029eb4000 ffffffff81adf0a0 >>>> Oct 3 06:12:19.351221 (XEN) 0000000000000000 0000000000000000 >>>> ffff88002d400008 0000000000000000 >>>> Oct 3 06:12:19.359439 (XEN) 0000000000000030 0000000000000000 >>>> 00000000000003f8 00000000000003f8 >>>> Oct 3 06:12:19.367267 (XEN) ffffffff81adf0a0 0000beef0000beef >>>> ffffffff8138a5f4 000000bf0000beef >>>> Oct 3 06:12:19.375222 (XEN) 0000000000000002 ffff88002f803e08 >>>> 000000000000beef 000000000000beef >>>> Oct 3 06:12:19.383198 (XEN) 000000000000beef 000000000000beef >>>> 000000000000beef 0000000000000001 >>>> Oct 3 06:12:19.391230 (XEN) ffff8300ccc65000 00000031ada20d00 >>>> 00000000001526e0 >>>> Oct 3 06:12:19.399336 (XEN) Xen call trace: >>>> Oct 3 06:12:19.399389 (XEN) [<ffff82d0803022a5>] >>>> vmx_intr_assist+0x617/0x637 >>>> Oct 3 06:12:19.407337 (XEN) [<ffff82d080311c21>] >>>> vmx_asm_vmexit_handler+0x41/0x120 >>>> Oct 3 06:12:19.407380 (XEN) >>>> Oct 3 06:12:19.415246 (XEN) >>>> Oct 3 06:12:19.415278 (XEN) **************************************** >>>> Oct 3 06:12:19.415307 (XEN) Panic on CPU 1: >>>> Oct 3 06:12:19.415332 (XEN) Assertion 'intack.vector >= pt_vector' failed >>>> at intr.c:367 >>>> Oct 3 06:12:19.423432 (XEN) **************************************** >>> (CC Jan) >>> >>> Hi, Roger. >>> >>> I sent a patch to fix a possible cause of this bug, seeing >>> https://lists.xenproject.org/archives/html/xen-devel/2017-04/msg03254.html. >>> >>> Due to Xen 4.9 release, I put this patch aside and later forgot to >>> continue fixing this bug. Sorry for this. Of course, I will fix this >>> bug. >>> >>> I thought the root case was: >>> When injecting periodic timer interrupt in vmx_intr_assist(), >>> multi-read operations are done during one event delivery. For >>> example, if a periodic timer interrupt is from PIT, when set the >>> corresponding bit in vIRR, the corresponding RTE is accessed in >>> pt_update_irq(). When this function returns, it accesses the RTE >>> again to get the vector it sets in vIRR. Between the two >>> accesses, the content of RTE may have been changed by another CPU >>> for no protection method in use. This case can incur the >>> assertion failure in vmx_intr_assist(). >>> >>> For example, in this case, we may set 0x30 in vIRR, but return 0x38 to >>> vmx_intr_assist(). When we try to inject an interrupt, we would find >>> 0x38 is greater than the highest vector; then the assertion failure >>> happened. I have a xtf case to reproduce this bug, seeing >>> https://lists.xenproject.org/archives/html/xen-devel/2017-03/msg02906.html. >>> But according to Jan's opinion, he thought the bug was unlikely >>> triggered in OSSTEST by these weird operations. >>> >>> After thinking over it, the bug also can be caused by pt_update_irq() >>> returns 0x38 but it doesn't set 0x38 in vIRR for the corresponding RTE >>> is masked. Please refer to the code path: >>> vmx_intr_assist() -> pt_update_irq() -> hvm_isa_irq_assert() -> >>> assert_irq() -> assert_gsi() -> vioapic_irq_positive_edge(). >>> Note that in vioapic_irq_positive_edge(), if ent->fields.mask is set, >>> the function returns without setting the corresponding bit in vIRR. >> To verify this guess, I modify the above xtf a little. The new xtf test >> (enclosed in attachment) Create a guest with 2 vCPU. vCPU0 sets up PIT >> to generate timer interrupt every 1ms. It also boots up vCPU1. vCPU1 >> incessantly masks/unmasks the corresponding IOAPIC RTE and sends IPI >> (vector 0x30) to vCPU0. The bug happens as expected: > >On the XTF side of things, I really need to get around to cleaning up my >SMP support work. There are an increasing number of tests which are >creating ad-hoc APs. > >Recently, an APIC driver has been introduced, so you can probably drop >1/3 of that code by using apic_init()/apic_icr_write(). I've also got a >proto IO-APIC driver which I should clean up and upstream. Thanks for your information. I will try to clean up this test and send it out for review. Thanks Chao _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |