[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 104131: regressions - FAIL



On Mon, Jan 16, 2017 at 06:27:23AM +0000, Xuquan (Quan Xu) wrote:
>On January 16, 2017 1:26 PM, Tian, Kevin wrote:
>>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>>> Sent: Thursday, January 12, 2017 8:26 PM
>>>
>>> >>> On 12.01.17 at 13:15, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> > On 12/01/17 12:07, Xuquan (Quan Xu) wrote:
>>> >> On January 12, 2017 5:14 PM, Andrew Cooper wrote:
>>> >>> On 12/01/2017 06:46, osstest service owner wrote:
>>> >>>> flight 104131 xen-unstable real [real]
>>> >>>> http://logs.test-lab.xenproject.org/osstest/logs/104131/
>>> >>>>
>>> >>>> Regressions :-(
>>> >>>>
>>> >>>> Tests which did not succeed and are blocking, including tests
>>> >>>> which could not be run:
>>> >>>>  test-amd64-i386-xl-qemuu-debianhvm-amd64 16 guest-stop   fail
>>> >>> REGR. vs. 104119
>>> >>>
>>> >>> Jan 12 01:25:17.397607 (XEN) Assertion 'intack.vector >=
>>> >>> pt_vector' failed at
>>> >>> intr.c:321
>>> >>> Jan 12 01:25:37.133596 (XEN) ----[ Xen-4.9-unstable  x86_64
>>> >>> debug=y Not tainted ]----
>>> >>> Jan 12 01:25:37.141577 (XEN) CPU:    14
>>> >>> Jan 12 01:25:37.141607 (XEN) RIP:    e008:[<ffff82d0801ef7fc>]
>>> >>> vmx_intr_assist+0x35e/0x51d
>>> >>> Jan 12 01:25:37.149617 (XEN) RFLAGS: 0000000000010202
>>CONTEXT:
>>> >>> hypervisor (d15v0)
>>> >>> Jan 12 01:25:37.149655 (XEN) rax: 0000000000000038   rbx:
>>> >>> ffff830079e1e000   rcx: 0000000000000030
>>> >>> Jan 12 01:25:37.157582 (XEN) rdx: 0000000000000000   rsi:
>>> >>> 0000000000000030   rdi: ffff830079e1e000
>>> >>> Jan 12 01:25:37.165584 (XEN) rbp: ffff83047de2ff08   rsp:
>>ffff83047de2fea8
>>> >>> r8:  ffff82c00022f000
>>> >>> Jan 12 01:25:37.173579 (XEN) r9:  ffff8301b63ede80   r10:
>>> >>> ffff830176386560   r11: 000001955ee79bd0
>>> >>> Jan 12 01:25:37.181582 (XEN) r12: 0000000000003002   r13:
>>> >>> 0000000000003002   r14: 0000000000000030
>>> >>> Jan 12 01:25:37.189584 (XEN) r15: ffff83023fec2000   cr0:
>>> >>> 0000000080050033   cr4: 00000000003526e0
>>> >>> Jan 12 01:25:37.197572 (XEN) cr3: 0000000232edb000   cr2:
>>> >>> 0000000002487034
>>> >>> Jan 12 01:25:37.205569 (XEN) ds: 0000   es: 0000   fs: 0000   gs:
>>0000
>>> >>> ss: 0000   cs: e008
>>> >>> Jan 12 01:25:37.205606 (XEN) Xen code around <ffff82d0801ef7fc>
>>> >>> (vmx_intr_assist+0x35e/0x51d):
>>> >>> Jan 12 01:25:37.213575 (XEN)  41 0f b6 f6 39 f0 7e 02 <0f> 0b 48
>>> >>> 89 df e8 51
>>> >>> 20 00 00 b8 10 08 00 00 0f Jan 12 01:25:37.221561 (XEN) Xen stack
>>> >>> trace
>>> >> >from rsp=ffff83047de2fea8:
>>> >>> Jan 12 01:25:37.229600 (XEN)    ffff82d08031aa80 00000038ffffffff
>>> >>> ffff83047de2ffff ffff83023fec2000
>>> >>> Jan 12 01:25:37.237594 (XEN)    ffff83047de2fef8 ffff82d080130cb6
>>> >>> ffff830079e1e000 ffff830079e1e000
>>> >>> Jan 12 01:25:37.245588 (XEN)    ffff83007bae2000
>>000000000000000e
>>> >>> ffff830233117000 ffff83023fec2000
>>> >>> Jan 12 01:25:37.253594 (XEN)    ffff83047de2fdc0 ffff82d0801fdeb1
>>> >>> 0000000000000004 00000000000000c2
>>> >>> Jan 12 01:25:37.261584 (XEN)    0000000000000020
>>0000000000000007
>>> >>> ffff8800e8d28000 ffffffff81add0a0
>>> >>> Jan 12 01:25:37.269607 (XEN)    0000000000000246
>>0000000000000000
>>> >>> ffff880142400008 0000000000000004
>>> >>> Jan 12 01:25:37.277580 (XEN)    0000000000000036
>>0000000000000000
>>> >>> 00000000000003f8 00000000000003f8
>>> >>> Jan 12 01:25:37.285584 (XEN)    ffffffff81add0a0 0000beef0000beef
>>> >>> ffffffff813899a4 000000bf0000beef
>>> >>> Jan 12 01:25:37.293567 (XEN)    0000000000000002
>>ffff880147c03e08
>>> >>> 000000000000beef 1cec835356e5beef
>>> >>> Jan 12 01:25:37.293606 (XEN)    085d8b002674beef
>>01dcb38b000cbeef
>>> >>> 8914458d3174beef 2444c7100000000e
>>> >>> Jan 12 01:25:37.301586 (XEN)    ffff830079e1e000
>>00000031bfc37600
>>> >>> 00000000003526e0
>>> >>> Jan 12 01:25:37.309607 (XEN) Xen call trace:
>>> >>> Jan 12 01:25:37.309639 (XEN)    [<ffff82d0801ef7fc>]
>>> >>> vmx_intr_assist+0x35e/0x51d
>>> >>> Jan 12 01:25:37.317591 (XEN)    [<ffff82d0801fdeb1>]
>>> >>> vmx_asm_vmexit_handler+0x41/0x120
>>> >>> Jan 12 01:25:37.325598 (XEN)
>>> >>> Jan 12 01:25:37.325624 (XEN)
>>> >>> Jan 12 01:25:37.325647 (XEN)
>>> >>> ****************************************
>>> >>> Jan 12 01:25:37.333653 (XEN) Panic on CPU 14:
>>> >>> Jan 12 01:25:37.333684 (XEN) Assertion 'intack.vector >=
>>> >>> pt_vector' failed at
>>> >>> intr.c:321 Jan 12 01:25:37.341571 (XEN)
>>> >>> ****************************************
>>> >>> Jan 12 01:25:37.341603 (XEN)
>>> >>> Jan 12 01:25:37.341626 (XEN) Reboot in five seconds...
>>> >>> Jan 12 01:25:37.349566 (XEN) Resetting with ACPI MEMORY or I/O
>>> >>> RESET_REG.
>>> >>>
>>> >>> This is caused by "x86/apicv: fix RTC periodic timer and apicv
>>> >>> issue".  It is not a deterministic issue, as it appears to have
>>> >>> survived a week of testing already, but there is clearly something still
>>problematic with the code.
>>> >>>
>>> >>
>>> >> Andrew,
>>> >> If you have, could you give more information?
>>> >
>>> > No further information sorry.  This was found by the automated test
>>system.
>>>
>>> But some can be gathered:
>>>
>>> > Full logs are available from
>>> > http://logs.test-lab.xenproject.org/osstest/logs/104131/test-amd64-i
>>> > 386-xl-q
>>> > emuu-debianhvm-amd64/
>>> > but I doubt any of them will help in diagnosing the issue any further.
>>> >
>>> >> Such as the value of intack.vector / pt_vector..
>>>
>>> At leastb one of the two values is likely to live in a register, and
>>> hence its value would be available in the dump. Just takes looking at
>>> the disassembly.
>>>
>>> >> I guess, the reason may be that the intack.vector is ' uint8_t ' and the
>>pt_vector is 'int'..
>>>
>>> That would be odd.
>>>
>>> >> Or there is a corner case that intack.vector is __not__ the highest
>>priority vector..
>>>
>>> That's what I'm afraid of, and why I had asked to add the ASSERT().
>>>
>>
>>I cannot come up a valid reason for such situation (intack.vector is 0x30
>>while pt_vector is 0x38 from Chao's data). pt_update_irq is invoked before
>>checking highest pending IRRs so pt_vector should be honored anyway.
>>One possible reason is that being some reason pt_vector is not in vIRR at
>>that point (due to some bug in the path from PIR to vIRR). However I didn't
>>catch such bug simply by looking at code. We need reproduce this problem
>>in developer side to find out actual reason. Andrew it'd be helpful if you
>>can help Quan/Chao to find out more test environment info.
>>
>
>I'll continue to follow up this issue..
>However I don't have enough CPU-v3 machine for test it(occupied by another 
>project).. I hope Chao could build some test environment.. 
>
No problem. When you come up with ideas or find some clues, I can verify them 
for you.

thanks
Chao
>
>Quan 
>
>
>
>
>>One thing noted though. The original patch from Quan is actually
>>orthogonal to this ASSERT. Regardless of whether intack.vector is larger or
>>smaller than pt_vector, we always require the trick as long as pt_vector is
>>not the one being currently programmed to RVI. Then do we want to revert
>>the whole commit until the problem is finally fixed, or OK to just remove
>>ASSERT (or replace with WARN_ON with more debug info) to unblock test
>>system before the fix is ready?
>>
>>Thanks
>>Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.