[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable bisection] complete test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm



On 30/10/16 04:29, osstest service owner wrote:
> branch xen-unstable
> xenbranch xen-unstable
> job test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm
> testid debian-hvm-install
>
> Tree: linux git://xenbits.xen.org/linux-pvops.git
> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
> Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> Tree: xen git://xenbits.xen.org/xen.git
>
> *** Found and reproduced problem changeset ***
>
>   Bug is in tree:  xen git://xenbits.xen.org/xen.git
>   Bug introduced:  0897514b4b376a167f968f79c6ea0dee1061458e
>   Bug not present: 4000a7c7d7b0e01837abd3918e393f289c07d68c
>   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/101803/
>
>
>   commit 0897514b4b376a167f968f79c6ea0dee1061458e
>   Author: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>   Date:   Wed Oct 26 10:34:21 2016 +0100
>   
>       tools/oxenstored: Avoid allocating invalid transaction ids

I have to admit that I am staring at this report in belief, but it is
apparently deterministic.  It is very strange that only this job is
affected; if there was actually a problem with xenstore transactions, I
would have thought that there to be collateral damage everywhere.

Looking through the logs, there are several concerning things happening
even in the success cases.

First:

(XEN) HVM1 restore: CPU 0
(XEN) avc:  denied  { getvcpucontext } for domid=0 target=2
scontext=system_u:system_r:dom0_t tcontext=system_u:system_r:dm_dom_t
tclass=domain

The toolstack calls getvcpucontext as part of domain creation, and the
XSM policy disallows this on dm_dom_t's.  Interestingly, this failure
doesn't appear to be fatal to domain creation, and it really ought to
be.  I expect there is also another bug lurking in the lower levels of
the toolstack.

Second:

(XEN) Assertion '!in_irq()' failed at xmalloc_tlsf.c:577
(XEN) ----[ Xen-4.8.0-rc  x86_64  debug=y   Not tainted ]----
<snip>
(XEN) Xen call trace:
(XEN)    [<ffff82d08013cf20>] _xmalloc+0x2f/0x313
(XEN)    [<ffff82d08016a6f9>] services.c#context_struct_to_string+0x98/0x16d
(XEN)    [<ffff82d08016c0c2>] security_sid_to_context+0xd3/0xe7
(XEN)    [<ffff82d080162596>] hooks.c#flask_show_security_evtchn+0x6f/0x87
(XEN)    [<ffff82d08010819a>] event_channel.c#dump_evtchn_info+0x246/0x2cb
(XEN)    [<ffff82d080116271>] handle_keypress+0x8c/0xac
(XEN)    [<ffff82d08014600b>] console.c#__serial_rx+0x38/0x73
(XEN)    [<ffff82d0801467ea>] console.c#serial_rx+0x8a/0x8f
(XEN)    [<ffff82d080148b17>] serial_rx_interrupt+0x90/0xac
(XEN)    [<ffff82d08014756a>] ns16550.c#ns16550_interrupt+0x57/0x71
(XEN)    [<ffff82d0801839fb>] do_IRQ+0x56e/0x60f
(XEN)    [<ffff82d080254d67>] common_interrupt+0x67/0x70
(XEN)    [<ffff82d0801cd586>] mwait-idle.c#mwait_idle+0x2af/0x2f9

The 'e' debugkey isn't safe to use when XSM is compiled in, as
security_sid_to_context() allocates memory.

Furthermore, any unexpected host crashes should cause a failure of the
test.  This appears to have gone unnoticed because it happens in the
capture-logs phase, with presumably sufficient timeouts that OSSTest
doesn't notice that the host rebooted in the middle of log collection.

Third:

(d2) **************************
(d2) blk_open(/local/domain/2/device/vbd/5632) -> 6
(d2) xs_watch(device-model/1/logdirty/cmd, logdirty)
(d2) xs_watch(device-model/1/command, dm-command)
(d2) xs_watch(/local/domain/1/cpu, vcpu-set)
(d2) xs_read(/local/domain/0/backend/pci/1/0/msitranslate): ENOENT
(d2) xs_read(/local/domain/0/backend/pci/1/0/power_mgmt): ENOENT
(d2) open(/var/log/dm-serial.log) -> 7
(d2) fcntl(-1, 3, 3ffbc8/17775710)
(d2) fcntl(-1, 4, ffffffff/37777777777)
(d2) fcntl(7, 3, ffffffff/37777777777)
(d2) fcntl(7, 4, ffffffff/37777777777)
(d2) xs_watch(/local/domain/0/backend/console/1, be:0x14b1a7:1:0x186800)
(d2) xs_directory(/local/domain/0/backend/console/1): EACCES
(d2) xs_watch(/local/domain/0/backend/vkbd/1, be:0x1479ff:1:0x1867c0)
(d2) xs_directory(/local/domain/0/backend/vkbd/1): EACCES
(d2) xs_read(device-model/1/disable_pf): ENOENT
(d2) xs_watch(/local/domain/1/log-throttling,
/local/domain/1/log-throttling)
(d2) Thread "kbdfront": pointer: 0x0xb0182570, stack: 0x0xaa0000
(d2) ******************* FBFRONT for /local/domain/2/device/vfb/0 **********

The stub qemu attempts to read d1's backends.  It probably shouldn't be
doing that.


Comparing the xenstored-access logs, between the success and failure
cases, it does appear that in the failing case, all transactions have
the id 1.  I am trying to debug why.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.