[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [BUG] XEN 4.3.3 - segfault in xl create for HVM with PCI passthrough

On Tue, 2014-11-04 at 16:13 +0100, Atom2 wrote:
> I assume it may be warranted to "upgrade" this issue to a bug status 
> (obviously also in the hope that it attractes wider interest) by 
> prefixing the subject line with a [BUG] prefix as per 
> http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen_Project. I have 
> exhausted all my options (including numerous IRC attempts), provided all 
> the information I have been asked for but the issue persists and nobody 
> seems to have an idea how to rectify the problem.

Sorry for the delay, the issue is quite perplexing so I was intending to
sleep on it, but didn't get any inspiration in doing so...

In the gdb traces you provided there is:
#10 read_all (fd=10, data=data@entry=0x7ffff0000a10, len=len@entry=16, 
nonblocking=nonblocking@entry=0) at xs.c:374

which seems to correspond to the 
        if (!read_all(h->fd, &msg->hdr, sizeof(msg->hdr), nonblocking)) { /* 
Cancellation point */
in read_message (because the size and offset seem matches this call, so
I think it is more likely than the other one, but the logic below
applies in either case).

The thing we are reading into has literally just been allocated, so I
can't think of any reason accessing it should fault.

There is only one xenstore change between 4.3.1 and 4.3.3 which is 
        commit 014f9219f1dca3ee92948f0cfcda8d1befa6cbcd
        Author: Matthew Daley <mattd@xxxxxxxxxxx>
        Date:   Sat Nov 30 13:20:04 2013 +1300
            xenstore: sanity check incoming message body lengths
            This is for the client-side receiving messages from xenstored, so 
            is no security impact, unlike XSA-72.
but I can't see any way that could possibly cause a segfault.

So, I'm afraid I'm completely mystified.

You could try running the xl command under valgrind, you may find "xl
create -F" (which keeps xl in the foreground) handy if you try this.
That might help catch any heap corruption etc.

A related thing to try might be to run "MALLOC_CHECK_=2 xl create ..."
which enables glib's heap consistency checks (described at the end of
 which might give a clue.

Otherwise I think the next step would be to downgrade to 4.3.1 and see
if the problem persists, in order to rule out changes elsewhere in the
system. If the problem doesn't happen with a 4.3.1 rebuilt on your
current system then the next thing would probably be to bisect the
issue. There are only 31 toolstack changes in that range, so it ought to
only take 5-6 iterations.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.