[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)



On Tue, 12 May 2015, Fabio Fantoni wrote:
> Il 12/05/2015 12:26, Fabio Fantoni ha scritto:
> > Il 12/05/2015 11:23, Fabio Fantoni ha scritto:
> > > Il 11/05/2015 17:04, Fabio Fantoni ha scritto:
> > > > Il 21/04/2015 14:53, Stefano Stabellini ha scritto:
> > > > > On Tue, 21 Apr 2015, Fabio Fantoni wrote:
> > > > > > Il 21/04/2015 12:49, Stefano Stabellini ha scritto:
> > > > > > > On Mon, 20 Apr 2015, Fabio Fantoni wrote:
> > > > > > > > I updated xen and qemu from xen 4.5.0 with its upstream qemu
> > > > > > > > included to
> > > > > > > > xen
> > > > > > > > 4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk
> > > > > > > > to use
> > > > > > > > revision "master").
> > > > > > > > After few minutes I booted windows 7 64 bit domU qemu crash,
> > > > > > > > tried 2 times
> > > > > > > > with same result.
> > > > > > > > 
> > > > > > > > In the domU's qemu log:
> > > > > > > > > qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion
> > > > > > > > > `(old_top ==
> > > > > > > > > (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
> > > > > > > > > __builtin_offsetof
> > > > > > > > > (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned
> > > > > > > > > long)
> > > > > > > > > (old_size) >= (unsigned long)((((__builtin_offsetof (struct
> > > > > > > > > malloc_chunk,
> > > > > > > > > fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 *
> > > > > > > > > (sizeof(size_t))) -
> > > > > > > > > 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end &
> > > > > > > > > pagemask)
> > > > > > > > > ==
> > > > > > > > > 0)' failed.
> > > > > > > > > Killing all inferiors
> > > > > > > > In attachment the full backtrace of qemu crash.
> > > > > > > > 
> > > > > > > > With a fast search after I saw the backtrace I found a probable
> > > > > > > > cause of
> > > > > > > > regression (I'm not sure):
> > > > > > > > http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa
> > > > > > > >  
> > > > > > > > spice: make sure we don't overflow ssd->buf
> > > > > > > > 
> > > > > > > > Added also qemu-devel and spice-devel as cc.
> > > > > > > > 
> > > > > > > > If you need more informations/tests tell me and I'll post them.
> > > > > > >    Maybe you could try to revert the offending commit
> > > > > > > (5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect
> > > > > > > the
> > > > > > > crash?
> > > > > > Thanks for your reply.
> > > > > > 
> > > > > > I reverted to 4.5.0 on dom0 for now on that system because I'm busy
> > > > > > trying to
> > > > > > found another problem that cause very bad performance without errors
> > > > > > or
> > > > > > nothing in logs :( I don't know if if xen related, kernel related or
> > > > > > other for
> > > > > > now.
> > > > > > 
> > > > > > About this regression with spice I'll do further tests in next days
> > > > > > (probably
> > > > > > starting reverting the spice patch in qemu) but any help is
> > > > > > appreciated.
> > > > > > Based on data I have for now is possible that the problem is that
> > > > > > qemu try to
> > > > > > allocate other ram or videoram after domU create but with xen is not
> > > > > > possible?
> > > > > > In the spice related patch I saw something about dynamic allocation
> > > > > > for
> > > > > > example.
> > > > > It is probably caused by a commit in the range:
> > > > > 
> > > > > 1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4
> > > > >  
> > > > > 
> > > > > there are only 10 commits in that range. By using git bisect you
> > > > > should
> > > > > be able to narrow it down in just 3 tests.
> > > > 
> > > > Sorry for delay, I was busy with many things, today I retried with
> > > > updated stable-4.5 and also reverting "spice: make sure we don't
> > > > overflow ssd->buf" (in a second test) but in both case regression remain
> > > > :(
> > > > Tomorrow probably I'll do other tests.
> > > 
> > > I did another test, reverting this instead:
> > > http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commit;h=c9ac5f816bf3a8b56f836b078711dcef6e5c90b8
> > >  
> > > And now seems I'm unable to reproduce the regression, before happen after
> > > few seconds up to 1-2 minutes, now I use the same domU 15-20 minutes
> > > without problem.
> > > Probably is the cause of regression even if seems strange that on unstable
> > > with same patch on tests of some days ago didn't happen.
> > > 
> > > Any ideas?
> > > 
> > > Thanks for any reply and sorry for my bad english.
> > 
> > Bad news, qemu crash still happen even if this time in qemu log there is
> > another output, see attachment.
> > After take a look on the other patches I saw:
> > http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commitdiff;h=7154fba0e51ec985ef621965d1b7120ad424fcbf
> >  
> > With "Conflicts: hw/display/vga.c" in description I'll try to revert it
> > instead.
> > 
> > Or someone can tell me another probable test I can try?
> 
> Tried also to revet the patch above with same result, so I retried with qemu
> from 4.5.0 and seems the crash happen also in this case...I'm going crazy :(
> 
> In attachment full gdb log.
> 
> Any ideas on how to found the problem please?

Hi Fabio,

Don't worry, bisecting 10 commits should be pretty straightforward.
Just use the command "git bisect" on the QEMU repository:

git bisect start
git bisect bad
git bisect good 1ebb75b1fee779621b63e84fefa7b07354c43a99

These 3 commands tell git that 1ebb75b1fee779621b63e84fefa7b07354c43a99
was working correctly but the current head is broken. git bisect will
select a commit, that you need to that, somewhere in the middle of the
range. Once you tested it, you do

git bisect good

if it worked, or

git bisect bad

if it didn't work as expected. git bisect will automatically select
another commit for you to test. After about 3 tests, git bisect will
find what was the exact cause of the issue.

Let me know if anything is not clear.

Cheers,

Stefano

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.