[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Block device hang after migration



On Thu, Oct 19, 2017 at 03:30:28PM +0100, Roger Pau Monné wrote:
> On Thu, Oct 19, 2017 at 11:53:11AM +0100, Wei Liu wrote:
> > Hi
> > 
> > In the process of upgrading osstest to Stretch, I discovered an issue
> > with the block device. This happens after a local migration.
> > 
> > [  127.216232] Freezing user space processes ... (elapsed 0.005 seconds) 
> > done.
> > [  127.222143] Freezing remaining freezable tasks ... 
> > [  147.228913] Freezing of tasks failed after 20.006 seconds (1 tasks 
> > refusing to freeze, wq_busy=0):
> > [  147.228935] jbd2/xvda1-8    D    0   143      2 0x00000000
> > [  147.228964]  ffff880005109000 0000000000000000 ffff88000569e000 
> > ffff88001f918240
> > [  147.228984]  ffff88001ea7a000 ffffc9004029bb30 ffffffff816038e3 
> > ffffc9004029bbe8
> > [  147.229001]  00ff8800056d1500 ffff88001f918240 0000000000000000 
> > ffff88000569e000
> > [  147.229028] Call Trace:
> > [  147.229274]  [<ffffffff816038e3>] ? __schedule+0x233/0x6d0
> > [  147.229297]  [<ffffffff81604550>] ? bit_wait+0x50/0x50
> > [  147.229307]  [<ffffffff81603db2>] ? schedule+0x32/0x80
> > [  147.229318]  [<ffffffff8160711e>] ? schedule_timeout+0x1de/0x350
> > [  147.229345]  [<ffffffff8101b601>] ? xen_clocksource_get_cycles+0x11/0x20
> > [  147.229363]  [<ffffffff810ec47b>] ? ktime_get+0x3b/0xb0
> > [  147.229378]  [<ffffffff81604550>] ? bit_wait+0x50/0x50
> > [  147.229389]  [<ffffffff8160364d>] ? io_schedule_timeout+0x9d/0x100
> > [  147.229401]  [<ffffffff810b8ab7>] ? prepare_to_wait+0x57/0x80
> > [  147.229417]  [<ffffffff81604567>] ? bit_wait_io+0x17/0x60
> > [  147.229427]  [<ffffffff81604133>] ? __wait_on_bit+0x53/0x80
> > [  147.229442]  [<ffffffff81604550>] ? bit_wait+0x50/0x50
> > [  147.229457]  [<ffffffff8160428e>] ? out_of_line_wait_on_bit+0x7e/0xa0
> > [  147.229469]  [<ffffffff810b8f20>] ? wake_atomic_t_function+0x60/0x60
> > [  147.229563]  [<ffffffffc007cde2>] ? 
> > jbd2_journal_commit_transaction+0xdd2/0x17a0 [jbd2]
> > [  147.229589]  [<ffffffff8109da1d>] ? finish_task_switch+0x7d/0x1f0
> > [  147.229612]  [<ffffffffc0081bc2>] ? kjournald2+0xc2/0x260 [jbd2]
> > [  147.229624]  [<ffffffff810b8e80>] ? prepare_to_wait_event+0xf0/0xf0
> > [  147.229643]  [<ffffffffc0081b00>] ? commit_timeout+0x10/0x10 [jbd2]
> > [  147.229656]  [<ffffffff81096697>] ? kthread+0xd7/0xf0
> > [  147.229667]  [<ffffffff810965c0>] ? kthread_park+0x60/0x60
> > [  147.229684]  [<ffffffff81608835>] ? ret_from_fork+0x25/0x30
> > [  147.229708] Restarting kernel threads ... done.
> > [  147.230496] xen:manage: do_suspend: freeze kernel threads failed -16
> > [  147.230508] Restarting tasks ... done.
> > [  238.484918] 
> > 
> > http://logs.test-lab.xenproject.org/osstest/logs/114709/test-amd64-amd64-xl-qcow2/fiano1---var-log-xen-console-guest-debian.stretch.guest.osstest--incoming.log
> > 
> > And this seems to be the same issue Olivier Bonvale reported in "[Xen-devel]
> > task btrfs-transacti:651 blocked for more than 120 seconds".
> 
> Not really I think, that didn't involve migration IIRC. Oliver was
> attaching 26 PV disks, which starved the grant table.
> 
> > The guest in osstest uses ext4, with only one or two vbds. Kernel is 
> > Debian's
> > stock kernel (4.9).
> 
> There's been a lot of patches from Juergen and others since 4.9.
> osstest is currently using 4.9 also and doesn't seem to complain.
> 
> Is there any newer kernel available from backports?
> 

There is no Debian backport kernel in the case -- it is going to be 4.9
all the time.

Assuming all patches required will be backported to 4.9.  We just need
to wait for the changes to trickle down to Debian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.