Xen project Mailing List

On Mon, Apr 30, 2018 at 6:56 PM, Daz Day <dazday60@xxxxxxxxx> wrote:

Hi All,

I posted this out to the centos-virt mailing list and was asked by dunlapg at umich.edu to post it here instead with some additional info (config file of a VM, dmesg before a hang from inside a VM and dmesg after hang from Dom0), so here's the original email, plus the extras at the bottom:

Hi,

I've tried hitting up the CentOS forums and thought I'd try here too as I don't seem to be getting any bites.

We've been in the process of migrating all our hypervisors over to CentOS 7 using Xen. Once we had a few up and running we started to notice that the DomU's would randomly freeze. They become unresponsive to any network traffic, stop consuming CPU resources on the hypervisor and it's not possible to log in to the console locally using:
virsh console <domain>
We can sometimes get as far as typing a username and hitting return, but the DomU just hangs there. It doesn't seem to matter what Linux distro the DomU is running, it affects them all. The only way we can get them back is by destroying and recreating them (far from ideal!).

After a bit of research and digging around, we eventually found these 2 nuggets:
https://wiki.gentoo.org/wiki/Xen#Xen_domU_hanging_with_kernel_4.3.2B
https://www.novell.com/support/kb/doc.php?id=7018590

They both advise adding the command line argument:
gnttab_max_frames=256(the default is 32).
We applied this change and all hypervisors rand stable for around a week until DomU's started freezing again (we've since tried even higher values, to no avail). More research later led me to https://bugs.centos.org/view.php?id=14258 and https://bugs.centos.org/view.php?id=14284 (which are essentially the same report). There hasn't really been any movement on these tickets unfortunately, but I have +1'd them.

Have any others had issues with Xen and DomU's locking up in CentOS 7? Are there any other fixes/workarounds? If any additional info is needed that isn't already in the bug tickets or forum post, please let me know and I'll be happy to provide whatever is required (these freezes are happening at least once a day).

Any help would be much appreciated and would mean my Ops guys could get a decent sleep!
Cheers
Darren

Additional info as requested:
Config file for one of the frozen VMs (there are other VMs with fewer interfaces some with more memory, some with more vCPUs):
<domain type='xen'>
<name>t02red02</name>
<uuid>bdbc6df1-e153-0dac-9a92-faa7b1270e97</uuid>
<memory unit='KiB'>2097152</memory>
<currentMemory unit='KiB'>2097152</currentMemory>
<vcpu placement='static'>1</vcpu>
<bootloader>/usr/bin/pygrub</bootloader>
<os>
    <type arch='x86_64' machine='xenpv'>linux</type>
</os>
<clock offset='utc' adjustment='reset'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>restart</on_crash>
<devices>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/xen/t02red02.img'/>
      <target dev='xvda' bus='xen'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/xen/t02red02-xvdb.img'/>
      <target dev='xvdb' bus='xen'/>
    </disk>
    <interface type='bridge'>
      <mac address='00:16:3e:03:aa:d3'/>
      <source bridge='xbr_v999'/>
      <script path='vif-bridge'/>
    </interface>
    <interface type='bridge'>
      <mac address='00:16:3e:31:11:02'/>
      <source bridge='xbr_v11'/>
      <script path='vif-bridge'/>
    </interface>
    <interface type='bridge'>
      <mac address='00:16:3e:31:14:02'/>
      <source bridge='xbr_v14'/>
      <script path='vif-bridge'/>
    </interface>
    <interface type='bridge'>
      <mac address='00:16:3e:31:20:02'/>
      <source bridge='xbr_v20'/>
      <script path='vif-bridge'/>
    </interface>
    <interface type='bridge'>
      <mac address='00:16:3e:31:21:02'/>
      <source bridge='xbr_v21'/>
      <script path='vif-bridge'/>
    </interface>
    <interface type='bridge'>
      <mac address='00:16:3e:31:28:02'/>
      <source bridge='xbr_v28'/>
      <script path='vif-bridge'/>
    </interface>
    <interface type='bridge'>
      <mac address='00:16:3e:31:34:02'/>
      <source bridge='xbr_v34'/>
      <script path='vif-bridge'/>
    </interface>
    <interface type='bridge'>
      <mac address='00:16:3e:31:39:02'/>
      <source bridge='xbr_v39'/>
      <script path='vif-bridge'/>
    </interface>
    <console type='pty'>
      <target type='xen' port='0'/>
    </console>
    <input type='mouse' bus='xen'/>
    <input type='keyboard' bus='xen'/>
</devices>
</domain>

dmesg from inside the VM. It's worth mentioning that in the bug ticket someone suggested attempting to save the state of the VM and then restore it - sometimes this works, sometimes it fails. In this case it failed, but without any other intervention, the VM recovered:
[1206840.137075] INFO: task jbd2/xvda2-8:137 blocked for more than 120 seconds.
[1206840.137098] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1206840.137117] jbd2/xvda2-8    D 0000000000000000     0   137      2 0x00000000
[1206840.137141] ffff88007768bac0 0000000000000246 ffff880003636dd0 ffff88007768bfd8
[1206840.137163] ffff88007768bfd8 ffff88007768bfd8 ffff880003636dd0 ffff88007f816c40
[1206840.137189] 0000000000000000 7fffffffffffffff ffff88007fbde260 ffffffff8168a1e0
[1206840.137218] Call Trace:
[1206840.137240] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
[1206840.137255] [<ffffffff8168c169>] schedule+0x29/0x70
[1206840.137266] [<ffffffff81689bc9>] schedule_timeout+0x239/0x2c0
[1206840.137282] [<ffffffff8101fd60>] ? xen_clocksource_get_cycles+0x20/0x30
[1206840.137302] [<ffffffff810eb08c>] ? ktime_get_ts64+0x4c/0xf0
[1206840.137318] [<ffffffff8168e2a0>] ? _raw_spin_unlock_irqrestore+0x20/0x40
[1206840.137333] [<ffffffff8101fd60>] ? xen_clocksource_get_cycles+0x20/0x30
[1206840.137346] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
[1206840.137357] [<ffffffff8168b70e>] io_schedule_timeout+0xae/0x130
[1206840.137368] [<ffffffff8168e2a0>] ? _raw_spin_unlock_irqrestore+0x20/0x40
[1206840.137380] [<ffffffff8168b7a8>] io_schedule+0x18/0x20
[1206840.137393] [<ffffffff8168a1f1>] bit_wait_io+0x11/0x50
[1206840.137406] [<ffffffff81689d15>] __wait_on_bit+0x65/0x90
[1206840.137419] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
[1206840.137433] [<ffffffff81689dc1>] out_of_line_wait_on_bit+0x81/0xb0
[1206840.137452] [<ffffffff810b1be0>] ? wake_bit_function+0x40/0x40
[1206840.137467] [<ffffffff8123338a>] __wait_on_buffer+0x2a/0x30
[1206840.137494] [<ffffffffa0034e78>] jbd2_journal_commit_transaction+0xe88/0x19a0 [jbd2]
[1206840.137511] [<ffffffff81029573>] ? __switch_to+0xe3/0x4c0
[1206840.137523] [<ffffffff8168e48a>] ? _raw_spin_lock_irqsave+0x1a/0x60
[1206840.137540] [<ffffffffa0039e99>] kjournald2+0xc9/0x260 [jbd2]
[1206840.137553] [<ffffffff810b1b20>] ? wake_up_atomic_t+0x30/0x30
[1206840.145192] [<ffffffffa0039dd0>] ? commit_timeout+0x10/0x10 [jbd2]
[1206840.145212] [<ffffffff810b0a4f>] kthread+0xcf/0xe0
[1206840.145224] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
[1206840.145242] [<ffffffff816970d8>] ret_from_fork+0x58/0x90
[1206840.145257] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
[1206840.145274] INFO: task auditd:302 blocked for more than 120 seconds.
[1206840.145288] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1206840.145305] auditd          D 0000000000000000     0   302      1 0x00000000
[1206840.145332] ffff88000350b8a0 0000000000000286 ffff88007b426dd0 ffff88000350bfd8
[1206840.145360] ffff88000350bfd8 ffff88000350bfd8 ffff88007b426dd0 ffff88007f816c40
[1206840.145389] 0000000000000000 7fffffffffffffff ffffffff8168a1e0 ffff88000350ba00
[1206840.145416] Call Trace:
[1206840.145429] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
[1206840.145444] [<ffffffff8168c169>] schedule+0x29/0x70
[1206840.145458] [<ffffffff81689bc9>] schedule_timeout+0x239/0x2c0
[1206840.145489] [<ffffffffa009ac38>] ? __es_insert_extent+0x148/0x260 [ext4]
[1206840.145503] [<ffffffff8101fd60>] ? xen_clocksource_get_cycles+0x20/0x30
[1206840.145516] [<ffffffff810eb08c>] ? ktime_get_ts64+0x4c/0xf0
[1206840.145529] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
[1206840.145540] [<ffffffff8168b70e>] io_schedule_timeout+0xae/0x130
[1206840.145552] [<ffffffff8168e2a0>] ? _raw_spin_unlock_irqrestore+0x20/0x40
[1206840.145564] [<ffffffff8168b7a8>] io_schedule+0x18/0x20
[1206840.145576] [<ffffffff8168a1f1>] bit_wait_io+0x11/0x50
[1206840.145587] [<ffffffff81689f0f>] __wait_on_bit_lock+0x5f/0xc0
[1206840.145599] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
[1206840.145609] [<ffffffff81689ff1>] out_of_line_wait_on_bit_lock+0x81/0xb0
[1206840.145621] [<ffffffff810b1be0>] ? wake_bit_function+0x40/0x40
[1206840.145635] [<ffffffff812333f2>] __lock_buffer+0x32/0x40
[1206840.145650] [<ffffffffa0032e33>] do_get_write_access+0x433/0x4d0 [jbd2]
[1206840.149867] [<ffffffff812336ed>] ? __getblk+0x2d/0x300
[1206840.149899] [<ffffffffa0032ef7>] jbd2_journal_get_write_access+0x27/0x40 [jbd2]
[1206840.149949] [<ffffffffa0089d2b>] __ext4_journal_get_write_access+0x3b/0x80 [ext4]
[1206840.149985] [<ffffffffa0058290>] ext4_reserve_inode_write+0x70/0xa0 [ext4]
[1206840.150035] [<ffffffffa005b9b0>] ? ext4_dirty_inode+0x40/0x60 [ext4]
[1206840.150055] [<ffffffffa0058313>] ext4_mark_inode_dirty+0x53/0x210 [ext4]
[1206840.150078] [<ffffffffa005b9b0>] ext4_dirty_inode+0x40/0x60 [ext4]
[1206840.150104] [<ffffffff8122944a>] __mark_inode_dirty+0xca/0x290
[1206840.150120] [<ffffffff8121a001>] update_time+0x81/0xd0
[1206840.150132] [<ffffffff8121a250>] file_update_time+0xa0/0xf0
[1206840.150160] [<ffffffff811830c8>] __generic_file_aio_write+0x198/0x400
[1206840.150176] [<ffffffff81183389>] generic_file_aio_write+0x59/0xa0
[1206840.150197] [<ffffffffa004ef0b>] ext4_file_write+0xdb/0x470 [ext4]
[1206840.150213] [<ffffffff811fdf3d>] do_sync_write+0x8d/0xd0
[1206840.150225] [<ffffffff811fe7ad>] vfs_write+0xbd/0x1e0
[1206840.150235] [<ffffffff811ff2cf>] SyS_write+0x7f/0xe0
[1206840.150248] [<ffffffff81697189>] system_call_fastpath+0x16/0x1b
[1206840.150280] INFO: task dockerd:807 blocked for more than 120 seconds.
[1206840.150292] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1206840.150306] dockerd         D 0000000000000000     0   807      1 0x00000080
[1206840.150330] ffff880076c77e38 0000000000000282 ffff8800026fce70 ffff880076c77fd8
[1206840.150352] ffff880076c77fd8 ffff880076c77fd8 ffff8800026fce70 ffff880003725800
[1206840.150380] 000000000034ceeb ffff880003725890 ffff880003725828 ffff880076c77e80
[1206840.150408] Call Trace:
[1206840.150418] [<ffffffff8168c169>] schedule+0x29/0x70
[1206840.150432] [<ffffffffa0039875>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
[1206840.150449] [<ffffffff810b1b20>] ? wake_up_atomic_t+0x30/0x30
[1206840.150469] [<ffffffffa003c802>] jbd2_complete_transaction+0x52/0xa0 [jbd2]
[1206840.150490] [<ffffffffa004fd92>] ext4_sync_file+0x292/0x320 [ext4]
[1206840.150508] [<ffffffff81230445>] do_fsync+0x65/0xa0
[1206840.150519] [<ffffffff81230710>] SyS_fsync+0x10/0x20
[1206840.150535] [<ffffffff81697189>] system_call_fastpath+0x16/0x1b
.... These messages repeat for a long time until we attempted to save the state when this appeared:
[1208895.926365] Freezing user space processes ...
[1208915.930503] Freezing of tasks failed after 20.004 seconds (106 tasks refusing to freeze, wq_busy=0):
[1208915.930522] auditd          D 0000000000000000     0   302      1 0x00000004
[1208915.930541] ffff88000350b8a0 0000000000000286 ffff88007b426dd0 ffff88000350bfd8
[1208915.930563] ffff88000350bfd8 ffff88000350bfd8 ffff88007b426dd0 ffff88007f816c40
[1208915.930584] 0000000000000000 7fffffffffffffff ffffffff8168a1e0 ffff88000350ba00
[1208915.930605] Call Trace:
[1208915.930617] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
[1208915.930628] [<ffffffff8168c169>] schedule+0x29/0x70
[1208915.930639] [<ffffffff81689bc9>] schedule_timeout+0x239/0x2c0
[1208915.930679] [<ffffffffa009ac38>] ? __es_insert_extent+0x148/0x260 [ext4]
[1208915.930693] [<ffffffff8101fd60>] ? xen_clocksource_get_cycles+0x20/0x30
[1208915.930706] [<ffffffff810eb08c>] ? ktime_get_ts64+0x4c/0xf0
[1208915.930725] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
[1208915.930736] [<ffffffff8168b70e>] io_schedule_timeout+0xae/0x130
[1208915.930748] [<ffffffff8168e2a0>] ? _raw_spin_unlock_irqrestore+0x20/0x40
[1208915.930760] [<ffffffff8168b7a8>] io_schedule+0x18/0x20
[1208915.930770] [<ffffffff8168a1f1>] bit_wait_io+0x11/0x50
[1208915.930787] [<ffffffff81689f0f>] __wait_on_bit_lock+0x5f/0xc0
[1208915.930799] [<ffffffff8168a1e0>] ? bit_wait+0x50/0x50
[1208915.930809] [<ffffffff81689ff1>] out_of_line_wait_on_bit_lock+0x81/0xb0
[1208915.930832] [<ffffffff810b1be0>] ? wake_bit_function+0x40/0x40
[1208915.930845] [<ffffffff812333f2>] __lock_buffer+0x32/0x40
[1208915.930861] [<ffffffffa0032e33>] do_get_write_access+0x433/0x4d0 [jbd2]
[1208915.930879] [<ffffffff812336ed>] ? __getblk+0x2d/0x300
[1208915.930893] [<ffffffffa0032ef7>] jbd2_journal_get_write_access+0x27/0x40 [jbd2]
[1208915.930923] [<ffffffffa0089d2b>] __ext4_journal_get_write_access+0x3b/0x80 [ext4]
[1208915.930946] [<ffffffffa0058290>] ext4_reserve_inode_write+0x70/0xa0 [ext4]
[1208915.930971] [<ffffffffa005b9b0>] ? ext4_dirty_inode+0x40/0x60 [ext4]
[1208915.930989] [<ffffffffa0058313>] ext4_mark_inode_dirty+0x53/0x210 [ext4]
[1208915.931025] [<ffffffffa005b9b0>] ext4_dirty_inode+0x40/0x60 [ext4]
[1208915.931048] [<ffffffff8122944a>] __mark_inode_dirty+0xca/0x290
[1208915.931060] [<ffffffff8121a001>] update_time+0x81/0xd0
[1208915.931072] [<ffffffff8121a250>] file_update_time+0xa0/0xf0
[1208915.931093] [<ffffffff811830c8>] __generic_file_aio_write+0x198/0x400
[1208915.931105] [<ffffffff81183389>] generic_file_aio_write+0x59/0xa0
[1208915.931128] [<ffffffffa004ef0b>] ext4_file_write+0xdb/0x470 [ext4]
[1208915.931141] [<ffffffff811fdf3d>] do_sync_write+0x8d/0xd0
[1208915.931151] [<ffffffff811fe7ad>] vfs_write+0xbd/0x1e0
[1208915.931169] [<ffffffff811ff2cf>] SyS_write+0x7f/0xe0
[1208915.931179] [<ffffffff81697189>] system_call_fastpath+0x16/0x1b
.... Then these messages continue and the last entries in dmesg are:
[1208922.543462] sched: RT throttling activated
[1208922.543444] Restarting tasks ... done.
[1208922.555833] xen:manage: do_suspend: freeze processes failed -16

dmesg from the hypervisor around this time:
[608439.695371] vif vif-35-1 vif35.1: Guest Rx ready
[608439.695656] xbr_v111: port 2(vif35.1) entered blocking state
[608439.695918] xbr_v111: port 2(vif35.1) entered forwarding state

I'll see when another VM locks up and get another set of these if required, possibly one without the saving and restoring state

If there's anything else required, let me know and I'll be happy to provide.

Cheers,
Darren

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-users

Hello,

which version of Xen are you using? Are you monitoring nr_frames with the xen-diag tool?

In xen 4.10, the grant frames are configured per domU, so if you're using xen 4.10 you need to set them in each domU configuration rather than globally. By the way, the Gentoo wiki page was written out of frustration with having these random hangs for half a year a couldn't find anything to overcome them. But setting the max grants to 256 (on kernel 4.14) solved it for us.

Tomas

Re: [Xen-users] Xen DomU's randomly freezing