[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] xen dom0 nfs hangs, oom killer and more




I plan on setting up a test soon to smoke this out further. Going to have a host set up where I can boot xen or non-xen and run the same operations and see if I can show definitely this shows up only under xen dom0, and then maybe get a clearer picture of why.
Thank you.
I ran a job that did a backup of some local file systems using tar and parallel gzip compression to an nfs mounted directory. All was fine...for a while... and then once again, the system falls down. I have default instal ubuntu and have tried all sorts of tweaks, kernels, sysctl variables, and nfs is simply death under xen no matter what, across generations of hardware, os installs from at lease ubuntu 12 forward, switches, networks, this cancer of nfs death never goes away, never changes, is trivial to trigger (just try using it), and I am still at a complete and total loss. Is there anyone who sucessfully uses a network filesystem of any kind under xen dom0 and if so what is your experience?

[ 2682.994745] INFO: task pigz:11030 blocked for more than 120 seconds.
[ 2682.994783] Tainted: G W 4.10.0-041000-generic #201702191831 [ 2682.994806] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2682.994832] pigz            D    0 11030  11028 0x00000000
[ 2682.994838] Call Trace:
[ 2682.994853]  __schedule+0x233/0x6f0
[ 2682.994857]  ? bit_wait+0x60/0x60
[ 2682.994859]  schedule+0x36/0x80
[ 2682.994863]  schedule_timeout+0x22a/0x3f0
[ 2682.994869]  ? node_dirty_ok+0x12c/0x170
[ 2682.994872]  ? get_page_from_freelist+0x27c/0xb20
[ 2682.994877]  ? ___slab_alloc+0x3a0/0x4b0
[ 2682.994884]  ? ktime_get+0x41/0xb0
[ 2682.994886]  ? bit_wait+0x60/0x60
[ 2682.994888]  io_schedule_timeout+0xa4/0x110
[ 2682.994892]  ? _raw_spin_unlock_irqrestore+0x1a/0x20
[ 2682.994894]  bit_wait_io+0x1b/0x60
[ 2682.994896]  __wait_on_bit+0x58/0x90
[ 2682.994898]  ? bit_wait+0x60/0x60
[ 2682.994900]  out_of_line_wait_on_bit+0x82/0xb0
[ 2682.994907]  ? autoremove_wake_function+0x40/0x40
[ 2682.994937]  nfs_wait_on_request+0x37/0x40 [nfs]
[ 2682.994952]  nfs_writepage_setup+0xd1/0x6f0 [nfs]
[ 2682.994965]  nfs_updatepage+0x107/0x3a0 [nfs]
[ 2682.994971]  ? __check_object_size+0x100/0x1d7
[ 2682.994983]  nfs_write_end+0xf8/0x570 [nfs]
[ 2682.994990]  generic_perform_write+0x10f/0x1c0
[ 2682.995002]  nfs_file_write+0xdc/0x220 [nfs]
[ 2682.995006]  __vfs_write+0xe5/0x160
[ 2682.995010]  vfs_write+0xb5/0x1a0
[ 2682.995012]  SyS_write+0x55/0xc0
[ 2682.995016]  entry_SYSCALL_64_fastpath+0x1e/0xad
[ 2682.995019] RIP: 0033:0x7f9ad7f994bd
[ 2682.995021] RSP: 002b:00007f9ad769be50 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [ 2682.995024] RAX: ffffffffffffffda RBX: 000000000062f500 RCX: 00007f9ad7f994bd [ 2682.995025] RDX: 0000000000018635 RSI: 00007f9ad8304010 RDI: 0000000000000001 [ 2682.995027] RBP: 0000000001a89c00 R08: 0000000000000000 R09: 0000000000000000 [ 2682.995028] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 [ 2682.995030] R13: 0000000000020000 R14: 00007f9a9c6ffa98 R15: 00007f9a84042040
[ 2803.829527] INFO: task pigz:11030 blocked for more than 120 seconds.
[ 2803.829556] Tainted: G W 4.10.0-041000-generic #201702191831 [ 2803.829577] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2803.829600] pigz            D    0 11030  11028 0x00000000
[ 2803.829604] Call Trace:
[ 2803.829613]  __schedule+0x233/0x6f0
[ 2803.829615]  ? bit_wait+0x60/0x60
[ 2803.829616]  schedule+0x36/0x80
[ 2803.829618]  schedule_timeout+0x22a/0x3f0
[ 2803.829622]  ? node_dirty_ok+0x12c/0x170
[ 2803.829624]  ? get_page_from_freelist+0x27c/0xb20
[ 2803.829627]  ? ___slab_alloc+0x3a0/0x4b0
[ 2803.829631]  ? ktime_get+0x41/0xb0
[ 2803.829632]  ? bit_wait+0x60/0x60
[ 2803.829633]  io_schedule_timeout+0xa4/0x110
[ 2803.829635]  ? _raw_spin_unlock_irqrestore+0x1a/0x20
[ 2803.829637]  bit_wait_io+0x1b/0x60
[ 2803.829638]  __wait_on_bit+0x58/0x90
[ 2803.829639]  ? bit_wait+0x60/0x60
[ 2803.829641]  out_of_line_wait_on_bit+0x82/0xb0
[ 2803.829644]  ? autoremove_wake_function+0x40/0x40
[ 2803.829663]  nfs_wait_on_request+0x37/0x40 [nfs]
[ 2803.829672]  nfs_writepage_setup+0xd1/0x6f0 [nfs]
[ 2803.829680]  nfs_updatepage+0x107/0x3a0 [nfs]
[ 2803.829683]  ? __check_object_size+0x100/0x1d7
[ 2803.829691]  nfs_write_end+0xf8/0x570 [nfs]
[ 2803.829695]  generic_perform_write+0x10f/0x1c0
[ 2803.829702]  nfs_file_write+0xdc/0x220 [nfs]
[ 2803.829705]  __vfs_write+0xe5/0x160
[ 2803.829707]  vfs_write+0xb5/0x1a0
[ 2803.829708]  SyS_write+0x55/0xc0
[ 2803.829711]  entry_SYSCALL_64_fastpath+0x1e/0xad
[ 2803.829712] RIP: 0033:0x7f9ad7f994bd
[ 2803.829713] RSP: 002b:00007f9ad769be50 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [ 2803.829715] RAX: ffffffffffffffda RBX: 000000000062f500 RCX: 00007f9ad7f994bd [ 2803.829716] RDX: 0000000000018635 RSI: 00007f9ad8304010 RDI: 0000000000000001 [ 2803.829717] RBP: 0000000001a89c00 R08: 0000000000000000 R09: 0000000000000000 [ 2803.829718] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 [ 2803.829719] R13: 0000000000020000 R14: 00007f9a9c6ffa98 R15: 00007f9a84042040
[ 4374.681798] INFO: task pigz:11030 blocked for more than 120 seconds.
[ 4374.681829] Tainted: G W 4.10.0-041000-generic #201702191831 [ 4374.681850] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4374.681874] pigz            D    0 11030  11028 0x00000000
[ 4374.681878] Call Trace:
[ 4374.681887]  __schedule+0x233/0x6f0
[ 4374.681889]  ? bit_wait+0x60/0x60
[ 4374.681891]  schedule+0x36/0x80
[ 4374.681893]  schedule_timeout+0x22a/0x3f0
[ 4374.681896]  ? node_dirty_ok+0x12c/0x170
[ 4374.681900]  ? update_load_avg+0x6b/0x510
[ 4374.681903]  ? ktime_get+0x41/0xb0
[ 4374.681905]  ? bit_wait+0x60/0x60
[ 4374.681906]  io_schedule_timeout+0xa4/0x110
[ 4374.681908]  ? _raw_spin_unlock_irqrestore+0x1a/0x20
[ 4374.681909]  bit_wait_io+0x1b/0x60
[ 4374.681911]  __wait_on_bit+0x58/0x90
[ 4374.681912]  ? bit_wait+0x60/0x60
[ 4374.681913]  out_of_line_wait_on_bit+0x82/0xb0
[ 4374.681916]  ? autoremove_wake_function+0x40/0x40
[ 4374.681934]  nfs_wait_on_request+0x37/0x40 [nfs]
[ 4374.681943]  nfs_writepage_setup+0xd1/0x6f0 [nfs]
[ 4374.681951]  nfs_updatepage+0x107/0x3a0 [nfs]
[ 4374.681954]  ? __check_object_size+0x100/0x1d7
[ 4374.681962]  nfs_write_end+0xf8/0x570 [nfs]
[ 4374.681966]  generic_perform_write+0x10f/0x1c0
[ 4374.681977]  nfs_file_write+0xdc/0x220 [nfs]
[ 4374.681980]  __vfs_write+0xe5/0x160
[ 4374.681983]  vfs_write+0xb5/0x1a0
[ 4374.681986]  SyS_write+0x55/0xc0
[ 4374.681989]  entry_SYSCALL_64_fastpath+0x1e/0xad
[ 4374.681991] RIP: 0033:0x7f9ad7f994bd
[ 4374.681993] RSP: 002b:00007f9ad769be50 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [ 4374.681995] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f9ad7f994bd [ 4374.681997] RDX: 000000000000024f RSI: 00007f9abc1de010 RDI: 0000000000000001 [ 4374.681998] RBP: 0000000001a684f0 R08: 0000000000000000 R09: 0000000000000000 [ 4374.682000] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000024f [ 4374.682001] R13: 0000000000020000 R14: 00007f9ad414c010 R15: 00007f9ab40420f0 [ 5051.805211] EXT4-fs (dm-7): mounted filesystem with ordered data mode. Opts: (null)
[ 5446.073199] EXT4-fs (dm-7): 1 orphan inode deleted
[ 5446.073203] EXT4-fs (dm-7): recovery complete
[ 5446.082484] EXT4-fs (dm-7): mounted filesystem with ordered data mode. Opts: (null)



_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
https://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.