[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] xen dom0 nfs hangs, oom killer and more



Hello,

I have had long term issues with xen dom0 and NFS / CIFS which essentially make it impossible to do any reasonable amount of network based filesystem operations from dom0 without the risk of excessive blocking or hanging of dom0 processes with messages such as "blocked for more than 120 seconds". I last posted about this problem in 2015 and documented some very extensive troubleshooting on this problem (https://lists.gt.net/xen/users/381469). I have never found a resolution, across generations of hardware, OS installs, networks and more, the penalty simply seems that if you attempt to use NFS / CIFS from xen dom0, you can expect io performance issues akin to dialup speeds, and hanging / blocked processes starved for I/O that can only be resolved with a reboot of the box.

So here it is in 2017 and I have taken yet another crack at all this. I now have yet again, new network, new servers, new os installs, new switches, the works. Not only can you still not use NFS/CIFS in dom0 without the hangs, new symptoms arise now with OOM killer starting up and killing seemingly random processes in dom0 all the while with many many gigabytes of free ram, no obvious explanation. I am not even sure really how to best document this problem but it's like one of those emperor has no clothes things, nfs/cifs is unsafe and brings down xen dom0 no special effort required other than simply trying to use these, end of story.

I use NFS and CIFS in non-xen setups (bare metal servers) and find both to be reasonably reliable and well performing and for nearly the same exact workloads that would be under xen dom0 (backups - to create compressed tar files of virtual machine image files for example), so I'm a little confident that linux and these filesystems together are reasonably stable. What I'd like to know, is if there is anyone else who uses nfs/cifs from dom0 and are you successful and what your setup looks like?

I plan on setting up a test soon to smoke this out further. Going to have a host set up where I can boot xen or non-xen and run the same operations and see if I can show definitely this shows up only under xen dom0, and then maybe get a clearer picture of why.
Thank you.


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
https://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.