[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] xen dom0 nfs hangs, oom killer and more



On Sun, 2017-02-12 at 23:39 -0800, Mike wrote:
> Hello,
> 
Hi,

>      I have had long term issues with xen dom0 and NFS / CIFS which 
> essentially make it impossible to do any reasonable amount of
> network 
> based filesystem operations from dom0 without the risk of excessive 
> blocking or hanging of dom0 processes with messages such as "blocked
> for 
> more than 120 seconds". I last posted about this problem in 2015 and 
> documented some very extensive troubleshooting on this problem 
> (https://lists.gt.net/xen/users/381469). I have never found a 
> resolution, across generations of hardware, OS installs, networks
> and 
> more, the penalty simply seems that if you attempt to use NFS / CIFS 
> from xen dom0, you can expect io performance issues akin to dialup 
> speeds, and hanging / blocked processes starved for I/O that can only
> be 
> resolved with a reboot of the box.
> 
Ok. I do have a testbox that mounts stuff from NFS in dom0. It works
for me, but I use it (I mean, the NFS part) in a very limited way, I
have to admit.

>      So here it is in 2017 and I have taken yet another crack at all 
> this. I now have yet again, new network, new servers, new os
> installs, 
> new switches, the works. Not only can you still not use NFS/CIFS in
> dom0 
> without the hangs, new symptoms arise now with OOM killer starting
> up 
> and killing seemingly random processes in dom0 all the while with
> many 
> many gigabytes of free ram, no obvious explanation. 
>
Right. I understand this can be ver frustrating. :-(

> I am not even sure 
> really how to best document this problem but it's like one of those 
> emperor has no clothes things, nfs/cifs is unsafe and brings down
> xen 
> dom0 no special effort required other than simply trying to use
> these, 
> end of story.
> 
Well, sure. But at the same time, without information it's hard
(impossible) to figure out what's happening and get to the bottom of
the issue.

So, as a starting point, full output of `xl dmesg' and `dmesg' if the
first thing I'd ask to see. Even more useful they will be actual info
of a crash. So, for instance, what OOMk says, whether Xen prints
anything on the serial console before locking/dying, etc.

>      I plan on setting up a test soon to smoke this out further.
> Going 
> to have a host set up where I can boot xen or non-xen and run the
> same 
> operations and see if I can show definitely this shows up only under
> xen 
> dom0, and then maybe get a clearer picture of why.
>
Actually, yes. If you can trigger and reproduce the bug, and provide as
much logs as possible of the exploded system, that would hopefully be a
useful starting point for a diagnosis. :-)

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
https://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.