[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Xen 4.12 DomU hang / freeze / stall under high network/disk load
On Mon, Feb 17, 2020 at 10:49 AM Sarah Newman <srn@xxxxxxxxx> wrote: > On 2/17/20 10:33 AM, Tomas Mozes wrote: > > Just a quick note - no stall after switching to credit scheduler on xen > > 4.12 after 3 days. > That's great news. By 4.12 do you mean release 4.12.1, 4.12.2, or something > else? > I'm assuming when "PGNet Dev" reported 4.12 being bad and 4.13 being good, > they were using the default scheduler of credit 2. I hope they respond with details! :-) > It's worth asking on xen-devel if there's a known bug in the credit 2 > scheduler that's been fixed. It looks like there were some significant changes > to the scheduling code in between Xen 4.12 and Xen 4.13, and if one was a fix > I'm not sure it would have been recognized as being so. Sarah, Tomas - Is that something one of you wants to do? If not, I'm happy to take that task, but don't want to step on toes. In light of this report, I've added sched=credit to my bootloader, for the *next* time, on my 4.12 production host. The guest on that host - which is my production machine and which I am not stress testing - has now been up for 8 days (typical when not stress testing, it lasts for 3-14 days). Rather than rebooting to sched=credit now, I'm still hoping it will stall again, so I can run the commands Sarah asked.... although I wonder if it's still worth it given what we're finding??? Sarah - If you do feel it's worth it, I'm happy to wait. Here are the commands I have lined up to run on the physical host (current guest id=10) when the guest stalls next: xl sysrq 10 l xl sysrq 10 x xl debug-keys q xl dmesg xl info Is this right? Are there any other debugging commands I can/should run on the host or guest when it stalls next? Anything that might be useful I'm happy to grab, but since it might be 2AM I want to line them all up in a file (as I have above) so I don't have to hunt while trying to stay awake. :-) After it stalls next and I grab the debugging output suggested, I'll reboot the physical host into sched=credit for the production guest. My test host/guest I'm going to leave on 15.0/4.10 for now - since it's my future production host - until I do more testing on that configuration and/or until we get this nailed down. Glen _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |