Xen project Mailing List

Re: [Xen-devel] support for more than 32 VCPUs when migrating PVHVM guest

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

From: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>

Date: Tue, 03 Feb 2015 10:38:24 +0100

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Tue, 03 Feb 2015 09:38:59 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> writes: > On Mon, Feb 02, 2015 at 12:03:28PM +0100, Vitaly Kuznetsov wrote: >> Andrew Cooper <andrew.cooper3@xxxxxxxxxx> writes: >> >> > On 02/02/15 10:47, Vitaly Kuznetsov wrote: >> >> Hi Konrad, >> >> >> >> I just hit an issue with PVHVM guests after save/restore (or migration), >> >> if a PVHVM guest has > 32 VCPUs it hangs. Turns out, you saw it almost a >> >> year ago and even wrote patches to call VCPUOP_register_vcpu_info after >> >> resume. Unfortunately these patches never made it to xen/kernel. Do you >> >> have a plan to pick this up? What were the arguments against your >> >> suggestion? >> > >> > 32 VCPUs is the legacy limit for HVM guests, but should not have any >> > remaining artefacts these days. >> > >> > Do you know why the hang occurs? I can't spot anything in the legacy >> > migration code which would enforce such a limit. >> > >> > What is the subject of the thread you reference so I can search for it? >> > >> >> Sorry, I should have send the link: >> >> http://lists.xen.org/archives/html/xen-devel/2014-04/msg00794.html >> >> Konrad's patches: >> >> http://lists.xen.org/archives/html/xen-devel/2014-04/msg01199.html >> >> The issue is that we don't call VCPUOP_register_vcpu_info after >> suspend/resume (or migration) and it is mandatory. > > The issues I saw were that with the enablement of that everything > (which is what Jan requested) seems to work - except that I , ah here it is: > > http://lists.xen.org/archives/html/xen-devel/2014-04/msg02875.html > err: > > http://lists.xen.org/archives/html/xen-devel/2014-04/msg02945.html > > > The VCPUOP_send_nmi did cause the HVM to get an NMI and it spitted out > > 'Dazed and confused'. It also noticed corruption: > > > > [ 3.611742] Corrupted low memory at c000fffc (fffc phys) = 00029b00 > > [ 2.386785] Corrupted low memory at ffff88000000fff8 (fff8 phys) = > > 2990000000000 > > > > Which is odd because there does not seem to be anything in the path > > of hypervisor that would cause this. > > Indeed. This looks a little like a segment descriptor got modified here > with a descriptor table base of zero and a selector of 0xfff8. That > corruption needs to be hunted down in any case before enabling > VCPUOP_send_nmi for HVM. > > I did not get a chance to "hunt down" that pesky issue. That is the only > thing holding this patchset. > > Said patch is in my queue of patches to upstream (amongts 30 other ones) - > and I am working through the review/issues - but it will take me quite some > time - so if you feel like taking a stab at this - please do! Thanks for summing this up for me, in case something pops up wrt this corruption issue I'll report. -- Vitaly _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.