Xen project Mailing List

Re: [Xen-devel] support for more than 32 VCPUs when migrating PVHVM guest

To: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>

From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

Date: Mon, 2 Feb 2015 09:21:39 -0500

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Mon, 02 Feb 2015 14:21:59 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Mon, Feb 02, 2015 at 12:03:28PM +0100, Vitaly Kuznetsov wrote: > Andrew Cooper <andrew.cooper3@xxxxxxxxxx> writes: > > > On 02/02/15 10:47, Vitaly Kuznetsov wrote: > >> Hi Konrad, > >> > >> I just hit an issue with PVHVM guests after save/restore (or migration), > >> if a PVHVM guest has > 32 VCPUs it hangs. Turns out, you saw it almost a > >> year ago and even wrote patches to call VCPUOP_register_vcpu_info after > >> resume. Unfortunately these patches never made it to xen/kernel. Do you > >> have a plan to pick this up? What were the arguments against your > >> suggestion? > > > > 32 VCPUs is the legacy limit for HVM guests, but should not have any > > remaining artefacts these days. > > > > Do you know why the hang occurs? I can't spot anything in the legacy > > migration code which would enforce such a limit. > > > > What is the subject of the thread you reference so I can search for it? > > > > Sorry, I should have send the link: > > http://lists.xen.org/archives/html/xen-devel/2014-04/msg00794.html > > Konrad's patches: > > http://lists.xen.org/archives/html/xen-devel/2014-04/msg01199.html > > The issue is that we don't call VCPUOP_register_vcpu_info after > suspend/resume (or migration) and it is mandatory. The issues I saw were that with the enablement of that everything (which is what Jan requested) seems to work - except that I , ah here it is: http://lists.xen.org/archives/html/xen-devel/2014-04/msg02875.html err: http://lists.xen.org/archives/html/xen-devel/2014-04/msg02945.html > The VCPUOP_send_nmi did cause the HVM to get an NMI and it spitted out > 'Dazed and confused'. It also noticed corruption: > > [ 3.611742] Corrupted low memory at c000fffc (fffc phys) = 00029b00 > [ 2.386785] Corrupted low memory at ffff88000000fff8 (fff8 phys) = > 2990000000000 > > Which is odd because there does not seem to be anything in the path > of hypervisor that would cause this. Indeed. This looks a little like a segment descriptor got modified here with a descriptor table base of zero and a selector of 0xfff8. That corruption needs to be hunted down in any case before enabling VCPUOP_send_nmi for HVM. I did not get a chance to "hunt down" that pesky issue. That is the only thing holding this patchset. Said patch is in my queue of patches to upstream (amongts 30 other ones) - and I am working through the review/issues - but it will take me quite some time - so if you feel like taking a stab at this - please do! _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.