[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.14 and future work


  • To: "Durrant, Paul" <pdurrant@xxxxxxxxxx>, Xen-devel List <xen-devel@xxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Tue, 3 Dec 2019 17:37:20 +0000
  • Authentication-results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=andrew.cooper3@xxxxxxxxxx; spf=Pass smtp.mailfrom=Andrew.Cooper3@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx
  • Autocrypt: addr=andrew.cooper3@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABtClBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPokCOgQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86LkCDQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAYkC HwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA==
  • Delivery-date: Tue, 03 Dec 2019 17:37:34 +0000
  • Ironport-sdr: fypLGFGUHTHOsEJoV2Eg5jkt4k1Gr5UCFCR0YNV/r21jVfgRw2UrbMaJEBTqQTMvx0znvu+KRV 2Cx/Ls62nzAbTy7DM3TLAGcxED11jgm4BBFC3c6GHqSdCMWfGNiIOa6aTNObkJzchd1IMFRNrJ HmLBGeEzFT19aZKXe2IUshbCFs2sMvrXoU+9AiA9KAcIDPUj4mkXhnT2p3RT/OXey4yZ2WoHM6 vMe3kM0KKx1zyNXt8a4/zVjLi4W32l/vfxs87fcknUaFJH0F4x3rl3EI9Fbuk9RqvO7Tt7Wxj7 H7U=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 03/12/2019 09:03, Durrant, Paul wrote:
>> -----Original Message-----
>> From: Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf Of
>> Andrew Cooper
>> Sent: 02 December 2019 19:52
>> To: Xen-devel List <xen-devel@xxxxxxxxxxxxx>
>> Subject: [Xen-devel] Xen 4.14 and future work
>>
>> Hello,
>>
>> Now that 4.13 is on its way out of the door, it is time to look to
>> ongoing work.
>>
>> We have a large backlog of speculation-related work.  For one, we still
>> don't virtualise MSR_ARCH_CAPS for guests, or use eIBRS ourselves in
>> Xen.  Therefore, while Xen does function on Cascade Lake, support is
>> distinctly suboptimal.
>>
>> Similarly, AMD systems frequently fill /var/log with:
>>
>> (XEN) emul-priv-op.c:1113:d0v13 Domain attempted WRMSR c0011020 from
>> 0x0006404000000000 to 0x0006404000000400
>>
>> which is an interaction Linux's prctl() to disable memory disambiguation
>> on a per-process basis, Xen's write/discard behaviour for MSRs, and the
>> long-overdue series to properly virtualise SSBD support on AMD
>> hardware.  AMD Rome hardware, like Cascade Lake, has certain hardware
>> speculative mitigation features which need virtualising for guests to
>> make use of.
>>
> I assume this would addressed by the proposed cpuid/msr policy work?

Yes.  The next task there is to plumb the CPUID policy through the libxc
migrate stream, coping with its absence from older sources.  This
(purposefully) breaks the dual purpose of the CPUID code in libxc for
both domain start and domain restore, and allows us to rewrite the
domain start logic without impacting migrating-in VMs.

Then, and only then, is it safe to add MSR_ARCH_CAPS into the guest
policies and start setting it up.

> I think it is quite vital for Xen that we are able to migrate guests across 
> pools of heterogeneous h/w and therefore I'd like to see this done in 4.14 if 
> possible.

Why do you think it was top of my list :)

>
>> Similarly, there is plenty more work to do with core-aware scheduling,
>> and from my side of things, sane guest topology.  This will eventually
>> unblock one of the factors on the hard 128 vcpu limit for HVM guests.
>>
>>
>> Another big area is the stability of toolstack hypercalls.  This is a
>> crippling pain point for distros and upgradeability of systems, and
>> there is frankly no justifiable reason for the way we currently do
>> things  The real reason is inertia from back in the days when Xen.git
>> (bitkeeper as it was back then) contained a fork of every relevant
>> pieces of software, but this a long-since obsolete model, but still
>> causing us pain.  I will follow up with a proposal in due course, but as
>> a oneliner, it will build on the dm_op() API model.
> This is also fairly vital for the work on live update of Xen (as discussed at 
> the last dev summit). Any instability in the tools ABI will compromise 
> hypervisor update and fixing such issues on an ad-hoc basis as they arise is 
> not really a desirable prospect.
>
>> Likely included within this is making the domain/vcpu destroy paths
>> idempotent so we can fix a load of NULL pointer dereferences in Xen
>> caused by XEN_DOMCTL_max_vcpus not being part of XEN_DOMCTL_createdomain.
>>
>> Other work in this area involves adding X86_EMUL_{VIRIDIAN,NESTED_VIRT}
>> to replace their existing problematic enablement interfaces.
>>
> I think this should include deprecation of HVMOP_get/set_param as far as is 
> possible (i.e. tools use)...
>
>> A start needs to be made on a total rethink of the HVM ABI.  This has
>> come up repeatedly at previous dev summits, and is in desperate need of
>> having some work started on it.
>>
> ...and completely in any new ABI.

Both already in the plan(s).

> I wonder to what extent we can provide a guest-side compat layer here, 
> otherwise it would be hard to get traction I think.

Step 1 of the design (deliberately) won't be concerned with guest
compatibility.  The single most important aspect is to come up with a
clean design which is not crippled by retaining compatibility for PV
guests, and without x86-isms leaking into other architectures.

Once a sensible design exists, we can go about figuring out how best to
enact it.  Most areas will be able to fit compatibility into existing
HVM guests, but some are going to have a very hard time.

> There was an interesting talk at KVM Forum (https://sched.co/Tmuy) on dealing 
> with emulation inside guest context by essentially re-injecting the VMEXITs 
> back into the guest for pseudo-SMM code (loaded as part of the firmware blob) 
> to deal with. I could imagine potentially using such a mechanism to have a 
> 'legacy' hypercall translated to the new ABI, which would allow older guests 
> to be supported unmodified (albeit with a performance penalty). Such a 
> mechanism may also be useful as an alternative way of dealing with some of 
> the emulation dealt with directly in Xen at the moment, to reduce the 
> hypervisor attack surface e.g. stdvga caching, hpet, rtc... perhaps.

I don't think this is relevant to the ABI discussion - its not changing
anything in guest view.  I'm sure people will want it for other reasons,
and I don't see any issue with implementing it for existing HVM guests.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.