[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Design session report: Live-Updating Xen


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <JBeulich@xxxxxxxx>
  • Date: Thu, 18 Jul 2019 09:15:16 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1;spf=pass smtp.mailfrom=suse.com;dmarc=pass action=none header.from=suse.com;dkim=pass header.d=suse.com;arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ScDBGuBLJlQXR/8FJasUvHk7gQIcXfH0XjLE1/A1w1M=; b=nVl9Xm8dAqc1f6lxf28pU7qhiop1AqL9hs+ejUiYU7eGeLAZKq7OAeRtdRzXMNGClwqI/NWgc/otdW4qf7IiuJyyoEEQLZToSxbHrxAYjw2eOiWIMlou9xPcTw936bY0qbIFPaq5znAREWxLJUgdGcLOPMQD0I3tyjjXvanoU0byQlVBnzhcXa1oZGNYDnHaD0Zhq/anbAAbhuWXNuqONhg2RBDPPwGcF8NoiuD3MvzhXvMrOxrc7mBdJt02HJLr7tHPktOabPPIFIEcb6aBcasGz6/TRnOYowYmOL6A80/oHk54M+IKRGiOHGKs1emAOnSKuaILAn93fAmcIIeRrw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=i49BNLYfRC3T7JzMi+/myuSTbl2N6ut8ZyixdzVd9CBra7f5a8cawGOcePsmW2vWtulG65DHMG73gpO7LTQWF1KukcBgBpaGAHbKVHlg5aFS19W+ZAnOywCXkn92ZSlqua63GFVuszyYjlM4dzEQKJDeNjn06ehagBP4DzTgUS0X7Bs7V0DNGHKtbPW5Hd/jT7JA/cP7H7AXlyR6faUO07CGSarzXWuBN24ysiBDNK+TcQghajXdLGoGXLRiXtRcSLZOKX+MGtvLweVnuFU4pFyw5COQ8cSavc1qmw17LPpRDhEhbP4JS1A8gKahw9GemJPQAFOl4yEnL7m9aoEK/g==
  • Authentication-results: spf=none (sender IP is ) smtp.mailfrom=JBeulich@xxxxxxxx;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Leonard Foerster <foersleo@xxxxxxxxxx>
  • Delivery-date: Thu, 18 Jul 2019 09:15:38 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHVOz84rrbZg3jr7UWo9Co44cwhi6bN7RdMgAB5rACAAEe/loAAGskAgABeoF+AAPQ0AA==
  • Thread-topic: [Xen-devel] Design session report: Live-Updating Xen

On 17.07.2019 20:40, Andrew Cooper wrote:
> On 17/07/2019 14:02, Jan Beulich wrote:
>> On 17.07.2019 13:26, Andrew Cooper wrote:
>>> We do not want to be grovelling around in the old Xen's datastructures,
>>> because that adds a binary A=>B translation which is
>>> per-old-version-of-xen, meaning that you need a custom build of each
>>> target Xen which depends on the currently-running Xen, or have to
>>> maintain a matrix of old versions which will be dependent on the local
>>> changes, and therefore not suitable for upstream.
>> Now the question is what alternative you would suggest. By you
>> saying "the pinned state lives in the migration stream", I assume
>> you mean to imply that Dom0 state should be handed from old to
>> new Xen via such a stream (minus raw data page contents)?
> 
> Yes, and this in explicitly identified in the bullet point saying "We do
> only rely on domain state and no internal xen state".
> 
> In practice, it is going to be far more efficient to have Xen
> serialise/deserialise the domain register state etc, than to bounce it
> via hypercalls.  By the time you're doing that in Xen, adding dom0 as
> well is trivial.

So I must be missing some context here: How could hypercalls come into
the picture at all when it comes to "migrating" Dom0?

>>> The in-guest evtchn data structure will accumulate events just like a
>>> posted interrupt descriptor.  Real interrupts will queue in the LAPIC
>>> during the transition period.
>> Yes, that'll work as long as interrupts remain active from Xen's POV.
>> But if there's concern about a blackout period for HVM/PVH, then
>> surely there would also be such for PV.
> 
> The only fix for that is to reduce the length of the blackout period.
> We can't magically inject interrupts half way through the xen-to-xen
> transition, because we can't run vcpus at that point in time.

Hence David's proposal to "re-inject". We'd have to record them during
the blackout period, and inject once Dom0 is all set up again.

>>>> Re-using large data structures (or arrays thereof) may also turn out
>>>> useful in terms of latency until the new Xen actually becomes ready to
>>>> resume.
>>> When it comes to optimising the latency, there is a fair amount we might
>>> be able to do ahead of the critical region, but I still think this would
>>> be better done in terms of a "clean start" in the new Xen to reduce
>>> binary dependences.
>> Latency actually is only one aspect (albeit the larger the host, the more
>> relevant it is). Sufficient memory to have both old and new copies of the
>> data structures in place, plus the migration stream, is another. This
>> would especially become relevant when even DomU-s were to remain in
>> memory, rather than getting saved/restored.
> 
> But we're still talking about something which is on a multi-MB scale,
> rather than multi-GB scale.

On multi-TB systems frame_table[] is a multi-GB table. And with boot times
often scaling (roughly) with system size, live updating is (I guess) all
the more interesting on bigger systems.

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.