[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen: Avoid calling device suspend/resume callbacks


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <JBeulich@xxxxxxxx>
  • Date: Tue, 30 Jul 2019 07:55:02 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1;spf=pass smtp.mailfrom=suse.com;dmarc=pass action=none header.from=suse.com;dkim=pass header.d=suse.com;arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=m8S+uEcUON09sP2qjrAvxKSBqC00Ufh8ZkYr8R55d6s=; b=ZcPAtG21xRj90BqBc/6kAngkvxNeeBFHoA7KmxDQzbk6jiey4lOZ7ungIHJHDHAYDxAls1qzqdtEMdmTrYY3soZ12IQDQWIEzNn3PtZjf2M4uZZC0D497mmBB7FXNYLYhJsUx+UrliCX3ka5yzPs34rH2rz6MOtODD+dVakraoEgoMCiAcHu0OrD8b1OvcJF7QICW1lyd4drLUd7IayepK/MsisFpuvwH8+55ozFtIkrW+U3Xhc/XUNGcURz++0bN23e4TtASlGqISchV7Who7faJPi8/TfaLagczV3BVSDlAFB3VkTyHZZyF9uVPmurEDHI/8L3BRsn5hHErU6rXQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RaY2POYY+u4iCEpLNmijyF+MPl1BVJx9ef22INaV9vxS2x9mLTPmVwBKV2AqAaA+/FlI6bAqoINbyT2A0Dm7wuuCkn33uDxGHmspnMsxJ9Vw3I9AEJN/56z7WubDlQzo2VCgWyuzWxYmE9TDQghoC0iMXrbvm8tslb6vMU84ZpPjNPZJC3RWJFcXWxFW9ioR7BNkh+E3WP8tn/5rDSyZ5BSCNkQEZqFnfp9wMKnEfG6brYf38vz9D4JD2vk75meLLDNoSqrZVeLReQe0YaQYKcT0rwEo+rXjum45phjSrwHar+2eEFTZ9Z80GtrYkisrEfmdMuAY5BKuT7DcqM+4Ow==
  • Authentication-results: spf=none (sender IP is ) smtp.mailfrom=JBeulich@xxxxxxxx;
  • Cc: Juergen Gross <JGross@xxxxxxxx>, Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 30 Jul 2019 08:02:58 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHVRiQr6DrMZ4ncf0iQGv5Hre+VmqbhwEEAgAATXb6AAPg8AA==
  • Thread-topic: [Xen-devel] [PATCH] xen: Avoid calling device suspend/resume callbacks

On 29.07.2019 19:06, Andrew Cooper wrote:
> On 29/07/2019 16:57, Jan Beulich wrote:
>> On 29.07.2019 17:41, Ross Lagerwall wrote:
>>> When suspending/resuming or migrating under Xen, there isn't much need
>>> for suspending and resuming all the attached devices since the Xen/QEMU
>>> should correctly maintain the hardware state. Drop these calls and
>>> replace with more specific calls to ensure Xen frontend devices are
>>> properly reconnected.
>> Is this true for the general pass-through case as well? While migration
>> may not be (fully) compatible with pass-through, iirc save/restore is.
> 
> What gives you this impression?
> 
> Migration and save/restore are *literally* the same thing, except that
> in one case you're piping the data to/from disk, and in the other you're
> piping it to the destination and restoring it immediately.
> 
> If you look at the toolstack code, it is all in terms of reading/writing
> an fd (including libxl's API) which is either a network socket or a
> file, as chosen by xl.

Sure. The main difference is where the restore happens (by default):
For live migration I expect this to be on a different host, whereas
for a non-live restore I'd expect this to be the same host. And it
is only the "same host" case where one can assume the same physical
piece of hardware to be available again for passing through to this
guest. In the "different host" case restore _may_ be possible, using
identical hardware. (And yes, in the "same host" case restore may
also be impossible, if the hardware meanwhile has been assigned to
another guest. But as said, I'm talking about the default case here.)

>> Would qemu restore state of physical PCI devices?
> 
> What state would Qemu be in a position to know about, which isn't
> already present in Qemu's datablob?

That's a valid (rhetorical) question, but not helping to answer mine.

> What we do with graphics cards is to merge Xens logdirty bitmap, with a
> dirty list provided by the card itself.  This needs a device-specific
> knowledge.  In addition, there is an opaque blob of data produced by the
> source card, which is handed to the destination card.  That also lives
> in the stream.
> 
> Intel's Scalable IOV spec is attempting to rationalise this by having a
> standard ways of getting logdirty and "internal state" information out
> of a device, but for the moment, it requires custom device-driver
> specific code to do anything migration related with real hardware.

Which isn't very nice, since it doesn't scale well as a model.

> As for why its safe to do like this, the best argument is that this is
> how all other vendors do migration, including KVM.  Xen is the
> odd-one-out using the full S3 path.

So how do "all other vendors" deal with device specific state? So
far I was under the impression that to deal with this is precisely
why we use the S3 logic in the kernel.

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.