Xen project Mailing List

Re: [PATCH v3 4/8] x86/mem-sharing: copy GADDR based shared guest areas

To: Tamas K Lengyel <tamas@xxxxxxxxxxxxx>

Date: Thu, 4 May 2023 16:25:20 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=lum0Eiv6pzZwVF+AjweRCSSqsAI33STHmq0Lr4LkbTY=; b=jGcuJwZeJJCZufQkMDR1D9RqJThgglAiWg+Y7NdVlNBN+5oTfMFrmLK7KuxTSN1aEdB4xeZ8IwH/RoHBSGDYKd9LoPT7ssCEMQOoOui5+ANAg9nPOkuRu2Xhtti+a4r9mXdu6kKWLYgB+aAC1T06l1hAHTg+XWyrj7AMfWaQeUYv8Lxnvhq6bdthTAYHzpyWcqytxeQLKurnHviBqRCZnx/uN+b+c28lySRcXW8pa8yCYi/9rVq7+nl5LafVhvSNrXfDPsF6Wm4Q6ciHQO0Ms8vlDKbZCkBPC7GYYdhHq+fCyRKPN5Dy4x2vkk/I0/AKqjhyuGV1e6bK/5EYISwQIQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LJ8YIXYlv0wodUpb+Y9Lgyyk7KE8iGNoQGSNl0WW+vtahVOm3MsWnLvXqd2hggPMrGtFZNRTJ0PnGdJ/ndaa2ujXQ0Xn4GyUhJ5NfHw/STsO/9IXWIB1+ew+mYtnBjvAD5NfgUMZWDWXonyRGAC6EUi0yJvtjpBoJTLWvaUUCsR0286N6nPkXUDg0FQ9jTpGKrKeQmbHP33Q4fqq7nYfFnembPyyvH0WdHUrVgpVoGRZkPDiTSXwXl6O91Ho5P05ZmOZJiLZ6DuK0pLwKEhIDjnz78bRSHYPUjtvJ+Yo3+3WSpBaOg5M0Y89xHGam2hq1w58gGZMZQs2JM9bgEsHuA==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Thu, 04 May 2023 14:25:39 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04.05.2023 14:50, Tamas K Lengyel wrote: > On Thu, May 4, 2023 at 3:44 AM Jan Beulich <jbeulich@xxxxxxxx> wrote: >> On 03.05.2023 19:14, Tamas K Lengyel wrote: >>>> @@ -1974,7 +2046,10 @@ int mem_sharing_fork_reset(struct domain >>>> >>>> state: >>>> if ( reset_state ) >>>> + { >>>> rc = copy_settings(d, pd); >>>> + /* TBD: What to do here with -ERESTART? */ >>> >>> Ideally we could avoid hitting code-paths that are restartable during > fork >>> reset since it gets called from vm_event replies that have no concept of >>> handling errors. If we start having errors like this we would just have > to >>> drop the vm_event reply optimization and issue a standalone fork reset >>> hypercall every time which isn't a big deal, it's just slower. >> >> I'm afraid I don't follow: We are in the process of fork-reset here. How >> would issuing "a standalone fork reset hypercall every time" make this >> any different? The possible need for a continuation here comes from a >> failed spin_trylock() in map_guest_area(). That won't change the next >> time round. > > Why not? Who is holding the lock and why wouldn't it ever relinquish it? What state is the fork in at that point in time? We're talking about the fork's hypercall deadlock mutex here, after all. Hence if we knew the fork is paused (just like the parent is), then I don't think -ERESTART can be coming back. (To be precise, both paths leading here are of interest, yet the state the fork is in may be different in both cases.) > If > that's really true then there is a larger issue then just not being able to > report the error back to the user on vm_event_resume path and we need to > devise a way of being able to copy this from the parent bypassing this > lock. The issue isn't that the lock will never become available. But we can't predict how many attempts it'll take. But my earlier question went in a different direction anyway: You did suggest to replace a fork-reset by "a standalone fork reset hypercall every time". I somehow don't get the difference (but it looks like some of your further reply down from here addresses that). > The parent is paused and its state should not be changing while forks > are active, so if the lock was turned into an rwlock of some sort so we can > acquire the read-lock would perhaps be a possible way out of this. Given the specific lock we're talking about here, an rwlock is out of question, I think. >> But perhaps I should say that till now I didn't even pay much attention >> to the 2nd use of the function by vm_event_resume(); I was mainly >> focused on the one from XENMEM_sharing_op_fork_reset, where no >> continuation handling exists. Yet perhaps your comment is mainly >> related to that use? >> >> I actually notice that the comment ahead of the function already has a >> continuation related TODO, just that there thought is only of larger >> memory footprint. > > With XENMEM_sharing_op_fork_reset the caller actually receives the error > code and can decide what to do next. With vm_event_resume there is no path > currently to notify the agent of an error. We could generate another > vm_event to send such an error, but the expectation with fork_reset is that > it will always work because the parent is paused, so not having that path > for an error to get back to the agent isn't a big deal. > > Now, if it becomes the case that due to this locking we can get an error > even while the parent is paused, that will render the vm_event_resume path > unreliably so we would just switch to using XENMEM_sharing_op_fork_reset so > that at least it can re-try in case of an issue. Of course, only if a > reissue of the hypercall has any reasonable chance of succeeding. (I think this is the explanation for the "standalone reset hypercall.) Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.