[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 4/8] x86/mem-sharing: copy GADDR based shared guest areas


  • To: Tamas K Lengyel <tamas@xxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 26 Jan 2023 09:13:39 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=d72iJKQhYowTECgr8vXItHRLRcBZNOVgJ+n71MAuwuk=; b=AL9MWgo5YsudVv5fUEITCjvxTndVFeGn5rBZU9aSzJi1u6SfReOcutCQ5Kg20NaB+3qTCGtxKlSZB0nmGoUtbWhrNwXcgdZsYVfNw8wyW2tyLPtKujzcFyFbV+70MsaYNilE5laZVB0E4e+n0M3zsUvVC5IwR69bZeoCdOLCSnH63x/FLn4oyVuPJDw4LYjsFkMhG4DjFJ+koUzHKL5Sm6NZ9p6zM3CkfaLoZRnFpkK6hifIeyC8kIQHpDenFTnoizxKcJYis8iSr6/HmVH0Q0PjB4H+s2y69R98WYWVFRqFCkoRD5YpKrgZgjOCyYLGJDUbuAKaIKYxWX1Pfxx7dA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SH95VcRID7fFh6fkEqoMIHiNCo5lC3K8CqfOh5RM81/FmNVCMxXyeqAT9/OX0XR7zsOUkAJXeYnfGBfCM3ziTcDpIVaBNHC2rEAzNL7gXwlR850SFwjCWIsDEdqVYxTQdIwwHFkn5mN06R1M0/q2W4KEs1q+wp5TgsEuK3RPPwUOTllqUygESJj29klSu1vdKXH+CXY6n3bPBAcjUs1PGSZDzwEh1CA7uuX7pUgEO2zI1vEgzw4on9QrnKLM4KtaRhuCMAB7q6d4Syob7s/8pjUzRzWlgT8Px8toer8RY9EGLvlsucbFhEuNLHpRSSmAHRIdlNG7fdX9kzpOB6bd3Q==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Thu, 26 Jan 2023 08:14:13 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 25.01.2023 16:34, Tamas K Lengyel wrote:
> On Tue, Jan 24, 2023 at 6:19 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>
>> On 23.01.2023 19:32, Tamas K Lengyel wrote:
>>> On Mon, Jan 23, 2023 at 11:24 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>>> On 23.01.2023 17:09, Tamas K Lengyel wrote:
>>>>> On Mon, Jan 23, 2023 at 9:55 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>>>>> --- a/xen/arch/x86/mm/mem_sharing.c
>>>>>> +++ b/xen/arch/x86/mm/mem_sharing.c
>>>>>> @@ -1653,6 +1653,65 @@ static void copy_vcpu_nonreg_state(struc
>>>>>>      hvm_set_nonreg_state(cd_vcpu, &nrs);
>>>>>>  }
>>>>>>
>>>>>> +static int copy_guest_area(struct guest_area *cd_area,
>>>>>> +                           const struct guest_area *d_area,
>>>>>> +                           struct vcpu *cd_vcpu,
>>>>>> +                           const struct domain *d)
>>>>>> +{
>>>>>> +    mfn_t d_mfn, cd_mfn;
>>>>>> +
>>>>>> +    if ( !d_area->pg )
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    d_mfn = page_to_mfn(d_area->pg);
>>>>>> +
>>>>>> +    /* Allocate & map a page for the area if it hasn't been already.
>>> */
>>>>>> +    if ( !cd_area->pg )
>>>>>> +    {
>>>>>> +        gfn_t gfn = mfn_to_gfn(d, d_mfn);
>>>>>> +        struct p2m_domain *p2m = p2m_get_hostp2m(cd_vcpu->domain);
>>>>>> +        p2m_type_t p2mt;
>>>>>> +        p2m_access_t p2ma;
>>>>>> +        unsigned int offset;
>>>>>> +        int ret;
>>>>>> +
>>>>>> +        cd_mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL,
>>> NULL);
>>>>>> +        if ( mfn_eq(cd_mfn, INVALID_MFN) )
>>>>>> +        {
>>>>>> +            struct page_info *pg =
> alloc_domheap_page(cd_vcpu->domain,
>>>>> 0);
>>>>>> +
>>>>>> +            if ( !pg )
>>>>>> +                return -ENOMEM;
>>>>>> +
>>>>>> +            cd_mfn = page_to_mfn(pg);
>>>>>> +            set_gpfn_from_mfn(mfn_x(cd_mfn), gfn_x(gfn));
>>>>>> +
>>>>>> +            ret = p2m->set_entry(p2m, gfn, cd_mfn, PAGE_ORDER_4K,
>>>>> p2m_ram_rw,
>>>>>> +                                 p2m->default_access, -1);
>>>>>> +            if ( ret )
>>>>>> +                return ret;
>>>>>> +        }
>>>>>> +        else if ( p2mt != p2m_ram_rw )
>>>>>> +            return -EBUSY;
>>>>>> +
>>>>>> +        /*
>>>>>> +         * Simply specify the entire range up to the end of the
> page.
>>>>> All the
>>>>>> +         * function uses it for is a check for not crossing page
>>>>> boundaries.
>>>>>> +         */
>>>>>> +        offset = PAGE_OFFSET(d_area->map);
>>>>>> +        ret = map_guest_area(cd_vcpu, gfn_to_gaddr(gfn) + offset,
>>>>>> +                             PAGE_SIZE - offset, cd_area, NULL);
>>>>>> +        if ( ret )
>>>>>> +            return ret;
>>>>>> +    }
>>>>>> +    else
>>>>>> +        cd_mfn = page_to_mfn(cd_area->pg);
>>>>>
>>>>> Everything to this point seems to be non mem-sharing/forking related.
>>> Could
>>>>> these live somewhere else? There must be some other place where
>>> allocating
>>>>> these areas happens already for non-fork VMs so it would make sense to
>>> just
>>>>> refactor that code to be callable from here.
>>>>
>>>> It is the "copy" aspect with makes this mem-sharing (or really fork)
>>>> specific. Plus in the end this is no different from what you have
>>>> there right now for copying the vCPU info area. In the final patch
>>>> that other code gets removed by re-using the code here.
>>>
>>> Yes, the copy part is fork-specific. Arguably if there was a way to do
> the
>>> allocation of the page for vcpu_info I would prefer that being
> elsewhere,
>>> but while the only requirement is allocate-page and copy from parent
> I'm OK
>>> with that logic being in here because it's really straight forward. But
> now
>>> you also do extra sanity checks here which are harder to comprehend in
> this
>>> context alone.
>>
>> What sanity checks are you talking about (also below, where you claim
>> map_guest_area() would be used only to sanity check)?
> 
> Did I misread your comment above "All the function uses it for is a check
> for not crossing page boundaries"? That sounds to me like a simple sanity
> check, unclear why it matters though and why only for forks.

The comment is about the function's use of the range it is being passed.
It doesn't say in any way that the function is doing only sanity checking.
If the comment wording is ambiguous or unclear, I'm happy to take
improvement suggestions.

>>> What if extra sanity checks will be needed in the future? Or
>>> the sanity checks in the future diverge from where this happens for
> normal
>>> VMs because someone overlooks this needing to be synched here too?
>>>
>>>> I also haven't been able to spot anything that could be factored
>>>> out (and one might expect that if there was something, then the vCPU
>>>> info area copying should also already have used it). map_guest_area()
>>>> is all that is used for other purposes as well.
>>>
>>> Well, there must be a location where all this happens for normal VMs as
>>> well, no?
>>
>> That's map_guest_area(). What is needed here but not elsewhere is the
>> populating of the GFN underlying the to-be-mapped area. That's the code
>> being added here, mirroring what you need to do for the vCPU info page.
>> Similar code isn't needed elsewhere because the guest invoked operation
>> is purely a "map" - the underlying pages are already expected to be
>> populated (which of course we check, or else we wouldn't know what page
>> to actually map).
> 
> Populated by what and when?

Population happens either at domain creation or when the guest is moving
pages around (e.g. during ballooning). What is happening here (also in
the pre-existing code to deal with the vCPU info page) is the minimal
amount of "populate at creation" to meet the prereq for the mapping
operation(s).

>>> Why not factor that code so that it can be called from here, so
>>> that we don't have to track sanity check requirements in two different
>>> locations? Or for normal VMs that sanity checking bit isn't required? If
>>> so, why?
>>
>> As per above, I'm afraid that I'm lost with these questions. I simply
>> don't know what you're talking about.
> 
> You are adding code here that allocates memory and copies the content of
> similarly allocated memory from the parent. You perform extra sanity checks
> for unknown reasons that seem to be only needed here. It is unclear why you
> are doing that and why can't the same code-paths that allocate this memory
> for the parent be simply reused so the only thing left to do here would be
> to copy the content from the parent?

No, I'm not "adding code" in the sense that I read your comments so far.
Such code was already there (and, as pointed out somewhere, in slightly
broken form). Yes, I'm introducing a 2nd instance of this, but just to
then (in the last patch) remove the original (slightly broken) instance.
So across the entire series this is merely code movement (with
adjustments).

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.