[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 4/8] x86/mem-sharing: copy GADDR based shared guest areas


  • To: Tamas K Lengyel <tamas@xxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 23 Jan 2023 17:24:29 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZpEjFsSQ6y2gsr604WoOs2H3Jxs7HWLWOC83Gv+E2tc=; b=ddT/d9P40v5Bl4yKBu9SZX+0GUSiCA1HvRXVA2FPQFpGvHLE5bhRnriiw7C/mtp+B0wLwuBOfOMHnjNwF+NwtkSNmHJhhfXHozz8T/LHz+TIdIz6T4+50GCZm5Y0iQ+qNI/jiFKR5cPZKlyZR1WsZk827SRSvwwvZqYwOgtHaVUhjmVSyTJqif/YJDF/fw5PRC4dps99W1nk6Z1SB8FEY3igTD7a2FoLc/uTL6ss1hstzxoZ9Jk6fA4JAJ+oXMw6BExwoSMg7drZNnPBVhLm/uhc8vo2lrT1ptfEEs4YSPIfAcvl2JnxQo0rZkZPqRBj2NbaqP7F/NVWBNlPrS3MMA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PX5vUPMpCHWodzN3sXdQycIMaWITxzZatTAXmDfK0+IV6MH0yF9EE2bJ0egmC0aqhy9flMKBBGiy2TZm8Kgz0HRrPPzcuzAvDo5mfzDrKWteRM0JKm3yrLo7ZikWKjF6/BaKwO3liOjIYy0aa1T+vSIwQ1K4SUEQX3sr30mZj4BG5wctGNkhy8xAlt8qcZuJ2I8w/43BrYYw0/Kaxrf0wFkNyf0rVWw2lijG806HJ4/7VNJnRg021blW+EUQZbhY1JQa/gOWJfGgQ3YA6vgYUzzxUcxZ0l7J6C2oKnopw/VRslQ69/QTXRVHIWdrrzsxUXPdBlECequvvu4vPh6vgA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Mon, 23 Jan 2023 16:24:40 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 23.01.2023 17:09, Tamas K Lengyel wrote:
> On Mon, Jan 23, 2023 at 9:55 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>> --- a/xen/arch/x86/mm/mem_sharing.c
>> +++ b/xen/arch/x86/mm/mem_sharing.c
>> @@ -1653,6 +1653,65 @@ static void copy_vcpu_nonreg_state(struc
>>      hvm_set_nonreg_state(cd_vcpu, &nrs);
>>  }
>>
>> +static int copy_guest_area(struct guest_area *cd_area,
>> +                           const struct guest_area *d_area,
>> +                           struct vcpu *cd_vcpu,
>> +                           const struct domain *d)
>> +{
>> +    mfn_t d_mfn, cd_mfn;
>> +
>> +    if ( !d_area->pg )
>> +        return 0;
>> +
>> +    d_mfn = page_to_mfn(d_area->pg);
>> +
>> +    /* Allocate & map a page for the area if it hasn't been already. */
>> +    if ( !cd_area->pg )
>> +    {
>> +        gfn_t gfn = mfn_to_gfn(d, d_mfn);
>> +        struct p2m_domain *p2m = p2m_get_hostp2m(cd_vcpu->domain);
>> +        p2m_type_t p2mt;
>> +        p2m_access_t p2ma;
>> +        unsigned int offset;
>> +        int ret;
>> +
>> +        cd_mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, NULL);
>> +        if ( mfn_eq(cd_mfn, INVALID_MFN) )
>> +        {
>> +            struct page_info *pg = alloc_domheap_page(cd_vcpu->domain,
> 0);
>> +
>> +            if ( !pg )
>> +                return -ENOMEM;
>> +
>> +            cd_mfn = page_to_mfn(pg);
>> +            set_gpfn_from_mfn(mfn_x(cd_mfn), gfn_x(gfn));
>> +
>> +            ret = p2m->set_entry(p2m, gfn, cd_mfn, PAGE_ORDER_4K,
> p2m_ram_rw,
>> +                                 p2m->default_access, -1);
>> +            if ( ret )
>> +                return ret;
>> +        }
>> +        else if ( p2mt != p2m_ram_rw )
>> +            return -EBUSY;
>> +
>> +        /*
>> +         * Simply specify the entire range up to the end of the page.
> All the
>> +         * function uses it for is a check for not crossing page
> boundaries.
>> +         */
>> +        offset = PAGE_OFFSET(d_area->map);
>> +        ret = map_guest_area(cd_vcpu, gfn_to_gaddr(gfn) + offset,
>> +                             PAGE_SIZE - offset, cd_area, NULL);
>> +        if ( ret )
>> +            return ret;
>> +    }
>> +    else
>> +        cd_mfn = page_to_mfn(cd_area->pg);
> 
> Everything to this point seems to be non mem-sharing/forking related. Could
> these live somewhere else? There must be some other place where allocating
> these areas happens already for non-fork VMs so it would make sense to just
> refactor that code to be callable from here.

It is the "copy" aspect with makes this mem-sharing (or really fork)
specific. Plus in the end this is no different from what you have
there right now for copying the vCPU info area. In the final patch
that other code gets removed by re-using the code here.

I also haven't been able to spot anything that could be factored
out (and one might expect that if there was something, then the vCPU
info area copying should also already have used it). map_guest_area()
is all that is used for other purposes as well.

>> +
>> +    copy_domain_page(cd_mfn, d_mfn);
>> +
>> +    return 0;
>> +}
>> +
>>  static int copy_vpmu(struct vcpu *d_vcpu, struct vcpu *cd_vcpu)
>>  {
>>      struct vpmu_struct *d_vpmu = vcpu_vpmu(d_vcpu);
>> @@ -1745,6 +1804,16 @@ static int copy_vcpu_settings(struct dom
>>              copy_domain_page(new_vcpu_info_mfn, vcpu_info_mfn);
>>          }
>>
>> +        /* Same for the (physically registered) runstate and time info
> areas. */
>> +        ret = copy_guest_area(&cd_vcpu->runstate_guest_area,
>> +                              &d_vcpu->runstate_guest_area, cd_vcpu, d);
>> +        if ( ret )
>> +            return ret;
>> +        ret = copy_guest_area(&cd_vcpu->arch.time_guest_area,
>> +                              &d_vcpu->arch.time_guest_area, cd_vcpu, d);
>> +        if ( ret )
>> +            return ret;
>> +
>>          ret = copy_vpmu(d_vcpu, cd_vcpu);
>>          if ( ret )
>>              return ret;
>> @@ -1987,7 +2056,10 @@ int mem_sharing_fork_reset(struct domain
>>
>>   state:
>>      if ( reset_state )
>> +    {
>>          rc = copy_settings(d, pd);
>> +        /* TBD: What to do here with -ERESTART? */
> 
> Where does ERESTART coming from?

>From map_guest_area()'s attempt to acquire the hypercall deadlock mutex,
in order to then pause the subject vCPU. I suppose that in the forking
case it may already be paused, but then there's no way map_guest_area()
could know. Looking at the pause count is fragile, as there's no
guarantee that the vCPU may be unpaused while we're still doing work on
it. Hence I view such checks as only suitable for assertions.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.