[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC 07/10] domain: map/unmap GADDR based shared guest areas


  • To: Julien Grall <julien@xxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 28 Nov 2022 10:01:48 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XfzsaMtroem9sNXWRAgAQe2QVEJEa1tD8rIN8Lj/yw0=; b=Uij3GX8Ug2A3jCvO3uCgUZoPpjkRjyIo/kIP1IQMjq41CUlRtgH07kY2fjR7MyxwpnT2YKcRO2fOziTeWX6JlD7xIhnqUX8yZzXblMd5KMdAzOVTTUcy/NhBr7Vp+gUCpjFDF7A7HAhTcYW+5Up/LyKa7sBk0q/pzF/arXFVoHrrC8lxUMg2irmdbQmcFItb7cPaSxM0v8vTfvpIRXp+wnmK5xHNu7chxZgAkKoP0D0QBEQ1bSB+Aaq3sNbFYJ93G/vk7SbKilLVZ8bAt6iv/l3LtE6KK+Px8PbwlA/mjn41kEIEdQXuaKdPa+wVVOEChAEWbVwgTpg6Z7APuGGLGA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=caIBJf+thrHVzjZIP2HXAYsyX8fMDXvbPP7EOgfU0qHUhVBGpENu0RPoUZSTupURSzPaZ6pEMkbbkTQJxk0zy7GM/VZYbUHcCpET5XyrYli73OGCrUSsZ/rJ7EIdwKOdoqUePzbmhpyN0PTic6ktq7gNkKupv4K+0gtpACdSdPJCPaocIwUXADNj0UFl5xKHcGRWD8BHWu6ur2s61fLwxKkqhB6s526GTFoHVf00ld6WEbQHOCSXFZx5FjMQ+TOtHJwgowRyOpshUuncJuvAGiYSS7vQTQ//YyC3+1SD36qtxODTTfQwLMu/vjNJwAcA3rs7m10uKw/uA2qvJC2C6Q==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 28 Nov 2022 09:02:24 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 24.11.2022 22:29, Julien Grall wrote:
> On 19/10/2022 09:43, Jan Beulich wrote:
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -1563,7 +1563,82 @@ int map_guest_area(struct vcpu *v, paddr
>>                      struct guest_area *area,
>>                      void (*populate)(void *dst, struct vcpu *v))
>>   {
>> -    return -EOPNOTSUPP;
>> +    struct domain *currd = v->domain;
>> +    void *map = NULL;
>> +    struct page_info *pg = NULL;
>> +    int rc = 0;
>> +
>> +    if ( gaddr )
> 
> 0 is technically a valid (guest) physical address on Arm.

I guess it is everywhere; it certainly also is on x86. While perhaps a
little unfortunate in ordering, the public header changes coming only
in the following patches was the best way I could think of to split
this work into reasonable size pieces. There the special meaning of 0
is clearly documented. And I don't really see it as a meaningful
limitation to not allow guests to register such areas at address zero.

>> +    {
>> +        unsigned long gfn = PFN_DOWN(gaddr);
> 
> This could be gfn_t for adding some type safety.

Indeed I did consider doing so, but the resulting code would imo be
less legible. But this difference perhaps isn't significant enough
for me to object to changing, in case you (or others) think the
type safety is really a meaningful gain here.

>> +        unsigned int align;
>> +        p2m_type_t p2mt;
>> +
>> +        if ( gfn != PFN_DOWN(gaddr + size - 1) )
>> +            return -ENXIO;
>> +
>> +#ifdef CONFIG_COMPAT
>> +        if ( has_32bit_shinfo(currd) )
>> +            align = alignof(compat_ulong_t);
>> +        else
>> +#endif
>> +            align = alignof(xen_ulong_t);
>> +        if ( gaddr & (align - 1) )
>> +            return -ENXIO;
>> +
>> +        rc = check_get_page_from_gfn(currd, _gfn(gfn), false, &p2mt, &pg);
>> +        if ( rc )
>> +            return rc;
>> +
>> +        if ( !get_page_type(pg, PGT_writable_page) )
>> +        {
>> +            put_page(pg);
>> +            return -EACCES;
>> +        }
>> +
>> +        map = __map_domain_page_global(pg);
>> +        if ( !map )
>> +        {
>> +            put_page_and_type(pg);
>> +            return -ENOMEM;
>> +        }
>> +        map += PAGE_OFFSET(gaddr);
>> +    }
>> +
>> +    if ( v != current )
>> +    {
>> +        if ( !spin_trylock(&currd->hypercall_deadlock_mutex) )
>> +        {
>> +            rc = -ERESTART;
>> +            goto unmap;
>> +        }
>> +
>> +        vcpu_pause(v);
> 
> AFAIU, the goal of vcpu_pause() here is to guarantee that the "area" 
> will not be touched by another pCPU.

Hmm, I find the way you put it a little confusing, but yes. I'd express
it as "The purpose of the vcpu_pause() is to guarantee that the vCPU in
question won't use its own area while the location thereof is being
updated." This includes updates by Xen as well as guest side consumption
of the data (with focus on the former, yes).

> However, looking at the function context_switch() we have:
> 
> sched_context_switched(prev, next);
> _update_runstate_area();

With this really being

    _update_runstate_area(next);

...

> The first function will set v->is_running to false (see 
> vcpu_context_saved()). So I think the "area" could be touched even afte 
> vcpu_pause() is returned.
> 
> Therefore, I think we will need _update_runstate_area() (or 
> update_runstate_area()) to be called first.

... I don't see a need for adjustment. The corresponding

    _update_runstate_area(prev);

sits quite a bit earlier in context_switch(). (Arm code is quite a bit
different, but this particular aspect looks to apply there as well.)

>> @@ -1573,6 +1648,22 @@ int map_guest_area(struct vcpu *v, paddr
>>    */
>>   void unmap_guest_area(struct vcpu *v, struct guest_area *area)
>>   {
>> +    struct domain *d = v->domain;
>> +    void *map;
>> +    struct page_info *pg;
> 
> AFAIU, the assumption is the vCPU should be paused here.

Yes, as the comment ahead of the function (introduced by an earlier
patch) says.

> Should we add an ASSERT()?

I was going from unmap_vcpu_info(), which had the same requirement,
while also only recording it by way of a comment. I certainly could
add an ASSERT(), but besides this being questionable as to the rules
set forth in ./CODING_STYLE I also view assertions of "paused" state
as being of limited use - the entity in question may become unpaused
on the clock cycle after the check was done. (But yes, such are no
different from e.g. the fair number of spin_is_locked() checks we
have scattered around, which don't really provide guarantees either.)

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.