[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC 07/10] domain: map/unmap GADDR based shared guest areas


  • To: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 18 Jan 2023 10:55:20 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wHORsIA3j1uSHkrpYne1jaxtDvRUxlw3LmENpIDvF9I=; b=VRasEoIsRQSTXW3wrzteeDz6UlYuZwgLTqFTkKdXF/++/ZW1P+qgueDHDcTmrUc7TF2X13eZ/jbsBeRfIHlzd8DgKNDdkgBGCRmD9w2IR3Z3njk1cTPO2gc274gvbknEQS9RYMuz4KHrsJxop+pMfyLtBKBNbiIMCALIKCbtfGntL/qgZm/TDrchEckMKnzg71nQ+3m1OQCWhe5QNOl237Jc35QuzcUKhwhMJVWp6mwnSEaoxF06DTmnFaqauC6VPbFObQEU0OXnYj9osrpQdxLusyNnrbNMZZmUT+oFrRAx7Pxf3jsm3Yia+a/SMGdzOdTqM6qjcE2khTguXCfLhQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EEIzkgDE/p2AudjTXsKqpk8BZ3dvMyE/9Y5nKgCix+dp0+HS/l0S2XhaiNDac8fuqpVjFAbv0KVtDapQM2JNvyyXx4t+4kVTWtZxUb6M6yxvuusdxh85ndgGrzbWlPXt5CTfPcjeICHSkX4I7Me7rjWWYrje3vtbpiKec/W6P76eCJngozpTfcReAYGzizK6pz+PGp4ufiwU+fa2QbaUu0upkIRY44795hpVxMQYTZX6gW1mig4v/SvbN6XHaGuiMO2LjvQWe+L/BkiJ+gR2efIKlD7rCEPiKuzBQ01XBeXtXWU2v6/WCh+LCOis1hN0jf9mYqaKGEZuTsKXERnM+Q==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: George Dunlap <George.Dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 18 Jan 2023 09:55:37 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 17.01.2023 23:04, Andrew Cooper wrote:
> On 19/10/2022 8:43 am, Jan Beulich wrote:
>> The registration by virtual/linear address has downsides: At least on
>> x86 the access is expensive for HVM/PVH domains. Furthermore for 64-bit
>> PV domains the areas are inaccessible (and hence cannot be updated by
>> Xen) when in guest-user mode.
> 
> They're also inaccessible in HVM guests (x86 and ARM) when Meltdown
> mitigations are in place.

I've added this explicitly, but ...

> And lets not get started on the multitude of layering violations that is
> guest_memory_policy() for nested virt.  In fact, prohibiting any form of
> map-by-va is a perquisite to any rational attempt to make nested virt work.
> 
> (In fact, that infrastructure needs purging before any other
> architecture pick up stubs too.)
> 
> They're also inaccessible in general because no architecture has
> hypervisor privilege in a normal user/supervisor split, and you don't
> know whether the mapping is over supervisor or user mapping, and
> settings like SMAP/PAN can cause the pagewalk to fail even when the
> mapping is in place.

... I'm now merely saying that there are yet more reasons, rather than
trying to enumerate them all.

>> In preparation of the introduction of new vCPU operations allowing to
>> register the respective areas (one of the two is x86-specific) by
>> guest-physical address, flesh out the map/unmap functions.
>>
>> Noteworthy differences from map_vcpu_info():
>> - areas can be registered more than once (and de-registered),
> 
> When register by GFN is available, there is never a good reason to the
> same area twice.

Why not? Why shouldn't different entities be permitted to register their
areas, one after the other? This at the very least requires a way to
de-register.

> The guest maps one MMIO-like region, and then constructs all the regular
> virtual addresses mapping it (or not) that it wants.
> 
> This API is new, so we can enforce sane behaviour from the outset.  In
> particular, it will help with ...
> 
>> - remote vCPU-s are paused rather than checked for being down (which in
>>   principle can change right after the check),
>> - the domain lock is taken for a much smaller region.
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>> ---
>> RFC: By using global domain page mappings the demand on the underlying
>>      VA range may increase significantly. I did consider to use per-
>>      domain mappings instead, but they exist for x86 only. Of course we
>>      could have arch_{,un}map_guest_area() aliasing global domain page
>>      mapping functions on Arm and using per-domain mappings on x86. Yet
>>      then again map_vcpu_info() doesn't do so either (albeit that's
>>      likely to be converted subsequently to use map_vcpu_area() anyway).
> 
> ... this by providing a bound on the amount of vmap() space can be consumed.

I'm afraid I don't understand. When re-registering a different area, the
earlier one will be unmapped. The consumption of vmap space cannot grow
(or else we'd have a resource leak and hence an XSA).

>> RFC: In map_guest_area() I'm not checking the P2M type, instead - just
>>      like map_vcpu_info() - solely relying on the type ref acquisition.
>>      Checking for p2m_ram_rw alone would be wrong, as at least
>>      p2m_ram_logdirty ought to also be okay to use here (and in similar
>>      cases, e.g. in Argo's find_ring_mfn()). p2m_is_pageable() could be
>>      used here (like altp2m_vcpu_enable_ve() does) as well as in
>>      map_vcpu_info(), yet then again the P2M type is stale by the time
>>      it is being looked at anyway without the P2M lock held.
> 
> Again, another error caused by Xen not knowing the guest physical
> address layout.  These mappings should be restricted to just RAM regions
> and I think we want to enforce that right from the outset.

Meaning what exactly in terms of action for me to take? As said, checking
the P2M type is pointless. So without you being more explicit, all I can
take your reply for is merely a comment, with no action on my part (not
even to remove this RFC remark).

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.