[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 02/15] IOMMU/x86: perform PV Dom0 mappings in batches


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 31 May 2022 18:01:24 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RYpT6ALAe8NCcFBFfjwCnKc+VRgSWanZdXJjpsA+WWc=; b=L6rWwzyvH0aEi8pLU0armT9dYGeyREbG2hYsBmFd46/UKeIIMECIsWdFtivCKxj8Nt8aK5OEFjrBpZZ/6dpYw0PVdKHFa3fVBVz4Bg/vjoLAV8Bbh8kSKTufTUp4WT97pZOSKpgcDkyMenICs+s9p1Z4P0cYlU4CZ+0Wrqe1Dg9ucTpMaBJUcaw8DfjXCaTbF0CjRK/lMBncA3+3Y/lPkEm0E1FP9TfJeHjOIJtlswsQuKEe39zcw6nLgsPolHtBTyNEqtnLChva2QgsaDmS4YZvhhuOH7rCY2wLXLKXFhlnWqja8pwgpC/olDtfRD83Ddgr8iTXP7pR3QAA5JN+9A==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=DQUVqL0XSkyVTkhpExje4biuCULNzYakUwo/2YynJS0oYZxjlpduZw97XDyHlBAO3kcpjtE3jLEFRRUlCbSGz91sTTgzSaB3MOyLmBORnXo0AifHrYWm/Y+mjxmLAzXLH+e87d7lEygEW6jnO9dO0XRIA5lPrL+JvmRxbKm2RRwH3bpttDKGbwkeWBm19CEOUXztcSbYuQaagg5qtD5b2BDlU9D+sKRYXgWZT18u04dT4Uv3LTzG+Q3IM9vDkNn2LIL5yz1ilixJAKReJrlhCEz7JI2geGM4LonhgaddTftqkuAHcu2QcxSVWkykqQzENL35/ftCU8w2r3+VNdKWdA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Tue, 31 May 2022 16:01:41 +0000
  • Ironport-data: A9a23:eh78yqyWpPSeiXA5mnl6t+dNxyrEfRIJ4+MujC+fZmUNrF6WrkUCz WUfW2CFbqrYYGLzLdF2PI2y9xsDvMDRzd82SwJtqyAxQypGp/SeCIXCJC8cHc8zwu4v7q5Dx 59DAjUVBJlsFhcwnj/0bv656yMUOZigHtIQMsadUsxKbVIiGX5JZS5LwbZj2NY12oHhWWthh PupyyHhEA79s9JLGjp8B5Kr8HuDa9yr5Vv0FnRnDRx6lAe2e0s9VfrzFonoR5fMeaFGH/bSe gr25OrRElU1XfsaIojNfr7TKiXmS1NJVOSEoiI+t6OK2nCuqsGuu0qS2TV1hUp/0l20c95NJ Nplk6eWEx0Vb6P3s/k4FBJRAQA9F50W5+qSSZS/mZT7I0zuVVLJmqwrIGRoeIoS96BwHH1E8 uEeJHYVdBefiumqwbW9DO5xmsAkK8qtN4Qa0p1i5WiBUbB6HtaeHOOTu44wMDQY36iiGd7EY MUUc3x3ZQnoaBxTIFYHTpk5mY9Eg1GgKWAE9AzK9cLb5UDy4itI+r+uOeDyReOgfu9HxxqHm UX/qjGR7hYycYb3JSC+2nCmi/LLnCj7cJkPD7D+/flv6HWDy2pWBBAIWF+TpfiillX4S99ZM 1YT+Cclse417kPDZsH0QhmQsHOC+BkGVLJ4DOkS+AyLjK3O7G6k6nMsSzdAbJkqsZEwTDlzj 1uRxYq3VHporaGfTm+b+vGMtzSuNCMJLGgEIygZUQ8C5Nqlq4Y25v7Scute/GeOpoWdMVnNL /qi9UDSW517YRY36piG
  • Ironport-hdrordr: A9a23:pLd4rKF3D7bxnuWlpLqFepHXdLJyesId70hD6qkvc3Fom52j/f xGws5x6faVslkssb8b6LW90Y27MAvhHPlOkPIs1NaZLXDbUQ6TQL2KgrGD/9SNIVycygcZ79 YbT0EcMqyOMbEZt7ec3ODQKb9Jrri6GeKT9IHjJh9WPH1XgspbnmNE42igYy9LrF4sP+tFKH PQ3LsPmxOQPVAsKuirDHgMWObO4/XNiZLdeBYDQzoq8hOHgz+E4KPzV0Hw5GZUbxp/hZMZtU TVmQ3w4auu99m91x/nzmfWq7BbgsHoxNdvDNGFzuIVNjLvoAC1Y5kJYczLgBkF5MWUrHo6mt jFpBkte+x19nPqZ2mw5SDg3gHxuQxen0PK+Bu9uz/OsMb5TDU1B45qnoRCaCbU7EImoZVVzL 9L93jxjesZMTrw2ADGo/TYXRBjkUS55VA4l/QIsnBZWYwCLJdMsI0k+l9PGptoJlO31GkeKp guMCjg3ocXTbvDBEqp/VWHgebcE0jbJy32DHTr4aeuonprdHMQ9Tps+CVQpAZEyHsHceg02w 31CNUXqFhwdL5nUUtcPpZ3fSLlMB26ffrzWFjiUmjPJeUgB0/njaLRzfEc2NyKEaZ4vqfa3q 6xGm9liQ==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, May 27, 2022 at 01:12:48PM +0200, Jan Beulich wrote:
> For large page mappings to be easily usable (i.e. in particular without
> un-shattering of smaller page mappings) and for mapping operations to
> then also be more efficient, pass batches of Dom0 memory to iommu_map().
> In dom0_construct_pv() and its helpers (covering strict mode) this
> additionally requires establishing the type of those pages (albeit with
> zero type references).
> 
> The earlier establishing of PGT_writable_page | PGT_validated requires
> the existing places where this gets done (through get_page_and_type())
> to be updated: For pages which actually have a mapping, the type
> refcount needs to be 1.
> 
> There is actually a related bug that gets fixed here as a side effect:
> Typically the last L1 table would get marked as such only after
> get_page_and_type(..., PGT_writable_page). While this is fine as far as
> refcounting goes, the page did remain mapped in the IOMMU in this case
> (when "iommu=dom0-strict").
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>

Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

> ---
> Subsequently p2m_add_identity_entry() may want to also gain an order
> parameter, for arch_iommu_hwdom_init() to use. While this only affects
> non-RAM regions, systems typically have 2-16Mb of reserved space
> immediately below 4Gb, which hence could be mapped more efficiently.
> 
> Eventually we may want to overhaul this logic to use a rangeset based
> approach instead, punching holes into originally uniformly large-page-
> mapped regions. Doing so right here would first and foremost be yet more
> of a change.
> 
> The installing of zero-ref writable types has in fact shown (observed
> while putting together the change) that despite the intention by the
> XSA-288 changes (affecting DomU-s only) for Dom0 a number of
> sufficiently ordinary pages (at the very least initrd and P2M ones as
> well as pages that are part of the initial allocation but not part of
> the initial mapping) still have been starting out as PGT_none, meaning
> that they would have gained IOMMU mappings only the first time these
> pages would get mapped writably. Consequently an open question is
> whether iommu_memory_setup() should set the pages to PGT_writable_page
> independent of need_iommu_pt_sync().

Hm, I see, non strict PV dom0s won't get the pages set to
PGT_writable_page even when accessible by devices by virtue of such
domain having all RAM mapped in the IOMMU page-tables.

I guess it does make sense to also have the pages set as
PGT_writable_page by default in that case, as tthe pages _are_
writable by the IOMMU.  Do pages added during runtime (ie: ballooned
in) also get PGT_writable_page set?

> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -363,8 +363,8 @@ static unsigned int __hwdom_init hwdom_i
>  
>  void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
>  {
> -    unsigned long i, top, max_pfn;
> -    unsigned int flush_flags = 0;
> +    unsigned long i, top, max_pfn, start, count;
> +    unsigned int flush_flags = 0, start_perms = 0;
>  
>      BUG_ON(!is_hardware_domain(d));
>  
> @@ -395,9 +395,9 @@ void __hwdom_init arch_iommu_hwdom_init(
>       * First Mb will get mapped in one go by pvh_populate_p2m(). Avoid
>       * setting up potentially conflicting mappings here.
>       */
> -    i = paging_mode_translate(d) ? PFN_DOWN(MB(1)) : 0;
> +    start = paging_mode_translate(d) ? PFN_DOWN(MB(1)) : 0;
>  
> -    for ( ; i < top; i++ )
> +    for ( i = start, count = 0; i < top; )
>      {
>          unsigned long pfn = pdx_to_pfn(i);
>          unsigned int perms = hwdom_iommu_map(d, pfn, max_pfn);
> @@ -406,20 +406,41 @@ void __hwdom_init arch_iommu_hwdom_init(
>          if ( !perms )
>              rc = 0;
>          else if ( paging_mode_translate(d) )
> +        {
>              rc = p2m_add_identity_entry(d, pfn,
>                                          perms & IOMMUF_writable ? 
> p2m_access_rw
>                                                                  : 
> p2m_access_r,
>                                          0);
> +            if ( rc )
> +                printk(XENLOG_WARNING
> +                       "%pd: identity mapping of %lx failed: %d\n",
> +                       d, pfn, rc);
> +        }
> +        else if ( pfn != start + count || perms != start_perms )
> +        {
> +        commit:
> +            rc = iommu_map(d, _dfn(start), _mfn(start), count, start_perms,
> +                           &flush_flags);
> +            if ( rc )
> +                printk(XENLOG_WARNING
> +                       "%pd: IOMMU identity mapping of [%lx,%lx) failed: 
> %d\n",
> +                       d, pfn, pfn + count, rc);
> +            SWAP(start, pfn);
> +            start_perms = perms;
> +            count = 1;
> +        }
>          else
> -            rc = iommu_map(d, _dfn(pfn), _mfn(pfn), 1ul << PAGE_ORDER_4K,
> -                           perms, &flush_flags);
> +        {
> +            ++count;
> +            rc = 0;
> +        }
>  
> -        if ( rc )
> -            printk(XENLOG_WARNING "%pd: identity %smapping of %lx failed: 
> %d\n",
> -                   d, !paging_mode_translate(d) ? "IOMMU " : "", pfn, rc);
>  
> -        if (!(i & 0xfffff))
> +        if ( !(++i & 0xfffff) )
>              process_pending_softirqs();
> +
> +        if ( i == top && count )

Nit: do you really need to check for count != 0? AFAICT this is only
possible in the first iteration.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.