[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [PATCH v3 4/7] swiotlb: if swiotlb is full, fall back to a transient memory pool
- To: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
- From: "Michael Kelley (LINUX)" <mikelley@xxxxxxxxxxxxx>
- Date: Thu, 6 Jul 2023 14:22:50 +0000
- Accept-language: en-US
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=microsoft.com; dmarc=pass action=none header.from=microsoft.com; dkim=pass header.d=microsoft.com; arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wG+YaDqOerIORGaH0fUqy7utjqv7UUuWUSk817E/mzg=; b=ndEu3JVKgKXySe5MQlXcBi/fbhKlL2O0ptdcmhzZZxiudcIPTbIuVNkdkB7HkH16H+jm5k6aAN4AnE+XU/Bvf+Gbx1R3+7dta0Lcn6aYfG+gC0CCAU0mfxQfTbhB7RQZI28QC+FK6HoyeJmazK9axvEgoJf1S8XHAN5MIikeRHp7Ubrli0RYpTiUsZolPvoEta0+yaUEHgModwpkI6gDAPusar30t15FZF7HmHAh+AOZc0VXj0LuXz8lKc2c9SPM4H5QN6pffvIkThZvUj7YwHPFN99E7t/QoUkv1YcH+JkRqtxD+dBLRm8stD2aC8qqWCyamxXnaxRzoidhuSwbHw==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UwhdvMOTCj1pk6YoWX/0vA8V1MM1yW1X+cgckphBoWw18+TH5Wq4fo6sy5T77KA9dxG2QmsFTIED0utRdGSKbFhjMifbh0Yx5wjIefStAvpwxYREUbfvGnyNJp74WgBoV6/zkzgu1ro93YJgZmAqlvDxcv19Yf0ne/v32iCEfyibMuTwXufpKaiV/jiLLwOxN6tiBgOpj2T1iIVTb3yQ6dQ10HKf/K40XhmE4fLUWzyBE9ipz1Q320hp35gTgZ1ChNcQEwbMJY3ERm6nSf+UmZN+ifGu+PzwaTnIZ9jGa/nCal+N25zkuQTZYRzV9dkKMvcBNMMTgKbW9FMKvxbJdQ==
- Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=microsoft.com;
- Cc: Petr Tesarik <petrtesarik@xxxxxxxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Thomas Bogendoerfer <tsbogend@xxxxxxxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Borislav Petkov <bp@xxxxxxxxx>, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>, "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@xxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, "Rafael J. Wysocki" <rafael@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>, Christoph Hellwig <hch@xxxxxx>, Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>, Robin Murphy <robin.murphy@xxxxxxx>, Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>, Hans de Goede <hdegoede@xxxxxxxxxx>, Jason Gunthorpe <jgg@xxxxxxxx>, Kees Cook <keescook@xxxxxxxxxxxx>, Saravana Kannan <saravanak@xxxxxxxxxx>, "moderated list:XEN HYPERVISOR ARM" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "moderated list:ARM PORT" <linux-arm-kernel@xxxxxxxxxxxxxxxxxxx>, open list <linux-kernel@xxxxxxxxxxxxxxx>, "open list:MIPS" <linux-mips@xxxxxxxxxxxxxxx>, "open list:XEN SWIOTLB SUBSYSTEM" <iommu@xxxxxxxxxxxxxxx>, Roberto Sassu <roberto.sassu@xxxxxxxxxxxxxxx>, Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>, "petr@xxxxxxxxxxx" <petr@xxxxxxxxxxx>
- Delivery-date: Thu, 06 Jul 2023 14:23:48 +0000
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
- Msip_labels: MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_ActionId=a8aa5c53-11c1-447d-8308-ef08c97544cc;MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_ContentBits=0;MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Enabled=true;MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Method=Standard;MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Name=Internal;MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SetDate=2023-07-06T14:10:48Z;MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SiteId=72f988bf-86f1-41af-91ab-2d7cd011db47;
- Thread-index: AQHZqN4Es6xs27E7q0W1vSuhepPC1q+p0aJAgAKeXgCAAGWX4A==
- Thread-topic: [PATCH v3 4/7] swiotlb: if swiotlb is full, fall back to a transient memory pool
From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> Sent: Thursday, July 6,
2023 1:07 AM
>
> On Thu, Jul 06, 2023 at 03:50:55AM +0000, Michael Kelley (LINUX) wrote:
> > From: Petr Tesarik <petrtesarik@xxxxxxxxxxxxxxx> Sent: Tuesday, June 27,
> > 2023
> 2:54 AM
> > >
> > > Try to allocate a transient memory pool if no suitable slots can be found,
> > > except when allocating from a restricted pool. The transient pool is just
> > > enough big for this one bounce buffer. It is inserted into a per-device
> > > list of transient memory pools, and it is freed again when the bounce
> > > buffer is unmapped.
> > >
> > > Transient memory pools are kept in an RCU list. A memory barrier is
> > > required after adding a new entry, because any address within a transient
> > > buffer must be immediately recognized as belonging to the SWIOTLB, even if
> > > it is passed to another CPU.
> > >
> > > Deletion does not require any synchronization beyond RCU ordering
> > > guarantees. After a buffer is unmapped, its physical addresses may no
> > > longer be passed to the DMA API, so the memory range of the corresponding
> > > stale entry in the RCU list never matches. If the memory range gets
> > > allocated again, then it happens only after a RCU quiescent state.
> > >
> > > Since bounce buffers can now be allocated from different pools, add a
> > > parameter to swiotlb_alloc_pool() to let the caller know which memory pool
> > > is used. Add swiotlb_find_pool() to find the memory pool corresponding to
> > > an address. This function is now also used by is_swiotlb_buffer(), because
> > > a simple boundary check is no longer sufficient.
> > >
> > > The logic in swiotlb_alloc_tlb() is taken from __dma_direct_alloc_pages(),
> > > simplified and enhanced to use coherent memory pools if needed.
> > >
> > > Note that this is not the most efficient way to provide a bounce buffer,
> > > but when a DMA buffer can't be mapped, something may (and will) actually
> > > break. At that point it is better to make an allocation, even if it may be
> > > an expensive operation.
> >
> > I continue to think about swiotlb memory management from the standpoint
> > of CoCo VMs that may be quite large with high network and storage loads.
> > These VMs are often running mission-critical workloads that can't tolerate
> > a bounce buffer allocation failure. To prevent such failures, the swiotlb
> > memory size must be overly large, which wastes memory.
>
> If "mission critical workloads" are in a vm that allowes overcommit and
> no control over other vms in that same system, then you have worse
> problems, sorry.
>
> Just don't do that.
>
No, the cases I'm concerned about don't involve memory overcommit.
CoCo VMs must use swiotlb bounce buffers to do DMA I/O. Current swiotlb
code in the Linux guest allocates a configurable, but fixed, amount of guest
memory at boot time for this purpose. But it's hard to know how much
swiotlb bounce buffer memory will be needed to handle peak I/O loads.
This patch set does dynamic allocation of swiotlb bounce buffer memory,
which can help avoid needing to configure an overly large fixed size at boot.
Michael
|