Xen project Mailing List

Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Fri, 7 Jul 2023 11:50:11 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FEN7c4/uKWPuDvJoEF95j+vvshfPei0q6oHxHAVw6pg=; b=ID1KNLntzfhjNAoLyHB2OLSftkgwmsIN1UdO8gF6k1tXUJeNrZTCUt0LszfowDLFYVVUoxDiAAEM/ULLMCn4GXwpQXZ+RLKqGCI2pDZRO7rilox/Em8lWHP/G585HsIvFGt3tejToKOWD5+ZPgMubKGTVD4vHe08jJtSITOjJt0yXRY3VjTp9EFuV2MfrWSONDmJetJsLqDn4Y6NC+lIqROZf7sdyisfAn2hVJKHr6cv8Afa8Y8UqpF+xlB37SYu51dU1tUXfClt7C/Azw3S3YK1thWagaG1i8H3E6QUAdpSfhaVUilTb79ir/Vew480TJmTqDrbpRzTHNjhN6xuGQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EdwUEwUpkpyRqOGIStLkrIAVnHKwrLfx/tD2nk3jGV0PdvCViTObP9wTu9u4dB0AfSHyl1IIMYL9sriDi0tTNPZMogUG7IBeJRqOIf9mdVeJpfLwld1Ilz2pqMOlibmstTXQESmamEoZ9B6hstBSvQFWROMt9KIATXlRXowGN8MT9hMI4puFY24R3/D9NG4HSBBITfBmjoy7YmMaWSssD+QBYsou6kUoTrGg3E3CXZ0uPLOjbF4W0QkFSM5AZIHP5ewPFQyrqwMVWQJRkiGfpgtr+nXl1ljkdJCHdVMUIDw4tQC/K/7ZkS7ivmgaYQlflYUKMibo98flG4/S7me4cg==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <olekstysh@xxxxxxxxx>, Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Petr Pavlu <petr.pavlu@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, vikram.garhwal@xxxxxxx

Delivery-date: Fri, 07 Jul 2023 09:50:39 +0000

Ironport-data: A9a23:BGsT7KtZEYJMR6XhLBxYNs7zr+fnVB5eMUV32f8akzHdYApBsoF/q tZmKW3TMqnca2PwfY8gOoWw8E4G78OAztBnSQps+yoyFyNG+JbJXdiXEBz9bniYRiHhoOCLz O1FM4Wdc5pkJpP4jk3wWlQ0hSAkjclkfpKlVKiffHg3HVQ+IMsYoUoLs/YjhYJ1isSODQqIu Nfjy+XSI1bg0DNvWo4uw/vrRChH4rKq4Vv0gnRkPaoQ5ACGyCFMZH4iDfrZw0XQE9E88tGSH 44v/JnhlkvF8hEkDM+Sk7qTWiXmlZaLYGBiIlIPM0STqkAqSh4ai87XB9JFAatjsB2bnsgZ9 Tl4ncfYpTHFnEH7sL91vxFwS0mSNEDdkVPNCSDXXce7lyUqf5ZwqhnH4Y5f0YAwo45K7W9yG fMwFRYKYzmIisKP6buUTrlAiJ4YJZHnBdZK0p1g5Wmx4fcOZ7nmGv2PwOACmTA6i4ZJAOrUY NcfZXx3dhPcbhZTO1ARTpUjgOOvgXq5eDpdwL6XjfNvvy6Pk0osjf60aIu9lt+iHK25mm6Co W3L5SLhCwwyP92D0zuVtHmrg4cjmAuiAd9OS+zjpqQCbFu72kUTMQQ3RXKBqNamrGC/AdZHL Go0w397xUQ13AnxJjXnZDW6r2SDpQU0QMdLHqsx7wTl4rHP/w+TC2wATzhAQN8rrsk7QXotz FDht9TtCD90rKyOSVqS876VqXW5Pi19BWoLfyoNVwYGy9jlvoAojxjLQ8pjEai6ldn8E3f7x DXihA86irYIhMgHzZKH7EvHiDKhoJvOZgMt7wCRVWWghithaZK/IZSh91zGxe1dN4vfRV6E1 FAUls7b4O0QAJWlkC2WXP5LDLyv/+yCMjDXnRhoBZZJ3zCs/WOzOINd+jdzIG93PcsePzzke knevUVW/pA7FGWrbrV+ZsS+At4q0qHkPd3gWrbfad8mX3RqXAqO/SUrYFHK2WnoyRAoiftmY cbddtuwB3EHD6gh1CCxW+oWzb4swGY52H/XQpf4iR+g1NJyeUKodFvMC3PWBshR0U9OiFy9H wp3XydS9yhibQ==

Ironport-hdrordr: A9a23:OtehYK0UHQugIU94M1T3GwqjBEIkLtp133Aq2lEZdPU0SKGlfq GV7ZAmPHrP4gr5N0tOpTntAse9qBDnhPtICOsqTNSftWDd0QPFEGgF1+rfKlXbcBEWndQtt5 uIHZIfNDSKNykcsS77ijPIb+rJwrO8gd+VbTG19QYSceloAZsQnjuQEmygYytLrJEtP+tCKH KbjPA33gaISDAsQemQIGIKZOTHr82jruOaXfZXbyRXkDVnlFmTmcXHLyQ=

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Jul 07, 2023 at 06:38:48AM +0200, Juergen Gross wrote: > On 06.07.23 23:49, Stefano Stabellini wrote: > > On Thu, 6 Jul 2023, Roger Pau Monné wrote: > > > On Wed, Jul 05, 2023 at 03:41:10PM -0700, Stefano Stabellini wrote: > > > > On Wed, 5 Jul 2023, Roger Pau Monné wrote: > > > > > On Tue, Jul 04, 2023 at 08:14:59PM +0300, Oleksandr Tyshchenko wrote: > > > > > > Part 2 (clarification): > > > > > > > > > > > > I think using a special config space register in the root complex > > > > > > would > > > > > > not be terrible in terms of guest changes because it is easy to > > > > > > introduce a new root complex driver in Linux and other OSes. The > > > > > > root > > > > > > complex would still be ECAM compatible so the regular ECAM driver > > > > > > would > > > > > > still work. A new driver would only be necessary if you want to be > > > > > > able > > > > > > to access the special config space register. > > > > > > > > > > I'm slightly worry of this approach, we end up modifying a root > > > > > complex emulation in order to avoid modifying a PCI device emulation > > > > > on QEMU, not sure that's a good trade off. > > > > > > > > > > Note also that different architectures will likely have different root > > > > > complex, and so you might need to modify several of them, plus then > > > > > arrange the PCI layout correctly in order to have the proper hierarchy > > > > > so that devices belonging to different driver domains are assigned to > > > > > different bridges. > > > > > > > > I do think that adding something to the PCI conf register somewhere is > > > > the best option because it is not dependent on ACPI and it is not > > > > dependent on xenstore both of which are very undesirable. > > > > > > > > I am not sure where specifically is the best place. These are 3 ideas > > > > we came up with: > > > > 1. PCI root complex > > > > 2. a register on the device itself > > > > 3. a new capability of the device > > > > 4. add one extra dummy PCI device for the sole purpose of exposing the > > > > grants capability > > > > > > > > > > > > Looking at the spec, there is a way to add a vendor-specific capability > > > > (cap_vndr = 0x9). Could we use that? It doesn't look like it is used > > > > today, Linux doesn't parse it. > > > > > > I did wonder the same from a quick look at the spec. There's however > > > a text in the specification that says: > > > > > > "The driver SHOULD NOT use the Vendor data capability except for > > > debugging and reporting purposes." > > > > > > So we would at least need to change that because the capability would > > > then be used by other purposes different than debugging and reporting. > > > > > > Seems like a minor adjustment, so might we worth asking upstream about > > > their opinion, and to get a conversation started. > > > > Wait, wouldn't this use-case fall under "reporting" ? It is exactly what > > we are doing, right? > > I'd understand "reporting" as e.g. logging, transferring statistics, ... > > We'd like to use it for configuration purposes. I've also read it that way. > Another idea would be to enhance the virtio IOMMU device to suit our needs: > we could add the domid as another virtio IOMMU device capability and (for now) > use bypass mode for all "productive" devices. If we have to start adding capabilties, won't it be easier to just add it to the each device instead of adding it to virtio IOMMU. Or is the parsing of capabilities device specific, and hence we would have to implement such parsing for each device? I would expect some capabilities are shared between all devices, and a Xen capability could be one of those. > Later we could even add grant-V3 support to Xen and to the virtio IOMMU device > (see my last year Xen Summit design session). This could be usable for > disaggregated KVM setups, too, so I believe there is a chance to get this > accepted. > > > > > > > ********** > > > > > > What do you think about it? Are there any pitfalls, etc? This also > > > > > > requires > > > > > > system changes, but at least without virtio spec changes. > > > > > > > > > > Why are we so reluctant to add spec changes? I understand this might > > > > > take time an effort, but it's the only way IMO to build a sustainable > > > > > VirtIO Xen implementation. Did we already attempt to negotiate with > > > > > Oasis Xen related spec changes and those where refused? > > > > > > > > That's because spec changes can be very slow. This is a bug that we need > > > > a relatively quick solution for and waiting 12-24 months for a spec > > > > update is not realistic. > > > > > > > > I think a spec change would be best as a long term solution. We also > > > > need a short term solution. The short term solution doesn't have to be > > > > ideal but it has to work now. > > > > > > My fear with such approach is that once a bodge is in place people > > > move on to other stuff and this never gets properly fixed. > > > > > > I know this might not be a well received opinion, but it would be > > > better if such bodge is kept in each interested party patchqueue for > > > the time being, until a proper solution is implemented. That way > > > there's an interest from parties into properly fixing it upstream. > > > > Unfortunately we are in the situation where we have an outstanding > > upstream bug, so we have to take action one way or the other. > > The required virtio IOMMU device modification would be rather small, so > adding it maybe under a CONFIG option defaulting to off might be > acceptable. Would you then do the grant allocation as part of virtio IOMMU? Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.