[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
From: Juergen Gross <jgross@xxxxxxxx>
Date: Fri, 7 Jul 2023 16:10:14 +0200
Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <olekstysh@xxxxxxxxx>, Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Petr Pavlu <petr.pavlu@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, vikram.garhwal@xxxxxxx
Delivery-date: Fri, 07 Jul 2023 14:10:27 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 07.07.23 11:50, Roger Pau Monné wrote:

On Fri, Jul 07, 2023 at 06:38:48AM +0200, Juergen Gross wrote:

On 06.07.23 23:49, Stefano Stabellini wrote:

On Thu, 6 Jul 2023, Roger Pau Monné wrote:

On Wed, Jul 05, 2023 at 03:41:10PM -0700, Stefano Stabellini wrote:

On Wed, 5 Jul 2023, Roger Pau Monné wrote:

On Tue, Jul 04, 2023 at 08:14:59PM +0300, Oleksandr Tyshchenko wrote:

Part 2 (clarification):

I think using a special config space register in the root complex would
not be terrible in terms of guest changes because it is easy to
introduce a new root complex driver in Linux and other OSes. The root
complex would still be ECAM compatible so the regular ECAM driver would
still work. A new driver would only be necessary if you want to be able
to access the special config space register.


I'm slightly worry of this approach, we end up modifying a root
complex emulation in order to avoid modifying a PCI device emulation
on QEMU, not sure that's a good trade off.

Note also that different architectures will likely have different root
complex, and so you might need to modify several of them, plus then
arrange the PCI layout correctly in order to have the proper hierarchy
so that devices belonging to different driver domains are assigned to
different bridges.


I do think that adding something to the PCI conf register somewhere is
the best option because it is not dependent on ACPI and it is not
dependent on xenstore both of which are very undesirable.

I am not sure where specifically is the best place. These are 3 ideas
we came up with:
1. PCI root complex
2. a register on the device itself
3. a new capability of the device
4. add one extra dummy PCI device for the sole purpose of exposing the
     grants capability


Looking at the spec, there is a way to add a vendor-specific capability
(cap_vndr = 0x9). Could we use that? It doesn't look like it is used
today, Linux doesn't parse it.


I did wonder the same from a quick look at the spec.  There's however
a text in the specification that says:

"The driver SHOULD NOT use the Vendor data capability except for
debugging and reporting purposes."

So we would at least need to change that because the capability would
then be used by other purposes different than debugging and reporting.

Seems like a minor adjustment, so might we worth asking upstream about
their opinion, and to get a conversation started.


Wait, wouldn't this use-case fall under "reporting" ? It is exactly what
we are doing, right?


I'd understand "reporting" as e.g. logging, transferring statistics, ...

We'd like to use it for configuration purposes.


I've also read it that way.

Another idea would be to enhance the virtio IOMMU device to suit our needs:
we could add the domid as another virtio IOMMU device capability and (for now)
use bypass mode for all "productive" devices.


If we have to start adding capabilties, won't it be easier to just add
it to the each device instead of adding it to virtio IOMMU.  Or is the
parsing of capabilities device specific, and hence we would have to
implement such parsing for each device?  I would expect some
capabilities are shared between all devices, and a Xen capability could
be one of those.


Have a look at [1], which is describing the common device config layout.
The problem here is that we'd need to add the domid after the queue specific
data, resulting in a mess if further queue fields would be added later.

We could try that, of course.

Later we could even add grant-V3 support to Xen and to the virtio IOMMU device
(see my last year Xen Summit design session). This could be usable for
disaggregated KVM setups, too, so I believe there is a chance to get this
accepted.

**********
What do you think about it? Are there any pitfalls, etc? This also requires
system changes, but at least without virtio spec changes.


Why are we so reluctant to add spec changes?  I understand this might
take time an effort, but it's the only way IMO to build a sustainable
VirtIO Xen implementation.  Did we already attempt to negotiate with
Oasis Xen related spec changes and those where refused?


That's because spec changes can be very slow. This is a bug that we need
a relatively quick solution for and waiting 12-24 months for a spec
update is not realistic.

I think a spec change would be best as a long term solution. We also
need a short term solution. The short term solution doesn't have to be
ideal but it has to work now.


My fear with such approach is that once a bodge is in place people
move on to other stuff and this never gets properly fixed.

I know this might not be a well received opinion, but it would be
better if such bodge is kept in each interested party patchqueue for
the time being, until a proper solution is implemented.  That way
there's an interest from parties into properly fixing it upstream.


Unfortunately we are in the situation where we have an outstanding
upstream bug, so we have to take action one way or the other.


The required virtio IOMMU device modification would be rather small, so
adding it maybe under a CONFIG option defaulting to off might be
acceptable.


Would you then do the grant allocation as part of virtio IOMMU?


Long term, maybe. Do you remember my Grant-V3 design session last year? Being
able to reuse the same layout for virtio IOMMU was one of the basic ideas for
that layout (this would need some heavy work on the virtio IOMMU frontend and
backend, of course).


Juergen

[1]:http://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.pdf#subsubsection.4.1.4.3

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Follow-Ups:
- Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
  - From: Roger Pau Monné
- Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
  - From: Juergen Gross

References:
- Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
  - From: Roger Pau Monné
- Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
  - From: Juergen Gross
- Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
  - From: Marek Marczykowski-Górecki
- Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
  - From: Roger Pau Monné
- Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
  - From: Oleksandr Tyshchenko
- Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
  - From: Roger Pau Monné
- Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
  - From: Stefano Stabellini
- Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
  - From: Roger Pau Monné
- Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
  - From: Stefano Stabellini
- Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
  - From: Juergen Gross
- Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
  - From: Roger Pau Monné

Prev by Date: [PATCH] iommu/amd-vi: avoid pointless flushes in invalidate_all_domain_pages()
Next by Date: Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
Previous by thread: Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
Next by thread: Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.