[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH V1 05/12] hvm/dm: Introduce xendevicemodel_set_irq_level DM op



Hi Stefano,

On 11/08/2020 00:34, Stefano Stabellini wrote:
On Sat, 8 Aug 2020, Julien Grall wrote:
On Fri, 7 Aug 2020 at 22:51, Stefano Stabellini <sstabellini@xxxxxxxxxx> wrote:

On Fri, 7 Aug 2020, Jan Beulich wrote:
On 07.08.2020 01:49, Stefano Stabellini wrote:
On Thu, 6 Aug 2020, Julien Grall wrote:
On 06/08/2020 01:37, Stefano Stabellini wrote:
On Wed, 5 Aug 2020, Julien Grall wrote:
On 05/08/2020 00:22, Stefano Stabellini wrote:
On Mon, 3 Aug 2020, Oleksandr Tyshchenko wrote:
From: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>

This patch adds ability to the device emulator to notify otherend
(some entity running in the guest) using a SPI and implements Arm
specific bits for it. Proposed interface allows emulator to set
the logical level of a one of a domain's IRQ lines.

Please note, this is a split/cleanup of Julien's PoC:
"Add support for Guest IO forwarding to a device emulator"

Signed-off-by: Julien Grall <julien.grall@xxxxxxx>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
---
    tools/libs/devicemodel/core.c                   | 18
++++++++++++++++++
    tools/libs/devicemodel/include/xendevicemodel.h |  4 ++++
    tools/libs/devicemodel/libxendevicemodel.map    |  1 +
    xen/arch/arm/dm.c                               | 22
+++++++++++++++++++++-
    xen/common/hvm/dm.c                             |  1 +
    xen/include/public/hvm/dm_op.h                  | 15
+++++++++++++++
    6 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/tools/libs/devicemodel/core.c
b/tools/libs/devicemodel/core.c
index 4d40639..30bd79f 100644
--- a/tools/libs/devicemodel/core.c
+++ b/tools/libs/devicemodel/core.c
@@ -430,6 +430,24 @@ int xendevicemodel_set_isa_irq_level(
        return xendevicemodel_op(dmod, domid, 1, &op, sizeof(op));
    }
    +int xendevicemodel_set_irq_level(
+    xendevicemodel_handle *dmod, domid_t domid, uint32_t irq,
+    unsigned int level)

It is a pity that having xen_dm_op_set_pci_intx_level and
xen_dm_op_set_isa_irq_level already we need to add a third one, but from
the names alone I don't think we can reuse either of them.

The problem is not the name...


It is very similar to set_isa_irq_level. We could almost rename
xendevicemodel_set_isa_irq_level to xendevicemodel_set_irq_level or,
better, just add an alias to it so that xendevicemodel_set_irq_level is
implemented by calling xendevicemodel_set_isa_irq_level. Honestly I am
not sure if it is worth doing it though. Any other opinions?

... the problem is the interrupt field is only 8-bit. So we would only be
able
to cover IRQ 0 - 255.

Argh, that's not going to work :-(  I wasn't sure if it was a good idea
anyway.


It is not entirely clear how the existing subop could be extended without
breaking existing callers.

But I think we should plan for not needing two calls (one to set level
to 1, and one to set it to 0):
https://marc.info/?l=xen-devel&m=159535112027405

I am not sure to understand your suggestion here? Are you suggesting to
remove
the 'level' parameter?

My hope was to make it optional to call the hypercall with level = 0,
not necessarily to remove 'level' from the struct.

 From my understanding, the hypercall is meant to represent the status of the
line between the device and the interrupt controller (either low or high).

This is then up to the interrupt controller to decide when the interrupt is
going to be fired:
   - For edge interrupt, this will fire when the line move from low to high (or
vice versa).
   - For level interrupt, this will fire when line is high (assuming level
trigger high) and will keeping firing until the device decided to lower the
line.

For a device, it is common to keep the line high until an OS wrote to a
specific register.

Furthermore, technically, the guest OS is in charge to configure how an
interrupt is triggered. Admittely this information is part of the DT, but
nothing prevent a guest to change it.

As side note, we have a workaround in Xen for some buggy DT (see the arch
timer) exposing the wrong trigger type.

Because of that, I don't really see a way to make optional. Maybe you have
something different in mind?

For level, we need the level parameter. For edge, we are only interested
in the "edge", right?

I don't think so, unless Arm has special restrictions. Edges can be
both rising and falling ones.

And the same is true for level interrupts too: they could be active-low
or active-high.


Instead of modelling the state of the line, which seems to be a bit
error prone especially in the case of a single-device emulator that
might not have enough information about the rest of the system (it might
not know if the interrupt is active-high or active-low), we could model
the triggering of the interrupt instead.

I am not sure to understand why the single (or event multiple) device
emulator needs to know the trigger type. The information of the
trigger type of the interrupt would be described in the firmware table
and it is expected to be the same as what the emulator expects.

If the guest OS decided to configure wrongly the interrupt trigger
type, then it may not work properly. But, from my understanding, this
doesn't differ from the HW behavior.


In the case of level=1, it would mean that the interrupt line is active,
no matter if it is active-low or active-high. In the case of level=0, it
would mean that it is inactive.

Similarly, in the case of an edge interrupt edge=1 or level=1 would mean
that there is an edge, no matter if it is rising or falling.

TBH, I think your approach is only going to introduce more headache in
Xen if a guest OS decides to change the trigger type.

It feels much easier to just ask the emulator to let us know the level
of the line. Then if the guest OS decides to change the trigger type,
we only need to resample the line.

Emulators, at least the ones in QEMU, don't model the hardware so
closely to care about trigger type. The only thing they typically care
about is to fire a notification.

I don't think I agree with this. Devices in QEMU will set the level (high or low) of the line. This is then up to the interrupt controller to decide how to act with it. See the function qemu_set_irq().

In the case of active-high level interrupt, the interrupt would fire until the line has been lowered.


The trigger type only comes into the picture when there is a bug or a
disagreement between Xen and QEMU. Imagine a device that can be both
level active-high or active-low, if the guest kernel changes the
configuration, Xen would know about it, but QEMU wouldn't.

Lets take a step back. From my understanding, on real HW, the OS will have to configure the device *and* the interrupt controller in order to switch from level active-low to level active-high. Otherwise, there would be discrepancy between the two.

In our situation, Xen is basically the interrupt controller and QEMU the device. So both should be aware of any change here. Did I miss anything?

 I vaguely
recall a bug 10+ years ago about this with QEMU on x86 and a line that
could be both active-high and active-low. So QEMU would raise the
interrupt but Xen would actually think that QEMU stopped the interrupt.

To do this right, we would have to introduce an interface between Xen
and QEMU to propagate the trigger type. Xen would have to tell QEMU when
the guest changed the configuration. That would work, but it would be
better if we can figure out a way to do without it to reduce complexity.
Per above, I don't think this is necessary.


Instead, given that QEMU and other emulators don't actually care about
active-high or active-low, if we have a Xen interface that just says
"fire the interrupt" we get away from this kind of troubles. It would
also be more efficient because the total number of hypercalls required
would be lower.

I read "fire interrupt" the interrupt as "Please generate an interrupt once". Is it what you definition you expect?

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.