[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/4] xen/arm: gic: Ensure we have an ISB between ack and do_IRQ()

Hello Andre,

On 22.11.18 20:04, Andre Przywara wrote:

Is that benchmark chosen to put some interrupt load on the system? Or
is that what the customer actually uses and she is really suffering
from Xen's interrupt handling and emulation?
That it the 3D benchmark used by customer to compare virtualization solutions from different providers.

If you chose this benchmark because it tends to trigger a lot of
interrupts and you hope to estimate some other interrupt property with
this (for instance interrupt latency or typical LR utilisation), then
you might be disappointed. This seems to go into the direction of an
interrupt storm, where we really don't care about performance, but just
want to  make sure we keep running and ideally don't penalise other
Well, that benchmark itself is rather interrupts oriented (on our HW). It emits GPU load, so causes very low CPU load, but lot of intrerupts from GPU, video subsytem, display subsystem. I know about the WFI/WFE problem and `vwfi=native`. But we can't use it, because our system is overcommitted.

Adding the reschedule
IRQ makes the system tend to not fit all IRQs into 4 LRs available in
my GIC. Moreover, the benchmark does not emit a network traffic or
disk usage during the run. So real life cases will add more
concurrent IRQs.

That's rather odd. Are you sure you actually have all LRs used?
I have to recheck. 7

What is your guest system? Linux?
Yep, LK 4.14.35

Can you check whether you use EOI mode 1 in
the guest ("GIC: Using split EOI/Deactivate mode" in dmesg, right after
"NR_IRQS: xx, nr_irqs: xx, preallocated irqs: 0")?
I didn't find such a print in dmesg:
        [    0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, 
        [    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
        [    0.000000] arch_timer: cp15 timer(s) running at 8.33MHz (virt).

The frequency of the interrupt (n per second) should be unrelated to
the number of IRQs being presented simultaneously to the guest.
Actually yes. What could matter here is that we have 4 concurrent HW interrupt sources involved in 3D processing.

Typically I would assume you have zero interrupts normally, because
your guest is doing actual work (running the OS or userland program).
Then you handle the occasional interrupt (1 LR in active state), and
the timer IRQ fires during this. This lets the guest exit, and the
second LR gets used with the injected pending timer IRQ. Now every now
and then an IPI might also be injected from another VCPU at the same
time, which brings the count up to 3. But all of the time the guest
still handles this first interrupt. And chances are the interrupt
handler sooner or later triggers an (MMIO) exit, at which case the
number of LR becomes irrelevant. A high number of LRs would only be
interesting if you are able to process all those interrupts without a
single exit, typically this is only true for the virtual arch timer IRQ.
I need some time to sort it out.

Also keep in mind that some instructions here and there in the
interrupt handling path in Xen might not be relevant if you exit the
guest frequently (due to interrupts, for instance). The cost of an exit
will probably dwarf the cost of adding a struct pending_irq to a linked
list or so.
It is clear. As we discussed internally, even making IRQ path shorter, we may experience the benchmark results drop due to the fact that we are doing more context switches from guest instead of collecting those interrupts directly from hyp.

Andrii Anisov.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.