[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads

To: George Dunlap <George.Dunlap@xxxxxxxxxx>, Julien Grall <julien.grall.oss@xxxxxxxxx>
From: Julien Grall <julien@xxxxxxx>
Date: Mon, 6 Apr 2020 19:47:14 +0100
Cc: Peng Fan <peng.fan@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, "maz@xxxxxxxxxx" <maz@xxxxxxxxxx>, Wei Xu <xuwei5@xxxxxxxxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>
Delivery-date: Mon, 06 Apr 2020 18:47:42 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>



On 06/04/2020 18:58, George Dunlap wrote:

On Apr 3, 2020, at 9:27 PM, Julien Grall <julien.grall.oss@xxxxxxxxx> wrote:

On Fri, 3 Apr 2020 at 20:41, Stefano Stabellini <sstabellini@xxxxxxxxxx> wrote:


On Thu, 2 Apr 2020, Julien Grall wrote:

As we discussed on Tuesday, the cost for other vCPUs is only going to be a
trap to the hypervisor and then back again. The cost is likely smaller than
receiving and forwarding an interrupt.

You actually agreed on this analysis. So can you enlighten me as to why
receiving an interrupt is a not problem for latency but this is?


My answer was that the difference is that an operating system can
disable interrupts, but it cannot disable receiving this special IPI.


An OS can *only* disable its own interrupts, yet interrupts will still
be received by Xen even if the interrupts are masked at the processor
(e.g local_irq_disable()).

You would need to disable interrupts one by one as the GIC level (use
ICENABLER) in order to not receive any interrupts. Yet, Xen may still
receive interrupts for operational purposes (e.g serial, console,
maintainance IRQ...). So trap will happen.


I think Stefano’s assertion is that the users he has in mind will be configuring the 
system such that RT workloads get a minimum number of hypervisor-related interrupts possible. 
 On a 4-core system, you could  have non-RT workloads running on cores 0-1, and RT workloads 
running with the NULL scheduler on cores 2-3.  In such a system, you’d obviously 
arrange that serial and maintenance IRQs are delivered to cores 0-1.

Well maintenance IRQs are per pCPU so you can't route to another one...

But, I think you missed my point that local_irq_disable() from the guestwill not prevent the hypervisor to receive interrupts *even* the onerouted to the vCPU itself. They will just not be delivered to the guestcontext until local_irq_enable() is called.


Julien, I think your argument above about interrupts somewhat undermines your point about deadlock.  Your basic 
argument is, that a guest will be interrupted by Xen quite frequently anyway for lots of reasons; adding one more 
to the list shouldn’t measurably increase the jitter.  But if it’s true that a guest will be 
interrupted by Xen frequently, then deadlock shouldn’t be much of an issue; and insofar as deadlock is an 
issue, it’s because there were no interrupts — and thus, adding an extra IPI will have a 
statistically significant effect on jitter.

I presented two way to disable interrupts because Stefano's e-mail wasnot clear enough which one he was referring to. So I was hoping to safesome round-trip, but it looks like I muddle my point.

If Stefano was referring to critical sections where interrupts aremasked using local_irq_disable(). Then, you are not going to limit thenumbers of traps at all (see why above). So the point here was moot.

I don't believe Stefano was referring to the second case as disablinginterrupts at the GIC level will require to trap in the hypervisor. ButI thought I would mention it.

Regarding the livelock (Marc pointed out it wasn't a deadlock), thereare multiple conflicting use cases. In an ideal world, there will belimited reasons to interrupts the guest. And yes it will result to alivelock.

In a less ideal world, there will some time be interruptions. But youdon't know when. If you look at determinism then, I am afraid thatStefano's implementation is not because you don't know when the vCPUwill exit.


On the other had, Stefano’s argument about deadlock (if I understand him correctly) has been that guests 
shouldn’t really be spinning on that register anyway.  But it sounds from Julien’s other email that 
perhaps spinning on the register is exactly what newer versions of Linux do — so Linux would certainly hang 
on such systems if Stefano’s assertion about the low number of Xen-related interrupts is true.

Rather than playing the game "one person's word against the otherperson's word", from Linux 5.4:


do {
    unsigned long flags;

    /*
     * Wait until we're out of the critical section.  This might
     * give the wrong answer due to the lack of memory barriers.
     */
    while (irqd_irq_inprogress(&desc->irq_data))
        cpu_relax();

    /* Ok, that indicated we're done: double-check carefully. */
    raw_spin_lock_irqsave(&desc->lock, flags);
    inprogress = irqd_irq_inprogress(&desc->irq_data);

    /*
     * If requested and supported, check at the chip whether it
     * is in flight at the hardware level, i.e. already pending
     * in a CPU and waiting for service and acknowledge.
     */
    if (!inprogress && sync_chip) {
        /*
         * Ignore the return code. inprogress is only updated
         * when the chip supports it.
         */
        __irq_get_irqchip_state(irqd, IRQCHIP_STATE_ACTIVE,
                                &inprogress);
    }
    raw_spin_unlock_irqrestore(&desc->lock, flags);
    /* Oops, that failed? */
} while (inprogress);


If the latter is true, then it sounds like addressing the deadlock issue will 
be necessary?  And so effort needs to be put towards figuring out how to 
minimize jitter due to Xen-related IPIs.

Here what I wrote before:

"
A few remarks:
    * The function do_nothing() is basically a NOP.
    * I am suggesting to use smp_call_function() rather
smp_send_event_check_cpu() is because we need to know when the vCPU has
synchronized the LRs. As the function do_nothing() will be call
afterwards, then we know the the snapshot of the LRs has been done
    * It would be possible to everything in one vCPU.
    * We can possibly optimize it for the SGIs/PPIs case

This is still not perfect, but I don't think the performance would be
that bad.
"

"
But if your concern is to disturb a vCPU which has interrupts
disabled. Then there is way to mitigate this in an implementation, you
can easily know whether there are interrupts inflight at a given point
for a given vCPU. So you can avoid to IPI if you know there is no
interrupts potentially active.
"

I am looking forward to hear about the potential improvements.

Cheers,

--
Julien Grall

Follow-Ups:
- Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
  - From: George Dunlap

References:
- Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
  - From: Stefano Stabellini
- Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
  - From: Julien Grall
- Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
  - From: Stefano Stabellini
- Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
  - From: Julien Grall
- Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
  - From: Stefano Stabellini
- Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
  - From: Julien Grall
- Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
  - From: George Dunlap

Prev by Date: Re: [XEN PATCH] hvmloader: Enable MMIO and I/O decode, after all resource allocation
Next by Date: [ovmf test] 149462: all pass - PUSHED
Previous by thread: Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
Next by thread: Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.