[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads

To: George Dunlap <dunlapg@xxxxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
From: Julien Grall <julien@xxxxxxx>
Date: Wed, 1 Apr 2020 18:56:30 +0100
Cc: Peng Fan <peng.fan@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>, Wei Xu <xuwei5@xxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, nd <nd@xxxxxxx>
Delivery-date: Wed, 01 Apr 2020 17:56:44 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>



On 01/04/2020 18:02, George Dunlap wrote:

On Wed, Apr 1, 2020 at 10:54 AM Bertrand Marquis<Bertrand.Marquis@xxxxxxx <mailto:Bertrand.Marquis@xxxxxxx>> wrote:

    On 1 Apr 2020, at 09:30, Julien Grall <julien@xxxxxxx
    <mailto:julien@xxxxxxx>> wrote:



    On 01/04/2020 01:57, Stefano Stabellini wrote:

    On Mon, 30 Mar 2020, Julien Grall wrote:

    Hi Stefano,

    On 30/03/2020 17:35, Stefano Stabellini wrote:

    On Sat, 28 Mar 2020, Julien Grall wrote:

    qHi Stefano,

    On 27/03/2020 02:34, Stefano Stabellini wrote:

    This is a simple implementation of GICD_ICACTIVER /
    GICD_ISACTIVER
    reads. It doesn't take into account the latest state of
    interrupts on
    other vCPUs. Only the current vCPU is up-to-date. A full
    solution is
    not possible because it would require synchronization among
    all vCPUs,
    which would be very expensive in terms or latency.


    Your sentence suggests you have number showing that correctly
    emulating
    the
    registers would be too slow. Mind sharing them?


    No, I don't have any numbers. Would you prefer a different
    wording or a
    better explanation? I also realized there is a typo in there
    (or/of).

    Let me start with I think correctness is more important than speed.
    So I would have expected your commit message to contain some
    fact why
    synchronization is going to be slow and why this is a problem.

    To give you a concrete example, the implementation of set/way
    instructions are
    really slow (it could take a few seconds depending on the
    setup). However,
    this was fine because not implementing them correctly would have
    a greater
    impact on the guest (corruption) and they are not used often.

    I don't think the performance in our case will be in same order
    magnitude. It
    is most likely to be in the range of milliseconds (if not less)
    which I think
    is acceptable for emulation (particularly for the vGIC) and the
    current uses.

    Writing on the mailing list some of our discussions today.
    Correctness is not just in terms of compliance to a specification
    but it
    is also about not breaking guests. Introducing latency in the
    range of
    milliseconds, or hundreds of microseconds, would break any latency
    sensitive workloads. We don't have numbers so we don't know for
    certain
    the effect that your suggestion would have.


    You missed part of the discussion. I don't disagree that latency
    is important. However, if an implementation is only 95% reliable,
    then it means 5% of the time your guest may break (corruption,
    crash, deadlock...). At which point the latency is the last of
    your concern.

    It would be interesting to have those numbers, and I'll add to my
    TODO
    list to run the experiments you suggested, but I'll put it on the
    back-burner (from a Xilinx perspective it is low priority as no
    customers are affected.)


    How about we get a correct implementation merge first and then
    discuss about optimization? This would allow the community to
    check whether there are actually noticeable latency in their workload.


    Hi,

    I am not sure that pushing something with a performance impact to
    later fix it is the right approach here.

    The patch is an improvement compared to the current code and it can
    be further improved later to handle more cases (other cores).

    If we really have to sync all vCPUs here, this will cost a lot and
    the result will still be the status in the past in fact because
    nothing will make sure that at the point the guest gets back the
    value it is still valid.


The same is true on real hardware, right?

Looking at this discussion as a non-ARM person, it sounds a bit like ARMspecified this in a way that wasuseless-but-easy-to-implement-so-why-not; but it turns out to beuseless-but-hard-to-implement virtualized.

On the one hand, I'm sympathetic to Julien's point of view; I very muchdon't like the idea of silently changing behavior just because thespecified behavior is inconvenient for us.

On the other hand, I'm sympathetic to Stefano's point of view, that it'spointless to introduce a load of overhead and jitter to implementbehavior which nobody in their right mind would even want to use (atleast virtualized).

What I think would be *ideal* would be for ARM to change thespecification to make it easier virtualize. For instance:, byspecifying that the register *may* contain information about othercores, but may not; or, that the register will contain information onother cores on real hardware, but not virtualized.

The goal of those registers is to give you a picture of the interruptsactivated on your HW. For instance, this is used by Linux to properlysynchronize pending interrupts as there would be a race otherwise.

I wrote a long e-mail a few months ago explaining why Linux is using itand why we should implement correctly [1].

Even on real HW, the value you read may already be incorrect (i.e aninterrupt may have become pending).

What Stefano implemented is using a snapshot of what Xen thinks is thestate of the vGIC. I think this is inline with the existing specification.

However, this has a major flaw because this relies on the vCPUs to exitfor synchronization the HW state and what Xen thinks of the vGIC. Wehave configured the vGIC to not exit when the virtual interrupt isbacked by an HW interrupt. So we need to rely on another trap for thesynchronization.

With the NULL scheduler (and even credit2 with only 1 vCPU running perpCPU), the number of traps is very limited. So read the I*ACTIVERregisters may report that an interrupt is actived for a really long time.


The result here is a thread in the guest may be blocked for a long-time.

To be brutally honest, this totally defeat the argument that "this patchhas a better latency".

Barring that, I think we should have a centralized place to documentdeviations from the spec; and I think changes to this spec should becoordinated with KVM (and maybe ACRN?), to minimize hypervisor-specificdeviations.

AFAIK, KVM is implementing the ACTIVER registers the same way asStefano's patch. It might be harder (or not possible) to hit thedeadlock depending on how they configure the vGIC (i.e trap the activestate) and the scheduling.

As I discussed with Stefano yesterday, I can accept that the state isnot up-to-date. But I can't ignore the fact this patch is introducing adeadlock in some circumstance.


Cheers,

[1]https://lore.kernel.org/xen-devel/7289f75f-1ab2-2d42-cd88-1be5306b3072@xxxxxxx/


--
Julien Grall

References:
- Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
  - From: Stefano Stabellini
- Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
  - From: Julien Grall
- Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
  - From: Bertrand Marquis
- Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
  - From: George Dunlap

Prev by Date: Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
Next by Date: [xen-unstable test] 149260: regressions - FAIL
Previous by thread: Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
Next by thread: Re: [PATCH v2] xen/arm: implement GICD_I[S/C]ACTIVER reads
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.