[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 06/28] ARM: GICv3 ITS: introduce ITS command handling



Hi,

On 08/03/17 15:28, Julien Grall wrote:
> 
> 
> On 07/03/17 18:08, Andre Przywara wrote:
>> Hi Julien,
> 
> Hi Andre,
> 
>>
>> On 06/02/17 19:16, Julien Grall wrote:
>>> Hi Andre,
>>>
>>> On 30/01/17 18:31, Andre Przywara wrote:
>>>> To be able to easily send commands to the ITS, create the respective
>>>> wrapper functions, which take care of the ring buffer.
>>>> The first two commands we implement provide methods to map a collection
>>>> to a redistributor (aka host core) and to flush the command queue
>>>> (SYNC).
>>>> Start using these commands for mapping one collection to each host CPU.
>>>>
>>>> Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx>
>>>> ---
>>>>  xen/arch/arm/gic-v3-its.c         | 142
>>>> +++++++++++++++++++++++++++++++++++++-
>>>>  xen/arch/arm/gic-v3-lpi.c         |  20 ++++++
>>>>  xen/arch/arm/gic-v3.c             |  18 ++++-
>>>>  xen/include/asm-arm/gic_v3_defs.h |   2 +
>>>>  xen/include/asm-arm/gic_v3_its.h  |  36 ++++++++++
>>>>  5 files changed, 215 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
>>>> index ad7cd2a..6578e8a 100644
>>>> --- a/xen/arch/arm/gic-v3-its.c
>>>> +++ b/xen/arch/arm/gic-v3-its.c
>>>> @@ -19,6 +19,7 @@
>>>>  #include <xen/config.h>
>>>>  #include <xen/lib.h>
>>>>  #include <xen/device_tree.h>
>>>> +#include <xen/delay.h>
>>>>  #include <xen/libfdt/libfdt.h>
>>>>  #include <xen/mm.h>
>>>>  #include <xen/sizes.h>
>>>> @@ -29,6 +30,98 @@
>>>>
>>>>  #define ITS_CMD_QUEUE_SZ                SZ_64K
>>>>
>>>> +#define BUFPTR_MASK                     GENMASK(19, 5)
>>>> +static int its_send_command(struct host_its *hw_its, const void
>>>> *its_cmd)
>>>> +{
>>>> +    uint64_t readp, writep;
>>>> +
>>>> +    spin_lock(&hw_its->cmd_lock);
>>>
>>> Do you never expect a command to be sent in an interrupt path? I could
>>> see at least one, we may decide to throttle the number of LPIs received
>>> by a guest so this would involve disabling the interrupt.
>>
>> I take it you are asking for spin_lock_irq[save]()?
> 
> Yes.
> 
>> I don't think queuing ITS commands in interrupt context is a good idea,
>> especially since I just introduced a grace period to wait for a draining
>> command queue.
>> I am happy to revisit this when needed.
> 
> As mentioned on the previous mail, we might need to send a command
> whilst in the interrupt context if we need to disable an interrupt that
> fire too often.
> 
> I would be fine to have an ASSERT(!in_irq()) and a comment explaining
> why for the time being.

Done.

>>
>>>> +
>>>> +    readp = readq_relaxed(hw_its->its_base + GITS_CREADR) &
>>>> BUFPTR_MASK;
>>>> +    writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) &
>>>> BUFPTR_MASK;
>>>> +
>>>> +    if ( ((writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ) == readp )
>>>> +    {
>>>
>>> I look at all the series applied and there is no error message at all
>>> when the queue is full. This will make difficult to see what's going on.
>>>
>>> Furthermore, this limit could be easily reached. Furthermore, this could
>>> happen easily if you decide to map a device with thousands of
>>> interrupts. For instance the function gicv3_map_its_map_host_events will
>>> issue 2 commands per event (MAPTI and INV).
>>>
>>> So how do you plan to address this?
>>
>> So I changed this now to wait for 1 ms (or whatever value you prefer) in
>> hope the command queue drains. In the end the ITS is hardware, so
>> processing commands it's the only thing it does and I don't expect it to
>> be seriously stalled, usually. So waiting a tiny bit to cover this odd
>> case of command queue contention seems useful to me, especially since we
>> only send commands from non-critical Dom0 code.
> 
> I don't have any idea of a good value. My worry with such value is you
> are only hoping it will never happen. If you fail here, what will you
> do? You will likely have to revert changes which mean more command and
> then? If it fails once, why would it not fail again? You will end up in
> a spiral loop.
> 
> Regarding the value, is it something we could confirm with the hardware
> guys?

Let's move this bikesh^Wfine-tuning to a later point in time ;-)

>> The command queue is now 1 MB in size, so we have 32,768 commands in
>> there. Should be enough for everybody ;-)
>>
>>>> +        spin_unlock(&hw_its->cmd_lock);
>>>> +        return -EBUSY;
>>>> +    }
>>>> +
>>>> +    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
>>>> +    if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
>>>> +        __flush_dcache_area(hw_its->cmd_buf + writep, ITS_CMD_SIZE);
>>>
>>> Please use dcache_.... helpers.
>>>
>>>> +    else
>>>> +        dsb(ishst);
>>>> +
>>>> +    writep = (writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ;
>>>> +    writeq_relaxed(writep & BUFPTR_MASK, hw_its->its_base +
>>>> GITS_CWRITER);
>>>> +
>>>> +    spin_unlock(&hw_its->cmd_lock);
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static uint64_t encode_rdbase(struct host_its *hw_its, int cpu,
>>>> uint64_t reg)
>>>
>>> s/int cpu/unsigned int cpu/
>>
>> So it's easy to do so, but why is that actually?
> 
> Because a CPU and a vCPU ID cannot be signed. So what's the point to
> make them signed except saving 9 characters?

That doesn't explain why the rest of Xen is using signed values, but
well, I will change it.

>> I see that both "processor" and "vcpu_id" are "int" in struct vcpu, so I
>> was using int as the type for CPUs here as well.
> 
> 
> [...]
> 
>>>> +{
>>>> +    struct host_its *its;
>>>> +    int ret;
>>>> +
>>>> +    list_for_each_entry(its, &host_its_list, entry)
>>>> +    {
>>>> +        /*
>>>> +         * This function is called on CPU0 before any ITSes have been
>>>> +         * properly initialized. Skip the collection setup in this
>>>> case,
>>>> +         * it will be done explicitly for CPU0 upon initializing the
>>>> ITS.
>>>> +         */
>>>
>>> Looking at the code, I don't understand why you need to do that. AFAIU
>>> there are no restriction to initialize the ITS (e.g call gicv3_its_init)
>>> before gicv3_cpu_init.
>>
>> Well, it's a bit more subtle: For initialising the ITS (the collection
>> table entry, more specifically), we need to know the "rdbase", so either
>> the physical address or the logical ID. Those we determine only
>> somewhere deep in gicv3_cpu_init().
>> So just moving gicv3_its_init() before gicv3_cpu_init() does not work. I
>> will try and see if it's worth to split gicv3_its_init() into a generic
>> and a per-CPU part, though I doubt that this is helpful.
> 
> Looking at the gicv3_its_init function, the only code requiring "rdbase"
> is mapping the collection for the CPU0. The rest is CPU agnostic and
> should only populate the data structure.
> 
> So this does not answer why you need to wait until CPU0 is initialized
> to populate those table. The following path would be fine
>     gicv3_its_init();
>         -> Populate BASER
>     gicv3_cpu_init();
>         -> gicv3_its_setup_collection()
>             -> Initialize collection for CPU0

Yeah, I changed it similarly yesterday:
Move the gicv3_its_setup_collection() call from the LPI setup path to be
called just before the gicv3_enable_lpis() call. And then move the call
to gicv3_its_init() to be done directly after gicv3_dist_init(), so
before gicv3_cpu_init().
That seems to work and allows us to get rid of the extra handling.

Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.