[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen on ARM vITS Handling Draft B (Was Re: Xen/arm: Virtual ITS command queue handling)



On Tue, May 19, 2015 at 7:21 PM, Ian Campbell <ian.campbell@xxxxxxxxxx> wrote:
> On Tue, 2015-05-19 at 14:37 +0100, Julien Grall wrote:
>> Hi Ian,
>>
>> On 19/05/15 13:10, Ian Campbell wrote:
>> > On Fri, 2015-05-15 at 15:55 +0100, Julien Grall wrote:
>> > [...]
>> >>> Translation of certain commands can be expensive (XXX citation
>> >>> needed).
>> >>
>> >> The term "expensive" is subjective. I think we can end up to cheap
>> >> translation if we properly pre-allocate information (such as device,
>> >> LPIs...). We can have all the informations before the guest as boot or
>> >> during hotplug part. It wouldn't take more memory than it should use.
>> >>
>> >> During command translation, we would just need to enable the device/LPIs.
>> >>
>> >> The remaining expensive part would be the validation. I think we can
>> >> improve most of them of O(1) (such as collection checking) or O(log(n))
>> >> (such as device checking).
>> > [...]
>> >>> XXX need a solution for this.
>> >>
>> >> Command translation can be improved. It may be good too add a section
>> >> explaining how translation of command foo can be done.
>> >
>> > I think that is covered by the spec, however if there are operations
>> > which form part of this which are potentially expensive we should
>> > outline in our design how this will be dealt with.
>> >
>> > Perhaps you or Vijay could propose some additional text covering:
>> >       * What the potentially expensive operations during a translation
>> >         are.
>> >       * How we are going to deal with those operations, including:
>> >               * What data structure is used
>> >               * What start of day setup is required to enable this
>> >               * What operations are therefore required at translation
>> >                 time
>>
>> I don't have much time to work on a proposal. I would be happy if Vijay
>> do it.
>
> OK, Vijay could you make a proposal here please.

__text__

1) Command translation:
-----------------------------------

 - ITS commands contains device ID, Event ID (vID), Collection ID
(vCID), Target Address (vTA)
    parameters
 - All these parameters should be validated
 - These parameters should be translated from Virtual to Physical

Of the existing GICv3 ITS commands, MAPC, MAPD, MAPVI/MAPI are the time
consuming commands as these commands creates entry in the Xen ITS structures,
which are used to validate other ITS commands.

1.1 MAPC command translation
-----------------------------------------------
   Format: MAPC vCID, vTA

   -  vTA is validated against Re-distributor address by searching
Redistributor region /
       CPU number based on GITS_TYPER.PAtype and Physical Collection
ID & Physical
       Target address are retrieved
   -  Each vITS will have cid_map (struct cid_mapping) which holds mapping of
      Virtual Collection ID, Virtual Targets address and Physical Collection ID.
   -  MAPC pCID, pTA physical ITS command is generated

   Here there is no overhead, the cid_map entries (approx 32 entries)
are preallocated when
   vITS is created.

1.2 MAPD Command translation:
-----------------------------------------------
   Format: MAPD device, ITT IPA, ITT Size

   MAPD is sent with Validation bit set if device needs to be added
and reset when device is removed

If Validation bit is set:
   - Allocate memory for its_device struct
   - Validate ITT IPA & ITT size and update its_device struct
   - Find number of vectors(nrvecs) for this device by querying PCI
helper function
   - Allocate nrvecs number of LPI
   - Allocate memory for struct vlpi_map for this device. This
vlpi_map holds mapping
     of Virtual LPI to Physical LPI and ID.
   - Find physical ITS node for which this device is assigned

   - Call p2m_lookup on ITT IPA addr and get physical ITT address
   - Validate ITT Size
   - Generate/format physical ITS command: MAPD, ITT PA, ITT Size

   Here the overhead is with memory allocation for its_device and vlpi_map

If Validation bit is not set:
    - Validate if the device exits by checking vITS device list
    - Clear all vlpis assigned for this device
    - Remove this device from vITS list
    - Free memory

1.3 MAPVI/MAPI Command translation:
-----------------------------------------------
   Format: MAPVI device, ID, vID, vCID

- Validate if the device exits by checking vITS device list
- Validate vCID and get pCID by searching cid_map
- if vID does not have entry in vlpi_entries of this device
  If not, Allot pID from vlpi_map of this device and update
vlpi_entries with new pID
- Allocate irq descriptor and add to RB tree
- call route_irq_to_guest() for this pID
- Generate/format physical ITS command: MAPVI device ID, pID, pCID

Here the overhead is allot physical ID, allocate memory for
irq descriptor and  routing interrupt

All other ITS command like MOVI, DISCARD, INV, INVALL, INT, CLEAR,
SYNC just validate and generate physical command

__text__

We can discuss and add how to reduce translation time.

>>
>> >>  I think
>> >> that limiting the number of batch/command sent per pass would allow a
>> >> small pass.
>> >
>> > I think we have a few choices:
>> >
>> >       * Limit to one batch per vits at a time
>> >       * Limit to some total number of batches per scheduling pass
>> >       * Time bound the scheduling procedure
>> >
>> > Do we have a preference?
>>
>> Time bound may be difficult to implement.
>
> Yes, I don't think that one is realistic.
>
>>  I think we would have to limit
>> batch per vITS (for code simplification) and total number of batch per
>> scheduling pass at the same time.
>
> OK.
>
>> >>>   the underlying hardware to the guest.
>> >>> * Adds complexity to the guest layout, which is right now static. How
>> >>>   do you decide the number of vITS/root controller exposed:
>> >>>     * Hotplug is tricky
>> >>> * Toolstack needs greater knowledge of the host layout
>> >>> * Given that PCI passthrough doesn't allow migration, maybe we could
>> >>>   use the layout of the hardware.
>> >>>
>> >>> In 1 vITS for all pITS:
>> >>>
>> >>> * What to do with global commands? Inject to all pITS and then
>> >>>   synchronise on them all finishing.
>> >>> * Handling of out of order completion of commands queued with
>> >>>   different pITS, since the vITS must appear to complete in
>> >>>   order. Apart from the book keeping question it makes scheduling more
>> >>>   interesting:
>> >>>     * What if you have a pITS with slots available, and the guest command
>> >>>       queue contains commands which could go to the pITS, but behind ones
>> >>>       which are targetting another pITS which has no slots
>> >>>     * What if one pITS is very busy and another is mostly idle and a
>> >>>       guest submits one command to the busy one (contending with other
>> >>>       guest) followed by a load of commands targeting the idle one. Those
>> >>>       commands would be held up in this situation.
>> >>>     * Reasoning about fairness may be harder.
>> >>>
>> >>> XXX need a solution/decision here.
>> >>
>> >>> In addition the introduction of direct interrupt injection in version
>> >>> 4 GICs may imply a vITS per pITS. (Update: it seems not)
>> >>
>> >> Other items to add: NUMA and I/O NUMA. I don't know much about it but I
>> >> think the first solution would be more suitable.
>> >
>> > first solution == ?
>>
>> 1 vITS per pITS.
>
> Ah, yes.
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.