[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling

On Fri, 2015-05-15 at 18:08 +0530, Vijay Kilari wrote:
> On Fri, May 15, 2015 at 4:58 PM, Ian Campbell <ian.campbell@xxxxxxxxxx> wrote:
> > On Wed, 2015-05-13 at 21:57 +0530, Vijay Kilari wrote:
> >> > * On receipt of an interrupt notification arising from Xen's own use
> >> >   of `INT`; (see discussion under Completion)
> >>
> >>     If INT notification method is used, then I don't think there is need
> >> for pITS scheduling on CREADER read.
> >>
> >> As we discussed in patch #13. Below steps should be suffice to virtualize
> >> command queue.
> >>
> >> 1) On each guest CWRITER update, Read batch ( 'm' commands) of commands
> >>     and translate it and put on pITS schedule list. If there are more than 
> >> 'm'
> >>     commands create m/n entries in schedule list. Append INT command for 
> >> each
> >>      schedule list entry
> >
> > How many INT commands do you mean here?
>    One INT command (Xen's completion INT) per batch
> >
> >>      1a) If there is no ongoing command from this vITS on physical queue,
> >>            send to physical queue.
> >>      1b) If there is ongoing command return to guest.
> >> 2) On receiving completion interrupt, update CREADER of guest and post next
> >>     command from schedule list to physical queue.
> >>
> >> With this,
> >>    - There will be no overhead of translating command in interrupt context
> >> which is quite heavy because translating ITS command requires validating
> >> and updating interval ITS structures.
> >
> > Can you give some examples of the heaviest translations please so I can
> > get a feel for actually how expensive we are talking here.
> >
>     For example to translate MAPVI device_ID, event_ID, vID, vCID


> >>    - Always only one request from guest will be posted to physical queue
> >>    - Even in guest floods with large number of commands, all the commands
> >>      will be translated and queued in schedule list and posted batch by 
> >> batch
> >>    - Scheduling pass is called only on CWRITER & completion INT.
> >
> > I think the main difference in what you propose here is that commands
> > are queued in pre-translated form to be injected (cheaply) during
> > scheduling as opposed to being left on the guest queue and translated
> > directly into the pits queue.
> >
> > I think `INT` vs `CREADR` scheduling is largely orthogonal to that.
> >
> > Julien proposed moving scheduling to a softirq, which gets it out of IRQ
> > context (good) but does necessarily account the translation to the
> > guest, which is a benefit of your approach. (I think things wihch happen
> > in a sortirq are implicitly accounted  to current, whoever that may be)
> >
>    one softirq that looks at the all the vITS and posts the commands to pITS?
> or one softirq per vITS?

The former.

However in draft B I proposed that we might need something more like the
latter for accounting purposes, either the actual scheduling pass or a
per-vITS translation pass.

> > On the downside pretranslation adds memory overhead and reintroduces the
> > issue of a potentially long synchronous translation during `CWRITER`
> > handling.
>    Memory that is allocated is freed after completion of that batch.

It is still overhead.

>   The translation duration depends on how many commands guest is
> writing before updated CWRITER.

Xen cannot trust a guest to not write an enourmous batch. We need to
think in terms of malicious guest behaviour, i.e. deliberately try to
subvert or DoS the system, we cannot assume a well behaved guest.

> >> > Possible simplification: If we arrange that no guest ever has multiple
> >> > batches in flight (which can occur if we wrap around the list several
> >> > times) then we may be able to simplify the book keeping
> >> > required. However this may need some careful thought wrt fairness for
> >> > guests submitting frequent small batches of commands vs those sending
> >> > large batches.
> >>
> >>   If one LPI of the dummy device assigned to one VM, then book keeping
> >> per vITS becomes simple
> >
> > What dummy device do you mean? What simplifications does it imply?
> >
>   I mean fake device (non-existent device)  to generate completion INT.
> Using unique completion INT for every vITS, then book keeping would be
> simple. This helps to identify vITS on receiving completion INT (Completion 
> <=> vITS mapping)

It already seem interesting to find one INT, would finding N (for
potentially large N) be possible?

However given the synchronous nature of things I think one suffices, you
can fairly easily keep the vits on a list in the order they appear on
the ring etc.

> >>
> >> >
> >> > ### Completion
> >> >
> >> > It is expected that commands will normally be completed (resulting in
> >> > an update of the corresponding `vits_cq.creadr`) via guest read from
> >> > `CREADR`. This will trigger a scheduling pass which will ensure the
> >> > `vits_cq.creadr` value is up to date before it is returned.
> >> >
> >>     If guest is CREADR to know completion of command, no need
> >> of scheduling pass if INT is used.
> >
> > We cannot know apriori which scheme a guest is going to use, nor do we
> > have the freedom to mandate a particular scheme, or even that the guest
> > uses the same scheme for every batch of commands.
> >
> > So we need to design a system which works whether all guests use only
> > INT or all guests using only CREADR polling or anything in between.
> >
> > A scheduling pass is not needed on INT injection (either Xen's or the
> > guests) in order to update `CREADR` (as you suggest), however it may be
> > necessary in order to keep the pITS command queue moving by scheduling
> > any outstanding commands. Consider the case of a guest which receives an
> > INT but does not subsequently read `CREADR` (at all or in a timely
> > manner).
>   Scheduling outstanding commands and updating CREADER
> is always done by Xen's completion INT.
> So even if guest does not read CREADER it does not matter.
> One corner case I think of is if guest is using INT method to know the
> completion of command and if guest's INT command is received before
> Xen's completion INT arrives, in that case guest might see old CREADER.
> To handle this scenario, we can prefix Xen's completion INT before guest INT
> command.

Or do the processing on guest INT command too, which is in the draft
proposal I think.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.