[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen/arm: Virtual ITS command queue handling
On Fri, May 15, 2015 at 4:29 PM, Ian Campbell <ian.campbell@xxxxxxxxxx> wrote: > On Wed, 2015-05-13 at 15:26 +0100, Julien Grall wrote: >> >>> on that vits; >> >>> * On receipt of an interrupt notification arising from Xen's own use >> >>> of `INT`; (see discussion under Completion) >> >>> * On any interrupt injection arising from a guests use of the `INT` >> >>> command; (XXX perhaps, see discussion under Completion) >> >> >> >> With all the solution suggested, it will be very likely that we will try >> >> to execute multiple the scheduling pass at the same time. >> >> >> >> One way is to wait, until the previous pass as finished. But that would >> >> mean that the scheduler would be executed very often. >> >> >> >> Or maybe you plan to offload the scheduler in a softirq? >> > >> > Good point. >> > >> > A soft irq might be one solution, but it is problematic during emulation >> > of `CREADR`, when we would like to do a pass immediately to complete any >> > operations outstanding for the domain doing the read. >> > >> > Or just using spin_try_lock and not bothering if one is already in >> > progress might be another. But has similar problems. >> > >> > Or we could defer only scheduling from `INT` (either guest or Xen's own) >> > to a softirq but do ones from `CREADR` emulation synchronously? The >> > softirq would be run on return from the interrupt handler but multiple >> > such would be coalesced I think? >> >> I think we could defer the scheduling to a softirq for CREADR too, if >> the guest is using: >> - INT completion: vits.creadr would have been correctly update when >> receiving the INT in xen. >> - polling completion: the guest will loop on CREADR. It will likely get >> the info on the next read. The drawback is the guest may loose few >> instructions cycle. >> >> Overall, I don't think it's necessary to have an accurate CREADR. > > Yes, deferring the update by one exit+enter might be tolerable. I added > after this list: > This may result in lots of contention on the scheduler > locking. Therefore we consider that in each case all which happens is > triggering of a softirq which will be processed on return to guest, > and just once even for multiple events. The is considered OK for the > `CREADR` case because at worst the value read will be one cycle out of > date. > > > >> >> [..] >> >> >> AFAIU the process suggested, Xen will inject small batch as long as the >> >> physical command queue is not full. >> > >> >> Let's take a simple case, only a single domain is using vITS on the >> >> platform. If it injects a huge number of commands, Xen will split it >> >> with lots of small batch. All batch will be injected in the same pass as >> >> long as it fits in the physical command queue. Am I correct? >> > >> > That's how it is currently written, yes. With the "possible >> > simplification" above the answer is no, only a batch at a time would be >> > written for each guest. >> > >> > BTW, it doesn't have to be a single guest, the sum total of the >> > injections across all guests could also take a similar amount of time. >> > Is that a concern? >> >> Yes, the example with only a guest was easier to explain. > > So as well as limiting the number of commands in each domains batch we > also want to limit the total number of batches? > >> >> I think we have to restrict total number of batch (i.e for all the >> >> domain) injected in a same scheduling pass. >> >> >> >> I would even tend to allow only one in flight batch per domain. That >> >> would limit the possible problem I pointed out. >> > >> > This is the "possible simplification" I think. Since it simplifies other >> > things (I think) as well as addressing this issue I think it might be a >> > good idea. >> >> With the limitation of command send per batch, would the fairness you >> were talking on the design doc still required? > > I think we still want to schedule the guest's in a strict round robin > manner, to avoid one guest monopolising things. > >> >>> Therefore it is proposed that the restriction that a single vITS maps >> >>> to one pITS be retained. If a guest requires access to devices >> >>> associated with multiple pITSs then multiple vITS should be >> >>> configured. >> >> >> >> Having multiple vITS per domain brings other issues: >> >> - How do you know the number of ITS to describe in the device tree at >> >> boot? >> > >> > I'm not sure. I don't think 1 vs N is very different from the question >> > of 0 vs 1 though, somehow the tools need to know about the pITS setup. >> >> I don't see why the tools would require to know the pITS setup. > > Even with only a single vits the tools need to know if the system has 0, > 1, or more pits, to know whether to vreate a vits at all or not. > >> >> - How do you tell to the guest that the PCI device is mapped to a >> >> specific vITS? >> > >> > Device Tree or IORT, just like on native and just like we'd have to tell >> > the guest about that mapping even if there was a single vITS. >> >> Right, although the root controller can only be attached to one ITS. >> >> It will be necessary to have multiple root controller in the guest in >> the case of we passthrough devices using different ITS. >> >> Is pci-back able to expose multiple root controller? > > In principal the xenstore protocol supports it, but AFAIK all toolstacks > have only every used "bus" 0, so I wouldn't be surprised if there were > bugs lurking. > > But we could fix those, I don't think it is a requirement that this > stuff suddenly springs into life on ARM even with existing kernels. > >> > I think the complexity of having one vITS target multiple pITSs is going >> > to be quite high in terms of data structures and the amount of >> > thinking/tracking scheduler code will have to do, mostly down to out of >> > order completion of things put in the pITS queue. >> >> I understand the complexity, but exposing on vITS per pITS means that we >> are exposing the underlying hardware to the guest. > > Some aspect of it, yes, but it is still a virtual ITs. > >> That bring a lot of complexity in the guest layout, which is right now >> static. How do you decide the number of vITS/root controller exposed >> (think about PCI hotplug)? >> >> Given that PCI passthrough doesn't allow migration, maybe we could use >> the layout of the hardware. > > That's an option. > >> If we are going to expose multiple vITS to the guest, we should only use >> vITS for guest using PCI passthrough. This is because migration won't be >> compatible with it. > > It would be possible to support one s/w only vits for migration, i.e the > evtchn thing at the end, but for the general case that is correct. On > x86 I believe that if you hot unplug all passthrough devices you can > migrate and then plug in other devices at the other end. > > Anyway, more generally there are certainly problems with multiple vITS. > However there are also problems with a single vITS feeding multiple > pITSs: > > * What to do with global commands? Inject to all pITS and then > synchronise on them all finishing. > * Handling of out of order completion of commands queued with > different pITS, since the vITS must appear to complete in order. > Apart from the book keeping question it makes scheduling more > interesting: > * What if you have a pITS with slots available, and the > guest command queue contains commands which could go to > the pITS, but behind ones which are targetting another > pITS which has no slots > * What if one pITS is very busy and another is mostly idle > and a guest submits one command to the busy one > (contending with other guest) followed by a load of > commands targeting the idle one. Those commands would be > held up in this situation. > * Reasoning about fairness may be harder. > > I've but both your list and mine into the next revision of the document. > I think this remains an important open question. > Handling of Single vITS and multipl pITS can be made simple. All ITS commands except SYNC & INVALL has device id which will help us to know to which pITS it should be sent. SYNC & INVALL can be dropped by Xen on Guest request and let Xen append where ever SYNC & INVALL is required. (Ex; Linux driver adds SYNC for required commands). With this assumption, all ITS commands are mapped to pITS and no need of synchronization across pITS _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |