[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] vmx: VT-d posted-interrupt core logic handling

>>> On 10.03.16 at 06:09, <kevin.tian@xxxxxxxxx> wrote:
> It's always good to have a clear definition to which extend a performance
> issue would become a security risk. I saw 200us/500us used as example
> in this thread, however no one can give an accrual criteria. In that case,
> how do we call it a problem even when Feng collected some data? Based
> on mindset from all maintainers?

I think I've already made clear in previous comments that such
measurements won't lead anywhere. What we need is a
guarantee (by way of enforcement in source code) that the
lists can't grow overly large, compared to the total load placed
on the system.

> I think a good way of looking at this is based on which capability is 
> impacted.
> In this specific case the directly impacted metric is the interrupt delivery
> latency. However today Xen is not RT-capable. Xen doesn't commit to 
> deliver a worst-case 10us interrupt latency. The whole interrupt delivery 
> path 
> (from Xen into Guest) has not been optimized yet, then there could be other 
> reasons impacting latency too beside the concern on this specific list walk. 
> There is no baseline worst-case data w/o PI. There is no final goal to hit. 
> There is no test case to measure. 
> Then why blocking this feature due to this unmeasurable concern and why
> not enabling it and then improving it later when it becomes a measurable 
> concern when Xen will commit a clear interrupt latency goal will be 
> committed 
> by Xen (at that time people working on that effort will have to identify all 
> kinds 
> of problems impacting interrupt latency and then can optimize together)?
> People should understand possibly bad interrupt latency in extreme cases
> like discussed in this thread (w/ or w/o PI), since Xen doesn't commit 
> anything 
> here.

I've never made any reference to this being an interrupt latency
issue; I think it was George who somehow implied this from earlier
comments. Interrupt latency, at least generally, isn't a security
concern (generally because of course latency can get so high that
it might become a concern). All my previous remarks regarding the
issue are solely from the common perspective of long running
operations (which we've been dealing with outside of interrupt
context in a variety of cases, as you may recall). Hence the purely
theoretical basis for some sort of measurement would be to
determine how long a worst case list traversal would take. With
"worst case" being derived from the theoretical limits the
hypervisor implementation so far implies: 128 vCPU-s per domain
(a limit which we sooner or later will need to lift, i.e. taking into
consideration a larger value - like the 8k for PV guests - wouldn't
hurt) by 32k domains per host, totaling to 4M possible list entries.
Yes, it is obvious that this limit won't be reachable in practice, but
no, any lower limit can't be guaranteed to be good enough.

But I'm just now noticing this is the wrong thread to have this
discussion in - George specifically branched off the thread with
the new topic to separate the general discussion from the
specific case of the criteria for default enabling VT-d PI. So let's
please move this back to the other sub-thread (and I've
changed to subject back to express this).


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.