[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] vmx: VT-d posted-interrupt core logic handling

To: Jan Beulich <JBeulich@xxxxxxxx>
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Thu, 10 Mar 2016 08:43:39 +0000
Accept-language: en-US
Cc: Lars Kurth <lars.kurth@xxxxxxxxxx>, "Wu, Feng" <feng.wu@xxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Dario Faggioli <dario.faggioli@xxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, David Vrabel <david.vrabel@xxxxxxxxxx>
Delivery-date: Thu, 10 Mar 2016 08:44:06 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>
Thread-index: AQHReqjxux2HqiAPIEOqweXLNoOWZg==
Thread-topic: vmx: VT-d posted-interrupt core logic handling

> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: Thursday, March 10, 2016 4:07 PM
> 
> >>> On 10.03.16 at 06:09, <kevin.tian@xxxxxxxxx> wrote:
> > It's always good to have a clear definition to which extend a performance
> > issue would become a security risk. I saw 200us/500us used as example
> > in this thread, however no one can give an accrual criteria. In that case,
> > how do we call it a problem even when Feng collected some data? Based
> > on mindset from all maintainers?
> 
> I think I've already made clear in previous comments that such
> measurements won't lead anywhere. What we need is a
> guarantee (by way of enforcement in source code) that the
> lists can't grow overly large, compared to the total load placed
> on the system.

Thanks for clarity here. 

> 
> > I think a good way of looking at this is based on which capability is
> > impacted.
> > In this specific case the directly impacted metric is the interrupt delivery
> > latency. However today Xen is not RT-capable. Xen doesn't commit to
> > deliver a worst-case 10us interrupt latency. The whole interrupt delivery
> > path
> > (from Xen into Guest) has not been optimized yet, then there could be other
> > reasons impacting latency too beside the concern on this specific list walk.
> > There is no baseline worst-case data w/o PI. There is no final goal to hit.
> > There is no test case to measure.
> >
> > Then why blocking this feature due to this unmeasurable concern and why
> > not enabling it and then improving it later when it becomes a measurable
> > concern when Xen will commit a clear interrupt latency goal will be
> > committed
> > by Xen (at that time people working on that effort will have to identify all
> > kinds
> > of problems impacting interrupt latency and then can optimize together)?
> > People should understand possibly bad interrupt latency in extreme cases
> > like discussed in this thread (w/ or w/o PI), since Xen doesn't commit
> > anything
> > here.
> 
> I've never made any reference to this being an interrupt latency
> issue; I think it was George who somehow implied this from earlier
> comments. Interrupt latency, at least generally, isn't a security
> concern (generally because of course latency can get so high that
> it might become a concern). All my previous remarks regarding the
> issue are solely from the common perspective of long running
> operations (which we've been dealing with outside of interrupt
> context in a variety of cases, as you may recall). Hence the purely

Yes, that concern makes sense.

> theoretical basis for some sort of measurement would be to
> determine how long a worst case list traversal would take. With
> "worst case" being derived from the theoretical limits the
> hypervisor implementation so far implies: 128 vCPU-s per domain
> (a limit which we sooner or later will need to lift, i.e. taking into
> consideration a larger value - like the 8k for PV guests - wouldn't
> hurt) by 32k domains per host, totaling to 4M possible list entries.
> Yes, it is obvious that this limit won't be reachable in practice, but
> no, any lower limit can't be guaranteed to be good enough.

Here do you think whether '4M' possible entries are 'overly large'
so we must have some enforcement in code, or still some experiments 
required to verify '4M' does been a problem (since total overhead 
depends on what we do with each entry)? If the latter what's the 
criteria to define it as a problem (e.g. 200us in total)?

There are many linked list usages today in Xen hypervisor, which
have different theoretical maximum possible number. The closest
one to PI might be the usage in tmem (pool->share_list) which is 
page based so could grow 'overly large'. Other examples are 
magnitude lower, e.g. s->ioreq_vcpu_list in ioreq server (which
could be 8K in above example), and d->arch.hvm_domain.msixtbl_list
in MSI-x virtualization (which could be 2^11 per spec). Do we
also want to create some artificial scenarios to examine them 
since based on actual operation K-level entries may also become
a problem? 

Just want to figure out how best we can solve all related linked-list 
usages in current hypervisor. 

> 
> But I'm just now noticing this is the wrong thread to have this
> discussion in - George specifically branched off the thread with
> the new topic to separate the general discussion from the
> specific case of the criteria for default enabling VT-d PI. So let's
> please move this back to the other sub-thread (and I've
> changed to subject back to express this).
> 

Sorry for cross-posting.

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] vmx: VT-d posted-interrupt core logic handling
  - From: Jan Beulich

References:
- [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: Konrad Rzeszutek Wilk
- Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: George Dunlap
- Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: Konrad Rzeszutek Wilk
- Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: George Dunlap
- Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: Wu, Feng
- Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: George Dunlap
- Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: Jan Beulich
- Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: George Dunlap
- Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: Jan Beulich
- Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: George Dunlap
- Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
  - From: Jan Beulich
- [Xen-devel] On setting clear criteria for declaring a feature acceptable (was "vmx: VT-d posted-interrupt core logic handling")
  - From: George Dunlap
- Re: [Xen-devel] On setting clear criteria for declaring a feature acceptable (was "vmx: VT-d posted-interrupt core logic handling")
  - From: Tian, Kevin
- Re: [Xen-devel] vmx: VT-d posted-interrupt core logic handling
  - From: Jan Beulich

Prev by Date: Re: [Xen-devel] [PATCH v2 1/2] tools/foreign: Avoid using alignment directives when not appropriate
Next by Date: Re: [Xen-devel] arm: Missing memset in setup_frametable_mappings
Previous by thread: Re: [Xen-devel] vmx: VT-d posted-interrupt core logic handling
Next by thread: Re: [Xen-devel] vmx: VT-d posted-interrupt core logic handling
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.