Xen project Mailing List

[Xen-devel] RE: One (possible) x86 get_user_pages bug

To: 'Jeremy Fitzhardinge' <jeremy@xxxxxxxx>

From: Kaushik Barde <kbarde@xxxxxxxxxx>

Date: Mon, 31 Jan 2011 12:10:04 -0800

Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, 'Kenneth Lee' <liguozhu@xxxxxxxxxx>, 'Peter Zijlstra' <a.p.zijlstra@xxxxxxxxx>, 'Marcelo Tosatti' <mtosatti@xxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, 'Jan Beulich' <JBeulich@xxxxxxxxxx>, wangzhenguo@xxxxxxxxxx, 'Xiaowei Yang' <xiaowei.yang@xxxxxxxxxx>, 'linqaingmin' <linqiangmin@xxxxxxxxxx>, fanhenglong@xxxxxxxxxx, 'Avi Kivity' <avi@xxxxxxxxxx>, 'Wu Fengguang' <fengguang.wu@xxxxxxxxx>, 'Nick Piggin' <npiggin@xxxxxxxxx>

Delivery-date: Mon, 31 Jan 2011 12:10:43 -0800

Iplanet-smtp-warning: Lines longer than SMTP allows found and truncated.

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Thread-index: AcvBcYgoa/90xZeyTW60Kt7EGFmB7AADqAQA

<< I'm not sure I follow you here. The issue with TLB flush IPIs is that the hypervisor doesn't know the purpose of the IPI and ends up (potentially) waking up a sleeping VCPU just to flush its tlb - but since it was sleeping there were no stale TLB entries to flush.>> That's what I was trying understand, what is "Sleep" here? Is it ACPI sleep or some internal scheduling state? If vCPUs are asynchronous to pCPU in terms of ACPI sleep state, then they need to synced-up. That's where entire ACPI modeling needs to be considered. That's where KVM may not see this issue. Maybe I am missing something here. << A "few hundred uSecs" is really very slow - that's nearly a millisecond. It's worth spending some effort to avoid those kinds of delays.>> Actually, just checked IPIs are usually 1000-1500 cycles long (comparable to VMEXIT). My point is ideal solution should be where virtual platform behavior is closer to bare metal interrupts, memory, cpu state etc.. How to do it ? well that's what needs to be figured out :-) -Kaushik -----Original Message----- From: Jeremy Fitzhardinge [mailto:jeremy@xxxxxxxx] Sent: Monday, January 31, 2011 10:05 AM To: Kaushik Barde Cc: 'Avi Kivity'; 'Jan Beulich'; 'Xiaowei Yang'; 'Nick Piggin'; 'Peter Zijlstra'; fanhenglong@xxxxxxxxxx; 'Kenneth Lee'; 'linqaingmin'; wangzhenguo@xxxxxxxxxx; 'Wu Fengguang'; xen-devel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; 'Marcelo Tosatti' Subject: Re: One (possible) x86 get_user_pages bug On 01/30/2011 02:21 PM, Kaushik Barde wrote: > I agree i.e. deviation from underlying arch consideration is not a good > idea. > > Also, agreed, hypervisor knows which page entries are ready for TLB flush > across vCPUs. > > But, using above knowledge, along with TLB flush based on IPI is a better > solution. Its ability to synchronize it with pCPU based IPI and TLB flush > across vCPU. is key. I'm not sure I follow you here. The issue with TLB flush IPIs is that the hypervisor doesn't know the purpose of the IPI and ends up (potentially) waking up a sleeping VCPU just to flush its tlb - but since it was sleeping there were no stale TLB entries to flush. Xen's TLB flush hypercalls can optimise that case by only sending a real IPI to PCPUs which are actually running target VCPUs. In other cases, where a PCPU is known to have stale entries but it isn't running a relevant VCPU, it can just mark a deferred TLB flush which gets executed before the VCPU runs again. In other words, Xen can take significant advantage of getting a higher-level call ("flush these TLBs") compared just a simple IPI. Are you suggesting that the hypervisor should export some kind of "known dirty TLB" table to the guest, and have the guest work out which VCPUs need IPIs sent to them? How would that work? > IPIs themselves should be in few hundred uSecs in terms latency. Also, why > should pCPU be in sleep state for active vCPU scheduled page workload? A "few hundred uSecs" is really very slow - that's nearly a millisecond. It's worth spending some effort to avoid those kinds of delays. J > -Kaushik > > -----Original Message----- > From: Avi Kivity [mailto:avi@xxxxxxxxxx] > Sent: Sunday, January 30, 2011 5:02 AM > To: Jeremy Fitzhardinge > Cc: Jan Beulich; Xiaowei Yang; Nick Piggin; Peter Zijlstra; > fanhenglong@xxxxxxxxxx; Kaushik Barde; Kenneth Lee; linqaingmin; > wangzhenguo@xxxxxxxxxx; Wu Fengguang; xen-devel@xxxxxxxxxxxxxxxxxxx; > linux-kernel@xxxxxxxxxxxxxxx; Marcelo Tosatti > Subject: Re: One (possible) x86 get_user_pages bug > > On 01/27/2011 08:27 PM, Jeremy Fitzhardinge wrote: >> And even just considering virtualization, having non-IPI-based tlb >> shootdown is a measurable performance win, since a hypervisor can >> optimise away a cross-VCPU shootdown if it knows no physical TLB >> contains the target VCPU's entries. I can imagine the KVM folks could >> get some benefit from that as well. > It's nice to avoid the IPI (and waking up a cpu if it happens to be > asleep) but I think the risk of deviating too much from the baremetal > arch is too large, as demonstrated by this bug. > > (well, async page faults is a counterexample, I wonder if/when it will > bite us) > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.