[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-ia64-devel] PATCH: slightly improve stability
Hi Tristan, Could you please check whether this patch address RSE issue? Yes, Intel QA team is doing the test in the meantime. Thanks, -Anthony >-----Original Message----- >From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx >[mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Xu, Anthony >Sent: 2006?4?28? 9:48 >To: Tristan Gingold; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; Magenheimer, Dan (HP >Labs Fort Collins); Alex Williamson >Subject: RE: [Xen-ia64-devel] PATCH: slightly improve stability > >>From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx >>[mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Tristan >>Gingold >>Sent: 2006?4?27? 23:14 >>To: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; Magenheimer, Dan (HP Labs Fort >>Collins); Alex Williamson >>Subject: [Xen-ia64-devel] PATCH: slightly improve stability >> >>Hi, >> >>as reported earlier, this patch seems to improve stability: crashes are at >>least more coherent and maybe less frequent. >> >>RSE handling seems to have a bug: crahes are now due to either a bad value in >>a stacked register or a use of an invalid stacked register (although cfm >>seems correct in gdb!) > >I'm looking at this too, >Yes there is a bug about handle_lazy_cover. > >void ia64_do_page_fault (unsigned long address, unsigned long isr, struct >pt_regs *regs, unsigned long itir) >{ > unsigned long iip = regs->cr_iip, iha; > // FIXME should validate address here > unsigned long pteval; > unsigned long is_data = !((isr >> IA64_ISR_X_BIT) & 1UL); > IA64FAULT fault; > > if ((isr & IA64_ISR_IR) && handle_lazy_cover(current, isr, regs)) > return; > >This code sequence is intended to handle following scenario. > >1. Guest executes br.ret, this may cause mandatory RSE load, and this load may >cause TLB miss. >2. VMM gets control, but VMM can't handle this TLB miss itself, then VMM >injects >TLB miss to Guest TLB miss handler, when VMM executing "rfi" to jump to Guest >TLB miss handler, this TLB miss happens again. >3. At this time, interrupt_collection_enabled is 0, so handle_lazy_cover >executes "cover" on behalf of Guest, and return to Guest TLB miss handler >again, >this time there is no TLB miss. > > >Following code sequence is in ia64_leave_kernel path with psr.ic and psr.i off. >When br.ret.dptk.many b0 is executed, there may be a mandatory load, thus >There may be a tlb miss, according to above description handle_lazy_cover >executes "cover" on behalf of Guest and return to Guest, this is no correct >in this scenario. > >I didn't find an easy way to fix this bug. > > > mov loc6=0 > mov loc7=0 >(pRecurse) br.call.dptk.few b0=rse_clear_invalid > ;; > mov loc8=0 > mov loc9=0 > cmp.ne pReturn,p0=r0,in1 // if recursion count != 0, we need to > do a >br.ret > mov loc10=0 > mov loc11=0 >(pReturn) br.ret.dptk.many b0 >#endif /* !CONFIG_ITANIUM */ ># undef pRecurse ># undef pReturn > ;; > alloc r17=ar.pfs,0,0,0,0 // drop current register frame > ;; > loadrs > >Thanks, >Anthony > > >> >>Tested by doing many linux kernel compilation in SMP domU (> 100). >> >>Tristan. > >_______________________________________________ >Xen-ia64-devel mailing list >Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx >http://lists.xensource.com/xen-ia64-devel Attachment:
rse.patch _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ia64-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |