[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [for-4.7] x86/emulate: synchronize LOCKed instruction emulation
On 04/27/2016 10:14 AM, Razvan Cojocaru wrote: > On 04/27/2016 09:22 AM, Jan Beulich wrote: >>>>> On 26.04.16 at 19:23, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >>> On 04/26/16 19:03, George Dunlap wrote: >>>> On 19/04/16 17:35, Jan Beulich wrote: >>>>>>>> Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx> 04/19/16 1:01 PM >>> >>>>>> I think this might be because the LOCK prefix should guarantee that the >>>>>> instruction that follows it has exclusive use of shared memory (for both >>>>>> reads and writes) but I might be misreading the docs: >>>>> >>>>> LOCK definitely has no effect on other than the instruction it gets >>>>> applied >>>>> to. >>>> >>>> Sorry I wasn't involved in this discussion -- what was the conclusion here? >>>> >>>> FWIW Andy's suggestion of using a stub seemed like the most robust >>>> solution, if that could be made to work. >>>> >>>> If you're going to submit a patch substantially similar to this one, let >>>> me know so I can review the mm bits of the original patch. >>> >>> I'm not really sure. >>> >>> Regarding this version of the patch, Jan has asked for more information >>> on the performance impact, but I'm not sure how to obtain it in a >>> rigorous manner. >> >> That was only one half, which Andrew has now answered. The >> other was the not understood issue with a later variant you had >> tried. > > Right, there's the fact that this version (with read / write locking the > whole function) works whereas just synchonizing hvmemul_cmpxchg() with a > spin lock does not. It is not yet fully clear why (the part of the > conversation at the very top of this email was an early attempt to > elucidate it). I've kept experimenting with the patch but can't quite figure out why minimizing the lock scope to the writeback part would not be sufficient, but it isn't. I.e. with this code: 3824 writeback: 3825 ops->smp_lock(lock_prefix); 3826 switch ( dst.type ) 3827 { 3828 case OP_REG: 3829 /* The 4-byte case *is* correct: in 64-bit mode we zero-extend. */ 3830 switch ( dst.bytes ) 3831 { 3832 case 1: *(uint8_t *)dst.reg = (uint8_t)dst.val; break; 3833 case 2: *(uint16_t *)dst.reg = (uint16_t)dst.val; break; 3834 case 4: *dst.reg = (uint32_t)dst.val; break; /* 64b: zero-ext */ 3835 case 8: *dst.reg = dst.val; break; 3836 } 3837 break; 3838 case OP_MEM: 3839 if ( !(d & Mov) && (dst.orig_val == dst.val) && 3840 !ctxt->force_writeback ) 3841 /* nothing to do */; 3842 else if ( lock_prefix ) 3843 rc = ops->cmpxchg( 3844 dst.mem.seg, dst.mem.off, &dst.orig_val, 3845 &dst.val, dst.bytes, ctxt); 3846 else 3847 rc = ops->write( 3848 dst.mem.seg, dst.mem.off, &dst.val, dst.bytes, ctxt); 3849 if ( rc != 0 ) 3850 { 3851 ops->smp_unlock(lock_prefix); 3852 goto done; 3853 } 3854 default: 3855 break; 3856 } 3857 ops->smp_unlock(lock_prefix); I can still reproduce the guest hang. But if I lock at the very beginning of x86_emulate() and unlock before each return, no more hangs. I would appreciate any suggestions on how to go about trying to narrow the scope of the lock, or figure out what subtlety is at work here. Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |