[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [for-4.7] x86/emulate: synchronize LOCKed instruction emulation
On 05/04/2016 04:42 PM, Jan Beulich wrote: >>>> On 04.05.16 at 13:32, <rcojocaru@xxxxxxxxxxxxxxx> wrote: >> But while implementing a stub that falls back to the actual LOCK CMPXCHG >> and replacing hvm_copy_to_guest_virt() with it would indeed be an >> improvement (with the added advantage of being able to treat >> non-emulated LOCK CMPXCHG cases), I don't understand how that would >> solve the read-modify-write atomicity problem. >> >> AFAICT, this would only solve the write problem. Assuming we have VCPU1 >> and VCPU2 emulating a LOCKed instruction expecting rmw atomicity, the >> stub alone would not prevent this: >> >> VCPU1: read, modify >> VCPU2: read, modify, write >> VCPU1: write > > I'm not sure I follow what you mean here: Does the above represent > what the guest does, or what the hypervisor does as steps to emulate > a _single_ guest instruction? In the former case, I don't see what > you're after. And in the latter case I don't understand why you think > using CMPXCHG instead of WRITE wouldn't help. Briefly, this is the scenario: assuming a guest with two VCPUs and an introspection application that has restricted access to a page, the guest runs two LOCK instructions that touch the page, causing a page fault for each instruction. This further translates to two EPT fault vm_events being placed in the ring buffer. By the time the introspection application polls the event channel, both VCPUs are paused, waiting for replies to the vm_events. If the monitoring application processes both events (puts both replies, with the emulate option on, in the ring buffer), then signals the event channel, it is possible that both VCPUs get woken up, ending up running x86_emulate() simultaneously. In this case, my understanding is that just using CMPXCHG will not work (although it is clearly superior to the current implementation), because the read part and the write part of x86_emulate() (when LOCKed instructions are involved) should be executed atomically, but writing the CMPXCHG stub would only make sure that two simultaneous writes won't occur. In other words, this would still be possible (atomicity would still not be guaranteed for LOCKed instructions): VCPU1: read VCPU2: read, write VCPU1: write when what we want for LOCKed instructions is: VCPU1: read, write VCPU2: read, write Am I misunderstanding how x86_emulate() works? Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |