[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V2] x86/emulate: synchronize LOCKed instruction emulation



On 03/23/2017 03:23 PM, Jan Beulich wrote:
>>>> On 23.03.17 at 11:21, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>>>>> Sadly, I've now written this (rough) patch:
>>>>>>>
>>>>>>> http://pastebin.com/3DJ5WYt0 
>>>>>>>
>>>>>>> only to find that it does not solve our issue. With multiple processors
>>>>>>> per guest and heavy emulation at boot time, the VM got stuck at roughly
>>>>>>> the same point in its life as before the patch. Looks like more than
>>>>>>> CMPXCHG needs synchronizing.
>>>>>>>
>>>>>>> So it would appear that the smp_lock patch is the only way to bring this
>>>>>>> under control. Either that, or my patch misses something.
>>>>>>> Single-processor guests work just fine.
>>>>>>
>>>>>> Well, first of all the code to return RETRY is commented out. I
>>>>>> may guess that you've tried with it not commented out, but I
>>>>>> can't be sure.
>>>>>
>>>>> Indeed I have tried with it on, with the condition for success being at
>>>>> first "val != old", and then, as it is in the pasted code, "val == new".
>>>>> Both of these caused BSODs and random crashes. The guest only boots
>>>>> without any apparent issues (and with 1 VCPU only) after I've stopped
>>>>> returning RETRY.
>>>>
>>>> "val == new" is clearly wrong: Whether to retry depends on whether
>>>> the cmpxchg was unsuccessful.
>>>
>>> You're right, it looks like using "val == old" as a test for success
>>> (which is of course the only correct test) and return RETRY on fail
>>> works (though I'm still seeing a BSOD very seldom, I'll need to test it
>>> carefully and see what's going on).
>>
>> I've made a race more probable by resuming the VCPU only after
>> processing all the events in the ring buffer and now all my Windows 7
>> 32-bit guests BSOD with IRQL_NOT_LESS_OR_EQUAL every time at boot with
>> the updated patch.
>>
>> Changing the guests' configuration to only use 1 VCPU again solves the
>> issue, and also resuming the VCPU after treating each vm_event makes the
>> BSOD much less likely to appear (hence the illusion that the problem is
>> fixed).
> 
> Well, I can only repeat: We need to understand what it is that
> goes wrong.
> 
>> As for an instruction's crossing a page boundary, I'm not sure how that
>> would affect things here - we're simply emulating whatever instructions
>> cause page faults in interesting pages at boot, we're not yet preventing
>> writes at this point. And if it were such an issue, would it have been
>> fixed (and it is) by the smp_lock version of the patch?
> 
> Yes, it likely would have been: The problem with your code is that
> you map just a single page even when the original access crosses
> a page boundary, yet then access the returned pointer as if you
> had mapped two adjacent pages.

I've printed out all the offsets in the new cmpxchg hook, and the
largest one I've seen across many guest runs was 4064, with a size of 8,
which would not cause the write to go beyond the page. So this is not
what's causing the issue.

Since it's working fine with 1 VCPU per guest, this I think also
validates the new implementation of cmpxchg, since if it would have been
wrong the guest would have run into trouble soon. This is born out by my
before-and-after printk()s:

(XEN) offset: 156, size: 4
(XEN) offset: 1708, size: 4
(XEN) - [0] mem: 0, old: 0, new: 1, lock: 1
(XEN) - [1] mem: 3, old: 3, new: 4, lock: 1
(XEN) + [0] mem: 1, old: 0, new: 1, val: 0
(XEN) + [1] mem: 4, old: 3, new: 4, val: 3
(XEN) returning X86EMUL_OKAY
(XEN) returning X86EMUL_OKAY
(XEN) offset: 1708, size: 4
(XEN) - [1] mem: 4, old: 4, new: 3, lock: 1
(XEN) offset: 204, size: 4
(XEN) + [1] mem: 3, old: 4, new: 3, val: 4
(XEN) - [0] mem: 3539, old: 3539, new: 353a, lock: 1
(XEN) returning X86EMUL_OKAY
(XEN) + [0] mem: 353a, old: 3539, new: 353a, val: 3539
(XEN) returning X86EMUL_OKAY
(XEN) offset: 1708, size: 4
(XEN) - [1] mem: 3, old: 3, new: 4, lock: 1
(XEN) + [1] mem: 4, old: 3, new: 4, val: 3
(XEN) returning X86EMUL_OKAY

With 2 VCPUs, the guest can't seem to make up its mind to either BSOD
with IRQL_NOT_LESS_OR_EQUAL (when I return RETRY on old != val ) or
BAD_POOL_HEADER (when I always return OKAY), or simply get stuck. These
are all situations that happen on boot, fairly quickly (my Windows 7
32-bit guest doesn't go beyond the "Starting Windows" screen).

The last time it got stuck I've collected this info with succesive runs
of xenctx:

# ./xenctx -a 6
cs:eip: 0008:82a4209b
flags: 00200046 cid z p
ss:esp: 0010:8273f9bc
eax: 00000000   ebx: 00000000   ecx: 82782480   edx: 000003fd
esi: 80ba0020   edi: 00000000   ebp: 8273fa08
 ds:     0023    es:     0023    fs:     0030    gs:     0000

cr0: 8001003b
cr2: 8d486408
cr3: 00185000
cr4: 000406f9

dr0: 00000000
dr1: 00000000
dr2: 00000000
dr3: 00000000
dr6: fffe0ff0
dr7: 00000400
Code (instr addr 82a4209b)
14 a3 a2 82 cc cc cc cc cc cc cc cc cc cc 33 c0 8b 54 24 04 ec <c2> 04
00 8b ff 33 c0 8b 54 24 04

# ./xenctx -a 6
cs:eip: 0008:8c5065d6
flags: 00200246 cid i z p
ss:esp: 0010:8273fb9c
eax: 00000000   ebx: 8d662338   ecx: 850865f0   edx: 00000000
esi: 40008000   edi: 82742d20   ebp: 8273fc20
 ds:     0023    es:     0023    fs:     0030    gs:     0000

cr0: 8001003b
cr2: 8d486408
cr3: 00185000
cr4: 000406f9

dr0: 00000000
dr1: 00000000
dr2: 00000000
dr3: 00000000
dr6: fffe0ff0
dr7: 00000400
Code (instr addr 8c5065d6)
47 fc 83 c7 14 4e 75 ef 5f 5e c3 cc cc cc cc cc cc 8b ff fb f4 <c3> cc
cc cc cc cc 8b ff 55 8b ec

Hovewer this is likely irrelevant since by now we're past the trigger
event. I'm not sure where to go from here.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.