[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V2] x86/emulate: synchronize LOCKed instruction emulation



On 03/23/2017 10:19 AM, Jan Beulich wrote:
>>>> On 22.03.17 at 18:22, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>> On 03/21/2017 07:02 PM, Jan Beulich wrote:
>>>>>> On 21.03.17 at 17:44, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>> On 03/21/2017 06:29 PM, Jan Beulich wrote:
>>>>>>>> On 21.03.17 at 16:38, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>>>> On 03/15/2017 06:57 PM, Jan Beulich wrote:
>>>>>>>>>> On 15.03.17 at 17:46, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>>>>>> On 03/15/2017 06:30 PM, Jan Beulich wrote:
>>>>>>>>>>>> On 15.03.17 at 17:04, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>>>>>>>> ---
>>>>>>>>>> Changes since V1:
>>>>>>>>>>  - Added Andrew Cooper's credit, as he's kept the patch current
>>>>>>>>>>    througout non-trivial code changes since the initial patch.
>>>>>>>>>>  - Significantly more patch testing (with XenServer).
>>>>>>>>>>  - Restricted lock scope.
>>>>>>>>>
>>>>>>>>> Not by much, as it seems. In particular you continue to take the
>>>>>>>>> lock even for instructions not accessing memory at all.
>>>>>>>>
>>>>>>>> I'll take a closer look.
>>>>>>>>
>>>>>>>>> Also, by "reworked" I did assume you mean converted to at least the
>>>>>>>>> cmpxchg based model.
>>>>>>>>
>>>>>>>> I haven't been able to follow the latest emulator changes closely, 
>>>>>>>> could
>>>>>>>> you please clarify what you mean by "the cmpxchg model"? Thanks.
>>>>>>>
>>>>>>> This is unrelated to any recent changes. The idea is to make the
>>>>>>> ->cmpxchg() hook actually behave like what its name says. It's
>>>>>>> being used for LOCKed insn writeback already, and it could
>>>>>>> therefore simply force a retry of the full instruction if the compare
>>>>>>> part of it fails. It may need to be given another parameter, to
>>>>>>> allow the hook function to tell LOCKed from "normal" uses.
>>>>>>
>>>>>> I assume this is what you have in mind?
>>>>>
>>>>> Hmm, not exactly. I'd expect an actual CMPXCHG instruction to
>>>>> be used, with a LOCK prefix when the original access had one.
>>>>> There should certainly not be any tail call to hvmemul_write()
>>>>> anymore.
>>>>
>>>> There seems to be a misunderstanding here: if I understand this reply
>>>> correctly, you'd like hvmemul_cmpxchg() to become a stub that really
>>>> runs CMPXCHG - but if that comes with the smp_lock part as well, it
>>>> doesn't matter (except for the now-ignored p_old part, i.e. the 
>>>> comparison).
>>>>
>>>> I've initially asked if I should bring the old XenServer-queued LOCK
>>>> patch back, and I've understood that it could serve a purpose, at least
>>>> as a temporary workaround until your, and Andrew's emulator changes,
>>>> make it unrequired.
>>>
>>> Well, yes, as said earlier (still visible in context above) I had
>>> originally assumed "reworked" would mean making the cmpxchg
>>> hook actually match its name.
>>>
>>>> However, it appears that you've understood this to
>>>> mean the addition of the CMPXCHG stub (speaking of which, could you
>>>> please point out examples of implementing such stubs in the Xen code? I
>>>> have never done one of those before.)
>>>
>>> There's no talk about stubs here. All I had hoped would have been
>>> done _instead_ of the smp-lock approach was to use the CMPXCHG
>>> instruction inside the hook function, at least for RAM case (I think
>>> we could live with MMIO still being forwarded to the write handler).
>>>
>>> So all this isn't to say that the smp-lock approach might not be
>>> worthwhile to take on its own, provided the locked region gets
>>> shrunk as much as possible without losing correctness. The
>>> CMPXCHG model would just not have this locking granularity
>>> problem in the first place.
>>
>> Sadly, I've now written this (rough) patch:
>>
>> http://pastebin.com/3DJ5WYt0 
>>
>> only to find that it does not solve our issue. With multiple processors
>> per guest and heavy emulation at boot time, the VM got stuck at roughly
>> the same point in its life as before the patch. Looks like more than
>> CMPXCHG needs synchronizing.
>>
>> So it would appear that the smp_lock patch is the only way to bring this
>> under control. Either that, or my patch misses something.
>> Single-processor guests work just fine.
> 
> Well, first of all the code to return RETRY is commented out. I
> may guess that you've tried with it not commented out, but I
> can't be sure.

Indeed I have tried with it on, with the condition for success being at
first "val != old", and then, as it is in the pasted code, "val == new".
Both of these caused BSODs and random crashes. The guest only boots
without any apparent issues (and with 1 VCPU only) after I've stopped
returning RETRY.

> And then "stuck" of course can mean many things, so a better
> description of the state the VM is in and/or the operations that
> it's stuck on would be needed.

I agree, but on a first look it would appear that the hang is just like
the one without the patch, when multiple processors are involved. This
also is in line with my previous observation that when only cmpxchg() is
locked, the problem still occurs (you may remember that initially I've
tried to only synchronize the cmpxchg() call in x86_emulate()).

I'll try to give the patch a few more spins and compare the guest stuck
state with the stuck state without the patch.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.