[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V2] x86/emulate: synchronize LOCKed instruction emulation



>>> On 22.03.17 at 18:22, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
> On 03/21/2017 07:02 PM, Jan Beulich wrote:
>>>>> On 21.03.17 at 17:44, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>> On 03/21/2017 06:29 PM, Jan Beulich wrote:
>>>>>>> On 21.03.17 at 16:38, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>>> On 03/15/2017 06:57 PM, Jan Beulich wrote:
>>>>>>>>> On 15.03.17 at 17:46, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>>>>> On 03/15/2017 06:30 PM, Jan Beulich wrote:
>>>>>>>>>>> On 15.03.17 at 17:04, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>>>>>>> ---
>>>>>>>>> Changes since V1:
>>>>>>>>>  - Added Andrew Cooper's credit, as he's kept the patch current
>>>>>>>>>    througout non-trivial code changes since the initial patch.
>>>>>>>>>  - Significantly more patch testing (with XenServer).
>>>>>>>>>  - Restricted lock scope.
>>>>>>>>
>>>>>>>> Not by much, as it seems. In particular you continue to take the
>>>>>>>> lock even for instructions not accessing memory at all.
>>>>>>>
>>>>>>> I'll take a closer look.
>>>>>>>
>>>>>>>> Also, by "reworked" I did assume you mean converted to at least the
>>>>>>>> cmpxchg based model.
>>>>>>>
>>>>>>> I haven't been able to follow the latest emulator changes closely, could
>>>>>>> you please clarify what you mean by "the cmpxchg model"? Thanks.
>>>>>>
>>>>>> This is unrelated to any recent changes. The idea is to make the
>>>>>> ->cmpxchg() hook actually behave like what its name says. It's
>>>>>> being used for LOCKed insn writeback already, and it could
>>>>>> therefore simply force a retry of the full instruction if the compare
>>>>>> part of it fails. It may need to be given another parameter, to
>>>>>> allow the hook function to tell LOCKed from "normal" uses.
>>>>>
>>>>> I assume this is what you have in mind?
>>>>
>>>> Hmm, not exactly. I'd expect an actual CMPXCHG instruction to
>>>> be used, with a LOCK prefix when the original access had one.
>>>> There should certainly not be any tail call to hvmemul_write()
>>>> anymore.
>>>
>>> There seems to be a misunderstanding here: if I understand this reply
>>> correctly, you'd like hvmemul_cmpxchg() to become a stub that really
>>> runs CMPXCHG - but if that comes with the smp_lock part as well, it
>>> doesn't matter (except for the now-ignored p_old part, i.e. the comparison).
>>>
>>> I've initially asked if I should bring the old XenServer-queued LOCK
>>> patch back, and I've understood that it could serve a purpose, at least
>>> as a temporary workaround until your, and Andrew's emulator changes,
>>> make it unrequired.
>> 
>> Well, yes, as said earlier (still visible in context above) I had
>> originally assumed "reworked" would mean making the cmpxchg
>> hook actually match its name.
>> 
>>> However, it appears that you've understood this to
>>> mean the addition of the CMPXCHG stub (speaking of which, could you
>>> please point out examples of implementing such stubs in the Xen code? I
>>> have never done one of those before.)
>> 
>> There's no talk about stubs here. All I had hoped would have been
>> done _instead_ of the smp-lock approach was to use the CMPXCHG
>> instruction inside the hook function, at least for RAM case (I think
>> we could live with MMIO still being forwarded to the write handler).
>> 
>> So all this isn't to say that the smp-lock approach might not be
>> worthwhile to take on its own, provided the locked region gets
>> shrunk as much as possible without losing correctness. The
>> CMPXCHG model would just not have this locking granularity
>> problem in the first place.
> 
> Sadly, I've now written this (rough) patch:
> 
> http://pastebin.com/3DJ5WYt0 
> 
> only to find that it does not solve our issue. With multiple processors
> per guest and heavy emulation at boot time, the VM got stuck at roughly
> the same point in its life as before the patch. Looks like more than
> CMPXCHG needs synchronizing.
> 
> So it would appear that the smp_lock patch is the only way to bring this
> under control. Either that, or my patch misses something.
> Single-processor guests work just fine.

Well, first of all the code to return RETRY is commented out. I
may guess that you've tried with it not commented out, but I
can't be sure.

And then "stuck" of course can mean many things, so a better
description of the state the VM is in and/or the operations that
it's stuck on would be needed.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.