|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 4/5] x86/alternative: Implement NMI/#MC-safe patching
>>> On 01.02.18 at 16:20, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 31/01/18 11:00, Jan Beulich wrote:
>>>>> On 30.01.18 at 20:26, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> On 30/01/18 08:39, Jan Beulich wrote:
>>>>>>> On 29.01.18 at 16:38, <andrew.cooper3@xxxxxxxxxx> wrote:
>>>>> + /*
>>>>> + * We are the CPU performing the patching, and might have ended up
>>>>> here by
>>>>> + * hitting a breakpoint.
>>>>> + *
>>>>> + * Either way, we need to complete particular patch to make forwards
>>>>> + * progress. This logic is safe even if executed recursively in the
>>>>> + * breakpoint handler; the worst that will happen when normal
>>>>> execution
>>>>> + * resumes is that we will rewrite the same bytes a second time.
>>>>> + */
>>>>> +
>>>>> + /* First, insert a breakpoint to prevent execution of the patch
>>>>> site. */
>>>>> + i->addr[0] = 0xcc;
>>>>> + smp_wmb();
>>>> This is necessary, but not sufficient when replacing more than a
>>>> single insn: The other CPU may be executing instructions _after_
>>>> the initial one that is being replaced, and ...
>>>>
>>>>> + /* Second, copy the remaining instructions into place. */
>>>>> + memcpy(i->addr + 1, i->opcode + 1, i->len - 1);
>>>> ... this may be altering things underneath its feet.
>>> Hmm.
>>>
>>> It is completely impossible to recover if another CPU hits the middle of
>>> this memcpy(). It is impossible to synchronise a specific write against
>>> the remote CPU pipelines.
>> It is not completely impossible, but the solution would be awkward
>> to use: If the code doing the patching knew instruction boundaries,
>> it could put breakpoints onto all of them. Or maybe it isn't all that
>> bad to use - we have an insn decoder after all.
>
> An instruction decoder doesn't help. Yes - it allows us to avoid
> executing a spliced instruction, but we can't recover/rewind state for a
> cpu which hits one of these latter breakpoints.
Oh, true. At least doing so would be very difficult.
>>> While we can arrange and test full NMI safety, we cannot test #MC safety
>>> at all. Any code to try and implement #MC safety is going to be
>>> complicated, and make things worse if an #MC does hit.
>>>
>>> Therefore, I agree with the Linux view that trying to do anything for
>>> #MC safety is going to be worse than doing nothing at all.
>> I agree. But as said before - the immediate goal ought to be to
>> make alternatives patching safe, and that doesn't require any
>> of these considerations.
>
> ? Of course it requires these considerations.
I don't understand, but see below.
> If we ignore the livepatching side of things (which includes an
> alternatives pass), then we can make the boot-time alternatives pass NMI
> safe by explicitly choosing to patch in NMI context before we've booted
> the APs.
>
> Arranging for boot time alternatives to be NMI-safe is easy. Arranging
> for livepatching to be NMI-safe is also fairly easy.
>
> Arranging for anything, *even at boot time* to be #MC safe, is very
> complicated, and will be a large amount of code we cannot test. Hence
> why I'm leaning towards the Linux solution for the #MC problem.
I could live with this.
>>>>> @@ -153,7 +231,31 @@ void init_or_livepatch add_nops(void *insns,
>>>>> unsigned int len)
>>>>> void init_or_livepatch noinline
>>>>> text_poke(void *addr, const void *opcode, size_t len, bool live)
>>>>> {
>>>>> - memcpy(addr, opcode, len);
>>>>> + if ( !live || len == 1 )
>>>>> + {
>>>>> + /*
>>>>> + * If we know *addr can't be executed, or we are patching a
>>>>> single
>>>>> + * byte, it is safe to use a straight memcpy().
>>>>> + */
>>>>> + memcpy(addr, opcode, len);
>>>> Is it really worth special casing this? Whether to actually ack
>>>> patches 2 and 3 depends on that.
>>> Yes, and even more so if we are going to use an NMI rendezvous. We
>>> definitely don't want to have an NMI rendezvous while applying
>>> alternatives as part of livepatch preparation.
>> Well, again - live patching should be the second step here.
>
> You keep on saying this, and I can only conclude that you are confused
> as to which actions happen when.
>
> At boot time, we do a single alternatives pass on the live .text section.
>
> At livepatch load time, we do an alternatives pass on the incoming
> .text, but this is safe because the text definitely isn't being executed.
>
> At livepatch apply time, we quiesce the system and do a patching pass on
> the live .text, most of which are `jmp disp32`.
>
>
> We therefore have two alternatives passes (one safe, one unsafe), and
> one (unsafe) livepatch pass.
I don't think I'm confused in any way. The approach you've chosen
in the patch - to complete the patching inside the NMI or #MC
handler - is entirely sufficient to deal with NMI and - unless there
are recursive ones - #MC. You don't need NMI context to do NMI
safe patching when only a single CPU is up.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |