[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [XTF PATCH] XSA-186: Work around suspected Broadwell TLB erratum



>>> On 28.10.16 at 14:39, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 28/10/16 13:03, Jan Beulich wrote:
>>>>> On 28.10.16 at 12:36, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
>> (Maybe you want to drop the ...
>>
>>> --- a/tests/xsa-186/main.c
>>> +++ b/tests/xsa-186/main.c
>>> @@ -144,6 +144,29 @@ void test_main(void)
>>>      memcpy(stub, insn_buf_start, insn_buf_end - insn_buf_start);
>>>  
>>>      /*
>>> +     * Work around suspected Broadwell TLB Erratum
>>> +     *
>>> +     * Occasionally, this test failes with:
>>> +     *
>>> +     *   --- Xen Test Framework ---
>>> +     *   Environment: HVM 64bit (Long mode 4 levels)
>>> +     *   XSA-186 PoC
>>> +     *   ******************************
>>> +     *   PANIC: Unhandled exception at 0008:fffffffffffffffa
>>> +     *   Vec 14 #PF[-I-sr-] %cr2 fffffffffffffffa
>>> +     *   ******************************
>>> +     *
>>> +     * on Broadwell hardware.  The mapping is definitely present as the
>>> +     * memcpy() has already succeeded.  Inserting an invlpg resolves the
>>> +     * issue, sugguesting that there is a race conditon between dTLB/iTLB
>> ... stray u which slipped into "suggesting".)
>>
>> Btw - would you mind trying something else: Instead of the INVLPG,
>> put a CPUID or some other serializing instruction in here. ISTR that
>> for self modifying code this is required, i.e. the CPU could have been
>> fetching instructions ahead of the memcpy(), and nothing would be
>> there to force it to drop what it has already executed speculatively,
>> including the exception token.
> 
> That is an interesting point, but still doesn't explain the symptoms. 
> If the icache wasn't flushed, we might get junk instructions and a #UD/#GP.

No. As the processor speculates the call, it won't be able to fetch
the target instruction and hence would insert an exception token
into the queue. There would be junk instruction bytes only if there
was a prior mapping for that page, but aiui a mapping for that
address gets established exactly once.

> However, in this case the fault is for an instruction fetch from a
> non-present page, not a failure to execute what it found there.
> 
> I expect a cpuid instruction would resolve the issue, but it also forces
> a vmexit which complicates the microarchitectural interactions here. 
> Something else, like executing an int3 will also serialise the pipeline,
> but not vmexit.  I will try and find some time to experiment.

You're in ring 0, aren't you? That gives you plenty of serializing
instructions which don't directly interact with the TLBs. An LLDT
with a zero selector might be the one with least side effects. And
in case you're not in ring 0, make up an interrupt frame and
execute an IRET.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.