[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 0/3] x86: modify_ldt improvement, test, and config option

On Wed, Jul 29, 2015 at 5:29 PM, Andrew Cooper
<andrew.cooper3@xxxxxxxxxx> wrote:
> On 30/07/2015 00:13, Andy Lutomirski wrote:
>> On Wed, Jul 29, 2015 at 4:02 PM, Andrew Cooper
>> <andrew.cooper3@xxxxxxxxxx> wrote:
>>> On 29/07/2015 23:49, Boris Ostrovsky wrote:
>>>> On 07/29/2015 06:46 PM, David Vrabel wrote:
>>>>> On 29/07/2015 23:11, Andrew Cooper wrote:
>>>>>> On 29/07/2015 23:05, Andy Lutomirski wrote:
>>>>>>> On Wed, Jul 29, 2015 at 2:37 PM, Andrew Cooper
>>>>>>> <andrew.cooper3@xxxxxxxxxx> wrote:
>>>>>>>> On 29/07/2015 22:26, Andy Lutomirski wrote:
>>>>>>>>> On Wed, Jul 29, 2015 at 2:23 PM, Boris Ostrovsky
>>>>>>>>> <boris.ostrovsky@xxxxxxxxxx> wrote:
>>>>>>>>>> On 07/29/2015 03:03 PM, Andrew Cooper wrote:
>>>>>>>>>>> On 29/07/15 15:43, Boris Ostrovsky wrote:
>>>>>>>>>>>> FYI, I have got a repro now and am investigating.
>>>>>>>>>>> Good and bad news.  This bug has nothing to do with LDTs
>>>>>>>>>>> themselves.
>>>>>>>>>>> I have worked out what is going on, but this:
>>>>>>>>>>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>>>>>>>>>>> index 5abeaac..7e1a82e 100644
>>>>>>>>>>> --- a/arch/x86/xen/enlighten.c
>>>>>>>>>>> +++ b/arch/x86/xen/enlighten.c
>>>>>>>>>>> @@ -493,6 +493,7 @@ static void set_aliased_prot(void *v,
>>>>>>>>>>> pgprot_t prot)
>>>>>>>>>>>             pte = pfn_pte(pfn, prot);
>>>>>>>>>>>    +       (void)*(volatile int*)v;
>>>>>>>>>>>           if (HYPERVISOR_update_va_mapping((unsigned long)v,
>>>>>>>>>>> pte, 0)) {
>>>>>>>>>>>                   pr_err("set_aliased_prot va update failed w/
>>>>>>>>>>> lazy mode
>>>>>>>>>>> %u\n", paravirt_get_lazy_mode());
>>>>>>>>>>>                   BUG();
>>>>>>>>>>> Is perhaps not the fix we are looking for, and every use of
>>>>>>>>>>> HYPERVISOR_update_va_mapping() is susceptible to the same problem.
>>>>>>>>>> I think in most cases we know that page is mapped so hopefully
>>>>>>>>>> this is the
>>>>>>>>>> only site that we need to be careful about.
>>>>>>>>> Is there any chance we can get some kind of quick-and-dirty fix that
>>>>>>>>> can go to x86/urgent in the next few days even if a clean fix isn't
>>>>>>>>> available yet?
>>>>>>>> Quick and dirty?
>>>>>>>> Reading from v is the most obvious and quick way, for areas where
>>>>>>>> we are
>>>>>>>> certain v exists, is kernel memory and is expected to have a backing
>>>>>>>> page.  I don't know offhand how many of current
>>>>>>>> HYPERVISOR_update_va_mapping() callsites this applies to.
>>>>>>> __get_user((char *)v, tmp), perhaps, unless there's something better
>>>>>>> in the wings.  Keep in mind that we need this for -stable, and it's
>>>>>>> likely to get backported quite quickly due to CVE-2015-5157.
>>>>>> Hmm - something like that tucked inside HYPERVISOR_update_va_mapping()
>>>>>> would probably work, and certainly be minimal hassle for -stable.
>>>>>> Altering the hypercall used is certainly not something to backport, nor
>>>>>> are we sure it is a viable fix at this time.
>>>>> Changing this one use of update_va_mapping to use mmu_update_normal_pt
>>>>> is the correct fix to unblock this LDT series.  I see no reason why this
>>>>> cannot be backported.
>>>> To properly fix it should include batching and that is not something
>>>> that I think we should target for stable.
>>> Batching is absolutely not necessary to alter update_va_mapping to
>>> mmu_update_normal_pt.  After all, update_va_mapping isn't batched.
>>> However this isn't the first issue issue we have had lazy mmu faulting,
>>> and I doubt it is the last.  There are not many callsites of
>>> update_va_mapping - I will audit them tomorrow and see if any similar
>>> issues are lurking elsewhere.
>> One thing I should add: nothing flushes old aliases in xen_alloc_ldt,
>> yet I haven't been able to get xen_alloc_ldt to fail or subsequent LDT
>> access to fault.  Is this something we should be worried about?
> Yes.  update_va_mapping() will function perfectly well taking one RW
> mapping to RO even if there is a second RW mapping.  In such a case, the
> next LDT access will fault.

Which is a problem because that alias might still exist, and also
because Linux really doesn't expect that fault.

> On closer inspection, Xen is rather unhelpful with the fault.  Xen's
> lazy #PF will be bounced back to the guest with cr2 adjusted to appear
> in the range passed to set_ldt().  The error code however will be
> unmodified (and limited only by not-user and not-reserved), so will
> appear as a non-present read or write supervisor access to an address
> which the kernel has a valid read mapping of.

More yuck.

I think I'm just going to stick an unconditional vm_flush_aliases in alloc_ldt.

> Therefore, set_ldt() needs to be confident that there are no writeable
> mappings to the frames used to make up the LDT.  It could proactively
> fault them in by accessing one descriptor in each page inside the limit,
> but by the time a fault is received it is probably too late to work out
> where the other mapping is which prevented the typechange (or indeed,
> whether Xen objected to one of the descriptors instead).

This seems like overkill.

I'm still a bit confused, though: the failure is in xen_free_ldt.  How
do we make it all the way to xen_free_ldt without the vmapped page
existing in the guest's page tables?  After all, we had to survive
xen_alloc_ldt first, and ISTM that should fail in exactly the same

Anyway, I'll send v6.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.