[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v1] x86/mm: Suppresses vm_events caused by page-walks



On 8/28/18 11:13 AM, Jan Beulich wrote:
>>>> On 28.08.18 at 10:04, <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 27/08/18 15:08, Jan Beulich wrote:
>>>>>> On 27.08.18 at 15:47, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>> On 8/27/18 4:17 PM, Jan Beulich wrote:
>>>>>>>> On 27.08.18 at 15:02, <andrew.cooper3@xxxxxxxxxx> wrote:
>>>>>> This should be architecturally correct as it is exclusively derived from
>>>>>> information provided by the VMExit, and won't cause dirty bits to be
>>>>>> written in cases where the hardware wouldn't have written them
>>>>>> (speculative or otherwise).  It does mean that an instruction which
>>>>>> would need to set A and D bits in the walk will take two EPT violations
>>>>>> to achieve the end result, but it probably is still quicker than sending
>>>>>> the vm_event out.
>>>>> I'm afraid this is going to be only mostly correct: Atomicity of the page
>>>>> table write is going to be lost. This could become an actual problem if
>>>>> the guest used racing PTE accesses. Such racing accesses might not
>>>>> be a bug - simply consider the OS scanning for set A and/or D bits
>>>>> (and clearing them when they're set). Or an entity temporarily clearing
>>>>> (parts of) PTEs, with recovery logic in place to restore them when
>>>>> needed for a synchronous access. At the very least there's then the
>>>>> risk of a live lock within the guest.
>>>> But it's not clear to me why that can't already happen when just
>>>> emulating the current instruction (as we do now), if emulating said
>>>> instruction would set A or D?
>>> Yes, good point - this is a problem not just to the new handling you
>>> propose.
>>
>> There is no risk of livelock.  The A/D bits we get an EPT-violation on
>> are those which are write protected, so any modification at all will
>> trap.  In particular, an attempt from software to play in weird ways
>> with the pagetable will cause real vm_events which will be sent for
>> auditing.
> 
> Even with multiple views, where only some write-protect the page?

In that scenario it could be problematic, but the feature, even now, is
disabled by default. We could add a comment to the effect of "don't use
this with combinations of restricted + unrestricted views" above the
libxc enabler function, and the caller assumes responsibility for
calling it in the proper context only.

FWIW, our introspection agent uses just one EPT view (even with altp2m,
our main scenario is #VE - so a guest-level optimization). It does
unfortunately need to be view 1 (because whatever we do to view 0
propagates to all other views by design), but that's about the extent of it.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.