[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PML (Page Modification Logging) design for Xen




On 02/11/2015 09:13 PM, Jan Beulich wrote:
On 11.02.15 at 12:52, <andrew.cooper3@xxxxxxxxxx> wrote:
On 11/02/15 08:28, Kai Huang wrote:
With PML, we don't have to use write protection but just clear D-bit
of EPT entry of guest memory to do dirty logging, with an additional
PML buffer full VMEXIT for 512 dirty GPAs. Theoretically, this can
reduce hypervisor overhead when guest is in dirty logging mode, and
therefore more CPU cycles can be allocated to guest, so it's expected
benchmarks in guest will have better performance comparing to non-PML.
One issue with basic EPT A/D tracking was the scan of the EPT tables.
Here, hardware will give us a list of affected gfns, but how is Xen
supposed to efficiently clear the dirty bits again?  Using EPT
misconfiguration is no better than the existing fault path.
Why not? The misconfiguration exit ought to clear the D bit for all
511 entries in the L1 table (and set it for the one entry that is
currently serving the access). All further D bit handling will then
be PML based.
Indeed, we clear D-bit in EPT misconfiguration. In my understanding, the sequences are as follows:

1) PML enabled for the domain.
2) ept_invalidate_emt (or ept_invalidate_emt_range) is called.
3) Guest accesses specific GPA (which has been invalidated by step 2), and EPT misconfig is triggered. 4) Then resolve_misconfig is called, which fixes up GFN (above GPA >> 12) to p2m_ram_logdirty, and calls ept_p2m_type_to_flags, in which we clear D-bit of EPT entry (instead of clear W-bit) if p2m type is p2m_ram_logdirty. Then dirty logging of this GFN will be handled by PML.

The above 2) ~ 4) will be repeated when log-dirty radix tree is cleared.


- PML buffer flush

There are two places we need to flush PML buffer. The first place is
PML buffer full VMEXIT handler (apparently), and the second place is
in paging_log_dirty_op (either peek or clean), as vcpus are running
asynchronously along with paging_log_dirty_op is called from userspace
via hypercall, and it's possible there are dirty GPAs logged in vcpus'
PML buffers but not full. Therefore we'd better to flush all vcpus'
PML buffers before reporting dirty GPAs to userspace.
Why apparently?  It would be quite easy for a guest to dirty 512 frames
without otherwise taking a vmexit.
I silently replaced apparently with obviously while reading...

We handle above two cases by flushing PML buffer at the beginning of
all VMEXITs. This solves the first case above, and it also solves the
second case, as prior to paging_log_dirty_op, domain_pause is called,
which kicks vcpus (that are in guest mode) out of guest mode via
sending IPI, which cause VMEXIT, to them.

This also makes log-dirty radix tree more updated as PML buffer is
flushed on basis of all VMEXITs but not only PML buffer full VMEXIT.
My gut feeling is that this is substantial overhead on a common path,
but this largely depends on how the dirty bits can be cleared efficiently.
I agree on the overhead part, but I don't see what relation this has
to the dirty bit clearing - a PML buffer flush doesn't involve any
alterations of D bits.
No the flush is not related to the dirty bit clearing. The PML buffer flush just does following (which I should have clarified in my design, sorry):
1) read out PML index
2) Loop all GPAs logged in the PML buffer according to PML index, and update them to log-dirty radix tree.

I agree there's overhead on VMEXIT common path, but the overhead should not be substantial, comparing to the overhead of VMEXIT itself.

Thanks,
-Kai

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.