[Xen-devel] [PATCH v6 00/15] Alternate p2m: support multiple copies of host p2m

This set of patches adds support to hvm domains for EPTP switching by creating
multiple copies of the host p2m (currently limited to 10 copies).

The primary use of this capability is expected to be in scenarios where access
to memory needs to be monitored and/or restricted below the level at which the
guest OS page tables operate. Two examples that were discussed at the 2014 Xen
developer summit are:

    VM introspection: 

    Secure inter-VM communication:

A more detailed design specification can be found at:

Each p2m copy is populated lazily on EPT violations.
Permissions for pages in alternate p2m's can be changed in a similar
way to the existing memory access interface, and gfn->mfn mappings can be 

All this is done through extra HVMOP types.

The cross-domain HVMOP code has been compile-tested only. Also, the cross-domain
code is hypervisor-only, the toolstack has not been modified.

The intra-domain code has been tested. Violation notifications can only be 
for pages that have been modified (access permissions and/or gfn->mfn mapping) 
intra-domain, and only on VCPU's that have enabled notification.

VMFUNC and #VE will both be emulated on hardware without native support.

This code is not compatible with nested hvm functionality and will refuse to 
with nested hvm active. It is also not compatible with migration. It should be
considered experimental.

Changes since v5:

    Rebased on staging.

    We believe v6 addresses all ABI issues and actual bugs, it does
    not address all outstanding maintainer issues.

    Patch 1:
        no changes

    Patch 2:
        no changes

    Patch 3:
        no changes
        removed ack's etc

    Patch 4:
        fixed a markdown formatting error

    Patch 5:
        removed a buggy assert
        removed Andrew's R-b

    Patch 6:
        fixed a bug when disabling #VE due to bad veinfo gfn

    Patch 7:
        addressed Jan's most recent comments

    Patch 8:
        no changes

    Patch 9:
        Added padding to vm_event_t header (per Andrew)

    Patch 10:
        No changes

    Patch 11:
        Reworked structure padding
        Added altp2m_op interface version
        Reworked altp2m_op handling again

    Patch 12:
        Mechanical changes due to patch 11 changes

    Patch 13:
        Mechanical changes due to patch 11 changes

    Patch 14:
        Mechanical changes due to patch 11 changes

    Patch 15:
        Mechanical changes due to an upstream change

Changes since v4:

    Patch 3:  don't set bit 63 of top-level entries.

    Patch 5:  extra locking order description in mm-locks.h
              don't initialise altp2m data unless altp2m is enabled globally
               and hardware supports it
              removed some hardware-specific wrappers that were not being used
              renamed ap2m... interfaces to altp2m...
              fixed error path in p2m_init_altp2m

    Patch 7:  addressed remaining feedback

    Patch 8:  made suppress_ve preservation consistent

    Patch 9:  changed flag bit to avoid collision with recently applied series

    Patch 10: check pad fields for zero
              minor formatting changes

    Patch 11: renamed HVM parameter

    Patch 15: removed v3 workaround

Changes since v3:

Major changes are:

    Replaced patch 8.

    Refactored patch 11 to use a single HVMOP with subcodes.

    Addressed feedback in patch 7, and some other patches.

    Added two tools/test patches from Tamas. Both are optional.

    Added various ack's and reviewed-by's.


Ravi Sahita will now be the point of contact for this series.

Changes since v2:

Addressed all v2 feedback *except*:

    In patch 5, the per-domain EPTP list page is still allocated from the
    Xen heap. If allocated from the domain heap Xen panics - IIRC on Haswell
    hardware when walking the EPTP list during exit processing in patch 6.

    HVM_ops are not merged. Tamas suggested merging the memory access ops,
    but in practice they are not as similar as they appear on the surface.
    Razvan suggested merging the implementation code in p2m.c, but that is
    also not as common as it appears on the surface.
    Andrew suggested merging all altp2m ops into one with a subop code in
    the input stucture. His point that only 255 ops can be defined is well
    taken, but altp2m uses only 2 more ops than the recently introduced
    ioreq ops, and <15% of the available ops have been defined. Since we
    don't know how to implement XSM hooks and policy with the subop model,
    we have not adopted this suggestion.

    The p2m set/get interface is not modified. The altp2m code needs to
    write suppress_ve in 2 places and read it in 1 place. The original
    patch series managed this by coupling the state of suppress_ve to the
    p2m memory type, which Tim disliked. In v2 of the series, special
    set/get interaces were added to access suppress_ve only when required.
    Jan has suggested changing the existing interfaces, but we feel this
    is inappropriate for this experimental patch series. Changing the
    existing interfaces would require a design agreement to be reached
    and would impact a large amount of existing code.

    Andrew kindly added some reviewed-by's to v2. I have not carried
    his reviewed-by of the memory event patch forward because Tamas
    requested significant changes to the patch.

Changes since v1:

Many changes since v1 in response to maintainer feedback, including:

    Suppress_ve state is now decoupled from memory type
    VMFUNC emulation handled in x86 emulator
    Lazy-copy algorithm copies any page where mfn != INVALID_MFN
    All nested page fault handling except lazy-copy is now in
        top-level (hvm.c) nested page fault handler
    Split p2m lock type (as suggested by Tim) to avoid lock order violations
    XSM hooks
    Xen parameter to globally enable altp2m (default disabled) and HVM parameter
    Altp2m reference counting no longer uses dirty_cpu bitmap
    Remapped page tracking to invalidate altp2m's where needed to protect Xen
    Many other minor changes

The altp2m invalidation is implemented to a level that I believe satisifies
the requirements of protecting Xen. Invalidation notification is not yet
implemented, and there may be other cases where invalidation is warranted to
protect the integrity of the restrictions placed through altp2m. We may add
further patches in this area.

Testability is still a potential issue. We have offered to make our internal
Windows test binaries available for intra-domain testing. Tamas has
been working on toolstack support for cross-domain testing with a slightly
earlier patch series, and we hope he will submit that support.

Not all of the patches will be of interest to everyone copied here. I've
copied everyone on this initial mailing to give context.

Andrew Cooper (1):
  common/domain: Helpers to pause a domain while in context

Ed White (9):
  VMX: VMFUNC and #VE definitions and detection.
  VMX: implement suppress #VE.
  x86/HVM: Hardware alternate p2m support detection.
  x86/altp2m: basic data structures and support routines.
  VMX/altp2m: add code to support EPTP switching and #VE.
  x86/altp2m: alternate p2m memory events.
  x86/altp2m: add remaining support routines.
  x86/altp2m: define and implement alternate p2m HVMOP types.
  x86/altp2m: Add altp2mhvm HVM domain parameter.

George Dunlap (1):
  x86/altp2m: add control of suppress_ve.

Ravi Sahita (2):
  VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
  x86/altp2m: XSM hooks for altp2m HVM ops

Tamas K Lengyel (2):
  tools/libxc: add support to altp2m hvmops
  tools/xen-access: altp2m testcases

 docs/man/xl.cfg.pod.5                        |  12 +
 docs/misc/xen-command-line.markdown          |   7 +
 tools/flask/policy/policy/modules/xen/xen.if |   4 +-
 tools/libxc/Makefile                         |   1 +
 tools/libxc/include/xenctrl.h                |  22 ++
 tools/libxc/xc_altp2m.c                      | 248 ++++++++++++
 tools/libxl/libxl.h                          |   6 +
 tools/libxl/libxl_create.c                   |   1 +
 tools/libxl/libxl_dom.c                      |   2 +
 tools/libxl/libxl_types.idl                  |   1 +
 tools/libxl/xl_cmdimpl.c                     |  10 +
 tools/tests/xen-access/xen-access.c          | 173 +++++++--
 xen/arch/x86/hvm/Makefile                    |   1 +
 xen/arch/x86/hvm/altp2m.c                    |  77 ++++
 xen/arch/x86/hvm/emulate.c                   |  18 +-
 xen/arch/x86/hvm/hvm.c                       | 250 +++++++++++-
 xen/arch/x86/hvm/vmx/vmcs.c                  |  42 +-
 xen/arch/x86/hvm/vmx/vmx.c                   | 176 +++++++++
 xen/arch/x86/mm/hap/Makefile                 |   1 +
 xen/arch/x86/mm/hap/altp2m_hap.c             |  98 +++++
 xen/arch/x86/mm/hap/hap.c                    |  38 +-
 xen/arch/x86/mm/mem_sharing.c                |   4 +-
 xen/arch/x86/mm/mm-locks.h                   |  46 ++-
 xen/arch/x86/mm/p2m-ept.c                    |  37 +-
 xen/arch/x86/mm/p2m-pod.c                    |  12 +-
 xen/arch/x86/mm/p2m-pt.c                     |  10 +-
 xen/arch/x86/mm/p2m.c                        | 554 +++++++++++++++++++++++++--
 xen/arch/x86/x86_emulate/x86_emulate.c       |  19 +-
 xen/arch/x86/x86_emulate/x86_emulate.h       |   4 +
 xen/common/domain.c                          |  28 ++
 xen/common/vm_event.c                        |   4 +
 xen/include/asm-arm/p2m.h                    |   6 +
 xen/include/asm-x86/domain.h                 |  10 +
 xen/include/asm-x86/hvm/altp2m.h             |  42 ++
 xen/include/asm-x86/hvm/hvm.h                |  25 ++
 xen/include/asm-x86/hvm/vcpu.h               |   9 +
 xen/include/asm-x86/hvm/vmx/vmcs.h           |  14 +-
 xen/include/asm-x86/hvm/vmx/vmx.h            |  13 +-
 xen/include/asm-x86/msr-index.h              |   1 +
 xen/include/asm-x86/p2m.h                    |  90 ++++-
 xen/include/public/hvm/hvm_op.h              |  89 +++++
 xen/include/public/hvm/params.h              |   5 +-
 xen/include/public/vm_event.h                |  12 +
 xen/include/xen/sched.h                      |   5 +
 xen/include/xsm/dummy.h                      |  12 +
 xen/include/xsm/xsm.h                        |  12 +
 xen/xsm/dummy.c                              |   2 +
 xen/xsm/flask/hooks.c                        |  12 +
 xen/xsm/flask/policy/access_vectors          |   7 +
 49 files changed, 2169 insertions(+), 103 deletions(-)
 create mode 100644 tools/libxc/xc_altp2m.c
 create mode 100644 xen/arch/x86/hvm/altp2m.c
 create mode 100644 xen/arch/x86/mm/hap/altp2m_hap.c
 create mode 100644 xen/include/asm-x86/hvm/altp2m.h


