[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v8 10/25] xsplice: Implement support for applying/reverting/replacing patches.



From: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>

Implement support for the apply, revert and replace actions.

To perform and action on a payload, the hypercall sets up a data
structure to schedule the work.  A hook is added in the reset_stack_and_jump
to check for work and execute it if needed (specifically we check an
per-cpu flag to make this as quick as possible).

In this way, patches can be applied with all CPUs idle and without
stacks.  The first CPU to run check_for_xsplice_work() becomes the
master and triggers a reschedule softirq to trigger all the other CPUs
to enter check_for_xsplice_work() with no stack.  Once all CPUs
have rendezvoused, all CPUs disable IRQs and NMIs are ignored.
The system is then quiscient and the master performs the action.
After this, all CPUs enable IRQs and NMIs are re-enabled.

Note that it is unsafe to patch do_nmi and the xSplice internal functions.
Patching functions on NMI/MCE path is liable to end in disaster.
This is not addressed in this patch and is mentioned in the
design doc as a further TODO.

The action to perform is one of:
- APPLY: For each function in the module, store the first 5 bytes of the
  old function and replace it with a jump to the new function.
- REVERT: Copy the previously stored bytes into the first 5 bytes of the
  old function.
- REPLACE: Revert each applied module and then apply the new module.

To prevent a deadlock with any other barrier in the system, the master
will wait for up to 30ms before timing out.
Measurements found that the patch application to take about 100 μs on a
72 CPU system, whether idle or fully loaded.

We also add an BUILD_ON to make sure that the size of the structure
of the payload is not inadvertly changed and that the offsets are
correct on both 32 and 64-bit hypervisor (ARM32 and ARM64).
To make this work, we make the public header use architecture
neutral members (xen_ulong_t) but for per-architecture have to
fiddle with padding.

Signed-off-by: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

--
Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>
Cc: Julien Grall <julien.grall@xxxxxxx>
Cc: Keir Fraser <keir@xxxxxxx>
Cc: Jan Beulich <jbeulich@xxxxxxxx>
Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Cc: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
Cc: Jun Nakajima <jun.nakajima@xxxxxxxxx>
Cc: Kevin Tian <kevin.tian@xxxxxxxxx>

v2: - Pluck the 'struct xsplice_patch_func' in this patch.
    - Modify code per review comments.
    - Add more data in the keyboard handler.
    - Redo the patching code, split it in functions.
v3: - Add return_ macro for debug builds.
    - Move s/payload_list_lock/payload_list/ to earlier patch
    - Remove const and use ELF types for xsplice_patch_func
     - Add check routine to do simple sanity checks for various
      sections.
    - s/%p/PRIx64/ as ARM builds complain.
    - Move code around. Add more dprintk. Add XSPLICE in front of all
      printks/dprintk.
      Put the NMIs back if we fail patching.
      Add per-cpu to lessen contention for global structure.
      Extract from xsplice_do_single patching code into xsplice_do_action
      Squash xsplice_do_single and check_for_xsplice_work together to
      have all rendezvous in one place.
      Made XSPLICE_ACTION_REPLACE work again (wrong list iterator)
      s/find_special_sections/prepare_payload/
      Use list_del_init and INIT_LIST_HEAD for applied_list
v4:
   - Add comment, adjust spacing for "Timed out on CPU semaphore"
   - Added CR0.WP manipulations when altering the .text of hypervisor.
   - Added fix from Andrew for CR0.WP manipulation.
v5: - Made xsplice_patch_func use uintXX_t instead of ELF_ types to easy
      making it work under ARM (32bit). Add more BUILD-BUG-ON checks.
    - Add more BUILD_ON checks. Sprinkle newlines.
v6: Rebase on "arm/x86: Alter nmi_callback_t typedef"
   - Drop the recursive spinlock usage.
   - Move NMI callbacks in arch specific.
   - Fold the 'check_for_xsplice_work' in reset_stack_and_jump
   - Add arch specific check for .xsplice.funcs.
   - Seperate external and internal structure of .xsplice.funcs.
   - Changed per Jan's review
   - Modified the .xsplice.funcs checks
v7:
   - Modified old_ptr to void* instead of uint8_t*
   - Modified the xsplice_patch_func_internal for ARM32 to have padding.
   - Used #if BITS_PER_LONG == 64 for the xsplice_patch_func_internal along
     with ifndef CONFIG_ARM for the undo (which may be different size on ARM64)
v8:
  - Add "is empty" if special sections are in fact empty.
  - Added Andrew's Reviewed-by:
  - Rebase on v7.2 of  x86/mm: Introduce modify_xen_mappings()
  - Change some of printk to dprintk and some of the dprintk to printk.
  - Make the xsplice_patch_func (and the internal) structure have uint32_t
    (instead of uint64_t) if BITS_PER_LONG==32. This makes the size and
    offset different so note that in the design and common code.
  - Add #undef ACTION
  - Guard struct xsplice_patch_func in sysctl.h with __XEN__ as toolstacks
    will fail to compile. We do have BITS_PER_LONG defined in xc_bitops.h but
    that will go away (and also that macro uses sizeof and the pre-processor
    will choke on that).
  - Dropped Julien's Acked as I replaced BITS_PER_LONG/CONFIG_ARM_32.
    (Stefano is OK with it, but would prefer BITS_PER_LONG, Jan does not want
    BITS_PER_LONG).
---
 docs/misc/xsplice.markdown    |   8 +-
 xen/arch/arm/xsplice.c        |  33 +++
 xen/arch/x86/domain.c         |   2 +
 xen/arch/x86/xsplice.c        |  88 +++++++-
 xen/common/xsplice.c          | 473 +++++++++++++++++++++++++++++++++++++++++-
 xen/include/asm-x86/current.h |  10 +-
 xen/include/public/sysctl.h   |  27 +++
 xen/include/xen/xsplice.h     |  47 +++++
 8 files changed, 670 insertions(+), 18 deletions(-)

diff --git a/docs/misc/xsplice.markdown b/docs/misc/xsplice.markdown
index d4e7d75..7a19b60 100644
--- a/docs/misc/xsplice.markdown
+++ b/docs/misc/xsplice.markdown
@@ -289,8 +289,13 @@ describing the functions to be patched:
 <pre>
 struct xsplice_patch_func {  
     const char *name;  
+#if BITS_PER_LONG == 32  
+    uint32_t new_addr;  
+    uint32_t old_addr;  
+#else  
     uint64_t new_addr;  
     uint64_t old_addr;  
+#endif
     uint32_t new_size;  
     uint32_t old_size;  
     uint8_t version;  
@@ -298,7 +303,8 @@ struct xsplice_patch_func {
 };  
 </pre>
 
-The size of the structure is 64 bytes.
+The size of the structure is 64 bytes or 52 bytes if compiled under 32-bit
+hypervisors.
 
 * `name` is the symbol name of the old function. Only used if `old_addr` is
    zero, otherwise will be used during dynamic linking (when hypervisor loads
diff --git a/xen/arch/arm/xsplice.c b/xen/arch/arm/xsplice.c
index 4465244..5491317 100644
--- a/xen/arch/arm/xsplice.c
+++ b/xen/arch/arm/xsplice.c
@@ -6,6 +6,39 @@
 #include <xen/xsplice_elf.h>
 #include <xen/xsplice.h>
 
+void arch_xsplice_patching_enter(void)
+{
+}
+
+void arch_xsplice_patching_leave(void)
+{
+}
+
+int arch_xsplice_verify_func(const struct xsplice_patch_func_internal *func)
+{
+    return -ENOSYS;
+}
+
+void arch_xsplice_apply_jmp(struct xsplice_patch_func_internal *func)
+{
+}
+
+void arch_xsplice_revert_jmp(const struct xsplice_patch_func_internal *func)
+{
+}
+
+void arch_xsplice_post_action(void)
+{
+}
+
+void arch_xsplice_mask(void)
+{
+}
+
+void arch_xsplice_unmask(void)
+{
+}
+
 int arch_xsplice_verify_elf(const struct xsplice_elf *elf)
 {
     return -ENOSYS;
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index e93ff20..2d7932f 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -36,6 +36,7 @@
 #include <xen/cpu.h>
 #include <xen/wait.h>
 #include <xen/guest_access.h>
+#include <xen/xsplice.h>
 #include <public/sysctl.h>
 #include <public/hvm/hvm_vcpu.h>
 #include <asm/regs.h>
@@ -120,6 +121,7 @@ static void idle_loop(void)
         (*pm_idle)();
         do_tasklet();
         do_softirq();
+        check_for_xsplice_work(); /* Must be last. */
     }
 }
 
diff --git a/xen/arch/x86/xsplice.c b/xen/arch/x86/xsplice.c
index 3000ecb..a180af8 100644
--- a/xen/arch/x86/xsplice.c
+++ b/xen/arch/x86/xsplice.c
@@ -10,6 +10,82 @@
 #include <xen/xsplice_elf.h>
 #include <xen/xsplice.h>
 
+#include <asm/nmi.h>
+
+#define PATCH_INSN_SIZE 5
+
+void arch_xsplice_patching_enter(void)
+{
+    /* Disable WP to allow changes to read-only pages. */
+    write_cr0(read_cr0() & ~X86_CR0_WP);
+}
+
+void arch_xsplice_patching_leave(void)
+{
+    /* Reinstate WP. */
+    write_cr0(read_cr0() | X86_CR0_WP);
+}
+
+int arch_xsplice_verify_func(const struct xsplice_patch_func_internal *func)
+{
+    /* No NOP patching yet. */
+    if ( !func->new_size )
+        return -EOPNOTSUPP;
+
+    if ( func->old_size < PATCH_INSN_SIZE )
+        return -EINVAL;
+
+    return 0;
+}
+
+void arch_xsplice_apply_jmp(struct xsplice_patch_func_internal *func)
+{
+    int32_t val;
+    uint8_t *old_ptr;
+
+    BUILD_BUG_ON(PATCH_INSN_SIZE > sizeof(func->u.undo));
+    BUILD_BUG_ON(PATCH_INSN_SIZE != (1 + sizeof val));
+
+    old_ptr = func->old_addr;
+    memcpy(func->u.undo, old_ptr, PATCH_INSN_SIZE);
+
+    *old_ptr++ = 0xe9; /* Relative jump */
+    val = func->new_addr - func->old_addr - PATCH_INSN_SIZE;
+    memcpy(old_ptr, &val, sizeof val);
+}
+
+void arch_xsplice_revert_jmp(const struct xsplice_patch_func_internal *func)
+{
+    memcpy(func->old_addr, func->u.undo, PATCH_INSN_SIZE);
+}
+
+/* Serialise the CPU pipeline. */
+void arch_xsplice_post_action(void)
+{
+    cpuid_eax(0);
+}
+
+static nmi_callback_t *saved_nmi_callback;
+/*
+ * Note that because of this NOP code the do_nmi is not safely patchable.
+ * Also if we do receive 'real' NMIs we have lost them. Ditto for MCE.
+ */
+static int mask_nmi_callback(const struct cpu_user_regs *regs, int cpu)
+{
+    /* TODO: Handle missing NMI/MCE.*/
+    return 1;
+}
+
+void arch_xsplice_mask(void)
+{
+    saved_nmi_callback = set_nmi_callback(mask_nmi_callback);
+}
+
+void arch_xsplice_unmask(void)
+{
+    set_nmi_callback(saved_nmi_callback);
+}
+
 int arch_xsplice_verify_elf(const struct xsplice_elf *elf)
 {
 
@@ -148,12 +224,12 @@ int arch_xsplice_secure(void *va, unsigned int pages, 
enum va_type type)
     ASSERT(va);
     ASSERT(pages);
 
-    if ( type == XSPLICE_VA_RX ) /* PAGE_HYPERVISOR_RX */
-        flag = _PAGE_PRESENT;
-    else if ( type == XSPLICE_VA_RW ) /* PAGE_HYPERVISOR_RW */
-        flag = _PAGE_RW | _PAGE_NX | _PAGE_PRESENT;
-    else /* PAGE_HYPERVISOR_RO */
-        flag = _PAGE_NX | _PAGE_PRESENT;
+    if ( type == XSPLICE_VA_RX )
+        flag = PAGE_HYPERVISOR_RX;
+    else if ( type == XSPLICE_VA_RW )
+        flag = PAGE_HYPERVISOR_RW;
+    else
+        flag = PAGE_HYPERVISOR_RO;
 
     /* The ones we are allowed to modify are: _PAGE_NX|_PAGE_RW|_PAGE_PRESENT 
*/
     modify_xen_mappings(start, start + pages * PAGE_SIZE, flag);
diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
index 809b067..9e6697e 100644
--- a/xen/common/xsplice.c
+++ b/xen/common/xsplice.c
@@ -3,6 +3,7 @@
  *
  */
 
+#include <xen/cpu.h>
 #include <xen/err.h>
 #include <xen/guest_access.h>
 #include <xen/keyhandler.h>
@@ -11,18 +12,28 @@
 #include <xen/mm.h>
 #include <xen/sched.h>
 #include <xen/smp.h>
+#include <xen/softirq.h>
 #include <xen/spinlock.h>
 #include <xen/vmap.h>
+#include <xen/wait.h>
 #include <xen/xsplice_elf.h>
 #include <xen/xsplice.h>
 
 #include <asm/event.h>
 #include <public/sysctl.h>
 
-/* Protects against payload_list operations. */
+/*
+ * Protects against payload_list operations and also allows only one
+ * caller in schedule_work.
+ */
 static DEFINE_SPINLOCK(payload_lock);
 static LIST_HEAD(payload_list);
 
+/*
+ * Patches which have been applied.
+ */
+static LIST_HEAD(applied_list);
+
 static unsigned int payload_cnt;
 static unsigned int payload_version = 1;
 
@@ -37,9 +48,35 @@ struct payload {
     void *ro_addr;                       /* Virtual address of .rodata. */
     size_t ro_size;                      /* .. and its size (if any). */
     size_t pages;                        /* Total pages for [text,rw,ro]_addr 
*/
+    struct list_head applied_list;       /* Linked to 'applied_list'. */
+    struct xsplice_patch_func_internal *funcs;    /* The array of functions to 
patch. */
+    unsigned int nfuncs;                 /* Nr of functions to patch. */
     char name[XEN_XSPLICE_NAME_SIZE];    /* Name of it. */
 };
 
+/* Defines an outstanding patching action. */
+struct xsplice_work
+{
+    atomic_t semaphore;          /* Used for rendezvous. */
+    atomic_t irq_semaphore;      /* Used to signal all IRQs disabled. */
+    uint32_t timeout;            /* Timeout to do the operation. */
+    struct payload *data;        /* The payload on which to act. */
+    volatile bool_t do_work;     /* Signals work to do. */
+    volatile bool_t ready;       /* Signals all CPUs synchronized. */
+    unsigned int cmd;            /* Action request: XSPLICE_ACTION_* */
+};
+
+/* There can be only one outstanding patching action. */
+static struct xsplice_work xsplice_work;
+
+/*
+ * Indicate whether the CPU needs to consult xsplice_work structure.
+ * We want an per-cpu data structure otherwise the check_for_xsplice_work
+ * would hammer a global xsplice_work structure on every guest VMEXIT.
+ * Having an per-cpu lessens the load.
+ */
+static DEFINE_PER_CPU(bool_t, work_to_do);
+
 static int verify_name(const xen_xsplice_name_t *name, char *n)
 {
     if ( !name->size || name->size > XEN_XSPLICE_NAME_SIZE )
@@ -255,6 +292,98 @@ static int secure_payload(struct payload *payload, struct 
xsplice_elf *elf)
     return rc;
 }
 
+static int check_special_sections(const struct xsplice_elf *elf)
+{
+    unsigned int i;
+    static const char *const names[] = { ".xsplice.funcs" };
+    unsigned int count[ARRAY_SIZE(names)] = { 0 };
+
+    for ( i = 0; i < ARRAY_SIZE(names); i++ )
+    {
+        const struct xsplice_elf_sec *sec;
+
+        sec = xsplice_elf_sec_by_name(elf, names[i]);
+        if ( !sec )
+        {
+            dprintk(XENLOG_ERR, XSPLICE "%s: %s is missing!\n",
+                    elf->name, names[i]);
+            return -EINVAL;
+        }
+
+        if ( !sec->sec->sh_size )
+        {
+            dprintk(XENLOG_ERR, XSPLICE "%s: %s is empty!\n",
+                    elf->name, names[i]);
+            return -EINVAL;
+        }
+        count[i]++;
+    }
+
+    for ( i = 0; i < ARRAY_SIZE(names); i++ )
+    {
+        if ( count[i] > 1 )
+        {
+            dprintk(XENLOG_ERR, XSPLICE "%s: %s was seen more than once!\n",
+                    elf->name, names[i]);
+            return -EINVAL;
+        }
+    }
+
+    return 0;
+}
+
+static int prepare_payload(struct payload *payload,
+                           struct xsplice_elf *elf)
+{
+    const struct xsplice_elf_sec *sec;
+    unsigned int i;
+    struct xsplice_patch_func_internal *f;
+
+    sec = xsplice_elf_sec_by_name(elf, ".xsplice.funcs");
+    ASSERT(sec);
+    if ( sec->sec->sh_size % sizeof(*payload->funcs) )
+    {
+        dprintk(XENLOG_ERR, XSPLICE "%s: Wrong size of .xsplice.funcs!\n",
+                elf->name);
+        return -EINVAL;
+    }
+
+    payload->funcs = sec->load_addr;
+    payload->nfuncs = sec->sec->sh_size / sizeof(*payload->funcs);
+
+    for ( i = 0; i < payload->nfuncs; i++ )
+    {
+        int rc;
+        unsigned int j;
+
+        f = &(payload->funcs[i]);
+
+        if ( f->version != XSPLICE_PAYLOAD_VERSION )
+        {
+            dprintk(XENLOG_ERR, XSPLICE "%s: Wrong version (%u). Expected 
%d!\n",
+                    elf->name, f->version, XSPLICE_PAYLOAD_VERSION);
+            return -EOPNOTSUPP;
+        }
+
+        if ( !f->new_addr || !f->new_size )
+        {
+            dprintk(XENLOG_ERR, XSPLICE "%s: Address or size fields are 
zero!\n",
+                    elf->name);
+            return -EINVAL;
+        }
+
+        rc = arch_xsplice_verify_func(f);
+        if ( rc )
+            return rc;
+
+        for ( j = 0; j < ARRAY_SIZE(f->u.pad); j++ )
+            if ( f->u.pad[j] )
+                return -EINVAL;
+    }
+
+    return 0;
+}
+
 static void free_payload(struct payload *data)
 {
     ASSERT(spin_is_locked(&payload_lock));
@@ -286,6 +415,14 @@ static int load_payload_data(struct payload *payload, void 
*raw, size_t len)
     if ( rc )
         goto out;
 
+    rc = check_special_sections(&elf);
+    if ( rc )
+        goto out;
+
+    rc = prepare_payload(payload, &elf);
+    if ( rc )
+        goto out;
+
     rc = secure_payload(payload, &elf);
 
  out:
@@ -346,6 +483,7 @@ static int xsplice_upload(xen_sysctl_xsplice_upload_t 
*upload)
 
     data->state = XSPLICE_STATE_CHECKED;
     INIT_LIST_HEAD(&data->list);
+    INIT_LIST_HEAD(&data->applied_list);
 
     list_add_tail(&data->list, &payload_list);
     payload_cnt++;
@@ -456,6 +594,295 @@ static int xsplice_list(xen_sysctl_xsplice_list_t *list)
     return rc ? : idx;
 }
 
+/*
+ * The following functions get the CPUs into an appropriate state and
+ * apply (or revert) each of the payload's functions. This is needed
+ * for XEN_SYSCTL_XSPLICE_ACTION operation (see xsplice_action).
+ */
+
+static int apply_payload(struct payload *data)
+{
+    unsigned int i;
+
+    printk(XENLOG_INFO XSPLICE "%s: Applying %u functions.\n",
+            data->name, data->nfuncs);
+
+    arch_xsplice_patching_enter();
+
+    for ( i = 0; i < data->nfuncs; i++ )
+        arch_xsplice_apply_jmp(&data->funcs[i]);
+
+    arch_xsplice_patching_leave();
+
+    list_add_tail(&data->applied_list, &applied_list);
+
+    return 0;
+}
+
+static int revert_payload(struct payload *data)
+{
+    unsigned int i;
+
+    printk(XENLOG_INFO XSPLICE "%s: Reverting.\n", data->name);
+
+    arch_xsplice_patching_enter();
+
+    for ( i = 0; i < data->nfuncs; i++ )
+        arch_xsplice_revert_jmp(&data->funcs[i]);
+
+    arch_xsplice_patching_leave();
+
+    list_del_init(&data->applied_list);
+
+    return 0;
+}
+
+/*
+ * This function is executed having all other CPUs with no stack (we may
+ * have cpu_idle on it) and IRQs disabled. We guard against NMI by temporarily
+ * installing our NOP NMI handler.
+ */
+static void xsplice_do_action(void)
+{
+    int rc;
+    struct payload *data, *other, *tmp;
+
+    data = xsplice_work.data;
+    /*
+     * Now this function should be the only one on any stack.
+     * No need to lock the payload list or applied list.
+     */
+    switch ( xsplice_work.cmd )
+    {
+    case XSPLICE_ACTION_APPLY:
+        rc = apply_payload(data);
+        if ( rc == 0 )
+            data->state = XSPLICE_STATE_APPLIED;
+        break;
+
+    case XSPLICE_ACTION_REVERT:
+        rc = revert_payload(data);
+        if ( rc == 0 )
+            data->state = XSPLICE_STATE_CHECKED;
+        break;
+
+    case XSPLICE_ACTION_REPLACE:
+        rc = 0;
+        /* N.B: Use 'applied_list' member, not 'list'. */
+        list_for_each_entry_safe_reverse ( other, tmp, &applied_list, 
applied_list )
+        {
+            other->rc = revert_payload(other);
+            if ( other->rc == 0 )
+                other->state = XSPLICE_STATE_CHECKED;
+            else
+            {
+                rc = -EINVAL;
+                break;
+            }
+        }
+
+        if ( rc == 0 )
+        {
+            rc = apply_payload(data);
+            if ( rc == 0 )
+                data->state = XSPLICE_STATE_APPLIED;
+        }
+        break;
+
+    default:
+        ASSERT_UNREACHABLE();
+        break;
+    }
+
+    /* We must set rc as xsplice_action sets it to -EAGAIN when kicking of. */
+    data->rc = rc;
+}
+
+static int schedule_work(struct payload *data, uint32_t cmd, uint32_t timeout)
+{
+    ASSERT(spin_is_locked(&payload_lock));
+
+    /* Fail if an operation is already scheduled. */
+    if ( xsplice_work.do_work )
+        return -EBUSY;
+
+    if ( !get_cpu_maps() )
+    {
+        printk(XENLOG_ERR XSPLICE "%s: unable to get cpu_maps lock!\n",
+               data->name);
+        return -EBUSY;
+    }
+
+    xsplice_work.cmd = cmd;
+    xsplice_work.data = data;
+    xsplice_work.timeout = timeout ?: MILLISECS(30);
+
+    dprintk(XENLOG_DEBUG, XSPLICE "%s: timeout is %"PRI_stime"ms\n",
+            data->name, xsplice_work.timeout / MILLISECS(1));
+
+    atomic_set(&xsplice_work.semaphore, -1);
+    atomic_set(&xsplice_work.irq_semaphore, -1);
+
+    xsplice_work.ready = 0;
+
+    smp_wmb();
+
+    xsplice_work.do_work = 1;
+    this_cpu(work_to_do) = 1;
+
+    put_cpu_maps();
+
+    return 0;
+}
+
+static void reschedule_fn(void *unused)
+{
+    this_cpu(work_to_do) = 1;
+    raise_softirq(SCHEDULE_SOFTIRQ);
+}
+
+static int xsplice_spin(atomic_t *counter, s_time_t timeout,
+                           unsigned int cpus, const char *s)
+{
+    int rc = 0;
+
+    while ( atomic_read(counter) != cpus && NOW() < timeout )
+        cpu_relax();
+
+    /* Log & abort. */
+    if ( atomic_read(counter) != cpus )
+    {
+        printk(XENLOG_ERR XSPLICE "%s: Timed out on %s semaphore %u/%u\n",
+               xsplice_work.data->name, s, atomic_read(counter), cpus);
+        rc = -EBUSY;
+        xsplice_work.data->rc = rc;
+        xsplice_work.do_work = 0;
+        smp_wmb();
+    }
+
+    return rc;
+}
+
+/*
+ * The main function which manages the work of quiescing the system and
+ * patching code.
+ */
+void check_for_xsplice_work(void)
+{
+#define ACTION(x) [XSPLICE_ACTION_##x] = #x
+    static const char *const names[] = {
+            ACTION(APPLY),
+            ACTION(REVERT),
+            ACTION(REPLACE),
+    };
+#undef ACTION
+    unsigned int cpu = smp_processor_id();
+    s_time_t timeout;
+    unsigned long flags;
+
+    /* Fast path: no work to do. */
+    if ( !per_cpu(work_to_do, cpu ) )
+        return;
+
+    smp_rmb();
+    /* In case we aborted, other CPUs can skip right away. */
+    if ( !xsplice_work.do_work )
+    {
+        per_cpu(work_to_do, cpu) = 0;
+        return;
+    }
+
+    ASSERT(local_irq_is_enabled());
+
+    /* Set at -1, so will go up to num_online_cpus - 1. */
+    if ( atomic_inc_and_test(&xsplice_work.semaphore) )
+    {
+        struct payload *p;
+        unsigned int cpus;
+
+        p = xsplice_work.data;
+        if ( !get_cpu_maps() )
+        {
+            printk(XENLOG_ERR XSPLICE "%s: CPU%u - unable to get cpu_maps 
lock!\n",
+                   p->name, cpu);
+            per_cpu(work_to_do, cpu) = 0;
+            xsplice_work.data->rc = -EBUSY;
+            xsplice_work.do_work = 0;
+            /*
+             * Do NOT decrement semaphore down - as that may cause the other
+             * CPU (which may be at this ready to increment it)
+             * to assume the role of master and then needlessly time out
+             * out (as do_work is zero).
+             */
+            smp_wmb();
+            return;
+        }
+        /* "Mask" NMIs. */
+        arch_xsplice_mask();
+
+        barrier(); /* MUST do it after get_cpu_maps. */
+        cpus = num_online_cpus() - 1;
+
+        if ( cpus )
+        {
+            dprintk(XENLOG_DEBUG, XSPLICE "%s: CPU%u - IPIing the other %u 
CPUs\n",
+                    p->name, cpu, cpus);
+            smp_call_function(reschedule_fn, NULL, 0);
+        }
+
+        timeout = xsplice_work.timeout + NOW();
+        if ( xsplice_spin(&xsplice_work.semaphore, timeout, cpus, "CPU") )
+            goto abort;
+
+        /* All CPUs are waiting, now signal to disable IRQs. */
+        xsplice_work.ready = 1;
+        smp_wmb();
+
+        atomic_inc(&xsplice_work.irq_semaphore);
+        if ( !xsplice_spin(&xsplice_work.irq_semaphore, timeout, cpus, "IRQ") )
+        {
+            local_irq_save(flags);
+            /* Do the patching. */
+            xsplice_do_action();
+            /* Flush the CPU i-cache via CPUID instruction (on x86). */
+            arch_xsplice_post_action();
+            local_irq_restore(flags);
+        }
+        arch_xsplice_unmask();
+
+ abort:
+        per_cpu(work_to_do, cpu) = 0;
+        xsplice_work.do_work = 0;
+
+        smp_wmb(); /* MUST complete writes before put_cpu_maps(). */
+
+        put_cpu_maps();
+
+        printk(XENLOG_INFO XSPLICE "%s finished %s with rc=%d\n",
+               p->name, names[xsplice_work.cmd], p->rc);
+    }
+    else
+    {
+        /* Wait for all CPUs to rendezvous. */
+        while ( xsplice_work.do_work && !xsplice_work.ready )
+            cpu_relax();
+
+        /* Disable IRQs and signal. */
+        local_irq_save(flags);
+        atomic_inc(&xsplice_work.irq_semaphore);
+
+        /* Wait for patching to complete. */
+        while ( xsplice_work.do_work )
+            cpu_relax();
+
+        /* To flush out pipeline. */
+        arch_xsplice_post_action();
+        local_irq_restore(flags);
+
+        per_cpu(work_to_do, cpu) = 0;
+    }
+}
+
 static int xsplice_action(xen_sysctl_xsplice_action_t *action)
 {
     struct payload *data;
@@ -502,27 +929,24 @@ static int xsplice_action(xen_sysctl_xsplice_action_t 
*action)
     case XSPLICE_ACTION_REVERT:
         if ( data->state == XSPLICE_STATE_APPLIED )
         {
-            /* No implementation yet. */
-            data->state = XSPLICE_STATE_CHECKED;
-            data->rc = 0;
+            data->rc = -EAGAIN;
+            rc = schedule_work(data, action->cmd, action->timeout);
         }
         break;
 
     case XSPLICE_ACTION_APPLY:
         if ( data->state == XSPLICE_STATE_CHECKED )
         {
-            /* No implementation yet. */
-            data->state = XSPLICE_STATE_APPLIED;
-            data->rc = 0;
+            data->rc = -EAGAIN;
+            rc = schedule_work(data, action->cmd, action->timeout);
         }
         break;
 
     case XSPLICE_ACTION_REPLACE:
         if ( data->state == XSPLICE_STATE_CHECKED )
         {
-            /* No implementation yet. */
-            data->state = XSPLICE_STATE_CHECKED;
-            data->rc = 0;
+            data->rc = -EAGAIN;
+            rc = schedule_work(data, action->cmd, action->timeout);
         }
         break;
 
@@ -587,6 +1011,7 @@ static const char *state2str(uint32_t state)
 static void xsplice_printall(unsigned char key)
 {
     struct payload *data;
+    unsigned int i;
 
     printk("'%u' pressed - Dumping all xsplice patches\n", key);
 
@@ -597,15 +1022,43 @@ static void xsplice_printall(unsigned char key)
     }
 
     list_for_each_entry ( data, &payload_list, list )
+    {
         printk(" name=%s state=%s(%d) %p (.data=%p, .rodata=%p) using %zu 
pages.\n",
                data->name, state2str(data->state), data->state, 
data->text_addr,
                data->rw_addr, data->ro_addr, data->pages);
 
+        for ( i = 0; i < data->nfuncs; i++ )
+        {
+            struct xsplice_patch_func_internal *f = &(data->funcs[i]);
+            printk("    %s patch %p(%u) with %p (%u)\n",
+                   f->name, f->old_addr, f->old_size, f->new_addr, 
f->new_size);
+
+            if ( i && !(i % 64) )
+            {
+                spin_unlock(&payload_lock);
+                process_pending_softirqs();
+                spin_lock(&payload_lock);
+            }
+        }
+    }
+
     spin_unlock(&payload_lock);
 }
 
 static int __init xsplice_init(void)
 {
+    BUILD_BUG_ON( sizeof(struct xsplice_patch_func) !=
+                  sizeof(struct xsplice_patch_func_internal) );
+    BUILD_BUG_ON( sizeof(struct xsplice_patch_func_internal) !=
+                  XSPLICE_PATCH_FUNC_INTERNAL_SIZE );
+
+#if BITS_PER_LONG == 64
+    BUILD_BUG_ON( offsetof(struct xsplice_patch_func, new_addr) != 8 );
+    BUILD_BUG_ON( offsetof(struct xsplice_patch_func, new_size) != 24 );
+#else
+    BUILD_BUG_ON( offsetof(struct xsplice_patch_func, new_addr) != 4 );
+    BUILD_BUG_ON( offsetof(struct xsplice_patch_func, new_size) != 12 );
+#endif
     register_keyhandler('x', xsplice_printall, "print xsplicing info", 1);
 
     arch_xsplice_init();
diff --git a/xen/include/asm-x86/current.h b/xen/include/asm-x86/current.h
index 4083261..027aa0c 100644
--- a/xen/include/asm-x86/current.h
+++ b/xen/include/asm-x86/current.h
@@ -86,10 +86,18 @@ static inline struct cpu_info *get_cpu_info(void)
 unsigned long get_stack_trace_bottom(unsigned long sp);
 unsigned long get_stack_dump_bottom (unsigned long sp);
 
+#ifdef CONFIG_XSPLICE
+# define __CHECK_FOR_XSPLICE_WORK "call check_for_xsplice_work;"
+#else
+# define __CHECK_FOR_XSPLICE_WORK ""
+#endif
+
 #define reset_stack_and_jump(__fn)                                      \
     ({                                                                  \
         __asm__ __volatile__ (                                          \
-            "mov %0,%%"__OP"sp; jmp %c1"                                \
+            "mov %0,%%"__OP"sp;"                                        \
+            __CHECK_FOR_XSPLICE_WORK                                    \
+             "jmp %c1"                                                  \
             : : "r" (guest_cpu_user_regs()), "i" (__fn) : "memory" );   \
         unreachable();                                                  \
     })
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 090fd21..3d47b47 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -833,6 +833,33 @@ DEFINE_XEN_GUEST_HANDLE(xen_sysctl_featureset_t);
  *     If zero exit with success.
  */
 
+#define XSPLICE_PAYLOAD_VERSION 1
+/*
+ * .xsplice.funcs structure layout defined in the `Payload format`
+ * section in the xSplice design document.
+ *
+ * The size should be exactly 64 bytes (64) or 52 bytes (32).
+ *
+ * We guard this with __XEN__ as toolstacks do not need to use it.
+ */
+#ifdef __XEN__
+struct xsplice_patch_func {
+    const char *name;       /* Name of function to be patched. */
+#if CONFIG_ARM_32
+    uint32_t new_addr;
+    uint32_t old_addr;
+#else
+    uint64_t new_addr;
+    uint64_t old_addr;      /* Can be zero and name will be looked up. */
+#endif
+    uint32_t new_size;
+    uint32_t old_size;
+    uint8_t version;        /* MUST be XSPLICE_PAYLOAD_VERSION. */
+    uint8_t pad[31];        /* MUST be zero filled. */
+};
+typedef struct xsplice_patch_func xsplice_patch_func_t;
+#endif
+
 /*
  * Structure describing an ELF payload. Uniquely identifies the
  * payload. Should be human readable.
diff --git a/xen/include/xen/xsplice.h b/xen/include/xen/xsplice.h
index 83e4f31..536a9e8 100644
--- a/xen/include/xen/xsplice.h
+++ b/xen/include/xen/xsplice.h
@@ -11,12 +11,43 @@ struct xsplice_elf_sec;
 struct xsplice_elf_sym;
 struct xen_sysctl_xsplice_op;
 
+#include <xen/elfstructs.h>
 #ifdef CONFIG_XSPLICE
 
+/*
+ * The structure which defines the patching. This is what the hypervisor
+ * expects in the '.xsplice.func' section of the ELF file.
+ *
+ * This MUST be in sync with what the tools generate. We expose
+ * for the tools the 'struct xsplice_patch_func' which does not have
+ * platform specific entries.
+ */
+#if BITS_PER_LONG == 64
+#define XSPLICE_PATCH_FUNC_INTERNAL_SIZE    64
+#else
+#define XSPLICE_PATCH_FUNC_INTERNAL_SIZE    52
+#endif
+
+struct xsplice_patch_func_internal {
+    const char *name;
+    void *new_addr;
+    void *old_addr;
+    uint32_t new_size;
+    uint32_t old_size;
+    uint8_t version;
+    union {
+#ifndef CONFIG_ARM
+        uint8_t undo[8];
+#endif
+        uint8_t pad[31];
+    } u;
+};
+
 /* Convenience define for printk. */
 #define XSPLICE "xsplice: "
 
 int xsplice_op(struct xen_sysctl_xsplice_op *);
+void check_for_xsplice_work(void);
 
 /* Arch hooks. */
 int arch_xsplice_verify_elf(const struct xsplice_elf *elf);
@@ -44,6 +75,21 @@ int arch_xsplice_secure(void *va, unsigned int pages, enum 
va_type types);
 void arch_xsplice_free_payload(void *va);
 
 void arch_xsplice_init(void);
+
+int arch_xsplice_verify_func(const struct xsplice_patch_func_internal *func);
+/*
+ * These functions are called around the critical region patching live code,
+ * for an architecture to take make appropratie global state adjustments.
+ */
+void arch_xsplice_patching_enter(void);
+void arch_xsplice_patching_leave(void);
+
+void arch_xsplice_apply_jmp(struct xsplice_patch_func_internal *func);
+void arch_xsplice_revert_jmp(const struct xsplice_patch_func_internal *func);
+void arch_xsplice_post_action(void);
+
+void arch_xsplice_mask(void);
+void arch_xsplice_unmask(void);
 #else
 
 #include <xen/errno.h> /* For -ENOSYS */
@@ -52,6 +98,7 @@ static inline int xsplice_op(struct xen_sysctl_xsplice_op *op)
     return -ENOSYS;
 }
 
+static inline void check_for_xsplice_work(void) { };
 #endif /* CONFIG_XSPLICE */
 
 #endif /* __XEN_XSPLICE_H__ */
-- 
2.5.0


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.