[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-changelog] Merged.



# HG changeset patch
# User emellor@xxxxxxxxxxxxxxxxxxxxxx
# Node ID fb174770f426193a9ae6594c8f683275bbe78451
# Parent  9fcfdab04aa931865de059f3826c91e1a140a127
# Parent  050ad9813cdbbb2b5ecb7ed5a6240cc428ef723e
Merged.

diff -r 9fcfdab04aa9 -r fb174770f426 buildconfigs/linux-defconfig_xen0_x86_32
--- a/buildconfigs/linux-defconfig_xen0_x86_32  Thu Apr  6 13:22:52 2006
+++ b/buildconfigs/linux-defconfig_xen0_x86_32  Fri Apr  7 10:52:00 2006
@@ -1231,6 +1231,7 @@
 #
 # Instrumentation Support
 #
+# CONFIG_PROFILING is not set
 # CONFIG_KPROBES is not set
 
 #
diff -r 9fcfdab04aa9 -r fb174770f426 buildconfigs/linux-defconfig_xen0_x86_64
--- a/buildconfigs/linux-defconfig_xen0_x86_64  Thu Apr  6 13:22:52 2006
+++ b/buildconfigs/linux-defconfig_xen0_x86_64  Fri Apr  7 10:52:00 2006
@@ -1183,6 +1183,7 @@
 # CONFIG_DEBUG_SPINLOCK is not set
 # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
 # CONFIG_DEBUG_KOBJECT is not set
+# CONFIG_DEBUG_INFO is not set
 # CONFIG_DEBUG_FS is not set
 # CONFIG_DEBUG_VM is not set
 CONFIG_FRAME_POINTER=y
diff -r 9fcfdab04aa9 -r fb174770f426 buildconfigs/linux-defconfig_xenU_x86_32
--- a/buildconfigs/linux-defconfig_xenU_x86_32  Thu Apr  6 13:22:52 2006
+++ b/buildconfigs/linux-defconfig_xenU_x86_32  Fri Apr  7 10:52:00 2006
@@ -779,6 +779,7 @@
 #
 # Instrumentation Support
 #
+# CONFIG_PROFILING is not set
 # CONFIG_KPROBES is not set
 
 #
diff -r 9fcfdab04aa9 -r fb174770f426 buildconfigs/linux-defconfig_xenU_x86_64
--- a/buildconfigs/linux-defconfig_xenU_x86_64  Thu Apr  6 13:22:52 2006
+++ b/buildconfigs/linux-defconfig_xenU_x86_64  Fri Apr  7 10:52:00 2006
@@ -1080,6 +1080,7 @@
 # CONFIG_DEBUG_SPINLOCK is not set
 # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
 # CONFIG_DEBUG_KOBJECT is not set
+# CONFIG_DEBUG_INFO is not set
 # CONFIG_DEBUG_FS is not set
 # CONFIG_DEBUG_VM is not set
 CONFIG_FRAME_POINTER=y
diff -r 9fcfdab04aa9 -r fb174770f426 buildconfigs/linux-defconfig_xen_x86_32
--- a/buildconfigs/linux-defconfig_xen_x86_32   Thu Apr  6 13:22:52 2006
+++ b/buildconfigs/linux-defconfig_xen_x86_32   Fri Apr  7 10:52:00 2006
@@ -2892,6 +2892,7 @@
 #
 # Instrumentation Support
 #
+# CONFIG_PROFILING is not set
 # CONFIG_KPROBES is not set
 
 #
diff -r 9fcfdab04aa9 -r fb174770f426 buildconfigs/linux-defconfig_xen_x86_64
--- a/buildconfigs/linux-defconfig_xen_x86_64   Thu Apr  6 13:22:52 2006
+++ b/buildconfigs/linux-defconfig_xen_x86_64   Fri Apr  7 10:52:00 2006
@@ -2587,6 +2587,7 @@
 # CONFIG_DEBUG_SPINLOCK is not set
 # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
 # CONFIG_DEBUG_KOBJECT is not set
+# CONFIG_DEBUG_INFO is not set
 # CONFIG_DEBUG_FS is not set
 # CONFIG_DEBUG_VM is not set
 # CONFIG_FRAME_POINTER is not set
diff -r 9fcfdab04aa9 -r fb174770f426 docs/src/user.tex
--- a/docs/src/user.tex Thu Apr  6 13:22:52 2006
+++ b/docs/src/user.tex Fri Apr  7 10:52:00 2006
@@ -2052,7 +2052,7 @@
 
 If the dev86 package is not available on the x86\_64 distribution, you can 
install the i386 version of it. The dev86 rpm package for various distributions 
can be found at {\scriptsize {\tt 
http://www.rpmfind.net/linux/rpm2html/search.php?query=dev86\&submit=Search}} \\
 
-LibVNCServer & The unmodified guest's VGA display, keyboard, and mouse are 
virtualized using the vncserver library provided by this package. You can get 
the sources of libvncserver from {\small {\tt 
http://sourceforge.net/projects/libvncserver}}. Build and install the sources 
on the build system to get the libvncserver library. The 0.8pre version of 
libvncserver is currently working well with Xen.\\
+LibVNCServer & The unmodified guest's VGA display, keyboard, and mouse can be 
virtualized by the vncserver library. You can get the sources of libvncserver 
from {\small {\tt http://sourceforge.net/projects/libvncserver}}. Build and 
install the sources on the build system to get the libvncserver library. There 
is a significant performance degradation in 0.8 version. The current sources in 
the CVS tree have fixed this degradation. So it is highly recommended to 
download the latest CVS sources and install them.\\
 
 SDL-devel, SDL & Simple DirectMedia Layer (SDL) is another way of virtualizing 
the unmodified guest console. It provides an X window for the guest console. 
 
@@ -2076,6 +2076,8 @@
 acpi & Enable VMX guest ACPI, default=0 (disabled)\\
 
 apic & Enable VMX guest APIC, default=0 (disabled)\\
+
+pae & Enable VMX guest PAE, default=0 (disabled)\\
 
 vif     & Optionally defines MAC address and/or bridge for the network 
interfaces. Random MACs are assigned if not given. {\small {\tt type=ioemu}} 
means ioemu is used to virtualize the VMX NIC. If no type is specified, vbd is 
used, as with paravirtualized guests.\\
 
@@ -2229,6 +2231,30 @@
 
 In the default configuration, VNC is on and SDL is off. Therefore VNC windows 
will open when VMX guests are created. If you want to use SDL to create VMX 
guests, set {\small {\tt sdl=1}} in your VMX configuration file. You can also 
turn off VNC by setting {\small {\tt vnc=0}}.
  
+\subsection{Use mouse in VNC window}
+The default PS/2 mouse will not work properly in VMX by a VNC window. 
Summagraphics mouse emulation does work in this environment. A Summagraphics 
mouse can be enabled by reconfiguring 2 services:
+
+{\small {\tt 1. General Purpose Mouse (GPM). The GPM daemon is configured in 
different ways in different Linux distributions. On a Redhat distribution, this 
is accomplished by changing the file `/etc/sysconfig/mouse' to have the 
following:\\
+MOUSETYPE="summa"\\
+XMOUSETYPE="SUMMA"\\
+DEVICE=/dev/ttyS0\\
+\\
+2. X11. For all Linux distributions, change the Mouse0 stanza in 
`/etc/X11/xorg.conf' to:\\
+Section "InputDevice"\\
+Identifier "Mouse0"\\
+Driver "summa"\\
+Option "Device" "/dev/ttyS0"\\
+Option "InputFashion" "Tablet"\\
+Option "Mode" "Absolute"\\
+Option "Name" "EasyPen"\\
+Option "Compatible" "True"\\
+Option "Protocol" "Auto"\\
+Option "SendCoreEvents" "on"\\
+Option "Vendor" "GENIUS"\\
+EndSection}}
+
+If the Summagraphics mouse isn't the default mouse, you can manually kill 
'gpm' and restart it with the command "gpm -m /dev/ttyS0 -t summa". Note that 
Summagraphics mouse makes no sense in an SDL window and is therefore not 
available in this environment.
+
 \subsection{Destroy VMX guests}
 VMX guests can be destroyed in the same way as can paravirtualized guests. We 
recommend that you type the command 
 
diff -r 9fcfdab04aa9 -r fb174770f426 linux-2.6-xen-sparse/arch/i386/Kconfig
--- a/linux-2.6-xen-sparse/arch/i386/Kconfig    Thu Apr  6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/arch/i386/Kconfig    Fri Apr  7 10:52:00 2006
@@ -1116,9 +1116,7 @@
 menu "Instrumentation Support"
        depends on EXPERIMENTAL
 
-if !X86_XEN
 source "arch/i386/oprofile/Kconfig"
-endif
 
 config KPROBES
        bool "Kprobes (EXPERIMENTAL)"
diff -r 9fcfdab04aa9 -r fb174770f426 linux-2.6-xen-sparse/arch/i386/Makefile
--- a/linux-2.6-xen-sparse/arch/i386/Makefile   Thu Apr  6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/arch/i386/Makefile   Fri Apr  7 10:52:00 2006
@@ -162,3 +162,4 @@
 endef
 
 CLEAN_FILES += arch/$(ARCH)/boot/fdimage arch/$(ARCH)/boot/mtools.conf
+CLEAN_FILES += vmlinuz vmlinux-stripped
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/arch/i386/kernel/setup-xen.c
--- a/linux-2.6-xen-sparse/arch/i386/kernel/setup-xen.c Thu Apr  6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/arch/i386/kernel/setup-xen.c Fri Apr  7 10:52:00 2006
@@ -1317,6 +1317,11 @@
                }
        }
 #endif
+#ifdef CONFIG_KEXEC
+       if (crashk_res.start != crashk_res.end)
+               reserve_bootmem(crashk_res.start,
+                       crashk_res.end - crashk_res.start + 1);
+#endif
 
        if (!xen_feature(XENFEAT_auto_translated_physmap))
                phys_to_machine_mapping =
@@ -1435,11 +1440,6 @@
 #endif
                }
        }
-#endif
-#ifdef CONFIG_KEXEC
-       if (crashk_res.start != crashk_res.end)
-               reserve_bootmem(crashk_res.start,
-                       crashk_res.end - crashk_res.start + 1);
 #endif
 }
 
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/arch/i386/mm/ioremap-xen.c
--- a/linux-2.6-xen-sparse/arch/i386/mm/ioremap-xen.c   Thu Apr  6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/arch/i386/mm/ioremap-xen.c   Fri Apr  7 10:52:00 2006
@@ -177,6 +177,32 @@
 
 EXPORT_SYMBOL(touch_pte_range);
 
+void *vm_map_xen_pages (unsigned long maddr, int vm_size, pgprot_t prot)
+{
+       int error;
+       
+       struct vm_struct *vma;
+       vma = get_vm_area (vm_size, VM_IOREMAP);
+      
+       if (vma == NULL) {
+               printk ("ioremap.c,vm_map_xen_pages(): "
+                       "Failed to get VMA area\n");
+               return NULL;
+       }
+
+       error = direct_kernel_remap_pfn_range((unsigned long) vma->addr,
+                                             maddr >> PAGE_SHIFT, vm_size,
+                                             prot, DOMID_SELF );
+       if (error == 0) {
+               return vma->addr;
+       } else {
+               printk ("ioremap.c,vm_map_xen_pages(): "
+                       "Failed to map xen shared pages into kernel space\n");
+               return NULL;
+       }
+}
+EXPORT_SYMBOL(vm_map_xen_pages);
+
 /*
  * Does @address reside within a non-highmem page that is local to this virtual
  * machine (i.e., not an I/O page, nor a memory page belonging to another VM).
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c
--- a/linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c        Thu Apr  6 
13:22:52 2006
+++ b/linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c        Fri Apr  7 
10:52:00 2006
@@ -215,52 +215,26 @@
 
 int blkif_schedule(void *arg)
 {
-       blkif_t          *blkif = arg;
+       blkif_t *blkif = arg;
 
        blkif_get(blkif);
+
        if (debug_lvl)
                printk(KERN_DEBUG "%s: started\n", current->comm);
-       for (;;) {
-               if (kthread_should_stop()) {
-                       /* asked to quit? */
-                       if (!atomic_read(&blkif->io_pending))
-                               break;
-                       if (debug_lvl)
-                               printk(KERN_DEBUG "%s: I/O pending, "
-                                      "delaying exit\n", current->comm);
-               }
-
-               if (!atomic_read(&blkif->io_pending)) {
-                       /* Wait for work to do. */
-                       wait_event_interruptible(
-                               blkif->wq,
-                               (atomic_read(&blkif->io_pending) ||
-                                kthread_should_stop()));
-               } else if (list_empty(&pending_free)) {
-                       /* Wait for pending_req becoming available. */
-                       wait_event_interruptible(
-                               pending_free_wq,
-                               !list_empty(&pending_free));
-               }
-
-               if (blkif->status != CONNECTED) {
-                       /* make sure we are connected */
-                       if (debug_lvl)
-                               printk(KERN_DEBUG "%s: not connected "
-                                      "(%d pending)\n",
-                                      current->comm,
-                                      atomic_read(&blkif->io_pending));
-                       wait_event_interruptible(
-                               blkif->wq,
-                               (blkif->status == CONNECTED ||
-                                kthread_should_stop()));
-                       continue;
-               }
-
-               /* Schedule I/O */
-               atomic_set(&blkif->io_pending, 0);
+
+       while (!kthread_should_stop()) {
+               wait_event_interruptible(
+                       blkif->wq,
+                       blkif->waiting_reqs || kthread_should_stop());
+               wait_event_interruptible(
+                       pending_free_wq,
+                       !list_empty(&pending_free) || kthread_should_stop());
+
+               blkif->waiting_reqs = 0;
+               smp_mb(); /* clear flag *before* checking for work */
+
                if (do_block_io_op(blkif))
-                       atomic_inc(&blkif->io_pending);
+                       blkif->waiting_reqs = 1;
                unplug_queue(blkif);
 
                if (log_stats && time_after(jiffies, blkif->st_print))
@@ -271,8 +245,10 @@
                print_stats(blkif);
        if (debug_lvl)
                printk(KERN_DEBUG "%s: exiting\n", current->comm);
+
        blkif->xenblkd = NULL;
        blkif_put(blkif);
+
        return 0;
 }
 
@@ -311,12 +287,15 @@
  * NOTIFICATION FROM GUEST OS.
  */
 
+static void blkif_notify_work(blkif_t *blkif)
+{
+       blkif->waiting_reqs = 1;
+       wake_up(&blkif->wq);
+}
+
 irqreturn_t blkif_be_int(int irq, void *dev_id, struct pt_regs *regs)
 {
-       blkif_t *blkif = dev_id;
-
-       atomic_inc(&blkif->io_pending);
-       wake_up(&blkif->wq);
+       blkif_notify_work(dev_id);
        return IRQ_HANDLED;
 }
 
@@ -536,10 +515,8 @@
        }
        spin_unlock_irqrestore(&blkif->blk_ring_lock, flags);
 
-       if (more_to_do) {
-               atomic_inc(&blkif->io_pending);
-               wake_up(&blkif->wq);
-       }
+       if (more_to_do)
+               blkif_notify_work(blkif);
        if (notify)
                notify_remote_via_irq(blkif->irq);
 }
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/drivers/xen/blkback/common.h
--- a/linux-2.6-xen-sparse/drivers/xen/blkback/common.h Thu Apr  6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/drivers/xen/blkback/common.h Fri Apr  7 10:52:00 2006
@@ -72,7 +72,6 @@
        /* Back pointer to the backend_info. */
        struct backend_info *be; 
        /* Private fields. */
-       enum { DISCONNECTED, CONNECTED } status;
 #ifdef CONFIG_XEN_BLKDEV_TAP_BE
        /* Is this a blktap frontend */
        unsigned int     is_blktap;
@@ -82,7 +81,7 @@
 
        wait_queue_head_t   wq;
        struct task_struct  *xenblkd;
-       atomic_t            io_pending;
+       unsigned int        waiting_reqs;
        request_queue_t     *plug;
 
        /* statistics */
@@ -133,8 +132,6 @@
 irqreturn_t blkif_be_int(int irq, void *dev_id, struct pt_regs *regs);
 int blkif_schedule(void *arg);
 
-void update_blkif_status(blkif_t *blkif); 
-
 #endif /* __BLKIF__BACKEND__COMMON_H__ */
 
 /*
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/drivers/xen/blkback/interface.c
--- a/linux-2.6-xen-sparse/drivers/xen/blkback/interface.c      Thu Apr  6 
13:22:52 2006
+++ b/linux-2.6-xen-sparse/drivers/xen/blkback/interface.c      Fri Apr  7 
10:52:00 2006
@@ -45,7 +45,6 @@
 
        memset(blkif, 0, sizeof(*blkif));
        blkif->domid = domid;
-       blkif->status = DISCONNECTED;
        spin_lock_init(&blkif->blk_ring_lock);
        atomic_set(&blkif->refcnt, 1);
        init_waitqueue_head(&blkif->wq);
@@ -138,9 +137,6 @@
        blkif->irq = bind_evtchn_to_irqhandler(
                blkif->evtchn, blkif_be_int, 0, "blkif-backend", blkif);
 
-       /* We're potentially connected now */
-       update_blkif_status(blkif); 
-
        return 0;
 }
 
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c
--- a/linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c Thu Apr  6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c Fri Apr  7 10:52:00 2006
@@ -16,7 +16,6 @@
     along with this program; if not, write to the Free Software
     Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 */
-
 
 #include <stdarg.h>
 #include <linux/module.h>
@@ -25,36 +24,52 @@
 #include "common.h"
 
 #undef DPRINTK
-#define DPRINTK(fmt, args...) \
-    pr_debug("blkback/xenbus (%s:%d) " fmt ".\n", __FUNCTION__, __LINE__, 
##args)
-
+#define DPRINTK(fmt, args...)                          \
+       pr_debug("blkback/xenbus (%s:%d) " fmt ".\n",   \
+                __FUNCTION__, __LINE__, ##args)
 
 struct backend_info
 {
        struct xenbus_device *dev;
        blkif_t *blkif;
        struct xenbus_watch backend_watch;
-
        unsigned major;
        unsigned minor;
        char *mode;
 };
 
-
-static void maybe_connect(struct backend_info *);
 static void connect(struct backend_info *);
 static int connect_ring(struct backend_info *);
 static void backend_changed(struct xenbus_watch *, const char **,
                            unsigned int);
 
 
-void update_blkif_status(blkif_t *blkif)
+static void update_blkif_status(blkif_t *blkif)
 { 
-       if(blkif->irq && blkif->vbd.bdev) {
-               blkif->status = CONNECTED; 
-               (void)blkif_be_int(0, blkif, NULL); 
-       }
-       maybe_connect(blkif->be); 
+       int err;
+
+       /* Not ready to connect? */
+       if (!blkif->irq || !blkif->vbd.bdev)
+               return;
+
+       /* Already connected? */
+       if (blkif->be->dev->state == XenbusStateConnected)
+               return;
+
+       /* Attempt to connect: exit if we fail to. */
+       connect(blkif->be);
+       if (blkif->be->dev->state != XenbusStateConnected)
+               return;
+
+       blkif->xenblkd = kthread_run(blkif_schedule, blkif,
+                                    "xvd %d %02x:%02x",
+                                    blkif->domid,
+                                    blkif->be->major, blkif->be->minor);
+       if (IS_ERR(blkif->xenblkd)) {
+               err = PTR_ERR(blkif->xenblkd);
+               blkif->xenblkd = NULL;
+               xenbus_dev_error(blkif->be->dev, err, "start xenblkd");
+       }
 }
 
 
@@ -91,7 +106,6 @@
                be->backend_watch.node = NULL;
        }
        if (be->blkif) {
-               be->blkif->status = DISCONNECTED; 
                if (be->blkif->xenblkd)
                        kthread_stop(be->blkif->xenblkd);
                blkif_put(be->blkif);
@@ -185,8 +199,8 @@
                return;
        }
 
-       if (be->major && be->minor &&
-           (be->major != major || be->minor != minor)) {
+       if ((be->major || be->minor) &&
+           ((be->major != major) || (be->minor != minor))) {
                printk(KERN_WARNING
                       "blkback: changing physical device (from %x:%x to "
                       "%x:%x) not supported.\n", be->major, be->minor,
@@ -220,17 +234,6 @@
                        return;
                }
 
-               be->blkif->xenblkd = kthread_run(blkif_schedule, be->blkif,
-                                                "xvd %d %02x:%02x",
-                                                be->blkif->domid,
-                                                be->major, be->minor);
-               if (IS_ERR(be->blkif->xenblkd)) {
-                       err = PTR_ERR(be->blkif->xenblkd);
-                       be->blkif->xenblkd = NULL;
-                       xenbus_dev_error(dev, err, "start xenblkd");
-                       return;
-               }
-
                device_create_file(&dev->dev, &dev_attr_physical_device);
                device_create_file(&dev->dev, &dev_attr_mode);
 
@@ -288,14 +291,6 @@
 
 
 /* ** Connection ** */
-
-
-static void maybe_connect(struct backend_info *be)
-{
-       if ((be->major != 0 || be->minor != 0) &&
-           be->blkif->status == CONNECTED)
-               connect(be);
-}
 
 
 /**
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/drivers/xen/core/reboot.c
--- a/linux-2.6-xen-sparse/drivers/xen/core/reboot.c    Thu Apr  6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/drivers/xen/core/reboot.c    Fri Apr  7 10:52:00 2006
@@ -85,6 +85,23 @@
 #define smp_resume()   ((void)0)
 #endif
 
+/* Ensure we run on the idle task page tables so that we will
+   switch page tables before running user space. This is needed
+   on architectures with separate kernel and user page tables
+   because the user page table pointer is not saved/restored. */
+static void switch_idle_mm(void)
+{
+       struct mm_struct *mm = current->active_mm;
+
+       if (mm == &init_mm)
+               return;
+
+       atomic_inc(&init_mm.mm_count);
+       switch_mm(mm, &init_mm, current);
+       current->active_mm = &init_mm;
+       mmdrop(mm);
+}
+
 static int __do_suspend(void *ignore)
 {
        int i, j, k, fpp, err;
@@ -163,6 +180,8 @@
        irq_resume();
 
        time_resume();
+
+       switch_idle_mm();
 
        __sti();
 
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/drivers/xen/netback/netback.c
--- a/linux-2.6-xen-sparse/drivers/xen/netback/netback.c        Thu Apr  6 
13:22:52 2006
+++ b/linux-2.6-xen-sparse/drivers/xen/netback/netback.c        Fri Apr  7 
10:52:00 2006
@@ -301,9 +301,6 @@
                netif   = netdev_priv(skb->dev);
                size    = skb->tail - skb->data;
 
-               /* Rederive the machine addresses. */
-               new_mfn = mcl->args[1] >> PAGE_SHIFT;
-               old_mfn = gop->mfn;
                atomic_set(&(skb_shinfo(skb)->dataref), 1);
                skb_shinfo(skb)->nr_frags = 0;
                skb_shinfo(skb)->frag_list = NULL;
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c
--- a/linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c      Thu Apr  6 
13:22:52 2006
+++ b/linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c      Fri Apr  7 
10:52:00 2006
@@ -993,8 +993,8 @@
         * the RX ring because some of our pages are currently flipped out
         * so we can't just free the RX skbs.
         * NB2. Freelist index entries are always going to be less than
-        *  __PAGE_OFFSET, whereas pointers to skbs will always be equal or
-        * greater than __PAGE_OFFSET: we use this property to distinguish
+        *  PAGE_OFFSET, whereas pointers to skbs will always be equal or
+        * greater than PAGE_OFFSET: we use this property to distinguish
         * them.
         */
 
@@ -1005,7 +1005,7 @@
         * interface has been down.
         */
        for (requeue_idx = 0, i = 1; i <= NET_TX_RING_SIZE; i++) {
-               if ((unsigned long)np->tx_skbs[i] < __PAGE_OFFSET)
+               if ((unsigned long)np->tx_skbs[i] < PAGE_OFFSET)
                        continue;
 
                skb = np->tx_skbs[i];
@@ -1036,7 +1036,7 @@
 
        /* Rebuild the RX buffer freelist and the RX ring itself. */
        for (requeue_idx = 0, i = 1; i <= NET_RX_RING_SIZE; i++) {
-               if ((unsigned long)np->rx_skbs[i] < __PAGE_OFFSET)
+               if ((unsigned long)np->rx_skbs[i] < PAGE_OFFSET)
                        continue;
                gnttab_grant_foreign_transfer_ref(
                        np->grant_rx_ref[i], np->xbdev->otherend_id,
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c
--- a/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c        Thu Apr  6 
13:22:52 2006
+++ b/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c        Fri Apr  7 
10:52:00 2006
@@ -277,6 +277,7 @@
        set_bit(__HYPERVISOR_mmu_update,       hypercall_permission_map);
        set_bit(__HYPERVISOR_mmuext_op,        hypercall_permission_map);
        set_bit(__HYPERVISOR_xen_version,      hypercall_permission_map);
+       set_bit(__HYPERVISOR_sched_op,         hypercall_permission_map);
 
        privcmd_intf = create_xen_proc_entry("privcmd", 0400);
        if (privcmd_intf != NULL)
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/drivers/xen/tpmfront/tpmfront.c
--- a/linux-2.6-xen-sparse/drivers/xen/tpmfront/tpmfront.c      Thu Apr  6 
13:22:52 2006
+++ b/linux-2.6-xen-sparse/drivers/xen/tpmfront/tpmfront.c      Fri Apr  7 
10:52:00 2006
@@ -65,14 +65,18 @@
                              void *tpm_priv,
                              struct pt_regs *ptregs);
 static void tpmif_rx_action(unsigned long unused);
-static void tpmif_connect(struct tpm_private *tp, domid_t domid);
+static int tpmif_connect(struct xenbus_device *dev,
+                         struct tpm_private *tp,
+                         domid_t domid);
 static DECLARE_TASKLET(tpmif_rx_tasklet, tpmif_rx_action, 0);
-static int tpm_allocate_buffers(struct tpm_private *tp);
+static int tpmif_allocate_tx_buffers(struct tpm_private *tp);
+static void tpmif_free_tx_buffers(struct tpm_private *tp);
 static void tpmif_set_connected_state(struct tpm_private *tp,
                                       u8 newstate);
 static int tpm_xmit(struct tpm_private *tp,
                     const u8 * buf, size_t count, int userbuffer,
                     void *remember);
+static void destroy_tpmring(struct tpm_private *tp);
 
 #define DPRINTK(fmt, args...) \
     pr_debug("xen_tpm_fr (%s:%d) " fmt, __FUNCTION__, __LINE__, ##args)
@@ -80,6 +84,8 @@
     printk(KERN_INFO "xen_tpm_fr: " fmt, ##args)
 #define WPRINTK(fmt, args...) \
     printk(KERN_WARNING "xen_tpm_fr: " fmt, ##args)
+
+#define GRANT_INVALID_REF      0
 
 
 static inline int
@@ -119,6 +125,14 @@
 }
 
 
+static inline void tx_buffer_free(struct tx_buffer *txb)
+{
+       if (txb) {
+               free_page((long)txb->data);
+               kfree(txb);
+       }
+}
+
 /**************************************************************
  Utility function for the tpm_private structure
 **************************************************************/
@@ -128,21 +142,27 @@
        init_waitqueue_head(&tp->wait_q);
 }
 
+static inline void tpm_private_free(void)
+{
+       tpmif_free_tx_buffers(my_priv);
+       kfree(my_priv);
+       my_priv = NULL;
+}
+
 static struct tpm_private *tpm_private_get(void)
 {
+       int err;
        if (!my_priv) {
                my_priv = kzalloc(sizeof(struct tpm_private), GFP_KERNEL);
                if (my_priv) {
                        tpm_private_init(my_priv);
+                       err = tpmif_allocate_tx_buffers(my_priv);
+                       if (err < 0) {
+                               tpm_private_free();
+                       }
                }
        }
        return my_priv;
-}
-
-static inline void tpm_private_free(void)
-{
-       kfree(my_priv);
-       my_priv = NULL;
 }
 
 /**************************************************************
@@ -233,14 +253,14 @@
        tpmif_tx_interface_t *sring;
        int err;
 
+       tp->ring_ref = GRANT_INVALID_REF;
+
        sring = (void *)__get_free_page(GFP_KERNEL);
        if (!sring) {
                xenbus_dev_fatal(dev, -ENOMEM, "allocating shared ring");
                return -ENOMEM;
        }
        tp->tx = sring;
-
-       tpm_allocate_buffers(tp);
 
        err = xenbus_grant_ring(dev, virt_to_mfn(tp->tx));
        if (err < 0) {
@@ -251,14 +271,13 @@
        }
        tp->ring_ref = err;
 
-       err = xenbus_alloc_evtchn(dev, &tp->evtchn);
+       err = tpmif_connect(dev, tp, dev->otherend_id);
        if (err)
                goto fail;
 
-       tpmif_connect(tp, dev->otherend_id);
-
        return 0;
 fail:
+       destroy_tpmring(tp);
        return err;
 }
 
@@ -266,14 +285,17 @@
 static void destroy_tpmring(struct tpm_private *tp)
 {
        tpmif_set_connected_state(tp, 0);
-       if (tp->tx != NULL) {
+
+       if (tp->ring_ref != GRANT_INVALID_REF) {
                gnttab_end_foreign_access(tp->ring_ref, 0,
                                          (unsigned long)tp->tx);
+               tp->ring_ref = GRANT_INVALID_REF;
                tp->tx = NULL;
        }
 
        if (tp->irq)
-               unbind_from_irqhandler(tp->irq, NULL);
+               unbind_from_irqhandler(tp->irq, tp);
+
        tp->evtchn = tp->irq = 0;
 }
 
@@ -377,6 +399,9 @@
        int handle;
        struct tpm_private *tp = tpm_private_get();
 
+       if (!tp)
+               return -ENOMEM;
+
        err = xenbus_scanf(XBT_NULL, dev->nodename,
                           "handle", "%i", &handle);
        if (XENBUS_EXIST_ERR(err))
@@ -402,15 +427,14 @@
 
 static int tpmfront_remove(struct xenbus_device *dev)
 {
-       struct tpm_private *tp = dev->data;
+       struct tpm_private *tp = (struct tpm_private *)dev->data;
        destroy_tpmring(tp);
        return 0;
 }
 
-static int
-tpmfront_suspend(struct xenbus_device *dev)
-{
-       struct tpm_private *tp = dev->data;
+static int tpmfront_suspend(struct xenbus_device *dev)
+{
+       struct tpm_private *tp = (struct tpm_private *)dev->data;
        u32 ctr;
 
        /* lock, so no app can send */
@@ -437,29 +461,35 @@
        return 0;
 }
 
-static int
-tpmfront_resume(struct xenbus_device *dev)
-{
-       struct tpm_private *tp = dev->data;
+static int tpmfront_resume(struct xenbus_device *dev)
+{
+       struct tpm_private *tp = (struct tpm_private *)dev->data;
+       destroy_tpmring(tp);
        return talk_to_backend(dev, tp);
 }
 
-static void
-tpmif_connect(struct tpm_private *tp, domid_t domid)
+static int tpmif_connect(struct xenbus_device *dev,
+                         struct tpm_private *tp,
+                         domid_t domid)
 {
        int err;
 
        tp->backend_id = domid;
+
+       err = xenbus_alloc_evtchn(dev, &tp->evtchn);
+       if (err)
+               return err;
 
        err = bind_evtchn_to_irqhandler(tp->evtchn,
                                        tpmif_int, SA_SAMPLE_RANDOM, "tpmif",
                                        tp);
        if (err <= 0) {
                WPRINTK("bind_evtchn_to_irqhandler failed (err=%d)\n", err);
-               return;
+               return err;
        }
 
        tp->irq = err;
+       return 0;
 }
 
 static struct xenbus_device_id tpmfront_ids[] = {
@@ -488,19 +518,30 @@
        xenbus_unregister_driver(&tpmfront);
 }
 
-
-static int
-tpm_allocate_buffers(struct tpm_private *tp)
+static int tpmif_allocate_tx_buffers(struct tpm_private *tp)
 {
        unsigned int i;
 
-       for (i = 0; i < TPMIF_TX_RING_SIZE; i++)
+       for (i = 0; i < TPMIF_TX_RING_SIZE; i++) {
                tp->tx_buffers[i] = tx_buffer_alloc();
-       return 1;
-}
-
-static void
-tpmif_rx_action(unsigned long priv)
+               if (!tp->tx_buffers[i]) {
+                       tpmif_free_tx_buffers(tp);
+                       return -ENOMEM;
+               }
+       }
+       return 0;
+}
+
+static void tpmif_free_tx_buffers(struct tpm_private *tp)
+{
+       unsigned int i;
+
+       for (i = 0; i < TPMIF_TX_RING_SIZE; i++) {
+               tx_buffer_free(tp->tx_buffers[i]);
+       }
+}
+
+static void tpmif_rx_action(unsigned long priv)
 {
        struct tpm_private *tp = (struct tpm_private *)priv;
 
@@ -545,8 +586,7 @@
 }
 
 
-static irqreturn_t
-tpmif_int(int irq, void *tpm_priv, struct pt_regs *ptregs)
+static irqreturn_t tpmif_int(int irq, void *tpm_priv, struct pt_regs *ptregs)
 {
        struct tpm_private *tp = tpm_priv;
        unsigned long flags;
@@ -560,10 +600,9 @@
 }
 
 
-static int
-tpm_xmit(struct tpm_private *tp,
-         const u8 * buf, size_t count, int isuserbuffer,
-         void *remember)
+static int tpm_xmit(struct tpm_private *tp,
+                    const u8 * buf, size_t count, int isuserbuffer,
+                    void *remember)
 {
        tpmif_tx_request_t *tx;
        TPMIF_RING_IDX i;
@@ -693,8 +732,7 @@
  * =================================================================
  */
 
-static int __init
-tpmif_init(void)
+static int __init tpmif_init(void)
 {
        IPRINTK("Initialising the vTPM driver.\n");
        if ( gnttab_alloc_grant_references ( TPMIF_TX_RING_SIZE,
@@ -709,8 +747,7 @@
 
 module_init(tpmif_init);
 
-static void __exit
-tpmif_exit(void)
+static void __exit tpmif_exit(void)
 {
        exit_tpm_xenbus();
        gnttab_free_grant_references(gref_head);
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/hypercall.h
--- a/linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/hypercall.h    Thu Apr 
 6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/hypercall.h    Fri Apr 
 7 10:52:00 2006
@@ -328,6 +328,21 @@
 {
        return _hypercall2(int, nmi_op, op, arg);
 }
+
+static inline int
+HYPERVISOR_callback_op(
+       int cmd, void *arg)
+{
+       return _hypercall2(int, callback_op, cmd, arg);
+}
+
+static inline int
+HYPERVISOR_xenoprof_op(
+       int op, unsigned long arg1, unsigned long arg2)
+{
+       return _hypercall3(int, xenoprof_op, op, arg1, arg2);
+}
+
 
 #endif /* __HYPERCALL_H__ */
 
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/include/asm-i386/mach-xen/setup_arch_post.h
--- a/linux-2.6-xen-sparse/include/asm-i386/mach-xen/setup_arch_post.h  Thu Apr 
 6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/include/asm-i386/mach-xen/setup_arch_post.h  Fri Apr 
 7 10:52:00 2006
@@ -5,6 +5,8 @@
  *     This is included late in kernel/setup.c so that it can make
  *     use of all of the static functions.
  **/
+
+#include <xen/interface/callback.h>
 
 static char * __init machine_specific_memory_setup(void)
 {
@@ -23,6 +25,14 @@
 static void __init machine_specific_arch_setup(void)
 {
        struct xen_platform_parameters pp;
+       struct callback_register event = {
+               .type = CALLBACKTYPE_event,
+               .address = { __KERNEL_CS, (unsigned long)hypervisor_callback },
+       };
+       struct callback_register failsafe = {
+               .type = CALLBACKTYPE_failsafe,
+               .address = { __KERNEL_CS, (unsigned long)failsafe_callback },
+       };
        struct xennmi_callback cb;
 
        if (xen_feature(XENFEAT_auto_translated_physmap) &&
@@ -32,9 +42,8 @@
                memset(empty_zero_page, 0, sizeof(empty_zero_page));
        }
 
-       HYPERVISOR_set_callbacks(
-           __KERNEL_CS, (unsigned long)hypervisor_callback,
-           __KERNEL_CS, (unsigned long)failsafe_callback);
+       HYPERVISOR_callback_op(CALLBACKOP_register, &event);
+       HYPERVISOR_callback_op(CALLBACKOP_register, &failsafe);
 
        cb.handler_address = (unsigned long)&nmi;
        HYPERVISOR_nmi_op(XENNMI_register_callback, &cb);
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/hypercall.h
--- a/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/hypercall.h  Thu Apr 
 6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/hypercall.h  Fri Apr 
 7 10:52:00 2006
@@ -328,6 +328,20 @@
        unsigned long op, void *arg)
 {
        return _hypercall2(int, nmi_op, op, arg);
+}
+
+static inline int
+HYPERVISOR_callback_op(
+       int cmd, void *arg)
+{
+       return _hypercall2(int, callback_op, cmd, arg);
+}
+
+static inline int
+HYPERVISOR_xenoprof_op(
+       int op, unsigned long arg1, unsigned long arg2)
+{
+       return _hypercall3(int, xenoprof_op, op, arg1, arg2);
 }
 
 #endif /* __HYPERCALL_H__ */
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/setup_arch_post.h
--- a/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/setup_arch_post.h        
Thu Apr  6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/setup_arch_post.h        
Fri Apr  7 10:52:00 2006
@@ -6,20 +6,33 @@
  *     use of all of the static functions.
  **/
 
+#include <xen/interface/callback.h>
+
 extern void hypervisor_callback(void);
 extern void failsafe_callback(void);
 extern void nmi(void);
 
 static void __init machine_specific_arch_setup(void)
 {
+       struct callback_register event = {
+               .type = CALLBACKTYPE_event,
+               .address = (unsigned long) hypervisor_callback,
+       };
+       struct callback_register failsafe = {
+               .type = CALLBACKTYPE_failsafe,
+               .address = (unsigned long)failsafe_callback,
+       };
+       struct callback_register syscall = {
+               .type = CALLBACKTYPE_syscall,
+               .address = (unsigned long)system_call,
+       };
 #ifdef CONFIG_X86_LOCAL_APIC
        struct xennmi_callback cb;
 #endif
 
-       HYPERVISOR_set_callbacks(
-                (unsigned long) hypervisor_callback,
-                (unsigned long) failsafe_callback,
-                (unsigned long) system_call);
+       HYPERVISOR_callback_op(CALLBACKOP_register, &event);
+       HYPERVISOR_callback_op(CALLBACKOP_register, &failsafe);
+       HYPERVISOR_callback_op(CALLBACKOP_register, &syscall);
 
 #ifdef CONFIG_X86_LOCAL_APIC
        cb.handler_address = (unsigned long)&nmi;
diff -r 9fcfdab04aa9 -r fb174770f426 tools/examples/init.d/xend
--- a/tools/examples/init.d/xend        Thu Apr  6 13:22:52 2006
+++ b/tools/examples/init.d/xend        Fri Apr  7 10:52:00 2006
@@ -7,7 +7,7 @@
 # chkconfig: 2345 98 01
 # description: Starts and stops the Xen control daemon.
 
-if ! [ -e /proc/xen/privcmd ]; then
+if ! grep -q "control_d" /proc/xen/capabilities ; then
        exit 0
 fi
 
diff -r 9fcfdab04aa9 -r fb174770f426 tools/examples/vtpm-common.sh
--- a/tools/examples/vtpm-common.sh     Thu Apr  6 13:22:52 2006
+++ b/tools/examples/vtpm-common.sh     Fri Apr  7 10:52:00 2006
@@ -261,12 +261,6 @@
 
        if [ "$REASON" == "create" ]; then
                vtpm_reset $instance
-       elif [ "$REASON" == "resume" ]; then
-               vtpm_setup $instance
-       else
-               #default case for 'now'
-               #vtpm_reset $instance
-               true
        fi
        xenstore_write $XENBUS_PATH/instance $instance
 }
diff -r 9fcfdab04aa9 -r fb174770f426 tools/ioemu/target-i386-dm/helper2.c
--- a/tools/ioemu/target-i386-dm/helper2.c      Thu Apr  6 13:22:52 2006
+++ b/tools/ioemu/target-i386-dm/helper2.c      Fri Apr  7 10:52:00 2006
@@ -409,12 +409,20 @@
 void
 destroy_hvm_domain(void)
 {
-    extern FILE* logfile;
-    char destroy_cmd[32];
-
-    sprintf(destroy_cmd, "xm destroy %d", domid);
-    if (system(destroy_cmd) == -1)
-        fprintf(logfile, "%s failed.!\n", destroy_cmd);
+   int xcHandle;
+   int sts;
+ 
+   xcHandle = xc_interface_open();
+   if (xcHandle < 0)
+     fprintf(logfile, "Cannot acquire xenctrl handle\n");
+   else {
+     sts = xc_domain_shutdown(xcHandle, domid, SHUTDOWN_poweroff);
+     if (sts != 0)
+       fprintf(logfile, "? xc_domain_shutdown failed to issue poweroff, sts 
%d, errno %d\n", sts, errno);
+     else
+       fprintf(logfile, "Issued domain %d poweroff\n", domid);
+     xc_interface_close(xcHandle);
+   }
 }
 
 fd_set wakeup_rfds;
@@ -480,13 +488,24 @@
 
 static void qemu_hvm_reset(void *unused)
 {
-    char cmd[64];
-
-    /* pause domain first, to avoid repeated reboot request*/
-    xc_domain_pause(xc_handle, domid);
-
-    sprintf(cmd, "xm shutdown -R %d", domid);
-    system(cmd);
+   int xcHandle;
+   int sts;
+
+   /* pause domain first, to avoid repeated reboot request*/
+   xc_domain_pause(xc_handle, domid);
+
+   xcHandle = xc_interface_open();
+   if (xcHandle < 0)
+     fprintf(logfile, "Cannot acquire xenctrl handle\n");
+   else {
+     sts = xc_domain_shutdown(xcHandle, domid, SHUTDOWN_reboot);
+     if (sts != 0)
+       fprintf(logfile, "? xc_domain_shutdown failed to issue reboot, sts 
%d\n", sts);
+     else
+       fprintf(logfile, "Issued domain %d reboot\n", domid);
+     xc_interface_close(xcHandle);
+   }
+ 
 }
 
 CPUState * cpu_init()
diff -r 9fcfdab04aa9 -r fb174770f426 tools/ioemu/vl.c
--- a/tools/ioemu/vl.c  Thu Apr  6 13:22:52 2006
+++ b/tools/ioemu/vl.c  Fri Apr  7 10:52:00 2006
@@ -2556,8 +2556,10 @@
         return -1;
     }
 
+#if 0 /* Generates lots of log file output - turn on for debugging */
     for (i = 0; i < nr_pages; i++)
         fprintf(stderr, "set_map result i %x result %lx\n", i, 
extent_start[i]);
+#endif
 
     return 0;
 }
diff -r 9fcfdab04aa9 -r fb174770f426 tools/libxc/xc_domain.c
--- a/tools/libxc/xc_domain.c   Thu Apr  6 13:22:52 2006
+++ b/tools/libxc/xc_domain.c   Fri Apr  7 10:52:00 2006
@@ -57,6 +57,35 @@
     op.u.destroydomain.domain = (domid_t)domid;
     return do_dom0_op(xc_handle, &op);
 }
+
+int xc_domain_shutdown(int xc_handle,
+                       uint32_t domid,
+                       int reason)
+{
+    int ret = -1;
+    sched_remote_shutdown_t arg;
+    DECLARE_HYPERCALL;
+
+    hypercall.op     = __HYPERVISOR_sched_op;
+    hypercall.arg[0] = (unsigned long)SCHEDOP_remote_shutdown;
+    hypercall.arg[1] = (unsigned long)&arg;
+    arg.domain_id = domid;
+    arg.reason = reason;
+
+    if ( mlock(&arg, sizeof(arg)) != 0 )
+    {
+        PERROR("Could not lock memory for Xen hypercall");
+        goto out1;
+    }
+
+    ret = do_xen_hypercall(xc_handle, &hypercall);
+
+    safe_munlock(&arg, sizeof(arg));
+
+ out1:
+    return ret;
+}
+
 
 int xc_vcpu_setaffinity(int xc_handle,
                         uint32_t domid, 
diff -r 9fcfdab04aa9 -r fb174770f426 tools/libxc/xc_linux_restore.c
--- a/tools/libxc/xc_linux_restore.c    Thu Apr  6 13:22:52 2006
+++ b/tools/libxc/xc_linux_restore.c    Fri Apr  7 10:52:00 2006
@@ -646,18 +646,14 @@
         goto out;
     }
 
-    if ((pt_levels == 2) && ((pfn_type[pfn]&LTABTYPE_MASK) != L2TAB)) { 
+    if ( (pfn_type[pfn] & LTABTYPE_MASK) != 
+         ((unsigned long)pt_levels<<LTAB_SHIFT) ) {
         ERR("PT base is bad. pfn=%lu nr=%lu type=%08lx %08lx",
-            pfn, max_pfn, pfn_type[pfn], (unsigned long)L2TAB);
-        goto out;
-    }
-
-    if ((pt_levels == 3) && ((pfn_type[pfn]&LTABTYPE_MASK) != L3TAB)) { 
-        ERR("PT base is bad. pfn=%lu nr=%lu type=%08lx %08lx",
-            pfn, max_pfn, pfn_type[pfn], (unsigned long)L3TAB);
-        goto out;
-    }
-    
+            pfn, max_pfn, pfn_type[pfn], 
+            (unsigned long)pt_levels<<LTAB_SHIFT); 
+        goto out;
+    }
+
     ctxt.ctrlreg[3] = p2m[pfn] << PAGE_SHIFT;
 
     /* clear any pending events and the selector */
diff -r 9fcfdab04aa9 -r fb174770f426 tools/libxc/xenctrl.h
--- a/tools/libxc/xenctrl.h     Thu Apr  6 13:22:52 2006
+++ b/tools/libxc/xenctrl.h     Fri Apr  7 10:52:00 2006
@@ -206,6 +206,21 @@
 int xc_domain_destroy(int xc_handle, 
                       uint32_t domid);
 
+/**
+ * This function will shutdown a domain. This is intended for use in
+ * fully-virtualized domains where this operation is analogous to the
+ * sched_op operations in a paravirtualized domain. The caller is
+ * expected to give the reason for the shutdown.
+ *
+ * @parm xc_handle a handle to an open hypervisor interface
+ * @parm domid the domain id to destroy
+ * @parm reason is the reason (SHUTDOWN_xxx) for the shutdown
+ * @return 0 on success, -1 on failure
+ */
+int xc_domain_shutdown(int xc_handle, 
+                       uint32_t domid,
+                       int reason);
+
 int xc_vcpu_setaffinity(int xc_handle,
                         uint32_t domid,
                         int vcpu,
diff -r 9fcfdab04aa9 -r fb174770f426 
tools/xm-test/tests/vtpm/02_vtpm-cat_pcrs.py
--- a/tools/xm-test/tests/vtpm/02_vtpm-cat_pcrs.py      Thu Apr  6 13:22:52 2006
+++ b/tools/xm-test/tests/vtpm/02_vtpm-cat_pcrs.py      Fri Apr  7 10:52:00 2006
@@ -46,6 +46,7 @@
     FAIL(str(e))
 
 if re.search("No such file",run["output"]):
+    vtpm_cleanup(domName)
     FAIL("TPM frontend support not compiled into (domU?) kernel")
 
 console.closeConsole()
diff -r 9fcfdab04aa9 -r fb174770f426 
tools/xm-test/tests/vtpm/03_vtpm-susp_res.py
--- a/tools/xm-test/tests/vtpm/03_vtpm-susp_res.py      Thu Apr  6 13:22:52 2006
+++ b/tools/xm-test/tests/vtpm/03_vtpm-susp_res.py      Fri Apr  7 10:52:00 2006
@@ -47,6 +47,7 @@
     FAIL(str(e))
 
 if re.search("No such file",run["output"]):
+    vtpm_cleanup(domName)
     FAIL("TPM frontend support not compiled into (domU?) kernel")
 
 console.closeConsole()
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/Makefile
--- a/xen/arch/x86/Makefile     Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/Makefile     Fri Apr  7 10:52:00 2006
@@ -2,6 +2,7 @@
 subdir-y += cpu
 subdir-y += genapic
 subdir-y += hvm
+subdir-y += oprofile
 
 subdir-$(x86_32) += x86_32
 subdir-$(x86_64) += x86_64
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c     Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/domain.c     Fri Apr  7 10:52:00 2006
@@ -961,6 +961,10 @@
     /* Relinquish every page of memory. */
     relinquish_memory(d, &d->xenpage_list);
     relinquish_memory(d, &d->page_list);
+
+    /* Free page used by xen oprofile buffer */
+    free_xenoprof_pages(d);
+
 }
 
 void arch_dump_domain_info(struct domain *d)
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/shutdown.c
--- a/xen/arch/x86/shutdown.c   Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/shutdown.c   Fri Apr  7 10:52:00 2006
@@ -44,7 +44,7 @@
 void __attribute__((noreturn)) __machine_halt(void *unused)
 {
     for ( ; ; )
-        safe_halt();
+        __asm__ __volatile__ ( "hlt" );
 }
 
 void machine_halt(void)
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/traps.c
--- a/xen/arch/x86/traps.c      Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/traps.c      Fri Apr  7 10:52:00 2006
@@ -32,6 +32,7 @@
 #include <xen/errno.h>
 #include <xen/mm.h>
 #include <xen/console.h>
+#include <xen/reboot.h>
 #include <asm/regs.h>
 #include <xen/delay.h>
 #include <xen/event.h>
@@ -318,8 +319,7 @@
     console_force_lock();
 
     /* Wait for manual reset. */
-    for ( ; ; )
-        __asm__ __volatile__ ( "hlt" );
+    machine_halt();
 }
 
 static inline int do_trap(int trapnr, char *str,
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/x86_32/entry.S
--- a/xen/arch/x86/x86_32/entry.S       Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/x86_32/entry.S       Fri Apr  7 10:52:00 2006
@@ -119,7 +119,7 @@
         movl  $DBLFLT1,%eax
         pushl %eax                     # EIP
         pushl %esi                     # error_code/entry_vector
-        jmp   error_code
+        jmp   handle_exception
 DBLFLT1:GET_CURRENT(%ebx)
         jmp   test_all_events
 failsafe_callback:
@@ -381,14 +381,6 @@
         jmp   __domain_crash_synchronous
 
         ALIGN
-process_guest_exception_and_events:
-        leal VCPU_trap_bounce(%ebx),%edx
-        testb $TBF_EXCEPTION,TRAPBOUNCE_flags(%edx)
-        jz   test_all_events
-        call create_bounce_frame
-        jmp  test_all_events
-
-        ALIGN
 ENTRY(ret_from_intr)
         GET_CURRENT(%ebx)
         movl  UREGS_eflags(%esp),%eax
@@ -400,7 +392,7 @@
 ENTRY(divide_error)
        pushl $TRAP_divide_error<<16
        ALIGN
-error_code:
+handle_exception:
         FIXUP_RING0_GUEST_STACK
         SAVE_ALL_NOSEGREGS(a)
         SET_XEN_SEGMENTS(a)
@@ -419,7 +411,11 @@
         movb  UREGS_cs(%esp),%al
         testl $(3|X86_EFLAGS_VM),%eax
        jz    restore_all_xen
-        jmp   process_guest_exception_and_events
+        leal  VCPU_trap_bounce(%ebx),%edx
+        testb $TBF_EXCEPTION,TRAPBOUNCE_flags(%edx)
+        jz    test_all_events
+        call  create_bounce_frame
+        jmp   test_all_events
 
 exception_with_ints_disabled:
         movl  UREGS_eflags(%esp),%eax
@@ -452,71 +448,71 @@
                                         
 ENTRY(coprocessor_error)
        pushl $TRAP_copro_error<<16
-       jmp error_code
+       jmp   handle_exception
 
 ENTRY(simd_coprocessor_error)
        pushl $TRAP_simd_error<<16
-       jmp error_code
+       jmp   handle_exception
 
 ENTRY(device_not_available)
        pushl $TRAP_no_device<<16
-        jmp   error_code
+        jmp   handle_exception
 
 ENTRY(debug)
        pushl $TRAP_debug<<16
-       jmp error_code
+       jmp   handle_exception
 
 ENTRY(int3)
        pushl $TRAP_int3<<16
-       jmp error_code
+       jmp   handle_exception
 
 ENTRY(overflow)
        pushl $TRAP_overflow<<16
-       jmp error_code
+       jmp   handle_exception
 
 ENTRY(bounds)
        pushl $TRAP_bounds<<16
-       jmp error_code
+       jmp   handle_exception
 
 ENTRY(invalid_op)
        pushl $TRAP_invalid_op<<16
-       jmp error_code
+       jmp   handle_exception
 
 ENTRY(coprocessor_segment_overrun)
        pushl $TRAP_copro_seg<<16
-       jmp error_code
+       jmp   handle_exception
 
 ENTRY(invalid_TSS)
-        movw $TRAP_invalid_tss,2(%esp)
-       jmp error_code
+        movw  $TRAP_invalid_tss,2(%esp)
+       jmp   handle_exception
 
 ENTRY(segment_not_present)
-        movw $TRAP_no_segment,2(%esp)
-       jmp error_code
+        movw  $TRAP_no_segment,2(%esp)
+       jmp   handle_exception
 
 ENTRY(stack_segment)
-        movw $TRAP_stack_error,2(%esp)
-       jmp error_code
+        movw  $TRAP_stack_error,2(%esp)
+       jmp   handle_exception
 
 ENTRY(general_protection)
-        movw $TRAP_gp_fault,2(%esp)
-       jmp error_code
+        movw  $TRAP_gp_fault,2(%esp)
+       jmp   handle_exception
 
 ENTRY(alignment_check)
-        movw $TRAP_alignment_check,2(%esp)
-       jmp error_code
+        movw  $TRAP_alignment_check,2(%esp)
+       jmp   handle_exception
 
 ENTRY(page_fault)
-        movw $TRAP_page_fault,2(%esp)
-       jmp error_code
+        movw  $TRAP_page_fault,2(%esp)
+       jmp   handle_exception
 
 ENTRY(machine_check)
         pushl $TRAP_machine_check<<16
-       jmp error_code
+       jmp   handle_exception
 
 ENTRY(spurious_interrupt_bug)
         pushl $TRAP_spurious_int<<16
-       jmp error_code
+       jmp   handle_exception
 
 ENTRY(nmi)
 #ifdef CONFIG_X86_SUPERVISOR_MODE_KERNEL
@@ -648,6 +644,8 @@
         .long do_acm_op
         .long do_nmi_op
         .long do_arch_sched_op
+        .long do_callback_op        /* 30 */
+        .long do_xenoprof_op
         .rept NR_hypercalls-((.-hypercall_table)/4)
         .long do_ni_hypercall
         .endr
@@ -683,6 +681,8 @@
         .byte 1 /* do_acm_op            */
         .byte 2 /* do_nmi_op            */
         .byte 2 /* do_arch_sched_op     */
+        .byte 2 /* do_callback_op       */  /* 30 */
+        .byte 3 /* do_xenoprof_op       */
         .rept NR_hypercalls-(.-hypercall_args_table)
         .byte 0 /* do_ni_hypercall      */
         .endr
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/x86_32/traps.c
--- a/xen/arch/x86/x86_32/traps.c       Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/x86_32/traps.c       Fri Apr  7 10:52:00 2006
@@ -9,10 +9,13 @@
 #include <xen/mm.h>
 #include <xen/irq.h>
 #include <xen/symbols.h>
+#include <xen/reboot.h>
 #include <asm/current.h>
 #include <asm/flushtlb.h>
 #include <asm/hvm/hvm.h>
 #include <asm/hvm/support.h>
+
+#include <public/callback.h>
 
 /* All CPUs have their own IDT to allow int80 direct trap. */
 idt_entry_t *idt_tables[NR_CPUS] = { 0 };
@@ -178,8 +181,7 @@
     console_force_lock();
 
     /* Wait for manual reset. */
-    for ( ; ; )
-        __asm__ __volatile__ ( "hlt" );
+    machine_halt();
 }
 
 unsigned long do_iret(void)
@@ -315,20 +317,102 @@
         set_int80_direct_trap(v);
 }
 
+static long register_guest_callback(struct callback_register *reg)
+{
+    long ret = 0;
+    struct vcpu *v = current;
+
+    fixup_guest_code_selector(reg->address.cs);
+
+    switch ( reg->type )
+    {
+    case CALLBACKTYPE_event:
+        v->arch.guest_context.event_callback_cs     = reg->address.cs;
+        v->arch.guest_context.event_callback_eip    = reg->address.eip;
+        break;
+
+    case CALLBACKTYPE_failsafe:
+        v->arch.guest_context.failsafe_callback_cs  = reg->address.cs;
+        v->arch.guest_context.failsafe_callback_eip = reg->address.eip;
+        break;
+
+    default:
+        ret = -EINVAL;
+        break;
+    }
+
+    return ret;
+}
+
+static long unregister_guest_callback(struct callback_unregister *unreg)
+{
+    long ret;
+
+    switch ( unreg->type )
+    {
+    default:
+        ret = -EINVAL;
+        break;
+    }
+
+    return ret;
+}
+
+
+long do_callback_op(int cmd, GUEST_HANDLE(void) arg)
+{
+    long ret;
+
+    switch ( cmd )
+    {
+    case CALLBACKOP_register:
+    {
+        struct callback_register reg;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(&reg, arg, 1) )
+            break;
+
+        ret = register_guest_callback(&reg);
+    }
+    break;
+
+    case CALLBACKOP_unregister:
+    {
+        struct callback_unregister unreg;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(&unreg, arg, 1) )
+            break;
+
+        ret = unregister_guest_callback(&unreg);
+    }
+    break;
+
+    default:
+        ret = -EINVAL;
+        break;
+    }
+
+    return ret;
+}
+
 long do_set_callbacks(unsigned long event_selector,
                       unsigned long event_address,
                       unsigned long failsafe_selector,
                       unsigned long failsafe_address)
 {
-    struct vcpu *d = current;
-
-    fixup_guest_code_selector(event_selector);
-    fixup_guest_code_selector(failsafe_selector);
-
-    d->arch.guest_context.event_callback_cs     = event_selector;
-    d->arch.guest_context.event_callback_eip    = event_address;
-    d->arch.guest_context.failsafe_callback_cs  = failsafe_selector;
-    d->arch.guest_context.failsafe_callback_eip = failsafe_address;
+    struct callback_register event = {
+        .type = CALLBACKTYPE_event,
+        .address = { event_selector, event_address },
+    };
+    struct callback_register failsafe = {
+        .type = CALLBACKTYPE_failsafe,
+        .address = { failsafe_selector, failsafe_address },
+    };
+
+    register_guest_callback(&event);
+    register_guest_callback(&failsafe);
 
     return 0;
 }
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/x86_64/entry.S
--- a/xen/arch/x86/x86_64/entry.S       Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/x86_64/entry.S       Fri Apr  7 10:52:00 2006
@@ -68,7 +68,7 @@
         leaq  DBLFLT1(%rip),%rax
         pushq %rax                     # RIP
         pushq %rsi                     # error_code/entry_vector
-        jmp   error_code
+        jmp   handle_exception
 DBLFLT1:GET_CURRENT(%rbx)
         jmp   test_all_events
 failsafe_callback:
@@ -320,15 +320,6 @@
         jmp  __domain_crash_synchronous
 
         ALIGN
-/* %rbx: struct vcpu */
-process_guest_exception_and_events:
-        leaq  VCPU_trap_bounce(%rbx),%rdx
-        testb $TBF_EXCEPTION,TRAPBOUNCE_flags(%rdx)
-        jz    test_all_events
-        call  create_bounce_frame
-        jmp   test_all_events
-
-        ALIGN
 /* No special register assumptions. */
 ENTRY(ret_from_intr)
         GET_CURRENT(%rbx)
@@ -338,7 +329,7 @@
 
         ALIGN
 /* No special register assumptions. */
-error_code:
+handle_exception:
         SAVE_ALL
         testb $X86_EFLAGS_IF>>8,UREGS_eflags+1(%rsp)
         jz    exception_with_ints_disabled
@@ -351,7 +342,11 @@
         callq *(%rdx,%rax,8)
         testb $3,UREGS_cs(%rsp)
         jz    restore_all_xen
-        jmp   process_guest_exception_and_events
+        leaq  VCPU_trap_bounce(%rbx),%rdx
+        testb $TBF_EXCEPTION,TRAPBOUNCE_flags(%rdx)
+        jz    test_all_events
+        call  create_bounce_frame
+        jmp   test_all_events
 
 /* No special register assumptions. */
 exception_with_ints_disabled:
@@ -384,90 +379,90 @@
 ENTRY(divide_error)
         pushq $0
         movl  $TRAP_divide_error,4(%rsp)
-        jmp   error_code
+        jmp   handle_exception
 
 ENTRY(coprocessor_error)
         pushq $0
         movl  $TRAP_copro_error,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(simd_coprocessor_error)
         pushq $0
         movl  $TRAP_simd_error,4(%rsp)
-       jmp error_code
+       jmp   handle_exception
 
 ENTRY(device_not_available)
         pushq $0
         movl  $TRAP_no_device,4(%rsp)
-        jmp   error_code
+        jmp   handle_exception
 
 ENTRY(debug)
         pushq $0
         movl  $TRAP_debug,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(int3)
         pushq $0
        movl  $TRAP_int3,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(overflow)
         pushq $0
        movl  $TRAP_overflow,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(bounds)
         pushq $0
        movl  $TRAP_bounds,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(invalid_op)
         pushq $0
        movl  $TRAP_invalid_op,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(coprocessor_segment_overrun)
         pushq $0
        movl  $TRAP_copro_seg,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(invalid_TSS)
         movl  $TRAP_invalid_tss,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(segment_not_present)
         movl  $TRAP_no_segment,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(stack_segment)
         movl  $TRAP_stack_error,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(general_protection)
         movl  $TRAP_gp_fault,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(alignment_check)
         movl  $TRAP_alignment_check,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(page_fault)
         movl  $TRAP_page_fault,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(machine_check)
         pushq $0
         movl  $TRAP_machine_check,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(spurious_interrupt_bug)
         pushq $0
         movl  $TRAP_spurious_int,4(%rsp)
-       jmp   error_code
+       jmp   handle_exception
 
 ENTRY(double_fault)
         movl  $TRAP_double_fault,4(%rsp)
-        jmp   error_code
+        jmp   handle_exception
 
 ENTRY(nmi)
         pushq $0
@@ -557,6 +552,8 @@
         .quad do_acm_op
         .quad do_nmi_op
         .quad do_arch_sched_op
+        .quad do_callback_op        /* 30 */
+        .quad do_xenoprof_op
         .rept NR_hypercalls-((.-hypercall_table)/8)
         .quad do_ni_hypercall
         .endr
@@ -592,6 +589,8 @@
         .byte 1 /* do_acm_op            */
         .byte 2 /* do_nmi_op            */
         .byte 2 /* do_arch_sched_op     */
+        .byte 2 /* do_callback_op       */  /* 30 */
+        .byte 3 /* do_xenoprof_op       */
         .rept NR_hypercalls-(.-hypercall_args_table)
         .byte 0 /* do_ni_hypercall      */
         .endr
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/x86_64/traps.c
--- a/xen/arch/x86/x86_64/traps.c       Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/x86_64/traps.c       Fri Apr  7 10:52:00 2006
@@ -10,12 +10,15 @@
 #include <xen/symbols.h>
 #include <xen/console.h>
 #include <xen/sched.h>
+#include <xen/reboot.h>
 #include <asm/current.h>
 #include <asm/flushtlb.h>
 #include <asm/msr.h>
 #include <asm/shadow.h>
 #include <asm/hvm/hvm.h>
 #include <asm/hvm/support.h>
+
+#include <public/callback.h>
 
 void show_registers(struct cpu_user_regs *regs)
 {
@@ -164,8 +167,7 @@
     console_force_lock();
 
     /* Wait for manual reset. */
-    for ( ; ; )
-        __asm__ __volatile__ ( "hlt" );
+    machine_halt();
 }
 
 void toggle_guest_mode(struct vcpu *v)
@@ -184,13 +186,19 @@
 
     if ( unlikely(copy_from_user(&iret_saved, (void *)regs->rsp,
                                  sizeof(iret_saved))) )
+    {
+        DPRINTK("Fault while reading IRET context from guest stack\n");
         domain_crash_synchronous();
+    }
 
     /* Returning to user mode? */
     if ( (iret_saved.cs & 3) == 3 )
     {
         if ( unlikely(pagetable_get_paddr(v->arch.guest_table_user) == 0) )
-            return -EFAULT;
+        {
+            DPRINTK("Guest switching to user mode with no user page tables\n");
+            domain_crash_synchronous();
+        }
         toggle_guest_mode(v);
     }
 
@@ -312,15 +320,106 @@
     wrmsr(MSR_SYSCALL_MASK, EF_VM|EF_RF|EF_NT|EF_DF|EF_IE|EF_TF, 0U);
 }
 
+static long register_guest_callback(struct callback_register *reg)
+{
+    long ret = 0;
+    struct vcpu *v = current;
+
+    switch ( reg->type )
+    {
+    case CALLBACKTYPE_event:
+        v->arch.guest_context.event_callback_eip    = reg->address;
+        break;
+
+    case CALLBACKTYPE_failsafe:
+        v->arch.guest_context.failsafe_callback_eip = reg->address;
+        break;
+
+    case CALLBACKTYPE_syscall:
+        v->arch.guest_context.syscall_callback_eip  = reg->address;
+        break;
+
+    default:
+        ret = -EINVAL;
+        break;
+    }
+
+    return ret;
+}
+
+static long unregister_guest_callback(struct callback_unregister *unreg)
+{
+    long ret;
+
+    switch ( unreg->type )
+    {
+    default:
+        ret = -EINVAL;
+        break;
+    }
+
+    return ret;
+}
+
+
+long do_callback_op(int cmd, GUEST_HANDLE(void) arg)
+{
+    long ret;
+
+    switch ( cmd )
+    {
+    case CALLBACKOP_register:
+    {
+        struct callback_register reg;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(&reg, arg, 1) )
+            break;
+
+        ret = register_guest_callback(&reg);
+    }
+    break;
+
+    case CALLBACKOP_unregister:
+    {
+        struct callback_unregister unreg;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(&unreg, arg, 1) )
+            break;
+
+        ret = unregister_guest_callback(&unreg);
+    }
+    break;
+
+    default:
+        ret = -EINVAL;
+        break;
+    }
+
+    return ret;
+}
+
 long do_set_callbacks(unsigned long event_address,
                       unsigned long failsafe_address,
                       unsigned long syscall_address)
 {
-    struct vcpu *d = current;
-
-    d->arch.guest_context.event_callback_eip    = event_address;
-    d->arch.guest_context.failsafe_callback_eip = failsafe_address;
-    d->arch.guest_context.syscall_callback_eip  = syscall_address;
+    struct callback_register event = {
+        .type = CALLBACKTYPE_event,
+        .address = event_address,
+    };
+    struct callback_register failsafe = {
+        .type = CALLBACKTYPE_failsafe,
+        .address = failsafe_address,
+    };
+    struct callback_register syscall = {
+        .type = CALLBACKTYPE_syscall,
+        .address = syscall_address,
+    };
+
+    register_guest_callback(&event);
+    register_guest_callback(&failsafe);
+    register_guest_callback(&syscall);
 
     return 0;
 }
diff -r 9fcfdab04aa9 -r fb174770f426 xen/common/event_channel.c
--- a/xen/common/event_channel.c        Thu Apr  6 13:22:52 2006
+++ b/xen/common/event_channel.c        Fri Apr  7 10:52:00 2006
@@ -57,6 +57,7 @@
     {
     case VIRQ_TIMER:
     case VIRQ_DEBUG:
+    case VIRQ_XENOPROF:
         rc = 0;
         break;
     default:
diff -r 9fcfdab04aa9 -r fb174770f426 xen/common/schedule.c
--- a/xen/common/schedule.c     Thu Apr  6 13:22:52 2006
+++ b/xen/common/schedule.c     Fri Apr  7 10:52:00 2006
@@ -413,6 +413,30 @@
         break;
     }
 
+    case SCHEDOP_remote_shutdown:
+    {
+        struct domain *d;
+        struct sched_remote_shutdown sched_remote_shutdown;
+
+        if ( !IS_PRIV(current->domain) )
+            return -EPERM;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(&sched_remote_shutdown, arg, 1) )
+            break;
+
+        ret = -ESRCH;
+        d = find_domain_by_id(sched_remote_shutdown.domain_id);
+        if ( d == NULL )
+            break;
+
+        domain_shutdown(d, (u8)sched_remote_shutdown.reason);
+        put_domain(d);
+        ret = 0;
+
+        break;
+    }
+
     default:
         ret = -ENOSYS;
     }
diff -r 9fcfdab04aa9 -r fb174770f426 xen/drivers/char/console.c
--- a/xen/drivers/char/console.c        Thu Apr  6 13:22:52 2006
+++ b/xen/drivers/char/console.c        Fri Apr  7 10:52:00 2006
@@ -520,6 +520,7 @@
 {
     console_lock = SPIN_LOCK_UNLOCKED;
     serial_force_unlock(sercon_handle);
+    console_start_sync();
 }
 
 void console_force_lock(void)
diff -r 9fcfdab04aa9 -r fb174770f426 xen/include/public/arch-x86_32.h
--- a/xen/include/public/arch-x86_32.h  Thu Apr  6 13:22:52 2006
+++ b/xen/include/public/arch-x86_32.h  Fri Apr  7 10:52:00 2006
@@ -168,6 +168,11 @@
     unsigned long pad[5]; /* sizeof(vcpu_info_t) == 64 */
 } arch_vcpu_info_t;
 
+typedef struct {
+    unsigned long cs;
+    unsigned long eip;
+} xen_callback_t;
+
 #endif /* !__ASSEMBLY__ */
 
 /*
diff -r 9fcfdab04aa9 -r fb174770f426 xen/include/public/arch-x86_64.h
--- a/xen/include/public/arch-x86_64.h  Thu Apr  6 13:22:52 2006
+++ b/xen/include/public/arch-x86_64.h  Fri Apr  7 10:52:00 2006
@@ -244,6 +244,8 @@
     unsigned long pad; /* sizeof(vcpu_info_t) == 64 */
 } arch_vcpu_info_t;
 
+typedef unsigned long xen_callback_t;
+
 #endif /* !__ASSEMBLY__ */
 
 /*
diff -r 9fcfdab04aa9 -r fb174770f426 xen/include/public/dom0_ops.h
--- a/xen/include/public/dom0_ops.h     Thu Apr  6 13:22:52 2006
+++ b/xen/include/public/dom0_ops.h     Fri Apr  7 10:52:00 2006
@@ -140,15 +140,16 @@
 DEFINE_GUEST_HANDLE(dom0_settime_t);
 
 #define DOM0_GETPAGEFRAMEINFO 18
+#define LTAB_SHIFT 28
 #define NOTAB 0         /* normal page */
-#define L1TAB (1<<28)
-#define L2TAB (2<<28)
-#define L3TAB (3<<28)
-#define L4TAB (4<<28)
+#define L1TAB (1<<LTAB_SHIFT)
+#define L2TAB (2<<LTAB_SHIFT)
+#define L3TAB (3<<LTAB_SHIFT)
+#define L4TAB (4<<LTAB_SHIFT)
 #define LPINTAB  (1<<31)
-#define XTAB  (0xf<<28) /* invalid page */
+#define XTAB  (0xf<<LTAB_SHIFT) /* invalid page */
 #define LTAB_MASK XTAB
-#define LTABTYPE_MASK (0x7<<28)
+#define LTABTYPE_MASK (0x7<<LTAB_SHIFT)
 
 typedef struct dom0_getpageframeinfo {
     /* IN variables. */
diff -r 9fcfdab04aa9 -r fb174770f426 xen/include/public/sched.h
--- a/xen/include/public/sched.h        Thu Apr  6 13:22:52 2006
+++ b/xen/include/public/sched.h        Fri Apr  7 10:52:00 2006
@@ -65,6 +65,19 @@
 DEFINE_GUEST_HANDLE(sched_poll_t);
 
 /*
+ * Declare a shutdown for another domain. The main use of this function is
+ * in interpreting shutdown requests and reasons for fully-virtualized
+ * domains.  A para-virtualized domain may use SCHEDOP_shutdown directly.
+ * @arg == pointer to sched_remote_shutdown structure.
+ */
+#define SCHEDOP_remote_shutdown        4
+typedef struct sched_remote_shutdown {
+    domid_t domain_id;         /* Remote domain ID */
+    unsigned int reason;       /* SHUTDOWN_xxx reason */
+} sched_remote_shutdown_t;
+DEFINE_GUEST_HANDLE(sched_remote_shutdown_t);
+
+/*
  * Reason codes for SCHEDOP_shutdown. These may be interpreted by control
  * software to determine the appropriate action. For the most part, Xen does
  * not care about the shutdown code.
diff -r 9fcfdab04aa9 -r fb174770f426 xen/include/public/xen.h
--- a/xen/include/public/xen.h  Thu Apr  6 13:22:52 2006
+++ b/xen/include/public/xen.h  Fri Apr  7 10:52:00 2006
@@ -60,6 +60,8 @@
 #define __HYPERVISOR_acm_op               27
 #define __HYPERVISOR_nmi_op               28
 #define __HYPERVISOR_sched_op             29
+#define __HYPERVISOR_callback_op          30
+#define __HYPERVISOR_xenoprof_op          31
 
 /* 
  * VIRTUAL INTERRUPTS
@@ -76,6 +78,7 @@
 #define VIRQ_CONSOLE    2  /* G. (DOM0) Bytes received on emergency console. */
 #define VIRQ_DOM_EXC    3  /* G. (DOM0) Exceptional event for some domain.   */
 #define VIRQ_DEBUGGER   6  /* G. (DOM0) A domain has paused for debugging.   */
+#define VIRQ_XENOPROF   7  /* V. XenOprofile interrupt: new sample available */
 #define NR_VIRQS        8
 
 /*
diff -r 9fcfdab04aa9 -r fb174770f426 xen/include/xen/sched.h
--- a/xen/include/xen/sched.h   Thu Apr  6 13:22:52 2006
+++ b/xen/include/xen/sched.h   Fri Apr  7 10:52:00 2006
@@ -14,6 +14,7 @@
 #include <xen/grant_table.h>
 #include <xen/rangeset.h>
 #include <asm/domain.h>
+#include <xen/xenoprof.h>
 
 extern unsigned long volatile jiffies;
 extern rwlock_t domlist_lock;
@@ -155,6 +156,9 @@
 
     /* Control-plane tools handle for this domain. */
     xen_domain_handle_t handle;
+
+    /* OProfile support. */
+    struct xenoprof *xenoprof;
 };
 
 struct domain_setup_info
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/arch/i386/oprofile/Makefile
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/arch/i386/oprofile/Makefile  Fri Apr  7 10:52:00 2006
@@ -0,0 +1,16 @@
+obj-$(CONFIG_OPROFILE) += oprofile.o
+
+DRIVER_OBJS = $(addprefix ../../../drivers/oprofile/, \
+               oprof.o cpu_buffer.o buffer_sync.o \
+               event_buffer.o oprofile_files.o \
+               oprofilefs.o oprofile_stats.o  \
+               timer_int.o )
+
+ifdef CONFIG_XEN
+oprofile-y                             := $(DRIVER_OBJS) xenoprof.o
+else 
+oprofile-y                             := $(DRIVER_OBJS) init.o backtrace.o
+oprofile-$(CONFIG_X86_LOCAL_APIC)      += nmi_int.o op_model_athlon.o \
+                                          op_model_ppro.o op_model_p4.o
+oprofile-$(CONFIG_X86_IO_APIC)         += nmi_timer_int.o
+endif
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/arch/i386/oprofile/xenoprof.c
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/arch/i386/oprofile/xenoprof.c        Fri Apr  7 
10:52:00 2006
@@ -0,0 +1,395 @@
+/**
+ * @file xenoprof.c
+ *
+ * @remark Copyright 2002 OProfile authors
+ * @remark Read the file COPYING
+ *
+ * @author John Levon <levon@xxxxxxxxxxxxxxxxx>
+ *
+ * Modified by Aravind Menon and Jose Renato Santos for Xen
+ * These modifications are:
+ * Copyright (C) 2005 Hewlett-Packard Co.
+ */
+
+#include <linux/init.h>
+#include <linux/notifier.h>
+#include <linux/smp.h>
+#include <linux/oprofile.h>
+#include <linux/sysdev.h>
+#include <linux/slab.h>
+#include <linux/interrupt.h>
+#include <linux/vmalloc.h>
+#include <asm/nmi.h>
+#include <asm/msr.h>
+#include <asm/apic.h>
+#include <asm/pgtable.h>
+#include <xen/evtchn.h>
+#include "op_counter.h"
+
+#include <xen/interface/xen.h>
+#include <xen/interface/xenoprof.h>
+
+static int xenoprof_start(void);
+static void xenoprof_stop(void);
+
+void * vm_map_xen_pages(unsigned long maddr, int vm_size, pgprot_t prot);
+
+static int xenoprof_enabled = 0;
+static int num_events = 0;
+static int is_primary = 0;
+
+/* sample buffers shared with Xen */
+xenoprof_buf_t * xenoprof_buf[MAX_VIRT_CPUS];
+/* Shared buffer area */
+char * shared_buffer;
+/* Number of buffers in shared area (one per VCPU) */
+int nbuf;
+/* Mappings of VIRQ_XENOPROF to irq number (per cpu) */
+int ovf_irq[NR_CPUS];
+/* cpu model type string - copied from Xen memory space on XENOPROF_init 
command */
+char cpu_type[XENOPROF_CPU_TYPE_SIZE];
+
+#ifdef CONFIG_PM
+
+static int xenoprof_suspend(struct sys_device * dev, pm_message_t state)
+{
+       if (xenoprof_enabled == 1)
+               xenoprof_stop();
+       return 0;
+}
+
+
+static int xenoprof_resume(struct sys_device * dev)
+{
+       if (xenoprof_enabled == 1)
+               xenoprof_start();
+       return 0;
+}
+
+
+static struct sysdev_class oprofile_sysclass = {
+       set_kset_name("oprofile"),
+       .resume         = xenoprof_resume,
+       .suspend        = xenoprof_suspend
+};
+
+
+static struct sys_device device_oprofile = {
+       .id     = 0,
+       .cls    = &oprofile_sysclass,
+};
+
+
+static int __init init_driverfs(void)
+{
+       int error;
+       if (!(error = sysdev_class_register(&oprofile_sysclass)))
+               error = sysdev_register(&device_oprofile);
+       return error;
+}
+
+
+static void __exit exit_driverfs(void)
+{
+       sysdev_unregister(&device_oprofile);
+       sysdev_class_unregister(&oprofile_sysclass);
+}
+
+#else
+#define init_driverfs() do { } while (0)
+#define exit_driverfs() do { } while (0)
+#endif /* CONFIG_PM */
+
+unsigned long long oprofile_samples = 0;
+
+static irqreturn_t 
+xenoprof_ovf_interrupt(int irq, void * dev_id, struct pt_regs * regs)
+{
+       int head, tail, size;
+       xenoprof_buf_t * buf;
+       int cpu;
+
+       cpu = smp_processor_id();
+       buf = xenoprof_buf[cpu];
+
+       head = buf->event_head;
+       tail = buf->event_tail;
+       size = buf->event_size;
+
+       if (tail > head) {
+               while (tail < size) {
+                       oprofile_add_pc(buf->event_log[tail].eip,
+                                       buf->event_log[tail].mode,
+                                       buf->event_log[tail].event);
+                       oprofile_samples++;
+                       tail++;
+               }
+               tail = 0;
+       }
+       while (tail < head) {
+               oprofile_add_pc(buf->event_log[tail].eip,
+                               buf->event_log[tail].mode,
+                               buf->event_log[tail].event);
+               oprofile_samples++;
+               tail++;
+       }
+
+       buf->event_tail = tail;
+
+       return IRQ_HANDLED;
+}
+
+
+static void unbind_virq_cpu(void * info)
+{
+       int cpu = smp_processor_id();
+       if (ovf_irq[cpu] >= 0) {
+               unbind_from_irqhandler(ovf_irq[cpu], NULL);
+               ovf_irq[cpu] = -1;
+       }
+}
+
+
+static void unbind_virq(void)
+{
+       on_each_cpu(unbind_virq_cpu, NULL, 0, 1);
+}
+
+
+int bind_virq_error;
+
+static void bind_virq_cpu(void * info)
+{
+       int result;
+       int cpu = smp_processor_id();
+
+       result = bind_virq_to_irqhandler(VIRQ_XENOPROF,
+                                        cpu,
+                                        xenoprof_ovf_interrupt,
+                                        SA_INTERRUPT,
+                                        "xenoprof",
+                                        NULL);
+
+       if (result<0) {
+               bind_virq_error = result;
+               printk("xenoprof.c: binding VIRQ_XENOPROF to IRQ failed on CPU "
+                      "%d\n", cpu);
+       } else {
+               ovf_irq[cpu] = result;
+       }
+}
+
+
+static int bind_virq(void)
+{
+       bind_virq_error = 0;
+       on_each_cpu(bind_virq_cpu, NULL, 0, 1);
+       if (bind_virq_error) {
+               unbind_virq();
+               return bind_virq_error;
+       } else {
+               return 0;
+       }
+}
+
+
+static int xenoprof_setup(void)
+{
+       int ret;
+
+       ret = bind_virq();
+       if (ret)
+               return ret;
+
+       if (is_primary) {
+               ret = HYPERVISOR_xenoprof_op(XENOPROF_reserve_counters,
+                                            (unsigned long)NULL,
+                                            (unsigned long)NULL);
+               if (ret)
+                       goto err;
+
+               ret = HYPERVISOR_xenoprof_op(XENOPROF_setup_events,
+                                            (unsigned long)&counter_config,
+                                            (unsigned long)num_events);
+               if (ret)
+                       goto err;
+       }
+
+       ret = HYPERVISOR_xenoprof_op(XENOPROF_enable_virq,
+                                    (unsigned long)NULL,
+                                    (unsigned long)NULL);
+       if (ret)
+               goto err;
+
+       xenoprof_enabled = 1;
+       return 0;
+ err:
+       unbind_virq();
+       return ret;
+}
+
+
+static void xenoprof_shutdown(void)
+{
+       xenoprof_enabled = 0;
+
+       HYPERVISOR_xenoprof_op(XENOPROF_disable_virq,
+                              (unsigned long)NULL,
+                              (unsigned long)NULL);
+
+       if (is_primary) {
+               HYPERVISOR_xenoprof_op(XENOPROF_release_counters,
+                                      (unsigned long)NULL,
+                                      (unsigned long)NULL);
+       }
+
+       unbind_virq();
+}
+
+
+static int xenoprof_start(void)
+{
+       int ret = 0;
+
+       if (is_primary)
+               ret = HYPERVISOR_xenoprof_op(XENOPROF_start,
+                                            (unsigned long)NULL,
+                                            (unsigned long)NULL);
+       return ret;
+}
+
+
+static void xenoprof_stop(void)
+{
+       if (is_primary)
+               HYPERVISOR_xenoprof_op(XENOPROF_stop,
+                                      (unsigned long)NULL,
+                                      (unsigned long)NULL);
+}
+
+
+static int xenoprof_set_active(int * active_domains,
+                         unsigned int adomains)
+{
+       int ret = 0;
+       if (is_primary)
+               ret = HYPERVISOR_xenoprof_op(XENOPROF_set_active,
+                                            (unsigned long)active_domains,
+                                            (unsigned long)adomains);
+       return ret;
+}
+
+
+struct op_counter_config counter_config[OP_MAX_COUNTER];
+
+static int xenoprof_create_files(struct super_block * sb, struct dentry * root)
+{
+       unsigned int i;
+
+       for (i = 0; i < num_events; ++i) {
+               struct dentry * dir;
+               char buf[2];
+ 
+               snprintf(buf, 2, "%d", i);
+               dir = oprofilefs_mkdir(sb, root, buf);
+               oprofilefs_create_ulong(sb, dir, "enabled",
+                                       &counter_config[i].enabled);
+               oprofilefs_create_ulong(sb, dir, "event",
+                                       &counter_config[i].event);
+               oprofilefs_create_ulong(sb, dir, "count",
+                                       &counter_config[i].count);
+               oprofilefs_create_ulong(sb, dir, "unit_mask",
+                                       &counter_config[i].unit_mask);
+               oprofilefs_create_ulong(sb, dir, "kernel",
+                                       &counter_config[i].kernel);
+               oprofilefs_create_ulong(sb, dir, "user",
+                                       &counter_config[i].user);
+       }
+
+       return 0;
+}
+
+
+struct oprofile_operations xenoprof_ops = {
+       .create_files   = xenoprof_create_files,
+       .set_active     = xenoprof_set_active,
+       .setup          = xenoprof_setup,
+       .shutdown       = xenoprof_shutdown,
+       .start          = xenoprof_start,
+       .stop           = xenoprof_stop
+};
+
+
+/* in order to get driverfs right */
+static int using_xenoprof;
+
+int __init oprofile_arch_init(struct oprofile_operations * ops)
+{
+       xenoprof_init_result_t result;
+       xenoprof_buf_t * buf;
+       int max_samples = 16;
+       int vm_size;
+       int npages;
+       int i;
+
+       int ret = HYPERVISOR_xenoprof_op(XENOPROF_init,
+                                        (unsigned long)max_samples,
+                                        (unsigned long)&result);
+
+       if (!ret) {
+               pgprot_t prot = __pgprot(_KERNPG_TABLE);
+
+               num_events = result.num_events;
+               is_primary = result.is_primary;
+               nbuf = result.nbuf;
+
+               npages = (result.bufsize * nbuf - 1) / PAGE_SIZE + 1;
+               vm_size = npages * PAGE_SIZE;
+
+               shared_buffer = (char *) vm_map_xen_pages(result.buf_maddr,
+                                                         vm_size, prot);
+               if (!shared_buffer) {
+                       ret = -ENOMEM;
+                       goto out;
+               }
+
+               for (i=0; i< nbuf; i++) {
+                       buf = (xenoprof_buf_t*) 
+                               &shared_buffer[i * result.bufsize];
+                       BUG_ON(buf->vcpu_id >= MAX_VIRT_CPUS);
+                       xenoprof_buf[buf->vcpu_id] = buf;
+               }
+
+               /*  cpu_type is detected by Xen */
+               cpu_type[XENOPROF_CPU_TYPE_SIZE-1] = 0;
+               strncpy(cpu_type, result.cpu_type, XENOPROF_CPU_TYPE_SIZE - 1);
+               xenoprof_ops.cpu_type = cpu_type;
+
+               init_driverfs();
+               using_xenoprof = 1;
+               *ops = xenoprof_ops;
+
+               for (i=0; i<NR_CPUS; i++)
+                       ovf_irq[i] = -1;
+       }
+ out:
+       printk(KERN_INFO "oprofile_arch_init: ret %d, events %d, "
+              "is_primary %d\n", ret, num_events, is_primary);
+       return ret;
+}
+
+
+void __exit oprofile_arch_exit(void)
+{
+       if (using_xenoprof)
+               exit_driverfs();
+
+       if (shared_buffer) {
+               vunmap(shared_buffer);
+               shared_buffer = NULL;
+       }
+       if (is_primary)
+               HYPERVISOR_xenoprof_op(XENOPROF_shutdown,
+                                      (unsigned long)NULL,
+                                      (unsigned long)NULL);
+}
diff -r 9fcfdab04aa9 -r fb174770f426 
linux-2.6-xen-sparse/arch/x86_64/oprofile/Makefile
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/linux-2.6-xen-sparse/arch/x86_64/oprofile/Makefile        Fri Apr  7 
10:52:00 2006
@@ -0,0 +1,22 @@
+#
+# oprofile for x86-64.
+# Just reuse the one from i386. 
+#
+
+obj-$(CONFIG_OPROFILE) += oprofile.o
+ 
+DRIVER_OBJS = $(addprefix ../../../drivers/oprofile/, \
+       oprof.o cpu_buffer.o buffer_sync.o \
+       event_buffer.o oprofile_files.o \
+       oprofilefs.o oprofile_stats.o \
+       timer_int.o )
+
+ifdef CONFIG_XEN
+OPROFILE-y := xenoprof.o
+else
+OPROFILE-y := init.o backtrace.o
+OPROFILE-$(CONFIG_X86_LOCAL_APIC) += nmi_int.o op_model_athlon.o op_model_p4.o 
\
+                                    op_model_ppro.o
+OPROFILE-$(CONFIG_X86_IO_APIC)    += nmi_timer_int.o 
+endif
+oprofile-y = $(DRIVER_OBJS) $(addprefix ../../i386/oprofile/, $(OPROFILE-y))
diff -r 9fcfdab04aa9 -r fb174770f426 patches/linux-2.6.16/xenoprof-generic.patch
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/patches/linux-2.6.16/xenoprof-generic.patch       Fri Apr  7 10:52:00 2006
@@ -0,0 +1,384 @@
+diff -pruN ../pristine-linux-2.6.16/drivers/oprofile/buffer_sync.c 
./drivers/oprofile/buffer_sync.c
+--- ../pristine-linux-2.6.16/drivers/oprofile/buffer_sync.c    2006-03-20 
05:53:29.000000000 +0000
++++ ./drivers/oprofile/buffer_sync.c   2006-04-03 15:53:05.000000000 +0100
+@@ -6,6 +6,10 @@
+  *
+  * @author John Levon <levon@xxxxxxxxxxxxxxxxx>
+  *
++ * Modified by Aravind Menon for Xen
++ * These modifications are:
++ * Copyright (C) 2005 Hewlett-Packard Co.
++ *
+  * This is the core of the buffer management. Each
+  * CPU buffer is processed and entered into the
+  * global event buffer. Such processing is necessary
+@@ -275,15 +279,24 @@ static void add_cpu_switch(int i)
+       last_cookie = INVALID_COOKIE;
+ }
+ 
+-static void add_kernel_ctx_switch(unsigned int in_kernel)
++static void add_cpu_mode_switch(unsigned int cpu_mode)
+ {
+       add_event_entry(ESCAPE_CODE);
+-      if (in_kernel)
+-              add_event_entry(KERNEL_ENTER_SWITCH_CODE); 
+-      else
+-              add_event_entry(KERNEL_EXIT_SWITCH_CODE); 
++      switch (cpu_mode) {
++      case CPU_MODE_USER:
++              add_event_entry(USER_ENTER_SWITCH_CODE);
++              break;
++      case CPU_MODE_KERNEL:
++              add_event_entry(KERNEL_ENTER_SWITCH_CODE);
++              break;
++      case CPU_MODE_XEN:
++              add_event_entry(XEN_ENTER_SWITCH_CODE);
++              break;
++      default:
++              break;
++      }
+ }
+- 
++
+ static void
+ add_user_ctx_switch(struct task_struct const * task, unsigned long cookie)
+ {
+@@ -348,9 +361,9 @@ static int add_us_sample(struct mm_struc
+  * for later lookup from userspace.
+  */
+ static int
+-add_sample(struct mm_struct * mm, struct op_sample * s, int in_kernel)
++add_sample(struct mm_struct * mm, struct op_sample * s, int cpu_mode)
+ {
+-      if (in_kernel) {
++      if (cpu_mode >= CPU_MODE_KERNEL) {
+               add_sample_entry(s->eip, s->event);
+               return 1;
+       } else if (mm) {
+@@ -496,7 +509,7 @@ void sync_buffer(int cpu)
+       struct mm_struct *mm = NULL;
+       struct task_struct * new;
+       unsigned long cookie = 0;
+-      int in_kernel = 1;
++      int cpu_mode = 1;
+       unsigned int i;
+       sync_buffer_state state = sb_buffer_start;
+       unsigned long available;
+@@ -513,12 +526,12 @@ void sync_buffer(int cpu)
+               struct op_sample * s = &cpu_buf->buffer[cpu_buf->tail_pos];
+  
+               if (is_code(s->eip)) {
+-                      if (s->event <= CPU_IS_KERNEL) {
++                      if (s->event <= CPU_MODE_XEN) {
+                               /* kernel/userspace switch */
+-                              in_kernel = s->event;
++                              cpu_mode = s->event;
+                               if (state == sb_buffer_start)
+                                       state = sb_sample_start;
+-                              add_kernel_ctx_switch(s->event);
++                              add_cpu_mode_switch(s->event);
+                       } else if (s->event == CPU_TRACE_BEGIN) {
+                               state = sb_bt_start;
+                               add_trace_begin();
+@@ -536,7 +549,7 @@ void sync_buffer(int cpu)
+                       }
+               } else {
+                       if (state >= sb_bt_start &&
+-                          !add_sample(mm, s, in_kernel)) {
++                          !add_sample(mm, s, cpu_mode)) {
+                               if (state == sb_bt_start) {
+                                       state = sb_bt_ignore;
+                                       
atomic_inc(&oprofile_stats.bt_lost_no_mapping);
+diff -pruN ../pristine-linux-2.6.16/drivers/oprofile/cpu_buffer.c 
./drivers/oprofile/cpu_buffer.c
+--- ../pristine-linux-2.6.16/drivers/oprofile/cpu_buffer.c     2006-03-20 
05:53:29.000000000 +0000
++++ ./drivers/oprofile/cpu_buffer.c    2006-04-03 15:53:05.000000000 +0100
+@@ -6,6 +6,10 @@
+  *
+  * @author John Levon <levon@xxxxxxxxxxxxxxxxx>
+  *
++ * Modified by Aravind Menon for Xen
++ * These modifications are:
++ * Copyright (C) 2005 Hewlett-Packard Co.
++ *
+  * Each CPU has a local buffer that stores PC value/event
+  * pairs. We also log context switches when we notice them.
+  * Eventually each CPU's buffer is processed into the global
+@@ -58,7 +62,7 @@ int alloc_cpu_buffers(void)
+                       goto fail;
+  
+               b->last_task = NULL;
+-              b->last_is_kernel = -1;
++              b->last_cpu_mode = -1;
+               b->tracing = 0;
+               b->buffer_size = buffer_size;
+               b->tail_pos = 0;
+@@ -114,7 +118,7 @@ void cpu_buffer_reset(struct oprofile_cp
+        * collected will populate the buffer with proper
+        * values to initialize the buffer
+        */
+-      cpu_buf->last_is_kernel = -1;
++      cpu_buf->last_cpu_mode = -1;
+       cpu_buf->last_task = NULL;
+ }
+ 
+@@ -164,13 +168,13 @@ add_code(struct oprofile_cpu_buffer * bu
+  * because of the head/tail separation of the writer and reader
+  * of the CPU buffer.
+  *
+- * is_kernel is needed because on some architectures you cannot
++ * cpu_mode is needed because on some architectures you cannot
+  * tell if you are in kernel or user space simply by looking at
+- * pc. We tag this in the buffer by generating kernel enter/exit
+- * events whenever is_kernel changes
++ * pc. We tag this in the buffer by generating kernel/user (and xen)
++ *  enter events whenever cpu_mode changes
+  */
+ static int log_sample(struct oprofile_cpu_buffer * cpu_buf, unsigned long pc,
+-                    int is_kernel, unsigned long event)
++                    int cpu_mode, unsigned long event)
+ {
+       struct task_struct * task;
+ 
+@@ -181,16 +185,16 @@ static int log_sample(struct oprofile_cp
+               return 0;
+       }
+ 
+-      is_kernel = !!is_kernel;
++      WARN_ON(cpu_mode > CPU_MODE_XEN);
+ 
+       task = current;
+ 
+       /* notice a switch from user->kernel or vice versa */
+-      if (cpu_buf->last_is_kernel != is_kernel) {
+-              cpu_buf->last_is_kernel = is_kernel;
+-              add_code(cpu_buf, is_kernel);
++      if (cpu_buf->last_cpu_mode != cpu_mode) {
++              cpu_buf->last_cpu_mode = cpu_mode;
++              add_code(cpu_buf, cpu_mode);
+       }
+-
++      
+       /* notice a task switch */
+       if (cpu_buf->last_task != task) {
+               cpu_buf->last_task = task;
+diff -pruN ../pristine-linux-2.6.16/drivers/oprofile/cpu_buffer.h 
./drivers/oprofile/cpu_buffer.h
+--- ../pristine-linux-2.6.16/drivers/oprofile/cpu_buffer.h     2006-03-20 
05:53:29.000000000 +0000
++++ ./drivers/oprofile/cpu_buffer.h    2006-04-03 15:53:05.000000000 +0100
+@@ -36,7 +36,7 @@ struct oprofile_cpu_buffer {
+       volatile unsigned long tail_pos;
+       unsigned long buffer_size;
+       struct task_struct * last_task;
+-      int last_is_kernel;
++      int last_cpu_mode;
+       int tracing;
+       struct op_sample * buffer;
+       unsigned long sample_received;
+@@ -51,7 +51,9 @@ extern struct oprofile_cpu_buffer cpu_bu
+ void cpu_buffer_reset(struct oprofile_cpu_buffer * cpu_buf);
+ 
+ /* transient events for the CPU buffer -> event buffer */
+-#define CPU_IS_KERNEL 1
+-#define CPU_TRACE_BEGIN 2
++#define CPU_MODE_USER    0
++#define CPU_MODE_KERNEL  1
++#define CPU_MODE_XEN     2
++#define CPU_TRACE_BEGIN  3
+ 
+ #endif /* OPROFILE_CPU_BUFFER_H */
+diff -pruN ../pristine-linux-2.6.16/drivers/oprofile/event_buffer.h 
./drivers/oprofile/event_buffer.h
+--- ../pristine-linux-2.6.16/drivers/oprofile/event_buffer.h   2006-03-20 
05:53:29.000000000 +0000
++++ ./drivers/oprofile/event_buffer.h  2006-04-03 15:53:05.000000000 +0100
+@@ -29,11 +29,12 @@ void wake_up_buffer_waiter(void);
+ #define CPU_SWITCH_CODE               2
+ #define COOKIE_SWITCH_CODE            3
+ #define KERNEL_ENTER_SWITCH_CODE      4
+-#define KERNEL_EXIT_SWITCH_CODE               5
++#define USER_ENTER_SWITCH_CODE                5
+ #define MODULE_LOADED_CODE            6
+ #define CTX_TGID_CODE                 7
+ #define TRACE_BEGIN_CODE              8
+ #define TRACE_END_CODE                        9
++#define XEN_ENTER_SWITCH_CODE         10
+  
+ #define INVALID_COOKIE ~0UL
+ #define NO_COOKIE 0UL
+diff -pruN ../pristine-linux-2.6.16/drivers/oprofile/oprof.c 
./drivers/oprofile/oprof.c
+--- ../pristine-linux-2.6.16/drivers/oprofile/oprof.c  2006-03-20 
05:53:29.000000000 +0000
++++ ./drivers/oprofile/oprof.c 2006-04-03 15:53:05.000000000 +0100
+@@ -5,6 +5,10 @@
+  * @remark Read the file COPYING
+  *
+  * @author John Levon <levon@xxxxxxxxxxxxxxxxx>
++ *
++ * Modified by Aravind Menon for Xen
++ * These modifications are:
++ * Copyright (C) 2005 Hewlett-Packard Co.
+  */
+ 
+ #include <linux/kernel.h>
+@@ -19,7 +23,7 @@
+ #include "cpu_buffer.h"
+ #include "buffer_sync.h"
+ #include "oprofile_stats.h"
+- 
++
+ struct oprofile_operations oprofile_ops;
+ 
+ unsigned long oprofile_started;
+@@ -33,6 +37,17 @@ static DECLARE_MUTEX(start_sem);
+  */
+ static int timer = 0;
+ 
++extern unsigned int adomains;
++extern int active_domains[MAX_OPROF_DOMAINS];
++
++int oprofile_set_active(void)
++{
++      if (oprofile_ops.set_active)
++              return oprofile_ops.set_active(active_domains, adomains);
++
++      return -EINVAL;
++}
++
+ int oprofile_setup(void)
+ {
+       int err;
+diff -pruN ../pristine-linux-2.6.16/drivers/oprofile/oprof.h 
./drivers/oprofile/oprof.h
+--- ../pristine-linux-2.6.16/drivers/oprofile/oprof.h  2006-03-20 
05:53:29.000000000 +0000
++++ ./drivers/oprofile/oprof.h 2006-04-03 15:53:05.000000000 +0100
+@@ -35,5 +35,7 @@ void oprofile_create_files(struct super_
+ void oprofile_timer_init(struct oprofile_operations * ops);
+ 
+ int oprofile_set_backtrace(unsigned long depth);
++
++int oprofile_set_active(void);
+  
+ #endif /* OPROF_H */
+diff -pruN ../pristine-linux-2.6.16/drivers/oprofile/oprofile_files.c 
./drivers/oprofile/oprofile_files.c
+--- ../pristine-linux-2.6.16/drivers/oprofile/oprofile_files.c 2006-03-20 
05:53:29.000000000 +0000
++++ ./drivers/oprofile/oprofile_files.c        2006-04-03 15:53:05.000000000 
+0100
+@@ -5,15 +5,21 @@
+  * @remark Read the file COPYING
+  *
+  * @author John Levon <levon@xxxxxxxxxxxxxxxxx>
++ *
++ * Modified by Aravind Menon for Xen
++ * These modifications are:
++ * Copyright (C) 2005 Hewlett-Packard Co.     
+  */
+ 
+ #include <linux/fs.h>
+ #include <linux/oprofile.h>
++#include <asm/uaccess.h>
++#include <linux/ctype.h>
+ 
+ #include "event_buffer.h"
+ #include "oprofile_stats.h"
+ #include "oprof.h"
+- 
++
+ unsigned long fs_buffer_size = 131072;
+ unsigned long fs_cpu_buffer_size = 8192;
+ unsigned long fs_buffer_watershed = 32768; /* FIXME: tune */
+@@ -117,11 +123,79 @@ static ssize_t dump_write(struct file * 
+ static struct file_operations dump_fops = {
+       .write          = dump_write,
+ };
+- 
++
++#define TMPBUFSIZE 512
++
++unsigned int adomains = 0;
++long active_domains[MAX_OPROF_DOMAINS];
++
++static ssize_t adomain_write(struct file * file, char const __user * buf, 
++                           size_t count, loff_t * offset)
++{
++      char tmpbuf[TMPBUFSIZE];
++      char * startp = tmpbuf;
++      char * endp = tmpbuf;
++      int i;
++      unsigned long val;
++      
++      if (*offset)
++              return -EINVAL; 
++      if (!count)
++              return 0;
++      if (count > TMPBUFSIZE - 1)
++              return -EINVAL;
++
++      memset(tmpbuf, 0x0, TMPBUFSIZE);
++
++      if (copy_from_user(tmpbuf, buf, count))
++              return -EFAULT;
++      
++      for (i = 0; i < MAX_OPROF_DOMAINS; i++)
++              active_domains[i] = -1;
++      adomains = 0;
++
++      while (1) {
++              val = simple_strtol(startp, &endp, 0);
++              if (endp == startp)
++                      break;
++              while (ispunct(*endp))
++                      endp++;
++              active_domains[adomains++] = val;
++              if (adomains >= MAX_OPROF_DOMAINS)
++                      break;
++              startp = endp;
++      }
++      if (oprofile_set_active())
++              return -EINVAL; 
++      return count;
++}
++
++static ssize_t adomain_read(struct file * file, char __user * buf, 
++                          size_t count, loff_t * offset)
++{
++      char tmpbuf[TMPBUFSIZE];
++      size_t len = 0;
++      int i;
++      /* This is all screwed up if we run out of space */
++      for (i = 0; i < adomains; i++) 
++              len += snprintf(tmpbuf + len, TMPBUFSIZE - len, 
++                              "%u ", (unsigned int)active_domains[i]);
++      len += snprintf(tmpbuf + len, TMPBUFSIZE - len, "\n");
++      return simple_read_from_buffer((void __user *)buf, count, 
++                                     offset, tmpbuf, len);
++}
++
++
++static struct file_operations active_domain_ops = {
++      .read           = adomain_read,
++      .write          = adomain_write,
++};
++
+ void oprofile_create_files(struct super_block * sb, struct dentry * root)
+ {
+       oprofilefs_create_file(sb, root, "enable", &enable_fops);
+       oprofilefs_create_file_perm(sb, root, "dump", &dump_fops, 0666);
++      oprofilefs_create_file(sb, root, "active_domains", &active_domain_ops);
+       oprofilefs_create_file(sb, root, "buffer", &event_buffer_fops);
+       oprofilefs_create_ulong(sb, root, "buffer_size", &fs_buffer_size);
+       oprofilefs_create_ulong(sb, root, "buffer_watershed", 
&fs_buffer_watershed);
+diff -pruN ../pristine-linux-2.6.16/include/linux/oprofile.h 
./include/linux/oprofile.h
+--- ../pristine-linux-2.6.16/include/linux/oprofile.h  2006-03-20 
05:53:29.000000000 +0000
++++ ./include/linux/oprofile.h 2006-04-03 15:53:05.000000000 +0100
+@@ -16,6 +16,8 @@
+ #include <linux/types.h>
+ #include <linux/spinlock.h>
+ #include <asm/atomic.h>
++
++#include <xen/interface/xenoprof.h>
+  
+ struct super_block;
+ struct dentry;
+@@ -27,6 +29,8 @@ struct oprofile_operations {
+       /* create any necessary configuration files in the oprofile fs.
+        * Optional. */
+       int (*create_files)(struct super_block * sb, struct dentry * root);
++      /* setup active domains with Xen */
++      int (*set_active)(int *active_domains, unsigned int adomains);
+       /* Do any necessary interrupt setup. Optional. */
+       int (*setup)(void);
+       /* Do any necessary interrupt shutdown. Optional. */
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/oprofile/Makefile
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/oprofile/Makefile    Fri Apr  7 10:52:00 2006
@@ -0,0 +1,5 @@
+obj-y += xenoprof.o
+obj-y += nmi_int.o
+obj-y += op_model_p4.o
+obj-y += op_model_ppro.o
+obj-y += op_model_athlon.o
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/oprofile/nmi_int.c
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/oprofile/nmi_int.c   Fri Apr  7 10:52:00 2006
@@ -0,0 +1,391 @@
+/**
+ * @file nmi_int.c
+ *
+ * @remark Copyright 2002 OProfile authors
+ * @remark Read the file COPYING
+ *
+ * @author John Levon <levon@xxxxxxxxxxxxxxxxx>
+ *
+ * Modified for Xen: by Aravind Menon & Jose Renato Santos
+ *   These modifications are:
+ *   Copyright (C) 2005 Hewlett-Packard Co.
+ */
+
+#include <xen/event.h>
+#include <xen/types.h>
+#include <xen/errno.h>
+#include <xen/init.h>
+#include <public/xen.h>
+#include <asm/nmi.h>
+#include <asm/msr.h>
+#include <asm/apic.h>
+#include <asm/regs.h>
+#include <asm/current.h>
+#include <xen/delay.h>
+ 
+#include "op_counter.h"
+#include "op_x86_model.h"
+ 
+static struct op_x86_model_spec const * model;
+static struct op_msrs cpu_msrs[NR_CPUS];
+static unsigned long saved_lvtpc[NR_CPUS];
+
+#define VIRQ_BITMASK_SIZE (MAX_OPROF_DOMAINS/32 + 1)
+extern int active_domains[MAX_OPROF_DOMAINS];
+extern unsigned int adomains;
+extern struct domain *primary_profiler;
+extern struct domain *adomain_ptrs[MAX_OPROF_DOMAINS];
+extern unsigned long virq_ovf_pending[VIRQ_BITMASK_SIZE];
+extern int is_active(struct domain *d);
+extern int active_id(struct domain *d);
+extern int is_profiled(struct domain *d);
+
+extern size_t strlcpy(char *dest, const char *src, size_t size);
+
+
+int nmi_callback(struct cpu_user_regs *regs, int cpu)
+{
+       int xen_mode, ovf;
+
+       ovf = model->check_ctrs(cpu, &cpu_msrs[cpu], regs);
+       xen_mode = ring_0(regs);
+       if ( ovf && is_active(current->domain) && !xen_mode )
+               send_guest_vcpu_virq(current, VIRQ_XENOPROF);
+
+       return 1;
+}
+ 
+ 
+static void nmi_cpu_save_registers(struct op_msrs *msrs)
+{
+       unsigned int const nr_ctrs = model->num_counters;
+       unsigned int const nr_ctrls = model->num_controls; 
+       struct op_msr *counters = msrs->counters;
+       struct op_msr *controls = msrs->controls;
+       unsigned int i;
+
+       for (i = 0; i < nr_ctrs; ++i) {
+               rdmsr(counters[i].addr,
+                       counters[i].saved.low,
+                       counters[i].saved.high);
+       }
+ 
+       for (i = 0; i < nr_ctrls; ++i) {
+               rdmsr(controls[i].addr,
+                       controls[i].saved.low,
+                       controls[i].saved.high);
+       }
+}
+
+
+static void nmi_save_registers(void * dummy)
+{
+       int cpu = smp_processor_id();
+       struct op_msrs * msrs = &cpu_msrs[cpu];
+       model->fill_in_addresses(msrs);
+       nmi_cpu_save_registers(msrs);
+}
+
+
+static void free_msrs(void)
+{
+       int i;
+       for (i = 0; i < NR_CPUS; ++i) {
+               xfree(cpu_msrs[i].counters);
+               cpu_msrs[i].counters = NULL;
+               xfree(cpu_msrs[i].controls);
+               cpu_msrs[i].controls = NULL;
+       }
+}
+
+
+static int allocate_msrs(void)
+{
+       int success = 1;
+       size_t controls_size = sizeof(struct op_msr) * model->num_controls;
+       size_t counters_size = sizeof(struct op_msr) * model->num_counters;
+
+       int i;
+       for (i = 0; i < NR_CPUS; ++i) {
+               if (!test_bit(i, &cpu_online_map))
+                       continue;
+
+               cpu_msrs[i].counters = xmalloc_bytes(counters_size);
+               if (!cpu_msrs[i].counters) {
+                       success = 0;
+                       break;
+               }
+               cpu_msrs[i].controls = xmalloc_bytes(controls_size);
+               if (!cpu_msrs[i].controls) {
+                       success = 0;
+                       break;
+               }
+       }
+
+       if (!success)
+               free_msrs();
+
+       return success;
+}
+
+
+static void nmi_cpu_setup(void * dummy)
+{
+       int cpu = smp_processor_id();
+       struct op_msrs * msrs = &cpu_msrs[cpu];
+       model->setup_ctrs(msrs);
+}
+
+
+int nmi_setup_events(void)
+{
+       on_each_cpu(nmi_cpu_setup, NULL, 0, 1);
+       return 0;
+}
+
+int nmi_reserve_counters(void)
+{
+       if (!allocate_msrs())
+               return -ENOMEM;
+
+       /* We walk a thin line between law and rape here.
+        * We need to be careful to install our NMI handler
+        * without actually triggering any NMIs as this will
+        * break the core code horrifically.
+        */
+       if (reserve_lapic_nmi() < 0) {
+               free_msrs();
+               return -EBUSY;
+       }
+       /* We need to serialize save and setup for HT because the subset
+        * of msrs are distinct for save and setup operations
+        */
+       on_each_cpu(nmi_save_registers, NULL, 0, 1);
+       return 0;
+}
+
+int nmi_enable_virq(void)
+{
+       set_nmi_callback(nmi_callback);
+       return 0;
+}
+
+
+void nmi_disable_virq(void)
+{
+       unset_nmi_callback();
+} 
+
+
+static void nmi_restore_registers(struct op_msrs * msrs)
+{
+       unsigned int const nr_ctrs = model->num_counters;
+       unsigned int const nr_ctrls = model->num_controls; 
+       struct op_msr * counters = msrs->counters;
+       struct op_msr * controls = msrs->controls;
+       unsigned int i;
+
+       for (i = 0; i < nr_ctrls; ++i) {
+               wrmsr(controls[i].addr,
+                       controls[i].saved.low,
+                       controls[i].saved.high);
+       }
+ 
+       for (i = 0; i < nr_ctrs; ++i) {
+               wrmsr(counters[i].addr,
+                       counters[i].saved.low,
+                       counters[i].saved.high);
+       }
+}
+ 
+
+static void nmi_cpu_shutdown(void * dummy)
+{
+       int cpu = smp_processor_id();
+       struct op_msrs * msrs = &cpu_msrs[cpu];
+       nmi_restore_registers(msrs);
+}
+
+ 
+void nmi_release_counters(void)
+{
+       on_each_cpu(nmi_cpu_shutdown, NULL, 0, 1);
+       release_lapic_nmi();
+       free_msrs();
+}
+
+ 
+static void nmi_cpu_start(void * dummy)
+{
+       int cpu = smp_processor_id();
+       struct op_msrs const * msrs = &cpu_msrs[cpu];
+       saved_lvtpc[cpu] = apic_read(APIC_LVTPC);
+       apic_write(APIC_LVTPC, APIC_DM_NMI);
+       model->start(msrs);
+}
+ 
+
+int nmi_start(void)
+{
+       on_each_cpu(nmi_cpu_start, NULL, 0, 1);
+       return 0;
+}
+ 
+ 
+static void nmi_cpu_stop(void * dummy)
+{
+       unsigned int v;
+       int cpu = smp_processor_id();
+       struct op_msrs const * msrs = &cpu_msrs[cpu];
+       model->stop(msrs);
+
+       /* restoring APIC_LVTPC can trigger an apic error because the delivery
+        * mode and vector nr combination can be illegal. That's by design: on
+        * power on apic lvt contain a zero vector nr which are legal only for
+        * NMI delivery mode. So inhibit apic err before restoring lvtpc
+        */
+       if ( !(apic_read(APIC_LVTPC) & APIC_DM_NMI)
+            || (apic_read(APIC_LVTPC) & APIC_LVT_MASKED) )
+       {
+               printk("nmi_stop: APIC not good %ul\n", apic_read(APIC_LVTPC));
+               mdelay(5000);
+       }
+       v = apic_read(APIC_LVTERR);
+       apic_write(APIC_LVTERR, v | APIC_LVT_MASKED);
+       apic_write(APIC_LVTPC, saved_lvtpc[cpu]);
+       apic_write(APIC_LVTERR, v);
+}
+ 
+ 
+void nmi_stop(void)
+{
+       on_each_cpu(nmi_cpu_stop, NULL, 0, 1);
+}
+
+
+struct op_counter_config counter_config[OP_MAX_COUNTER];
+
+static int __init p4_init(char * cpu_type)
+{ 
+       __u8 cpu_model = current_cpu_data.x86_model;
+
+       if (cpu_model > 4)
+               return 0;
+
+#ifndef CONFIG_SMP
+       strncpy (cpu_type, "i386/p4", XENOPROF_CPU_TYPE_SIZE - 1);
+       model = &op_p4_spec;
+       return 1;
+#else
+       switch (smp_num_siblings) {
+               case 1:
+                       strncpy (cpu_type, "i386/p4", 
+                                XENOPROF_CPU_TYPE_SIZE - 1);
+                       model = &op_p4_spec;
+                       return 1;
+
+               case 2:
+                       strncpy (cpu_type, "i386/p4-ht", 
+                                XENOPROF_CPU_TYPE_SIZE - 1);
+                       model = &op_p4_ht2_spec;
+                       return 1;
+       }
+#endif
+       printk("Xenoprof ERROR: P4 HyperThreading detected with > 2 threads\n");
+
+       return 0;
+}
+
+
+static int __init ppro_init(char *cpu_type)
+{
+       __u8 cpu_model = current_cpu_data.x86_model;
+
+       if (cpu_model > 0xd)
+               return 0;
+
+       if (cpu_model == 9) {
+               strncpy (cpu_type, "i386/p6_mobile", XENOPROF_CPU_TYPE_SIZE - 
1);
+       } else if (cpu_model > 5) {
+               strncpy (cpu_type, "i386/piii", XENOPROF_CPU_TYPE_SIZE - 1);
+       } else if (cpu_model > 2) {
+               strncpy (cpu_type, "i386/pii", XENOPROF_CPU_TYPE_SIZE - 1);
+       } else {
+               strncpy (cpu_type, "i386/ppro", XENOPROF_CPU_TYPE_SIZE - 1);
+       }
+
+       model = &op_ppro_spec;
+       return 1;
+}
+
+int nmi_init(int *num_events, int *is_primary, char *cpu_type)
+{
+       __u8 vendor = current_cpu_data.x86_vendor;
+       __u8 family = current_cpu_data.x86;
+       int prim = 0;
+ 
+       if (!cpu_has_apic)
+               return -ENODEV;
+
+       if (primary_profiler == NULL) {
+               /* For now, only dom0 can be the primary profiler */
+               if (current->domain->domain_id == 0) {
+                       primary_profiler = current->domain;
+                       prim = 1;
+               }
+       }
+ 
+       /* Make sure string is NULL terminated */
+       cpu_type[XENOPROF_CPU_TYPE_SIZE - 1] = 0;
+
+       switch (vendor) {
+               case X86_VENDOR_AMD:
+                       /* Needs to be at least an Athlon (or hammer in 32bit 
mode) */
+
+                       switch (family) {
+                       default:
+                               return -ENODEV;
+                       case 6:
+                               model = &op_athlon_spec;
+                               strncpy (cpu_type, "i386/athlon", 
+                                        XENOPROF_CPU_TYPE_SIZE - 1);
+                               break;
+                       case 0xf:
+                               model = &op_athlon_spec;
+                               /* Actually it could be i386/hammer too, but 
give
+                                  user space an consistent name. */
+                               strncpy (cpu_type, "x86-64/hammer", 
+                                        XENOPROF_CPU_TYPE_SIZE - 1);
+                               break;
+                       }
+                       break;
+ 
+               case X86_VENDOR_INTEL:
+                       switch (family) {
+                               /* Pentium IV */
+                               case 0xf:
+                                       if (!p4_init(cpu_type))
+                                               return -ENODEV;
+                                       break;
+
+                               /* A P6-class processor */
+                               case 6:
+                                       if (!ppro_init(cpu_type))
+                                               return -ENODEV;
+                                       break;
+
+                               default:
+                                       return -ENODEV;
+                       }
+                       break;
+
+               default:
+                       return -ENODEV;
+       }
+
+       *num_events = model->num_counters;
+       *is_primary = prim;
+
+       return 0;
+}
+
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/oprofile/op_counter.h
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/oprofile/op_counter.h        Fri Apr  7 10:52:00 2006
@@ -0,0 +1,29 @@
+/**
+ * @file op_counter.h
+ *
+ * @remark Copyright 2002 OProfile authors
+ * @remark Read the file COPYING
+ *
+ * @author John Levon
+ */
+ 
+#ifndef OP_COUNTER_H
+#define OP_COUNTER_H
+ 
+#define OP_MAX_COUNTER 8
+ 
+/* Per-perfctr configuration as set via
+ * oprofilefs.
+ */
+struct op_counter_config {
+        unsigned long count;
+        unsigned long enabled;
+        unsigned long event;
+        unsigned long kernel;
+        unsigned long user;
+        unsigned long unit_mask;
+};
+
+extern struct op_counter_config counter_config[];
+
+#endif /* OP_COUNTER_H */
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/oprofile/op_model_athlon.c
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/oprofile/op_model_athlon.c   Fri Apr  7 10:52:00 2006
@@ -0,0 +1,168 @@
+/**
+ * @file op_model_athlon.h
+ * athlon / K7 model-specific MSR operations
+ *
+ * @remark Copyright 2002 OProfile authors
+ * @remark Read the file COPYING
+ *
+ * @author John Levon
+ * @author Philippe Elie
+ * @author Graydon Hoare
+ */
+
+#include <xen/types.h>
+#include <asm/msr.h>
+#include <asm/io.h>
+#include <asm/apic.h>
+#include <asm/processor.h>
+#include <xen/sched.h>
+#include <asm/regs.h>
+#include <asm/current.h>
+ 
+#include "op_x86_model.h"
+#include "op_counter.h"
+
+#define NUM_COUNTERS 4
+#define NUM_CONTROLS 4
+
+#define CTR_READ(l,h,msrs,c) do {rdmsr(msrs->counters[(c)].addr, (l), (h));} 
while (0)
+#define CTR_WRITE(l,msrs,c) do {wrmsr(msrs->counters[(c)].addr, -(unsigned 
int)(l), -1);} while (0)
+#define CTR_OVERFLOWED(n) (!((n) & (1U<<31)))
+
+#define CTRL_READ(l,h,msrs,c) do {rdmsr(msrs->controls[(c)].addr, (l), (h));} 
while (0)
+#define CTRL_WRITE(l,h,msrs,c) do {wrmsr(msrs->controls[(c)].addr, (l), (h));} 
while (0)
+#define CTRL_SET_ACTIVE(n) (n |= (1<<22))
+#define CTRL_SET_INACTIVE(n) (n &= ~(1<<22))
+#define CTRL_CLEAR(x) (x &= (1<<21))
+#define CTRL_SET_ENABLE(val) (val |= 1<<20)
+#define CTRL_SET_USR(val,u) (val |= ((u & 1) << 16))
+#define CTRL_SET_KERN(val,k) (val |= ((k & 1) << 17))
+#define CTRL_SET_UM(val, m) (val |= (m << 8))
+#define CTRL_SET_EVENT(val, e) (val |= e)
+
+static unsigned long reset_value[NUM_COUNTERS];
+
+extern void xenoprof_log_event(struct vcpu *v, unsigned long eip,
+                              int mode, int event);
+ 
+static void athlon_fill_in_addresses(struct op_msrs * const msrs)
+{
+       msrs->counters[0].addr = MSR_K7_PERFCTR0;
+       msrs->counters[1].addr = MSR_K7_PERFCTR1;
+       msrs->counters[2].addr = MSR_K7_PERFCTR2;
+       msrs->counters[3].addr = MSR_K7_PERFCTR3;
+
+       msrs->controls[0].addr = MSR_K7_EVNTSEL0;
+       msrs->controls[1].addr = MSR_K7_EVNTSEL1;
+       msrs->controls[2].addr = MSR_K7_EVNTSEL2;
+       msrs->controls[3].addr = MSR_K7_EVNTSEL3;
+}
+
+ 
+static void athlon_setup_ctrs(struct op_msrs const * const msrs)
+{
+       unsigned int low, high;
+       int i;
+ 
+       /* clear all counters */
+       for (i = 0 ; i < NUM_CONTROLS; ++i) {
+               CTRL_READ(low, high, msrs, i);
+               CTRL_CLEAR(low);
+               CTRL_WRITE(low, high, msrs, i);
+       }
+       
+       /* avoid a false detection of ctr overflows in NMI handler */
+       for (i = 0; i < NUM_COUNTERS; ++i) {
+               CTR_WRITE(1, msrs, i);
+       }
+
+       /* enable active counters */
+       for (i = 0; i < NUM_COUNTERS; ++i) {
+               if (counter_config[i].enabled) {
+                       reset_value[i] = counter_config[i].count;
+
+                       CTR_WRITE(counter_config[i].count, msrs, i);
+
+                       CTRL_READ(low, high, msrs, i);
+                       CTRL_CLEAR(low);
+                       CTRL_SET_ENABLE(low);
+                       CTRL_SET_USR(low, counter_config[i].user);
+                       CTRL_SET_KERN(low, counter_config[i].kernel);
+                       CTRL_SET_UM(low, counter_config[i].unit_mask);
+                       CTRL_SET_EVENT(low, counter_config[i].event);
+                       CTRL_WRITE(low, high, msrs, i);
+               } else {
+                       reset_value[i] = 0;
+               }
+       }
+}
+
+ 
+static int athlon_check_ctrs(unsigned int const cpu,
+                             struct op_msrs const * const msrs,
+                             struct cpu_user_regs * const regs)
+
+{
+       unsigned int low, high;
+       int i;
+       int ovf = 0;
+       unsigned long eip = regs->eip;
+       int mode = 0;
+
+       if (guest_kernel_mode(current, regs))
+               mode = 1;
+       else if (ring_0(regs))
+               mode = 2;
+
+       for (i = 0 ; i < NUM_COUNTERS; ++i) {
+               CTR_READ(low, high, msrs, i);
+               if (CTR_OVERFLOWED(low)) {
+                       xenoprof_log_event(current, eip, mode, i);
+                       CTR_WRITE(reset_value[i], msrs, i);
+                       ovf = 1;
+               }
+       }
+
+       /* See op_model_ppro.c */
+       return ovf;
+}
+
+ 
+static void athlon_start(struct op_msrs const * const msrs)
+{
+       unsigned int low, high;
+       int i;
+       for (i = 0 ; i < NUM_COUNTERS ; ++i) {
+               if (reset_value[i]) {
+                       CTRL_READ(low, high, msrs, i);
+                       CTRL_SET_ACTIVE(low);
+                       CTRL_WRITE(low, high, msrs, i);
+               }
+       }
+}
+
+
+static void athlon_stop(struct op_msrs const * const msrs)
+{
+       unsigned int low,high;
+       int i;
+
+       /* Subtle: stop on all counters to avoid race with
+        * setting our pm callback */
+       for (i = 0 ; i < NUM_COUNTERS ; ++i) {
+               CTRL_READ(low, high, msrs, i);
+               CTRL_SET_INACTIVE(low);
+               CTRL_WRITE(low, high, msrs, i);
+       }
+}
+
+
+struct op_x86_model_spec const op_athlon_spec = {
+       .num_counters = NUM_COUNTERS,
+       .num_controls = NUM_CONTROLS,
+       .fill_in_addresses = &athlon_fill_in_addresses,
+       .setup_ctrs = &athlon_setup_ctrs,
+       .check_ctrs = &athlon_check_ctrs,
+       .start = &athlon_start,
+       .stop = &athlon_stop
+};
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/oprofile/op_model_p4.c
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/oprofile/op_model_p4.c       Fri Apr  7 10:52:00 2006
@@ -0,0 +1,739 @@
+/**
+ * @file op_model_p4.c
+ * P4 model-specific MSR operations
+ *
+ * @remark Copyright 2002 OProfile authors
+ * @remark Read the file COPYING
+ *
+ * @author Graydon Hoare
+ */
+
+#include <xen/types.h>
+#include <asm/msr.h>
+#include <asm/io.h>
+#include <asm/apic.h>
+#include <asm/processor.h>
+#include <xen/sched.h>
+#include <asm/regs.h>
+#include <asm/current.h>
+
+#include "op_x86_model.h"
+#include "op_counter.h"
+
+#define NUM_EVENTS 39
+
+#define NUM_COUNTERS_NON_HT 8
+#define NUM_ESCRS_NON_HT 45
+#define NUM_CCCRS_NON_HT 18
+#define NUM_CONTROLS_NON_HT (NUM_ESCRS_NON_HT + NUM_CCCRS_NON_HT)
+
+#define NUM_COUNTERS_HT2 4
+#define NUM_ESCRS_HT2 23
+#define NUM_CCCRS_HT2 9
+#define NUM_CONTROLS_HT2 (NUM_ESCRS_HT2 + NUM_CCCRS_HT2)
+
+static unsigned int num_counters = NUM_COUNTERS_NON_HT;
+
+
+/* this has to be checked dynamically since the
+   hyper-threadedness of a chip is discovered at
+   kernel boot-time. */
+static inline void setup_num_counters(void)
+{
+#ifdef CONFIG_SMP
+       if (smp_num_siblings == 2)
+               num_counters = NUM_COUNTERS_HT2;
+#endif
+}
+
+static int inline addr_increment(void)
+{
+#ifdef CONFIG_SMP
+       return smp_num_siblings == 2 ? 2 : 1;
+#else
+       return 1;
+#endif
+}
+
+
+/* tables to simulate simplified hardware view of p4 registers */
+struct p4_counter_binding {
+       int virt_counter;
+       int counter_address;
+       int cccr_address;
+};
+
+struct p4_event_binding {
+       int escr_select;  /* value to put in CCCR */
+       int event_select; /* value to put in ESCR */
+       struct {
+               int virt_counter; /* for this counter... */
+               int escr_address; /* use this ESCR       */
+       } bindings[2];
+};
+
+/* nb: these CTR_* defines are a duplicate of defines in
+   event/i386.p4*events. */
+
+
+#define CTR_BPU_0      (1 << 0)
+#define CTR_MS_0       (1 << 1)
+#define CTR_FLAME_0    (1 << 2)
+#define CTR_IQ_4       (1 << 3)
+#define CTR_BPU_2      (1 << 4)
+#define CTR_MS_2       (1 << 5)
+#define CTR_FLAME_2    (1 << 6)
+#define CTR_IQ_5       (1 << 7)
+
+static struct p4_counter_binding p4_counters [NUM_COUNTERS_NON_HT] = {
+       { CTR_BPU_0,   MSR_P4_BPU_PERFCTR0,   MSR_P4_BPU_CCCR0 },
+       { CTR_MS_0,    MSR_P4_MS_PERFCTR0,    MSR_P4_MS_CCCR0 },
+       { CTR_FLAME_0, MSR_P4_FLAME_PERFCTR0, MSR_P4_FLAME_CCCR0 },
+       { CTR_IQ_4,    MSR_P4_IQ_PERFCTR4,    MSR_P4_IQ_CCCR4 },
+       { CTR_BPU_2,   MSR_P4_BPU_PERFCTR2,   MSR_P4_BPU_CCCR2 },
+       { CTR_MS_2,    MSR_P4_MS_PERFCTR2,    MSR_P4_MS_CCCR2 },
+       { CTR_FLAME_2, MSR_P4_FLAME_PERFCTR2, MSR_P4_FLAME_CCCR2 },
+       { CTR_IQ_5,    MSR_P4_IQ_PERFCTR5,    MSR_P4_IQ_CCCR5 }
+};
+
+#define NUM_UNUSED_CCCRS       NUM_CCCRS_NON_HT - NUM_COUNTERS_NON_HT
+
+/* All cccr we don't use. */
+static int p4_unused_cccr[NUM_UNUSED_CCCRS] = {
+       MSR_P4_BPU_CCCR1,       MSR_P4_BPU_CCCR3,
+       MSR_P4_MS_CCCR1,        MSR_P4_MS_CCCR3,
+       MSR_P4_FLAME_CCCR1,     MSR_P4_FLAME_CCCR3,
+       MSR_P4_IQ_CCCR0,        MSR_P4_IQ_CCCR1,
+       MSR_P4_IQ_CCCR2,        MSR_P4_IQ_CCCR3
+};
+
+/* p4 event codes in libop/op_event.h are indices into this table. */
+
+static struct p4_event_binding p4_events[NUM_EVENTS] = {
+       
+       { /* BRANCH_RETIRED */
+               0x05, 0x06, 
+               { {CTR_IQ_4, MSR_P4_CRU_ESCR2},
+                 {CTR_IQ_5, MSR_P4_CRU_ESCR3} }
+       },
+       
+       { /* MISPRED_BRANCH_RETIRED */
+               0x04, 0x03, 
+               { { CTR_IQ_4, MSR_P4_CRU_ESCR0},
+                 { CTR_IQ_5, MSR_P4_CRU_ESCR1} }
+       },
+       
+       { /* TC_DELIVER_MODE */
+               0x01, 0x01,
+               { { CTR_MS_0, MSR_P4_TC_ESCR0},  
+                 { CTR_MS_2, MSR_P4_TC_ESCR1} }
+       },
+       
+       { /* BPU_FETCH_REQUEST */
+               0x00, 0x03, 
+               { { CTR_BPU_0, MSR_P4_BPU_ESCR0},
+                 { CTR_BPU_2, MSR_P4_BPU_ESCR1} }
+       },
+
+       { /* ITLB_REFERENCE */
+               0x03, 0x18,
+               { { CTR_BPU_0, MSR_P4_ITLB_ESCR0},
+                 { CTR_BPU_2, MSR_P4_ITLB_ESCR1} }
+       },
+
+       { /* MEMORY_CANCEL */
+               0x05, 0x02,
+               { { CTR_FLAME_0, MSR_P4_DAC_ESCR0},
+                 { CTR_FLAME_2, MSR_P4_DAC_ESCR1} }
+       },
+
+       { /* MEMORY_COMPLETE */
+               0x02, 0x08,
+               { { CTR_FLAME_0, MSR_P4_SAAT_ESCR0},
+                 { CTR_FLAME_2, MSR_P4_SAAT_ESCR1} }
+       },
+
+       { /* LOAD_PORT_REPLAY */
+               0x02, 0x04, 
+               { { CTR_FLAME_0, MSR_P4_SAAT_ESCR0},
+                 { CTR_FLAME_2, MSR_P4_SAAT_ESCR1} }
+       },
+
+       { /* STORE_PORT_REPLAY */
+               0x02, 0x05,
+               { { CTR_FLAME_0, MSR_P4_SAAT_ESCR0},
+                 { CTR_FLAME_2, MSR_P4_SAAT_ESCR1} }
+       },
+
+       { /* MOB_LOAD_REPLAY */
+               0x02, 0x03,
+               { { CTR_BPU_0, MSR_P4_MOB_ESCR0},
+                 { CTR_BPU_2, MSR_P4_MOB_ESCR1} }
+       },
+
+       { /* PAGE_WALK_TYPE */
+               0x04, 0x01,
+               { { CTR_BPU_0, MSR_P4_PMH_ESCR0},
+                 { CTR_BPU_2, MSR_P4_PMH_ESCR1} }
+       },
+
+       { /* BSQ_CACHE_REFERENCE */
+               0x07, 0x0c, 
+               { { CTR_BPU_0, MSR_P4_BSU_ESCR0},
+                 { CTR_BPU_2, MSR_P4_BSU_ESCR1} }
+       },
+
+       { /* IOQ_ALLOCATION */
+               0x06, 0x03, 
+               { { CTR_BPU_0, MSR_P4_FSB_ESCR0},
+                 { 0, 0 } }
+       },
+
+       { /* IOQ_ACTIVE_ENTRIES */
+               0x06, 0x1a, 
+               { { CTR_BPU_2, MSR_P4_FSB_ESCR1},
+                 { 0, 0 } }
+       },
+
+       { /* FSB_DATA_ACTIVITY */
+               0x06, 0x17, 
+               { { CTR_BPU_0, MSR_P4_FSB_ESCR0},
+                 { CTR_BPU_2, MSR_P4_FSB_ESCR1} }
+       },
+
+       { /* BSQ_ALLOCATION */
+               0x07, 0x05, 
+               { { CTR_BPU_0, MSR_P4_BSU_ESCR0},
+                 { 0, 0 } }
+       },
+
+       { /* BSQ_ACTIVE_ENTRIES */
+               0x07, 0x06,
+               { { CTR_BPU_2, MSR_P4_BSU_ESCR1 /* guess */},  
+                 { 0, 0 } }
+       },
+
+       { /* X87_ASSIST */
+               0x05, 0x03, 
+               { { CTR_IQ_4, MSR_P4_CRU_ESCR2},
+                 { CTR_IQ_5, MSR_P4_CRU_ESCR3} }
+       },
+
+       { /* SSE_INPUT_ASSIST */
+               0x01, 0x34,
+               { { CTR_FLAME_0, MSR_P4_FIRM_ESCR0},
+                 { CTR_FLAME_2, MSR_P4_FIRM_ESCR1} }
+       },
+  
+       { /* PACKED_SP_UOP */
+               0x01, 0x08, 
+               { { CTR_FLAME_0, MSR_P4_FIRM_ESCR0},
+                 { CTR_FLAME_2, MSR_P4_FIRM_ESCR1} }
+       },
+  
+       { /* PACKED_DP_UOP */
+               0x01, 0x0c, 
+               { { CTR_FLAME_0, MSR_P4_FIRM_ESCR0},
+                 { CTR_FLAME_2, MSR_P4_FIRM_ESCR1} }
+       },
+
+       { /* SCALAR_SP_UOP */
+               0x01, 0x0a, 
+               { { CTR_FLAME_0, MSR_P4_FIRM_ESCR0},
+                 { CTR_FLAME_2, MSR_P4_FIRM_ESCR1} }
+       },
+
+       { /* SCALAR_DP_UOP */
+               0x01, 0x0e,
+               { { CTR_FLAME_0, MSR_P4_FIRM_ESCR0},
+                 { CTR_FLAME_2, MSR_P4_FIRM_ESCR1} }
+       },
+
+       { /* 64BIT_MMX_UOP */
+               0x01, 0x02, 
+               { { CTR_FLAME_0, MSR_P4_FIRM_ESCR0},
+                 { CTR_FLAME_2, MSR_P4_FIRM_ESCR1} }
+       },
+  
+       { /* 128BIT_MMX_UOP */
+               0x01, 0x1a, 
+               { { CTR_FLAME_0, MSR_P4_FIRM_ESCR0},
+                 { CTR_FLAME_2, MSR_P4_FIRM_ESCR1} }
+       },
+
+       { /* X87_FP_UOP */
+               0x01, 0x04, 
+               { { CTR_FLAME_0, MSR_P4_FIRM_ESCR0},
+                 { CTR_FLAME_2, MSR_P4_FIRM_ESCR1} }
+       },
+  
+       { /* X87_SIMD_MOVES_UOP */
+               0x01, 0x2e, 
+               { { CTR_FLAME_0, MSR_P4_FIRM_ESCR0},
+                 { CTR_FLAME_2, MSR_P4_FIRM_ESCR1} }
+       },
+  
+       { /* MACHINE_CLEAR */
+               0x05, 0x02, 
+               { { CTR_IQ_4, MSR_P4_CRU_ESCR2},
+                 { CTR_IQ_5, MSR_P4_CRU_ESCR3} }
+       },
+
+       { /* GLOBAL_POWER_EVENTS */
+               0x06, 0x13 /* older manual says 0x05, newer 0x13 */,
+               { { CTR_BPU_0, MSR_P4_FSB_ESCR0},
+                 { CTR_BPU_2, MSR_P4_FSB_ESCR1} }
+       },
+  
+       { /* TC_MS_XFER */
+               0x00, 0x05, 
+               { { CTR_MS_0, MSR_P4_MS_ESCR0},
+                 { CTR_MS_2, MSR_P4_MS_ESCR1} }
+       },
+
+       { /* UOP_QUEUE_WRITES */
+               0x00, 0x09,
+               { { CTR_MS_0, MSR_P4_MS_ESCR0},
+                 { CTR_MS_2, MSR_P4_MS_ESCR1} }
+       },
+
+       { /* FRONT_END_EVENT */
+               0x05, 0x08,
+               { { CTR_IQ_4, MSR_P4_CRU_ESCR2},
+                 { CTR_IQ_5, MSR_P4_CRU_ESCR3} }
+       },
+
+       { /* EXECUTION_EVENT */
+               0x05, 0x0c,
+               { { CTR_IQ_4, MSR_P4_CRU_ESCR2},
+                 { CTR_IQ_5, MSR_P4_CRU_ESCR3} }
+       },
+
+       { /* REPLAY_EVENT */
+               0x05, 0x09,
+               { { CTR_IQ_4, MSR_P4_CRU_ESCR2},
+                 { CTR_IQ_5, MSR_P4_CRU_ESCR3} }
+       },
+
+       { /* INSTR_RETIRED */
+               0x04, 0x02, 
+               { { CTR_IQ_4, MSR_P4_CRU_ESCR0},
+                 { CTR_IQ_5, MSR_P4_CRU_ESCR1} }
+       },
+
+       { /* UOPS_RETIRED */
+               0x04, 0x01,
+               { { CTR_IQ_4, MSR_P4_CRU_ESCR0},
+                 { CTR_IQ_5, MSR_P4_CRU_ESCR1} }
+       },
+
+       { /* UOP_TYPE */    
+               0x02, 0x02, 
+               { { CTR_IQ_4, MSR_P4_RAT_ESCR0},
+                 { CTR_IQ_5, MSR_P4_RAT_ESCR1} }
+       },
+
+       { /* RETIRED_MISPRED_BRANCH_TYPE */
+               0x02, 0x05, 
+               { { CTR_MS_0, MSR_P4_TBPU_ESCR0},
+                 { CTR_MS_2, MSR_P4_TBPU_ESCR1} }
+       },
+
+       { /* RETIRED_BRANCH_TYPE */
+               0x02, 0x04,
+               { { CTR_MS_0, MSR_P4_TBPU_ESCR0},
+                 { CTR_MS_2, MSR_P4_TBPU_ESCR1} }
+       }
+};
+
+
+#define MISC_PMC_ENABLED_P(x) ((x) & 1 << 7)
+
+#define ESCR_RESERVED_BITS 0x80000003
+#define ESCR_CLEAR(escr) ((escr) &= ESCR_RESERVED_BITS)
+#define ESCR_SET_USR_0(escr, usr) ((escr) |= (((usr) & 1) << 2))
+#define ESCR_SET_OS_0(escr, os) ((escr) |= (((os) & 1) << 3))
+#define ESCR_SET_USR_1(escr, usr) ((escr) |= (((usr) & 1)))
+#define ESCR_SET_OS_1(escr, os) ((escr) |= (((os) & 1) << 1))
+#define ESCR_SET_EVENT_SELECT(escr, sel) ((escr) |= (((sel) & 0x3f) << 25))
+#define ESCR_SET_EVENT_MASK(escr, mask) ((escr) |= (((mask) & 0xffff) << 9))
+#define ESCR_READ(escr,high,ev,i) do {rdmsr(ev->bindings[(i)].escr_address, 
(escr), (high));} while (0)
+#define ESCR_WRITE(escr,high,ev,i) do {wrmsr(ev->bindings[(i)].escr_address, 
(escr), (high));} while (0)
+
+#define CCCR_RESERVED_BITS 0x38030FFF
+#define CCCR_CLEAR(cccr) ((cccr) &= CCCR_RESERVED_BITS)
+#define CCCR_SET_REQUIRED_BITS(cccr) ((cccr) |= 0x00030000)
+#define CCCR_SET_ESCR_SELECT(cccr, sel) ((cccr) |= (((sel) & 0x07) << 13))
+#define CCCR_SET_PMI_OVF_0(cccr) ((cccr) |= (1<<26))
+#define CCCR_SET_PMI_OVF_1(cccr) ((cccr) |= (1<<27))
+#define CCCR_SET_ENABLE(cccr) ((cccr) |= (1<<12))
+#define CCCR_SET_DISABLE(cccr) ((cccr) &= ~(1<<12))
+#define CCCR_READ(low, high, i) do {rdmsr(p4_counters[(i)].cccr_address, 
(low), (high));} while (0)
+#define CCCR_WRITE(low, high, i) do {wrmsr(p4_counters[(i)].cccr_address, 
(low), (high));} while (0)
+#define CCCR_OVF_P(cccr) ((cccr) & (1U<<31))
+#define CCCR_CLEAR_OVF(cccr) ((cccr) &= (~(1U<<31)))
+
+#define CTR_READ(l,h,i) do {rdmsr(p4_counters[(i)].counter_address, (l), 
(h));} while (0)
+#define CTR_WRITE(l,i) do {wrmsr(p4_counters[(i)].counter_address, -(u32)(l), 
-1);} while (0)
+#define CTR_OVERFLOW_P(ctr) (!((ctr) & 0x80000000))
+
+
+/* this assigns a "stagger" to the current CPU, which is used throughout
+   the code in this module as an extra array offset, to select the "even"
+   or "odd" part of all the divided resources. */
+static unsigned int get_stagger(void)
+{
+#ifdef CONFIG_SMP
+       int cpu = smp_processor_id();
+       return (cpu != first_cpu(cpu_sibling_map[cpu]));
+#endif 
+       return 0;
+}
+
+
+/* finally, mediate access to a real hardware counter
+   by passing a "virtual" counter numer to this macro,
+   along with your stagger setting. */
+#define VIRT_CTR(stagger, i) ((i) + ((num_counters) * (stagger)))
+
+static unsigned long reset_value[NUM_COUNTERS_NON_HT];
+
+
+static void p4_fill_in_addresses(struct op_msrs * const msrs)
+{
+       unsigned int i; 
+       unsigned int addr, stag;
+
+       setup_num_counters();
+       stag = get_stagger();
+
+       /* the counter registers we pay attention to */
+       for (i = 0; i < num_counters; ++i) {
+               msrs->counters[i].addr = 
+                       p4_counters[VIRT_CTR(stag, i)].counter_address;
+       }
+
+       /* FIXME: bad feeling, we don't save the 10 counters we don't use. */
+
+       /* 18 CCCR registers */
+       for (i = 0, addr = MSR_P4_BPU_CCCR0 + stag;
+            addr <= MSR_P4_IQ_CCCR5; ++i, addr += addr_increment()) {
+               msrs->controls[i].addr = addr;
+       }
+       
+       /* 43 ESCR registers in three or four discontiguous group */
+       for (addr = MSR_P4_BSU_ESCR0 + stag;
+            addr < MSR_P4_IQ_ESCR0; ++i, addr += addr_increment()) {
+               msrs->controls[i].addr = addr;
+       }
+
+       /* no IQ_ESCR0/1 on some models, we save a seconde time BSU_ESCR0/1
+        * to avoid special case in nmi_{save|restore}_registers() */
+       if (boot_cpu_data.x86_model >= 0x3) {
+               for (addr = MSR_P4_BSU_ESCR0 + stag;
+                    addr <= MSR_P4_BSU_ESCR1; ++i, addr += addr_increment()) {
+                       msrs->controls[i].addr = addr;
+               }
+       } else {
+               for (addr = MSR_P4_IQ_ESCR0 + stag;
+                    addr <= MSR_P4_IQ_ESCR1; ++i, addr += addr_increment()) {
+                       msrs->controls[i].addr = addr;
+               }
+       }
+
+       for (addr = MSR_P4_RAT_ESCR0 + stag;
+            addr <= MSR_P4_SSU_ESCR0; ++i, addr += addr_increment()) {
+               msrs->controls[i].addr = addr;
+       }
+       
+       for (addr = MSR_P4_MS_ESCR0 + stag;
+            addr <= MSR_P4_TC_ESCR1; ++i, addr += addr_increment()) { 
+               msrs->controls[i].addr = addr;
+       }
+       
+       for (addr = MSR_P4_IX_ESCR0 + stag;
+            addr <= MSR_P4_CRU_ESCR3; ++i, addr += addr_increment()) { 
+               msrs->controls[i].addr = addr;
+       }
+
+       /* there are 2 remaining non-contiguously located ESCRs */
+
+       if (num_counters == NUM_COUNTERS_NON_HT) {              
+               /* standard non-HT CPUs handle both remaining ESCRs*/
+               msrs->controls[i++].addr = MSR_P4_CRU_ESCR5;
+               msrs->controls[i++].addr = MSR_P4_CRU_ESCR4;
+
+       } else if (stag == 0) {
+               /* HT CPUs give the first remainder to the even thread, as
+                  the 32nd control register */
+               msrs->controls[i++].addr = MSR_P4_CRU_ESCR4;
+
+       } else {
+               /* and two copies of the second to the odd thread,
+                  for the 22st and 23nd control registers */
+               msrs->controls[i++].addr = MSR_P4_CRU_ESCR5;
+               msrs->controls[i++].addr = MSR_P4_CRU_ESCR5;
+       }
+}
+
+
+static void pmc_setup_one_p4_counter(unsigned int ctr)
+{
+       int i;
+       int const maxbind = 2;
+       unsigned int cccr = 0;
+       unsigned int escr = 0;
+       unsigned int high = 0;
+       unsigned int counter_bit;
+       struct p4_event_binding *ev = NULL;
+       unsigned int stag;
+
+       stag = get_stagger();
+       
+       /* convert from counter *number* to counter *bit* */
+       counter_bit = 1 << VIRT_CTR(stag, ctr);
+       
+       /* find our event binding structure. */
+       if (counter_config[ctr].event <= 0 || counter_config[ctr].event > 
NUM_EVENTS) {
+               printk(KERN_ERR 
+                      "oprofile: P4 event code 0x%lx out of range\n", 
+                      counter_config[ctr].event);
+               return;
+       }
+       
+       ev = &(p4_events[counter_config[ctr].event - 1]);
+       
+       for (i = 0; i < maxbind; i++) {
+               if (ev->bindings[i].virt_counter & counter_bit) {
+
+                       /* modify ESCR */
+                       ESCR_READ(escr, high, ev, i);
+                       ESCR_CLEAR(escr);
+                       if (stag == 0) {
+                               ESCR_SET_USR_0(escr, counter_config[ctr].user);
+                               ESCR_SET_OS_0(escr, counter_config[ctr].kernel);
+                       } else {
+                               ESCR_SET_USR_1(escr, counter_config[ctr].user);
+                               ESCR_SET_OS_1(escr, counter_config[ctr].kernel);
+                       }
+                       ESCR_SET_EVENT_SELECT(escr, ev->event_select);
+                       ESCR_SET_EVENT_MASK(escr, 
counter_config[ctr].unit_mask);                       
+                       ESCR_WRITE(escr, high, ev, i);
+                      
+                       /* modify CCCR */
+                       CCCR_READ(cccr, high, VIRT_CTR(stag, ctr));
+                       CCCR_CLEAR(cccr);
+                       CCCR_SET_REQUIRED_BITS(cccr);
+                       CCCR_SET_ESCR_SELECT(cccr, ev->escr_select);
+                       if (stag == 0) {
+                               CCCR_SET_PMI_OVF_0(cccr);
+                       } else {
+                               CCCR_SET_PMI_OVF_1(cccr);
+                       }
+                       CCCR_WRITE(cccr, high, VIRT_CTR(stag, ctr));
+                       return;
+               }
+       }
+
+       printk(KERN_ERR 
+              "oprofile: P4 event code 0x%lx no binding, stag %d ctr %d\n",
+              counter_config[ctr].event, stag, ctr);
+}
+
+
+static void p4_setup_ctrs(struct op_msrs const * const msrs)
+{
+       unsigned int i;
+       unsigned int low, high;
+       unsigned int addr;
+       unsigned int stag;
+
+       stag = get_stagger();
+
+       rdmsr(MSR_IA32_MISC_ENABLE, low, high);
+       if (! MISC_PMC_ENABLED_P(low)) {
+               printk(KERN_ERR "oprofile: P4 PMC not available\n");
+               return;
+       }
+
+       /* clear the cccrs we will use */
+       for (i = 0 ; i < num_counters ; i++) {
+               rdmsr(p4_counters[VIRT_CTR(stag, i)].cccr_address, low, high);
+               CCCR_CLEAR(low);
+               CCCR_SET_REQUIRED_BITS(low);
+               wrmsr(p4_counters[VIRT_CTR(stag, i)].cccr_address, low, high);
+       }
+
+       /* clear cccrs outside our concern */
+       for (i = stag ; i < NUM_UNUSED_CCCRS ; i += addr_increment()) {
+               rdmsr(p4_unused_cccr[i], low, high);
+               CCCR_CLEAR(low);
+               CCCR_SET_REQUIRED_BITS(low);
+               wrmsr(p4_unused_cccr[i], low, high);
+       }
+
+       /* clear all escrs (including those outside our concern) */
+       for (addr = MSR_P4_BSU_ESCR0 + stag;
+            addr <  MSR_P4_IQ_ESCR0; addr += addr_increment()) {
+               wrmsr(addr, 0, 0);
+       }
+
+       /* On older models clear also MSR_P4_IQ_ESCR0/1 */
+       if (boot_cpu_data.x86_model < 0x3) {
+               wrmsr(MSR_P4_IQ_ESCR0, 0, 0);
+               wrmsr(MSR_P4_IQ_ESCR1, 0, 0);
+       }
+
+       for (addr = MSR_P4_RAT_ESCR0 + stag;
+            addr <= MSR_P4_SSU_ESCR0; ++i, addr += addr_increment()) {
+               wrmsr(addr, 0, 0);
+       }
+       
+       for (addr = MSR_P4_MS_ESCR0 + stag;
+            addr <= MSR_P4_TC_ESCR1; addr += addr_increment()){ 
+               wrmsr(addr, 0, 0);
+       }
+       
+       for (addr = MSR_P4_IX_ESCR0 + stag;
+            addr <= MSR_P4_CRU_ESCR3; addr += addr_increment()){ 
+               wrmsr(addr, 0, 0);
+       }
+
+       if (num_counters == NUM_COUNTERS_NON_HT) {              
+               wrmsr(MSR_P4_CRU_ESCR4, 0, 0);
+               wrmsr(MSR_P4_CRU_ESCR5, 0, 0);
+       } else if (stag == 0) {
+               wrmsr(MSR_P4_CRU_ESCR4, 0, 0);
+       } else {
+               wrmsr(MSR_P4_CRU_ESCR5, 0, 0);
+       }               
+       
+       /* setup all counters */
+       for (i = 0 ; i < num_counters ; ++i) {
+               if (counter_config[i].enabled) {
+                       reset_value[i] = counter_config[i].count;
+                       pmc_setup_one_p4_counter(i);
+                       CTR_WRITE(counter_config[i].count, VIRT_CTR(stag, i));
+               } else {
+                       reset_value[i] = 0;
+               }
+       }
+}
+
+
+extern void xenoprof_log_event(struct vcpu *v, unsigned long eip,
+                              int mode, int event);
+
+static int p4_check_ctrs(unsigned int const cpu,
+                         struct op_msrs const * const msrs,
+                         struct cpu_user_regs * const regs)
+{
+       unsigned long ctr, low, high, stag, real;
+       int i;
+       int ovf = 0;
+       unsigned long eip = regs->eip;
+       int mode = 0;
+
+       if (guest_kernel_mode(current, regs))
+               mode = 1;
+       else if (ring_0(regs))
+               mode = 2;
+
+       stag = get_stagger();
+
+       for (i = 0; i < num_counters; ++i) {
+               
+               if (!reset_value[i]) 
+                       continue;
+
+               /* 
+                * there is some eccentricity in the hardware which
+                * requires that we perform 2 extra corrections:
+                *
+                * - check both the CCCR:OVF flag for overflow and the
+                *   counter high bit for un-flagged overflows.
+                *
+                * - write the counter back twice to ensure it gets
+                *   updated properly.
+                * 
+                * the former seems to be related to extra NMIs happening
+                * during the current NMI; the latter is reported as errata
+                * N15 in intel doc 249199-029, pentium 4 specification
+                * update, though their suggested work-around does not
+                * appear to solve the problem.
+                */
+               
+               real = VIRT_CTR(stag, i);
+
+               CCCR_READ(low, high, real);
+               CTR_READ(ctr, high, real);
+               if (CCCR_OVF_P(low) || CTR_OVERFLOW_P(ctr)) {
+                       xenoprof_log_event(current, eip, mode, i);
+                       CTR_WRITE(reset_value[i], real);
+                       CCCR_CLEAR_OVF(low);
+                       CCCR_WRITE(low, high, real);
+                       CTR_WRITE(reset_value[i], real);
+                       ovf = 1;
+               }
+       }
+
+       /* P4 quirk: you have to re-unmask the apic vector */
+       apic_write(APIC_LVTPC, apic_read(APIC_LVTPC) & ~APIC_LVT_MASKED);
+
+       return ovf;
+}
+
+
+static void p4_start(struct op_msrs const * const msrs)
+{
+       unsigned int low, high, stag;
+       int i;
+
+       stag = get_stagger();
+
+       for (i = 0; i < num_counters; ++i) {
+               if (!reset_value[i])
+                       continue;
+               CCCR_READ(low, high, VIRT_CTR(stag, i));
+               CCCR_SET_ENABLE(low);
+               CCCR_WRITE(low, high, VIRT_CTR(stag, i));
+       }
+}
+
+
+static void p4_stop(struct op_msrs const * const msrs)
+{
+       unsigned int low, high, stag;
+       int i;
+
+       stag = get_stagger();
+
+       for (i = 0; i < num_counters; ++i) {
+               CCCR_READ(low, high, VIRT_CTR(stag, i));
+               CCCR_SET_DISABLE(low);
+               CCCR_WRITE(low, high, VIRT_CTR(stag, i));
+       }
+}
+
+
+#ifdef CONFIG_SMP
+struct op_x86_model_spec const op_p4_ht2_spec = {
+       .num_counters = NUM_COUNTERS_HT2,
+       .num_controls = NUM_CONTROLS_HT2,
+       .fill_in_addresses = &p4_fill_in_addresses,
+       .setup_ctrs = &p4_setup_ctrs,
+       .check_ctrs = &p4_check_ctrs,
+       .start = &p4_start,
+       .stop = &p4_stop
+};
+#endif
+
+struct op_x86_model_spec const op_p4_spec = {
+       .num_counters = NUM_COUNTERS_NON_HT,
+       .num_controls = NUM_CONTROLS_NON_HT,
+       .fill_in_addresses = &p4_fill_in_addresses,
+       .setup_ctrs = &p4_setup_ctrs,
+       .check_ctrs = &p4_check_ctrs,
+       .start = &p4_start,
+       .stop = &p4_stop
+};
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/oprofile/op_model_ppro.c
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/oprofile/op_model_ppro.c     Fri Apr  7 10:52:00 2006
@@ -0,0 +1,153 @@
+/**
+ * @file op_model_ppro.h
+ * pentium pro / P6 model-specific MSR operations
+ *
+ * @remark Copyright 2002 OProfile authors
+ * @remark Read the file COPYING
+ *
+ * @author John Levon
+ * @author Philippe Elie
+ * @author Graydon Hoare
+ */
+
+#include <xen/types.h>
+#include <asm/msr.h>
+#include <asm/io.h>
+#include <asm/apic.h>
+#include <asm/processor.h>
+#include <xen/sched.h>
+#include <asm/regs.h>
+#include <asm/current.h>
+ 
+#include "op_x86_model.h"
+#include "op_counter.h"
+
+#define NUM_COUNTERS 2
+#define NUM_CONTROLS 2
+
+#define CTR_READ(l,h,msrs,c) do {rdmsr(msrs->counters[(c)].addr, (l), (h));} 
while (0)
+#define CTR_WRITE(l,msrs,c) do {wrmsr(msrs->counters[(c)].addr, -(u32)(l), 
-1);} while (0)
+#define CTR_OVERFLOWED(n) (!((n) & (1U<<31)))
+
+#define CTRL_READ(l,h,msrs,c) do {rdmsr((msrs->controls[(c)].addr), (l), 
(h));} while (0)
+#define CTRL_WRITE(l,h,msrs,c) do {wrmsr((msrs->controls[(c)].addr), (l), 
(h));} while (0)
+#define CTRL_SET_ACTIVE(n) (n |= (1<<22))
+#define CTRL_SET_INACTIVE(n) (n &= ~(1<<22))
+#define CTRL_CLEAR(x) (x &= (1<<21))
+#define CTRL_SET_ENABLE(val) (val |= 1<<20)
+#define CTRL_SET_USR(val,u) (val |= ((u & 1) << 16))
+#define CTRL_SET_KERN(val,k) (val |= ((k & 1) << 17))
+#define CTRL_SET_UM(val, m) (val |= (m << 8))
+#define CTRL_SET_EVENT(val, e) (val |= e)
+
+static unsigned long reset_value[NUM_COUNTERS];
+ 
+static void ppro_fill_in_addresses(struct op_msrs * const msrs)
+{
+       msrs->counters[0].addr = MSR_P6_PERFCTR0;
+       msrs->counters[1].addr = MSR_P6_PERFCTR1;
+       
+       msrs->controls[0].addr = MSR_P6_EVNTSEL0;
+       msrs->controls[1].addr = MSR_P6_EVNTSEL1;
+}
+
+
+static void ppro_setup_ctrs(struct op_msrs const * const msrs)
+{
+       unsigned int low, high;
+       int i;
+
+       /* clear all counters */
+       for (i = 0 ; i < NUM_CONTROLS; ++i) {
+               CTRL_READ(low, high, msrs, i);
+               CTRL_CLEAR(low);
+               CTRL_WRITE(low, high, msrs, i);
+       }
+       
+       /* avoid a false detection of ctr overflows in NMI handler */
+       for (i = 0; i < NUM_COUNTERS; ++i) {
+               CTR_WRITE(1, msrs, i);
+       }
+
+       /* enable active counters */
+       for (i = 0; i < NUM_COUNTERS; ++i) {
+               if (counter_config[i].enabled) {
+                       reset_value[i] = counter_config[i].count;
+
+                       CTR_WRITE(counter_config[i].count, msrs, i);
+
+                       CTRL_READ(low, high, msrs, i);
+                       CTRL_CLEAR(low);
+                       CTRL_SET_ENABLE(low);
+                       CTRL_SET_USR(low, counter_config[i].user);
+                       CTRL_SET_KERN(low, counter_config[i].kernel);
+                       CTRL_SET_UM(low, counter_config[i].unit_mask);
+                       CTRL_SET_EVENT(low, counter_config[i].event);
+                       CTRL_WRITE(low, high, msrs, i);
+               }
+       }
+}
+
+
+extern void xenoprof_log_event(struct vcpu *v, unsigned long eip,
+                              int mode, int event);
+ 
+static int ppro_check_ctrs(unsigned int const cpu,
+                           struct op_msrs const * const msrs,
+                           struct cpu_user_regs * const regs)
+{
+       unsigned int low, high;
+       int i;
+       int ovf = 0;
+       unsigned long eip = regs->eip;
+       int mode = 0;
+
+       if ( guest_kernel_mode(current, regs) ) 
+               mode = 1;
+       else if ( ring_0(regs) )
+               mode = 2;
+ 
+       for (i = 0 ; i < NUM_COUNTERS; ++i) {
+               CTR_READ(low, high, msrs, i);
+               if (CTR_OVERFLOWED(low)) {
+                       xenoprof_log_event(current, eip, mode, i);
+                       CTR_WRITE(reset_value[i], msrs, i);
+                       ovf = 1;
+               }
+       }
+
+       /* Only P6 based Pentium M need to re-unmask the apic vector but it
+        * doesn't hurt other P6 variant */
+       apic_write(APIC_LVTPC, apic_read(APIC_LVTPC) & ~APIC_LVT_MASKED);
+
+       return ovf;
+}
+
+ 
+static void ppro_start(struct op_msrs const * const msrs)
+{
+       unsigned int low,high;
+       CTRL_READ(low, high, msrs, 0);
+       CTRL_SET_ACTIVE(low);
+       CTRL_WRITE(low, high, msrs, 0);
+}
+
+
+static void ppro_stop(struct op_msrs const * const msrs)
+{
+       unsigned int low,high;
+       CTRL_READ(low, high, msrs, 0);
+       CTRL_SET_INACTIVE(low);
+       CTRL_WRITE(low, high, msrs, 0);
+}
+
+
+struct op_x86_model_spec const op_ppro_spec = {
+       .num_counters = NUM_COUNTERS,
+       .num_controls = NUM_CONTROLS,
+       .fill_in_addresses = &ppro_fill_in_addresses,
+       .setup_ctrs = &ppro_setup_ctrs,
+       .check_ctrs = &ppro_check_ctrs,
+       .start = &ppro_start,
+       .stop = &ppro_stop
+};
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/oprofile/op_x86_model.h
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/oprofile/op_x86_model.h      Fri Apr  7 10:52:00 2006
@@ -0,0 +1,51 @@
+/**
+ * @file op_x86_model.h
+ * interface to x86 model-specific MSR operations
+ *
+ * @remark Copyright 2002 OProfile authors
+ * @remark Read the file COPYING
+ *
+ * @author Graydon Hoare
+ */
+
+#ifndef OP_X86_MODEL_H
+#define OP_X86_MODEL_H
+
+struct op_saved_msr {
+       unsigned int high;
+       unsigned int low;
+};
+
+struct op_msr {
+       unsigned long addr;
+       struct op_saved_msr saved;
+};
+
+struct op_msrs {
+       struct op_msr * counters;
+       struct op_msr * controls;
+};
+
+struct pt_regs;
+
+/* The model vtable abstracts the differences between
+ * various x86 CPU model's perfctr support.
+ */
+struct op_x86_model_spec {
+       unsigned int const num_counters;
+       unsigned int const num_controls;
+       void (*fill_in_addresses)(struct op_msrs * const msrs);
+       void (*setup_ctrs)(struct op_msrs const * const msrs);
+       int (*check_ctrs)(unsigned int const cpu, 
+                         struct op_msrs const * const msrs,
+                         struct cpu_user_regs * const regs);
+       void (*start)(struct op_msrs const * const msrs);
+       void (*stop)(struct op_msrs const * const msrs);
+};
+
+extern struct op_x86_model_spec const op_ppro_spec;
+extern struct op_x86_model_spec const op_p4_spec;
+extern struct op_x86_model_spec const op_p4_ht2_spec;
+extern struct op_x86_model_spec const op_athlon_spec;
+
+#endif /* OP_X86_MODEL_H */
diff -r 9fcfdab04aa9 -r fb174770f426 xen/arch/x86/oprofile/xenoprof.c
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/xen/arch/x86/oprofile/xenoprof.c  Fri Apr  7 10:52:00 2006
@@ -0,0 +1,528 @@
+/*
+ * Copyright (C) 2005 Hewlett-Packard Co.
+ * written by Aravind Menon & Jose Renato Santos
+ *            (email: xenoprof@xxxxxxxxxxxxx)
+ */
+
+#include <xen/sched.h>
+#include <public/xenoprof.h>
+
+#include "op_counter.h"
+
+/* Limit amount of pages used for shared buffer (per domain) */
+#define MAX_OPROF_SHARED_PAGES 32
+
+int active_domains[MAX_OPROF_DOMAINS];
+int active_ready[MAX_OPROF_DOMAINS];
+unsigned int adomains;
+unsigned int activated;
+struct domain *primary_profiler;
+int xenoprof_state = XENOPROF_IDLE;
+
+u64 total_samples;
+u64 invalid_buffer_samples;
+u64 corrupted_buffer_samples;
+u64 lost_samples;
+u64 active_samples;
+u64 idle_samples;
+u64 others_samples;
+
+
+extern int nmi_init(int *num_events, int *is_primary, char *cpu_type);
+extern int nmi_reserve_counters(void);
+extern int nmi_setup_events(void);
+extern int nmi_enable_virq(void);
+extern int nmi_start(void);
+extern void nmi_stop(void);
+extern void nmi_disable_virq(void);
+extern void nmi_release_counters(void);
+
+int is_active(struct domain *d)
+{
+    struct xenoprof *x = d->xenoprof;
+    return ((x != NULL) && (x->domain_type == XENOPROF_DOMAIN_ACTIVE));
+}
+
+int is_profiled(struct domain *d)
+{
+    return is_active(d);
+}
+
+static void xenoprof_reset_stat(void)
+{
+    total_samples = 0;
+    invalid_buffer_samples = 0;
+    corrupted_buffer_samples = 0;
+    lost_samples = 0;
+    active_samples = 0;
+    idle_samples = 0;
+    others_samples = 0;
+}
+
+static void xenoprof_reset_buf(struct domain *d)
+{
+    int j;
+    struct xenoprof_buf *buf;
+
+    if ( d->xenoprof == NULL )
+    {
+        printk("xenoprof_reset_buf: ERROR - Unexpected "
+               "Xenoprof NULL pointer \n");
+        return;
+    }
+
+    for ( j = 0; j < MAX_VIRT_CPUS; j++ )
+    {
+        buf = d->xenoprof->vcpu[j].buffer;
+        if ( buf != NULL )
+        {
+            buf->event_head = 0;
+            buf->event_tail = 0;
+        }
+    }
+}
+
+int active_index(struct domain *d)
+{
+    int i, id = d->domain_id;
+
+    for ( i = 0; i < adomains; i++ )
+        if ( active_domains[i] == id )
+            return i;
+
+    return -1;
+}
+
+int set_active(struct domain *d)
+{
+    int ind;
+    struct xenoprof *x;
+
+    ind = active_index(d);
+    if ( ind < 0 )
+        return -EPERM;
+
+    x = d->xenoprof;
+    if ( x == NULL )
+        return -EPERM;
+
+    x->domain_ready = 1;
+    x->domain_type = XENOPROF_DOMAIN_ACTIVE;
+    active_ready[ind] = 1;
+    activated++;
+
+    return 0;
+}
+
+int reset_active(struct domain *d)
+{
+    int ind;
+    struct xenoprof *x;
+
+    ind = active_index(d);
+    if ( ind < 0 )
+        return -EPERM;
+
+    x = d->xenoprof;
+    if ( x == NULL )
+        return -EPERM;
+
+    x->domain_ready = 0;
+    x->domain_type = XENOPROF_DOMAIN_IGNORED;
+    active_ready[ind] = 0;
+    activated--;
+    if ( activated <= 0 )
+        adomains = 0;
+
+    return 0;
+}
+
+int set_active_domains(int num)
+{
+    int primary;
+    int i;
+    struct domain *d;
+
+    /* Reset any existing active domains from previous runs. */
+    for ( i = 0; i < adomains; i++ )
+    {
+        if ( active_ready[i] )
+        {
+            d = find_domain_by_id(active_domains[i]);
+            if ( d != NULL )
+            {
+                reset_active(d);
+                put_domain(d);
+            }
+        }
+    }
+
+    adomains = num;
+
+    /* Add primary profiler to list of active domains if not there yet */
+    primary = active_index(primary_profiler);
+    if ( primary == -1 )
+    {
+        /* Return if there is no space left on list. */
+        if ( num >= MAX_OPROF_DOMAINS )
+            return -E2BIG;
+        active_domains[num] = primary_profiler->domain_id;
+        num++;
+    }
+
+    adomains = num;
+    activated = 0;
+
+    for ( i = 0; i < adomains; i++ )
+        active_ready[i] = 0;
+
+    return 0;
+}
+
+void xenoprof_log_event(
+    struct vcpu *vcpu, unsigned long eip, int mode, int event)
+{
+    struct xenoprof_vcpu *v;
+    struct xenoprof_buf *buf;
+    int head;
+    int tail;
+    int size;
+
+
+    total_samples++;
+
+    /* ignore samples of un-monitored domains */
+    /* Count samples in idle separate from other unmonitored domains */
+    if ( !is_profiled(vcpu->domain) )
+    {
+        others_samples++;
+        return;
+    }
+
+    v = &vcpu->domain->xenoprof->vcpu[vcpu->vcpu_id];
+
+    /* Sanity check. Should never happen */ 
+    if ( v->buffer == NULL )
+    {
+        invalid_buffer_samples++;
+        return;
+    }
+
+    buf = vcpu->domain->xenoprof->vcpu[vcpu->vcpu_id].buffer;
+
+    head = buf->event_head;
+    tail = buf->event_tail;
+    size = v->event_size;
+
+    /* make sure indexes in shared buffer are sane */
+    if ( (head < 0) || (head >= size) || (tail < 0) || (tail >= size) )
+    {
+        corrupted_buffer_samples++;
+        return;
+    }
+
+    if ( (head == tail - 1) || (head == size - 1 && tail == 0) )
+    {
+        buf->lost_samples++;
+        lost_samples++;
+    }
+    else
+    {
+        buf->event_log[head].eip = eip;
+        buf->event_log[head].mode = mode;
+        buf->event_log[head].event = event;
+        head++;
+        if ( head >= size )
+            head = 0;
+        buf->event_head = head;
+        active_samples++;
+        if ( mode == 0 )
+            buf->user_samples++;
+        else if ( mode == 1 )
+            buf->kernel_samples++;
+        else
+            buf->xen_samples++;
+    }
+}
+
+char *alloc_xenoprof_buf(struct domain *d, int npages)
+{
+    char *rawbuf;
+    int i, order;
+
+    /* allocate pages to store sample buffer shared with domain */
+    order  = get_order_from_pages(npages);
+    rawbuf = alloc_xenheap_pages(order);
+    if ( rawbuf == NULL )
+    {
+        printk("alloc_xenoprof_buf(): memory allocation failed\n");
+        return 0;
+    }
+
+    /* Share pages so that kernel can map it */
+    for ( i = 0; i < npages; i++ )
+        share_xen_page_with_guest(
+            virt_to_page(rawbuf + i * PAGE_SIZE), 
+            d, XENSHARE_writable);
+
+    return rawbuf;
+}
+
+int alloc_xenoprof_struct(struct domain *d, int max_samples)
+{
+    struct vcpu *v;
+    int nvcpu, npages, bufsize, max_bufsize;
+    int i;
+
+    d->xenoprof = xmalloc(struct xenoprof);
+
+    if ( d->xenoprof == NULL )
+    {
+        printk ("alloc_xenoprof_struct(): memory "
+                "allocation (xmalloc) failed\n");
+        return -ENOMEM;
+    }
+
+    memset(d->xenoprof, 0, sizeof(*d->xenoprof));
+
+    nvcpu = 0;
+    for_each_vcpu ( d, v )
+        nvcpu++;
+
+    /* reduce buffer size if necessary to limit pages allocated */
+    bufsize = sizeof(struct xenoprof_buf) +
+        (max_samples - 1) * sizeof(struct event_log);
+    max_bufsize = (MAX_OPROF_SHARED_PAGES * PAGE_SIZE) / nvcpu;
+    if ( bufsize > max_bufsize )
+    {
+        bufsize = max_bufsize;
+        max_samples = ( (max_bufsize - sizeof(struct xenoprof_buf)) /
+                        sizeof(struct event_log) ) + 1;
+    }
+
+    npages = (nvcpu * bufsize - 1) / PAGE_SIZE + 1;
+    d->xenoprof->rawbuf = alloc_xenoprof_buf(d, npages);
+    if ( d->xenoprof->rawbuf == NULL )
+    {
+        xfree(d->xenoprof);
+        d->xenoprof = NULL;
+        return -ENOMEM;
+    }
+
+    d->xenoprof->npages = npages;
+    d->xenoprof->nbuf = nvcpu;
+    d->xenoprof->bufsize = bufsize;
+    d->xenoprof->domain_ready = 0;
+    d->xenoprof->domain_type = XENOPROF_DOMAIN_IGNORED;
+
+    /* Update buffer pointers for active vcpus */
+    i = 0;
+    for_each_vcpu ( d, v )
+    {
+        d->xenoprof->vcpu[v->vcpu_id].event_size = max_samples;
+        d->xenoprof->vcpu[v->vcpu_id].buffer =
+            (struct xenoprof_buf *)&d->xenoprof->rawbuf[i * bufsize];
+        d->xenoprof->vcpu[v->vcpu_id].buffer->event_size = max_samples;
+        d->xenoprof->vcpu[v->vcpu_id].buffer->vcpu_id = v->vcpu_id;
+
+        i++;
+        /* in the unlikely case that the number of active vcpus changes */
+        if ( i >= nvcpu )
+            break;
+    }
+
+    return 0;
+}
+
+void free_xenoprof_pages(struct domain *d)
+{
+    struct xenoprof *x;
+    int order;
+
+    x = d->xenoprof;
+    if ( x == NULL )
+        return;
+
+    if ( x->rawbuf != NULL )
+    {
+        order = get_order_from_pages(x->npages);
+        free_xenheap_pages(x->rawbuf, order);
+    }
+
+    xfree(x);
+    d->xenoprof = NULL;
+}
+
+int xenoprof_init(int max_samples, xenoprof_init_result_t *init_result)
+{
+    xenoprof_init_result_t result;
+    int is_primary, num_events;
+    struct domain *d = current->domain;
+    int ret;
+
+    ret = nmi_init(&num_events, &is_primary, result.cpu_type);
+    if ( is_primary )
+        primary_profiler = current->domain;
+
+    if ( ret < 0 )
+        goto err;
+
+    /*
+     * We allocate xenoprof struct and buffers only at first time xenoprof_init
+     * is called. Memory is then kept until domain is destroyed.
+     */
+    if ( (d->xenoprof == NULL) &&
+         ((ret = alloc_xenoprof_struct(d, max_samples)) < 0) )
+        goto err;
+
+    xenoprof_reset_buf(d);
+
+    d->xenoprof->domain_type  = XENOPROF_DOMAIN_IGNORED;
+    d->xenoprof->domain_ready = 0;
+    d->xenoprof->is_primary = is_primary;
+
+    result.is_primary = is_primary;
+    result.num_events = num_events;
+    result.nbuf = d->xenoprof->nbuf;
+    result.bufsize = d->xenoprof->bufsize;
+    result.buf_maddr = __pa(d->xenoprof->rawbuf);
+
+    if ( copy_to_user((void *)init_result, (void *)&result, sizeof(result)) )
+    {
+        ret = -EFAULT;
+        goto err;
+    }
+
+    return ret;
+
+ err:
+    if ( primary_profiler == current->domain )
+        primary_profiler = NULL;
+    return ret;
+}
+
+#define PRIV_OP(op) ( (op == XENOPROF_set_active)       \
+                   || (op == XENOPROF_reserve_counters) \
+                   || (op == XENOPROF_setup_events)     \
+                   || (op == XENOPROF_start)            \
+                   || (op == XENOPROF_stop)             \
+                   || (op == XENOPROF_release_counters) \
+                   || (op == XENOPROF_shutdown))
+
+int do_xenoprof_op(int op, unsigned long arg1, unsigned long arg2)
+{
+    int ret = 0;
+
+    if ( PRIV_OP(op) && (current->domain != primary_profiler) )
+    {
+        printk("xenoprof: dom %d denied privileged operation %d\n",
+               current->domain->domain_id, op);
+        return -EPERM;
+    }
+
+    switch ( op )
+    {
+    case XENOPROF_init:
+        ret = xenoprof_init((int)arg1, (xenoprof_init_result_t *)arg2);
+        break;
+
+    case XENOPROF_set_active:
+        if ( xenoprof_state != XENOPROF_IDLE )
+            return -EPERM;
+        if ( arg2 > MAX_OPROF_DOMAINS )
+            return -E2BIG;
+        if ( copy_from_user((void *)&active_domains, 
+                            (void *)arg1, arg2*sizeof(int)) )
+            return -EFAULT;
+        ret = set_active_domains(arg2);
+        break;
+
+    case XENOPROF_reserve_counters:
+        if ( xenoprof_state != XENOPROF_IDLE )
+            return -EPERM;
+        ret = nmi_reserve_counters();
+        if ( !ret )
+            xenoprof_state = XENOPROF_COUNTERS_RESERVED;
+        break;
+
+    case XENOPROF_setup_events:
+        if ( xenoprof_state != XENOPROF_COUNTERS_RESERVED )
+            return -EPERM;
+        if ( adomains == 0 )
+            set_active_domains(0);
+
+        if ( copy_from_user((void *)&counter_config, (void *)arg1, 
+                            arg2 * sizeof(struct op_counter_config)) )
+            return -EFAULT;
+        ret = nmi_setup_events();
+        if ( !ret )
+            xenoprof_state = XENOPROF_READY;
+        break;
+
+    case XENOPROF_enable_virq:
+        if ( current->domain == primary_profiler )
+        {
+            nmi_enable_virq();
+            xenoprof_reset_stat();
+        }
+        xenoprof_reset_buf(current->domain);
+        ret = set_active(current->domain);
+        break;
+
+    case XENOPROF_start:
+        ret = -EPERM;
+        if ( (xenoprof_state == XENOPROF_READY) &&
+             (activated == adomains) )
+            ret = nmi_start();
+
+        if ( ret == 0 )
+            xenoprof_state = XENOPROF_PROFILING;
+        break;
+
+    case XENOPROF_stop:
+        if ( xenoprof_state != XENOPROF_PROFILING )
+            return -EPERM;
+        nmi_stop();
+        xenoprof_state = XENOPROF_READY;
+        break;
+
+    case XENOPROF_disable_virq:
+        if ( (xenoprof_state == XENOPROF_PROFILING) && 
+             (is_active(current->domain)) )
+            return -EPERM;
+        ret = reset_active(current->domain);
+        break;
+
+    case XENOPROF_release_counters:
+        ret = -EPERM;
+        if ( (xenoprof_state == XENOPROF_COUNTERS_RESERVED) ||
+             (xenoprof_state == XENOPROF_READY) )
+        {
+            xenoprof_state = XENOPROF_IDLE;
+            nmi_release_counters();
+            nmi_disable_virq();
+            ret = 0;
+        }
+        break;
+
+    case XENOPROF_shutdown:
+        ret = -EPERM;
+        if ( xenoprof_state == XENOPROF_IDLE )
+        {
+            activated = 0;
+            adomains=0;
+            primary_profiler = NULL;
+            ret = 0;
+        }
+        break;
+
+    default:
+        ret = -EINVAL;
+    }
+
+    if ( ret < 0 )
+        printk("xenoprof: operation %d failed for dom %d (status : %d)\n",
+               op, current->domain->domain_id, ret);
+
+    return ret;
+}
diff -r 9fcfdab04aa9 -r fb174770f426 xen/include/public/callback.h
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/xen/include/public/callback.h     Fri Apr  7 10:52:00 2006
@@ -0,0 +1,57 @@
+/******************************************************************************
+ * callback.h
+ *
+ * Register guest OS callbacks with Xen.
+ *
+ * Copyright (c) 2006, Ian Campbell
+ */
+
+#ifndef __XEN_PUBLIC_CALLBACK_H__
+#define __XEN_PUBLIC_CALLBACK_H__
+
+#include "xen.h"
+
+/*
+ * Prototype for this hypercall is:
+ *   long callback_op(int cmd, void *extra_args)
+ * @cmd        == CALLBACKOP_??? (callback operation).
+ * @extra_args == Operation-specific extra arguments (NULL if none).
+ */
+
+#define CALLBACKTYPE_event                 0
+#define CALLBACKTYPE_failsafe              1
+#define CALLBACKTYPE_syscall               2 /* x86_64 only */
+
+/*
+ * Register a callback.
+ */
+#define CALLBACKOP_register                0
+typedef struct callback_register {
+     int type;
+     xen_callback_t address;
+} callback_register_t;
+DEFINE_GUEST_HANDLE(callback_register_t);
+
+/*
+ * Unregister a callback.
+ *
+ * Not all callbacks can be unregistered. -EINVAL will be returned if
+ * you attempt to unregister such a callback.
+ */
+#define CALLBACKOP_unregister              1
+typedef struct callback_unregister {
+     int type;
+} callback_unregister_t;
+DEFINE_GUEST_HANDLE(callback_unregister_t);
+
+#endif /* __XEN_PUBLIC_CALLBACK_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff -r 9fcfdab04aa9 -r fb174770f426 xen/include/public/xenoprof.h
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/xen/include/public/xenoprof.h     Fri Apr  7 10:52:00 2006
@@ -0,0 +1,83 @@
+/******************************************************************************
+ * xenoprof.h
+ * 
+ * Interface for enabling system wide profiling based on hardware performance
+ * counters
+ * 
+ * Copyright (C) 2005 Hewlett-Packard Co.
+ * Written by Aravind Menon & Jose Renato Santos
+ */
+
+#ifndef __XEN_PUBLIC_XENOPROF_H__
+#define __XEN_PUBLIC_XENOPROF_H__
+
+/*
+ * Commands to HYPERVISOR_pmc_op().
+ */
+#define XENOPROF_init               0
+#define XENOPROF_set_active         1
+#define XENOPROF_reserve_counters   3
+#define XENOPROF_setup_events       4
+#define XENOPROF_enable_virq        5
+#define XENOPROF_start              6
+#define XENOPROF_stop               7
+#define XENOPROF_disable_virq       8
+#define XENOPROF_release_counters   9
+#define XENOPROF_shutdown          10
+
+#define MAX_OPROF_EVENTS    32
+#define MAX_OPROF_DOMAINS   25 
+#define XENOPROF_CPU_TYPE_SIZE 64
+
+/* Xenoprof performance events (not Xen events) */
+struct event_log {
+    uint64_t eip;
+    uint8_t mode;
+    uint8_t event;
+};
+
+/* Xenoprof buffer shared between Xen and domain - 1 per VCPU */
+typedef struct xenoprof_buf {
+    uint32_t event_head;
+    uint32_t event_tail;
+    uint32_t event_size;
+    uint32_t vcpu_id;
+    uint64_t xen_samples;
+    uint64_t kernel_samples;
+    uint64_t user_samples;
+    uint64_t lost_samples;
+    struct event_log event_log[1];
+} xenoprof_buf_t;
+DEFINE_GUEST_HANDLE(xenoprof_buf_t);
+
+typedef struct xenoprof_init_result {
+    int32_t  num_events;
+    int32_t  is_primary;
+    int32_t  nbuf;
+    int32_t  bufsize;
+    uint64_t buf_maddr;
+    char cpu_type[XENOPROF_CPU_TYPE_SIZE];
+} xenoprof_init_result_t;
+DEFINE_GUEST_HANDLE(xenoprof_init_result_t);
+
+typedef struct xenoprof_counter_config {
+    unsigned long count;
+    unsigned long enabled;
+    unsigned long event;
+    unsigned long kernel;
+    unsigned long user;
+    unsigned long unit_mask;
+} xenoprof_counter_config_t;
+DEFINE_GUEST_HANDLE(xenoprof_counter_config_t);
+
+#endif /* __XEN_PUBLIC_XENOPROF_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff -r 9fcfdab04aa9 -r fb174770f426 xen/include/xen/xenoprof.h
--- /dev/null   Thu Apr  6 13:22:52 2006
+++ b/xen/include/xen/xenoprof.h        Fri Apr  7 10:52:00 2006
@@ -0,0 +1,42 @@
+/******************************************************************************
+ * xenoprof.h
+ * 
+ * Xenoprof: Xenoprof enables performance profiling in Xen
+ * 
+ * Copyright (C) 2005 Hewlett-Packard Co.
+ * written by Aravind Menon & Jose Renato Santos
+ */
+
+#ifndef __XEN_XENOPROF_H__
+#define __XEN_XENOPROF_H__
+
+#include <public/xenoprof.h>
+
+#define XENOPROF_DOMAIN_IGNORED    0
+#define XENOPROF_DOMAIN_ACTIVE     1
+
+#define XENOPROF_IDLE              0
+#define XENOPROF_COUNTERS_RESERVED 1
+#define XENOPROF_READY             2
+#define XENOPROF_PROFILING         3
+
+struct xenoprof_vcpu {
+    int event_size;
+    struct xenoprof_buf *buffer;
+};
+
+struct xenoprof {
+    char* rawbuf;
+    int npages;
+    int nbuf;
+    int bufsize;
+    int domain_type;
+    int domain_ready;
+    int is_primary;
+    struct xenoprof_vcpu vcpu [MAX_VIRT_CPUS];
+};
+
+struct domain;
+void free_xenoprof_pages(struct domain *d);
+
+#endif  /* __XEN__XENOPROF_H__ */
diff -r 9fcfdab04aa9 -r fb174770f426 linux-2.6-xen-sparse/include/linux/irq.h
--- a/linux-2.6-xen-sparse/include/linux/irq.h  Thu Apr  6 13:22:52 2006
+++ /dev/null   Fri Apr  7 10:52:00 2006
@@ -1,244 +0,0 @@
-#ifndef __irq_h
-#define __irq_h
-
-/*
- * Please do not include this file in generic code.  There is currently
- * no requirement for any architecture to implement anything held
- * within this file.
- *
- * Thanks. --rmk
- */
-
-#include <linux/config.h>
-#include <linux/smp.h>
-
-#if !defined(CONFIG_S390)
-
-#include <linux/linkage.h>
-#include <linux/cache.h>
-#include <linux/spinlock.h>
-#include <linux/cpumask.h>
-
-#include <asm/irq.h>
-#include <asm/ptrace.h>
-
-/*
- * IRQ line status.
- */
-#define IRQ_INPROGRESS 1       /* IRQ handler active - do not enter! */
-#define IRQ_DISABLED   2       /* IRQ disabled - do not enter! */
-#define IRQ_PENDING    4       /* IRQ pending - replay on enable */
-#define IRQ_REPLAY     8       /* IRQ has been replayed but not acked yet */
-#define IRQ_AUTODETECT 16      /* IRQ is being autodetected */
-#define IRQ_WAITING    32      /* IRQ not yet seen - for autodetection */
-#define IRQ_LEVEL      64      /* IRQ level triggered */
-#define IRQ_MASKED     128     /* IRQ masked - shouldn't be seen again */
-#if defined(ARCH_HAS_IRQ_PER_CPU)
-# define IRQ_PER_CPU   256     /* IRQ is per CPU */
-# define CHECK_IRQ_PER_CPU(var) ((var) & IRQ_PER_CPU)
-#else
-# define CHECK_IRQ_PER_CPU(var) 0
-#endif
-
-/*
- * Interrupt controller descriptor. This is all we need
- * to describe about the low-level hardware. 
- */
-struct hw_interrupt_type {
-       const char * typename;
-       unsigned int (*startup)(unsigned int irq);
-       void (*shutdown)(unsigned int irq);
-       void (*enable)(unsigned int irq);
-       void (*disable)(unsigned int irq);
-       void (*ack)(unsigned int irq);
-       void (*end)(unsigned int irq);
-       void (*set_affinity)(unsigned int irq, cpumask_t dest);
-       /* Currently used only by UML, might disappear one day.*/
-#ifdef CONFIG_IRQ_RELEASE_METHOD
-       void (*release)(unsigned int irq, void *dev_id);
-#endif
-};
-
-typedef struct hw_interrupt_type  hw_irq_controller;
-
-/*
- * This is the "IRQ descriptor", which contains various information
- * about the irq, including what kind of hardware handling it has,
- * whether it is disabled etc etc.
- *
- * Pad this out to 32 bytes for cache and indexing reasons.
- */
-typedef struct irq_desc {
-       hw_irq_controller *handler;
-       void *handler_data;
-       struct irqaction *action;       /* IRQ action list */
-       unsigned int status;            /* IRQ status */
-       unsigned int depth;             /* nested irq disables */
-       unsigned int irq_count;         /* For detecting broken interrupts */
-       unsigned int irqs_unhandled;
-       spinlock_t lock;
-#if defined (CONFIG_GENERIC_PENDING_IRQ) || defined (CONFIG_IRQBALANCE)
-       unsigned int move_irq;          /* Flag need to re-target intr dest*/
-#endif
-} ____cacheline_aligned irq_desc_t;
-
-extern irq_desc_t irq_desc [NR_IRQS];
-
-/* Return a pointer to the irq descriptor for IRQ.  */
-static inline irq_desc_t *
-irq_descp (int irq)
-{
-       return irq_desc + irq;
-}
-
-#include <asm/hw_irq.h> /* the arch dependent stuff */
-
-extern int setup_irq(unsigned int irq, struct irqaction * new);
-#ifdef CONFIG_XEN
-extern int teardown_irq(unsigned int irq, struct irqaction * old);
-#endif
-
-#ifdef CONFIG_GENERIC_HARDIRQS
-extern cpumask_t irq_affinity[NR_IRQS];
-
-#ifdef CONFIG_SMP
-static inline void set_native_irq_info(int irq, cpumask_t mask)
-{
-       irq_affinity[irq] = mask;
-}
-#else
-static inline void set_native_irq_info(int irq, cpumask_t mask)
-{
-}
-#endif
-
-#ifdef CONFIG_SMP
-
-#if defined (CONFIG_GENERIC_PENDING_IRQ) || defined (CONFIG_IRQBALANCE)
-extern cpumask_t pending_irq_cpumask[NR_IRQS];
-
-static inline void set_pending_irq(unsigned int irq, cpumask_t mask)
-{
-       irq_desc_t *desc = irq_desc + irq;
-       unsigned long flags;
-
-       spin_lock_irqsave(&desc->lock, flags);
-       desc->move_irq = 1;
-       pending_irq_cpumask[irq] = mask;
-       spin_unlock_irqrestore(&desc->lock, flags);
-}
-
-static inline void
-move_native_irq(int irq)
-{
-       cpumask_t tmp;
-       irq_desc_t *desc = irq_descp(irq);
-
-       if (likely (!desc->move_irq))
-               return;
-
-       desc->move_irq = 0;
-
-       if (likely(cpus_empty(pending_irq_cpumask[irq])))
-               return;
-
-       if (!desc->handler->set_affinity)
-               return;
-
-       /* note - we hold the desc->lock */
-       cpus_and(tmp, pending_irq_cpumask[irq], cpu_online_map);
-
-       /*
-        * If there was a valid mask to work with, please
-        * do the disable, re-program, enable sequence.
-        * This is *not* particularly important for level triggered
-        * but in a edge trigger case, we might be setting rte
-        * when an active trigger is comming in. This could
-        * cause some ioapics to mal-function.
-        * Being paranoid i guess!
-        */
-       if (unlikely(!cpus_empty(tmp))) {
-               desc->handler->disable(irq);
-               desc->handler->set_affinity(irq,tmp);
-               desc->handler->enable(irq);
-       }
-       cpus_clear(pending_irq_cpumask[irq]);
-}
-
-#ifdef CONFIG_PCI_MSI
-/*
- * Wonder why these are dummies?
- * For e.g the set_ioapic_affinity_vector() calls the set_ioapic_affinity_irq()
- * counter part after translating the vector to irq info. We need to perform
- * this operation on the real irq, when we dont use vector, i.e when
- * pci_use_vector() is false.
- */
-static inline void move_irq(int irq)
-{
-}
-
-static inline void set_irq_info(int irq, cpumask_t mask)
-{
-}
-
-#else // CONFIG_PCI_MSI
-
-static inline void move_irq(int irq)
-{
-       move_native_irq(irq);
-}
-
-static inline void set_irq_info(int irq, cpumask_t mask)
-{
-       set_native_irq_info(irq, mask);
-}
-#endif // CONFIG_PCI_MSI
-
-#else  // CONFIG_GENERIC_PENDING_IRQ || CONFIG_IRQBALANCE
-
-#define move_irq(x)
-#define move_native_irq(x)
-#define set_pending_irq(x,y)
-static inline void set_irq_info(int irq, cpumask_t mask)
-{
-       set_native_irq_info(irq, mask);
-}
-
-#endif // CONFIG_GENERIC_PENDING_IRQ
-
-#else // CONFIG_SMP
-
-#define move_irq(x)
-#define move_native_irq(x)
-
-#endif // CONFIG_SMP
-
-extern int no_irq_affinity;
-extern int noirqdebug_setup(char *str);
-
-extern fastcall int handle_IRQ_event(unsigned int irq, struct pt_regs *regs,
-                                       struct irqaction *action);
-extern fastcall unsigned int __do_IRQ(unsigned int irq, struct pt_regs *regs);
-extern void note_interrupt(unsigned int irq, irq_desc_t *desc,
-                                       int action_ret, struct pt_regs *regs);
-extern int can_request_irq(unsigned int irq, unsigned long irqflags);
-
-extern void init_irq_proc(void);
-
-#ifdef CONFIG_AUTO_IRQ_AFFINITY
-extern int select_smp_affinity(unsigned int irq);
-#else
-static inline int
-select_smp_affinity(unsigned int irq)
-{
-       return 1;
-}
-#endif
-
-#endif
-
-extern hw_irq_controller no_irq_type;  /* needed in every arch ? */
-
-#endif
-
-#endif /* __irq_h */
diff -r 9fcfdab04aa9 -r fb174770f426 linux-2.6-xen-sparse/kernel/irq/manage.c
--- a/linux-2.6-xen-sparse/kernel/irq/manage.c  Thu Apr  6 13:22:52 2006
+++ /dev/null   Fri Apr  7 10:52:00 2006
@@ -1,425 +0,0 @@
-/*
- * linux/kernel/irq/manage.c
- *
- * Copyright (C) 1992, 1998-2004 Linus Torvalds, Ingo Molnar
- *
- * This file contains driver APIs to the irq subsystem.
- */
-
-#include <linux/config.h>
-#include <linux/irq.h>
-#include <linux/module.h>
-#include <linux/random.h>
-#include <linux/interrupt.h>
-
-#include "internals.h"
-
-#ifdef CONFIG_SMP
-
-cpumask_t irq_affinity[NR_IRQS] = { [0 ... NR_IRQS-1] = CPU_MASK_ALL };
-
-#if defined (CONFIG_GENERIC_PENDING_IRQ) || defined (CONFIG_IRQBALANCE)
-cpumask_t __cacheline_aligned pending_irq_cpumask[NR_IRQS];
-#endif
-
-/**
- *     synchronize_irq - wait for pending IRQ handlers (on other CPUs)
- *     @irq: interrupt number to wait for
- *
- *     This function waits for any pending IRQ handlers for this interrupt
- *     to complete before returning. If you use this function while
- *     holding a resource the IRQ handler may need you will deadlock.
- *
- *     This function may be called - with care - from IRQ context.
- */
-void synchronize_irq(unsigned int irq)
-{
-       struct irq_desc *desc = irq_desc + irq;
-
-       if (irq >= NR_IRQS)
-               return;
-
-       while (desc->status & IRQ_INPROGRESS)
-               cpu_relax();
-}
-
-EXPORT_SYMBOL(synchronize_irq);
-
-#endif
-
-/**
- *     disable_irq_nosync - disable an irq without waiting
- *     @irq: Interrupt to disable
- *
- *     Disable the selected interrupt line.  Disables and Enables are
- *     nested.
- *     Unlike disable_irq(), this function does not ensure existing
- *     instances of the IRQ handler have completed before returning.
- *
- *     This function may be called from IRQ context.
- */
-void disable_irq_nosync(unsigned int irq)
-{
-       irq_desc_t *desc = irq_desc + irq;
-       unsigned long flags;
-
-       if (irq >= NR_IRQS)
-               return;
-
-       spin_lock_irqsave(&desc->lock, flags);
-       if (!desc->depth++) {
-               desc->status |= IRQ_DISABLED;
-               desc->handler->disable(irq);
-       }
-       spin_unlock_irqrestore(&desc->lock, flags);
-}
-
-EXPORT_SYMBOL(disable_irq_nosync);
-
-/**
- *     disable_irq - disable an irq and wait for completion
- *     @irq: Interrupt to disable
- *
- *     Disable the selected interrupt line.  Enables and Disables are
- *     nested.
- *     This function waits for any pending IRQ handlers for this interrupt
- *     to complete before returning. If you use this function while
- *     holding a resource the IRQ handler may need you will deadlock.
- *
- *     This function may be called - with care - from IRQ context.
- */
-void disable_irq(unsigned int irq)
-{
-       irq_desc_t *desc = irq_desc + irq;
-
-       if (irq >= NR_IRQS)
-               return;
-
-       disable_irq_nosync(irq);
-       if (desc->action)
-               synchronize_irq(irq);
-}
-
-EXPORT_SYMBOL(disable_irq);
-
-/**
- *     enable_irq - enable handling of an irq
- *     @irq: Interrupt to enable
- *
- *     Undoes the effect of one call to disable_irq().  If this
- *     matches the last disable, processing of interrupts on this
- *     IRQ line is re-enabled.
- *
- *     This function may be called from IRQ context.
- */
-void enable_irq(unsigned int irq)
-{
-       irq_desc_t *desc = irq_desc + irq;
-       unsigned long flags;
-
-       if (irq >= NR_IRQS)
-               return;
-
-       spin_lock_irqsave(&desc->lock, flags);
-       switch (desc->depth) {
-       case 0:
-               WARN_ON(1);
-               break;
-       case 1: {
-               unsigned int status = desc->status & ~IRQ_DISABLED;
-
-               desc->status = status;
-               if ((status & (IRQ_PENDING | IRQ_REPLAY)) == IRQ_PENDING) {
-                       desc->status = status | IRQ_REPLAY;
-                       hw_resend_irq(desc->handler,irq);
-               }
-               desc->handler->enable(irq);
-               /* fall-through */
-       }
-       default:
-               desc->depth--;
-       }
-       spin_unlock_irqrestore(&desc->lock, flags);
-}
-
-EXPORT_SYMBOL(enable_irq);
-
-/*
- * Internal function that tells the architecture code whether a
- * particular irq has been exclusively allocated or is available
- * for driver use.
- */
-int can_request_irq(unsigned int irq, unsigned long irqflags)
-{
-       struct irqaction *action;
-
-       if (irq >= NR_IRQS)
-               return 0;
-
-       action = irq_desc[irq].action;
-       if (action)
-               if (irqflags & action->flags & SA_SHIRQ)
-                       action = NULL;
-
-       return !action;
-}
-
-/**
- *     setup_irq - register an irqaction structure
- *     @irq: Interrupt to register
- *     @irqaction: The irqaction structure to be registered
- *
- *     Normally called by request_irq, this function can be used
- *     directly to allocate special interrupts that are part of the
- *     architecture.
- */
-int setup_irq(unsigned int irq, struct irqaction * new)
-{
-       struct irq_desc *desc = irq_desc + irq;
-       struct irqaction *old, **p;
-       unsigned long flags;
-       int shared = 0;
-
-       if (irq >= NR_IRQS)
-               return -EINVAL;
-
-       if (desc->handler == &no_irq_type)
-               return -ENOSYS;
-       /*
-        * Some drivers like serial.c use request_irq() heavily,
-        * so we have to be careful not to interfere with a
-        * running system.
-        */
-       if (new->flags & SA_SAMPLE_RANDOM) {
-               /*
-                * This function might sleep, we want to call it first,
-                * outside of the atomic block.
-                * Yes, this might clear the entropy pool if the wrong
-                * driver is attempted to be loaded, without actually
-                * installing a new handler, but is this really a problem,
-                * only the sysadmin is able to do this.
-                */
-               rand_initialize_irq(irq);
-       }
-
-       /*
-        * The following block of code has to be executed atomically
-        */
-       spin_lock_irqsave(&desc->lock,flags);
-       p = &desc->action;
-       if ((old = *p) != NULL) {
-               /* Can't share interrupts unless both agree to */
-               if (!(old->flags & new->flags & SA_SHIRQ)) {
-                       spin_unlock_irqrestore(&desc->lock,flags);
-                       return -EBUSY;
-               }
-
-               /* add new interrupt at end of irq queue */
-               do {
-                       p = &old->next;
-                       old = *p;
-               } while (old);
-               shared = 1;
-       }
-
-       *p = new;
-
-       if (!shared) {
-               desc->depth = 0;
-               desc->status &= ~(IRQ_DISABLED | IRQ_AUTODETECT |
-                                 IRQ_WAITING | IRQ_INPROGRESS);
-               if (desc->handler->startup)
-                       desc->handler->startup(irq);
-               else
-                       desc->handler->enable(irq);
-       }
-       spin_unlock_irqrestore(&desc->lock,flags);
-
-       new->irq = irq;
-       register_irq_proc(irq);
-       new->dir = NULL;
-       register_handler_proc(irq, new);
-
-       return 0;
-}
-
-/*
- *     teardown_irq - unregister an irqaction
- *     @irq: Interrupt line being freed
- *     @old: Pointer to the irqaction that is to be unregistered
- *
- *     This function is called by free_irq and does the actual
- *     business of unregistering the handler. It exists as a 
- *     seperate function to enable handlers to be unregistered 
- *     for irqactions that have been allocated statically at 
- *     boot time.
- *
- *     This function must not be called from interrupt context.
- */
-#ifndef CONFIG_XEN
-static
-#endif
-int teardown_irq(unsigned int irq, struct irqaction * old)
-{
-       struct irq_desc *desc;
-       struct irqaction **p;
-       unsigned long flags;
-
-       if (irq >= NR_IRQS)
-               return -ENOENT;
-
-       desc = irq_desc + irq;
-       spin_lock_irqsave(&desc->lock,flags);
-       p = &desc->action;
-       for (;;) {
-               struct irqaction * action = *p;
-
-               if (action) {
-                       struct irqaction **pp = p;
-
-                       p = &action->next;
-                       if (action != old)
-                               continue;
-
-                       /* Found it - now remove it from the list of entries */
-                       *pp = action->next;
-
-                       /* Currently used only by UML, might disappear one 
day.*/
-#ifdef CONFIG_IRQ_RELEASE_METHOD
-                       if (desc->handler->release)
-                               desc->handler->release(irq, dev_id);
-#endif
-
-                       if (!desc->action) {
-                               desc->status |= IRQ_DISABLED;
-                               if (desc->handler->shutdown)
-                                       desc->handler->shutdown(irq);
-                               else
-                                       desc->handler->disable(irq);
-                       }
-                       spin_unlock_irqrestore(&desc->lock,flags);
-                       unregister_handler_proc(irq, action);
-
-                       /* Make sure it's not being used on another CPU */
-                       synchronize_irq(irq);
-                       return 0;
-               }
-               printk(KERN_ERR "Trying to teardown free IRQ%d\n",irq);
-               spin_unlock_irqrestore(&desc->lock,flags);
-               return -ENOENT;
-       }
-}
-
-/**
- *     free_irq - free an interrupt
- *     @irq: Interrupt line to free
- *     @dev_id: Device identity to free
- *
- *     Remove an interrupt handler. The handler is removed and if the
- *     interrupt line is no longer in use by any driver it is disabled.
- *     On a shared IRQ the caller must ensure the interrupt is disabled
- *     on the card it drives before calling this function. The function
- *     does not return until any executing interrupts for this IRQ
- *     have completed.
- *
- *     This function must not be called from interrupt context.
- */
-void free_irq(unsigned int irq, void *dev_id)
-{
-       struct irq_desc *desc;
-       struct irqaction *action;
-       unsigned long flags;
-
-       if (irq >= NR_IRQS)
-               return;
-
-       desc = irq_desc + irq;
-       spin_lock_irqsave(&desc->lock,flags);
-       for (action = desc->action; action != NULL; action = action->next) {
-               if (action->dev_id != dev_id)
-                       continue;
-
-               spin_unlock_irqrestore(&desc->lock,flags);
-
-               if (teardown_irq(irq, action) == 0)
-                       kfree(action);
-               return;
-       }
-       printk(KERN_ERR "Trying to free free IRQ%d\n",irq);
-       spin_unlock_irqrestore(&desc->lock,flags);
-       return;
-}
-
-EXPORT_SYMBOL(free_irq);
-
-/**
- *     request_irq - allocate an interrupt line
- *     @irq: Interrupt line to allocate
- *     @handler: Function to be called when the IRQ occurs
- *     @irqflags: Interrupt type flags
- *     @devname: An ascii name for the claiming device
- *     @dev_id: A cookie passed back to the handler function
- *
- *     This call allocates interrupt resources and enables the
- *     interrupt line and IRQ handling. From the point this
- *     call is made your handler function may be invoked. Since
- *     your handler function must clear any interrupt the board
- *     raises, you must take care both to initialise your hardware
- *     and to set up the interrupt handler in the right order.
- *
- *     Dev_id must be globally unique. Normally the address of the
- *     device data structure is used as the cookie. Since the handler
- *     receives this value it makes sense to use it.
- *
- *     If your interrupt is shared you must pass a non NULL dev_id
- *     as this is required when freeing the interrupt.
- *
- *     Flags:
- *
- *     SA_SHIRQ                Interrupt is shared
- *     SA_INTERRUPT            Disable local interrupts while processing
- *     SA_SAMPLE_RANDOM        The interrupt can be used for entropy
- *
- */
-int request_irq(unsigned int irq,
-               irqreturn_t (*handler)(int, void *, struct pt_regs *),
-               unsigned long irqflags, const char * devname, void *dev_id)
-{
-       struct irqaction * action;
-       int retval;
-
-       /*
-        * Sanity-check: shared interrupts must pass in a real dev-ID,
-        * otherwise we'll have trouble later trying to figure out
-        * which interrupt is which (messes up the interrupt freeing
-        * logic etc).
-        */
-       if ((irqflags & SA_SHIRQ) && !dev_id)
-               return -EINVAL;
-       if (irq >= NR_IRQS)
-               return -EINVAL;
-       if (!handler)
-               return -EINVAL;
-
-       action = kmalloc(sizeof(struct irqaction), GFP_ATOMIC);
-       if (!action)
-               return -ENOMEM;
-
-       action->handler = handler;
-       action->flags = irqflags;
-       cpus_clear(action->mask);
-       action->name = devname;
-       action->next = NULL;
-       action->dev_id = dev_id;
-
-       select_smp_affinity(irq);
-
-       retval = setup_irq(irq, action);
-       if (retval)
-               kfree(action);
-
-       return retval;
-}
-
-EXPORT_SYMBOL(request_irq);
-
diff -r 9fcfdab04aa9 -r fb174770f426 linux-2.6-xen-sparse/lib/Kconfig.debug
--- a/linux-2.6-xen-sparse/lib/Kconfig.debug    Thu Apr  6 13:22:52 2006
+++ /dev/null   Fri Apr  7 10:52:00 2006
@@ -1,224 +0,0 @@
-
-config PRINTK_TIME
-       bool "Show timing information on printks"
-       help
-         Selecting this option causes timing information to be
-         included in printk output.  This allows you to measure
-         the interval between kernel operations, including bootup
-         operations.  This is useful for identifying long delays
-         in kernel startup.
-
-
-config MAGIC_SYSRQ
-       bool "Magic SysRq key"
-       depends on !UML
-       help
-         If you say Y here, you will have some control over the system even
-         if the system crashes for example during kernel debugging (e.g., you
-         will be able to flush the buffer cache to disk, reboot the system
-         immediately or dump some status information). This is accomplished
-         by pressing various keys while holding SysRq (Alt+PrintScreen). It
-         also works on a serial console (on PC hardware at least), if you
-         send a BREAK and then within 5 seconds a command keypress. The
-         keys are documented in <file:Documentation/sysrq.txt>. Don't say Y
-         unless you really know what this hack does.
-
-config DEBUG_KERNEL
-       bool "Kernel debugging"
-       help
-         Say Y here if you are developing drivers or trying to debug and
-         identify kernel problems.
-
-config LOG_BUF_SHIFT
-       int "Kernel log buffer size (16 => 64KB, 17 => 128KB)" if DEBUG_KERNEL
-       range 12 21
-       default 17 if S390
-       default 16 if X86_NUMAQ || IA64
-       default 15 if SMP
-       default 14
-       help
-         Select kernel log buffer size as a power of 2.
-         Defaults and Examples:
-                    17 => 128 KB for S/390
-                    16 => 64 KB for x86 NUMAQ or IA-64
-                    15 => 32 KB for SMP
-                    14 => 16 KB for uniprocessor
-                    13 =>  8 KB
-                    12 =>  4 KB
-
-config DETECT_SOFTLOCKUP
-       bool "Detect Soft Lockups"
-       depends on DEBUG_KERNEL
-       default y
-       help
-         Say Y here to enable the kernel to detect "soft lockups",
-         which are bugs that cause the kernel to loop in kernel
-         mode for more than 10 seconds, without giving other tasks a
-         chance to run.
-
-         When a soft-lockup is detected, the kernel will print the
-         current stack trace (which you should report), but the
-         system will stay locked up. This feature has negligible
-         overhead.
-
-         (Note that "hard lockups" are separate type of bugs that
-          can be detected via the NMI-watchdog, on platforms that
-          support it.)
-
-config SCHEDSTATS
-       bool "Collect scheduler statistics"
-       depends on DEBUG_KERNEL && PROC_FS
-       help
-         If you say Y here, additional code will be inserted into the
-         scheduler and related routines to collect statistics about
-         scheduler behavior and provide them in /proc/schedstat.  These
-         stats may be useful for both tuning and debugging the scheduler
-         If you aren't debugging the scheduler or trying to tune a specific
-         application, you can say N to avoid the very slight overhead
-         this adds.
-
-config DEBUG_SLAB
-       bool "Debug memory allocations"
-       depends on DEBUG_KERNEL && SLAB
-       help
-         Say Y here to have the kernel do limited verification on memory
-         allocation as well as poisoning memory on free to catch use of freed
-         memory. This can make kmalloc/kfree-intensive workloads much slower.
-
-config DEBUG_PREEMPT
-       bool "Debug preemptible kernel"
-       depends on DEBUG_KERNEL && PREEMPT
-       default y
-       help
-         If you say Y here then the kernel will use a debug variant of the
-         commonly used smp_processor_id() function and will print warnings
-         if kernel code uses it in a preemption-unsafe way. Also, the kernel
-         will detect preemption count underflows.
-
-config DEBUG_MUTEXES
-       bool "Mutex debugging, deadlock detection"
-       default y
-       depends on DEBUG_KERNEL
-       help
-        This allows mutex semantics violations and mutex related deadlocks
-        (lockups) to be detected and reported automatically.
-
-config DEBUG_SPINLOCK
-       bool "Spinlock debugging"
-       depends on DEBUG_KERNEL
-       help
-         Say Y here and build SMP to catch missing spinlock initialization
-         and certain other kinds of spinlock errors commonly made.  This is
-         best used in conjunction with the NMI watchdog so that spinlock
-         deadlocks are also debuggable.
-
-config DEBUG_SPINLOCK_SLEEP
-       bool "Sleep-inside-spinlock checking"
-       depends on DEBUG_KERNEL
-       help
-         If you say Y here, various routines which may sleep will become very
-         noisy if they are called with a spinlock held.
-
-config DEBUG_KOBJECT
-       bool "kobject debugging"
-       depends on DEBUG_KERNEL
-       help
-         If you say Y here, some extra kobject debugging messages will be sent
-         to the syslog. 
-
-config DEBUG_HIGHMEM
-       bool "Highmem debugging"
-       depends on DEBUG_KERNEL && HIGHMEM
-       help
-         This options enables addition error checking for high memory systems.
-         Disable for production systems.
-
-config DEBUG_BUGVERBOSE
-       bool "Verbose BUG() reporting (adds 70K)" if DEBUG_KERNEL && EMBEDDED
-       depends on BUG
-       depends on ARM || ARM26 || M32R || M68K || SPARC32 || SPARC64 || X86_32 
|| FRV
-       default !EMBEDDED
-       help
-         Say Y here to make BUG() panics output the file name and line number
-         of the BUG call as well as the EIP and oops trace.  This aids
-         debugging but costs about 70-100K of memory.
-
-config DEBUG_INFO
-       bool "Compile the kernel with debug info"
-       depends on DEBUG_KERNEL && !X86_64_XEN
-       help
-          If you say Y here the resulting kernel image will include
-         debugging info resulting in a larger kernel image.
-         Say Y here only if you plan to debug the kernel.
-
-         If unsure, say N.
-
-config DEBUG_IOREMAP
-       bool "Enable ioremap() debugging"
-       depends on DEBUG_KERNEL && PARISC
-       help
-         Enabling this option will cause the kernel to distinguish between
-         ioremapped and physical addresses.  It will print a backtrace (at
-         most one every 10 seconds), hopefully allowing you to see which
-         drivers need work.  Fixing all these problems is a prerequisite
-         for turning on USE_HPPA_IOREMAP.  The warnings are harmless;
-         the kernel has enough information to fix the broken drivers
-         automatically, but we'd like to make it more efficient by not
-         having to do that.
-
-config DEBUG_FS
-       bool "Debug Filesystem"
-       depends on DEBUG_KERNEL && SYSFS
-       help
-         debugfs is a virtual file system that kernel developers use to put
-         debugging files into.  Enable this option to be able to read and
-         write to these files.
-
-         If unsure, say N.
-
-config DEBUG_VM
-       bool "Debug VM"
-       depends on DEBUG_KERNEL
-       help
-         Enable this to turn on extended checks in the virtual-memory system
-          that may impact performance.
-
-         If unsure, say N.
-
-config FRAME_POINTER
-       bool "Compile the kernel with frame pointers"
-       depends on DEBUG_KERNEL && (X86 || CRIS || M68K || M68KNOMMU || FRV || 
UML)
-       default y if DEBUG_INFO && UML
-       help
-         If you say Y here the resulting kernel image will be slightly larger
-         and slower, but it might give very useful debugging information on
-         some architectures or if you use external debuggers.
-         If you don't debug the kernel, you can say N.
-
-config FORCED_INLINING
-       bool "Force gcc to inline functions marked 'inline'"
-       depends on DEBUG_KERNEL
-       default y
-       help
-         This option determines if the kernel forces gcc to inline the 
functions
-         developers have marked 'inline'. Doing so takes away freedom from gcc 
to
-         do what it thinks is best, which is desirable for the gcc 3.x series 
of
-         compilers. The gcc 4.x series have a rewritten inlining algorithm and
-         disabling this option will generate a smaller kernel there. Hopefully
-         this algorithm is so good that allowing gcc4 to make the decision can
-         become the default in the future, until then this option is there to
-         test gcc for this.
-
-config RCU_TORTURE_TEST
-       tristate "torture tests for RCU"
-       depends on DEBUG_KERNEL
-       default n
-       help
-         This option provides a kernel module that runs torture tests
-         on the RCU infrastructure.  The kernel module may be built
-         after the fact on the running kernel to be tested, if desired.
-
-         Say Y here if you want RCU torture tests to start automatically
-         at boot time (you probably don't).
-         Say M if you want the RCU torture tests to build as a module.
-         Say N if you are unsure.

_______________________________________________
Xen-changelog mailing list
Xen-changelog@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-changelog


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.