[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Experimental PCI Emulation Layer for XEN x86(32)



Hello!

   I have explored the idea of running QEMU's PCI emulation layer in XEN's 
(x86-32) exception handler a bit further and the results are very 
promising.
The benefits of doing this are the following:

- Compiling PCI code into a user domain works just fine and the PCI layer 
finds an empty PCI bus; the bootup messages (dmesg) can be seen under (*) 
as well as the devices that the sysfs shows
  - you need to set CONFIG_XEN_PRIVILEGED_GUEST=y and 
CONFIG_XEN_PHYSDEV_ACCESS=y to be queried about PCI when compiling a user 
domain kernel

-  The best: Running a domain 0 kernel as a user domain kernel also works 
! The bootup messages (dmesg) can be seen under (**) below
   - With some further exploration this could be leading to a unified 
build process and kernel for domain 0 and user domains which I think would 
be 'good'.


In the future I would look at this a bit further and try to do the 
following:
- hide PCI devices in domain 0 by passing a command line argument in the 
grub.conf when booting a domain
- learn as much about the hidden PCI device as possible by probing the PCI 
bus or reading it from exising device structures
- making the information about the PCI device available to  XEN through a 
hypercall and programm the  PCI emulation layer  of a driver domain to 
mimic this device's existence
- punch holes into the privilege bitmap of the driver domain to access the 
IO ports of that device and somehow map memory so the user domain can talk 
to this device (need help on this)

About the emulation layer:
- I recycled QEMU's implementation and shrunk it to fit into XEN; it's 
actually quite small also when compiled

        -rw-r--r--  1 root root 8756 Aug 19 12:05 qemu_pci.o
        -rw-r--r--  1 root root 2876 Aug 19 12:05 qemu_vl.o

- The plugability of QEMU's emulation layer has been preserved to register 
'mimic'ed devices' or maybe even the ones that QEMU has :-)
- There's a nob to turn: MAX_IOPORTS; If that is set to 65536 (for all 
inb's and outb's) then XEN will need to allocate a huge amount of memory; 
turning it down to 4096 is good enough to emulate all ports <=4095; this 
part needs some work -> a hash table which hashes by ioport number instead 
of a plain array


The patch is attached and inlined. It works fine against Wendesday's tree.

Please let me know about comments or ideas!



-------------------------
(*) dmesg of user domain kernel with PCI bus 


Linux version 2.6.12-xenU (root@xxxxxxxxxxxxxxxxxxxxx) (gcc version 4.0.1 
20050727 (Red Hat 4.0.1-5)) #6 SMP Thu Aug 18 17:55:21 EDT 2005
BIOS-provided physical RAM map:
 Xen: 0000000000000000 - 0000000002000000 (usable)
0MB HIGHMEM available.
32MB LOWMEM available.
DMI not present.
IRQ lockup detection disabled
Allocating PCI resources starting at 02000000 (gap: 02000000:fe000000)
Built 1 zonelists
Kernel command line:  root=/dev/ram0 console=tty ro ima=1
Initializing CPU#0
PID hash table entries: 256 (order: 8, 4096 bytes)
Xen reported: 2799.900 MHz processor.
Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
vmalloc area: c2800000-fb7fe000, maxmem 34000000
Memory: 17644k/32768k available (1974k kernel code, 15060k reserved, 558k 
data, 168k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... 
Ok.
Mount-cache hash table entries: 512
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... disabled
CPU0: Intel(R) Xeon(TM) CPU 2.80GHz stepping 07
SMP motherboard not detected.
Brought up 1 CPUs
checking if image is initramfs... it is
Freeing initrd memory: 11440k freed
NET: Registered protocol family 16
store_evtchn = 2
XENBUS xs_read_watch: 0
PCI: Using configuration type 1
XENBUS xs_read_watch: 0
XENBUS xs_read_watch: 0
xen_mem: Initialising balloon driver.
XENBUS xs_read_watch: 0
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
Grant table initialized
Initializing Cryptographic API
PCI: PIIX3: Enabling Passive Release on 0000:00:01.0
Limiting direct PCI/PCI transfers.
Activating ISA DMA hang workarounds.
i8042.c: No controller found.
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
Xen virtual console successfully installed as tty1
Event-channel device installed.
Blkif frontend is using grant tables.
xen_blk: Initialising virtual block device driver
xen_blk: Timeout connecting to device!
Netdev frontend (TX) is using grant tables.
Netdev frontend (RX) is using grant tables.
xen_net: Initialising virtual ethernet driver.
??~?mice: PS/2 mouse device common for all mice
NET: Registered protocol family 2
IP: routing cache hash table of 256 buckets, 4Kbytes
TCP established hash table entries: 2048 (order: 3, 32768 bytes)
TCP bind hash table entries: 2048 (order: 2, 24576 bytes)
TCP: Hash tables configured (established 2048 bind 2048)
NET: Registered protocol family 1
NET: Registered protocol family 17
Freeing unused kernel memory: 168k freed
init started:  BusyBox v1.00 (2005.02.09-15:13+0000) multi-call binary

[...]

# cd /sys/devices/
# find
.
./pci0000:00
./pci0000:00/0000:00:01.0
./pci0000:00/0000:00:01.0/config
./pci0000:00/0000:00:01.0/bus
./pci0000:00/0000:00:01.0/modalias
./pci0000:00/0000:00:01.0/local_cpus
./pci0000:00/0000:00:01.0/irq
./pci0000:00/0000:00:01.0/class
./pci0000:00/0000:00:01.0/subsystem_device
./pci0000:00/0000:00:01.0/subsystem_vendor
./pci0000:00/0000:00:01.0/device
./pci0000:00/0000:00:01.0/vendor
./pci0000:00/0000:00:01.0/resource
./pci0000:00/0000:00:00.0
./pci0000:00/0000:00:00.0/config
./pci0000:00/0000:00:00.0/bus
./pci0000:00/0000:00:00.0/modalias
./pci0000:00/0000:00:00.0/local_cpus
./pci0000:00/0000:00:00.0/irq
./pci0000:00/0000:00:00.0/class
./pci0000:00/0000:00:00.0/subsystem_device
./pci0000:00/0000:00:00.0/subsystem_vendor
./pci0000:00/0000:00:00.0/device
./pci0000:00/0000:00:00.0/vendor
./pci0000:00/0000:00:00.0/resource
./vif-0
./vif-0/bus
./system
./system/ioapic
./system/timer
./system/timer/timer0
./system/cpu
./system/cpu/cpu0
./system/cpu/cpu0/online
./platform




-------------------------
(**) dmesg of domain 0 kernel booting up as a user domain kernel

Linux version 2.6.12-xen0 (root@xxxxxxxxxxxxxxxxxxxxx) (gcc version 4.0.1 
20050727 (
Red Hat 4.0.1-5)) #6 Thu Aug 18 17:54:40 EDT 2005
BIOS-provided physical RAM map:
 Xen: 0000000000000000 - 0000000002000000 (usable)
0MB HIGHMEM available.
32MB LOWMEM available.
On node 0 totalpages: 8192
  DMA zone: 8192 pages, LIFO batch:3
  Normal zone: 0 pages, LIFO batch:1
  HighMem zone: 0 pages, LIFO batch:1
DMI not present.
ACPI in unprivileged domain disabled
IRQ lockup detection disabled
Allocating PCI resources starting at 02000000 (gap: 02000000:fe000000)
Built 1 zonelists
Kernel command line:  root=/dev/ram0 console=tty ro ima=1
Initializing CPU#0
PID hash table entries: 256 (order: 8, 4096 bytes)
Xen reported: 2799.892 MHz processor.
Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
vmalloc area: c2800000-fb7fe000, maxmem 34000000
Memory: 15748k/32768k available (3438k kernel code, 16956k reserved, 1016k 
data, 352
k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... 
Ok.
Calibrating delay loop... 5596.77 BogoMIPS (lpj=27983872)
Mount-cache hash table entries: 512
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 
00004400 0000
0000 00000000
CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000 
00004400 00000
000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: After all inits, caps: bfebc3f1 00000000 00000000 00000080 00004400 
00000000 00
000000
CPU: Intel(R) Xeon(TM) CPU 2.80GHz stepping 07
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... disabled
checking if image is initramfs... it is
Freeing initrd memory: 11440k freed
NET: Registered protocol family 16
store_evtchn = 2
XENBUS xs_read_watch: 0
PCI: Using configuration type 1
XENBUS xs_read_watch: 0
ACPI: Subsystem revision 20050309
ACPI: Interpreter disabled.
xen_mem: Initialising balloon driver.
XENBUS xs_read_watch: 0
SCSI subsystem initialized
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a 
report
Grant table initialized
IA-32 Microcode Update Driver: v1.14-xen <tigran@xxxxxxxxxxx>
Initializing Cryptographic API
i8042.c: No controller found.
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
Floppy drive(s): fd0 is unknown type 15 (usb?), fd1 is unknown type 15 
(usb?)
Failed to obtain physical IRQ 6
floppy0: no floppy controllers found
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
loop: loaded (max 8 devices)
HP CISS Driver (v 2.6.6)
Intel(R) PRO/1000 Network Driver - version 6.0.54-k2
Copyright (c) 1999-2004 Intel Corporation.
pcnet32.c:v1.30j 29.04.2005 tsbogend@xxxxxxxxxxxxxxxx
e100: Intel(R) PRO/100 Network Driver, 3.4.8-k2-NAPI
e100: Copyright(c) 1999-2005 Intel Corporation
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@xxxxxxxxxxxx>
Xen virtual console successfully installed as tty1
Event-channel device installed.
Blkif frontend is using grant tables.
xen_blk: Initialising virtual block device driver
xen_blk: Could not probe disks (0)
xen_blk: failed to probe
Netdev frontend (TX) is using grant tables.
Netdev frontend (RX) is using grant tables.
xen_net: Initialising virtual ethernet driver.
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 50MHz system bus speed for PIO modes; override with 
idebus=xx
Probing IDE interface ide0...
Probing IDE interface ide1...
Probing IDE interface ide2...
Probing IDE interface ide3...
Probing IDE interface ide4...
Probing IDE interface ide5...
Red Hat/Adaptec aacraid driver (1.1.2-lk2 Aug 17 2005)
3ware Storage Controller device driver for Linux v1.26.02.001.
libata version 1.11 loaded.
Fusion MPT base driver 3.01.20
Copyright (c) 1999-2004 LSI Logic Corporation
Fusion MPT SCSI Host driver 3.01.20
usbmon: debugs is not available
ohci_hcd: 2004 Nov 08 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
USB Universal Host Controller Interface driver v2.2
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.01:USB HID core driver
mice: PS/2 mouse device common for all mice
md: raid0 personality registered as nr 2
md: raid1 personality registered as nr 3
md: raid5 personality registered as nr 4
raid5: automatically using best checksumming function: pIII_sse
   pIII_sse  :  1306.800 MB/sec
raid5: using function: pIII_sse (1306.800 MB/sec)
md: md driver 0.90.1 MAX_MD_DEVS=256, MD_SB_DISKS=27
device-mapper: 4.4.0-ioctl (2005-01-12) initialised: dm-devel@xxxxxxxxxx
NET: Registered protocol family 2
IP: routing cache hash table of 512 buckets, 4Kbytes
TCP established hash table entries: 2048 (order: 2, 16384 bytes)
TCP bind hash table entries: 2048 (order: 1, 8192 bytes)
TCP: Hash tables configured (established 2048 bind 2048)
NET: Registered protocol family 1
NET: Registered protocol family 17
Bridge firewalling registered
Freeing unused kernel memory: 352k freed

----------------------------------

The patch:

Signed-off-by: Stefan Berger <stefanb@xxxxxxxxxx>


Index: root/xen-unstable.hg/xen/arch/x86/qemu_pci.c
===================================================================
--- /dev/null
+++ root/xen-unstable.hg/xen/arch/x86/qemu_pci.c
@@ -0,0 +1,807 @@
+/*
+ * QEMU PCI bus manager
+ *
+ * Copyright (c) 2004 Fabrice Bellard
+ * 
+ * Permission is hereby granted, free of charge, to any person obtaining 
a copy
+ * of this software and associated documentation files (the "Software"), 
to deal
+ * in the Software without restriction, including without limitation the 
rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or 
sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be 
included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 
SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS 
IN
+ * THE SOFTWARE.
+ */
+/*
+ * Adapted to fit XEN by Stefan Berger, stefanb@xxxxxxxxxx
+ */
+
+//#define DEBUG_PCI
+
+#define PCI_VENDOR_ID          0x00    /* 16 bits */
+#define PCI_DEVICE_ID          0x02    /* 16 bits */
+#define PCI_COMMAND            0x04    /* 16 bits */
+#define  PCI_COMMAND_IO                0x1     /* Enable response in I/O 
space */
+#define  PCI_COMMAND_MEMORY    0x2     /* Enable response in Memory space 
*/
+#define PCI_CLASS_DEVICE        0x0a    /* Device class */
+#define PCI_INTERRUPT_LINE     0x3c    /* 8 bits */
+#define PCI_INTERRUPT_PIN      0x3d    /* 8 bits */
+#define PCI_MIN_GNT            0x3e    /* 8 bits */
+#define PCI_MAX_LAT            0x3f    /* 8 bits */
+
+
+#include <xen/qemu_vl.h>
+#include <xen/lib.h>
+#include "qemu_makeup.h"
+
+
+static PCIBase *create_pci_base(void)
+{ 
+    PCIBase *base = qemu_mallocz(sizeof(PCIBase));
+ 
+    if (base) {
+        init_ioports(base);
+    }
+    return base;
+}
+
+static PCIBus *pci_register_bus(PCIBase *base)
+{
+    PCIBus *bus;
+    bus = qemu_mallocz(sizeof(PCIBus));
+    if (bus) {
+        bus->base = base;
+    }
+    return bus;
+}
+
+static void pci_bus_free(PCIBus *bus)
+{
+    xfree(bus);
+}
+
+static void pci_device_free(PCIDevice *dev)
+{
+    xfree(dev);
+}
+
+void pci_base_free(PCIBase *base)
+{
+    PCIBus *bus = base->bus;
+    if (bus) {
+        int devfn;
+        for(devfn = 0; devfn < 256; devfn++) {
+            PCIDevice *d;
+            d = bus->devices[devfn];
+            if (d)
+                pci_device_free(d);
+        }
+        pci_bus_free(bus);
+    }
+ 
+    xfree(base);
+}
+
+/* -1 for devfn means auto assign */
+PCIDevice *pci_register_device(PCIBus *bus, const char *name, 
+                               int instance_size, int devfn,
+                               PCIConfigReadFunc *config_read, 
+                               PCIConfigWriteFunc *config_write)
+{
+    PCIDevice *pci_dev;
+
+ 
+    if (devfn < 0) {
+        for(devfn = bus->devfn_min ; devfn < 256; devfn += 8) {
+            if (!bus->devices[devfn])
+                goto found;
+        }
+        return NULL;
+    found: ;
+    }
+    pci_dev = qemu_mallocz(instance_size);
+    if (!pci_dev)
+        return NULL;
+    pci_dev->bus = bus;
+    pci_dev->devfn = devfn;
+    pstrcpy(pci_dev->name, sizeof(pci_dev->name), name);
+
+    if (!config_read)
+        config_read = pci_default_read_config;
+    if (!config_write)
+        config_write = pci_default_write_config;
+    pci_dev->config_read = config_read;
+    pci_dev->config_write = config_write;
+    bus->devices[devfn] = pci_dev;
+    return pci_dev;
+}
+
+void pci_register_io_region(PCIDevice *pci_dev, int region_num, 
+                            uint32_t size, int type, 
+                            PCIMapIORegionFunc *map_func)
+{
+    PCIIORegion *r;
+
+    if ((unsigned int)region_num >= PCI_NUM_REGIONS)
+        return;
+    r = &pci_dev->io_regions[region_num];
+    r->addr = -1;
+    r->size = size;
+    r->type = type;
+    r->map_func = map_func;
+}
+
+static void pci_addr_writel(PCIBase *base, uint32_t addr, uint32_t val)
+{
+    PCIBus *s = base->bus;
+    s->config_reg = val;
+}
+
+static uint32_t pci_addr_readl(PCIBase *base, uint32_t addr)
+{
+    PCIBus *s = base->bus;
+    return s->config_reg;
+}
+
+static void pci_update_mappings(PCIDevice *d)
+{
+    PCIIORegion *r;
+    int cmd, i;
+    uint32_t last_addr, new_addr, config_ofs;
+ 
+    cmd = le16_to_cpu(*(uint16_t *)(d->config + PCI_COMMAND));
+    for(i = 0; i < PCI_NUM_REGIONS; i++) {
+        r = &d->io_regions[i];
+        if (i == PCI_ROM_SLOT) {
+            config_ofs = 0x30;
+        } else {
+            config_ofs = 0x10 + i * 4;
+        }
+        if (r->size != 0) {
+            if (r->type & PCI_ADDRESS_SPACE_IO) {
+                if (cmd & PCI_COMMAND_IO) {
+                    new_addr = le32_to_cpu(*(uint32_t *)(d->config + 
+                                                         config_ofs));
+                    new_addr = new_addr & ~(r->size - 1);
+                    last_addr = new_addr + r->size - 1;
+                    /* NOTE: we have only 64K ioports on PC */
+                    if (last_addr <= new_addr || new_addr == 0 ||
+                        last_addr >= 0x10000) {
+                        new_addr = -1;
+                    }
+                } else {
+                    new_addr = -1;
+                }
+            } else {
+                if (cmd & PCI_COMMAND_MEMORY) {
+                    new_addr = le32_to_cpu(*(uint32_t *)(d->config + 
+                                                         config_ofs));
+                    /* the ROM slot has a specific enable bit */
+                    if (i == PCI_ROM_SLOT && !(new_addr & 1))
+                        goto no_mem_map;
+                    new_addr = new_addr & ~(r->size - 1);
+                    last_addr = new_addr + r->size - 1;
+                    /* NOTE: we do not support wrapping */
+                    /* XXX: as we cannot support really dynamic
+                       mappings, we handle specific values as invalid
+                       mappings. */
+                    if (last_addr <= new_addr || new_addr == 0 ||
+                        last_addr == -1) {
+                        new_addr = -1;
+                    }
+                } else {
+                no_mem_map:
+                    new_addr = -1;
+                }
+            }
+            /* now do the real mapping */
+            if (new_addr != r->addr) {
+                if (r->addr != -1) {
+                    if (r->type & PCI_ADDRESS_SPACE_IO) {
+                        int class;
+                        /* NOTE: specific hack for IDE in PC case:
+                           only one byte must be mapped. */
+                        class = d->config[0x0a] | (d->config[0x0b] << 8);
+                        if (class == 0x0101 && r->size == 4) {
+                           isa_unassign_ioport(d->bus->base, r->addr + 2, 
1);
+                        } else {
+                           isa_unassign_ioport(d->bus->base, r->addr, 
r->size);
+                        }
+                    } else {
+// !!! This should not have been disabled - maybe
+#if 0
+                        cpu_register_physical_memory(r->addr + 
pci_mem_base, 
+                                                     r->size, 
+                                                     IO_MEM_UNASSIGNED);
+#endif
+                    }
+                }
+                r->addr = new_addr;
+                if (r->addr != -1) {
+                    r->map_func(d, i, r->addr, r->size, r->type);
+                }
+            }
+        }
+    }
+}
+
+uint32_t pci_default_read_config(PCIDevice *d, 
+                                 uint32_t address, int len)
+{
+    uint32_t val;
+    switch(len) {
+    case 1:
+        val = d->config[address];
+        break;
+    case 2:
+        val = le16_to_cpu(*(uint16_t *)(d->config + address));
+        break;
+    default:
+    case 4:
+        val = le32_to_cpu(*(uint32_t *)(d->config + address));
+        break;
+    }
+    return val;
+}
+
+void pci_default_write_config(PCIDevice *d, 
+                              uint32_t address, uint32_t val, int len)
+{
+    int can_write, i;
+    uint32_t end, addr;
+
+    if (len == 4 && ((address >= 0x10 && address < 0x10 + 4 * 6) || 
+                     (address >= 0x30 && address < 0x34))) {
+        PCIIORegion *r;
+        int reg;
+
+        if ( address >= 0x30 ) {
+            reg = PCI_ROM_SLOT;
+        }else{
+            reg = (address - 0x10) >> 2;
+        }
+        r = &d->io_regions[reg];
+        if (r->size == 0)
+            goto default_config;
+        /* compute the stored value */
+        if (reg == PCI_ROM_SLOT) {
+            /* keep ROM enable bit */
+            val &= (~(r->size - 1)) | 1;
+        } else {
+            val &= ~(r->size - 1);
+            val |= r->type;
+        }
+        *(uint32_t *)(d->config + address) = cpu_to_le32(val);
+        pci_update_mappings(d);
+        return;
+    }
+ default_config:
+    /* not efficient, but simple */
+    addr = address;
+    for(i = 0; i < len; i++) {
+        /* default read/write accesses */
+        switch(d->config[0x0e]) {
+        case 0x00:
+        case 0x80:
+            switch(addr) {
+            case 0x00:
+            case 0x01:
+            case 0x02:
+            case 0x03:
+            case 0x08:
+            case 0x09:
+            case 0x0a:
+            case 0x0b:
+            case 0x0e:
+            case 0x10 ... 0x27: /* base */
+            case 0x30 ... 0x33: /* rom */
+            case 0x3d:
+                can_write = 0;
+                break;
+            default:
+                can_write = 1;
+                break;
+            }
+            break;
+        default:
+        case 0x01:
+            switch(addr) {
+            case 0x00:
+            case 0x01:
+            case 0x02:
+            case 0x03:
+            case 0x08:
+            case 0x09:
+            case 0x0a:
+            case 0x0b:
+            case 0x0e:
+            case 0x38 ... 0x3b: /* rom */
+            case 0x3d:
+                can_write = 0;
+                break;
+            default:
+                can_write = 1;
+                break;
+            }
+            break;
+        }
+        if (can_write) {
+            d->config[addr] = val;
+        }
+        addr++;
+        val >>= 8;
+    }
+
+    end = address + len;
+    if (end > PCI_COMMAND && address < (PCI_COMMAND + 2)) {
+        /* if the command register is modified, we must modify the 
mappings */
+        pci_update_mappings(d);
+    }
+}
+
+static void pci_data_write(PCIBus *opaque, uint32_t addr, 
+                           uint32_t val, int len)
+{
+    PCIBus *s = opaque;
+    PCIDevice *pci_dev;
+    int config_addr, bus_num;
+ 
+#if defined(DEBUG_PCI) && 0
+    printf("pci_data_write: addr=%08x val=%08x len=%d\n",
+           s->config_reg, val, len);
+#endif
+    if (!(s->config_reg & (1 << 31))) {
+        return;
+    }
+    if ((s->config_reg & 0x3) != 0) {
+        return;
+    }
+    bus_num = (s->config_reg >> 16) & 0xff;
+    if (bus_num != 0)
+        return;
+    pci_dev = s->devices[(s->config_reg >> 8) & 0xff];
+    if (!pci_dev)
+        return;
+    config_addr = (s->config_reg & 0xfc) | (addr & 3);
+#if defined(DEBUG_PCI)
+    printf("pci_config_write: %s: addr=%02x val=%08x len=%d\n",
+           pci_dev->name, config_addr, val, len);
+#endif
+    pci_dev->config_write(pci_dev, config_addr, val, len);
+}
+
+static uint32_t pci_data_read(PCIBus *opaque, uint32_t addr, 
+                              int len)
+{
+    PCIBus *s = opaque;
+    PCIDevice *pci_dev;
+    int config_addr, bus_num;
+    uint32_t val;
+
+    if (!(s->config_reg & (1 << 31)))
+        goto fail;
+    if ((s->config_reg & 0x3) != 0)
+        goto fail;
+    bus_num = (s->config_reg >> 16) & 0xff;
+    if (bus_num != 0)
+        goto fail;
+    pci_dev = s->devices[(s->config_reg >> 8) & 0xff];
+    if (!pci_dev) {
+    fail:
+        switch(len) {
+        case 1:
+            val = 0xff;
+            break;
+        case 2:
+            val = 0xffff;
+            break;
+        default:
+        case 4:
+            val = 0xffffffff;
+            break;
+        }
+        goto the_end;
+    }
+    config_addr = (s->config_reg & 0xfc) | (addr & 3);
+    val = pci_dev->config_read(pci_dev, config_addr, len);
+#if defined(DEBUG_PCI)
+    printf("pci_config_read: %s: addr=%02x val=%08x len=%d\n",
+           pci_dev->name, config_addr, val, len);
+#endif
+ the_end:
+#if defined(DEBUG_PCI) && 0
+    printf("pci_data_read: addr=%08x val=%08x len=%d\n",
+           s->config_reg, val, len);
+#endif
+    return val;
+}
+
+static void pci_data_writeb(PCIBase *base, uint32_t addr, uint32_t val)
+{
+    pci_data_write(base->bus, addr, val, 1);
+}
+
+static void pci_data_writew(PCIBase *base, uint32_t addr, uint32_t val)
+{
+    pci_data_write(base->bus, addr, val, 2);
+}
+
+static void pci_data_writel(PCIBase *base, uint32_t addr, uint32_t val)
+{
+    pci_data_write(base->bus, addr, val, 4);
+}
+
+static uint32_t pci_data_readb(PCIBase *base, uint32_t addr)
+{
+    return pci_data_read(base->bus, addr, 1);
+}
+
+static uint32_t pci_data_readw(PCIBase *base, uint32_t addr)
+{
+    return pci_data_read(base->bus, addr, 2);
+}
+
+static uint32_t pci_data_readl(PCIBase *base, uint32_t addr)
+{
+    return pci_data_read(base->bus, addr, 4);
+}
+
+/* i440FX PCI bridge */
+
+PCIBase *i440fx_init()
+{
+    PCIBus *s;
+    PCIDevice *d;
+    PCIBase *base;
+
+    base = create_pci_base();
+    if (!base) return NULL; 
+
+    base->bus = s = pci_register_bus(base);
+    if (!s) {
+        xfree(base);
+        return NULL;
+    }
+
+    register_ioport_write(0xcf8, 4, 4, pci_addr_writel, s, base);
+    register_ioport_read(0xcf8, 4, 4, pci_addr_readl, s, base);
+
+    register_ioport_write(0xcfc, 4, 1, pci_data_writeb, s, base);
+    register_ioport_write(0xcfc, 4, 2, pci_data_writew, s, base);
+    register_ioport_write(0xcfc, 4, 4, pci_data_writel, s, base);
+    register_ioport_read(0xcfc, 4, 1, pci_data_readb, s, base);
+    register_ioport_read(0xcfc, 4, 2, pci_data_readw, s, base);
+    register_ioport_read(0xcfc, 4, 4, pci_data_readl, s, base);
+
+    d = pci_register_device(s, "i440FX", sizeof(PCIDevice), 0, 
+                            NULL, NULL);
+
+    d->config[0x00] = 0x86; // vendor_id
+    d->config[0x01] = 0x80;
+    d->config[0x02] = 0x37; // device_id
+    d->config[0x03] = 0x12;
+    d->config[0x08] = 0x02; // revision
+    d->config[0x0a] = 0x00; // class_sub = host2pci
+    d->config[0x0b] = 0x06; // class_base = PCI_bridge
+    d->config[0x0e] = 0x00; // header_type
+    return base;
+}
+
+/* PIIX3 PCI to ISA bridge */
+
+typedef struct PIIX3State {
+    PCIDevice dev;
+} PIIX3State;
+
+PIIX3State *piix3_state;
+
+/* return the global irq number corresponding to a given device irq
+   pin. We could also use the bus number to have a more precise
+   mapping. */
+static inline int pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
+{
+    int slot_addend;
+    slot_addend = (pci_dev->devfn >> 3) - 1;
+    return (irq_num + slot_addend) & 3;
+}
+
+static void piix3_reset(PIIX3State *d)
+{
+    uint8_t *pci_conf = d->dev.config;
+
+    pci_conf[0x04] = 0x07; // master, memory and I/O
+    pci_conf[0x05] = 0x00;
+    pci_conf[0x06] = 0x00;
+    pci_conf[0x07] = 0x02; // PCI_status_devsel_medium
+    pci_conf[0x4c] = 0x4d;
+    pci_conf[0x4e] = 0x03;
+    pci_conf[0x4f] = 0x00;
+    pci_conf[0x60] = 0x80;
+    pci_conf[0x69] = 0x02;
+    pci_conf[0x70] = 0x80;
+    pci_conf[0x76] = 0x0c;
+    pci_conf[0x77] = 0x0c;
+    pci_conf[0x78] = 0x02;
+    pci_conf[0x79] = 0x00;
+    pci_conf[0x80] = 0x00;
+    pci_conf[0x82] = 0x00;
+    pci_conf[0xa0] = 0x08;
+    pci_conf[0xa0] = 0x08;
+    pci_conf[0xa2] = 0x00;
+    pci_conf[0xa3] = 0x00;
+    pci_conf[0xa4] = 0x00;
+    pci_conf[0xa5] = 0x00;
+    pci_conf[0xa6] = 0x00;
+    pci_conf[0xa7] = 0x00;
+    pci_conf[0xa8] = 0x0f;
+    pci_conf[0xaa] = 0x00;
+    pci_conf[0xab] = 0x00;
+    pci_conf[0xac] = 0x00;
+    pci_conf[0xae] = 0x00;
+}
+
+void piix3_init(PCIBus *bus)
+{
+    PIIX3State *d;
+    uint8_t *pci_conf;
+
+    d = (PIIX3State *)pci_register_device(bus, "PIIX3", 
sizeof(PIIX3State),
+                                          -1, NULL, NULL);
+#if 0
+    register_savevm("PIIX3", 0, 1, generic_pci_save, generic_pci_load, 
d);
+#endif
+
+    piix3_state = d;
+    pci_conf = d->dev.config;
+
+    pci_conf[0x00] = 0x86; // Intel
+    pci_conf[0x01] = 0x80;
+    pci_conf[0x02] = 0x00; // 82371SB PIIX3 PCI-to-ISA bridge (Step A1)
+    pci_conf[0x03] = 0x70;
+    pci_conf[0x0a] = 0x01; // class_sub = PCI_ISA
+    pci_conf[0x0b] = 0x06; // class_base = PCI_bridge
+    pci_conf[0x0e] = 0x80; // header_type = PCI_multifunction, generic
+
+    piix3_reset(d);
+}
+
+
+
+
+/***********************************************************/
+/* XXX: the following should be moved to the PC BIOS */
+
+static __attribute__((unused)) uint32_t isa_inb(PCIBus *bus,uint32_t 
addr)
+{
+    return cpu_inb(bus->base, addr);
+}
+
+static void isa_outb(PCIBus *bus, uint32_t val, uint32_t addr)
+{
+    cpu_outb(bus->base, addr, val);
+}
+
+static __attribute__((unused)) uint32_t isa_inw(PCIBus *bus,uint32_t 
addr)
+{
+    return cpu_inw(bus->base, addr);
+}
+
+static __attribute__((unused)) void isa_outw(PCIBus *bus,uint32_t val, 
uint32_t addr)
+{
+    cpu_outw(bus->base, addr, val);
+}
+
+static __attribute__((unused)) uint32_t isa_inl(PCIBus *bus,uint32_t 
addr)
+{
+    return cpu_inl(bus->base, addr);
+}
+
+static __attribute__((unused)) void isa_outl(PCIBus *bus,uint32_t val, 
uint32_t addr)
+{
+    cpu_outl(bus->base, addr, val);
+}
+
+static void pci_config_writel(PCIDevice *d, uint32_t addr, uint32_t val)
+{
+    PCIBus *s = d->bus;
+    s->config_reg = 0x80000000 | (s->bus_num << 16) | 
+        (d->devfn << 8) | addr;
+    pci_data_write(s, 0, val, 4);
+}
+
+static void pci_config_writew(PCIDevice *d, uint32_t addr, uint32_t val)
+{
+    PCIBus *s = d->bus;
+    s->config_reg = 0x80000000 | (s->bus_num << 16) | 
+        (d->devfn << 8) | (addr & ~3);
+    pci_data_write(s, addr & 3, val, 2);
+}
+
+
+static void pci_config_writeb(PCIDevice *d, uint32_t addr, uint32_t val)
+{
+    PCIBus *s = d->bus;
+    s->config_reg = 0x80000000 | (s->bus_num << 16) | 
+        (d->devfn << 8) | (addr & ~3);
+    pci_data_write(s, addr & 3, val, 1);
+}
+
+static __attribute__((unused)) uint32_t pci_config_readl(PCIDevice *d, 
uint32_t addr)
+{
+    PCIBus *s = d->bus;
+    s->config_reg = 0x80000000 | (s->bus_num << 16) | 
+        (d->devfn << 8) | addr;
+    return pci_data_read(s, 0, 4);
+}
+
+static uint32_t pci_config_readw(PCIDevice *d, uint32_t addr)
+{
+    PCIBus *s = d->bus;
+    s->config_reg = 0x80000000 | (s->bus_num << 16) | 
+        (d->devfn << 8) | (addr & ~3);
+    return pci_data_read(s, addr & 3, 2);
+}
+
+static uint32_t pci_config_readb(PCIDevice *d, uint32_t addr)
+{
+    PCIBus *s = d->bus;
+    s->config_reg = 0x80000000 | (s->bus_num << 16) | 
+        (d->devfn << 8) | (addr & ~3);
+    return pci_data_read(s, addr & 3, 1);
+}
+
+static uint32_t pci_bios_io_addr;
+static uint32_t pci_bios_mem_addr;
+
+/* host irqs corresponding to PCI irqs A-D */
+static uint8_t pci_irqs[4] = { 11, 9, 11, 9 };
+
+static void pci_set_io_region_addr(PCIDevice *d, int region_num, uint32_t 
addr)
+{
+    PCIIORegion *r;
+    uint16_t cmd;
+    uint32_t ofs;
+
+    if ( region_num == PCI_ROM_SLOT ) {
+        ofs = 0x30;
+    }else{
+        ofs = 0x10 + region_num * 4;
+    }
+
+    pci_config_writel(d, ofs, addr);
+    r = &d->io_regions[region_num];
+
+    /* enable memory mappings */
+    cmd = pci_config_readw(d, PCI_COMMAND);
+    if ( region_num == PCI_ROM_SLOT )
+        cmd |= 2;
+    else if (r->type & PCI_ADDRESS_SPACE_IO)
+        cmd |= 1;
+    else
+        cmd |= 2;
+    pci_config_writew(d, PCI_COMMAND, cmd);
+}
+
+static void pci_bios_init_device(PCIDevice *d)
+{
+    int class;
+    PCIIORegion *r;
+    uint32_t *paddr;
+    int i, pin, pic_irq, vendor_id, device_id;
+
+    class = pci_config_readw(d, PCI_CLASS_DEVICE);
+    vendor_id = pci_config_readw(d, PCI_VENDOR_ID);
+    device_id = pci_config_readw(d, PCI_DEVICE_ID);
+    switch(class) {
+    case 0x0101:
+        if (vendor_id == 0x8086 && device_id == 0x7010) {
+            /* PIIX3 IDE */
+            pci_config_writew(d, 0x40, 0x8000); // enable IDE0
+            pci_config_writew(d, 0x42, 0x8000); // enable IDE1
+            goto default_map;
+        } else {
+            /* IDE: we map it as in ISA mode */
+            pci_set_io_region_addr(d, 0, 0x1f0);
+            pci_set_io_region_addr(d, 1, 0x3f4);
+            pci_set_io_region_addr(d, 2, 0x170);
+            pci_set_io_region_addr(d, 3, 0x374);
+        }
+        break;
+    case 0x0300:
+        if (vendor_id != 0x1234)
+            goto default_map;
+        /* VGA: map frame buffer to default Bochs VBE address */
+        pci_set_io_region_addr(d, 0, 0xE0000000);
+        break;
+    case 0x0800:
+        /* PIC */
+        vendor_id = pci_config_readw(d, PCI_VENDOR_ID);
+        device_id = pci_config_readw(d, PCI_DEVICE_ID);
+        if (vendor_id == 0x1014) {
+            /* IBM */
+            if (device_id == 0x0046 || device_id == 0xFFFF) {
+                /* MPIC & MPIC2 */
+                pci_set_io_region_addr(d, 0, 0x80800000 + 0x00040000);
+            }
+        }
+        break;
+    case 0xff00:
+        if (vendor_id == 0x0106b &&
+            (device_id == 0x0017 || device_id == 0x0022)) {
+            /* macio bridge */
+            pci_set_io_region_addr(d, 0, 0x80800000);
+        }
+        break;
+    default:
+    default_map:
+        /* default memory mappings */
+        for(i = 0; i < PCI_NUM_REGIONS; i++) {
+            r = &d->io_regions[i];
+            if (r->size) {
+                if (r->type & PCI_ADDRESS_SPACE_IO)
+                    paddr = &pci_bios_io_addr;
+                else
+                    paddr = &pci_bios_mem_addr;
+                *paddr = (*paddr + r->size - 1) & ~(r->size - 1);
+                pci_set_io_region_addr(d, i, *paddr);
+                *paddr += r->size;
+            }
+        }
+        break;
+    }
+
+    /* map the interrupt */
+    pin = pci_config_readb(d, PCI_INTERRUPT_PIN);
+    if (pin != 0) {
+        pin = pci_slot_get_pirq(d, pin - 1);
+        pic_irq = pci_irqs[pin];
+        pci_config_writeb(d, PCI_INTERRUPT_LINE, pic_irq);
+    }
+
+}
+
+/*
+ * This function initializes the PCI devices as a normal PCI BIOS
+ * would do. It is provided just in case the BIOS has no support for
+ * PCI.
+ */
+void pci_bios_init(PCIBus *bus)
+{
+    int i, irq;
+    uint8_t elcr[2];
+
+    pci_bios_io_addr = 0xc000;
+    pci_bios_mem_addr = 0xf0000000;
+
+    /* activate IRQ mappings */
+    elcr[0] = 0x00;
+    elcr[1] = 0x00;
+    for(i = 0; i < 4; i++) {
+        irq = pci_irqs[i];
+        /* set to trigger level */
+        elcr[irq >> 3] |= (1 << (irq & 7));
+        /* activate irq remapping in PIIX */
+        pci_config_writeb((PCIDevice *)piix3_state, 0x60 + i, irq);
+    }
+    isa_outb(bus, elcr[0], 0x4d0);
+    isa_outb(bus, elcr[1], 0x4d1);
+
+    if (bus) {
+        int devfn;
+        for(devfn = 0; devfn < 256; devfn++) {
+            PCIDevice *d;
+            d = bus->devices[devfn];
+            if (d)
+                pci_bios_init_device(d);
+        }
+    }
+}
Index: root/xen-unstable.hg/xen/arch/x86/qemu_makeup.h
===================================================================
--- /dev/null
+++ root/xen-unstable.hg/xen/arch/x86/qemu_makeup.h
@@ -0,0 +1,25 @@
+#ifndef QEMU_MAKEUP_H
+#define QEMU_MAKEUP_H
+
+#include <xen/types.h>
+#include <xen/cache.h>
+#include <xen/xmalloc.h>
+
+static inline void *qemu_mallocz(unsigned int size) {
+        void *ptr = xmalloc_bytes(size);
+        if (ptr) {
+            memset(ptr, 0x0, size);
+        }
+        return ptr;
+}
+static inline uint16_t le16_to_cpu(uint16_t data) {
+        return data;
+}
+static inline uint32_t le32_to_cpu(uint32_t data) {
+        return data;
+}
+static inline uint32_t cpu_to_le32(uint32_t data) {
+        return data;
+}
+
+#endif
Index: root/xen-unstable.hg/xen/arch/x86/qemu_vl.c
===================================================================
--- /dev/null
+++ root/xen-unstable.hg/xen/arch/x86/qemu_vl.c
@@ -0,0 +1,262 @@
+/*
+ * QEMU System Emulator
+ * 
+ * Copyright (c) 2003-2005 Fabrice Bellard
+ * 
+ * Permission is hereby granted, free of charge, to any person obtaining 
a copy
+ * of this software and associated documentation files (the "Software"), 
to deal
+ * in the Software without restriction, including without limitation the 
rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or 
sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be 
included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 
SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS 
IN
+ * THE SOFTWARE.
+ */
+/*
+ * Adapted to fit XEN by Stefan Berger, stefanb@xxxxxxxxxx
+ */
+#include <xen/qemu_vl.h>
+#include <xen/lib.h>
+#include "qemu_makeup.h"
+
+
+/* size is the word size in byte */
+int register_ioport_read(int start, int length, int size, 
+                         IOPortReadFunc *func, PCIBus *bus,
+                         PCIBase *base)
+{
+    int i, bsize;
+
+    if (size == 1) {
+        bsize = 0;
+    } else if (size == 2) {
+        bsize = 1;
+    } else if (size == 4) {
+        bsize = 2;
+    } else {
+        printk("register_ioport_read: invalid size");
+        return -1;
+    }
+    for(i = start; i < start + length && i < MAX_IOPORTS; i += size) {
+        base->ioport_read_table[bsize][i] = func;
+#if 0
+        if (base->ioport_opaque[i] != NULL && 
+            base->ioport_opaque[i] != base)
+            printk("register_ioport_read: invalid opaque");
+#endif
+        base->ioport_opaque[i] = base;
+    }
+    return 0;
+}
+
+/* size is the word size in byte */
+int register_ioport_write(int start, int length, int size, 
+                          IOPortWriteFunc *func, PCIBus *bus,
+                          PCIBase *base)
+{
+    int i, bsize;
+
+    if (size == 1) {
+        bsize = 0;
+    } else if (size == 2) {
+        bsize = 1;
+    } else if (size == 4) {
+        bsize = 2;
+    } else {
+        printk("register_ioport_write: invalid size");
+        return -1;
+    }
+    for(i = start; i < start + length && i < MAX_IOPORTS ; i += size) {
+        base->ioport_write_table[bsize][i] = func;
+#if 0
+        if (base->ioport_opaque[i] != NULL && 
+            base->ioport_opaque[i] != base)
+            printk("register_ioport_read: invalid opaque");
+#endif
+        base->ioport_opaque[i] = base;
+    }
+    return 0;
+}
+
+
+void cpu_outb(PCIBase *env, unsigned int addr, int val)
+{
+#ifdef DEBUG_IOPORT
+    if (loglevel & CPU_LOG_IOPORT)
+        fprintf(logfile, "outb: %04x %02x\n", addr, val);
+#endif
+    if (addr <= MAX_IOPORTS)
+        env->ioport_write_table[0][addr](env->ioport_opaque[addr], addr, 
val);
+}
+
+void cpu_outw(PCIBase *env, unsigned int addr, int val)
+{
+#ifdef DEBUG_IOPORT
+    if (loglevel & CPU_LOG_IOPORT)
+        fprintf(logfile, "outw: %04x %04x\n", addr, val);
+#endif 
+    if (addr <= MAX_IOPORTS)
+        env->ioport_write_table[1][addr](env->ioport_opaque[addr], addr, 
val);
+}
+
+void cpu_outl(PCIBase *env, unsigned int addr, int val)
+{
+#ifdef DEBUG_IOPORT
+    if (loglevel & CPU_LOG_IOPORT)
+        fprintf(logfile, "outl: %04x %08x\n", addr, val);
+#endif
+    if (addr <= MAX_IOPORTS)
+        env->ioport_write_table[2][addr](env->ioport_opaque[addr], addr, 
val);
+}
+
+int cpu_inb(PCIBase *env, unsigned int addr)
+{
+    int val;
+    if (addr > MAX_IOPORTS) {
+        return ~0;
+    }
+    val = env->ioport_read_table[0][addr](env->ioport_opaque[addr], 
addr);
+#ifdef DEBUG_IOPORT
+    if (loglevel & CPU_LOG_IOPORT)
+        fprintf(logfile, "inb : %04x %02x\n", addr, val);
+#endif
+    return val;
+}
+
+int cpu_inw(PCIBase *env, unsigned int addr)
+{
+    int val;
+    if (addr > MAX_IOPORTS) {
+        return ~0;
+    }
+    val = env->ioport_read_table[1][addr](env->ioport_opaque[addr], 
addr);
+#ifdef DEBUG_IOPORT
+    if (loglevel & CPU_LOG_IOPORT)
+        fprintf(logfile, "inw : %04x %04x\n", addr, val);
+#endif
+    return val;
+}
+
+int cpu_inl(PCIBase *env, unsigned int addr)
+{
+    int val;
+    if (addr > MAX_IOPORTS) {
+        return ~0;
+    }
+    val = env->ioport_read_table[2][addr](env->ioport_opaque[addr], 
addr);
+#ifdef DEBUG_IOPORT
+    if (loglevel & CPU_LOG_IOPORT)
+        fprintf(logfile, "inl : %04x %08x\n", addr, val);
+#endif
+    return val;
+}
+
+uint32_t default_ioport_readb(PCIBase *base, uint32_t address)
+{
+#ifdef DEBUG_UNUSED_IOPORT
+    fprintf(stderr, "inb: port=0x%04x\n", address);
+#endif
+    return 0xff;
+}
+
+void default_ioport_writeb(PCIBase *base, uint32_t address, uint32_t 
data)
+{
+#ifdef DEBUG_UNUSED_IOPORT
+    fprintf(stderr, "outb: port=0x%04x data=0x%02x\n", address, data);
+#endif
+}
+
+/* default is to make two byte accesses */
+uint32_t default_ioport_readw(PCIBase *base, uint32_t address)
+{
+    uint32_t data = ~0;
+    if (address + 1 < MAX_IOPORTS) {
+        data = 
base->ioport_read_table[0][address](base->ioport_opaque[address], 
address);
+        address += 1;
+        data |= 
base->ioport_read_table[0][address](base->ioport_opaque[address], address) 
<< 8;
+    }
+    return data;
+}
+
+void default_ioport_writew(PCIBase *base, uint32_t address, uint32_t 
data)
+{
+    if (address + 1 < MAX_IOPORTS) {
+ base->ioport_write_table[0][address](base->ioport_opaque[address], 
address, data & 0xff);
+        address += 1;
+ base->ioport_write_table[0][address](base->ioport_opaque[address], 
address, (data >> 8) & 0xff);
+    }
+}
+
+uint32_t default_ioport_readl(PCIBase *base, uint32_t address)
+{
+#ifdef DEBUG_UNUSED_IOPORT
+    fprintf(stderr, "inl: port=0x%04x\n", address);
+#endif
+    return 0xfffffffe;
+}
+
+void default_ioport_writel(PCIBase *base, uint32_t address, uint32_t 
data)
+{
+#ifdef DEBUG_UNUSED_IOPORT
+    fprintf(stderr, "outl: port=0x%04x data=0x%02x\n", address, data);
+#endif
+}
+
+void init_ioports(PCIBase *base)
+{
+    int i;
+
+    for(i = 0; i < MAX_IOPORTS; i++) {
+        base->ioport_read_table[0][i] = default_ioport_readb;
+        base->ioport_write_table[0][i] = default_ioport_writeb;
+        base->ioport_read_table[1][i] = default_ioport_readw;
+        base->ioport_write_table[1][i] = default_ioport_writew;
+        base->ioport_read_table[2][i] = default_ioport_readl;
+        base->ioport_write_table[2][i] = default_ioport_writel;
+    }
+}
+
+
+void isa_unassign_ioport(PCIBase *base, int start, int length)
+{
+    int i;
+
+    for(i = start; i < start + length && i < MAX_IOPORTS; i++) {
+        base->ioport_read_table[0][i] = default_ioport_readb;
+        base->ioport_read_table[1][i] = default_ioport_readw;
+        base->ioport_read_table[2][i] = default_ioport_readl;
+
+        base->ioport_write_table[0][i] = default_ioport_writeb;
+        base->ioport_write_table[1][i] = default_ioport_writew;
+        base->ioport_write_table[2][i] = default_ioport_writel;
+    }
+}
+
+/***********************************************************/
+
+
+void pstrcpy(char *buf, int buf_size, const char *str)
+{
+    int c;
+    char *q = buf;
+
+    if (buf_size <= 0)
+        return;
+
+    for(;;) {
+        c = *str++;
+        if (c == 0 || q >= buf + buf_size - 1)
+            break;
+        *q++ = c;
+    }
+    *q = '\0';
+}
Index: root/xen-unstable.hg/xen/include/xen/qemu_vl.h
===================================================================
--- /dev/null
+++ root/xen-unstable.hg/xen/include/xen/qemu_vl.h
@@ -0,0 +1,149 @@
+/*
+ * QEMU System Emulator header
+ * 
+ * Copyright (c) 2003 Fabrice Bellard
+ * 
+ * Permission is hereby granted, free of charge, to any person obtaining 
a copy
+ * of this software and associated documentation files (the "Software"), 
to deal
+ * in the Software without restriction, including without limitation the 
rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or 
sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be 
included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 
SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS 
IN
+ * THE SOFTWARE.
+ */
+#ifndef VL_H
+#define VL_H
+
+#include <xen/types.h>
+
+
+typedef struct PCIBus PCIBus;
+typedef struct PCIDevice PCIDevice;
+typedef struct PCIBase PCIBase;
+
+/* vl.c */
+
+void pstrcpy(char *buf, int buf_size, const char *str);
+
+
+typedef void SetIRQFunc(void *opaque, int irq_num, int level);
+
+/* ISA bus */
+
+typedef void (IOPortWriteFunc)(PCIBase *base, uint32_t address, uint32_t 
data);
+typedef uint32_t (IOPortReadFunc)(PCIBase *base, uint32_t address);
+
+int register_ioport_read(int start, int length, int size,
+                         IOPortReadFunc *func, PCIBus *bus,
+                         PCIBase *base);
+int register_ioport_write(int start, int length, int size,
+                          IOPortWriteFunc *func, PCIBus *bus,
+                          PCIBase *base);
+
+/* PCI bus */
+
+
+typedef void PCIConfigWriteFunc(PCIDevice *pci_dev, 
+                                uint32_t address, uint32_t data, int 
len);
+typedef uint32_t PCIConfigReadFunc(PCIDevice *pci_dev, 
+                                   uint32_t address, int len);
+typedef void PCIMapIORegionFunc(PCIDevice *pci_dev, int region_num, 
+                                uint32_t addr, uint32_t size, int type);
+
+
+#define PCI_ADDRESS_SPACE_MEM          0x00
+#define PCI_ADDRESS_SPACE_IO           0x01
+#define PCI_ADDRESS_SPACE_MEM_PREFETCH 0x08
+
+
+typedef struct PCIIORegion {
+    uint32_t addr; /* current PCI mapping address. -1 means not mapped */
+    uint32_t size;
+    uint8_t type;
+    PCIMapIORegionFunc *map_func;
+} PCIIORegion;
+
+#define PCI_ROM_SLOT 6
+#define PCI_NUM_REGIONS 7
+struct PCIDevice {
+    /* PCI config space */
+    uint8_t config[256];
+
+    /* the following fields are read only */
+    PCIBus *bus;
+    int devfn;
+    char name[64];
+    PCIIORegion io_regions[PCI_NUM_REGIONS];
+ 
+    /* do not access the following fields */
+    PCIConfigReadFunc *config_read;
+    PCIConfigWriteFunc *config_write;
+};
+
+struct PCIBus {
+    int bus_num;
+    int devfn_min;
+    uint32_t config_reg; /* XXX: suppress */
+    /* low level pic */
+    SetIRQFunc *low_set_irq;
+    void *irq_opaque;
+    PCIDevice *devices[256];
+    PCIBase *base;
+};
+
+#if 0
+#define MAX_IOPORTS 65536
+#else
+#define MAX_IOPORTS 4096
+#endif
+
+struct PCIBase {
+    PCIBus *bus;
+    PCIBase *ioport_opaque[MAX_IOPORTS];
+    IOPortReadFunc  *ioport_read_table[3][MAX_IOPORTS];
+    IOPortWriteFunc *ioport_write_table[3][MAX_IOPORTS];
+};
+
+PCIDevice *pci_register_device(PCIBus *bus, const char *name,
+                               int instance_size, int devfn,
+                               PCIConfigReadFunc *config_read, 
+                               PCIConfigWriteFunc *config_write);
+
+void pci_register_io_region(PCIDevice *pci_dev, int region_num, 
+                            uint32_t size, int type, 
+                            PCIMapIORegionFunc *map_func);
+
+uint32_t pci_default_read_config(PCIDevice *d, 
+                                 uint32_t address, int len);
+void pci_default_write_config(PCIDevice *d, 
+                              uint32_t address, uint32_t val, int len);
+
+void cpu_outb(PCIBase *env, unsigned int addr, int val);
+void cpu_outw(PCIBase *env, unsigned int addr, int val);
+void cpu_outl(PCIBase *env, unsigned int addr, int val);
+int cpu_inb(PCIBase *env, unsigned int addr);
+int cpu_inw(PCIBase *env, unsigned int addr);
+int cpu_inl(PCIBase *env, unsigned int addr);
+void init_ioports(PCIBase *base);
+void isa_unassign_ioport(PCIBase *base, int start, int length);
+
+extern struct PIIX3State *piix3_state;
+
+PCIBase *i440fx_init(void);
+void piix3_init(PCIBus *bus);
+void pci_bios_init(PCIBus *bus);
+
+void pci_base_free(PCIBase *base);
+
+
+#endif /* VL_H */
Index: root/xen-unstable.hg/xen/arch/x86/traps.c
===================================================================
--- root.orig/xen-unstable.hg/xen/arch/x86/traps.c
+++ root/xen-unstable.hg/xen/arch/x86/traps.c
@@ -553,16 +553,74 @@ static inline int admin_io_okay(
     return 0;
 }
 
+
+u32 static inline _cpu_inb(u32 dummy, int addr)
+{
+    struct vcpu *v = current;
+    struct domain *d  = v->domain;
+    if (d->PCIBase) {
+        return cpu_inb(d->PCIBase, addr);
+    }
+    return ~0;
+}
+
+u32 static inline _cpu_inw(u32 dummy, int addr)
+{
+    struct vcpu *v = current;
+    struct domain *d  = v->domain;
+    if (d->PCIBase) {
+        return cpu_inw(d->PCIBase, addr);
+    }
+    return ~0;
+}
+
+u32 static inline _cpu_inl(u32 dummy, int addr)
+{
+    struct vcpu *v = current;
+    struct domain *d  = v->domain;
+    if (d->PCIBase) {
+        return cpu_inl(d->PCIBase, addr);
+    }
+    return ~0;
+}
+
+void static inline _cpu_outb(u32 dummy, int addr, u8 val)
+{
+    struct vcpu *v = current;
+    struct domain *d  = v->domain;
+    if (d->PCIBase) {
+        cpu_outb(d->PCIBase, addr, val);
+    }
+}
+
+void static inline _cpu_outw(u32 dummy, int addr, u16 val)
+{
+    struct vcpu *v = current;
+    struct domain *d  = v->domain;
+    if (d->PCIBase) {
+        cpu_outw(d->PCIBase, addr, val);
+    }
+}
+
+void static inline _cpu_outl(u32 dummy, int addr, u32 val)
+{
+    struct vcpu *v = current;
+    struct domain *d  = v->domain;
+    if (d->PCIBase) {
+        cpu_outl(d->PCIBase, addr, val);
+    }
+}
+
 /* Check admin limits. Silently fail the access if it is disallowed. */
-#define inb_user(_p, _d, _r) (admin_io_okay(_p, 1, _d, _r) ? inb(_p) : 
~0)
-#define inw_user(_p, _d, _r) (admin_io_okay(_p, 2, _d, _r) ? inw(_p) : 
~0)
-#define inl_user(_p, _d, _r) (admin_io_okay(_p, 4, _d, _r) ? inl(_p) : 
~0)
+#define inb_user(_p, _d, _r) (admin_io_okay(_p, 1, _d, _r) ? inb(_p) : 
_cpu_inb(0,_p))
+#define inw_user(_p, _d, _r) (admin_io_okay(_p, 2, _d, _r) ? inw(_p) : 
_cpu_inw(0,_p))
+#define inl_user(_p, _d, _r) (admin_io_okay(_p, 4, _d, _r) ? inl(_p) : 
_cpu_inl(0,_p))
 #define outb_user(_v, _p, _d, _r) \
-    (admin_io_okay(_p, 1, _d, _r) ? outb(_v, _p) : ((void)0))
+    (admin_io_okay(_p, 1, _d, _r) ? outb(_v, _p) : 
_cpu_outb(0,_p,_v)/*((void)0)*/)
 #define outw_user(_v, _p, _d, _r) \
-    (admin_io_okay(_p, 2, _d, _r) ? outw(_v, _p) : ((void)0))
+    (admin_io_okay(_p, 2, _d, _r) ? outw(_v, _p) : 
_cpu_outw(0,_p,_v)/*((void)0)*/)
 #define outl_user(_v, _p, _d, _r) \
-    (admin_io_okay(_p, 4, _d, _r) ? outl(_v, _p) : ((void)0))
+    (admin_io_okay(_p, 4, _d, _r) ? outl(_v, _p) : 
_cpu_outl(0,_p,_v)/*((void)0)*/)
 
 /* Propagate a fault back to the guest kernel. */
 #define USER_READ_FAULT  4 /* user mode, read fault */
Index: root/xen-unstable.hg/xen/common/domain.c
===================================================================
--- root.orig/xen-unstable.hg/xen/common/domain.c
+++ root/xen-unstable.hg/xen/common/domain.c
@@ -58,6 +58,21 @@ struct domain *do_createdomain(domid_t d
         free_domain_struct(d);
         return NULL;
     }
+
+    if (dom_id != 0 && dom_id != IDLE_DOMAIN_ID) {
+        d->PCIBase = i440fx_init();
+        if (d->PCIBase) {
+            PCIBus *bus = d->PCIBase->bus;
+            piix3_init(bus);
+            pci_bios_init(bus);
+            printk("Fake PCI BUS initialized.\n");
+        } else {
+            printk("Not enough memory for fake PCI bus?\n");
+            evtchn_destroy(d);
+            free_domain_struct(d);
+            return NULL;
+        }
+    }
 
     arch_do_createdomain(v);
 
@@ -77,6 +92,7 @@ struct domain *do_createdomain(domid_t d
         write_unlock(&domlist_lock);
     }
 
+
     return d;
 }
 
@@ -260,6 +276,10 @@ void domain_destruct(struct domain *d)
     *pd = d->next_in_hashbucket;
     write_unlock(&domlist_lock);
 
+    if (d->PCIBase) {
+        pci_base_free(d->PCIBase);
+    }
+
     evtchn_destroy(d);
     grant_table_destroy(d);
 
Index: root/xen-unstable.hg/xen/include/xen/sched.h
===================================================================
--- root.orig/xen-unstable.hg/xen/include/xen/sched.h
+++ root/xen-unstable.hg/xen/include/xen/sched.h
@@ -12,6 +12,7 @@
 #include <xen/ac_timer.h>
 #include <xen/grant_table.h>
 #include <asm/domain.h>
+#include <xen/qemu_vl.h>
 
 extern unsigned long volatile jiffies;
 extern rwlock_t domlist_lock;
@@ -138,6 +139,7 @@ struct domain
     struct arch_domain arch;
 
     void *ssid; /* sHype security subject identifier */
+    PCIBase *PCIBase;
 };
 
 struct domain_setup_info



Attachment: pci_emu_xen.diff
Description: Binary data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.