[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Strange failures of Xen 4.3.1, PVHVM storage VM, iSCSI and Windows+GPLPV VM combination

  • To: xen-users@xxxxxxxxxxxxx
  • From: Kuba <kuba.0000@xxxxx>
  • Date: Sat, 01 Feb 2014 20:27:23 +0100
  • Delivery-date: Sat, 01 Feb 2014 19:28:55 +0000
  • List-id: Xen user discussion <xen-users.lists.xen.org>

W dniu 2014-01-31 01:53, Adam Goryachev pisze:
On 31/01/14 11:08, Kuba wrote:
W dniu 2014-01-30 23:51, Adam Goryachev pisze:
On 31/01/14 00:50, Kuba wrote:
Dear List,

I am trying to set up a following configuration:
1. very simple Linux-based dom0 (Debian 7.3) with Xen 4.3.1 compiled
from sources,
2. one storage VM (FreeBSD 10, HVM+PV) with SATA controller attached
using VT-d, exporting block devices via iSCSI to other VMs and
physical machines,
3. one Windows 7 SP1 64 VM (HVM+GPLPV) with GPU passthrough (Quadro
4000) installed on a block device exported from the storage VM (target
on the storage VM, initiator on dom0).

Everything works perfectly (including PCI & GPU passthrough) until I
install GPLPV drivers on the Windows VM. After driver installation,
Windows needs to reboot, boots fine, displays a message that PV SCSI
drivers were installed and needs to reboot again, and then cannot
boot. Sometimes it gets stuck at "booting from harddrive" in SeaBIOS,
sometimes BSODs with "unmountable boot volume" message. All of the
following I tried without GPU passthrough to narrow down the problem.

The intriguing part is this:

1. If the storage VM's OS is Linux - it fails with the above symptoms.
2. If the block devices for the storage VM come directly from dom0
(not via pci-passthrough) - it fails.
2. If the storage VM is an HVM without PV drivers (e.g. FreeBSD
9.2-GENERIC) - it all works.
3. If the storage VM's OS is Linux with kernel compiled without Xen
guest support - it works, but is unstable (see below).
4. If the iSCSI target is on a different physical machine - it all
5. If the iSCSI target is on dom0 itself - it works.
6. If I attach the AHCI controller to the Windows VM and install
directly on the hard drive - it works.
7. If the block device for Windows VM is a disk, partition, file, LVM
volume or even a ZoL's zvol (and it comes from a dom0 itself, without
iSCSI)- it works.

If I install Windows and the GPLPV drivers on a hard drive attached to
dom0, Windows + GPLPV work perfectly. If I then give the same hard
drive as a block device to the storage VM and re-export it through
iSCSI, Windows usually boots fine, but works unstable. And by unstable
I mean random read/write errors, sometimes programs won't start,
ntdll.dll crashes, and after couple reboots Windows won't boot (just
like mentioned above).

The configurations I would like to achieve makes sense only with PV
drivers on both storage and Windows VM. All of the "components" seem
to work perfectly until all put together, so I am not really sure
where the problem is.

I would be very grateful for any suggestions or ideas that could
possibly help to narrow down the problem. Maybe I am just doing
something wrong (I hope so). Or maybe there is a bug that shows itself
only in such a particular configuration (hope not)?

IMHO, it sounds like a resource issue... the domU providing the iSCSI,
plus the dom0 plus the domU (windows VM) are all asking for CPU, IRQ's,
etc, and someone isn't getting enough in time. It doesn't really help,
but we use a physical iSCSI server, the dom0 then connects to the iSCSI
and provides the disks to the VM's. Maybe look at assigning specific
exclusive CPU's to each dom0 and domU's, and see if you can still
reproduce the issue. Also, make absolutely sure that you don't have two
VM's accessing the same iSCSI.


Dear Adam,

Thank you for your reply. I will try assigning specific cores to the
VMs this weekend. I will also try to run all this on another machine
and try using something different then Windows for the second VM (why
didn't I think about it earlier?). When iSCSI target is on another
physical machine, everything works like a charm, but the whole point
is to make it all work on a single machine (everything faster than 1
Gbps is way beyond my reach, while iperf reports ~22 Gbps between dom0
and storage VM...)

Once again thank you for your suggestions, I will report back with
more results. In the meantime please clarify one thing for me - is
there something inherently wrong with what I'm trying to do?

I don't see anything "inherently" wrong, but I would question why you
would want to do this? Why not let dom0 use the scsi controller, and
export the disks as physical devices into the various VM's? You are
adding a whole bunch of "ethernet" overhead to both domU's plus the
dom0. Considering the storage is physically local.

The problem you have is the VM's will not be portable to another
machine, because they are both tied to the physical pci devices, and the
block devices are not available on any other physical machine anyway.

A 1Gbps ethernet provides 100MB/s (actually closer to 130MB/s), so
simply bonding 2 x 1Gbps ethernet can usually provide more disk
bandwidth than the disks can provide.

In my setup, the iSCSI server uses 8 x 1Gbps ethernet bonded, and the
xen machines use 2 x 1Gbps bonded for iSCSI, plus 1 x 1Gbps which is
bridged to the domU's and for dom0 "management". You can get a couple of
cheap-ish 1Gbps network cards easily enough, and your disk subsystem
probably won't provide more than 200MB/s anyway (we can get a max of
2.5GB/s read from the disk subsystem, but the limited bandwidth for each
dom0 helps to stop any one domU from stealing all the disk IO). In
practice, I can run 4 x dom0 and obtain over 180MB/s on each of them in


I'd like to achieve several things at once. I'm fairly new to Xen, (which is an impressive piece of software) and it creates possibilities I'm only beginning to grasp.

I'm trying to build something a little bit similar to Qubes OS. I need several VMs for different tasks, some of them Windows-based, some of them using GPU passthrough, but all of them installed ZFS backed storage. I'd really like to have a storage VM that is separated from everything else as much as possible, even from dom0, just like it was a seperate machine. I'm fairly certain that providing storage space directly from dom0 would be faster, but that's a trade-off I'm willing to accept - it's something between dom0 and separate physical machine. Consequently, if all goes well, I will get following benefits: a) everything on one physical machine (less power consumption, cheaper, satisfying performance), b) ZFS storage for all VMs (data integrity, snapshots, rollbacks, VM cloning, etc.),
c) Windows on ZFS (an idea that started all this),
d) storage VM separated from everything else,
f) great flexibility.

I'm aware about VM migration issues, but that's another trade-off I'm willing to accept. Simply put, I'm trying to achieve a lot more using the same hardware with acceptable performance loss.

Best regards,

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.