[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Fatal crash on xen4.2 HVM + qemu-xen dm + NFS



Ian,

Not as far as I know, but Trond zero-copy == O_DIRECT so if you aren't
using O_DIRECT then you aren't using zero copy -- and that agrees with
my recollection. In that case your issue is something totally unrelated.

You could try stracing the qemu-dm and see what it does.

Will do. I'm wondering if AIO + NFS is ever zero copy without O_DIRECT.

I'm wondering whether what's happening is that when the disk grows
(or there's a backing file in place) some sort of different I/O is
done by qemu. Perhaps irrespective of write cache setting, it does some
form of zero copy I/O when there's a backing file in place.

I doubt that, but I don't really know anything about qdisk.

I'd be much more inclined to suspect a bug in the xen_qdisk backend's
handling of disks resizes, if that's what you are doing.

We aren't resizing the qcow2 disk itself. What we're doing is
creating a 20G (virtual size) qcow2 disk, containing a 3G (or
so) Ubuntu image - i.e. the partition table says it's 3G. We
then take a snapshot of it and use that as a backing file. The
guest then writes to the partition table enlarging it to the
virtual size of the disk, then resizes the file system. This
triggers it. Unless QEMU has some special reason to care about
what is in the partition table (e.g. to support the old xen
'mount a file as a partition' stuff), it's just a pile of sectors
being written.

tap == blktap2. I don't know if it supports qcow or not but I don't
think xl exposes it if it does.

Well, in xl's conf file we are using
disk = [ 'tap:qcow2:/my/nfs/directory/testdisk.qcow2,xvda,w' ]

I think that's how you are meant to do qcow2 isn't it?

You could try with a test .vhd or .raw file though.

We can do this but I'm betting it won't fail (at least with .raw)
as it only breaks on qcow2 if there's a backing file associated
with the qcow2 file (i.e. if we're writing to a snapshot).

Unfortunately it won't be zero. There will be at least one reference
from the page being part of the process, which won't be dropped until
the process dies.

OK, well this is my ignorance of how the grant mechanism work.
I had assumed the page from the relevant domU got mapped into the
process in dom0, and that when it was unmapped it would be mapped
back out of the process's memory. Otherwise would the process's
memory map not fill up?

BTW I'm talking about the dom0 kernels page reference count. Xen's page
reference count is irrelevant here.

Indeed.

I suggest you google up previous discussions on the netdev list about
this issue -- all these sorts of ideas were discussed back then.

OK. I will google.

--
Alex Bligh

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.