[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [win-pv-devel] Problems with xenvbd
> -----Original Message----- > From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx] > Sent: 02 September 2015 09:54 > To: Paul Durrant; RafaÅ WojdyÅa; win-pv-devel@xxxxxxxxxxxxxxxxxxxx > Cc: Stefano Stabellini > Subject: Re: [win-pv-devel] Problems with xenvbd > > Il 01/09/2015 16:41, Paul Durrant ha scritto: > >> -----Original Message----- > >> From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx] > >> Sent: 21 August 2015 14:14 > >> To: RafaÅ WojdyÅa; Paul Durrant; win-pv-devel@xxxxxxxxxxxxxxxxxxxx > >> Subject: Re: [win-pv-devel] Problems with xenvbd > >> > >> Il 21/08/2015 10:12, Fabio Fantoni ha scritto: > >>> Il 21/08/2015 00:03, RafaÅ WojdyÅa ha scritto: > >>>> On 2015-08-19 23:25, Paul Durrant wrote: > >>>>>> -----Original Message----- From: > >>>>>> win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx [mailto:win-pv-devel- > >>>>>> bounces@xxxxxxxxxxxxxxxxxxxx] On Behalf Of Rafal Wojdyla Sent: 18 > >>>>>> August 2015 14:33 To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx Subject: > >>>>>> [win-pv-devel] Problems with xenvbd > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> I've been testing the current pvdrivers code in preparation for > >>>>>> creating upstream patches for my xeniface additions and I noticed > >>>>>> than xenvbd seems to be very unstable for me. I'm not sure if it's > >>>>>> a problem with xenvbd itself or my code because it seemed to only > >>>>>> manifest when the full suite of our guest tools was installed along > >>>>>> with xenvbd. In short, most of the time the system crashed with > >>>>>> kernel memory corruption in seemingly random processes shortly > >>>>>> after start. Driver Verifier didn't seem to catch anything. You can > >>>>>> see a log from one such crash in the attachment crash1.txt. > >>>>>> > >>>>>> Today I tried to perform some more tests but this time without our > >>>>>> guest tools (only pvdrivers and our shared libraries were > >>>>>> installed). To my surprise now Driver Verifier was crashing the > >>>>>> system every time in xenvbd (see crash2.txt). I don't know why it > >>>>>> didn't catch that previously... If adding some timeout to the > >>>>>> offending wait doesn't break anything I'll try that to see if I can > >>>>>> reproduce the previous memory corruptions. > >>>>>> > >>>>> Those crashes do look odd. I'm on PTO for the next week but I'll have > >>>>> a look when I get back to the office. I did run verifier on all the > >>>>> drivers a week or so back (while running vbd plug/unplug tests) but > >>>>> there have been a couple of changes since then. > >>>>> > >>>>> Paul > >>>>> > >>>> No problem. I attached some more logs. The last one was during > system > >>>> shutdown, after that the OS failed to boot (probably corrupted > >>>> filesystem since the BSOD itself seemed to indicate that). I think every > >>>> time there is a BLKIF_RSP_ERROR somewhere but I'm not yet familiar > with > >>>> Xen PV device interfaces so not sure what that means. > >>>> > >>>> In the meantime I've run more tests on my modified xeniface driver to > >>>> make sure it's not contributing to these issues but everything seemed > to > >>>> be fine there. > >>>> > >>>> > >>> I also had a disk corruption on windows 10 pro 64 bit with pv drivers > >>> build of 11 august but I'm not sure that is related to winpv drivers, > >>> on same domU I started testing also snapshot with qcow2 disk overlay. > >>> For this case I don't have useful information because don't try to > >>> boot windows at all but if rehappen I'll try to take other useful > >>> information. > >> Happen another time but also this I was unable to understand what is > >> exactly the cause. > >> On windows reboot all seems was ok and did a clean shutdown but on > >> reboot seabios don't found bootable disk and qemu log don't show useful > >> informations. > >> qemu-img check show errors: > >>> /usr/lib/xen/bin/qemu-img check W10.disk1.cow-sn1 > >>> ERROR cluster 143 refcount=1 reference=2 > >>> Leaked cluster 1077 refcount=1 reference=0 > >>> ERROR cluster 1221 refcount=1 reference=2 > >>> Leaked cluster 2703 refcount=1 reference=0 > >>> Leaked cluster 5212 refcount=1 reference=0 > >>> Leaked cluster 13375 refcount=1 reference=0 > >>> > >>> 2 errors were found on the image. > >>> Data may be corrupted, or further writes to the image may corrupt it. > >>> > >>> 4 leaked clusters were found on the image. > >>> This means waste of disk space, but no harm to data. > >>> 27853/819200 = 3.40% allocated, 22.65% fragmented, 0.00% compressed > >>> clusters > >>> Image end offset: 1850736640 > >> I created it with: > >> /usr/lib/xen/bin/qemu-img create -o > >> backing_file=W10.disk1.xm,backing_fmt=raw -f qcow2 W10.disk1.cow- > sn1 > >> and changed the xl domU configuration: > >> disk=['/mnt/vm2/W10.disk1.cow-sn1,qcow2,hda,rw',... > >> Dom0 is with xen 4.6-rc1 and qemu 2.4.0 > >> DomU is windows 10 pro 64 bit with pv drivers build of 11 august > >> > >> How I can know for sure if it is a winpv or qemu or other problem and > >> take useful information to report? > >> > >> Thanks for any reply and sorry for my bad english. > > This sounds very much like a lack of synchronization somewhere. I recall > seeing other problems of this ilk when someone was messing around with > O_DIRECT for opening images. I wonder if we are missing a flush operation > on shutdown. > > > > Paul > > > Thanks for reply. > I did a fast search but I not found O_DIRECT grepping in libxl, I found > it only in qemu code. > After I tried with patch that seems added setting of it for xen: > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=454ae734f1d9f591345fa78 > 376435a8e74bb4edd > Checking in libxl seems disabled by default and from some old xen post > seems that O_DIRECT creates problems. > I should try it enable direct-io-safe in domUs qcow2 disks? > Added also Stefano Stabellini as cc. > @Stefano Stabellini: What is the current know status and result of > direct-io-safe? > Sorry is the question are stupid by or my english is too bad or many > post of latest years are confused and in same cases seems also > contradictory about stability/integrity/performance using it or not. > In particular seems crash with some kernels but I not understand exactly > what versions and/or with which patches. > > @Paul Durrant: have you see my other mail when I wrote that based on my > latest test with xen 4.6 without udev file windows domUs with new pv > driver don't boot and for still boot it correctly I must readd udev > file, can this cause unexpected case related to this problem or is > different? > http://lists.xen.org/archives/html/win-pv-devel/2015-08/msg00033.html > I'm not sure why udev would be an issue here. The problem you have appears to be QEMU ignoring the request to unplug emulated disks. I've not seen this behaviour on my test box so I'll need to dig some more. Paul > Thanks for any reply and sorry for my bad english. > _______________________________________________ win-pv-devel mailing list win-pv-devel@xxxxxxxxxxxxxxxxxxxx http://lists.xenproject.org/cgi-bin/mailman/listinfo/win-pv-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |