[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [win-pv-devel] Problems with xenvbd
> -----Original Message----- > From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx] > Sent: 04 September 2015 12:08 > To: Paul Durrant; RafaÅ WojdyÅa; win-pv-devel@xxxxxxxxxxxxxxxxxxxx > Cc: Stefano Stabellini > Subject: Re: [win-pv-devel] Problems with xenvbd > > Il 04/09/2015 11:30, Paul Durrant ha scritto: > >> -----Original Message----- > >> From: win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx [mailto:win-pv-devel- > >> bounces@xxxxxxxxxxxxxxxxxxxx] On Behalf Of Paul Durrant > >> Sent: 02 September 2015 10:00 > >> To: Fabio Fantoni; RafaÅ WojdyÅa; win-pv-devel@xxxxxxxxxxxxxxxxxxxx > >> Cc: Stefano Stabellini > >> Subject: Re: [win-pv-devel] Problems with xenvbd > >> > >>> -----Original Message----- > >>> From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx] > >>> Sent: 02 September 2015 09:54 > >>> To: Paul Durrant; RafaÅ WojdyÅa; win-pv-devel@xxxxxxxxxxxxxxxxxxxx > >>> Cc: Stefano Stabellini > >>> Subject: Re: [win-pv-devel] Problems with xenvbd > >>> > >>> Il 01/09/2015 16:41, Paul Durrant ha scritto: > >>>>> -----Original Message----- > >>>>> From: Fabio Fantoni [mailto:fabio.fantoni@xxxxxxx] > >>>>> Sent: 21 August 2015 14:14 > >>>>> To: RafaÅ WojdyÅa; Paul Durrant; win-pv-devel@xxxxxxxxxxxxxxxxxxxx > >>>>> Subject: Re: [win-pv-devel] Problems with xenvbd > >>>>> > >>>>> Il 21/08/2015 10:12, Fabio Fantoni ha scritto: > >>>>>> Il 21/08/2015 00:03, RafaÅ WojdyÅa ha scritto: > >>>>>>> On 2015-08-19 23:25, Paul Durrant wrote: > >>>>>>>>> -----Original Message----- From: > >>>>>>>>> win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx [mailto:win-pv- > devel- > >>>>>>>>> bounces@xxxxxxxxxxxxxxxxxxxx] On Behalf Of Rafal Wojdyla Sent: > 18 > >>>>>>>>> August 2015 14:33 To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx > Subject: > >>>>>>>>> [win-pv-devel] Problems with xenvbd > >>>>>>>>> > >>>>>>>>> Hi, > >>>>>>>>> > >>>>>>>>> I've been testing the current pvdrivers code in preparation for > >>>>>>>>> creating upstream patches for my xeniface additions and I > noticed > >>>>>>>>> than xenvbd seems to be very unstable for me. I'm not sure if > it's > >>>>>>>>> a problem with xenvbd itself or my code because it seemed to > only > >>>>>>>>> manifest when the full suite of our guest tools was installed > along > >>>>>>>>> with xenvbd. In short, most of the time the system crashed with > >>>>>>>>> kernel memory corruption in seemingly random processes > shortly > >>>>>>>>> after start. Driver Verifier didn't seem to catch anything. You can > >>>>>>>>> see a log from one such crash in the attachment crash1.txt. > >>>>>>>>> > >>>>>>>>> Today I tried to perform some more tests but this time without > our > >>>>>>>>> guest tools (only pvdrivers and our shared libraries were > >>>>>>>>> installed). To my surprise now Driver Verifier was crashing the > >>>>>>>>> system every time in xenvbd (see crash2.txt). I don't know why > it > >>>>>>>>> didn't catch that previously... If adding some timeout to the > >>>>>>>>> offending wait doesn't break anything I'll try that to see if I can > >>>>>>>>> reproduce the previous memory corruptions. > >>>>>>>>> > >>>>>>>> Those crashes do look odd. I'm on PTO for the next week but I'll > >> have > >>>>>>>> a look when I get back to the office. I did run verifier on all the > >>>>>>>> drivers a week or so back (while running vbd plug/unplug tests) > but > >>>>>>>> there have been a couple of changes since then. > >>>>>>>> > >>>>>>>> Paul > >>>>>>>> > >>>>>>> No problem. I attached some more logs. The last one was during > >>> system > >>>>>>> shutdown, after that the OS failed to boot (probably corrupted > >>>>>>> filesystem since the BSOD itself seemed to indicate that). I think > >> every > >>>>>>> time there is a BLKIF_RSP_ERROR somewhere but I'm not yet > familiar > >>> with > >>>>>>> Xen PV device interfaces so not sure what that means. > >>>>>>> > >>>>>>> In the meantime I've run more tests on my modified xeniface > driver > >> to > >>>>>>> make sure it's not contributing to these issues but everything > >> seemed > >>> to > >>>>>>> be fine there. > >>>>>>> > >>>>>>> > >>>>>> I also had a disk corruption on windows 10 pro 64 bit with pv drivers > >>>>>> build of 11 august but I'm not sure that is related to winpv drivers, > >>>>>> on same domU I started testing also snapshot with qcow2 disk > overlay. > >>>>>> For this case I don't have useful information because don't try to > >>>>>> boot windows at all but if rehappen I'll try to take other useful > >>>>>> information. > >>>>> Happen another time but also this I was unable to understand what is > >>>>> exactly the cause. > >>>>> On windows reboot all seems was ok and did a clean shutdown but on > >>>>> reboot seabios don't found bootable disk and qemu log don't show > >> useful > >>>>> informations. > >>>>> qemu-img check show errors: > >>>>>> /usr/lib/xen/bin/qemu-img check W10.disk1.cow-sn1 > >>>>>> ERROR cluster 143 refcount=1 reference=2 > >>>>>> Leaked cluster 1077 refcount=1 reference=0 > >>>>>> ERROR cluster 1221 refcount=1 reference=2 > >>>>>> Leaked cluster 2703 refcount=1 reference=0 > >>>>>> Leaked cluster 5212 refcount=1 reference=0 > >>>>>> Leaked cluster 13375 refcount=1 reference=0 > >>>>>> > >>>>>> 2 errors were found on the image. > >>>>>> Data may be corrupted, or further writes to the image may corrupt > it. > >>>>>> > >>>>>> 4 leaked clusters were found on the image. > >>>>>> This means waste of disk space, but no harm to data. > >>>>>> 27853/819200 = 3.40% allocated, 22.65% fragmented, 0.00% > >> compressed > >>>>>> clusters > >>>>>> Image end offset: 1850736640 > >>>>> I created it with: > >>>>> /usr/lib/xen/bin/qemu-img create -o > >>>>> backing_file=W10.disk1.xm,backing_fmt=raw -f qcow2 > W10.disk1.cow- > >>> sn1 > >>>>> and changed the xl domU configuration: > >>>>> disk=['/mnt/vm2/W10.disk1.cow-sn1,qcow2,hda,rw',... > >>>>> Dom0 is with xen 4.6-rc1 and qemu 2.4.0 > >>>>> DomU is windows 10 pro 64 bit with pv drivers build of 11 august > >>>>> > >>>>> How I can know for sure if it is a winpv or qemu or other problem and > >>>>> take useful information to report? > >>>>> > >>>>> Thanks for any reply and sorry for my bad english. > >>>> This sounds very much like a lack of synchronization somewhere. I > recall > >>> seeing other problems of this ilk when someone was messing around > with > >>> O_DIRECT for opening images. I wonder if we are missing a flush > operation > >>> on shutdown. > >>>> Paul > >>>> > >>> Thanks for reply. > >>> I did a fast search but I not found O_DIRECT grepping in libxl, I found > >>> it only in qemu code. > >>> After I tried with patch that seems added setting of it for xen: > >>> > >> > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=454ae734f1d9f591345fa78 > >>> 376435a8e74bb4edd > >>> Checking in libxl seems disabled by default and from some old xen post > >>> seems that O_DIRECT creates problems. > >>> I should try it enable direct-io-safe in domUs qcow2 disks? > >>> Added also Stefano Stabellini as cc. > >>> @Stefano Stabellini: What is the current know status and result of > >>> direct-io-safe? > >>> Sorry is the question are stupid by or my english is too bad or many > >>> post of latest years are confused and in same cases seems also > >>> contradictory about stability/integrity/performance using it or not. > >>> In particular seems crash with some kernels but I not understand exactly > >>> what versions and/or with which patches. > >>> > >>> @Paul Durrant: have you see my other mail when I wrote that based on > my > >>> latest test with xen 4.6 without udev file windows domUs with new pv > >>> driver don't boot and for still boot it correctly I must readd udev > >>> file, can this cause unexpected case related to this problem or is > >>> different? > >>> http://lists.xen.org/archives/html/win-pv-devel/2015- > 08/msg00033.html > >>> > >> I'm not sure why udev would be an issue here. The problem you have > >> appears to be QEMU ignoring the request to unplug emulated disks. I've > not > >> seen this behaviour on my test box so I'll need to dig some more. > >> > > I notice you have 6 IDE channels? Are you using AHCI by any chance? If you > are then it looks like QEMU is not honouring the unplug request... that would > be where the bug is. I'll try to repro myself. > > > > Paul > > If I remember good I already tried also with ide about both problems > (udev and qcow) with same result. > I'm also already using mainly ahci on windows domUs (with new pv) in > test system for some months. > But if needed tell me and I'll do more tests. > About your recent patches seems fix related to unplug or I'm wrong? I'll > retry with them this afternoon without udev file if new pv test build > will be ready. My recent changes to xenvbd were to do with when unplug should be requested and also cleaning up on driver removal. I don't think either of them affect your case; I think you're experiencing a problem with QEMU. Paul > > Thanks for any reply and sorry for my bad english. _______________________________________________ win-pv-devel mailing list win-pv-devel@xxxxxxxxxxxxxxxxxxxx http://lists.xenproject.org/cgi-bin/mailman/listinfo/win-pv-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |