[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: blue screen in windows balloon driver



I am testing windows 2003, and the pvdriver msi is build in XP env.
 
Well, I check all other not crashed VMS, all of them has the  XenVbd_HwScsiResetBus.
What does this mean? Is it reasonable.
 
I will run the debug mode pv on the other two physical to see if the log exists.
(Since not blue srceen ever happen on those two hosts).
 
Also, how to check whether xenvbd is stuck?
 
many thanks.
 
> Subject: RE: blue screen in windows balloon driver
> Date: Tue, 1 Mar 2011 20:41:18 +1100
> From: james.harper@xxxxxxxxxxxxxxxx
> To: tinnycloud@xxxxxxxxxxx
>
> Are you testing under Windows 2008?
>
> When you build the drivers under the windows 2008 build environment, you
> should get a storport xenvbd driver, not a scsiport xenvbd driver, but
> in your logs I see scsiport.
>
> This shouldn't affect the crash we are seeing though.
>
> James
>
> > -----Original Message-----
> > From: MaoXiaoyun [mailto:tinnycloud@xxxxxxxxxxx]
> > Sent: Tuesday, 1 March 2011 18:14
> > To: xen devel
> > Cc: James Harper
> > Subject: RE: blue screen in windows balloon driver
> >
> > Hi James:
> >
> > Attached are three logs. (I started test PV in debug mode)
> >
> > qemu-dm-w3.MR_cp7.vhd.log.no rmal:
> > is the VM not crash
> >
> >
> > qemu-dm-w3.MR_cp23.vhd.log.crash:
> > is the vm crashed, but the log show a Assertion failed.
> > *** Assertion failed: srb != NULL
> > *** Source File:
> e:\download\win-pvdrivers.hg\xenvbd\xenvbd_scsiport.c, line
> > 988
> > Blue screen on "NO_PAGES_AVAILABLE"
> > ***STOP: 0x0000004D (0x0001566c,0x0001566c,0x00000000,0x00000000)
> >
> >
> > qemu-dm-w3.MR_cp6.vhd.log.crash: is the vm crashed, bug no special
> error in
> > log
> > Blue screen on "NO_PAGES_AVAILABLE"
> > ***STOP: 0x0000004D (0x0001590f,0x0001590f,0x00000000,0x00000000)
> >
> > thanks.
> >
> > > Subject: Re: blue screen in windows balloon driver
> > > From: james.harper@xxxxxxxxxxxxxxxx
> > > Date: Tue, 1 Mar 2011 16:01:46 +1100
> > > To : tinnycloud@xxxxxxxxxxx
> > > CC: xen-devel@xxxxxxxxxxxxxxxxxxx
> > >
> > > Please send logs and bug check codes for any future crashes
> > >
> > > Can you also send me your memhog program?
> > >
> > > Sent from my iPhone
> > >
> > > On 01/03/2011, at 13:37, "MaoXiaoyun" <tinnycloud@xxxxxxxxxxx>
> wrote:
> > >
> > > > Thanks James.
> > > >
> > > > Well, what if the memory is balloon dow already?
> > > > In my test, the eat memory process(named memhog) is started after
> the
> > server starts,
> > > > (that is all VMs have already ballooned down to 512M)
> > > > It looks like the "balloon down threads " is not working at that
> time.
> > > >
> > > > One more question is, if memhog eat process at very fast sp eed,
> will it
> > consume the
> > > > NopagePool memory? (I am not whether NopagePool and Page Pool is
> seperate
> > pool).
> > > > If so, if the memory is exhausted, some other places
> > like"ExAllocatePoolWithTag(NonPagedPool,...)",
> > > > will gets no memory, and could it cause bluescreen?
> > > >
> > > > I will have the latest driver tested, thanks.
> > > >
> > > >
> > > > > Subject: RE: blue screen in windows balloon driver
> > > > > Date: Tue, 1 Mar 2011 10:45:52 +1100
> > > > > From: james.harper@xxxxxxxxxxxxxxxx
> > > > > To: tinnycloud@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
> > > > >
> > > > > I have just pushed a change to check the
> > > > > "\KernelObjects\LowMemoryCondition" event before al locating
> memory for
> > > > > ballooning, and waiting if the event is set. This may resolve
> the
> > > > > problems you are seeing.
> > > > >
> > > > > What I have seen is that initially the event gets set, but then
> as
> > > > > Windows pages some active memory out the event gets cleared
> again and
> > > > > further ballooning down is possible. It may prevent you
> ballooning down
> > > > > quite as low as you could before, but if it stops windows
> crashing then
> > > > > I think it is good.
> > > > >
> > > > > James
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: MaoXiaoyun [mailto:tinnycloud@xxxxxxxxxxx]
> > > > > > Sent: Monday, 28 February 2011 19:30
& gt; > > > > > To: xen devel
> > > > > > Cc: James Harper
> > > > > > Subject: RE: blue screen in windows balloon driver
> > > > > >
> > > > > > Hi James:
> > > > > >
> > > > > > Unfortunately, We still hit the blue screen on the stress
> test.
> > > > > > (Start total 24 HVMS on a single 16core, 24G host,
> > > > > > each HVM owns 2G Memory, start with memory=512M,
> > > > > > and inside two eating memory processes, each of which will
> each
> > > > > 1G
> > > > > > memory)
> > > > > >
> > > > > > As I go though the code, I noticed that all memory allocation
> > > > > relates to
> > > > > > "ExAllocatePoolWithTag(NonPagedPool,...)", which is from
> > > > > NonePaged Pool,
> > > > > > As I know, the NonePagePool memory is the memory could not be
> > > > > paged out,
> > > > > > and that is limited, and for the blue screen VMS, I also found
> > > > > the free
> > > > > > memory
> > > > > > is quite low, only about hundreds KB left.
> > > > > >
> > > > > > So, when memory overcommit, some of the VM will not got enough
> > > > > memory,
> > > > > > and if most of its Memory is occupied by eating memory
> process, then
> > > > > > ExAllocatePoolWithTag
> > > > > > will fail, thus caused "NO_PAGES_AVALIABLE" blue screen. Is
> this
> > > > > possible?
> > > > > >
> > > > > > Meanwhile, I will have your PVdriver tested to see if blue
> > > > > exists,
> > > > > > thanks.
> > > > > >
> > > > > >
> > > > > > >From: tinnycloud@xxxxxxxxxxx
> > > > > > >To: tinnycloud@xxxxxxxxxxx
> > > > > > >Subject: FW: blue screen in windows balloon driver
> > > > > > >Date: Mon, 28 Feb 2011 16:16:59 +0800
> > > > > > >
> > > > > > >
> > > > > > >Thanks for fixing the POD. It's is better make it earlier to
> avoid
> > > > > crash.
> > > > > > >
> > > > > > >The meminfo is written every 1 seconds into xenstore dir
> > > > > > /local/domain/did/memory/meminfo.
> > > > > > >And to avoid to many writes, only the memory changes large
> than 5M,
> > > > > the
> > > > > > thread will do the write.
> > > > > > >
> > > > > > >As for userspace daemon, it is our first choice, but we found
> it
> > > > > xenstore
> > > > > > daemon in dom0 comsume
> > > > > > >many CPU(we test in linux only), so we decide to move it into
> driver.
> > > > > > >
> > > > > > >I've done merge my code with latest changeset 866, and do the
> stree
> > > > > test
> > > > > > later.
> > > > > > >
> > > > > > >many thanks.
> > > > > > >
> > > > > > >> Subject: RE: RE: blue screen in windows balloon driver
> > > > > > >> Date: Sun, 27 Feb 2011 22:25:28 +110 0
> > > > > > >> From: james.harper@xxxxxxxxxxxxxxxx
> > > > > > >> To: tinnycloud@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
> > > > > > >> CC: george.dunlap@xxxxxxxxxxxxx
> > > > > > >>
> > > > > > >> > Thanks James.
> > > > > > >> >
> > > > > > >> > I think it is GPLPV. The driver is from
> > > > > > >> http://xenbits.xen.org/ext/win-
> > > > > > >> > pvdrivers.hg
> > > > > > >> > But, I have done some other things
> > > > > > >> >
> > > > > > >> > 1) Add pod support
> > > > > > >> > 2) enable a meminfo thread, periodically write VM meminfo
> into
> > > > > > >> xenstore> > > > > > >> > We use info of Current Memory, Free memory, and Committed
> memory,
> > > > > > >> retrived
> > > > > > >> > through NativeAPI
> > > > > > >> > 3) our code is based from changeset 823, attached is the
> diff of
> > > > > my
> > > > > > >> current
> > > > > > >> > code with changset of 853.
> > > > > > >> >
> > > > > > >> > Maybe I need add my code to 853, and test again.
> > > > > > >> > Thanks.
> > > > > > >> >
> > > > > > >>
> > > > > > >> As per other post, I have just committed some patches and
> PoD
> > > > > should now
> > > > > > >> be working properly. I can start a DomU with 4GB of maxmem
> but only
> > > > > > >> 128MB of populated memory without any problems. This now
> works
> > > > > because I
> > > > > > >> do the initial balloon down in DriverEntry, way before
> xenpci does
> > > > > > >> anything else. Before it would blow up in DriverEntry. I
> think I
> > > > > > >> determine the amount to initially balloon down a little
> differently
> > > > > from
> > > > > > >> you too.
> > > > > > >>
> > > > > > >> It takes a while to balloon down the memory though... I
> think
> > > > > Windows
> > > > > > >> tends to delay large allocations or something, because
> ballooning
> > > > > up
> > > > > > >> again is pretty much instant.
> > > > > > >>
> > > > > > >> How often are you writing meminfo stuff into xenstore?
> Could you do
> > > > > that
> > > > > > >> in userspace (the interface to xenstore exists and seems to
> work
> > > > > well
> > > > > > >> although it's a little tedious)? You would then be able to
> just run
> > > > > it
> > > > > > >> as a service and not need to patch GPLPV.
> > > > > > >>
> > > > > > >> James
> > > > > > >
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.