[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] RE: VM with PV drivers; Win2003 lock-up



Bart,

I hope you don't mind, I have cc'd xen-users as I'll end up repeating
this there anyway.

> Hi James
> 
> I'm running the latest windows PV drivers (downloaded and built as
checked
> version) on a win2k3 SP2 VM image, which has 4 cpus defined in the xen
> script.  The host is running xen version 3.1.4 on 2.6.18 -
> 
> The image boots up and runs fine -
> 
> On the VM added an external storage (volume) via iscsi; the storage is
> added fine and can be read/written to.
> 
> When I fire up an iometer test, after about 40 minutes or so the VM no
> longer answers pings, I can connect to the console via vnc but the
> pointing device is dead.
> 
> xm destroyed the VM and re-ran the test under the debugger; using
another
> VM to debug the image.
> 
> When the VM running iometer locks; I am able to break into the
debugger -
> 
> But doing step-overs all that seems to run is nt!KiIdleLoop and once
in
> awhile nt!PoSetPowerState
> 
> Questions -
> 
> Will we get PowerEevents if the system is resource starved ?
>
> If so how will the PV drivers handle it ?

Not sure. I haven't implemented any power stuff.

> Have you run into this type of condition before?

Nope.

> Any words of wisdom, etc ...
> 
> Do you have an unreleased version of the code (drivers), which would
aid
> in tracking down this issue?
> 
> I don't mind running unreleased code -

It's a bit of a tricky time. I have just recently updated the interrupt
triggering code and unfortunately this has resulted in a huge reduction
in network performance. Previously the code went:
(XenPCI IRQ)->(XenPCI Dpc)->(XenNet IRQ)->(XenNet Dpc)
But now it goes:
(XenPCI IRQ)->(XenNet IRQ)->(XenNet Dpc)

Which means a huge decrease in latency between getting the interrupt and
calling of the Dpc where the actual work gets done. The side effect of
this is that under high network load there is a much smaller amount of
work to be done per interrupt (less packets built up), and so the per
packet overhead is much higher. The difference in performance is mainly
on the RX side of things, and is in the order of a reduction from around
800MB/s to 50MB/s. Nasty.

I've just started putting in some code to implement some higher
efficiencies - basically telling Xen not to interrupt us unless there is
more than a single packet of work to be done, with a timer to make sure
that things don't jam up when load reduces and there isn't that much
work to be done anymore. Once I have that even half working I'll push it
to hg and you can have a look at it. I've also fixed a bunch of little
problems along the way that meant that the windows driver verifier was
crashing all over the place.

None of the above though should cause the problems you are seeing. Is
your boot device and all swap devices on xenvbd or is it iSCSI too? I
need to know that before I can offer any suggestions.

Are you sure that iSCSI is causing a problem? Can you get it to crash
without using iSCSI?

Thanks for the feedback.

James

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.