RE: [Xen-devel] Error restoring DomU when using GPLPV

> On 04/08/2009 08:58, "James Harper" <james.harper@xxxxxxxxxxxxxxxx>
> > When the DomU is running, 'xm debug q' looks like:
> >
> > (XEN) General information for domain 23:
> > (XEN)     refcnt=3 dying=0 nr_pages=197611 xenheap_pages=33
> > dirty_cpus={1} max_pages=197632
> >
> > During restore, it looks like this:
> > (XEN) General information for domain 22:
> > (XEN)     refcnt=3 dying=0 nr_pages=196576 xenheap_pages=5
> > max_pages=197632
> Is the host simply out of memory?

No. 5G physical memory free and there is only 768MB assigned to the
DomU. I can start the guest again, I just can't restore it.

> If dom22 above has 196576 pages and
> max_pages=197632 then an allocation of 33 order-0 extents should not
> due to over-commitment to the guest.

196576 is just where it happened to be when I took the last 'xm debug
q', before 'xm restore' failed and deleted it. The allocation of '33'
returns '32' so it does appear to be an off-by-one error.

> The only reason for such a failure is
> inadequate memory available in the host free pools. Perhaps xend
> auto-ballooning is involved? I'd turn it off if so, as it blows. It
> have freed up one-page-too-few or somesuch.

I assume that what happens is that the memory continues to grow until it
hits max_pages, for some reason.  Is there a way to tell 'xm restore'
not to delete the domain when the restore fails so I can see if nr_pages
really does equal max_pages at the time that it dies?

The curious thing is that this only happens when GPLPV is running. A PV
domU or a pure HVM DomU doesn't have this problem (presumably that would
have been noticed during regression testing). It would be interesting to
try a PVonHVM Linux DomU and see how that goes... hopefully someone who
having the problem with GPLPV also has PVonHVM domains they could test.


