[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PoD code killing domain before it really gets started



On 26/07/12 15:41, Jan Beulich wrote:
George,

in the hope that you might have some insight, or might be
remembering that something like this was reported before (and
ideally fixed), I'll try to describe a problem a customer of ours
reported. Unfortunately this is with Xen 4.0.x (plus numerous
backports), and it is not known whether the same issue exists
on 4.1.x or -unstable.

For a domain with maxmem=16000M and memory=3200M, what
gets logged is

(XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! tot_pages 480 
pod_entries 221184
(XEN) domain_crash called from p2m.c:1150
(XEN) Domain 3 reported crashed by domain 0 on cpu#6:
(XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! tot_pages 480 
pod_entries 221184
(XEN) domain_crash called from p2m.c:1150

Translated to hex, the numbers are 1e0 and 36000. The latter
one varies across the (rather infrequent) cases where this
happens (but was always a multiple of 1000 - see below), and
instant retries to create the affected domain did always succeed
so far (i.e. the failure is definitely not because of a lack of free
memory).

Given that the memory= target wasn't reached, yet, I would
conclude that this happens in the middle of (4.0.x file name used
here) tools/libxc/xc_hvm_build.c:setup_guest()'s main physmap
population code. However, the way I read the code there, I
would think that the sequence of population should be (using
hex GFNs) 0...9f, c0...7ff, 800-fff, 1000-17ff, etc. That,
however appears to be inconsistent with the logged numbers
above - tot_pages should always be at least 7e0 (low 2Mb less
the VGA hole), especially when pod_entries is divisible by 800
(the increment by which large page population happens).

As a result of this apparent inconsistency I can't really
conclude anything from the logged numbers.

The main question, irrespective of any numbers, of course is:
How would p2m_pod_demand_populate() be invoked at all
during this early phase of domain construction? Nothing
should be touching any of the memory... If this nevertheless
is possible (even if just for a single page), then perhaps the
tools ought to make sure the pages put into the low 2Mb get
actually zeroed, so the PoD code has a chance to find victim
pages.
Yes, this is a very strange circumstance: because p2m_demand_populate() shouldn't happen until at least one PoD entry has been created; and that shouldn't happen until after c0...7ff have been populated with 4k pages.

Although, it does look as though when populating 4k pages, the code doesn't actually look to see if the allocation succeeded or not... oh wait, no, it actually checks rc as a condition of the while() loop -- but that is then clobbered by the xc_domain_set_pod_target() call. But surely if the 4k allocation failed, the set_target() call should fail as well? And in any case, there shouldn't yet be any PoD entries to cause a demand-populate.

We probably should change "if(pod_mode)" to "if(rc == 0 && pod_mode)" or something like that, just to be sure. I'll spin up a patch.

I think what I would try to do is to add a stack trace to the demand_populate() failure path, so you can see where the call came from; i.e., if it came from a guest access, or from someone in dom0 writing to some of the memory. I'd also add a printk to set_pod_target(), so you can see if it was actually called and what it was set to.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.