[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen: sched: fix (ACPI S3) resume with cpupools with different schedulers.

On Fri, 2015-11-13 at 03:41 -0700, Jan Beulich wrote:
> > > > On 13.11.15 at 11:08, <dario.faggioli@xxxxxxxxxx> wrote:
> > Fix things by properly deallocating scheduler specific
> > data of the pCPU's pool scheduler during pCPU teardown,
> > and re-allocating them --always for &ops-- during pCPU
> > bringup.
> Avoiding this was done for a reason iirc: What if one such allocation
> fails (e.g. due to heap fragmentation resulting from the allocation
> order not being the exact inverse of the freeing order)?
I don't think I'm getting the point.

cpu_schedule_up() allocates, with the alloc_pdata hook, stuff that was
freed by cpu_schedule_down (with the free_pdata hook) already.

It actually allocates two things, the other being the entire idle vCPU
of the pCPU, but that is only if idle_vcpu[cpu] is NULL, which should
not be happening during suspend/resume.

In any case, if one of these allocations fails, it already returns
ENOMEM. What happens then is, of course, caller's call. If it happens
during boot for pCPU 0, we BUG(). If it happens with non-boot pCPUs
(either at boot, during hotplug or during suspend/resume), we fail the
bringup of that pCPU at the CPU_UP_PREPARE phase.

This patch introduces one more occasion for the failure to occur, sure,
if there's no (proper) memory. Without this patch the system crashes
already, and even if there is enough (proper) memory, thuogh.

Furthermore, at a later point on the pCPU bringup path,
schedule_cpu_switch() is called. In the scenario described in the
changelog, what it will do is deallocating the existing scheduler
specific per-vCPU data, and re-allocating a new one. If the allocation
fails at that point, what happens is that we fail the bringup at the
CPU_ONLINE stage and (in cpu_up()) we BUG(). This does not really look
better than what happens with the patch applied to me... Quite the
opposite, TBH.

Actually, re-looking at schedule_cpu_switch(), the situation is
probably even worse than how I described it in the changelog! In fact,
still in the scenario introduced there, when we actually get to call
schedule_cpu_switch() (during CPU_ONLINE), the pCPU --let it be pCPU
6-- is in the following state:
 - it's still formally in cpupool1, but it's scheduler is set to
   &ops, i.e., to Credit1,
 - idle_vcpu[6].sched_priv points to Credit2 vCPU data,
 - it is being switched to pool1, with ops == Credit2.

So, schedule_cpu_switch() itself will:
 - allocate a new Credit2 per vCPU data structure,
 - make idle_vcpu[6].sched_priv point such new structure,
 - try to deallocate the still existing Credit2 per vCPU data 
   structure with Credit1's (old_ops in the function) free_vdata hook!!
 - bad things (or so I expect!)

Note, in particular, the the next to last item: we'd be calling
Credit1's csched_free_vdata() to deallocate something that was
allocated with Credit2's csched_alloc_vdata()... Which looks like
another rather interesting bug to me (I haven't been able to trigger
it, most likely because the one shown in the changelog manifests first,
but I can try, if we're curios :-D).

So, in summary, this patch fixes two bugs, one that is showing and one
latent... Not bad! :-)

Fact is, there is an inconsistency between the pCPU's scheduler (set in
cpu_schedule_up(), to Credit1, in the example) and the pCPU's idle
vCPU's private data (Credit2 data in the example), which, if left in
place, could explode somewhere else, at some future point. In my
opinion, the best solution is to remove such inconsistency, and that's
what the patch tries to do.

Another idea could be trying to completely and fully initialize the
_proper_ scheduler for the pCPU in cpu_schedule_up(), but it's not
guaranteed that this will be doable for all schedulers; e.g., it is
impossible for Credit2. Maybe, but I'd have to try, since it's only the
inconsistency in the status of the idle vCPU that triggers the bug, we
can at least set that straight in cpu_schedule_up()... As I said, I'd
have to check, and I'm fine with trying, if it is deemed worthwhile.

Let me know what you think.

<<This happens because I choose it to happen!>> (Raistlin Majere)
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.