[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen: sched: fix (ACPI S3) resume with cpupools with different schedulers.



>>> On 13.11.15 at 13:30, <dario.faggioli@xxxxxxxxxx> wrote:
> On Fri, 2015-11-13 at 03:41 -0700, Jan Beulich wrote:
>> > > > On 13.11.15 at 11:08, <dario.faggioli@xxxxxxxxxx> wrote:
>> > Fix things by properly deallocating scheduler specific
>> > data of the pCPU's pool scheduler during pCPU teardown,
>> > and re-allocating them --always for &ops-- during pCPU
>> > bringup.
>> 
>> Avoiding this was done for a reason iirc: What if one such allocation
>> fails (e.g. due to heap fragmentation resulting from the allocation
>> order not being the exact inverse of the freeing order)?
>> 
> I don't think I'm getting the point.
> 
> cpu_schedule_up() allocates, with the alloc_pdata hook, stuff that was
> freed by cpu_schedule_down (with the free_pdata hook) already.

If there is no change in the allocation pattern (i.e. including the
order of frees and allocs), then I'm sorry for the noise. I understood
the description of the change differently then.

> It actually allocates two things, the other being the entire idle vCPU
> of the pCPU, but that is only if idle_vcpu[cpu] is NULL, which should
> not be happening during suspend/resume.
> 
> In any case, if one of these allocations fails, it already returns
> ENOMEM. What happens then is, of course, caller's call. If it happens
> during boot for pCPU 0, we BUG(). If it happens with non-boot pCPUs
> (either at boot, during hotplug or during suspend/resume), we fail the
> bringup of that pCPU at the CPU_UP_PREPARE phase.

Except that after resume the system should get into the same state
it was before suspend, i.e. in particular same number of CPUs. Imo
there simply shouldn't be any allocations on the resume path -
everything should be possible to be picked up as it was left during
suspend. Along the lines of e.g. the handling of affinities we once
had (and maybe still have), getting saved away and restored instead
of trashed.

> So, in summary, this patch fixes two bugs, one that is showing and one
> latent... Not bad! :-)

Agreed.

> Fact is, there is an inconsistency between the pCPU's scheduler (set in
> cpu_schedule_up(), to Credit1, in the example) and the pCPU's idle
> vCPU's private data (Credit2 data in the example), which, if left in
> place, could explode somewhere else, at some future point. In my
> opinion, the best solution is to remove such inconsistency, and that's
> what the patch tries to do.

Removing the inconsistency is certainly a requirement. The question
just is how to do so without risking system health in other ways.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.