[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Patch] Call sched_destroy_domain before cpupool_rm_domain.


  • To: Nate Studer <nate.studer@xxxxxxxxxxxxxxx>
  • From: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>
  • Date: Tue, 05 Nov 2013 06:59:14 +0100
  • Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Dario Faggioli <dario.faggioli@xxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, xen-devel@xxxxxxxxxxxxx
  • Delivery-date: Tue, 05 Nov 2013 05:59:52 +0000
  • Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Message-ID:Date:From:Organization:User-Agent: MIME-Version:To:CC:Subject:References:In-Reply-To: Content-Type:Content-Transfer-Encoding; b=ijHpy0tiH77ZNtMR2ETRMp0q5MqpZMYDBBMRp8YytoLvJRtB6tmRcjNL QLYAWgSorI/y6h/LcMTrOjfsvs3HdTuDYjarBxdTBE27wKkL20ZrksZ7z tbRqd3ZNeDluL0ITcPYRpqflhXLFZdpvkQYgYnSjZN4RkdGHrsQSDuhMY ownp6lU2nT7YyqrHMXcPhVtUF2Sa+OF9Zc/Ta2PJG16fBwoGRV6BUsCMG pS6Yt8gDaH91Qd4J+GGW/dNSV0A43;
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 04.11.2013 16:22, Nate Studer wrote:
On 11/4/2013 4:58 AM, Juergen Gross wrote:
On 04.11.2013 10:26, Dario Faggioli wrote:
On lun, 2013-11-04 at 07:30 +0100, Juergen Gross wrote:
On 04.11.2013 04:03, Nathan Studer wrote:
From: Nathan Studer <nate.studer@xxxxxxxxxxxxxxx>

The domain destruction code, removes a domain from its cpupool
before attempting to destroy its scheduler information.  Since
the scheduler framework uses the domain's cpupool information
to decide on which scheduler ops to use, this results in the
the wrong scheduler's destroy domain function being called
when the cpupool scheduler and the initial scheduler are
different.

Correct this by destroying the domain's scheduling information
before removing it from the pool.

Signed-off-by: Nathan Studer <nate.studer@xxxxxxxxxxxxxxx>

Reviewed-by: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>

I think this is a candidate for backports too, isn't it?

Nathan, what was happening without this patch? Are you able to quickly
figure out what previous Xen versions suffers from the same bug?

Various things:

If I used the credit scheduler in Pool-0 and the arinc653 scheduler in a cpupool
the other pool, it would:
1.  Hit a BUG_ON in the arinc653 scheduler.
2.  Hit an assert in the scheduling framework code.
3.  Or crash in the credit scheduler's csched_free_domdata function.

The latter clued me in that the wrong scheduler's destroy function was somehow
being called.

If I used the credit2 scheduler in the other pool, I would only ever see the 
latter.

Similarly, if I used the sedf scheduler in the other pool, I would only ever see
the latter.  However when using the sedf scheduler I would have to create and
destroy the domain twice, instead of just once.


In theory this bug is present since 4.1.

OTOH it will be hit only with arinc653 scheduler in a cpupool other than
Pool-0. And I don't see how this is being supported by arinc653 today (pick_cpu
will always return 0).

Correct, the arinc653 scheduler currently does not work with cpupools.  We are
working on remedying that though, which is how I ran into this.  I would have
just wrapped this patch in with the upcoming arinc653 ones, if I had not run
into the same issue with the other schedulers.


All other schedulers will just call xfree() for the domain specific data (and
may be update some statistic data, which is not critical).

The credit and credit2 schedulers do a bit more than that in their free_domdata
functions.

Sorry, got not enough sleep on the weekend ;-)

I checked only 4.1 and 4.2 trees. There only xfree of the domain data is done.


The credit scheduler frees the node_affinity_cpumask contained in the domain
data and the credit2 scheduler deletes a list element contained in the domain
data.  Since with this bug they are accessing structures that do not belong to
them, bad things happen.

So the patch would be subject to a 4.3 backport, I think.


Juergen

--
Juergen Gross                 Principal Developer Operating Systems
PBG PDG ES&S SWE OS6                   Telephone: +49 (0) 89 62060 2932
Fujitsu                                   e-mail: juergen.gross@xxxxxxxxxxxxxx
Mies-van-der-Rohe-Str. 8                Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.