[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH-for-4.17] xen/sched: fix race in RTDS scheduler


  • To: Juergen Gross <jgross@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Dario Faggioli <dfaggioli@xxxxxxxx>
  • Date: Fri, 21 Oct 2022 09:23:26 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Wx0rnd57G2IRJXD8NGFjsu49No24LVjnChtXOXFczLs=; b=BhvBm1BKaaUR1gBXPwe+8v2n8S/L+uGv3pVNz/feSKV36W1uow5M/yLpoN/HeN4MrgWnSoTGGOTGwB3prHSbKUcjwVFwpV5X23Z+cOnRyfKSsi6d5yvubJnojZrJ71zFV8vlYOF+EHv7i4qcWX7YdfuyJaRHTwfHJAu7+NXqUYmuKeWxJ+dONWJyvp8tlUyCYJOAtyQAEQ3DSzKiLIcJ5f7wnaeTkTe1S2qLjMZNhKOTDfEfjRI38PdEASLkSEIcPd9oi8c44s9UV3Tt8uPc7gFvCCldRXGdrRhjsDuqfzrdP9gYvqRwhA/an7N3IXrdPwVPlS6ah2vAmF5+nd0zGw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=boUYgBlT5IBRArLnlv8G431/C2pKh+temL1vIueFP+QfYxMGg1f5vDP86ScrXtxzfDVfwZI7iM6SFX0npZ/AMXyIrjIwNMckqH0aF5MMaEGOL1FAEEKOedFbzkwlWMARcKIGqgvtPL1xqfri7okQJt5S6fpUInY85WGRhgrd4vP0D6Mgk3i9EFjYbO/pzSSt/Gvgb8+0LoEMarVOAMwAlNIM15sMVPukr1wlbNgpC62u+qQ06xvZghxnErclHTp9kKxDMlqoUf9qPfUODtbogVUbY5tzjSPJqBaxe1tTu6zInBajLV45YNkcBd1sAnOM6aKS4aksJB4NXpDw2rSz7w==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "Henry.Wang@xxxxxxx" <Henry.Wang@xxxxxxx>, "george.dunlap@xxxxxxxxxx" <george.dunlap@xxxxxxxxxx>, "mengxu@xxxxxxxxxxxxx" <mengxu@xxxxxxxxxxxxx>
  • Delivery-date: Fri, 21 Oct 2022 09:23:31 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHY5RPdyv7HUa7jcEWPdT3S+dSn164Yk2EA
  • Thread-topic: [PATCH-for-4.17] xen/sched: fix race in RTDS scheduler

On Fri, 2022-10-21 at 08:10 +0200, Juergen Gross wrote:
> When a domain gets paused the unit runnable state can change to "not
> runnable" without the scheduling lock being involved. This means that
> a specific scheduler isn't involved in this change of runnable state.
> 
> In the RTDS scheduler this can result in an inconsistency in case a
> unit is losing its "runnable" capability while the RTDS scheduler's
> scheduling function is active. RTDS will remove the unit from the run
> queue, but doesn't do so for the replenish queue, leading to hitting
> an ASSERT() in replq_insert() later when the domain is unpaused
> again.
> 
> Fix that by removing the unit from the replenish queue as well in
> this
> case.
> 
Ah, ok... So, all is fine until what could happen during rt_schedule(),
was "just" that the currently running task, not only is descheduled,
but it also became !runnable.

In fact, in this case, the unit itself is not in the runq, but it can
be in the replq. However, since it still has the RTDS_scheduled flag
set, either:
1) we reach rt_context_saved(), which remove it from replq, before any 
   replq_insert;
2) rt_unit_wake() is called, but due to RTDS_scheduled, it may only do 
   replq_reinsert(), which is fine with the unit being already there.

However, what can also happen in rt_schedule() is that we remove from
the runq an unit that was not running, and hence does not have the
RTDS_scheduled flat set. In which case, rt_context_saved() doesn't do
anything to it (of course!). And as soon as rt_unit_wake() happens, it
does replq_insert(), which is not fine with finding the replenishment
event in the queue already.

So, yes... And good catch! :-P


> Fixes: 7c7b407e7772 ("xen/sched: introduce unit_runnable_state()")
> Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
>
Acked-by: Dario Faggioli <dfaggioli@xxxxxxxx>

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

Attachment: signature.asc
Description: This is a digitally signed message part


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.