[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Discussion on the delayed start of major frame with ARINC653 scheduler


  • To: "Choi, Anderson" <Anderson.Choi@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Nathan Studer <Nathan.Studer@xxxxxxxxxxxxxxx>
  • Date: Wed, 9 Jul 2025 14:21:03 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=dornerworks.com; dmarc=pass action=none header.from=dornerworks.com; dkim=pass header.d=dornerworks.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector5401; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QKLWcxx8tlOhU5WEzntn9cWtqMId0tgkg5G4CI3do38=; b=jMulVHSnAkt+ZN4UdARXNiTlh5iy+d/FNTmg4Fw9ntMFeFyKW3RVeyXrvfNFMOFf9cqmVhbveOTzq9kdVvORx9w/rpDN6T3bBZMiiaVRTlXQhFLaSSo3wL2HxLP8tgasU2ve3nRrHcOwSs4zpbFY5lp2EmgqCSwER6mvGOtI2jyE8AFIkKFqBg+fAuny0ftH8ikYGaovP7KCL8ffa4Cpyc9eJ/ISqg+6vsUUQnRDLwujflA3vxIMoSE7DAxpmEluhSKFcqVOMXjp4qmTVt8fio71Qxj1/4DTGcXm8ZmANQWw5+ETE8bj2E9TeS7jd0iHLm3VdU41aHMSdQwRbLd5SQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector5401; d=microsoft.com; cv=none; b=0OVeN1MT621sl5k+u71wr/vKgnIZnvh6MbnivbWUUimjl/pvfIyI2xA+q0XgCvNeyVZP76CuixCRFkmnSaGX/+qrxyJ2QfgB4cyBZKyryDyfUxWMbGnTVbzyko08qlC9JWhowhLqPEYLag3+A7LPp/KzjUiCqxuYWRy2dU6MxALm01hifPRxjyTx19otFWomZ3Ad0y+5VagSbSU+/emz+PApbVRAVEgeL+J3+LvRBy9LDBdklZkbnWLzNE0aKiJYoQAayMJa2zQoaS3UblDCEE7JaOl+WbStrQoZAtFgovBD+7k0GP/lUIrBLwtnDUvofsgNQzwWxh/80bHMuZ48aQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=dornerworks.com;
  • Cc: "stewart@xxxxxxx" <stewart@xxxxxxx>, "Weber (US), Matthew L" <matthew.l.weber3@xxxxxxxxxx>, "Whitehead (US), Joshua C" <joshua.c.whitehead@xxxxxxxxxx>, Jeff Kubascik <Jeff.Kubascik@xxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 09 Jul 2025 14:21:21 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AdvmTXSjjVtbwyT/QSCLI/dKN0kF9QKhMe9A
  • Thread-topic: Discussion on the delayed start of major frame with ARINC653 scheduler

+Jeff

On 6/25/25 23:51, Choi, Anderson wrote:
> We are observing a slight delay in the start of major frame with the current
> implementation of ARINC653 scheduler, which breaks the determinism in the
> periodic execution of domains.
> 
> This seems to result from the logic where the variable "next_major_frame" is
> calculated based on the current timestamp "now" at a653sched_do_schedule().
> 

This is a known issue with the upstream version of the scheduler, so appreciate 
you providing an upstream compatible patch.

> static void cf_check
> a653sched_do_schedule(
> <snip>
>     else if ( now >= sched_priv->next_major_frame )
>     {
>         /* time to enter a new major frame
>          * the first time this function is called, this will be true */
>         /* start with the first domain in the schedule */
>         sched_priv->sched_index = 0;
>         sched_priv->next_major_frame = now + sched_priv->major_frame;
>         sched_priv->next_switch_time = now + sched_priv->schedule[0].runtime;
>     }
> 
> Therefore, the inherent delta between "now" and the previous
> "next_major_frame" is added to the next start of major frame represented by 
> the
> variable "next_major_frame".
> 
> And I think the issue can be fixed with the following change to use
> "next_major_frame" as the base of calculation.
> 
> diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c
> index 930361fa5c..15affad3a3 100644
> --- a/xen/common/sched/arinc653.c
> +++ b/xen/common/sched/arinc653.c
> @@ -534,8 +534,11 @@ a653sched_do_schedule(
>           * the first time this function is called, this will be true */
>          /* start with the first domain in the schedule */
>          sched_priv->sched_index = 0;
> -        sched_priv->next_major_frame = now + sched_priv->major_frame;
> -        sched_priv->next_switch_time = now + sched_priv->schedule[0].runtime;
> +
> +        do {
> +            sched_priv->next_switch_time = sched_priv->next_major_frame +
> sched_priv->schedule[0].runtime;
> +            sched_priv->next_major_frame += sched_priv->major_frame;
> +        } while ((now >= sched_priv->next_major_frame) || (now >= sched_priv-
> >next_switch_time));
>      }
>      Else

I'm not sure this will work if the first minor frame is also missed (which can 
happen in some odd cases).  In that scenario, you need to iterate through the 
schedule after resyncing the expected next major frame.

Building off your changes, this should work:

-    if ( sched_priv->num_schedule_entries < 1 )
-        sched_priv->next_major_frame = now + DEFAULT_TIMESLICE;
-    else if ( now >= sched_priv->next_major_frame )
+    /* Switch to next major frame while handling potentially missed frames */
+    while ( now >= sched_priv->next_major_frame )
     {
-        /* time to enter a new major frame
-         * the first time this function is called, this will be true */
-        /* start with the first domain in the schedule */
         sched_priv->sched_index = 0;
-        sched_priv->next_major_frame = now + sched_priv->major_frame;
-        sched_priv->next_switch_time = now + sched_priv->schedule[0].runtime;
-    }
-    else
-    {
-        while ( (now >= sched_priv->next_switch_time) &&
-                (sched_priv->sched_index < sched_priv->num_schedule_entries) )
+
+        if ( sched_priv->num_schedule_entries < 1 )
+        {
+            sched_priv->next_major_frame += DEFAULT_TIMESLICE;
+            sched_priv->next_switch_time = sched_priv->next_major_frame;
+        }
+        else
         {
-            /* time to switch to the next domain in this major frame */
-            sched_priv->sched_index++;
-            sched_priv->next_switch_time +=
-                sched_priv->schedule[sched_priv->sched_index].runtime;
+            sched_priv->next_switch_time = sched_priv->next_major_frame +
+                sched_priv->schedule[0].runtime;
+            sched_priv->next_major_frame += sched_priv->major_frame;
         }
     }
 
+    /* Switch minor frame or find correct minor frame after a miss */
+    while ( (now >= sched_priv->next_switch_time) &&
+            (sched_priv->sched_index < sched_priv->num_schedule_entries) )
+    {
+        sched_priv->sched_index++;
+        sched_priv->next_switch_time +=
+            sched_priv->schedule[sched_priv->sched_index].runtime;
+    }
+

Any chance you could give that a test and see if it fixes your issue?

> 
> Can I get your advice on this subject?
> 
> Should you have any questions about the description, please let me know.
> 
> Here are the details to reproduce the issue on QEMUARM64.

I assume you are also running on hardware, but just a warning that testing real 
time scheduling on qemu can be a frustrating experience.

     Nate

> 
> [Xen version]
> - 4.19 (43aeacff8695850ee26ee038159b1f885e69fdf)
> 
> [ARINC653 pool configuration]
> - name="Pool-arinc"
> - sched="arinc653"
> - cpus=["3"]
> 
> [Dom1 configuration]
> - name = "dom1"
> - kernel = "/etc/xen/dom1/Image"
> - ramdisk = "/etc/xen/dom1/guest.cpio.gz"
> - extra = "root=/dev/loop0 rw nohlt"
> - memory = 256
> - vcpus = 1
> - pool = "Pool-arinc"
> 
> [Major frame configuration]
> $ a653_sched -p Pool-arinc dom1:10 :10 //20 msec (Dom1 10 msec : Idle 10
> msec)
> 
> [Collecting xentrace dump]
> $ xentrace -D -T 5 -e 0x2f000 /tmp/xentrace.bin
> 
> Parsed xentrace shows that its runstate change from 'runnable' to 'running',
> which means the start of major frame, is slightly shifted every period.
> Below are the first 21 traces since dom1 has started running. With the given
> major frame of 20 msec, the 21st major frame should have started at
> 0.414553536 sec (0.01455336 + 20 msec * 20).
> However, it started running at 0.418066096 sec which results in 3.5 msec of 
> shift,
> which will be eventually long enough to wrap around the whole major frame
> (roughly after 120 periods).
> 
> 0.014553536 ---x d?v? runstate_change d1v0 runnable->running
> 0.034629712 ---x d?v? runstate_change d1v0 runnable->running
> 0.054771216 ---x d?v? runstate_change d1v0 runnable->running
> 0.075080608 -|-x d?v? runstate_change d1v0 runnable->running
> 0.095236544 ---x d?v? runstate_change d1v0 runnable->running
> 0.115390144 ---x d?v? runstate_change d1v0 runnable->running
> 0.135499040 ---x d?v? runstate_change d1v0 runnable->running
> 0.155614784 ---x d?v? runstate_change d1v0 runnable->running
> 0.175833744 ---x d?v? runstate_change d1v0 runnable->running
> 0.195887488 ---x d?v? runstate_change d1v0 runnable->running
> 0.216028656 ---x d?v? runstate_change d1v0 runnable->running
> 0.236182032 ---x d?v? runstate_change d1v0 runnable->running
> 0.256302368 ---x d?v? runstate_change d1v0 runnable->running
> 0.276457472 ---x d?v? runstate_change d1v0 runnable->running
> 0.296649296 ---x d?v? runstate_change d1v0 runnable->running
> 0.316753856 ---x d?v? runstate_change d1v0 runnable->running
> 0.336909120 ---x d?v? runstate_change d1v0 runnable->running
> 0.357329936 ---x d?v? runstate_change d1v0 runnable->running
> 0.377691744 |||x d?v? runstate_change d1v0 runnable->running
> 0.397747008 |||x d?v? runstate_change d1v0 runnable->running
> 0.418066096 -||x d?v? runstate_change d1v0 runnable->running
> 
> However, with the suggested change applied, we can obtain the deterministic
> behavior of arinc653 scheduler, where every major frame starts 20 msec apart.
> 
> 0.022110320 ---x d?v? runstate_change d1v0 runnable->running
> 0.041985952 ---x d?v? runstate_change d1v0 runnable->running
> 0.062345824 ---x d?v? runstate_change d1v0 runnable->running
> 0.082145808 ---x d?v? runstate_change d1v0 runnable->running
> 0.101957360 ---x d?v? runstate_change d1v0 runnable->running
> 0.122223776 ---x d?v? runstate_change d1v0 runnable->running
> 0.142334352 ---x d?v? runstate_change d1v0 runnable->running
> 0.162126256 ---x d?v? runstate_change d1v0 runnable->running
> 0.182261984 ---x d?v? runstate_change d1v0 runnable->running
> 0.202001840 |--x d?v? runstate_change d1v0 runnable->running
> 0.222070800 ---x d?v? runstate_change d1v0 runnable->running
> 0.242137680 ---x d?v? runstate_change d1v0 runnable->running
> 0.262313040 ---x d?v? runstate_change d1v0 runnable->running
> 0.282178128 ---x d?v? runstate_change d1v0 runnable->running
> 0.302071328 ---x d?v? runstate_change d1v0 runnable->running
> 0.321969216 ---x d?v? runstate_change d1v0 runnable->running
> 0.341958464 ---x d?v? runstate_change d1v0 runnable->running
> 0.362147136 ---x d?v? runstate_change d1v0 runnable->running
> 0.382085296 ---x d?v? runstate_change d1v0 runnable->running
> 0.402076560 ---x d?v? runstate_change d1v0 runnable->running
> 0.421985456 ---x d?v? runstate_change d1v0 runnable->running
> 
> Thanks,
> Anderson



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.