[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [xen-unstable test] 145796: tolerable FAIL - PUSHED



Hi Dario,

Apologies for the late answer.

On 22/01/2020 03:40, Dario Faggioli wrote:
On Fri, 2020-01-10 at 18:24 +0000, Julien Grall wrote:
Hi all,

Hi Julien,

I was looking at this, and I have a couple of questions...

On 08/01/2020 23:14, Julien Grall wrote:
On Wed, 8 Jan 2020 at 21:40, osstest service owner
<osstest-admin@xxxxxxxxxxxxxx> wrote:
****************************************
Jan  8 15:02:26.943794 (XEN) Panic on CPU 1:
Jan  8 15:02:26.945872 (XEN) Assertion '!unit_on_replq(svc)' failed
at
sched_rt.c:586
Jan  8 15:02:26.951492 (XEN)
****************************************

So I managed to reproduce it on Arm by hacking the hypercall path to
call:

domain_pause_nosync(current->domain);
domain_unpause(current->domain);

With a debug build and with a 2 vCPU dom0 the crash happen in a few
seconds. When the unit is not scheduled, rt_unit_wake() expects the
unit
to be in none of the queues.

The interaction is as following:

CPU0                            | CPU1
                                        |
do_domain_pause()               |
   -> atomic_inc(&d->pause_count)     |
   -> vcpu_sleep_nosync(vCPU A)      |  schedule()
                                |       -> Lock
                                  |       -> rt_schedule()
                                  |          -> snext = runq_pick(...)
                                  |          /* return unit A (aka
vCPU A)
                                |          /* Unit is not runnable */
                                |          -> Remove from the q
                                  |      [....]
                                |       -> Lock
     -> Lock                 |
     -> rt_unit_sleep()              |
      /* Unit not scheduled */  |
      /* Nothing to do */               |

Thanks a lot for the analysis. As said above, just a few questions, to
be sure I'm understanding properly what is happening.

You have a 2 vCPUs dom0, and how many other vCPUs from other domains?
Or do you only have those 2 dom0 vCPUs and you are actually pausing
dom0?

Only dom0 with 2 vCPUs is running. On every hypercall, it will try to pause/unpause itself. This is to roughly match the behavior of the Arm guest atomic helpers.


In general, what is running (I mean which vcpu) on CPU0, when the
domain_pause() happens? And what is running on CPU1 when schedule()
happens?

If you just have the 2 dom0's vCPUs around (and we call them vCPU A and
vCPU B), the only case for which I can imagine runq_pick() returning A
on CPU1 would be if CPU0 would be running vCPU B (and invoked the
hypercall from it) and CPU1 was idle... is this the case?

This is indeed the case. The schedule() on CPU1 has happenned because vCPU A was woken up (e.g an interrupt was received and injected to the vCPU).


When schedule() grab the lock first (as shown above), the unit will
only
be removed from the Q. However, when vcpu_sleep_nosync() grab the
lock
first and the unit was not scheduled, rt_unit_sleep() will remove
the
unit from two queues (runQ/depleteQ and replenishQ).

So I think we want schedule() to remove the unit from the 2 queues if
it
is not runnable. Any opinions?

Mmm... that may work, but I'm not sure.

In fact, I'm starting to think that patch 7c7b407e777 "xen/sched:
introduce unit_runnable_state()", which added the 'q_remove(snext)' in
rt_schedule() might not be correct.

I have tested Xen before this commit and didn't manage to reproduce the crash. As soon as I had the commit, it will crash quite quickly.


In fact, if runq_pick() returns a vCPU which is in the runqueue, but is
not runnable (e.g., because we're racing with do_domain_pause(), which
already set pause_count), it's not rt_schedule() job to dequeue it from
anything.

We probably should just ignore it and pick another vCPU, if any (and
idle otherwise). Then, after we release the lock, if will be
rt_unit_sleep(), called by do_domain_pause() in this case, that will
finish the job of properly dequeueing it...

Another strange thing is that, as the code looks right now, runq_pick()
returns the first unit in the runq (i.e., the one with the earliest
deadline), without checking whether it is runnable. Then, in
rt_schedule(), if the unit is not runnable, we (only partially, as you
figured out) dequeue it, and use idle instead, as our candidate for
being the next scheduled unit... But what if there were other
*runnable* units in the runqueue?

My knowledge of the scheduler is quite limited. Maybe Meng would be able to answer to this question?

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.