[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] dom0less + sched=null => broken in staging



Hi,

On 8/13/19 7:43 PM, Julien Grall wrote:
> 
> 
> On 8/13/19 6:34 PM, Dario Faggioli wrote:
>> On Tue, 2019-08-13 at 17:52 +0100, Julien Grall wrote:
>>> Hi Dario,
>>>
>> Hello!
>>
>>> On 8/13/19 4:27 PM, Dario Faggioli wrote:
>>>> On Fri, 2019-08-09 at 11:30 -0700, Stefano Stabellini wrote:
>>>>>
>>>> In my (x86 and "dom0full") testbox, this seems to come from
>>>> domain_unpause_by_systemcontroller(dom0) called by
>>>> xen/arch/x86/setup.c:init_done(), at the very end of __start_xen().
>>>>
>>>> I don't know if domain construction in an ARM dom0less system works
>>>> similarly, though. What we want, is someone calling either
>>>> vcpu_wake()
>>>> or vcpu_unpause(), after having cleared _VPF_down from pause_flags.
>>>
>>> Looking at create_domUs() there is a call to
>>> domain_unpause_by_controller for each domUs.
>>>
>> Yes, I saw that. And I've seen the one done don dom0, at the end of
>> xen/arch/arm/setup.c:start_xen(), as well.
>>
>> Also, both construct_dom0() (still from start_xen()) and
>> construct_domU() (called from create_domUs()) call construct_domain(),
>> which does clear_bit(_VPF_down), setting the domain to online.
>>
>> So, unless the flag gets cleared again, or something else happens that
>> makes the vCPU(s) fail the vcpu_runnable() check in
>> domain_unpause()->vcpu_wake(), I don't see why the wakeup that let the
>> null scheduler start scheduling the vCPU doesn't happen... as it
>> instead does on x86 or !dom0less ARM (because, as far as I've
>> understood, it's only dom0less that doesn't work, it this correct?)
> 
> Yes, I quickly tried to use NULL scheduler with just dom0 and it boots.
> 
> Interestingly, I can't see the log:
> 
> (XEN) Freed 328kB init memory.
> 
> This is called as part of init_done before CPU0 goes into the idle loop.
> 
> Adding more debug, it is getting stuck when calling 
> domain_unpause_by_controller for dom0. Specifically vcpu_wake on dom0v0.
> 
> The loop to assign a pCPU in null_vcpu_wake() is turning into an 
> infinite loop. Indeed the loop is trying to pick CPU0 for dom0v0 that is 
> already used by dom1v0. So the problem is in pick_cpu() or the data used 
> by it.
> 
> It feels to me this is an affinity problem. Note that I didn't request 
> to pin dom0 vCPUs.

I did a bit more digging, as I pointed out before, pick_cpu() is
returning pCPU0. This is because per_cpu(ncp, 0) == NULL.

per_cpu(npc, 0) will be set by vcpu_assign(). AFAIU, the function
is called during scheduling. As CPU0 is not able to serve softirq until it
finishes to initialize, per_cpu(npc, 0) will still be NULL when trying to
wake dom0v0.

My knowledge of the scheduler is pretty limited, so I will leave to
Dario and George suggesting a fix :).

On a side note, I have tried to hack a bit the Dom0 vCPU allocation
to see if I can help you to reproduce it on x86. But I stumbled across
another error while bringing up d0v1:

(XEN) Assertion 'lock == per_cpu(schedule_data, v->processor).schedule_lock' 
failed at /home/julieng/works/xen/xen/include/xen/sched-if.h:108
(XEN) ----[ Xen-4.13-unstable  arm64  debug=y   Not tainted ]----
(XEN) CPU:    0

[...]

(XEN) Xen call trace:
(XEN)    [<00000000002251b8>] vcpu_wake+0x550/0x554 (PC)
(XEN)    [<0000000000224da4>] vcpu_wake+0x13c/0x554 (LR)
(XEN)    [<0000000000261624>] vpsci.c#do_common_cpu_on+0x134/0x1c4
(XEN)    [<0000000000261a04>] do_vpsci_0_2_call+0x294/0x3d0
(XEN)    [<00000000002612c0>] vsmc.c#vsmccc_handle_call+0x3a0/0x4b0
(XEN)    [<0000000000261484>] do_trap_hvc_smccc+0x28/0x4c
(XEN)    [<0000000000257efc>] do_trap_guest_sync+0x508/0x5d8
(XEN)    [<000000000026542c>] entry.o#guest_sync_slowpath+0x9c/0xcc
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'lock == per_cpu(schedule_data, v->processor).schedule_lock' 
failed at 
/home/julieng/works/xen/xen/include/xen/sched-***************************************

I only try to create all the vCPU to pCPU 0 with the following code:

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 4c8404155a..ce92e3841f 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2004,7 +2004,7 @@ static int __init construct_domain(struct domain *d, 
struct kernel_info *kinfo)
     for ( i = 1, cpu = 0; i < d->max_vcpus; i++ )
     {
         cpu = cpumask_cycle(cpu, &cpu_online_map);
-        if ( vcpu_create(d, i, cpu) == NULL )
+        if ( vcpu_create(d, i, 0) == NULL )
         {
             printk("Failed to allocate dom0 vcpu %d on pcpu %d\n", i, cpu);
             break;

I am not entirely sure whether the problem is related.

Anyway, I have wrote the following patch to reproduce on Arm without
dom0less:

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 4c8404155a..20246ae475 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2004,7 +2004,7 @@ static int __init construct_domain(struct domain *d, 
struct kernel_info *kinfo)
     for ( i = 1, cpu = 0; i < d->max_vcpus; i++ )
     {
         cpu = cpumask_cycle(cpu, &cpu_online_map);
-        if ( vcpu_create(d, i, cpu) == NULL )
+        if ( vcpu_create(d, i, 0) == NULL )
         {
             printk("Failed to allocate dom0 vcpu %d on pcpu %d\n", i, cpu);
             break;
@@ -2019,6 +2019,10 @@ static int __init construct_domain(struct domain *d, 
struct kernel_info *kinfo)
     v->is_initialised = 1;
     clear_bit(_VPF_down, &v->pause_flags);
 
+    v = d->vcpu[1];
+    v->is_initialised = 1;
+    clear_bit(_VPF_down, &v->pause_flags);
+
     return 0;
 }
 
This could easily be adapt for x86 so you can reproduce it easily :).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.