[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Core scheduling and cpu offlining



On 02/03/2020 14:05, Jürgen Groß wrote:
> On 02.03.20 14:51, Igor Druzhinin wrote:
>> On 02/03/2020 08:39, Jürgen Groß wrote:
>>> Hi Igor,
>>>
>>> could you please test the attached patch whether it fixes your problem
>>> with cpu offlining?
>>
>> It's certainly better and doesn't cause watchdog hit as before but I ran
>> the following script to verify:
>>
>> while true
>> do
>>      for i in `seq 1 63`; do xen-hptool cpu-offline $i; done
>>      for i in `seq 1 63`; do xen-hptool cpu-online $i; done
>> done
>>
>> ... and got this a little bit later (note the same script works fine in 
>> thread mode):
>>
>> (XEN) [  282.199134] Assertion '!preempt_count()' failed at preempt.c:36
>> (XEN) [  282.199142] ----[ Xen-4.13.0  x86_64  debug=y   Not tainted ]----
>> (XEN) [  282.199147] CPU:    0
>> (XEN) [  282.199150] RIP:    e008:[<ffff82d080228817>] 
>> ASSERT_NOT_IN_ATOMIC+0x1f/0x58
>> (XEN) [  282.199159] RFLAGS: 0000000000010202   CONTEXT: hypervisor
>> (XEN) [  282.199165] rax: ffff82d0805c7024   rbx: 0000000000000000   rcx: 
>> 0000000000000000
>> (XEN) [  282.199170] rdx: 0000000000000000   rsi: 00000000000026cd   rdi: 
>> ffff82d0804b3aac
>> (XEN) [  282.199175] rbp: ffff8300920bfe90   rsp: ffff8300920bfe90   r8:  
>> ffff83042f21ffe0
>> (XEN) [  282.199180] r9:  0000000000000001   r10: 3333333333333333   r11: 
>> 0000000000000001
>> (XEN) [  282.199185] r12: ffff82d0805cdb00   r13: 0000000000000000   r14: 
>> ffff82d0805c7250
>> (XEN) [  282.199192] r15: 0000000000000000   cr0: 000000008005003b   cr4: 
>> 00000000003506e0
>> (XEN) [  282.199252] cr3: 00000000920b0000   cr2: 00007f0fff967000
>> (XEN) [  282.199256] fsb: 00007f0fff957740   gsb: ffff88821e000000   gss: 
>> 0000000000000000
>> (XEN) [  282.199261] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   
>> cs: e008
>> (XEN) [  282.199268] Xen code around <ffff82d080228817> 
>> (ASSERT_NOT_IN_ATOMIC+0x1f/0x58):
>> (XEN) [  282.199272]  52 d1 83 3c 10 00 74 02 <0f> 0b 48 89 e0 48 0d ff 7f 
>> 00 00 8b 40 c1 48 c1
>> (XEN) [  282.199287] Xen stack trace from rsp=ffff8300920bfe90:
>> (XEN) [  282.199290]    ffff8300920bfea0 ffff82d080242680 ffff8300920bfef0 
>> ffff82d08027a171
>> (XEN) [  282.199297]    ffff82d080242635 000000002b3bf000 ffff83042bb1f000 
>> ffff83042bb1f000
>> (XEN) [  282.199304]    ffff83042bb1f000 0000000000000000 ffff82d0805ec620 
>> 0000000000000000
>> (XEN) [  282.199311]    ffff8300920bfd60 0000000000000000 00007ffc633001b0 
>> 0000000000305000
>> (XEN) [  282.199317]    ffff888212bd28a8 00007ffc633001b0 fffffffffffffff2 
>> 0000000000000286
>> (XEN) [  282.199324]    0000000000000000 0000000000000000 0000000000000000 
>> 0000000000000000
>> (XEN) [  282.199329]    ffffffff8100146a 0000000000000000 0000000000000000 
>> deadbeefdeadf00d
>> (XEN) [  282.199335]    0000010000000000 ffffffff8100146a 000000000000e033 
>> 0000000000000286
>> (XEN) [  282.199342]    ffffc90042977d70 000000000000e02b 0000000000000000 
>> 0000000000000000
>> (XEN) [  282.199347]    0000000000000000 0000000000000000 0000e01000000000 
>> ffff83042bb1f000
>> (XEN) [  282.199353]    0000000000000000 00000000003506e0 0000000000000000 
>> 0000000000000000
>> (XEN) [  282.199359]    0000040000000000 0000000000000000
>> (XEN) [  282.199364] Xen call trace:
>> (XEN) [  282.199368]    [<ffff82d080228817>] R ASSERT_NOT_IN_ATOMIC+0x1f/0x58
>> (XEN) [  282.199375]    [<ffff82d080242680>] F do_softirq+0x9/0x15
>> (XEN) [  282.199381]    [<ffff82d08027a171>] F 
>> arch/x86/domain.c#idle_loop+0xb4/0xcb
>> (XEN) [  282.199384]
>> (XEN) [  282.438998]
>> (XEN) [  282.440991] ****************************************
>> (XEN) [  282.446459] Panic on CPU 0:
>> (XEN) [  282.449745] Assertion '!preempt_count()' failed at preempt.c:36
>> (XEN) [  282.456156] ****************************************
>> (XEN) [  282.461621]
> 
> Oh, indeed, there are rcu_read_unlock() calls missing (up to now
> for ARM relevant only).
> 
> Is this one better?

I think we're back at the square one. For some reason it now throws watchdog 
timeouts
again. Note: I'm testing without any rcu_barrier related patches applied. Do 
you see
the same issues running the script above on your machine?

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.