[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: xen/evtchn: Dom0 boot hangs using preempt_rt kernel 5.10


  • To: Jürgen Groß <jgross@xxxxxxxx>
  • From: Luca Fancellu <luca.fancellu@xxxxxxx>
  • Date: Fri, 19 Mar 2021 11:50:39 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=FM0m4dceXDmzZ5Y7JaceturxEaCYIsESfTG+xWu/3Jo=; b=fF5/ZULUiSXUepTHfrtzTgTo8uqScF4K1epq4G9nXIu8P81L5aR2e0wOnqyinmKZ5DwnuV3HrzAfXiJr8jBBjdZONJLw+vfzbI8OL8sHdqKOq6OYuwZj9JEZj8mmsR4Hk9PZmcCPa9nGLjB7LWfQe0NfZVGk8MD5C9KcwsWwPqEh6WIGJrgJoYyISdVji/rooXrBep/IOolNRzz9D4tEHEKi4Dd8Eqt9R/gP/WC2JhVBFxVcRaf3CMeyMSd05YKD5gsREVRNZkhtQDotY/OK2M+HJgw8UR2yVuipHWbyGTARUMxo3rqNUW9NbckP7Y7zUS/zukkQSJ4xX9xMF+hbuQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NcZevTLp94gHBVhnzZ+9QPGL9Y19iGA4S75+WwzDsPnbQJYFuWUja9Lk5PWyEwmDRdutmXXh67WkmegVuCPtSyKzFKfU/KKpylOCCPVy5kFeO+G+CV8iF4hpjY+mjPgnisvv+psWOh8cLVPPjxvhGNA88SgruaEZ6rMsBZlsJefRhSh1t1OeZLneTsXNrYEzv4KzcFMQNtHiQiNlo+L/EQf1pgSJhFfZS/png8+M8ptmzWMrNDmf/CHsTf3ljjwac4AQTXFt7UnUvZSCxLSOnWqtMQn3rdbUMRSNTCHCA2V8tu11p7fiA507WeiyAgEhFLTfJQ/MT1WXR9pLpFcU+w==
  • Authentication-results-original: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=arm.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "jgrall@xxxxxxxxxx" <jgrall@xxxxxxxxxx>
  • Delivery-date: Fri, 19 Mar 2021 11:51:10 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=arm.com;

Hi Juergen,

Could you confirm that back porting this two serie to the linux kernel 5.10:

https://patchwork.kernel.org/project/xen-devel/cover/20201210192536.118432146@xxxxxxxxxxxxx/
https://patchwork.kernel.org/project/xen-devel/cover/20210306161833.4552-1-jgross@xxxxxxxx/

Is needed to remove the BUG_ON(…)?

Thank you for your time.

Cheers,

Luca

> On 18 Mar 2021, at 08:47, Luca Fancellu <Luca.Fancellu@xxxxxxx> wrote:
> 
> Hi Juergen,
> 
> If you are willing to do the patch I think it will be faster to being 
> accepted, what about the BUG_ON(…) in evtchn_2l_unmask from events_2l.c file?
> 
> Cheers,
> 
> Luca
> 
>> On 18 Mar 2021, at 07:54, Jürgen Groß <jgross@xxxxxxxx> wrote:
>> 
>> On 17.03.21 15:32, Luca Fancellu wrote:
>>> Hi all,
>>> we've been encountering an issue when using the kernel 5.10 with preempt_rt 
>>> support for Dom0, the problem is that during the boot of Dom0, it hits a 
>>> BUG_ON(!irqs_disabled()) from the function evtchn_fifo_unmask defined in 
>>> events_fifo.c.
>>> This is the call stack:
>>> [   17.817018] ------------[ cut here ]------------
>>> [   17.817021] kernel BUG at drivers/xen/events/events_fifo.c:258!
>>> [   18.817079] Internal error: Oops - BUG: 0 [#1] PREEMPT_RT SMP
>>> [   18.817081] Modules linked in: bridge stp llc ipv6
>>> [   18.817086] CPU: 3 PID: 558 Comm: xenstored Not tainted 
>>> 5.10.16-rt25-yocto-preempt-rt #1
>>> [   18.817089] Hardware name: Arm Neoverse N1 System Development Platform 
>>> (DT)
>>> [   18.817090] pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
>>> [   18.817092] pc : evtchn_fifo_unmask+0xd4/0xe0
>>> [   18.817099] lr : xen_irq_lateeoi_locked+0xec/0x200
>>> [   18.817102] sp : ffff8000123f3cc0
>>> [   18.817102] x29: ffff8000123f3cc0 x28: ffff0000427b1d80
>>> [   18.817104] x27: 0000000000000000 x26: 0000000000000000
>>> [   18.817106] x25: 0000000000000001 x24: 0000000000000001
>>> [   18.817107] x23: ffff0000412fc900 x22: 0000000000000004
>>> [   18.817109] x21: 0000000000000000 x20: ffff000042e06990
>>> [   18.817110] x19: ffff0000427b1d80 x18: 0000000000000010
>>> [   18.817112] x17: 0000000000000000 x16: 0000000000000000
>>> [   18.817113] x15: 0000000000000002 x14: 0000000000000001
>>> [   18.817114] x13: 000000000001a7e8 x12: 0000000000000040
>>> [   18.817116] x11: ffff000040400248 x10: ffff00004040024a
>>> [   18.817117] x9 : ffff800011be5200 x8 : ffff000040400270
>>> [   18.817119] x7 : 0000000000000000 x6 : 0000000000000003
>>> [   18.817120] x5 : 0000000000000000 x4 : ffff000040400308
>>> [   18.817121] x3 : ffff0000408a400c x2 : 0000000000000000
>>> [   18.817122] x1 : 0000000000000000 x0 : ffff0000408a4000
>>> [   18.817124] Call trace:
>>> [   18.817125]  evtchn_fifo_unmask+0xd4/0xe0
>>> [   18.817127]  xen_irq_lateeoi_locked+0xec/0x200
>>> [   18.817129]  xen_irq_lateeoi+0x48/0x64
>>> [   18.817131]  evtchn_write+0x124/0x15c
>>> [   18.817134]  vfs_write+0xf0/0x2cc
>>> [   18.817137]  ksys_write+0xe0/0x100
>>> [   18.817139]  __arm64_sys_write+0x20/0x30
>>> [   18.817142]  el0_svc_common.constprop.0+0x78/0x1a0
>>> [   18.817145]  do_el0_svc+0x24/0x90
>>> [   18.817147]  el0_svc+0x14/0x20
>>> [   18.817151]  el0_sync_handler+0x1a4/0x1b0
>>> [   18.817153]  el0_sync+0x174/0x180
>>> [   18.817156] Code: 52800120 b90023e6 97e6d104 17fffff0 (d4210000)
>>> [   18.817158] ---[ end trace 0000000000000002 ]---
>>> Our last tested kernel was the 5.4 and our analysis pointed out that the 
>>> introduction of the lateeoi framework (xen/events: add a new "late EOI" 
>>> evtchn framework) in conjunction with the preempt_rt patches (irqs kept 
>>> enabled between spinlock_t/rwlock_t _irqsave///​_irqrestore operations) is 
>>> the root cause.
>>> Given that many modifications were made to the mask/unmask operations, a 
>>> big one from Juergen Gross (xen/events: don't unmask an event channel when 
>>> an eoi is pending), is the BUG_ON(...) still needed?
>>> With the mentioned commit every call to a mask/unmask operation is 
>>> protected by a spinlock, so I would like to have some feedbacks from who 
>>> has more experience than me on this part of the code.
>> 
>> I think this BUG_ON() can be removed.
>> 
>> Are you planning to send a patch?
>> 
>> 
>> Juergen
>> <OpenPGP_0xB0DE9DD628BF132F.asc>




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.