[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

xen/evtchn: Dom0 boot hangs using preempt_rt kernel 5.10


  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Luca Fancellu <Luca.Fancellu@xxxxxxx>
  • Date: Wed, 17 Mar 2021 14:32:26 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aVrl4Ub3251jU+WymCbyLkna78vFyP8vTELrcAsssrM=; b=QACw31/IuInIyO++bjjHouRmXIEzD+TyOm+dgSOf4btcmrIhYFhJfMd/bT56rOCdPGRzNcrbMFWUvi4pM3zk/4S/nLbllKaPY4K0q8+/1Bw/4Ie6kKfZHi0vsHi3SPc5WVR5ppyinsvlMUjyObeX4JwAdcTPyXcljf6Gxj7TKnanpNvcJh54OgMMYj5bE5FMjJ8sdKXm+BUHedg/SnaH/NyOjjTPQmgSNDbEQHdfiiLIjxr5gz3Wl2KVX1fsCjFYeQKPx1BKLgaQgl79Y7hyu16+OcA1ADHIlapxV4dmYk7ye5bzZf97iZwv1oaGBx/ivsJivoDEjrwkWu05SOu7Tg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XpvSzLyOj2cVLOnQ4aUdpjarntlmNFRZHyl6IZXY9GvQBfncNO4S3l04bT65LEUW9pR/1GsodpTh1cdC8my2SQ9o+ANwukP4JYFQnXdtGnr4TCrxrJ/I7HAbbDzECiQ6eWFtn1cbb3wh304nnXCb9PYtzABASYDHtNyD1HGalRpqI/AFroQfJ4fgYOYEXARtHFIHyhPeqxlMhsjHxFpCMzXEPWFwMqxGe/43yQ+toQ5fiSHeNO1eBG3HmOUFiwFO+UnXBMESyCjs08ETA1OhGaUtyUblyUoqjZerbKV4mz+q+B41orQquZedEsdzNghqKP9Y1ZU5Lvo/al5opodX/A==
  • Authentication-results-original: lists.xenproject.org; dkim=none (message not signed) header.d=none;lists.xenproject.org; dmarc=none action=none header.from=arm.com;
  • Cc: Juergen Gross <jgross@xxxxxxxx>, "jgrall@xxxxxxxxxx" <jgrall@xxxxxxxxxx>
  • Delivery-date: Wed, 17 Mar 2021 14:32:51 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: lists.xenproject.org; dkim=none (message not signed) header.d=none;lists.xenproject.org; dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHXGzaro+yJ53YhvEak45F6mQniMqqIPN21
  • Thread-topic: xen/evtchn: Dom0 boot hangs using preempt_rt kernel 5.10

 
Hi all,

we've been encountering an issue when using the kernel 5.10 with preempt_rt support for Dom0, the problem is that during the boot of Dom0, it hits a BUG_ON(!irqs_disabled()) from the function evtchn_fifo_unmask defined in events_fifo.c.

This is the call stack:

[   17.817018] ------------[ cut here ]------------
[   17.817021] kernel BUG at drivers/xen/events/events_fifo.c:258!
[   18.817079] Internal error: Oops - BUG: 0 [#1] PREEMPT_RT SMP
[   18.817081] Modules linked in: bridge stp llc ipv6
[   18.817086] CPU: 3 PID: 558 Comm: xenstored Not tainted 5.10.16-rt25-yocto-preempt-rt #1
[   18.817089] Hardware name: Arm Neoverse N1 System Development Platform (DT)
[   18.817090] pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[   18.817092] pc : evtchn_fifo_unmask+0xd4/0xe0
[   18.817099] lr : xen_irq_lateeoi_locked+0xec/0x200
[   18.817102] sp : ffff8000123f3cc0
[   18.817102] x29: ffff8000123f3cc0 x28: ffff0000427b1d80
[   18.817104] x27: 0000000000000000 x26: 0000000000000000
[   18.817106] x25: 0000000000000001 x24: 0000000000000001
[   18.817107] x23: ffff0000412fc900 x22: 0000000000000004
[   18.817109] x21: 0000000000000000 x20: ffff000042e06990
[   18.817110] x19: ffff0000427b1d80 x18: 0000000000000010
[   18.817112] x17: 0000000000000000 x16: 0000000000000000
[   18.817113] x15: 0000000000000002 x14: 0000000000000001
[   18.817114] x13: 000000000001a7e8 x12: 0000000000000040
[   18.817116] x11: ffff000040400248 x10: ffff00004040024a
[   18.817117] x9 : ffff800011be5200 x8 : ffff000040400270
[   18.817119] x7 : 0000000000000000 x6 : 0000000000000003
[   18.817120] x5 : 0000000000000000 x4 : ffff000040400308
[   18.817121] x3 : ffff0000408a400c x2 : 0000000000000000
[   18.817122] x1 : 0000000000000000 x0 : ffff0000408a4000
[   18.817124] Call trace:
[   18.817125]  evtchn_fifo_unmask+0xd4/0xe0
[   18.817127]  xen_irq_lateeoi_locked+0xec/0x200
[   18.817129]  xen_irq_lateeoi+0x48/0x64
[   18.817131]  evtchn_write+0x124/0x15c
[   18.817134]  vfs_write+0xf0/0x2cc
[   18.817137]  ksys_write+0xe0/0x100
[   18.817139]  __arm64_sys_write+0x20/0x30
[   18.817142]  el0_svc_common.constprop.0+0x78/0x1a0
[   18.817145]  do_el0_svc+0x24/0x90
[   18.817147]  el0_svc+0x14/0x20
[   18.817151]  el0_sync_handler+0x1a4/0x1b0
[   18.817153]  el0_sync+0x174/0x180
[   18.817156] Code: 52800120 b90023e6 97e6d104 17fffff0 (d4210000)
[   18.817158] ---[ end trace 0000000000000002 ]---

Our last tested kernel was the 5.4 and our analysis pointed out that the introduction of the lateeoi framework (xen/events: add a new "late EOI" evtchn framework) in conjunction with the preempt_rt patches (irqs kept enabled between spinlock_t/rwlock_t _irqsave/​_irqrestore operations) is the root cause.

Given that many modifications were made to the mask/unmask operations, a big one from Juergen Gross (xen/events: don't unmask an event channel when an eoi is pending), is the BUG_ON(...) still needed? 

With the mentioned commit every call to a mask/unmask operation is protected by a spinlock, so I would like to have some feedbacks from who has more experience than me on this part of the code.

Thank you,

Luca


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.