[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: xen/evtchn: Dom0 boot hangs using preempt_rt kernel 5.10
> On 23 Mar 2021, at 19:26, Julien Grall <julien@xxxxxxx> wrote: > > > > On 23/03/2021 17:06, Luca Fancellu wrote: >> Hi all, > > Hi, > > Please avoid top posting when answering to a comment. This makes more > difficult to follow. > >> I have an update, changing the lock introduced by the serie from spinlock_t >> to raw_spinlock_t, changing the lock/unlock function to use the raw_* >> version and keeping the BUG_ON(…) (now we can because raw_* implementation >> disable interrupts on preempt_rt) the kernel is booting correctly. >> So seems that the BUG_ON(…) is needed and the unmask function should run >> with interrupt disabled, anyone knows why this change worked? > > Do you mean why no-one spotted the issue before? If so, AFAIK, on vanilla > Linux, spin_lock is still just a wrapper to raw_spinlock. IOW there is no > option to replace it with a RT spinlock. > > So if you don't apply the RT patches, you would not be able to trigger the > issue. > > As to the fix itself, I think using raw_spinlock_t is the correct thing to do > because the lock is also used in interrupt context (even with RT enabled). > > Would you be able to send a patch? Yes I’ll send a patch soon > >>> On 23 Mar 2021, at 15:39, Luca Fancellu <luca.fancellu@xxxxxxx> wrote: >>> >>> Hi Jason, >>> >>> Thanks for your hints, unfortunately seems not an init problem because in >>> the same init configuration I tried the 5.10.23 (preempt_rt) without the >>> Juergen patch but with the BUG_ON removed and it boots without problem. So >>> seems that applying the serie does something (on a preempt_rt kernel) and >>> we are trying to figure out what. >>> >>> >>>> On 23 Mar 2021, at 12:36, Jason Andryuk <jandryuk@xxxxxxxxx> wrote: >>>> >>>> On Mon, Mar 22, 2021 at 3:09 PM Luca Fancellu <luca.fancellu@xxxxxxx> >>>> wrote: >>>>> >>>>> Hi Juergen, >>>>> >>>>> Yes you are right it was my mistake, as you said to remove the BUG_ON(…) >>>>> this serie >>>>> (https://patchwork.kernel.org/project/xen-devel/cover/20210306161833.4552-1-jgross@xxxxxxxx/) >>>>> is needed, since I’m using yocto I’m able to build a preempt_rt kernel >>>>> up to the 5.10.23 and for this reason I’m applying that serie on top of >>>>> this version, then I’m removing the BUG_ON(…). >>>>> >>>>> A thing that was not expected is that now the Dom0 kernel is stuck on >>>>> “Setting domain 0 name, domid and JSON config…” step and the system seems >>>>> unresponsive. Seems like a deadlock issue but looking into the serie we >>>>> can’t spot anything and that serie was also tested by others from the >>>>> community. > > The deadlock is expected. When you enable RT spinlock, the interrupts will > not disabled even when you call spin_lock_irqsave(). > > As the lock is also used in interrupt context (e.g. with interrupt masked), > this will lead to a deadlock because the lock can be held with interrupt > unmasked. > > This is quite a common error as developpers are not yet used to test RT. I > remember finding a few other instances like that when I worked on RT a couple > of years ago. > > For future reference, I think CONFIG_PROVE_LOCKING=y could help you to detect > (potential) deadlock. > > Cheers, > > -- > Julien Grall
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |