[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.5 random freeze question



BTW - shouldn't this flag GICH_LR_MAINTENANCE_IRQ be set after
maintenance interrupt requesting ?

On Wed, Nov 19, 2014 at 6:32 PM, Andrii Tseglytskyi
<andrii.tseglytskyi@xxxxxxxxxxxxxxx> wrote:
> Gic dump during interrupt requesting:
>
> (XEN) GICH_LRs (vcpu 0) mask=f
> (XEN)    HW_LR[0]=3a00001f
> (XEN)    HW_LR[1]=9a015856
> (XEN)    HW_LR[2]=1a00001b
> (XEN)    HW_LR[3]=9a00e439
> (XEN) Inflight irq=31 lr=0
> (XEN) Inflight irq=86 lr=1
> (XEN) Inflight irq=27 lr=2
> (XEN) Inflight irq=57 lr=3
> (XEN) Inflight irq=2 lr=255
> (XEN) Pending irq=2
>
> On Wed, Nov 19, 2014 at 6:29 PM, Andrii Tseglytskyi
> <andrii.tseglytskyi@xxxxxxxxxxxxxxx> wrote:
>> On Wed, Nov 19, 2014 at 6:13 PM, Stefano Stabellini
>> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
>>> On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
>>>> On Wed, Nov 19, 2014 at 6:01 PM, Andrii Tseglytskyi
>>>> <andrii.tseglytskyi@xxxxxxxxxxxxxxx> wrote:
>>>> > On Wed, Nov 19, 2014 at 5:41 PM, Stefano Stabellini
>>>> > <stefano.stabellini@xxxxxxxxxxxxx> wrote:
>>>> >> On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
>>>> >>> Hi Stefano,
>>>> >>>
>>>> >>> On Wed, Nov 19, 2014 at 4:52 PM, Stefano Stabellini
>>>> >>> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
>>>> >>> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
>>>> >>> >> Hi Stefano,
>>>> >>> >>
>>>> >>> >> > >      if ( !list_empty(&current->arch.vgic.lr_pending) && 
>>>> >>> >> > > lr_all_full() )
>>>> >>> >> > > -        GICH[GICH_HCR] |= GICH_HCR_UIE;
>>>> >>> >> > > +        GICH[GICH_HCR] |= GICH_HCR_NPIE;
>>>> >>> >> > >      else
>>>> >>> >> > > -        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
>>>> >>> >> > > +        GICH[GICH_HCR] &= ~GICH_HCR_NPIE;
>>>> >>> >> > >
>>>> >>> >> > >  }
>>>> >>> >> >
>>>> >>> >> > Yes, exactly
>>>> >>> >>
>>>> >>> >> I tried, hang still occurs with this change
>>>> >>> >
>>>> >>> > We need to figure out why during the hang you still have all the LRs
>>>> >>> > busy even if you are getting maintenance interrupts that should cause
>>>> >>> > them to be cleared.
>>>> >>> >
>>>> >>>
>>>> >>> I see that I have free LRs during maintenance interrupt
>>>> >>>
>>>> >>> (XEN) gic.c:871:d0v0 maintenance interrupt
>>>> >>> (XEN) GICH_LRs (vcpu 0) mask=0
>>>> >>> (XEN)    HW_LR[0]=9a015856
>>>> >>> (XEN)    HW_LR[1]=0
>>>> >>> (XEN)    HW_LR[2]=0
>>>> >>> (XEN)    HW_LR[3]=0
>>>> >>> (XEN) Inflight irq=86 lr=0
>>>> >>> (XEN) Inflight irq=2 lr=255
>>>> >>> (XEN) Pending irq=2
>>>> >>>
>>>> >>> But I see that after I got hang - maintenance interrupts are generated
>>>> >>> continuously. Platform continues printing the same log till reboot.
>>>> >>
>>>> >> Exactly the same log? As in the one above you just pasted?
>>>> >> That is very very suspicious.
>>>> >
>>>> > Yes exactly the same log. And looks like it means that LRs are flushed
>>>> > correctly.
>>>> >
>>>> >>
>>>> >> I am thinking that we are not handling GICH_HCR_UIE correctly and
>>>> >> something we do in Xen, maybe writing to an LR register, might trigger a
>>>> >> new maintenance interrupt immediately causing an infinite loop.
>>>> >>
>>>> >
>>>> > Yes, this is what I'm thinking about. Taking in account all collected
>>>> > debug info it looks like once LRs are overloaded with SGIs -
>>>> > maintenance interrupt occurs.
>>>> > And then it is not handled properly, and occurs again and again - so
>>>> > platform hangs inside its handler.
>>>> >
>>>> >> Could you please try this patch? It disable GICH_HCR_UIE immediately on
>>>> >> hypervisor entry.
>>>> >>
>>>> >
>>>> > Now trying.
>>>> >
>>>> >>
>>>> >> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>>>> >> index 4d2a92d..6ae8dc4 100644
>>>> >> --- a/xen/arch/arm/gic.c
>>>> >> +++ b/xen/arch/arm/gic.c
>>>> >> @@ -701,6 +701,8 @@ void gic_clear_lrs(struct vcpu *v)
>>>> >>      if ( is_idle_vcpu(v) )
>>>> >>          return;
>>>> >>
>>>> >> +    GICH[GICH_HCR] &= ~GICH_HCR_UIE;
>>>> >> +
>>>> >>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>>>> >>
>>>> >>      while ((i = find_next_bit((const unsigned long *) 
>>>> >> &this_cpu(lr_mask),
>>>> >> @@ -821,12 +823,8 @@ void gic_inject(void)
>>>> >>
>>>> >>      gic_restore_pending_irqs(current);
>>>> >>
>>>> >> -
>>>> >>      if ( !list_empty(&current->arch.vgic.lr_pending) && lr_all_full() )
>>>> >>          GICH[GICH_HCR] |= GICH_HCR_UIE;
>>>> >> -    else
>>>> >> -        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
>>>> >> -
>>>> >>  }
>>>> >>
>>>> >>  static void do_sgi(struct cpu_user_regs *regs, int othercpu, enum 
>>>> >> gic_sgi sgi)
>>>> >
>>>>
>>>> Heh - I don't see hangs with this patch :) But also I see that
>>>> maintenance interrupt doesn't occur (and no hang as result)
>>>> Stefano - is this expected?
>>>
>>> No maintenance interrupts at all? That's strange. You should be
>>> receiving them when LRs are full and you still have interrupts pending
>>> to be added to them.
>>>
>>> You could add another printk here to see if you should be receiving
>>> them:
>>>
>>>      if ( !list_empty(&current->arch.vgic.lr_pending) && lr_all_full() )
>>> +    {
>>> +        gdprintk(XENLOG_DEBUG, "requesting maintenance interrupt\n");
>>>          GICH[GICH_HCR] |= GICH_HCR_UIE;
>>> -    else
>>> -        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
>>> -
>>> +    }
>>>  }
>>>
>>
>> Requested properly:
>>
>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
>> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
>>
>> But does not occur
>>
>>
>>>
>>>> >
>>>> >
>>>> > --
>>>> >
>>>> > Andrii Tseglytskyi | Embedded Dev
>>>> > GlobalLogic
>>>> > www.globallogic.com
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Andrii Tseglytskyi | Embedded Dev
>>>> GlobalLogic
>>>> www.globallogic.com
>>>>
>>
>>
>>
>> --
>>
>> Andrii Tseglytskyi | Embedded Dev
>> GlobalLogic
>> www.globallogic.com
>
>
>
> --
>
> Andrii Tseglytskyi | Embedded Dev
> GlobalLogic
> www.globallogic.com



-- 

Andrii Tseglytskyi | Embedded Dev
GlobalLogic
www.globallogic.com

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.