[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 14/18] xen/arm: Unmask the Abort/SError bit in the exception entries



On Thu, 23 Mar 2017, Julien Grall wrote:
> Hi Wei,
> 
> On 23/03/17 03:13, Wei Chen wrote:
> > On 2017/3/23 6:22, Stefano Stabellini wrote:
> > > On Wed, 22 Mar 2017, Julien Grall wrote:
> > > > Hi Wei,
> > > > 
> > > > On 22/03/17 08:49, Wei Chen wrote:
> > > > > Hi Stefano,
> > > > > 
> > > > > On 2017/3/21 5:38, Stefano Stabellini wrote:
> > > > > > On Mon, 13 Mar 2017, Wei Chen wrote:
> > > > > > > Currently, we masked the Abort/SError bit in Xen exception
> > > > > > > entries.
> > > > > > > So Xen could not capture any Abort/SError while it's running.
> > > > > > > Now, Xen has the ability to handle the Abort/SError, we should
> > > > > > > unmask
> > > > > > > the Abort/SError bit by default to let Xen capture Abort/SError
> > > > > > > while
> > > > > > > it's running.
> > > > > > > 
> > > > > > > But in order to avoid receiving nested asynchronous abort, we
> > > > > > > don't
> > > > > > > unmask Abort/SError bit in hyp_error and trap_data_abort.
> > > > > > > 
> > > > > > > Signed-off-by: Wei Chen <Wei.Chen@xxxxxxx>
> > > > > > > ---
> > > > > > > We haven't done this before, so I don't know how can this change
> > > > > > > will affect the Xen. If the IRQ and Abort take place at the same
> > > > > > > time, how can we handle them?
> > > > > > 
> > > > > > If the abort is for Xen, the hypervisor will crash and that's the
> > > > > > end of
> > > > > 
> > > > > Before the system crash, we have enable the IRQ, so that would not be
> > > > > the end if an IRQ happens at the same time.
> > > > > 
> > > > > > it. If the abort is for the guest, Xen will inject it into the VM,
> > > > > > then
> > > > > 
> > > > > Before we have inject the abort to VM, we have enable the IRQ.
> > > > > 
> > > > > > it will return from handling the abort, going back to handling the
> > > > > > IRQ
> > > > > > as before. Isn't that right?
> > > > > 
> > > > > If the abort has higher priority then IRQ, it's right.
> > > > > 
> > > > > > 
> > > > > > 
> > > > > > > If an abort is taking place while we're handling the IRQ, the
> > > > > > > program
> > > > > > > jump to abort exception, and then enable the IRQ. In this case,
> > > > > > > what
> > > > > > > will happen? So I think I need more discussions from community.
> > > > > > 
> > > > > > Do you know of an example scenario where Xen could have a problem
> > > > > > with
> > > > > > this?
> > > > > > 
> > > > > 
> > > > > For example,
> > > > > 1. Trigger a SError in hypervisor.
> > > > > 2. Jump to hyp_error to handle SError.
> > > > > 3. Enable IRQ in hyp_error before PANIC
> > > > > 4. A timer IRQ happens.
> > > > > 5. Jump to hyp_irq and unmask abort again.
> > > > > 6. Jump hyp_error again?
> > > > 
> > > > Technically you could end up in an infinite loop if hyp_error code
> > > > generates a
> > > > SError. It will stay pending until the end and then trigger again when
> > > > SError
> > > > is unmasked...
> > > > 
> > > > That's unfortunate but I don't think this is a big issue as if this is
> > > > happening your platform is already doomed.
> > > 
> > > I agree, but the scenario suggested by Wei is not like that: hyp_error
> > > does not generate an serror, it only unmask irqs.
> > > 
> > > I think that it would be safer not to unmask IRQs in hyp_error (remove
> > > msr daifclr, #2 at the beginning of hyp_error). IRQs can be enabled at
> > > the end of do_trap_hyp_serror (if the hypervisor does not panic).
> > > 
> > 
> > Yes, it would be safer if we defined an end for exception loop. And
> > hyp_error is a good place to end up the exception loop.
> > 1. If hyp_error will PANIC the system, enable IRQ in hyp_error to handle
> >     interrupts is non-significant.
> > 2. If hyp_error will forward the SErrors, enable IRQ before forwarding
> >     SError to guest will make Xen enter exception loop. The SError would
> >     not have chance to be forwarded to guest.
> > 
> > So, I think enable IRQ at the end of do_trap_hyp_serror would be better.
> 
> That's not going to help. If the IRQ path is triggering an SError again and
> again, there is no way to go out even if you re-enable IRQ later.

I understand where the misunderstanding comes from, I think it is due to
a different interpretation of how the hardware behaves. Only one is
correct, and I think it's yours.

In the following case, assuming only one SError is ever generated:

1. A guest causes an SError
2. Xen jumps to hyp_error to handle it
3. Xen enables IRQ in hyp_error
4. Xen receives a timer event
5. Xen jumps to hyp_irq and unmask abort
6. ???

In 6., Xen doesn't jump to hyp_error again because of the first SError,
right? The SError has already been received, Xen wouldn't trap again on
the same abort because something hasn't been cleared, right? In that
case, there is no need to delay unmasking IRQs, as you have been saying:

6. Xen handles the IRQ, injects a virtual irq to guest
7. Xen returns from IRQ, goes back to abort handler
8. Xen injects abort to guest

in that case this patch is good as is.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.