[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 14/18] xen/arm: Unmask the Abort/SError bit in the exception entries



Hi Stefano,

On 2017/3/24 8:10, Stefano Stabellini wrote:
> On Thu, 23 Mar 2017, Julien Grall wrote:
>> Hi Wei,
>>
>> On 23/03/17 03:13, Wei Chen wrote:
>>> On 2017/3/23 6:22, Stefano Stabellini wrote:
>>>> On Wed, 22 Mar 2017, Julien Grall wrote:
>>>>> Hi Wei,
>>>>>
>>>>> On 22/03/17 08:49, Wei Chen wrote:
>>>>>> Hi Stefano,
>>>>>>
>>>>>> On 2017/3/21 5:38, Stefano Stabellini wrote:
>>>>>>> On Mon, 13 Mar 2017, Wei Chen wrote:
>>>>>>>> Currently, we masked the Abort/SError bit in Xen exception
>>>>>>>> entries.
>>>>>>>> So Xen could not capture any Abort/SError while it's running.
>>>>>>>> Now, Xen has the ability to handle the Abort/SError, we should
>>>>>>>> unmask
>>>>>>>> the Abort/SError bit by default to let Xen capture Abort/SError
>>>>>>>> while
>>>>>>>> it's running.
>>>>>>>>
>>>>>>>> But in order to avoid receiving nested asynchronous abort, we
>>>>>>>> don't
>>>>>>>> unmask Abort/SError bit in hyp_error and trap_data_abort.
>>>>>>>>
>>>>>>>> Signed-off-by: Wei Chen <Wei.Chen@xxxxxxx>
>>>>>>>> ---
>>>>>>>> We haven't done this before, so I don't know how can this change
>>>>>>>> will affect the Xen. If the IRQ and Abort take place at the same
>>>>>>>> time, how can we handle them?
>>>>>>>
>>>>>>> If the abort is for Xen, the hypervisor will crash and that's the
>>>>>>> end of
>>>>>>
>>>>>> Before the system crash, we have enable the IRQ, so that would not be
>>>>>> the end if an IRQ happens at the same time.
>>>>>>
>>>>>>> it. If the abort is for the guest, Xen will inject it into the VM,
>>>>>>> then
>>>>>>
>>>>>> Before we have inject the abort to VM, we have enable the IRQ.
>>>>>>
>>>>>>> it will return from handling the abort, going back to handling the
>>>>>>> IRQ
>>>>>>> as before. Isn't that right?
>>>>>>
>>>>>> If the abort has higher priority then IRQ, it's right.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> If an abort is taking place while we're handling the IRQ, the
>>>>>>>> program
>>>>>>>> jump to abort exception, and then enable the IRQ. In this case,
>>>>>>>> what
>>>>>>>> will happen? So I think I need more discussions from community.
>>>>>>>
>>>>>>> Do you know of an example scenario where Xen could have a problem
>>>>>>> with
>>>>>>> this?
>>>>>>>
>>>>>>
>>>>>> For example,
>>>>>> 1. Trigger a SError in hypervisor.
>>>>>> 2. Jump to hyp_error to handle SError.
>>>>>> 3. Enable IRQ in hyp_error before PANIC
>>>>>> 4. A timer IRQ happens.
>>>>>> 5. Jump to hyp_irq and unmask abort again.
>>>>>> 6. Jump hyp_error again?
>>>>>
>>>>> Technically you could end up in an infinite loop if hyp_error code
>>>>> generates a
>>>>> SError. It will stay pending until the end and then trigger again when
>>>>> SError
>>>>> is unmasked...
>>>>>
>>>>> That's unfortunate but I don't think this is a big issue as if this is
>>>>> happening your platform is already doomed.
>>>>
>>>> I agree, but the scenario suggested by Wei is not like that: hyp_error
>>>> does not generate an serror, it only unmask irqs.
>>>>
>>>> I think that it would be safer not to unmask IRQs in hyp_error (remove
>>>> msr daifclr, #2 at the beginning of hyp_error). IRQs can be enabled at
>>>> the end of do_trap_hyp_serror (if the hypervisor does not panic).
>>>>
>>>
>>> Yes, it would be safer if we defined an end for exception loop. And
>>> hyp_error is a good place to end up the exception loop.
>>> 1. If hyp_error will PANIC the system, enable IRQ in hyp_error to handle
>>>     interrupts is non-significant.
>>> 2. If hyp_error will forward the SErrors, enable IRQ before forwarding
>>>     SError to guest will make Xen enter exception loop. The SError would
>>>     not have chance to be forwarded to guest.
>>>
>>> So, I think enable IRQ at the end of do_trap_hyp_serror would be better.
>>
>> That's not going to help. If the IRQ path is triggering an SError again and
>> again, there is no way to go out even if you re-enable IRQ later.
>
> I understand where the misunderstanding comes from, I think it is due to
> a different interpretation of how the hardware behaves. Only one is
> correct, and I think it's yours.
>
> In the following case, assuming only one SError is ever generated:
>
> 1. A guest causes an SError
> 2. Xen jumps to hyp_error to handle it
> 3. Xen enables IRQ in hyp_error
> 4. Xen receives a timer event
> 5. Xen jumps to hyp_irq and unmask abort
> 6. ???
>
> In 6., Xen doesn't jump to hyp_error again because of the first SError,
> right? The SError has already been received, Xen wouldn't trap again on
> the same abort because something hasn't been cleared, right? In that

Yes, I think you are right. I have done a test, the hardware works as
you said. Xen would not jump to hyp_error again by the first SError.

> case, there is no need to delay unmasking IRQs, as you have been saying:
>
> 6. Xen handles the IRQ, injects a virtual irq to guest
> 7. Xen returns from IRQ, goes back to abort handler
> 8. Xen injects abort to guest
>
> in that case this patch is good as is.
>

Thanks, that reassures me :)


-- 
Regards,
Wei Chen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.