[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xc_hvm_inject_trap() races



>>> Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx> 11/01/16 11:53 AM >>>
>On 11/01/2016 12:30 PM, Jan Beulich wrote:
>>>>> Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx> 11/01/16 10:04 AM >>>
>>> We've stumbled across the following scenario: we're injecting a #PF to
>>> try to bring a swapped page back, but Xen already have a pending
>>> interrupt, and the two collide.
>>>
>>> I've logged what happens in hvm_do_resume() at the point of injection,
>>> and stumbled across this:
>>>
>>> (XEN) [  252.878389] vector: 14, type: 3, error_code: 0,
>>> VM_ENTRY_INTR_INFO: 0x00000000800000e1
>>>
>>> VM_ENTRY_INTR_INFO does have INTR_INFO_VALID_MASK set here.
>> 
>> So a first question I have is this: What are the criteria that made your
>> application decide it needs to inject a trap? Obviously there must have
>> been some kind of event in the guest that triggered this. Hence the
>> question is whether this same event wouldn't re-trigger at the end of the
>> hardware interrupt (or could be made re-trigger reasonably easily).
>> Because in the end what you're trying to do here is something that's
>> architecturally impossible: There can't be a (non-nested) exception once
>> an external interrupt has been accepted (i.e. any subsequent exception
>> can only be related to the delivery of that interrupt vector, not to the code
>> which was running when the interrupt was signaled).
>
>Unfortunately there are two main reasons why relying on the conditions
>for injecting the page fault repeating is problematic:
>
>1. We'd need to be able differentiate between a failed run (where
>injection doesn't work) and a succesful run, with no real possibility to
>know the difference for sure. So we don't know how to adapt the
>application's internal state based on some events: is the event the
>"final" one, or just a duplicate? xc_hvm_inject_trap() does not tell us
>(as indeed it cannot know) if the injection succeeded, and there's no
>other way to know.
>
>2. More importantly (although working around 1. is far from trivial),
>the event may not be repeatable. As an example, #PF injection can occur
>as part of handling an EPT event, but during handling the event the
>introspection engine can decide to lift the restrictions on said page
>after injecting the #PF. So the application relied on the #PF being
>delivered, and with the restrictions lifted from the page there will be
>no further EPT events for that page, therefore the main condition for
>triggering the #PF is lost forever.

Isn't this a problem you also have under other circumstances, e.g.
multiple faults occurring for a single instruction? Which gets us to the
fact that you didn't answer at all the initial question I did raise. Apart
from that I'm also not really understanding the model you describe:
What good does injecting #PF alongside lifting restrictions? I'd normally
expect just one of the two to occur to deal with whatever caused the
original event.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.