[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xc_hvm_inject_trap() races



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: 2 November, 2016 11:38
> To: rcojocaru@xxxxxxxxxxxxxxx; Andrei Vlad LUTAS
> <vlutas@xxxxxxxxxxxxxxx>
> Cc: andrew.cooper3@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx;
> tamas@xxxxxxxxxxxxx
> Subject: RE: RE: [Xen-devel] xc_hvm_inject_trap() races
> 
> >>> On 02.11.16 at 10:13, <vlutas@xxxxxxxxxxxxxxx> wrote:
> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> Sent: 2 November, 2016 10:50
> >> >>> On 01.11.16 at 23:17, <vlutas@xxxxxxxxxxxxxxx> wrote:
> >> > We don't really care when and how the #PF is handled. We don't care
> >> > if the page is paged out at some random point. What we do know is
> >> > that at a certain point in the future, the page will be swapped in;
> >> > how do we know when? The OS will write the guest page tables, at
> >> > which point we can inspect the physical page itself (so you can see
> >> > here why we don't care about the page being swapped out sometime in
> >> > the future). So we really _can_ lift any restriction we want at that 
> >> > point.
> >>
> >> Hmm, I'm having difficulty seeing the supposedly broken flow of
> >> events
> >> here: Earlier it was said that #PF injection would be a result of EPT
> >> event processing. Here you say that the lifting of the restrictions
> >> would be a result of seeing the guest modify its page tables (which
> >> would in turn be a result of the #PF actually having arrived in the
> >> guest). So if (with this, and as you say
> >> above) you don't care when the #PF gets handled, where's the original
> >> problem?
> >
> > That's not what I wanted to say, sorry if it was unclear. What I'm
> > trying to say is that the decision to inject a #PF can be made when
> > handling an EPT violation - the accessed page needs not be related in
> > any way with the page for which we decide to inject the #PF. For
> > example, we intercept writes in a list that describes the loaded
> > module. Whenever a new module is loaded, an entry would be inserted
> > into that list, and that would generate an EPT write violation. Now,
> > the introspection logic will be able to analyze what module was loaded
> > and where, and it may find out that the module headers (which are
> > needed by the protection logic) are not present in memory - therefore,
> > it would inject a #PF in order to force the OS to swap in said
> > headers. On the other hand, the HVI logic may also decide that it
> > doesn't need to watch for modules loading anymore (for example, all the
> interesting modules were loaded), so it will remove the write hook from the
> list of loaded modules.
> > These two (injection of the #PF and the removal of the EPT write
> > protection) would be done in the same event handler, so we can't rely
> > on the event being re-generated in this case. Hopefully this example
> makes it more clear.
> 
> If the decision whether further events are needed depends on data in a
> page not present in memory, how can that decision be taken before the
> injected #PF has actually been handled? I'm still not seeing a flow of events
> where there is a problem. Furthermore, I don't think it would do much harm
> if you kept the watch in place slightly longer?

The decision whether further events are needed or not is NOT made based on the 
contents of the missing page. Let us assume we have the MODULE structure, that 
contains a "name" and an "address". The MODULE is inserted in the modules list 
via a write, which triggers an EPT violation, which is handled by HVI. The HVI 
sees that "name" is the module it was waiting for (for example, ntdll, 
kernel32, or whatever), and decides that it doesn't want to intercept other 
modules being loaded, so it removes the write hook from the list. Furthermore, 
it sees that "address" points to a swapped-out page, so it injects a #PF, to 
swap it in. 

> 
> >> The fact that {vmx,svm}_inject_trap() combine the new exception with
> >> an already injected one (and blindly discard events other than hw
> >> exceptions), otoh, looks like indeed wants to be controllable by the
> >> caller: When the event comes from the outside (the hypercall), it
> >> would clearly seem better to simply tell the caller that no injection
> >> happened and the event needs to be kept pending. The main question
> >> then is how to make certain injection gets retried at the right point
> >> in time (read: once the other interrupt handler IRETs back to original
> context).
> >
> > Yes, this is basically our problem. Right now, the #PF would overwrite
> > other interrupts, which is very bad. On the other hand, it can't
> > return an error (if I understand the code correctly), since it can't
> > know if another event will be scheduled for injection. As I told
> > Andrew, at least returning an error that would indicate the #PF cannot
> > be injected may help us a lot here
> 
> But an event that prevents the injected one to make it may get generated
> only _after_ the inject hypercall was completed. Once again - the problem
> needs to be solved elsewhere.
> 
> > (I'm sure making the injected trap take precedence over other events
> > would not be acceptable).
> 
> Indeed.
> 
> Jan
> 
> ________________________
> This email was scanned by Bitdefender

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.