[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: mem-event interface



At 23:25 +0100 on 23 Jun (1277335526), Grzegorz Milos wrote:
> However, I'm a bit wary about putting anything non-essential in libxc,
> and it seems like the event demux might be quite complex and dependant
> on the type of events you are handling. Therefore we don't want to end
> up with really complex daemon in libxc. Instead I think we should try
> to make use of multiple rings in order to alleviate some of the demux
> headaches (sharing related events would go to the memshr daemon
> through one ring, paging to the pager through another, introspection
> events to XenAccess etc.), and then do further demux in the relevant
> daemon.

I agree that multiple rings are a good idea here - especially if we want
to disaggregate and have event handlers in multiple domains. 

Maybe the ring-registering interface could take a type and a rangeset -
that would reduce the amount of extra chatter at the cost of some more
overhead in Xen.

> This could potentially introduce some inefficiencies (e.g. one memory
> access could generate multiple events), and could cause the daemons to
> step on each other toes, but I don't think that's going to be a
> problem in practice, because the types of events we are interested in
> intercepting at the moment seem to be disjoint enough.
> 
> Also, the complexity of handling sync vs. async events, as well as
> supporting batching and out-of-order replies, may already be complex
> enough without having to worry about demultiplexing ;). So let's do
> things in small steps. I think the priority should be teaching Xen to
> handle multiple rings (the last time I looked at the mem_event code it
> couldn't). What do you think?
> 
> Thanks
> Gregor
> 
> 
> On Wed, Jun 23, 2010 at 11:25 PM, Grzegorz Milos
> <grzegorz.milos@xxxxxxxxx> wrote:
> > [From Patrick]
> >
> > Ah. Well, as long as it's in it's own library or API or whatever so
> > other applications can take advantage of it, then it's fine by me :)
> > libintrospec or something like that.
> >
> >
> > Patrick
> >
> >
> > On Wed, Jun 23, 2010 at 11:24 PM, Grzegorz Milos
> > <grzegorz.milos@xxxxxxxxx> wrote:
> >> [From Bryan]
> >>
> >>> I guess I'm more envisioning integrating all this with libxc and
> >>> having XenAccess et al. use that. Keeping it as a separate, VM
> >>> introspection library makes sense too. In any case, I think having
> >>> XenAccess as part of Xen is a good move. VM introspection is a useful
> >>> thing to have and I think a lot of projects could benefit from it.
> >>
> >> From my experience, the address translations can actually be pretty
> >> tricky.  This is a big chunk of what XenAccess does, and it requires
> >> some memory analysis in the domU to find necessary page tables and
> >> such.  So it may be more than you really want to add to libxc.  But if
> >> you go down this route, then I could certainly simplify the XenAccess
> >> code, so I wouldn't complain about that :-)
> >>
> >> -bryan
> >>
> >> On Wed, Jun 23, 2010 at 11:24 PM, Grzegorz Milos
> >> <grzegorz.milos@xxxxxxxxx> wrote:
> >>> [From Patrick]
> >>>
> >>> I guess I'm more envisioning integrating all this with libxc and
> >>> having XenAccess et al. use that. Keeping it as a separate, VM
> >>> introspection library makes sense too. In any case, I think having
> >>> XenAccess as part of Xen is a good move. VM introspection is a useful
> >>> thing to have and I think a lot of projects could benefit from it.
> >>>
> >>>
> >>> Patrick
> >>>
> >>> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
> >>> <grzegorz.milos@xxxxxxxxx> wrote:
> >>>> [From Bryan]
> >>>>
> >>>>> XenAccess, but how feasible is it to even move some of the gva/pfn/mfn
> >>>>> translation code out into the library and have the mem_event daemon
> >>>>> use that? I do remember reading through and borrowing XenAccess code
> >>>>
> >>>> This is certainly doable.  But if we decide to make a Xen library
> >>>> depend on XenAccess, then it would make sense to include XenAccess as
> >>>> part of the Xen distribution, IMHO.  This probably isn't too
> >>>> unreasonable to consider, but we'd want to make sure that the
> >>>> XenAccess configuration is either simplified or eliminated to avoid
> >>>> causing headaches for the average person using this stuff.  Something
> >>>> to think about...
> >>>>
> >>>> -bryan
> >>>>
> >>>> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
> >>>> <grzegorz.milos@xxxxxxxxx> wrote:
> >>>>> [From Patrick]
> >>>>>
> >>>>>> I like this idea as it keeps Xen as simple as possible and should also
> >>>>>> help to reduce the number of notifications sent from Xen up to user
> >>>>>> space (e.g., one notification to the daemon could then be pushed out
> >>>>>> to multiple clients that care about it).
> >>>>>
> >>>>> Yeah, that was my general thinking as well. So the immediate change to
> >>>>> the mem_event interface for this would be a way to specify sub-page
> >>>>> level stuff. The best way to approach this is probably by specifying a
> >>>>> start and end range (or more likely start address and size). This way
> >>>>> things like swapping and sharing would specify the start address of
> >>>>> the page they're interested in and PAGE_SIZE (or, more realistically
> >>>>> there would be an additional lib call to do page-level stuff, which
> >>>>> would just take the pfn and do this translation under the hood).
> >>>>>
> >>>>>
> >>>>>> For what it's worth, I'd be happy to build such a daemon into
> >>>>>> XenAccess.  This may be a logical place for it since XenAccess is
> >>>>>> already doing address translations and such, so it would be easier for
> >>>>>> a client app to specify an address range of interest as a virtual
> >>>>>> address or physical address.  This would prevent the need to repeat
> >>>>>> some of that address translation functionality in yet another library.
> >>>>>>
> >>>>>> Alternatively, we could provide the daemon functionality in libxc or
> >>>>>> some other Xen library and only provide support for low level
> >>>>>> addresses (e.g., pfn + offset).  Then XenAccess could build on top of
> >>>>>> that to offer higher level addresses (e.g., pa or va) using its
> >>>>>> existing translation mechanisms.  This approach would more closely
> >>>>>> mirror the current division of labor between XenAccess and libxc.
> >>>>>
> >>>>> This sounds good to me. I'd lean towards  the second approach as I
> >>>>> think it's the better long-term solution. I'm a bit rusty on my
> >>>>> XenAccess, but how feasible is it to even move some of the gva/pfn/mfn
> >>>>> translation code out into the library and have the mem_event daemon
> >>>>> use that? I do remember reading through and borrowing XenAccess code
> >>>>> (or at least the general mechanism) to do address translation stuff
> >>>>> for other projects, so it seems like having a general way to do that
> >>>>> would be a win. I think I did it with the CoW stuff, which I actually
> >>>>> want to port to the mem_event interface as well, both to have it
> >>>>> available and as another example of neat things we can do with the
> >>>>> interface.
> >>>>>
> >>>>>
> >>>>> Patrick
> >>>>>
> >>>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
> >>>>> <grzegorz.milos@xxxxxxxxx> wrote:
> >>>>>> [From Bryan]
> >>>>>>
> >>>>>>> needs to know to do sync notification). What's everybody thoughts on
> >>>>>>> this? Does it seem reasonable or have I gone completely mad?
> >>>>>>
> >>>>>> I like this idea as it keeps Xen as simple as possible and should also
> >>>>>> help to reduce the number of notifications sent from Xen up to user
> >>>>>> space (e.g., one notification to the daemon could then be pushed out
> >>>>>> to multiple clients that care about it).
> >>>>>>
> >>>>>> For what it's worth, I'd be happy to build such a daemon into
> >>>>>> XenAccess.  This may be a logical place for it since XenAccess is
> >>>>>> already doing address translations and such, so it would be easier for
> >>>>>> a client app to specify an address range of interest as a virtual
> >>>>>> address or physical address.  This would prevent the need to repeat
> >>>>>> some of that address translation functionality in yet another library.
> >>>>>>
> >>>>>> Alternatively, we could provide the daemon functionality in libxc or
> >>>>>> some other Xen library and only provide support for low level
> >>>>>> addresses (e.g., pfn + offset).  Then XenAccess could build on top of
> >>>>>> that to offer higher level addresses (e.g., pa or va) using its
> >>>>>> existing translation mechanisms.  This approach would more closely
> >>>>>> mirror the current division of labor between XenAccess and libxc.
> >>>>>>
> >>>>>> -bryan
> >>>>>>
> >>>>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
> >>>>>> <grzegorz.milos@xxxxxxxxx> wrote:
> >>>>>>> [From Patrick]
> >>>>>>>
> >>>>>>>> Since I'm coming in the middle of this discussion, forgive me if I've
> >>>>>>>> missed something.  But is the idea here to create a more general
> >>>>>>>> interface that could support various different types of memory events
> >>>>>>>> + notification?  And the two events listed below are just a subset of
> >>>>>>>> the events that could / would be supported?
> >>>>>>>
> >>>>>>> That's correct.
> >>>>>>>
> >>>>>>>
> >>>>>>>> In general, I like the sound of where this is going but I would like
> >>>>>>>> to see support for notification of events such as when a domU reads /
> >>>>>>>> writes / execs a pre-specified byte(s) of memory.  As such, there
> >>>>>>>> would need to be a notification path (as discussed below) and also a
> >>>>>>>> control path to setup the memory regions that the user app cares
> >>>>>>>> about.
> >>>>>>>
> >>>>>>> Sub-page events is something I would like to have included as well.
> >>>>>>> Currently the control path is basically just "nominating" a page (for
> >>>>>>> either swapping or sharing). It's not entirely clear to me the best
> >>>>>>> way to go about this. With swapping and sharing we have code in Xen to
> >>>>>>> handle both cases. However, to just receive notifications (like
> >>>>>>> "read", "write", "execute") I don't think we need specialised support
> >>>>>>> (or at least just once to handle the notifications). I'm thinking it
> >>>>>>> might be good to have a daemon to handle these events in user-space
> >>>>>>> and register clients with the user-space daemon. Each client would get
> >>>>>>> a unique client ID which could be used to identify who should get the
> >>>>>>> response. This way, we could just register that somebody is interested
> >>>>>>> in that page (or byte, etc) and let the user-space tool handle most of
> >>>>>>> the complex logic (i.e. which of the clients should that particular
> >>>>>>> notification go to). This requires some notion of priority for memory
> >>>>>>> areas (e.g. if one client requests notification for access to a byte
> >>>>>>> of page foo and another requests notification for access to any of
> >>>>>>> page foo, then we only need Xen to store that it should notify for
> >>>>>>> page foo and just send along which byte(s) of the page were accessed
> >>>>>>> as well, then the user-space daemon can determine if both clients
> >>>>>>> should be notified or just the one) (e.g. if one client requests async
> >>>>>>> notification and another requests sync notification, then Xen only
> >>>>>>> needs to know to do sync notification). What's everybody thoughts on
> >>>>>>> this? Does it seem reasonable or have I gone completely mad?
> >>>>>>>
> >>>>>>>
> >>>>>>> Patrick
> >>>>>>>
> >>>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
> >>>>>>> <grzegorz.milos@xxxxxxxxx> wrote:
> >>>>>>>> [From Bryan]
> >>>>>>>>
> >>>>>>>> Bryan D. Payne
> >>>>>>>>  to Patrick, me, george.dunlap, Andrew, Steven
> >>>>>>>>
> >>>>>>>> show details Jun 16 (7 days ago)
> >>>>>>>>
> >>>>>>>> Patrick, thanks for the inclusion.
> >>>>>>>>
> >>>>>>>> Since I'm coming in the middle of this discussion, forgive me if I've
> >>>>>>>> missed something.  But is the idea here to create a more general
> >>>>>>>> interface that could support various different types of memory events
> >>>>>>>> + notification?  And the two events listed below are just a subset of
> >>>>>>>> the events that could / would be supported?
> >>>>>>>>
> >>>>>>>> In general, I like the sound of where this is going but I would like
> >>>>>>>> to see support for notification of events such as when a domU reads /
> >>>>>>>> writes / execs a pre-specified byte(s) of memory.  As such, there
> >>>>>>>> would need to be a notification path (as discussed below) and also a
> >>>>>>>> control path to setup the memory regions that the user app cares
> >>>>>>>> about.
> >>>>>>>>
> >>>>>>>> -bryan
> >>>>>>>>
> >>>>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
> >>>>>>>> <grzegorz.milos@xxxxxxxxx> wrote:
> >>>>>>>>> [From Patrick]
> >>>>>>>>>
> >>>>>>>>> I think the idea of multiple rings is a good one. We'll register the
> >>>>>>>>> clients in Xen and when an mem_event is reached, we can just iterate
> >>>>>>>>> through the list of listeners to see who needs a notification.
> >>>>>>>>>
> >>>>>>>>> The person working on the anti-virus stuff is Bryan Payne from 
> >>>>>>>>> Georgia
> >>>>>>>>> Tech. I've CCed him as well so we can get his input on this stuff as
> >>>>>>>>> well. It's better to hash out a proper interface now rather than
> >>>>>>>>> continually changing it around.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Patrick
> >>>>>>>>>
> >>>>>>>>> On Wed, Jun 23, 2010 at 11:19 PM, Grzegorz Milos
> >>>>>>>>> <grzegorz.milos@xxxxxxxxx> wrote:
> >>>>>>>>>> [From Gregor]
> >>>>>>>>>>
> >>>>>>>>>> There are two major events that the memory sharing code needs to
> >>>>>>>>>> communicate over the hypervisor/userspace boundary:
> >>>>>>>>>> 1. GFN unsharing failed due to lack of memory. This will be called 
> >>>>>>>>>> the
> >>>>>>>>>> 'OOM event' from now on.
> >>>>>>>>>> 2. MFN is no longer sharable (actually an opaque sharing handle 
> >>>>>>>>>> would
> >>>>>>>>>> be communicated instead of the MFN). 'Handle invalidate event' from
> >>>>>>>>>> now on.
> >>>>>>>>>>
> >>>>>>>>>> The requirements on the OOM event are relatively similar to the
> >>>>>>>>>> page-in event. The way this should operate is that the faulting 
> >>>>>>>>>> VCPU
> >>>>>>>>>> is paused, and the pager is requested to free up some memory. When 
> >>>>>>>>>> it
> >>>>>>>>>> does so, it should generate an appropriate response, and wake up 
> >>>>>>>>>> the
> >>>>>>>>>> VCPU back again using a domctl. The event is going to be low 
> >>>>>>>>>> volume,
> >>>>>>>>>> and since it is going to be handled synchronously, likely in tens 
> >>>>>>>>>> of
> >>>>>>>>>> ms, there are no particular requirements on the efficiency.
> >>>>>>>>>>
> >>>>>>>>>> Handle invalidate event type is less important in the short term
> >>>>>>>>>> because the userspace sharing daemon is designed to be resilient to
> >>>>>>>>>> unfresh sharing state. However, if it is missing it will make the
> >>>>>>>>>> sharing progressively less effective as time goes on. The idea is 
> >>>>>>>>>> that
> >>>>>>>>>> the hypervisor communicates which sharing handles are no longer 
> >>>>>>>>>> valid,
> >>>>>>>>>> such that the sharing daemon only attempts to share pages in the
> >>>>>>>>>> correct state. This would be relatively high volume event, but it
> >>>>>>>>>> doesn't need to be accurate (i.e. events can be dropped if they are
> >>>>>>>>>> not consumed quickly enough). As such this event should be batch
> >>>>>>>>>> delivered, in an asynchronous fashion.
> >>>>>>>>>>
> >>>>>>>>>> The OOM event is coded up in Xen, but it will not be consumed 
> >>>>>>>>>> properly
> >>>>>>>>>> in the pager. If I remember correctly, I didn't want to interfere 
> >>>>>>>>>> with
> >>>>>>>>>> the page-in events because the event interface assumed that 
> >>>>>>>>>> mem-event
> >>>>>>>>>> responses are inserted onto the ring in precisely the same order as
> >>>>>>>>>> the requests. This may not be the case when we start mixing 
> >>>>>>>>>> different
> >>>>>>>>>> event types. WRT to the handle invalidation, the relevant hooks 
> >>>>>>>>>> exist
> >>>>>>>>>> in Xen, and in the mem sharing daemon, but there is no way to
> >>>>>>>>>> communicate events to two different consumers atm.
> >>>>>>>>>>
> >>>>>>>>>> Since the requirements on the two different sharing event types are
> >>>>>>>>>> substantially different, I think it may be easier if separate 
> >>>>>>>>>> channels
> >>>>>>>>>> (i.e. separate rings) were used to transfer them. This would also 
> >>>>>>>>>> fix
> >>>>>>>>>> the multiple consumers issue relatively easily. Of course you may 
> >>>>>>>>>> know
> >>>>>>>>>> of some other mem events that wouldn't fit in that scheme.
> >>>>>>>>>>
> >>>>>>>>>> I remember that there was someone working on an external anti-virus
> >>>>>>>>>> software, which prompted the whole mem-event work. I don't remember
> >>>>>>>>>> his/hers name or affiliation (could you remind me?), but maybe 
> >>>>>>>>>> he/she
> >>>>>>>>>> would be interested in working on some of this?
> >>>>>>>>>>
> >>>>>>>>>> Thanks
> >>>>>>>>>> Gregor
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

-- 
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, XenServer Engineering
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.