[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: xenstore - Suggestion of batching watch events



On Tue, Jun 24, 2025 at 4:21 PM Christian Lindig
<christian.lindig@xxxxxxxxx> wrote:
>
> I believe what you observe is a major source of inefficiency for the reason 
> you describe: changes are acted upon too early because there is no way to 
> observe that they are part of a transaction. So now heuristics come in like 
> waiting for more changes before acting on the ones observed. I wonder how the 
> tree structure plays into this. Clients watch different sub trees and we 
> don’t exploit this knowledge. I do agree that some protocol or syntax to 
> batch updates would be useful.

This is probably where the Irmin-based prototype had an advantage,
pulling batched updates is something that 'git' is good at, and in
many ways Irmin was similar to git.
When I joined XenServer I've been told that all the improvements from
the Irmin prototype got integrated back into the mainline one, but
that clearly isn't true, because:
* the mainline one had a lot of security issues that the Irmin based one didn't
* batching of updates is completely absent

Since this "high-bandwidth" event mechanism is only needed by Dom0 I
think we can avoid sending these using the xenstore PV ring protocol,
and even if we don't use Irmin to implement or send the updates, we
shouldn't be constrained anymore by the 4K limitation of a xenstore
packet on the socket interface in Dom0.

If this is a new kind of API message then clients can opt-in (older
clients who can't cope with such larger responses won't know to make
the new API call either).
That would avoid the problem we had with the directory protocol where
O and C versions assigned different semantics *to the same API call*
(O version used >4k packets, C version failed and only worked with the
PARTIAL protocol). Now we finally have behaviour parity on that, so
any change we make, I suggest we do by introducing a new API call that
doesn't inherit legacy limitations like 4K packets sizes on unix
domain sockets at least.

Best regards,
--Edwin
>
> — C
>
>
> > On 24 Jun 2025, at 15:51, Andriy Sultanov <sultanovandriy@xxxxxxxxx> wrote:
> >
> > Currently, as far as I am aware, the ability of xenstore clients to properly
> > handle and detect batch updates is somewhat lacking. Transactions are not
> > directly visible to the clients watching a particular directory - they will
> > receive a lot of individual watch_event's once the transaction is committed,
> > without any indication when such updates are going to end.
> >
> > Clients such as xenopsd from the xapi toolstack are reliant on xenstore to
> > track their managed domains, and a flood of individual updates most often
> > results in a flood of events raised from xenopsd to xapi (There are
> > consolidation mechanisms implemented there, with updates getting merged
> > together, but if xapi picks up update events from the queue quickly enough, 
> > it
> > will only get more update events later)
> >
> > The need for batching is fairly evident from the fact that XenServer's 
> > Windows
> > PV drivers, for example, adopted an ad-hoc "batch" optimization (not 
> > documented
> > anywhere, of course), where some sequence of writes is followed by a write 
> > of
> > the value "1" to "data/updated". This used to be honoured by xapi, which 
> > would
> > not consider the guest agent update done until it received notice of such a
> > "batch ended" update, but it caused xapi to miss updates that were not 
> > followed
> > by such a write, so xapi now ignores this ad-hoc batching. One could imagine
> > many workarounds here (for example, some sort of a mechanism where xenopsd
> > stalls an update for a second to see if any more related updates show up and
> > only then notifies xapi of it, with obvious trade-offs), but IMO it could be
> > worth considering making this easier on the xenstore side for different
> > use-cases.
> >
> > Suggestion:
> > WATCH_EVENT's req_id and tx_id are currently 0. Could it be possible, for
> > example, to modify this such that watch events coming as a result of a
> > transaction commit (a "batch") have tx_id of the corresponding transaction
> > and req_id of, say, 2 if it's the last such watch event of a batch and 1
> > otherwise? Old clients would still ignore these values, but it would allow
> > some others to detect if an update is part of a logical batch that doesn't 
> > end
> > until its last event.
> >
> > Is this beyond the scope of what xenstored wants to do? From a first glance,
> > this does not seem to introduce obvious unwanted information leaks either, 
> > but
> > I could be wrong. I would love to hear if this is something that could be
> > interesting to others and if this could be considered at all.
> >
> > Thank you!
> >
> >
>



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.