[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-API] timing loops


  • To: 'Anil Madhavapeddy' <anil@xxxxxxxxxx>
  • From: Dave Scott <Dave.Scott@xxxxxxxxxxxxx>
  • Date: Tue, 10 Jul 2012 15:36:59 +0100
  • Accept-language: en-US
  • Acceptlanguage: en-US
  • Cc: "xen-api@xxxxxxxxxxxxx" <xen-api@xxxxxxxxxxxxx>
  • Delivery-date: Tue, 10 Jul 2012 14:37:08 +0000
  • List-id: User and development list for XCP and XAPI <xen-api.lists.xen.org>
  • Thread-index: Ac1ep5rCNhebkJw6SuetvxJo/JMcFwAAKisw
  • Thread-topic: [Xen-API] timing loops

Hopefully in the future the whole stack will support cancellation -- so the 
user can apply their own timeout values in their code instead of us doing it 
one-size-fits-all. A lot of the domain-level stuff can now be cancelled (which 
may cause the domain to crash if it happens at a bad time.. but this does at 
least cause things to unwind usually). Most of the storage interface is 
uncancellable, which is a big problem since it involves off-box RPCs. We either 
need to fix that directly or offer users the big red button labeled "driver 
domain restart" which will unstick things

One bad thing about not supporting cancellation is that it encourages people to 
close connections and walk away, unaware that a large amount of resources (and 
locks) are still being consumed server-side.

One good thing to do would be to send heartbeats to any running CLIs and 
auto-cancel when the connection is broken unless some "--async" option is given 
which would return immediately with a Task.

In the meantime we always tune the timeouts to fail eventually if the system 
gets truly stuck under high load. This leads to fairly long timeouts, which 
isn't ideal for everyone. There's a tension between high timeouts for stress 
testing and low timeouts for user experience -- we can't do both :(

Cheers,
Dave

> -----Original Message-----
> From: Anil Madhavapeddy [mailto:anil@xxxxxxxxxx]
> Sent: 10 July 2012 15:24
> To: Dave Scott
> Cc: xen-api@xxxxxxxxxxxxx
> Subject: Re: [Xen-API] timing loops
> 
> How do you decide on a reasonable value of n, given that real timeouts
> shift so dramatically with dom0 system load?  Or rather, what areas of
> xapi aren't fully event-driven and require such timeouts?
> 
> I can imagine the device/udev layer being icky in this regard, but a
> good way to wrap all such instances might be to have a single event-
> dispatch daemon which combines all the system events and timeouts, and
> coordinates the remainder of the xapi process cluster (which will not
> need arbitrary timeouts as a result).  Or it just too impractical since
> there are so many places where such timeouts are required?
> 
> -anil
> 
> On 10 Jul 2012, at 15:18, Dave Scott wrote:
> 
> > Hi,
> >
> > With all the recent xapi disaggregation work, are we now more
> vulnerable to failures induced by moving the system clock around,
> affecting timeout logic in our async-style interfaces where we wait for
> 'n' seconds for an event notification?
> >
> > I've recently added 'oclock' as a dependency which gives us access to
> a monotonic clock source, which is perfect (I believe) for reliably
> 'timing out'. I started a patch to convert the whole codebase over but
> it was getting much too big and hard to test because sometimes we
> really do want a calendar date, and other times we really want a point
> in time.
> >
> > Maybe I should make a subset of my patch which fixes all the new
> timing loops that have been introduced. What do you think? Would you
> like to confess to having written:
> >
> > let start = Unix.gettimeofday () in
> > while (not p && (Unix.gettimeofday () -. start < timeout) do
> Thread.delay 1. done
> >
> > I've got a nice higher-order function to replace this which does:
> >
> > let until p timeout interval =
> >  let start = Oclock.gettime Oclock.monotonic in
> >  while (not p && (Int64.(to_float (sub (Oclock.gettime
> Oclock.monotonic) start) / 1e9) < timeout) do Thread.delay 1. Done
> >
> > I believe this is one of many things that lwt (and JS core) does a
> nice job of.
> >
> > Cheers,
> > Dave
> >
> > _______________________________________________
> > Xen-api mailing list
> > Xen-api@xxxxxxxxxxxxx
> > http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api
> >


_______________________________________________
Xen-api mailing list
Xen-api@xxxxxxxxxxxxx
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.