[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] libxl: Increase device model startup timeout to 1min.



Anthony PERARD writes ("Re: [PATCH] libxl: Increase device model startup 
timeout to 1min."):
> On Thu, Jul 02, 2015 at 01:38:37PM +0100, Ian Jackson wrote:
> > I'm starting to think that this might be a real bug but that the bug
> > might be "Linux's I/O subsystem sometimes produces appalling latency
> > under load" (which is hardly news).
> 
> I guess the straces support this, here are few quote from different strace:
...
> 04:11:50.602639 mmap(0x7f845bc29000, 8192, PROT_READ|PROT_WRITE, 
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x31000) = 0x7f845bc29000 <0.000038>
> 04:11:51.257654 close(3)                = 0 <0.000042>
...
> The first quote is a pattern I'm seeing very often on slow dm start, where
> it take a long time between the mmap and the next syscall. On the second
> quote, read() is to blame, it took 1s.
> 
> I guess even the first quote imply there is going to be I/O after the mmap
> call, isn't it?

It's very likely, yes.  The code after mmap will probably start
reading the pages just mapped.


Thanks for this investigation.

I am now convinced that this is indeed the bug "Linux's I/O subsystem
sometimes produces appalling latency under load".  That bug has
existed for at least a decade and seems unlikely to be fixed any time
soon.  Certainly, fixing it is beyond our scope.

So papering over this with an increase in the timeout is probably
proper.  I'm tempted to suggest increasing the timeout only on Linux.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.