[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Possible error restoring machine



On Wed, 2012-05-23 at 11:25 +0100, Ian Campbell wrote:
> CCiong the Remus maintainer since all this non-blocking stuff is for
> remus/checkpointing.
> 
> On Wed, 2012-05-23 at 10:39 +0100, Frediano Ziglio wrote:
> > I noted a possible problem restoring a machine.
> > 
> > In xc_domain_restore (xc_domain_restore.c) if it's not the last
> > checkpoint we set O_NONBLOCK flag (search for fcntl) that we can call
> > pagebuf_get or just load other pages (see following "goto loadpages;"
> > line).
> > Now we could ending up calling xc_tmem_restore/xc_tmem_restore_extra
> > (xc_tmem.c) which call read_extract (xc_private.c) on the same non
> > blocking socket/file
> 
> There's a bunch of such places in that function, the RDEXACT macro is
> also == rdexact except on Minios.
> 
> >  but read_extract does not handle EAGAIN/EWOULDBLOCK
> > (both can be returned on non blocking socket depending on file type and
> > Unix/Linux version) leading to a failure.
> > Does this make sense or is it impossible ??
> 
> Isn't this what the if line:
>         len = read(fd, buf + offset, size - offset);
>         if ( (len == -1) && ((errno == EINTR) || (errno == EAGAIN)) )
>             continue;
> 
> is doing?
> 
> > Also note that rdexact (xc_domain_restore.c) handle data timeout but we
> > can still block in read_exact called by
> > xc_tmem_restore/xc_tmem_restore_extra.
> 
> Oh, wait! read_exact != rdexact -- ouch! Those are confusingly similar!
> 
> I suspect we need to pull the xc_tmem_{save,restore} into the
> appropriate file and use the non-blocking capable versions or to export
> the non-blocking function, with an improved name, so it can be used from
> xc_tmem.c.
> 

I was working on a patch to try to reduce cpu usage and read calls using
buffering for io_fd.

Currently works but is not still that good to post.

> Shriram, any thoughts?
> 
> > 
> > Last note on rdexact, isn't 1 second (HEARTBEAT_MS) too small if there
> > are network problems?
> > 

Frediano

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.