[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Possible error restoring machine



I noted a possible problem restoring a machine.

In xc_domain_restore (xc_domain_restore.c) if it's not the last
checkpoint we set O_NONBLOCK flag (search for fcntl) that we can call
pagebuf_get or just load other pages (see following "goto loadpages;"
line).
Now we could ending up calling xc_tmem_restore/xc_tmem_restore_extra
(xc_tmem.c) which call read_extract (xc_private.c) on the same non
blocking socket/file but read_extract does not handle EAGAIN/EWOULDBLOCK
(both can be returned on non blocking socket depending on file type and
Unix/Linux version) leading to a failure.
Does this make sense or is it impossible ??

Also note that rdexact (xc_domain_restore.c) handle data timeout but we
can still block in read_exact called by
xc_tmem_restore/xc_tmem_restore_extra.

Last note on rdexact, isn't 1 second (HEARTBEAT_MS) too small if there
are network problems?

Frediano

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.