Re: [Xen-devel] [Xen-users] "xl restore" leaks a file descriptor?

On Tue, 2015-08-11 at 11:13 -0400, Andrew Armenia wrote:
> It's the checkpoint file - i.e. the command line argument to xl
> restore - that is being leaked.


> So the checkpoint file is clearly being leaked.

Indeed. I confirmed this even with the current development version using ls
-l /proc/<pid>/fd which shows an fd open on a deleted file:

# ps aux| grep xl
root     20465  0.0  0.2 106036   984 ?        SLsl 15:42   0:00 xl restore save
# ls -l /proc/20465/fd
lr-x------. 1 root root 64 Aug 11 15:42 7 -> /root/save
# rm /root/save
# ls -l /proc/20465/fd
lr-x------. 1 root root 64 Aug 11 15:42 7 -> /root/save (deleted)

>  Its space is not freed
> until the 'xl restore' process is ended by shutting down the domain:
> It seems like xl restore should close the checkpoint file as soon as
> it's done restoring the domain, allowing the space to be freed, but
> that's clearly not happening.

Right. In fact xl sets the file to be close-on-exec right after opening it,
which is before the daemonisation step, so it ought to be closed
automatically, but isn't for some reason.

My working theory is that something in the machinery which spawns the save
helper is defeating the use of CLOEXEC, perhaps by dup2() or perhaps by
unsetting CLOEXEC.

Any way, thanks for reporting. I've copied the devel list and 4.6 RM. Wei
this probably ought to be a blocker for 4.6 (and the fix ought ultimately
to be backported to 4.4 onwards at least).

NB: This leak seems to be independent of the switch to migration v2.


