[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: xenstored file descriptor leak



On Wed, Feb 03, 2021 at 07:18:51AM +0100, Jürgen Groß wrote:
> On 02.02.21 19:37, Manuel Bouyer wrote:
> > Hello,
> > on NetBSD I'm tracking down an issue where xenstored never closes its
> > file descriptor to /var/run/xenstored/socket and instead loops at 100%
> > CPU on these descriptors.
> > 
> > xenstored loops because poll(2) returns a POLLIN event for these
> > descriptors but they are marked is_ignored = true.
> > 
> > I'm seeing this with xen 4.15, 4.13 and has also been reported with 4.11
> > with latest security patches.
> > It seems to have started with patches for XSA-115 (A user reported this
> > for 4.11)
> > 
> > I've tracked it down to a difference in poll(2) implementation; it seems
> > that linux will return something that is not (POLLIN|POLLOUT) when a
> > socket if closed; while NetBSD returns POLLIN (this matches the NetBSD's
> > man page).
> 
> Yeah, Linux seems to return POLLHUP additionally.

Overall, it seems that the poll(2) behavior with non-regular files
is highly OS-dependant when it comes to border cases (like a remote
close of a socket)

> 
> > 
> > First I think there may be a security issue here, as, even on linux it 
> > should
> > be possible for a client to cause a socket to go to the is_ignored state,
> > while still sending data and cause xenstored to go to a 100% CPU loop.
> 
> No security issue, as sockets are restricted to dom0 and user root.
> 
> In case you are suspecting a security issue, please don't send such
> mails to xen-devel in future, but to security@xxxxxxxxxxxxxxxxxxxx.

Yes, sorry. Will do next time.

> 
> > I think this is needed anyway:
> > 
> > --- xenstored_core.c.orig   2021-02-02 18:06:33.389316841 +0100
> > +++ xenstored_core.c        2021-02-02 19:27:43.761877371 +0100
> > @@ -397,9 +397,12 @@
> >                          !list_empty(&conn->out_list)))
> >                             *ptimeout = 0;
> >             } else {
> > -                   short events = POLLIN|POLLPRI;
> > -                   if (!list_empty(&conn->out_list))
> > -                           events |= POLLOUT;
> > +                   short events = 0;
> > +                   if (!conn->is_ignored) {
> > +                           events |= POLLIN|POLLPRI;
> > +                           if (!list_empty(&conn->out_list))
> > +                                   events |= POLLOUT;
> > +                   }
> >                     conn->pollfd_idx = set_fd(conn->fd, events);
> >             }
> >     }
> 
> Yes, I think this is a good idea.

Well, after some sleep I don't think it is. We should always keep at last
POLLIN or we will never notice a socket close otherwise.

> 
> > 
> > Now I wonder if, on NetBSD at last, a read error or short read shouldn't
> > cause the socket to be closed, as with:
> > 
> > @@ -1561,6 +1565,8 @@
> >   bad_client:
> >     ignore_connection(conn);
> > +   /* we don't want to keep this connection alive */
> > +   talloc_free(conn);
> >   }
> 
> This is wrong for non-socket connections, as we want to keep the domain
> in question to be known to xenstored.
> 
> For socket connections this should be okay, though.

What are "non-socket connections" BTW ? I don't think I've seen one
in my test.

Is there a way to know if a connection is socket or non-socket ?

-- 
Manuel Bouyer <bouyer@xxxxxxxxxxxxxxx>
     NetBSD: 26 ans d'experience feront toujours la difference
--



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.