[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 1/3] evtchn: slightly defer lock acquire where possible



Hi Roger,

On 28/05/2021 14:31, Roger Pau Monné wrote:
On Fri, May 28, 2021 at 11:48:51AM +0100, Julien Grall wrote:
Hi Jan,

On 28/05/2021 11:23, Jan Beulich wrote:
On 28.05.2021 10:30, Roger Pau Monné wrote:
On Thu, May 27, 2021 at 07:48:41PM +0100, Julien Grall wrote:
On 27/05/2021 12:28, Jan Beulich wrote:
port_is_valid() and evtchn_from_port() are fine to use without holding
any locks. Accordingly acquire the per-domain lock slightly later in
evtchn_close() and evtchn_bind_vcpu().

So I agree that port_is_valid() and evtchn_from_port() are fine to use
without holding any locks in evtchn_bind_vcpu(). However, this is misleading
to say there is no problem with evtchn_close().

evtchn_close() can be called with current != d and therefore, there is a

The only instances evtchn_close is called with current != d and the
domain could be unpaused is in free_xen_event_channel AFAICT.

As long as the domain is not paused, ->valid_evtchns can't ever
decrease: The only point where this gets done is in evtchn_destroy().
Hence ...

risk that port_is_valid() may be valid and then invalid because
d->valid_evtchns is decremented in evtchn_destroy().

Hm, I guess you could indeed have parallel calls to
free_xen_event_channel and evtchn_destroy in a way that
free_xen_event_channel could race with valid_evtchns getting
decreased?

... I don't see this as relevant.

Thankfully the memory is still there. So the current code is okayish and I
could reluctantly accept this behavior to be spread. However, I don't think
this should be left uncommented in both the code (maybe on top of
port_is_valid()?) and the commit message.

Indeed, I think we need some expansion of the comment in port_is_valid
to clarify all this. I'm not sure I understand it properly myself when
it's fine to use port_is_valid without holding the per domain event
lock.

Because of the above property plus the fact that even if
->valid_evtchns decreases, the underlying struct evtchn instance
will remain valid (i.e. won't get de-allocated, which happens only
in evtchn_destroy_final()), it is always fine to use it without
lock. With this I'm having trouble seeing what would need adding
to port_is_valid()'s commentary.

Lets take the example of free_xen_event_channel(). The function is checking
if the port is valid. If it is, then evtchn_close() will be called.

At this point, it would be fair for a developper to assume that
port_is_valid() will also return true in event_close().

To push to the extreme, if free_xen_event_channel() was the only caller of
evtchn_close(), one could argue that the check in evtchn_close() could be a
BUG_ON().

However, this can't be because this would sooner or later turn to an XSA.

Effectively, it means that is_port_valid() *cannot* be used in an
ASSERT()/BUG_ON() and every caller should check the return even if the port
was previously validated.

We already have cases of is_port_valid being used in ASSERTs (in the
shim) and a BUG_ON (with the domain event lock held in evtchn_close).

I was likely a bit too restrictive in my remark. The BUG_ON() in evtchn_close() is fine because this is used on a (in theory) an open port with the both domains event lock held.

This would be more a problem if we have either:

if ( !port_is_valid(d1, port1) )
  return -EINVAL;

/* .... */

BUG_ON(port_is_valid(d1, port1));


So I think a comment on top of is_port_valid() would be really useful before
someone rediscover it the hard way.

I think I'm being extremely dull here, sorry. From your text I
understand that the value returned by is_port_valid could be stale by
the time the user reads it?

I think there's some condition that makes this value stale, and it's
not the common case?

It is any code that race with evtchn_destroy().

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.