[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] evtchn/Flask: pre-allocate node on send path



Hi Jan,

On 25/09/2020 14:58, Jan Beulich wrote:
On 25.09.2020 15:16, Julien Grall wrote:
Hi Jan,

On 25/09/2020 13:21, Jan Beulich wrote:
On 25.09.2020 12:34, Julien Grall wrote:
On 24/09/2020 11:53, Jan Beulich wrote:
xmalloc() & Co may not be called with IRQs off, or else check_lock()
will have its assertion trigger about locks getting acquired
inconsistently. Re-arranging the locking in evtchn_send() doesn't seem
very reasonable, especially since the per-channel lock was introduced to
avoid acquiring the per-domain event lock on the send paths. Issue a
second call to xsm_evtchn_send() instead, before acquiring the lock, to
give XSM / Flask a chance to pre-allocate whatever it may need.

This is the sort of fall-out I was expecting when we decide to turn off
the interrupts for big chunk of code. I couldn't find any at the time
though...

Can you remind which caller of send_guest{global, vcpu}_virq() will call
them with interrupts off?

I don't recall which one of the two it was that I hit; we wanted
both to use the lock anyway. send_guest_pirq() very clearly also
gets called with IRQs off.

Would it be possible to consider deferring the call to a softirq
taslket? If so, this would allow us to turn the interrupts again.

Of course this is in principle possible; the question is how
involved this is going to get.
However, on x86 oprofile's call to
send_guest_vcpu_virq() can't easily be replaced - it's dangerous
enough already that in involves locks in NMI context. I don't
fancy seeing it use more commonly used ones.

Fair enough. I would still like to consider a way where we could avoid
to hack xsm_* because we have the interrupts disabled.

Well, from a conceptual pov it's at least questionable for XSM to
need any memory allocations at all when merely being asked for
permission. And indeed the need for it arises solely from its
desire to cache the result, for the sake of subsequent lookups.

I also find it odd that there's an XSM check on the send path in
the first place. This isn't just because it would seem to me that
it should be decided at binding time whether sending is permitted
- I may easily be missing something in the conceptual model here.
It's also because __domain_finalise_shutdown() too uses
evtchn_send(), and I didn't think this one should be subject to
any XSM check (just like send_guest_*() aren't).

Maybe this is the first question we need to answer?


AFAICT, we don't need send_guest_global_virq() and evtchn_send() to be
mutually exclusive. Is that correct?

Yes, any number of sends (even to the same port) could in principle
run concurrently, I think. Or else the FIFO code would have been
broken from the very point where the per-channel lock was
introduced and acquiring of the per-domain one then dropped from
evtchn_send() (other sending paths weren't using the per-domain one
anyway already before that).

So how about splitting the lock in two? One would be used when the
interrupts have to be disabled while the other would be used when we can
keep interrupts enabled.

Now that's an interesting proposal. I thought one lock per channel
was already pretty fine-grained. Now you propose making it two.

The two locks would have to be taken when the event channel needs to be
modified.

Requiring a total of 6 locks to be acquired when fiddling with
interdomain channels... Wow. Definitely more intrusive overall than
the change here.

Well hacks are always more self-contained but long term they are a nightmare to maintain :).

6 locks is indeed a lot, but the discussion here shows that the current locking is probably not suitable for the current uses.

I don't have a better solution so far, but I will have a think about it. Meanwhile, it would be good to figure out why xsm_evtchn_send() is needed.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.