[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 12/12] evtchn: convert domain event lock to an r/w one
On 28.09.2020 18:44, Roger Pau Monné wrote: > On Mon, Sep 28, 2020 at 01:02:43PM +0200, Jan Beulich wrote: >> Especially for the use in evtchn_move_pirqs() (called when moving a vCPU >> across pCPU-s) and the ones in EOI handling in PCI pass-through code, >> serializing perhaps an entire domain isn't helpful when no state (which >> isn't e.g. further protected by the per-channel lock) changes. >> >> Unfortunately this implies dropping of lock profiling for this lock, >> until r/w locks may get enabled for such functionality. >> >> While ->notify_vcpu_id is now meant to be consistently updated with the >> per-channel lock held for writing, an extension applies to ECS_PIRQ: The >> field is also guaranteed to not change with the per-domain event lock >> held. Therefore the unlink_pirq_port() call from evtchn_bind_vcpu() as >> well as the link_pirq_port() one from evtchn_bind_pirq() could in >> principle be moved out of the per-channel locked regions, but this >> further code churn didn't seem worth it. >> >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> >> --- >> RFC: >> * In evtchn_bind_vcpu() the question is whether limiting the use of >> write_lock() to just the ECS_PIRQ case is really worth it. > > IMO I would just use use write_lock() at the top of the function in > place of the current spin_lock. The more fine grained change should be > done as a follow up patch if it's worth it. TBH event channels > shouldn't change vCPU that frequently that using a more fine grained > approach matters much. No, using the write lock in place of the spin lock would violate what the description says: "While ->notify_vcpu_id is now meant to be consistently updated with the per-channel lock held for writing, ...". I could only switch to acquiring both. >> --- a/xen/arch/x86/hvm/vmsi.c >> +++ b/xen/arch/x86/hvm/vmsi.c >> @@ -465,7 +465,7 @@ int msixtbl_pt_register(struct domain *d >> int r = -EINVAL; >> >> ASSERT(pcidevs_locked()); >> - ASSERT(spin_is_locked(&d->event_lock)); >> + ASSERT(rw_is_write_locked(&d->event_lock)); > > FWIW, we could switch rw_is_write_locked to use > _is_write_locked_by_me (or introduce rw_is_write_locked_by_me, albeit > I think all users of rw_is_write_locked care about the lock being > taken by them). Probably, but not here and now. >> @@ -1098,7 +1108,7 @@ int evtchn_reset(struct domain *d, bool >> if ( d != current->domain && !d->controller_pause_count ) >> return -EINVAL; >> >> - spin_lock(&d->event_lock); >> + read_lock(&d->event_lock); >> >> /* >> * If we are resuming, then start where we stopped. Otherwise, check >> @@ -1109,7 +1119,7 @@ int evtchn_reset(struct domain *d, bool >> if ( i > d->next_evtchn ) >> d->next_evtchn = i; > > Using the read lock to write to d->next_evtchn here... > >> >> - spin_unlock(&d->event_lock); >> + read_unlock(&d->event_lock); >> >> if ( !i ) >> return -EBUSY; >> @@ -1121,14 +1131,14 @@ int evtchn_reset(struct domain *d, bool >> /* NB: Choice of frequency is arbitrary. */ >> if ( !(i & 0x3f) && hypercall_preempt_check() ) >> { >> - spin_lock(&d->event_lock); >> + write_lock(&d->event_lock); >> d->next_evtchn = i; > > ... but the write lock here instead seems inconsistent. Agreed, fixed. >> @@ -1557,7 +1568,7 @@ static void domain_dump_evtchn_info(stru >> "Polling vCPUs: {%*pbl}\n" >> " port [p/m/s]\n", d->domain_id, d->max_vcpus, d->poll_mask); >> >> - spin_lock(&d->event_lock); >> + read_lock(&d->event_lock); > > Since this is a debug key, I would suggest using read_trylock in order > to prevent blocking if a CPU is stuck while holding the event_lock in > write mode. We should do so in all debug key handlers imo, and I don't think such a mostly orthogonal change would be reasonable to make right here. >> --- a/xen/drivers/passthrough/io.c >> +++ b/xen/drivers/passthrough/io.c >> @@ -105,7 +105,7 @@ static void pt_pirq_softirq_reset(struct >> { >> struct domain *d = pirq_dpci->dom; >> >> - ASSERT(spin_is_locked(&d->event_lock)); >> + ASSERT(rw_is_write_locked(&d->event_lock)); >> >> switch ( cmpxchg(&pirq_dpci->state, 1 << STATE_SCHED, 0) ) >> { >> @@ -162,7 +162,7 @@ static void pt_irq_time_out(void *data) >> const struct hvm_irq_dpci *dpci; >> const struct dev_intx_gsi_link *digl; >> >> - spin_lock(&irq_map->dom->event_lock); >> + read_lock(&irq_map->dom->event_lock); > > Is it fine to use the lock in read mode here? It's likely to change > the flags by adding HVM_IRQ_DPCI_EOI_LATCH, and hence should use the > lock in write mode? > > As I think that's the lock that's supposed to protect changes to the > flags field? Yes, looks like you're right - there simply is no other lock. You're also right for all other respective pass-through related remarks, I've switched them all to write_lock(). That's a bit of a shame though, as especially on the EOI path things ought to be able to work in a more parallel way. >> --- a/xen/drivers/passthrough/pci.c >> +++ b/xen/drivers/passthrough/pci.c >> @@ -883,7 +883,7 @@ static int pci_clean_dpci_irqs(struct do >> if ( !is_hvm_domain(d) ) >> return 0; >> >> - spin_lock(&d->event_lock); >> + write_lock(&d->event_lock); >> hvm_irq_dpci = domain_get_irq_dpci(d); >> if ( hvm_irq_dpci != NULL ) >> { >> @@ -901,14 +901,14 @@ static int pci_clean_dpci_irqs(struct do >> ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL); >> if ( ret ) >> { >> - spin_unlock(&d->event_lock); >> + read_unlock(&d->event_lock); > > This should be a write_unlock AFAICT. Oh, of course - missed to convert this one when converting the two others. >> --- a/xen/include/xen/sched.h >> +++ b/xen/include/xen/sched.h >> @@ -373,7 +373,7 @@ struct domain >> unsigned int xen_evtchns; >> /* Port to resume from in evtchn_reset(), when in a continuation. */ >> unsigned int next_evtchn; >> - spinlock_t event_lock; >> + rwlock_t event_lock; > > It would be nice to add a comment regarding what fields does > event_lock protect. It's kind of a very generic lock name that I think > has been abused a bit. > > Not that it needs to be done in that patch. Right, and it'll be quite some auditing that needs doing in order to collect all the pieces. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |