[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/xen: remove unneeded preempt_disable() from xen_irq_enable()



On 21.09.21 10:27, Peter Zijlstra wrote:
On Tue, Sep 21, 2021 at 09:02:26AM +0200, Juergen Gross wrote:
Disabling preemption in xen_irq_enable() is not needed. There is no
risk of missing events due to preemption, as preemption can happen
only in case an event is being received, which is just the opposite
of missing an event.

Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
---
  arch/x86/xen/irq.c | 18 +++++++-----------
  1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
index dfa091d79c2e..ba9b14a97109 100644
--- a/arch/x86/xen/irq.c
+++ b/arch/x86/xen/irq.c
@@ -57,24 +57,20 @@ asmlinkage __visible void xen_irq_enable(void)
  {
        struct vcpu_info *vcpu;
- /*
-        * We may be preempted as soon as vcpu->evtchn_upcall_mask is
-        * cleared, so disable preemption to ensure we check for
-        * events on the VCPU we are still running on.
-        */
-       preempt_disable();
-
        vcpu = this_cpu_read(xen_vcpu);
        vcpu->evtchn_upcall_mask = 0;
- /* Doesn't matter if we get preempted here, because any
-          pending event will get dealt with anyway. */
+       /*
+        * Now preemption could happen, but this is only possible if an event
+        * was handled, so missing an event due to preemption is not
+        * possible at all.
+        * The worst possible case is to be preempted and then check events
+        * pending on the old vcpu, but this is not problematic.
+        */
barrier(); /* unmask then check (avoid races) */
        if (unlikely(vcpu->evtchn_upcall_pending))
                xen_force_evtchn_callback();
-
-       preempt_enable();
  }
  PV_CALLEE_SAVE_REGS_THUNK(xen_irq_enable);
--
2.26.2


So the reason I asked about this is:

   vmlinux.o: warning: objtool: xen_irq_disable()+0xa: call to 
preempt_count_add() leaves .noinstr.text section
   vmlinux.o: warning: objtool: xen_irq_enable()+0xb: call to 
preempt_count_add() leaves .noinstr.text section

as reported by sfr here:

   https://lkml.kernel.org/r/20210920113809.18b9b70c@xxxxxxxxxxxxxxxx

(I'm still not entirely sure why I didn't see them in my build, or why
0day didn't either)

Anyway, I can 'fix' xen_irq_disable(), see below, but I'm worried about
that still having a hole vs the preempt model. Consider:

xen_irq_disable()
   preempt_disable();
   <IRQ>
     set_tif_need_resched()
   </IRQ no preemption because preempt_count!=0>
   this_cpu_read(xen_vcpu)->evtchn_upcall_mask = 1; // IRQs are actually 
disabled
   preempt_enable_no_resched(); // can't resched because IRQs are disabled

   ...

xen_irq_enable()
   preempt_disable();
   vcpu->evtch_upcall_mask = 0; // IRQs are on
   preempt_enable() // catches the resched from above


Now your patch removes that preempt_enable() and we'll have a missing
preemption.

Trouble is, because this is noinstr, we can't do schedule().. catch-22

I think it is even worse. Looking at xen_save_fl() there is clearly
a missing preempt_disable().

But I think this all can be resolved by avoiding the need of disabling
preemption in those calls (xen_save_fl(), xen_irq_disable() and
xen_irq_enable()).

Right now disabling preemption is needed, because the flag to be tested
or modified is reached via a pointer (xen_vcpu) stored in the percpu
area. Looking where it might point to reveals the target address is
either an array indexed by smp_processor_id() or a percpu variable of
the local cpu (xen_vcpu_info).

Nowadays (since Xen 3.4, which is older than our minimal supported Xen
version) the array indexed by smp_processor_id() is used only during
early boot (interrupts are always off, only boot cpu is running) and
just after coming back from suspending the system (e.g. when being
live migrated). Early boot should be no problem, and the suspend case
isn't either, as that is happening under control of stop_machine()
(interrupts off on all cpus).

So I think I can switch the whole mess to only need to work on the
local percpu xen_vcpu_info instance, which will access always the
"correct" area via %gs.

Let me have a try ...


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.