[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v2] x86/irq: do not release irq until all cleanup is done


  • To: xen-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
  • Date: Wed, 16 Nov 2022 13:21:14 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=H0GHGNTLWnzPh0GtHLDMx3fr+H1OyOCnHcmXOA+Mh5Y=; b=cDiKXgfdmo5pR0gAPQkuHxXqvCvgAEk7TEskk2X5EIFgaHEV1PTBkKrKGxBMQcq/SCf+4JzYpgHW1c8DDyWP3pWXTfWcm5luZK/xCE2HmCVn4ytR7MWG9RHJVytzyU5ZKdgrTBlrLhqA1yPy3avLK7XGiMDpSjLo5gXpIL8oNCxt1w7fiaKjaMNuEBv90ojaB0ppnu0S1K1Jus7COMoyedil3i27JB2QrlTn7SGGDrJvpHSK0UqqwLl81olHIbnLU9mKTpe2EPRDemRRqhsBNKoNc+drxEQ7I1yZ2e8rAOt3vAWRFsjxDbs2e1QI+guimOuyg02YQhg4Hf/58M4E4Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XpBVgqQwULCPefJ5P29hQi8rbPHtR1TV+ZCVgNrhCSK92hWuMIE2Jhz7FzncSXyEi8matPwS93g7AqI5UTwGuElt22Agd81yVctbGwsMx+OyB2oWOMP5dW0INeV3cYsgjFX7/CmDwM5QNRXHWnq+ACH8zgFHI7szG2IXMpg5MUzJ5KOCVnjKHzxerpsVIpIz/LDc6XeBQgt7RY6ssKmHyDf2xyF85Z8MwNqwHbt4BTEKNza97I++k1s9Wqb4Hm261W1fvyorlWMZBT9fTrN9U6zaI0t1iRIp3MTUNSPPsEgFCJgHYOycnjlIUytm7D9bp7QOD2RB1w2H+3dNkYT+LA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Wed, 16 Nov 2022 12:21:42 +0000
  • Ironport-data: A9a23:jmMVCaihVsOWwgiDdFuueSUGX161RBEKZh0ujC45NGQN5FlHY01je htvUGzXPvbcMGr2Ko9xPI61pEtX6pPVzNZlGVY9+yo9Fngb9cadCdqndUqhZCn6wu8v7q5Ex 55HNoSfdpBcolv0/ErF3m3J9CEkvU2wbuOgTrWCYmUpH1QMpB4J0XpLg/Q+jpNjne+3CgaMv cKai8DEMRqu1iUc3lg8sspvkzsy+qWs0N8klgZmP6oS5QaAzyB94K83fsldEVOpGuG4IcbiL wrz5OnR1n/U+R4rFuSknt7TGqHdauePVeQmoiM+t5mK2nCulARrukoIHKN0hXNsoyeIh7hMJ OBl7vRcf+uL0prkw4zxWzEAe8130DYvFLXveRBTuuTLp6HKnueFL1yDwyjaMKVBktubD12i+ tQbEBERSUzY2t6L/6Obe8BO2u4AIJDSadZ3VnFIlVk1DN4AaLWaGeDv2oUd2z09wMdTAfzZe swVLyJ1awjNaAFOPVFRD48imOCvhT/0dDgwRFC9/PJrpTSMilIvluS8WDbWUoXiqcF9hEGXq 3iA523kKhobKMae2XyO9XfEaurnzX+qA9hDS+3QGvhCnmOiyXEzVxcvf2C2iKa2l3SkfPEOJ BlBksYphe1onKCxdfHtUhv9rHOasxo0X9tLD/Z8+AyL0rDT4QuSGi4DVDEpQN4sudIyRDcq/ kSUhN6vDjtq2JWKTVqN+7HSqim9URX5NkcHbC4ACA4aud/qpdhpigqVFooyVqmoktfyBDf8h SiQqzQzjKkSishN0Lin+VfAgHSnoZ2hohMJ2zg7l1mNtmtRDLNJraTygbQHxZ6s9Lqkc2Q=
  • Ironport-hdrordr: A9a23:bHo+jatDTlA9N5prwtnE9VpO7skCAoAji2hC6mlwRA09TyXGra 2TdaUgvyMc1gx7ZJhBo7+90We7MBHhHPlOkPEs1NaZLXDbUQ6TQL2KgrGSpwEIdxefygc/79 YcT0EBMqyWMbESt6+TjmiF+r4bsaO6GcuT9ILjJhlWPGJXg/YK1XYDNu/XKDwDeCB2Qb4CUL aM7MtOoDStPVwRc8SAH3EAG8TTutHRk5riQBgeQzoq8hOHgz+E4KPzV0Hw5GZXbxp/hZMZtU TVmQ3w4auu99m91x/nzmfWq7hGhdf7zdNHJcqUzuwYMC/lhAqEbJloH5eCoDc2iuey70tCqq iEnz4Qe+BIr1/BdGC8phXgnyHmzTYV8nfnjXuVm2Hqr8DVTC8zT5Mpv/MuTjLpr24b+P1s2q NC2GyU87JREBP7hSz4o/zFTQtjmEaYqWcr1cQTk3tce40Db6I5l/1pwGplVLM7WA7q4oEuF+ djSOna+fZtaFufK0vUu2F+qebcLUgbL1OjeAwvq8aV2z9ZkDRS1E0D3vESmX8G6dYUV4REz/ 6sCNUlqJh+CustKY5tDuYIRsW6TkbXRwjXDW6UKVP7UIkaJnP2rYLt6rld3pDmRHUx9up9pH 39aiIYiYZrEHieSfFmnac7uCwleV/NEggEkapllttEUr6VfsuaDcTMciFtryKamYRgPiTqYY fOBHtoOY6dEYKXI/cu4+TfYeghFZBMarxhhv8LH3Szn+nsFqrG8sTmTde7HsudLd9jYBK1Pk c+
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Current code in _clear_irq_vector() will mark the irq as unused before
doing the cleanup required when move_in_progress is true.

This can lead to races in create_irq() if the function picks an irq
desc that's been marked as unused but has move_in_progress set, as the
call to assign_irq_vector() in that function can then fail with
-EAGAIN.

Prevent that by only marking irq descs as unused when all the cleanup
has been done.  While there also use write_atomic() when setting
IRQ_UNUSED in _clear_irq_vector() and add a barrier in order to
prevent the setting of IRQ_UNUSED getting reordered by the compiler.

The check for move_in_progress cannot be removed from
_assign_irq_vector(), as other users (io_apic_set_pci_routing() and
ioapic_guest_write()) can still pass active irq descs to
assign_irq_vector().

Note the trace point is not moved and is now set before the irq is
marked as unused.  This is done so that the CPU mask provided in the
trace point is the one belonging to the current vector, not the old
one.

Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
---
Changes since v1:
 - Leave the trace point at the same place (before freeing old
   vector).
 - Add a (write) barrier before the IRQ_UNUSED write.
---
I've only observed this race when using vPCI with a PVH dom0, so I
assume other users of _{clear,assign}_irq_vector() are likely to
already be mutually exclusive by means of other external locks.

The path that triggered this race is using
allocate_and_map_msi_pirq(), which will sadly drop the returned error
code from create_irq() and just return EINVAL, that makes figuring out
the issue more complicated, but fixing that error path should be
done in a separate commit.
---
 xen/arch/x86/irq.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index cd0c8a30a8..20150b1c7f 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -220,27 +220,28 @@ static void _clear_irq_vector(struct irq_desc *desc)
         clear_bit(vector, desc->arch.used_vectors);
     }
 
-    desc->arch.used = IRQ_UNUSED;
-
     trace_irq_mask(TRC_HW_IRQ_CLEAR_VECTOR, irq, vector, tmp_mask);
 
-    if ( likely(!desc->arch.move_in_progress) )
-        return;
+    if ( unlikely(desc->arch.move_in_progress) )
+    {
+        /* If we were in motion, also clear desc->arch.old_vector */
+        old_vector = desc->arch.old_vector;
+        cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
 
-    /* If we were in motion, also clear desc->arch.old_vector */
-    old_vector = desc->arch.old_vector;
-    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
+        for_each_cpu(cpu, tmp_mask)
+        {
+            ASSERT(per_cpu(vector_irq, cpu)[old_vector] == irq);
+            TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu);
+            per_cpu(vector_irq, cpu)[old_vector] = ~irq;
+        }
 
-    for_each_cpu(cpu, tmp_mask)
-    {
-        ASSERT(per_cpu(vector_irq, cpu)[old_vector] == irq);
-        TRACE_3D(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu);
-        per_cpu(vector_irq, cpu)[old_vector] = ~irq;
-    }
+        release_old_vec(desc);
 
-    release_old_vec(desc);
+        desc->arch.move_in_progress = 0;
+    }
 
-    desc->arch.move_in_progress = 0;
+    smp_wmb();
+    write_atomic(&desc->arch.used, IRQ_UNUSED);
 }
 
 void __init clear_irq_vector(int irq)
-- 
2.37.3




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.