[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/fred: Send an EVENT_CHECK IPI on exit from NMI


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Wed, 24 Jun 2026 16:27:33 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=73JX4tRxghNd3KEwgvLBsm7nuoIDmN3JPk0jqpOng2c=; b=Nk8fREabU6CxqotW4/PlxmAWTXyfamqe5lS1Erxn1GyhrqBn7rJorBop7c1f/qxPqbNsfVZIRkvExLH818n6+sYfWWyVbcVmek/+9vdjnj+Fyw8j8qSHeHM77HuYr89bJC1lL6MRv4HcRhTos9OccwrjIbISGvT4vLG68xc94poU2/wnExEVY63FCfZWtCK/qkvZcA7wTcVXezq67K/K5gSKGxN9GdN07z17F5qasMXFJQXz8HoofrWoPMRup8+R7OaK2q/t8xoENDRFKNdBWWP8iKRJQQyyrrsER6GmEP3CGOSEJLpMh9iI11P7fpLOtcla+NJwYqMSkhVC5p8BTw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Zpb+8f9INtW0oONEZz6HyeuUsCDAAGSVFOuMk9nG5TFiBpIXdS3QJo3kT/XdZjTOG4A3s0MqI7tyUrEN85jx8VK6AIlN5N+hzCKmPuNOgEJUPhfKEQOreR1ShMql2zMjTAFFqxjDTF50FqgTVL9ugx2xYqOyfIIM5/3tBm7CmBT4A4C8Qz8bTDTzQJ2KxWpnH9BzlA1QnQ/oP4VMxX0yr6fp4INBCZHje+OaThl8/eebG7LODDjv1GLaWImZYAPiRN/jeg954sOdxTkkXD4vwiPEjnezHEj4olW0exGrzq8gGP9RMKeHrjzohJitLG+oNH8GieVFx9to7XA36Ff0yw==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=citrix.com header.i="@citrix.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Autocrypt: addr=andrew.cooper3@xxxxxxxxxx; keydata= xsFNBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABzSlBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPsLBegQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86M7BTQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAcLB XwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA==
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Teddy Astie <teddy.astie@xxxxxxxxxx>, Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 24 Jun 2026 15:27:44 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 24/06/2026 3:43 pm, Jan Beulich wrote:
> On 24.06.2026 16:23, Andrew Cooper wrote:
>> Returning from an NMI which hits guest context needs special casing in FRED
>> mode just like it does in IDT mode.
>>
>> Break nmi_exit_to_guest() out of handle_ist_exception(), and use it in
>> entry_FRED_R3() also.
>>
>> Expand the comment a little, and invert the conditional jump to
>> compat_restore_all_guest() to avoid needing an #else clause for CONFIG_PV32.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
> provided of course ...
>
>> Slightly RFC, not tested yet.  (My AMD system takes an eternity to reboot)
> ... the results of this won't prove it wrong.
>
>> For 4.22.  Found during testing of FRED.  The consqeuence is that we can end
>> up scheduling while still in NMI context, after which things like the 
>> watchdog
>> and other diagnostics don't work properly.
> May therefore want a Fixes: tag (it'll also want backporting aiui).

Ah yes, I'd meant to set one, but forgot.

Fixes: 87cfcbe9f0b5 ("x86/pv: Guest exception handling in FRED mode")

>
>> --- a/xen/arch/x86/x86_64/entry-fred.S
>> +++ b/xen/arch/x86/x86_64/entry-fred.S
>> @@ -20,6 +20,12 @@ FUNC(entry_FRED_R3, 4096)
>>          GET_STACK_END(14)
>>          movq    STACK_CPUINFO_FIELD(current_vcpu)(%r14), %rbx
>>  
>> +        /* NMIs need special handling on return to guest. */
>> +        movzbl  UREGS_ss + 6(%rsp), %eax
>> +        and     $0xf, %eax
> As you may be aware, I'm not overly happy with such literal numbers. But
> well, alternatives look a little involved. So just a remark, not a request
> to consider any kind of adjustment.

The 0xf cannot usefully be anything else.  It's the width of the event
type field in a FRED frame, but you need to visually see it's less than
0xff or the switch from %eax to %al looks wrong.

The +6 can't be generated by asm-offsets because the infrastructure
doesn't work on bitfields.

>
>> --- a/xen/arch/x86/x86_64/entry.S
>> +++ b/xen/arch/x86/x86_64/entry.S
>> @@ -146,6 +146,35 @@ process_trap:
>>          jmp  test_all_events
>>  END(switch_to_kernel)
>>  
>> +/*
>> + * When returning to guest from an NMI, we must execute an IRET/ERETU to
>> + * re-enable NMIs, and must not process softirqs which can e.g. schedule
>> + * rather than returning to guest context.
>> + *
>> + * If a softirq is pending, send ourselves an EVENT_CHECK IPI to compensate.
>> + * This will cause softirq processing to occur upon leaving NMI context.
>> + *
>> + * %rbx: struct vcpu, %r14 stack_end
>> + */
>> +FUNC(nmi_exit_to_guest)
>> +        mov     STACK_CPUINFO_FIELD(processor_id)(%r14), %eax
>> +        shl     $IRQSTAT_shift, %eax
>> +        lea     irq_stat + IRQSTAT_softirq_pending(%rip), %rcx
>> +        cmpl    $0, (%rcx, %rax, 1)
>> +        je      1f
>> +        mov     $EVENT_CHECK_VECTOR, %edi
>> +        call    send_IPI_self
>> +1:
>> +        /* For restore_all_guest. */
>> +        mov     STACK_CPUINFO_FIELD(current_vcpu)(%r14), %rbx
>> +#ifdef CONFIG_PV32
>> +        mov     VCPU_domain(%rbx), %rax
>> +        cmpb    $0, DOMAIN_is_32bit_pv(%rax)
> Would you be open to a little bit of trickery here while you move the code?
> The low 12 bits of %rbx are clear, so instead of $0 we could use %bl here.

struct vcpu being page aligned is a convenience not a requirement.  It's
hard alignment requirements are 32b and even then only with CONFIG_SHADOW.



>
>> +        jne     compat_restore_all_guest
>> +#endif
>> +        jmp     restore_all_guest
>> +END(nmi_exit_to_guest)
> Much like you flipped the Jcc/JMP here, ...
>
>> @@ -1209,25 +1238,7 @@ FUNC(handle_ist_exception)
>>  #ifdef CONFIG_PV
>>          testb $3,UREGS_cs(%rsp)
>>          jz    restore_all_xen
> ... how about also making this plus ...
>
>> -        /* Send an IPI to ourselves to cover for the lack of event 
>> checking. */
>> -        mov   STACK_CPUINFO_FIELD(processor_id)(%r14), %eax
>> -        shll  $IRQSTAT_shift,%eax
>> -        leaq  irq_stat+IRQSTAT_softirq_pending(%rip),%rcx
>> -        cmpl  $0,(%rcx,%rax,1)
>> -        je    1f
>> -        movl  $EVENT_CHECK_VECTOR,%edi
>> -        call  send_IPI_self
>> -1:
>> -        /* For restore_all_guest. */
>> -        mov   STACK_CPUINFO_FIELD(current_vcpu)(%r14), %rbx
>> -#ifdef CONFIG_PV32
>> -        movq  VCPU_domain(%rbx),%rax
>> -        cmpb  $0,DOMAIN_is_32bit_pv(%rax)
>> -        je    restore_all_guest
>> -        jmp   compat_restore_all_guest
>> -#else
>> -        jmp   restore_all_guest
>> -#endif
>> +        jmp   nmi_exit_to_guest
> ... this
>
>         jnz   nmi_exit_to_guest
>         jmp   restore_all_xen
>
> then allowing to fold with ...
>
>>  #else
>>          ASSERT_CONTEXT_IS_XEN
>>          jmp   restore_all_xen
> ... this?

This makes the diff rather less legible (and specifically, far less
obviously a "break out"), and changes the configurations that the ASSERT
lives in.

Perhaps as a followup, but not in this patch.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.