[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2 of 2 V4] x86/kexec: Change NMI and MCE handling on kexec path



>>> On 11.12.12 at 16:34, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
> -    /* Would it be better to replace the trap vector here? */
> -    set_nmi_callback(crash_nmi_callback);
> +
> +    /* Change NMI trap handlers.  Non-crashing pcpus get nmi_crash which
> +     * invokes do_nmi_crash (above), which cause them to write state and
> +     * fall into a loop.  The crashing pcpu gets the nop handler to
> +     * cause it to return to this function ASAP.
> +     */
> +    for ( i = 0; i < nr_cpu_ids; ++i )
> +        if ( idt_tables[i] )
> +        {
> +
> +            if ( i == cpu )
> +            {
> +                /* Disable the interrupt stack tables for this MCE and
> +                 * NMI handler (shortly to become a nop) as there is a 1
> +                 * instruction race window where NMIs could be
> +                 * re-enabled and corrupt the exception frame, leaving
> +                 * us unable to continue on this crash path (which half
> +                 * defeats the point of using the nop handler in the
> +                 * first place).
> +                 *
> +                 * This update is safe from a security point of view, as
> +                 * this pcpu is never going to try to sysret back to a
> +                 * PV vcpu.
> +                 */

This comment appears to have become stale with the latest
changes.

> +                _set_gate_lower(&idt_tables[i][TRAP_nmi], 14, 0, &trap_nop);

No need for the extra & on functions and arrays.

> +                set_ist(&idt_tables[i][TRAP_machine_check], IST_NONE);
> +            }
> +            else
> +                /* Do not update stack table for other pcpus. */
> +                _update_gate_addr_lower(&idt_tables[i][TRAP_nmi], 
> &nmi_crash);
> +        }
> +

>...

> +/* Write the lower 64 bits of an IDT Entry. This relies on the upper 32
> + * bits of the address not changing, which is a safe aumption until our

assumption

> + * code size exceeds 4GB.

1Gb.

> + *
> + * Ideally, we would use cmpxchg16b, but this is not supported on some
> + * old AMD 64bit capable processors, and has no safe equivelent.

equivalent

> + */
> +static inline void _write_gate_lower(idt_entry_t * gate, idt_entry_t * new)

static inline void _write_gate_lower(idt_entry_t *gate, idt_entry_t *new)

(similar extra blanks elsewhere)

Also, to make clear which of the two is the entry written, const-
qualifying the other one might be a good idea.

> +{
> +    ASSERT(gate->b == new->b);
> +    *(volatile unsigned long *)&gate->a = new->a;

volatile? And if so, why not volatilie-qualify the function parameter?

> +}
> +
>  #define _set_gate(gate_addr,type,dpl,addr)               \
>  do {                                                     \
>      (gate_addr)->a = 0;                                  \
> @@ -122,6 +135,35 @@ do {                                    
>          (1UL << 47);                                     \
>  } while (0)
>  
> +#define _set_gate_lower(gate_addr,type,dpl,addr)         \
> +    do {                                                 \
> +    idt_entry_t idte;                                    \
> +    idte.b = (gate_addr)->b;                             \
> +    idte.a =                                            \
> +        (((unsigned long)(addr) & 0xFFFF0000UL) << 32) | \
> +        ((unsigned long)(dpl) << 45) |                   \
> +        ((unsigned long)(type) << 40) |                  \
> +        ((unsigned long)(addr) & 0xFFFFUL) |             \
> +        ((unsigned long)__HYPERVISOR_CS64 << 16) |       \
> +        (1UL << 47);                                     \
> +    _write_gate_lower((gate_addr), &idte);               \

No need for extra inner parentheses.

> +} while (0)
> +
> +/* Update the lower half handler of an IDT Entry, without changing any
> + * other configuration. */
> +static inline void _update_gate_addr_lower(idt_entry_t * gate, void * addr)

Any reason for this being an inline function and the other being
a macro?

Jan

> +{
> +    idt_entry_t idte;
> +    idte.a = gate->a;
> +
> +    idte.b = ((unsigned long)(addr) >> 32);
> +    idte.a &= 0x0000FFFFFFFF0000ULL;
> +    idte.a |= (((unsigned long)(addr) & 0xFFFF0000UL) << 32) |
> +        ((unsigned long)(addr) & 0xFFFFUL);
> +
> +    _write_gate_lower(gate, &idte);
> +}
> +
>  #define _set_tssldt_desc(desc,addr,limit,type)           \
>  do {                                                     \
>      (desc)[0].b = (desc)[1].b = 0;                       \


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.