[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 7 of 7] KEXEC: correctly revert x2apic state when kexecing




On 14/06/11 13:11, Keir Fraser wrote:
> On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx> wrote:
>
>> Introduce the boolean variable 'kexecing' which indicates to functions
>> whether we are on the kexec path or not.  This is used by
>> disable_local_APIC() to try and revert the APIC mode back to how it
>> was found on boot.
> Does this need to be done only on the kexec path, rather than
> unconditionally in disable_local_APIC()?
>
>  -- Keir

I tried doing it unconditionally, but because of the interaction of the
local apic mode and the x2apic_enabled variable, it was causing
protection faults all over the place in the regular reboot code.

Specifically: the apic code switches between MSRs and MMIO based on the
single global x2apic_enabled variable.  Whether you should use MSRs or
MMIO depends on whether your local apic is in xapic or x2apic mode, and
using MSRs when you should be using MMIO results in a protection fault,
which cascades as disable_local_APIC() is on the panic codepath.

The function is also caused as part of __stop_this_cpu() which is on
several codepaths.  While none of those codepaths immediately jump out
as being problematic, I have a feeling that the interaction with
x2apic_enabled will cause problems there as well.

I can't think of any architectural reason for it not to happen, but it
wont work without the original refactoring of the apic code which was
rejected on terms of straying too far from Linux.

~Andrew

>> We also need some fudging of the x2apic_enabled variable.  It is used
>> in multiple places over the codebase to mean multiple things,
>> including:
>>     What did the user specifify on the command line?
>>     Did the BIOS boot me in x2apic mode?
>>     Is the BSP Local APIC in x2apic mode?
>>     What mode is my Local APIC in?
>>
>> Therefore, set it up to prevent a protection fault when disabling the
>> IOAPICs.  (In this case, it is used in the "What mode is my Local APIC
>> in?" case, so the processor doesnt suffer a protection fault because
>> of trying to use x2apic MSRs when it should be using xapic MMIO)
>>
>> Finally, make sure that interrupts are disabled when jumping into the
>> purgatory code.  It would be bad to service interrupts in the Xen
>> context when the next kernel is booting.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/apic.c
>> --- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -37,6 +37,7 @@
>>  #include <asm/asm_defns.h> /* for BUILD_SMP_INTERRUPT */
>>  #include <mach_apic.h>
>>  #include <io_ports.h>
>> +#include <xen/kexec.h>
>>  
>>  static bool_t tdt_enabled __read_mostly;
>>  static bool_t tdt_enable __initdata = 1;
>> @@ -345,6 +346,33 @@ void disable_local_APIC(void)
>>          wrmsrl(MSR_IA32_APICBASE, msr_content &
>>                 ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD));
>>      }
>> +
>> +    if ( kexecing )
>> +    {
>> +        uint64_t msr_content;
>> +        rdmsrl(MSR_IA32_APICBASE, msr_content);
>> +        msr_content &= ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD);
>> +        wrmsrl(MSR_IA32_APICBASE, msr_content);
>> +
>> +        switch ( apic_boot_mode )
>> +        {
>> +        case APIC_MODE_DISABLED:
>> +            break; /* Nothing to do - we did this above */
>> +        case APIC_MODE_XAPIC:
>> +            msr_content |= MSR_IA32_APICBASE_ENABLE;
>> +            wrmsrl(MSR_IA32_APICBASE, msr_content);
>> +            break;
>> +        case APIC_MODE_X2APIC:
>> +            msr_content |= 
>> (MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD);
>> +            wrmsrl(MSR_IA32_APICBASE, msr_content);
>> +            break;
>> +        default:
>> +            printk("Default case when reverting #%d lapic to boot state\n",
>> +                   smp_processor_id());
>> +            break;
>> +        }
>> +    }
>> +
>>  }
>>  
>>  /*
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/crash.c
>> --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -27,6 +27,7 @@
>>  #include <asm/hvm/support.h>
>>  #include <asm/apic.h>
>>  #include <asm/io_apic.h>
>> +#include <xen/iommu.h>
>>  
>>  static atomic_t waiting_for_crash_ipi;
>>  static unsigned int crashing_cpu;
>> @@ -82,6 +83,15 @@ static void nmi_shootdown_cpus(void)
>>      iommu_crash_shutdown();
>>  
>>      __stop_this_cpu();
>> +
>> +    /* This is a bit of a hack due to the problems with the x2apic_enabled
>> +     * variable, but we can't do any better without a significant 
>> refactoring
>> +     * of the APIC code */
>> +    if ( current_local_apic_mode() == APIC_MODE_X2APIC )
>> +        x2apic_enabled = 1;
>> +    else
>> +        x2apic_enabled = 0;
>> +
>>      disable_IO_APIC();
>>  
>>      local_irq_restore(flags);
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/machine_kexec.c
>> --- a/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -91,6 +91,11 @@ void machine_kexec(xen_kexec_image_t *im
>>      if ( hpet_broadcast_is_available() )
>>          hpet_disable_legacy_broadcast();
>>  
>> +    /* We are about to permenantly jump out of the Xen context into the 
>> kexec
>> +     * purgatory code.  We really dont want to be still servicing interupts.
>> +     */
>> +    local_irq_disable();
>> +
>>      /*
>>       * compat_machine_kexec() returns to idle pagetables, which requires us
>>       * to be running on a static GDT mapping (idle pagetables have no GDT
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/common/kexec.c
>> --- a/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -29,6 +29,8 @@
>>  #include <compat/kexec.h>
>>  #endif
>>  
>> +bool_t kexecing = FALSE;
>> +
>>  static DEFINE_PER_CPU_READ_MOSTLY(void *, crash_notes);
>>  
>>  static Elf_Note *xen_crash_note;
>> @@ -220,6 +222,8 @@ void kexec_crash(void)
>>      if ( !test_bit(KEXEC_IMAGE_CRASH_BASE + pos, &kexec_flags) )
>>          return;
>>  
>> +    kexecing = TRUE;
>> +
>>      kexec_common_shutdown();
>>      kexec_crash_save_cpu();
>>      machine_crash_shutdown();
>> @@ -232,6 +236,8 @@ static long kexec_reboot(void *_image)
>>  {
>>      xen_kexec_image_t *image = _image;
>>  
>> +    kexecing = TRUE;
>> +
>>      kexec_common_shutdown();
>>      machine_reboot_kexec(image);
>>  
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/include/xen/kexec.h
>> --- a/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100
>> @@ -12,6 +12,8 @@ typedef struct xen_kexec_reserve {
>>  
>>  extern xen_kexec_reserve_t kexec_crash_area;
>>  
>> +extern bool_t kexecing;
>> +
>>  void set_kexec_crash_area_size(u64 system_ram);
>>  
>>  /* We have space for 4 images to support atomic update
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel
>

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.