[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] x86/nmi: correctly check MSB of P6 performance counter MSR in watchdog



On 26/02/2019 13:12, Igor Druzhinin wrote:
> The logic currently tries to work out if a recent overflow (that indicates
> that NMI comes from the watchdog) happened by checking MSB of performance
> counter MSR that is initially sign extended from a negative value
> that we program it to. A possibly incorrect assumption here is that
> MSB is always bit 32 while on modern hardware it's usually 47 and
> the actual bit-width is reported through CPUID. Checking bit 32 for
> overflows is usually fine since we never program it to anything
> exceeding 32-bits and NMI is handled shortly after overflow occurs.
> 
> A problematic scenario that we saw occurs on systems where SMIs taking
> significant time are possible. In that case, NMI handling is deferred to
> the point firmware exits SMI which might take enough time for the counter
> to go through bit 32 and set it to 1 again. So the logic described above
> will misread it and report an unknown NMI erroneously.
> 
> Fortunately, we can use the actual MSB, which is usually higher than the
> currently hardcoded 32, and treat this case correctly at least on modern
> hardware.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
> ---
> Changes in v2:
> * taken care of the case where leaf 0xa is missing or width is 0
> * apply a cap in case width is lower than we're expecting to program
> ---
>  xen/arch/x86/nmi.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/nmi.c b/xen/arch/x86/nmi.c
> index bfe9777..6318e64 100644
> --- a/xen/arch/x86/nmi.c
> +++ b/xen/arch/x86/nmi.c
> @@ -37,6 +37,7 @@ unsigned int nmi_watchdog = NMI_NONE;
>  static unsigned int nmi_hz = HZ;
>  static unsigned int nmi_perfctr_msr; /* the MSR to reset in NMI handler */
>  static unsigned int nmi_p4_cccr_val;
> +static unsigned int nmi_p6_event_width;
>  static DEFINE_PER_CPU(struct timer, nmi_timer);
>  static DEFINE_PER_CPU(unsigned int, nmi_timer_ticks);
>  
> @@ -123,7 +124,9 @@ int nmi_active;
>  #define P6_EVNTSEL_USR               (1 << 16)
>  #define P6_EVENT_CPU_CLOCKS_NOT_HALTED        0x79
>  #define CORE_EVENT_CPU_CLOCKS_NOT_HALTED 0x3c
> -#define P6_EVENT_WIDTH          32
> +/* Bit width of IA32_PMCx MSRs is reported using CPUID.0AH:EAX[23:16]. */
> +#define P6_EVENT_WIDTH_MASK  (((1 << 8) - 1) << 16)
> +#define P6_EVENT_WIDTH_DEFAULT       32
>  
>  #define P4_ESCR_EVENT_SELECT(N)      ((N)<<25)
>  #define P4_CCCR_OVF_PMI0     (1<<26)
> @@ -292,6 +295,7 @@ static inline void write_watchdog_counter(const char 
> *descr)
>      u64 count = (u64)cpu_khz * 1000;
>  
>      do_div(count, nmi_hz);
> +    count = min(count, (1UL << (nmi_p6_event_width - 1)) - 1);
>      if(descr)
>          Dprintk("setting %s to -%#"PRIx64"\n", descr, count);
>      wrmsrl(nmi_perfctr_msr, 0 - count);

I mistakenly used nmi_p6_event_width in a shared call for all
architectures - I'll fix it in v3.

Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.