[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: [PATCH] Fix cache flush bug of cpu offline



Keir Fraser wrote:
> On 11/03/2011 14:34, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
> 
>> Fix cache flush bug of cpu offline
>> 
>> Current xen cpu offline logic flush cache too early, which
>> potentially break cache coherency. wbinvd should be the last ops
>> before cpu going into dead, otherwise cache may be dirty, i.e,
>> something like setting an A bit on page tables. Pointed out by Arjan
>> van de Ven. 
> 
> The position still seems a bit arbitrary. In the first hunk below,
> why is it safe to wbinvd() outside the for-loop and before reading
> cx->entry_method, but not before reading from processor_powers[]? It
> would be neater if we could put the wbinvd() in a wrapper function
> for calling *dead_idle. 
> 
>  -- Keir

we did experiment, if did wbinvd at current position (at play_dead), sometimes 
it bring strange issue when repeatly cpu offline/online.
so for cpu dead, the near wbinvd to last step, the safer.
wbinvd would better be the last ops before cpu dead, to avoid potential cache 
coherency break.

In fact, it can do wbinvd inside loop, but as cpu_offline_3.patch said, 
at Xen 7400 when hyperthreading, the offlined thread may be spuriously waken up 
by its brother, and frequently waken inside the dead loop. 
In such case, considering heavy workload of wbinvd, we add a light-weight 
clflush instruction inside loop.

Thanks,
Jinsong


> 
>> Signed-off-by: Liu, Jinsong <jinsong.liu@xxxxxxxxx>
>> 
>> diff -r 2dc3c1cc1bba xen/arch/x86/acpi/cpu_idle.c
>> --- a/xen/arch/x86/acpi/cpu_idle.c Mon Mar 07 05:31:46 2022 +0800
>> +++ b/xen/arch/x86/acpi/cpu_idle.c Thu Mar 10 23:40:51 2022 +0800
>> @@ -562,11 +562,14 @@ static void acpi_dead_idle(void)
>>      if ( (cx = &power->states[power->count-1]) == NULL )         
>> goto default_halt; 
>> 
>> +    /*
>> +     * cache must be flashed as the last ops before cpu going into
>> dead, +     * otherwise, cpu may dead with dirty data breaking cache
>> coherency, +     * leading to strange errors.
>> +     */
>> +    wbinvd();
>>      for ( ; ; )
>>      {
>> -        if ( !power->flags.bm_check && cx->type == ACPI_STATE_C3 )
>> -            ACPI_FLUSH_CPU_CACHE();
>> -
>>          switch ( cx->entry_method )
>>          {
>>              case ACPI_CSTATE_EM_FFH:
>> @@ -584,6 +587,7 @@ static void acpi_dead_idle(void)      }
>> 
>>  default_halt:
>> +    wbinvd();
>>      for ( ; ; )
>>          halt();
>>  }
>> diff -r 2dc3c1cc1bba xen/arch/x86/domain.c
>> --- a/xen/arch/x86/domain.c Mon Mar 07 05:31:46 2022 +0800
>> +++ b/xen/arch/x86/domain.c Thu Mar 10 23:40:51 2022 +0800
>> @@ -93,6 +93,12 @@ static void default_idle(void)
>> 
>>  static void default_dead_idle(void)
>>  {
>> +    /*
>> +     * cache must be flashed as the last ops before cpu going into
>> dead, +     * otherwise, cpu may dead with dirty data breaking cache
>> coherency, +     * leading to strange errors.
>> +     */
>> +    wbinvd();
>>      for ( ; ; )
>>          halt();
>>  }
>> @@ -100,7 +106,6 @@ static void play_dead(void)
>>  static void play_dead(void)
>>  {
>>      local_irq_disable();
>> -    wbinvd();
>> 
>>      /*
>>       * NOTE: After cpu_exit_clear, per-cpu variables are no longer
>> accessible,


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.