[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] 答复: 答复: [PATCH v2] x86/hpet: Fix possible ASSERT(cpu < nr_cpu_ids)


  • To: Jan Beulich <JBeulich@xxxxxxxx>
  • From: David Wang <DavidWang@xxxxxxxxxxx>
  • Date: Thu, 19 Apr 2018 08:56:09 +0000
  • Accept-language: zh-CN, en-US
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Fiona Li\(BJ-RD\)" <FionaLi@xxxxxxxxxxx>
  • Delivery-date: Thu, 19 Apr 2018 09:12:29 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHT1yzdpHsoAoSss0GbDHjFrdP+6aQHaz0a///A2ICAAJTdiw==
  • Thread-topic: 答复: [PATCH v2] x86/hpet: Fix possible ASSERT(cpu < nr_cpu_ids)

Jan , thank you for your explanation. 
Your mean : To avoid  evaluating the cpumask twice in hpet_detach_channel(), 
use a local variable  to hold the result of cpumask_first(). Then  
cpumask_empty() could be  replaced  by comparing the value of variable and 
nr_cpu_ids.

I understand it right?



发件人: Jan Beulich <JBeulich@xxxxxxxx>
发送时间: 2018年4月19日 15:30
收件人: David Wang
抄送: xen-devel; Fiona Li(BJ-RD)
主题: Re: 答复: [PATCH v2] x86/hpet: Fix possible ASSERT(cpu < nr_cpu_ids)
    
>>> On 19.04.18 at 06:50, <DavidWang@xxxxxxxxxxx> wrote:
> Thank you for reply. Maybe My description is not clear. Please allow me to 
> explain it again.
> Multiple CPU may be share an channel(according to  hpet_get_channel(), it is 
> possible ). When  one of  them get the lock of 
> channel(hpet_broadcast_exit()->hpet_detach_channel()->spin_lock_irq(&ch->lock))
> , others shouldn't rewrite/clear the ch->cpumask in hpet_broadcast_exit(). 
> This lead to errors.
> For example:
> CPU zero and CPU one share an channel by executing hpet_get_channel() 
> respectively and  ch->cpumask of channel be set to 0x3.
> Next, CPU zero  execute hpet_broadcast_exit()->cpumask_clear_cpu() and the 
> ch->cpumask is 0x2.  
> CPU zero execute 
> hpet_broadcast_exit()->hpet_detach_channel()->cpumask_empty() 
> and it get a false.  
> After that the next moment, CPU one  execute 
> hpet_broadcast_exit()->cpumask_clear_cpu(). That set the ch->cpumask to 0.
> When CPU zero execute hpet_detach_channel()->cpumask_first(), ch->cpu would 
> be 
> set to nr_cpu_ids for ch->cpumask being 0.
> An assertion would happen through hpet_detach_channel() 
> ->set_channel_irq_affinity() -> cpumask_of() ->cpumask_check();
>  
> I think the cause leading to assertion is that  cpu rewrite  shared zone 
> when other is reading. I've tried two ways.  The CPU must get the lock of 
> channel before  executing cpumask_clear_cpu() as patch v1.  Another way of 
> resolving it is "a variable hold the value of  ch->cpumask at the beginning 
> of 
> hpet_detach_channel() as patch v2" .

Following your earlier description I had been able to work out what you
describe above. But that doesn't mean the description of the patch now
can remain unclear: You need to describe the issue in an understandable
way. This doesn't, however, necessarily mean to make the description
much longer (i.e. I don't think the above would be a suitable replacement).
See below.

> "how about eliminating the cpumask_empty() call in favor of just the 
> cpumask_first()" 
> Do you mean to delete  the cpumask_empty() and leave the cpumask_first()? 

FAOD "leave" isn't the right term: I did suggest to _move_ cpumask_first(),
such that you can use its result both in place of the current cpumask_empty()
one _and_ where it is being used currently.

>>>> On 18.04.18 at 11:25, <Davidwang@xxxxxxxxxxx> wrote:
>> From: David Wang <davidwang@xxxxxxxxxxx>
>> 
>> For the ch->cpumask be cleared by other cpu, cpumask_first() called by
>> hpet_detach_channel() return nr_cpu_ids. That lead an assertion in
>> set_channel_irq_affinity() when cpumask_of() check cpu.
>> Fix this by using a local variable.

I think the description in v1 came closer to something understandable.
Taking that, how about:

"CPUs may share an in-use channel. Hence clearing of a bit from the
 cpumask (in hpet_broadcast_exit()) as well as setting one (in
 hpet_broadcast_enter()) must not race evaluation of that same
 cpumask. Therefore avoid evaluating the cpumask twice in
 hpet_detach_channel(). Otherwise cpumask_empty() may e.g. return
 false while the subsequent cpumask_first() could return nr_cpu_ids,
 which then triggers the assertion in cpumask_of() reached through
 set_channel_irq_affinity()."

?

Jan


    
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.