[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] 答复: 答复: [PATCH v2] x86/hpet: Fix possible ASSERT(cpu < nr_cpu_ids)
Jan , thank you for your explanation. Your mean : To avoid evaluating the cpumask twice in hpet_detach_channel(), use a local variable to hold the result of cpumask_first(). Then cpumask_empty() could be replaced by comparing the value of variable and nr_cpu_ids. I understand it right? 发件人: Jan Beulich <JBeulich@xxxxxxxx> 发送时间: 2018年4月19日 15:30 收件人: David Wang 抄送: xen-devel; Fiona Li(BJ-RD) 主题: Re: 答复: [PATCH v2] x86/hpet: Fix possible ASSERT(cpu < nr_cpu_ids) >>> On 19.04.18 at 06:50, <DavidWang@xxxxxxxxxxx> wrote: > Thank you for reply. Maybe My description is not clear. Please allow me to > explain it again. > Multiple CPU may be share an channel(according to hpet_get_channel(), it is > possible ). When one of them get the lock of > channel(hpet_broadcast_exit()->hpet_detach_channel()->spin_lock_irq(&ch->lock)) > , others shouldn't rewrite/clear the ch->cpumask in hpet_broadcast_exit(). > This lead to errors. > For example: > CPU zero and CPU one share an channel by executing hpet_get_channel() > respectively and ch->cpumask of channel be set to 0x3. > Next, CPU zero execute hpet_broadcast_exit()->cpumask_clear_cpu() and the > ch->cpumask is 0x2. > CPU zero execute > hpet_broadcast_exit()->hpet_detach_channel()->cpumask_empty() > and it get a false. > After that the next moment, CPU one execute > hpet_broadcast_exit()->cpumask_clear_cpu(). That set the ch->cpumask to 0. > When CPU zero execute hpet_detach_channel()->cpumask_first(), ch->cpu would > be > set to nr_cpu_ids for ch->cpumask being 0. > An assertion would happen through hpet_detach_channel() > ->set_channel_irq_affinity() -> cpumask_of() ->cpumask_check(); > > I think the cause leading to assertion is that cpu rewrite shared zone > when other is reading. I've tried two ways. The CPU must get the lock of > channel before executing cpumask_clear_cpu() as patch v1. Another way of > resolving it is "a variable hold the value of ch->cpumask at the beginning > of > hpet_detach_channel() as patch v2" . Following your earlier description I had been able to work out what you describe above. But that doesn't mean the description of the patch now can remain unclear: You need to describe the issue in an understandable way. This doesn't, however, necessarily mean to make the description much longer (i.e. I don't think the above would be a suitable replacement). See below. > "how about eliminating the cpumask_empty() call in favor of just the > cpumask_first()" > Do you mean to delete the cpumask_empty() and leave the cpumask_first()? FAOD "leave" isn't the right term: I did suggest to _move_ cpumask_first(), such that you can use its result both in place of the current cpumask_empty() one _and_ where it is being used currently. >>>> On 18.04.18 at 11:25, <Davidwang@xxxxxxxxxxx> wrote: >> From: David Wang <davidwang@xxxxxxxxxxx> >> >> For the ch->cpumask be cleared by other cpu, cpumask_first() called by >> hpet_detach_channel() return nr_cpu_ids. That lead an assertion in >> set_channel_irq_affinity() when cpumask_of() check cpu. >> Fix this by using a local variable. I think the description in v1 came closer to something understandable. Taking that, how about: "CPUs may share an in-use channel. Hence clearing of a bit from the cpumask (in hpet_broadcast_exit()) as well as setting one (in hpet_broadcast_enter()) must not race evaluation of that same cpumask. Therefore avoid evaluating the cpumask twice in hpet_detach_channel(). Otherwise cpumask_empty() may e.g. return false while the subsequent cpumask_first() could return nr_cpu_ids, which then triggers the assertion in cpumask_of() reached through set_channel_irq_affinity()." ? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |