[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] timer: fix NR_CPUS=1 build with gcc13
On 13.09.2023 12:25, George Dunlap wrote: > On Wed, Sep 13, 2023 at 11:05 AM Jan Beulich <jbeulich@xxxxxxxx> wrote: >> On 13.09.2023 11:44, George Dunlap wrote: >>> On Wed, Sep 13, 2023 at 8:32 AM Jan Beulich <jbeulich@xxxxxxxx> wrote: >>>> >>>> Gcc13 apparently infers from "if ( old_cpu < new_cpu )" that "new_cpu" >>>> is >= 1, and then (on x86) complains about "per_cpu(timers, new_cpu)" >>>> exceeding __per_cpu_offset[]'s bounds (being an array of 1 in such a >>>> configuration). Make the code conditional upon there being at least 2 >>>> CPUs configured (otherwise there simply is nothing to migrate [to]). >>> >>> Hmm, without digging into it, migrate_timer() doesn't seem like very >>> robust code: It doesn't check to make sure that new_cpu is valid, nor >>> does it give the option of returning an error if anything fails. >> >> Question is - what do you expect the callers to do upon getting back >> failure? > > [snip] > >>> Would it make more sense to add `|| >>> (new_cpu > CONFIG_NR_CPUS)` to the early-return conditional at the >>> top of the first `for (; ; )` loop? >> >> But that would mean not doing what was requested without any indication >> to the caller. An out-of-range CPU passed in is generally very likely >> to result in a crash, I think. > > If it's only off by a little bit, there's a good chance it might just > corrupt some other data, causing a crash further down the line, where > it's not obvious what went wrong. In general I would agree. but __per_cpu_offset[] is quite special in the values it holds. The data immediately following it would therefore also need to have unusual values within relatively narrow a range for a crash to not occur right away. > Generally speaking, passing an > error up the stack, explicitly crashing, or explicitly doing nothing > with a warning to the console are all better options. I guess I'll go that route then, since ... >>> I guess if we don't expect it ever to be called, it might be better to >>> get rid of the code entirely; but maybe in that case we should add >>> something like the following? >>> >>> ``` >>> #else >>> WARN_ONCE("migrate_timer: Request to move to %u on a single-core >>> system!", new_cpu); >>> ASSERT_UNREACHABLE(); >>> #endif >>> ``` >> >> With the old_cpu == new_cpu case explicitly permitted (and that being >> the only legal case when NR_CPUS=1, which arguably is an aspect which >> makes gcc's diagnostic questionable), perhaps only >> >> #else >> old_cpu = ...; >> if ( old_cpu != TIMER_CPU_status_killed ) >> WARN_ON(new_cpu != old_cpu); >> #endif >> >> (I'm afraid we have no WARN_ON_ONCE() yet, nor WARN_ONCE())? > > I think I was looking for `printk_once`. > > If there's no reasonable way to fail more gracefully (or no real point > in making the effort to do so), what if we add the following to the > top of the function? Does that make gcc13 happy? > > ``` > if ( new_cpu >= CONFIG_NR_CPUS ) > { > printk_once(/* whatever */); > ASSERT_UNREACHABLE(); > return; > } > ``` ... this actually makes things worse (then the compiler complains about old_cpu uses as array index), ... > Or, if we feel like being passed an invalid cpu means the state is so > bad it would be better to just crash and have done with it: > > ``` > BUG_ON(new_cpu >= CONFIG_NR_CPUS); > ``` ... and this, while it helps when then also done for old_cpu, seems too hefty to me. Just to mention it, 'asm volatile ( "" : "+g" (new_cpu) );' placed at the right location also helps. That's effectively RELOC_HIDE(), which we use to work around a gcc11 issue in the same area - see gcc11_wrap(). Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |