[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.19 2/9] xen/cpu: do not get the CPU map in stop_machine_run()



On Wed, May 29, 2024 at 05:31:02PM +0200, Jan Beulich wrote:
> On 29.05.2024 17:20, Roger Pau Monné wrote:
> > On Wed, May 29, 2024 at 03:04:13PM +0200, Jan Beulich wrote:
> >> On 29.05.2024 11:01, Roger Pau Monne wrote:
> >>> The current callers of stop_machine_run() outside of init code already 
> >>> have the
> >>> CPU maps locked, and hence there's no reason for stop_machine_run() to 
> >>> attempt
> >>> to lock again.
> >>
> >> While purely from a description perspective this is okay, ...
> >>
> >>> --- a/xen/common/stop_machine.c
> >>> +++ b/xen/common/stop_machine.c
> >>> @@ -82,9 +82,15 @@ int stop_machine_run(int (*fn)(void *data), void 
> >>> *data, unsigned int cpu)
> >>>      BUG_ON(!local_irq_is_enabled());
> >>>      BUG_ON(!is_idle_vcpu(current));
> >>>  
> >>> -    /* cpu_online_map must not change. */
> >>> -    if ( !get_cpu_maps() )
> >>> +    /*
> >>> +     * cpu_online_map must not change.  The only two callers of
> >>> +     * stop_machine_run() outside of init code already have the CPU map 
> >>> locked.
> >>> +     */
> >>
> >> ... the "two" here is not unlikely to quickly go stale; who knows what PPC
> >> and RISC-V will have as their code becomes more complete?
> >>
> >> I'm also unconvinced that requiring ...
> >>
> >>> +    if ( system_state >= SYS_STATE_active && !cpu_map_locked() )
> >>
> >> ... this for all future (post-init) uses of stop_machine_run() is a good
> >> idea. It is quite a bit more natural, to me at least, for the function to
> >> effect this itself, as is the case prior to your change.
> > 
> > This is mostly a pre-req for the next change that switches
> > get_cpu_maps() to return false if the current CPU is holding the CPU
> > maps lock in write mode.
> > 
> > IF we don't want to go this route we need a way to signal
> > send_IPI_mask() when a CPU hot{,un}plug operation is taking place,
> > because get_cpu_maps() enough is not suitable.
> > 
> > Overall I don't like the corner case where get_cpu_maps() returns true
> > if a CPU hot{,un}plug operation is taking place in the current CPU
> > context.  The guarantee of get_cpu_maps() is that no CPU hot{,un}plug
> > operations can be in progress if it returns true.
> 
> I'm not convinced of looking at it this way. To me the guarantee is
> merely that no CPU operation is taking place _elsewhere_. As indicated,
> imo the local CPU should be well aware of what context it's actually in,
> and hence what is (or is not) appropriate to do at a particular point in
> time.
> 
> I guess what I'm missing is an example of a concrete code path where
> things presently go wrong.

See the specific example in patch 3/9 with time_calibration() and it's
usage of send_IPI_mask() when called from a CPU executing in cpu_up()
context.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.