[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 00/12] Convert cpu_up/down to device_online/offline



Hi Thomas

On 10/30/19 15:38, Qais Yousef wrote:
> Using cpu_up/down directly to bring cpus online/offline loses synchronization
> with sysfs and could suffer from a race similar to what is described in
> commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and serialization
> during LPM").
> 
> cpu_up/down seem to be more of a internal implementation detail for the cpu
> subsystem to use to boot up cpus, perform suspend/resume and low level hotplug
> operations. Users outside of the cpu subsystem would be better using the 
> device
> core API to bring a cpu online/offline which is the interface used to hotplug
> memory and other system devices.
> 
> Several users have already migrated to use the device core API, this series
> converts the remaining users and hides cpu_up/down from internal users at the
> end.
> 
> I still need to update the documentation to remove references to cpu_up/down
> and advocate for device_online/offline instead if this series will make its 
> way
> through.
> 
> I noticed this problem while working on a hack to disable offlining
> a particular CPU but noticed that setting the offline_disabled attribute in 
> the
> device struct isn't enough because users can easily bypass the device core.
> While my hack isn't a valid use case but it did highlight the inconsistency in
> the way cpus are being onlined/offlined and this attempt hopefully improves on
> this.
> 
> The first 6 patches fixes arch users.
> 
> The next 5 patches fixes generic code users. Particularly creating a new
> special exported API for the device core to use instead of cpu_up/down.
> Maybe we can do something more restrictive than that.
> 
> The last patch removes cpu_up/down from cpu.h and unexport the functions.
> 
> In some cases where the use of cpu_up/down seemed legitimate, I encapsulated
> the logic in a higher level - special purposed function; and converted the 
> code
> to use that instead.
> 
> I did run the rcu torture, lock torture and psci checker tests and no problem
> was noticed. I did perform build tests on all arch affected except for parisc.
> 
> Hopefully I got the CC list right for all the patches. Apologies in advance if
> some people were omitted from some patches but they should have been CCed.

I had to make an educated guess that you're probably the 'maintainer' of cpu
hotplug - but there's no explicit entry that says that. Please let me know if
I need to bring the attention of others too.

The series do have few rough edges to address, but it's relatively
straightforward and I think does offer a nice improvement in the form of
consolidating the API for bringing up/down cpus from external
subsystems/drivers. Beside fix the inconsistency of device's core view of the
state of the cpu which can happen when cpu_{up/down} are called directly.

The downside I see is that the external API to bring cpus up/down for
suspend/resume and at boot seem to have grown a bit organically (I've added
a couple in this series to address 2 direct users of cpu_{up,down}). We might
need to rethink this API, but I think this is outside the scope of this series.

Any thoughts/feedback would be appreciated.

Thanks

--
Qais Yousef

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.