[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly
Hi Julien, Stefano, Thanks for the feedback and suggestions. On Mon, Apr 16, 2018 at 3:14 PM, Julien Grall <julien.grall@xxxxxxx> wrote: > Hi, > > > On 12/04/18 22:31, Stefano Stabellini wrote: >> >> On Thu, 12 Apr 2018, Julien Grall wrote: >>> >>> On 12/04/18 00:46, Stefano Stabellini wrote: >>>> >>>> On Wed, 11 Apr 2018, Julien Grall wrote: >>>>> >>>>> On 11/04/18 14:19, Mirela Simonovic wrote: >>>>>> >>>>>> Freeing percpu area is done when a non-boot CPU is disabled upon >>>>>> suspend. >>>>>> This use to be scheduled for execution after a period of time, what >>>>>> caused >>>>>> the following racing issues. If CPU is enabled after it is disabled >>>>>> and >>>>>> before the freeing of percpu area is performed, Xen would crash upon >>>>>> initializing percpu area because per cpu offset is not marked as >>>>>> INVALID_PERCPU_AREA (this suppose to happen when cpu area is freed). >>>>>> To resolve the racing issue, free percpu area right away instead >>>>>> scheduling it for later. >>>>> >>>>> >>>>> The reason of using the RCU is you want to make sure that none of the >>>>> other >>>>> CPUs will access that percpu data before freeing it. So I don't think >>>>> this >>>>> patch is valid. >>>>> >>>>> It looks like to me a rcu barrier is missing after calling cpu_down >>>>> somewhere >>>>> in the CPU off path. I am not entirely sure where. >>>> >>>> >>>> We need a rcu_barrier(). Perhaps, it could be added on cpu_on before >>>> initializing the percpu area? >>> >>> >>> Do you mind giving a bit more details on your thought? cpu_up looks a >>> strange >>> place as no one should access the percpu area after the CPU is down. So >>> it >>> feels the rcu_barrier should be somewhere before PSCI_cpu_off is called. >> >> >> Yes, it feels strange to do it on cpu_on, it would be more obvious on >> cpu_off, but we don't actually need to _free_percpu_area on cpu_off, >> right? We only need to make sure it is done before cpu_percpu_callback >> is called on cpu_on. >> >> My suggestion would be to evaluate if it is possible to introduce the >> rcu_barrier() on the resume path before cpu_percpu_callback, maybe in >> start_secondary. > > > Well, cpu_percpu_callback is not called by start_secondary. It is called > when preparing the CPU from another CPU. So anything in start_secondary will > not work. > I have also confirmed that in start_secondary it doesn't work, it's too late. >> >> I was also looking at xen/arch/x86/acpi/power.c:enter_state and noticed >> that they chose to call rcu_barrier() on enable_cpu before >> enable_nonboot_cpus(). > Before the enable_nonboot_cpus() gets invoked seems to be a good place for rcu_barrier(), as it's done for x86. > > I guess the rcu_barrier() in the function handling suspend/resume works. But > that doesn't cover the hotplug case. Looking at x86, suspend/resume case. > For the hotplug case, there are an rcu_barrier in cpu_{up,down}_helper but > they are only present in the case of cpu_{up,down} failed. I am not entirely > sure how this is handled in x86 > > Andrew, Jan, do you know when the percpu will be free on hotplug? It is call > to call_rcu(...) but I am not sure when this is going to be executed. > AFAIK disable/enable_nonboot_cpus() is the only way to do the hotplug and rcu_barrier() is not included in the flow. I suggest to invoke rcu_barrier() before enable_nonboot_cpus() and eventually this could be moved into enable_nonboot_cpus() itself. Thanks, Mirela > Cheers, > > -- > Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |