[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH -next v4 06/19] arm64: entry: Move arm64_preempt_schedule_irq() into exit_to_kernel_mode()




On 2024/10/29 22:52, Mark Rutland wrote:
> On Fri, Oct 25, 2024 at 06:06:47PM +0800, Jinjie Ruan wrote:
>> Move arm64_preempt_schedule_irq() into exit_to_kernel_mode(), so not
>> only __el1_irq() but also every time when kernel mode irq return,
>> there is a chance to reschedule.
> 
> We use exit_to_kernel_mode() for every non-NMI exception return to the
> kernel, not just IRQ returns.

Yes, it it not only irq but other non-NMI exception, will update it.

> 
>> As Mark pointed out, this change will have the following key impact:
>>
>>     "We'll preempt even without taking a "real" interrupt. That
>>     shouldn't result in preemption that wasn't possible before,
>>     but it does change the probability of preempting at certain points,
>>     and might have a performance impact, so probably warrants a
>>     benchmark."
> 
> For anyone following along at home, I said that at:
> 
>   https://lore.kernel.org/linux-arm-kernel/ZxejvAmccYMTa4P1@J2N7QTR9R3/
> 
> ... and there I specifically said:

Thank you!

This one and the next patch will be merged as you suggested.

I would have thought it would have been clearer to put it in
__exit_to_kernel_mode() and move it to interrupt enabled block in two steps.

> 
>> I's suggest you first write a patch to align arm64's entry code with the
>> generic code, by removing the call to arm64_preempt_schedule_irq() from
>> __el1_irq(), and adding a call to arm64_preempt_schedule_irq() in
>> __exit_to_kernel_mode(), e.g.
>>
>> | static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
>> | {
>> |    ...
>> |    if (interrupts_enabled(regs)) {
>> |            ...
>> |            if (regs->exit_rcu) {
>> |                    ...
>> |            }
>> |            ...
>> |            arm64_preempt_schedule_irq();
>> |            ...
>> |    } else {
>> |            ...
>> |    }
>> | }
> 
> [...]
> 
>> +#ifdef CONFIG_PREEMPT_DYNAMIC
>> +DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
>> +#define need_irq_preemption() \
>> +    (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
>> +#else
>> +#define need_irq_preemption()       (IS_ENABLED(CONFIG_PREEMPTION))
>> +#endif
>> +
>> +static void __sched arm64_preempt_schedule_irq(void)
>> +{
>> +    if (!need_irq_preemption())
>> +            return;
>> +
>> +    /*
>> +     * Note: thread_info::preempt_count includes both thread_info::count
>> +     * and thread_info::need_resched, and is not equivalent to
>> +     * preempt_count().
>> +     */
>> +    if (READ_ONCE(current_thread_info()->preempt_count) != 0)
>> +            return;
>> +
>> +    /*
>> +     * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC
>> +     * priority masking is used the GIC irqchip driver will clear DAIF.IF
>> +     * using gic_arch_enable_irqs() for normal IRQs. If anything is set in
>> +     * DAIF we must have handled an NMI, so skip preemption.
>> +     */
>> +    if (system_uses_irq_prio_masking() && read_sysreg(daif))
>> +            return;
>> +
>> +    /*
>> +     * Preempting a task from an IRQ means we leave copies of PSTATE
>> +     * on the stack. cpufeature's enable calls may modify PSTATE, but
>> +     * resuming one of these preempted tasks would undo those changes.
>> +     *
>> +     * Only allow a task to be preempted once cpufeatures have been
>> +     * enabled.
>> +     */
>> +    if (system_capabilities_finalized())
>> +            preempt_schedule_irq();
>> +}
>> +
>>  /*
>>   * Handle IRQ/context state management when exiting to kernel mode.
>>   * After this function returns it is not safe to call regular kernel code,
>> @@ -72,6 +114,8 @@ static noinstr irqentry_state_t 
>> enter_from_kernel_mode(struct pt_regs *regs)
>>  static void noinstr exit_to_kernel_mode(struct pt_regs *regs,
>>                                      irqentry_state_t state)
>>  {
>> +    arm64_preempt_schedule_irq();
> 
> This is broken; exit_to_kernel_mode() is called for any non-NMI return
> excpetion return to the kernel, and this doesn't check that interrupts
> were enabled in the context the exception was taken from.
> 
> This will preempt in cases where we should not, e.g. if we WARN() in a 
> section with
> IRQs disabled.
> 
> Mark.
> 



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.