[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 5/7] vpci: fix execution of long running operations

Hi Roger,

On 11/8/18 12:20 PM, Roger Pau Monné wrote:
On Thu, Nov 08, 2018 at 11:52:57AM +0000, Julien Grall wrote:
Hi Roger,

On 11/8/18 11:44 AM, Roger Pau Monné wrote:
On Thu, Nov 08, 2018 at 11:42:35AM +0000, Julien Grall wrote:

Sorry to jump in the conversation late.

On 11/8/18 11:29 AM, Roger Pau Monné wrote:
Why would that be? The do_softirq() invocation sits on the exit-
to-guest path, explicitly avoiding any such nesting unless there
was a do_softirq() invocation somewhere in a softirq handler.

It sits on an exit-to-guest path, but the following chunk:


Would prevent the path from ever reaching the exit-to-guest and
nesting on itself, unless the vCPU is marked as blocked, which
prevents it from being scheduled thus avoiding this recursion.

I can't see how the recursion could happen on Arm. So is it an x86 issue?

This is not an issue with the current code, I was just discussing with
Jan how to properly implement vPCI long running operations that need
to be preempted.

To give more context on my question, we are looking at handling preemption
on Arm in some long running operations (e.g cache flush) without having to
worry about returning to guest.

I am thinking something along the following on Arm in a loop.

for ( .... )
    if ( try_reschedule )

This would require to have no lock taken but I think it would work on Arm
for any long operations. So I am quite interested on the result on the
discussions here.

As said to Jan, I don't think this is viable because you could end up
recursing in do_softirq if there are no other guests to run and enough

Let's image that there's only 1 vCPU to run, and that it has a long
running operation pending. I assume you will somehow hook the code to
perform such operation in the guest resume path:

-> preempt
-> preempt
-> preempt

As you can see this could overflow the stack if the are enough

This sounds like an x86 specific issue. In the case of Arm, the context_switch() function will return, so we will come back in the loop before.

We can do this because the hypervisor stack is per-VCPU. So there are no stack overflowed involved here.


Julien Grall

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.