[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 5/7] vpci: fix execution of long running operations



>>> On 08.11.18 at 12:29, <roger.pau@xxxxxxxxxx> wrote:
> On Thu, Nov 08, 2018 at 02:55:56AM -0700, Jan Beulich wrote:
>> >>> On 07.11.18 at 18:15, <roger.pau@xxxxxxxxxx> wrote:
>> > On Wed, Nov 07, 2018 at 08:06:00AM -0700, Jan Beulich wrote:
>> >> >>> On 07.11.18 at 12:11, <roger.pau@xxxxxxxxxx> wrote:
>> >> > On Mon, Nov 05, 2018 at 09:56:13AM -0700, Jan Beulich wrote:
>> >> >> >>> On 30.10.18 at 16:41, <roger.pau@xxxxxxxxxx> wrote:
>> >> >> > BAR map/unmap is a long running operation that needs to be preempted
>> >> >> > in order to avoid overrunning the assigned vCPU time (or even
>> >> >> > triggering the watchdog).
>> >> >> > 
>> >> >> > Current logic for this preemption is wrong, and won't work at all for
>> >> >> > AMD since only Intel makes use of hvm_io_pending (and even in that
>> >> >> > case the current code is wrong).
>> >> >> 
>> >> >> I'm having trouble understanding this, both for the AMD aspect
>> >> >> (it is only vvmx.c which has a function call not mirrored on the
>> >> >> AMD side) and for the supposed general brokenness. Without
>> >> >> some clarification I can't judge whether re-implementing via
>> >> >> tasklet is actually the best approach.
>> >> > 
>> >> > hvm_io_pending itself cannot block the vCPU from executing, it's used
>> >> > by nvmx_switch_guest in order to prevent changing the nested VMCS if
>> >> > there's pending IO emulation work AFAICT.
>> >> > 
>> >> > The only way I could find to actually prevent a vCPU from running
>> >> > while doing some work on it's behalf in a preemptive way is by
>> >> > blocking it and using a tasklet. What's done with IOREQs is not
>> >> > suitable here since Xen needs to do some work instead of just wait on
>> >> > an external event (an event channel from the IOREQ).
>> >> 
>> >> No, there is a second means, I've just confused the functions. The
>> >> question is whether your vpci_process_pending() invocation
>> >> perhaps sits in the wrong function. handle_hvm_io_completion() is
>> >> what hvm_do_resume() calls, and what can prevent a guest from
>> >> resuming execution. The hvm_io_pending() invocation just sits on
>> >> a special case path down from there (through handle_pio()).
>> > 
>> > Yes, handle_hvm_io_completion is the function that actually blocks the
>> > vCPU and waits for an event channel from the ioreq. This is however
>> > not suitable because it uses the following code (simplified):
>> > 
>> > set_bit(_VPF_blocked_in_xen, &current->pause_flags);
>> > raise_softirq(SCHEDULE_SOFTIRQ);
>> > do_softirq();
>> > 
>> > In the vPCI case Xen cannot set the vCPU as blocked_in_xen, Xen needs
>> > the scheduler to schedule the vCPU so the pending work can be
>> > processed.
>> 
>> Right, and I didn't say you should set the vCPU to blocked. What
>> I've pointed out is that the mere fact of handle_hvm_io_completion()
>> returning false makes hvm_do_resume() bail, resulting in another
>> round through softirq processing (from entry.S code) as long as
>> _some_ softirq is pending (here: the scheduler one).
> 
> No, hvm_do_resume bailing doesn't prevent the VM from being scheduled.
> Both vmx_do_resume and svm_do_resume (the callers of hvm_do_resume)
> will run the guest regardless of whether hvm_do_resume has bailed
> early.
> 
> Note that at the point where hvm_do_resume is called there's no
> turning back, both {vmx/svm}_do_resume functions are annotated as
> noreturn. The only way to prevent guest execution at that point is to
> raise a scheduler softirq and process it, like it's done in
> wait_on_xen_event_channel.

That's what I've said. You don't need to mark the vCPU as blocked
in Xen for this, though. And that's what I understood was your
original concern.

>>  Then if the blocked bit is not set the call to do_softirq
>> > would be recurred, thus probably leading to a stack overflow if
>> > there's enough pending work. ie:
>> > 
>> > <process work>
>> >    <do_softirq>
>> >            <process work>
>> >                    <do_softirq>
>> >                            <...>
>> 
>> Why would that be? The do_softirq() invocation sits on the exit-
>> to-guest path, explicitly avoiding any such nesting unless there
>> was a do_softirq() invocation somewhere in a softirq handler.
> 
> It sits on an exit-to-guest path, but the following chunk:
> 
> raise_softirq(SCHEDULE_SOFTIRQ);
> do_softirq();
> 
> Would prevent the path from ever reaching the exit-to-guest and
> nesting on itself, unless the vCPU is marked as blocked, which
> prevents it from being scheduled thus avoiding this recursion.

It would, indeed, but emphasis is on the subjunctive unless
you have an example of such malformed code. I'm not
aware of any, and my earlier comments also didn't suggest to
introduce such. Perhaps I'm simply missing part of what you
think you see? It is important to note though that
vmx_do_resume() has

    reset_stack_and_jump(vmx_asm_do_vmentry);

as its last step, precluding any nesting.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.