[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 16/18] vmx: Add some scheduler hooks for VT-d posted interrupts




> -----Original Message-----
> From: dunlapg@xxxxxxxxx [mailto:dunlapg@xxxxxxxxx] On Behalf Of George
> Dunlap
> Sent: Thursday, September 17, 2015 1:08 AM
> To: Wu, Feng
> Cc: Jan Beulich; Tian, Kevin; Keir Fraser; Andrew Cooper; Dario Faggioli;
> xen-devel@xxxxxxxxxxxxx
> Subject: Re: [Xen-devel] [PATCH v6 16/18] vmx: Add some scheduler hooks for
> VT-d posted interrupts
> 
> On Thu, Sep 10, 2015 at 9:59 AM, Wu, Feng <feng.wu@xxxxxxxxx> wrote:
> >> >> >> Calling vcpu_unblock() in the middle of context_switch()? Why? And
> >> >> >> is this safe?
> >> >> >
> >> >> > I cannot see anything unsafe so far, can some scheduler maintainer
> >> >> > help to confirm it? Dario? George?
> >> >>
> >> >> But first and foremost you ought to answer the "why".
> >> >
> >> > I think if you think this solution is unsafe, you need point out where it
> >> > is, not
> >> > just ask "is this safe ?", I don't think this is an effective comments.
> >>
> >> It is my general understanding that people wanting code to be
> >> included have to prove their code is okay, not reviewers to prove
> >> the code is broken.
> >>
> >> > Anyway, I go through the main path of the code as below, I really don't 
> >> > find
> >> > anything unsafe here.
> >> >
> >> > context_switch() -->
> >> >     pi_ctxt_switch_from() -->
> >> >         vmx_pre_ctx_switch_pi() -->
> >> >             vcpu_unblock() -->
> >> >                 vcpu_wake() -->
> >> >                     SCHED_OP(VCPU2OP(v), wake, v) -->
> >> >                         csched_vcpu_wake() -->
> >> >                             __runq_insert()
> >> >                             __runq_tickle()
> >> >
> >> > If you continue to think it is unsafe, pointing out the place will be
> >> > welcome!
> >>
> >> Once again - I didn't say it's unsafe (and hence can't point at a
> >> particular place). What bothers me is that vcpu_unblock() affects
> >> scheduler state, and altering scheduler state from inside the
> >> context switch path looks problematic at the first glance. I'd be
> >> happy if I was told there is no such problem, but be aware that
> >> this would have to include latent ones: Even if right now things
> >> are safe, having such feedback have the potential of becoming
> >> an actual problem with a later scheduler change is already an
> >> issue, as finding the problem is then likely going to be rather hard
> >> (and I suspect it's not going to be you to debug this).
> >
> > What I can say is that after investigation, I don't find anything bad
> > for this vcpu_unblock(), I tested it in my experiment. So that is why
> > I'd like to ask some ideas from scheduler expects.
> 
> You still didn't answer Jan's question as to why you chose to call it there.

The reason why I need to call vcpu_unblock() here is that we are about
to block the vCPU and put it on the blocking list if everything goes okay,
but if during updating the posted-interrupt descriptor 'ON' is set, which
means there is a new interrupt coming for this vCPU, we shouldn't block
it, so I call vcpu_unblock to do it.

> 
> As to why Jan is suspicious, the hypervisor code is incredibly
> complicated, and even hypervisor maintainers are only mortals. :-) One
> way we deal with the complication is by trying to restrict the ways
> different code interacts, so that people reading, writing, and
> maintaining the code can make simplifying assumptions to make it
> easier to grasp / harder to make mistakes.
> 
> People reading or working on vcpu_wake() code expect it to be called
> from interrupt contexts, and also expect it to be called from normal
> hypercalls.  They *don't* expect it to be called from in the middle of
> a context switch, where "current", the registers, and all those things
> are all up in the air.  It's possible that the code *already*
> implicitly assumes it's not called from context switch (i.e., that v
> == current means certain state), and even if not, it's possible
> someone will make that assumption in the future, causing a very
> hard-to-detect bug.
> 
> Now I haven't looked at the code in detail yet; it may be that the
> only way to make PI work is to make this call.  If that's the case we
> need to carefully examine assumptions, and carefully comment the code
> to make sure people working on the scheduling code continue to think
> carefully about their assumptions in the future.  But if we can get
> away without doing that, it will make things much easier.
> 
> But it's certainly reasonable to expect the maintainers to offer you
> an alternate solution if your proposed solution is unacceptable. :-)
> Let me take a look and see what I think.

Really appreciate your detailed description about this, it is reasonable,
and thank you from your comments!

Thanks,
Feng

> 
>  -George
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.