Xen project Mailing List

Re: [Xen-devel] [PATCH] x86/hvm: don't rely on shared ioreq state for completion handling

From: Paul Durrant <Paul.Durrant@xxxxxxxxxx>

Date: Tue, 11 Aug 2015 16:17:27 +0000

Accept-language: en-GB, en-US

Cc: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, "Keir \(Xen.org\)" <keir@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 11 Aug 2015 16:17:35 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHQy6ZjR+vuNT/KQkKPcmiWKdaoT54G2YGAgAAiHgD//+U9AIAAIhYQ///in4CAACVnsA==

Thread-topic: [PATCH] x86/hvm: don't rely on shared ioreq state for completion handling

> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > Sent: 11 August 2015 17:03 > To: Paul Durrant > Cc: Andrew Cooper; xen-devel@xxxxxxxxxxxxxxxxxxxx; Keir (Xen.org) > Subject: RE: [PATCH] x86/hvm: don't rely on shared ioreq state for > completion handling > > >>> On 11.08.15 at 17:49, <Paul.Durrant@xxxxxxxxxx> wrote: > >> -----Original Message----- > >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > >> Sent: 11 August 2015 16:46 > >> To: Paul Durrant > >> Cc: Andrew Cooper; xen-devel@xxxxxxxxxxxxxxxxxxxx; Keir (Xen.org) > >> Subject: RE: [PATCH] x86/hvm: don't rely on shared ioreq state for > >> completion handling > >> > >> >>> On 11.08.15 at 17:32, <Paul.Durrant@xxxxxxxxxx> wrote: > >> >> -----Original Message----- > >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > >> >> Sent: 11 August 2015 16:20 > >> >> To: Paul Durrant > >> >> Cc: Andrew Cooper; xen-devel@xxxxxxxxxxxxxxxxxxxx; Keir (Xen.org) > >> >> Subject: Re: [PATCH] x86/hvm: don't rely on shared ioreq state for > >> >> completion handling > >> >> > >> >> >>> On 31.07.15 at 17:34, <paul.durrant@xxxxxxxxxx> wrote: > >> >> > Both hvm_io_pending() and hvm_wait_for_io() use the shared (with > >> >> emulator) > >> >> > ioreq structure to determined whether there is a pending I/O. The > latter > >> >> > will > >> >> > misbehave if the shared state is driven to STATE_IOREQ_NONE by > the > >> >> emulator, > >> >> > or when the shared ioreq page is cleared for re-insertion into the > guest > >> >> > P2M when the ioreq server is disabled (STATE_IOREQ_NONE == 0) > >> because > >> >> it > >> >> > will terminate its wait without calling hvm_io_assist() to adjust > >> >> > Xen's > >> >> > internal I/O emulation state. This may then lead to an io completion > >> >> > handler finding incorrect internal emulation state and calling > >> >> > domain_crash(). > >> >> > > >> >> > This patch fixes the problem by adding a pending flag to the ioreq > >> server's > >> >> > per-vcpu structure which cannot be directly manipulated by the > >> emulator > >> >> > and thus can be used to determine whether an I/O is actually > pending > >> for > >> >> > that vcpu on that ioreq server. If an I/O is pending and the shared > state > >> >> > is seen to go to STATE_IOREQ_NONE then it can be treated as an > >> abnormal > >> >> > completion of emulation (hence the data placed in the shared > structure > >> >> > is not used) and the internal state is adjusted as for a normal > >> completion. > >> >> > Thus, when a completion handler subsequently runs, the internal > state > >> is as > >> >> > expected and domain_crash() will not be called. > >> >> > > >> >> > Signed-off-by: Paul Durrant <paul.durrant@xxxxxxxxxx> > >> >> > Reported-by: Sander Eikelenboom <linux@xxxxxxxxxxxxxx> > >> >> > Tested-by: Roger Pau MonnÃ <roger.pau@xxxxxxxxxx> > >> >> > >> >> I realize this went in already, but ... > >> >> > >> >> > --- a/xen/arch/x86/hvm/hvm.c > >> >> > +++ b/xen/arch/x86/hvm/hvm.c > >> >> > @@ -412,44 +412,57 @@ bool_t hvm_io_pending(struct vcpu *v) > >> >> > &d->arch.hvm_domain.ioreq_server.list, > >> >> > list_entry ) > >> >> > { > >> >> > - ioreq_t *p = get_ioreq(s, v); > >> >> > + struct hvm_ioreq_vcpu *sv; > >> >> > > >> >> > - if ( p->state != STATE_IOREQ_NONE ) > >> >> > - return 1; > >> >> > + list_for_each_entry ( sv, > >> >> > + &s->ioreq_vcpu_list, > >> >> > + list_entry ) > >> >> > + { > >> >> > + if ( sv->vcpu == v && sv->pending ) > >> >> > + return 1; > >> >> > + } > >> >> > >> >> ... while from the review of the original series I recall that doing the > >> >> outer loop without any lock is fine (due to using domain_pause() > >> >> when registering servers) I'm not convinced this extends to the > >> >> inner loop. Can you clarify please? (There are a couple more such > >> >> loops that I can't immediately see being protected by a lock.) > >> > > >> > Yes, I think you are right. If a cpu were to disappear then the list walk > >> > would be compromised. It should either be locked or rcu in all places. > >> > >> I don't think we need to be concerned of vCPU-s disappearing, > >> since that doesn't happen during the lifetime of a VM. And the > >> hvm_do_resume() path is used only for domains still alive. Of > >> course, if any of the lockless loops sit on paths reachable after > >> a domain got marked dying, that would need fixing. > >> > >> What I'm more concerned about are list manipulations behind > >> the back of a list traversing CPU. Or do those happen only upon > >> vCPU creation/destruction? > > > > Theoretically we could do vcpu hot remove, couldn't we? That's the case I > > was thinking of. > > But that only removes the vCPU from what the guest sees. The > hypervisor never de-allocates a vCPU prior to domain death. > Oh, in that case I think the loops are safe. Paul > Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.