[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH-for-4.14] ioreq: handle pending emulation racing with ioreq server destruction



> -----Original Message-----
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: 08 June 2020 12:10
> To: paul@xxxxxxx
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; 'Paul Durrant' <pdurrant@xxxxxxxxxx>; 
> 'Marek Marczykowski-Górecki'
> <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> Subject: Re: [PATCH-for-4.14] ioreq: handle pending emulation racing with 
> ioreq server destruction
> 
> On 08.06.2020 13:04, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@xxxxxxxx>
> >> Sent: 08 June 2020 11:58
> >> To: Paul Durrant <paul@xxxxxxx>
> >> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Paul Durrant <pdurrant@xxxxxxxxxx>; 
> >> Marek Marczykowski-Górecki
> >> <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> >> Subject: Re: [PATCH-for-4.14] ioreq: handle pending emulation racing with 
> >> ioreq server destruction
> >>
> >> On 08.06.2020 11:46, Paul Durrant wrote:
> >>> From: Paul Durrant <pdurrant@xxxxxxxxxx>
> >>>
> >>> When an emulation request is initiated in hvm_send_ioreq() the guest vcpu 
> >>> is
> >>> blocked on an event channel until that request is completed. If, however,
> >>> the emulator is killed whilst that emulation is pending then the ioreq
> >>> server may be destroyed. Thus when the vcpu is awoken the code in
> >>> handle_hvm_io_completion() will find no pending request to wait for, but 
> >>> will
> >>> leave the internal vcpu io_req.state set to IOREQ_READY and the vcpu 
> >>> shutdown
> >>> deferall flag in place (because hvm_io_assist() will never be called). The
> >>> emulation request is then completed anyway. This means that any 
> >>> subsequent call
> >>> to hvmemul_do_io() will find an unexpected value in io_req.state and will
> >>> return X86EMUL_UNHANDLEABLE, which in some cases will result in continuous
> >>> re-tries.
> >>>
> >>> This patch fixes the issue by moving the setting of io_req.state and 
> >>> clearing
> >>> of shutdown deferral (as will as MSI-X write completion) out of 
> >>> hvm_io_assist()
> >>> and directly into handle_hvm_io_completion().
> >>>
> >>> Reported-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> >>> Signed-off-by: Paul Durrant <pdurrant@xxxxxxxxxx>
> >>
> >> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
> >> with a question:
> >>
> >>> --- a/xen/arch/x86/hvm/ioreq.c
> >>> +++ b/xen/arch/x86/hvm/ioreq.c
> >>> @@ -109,15 +109,7 @@ static void hvm_io_assist(struct hvm_ioreq_vcpu *sv, 
> >>> uint64_t data)
> >>>      ioreq_t *ioreq = &v->arch.hvm.hvm_io.io_req;
> >>>
> >>>      if ( hvm_ioreq_needs_completion(ioreq) )
> >>> -    {
> >>> -        ioreq->state = STATE_IORESP_READY;
> >>>          ioreq->data = data;
> >>> -    }
> >>> -    else
> >>> -        ioreq->state = STATE_IOREQ_NONE;
> >>> -
> >>> -    msix_write_completion(v);
> >>> -    vcpu_end_shutdown_deferral(v);
> >>>
> >>>      sv->pending = false;
> >>>  }
> >>
> >> With this, is the function worth keeping at all?
> >
> > I did think about that, but it is called in more than one place. So, in the 
> > interest of trying to
> make back-porting straightforward, I thought it best to keep it simple.
> >
> >>
> >> Also I assume the patch has your implied R-a-b?
> >>
> >
> > Since it has your R-b now, yes :-)
> 
> Thanks. As said in the other thread, I think the change here will mask
> a problem elsewhere (presumably in qemu). I'm therefore unsure whether
> we want to apply it right away, rather than first understanding the
> root cause of Marek's specific problem.
> 

I think it ought to be applied right away since an emulator could die at any 
time and we don't really want the vcpu constantly bouncing in this case. Also, 
thinking of using emulators other than qemu, it's not necessarily a bug if they 
deregister at an arbitrary time... In fact it takes my mind back to PCMCIA 
cards, when device drivers had to be robust to them disappearing during an I/O 
cycle.

  Paul

> Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.