[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 2/2] x86/hvm: finish IOREQs correctly on completion path
On 15/03/2019 12:27, Jan Beulich wrote: >>>> On 14.03.19 at 23:30, <igor.druzhinin@xxxxxxxxxx> wrote: >> Since the introduction of linear_{read,write}() helpers in 3bdec530a5 >> (x86/HVM: split page straddling emulated accesses in more cases) the >> completion path for IOREQs has been broken: if there is an IOREQ in >> progress but hvm_copy_{to,from}_guest_linear() returns HVMTRANS_okay >> (e.g. when P2M type of source/destination has been changed by IOREQ >> handler) the execution will never re-enter hvmemul_do_io() where >> IOREQs are completed. This usually results in a domain crash upon >> the execution of the next IOREQ entering hvmemul_do_io() and finding >> the remnants of the previous IOREQ in the state machine. > > From an archeological pov I'm not sure you point at the offending > commit: I'd rather expect d7bff2bc00 ("x86/HVM: __hvm_copy() > should not write to p2m_ioreq_server pages") to be the culprit, > which went in two months later. > >> This particular issue has been discovered in relation to p2m_ioreq_server >> type where an emulator changed the memory type between p2m_ioreq_server >> and p2m_ram_rw in process of responding to IOREQ which made >> hvm_copy_..() to behave differently on the way back. >> >> Fix it for now by checking if IOREQ completion is required (which >> can be identified by quering MMIO cache) before trying to finish >> a memory access immediately through hvm_copy_..(), re-enter >> hvmemul_do_io() otherwise. This change alone addresses IOREQ >> completion issue where P2M type is modified in the middle of emulation >> but is not enough for a more general case where machine state >> arbitrarely changes behind our back. > > I'm afraid this still claims to address cases which don't get fixed > here. For example, take a page changing _to_ p2m_ioreq_server > behind our backs: You won't find an MMIO cache entry for it, > hvm_copy_to_guest_linear() will fail, and you'll try to issue an > MMIO write when in reality the write was already done (emulated > for whatever other reason, e.g. introspection). This example > may be pretty contrived, but Andrew's ballooning scenario really > applies both ways (balloon-in and balloon-out), while the change > deals only with the balloon-in case. > > So while I'm fine with the code change, I'd still like to ask to > further refine the description. Thanks for clarification. I discussed with Paul - there is definitely still a hole in general case where 1st half of the instruction is memory and 2nd half is MMIO and the 1st half is changed *to* MMIO. But it's hard to deal with these types of accesses without a complete re-write of MMIO cache into general insn access cache - so to lift it up to linear_{read,write} layer. I hope my understanding is now correct and I'll put into the description. Until then the fix should do fine with scenarios we're seeing. Igor _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |