Xen project Mailing List

Re: [Xen-devel] [PATCH v2] x86/hvm: finish IOREQ correctly on completion path

To: "Igor Druzhinin" <igor.druzhinin@xxxxxxxxxx>, "Paul Durrant" <paul.durrant@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Wed, 13 Mar 2019 08:29:38 -0600

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>

Delivery-date: Wed, 13 Mar 2019 14:29:57 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

>>> On 13.03.19 at 13:47, <igor.druzhinin@xxxxxxxxxx> wrote: > On 13/03/2019 08:53, Jan Beulich wrote: >>>>> On 12.03.19 at 17:23, <igor.druzhinin@xxxxxxxxxx> wrote: >>> Since the introduction of linear_{read,write}() helpers in 3bdec530a5 >>> (x86/HVM: split page straddling emulated accesses in more cases) the >>> completion path for IOREQs has been broken: if there is an IOREQ in >>> progress but hvm_copy_{to,from}_guest_linear() returns HVMTRANS_okay >>> (e.g. when P2M type of source/destination has been changed by IOREQ >>> handler) the execution will never re-enter hvmemul_do_io() where >>> IOREQs are completed. This usually results in a domain crash upon >>> the execution of the next IOREQ entering hvmemul_do_io() and finding >>> the remnants of the previous IOREQ in the state machine. >>> >>> This particular issue has been discovered in relation to p2m_ioreq_server >>> type where an emulator changed the memory type between p2m_ioreq_server >>> and p2m_ram_rw in process of responding to IOREQ which made hvm_copy_..() >>> to behave differently on the way back. But could be also applied >>> to a case where e.g. an emulator balloons memory to/from the guest in >>> response to MMIO read/write, etc. >> >> An emulator ballooning memory? I think Andrew was hinting towards >> another vCPU of the guest doing some ballooning work in parallel to >> the insn emulation. > > Yes, this is another example of what emulator might be doing with P2M > type in the process of IOREQ handling. Has nothing to do with what > Andrew mentioned. I'm struggling to see how an emulator could do so without upsetting the guest. >>> + if ( !cache ) >>> + rc = hvm_copy_from_guest_linear(p_data, addr, bytes, pfec, >>> &pfinfo); >> >> This is certainly better than the previous version, but imo still >> doesn't address the general case. But I'd be willing to accept >> it as a temporary workaround for the specific case of a write >> to an (aligned) page table entry leading to a page type change >> from p2m_ioreq_server to p2m_ram_rw, as it at least doesn't >> seem to break the case anymore that the introduction of >> linear_{read,write}() was meant to address. > > I don't think it's feasible (and makes sense) to specify P2M type on > that (linear) level. We only have access to P2M type on physical layer > which is several calls down the stack. And I cannot see now how we can > query what the type actually *was* before without introducing some > additional stashing array of p2mt somewhere (mmio_cache?). Hmm, some misunderstanding perhaps: I did not mean to suggest to track the p2m type here. I've mentioned types in my reply solely to call out the specific case the change will help with, in order to separate it from the possible cases where it won't help. > I agree this is more useful as a temporary workaround - I'll put it into > the commit description of the next version. Thanks, and together with this please also mention the known case(s) where this is not going to help. >> The more general case that still won't work (afaict) is an >> access crossing a page boundary, where the second page's >> type changes behind our backs. The first part of the access >> won't find a cache entry here, and hence would still go the >> hvm_copy_{from,to}_guest_linear() path above. > > This could be solved by splitting *all* (not only MMIO as it's done now) > linear accesses on page boundary and checking each page separately. > Would you consider this for v3? This shouldn't introduce any changes in > behavior afaics and is actually more correct imo. The reason this wasn't done was (iirc) because this is likely to be less efficient. I'd like to defer to Paul as the maintainer of the code to give you further direction one or the other way. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.