Xen project Mailing List

Re: [Xen-devel] x86/vMSI-X emulation issue

To: Jan Beulich <JBeulich@xxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Paul Durrant <Paul.Durrant@xxxxxxxxxx>

Date: Thu, 24 Mar 2016 09:09:34 +0000

Accept-language: en-GB, en-US

Cc: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>

Delivery-date: Thu, 24 Mar 2016 09:10:19 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHRhSswWjeEL9v6CUa53Q6giYkK759oKNuAgAAl6lA=

Thread-topic: [Xen-devel] x86/vMSI-X emulation issue

> -----Original Message----- > From: Xen-devel [mailto:xen-devel-bounces@xxxxxxxxxxxxx] On Behalf Of Jan > Beulich > Sent: 24 March 2016 07:52 > To: xen-devel > Cc: Andrew Cooper > Subject: Re: [Xen-devel] x86/vMSI-X emulation issue > > >>> On 23.03.16 at 18:05, <JBeulich@xxxxxxxx> wrote: > > All, > > > > so I've just learned that Windows (at least some versions and some > > of their code paths) use REP MOVSD to read/write the MSI-X table. > > The way at least msixtbl_write() works is not compatible with this > > (msixtbl_read() also seems affected, albeit to a lesser degree), and > > apparently it just worked by accident until the XSA-120 and 128-131 > > and follow-up changes - most notably commit ad28e42bd1 ("x86/MSI: > > track host and guest masking separately"), as without the call to > > guest_mask_msi_irq() interrupts won't ever get unmasked. > > > > The problem with emulating REP MOVSD is that msixtbl_write() > > intentionally returns X86EMUL_UNHANDLEABLE on all writes to > > words 0, 1, and 2. When in the process of emulating multiple > > writes, we therefore hand the entire batch of 3 or 4 writes to qemu, > > and the hypervisor doesn't get to see any other than the initial > > iteration. > > > > Now I see a couple of possible solutions, but none of them look > > really neat, hence I'm seeking a second opinion (including, of > > course, further alternative ideas): > > > > 1) Introduce another X86EMUL_* like status that's not really to be > > used by the emulator itself, but only by the two vMSI-X functions > > to indicate to their caller that prior to forwarding the request it > > should be chopped to a single repetition. > > > > 2) Do aforementioned chopping automatically on seeing > > X86EMUL_UNHANDLEABLE, on the basis that the .check > > handler had indicated that the full range was acceptable. That > > would at once cover other similarly undesirable cases like the > > vLAPIC code returning this error. However, any stdvga like > > emulated device would clearly not want such to happen, and > > would instead prefer the entire batch to get forwarded in one > > go (stdvga itself sits on a different path). Otoh, with the > > devices we have currently, this would seem to be the least > > intrusive solution. > > Having thought about it more over night, I think this indeed is > the most reasonable route, not just because it's least intrusive: > For non-buffered internally handled I/O requests, no good can > come from forwarding full batches to qemu, when the respective > range checking function has indicated that this is an acceptable > request. And in fact neither vHPET not vIO-APIC code generate > X86EMUL_UNHANDLEABLE. And vLAPIC code doing so is also > just apparently so - I'll submit a patch to make this obvious once > tested. > > Otoh stdvga_intercept_pio() uses X86EMUL_UNHANDLEABLE in > a manner similar to the vMSI-X code - for internal caching and > then forwarding to qemu. Clearly that is also broken for > REP OUTS, and hence a similar rep count reduction is going to > be needed for the port I/O case. > It suggests that such cache-and/or-forward models should probably sit somewhere else in the flow, possibly being invoked from hvm_send_ioreq() since there should indeed be a selected ioreq server for these cases. Paul > vRTC code would misbehave too, albeit there it is quite hard to > see what use REP INS or REP OUTS could be. Yet we can't > exclude a guest using such, so we should make it behave > correctly. > > For handle_pmt_io(), otoh, forwarding the full batch would be > okay, but since there shouldn't be any writes breaking up such > batches wouldn't be a problem. Then again forwarding such > invalid requests to qemu is kind of pointless - we could as well > terminate them right in Xen, just like we terminate requests > of other than 4 byte width - again I'll submit a patch to make > this obvious once tested. > > Jan > > > 3) Have emulation backends provide some kind of (static) flag > > indicating which forwarding behavior they would like. > > > > 4) Expose the full ioreq to the emulation backends, so they can > > fiddle with the request to their liking. > > > > Thanks, Jan > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@xxxxxxxxxxxxx > > http://lists.xen.org/xen-devel > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxx > http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.