[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 10/08/18 17:30, George Dunlap wrote: > On 08/10/2018 05:00 PM, Jan Beulich wrote: >>>>> On 10.08.18 at 17:35, <Paul.Durrant@xxxxxxxxxx> wrote: >>>> -----Original Message----- >>>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] >>>> Sent: 10 August 2018 16:31 >>>> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> >>>> Cc: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>; xen-devel <xen- >>>> devel@xxxxxxxxxxxxxxxxxxxx> >>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>> >>>>>>> On 10.08.18 at 17:08, <Paul.Durrant@xxxxxxxxxx> wrote: >>>>>> -----Original Message----- >>>>>> From: Andrew Cooper >>>>>> Sent: 10 August 2018 13:56 >>>>>> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx>; 'Jan Beulich' >>>>>> <JBeulich@xxxxxxxx> >>>>>> Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx> >>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>> >>>>>> On 10/08/18 13:43, Paul Durrant wrote: >>>>>>>> -----Original Message----- >>>>>>>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] >>>>>>>> Sent: 10 August 2018 13:37 >>>>>>>> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> >>>>>>>> Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx> >>>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>> >>>>>>>>>>> On 10.08.18 at 14:22, <Paul.Durrant@xxxxxxxxxx> wrote: >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] >>>>>>>>>> Sent: 10 August 2018 13:13 >>>>>>>>>> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> >>>>>>>>>> Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx> >>>>>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>>>> >>>>>>>>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@xxxxxxxxxx> wrote: >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] >>>>>>>>>>>> Sent: 10 August 2018 13:02 >>>>>>>>>>>> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> >>>>>>>>>>>> Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx> >>>>>>>>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>>>>>> >>>>>>>>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@xxxxxxxxxx> wrote: >>>>>>>>>>>>> These are probably both candidates for back-port. >>>>>>>>>>>>> >>>>>>>>>>>>> Paul Durrant (2): >>>>>>>>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores >>>>>> direction >>>>>>>> flag >>>>>>>>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross >>>>>> GFN >>>>>>>>>>>>> boundaries >>>>>>>>>>>>> >>>>>>>>>>>>> xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- >>>>>>>>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- >>>>>>>>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) >>>>>>>>>>>> I take it this isn't yet what we've talked about yesterday on irc? >>>>>>>>>>>> >>>>>>>>>>> This is the band-aid fix. I can now show correct handling of a rep >>>> mov >>>>>>>>>>> walking off MMIO into RAM. >>>>>>>>>> But that's not the problem we're having. In our case the bad >>>> behavior >>>>>>>>>> is with a single MOV. That's why I had assumed that your plan to >>>> fiddle >>>>>>>>>> with null_handler would help in our case as well, while this series >>>>>> clearly >>>>>>>>>> won't (afaict). >>>>>>>>>> >>>>>>>>> Oh, I see. A single MOV spanning MMIO and RAM has undefined >>>>>> behaviour >>>>>>>> though >>>>>>>>> as I understand it. Am I incorrect? >>>>>>>> I'm not aware of SDM or PM saying anything like this. Anyway, the >>>>>>>> specific case where this is being observed as an issue is when >>>>>>>> accessing the last few bytes of a normal RAM page followed by a >>>>>>>> ballooned out one. The balloon driver doesn't remove the virtual >>>>>>>> mapping of such pages (presumably in order to not shatter super >>>>>>>> pages); observation is with the old XenoLinux one, but from code >>>>>>>> inspection the upstream one behaves the same. >>>>>>>> >>>>>>>> Unless we want to change the balloon driver's behavior, at least >>>>>>>> this specific case needs to be considered having defined behavior, >>>>>>>> I think. >>>>>>>> >>>>>>> Ok. I'll see what I can do. >>>>>> It is a software error to try and cross boundaries. Modern processors >>>>>> do their best to try and cause the correct behaviour to occur, albeit >>>>>> with a massive disclaimer about the performance hit. Older processors >>>>>> didn't cope. >>>>>> >>>>>> As far as I'm concerned, its fine to terminate a emulation which crosses >>>>>> a boundary with the null ops. >>>>> Alas we never even get as far as the I/O handlers in some circumstances... >>>>> >>>>> I just set up a variant of an XTF test doing a backwards rep movsd into a >>>>> well aligned stack buffer where source buffer starts 1 byte before a >>>> boundary >>>>> between RAM and MMIO. The code in hvmemul_rep_movs() (rightly) >>>> detects that >>>>> both the source and dest of the initial rep are RAM, skips over the I/O >>>>> emulation calls, and then fails when the hvm_copy_from_guest_phys() >>>>> (unsurprisingly) fails to grab the 8 bytes for the initial rep. >>>>> So, any logic we add to deal with handling page spanning ops is going to >>>>> have to go in at the top level of instruction emulation... which I fear is >>>>> going to be quite a major change and not something that's going to be easy >>>> to >>>>> back-port. >>>> Well, wasn't it clear from the beginning that a proper fix would be too >>>> invasive to backport? And wasn't it for that reason that you intended >>>> to add a small hack first, to deal with just the case(s) that we currently >>>> have issues with? >>> Well, given that I mistakenly understood you were hitting the same rep >>> issue >>> that I was, I thought I could deal with it in a reasonably straightforward >>> way. Maybe I can still do a point fix for what you are hitting though. What >>> precisely are you hitting? Always a single MOV? And always from a page >>> spanning source to a well aligned dest? Or more combinations than that? >> Always a single, misaligned MOV spanning the boundary from a valid >> RAM page to a ballooned out one (Linux'es load_unaligned_zeropad()). >> I meanwhile wonder though whether it might not be better to address >> this at the p2m level, by inserting a r/o mapping of an all-zeros page. >> George, do you have any opinion one way or the other? > Sorry, what exactly is the issue here? Linux has a function called > load_unaligned_zeropad() which is reading into a ballooned region? > > Fundamentally, a ballooned page is one which has been allocated to a > device driver. I'm having a hard time coming up with a justification > for having code which reads memory owned by B in the process of reading > memory owned by A. Or is there some weird architectural reason that I'm > not aware of? The underlying issue is that the emulator can't cope with a single misaligned access which crosses RAM and MMIO. It gives up and presumably throws #UD back. One longstanding Xen bug is that simply ballooning a page out shouldn't be able to trigger MMIO emulation to begin with. It is a side effect of mixed p2m types, and the fix for this to have Xen understand the guest physmap layout. However, the real bug is Linux making such a misaligned access into a ballooned out page in the first place. This is a Linux kernel bug which (presumably) manifests in a very obvious way due to shortcomings in Xen's emulation handling. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |