[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes
On 08/10/2018 05:00 PM, Jan Beulich wrote: >>>> On 10.08.18 at 17:35, <Paul.Durrant@xxxxxxxxxx> wrote: >>> -----Original Message----- >>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] >>> Sent: 10 August 2018 16:31 >>> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> >>> Cc: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>; xen-devel <xen- >>> devel@xxxxxxxxxxxxxxxxxxxx> >>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>> >>>>>> On 10.08.18 at 17:08, <Paul.Durrant@xxxxxxxxxx> wrote: >>>>> -----Original Message----- >>>>> From: Andrew Cooper >>>>> Sent: 10 August 2018 13:56 >>>>> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx>; 'Jan Beulich' >>>>> <JBeulich@xxxxxxxx> >>>>> Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx> >>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>> >>>>> On 10/08/18 13:43, Paul Durrant wrote: >>>>>>> -----Original Message----- >>>>>>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] >>>>>>> Sent: 10 August 2018 13:37 >>>>>>> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> >>>>>>> Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx> >>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>> >>>>>>>>>> On 10.08.18 at 14:22, <Paul.Durrant@xxxxxxxxxx> wrote: >>>>>>>>> -----Original Message----- >>>>>>>>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] >>>>>>>>> Sent: 10 August 2018 13:13 >>>>>>>>> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> >>>>>>>>> Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx> >>>>>>>>> Subject: RE: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>>> >>>>>>>>>>>> On 10.08.18 at 14:08, <Paul.Durrant@xxxxxxxxxx> wrote: >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] >>>>>>>>>>> Sent: 10 August 2018 13:02 >>>>>>>>>>> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> >>>>>>>>>>> Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx> >>>>>>>>>>> Subject: Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes >>>>>>>>>>> >>>>>>>>>>>>>> On 10.08.18 at 12:37, <paul.durrant@xxxxxxxxxx> wrote: >>>>>>>>>>>> These are probably both candidates for back-port. >>>>>>>>>>>> >>>>>>>>>>>> Paul Durrant (2): >>>>>>>>>>>> x86/hvm/ioreq: MMIO range checking completely ignores >>>>> direction >>>>>>> flag >>>>>>>>>>>> x86/hvm/emulate: make sure rep I/O emulation does not cross >>>>> GFN >>>>>>>>>>>> boundaries >>>>>>>>>>>> >>>>>>>>>>>> xen/arch/x86/hvm/emulate.c | 17 ++++++++++++++++- >>>>>>>>>>>> xen/arch/x86/hvm/ioreq.c | 15 ++++++++++----- >>>>>>>>>>>> 2 files changed, 26 insertions(+), 6 deletions(-) >>>>>>>>>>> I take it this isn't yet what we've talked about yesterday on irc? >>>>>>>>>>> >>>>>>>>>> This is the band-aid fix. I can now show correct handling of a rep >>> mov >>>>>>>>>> walking off MMIO into RAM. >>>>>>>>> But that's not the problem we're having. In our case the bad >>> behavior >>>>>>>>> is with a single MOV. That's why I had assumed that your plan to >>> fiddle >>>>>>>>> with null_handler would help in our case as well, while this series >>>>> clearly >>>>>>>>> won't (afaict). >>>>>>>>> >>>>>>>> Oh, I see. A single MOV spanning MMIO and RAM has undefined >>>>> behaviour >>>>>>> though >>>>>>>> as I understand it. Am I incorrect? >>>>>>> I'm not aware of SDM or PM saying anything like this. Anyway, the >>>>>>> specific case where this is being observed as an issue is when >>>>>>> accessing the last few bytes of a normal RAM page followed by a >>>>>>> ballooned out one. The balloon driver doesn't remove the virtual >>>>>>> mapping of such pages (presumably in order to not shatter super >>>>>>> pages); observation is with the old XenoLinux one, but from code >>>>>>> inspection the upstream one behaves the same. >>>>>>> >>>>>>> Unless we want to change the balloon driver's behavior, at least >>>>>>> this specific case needs to be considered having defined behavior, >>>>>>> I think. >>>>>>> >>>>>> Ok. I'll see what I can do. >>>>> >>>>> It is a software error to try and cross boundaries. Modern processors >>>>> do their best to try and cause the correct behaviour to occur, albeit >>>>> with a massive disclaimer about the performance hit. Older processors >>>>> didn't cope. >>>>> >>>>> As far as I'm concerned, its fine to terminate a emulation which crosses >>>>> a boundary with the null ops. >>>> >>>> Alas we never even get as far as the I/O handlers in some circumstances... >>>> >>>> I just set up a variant of an XTF test doing a backwards rep movsd into a >>>> well aligned stack buffer where source buffer starts 1 byte before a >>> boundary >>>> between RAM and MMIO. The code in hvmemul_rep_movs() (rightly) >>> detects that >>>> both the source and dest of the initial rep are RAM, skips over the I/O >>>> emulation calls, and then fails when the hvm_copy_from_guest_phys() >>>> (unsurprisingly) fails to grab the 8 bytes for the initial rep. >>>> So, any logic we add to deal with handling page spanning ops is going to >>>> have to go in at the top level of instruction emulation... which I fear is >>>> going to be quite a major change and not something that's going to be easy >>> to >>>> back-port. >>> >>> Well, wasn't it clear from the beginning that a proper fix would be too >>> invasive to backport? And wasn't it for that reason that you intended >>> to add a small hack first, to deal with just the case(s) that we currently >>> have issues with? >> >> Well, given that I mistakenly understood you were hitting the same rep issue >> that I was, I thought I could deal with it in a reasonably straightforward >> way. Maybe I can still do a point fix for what you are hitting though. What >> precisely are you hitting? Always a single MOV? And always from a page >> spanning source to a well aligned dest? Or more combinations than that? > > Always a single, misaligned MOV spanning the boundary from a valid > RAM page to a ballooned out one (Linux'es load_unaligned_zeropad()). > I meanwhile wonder though whether it might not be better to address > this at the p2m level, by inserting a r/o mapping of an all-zeros page. > George, do you have any opinion one way or the other? Sorry, what exactly is the issue here? Linux has a function called load_unaligned_zeropad() which is reading into a ballooned region? Fundamentally, a ballooned page is one which has been allocated to a device driver. I'm having a hard time coming up with a justification for having code which reads memory owned by B in the process of reading memory owned by A. Or is there some weird architectural reason that I'm not aware of? -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |