[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes



>>> On 13.08.18 at 08:50,  wrote:
>>>> On 10.08.18 at 18:37, <andrew.cooper3@xxxxxxxxxx> wrote:
> > On 10/08/18 17:30, George Dunlap wrote:
> >> Sorry, what exactly is the issue here?  Linux has a function called
> >> load_unaligned_zeropad() which is reading into a ballooned region?
> 
> Yes.
> 
> >> Fundamentally, a ballooned page is one which has been allocated to a
> >> device driver.  I'm having a hard time coming up with a justification
> >> for having code which reads memory owned by B in the process of reading
> >> memory owned by A.  Or is there some weird architectural reason that I'm
> >> not aware of?
> 
> Well, they do this no matter who owns the successive page (or
> perhaps at a smaller granularity also the successive allocation).
> I guess their goal is to have just a single MOV in the common
> case (with the caller ignoring the uninteresting to it high bytes),
> while recovering gracefully from #PF should one occur.
> 
> > The underlying issue is that the emulator can't cope with a single
> > misaligned access which crosses RAM and MMIO.  It gives up and
> > presumably throws #UD back.
> 
> We wouldn't have observed any problem if there was #UD in
> such a case, as Linux'es fault recovery code doesn't care what
> kind of fault has occurred. We're getting back a result of all
> ones, even for the part of the read that has actually hit the
> last few bytes of the present page.
> 
> > One longstanding Xen bug is that simply ballooning a page out shouldn't
> > be able to trigger MMIO emulation to begin with.  It is a side effect of
> > mixed p2m types, and the fix for this to have Xen understand the guest
> > physmap layout.
> 
> And hence the consideration of mapping in an all zeros page
> instead. This is because of the way __hvmemul_read() /
> __hvm_copy() work: The latter doesn't tell its caller how many
> bytes it was able to read, and hence the former considers the
> entire range MMIO (and forwards the request for emulation).
> Of course all of this is an issue only because
> hvmemul_virtual_to_linear() sees no need to split the request
> at the page boundary, due to the balloon driver having left in
> place the mapping of the ballooned out page.
> 
> Obviously the opposite case (access starting in a ballooned
> out page and crossing into an "ordinary" one) would have a
> similar issue, which is presumably even harder to fix without
> going the map-a-zero-page route (or Paul's suggested
> null_handler hack).
> 
> > However, the real bug is Linux making such a misaligned access into a
> > ballooned out page in the first place.  This is a Linux kernel bug which
> > (presumably) manifests in a very obvious way due to shortcomings in
> > Xen's emulation handling.
> 
> I wouldn't dare to judge whether this is a bug, especially in
> light that they recover gracefully from the #PF that might result in
> the native case. Arguably the caller has to have some knowledge
> about what might live in the following page, as to not inadvertently
> hit an MMIO page rather than a non-present mapping. But I'd
> leave such judgment to them; our business is to get working a case
> that is working without Xen underneath.

Following some further discussion with Andrew, he looks to be
convinced that the issue is to be fixed in the balloon driver,
which so far (intentionally afaict) does not remove the direct
map entries for ballooned out pages in the HVM case. I'm not
convinced of this, but I'd nevertheless like to inquire whether
such a change (resulting in shattered super page mappings)
would be acceptable in the first place.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.