[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 4/5] x86/mm: Reject invalid cacheability in PV guests by default



On Thu, Jan 05, 2023 at 09:00:03PM -0500, Demi Marie Obenour wrote:
> On Thu, Jan 05, 2023 at 08:28:26PM +0000, Andrew Cooper wrote:
> > On 22/12/2022 10:31 pm, Demi Marie Obenour wrote:
> > > diff --git a/docs/misc/xen-command-line.pandoc 
> > > b/docs/misc/xen-command-line.pandoc
> > > index 
> > > 424b12cfb27d6ade2ec63eacb8afe5df82465451..0230a7bc17cbd4362a42ea64cea695f31f5e0f86
> > >  100644
> > > --- a/docs/misc/xen-command-line.pandoc
> > > +++ b/docs/misc/xen-command-line.pandoc
> > > @@ -1417,6 +1417,17 @@ detection of systems known to misbehave upon 
> > > accesses to that port.
> > >  ### idle_latency_factor (x86)
> > >  > `= <integer>`
> > >  
> > > +### invalid-cacheability (x86)
> > > +> `= allow | deny | trap`
> > > +
> > > +> Default: `deny` in release builds, otherwise `trap`
> > > +
> > > +Specify what happens when a PV guest tries to use one of the reserved 
> > > entries in
> > > +the PAT.  `deny` causes the attempt to be rejected with -EINVAL, `allow` 
> > > allows
> > > +the attempt, and `trap` causes a general protection fault to be raised.
> > > +Currently, the reserved entries are marked as uncacheable in Xen's PAT, 
> > > but this
> > > +will change if new memory types are added, so guests must not rely on it.
> > > +
> > 
> > This wants to be scoped under "pv", and not a top-level option.  Current
> > parsing is at the top of xen/arch/x86/pv/domain.c
> > 
> > How about `pv={no-}oob-pat`, and parse it into the opt_pv_oob_pat boolean ?
> 
> Works for me, though I will use ‘invalid’ instead of ‘oob’ as valid PAT
> entries might not be contiguous.

If we're talking about alternative PAT settings, I'm not sure if they
can be called "invalid". With the default Xen's choice of PAT, top two
entries are documented as reserved (in xen.h) and only that makes them
forbidden to use. But with alternative settings, it's only behavior of
Linux parsing that prefers using lower options. When breaking contract
set in xen.h, "reserved" entries stop being reserved too.

So, _if_ that option would be applicable alternative PAT choice, it's
only useful for debugging Linux specifically (assuming Linux won't
change its approach to choosing entries - which I think it's allowed to
do).

> > There really is no need to distinguish between deny and trap.  IMO,
> > injecting #GP unilaterally is fine (to a guest expecting the hypercall
> > to succeed, -EINVAL vs #GP makes no difference, but #GP is far more
> > useful to a human trying to debug issues here), but I could be talked
> > into putting that behind a CONFIG_DEBUG if other feel strongly.
> 
> Marek, thoughts?

With Xen's default PAT, #GP may be useful indeed, but it must come with
a message why it was injected.

> > > @@ -1343,7 +1374,34 @@ static int promote_l1_table(struct page_info *page)
> > >          }
> > >          else
> > >          {
> > > -            switch ( ret = get_page_from_l1e(pl1e[i], d, d) )
> > > +            l1_pgentry_t l1e = pl1e[i];
> > > +
> > > +            if ( invalid_cacheability != INVALID_CACHEABILITY_ALLOW )
> > > +            {
> > > +                switch ( l1e.l1 & PAGE_CACHE_ATTRS )
> > > +                {
> > > +                case _PAGE_WB:
> > > +                case _PAGE_UC:
> > > +                case _PAGE_UCM:
> > > +                case _PAGE_WC:
> > > +                case _PAGE_WT:
> > > +                case _PAGE_WP:
> > > +                    break;
> > > +                default:
> > > +                    /*
> > > +                     * If we get here, a PV guest tried to use one of the
> > > +                     * reserved values in Xen's PAT.  This indicates a 
> > > bug
> > > +                     * in the guest.  If requested by the user, inject 
> > > #GP
> > > +                     * to cause the guest to log a stack trace.
> > > +                     */
> > > +                    if ( invalid_cacheability == 
> > > INVALID_CACHEABILITY_TRAP )
> > > +                        pv_inject_hw_exception(TRAP_gp_fault, 0);
> > > +                    ret = -EINVAL;
> > > +                    goto fail;
> > > +                }
> > > +            }
> > 
> > This will catch cases where the guest writes a full pagetable, then
> > promotes it, but won't catch cases where the guest modifies an
> > individual entry with mmu_update/etc.
> > 
> > The logic needs to be in get_page_from_l1e() to get applied uniformly to
> > all PTE modifications.
> 
> I will move it there, and also update Qubes OS’s patchset.
> 
> > Right now, the l1_disallow_mask() check near the start hides the "can
> > you use a nonzero cacheattr" check.  If I ever get around to cleaning up
> > my post-XSA-402 series, I have a load of modifications to this.
> 
> I came up with some major cleanups too.
> 
> > But this could be something like this:
> > 
> > if ( opt_pv_oob_pat && (l1f & PAGE_CACHE_ATTRS) > _PAGE_WP )
> > {
> >     // #GP
> >     return -EINVAL;
> > }
> > 
> > fairly early on.
> > 
> > It occurs to me that this check is only applicable when we're using the
> > Xen ABI APT, and seeing as we've talked about possibly making patch 5 a
> > Kconfig option, that may want accounting here.  (Perhaps simply making
> > opt_pb_oob_pat be false in a !XEN_PAT build.)
> 
> It’s actually applicable even with other PATs.  While Marek and I were
> tracking down an Intel iGPU cache coherency problem, Marek used it to
> verify that PAT entries that we thought were not being used were in fact
> unused.  This allowed proving that the behavior of the GPU was impacted
> by changes to PAT entries the hardware should not even be looking at,
> and therefore that the hardware itself must be buggy.

In fact, I did that via WARN() on the Linux side, to _not_ have guest
killed in this case, to potentially collect more info. As said above,
with alternative PAT settings, the contract which entries are "valid"
isn't there anymore, so punishing guest for using them isn't
appropriate. It could still be useful feature for debugging Linux (and
it feels, we'll need this feature for some time...). So, with !XEN_PAT
it should be at least disabled by default, or maybe even hidden behind
CONFIG_DEBUG.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.