[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/5] x86/hvm: fix handling of accesses to partial r/o MMIO pages



On Mon, Apr 14, 2025 at 08:33:44AM +0200, Jan Beulich wrote:
> On 11.04.2025 12:54, Roger Pau Monne wrote:
> > The current logic to handle accesses to MMIO pages partially read-only is
> > based on the (now removed) logic used to handle accesses to the r/o MMCFG
> > region(s) for PVH v1 dom0.  However that has issues when running on AMD
> > hardware, as in that case the guest linear address that triggered the fault
> > is not provided as part of the VM exit.  This caused
> > mmio_ro_emulated_write() to always fail before calling
> > subpage_mmio_write_emulate() when running on AMD and called from an HVM
> > context.
> > 
> > Take a different approach and convert the handling of partial read-only
> > MMIO page accesses into an HVM MMIO ops handler, as that's the more natural
> > way to handle this kind of emulation for HVM domains.
> > 
> > This allows getting rid of hvm_emulate_one_mmio() and it's single cal site
> > in hvm_hap_nested_page_fault().
> > 
> > Note a small adjustment is needed to the `pf-fixup` dom0 PVH logic: avoid
> > attempting to fixup faults resulting from accesses to read-only MMIO
> > regions, as handling of those accesses is now done by handle_mmio().
> > 
> > Fixes: 33c19df9a5a0 ('x86/PCI: intercept accesses to RO MMIO from dom0s in 
> > HVM containers')
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> > ---
> > The fixes tag is maybe a bit wonky, it's either this or:
> > 
> > 8847d6e23f97 ('x86/mm: add API for marking only part of a MMIO page read 
> > only')
> > 
> > However the addition of subpage r/o access handling to the existing
> > mmio_ro_emulated_write() function was done based on the assumption that the
> > current code was working - which turned out to not be the case for AMD,
> > hence my preference for blaming the commit that actually introduced the
> > broken logic.
> 
> I agree.
> 
> > ---
> >  xen/arch/x86/hvm/emulate.c             | 47 +-------------
> >  xen/arch/x86/hvm/hvm.c                 | 89 +++++++++++++++++++++++---
> 
> It's quite a bit of rather special code you add to this catch-all file. How
> about introducing a small new one, e.g. mmio.c (to later maybe become home
> to some further stuff)?

Yes, I didn't find any suitable place, I was also considering
hvm/io.c or hvm/intercept.c, but none looked very good TBH.

The main benefit of placing it in hvm.c is that the functions can all
be static and confined to hvm.c as it's only user.

> > --- a/xen/arch/x86/hvm/emulate.c
> > +++ b/xen/arch/x86/hvm/emulate.c
> > @@ -370,7 +370,8 @@ static int hvmemul_do_io(
> >          /* If there is no suitable backing DM, just ignore accesses */
> >          if ( !s )
> >          {
> > -            if ( is_mmio && is_hardware_domain(currd) )
> > +            if ( is_mmio && is_hardware_domain(currd) &&
> > +                 !rangeset_contains_singleton(mmio_ro_ranges, 
> > PFN_DOWN(addr)) )
> 
> I think this needs a brief comment - it otherwise looks misplaced here.
> 
> > @@ -585,9 +585,81 @@ static int cf_check hvm_print_line(
> >      return X86EMUL_OKAY;
> >  }
> >  
> > +static int cf_check subpage_mmio_accept(struct vcpu *v, unsigned long addr)
> > +{
> > +    p2m_type_t t;
> > +    mfn_t mfn = get_gfn_query_unlocked(v->domain, addr, &t);
> > +
> > +    return !mfn_eq(mfn, INVALID_MFN) && t == p2m_mmio_direct &&
> > +           !!subpage_mmio_find_page(mfn);
> 
> The !! isn't needed here, is it?

IIRC clang complained about conversion from pointer to integer without
a cast, but maybe that was before adding the further conditions here.

> > +}
> > +
> > +static int cf_check subpage_mmio_read(
> > +    struct vcpu *v, unsigned long addr, unsigned int len, unsigned long 
> > *data)
> > +{
> > +    struct domain *d = v->domain;
> > +    p2m_type_t t;
> > +    mfn_t mfn = get_gfn_query(d, addr, &t);
> > +    struct subpage_ro_range *entry;
> > +    volatile void __iomem *mem;
> > +
> > +    *data = ~0UL;
> > +
> > +    if ( mfn_eq(mfn, INVALID_MFN) || t != p2m_mmio_direct )
> > +    {
> > +        put_gfn(d, addr);
> > +        return X86EMUL_RETRY;
> > +    }
> > +
> > +    entry = subpage_mmio_find_page(mfn);
> > +    if ( !entry )
> > +    {
> > +        put_gfn(d, addr);
> > +        return X86EMUL_RETRY;
> > +    }
> > +
> > +    mem = subpage_mmio_map_page(entry);
> > +    if ( !mem )
> > +    {
> > +        put_gfn(d, addr);
> > +        gprintk(XENLOG_ERR, "Failed to map page for MMIO read at %#lx\n",
> > +                mfn_to_maddr(mfn) + PAGE_OFFSET(addr));
> 
> Makes me wonder whether the function parameter wouldn't better be named
> "curr" and the local variable "currd", to reflect that this log message
> will report appropriate context.

Sure, can adjust.

> Would also logging the guest physical address perhaps be worthwhile here?

Possibly, my first apporahc was to print gfn -> mfn, but ended up
copying the same message from  subpage_mmio_write_emulate() for
symmetry reasons.

> > +        return X86EMUL_OKAY;
> > +    }
> > +
> > +    *data = read_mmio(mem + PAGE_OFFSET(addr), len);
> > +
> > +    put_gfn(d, addr);
> > +    return X86EMUL_OKAY;
> > +}
> > +
> > +static int cf_check subpage_mmio_write(
> > +    struct vcpu *v, unsigned long addr, unsigned int len, unsigned long 
> > data)
> > +{
> > +    struct domain *d = v->domain;
> > +    p2m_type_t t;
> > +    mfn_t mfn = get_gfn_query(d, addr, &t);
> > +
> > +    if ( mfn_eq(mfn, INVALID_MFN) || t != p2m_mmio_direct )
> > +    {
> > +        put_gfn(d, addr);
> > +        return X86EMUL_RETRY;
> > +    }
> > +
> > +    subpage_mmio_write_emulate(mfn, PAGE_OFFSET(addr), data, len);
> > +
> > +    put_gfn(d, addr);
> > +    return X86EMUL_OKAY;
> > +}
> 
> Unlike the read path this doesn't return RETRY when subpage_mmio_find_page()
> fails (indicating something must have changed after "accept" said yes).

Yeah, I've noticed this, but didn't feel like modifying
subpage_mmio_write_emulate() for this.  The list of partial r/o MMIO
pages cannot change after system boot, so accept returning yes and not
finding a page here would likely warrant an ASSERT_UNRECHABLE().

Would you like me to modify subpage_mmio_write_emulate to make it
return an error code?

> > @@ -1981,7 +2056,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned 
> > long gla,
> >       */
> >      if ( (p2mt == p2m_mmio_dm) ||
> >           (npfec.write_access &&
> > -          (p2m_is_discard_write(p2mt) || (p2mt == p2m_ioreq_server))) )
> > +          (p2m_is_discard_write(p2mt) || (p2mt == p2m_ioreq_server) ||
> > +           /* MMIO entries can be r/o if the target mfn is in 
> > mmio_ro_ranges. */
> > +           (p2mt == p2m_mmio_direct))) )
> >      {
> >          if ( !handle_mmio_with_translation(gla, gfn, npfec) )
> >              hvm_inject_hw_exception(X86_EXC_GP, 0);
> 
> Aren't we handing too many things to handle_mmio_with_translation() this
> way? At the very least you're losing ...
> 
> > @@ -2033,14 +2110,6 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned 
> > long gla,
> >          goto out_put_gfn;
> >      }
> >  
> > -    if ( (p2mt == p2m_mmio_direct) && npfec.write_access && npfec.present 
> > &&
> 
> ... the .present check.

Isn't the p2mt == p2m_mmio_direct check already ensuring the entry is
present?  Otherwise it's type would be p2m_invalid or p2m_mmio_dm?

It did seem to me the other checks in this function already assume
that by having a valid type the entry is present.

> I'm also concerned of e.g. VT-x'es APIC access MFN, which is
> p2m_mmio_direct.

But that won't go into hvm_hap_nested_page_fault() when using
cpu_has_vmx_virtualize_apic_accesses (and thus having an APIC page
mapped as p2m_mmio_direct)?

It would instead be an EXIT_REASON_APIC_ACCESS vmexit which is handled
differently?

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.