[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2] x86/mm: add API for marking only part of a MMIO page read only



On Wed, Mar 29, 2023 at 02:39:22PM +0200, Jan Beulich wrote:
> On 29.03.2023 12:51, Marek Marczykowski-Górecki wrote:
> > On Wed, Mar 29, 2023 at 10:50:20AM +0200, Jan Beulich wrote:
> >> On 27.03.2023 12:09, Marek Marczykowski-Górecki wrote:
> >>> In some cases, only few registers on a page needs to be write-protected.
> >>> Examples include USB3 console (64 bytes worth of registers) or MSI-X's
> >>> PBA table (which doesn't need to span the whole table either).
> >>
> >> Yet like the MSI-X table the PBA is not permitted to share a page with
> >> other registers (the table itself excluded). So a need there would
> >> (again) only arise for devices violating the spec.
> > 
> > For PBA, indeed (and due to not seeing device needing such hack, I don't
> > do that for PBA yet). But the XHCI spec doesn't include such limitation, so
> > this case is perfectly valid.
> 
> My remark was merely because you mention PBA in the description.
> Mentioning it is okay, but you would want to further qualify it, I
> think.

Ah, ok.

> >> Taking both remarks together, limiting granularity to 16(?) bytes
> >> would allow using the advanced EPT functionality down the road, and
> >> would at the same time limit the suggested bitmap to just 256 bits /
> >> 32 bytes, which I think gets us below what even an empty rangeset
> >> would require. Plus lookup would also be quite a bit more lightweight.
> > 
> > Indeed, in that case it makes sense.
> 
> Hmm, I've checked the SDM, and I was misremembering: Granularity is
> 128 bytes, which might be too large for the purposes here.

Indeed, it seems so. In case of USB3 console, I want to protect 64 bytes
of registers...

I guess 16 bytes granularity would work, but it feels kinda arbitrary
without any good reason to choose this specific number. More logical
would be 4 bytes (as common register size), but it means 128 bytes for
the bitmask.

> >>> @@ -4893,6 +4906,172 @@ long arch_memory_op(unsigned long cmd, 
> >>> XEN_GUEST_HANDLE_PARAM(void) arg)
> >>>      return 0;
> >>>  }
> >>>  
> >>> +int subpage_mmio_ro_add(
> >>
> >> As long as patch 2 is going to add the only users, __init please, and
> >> there's no need for a "remove" counterpart.
> > 
> > __init makes sense. But as for removing "remove" part, I'm not sure. I
> > realize it is a dead code now, but it's easier to introduce it now to
> > provide complete API, than later by somebody else who would want to use
> > it in other places. Is there some trick to make compiler/linker optimize
> > it out?
> 
> At the very least you could also mark it __init. There are also the .discard
> and .discard.* sections we handle specially in the linker script. But no
> matter what you do to retain the code without impacting the resulting binary,
> iirc Misra tells us that there shouldn't be dead code.

Well, if dropping remove, then I guess I could leave a comment
describing what it would need to do. Or maybe just hint in the
description that earlier version of the patch had remove implemented -
so if anybody needs it in the future, can do some mailing list
archaeology and have something to start with.

> >>> +    if ( !iter || entry->fixmap_idx == fixmap_idx )
> >>> +        return 0;
> >>> +    else
> >>> +        return 1;
> >>
> >> If this case is really needed, this special return value (as documented
> >> in the header) probably needs specially handling in the callers - it's
> >> not necessarily an error after all. But I wonder whether it wouldn't be
> >> easier to check earlier on and fail right away (with e.g. -EBUSY), 
> > 
> > The idea is to allow multiple sub-ranges in a single page. Again, if
> > using ioremap() internally, instead of fixmap provided externally, this
> > case will go away.
> > 
> >> rather
> >> than adding the range and _then_ (kind of, as per patch 2) failing.
> > 
> > Right, I missed "!= 0" there.
> 
> Hmm, adding "!= 0" won't make a difference, will it? Adding "< 0" would.

Right.

> >>> +        }
> >>> +    }
> >>> +    gdprintk(XENLOG_WARNING,
> >>> +             "ignoring write to R/O MMIO mfn %" PRI_mfn " offset %lx len 
> >>> %u\n",
> >>> +             mfn_x(mfn), offset, len);
> >>
> >> ... make it here. (By implication: I'm not convinced this wants to be a
> >> gdprintk(), as I think at least one such warning would better surface in
> >> release builds as well.)
> > 
> > Right, gprintk() would make more sense indeed.
> > 
> >> At the same time I don't think any message should be issued for write
> >> attempts to pages which don't have part of it marked writable. We deal
> >> with such silently right now, and this shouldn't change.
> > 
> > At least for HVM domains, it isn't really silent. It's domain_crash()
> > (before my change, not reaching this function at all).
> 
> Well, yes, but that's one instance in the lifetime of a domain.

What I mean is that it isn't a common or normal case that HVM domain
would exercise in normal operation. It's abnormal situation and as such
it should IMO get a log message.
And even for PV domain, such message would save me quite some time
debugging related issues...

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.