[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/hvm: Improve hvm_set_guest_pat() code generation again
On 16.12.2022 21:53, Andrew Cooper wrote: > On 10/08/2022 3:06 pm, Jan Beulich wrote: >> On 10.08.2022 15:36, Andrew Cooper wrote: >>> From: Edwin Török <edvin.torok@xxxxxxxxxx> >>> >>> Following on from cset 9ce0a5e207f3 ("x86/hvm: Improve hvm_set_guest_pat() >>> code generation"), and the discovery that Clang/LLVM makes some especially >>> disastrous code generation for the loop at -O2 >>> >>> https://github.com/llvm/llvm-project/issues/54644 >>> >>> Edvin decided to remove the loop entirely by fully vectorising it. This is >>> substantially more efficient than the loop, and rather harder for a typical >>> compiler to mess up. >>> >>> Signed-off-by: Edwin Török <edvin.torok@xxxxxxxxxx> >>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >> The main downside being that changing the code to fit in a new PAT >> type will now be harder. > > When was the last PAT type change? > > Trick question. Never, because PAT hasn't changed since it was > introduced 24 years ago in the Pentium III. > > I really don't think we're in danger of needing to change this logic. One way to look at things, sure. >> I wonder in particular whether with that >> in mind it wouldn't be better to express the check not in terms of >> relations, but in terms of set / clear bits ("bits 3-7 clear AND >> (bit 2 set OR bit 1 clear)"). The code kind of does so already, but >> the variable names don't reflect that (and would hence need to >> change in such an event). > > That would reduced clarity. > > The bits being set or cleared are trivial for any developer, given the > particularly basic RHS expressions. > > The constant names are what relate the bit patterns to the description > of the problem. Again - one way to look at things. Plus, with Demi's series now also in mind, what's done here is moving us in exactly the opposite direction. Is this hot enough a function to warrant that? >>> --- a/xen/arch/x86/hvm/hvm.c >>> +++ b/xen/arch/x86/hvm/hvm.c >>> @@ -302,24 +302,43 @@ void hvm_get_guest_pat(struct vcpu *v, u64 *guest_pat) >>> *guest_pat = v->arch.hvm.pat_cr; >>> } >>> >>> -int hvm_set_guest_pat(struct vcpu *v, uint64_t guest_pat) >>> +/* >>> + * MSR_PAT takes 8 uniform fields, each of which must be a valid >>> architectural >>> + * memory type (0, 1, 4-7). This is a fully vectorised form of the >>> + * 8-iteration loop over bytes looking for PAT_TYPE_* constants. >> While grep-ing for PAT_TYPE_ will hit this line, I think we want >> every individual type to also be found here when grep-ing for one. >> The actual values aren't going to change, but perhaps the beast >> way to do so would still be by way of BUILD_BUG_ON()s. > > Why? What does that solve or improve? > > "pat" is the thing people are going to be looking for if they're > actually trying to find this logic. > > (And I bring this patch up specifically after reviewing Demi's series, > where PAT_TYPE_* changes to X86_MT_* but "pat" is still the useful > search term IMO.) I don't think "PAT" is a good thing to grep for when trying to find uses of particular memory types. Go grep for "_WP", "_WC", or "_WT" - you'll find not overly many hits, and with the false positives filtered out you'll have a good overview of which of these types is used in how many places. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |