[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2] Subject: x86/PAT: Report PAT on CPUs that support PAT without MTRR
On 7/13/2022 3:22 PM, Chuck Zmudzinski wrote: > On 7/13/2022 3:07 PM, Chuck Zmudzinski wrote: > > On 7/13/2022 9:45 AM, Juergen Gross wrote: > > > >> On 7/13/2022 6:36 AM, Chuck Zmudzinski wrote: > > > >> And in addition, if we are going to backport this patch to > > > >> all current stable branches, we better have a really, really, > > > >> good reason for changing the behavior of "nopat" on Xen. > > > >> > > > >> Does such a reason exist? > > > > > > > > Well, the simple reason is: It doesn't work the same way under Xen > > > > and non-Xen (in turn because, before my patch or whatever equivalent > > > > work, things don't work properly anyway, PAT-wise). Yet it definitely > > > > ought to behave the same everywhere, imo. > > > > > > There is Documentation/x86/pat.rst which rather clearly states, how > > > "nopat" is meant to work. It should not change the contents of the > > > PAT MSR and keep it just as it was set at boot time (the doc talks > > > about the "BIOS" setting of the MSR, and I guess in the Xen case > > > the hypervisor is kind of acting as the BIOS). > > > > > > The question is, whether "nopat" needs to be translated to > > > pat_enabled() returning "false". > > > > When I started working on a re-factoring effort of the logic > > surrounding pat_enabled(), I noticed there are five different > > reasons in the current code for setting pat_disabled to true, > > which IMO is what should be a redundant variable that should > > always be equal !pat_enabled() and !pat_bp_enabled, but that > > unfortunately is not the case. The five reasons for setting > > pat_disabled to true are given as message strings: > > > > 1. "MTRRs disabled, skipping PAT initialization too." > > 2. "PAT support disabled because CONFIG_MTRR is disabled in the kernel." > > 3. "PAT support disabled via boot option." > > 4. "PAT not supported by the CPU." > > 5. "PAT support disabled by the firmware." > > > > The only effect of setting pat_disabled to true is to inhibit > > the execution of pat_init(), but it does not inhibit the execution > > of init_cache_modes(), which is for handling all these cases > > when pat_init() was skipped. The Xen case is one of those > > cases, so in the Xen case, pat_disabled will be true yet the > > only way to fix the current regression and the five-year-old > > commit is by setting pat_bp_enabled to true so pat_enabled() > > will return true. So to fix the five-year-old commit, we must have > > > > pat_enabled() != pat_disabled > > > > Something is wrong with this logic, that is why I wanted to precede > > my fix with some re-factoring that will change some variable > > and function names and modify some comments before trying > > to fix the five-year-old commit, so that we will never have a situation > > when pat_enabled() != pat_disabled. > > > > Chuck > Sorry, I meant to say, > > To fix the five-year-old commit, we must have > > pat_enabled() != !pat_disabled or pat_enabled() == pat_disabled, > > and there is something wrong with that logic. > > Chuck So to summarize, I think this means that to be comfortable fixing the five-year-old commit and the current regression by artificially setting pat_bp_enabled and pat_enabled() to true, something which both my patch and Jan's patch does, we need to come to a new understanding of what the static boolean variable pat_disabled in arch/x86/mm/pat/memtype.c in the code really means. The fact is, we have a regression and the only fix we can find is to try to make pat_enabled() == pat_disabled I need to stop thinking about this for a while. It is time for those who have authority to fix this regression to make some comments about how they think this should be fixed. Chuck
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |