[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 0/3] x86: make pat and mtrr independent from each other
On 7/18/2022 7:32 AM, Chuck Zmudzinski wrote: > On 7/17/2022 3:55 AM, Thorsten Leemhuis wrote: > > Hi Juergen! > > > > On 15.07.22 16:25, Juergen Gross wrote: > > > Today PAT can't be used without MTRR being available, unless MTRR is at > > > least configured via CONFIG_MTRR and the system is running as Xen PV > > > guest. In this case PAT is automatically available via the hypervisor, > > > but the PAT MSR can't be modified by the kernel and MTRR is disabled. > > > > > > As an additional complexity the availability of PAT can't be queried > > > via pat_enabled() in the Xen PV case, as the lack of MTRR will set PAT > > > to be disabled. This leads to some drivers believing that not all cache > > > modes are available, resulting in failures or degraded functionality. > > > > > > The same applies to a kernel built with no MTRR support: it won't > > > allow to use the PAT MSR, even if there is no technical reason for > > > that, other than setting up PAT on all cpus the same way (which is a > > > requirement of the processor's cache management) is relying on some > > > MTRR specific code. > > > > > > Fix all of that by: > > > > > > - moving the function needed by PAT from MTRR specific code one level > > > up > > > - adding a PAT indirection layer supporting the 3 cases "no or disabled > > > PAT", "PAT under kernel control", and "PAT under Xen control" > > > - removing the dependency of PAT on MTRR > > > > Thx for working on this. If you need to respin these patches for one > > reason or another, could you do me a favor and add proper 'Link:' tags > > pointing to all reports about this issue? e.g. like this: > > > > Link: https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/ > > > > These tags are considered important by Linus[1] and others, as they > > allow anyone to look into the backstory weeks or years from now. That is > > why they should be placed in cases like this, as > > Documentation/process/submitting-patches.rst and > > Documentation/process/5.Posting.rst explain in more detail. I care > > personally, because these tags make my regression tracking efforts a > > whole lot easier, as they allow my tracking bot 'regzbot' to > > automatically connect reports with patches posted or committed to fix > > tracked regressions. > > > > [1] see for example: > > https://lore.kernel.org/all/CAHk-=wjMmSZzMJ3Xnskdg4+GGz=5p5p+GSYyFBTh0f-DgvdBWg@xxxxxxxxxxxxxx/ > > https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nOwVkVzjpM3VFU1zobP37Fwd_h9iAD5JQ@xxxxxxxxxxxxxx/ > > https://lore.kernel.org/all/CAHk-=wjxzafG-=J8oT30s7upn4RhBs6TX-uVFZ5rME+L5_DoJA@xxxxxxxxxxxxxx/ > > > > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) > > > > I echo Thorsten's thx for starting on this now instead of waiting until > September which I think is when Juergen said he could start working > on this last week. I agree with Thorsten that Link tags are needed. > Since multiple patches have been proposed to fix this regression, > perhaps a Link to each proposed patch, and a note that > the original report identified a specific commit which when reverted > also fixes it. IMO, this is all part of the backstory Thorsten refers to. > > It looks like with this approach, a fix will not be coming real soon, > and Borislav Petkov also discouraged me from testing this > patch set until I receive a ping telling me it is ready for testing, > which seems to confirm that this regression will not be fixed > very soon. Please correct me if I am wrong about how long > it will take to fix it with this approach. > > Also, is there any guarantee this approach is endorsed by > all the maintainers who will need to sign-off, especially > Linus? I say this because some of the discussion on the > earlier proposed patches makes me doubt this. I am especially > referring to this discussion: > > https://lore.kernel.org/lkml/4c8c9d4c-1c6b-8e9f-fa47-918a64898a28@xxxxxxxxxxxxx/ > > and also, here: > > https://lore.kernel.org/lkml/YsRjX%2FU1XN8rq+8u@xxxxxxx/ > > where Borislav Petkov argues that Linux should not be > patched at all to fix this regression but instead the fix > should come by patching the Xen hypervisor. > > So I have several questions, presuming at least the fix is going > to be delayed for some time, and also presuming this approach > is not yet an approach that has the blessing of the maintainers > who will need to sign-off: > > 1. Can you estimate when the patch series will be ready for > testing and suitable for a prepatch or RC release? > > 2. Can you estimate when the patch series will be ready to be > merged into the mainline release? Is there any hope it will be > fixed before the next longterm release hosted on kernel.org? > > 3. Since a fix is likely not coming soon, can you explain > why the commit that was mentioned in the original > report cannot be reverted as a temporary solution while > we wait for the full fix to come later? I can say that > reverting that commit (It was a commit affecting > drm/i915) does fix the issue on my system with no > negative side effects at all. In such a case, it seems > contrary to Linus' regression rule to not revert the > offending commit, even if reverting the offending > commit is not going to be the final solution. IOW, > I am trying to argue that an important corollary to > the Linus regression rule is that we revert commits > that introduce regressions, especially when there > are no negative effects when reverting the offending > commit. Why are we not doing that in this case? > > 4. Can you explain why this patch series is superior > to the other proposed patches that are much more > simple and have been reported to fix the regression? > > 5. This approach seems way too aggressive for backporting > to the stable releases. Is that correct? Or, will the patches > be backported to the stable releases? I was told that > backports to the stable releases are needed to keep things > consistent across all the supported versions when I submitted > a patch to fix this regression that identified a specific five year > old commit that my proposed patch would fix. > > Remember, this is a regression that is really bothering > people now. For example, I am now in a position where > I cannot install the updates of the Linux kernel that Debian > pushes out to me without patching the kernel with my > own private build that has one of the known fixes that > have already been identified as ways to workaround this > regression while we wait for the full solution that will > hopefully come later. > > Chuck > > > P.S.: As the Linux kernel's regression tracker I deal with a lot of > > reports and sometimes miss something important when writing mails like > > this. If that's the case here, don't hesitate to tell me in a public > > reply, it's in everyone's interest to set the public record straight. > > > > BTW, let me tell regzbot to monitor this thread: > > > > #regzbot ^backmonitor: > > https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/ > OK, the comments Boris made on the individual patches of this patch set answers most of my questions. Thx, Boris. Chuck
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |