[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] XSA 154 and ISA region (640K -> 1MB) WB cache instead of UC



On Thu, Aug 18, 2016 at 04:35:44PM +0100, One Thousand Gnomes wrote:
> On Thu, 18 Aug 2016 05:12:54 -0600
> "Jan Beulich" <JBeulich@xxxxxxxx> wrote:
> 
> > >>> On 18.08.16 at 12:16, <andrew.cooper3@xxxxxxxxxx> wrote:  
> > > On 18/08/16 11:06, Jan Beulich wrote:  
> > >>>>> On 17.08.16 at 22:32, <konrad.wilk@xxxxxxxxxx> wrote:  
> > >>>    Looking at the kernel it assumes that WB is ok for 640KB->1MB.
> > >>>    The comment says:
> > >>>    " /* Low ISA region is always mapped WB in page table. No need to 
> > >>> track   
> > > *"  
> > >> As per above it's not clear to me what this comment is backed by.  
> > > 
> > > This states what is in the pagetables.  Not the combined result with 
> > > MTRRs.
> > > 
> > > WB in the pagetables and WC/UB in the MTRRs is a legal combination which
> > > functions correctly.  
> > 
> > True, but then again - haven't I been told multiple times that Linux
> > nowadays prefers to run without using MTRRs?
> 
> The BIOS sets up the fixed MTRR registers for the 640K-1MB window. Those
> are separate to the variable range MTRR registers used for main memory
> with specific mappings for segments A000 to BFFF then C000-C7FF /
> C800-CFFF / etc up to FFFF.

OK, so BIOS-inherited.

Looking at the Intel SDM (figure 11-7), if the MTRR is UC for that, then 
having pagetables being either UC or WB are fine. Except Linux's use
of the quirk (is_untracked_pat_range) ends up always requesting WB.

And to combat the splat, the patch:


From 5209635f23786fb88cf0ce77719da8acda63bf65 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Fri, 19 Aug 2016 11:06:44 -0400
Subject: [PATCH] x86/xen: Add x86_platform.is_untracked_pat_range quirk to
 ignore ISA regions.

On x86 whenever VMAs are setup, the 'is_ISA_range quirk' (which this
patch re-implements) is used to figure whether to ignore the
requested PAT type and always use WB (see 'reserve_memtype').
Specifically it forces the WB type for any region in the ISA space.

From the Intel SDM, the combination of MTRR (UC, which is setup by
the BIOS) and PAT (UC or WB) for the ISA region ends up with the same
value - UC.

However on Xen, due to XSA 154 we enforce that mappings that _ANY_
pagetable entry to MMIO ranges MUST have the same the same cachability
mapping - and in this case we enforce UC.

Which means that with XSA 154 (and without this patch) any application
that maps /dev/mem to get SMBIOS information (like mcelog), and pokes
in the ISA region will not have an PTE set. That is due to
reserve_pfn_range returning -EINVAL which results in the PTE not being set.

[These are debug entries added in 'reserve_pfn_range']
mcelog:2471 0xf0000->0xf1000, req_type=write-back new_type=write-back
mcelog:2471 0xeb000->0xed000, req_type=write-back new_type=write-back

.. above are successfull ones, but:
mcelog:2471 0xeb000->0xed000, req_type=uncached new_type=uncached
[again, a debug one:]
mcelog:2471 want=uncached got=write-back strict 0x000eb000-0x000ecfff
mcelog:2471 map pfn expected mapping type uncached for [mem 
0x000eb000-0x000ecfff], got write-back
 ------------[ cut here ]------------

 [<ffffffff816c66f0>] dump_stack+0x63/0x83
 [<ffffffff81084745>] warn_slowpath_common+0x95/0xe0
 [<ffffffff810847aa>] warn_slowpath_null+0x1a/0x20
 [<ffffffff810725f3>] untrack_pfn+0x93/0xc0
 [<ffffffff811b90f9>] unmap_single_vma+0xa9/0x100
 [<ffffffff811b9644>] unmap_vmas+0x54/0xa0
 [<ffffffff811bf0da>] exit_mmap+0x9a/0x150
 [<ffffffff810825d3>] mmput+0x73/0x110
 [<ffffffff81082775>] dup_mm+0x105/0x110
 [<ffffffff81083b1d>] copy_process+0x11ed/0x1240
 [<ffffffff81084009>] do_fork+0x79/0x280
 [<ffffffff810259d3>] ? syscall_trace_enter_phase1+0x153/0x180
 [<ffffffff81084226>] SyS_clone+0x16/0x20
 [<ffffffff816cb3ee>] system_call_fastpath+0x12/0x71

results in that splat.

The effective result of the function below is for 'reserver_memtype'
to ignore the result from 'x86_platform.is_untracked_pat_range' quirk.
Which means that the splat above does not happen.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
---
 arch/x86/xen/enlighten.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 8ffb089..3238d04 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -283,6 +283,27 @@ static void __init xen_banner(void)
               version >> 16, version & 0xffff, extra.extraversion,
               xen_feature(XENFEAT_mmu_pt_update_preserve_ad) ? " 
(preserve-AD)" : "");
 }
+
+/*
+ * On x86 whenever VMAs are setup, the 'is_ISA_range quirk' (which we
+ * re-implement below) is used to figure whether to ignore the
+ * requested PAT type and always use WB (see 'reserve_memtype').
+ *
+ * The combination of MTRR (UC) and PAT (UC or WB) for the ISA region ends
+ * up with the same value - UC.
+ *
+ * However on Xen, due to XSA 154 we enforce that mappings to _ANY_ MMIO
+ * range MUST have the same the same cachability mapping - and in this case
+ * we enforce UC for everything.
+ *
+ * The effective result of the function below is for 'reserver_memtype'
+ * to ignore the result from 'x86_platform.is_untracked_pat_range' quirk.
+ */
+static bool xen_ignore(u64 s, u64 e)
+{
+       return false;
+}
+
 /* Check if running on Xen version (major, minor) or later */
 bool
 xen_running_on_version_or_later(unsigned int major, unsigned int minor)
@@ -1730,6 +1751,8 @@ asmlinkage __visible void __init xen_start_kernel(void)
                x86_init.mpparse.get_smp_config = x86_init_uint_noop;
 
                xen_boot_params_init_edd();
+
+               x86_platform.is_untracked_pat_range = xen_ignore;
        }
 #ifdef CONFIG_PCI
        /* PCI BIOS service won't work from a PV guest. */
-- 
2.5.5

> 
> Alan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.