[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] OVMF/Xen, Debian wheezy can't boot with NX on stack (Was: Re: [edk2] [PATCH] OvmfPkg: prevent code execution from DXE stack)
On Fri, Sep 11, 2015 at 05:28:06PM +0200, Laszlo Ersek wrote: > On 09/11/15 16:10, Josh Triplett wrote: > > On Fri, Sep 11, 2015 at 01:43:53PM +0200, Laszlo Ersek wrote: > >> On 09/09/15 12:48, Laszlo Ersek wrote: > >>> On 09/09/15 11:37, Ian Campbell wrote: > >>>> On Wed, 2015-09-09 at 01:06 -0600, Jan Beulich wrote: > >>>>>>>> On 09.09.15 at 00:23, <lersek@xxxxxxxxxx> wrote: > >>>>>> On 09/08/15 19:26, Anthony PERARD wrote: > >>>>>>> And I get this on the console: > >>>>>>> Welcome to GRUB! > >>>>>>> > >>>>>>> !!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU Apic ID - > >>>>>>> 00000000 !!!! > >>>>>>> RIP - 000000000F5F8918, CS - 0000000000000028, RFLAGS - > >>>>>>> 0000000000210206 > >>>>>>> ExceptionData - 0000000000000011 > >>>>>>> RAX - 0000000000000000, RCX - 0000000007FCE000, RDX - > >>>>>>> 0000000000000000 > >>>>>>> RBX - 000000000B6092C0, RSP - 000000000F5F8590, RBP - > >>>>>>> 000000000B608EA0 > >>>>>>> RSI - 000000000F5F8838, RDI - 000000000B608EA0 > >>>>>>> R8 - 0000000000000000, R9 - 000000000B609200, R10 - > >>>>>>> 0000000000000000 > >>>>>>> R11 - 000000000000000A, R12 - 0000000000000000, R13 - > >>>>>>> 000000000000001B > >>>>>>> R14 - 000000000B609360, R15 - 0000000000000000 > >>>>>>> DS - 0000000000000008, ES - 0000000000000008, FS - > >>>>>>> 0000000000000008 > >>>>>>> GS - 0000000000000008, SS - 0000000000000008 > >>>>>>> CR0 - 0000000080000033, CR2 - 000000000F5F8918, CR3 - > >>>>>>> 000000000F597000 > >>>>>>> CR4 - 0000000000000668, CR8 - 0000000000000000 > >>>>>>> DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - > >>>>>>> 0000000000000000 > >>>>>>> DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - > >>>>>>> 0000000000000400 > >>>>>>> GDTR - 000000000F57BF18 000000000000003F, LDTR - 0000000000000000 > >>>>>>> IDTR - 000000000EEA5018 0000000000000FFF, TR - 0000000000000000 > >>>>>>> FXSAVE_STATE - 000000000F5F81F0 > >>>>>>> !!!! Find PE image > >>>>>> /build/xen-unstable/src/xen-unstable/tools/firmware/ovmf-dir > >>>>>> -remote/Build > >>>>>> /OvmfX64/DEBUG_GCC49/X64/IntelFrameworkModulePkg/Universal/StatusCode/R > >>>>>> untime > >>>>>> Dxe/StatusCodeRuntimeDxe/DEBUG/StatusCodeRuntimeDxe.dll > >>>>>> (ImageBase=000000000F556000, EntryPoint=000000000F55628F) !!!! > >>>>>>> > >>>>>>> I did check with other guest (Windows, Ubuntu, Debian Jessie), and > >>>>>>> they are > >>>>>>> working correctly. Debian Wheezy is the only one that fail. > >>>>>> > >>>>>> I don't have an environment to reproduce this in. I think we should try > >>>>>> to understand this problem better, before deciding how to make it go > >>>>>> away. > >>>>>> > >>>>>> Please locate the "StatusCodeRuntimeDxe.debug" file in your Build > >>>>>> directory (ie. under the location listed in the error report). Then, > >>>>>> please disassemble it with "objdump -S". The fault location in the > >>>>>> disassembly can be found based on RIP, ImageBase and EntryPoint; > >>>>> > >>>>> I don't think the exact instruction at that address really matters. The > >>>>> main question appears to be why RIP and RSP both point into the > >>>>> same page (see also the subject of Anthony's mail). > >>>> > >>>> I'm not 100% what is going on, > >>> > >>> me neither :) > >>> > >>>> but if this (executable code on stack) is > >>>> happening in grub is there something which is explicitly forbidden to > >>>> UEFI > >>>> apps by the UEFI spec? > >>> > >>> Yes, there is. This small OvmfPkg patch only enables the edk2 feature > >>> added by Star Zeng in > >>> <https://github.com/tianocore/edk2/commit/5630cdfe> for OVMF. That patch > >>> (also referenced in my commit message by SVN rev) says, > >>> > >>> This feature is added for UEFI spec that says > >>> "Stack may be marked as non-executable in identity mapped page > >>> tables". > >>> A PCD PcdSetNxForStack is added to turn on/off this feature, and it > >>> is FALSE by default. > >>> > >>> A UEFI app runs (well, *starts*, anyway) before ExitBootServices() / > >>> SetVirtualAddressMap(), so it's bound by the above. > >>> > >>> The spec passage above is quoted from "2.3.2 IA-32 Platforms", and > >>> "2.3.4 x64 Platforms", in chapter "2.3 Calling Conventions", where the > >>> boot services time environment is specified. > >>> > >>> This is new in UEFI-2.5, and it comes from Mantis ticket 1224: "Adding > >>> support for No executable data areas". > >>> > >>> ... The question could be then if grub (in Wheezy) should be adapted to > >>> UEFI-2.5 (if that's possible) or if OVMF should be built without this > >>> feature. > >>> > >>> Hmmm. Actually, I'm torn about the default for PcdSetNxForStack. > >>> > >>> Namely, Mantis ticket 1224 has come up before. There's another edk2 > >>> sub-feature related to this UEFI spec feature / Mantis ticket; the > >>> properties table (controlled by "PcdPropertiesTableEnable"), and the > >>> effects it has on the UEFI memory map, and the requirements it presents > >>> for UEFI OSes. > >>> > >>> *That* sub-feature is extremely intrusive. > >>> "MdeModulePkg/MdeModulePkg.dec" sets "PcdPropertiesTableEnable" TRUE by > >>> default, and OvmfPkg inherits it. I have not overridden that default > >>> just yet in OvmfPkg because the properties table feature depends on > >>> something *else* too: sections in runtime DXE driver binaries must be > >>> aligned at 4KB, and that is not ensured for OvmfPkg (yet?). Therefore, > >>> the properties table feature is not active in OVMF at the moment due to > >>> a "random build tools circumstance" -- and not because the feature flag > >>> is actually disabled. > >>> > >>> Now, the properties table feature absolutely cannot be (effectively) > >>> enabled in OVMF as a default. It would break Windows 7 / Windows Server > >>> 2008 R2. A huge number of users run such guests for GPU passthrough. > >>> > >>> The idea that just crossed my mind was to introduce a new build flag for > >>> OVMF, like -D NOEXEC_DATA_ENABLE. And this would then control > >>> PcdSetNxForStack *and* PcdPropertiesTableEnable *together*: > >>> > >>> if NOEXEC_DATA_ENABLE: > >>> PcdSetNxForStack := TRUE > >>> PcdPropertiesTableEnable := TRUE > >>> else > >>> PcdSetNxForStack := FALSE > >>> PcdPropertiesTableEnable := FALSE > >>> > >>> However, I think that's a bad idea. > >>> > >>> "PcdPropertiesTableEnable" breaks a very widely used guest OS for which > >>> there is no update in sight (ie. no Service Pack 2 for Windows 7 / > >>> Windows Server 2008 R2), but is otherwise supposed to be supported for > >>> years to come. > >>> > >>> Whereas Debian Wheezy is not as widely used as a guest (for one: it > >>> "competes" with all the other Linux distros from the same timeframe). > >>> Debian Wheezy also provides a quite non-intrusive upgrade path (to > >>> Jessie), which is not the case for Windows 7. So the breakage casued by > >>> setting PcdSetNxForStack has much smaller impact, I think, and the > >>> benefits might outweigh the breakage. > >>> > >>> Which just means that a heavy-handed -D NOEXEC_DATA_ENABLE, like above, > >>> would not be appropriate. PcdSetNxForStack set, and > >>> PcdPropertiesTableEnable clear, is a valid configuration. > >>> > >>> In any case, I sort of convinced myself that Wheezy's grub is not at > >>> fault; it was simply written against an earlier release of the UEFI spec. > >>> > >>> How about this: > >>> > >>> - (somewhat unrelatedly, but for completeness): > >>> I post a patch that exposes PcdPropertiesTableEnable with a build > >>> time flag (NX_PROP_TABLE_ENABLE) with an OVMF-default of FALSE, > >>> > >>> - I post another patch that exposes PcdSetNxForStack with a build time > >>> flag (NX_STACK_ENABLE) with an OVMF-default of *TRUE*. You could then > >>> pass -D NX_STACK_ENABLE=FALSE when you build OVMF for any environment > >>> where Wheezy guests are expected. > >>> > >>> Jordan? > >>> > >>> Yet another (quite complex) possibility: > >>> > >>> - According to "MdeModulePkg/MdeModulePkg.dec", > >>> a platform DSC can already type PcdPropertiesTableEnable as a dynamic > >>> PCD. We could allow the same for PcdSetNxForStack. Star? > >>> > >>> Both PCDs are consumed "acceptably late" (the first in the DXE core, > >>> the second in the DXE IPL PEIM). This would allow > >>> OvmfPkg/PlatformPei to set the PCDs dynamically. > >>> > >>> - Then the question is how we pass in the PCD values from the outside. > >>> For QEMU/KVM virtual machines, we could use Gabriel's QEMU feature > >>> > >>> -fw_cfg name=opt/my_item_name,file=./my_blob.bin > >>> > >>> ie. no changes for QEMU would be necessary. Xen virtual machines > >>> don't have fw_cfg however, therefore "some other" info passing should > >>> be implemented in "OvmfPkg/PlatformPei/Xen.c", and (I assume!) in the > >>> host-side Xen tools as well. > >>> > >>> Personally I think that this dynamic approach is overkill (mainly > >>> because I'm fine with being unable to install Debian Wheezy guests, both > >>> wearing and not wearing my red fedora; and because the properties table > >>> feature is not active for *any* OVMF guests anyway, in practice). > >>> > >>> So I'd like to ask you guys to decide which one you prefer: a simple > >>> build time flag called NX_STACK_ENABLE (which will allow you to install > >>> Wheezy guests, but deprive all other guests from a non-exec stack), or > >>> the dynamic switches (which will take host-side work in Xen, plus a > >>> matching OvmfPkg patch). > >> > >> I might go ahead and implement the -fw_cfg solution (for when OVMF runs > >> on QEMU), leaving any parallel Xen functionality to Xen developers, for > >> later. > >> > >> Because, I just found out that the GRUB binary built into the bits-2005 > >> release ISO <http://biosbits.org/download/> blows up the exact same way, > >> and *that* is very annoying to me personally. > >> > >> Maybe we should invert the default. I'm not so convinced any longer that > >> the current default is right. I didn't know that the GRUB version with > >> code on the stack was this wide-spread. > > > > Heh. Our fault for still using old (2.00) GRUB2, which we do plan to > > upgrade at some point, though doing so doesn't top the TODO list. But I > > do agree that with OVMF not having previously enforced NX in the UEFI > > environment, going from that to immediately enforcing it *by default* > > seems a bit quick. I would be surprised if doing so didn't break other > > UEFI programs too. > > > > Could OVMF build with NX support available, but disabled by default, so > > that qemu can then turn it *on* with a -fw_cfg option? > > It would be possible, yes. > > I just posted a patch series that adds that dynamism, but it leaves the > NX stack turned on by default. (Ie. one needs to add the cmdline option > to turn it off.) If there's consensus to revert the default to off, I > can submit a v2. > > > Do any UEFI stacks on systems in the wild have NX turned on? I don't > > think it makes sense for OVMF to enable NX by default until a majority > > of new physical systems do so. > > For me that's not so clear-cut. OVMF is frequently used as a UEFI > development environment (it's better to brick a virtual machine than > your physical dev platform...), so upstream should embrace new UEFI > features reasonably early, unless there are *grave* regressions. > > One example I named was the properties table feature (new in UEFI-2.5). > It would break Windows 7. Given the large number of users running > Windows 7 on OVMF (mainly for GPU passthrough), such a regression would > be serious. > > Breaking Debian Wheezy's and BITS's GRUB is also bad, but the former is > very old (and has a clear upgrade path), while the latter is mainly used > by developers (who can learn about the -fw_cfg switch by googling or > asking on the least without huge trouble). In this case I'm leaning > towards OVMF being "bleeding edge" by default. But, I could be convinced > otherwise. I certainly think it makes sense for OVMF to adopt the feature sooner than normal, and I agree that OVMF serves as a test case. But going directly from "not possible to turn on" to "turned on by default", without any period of "off by default but possible to turn on", seems a bit unfortunate. That said, we could certainly fix BITS to use newer GRUB2, and use (and document) -fw_cfg in the meantime. So I won't push *too* hard for changing the default, just mildly. On a vaguely related note, what's the canonical place to report bugs in OVMF? Github issues on tianocore/edk2 (which a few people have filed issues on) or tianocore/edk2-OvmfPkg (which nobody has filed issues on), or some other bug tracker I'm unaware of and haven't found a link to? - Josh Triplett _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |