[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] OVMF/Xen, Debian wheezy can't boot with NX on stack (Was: Re: [edk2] [PATCH] OvmfPkg: prevent code execution from DXE stack)
On 09/11/15 21:30, Josh Triplett wrote: > On Fri, Sep 11, 2015 at 05:28:06PM +0200, Laszlo Ersek wrote: >> On 09/11/15 16:10, Josh Triplett wrote: >>> On Fri, Sep 11, 2015 at 01:43:53PM +0200, Laszlo Ersek wrote: >>>> On 09/09/15 12:48, Laszlo Ersek wrote: >>>>> On 09/09/15 11:37, Ian Campbell wrote: >>>>>> On Wed, 2015-09-09 at 01:06 -0600, Jan Beulich wrote: >>>>>>>>>> On 09.09.15 at 00:23, <lersek@xxxxxxxxxx> wrote: >>>>>>>> On 09/08/15 19:26, Anthony PERARD wrote: >>>>>>>>> And I get this on the console: >>>>>>>>> Welcome to GRUB! >>>>>>>>> >>>>>>>>> !!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU Apic ID - >>>>>>>>> 00000000 !!!! >>>>>>>>> RIP - 000000000F5F8918, CS - 0000000000000028, RFLAGS - >>>>>>>>> 0000000000210206 >>>>>>>>> ExceptionData - 0000000000000011 >>>>>>>>> RAX - 0000000000000000, RCX - 0000000007FCE000, RDX - >>>>>>>>> 0000000000000000 >>>>>>>>> RBX - 000000000B6092C0, RSP - 000000000F5F8590, RBP - >>>>>>>>> 000000000B608EA0 >>>>>>>>> RSI - 000000000F5F8838, RDI - 000000000B608EA0 >>>>>>>>> R8 - 0000000000000000, R9 - 000000000B609200, R10 - >>>>>>>>> 0000000000000000 >>>>>>>>> R11 - 000000000000000A, R12 - 0000000000000000, R13 - >>>>>>>>> 000000000000001B >>>>>>>>> R14 - 000000000B609360, R15 - 0000000000000000 >>>>>>>>> DS - 0000000000000008, ES - 0000000000000008, FS - >>>>>>>>> 0000000000000008 >>>>>>>>> GS - 0000000000000008, SS - 0000000000000008 >>>>>>>>> CR0 - 0000000080000033, CR2 - 000000000F5F8918, CR3 - >>>>>>>>> 000000000F597000 >>>>>>>>> CR4 - 0000000000000668, CR8 - 0000000000000000 >>>>>>>>> DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - >>>>>>>>> 0000000000000000 >>>>>>>>> DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - >>>>>>>>> 0000000000000400 >>>>>>>>> GDTR - 000000000F57BF18 000000000000003F, LDTR - 0000000000000000 >>>>>>>>> IDTR - 000000000EEA5018 0000000000000FFF, TR - 0000000000000000 >>>>>>>>> FXSAVE_STATE - 000000000F5F81F0 >>>>>>>>> !!!! Find PE image >>>>>>>> /build/xen-unstable/src/xen-unstable/tools/firmware/ovmf-dir >>>>>>>> -remote/Build >>>>>>>> /OvmfX64/DEBUG_GCC49/X64/IntelFrameworkModulePkg/Universal/StatusCode/R >>>>>>>> untime >>>>>>>> Dxe/StatusCodeRuntimeDxe/DEBUG/StatusCodeRuntimeDxe.dll >>>>>>>> (ImageBase=000000000F556000, EntryPoint=000000000F55628F) !!!! >>>>>>>>> >>>>>>>>> I did check with other guest (Windows, Ubuntu, Debian Jessie), and >>>>>>>>> they are >>>>>>>>> working correctly. Debian Wheezy is the only one that fail. >>>>>>>> >>>>>>>> I don't have an environment to reproduce this in. I think we should try >>>>>>>> to understand this problem better, before deciding how to make it go >>>>>>>> away. >>>>>>>> >>>>>>>> Please locate the "StatusCodeRuntimeDxe.debug" file in your Build >>>>>>>> directory (ie. under the location listed in the error report). Then, >>>>>>>> please disassemble it with "objdump -S". The fault location in the >>>>>>>> disassembly can be found based on RIP, ImageBase and EntryPoint; >>>>>>> >>>>>>> I don't think the exact instruction at that address really matters. The >>>>>>> main question appears to be why RIP and RSP both point into the >>>>>>> same page (see also the subject of Anthony's mail). >>>>>> >>>>>> I'm not 100% what is going on, >>>>> >>>>> me neither :) >>>>> >>>>>> but if this (executable code on stack) is >>>>>> happening in grub is there something which is explicitly forbidden to >>>>>> UEFI >>>>>> apps by the UEFI spec? >>>>> >>>>> Yes, there is. This small OvmfPkg patch only enables the edk2 feature >>>>> added by Star Zeng in >>>>> <https://github.com/tianocore/edk2/commit/5630cdfe> for OVMF. That patch >>>>> (also referenced in my commit message by SVN rev) says, >>>>> >>>>> This feature is added for UEFI spec that says >>>>> "Stack may be marked as non-executable in identity mapped page >>>>> tables". >>>>> A PCD PcdSetNxForStack is added to turn on/off this feature, and it >>>>> is FALSE by default. >>>>> >>>>> A UEFI app runs (well, *starts*, anyway) before ExitBootServices() / >>>>> SetVirtualAddressMap(), so it's bound by the above. >>>>> >>>>> The spec passage above is quoted from "2.3.2 IA-32 Platforms", and >>>>> "2.3.4 x64 Platforms", in chapter "2.3 Calling Conventions", where the >>>>> boot services time environment is specified. >>>>> >>>>> This is new in UEFI-2.5, and it comes from Mantis ticket 1224: "Adding >>>>> support for No executable data areas". >>>>> >>>>> ... The question could be then if grub (in Wheezy) should be adapted to >>>>> UEFI-2.5 (if that's possible) or if OVMF should be built without this >>>>> feature. >>>>> >>>>> Hmmm. Actually, I'm torn about the default for PcdSetNxForStack. >>>>> >>>>> Namely, Mantis ticket 1224 has come up before. There's another edk2 >>>>> sub-feature related to this UEFI spec feature / Mantis ticket; the >>>>> properties table (controlled by "PcdPropertiesTableEnable"), and the >>>>> effects it has on the UEFI memory map, and the requirements it presents >>>>> for UEFI OSes. >>>>> >>>>> *That* sub-feature is extremely intrusive. >>>>> "MdeModulePkg/MdeModulePkg.dec" sets "PcdPropertiesTableEnable" TRUE by >>>>> default, and OvmfPkg inherits it. I have not overridden that default >>>>> just yet in OvmfPkg because the properties table feature depends on >>>>> something *else* too: sections in runtime DXE driver binaries must be >>>>> aligned at 4KB, and that is not ensured for OvmfPkg (yet?). Therefore, >>>>> the properties table feature is not active in OVMF at the moment due to >>>>> a "random build tools circumstance" -- and not because the feature flag >>>>> is actually disabled. >>>>> >>>>> Now, the properties table feature absolutely cannot be (effectively) >>>>> enabled in OVMF as a default. It would break Windows 7 / Windows Server >>>>> 2008 R2. A huge number of users run such guests for GPU passthrough. >>>>> >>>>> The idea that just crossed my mind was to introduce a new build flag for >>>>> OVMF, like -D NOEXEC_DATA_ENABLE. And this would then control >>>>> PcdSetNxForStack *and* PcdPropertiesTableEnable *together*: >>>>> >>>>> if NOEXEC_DATA_ENABLE: >>>>> PcdSetNxForStack := TRUE >>>>> PcdPropertiesTableEnable := TRUE >>>>> else >>>>> PcdSetNxForStack := FALSE >>>>> PcdPropertiesTableEnable := FALSE >>>>> >>>>> However, I think that's a bad idea. >>>>> >>>>> "PcdPropertiesTableEnable" breaks a very widely used guest OS for which >>>>> there is no update in sight (ie. no Service Pack 2 for Windows 7 / >>>>> Windows Server 2008 R2), but is otherwise supposed to be supported for >>>>> years to come. >>>>> >>>>> Whereas Debian Wheezy is not as widely used as a guest (for one: it >>>>> "competes" with all the other Linux distros from the same timeframe). >>>>> Debian Wheezy also provides a quite non-intrusive upgrade path (to >>>>> Jessie), which is not the case for Windows 7. So the breakage casued by >>>>> setting PcdSetNxForStack has much smaller impact, I think, and the >>>>> benefits might outweigh the breakage. >>>>> >>>>> Which just means that a heavy-handed -D NOEXEC_DATA_ENABLE, like above, >>>>> would not be appropriate. PcdSetNxForStack set, and >>>>> PcdPropertiesTableEnable clear, is a valid configuration. >>>>> >>>>> In any case, I sort of convinced myself that Wheezy's grub is not at >>>>> fault; it was simply written against an earlier release of the UEFI spec. >>>>> >>>>> How about this: >>>>> >>>>> - (somewhat unrelatedly, but for completeness): >>>>> I post a patch that exposes PcdPropertiesTableEnable with a build >>>>> time flag (NX_PROP_TABLE_ENABLE) with an OVMF-default of FALSE, >>>>> >>>>> - I post another patch that exposes PcdSetNxForStack with a build time >>>>> flag (NX_STACK_ENABLE) with an OVMF-default of *TRUE*. You could then >>>>> pass -D NX_STACK_ENABLE=FALSE when you build OVMF for any environment >>>>> where Wheezy guests are expected. >>>>> >>>>> Jordan? >>>>> >>>>> Yet another (quite complex) possibility: >>>>> >>>>> - According to "MdeModulePkg/MdeModulePkg.dec", >>>>> a platform DSC can already type PcdPropertiesTableEnable as a dynamic >>>>> PCD. We could allow the same for PcdSetNxForStack. Star? >>>>> >>>>> Both PCDs are consumed "acceptably late" (the first in the DXE core, >>>>> the second in the DXE IPL PEIM). This would allow >>>>> OvmfPkg/PlatformPei to set the PCDs dynamically. >>>>> >>>>> - Then the question is how we pass in the PCD values from the outside. >>>>> For QEMU/KVM virtual machines, we could use Gabriel's QEMU feature >>>>> >>>>> -fw_cfg name=opt/my_item_name,file=./my_blob.bin >>>>> >>>>> ie. no changes for QEMU would be necessary. Xen virtual machines >>>>> don't have fw_cfg however, therefore "some other" info passing should >>>>> be implemented in "OvmfPkg/PlatformPei/Xen.c", and (I assume!) in the >>>>> host-side Xen tools as well. >>>>> >>>>> Personally I think that this dynamic approach is overkill (mainly >>>>> because I'm fine with being unable to install Debian Wheezy guests, both >>>>> wearing and not wearing my red fedora; and because the properties table >>>>> feature is not active for *any* OVMF guests anyway, in practice). >>>>> >>>>> So I'd like to ask you guys to decide which one you prefer: a simple >>>>> build time flag called NX_STACK_ENABLE (which will allow you to install >>>>> Wheezy guests, but deprive all other guests from a non-exec stack), or >>>>> the dynamic switches (which will take host-side work in Xen, plus a >>>>> matching OvmfPkg patch). >>>> >>>> I might go ahead and implement the -fw_cfg solution (for when OVMF runs >>>> on QEMU), leaving any parallel Xen functionality to Xen developers, for >>>> later. >>>> >>>> Because, I just found out that the GRUB binary built into the bits-2005 >>>> release ISO <http://biosbits.org/download/> blows up the exact same way, >>>> and *that* is very annoying to me personally. >>>> >>>> Maybe we should invert the default. I'm not so convinced any longer that >>>> the current default is right. I didn't know that the GRUB version with >>>> code on the stack was this wide-spread. >>> >>> Heh. Our fault for still using old (2.00) GRUB2, which we do plan to >>> upgrade at some point, though doing so doesn't top the TODO list. But I >>> do agree that with OVMF not having previously enforced NX in the UEFI >>> environment, going from that to immediately enforcing it *by default* >>> seems a bit quick. I would be surprised if doing so didn't break other >>> UEFI programs too. >>> >>> Could OVMF build with NX support available, but disabled by default, so >>> that qemu can then turn it *on* with a -fw_cfg option? >> >> It would be possible, yes. >> >> I just posted a patch series that adds that dynamism, but it leaves the >> NX stack turned on by default. (Ie. one needs to add the cmdline option >> to turn it off.) If there's consensus to revert the default to off, I >> can submit a v2. >> >>> Do any UEFI stacks on systems in the wild have NX turned on? I don't >>> think it makes sense for OVMF to enable NX by default until a majority >>> of new physical systems do so. >> >> For me that's not so clear-cut. OVMF is frequently used as a UEFI >> development environment (it's better to brick a virtual machine than >> your physical dev platform...), so upstream should embrace new UEFI >> features reasonably early, unless there are *grave* regressions. >> >> One example I named was the properties table feature (new in UEFI-2.5). >> It would break Windows 7. Given the large number of users running >> Windows 7 on OVMF (mainly for GPU passthrough), such a regression would >> be serious. >> >> Breaking Debian Wheezy's and BITS's GRUB is also bad, but the former is >> very old (and has a clear upgrade path), while the latter is mainly used >> by developers (who can learn about the -fw_cfg switch by googling or >> asking on the least without huge trouble). In this case I'm leaning >> towards OVMF being "bleeding edge" by default. But, I could be convinced >> otherwise. > > I certainly think it makes sense for OVMF to adopt the feature sooner > than normal, and I agree that OVMF serves as a test case. But going > directly from "not possible to turn on" to "turned on by default", > without any period of "off by default but possible to turn on", seems a > bit unfortunate. > > That said, we could certainly fix BITS to use newer GRUB2, and use > (and document) -fw_cfg in the meantime. So I won't push *too* hard for > changing the default, just mildly. Okay. If I'll need to send a v2 for any reason, I'll incorporate this. If not, then I can post a followup patch later (stating that it's due to community feedback). > On a vaguely related note, what's the canonical place to report bugs in > OVMF? (Bugs? What bugs? :)) It's this list, <edk2-devel@xxxxxxxxxxxx>. Thanks! Laszlo > Github issues on tianocore/edk2 (which a few people have filed > issues on) or tianocore/edk2-OvmfPkg (which nobody has filed issues on), > or some other bug tracker I'm unaware of and haven't found a link to? > > - Josh Triplett > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |