[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: EFI's -mapbs option may cause Linux to panic()
On 23.11.22 10:18, Jan Beulich wrote: On 23.11.2022 08:39, Juergen Gross wrote:On 22.11.22 10:47, Roger Pau Monné wrote:On Mon, Nov 21, 2022 at 06:01:00PM +0100, Jan Beulich wrote:On 21.11.2022 17:48, Roger Pau Monné wrote:On Mon, Nov 21, 2022 at 05:27:16PM +0100, Jan Beulich wrote:Hello, on a system with these first two EFI memory map entries (XEN) 0000000000000-000000009dfff type=4 attr=000000000000000f (XEN) 000000009e000-000000009ffff type=2 attr=000000000000000f i.e. except for 2 pages all space below 1M being BootServicesData, the -mapbs option has the effect of marking reserved all that space. Then Linux fails trying to allocate its lowmem trampoline (which really it shouldn't need when running in PV mode), ultimately leading to panic("Real mode trampoline was not allocated"); in their init_real_mode(). While for PV I think it is clear that the easiest is to avoid trampoline setup in the first place, iirc PVH Dom0 also tries to mirror the host memory map to its own address space. Does PVH Linux require a lowmem trampoline?Yes, it does AFAIK. I guess those two pages won't be enough for Linux boot trampoline requirements then. I assume native Linux is fine with this memory map because it reclaims the EfiBootServicesData region and that's enough.That's my understanding as well.While the two pages here are just enough for Xen's trampoline, I still wonder whether we want to adjust -mapbs behavior. Since whatever we might do leaves a risk of conflicting with true firmware (mis)use of that space, the best I can think of right now would be another option altering behavior (or providing altered behavior). Yet such an option would likely need to be more fine-grained then than covering all of the low Mb in one go. Which feels like both going too far and making it awkward for people to figure out what value(s) to use ... Thoughts anyone?I'm unsure what to recommend. The mapbs option is a workaround for broken firmware, and it's not enabled by default, so we might be lucky and never find a system with a memory map like you describe that also requires mapbs in order to boot.Guess how we've learned of the issue: Systems may boot fine without -mapbs, but they may fail to reboot because of that (in)famous issue of firmware writers not properly separating boot services code paths from runtime services ones. And there we're dealing with a system where I suspect this to be the case, just that - unlike in earlier similar cases - there's no "clean" crash proving the issue (the system simply hangs). Hence my request that they use -mapbs to try to figure out. And yes, "reboot=acpi" helps there, but they insist on knowing what component is to blame.Well, if reboot=acpi fixes it then it's quite clear EFI reboot method is to blame? Or they want to know the exact cause that makes EFI reboot fail, because that's quite difficult to figure out from our end. But I'm afraid I don't see any solution to make mapbs work with a PVH dom0 on a system with a memory map like you provided, short of adding some kind of bodge to not map and mark as reserved memory below 1MB (but that kind of defeats the purpose of mapbs).What we could do in such a case would be to inhibit suspending the system, and to run dom0 with a single cpu only. An error message indicating that the system should be booted without mapbs should be issued, of course.That's going to be awkward: Linux can't very well issue a message suggesting to remove the use of a hypervisor option (behavior of which is an implementation detail to some degree, and hence the message could end up being misleading later). Xen also can't very well issue such a message, since it doesn't know how much of lowmem is going to be enough for whichever Dom0 OS there's going to be booted. In principle an OS may get away with less than a single page. Hence Xen at best could issue a "may not work" message (unless no space at all was available at some 4k-aligned boundary), and even then it being a false indication on some (many?) systems may lead to people not paying attention when they should. A kernel message could be phrased more generic, e.g. "Couldn't find a large enough free memory region below 1MB, maybe due to hypervisor settings and/or firmware issues." Juergen Attachment:
OpenPGP_0xB0DE9DD628BF132F.asc Attachment:
OpenPGP_signature
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |