[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 04/12] x86/boot: introduce module release
On 2024-11-06 19:46, Daniel P. Smith wrote: On 11/6/24 17:16, Jason Andryuk wrote:On 2024-11-02 13:25, Daniel P. Smith wrote:A precarious approach was used to release the pages used to hold a boot module. The precariousness stemmed from the fact that in the case of PV dom0, the initrd module pages may be either mapped or explicitly copied into the dom0 address space. So to handle this situation, the PV dom0 construction code will set the size of the module to zero, relying on discard_initial_images() to skipany modules with a size of zero.A function is introduced to release a module when it is no longer needed that accepts a boolean parameter, free_mem, to indicate if the corresponding pages can be freed. To track that a module has been released, the boot module flag`released` is introduced.The previous release model was a free all at once except those of size zeros, which would handle any unused modules passed. The new model is one of, free consumed module after usage is complete, thus unconsumed modules do not have aconsumer to free them.Slightly confusing. Maybe just "The new model is to free modules after they are consumed. Thus unconsumed modules are not freed."okay.To address this, the discard_uknown_boot_modules() is"unknown"Ackintroduced and called after the last module identification occurs, initrd, to free the pages of any boot modules that are identified as not being released. After domain construction completes, all modules should be freed. A walk of the boot modules is added after domain construction to validate and notify if amodule is found not to have been released. Signed-off-by: Daniel P. Smith <dpsmith@xxxxxxxxxxxxxxxxxxxx> --- + { + struct boot_module *bm = &bi->mods[i]; + + if ( bm == NULL || bm->released ) continue; - init_domheap_pages(start, start + PAGE_ALIGN(size)); + release_boot_module(bm, true); + count++; } - bi->nr_modules = 0; + if ( count )+ printk(XENLOG_DEBUG "Releasing pages for uknown boot module %d\n","unknown". Since the operation already happened, maybe "Released pages for %d unknown boot modules"? I'm not sure of the value of that message. It would be more informative to provide the module index and maybe a page count. The index would at least point to which module is unused.Ack to unknown.Can adjust the phrasing, the question is there a desire to have a message for every boot module freed. Guess I could do a single log line split across multiple printks, Thinking about the case where someone tried to abuse the interface by loading a bunch of unused modules. It's 63 modules, so not too many. And this is the boot path, so it's administrator controlled and shouldn't really be subject to abuse. I think printing the index of unused modules makes the message useful. If a module is provided to Xen, but Xen doesn't use it, it is worth flagging which one. As an example, if I thought module 3 was ucode, but it isn't getting used, that is useful information for further investigation. + count); } static void __init init_idle_domain(void) @@ -2111,6 +2139,8 @@ void asmlinkage __init noreturn __start_xen(void) initrdidx); } + discard_unknown_boot_modules(); + /** We're going to setup domain0 using the module(s) that we stashed safely * above our heap. The second module, if present, is an initrd ramdisk.@@ -2122,6 +2152,11 @@ void asmlinkage __init noreturn __start_xen(void) if ( !dom0 ) panic("Could not set up DOM0 guest OS\n"); + /* Check and warn if any modules did not get released */ + for ( i = 0; i < bi->nr_modules; i++ ) + if ( !bi->mods[i].released )+ printk(XENLOG_ERR "Boot module %d not released, memory leaked", i);+Why not release the memory here instead of leaking it?Because you don't know if it was mapped or consumed. So this is more of a sanity check since it should not trigger? i.e. it's a bug to see this message. I feel like the corner case of mapping the dom0 initrd is leading to this manual approach or releasing modules individually. That logic then has to be spread around. discard_initial_images() kept the logic centralized. Maybe just replace the mod_end == 0 special case with a "don't release me" flag that is checked in discard_initial_images()?That is what started me at the options to deal with it. The two I came up with was a flag and this approach. Weighing the pros/cons of the two, the deciding factor is when multi-domain construction is introduced. With multi-domain with a large number of domains, a balance has to be struck between holding all the kernels and ramdisks in memory plus being able to allocate the desired amount of memory for each domain. So you're saying that by piece-wise free-ing memory, you can have more domUs loaded by the boot loader? If free-ing is delayed to the end, memory is tied up that could otherwise be assigned to domUs? Is this the real motivation? If so, it belongs in the commit message. As it is, the commit message is lacking in a specify reason why this change is needed. Perhaps a balance can be struck, with a discard_consumed_boot_modules() that walks the boot module list, and discard the ones consumed. While only dom0 can be constructed, it would be called once after create_dom0 call returns (per Andy's suggestion in response to this comment). An expansion on this could be that instead of using a free flag, use a ref count that is incremented as it is claimed and the decremented when it has been consumed. Are you expecting 1 kernel image gets used to create N domUs? Or are you only handling 1:1? Is mapping the initrd only expected for a single PV dom0, or do you envision mapping multiple initrds for multiple PV domUs? It seems to me that more mapping might help alleviate keeping memory tied up. Freeing XSM and ucode early helps, but those aren't particularly big. An initrd in the 100s of MBs is significant though. If you hyperlaunch Xen and 2 domUs, you'd want to be able to assign ~all the memory. Losing 2x ~100MB as unassigned for copies of the initrds is wasteful. Regards, Jason
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |