[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: slow start of Pod HVM domU with qemu 9.1
On Wed, Jan 29, 2025 at 09:52:19AM +0100, Jan Beulich wrote: > On 29.01.2025 00:58, Stefano Stabellini wrote: > > On Tue, 28 Jan 2025, Edgar E. Iglesias wrote: > >> On Tue, Jan 28, 2025 at 03:15:44PM +0100, Olaf Hering wrote: > >>> With this change the domU starts fast again: > >>> > >>> --- a/hw/xen/xen-mapcache.c > >>> +++ b/hw/xen/xen-mapcache.c > >>> @@ -522,6 +522,7 @@ ram_addr_t xen_ram_addr_from_mapcache(void *ptr) > >>> ram_addr_t addr; > >>> > >>> addr = xen_ram_addr_from_mapcache_single(mapcache, ptr); > >>> + if (1) > >>> if (addr == RAM_ADDR_INVALID) { > >>> addr = xen_ram_addr_from_mapcache_single(mapcache_grants, ptr); > >>> } > >>> @@ -626,6 +627,7 @@ static void > >>> xen_invalidate_map_cache_entry_single(MapCache *mc, uint8_t *buffer) > >>> static void xen_invalidate_map_cache_entry_all(uint8_t *buffer) > >>> { > >>> xen_invalidate_map_cache_entry_single(mapcache, buffer); > >>> + if (1) > >>> xen_invalidate_map_cache_entry_single(mapcache_grants, buffer); > >>> } > >>> > >>> @@ -700,6 +702,7 @@ void xen_invalidate_map_cache(void) > >>> bdrv_drain_all(); > >>> > >>> xen_invalidate_map_cache_single(mapcache); > >>> + if (0) > >>> xen_invalidate_map_cache_single(mapcache_grants); > >>> } > >>> > >>> I did the testing with libvirt, the domU.cfg equivalent looks like this: > >>> maxmem = 4096 > >>> memory = 2048 > >>> maxvcpus = 4 > >>> vcpus = 2 > >>> pae = 1 > >>> acpi = 1 > >>> apic = 1 > >>> viridian = 0 > >>> rtc_timeoffset = 0 > >>> localtime = 0 > >>> on_poweroff = "destroy" > >>> on_reboot = "destroy" > >>> on_crash = "destroy" > >>> device_model_override = "/usr/lib64/qemu-9.1/bin/qemu-system-i386" > >>> sdl = 0 > >>> vnc = 1 > >>> vncunused = 1 > >>> vnclisten = "127.0.0.1" > >>> vif = [ "mac=52:54:01:23:63:29,bridge=br0,script=vif-bridge" ] > >>> parallel = "none" > >>> serial = "pty" > >>> builder = "hvm" > >>> kernel = "/bug1236329/linux" > >>> ramdisk = "/bug1236329/initrd" > >>> cmdline = "console=ttyS0,115200n8 quiet ignore_loglevel"" > >>> boot = "c" > >>> disk = [ > >>> "format=qcow2,vdev=hda,access=rw,backendtype=qdisk,target=/bug1236329/sles12sp5.qcow2" > >>> ] > >>> usb = 1 > >>> usbdevice = "tablet" > >>> > >>> Any idea what can be done to restore boot times? > >> > >> > >> A guess is that it's taking a long time to walk the grants mapcache > >> when invalidating (in QEMU). Despite it being unused and empty. We > >> could try to find a way to keep track of usage and do nothing when > >> invalidating an empty/unused cache. > > > > If mapcache_grants is unused and empty, the call to > > xen_invalidate_map_cache_single(mapcache_grants) should be really fast? > > > > I think probably it might be the opposite: mapcache_grants is utilized, > > so going through all the mappings in xen_invalidate_map_cache_single > > takes time. > > > > However, I wonder if it is really needed. At least in the PoD case, the > > reason for the IOREQ_TYPE_INVALIDATE request is that the underlying DomU > > memory has changed. But that doesn't affect the grant mappings, because > > those are mappings of other domains' memory. > > > > So I am thinking whether we should remove the call to > > xen_invalidate_map_cache_single(mapcache_grants) ? > > > > Adding x86 maintainers: do we need to flush grant table mappings for the > > PV backends running in QEMU when Xen issues a IOREQ_TYPE_INVALIDATE > > request to QEMU? > > Judging from two of the three uses of ioreq_request_mapcache_invalidate() > in x86'es p2m.c, I'd say no. The 3rd use there is unconditional, but > maybe wrongly so. > > However, the answer also depends on what qemu does when encountering a > granted page. Would it enter it into its mapcache? Can it even access it? > (If it can't, how does emulated I/O work to such pages? If it can, isn't > this a violation of the grant's permissions, as it's targeted at solely > the actual HVM domain named in the grant?) > QEMU will only map granted pages if the guest explicitly asks QEMU to DMA into granted pages. Guests first need to grant pages to the domain running QEMU, then pass a cookie/address to QEMU with the grant id. QEMU will then map the granted memory, enter it into a dedicated mapcache (mapcache_grants) and then emulate device DMA to/from the grant. So QEMU will only map grants intended for QEMU DMA devices, not any grant to any domain. Details: https://github.com/torvalds/linux/blob/master/drivers/xen/grant-dma-ops.c Cheers, Edgar
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |