[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [Bug] Intel RMRR support with upstream Qemu
On 25/07/17 17:40, Alexey G wrote: > On Mon, 24 Jul 2017 21:39:08 +0100 > Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> wrote: >>> But, the problem is that overall MMIO hole(s) requirements are not known >>> exactly at the time the HVM domain being created. Some PCI devices will >>> be emulated, some will be merely passed through and yet there will be >>> some RMRR ranges. libxl can't know all this stuff - some comes from the >>> host, some comes from DM. So actual MMIO requirements are known to >>> hvmloader at the PCI bus enumeration time. >>> >> >> IMO hvmloader shouldn't really be allowed to relocate memory under any >> conditions. As Andrew said it's much easier to provision the hole >> statically in libxl during domain construction process and it doesn't >> really compromise any functionality. Having one more entity responsible >> for guest memory layout only makes things more convoluted. > > If moving most tasks of hvmloader to libxl is a planned feature in Citrix, > please let it be discussed on xen-devel first as it may affect many > people... and not all of them might be happy. :) > Everything always goes through the mailing list. > (tons of IMO and TLDR ahead, be warned) > > Moving PCI BAR allocation from guest side to libxl is a controversial step. > This may be the architecturally wrong way in fact. There are properties and > areas of responsibility. Among primary responsibilities of guest's firmware > is PCI BARs and MMIO hole size allocation. That's a guest's territory. > Guest relocates PCI BARs (and not just BIOS able to do this), guest > firmware relocates MMIO hole base for them. If it was a real system, all > tasks like PCI BAR allocation, remapping part of RAM above 4G etc were done > by system BIOS. In our case some of SeaBIOS/OVMF responsibilities were > offloaded to hvmloader, like PCI BARs allocation, sizing MMIO hole(s) for > them and generating ACPI tables. And that's ok as hvmloader can be > considered merely a 'supplemental' firmware to perform some tasks of > SeaBIOS/OVMF before passing control to them. This solution has some > architecture logic at least and doesn't look bad. > libxl is also a part of firmware so to speak. It's incorrect to think that only hvmloader and BIOS images are "proper" firmware. > On other hand, moving PCI hole calculation to libxl just to let Xen/libxl > know what the MMIO size value is might be a bad idea. > Aside from some code duplication, straying too far from the real hw paths, > or breaking existing (or future) interfaces this might have some other > negative consequences. Ex. who will be initializing guest's ACPI tables if > only libxl will know the memory layout? Some new interfaces between libxl > and hvmloader just to let the latter know what values to write to ACPI > tables being created? Or libxl will be initializing guest's ACPI tables as > well (another guest's internal task)? Similar concerns are applicable to > guest's final E820 construction. > The information is not confined by libxl - it's passed to hvmloader and it can finish the tasks libxl couldn't. Although, ACPI tables could be harmlessly initialized inside libxl as well (see PVH implementation). > Another thing is that handling ioreq/PT MMIO ranges is somewhat a property > of the device model (at least for now). Right now it's QEMU who traps PCI > BAR accesses and tells Xen how to handle specific ranges of MMIO space. If > QEMU already talks to Xen which ranges should be passed through or trapped > -- it can tell him the current overall MMIO limits as well... or handle > these limits himself -- if the MMIO hole range check is all what required to > avoid MMIO space misusing, this check can be easily implemented in QEMU, > provided that QEMU knows what memory/MMIO layout is. There is a lot of > implementation freedom where to place restrictions and checks, Xen or QEMU. > Strictly speaking, the MMIO hole itself can be considered a property of the > emulated machine and may have implementation differences for different > emulated chipsets. For example, the real i440' NB do not have an idea of > high MMIO hole at all. > > We have already a sort of an interface between hvmloader and QEMU -- > hvmloader has to do basic initialization for some emulated chipset's > registers (and this depends on the machine). Providing additional handling > for few other registers (TOM/TOLUD/etc) will cost almost nothing and > purpose of this registers will actually match their usage in real HW. This > way we can use an existing available interface and don't stray too far from > the real HW ways. > > I want to try this approach for Q35 bringup patches for Xen I'm currently > working on. I'll send these patches as RFC and will be glad to receive some > constructive criticism. > Sure. Static hole size provisioning doesn't prohibit its dynamic counterpart. Igor >>> libxl can be taught to retrieve all missing info from QEMU, but this way >>> will require to perform all grunt work of PCI BARs allocation in libxl >>> itself - in order to calculate the real MMIO hole(s) size, one needs to >>> take into account all PCI BARs sizes and their alignment requirements >>> diversity + existing gaps due to RMRR ranges... basically, libxl will >>> need to do most of hvmloader/pci.c's job. >>> >> >> The algorithm implemented in hvmloader for that is not complicated and >> can be moved to libxl easily. What we can do is to provision a hole big >> enough to include all the initially assigned PCI devices. We can also >> account for emulated MMIO regions if necessary. But, to be honest, it >> doesn't really matter since if there is no enough space in lower MMIO >> hole for some BARs - they can be easily relocated to upper MMIO >> hole by hvmloader or the guest itself (dynamically). _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |