[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring
On Fri, 23 Mar 2018 13:57:11 +0000 Paul Durrant <Paul.Durrant@xxxxxxxxxx> wrote: [...] >> Few related thoughts: >> >> 1. MMCONFIG address is chipset-specific. On Q35 it's a PCIEXBAR, on >> other x86 systems it may be HECBASE or else. So we can assume it is >> bound to the emulated machine > >Xen emulates the machine so it should be emulating PCIEXBAR. Actually, Xen currently emulates only few devices. Others are provided by QEMU, that's the problem. >> 2. We rely on QEMU to emulate different machines for us. >We should not be. It's a historical artefact that we rely on QEMU for >any part of machine emulation. HVM guests need to see something more or less close to real hardware to run. Even if we later install PV drivers for network/storage/etc usage, we still need to support system firmware (SeaBIOS/OVMF) and be able to install any (ideally) OS which expects to be installed only on some real x86 hw. We also need to be ready to fallback to the emulated hw if eg. user will boot OS in the safe mode. It all depends on what you mean by not relying on QEMU for any part of machine emulation. There is a number of mandatory devices which should be provided for a typical x86 system. Xen emulates some of them, but there is a number which he doesn't. Apart from "classic" devices like RTC, PIT, KBC, etc we need to provide at least storage and network interfaces. Windows installer won't be happy to boot from the PV storage device, he prefers to encounter something like AHCI (Windows 7+), ATA (for older OSes) or ATAPI if it is an iso cd. Providing emulation for the AHCI+ATA+ATAPI trio alone is a non-trivial task. QEMU itself provides only partial implementation of these, many features are unsupported. Another very useful thing to emulate is USB. Depending on the controller version and device classes required, this may be far more complex to emulate than AHCI+ATA+ATAPI combined. So, if you suggest to drop QEMU completely, it means that all this functionality must be replaced by own. Not that hard, but still a lot of effort. OTOH, if you mean stripping QEMU of general PCI bus control and replacing his emulated NB/SB with Xen-owned -- well, it theory it should be possible, with patches on QEMU side. In fact, the emulated chipset (NB+SB combo without supplemental devices) itself is a small part of required emulation. It's relatively easy to provide own analogs of for eg. 'mch' and 'ICH9-LPC' QEMU PCIDevice's, the problem is to glue all remaining parts together. I assume the final goal in this case is to have only a set of necessary QEMU PCIDevice's for which we will be providing I/O, MMIO and PCI conf trapping facilities. Only devices such as rtl8139, ich9-ahci and few others. Basically, this means a new, chipset-less QEMU machine type. Well, in theory it is possible with a bit of effort I think. The main question is where will be the NB/SB/PCIbus emulating part reside in this case. As this part must still have some priveleges, it's basically the same decision problem as with QEMU's dwelling place -- stubdomain, Dom0 or else. >> 3. There are users which touch chipset-specific PCIEXBAR directly if >> they see a Q35 system (OVMF so far) > >And we should squash such accesses. > Yes, we have that privilege (i.e. allocating all IO/MMIO bases) for hvmloader. OVMF should not differ in this subject to SeaBIOS. >The toolstack should be sole >control of the guest memory map. It should be the only building MCFG >so it should decide where the MMCONFIG regions go, not the firmware >running in guest context. HVM memory layout is another problem which needs solution BTW. I had to implement one for my PT goals, but it's very radical I'm afraid. Right now there are wicked issues present in handling memory layout between hvmloader and QEMU. They may see a different memory map, even with overlaps in some (depending on MMIO hole size and content) cases -- like an attempt to place MMIO BAR over memory which is used for vram backing storage by QEMU, causing variety of issues like emulated I/O errors (with a storage device) during guest boot attempt. Regarding control of the guest memory map in the toolstack only... The problem is, only firmware can see a final memory map at the moment. And only the device model knows about invisible "service" ranges for emulated devices, like the LFB content (aka "VRAM") when it is not mapped to a guest. In order to calculate the final memory/MMIO hole split, we need to know: 1) all PCI devices on a PCI bus. At the moment Xen contributes only devices like PT to the final PCI bus (via QMP device_add). Others are QEMU ones. Even Xen platform PCI device relies on QEMU emulation. Non-QEMU device emulators are another source of virtual PCI devices I guess. 2) all chipset-specific emulated MMIO ranges. MMCONFIG is one of them and largest (up to 256Mb for a segment). There are few other smaller ranges, eg. Root Complex registers. All this ranges depend on the emulated chipset. 3) all reserved memory ranges (this one what toolstack already knows) 4) all "service" guest memory ranges like backing storage for VRAM in QEMU. Emulated Option ROMs should belong here too, but IIRC xen-hvm.c either intentionally or by mistate handles them as emulated ranges currently. If we miss any of these (like what are the chipset-specific ranges and their size alignment requirements) -- we're in trouble. But, if we know *all* of these, we can pre-calculate the MMIO hole size. Although this is a bit fragile to do from the toolstack because both sizing algo in the toolstack and MMIO BAR allocation code in the firmware (hvmloader) must have their algorithms synchronized, because it is possible to sruff BARs to MMIO hole in different ways, especially when PCI-PCI bridges will appear on the scene. Both need to do it in a consistent way (resulting in similar set of gaps between allocated BARs), otherwise expected MMIO hole sizes won't match, which means we may need to relocate MMIO BARs to the high MMIO hole and this in turn may lead to those overlaps with QEMU memories. >> Seems like we're pretty limited in freedom of choice in this >> conditions, I'm afraid. > >I don't think so. We're only limited if we use QEMU's Q35 emulation >and what I'm saying is that we should not be doing that (nor should be >we be using it to emulate any part of the PIIX today). > > Paul _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |