[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring
On Sat, Mar 24, 2018 at 08:32:44AM +1000, Alexey G wrote: > On Fri, 23 Mar 2018 13:57:11 +0000 > Paul Durrant <Paul.Durrant@xxxxxxxxxx> wrote: > [...] > >> Few related thoughts: > >> > >> 1. MMCONFIG address is chipset-specific. On Q35 it's a PCIEXBAR, on > >> other x86 systems it may be HECBASE or else. So we can assume it is > >> bound to the emulated machine > > > >Xen emulates the machine so it should be emulating PCIEXBAR. > > Actually, Xen currently emulates only few devices. Others are > provided by QEMU, that's the problem. > > >> 2. We rely on QEMU to emulate different machines for us. > >We should not be. It's a historical artefact that we rely on QEMU for > >any part of machine emulation. > > HVM guests need to see something more or less close to real hardware to > run. Even if we later install PV drivers for network/storage/etc usage, > we still need to support system firmware (SeaBIOS/OVMF) and be able to > install any (ideally) OS which expects to be installed only on some > real x86 hw. We also need to be ready to fallback to the emulated hw if > eg. user will boot OS in the safe mode. I think Paul means that Xen should be emulating the platform devices and part of the southbridge/northbridge functionality, but not all the emulated devices provided to a guest. > > It all depends on what you mean by not relying on QEMU for any part > of machine emulation. > > There is a number of mandatory devices which should be provided for a > typical x86 system. Xen emulates some of them, but there is a number > which he doesn't. Apart from "classic" devices like RTC, PIT, KBC, etc > we need to provide at least storage and network interfaces. > > Windows installer won't be happy to boot from the PV storage device, he > prefers to encounter something like AHCI (Windows 7+), ATA (for older > OSes) or ATAPI if it is an iso cd. > Providing emulation for the AHCI+ATA+ATAPI trio alone is a non-trivial > task. QEMU itself provides only partial implementation of these, many > features are unsupported. Another very useful thing to emulate is USB. > Depending on the controller version and device classes required, this > may be far more complex to emulate than AHCI+ATA+ATAPI combined. > > So, if you suggest to drop QEMU completely, it means that all this > functionality must be replaced by own. Not that hard, but still a lot > of effort. > > > OTOH, if you mean stripping QEMU of general PCI bus control and > replacing his emulated NB/SB with Xen-owned -- well, it theory it > should be possible, with patches on QEMU side. > > In fact, the emulated chipset (NB+SB combo without supplemental devices) > itself is a small part of required emulation. It's relatively easy to > provide own analogs of for eg. 'mch' and 'ICH9-LPC' QEMU PCIDevice's, > the problem is to glue all remaining parts together. > > I assume the final goal in this case is to have only a set of necessary > QEMU PCIDevice's for which we will be providing I/O, MMIO and PCI conf > trapping facilities. Only devices such as rtl8139, ich9-ahci and few > others. > > Basically, this means a new, chipset-less QEMU machine type. > Well, in theory it is possible with a bit of effort I think. The main > question is where will be the NB/SB/PCIbus emulating part reside in > this case. Mostly inside of Xen. Of course the IDE/SATA/USB/Ethernet... part of the southbrigde will be emulated by a device model (ie: QEMU). As you mention above, I also took a look and it seems like the amount of registers that we should emulate for Q35 DRAM controller (D0:F0) is fairly minimal based on current QEMU implementation. We could even possibly get away by just emulating PCIEXBAR. > As this part must still have some priveleges, it's basically > the same decision problem as with QEMU's dwelling place -- stubdomain, > Dom0 or else. > > >> 3. There are users which touch chipset-specific PCIEXBAR directly if > >> they see a Q35 system (OVMF so far) > > > >And we should squash such accesses. > > > > Yes, we have that privilege (i.e. allocating all IO/MMIO bases) for > hvmloader. OVMF should not differ in this subject to SeaBIOS. > > >The toolstack should be sole > >control of the guest memory map. It should be the only building MCFG > >so it should decide where the MMCONFIG regions go, not the firmware > >running in guest context. > > HVM memory layout is another problem which needs solution BTW. I had to > implement one for my PT goals, but it's very radical I'm afraid. > > Right now there are wicked issues present in handling memory layout > between hvmloader and QEMU. They may see a different memory map, even > with overlaps in some (depending on MMIO hole size and content) cases -- > like an attempt to place MMIO BAR over memory which is used for vram > backing storage by QEMU, causing variety of issues like emulated I/O > errors (with a storage device) during guest boot attempt. > > Regarding control of the guest memory map in the toolstack only... The > problem is, only firmware can see a final memory map at the moment. > And only the device model knows about invisible "service" ranges for > emulated devices, like the LFB content (aka "VRAM") when it is not > mapped to a guest. > > In order to calculate the final memory/MMIO hole split, we need to know: > > 1) all PCI devices on a PCI bus. At the moment Xen contributes only > devices like PT to the final PCI bus (via QMP device_add). Others are > QEMU ones. Even Xen platform PCI device relies on QEMU emulation. > Non-QEMU device emulators are another source of virtual PCI devices I > guess. > > 2) all chipset-specific emulated MMIO ranges. MMCONFIG is one of them > and largest (up to 256Mb for a segment). There are few other smaller > ranges, eg. Root Complex registers. All this ranges depend on the > emulated chipset. > > 3) all reserved memory ranges (this one what toolstack already knows) > > 4) all "service" guest memory ranges like backing storage for VRAM in > QEMU. Emulated Option ROMs should belong here too, but IIRC xen-hvm.c > either intentionally or by mistate handles them as emulated ranges > currently. > > If we miss any of these (like what are the chipset-specific ranges and > their size alignment requirements) -- we're in trouble. But, if we know > *all* of these, we can pre-calculate the MMIO hole size. Although this > is a bit fragile to do from the toolstack because both sizing algo in > the toolstack and MMIO BAR allocation code in the firmware (hvmloader) > must have their algorithms synchronized, because it is possible to > sruff BARs to MMIO hole in different ways, especially when PCI-PCI > bridges will appear on the scene. Both need to do it in a consistent way > (resulting in similar set of gaps between allocated BARs), otherwise > expected MMIO hole sizes won't match, which means we may need to > relocate MMIO BARs to the high MMIO hole and this in turn may lead to > those overlaps with QEMU memories. I agree that the current memory layout management (or the lack of it) is concerning. Although related, I think this should be tackled as a different issue from the chipset one IMHO. Since you already posted the Q35 series I would attempt to get that done first before jumping into the memory layout one. Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |