[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] PVH CPU hotplug design document
On 01/12/2017 02:00 PM, Andrew Cooper wrote: On 12/01/17 12:13, Roger Pau Monné wrote: ## Proposed solution using the STAO The general idea of this method is to use the STAO in order to hide the pCPUs from the hardware domain, and provide processor objects for vCPUs in an extra SSDT table. This method requires one change to the STAO, in order to be able to notify the hardware domain of which processors found in ACPI tables are pCPUs. The description of the new STAO field is as follows: | Field | Byte Length | Byte Offset | Description | |--------------------|:-----------:|:-----------:|--------------------------| | Processor List [n] | - | - | A list of ACPI numbers, | | | | | where each number is the | | | | | Processor UID of a | | | | | physical CPU, and should | | | | | be treated specially by | | | | | the OSPM | The list of UIDs in this new field would be matched against the ACPI Processor UID field found in local/x2 APIC MADT structs and Processor objects in the ACPI namespace, and the OSPM should either ignore those objects, or in case it implements pCPU hotplug, it should notify Xen of changes to these objects. The contents of the MADT provided to the hardware domain are also going to be different from the contents of the MADT as found in native ACPI. The local/x2 APIC entries for all the pCPUs are going to be marked as disabled. Extra entries are going to be added for each vCPU available to the hardware domain, up to the maximum number of supported vCPUs. Note that supported vCPUs might be different than enabled vCPUs, so it's possible that some of these entries are also going to be marked as disabled. The entries for vCPUs on the MADT are going to use a processor local x2 APIC structure, and the ACPI processor ID of the first vCPU is going to be UINT32_MAX - HVM_MAX_VCPUS, in order to avoid clashes with IDs of pCPUs.This is slightly problematic. There is no restriction (so far as I am aware) on which ACPI IDs the firmware picks for its objects. They need not be consecutive, logical, or start from 0. If STAO is being extended to list the IDs of the physical processor objects, we should go one step further and explicitly list the IDs of the virtual processor objects. This leaves us flexibility if we have to avoid awkward firmware ID layouts. I don't think I understand how we'd use VCPU list in STAO. Can you explain this? It is also work stating that this puts an upper limit on nr_pcpus + nr_dom0_vcpus (but 4 billion processors really ought to be enough for anyone...)In order to be able to perform vCPU hotplug, the vCPUs must have an ACPI processor object in the ACPI namespace, so that the OSPM can request notifications and get the value of the \_STA and \_MAT methods. This can be problematic because Xen doesn't know the ACPI name of the other processor objects, so blindly adding new ones can create namespace clashes. This can be solved by using a different ACPI name in order to describe vCPUs in the ACPI namespace. Most hardware vendors tend to use CPU or PR prefixes for the processor objects, so using a 'VP' (ie: Virtual Processor) prefix should prevent clashes.One system I have to hand (with more than 255 pcpus) uses Cxxx To avoid namespace collisions, I can't see any option but to parse the DSDT/SSDTs to at least confirm that VPxx is available to use. You are talking about Xen doing this, right? Meaning that we'd need to add AML parser to the hypervisor? If we do that, I wonder whether this will also help us to deal with _PSS and _CST, which we now have to pass down from dom0. A Xen GPE device block will be used in order to deliver events related to the vCPUs available to the guest, since Xen doesn't know if there are any bits available in the native GPEs. A SCI interrupt will be injected into the guest in order to trigger the event. The following snippet is a representation of the ASL SSDT code that is proposed for the hardware domain: DefinitionBlock ("SSDT.aml", "SSDT", 5, "Xen", "HVM", 0) { Scope (\_SB) { OperationRegion(XEN, SystemMemory, 0xDEADBEEF, 40) Field(XEN, ByteAcc, NoLock, Preserve) { NCPU, 16, /* Number of vCPUs */ MSUA, 32, /* MADT checksum address */ MAPA, 32, /* MADT LAPIC0 address */ } } Scope ( \_SB ) { OperationRegion ( MSUM, SystemMemory, \_SB.MSUA, 1 ) Field ( MSUM, ByteAcc, NoLock, Preserve ) { MSU, 8 } Method ( PMAT, 2 ) { If ( LLess(Arg0, NCPU) ) { Return ( ToBuffer(Arg1) ) } Return ( Buffer() {0, 8, 0xff, 0xff, 0, 0, 0, 0} ) } Processor ( VP00, 0, 0x0000b010, 0x06 ) { Name ( _HID, "ACPI0007" ) Name ( _UID, 4294967167 ) OperationRegion ( MATR, SystemMemory, Add(\_SB.MAPA, 0), 8 ) Field ( MATR, ByteAcc, NoLock, Preserve ) { MAT, 64 } Field ( MATR, ByteAcc, NoLock, Preserve ) { Offset(4), FLG, 1 } Method ( _MAT, 0 ) { Return ( ToBuffer(MAT) ) } Method ( _STA ) { If ( FLG ) { Return ( 0xF ) } Return ( 0x0 ) } Method ( _EJ0, 1, NotSerialized ) { Sleep ( 0xC8 ) } } Processor ( VP01, 1, 0x0000b010, 0x06 ) { Name ( _HID, "ACPI0007" ) Name ( _UID, 4294967168 ) OperationRegion ( MATR, SystemMemory, Add(\_SB.MAPA, 8), 8 ) Field ( MATR, ByteAcc, NoLock, Preserve ) { MAT, 64 } Field ( MATR, ByteAcc, NoLock, Preserve ) { Offset(4), FLG, 1 } Method ( _MAT, 0 ) { Return ( PMAT (1, MAT) ) } Method ( _STA ) { If ( LLess(1, \_SB.NCPU) ) { If ( FLG ) { Return ( 0xF ) } } Return ( 0x0 ) } Method ( _EJ0, 1, NotSerialized ) { Sleep ( 0xC8 ) } } OperationRegion ( PRST, SystemIO, 0xaf00, 1 )This also has a chance of collision, both with the system ACPI controller, and also with PCIe devices advertising IO-BARs. (All graphics cards ever have IO-BARs, because windows refuses to bind a graphics driver to a PCI graphics device if the PCI device doesn't have at least one IO-BAR. Because PCIe requires 4k alignment on the upstream bridge IO-windows, there is a surprisingly low limit on the number of graphics cards you can put in a server and have functioning to windows satisfaction.) As with the other risks of collisions, Xen is going to have to search the system to find a free area to use. I am pretty ignorant about AML but is it possible to have AML dynamically determine the address? Or is it a compile-time value? -boris Field ( PRST, ByteAcc, NoLock, Preserve ) { PRS, 2 } Method ( PRSC, 0 ) { Store ( ToBuffer(PRS), Local0 ) Store ( DerefOf(Index(Local0, 0)), Local1 ) And ( Local1, 1, Local2 ) If ( LNotEqual(Local2, \_SB.VP00.FLG) ) { Store ( Local2, \_SB.VP00.FLG ) If ( LEqual(Local2, 1) ) { Notify ( VP00, 1 ) Subtract ( \_SB.MSU, 1, \_SB.MSU ) } Else { Notify ( VP00, 3 ) Add ( \_SB.MSU, 1, \_SB.MSU ) } } ShiftRight ( Local1, 1, Local1 ) And ( Local1, 1, Local2 ) If ( LNotEqual(Local2, \_SB.VP01.FLG) ) { Store ( Local2, \_SB.VP01.FLG ) If ( LEqual(Local2, 1) ) { Notify ( VP01, 1 ) Subtract ( \_SB.MSU, 1, \_SB.MSU ) } Else { Notify ( VP01, 3 ) Add ( \_SB.MSU, 1, \_SB.MSU ) } } Return ( One ) } } Device ( \_SB.GPEX ) { Name ( _HID, "ACPI0006" ) Name ( _UID, "XENGPE" ) Name ( _CRS, ResourceTemplate() { IO (Decode16, 0xafe0 , 0xafe0, 0x00, 0x4) } ) Method ( _E02 ) { \_SB.PRSC () } } } Since the position of the XEN data memory area is not know, the hypervisor will have to replace the address 0xdeadbeef with the actual memory address where this structure has been copied. This will involve a memory search of the AML code resulting from the compilation of the above ASL snippet.This is also slightly risky. If we need to do this, can we get a relocation list from the compiled table from iasl? ~AndrewIn order to implement this, the hypervisor build is going to use part of libacpi and the iasl compiler. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |