[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Design doc of adding ACPI support for arm64 on Xen

On 2015/8/5 18:31, Stefano Stabellini wrote:
> On Wed, 5 Aug 2015, Shannon Zhao wrote:
>> On 2015/8/4 22:37, Stefano Stabellini wrote:
>>> On Tue, 4 Aug 2015, Shannon Zhao wrote:
>>>> This document is going to explain the design details of Xen booting with
>>>> ACPI on ARM. Maybe parts of it may not be appropriate. Any comments are
>>>> welcome.
>>> Good start!
>>>> To Xen itself booting with ACPI, this is similar to Linux kernel except
>>>> that Xen doesn't parse DSDT table. So I'll skip this part and focus on
>>>> how Xen prepares ACPI tables for DOM0 and how Xen passes them to DOM0.
>>>> 1)copy and change some EFI and ACPI tables.
>>>>   a) Copy EFI_SYSTEM_TABLE and change the value of FirmwareVendor,
>>>>      VendorGuid, VendorTable, ConfigurationTable. These changes are not
>>>>      very special and it just assign values to these members.
>>>>   b) Create EFI_MEMORY_DESCRIPTOR table. This will add memory start and
>>>>      size information of DOM0. And DOM0 will get the memory information
>>>>      through this EFI table.
>>>>   c) Copy FADT table. Change the value of arm_boot_flags to enable PSCI
>>>>      and HVC. Let the hypervisor_id be "XenVMM" in order to tell DOM0
>>>>      that it runs on Xen hypervisor, so DOM0 can call hypercall to get
>>>>      some informations for booting necessity, such as grant tab start
>>>>      address and size. Change header revison, length and checksum as
>>>>      well.
>>>>   d) Copy GTDT table. Set non_secure_el2_interrupt and
>>>>      non_secure_el2_flags to 0 to mask EL2 timer for DOM0.
>>>>   e) Copy MADT table. According to the value of dom0_max_vcpus to change
>>>>      the number GICC entries.
>>>>   f) Create STAO table. This table is a new added one that's used to
>>>>      define a list of ACPI namespace names that are to be ignored by the
>>>>      OSPM in DOM0. Currently we use it to tell OSPM should ignore UART
>>>>      defined in SPCR table.
>>>>   g) Copy XSDT table. Add a new table entry for STAO and change other
>>>>      table's entries.
>>>>   h) Change the value of xsdt_physical_address in RSDP table.
>>>>   i) The reset of tables are not copied or changed. They are reused
>>>>      including DSDT, SPCR.
>>> OK so far
>>>>   All these tables will be copied or mapped to guest memory.
>>> Are they copied or mapped? Also I think we need to recalculate the
>>> md5sum?
>>>> 2)Create minimal DT to pass required informations to DOM0
>>>>   The minimal DT mainly passes DOM0 bootargs, address and size of initrd
>>>>   (if available), address and size of uefi system table, address and
>>>>   size of uefi memory table, uefi-mmap-desc-size and uefi-mmap-desc-ver.
>>> I think we need to specify which Linux entry point is called, that I
>>> think will be the proper non-EFI kernel entry point, which requires MMU
>>> off (see Documentation/efi-stub.txt in linux).
>>> Also it would be better to write the full bindings of the generated
>>> minimal DT, see http://marc.info/?l=linux-kernel&m=142362266626403&w=2
>>> and Documentation/arm/uefi.txt in linux.
>> An example of the minimal DT:
>> / {
>>     #address-cells = <2>;
>>     #size-cells = <1>;
>>     chosen {
>>         bootargs = "kernel=Image console=hvc0 earlycon=pl011,0x1c090000
>> root=/dev/vda2 rw rootfstype=ext4 init=/bin/sh acpi=force";
>>         linux,initrd-start = <0xXXXXXXXX>;
>>         linux,initrd-end = <0xXXXXXXXX>;
>>         linux,uefi-system-table = <0xXXXXXXXX>;
>>         linux,uefi-mmap-start = <0xXXXXXXXX>;
>>         linux,uefi-mmap-size = <0xXXXXXXXX>;
>>         linux,uefi-mmap-desc-size = <0xXXXXXXXX>;
>>         linux,uefi-mmap-desc-ver = <0xXXXXXXXX>;
>>     };
>> };
> Good, please include this example in the doc. Please include a pointer
> to Documentation/arm/uefi.txt which lists these paramaters.
>>>> 3)DOM0 how to get grant table and event channel irq informations
>>>>   As said above, we assign the hypervisor_id be "XenVMM" to tell DOM0
>>>>   that it runs on Xen hypervisor.
>>>>   Then save the start address and size
>>>>   of grant table in domain->grant_table->start_addr and
>>>>   domain->grant_table->size. DOM0 can call a new hypercall
>>>>   GNTTABOP_get_start_addr to get these info.
>>>>   Same to event channel, we've already save interrupt number in
>>>>   d->arch.evtchn_irq, so DOM0 can call a new hypercall EVTCHNOP_get_irq
>>>>   to get the irq.
>>> It would be nice to go down into more details and write the parameters
>>> of the hypercalls in the doc as they will become a newly supported ABI.
>> The parameters of GNTTABOP_get_start_addr is like below:
>>     struct gnttab_get_start_addr {
>>         /* IN parameters */
>>         domid_t dom;
>>         uint16_t pad;
>>         /* OUT parameters */
>>         uint64_t start_addr;
>>         uint64_t size;
>>     };

For grant table start address and size, maybe it could add two new HVM

>> The parameters of EVTCHNOP_get_irq is like below:
>>     struct evtchn_get_irq {
>>         /* IN parameters. */
>>        domid_t dom;
>>        uint16_t pad;
>>        /* OUT parameters. */
>>        uint32_t irq;
>>     };
> I think that it makes sense to reuse the existing HVM_PARAM_CALLBACK_IRQ
> hvmop call in this case. See
> drivers/xen/events/events_base.c:xen_set_callback_via in Linux and
> xen/include/public/hvm/params.h in Xen.
> I would just add a new delivery type:
> val[63:56] == 3: val[7:0] is a PPI (ARM and ARM64 only)

So for event_channel we could add a new function like
xen_get_callback_via in events_base.c and assign the value of param
HVM_PARAM_CALLBACK_IRQ in Xen.  And we need to expose the irq flag, so
the new delivery type may be:
val[63:56] == 3: val[15:8] is flag: val[7:0] is a PPI (ARM and ARM64 only)

Is this correct?

> I would appreciate Jan's feedback on the two hypercalls.
>>> The evtchnop would need to be called something like
>>> EVTCHNOP_get_notification_irq and would need to be ARM specific (on x86
>>> things are different).
>>>> 4)How to map MMIO regions
>>>>   a)Current implementation is mapping MMIO regions in Dom0 on demand
>>>>     when trapping in Xen with a data abort.
>>> I think this approach is prone to failures. A driver could program a
>>> device for DMA involving regions not yet mapped. As a consequence the
>>> DMA operation would fail because the SMMU would stop the transaction.
>>>>   b)Another way is to map all the non-ram memory regions before booting.
>>>>     But as suggested by Stefano, this will use a lot of memory to store
>>>>     the pagetables.
>>>>   c)Another suggested way is to use a hypercall from DOM0 to request
>>>>     MMIO regions mappings after Linux complete parsing the DSDT. But I
>>>>     didn't find a proper place to issue this call. Anyone has some
>>>>     suggestion?
>>> I suggested to exploit the bus_notifier callbacks and issue an hypercall
>>> there. In the case of the PCI bus, we are already handling notifications
>>> in drivers/xen/pci.c:xen_pci_notifier.
>>> Once you have a struct pci_dev pointer in your hand, you can get the
>>> MMIO regions from pdev->resource[bar].
>>> Does that work?
>> I investigate and test this approach. Adding a bus notifier for platform
>> bus, it could map the mmio regions.
> That's great!
> Keep in mind that many ARM platforms have non-PCI busses, so I think
> we'll need an amba and a platform bus_notifier too, in addition to the
> existing pci bus notifier.
>> Stefano, thanks for your suggestion. And does anyone else have other
>> comments on this approach?
>>>> 5)How route device interrupt to DOM0
>>>>   Currently we route all the SPI interrupts to DOM0 before DOM0 booting.
>>>>   But this maybe a workaround. What's the right choice? After DOM0
>>>>   parses the interrupt information from DSDT and call a hypercall to
>>>>   route them?
>>> I think that is OK for now, but it is good for you to bring up this
>>> point here.  Dom0 will ask Xen to remap interrupts for any devices
>>> assigned to DomU created after Dom0.
>>> .
>> -- 
>> Shannon
> .


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.