[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: Tuesday, January 19, 2016 7:47 PM
> >>> On 19.01.16 at 12:37, <wei.liu2@xxxxxxxxxx> wrote:
> > On Mon, Jan 18, 2016 at 01:46:29AM -0700, Jan Beulich wrote:
> >> >>> On 18.01.16 at 01:52, <haozhong.zhang@xxxxxxxxx> wrote:
> >> > On 01/15/16 10:10, Jan Beulich wrote:
> >> >> >>> On 29.12.15 at 12:31, <haozhong.zhang@xxxxxxxxx> wrote:
> >> >> > NVDIMM devices are detected and configured by software through
> >> >> > ACPI. Currently, QEMU maintains ACPI tables of vNVDIMM devices. This
> >> >> > patch extends the existing mechanism in hvmloader of loading 
> >> >> > passthrough
> >> >> > ACPI tables to load extra ACPI tables built by QEMU.
> >> >>
> >> >> Mechanically the patch looks okay, but whether it's actually needed
> >> >> depends on whether indeed we want NV RAM managed in qemu
> >> >> instead of in the hypervisor (where imo it belongs); I didn' see any
> >> >> reply yet to that same comment of mine made (iirc) in the context
> >> >> of another patch.
> >> >
> >> > One purpose of this patch series is to provide vNVDIMM backed by host
> >> > NVDIMM devices. It requires some drivers to detect and manage host
> >> > NVDIMM devices (including parsing ACPI, managing labels, etc.) that
> >> > are not trivial, so I leave this work to the dom0 linux. Current Linux
> >> > kernel abstract NVDIMM devices as block devices (/dev/pmemXX). QEMU
> >> > then mmaps them into certain range of dom0's address space and asks
> >> > Xen hypervisor to map that range of address space to a domU.
> >> >
> >
> > OOI Do we have a viable solution to do all these non-trivial things in
> > core hypervisor?  Are you proposing designing a new set of hypercalls
> > for NVDIMM?
> That's certainly a possibility; I lack sufficient detail to make myself
> an opinion which route is going to be best.
> Jan

Hi, Haozhong,

Are NVDIMM related ACPI table in plain text format, or do they require
a ACPI parser to decode? Is there a corresponding E820 entry?

Above information would be useful to help decide the direction.

In a glimpse I like Jan's idea that it's better to let Xen manage NVDIMM
since it's a type of memory resource while for memory we expect hypervisor
to centrally manage.

However in another thought the answer is different if we view this 
resource as a MMIO resource, similar to PCI BAR MMIO, ACPI NVS, etc.
then it should be fine to have Dom0 manage NVDIMM then Xen just controls
the mapping based on existing io permission mechanism.

Another possible point for this model is that PMEM is only one mode of 
NVDIMM device, which can be also exposed as a storage device. In the
latter case the management has to be in Dom0. So we don't need to
scatter the management role into Dom0/Xen based on different modes.

Back to your earlier questions:

> (1) The QEMU patches use xc_hvm_map_io_range_to_ioreq_server() to map
>     the host NVDIMM to domU, which results VMEXIT for every guest
>     read/write to the corresponding vNVDIMM devices. I'm going to find
>     a way to passthrough the address space range of host NVDIMM to a
>     guest domU (similarly to what xen-pt in QEMU uses)
> (2) Xen currently does not check whether the address that QEMU asks to
>     map to domU is really within the host NVDIMM address
>     space. Therefore, Xen hypervisor needs a way to decide the host
>     NVDIMM address space which can be done by parsing ACPI NFIT
>     tables.

If you look at how ACPI OpRegion is handled for IGD passthrough:

 241     ret = xc_domain_iomem_permission(xen_xc, xen_domid,
 242             (unsigned long)(igd_host_opregion >> XC_PAGE_SHIFT),

 254     ret = xc_domain_memory_mapping(xen_xc, xen_domid,
 255             (unsigned long)(igd_guest_opregion >> XC_PAGE_SHIFT),
 256             (unsigned long)(igd_host_opregion >> XC_PAGE_SHIFT),
 258             DPCI_ADD_MAPPING);

Above can address your 2 questions. Xen doesn't need to tell exactly
whether the assigned range actually belongs to NVDIMM, just like
the policy for PCI assignment today.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.