[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

Hi Jan, Wei and Kevin,

On 01/18/16 01:46, Jan Beulich wrote:
> >>> On 18.01.16 at 01:52, <haozhong.zhang@xxxxxxxxx> wrote:
> > On 01/15/16 10:10, Jan Beulich wrote:
> >> >>> On 29.12.15 at 12:31, <haozhong.zhang@xxxxxxxxx> wrote:
> >> > NVDIMM devices are detected and configured by software through
> >> > ACPI. Currently, QEMU maintains ACPI tables of vNVDIMM devices. This
> >> > patch extends the existing mechanism in hvmloader of loading passthrough
> >> > ACPI tables to load extra ACPI tables built by QEMU.
> >> 
> >> Mechanically the patch looks okay, but whether it's actually needed
> >> depends on whether indeed we want NV RAM managed in qemu
> >> instead of in the hypervisor (where imo it belongs); I didn' see any
> >> reply yet to that same comment of mine made (iirc) in the context
> >> of another patch.
> > 
> > One purpose of this patch series is to provide vNVDIMM backed by host
> > NVDIMM devices. It requires some drivers to detect and manage host
> > NVDIMM devices (including parsing ACPI, managing labels, etc.) that
> > are not trivial, so I leave this work to the dom0 linux. Current Linux
> > kernel abstract NVDIMM devices as block devices (/dev/pmemXX). QEMU
> > then mmaps them into certain range of dom0's address space and asks
> > Xen hypervisor to map that range of address space to a domU.
> > 
> > However, there are two problems in this Xen patch series and the
> > corresponding QEMU patch series, which may require further
> > changes in hypervisor and/or toolstack.
> > 
> > (1) The QEMU patches use xc_hvm_map_io_range_to_ioreq_server() to map
> >     the host NVDIMM to domU, which results VMEXIT for every guest
> >     read/write to the corresponding vNVDIMM devices. I'm going to find
> >     a way to passthrough the address space range of host NVDIMM to a
> >     guest domU (similarly to what xen-pt in QEMU uses)
> >     
> > (2) Xen currently does not check whether the address that QEMU asks to
> >     map to domU is really within the host NVDIMM address
> >     space. Therefore, Xen hypervisor needs a way to decide the host
> >     NVDIMM address space which can be done by parsing ACPI NFIT
> >     tables.
> These problems are a pretty direct result of the management of
> NVDIMM not being done by the hypervisor.
> Stating what qemu currently does is, I'm afraid, not really serving
> the purpose of hashing out whether the management of NVDIMM,
> just like that of "normal" RAM, wouldn't better be done by the
> hypervisor. In fact so far I haven't seen any rationale (other than
> the desire to share code with KVM) for the presently chosen
> solution. Yet in KVM qemu is - afaict - much more of an integral part
> of the hypervisor than it is in the Xen case (and even there core
> management of the memory is left to the kernel, i.e. what
> constitutes the core hypervisor there).
> Jan

Sorry for the later reply, as I was reading some code and trying to
get things clear for myself.

The primary reason of current solution is to reuse existing NVDIMM
driver in Linux kernel.

One responsibility of this driver is to discover NVDIMM devices and
their parameters (e.g. which portion of an NVDIMM device can be mapped
into the system address space and which address it is mapped to) by
parsing ACPI NFIT tables. Looking at the NFIT spec in Sec 5.2.25 of
ACPI Specification v6 and the actual code in Linux kernel
(drivers/acpi/nfit.*), it's not a trivial task.

Secondly, the driver implements a convenient block device interface to
let software access areas where NVDIMM devices are mapped. The
existing vNVDIMM implementation in QEMU uses this interface.

As Linux NVDIMM driver has already done above, why do we bother to
reimplement them in Xen?

For the two problems raised in my previous reply, following are my

(1) (for the first problem) QEMU mmaps /dev/pmemXX into its virtual
    address space. When it works with KVM, it calls KVM api to map
    that virtual address space range into a guest physical address

    For Xen, I'm going to do the similar thing, but Xen seems not
    provide such api. The most close one I can find is
    XEN_DOMCTL_memory_mapping (which is used by VGA passthrough in
    QEMU xen_pt_graphics), but it does not accept guest virtual
    address. Thus, I'm going to add a new one that does similar work
    but can accept guest virtual address.

(2) (for the second problem) After having looked at the corresponding
    Linux kernel code and my comments at beginning, I now doubt if
    it's necessary to parsing NFIT in Xen. Maybe I can follow what
    xen_pt_graphics does, that is to assign guest with permission to
    access the corresponding host NVDIMM address space range and then
    call the new hypercall added in (1).

    Again, a new hypercall that is similar to
    XEN_DOMCTL_iomem_permission and can accept guest virtual address
    is needed.

Any comments?



Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.