[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
From: "Zhang, Haozhong" <haozhong.zhang@xxxxxxxxx>
Date: Wed, 20 Jan 2016 13:58:21 +0800
Cc: Wei Liu <wei.liu2@xxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, StefanoStabellini <stefano.stabellini@xxxxxxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Keir Fraser <keir@xxxxxxx>
Delivery-date: Wed, 20 Jan 2016 05:58:35 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>
Mail-followup-to: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, StefanoStabellini <stefano.stabellini@xxxxxxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>

On 01/20/16 13:14, Tian, Kevin wrote:
> > From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> > Sent: Tuesday, January 19, 2016 7:47 PM
> > 
> > >>> On 19.01.16 at 12:37, <wei.liu2@xxxxxxxxxx> wrote:
> > > On Mon, Jan 18, 2016 at 01:46:29AM -0700, Jan Beulich wrote:
> > >> >>> On 18.01.16 at 01:52, <haozhong.zhang@xxxxxxxxx> wrote:
> > >> > On 01/15/16 10:10, Jan Beulich wrote:
> > >> >> >>> On 29.12.15 at 12:31, <haozhong.zhang@xxxxxxxxx> wrote:
> > >> >> > NVDIMM devices are detected and configured by software through
> > >> >> > ACPI. Currently, QEMU maintains ACPI tables of vNVDIMM devices. This
> > >> >> > patch extends the existing mechanism in hvmloader of loading 
> > >> >> > passthrough
> > >> >> > ACPI tables to load extra ACPI tables built by QEMU.
> > >> >>
> > >> >> Mechanically the patch looks okay, but whether it's actually needed
> > >> >> depends on whether indeed we want NV RAM managed in qemu
> > >> >> instead of in the hypervisor (where imo it belongs); I didn' see any
> > >> >> reply yet to that same comment of mine made (iirc) in the context
> > >> >> of another patch.
> > >> >
> > >> > One purpose of this patch series is to provide vNVDIMM backed by host
> > >> > NVDIMM devices. It requires some drivers to detect and manage host
> > >> > NVDIMM devices (including parsing ACPI, managing labels, etc.) that
> > >> > are not trivial, so I leave this work to the dom0 linux. Current Linux
> > >> > kernel abstract NVDIMM devices as block devices (/dev/pmemXX). QEMU
> > >> > then mmaps them into certain range of dom0's address space and asks
> > >> > Xen hypervisor to map that range of address space to a domU.
> > >> >
> > >
> > > OOI Do we have a viable solution to do all these non-trivial things in
> > > core hypervisor?  Are you proposing designing a new set of hypercalls
> > > for NVDIMM?
> > 
> > That's certainly a possibility; I lack sufficient detail to make myself
> > an opinion which route is going to be best.
> > 
> > Jan
> 
> Hi, Haozhong,
> 
> Are NVDIMM related ACPI table in plain text format, or do they require
> a ACPI parser to decode? Is there a corresponding E820 entry?
>

Most in plain text format, but still the driver evaluates _FIT
(firmware interface table) method and decode is needed then.

> Above information would be useful to help decide the direction.
> 
> In a glimpse I like Jan's idea that it's better to let Xen manage NVDIMM
> since it's a type of memory resource while for memory we expect hypervisor
> to centrally manage.
> 
> However in another thought the answer is different if we view this 
> resource as a MMIO resource, similar to PCI BAR MMIO, ACPI NVS, etc.
> then it should be fine to have Dom0 manage NVDIMM then Xen just controls
> the mapping based on existing io permission mechanism.
>

It's more like a MMIO device than the normal ram.

> Another possible point for this model is that PMEM is only one mode of 
> NVDIMM device, which can be also exposed as a storage device. In the
> latter case the management has to be in Dom0. So we don't need to
> scatter the management role into Dom0/Xen based on different modes.
>

NVDIMM device in pmem mode is exposed as storage device (a block
device /dev/pmemXX) in Linux, and it's also used like a disk drive
(you can make file system on it, create files on it and even pass
files rather than a whole /dev/pmemXX to guests).

> Back to your earlier questions:
> 
> > (1) The QEMU patches use xc_hvm_map_io_range_to_ioreq_server() to map
> >     the host NVDIMM to domU, which results VMEXIT for every guest
> >     read/write to the corresponding vNVDIMM devices. I'm going to find
> >     a way to passthrough the address space range of host NVDIMM to a
> >     guest domU (similarly to what xen-pt in QEMU uses)
> > 
> > (2) Xen currently does not check whether the address that QEMU asks to
> >     map to domU is really within the host NVDIMM address
> >     space. Therefore, Xen hypervisor needs a way to decide the host
> >     NVDIMM address space which can be done by parsing ACPI NFIT
> >     tables.
> 
> If you look at how ACPI OpRegion is handled for IGD passthrough:
> 
>  241     ret = xc_domain_iomem_permission(xen_xc, xen_domid,
>  242             (unsigned long)(igd_host_opregion >> XC_PAGE_SHIFT),
>  243             XEN_PCI_INTEL_OPREGION_PAGES,
>  244             XEN_PCI_INTEL_OPREGION_ENABLE_ACCESSED);
> 
>  254     ret = xc_domain_memory_mapping(xen_xc, xen_domid,
>  255             (unsigned long)(igd_guest_opregion >> XC_PAGE_SHIFT),
>  256             (unsigned long)(igd_host_opregion >> XC_PAGE_SHIFT),
>  257             XEN_PCI_INTEL_OPREGION_PAGES,
>  258             DPCI_ADD_MAPPING);
>

Yes, I've noticed these two functions. The addition work would be
adding new ones that can accept virtual address, as QEMU has no easy
way to get the physical address of /dev/pmemXX and can only mmap them
into its virtual address space.

> Above can address your 2 questions. Xen doesn't need to tell exactly
> whether the assigned range actually belongs to NVDIMM, just like
> the policy for PCI assignment today.
>

That means Xen hypervisor can trust whatever address dom0 kernel and
QEMU provide?

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

References:
- Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
  - From: Haozhong Zhang
- Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
  - From: Wei Liu
- Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
  - From: Tian, Kevin

Prev by Date: [Xen-devel] [distros-debian-snapshot test] 38664: trouble: broken/pass
Next by Date: Re: [Xen-devel] [PATCH v2 08/16] xen/hvm/params: Add a new delivery type for event-channel in HVM_PARAM_CALLBACK_IRQ
Previous by thread: Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
Next by thread: Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.