[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen

On 02/03/16 03:15, Konrad Rzeszutek Wilk wrote:
> > 3. Design of vNVDIMM in Xen
> Thank you for this design!
> > 
> >  Similarly to that in KVM/QEMU, enabling vNVDIMM in Xen is composed of
> >  three parts:
> >  (1) Guest clwb/clflushopt/pcommit enabling,
> >  (2) Memory mapping, and
> >  (3) Guest ACPI emulation.
> .. MCE? and vMCE?

NVDIMM can generate UCR errors like normal ram. Xen may handle them in a
way similar to what mc_memerr_dhandler() does, with some differences in
the data structure and the broken page offline parts:

Broken NVDIMM pages should be marked as "offlined" so that Xen
hypervisor can refuse further requests that map them to DomU.

The real problem here is what data structure will be used to record
information of NVDIMM pages. Because the size of NVDIMM is usually much
larger than normal ram, using struct page_info for NVDIMM pages would
occupy too much memory.

Alternatively, we may use a range set to represent NVDIMM pages:

    struct nvdimm_pages
        unsigned long mfn; /* starting MFN of a range of NVDIMM pages */
        unsigned long gfn; /* starting GFN where this range is mapped,
                              initially INVALID_GFN */
        unsigned long len; /* length of this range in bytes */

        int broken;        /* 0: initial value,
                              1: this range of NVDIMM pages are broken and 
offlined */
        struct domain *d;  /* NULL: initial value,
                              Not NULL: which domain this range is mapped to */

         * Every nvdimm_pages structure is linked in the global
         * xen_nvdimm_pages_list.
         * If it is mapped to a domain d, it will be also linked in
         * d->arch.nvdimm_pages_list.
        struct list_head *domain_list;
        struct list_head *global_list;

    struct list_head xen_nvdimm_pages_list;

    /* in asm-x86/domain.h */
    struct arch_domain
        struct list_head nvdimm_pages_list;

(1) Initially, Xen hypervisor creates a nvdimm_pages structure for each
    pmem region (starting SPA and size reported by Dom0 NVDIMM driver)
    and links all nvdimm_pages structures in xen_nvdimm_pages_list.

(2) If Xen hypervisor is then requested to map a range of NVDIMM pages
    [start_mfn, end_mfn] to gfn of domain d, it will

   (a) Check whether the GFN range [gfn, gfn + end_mfn - start_mfn + 1]
       of domain d has been occupied (e.g. by normal ram, I/O or other

   (b) Search xen_nvdimm_pages_list for one or multiple nvdimm_pages
       that [start_mfn, end_mfn] can fit in.

       If a nvdimm_pages structure is entirely covered by [start_mfn,
       end_mfn], then link that nvdimm_pages structure to
       If only a portion of a nvdimm_pages structure is covered by
       [start_mfn, end_mfn], then split that nvdimm_pages structure
       into multiple ones (the one entirely covered and at most two not
       covered), link the covered one to d->arch.nvdimm_pages_list and
       all of them to xen_nvdimm_pages_list as well.

       gfn and d fields of nvdimm_pages structures linked to
       d->arch.nvdimm_pages_list are also set accordingly.

(3) When a domain d is shutdown/destroyed, merge its nvdimm_pages
    structures (i.e. those in d->arch.nvdimm_pages_list) in

(4) When a MCE for host NVDIMM SPA range [start_mfn, end_mfn] happens,
  (a) search xen_nvdimm_pages_list for affected nvdimm_pages structures,
  (b) for each affected nvdimm_pages, if it belongs to a domain d and
      its broken field is already set, the domain d will be shutdown to
      prevent malicious guest accessing broken page (similarly to what
      offline_page() does).
  (c) for each affected nvdimm_pages, set its broken field to 1, and
  (d) for each affected nvdimm_pages, inject to domain d a vMCE that
      covers its GFN range if that nvdimm_pages belongs to domain d.

Comments, pls.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.