[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH 1/7] xen: use domid check in is_hardware_domain



Instead of checking is_privileged to determine if a domain should
control the hardware, check that the domain_id is equal to zero (which
is currently the only domain for which is_privileged is true).  This
allows other places where domain_id is checked for zero to be replaced
with is_hardware_domain.

The distinction between is_hardware_domain, is_control_domain, and
domain 0 is based on the following disaggregation model:

Domain 0 bootstraps the system.  It may remain to perform requested
builds of domains that need a minimal trust chain (i.e. vTPM domains).
Other than being built by the hypervisor, nothing is special about this
domain - although it may be useful to have is_control_domain() return
true depending on the toolstack it uses to build other domains.

The hardware domain manages devices for PCI pass-through to driver
domains or can act as a driver domain itself, depending on the desired
degree of disaggregation.  It is also the domain managing devices that
do not support pass-through: PCI configuration space access, parsing the
hardware ACPI tables and system power or machine check events.  This is
the only domain where is_hardware_domain() is true.  The return of
is_control_domain() may be false for this domain.

The control domain manages other domains, controls guest launch and
shutdown, and manages resource constraints; is_control_domain() returns
true.  The functionality guarded by is_control_domain may in the future
be adapted to use explicit hypercalls, eliminating the special treatment
of this domain.  It may be reasonable to have multiple control domains
on a multi-tenant system.

Guest domains and other service or driver domains are all treated
identically by the hypervisor; the security policy may further constrain
administrative actions on or communication between these domains.

Signed-off-by: Daniel De Graaf <dgdegra@xxxxxxxxxxxxx>
Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
Cc: Ian Campbell <ian.campbell@xxxxxxxxxx>
Cc: Keir Fraser <keir@xxxxxxx>
Cc: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
Cc: Tim Deegan <tim@xxxxxxx>
Cc: Xiantao Zhang <xiantao.zhang@xxxxxxxxx>

---
 xen/arch/arm/domain.c                       |  6 +++---
 xen/arch/arm/gic.c                          |  4 ++--
 xen/arch/arm/vgic.c                         |  2 +-
 xen/arch/arm/vtimer.c                       |  5 +++--
 xen/arch/arm/vuart.c                        |  2 +-
 xen/arch/x86/cpu/mcheck/vmce.c              |  2 +-
 xen/arch/x86/domain.c                       |  3 ++-
 xen/arch/x86/hvm/i8254.c                    |  2 +-
 xen/arch/x86/mm.c                           |  2 +-
 xen/arch/x86/time.c                         |  4 ++--
 xen/arch/x86/traps.c                        | 21 +++++++++++----------
 xen/common/domain.c                         |  6 +++---
 xen/common/kernel.c                         |  2 +-
 xen/common/xenoprof.c                       |  8 +++++++-
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  2 +-
 xen/drivers/passthrough/iommu.c             |  2 +-
 xen/drivers/passthrough/vtd/iommu.c         | 11 ++++++-----
 xen/drivers/passthrough/vtd/x86/vtd.c       |  2 +-
 xen/include/xen/sched.h                     |  4 ++--
 19 files changed, 50 insertions(+), 40 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 46ee486..2c76a8f 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -540,10 +540,10 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags)
 
     /*
      * Virtual UART is only used by linux early printk and decompress code.
-     * Only use it for dom0 because the linux kernel may not support
-     * multi-platform.
+     * Only use it for the hardware domain because the linux kernel may not
+     * support multi-platform.
      */
-    if ( (d->domain_id == 0) && (rc = domain_vuart_init(d)) )
+    if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
         goto fail;
 
     return 0;
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 0095b97..8168b7b 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -865,10 +865,10 @@ int gicv_setup(struct domain *d)
     int ret;
 
     /*
-     * Domain 0 gets the hardware address.
+     * The hardware domain gets the hardware address.
      * Guests get the virtual platform layout.
      */
-    if ( d->domain_id == 0 )
+    if ( is_hardware_domain(d) )
     {
         d->arch.vgic.dbase = gic.dbase;
         d->arch.vgic.cbase = gic.cbase;
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 9fc9586..4a7f8c0 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -82,7 +82,7 @@ int domain_vgic_init(struct domain *d)
     /* Currently nr_lines in vgic and gic doesn't have the same meanings
      * Here nr_lines = number of SPIs
      */
-    if ( d->domain_id == 0 )
+    if ( is_hardware_domain(d) )
         d->arch.vgic.nr_lines = gic_number_lines() - 32;
     else
         d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */
diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
index 3d6a721..cb690bb 100644
--- a/xen/arch/arm/vtimer.c
+++ b/xen/arch/arm/vtimer.c
@@ -54,10 +54,11 @@ int vcpu_domain_init(struct domain *d)
 int vcpu_vtimer_init(struct vcpu *v)
 {
     struct vtimer *t = &v->arch.phys_timer;
-    bool_t d0 = (v->domain == dom0);
+    bool_t d0 = is_hardware_domain(v->domain);
 
     /*
-     * Domain 0 uses the hardware interrupts, guests get the virtual platform.
+     * Hardware domain uses the hardware interrupts, guests get the virtual
+     * platform.
      */
 
     init_timer(&t->timer, phys_timer_expired, t, v->processor);
diff --git a/xen/arch/arm/vuart.c b/xen/arch/arm/vuart.c
index b9d3ced..c02a8a9 100644
--- a/xen/arch/arm/vuart.c
+++ b/xen/arch/arm/vuart.c
@@ -46,7 +46,7 @@
 
 int domain_vuart_init(struct domain *d)
 {
-    ASSERT( !d->domain_id );
+    ASSERT( is_hardware_domain(d) );
 
     d->arch.vuart.info = serial_vuart_info(SERHND_DTUART);
     if ( !d->arch.vuart.info )
diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index ed00f7c..c83375e 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -430,7 +430,7 @@ int unmmap_broken_page(struct domain *d, mfn_t mfn, 
unsigned long gfn)
     int rc;
 
     /* Always trust dom0's MCE handler will prevent future access */
-    if ( d == dom0 )
+    if ( is_hardware_domain(d) )
         return 0;
 
     if (!mfn_valid(mfn_x(mfn)))
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 796b775..66acc90 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1529,7 +1529,8 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
         }
 
         set_cpuid_faulting(is_pv_vcpu(next) &&
-                           (next->domain->domain_id != 0));
+                           !is_control_domain(next->domain) &&
+                           !is_hardware_domain(next->domain));
     }
 
     if (is_hvm_vcpu(next) && (prev != next) )
diff --git a/xen/arch/x86/hvm/i8254.c b/xen/arch/x86/hvm/i8254.c
index f7493b8..6a91f76 100644
--- a/xen/arch/x86/hvm/i8254.c
+++ b/xen/arch/x86/hvm/i8254.c
@@ -540,7 +540,7 @@ int pv_pit_handler(int port, int data, int write)
         .data = data
     };
 
-    if ( (current->domain->domain_id == 0) && dom0_pit_access(&ioreq) )
+    if ( is_hardware_domain(current->domain) && dom0_pit_access(&ioreq) )
     {
         /* nothing to do */;
     }
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index fdc5ed3..ad48acc 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4787,7 +4787,7 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) 
arg)
             .max_mfn = MACH2PHYS_NR_ENTRIES - 1
         };
 
-        if ( !mem_hotplug && current->domain == dom0 )
+        if ( !mem_hotplug && is_hardware_domain(current->domain) )
             mapping.max_mfn = max_page - 1;
         if ( copy_to_guest(arg, &mapping, 1) )
             return -EFAULT;
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index f904af2..89308bd 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1839,7 +1839,7 @@ void tsc_set_info(struct domain *d,
                   uint32_t tsc_mode, uint64_t elapsed_nsec,
                   uint32_t gtsc_khz, uint32_t incarnation)
 {
-    if ( is_idle_domain(d) || (d->domain_id == 0) )
+    if ( is_idle_domain(d) || is_hardware_domain(d) )
     {
         d->arch.vtsc = 0;
         return;
@@ -1943,7 +1943,7 @@ static void dump_softtsc(unsigned char key)
                "warp=%lu (count=%lu)\n", tsc_max_warp, tsc_check_count);
     for_each_domain ( d )
     {
-        if ( d->domain_id == 0 && d->arch.tsc_mode == TSC_MODE_DEFAULT )
+        if ( is_hardware_domain(d) && d->arch.tsc_mode == TSC_MODE_DEFAULT )
             continue;
         printk("dom%u%s: mode=%d",d->domain_id,
                 is_hvm_domain(d) ? "(hvm)" : "", d->arch.tsc_mode);
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index ed4ae2d..21c8b63 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -732,18 +732,19 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t 
sub_idx,
 void pv_cpuid(struct cpu_user_regs *regs)
 {
     uint32_t a, b, c, d;
+    struct vcpu *curr = current;
 
     a = regs->eax;
     b = regs->ebx;
     c = regs->ecx;
     d = regs->edx;
 
-    if ( current->domain->domain_id != 0 )
+    if ( !is_control_domain(curr->domain) && !is_hardware_domain(curr->domain) 
)
     {
         unsigned int cpuid_leaf = a, sub_leaf = c;
 
         if ( !cpuid_hypervisor_leaves(a, c, &a, &b, &c, &d) )
-            domain_cpuid(current->domain, a, c, &a, &b, &c, &d);
+            domain_cpuid(curr->domain, a, c, &a, &b, &c, &d);
 
         switch ( cpuid_leaf )
         {
@@ -751,15 +752,15 @@ void pv_cpuid(struct cpu_user_regs *regs)
         {
             unsigned int _eax, _ebx, _ecx, _edx;
             /* EBX value of main leaf 0 depends on enabled xsave features */
-            if ( sub_leaf == 0 && current->arch.xcr0 )
+            if ( sub_leaf == 0 && curr->arch.xcr0 )
             {
                 /* reset EBX to default value first */
                 b = XSTATE_AREA_MIN_SIZE;
                 for ( sub_leaf = 2; sub_leaf < 63; sub_leaf++ )
                 {
-                    if ( !(current->arch.xcr0 & (1ULL << sub_leaf)) )
+                    if ( !(curr->arch.xcr0 & (1ULL << sub_leaf)) )
                         continue;
-                    domain_cpuid(current->domain, cpuid_leaf, sub_leaf,
+                    domain_cpuid(curr->domain, cpuid_leaf, sub_leaf,
                                  &_eax, &_ebx, &_ecx, &_edx);
                     if ( (_eax + _ebx) > b )
                         b = _eax + _ebx;
@@ -796,7 +797,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
         __clear_bit(X86_FEATURE_DS, &d);
         __clear_bit(X86_FEATURE_ACC, &d);
         __clear_bit(X86_FEATURE_PBE, &d);
-        if ( is_pvh_vcpu(current) )
+        if ( is_pvh_vcpu(curr) )
             __clear_bit(X86_FEATURE_MTRR, &d);
 
         __clear_bit(X86_FEATURE_DTES64 % 32, &c);
@@ -805,7 +806,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
         __clear_bit(X86_FEATURE_VMXE % 32, &c);
         __clear_bit(X86_FEATURE_SMXE % 32, &c);
         __clear_bit(X86_FEATURE_TM2 % 32, &c);
-        if ( is_pv_32bit_vcpu(current) )
+        if ( is_pv_32bit_vcpu(curr) )
             __clear_bit(X86_FEATURE_CX16 % 32, &c);
         __clear_bit(X86_FEATURE_XTPR % 32, &c);
         __clear_bit(X86_FEATURE_PDCM % 32, &c);
@@ -844,12 +845,12 @@ void pv_cpuid(struct cpu_user_regs *regs)
 
     case 0x80000001:
         /* Modify Feature Information. */
-        if ( is_pv_32bit_vcpu(current) )
+        if ( is_pv_32bit_vcpu(curr) )
         {
             __clear_bit(X86_FEATURE_LM % 32, &d);
             __clear_bit(X86_FEATURE_LAHF_LM % 32, &c);
         }
-        if ( is_pv_32on64_vcpu(current) &&
+        if ( is_pv_32on64_vcpu(curr) &&
              boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
             __clear_bit(X86_FEATURE_SYSCALL % 32, &d);
         __clear_bit(X86_FEATURE_PAGE1GB % 32, &d);
@@ -1860,7 +1861,7 @@ static inline uint64_t guest_misc_enable(uint64_t val)
 static int is_cpufreq_controller(struct domain *d)
 {
     return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
-            (d->domain_id == 0));
+            is_hardware_domain(d));
 }
 
 #include "x86_64/mmconfig.h"
diff --git a/xen/common/domain.c b/xen/common/domain.c
index ad8a1b6..b5868a5 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -237,7 +237,7 @@ struct domain *domain_create(
     else if ( domcr_flags & DOMCRF_pvh )
         d->guest_type = guest_type_pvh;
 
-    if ( domid == 0 )
+    if ( is_hardware_domain(d) )
     {
         d->is_pinned = opt_dom0_vcpus_pin;
         d->disable_migrate = 1;
@@ -262,7 +262,7 @@ struct domain *domain_create(
         d->is_paused_by_controller = 1;
         atomic_inc(&d->pause_count);
 
-        if ( domid )
+        if ( !is_hardware_domain(d) )
             d->nr_pirqs = nr_static_irqs + extra_domU_irqs;
         else
             d->nr_pirqs = nr_static_irqs + extra_dom0_irqs;
@@ -594,7 +594,7 @@ void domain_shutdown(struct domain *d, u8 reason)
         d->shutdown_code = reason;
     reason = d->shutdown_code;
 
-    if ( d->domain_id == 0 )
+    if ( is_hardware_domain(d) )
         dom0_shutdown(reason);
 
     if ( d->is_shutting_down )
diff --git a/xen/common/kernel.c b/xen/common/kernel.c
index b371f8f..7e83353 100644
--- a/xen/common/kernel.c
+++ b/xen/common/kernel.c
@@ -303,7 +303,7 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
                     (1U << XENFEAT_auto_translated_physmap);
             if ( supervisor_mode_kernel )
                 fi.submap |= 1U << XENFEAT_supervisor_mode_kernel;
-            if ( current->domain == dom0 )
+            if ( is_hardware_domain(current->domain) )
                 fi.submap |= 1U << XENFEAT_dom0;
 #ifdef CONFIG_X86
             switch ( d->guest_type )
diff --git a/xen/common/xenoprof.c b/xen/common/xenoprof.c
index 52ab00d..3c30c3e 100644
--- a/xen/common/xenoprof.c
+++ b/xen/common/xenoprof.c
@@ -601,9 +601,15 @@ static int xenoprof_op_init(XEN_GUEST_HANDLE_PARAM(void) 
arg)
                                    xenoprof_init.cpu_type)) )
         return ret;
 
+    /* Only the hardware domain may become the primary profiler here because
+     * there is currently no cleanup of xenoprof_primary_profiler or associated
+     * profiling state when the primary profiling domain is shut down or
+     * crashes.  Once a better cleanup method is present, it will be possible 
to
+     * allow another domain to be the primary profiler.
+     */
     xenoprof_init.is_primary = 
         ((xenoprof_primary_profiler == d) ||
-         ((xenoprof_primary_profiler == NULL) && (d->domain_id == 0)));
+         ((xenoprof_primary_profiler == NULL) && is_hardware_domain(d)));
     if ( xenoprof_init.is_primary )
         xenoprof_primary_profiler = current->domain;
 
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c 
b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index c26aabc..cf67494 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -122,7 +122,7 @@ static void amd_iommu_setup_domain_device(
 
     BUG_ON( !hd->root_table || !hd->paging_mode || !iommu->dev_table.buffer );
 
-    if ( iommu_passthrough && (domain->domain_id == 0) )
+    if ( iommu_passthrough && is_hardware_domain(domain) )
         valid = 0;
 
     if ( ats_enabled )
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 0cf0748..54d891f 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -817,7 +817,7 @@ static void iommu_dump_p2m_table(unsigned char key)
     ops = iommu_get_ops();
     for_each_domain(d)
     {
-        if ( !d->domain_id )
+        if ( is_hardware_domain(d) )
             continue;
 
         if ( iommu_use_hap_pt(d) )
diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index beb3723..9c8e4a3 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1327,7 +1327,7 @@ int domain_context_mapping_one(
         return res;
     }
 
-    if ( iommu_passthrough && (domain->domain_id == 0) )
+    if ( iommu_passthrough && is_hardware_domain(domain) )
     {
         context_set_translation_type(*context, CONTEXT_TT_PASS_THRU);
         agaw = level_to_agaw(iommu->nr_pt_levels);
@@ -1723,7 +1723,7 @@ static int intel_iommu_map_page(
         return 0;
 
     /* do nothing if dom0 and iommu supports pass thru */
-    if ( iommu_passthrough && (d->domain_id == 0) )
+    if ( iommu_passthrough && is_hardware_domain(d) )
         return 0;
 
     spin_lock(&hd->mapping_lock);
@@ -1767,7 +1767,7 @@ static int intel_iommu_map_page(
 static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn)
 {
     /* Do nothing if dom0 and iommu supports pass thru. */
-    if ( iommu_passthrough && (d->domain_id == 0) )
+    if ( iommu_passthrough && is_hardware_domain(d) )
         return 0;
 
     dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K);
@@ -1950,8 +1950,9 @@ static int intel_iommu_remove_device(u8 devfn, struct 
pci_dev *pdev)
             continue;
 
         /*
-         * If the device belongs to dom0, and it has RMRR, don't remove
-         * it from dom0, because BIOS may use RMRR at booting time.
+         * If the device belongs to the hardware domain, and it has RMRR, don't
+         * remove it from the hardware domain, because BIOS may use RMRR at
+         * booting time.
          */
         if ( is_hardware_domain(pdev->domain) )
             return 0;
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c 
b/xen/drivers/passthrough/vtd/x86/vtd.c
index ca17cb1..f271a42 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -111,7 +111,7 @@ void __init iommu_set_dom0_mapping(struct domain *d)
 {
     unsigned long i, j, tmp, top;
 
-    BUG_ON(d->domain_id != 0);
+    BUG_ON(!is_hardware_domain(d));
 
     top = max(max_pdx, pfn_to_pdx(0xffffffffUL >> PAGE_SHIFT) + 1);
 
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index b9ba379..1f45f71 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -791,10 +791,10 @@ void watchdog_domain_destroy(struct domain *d);
 /* 
  * Use this check when the following are both true:
  *  - Using this feature or interface requires full access to the hardware
- *    (that is, this is would not be suitable for a driver domain)
+ *    (that is, this would not be suitable for a driver domain)
  *  - There is never a reason to deny dom0 access to this
  */
-#define is_hardware_domain(_d) ((_d)->is_privileged)
+#define is_hardware_domain(_d) ((_d)->domain_id == 0)
 
 /* This check is for functionality specific to a control domain */
 #define is_control_domain(_d) ((_d)->is_privileged)
-- 
1.8.5.3


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.