[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH 1/7] xen: use domid check in is_hardware_domain
Instead of checking is_privileged to determine if a domain should control the hardware, check that the domain_id is equal to zero (which is currently the only domain for which is_privileged is true). This allows other places where domain_id is checked for zero to be replaced with is_hardware_domain. The distinction between is_hardware_domain, is_control_domain, and domain 0 is based on the following disaggregation model: Domain 0 bootstraps the system. It may remain to perform requested builds of domains that need a minimal trust chain (i.e. vTPM domains). Other than being built by the hypervisor, nothing is special about this domain - although it may be useful to have is_control_domain() return true depending on the toolstack it uses to build other domains. The hardware domain manages devices for PCI pass-through to driver domains or can act as a driver domain itself, depending on the desired degree of disaggregation. It is also the domain managing devices that do not support pass-through: PCI configuration space access, parsing the hardware ACPI tables and system power or machine check events. This is the only domain where is_hardware_domain() is true. The return of is_control_domain() may be false for this domain. The control domain manages other domains, controls guest launch and shutdown, and manages resource constraints; is_control_domain() returns true. The functionality guarded by is_control_domain may in the future be adapted to use explicit hypercalls, eliminating the special treatment of this domain. It may be reasonable to have multiple control domains on a multi-tenant system. Guest domains and other service or driver domains are all treated identically by the hypervisor; the security policy may further constrain administrative actions on or communication between these domains. Signed-off-by: Daniel De Graaf <dgdegra@xxxxxxxxxxxxx> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> Cc: Ian Campbell <ian.campbell@xxxxxxxxxx> Cc: Keir Fraser <keir@xxxxxxx> Cc: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx> Cc: Tim Deegan <tim@xxxxxxx> Cc: Xiantao Zhang <xiantao.zhang@xxxxxxxxx> --- xen/arch/arm/domain.c | 6 +++--- xen/arch/arm/gic.c | 4 ++-- xen/arch/arm/vgic.c | 2 +- xen/arch/arm/vtimer.c | 5 +++-- xen/arch/arm/vuart.c | 2 +- xen/arch/x86/cpu/mcheck/vmce.c | 2 +- xen/arch/x86/domain.c | 3 ++- xen/arch/x86/hvm/i8254.c | 2 +- xen/arch/x86/mm.c | 2 +- xen/arch/x86/time.c | 4 ++-- xen/arch/x86/traps.c | 21 +++++++++++---------- xen/common/domain.c | 6 +++--- xen/common/kernel.c | 2 +- xen/common/xenoprof.c | 8 +++++++- xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- xen/drivers/passthrough/iommu.c | 2 +- xen/drivers/passthrough/vtd/iommu.c | 11 ++++++----- xen/drivers/passthrough/vtd/x86/vtd.c | 2 +- xen/include/xen/sched.h | 4 ++-- 19 files changed, 50 insertions(+), 40 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 46ee486..2c76a8f 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -540,10 +540,10 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) /* * Virtual UART is only used by linux early printk and decompress code. - * Only use it for dom0 because the linux kernel may not support - * multi-platform. + * Only use it for the hardware domain because the linux kernel may not + * support multi-platform. */ - if ( (d->domain_id == 0) && (rc = domain_vuart_init(d)) ) + if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) ) goto fail; return 0; diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index 0095b97..8168b7b 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -865,10 +865,10 @@ int gicv_setup(struct domain *d) int ret; /* - * Domain 0 gets the hardware address. + * The hardware domain gets the hardware address. * Guests get the virtual platform layout. */ - if ( d->domain_id == 0 ) + if ( is_hardware_domain(d) ) { d->arch.vgic.dbase = gic.dbase; d->arch.vgic.cbase = gic.cbase; diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index 9fc9586..4a7f8c0 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -82,7 +82,7 @@ int domain_vgic_init(struct domain *d) /* Currently nr_lines in vgic and gic doesn't have the same meanings * Here nr_lines = number of SPIs */ - if ( d->domain_id == 0 ) + if ( is_hardware_domain(d) ) d->arch.vgic.nr_lines = gic_number_lines() - 32; else d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */ diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c index 3d6a721..cb690bb 100644 --- a/xen/arch/arm/vtimer.c +++ b/xen/arch/arm/vtimer.c @@ -54,10 +54,11 @@ int vcpu_domain_init(struct domain *d) int vcpu_vtimer_init(struct vcpu *v) { struct vtimer *t = &v->arch.phys_timer; - bool_t d0 = (v->domain == dom0); + bool_t d0 = is_hardware_domain(v->domain); /* - * Domain 0 uses the hardware interrupts, guests get the virtual platform. + * Hardware domain uses the hardware interrupts, guests get the virtual + * platform. */ init_timer(&t->timer, phys_timer_expired, t, v->processor); diff --git a/xen/arch/arm/vuart.c b/xen/arch/arm/vuart.c index b9d3ced..c02a8a9 100644 --- a/xen/arch/arm/vuart.c +++ b/xen/arch/arm/vuart.c @@ -46,7 +46,7 @@ int domain_vuart_init(struct domain *d) { - ASSERT( !d->domain_id ); + ASSERT( is_hardware_domain(d) ); d->arch.vuart.info = serial_vuart_info(SERHND_DTUART); if ( !d->arch.vuart.info ) diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c index ed00f7c..c83375e 100644 --- a/xen/arch/x86/cpu/mcheck/vmce.c +++ b/xen/arch/x86/cpu/mcheck/vmce.c @@ -430,7 +430,7 @@ int unmmap_broken_page(struct domain *d, mfn_t mfn, unsigned long gfn) int rc; /* Always trust dom0's MCE handler will prevent future access */ - if ( d == dom0 ) + if ( is_hardware_domain(d) ) return 0; if (!mfn_valid(mfn_x(mfn))) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 796b775..66acc90 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1529,7 +1529,8 @@ void context_switch(struct vcpu *prev, struct vcpu *next) } set_cpuid_faulting(is_pv_vcpu(next) && - (next->domain->domain_id != 0)); + !is_control_domain(next->domain) && + !is_hardware_domain(next->domain)); } if (is_hvm_vcpu(next) && (prev != next) ) diff --git a/xen/arch/x86/hvm/i8254.c b/xen/arch/x86/hvm/i8254.c index f7493b8..6a91f76 100644 --- a/xen/arch/x86/hvm/i8254.c +++ b/xen/arch/x86/hvm/i8254.c @@ -540,7 +540,7 @@ int pv_pit_handler(int port, int data, int write) .data = data }; - if ( (current->domain->domain_id == 0) && dom0_pit_access(&ioreq) ) + if ( is_hardware_domain(current->domain) && dom0_pit_access(&ioreq) ) { /* nothing to do */; } diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index fdc5ed3..ad48acc 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4787,7 +4787,7 @@ long arch_memory_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg) .max_mfn = MACH2PHYS_NR_ENTRIES - 1 }; - if ( !mem_hotplug && current->domain == dom0 ) + if ( !mem_hotplug && is_hardware_domain(current->domain) ) mapping.max_mfn = max_page - 1; if ( copy_to_guest(arg, &mapping, 1) ) return -EFAULT; diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c index f904af2..89308bd 100644 --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -1839,7 +1839,7 @@ void tsc_set_info(struct domain *d, uint32_t tsc_mode, uint64_t elapsed_nsec, uint32_t gtsc_khz, uint32_t incarnation) { - if ( is_idle_domain(d) || (d->domain_id == 0) ) + if ( is_idle_domain(d) || is_hardware_domain(d) ) { d->arch.vtsc = 0; return; @@ -1943,7 +1943,7 @@ static void dump_softtsc(unsigned char key) "warp=%lu (count=%lu)\n", tsc_max_warp, tsc_check_count); for_each_domain ( d ) { - if ( d->domain_id == 0 && d->arch.tsc_mode == TSC_MODE_DEFAULT ) + if ( is_hardware_domain(d) && d->arch.tsc_mode == TSC_MODE_DEFAULT ) continue; printk("dom%u%s: mode=%d",d->domain_id, is_hvm_domain(d) ? "(hvm)" : "", d->arch.tsc_mode); diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index ed4ae2d..21c8b63 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -732,18 +732,19 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t sub_idx, void pv_cpuid(struct cpu_user_regs *regs) { uint32_t a, b, c, d; + struct vcpu *curr = current; a = regs->eax; b = regs->ebx; c = regs->ecx; d = regs->edx; - if ( current->domain->domain_id != 0 ) + if ( !is_control_domain(curr->domain) && !is_hardware_domain(curr->domain) ) { unsigned int cpuid_leaf = a, sub_leaf = c; if ( !cpuid_hypervisor_leaves(a, c, &a, &b, &c, &d) ) - domain_cpuid(current->domain, a, c, &a, &b, &c, &d); + domain_cpuid(curr->domain, a, c, &a, &b, &c, &d); switch ( cpuid_leaf ) { @@ -751,15 +752,15 @@ void pv_cpuid(struct cpu_user_regs *regs) { unsigned int _eax, _ebx, _ecx, _edx; /* EBX value of main leaf 0 depends on enabled xsave features */ - if ( sub_leaf == 0 && current->arch.xcr0 ) + if ( sub_leaf == 0 && curr->arch.xcr0 ) { /* reset EBX to default value first */ b = XSTATE_AREA_MIN_SIZE; for ( sub_leaf = 2; sub_leaf < 63; sub_leaf++ ) { - if ( !(current->arch.xcr0 & (1ULL << sub_leaf)) ) + if ( !(curr->arch.xcr0 & (1ULL << sub_leaf)) ) continue; - domain_cpuid(current->domain, cpuid_leaf, sub_leaf, + domain_cpuid(curr->domain, cpuid_leaf, sub_leaf, &_eax, &_ebx, &_ecx, &_edx); if ( (_eax + _ebx) > b ) b = _eax + _ebx; @@ -796,7 +797,7 @@ void pv_cpuid(struct cpu_user_regs *regs) __clear_bit(X86_FEATURE_DS, &d); __clear_bit(X86_FEATURE_ACC, &d); __clear_bit(X86_FEATURE_PBE, &d); - if ( is_pvh_vcpu(current) ) + if ( is_pvh_vcpu(curr) ) __clear_bit(X86_FEATURE_MTRR, &d); __clear_bit(X86_FEATURE_DTES64 % 32, &c); @@ -805,7 +806,7 @@ void pv_cpuid(struct cpu_user_regs *regs) __clear_bit(X86_FEATURE_VMXE % 32, &c); __clear_bit(X86_FEATURE_SMXE % 32, &c); __clear_bit(X86_FEATURE_TM2 % 32, &c); - if ( is_pv_32bit_vcpu(current) ) + if ( is_pv_32bit_vcpu(curr) ) __clear_bit(X86_FEATURE_CX16 % 32, &c); __clear_bit(X86_FEATURE_XTPR % 32, &c); __clear_bit(X86_FEATURE_PDCM % 32, &c); @@ -844,12 +845,12 @@ void pv_cpuid(struct cpu_user_regs *regs) case 0x80000001: /* Modify Feature Information. */ - if ( is_pv_32bit_vcpu(current) ) + if ( is_pv_32bit_vcpu(curr) ) { __clear_bit(X86_FEATURE_LM % 32, &d); __clear_bit(X86_FEATURE_LAHF_LM % 32, &c); } - if ( is_pv_32on64_vcpu(current) && + if ( is_pv_32on64_vcpu(curr) && boot_cpu_data.x86_vendor != X86_VENDOR_AMD ) __clear_bit(X86_FEATURE_SYSCALL % 32, &d); __clear_bit(X86_FEATURE_PAGE1GB % 32, &d); @@ -1860,7 +1861,7 @@ static inline uint64_t guest_misc_enable(uint64_t val) static int is_cpufreq_controller(struct domain *d) { return ((cpufreq_controller == FREQCTL_dom0_kernel) && - (d->domain_id == 0)); + is_hardware_domain(d)); } #include "x86_64/mmconfig.h" diff --git a/xen/common/domain.c b/xen/common/domain.c index ad8a1b6..b5868a5 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -237,7 +237,7 @@ struct domain *domain_create( else if ( domcr_flags & DOMCRF_pvh ) d->guest_type = guest_type_pvh; - if ( domid == 0 ) + if ( is_hardware_domain(d) ) { d->is_pinned = opt_dom0_vcpus_pin; d->disable_migrate = 1; @@ -262,7 +262,7 @@ struct domain *domain_create( d->is_paused_by_controller = 1; atomic_inc(&d->pause_count); - if ( domid ) + if ( !is_hardware_domain(d) ) d->nr_pirqs = nr_static_irqs + extra_domU_irqs; else d->nr_pirqs = nr_static_irqs + extra_dom0_irqs; @@ -594,7 +594,7 @@ void domain_shutdown(struct domain *d, u8 reason) d->shutdown_code = reason; reason = d->shutdown_code; - if ( d->domain_id == 0 ) + if ( is_hardware_domain(d) ) dom0_shutdown(reason); if ( d->is_shutting_down ) diff --git a/xen/common/kernel.c b/xen/common/kernel.c index b371f8f..7e83353 100644 --- a/xen/common/kernel.c +++ b/xen/common/kernel.c @@ -303,7 +303,7 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) (1U << XENFEAT_auto_translated_physmap); if ( supervisor_mode_kernel ) fi.submap |= 1U << XENFEAT_supervisor_mode_kernel; - if ( current->domain == dom0 ) + if ( is_hardware_domain(current->domain) ) fi.submap |= 1U << XENFEAT_dom0; #ifdef CONFIG_X86 switch ( d->guest_type ) diff --git a/xen/common/xenoprof.c b/xen/common/xenoprof.c index 52ab00d..3c30c3e 100644 --- a/xen/common/xenoprof.c +++ b/xen/common/xenoprof.c @@ -601,9 +601,15 @@ static int xenoprof_op_init(XEN_GUEST_HANDLE_PARAM(void) arg) xenoprof_init.cpu_type)) ) return ret; + /* Only the hardware domain may become the primary profiler here because + * there is currently no cleanup of xenoprof_primary_profiler or associated + * profiling state when the primary profiling domain is shut down or + * crashes. Once a better cleanup method is present, it will be possible to + * allow another domain to be the primary profiler. + */ xenoprof_init.is_primary = ((xenoprof_primary_profiler == d) || - ((xenoprof_primary_profiler == NULL) && (d->domain_id == 0))); + ((xenoprof_primary_profiler == NULL) && is_hardware_domain(d))); if ( xenoprof_init.is_primary ) xenoprof_primary_profiler = current->domain; diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c index c26aabc..cf67494 100644 --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -122,7 +122,7 @@ static void amd_iommu_setup_domain_device( BUG_ON( !hd->root_table || !hd->paging_mode || !iommu->dev_table.buffer ); - if ( iommu_passthrough && (domain->domain_id == 0) ) + if ( iommu_passthrough && is_hardware_domain(domain) ) valid = 0; if ( ats_enabled ) diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index 0cf0748..54d891f 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -817,7 +817,7 @@ static void iommu_dump_p2m_table(unsigned char key) ops = iommu_get_ops(); for_each_domain(d) { - if ( !d->domain_id ) + if ( is_hardware_domain(d) ) continue; if ( iommu_use_hap_pt(d) ) diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index beb3723..9c8e4a3 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -1327,7 +1327,7 @@ int domain_context_mapping_one( return res; } - if ( iommu_passthrough && (domain->domain_id == 0) ) + if ( iommu_passthrough && is_hardware_domain(domain) ) { context_set_translation_type(*context, CONTEXT_TT_PASS_THRU); agaw = level_to_agaw(iommu->nr_pt_levels); @@ -1723,7 +1723,7 @@ static int intel_iommu_map_page( return 0; /* do nothing if dom0 and iommu supports pass thru */ - if ( iommu_passthrough && (d->domain_id == 0) ) + if ( iommu_passthrough && is_hardware_domain(d) ) return 0; spin_lock(&hd->mapping_lock); @@ -1767,7 +1767,7 @@ static int intel_iommu_map_page( static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn) { /* Do nothing if dom0 and iommu supports pass thru. */ - if ( iommu_passthrough && (d->domain_id == 0) ) + if ( iommu_passthrough && is_hardware_domain(d) ) return 0; dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K); @@ -1950,8 +1950,9 @@ static int intel_iommu_remove_device(u8 devfn, struct pci_dev *pdev) continue; /* - * If the device belongs to dom0, and it has RMRR, don't remove - * it from dom0, because BIOS may use RMRR at booting time. + * If the device belongs to the hardware domain, and it has RMRR, don't + * remove it from the hardware domain, because BIOS may use RMRR at + * booting time. */ if ( is_hardware_domain(pdev->domain) ) return 0; diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c index ca17cb1..f271a42 100644 --- a/xen/drivers/passthrough/vtd/x86/vtd.c +++ b/xen/drivers/passthrough/vtd/x86/vtd.c @@ -111,7 +111,7 @@ void __init iommu_set_dom0_mapping(struct domain *d) { unsigned long i, j, tmp, top; - BUG_ON(d->domain_id != 0); + BUG_ON(!is_hardware_domain(d)); top = max(max_pdx, pfn_to_pdx(0xffffffffUL >> PAGE_SHIFT) + 1); diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index b9ba379..1f45f71 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -791,10 +791,10 @@ void watchdog_domain_destroy(struct domain *d); /* * Use this check when the following are both true: * - Using this feature or interface requires full access to the hardware - * (that is, this is would not be suitable for a driver domain) + * (that is, this would not be suitable for a driver domain) * - There is never a reason to deny dom0 access to this */ -#define is_hardware_domain(_d) ((_d)->is_privileged) +#define is_hardware_domain(_d) ((_d)->domain_id == 0) /* This check is for functionality specific to a control domain */ #define is_control_domain(_d) ((_d)->is_privileged) -- 1.8.5.3 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |