[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1/7] xen: use domid check in is_hardware_domain
>>> On 18.03.14 at 22:34, Daniel De Graaf <dgdegra@xxxxxxxxxxxxx> wrote: > Instead of checking is_privileged to determine if a domain should > control the hardware, check that the domain_id is equal to zero (which > is currently the only domain for which is_privileged is true). This > allows other places where domain_id is checked for zero to be replaced > with is_hardware_domain. > > The distinction between is_hardware_domain, is_control_domain, and > domain 0 is based on the following disaggregation model: > > Domain 0 bootstraps the system. It may remain to perform requested > builds of domains that need a minimal trust chain (i.e. vTPM domains). > Other than being built by the hypervisor, nothing is special about this > domain - although it may be useful to have is_control_domain() return > true depending on the toolstack it uses to build other domains. > > The hardware domain manages devices for PCI pass-through to driver > domains or can act as a driver domain itself, depending on the desired > degree of disaggregation. It is also the domain managing devices that > do not support pass-through: PCI configuration space access, parsing the > hardware ACPI tables and system power or machine check events. This is > the only domain where is_hardware_domain() is true. The return of > is_control_domain() may be false for this domain. > > The control domain manages other domains, controls guest launch and > shutdown, and manages resource constraints; is_control_domain() returns > true. The functionality guarded by is_control_domain may in the future > be adapted to use explicit hypercalls, eliminating the special treatment > of this domain. It may be reasonable to have multiple control domains > on a multi-tenant system. > > Guest domains and other service or driver domains are all treated > identically by the hypervisor; the security policy may further constrain > administrative actions on or communication between these domains. > > Signed-off-by: Daniel De Graaf <dgdegra@xxxxxxxxxxxxx> > Cc: Ian Campbell <ian.campbell@xxxxxxxxxx> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> > Cc: Keir Fraser <keir@xxxxxxx> > Cc: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx> > Cc: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx> > Cc: Tim Deegan <tim@xxxxxxx> > Cc: Xiantao Zhang <xiantao.zhang@xxxxxxxxx> > > --- > Changes from v1: > - Fix up some references to dom0 in comments > - Show real CPUID values to hardware domain > - Require xenoprof to be hardware domain due to lack of cleanup on destroy > - Change a few ( d == dom0 ) references to is_hardware_domain > > xen/arch/arm/domain.c | 6 +++--- > xen/arch/arm/gic.c | 4 ++-- > xen/arch/arm/vgic.c | 2 +- > xen/arch/arm/vtimer.c | 5 +++-- > xen/arch/arm/vuart.c | 2 +- > xen/arch/x86/cpu/mcheck/vmce.c | 2 +- > xen/arch/x86/domain.c | 3 ++- > xen/arch/x86/hvm/i8254.c | 2 +- > xen/arch/x86/mm.c | 2 +- > xen/arch/x86/time.c | 4 ++-- > xen/arch/x86/traps.c | 21 +++++++++++---------- > xen/common/domain.c | 6 +++--- > xen/common/kernel.c | 2 +- > xen/common/xenoprof.c | 8 +++++++- > xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- > xen/drivers/passthrough/iommu.c | 2 +- > xen/drivers/passthrough/vtd/iommu.c | 11 ++++++----- > xen/drivers/passthrough/vtd/x86/vtd.c | 2 +- > xen/include/xen/sched.h | 4 ++-- > 19 files changed, 50 insertions(+), 40 deletions(-) > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > index 46ee486..2c76a8f 100644 > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -540,10 +540,10 @@ int arch_domain_create(struct domain *d, unsigned int > domcr_flags) > > /* > * Virtual UART is only used by linux early printk and decompress code. > - * Only use it for dom0 because the linux kernel may not support > - * multi-platform. > + * Only use it for the hardware domain because the linux kernel may not > + * support multi-platform. > */ > - if ( (d->domain_id == 0) && (rc = domain_vuart_init(d)) ) > + if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) ) > goto fail; > > return 0; > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > index 0095b97..8168b7b 100644 > --- a/xen/arch/arm/gic.c > +++ b/xen/arch/arm/gic.c > @@ -865,10 +865,10 @@ int gicv_setup(struct domain *d) > int ret; > > /* > - * Domain 0 gets the hardware address. > + * The hardware domain gets the hardware address. > * Guests get the virtual platform layout. > */ > - if ( d->domain_id == 0 ) > + if ( is_hardware_domain(d) ) > { > d->arch.vgic.dbase = gic.dbase; > d->arch.vgic.cbase = gic.cbase; > diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c > index 553411d..65f48f2 100644 > --- a/xen/arch/arm/vgic.c > +++ b/xen/arch/arm/vgic.c > @@ -82,7 +82,7 @@ int domain_vgic_init(struct domain *d) > /* Currently nr_lines in vgic and gic doesn't have the same meanings > * Here nr_lines = number of SPIs > */ > - if ( d->domain_id == 0 ) > + if ( is_hardware_domain(d) ) > d->arch.vgic.nr_lines = gic_number_lines() - 32; > else > d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */ > diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c > index 3d6a721..cb690bb 100644 > --- a/xen/arch/arm/vtimer.c > +++ b/xen/arch/arm/vtimer.c > @@ -54,10 +54,11 @@ int vcpu_domain_init(struct domain *d) > int vcpu_vtimer_init(struct vcpu *v) > { > struct vtimer *t = &v->arch.phys_timer; > - bool_t d0 = (v->domain == dom0); > + bool_t d0 = is_hardware_domain(v->domain); > > /* > - * Domain 0 uses the hardware interrupts, guests get the virtual > platform. > + * Hardware domain uses the hardware interrupts, guests get the virtual > + * platform. > */ > > init_timer(&t->timer, phys_timer_expired, t, v->processor); > diff --git a/xen/arch/arm/vuart.c b/xen/arch/arm/vuart.c > index b9d3ced..c02a8a9 100644 > --- a/xen/arch/arm/vuart.c > +++ b/xen/arch/arm/vuart.c > @@ -46,7 +46,7 @@ > > int domain_vuart_init(struct domain *d) > { > - ASSERT( !d->domain_id ); > + ASSERT( is_hardware_domain(d) ); > > d->arch.vuart.info = serial_vuart_info(SERHND_DTUART); > if ( !d->arch.vuart.info ) > diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c > index ed00f7c..c83375e 100644 > --- a/xen/arch/x86/cpu/mcheck/vmce.c > +++ b/xen/arch/x86/cpu/mcheck/vmce.c > @@ -430,7 +430,7 @@ int unmmap_broken_page(struct domain *d, mfn_t mfn, > unsigned long gfn) > int rc; > > /* Always trust dom0's MCE handler will prevent future access */ > - if ( d == dom0 ) > + if ( is_hardware_domain(d) ) > return 0; > > if (!mfn_valid(mfn_x(mfn))) > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > index 0d563de..0cfca2a 100644 > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -1504,7 +1504,8 @@ void context_switch(struct vcpu *prev, struct vcpu > *next) > } > > set_cpuid_faulting(is_pv_vcpu(next) && > - (next->domain->domain_id != 0)); > + !is_control_domain(next->domain) && > + !is_hardware_domain(next->domain)); > } > > if (is_hvm_vcpu(next) && (prev != next) ) > diff --git a/xen/arch/x86/hvm/i8254.c b/xen/arch/x86/hvm/i8254.c > index f7493b8..6a91f76 100644 > --- a/xen/arch/x86/hvm/i8254.c > +++ b/xen/arch/x86/hvm/i8254.c > @@ -540,7 +540,7 @@ int pv_pit_handler(int port, int data, int write) > .data = data > }; > > - if ( (current->domain->domain_id == 0) && dom0_pit_access(&ioreq) ) > + if ( is_hardware_domain(current->domain) && dom0_pit_access(&ioreq) ) > { > /* nothing to do */; > } > diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c > index fdc5ed3..ad48acc 100644 > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -4787,7 +4787,7 @@ long arch_memory_op(int op, > XEN_GUEST_HANDLE_PARAM(void) arg) > .max_mfn = MACH2PHYS_NR_ENTRIES - 1 > }; > > - if ( !mem_hotplug && current->domain == dom0 ) > + if ( !mem_hotplug && is_hardware_domain(current->domain) ) > mapping.max_mfn = max_page - 1; > if ( copy_to_guest(arg, &mapping, 1) ) > return -EFAULT; > diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c > index f904af2..89308bd 100644 > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -1839,7 +1839,7 @@ void tsc_set_info(struct domain *d, > uint32_t tsc_mode, uint64_t elapsed_nsec, > uint32_t gtsc_khz, uint32_t incarnation) > { > - if ( is_idle_domain(d) || (d->domain_id == 0) ) > + if ( is_idle_domain(d) || is_hardware_domain(d) ) > { > d->arch.vtsc = 0; > return; > @@ -1943,7 +1943,7 @@ static void dump_softtsc(unsigned char key) > "warp=%lu (count=%lu)\n", tsc_max_warp, tsc_check_count); > for_each_domain ( d ) > { > - if ( d->domain_id == 0 && d->arch.tsc_mode == TSC_MODE_DEFAULT ) > + if ( is_hardware_domain(d) && d->arch.tsc_mode == TSC_MODE_DEFAULT ) > continue; > printk("dom%u%s: mode=%d",d->domain_id, > is_hvm_domain(d) ? "(hvm)" : "", d->arch.tsc_mode); > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c > index ed4ae2d..21c8b63 100644 > --- a/xen/arch/x86/traps.c > +++ b/xen/arch/x86/traps.c > @@ -732,18 +732,19 @@ int cpuid_hypervisor_leaves( uint32_t idx, uint32_t > sub_idx, > void pv_cpuid(struct cpu_user_regs *regs) > { > uint32_t a, b, c, d; > + struct vcpu *curr = current; > > a = regs->eax; > b = regs->ebx; > c = regs->ecx; > d = regs->edx; > > - if ( current->domain->domain_id != 0 ) > + if ( !is_control_domain(curr->domain) && > !is_hardware_domain(curr->domain) ) > { > unsigned int cpuid_leaf = a, sub_leaf = c; > > if ( !cpuid_hypervisor_leaves(a, c, &a, &b, &c, &d) ) > - domain_cpuid(current->domain, a, c, &a, &b, &c, &d); > + domain_cpuid(curr->domain, a, c, &a, &b, &c, &d); > > switch ( cpuid_leaf ) > { > @@ -751,15 +752,15 @@ void pv_cpuid(struct cpu_user_regs *regs) > { > unsigned int _eax, _ebx, _ecx, _edx; > /* EBX value of main leaf 0 depends on enabled xsave features > */ > - if ( sub_leaf == 0 && current->arch.xcr0 ) > + if ( sub_leaf == 0 && curr->arch.xcr0 ) > { > /* reset EBX to default value first */ > b = XSTATE_AREA_MIN_SIZE; > for ( sub_leaf = 2; sub_leaf < 63; sub_leaf++ ) > { > - if ( !(current->arch.xcr0 & (1ULL << sub_leaf)) ) > + if ( !(curr->arch.xcr0 & (1ULL << sub_leaf)) ) > continue; > - domain_cpuid(current->domain, cpuid_leaf, sub_leaf, > + domain_cpuid(curr->domain, cpuid_leaf, sub_leaf, > &_eax, &_ebx, &_ecx, &_edx); > if ( (_eax + _ebx) > b ) > b = _eax + _ebx; > @@ -796,7 +797,7 @@ void pv_cpuid(struct cpu_user_regs *regs) > __clear_bit(X86_FEATURE_DS, &d); > __clear_bit(X86_FEATURE_ACC, &d); > __clear_bit(X86_FEATURE_PBE, &d); > - if ( is_pvh_vcpu(current) ) > + if ( is_pvh_vcpu(curr) ) > __clear_bit(X86_FEATURE_MTRR, &d); > > __clear_bit(X86_FEATURE_DTES64 % 32, &c); > @@ -805,7 +806,7 @@ void pv_cpuid(struct cpu_user_regs *regs) > __clear_bit(X86_FEATURE_VMXE % 32, &c); > __clear_bit(X86_FEATURE_SMXE % 32, &c); > __clear_bit(X86_FEATURE_TM2 % 32, &c); > - if ( is_pv_32bit_vcpu(current) ) > + if ( is_pv_32bit_vcpu(curr) ) > __clear_bit(X86_FEATURE_CX16 % 32, &c); > __clear_bit(X86_FEATURE_XTPR % 32, &c); > __clear_bit(X86_FEATURE_PDCM % 32, &c); > @@ -844,12 +845,12 @@ void pv_cpuid(struct cpu_user_regs *regs) > > case 0x80000001: > /* Modify Feature Information. */ > - if ( is_pv_32bit_vcpu(current) ) > + if ( is_pv_32bit_vcpu(curr) ) > { > __clear_bit(X86_FEATURE_LM % 32, &d); > __clear_bit(X86_FEATURE_LAHF_LM % 32, &c); > } > - if ( is_pv_32on64_vcpu(current) && > + if ( is_pv_32on64_vcpu(curr) && > boot_cpu_data.x86_vendor != X86_VENDOR_AMD ) > __clear_bit(X86_FEATURE_SYSCALL % 32, &d); > __clear_bit(X86_FEATURE_PAGE1GB % 32, &d); > @@ -1860,7 +1861,7 @@ static inline uint64_t guest_misc_enable(uint64_t val) > static int is_cpufreq_controller(struct domain *d) > { > return ((cpufreq_controller == FREQCTL_dom0_kernel) && > - (d->domain_id == 0)); > + is_hardware_domain(d)); > } > > #include "x86_64/mmconfig.h" > diff --git a/xen/common/domain.c b/xen/common/domain.c > index ad8a1b6..b5868a5 100644 > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -237,7 +237,7 @@ struct domain *domain_create( > else if ( domcr_flags & DOMCRF_pvh ) > d->guest_type = guest_type_pvh; > > - if ( domid == 0 ) > + if ( is_hardware_domain(d) ) > { > d->is_pinned = opt_dom0_vcpus_pin; > d->disable_migrate = 1; > @@ -262,7 +262,7 @@ struct domain *domain_create( > d->is_paused_by_controller = 1; > atomic_inc(&d->pause_count); > > - if ( domid ) > + if ( !is_hardware_domain(d) ) > d->nr_pirqs = nr_static_irqs + extra_domU_irqs; > else > d->nr_pirqs = nr_static_irqs + extra_dom0_irqs; > @@ -594,7 +594,7 @@ void domain_shutdown(struct domain *d, u8 reason) > d->shutdown_code = reason; > reason = d->shutdown_code; > > - if ( d->domain_id == 0 ) > + if ( is_hardware_domain(d) ) > dom0_shutdown(reason); > > if ( d->is_shutting_down ) > diff --git a/xen/common/kernel.c b/xen/common/kernel.c > index b371f8f..7e83353 100644 > --- a/xen/common/kernel.c > +++ b/xen/common/kernel.c > @@ -303,7 +303,7 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) > arg) > (1U << XENFEAT_auto_translated_physmap); > if ( supervisor_mode_kernel ) > fi.submap |= 1U << XENFEAT_supervisor_mode_kernel; > - if ( current->domain == dom0 ) > + if ( is_hardware_domain(current->domain) ) > fi.submap |= 1U << XENFEAT_dom0; > #ifdef CONFIG_X86 > switch ( d->guest_type ) > diff --git a/xen/common/xenoprof.c b/xen/common/xenoprof.c > index 52ab00d..3c30c3e 100644 > --- a/xen/common/xenoprof.c > +++ b/xen/common/xenoprof.c > @@ -601,9 +601,15 @@ static int xenoprof_op_init(XEN_GUEST_HANDLE_PARAM(void) > arg) > xenoprof_init.cpu_type)) ) > return ret; > > + /* Only the hardware domain may become the primary profiler here > because > + * there is currently no cleanup of xenoprof_primary_profiler or > associated > + * profiling state when the primary profiling domain is shut down or > + * crashes. Once a better cleanup method is present, it will be > possible to > + * allow another domain to be the primary profiler. > + */ > xenoprof_init.is_primary = > ((xenoprof_primary_profiler == d) || > - ((xenoprof_primary_profiler == NULL) && (d->domain_id == 0))); > + ((xenoprof_primary_profiler == NULL) && is_hardware_domain(d))); > if ( xenoprof_init.is_primary ) > xenoprof_primary_profiler = current->domain; > > diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c > b/xen/drivers/passthrough/amd/pci_amd_iommu.c > index c26aabc..cf67494 100644 > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > @@ -122,7 +122,7 @@ static void amd_iommu_setup_domain_device( > > BUG_ON( !hd->root_table || !hd->paging_mode || !iommu->dev_table.buffer > ); > > - if ( iommu_passthrough && (domain->domain_id == 0) ) > + if ( iommu_passthrough && is_hardware_domain(domain) ) > valid = 0; > > if ( ats_enabled ) > diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c > index 0cf0748..54d891f 100644 > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -817,7 +817,7 @@ static void iommu_dump_p2m_table(unsigned char key) > ops = iommu_get_ops(); > for_each_domain(d) > { > - if ( !d->domain_id ) > + if ( is_hardware_domain(d) ) > continue; > > if ( iommu_use_hap_pt(d) ) > diff --git a/xen/drivers/passthrough/vtd/iommu.c > b/xen/drivers/passthrough/vtd/iommu.c > index beb3723..9c8e4a3 100644 > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -1327,7 +1327,7 @@ int domain_context_mapping_one( > return res; > } > > - if ( iommu_passthrough && (domain->domain_id == 0) ) > + if ( iommu_passthrough && is_hardware_domain(domain) ) > { > context_set_translation_type(*context, CONTEXT_TT_PASS_THRU); > agaw = level_to_agaw(iommu->nr_pt_levels); > @@ -1723,7 +1723,7 @@ static int intel_iommu_map_page( > return 0; > > /* do nothing if dom0 and iommu supports pass thru */ > - if ( iommu_passthrough && (d->domain_id == 0) ) > + if ( iommu_passthrough && is_hardware_domain(d) ) > return 0; > > spin_lock(&hd->mapping_lock); > @@ -1767,7 +1767,7 @@ static int intel_iommu_map_page( > static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn) > { > /* Do nothing if dom0 and iommu supports pass thru. */ > - if ( iommu_passthrough && (d->domain_id == 0) ) > + if ( iommu_passthrough && is_hardware_domain(d) ) > return 0; > > dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K); > @@ -1950,8 +1950,9 @@ static int intel_iommu_remove_device(u8 devfn, struct > pci_dev *pdev) > continue; > > /* > - * If the device belongs to dom0, and it has RMRR, don't remove > - * it from dom0, because BIOS may use RMRR at booting time. > + * If the device belongs to the hardware domain, and it has RMRR, > don't > + * remove it from the hardware domain, because BIOS may use RMRR at > + * booting time. > */ > if ( is_hardware_domain(pdev->domain) ) > return 0; > diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c > b/xen/drivers/passthrough/vtd/x86/vtd.c > index ca17cb1..f271a42 100644 > --- a/xen/drivers/passthrough/vtd/x86/vtd.c > +++ b/xen/drivers/passthrough/vtd/x86/vtd.c > @@ -111,7 +111,7 @@ void __init iommu_set_dom0_mapping(struct domain *d) > { > unsigned long i, j, tmp, top; > > - BUG_ON(d->domain_id != 0); > + BUG_ON(!is_hardware_domain(d)); > > top = max(max_pdx, pfn_to_pdx(0xffffffffUL >> PAGE_SHIFT) + 1); > > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h > index 00f0eba..466949f 100644 > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -775,10 +775,10 @@ void watchdog_domain_destroy(struct domain *d); > /* > * Use this check when the following are both true: > * - Using this feature or interface requires full access to the hardware > - * (that is, this is would not be suitable for a driver domain) > + * (that is, this would not be suitable for a driver domain) > * - There is never a reason to deny dom0 access to this > */ > -#define is_hardware_domain(_d) ((_d)->is_privileged) > +#define is_hardware_domain(_d) ((_d)->domain_id == 0) > > /* This check is for functionality specific to a control domain */ > #define is_control_domain(_d) ((_d)->is_privileged) > -- > 1.8.5.3 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |