[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 05/10] x86/HVM/SVM: Add AVIC initialization code
Hi Andrew On 1/2/17 23:37, Andrew Cooper wrote: On 31/12/2016 05:45, Suravee Suthikulpanit wrote:[...] +/* + * Note: Current max index allowed for physical APIC ID table is 255. + */ +#define AVIC_PHY_APIC_ID_MAX 0xFF + +#define AVIC_DOORBELL 0xc001011b + +#define AVIC_HPA_SHIFT 12 +#define AVIC_HPA_MASK (((1ULL << 40) - 1) << AVIC_HPA_SHIFT)What does this mask represent? I have spotted a truncation bug below, but how to fix the issue depends on the meaning of the mask. The only limits I can see in the spec is that the host physical addresses must be present in system RAM, which presumably means they should follow maxphysaddr like everything else? Please see below. [...] +/* + * Note: + * Currently, svm-avic mode is not supported with nested virtualization. + * Therefore, it is not yet currently enabled by default. Once the support + * is in-place, this should be enabled by default. + */ +bool svm_avic = 0; +boolean_param("svm-avic", svm_avic);We are trying to avoid the proliferation of top-level options. Please could you introduce a "svm=" option which takes avic as a sub-boolean, similar to how iommu= currently works? Sure, I will introduce the custom_param("svm", parse_svm_param). [...] >> +{ + struct avic_phy_apic_id_ent *avic_phy_apic_id_table; + struct svm_domain *d = &v->domain->arch.hvm_domain.svm; + + if ( !d->avic_phy_apic_id_table_mfn ) + return NULL; + + /* + * Note: APIC ID = 0xff is used for broadcast. + * APIC ID > 0xff is reserved. + */ + if ( index >= 0xff ) + return NULL; + + avic_phy_apic_id_table = mfn_to_virt(d->avic_phy_apic_id_table_mfn);This is unsafe and will break on hosts with more than 5TB of RAM. As you allocate avic_phy_apic_id_table_mfn with alloc_domheap_page(), you must use {map,unmap}_domain_page() to get temporary mappings (which will use mfn_to_virt() in the common case), or use {map,unmap}_domain_page_global() to get permanent mappings. As the values in here need modifying across a schedule, I would suggest making a global mapping at create time, and storing the result in a properly-typed pointer. Thanks for pointing this out. I'll update both logical and physical APIC ID table to use the {map,unmap}_domain_page_global() instead. Hm... That means I would also need to check this in svm_avic_dom_destroy() and svm_avic_init_vmcb().[...] + + if ( !svm_avic ) + return 0;|| !has_vlapic(d) HVMLite domains may legitimately be configured without an LAPIC at all. [...] + +static inline u32 * +avic_get_bk_page_entry(const struct vcpu *v, u32 offset) +{ + const struct vlapic *vlapic = vcpu_vlapic(v); + char *tmp; + + if ( !vlapic || !vlapic->regs_page ) + return NULL; + + tmp = (char *)page_to_virt(vlapic->regs_page); + return (u32 *)(tmp + offset);This is also unsafe over 5TB. vlapic->regs is already a global mapping which you should use. Good point. Also, you should leave a comment by the allocation of regs_page that AVIC depends on it being a full page allocation. Ok, I will add a comment in arch/x86/hvm/vlapic.c: vlapic_init(). +} + +void svm_avic_update_vapic_bar(const struct vcpu *v, uint64_t data) +{ + const struct arch_svm_struct *s = &v->arch.hvm_svm; + + s->vmcb->avic_vapic_bar = data & AVIC_VAPIC_BAR_MASK; + s->vmcb->cleanbits.fields.avic = 0;While I can appreciate what you are trying to do here, altering the physical address in IA32_APICBASE has never functioned properly under Xen, as nothing updates the P2M. Worse, with multi-vcpu guests, the use of a single shared p2m makes this impossible to do properly. Ahh.. Good to know I know it is non-architectural behaviour, but IMO vlapic_msr_set() should be updated to raise #GP[0] when attempting to change the frame, to make it fail obviously rather than having interrupts go missing. Also I see that the Intel side (via vmx_vlapic_msr_changed()) makes the same mistake. Okay, it seems that the hvm_msr_write_intercept() has already handled this isn't it? So, I should be able to pretty much removing the handling for the MSR here. +} + +int svm_avic_init_vmcb(struct vcpu *v) +{ + paddr_t ma; + u32 *apic_id_reg; + struct arch_svm_struct *s = &v->arch.hvm_svm; + struct vmcb_struct *vmcb = s->vmcb; + struct svm_domain *d = &v->domain->arch.hvm_domain.svm; + const struct vlapic *vlapic = vcpu_vlapic(v); + struct avic_phy_apic_id_ent entry; + + if ( !svm_avic ) + return 0; + + if ( !vlapic || !vlapic->regs_page ) + return -EINVAL;Somewhere you need a bounds check against the APIC ID being within AVIC limits. We are checking the bound in the avic_get_phy_apic_id_ent(). + if ( !s->avic_last_phy_id ) + return -EINVAL;Why does a pointer into the physical ID page need storing in arch_svm_struct? This was so that we can quickly access the physical APIC ID entry to update them w/o having to call avic_get_phy_apic_id_entry(). + + vmcb->avic_bk_pg_pa = page_to_maddr(vlapic->regs_page) & AVIC_HPA_MASK;This use of AVIC_HPA_MASK may truncate the the address, which is definitely a problem. I'm not quite sure that I got the truncation that you pointed out. Could you please elaborate? If AVIC_HPA_MASK isn't just related to maxphysaddr, then you need to pass appropriate memflags into the alloc_domheap_page() calls to get a suitable page. If by "maxphysaddr" is the 52-bit physical address limit for the PAE mode, then I think that's related. + + entry = *(s->avic_last_phy_id); + smp_rmb(); + entry.bk_pg_ptr = (vmcb->avic_bk_pg_pa & AVIC_HPA_MASK) >> AVIC_HPA_SHIFT; + entry.is_running = 0; + entry.valid = 1; + *(s->avic_last_phy_id) = entry; + smp_wmb();During domain creation, no guests are running, so you can edit this cleanly. > What values are actually being excluded here? This, and other patches, look like you are actually just trying to do a read_atomic(), modify, write_atomic() update, rather than actually requiring ordering. Besides the read_atomic(), modify write_atomic() to update the entry. I also want to make sure that the compiler won't shuffle the order around, which I thought can be achieved via smp_rmb() and smp_wmb(). [...] @@ -1799,6 +1802,10 @@ static int svm_msr_write_intercept(unsigned int msr, uint64_t msr_content) switch ( msr ) { + case MSR_IA32_APICBASE: + if ( svm_avic_vcpu_enabled(v) ) + svm_avic_update_vapic_bar(v, msr_content); + break;I think this is dead code. MSR_IA32_APICBASE is handled in hvm_msr_write_intercept(), and won't fall into the vendor hook. If you were to continue down this route, I would add another pi_ops hook, and replace the VMX layering violation in vlapic_msr_set(). I see now. I will remove the code added here. Thanks, Suravee _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |