|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Nested virtualization off VMware vSphere 6.0 with EL6 guests crashes on Xen 4.6
>>> On 15.01.16 at 22:39, <konrad.wilk@xxxxxxxxxx> wrote:
> On Tue, Jan 12, 2016 at 02:22:03AM -0700, Jan Beulich wrote:
>> Since we can (I hope) pretty much exclude a paging type, the
>> ASSERT() must have triggered because of vapic_pg being NULL.
>> That might be verifiable without extra printk()s, just by checking
>> the disassembly (assuming the value sits in a register). In which
>> case vapic_gpfn would be of interest too.
>
> The vapic_gpfn is 0xffffffffffff.
>
> To be exact:
>
> nvmx_update_virtual_apic_address:vCPU0 0xffffffffffffffff(vAPIC) 0x0(APIC),
> 0x0(TPR) ctrl=b5b9effe
>
> Based on this:
>
> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
> index cb6f9b8..8a0abfc 100644
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -695,7 +695,15 @@ static void nvmx_update_virtual_apic_address(struct vcpu
> *v)
>
> vapic_gpfn = __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR) >>
> PAGE_SHIFT;
> vapic_pg = get_page_from_gfn(v->domain, vapic_gpfn, &p2mt,
> P2M_ALLOC);
> - ASSERT(vapic_pg && !p2m_is_paging(p2mt));
> + if ( !vapic_pg ) {
> + printk("%s:vCPU%d 0x%lx(vAPIC) 0x%lx(APIC), 0x%lx(TPR)
> ctrl=%x\n", __func__,v->vcpu_id,
> + __get_vvmcs(nvcpu->nv_vvmcx, VIRTUAL_APIC_PAGE_ADDR),
> + __get_vvmcs(nvcpu->nv_vvmcx, APIC_ACCESS_ADDR),
> + __get_vvmcs(nvcpu->nv_vvmcx, TPR_THRESHOLD),
> + ctrl);
> + }
> + ASSERT(vapic_pg);
> + ASSERT(vapic_pg && !p2m_is_paging(p2mt));
> __vmwrite(VIRTUAL_APIC_PAGE_ADDR, page_to_maddr(vapic_pg));
> put_page(vapic_pg);
> }
Interesting: I can't see VIRTUAL_APIC_PAGE_ADDR to be written
with all ones anywhere, neither for the real VMCS nor for the virtual
one (page_to_maddr() can't, afaict, return such a value). Could you
check where the L1 guest itself is writing that value, or whether it
fails to initialize that field and it happens to start out as all ones?
>> What looks odd to me is the connection between
>> CPU_BASED_TPR_SHADOW being set and the use of a (valid)
>> virtual APIC page: Wouldn't this rather need to depend on
>> SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES, just like in
>> nvmx_update_apic_access_address()?
>
> Could be. I added in an read for the secondary control:
>
> nvmx_update_virtual_apic_address:vCPU2 0xffffffffffffffff(vAPIC) 0x0(APIC),
> 0x0(TPR) ctrl=b5b9effe sec=0
>
> So trying your recommendation:
> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
> index cb6f9b8..d291c91 100644
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -686,8 +686,8 @@ static void nvmx_update_virtual_apic_address(struct vcpu
> *v)
> struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
> u32 ctrl;
>
> - ctrl = __n2_exec_control(v);
> - if ( ctrl & CPU_BASED_TPR_SHADOW )
> + ctrl = __n2_secondary_exec_control(v);
> + if ( ctrl & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES )
> {
> p2m_type_t p2mt;
> unsigned long vapic_gpfn;
>
>
> Got me:
> (XEN) stdvga.c:151:d1v0 leaving stdvga mode
> (XEN) stdvga.c:147:d1v0 entering stdvga and caching modes
> (XEN) stdvga.c:520:d1v0 leaving caching mode
> (XEN) vvmx.c:2491:d1v0 Unknown nested vmexit reason 80000021.
> (XEN) Failed vm entry (exit reason 0x80000021) caused by invalid guest state
Interesting. I've just noticed that a similar odd looking (to me)
dependency exists in construct_vmcs(). Perhaps I've overlooked
something in the SDM. In any event I think some words from the
VMX maintainers would be quite nice here.
Sadly the VMCS dump doesn't include the two APIC related
addresses...
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |