[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [xen-unstable test] 106698: regressions - FAIL
On 16/03/17 14:26, Sergey Dyasli wrote: > On Thu, 2017-03-16 at 05:15 -0600, Jan Beulich wrote: >>>>> On 16.03.17 at 10:03, <osstest-admin@xxxxxxxxxxxxxx> wrote: >>> flight 106698 xen-unstable real [real] >>> http://logs.test-lab.xenproject.org/osstest/logs/106698/ >>> >>> Regressions :-( >>> >>> Tests which did not succeed and are blocking, >>> including tests which could not be run: >>> test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail REGR. >>> vs. 106652 >> While there's quite a bit of stuff under test, your recent vVMX series >> would seem to be the most likely candidate for a regression here. I >> am, however, puzzled by >> >> (XEN) d1v0 VMLAUNCH error: 0 >> (XEN) domain_crash_sync called from vmcs.c:1712 >> >> in the L1 log - error 0 is supposed to be "no error", and I can't see >> how VM_INSTRUCTION_ERROR would ever be written to zero. >> Which leaves there being a path (which I can't spot) where it's not >> being written, or a problem handling the respective vmread by the >> guest. >> >> Could you take a look, please? > L1:vmlaunch failed and vmx_vmentry_failure() was called. However it > doesn't check if the fail was Valid or Invalid. In the latter case > VM_INSTRUCTION_ERROR would be meaningless. > > There are only 2 cases for vmfail_invalid() inside nvmx_handle_vmlaunch(): > > 1. if ( vcpu_nestedhvm(v).nv_vvmcxaddr == INVALID_PADDR ) > > That would imply that L0:nvmx_handle_vmptrld() returned VMfail > and L1:__vmptrld() hit BUG() which is not the case. > > 2. if ( nvmx->shadow_vmcs ) > > I have identified one possible issue with that. H/W looks like Haswell > and L0 has: > > (XEN) - VMCS shadowing > > However L1 is missing "VMCS shadowing" in "VMX advanced features". > I didn't expect that fact since L1 sees VMX_MISC_VMWRITE_ALL > in MSR_IA32_VMX_MISC. It must be something else that prevents L1 from > enabling vmcs shadowing. > > Above makes the follwing check inside nvmx_handle_vmptrld() incorrect: > > (!cpu_has_vmx_vmcs_shadowing && nvmx->shadow_vmcs) > > Since cpu_has_vmx_vmcs_shadowing tests L0's capability and not L1's. > > Shadow bit will be set by L0:nvmx_set_vmcs_pointer() which might > suggest that there are other cases with nvmx_handle_vmptrld() re-entrancy > that I have missed. If the following scenario is possible: > > nvmx_handle_vmptrld() > nvcpu->nv_vvmcxaddr == INVALID_PADDR > nvmx->shadow_vmcs = false > vvmcs->vmcs_revision_id |= VMCS_RID_TYPE_MASK; > > // no nvmx_clear_vmcs_pointer() in between > > nvmx_handle_vmptrld() > nvcpu->nv_vvmcxaddr == INVALID_PADDR > nvmx->shadow_vmcs = true > (!cpu_has_vmx_vmcs_shadowing && nvmx->shadow_vmcs) == false > > nvmx_handle_vmlaunch() > nvmx->shadow_vmcs == true > vmfail_invalid(regs); > > Then it would explain the regression. Ok - we should revert dc05c0ceeb8609b6d60f6a117a0192e9160946b8 and b22ee98c4ecc4e7c827451dee01181529df4d26c to unblock master. I will get to this shortly, unless there are sudden objections. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |