[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/x86: ensure copying to L1 guest in update_runstate_area()



On 23/02/17 08:00, Haozhong Zhang wrote:
> On 02/22/17 00:46 -0700, Jan Beulich wrote:
>>>>> On 22.02.17 at 03:20, <haozhong.zhang@xxxxxxxxx> wrote:
>>> On 02/22/17 09:28 +0800, Haozhong Zhang wrote:
>>>> On 02/21/17 02:15 -0700, Jan Beulich wrote:
>>>>>>>> On 21.02.17 at 03:11, <haozhong.zhang@xxxxxxxxx> wrote:
>>> [..] 
>>>>>> +     *
>>>>>> +     * Therefore, we clear the nested guest flag before 
>>>>>> __raw_copy_to_guest()
>>>>>> +     * and __copy_to_guest(), and restore the flag after all guest copy.
>>>>>> +     */
>>>>>> +    if ( is_hvm_vcpu(v) && paging_mode_hap(v->domain) )
>>> I think it would be clearer to use nestedhvm_enabled() here.
>> Indeed - that would eliminate all these open coding of assumption
>> which may be valid at present, but not down the road.
>>
>>>>> And why is this HAP-specific?
>>>>>
>>>> IIUC, nested HVM relies on HAP.
>>> For nested SVM, I find the following check in hvmop_set_param():
>>>     case HVM_PARAM_NESTEDHVM:
>>>         if ( cpu_has_svm && !paging_mode_hap(d) && a.value )
>>>             rc = -EINVAL;
>>>
>>> I don't find the similar check for nested VMX here and in vvmx.c.
>>> Though L1 HVM domain w/ nestedhvm=1 and hap=0 can boot up on Intel
>>> machine (because of the lack of above check?), starting L2 guest does
>>> crash L1 at the very beginning and L0 Xen reports the following debug
>>> messages:
>>>
>>> (XEN) realmode.c:111:d18v9 Failed to emulate insn.
>>> (XEN) Real-mode emulation failed: d18v9 Real @ f000:0000fff0 ->
>>> (XEN) domain_crash called from realmode.c:157
>>> (XEN) Domain 18 (vcpu#9) crashed on cpu#29:
>>> (XEN) ----[ Xen-4.9-unstable  x86_64  debug=y   Not tainted ]----
>>> (XEN) CPU:    29
>>> (XEN) RIP:    f000:[<000000000000fff0>]
>>> (XEN) RFLAGS: 0000000000000002   CONTEXT: hvm guest (d18v9)
>>> (XEN) rax: 0000000000000000   rbx: 0000000000000000   rcx: 0000000000000000
>>> (XEN) rdx: 0000000000000f61   rsi: 0000000000000000   rdi: 0000000000000000
>>> (XEN) rbp: 0000000000000000   rsp: 0000000000000000   r8:  0000000000000000
>>> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
>>> (XEN) r12: 0000000000000000   r13: 0000000000000000   r14: 0000000000000000
>>> (XEN) r15: 0000000000000000   cr0: 0000000000000030   cr4: 0000000000002050
>>> (XEN) cr3: 00000000feffc000   cr2: 0000000000000000
>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: f000
>> Now that's of course quite odd: The instruction at that address
>> ought to be a direct far branch, which I think the emulator is able
>> to deal with. Which leaves the possibility of it fetching the wrong
>> bytes (which sadly don't get shown above).
> Yes, the emulator fails at fetching the instruction from L2
> guest. It's at the vm entry to L2 guest after handling the L1
> vmlaunch. The code path from vmx_do_vmentry is shown as below:
>     vmx_asm_do_vmentry
>     vmx_realmode
>     vmx_realmode_emulate_one
>     hvm_emulate_one
>     _hvm_emulate_one
>     x86_emulate
>     x86_decode
>     insn_fetch_type
>     ...
>     hvmemul_insn_fetch
>     __hvmemul_read
>     hvm_fetch_from_guest_linear
>     __hvm_copy(addr = 0xfffffff0, flags = HVMCOPY_from_guest | HVMCOPY_linear)
>     3117    if ( flags & HVMCOPY_linear )
>             {
>     3119        gfn = paging_gva_to_gfn(v, addr, &pfec);
>                 ...
>     3133        gpa |= gfn << PAGE_SHIHT;
>             }
>             ...
>     3151    page = get_page_from_gfn(v->domain, gfn, &p2mt, P2M_UNSHARE);
>     3153    if ( !page )
>     3154        return HVMCOPY_bad_gfn_to_mfn;
>
> When hap is not enabled, paging_gva_to_gfn at line 3119 uses hostp2m
> to translate the *L2* gpa rather than walking through the nested
> EPT, so line 3119 actually translates from *L1* gpa 0xfffffff0 to L1
> gfn 0xfffff. The p2m type at gfn 0xfffff is p2m_mmio_dm, so
> get_page_from_gfn() at line 3151 fails.
>
> I don't know whether nested VMX with non-hap ever worked or not, but
> it does not now. I'll include a patch to enforce the hap requirement
> for nested vmx.

This answers one of my open tickets (although being shadow-related, it
wasn't the highest priority item to look at).

Nested VMX definitely used to work with shadow paging, but I am not
surprised if it has bitrotted in the meantime.  One larger area I have
identified in the current swamp of nested-virt problems is that the
current p2m infrastructure makes it far too easy to make mistakes like
this (e.g. c/s bab2bd8e2)

My idea (probably needs refining) to address the issue is to introduce
new TYPE_SAFE()'s for l2 and l2 gfns, and use the compiler to make it
harder to get this wrong.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.