[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3] xen: introduce VCPUOP_register_runstate_phys_memory_area hypercall



>>> On 11.06.19 at 18:09, <andrii.anisov@xxxxxxxxx> wrote:
> On 07.06.19 17:23, Jan Beulich wrote:
>>>>> On 24.05.19 at 20:12, <andrii.anisov@xxxxxxxxx> wrote:
>>> From: Andrii Anisov <andrii_anisov@xxxxxxxx>
>>>
>>> Existing interface to register runstate are with its virtual address
>>> is prone to issues which became more obvious with KPTI enablement in
>>> guests. The nature of those issues is the fact that the guest could
>>> be interrupted by the hypervisor at any time, and there is no guarantee
>>> to have the registered virtual address translated with the currently
>>> available guest's page tables. Before the KPTI such a situation was
>>> possible in case the guest is caught in the middle of PT processing
>>> (e.g. superpage shattering). With the KPTI this happens also when the
>>> guest runs userspace, so has a pretty high probability.
>> 
>> Except when there's no need for KPTI in the guest in the first place,
>> as is the case for x86-64 PV guests. I think this is worthwhile clarifying.
>> 
>>> So it was agreed to register runstate with the guest's physical address
>>> so that its mapping is permanent from the hypervisor point of view. [1]
>>>
>>> The hypercall employs the same vcpu_register_runstate_memory_area
>>> structure for the interface, but requires a registered area to not
>>> cross a page boundary.
>>>
>>> [1] 
>>> https://lists.xenproject.org/archives/html/xen-devel/2019-02/msg00416.html 
>>>
>>> Signed-off-by: Andrii Anisov <andrii_anisov@xxxxxxxx>
>> 
>> I would have really hoped that you would outline the intended interaction
>> between both interfaces,
> 
> I'd suppose no specific interaction between interfaces. I hardly imagine 
> realistic use-cases where this might be needed.
> 
>> such that while reviewing one can judge whether
>> you're actually matching the goal. I think it is actually mandatory to make
>> explicit in the public header what level of mixing is permitted, what is not
>> permitted, and what mix of requests is simply undefined.> 
>> In particular, while we did work out during prior discussions that some
>> level of mixing has to be permitted, I'm unconvinced that arbitrary
>> mixing has to be allowed. For example, switching from one model to
>> another could be permitted only with just a single active vCPU. This
>> might allow to do away again with the runstate_in_use field you add.
> 
> Well, my point here is to left it as is, maybe add more documentation. If 
> one likes shooting his leg, we should only care about avoiding ricochet harms 
> hypervisor or other guests.
> If you disagree, please suggest your interaction model, I'll be happy to 
> implement it.

Well, if "mix as you like" is fine for guests to follow, then okay. But
we need to be _really_ certain there's no issue with this. Relaxing
the interface is always possible, while tightening an interface is
almost always at least a problem, if possible at all.

>>> --- a/xen/arch/arm/domain.c
>>> +++ b/xen/arch/arm/domain.c
>>> @@ -274,17 +274,15 @@ static void ctxt_switch_to(struct vcpu *n)
>>>       virt_timer_restore(n);
>>>   }
>>>   
>>> -/* Update per-VCPU guest runstate shared memory area (if registered). */
>>> -static void update_runstate_area(struct vcpu *v)
>>> +static void update_runstate_by_gvaddr(struct vcpu *v)
>>>   {
>>>       void __user *guest_handle = NULL;
>>>   
>>> -    if ( guest_handle_is_null(runstate_guest(v)) )
>>> -        return;
>>> +    ASSERT(!guest_handle_is_null(runstate_guest_virt(v)));
>>>   
>>>       if ( VM_ASSIST(v->domain, runstate_update_flag) )
>>>       {
>>> -        guest_handle = &v->runstate_guest.p->state_entry_time + 1;
>>> +        guest_handle = &v->runstate_guest.virt.p->state_entry_time + 1;
>>>           guest_handle--;
>>>           v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>>>           __raw_copy_to_guest(guest_handle,
>>> @@ -292,7 +290,7 @@ static void update_runstate_area(struct vcpu *v)
>>>           smp_wmb();
>>>       }
>>>   
>>> -    __copy_to_guest(runstate_guest(v), &v->runstate, 1);
>>> +    __copy_to_guest(runstate_guest_virt(v), &v->runstate, 1);
>> 
>> In a prior version you did the mechanical part of adjusting the VA-based
>> code in a prereq patch, aiding review. Is there a particular reason you
>> folded everything into one patch now?
> 
> I silently followed suggestion from George [1]. Any objections?

Hmm, I can't read this into George's suggestion. Aiui he did suggest
not to split the definition of the new interface from its implementation.
But that doesn't necessarily mean to squash _everything_ in one
patch.

>>> +static bool update_runstate_by_gpaddr_compat(struct vcpu *v)
>>> +{
>>> +    struct compat_vcpu_runstate_info *runstate =
>>> +            (struct compat_vcpu_runstate_info *)v->runstate_guest.phys;
>>> +
>>> +    ASSERT(runstate != NULL);
>>> +
>>> +    if ( VM_ASSIST(v->domain, runstate_update_flag) )
>>> +    {
>>> +        runstate->state_entry_time |= XEN_RUNSTATE_UPDATE;
>>> +        smp_wmb();
>>> +        v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>>> +    }
>>> +
>>> +    {
>>> +        struct compat_vcpu_runstate_info info;
>>> +        XLAT_vcpu_runstate_info(&info, &v->runstate);
>>> +        memcpy(v->runstate_guest.phys, &info, sizeof(info));
>>> +    }
>>> +    else
>>> +        memcpy(v->runstate_guest.phys, &v->runstate, sizeof(v->runstate));
>> 
>> This "else" does not seem to be paired with an if(). Does this build
>> at all?
> 
> This particular - not!
> And it is really strange, I remember I checked patch compilation for x86. 
> Looking through git reflog to realize at what amend it became broken.
> But also I do not completely understand the meaning of "_compat" and if it 
> should be supported here?

Well, I'm afraid I don't understand what you're after. Of course
compat mode guests need to continue to be supported, and the
new interface would also better be available to them. And it is
a fact that their runstate area layout differs from that of 64-bit
guests.

>>> +    mfn = domain_page_map_to_mfn(v->runstate_guest.phys);
>>> +
>>> +    unmap_domain_page_global((void *)
>>> +                             ((unsigned long)v->runstate_guest.phys &
>>> +                              PAGE_MASK));
>>> +
>>> +    v->runstate_guest.phys = NULL;
>> 
>> I think you would better store NULL before unmapping.
> 
> Do you mean using local variable to pass address to 
> unmap_domain_page_global(), and setting v->runstate_guest.phys to NULL prior 
> to unmap?

Yes.

>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -163,17 +163,31 @@ struct vcpu
>>>       void            *sched_priv;    /* scheduler-specific data */
>>>   
>>>       struct vcpu_runstate_info runstate;
>>> +
>>> +    enum {
>>> +        RUNSTATE_NONE = 0,
>>> +        RUNSTATE_PADDR = 1,
>>> +        RUNSTATE_VADDR = 2,
>>> +    } runstate_guest_type;
>>> +
>>> +    unsigned long runstate_in_use;
>> 
>> Why "unsigned long"? Isn't a bool all you need?
> 
> Bool should be enough. But it seems we will have a lock here.
> 
>> Also these would now all want to be grouped in a sub-structure named
>> "runstate", rather than having "runstate_" prefixes.
> 
> Member `runstate` has already a type of `struct vcpu_runstate_info` which is 
> an interface type.
> `runstate_guest` is a union. I'd not like moving `runstate_guest` union into 
> another substructure. Because we would have long lines like 
> `v->struct.runstate_guest.virt.p->state_entry_time`.

You didn't get my point then: What I'm after is

    struct {
        struct vcpu_runstate_info info;
        enum {
            RUNSTATE_NONE,
            RUNSTATE_PADDR,
            RUNSTATE_VADDR,
        } guest_type;
        bool in_use;
    } runstate;

(and of course the transformation to runstate.info broken out
into a separate prerreq patch).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.