[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 13/13] x86/domctl: Implement XEN_DOMCTL_set_cpumsr_policy

>>> On 05.07.18 at 19:55, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 05/07/18 10:28, Jan Beulich wrote:
>>>>> +    /*
>>>>> +     * Audit was successful.  Replace existing policies, leaving the old
>>>>> +     * policies to be freed.
>>>>> +     */
>>>>> +    SWAP(new.cp, d->arch.cpuid);
>>>>> +    SWAP(new.dp, d->arch.msr);
>>>>> +    SWAP(new.vp, v->arch.msr);
>>>>> +
>>>>> +    /* Merge the (now audited) vCPU MSRs into every other 
>>>>> msr_vcpu_policy. */
>>>>> +    for ( ; v; v = v->next_in_list )
>>>> This open-coded almost-for_each_domain() doesn't look very nice.
>>> ITYM for_each_vcpu()
>> Oops, of course.
>>> And yes, but for_each_vcpu() is wrong to use here, and we don't have a
>>> for_each_vcpu_other_than_0() helper.
>> Perhaps still better to do
>>     for_each_vcpu(d, v)
>>     {
>>         if ( v->vcpu_id == 0 )
>>             continue;
>>         ...
>>     }
>> ?
> Do you think that looks cleaner?

Yes. There is exactly one other open coded use of next_in_list in x86
code (with the asm-offsets cleanup in place that I've just sent), and I'd
prefer to see that go away as well (but it's less obvious what a
reasonable replacement would look like there).

> I doubt the compiler can optimise that into my version.

I'm pretty sure it can't, but I'm also pretty sure it doesn't matter here.

>>>>> +    {
>>>>> +        /* XXX - Figure out how to avoid a TOCTOU race here.  XLAT area? 
>>>>> */
>>>>> +        if ( (ret = x86_msr_copy_from_buffer(
>>>>> +                  NULL, v->arch.msr, xdpc->msr_policy, xdpc->nr_msrs, 
>>>>> NULL)) )
>>>> Why can't you go from vCPU 0's v->arch.msr here, which is the copied-in
>>>> (and sanitized) representation already? Also, is it really a good idea to
>>>> assume all vCPU-s have the same policies?
>>> There are multiple colliding issues which lead to this code, but as
>>> several people have pointed out, its probably over the top.
>>> First, as to the same policy.  This hypercall can currently only be used
>>> before the vcpu has started executing.
>>> As such, it is setting the init state of the MSRs from the guests point
>>> of view, and there is exactly one MSR I'm aware of which has an init
>>> value which depends on the core (that being APIC_BASE.BSP which can
>>> trivially be handled in Xen).  All other MSRs have identical init state
>>> AFAICT, and I don't want to create an interface which makes it easy to
>>> accidentally end up with wrong values.
>> So what about migration? There are certainly differing incoming values
>> there. Of course there's the MSRs restore record, but no atomic sanity
>> check between those and the policy here is possible.
> Migration is still a problem.  This CPUID/MSR work is the next step on
> the path to fixing the "state before policy" problem we've got when
> restoring a guest.
> Once we have a working CPUID and R/O MSR configuration "blob" which the
> toolstack can manipulate, we can (in Xen) require that the toolstack
> provide the blob before memory and R/W register state.

Hmm, you talk about r/o MSRs here only, but this then covers the
domain policy object only when looking at what we currently have.
Both MSRs in the vCPU policy object are r/w ones, and hence I'd
like it to be at least clear what the interaction between the policy
and other MSR restore is supposed to be in the end.

This is in particular relevant wrt the derivation of data from vCPU 0
here. With what you do currently, you already make the code
dependent upon the MSRs record coming after the configuration
done here, or else the cloning of vCPU 0 register values would
clobber the intended (restored) ones.

> When we get to that point, the toolstack shall call
> DOMCTL_get_cpumsr_policy (modulo whatever plan I device to fix our R/W
> MSR from the VCPU state problem) and place this ahead of the main
> memory/register state in the migration stream.

It is perhaps the case that r/w MSRs weren't actually meant to go
into the policy structures?

> The receiving side will either feed the blob back to Xen, or fabricate
> the policy out of thin air at this point.  The current behaviour is to
> fabricate a policy out of thin air after all migration stream is
> complete, but before unpausing the domain.

I.e., as mentioned above, you clobber the values coming from the
CPU_MSR record. I don't think this can be the way to go, not even


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.