|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 2/6] x86/domctl: Fix TOCTOU race with the use of XEN_DOMCTL_getvcpuextstate
>>> On 12.09.16 at 14:02, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 12/09/16 12:17, Jan Beulich wrote:
>>>>> On 12.09.16 at 11:51, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> A toolstack must call XEN_DOMCTL_getvcpuextstate twice; first to find the
> size
>>> of the buffer to use, and a second time to get the actual content.
>>>
>>> The reported size was based on v->arch.xcr0_accum, but a guest which extends
>>> its xcr0_accum between the two hypercalls will cause the toolstack to fail
> the
>>> evc->size != size check, as the provided buffer is now too small. This
>>> causes
>>> a hard error during the final phase of migration.
>>>
>>> Instead, return return a size based on xfeature_mask, which is the maximum
>>> size Xen will ever permit. The hypercall must now tolerate a
>>> toolstack-provided buffer which is overly large (for the case where a guest
>>> isn't using all available xsave states), and should write back how much data
>>> was actually written into the buffer.
>> To be honest, I'm of two minds here. Part of me thinks this is an
>> okay change. However, in particular ...
>>
>>> --- a/xen/arch/x86/domctl.c
>>> +++ b/xen/arch/x86/domctl.c
>>> @@ -1054,19 +1054,25 @@ long arch_do_domctl(
>>> unsigned int size;
>>>
>>> ret = 0;
>>> - vcpu_pause(v);
>>>
>>> - size = PV_XSAVE_SIZE(v->arch.xcr0_accum);
>>> if ( (!evc->size && !evc->xfeature_mask) ||
>>> guest_handle_is_null(evc->buffer) )
>>> {
>>> + /*
>>> + * A query for the size of buffer to use. Must return the
>>> + * maximum size we ever might hand back to userspace,
> bearing
>>> + * in mind that the vcpu might increase its xcr0_accum
> between
>>> + * this query for size, and the following query for data.
>>> + */
>>> evc->xfeature_mask = xfeature_mask;
>>> - evc->size = size;
>>> - vcpu_unpause(v);
>>> + evc->size = PV_XSAVE_SIZE(xfeature_mask);
>>> goto vcpuextstate_out;
>>> }
>>>
>>> - if ( evc->size != size || evc->xfeature_mask != xfeature_mask )
>>> + vcpu_pause(v);
>>> + size = PV_XSAVE_SIZE(v->arch.xcr0_accum);
>>> +
>>> + if ( evc->size < size || evc->xfeature_mask != xfeature_mask )
>> ... the relaxation from != to < looks somewhat fragile to me, going
>> forward.
>
> What is fragile about it? It is a very common pattern in general, and
> we use it elsewhere.
If we get handed too large a buffer, what meaning do you want to
assign to the extra bytes? Them being meaningless is just one
possible interpretation.
>> Did you consider dealing with the issue in the tool stack?
>
> Yes, and rejected doing so.
>
>> It can't be that hard to repeat the size query in case data retrieval fails.
>
> In principle, this is true if the size case was distinguishable from all
> the other cases which currently return -EINVAL.
It easily is, as long as the caller know what size it used on the most
recent earlier call (since it would then notice it having grown).
>>> @@ -1103,6 +1109,10 @@ long arch_do_domctl(
>>> }
>>>
>>> vcpu_unpause(v);
>>> +
>>> + /* Specify how much data we actually wrote into the buffer. */
>>> + if ( !ret )
>>> + evc->size = size;
>> ... if this got written on the earlier error path, there wouldn't even
>> be a need to retry the size query: Data retrieval could be retried
>> with the new size right after enlarging the buffer.
>
> True, but userspace hypercalls are expensive for the domain in question,
> and take a global spinlock in Xen which is expensive for the system as a
> whole. Looking forward to PVH control domains, any hypercall will be
> far more expensive (a vmexit/entry vs syscall/sysret).
>
> As such, I am not happy introducing unnecessary hypercalls, as they are
> entirely detrimental to dom0 and Xen.
Let's be realistic: How often do you expect xcr0_accum to change
in practice between the size query and the data retrieval? I'm
perfectly fine fixing the theoretical issue, but I don't think there's
a performance aspect to be considered when deciding how to deal
with it.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |