Xen project Mailing List

Re: [Xen-devel] [PATCH 2/6] x86/domctl: Fix TOCTOU race with the use of XEN_DOMCTL_getvcpuextstate

To: "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Mon, 12 Sep 2016 06:33:46 -0600

Delivery-date: Mon, 12 Sep 2016 12:37:02 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 12.09.16 at 14:02, <andrew.cooper3@xxxxxxxxxx> wrote: > On 12/09/16 12:17, Jan Beulich wrote: >>>>> On 12.09.16 at 11:51, <andrew.cooper3@xxxxxxxxxx> wrote: >>> A toolstack must call XEN_DOMCTL_getvcpuextstate twice; first to find the > size >>> of the buffer to use, and a second time to get the actual content. >>> >>> The reported size was based on v->arch.xcr0_accum, but a guest which extends >>> its xcr0_accum between the two hypercalls will cause the toolstack to fail > the >>> evc->size != size check, as the provided buffer is now too small. This >>> causes >>> a hard error during the final phase of migration. >>> >>> Instead, return return a size based on xfeature_mask, which is the maximum >>> size Xen will ever permit. The hypercall must now tolerate a >>> toolstack-provided buffer which is overly large (for the case where a guest >>> isn't using all available xsave states), and should write back how much data >>> was actually written into the buffer. >> To be honest, I'm of two minds here. Part of me thinks this is an >> okay change. However, in particular ... >> >>> --- a/xen/arch/x86/domctl.c >>> +++ b/xen/arch/x86/domctl.c >>> @@ -1054,19 +1054,25 @@ long arch_do_domctl( >>> unsigned int size; >>> >>> ret = 0; >>> - vcpu_pause(v); >>> >>> - size = PV_XSAVE_SIZE(v->arch.xcr0_accum); >>> if ( (!evc->size && !evc->xfeature_mask) || >>> guest_handle_is_null(evc->buffer) ) >>> { >>> + /* >>> + * A query for the size of buffer to use. Must return the >>> + * maximum size we ever might hand back to userspace, > bearing >>> + * in mind that the vcpu might increase its xcr0_accum > between >>> + * this query for size, and the following query for data. >>> + */ >>> evc->xfeature_mask = xfeature_mask; >>> - evc->size = size; >>> - vcpu_unpause(v); >>> + evc->size = PV_XSAVE_SIZE(xfeature_mask); >>> goto vcpuextstate_out; >>> } >>> >>> - if ( evc->size != size || evc->xfeature_mask != xfeature_mask ) >>> + vcpu_pause(v); >>> + size = PV_XSAVE_SIZE(v->arch.xcr0_accum); >>> + >>> + if ( evc->size < size || evc->xfeature_mask != xfeature_mask ) >> ... the relaxation from != to < looks somewhat fragile to me, going >> forward. > > What is fragile about it? It is a very common pattern in general, and > we use it elsewhere. If we get handed too large a buffer, what meaning do you want to assign to the extra bytes? Them being meaningless is just one possible interpretation. >> Did you consider dealing with the issue in the tool stack? > > Yes, and rejected doing so. > >> It can't be that hard to repeat the size query in case data retrieval fails. > > In principle, this is true if the size case was distinguishable from all > the other cases which currently return -EINVAL. It easily is, as long as the caller know what size it used on the most recent earlier call (since it would then notice it having grown). >>> @@ -1103,6 +1109,10 @@ long arch_do_domctl( >>> } >>> >>> vcpu_unpause(v); >>> + >>> + /* Specify how much data we actually wrote into the buffer. */ >>> + if ( !ret ) >>> + evc->size = size; >> ... if this got written on the earlier error path, there wouldn't even >> be a need to retry the size query: Data retrieval could be retried >> with the new size right after enlarging the buffer. > > True, but userspace hypercalls are expensive for the domain in question, > and take a global spinlock in Xen which is expensive for the system as a > whole. Looking forward to PVH control domains, any hypercall will be > far more expensive (a vmexit/entry vs syscall/sysret). > > As such, I am not happy introducing unnecessary hypercalls, as they are > entirely detrimental to dom0 and Xen. Let's be realistic: How often do you expect xcr0_accum to change in practice between the size query and the data retrieval? I'm perfectly fine fixing the theoretical issue, but I don't think there's a performance aspect to be considered when deciding how to deal with it. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.