[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 01/11] x86/domctl: Add XEN_DOMCTL_set_avail_vcpus



On 14/11/16 14:59, Boris Ostrovsky wrote:
> On 11/09/2016 02:47 PM, Boris Ostrovsky wrote:
>> On 11/09/2016 02:23 PM, Andrew Cooper wrote:
>>> On 09/11/16 15:29, Boris Ostrovsky wrote:
>>>> On 11/09/2016 10:04 AM, Andrew Cooper wrote:
>>>>> On 09/11/16 14:39, Boris Ostrovsky wrote:
>>>>>> This domctl is called when a VCPU is hot-(un)plugged to a guest (via
>>>>>> 'xl vcpu-set'). While this currently is only intended to be needed by
>>>>>> PVH guests we will call this domctl for all (x86) guests for consistency.
>>>>>>
>>>>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
>>>>>> Acked-by: Daniel De Graaf <dgdegra@xxxxxxxxxxxxx>
>>>>>> ---
>>>>>> Changes in v2:
>>>>>> * Added comment in arch_do_domctl() stating that the domctl is only 
>>>>>> required
>>>>>>   for PVH guests
>>>>> I am not happy with this change until we understand why it is needed.
>>>>>
>>>>> Are we genuinely saying that there is no current enforcement in the
>>>>> PV-hotplug mechanism?
>>>> That's right. Don't call setup_cpu_watcher() in Linux and you will be
>>>> running with maxvcpus.
>>> /sigh - Quality engineering there...
>>>
>>> Yes - lets take the time to actually do this properly.
>>>
>>>>>> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
>>>>>> index 2a2fe04..b736d4c 100644
>>>>>> --- a/xen/arch/x86/domctl.c
>>>>>> +++ b/xen/arch/x86/domctl.c
>>>>>> @@ -1430,6 +1430,23 @@ long arch_do_domctl(
>>>>>>          }
>>>>>>          break;
>>>>>>  
>>>>>> +    case XEN_DOMCTL_set_avail_vcpus:
>>>>>> +    {
>>>>>> +        unsigned int num = domctl->u.avail_vcpus.num;
>>>>>> +
>>>>>> +        /*
>>>>>> +         * This is currently only needed by PVH guests (but
>>>>>> +         * any guest is free to make this call).
>>>>>> +         */
>>>>>> +        ret = -EINVAL;
>>>>>> +        if ( num > d->max_vcpus )
>>>>>> +            break;
>>>>>> +
>>>>>> +        d->arch.avail_vcpus = num;
>>>>>> +        ret = 0;
>>>>>> +        break;
>>>>>> +    }
>>>>> What do you actually mean by "avail_vcpus"?  What happens if a vcpu
>>>>> higher than the new number is currently online and running?  What
>>>>> happens to the already-existing vcpus-at-startup value?
>>>> It shouldn't happen: we set avail_vcpus at domain creation time to
>>>> vcpus-at-startup.
>>>>
>>>> The name is not great. It would have been better to have it online_vcpus
>>>> but that usually means that VPF_down is set (which can happen, for
>>>> example, when the guest offlines a VCPU).
>>> How about an availability bitmap instead, which always has max_vcpus
>>> bits in it?  Xen should consult the bitmap before allowing a VM to
>>> online a new vcpu.
>> We could indeed use bitmap (and then it will actually be easier to
>> handle io request as we can just copy appropriate bytes of the map
>> instead of going bit-by-bit). This will still require the new domctl.

Given the lack of any enforcing mechanism, introducing one is clearly
needed.  Please mention so in the commit message.

>>
>> I am not convinced though that we can start enforcing this new VCPU
>> count, at least for PV guests. They expect to start all max VCPUs and
>> then offline them. This, of course, can be fixed but all non-updated
>> kernels will stop booting.
>
> How about we don't clear _VPF_down if the bit in the availability bitmap
> is not set?

This is yet another PV mess.  We clearly need to quirk PV guests as the
exception to sanity, given that they expect (and have been able to)
online all cpus at start-of-day.

To avoid race conditions, you necessarily need to be able to set a
reduced permitted map before asking the VM to unplug.

For HVM guests, we can set a proper permitted map at boot, and really
should do so.

For PV guests, we have to wait until it has completed its SMP bringup
before reducing the permitted set.  Therefore, making the initial
set_avail_vcpus call could be deferred until the first unplug request?

It also occurs to me that you necessarily need a get_avail_vcpus
hypercall to be able to use this interface sensibly from the toolstack.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.