[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Supporting consistency of vcpu_runstate_info across cpus



Since, AFAIUI, you're interested in non-Linux guests' perspective, I'm
adding Roger (and avoiding trimming, for his benefit), who can tell us
what he thinks of this all, from the FreeBSD point of view.

On Thu, May 19, 2016 at 10:49 AM, Juergen Gross <jgross@xxxxxxxx> wrote:
> On 19/05/16 10:09, Andrew Cooper wrote:
>> On 19/05/2016 08:53, Juergen Gross wrote:
>>> A guest kernel can use the vcpu_op hypercall sub-op
>>> VCPUOP_register_runstate_memory_area to get a copy of the
>>> vcpu_runstate_info of a vcpu mapped into its memory. As this structure
>>> has no update indicator it is only save to be read by the vcpu it is
>>> containing the runstate information of.
>>>
>>> Being able to read the runstate info of another cpu is required e.g.
>>> by the Linux kernel to be able to calculate vruntime: see
>>>
>>> http://lists.xen.org/archives/html/xen-devel/2016-05/msg01790.html
>>>
>>> I'd suggest to add an "update in progress" indicator in the highest
>>> bit of vcpu_runstate_info->state_entry_time as this structure element is
>>> already used to detect vcpu scheduling when vcpu_runstate_info is read
>>> by the owning vcpu.
>>>
>>> The question is how to enable setting this indicator, as the guest must
>>> be able to cope with it (I believe the Linux kernel would just run fine,
>>> but we can't be sure this is true for all guests).
>>>
>>> I see the following possible solutions:
>>>
>>> a) Introduce a new vcpu_op hypercall sub-op for mapping the
>>>    vcpu_runstate_info with update indicator support (a guest supporting
>>>    this would try the new sub-op first and could fall back to
>>>    VCPUOP_register_runstate_memory_area in case of ENOSYS).
>>>
>>> b) Add a virtual MSR to switch on the feature (not being able to set the
>>>    appropriate bit would indicate the feature not being available). This
>>>    is the variant KVM is using. Does ARM have something like MSRs?
>>>
>>> c) Add another hypercall to switch on the feature (similar to
>>>    XENVER_get_features we could have a XENVER_set_features).
>>>
>>> Any preferences?
>>
>> However, irrespective of how you signal the request for new behaviour,
>> you should see about using a lockless clock rather than a single bit, as
>> a single bit can't indicate the case where a complete update has
>> occurred between two samplings.  This will probably require an extension
>> to the current implementation, at which point you might be able to add a
>> capability field as well.
>
> That's the reason I've chosen state_entry_time as the home for the new
> bit. state_entry_time is guaranteed to change between two updates. So
> the logic would look like the following:
>
> do {
>   old_entry_time = READ_ONCE(r->state_entry_time);
>   rmb();
>   new_state = READ_ONCE(*r);
>   rmb();
> } while (new_state.state_entry_time != old_entry_time ||
>          (old_entry_time >> 63));
>
>> Alternatively, the easiest way will probably be to add a new VMASSIST,
>> which allows the guest to opt into the new behaviour.
>
> Aah, nice. Yes, this seems to be a sensible option.
>
FWIW, this looks a good approach to me as well.

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
---------------------------------------------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.