[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/2] x86/time: update vtsc_last with cmpxchg and drop vtsc_lock



On 16.12.2019 13:30, Roger Pau Monné wrote:
> On Mon, Dec 16, 2019 at 12:21:09PM +0100, Jan Beulich wrote:
>> On 16.12.2019 11:00, Roger Pau Monné wrote:
>>> On Fri, Dec 13, 2019 at 10:48:02PM +0000, Igor Druzhinin wrote:
>>>> Now that vtsc_last is the only entity protected by vtsc_lock we can
>>>> simply update it using a single atomic operation and drop the spinlock
>>>> entirely. This is extremely important for the case of running nested
>>>> (e.g. shim instance with lots of vCPUs assigned) since if preemption
>>>> happens somewhere inside the critical section that would immediately
>>>> mean that other vCPU stop progressing (and probably being preempted
>>>> as well) waiting for the spinlock to be freed.
>>>>
>>>> This fixes constant shim guest boot lockups with ~32 vCPUs if there is
>>>> vCPU overcommit present (which increases the likelihood of preemption).
>>>>
>>>> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
>>>> ---
>>>>  xen/arch/x86/domain.c        |  1 -
>>>>  xen/arch/x86/time.c          | 16 ++++++----------
>>>>  xen/include/asm-x86/domain.h |  1 -
>>>>  3 files changed, 6 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
>>>> index bed19fc..94531be 100644
>>>> --- a/xen/arch/x86/domain.c
>>>> +++ b/xen/arch/x86/domain.c
>>>> @@ -539,7 +539,6 @@ int arch_domain_create(struct domain *d,
>>>>      INIT_PAGE_LIST_HEAD(&d->arch.relmem_list);
>>>>  
>>>>      spin_lock_init(&d->arch.e820_lock);
>>>> -    spin_lock_init(&d->arch.vtsc_lock);
>>>>  
>>>>      /* Minimal initialisation for the idle domain. */
>>>>      if ( unlikely(is_idle_domain(d)) )
>>>> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
>>>> index 216169a..202446f 100644
>>>> --- a/xen/arch/x86/time.c
>>>> +++ b/xen/arch/x86/time.c
>>>> @@ -2130,19 +2130,15 @@ u64 gtsc_to_gtime(struct domain *d, u64 tsc)
>>>>  
>>>>  uint64_t pv_soft_rdtsc(const struct vcpu *v, const struct cpu_user_regs 
>>>> *regs)
>>>>  {
>>>> -    s_time_t now = get_s_time();
>>>> +    s_time_t old, new, now = get_s_time();
>>>>      struct domain *d = v->domain;
>>>>  
>>>> -    spin_lock(&d->arch.vtsc_lock);
>>>> -
>>>> -    if ( (int64_t)(now - d->arch.vtsc_last) > 0 )
>>>> -        d->arch.vtsc_last = now;
>>>> -    else
>>>> -        now = ++d->arch.vtsc_last;
>>>> -
>>>> -    spin_unlock(&d->arch.vtsc_lock);
>>>> +    do {
>>>> +        old = d->arch.vtsc_last;
>>>> +        new = (int64_t)(now - d->arch.vtsc_last) > 0 ? now : old + 1;
>>>
>>> Why do you need to do this subtraction? Isn't it easier to just do:
>>>
>>> new = now > d->arch.vtsc_last ? now : old + 1;
>>
>> This wouldn't be reliable when the TSC wraps. Remember that firmware
>> may set the TSC, and it has been seen to be set to very large
>> (effectively negative, if they were signed quantities) values,
> 
> s_time_t is a signed value AFAICT (s64).

Oh, I should have looked at types, rather than inferring uint64_t
in particular for something like vtsc_last.

>> which
>> will then eventually wrap (whereas we're not typically concerned of
>> 64-bit counters wrapping when they start from zero).
> 
> But get_s_time returns the system time in ns since boot, not the TSC
> value, hence it will start from 0 and we shouldn't be concerned about
> wraps?

Good point, seeing that all parts here are s_time_t. Of course
with all parts being so, there's indeed no need for the cast,
but comparing both values is then equivalent to comparing the
difference against zero.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.