[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] x86/vpmu: Fix race-condition in vpmu_load


  • To: Tamas K Lengyel <tamas.k.lengyel@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 29 Sep 2022 16:46:36 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=opqA64Q+nuXWr4AFVA5z2MJl0UtnOwe+MyWlPkxBOEw=; b=JaTnEBFdpuzvj3nMvR8a2pS05tgOHOZuBT9WmsCaqToiZ1i6r+/FkR4UGwcbn8GKWuCkfsBWHBNCebIJN4BBf/QL6fo5t9qfQdKP873lxqecmYu8N1ZTXckeqbtnRmI2D5GnYDgWWXzUZ10bnqwuwLygkZ3IFFBFKxNiWN3O7Qyj46Ll630rgjqjHaP6eEpFgIk9DThsO0DZ1VTNaOtxvEcLqQETh8ArwYw1fLHf9iBWrgtIvhv6ufF/ZhsMrhhyRidbMFwWILStUkxagqcOCZUa88x4gqXzjn+K8MkPdAIxfI4PDw2Rjnj0brgL33cTChqMClRs9YhGynRv7Z4O8w==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TsEruHbi4XeZVVhxGWzAZFXmu7HScyErAl+KylPl2f/09DoVFHSFEWDBVGncrKeprfbY7n6OuAegc+8HVIE5AvQ+3VGf2tLj+Kj7pDR956EaocINwPKh4wwYcAk+Odb7xE5GXDbMzjnG8k7J+8Ng9l1xnvbEqIGHN8NVMWzKiJDTlakeybntbq2oYKNO/fiocuHhzRnTeoT9hxlhJb2/uv68WWBDFbO5aA1/lvApc8HfNKUOCk3x+QSrbm0ntsiRJ1WSHPjdcMmkBomWvopZ53zaeTGI6E5PgSL8lQ4Ej2U1XuzrajS2sF5B7ulwHZc3q/blh5SMIS4uVlkcgtb01A==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Tamas K Lengyel <tamas.lengyel@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
  • Delivery-date: Thu, 29 Sep 2022 14:46:53 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 29.09.2022 16:28, Tamas K Lengyel wrote:
> On Thu, Sep 29, 2022 at 9:07 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
> 
>> On 26.09.2022 16:22, Tamas K Lengyel wrote:
>>> On Mon, Sep 26, 2022 at 10:12 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>>> On 22.09.2022 22:48, Tamas K Lengyel wrote:
>>>>> --- a/xen/arch/x86/cpu/vpmu.c
>>>>> +++ b/xen/arch/x86/cpu/vpmu.c
>>>>> @@ -376,57 +376,24 @@ void vpmu_save(struct vcpu *v)
>>>>>      vpmu->last_pcpu = pcpu;
>>>>>      per_cpu(last_vcpu, pcpu) = v;
>>>>>
>>>>> +    vpmu_set(vpmu, VPMU_CONTEXT_SAVE);
>>>>> +
>>>>>      if ( alternative_call(vpmu_ops.arch_vpmu_save, v, 0) )
>>>>>          vpmu_reset(vpmu, VPMU_CONTEXT_LOADED);
>>>>>
>>>>> +    vpmu_reset(vpmu, VPMU_CONTEXT_SAVE);
>>>>> +
>>>>>      apic_write(APIC_LVTPC, PMU_APIC_VECTOR | APIC_LVT_MASKED);
>>>>>  }
>>>>>
>>>>>  int vpmu_load(struct vcpu *v, bool_t from_guest)
>>>>>  {
>>>>>      struct vpmu_struct *vpmu = vcpu_vpmu(v);
>>>>> -    int pcpu = smp_processor_id(), ret;
>>>>> -    struct vcpu *prev = NULL;
>>>>> +    int ret;
>>>>>
>>>>>      if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) )
>>>>>          return 0;
>>>>>
>>>>> -    /* First time this VCPU is running here */
>>>>> -    if ( vpmu->last_pcpu != pcpu )
>>>>> -    {
>>>>> -        /*
>>>>> -         * Get the context from last pcpu that we ran on. Note that if
>>>> another
>>>>> -         * VCPU is running there it must have saved this VPCU's
>> context
>>>> before
>>>>> -         * startig to run (see below).
>>>>> -         * There should be no race since remote pcpu will disable
>>>> interrupts
>>>>> -         * before saving the context.
>>>>> -         */
>>>>> -        if ( vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) )
>>>>> -        {
>>>>> -            on_selected_cpus(cpumask_of(vpmu->last_pcpu),
>>>>> -                             vpmu_save_force, (void *)v, 1);
>>>>> -            vpmu_reset(vpmu, VPMU_CONTEXT_LOADED);
>>>>> -        }
>>>>> -    }
>>>>> -
>>>>> -    /* Prevent forced context save from remote CPU */
>>>>> -    local_irq_disable();
>>>>> -
>>>>> -    prev = per_cpu(last_vcpu, pcpu);
>>>>> -
>>>>> -    if ( prev != v && prev )
>>>>> -    {
>>>>> -        vpmu = vcpu_vpmu(prev);
>>>>> -
>>>>> -        /* Someone ran here before us */
>>>>> -        vpmu_save_force(prev);
>>>>> -        vpmu_reset(vpmu, VPMU_CONTEXT_LOADED);
>>>>> -
>>>>> -        vpmu = vcpu_vpmu(v);
>>>>> -    }
>>>>> -
>>>>> -    local_irq_enable();
>>>>> -
>>>>>      /* Only when PMU is counting, we load PMU context immediately. */
>>>>>      if ( !vpmu_is_set(vpmu, VPMU_RUNNING) ||
>>>>>           (!has_vlapic(vpmu_vcpu(vpmu)->domain) &&
>>>>
>>>> What about the other two uses of vpmu_save_force() in this file? I looks
>>>> to me as if only the use in mem_sharing.c needs to be retained.
>>>
>>> I don't know, maybe. I rather focus this patch only on the issue and its
>>> fix as I don't want to introduce unintended side effects by doing a
>>> cleanup/consolidation at other code-paths when not strictly necessary.
>>
>> While I see your point, I'm afraid I don't think I can ack this
>> change without knowing whether the other uses don't expose a similar
>> issue. It would feel wrong to fix only one half of a problem. I may,
>> somewhat hesitantly, give an ack if e.g. Boris offered his R-b.
>> Else the only other option I see is that some other maintainer give
>> their ack.
>>
> 
> I may have misunderstood what you are asking. I thought you were asking if
> the other two remaining users of vpmu_save_force could be switched over to
> vpmu_save as a generic cleanup, to which my answer is still maybe. From the
> perspective of this particular bug those use-cases are safe. On is acting
> on the current vcpu and doesn't try to run vpmu_save_force on a remote
> vcpu, the other one is being called when the domain is being shut down so
> the vcpu cannot be in a runnable state.

Hmm, yes - I can accept that. Thanks for the clarification.

Acked-by: Jan Beulich <jbeulich@xxxxxxxx>

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.