[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/vpmu: fix race-condition in vpmu_load


  • To: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 19 Sep 2022 11:27:38 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RfjF2Ow96NNxWkR0u2rSRqq5+oo8NUCL9NhBeZjuwM4=; b=CJEkqlYkEWTG+385JcW8j91TOh1NvNaBEKEcL9j0HwsjhxIcDat7xS1QOW4u3pnK4zIx64vH6I7/913ZzgYM+n0FH9/cytaQGb8sskXuF83vWZiNKfrzg20gKBGBcuA0cC2guBUXVzd9vWcDkshJsHpaNnICRVP5wM390GpWCeZs7KR/aCxCoJzm919HPZe6G9pR0rHnJSuGoNIcvRarxYQf5QOdzUXSX7zsjH54W3YtVIjyI5mTAqAbgo/Ft05J5mA4i7LF+3/uTikyq10J9IlL3PeiDxEtqvu3mp3kCmvSuHmW7jHy60z5d7y64USbjhL59YUjL1X3dZel/060yg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PLXivkKfA/atjgTmwy91+llHlSxxy7CUpQpysvRtRYKflu15GD0JCy0cOTdIfDTea5J4Q5BZMZQ5VjdlJFVeLlCWwLtLVUzCyAsFN27um9I6uNgN7HsMrIf8ee41E+IYqHX1rAncc+0ZTvhePh8r99jphvkQTgsuhgt3IV1wOcDIFe5aLPXWuZoNXSzuUgF/x1cwf67rpL8wlqRGkdVd1eAGeQ11Jkigm23jOxHKGTnTAQ6pFeTOmgDX7mPoZmIl8dtEjgIoOU4MVGzMxF8rbRJC79bra0rzGkodBW6vmLSfIjBzr2eiUvSktjEVXGmWnmzTH3vvvY1+NlpAmXjh1w==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Tamas K Lengyel <tamas.lengyel@xxxxxxxxx>
  • Delivery-date: Mon, 19 Sep 2022 09:27:48 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 16.09.2022 23:35, Boris Ostrovsky wrote:
> 
> On 9/16/22 8:52 AM, Jan Beulich wrote:
>> On 15.09.2022 16:01, Tamas K Lengyel wrote:
>>> While experimenting with the vPMU subsystem an ASSERT failure was
>>> observed in vmx_find_msr because the vcpu_runnable state was true.
>>>
>>> The root cause of the bug appears to be the fact that the vPMU subsystem
>>> doesn't save its state on context_switch. The vpmu_load function will 
>>> attempt
>>> to gather the PMU state if its still loaded two different ways:
>>>      1. if the current pcpu is not where the vcpu ran before doing a remote 
>>> save
>>>      2. if the current pcpu had another vcpu active before doing a local 
>>> save
>>>
>>> However, in case the prev vcpu is being rescheduled on another pcpu its 
>>> state
>>> has already changed and vcpu_runnable is returning true, thus #2 will trip 
>>> the
>>> ASSERT. The only way to avoid this race condition is to make sure the
>>> prev vcpu is paused while being checked and its context saved. Once the prev
>>> vcpu is resumed and does #1 it will find its state already saved.
>> While I consider this explanation plausible, I'm worried:
>>
>>> --- a/xen/arch/x86/cpu/vpmu.c
>>> +++ b/xen/arch/x86/cpu/vpmu.c
>>> @@ -419,8 +419,10 @@ int vpmu_load(struct vcpu *v, bool_t from_guest)
>>>           vpmu = vcpu_vpmu(prev);
>>>   
>>>           /* Someone ran here before us */
>>> +        vcpu_pause(prev);
>>>           vpmu_save_force(prev);
>>>           vpmu_reset(vpmu, VPMU_CONTEXT_LOADED);
>>> +        vcpu_unpause(prev);
>>>   
>>>           vpmu = vcpu_vpmu(v);
>>>       }
>> We're running with IRQs off here, yet vcpu_pause() waits for the vcpu
>> to actually be de-scheduled. Even with IRQs on this is already a
>> relatively heavy operation (also including its impact on the remote
>> side). Additionally the function is called from context_switch(), and
>> I'm unsure of the usability of vcpu_pause() on such a path. In
>> particular: Is there a risk of two CPUs doing this mutually to one
>> another? If so, is deadlocking excluded?
>>
>> Hence at the very least I think the description wants extending, to
>> discuss the safety of the change.
>>
>> Boris - any chance you could comment here? Iirc that's code you did
>> introduce.
> 
> 
> Is the assertion in vmx_find_msr() really needs to be for runnable vcpu or 
> can it be a check on whether vcpu is actually running (e.g. RUNSTATE_running)?

You cannot safely check for "running", as "runnable" may transition
to/from "running" behind your back.

Jan

> I think call chain vpmu_load()->..->*_vpmu_save()->...->vmx_find_msr() is the 
> only one where we are doing it for non-current vcpu. If we can guarantee that 
> run state is set after vpmu_load() call (maybe it is already, I haven't 
> checked) then testing the state may avoid the assertion.
> 
> 
> -boris
> 




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.