Xen project Mailing List

Re: [PATCH v2] x86/vpt: Do not take pt_migrate rwlock in some cases

On 29.03.2021 12:40, Roger Pau Monné wrote: > On Mon, Mar 29, 2021 at 11:56:56AM +0200, Jan Beulich wrote: >> On 27.03.2021 02:51, Boris Ostrovsky wrote: >>> Commit 8e76aef72820 ("x86/vpt: fix race when migrating timers between >>> vCPUs") addressed XSA-336 by introducing a per-domain rwlock that was >>> intended to protect periodic timer during VCPU migration. Since such >>> migration is an infrequent event no performance impact was expected. >>> >>> Unfortunately this turned out not to be the case: on a fairly large >>> guest (92 VCPUs) we've observed as much as 40% TPCC performance >>> regression with some guest kernels. Further investigation pointed to >>> pt_migrate read lock taken in pt_update_irq() as the largest contributor >>> to this regression. With large number of VCPUs and large number of VMEXITs >>> (from where pt_update_irq() is always called) the update of an atomic in >>> read_lock() is thought to be the main cause. >>> >>> Stephen Brennan analyzed locking pattern and classified lock users as >>> follows: >>> >>> 1. Functions which read (maybe write) all periodic_time instances >>> attached to a particular vCPU. These are functions which use pt_vcpu_lock() >>> after the commit, such as pt_restore_timer(), pt_save_timer(), etc. >>> 2. Functions which want to modify a particular periodic_time object. >>> These guys lock whichever vCPU the periodic_time is attached to, but >>> since the vCPU could be modified without holding any lock, they are >>> vulnerable to the bug. Functions in this group use pt_lock(), such as >>> pt_timer_fn() or destroy_periodic_time(). >>> 3. Functions which not only want to modify the periodic_time, but also >>> would like to modify the =vcpu= fields. These are create_periodic_time() >>> or pt_adjust_vcpu(). They create the locking imbalance bug for group 2, >>> but we can't simply hold 2 vcpu locks due to the deadlock risk. >>> >>> Roger Monné then pointed out that group 1 functions don't really need >>> to hold the pt_migrate rwlock and that group 3 should be hardened by >>> holding appropriate vcpu's tm_lock when adding or deleting a timer >>> from its list. >> >> I'm struggling some with the latter aspect: Is this to mean there is >> something wrong right now? > > There's nothing wrong right now AFAICT , because per-vcpu users (ie: > type 1) also hold the rw lock in read mode when iterating over the > per-vcpu list. > >> Or does "harden" really mean "just to be >> on the safe side" here? > > If the per-domain rw lock is no longer read-locked by type 1 users > then type 2 and type 3 users need to make sure the per-vcpu lock is > taken before adding or removing items from the per-vcpu lists, or else > a type 1 user could see list corruption as a result of modifications > done by type 2 or 3 without holding the per-vcpu lock. Ah, right. Boris, may I then ask to avoid the somewhat ambiguous word "harden" here then, and instead make clear that the new locking is in fact "balancing" removal of locks elsewhere? Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.