[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC] Avoid dom0/HVM performance penalty from MSR access tightening


  • To: Alex Olson <this.is.a0lson@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • Date: Thu, 10 Feb 2022 18:27:27 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=WLQ6zFOpuCFiRHzGec/l2mWyFi+kYiC1GN7h3D2ppJo=; b=n4OtHhWxYX8QE/jNqooTpiAAGrustjxDxxObYGXeBrgT5MEyQedl5Vsw36bpBHcVMqBCZsX0hruiUsjWZTBUNKD7xdJj+KvxkQaouEkZqkmGbSubpdYYCLmQwZzqm3K8ufmk5VJWLhHjbtgjftZwGdAMmExx4pHu/PPRld6d7OjKpEYHw5rJizTdNrc30xFVXBQ3eScJbQIAgENdCAcyqCZ6cXv7q3NzGvCLPs5IagZ3DpRvZOjfp+jQL/ESKyzk+9kuLbJ/RU1P/tKOG6fNrD0CEDzQvAq5i7S98n3cko5RM1ewZoNZurSHk/8a37lk/wJ7vXC4k6kqRSSqHtrsLg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=R9CA2YjMNGKSCWZGmAJK8DK7mdxzgA4/SvDj24/sUykWNcB0KP61IcA0dBG+uWvqG976ZquaB1FWjcwchQ+k1fKHHdJuUu1bE3J0oPoouyUc1xZgRvLRFOUFTGB1+SrMKFXlMM+/v4iCUjjmPlQJtkrW3UTvp5GVu75PmaCO8PhG1z0SFWNDTVA4sGdWCRnj28zXJLLonJeiHsWSNOUSBmgECwCc/DOq1GbcCs+zXdzA7vGhiz0HH2cJbBsk5ag65benq2yvSWR0Nuhem9EmjyvUisi19pf8QhopGO3ADa6AbB5IB+/UezSUps6hPDxYJ6l2MJ3oVbzphVxJOyHWXA==
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Delivery-date: Thu, 10 Feb 2022 18:27:56 +0000
  • Ironport-data: A9a23:bmRQUaI+yKvLUCV9FE+RA5IlxSXFcZb7ZxGr2PjKsXjdYENS0DADy zcYDWmPPqmLYWP1LdBwaY3g8E9VuMSHx99mGwtlqX01Q3x08seUXt7xwmUcns+xwm8vaGo9s q3yv/GZdJhcokcxIn5BC5C5xZVG/fjgqoHUVaiUakideSc+EH170Ug6w7Zj6mJVqYPR7z2l6 IuaT/L3YDdJ6xYsWo7Dw/vewP/HlK2aVAIw5jTSV9gS1LPtvyB94KYkDbOwNxPFrrx8RYZWc QphIIaRpQs19z91Yj+sfy2SnkciGtY+NiDW4pZatjTLbrGvaUXe345iXMfwZ3u7hB27mfpdi /sRnqCgVBZ3JvXdm7w5cwRHRnQW0a1uoNcrIFC6uM2XiUbHb2Ht07NlC0Re0Y8wo7gtRzsUr LpBdW5LPkvra+GemdpXTsFFgMg5IdatF4QYonx6lhnSDOo8QICFSKLPjTNd9Glr2pgVQKaOD yYfQQZlPSmdbjdVAFIKWYkVgLirmHD5UgQN/Tp5ooJoujOOnWSdyoPFINfTP9CHW8hRtkKZv X7duXT0BAkAM96SwibD9Wij7sfTnSLgHoMJUrTg8uVthnWcw2USDFsdUl7TnBWiohfgAZQFc RVSo3dw6/hpnKC2cjXjdy++ilLHkEI3Z8pNArMisBCOko70/hnMUwDoUQV9QNAhscY3Qxkj2 VmIg87lCFRTjVGFdZ6O3uzK9G3vYED5OUdHPHZZFlVdv7EPtalu1kqnczp1LEKiYjQZ8xnUy ivCkiUxjq57YSUjh/TipgCvb95BS/H0ou8JCuf/AzrNAuBRPtfNi2mUBb7ztq8owGGxFAfpg ZT8s5LChN3i9LnU/MB3fM0DHauy+9GOOyDGjFhkEvEJrmrxpyL4J9gIu2EhdS+F1/ronxeyO Sc/XisLuvdu0IaCN/crM+pd9ex2pUQfKTgVfq+NNYcfCnSAXASG4DtvdSatM5PFyyARfVUEE c7DK66EVC9CYYw+lWbeb7pNgNcDm3FlrUuOFM+T8vhS+efHDJJjYexeawXmgyFQxP7snTg5B P4Ba5XUm08CDIUToED/qOYuELzDFlBibbjeoM1LbO+TZA1gHWAqEfjKxr09PYdimsxoei3go xlRg2dUlwjyg2PpMwKPZiwxYb/jR88n/3k6ITYtLRCj3H16OdSj66IWdp0We7g79bM8ka4oH qddI8jQUO5STjnn+igGacWvpoJVaxn21xmFODCoYWZjcsc4FRDJ4NLtYiDm6DIKUnisrcI7r rD5jlHbTJMPSh5MFsHTbP7znVq9sWJEwLB5XlfSI8kVc0LpqdA4Jyv0h/4xAscNNRScmWfKi 1fIWU8V/LCfrZU0/d/FgbG/g72oS+YuTFBHG2T77KqtMXWI9GSU3oIdAv2DeirQVT2o9fz6N /lV1fz1LNYOgE1O79hnC79uwK8zu4nvqrtdwlg2FXnHdQ32WLZpI33A1shTrKxdgLRevFLuC E6I/9BbP5SPOd/kTwFNdFZ0MLzb2KFGgCTW4NQ0PF7+tX1+87ewWElPOwWB1X5GJ7xvPYJ5m eostab6MeBkZsbG5jpesh1pyg==
  • Ironport-hdrordr: A9a23:Y/YM66qljkS8GOVQ51tdW9IaV5uEL9V00zEX/kB9WHVpm5Oj+f xGzc516farslossSkb6K290DHpewKTyXcH2/hsAV7EZnimhILIFvAs0WKG+Vzd8kLFh5dgPM tbAspD4ZjLfCJHZKXBkUmF+rQbsaG6GcmT7I+0pRoMPGJXguNbnnxE426gYxdLrWJ9dP4E/e +nl6x6Tk2bCBMqh6qAdxw4dtmGg+eOuIPtYBYACRJiwhKJlymU5LnzFAXd9gsCUhtUqI1Ssl Ttokjc3OGOovu7whjT2yv49JJNgubszdNFGYilltUVEDPxkQylDb4RGYFq/QpF5d1H2mxa1+ UkkC1QefibLEmhJ11dlCGdnzUIFgxes0MKh2Xo2kcL6vaJOw7SQ/Ax+76xNCGptnbI9esMoJ 6ilQiixutqJAKFkyLn69fSURZ20kKyvHo5iOYWy2dSSI0EddZq3MYiFW5uYd899RjBmcsa+S hVfbXhzecTdUnfY2HSv2FpztDpVnMvHg2eSkxHvsCOyTBZkH1w0kNdnaUk7zs93YN4T4MB6/ XPM6xumr0LRsgKbbhlDONERcesEGTCTR/FLWrXK1X6E6MMPW7LtvfMkfgIzfDvfIZNwIo5mZ zHXl8dvWkue1j2AcnLx5FP+gClehT1Yd0s8LAp23FUgMyPeFPbC1z1dLl1qbrSnxw2OLyvZ8 qO
  • Ironport-sdr: lavaRb6hJsXdrEZ6spBfVVbtqOP+KVGIB06E4ecrA+klwzHaCjHavCgU5rr1vavIFsErIE6lLe w82Jk9nLSIni2a24xfTR1qdviyvFipzU6rDWF/ny+UJMouceS53fKH2wq9/McY1MJ0pge+eLTj DW9oPLHIbxzcA1duTpJzCOo0sdYI+BbsEJGRHgWhQrxtO6T6x4YLoml/+rzMtp89Puup2qQNbC qXSWyswSE4KhC1iE2UoAFg3Lz9LtAp6/8WsFQCzC9qVQfYAd5MElmWw38dZvYkOVeJINpfE27E 20/bVn3QfR3hgI5i9crOgXdZ
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHYHqOPCVLXb6vjgEiHGtmfOQDTpayNGm6A
  • Thread-topic: [RFC] Avoid dom0/HVM performance penalty from MSR access tightening

On 10/02/2022 17:27, Alex Olson wrote:
> I'm seeing strange performance issues under Xen on a Supermicro server with a 
> Xeon D-1541 CPU caused by an MSR-related commit.
>
> Commit 322ec7c89f6640ee2a99d1040b6f786cf04872cf 'x86/pv: disallow access to 
> unknown MSRs'
> surprisingly introduces a severe performance penality where dom0 has about 
> 1/8th
> the normal CPU performance. Even even when 'xenpm' is used to select the
> performance governor and operate the CPU at maximum frequency, actual CPU
> performance is still 1/2 of normal (as well as using 
> "cpufreq=xen,performance").
>
> The patch below fixes it but I don't fully understand why.
>
> Basically, when *reads* of MSR_IA32_THERM_CONTROL are blocked, dom0 and
> guests (pinned to other CPUs) see the performance issues.
>
> For benchmarking purposes, I built a small C program that runs a "for
> loop" 
> 4Billion iterations and timed its execution. In dom0, the
> performance issues
> also cause HVM guest startup time to go from 9-10
> seconds to almost 80 seconds.
>
> I assumed Xen was managing CPU frequency and thus blocking related MSR
> access by dom0 (or any other domain). However,  clearly something else
> is happening and I don't understand why.
>
> I initially attempted to copy the same logic as the write MSR case. This
> was effective at fixing the dom0 performance issue, but still left other
> domains running at 1/2 speed. Hence, the change below has no access control.
>
>
> If anyone has any insight as to what is really happening, I would be all ears
> as I am unsure if the change below is a proper solution.

Well that's especially entertaining...

So your patch edits pv/emul-priv-op.c#read_msr(), so is only changing
the behaviour for PV dom0.

What exactly is your small C program doing?


The change that that patch made was to turn a read which previously
succeeded into a #GP fault.

The read has already been bogus, even if they appeared to work before. 
When dom0 is scheduled around, it no longer knows which MSR it is
actually reading, so at the best, the data being read is racy as to
which CPU you're instantaneously scheduled on.


At a guess, something in Linux is doing something especially dumb when
given #GP and is falling into a tight loop of trying to read the MSR. 
Do you happen to know which of those two is the more dominating factor?

~Andrew

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.