[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation


  • To: Olaf Hering <olaf@xxxxxxxxx>, <xen-devel@xxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Mon, 1 Oct 2018 13:39:51 +0100
  • Autocrypt: addr=andrew.cooper3@xxxxxxxxxx; prefer-encrypt=mutual; keydata= xsFNBFLhNn8BEADVhE+Hb8i0GV6mihnnr/uiQQdPF8kUoFzCOPXkf7jQ5sLYeJa0cQi6Penp VtiFYznTairnVsN5J+ujSTIb+OlMSJUWV4opS7WVNnxHbFTPYZVQ3erv7NKc2iVizCRZ2Kxn srM1oPXWRic8BIAdYOKOloF2300SL/bIpeD+x7h3w9B/qez7nOin5NzkxgFoaUeIal12pXSR Q354FKFoy6Vh96gc4VRqte3jw8mPuJQpfws+Pb+swvSf/i1q1+1I4jsRQQh2m6OTADHIqg2E ofTYAEh7R5HfPx0EXoEDMdRjOeKn8+vvkAwhviWXTHlG3R1QkbE5M/oywnZ83udJmi+lxjJ5 YhQ5IzomvJ16H0Bq+TLyVLO/VRksp1VR9HxCzItLNCS8PdpYYz5TC204ViycobYU65WMpzWe LFAGn8jSS25XIpqv0Y9k87dLbctKKA14Ifw2kq5OIVu2FuX+3i446JOa2vpCI9GcjCzi3oHV e00bzYiHMIl0FICrNJU0Kjho8pdo0m2uxkn6SYEpogAy9pnatUlO+erL4LqFUO7GXSdBRbw5 gNt25XTLdSFuZtMxkY3tq8MFss5QnjhehCVPEpE6y9ZjI4XB8ad1G4oBHVGK5LMsvg22PfMJ ISWFSHoF/B5+lHkCKWkFxZ0gZn33ju5n6/FOdEx4B8cMJt+cWwARAQABzSlBbmRyZXcgQ29v cGVyIDxhbmRyZXcuY29vcGVyM0BjaXRyaXguY29tPsLBegQTAQgAJAIbAwULCQgHAwUVCgkI CwUWAgMBAAIeAQIXgAUCWKD95wIZAQAKCRBlw/kGpdefoHbdD/9AIoR3k6fKl+RFiFpyAhvO 59ttDFI7nIAnlYngev2XUR3acFElJATHSDO0ju+hqWqAb8kVijXLops0gOfqt3VPZq9cuHlh IMDquatGLzAadfFx2eQYIYT+FYuMoPZy/aTUazmJIDVxP7L383grjIkn+7tAv+qeDfE+txL4 SAm1UHNvmdfgL2/lcmL3xRh7sub3nJilM93RWX1Pe5LBSDXO45uzCGEdst6uSlzYR/MEr+5Z JQQ32JV64zwvf/aKaagSQSQMYNX9JFgfZ3TKWC1KJQbX5ssoX/5hNLqxMcZV3TN7kU8I3kjK mPec9+1nECOjjJSO/h4P0sBZyIUGfguwzhEeGf4sMCuSEM4xjCnwiBwftR17sr0spYcOpqET ZGcAmyYcNjy6CYadNCnfR40vhhWuCfNCBzWnUW0lFoo12wb0YnzoOLjvfD6OL3JjIUJNOmJy RCsJ5IA/Iz33RhSVRmROu+TztwuThClw63g7+hoyewv7BemKyuU6FTVhjjW+XUWmS/FzknSi dAG+insr0746cTPpSkGl3KAXeWDGJzve7/SBBfyznWCMGaf8E2P1oOdIZRxHgWj0zNr1+ooF /PzgLPiCI4OMUttTlEKChgbUTQ+5o0P080JojqfXwbPAyumbaYcQNiH1/xYbJdOFSiBv9rpt TQTBLzDKXok86M7BTQRS4TZ/ARAAkgqudHsp+hd82UVkvgnlqZjzz2vyrYfz7bkPtXaGb9H4 Rfo7mQsEQavEBdWWjbga6eMnDqtu+FC+qeTGYebToxEyp2lKDSoAsvt8w82tIlP/EbmRbDVn 7bhjBlfRcFjVYw8uVDPptT0TV47vpoCVkTwcyb6OltJrvg/QzV9f07DJswuda1JH3/qvYu0p vjPnYvCq4NsqY2XSdAJ02HrdYPFtNyPEntu1n1KK+gJrstjtw7KsZ4ygXYrsm/oCBiVW/OgU g/XIlGErkrxe4vQvJyVwg6YH653YTX5hLLUEL1NS4TCo47RP+wi6y+TnuAL36UtK/uFyEuPy wwrDVcC4cIFhYSfsO0BumEI65yu7a8aHbGfq2lW251UcoU48Z27ZUUZd2Dr6O/n8poQHbaTd 6bJJSjzGGHZVbRP9UQ3lkmkmc0+XCHmj5WhwNNYjgbbmML7y0fsJT5RgvefAIFfHBg7fTY/i kBEimoUsTEQz+N4hbKwo1hULfVxDJStE4sbPhjbsPCrlXf6W9CxSyQ0qmZ2bXsLQYRj2xqd1 bpA+1o1j2N4/au1R/uSiUFjewJdT/LX1EklKDcQwpk06Af/N7VZtSfEJeRV04unbsKVXWZAk uAJyDDKN99ziC0Wz5kcPyVD1HNf8bgaqGDzrv3TfYjwqayRFcMf7xJaL9xXedMcAEQEAAcLB XwQYAQgACQUCUuE2fwIbDAAKCRBlw/kGpdefoG4XEACD1Qf/er8EA7g23HMxYWd3FXHThrVQ HgiGdk5Yh632vjOm9L4sd/GCEACVQKjsu98e8o3ysitFlznEns5EAAXEbITrgKWXDDUWGYxd pnjj2u+GkVdsOAGk0kxczX6s+VRBhpbBI2PWnOsRJgU2n10PZ3mZD4Xu9kU2IXYmuW+e5KCA vTArRUdCrAtIa1k01sPipPPw6dfxx2e5asy21YOytzxuWFfJTGnVxZZSCyLUO83sh6OZhJkk b9rxL9wPmpN/t2IPaEKoAc0FTQZS36wAMOXkBh24PQ9gaLJvfPKpNzGD8XWR5HHF0NLIJhgg 4ZlEXQ2fVp3XrtocHqhu4UZR4koCijgB8sB7Tb0GCpwK+C4UePdFLfhKyRdSXuvY3AHJd4CP 4JzW0Bzq/WXY3XMOzUTYApGQpnUpdOmuQSfpV9MQO+/jo7r6yPbxT7CwRS5dcQPzUiuHLK9i nvjREdh84qycnx0/6dDroYhp0DFv4udxuAvt1h4wGwTPRQZerSm4xaYegEFusyhbZrI0U9tJ B8WrhBLXDiYlyJT6zOV2yZFuW47VrLsjYnHwn27hmxTC/7tvG3euCklmkn9Sl9IAKFu29RSo d5bD8kMSCYsTqtTfT6W4A3qHGvIDta3ptLYpIAOD2sY3GYq2nf3Bbzx81wZK14JdDDHUX2Rs 6+ahAA==
  • Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>
  • Delivery-date: Mon, 01 Oct 2018 12:39:58 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 07/06/18 14:08, Olaf Hering wrote:
> Add an option to control when vTSC emulation will be activated for a
> domU with tsc_mode=default. Without such option each TSC access from
> domU will be emulated, which causes a significant perfomance drop for
> workloads that make use of rdtsc.
>
> One option to avoid the TSC option is to run domUs with tsc_mode=native.
> This has the drawback that migrating a domU from a "2.3GHz" class host
> to a "2.4GHz" class host may change the rate at wich the TSC counter
> increases, the domU may not be prepared for that.
>
> With the new option the host admin can decide how a domU should behave
> when it is migrated across systems of the same class. Since there is
> always some jitter when Xen calibrates the cpu_khz value, all hosts of
> the same class will most likely have slightly different values. As a
> result vTSC emulation is unavoidable. Data collected during the incident
> which triggered this change showed a jitter of up to 200 KHz across
> systems of the same class.

Do you have any further details of the systems involved?  If they are
identical systems, they should all have the same real TSC frequency, and
its a known issue that Xen isn't very good at working out the
frequency.  TBH, fixing that would be far better overall.

>
> Existing padding fields are reused to store vtsc_khz_tolerance as u16.
> The padding is sent as zero in write_tsc_info to the receving host.
> The padding is undefined if the changed code runs as receiver.

I'm not sure what you mean by this final sentence.

> handle_tsc_info has no code to verify that padding is indeed zero. Due
> to the lack of a version field it is impossible to know if the sender
> already has the newly introduced vtsc_tolerance field. In the worst
> case the receiving domU will get an unemulated TSC.

The lack of padding verification is deliberate, for forwards
compatibility.  Why does the sending code matter?  One way or another,
if the field is 0, the option wasn't present or wasn't configured. 
Neither of these situations affect the decision-making that the
receiving side needs to perform.

>
> Signed-off-by: Olaf Hering <olaf@xxxxxxxxx>
> Reviewed-by: Wei Liu <wei.liu2@xxxxxxxxxx> (v07/v08)
> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> (v08)

I'm still -0.5 for this patch.  I can appreciate why you want it, but it
is a gross hack which only works when you don't skew time more than NTP
in the guest can cope with.  My gut feeling is that there will be other
more subtle fallout.

As for the implementation itself, a few trivial comments.

> diff --git a/docs/man/xen-tscmode.pod.7 b/docs/man/xen-tscmode.pod.7
> index 3bbc96f201..122ae36679 100644
> --- a/docs/man/xen-tscmode.pod.7
> +++ b/docs/man/xen-tscmode.pod.7
> @@ -99,6 +99,9 @@ whether or not the VM has been saved/restored/migrated
>  
>  =back
>  
> +If the tsc_mode is set to "default" the decision to emulate TSC can be
> +tweaked further with the "vtsc_tolerance_khz" option.
> +
>  To understand this in more detail, the rest of this document must
>  be read.
>  
> @@ -211,6 +214,19 @@ is emulated.  Note that, though emulated, the "apparent" 
> TSC frequency
>  will be the TSC frequency of the initial physical machine, even after
>  migration.
>  
> +Since the calibration of the TSC frequency may not be 100% accurate, the
> +exact value of the frequency can change even across reboots.

It can change across reboots for other reasons, e.g. firmware settings.

I'd phrase this as "Since the calibration of the TSC frequency isn't
100% accurate, the value measured by Xen can vary across reboots".

>  This means
> +also several otherwise identical systems can have a slightly different
> +TSC frequency. As a result TSC access will be emulated if a domU is
> +migrated from one host to another, identical host. To avoid the
> +performance impact of TSC emulation a certain tolerance of the measured
> +host TSC frequency can be specified with "vtsc_tolerance_khz". If the
> +measured "cpu_khz" value is within the tolerance range, TSC access
> +remains native. Otherwise it will be emulated. This allows to migrate
> +domUs between identical hardware. If the domU will be migrated to a
> +different kind of hardware, say from a "2.3GHz" to a "2.5GHz" system,
> +TSC will be emualted to maintain the TSC frequency expected by the domU.
> +
>  For environments where both TSC-safeness AND highest performance
>  even across migration is a requirement, application code can be specially
>  modified to use an algorithm explicitly designed into Xen for this purpose.
> diff --git a/docs/man/xl.cfg.pod.5.in b/docs/man/xl.cfg.pod.5.in
> index 47d88243b1..995277794f 100644
> --- a/docs/man/xl.cfg.pod.5.in
> +++ b/docs/man/xl.cfg.pod.5.in
> @@ -1898,6 +1898,16 @@ determined in a similar way to that of B<default> TSC 
> mode.
>  
>  Please see B<xen-tscmode(7)> for more information on this option.
>  
> +=item B<vtsc_tolerance_khz="KHZ">
> +
> +B<(x86 only, relevant only for tsc_mode=default)>
> +When a domU is started, the CPU frequency of the host is used by the domU for
> +TSC related time measurement. Once the domU is either migrated or
> +saved/restored on another host that CPU frequency has to be emulated to avoid
> +timedrift. To avoid the performance penalty of the TSC emulation, allow a
> +certain amount of jitter of the measured CPU frequency on the hosts the domU
> +is supposed to run on. Default value is 0, i.e. no tolerance.

In one of these two paragraphs, I think there needs to be a warning
about clock drift in the guest.

> +
>  =item B<localtime=BOOLEAN>
>  
>  Set the real time clock to local time or to UTC. False (0) by default,
> diff --git a/docs/specs/libxc-migration-stream.pandoc 
> b/docs/specs/libxc-migration-stream.pandoc
> index 73421ff393..0d0f17edb1 100644
> --- a/docs/specs/libxc-migration-stream.pandoc
> +++ b/docs/specs/libxc-migration-stream.pandoc
> @@ -3,7 +3,7 @@
>    Andrew Cooper <<andrew.cooper3@xxxxxxxxxx>>
>    Wen Congyang <<wency@xxxxxxxxxxxxxx>>
>    Yang Hongyang <<hongyang.yang@xxxxxxxxxxxx>>
> -% Revision 2
> +% Revision 3
>  
>  Introduction
>  ============
> @@ -472,7 +472,7 @@ XEN\_DOMCTL\_{get,set}tscinfo hypercall sub-ops.
>      +------------------------+------------------------+
>      | nsec                                            |
>      +------------------------+------------------------+
> -    | incarnation            | (reserved)             |
> +    | incarnation            | tolerance | (reserved) |
>      +------------------------+------------------------+
>  
>  --------------------------------------------------------------------
> @@ -485,6 +485,8 @@ khz              TSC frequency, in kHz.
>  nsec             Elapsed time, in nanoseconds.
>  
>  incarnation      Incarnation.
> +
> +tolerance        Amount of Jitter the domU can handle after migration

Measurement units?

>  --------------------------------------------------------------------
>  
>  \clearpage
> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> index c342d00732..4a9c43b718 100644
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -2148,8 +2153,25 @@ void tsc_set_info(struct domain *d,
>           * When a guest is created, gtsc_khz is passed in as zero, making
>           * d->arch.tsc_khz == cpu_khz. Thus no need to check incarnation.
>           */
> +        disable_vtsc = d->arch.tsc_khz == cpu_khz;
> +
> +        if ( tsc_mode == TSC_MODE_DEFAULT && gtsc_khz &&
> +             d->arch.vtsc_tolerance_khz )
> +        {
> +            long khz_diff;
> +
> +            khz_diff = ABS((long)(cpu_khz - gtsc_khz));
> +            disable_vtsc = khz_diff <= d->arch.vtsc_tolerance_khz;
> +
> +            printk(XENLOG_G_INFO "d%d: host has %lu kHz,"
> +                   " domU expects %u kHz,"
> +                   " difference of %ld is %s tolerance of %u\n",
> +                   d->domain_id, cpu_khz, gtsc_khz, khz_diff,
> +                   disable_vtsc ? "within" : "outside",
> +                   d->arch.vtsc_tolerance_khz);
> +        }

Newline here.

>          if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() &&
> -             (d->arch.tsc_khz == cpu_khz ||
> +             (disable_vtsc ||
>                (is_hvm_domain(d) &&
>                 hvm_get_tsc_scaling_ratio(d->arch.tsc_khz))) )

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.