[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/3] x86/time: adjust time recording time_calibration_tsc_rendezvous()


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Mon, 8 Feb 2021 17:39:35 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=y58D1HQJPdV9rWbvSfCrG+b68fRv8CzeC0e4ZYte8A0=; b=hOTqeFTeeUiUL4peoeg9jgbiQk/qZ7HymDFuxV2EB+hSYPtiC2iW7oUK+d8g/Q22yeAf9FjuFuXYh3i7x9HAXkR3wQ2h/fcMccK5AxFhHuFPbF4x38fiAyGREd4Sz0/rRKg3OJNTAict4VPbB+HZYWYSMdusnQAsaJgBtz79sjFDjjJGmMrJXOdAkOgm/7hIuJczI4EsuKyUmCW6WMw9ExsYPv6YfGXrlv3gmFGDAHJD/Qy29fjwZvXnoXrVAZcDZ9z464Fg90MHP8hGRDLOkzppEU9V6EKI5USfJ83MBBZeuPZbkaCwLIDurGyVKDaUmroMEYCduEjew0QBOye7oQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TZ38+pU0zpgRQVSH9X4LKSIyAVPLS2jFDXk294NHfLoo4pCj5bQGGDz+5zQh9zjDAl4i937M6+2elrgi1VSjTzWqpnYDWsMcmqKJxilTWcd2uJ6KyABFfiaqauVsMv7O4XcCRII0Pom710VzQW5Aja5heQxzHCvwNWXSJNL9sFdgvsAtYLxLnjP6MxvB7+QOvC0cf6ZTIbnnZ0UppNnMMbq6n0ugplVgwQWx8smy4SHYwfoE8amVkLzw5BHbFplw58Lci11n/mw2vPBWL0B00yvAvllyTqNJSdH6kxiO9DtfdqsBpO82edsPBUG/ZVHUx+zWuO9BfjxF1Pv7hWavzg==
  • Authentication-results: esa2.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Claudemir Todo Bom <claudemir@xxxxxxxxxxx>
  • Delivery-date: Mon, 08 Feb 2021 16:39:53 +0000
  • Ironport-sdr: KNmOHEoc+fs6Qjf7XpnuqOlS3WHPDruVugi4fc/osjZEtU/UqKOv+3wvEjD6g1VMjbXkzJ0lCS Utas9z0C5it1TN0LrhIRFMwbw8BzM7A2GlkDkkJKJbSeCqlja7T1xPp0nbz973Hksb45AQwfzD IAM0h9Dwbmke4oNXKSPT3sn6ZkhzDBBmYbu6zZQ3Ta8FB5txhIPtSj0VG7HrjL+fGeVkqIuZ0j HAt2ArH7DEZ9S8eo2Lct93FBusbrBg9WFY12ZnDZ1OKp3T5XDHBGjQJoVhfn8v7RfikDrfjP2n txM=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Feb 08, 2021 at 12:50:09PM +0100, Jan Beulich wrote:
> On 08.02.2021 12:05, Roger Pau Monné wrote:
> > On Mon, Feb 08, 2021 at 11:56:01AM +0100, Jan Beulich wrote:
> >> On 05.02.2021 17:15, Roger Pau Monné wrote:
> >>> I've been thinking this all seems doomed when Xen runs in a virtualized
> >>> environment, and should likely be disabled. There's no point on trying
> >>> to sync the TSC over multiple vCPUs as the scheduling delay between
> >>> them will likely skew any calculations.
> >>
> >> We may want to consider to force the equivalent of
> >> "clocksource=tsc" in that case. Otoh a well behaved hypervisor
> >> underneath shouldn't lead to us finding a need to clear
> >> TSC_RELIABLE, at which point this logic wouldn't get engaged
> >> in the first place.
> > 
> > I got the impression that on a loaded system guests with a non-trivial
> > amount of vCPUs might be in trouble to be able to schedule them all
> > close enough for the rendezvous to not report a big skew, and thus
> > disable TSC_RELIABLE?
> 
> No, check_tsc_warp() / tsc_check_reliability() don't have a
> problem there. Every CPU reads the shared "most advanced"
> stamp before reading its local one. So it doesn't matter how
> large the gaps are here. In fact the possible bad effect is
> the other way around here - if the scheduling effects are
> too heavy, we may mistakenly consider TSCs reliable when
> they aren't.
> 
> A problem of the kind you describe exists in the actual
> rendezvous function. And actually any problem of this kind
> can, on a smaller scale, already be be observed with SMT,
> because the individual hyperthreads of a core can't
> possibly all run at the same time.

Indeed I got confused between tsc_check_reliability and the actual
rendezvous function, so it's likely the adjustments done by the
rendezvous are pointless when running virtualized IMO, due to the
inability to likely schedule all the vCPUs at one to execute the
rendezvous.

> As occurs to me only now, I think we can improve accuracy
> some (in particular on big systems) by making sure
> struct calibration_rendezvous's master_tsc_stamp is not
> sharing its cache line with semaphore and master_stime. The
> latter get written by (at least) the BSP, while
> master_tsc_stamp is stable after the 2nd loop iteration.
> Hence on the 3rd and 4th iterations we could even prefetch
> it to reduce the delay on the last one.

Seems like a possibility indeed.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.