[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/pv: Split pv_hypercall() in two


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 12 Oct 2021 11:26:35 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZfgbzRb7yA4XjUtaUAf4Abv7ClKyEgs3Gf8y/3mHnz8=; b=DfdoUZZ8lJ78lJyGjvgCxZQJ3zQJKWODDzkgeY9iEJiNhqe9LQOUtHa5YfUI+xb/yP9leA7t+jlicMfVjUI2bVTQvQEG1CrHyG0/Gtw+g0J2+vLfNh/w8aNlS7QB0XA7ABeikKp5BgCUaE01JXVpzPJg0uoqXqVYchm40sRxKTVJDWR+TnU2AhWQLQ+E0L1kM7QenpMlxYR0d5xALjZlywSut12NS3PoOOF3539YSD5bSF7pJj4qkAjOkYq5xihJj5n/5FdoXKy9LSzrIxbLgVw6GvLrGyPwZPA9FtB8k/d2AKuIh4rxTuDhZCmkO+NWDv1ZN9eFqqDDkAeFqjhMNA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=F/TtfJs983dTU5Vzn3GuiVYpY38mHuqzjb+qs7XJ+HhVE/RRL/TtLo8mYw+zanByqGe0ca4z2GdLnYeua3P/smMvWDszCu61s1DP9JNbS3/moftrTV8SstC317iOHkOI/zbfB5XarJ7CVSIP7X/gp0WNG2RFsfkKl5GCgPFe56YkmMXv1cusqaAVnE0dtvrOnAB3UheOB5YRts1t2yLsr/bDU5O1XwRKa1QO8OiFqTqmstSQHMsbabT2VVSGNQnXT7HXeIz2SOntNKt2SccNiEJc6b5W5qWp9wzzFKOUAWVqx7VW1bJ6uZam0/FBnonqlbd6yyqzyhUKVf1wLrmKJg==
  • Authentication-results: lists.xenproject.org; dkim=none (message not signed) header.d=none;lists.xenproject.org; dmarc=none action=none header.from=suse.com;
  • Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 12 Oct 2021 09:26:43 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 11.10.2021 20:05, Andrew Cooper wrote:
> The is_pv_32bit_vcpu() conditionals hide four lfences, with two taken on any
> individual path through the function.  There is very little code common
> between compat and native, and context-dependent conditionals predict very
> badly for a period of time after context switch.
> 
> Move do_entry_int82() from pv/traps.c into pv/hypercall.c, allowing
> _pv_hypercall() to be static and forced inline.  The delta is:
> 
>   add/remove: 0/0 grow/shrink: 1/1 up/down: 300/-282 (18)
>   Function                                     old     new   delta
>   do_entry_int82                                50     350    +300
>   pv_hypercall                                 579     297    -282
> 
> which is tiny, but the perf implications are large:
> 
>   Guest | Naples | Milan  | SKX    | CFL-R  |
>   ------+--------+--------+--------+--------+
>   pv64  |  17.4% |  15.5% |   2.6% |   4.5% |
>   pv32  |   1.9% |  10.9% |   1.4% |   2.5% |
> 
> These are percentage improvements in raw TSC detlas for a xen_version
> hypercall, with obvious outliers excluded.  Therefore, it is an idealised best
> case improvement.
> 
> The pv64 path uses `syscall`, while the pv32 path uses `int $0x82` so
> necessarily has higher overhead.  Therefore, dropping the lfences is less over
> an overall improvement.
> 
> I don't know why the Naples pv32 improvement is so small, but I've double
> checked the numbers and they're correct.  There's something we're doing which
> is a large overhead in the pipeline.
> 
> On the Intel side, both systems are writing to MSR_SPEC_CTRL on
> entry/exit (SKX using the retrofitted microcode implementation, CFL-R using
> the hardware implementation), while SKX is suffering further from XPTI for
> Meltdown protection.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.