Xen project Mailing List

Re: [PATCH 10/12] xen/x86: call hypercall handlers via switch statement

Date: Tue, 2 Nov 2021 10:54:51 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cNSNxGPr6G/CRU5m9SmGuVBUxGeqEgz9dYEYbFws2ys=; b=Lm6gH8gj8yNy7m3OKX3mjn8WsnrEnBYXbCw4TEIjq8EEwK8K+4+OjuMd4wSXiflPucXXrr3D8DiWbl64/ILh0+reUzFZDrVITQOvbh/ccZPVbTHOVudbfnt+UYX6TDcylHMY5f+YwuMOQ7os7ig1GJI97xLheVZMXXpgSkm/UWqbdj7RF1rLav2ZtZ6JsmxXeyrjC3te/DaYyU+rkVOVdY9w9VcMK9azW+OnBbU/WZ/RVevqWMYoXAjreKvXwmwZH4ztsKZXjOC2HrBamNZL0FUAsdiU0XuwgS43ZBlwxGTgdOcR+Wbxd9XzRlLsNoGoDEXcwJfDh/xVlPQV/DgrFw==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ADLuLd6sqE5NaKehdI+cjBpYw2Ohax5frXqX76OYFjA3JRA6BMQempbC858Gb5ZYWNBxYlBHfkLrOw+qVY2vYG+WYGHJtERVSLeOlXNffU0WqoI5Bla+lVbZeB2LxfhG3f+sr/JqpqE48kLWKM8rxEaoMGNPaWA+ce/oEshsa00sne9LydLuLH4ZoYXp9nwDey+vklvsxP1ppByxedgAzmRDL4P0JBqt19MhD1u7dOD+RTlKdsnk5CaZ7BT9X+WmmwEU8bejFm4L6IJM38TMaGMcHLKGhBU+w9Un/0VlKOKV0EOpTp8LUlhsb/H+4UUspZMIvXeXNEbrjUNVMEjuvw==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Tue, 02 Nov 2021 09:55:16 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 28.10.2021 16:32, Juergen Gross wrote: > On 21.10.21 16:41, Jan Beulich wrote: >> On 15.10.2021 14:51, Juergen Gross wrote: >>> Instead of using a function table use the generated switch statement >>> macros for calling the appropriate hypercall handlers. >>> >>> This is beneficial to performance and avoids speculation issues. >>> >>> With calling the handlers using the correct number of parameters now >>> it is possible to do the parameter register clobbering in the NDEBUG >>> case after returning from the handler. This in turn removes the only >>> users of hypercall_args_table[] which can be removed now. >> >> "removed" reads misleading to me: You really replace it by new tables, >> using script-generated initializers. Also it looks like you're doubling >> the data, as the same sets were previously used by pv64/hvm64 and >> pv32/hvm32 respectively. > > Yes, I'll change that paragraph. > > Regarding having 4 tables on x86 now: merging the pv/hvm tables would be > possible, but this would add some complexity to the script generating > the tables (it should test whether the number of parameters of pv and > hvm match). As the tables are present in debug build only I don't think > this is a real issue. Sure, but that imo wants saying in the description. >> Overall, besides these mainly cosmetic aspects the main thing missing >> is an approach to prioritize the handful most frequently used functions, >> for them to be pulled out of the switch() so we don't depend on the >> compiler's choice for the order of comparisons done. > > I have already prepared that step by generating the complete call > sequence, so any change for prioritizing some hypercalls can be local to > the generator script and the used input data. > > The main question is how to do that. I've collected some hypercall > statistics data for PV and PVH guests running some simple tests (once a > build of the Xen hypervisor, and once a scp of a large file). The data > is split between guest and dom0 (PV) counts. There is no clear "winner" > which hypercall should be fastest, but several hypercalls are clearly > not important. > > Here is the data: > > PV-hypercall PV-guest build PV-guest scp dom0 build dom0 scp > mmu_update 186175729 2865 20936 33725 Builds should be local to the guest and I/O should involve gnttab ops but no mmu-update. Hence I have a hard time seeing where the huge difference here would be coming from. Did you have any thoughts here? > stack_switch 1273311 62381 108589 270764 > multicall 2182803 50 302 524 A fair amount of the mmu-updates is going to be coming through muticalls, I would guess. Priorities therefore may even differ for the two separate dispatch points. > update_va_mapping 571868 10 60 80 > xen_version 73061 850 859 5432 > grant_table_op 0 0 35557 139110 > iret 75673006 484132 268157 757958 The huge differences for builds is puzzling mere here ... > vcpu_op 453037 71199 138224 334988 > set_segment_base 1650249 62387 108645 270823 > mmuext_op 11225681 188 7239 3426 ... and here as well. Did Dom0 and DomU use identical numbers of vCPU-s and identical -j make option values? > sched_op 280153 134645 70729 137943 > event_channel_op 192327 66204 71409 214191 > physdev_op 0 0 7721 4315 > (the dom0 values are for the guest running the build or scp test, so > dom0 acting as backend) > > HVM-hypercall PVH-guest build PVH-guest scp > vcpu_op 277684 2324 > event_channel_op 350233 57383 > (the related dom0 counter values are in the same range as with the test > running in the PV guest) > > It should be noted that during boot of the guests the numbers for the PV > guest are more like the ones for the build test with the exception of > iret and sched_op being higher, while for PVH sched_op is by far the > most often used hypercall. > > I'm not sure how to translate those numbers into a good algorithm for > generating the call sequence. Well, there's never going to be a clear cut fitting everything, I suppose. > I could add priorities to each hypercall in hypercall-defs.c and have a > cascade of if (likely(foo)) call_foo; else if (likely(bla)) ... else > switch(rest). Personally I'd lean to an approach like this one; perhaps there's not even a need to specify priorities for every hypercall, but just the ones we deem most frequently used? Jan > Or I could have groups of hypercalls with a priority for each group and: > > mask = 1ULL << num; > if (likely(mask & prio_1_mask)) switch(num) ... > else if (likely(mask & prio_2_mask)) switch (num) ... > ... > else switch (num) ... > > Or I could combine those approaches using the mask variant for cases of > multiple entries having the same priority and the direct call variant > for the cases of only a single entry having a specific priority. > > And then there is the problem to set the priorities (fairly simple for > HVM, PV is more diffcult). > > > Juergen >

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.