|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [XEN][PATCH] xen/x86: guest_access: optimize raw_x_guest() for PV and HVM combinations
On 2025-11-06 11:33, Grygorii Strashko wrote: Hi Teddy, Jan, On 06.11.25 17:57, Teddy Astie wrote:Le 31/10/2025 à 22:25, Grygorii Strashko a écrit :From: Grygorii Strashko <grygorii_strashko@xxxxxxxx> Xen uses below pattern for raw_x_guest() functions: define raw_copy_to_guest(dst, src, len) \ (is_hvm_vcpu(current) ? \ copy_to_user_hvm((dst), (src), (len)) : \ copy_to_guest_pv(dst, src, len)) How this pattern is working depends on CONFIG_PV/CONFIG_HVM as: - PV=y and HVM=y Proper guest access function is selected depending on domain type. - PV=y and HVM=nOnly PV domains are possible. is_hvm_domain/vcpu() will constify to "false"and compiler will optimize code and skip HVM specific part. - PV=n and HVM=yOnly HVM domains are possible. is_hvm_domain/vcpu() will not be constified.No PV specific code will be optimized by compiler. - PV=n and HVM=n No guests should possible. The code will still follow PV path. Rework raw_x_guest() code to use required functions explicitly for eachcombination of CONFIG_PV/CONFIG_HVM with main intention to optimize code for(PV=n and HVM=y) case. For the case (PV=n and HVM=n) empty stubs are created which return (1) indicating failure. Hence, no guests should possible in this case - which means no access to guest memory should ever happen.The two calls of __raw_copy_to_guest() in common/domain.c- >update_runstate_area()are fixed for this case by explicitly cast the return value to void (MISRA C Rule 17.7). Finally build arch/x86/usercopy.c only for PV=y. The measured (bloat-o-meter) improvement for (PV=n and HVM=y) case is: add/remove: 0/10 grow/shrink: 2/90 up/down: 163/-30932 (-30769) Total: Before=1937113, After=1906344, chg -1.59% Signed-off-by: Grygorii Strashko <grygorii_strashko@xxxxxxxx> --- xen/arch/x86/Makefile | 2 +-xen/arch/x86/include/asm/guest_access.h | 38 +++++++++++++++++++++ ++++xen/common/domain.c | 10 ++++--- 3 files changed, 45 insertions(+), 5 deletions(-)
Yes, I was thinking something like Teddy suggested:
#define raw_copy_to_guest(dst, src, len) \
(is_hvm_vcpu(current) ? copy_to_user_hvm(dst, src, len) :
is_pv_vcpu(current) ? copy_to_guest_pv(dst, src, len) :
fail_copy(dst, src, len))
But that made the think the is_{hvm,pv}_{vcpu,domain}() could be
optimized for when only 1 of HVM or PV is enabled.
Regards, Jason xen: Optimize is_hvm/pv_domain() for single domain type is_hvm_domain() and is_pv_domain() hardcode the false conditions for HVM=n and PV=n, but they still leave the XEN_DOMCTL_CDF_hvm flag checking. When only one of PV or HVM is set, the result can be hard coded since the other is impossible. Notably, this removes the evaluate_nospec() lfences. Signed-off-by: Jason Andryuk <jason.andryuk@xxxxxxx> --- Untested. HVM=y PV=n bloat-o-meter: add/remove: 3/6 grow/shrink: 19/212 up/down: 3060/-60349 (-57289) Full bloat-o-meter below. --- xen/include/xen/sched.h | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index f680fb4fa1..12f10d9cc8 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h@@ -1176,8 +1176,13 @@ static always_inline bool is_hypercall_target(const struct domain *d)
static always_inline bool is_pv_domain(const struct domain *d)
{
- return IS_ENABLED(CONFIG_PV) &&
- evaluate_nospec(!(d->options & XEN_DOMCTL_CDF_hvm));
+ if ( !IS_ENABLED(CONFIG_PV) )
+ return false;
+
+ if ( IS_ENABLED(CONFIG_HVM) )
+ return evaluate_nospec(!(d->options & XEN_DOMCTL_CDF_hvm));
+
+ return true;
}
static always_inline bool is_pv_vcpu(const struct vcpu *v)
@@ -1218,8 +1223,13 @@ static always_inline bool is_pv_64bit_vcpu(const
struct vcpu *v)
static always_inline bool is_hvm_domain(const struct domain *d)
{
- return IS_ENABLED(CONFIG_HVM) &&
- evaluate_nospec(d->options & XEN_DOMCTL_CDF_hvm);
+ if ( !IS_ENABLED(CONFIG_HVM) )
+ return false;
+
+ if ( IS_ENABLED(CONFIG_PV) )
+ return evaluate_nospec(d->options & XEN_DOMCTL_CDF_hvm);
+
+ return true;
}
static always_inline bool is_hvm_vcpu(const struct vcpu *v)
--
2.51.1
HVM=y PV=n bloat-o-meter:
add/remove: 3/6 grow/shrink: 19/212 up/down: 3060/-60349 (-57289)
Function old new delta
__alt_instructions_end 2 896 +894
__stop___pre_ex_table 40 664 +624
increase_reservation - 418 +418
do_memory_op 11026 11404 +378
decrease_reservation - 360 +360
x86_emulate 217169 217338 +169
x86_msr_copy_to_buffer 117 177 +60
__start_bug_frames_0 - 40 +40
pt_irq_destroy_bind 1490 1515 +25
map_pages_to_xen 4284 4300 +16
domain_cpu_policy_changed 95 108 +13
sched_init_vcpu 1621 1630 +9
is_memory_hole 105 113 +8
hvm_save 549 557 +8
pv_soft_rdtsc 62 69 +7
elf_load_image 84 91 +7
console_send 262 268 +6
wmemchr 58 63 +5
map_grant_ref 5410 5415 +5
__start_xen 8273 8277 +4
p2m_pod_init 138 141 +3
pending_requeue 635 636 +1
set_time_scale 93 91 -2
do_debug 391 389 -2
handle_iomem_range 171 168 -3
fill_vmsr_data 424 421 -3
p2m_pod_offline_or_broken_hit 1720 1716 -4
allocate_pirq 480 476 -4
subpage_mmio_find_page 68 63 -5
x86_cpu_policy_fill_native 999 991 -8
recalculate_cpuid_policy 868 860 -8
irq_load_pci 211 201 -10
local_time_calibration 440 429 -11
ioreq_server_add_vcpu_all 199 188 -11
hvm_get_mem_pinned_cacheattr 233 222 -11
symbols_offsets 16856 16844 -12
send_guest_pirq 386 374 -12
pt_restore_timer 171 158 -13
p2m_pod_zero_check_superpage 1251 1238 -13
inject_vmce 208 195 -13
amd_vpmu_save 278 265 -13
time_resume 324 310 -14
vmcb_dump 300 285 -15
pirq_cleanup_check 97 82 -15
init_hypercall_page 61 46 -15
init_domain_irq_mapping 191 176 -15
cpu_schedule_callback 549 534 -15
__hvm_copy 695 680 -15
fixup_eoi 175 159 -16
evtchn_destroy_final 171 155 -16
context_switch 703 687 -16
__stop_bug_frames_1 3672 3656 -16
p2m_init_one 203 186 -17
p2m_init 133 116 -17
p2m_final_teardown 60 43 -17
iommu_domain_init 251 234 -17
hvmemul_write_cache 938 921 -17
hvm_do_IRQ_dpci 256 239 -17
domain_update_node_aff 387 370 -17
domain_set_time_offset 41 24 -17
cleanup_domain_irq_mapping 61 44 -17
arch_evtchn_bind_pirq 142 125 -17
arch_domain_destroy 130 113 -17
alloc_vcpu_struct 58 41 -17
hvm_get_guest_time_fixed 139 120 -19
hvm_assert_evtchn_irq 183 164 -19
__context_switch 539 520 -19
setup_io_bitmap 123 103 -20
vm_event_toggle_singlestep 65 44 -21
show_registers 626 605 -21
show_execution_state 508 487 -21
p2m_pod_get_mem_target 728 707 -21
ioreq_server_get_frame 333 312 -21
get_free_pirqs 188 167 -21
arch_vcpu_destroy 93 72 -21
amd_vpmu_destroy 90 69 -21
shadow_domain_init 63 41 -22
p2m_mem_access_sanity_check 71 49 -22
hvm_domain_use_pirq 50 28 -22
dump_irq_info 495 473 -22
tsc_get_info 230 207 -23
hvm_set_callback_via 280 257 -23
vcpu_mark_events_pending 50 26 -24
dump_softtsc 511 487 -24
dump_pageframe_info 527 503 -24
arch_pci_clean_pirqs 347 323 -24
vmce_enable_mca_cap 125 100 -25
msixtbl_init 100 75 -25
guest_flush_tlb_flags 46 21 -25
domain_get_irq_dpci 54 29 -25
arch_domain_creation_finished 48 23 -25
unmmap_broken_page 287 261 -26
p2m_pod_active 391 365 -26
gtsc_to_gtime 49 23 -26
domain_relinquish_resources 721 695 -26
do_vm_assist 141 113 -28
vmce_restore_vcpu 160 131 -29
unmap_domain_pirq_emuirq 255 226 -29
construct_dom 206 177 -29
arch_domain_soft_reset 892 863 -29
get_free_pirq 240 210 -30
vm_event_fill_regs 1280 1248 -32
paging_gva_to_gfn 297 265 -32
evtchn_reset 818 786 -32
map_domain_emuirq_pirq 558 524 -34
physdev_unmap_pirq 686 651 -35
cpu_policy_updated 35 - -35
do_IRQ 1496 1459 -37
alloc_pirq_struct 96 59 -37
iommu_unmap 657 619 -38
iommu_map 682 644 -38
evtchn_close 1685 1647 -38
xenctl_bitmap_to_bitmap 274 235 -39
amd_vpmu_load 521 482 -39
iommu_free_pgtables 358 318 -40
dom0_setup_permissions 3762 3722 -40
evtchn_destroy 479 438 -41
arch_do_multicall_call 52 10 -42
symbols_names 50060 50016 -44
p2m_teardown 945 901 -44
_put_page_type 667 623 -44
hvmemul_read_cache 581 536 -45
vcpu_block 156 109 -47
safe_copy_string_from_guest 169 122 -47
map_range 508 461 -47
getdomaininfo 695 648 -47
physdev_map_pirq 921 873 -48
gnttab_release_mappings 1447 1399 -48
guest_rdmsr 1972 1923 -49
x86_msr_copy_from_buffer 280 230 -50
relinquish_memory 1212 1161 -51
p2m_set_mem_access 838 787 -51
p2m_pod_demand_populate 2238 2187 -51
hvmemul_rep_outs 680 629 -51
compat_set_cx_pminfo 539 488 -51
tsc_set_info 480 428 -52
monitor_domctl 600 547 -53
default_initialise_vcpu 208 154 -54
collect_time_info 336 282 -54
xenmem_add_to_physmap 706 651 -55
hvm_set_mem_pinned_cacheattr 772 717 -55
hap_track_dirty_vram 1331 1276 -55
arch_domain_create 998 943 -55
ucode_update_hcall 202 146 -56
copy_leaf_to_buffer 160 104 -56
relinquish_p2m_mapping 988 930 -58
gtime_to_gtsc 81 23 -58
arch_hwdom_irqs 132 74 -58
x86_cpuid_copy_from_buffer 760 701 -59
set_px_pminfo 851 792 -59
set_cx_pminfo 1675 1616 -59
do_dm_op 191 132 -59
hap_set_allocation 510 450 -60
domain_get_maximum_gpfn 79 19 -60
_handle_iomem_range 234 173 -61
compat_dm_op 261 199 -62
paging_free_log_dirty_bitmap 1653 1589 -64
memcpy_to_guest_ring 409 344 -65
efi_compat_get_info 993 928 -65
smt_up_down_helper 437 370 -67
arch_set_info_guest 402 335 -67
guest_wrmsr 2354 2286 -68
read_console_ring 395 325 -70
vcpu_show_execution_state 737 665 -72
hvm_save_one 798 726 -72
efi_get_info 1028 948 -80
arch_initialise_vcpu 197 110 -87
amd_vpmu_do_wrmsr 905 816 -89
iommu_do_pci_domctl 2051 1961 -90
compat_argo_op 405 313 -92
_get_page_type 1439 1347 -92
do_get_pm_info 998 902 -96
p2m_pod_empty_cache 1123 1020 -103
vpmu_do_interrupt 965 851 -114
p2m_pod_set_cache_target 816 701 -115
gwstrlen 284 168 -116
do_paging_domctl_cont 879 758 -121
mem_access_memop 986 859 -127
copy_msr_to_buffer 138 - -138
get_reserved_device_memory 404 259 -145
do_pm_op 2731 2577 -154
p2m_set_mem_access_multi 1023 861 -162
do_xenpmu_op 1690 1524 -166
do_vcpu_op 736 568 -168
bitmap_to_xenctl_bitmap 477 309 -168
do_poll 900 731 -169
subarch_memory_op 720 550 -170
elf_memcpy 170 - -170
arch_vcpu_create 392 221 -171
do_console_io 1069 887 -182
gnttab_copy 1596 1396 -200
gnttab_unmap_grant_ref 778 565 -213
gnttab_unmap_and_replace 778 565 -213
update_runstate_area 814 599 -215
acpi_set_pdc_bits 450 235 -215
compat_arch_memory_op 990 769 -221
__stop_bug_frames_2 59456 59232 -224
compat_vcpu_op 885 656 -229
pci_physdev_op 865 626 -239
update_secondary_system_time 462 217 -245
populate_physmap 1220 970 -250
paging_domctl 3240 2987 -253
gnttab_setup_table 1337 1084 -253
gnttab_get_status_frames 1624 1370 -254
symbols_token_index 118952 118688 -264
vcpu_show_registers 375 88 -287
efi_runtime_call 2621 2333 -288
do_sched_op 1126 821 -305
efi_compat_runtime_call 3036 2724 -312
compat_sched_op 1147 835 -312
compat_common_vcpu_op 803 416 -387
arch_get_info_guest 899 512 -387
compat_multicall 1201 792 -409
guest_cpuid 1835 1418 -417
do_multicall 1165 725 -440
arch_do_sysctl 1293 843 -450
common_vcpu_op 1727 1276 -451
dm_op 3776 3282 -494
_einittext 1937 1439 -498
do_domctl 7108 6505 -603
show_guest_stack 609 - -609
do_hvm_op 4075 3440 -635
do_grant_table_op 6666 6031 -635
arch_memory_op 2888 2233 -655
do_argo_op 7164 6501 -663
compat_grant_table_op 2491 1827 -664
__start___ex_table 1696 1032 -664
do_xen_version 2600 1906 -694
pmstat_get_cx_stat 2660 1822 -838
__alt_call_sites_start 892 - -892
arch_do_domctl 7351 6351 -1000
do_sysctl 5190 4074 -1116
do_event_channel_op 4471 3074 -1397
compat_physdev_op 4989 3552 -1437
do_physdev_op 5142 3688 -1454
do_platform_op 6428 4778 -1650
compat_platform_op 6301 4623 -1678
compat_memory_op 6803 5050 -1753
memory_exchange 4128 - -4128
__alt_instructions 53802 33488 -20314
Total: Before=18446744073714933955, After=18446744073714876666, chg -0.00%
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |