[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC PATCH 22/22] x86/AMD: add IRPerf support
> On 25 Oct 2023, at 21:59, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx> wrote: > > On 25/10/2023 8:29 pm, Edwin Török wrote: >> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c >> index 483b5e4f70..b3cd851d9d 100644 >> --- a/xen/arch/x86/msr.c >> +++ b/xen/arch/x86/msr.c >> @@ -584,6 +584,13 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t >> val) >> } >> break; >> >> + case MSR_K8_HWCR: >> + if ( !(cp->x86_vendor & X86_VENDOR_AMD) || >> + (val & ~K8_HWCR_IRPERF_EN) || >> + wrmsr_safe(msr, val) != 0 ) >> + goto gp_fault; >> + break; >> + >> case MSR_AMD64_DE_CFG: >> /* >> * OpenBSD 6.7 will panic if writing to DE_CFG triggers a #GP: >> diff --git a/xen/include/public/arch-x86/cpufeatureset.h >> b/xen/include/public/arch-x86/cpufeatureset.h >> index 5faca0bf7a..40f74cd5e8 100644 >> --- a/xen/include/public/arch-x86/cpufeatureset.h >> +++ b/xen/include/public/arch-x86/cpufeatureset.h >> @@ -241,7 +241,7 @@ XEN_CPUFEATURE(EFRO, 7*32+10) /* APERF/MPERF >> Read Only interface */ >> >> /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */ >> XEN_CPUFEATURE(CLZERO, 8*32+ 0) /*A CLZERO instruction */ >> -XEN_CPUFEATURE(IRPERF, 8*32+ 1) /* Instruction Retired Performance >> Counter */ >> +XEN_CPUFEATURE(IRPERF, 8*32+ 1) /*A! Instruction Retired Performance >> Counter */ >> XEN_CPUFEATURE(RSTR_FP_ERR_PTRS, 8*32+ 2) /*A (F)X{SAVE,RSTOR} always >> saves/restores FPU Error pointers */ >> XEN_CPUFEATURE(WBNOINVD, 8*32+ 9) /* WBNOINVD instruction */ >> XEN_CPUFEATURE(IBPB, 8*32+12) /*A IBPB support only (no IBRS, used >> by AMD) */ > > Last things first. You can advertise this bit to guests, but I'm > looking at the AMD manuals and woefully little information on what this > is an how it works. > > It does have an architectural enumeration, but there isn't even a a > description of what HWCR.IPERF_EN does. > > The HWCR documentation says "enables Core::X86::Msr::IRPerfCount", but > the IRPERF MSR says no such think, > noting only that EFRO_LOCK makes the > MSR non-writeable (which in turn means we can't always offer it to > guests anyway. See below). > At a guess, the HWCR bit activates the counter, but nothing states > this. My version (https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf) says: "This is a dedicated counter that is always counting instructions retired. It exists at MSR address C000_00E9. It is enabled by setting a 1 to HWCR[30] and its support is indicated by CPUID Fn8000_0008_EBX[1]." Although digging a bit more there are some erratas around deep C states on some chips and the HWCR register itself is not global but per CCD (which is neatly buried in the _ccd[7:0]_.... smallprint at the top: https://github.com/cyring/CoreFreq/issues/206#issuecomment-722713147 > At a guess, it's a free-running, reset-able counter, but again > nothing states this. The manual just says "is a dedicated counter" and > doesn't even state whether it is (or is not) tied into the other core > fixed counters and whether it is integrated with PMI or not. There is also a LWP 'Instructions Retired' described in 13.4.2.2, which is user-mode only and per-core, but I assume that is completely different from this MSR and part of the features that got dropped in newer cores. > > Which brings us on to the middle hunk. Putting it there short circuits > the PV-specific handling, but you can't do that in general without > arranging for HWCR to be context switched properly for vCPUs, nor in > this case IPERF_COUNT itself. > > Unless you context switch HWCR and IPERF_COUNT, a guest will see it not > counting, or the count going backwards, 50% of the time as the vCPU is > scheduled around. > > So while in principle we could let guests use this facility (it would > have to be off by default, because there doesn't appear to be any > filtering capability in it at all, so we can't restrict it to > instructions retired in guest context), it will need a far larger patch > than this to do it safely. Thanks, I'll move this patch into the 'needs more work' category. Thanks, --Edwin
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |