[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 4/5] x86/pv: Remove deferred RDTSC{, P} handling in pv_emulate_privileged_op()
On Tue, Feb 20, 2018 at 04:37:51PM +0000, Andrew Cooper wrote: > On 20/02/18 16:28, Roger Pau Monné wrote: > > On Tue, Feb 20, 2018 at 11:58:42AM +0000, Andrew Cooper wrote: > >> The handling of RDTSCP for PV guests has been broken (AFAICT forever). > >> > >> To start with, RDTSCP is hidden from PV guests so the MSR_TSC_AUX path > >> should > >> be unreachable. However, this appears to be a "feature" of > >> TSC_MODE_PVRDTSCP, > >> and the emulator doesn't perform appropriate feature checking. > >> (Conversely, > >> we unilaterally advertise RDPID which uses the same path, but it should > >> never > >> trap on #GP to arrive here in the first place). > >> > >> A PV guest typically can see RDTSCP in native CPUID, so userspace will > >> probably end up using it. On a capable pipeline (without TSD, see below), > >> it > >> will execute normally and return non-virtualised data. > >> > >> When a virtual TSC mode is not specified for the domain, CR4.TSD is left > >> clear, so executing RDTSCP will execute without trapping. However, a guest > >> kernel may set TSD itself, at which point the emulator should not suddenly > >> switch to virtualised TSC mode and start handing out differently-scaled > >> values. > >> > >> Drop all the deferral logic, and return scaled or raw TSC values depending > >> only on currd->arch.vtsc. This changes the exact moment at which the > >> timestamp is taken, but that doesn't matter from the guests point of view, > >> and > >> is consistent with the HVM side of things. It also means that RDTSC and > >> RDTSCP are now consistent WRT handing out native or virtualised timestamps. > >> > >> The MSR_TSC_AUX case unconditionally returns the migration incarnation or > >> zero, depending on TSC_MODE_PVRDTSCP, which is faster than re-reading it > >> out > >> of hardware. > >> > >> This is a behavioural change for guests, but the semantics are rather more > >> sane. It lays groundwork for further fixes. > >> > >> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > >> --- > >> CC: Jan Beulich <JBeulich@xxxxxxxx> > >> CC: Wei Liu <wei.liu2@xxxxxxxxxx> > >> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> > >> CC: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> > >> --- > >> xen/arch/x86/pv/emul-priv-op.c | 35 +++++------------------------------ > >> 1 file changed, 5 insertions(+), 30 deletions(-) > >> > >> diff --git a/xen/arch/x86/pv/emul-priv-op.c > >> b/xen/arch/x86/pv/emul-priv-op.c > >> index d4d64f2..4e3641d 100644 > >> --- a/xen/arch/x86/pv/emul-priv-op.c > >> +++ b/xen/arch/x86/pv/emul-priv-op.c > >> @@ -60,9 +60,6 @@ struct priv_op_ctxt { > >> } cs; > >> char *io_emul_stub; > >> unsigned int bpmatch; > >> - unsigned int tsc; > >> -#define TSC_BASE 1 > >> -#define TSC_AUX 2 > >> }; > >> > >> /* I/O emulation support. Helper routines for, and type of, the stack > >> stub. */ > >> @@ -843,8 +840,7 @@ static inline bool is_cpufreq_controller(const struct > >> domain *d) > >> static int read_msr(unsigned int reg, uint64_t *val, > >> struct x86_emulate_ctxt *ctxt) > >> { > >> - struct priv_op_ctxt *poc = container_of(ctxt, struct priv_op_ctxt, > >> ctxt); > >> - const struct vcpu *curr = current; > >> + struct vcpu *curr = current; > > I think you can keep the const here? > > pv_soft_rdtsc() mutates curr. This change is most certainly not spurious. Oh right, you constified regs but not the vcpu parameter. AFAICT the vcpu parameter of pv_soft_rdtsc could also be constified? It's only used to get the domain and whether the guests is in kernel or user mode. > >> const struct domain *currd = curr->domain; > >> bool vpmu_msr = false; > >> int ret; > >> @@ -880,20 +876,13 @@ static int read_msr(unsigned int reg, uint64_t *val, > >> *val = curr->arch.pv_vcpu.gs_base_user; > >> return X86EMUL_OKAY; > >> > >> - /* > >> - * In order to fully retain original behavior, defer calling > >> - * pv_soft_rdtsc() until after emulation. This may want/need to be > >> - * reconsidered. > >> - */ > >> case MSR_IA32_TSC: > >> - poc->tsc |= TSC_BASE; > >> - goto normal; > >> + *val = currd->arch.vtsc ? pv_soft_rdtsc(curr, ctxt->regs) : > >> rdtsc(); > >> + return X86EMUL_OKAY; > >> > >> case MSR_TSC_AUX: > >> - poc->tsc |= TSC_AUX; > >> - if ( cpu_has_rdtscp ) > >> - goto normal; > >> - *val = 0; > >> + *val = (uint32_t)((currd->arch.tsc_mode == TSC_MODE_PVRDTSCP) > >> + ? currd->arch.incarnation : 0); > > I wonder whether Xen should inject a #GP here if tsc_mode is not > > PVRDTSCP and RDPID is not available. > > RDPID would be a layering violation. Also, a lot of this changes again > in patch 5. I see, patch 5 makes this all much better. And this is not any worse that the previous behavior: Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |