Xen project Mailing List

Re: [Xen-devel] [PATCH v5 10/11] viridian: add implementation of synthetic timers

>>> On 11.03.19 at 14:41, <paul.durrant@xxxxxxxxxx> wrote: > +bool viridian_synic_deliver_timer_msg(struct vcpu *v, unsigned int sintx, > + unsigned int index, > + uint64_t expiration, > + uint64_t delivery) > +{ > + struct viridian_vcpu *vv = v->arch.hvm.viridian; > + const union viridian_sint_msr *vs = &vv->sint[sintx]; > + HV_MESSAGE *msg = vv->simp.ptr; > + struct { > + uint32_t TimerIndex; > + uint32_t Reserved; > + uint64_t ExpirationTime; > + uint64_t DeliveryTime; > + } payload = { > + .TimerIndex = index, > + .ExpirationTime = expiration, > + .DeliveryTime = delivery, > + }; > + > + if ( test_bit(sintx, &vv->msg_pending) ) > + return false; > + > + BUILD_BUG_ON(sizeof(*msg) != HV_MESSAGE_SIZE); > + msg += sintx; > + > + /* > + * To avoid using an atomic test-and-set, and barrier before calling > + * vlapic_set_irq(), this function must be called in context of the > + * vcpu receiving the message. > + */ > + ASSERT(v == current); > + if ( msg->Header.MessageType != HvMessageTypeNone ) > + { > + msg->Header.MessageFlags.MessagePending = 1; > + __set_bit(sintx, &vv->msg_pending); > + return false; > + } > + > + msg->Header.MessageType = HvMessageTimerExpired; > + msg->Header.MessageFlags.MessagePending = 0; > + msg->Header.PayloadSize = sizeof(payload); > + memcpy(msg->Payload, &payload, sizeof(payload)); Since you can't use plain assignment here, how about a BUILD_BUG_ON(sizeof(payload) <= sizeof(msg->payload))? > +static uint64_t time_now(struct domain *d) > +{ > + const struct viridian_page *rt = &d->arch.hvm.viridian->reference_tsc; > + HV_REFERENCE_TSC_PAGE *p = rt->ptr; > + uint32_t start, end; > + uint64_t tsc; > + uint64_t scale; > + uint64_t offset; Would mind grouping them all on one line, just like you do for start and end? > + /* > + * If the reference TSC page is not enabled, or has been invalidated > + * fall back to the partition reference counter. > + */ > + if ( !p || !p->TscSequence ) > + return time_ref_count(d); > + > + /* > + * The following sampling algorithm for tsc, scale and offset is > + * documented in the specifiction. > + */ > + do { > + start = p->TscSequence; > + smp_rmb(); > + > + tsc = hvm_get_guest_tsc(pt_global_vcpu_target(d)); > + scale = p->TscScale; > + offset = p->TscOffset; > + > + smp_rmb(); > + end = p->TscSequence; > + } while (end != start); Blanks immediately inside the parentheses please. As to safety of this, I have two concerns: 1) TscSequence gets updated as a result of a guest action (an MSR write). This makes it non-obvious that the loop above will get exited in due course. 2) The way update_reference_tsc() deals with the two "invalid" values suggests ~0 and 0 should be special cased in general. I _think_ this is not necessary here, but it also seems to me as if the VM ever having a way to observe either of those two values would be wrong too. Shouldn't the function avoid to ever store ~0 into that field, i.e. increment into a local variable, update that local variable to skip the two "invalid" values, and only then store into the field? Otoh, making it into that function being a result of an MSR write, it may welll be that the spec precludes the guest from reading the reference page while an update was invoked from one of its vCPU-s. If this was the case, then I also wouldn't have to wonder any longer how this entire mechanism can be race free in the first place (without a double increment like we do in the pv-clock protocol). > +static void start_stimer(struct viridian_stimer *vs) > +{ > + const struct vcpu *v = vs->v; > + struct viridian_vcpu *vv = v->arch.hvm.viridian; > + unsigned int stimerx = vs - &vv->stimer[0]; > + int64_t now = time_now(v->domain); > + s_time_t timeout; > + > + if ( !test_and_set_bit(stimerx, &vv->stimer_enabled) ) > + printk(XENLOG_G_INFO "%pv: VIRIDIAN STIMER%u: enabled\n", v, > + stimerx); > + > + if ( vs->config.fields.periodic ) > + { > + unsigned int missed = 0; > + int64_t next; > + > + /* > + * The specification says that if the timer is lazy then we > + * skip over any missed expirations so we can treat this case > + * as the same as if the timer is currently stopped, i.e. we > + * just schedule expiration to be 'count' ticks from now. > + */ > + if ( !vs->started || vs->config.fields.lazy ) > + next = now + vs->count; > + else > + { > + /* > + * The timer is already started, so we're re-scheduling. > + * Hence advance the timer expiration by one tick. > + */ > + next = vs->expiration + vs->count; > + > + /* Now check to see if any expirations have been missed */ > + if ( now - next > 0 ) > + missed = (now - next) / vs->count; > + > + /* > + * The specification says that if the timer is not lazy then > + * a non-zero missed count should be used to reduce the period > + * of the timer until it catches up, unless the count has > + * reached a 'significant number', in which case the timer > + * should be treated as lazy. Unfortunately the specification > + * does not state what that number is so the choice of number > + * here is a pure guess. > + */ > + if ( missed > 3 ) > + { > + missed = 0; > + next = now + vs->count; > + } > + } > + > + vs->expiration = next; > + timeout = ((next - now) * 100ull) / (missed + 1); The logic above guarantees next > now only if missed > 3. Yet with next < now, the multiplication by 100ull will produce a huge 64-bit value, and division by a value larger than 1 will make it a huge 62- or 63-bit value (sign bit lost). That's hardly the timeout you mean. > + } > + else > + { > + vs->expiration = vs->count; > + if ( vs->count - now <= 0 ) Unless you really mean != 0 you have an issue here, due to vs->count being uint64_t. Perhaps, taking also ... > + { > + set_bit(stimerx, &vv->stimer_pending); > + return; > + } > + > + timeout = (vs->expiration - now) * 100ull; ... this, you want to calculate the difference into a local variable of type int64_t instead, and use it in both places? > +static void poll_stimer(struct vcpu *v, unsigned int stimerx) > +{ > + struct viridian_vcpu *vv = v->arch.hvm.viridian; > + struct viridian_stimer *vs = &vv->stimer[stimerx]; > + > + if ( !test_bit(stimerx, &vv->stimer_pending) ) > + return; > + > + if ( !viridian_synic_deliver_timer_msg(v, vs->config.fields.sintx, > + stimerx, vs->expiration, > + time_now(v->domain)) ) > + return; > + > + clear_bit(stimerx, &vv->stimer_pending); While perhaps benign, wouldn't it be better to clear the pending bit before delivering the message? > +void viridian_time_vcpu_freeze(struct vcpu *v) > +{ > + struct viridian_vcpu *vv = v->arch.hvm.viridian; > + unsigned int i; > + > + if ( !is_viridian_vcpu(v) ) > + return; Don't you also want/need to check the HVMPV_stimer bit here (and the also in the thaw counterpart)? > @@ -149,6 +398,63 @@ int viridian_time_wrmsr(struct vcpu *v, uint32_t idx, > uint64_t val) > } > break; > > + case HV_X64_MSR_TIME_REF_COUNT: > + return X86EMUL_EXCEPTION; > + > + case HV_X64_MSR_STIMER0_CONFIG: > + case HV_X64_MSR_STIMER1_CONFIG: > + case HV_X64_MSR_STIMER2_CONFIG: > + case HV_X64_MSR_STIMER3_CONFIG: > + { > + unsigned int stimerx = > + array_index_nospec((idx - HV_X64_MSR_STIMER0_CONFIG) / 2, > + ARRAY_SIZE(vv->stimer)); > + struct viridian_stimer *vs = &vv->stimer[stimerx]; I again think you'd better use array_access_nospec() here (also for the rdmsr counterparts). > @@ -160,6 +466,7 @@ int viridian_time_wrmsr(struct vcpu *v, uint32_t idx, > uint64_t val) > > int viridian_time_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val) > { > + struct viridian_vcpu *vv = v->arch.hvm.viridian; const? > @@ -201,6 +508,37 @@ int viridian_time_rdmsr(const struct vcpu *v, uint32_t > idx, uint64_t *val) > break; > } > > + case HV_X64_MSR_STIMER0_CONFIG: > + case HV_X64_MSR_STIMER1_CONFIG: > + case HV_X64_MSR_STIMER2_CONFIG: > + case HV_X64_MSR_STIMER3_CONFIG: > + { > + unsigned int stimerx = > + array_index_nospec((idx - HV_X64_MSR_STIMER0_CONFIG) / 2, > + ARRAY_SIZE(vv->stimer));; > + > + if ( !(viridian_feature_mask(d) & HVMPV_stimer) ) > + return X86EMUL_EXCEPTION; > + > + *val = vv->stimer[stimerx].config.raw; > + break; > + } > + case HV_X64_MSR_STIMER0_COUNT: Blank line between case blocks please. > @@ -231,11 +590,36 @@ void viridian_time_domain_deinit(const struct domain *d) > void viridian_time_save_vcpu_ctxt( > const struct vcpu *v, struct hvm_viridian_vcpu_context *ctxt) > { > + struct viridian_vcpu *vv = v->arch.hvm.viridian; const? > + unsigned int i; > + > + BUILD_BUG_ON(ARRAY_SIZE(vv->stimer) != > + ARRAY_SIZE(ctxt->stimer_config_msr)); > + BUILD_BUG_ON(ARRAY_SIZE(vv->stimer) != > + ARRAY_SIZE(ctxt->stimer_count_msr)); > + > + for ( i = 0; i < ARRAY_SIZE(vv->stimer); i++ ) > + { > + struct viridian_stimer *vs = &vv->stimer[i]; const (if you really consider the variable useful in the first place)? > @@ -322,6 +324,15 @@ int guest_wrmsr_viridian(struct vcpu *v, uint32_t idx, > uint64_t val) > case HV_X64_MSR_TSC_FREQUENCY: > case HV_X64_MSR_APIC_FREQUENCY: > case HV_X64_MSR_REFERENCE_TSC: > + case HV_X64_MSR_TIME_REF_COUNT: > + case HV_X64_MSR_STIMER0_CONFIG: > + case HV_X64_MSR_STIMER0_COUNT: > + case HV_X64_MSR_STIMER1_CONFIG: > + case HV_X64_MSR_STIMER1_COUNT: > + case HV_X64_MSR_STIMER2_CONFIG: > + case HV_X64_MSR_STIMER2_COUNT: > + case HV_X64_MSR_STIMER3_CONFIG: > + case HV_X64_MSR_STIMER3_COUNT: For readability / brevity case HV_X64_MSR_STIMER0_CONFIG ... HV_X64_MSR_STIMER3_COUNT: ? Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.