|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v5 10/11] viridian: add implementation of synthetic timers
>>> On 11.03.19 at 14:41, <paul.durrant@xxxxxxxxxx> wrote:
> +bool viridian_synic_deliver_timer_msg(struct vcpu *v, unsigned int sintx,
> + unsigned int index,
> + uint64_t expiration,
> + uint64_t delivery)
> +{
> + struct viridian_vcpu *vv = v->arch.hvm.viridian;
> + const union viridian_sint_msr *vs = &vv->sint[sintx];
> + HV_MESSAGE *msg = vv->simp.ptr;
> + struct {
> + uint32_t TimerIndex;
> + uint32_t Reserved;
> + uint64_t ExpirationTime;
> + uint64_t DeliveryTime;
> + } payload = {
> + .TimerIndex = index,
> + .ExpirationTime = expiration,
> + .DeliveryTime = delivery,
> + };
> +
> + if ( test_bit(sintx, &vv->msg_pending) )
> + return false;
> +
> + BUILD_BUG_ON(sizeof(*msg) != HV_MESSAGE_SIZE);
> + msg += sintx;
> +
> + /*
> + * To avoid using an atomic test-and-set, and barrier before calling
> + * vlapic_set_irq(), this function must be called in context of the
> + * vcpu receiving the message.
> + */
> + ASSERT(v == current);
> + if ( msg->Header.MessageType != HvMessageTypeNone )
> + {
> + msg->Header.MessageFlags.MessagePending = 1;
> + __set_bit(sintx, &vv->msg_pending);
> + return false;
> + }
> +
> + msg->Header.MessageType = HvMessageTimerExpired;
> + msg->Header.MessageFlags.MessagePending = 0;
> + msg->Header.PayloadSize = sizeof(payload);
> + memcpy(msg->Payload, &payload, sizeof(payload));
Since you can't use plain assignment here, how about a
BUILD_BUG_ON(sizeof(payload) <= sizeof(msg->payload))?
> +static uint64_t time_now(struct domain *d)
> +{
> + const struct viridian_page *rt = &d->arch.hvm.viridian->reference_tsc;
> + HV_REFERENCE_TSC_PAGE *p = rt->ptr;
> + uint32_t start, end;
> + uint64_t tsc;
> + uint64_t scale;
> + uint64_t offset;
Would mind grouping them all on one line, just like you do for start
and end?
> + /*
> + * If the reference TSC page is not enabled, or has been invalidated
> + * fall back to the partition reference counter.
> + */
> + if ( !p || !p->TscSequence )
> + return time_ref_count(d);
> +
> + /*
> + * The following sampling algorithm for tsc, scale and offset is
> + * documented in the specifiction.
> + */
> + do {
> + start = p->TscSequence;
> + smp_rmb();
> +
> + tsc = hvm_get_guest_tsc(pt_global_vcpu_target(d));
> + scale = p->TscScale;
> + offset = p->TscOffset;
> +
> + smp_rmb();
> + end = p->TscSequence;
> + } while (end != start);
Blanks immediately inside the parentheses please.
As to safety of this, I have two concerns:
1) TscSequence gets updated as a result of a guest action (an MSR
write). This makes it non-obvious that the loop above will get
exited in due course.
2) The way update_reference_tsc() deals with the two "invalid"
values suggests ~0 and 0 should be special cased in general. I
_think_ this is not necessary here, but it also seems to me as if
the VM ever having a way to observe either of those two values
would be wrong too. Shouldn't the function avoid to ever store
~0 into that field, i.e. increment into a local variable, update
that local variable to skip the two "invalid" values, and only then
store into the field?
Otoh, making it into that function being a result of an MSR write,
it may welll be that the spec precludes the guest from reading
the reference page while an update was invoked from one of its
vCPU-s. If this was the case, then I also wouldn't have to
wonder any longer how this entire mechanism can be race free
in the first place (without a double increment like we do in the
pv-clock protocol).
> +static void start_stimer(struct viridian_stimer *vs)
> +{
> + const struct vcpu *v = vs->v;
> + struct viridian_vcpu *vv = v->arch.hvm.viridian;
> + unsigned int stimerx = vs - &vv->stimer[0];
> + int64_t now = time_now(v->domain);
> + s_time_t timeout;
> +
> + if ( !test_and_set_bit(stimerx, &vv->stimer_enabled) )
> + printk(XENLOG_G_INFO "%pv: VIRIDIAN STIMER%u: enabled\n", v,
> + stimerx);
> +
> + if ( vs->config.fields.periodic )
> + {
> + unsigned int missed = 0;
> + int64_t next;
> +
> + /*
> + * The specification says that if the timer is lazy then we
> + * skip over any missed expirations so we can treat this case
> + * as the same as if the timer is currently stopped, i.e. we
> + * just schedule expiration to be 'count' ticks from now.
> + */
> + if ( !vs->started || vs->config.fields.lazy )
> + next = now + vs->count;
> + else
> + {
> + /*
> + * The timer is already started, so we're re-scheduling.
> + * Hence advance the timer expiration by one tick.
> + */
> + next = vs->expiration + vs->count;
> +
> + /* Now check to see if any expirations have been missed */
> + if ( now - next > 0 )
> + missed = (now - next) / vs->count;
> +
> + /*
> + * The specification says that if the timer is not lazy then
> + * a non-zero missed count should be used to reduce the period
> + * of the timer until it catches up, unless the count has
> + * reached a 'significant number', in which case the timer
> + * should be treated as lazy. Unfortunately the specification
> + * does not state what that number is so the choice of number
> + * here is a pure guess.
> + */
> + if ( missed > 3 )
> + {
> + missed = 0;
> + next = now + vs->count;
> + }
> + }
> +
> + vs->expiration = next;
> + timeout = ((next - now) * 100ull) / (missed + 1);
The logic above guarantees next > now only if missed > 3. Yet
with next < now, the multiplication by 100ull will produce a huge
64-bit value, and division by a value larger than 1 will make it a
huge 62- or 63-bit value (sign bit lost). That's hardly the timeout
you mean.
> + }
> + else
> + {
> + vs->expiration = vs->count;
> + if ( vs->count - now <= 0 )
Unless you really mean != 0 you have an issue here, due to
vs->count being uint64_t. Perhaps, taking also ...
> + {
> + set_bit(stimerx, &vv->stimer_pending);
> + return;
> + }
> +
> + timeout = (vs->expiration - now) * 100ull;
... this, you want to calculate the difference into a local
variable of type int64_t instead, and use it in both places?
> +static void poll_stimer(struct vcpu *v, unsigned int stimerx)
> +{
> + struct viridian_vcpu *vv = v->arch.hvm.viridian;
> + struct viridian_stimer *vs = &vv->stimer[stimerx];
> +
> + if ( !test_bit(stimerx, &vv->stimer_pending) )
> + return;
> +
> + if ( !viridian_synic_deliver_timer_msg(v, vs->config.fields.sintx,
> + stimerx, vs->expiration,
> + time_now(v->domain)) )
> + return;
> +
> + clear_bit(stimerx, &vv->stimer_pending);
While perhaps benign, wouldn't it be better to clear the pending bit
before delivering the message?
> +void viridian_time_vcpu_freeze(struct vcpu *v)
> +{
> + struct viridian_vcpu *vv = v->arch.hvm.viridian;
> + unsigned int i;
> +
> + if ( !is_viridian_vcpu(v) )
> + return;
Don't you also want/need to check the HVMPV_stimer bit here (and
the also in the thaw counterpart)?
> @@ -149,6 +398,63 @@ int viridian_time_wrmsr(struct vcpu *v, uint32_t idx,
> uint64_t val)
> }
> break;
>
> + case HV_X64_MSR_TIME_REF_COUNT:
> + return X86EMUL_EXCEPTION;
> +
> + case HV_X64_MSR_STIMER0_CONFIG:
> + case HV_X64_MSR_STIMER1_CONFIG:
> + case HV_X64_MSR_STIMER2_CONFIG:
> + case HV_X64_MSR_STIMER3_CONFIG:
> + {
> + unsigned int stimerx =
> + array_index_nospec((idx - HV_X64_MSR_STIMER0_CONFIG) / 2,
> + ARRAY_SIZE(vv->stimer));
> + struct viridian_stimer *vs = &vv->stimer[stimerx];
I again think you'd better use array_access_nospec() here (also
for the rdmsr counterparts).
> @@ -160,6 +466,7 @@ int viridian_time_wrmsr(struct vcpu *v, uint32_t idx,
> uint64_t val)
>
> int viridian_time_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val)
> {
> + struct viridian_vcpu *vv = v->arch.hvm.viridian;
const?
> @@ -201,6 +508,37 @@ int viridian_time_rdmsr(const struct vcpu *v, uint32_t
> idx, uint64_t *val)
> break;
> }
>
> + case HV_X64_MSR_STIMER0_CONFIG:
> + case HV_X64_MSR_STIMER1_CONFIG:
> + case HV_X64_MSR_STIMER2_CONFIG:
> + case HV_X64_MSR_STIMER3_CONFIG:
> + {
> + unsigned int stimerx =
> + array_index_nospec((idx - HV_X64_MSR_STIMER0_CONFIG) / 2,
> + ARRAY_SIZE(vv->stimer));;
> +
> + if ( !(viridian_feature_mask(d) & HVMPV_stimer) )
> + return X86EMUL_EXCEPTION;
> +
> + *val = vv->stimer[stimerx].config.raw;
> + break;
> + }
> + case HV_X64_MSR_STIMER0_COUNT:
Blank line between case blocks please.
> @@ -231,11 +590,36 @@ void viridian_time_domain_deinit(const struct domain *d)
> void viridian_time_save_vcpu_ctxt(
> const struct vcpu *v, struct hvm_viridian_vcpu_context *ctxt)
> {
> + struct viridian_vcpu *vv = v->arch.hvm.viridian;
const?
> + unsigned int i;
> +
> + BUILD_BUG_ON(ARRAY_SIZE(vv->stimer) !=
> + ARRAY_SIZE(ctxt->stimer_config_msr));
> + BUILD_BUG_ON(ARRAY_SIZE(vv->stimer) !=
> + ARRAY_SIZE(ctxt->stimer_count_msr));
> +
> + for ( i = 0; i < ARRAY_SIZE(vv->stimer); i++ )
> + {
> + struct viridian_stimer *vs = &vv->stimer[i];
const (if you really consider the variable useful in the first place)?
> @@ -322,6 +324,15 @@ int guest_wrmsr_viridian(struct vcpu *v, uint32_t idx,
> uint64_t val)
> case HV_X64_MSR_TSC_FREQUENCY:
> case HV_X64_MSR_APIC_FREQUENCY:
> case HV_X64_MSR_REFERENCE_TSC:
> + case HV_X64_MSR_TIME_REF_COUNT:
> + case HV_X64_MSR_STIMER0_CONFIG:
> + case HV_X64_MSR_STIMER0_COUNT:
> + case HV_X64_MSR_STIMER1_CONFIG:
> + case HV_X64_MSR_STIMER1_COUNT:
> + case HV_X64_MSR_STIMER2_CONFIG:
> + case HV_X64_MSR_STIMER2_COUNT:
> + case HV_X64_MSR_STIMER3_CONFIG:
> + case HV_X64_MSR_STIMER3_COUNT:
For readability / brevity
case HV_X64_MSR_STIMER0_CONFIG ... HV_X64_MSR_STIMER3_COUNT:
?
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |