[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 8/9] viridian: add implementation of synthetic timers
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > Sent: 25 February 2019 14:54 > To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> > Cc: Julien Grall <julien.grall@xxxxxxx>; Andrew Cooper > <Andrew.Cooper3@xxxxxxxxxx>; Roger Pau Monne > <roger.pau@xxxxxxxxxx>; Wei Liu <wei.liu2@xxxxxxxxxx>; George Dunlap > <George.Dunlap@xxxxxxxxxx>; Ian > Jackson <Ian.Jackson@xxxxxxxxxx>; Stefano Stabellini > <sstabellini@xxxxxxxxxx>; xen-devel <xen- > devel@xxxxxxxxxxxxxxxxxxxx>; Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>; > Tim (Xen.org) > <tim@xxxxxxx> > Subject: Re: [PATCH v3 8/9] viridian: add implementation of synthetic timers > > >>> On 31.01.19 at 11:47, <paul.durrant@xxxxxxxxxx> wrote: > > --- a/xen/arch/x86/hvm/viridian/synic.c > > +++ b/xen/arch/x86/hvm/viridian/synic.c > > @@ -329,7 +329,53 @@ void viridian_synic_domain_deinit(struct domain *d) > > > > void viridian_synic_poll_messages(struct vcpu *v) > > { > > - /* There are currently no message sources */ > > + viridian_time_poll_timers(v); > > +} > > + > > +bool viridian_synic_deliver_timer_msg(struct vcpu *v, unsigned int sintx, > > + unsigned int index, > > + int64_t expiration, int64_t delivery) > > +{ > > + const union viridian_sint_msr *vs = &v->arch.hvm.viridian->sint[sintx]; > > + HV_MESSAGE *msg = v->arch.hvm.viridian->simp.ptr; > > + struct { > > + uint32_t TimerIndex; > > + uint32_t Reserved; > > + uint64_t ExpirationTime; > > + uint64_t DeliveryTime; > > + } payload = { > > + .TimerIndex = index, > > + .ExpirationTime = expiration, > > + .DeliveryTime = delivery, > > + }; > > + > > + if ( test_bit(sintx, &v->arch.hvm.viridian->msg_pending) ) > > + return false; > > + > > + BUILD_BUG_ON(sizeof(*msg) != HV_MESSAGE_SIZE); > > + msg += sintx; > > + > > + /* > > + * To avoid using an atomic test-and-set this function must be called > > + * in context of the vcpu receiving the message. > > + */ > > + ASSERT(v == current); > > + if ( msg->Header.MessageType != HvMessageTypeNone ) > > + { > > + msg->Header.MessageFlags.MessagePending = 1; > > + set_bit(sintx, &v->arch.hvm.viridian->msg_pending); > > As per the comment above this is always in context of the subject > vCPU. It looks to me as if this was also the case for the two > clear_bit() on the field in the prior patch. If so, all three could be > the non-atomic variants instead. The only slight subtlety I think is the one in the wrmsr function, which can be called in context of a domain restore. I think it's still ok for it to be non-atomic in this case but I'll assert (v = current || !v->running), which I think should cover it. > > > + return false; > > + } > > + > > + msg->Header.MessageType = HvMessageTimerExpired; > > + msg->Header.MessageFlags.MessagePending = 0; > > + msg->Header.PayloadSize = sizeof(payload); > > + memcpy(msg->Payload, &payload, sizeof(payload)); > > + > > + if ( !vs->fields.mask ) > > + vlapic_set_irq(vcpu_vlapic(v), vs->fields.vector, 0); > > If this wasn't with v == current, I think you'd also need a barrier > here. Could you extend the comment above to also mention this > aspect? Ok. > > > @@ -118,14 +119,237 @@ static int64_t time_ref_count(struct domain *d) > > return raw_trc_val(d) + trc->off; > > } > > > > +static int64_t time_now(struct domain *d) > > Why would this return a signed value? And can't the function > parameter be const? The function parameter can be const, but I think the result needs to be signed for the missed ticks logic in start_timer() to work correctly. > > > +{ > > + const struct viridian_page *rt = &d->arch.hvm.viridian->reference_tsc; > > + HV_REFERENCE_TSC_PAGE *p = rt->ptr; > > + uint32_t start, end; > > + __int128_t tsc; > > + __int128_t scale; > > I don't think you need both of them be 128 bits wide. I also don't > see why either would want to be of a signed type. The spec says (as in the comment below): "The partition reference time is computed by the following formula: ReferenceTime = ((VirtualTsc * TscScale) >> 64) + TscOffset The multiplication is a 64 bit multiplication, which results in a 128 bit number which is then shifted 64 times to the right to obtain the high 64 bits.TscScale" Again, I'm using signed arithmetic as I think it's necessary for the missed ticks logic to work correctly in the event of an overflow. > > > + int64_t offset; > > + > > + /* > > + * If the reference TSC page is not enabled, or has been invalidated > > + * fall back to the partition reference counter. > > + */ > > + if ( !p || !p->TscSequence ) > > + return time_ref_count(d); > > + > > + /* > > + * The following sampling algorithm for tsc, scale and offset is > > + * documented in the specifiction. > > + */ > > + start = p->TscSequence; > > + > > + do { > > + tsc = rdtsc(); > > + scale = p->TscScale; > > + offset = p->TscOffset; > > + > > + smp_mb(); > > + end = p->TscSequence; > > Why is this a full barrier, rather than just a read one? And don't you need > to add a counterpart in update_reference_tsc()? Yes, a read barrier is enough with the counterpart write barrier added. > > > + } while (end != start); > > update_reference_tsc() increments TscSequence. If end doesn't match > start at this point, you're entering a near infinite loop here as long as > you don't update start inside the loop. I also think that there's a > second read barrier needed between this initial reading of the sequence > number and the reading of the actual values. Yes, the start value should be inside the loop of course. > > > + /* > > + * The specification says: "The partition reference time is computed > > + * by the following formula: > > + * > > + * ReferenceTime = ((VirtualTsc * TscScale) >> 64) + TscOffset > > + * > > + * The multiplication is a 64 bit multiplication, which results in a > > + * 128 bit number which is then shifted 64 times to the right to obtain > > + * the high 64 bits." > > + */ > > + return ((tsc * scale) >> 64) + offset; > > +} > > + > > +static void stop_stimer(struct viridian_stimer *vs) > > +{ > > + struct vcpu *v = vs->v; > > const? Ok. > > > + unsigned int stimerx = vs - &v->arch.hvm.viridian->stimer[0]; > > + > > + if ( !vs->started ) > > + return; > > + > > + stop_timer(&vs->timer); > > + clear_bit(stimerx, &v->arch.hvm.viridian->stimer_pending); > > + vs->started = false; > > +} > > + > > +static void stimer_expire(void *data) > > +{ > > + struct viridian_stimer *vs = data; > > const? Ok. > > > + struct vcpu *v = vs->v; > > + unsigned int stimerx = vs - &v->arch.hvm.viridian->stimer[0]; > > + > > + if ( !vs->config.fields.enabled ) > > + return; > > + > > + set_bit(stimerx, &v->arch.hvm.viridian->stimer_pending); > > + vcpu_kick(v); > > +} > > + > > +static void start_stimer(struct viridian_stimer *vs) > > +{ > > + struct vcpu *v = vs->v; > > + unsigned int stimerx = vs - &v->arch.hvm.viridian->stimer[0]; > > + int64_t now = time_now(v->domain); > > + s_time_t timeout; > > + > > + if ( !test_and_set_bit(stimerx, &v->arch.hvm.viridian->stimer_enabled) > > ) > > + printk(XENLOG_G_INFO "%pv: VIRIDIAN STIMER%u: enabled\n", v, > > + stimerx); > > + > > + if ( vs->config.fields.periodic ) > > + { > > + unsigned int missed = 0; > > + int64_t next; > > + > > + /* > > + * The specification says that if the timer is lazy then we > > + * skip over any missed expirations so we can treat this case > > + * as the same as if the timer is currently stopped, i.e. we > > + * just schedule expiration to be 'count' ticks from now. > > + */ > > + if ( !vs->started || vs->config.fields.lazy ) > > + { > > + next = now + vs->count; > > + } > > Unnecessary braces. Yes. > > > @@ -149,6 +373,57 @@ int viridian_time_wrmsr(struct vcpu *v, uint32_t idx, > > uint64_t val) > > } > > break; > > > > + case HV_X64_MSR_TIME_REF_COUNT: > > + return X86EMUL_EXCEPTION; > > Isn't this an unrelated change? It is. I'll call it out in the comment comment. > > > + case HV_X64_MSR_STIMER0_CONFIG: > > + case HV_X64_MSR_STIMER1_CONFIG: > > + case HV_X64_MSR_STIMER2_CONFIG: > > + case HV_X64_MSR_STIMER3_CONFIG: > > + { > > + unsigned int stimerx = (idx - HV_X64_MSR_STIMER0_CONFIG) / 2; > > + struct viridian_stimer *vs = > > &v->arch.hvm.viridian->stimer[stimerx]; > > + > > + if ( !(viridian_feature_mask(d) & HVMPV_stimer) ) > > + return X86EMUL_EXCEPTION; > > + > > + stop_stimer(vs); > > + > > + vs->config.raw = val; > > + > > + if ( !vs->config.fields.sintx ) > > + vs->config.fields.enabled = 0; > > + > > + if ( vs->config.fields.enabled ) > > + start_stimer(vs); > > + > > + break; > > + } > > + case HV_X64_MSR_STIMER0_COUNT: > > Missing blank line again (and also further down here as well as in the > rdmsr code). > Ok. TBH I've always thought the normal style was to omit the blank line if the case statement has braces. > > + case HV_X64_MSR_STIMER1_COUNT: > > + case HV_X64_MSR_STIMER2_COUNT: > > + case HV_X64_MSR_STIMER3_COUNT: > > + { > > + unsigned int stimerx = (idx - HV_X64_MSR_STIMER0_COUNT) / 2; > > + struct viridian_stimer *vs = > > &v->arch.hvm.viridian->stimer[stimerx]; > > + > > + if ( !(viridian_feature_mask(d) & HVMPV_stimer) ) > > + return X86EMUL_EXCEPTION; > > + > > + stop_stimer(vs); > > + > > + vs->count = val; > > + > > + if ( !vs->count ) > > Any reason you don't use val here (which the compiler likely will do > anyway)? Not particularly, I just think it reads better and is more consistent with other code. > > > @@ -201,6 +476,32 @@ int viridian_time_rdmsr(const struct vcpu *v, uint32_t > > idx, uint64_t *val) > > break; > > } > > > > + case HV_X64_MSR_STIMER0_CONFIG: > > + case HV_X64_MSR_STIMER1_CONFIG: > > + case HV_X64_MSR_STIMER2_CONFIG: > > + case HV_X64_MSR_STIMER3_CONFIG: > > + { > > + unsigned int stimerx = (idx - HV_X64_MSR_STIMER0_CONFIG) / 2; > > + > > + if ( !(viridian_feature_mask(d) & HVMPV_stimer) ) > > + return X86EMUL_EXCEPTION; > > + > > + *val = v->arch.hvm.viridian->stimer[stimerx].config.raw; > > While more noticeable here and ... > > > + break; > > + } > > + case HV_X64_MSR_STIMER0_COUNT: > > + case HV_X64_MSR_STIMER1_COUNT: > > + case HV_X64_MSR_STIMER2_COUNT: > > + case HV_X64_MSR_STIMER3_COUNT: > > + { > > + unsigned int stimerx = (idx - HV_X64_MSR_STIMER0_COUNT) / 2; > > + > > + if ( !(viridian_feature_mask(d) & HVMPV_stimer) ) > > + return X86EMUL_EXCEPTION; > > + > > + *val = v->arch.hvm.viridian->stimer[stimerx].count; > > ... here, array_access_nospec() are probably needed not just here, > but also in the wrmsr logic. Really? stimerx is calculated based on hitting the case statement in the first place. > > > --- a/xen/include/asm-x86/hvm/viridian.h > > +++ b/xen/include/asm-x86/hvm/viridian.h > > @@ -40,6 +40,33 @@ union viridian_sint_msr > > } fields; > > }; > > > > +union viridian_stimer_config_msr > > +{ > > + uint64_t raw; > > + struct > > + { > > + uint64_t enabled:1; > > + uint64_t periodic:1; > > + uint64_t lazy:1; > > + uint64_t auto_enable:1; > > + uint64_t vector:8; > > + uint64_t direct_mode:1; > > + uint64_t reserved_zero1:3; > > + uint64_t sintx:4; > > + uint64_t reserved_zero2:44; > > + } fields; > > +}; > > + > > +struct viridian_stimer { > > + struct vcpu *v; > > Isn't a full 8-byte pointer a little too much overhead here? You could > instead store the timer index ... I think I need it in stimer_expire() which can be called in any vcpu context IIUC. > > > + struct timer timer; > > + union viridian_stimer_config_msr config; > > + uint64_t count; > > + int64_t expiration; > > + s_time_t timeout; > > + bool started; > > ... in a field using the 7-byte padding here, and use container_of() > to get at the outer structure. That would get me as far as viridian_vcpu, but there's no pointer to struct vcpu in there, and I need one to call vcpu_kick(). Paul > > Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |