[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling




> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: Friday, January 29, 2016 12:38 AM
> To: Wu, Feng <feng.wu@xxxxxxxxx>
> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; Dario Faggioli
> <dario.faggioli@xxxxxxxxxx>; George Dunlap <george.dunlap@xxxxxxxxxxxxx>;
> Tian, Kevin <kevin.tian@xxxxxxxxx>; xen-devel@xxxxxxxxxxxxx; Keir Fraser
> <keir@xxxxxxx>
> Subject: Re: [PATCH v11 1/2] vmx: VT-d posted-interrupt core logic handling
> 
> >>> On 28.01.16 at 06:12, <feng.wu@xxxxxxxxx> wrote:
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -83,7 +83,140 @@ static int vmx_msr_write_intercept(unsigned int msr,
> uint64_t msr_content);
> >  static void vmx_invlpg_intercept(unsigned long vaddr);
> >  static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
> >
> > +static void vmx_vcpu_block(struct vcpu *v)
> > +{
> > +    unsigned long flags;
> > +    unsigned int dest;
> > +
> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> 
> Stray blank line between declarations.
> 
> > +    ASSERT(v->arch.hvm_vmx.pi_block_list_lock == NULL);
> 
> I think this needs to be done after acquiring the lock.

I am not sure what you mean here. The ASSERT here means that
the vCPU is not on any pCPU list when we get here.

> 
> > +static void vmx_pi_switch_from(struct vcpu *v)
> > +{
> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> > +
> > +    if ( test_bit(_VPF_blocked, &v->pause_flags) )
> > +        return;
> > +
> > +    /*
> > +     * The vCPU has been preempted or went to sleep. We don't
> > +     * need to send notification event to a runnable or sleeping
> > +     * vcpu, the interrupt information will be delivered to it
> > +     * before VM-ENTRY when the vcpu is scheduled to run next time.
> > +     */
> > +    pi_set_sn(pi_desc);
> > +}
> 
> The comment confuses me: A runnable vCPU certainly doesn't
> need waking, but a sleeping one? Why wouldn't an interrupt
> coming in not need to wake that vCPU, if this interrupt is what
> it's waiting for?

First, let's put VT-d PI away, just thinking the case without it:
When a vCPU is sleeping, such as, vcpu_sleep_nosync() gets
called, the vCPU is in RUNSTATE_offline state. Then an interrupt
comes in and the slept vCPU is the destination, the vCPU is not
woken up by the interrupts. The slept vCPU would only be woken
up by calling vcpu_wakeup() in the code path responding to
the interrupt, however, this is not in the code. So this is
the current implementation and policy. I don't think VT-d PI needs
to break it. On the other hand, even we clear SN for slept vCPU,
it doesn't help, since it is still not in the run-queue and don't
have chance to run.

Above is my understanding about the scheduler related things,
maybe Dario can double confirm this. Thanks!

> 
> > +static void vmx_pi_do_resume(struct vcpu *v)
> > +{
> > +    unsigned long flags;
> > +    spinlock_t *pi_block_list_lock;
> > +    struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> > +
> > +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
> > +
> > +    /*
> > +     * Set 'NV' field back to posted_intr_vector, so the
> > +     * Posted-Interrupts can be delivered to the vCPU when
> > +     * it is running in non-root mode.
> > +     */
> > +    if ( pi_desc->nv != posted_intr_vector )
> > +        write_atomic(&pi_desc->nv, posted_intr_vector);
> 
> Perhaps this was discussed before, but I don't recall and now
> wonder - why inside an if()? This is a simple memory write on
> x86.

The initial purpose is that if NV is already equal to 'posted_intr_vector',
we can save the following atomically write operation. There are some
volatile stuff and barriers in write_atomic().

> 
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -2293,6 +2293,8 @@ static int reassign_device_ownership(
> >          pdev->domain = target;
> >      }
> >
> > +    vmx_pi_hooks_reassign(source, target);
> > +
> >      return ret;
> >  }
> 
> I think you need to clear source's hooks here, but target's need to
> get set ahead of the actual assignment.

I think this is the place where the device is moved from
&source->arch.pdev_list to &target->arch.pdev_list, if that is the
case, we should clear source and set target here, right?

Please refer to the following code before 'vmx_pi_hooks_reassign'

    if ( devfn == pdev->devfn )
    {
        list_move(&pdev->domain_list, &target->arch.pdev_list);
        pdev->domain = target;
    }

> 
> > --- a/xen/include/asm-x86/hvm/hvm.h
> > +++ b/xen/include/asm-x86/hvm/hvm.h
> > @@ -565,6 +565,11 @@ const char *hvm_efer_valid(const struct vcpu *v,
> uint64_t value,
> >                             signed int cr0_pg);
> >  unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v, bool_t
> restore);
> >
> > +#define arch_vcpu_block(v) ({                                              
> >  \
> > +    if ( (v)->domain->arch.hvm_domain.vmx.vcpu_block )                     
> >  \
> > +        (v)->domain->arch.hvm_domain.vmx.vcpu_block((v));                  
> >  \
> 
> Stray double parentheses. But - why is this a macro all of the
> sudden anyway?
> 

In the previous version, you suggest to make arch_vcpu_block()
inline, I tried that, but I encountered building failures, which is
related to using '(v)->domain->arch.hvm_domain.vmx.vcpu_block '
here since it refers to some data structure, it is not so straightforward
to address it, so I change it to a macro, just like other micros in this
file, which refers to ' (v)->arch.hvm_vcpu.....'.

Thanks,
Feng

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.