[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 4/7] x86/vmx: add do_vmtrace_op


  • To: Michał Leszczyński <michal.leszczynski@xxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 22 Jun 2020 15:25:02 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=h/3qiXh/KRx32LrOeslzu7ZsTcQLk4cSzuTZWmdNvO4=; b=Nx7Ky2UR/tkzPsc4dmXkv309cFP19gMTym3rdcLTXls+8zZmgJmMep5UuYeQMZ6xg3aCg6T62VX43p5rB1qSpGB3K+hyVjcjm8w+RiJOoR4HocbtYvk1TIkFBg5EpjSV2aTQy8e4AYyvUDnilgoQJHjBkrs2wimlsBe8yshbxNZhkibUd/zMCSW5Ogsr16gaoOMCRbvT7Q0sCUiX+a0ymdw23e1hqFnwFsKYCSmu5F3puDHK+Imenf7qRjN5qioVPsbbSGkjYkhlKkpmWr1+IoBBiFN4LFpUCVlWtA2t+SOJ0UNzwoF9bhW/cK5w0t5V1GxItY8hqyDsAZncneAWtg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WXzTW0ZXn/0xE2NDPpQaIKFMCxWlIv2mwzj/NUzOBpuhWlhUSadhDoSoSn5zYeZXRa1CafYkGejwd1ZwqECcJc/cZwbsn3x1nvmsaYQEgU8otmhxWZfvOWk8lF8zwpSmDycgpuXm1zw0QoVD7ogslkPs5QHruJt21LMlbaytZI4x3+7C39WAl65iJl7pBkGm3H6lV1/46xH+ZlUjfrHavgEDTRx5X7IPIhbkMdfL0luBjpOzI8fTB6Uz3fFosgaGB+nVmO3edKUz5hPLuF+yvwIzpLIKtUTzcfT0ZUN0MZUUtjIIyKTj01mclocy5DA09I57eekfUYbRxGPLWwggcw==
  • Authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=suse.com;
  • Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Wei Liu <wl@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, "Kang, Luwei" <luwei.kang@xxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Tamas K Lengyel <tamas.k.lengyel@xxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Mon, 22 Jun 2020 13:25:18 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 19.06.2020 01:41, Michał Leszczyński wrote:
> @@ -1631,6 +1649,8 @@ void hvm_vcpu_destroy(struct vcpu *v)
>      vlapic_destroy(v);
>  
>      hvm_vcpu_cacheattr_destroy(v);
> +
> +    hvm_vmtrace_destroy(v);
>  }

Whenever possible resource cleanup should occur from
hvm_domain_relinquish_resources(). Is there anything preventing
this here?

> @@ -4066,6 +4086,51 @@ static int hvmop_set_evtchn_upcall_vector(
>      return 0;
>  }
>  
> +static int hvm_set_vmtrace_pt_size(struct domain *d, uint64_t value)
> +{
> +    void *buf;
> +    unsigned int buf_order;
> +    struct page_info *pg;
> +    struct ipt_state *ipt;
> +    struct vcpu *v;
> +
> +    if ( value < PAGE_SIZE ||
> +         value > GB(4) ||
> +         ( value & (value - 1) ) ) {
> +        /* we don't accept trace buffer size smaller than single page
> +         * and the upper bound is defined as 4GB in the specification */
> +        return -EINVAL;
> +    }
> +
> +    for_each_vcpu ( d, v )
> +    {
> +        buf_order = get_order_from_bytes(value);
> +        pg = alloc_domheap_pages(d, buf_order, MEMF_no_refcount);
> +
> +        if ( !pg )
> +            return -EFAULT;
> +
> +        buf = page_to_virt(pg);

In addition to what Roger has said here, just fyi that you can't
use page_to_virt() on anything returned from alloc_domheap_pages(),
unless you suitably restrict the address range such that the
result is guaranteed to have a direct mapping.

> @@ -4949,6 +5018,100 @@ static int compat_altp2m_op(
>      return rc;
>  }
>  
> +static int do_vmtrace_op(XEN_GUEST_HANDLE_PARAM(void) arg)
> +{
> +    struct xen_hvm_vmtrace_op a;
> +    struct domain *d;
> +    int rc;
> +    struct vcpu *v;
> +    struct ipt_state *ipt;
> +
> +    if ( !hvm_pt_supported() )
> +        return -EOPNOTSUPP;
> +
> +    if ( copy_from_guest(&a, arg, 1) )
> +        return -EFAULT;
> +
> +    if ( a.version != HVMOP_VMTRACE_INTERFACE_VERSION )
> +        return -EINVAL;
> +
> +    d = rcu_lock_domain_by_any_id(a.domain);
> +    spin_lock(&d->vmtrace_lock);
> +
> +    if ( d == NULL )
> +        return -ESRCH;
> +
> +    if ( !is_hvm_domain(d) )
> +    {
> +        rc = -EOPNOTSUPP;
> +        goto out;
> +    }
> +
> +    domain_pause(d);

Who's the intended caller of this interface? You making it a hvm-op
suggests the guest may itself call this. But of course a guest
can't pause itself. If this is supposed to be a tools-only interface,
then you should frame it suitably in the public header and of course
you need to enforce this here (which would e.g. mean you shouldn't
use rcu_lock_domain_by_any_id()).

Also please take a look at hvm/ioreq.c, which makes quite a bit of
use of domain_pause(). In particular I think you want to acquire
the lock only after having paused the domain.

> +    if ( a.vcpu >= d->max_vcpus )
> +    {
> +        rc = -EINVAL;
> +        goto out;
> +    }
> +
> +    v = d->vcpu[a.vcpu];
> +    ipt = v->arch.hvm.vmx.ipt_state;
> +
> +    if ( !ipt )
> +    {
> +        /*
> +      * PT must be first initialized upon domain creation.
> +      */
> +        rc = -EINVAL;
> +        goto out;
> +    }
> +
> +    switch ( a.cmd )
> +    {
> +    case HVMOP_vmtrace_ipt_enable:
> +        if ( vmx_add_guest_msr(v, MSR_RTIT_CTL,
> +                               RTIT_CTL_TRACEEN | RTIT_CTL_OS |
> +                               RTIT_CTL_USR | RTIT_CTL_BRANCH_EN) )
> +        {
> +            rc = -EFAULT;
> +            goto out;
> +        }
> +
> +        ipt->active = 1;
> +        break;
> +    case HVMOP_vmtrace_ipt_disable:
> +        if ( vmx_add_guest_msr(v, MSR_RTIT_CTL, 0) )

Shouldn't you rather remove the MSR from the load list here?

Is any of what you do in this switch() actually legitimate without
hvm_set_vmtrace_pt_size() having got called for the guest? From
remarks elsewhere I imply you expect the param that you currently
use to be set upon domain creation time, but at the very least the
potentially big buffer should imo not get allocated up front, but
only when tracing is to actually be enabled.

Also please have blank lines between individual case blocks.

> +        {
> +            rc = -EFAULT;
> +            goto out;
> +        }
> +
> +        ipt->active = 0;
> +        break;
> +    case HVMOP_vmtrace_ipt_get_offset:
> +        a.offset = ipt->output_mask.offset;
> +        break;
> +    default:
> +        rc = -EOPNOTSUPP;
> +        goto out;
> +    }
> +
> +    rc = -EFAULT;
> +    if ( __copy_to_guest(arg, &a, 1) )
> +      goto out;

Only HVMOP_vmtrace_ipt_get_offset needs this - please avoid copying
back on other paths. Even there you could in principle copy back
just the one field; the function taking XEN_GUEST_HANDLE_PARAM(void)
gets in the way of this, though.

The last line above also has an indentation issue, but the use of
"goto" in this case can be avoided anyway.

> +    rc = 0;
> +
> + out:
> +    domain_unpause(d);
> +    spin_unlock(&d->vmtrace_lock);
> +    rcu_unlock_domain(d);
> +
> +    return rc;
> +}
> +
> +DEFINE_XEN_GUEST_HANDLE(compat_hvm_vmtrace_op_t);

I don't see any use of this handle further down - why do you force
it being declared?

> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -4624,6 +4624,43 @@ int arch_acquire_resource(struct domain *d, unsigned 
> int type,
>          }
>          break;
>      }
> +
> +    case XENMEM_resource_vmtrace_buf:
> +    {
> +        mfn_t mfn;
> +        unsigned int i;
> +        struct ipt_state *ipt;
> +
> +        if ( id >= d->max_vcpus )
> +        {
> +            rc = -EINVAL;
> +            break;
> +        }
> +
> +        ipt = d->vcpu[id]->arch.hvm.vmx.ipt_state;

Please use domain_vcpu() here.

> +        if ( !ipt )
> +        {
> +            rc = -EINVAL;
> +            break;
> +        }
> +
> +        mfn = mfn_x(ipt->output_base >> PAGE_SHIFT);

maddr_to_mfn()?

> +        if (nr_frames > ( ( ipt->output_mask.size + 1 ) >> PAGE_SHIFT ))
> +        {
> +            rc = -EINVAL;
> +            break;
> +        }
> +
> +        rc = 0;
> +        for ( i = 0; i < nr_frames; i++ )
> +        {
> +            mfn_list[i] = mfn_add(mfn, i);
> +        }

Please omit the braces in cases like this one.

> --- a/xen/include/asm-x86/hvm/vmx/vmcs.h
> +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
> @@ -104,6 +104,19 @@ struct pi_blocking_vcpu {
>      spinlock_t           *lock;
>  };
>  
> +struct ipt_state {
> +    uint64_t active;
> +    uint64_t status;
> +    uint64_t output_base;
> +    union {
> +        uint64_t raw;
> +        struct {
> +            uint32_t size;
> +            uint32_t offset;
> +        };
> +    } output_mask;
> +};

While referenced ...

>  struct vmx_vcpu {
>      /* Physical address of VMCS. */
>      paddr_t              vmcs_pa;
> @@ -186,6 +199,9 @@ struct vmx_vcpu {
>       * pCPU and wakeup the related vCPU.
>       */
>      struct pi_blocking_vcpu pi_blocking;
> +
> +    /* State of Intel Processor Trace feature */
> +    struct ipt_state     *ipt_state;

... here, the struct declaration itself doesn't really belong in this
header, as not being VMCS-related. The better place likely is vmx.h.

> --- a/xen/include/public/hvm/hvm_op.h
> +++ b/xen/include/public/hvm/hvm_op.h
> @@ -382,6 +382,29 @@ struct xen_hvm_altp2m_op {
>  typedef struct xen_hvm_altp2m_op xen_hvm_altp2m_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_hvm_altp2m_op_t);
>  
> +/* HVMOP_vmtrace: Perform VM tracing related operation */
> +#define HVMOP_vmtrace 26
> +
> +#define HVMOP_VMTRACE_INTERFACE_VERSION 0x00000001

I'm unconvinced we want to introduce yet another versioned interface.
In any event, as hinted at earlier, this suggests it wants to be a
tools-only interface instead (which, at least for the time being, is
not required to be a stable interface then, but that's also something
we apparently want to move away from, and hence you may better not
try to rely on it not needing to be stable).

> +struct xen_hvm_vmtrace_op {
> +    /* IN variable */
> +    uint32_t version;   /* HVMOP_VMTRACE_INTERFACE_VERSION */
> +    uint32_t cmd;
> +/* Enable/disable external vmtrace for given domain */
> +#define HVMOP_vmtrace_ipt_enable      1
> +#define HVMOP_vmtrace_ipt_disable     2
> +#define HVMOP_vmtrace_ipt_get_offset  3
> +    domid_t domain;
> +    uint32_t vcpu;
> +    uint64_t size;
> +
> +    /* OUT variable */
> +    uint64_t offset;

If this is to be a tools-only interface, please use uint64_aligned_t.

You also want to add an entry to xen/include/xlat.lst and use the
resulting macro to prove that the struct layout is the same for
native and compat callers.

> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -457,6 +457,9 @@ struct domain
>      unsigned    pbuf_idx;
>      spinlock_t  pbuf_lock;
>  
> +    /* Used by vmtrace domctls */
> +    spinlock_t  vmtrace_lock;

There's no domctl invovled here anymore, I think.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.