[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 07/15] argo: implement the register op



On Tue, Jan 22, 2019 at 1:59 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>
> On Mon, Jan 21, 2019 at 01:59:47AM -0800, Christopher Clark wrote:
> > The register op is used by a domain to register a region of memory for
> > receiving messages from either a specified other domain, or, if specifying a
> > wildcard, any domain.
> >
> > This operation creates a mapping within Xen's private address space that
> > will remain resident for the lifetime of the ring. In subsequent commits,
> > the hypervisor will use this mapping to copy data from a sending domain into
> > this registered ring, making it accessible to the domain that registered the
> > ring to receive data.
> >
> > Wildcard any-sender rings are default disabled and registration will be
> > refused with EPERM unless they have been specifically enabled with the
> > new mac-permissive flag that is added to the argo boot option here. The
> > reason why the default for wildcard rings is 'deny' is that there is
> > currently no means to protect the ring from DoS by a noisy domain
> > spamming the ring, affecting other domains ability to send to it. This
> > will be addressed with XSM policy controls in subsequent work.
> >
> > Since denying access to any-sender rings is a significant functional
> > constraint, the new option "mac-permissive" for the argo bootparam
> > enables overriding this. eg: "argo=1,mac-permissive=1"
> >
> > The p2m type of the memory supplied by the guest for the ring must be
> > p2m_ram_rw and the memory will be pinned as PGT_writable_page while the ring
> > is registered.
> >
> > xen_argo_gfn_t type is defined and is 64-bit on all architectures which
> > assists with avoiding the need for compat code to translate hypercall args.
> > This hypercall op and its interface currently only supports 4K-sized pages.
> >
> > Signed-off-by: Christopher Clark <christopher.clark6@xxxxxxxxxxxxxx>
>
> Reviewed-by: Roger Pau Mooné <roger.pau@xxxxxxxxxx>
>
> Just some nits that can be taken care of later.
>
> > +static int
> > +find_ring_mfns(struct domain *d, struct argo_ring_info *ring_info,
> > +               const unsigned int npage,
> > +               XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd,
> > +               const unsigned int len)
> > +{
> > +    unsigned int i;
> > +    int ret = 0;
> > +    mfn_t *mfns;
> > +    void **mfn_mapping;
> > +
> > +    ASSERT(LOCKING_Write_rings_L2(d));
> > +
> > +    if ( ring_info->mfns )
> > +    {
> > +        /* Ring already existed: drop the previous mapping. */
> > +        gprintk(XENLOG_INFO, "argo: vm%u re-register existing ring "
> > +                "(vm%u:%x vm%u) clears mapping\n",
> > +                d->domain_id, ring_info->id.domain_id,
> > +                ring_info->id.aport, ring_info->id.partner_id);
> > +
> > +        ring_remove_mfns(d, ring_info);
> > +        ASSERT(!ring_info->mfns);
> > +    }
> > +
> > +    mfns = xmalloc_array(mfn_t, npage);
> > +    if ( !mfns )
> > +        return -ENOMEM;
> > +
> > +    for ( i = 0; i < npage; i++ )
> > +        mfns[i] = INVALID_MFN;
> > +
> > +    mfn_mapping = xzalloc_array(void *, npage);
> > +    if ( !mfn_mapping )
> > +    {
> > +        xfree(mfns);
> > +        return -ENOMEM;
> > +    }
> > +
> > +    ring_info->mfns = mfns;
> > +    ring_info->mfn_mapping = mfn_mapping;
> > +
> > +    for ( i = 0; i < npage; i++ )
> > +    {
> > +        xen_argo_gfn_t argo_gfn;
> > +        mfn_t mfn;
> > +
> > +        ret = __copy_from_guest_offset(&argo_gfn, gfn_hnd, i, 1) ? -EFAULT 
> > : 0;
> > +        if ( ret )
> > +            break;
> > +
> > +        ret = find_ring_mfn(d, _gfn(argo_gfn), &mfn);
> > +        if ( ret )
> > +        {
> > +            gprintk(XENLOG_ERR, "argo: vm%u: invalid gfn %"PRI_gfn" "
> > +                    "r:(vm%u:%x vm%u) %p %u/%u\n",
> > +                    d->domain_id, gfn_x(_gfn(argo_gfn)),
> > +                    ring_info->id.domain_id, ring_info->id.aport,
> > +                    ring_info->id.partner_id, ring_info, i, npage);
> > +            break;
> > +        }
> > +
> > +        ring_info->mfns[i] = mfn;
> > +
> > +        argo_dprintk("%u: %"PRI_gfn" -> %"PRI_mfn"\n",
> > +                     i, gfn_x(_gfn(argo_gfn)), mfn_x(ring_info->mfns[i]));
> > +    }
> > +
> > +    ring_info->nmfns = i;
> > +
> > +    if ( ret )
> > +        ring_remove_mfns(d, ring_info);
> > +    else
> > +    {
> > +        ASSERT(ring_info->nmfns == NPAGES_RING(len));
> > +
> > +        gprintk(XENLOG_DEBUG, "argo: vm%u ring (vm%u:%x vm%u) %p "
>
> Nit: this likely wants to be an argo_dprintk?

There are not many instances in the Argo code where gprintk(XENLOG_DEBUG
is used, but it's intentional here, because argo_dprintk needs a
recompile to enable it, whereas gprintk does not.

Ring registration is non-datapath and the message is potentially useful
when diagnosing a deployed system.

>
> > +                "mfn_mapping %p len %u nmfns %u\n",
> > +                d->domain_id, ring_info->id.domain_id,
> > +                ring_info->id.aport, ring_info->id.partner_id, ring_info,
> > +                ring_info->mfn_mapping, ring_info->len, ring_info->nmfns);
> > +    }
> > +
> > +    return ret;
> > +}
> > +
> > +static long
> > +register_ring(struct domain *currd,
> > +              XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd,
> > +              XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd,
> > +              unsigned int npage, bool fail_exist)
> > +{
> > +    xen_argo_register_ring_t reg;
> > +    struct argo_ring_id ring_id;
> > +    void *map_ringp;
> > +    xen_argo_ring_t *ringp;
> > +    struct argo_ring_info *ring_info, *new_ring_info = NULL;
> > +    struct argo_send_info *send_info = NULL;
> > +    struct domain *dst_d = NULL;
> > +    int ret = 0;
> > +    unsigned int private_tx_ptr;
> > +
> > +    ASSERT(currd == current->domain);
> > +
> > +    if ( copy_from_guest(&reg, reg_hnd, 1) )
> > +        return -EFAULT;
> > +
> > +    /*
> > +     * A ring must be large enough to transmit messages, so requires space 
> > for:
> > +     * * 1 message header, plus
> > +     * * 1 payload slot (payload is always rounded to a multiple of 16 
> > bytes)
> > +     *   for the message payload to be written into, plus
> > +     * * 1 more slot, so that the ring cannot be filled to capacity with a
> > +     *   single minimum-size message -- see the logic in ringbuf_insert --
> > +     *   allowing for this ensures that there can be space remaining when a
> > +     *   message is present.
> > +     * The above determines the minimum acceptable ring size.
> > +     */
> > +    if ( (reg.len < (sizeof(struct xen_argo_ring_message_header)
> > +                      + ROUNDUP_MESSAGE(1) + ROUNDUP_MESSAGE(1))) ||
> > +         (reg.len > XEN_ARGO_MAX_RING_SIZE) ||
> > +         (reg.len != ROUNDUP_MESSAGE(reg.len)) ||
> > +         (NPAGES_RING(reg.len) != npage) ||
> > +         (reg.pad != 0) )
> > +        return -EINVAL;
> > +
> > +    ring_id.partner_id = reg.partner_id;
> > +    ring_id.aport = reg.aport;
> > +    ring_id.domain_id = currd->domain_id;
> > +
> > +    if ( reg.partner_id == XEN_ARGO_DOMID_ANY )
> > +    {
> > +        if ( !opt_argo_mac_permissive )
> > +            return -EPERM;
> > +    }
> > +    else
> > +    {
> > +        dst_d = get_domain_by_id(reg.partner_id);
> > +        if ( !dst_d )
> > +        {
> > +            argo_dprintk("!dst_d, ESRCH\n");
> > +            return -ESRCH;
> > +        }
> > +
> > +        send_info = xzalloc(struct argo_send_info);
> > +        if ( !send_info )
> > +        {
> > +            ret = -ENOMEM;
> > +            goto out;
> > +        }
> > +        send_info->id = ring_id;
> > +    }
> > +
> > +    /*
> > +     * Common case is that the ring doesn't already exist, so do the alloc 
> > here
> > +     * before picking up any locks.
> > +     */
> > +    new_ring_info = xzalloc(struct argo_ring_info);
> > +    if ( !new_ring_info )
> > +    {
> > +        ret = -ENOMEM;
> > +        goto out;
> > +    }
> > +
> > +    read_lock(&L1_global_argo_rwlock);
> > +
> > +    if ( !currd->argo )
> > +    {
> > +        ret = -ENODEV;
> > +        goto out_unlock;
> > +    }
> > +
> > +    if ( dst_d && !dst_d->argo )
> > +    {
> > +        argo_dprintk("!dst_d->argo, ECONNREFUSED\n");
> > +        ret = -ECONNREFUSED;
> > +        goto out_unlock;
> > +    }
> > +
> > +    write_lock(&currd->argo->rings_L2_rwlock);
> > +
> > +    if ( currd->argo->ring_count >= MAX_RINGS_PER_DOMAIN )
> > +    {
> > +        ret = -ENOSPC;
> > +        goto out_unlock2;
> > +    }
> > +
> > +    ring_info = find_ring_info(currd, &ring_id);
> > +    if ( !ring_info )
> > +    {
> > +        ring_info = new_ring_info;
> > +        new_ring_info = NULL;
> > +
> > +        spin_lock_init(&ring_info->L3_lock);
> > +
> > +        ring_info->id = ring_id;
> > +        INIT_LIST_HEAD(&ring_info->pending);
> > +
> > +        list_add(&ring_info->node,
> > +                 &currd->argo->ring_hash[hash_index(&ring_info->id)]);
> > +
> > +        gprintk(XENLOG_DEBUG, "argo: vm%u registering ring (vm%u:%x 
> > vm%u)\n",
> > +                currd->domain_id, ring_id.domain_id, ring_id.aport,
> > +                ring_id.partner_id);
> > +    }
> > +    else if ( ring_info->len )
> > +    {
> > +        /*
> > +         * If the caller specified that the ring must not already exist,
> > +         * fail at attempt to add a completed ring which already exists.
> > +         */
> > +        if ( fail_exist )
> > +        {
> > +            argo_dprintk("disallowed reregistration of existing ring\n");
>
> And this should likely be gprintk with error type?

ack, yes, thanks.

>
> I think the pattern of using gprintk for error messages and
> argo_dprintk for verbose information is correct, but there are a
> couple of oddities that can be fixed later.
>
> > +            ret = -EEXIST;
> > +            goto out_unlock2;
> > +        }
> > +
> > +        if ( ring_info->len != reg.len )
> > +        {
> > +            /*
> > +             * Change of ring size could result in entries on the pending
> > +             * notifications list that will never trigger.
> > +             * Simple blunt solution: disallow ring resize for now.
> > +             * TODO: investigate enabling ring resize.
> > +             */
> > +            gprintk(XENLOG_ERR, "argo: vm%u attempted to change ring size "
> > +                    "(vm%u:%x vm%u)\n",
> > +                    currd->domain_id, ring_id.domain_id, ring_id.aport,
> > +                    ring_id.partner_id);
> > +            /*
> > +             * Could return EINVAL here, but if the ring didn't already
> > +             * exist then the arguments would have been valid, so: EEXIST.
> > +             */
> > +            ret = -EEXIST;
> > +            goto out_unlock2;
> > +        }
> > +
> > +        gprintk(XENLOG_DEBUG,
> > +                "argo: vm%u re-registering existing ring (vm%u:%x vm%u)\n",
> > +                currd->domain_id, ring_id.domain_id, ring_id.aport,
> > +                ring_id.partner_id);
>
> This again would better be argo_dprintk IMO.

similar to above: message useful when deployed, not just for argo development.

>
> [...]
> > @@ -552,6 +987,38 @@ do_argo_op(unsigned int cmd, 
> > XEN_GUEST_HANDLE_PARAM(void) arg1,
> >
> >      switch (cmd)
> >      {
> > +    case XEN_ARGO_OP_register_ring:
> > +    {
> > +        XEN_GUEST_HANDLE_PARAM(xen_argo_register_ring_t) reg_hnd =
> > +            guest_handle_cast(arg1, xen_argo_register_ring_t);
> > +        XEN_GUEST_HANDLE_PARAM(xen_argo_gfn_t) gfn_hnd =
> > +            guest_handle_cast(arg2, xen_argo_gfn_t);
> > +        /* arg3 is npage */
> > +        /* arg4 is flags */
> > +        bool fail_exist = arg4 & XEN_ARGO_REGISTER_FLAG_FAIL_EXIST;
>
> Nit: I would add a:
>
> BUILD_BUG_ON(!IS_ALIGNED(XEN_ARGO_MAX_RING_SIZE, PAGE_SIZE));

ack

thanks,

Christopher

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.