[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v7 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.



>>> On 08.03.17 at 16:33, <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
> changes in v7: 
>   - Use new ioreq server interface - XEN_DMOP_map_mem_type_to_ioreq_server.
>   - According to comments from George: removed domain_pause/unpause() in
>     hvm_map_mem_type_to_ioreq_server(), because it's too expensive,
>     and we can avoid the: 
>     a> deadlock between p2m lock and ioreq server lock by using these locks
>        in the same order - solved in patch 4;

That is, until patch 4 there is deadlock potential? I think you want to
re-order the patches if so. Or was it that the type can't really be used
until the last patch of the series? (I'm sorry, it's been quite a while
since the previous version.)

> @@ -365,6 +383,24 @@ static int dm_op(domid_t domid,
>          break;
>      }
>  
> +    case XEN_DMOP_map_mem_type_to_ioreq_server:
> +    {
> +        const struct xen_dm_op_map_mem_type_to_ioreq_server *data =
> +            &op.u.map_mem_type_to_ioreq_server;
> +
> +        rc = -EINVAL;
> +        if ( data->pad )
> +            break;
> +
> +        /* Only support for HAP enabled hvm. */
> +        if ( !hap_enabled(d) )
> +            break;

Perhaps better to give an error other than -EINVAL in this case?
If so, then the same error should likely also be used in your
set_mem_type() addition.

> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -99,6 +99,7 @@ static int hvmemul_do_io(
>      uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
>  {
>      struct vcpu *curr = current;
> +    struct domain *currd = curr->domain;
>      struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
>      ioreq_t p = {
>          .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
> @@ -140,7 +141,7 @@ static int hvmemul_do_io(
>               (p.dir != dir) ||
>               (p.df != df) ||
>               (p.data_is_ptr != data_is_addr) )
> -            domain_crash(curr->domain);
> +            domain_crash(currd);

If you mean to do this transformation here, then please do so
consistently for the entire function.

> @@ -177,8 +178,65 @@ static int hvmemul_do_io(
>          break;
>      case X86EMUL_UNHANDLEABLE:
>      {
> -        struct hvm_ioreq_server *s =
> -            hvm_select_ioreq_server(curr->domain, &p);
> +        /*
> +         * Xen isn't emulating the instruction internally, so see if
> +         * there's an ioreq server that can handle it. Rules:
> +         *
> +         * - PIO and "normal" mmio run through hvm_select_ioreq_server()
> +         * to choose the ioreq server by range. If no server is found,
> +         * the access is ignored.
> +         *
> +         * - p2m_ioreq_server accesses are handled by the current
> +         * ioreq_server for the domain, but there are some corner
> +         * cases:

Who or what is "the current ioreq_server for the domain"?

> +         *   - If the domain ioreq_server is NULL, assume this is a
> +         *   race between the unbinding of ioreq server and guest fault
> +         *   so re-try the instruction.
> +         *
> +         *   - If the IOREQ_MEM_ACCESS_WRITE flag is not set, treat it
> +         *   like a normal PIO or MMIO that doesn't have an ioreq
> +         *   server (i.e., by ignoring it).
> +         */
> +        struct hvm_ioreq_server *s = NULL;
> +        p2m_type_t p2mt = p2m_invalid;
> +
> +        if ( is_mmio )
> +        {
> +            unsigned long gmfn = paddr_to_pfn(addr);
> +
> +            (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);

Stray cast.

> +            if ( p2mt == p2m_ioreq_server )
> +            {
> +                unsigned int flags;
> +
> +                s = p2m_get_ioreq_server(currd, &flags);
> +
> +                /*
> +                 * If p2mt is ioreq_server but ioreq_server is NULL,
> +                 * we probably lost a race with unbinding of ioreq
> +                 * server, just retry the access.
> +                 */
> +                if ( s == NULL )
> +                {
> +                    rc = X86EMUL_RETRY;
> +                    vio->io_req.state = STATE_IOREQ_NONE;
> +                    break;
> +                }
> +
> +                /*
> +                 * If the IOREQ_MEM_ACCESS_WRITE flag is not set,
> +                 * we should set s to NULL, and just ignore such
> +                 * access.
> +                 */
> +                if ( !(flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE) )
> +                    s = NULL;

What is this about? You only allow WRITE registrations, so this looks
to be dead code. Yet if it is meant to guard against future enabling
of READ, then this clearly should not be done for reads.

> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -131,6 +131,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain 
> *p2m, ept_entry_t *entry,
>              entry->r = entry->w = entry->x = 1;
>              entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>              break;
> +        case p2m_ioreq_server:
> +            entry->r = 1;
> +            entry->w = !(p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE);

Along the lines of the previous comment - if you mean to have the
code cope with READ, please also set ->r accordingly, or add a
comment why this won't have the intended effect (yielding a not
present EPTE).

> @@ -92,8 +94,13 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t 
> mfn,
>      default:
>          return flags | _PAGE_NX_BIT;
>      case p2m_grant_map_ro:
> -    case p2m_ioreq_server:
>          return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT;
> +    case p2m_ioreq_server:
> +        flags |= P2M_BASE_FLAGS | _PAGE_RW | _PAGE_NX_BIT;
> +        if ( p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
> +            return flags & ~_PAGE_RW;
> +        else
> +            return flags;

Stray else. But even better would imo be

        if ( p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
            flags &= ~_PAGE_RW;
        return flags;

> +struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
> +                                              unsigned int *flags)
> +{
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +    struct hvm_ioreq_server *s;
> +
> +    spin_lock(&p2m->ioreq.lock);
> +
> +    s = p2m->ioreq.server;
> +    *flags = p2m->ioreq.flags;
> +
> +    spin_unlock(&p2m->ioreq.lock);
> +    return s;
> +}

I'm afraid this question was asked before, but since there's no
comment here or anywhere, I can't recall if there was a reason why
s potentially being stale by the time the caller looks at it is not a
problem.

> --- a/xen/include/public/hvm/dm_op.h
> +++ b/xen/include/public/hvm/dm_op.h
> @@ -318,6 +318,30 @@ struct xen_dm_op_inject_msi {
>      uint64_aligned_t addr;
>  };
>  
> +/*
> + * XEN_DMOP_map_mem_type_to_ioreq_server : map or unmap the IOREQ Server <id>
> + *                                      to specific memroy type <type>
> + *                                      for specific accesses <flags>
> + *
> + * For now, flags only accept the value of XEN_DMOP_IOREQ_MEM_ACCESS_WRITE,
> + * which means only write operations are to be forwarded to an ioreq server.
> + * Support for the emulation of read operations can be added when an ioreq
> + * server has such requirement in future.
> + */
> +#define XEN_DMOP_map_mem_type_to_ioreq_server 15
> +
> +struct xen_dm_op_map_mem_type_to_ioreq_server {
> +    ioservid_t id;      /* IN - ioreq server id */
> +    uint16_t type;      /* IN - memory type */
> +    uint16_t pad;

Perhaps there was padding needed when this was a hvmop, but
now the padding does exactly the wrong thing.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.