[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 2/2] ioreq-server: write protected range and forwarding



> -----Original Message-----
> From: Ye, Wei [mailto:wei.ye@xxxxxxxxx]
> Sent: 05 September 2014 01:44
> To: Kevin Tian; Paul Durrant; xen-devel@xxxxxxxxxxxxx
> Cc: JBeulich@xxxxxxxx; Ian Campbell; Ian Jackson; Stefano Stabellini;
> Dugger, Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@xxxxxxxxxx; Keir
> (Xen.org); Tim (Xen.org)
> Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> forwarding
> 
> 
> 
> > -----Original Message-----
> > From: Tian, Kevin
> > Sent: Friday, September 5, 2014 7:11 AM
> > To: Paul Durrant; Ye, Wei; xen-devel@xxxxxxxxxxxxx
> > Cc: JBeulich@xxxxxxxx; Ian Campbell; Ian Jackson; Stefano Stabellini;
> Dugger,
> > Donald D; Zhang, Yang Z; Lv, Zhiyuan; konrad.wilk@xxxxxxxxxx; Keir
> (Xen.org);
> > Tim (Xen.org)
> > Subject: RE: [PATCH v3 2/2] ioreq-server: write protected range and
> > forwarding
> >
> > > From: Paul Durrant [mailto:Paul.Durrant@xxxxxxxxxx]
> > > Sent: Wednesday, September 03, 2014 3:12 AM
> > >
> > > > -----Original Message-----
> > > > From: Wei Ye [mailto:wei.ye@xxxxxxxxx]
> > > > Sent: 03 September 2014 22:53
> > > > To: xen-devel@xxxxxxxxxxxxx
> > > > Cc: JBeulich@xxxxxxxx; Ian Campbell; Paul Durrant; Ian Jackson;
> > > > Stefano Stabellini; donald.d.dugger@xxxxxxxxx; Kevin Tian;
> > > > yang.z.zhang@xxxxxxxxx; zhiyuan.lv@xxxxxxxxx;
> > > > konrad.wilk@xxxxxxxxxx; Keir (Xen.org); Tim (Xen.org); Wei Ye
> > > > Subject: [PATCH v3 2/2] ioreq-server: write protected range and
> > > > forwarding
> > > >
> > > > Extend the ioreq-server to write protect pages and forward the write
> > > > access to the corresponding device model.
> > > >
> > >
> > > Using rangesets did make the patch somewhat smaller then :-)
> > >
> > > > Signed-off-by: Wei Ye <wei.ye@xxxxxxxxx>
> > > > ---
> > > >  tools/libxc/xc_domain.c          |   33
> > > +++++++++++++++++++++++++++
> > > >  tools/libxc/xenctrl.h            |   18 +++++++++++++++
> > > >  xen/arch/x86/hvm/hvm.c           |   46
> > > > ++++++++++++++++++++++++++++++++++++++
> > > >  xen/include/asm-x86/hvm/domain.h |    2 +-
> > > >  xen/include/public/hvm/hvm_op.h  |    1 +
> > > >  5 files changed, 99 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c index
> > > > 37ed141..34e6309 100644
> > > > --- a/tools/libxc/xc_domain.c
> > > > +++ b/tools/libxc/xc_domain.c
> > > > @@ -1485,6 +1485,39 @@ int
> > > > xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch,
> domid_t
> > > > domid,
> > > >      return rc;
> > > >  }
> > > >
> > > > +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch,
> > domid_t
> > > > domid,
> > > > +                                        ioservid_t id, uint16_t set,
> > > > +                                        uint64_t start, uint64_t
> > > end)
> > > > +{
> > >
> > > Given what I said about the notion of MMIO ranges that need read
> > > emulation but no write emulation, and the ranges you're concerned
> > > about which require write emulation but no read emulation, I wonder
> > > whether this could collapse into the existing
> > > xc_hvm_map_io_range_to_ioreq_server() by adding an extra parameter
> > to say whether, for a given range, you want read emulation, write
> > > emulation or both.
> >
> > Agree. Add a parameter to indicate r/w permission seems cleaner.
> >
> > >
> > > > +    DECLARE_HYPERCALL;
> > > > +    DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg);
> > > > +    int rc = -1;
> > > > +
> > > > +    if ( arg == NULL )
> > > > +    {
> > > > +        errno = -EFAULT;
> > > > +        return -1;
> > > > +    }
> > > > +
> > > > +    hypercall.op = __HYPERVISOR_hvm_op;
> > > > +    if ( set )
> > > > +        hypercall.arg[0] = HVMOP_map_io_range_to_ioreq_server;
> > > > +    else
> > > > +        hypercall.arg[0] = HVMOP_unmap_io_range_from_ioreq_server;
> > > > +    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
> > > > +
> > > > +    arg->domid = domid;
> > > > +    arg->id = id;
> > > > +    arg->type = HVMOP_IO_RANGE_WP;
> > > > +    arg->start = start;
> > > > +    arg->end = end;
> > > > +
> > > > +    rc = do_xen_hypercall(xch, &hypercall);
> > > > +
> > > > +    xc_hypercall_buffer_free(xch, arg);
> > > > +    return rc;
> > > > +}
> > > > +
> > > >  int xc_hvm_destroy_ioreq_server(xc_interface *xch,
> > > >                                  domid_t domid,
> > > >                                  ioservid_t id) diff --git
> > > > a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h index
> > > > b55d857..b261602 100644
> > > > --- a/tools/libxc/xenctrl.h
> > > > +++ b/tools/libxc/xenctrl.h
> > > > @@ -1943,6 +1943,24 @@ int
> > > > xc_hvm_unmap_pcidev_from_ioreq_server(xc_interface *xch,
> > > >                                            uint8_t function);
> > > >
> > > >  /**
> > > > + * This function registers/deregisters a set of pages to be write
> > protected.
> > > > + *
> > > > + * @parm xch a handle to an open hypervisor interface.
> > > > + * @parm domid the domain id to be serviced
> > > > + * @parm id the IOREQ Server id.
> > > > + * @parm set whether registers or deregisters: 1 for register, 0
> > > > + for
> > > > deregister
> > > > + * @parm start start of range
> > > > + * @parm end end of range (inclusive).
> > > > + * @return 0 on success, -1 on failure.
> > > > + */
> > > > +int xc_hvm_map_wp_range_to_ioreq_server(xc_interface *xch,
> > > > +                                        domid_t domid,
> > > > +                                        ioservid_t id,
> > > > +                                        uint16_t set,
> > > > +                                        uint64_t start,
> > > > +                                        uint64_t end);
> > > > +
> > > > +/**
> > > >   * This function destroys an IOREQ Server.
> > > >   *
> > > >   * @parm xch a handle to an open hypervisor interface.
> > > > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index
> > > > adc0f93..f8c4db8 100644
> > > > --- a/xen/arch/x86/hvm/hvm.c
> > > > +++ b/xen/arch/x86/hvm/hvm.c
> > > > @@ -853,6 +853,7 @@ static int
> > > > hvm_ioreq_server_alloc_rangesets(struct
> > > > hvm_ioreq_server *s,
> > > >                        (i == HVMOP_IO_RANGE_PORT) ? "port" :
> > > >                        (i == HVMOP_IO_RANGE_MEMORY) ?
> > > "memory" :
> > > >                        (i == HVMOP_IO_RANGE_PCI) ? "pci" :
> > > > +                      (i == HVMOP_IO_RANGE_WP) ? "write
> > > protection" :
> > >
> > > Continuing along my train of thought from above, perhaps the rangesets
> > > should now be
> > >
> > > HVMOP_IO_RANGE_PORT_READ
> > > HMVOP_IO_RANGE_PORT_WRITE
> > > HVMOP_IO_RANGE_MEMORY_READ
> > > HVMOP_IO_RANGE_MEMORY_WRITE
> > > HVMOP_IO_RANGE_PCI
> >
> > this looks cleaner. Once we start considering the permission, WP becomes
> > only an attribute which can be attached to either port or memory resource.
> > So better not define a new type here.
> >
> 
> Shall we also need a type of HVMOP_IO_RANGE_MEMORY_BOTH?
> Otherwise, if want to set
> both r/w emulation, user needs to call the interface
> xc_hvm_map_io_range_to_ioreq_server() twice. One
> is with the parameter HVMOP_IO_RANGE_MEMORY_READ and the other
> with parameter
> HVMOP_IO_RANGE_MEMORY_WRITE.
> 

That's true, so messing with the types directly is probably not the correct 
answer. Instead, break direct association between type and rangeset index and 
then hvmop_map_io_range_to_ioreq_server() can call 
hvm_map_io_range_to_ioreq_server() multiple times if necessary.
You could declare the array ins struct hvm_ioreq_server as:

struct rangeset *range[NR_IO_RANGE_TYPES * 2];

and then calculate index as something like:

if (flags & READ)
  index = type;

if (flags & WRITE)
  index = type + NR_IO_RANGE_TYPES;

Yes, it does mean double the number of rangesets if you want both read and 
write, but that's a very small amount of extra memory. If necessary it could be 
avoided by having a third set of rangesets for READ|WRITE but it would mean 
potentially having to do two lookups in hvm_select_ioreq_server().

  Paul


> Wei.
> 
> > for now READ can simply fail until we see a real usage.
> >
> > One side-effect though, is for default r/w emulated case we would double
> > rangesets here.
> >
> > >
> > > >                        "");
> > > >          if ( rc )
> > > >              goto fail;
> > > > @@ -1138,6 +1139,37 @@ static int hvm_get_ioreq_server_info(struct
> > > > domain *d, ioservid_t id,
> > > >      return rc;
> > > >  }
> > > >
> > > > +static int hvm_change_p2m_type_ioreq_server(struct domain *d,
> > > > +uint16_t
> > > > set,
> > > > +                                            uint64_t start,
> > > uint64_t end)
> > > > +{
> > > > +    int rc = -EINVAL;
> > > > +    uint64_t gpfn_s, gpfn_e, gpfn;
> > > > +    p2m_type_t ot, nt;
> > > > +
> > > > +    if ( set )
> > > > +    {
> > > > +        ot = p2m_ram_rw;
> > > > +        nt = p2m_mmio_write_dm;
> > > > +    }
> > > > +    else
> > > > +    {
> > > > +        ot = p2m_mmio_write_dm;
> > > > +        nt = p2m_ram_rw;
> > > > +    }
> > > > +
> > > > +    gpfn_s = start >> PAGE_SHIFT;
> > > > +    gpfn_e = end >> PAGE_SHIFT;
> > > > +
> > > > +    for ( gpfn = gpfn_s; gpfn <= gpfn_e; gpfn++)
> > > > +    {
> > > > +        rc = p2m_change_type_one(d, gpfn, ot, nt);
> > > > +        if ( !rc )
> > > > +            break;
> > > > +    }
> > > > +
> > > > +    return rc;
> > > > +}
> > > > +
> > > >  static int hvm_map_io_range_to_ioreq_server(struct domain *d,
> > > ioservid_t
> > > > id,
> > > >                                              uint32_t type,
> > > uint64_t start, uint64_t end)
> > > >  {
> > > > @@ -1163,6 +1195,7 @@ static int
> > > > hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> > > >              case HVMOP_IO_RANGE_PORT:
> > > >              case HVMOP_IO_RANGE_MEMORY:
> > > >              case HVMOP_IO_RANGE_PCI:
> > > > +            case HVMOP_IO_RANGE_WP:
> > > >                  r = s->range[type];
> > > >                  break;
> > > >
> > > > @@ -1180,6 +1213,10 @@ static int
> > > > hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
> > > >                  break;
> > > >
> > > >              rc = rangeset_add_range(r, start, end);
> > > > +
> > > > +            if ( type == HVMOP_IO_RANGE_WP )
> > > > +                rc = hvm_change_p2m_type_ioreq_server(d, 1, start,
> > > end);
> > > > +
> > >
> > > Here, for memory ranges,  you'd then change the p2m types to
> > > p2m_mmio_write_dm, p2m_mmio_read_dm or p2m_mmio_dm
> > depending on
> > > whether the a range was being added to the MEMORY_WRITE set, the
> > > MEMORY_READ set or both. Obviously you'd need to be careful about
> > > ranges with distinct semantics that fall into the same page - but that
> > > could be dealt with. E.g. if you had two ranges in the same page, one
> > > needing full emulation and one needing write-only emulation then you
> > > could set the p2m type to p2m_mmio_dm and handle the read emulation
> > > for the second range directly in
> hvm_send_assist_req_to_ioreq_server().
> >
> > I prefer to not introducing such complexity now. Just focus on existing
> usages,
> > i.e. either rw-emulation or w-emulation. We can reserve the type in
> > hypercall path, but p2m_mmio_read_dm can wait until a real usage is
> > required.
> >
> > >
> > >   Paul
> > >
> > > >              break;
> > > >          }
> > > >      }
> > > > @@ -1214,6 +1251,7 @@ static int
> > > > hvm_unmap_io_range_from_ioreq_server(struct domain *d,
> ioservid_t
> > id,
> > > >              case HVMOP_IO_RANGE_PORT:
> > > >              case HVMOP_IO_RANGE_MEMORY:
> > > >              case HVMOP_IO_RANGE_PCI:
> > > > +            case HVMOP_IO_RANGE_WP:
> > > >                  r = s->range[type];
> > > >                  break;
> > > >
> > > > @@ -1231,6 +1269,10 @@ static int
> > > > hvm_unmap_io_range_from_ioreq_server(struct domain *d,
> ioservid_t
> > id,
> > > >                  break;
> > > >
> > > >              rc = rangeset_remove_range(r, start, end);
> > > > +
> > > > +            if ( type == HVMOP_IO_RANGE_WP )
> > > > +                rc = hvm_change_p2m_type_ioreq_server(d, 0, start,
> > > end);
> > > > +
> > > >              break;
> > > >          }
> > > >      }
> > > > @@ -2361,6 +2403,10 @@ static struct hvm_ioreq_server
> > > > *hvm_select_ioreq_server(struct domain *d,
> > > >              if ( rangeset_contains_range(r, addr, end) )
> > > >                  return s;
> > > >
> > > > +            r = s->range[HVMOP_IO_RANGE_WP];
> > > > +            if ( rangeset_contains_range(r, addr, end) )
> > > > +                return s;
> > > > +
> > > >              break;
> > > >          case IOREQ_TYPE_PCI_CONFIG:
> > > >              if ( rangeset_contains_singleton(r, addr >> 32) ) diff
> > > > --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-
> > > > x86/hvm/domain.h index 291a2e0..b194f9a 100644
> > > > --- a/xen/include/asm-x86/hvm/domain.h
> > > > +++ b/xen/include/asm-x86/hvm/domain.h
> > > > @@ -48,7 +48,7 @@ struct hvm_ioreq_vcpu {
> > > >      evtchn_port_t    ioreq_evtchn;
> > > >  };
> > > >
> > > > -#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
> > > > +#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_WP + 1)
> > > >  #define MAX_NR_IO_RANGES  256
> > > >
> > > >  struct hvm_ioreq_server {
> > > > diff --git a/xen/include/public/hvm/hvm_op.h
> > > > b/xen/include/public/hvm/hvm_op.h index eeb0a60..aedb97f 100644
> > > > --- a/xen/include/public/hvm/hvm_op.h
> > > > +++ b/xen/include/public/hvm/hvm_op.h
> > > > @@ -323,6 +323,7 @@ struct xen_hvm_io_range {
> > > >  # define HVMOP_IO_RANGE_PORT   0 /* I/O port range */
> > > >  # define HVMOP_IO_RANGE_MEMORY 1 /* MMIO range */
> > > >  # define HVMOP_IO_RANGE_PCI    2 /* PCI segment/bus/dev/func
> > > range */
> > > > +# define HVMOP_IO_RANGE_WP     3 /* Write protetion range */
> > > >      uint64_aligned_t start, end; /* IN - inclusive start and end of
> > > > range */  };  typedef struct xen_hvm_io_range xen_hvm_io_range_t;
> > > > --
> > > > 1.7.9.5
> > >


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.