[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3] ring.h: introduce macros to handle monodirectional rings with multiple req sizes



On Tue, 28 Feb 2017, Oleksandr Andrushchenko wrote:
> On 02/28/2017 12:17 AM, Stefano Stabellini wrote:
> > On Mon, 27 Feb 2017, Oleksandr Andrushchenko wrote:
> > > On 02/23/2017 08:45 PM, Stefano Stabellini wrote:
> > > > On Thu, 23 Feb 2017, Oleksandr Andrushchenko wrote:
> > > > > Hi, Stefano!
> > > > > 
> > > > > On 02/22/2017 07:10 PM, Stefano Stabellini wrote:
> > > > > > On Wed, 22 Feb 2017, Oleksandr Andrushchenko wrote:
> > > > > > > Hi, Stefano, Jan!
> > > > > > > 1. Stefano, are you still NOT considering adding
> > > > > > > functionality to avoid memory copying? We discussed
> > > > > > > this a little bit here [1].
> > > > > > Hi Oleksandr,
> > > > > > 
> > > > > > these macros are the generic versions of what pvcalls and xen-9pfs
> > > > > > need,
> > > > > > and these protocols use memory copies.
> > > > > These macros will live in generic ring.h. It means that
> > > > > they can be used not only by pvcalls/9pfs, but others are
> > > > > allowed to use them too (kbdif/fbif/displif/??).
> > > > That's right, but without an additional use case and test case (code I
> > > > can test them with), it doesn't make sense for me to add more functions
> > > > in this patch now. They would likely not be optimal. Of course they can
> > > > be added later.
> > > > 
> > > > 
> > > > > That being said, I am not convinced that this is a good idea
> > > > > to introduce a memory copying while dealing with the packets
> > > > > in an IRQ handler, for example.
> > > > Let me premise that I have nothing against memory sharing based
> > > > protocols, in fact I welcome them, but these macros are meant for memory
> > > > copy based protocols. (It is a pity that the Intel and ARM architecture
> > > > differ so much in this regard, copy being faster than mapping in most
> > > > cases on Intel, given the high cost of tlb flushes).
> > > > 
> > > > 
> > > > > So, my point is that we should give a possibility to directly
> > > > > access ring's contents, which will not be used in your case.
> > > > That I can do. I could add a simple macro to make it easy to get a
> > > > pointer to the content for reading or writing. Today I am accessing it
> > > > with something like:
> > > >            data->out + pvcalls_mask(intf->out_prod, array_size);
> > > > 
> > > > But I could add some sugar around it.
> > > > 
> > > > 
> > > > > >     I cannot introduce memory sharing
> > > > > > interfaces as part of this patch, because they wouldn't comply to
> > > > > > pvcalls or xen-9pfs
> > > > > Again, you are thinking of pvcalls/9pfs, but none of the macros
> > > > > say they are for these use-cases: anyone can use them
> > > > > >     and I don't have another test case for them.
> > > > > > But if you had a patch to introduce them in ring.h, I would be happy
> > > > > > to
> > > > > > help you review it.
> > > > > > 
> > > > > No, I don't have any, sorry
> > > > > > > 2. Will you also provide macros/inlines for fixed sized packets?
> > > > > > > So, others do not reinvent the wheel again on top of your code.
> > > > > > I thought I already did: you can read/write fixed sized packets
> > > > > > using
> > > > > > the two read/write_packet functions.
> > > > > I was thinking of something like we have for req/resp
> > > > >       DEFINE_RING_TYPES(fsif, struct fsif_request, struct
> > > > > fsif_response);
> > > > > so you don't need to care of sizeof(req/resp/event)
> > > > I think that would be very restrictive: with the read/write_packet
> > > > functions you can actually read and write packets of the same type or
> > > > different types. You could read packets of different sizes. In fact
> > > > maybe, instead of a packet_t type parameter, we should pass an opaque
> > > > struct pointer and a size, so that they can be used to actually read and
> > > > write anything. That way, you get the maximum flexibility out of them.
> > > > 
> > > > Also, if you need a traditional ring, what not use the traditional macro
> > > > for it?
> > > I use it for req/resp (front->back), but I also need an asynchronous event
> > > channel (back->front)
> > > For that reason, Konrad suggested that I can probably re-use what you
> > > do for pvcalls/9pfs [1]. But, now I am not convinced that I should drop
> > > what I already have in displif and what kbdif/fbif use.
> > I see. If you use these macros, you'll end up with two mono-directional
> > rings. One for back->front communications and the other for front->back
> > communications. You'll also end up with two event channels, one for each
> > ring. They are used to notify the other end, both producer and consumer
> > do that. Finally, you'll get two functions (in my latest draft, still to
> > be sent out):
> > 
> > void $NAME_read_packet(const unsigned char *buf,
> >          RING_IDX masked_prod, RING_IDX *masked_cons,
> >          RING_IDX ring_size, void *opaque, size_t size);
> > 
> > void $NAME_write_packet(unsigned char *buf,
> >          RING_IDX *masked_prod, RING_IDX masked_cons,
> >          RING_IDX ring_size, const void *opaque, size_t size);
> > 
> > You can use them to send or receive pretty much anything on the ring,
> > including request/response structures, but also raw data. The code that
> > uses them looks like this (frontend side):
> > 
> >    struct xen_example_data_intf *intf; /* pointer to the indexes page */
> >    unsigned char *in;                  /* pointer to ring buffer */
> >    int irq;                            /* irq number corresponding to event
> > channel */
> >    struct xen_example_request h;       /* opaque struct to read, could be
> >                                           your request struct */
> > 
> >    RING_IDX cons, prod, masked_cons, masked_prod;
> > 
> >    while (1) {
> >             cons = intf->in_cons;
> >             prod = intf->in_prod;
> >             mb();
> >                     if (xen_example_queued(prod, cons,
> > XEN_EXAMPLE_RING_SIZE) < sizeof(h)) {
> >                     notify_remote_via_irq(irq);
> >                     return;
> >             }
> >                     masked_prod = xen_example_mask(prod,
> > XEN_EXAMPLE_RING_SIZE);
> >             masked_cons = xen_example_mask(cons, XEN_EXAMPLE_RING_SIZE);
> >             xen_example_read_packet(in,
> >                               masked_prod,
> >                               &masked_cons,
> >                               XEN_EXAMPLE_RING_SIZE,
> >                               &h,
> >                               sizeof(h));
> > 
> >          do_something_with_packet(&h);
> > 
> >          ring->intf->in_cons = cons + sizeof(h);
> >             wmb();
> >    }
> > 
> > This is an example of the similar code dealing with variable length
> > request structs:
> > 
> >    struct xen_example_data_intf *intf; /* pointer to the indexes page */
> >    unsigned char *in;                  /* pointer to ring buffer */
> >    int irq;                            /* irq number corresponding to event
> > channel */
> >    struct xen_example_header h;        /* header */
> > 
> >    unsigned char *buf = NULL;
> >    RING_IDX cons, prod, masked_cons, masked_prod;
> > 
> >    while (1) {
> >             cons = intf->in_cons;
> >             prod = intf->in_prod;
> >             mb();
> >                     if (xen_example_queued(prod, cons,
> > XEN_EXAMPLE_RING_SIZE) < sizeof(h)) {
> >                     notify_remote_via_irq(irq);
> >                     return;
> >             }
> >                     masked_prod = xen_example_mask(prod,
> > XEN_EXAMPLE_RING_SIZE);
> >             masked_cons = xen_example_mask(cons, XEN_EXAMPLE_RING_SIZE);
> >             xen_example_read_packet(in,
> >                               masked_prod,
> >                               &masked_cons,
> >                               XEN_EXAMPLE_RING_SIZE,
> >                               &h,
> >                               sizeof(h));
> > 
> >          buf = kmalloc(sizeof(h), GFP_KERNEL);
> >          ASSERT(buf != NULL);
> >          /* Assuming that the raw data follows the header and that h.size
> >           * is the length of the data. */
> >          xen_example_read_packet(in,
> >                               masked_prod,
> >                               &masked_cons,
> >                               XEN_EXAMPLE_RING_SIZE,
> >                               buf,
> >                               h.size);
> > 
> >          do_something_with_packet(&h, buf, h.size);
> > 
> >          ring->intf->in_cons = cons + sizeof(h) + h.size;
> >             wmb();
> >          kfree(buf);
> >    }
> > 
> > This code is not even compile tested but I hope it helps! Would this
> > scheme work for you? If not, why?
> At first site it will work, thank you for such a detailed explanation
> But, what we already have in ring.h for req/resp case looks simpler
> for my use-case: already fixed size, no memcpy
> For "back->front" direction - probably I can use your approach, but
> the only benefit I get is that your macros will be the part of ring.h
> 
> That being said:
> 1. Your approach is an excellent fit for arbitrary sized structures and
> buffer size - definitely WIN

Thank you :-)


> 2. For my use-case I wold prefer what I already have because of
>  - simplicity for fixed size
>  - no memcpy
>  - possibility to put "back->front" path into the same page used for
> "front->back"

Fair enough. It is possible to avoid the memcpy by pointing directly to
the memory on the ring:

    struct xen_example_request *h;
    while (1) {
        cons = intf->in_cons;
        prod = intf->in_prod;
        mb();
        if (xen_example_queued(prod, cons,
                    XEN_EXAMPLE_RING_SIZE) < sizeof(*h)) {
            notify_remote_via_irq(irq);
            return;
        }
        masked_prod = xen_example_mask(prod,
                XEN_EXAMPLE_RING_SIZE);
        masked_cons = xen_example_mask(cons, XEN_EXAMPLE_RING_SIZE);

        h = in + masked_cons;

        do_something_with_packet(h, masked_prod, masked_cons, 
XEN_EXAMPLE_RING_SIZE);

        ring->intf->in_cons = cons + sizeof(*h);
        wmb();
    }

However you would have to be careful reading the data because it could
wrap around the ring. The read_packet and write_packet macros do the
calculations for you, wrapping around when necessary, but they require a
memcpy. In this example, do_something_with_packet cannot read a
sizeof(*h) amount of bytes without checking if it causes masked_cons to
go over XEN_EXAMPLE_RING_SIZE. Alternatively, you could write an sg_list
to handle the read or the write:

    struct iovec out_sg[2]; /* sg list to fill up */
    int *num;               /* size of the sg list */
    int out_size;           /* data to write */

    RING_IDX cons, prod, masked_prod, masked_cons;

    cons = intf->out_cons;
    prod = intf->out_prod;
    rmb();
    masked_prod = xen_example_mask(prod, XEN_EXAMPLE_RING_SIZE);
    masked_cons = xen_example_mask(cons, XEN_EXAMPLE_RING_SIZE);

    if (masked_cons < masked_prod) {
        out_sg[0].iov_base = out + masked_cons;
        out_sg[0].iov_len = out_size;
        *num = 1;
    } else {
        if (out_size > (XEN_EXAMPLE_RING_SIZE - masked_cons)) {
            out_sg[0].iov_base = out + masked_cons;
            out_sg[0].iov_len = XEN_EXAMPLE_RING_SIZE - masked_cons;
            out_sg[1].iov_base = out;
            out_sg[1].iov_len = out_size - (XEN_EXAMPLE_RING_SIZE - 
masked_cons);
            *num = 2;
        } else {
            out_sg[0].iov_base = out + masked_cons;
            out_sg[0].iov_len = out_size;
            *num = 1;
        }
    }

I hope this helps. Honestly, it looks like you wouldn't gain much by
using this approach over the traditional macros in your use cases.

Konrad, what do you think?

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.